Patent application title: CYANOBACTERIA SAXITOXIN GENE CLUSTER AND DETECTION OF CYANOTOXIC ORGANISMS
Inventors:
Brett A. Neilan (New South Wales, AU)
Troco Kaan Mihali (New South Wales, AU)
Ralf Kellmann (Nesttun, NO)
Young Jae Jeon (New South Wales, AU)
Assignees:
NEWSOUTH INNOVATIONS PTY LIMITED
IPC8 Class: AC12Q168FI
USPC Class:
435 612
Class name: Measuring or testing process involving enzymes or micro-organisms; composition or test strip therefore; processes of forming such composition or test strip involving nucleic acid with significant amplification step (e.g., polymerase chain reaction (pcr), etc.)
Publication date: 2011-06-02
Patent application number: 20110129842
Abstract:
The present invention relates to methods for the detection of
cyanobacteria, dinoflagellates, and in particular, methods for the
detection of cyanotoxic organisms. Kits for the detection of
cyanobacteria, dinoflagellates, and cyanotoxic organisms are provided.
The invention further relates to methods of screening for compounds that
modulate the activity of polynucleotides and/or polypeptides of the
saxitoxin and cylindrospermopsin biosynthetic pathways.Claims:
1. An isolated polynucleotide comprising a nucleotide sequence sharing at
least 90% sequence homology with SEQ ID NO: 1 or a fragment thereof,
wherein said fragment encodes a protein of a saxitoxin biosynthetic
pathway.
2. The polynucleotide according to claim 1, wherein said fragment comprises a nucleotide sequence sharing at least 90% sequence homology with a sequence selected from the group consisting of SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 18, SEQ ID NO: 20, SEQ ID NO: 22, SEQ ID NO: 24, SEQ ID NO: 26, SEQ ID NO: 28, SEQ ID NO: 30, SEQ ID NO: 32, SEQ ID NO: 34, SEQ ID NO: 36, SEQ ID NO: 38, SEQ ID NO: 40, SEQ ID NO: 42, SEQ ID NO: 44, SEQ ID NO: 46, SEQ ID NO: 48, SEQ ID NO: 50, SEQ ID NO: 52, SEQ ID NO: 54, SEQ ID NO: 56, SEQ ID NO: 58, SEQ ID NO: 60, SEQ ID NO: 62, SEQ ID NO: 64, SEQ ID NO: 66, and SEQ ID NO: 68.
3. (canceled)
4. An isolated saxitoxin biosynthetic pathway polypeptide comprising an amino acid sequence sharing at least 90% sequence homology with a sequence selected from the group consisting of SEQ ID NO: 3, SEQ ID NO: 5, SEQ ID NO: 7, SEQ ID NO: 9, SEQ ID NO: 11, SEQ ID NO: 13, SEQ ID NO: 15, SEQ ID NO: 17, SEQ ID NO: 19, SEQ ID NO: 21, SEQ ID NO: 23, SEQ ID NO: 25, SEQ ID NO: 27, SEQ ID NO: 29, SEQ ID NO: 31, SEQ ID NO: 33, SEQ ID NO: 35, SEQ ID NO: 37, SEQ ID NO: 39, SEQ ID NO: 41, SEQ ID NO: 43, SEQ ID NO: 45, SEQ ID NO: 47, SEQ ID NO: 49, SEQ ID NO: 51, SEQ ID NO: 53, SEQ ID NO: 55, SEQ ID NO: 57, SEQ ID NO: 59, SEQ ID NO: 61, SEQ ID NO: 63, SEQ ID NO: 65, SEQ ID NO: 67, and SEQ ID NO: 69.
5. (canceled)
6. (canceled)
7. (canceled)
8. (canceled)
9. A method for detecting cyanotoxic organism, the method comprising the steps of obtaining a sample for use in the method and analyzing the sample for the presence of a saxitoxin (SXT) cluster gene present only in saxitoxin-producing organisms, wherein said analysing comprises detecting: (i) a polynucleotide comprising a sequence selected from the group consisting of: SEQ ID NO: 14, SEQ ID NO: 20, SEQ ID NO: 22, SEQ ID NO: 24, SEQ ID NO: 36, and variants and fragments thereof, (ii) a ribonucleic acid or complementary DNA encoded by a sequence according to (i), (iii) a polypeptide comprising a sequence selected from the group consisting of: SEQ ID NO: 15, SEQ ID NO: 21, SEQ ID NO: 23, SEQ ID NO: 25, SEQ ID NO: 37, and variants and fragments thereof, and wherein said presence is indicative of cyanotoxic organisms in the sample.
10. The method according to claim 9, wherein said cyanotoxic organisms are cyanobacteria.
11. The method according to claim 9, wherein said analyzing comprises amplification of DNA from the sample by polymerase chain reaction and detecting the amplified sequences.
12. The method according to claim 11, wherein said polymerase chain reaction utilises one or more primers comprising a sequence selected from the group consisting of SEQ ID NO: 70, SEQ ID NO: 71, SEQ ID NO: 72, SEQ ID NO: 73, SEQ ID NO: 74, SEQ ID NO: 75, SEQ ID NO: 76, SEQ ID NO: 77, SEQ ID NO: 78, SEQ ID NO: 79, SEQ ID NO: 113, SEQ ID NO: 114, SEQ ID NO: 115, SEQ ID NO: 116, SEQ ID NO: 117, SEQ ID NO: 118, SEQ ID NO: 119, SEQ ID NO: 120, SEQ ID NO: 121, SEQ ID NO: 122, SEQ ID NO: 123, SEQ ID NO: 124, SEQ ID NO: 125, SEQ ID NO: 126, SEQ ID NO: 127, SEQ ID NO: 128, SEQ ID NO: 129, SEQ ID NO: 130, SEQ ID NO: 131, SEQ ID NO: 132, SEQ ID NO: 133, SEQ ID NO: 134, and variants and fragments thereof.
13. The method according to claim 9, further comprising analyzing the sample for the presence of one or more of: (i) a polynucleotide comprising a sequence selected from the group consisting of: SEQ ID NO: 80, SEQ ID NO: 81, SEQ ID NO: 83, SEQ ID NO: 85, SEQ ID NO: 87, SEQ ID NO: 89, SEQ ID NO: 91, SEQ ID NO: 93, SEQ ID NO: 95, SEQ ID NO: 97, SEQ ID NO: 99, SEQ ID NO: 101, SEQ ID NO: 103, SEQ ID NO: 105, SEQ ID NO: 107, SEQ ID NO: 109, and variants and fragments thereof, (ii) a ribonucleic acid or complementary DNA encoded by a sequence according to (i), (iii) a polypeptide comprising a sequence selected from the group consisting of: SEQ ID NO: 82, SEQ ID NO: 84, SEQ ID NO: 86, SEQ ID NO: 88, SEQ ID NO: 90, SEQ ID NO: 92, SEQ ID NO: 94, SEQ ID NO: 96, SEQ ID NO: 98, SEQ ID NO: 100, SEQ ID NO: 102, SEQ ID NO: 104, SEQ ID NO: 106, SEQ ID NO: 108, and SEQ ID NO: 110, and variants and fragments thereof.
14. The method according to claim 13, wherein said analyzing comprises amplification of DNA from the sample by polymerase chain reaction utilizing one or more primers comprising a sequence selected from the group consisting of SEQ ID NO: 111, SEQ ID NO: 112, and variants and fragments thereof.
15. (canceled)
16. (canceled)
17. (canceled)
18. (canceled)
19. (canceled)
20. (canceled)
21. (canceled)
22. (canceled)
23. A kit for the detection of cyanotoxic organisms, the kit comprising at least one agent for detecting the presence an SXT cluster gene present only in saxitoxin-producing organisms, wherein said agent detects: (i) a polynucleotide comprising a sequence selected from the group consisting of: SEQ ID NO: 14, SEQ ID NO: 20, SEQ ID NO: 22, SEQ ID NO: 24, SEQ ID NO: 36, and variants and fragments thereof, (ii) a ribonucleic acid or complementary DNA encoded by a sequence according to (i), (iii) a polypeptide comprising a sequence selected from the group consisting of: SEQ ID NO: 15, SEQ ID NO: 21, SEQ ID NO: 23, SEQ ID NO: 25, SEQ ID NO: 37, and variants and fragments thereof, and wherein said presence is indicative of cyanotoxic organisms in the sample.
24. The kit according to claim 23, wherein said at least one agent is a primer, antibody or probe.
25. The kit according to claim 24, wherein said primer or probe comprises a sequence selected from the group consisting of SEQ ID NO: 70, SEQ ID NO: 71, SEQ ID NO: 72, SEQ ID NO: 73, SEQ ID NO: 74, SEQ ID NO: 75, SEQ ID NO: 76, SEQ ID NO: 77, SEQ ID NO: 78, SEQ ID NO: 79, SEQ ID NO: 113, SEQ ID NO: 114, SEQ ID NO: 115, SEQ ID NO: 116, SEQ ID NO: 117, SEQ ID NO: 118, SEQ ID NO: 119, SEQ ID NO: 120, SEQ ID NO: 121, SEQ ID NO: 122, SEQ ID NO: 123, SEQ ID NO: 124, SEQ ID NO: 125, SEQ ID NO: 126, SEQ ID NO: 127, SEQ ID NO: 128, SEQ ID NO: 129, SEQ ID NO: 130, SEQ ID NO: 131, SEQ ID NO: 132, SEQ ID NO: 133, SEQ ID NO: 134, and variants and fragments thereof.
26. The kit according to claim 23, further comprising at least one additional agent for detecting the presence of one or more of: (i) a polynucleotide comprising a sequence selected from the group consisting of: SEQ ID NO: 80, SEQ ID NO: 81, SEQ ID NO: 83, SEQ ID NO: 85, SEQ ID NO: 87, SEQ ID NO: 89, SEQ ID NO: 91, SEQ ID NO: 93, SEQ ID NO: 95, SEQ ID NO: 97, SEQ ID NO: 99, SEQ ID NO: 101, SEQ ID NO: 103, SEQ ID NO: 105, SEQ ID NO: 107, SEQ ID NO: 109, and variants and fragments thereof, (ii) a ribonucleic acid or complementary DNA encoded by a sequence according to (i), (iii) a polypeptide comprising a sequence selected from the group consisting of: SEQ ID NO: 82, SEQ ID NO: 84, SEQ ID NO: 86, SEQ ID NO: 88, SEQ ID NO: 90, SEQ ID NO: 92, SEQ ID NO: 94, SEQ ID NO: 96, SEQ ID NO: 98, SEQ ID NO: 100, SEQ ID NO: 102, SEQ ID NO: 104, SEQ ID NO: 106, SEQ ID NO: 108, and SEQ ID NO: 110, and variants and fragments thereof.
27. The kit according to claim 26, wherein said at least one additional agent is a primer, antibody or probe.
28. The kit according to claim 27, wherein said primer or probe comprises a sequence selected from the group consisting of SEQ ID NO: 109, SEQ ID NO: 110, and variants and fragments thereof.
29. (canceled)
30. (canceled)
31. (canceled)
32. (canceled)
33. The method according to claim 10, wherein said cyanotoxic organisms are dinoflagellates.
34. The kit according to claim 23, wherein said cyanotoxic organisms are cyanobacteria.
35. The kit according to claim 23, wherein said cyanotoxic organisms are dinoflagellates.
Description:
RELATED APPLICATIONS
[0001] This application claims the benefit of PCT Application No. PCT/AU2008/001805 filed on Dec. 5, 2008, which claims the benefit of Australian Patent Application No. 2008902056 filed on Apr. 24, 2008 which is incorporated herein by reference in its entirety.
TECHNICAL FIELD
[0002] The present invention relates to methods for the detection of cyanobacteria, dinofiagellates, and in particular, methods for the detection of cyanotoxic organisms. Kits for the detection of cyanobacteria, dinofiagellates, and cyanotoxic organisms are provided. The invention further relates to methods of screening for compounds that modulate the activity of polynucleotides and/or polypeptides of the saxitoxin and cylindrospermopsin biosynthetic pathways.
BACKGROUND
[0003] Cyanobacteria, also known as blue-green algae, are photosynthetic bacteria widespread in marine and freshwater environments. Of particular significance for water quality and human and animal health are those cyanobacteria which produce toxic compounds. Under eutrophic conditions cyanobacteria tend to form large blooms which drastically promote elevated toxin concentrations. Cyanobacterial blooms may flourish and expand in coastal waters, streams, lakes, and in drinking water and recreational reservoirs. The toxins they produce can pose a serious health risk for humans and animals and this problem is internationally relevant since most toxic cyanobacteria have a global distribution.
[0004] A diverse range of cyanobacterial genera are well known for the formation of toxic blue-green algal blooms on water surfaces. Saxitoxin (SXT) and its analogues cause the paralytic shellfish poisoning (PSP) syndrome, which afflicts human health and impacts on coastal shellfish economies worldwide. PSP toxins are unique alkaloids, being produced by both prokaryotes and eukaryotes. PSP toxins are among the most potent and pervasive algal toxins and are considered a serious toxicological health-risk that may affect humans, animals and ecosystems worldwide. These toxins block voltage-gated sodium and calcium channels, and prolong the gating of potassium channels preventing the transduction of neuronal signals. It has been estimated that more than 2000 human cases of PSP occur globally every year. Moreover, coastal blooms of producing microorganisms result in millions of dollars of economic damage due to PSP toxin contamination of seafood and the continuous requirement for costly biotoxin monitoring programs. Early warning systems to anticipate paralytic shellfish toxin (PST)-producing algal blooms, such as PCR and ELISA-based screening, are as yet unavailable due to the lack of data on the genetic basis of PST production.
[0005] SXT is a tricyclic perhydropurine alkaloid which can be substituted at various positions leading to more than 30 naturally occurring SXT analogues. Although SXT biosynthesis seems complex and unique, organisms from two kingdoms, including certain species of marine dinoflagellates and freshwater cyanobacteria, are capable of producing these toxins, apparently by the same biosynthetic route. In spite of considerable efforts none of the enzymes or genes involved in the biosynthesis and modification of SXT have been previously identified.
[0006] The occurrence of the cyanobacterial genus Cylindrospermopsis has been documented on all continents and therefore poses a significant public health threat on a global scale. The major toxin produced by Cylindrospermopsis is cylindrospermopsin (CYR). Besides posing a threat to human health, cylindrospermopsin also causes significant economic losses for farmers due to the poisoning of livestock with cylindrospermopsin-contaminated drinking water. Cylindrospermopsin has hepatotoxic, general cytotoxic and neurotoxic effects and is a potential carcinogen. Its toxicity is due to the inhibition of glutathione and protein synthesis as well as inhibiting cytochrome P450. Six cyanobacterial species have so far been identified to produce cylindrospermopsin; Cylindrospermopsis raciborskii, Aphanizomenon ovalisporum, Aphanizomenon flos-aquae, Umezakia natans, Rhaphdiopsis curvata and Anabaena bergii. Incidents of human poisoning with cylindrospermopsin have only been reported in sub-tropical Australia to date, however C. raciborskii and A. flos-aquae have recently been detected in areas with more temperate climates. The tendency of C. raciborskii to form dense blooms and the invasiveness of the producer organisms gives rise to global concerns for drinking water quality and necessitates the monitoring of drinking water reserves for the presence of cylindrospermopsin producers.
[0007] There is a need for rapid and accurate methods detecting cyanobacteria, and in particular those strains which are capable of producing cyanotoxins such as saxitoxin and cylindrospermopsin. Rapid and accurate methods for detecting cyanotoxic organisms are needed for assessing the potential health hazard of cyanobacterial blooms and for the implementation of effective water management strategies to minimize the effects of toxic bloom outbreaks.
SUMMARY
[0008] In a first aspect, there is provided an isolated polynucleotide comprising a sequence according to SEQ ID NO: 1 or a variant or fragment thereof.
[0009] In one embodiment of the first aspect, the fragment comprises a sequence selected from the group consisting of SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 18, SEQ ID NO: 20, SEQ ID NO: 22, SEQ ID NO: 24, SEQ ID NO: 26, SEQ ID NO: 28, SEQ ID NO: 30, SEQ ID NO: 32, SEQ ID NO: 34, SEQ ID NO: 36, SEQ ID NO: 38, SEQ ID NO: 40, SEQ ID NO: 42, SEQ ID NO: 44, SEQ ID NO: 46, SEQ ID NO: 48, SEQ ID NO: 50, SEQ ID NO: 52, SEQ ID NO: 54, SEQ ID NO: 56, SEQ ID NO: 58, SEQ ID NO: 60, SEQ ID NO: 62, SEQ ID NO: 64, SEQ ID NO: 66, SEQ ID NO: 68, and variants and fragments thereof.
[0010] In a second aspect, there is provided an isolated ribonucleic acid or an isolated complementary DNA encoded by a sequence according to the first aspect.
[0011] In a third aspect, there is provided an isolated polypeptide comprising an amino acid sequence selected from the group consisting of SEQ ID NO: 3, SEQ ID NO: 5, SEQ ID NO: 7, SEQ ID NO: 9, SEQ ID NO: 11, SEQ ID NO: 13, SEQ ID NO: 15, SEQ ID NO: 17, SEQ ID NO: 19, SEQ ID NO: 21, SEQ ID NO: 23, SEQ ID NO: 25, SEQ ID NO: 27, SEQ ID NO: 29, SEQ ID NO: 31, SEQ ID NO: 33, SEQ ID NO: 35, SEQ ID NO: 37, SEQ ID NO: 39, SEQ ID NO: 41, SEQ ID NO: 43, SEQ ID NO: 45, SEQ ID NO: 47, SEQ ID NO: 49, SEQ ID NO: 51, SEQ ID NO: 53, SEQ ID NO: 55, SEQ ID NO: 57, SEQ ID NO: 59, SEQ ID NO: 61, SEQ ID NO: 63, SEQ ID NO: 65, SEQ ID NO: 67, SEQ ID NO: 69, and variants and fragments thereof.
[0012] In one embodiment, there is provided a probe or primer that hybridises specifically with one or more of: a polynucleotide according to the first aspect, a ribonucleic acid or complementary DNA according to the second aspect, or a polypeptide according the third aspect.
[0013] In another embodiment, there is provided a vector comprising a polynucleotide according to the first aspect, or a ribonucleic acid or complementary DNA according the second aspect. The vector may be an expression vector.
[0014] In another embodiment, a host cell is provided comprising the vector.
[0015] In another embodiment, there is provided an isolated antibody capable of binding specifically to a polypeptide according to the third aspect.
[0016] In a fourth aspect, there is provided a method for the detection of cyanobacteria, the method comprising the steps of obtaining a sample for use in the method and analyzing the sample for the presence of one or more of: [0017] (i) a polynucleotide comprising a sequence according to the first aspect [0018] (ii) a ribonucleic acid or complementary DNA according to the second aspect [0019] (iii) a polypeptide comprising a sequence according to third aspect wherein said presence is indicative of cyanobacteria in the sample.
[0020] In a fifth aspect, there is provided a method for detecting a cyanotoxic organism, the method comprising the steps of obtaining a sample for use in the method and analyzing the sample for the presence of one or more of: [0021] (i) a polynucleotide comprising a sequence selected from the group consisting of: SEQ ID NO: 14, SEQ ID NO: 20, SEQ ID NO: 22, SEQ ID NO: 24, SEQ ID NO: 36, and variants and fragments thereof [0022] (ii) a ribonucleic acid or complementary DNA encoded by a sequence according to (i) [0023] (iii) a polypeptide comprising a sequence selected from the group consisting of: SEQ ID NO: 15, SEQ ID NO: 21, SEQ ID NO: 23, SEQ ID NO: 25, SEQ ID NO: 37, and variants and fragments thereof, wherein said presence is indicative of cyanotoxic organisms in the sample.
[0024] In one embodiment of the fifth aspect, the cyanotoxic organism is a cyanobacteria or a dinoflagellate.
[0025] In one embodiment of the fourth and fifth aspects, analyzing the sample comprises amplification of DNA from the sample by polymerase chain reaction and detecting the amplified sequences. The polymerase chain reaction may utilise one or more primers comprising a sequence selected from the group consisting of SEQ ID NO: 70, SEQ ID NO: 71, SEQ ID NO: 72, SEQ ID NO: 73, SEQ ID NO: 74, SEQ ID NO: 75, SEQ ID NO: 76, SEQ ID NO: 77, SEQ ID NO: 78, SEQ ID NO: 79, SEQ ID NO: 113, SEQ ID NO: 114, SEQ ID NO: 115, SEQ ID NO: 116, SEQ ID NO: 117, SEQ ID NO: 118, SEQ ID NO: 119, SEQ ID NO: 120, SEQ ID NO: 121, SEQ ID NO: 122, SEQ ID NO: 123, SEQ ID NO: 124, SEQ ID NO: 125, SEQ ID NO: 126, SEQ ID NO: 127, SEQ ID NO: 128, SEQ ID NO: 129, SEQ ID NO: 130, SEQ ID NO: 131, SEQ ID NO: 132, SEQ ID NO: 133, SEQ ID NO: 134, and variants and fragments thereof.
[0026] In another embodiment of the fourth and fifth aspects, the method comprises further analyzing the sample for the presence of one or more of: [0027] (i) a polynucleotide comprising a sequence selected from the group consisting of: SEQ ID NO: 80, SEQ ID NO: 81, SEQ ID NO: 83, SEQ ID NO: 85, SEQ ID NO: 87, SEQ ID NO: 89, SEQ ID NO: 91, SEQ ID NO: 93, SEQ ID NO: 95, SEQ ID NO: 97, SEQ ID NO: 99, SEQ ID NO: 101, SEQ ID NO: 103, SEQ ID NO: 105, SEQ ID NO: 107, SEQ ID NO: 109, and variants and fragments thereof, [0028] (ii) a ribonucleic acid or complementary DNA encoded by a sequence according to (i), [0029] (iii) a polypeptide comprising a sequence selected from the group consisting of: SEQ ID NO: 82, SEQ ID NO: 84, SEQ ID NO: 86, SEQ ID NO: 88, SEQ ID NO: 90, SEQ ID NO: 92, SEQ ID NO: 94, SEQ ID NO: 96, SEQ ID NO: 98, SEQ ID NO: 100, SEQ ID NO: 102, SEQ ID NO: 104, SEQ ID NO: 106, SEQ ID NO: 108, and SEQ ID NO: 110, and variants and fragments thereof.
[0030] The further analysis of the sample may comprise amplification of DNA from the sample by polymerase chain reaction. The polymerase chain reaction may utilise one or more primers comprising a sequence selected from the group consisting of SEQ ID NO: 111, SEQ ID NO: 112, or variants or fragments thereof.
[0031] In a sixth aspect, there is provided a method for the detection of dinoflagellates, the method comprising the steps of obtaining a sample for use in the method and analyzing the sample for the presence of one or more of: [0032] (i) a polynucleotide comprising a sequence according to the first aspect, [0033] (ii) a ribonucleic acid or complementary DNA according to the second aspect, [0034] (iii) a polypeptide comprising a sequence according to the third aspect, wherein said presence is indicative of dinoflagellates in the sample.
[0035] In one embodiment of the sixth aspect, analysing the sample comprises amplification of DNA from the sample by polymerase chain reaction and detecting the amplified sequences.
[0036] In one embodiment of the fourth, fifth, and sixth aspects, the detection comprises one or both of gel electrophoresis and nucleic acid sequencing. The sample may comprise one or more isolated or cultured organisms. The sample may be an environmental sample. The environmental sample may be derived from salt water, fresh water or a blue-green algal bloom.
[0037] In a seventh aspect, there is provided a kit for the detection of cyanobacteria, the kit comprising at least one agent for detecting the presence of one or more of: [0038] (i) a polynucleotide comprising a sequence according to the first aspect, [0039] (ii) a ribonucleic acid or complementary DNA according to the second aspect, [0040] (iii) a polypeptide comprising a sequence according to the third aspect, wherein said presence is indicative of cyanobacteria in the sample.
[0041] In an eighth aspect, there is provided a kit for the detection of cyanotoxic organisms, the kit comprising at least one agent for detecting the presence of one or more of: [0042] (i) a polynucleotide comprising a sequence selected from the group consisting of: SEQ ID NO: 14, SEQ ID NO: 20, SEQ ID NO: 22, SEQ ID NO: 24, SEQ ID NO: 36, and variants and fragments thereof, [0043] (ii) a ribonucleic acid or complementary DNA encoded by a sequence according to (i), [0044] (iii) a polypeptide comprising a sequence selected from the group consisting of SEQ ID NO: 15, SEQ ID NO: 21, SEQ ID NO: 23, SEQ ID NO: 25, SEQ ID NO: 37, and variants and fragments thereof, wherein said presence is indicative of cyanotoxic organisms in the sample.
[0045] In one embodiment of the seventh and eighth aspects, the at least one agent is a primer, antibody or probe. The primer or probe may comprise a sequence selected from the group consisting of SEQ ID NO: 70, SEQ ID NO: 71, SEQ ID NO: 72, SEQ ID NO: 73, SEQ ID NO: 74, SEQ ID NO: 75, SEQ ID NO: 76, SEQ ID NO: 77, SEQ ID NO: 78, SEQ ID NO: 79, SEQ ID NO: 113, SEQ ID NO: 114, SEQ ID NO: 115, SEQ ID NO: 116, SEQ ID NO: 117, SEQ ID NO: 118, SEQ ID NO: 119, SEQ ID NO: 120, SEQ ID NO: 121, SEQ ID NO: 122, SEQ ID NO: 123, SEQ ID NO: 124, SEQ ID NO: 125, SEQ ID NO: 126, SEQ ID NO: 127, SEQ ID NO: 128, SEQ ID NO: 129, SEQ ID NO: 130, SEQ ID NO: 131, SEQ ID NO: 132, SEQ ID NO: 133, SEQ ID NO: 134, and variants and fragments thereof.
[0046] In another embodiment of the seventh and eighth aspects, the kit further comprises at least one additional agent for detecting the presence of one or more of: [0047] (i) a polynucleotide comprising a sequence selected from the group consisting of: SEQ ID NO: 80, SEQ ID NO: 81, SEQ ID NO: 83, SEQ ID NO: 85, SEQ ID NO: 87, SEQ ID NO: 89, SEQ ID NO: 91, SEQ ID NO: 93, SEQ ID NO: 95, SEQ ID NO: 97, SEQ ID NO: 99, SEQ ID NO: 101, SEQ ID NO: 103, SEQ ID NO: 105, SEQ ID NO: 107, SEQ ID NO: 109, and variants and fragments thereof, [0048] (ii) a ribonucleic acid or complementary DNA encoded by a sequence according to (i), [0049] (iii) a polypeptide comprising a sequence selected from the group consisting of: SEQ ID NO: 82, SEQ ID NO: 84, SEQ ID NO: 86, SEQ ID NO: 88, SEQ ID NO: 90, SEQ ID NO: 92, SEQ ID NO: 94, SEQ ID NO: 96, SEQ ID NO: 98, SEQ ID NO: 100, SEQ ID NO: 102, SEQ ID NO: 104, SEQ ID NO: 106, SEQ ID NO: 108, and SEQ ID NO: 110, and variants and fragments thereof.
[0050] The at least one additional agent may be a primer, antibody or probe. The primer or probe may comprise a sequence selected from the group consisting of SEQ ID NO: 109, SEQ ID NO: 110, and variants and fragments thereof.
[0051] In a ninth aspect, there is provided a kit for the detection of dinoflagellates, the kit comprising at least one agent for detecting the presence of one or more of: [0052] (i) polynucleotide comprising a sequence according to the first aspect, [0053] (ii) a ribonucleic acid or complementary DNA according to the second aspect, [0054] (iii) a polypeptide comprising a sequence according to the third aspect, wherein said presence is indicative of dinoflagellates in the sample.
[0055] In a tenth aspect, there is provided a method of screening for a compound that modulates the expression or activity of one or more polypeptides according to the third aspect, the method comprising contacting the polypeptide with a candidate compound under conditions suitable to enable interaction of the candidate compound and the polypeptide, and assaying for activity of the polypeptide.
[0056] In one embodiment of the tenth aspect, modulating the expression or activity of one or more polypeptides comprises inhibiting the expression or activity of said polypeptide.
[0057] In another embodiment of the tenth aspect, modulating the expression or activity of one or more polypeptides comprises enhancing the expression or activity of said polypeptide.
[0058] In an eleventh aspect, there is provided an isolated polynucleotide comprising a sequence according to SEQ ID NO: 80 or a variant or fragment thereof.
[0059] In one embodiment of the eleventh aspect, the fragment comprises a sequence selected from the group consisting of SEQ ID NO: 81, SEQ ID NO: 83, SEQ ID NO: 85, SEQ ID NO: 87, SEQ ID NO: 89, SEQ ID NO: 91, SEQ ID NO: 93, SEQ ID NO: 95, SEQ ID NO: 97, SEQ ID NO: 99, SEQ ID NO: 101, SEQ ID NO: 103, SEQ ID NO: 105, SEQ ID NO: 107, SEQ ID NO: 109, and variants and fragments thereof.
[0060] In a twelfth aspect, there is provided a ribonucleic acid or complementary DNA encoded by a sequence according to the eleventh aspect.
[0061] In a thirteenth aspect, there is provided an isolated polypeptide comprising an amino acid sequence selected from the group consisting of SEQ ID NO: 82, SEQ ID NO: 84, SEQ ID NO: 86, SEQ ID NO: 88, SEQ ID NO: 90, SEQ ID NO: 92, SEQ ID NO: 94, SEQ ID NO: 96, SEQ ID NO: 98, SEQ ID NO: 100, SEQ ID NO: 102, SEQ ID NO: 104, SEQ ID NO: 106, SEQ ID NO: 108, SEQ ID NO: 110, and variants and fragments thereof.
[0062] In one embodiment, there is provided a probe or primer that hybridises specifically with one or more of a polynucleotide according to the eleventh aspect, a ribonucleic acid or complementary DNA according to the twelfth aspect, or a polypeptide according to the thirteenth aspect.
[0063] In another embodiment, there is provided a vector comprising a polynucleotide according to the eleventh aspect, or a ribonucleic acid or complementary DNA according to the twelfth aspect. The vector may be an expression vector. In one embodiment, a host cell is provided comprising the vector.
[0064] In another embodiment, there is provided an isolated antibody capable of binding specifically to a polypeptide according to the thirteenth aspect.
[0065] In a fourteenth aspect, there is provided a method for the detection of cyanobacteria, the method comprising the steps of obtaining a sample for use in the method and analyzing the sample for the presence of one or more of: [0066] (i) a polynucleotide comprising a sequence according to the eleventh aspect, [0067] (ii) a ribonucleic acid or complementary DNA according to the twelfth aspect, [0068] (iii) a polypeptide comprising a sequence according to thirteenth aspect, wherein said presence is indicative of cyanobacteria in the sample.
[0069] In a fifteenth aspect, there is provided a method for detecting a cyanotoxic organism, the method comprising the steps of obtaining a sample for use in the method and analyzing the sample for the presence of one or both of: [0070] (i) a polynucleotide comprising a sequence according to SEQ ID NO: 95 or a variant or fragment thereof, [0071] (ii) a ribonucleic acid or complementary DNA encoded by a sequence according to (i), [0072] (iii) a polypeptide comprising a sequence according to SEQ ID NO: 96, or a variant or fragment thereof, wherein said presence is indicative of a cyanotoxic organism in the sample.
[0073] In one embodiment of the fifteenth aspect, the cyanotoxic organism is a cyanobacteria.
[0074] In one embodiment of the fourteenth and fifteenth aspects, analyzing the sample comprises amplification of DNA from the sample by polymerase chain reaction and detecting the amplified sequences. The polymerase chain reaction may utilise one or more primers comprising a sequence selected from the group consisting of SEQ ID NO: 111, SEQ ID NO: 112 and variants and fragments thereof.
[0075] In another embodiment of the fourteenth and fifteenth aspects, the method comprises analyzing the sample for the presence of one or more of: [0076] (i) a polynucleotide comprising a sequence selected from the group consisting of: SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 18, SEQ ID NO: 20, SEQ ID NO: 22, SEQ ID NO: 24, SEQ ID NO: 26, SEQ ID NO: 28, SEQ ID NO: 30, SEQ ID NO: 32, SEQ ID NO: 34, SEQ ID NO: 36, SEQ ID NO: 38, SEQ ID NO: 40, SEQ ID NO: 42, SEQ ID NO: 44, SEQ ID NO: 46, SEQ ID NO: 48, SEQ ID NO: 50, SEQ ID NO: 52, SEQ ID NO: 54, SEQ ID NO: 56, SEQ ID NO: 58, SEQ ID NO: 60, SEQ ID NO: 62, SEQ ID NO: 64, SEQ ID NO: 66, SEQ ID NO: 68, and variants and fragments thereof, [0077] (ii) a ribonucleic acid or complementary DNA encoded by a sequence according to (i), [0078] (iii) a polypeptide comprising a sequence selected from the group consisting of: SEQ ID NO: 3, SEQ ID NO: 5, SEQ ID NO: 7, SEQ ID NO: 9, SEQ ID NO: 11, SEQ ID NO: 13, SEQ ID NO: 15, SEQ ID NO: 17, SEQ ID NO: 19 SEQ ID NO: 21, SEQ ID NO: 23, SEQ ID NO: 25, SEQ ID NO: 27, SEQ ID NO: 29, SEQ ID NO: 31, SEQ ID NO: 33, SEQ ID NO: 35, SEQ ID NO: 37, SEQ ID NO: 39, SEQ ID NO: 41, SEQ ID NO: 43, SEQ ID NO: 45, SEQ ID NO: 47, SEQ ID NO: 49, SEQ ID NO: 51, SEQ ID NO: 53, SEQ ID NO: 55, SEQ ID NO: 57, SEQ ID NO: 59, SEQ ID NO: 61, SEQ ID NO: 63, SEQ ID NO: 65, SEQ ID NO: 67, SEQ ID NO: 69, and variants and fragments thereof.
[0079] The further analysis of the sample may comprise amplification of DNA from the sample by polymerase chain reaction. The polymerase chain reaction may utilise one or more primers comprising a sequence selected from the group consisting of SEQ ID NO: 70, SEQ ID NO: 71, SEQ ID NO: 72, SEQ ID NO: 73, SEQ ID NO: 74, SEQ ID NO: 75, SEQ ID NO: 76, SEQ ID NO: 77, SEQ ID NO: 78, SEQ ID NO: 79, SEQ ID NO: 113, SEQ ID NO: 114, SEQ ID NO: 115, SEQ ID NO: 116, SEQ ID NO: 117, SEQ ID NO: 118, SEQ ID NO: 119, SEQ ID NO: 120, SEQ ID NO: 121, SEQ ID NO: 122, SEQ ID NO: 123, SEQ ID NO: 124, SEQ ID NO: 125, SEQ ID NO: 126, SEQ ID NO: 127, SEQ ID NO: 128, SEQ ID NO: 129, SEQ ID NO: 130, SEQ ID NO: 131, SEQ ID NO: 132, SEQ ID NO: 133, SEQ ID NO: 134, and variants and fragments thereof.
[0080] In a sixteenth aspect, there is provided a method for detecting a cylindrospermopsin-producing organism, the method comprising the steps of obtaining a sample for use in the method and analyzing the sample for the presence of one or both of: [0081] (i) a polynucleotide comprising a sequence according to SEQ ID NO: 95 or a variant or fragments thereof, [0082] (ii) a ribonucleic acid or complementary DNA encoded by a sequence according to (i), [0083] (iii) a polypeptide comprising a sequence according to SEQ ID NO: 96, or a variant or fragments thereof, wherein said presence is indicative of a cylindrospermopsin-producing organism in the sample.
[0084] In one embodiment of the sixteenth aspect, the cyanotoxic organism is a cyanobacteria. In another embodiment of the sixteenth aspect, analyzing the sample comprises amplification of DNA from the sample by polymerase chain reaction and detecting the amplified sequences. The polymerase chain reaction may utilise one or more primers comprising a sequence selected from the group consisting of SEQ ID NO: 111, SEQ ID NO: 112 and variants and fragments thereof.
[0085] In one embodiment of the fourteenth, fifteenth, and sixteenth aspects, the detection comprises one or both of gel electrophoresis and nucleic acid sequencing. The sample may comprise one or more isolated or cultured organisms. The sample may be an environmental sample. The environmental sample may be derived from salt water, fresh water or a blue-green algal bloom.
[0086] In a seventeenth aspect, there is provided a kit for the detection of cyanobacteria, the kit comprising at least one agent for detecting the presence of one or more of: [0087] (i) a polynucleotide comprising a sequence according to the eleventh aspect, [0088] (ii) a ribonucleic acid or complementary DNA according to the twelfth aspect, [0089] (iii) a polypeptide comprising a sequence according to the thirteenth aspect, wherein said presence is indicative of cyanobacteria in the sample.
[0090] In an eighteenth aspect, there is provided a kit for the detection of cyanotoxic organisms, the kit comprising at least one agent for detecting the presence of one or more of: [0091] (i) a polynucleotide comprising a sequence according to SEQ ID NO: 95 or a variant or fragment thereof, [0092] (ii) a ribonucleic acid or complementary DNA encoded by a sequence according to (i), [0093] (iii) a polypeptide comprising a sequence according to SEQ ID NO: 96, or a variant or fragment thereof, wherein said presence is indicative of cyanotoxic organisms in the sample.
[0094] In one embodiment of the seventeenth and eighteenth aspects, the at least one agent is a primer, antibody or probe. The primer or probe may comprise a sequence selected from the group consisting of SEQ ID NO: 111, SEQ ID NO: 112 and variants and fragments thereof.
[0095] In another embodiment of the seventeenth and eighteenth aspects, the kit may further comprise at least one additional agent for detecting the presence of one or more nucleotide sequences selected from the group consisting of: [0096] (i) a polynucleotide comprising a sequence selected from the group consisting of: SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 18, SEQ ID NO: 20, SEQ ID NO: 22, SEQ ID NO: 24, SEQ ID NO: 26, SEQ ID NO: 28, SEQ ID NO: 30, SEQ ID NO: 32, SEQ ID NO: 34, SEQ ID NO: 36, SEQ ID NO: 38, SEQ ID NO: 40, SEQ ID NO: 42, SEQ ID NO: 44, SEQ ID NO: 46, SEQ ID NO: 48, SEQ ID NO: 50, SEQ ID NO: 52, SEQ ID NO: 54, SEQ ID NO: 56, SEQ ID NO: 58, SEQ ID NO: 60, SEQ ID NO: 62, SEQ ID NO: 64, SEQ ID NO: 66, SEQ ID NO: 68, and variants and fragments thereof, [0097] (ii) a ribonucleic acid or complementary DNA encoded by a sequence according to (i), [0098] (iii) a polypeptide comprising a sequence selected from the group consisting of: SEQ ID NO: 3, SEQ ID NO: 5, SEQ ID NO: 7, SEQ ID NO: 9, SEQ ID NO: 11, SEQ ID NO: 13, SEQ ID NO: 15, SEQ ID NO: 17, SEQ ID NO: 19 SEQ ID NO: 21, SEQ ID NO: 23, SEQ ID NO: 25, SEQ ID NO: 27, SEQ ID NO: 29, SEQ ID NO: 31, SEQ ID NO: 33, SEQ ID NO: 35, SEQ ID NO: 37, SEQ ID NO: 39, SEQ ID NO: 41, SEQ ID NO: 43, SEQ ID NO: 45, SEQ ID NO: 47, SEQ ID NO: 49, SEQ ID NO: 51, SEQ ID NO: 53, SEQ ID NO: 55, SEQ ID NO: 57, SEQ ID NO: 59, SEQ ID NO: 61, SEQ ID NO: 63, SEQ ID NO: 65, SEQ ID NO: 67, SEQ ID NO: 69, and variants and fragments thereof.
[0099] The at least one additional agent may be a primer, antibody or probe. The primer or probe may comprise a sequence selected from the group consisting of SEQ ID NO: 70, SEQ ID NO: 71, SEQ ID NO: 72, SEQ ID NO: 73, SEQ ID NO: 74, SEQ ID NO: 75, SEQ ID NO: 76, SEQ ID NO: 77, SEQ ID NO: 78, SEQ ID NO: 79, SEQ ID NO: 113, SEQ ID NO: 114, SEQ ID NO: 115, SEQ ID NO: 116, SEQ ID NO: 117, SEQ ID NO: 118, SEQ ID NO: 119, SEQ ID NO: 120, SEQ ID NO: 121, SEQ ID NO: 122, SEQ ID NO: 123, SEQ ID NO: 124, SEQ ID NO: 125, SEQ ID NO: 126, SEQ ID NO: 127, SEQ ID NO: 128, SEQ ID NO: 129, SEQ ID NO: 130, SEQ ID NO: 131, SEQ ID NO: 132, SEQ ID NO: 133, SEQ ID NO: 134, and variants and fragments thereof.
[0100] In a nineteenth aspect, there is provided a kit for the detection of cylindrospermopsin-producing organisms, the kit comprising at least one agent for detecting the presence of one or more of: [0101] (i) a polynucleotide comprising a sequence according to SEQ ID NO: 95 or a variant or fragment thereof, [0102] (ii) a ribonucleic acid or complementary DNA encoded by a sequence according to (i), [0103] (iii) a polypeptide comprising a sequence according to SEQ ID NO: 96, or a variant or fragment thereof, wherein said presence is indicative of a cylindrospermopsin-producing organism in the sample.
[0104] In a twentieth aspect, there is provided a method of screening for a compound that modulates the expression or activity of one or more polypeptides according to the thirteenth aspect, the method comprising contacting the polypeptide with a candidate compound under conditions suitable to enable interaction of the candidate compound and the polypeptide, and assaying for activity of the polypeptide.
[0105] In one embodiment of the twentieth aspect, modulating the expression or activity of one or more polypeptides comprises inhibiting the expression or activity of said polypeptide.
[0106] In another embodiment of the twentieth aspect, modulating the expression or activity of one or more polypeptides comprises enhancing the expression or activity of said polypeptide.
DEFINITIONS
[0107] As used in this application, the singular form "a", "an" and "the" include plural references unless the context clearly dictates otherwise. For example, the term "a stem cell" also includes a plurality of stem cells.
[0108] As used herein, the term "comprising" means "including." Variations of the word "comprising", such as "comprise" and "comprises," have correspondingly varied meanings. Thus, for example, a polynucleotide "comprising" a sequence encoding a protein may consist exclusively of that sequence or may include one or more additional sequences.
[0109] As used herein, the terms "antibody" and "antibodies" include IgG (including IgG1, IgG2, IgG3, and IgG4), IgA (including IgA1 and IgA2), IgD, IgE, or IgM, and IgY, whole antibodies, including single-chain whole antibodies, and antigen-binding fragments thereof. Antigen-binding antibody fragments include, but are not limited to, Fab, Fab' and F(ab')2, Fd, single-chain Fvs (scFv), single-chain antibodies, disulfide-linked Fvs (sdFv) and fragments comprising either a VL or VH domain. The antibodies may be from any animal origin. Antigen-binding antibody fragments, including single-chain antibodies, may comprise the variable region(s) alone or in combination with the entire or partial of the following: hinge region, CH1, CH2, and CH3 domains. Also included are any combinations of variable region(s) and hinge region, CH1, CH2, and CH3 domains. Antibodies may be monoclonal, polyclonal, chimeric, multispecific, humanized, and human monoclonal and polyclonal antibodies which specifically bind the biological molecule.
[0110] As used herein, the terms "polypeptide" and "protein" are used interchangeably and are taken to have the same meaning.
[0111] As used herein, the terms "nucleotide sequence" and "polynucleotide sequence" are used interchangeably and are taken to have the same meaning.
[0112] As used herein, the term "kit" refers to any delivery system for delivering materials. In the context of the detection assays described herein, such delivery systems include systems that allow for the storage, transport, or delivery of reaction reagents (for example labels, reference samples, supporting material, etc. in the appropriate containers) and/or supporting materials (for example, buffers, written instructions for performing the assay etc.) from one location to another. For example, kits include one or more enclosures, such as boxes, containing the relevant reaction reagents and/or supporting materials.
[0113] Any discussion of documents, acts, materials, devices, articles or the like which has been included in the present specification is solely for the purpose of providing a context for the present invention. It is not to be taken as an admission that any or all of these matters form part of the prior art base or were common general knowledge in the field relevant to the present invention before the priority date of this application.
[0114] For the purposes of description all documents referred to herein are incorporated by reference unless otherwise stated.
BRIEF DESCRIPTION OF THE DRAWINGS
[0115] A preferred embodiment of the present invention will now be described, by way of an example only, with reference to the accompanying drawings wherein:
[0116] FIG. 1A is a table showing the distribution of the sxt genes in toxic and non-toxic cyanobacteria. PSP, saxitoxin; CYLN, cylindrospermopsin; +, gene fragment amplified; - no gene detected.
[0117] FIG. 1B is a table showing primer sequences used to amplify various SXT genes.
[0118] FIG. 2 is a table showing sxt genes from the saxitoxin gene cluster of C. raciborskii T3, their putative length, their BLAST similarity match with similar protein sequences from other organisms, and their predicted function.
[0119] FIG. 3 is a diagram showing the structural organisation of the sxt gene cluster from C. raciborskii T3. Abbreviations used are: IS4, insertion sequence 4; at, aminotransferase; dint, drug metabolite transporter; ompR, transcriptional regulator of ompR family; penP, penicillin binding; smf, gene predicted to be involved in DNA uptake. The scale indicates the gene cluster length in base pairs.
[0120] FIG. 4 is a flow diagram showing the pathway for SXT biosynthesis and the putative functions of sxt genes.
[0121] FIG. 5 shows MS/MS spectra of selected ions from cellular extracts of Cylindrospermopsis raciborskii T3. The predicted fragmentation of ions and the corresponding m/z values are indicated. FIG. 5A, arginine (m/z 175); FIG. 5B, saxitoxin (m/z 300); FIG. 5c, intermediate A' (m/z 187); FIG. 5D, intermediate C' (m/z 211); FIG. 5E, intermediate E' (m/z 225).
[0122] FIG. 6 is a table showing the cyr genes from the cylindrospermopsin gene cluster of C. raciborskii AWT205, their putative length, their BLAST similarity match with similar protein sequences from other organisms, and their predicted function.
[0123] FIG. 7 is a table showing the distribution of the sulfotransferase gene (cyrJ) in toxic and non-toxic cyanobacteria. 16S rRNA gene amplification is shown as a positive control. CYLN, cylindrospermopsin; SXT, saxitoxin; N.D., not detected; +, gene fragment amplified; -, no gene detected; NA, not available; AWQC, Australian Water Quality Center.
[0124] FIG. 8 is a flow diagram showing the biosynthetic pathway of cylindrospermopsin biosynthesis.
[0125] FIG. 9 is a diagram showing the structural organization of the cylindrospermopsin gene cluster from C. raciborskii AWT205. Scale indicates gene cluster length in base pairs.
DESCRIPTION
[0126] The inventors have identified a gene cluster responsible for saxitoxin biosynthesis (the SXT gene cluster) and a gene cluster responsible for cylindrospermopsin biosynthesis (the CYR gene cluster). The full sequence of each gene cluster has been determined and functional activities assigned to each of the genes identified therein. Based on this information, the inventors have elucidated the full saxitoxin and cylindrospermopsin biosynthetic pathways.
[0127] Accordingly, the invention provides polynucleotide and polypeptide sequences derived from each of the SXT and CYR gene clusters and in particular, sequences relating to the specific genes within each pathway. Methods and kits for the detection of cyanobacterial strains in a sample are provided based on the presence (or absence) in the sample of one or more of the sequences of the invention. The inventors have determined that certain open-reading frames present in the SXT gene cluster of saxitoxin-producing microorganisms are absent in the SXT gene cluster of microorganisms that do not produce saxitoxin. Similarly, it has been discovered that one open-reading frame present in the CYR gene cluster of cylindrospermopsin-producing microorganisms is absent in non-cylindrospermopsin-producing microorganisms. Accordingly, the invention provides methods and kits for the detection of toxin-producing microorganisms.
[0128] Also provided by the invention are screening methods for the identification of compounds capable of modulating the expression or activity of proteins in the saxitoxin and/or cylindrospermopsin biosynthetic pathways.
Polynucleotides and Polypeptides
[0129] The inventors have determined the full polynucleotide sequence of the saxitoxin (SXT) gene cluster and the cylindrospermopsin (CYR) gene cluster.
[0130] In accordance with aspects and embodiments of the invention, the SXT gene cluster may have, but is not limited to, the polynucleotide sequence as set forth SEQ ID NO: 1 (GenBank accession number DQ787200), or display sufficient sequence identity thereto to hybridise to the sequence of SEQ ID NO: 1.
[0131] The SXT gene cluster comprises 31 genes and 30 intergenic regions.
[0132] Gene 1 of the SXT gene cluster is a 759 base pair (bp) nucleotide sequence set forth in SEQ ID NO: 4. The nucleotide sequence of SXT Gene 1 ranges from the nucleotide in position 1625 up to the nucleotide in position 2383 of SEQ ID NO: 1. The polypeptide sequence encoded by Gene 1 (SXTD) is set forth in SEQ ID NO: 5.
[0133] Gene 2 of the SXT gene cluster is a 396 bp nucleotide sequence set forth in SEQ ID NO: 6. The nucleotide sequence of SXT Gene 2 ranges from the nucleotide in position 2621 up to the nucleotide in position 3016 of SEQ ID NO: 1. The polypeptide sequence encoded by Gene 2 (ORF3) is set forth in SEQ ID NO: 7.
[0134] Gene 3 of the SXT gene cluster is a 360 bp nucleotide sequence set forth in SEQ ID NO: 8. The nucleotide sequence of SXT Gene 3 ranges from the nucleotide in position 2955 up to the nucleotide in position 3314 of SEQ ID NO: 1. The polypeptide sequence encoded by Gene 3 (ORF4) is set forth in SEQ ID NO: 9.
[0135] Gene 4 of the SXT gene cluster is a 354 bp nucleotide sequence set forth in SEQ ID NO: 10. The nucleotide sequence of SXT Gene 4 ranges from the nucleotide in position 3647 up to the nucleotide in position 4000 of SEQ ID NO: 1. The polypeptide sequence encoded by Gene 4 (SXTC) is set forth in SEQ ID NO: 11.
[0136] Gene 5 of the SXT gene cluster is a 957 bp nucleotide sequence set forth in SEQ ID NO: 12. The nucleotide sequence of SXT Gene 5 ranges from the nucleotide in position 4030 up to the nucleotide in position 4986 of SEQ ID NO: 1. The polypeptide sequence encoded by Gene 5 (SXTB) is set forth in SEQ ID NO: 13.
[0137] Gene 6 of the SXT gene cluster is a 3738 bp nucleotide sequence set forth in SEQ ID NO: 14. The nucleotide sequence of SXT Gene 6 ranges from the nucleotide in position 5047 up to the nucleotide in position 8784 of SEQ ID NO: 1. The polypeptide sequence encoded by Gene 6 (SXTA) is set forth in SEQ ID NO: 15.
[0138] Gene 7 of the SXT gene cluster is a 387 bp nucleotide sequence set forth in SEQ ID NO: 16. The nucleotide sequence of SXT Gene 7 ranges from the nucleotide in position 9140 up to the nucleotide in position 9526 of SEQ ID NO: 1. The polypeptide sequence encoded by Gene 7 (SXTE) is set forth in SEQ ID NO: 17.
[0139] Gene 8 of the SXT gene cluster is a 1416 bp nucleotide sequence set forth in SEQ ID NO: 18. The nucleotide sequence of SXT Gene 8 ranges from the nucleotide in position 9686 up to the nucleotide in position 11101 of SEQ ID NO: 1. The polypeptide sequence encoded by Gene 8 (SXTF) is set forth in SEQ ID NO: 19.
[0140] Gene 9 of the SXT gene cluster is an 1134 bp nucleotide sequence set forth in SEQ ID NO: 20. The nucleotide sequence of SXT Gene 9 ranges from the nucleotide in position 11112 up to the nucleotide in position 12245 of SEQ ID NO: 1. The polypeptide sequence encoded by SXT Gene 9 (SXTG) is set forth in SEQ ID NO: 21.
[0141] Gene 10 of the SXT gene cluster is a 1005 bp nucleotide sequence set forth in SEQ ID NO: 22. The nucleotide sequence of SXT Gene 10 ranges from the nucleotide in position 12314 up to the nucleotide in position 13318 of SEQ ID NO: 1. The polypeptide sequence encoded by Gene 10 (SXTH) is set forth in SEQ ID NO: 23.
[0142] Gene 11 of the SXT gene cluster is an 1839 bp nucleotide sequence set forth in SEQ ID NO: 24. The nucleotide sequence of SXT Gene 11 ranges from the nucleotide in position 13476 up to the nucleotide in position 15314 of SEQ ID NO: 1. The polypeptide sequence encoded by Gene 11 (SXTI) is set forth in SEQ ID NO: 25.
[0143] Gene 12 of the SXT gene cluster is a 444 bp nucleotide sequence set forth in SEQ ID NO: 26. The nucleotide sequence of SXT Gene 12 ranges from the nucleotide in position 15318 up to the nucleotide in position 15761 of SEQ ID NO: 1. The polypeptide sequence encoded by Gene 12 (SXTJ) is set forth in SEQ ID NO: 27.
[0144] Gene 13 of the SXT gene cluster is a 165 bp nucleotide sequence set forth in SEQ ID NO: 28. The nucleotide sequence of SXT Gene 13 ranges from the nucleotide in position 15761 up to the nucleotide in position 15925 of SEQ ID NO: 1. The polypeptide sequence encoded by Gene 13 (SXTK) is set forth in SEQ ID NO: 29.
[0145] Gene 14 of the SXT gene cluster is a 1299 bp nucleotide sequence set forth in SEQ ID NO: 30. The nucleotide sequence of SXT Gene 14 ranges from the nucleotide in position 15937 up to the nucleotide in position 17235 of SEQ ID NO: 1. The polypeptide sequence encoded by Gene 14 (SXTL) is set forth in SEQ ID NO: 31.
[0146] Gene 15 of the SXT gene cluster is a 1449 bp nucleotide sequence set forth in SEQ ID NO: 32. The nucleotide sequence of SXT Gene 15 ranges from the nucleotide in position 17323 up to the nucleotide in position 18771 of SEQ ID NO: 1. The polypeptide sequence encoded by Gene 16 (SXTM) is set forth in SEQ ID NO: 33.
[0147] Gene 16 of the SXT gene cluster is an 831 bp nucleotide sequence set forth in SEQ ID NO: 34. The nucleotide sequence of SXT Gene 16 ranges from the nucleotide in position 19119 up to the nucleotide in position 19949 of SEQ ID NO: 1. The polypeptide sequence encoded by Gene 16 (SXTN) is set forth in SEQ ID NO: 35.
[0148] Gene 17 of the SXT gene cluster is a 774 bp nucleotide sequence set forth in SEQ ID NO: 36. The nucleotide sequence of SXT Gene 17 ranges from the nucleotide in position 20238 up to the nucleotide in position 21011 of SEQ ID NO: 1. The polypeptide sequence encoded by Gene 17 (SXTX) is set forth in SEQ ID NO: 37.
[0149] Gene 18 of the SXT gene cluster is a 327 bp nucleotide sequence set forth in SEQ ID NO: 38. The nucleotide sequence of SXT Gene 18 ranges from the nucleotide in position 21175 up to the nucleotide in position 21501 of SEQ ID NO: 1. The polypeptide sequence encoded by Gene 18 (SXTW) is set forth in SEQ ID NO: 39.
[0150] Gene 19 of the SXT gene cluster is a 1653 bp nucleotide sequence set forth in SEQ ID NO: 40. The nucleotide sequence of SXT Gene 219 ranges from the nucleotide in position 21542 up to the nucleotide in position 23194 of SEQ ID NO: 1. The polypeptide sequence encoded by Gene 19 (SXTV) is set forth in SEQ ID NO: 41.
[0151] Gene 20 of the SXT gene cluster is a 750 bp nucleotide sequence set forth in SEQ ID NO: 42. The nucleotide sequence of SXT Gene 20 ranges from the nucleotide in position 23199 up to the nucleotide in position 23948 of SEQ ID NO: 1. The polypeptide sequence encoded by Gene 20 (SXTU) is set forth in SEQ ID NO: 43.
[0152] Gene 21 of the SXT gene cluster is a 1005 bp nucleotide sequence set forth in SEQ ID NO: 44. The nucleotide sequence of SXT Gene 21 ranges from the nucleotide in position 24091 up to the nucleotide in position 25095 of SEQ ID NO: 1. The polypeptide sequence encoded by Gene 21 (SXTT) is set forth in SEQ ID NO: 45.
[0153] Gene 22 of the SXT gene cluster is a 726 bp nucleotide sequence set forth in SEQ ID NO: 46. The nucleotide sequence of SXT Gene 22 ranges from the nucleotide in position 25173 up to the nucleotide in position 25898 of SEQ ID NO: 1. The polypeptide sequence encoded by Gene 22 (SXTS) is set forth in SEQ ID NO: 47.
[0154] Gene 23 of the SXT gene cluster is a 576 bp nucleotide sequence set forth in SEQ ID NO: 48. The nucleotide sequence of SXT Gene 23 ranges from the nucleotide in position 25974 up to the nucleotide in position 26549 of SEQ ID NO: 1. The polypeptide sequence encoded by Gene 23 (ORF24) is set forth in SEQ ID NO: 49.
[0155] Gene 24 of the SXT gene cluster is a 777 bp nucleotide sequence set forth in SEQ ID NO: 50. The nucleotide sequence of SXT Gene 24 ranges from the nucleotide in position 26605 up to the nucleotide in position 27381 of SEQ ID NO: 1. The polypeptide sequence encoded by Gene 24 (SXTR) is set forth in SEQ ID NO: 51.
[0156] Gene 25 of the SXT gene cluster is a 777 bp nucleotide sequence set forth in SEQ ID NO: 52. The nucleotide sequence of SXT Gene 25 ranges from the nucleotide in position 27392 up to the nucleotide in position 28168 of SEQ ID NO: 1. The polypeptide sequence encoded by Gene 25 (SXTQ) is set forth in SEQ ID NO: 53.
[0157] Gene 26 of the SXT gene cluster is a 1227 bp nucleotide sequence set forth in SEQ ID NO: 54. The nucleotide sequence of SXT Gene 26 ranges from the nucleotide in position 28281 up to the nucleotide in position 29507 of SEQ ID NO: 1. The polypeptide sequence encoded by Gene 26 (SXTP) is set forth in SEQ ID NO: 55.
[0158] Gene 27 of the SXT gene cluster is a 603 bp nucleotide sequence set forth in SEQ ID NO: 56. The nucleotide sequence of SXT Gene 27 ranges from the nucleotide in position 29667 up to the nucleotide in position 30269 of SEQ ID NO: 1. The polypeptide sequence encoded by Gene 27 (SXTO) is set forth in SEQ ID NO: 57.
[0159] Gene 28 of the SXT gene cluster is a 1350 bp nucleotide sequence set forth in SEQ ID NO: 58. The nucleotide sequence of SXT Gene 28 ranges from the nucleotide in position 30612 up to the nucleotide in position 31961 of SEQ ID NO: 1. The polypeptide sequence encoded by Gene 28 (ORF29) is set forth in SEQ ID NO: 59.
[0160] Gene 29 of the SXT gene cluster is a 666 bp nucleotide sequence set forth in SEQ ID NO: 60. The nucleotide sequence of SXT Gene 29 ranges from the nucleotide in position 32612 up to the nucleotide in position 33277 of SEQ ID NO: 1. The polypeptide sequence encoded by Gene 29 (SXTY) is set forth in SEQ ID NO: 61.
[0161] Gene 30 of the SXT gene cluster is a 1353 bp nucleotide sequence set forth in SEQ ID NO: 62. The nucleotide sequence of SXT Gene 30 ranges from the nucleotide in position 33325 up to the nucleotide in position 34677 of SEQ ID NO: 1. The polypeptide sequence encoded by Gene 30 (SXTZ) is set forth in SEQ ID NO: 63.
[0162] Gene 31 of the SXT gene cluster is an 819 bp nucleotide sequence set forth in SEQ ID NO: 64. The nucleotide sequence of SXT Gene 31 ranges from the nucleotide in position 35029 up to the nucleotide in position 35847 of SEQ ID NO: 1. The polypeptide sequence encoded by Gene 31 (OMPR) is set forth in SEQ ID NO: 65.
[0163] The 5' border region of SXT gene cluster comprises a 1320 bp gene (orfl), the sequence of which is set forth in SEQ ID NO: 2. The nucleotide sequence of orfl ranges from the nucleotide in position 1 up to the nucleotide in position 1320 of SEQ ID NO: 1. The polypeptide sequence encoded by orfl is set forth in SEQ ID NO: 3.
[0164] The 3' border region of SXT gene cluster comprises a 774 bp gene (hisA), the sequence of which is set forth in SEQ ID NO: 66. The nucleotide sequence of hisA ranges from the nucleotide in position 35972 up to the nucleotide in position 36745 of SEQ ID NO: 1. The polypeptide sequence encoded by hisA is set forth in SEQ ID NO: 67.
[0165] The 3' border region of SXT gene cluster also comprises a 396 bp gene (orfA), the sequence of which is set forth in SEQ ID NO: 68. The nucleotide sequence of orfA ranges from the nucleotide in position 37060 up to the nucleotide in position 37455 of SEQ ID NO: 1. The polypeptide sequence encoded by orfA is set forth in SEQ ID NO: 69.
[0166] In accordance with other aspects and embodiments of the invention, the CYR gene cluster may have, but is not limited to, the nucleotide sequence as set forth SEQ ID NO: 80 (GenBank accession number EU140798), or display sufficient sequence identity thereto to hybridise to the sequence of SEQ ID NO: 80.
[0167] The CYR gene cluster comprises 15 genes and 14 intergenic regions.
[0168] Gene 1 of the CYR gene cluster is a 5631 bp nucleotide sequence set forth in SEQ ID NO: 81. The nucleotide sequence of CYR Gene 1 ranges from the nucleotide in position 444 up to the nucleotide in position 6074 of SEQ ID NO: 80. The polypeptide sequence encoded by Gene 1 (CYRD) is set forth in SEQ ID NO: 82.
[0169] Gene 2 of the CYR gene cluster is a 4074 bp nucleotide sequence set forth in SEQ ID NO: 83. The nucleotide sequence of CYR Gene 2 ranges from the nucleotide in position 6130 up to the nucleotide in position 10203 of SEQ ID NO: 80. The polypeptide sequence encoded by Gene 2 (CYRF) is set forth in SEQ ID NO: 84.
[0170] Gene 3 of the CYR gene cluster is a 1437 bp nucleotide sequence set forth in SEQ ID NO: 85. The nucleotide sequence of CYR Gene 3 ranges from the nucleotide in position 10251 up to the nucleotide in position 11687 of SEQ ID NO: 80. The polypeptide sequence encoded by Gene 3 (CYRG) is set forth in SEQ ID NO: 86.
[0171] Gene 4 of the CYR gene cluster is an 831 bp nucleotide sequence set forth in SEQ ID NO: 87. The nucleotide sequence of CYR Gene 4 ranges from the nucleotide in position 11741 up to the nucleotide in position 12571 of SEQ ID NO: 80. The polypeptide sequence encoded by Gene 4 (CYRI) is set forth in SEQ ID NO: 88.
[0172] Gene 5 of the CYR gene cluster is a 1398 bp nucleotide sequence set forth in SEQ ID NO: 89. The nucleotide sequence of CYR Gene 5 ranges from the nucleotide in position 12568 up to the nucleotide in position 13965 of SEQ ID NO: 80. The polypeptide sequence encoded by Gene 5 (CYRK) is set forth in SEQ ID NO: 90.
[0173] Gene 6 of the CYR gene cluster is a 750 bp nucleotide sequence set forth in SEQ ID NO: 91. The nucleotide sequence of CYR Gene 6 ranges from the nucleotide in position 14037 up to the nucleotide in position 14786 of SEQ ID NO: 80. The polypeptide sequence encoded by Gene 6 (CYRL) is set forth in SEQ ID NO: 92.
[0174] Gene 7 of the CYR gene cluster is a 1431 bp nucleotide sequence set forth in SEQ ID NO: 93. The nucleotide sequence of CYR Gene 7 ranges from the nucleotide in position 14886 up to the nucleotide in position 16316 of SEQ ID NO: 80. The polypeptide sequence encoded by Gene 7 (CYRH) is set forth in SEQ ID NO: 94.
[0175] Gene 8 of the CYR gene cluster is a 780 bp nucleotide sequence set forth in SEQ ID NO: 95. The nucleotide sequence of CYR Gene 8 ranges from the nucleotide in position 16893 up to the nucleotide in position 17672 of SEQ ID NO: 80. The polypeptide sequence encoded by Gene 8 (CYRJ) is set forth in SEQ ID NO: 96.
[0176] Gene 9 of the CYR gene cluster is an 1176 bp nucleotide sequence set forth in SEQ ID NO: 97. The nucleotide sequence of CYR Gene 9 ranges from the nucleotide in position 18113 up to the nucleotide in position 19288 of SEQ ID NO: 80. The polypeptide sequence encoded by Gene 9 (CYR) is set forth in SEQ ID NO: 98.
[0177] Gene 10 of the CYR gene cluster is an 8754 bp nucleotide sequence set forth in SEQ ID NO: 99. The nucleotide sequence of CYR Gene 10 ranges from the nucleotide in position 19303 up to the nucleotide in position 28056 of SEQ ID NO: 80. The polypeptide sequence encoded by Gene 10 (CYRB) is set forth in SEQ ID NO: 100.
[0178] Gene 11 of the CYR gene cluster is a 5667 bp nucleotide sequence set forth in SEQ ID NO: 101. The nucleotide sequence of CYR Gene 11 ranges from the nucleotide in position 28061 up to the nucleotide in position 33727 of SEQ ID NO: 80. The polypeptide sequence encoded by Gene 11 (CPRE) is set forth in SEQ ID NO: 102.
[0179] Gene 12 of the CYR gene cluster is a 5004 bp nucleotide sequence set forth in SEQ ID NO: 103. The nucleotide sequence of CYR Gene 12 ranges from the nucleotide in position 34299 up to the nucleotide in position 39302 of SEQ ID NO: 80. The polypeptide sequence encoded by Gene 12 (CYRC) is set forth in SEQ ID NO: 104.
[0180] Gene 13 of the CYR gene cluster is a 318 bp nucleotide sequence set forth in SEQ ID NO: 105. The nucleotide sequence of CYR Gene 13 ranges from the nucleotide in position 39366 up to the nucleotide in position 39683 of SEQ ID NO: 80. The polypeptide sequence encoded by Gene 13 (CYRM) is set forth in SEQ ID NO: 106.
[0181] Gene 14 of the CYR gene cluster is a 600 bp nucleotide sequence set forth in SEQ ID NO: 107. The nucleotide sequence of CYR Gene 14 ranges from the nucleotide in position 39793 up to the nucleotide in position 40392 of SEQ ID NO: 80. The polypeptide sequence encoded by Gene 14 (CYRN) is set forth in SEQ ID NO: 108.
[0182] Gene 15 of the CYR gene cluster is a 1548 bp nucleotide sequence set forth in SEQ ID NO: 109. The nucleotide sequence of CYR Gene 15 ranges from the nucleotide in position 40501 up to the nucleotide in position 42048 of SEQ ID NO: 80. The polypeptide sequence encoded by Gene 15 (GYRO) is set forth in SEQ ID NO: 110.
[0183] In general, the nucleic acids and polypeptides of the invention are of an isolated or purified form.
[0184] In addition to the SXT and CYR polynucleotides and polypeptide sequences set forth herein, also included within the scope of the present invention are variants and fragments thereof.
[0185] SXT and CYR polynucleotides disclosed herein may be deoxyribonucleic acids (DNA), ribonucleic acids (RNA) or complementary deoxyribonucleic acids (cDNA).
[0186] RNA may be derived from RNA polymerase-catalyzed transcription of a DNA sequence. The RNA may be a primary transcript derived transcription of a corresponding DNA sequence. RNA may also undergo post-transcriptional processing. For example, a primary RNA transcript may undergo post-transcriptional processing to form a mature RNA. Messenger RNA (mRNA) refers to RNA derived from a corresponding open reading frame that may be translated into protein by the cell. cDNA refers to a double-stranded DNA that is complementary to and derived from mRNA. Sense RNA refers to RNA transcript that includes the mRNA and so can be translated into protein by the cell. Antisense RNA refers to an RNA transcript that is complementary to all or part of a target primary transcript or mRNA and may be used to block the expression of a target gene.
[0187] The skilled addresse will recognise that RNA and cDNA sequences encoded by the SXT and CYR DNA sequences disclosed herein may be derived using the genetic code. An RNA sequence may be derived from a given DNA sequence by generating a sequence that is complementary the particular DNA sequence. The complementary sequence may be generated by converting each cytosine (`C`) base in the DNA sequence to a guanine (`G`) base, each guanine (`G`) base in the DNA sequence to a cytosine (`C`) base, each thymidine (`T`) base in the DNA sequence to an adenine (`A`) base, and each adenine (`A`) base in the DNA sequence to a uracil (`U`) base.
[0188] A complementary DNA (cDNA) sequence may be derived from a DNA sequence by deriving an RNA sequence from the DNA sequence as above, then converting the RNA sequence into a cDNA sequence. An RNA sequence can be converted into a Cdna sequence by converting each cytosine (`C`) base in the RNA sequence to a guanine (`G`) base, each guanine (`G`) base in the RNA sequence to a cytosine (`C`) base, each uracil (`U`) base in the RNA sequence to an adenine (`A`) base, and each adeneine (`A`) base in the RNA sequence to a thymidine (T`) base.
[0189] The term "variant" as used herein refers to a substantially similar sequence. In general, two sequences are "substantially similar" if the two sequences have a specified percentage of amino acid residues or nucleotides that are the same (percentage of "sequence identity"), over a specified region, or, when not specified, over the entire sequence. Accordingly, a "variant" of a polynucleotide and polypeptide sequence disclosed herein may share at least 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 83% 85%, 88%, 90%, 93%, 95%, 96%, 97%, 98% or 99% sequence identity with the reference sequence.
[0190] In general, polypeptide sequence variants possess qualitative biological activity in common. Polynucleotide sequence variants generally encode polypeptides which generally possess qualitative biological activity in common. Also included within the meaning of the term "variant" are homologues of polynucleotides and polypeptides of the invention. A polynucleotide homologue is typically from a different bacterial species but sharing substantially the same biological function or activity as the corresponding polynucleotide disclosed herein. A polypeptide homologue is typically from a different bacterial species but sharing substantially the same biological function or activity as the corresponding polypeptide disclosed herein. For example, homologues of the polynucleotides and polypeptides disclosed herein include, but are not limited to those from different species of cyanobacteria.
[0191] Further, the term "variant" also includes analogues of the polypeptides of the invention. A polypeptide "analogue" is a polypeptide which is a derivative of a polypeptide of the invention, which derivative comprises addition, deletion, substitution of one or more amino acids, such that the polypeptide retains substantially the same function. The term "conservative amino acid substitution" refers to a substitution or replacement of one amino acid for another amino acid with similar properties within a polypeptide chain (primary sequence of a protein). For example, the substitution of the charged amino acid glutamic acid (Glu) for the similarly charged amino acid aspartic acid (Asp) would be a conservative amino acid substitution.
[0192] In general, the percentage of sequence identity between two sequences may be determined by comparing two optimally aligned sequences over a comparison window.
[0193] The portion of the sequence in the comparison window may, for example, comprise deletions or additions (i.e. gaps) in comparison to the reference sequence (for example, a polynucleotide or polypeptide sequence disclosed herein), which does not comprise deletions or additions, in order to align the two sequences optimally. A percentage of sequence identity may then be calculated by determining the number of positions at which the identical nucleic acid base or amino acid residue occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison, and multiplying the result by 100 to yield the percentage of sequence identity.
[0194] In the context of two or more nucleic acid or polypeptide sequences, the percentage of sequence identity refers to the specified percentage of amino acid residues or nucleotides that are the same over a specified region, (or, when not specified, over the entire sequence), when compared and aligned for maximum correspondence over a comparison window, or designated region as measured using one of the following sequence comparison algorithms or by manual alignment and visual inspection.
[0195] For sequence comparison, typically one sequence acts as a reference sequence, to which test sequences are compared. When using a sequence comparison algorithm, test and reference sequences are entered into a computer, subsequence coordinates are designated, if necessary, and sequence algorithm program parameters are designated. Default program parameters can be used, or alternative parameters can be designated. The sequence comparison algorithm then calculates the percent sequence identities for the test sequences relative to the reference sequence, based on the program parameters. Methods of alignment of sequences for comparison are well known in the art. Optimal alignment of sequences for comparison can be determined conventionally using known computer programs, including, but not limited to: CLUSTAL in the PC/Gene program (available from Intelligenetics, Mountain View, Calif.); the ALIGN program (Version 2.0) and GAP, BESTFIT, BLAST, FASTA, and TFASTA in the GCG Wisconsin Genetics Software Package, Version 10 (available from Accelrys Inc., 9685 Scranton Road, San Diego, Calif., USA).
[0196] The BESTFIT program (Wisconsin Sequence Analysis Package, for Unix, Genetics Computer Group, University Research Park, 575 Science Drive, Madison, Wis. 53711) uses the local homology algorithm of Smith and Waterman to find the best segment of homology between two sequences (Smith and Waterman, Advances in Applied Mathematics 2:482-489 (1981)). When using BESTFIT or any other sequence alignment program to determine the degree of homology between sequences, the parameters may be set such that the percentage of identity is calculated over the full length of the reference nucleotide sequence and that gaps in homology of up to 5% of the total number of nucleotides in the reference sequence are allowed.
[0197] GAP uses the algorithm described in Needleman and Wunsch (1970) J. Mol. Biol. 48:443-453, to find the alignment of two complete sequences that maximizes the number of matches and minimizes the number of gaps. GAP considers all possible alignments and gap positions and creates the alignment with the largest number of matched bases and the fewest gaps. It allows for the provision of a gap creation penalty and a gap extension penalty in units of matched bases. GAP presents one member of the family of best alignments.
[0198] Another method for determining the best overall match between a query sequence and a subject sequence, also referred to as a global sequence alignment, can be determined using the FASTDB computer program based on the algorithm of Brutlag and colleagues (Comp. App. Biosci. 6:237-245 (1990)). In a sequence alignment the query and subject sequences are both DNA sequences. An RNA sequence can be compared by converting U's to T's. The result of said global sequence alignment is in percent identity.
[0199] The BLAST and BLAST 2.0 algorithms, may be used for determining percent sequence identity and sequence similarity. These are described in Altschul et al. (1977) Nuc. Acids Res. 25:3389-3402, and Altschul et al (1990) J. Mol. Biol. 215:403-410, respectively. Software for performing BLAST analyses is publicly available through the National Center for Biotechnology Information. This algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence, which either match or satisfy some positive-valued threshold score T when aligned with a word of the same length in a database sequence. T is referred to as the neighborhood word score threshold (Altschul et al., supra). These initial neighborhood word hits act as seeds for initiating searches to find longer HSPs containing them. The word hits are extended in both directions along each sequence for as far as the cumulative alignment score can be increased. Cumulative scores are calculated using, for nucleotide sequences, the parameters M (reward score for a pair of matching residues; always >0) and N (penalty score for mismatching residues; always <0). For amino acid sequences, a scoring matrix is used to calculate the cumulative score. Extension of the word hits in each direction are halted when: the cumulative alignment score falls off by the quantity X from its maximum achieved value; the cumulative score goes to zero or below, due to the accumulation of one or more negative-scoring residue alignments; or the end of either sequence is reached. The BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) uses as defaults a wordlength (W) of 11, an expectation (E) or 10, M=5, N=-4 and a comparison of both strands. For amino acid sequences, the BLASTP program uses as defaults a wordlength of 3, and expectation (E) of 10, and the BLOSUM62 scoring matrix (see Henikoff and Henikoff (1989) Proc. Natl. Acad. Sci USA 89:10915) alignments (B) of 50, expectation (E) of 10, M=5, N=-4, and a comparison of both strands. [0028] The BLAST algorithm also performs a statistical analysis of the similarity between two sequences (see, e.g., Karlin and Altschul (1993) Proc. Natl. Acad. Sci. USA 90:5873-5787). One measure of similarity provided by the BLAST algorithm is the smallest sum probability (P(N)), which provides an indication of the probability by which a match between two nucleotide or amino acid sequences would occur by chance. For example, a nucleic acid is considered similar to a reference sequence if the smallest sum probability in a comparison of the test nucleic acid to the reference nucleic acid is less than about 0.2, more preferably less than about 0.01, and most preferably less than about 0.001.
[0200] The invention also contemplates fragments of the polypeptides disclosed herein. A polypeptide "fragment" is a polypeptide molecule that encodes a constituent or is a constituent of a polypeptide of the invention or variant thereof. Typically the fragment possesses qualitative biological activity in common with the polypeptide of which it is a constituent. The peptide fragment may be between about 5 to about 3000 amino acids in length, between about 5 to about 2750 amino acids in length, between about 5 to about 2500 amino acids in length, between about 5 to about 2250 amino acids in length, between about 5 to about 2000 amino acids in length, between about 5 to about 1750 amino acids in length, between about 5 to about 1500 amino acids in length, between about 5 to about 1250 amino acids in length, between about 5 to about 1000 amino acids in length, between about 5 to about 900 amino acids in length, between about 5 to about 800 amino acids in length, between about 5 to about 700 amino acids in length, between about 5 to about 600 amino acids in length, between about 5 to about 500 amino acids in length, between about 5 to about 450 amino acids in length, between about 5 to about 400 amino acids in length, between about 5 to about 350 amino acids in length, between about 5 to about 300 amino acids in length, between about 5 to about 250 amino acids in length, between about 5 to about 200 amino acids in length, between about 5 to about 175 amino acids in length, between about 5 to about 150 amino acids in length, between about 5 to about 125 amino acids in length, between about 5 to about 100 amino acids in length, between about 5 to about 75 amino acids in length, between about 5 to about 50 amino acids in length, between about 5 to about 40 amino acids in length, between about 5 to about 30 amino acids in length, between about 5 to about 20 amino acids in length, and between about 5 to about 15 amino acids in length. Alternatively, the peptide fragment may be between about 5 to about 10 amino acids in length.
[0201] Also contemplated are fragments of the polynucleotides disclosed herein. A polynucleotide "fragment" is a polynucleotide molecule that encodes a constituent or is a constituent of a polynucleotide of the invention or variant thereof. Fragments of a polynucleotide do not necessarily need to encode polypeptides which retain biological activity. The fragment may, for example, be useful as a hybridization probe or PCR primer. The fragment may be derived from a polynucleotide of the invention or alternatively may be synthesized by some other means, for example by chemical synthesis.
[0202] Certain embodiments of the invention relate to fragments of SEQ ID NO: 1. A fragment of SEQ ID NO: 1 may comprise, for example, a constituent of SEQ ID NO: 1 in which the 5' gene border region gene orfl is absent. Alternatively, a fragment of SEQ ID NO: 1 may comprise, for example, a constituent of SEQ ID NO: 1 in which the 3' gene border region gene hisA is absent. Alternatively, a fragment of SEQ ID NO: 1 may comprise, for example, a constituent of SEQ ID NO: 1 in which the 3' gene border region gene orfA is absent. Alternatively, a fragment of SEQ ID NO: 1 may comprise, for example, a constituent of SEQ ID NO: 1 in which the 5' gene border region gene orfl is absent and the 3' border region gene orfA is absent. Alternatively, a fragment of SEQ ID NO: 1 may comprise, for example, a constituent of SEQ ID NO: 1 in which the 5' gene border region gene orfl is absent and the 3' border region genes hisA and orfA are absent.
[0203] In other embodiments, a fragment of SEQ ID NO: 1 may comprise one or more SXT open reading frames. The SXT open reading frame may be selected from the group consisting of SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 18, SEQ ID NO: 20, SEQ ID NO: 22, SEQ ID NO: 24, SEQ ID NO: 26, SEQ ID NO: 28, SEQ ID NO: 30, SEQ ID NO: 32, SEQ ID NO: 34, SEQ ID NO: 36, SEQ ID NO: 38, SEQ ID NO: 40, SEQ ID NO: 42, SEQ ID NO: 44, SEQ ID NO: 46, SEQ ID NO: 48, SEQ ID NO: 50, SEQ ID NO: 52, SEQ ID NO: 54, SEQ ID NO: 56, SEQ ID NO: 58, SEQ ID NO: 60, SEQ ID NO: 62, SEQ ID NO: 64, SEQ ID NO: 66, SEQ ID NO: 68, and variants thereof.
[0204] Additional embodiments of the invention relate to fragments of SEQ ID NO: 80. The fragment of SEQ ID NO: 80 may comprise one or more CYR open reading frames. The CYR open reading frame may be selected from the group consisting of SEQ ID NO: 81, SEQ ID NO: 83, SEQ ID NO: 85, SEQ ID NO: 87, SEQ ID NO: 89, SEQ ID NO: 91, SEQ ID NO: 93, SEQ ID NO: 95, SEQ ID NO: 97, SEQ ID NO: 99, SEQ ID NO: 101, SEQ ID NO: 103, SEQ ID NO: 105, SEQ ID NO: 107, SEQ ID NO: 109, and variants thereof.
[0205] In particular embodiments, the polynucleotides of the invention may be cloned into a vector. The vector may comprise, for example, a DNA, RNA or complementary DNA (cDNA) sequence. The vector may be a plasmid vector, a viral vector, or any other suitable vehicle adapted for the insertion of foreign sequences, their introduction into cells and the expression of the introduced sequences. Typically the vector is an expression vector and may include expression control and processing sequences such as a promoter, an enhancer, ribosome binding sites, polyadenylation signals and transcription termination sequences. The invention also contemplates host cells transformed by such vectors. For example, the polynucleotides of the invention may be cloned into a vector which is transformed into a bacterial host cell, for example E. coli. Methods for the construction of vectors and their transformation into host cells are generally known in the art, and described in, for example, Molecular Cloning: A Laboratory Manual (2nd ed., Cold Spring Harbor Laboratory Press, Plainview, N.Y., and, Ausubel F. M. et al. (Eds) Current Protocols in Molecular Biology (2007), John Wiley and Sons, Inc.
Nucleotide Probes, Primers and Antibodies
[0206] The invention contemplates nucleotides and fragments based on the sequences of the polynucleotides disclosed herein for use as primers and probes for the identification of homologous sequences.
[0207] The nucleotides and fragments may be in the form of oligonucleotides. Oligonucleotides are short stretches of nucleotide residues suitable for use in nucleic acid amplification reactions such as PCR, typically being at least about 5 nucleotides to about 80 nucleotides in length, more typically about 10 nucleotides in length to about 50 nucleotides in length, and even more typically about 15 nucleotides in length to about 30 nucleotides in length.
[0208] Probes are nucleotide sequences of variable length, for example between about 10 nucleotides and several thousand nucleotides, for use in detection of homologous sequences, typically by hybridization. Hybridization probes may be genomic DNA fragments, cDNA fragments, RNA fragments, or other oligonucleotides.
[0209] Methods for the design and/or production of nucleotide probes and/or primers are generally known in the art, and are described in Sambrook et al. (1989) Molecular Cloning: A Laboratory Manual (2nd ed., Cold Spring Harbor Laboratory Press, Plainview, N.Y.; Itakura K. et al. (1984) Annu. Rev. Biochem. 53:323; Innis et al., (Eds) (1990) PCR Protocols: A Guide to Methods and Applications (Academic Press, New York); Innis and Gelfand, (Eds) (1995) PCR Strategies (Academic Press, New York); and Innis and Gelfand, (Eds) (1999) PCR Methods Manual (Academic Press, New York). Nucleotide primers and probes may be prepared, for example, by chemical synthesis techniques for example, the phosphodiester and phosphotriester methods (see for example Narang S. A. et al. (1979) Meth. Enzymol. 68:90; Brown, E. L. (1979) et al. Meth. Enzymol. 68:109; and U.S. Pat. No. 4,356,270), the diethylphosphoramidite method (see Beaucage S. L. et al. (1981) Tetrahedron Letters, 22:1859-1862). A method for synthesizing oligonucleotides on a modified solid support is described in U.S. Pat. No. 4,458,066.
[0210] The nucleic acids of the invention, including the above-mentioned probes and primers, may be labelled by incorporation of a marker to facilitate their detection. Techniques for labelling and detecting nucleic acids are described, for example, in Ausubel F. M. et al. (Eds) Current Protocols in Molecular Biology (2007), John Wiley and Sons, Inc.
[0211] Examples of suitable markers include fluorescent molecules (e.g. acetylaminofluorene, bromodeoxyuridine, digoxigenin, fluorescein) and radioactive isotopes (e.g. 32P, 35S, 3H, 33P). Detection of the marker may be achieved, for example, by chemical, photochemical, immunochemical, biochemical, or spectroscopic techniques.
[0212] The probes and primers of the invention may be used, for example, to detect or isolate cyanobacteria and/or dinoflagellates in a sample of interest. Additionally or alternatively, the probes and primers of the invention may be used to detect or isolate a cyanotoxic organism and/or a cylindrospermopisn-producing organism in a sample of interest. Additionally or alternatively, the probes or primers of the invention may be used to isolate corresponding sequences in other organisms including, for example, other bacterial species. Methods such as the polymerase chain reaction (PCR), hybridization, and the like can be used to identify such sequences based on their sequence homology to the sequences set forth herein. Sequences that are selected based on their sequence identity to the entire sequences set forth herein or to fragments thereof are encompassed by the embodiments. Such sequences include sequences that are orthologs of the disclosed sequences. The term "orthologs" refers to genes derived from a common ancestral gene and which are found in different species as a result of speciation. Genes found in different species are considered orthologs when their nucleotide sequences and/or their encoded protein sequences share substantial identity as defined elsewhere herein. Functions of orthologs are often highly conserved among species.
[0213] In hybridization techniques, all or part of a known nucleotide sequence is used to generate a probe that selectively hybridizes to other corresponding nucleic acid sequences present in a given sample. The hybridization probes may be genomic DNA fragments, cDNA fragments, RNA fragments, or other oligonucleotides, and may be labelled with a detectable marker. Thus, for example, probes for hybridization can be made by labelling synthetic oligonucleotides based on the sequences of the invention.
[0214] The level of homology (sequence identity) between probe and the target sequence will largely be determined by the stringency of hybridization conditions. In particular the nucleotide sequence used as a probe may hybridize to a homologue or other variant of a polynucleotide disclosed herein under conditions of low stringency, medium stringency or high stringency. There are numerous conditions and factors, well known to those skilled in the art, which may be employed to alter the stringency of hybridization. For instance, the length and nature (DNA, RNA, base composition) of the nucleic acid to be hybridized to a specified nucleic acid; concentration of salts and other components, such as the presence or absence of formamide, dextran sulfate, polyethylene glycol etc; and altering the temperature of the hybridization and/or washing steps.
[0215] Typically, stringent hybridization conditions will be those in which the salt concentration is less than about 1.5 M Na ion, typically about 0.01 to 1.0 M Na ion concentration (or other salts) at pH 7.0 to 8.3 and the temperature is at least about 30° C. for short probes (e.g., 10 to 50 nucleotides) and at least about 60° C. for long probes (e.g., greater than 50 nucleotides). Stringent conditions may also be achieved with the addition of destabilizing agents such as formamide. Exemplary low stringency conditions include hybridization with a buffer solution of 30% to 35% formamide, 1 M NaCl, 1% SDS (sodium dodecyl sulfate) at 37° C., and a wash in 1× to 2×SSC (20×SSC=3.0 M NaCl/0.3 M trisodium citrate) at 50° C. to 55° C. Exemplary moderate stringency conditions include hybridization in 40% to 45% formamide, 1.0 M NaCl, 1% SDS at 37° C., and a wash in 0.5× to 1×SSC at 55° C. to 60° C. Exemplary high stringency conditions include hybridization in 50% formamide, 1 M NaCl, 1% SDS at 37° C., and a final wash in 0.1×SSC at 60° C. to 65° C. for at least about 20 minutes. Optionally, wash buffers may comprise about 0.1% to about 1% SDS. The duration of hybridization is generally less than about 24 hours, usually about 4 to about 12 hours.
[0216] Under a PCR approach, oligonucleotide primers can be designed for use in PCR reactions to amplify corresponding DNA sequences from cDNA or genomic DNA extracted from any organism of interest. Methods for designing PCR primers and PCR cloning are generally known in the art and are disclosed in Sambrook et al. (1989) Molecular Cloning: A Laboratory Manual (2nd ed., Cold Spring Harbor Laboratory Press, Plainview, N.Y.); Ausubel F. M. et al. (Eds) Current Protocols in Molecular Biology (2007), John Wiley and Sons, Inc; Maniatis et al. Molecular Cloning (1982), 280-281; Innis et al. (Eds) (1990) PCR Protocols: A Guide to Methods and Applications (Academic Press, New York); Innis and Gelfand, (Eds) (1995) PCR Strategies (Academic Press, New York); and Innis and Gelfand, (Eds) (1999) PCR Methods Manual (Academic Press, New York). Known methods of PCR include, but are not limited to, methods using paired primers, nested primers, single specific primers, degenerate primers, gene-specific primers, vector-specific primers, partially-mismatched primers, and the like.
[0217] The skilled addressee will recognise that the primers described herein for use in PCR or RT-PCR may also be used as probes for the detection of SXT or CYR sequences.
[0218] Also contemplated by the invention are antibodies which are capable of binding specifically to the polypeptides of the invention. The antibodies may be used to qualitatively or quantitatively detect and analyse one or more SXT or CYR polypeptides in a given sample. By "binding specifically" it will be understood that the antibody is capable of binding to the target polypeptide or fragment thereof with a higher affinity than it binds to an unrelated protein. For example, the antibody may bind to the polypeptide or fragment thereof with a binding constant in the range of at least about 10-4M to about 10-10M. Preferably the binding constant is at least about 10-5M, or at least about 10-6M, more preferably the binding constant of the antibody to the SXT or CYR polypeptide or fragment thereof is at least about 10-7M, at least about 10-8M, or at least about 10-9M or more.
[0219] Antibodies of the invention may exist in a variety of forms, including for example as a whole antibody, or as an antibody fragment, or other immunologically active fragment thereof, such as complementarity determining regions. Similarly, the antibody may exist as an antibody fragment having functional antigen-binding domains, that is, heavy and light chain variable domains. Also, the antibody fragment may exist in a form selected from the group consisting of, but not limited to: Fv, Fab, F(ab)2, say (single chain Fv), dAb (single domain antibody), chimeric antibodies, bi-specific antibodies, diabodies and triabodies.
[0220] An antibody `fragment` may be produced by modification of a whole antibody or by synthesis of the desired antibody fragment. Methods of generating antibodies, including antibody fragments, are known in the art and include, for example, synthesis by recombinant DNA technology. The skilled addressee will be aware of methods of synthesising antibodies, such as those described in, for example, U.S. Pat. No. 5,296,348 and Ausubel F. M. et al. (Eds) Current Protocols in Molecular Biology (2007), John Wiley and Sons, Inc.
[0221] Preferably antibodies are prepared from discrete regions or fragments of the SXT or CYR polypeptide of interest. An antigenic portion of a polypeptide of interest may be of any appropriate length, such as from about 5 to about 15 amino acids. Preferably, an antigenic portion contains at least about 5, 6, 7, 8, 9, 10, 11, 12, 13 or 14 amino acid residues.
[0222] In the context of this specification reference to an antibody specific to a SXT or CYR polypeptide of the invention includes an antibody that is specific to a fragment of the polypeptide of interest.
[0223] Antibodies that specifically bind to a polypeptide of the invention can be prepared, for example, using the purified SXT or CYR polypeptides or their nucleic acid sequences using any suitable methods known in the art. For example, a monoclonal antibody, typically containing Fab portions, may be prepared using hybridoma technology described in Harlow and Lane (Eds) Antibodies--A Laboratory Manual, (1988), Cold Spring Harbor Laboratory, N.Y; Coligan, Current Protocols in Immunology (1991); Goding, Monoclonal Antibodies: Principles and Practice (1986) 2nd ed; and Kohler & Milstein, (1975) Nature 256: 495-497. Such techniques include, but are not limited to, antibody preparation by selection of antibodies from libraries of recombinant antibodies in phage or similar vectors, as well as preparation of polyclonal and monoclonal antibodies by immunizing rabbits or mice (see, for example, Huse et al. (1989) Science 246: 1275-1281; Ward et al. (1989) Nature 341: 544-546).
[0224] It will also be understood that antibodies of the invention include humanised antibodies, chimeric antibodies and fully human antibodies. An antibody of the invention may be a bi-specific antibody, having binding specificity to more than one antigen or epitope. For example, the antibody may have specificity for one or more SXT or CYR polypeptide or fragments thereof, and additionally have binding specificity for another antigen. Methods for the preparation of humanised antibodies, chimeric antibodies, fully human antibodies, and bispecific antibodies are known in the art and include, for example as described in U.S. Pat. No. 6,995,243 issued Feb. 7, 2006 to Garabedian, et al. and entitled "Antibodies that recognize and bind phosphorylated human glucocorticoid receptor and methods of using same".
[0225] Generally, a sample potentially comprising SXT or CYR polypeptides can be contacted with an antibody that specifically binds the SXT or CYR polypeptide or fragment thereof. Optionally, the antibody can be fixed to a solid support to facilitate washing and subsequent isolation of the complex, prior to contacting the antibody with a sample. Examples of solid supports include, for example, microlitre plates, beads, ticks, or microbeads. Antibodies can also be attached to a ProteinChip array or a probe substrate as described, above.
[0226] Detectable labels for the identification of antibodies bound to the SXT or CYR polypeptides of the invention include, but are not limited to fluorochromes, fluorescent dyes, radiolabels, enzymes such as horse radish peroxide, alkaline phosphatase and others commonly used in the art, and colorimetric labels including colloidal gold or coloured glass or plastic beads. Alternatively, the marker in the sample can be detected using an indirect assay, wherein, for example, a second, labelled antibody is used to detect bound marker-specific antibody.
[0227] Methods for detecting the presence of or measuring the amount of, an antibody-marker complex include, for example, detection of fluorescence, chemiluminescence, luminescence, absorbance, birefringence, transmittance, reflectance, or refractive index such as surface plasmon resonance, ellipsometry, a resonant mirror method, a grating coupler wave guide method or interferometry. Radio frequency methods include multipolar resonance spectroscopy. Electrochemical methods include amperometry and voltametry methods. Optical methods include imaging methods and non-imaging methods and microscopy.
[0228] Useful assays for detecting the presence of or measuring the amount of, an antibody-marker complex include, include, for example, enzyme-linked immunosorbent assay (ELISA), a radioimmune assay (RIA), or a Western blot assay. Such methods are described in, for example, Clinical Immunology (Stites & Terr, eds., 7th ed. 1991); Methods in Cell Biology: Antibodies in Cell Biology, volume 37 (Asai, ed. 1993); and Harlow & Lane, supra.
Methods and Kits for Detection
[0229] The invention provides methods and kits for the detection and/or isolation of SXT nucleic acids and polypeptides. Also provided are methods and kits for the detection and/or isolation CYR nucleic acids and polypeptides.
[0230] In one aspect, the invention provides a method for the detection of cyanobacteria. The skilled addressee will understand that the detection of "cyanobacteria" encompasses the detection of one or more cyanobacteria. The method comprises obtaining a sample for use in the method, and detecting the presence of one or more SXT polynucleotides or polypeptides as disclosed herein, or a variant or fragment thereof. The presence of SXT polynucleotides, polypeptides, or variants or fragments thereof, is indicative of cyanobacteria in the sample.
[0231] The SXT polynucleotide may comprise a sequence selected from the group consisting of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 18, SEQ ID NO: 20, SEQ ID NO: 22, SEQ ID NO: 24, SEQ ID NO: 26, SEQ ID NO: 28, SEQ ID NO: 30, SEQ ID NO: 32, SEQ ID NO: 34, SEQ ID NO: 36, SEQ ID NO: 38, SEQ ID NO: 40, SEQ ID NO: 42, SEQ ID NO: 44, SEQ ID NO: 46, SEQ ID NO: 48, SEQ ID NO: 50, SEQ ID NO: 52, SEQ ID NO: 54, SEQ ID NO: 56, SEQ ID NO: 58, SEQ ID NO: 60, SEQ ID NO: 62, SEQ ID NO: 64, SEQ ID NO: 66, SEQ ID NO: 68, and variants and fragments thereof.
[0232] Alternatively, the SXT polynucleotide may be an RNA or cDNA encoded by a sequence selected from the group consisting of SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 18, SEQ ID NO: 20, SEQ ID NO: 22, SEQ ID NO: 24, SEQ ID NO: 26, SEQ ID NO: 28, SEQ ID NO: 30, SEQ ID NO: 32, SEQ ID NO: 34, SEQ ID NO: 36, SEQ ID NO: 38, SEQ ID NO: 40, SEQ ID NO: 42, SEQ ID NO: 44, SEQ ID NO: 46, SEQ ID NO: 48, SEQ ID NO: 50, SEQ ID NO: 52, SEQ ID NO: 54, SEQ ID NO: 56, SEQ ID NO: 58, SEQ ID NO: 60, SEQ ID NO: 62, SEQ ID NO: 64, SEQ ID NO: 66, SEQ ID NO: 68, and variants and fragments thereof and/or polypeptides as disclosed herein, or a variant or fragment thereof.
[0233] The SXT polypeptide may comprising an amino acid sequence selected from the group consisting of SEQ ID NO: 3, SEQ ID NO: 5, SEQ ID NO: 7, SEQ ID NO: 9, SEQ ID NO: 11, SEQ ID NO: 13, SEQ ID NO: 15, SEQ ID NO: 17, SEQ ID NO: 19, SEQ ID NO: 21, SEQ ID NO: 23, SEQ ID NO: 25, SEQ ID NO: 27, SEQ ID NO: 29, SEQ ID NO: 31, SEQ ID NO: 33, SEQ ID NO: 35, SEQ ID NO: 37, SEQ ID NO: 39, SEQ ID NO: 41, SEQ ID NO: 43, SEQ ID NO: 45, SEQ ID NO: 47, SEQ ID NO: 49, SEQ ID NO: 51, SEQ ID NO: 53, SEQ ID NO: 55, SEQ ID NO: 57, SEQ ID NO: 59, SEQ ID NO: 61, SEQ ID NO: 63, SEQ ID NO: 65, SEQ ID NO: 67, SEQ ID NO: 69, and variants and fragments thereof.
[0234] The inventors have determined that several genes of the SXT gene cluster exist in saxitoxin-producing organisms, and are absent in organisms with the SXT gene cluster that do not produce saxitoxin. Specifically, the inventors have identified that gene 6 (sxtA) (SEQ ID NO: 14), gene 9 (sxtG) (SEQ ID NO: 20), gene 10 (sxtH) (SEQ ID NO: 22), gene 11 (sxtI) (SEQ ID NO: 24) and gene 17 (sxtX) (SEQ ID NO: 36) of the SXT gene cluster are present only in organisms that produce saxitoxin.
[0235] Accordingly, in another aspect the invention provides a method of detecting a cyanotoxic organism. The method comprises obtaining a sample for use in the method, and detecting a cyanotoxic organism based on the detection of one or more SXT polynucleotides comprising a sequence set forth in SEQ ID NO: 14 (sxtA, gene 6), SEQ ID NO: 20 (sxtG, gene 9), SEQ ID NO: 22 (sxtH, gene 10), SEQ ID NO: 24 (sxtI, gene 11), SEQ ID NO: 36 (sxtX, gene 17), or variants or fragments thereof. Additionally or alternatively, a cyanotoxic organism may be detected based on the detection of an RNA or cDNA comprising a sequence encoded by SEQ ID NO: 14 (sxtA, gene 6), SEQ ID NO: 20 (sxtG, gene 9), SEQ ID NO: 22 (sxtH, gene 10), SEQ ID NO: 24 (sxtI, gene 11), SEQ ID NO: 36 (sxtX, gene 17), or variants or fragments thereof. Additionally or alternatively, a cyanotoxic organism may be detected based on the detection of one or more polypeptides comprising a sequence set forth in SEQ ID NO: 15 (SXTA), SEQ ID NO: 21 (SXTG), SEQ ID NO: 23 (SXTH), SEQ ID NO: (SXTI), SEQ ID NO: 37 (SXTX), or variants or fragments thereof, in a sample suspected of comprising one or more cyanotoxic organisms. The cyanotoxic organism may be any organism capable of producing saxitoxin. In a preferred embodiment of the invention, the cyanotoxic organism is a cyanobacteria or a dinoflagellate.
[0236] In certain embodiments of the invention, the methods for detecting cyanobacteria or the methods for detecting cyanotoxic organisms may further comprise the detection of one or more CYR polynucleotides or CYR polypeptides as disclosed herein, or a variant or fragment thereof. The CYR polynucleotide may comprise a sequence selected from the group consisting of SEQ ID NO: 80, SEQ ID NO: 81, SEQ ID NO: 83, SEQ ID NO: 85, SEQ ID NO: 87, SEQ ID NO: 89, SEQ ID NO: 91, SEQ ID NO: 93, SEQ ID NO: 95, SEQ ID NO: 97, SEQ ID NO: 99, SEQ ID NO: 101, SEQ ID NO: 103, SEQ ID NO: 105, SEQ ID NO: 107, SEQ ID NO: 109, and variants or fragments thereof.
[0237] Alternatively, the CYR polynucleotide may be an RNA or cDNA encoded by a polynucleotide sequence selected from the group consisting of SEQ ID NO: 80, SEQ ID NO: 81, SEQ ID NO: 83, SEQ ID NO: 85, SEQ ID NO: 87, SEQ ID NO: 89, SEQ ID NO: 91, SEQ ID NO: 93, SEQ ID NO: 95, SEQ ID NO: 97, SEQ ID NO: 99, SEQ ID NO: 101, SEQ ID NO: 103, SEQ ID NO: 105, SEQ ID NO: 107, SEQ ID NO: 109, and variants or fragments thereof.
[0238] The CYR polypeptide may comprise a sequence selected from the group consisting of SEQ ID NO: 82, SEQ ID NO: 84, SEQ ID NO: 86, SEQ ID NO: 88, SEQ ID NO: 90, SEQ ID NO: 92, SEQ ID NO: 94, SEQ ID NO: 96, SEQ ID NO: 98, SEQ ID NO: 100, SEQ ID NO: 102, SEQ ID NO: 104, SEQ ID NO: 106, SEQ ID NO: 108, and SEQ ID NO: 110, and variants or fragments thereof.
[0239] The inventors have determined gene 8 (cyrJ) (SEQ ID NO: 95) of the CYR gene cluster exists in cylindrospermopsin-producing organisms, and is absent in organisms with the CYR gene cluster that do not produce cylindrospermopsin. Accordingly, the methods for detecting cyanobacteria or the methods for detecting cyanotoxic organisms may further comprise the detection of a cylindrospermopsin-producing organism based on the detection of a CYR polynucleotide comprising a sequence set forth in SEQ ID NO: 95, or a variant or fragment thereof. Additionally or alternatively, the methods for detecting cyanobacteria or the methods for detecting cyanotoxic organisms may further comprise the detection of a cylindrospermopsin-producing organism based on the detection of an RNA or cDNA comprising a sequence encoded by SEQ ID NO: 95, or a variant or fragment thereof. Additionally or alternatively, the methods for detecting cyanobacteria or the methods for detecting cyanotoxic organisms may further comprise the detection of a cylindrospermopsin-producing organism based on the detection of a CYR polypeptide comprising a sequence set forth in SEQ ID NO: 96, or a variant or fragment thereof.
[0240] In another aspect, the invention provides a method for the detection of cyanobacteria. The skilled addressee will understand that the detection of "cyanobacteria" encompasses the detection of one or more cyanobacteria. The method comprises obtaining a sample for use in the method, and detecting the presence of one or more CYR polynucleotides or polypeptides as disclosed herein, or a variant or fragment thereof. The presence of CYR polynucleotides, polypeptides, or variants or fragments thereof, is indicative of cyanobacteria in the sample.
[0241] The CYR polynucleotide may comprise a sequence selected from the group consisting of SEQ ID NO: 81, SEQ ID NO: 83, SEQ ID NO: 85, SEQ ID NO: 87, SEQ ID NO: 89, SEQ ID NO: 91, SEQ ID NO: 93, SEQ ID NO: 95, SEQ ID NO: 97, SEQ ID NO: 99, SEQ ID NO: 101, SEQ ID NO: 103, SEQ ID NO: 105, SEQ ID NO: 107, SEQ ID NO: 109 and variants and fragments thereof.
[0242] Alternatively, the CYR polynucleotide may be an RNA or cDNA encoded by a sequence selected from the group consisting of SEQ ID NO: 81, SEQ ID NO: 83, SEQ ID NO: 85, SEQ ID NO: 87, SEQ ID NO: 89, SEQ ID NO: 91, SEQ ID NO: 93, SEQ ID NO: 95, SEQ ID NO: 97, SEQ ID NO: 99, SEQ ID NO: 101, SEQ ID NO: 103, SEQ ID NO: 105, SEQ ID NO: 107, SEQ ID NO: 109 and variants and fragments thereof.
[0243] The CYR polypeptide may comprise a sequence selected from the group consisting of SEQ ID NO: 82, SEQ ID NO: 84, SEQ ID NO: 86, SEQ ID NO: 88, SEQ ID NO: 90, SEQ ID NO: 92, SEQ ID NO: 94, SEQ ID NO: 96, SEQ ID NO: 98, SEQ ID NO: 100, SEQ ID NO: 102, SEQ ID NO: 104, SEQ ID NO: 106, SEQ ID NO: 108, and SEQ ID NO: 110, and variants or fragments thereof.
[0244] In another aspect of the invention there is provided a method of detecting a cylindrospermopsin-producing organism based on the detection of CYR gene 8 (cyrJ). The method comprises obtaining a sample for use in the method, and detecting the presence of a CYR polynucleotide comprising a sequence set forth in SEQ ID NO: 95, or a variant or fragment thereof. Additionally or alternatively, the method for detecting a cylindrospermopsin-producing organism based on the detection of CYR gene 8 (cyrJ) may comprise the detection of an RNA or cDNA comprising a sequence encoded by SEQ ID NO: 95, or a variant or fragment thereof. Additionally or alternatively, the method for detecting a cylindrospermopsin-producing organism based on the detection of CYR gene 8 (cyrJ) may comprise the detection of a CYR polypeptide comprising a sequence set forth in SEQ ID NO: 96, or a variant or fragment thereof.
[0245] In certain embodiments of the invention, the methods for detecting cyanobacteria comprising the detection of CYR sequences or variants or fragments thereof further comprise the detection of one or more SXT polynucleotides or SXT polypeptides as disclosed herein, or a variant or fragment thereof.
[0246] The SXT polynucleotide may comprise a sequence selected from the group consisting of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 18, SEQ ID NO: 20, SEQ ID NO: 22, SEQ ID NO: 24, SEQ ID NO: 26, SEQ ID NO: 28, SEQ ID NO: 30, SEQ ID NO: 32, SEQ ID NO: 34, SEQ ID NO: 36, SEQ ID NO: 38, SEQ ID NO: 40, SEQ ID NO: 42, SEQ ID NO: 44, SEQ ID NO: 46, SEQ ID NO: 48, SEQ ID NO: 50, SEQ ID NO: 52, SEQ ID NO: 54, SEQ ID NO: 56, SEQ ID NO: 58, SEQ ID NO: 60, SEQ ID NO: 62, SEQ ID NO: 64, SEQ ID NO: 66, SEQ ID NO: 68, and variants and fragments thereof.
[0247] Alternatively, the SXT polynucleotide may be an RNA or cDNA encoded by a sequence selected from the group consisting of SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 18, SEQ ID NO: 20, SEQ ID NO: 22, SEQ ID NO: 24, SEQ ID NO: 26, SEQ ID NO: 28, SEQ ID NO: 30, SEQ ID NO: 32, SEQ ID NO: 34, SEQ ID NO: 36, SEQ ID NO: 38, SEQ ID NO: 40, SEQ ID NO: 42, SEQ ID NO: 44, SEQ ID NO: 46, SEQ ID NO: 48, SEQ ID NO: 50, SEQ ID NO: 52, SEQ ID NO: 54, SEQ ID NO: 56, SEQ ID NO: 58, SEQ ID NO: 60, SEQ ID NO: 62, SEQ ID NO: 64, SEQ ID NO: 66, SEQ ID NO: 68, and variants and fragments thereof and/or polypeptides as disclosed herein, or a variant or fragment thereof.
[0248] The SXT polypeptide may comprising an amino acid sequence selected from the group consisting of SEQ ID NO: 3, SEQ ID NO: 5, SEQ ID NO: 7, SEQ ID NO: 9, SEQ ID NO: 11, SEQ ID NO: 13, SEQ ID NO: 15, SEQ ID NO: 17, SEQ ID NO: 19, SEQ ID NO: 21, SEQ ID NO: 23, SEQ ID NO: 25, SEQ ID NO: 27, SEQ ID NO: 29, SEQ ID NO: 31, SEQ ID NO: 33, SEQ ID NO: 35, SEQ ID NO: 37, SEQ ID NO: 39, SEQ ID NO: 41, SEQ ID NO: 43, SEQ ID NO: 45, SEQ ID NO: 47, SEQ ID NO: 49, SEQ ID NO: 51, SEQ ID NO: 53, SEQ ID NO: 55, SEQ ID NO: 57, SEQ ID NO: 59, SEQ ID NO: 61, SEQ ID NO: 63, SEQ ID NO: 65, SEQ ID NO: 67, SEQ ID NO: 69, and variants and fragments thereof.
[0249] In another aspect, the invention provides a method for the detection of dinoflagellates. The skilled addressee will understand that the detection of "dinoflagellates" encompasses the detection of one or more dinoflagellates. The method comprises obtaining a sample for use in the method, and detecting the presence of one or more SXT polynucleotides or polypeptides as disclosed herein, or a variant or fragment thereof. The presence of SXT polynucleotides, polypeptides, or variants or fragments thereof, is indicative of dinoflagellates in the sample.
[0250] The SXT polynucleotide may comprise a sequence selected from the group consisting of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 18, SEQ ID NO: 20, SEQ ID NO: 22, SEQ ID NO: 24, SEQ ID NO: 26, SEQ ID NO: 28, SEQ ID NO: 30, SEQ ID NO: 32, SEQ ID NO: 34, SEQ ID NO: 36, SEQ ID NO: 38, SEQ ID NO: 40, SEQ ID NO: 42, SEQ ID NO: 44, SEQ ID NO: 46, SEQ ID NO: 48, SEQ ID NO: 50, SEQ ID NO: 52, SEQ ID NO: 54, SEQ ID NO: 56, SEQ ID NO: 58, SEQ ID NO: 60, SEQ ID NO: 62, SEQ ID NO: 64, SEQ ID NO: 66, SEQ ID NO: 68, and variants and fragments thereof.
[0251] Alternatively, the SXT polynucleotide may be an RNA or cDNA encoded by a sequence selected from the group consisting of SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, SEQ ID NO: 10, SEQ. ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 18, SEQ ID NO: 20, SEQ ID NO: 22, SEQ ID NO: 24, SEQ ID NO: 26, SEQ ID NO: 28, SEQ ID NO: 30, SEQ ID NO: 32, SEQ ID NO: 34, SEQ ID NO: 36, SEQ ID NO: 38, SEQ ID NO: 40, SEQ ID NO: 42, SEQ ID NO: 44, SEQ ID NO: 46, SEQ ID NO: 48, SEQ ID NO: 50, SEQ ID NO: 52, SEQ ID NO: 54, SEQ ID NO: 56, SEQ ID NO: 58, SEQ ID NO: 60, SEQ ID NO: 62, SEQ ID NO: 64, SEQ ID NO: 66, SEQ ID NO: 68, and variants and fragments thereof and/or polypeptides as disclosed herein, or a variant or fragment thereof.
[0252] The SXT polypeptide may comprising an amino acid sequence selected from the group consisting of SEQ ID NO: 3, SEQ ID NO: 5, SEQ ID NO: 7, SEQ ID NO: 9, SEQ ID NO: 11, SEQ ID NO: 13, SEQ ID NO: 15, SEQ ID NO: 17, SEQ ID NO: 19, SEQ ID NO: 21, SEQ ID NO: 23, SEQ ID NO: 25, SEQ ID NO: 27, SEQ ID NO: 29, SEQ ID NO: 31, SEQ ID NO: 33, SEQ ID NO: 35, SEQ ID NO: 37, SEQ ID NO: 39, SEQ ID NO: 41, SEQ ID NO: 43, SEQ ID NO: 45, SEQ ID NO: 47, SEQ ID NO: 49, SEQ ID NO: 51, SEQ ID NO: 53, SEQ ID NO: 55, SEQ ID NO: 57, SEQ ID NO: 59, SEQ ID NO: 61, SEQ ID NO: 63, SEQ ID NO: 65, SEQ ID NO: 67, SEQ ID NO: 69, and variants and fragments thereof.
[0253] A sample for use in accordance with the methods described herein may be suspected of comprising one or more cyanotoxic organisms. The cyanotoxic organisms may be one or more cyanobacteria and/or one or more dinoflagellates. Additionally or alternatively, a sample for use in accordance with the methods described herein may be suspected of comprising one more cyanobacteria and/or one or more dinoflagellates. A sample for use in accordance with the methods described herein may be a comparative or control sample, for example, a sample comprising a known concentration or density of a cyanobacteria and/or dinoflagellates, or a sample comprising one or more known species or strains of cyanobacteria and/or dinoflagellates.
[0254] A sample for use in accordance with the methods described herein may be derived from any source. For example, a sample may be an environmental sample. The environmental sample may be derived, for example, from salt water, fresh water or a blue-green algal bloom. Alternatively, the sample may be derived from a laboratory source, such as a culture, or a commercial source.
[0255] It will be appreciated by those in the art that the methods and kits disclosed herein are generally suitable for detecting any organisms in which the SXT and/or CYR gene clusters are present. Suitable cyanobacteria to which the methods of the invention are applicable may be selected from the orders Oscillatoriales, Chroococcales, Nostocales and Stigonematales. For example, the cyanobacteria may be selected from the genera Anabaena, Nostoc, Microcystis, Planktothrix, Oscillatoria, Phormidium, and Nodularia. For example, the cyanobacteria may be selected from the species Cylindrospermopsis raciborskii T3, Cylindrospermopsis raciborskii AWT205, Aphanizomenon ovalisporum, Aphanizomenon flos-aquae, Aphanizomenon sp., Umezakia natans, Raphidiopsis curvata, Anabaena bergii, Lyngbya wollei, and Anabaena circinalis. Examples of suitable dinoflagellates to which the methods and kits of the invention are applicable may be selected from the genera Alexandrium, Pyrodinium and Gymnodinium. The methods and kits of the invention may also be employed for the discovery of novel hepatotoxic species or genera in culture collections or from environmental samples. The methods and kits of the invention may also be employed to detect cyanotoxins that accumulate in other animals, for example, fish and shellfish.
[0256] Detection of SXT and CYR polynucleotides and polypeptides disclosed herein may be performed using any suitable method. For example, methods for the detection of SXT and CYR polynucleotides and/or polypeptides disclosed herein may involve the use of a primer, probe or antibody specific for one or more SXT and CYR polynucleotides and polypeptides. Suitable techniques and assays in which the skilled addressee may utilise a primer, probe or antibody specific for one or more SXT and CYR polynucleotides and polypeptides include, for example, the polymerase chain reaction (and related variations of this technique), antibody based assays such as ELISA and flow cytometry, and fluorescent microscopy. Methods by which the SXT and CYR polypeptides disclosed herein may be identified are generally known in the art, and are described for example in Coligan J. E. et al. (Eds) Current Protocols in Protein Science (2007), John Wiley and Sons, Inc; Walker, J. M., (Ed) (1988) New Protein Techniques: Methods in Molecular Biology, Humana Press, Clifton, N.J. and Scopes, R. K. (1987) Protein Purification: Principles and Practice, 3rd. Ed., Springer-Verlag, New York, N.Y. For example, SXT and CYR polypeptides disclosed herein may be detected by western blot or spectrophotometric analysis. Other examples of suitable methods for the detection of SXT and CYR polypeptides are described, for example, in U.S. Pat. No. 4,683,195, U.S. Pat. No. 6,228,578, U.S. Pat. No. 7,282,355, U.S. Pat. No. 7,348,147 and PCT publication No. WO/2007/056723.
[0257] In a preferred embodiment of the invention, the detection of SXT and CYR polynucleotides and polypeptides is achieved by amplification of DNA from the sample of interest by polymerase chain reaction, using primers that hybridise specifically to the SXT and/or CYR sequence, or a variant or fragment thereof, and detecting the amplified sequence.
[0258] Nucleic acids and polypeptides for analysis using methods and kits disclosed herein may be extracted from organisms either in mixed culture or as individual species or genus isolates. Accordingly, the organisms may be cultured prior to nucleic acid and/or polypeptide isolation or alternatively nucleic acid and/or polypeptides may be extracted directly from environmental samples, such as water samples or blue-green algal blooms.
[0259] Suitable methods for the extraction and purification of nucleic acids for analysis using the methods and kits invention are generally known in the art and are described, for example, in Ausubel F. M. et al. (Eds) Current Protocols in Molecular Biology (2007), John Wiley and Sons, Inc; Neilan (1995) Appl. Environ. Microbiol. 61:2286-2291; and Neilan et al. (2002) Astrobiol. 2:271-280. The skilled addressee will readily appreciate that the invention is not limited to the specific methods for nucleic acid isolation described therein and other suitable methods are encompassed by the invention. The invention may be performed without nucleic acid isolation prior to analysis of the nucleic acid.
[0260] Suitable methods for the extraction and purification of polypeptides for the purposes of the invention are generally known in the art and are described, for example, in Coligan J. E. et al. (Eds) Current Protocols in Protein Science (2007), John Wiley and Sons, Inc; Walker, J. M., (Ed) (1988) New Protein Techniques: Methods in Molecular Biology, Humana Press, Clifton, N.J. and Scopes, R. K. (1987) Protein Purification: Principles and Practice, 3rd. Ed., Springer-Verlag, New York, N.Y. Examples of suitable techniques for protein extraction include, but are not limited to dialysis, ultrafiltration, and precipitation. Protein purification techniques suitable for use include, but are not limited to, reverse-phase chromatography, hydrophobic interaction chromatography, centrifugation, gel filtration, ammonium sulfate precipitation, and ion exchange.
[0261] In accordance with the methods and kits of the invention, SXT and CYR polynucleotides or variants or fragments thereof may be detected by any suitable means known in the art. In a preferred embodiment of the invention, SXT and CYR polynucleotides are detected by PCR amplification. Under the PCR approach, oligonucleotide primers can be designed for use in PCR reactions to amplify SXT and CYR polynucleotides of the invention. Also encompassed by the invention is the PCR amplification of complementary DNA (cDNA) amplified from messenger RNA (mRNA) derived from reverse-transcription of SXT and CYR sequences (RT-PCR). Known methods of PCR include, but are not limited to, methods using paired primers, nested primers, single specific primers, degenerate primers, gene-specific primers, vector-specific primers, partially-mismatched primers, and the like. Methods for designing PCR and RT-PCR primers are generally known in the art and are disclosed, for example, in Ausubel F. M. et al. (Eds) Current Protocols in Molecular Biology (2007), John Wiley and Sons, Inc; Maniatis et al. Molecular Cloning (1982), 280-281; Innis et al. (Eds) (1990) PCR Protocols: A Guide to Methods and Applications (Academic Press, New York); Innis and Gelfand, (Eds) (1995) PCR Strategies (Academic Press, New York); Innis and Gelfand, (Eds) (1999) PCR Methods Manual (Academic Press, New York); and Sambrook et al. (1989) Molecular Cloning: A Laboratory Manual (2nd ed., Cold Spring Harbor Laboratory Press, Plainview, N.Y.
[0262] The skilled addressee will readily appreciate that various parameters of PCR and RT-PCR procedures may be altered without affecting the ability to achieve the desired product. For example, the salt concentration may be varied or the time and/or temperature of one or more of the denaturation, annealing and extension steps may be varied. Similarly, the amount of DNA, cDNA, or RNA template may also be varied depending on the amount of nucleic acid available or the optimal amount of template required for efficient amplification. The primers for use in the methods and kits of the present invention are typically oligonucleotides typically being at least about 5 nucleotides to about 80 nucleotides in length, more typically about 10 nucleotides in length to about 50 nucleotides in length, and even more typically about 15 nucleotides in length to about 30 nucleotides in length. The skilled addressee will recognise that the primers described herein may be useful for a number of different applications, including but not limited to PCR, RT-PCR, and use of probes for the detection of SXT or CYR sequences.
[0263] Such primers can be prepared by any suitable method, including, for example, direct chemical synthesis or cloning and restriction of appropriate sequences. Not all bases in the primer need reflect the sequence of the template molecule to which the primer will hybridize, the primer need only contain sufficient complementary bases to enable the primer to hybridize to the template. A primer may also include mismatch bases at one or more positions, being bases that are not complementary to bases in the template, but rather are designed to incorporate changes into the DNA upon base extension or amplification. A primer may include additional bases, for example in the form of a restriction enzyme recognition sequence at the 5' end, to facilitate cloning of the amplified DNA.
[0264] The invention provides a method of detecting a cyanotoxic organism based on the detection of one or more of SXT gene 6 (sxtA), SXT gene 9 (sxtG), SXT gene 10 (sxtH), SXT gene 11 (sxtI) and SXT gene 17 (sxtX) (SEQ ID NOS: 14, 20, 22, 24, and 36 respectively), or fragments or variants thereof. Additionally or alternatively, a cyanotoxic organism may be detected based on the detection of one or more of the following SXT polypeptides: SXTA (SEQ ID NO: 15), SXTG (SEQ ID NO: 21), SXTH (SEQ ID NO: 23), SXTI (SEQ ID NO: 25), SXTX (SEQ ID NO: 37), or fragments or variants thereof.
[0265] The skilled addressee will recognise that any primers capable of the amplifying the stated SXT and/or CYR sequences, or variants or fragments thereof, are suitable for use in the methods of the invention. For example, suitable oligonucleotide primer pairs for the PCR amplification of SXT gene 6 (sxtA) may comprise a first primer comprising the sequence of SEQ ID NO: 70 and a second primer comprising the sequence of SEQ ID NO: 71, a first primer comprising the sequence of SEQ ID NO: 72 and a second primer comprising the sequence of SEQ ID NO: 73, a first primer comprising the sequence of SEQ ID NO: 74 and a second primer comprising the sequence of SEQ ID NO: 75, a first primer comprising the sequence of SEQ ID NO: 76 and a second primer comprising the sequence of SEQ ID NO: 77, a first primer comprising the sequence of SEQ ID NO: 78 and a second primer comprising the sequence of SEQ ID NO: 79, a first primer comprising the sequence of SEQ ID NO: 113 and a second primer comprising the sequence of SEQ ID NO: 114, or a first primer comprising the sequence of SEQ ID NO: 115 or SEQ ID NO: 116 and a second primer comprising the sequence of SEQ ID NO: 117.
[0266] Suitable oligonucleotide primer pairs for the amplification of SXT gene 9 (sxtG) may comprise a first primer comprising the sequence of SEQ ID NO: 118 and a second primer comprising the sequence of SEQ ID NO: 119, or a first primer comprising the sequence of SEQ ID NO: 120 and a second primer comprising the sequence of SEQ ID NO: 121.
[0267] Suitable oligonucleotide primer pairs for the amplification of SXT gene 10 (sxtH) may comprise a first primer comprising the sequence of SEQ ID NO: 122 and a second primer comprising the sequence of SEQ ID NO: 123.
[0268] Suitable oligonucleotide primer pairs for the amplification of SXT gene 11 (sxtI) may comprise a first primer comprising the sequence of SEQ ID NO: 124 or SEQ ID NO: 125 and a second primer comprising the sequence of SEQ ID NO: 126, or a first primer comprising the sequence of SEQ ID NO: 127 and a second primer comprising the sequence of SEQ ID NO: 128.
[0269] Suitable oligonucleotide primer pairs for the amplification of SXT gene 17 (sxtX) may comprise a first primer comprising the sequence of SEQ ID NO: 129 and a second primer comprising the sequence of SEQ ID NO: 130, or a first primer comprising the sequence of SEQ ID NO: 131 and a second primer comprising the sequence of SEQ ID NO: 132.
[0270] The skilled addressee will recognise that fragments and variants of the above-mentioned primer pairs may also efficiently amplify SXT gene 6 (sxtA), SXT gene 9 (sxtG), SXT gene 10 (sxtH), SXT gene 11 (sxtI) or SXT gene 17 (sxtX) sequences.
[0271] In certain embodiments of the invention, polynucleotide sequences derived from the CYR gene are detected based on the detection of CYR gene 8 (cyrJ) (SEQ ID NO: 95). Suitable oligonucleotide primer pairs for the PCR amplification of CYR gene 8 (cyrJ) may comprise a first primer having the sequence of SEQ ID NO: 111 or a fragment or variant thereof and a second primer having the sequence of SEQ ID NO: 112 or a fragment thereof.
[0272] Also included within the scope of the present invention are variants and fragments of the exemplified oligonucleotide primers. The skilled addressee will also recognise that the invention is not limited to the use of the specific primers exemplified, and alternative primer sequences may also be used, provided the primers are designed appropriately so as to enable the amplification of SXT and/or CYR sequences. Suitable primer sequences can be determined by those skilled in the art using routine procedures without undue experimentation. The location of suitable primers for the amplification of SXT and/or CYR sequences may be determined by such factors as G+C content and the ability for a sequence to form unwanted secondary structures.
[0273] Suitable methods of analysis of the amplified nucleic acids are well known to those skilled in the art and are described for example, in, Sambrook et al. (1989) Molecular Cloning: A Laboratory Manual (2nd ed., Cold Spring Harbor Laboratory Press, Plainview, N.Y.); Ausubel F. M. et al. (Eds) Current Protocols in Molecular Biology (2007), John Wiley and Sons, Inc; and Maniatis et al. Molecular Cloning (1982), 280-281. Suitable methods of analysis of the amplified nucleic acids include, for example, gel electrophoresis which may or may not be preceded by restriction enzyme digestion, and/or nucleic acid sequencing. Gel electrophoresis may comprise agarose gel electrophoresis or polyacrylamide gel electrophoresis, techniques commonly used by those skilled in the art for separation of DNA fragments on the basis of size. The concentration of agarose or polyacrylamide in the gel in large part determines the resolution ability of the gel and the appropriate concentration of agarose or polyacrylamide will therefore depend on the size of the DNA fragments to be distinguished.
[0274] In other embodiments of the invention, SXT and CYR polynucleotides and variants or fragments thereof may be detected by the use of suitable probes. The probes of the invention are based on the sequences of SXT and/or CYR polynucleotides disclosed herein. Probes are nucleotide sequences of variable length, for example between about 10 nucleotides and several thousand nucleotides, for use in detection of homologous sequences, typically by hybridization. Hybridization probes of the invention may be genomic DNA fragments, cDNA fragments, RNA fragments, or other oligonucleotides.
[0275] Methods for the design and/or production of nucleotide probes are generally known in the art, and are described, for example, in Robinson P. J., et al. (Eds) Current Protocols in Cytometry (2007), John Wiley and Sons, Inc; Ausubel F. M. et al. (Eds) Current Protocols in Molecular Biology (2007), John Wiley and Sons, Inc; Sambrook et al. (1989) Molecular Cloning: A Laboratory Manual (2nd ed., Cold Spring Harbor Laboratory Press, Plainview, N.Y.; and Maniatis et al. Molecular Cloning (1982), 280-281. Nucleotide probes may be prepared, for example, by chemical synthesis techniques, for example, the phosphodiester and phosphotriester methods (see for example Narang S. A. et al. (1979) Meth. Enzymol. 68:90; Brown, E. L. (1979) et al. Meth. Enzymol. 68:109; and U.S. Pat. No. 4,356,270), the diethylphosphoramidite method (see Beaucage S. L et al. (1981) Tetrahedron Letters, 22:1859-1862). A method for synthesizing oligonucleotides on a modified solid support is described in U.S. Pat. No. 4,458,066.
[0276] The probes of the invention may be labelled by incorporation of a marker to facilitate their detection. Techniques for labelling and detecting nucleic acids are described, for example, in Ausubel F. M. et al. (Eds) Current Protocols in Molecular Biology (2007), John Wiley and Sons, Inc. Examples of suitable markers include fluorescent molecules (e.g. acetylaminofluorene, 5-bromodeoxyuridine, digoxigenin, fluorescein) and radioactive isotopes (e.g. 32P, 35S, 3H, 33P). Detection of the marker may be achieved, for example, by chemical, photochemical, immunochemical, biochemical, or spectroscopic techniques.
[0277] The methods and kits of the invention also encompass the use of antibodies which are capable of binding specifically to the polypeptides of the invention. The antibodies may be used to qualitatively or quantitatively detect and analyse one or more SXT or CYR polypeptides in a given sample. Methods for the generation and use of antibodies are generally known in the art and described in, for example, Harlow and Lane (Eds) Antibodies--A Laboratory Manual, (1988), Cold Spring Harbor Laboratory, N.Y., Coligan, Current Protocols in Immunology (1991); Goding, Monoclonal Antibodies: Principles and Practice (1986) 2nd ed; and Kohler & Milstein, (1975) Nature 256: 495-497. The antibodies may be conjugated to a fluorochrome allowing detection, for example, by flow cytometry, immunohistochemisty or other means known in the art. Alternatively, the antibody may be bound to a substrate allowing colorimetric or chemiluminescent detection. The invention also contemplates the use of secondary antibodies capable of binding to one or more antibodies capable of binding specifically to the polypeptides of the invention.
[0278] The invention also provides kits for the detection of cyanotoxic organisms and/or cyanobacteria, and/or dinoflagellates. In general, the kits of the invention comprise at least one agent for detecting the presence of one or more SXT and/or CYR polynucleotide or polypeptides disclosed herein, or a variant or fragment thereof. Any suitable agent capable of detecting SXT and/or CYR sequences of the invention may be included in the kit. Non-limiting examples include primers, probes and antibodies.
[0279] In one aspect, the invention provides a kit for the detection of cyanobacteria, the kit comprising at least one agent for detecting the presence the presence of one or more SXT polynucleotides or polypeptides as disclosed herein, or a variant or fragment thereof.
[0280] The SXT polynucleotide may comprise a sequence selected from the group consisting of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 18, SEQ ID NO: 20, SEQ ID NO: 22, SEQ ID NO: 24, SEQ ID NO: 26, SEQ ID NO: 28, SEQ ID NO: 30, SEQ ID NO: 32, SEQ ID NO: 34, SEQ ID NO: 36, SEQ ID NO: 38, SEQ ID NO: 40, SEQ ID NO: 42, SEQ ID NO: 44, SEQ ID NO: 46, SEQ ID NO: 48, SEQ ID NO: 50, SEQ ID NO: 52, SEQ ID NO: 54, SEQ ID NO: 56, SEQ ID NO: 58, SEQ ID NO: 60, SEQ ID NO: 62, SEQ ID NO: 64, SEQ ID NO: 66, SEQ ID NO: 68, and variants and fragments thereof.
[0281] Alternatively, the SXT polynucleotide may be an RNA or cDNA encoded by a sequence selected from the group consisting of SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 18, SEQ ID NO: 20, SEQ ID NO: 22, SEQ ID NO: 24, SEQ ID NO: 26, SEQ ID NO: 28, SEQ ID NO: 30, SEQ ID NO: 32, SEQ ID NO: 34, SEQ ID NO: 36, SEQ ID NO: 38, SEQ ID NO: 40, SEQ ID NO: 42, SEQ ID NO: 44, SEQ ID NO: 46, SEQ ID NO: 48, SEQ ID NO: 50, SEQ ID NO: 52, SEQ ID NO: 54, SEQ ID NO: 56, SEQ ID NO: 58, SEQ ID NO: 60, SEQ ID NO: 62, SEQ ID NO: 64, SEQ ID NO: 66, SEQ ID NO: 68, and variants and fragments thereof and/or polypeptides as disclosed herein, or a variant or fragment thereof.
[0282] The SXT polypeptide may comprise an amino acid sequence selected from the group consisting of SEQ ID NO: 3, SEQ ID NO: 5, SEQ ID NO: 7, SEQ ID NO: 9, SEQ ID NO: 11, SEQ ID NO: 13, SEQ ID NO: 15, SEQ ID NO: 17, SEQ ID NO: 19, SEQ ID NO: 21, SEQ ID NO: 23, SEQ ID NO: 25, SEQ ID NO: 27, SEQ ID NO: 29, SEQ ID NO: 31, SEQ ID NO: 33, SEQ ID NO: 35, SEQ ID NO: 37, SEQ ID NO: 39, SEQ ID NO: 41, SEQ ID NO: 43, SEQ ID NO: 45, SEQ ID NO: 47, SEQ ID NO: 49, SEQ ID NO: 51, SEQ ID NO: 53, SEQ ID NO: 55, SEQ ID NO: 57, SEQ ID NO: 59, SEQ ID NO: 61, SEQ ID NO: 63, SEQ ID NO: 65, SEQ ID NO: 67, SEQ ID NO: 69, and variants and fragments thereof.
[0283] Also provided is a kit for the detection of cyanotoxic organisms. The kit comprises at least one agent for detecting the presence of one or more SXT polynucleotides or polypeptides as disclosed herein, or a variant or fragment thereof.
[0284] The SXT polynucleotide may comprise a sequence selected from the group consisting of SEQ ID NO: 14, SEQ ID NO: 20, SEQ ID NO: 22, SEQ ID NO: 24, SEQ ID NO: 36, and variants and fragments thereof.
[0285] Alternatively, the SXT polynucleotide may be an RNA or cDNA encoded by a sequence selected from the group consisting of SEQ ID NO: 14, SEQ ID NO: 20, SEQ ID NO: 22, SEQ ID NO: 24, SEQ ID NO: 36, and variants and fragments thereof.
[0286] The SXT polypeptide may comprising an amino acid sequence selected from the group consisting of consisting of SEQ ID NO: 15, SEQ ID NO: 21, SEQ ID NO: 23, SEQ ID NO: 25, SEQ ID NO: 37, and variants and fragments thereof.
[0287] The at least one agent may be any suitable reagent for the detection of SXT polynucleotides and/or polypeptides disclosed herein. For example, the agent may be a primer, an antibody or a probe. By way of exemplification only, the primers or probes may comprise a sequence selected from the group consisting of SEQ ID NO: 70, SEQ ID NO: 71, SEQ ID NO: 72, SEQ ID NO: 73, SEQ ID NO: 74, SEQ ID NO: 75, SEQ ID NO: 76, SEQ ID NO: 77, SEQ ID NO: 78, SEQ ID NO: 79, SEQ ID NO: 113, SEQ ID NO: 114, SEQ ID NO: 115, SEQ ID NO: 116, SEQ ID NO: 117, SEQ ID NO: 118, SEQ ID NO: 119, SEQ ID NO: 120, SEQ ID NO: 121, SEQ ID NO: 122, SEQ ID NO: 123, SEQ ID NO: 124, SEQ ID NO: 125, SEQ ID NO: 126, SEQ ID NO: 127, SEQ ID NO: 128, SEQ ID NO: 129, SEQ ID NO: 130, SEQ ID NO: 131, SEQ ID NO: 132, SEQ ID NO: 133, SEQ ID NO: 134, and variants and fragments thereof.
[0288] In certain embodiments of the invention, the kits for the detection of cyanobacteria or cyanotoxic organisms may further comprise at least one additional agent capable of detecting one or more CYR polynucleotide and/or CYR polypeptide sequences as disclosed herein, or a variant or fragment thereof.
[0289] The CYR polynucleotide may comprise a polynucleotide comprising a sequence selected from the group consisting of: SEQ ID NO: 80, SEQ ID NO: 81, SEQ ID NO: 83, SEQ ID NO: 85, SEQ ID NO: 87, SEQ ID NO: 89, SEQ ID NO: 91, SEQ ID NO: 93, SEQ ID NO: 95, SEQ ID NO: 97, SEQ ID NO: 99, SEQ ID NO: 101, SEQ ID NO: 103, SEQ ID NO: 105, SEQ ID NO: 107, SEQ ID NO: 109, and variants and fragments thereof.
[0290] Alternatively, the CYR polynucleotide may comprise a ribonucleic acid or complementary DNA encoded by a sequence selected from the group consisting of: SEQ ID NO: 80, SEQ ID NO: 81, SEQ ID NO: 83, SEQ ID NO: 85, SEQ ID NO: 87, SEQ ID NO: 89, SEQ ID NO: 91, SEQ ID NO: 93, SEQ ID NO: 95, SEQ ID NO: 97, SEQ ID NO: 99, SEQ ID NO: 101, SEQ ID NO: 103, SEQ ID NO: 105, SEQ ID NO: 107, SEQ ID NO: 109, and variants and fragments thereof.
[0291] The CYR polypeptide may comprise a polypeptide comprising a sequence selected from the group consisting of: SEQ ID NO: 82, SEQ ID NO: 84, SEQ ID NO: 86, SEQ ID NO: 88, SEQ ID NO: 90, SEQ ID NO: 92, SEQ ID NO: 94, SEQ ID NO: 96, SEQ ID NO: 98, SEQ ID NO: 100, SEQ ID NO: 102, SEQ ID NO: 104, SEQ ID NO: 106, SEQ ID NO: 108, and SEQ ID NO: 110, and variants and fragments thereof.
[0292] The at least one additional agent may be selected, for example, from the group consisting of primers, antibodies and probes. A suitable primer or probe may comprise a sequence selected from the group consisting of SEQ ID NO: 111, SEQ ID NO: 112, and variants and fragments thereof.
[0293] In another aspect, the invention provides a kit for the detection of cyanobacteria, the kit comprising at least one agent for detecting the presence the presence of one or more CYR polynucleotides or polypeptides as disclosed herein, or a variant or fragment thereof.
[0294] The CYR polynucleotide may comprise a sequence selected from the group consisting of SEQ ID NO: 81, SEQ ID NO: 83, SEQ ID NO: 85, SEQ ID NO: 87, SEQ ID NO: 89, SEQ ID NO: 91, SEQ ID NO: 93, SEQ ID NO: 95, SEQ ID NO: 97, SEQ ID NO: 99, SEQ ID NO: 101, SEQ ID NO: 103, SEQ ID NO: 105, SEQ ID NO: 107, SEQ ID NO: 109, and variants and fragments thereof.
[0295] Alternatively, the CYR polynucleotide may be an RNA or cDNA encoded by a sequence selected from the group consisting of SEQ ID NO: 81, SEQ ID NO: 83, SEQ ID NO: 85, SEQ ID NO: 87, SEQ ID NO: 89, SEQ ID NO: 91, SEQ ID NO: 93, SEQ ID NO: 95, SEQ ID NO: 97, SEQ ID NO: 99, SEQ ID NO: 101, SEQ ID NO: 103, SEQ ID NO: 105, SEQ ID NO: 107, SEQ ID NO: 109, and variants and fragments thereof.
[0296] The CYR polypeptide may comprise a sequence selected from the group consisting of SEQ ID NO: 82, SEQ ID NO: 84, SEQ ID NO: 86, SEQ ID NO: 88, SEQ ID NO: 90, SEQ ID NO: 92, SEQ ID NO: 94, SEQ ID NO: 96, SEQ ID NO: 98, SEQ ID NO: 100, SEQ ID NO: 102, SEQ ID NO: 104, SEQ ID NO: 106, SEQ ID NO: 108, and SEQ ID NO: 110, and variants or fragments thereof.
[0297] In certain embodiments of the invention, the kits for detecting cyanobacteria comprising one or more agents for the detection of CYR sequences or variants or fragments thereof, may further comprise at least one additional agent capable of detecting one or more of the SXT polynucleotides and/or SXT polypeptides disclosed herein, or variants or fragments thereof.
[0298] The SXT polynucleotide may comprise a sequence selected from the group consisting of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 18, SEQ ID NO: 20, SEQ ID NO: 22, SEQ ID NO: 24, SEQ ID NO: 26, SEQ ID NO: 28, SEQ ID NO: 30, SEQ ID NO: 32, SEQ ID NO: 34, SEQ ID NO: 36, SEQ ID NO: 38, SEQ ID NO: 40, SEQ ID NO: 42, SEQ ID NO: 44, SEQ ID NO: 46, SEQ ID NO: 48, SEQ ID NO: 50, SEQ ID NO: 52, SEQ ID NO: 54, SEQ ID NO: 56, SEQ ID NO: 58, SEQ ID NO: 60, SEQ ID NO: 62, SEQ ID NO: 64, SEQ ID NO: 66, SEQ ID NO: 68, and variants and fragments thereof.
[0299] Alternatively, the SXT polynucleotide may be an RNA or cDNA encoded by a sequence selected from the group consisting of SEQ ID NO: 2, SEQ. ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 18, SEQ ID NO: 20, SEQ ID NO: 22, SEQ ID NO: 24, SEQ ID NO: 26, SEQ ID NO: 28, SEQ ID NO: 30, SEQ ID NO: 32, SEQ ID NO: 34, SEQ ID NO: 36, SEQ ID NO: 38, SEQ ID NO: 40, SEQ ID NO: 42, SEQ ID NO: 44, SEQ ID NO: 46, SEQ ID NO: 48, SEQ ID NO: 50, SEQ ID NO: 52, SEQ ID NO: 54, SEQ ID NO: 56, SEQ ID NO: 58, SEQ ID NO: 60, SEQ ID NO: 62, SEQ ID NO: 64, SEQ ID NO: 66, SEQ ID NO: 68, and variants and fragments thereof and/or polypeptides as disclosed herein, or a variant or fragment thereof.
[0300] The at least one agent may be any suitable reagent for the detection of CYR polynucleotides and/or polypeptides disclosed herein. For example, the agent may be a primer, an antibody or a probe. By way of exemplification only, the primers or probes may comprise a sequence selected from the group consisting of SEQ ID NO: 111, SEQ ID NO: 112, and variants and fragments thereof.
[0301] Also provided is a kit for the detection of dinoflagellates, the kit comprising at least one agent for detecting the presence one or more SXT polynucleotides or polypeptides as disclosed herein, or a variant or fragment thereof.
[0302] The SXT polynucleotide may comprise a sequence selected from the group consisting of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 18, SEQ ID NO: 20, SEQ ID NO: 22, SEQ ID NO: 24, SEQ ID NO: 26, SEQ ID NO: 28, SEQ ID NO: 30, SEQ ID NO: 32, SEQ ID NO: 34, SEQ ID NO: 36, SEQ ID NO: 38, SEQ ID NO: 40, SEQ ID NO: 42, SEQ ID NO: 44, SEQ ID NO: 46, SEQ ID NO: 48, SEQ ID NO: 50, SEQ ID NO: 52, SEQ ID NO: 54, SEQ ID NO: 56, SEQ ID NO: 58, SEQ ID NO: 60, SEQ ID NO: 62, SEQ ID NO: 64, SEQ ID NO: 66, SEQ ID NO: 68, and variants and fragments thereof.
[0303] Alternatively, the SXT polynucleotide may be an RNA or cDNA encoded by a sequence selected from the group consisting of SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 18, SEQ ID NO: 20, SEQ ID NO: 22, SEQ ID NO: 24, SEQ ID NO: 26, SEQ ID NO: 28, SEQ ID NO: 30, SEQ ID NO: 32, SEQ ID NO: 34, SEQ ID NO: 36, SEQ ID NO: 38, SEQ ID NO: 40, SEQ ID NO: 42, SEQ ID NO: 44, SEQ ID NO: 46, SEQ ID NO: 48, SEQ ID NO: 50, SEQ ID NO: 52, SEQ ID NO: 54, SEQ ID NO: 56, SEQ ID NO: 58, SEQ ID NO: 60, SEQ ID NO: 62, SEQ ID NO: 64, SEQ ID NO: 66, SEQ ID NO: 68, and variants and fragments thereof and/or polypeptides as disclosed herein, or a variant or fragment thereof.
[0304] The SXT polypeptide may comprise an amino acid sequence selected from the group consisting of SEQ ID NO: 3, SEQ ID NO: 5, SEQ ID NO: 7, SEQ ID NO: 9, SEQ ID NO: 11, SEQ ID NO: 13, SEQ ID NO: 15, SEQ ID NO: 17, SEQ ID NO: 19, SEQ ID NO: 21, SEQ ID NO: 23, SEQ ID NO: 25, SEQ ID NO: 27, SEQ ID NO: 29, SEQ ID NO: 31, SEQ ID NO: 33, SEQ ID NO: 35, SEQ ID NO: 37, SEQ ID NO: 39, SEQ ID NO: 41, SEQ ID NO: 43, SEQ ID NO: 45, SEQ ID NO: 47, SEQ ID NO: 49, SEQ ID NO: 51, SEQ ID NO: 53, SEQ ID NO: 55, SEQ ID NO: 57, SEQ ID NO: 59, SEQ ID NO: 61, SEQ ID NO: 63, SEQ ID NO: 65, SEQ ID NO: 67, SEQ ID NO: 69, and variants and fragments thereof.
[0305] In general, the kits of the invention may comprise any number of additional components. By way of non-limiting examples the additional components may include, reagents for cell culture, reference samples, buffers, labels, and written instructions for performing the detection assay.
Methods of Screening
[0306] The polypeptides and polynucleotides of the present invention, and fragments and analogues thereof are useful for the screening and identification of compounds and agents that interact with these molecules. In particular, desirable compounds are those that modulate the activity of these polypeptides and polynucleotides. Such compounds may exert a modulatory effect by activating, stimulating, increasing, inhibiting or preventing expression or activity of the polypeptides and/or polynucleotides. Suitable compounds may exert their effect by virtue of either a direct (for example binding) or indirect interaction.
[0307] Compounds which bind, or otherwise interact with the polypeptides and polynucleotides of the invention, and specifically compounds which modulate their activity, may be identified by a variety of suitable methods. Non limiting methods include the two-hybrid method, co-immunoprecipitation, affinity purification, mass spectroscopy, tandem affinity purification, phage display, label transfer, DNA microarrays/gene coexpression and protein microarrays.
[0308] For example, a two-hybrid assay may be used to determine whether a candidate agent or plurality of candidate agents interacts or binds with a polypeptide of the invention or a variant or fragment thereof. The yeast two-hybrid assay system is a yeast-based genetic assay typically used for detecting protein-protein interactions (Fields and Song., Nature 340: 245-246 (1989)). The assay makes use of the multi-domain nature of transcriptional activators. For example, the DNA-binding domain of a known transcriptional activator may be fused to a polypeptide of the invention or a variant or fragment thereof, and the activation domain of the transcriptional activator fused to the candidate agent. Interaction between the candidate agent and the polypeptide of the invention or a variant or fragment thereof, will bring the DNA-binding and activation domains of the transcriptional activator into close proximity. Subsequent transcription of a specific reporter gene activated by the transcriptional activator allows the detection of an interaction.
[0309] In a modification of the technique above, a fusion protein may be constructed by fusing the polypeptide of the invention or a variant or fragment thereof to a detectable tag, for example alkaline phosphatase, and using a modified form of immunoprecipitation as described by Flanagan and Leder (Flanagan and Leder, Cell 63:185-194 (1990))
[0310] Alternatively, co-immunoprecipation may be used to determine whether a candidate agent or plurality of candidate agents interacts or binds with polypeptide of the invention or a variant or fragment thereof. Using this technique, cyanotoxic organisms, cyanobacteria and/or dinoflagellates may be lysed under nondenaturing conditions suitable for the preservation of protein-protein interactions. The resulting solution can then be incubated with an antibody specific for a polypeptide of the invention or a variant or fragment thereof and immunoprecipitated from the bulk solution, for example by capture with an antibody-binding protein attached to a solid support. Immunoprecipitation of the polypeptide of the invention or a variant or fragment thereof by this method facilitates the co-immunoprecipation of an agent associated with that protein. The identification an associated agent can be established using a number of methods known in the art, including but not limited to SDS-PAGE, western blotting, and mass spectrometry.
[0311] Alternatively, the phage display method may be used to determine whether a candidate agent or plurality of candidate agents interacts or binds with a polypeptide of the invention or a variant or fragment thereof. Phage display is a test to screen for protein interactions by integrating multiple genes from a gene bank into phage. Under this method, recombinant DNA techniques are used to express numerous genes as fusions with the coat protein of a bacteriophage such the peptide or protein product of each gene is displayed on the surface of the viral particle. A whole library of phage-displayed peptides or protein products of interest can be produced in this way. The resulting libraries of phage-displayed peptides or protein products may then be screened for the ability to bind a polypeptide of the invention or a variant or fragment thereof. DNA extracted from interacting phage contains the sequences of interacting proteins.
[0312] Alternatively, affinity chromatography may be used to determine whether a candidate agent or plurality of candidate agents interacts or binds with a polypeptide of the invention or a variant or fragment thereof. For example, a polypeptide of the invention or a variant or fragment thereof, may be immobilised on a support (such as sepharose) and cell lysates passed over the column. Proteins binding to the immobilised polypeptide of the invention or a variant or fragment thereof may then be eluted from the column and identified, for example by N-terminal amino acid sequencing.
[0313] Potential modulators of the activity of the polypeptides of the invention may be generated for screening by the above methods by a number of techniques known to those skilled in the art. For example, methods such as X-ray crystallography and nuclear magnetic resonance spectroscopy may be used to model the structure of polypeptide of the invention or a variant or fragment thereof, thus facilitating the design of potential modulating agents using computer-based modeling. Various forms of combinatorial chemistry may also be used to generate putative modulators.
[0314] Polypeptides of the invention and appropriate variants or fragments thereof can be used in high-throughput screens to assay candidate compounds for the ability to bind to, or otherwise interact therewith. These candidate compounds can be further screened against functional polypeptides to determine the effect of the compound on polypeptide activity.
[0315] The present invention also contemplates compounds which may exert their modulatory effect on polypeptides of the invention by altering expression of the polypeptide. In this case, such compounds may be identified by comparing the level of expression of the polypeptide in the presence of a candidate compound with the level of expression in the absence of the candidate compound.
[0316] It will be appreciated that the methods described above are merely examples of the types of methods that may be utilised to identify agents that are capable of interacting with, or modulating the activity of polypeptides of the invention or variants or fragments thereof. Other suitable methods will be known by persons skilled in the art and are within the scope of this invention.
[0317] Using the methods described above, an agent may be identified that is an agonist of a polypeptide of the invention or a variant or fragment thereof. Agents which are agonists enhance one or more of the biological activities of the polypeptide. Alternatively, the methods described above may identify an agent that is an antagonist of a polypeptide of the invention or a variant or fragment thereof. Agents which are antagonists retard one or more of the biological activities of the polypeptide.
[0318] Antibodies may act as agonists or antagonists of a polypeptide of the invention or a variant or fragment thereof. Preferably suitable antibodies are prepared from discrete regions or fragments of the polypeptides of the invention or variants or fragments thereof. An antigenic portion of a polynucleotide of interest may be of any appropriate length, such as from about 5 to about 15 amino acids. Preferably, an antigenic portion contains at least about 5, 6, 7, 8, 9, 10, 11, 12, 13 or 14 amino acid residues.
[0319] Methods for the generation of suitable antibodies will be readily appreciated by those skilled in the art. For example, monoclonal antibody specific for a polypeptide of the invention or a variant or fragment thereof typically containing Fab portions, may be prepared using hybridoma technology described in Antibodies-A Laboratory Manual, Harlow and Lane, eds., Cold Spring Harbor Laboratory, N.Y. (1988).
[0320] In essence, in the preparation of monoclonal antibodies directed toward polypeptide of the invention or a variant or fragment thereof, any technique that provides for the production of antibody molecules by continuous cell lines in culture may be used. These include the hybridoma technique originally developed by Kohler et al., Nature, 256:495-497 (1975), as well as the trioma technique, the human B-cell hybridoma technique (Kozbor et al., Immunology Today, 4:72 (1983)), and the EBV-hybridoma technique to produce human monoclonal antibodies (Cole et al., in Monoclonal Antibodies and Cancer Therapy, pp. 77-96, Alan R. Liss, Inc., (1985)). Immortal, antibody-producing cell lines can be created by techniques other than fusion, such as direct transformation of B lymphocytes with oncogenic DNA, or transfection with Epstein-Barr virus. See, for example, M. Schreier et al, "Hybridoma Techniques" Cold Spring Harbor Laboratory, (1980); Hammerling et al., "Monoclonal Antibodies and T-cell Hybridomas" Elsevier/North-Holland Biochemical Press, Amsterdam (1981); and Kennett et al., "Monoclonal Antibodies", Plenum Press (1980).
[0321] In brief, a means of producing a hybridoma from which the monoclonal antibody is produced, a myeloma or other self-perpetuating cell line is fused with lymphocytes obtained from the spleen of a mammal hyperimmunised with a recognition factor-binding portion thereof, or recognition factor, or an origin-specific DNA-binding portion thereof. Hybridomas producing a monoclonal antibody useful in practicing this invention are identified by their ability to immunoreact with the present recognition factors and their ability to inhibit specified transcriptional activity in target cells.
[0322] A monoclonal antibody useful in practicing the invention can be produced by initiating a monoclonal hybridoma culture comprising a nutrient medium containing a hybridoma that secretes antibody molecules of the appropriate antigen specificity. The culture is maintained under conditions and for a time period sufficient for the hybridoma to secrete the antibody molecules into the medium. The antibody-containing medium is then collected. The antibody molecules can then be further isolated by well-known techniques.
[0323] Similarly, there are various procedures known in the art which may be used for the production of polyclonal antibodies. For the production of polyclonal antibodies against a polypeptide of the invention or a variant or fragment thereof, various host animals can be immunized by injection with a polypeptide of the invention, or a variant or fragment thereof, including but not limited to rabbits, chickens, mice, rats, sheep, goats, etc. Further, the polypeptide variant or fragment thereof can be conjugated to an immunogenic carrier (e.g., bovine serum albumin (BSA) or keyhole limpet hemocyanin (KLH)). Also, various adjuvants may be used to increase the immunological response, including but not limited to Freund's (complete and incomplete), mineral gels such as aluminium hydroxide, surface active substances such as rysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, keyhole limpet hemocyanins, dinitrophenol, and potentially useful human adjuvants such as BCG (bacille Calmette-Guerin) and Corynebacterium parvum.
[0324] Screening for the desired antibody can also be accomplished by a variety of techniques known in the art. Assays for immunospecific binding of antibodies may include, but are not limited to, radioimmunoassays, ELISAs (enzyme-linked immunosorbent assay), sandwich immunoassays, immunoradiometric assays, gel diffusion precipitation reactions, immunodiffusion assays, in situ immunoassays, Western blots, precipitation reactions, agglutination assays, complement fixation assays, immunofluorescence assays, protein A assays, and Immunoelectrophoresis assays, and the like (see, for example, Ausubel et al., Current Protocols in Molecular Biology, Vol. 1, John Wiley & Sons, Inc., New York (1994)). Antibody binding may be detected by virtue of a detectable label on the primary antibody. Alternatively, the antibody may be detected by virtue of its binding with a secondary antibody or reagent which is appropriately labelled. A variety of methods are known in the art for detecting binding in an immunoassay and are included in the scope of the present invention.
[0325] The antibody (or fragment thereof) raised against a polypeptide of the invention or a variant or fragment thereof, has binding affinity for that protein. Preferably, the antibody (or fragment thereof) has binding affinity or avidity greater than about 105M-1, more preferably greater than about 106 M-1, more preferably still greater than about 107 M-1 and most preferably greater than about 108 M-1.
[0326] In terms of obtaining a suitable amount of an antibody according to the present invention, one may manufacture the antibody(s) using batch fermentation with serum free medium. After fermentation the antibody may be purified via a multistep procedure incorporating chromatography and viral inactivation/removal steps. For instance, the antibody may be first separated by Protein A affinity chromatography and then treated with solvent/detergent to inactivate any lipid enveloped viruses. Further purification, typically by anion and cation exchange chromatography may be used to remove residual proteins, solvents/detergents and nucleic acids. The purified antibody may be further purified and formulated into 0.9% saline using gel filtration columns. The formulated bulk preparation may then be sterilised and viral filtered and dispensed.
[0327] Embodiments of the invention may utilise antisense technology to inhibit the expression of a nucleic acid of the invention or a fragment or variant thereof by blocking translation of the encoded polypeptide. Antisense technology takes advantage of the fact that nucleic acids pair with complementary sequences. Suitable antisense molecules can be manufactured by chemical synthesis or, in the case of antisense RNA, by transcription in vitro or in vivo when linked to a promoter, by methods known to those skilled in the art.
[0328] For example, antisense oligonucleotides, typically of 18-30 nucleotides in length, may be generated which are at least substantially complementary across their length to a region of the nucleotide sequence of the polynucleotide of interest. Binding of the antisense oligonucleotide to their complementary cellular nucleotide sequences may interfere with transcription, RNA processing, transport, translation and/or mRNA stability. Suitable antisense oligonucleotides may be prepared by methods well known to those of skill in the art and may be designed to target and bind to regulatory regions of the nucleotide sequence or to coding (gene) or non-coding (intergenic region) sequences. Typically antisense oligonucleotides will be synthesized on automated synthesizers. Suitable antisense oligonucleotides may include modifications designed to improve their delivery into cells, their stability once inside a cell, and/or their binding to the appropriate target. For example, the antisense oligonucleotide may be modified by the addition of one or more phosphorothioate linkages, or the inclusion of one or morpholine rings into the backbone (so-called `morpholino` oligonucleotides).
[0329] An alternative antisense technology, known as RNA interference (RNAi), may be used, according to known methods in the art (see for example WO 99/49029 and WO 01/70949), to inhibit the expression of a polynucleotide. RNAi refers to a means of selective post-transcriptional gene silencing by destruction of specific mRNA by small interfering RNA molecules (siRNA). The siRNA is generated by cleavage of double stranded RNA, where one strand is identical to the message to be inactivated. Double-stranded RNA molecules may be synthesised in which one strand is identical to a specific region of the p53 mRNA transcript and introduced directly. Alternatively corresponding dsDNA can be employed, which, once presented intracellularly is converted into dsRNA. Methods for the synthesis of suitable molecule for use in RNAi and for achieving post-transcriptional gene silencing are known to those of skill in the art.
[0330] A further means of inhibiting expression may be achieved by introducing catalytic antisense nucleic acid constructs, such as ribozymes, which are capable of cleaving mRNA transcripts and thereby preventing the production of wild type protein. Ribozymes are targeted to and anneal with a particular sequence by virtue of two regions of sequence complementarity to the target flanking the ribozyme catalytic site. After binding the ribozyme cleaves the target in a site-specific manner. The design and testing of ribozymes which specifically recognise and cleave sequences of interest can be achieved by techniques well known to those in the art (see for example Lieber and Strauss, 1995, Molecular and Cellular Biology, 15:540-551.
[0331] The invention will now be described with reference to specific examples, which should not be construed as in any way limiting the scope of the invention.
EXAMPLES
[0332] The invention will now be described with reference to specific examples, which should not be construed as in any way limiting the scope of the invention.
Example 1
Cyanobacterial Cultures and Characterisation of the SXT Gene Cluster
[0333] Cyanobacterial strains used in the present study (FIG. 1) were grown in Jaworski medium in static batch culture at 26° C. under continuous illumination (10 μmol m-2 s-1). Total genomic DNA was extracted from cyanobacterial cells by lysozyme/SDS/proteinase K lysis following phenol-chloroform extraction as described in Neilan, B. A. 1995. Appl Environ Microbiol 61:2286-2291. DNA in the supernatant was precipitated with 2 volumes -20° C. ethanol, washed with 70% ethanol, dissolved in TB-buffer (10:1), and stored at -20° C. PCR primer sequences used for the amplification of sxt ORFs are shown in FIG. 1B).
[0334] PCR amplicons were separated by agarose gel electrophoresis in TAB buffer (40 mM Tris-acetate, 1 mM EDTA, pH 7.8), and visualised by UV translumination after staining in ethidium bromide (0.5 μg/ml). Sequencing of unknown regions of DNA was performed by adaptor-mediated PCR as described in Moffitt et al. (2004) Appl. Environ. Microbial. 70:6353-6362. Automated DNA sequencing was performed using the PRISM Big Dye cycle sequencing system and a model 373 sequencer (Applied Biosystems). Sequence data were analysed using ABI Prisrn-Autoassembler software, and percentage similarity and identity to other translated sequences determined using BLAST in conjunction with the National Center for Biotechnology Information (NIH), Fugue blast (http://www-cryst.bioc.cam.ac.uk/fugue/) was used to identify distant homologs via sequence-structure comparisons. The sxt gene clusters were assembled using the software Phred, Phrap, and Consed (http://www.phrap.org/phredphrapconsed.html), and open reading frames manually identified. GenBank accession numbers for the sxt gene cluster from C. raciborskii T3 is DQ787200.
Example 2
Mass Spectrometric Analysis of SXT Intermediates
[0335] Bacterial extracts and SXT standards were analysed by HPLC (Thermo Finnigan Surveyor HPLC and autosampler) coupled to an ion trap mass spectrometer (Thermo Finnigan LCQ Deca XP Plus) fitted with an electrospray source. Separation of analytes was obtained on a 2.1 mm×150 mm Phenomenex Luna 3 micron C18 column at 100 mL/min. Analysis was performed using a gradient starting at 5% acetonitrile in 10 mM heptafluorobutyric acid (HFBA) This was maintained for 10 min, then ramped to 100% acetonitrile, over 30 min. Conditions were held at 100% acetonitrile for 10 min to wash the column and then returned to 5% acetonitrile in 10 mM HFBA and again held for 10 min to equilibrate the column for the next sample. This resulted in a runtime of 60 min per sample. Sample volumes of 10-100 mL were injected for each analysis. The HPLC eluate directly entered the electrospray source, which was programmed as follows: electrospray voltage 5 kV, sheath gas flow rate 30 arbitrary units, auxiliary gas flow rate 5 arbitrary units. The capillary temperature was 200° C. and had a voltage of 47 V. Ion optics were optimised for maximum sensitivity before sample analysis using the instruments autotune function with a standard toxin solution. Mass spectra were acquired in the centroid mode over the m/z range 145-650. Mass range setting was `normal`, with 200 ms maximum ion injection time and automatic gain control (AGC) on. Tandem mass spectra were obtained over a m/z range relevant to the precursor ion. Collision energy was typically 20-30 ThermoFinnigan arbitrary units, and was optimised for maximal information using standards where available.
Example 3
Identification and Sequencing of the SXT Gene Cluster in Cylindrospermopsis raciborskii T3
[0336] O-carbamoyltransferase was initially detected in C. raciborskii T3 via degenerate PCR, and later named sxtI. Further investigation showed that homologues of sxtI were exclusively present in SXT toxin-producing strains of four cyanobacterial genera (Table 1), thus representing a good candidate gene in SXT toxin biosynthesis. The sequence of the complete putative SXT biosynthetic gene cluster (sxt) was then obtained by genome walking up- and downstream of sxtI in C. raciborskii T3 (FIG. 3). In C. raciborskii T3, this sxt, gene cluster spans approximately 35000 bp, encoding 31 open reading frames (FIG. 2). The cluster also included other genes encoding SXT-biosynthesis enzymes, including a methyltransferase (sxtA1), a class II aminotransferase (sxtA4), an amidinotransferase (sxtG), dioxygenases (sxtH), in addition to the Ocarbamoyltransferase (sxtI). PCR screening of selected sxt open reading frames in toxic and non-toxic cyanobacteria strains showed that they were exclusively present in SXT toxin-producing isolates (FIG. 1A), indicating the association of these genes with the toxic phenotype. In the following passages we describe the open reading frames in the putative sxt gene cluster and their predicted functions, based on bioinformatic analysis, LCMS/MS data on biosynthetic intermediates and in vitro biosynthesis, when applicable.
Example 4
Functional Prediction of the Parent Molecule SXT Biosynthetic Genes
[0337] Bioinformatic analysis of the sxt gene cluster revealed that it contains a previously undescribed example of a polyketide synthase (PKS) like structure, named sxtA. SxtA possesses four catalytic domains, SxtA1 to SxtA4. An iterated. PSI-blast search revealed low sequence homology of SxtA1 to S-adenosylmethionine (SAM)-dependent methyltransferases. Further analysis revealed the presence of three conserved sequence motifs in SxtA1 (278-ITDMGCGDG-286, 359-DPENLLHI-366, and 424-VVNKHGLMIL-433) that are specific for SAMdependent methyltransferases. SxtA2 is related to GCN5-related N-acetyl transferases (GNAT). GNAT catalyse the transfer of acetate from acetyl-CoA to various heteroatoms, and have been reported in association with other unconventional PKSs, such as PedI, where they load the acyl carrier protein (ACP) with acetate. SxtA3 is related to an ACP, and provides a phosphopantetheinyl-attachment site. SxtA4 is homologous to class II aminotransferases and was most similar to 8-amino-7-oxononanoate synthase (AONS). Class II aminotransferases are a monophyletic group of pyridoxal phosphate (PLP)-dependent enzymes, and the only enzymes that are known to perform Claisen-condensations of amino acids. We therefore reasoned that sxtA performs the first step in SXT biosynthesis, involving a Claisen-condensation.
[0338] The predicted reaction sequence of SxtA, based on its primary structure, is the loading of the ACP (SxtA3) with acetate from acetyl-CoA, followed by the SxtA1-catalysed methylation of acetyl-ACP, converting it to propionyl-ACP. The class II aminotransferase domain, SxtA4, would then perform a Claisen-condensation between propionyl-ACP and arginine (FIG. 4). The putative product of SxtA is thus 4-amino-3-oxoguanidinoheptane which is here designated as Compound A', (FIG. 4). To verify this pathway for SXT biosynthesis based on comparative gene sequence analysis, cell extracts of C. raciborskii T3 were screened by LC-MS/MS for the presence of compound A' (FIG. 5) as well as arginine and SXT as controls. Arginine and SXT were readily detected (FIG. 5) and produced the expected fragment ions. On the other hand, LC-MS/MS data obtained from m/z 187 was consistent with the presence of structure A from C. raciborskii T3 (FIG. 5). MS/MS spectra showed the expected fragment ion (m/z 170, m/z 128) after the loss of ammonia and guanidine from A'. LC-MS/MS data strongly supported the predicted function of SxtA and thus a revised initiating reaction in the SXT biosynthesis pathway.
[0339] sxtG encodes a putative amidinotransferase, which had the highest amino acid sequence similarity to L-arginine:lysine amidinotransferases. It is proposed that the product of SxtA is the substrate for the amidinotransferase SxtG, which transfers an amidino group from arginine to the a-amino group A' (FIG. 4), thus producing 4,7-diguanidino-3-oxoheptane designated compound B' (FIG. 3). This hypothetical sequence of reactions was also supported by the detection of C' by LC-MS/MS (FIG. 4). Cell extracts from C. raciborskii T3, however, did not contain any measurable levels of B' (4,7-diguanidino-3-oxoheptane). A likely explanation for the failure to detect the intermediate B' is its rapid cyclisation to form C' via the action of SxtB.
[0340] The sxt gene cluster encodes an enzyme, sxtB, similar to the cytidine deaminase-like enzymes from g-proteobacteria. The catalytic mechanism of cytidine deaminase is a retro-aldol cleavage of ammonia from cytidine, which is the same reaction mechanism in the reverse direction as the formation of the first heterocycle in the conversion from B' to C' (FIG. 4). It is therefore suggested that SxtB catalyses this retroaldol-like condensation (step 4, FIG. 4).
[0341] The incorporation of methionine methyl into SXT, and its hydroxylation was studied. Only one methionine methyl-derived hydrogen is retained in SXT, and a 1,2-H shift has been observed between acetate-derived C-5 and C-6 of SXT. Hydroxylation of the methyl side-chain of the SXT precursor proceeds via epoxidation of a double-bond between the SAM-derived methyl group and the acetate derived C-6. This incorporation pattern may result from an electrophilic attack of methionine methyl on the double bond between C-5 and C-6, which would have formed during the preceding cyclisation. Subsequently, the new methylene side-chain would be epoxidated, followed by opening to an aldehyde, and subsequent reduction to a hydroxyl. Retention of only one methionine methyl-derived hydrogen, the 1,2-H shift between C-5 and C-6, and the lacking 1,2-H shift between C-1 and C-5 is entirely consistent with the results of this study, whereby the introduction of methionine methyl precedes the formation of the three heterocycles.
[0342] sxtD encodes an enzyme with sequence similarity to sterol desaturase and is the only candidate desaturase present in the sxt gene cluster, SxtD is predicted to introduce a double bond between C-1 and C-5 of C', and cause a 1,2-H shift between C-5 and C-6 (compound D', FIG. 3). The gene product of sxtS has sequence homology to non-heme iron 2-oxoglutaratedependent (2OG) dioxygenases. These are multifunctional enzymes that can perform hydroxylation, epoxidation, desaturation, cyclisation, and expansion reactions. 2OG dioxygenases have been reported to catalyse the oxidative formation of heterocycles. SxtS could therefore perform the consecutive epoxidation of the new double bond, and opening of the epoxide to an aldehyde with concomitant bicyclisation. This explains the retention of only one methionine methyl-derived hydrogen, and the lack of a 1,2-H shift between C-1 and C-5 of SXT (steps 5 to 7, FIG. 4). SxtU has sequence similarity to short-chain alcohol dehydrogenases. The most similar enzyme with a known function is clavaldehyde dehydrogenase (AAF86624), which reduces the terminal aldehyde of clavulanate-9-aldehyde to an alcohol. SxtU is therefore predicted to reduce the terminal aldehyde group of the SXT precursor in step 8 (FIG. 4), forming compound E'.
[0343] The concerted action of SxtD, SxtS and SxtU is therefore the hydroxylation and bicyclisation of compound C' to E' (FIG. 4). In support for this proposed pathway of SXT biosynthesis, LC-MS/MS obtained from m/z 211 and m/z 225 allowed the detection of compounds C' and E' from C. raciborskii T3 (FIG. 5). On the other hand, no evidence could be found by LC-MS/MS for intermediates B (m/z 216), and C (m/z 198). MS/MS spectra showed the expected fragment ions after the loss of ammonia and guanidine from C', as well as the loss of water in the case of E'.
[0344] The detection of E' indicated that the final reactions leading to the complete SXT molecule are the O-carbamoylation of its free hydroxyl group and a oxidation of C-12. The actual sequence of these final reactions, however, remains uncertain. The gene product of sxtI is most similar to a predicted Ocarbamoyltransferase from Trichodesmium erythraeum (accession ABG50968) and other predicted O-carbamoyltransferases from cyanobacteria. O-carbamoyltransferases invariably transfer a carbamoyl group from carbamoylphosphate to a free hydroxyl group. Our data indicate that SxtI may catalyse the transfer of a carbamoyl group from carbamoylphosphate to the free hydroxy group of E'. Homologues of sxtJ and sxtK with a known function were not found in the databases, however it was noted that sxtJ and sxtK homologues were often encoded adjacent to O-carbamoyltransferase genes.
[0345] The sxt gene cluster contains two genes, sxtH and sxtT, each encoding a terminal oxygenase subunit of bacterial phenyl-propionate and related ring-hydroxylating dioxygenases. The closest homologue with a predicted function was capreomycidine hydroxylase from Streptomyces vinaceus, which hydroxylates a ringcarbon (C-6) of capreomycidine. SxtH and SxtT may therefore perform a similar function in SXT biosynthesis, that is, the oxidation or hydroxylation and oxidation of C-12, converting F' into SXT.
[0346] Members belonging to bacterial phenylpropionate and related ring-hydroxylating dioxygenases are multi-component enzymes, as they require an oxygenase reductase for their regeneration after each catalytic cycle. The sxt gene cluster provides a putative electron transport system, which would fulfill this function. sxtV encodes a 4Fe-4S ferredoxin with high sequence homology to a ferredoxin from Nostoc punctiforme. sxtW was most similar to fumarate reductase/succinate dehydrogenase-like enzymes from A. variabilis and Nostoc punctiforme, followed by AsfA from Pseudomonas putida. AsfA and AsfB are enzymes involved in the transport of electrons resulting from the catabolism of aryl sulfonates. SxtV could putatively extract an electron pair from succinate, converting it to fumarate, and then transfer the electrons via ferredoxin (SxtW) to SxtH and SxtT.
Example 5
Comparative Sequence Analysis and Functional Assignment of SXT Tailoring Genes
[0347] Following synthesis of the parent molecule SXT, modifying enzymes introduce various functional groups. In addition to SXT, C. raciborskii T3 produces N-1 hydroxylated (neoSXT), decarbamoylated (dcSXT), and N-sulfurylated (GTX-5) toxins, whereas A. circinalis AWQC131C produces decarbamoylated (dcSXT), O-sulfurylated (GTX-3/2, dcGTX-3/2), as well as both O- and N-sulfurylated toxins (C-1/2), but no N-1 hydroxylated toxins.
[0348] sxtX encodes an enzyme with homology to cephalosporin hydroxylase. sxtX was only detected in C. raciborskii T3, A. flos-aquae NH-5, and Lyngbya wollei, which produce N-1 hydroxylated analogues of SXT, such as neoSXT. This component of the gene cluster was not present in any strain of A. circinalis, and therefore probably the reason why this species does not produce N-1 hydroxylated PST toxins (FIG. 1A). The predicted function of SxtX is therefore the N-1 hydroxylation of SXT.
[0349] A. circinalis AWQC131C and C. raciborskii T3 also produces N- and O-sulfated analogues of SXT (GTX-5, C-213, (dc)GTX-3/4). The activity of two 3'-phosphate 5'-phosphosulfate (PAPS)-dependent sulfotransferases, which were specific for the N-21 of SXT and GTX-3/2, and O-22 of 11-hydroxy SXT, respectively, has been described from the SXT toxin-producing dinoflagellate Gymnodinium catenatum. The sxt gene cluster from C. raciborskii T3 encodes a putative sulfotransferase, SxtN. A PSI-BLAST search with SxtN identified only 25 hypothetical proteins of unknown function with an E value above the threshold (0.005). A profile library search, however, revealed significant structural relatedness of SxtN to estrogen sulfotransferase (1AQU) (Z-score=24.02) and other sulfotransferases. SxtN has a conserved N-terminal region, which corresponds to the adenosine 3'-phosphate 5'-phosphosulfate (PAPS) binding region in 1AQU. It is not known, however, whether SxtN transfers a sulfate group to N-21 or O-22. Interestingly, the sxt gene cluster encodes an adenylylsulfate kinase (APSK), SxtO, homologues of which are involved in the formation of PAPS (FIG. 2). APKS phosphorylates the product of ATPsulfurylase, adenylylsulfate, converting it to PAPS. Other biosynthetic gene clusters that result in sulfated secondary metabolites also contain genes required for the production of PAPS.
[0350] Decarbarnoylated analogues of SXT could be produced via either of two hypothetical scenarios. Enzymes that act downstream of the carbamoyltransferase, SxtI, in the biosynthesis of PSP toxins are proposed to have broad substrate specificity, processing both carbamoylated and decarbamoylated precursors of SXT. Alternatively, hydrolytic cleavage of the carbamoyl moiety from SXT or its precursors may occur. SxtL is related to GDSL-lipases, which are multifunctional enzymes with thioesterase, arylesterase, protease and lysophospholipase activities. The function of SxtL could therefore include the hydrolytic cleavage of the carbamoyl group from SXT analogues.
Example 6
Cluster-Associated SXT Genes Involved in Metabolite Transport
[0351] sxtF and sxtM encoded two proteins with high sequence similarity to sodium-driven multidrug and toxic compound extrusion (MATE) proteins of the NorM family. Members of the NorM family of MATE proteins are bacterial sodium-driven antiporters, that export cationic substances, All of the PSP toxins are cationic substances, except for the C-toxins which are zwitterionic. It is therefore probable that SxtF and SxtM are also involved in the export of PSP toxins. A mutational study of NorM from V. parahaematolyticus identified three conserved negatively charged residues (D32, E251, and D367) that confer substrate specificity, however the mechanism of substrate recognition remains unknown. In SxtF, the residue corresponding to E251 of NorM is conserved, whereas those corresponding to D32 and D367 are replaced by the neutral amino acids asparagine and tyrosine, respectively. Residues corresponding to D32 and E251 are conserved in SxtM, but D367 is replaced by histidine. The changes in substrate-binding residues may reflect the differences in PSP toxin substrates transported by these proteins.
Example 7
Putative Transcriptional Regulators of Saxitoxin Synthase
[0352] Environmental factors, such as nitrogen and phosphate availability have been reported to regulate the production of PSP toxins in dinoflagellates and cyanobacteria. Two transcriptional factors, sxtY and sxtZ, related to PhoU and OmpR, respectively, as well as a two component regulator histidine kinase were identified proximal to the 3'-end of the sxt gene cluster in C. raciborskii T3. PhoU-related proteins are negative regulators of phosphate uptake whereas OmpR-like proteins are involved in the regulation of a variety of metabolisms, including nitrogen and osmotic balance. It is therefore likely that PSP toxin production in C. raciborskii T3 is regulated at the transcriptional level in response to the availability of phosphate, as well as, other environmental factors.
Example 8
Phylogenetic Origins of the SXT Genes
[0353] The sxt gene cluster from C. raciborskii T3 has a true mosaic structure. Approximately half of the sxt genes of C. raciborskii T3 were most similar to counterparts from other cyanobacteria, however the remaining genes had their closest matches with homologues from proteobacteria, actinomycetes, sphingobacteria, and firmicutes. There is an increasing body of evidence that horizontal gene transfer (HGT) is a major driving force behind the evolution of prokaryotic genomes, and cyanobacterial genomes are known to be greatly affected by HGT, often involving transposases and phages. The fact that the majority of sxt genes are most closely related to homologues from other cyanobacteria, suggests that SXT biosynthesis may have evolved in an ancestral cyanobacterium that successively acquired the remaining genes from other bacteria via HGT. The structural organisation of the investigated sxt gene cluster, as well as the presence of several transposases related to the IS4-family, suggests that small cassettes of sxt genes are mobile.
Example 9
Cyanobacterial Cultures and Characterisation of the CYR Gene Cluster
[0354] Cyanobacterial strains were grown in Jaworski medium as described in Example 1 above. Total genomic DNA was extracted from cyanobacterial cells by lysozyme/SDS/proteinase K lysis following phenol-chloroform extraction as described previously Neilan, B. A. 1995. Appl Environ Microbiol 61:2286-2291. DNA in the supernatant was precipitated with 2 volumes -20° C. ethanol, washed with 70% ethanol, dissolved in TE-buffer (10:1), and stored at -20° C.
[0355] Characterization of unknown regions of DNA flanking the putative cylindrospermopsin biosynthesis genes was performed using an adaptor-mediated PCR as described in Moffitt et al. (2004) Appl. Environ. Microbiol. 70:6353-6362. PCRs were performed in 20 μl reaction volumes containing 1×Taq polymerase buffer 2.5 mM MgCl2, 0.2 mM deoxynucleotide triphosphates, 10 μmol each of the forward and reverse primers, between 10 and 100 ng genomic DNA and 0.2 U of Taq polymerase (Fischer Biotech, Australia). Thermal cycling was performed in a GeneAmp PCR System 2400 Thermal cycler (Perkin Elmer Corporation, Norwalk, Conn.). Cycling began with a denaturing step at 94° C. for 3 min followed by 30 cycles of denaturation at 94° C. for 10 s, primer annealing between 55° and 65° C. for 20 s and a DNA strand extension at 72° C. for 1-3 min. Amplification was completed by a final extension step at 72° C. for 7 min. Amplified DNA was separated by agarose gel electrophoresis in TAE buffer (40 mM Tris-acetate, 1 mM EDTA, pH 7.8), and visualized by UV transillumination after staining with ethidium bromide (0.5 μg/ml).
[0356] Automated DNA sequencing was performed using the PRISM Big Dye cycle sequencing system and a model 373 sequencer (Applied Biosystems, Foster City, Calif.). Sequence data were analyzed using ABI Prism-Autoassembler software, while identity/similarity values to other translated sequences were determined using BLAST in conjunction with the National Center for Biotechnology Information (NIH, Bethesda, Md.). Fugue blast (http://www-cryst.bioc.cam.ac.uk/fugue/) was used to identify distant homologs via sequence-structure comparisons. The gene clusters were assembled using the software Phred, Phrap, and Consed (http://www.phrap.org/phredphrapconsed.html), open reading frames were manually identified. Polyketide synthase and non-ribosomal peptide synthetase domains were determined using the specialized databases based on crystal structures (http://www-ab.informatik.uni-tuebingen.de/software/NRPSpredictor; http://www.tigr.org/jravel/nrps/, http://www.nii.res.in/nrps-pks.html).
Example 10
Genetic Screening of Cylindrospermopsin-Producing and Non-Producing Cyanobacterial Strains
[0357] Cylindrospermopsin-producing and non-producing cyanobacterial strains were screened for the presence of the sulfotransferase gene cyrJ using the primer set cynsulfF (5' ACTTCTCTCCTTTCCCTATC 3') (SEQ ID NO: 111) and cylnamR (5' GAGTGAAAATGCGTAGAACTTG 3') (SEQ ID NO: 112). Genomic DNA was tested for positive amplification using the 16S rRNA gene primers 27F and 809 as described in Neilan et al. (1997) Int. J. Syst. Bacteriol. 47:693-697. Amplicons were sequenced, as described in Example 9 above, to verify the identity of the gene fragment.
[0358] The biosynthesis of cylindrospermopsin involves an amidinotransferase, a NRPS, and a PICS (AoaA, AoaB and AoaC, respectively). In order to obtain the entire sequence of the cylindrospermopsin biosynthesis gene cluster, we used adaptor-mediated `gene-walking` technology, initiating the process from a partial sequence of the amidinotransferase gene from C. raciborskii AWT205. Successive outward facing primers were designed and the entire gene cluster spanning 43 kb was sequenced, together with a further 3.5 kb on either side of the toxin gene cluster.
[0359] These flanking regions encode putative accessory genes (hyp genes), which include molecular chaperons involved in the maturation of hydrogenases. Due to the fact that these genes are flanking the cylindrospermopsin gene cluster at both ends, we postulate that the toxin gene cluster was inserted into this area of the genome thus interrupting the HYP gene cluster. This genetic rearrangement is mechanistically supported by the presence of transposase-like sequences within the cylindrospermopsin cluster.
[0360] Bioinformatic analysis of the toxin gene cluster was performed and based on gene function inference using sequence alignments (NCBI BLAST), predicted structural homologies (Fugue Blast), and analysis of PKS and NRPS domains using specialized blast servers based on crystal structures. The cylindrospermopsin biosynthesis cluster contains 15 ORFs, which encode all the functions required for the biosynthesis, regulation and export of the toxin cylindrospermopsin (FIG. 6).
Example 11
Formation of the CYR Carbon Skeleton
[0361] The first step in formation of the carbon skeleton of cylindrospermopsin involves the synthesis of guanidinoacetate via transamidination of glycine. CyrA, the AoaA homolog, which encodes an amidinotransferase similar to the human arginine:glycine amidinotransferase GATM, transfers a guanidino group from a donor molecule, most likely arginine, onto an acceptor molecule of glycine thus forming guanidinoacetate (FIG. 8, step 1).
[0362] The next step (FIG. 8, step 2) in the biosynthesis is carried out by CyrB (AoaB homolog), a mixed NRPS-PKS. CyrB spans 8.7 kb and encodes the following domains; adenylation domain (A domain) and a peptidyl carrier protein (PCP) of an NRPS followed by a {tilde over (β)}ketosynthase domain (KS), acyltransferase domain (AT), dehydratase domain (DH), methyltransferase domain (MT), ketoreductase domain (KR), and an acyl carrier protein (ACP) of PKS origin. CyrB therefore must catalyse the second reaction since it is the only gene containing an A domain that could recruit a starter unit for subsequent PKS extensions. The specific amino acid activated by the CyrB A domain cannot be predicted as its substrate specificity conferring residues do not match any in the available databases (http://www-ab.informatik.uni-tuebingen.de/sofrware/NRPSpredictor; http://www.tigr.org/jravel/nrps/, http://www.nii.res.n/nrps-pks.html). So far, no other NRPS has been described that utilizes guanidinoacetate as a substrate. The A domain is thought to activate guanidinoacetate, which is then transferred via the swinging arm of the peptidyl carrier protein (PCP) to the KS domain. The AT domain activates malonyl-CoA and attaches it to the ACP. This is followed by a condensation reaction between the activated guanidinoacetate and malonyl-CoA in the KS domain. CyrB contains two reducing modules, KR and DH. Their concerted reaction reduces the keto group to a hydroxyl followed by elimination of H2O, resulting in a double bond between C13 and C14. The methyl transferase (MT) domain identified in CyrB via the NRPS/PKS databases (Example 9 above), is homologous to S-adenosylmethionine (SAM) dependent MT. It is therefore suggested that the MT methylates C13. It is proposed that a nucleophilic attack of the amidino group at N19 onto the newly formed double bond between C13 and C14 occurs via a `Michael addition`. The cyclization follows Baldwin's rules for ring closure (Baldwin et al. (1997) J. Org. Chem. 42; 3846-3852), resulting in the formation of the first ring in cylindrospermopsin. This reaction could be spontaneous and may not require enzymatic catalysis, as it is energetically favourable. This is the first of three ring formations.
[0363] The third step (FIG. 8, step 3) in the biosynthesis involves CyrC (AoaC homology, which encodes a PKS with KS, AT, KR, and ACP domains. The action of these domains results in the elongation of the growing chain by an acetate via activation of malonyl-CoA by the AT domain, its transfer to ACP and condensation at the KS domain with the product of CyrB. The elongated chain is bound to the ACP of CyrC and the KR domain reduces the keto group to a hydroxyl group on C12. The PKS module carrying out this step contains a KR domain and does not contain a DH domain, this corresponds only to CyrC.
[0364] Following the catalysis of enzyme CyrC is CyrD (FIG. 8, step 4), a PKS with five modules; KS, AT, DH, KR, and an ACP. The action of this PKS module on the product of CyrC results in the addition of one acetate and the reduction of the keto group on C10 to a hydroxyl and dehydration to a double bond between C9 and C10. This double bond is the site of a nucleophilic attack by the amidino group N19 via another Michael addition that again follows Baldwin's rules of ring closure, resulting in the formation of the second ring, the first 6-membered ring made in cylindrospermopsin.
[0365] The product of CyrD is the substrate for CyrE (step 5 in FIG. 8), a PKS containing a KS, AT, DH, KR domains and an ACP. Since this sequence of domains is identical to that of CyrD, it is not possible at this stage to ascertain which PKS acts first, but as their action is proposed to be identical it is immaterial at this point. CyrE catalyzes the addition of one acetate and the formation of a double bond between C7 and C8. This double bond is attacked by N18 via a Michael addition and the third cyclisation occurs, resulting in the second 6-member ring.
[0366] CyrF is the final PKS module (step 6 of FIG. 8) and is a minimal PKS containing only a KS, AT, and ACP. CyrF acts on the product of CyrE and elongates the chain by an acetate, leaving C4 and C6 unreduced.
[0367] Step 7 in the pathway (FIG. 8) involves the formation of the uracil ring, a reaction that is required for the toxicity of the final cylindrospermopsin compound. The cylindrospermopsin gene cluster encodes two enzymes with high sequence similarity (87%) that have been denoted CyrG and CyrH. A Psi-blast search (NCBI) followed by a Fugue profile library search (see materials and methods) revealed that CyrG and CyrH are most similar to the enzyme family of amidohydrolases/ureases/dihydroorotases, whose members catalyze the formation and cleavage of N--C bonds. It is proposed that these enzymes transfer a second guanidino group from a donor molecule, such as arginine or urea, onto C6 and C4 of cylindrospermopsin resulting in the formation of the uracil ring. These enzymes carry out two or three reactions depending on the guanidino donor. The first reaction consists of the formation of a covalent bond between the N of the guanidino donor and C6 of cylindrospermopsin followed by an elimination of H2O forming a double bond between C5 and C6. The second reaction catalyses the formation of a bond between the second N on the guanidino donor and C4 of cylindrospermopsin, co-committently with the breaking of the thioester bond between the acyl carrier protein of CyrE and cylindrospermopsin, causing the release of the molecule from the enzyme complex. Feeding experiments with labeled acetate have shown that the oxygen at C4 is of acetate origin and is not lost during biosynthesis, therefore requiring the de novo formation of the uracil ring. The third reaction--if required--would catalyze the cleavage of the guanidino group from a donor molecule other than urea. The action of CyrG and CyrH in the formation of the uracil ring in cylindrospermopsin describes a novel biosynthesis pathway of a pyrimidine.
[0368] One theory suggest a linear polyketide which readily assumes a favorable conformation for the formation of the rings. Cyclization may thus be spontaneous and not under enzymatic control. These analyses show that this may happen step-wise, with successive ring formation of the appropriate intermediate as it is synthesized. This mechanism also explains the lack of a thioesterase or cyclization domain, which are usually associated with NRPS/PKS modules and catalyze the release and cyclization of the final product from the enzyme complex.
Example 12
CYR Tailoring Reactions
[0369] Cylindrospermopsin biosynthesis requires the action of tailoring enzymes in order to complete the biosynthesis, catalyzing the sulfation at C12 and hydroxylation at C7. Analysis of the cylindrospermopsin gene cluster revealed three candidate enzymes for the tailoring reactions involved in the biosynthesis of cylindrospermopsin, namely CyrI, CyrJ, and CyrN. The sulfation of cylindrospermopsin at C12 is likely to be carried out by the action of a sulfotransferase. CyrJ encodes a protein that is most similar to human 3'-phosphoadenylyl sulfate (PAPS) dependent sulfotransferases. The cylindrospermopsin gene cluster also encodes an adenylsulfate kinase (ASK), namely CyrN. ASKs are enzymes that catalyze the formation of PAPS, which is the sulfate donor for sulfotransferases. It is proposed that CyrJ sulfates cylindrospermopsin at C12 while CyrN creates the pool of PAPS required for this reaction. Screening of cylindrospermopsin producing and non-producing strains revealed that the sulfotransferase genes were only present in cylindrospermopsin producing strains, further affirming the involvement of this entire cluster in the biosynthesis of cylindrospermopsin (FIG. 7). The cyrJ gene might therefore be a good candidate for a toxin probe, as it is more unique than NRPS and PKS genes and would presumably have less cross-reactivity with other gene clusters containing these genes, which are common in cyanobacteria. The final tailoring reaction is carried out by CyrI. A Fugue search and an iterated Psi-Blast revealed that CyrI is similar to a hydroxylase belonging to the 2-oxoglutarate and Fe(II)-dependent oxygenase superfamily, which includes the mammalian Prolyl 4-hydroxylase alpha subunit that catalyze the hydroxylation of collagen. It is proposed that CyrI catalyzes the hydroxylation of C7, a residue that, along with the uracil ring, seems to confer much of the toxicity of cylindrospermopsin. The hydroxylation at C7 bp CyrI is probably the final step in the biosynthesis of cylindrospermopsin.
Example 13
CYR Toxin Transport
[0370] Cylindrospermopsin and other cyanobacterial toxins appear to be exported out of the producing cells. The cylindrospermopsin gene cluster contains an ORF denoted CyrK, the product of which is most similar to sodium ion driven multi-drug and toxic compound extrusion proteins (MATE) of the NorM family. It is postulated that CyrK is a transporter for cylindrospermopsin, based on this homology and its central location in the cluster. Heterologous expression and characterization of the protein are currently being undertaken to verify its putative role in cylindrospermopsin export.
Example 14
Transcriptional Regulation of the Toxin Gene Cluster
[0371] Cylindrospermopsin production has been shown to be highest when fixed nitrogen is eliminated from the growth media (Saker et al. (1999) J. Phycol 35:599-606). Flanking the cylindrospermopsin gene cluster are "hyp" gene homologs involved in the maturation of hydrogenases. In the cyanobacterium Nostoc PCC73102 they are under the regulation of the global nitrogen regulator NtcA, that activates transcription of nitrogen assimilation genes. It is plausible that the cylindrospermopsin gene cluster is under the same regulation, as it is located wholly within the "hyp" gene cluster in C. raciborskii AWT205, and no obvious promoter region in the cylindrospermopsin gene cluster could be identified.
[0372] Finally, the cylindrospermopsin cluster also includes an ORF at its 3'-end designated CyrO. By homology, it encodes a hypothetical protein that appears to possess an ATP binding cassette, and is similar to WD repeat proteins, which have diverse regulatory and signal transduction roles. CyrO may also have a role in transcriptional regulation and DNA binding. It also shows homology to AAA family proteins that often perform chaperone-like functions and assist in the assembly, operation, or disassembly of protein complexes. Further insights into the role of CyrO are hindered due to low sequence homology with other proteins in databases.
[0373] The foregoing describes preferred forms of the present invention. It is to be understood that the present invention should not be restricted to the particular embodiment(s) shown above. Modifications and variations, obvious to those skilled in the art can be made thereto without departing from the scope of the present invention.
Sequence CWU
1
186137606DNACylindrospermopsis raciborskii T3 1atgatcccag ctaaaaaagt
ttatttttta ttgagtttag caatagttat ttcacccttt 60ttatccatga ttgtgggtat
ttacgaaaat attaaattta gggtattatt tgatttggtg 120gtcagggcac taatggtggt
tgactgcttc aatatcaaaa aacatcgggt caaaattagt 180cgtcaattac ctctacgttt
atctattgga cgtgagaatt tagtaatatt gaaggtagag 240tctgggaatg tcaatagtgc
tattcaaatt cgtgattact atcccacaga atttcccgta 300tccacatcta acctgatagt
taaccttccc cctaatcata ctcaggaagt aaagtacacc 360attcgaccta atcaacgggg
agaattttgg tggggaaata ttcaagttcg acagctggga 420aattggtctc tagggtggga
caattggcaa attccccaaa aaactgtggc taaggtgtat 480cctgatttgt taggactcag
atccctcgct attcgtttaa ccctacaatc ttctggatct 540atcactaaat tgcgtcaacg
gggaatggga acggaatttg ccgaactccg taattactgc 600atgggggatg atctacggtt
aattgattgg aaagctacag ctagacgtgc ttatggaaat 660ctgagtcccc tagtaagagt
tttagagcct caacaggaac aaactctgct tatattatta 720gatcgtggta gactaatgac
agctaatgta caagggttaa aacgatatga ttggggttta 780aataccacct tgtctttggc
attagcagga ttacataggg gcgatcgcgt aggagtaggg 840gtatttgact cccagctgca
tacctggata cctccagagc gaggacaaaa tcatctcaat 900cggcttatag acagacttac
acctattgaa ccagtgttag tggagtctga ttatttaaat 960gccattacct atgtagtaaa
acaacagact cgtagatctc tagtagtgtt aattactgat 1020ttagtcgatg ttactgcttc
ccatgaacta ctagtagcgc tgtgtaaatt agtgcctcga 1080tatctacctt tttgtgtaac
actcagggat cctgggattg ataaaatagc tcataatttt 1140agtcaagact taacacaggc
ttataatcga gcagtttctt tggacttgat atcacaaaga 1200gaaattgctt ttgctcagtt
gaaacaacag ggagttttgg tgttggatgc accagcaaat 1260caaatttccg agcagttggt
agaaaggtac ttacaaatca aagccaaaaa tcagatttga 1320ctccctgtcg agataattga
gaacttctgg aaagaatagc ccaataaact cgacaaagaa 1380cgtggttaga agttctttaa
agagtctatc atgccgaatc atattttaac agaagagcga 1440tcgctcttcc taagggatag
agtctgaaag ccacttcaac ggacgataat gcaactcttg 1500ttccagctgg agtgcggaga
attaccacat ccgaaataga caaaaagaaa taattggagt 1560taagaagata agtacataaa
tagtgataat atacaaaact agtcagcacg gattaaattt 1620actaatgata gatacaatat
cagtactatt aagagagtgg actgtaattt cccttacagg 1680tttagccttc tggctttggg
aaattcgctc tcccttccat caaattgaat acaaagctaa 1740attcttcaag gaattgggat
gggcgggaat atcattcgtc tttagaaatg tttatgcata 1800tgtttctgtg gcaattataa
aactattgag ttctctattt atgggagagt cagcaaattt 1860tgcaggagta atgtatgtgc
ccctctggct gaggatcatc actgcatata tattacagga 1920cttaactgac tatctattac
acaggacaat gcatagtaat cagtttcttt ggttgacgca 1980caaatggcat cattcaacaa
agcaatcatg gtggctgagt ggaaacaaag atagctttac 2040cggcggactt ttatatactg
ttacagcttt gtggtttcca ctgctggaca ttccctcaga 2100ggttatgtct gtagtggcag
tacatcaagt gattcataac aattggatac acctcaatgt 2160aaagtggaac tcctggttag
gaataattga atggatttat gttacgcccc gtattcacac 2220tttgcatcat cttgatacag
ggggaagaaa tttgagttct atgtttactt tcatcgaccg 2280attatttgga acctatgtgt
ttccagaaaa ctttgatata gaaaaatcta aaaatagatt 2340ggatgatcaa tcagtaacgg
tgaagacaat tttgggtttt taatagactt gggttctaag 2400tggaatggac ggaaaaaatg
gcggttaccc gcatctttaa tatatcctct ttttggggtt 2460gagatttgga taaagcggct
tgtactctgt cattattcaa atagccatgg cgttgcatat 2520ttgcgggatg atttaagatt
ttctcctaat ttgaaaaatt tctcttgtag gacgattgcg 2580aagcactcgc gagattgcat
tattaataaa accctgatag tcacccccaa cttattgcag 2640aaaaactttt ttctcttagg
taataaatta gtagtttaat tgaaaagcat agcatctctt 2700ttgacttgga ataacaaaat
gtcttacgat gtagtctagc taaatagtga cgcaaacgac 2760tgttttctcc ctcaactcta
gtcattgatg ttttactaat aatttggtct ccatcgggaa 2820taaattttgg gtaaacttta
tagccatccg taatccaaaa ataggatttc caatgctcta 2880tctttttcca taatttggca
aatgttttgg cacttctatc tcccactaca tattgaataa 2940ttcccgaacg tttgttatct
acaactgtcc agacccatat cttgtttttt tttaccaata 3000aatgtttcca actcatccag
ttgacaaact tcaggtgttt gggaattatt attactatct 3060gataactgac gacctagctt
tttgacccaa cgaatgactg tattgtgatt tactttagtc 3120attctttcaa ttgccctaaa
tccattccca tttacataca tggttaaaca tgcttccttt 3180acttcttggg aataacctct
aggagaataa gattcaataa attgacgacc acaattcttg 3240cattgataat tttgttttcc
ccttctctgg ccattttttc taatattatt ggaatcacag 3300tttgaacagt tcatcttgat
ttcttcctcg cggcgatcgc ctgctaaaaa ttcttcccct 3360tattatacat catcccgtgc
aggtgcaacg cccaaatagc catagtttat gatcggtatc 3420gaattcgcta ttgttttttc
tgccatatcc cttacctaag atgggacgat attcgctcat 3480aataccactg tcaattagat
catcagcaac atggtgagtg tatcctgacg accatcgata 3540tggccaccaa gatcactagc
taccccactg ggcaacaatt cgagtaaaag cgagtagccc 3600tactgtagca ttgaaaccat
ccaagtttga agttaaatac ctaaaattat gacctcattt 3660tcatttctag acgttcagca
acgggcatta actcacgtat cagatcaaag tttcctacgt 3720tccgtctcat ccagtctaat
aagaattttt ctccttcatc tagcttacct ttatcatcaa 3780caaaaaccat ctgctcgcac
caatctacaa atccggaatt agtcatctca tagactaaaa 3840tgatgggagg aaagtgtgcg
aatcccattt tttcaatgac ttccatacaa accagcttaa 3900atacttgttc gtttgtcaat
tcattagaca taaagaattt tcctttaatc aattctgttt 3960ctaatcctac cacagagtaa
taactcttgg tctggaacat aaattattct gtttttatca 4020atgcgtaagt cataacttat
tacttgacgg agttgcaggg gcatacctta acttgacctt 4080gggagcgata gaagaaagga
aggcttcagt gacgggtctt tgactaatcc cagtttccac 4140ttcaactaaa acagcatcac
aaatgtcgaa tagtgattga gaatatctat tcatattcat 4200gaaagtcaga gcagattcca
tcggagacat ggatgaatta aaggcagcgt tttcagcgta 4260tcgacctgta aatatattcc
cgtgggaatc ttttaacgct acccctgcaa aatttttcgt 4320gtagggagca taactttgat
tggcagcgga tagagcagca agcacaacat catcggtaga 4380ataggtctcc agatcatgaa
atactgtttg cattaatcca cctgtgagtc ctagatccgc 4440tggtccaaat ggctcgggta
gaaaatgtgg gagtttattt gaggtataag tttgctcagg 4500ctgtgattca ttagacttca
caagaagaac aaaattttga tttacagttg ccatctcgta 4560taaaaattgt cggcagtatc
cacatggtgc ttcgtggatt gctaatgctt gtaaaccggt 4620ttctccgtgc aaccacgcat
ttatggtggc ggattgttct gcgtgaactg agaaactaag 4680tgcctgtcct acaaattcca
tgtcggcacc aaaataaaga gttccagaac ccagttgatt 4740cttagattgt ggtttaccaa
gagcgatcgc ccctacataa aactgcgata ttggtaccct 4800agcataagtt gcggctacgg
gtagtaattg aatcattaac gtactaatat tagtaccaag 4860tcgatcaatc caagatgcga
caacacttga gtcaattaca gcatgttggg caagaattgt 4920ccttaactct gattgaatgg
aacgtggaac cttggcaatc gcctgttcta atgctacatg 4980ggtcatttgg gttattcttg
gacagagaga taaagatata ttagttttta tgaatcaatt 5040tcccacttaa tgcttgagta
tgttttcctc ctgcttacaa ggcaaagctt tccttttttg 5100tagcaaatcc caaactgctt
tgagagattt aattgcttgg tctatctcct cttcggtatt 5160ggcggctgta atcgaaaacc
ttaaagcact tttatttaaa ggtacgattg gaaaaatagc 5220aggagtaatt aaaataccat
attcccaaag gagttgacac acatcaatca tgtgttgagc 5280atctcccact aacacgccta
cgatgggaac gtaaccatag ttatccactt cgaatccaat 5340ggctcttgct tgtgtaacca
atttgtgagt taggtgataa atttgttttc ttaactgctc 5400cccctcctga cgattcacct
gtaatccggc taaggcactt gccaaactcg caacaggaga 5460aggaccagaa aatatggcag
tccaagcgtt gcggaagttg gttttgatcc ggcgatcgcc 5520acaagttaag aatgctgcgt
aagaagaata ggctttggac aaaccagcta catagatgat 5580attatcctct gcaaaccgca
ggtcaaaata attcaccatc ccgtttcctt tgtaaccgta 5640aggcatatcg ctgctgggat
tttcgcccaa aatgccaaaa ccatgagcat catccatgta 5700aattaaggca ttgtactctt
ttgccagatg cacgtaagct ggcagatcgg gaaaatctgc 5760cgacatggaa tacacgccat
caatgacaat aatctttact tgttcaggcg gatattttgc 5820tagtttttcg gctaaatcgt
tcaaatcatt atgtcgatat tggatgaact gggctccttt 5880gtgctgagcc agacagcacg
cttcataaat acaacgatgt gcagctatgt caccaaagat 5940gacaccatta ttcccagtta
atagtggtaa aattcctatc tgaagcagtg ttacagctgg 6000aaatactaaa acatcaggta
cgcctaaaag tttggacaat tcttcctcca attcctcata 6060aattgctggg gaagcaacaa
gccgagtcca gcttggatgt gtgccccatt tatccaaagc 6120tggtggaatt gcttccttaa
cttttggatg caagtcaaga cctaaatagt tgcaagaagc 6180aaagtctatc acccaatgtc
cgtcaattag caccttgcga ccttgttgtt ctgtgacgac 6240tcttgtgact tgaggaattt
tttgttggtt aactacgttt tccagagtgt tgatttcgtt 6300ggctgagtca acaggtggag
ctagatcaga ttgtttctct tgtaccactt ggttttggaa 6360ataagtgatg atggcagttg
gagtgttctt ttgtaaaaag aacgttccag acagattgat 6420ccctaaacgt tcctctagga
gcgtttgcag ttctaataaa tctaaagaat ctaatcccat 6480atccagcagt ttttgttgtg
gagcgtaggc tgcctgacgt tgggaaccca ttacttttaa 6540gatgcattct ttaacgagat
ccgctacagt tttgttttcc ttagttgcag atgttgcttt 6600tggtaccaat gaaccaattg
ctgagttaat atacggtcct ttgcgatcac caggcgagtg 6660caaagcactg tcgcgcaggt
tatattcaat caaaataccc atgccgagat tatctgtatc 6720ttccggacga taattagcaa
taattcccct aatttcggct cctcccgaca catggaaacc 6780cacaattgga tccagaagct
gtcgttgctc attgtgtagc tttaaatact ccatcatcgg 6840catttgggaa taattgacat
aatttcgaca gcgagttaca cccaccacgc tctcaatgcc 6900gcctttcagg gtacagtagt
aaagcataaa gtcccgcaat tcatttccta acccccgcgc 6960ctgaaactca ggtagaatat
ttagtgcgag cagttgaata actgaccctt ggggagtatg 7020taacgtcggc acttgcgcat
attttacatt ctctaatgcc tcagtgctgg taattgtttg 7080ggaataaatc gcaccaataa
tttgatcttc tataatcagc actaaattac cttgcgggtt 7140tagctcaagt cttcgccgaa
tttcatgagt agatgcccgt aaattttctg gccaacactt 7200gacctccaag tcaactaagg
caggtaaatc tgacaaatag gcatgactaa ttttgtaagg 7260tcttttctcg aagtaattaa
gcgtaatgcg agtaaaagga aatgtttttg ggtatctttt 7320agaaagctct agttttggaa
atagacctac ttgtgcagca gacatgagaa aaacctcagc 7380ttccacaaga tactgctgag
aaaatccctg aaacgcatcg aaatgtaagt tttcgctttt 7440gtctaaaaac tgatagacta
cccttggttc caaacaatgg acctccaaaa tcattaaacc 7500gtgtttattg accacttgag
accatctttc taagtgttcc accaaacttt gcaccataac 7560atgaggagga ataagctctc
cttgatcatc gacacagact gattggtaag gtaagtgagc 7620acgttctttc aattcgtttc
ttttctgagg aggaataaag agacgatcat ggtcgaggaa 7680cgaacggatg tgcaggatat
tttcgggatc atgaatgcca tgagcttcta aagaacgcac 7740catttgttct gggttcccaa
tatctccctg taaaactaag tggggaaggc tagcaagggt 7800gcgtgtggta gcttttaaag
aagcttcgtt ataatctaca cctataagac gcaggggata 7860ctgttcgagt gcttttcccc
tagcagactt aaattgaatg gtttcccaga ctcgtttcag 7920gagagttcca tcgccacacc
ccatgtcagt aatgtatttg ggttgttctt ctaatggcaa 7980ctgattgaat actgagagga
tactttcttc taaatcggca aaatatttct ggtgttgaaa 8040tccactcccg atcacgttaa
gggtgcgatc aatgtgcctt tcgtgaccgg aagcatctct 8100ttggaatacg gagagacaat
tgccaaacaa tacatcatga atgcgggaca acataggagt 8160gtaggacgcc actatggctg
tattcaaggc tcgctctccc ataaatcgac caagttcggt 8220tatggtcaaa cgacctgctg
taaggtcagc ccagccaagg tggagaaata acttacccaa 8280ctcttcttgc actgttgagc
ttaatgagga gagcaaaggt ttgtcctccg aatctgcaag 8340caagttgtgt ttgtgcagtg
ccagcaggag tgggatgacc agtaatccat ctaaaaaatc 8400tgccattagg ggattgtcca
ggttccacaa ttggcaagaa cgctcaatcc atcttcccag 8460caaatttcct tgtttccctt
ctaaataaga ctgaattggt aggttgtaca attgaagaat 8520gtcttccgaa attttgttgt
gaatcgctgc ttctgcggtt agagagtatt taagctcctt 8580atttcgggaa agccaatgta
aagactcgag catcctcaaa gcaacttgaa aatgtccgct 8640gttagctccc agatgttcca
ccatttggtt taaagagaga ggactttcat cggcgagtaa 8700ttcaaaaaca cctttttctc
gacacgcaag aataacggga accgccacaa agccgtgagt 8760ataacgatta atcttttgta
acatttagac gattattgat taatttatga ggaatgcatt 8820tttagtgcat accacgagat
tttgattgtc tcagaagttg tgtgaaaaag caagacaagt 8880agaccaaaaa aataagctaa
ataagtgtag tagcaataaa aagacgaatc gcaattgtac 8940gtgtcttgac taacaagcca
agtctctcta gataataatc gccctctacc agttgcgtaa 9000gtcccattgt tgttttaaac
tttaattgct aattaaacag ttatcaaatc ctgttcataa 9060cggatattta cagcaatttt
cggttatata aaattgcata tactgtaagt aatagcagaa 9120aattaattta ggtaggaaaa
tgttgaaaga tttcaaccag tttttaatca gaacactagc 9180attcgtattc gcatttggta
ttttcttaac cactggagtt ggcattgcta aagctgacta 9240cctagttaaa ggtggaaaga
ttaccaatgt tcaaaatact tcttctaacg gtgataatta 9300tgccgttagt atcagcggtg
ggtttggtcc ttgcgcagat agagtgatta tcctaccaac 9360ttcaggagtg ataaatcgag
acattcatat gcgtggctat gaagccgcat taactgcact 9420atccaatggc tttttagtag
atatttacga ctatactggc tcttcttgca gcaatggtgg 9480ccaactaact attaccaacc
aattaggtaa gctaatcagc aattaggttg tatcatgata 9540agatgaagta gtttaaccat
ggcaccacca gccaaaaact ttttaacgct agggtgtaac 9600agttatgggt gtggaatgta
ggttgtatcc agtgcatgaa acagccataa ttttagtata 9660agcaaacact aagattggag
aattcatgga aacaacctca aaaaaattta agtcagatct 9720gatattagaa gcacgagcaa
gcctaaagtt gggaatcccc ttagtcattt cacaaatgtg 9780cgaaacgggt atttatacag
cgaatgcagt catgatgggt ttacttggta cgcaagtttt 9840ggccgccggt gctttgggcg
cgctcgcttt tttgacctta ttatttgcct gccatggtat 9900tctctcagta ggaggatcac
tagcagccga agcttttggg gcaaataaaa tagatgaagt 9960tagtcgtatt gcttccgggc
aaatatggct agcagttacc ttgtctttac ctgcaatgct 10020tctgctttgg catggcgata
ctatcttgct gctattcggt caagaggaaa gcaatgtgtt 10080attgacaaaa acgtatttac
actcaatttt atggggcttt cccgctgcgc ttagtatttt 10140gacattaaga ggcattgcct
ctgctctcaa cgttccccga ttgataacta ttactatgct 10200cactcagctg atattgaata
ccgccgccga ttatgtgtta atattcggta aatttggtct 10260tcctcaactt ggtttggctg
gaataggctg ggcaactgct ctgggttttt gggttagttt 10320tacattgggg cttatcttgc
tgattttctc cctgaaagtt agagattata aacttttccg 10380ctacttgcat cagtttgata
aacagatctt tgtcaaaatt tttcaaactg gatggcccat 10440ggggtttcaa tggggggcgg
aaacggcact atttaacgtc accgcttggg tagcagggta 10500tttaggaacg gtaacattag
cagcccatga tattggcttc caaacggcag aactggcgat 10560ggttatacca ctcggagtcg
gcaatgtcgc tatgacaaga gtaggtcaga gtataggaga 10620aaaaaaccct ttgggtgcaa
gaagggtagc atcgattgga attacaatag ttggcattta 10680tgccagtatt gtagcacttg
ttttctggtt gtttccatat caaattgccg gaatttattt 10740aaatataaac aatcccgaga
atatcgaagc aattaagaaa gcaactactt ttatcccctt 10800ggcgggacta ttccaaatgt
tttacagtat tcaaataatt attgttgggg ctttggtcgg 10860tctgcgggat acatttgttc
cagtatcaat gaacttaatt gtctggggtc ttggattggc 10920aggaagctat ttcatggcaa
tcattttagg atgggggggg atcgggattt ggttggctat 10980ggttttgagt ccactcctct
cggcagttat tttaactgtt cgtttttatc gagtgattga 11040caatcttctt gccaacagtg
atgatatgtt acagaatgcg tctgttacta ctctaggctg 11100agaaaagcta tatgaccaat
caaaataacc aagaattaga gaacgattta ccaatcgcca 11160agcagccttg tccggtcaat
tcttataatg agtgggacac acttgaggag gtcattgttg 11220gtagtgttga aggtgcaatg
ttaccggccc tagaaccaat caacaaatgg acattccctt 11280ttgaagaatt ggaatctgcc
caaaagatac tctctgagag gggaggagtt ccttatccac 11340cagagatgat tacattagca
cacaaagaac taaatgaatt tattcacatt cttgaagcag 11400aaggggtcaa agttcgtcga
gttaaacctg tagatttctc tgtccccttc tccacaccag 11460cttggcaagt aggaagtggt
ttttgtgccg ccaatcctcg cgatgttttt ttggtgattg 11520ggaatgagat tattgaagca
ccaatggcag atcgcaaccg ctattttgaa acttgggcgt 11580atcgagagat gctcaaggaa
tattttcagg caggagctaa gtggactgca gcgccgaagc 11640cacaattatt cgacgcacag
tatgacttca atttccagtt tcctcaactg ggggagccgc 11700cgcgtttcgt cgttacagag
tttgaaccga cttttgatgc ggcagatttt gtgcgctgtg 11760gacgagatat ttttggtcaa
aaaagtcatg tgactaatgg tttgggcata gaatggttac 11820aacgtcactt ggaagacgaa
taccgtattc atattattga atcgcattgt ccggaagcac 11880tgcacatcga taccacctta
atgcctcttg cacctggcaa aatactagta aatccagaat 11940ttgtagatgt taataaattg
ccaaaaatcc tgaaaagctg ggacattttg gttgcacctt 12000accccaacca tatacctcaa
aaccagctga gactggtcag tgaatgggca ggtttgaatg 12060tactgatgtt agatgaagag
cgagtcattg tagaaaaaaa ccaggagcag atgattaaag 12120cactgaaaga ttggggattt
aagcctattg tttgccattt tgaaagctac tatccatttt 12180taggatcatt tcactgtgca
acattagacg ttcgccgacg cggaactctt cagtcctatt 12240tttaagattt atttcgatta
tcctttatcc tgatcatcca gagtgataag agcattacaa 12300ctaggagaca attatgacaa
ctgctgacct aatcttaatt aacaactggt acgtagtcgc 12360aaaggtggaa gattgtaaac
caggaagtat caccacggct cttttattgg gagttaagtt 12420ggtactatgg cgcagtcgtg
aacagaattc ccccatacag atatggcaag actactgccc 12480tcaccgaggt gtggctctgt
ctatgggaga aattgttaat aatactttgg tttgtccgta 12540tcacggatgg agatataatc
aagcaggtaa atgcgtacat atcccggctc accctgacat 12600gacaccccca gcaagtgccc
aagccaagat ctatcattgc caggagcgat acggattagt 12660atgggtgtgc ttaggtgatc
ctgtcaatga tataccttca ttacccgaat gggacgatcc 12720gaattatcat aatacttgta
ctaaatctta ttttattcaa gctagtgcgt ttcgtgtaat 12780ggataatttc atagatgtat
ctcattttcc ttttgtccac gacggtgggt taggtgatcg 12840caaccacgca caaattgaag
aatttgaggt aaaagtagac aaagatggca ttagcatagg 12900taaccttaaa ctccagatgc
caaggtttaa cagcagtaac gaagatgact catggactct 12960ttaccaaagg attagtcatc
ccttgtgtca atactatatt actgaatcct ctgaaattcg 13020gactgcggat ttgatgctgg
taacaccgat tgatgaagac aacagcttag tgcgaatgtt 13080agtaacgtgg aaccgctccg
aaatattaga gtcaacggta ctagaggaat ttgacgaaac 13140aatagaacaa gatattccga
ttatacactc tcaacagcca gcgcgtttac cactgttacc 13200ttcaaagcag ataaacatgc
aatggttgtc acaggaaata catgtaccgt cagatcgatg 13260cacagttgcc tatcgtcgat
ggctaaagga actgggcgtt acctatggtg tttgttaatt 13320tcagggttgt tggtatctgg
ataggtatgg ttttgagtcc actgctatct ggagggattt 13380taatggttgg tttttatcaa
cagcttgcca ataagtatta ctaatagtga tgatggggaa 13440gagaatcaaa ctatactcac
caacaaggtg ttaaaatgca gatcttagga atttcagctt 13500actaccacga tagtgctgcc
gcgatggtta tcgatggcga aattgttgct gcagctcagg 13560aagaacgttt ctcaagacga
aagcacgatg ctgggtttcc gactggagcg attacttact 13620gtctaaaaca agtaggaacc
aagttacaat atatcgatca aattgttttt tacgacaagc 13680cattagtcaa atttgagcgg
ttgctagaaa catatttagc atatgcccca aagggatttg 13740gctcgtttat tactgctatg
cccgtttggc tcaaagaaaa gctttaccta aaaacacttt 13800taaaaaaaga attggcgctt
ttgggggagt gcaaagcttc tcaattgcct cctctactgt 13860ttacctcaca tcaccaagcc
catgcggccg ctgctttttt tcccagtcct tttcagcgtg 13920ctgccgttct gtgcttagat
ggtgtaggag agtgggcaac tacttctgtc tggttgggag 13980aaggaaataa actcacacca
caatgggaaa ttgattttcc ccattccctc ggtttgcttt 14040actcagcgtt tacctactac
actgggttca aagttaactc aggtgagtac aaactcatgg 14100gtttagcacc ctacggggaa
cccaaatatg tggaccaaat tctcaagcat ttgttggatc 14160tcaaagaaga tggtactttt
aggttgaata tggactactt caactacacg gtggggctaa 14220ccatgaccaa tcataagttc
catagtatgt ttggaggacc accacgccag gcggaaggaa 14280aaatctccca aagagacatg
gatctggcaa gttcgatcca aaaggtgact gaagaagtca 14340tactgcgtct ggctagaact
atcaaaaaag aactgggtgt agagtatcta tgtttagcag 14400gtggtgtcgg tctcaattgc
gtggctaacg gacgaattct ccgagaaagt gatttcaaag 14460atatttggat tcaacccgca
gcaggagatg ccggtagtgc agtgggagca gctttagcga 14520tttggcatga ataccataag
aaacctcgca cttcaacagc aggcgatcgc atgaaaggtt 14580cttatctggg acctagcttt
agcgaggcgg agattctcca gtttcttaat tctgttaaca 14640taccctacca tcgatgcgtt
gataacgaac ttatggctcg tcttgcagaa attttagacc 14700agggaaatgt tgtaggctgg
ttttctggac gaatggagtt tggtccgcgt gctttgggtg 14760gccgttcgat tattggcgat
tcacgcagtc caaaaatgca atcggtcatg aacctgaaaa 14820ttaaatatcg tgagtccttc
cgtccatttg ctccttcagt cttggctgaa cgagtctccg 14880actacttcga tcttgatcgt
cctagtcctt atatgctttt ggtagcacaa gtcaaagaga 14940atctgcacat tcctatgaca
caagagcaac acgagctatt tgggatcgag aagctgaatg 15000ttcctcgttc ccaaattccc
gcagtcactc acgttgatta ctcagctcgt attcagacag 15060ttcacaaaga aacgaatcct
cgttactacg agttaattcg tcattttgag gcacgaactg 15120gttgtgctgt cttggtcaat
acttcgttta atgtccgcgg cgaaccaatt gtttgtactc 15180ccgaagacgc ttatcgatgc
tttatgagaa ctgaaatgga ctatttggtt atggagaatt 15240tcttgttggt caaatctgaa
cagccacggg gaaatagtga tgagtcatgg caaaaagaat 15300tcgagttaga ttaacttatg
agtgaatttt tcccacaaaa aagtggtaaa ttaaagatgg 15360aacagataaa agaacttgac
aaaaaaggat tgcgtgagtt tggactgatt ggcggttcta 15420tagtggcggt tttattcggc
tttttactgc cagttatacg ccatcattcc ttatcagtta 15480tcccttgggt tgttgctgga
tttctctgga tttgggcaat aatcgcacct acgactttaa 15540gttttattta ccaaatatgg
atgaggattg gacttgtttt aggatggata caaacacgaa 15600ttattttggg agttttattt
tatataatga tcacaccaat aggattcata agacggctgt 15660tgaatcaaga tccaatgacg
cgaatcttcg agccagagtt gccaacttat cgccaattga 15720gtaagtcaag aactacacaa
agtatggaga aaccattcta atgctaaaag acacttggga 15780ttttattaaa gacattgccg
gatttattaa agaacaaaaa aactatttgt tgattcccct 15840aattatcacc ctggtatcct
tgggggcgct gattgtcttt gctcaatctt ctgcgatcgc 15900acctttcatt tacactcttt
tttaaattgc catattatga gtaacttcaa gggttcggta 15960aagatagcat tgatgggaat
attgattttt tgtgggctaa tctttggcgt agcatttgtt 16020gaaattgggt tacgtattgc
cgggatcgaa cacatagcat tccatagcat tgatgaacac 16080agggggtggg tagggcgacc
tcatgtttcc gggtggtata gaaccgaagg tgaagctcac 16140atccaaatga atagtgatgg
ctttcgagat cgagaacaca tcaaggtcaa accagaaaat 16200accttcagga tagcgctgtt
gggagattcc tttgtagagt ccatgcaagt accgttggag 16260caaaatttgg cagcagttat
agaaggagaa atcagtagtt gtatagcttt agctggacga 16320aaggcggaag tgattaattt
tggagtgact ggttatggaa cagaccaaga actaattact 16380ctacgggaga aagtttggga
ctattcacct gatatagtag tgctagattt ttatactggc 16440aacgacattg ttgataactc
ccgtgcgctg agtcagaaat tctatcctaa tgaactaggt 16500tcactaaagc cgttttttat
acttagagat ggtaatctgg tggttgatgc ttcgtttatc 16560aatacggata attatcgctc
aaagctgaca tggtggggca aaacttatat gaaaataaaa 16620gaccactcac ggattttaca
ggttttaaac atggtacggg atgctcttaa caactctagt 16680agagggtttt cttctcaagc
tatagaggaa ccgttattta gtgatggaaa acaggataca 16740aaattgagcg ggttttttga
tatctacaaa ccacctactg accctgaatg gcaacaggca 16800tggcaagtca cagagaaact
gattagctca atgcaacacg aggtgactgc gaagaaagca 16860gattttttag ttgttacttt
tggcggtccc tttcaacgag aacctttagt gcgtcaaaaa 16920gaaatgcaag aattgggtct
gactgattgg ttttacccag agaagcgaat tacacgtttg 16980ggtgaggatg aggggttcag
tgtactcaat ctcagcccaa atttgcaggt ttattctgag 17040cagaacaatg cttgcctata
tgggtttgat gatactcaag gctgtgtagg gcattggaat 17100gctttaggac atcaggtagc
aggaaaaatg attgcatcga agatttgtca acagcagatg 17160agagaaagta tattgcctca
taagcacgac ccttcaagcc aaagctcacc tattacccaa 17220tcagtgatcc aataaagaac
tgggcatcac ttatgatgtt tactaatttc agttccgttg 17280atgttaatgc gtaactttta
ttactagttg taaagctgag atatgacaaa taccgaaaga 17340ggattagcag aaataacatc
aacaggatat aagtcagagc ttagatcgga ggcacgagtt 17400agcctccaac tggcaattcc
cttagtcctt gtcgaaatat gcggaacgag tattaatgtg 17460gtggatgtag tcatgatggg
cttacttggt actcaagttt tggctgctgg tgccttgggt 17520gcgatcgctt ttttatctgt
atcgaatact tgttataata tgcttttgtc gggggtagca 17580aaggcatctg aggcttttgg
ggcaaacaaa atagatcagg ttagtcgtat tgcttctggg 17640caaatatggc tggcactcac
cttgtctttg cctgcaatgc ttttgctttg gtatatggat 17700actatattgg tgctatttgg
tcaagttgaa agcaacacat taattgcaaa aacgtattta 17760cactcaattg tgtggggatt
tccggcggca gttggtattt tgatattaag aggcattgcc 17820tctgctgtga acgtccccca
attggtaact gtgacgatgc tagtagggct ggtcttgaat 17880gccccggcca attatgtatt
aatgttcggt aaatttggtc ttcctgaact tggtttagct 17940ggaataggct gggcaagtac
tttggttttt tggattagtt ttctagtggg ggttgtcttg 18000ctgattttct ccccaaaagt
tagagattat aaacttttcc gctacttgca tcagtttgat 18060cgacagacgg ttgtggaaat
ttttcaaact ggatggccta tgggttttct actgggagtg 18120gaatcagtag tattgagcct
caccgcttgg ttaacaggct atttgggaac agtaacatta 18180gcagctcatg agatcgcgat
ccaaacagca gaactggcga tagtgatacc actcggaatc 18240gggaatgttg ccgtcacgag
agtaggtcag actataggag aaaaaaaccc tttgggtgct 18300agaagggcag cattgattgg
gattatgatt ggtggcattt atgccagtct tgtggcagtc 18360attttctggt tgtttccata
tcagattgcg ggactttatt taaaaataaa cgatccagag 18420agtatggaag cagttaagac
agcaactaat tttctcttct tggcgggatt attccaattt 18480tttcatagcg ttcaaataat
tgttgttggg gttttaatag ggttgcagga tacgtttatc 18540ccattgttaa tgaatttggt
aggctggggt cttggcttgg cagtaagcta ttacatggga 18600atcattttat gttggggagg
tatgggtatc tggttaggtc tggttttgag tccactcctg 18660tccggactta ttttaatggt
tcgtttttat caagagattg ccaataggat tgccaatagt 18720gatgatgggc aagagagtat
atctattgac aacgttgaag aactctcctg acgaacagat 18780tgaattgcct tggtcttgac
acttcgttaa cctaagcatg agagtatagg ctatactctg 18840ccgtggttaa ctgagtgttg
tcctggatcg aggacgcagc ctggctgagc aacaaaaaag 18900actggaatct tgacctgtca
atggttttaa ctgctagttt gcggctggtg tcagcagctt 18960cgccatttct gcgcctaaga
cttgacctag ccataatatt ttagtattat gatgagcgat 19020cttaatcaaa ggcaaaaaat
ttacaattaa tctattgtta cattaatttt gctcctcatt 19080ctgtttaaat tttcagtgac
attgtaatct aactcaaaat gaaaacaaac aaacatatag 19140ctatgtgggc ttgtcctaga
agtcgttcta ctgtaattac ccgtgctttt gagaacttag 19200atgggtgtgt tgtttatgat
gagcctctag aggctccgaa tgtcttgatg acaacttaca 19260cgatgagtaa cagtcgtacg
ttagcagaag aagacttaaa gcaattaata ctgcaaaata 19320atgtagaaac agacctcaag
aaagttatag aacaattgac tggagattta ccggacggaa 19380aattattctc atttcaaaaa
atgataacag gtgactatag atctgaattt ggaatagatt 19440gggcaaaaaa gctaactaac
ttctttttaa taaggcatcc ccaagatatt attttttctt 19500tcgatatagc ggagagaaag
acaggtatca cagaaccatt cacacaacaa aatcttggca 19560tgaaaacact ttatgaagtt
ttccaacaaa ttgaagttat tacagggcaa acacctttag 19620ttattcactc agatgatata
attaaaaacc ctccttctgc tttgaaatgg ctgtgtaaaa 19680acttagggct tgcatttgat
gaaaagatgc tgacatggaa agcaaatcta gaagactcca 19740atttaaagta tacaaaatta
tatgctaatt ctgcgtctgg cagttcagaa ccttggtttg 19800aaactttaag atcgaccaaa
acatttctcg cctatgaaaa gaaggagaaa aaattaccag 19860ctcggttaat acctctacta
gatgaatcta ttccttacta tgaaaaactc ttacagcatt 19920gtcatatttt tgaatggtca
gaacactgag tttgatcgta accgttcaga ggggggatag 19980aagcgcgatt agggagatcc
aaaaaataaa atatctagcc gtctaacctc tttattttca 20040tcgattcttc ttaccgttcc
ctattccctc ccttcaccag ttcgtttttg ggtaggtgca 20100agatctgagc ctcccaccta
gggccgatct ggcagtgcgc gatcgccact agcccatgga 20160aaactagcac tttttgggga
acagccaaaa cctttattga gtaagaattt gaaaaagtgc 20220aagttaagag gcaatgacta
aaaatttttt tctactcttt tcaggataga attccagttt 20280ctagagccgt tgtaaccgta
catatcttga tagtacgtat cgatgaggta ctcattttcg 20340tggagcatta accagctttt
taactccgct aatttctgct ctcctttttc tattaattct 20400tgctcatcca aatcatccct
gtccaactcc tccctgtcca actcccacat agttttgttg 20460gtatcttcga caatcaagta
gtctccactt tttagaccgt tttcgtgaaa atattcaact 20520actcccaccg cattagcatg
ggcatcttct acgatcaacc agggatgagc aagcccagaa 20580agcagttccg acgacattat
tgcacccata ttgttacaat ccccctctaa aaaatgaacg 20640cgagagtcag tttttgcttt
ctcgtcgagt agggaaagat cgatatcgat acagtagaca 20700caaccttcta tttggaacag
ttctaagtga tcggctagcc aaatcgcgct gccaccgctt 20760aatgctccta tttcgattat
tgttttcggg cgaagctcat acaggagcat tgaataaaga 20820gctatttcgg tgcacccttt
caggaagggt atccctttcc aagtgaacaa atcgcggttt 20880gccaagagcg ctctccaagc
tggcactgga atagcacatt tatcttctct ttcagaaatt 20940ttggcaaacc gattaggttt
gaaaggtgca actttatagg cggcttcttg aacaaatttt 21000tggaagctca tctaattttc
ctcttaggtg ttagaacatt tgtaaaatct tggcgatttt 21060ttgttttctt tcttgaatat
agcaaccgcc aaggcggttt gagcataaac tggatgtagt 21120ccccgtgttt tacggttgag
acttaggtaa agcggctttg tttgtactct cccattattc 21180aaatagccgt agtttatgat
cggtatccaa ttcgctattg ttttttctgc catatcccca 21240acctaagatg cgacgatatt
cacccataat gccactgtca attaaatcat cctcgttgac 21300tgcaacattg gtatgagatt
gcggcgcaac atagagcgca tccgcaggac aatatgcttc 21360acagatgaaa caagtttgac
agtcttcctg tcgggcgatc gcaggcggtt ggttgggaac 21420tgcatcaaag acattggtag
ggcatacttg gacgcaaaca ttacaattaa tacagagttt 21480atggctgaca agctcgatca
tcatactgct cctgctacaa ctttaatact ggggctgtgg 21540tttaagtggt taatactggt
ggtgtagcgc tcgcatcctt cacccaatcc cgtctcaccc 21600aaagcctttc taagccgccc
gtggcttggt aataaagctg atttggatcg gtttcaggat 21660agtctatgcg aatatgttcg
ctacgcgttt ccttgcgatg taaagcgcta aaatatgccc 21720atcgtgctac agacacaaga
gcagccgctc gacgagaaaa ttccagatcg cgcactgtat 21780cttgtttcgg gttcccttgt
acttgctgcc acagcatttc taatttggcg agggaatcca 21840aaagtccctg ctcacagcgc
aagtaattct tctctaatgg gaacatctcg gcttgtacac 21900cgcggacaac tgcctcgcta
tcgaatgttt cggaaccagg gtactgggaa cgtaatccgg 21960cttgacctgc tggacgcaca
acccgttcat ggacatgagc gcccaaactc ttggcaaagg 22020cggctgcacc ttcccctgcc
cattgtcctg tagagattgc ccaagcagca ttaggaccat 22080cacccccaga agctatccca
gctaaaaact cccgcgatgc tgcatctccg gcggcataca 22140gtccaggaac ttttgtacca
caactatcat tcacaatccg aattccacct gtaccacgga 22200ctgtaccttc taaaaccagt
gttacaggta ctcgttctgt ataagggtca atgccagctt 22260ttttataggg tagaaaggcg
atgaagtgag acttttcaac caatgcttgg atttcaggtg 22320tggctcgatc caaacgagca
taaacgggac ctttcaggag ggcattgggc aggaacgatg 22380gatcgcgacg accattgata
tagccaccaa gatcgttacc tgcctcatcg gtgtaactag 22440cccagtaaaa gggagcagcc
cttgtcactg tggcattgaa agcggtcgag atggtatagt 22500gactggaagc ttccatactg
gagagttcgc cgccagcttc caccgccatc agcagtccat 22560cgcctgtatt ggtattgcaa
cctaaagctt tacttaggaa tgcacaaccg ccattcgcta 22620gaactactgc accagcgcga
acggtatagg tgcgatgatt ttgcctctgt acacctctag 22680ctccagccac ggagccgtcc
tgggctaata acagttctag agccggactt tggtcgaaaa 22740tttgcacacc cacacgcaac
aggttcttgc gaagtacccg catatattcc ggaccataat 22800aactctggcg cacggattcc
ccattttctt tggggaaacg atagccccaa tcttccacta 22860agggcaaact cagccaagct
ttttcaatta cacgttcaat ccaacgtaag ttagcgaggt 22920tatttccttt gctgtaacat
tcggatacat ctttctccca attctctgga gaaggtgcca 22980tgacgctatt gccactggca
gcagctgcac cgctcgtacc tagaaaacct ttatcaacaa 23040tgatgacttt gacaccttgg
gctccagccg cccatgctgc ccatgcggcg gcaggaccac 23100caccaattac cagcacgtca
gcagttaatt gtagttcagt gccgctatag gctgtaagca 23160attgcttttc ctccttgttt
aaagtcaagt tcatactttt aattatcttc tgcagtcggt 23220cgaatcaaaa tttcatttac
atttacatga tcgggttgtg tcactgcata aattatagct 23280cttgcaatat cctcactttg
taaaggtgtt attgtactaa gttgttcttt actaagctgt 23340ttcgtgatcg ggtcagaaat
taagtcatta aatggcgtat cgactaaacc tggctcaatg 23400atggtaacgc gaatgttgtc
taaagatacc tcctggcgta atgcttctga aagagcattg 23460acgcctgatt tggcagcact
ataaacgacc gcaccggact gcgctatcct gccatcgaca 23520gaagatatat tgactatatg
accggatttt tgggccttca gaagaggcaa aactgcgtgg 23580atagcatata aaactcccag
aacattcaca tcgaatgctc gcctccagtc tgcgggattt 23640ccagtatcaa ttgcaccaaa
cacaccaatt cctgcattat tcaccaaaat atctacatgt 23700cctagctcaa ccttggtctt
ttggactaga tgatttactt gagattcgtc tgtaatatct 23760gtaacaatag gcaatgcttg
accaccactg gcttcaatcc gttttgctag tgcatgcaaa 23820agctcagcac gtcttgcggc
gatcgcaact tttgccccct ccgcagctaa agcaaatgct 23880gtagcctctc caatcccaga
ggaagctcca gtaataatcg ccacttttcc atccaattta 23940cctgccatca gtcactcctt
agttttcgtt ttgctggtgc aatatgtaat aagtgcgttt 24000tgtacttgat tttgttcttt
ggtgattttt atataggagc gcataaagtg cttagtgatc 24060actttatttt ttagtgccat
tcaacttaaa ttaacaaacc ccataagtaa cacctagttg 24120ctttagccat cgacgatagg
caagtgtgca tctatctgat ggtacgtgga tttcgtgtga 24180aaacaattgt gtatttatct
gctttggagt taacagtggt aaacgtaccg gctgttgtgc 24240atgtaagatc cgaatatctt
gttctattgt ttcgtcatat tcagttagca tctttgactc 24300taacgtttca tacccgttcc
acattatcaa catacgcaat acactatttt cctcatcaat 24360cggtgtgatc gtcattaaat
ccacaatcct catttcaggg gattctgaaa cgcagtattg 24420acataaagga tgactaagcc
tgaaccaatt aacccaagag tcatcttcga tatggctgac 24480aatccttgat gtctggaatt
gatacttacc catagtaagg ccatctttat ctaatttcac 24540ctcaaattct tccacttttg
tataattgcg atcacctaac caaccgtcat ggataaaagg 24600aaaatgagac acgtctaagg
aattatccat cacacgaaac gcactagctt taatcaagta 24660agacttggta taagtcttgt
gataattcgg atcatcccat tcaggaaatg aaggtatatc 24720attaacagga tcgcccaagc
acacccacac taagccatag cgctcctggg agtgatatgt 24780cctggcttca gcacttgccg
gtggtaccat gccagggtga gctgggatct gtatgcattt 24840accagcctca ttgtatctcc
atccgtgata cggacaaact aaagtattat tcgtaatttc 24900tcccatagac agaggaacac
ctcggtgggg gcagtagtca agccatacct gtatgggtga 24960attttgttca taactgcgcc
ataataccaa cttcactccc aacaaacgag atctggtgat 25020acttccaggt ttacagtctt
ctacattggc gactacgtgc cagttattga ttaagattgg 25080gtcggtagtt gtcataattg
tctcctagtt ttgccagcca gcgaggcgta agtcagaatt 25140taagtttatg cttgtgtttg
agcctgcgat cgctaaatta tccttttcaa ggcatccacc 25200aacagtggtt tgatgttgtt
ttttgtaaaa atcagagtta gcatcctgta atcggtaatt 25260gaagtgttgg cagctgcggt
atgccataca gttggtgtat aaaacattgc tgcccctcct 25320ggaagtgaaa gacatatttc
tgcatttagt gaattggcag aagatgaatc taatgagtgt 25380tcccattggt ggctacttgg
tataactcgc attgtaccca tagtattatc tgtatcctgt 25440aagtatatag ttatgaatac
catggcttga ttggctactg gaaccaacaa ccgaagcgcg 25500tcgtcattta actcgttttt
tgacatggat gcaagtgcgt tcaatacttc aactacatat 25560ccatggtctt gatgccaagc
aatgtatcct gtacctgcac gaattatggc tagatcggtg 25620atcaatagga agatatcaga
cccaattaga gcctgtactg gtcccatcac agttggaagc 25680tctaaaagcc tctgaattat
cttttgatac ctaactggat ctgggatagt atgctcagac 25740caccactcat agtcacccgc
caatactccc ccacgttttt gttcggtaat aagttctact 25800tcatgccgta tttcttcaat
taacgctttt ggtacagctt cttcaactgt gaaataacca 25860tcatttgtgt aagcttgttt
ttgttccgct gtgagcatct ctcttattct cttgcaattc 25920aaaggattta gtggatcgtc
tggacataat taaggtcaat actgctgtaa ctatcaatgg 25980ttagtaggaa ttatcctata
gctgttcttt ctctggatag aagaaaggtt gtgagaagct 26040cgctccgact tcatttcagc
caatttttct gcagaccaat actgaaaata tcccaatctt 26100aataattcat cactagcctc
ttgtaactgg ctgaatgact gtactgatgc taaaacatac 26160ttagggtgag ttatgattac
gttattcaca ttctccgcgt catcaccaac atattgtttg 26220tctggatgcg atcctaaagc
taccaaatcg tattctggta atacataatt cgccttggta 26280atgtaccttt ccaacctctg
tgcatctagg ttttgagggt cgcagccaaa aatcaccatt 26340tcaaagtcat tattccatgt
tcttatctgt tccattagaa gctctggcag ttcaggtcca 26400tgaaaccaac gaacactaac
acggttattt aaccaagctg ccttcgcgta aggacagggt 26460ggaaaatttc ctgttagagg
attgggaatg ctgacaacat tgataatcca atcctctatt 26520tcttggcgaa attgttcgat
atttatcata actgttgatt tttcctcctt tgtagtaatt 26580agtagttaaa ggatttagtg
gatattaatc taggtcatag tataaccata tattaggctc 26640gatgtatatt cccatattgt
tgggatagtc aattttgaca ggtactaagc ctttgggaat 26700aatatagtca ccagtttctg
gaaaacgcat cccaactcta tcttcccaac cgtcaatagt 26760atcattaatt gttgtggatt
taaaacagat ccctgcaatt ttagccccat gtttgacatt 26820aactcgtaac caagggtcaa
atataagacc atttttatct cgccaggtaa tataccgctc 26880tatgggtata agtgggtaaa
gatattttag gcttggacgt gcagccatga tcaaagaatt 26940aagaccgtgg tattgagcaa
gttctttcat gtatccaatc agatactgac tcaagttttt 27000gccttgatac tctggtagga
ttgaaatcga tactacacat aacgcattag gcaggcggtt 27060ctgttctcgg tcttcaagcc
acttggctaa agcccagtca caaccttcgt ccggtaactc 27120atcaaaacgg ctttcataag
ttaaagggat acagtttcct tgcgctatca taagctgtgt 27180ggtagcttct actaacccaa
actggaattc tggataaatt tcaaatagag ctaaggaagc 27240tggatctgcc cagacatcat
gtatcaaaaa ttttgggtat gcttgatcaa agacactcat 27300cgtcctttcc acaaaatcag
aagtttcttt tggggttaca aagctatact ctaaattatg 27360ctgtacaatt tgaatggtca
ttggttattg gctaatcctt aaatttatac tggaagtcaa 27420atgagatctc actatcgtta
ttatctggaa gtacttgcac tgtcaattca ttaccgactt 27480tcccattccc aggcataatt
aataagttag ggtgaggtgg aatgccgtcg tactgtcgga 27540cgcggcgaaa aatgctcgaa
ttctcgccac catgtttatt caagaggact tcaactggtg 27600tgatgacaaa agtcattcct
gacccaaggt ggcgcgatcg ccgcttttga tttgctggag 27660tggaaacact aacaaataag
gcacaccctc ctagagaata agaccagtta gcagactgcg 27720gatcggcaga ccaatggcag
ggacaagaca ccgcatcaag gctatgtaac gcattcaaaa 27780aatcaaatgc ttgacctgca
tattcctcta ctgtaagaac tgttggttca ggtgggaaaa 27840agatgacaag tgtcagaaga
tccgcatttt cgtgctgaag caattcgttt tcattaactt 27900catcaatgta tttgtagata
ccctcaagcg tatgctcaac caagatcggg tcagttaaag 27960atgagactat caggtatcta
atcattccct tctgttcccc gatagttccc cagaagcaag 28020ggaaggcaga atcgctgatt
gtttcaacaa atgttgagta gctagtgcgt acccaagcag 28080gaaggcactc ctctagaaga
gaggattcca tctggctttt gttccagatt ggtgtaactc 28140cgtcaggaca taaattcttg
attaccatag ctgagttgaa aagtgagctt atttatacaa 28200aaacgatgga agtgacacct
gatggatggg acttcaaccc cctacacata attattatca 28260ttactatgtg gcaggtcctt
ctatatctta ttttttggaa gtccctgaaa attattcaac 28320aagatcgaga cgttgttgtt
gccagaattt gtgacagcca ggtcaagctt gctgtcgccg 28380ttgaaatccg caattgctat
agattcagga ttagtaccga ctggaaagtt agtagctatg 28440ccaaaagacc cattaccatt
tcctggtaag accgagacgt tattgctact ataatttgta 28500acagccaggt caagtttact
gtcgccattc acatctctaa tcgctacaga gtagggatta 28560gtaccggctg gaaagttagt
ggctgcgcca aaagacccat taccatttcc cagtaagacc 28620gagacgttat tgctgctagt
atttgcaaca gccaggtcaa gcttgctgtc gccatttaca 28680tccccagttg ctacaaatat
gggattagta ccgactggaa agttagtggc tgcgccaaaa 28740gacccattac catttcccag
taagaccgag acgttattgc tgacccaatt tgtaatagca 28800aggtcgagct tactgtcgct
attaaaatcc gcaatcgcta cggaaatcga ataagtatcg 28860acagggaagc tgctggctgc
gccaaaagac ccattaccat ttcccagtaa aaccaagacc 28920ttattgtcga accaatttgt
aaaagcaagg tcaagctcac tatcgttatt cacatctcca 28980atggctacag aataagggtt
agtaccaact gaaaagttag tggctgcgcc aaaagaccca 29040ttaccatttc ctagtaagac
cgagacgtta ttgctactaa aatttgcaac agccaggtca 29100agcttgctgt cgccatttac
atccccagtc actacaaaga cgggattagt accgactgga 29160aagttagtgg ctgcgccaaa
agacccatta ccatttccca gtaagaccga gacgttattg 29220tcgaaccaat ttgtaacagc
caggtcgagc ttactatcgc tattgaaatc cccaactgct 29280acagagtcag catcaagacc
agttgggaag ttaatagcag tagcataact actcctgtgg 29340gcaaatctca ctcctacgga
caaattaacc ggaacactaa attgcccaga aagcttttca 29400ttcttcagat aatagtcagt
tatatttgct aatgcaacag gagttataca taaaaatgta 29460ctaacagata atatccccgc
tataattagt aaagtgagcc ttttcacgag ttgtatagtt 29520caaatgtatt aacaatgttt
gtagccatac accatcgtgt atgaagaaag gtattgatcg 29580caaaatatct atccttgatc
tagcctatca cctaagttaa gccatattga gttctattta 29640gattttcttt ataaatcagc
tataatctat tgtttgaaaa ttgtgaattt gttttccacg 29700tatttgagta gttgttctag
gctttcctcg acggtgagtt cggatgtttc cacccataaa 29760tctgggctat tgggtggttc
ataaggggcg ctgattcccg taaatccatc tatttcccca 29820ctgcgtgctt ttagataaag
acctttcgga tcacgctgct cacaaagttc cagtggagtt 29880gcaatgtata cttcatgaaa
tagatctcca gctagtctac gcacctgttc tcggtcattc 29940ctgtagggtg agatgaaggc
agtgatcact aggcatcctg actccgcaaa gagtttggca 30000acctcaccca aacgacggat
attttctgag cgatcactag cagaaaatcc taaatcggaa 30060cacagtccat gacgaacact
atcaccatct aaaacaaagg tagaccatcc tttctcgaac 30120aaagtctgct ctaattttaa
agccaatgtt gttttaccag ccccggacag tccagtaaac 30180catagaatcc cgcttttatg
accattcttt agataacgat catatggaga tataagatgt 30240tttgtatagt gaatattagt
tgatttcata ttgctggagt ttagactaaa cagaagagcg 30300atcgctccat gcctgagatt
ttagtcagta tttccactcc tgtcaaacca ccaaaaacac 30360ggggtaacct ggaaaattcc
cctggggatc agctgaaaac tgctgtttaa cctgcattat 30420tcatgaaggc aaaaacagga
aaaacaaaac ctaacattta taccccaatt tatggcggaa 30480ctaacttaat aagtaaaaag
taaattaaac ctaattaaaa tccctgattt taaccccaaa 30540atcaatattt taaacctcaa
aacttctctt aatcccccat ttagacacac ctatcctatc 30600aaggcttaat tttaagaaaa
aattatttca aactcgctcg ccaaacgctc cataatcaaa 30660ttaatttcag acgaaaaagg
acagtaatat ggtagctcta ccaacaccct tcttgcggaa 30720actgtcacct tcgctgctat
tttgataatc gtttccctta acctaggaac ctgggcttta 30780gccagttttg ttccctgtgc
tgcttgccga attcccaaca ttaaaatgta agctgcttga 30840gataaaaata accgaaactg
attgacaata aatttctcac agctgagtct atctgatttt 30900atccccagtt ttaattcctt
aattctatgc tctgaagtag ctcctctttg aacataaaat 30960ttatcgtata aatcctgagc
ttctgtttcc aagctagtaa ttataaatct aggattgggt 31020cctttttcta gccattctgc
tttcataatt actcgccgag gttctgacca actccgagct 31080gcgtaataca catcatcaaa
taaacgaact ttttctcctg tgcgacaata ttccagtctg 31140gctcggtcaa gaaggtaatt
aatttttcgt tttaagacat cattattgct gaatccaaaa 31200acatatccaa ccccgctttt
ttcacaaacc tcaatgattt ctggtaacga gaaacccccg 31260tctcccctca gaacaattct
aatttcaggt aaggctcttt tgattcgcaa aaataaccat 31320tttagaatgc cagctactcc
tttaccagag tgagaatttc ccgcccttag ttgtagaact 31380aatggataac cactggaagc
ttcattaatc agaactggaa agtagatatc atgcctatgg 31440taaccattaa ataagctcag
ttgttgatga ccatgagtta gagcatccca cgcatctatg 31500tccaggacaa tctcttttga
ttcccgagga taggattcta ggaatttatc aacaaataac 31560cgacgaattt gtttgatatc
tttttgagtc acctgatttt ctaaacgact catagttggt 31620tgactagcta ataagttttc
tcctactgtg ggaacttgat tacaaactag cttaaaaatt 31680ggatcttggc gcaatttatt
actatcgttg ctatcttcat agccagcaat tatttgataa 31740attcgttggc taattaattg
agaaagagaa tgtttgactt tagtttggtc ccgattatcc 31800gtcaaacaat ctgccatatc
ttgacaaatt tttacctttt cttctacttg tcgtgccaga 31860ataattccgc catcactact
taaactcata tcagaaaaag tcagatctaa agttttttta 31920tcgaagaaat ttaaagataa
tcttgaggaa gatttagtca tatatagtgg ataggtttaa 31980tttttaaaat cctgatttat
tatagctgtt tttattcctt tttttcagtt tataactaaa 32040gttagttatt atttaatttg
gtgacggata ggaattacag agtgttggga tgacaaaatt 32100gccgtagctg ttgcagtata
accctttcag cgatttttat tctactctga tgaataatcc 32160aggataggct tgccatcact
ttctgggtag acaatgtcag gcgcgattgt ctccccaccc 32220tgattaacgt tagattttat
cacccccagt tgagtttttg gtgcaatttc cctcaccata 32280tctatacctc ccattcactt
tggtattgac tcaatcggtt caatttacta taacatgact 32340tatgtggggg tgtgtgcata
ccctcactta aaattaatgg atttgaatct cctcgcactg 32400ctgcaacttg aaaaactctg
agagtcagtt gagagctaac tctaccagga ggagagtttt 32460taaaaacccc cttcccgagc
gatcgcataa tttatggtat acaagaatag tgggtgaaaa 32520actaactggc gatcgctctt
ttcatttaag agacacccct tagttttttt tgcagtctca 32580tgaatttaaa cgatatctaa
ttattttcaa cctatctttg ccctgtaaca atgtatgcta 32640ccctttgacc aatattagta
gcatgatctg ccattctctc taaacactga attgctaatg 32700ttaatagtaa aatgggctcc
actaccccgg gaacatcttt ctgctgcgcc aaattacgat 32760ataacttttt gtaagcatca
tctactgtat catctaataa tttaatcctt ctaccactaa 32820tctcgtctaa atccgctaaa
gctactaggc tggtagccaa catagattgg gcatgatcgg 32880acataatggc aacctccccc
aaagtaggat gggggggata gggaaatatt ttcattgcta 32940tttctgccaa atctttggca
tagtccccaa tacgttccaa gtctctaact aattgcatga 33000atgagcttaa acaccgagat
tcttggtctg tgggagcttg actgctcata attgtggcac 33060aatcgacttc tatttgtctg
tagaagcgat caattttttt gtctaatctc cgtatttgct 33120cagctgctgt taaatcccga
ttgaatagag cttggtgact cagacggaat gactgctcta 33180ctaaagcacc catacgcaaa
acatctcgtt ccagtctttt aatggcacgt ataggttgag 33240gtttttcaaa aattgtatat
ttcacaacag ctttcatatt tttaatctcg ggtttaatat 33300atttctagct attatagtct
tgattcagaa atatccgcca tcatgttgaa ccacctgggg 33360aagatgaatt tgtatccaag
caccaccggt atcaggatgg ttcatggccc tgattttgcc 33420accatgagct ataattattt
ggcggacaat ggataaccct aaaccactac cagtaatttc 33480tactgtttca ttctcagagc
gggactcgcg gtgtctagct ttgtcccccc gataaaatct 33540ttgaaagaca tggggtagat
ccatgggagc aaatccaacc ccggaatcaa taatgttaat 33600ttctaaaatc tgatttgata
cttggtttaa tattgtatct gcttctggat caaccccatt 33660aatagacttc tccccacaaa
ctggattcat ttcaatgaaa atagtaccgt tcaggttgct 33720gtatttaata cagttatcta
acagattaag aaacacttga taaattctgg acttatcagc 33780acatatatag accttttccg
ggccggagta agaaatacta agatgctgat tagcggctag 33840gggctctaaa ttctcccaga
ctgaaaaaat tagggagcgg acttctagca tttccaaatt 33900cagttgtatg gaggaggtta
tttccatctg ggtcaggtct aaccaatttt ggactaaatt 33960aattagtctg tcaacctcct
gcatcaagcg gatgacccaa cggtttagag ggggatctaa 34020gcgagtttgc agggtttctg
cgaccagacg aatggaagtc agaggtgttc tcagttcatg 34080ggccaggtct gaaaaagagc
ggtcacgttg ctgatgaatg tctacaaatt gttggtgact 34140ttctagaaac acacccactt
gtccccccgg taggggaaaa ctgttagctg ctaaagacaa 34200tggctttaat cctaaaatac
cctgaccatg atctcgggaa gggtgaaaaa tccactcttg 34260catttgcggt ttttgccaat
cccgggtttg ctcaattaac tgatccagct cataggatct 34320cactaattcc agtagcaggc
gcacttgacc cggttgccat ctttgtaaat acagcatttc 34380ccgcgcgcac tgattacacc
atagtagttg gttttcttca tctacttgta aatatcccaa 34440aggcgcagca tccagcaact
gttcataagc tttgagtgac aagcgtaagt tttgttgctc 34500atctctaacg gtagatattt
tacgatgtaa tccagctaat aggggtaata atatcttttc 34560agcgtgaggg tttaagggtt
gggttaactg ctccaaatga ctgttaagtt gaaattgttg 34620ccaaagccaa aaaccaaaac
cgactgccaa acccagaaga aatcccaata agaacatttg 34680atcgtaagtg tgctatttga
ccggaattaa agggggagga tccaagcacg gtctttacag 34740gacggctttt tctaattgtt
aaattataat tataatcggt agggactgct ttgggaaaat 34800gcgatcgccc aggtatctgt
aaccatttct gtaccacagg ttagactgga tcaggtaact 34860gatacacttc ttgctgaatt
ttatgtccaa tcaaaatgac aactcccaaa atgataactc 34920ccgtgacaag agccaaaaac
ccgaatccag cagatggttt aaaataaaaa gaccacgacc 34980acctaaagga ataggaaaac
caaaaacaga atagcccaca tatagaaatc aaccaaatct 35040atagccaaaa cccctaactg
tgacaatata ttctggatgg ctagggtcta actctaattt 35100ttccctcagc catcgaatgt
gaacatccac cgttttactg tcaccaacaa aatcaggacc 35160ccaaacctgg tctaataact
gttcccgtga ccacaccctg cgagcataac tcataaatag 35220ttctagtaac cggaattctt
tcggtgacaa gctcacctcc ctccctctca ctaacacccg 35280acattcctga ggatttaaac
tgatatcctt atattttaaa gtgggtatca agggcaaatt 35340agaaaaccgc tgacgacgta
acagggcgcg acacctagcc accatttccc gtacgctaaa 35400aggcttagtt aggtaatcat
ccgcccctac ctctaaaccc agcacccggt cagtttcact 35460acctttcgca ctcagaatta
aaatcggtat ggaattaccc tggtgacgta acaaacgaca 35520aatatctaat ccgttgattt
gtggcaacat caagtctagc acaagcaggt cgaaggataa 35580ctcaccaggt tgggtctcta
aattcctgat taattccaca gcacaacgac catccttagc 35640agtcacaact tcataacctt
caccctctaa ggctactaca agcatctctc ggatcagttc 35700ttcgtcttcc actattaaaa
cgcgactaac tggttcaata tccgatttag tgaagtatct 35760agggtaattc agtagtatac
attgataaca aaaatttgta agaatgtact ggtctgggtt 35820tcccactagt atatgatcct
cactcattga tgccacatat tggggaacac ggaattcttg 35880tattcaatac aacaatttgc
ttaaatttat aattcaaata ggtgttttat agaaaatttt 35940gtcgaatatt tccacatttg
tggcttttag ttcaggcaaa acgagagaag tctaaagtgg 36000gtggaatatc ctgaattctt
ccaggaccta tagcccgtag tgcttctggt aaactaatat 36060ccccagtata tagggcttta
cccacaatta ctcctgtaac cccctgatgt tctaaagata 36120ataaggttaa taggtcagta
acagaaccca cacccccaga ggcaatcacg ggtatggaaa 36180tagcagatac caagtctctt
aatgctcgca agtttggtcc ctgaagcgta ccatcacggt 36240ttatatccgt ataaataata
gctgccgcac ccaattcctg catttgggtt gctagttggg 36300gggccaaaat ttgagaagtt
tctaaccaac ccctggtagc aactagacca ttccgcgcat 36360caatcccaat tataatttgc
tgggggaatt gttcacacag tccttgaacc agatctggtt 36420gctctactgc tacagttccc
agaattgccc actgtacccc aagattaaat aactgtataa 36480cgctggagct atcacgtatt
cctccgccaa cttcaatagg tatggaaata gcattggtaa 36540tagcttctat agtagataaa
ttaactattt taccagtttt tgctccatct aaatctacta 36600aatgtagtct tgttgctcct
tggtctgccc acattttagc ggtttccaca gggttatggc 36660tgtaaacctg ggattgtgca
tagtcacctt tgtagagtct tacacaacgc ccctctaata 36720gatctattgc tgggataact
tccatgacta attagtgaat aggttaattt cagttgagct 36780aaatggagaa ggagggattc
gaaccctcgg atggacctta cgattccatc aacagattag 36840caatctgccg ctttcgacca
ctcagccacc tctccaggtt tgttataaat tatgatgggt 36900caatcctaac agacaatttt
tggcttgtca agagattttt tgcaagtgga ggaggaaatc 36960cgtcagggat ttcaatcctg
gtcaactttt ttttgatttt gaatataaag ttaagtttaa 37020caatttctag tggcgctcct
ccaacagtag atataaaata tgagttggtc cacaatgaag 37080gacgtcttga ttttaatagt
caaatccctc caaatccatt ataatcccat gaatgctctt 37140tcaattccta cctggattat
ccatatttct agtgtcattg aatgggtagt tgccatttcc 37200ctcatctgga aatatggcga
actgacccaa aaccatagtt ggaggggatt tgccttaggt 37260atgatacccg ccttaattag
cgccctatcc gcttgtacct ggcattattt cgataatccc 37320cagtccctag aatggttagt
caccctccag gctactacta cgttaatagg taattttact 37380ctttgggcag cagcagtctg
ggtttggcgt tctactcgac cgaatgaggt tctcagtatc 37440tcaaataagg agtagaccgt
tatgatgtca aaagaaactc tctttgctct ctccctgttc 37500ccctatttgg gaatgttgtg
gtttctcagt cgcagtcccc aaatgccccc ttaagggctc 37560tatggattct atggcacttt
agtatttgtt ggtgttacca ttccag
3760621320DNACylindrospermopsis raciborskii T3 2atgatcccag ctaaaaaagt
ttatttttta ttgagtttag caatagttat ttcacccttt 60ttatccatga ttgtgggtat
ttacgaaaat attaaattta gggtattatt tgatttggtg 120gtcagggcac taatggtggt
tgactgcttc aatatcaaaa aacatcgggt caaaattagt 180cgtcaattac ctctacgttt
atctattgga cgtgagaatt tagtaatatt gaaggtagag 240tctgggaatg tcaatagtgc
tattcaaatt cgtgattact atcccacaga atttcccgta 300tccacatcta acctgatagt
taaccttccc cctaatcata ctcaggaagt aaagtacacc 360attcgaccta atcaacgggg
agaattttgg tggggaaata ttcaagttcg acagctggga 420aattggtctc tagggtggga
caattggcaa attccccaaa aaactgtggc taaggtgtat 480cctgatttgt taggactcag
atccctcgct attcgtttaa ccctacaatc ttctggatct 540atcactaaat tgcgtcaacg
gggaatggga acggaatttg ccgaactccg taattactgc 600atgggggatg atctacggtt
aattgattgg aaagctacag ctagacgtgc ttatggaaat 660ctgagtcccc tagtaagagt
tttagagcct caacaggaac aaactctgct tatattatta 720gatcgtggta gactaatgac
agctaatgta caagggttaa aacgatatga ttggggttta 780aataccacct tgtctttggc
attagcagga ttacataggg gcgatcgcgt aggagtaggg 840gtatttgact cccagctgca
tacctggata cctccagagc gaggacaaaa tcatctcaat 900cggcttatag acagacttac
acctattgaa ccagtgttag tggagtctga ttatttaaat 960gccattacct atgtagtaaa
acaacagact cgtagatctc tagtagtgtt aattactgat 1020ttagtcgatg ttactgcttc
ccatgaacta ctagtagcgc tgtgtaaatt agtgcctcga 1080tatctacctt tttgtgtaac
actcagggat cctgggattg ataaaatagc tcataatttt 1140agtcaagact taacacaggc
ttataatcga gcagtttctt tggacttgat atcacaaaga 1200gaaattgctt ttgctcagtt
gaaacaacag ggagttttgg tgttggatgc accagcaaat 1260caaatttccg agcagttggt
agaaaggtac ttacaaatca aagccaaaaa tcagatttga
13203439PRTCylindrospermopsis raciborskii T3 3Met Ile Pro Ala Lys Lys Val
Tyr Phe Leu Leu Ser Leu Ala Ile Val1 5 10
15Ile Ser Pro Phe Leu Ser Met Ile Val Gly Ile Tyr Glu
Asn Ile Lys 20 25 30Phe Arg
Val Leu Phe Asp Leu Val Val Arg Ala Leu Met Val Val Asp 35
40 45Cys Phe Asn Ile Lys Lys His Arg Val Lys
Ile Ser Arg Gln Leu Pro 50 55 60Leu
Arg Leu Ser Ile Gly Arg Glu Asn Leu Val Ile Leu Lys Val Glu65
70 75 80Ser Gly Asn Val Asn Ser
Ala Ile Gln Ile Arg Asp Tyr Tyr Pro Thr 85
90 95Glu Phe Pro Val Ser Thr Ser Asn Leu Ile Val Asn
Leu Pro Pro Asn 100 105 110His
Thr Gln Glu Val Lys Tyr Thr Ile Arg Pro Asn Gln Arg Gly Glu 115
120 125Phe Trp Trp Gly Asn Ile Gln Val Arg
Gln Leu Gly Asn Trp Ser Leu 130 135
140Gly Trp Asp Asn Trp Gln Ile Pro Gln Lys Thr Val Ala Lys Val Tyr145
150 155 160Pro Asp Leu Leu
Gly Leu Arg Ser Leu Ala Ile Arg Leu Thr Leu Gln 165
170 175Ser Ser Gly Ser Ile Thr Lys Leu Arg Gln
Arg Gly Met Gly Thr Glu 180 185
190Phe Ala Glu Leu Arg Asn Tyr Cys Met Gly Asp Asp Leu Arg Leu Ile
195 200 205Asp Trp Lys Ala Thr Ala Arg
Arg Ala Tyr Gly Asn Leu Ser Pro Leu 210 215
220Val Arg Val Leu Glu Pro Gln Gln Glu Gln Thr Leu Leu Ile Leu
Leu225 230 235 240Asp Arg
Gly Arg Leu Met Thr Ala Asn Val Gln Gly Leu Lys Arg Tyr
245 250 255Asp Trp Gly Leu Asn Thr Thr
Leu Ser Leu Ala Leu Ala Gly Leu His 260 265
270Arg Gly Asp Arg Val Gly Val Gly Val Phe Asp Ser Gln Leu
His Thr 275 280 285Trp Ile Pro Pro
Glu Arg Gly Gln Asn His Leu Asn Arg Leu Ile Asp 290
295 300Arg Leu Thr Pro Ile Glu Pro Val Leu Val Glu Ser
Asp Tyr Leu Asn305 310 315
320Ala Ile Thr Tyr Val Val Lys Gln Gln Thr Arg Arg Ser Leu Val Val
325 330 335Leu Ile Thr Asp Leu
Val Asp Val Thr Ala Ser His Glu Leu Leu Val 340
345 350Ala Leu Cys Lys Leu Val Pro Arg Tyr Leu Pro Phe
Cys Val Thr Leu 355 360 365Arg Asp
Pro Gly Ile Asp Lys Ile Ala His Asn Phe Ser Gln Asp Leu 370
375 380Thr Gln Ala Tyr Asn Arg Ala Val Ser Leu Asp
Leu Ile Ser Gln Arg385 390 395
400Glu Ile Ala Phe Ala Gln Leu Lys Gln Gln Gly Val Leu Val Leu Asp
405 410 415Ala Pro Ala Asn
Gln Ile Ser Glu Gln Leu Val Glu Arg Tyr Leu Gln 420
425 430Ile Lys Ala Lys Asn Gln Ile
4354759DNACylindrospermopsis raciborskii T3 4atgatagata caatatcagt
actattaaga gagtggactg taatttccct tacaggttta 60gccttctggc tttgggaaat
tcgctctccc ttccatcaaa ttgaatacaa agctaaattc 120ttcaaggaat tgggatgggc
gggaatatca ttcgtcttta gaaatgttta tgcatatgtt 180tctgtggcaa ttataaaact
attgagttct ctatttatgg gagagtcagc aaattttgca 240ggagtaatgt atgtgcccct
ctggctgagg atcatcactg catatatatt acaggactta 300actgactatc tattacacag
gacaatgcat agtaatcagt ttctttggtt gacgcacaaa 360tggcatcatt caacaaagca
atcatggtgg ctgagtggaa acaaagatag ctttaccggc 420ggacttttat atactgttac
agctttgtgg tttccactgc tggacattcc ctcagaggtt 480atgtctgtag tggcagtaca
tcaagtgatt cataacaatt ggatacacct caatgtaaag 540tggaactcct ggttaggaat
aattgaatgg atttatgtta cgccccgtat tcacactttg 600catcatcttg atacaggggg
aagaaatttg agttctatgt ttactttcat cgaccgatta 660tttggaacct atgtgtttcc
agaaaacttt gatatagaaa aatctaaaaa tagattggat 720gatcaatcag taacggtgaa
gacaattttg ggtttttaa
7595252PRTCylindrospermopsis raciborskii T3 5Met Ile Asp Thr Ile Ser Val
Leu Leu Arg Glu Trp Thr Val Ile Ser1 5 10
15Leu Thr Gly Leu Ala Phe Trp Leu Trp Glu Ile Arg Ser
Pro Phe His 20 25 30Gln Ile
Glu Tyr Lys Ala Lys Phe Phe Lys Glu Leu Gly Trp Ala Gly 35
40 45Ile Ser Phe Val Phe Arg Asn Val Tyr Ala
Tyr Val Ser Val Ala Ile 50 55 60Ile
Lys Leu Leu Ser Ser Leu Phe Met Gly Glu Ser Ala Asn Phe Ala65
70 75 80Gly Val Met Tyr Val Pro
Leu Trp Leu Arg Ile Ile Thr Ala Tyr Ile 85
90 95Leu Gln Asp Leu Thr Asp Tyr Leu Leu His Arg Thr
Met His Ser Asn 100 105 110Gln
Phe Leu Trp Leu Thr His Lys Trp His His Ser Thr Lys Gln Ser 115
120 125Trp Trp Leu Ser Gly Asn Lys Asp Ser
Phe Thr Gly Gly Leu Leu Tyr 130 135
140Thr Val Thr Ala Leu Trp Phe Pro Leu Leu Asp Ile Pro Ser Glu Val145
150 155 160Met Ser Val Val
Ala Val His Gln Val Ile His Asn Asn Trp Ile His 165
170 175Leu Asn Val Lys Trp Asn Ser Trp Leu Gly
Ile Ile Glu Trp Ile Tyr 180 185
190Val Thr Pro Arg Ile His Thr Leu His His Leu Asp Thr Gly Gly Arg
195 200 205Asn Leu Ser Ser Met Phe Thr
Phe Ile Asp Arg Leu Phe Gly Thr Tyr 210 215
220Val Phe Pro Glu Asn Phe Asp Ile Glu Lys Ser Lys Asn Arg Leu
Asp225 230 235 240Asp Gln
Ser Val Thr Val Lys Thr Ile Leu Gly Phe 245
2506396DNACylindrospermopsis raciborskii T3 6tcacccccaa cttattgcag
aaaaactttt ttctcttagg taataaatta gtagtttaat 60tgaaaagcat agcatctctt
ttgacttgga ataacaaaat gtcttacgat gtagtctagc 120taaatagtga cgcaaacgac
tgttttctcc ctcaactcta gtcattgatg ttttactaat 180aatttggtct ccatcgggaa
taaattttgg gtaaacttta tagccatccg taatccaaaa 240ataggatttc caatgctcta
tctttttcca taatttggca aatgttttgg cacttctatc 300tcccactaca tattgaataa
ttcccgaacg tttgttatct acaactgtcc agacccatat 360cttgtttttt tttaccaata
aatgtttcca actcat
3967131PRTCylindrospermopsis raciborskii T3 7Met Ser Trp Lys His Leu Leu
Val Lys Lys Asn Lys Ile Trp Val Trp1 5 10
15Thr Val Val Asp Asn Lys Arg Ser Gly Ile Ile Gln Tyr
Val Val Gly 20 25 30Asp Arg
Ser Ala Lys Thr Phe Ala Lys Leu Trp Lys Lys Ile Glu His 35
40 45Trp Lys Ser Tyr Phe Trp Ile Thr Asp Gly
Tyr Lys Val Tyr Pro Lys 50 55 60Phe
Ile Pro Asp Gly Asp Gln Ile Ile Ser Lys Thr Ser Met Thr Arg65
70 75 80Val Glu Gly Glu Asn Ser
Arg Leu Arg His Tyr Leu Ala Arg Leu His 85
90 95Arg Lys Thr Phe Cys Tyr Ser Lys Ser Lys Glu Met
Leu Cys Phe Ser 100 105 110Ile
Lys Leu Leu Ile Tyr Tyr Leu Arg Glu Lys Ser Phe Ser Ala Ile 115
120 125Ser Trp Gly
1308360DNACylindrospermopsis raciborskii T3 8ttatctacaa ctgtccagac
ccatatcttg ttttttttta ccaataaatg tttccaactc 60atccagttga caaacttcag
gtgtttggga attattatta ctatctgata actgacgacc 120tagctttttg acccaacgaa
tgactgtatt gtgatttact ttagtcattc tttcaattgc 180cctaaatcca ttcccattta
catacatggt taaacatgct tcctttactt cttgggaata 240acctctagga gaataagatt
caataaattg acgaccacaa ttcttgcatt gataattttg 300ttttcccctt ctctggccat
tttttctaat attattggaa tcacagtttg aacagttcat
3609119PRTCylindrospermopsis raciborskii T3 9Met Asn Cys Ser Asn Cys Asp
Ser Asn Asn Ile Arg Lys Asn Gly Gln1 5 10
15Arg Arg Gly Lys Gln Asn Tyr Gln Cys Lys Asn Cys Gly
Arg Gln Phe 20 25 30Ile Glu
Ser Tyr Ser Pro Arg Gly Tyr Ser Gln Glu Val Lys Glu Ala 35
40 45Cys Leu Thr Met Tyr Val Asn Gly Asn Gly
Phe Arg Ala Ile Glu Arg 50 55 60Met
Thr Lys Val Asn His Asn Thr Val Ile Arg Trp Val Lys Lys Leu65
70 75 80Gly Arg Gln Leu Ser Asp
Ser Asn Asn Asn Ser Gln Thr Pro Glu Val 85
90 95Cys Gln Leu Asp Glu Leu Glu Thr Phe Ile Gly Lys
Lys Lys Gln Asp 100 105 110Met
Gly Leu Asp Ser Cys Arg 11510354DNACylindrospermopsis raciborskii
T3 10ttatgacctc attttcattt ctagacgttc agcaacgggc attaactcac gtatcagatc
60aaagtttcct acgttccgtc tcatccagtc taataagaat ttttctcctt catctagctt
120acctttatca tcaacaaaaa ccatctgctc gcaccaatct acaaatccgg aattagtcat
180ctcatagact aaaatgatgg gaggaaagtg tgcgaatccc attttttcaa tgacttccat
240acaaaccagc ttaaatactt gttcgtttgt caattcatta gacataaaga attttccttt
300aatcaattct gtttctaatc ctaccacaga gtaataactc ttggtctgga acat
35411117PRTCylindrospermopsis raciborskii T3 11Met Phe Gln Thr Lys Ser
Tyr Tyr Ser Val Val Gly Leu Glu Thr Glu1 5
10 15Leu Ile Lys Gly Lys Phe Phe Met Ser Asn Glu Leu
Thr Asn Glu Gln 20 25 30Val
Phe Lys Leu Val Cys Met Glu Val Ile Glu Lys Met Gly Phe Ala 35
40 45His Phe Pro Pro Ile Ile Leu Val Tyr
Glu Met Thr Asn Ser Gly Phe 50 55
60Val Asp Trp Cys Glu Gln Met Val Phe Val Asp Asp Lys Gly Lys Leu65
70 75 80Asp Glu Gly Glu Lys
Phe Leu Leu Asp Trp Met Arg Arg Asn Val Gly 85
90 95Asn Phe Asp Leu Ile Arg Glu Leu Met Pro Val
Ala Glu Arg Leu Glu 100 105
110Met Lys Met Arg Ser 11512957DNACylindrospermopsis raciborskii
T3 12tcataactta ttacttgacg gagttgcagg ggcatacctt aacttgacct tgggagcgat
60agaagaaagg aaggcttcag tgacgggtct ttgactaatc ccagtttcca cttcaactaa
120aacagcatca caaatgtcga atagtgattg agaatatcta ttcatattca tgaaagtcag
180agcagattcc atcggagaca tggatgaatt aaaggcagcg ttttcagcgt atcgacctgt
240aaatatattc ccgtgggaat cttttaacgc tacccctgca aaatttttcg tgtagggagc
300ataactttga ttggcagcgg atagagcagc aagcacaaca tcatcggtag aataggtctc
360cagatcatga aatactgttt gcattaatcc acctgtgagt cctagatccg ctggtccaaa
420tggctcgggt agaaaatgtg ggagtttatt tgaggtataa gtttgctcag gctgtgattc
480attagacttc acaagaagaa caaaattttg atttacagtt gccatctcgt ataaaaattg
540tcggcagtat ccacatggtg cttcgtggat tgctaatgct tgtaaaccgg tttctccgtg
600caaccacgca tttatggtgg cggattgttc tgcgtgaact gagaaactaa gtgcctgtcc
660tacaaattcc atgtcggcac caaaataaag agttccagaa cccagttgat tcttagattg
720tggtttacca agagcgatcg cccctacata aaactgcgat attggtaccc tagcataagt
780tgcggctacg ggtagtaatt gaatcattaa cgtactaata ttagtaccaa gtcgatcaat
840ccaagatgcg acaacacttg agtcaattac agcatgttgg gcaagaattg tccttaactc
900tgattgaatg gaacgtggaa ccttggcaat cgcctgttct aatgctacat gggtcat
95713318PRTCylindrospermopsis raciborskii T3 13Met Thr His Val Ala Leu
Glu Gln Ala Ile Ala Lys Val Pro Arg Ser1 5
10 15Ile Gln Ser Glu Leu Arg Thr Ile Leu Ala Gln His
Ala Val Ile Asp 20 25 30Ser
Ser Val Val Ala Ser Trp Ile Asp Arg Leu Gly Thr Asn Ile Ser 35
40 45Thr Leu Met Ile Gln Leu Leu Pro Val
Ala Ala Thr Tyr Ala Arg Val 50 55
60Pro Ile Ser Gln Phe Tyr Val Gly Ala Ile Ala Leu Gly Lys Pro Gln65
70 75 80Ser Lys Asn Gln Leu
Gly Ser Gly Thr Leu Tyr Phe Gly Ala Asp Met 85
90 95Glu Phe Val Gly Gln Ala Leu Ser Phe Ser Val
His Ala Glu Gln Ser 100 105
110Ala Thr Ile Asn Ala Trp Leu His Gly Glu Thr Gly Leu Gln Ala Leu
115 120 125Ala Ile His Glu Ala Pro Cys
Gly Tyr Cys Arg Gln Phe Leu Tyr Glu 130 135
140Met Ala Thr Val Asn Gln Asn Phe Val Leu Leu Val Lys Ser Asn
Glu145 150 155 160Ser Gln
Pro Glu Gln Thr Tyr Thr Ser Asn Lys Leu Pro His Phe Leu
165 170 175Pro Glu Pro Phe Gly Pro Ala
Asp Leu Gly Leu Thr Gly Gly Leu Met 180 185
190Gln Thr Val Phe His Asp Leu Glu Thr Tyr Ser Thr Asp Asp
Val Val 195 200 205Leu Ala Ala Leu
Ser Ala Ala Asn Gln Ser Tyr Ala Pro Tyr Thr Lys 210
215 220Asn Phe Ala Gly Val Ala Leu Lys Asp Ser His Gly
Asn Ile Phe Thr225 230 235
240Gly Arg Tyr Ala Glu Asn Ala Ala Phe Asn Ser Ser Met Ser Pro Met
245 250 255Glu Ser Ala Leu Thr
Phe Met Asn Met Asn Arg Tyr Ser Gln Ser Leu 260
265 270Phe Asp Ile Cys Asp Ala Val Leu Val Glu Val Glu
Thr Gly Ile Ser 275 280 285Gln Arg
Pro Val Thr Glu Ala Phe Leu Ser Ser Ile Ala Pro Lys Val 290
295 300Lys Leu Arg Tyr Ala Pro Ala Thr Pro Ser Ser
Asn Lys Leu305 310
315143738DNACylindrospermopsis raciborskii T3 14ttaatgcttg agtatgtttt
cctcctgctt acaaggcaaa gctttccttt tttgtagcaa 60atcccaaact gctttgagag
atttaattgc ttggtctatc tcctcttcgg tattggcggc 120tgtaatcgaa aaccttaaag
cacttttatt taaaggtacg attggaaaaa tagcaggagt 180aattaaaata ccatattccc
aaaggagttg acacacatca atcatgtgtt gagcatctcc 240cactaacacg cctacgatgg
gaacgtaacc atagttatcc acttcgaatc caatggctct 300tgcttgtgta accaatttgt
gagttaggtg ataaatttgt tttcttaact gctccccctc 360ctgacgattc acctgtaatc
cggctaaggc acttgccaaa ctcgcaacag gagaaggacc 420agaaaatatg gcagtccaag
cgttgcggaa gttggttttg atccggcgat cgccacaagt 480taagaatgct gcgtaagaag
aataggcttt ggacaaacca gctacataga tgatattatc 540ctctgcaaac cgcaggtcaa
aataattcac catcccgttt cctttgtaac cgtaaggcat 600atcgctgctg ggattttcgc
ccaaaatgcc aaaaccatga gcatcatcca tgtaaattaa 660ggcattgtac tcttttgcca
gatgcacgta agctggcaga tcgggaaaat ctgccgacat 720ggaatacacg ccatcaatga
caataatctt tacttgttca ggcggatatt ttgctagttt 780ttcggctaaa tcgttcaaat
cattatgtcg atattggatg aactgggctc ctttgtgctg 840agccagacag cacgcttcat
aaatacaacg atgtgcagct atgtcaccaa agatgacacc 900attattccca gttaatagtg
gtaaaattcc tatctgaagc agtgttacag ctggaaatac 960taaaacatca ggtacgccta
aaagtttgga caattcttcc tccaattcct cataaattgc 1020tggggaagca acaagccgag
tccagcttgg atgtgtgccc catttatcca aagctggtgg 1080aattgcttcc ttaacttttg
gatgcaagtc aagacctaaa tagttgcaag aagcaaagtc 1140tatcacccaa tgtccgtcaa
ttagcacctt gcgaccttgt tgttctgtga cgactcttgt 1200gacttgagga attttttgtt
ggttaactac gttttccaga gtgttgattt cgttggctga 1260gtcaacaggt ggagctagat
cagattgttt ctcttgtacc acttggtttt ggaaataagt 1320gatgatggca gttggagtgt
tcttttgtaa aaagaacgtt ccagacagat tgatccctaa 1380acgttcctct aggagcgttt
gcagttctaa taaatctaaa gaatctaatc ccatatccag 1440cagtttttgt tgtggagcgt
aggctgcctg acgttgggaa cccattactt ttaagatgca 1500ttctttaacg agatccgcta
cagttttgtt ttccttagtt gcagatgttg cttttggtac 1560caatgaacca attgctgagt
taatatacgg tcctttgcga tcaccaggcg agtgcaaagc 1620actgtcgcgc aggttatatt
caatcaaaat acccatgccg agattatctg tatcttccgg 1680acgataatta gcaataattc
ccctaatttc ggctcctccc gacacatgga aacccacaat 1740tggatccaga agctgtcgtt
gctcattgtg tagctttaaa tactccatca tcggcatttg 1800ggaataattg acataatttc
gacagcgagt tacacccacc acgctctcaa tgccgccttt 1860cagggtacag tagtaaagca
taaagtcccg caattcattt cctaaccccc gcgcctgaaa 1920ctcaggtaga atatttagtg
cgagcagttg aataactgac ccttggggag tatgtaacgt 1980cggcacttgc gcatatttta
cattctctaa tgcctcagtg ctggtaattg tttgggaata 2040aatcgcacca ataatttgat
cttctataat cagcactaaa ttaccttgcg ggtttagctc 2100aagtcttcgc cgaatttcat
gagtagatgc ccgtaaattt tctggccaac acttgacctc 2160caagtcaact aaggcaggta
aatctgacaa ataggcatga ctaattttgt aaggtctttt 2220ctcgaagtaa ttaagcgtaa
tgcgagtaaa aggaaatgtt tttgggtatc ttttagaaag 2280ctctagtttt ggaaatagac
ctacttgtgc agcagacatg agaaaaacct cagcttccac 2340aagatactgc tgagaaaatc
cctgaaacgc atcgaaatgt aagttttcgc ttttgtctaa 2400aaactgatag actacccttg
gttccaaaca atggacctcc aaaatcatta aaccgtgttt 2460attgaccact tgagaccatc
tttctaagtg ttccaccaaa ctttgcacca taacatgagg 2520aggaataagc tctccttgat
catcgacaca gactgattgg taaggtaagt gagcacgttc 2580tttcaattcg tttcttttct
gaggaggaat aaagagacga tcatggtcga ggaacgaacg 2640gatgtgcagg atattttcgg
gatcatgaat gccatgagct tctaaagaac gcaccatttg 2700ttctgggttc ccaatatctc
cctgtaaaac taagtgggga aggctagcaa gggtgcgtgt 2760ggtagctttt aaagaagctt
cgttataatc tacacctata agacgcaggg gatactgttc 2820gagtgctttt cccctagcag
acttaaattg aatggtttcc cagactcgtt tcaggagagt 2880tccatcgcca caccccatgt
cagtaatgta tttgggttgt tcttctaatg gcaactgatt 2940gaatactgag aggatacttt
cttctaaatc ggcaaaatat ttctggtgtt gaaatccact 3000cccgatcacg ttaagggtgc
gatcaatgtg cctttcgtga ccggaagcat ctctttggaa 3060tacggagaga caattgccaa
acaatacatc atgaatgcgg gacaacatag gagtgtagga 3120cgccactatg gctgtattca
aggctcgctc tcccataaat cgaccaagtt cggttatggt 3180caaacgacct gctgtaaggt
cagcccagcc aaggtggaga aataacttac ccaactcttc 3240ttgcactgtt gagcttaatg
aggagagcaa aggtttgtcc tccgaatctg caagcaagtt 3300gtgtttgtgc agtgccagca
ggagtgggat gaccagtaat ccatctaaaa aatctgccat 3360taggggattg tccaggttcc
acaattggca agaacgctca atccatcttc ccagcaaatt 3420tccttgtttc ccttctaaat
aagactgaat tggtaggttg tacaattgaa gaatgtcttc 3480cgaaattttg ttgtgaatcg
ctgcttctgc ggttagagag tatttaagct ccttatttcg 3540ggaaagccaa tgtaaagact
cgagcatcct caaagcaact tgaaaatgtc cgctgttagc 3600tcccagatgt tccaccattt
ggtttaaaga gagaggactt tcatcggcga gtaattcaaa 3660aacacctttt tctcgacacg
caagaataac gggaaccgcc acaaagccgt gagtataacg 3720attaatcttt tgtaacat
3738151245PRTCylindrospermopsis raciborskii T3 15Met Leu Gln Lys Ile Asn
Arg Tyr Thr His Gly Phe Val Ala Val Pro1 5
10 15Val Ile Leu Ala Cys Arg Glu Lys Gly Val Phe Glu
Leu Leu Ala Asp 20 25 30Glu
Ser Pro Leu Ser Leu Asn Gln Met Val Glu His Leu Gly Ala Asn 35
40 45Ser Gly His Phe Gln Val Ala Leu Arg
Met Leu Glu Ser Leu His Trp 50 55
60Leu Ser Arg Asn Lys Glu Leu Lys Tyr Ser Leu Thr Ala Glu Ala Ala65
70 75 80Ile His Asn Lys Ile
Ser Glu Asp Ile Leu Gln Leu Tyr Asn Leu Pro 85
90 95Ile Gln Ser Tyr Leu Glu Gly Lys Gln Gly Asn
Leu Leu Gly Arg Trp 100 105
110Ile Glu Arg Ser Cys Gln Leu Trp Asn Leu Asp Asn Pro Leu Met Ala
115 120 125Asp Phe Leu Asp Gly Leu Leu
Val Ile Pro Leu Leu Leu Ala Leu His 130 135
140Lys His Asn Leu Leu Ala Asp Ser Glu Asp Lys Pro Leu Leu Ser
Ser145 150 155 160Leu Ser
Ser Thr Val Gln Glu Glu Leu Gly Lys Leu Phe Leu His Leu
165 170 175Gly Trp Ala Asp Leu Thr Ala
Gly Arg Leu Thr Ile Thr Glu Leu Gly 180 185
190Arg Phe Met Gly Glu Arg Ala Leu Asn Thr Ala Ile Val Ala
Ser Tyr 195 200 205Thr Pro Met Leu
Ser Arg Ile His Asp Val Leu Phe Gly Asn Cys Leu 210
215 220Ser Val Phe Gln Arg Asp Ala Ser Gly His Glu Arg
His Ile Asp Arg225 230 235
240Thr Leu Asn Val Ile Gly Ser Gly Phe Gln His Gln Lys Tyr Phe Ala
245 250 255Asp Leu Glu Glu Ser
Ile Leu Ser Val Phe Asn Gln Leu Pro Leu Glu 260
265 270Glu Gln Pro Lys Tyr Ile Thr Asp Met Gly Cys Gly
Asp Gly Thr Leu 275 280 285Leu Lys
Arg Val Trp Glu Thr Ile Gln Phe Lys Ser Ala Arg Gly Lys 290
295 300Ala Leu Glu Gln Tyr Pro Leu Arg Leu Ile Gly
Val Asp Tyr Asn Glu305 310 315
320Ala Ser Leu Lys Ala Thr Thr Arg Thr Leu Ala Ser Leu Pro His Leu
325 330 335Val Leu Gln Gly
Asp Ile Gly Asn Pro Glu Gln Met Val Arg Ser Leu 340
345 350Glu Ala His Gly Ile His Asp Pro Glu Asn Ile
Leu His Ile Arg Ser 355 360 365Phe
Leu Asp His Asp Arg Leu Phe Ile Pro Pro Gln Lys Arg Asn Glu 370
375 380Leu Lys Glu Arg Ala His Leu Pro Tyr Gln
Ser Val Cys Val Asp Asp385 390 395
400Gln Gly Glu Leu Ile Pro Pro His Val Met Val Gln Ser Leu Val
Glu 405 410 415His Leu Glu
Arg Trp Ser Gln Val Val Asn Lys His Gly Leu Met Ile 420
425 430Leu Glu Val His Cys Leu Glu Pro Arg Val
Val Tyr Gln Phe Leu Asp 435 440
445Lys Ser Glu Asn Leu His Phe Asp Ala Phe Gln Gly Phe Ser Gln Gln 450
455 460Tyr Leu Val Glu Ala Glu Val Phe
Leu Met Ser Ala Ala Gln Val Gly465 470
475 480Leu Phe Pro Lys Leu Glu Leu Ser Lys Arg Tyr Pro
Lys Thr Phe Pro 485 490
495Phe Thr Arg Ile Thr Leu Asn Tyr Phe Glu Lys Arg Pro Tyr Lys Ile
500 505 510Ser His Ala Tyr Leu Ser
Asp Leu Pro Ala Leu Val Asp Leu Glu Val 515 520
525Lys Cys Trp Pro Glu Asn Leu Arg Ala Ser Thr His Glu Ile
Arg Arg 530 535 540Arg Leu Glu Leu Asn
Pro Gln Gly Asn Leu Val Leu Ile Ile Glu Asp545 550
555 560Gln Ile Ile Gly Ala Ile Tyr Ser Gln Thr
Ile Thr Ser Thr Glu Ala 565 570
575Leu Glu Asn Val Lys Tyr Ala Gln Val Pro Thr Leu His Thr Pro Gln
580 585 590Gly Ser Val Ile Gln
Leu Leu Ala Leu Asn Ile Leu Pro Glu Phe Gln 595
600 605Ala Arg Gly Leu Gly Asn Glu Leu Arg Asp Phe Met
Leu Tyr Tyr Cys 610 615 620Thr Leu Lys
Gly Gly Ile Glu Ser Val Val Gly Val Thr Arg Cys Arg625
630 635 640Asn Tyr Val Asn Tyr Ser Gln
Met Pro Met Met Glu Tyr Leu Lys Leu 645
650 655His Asn Glu Gln Arg Gln Leu Leu Asp Pro Ile Val
Gly Phe His Val 660 665 670Ser
Gly Gly Ala Glu Ile Arg Gly Ile Ile Ala Asn Tyr Arg Pro Glu 675
680 685Asp Thr Asp Asn Leu Gly Met Gly Ile
Leu Ile Glu Tyr Asn Leu Arg 690 695
700Asp Ser Ala Leu His Ser Pro Gly Asp Arg Lys Gly Pro Tyr Ile Asn705
710 715 720Ser Ala Ile Gly
Ser Leu Val Pro Lys Ala Thr Ser Ala Thr Lys Glu 725
730 735Asn Lys Thr Val Ala Asp Leu Val Lys Glu
Cys Ile Leu Lys Val Met 740 745
750Gly Ser Gln Arg Gln Ala Ala Tyr Ala Pro Gln Gln Lys Leu Leu Asp
755 760 765Met Gly Leu Asp Ser Leu Asp
Leu Leu Glu Leu Gln Thr Leu Leu Glu 770 775
780Glu Arg Leu Gly Ile Asn Leu Ser Gly Thr Phe Phe Leu Gln Lys
Asn785 790 795 800Thr Pro
Thr Ala Ile Ile Thr Tyr Phe Gln Asn Gln Val Val Gln Glu
805 810 815Lys Gln Ser Asp Leu Ala Pro
Pro Val Asp Ser Ala Asn Glu Ile Asn 820 825
830Thr Leu Glu Asn Val Val Asn Gln Gln Lys Ile Pro Gln Val
Thr Arg 835 840 845Val Val Thr Glu
Gln Gln Gly Arg Lys Val Leu Ile Asp Gly His Trp 850
855 860Val Ile Asp Phe Ala Ser Cys Asn Tyr Leu Gly Leu
Asp Leu His Pro865 870 875
880Lys Val Lys Glu Ala Ile Pro Pro Ala Leu Asp Lys Trp Gly Thr His
885 890 895Pro Ser Trp Thr Arg
Leu Val Ala Ser Pro Ala Ile Tyr Glu Glu Leu 900
905 910Glu Glu Glu Leu Ser Lys Leu Leu Gly Val Pro Asp
Val Leu Val Phe 915 920 925Pro Ala
Val Thr Leu Leu Gln Ile Gly Ile Leu Pro Leu Leu Thr Gly 930
935 940Asn Asn Gly Val Ile Phe Gly Asp Ile Ala Ala
His Arg Cys Ile Tyr945 950 955
960Glu Ala Cys Cys Leu Ala Gln His Lys Gly Ala Gln Phe Ile Gln Tyr
965 970 975Arg His Asn Asp
Leu Asn Asp Leu Ala Glu Lys Leu Ala Lys Tyr Pro 980
985 990Pro Glu Gln Val Lys Ile Ile Val Ile Asp Gly
Val Tyr Ser Met Ser 995 1000
1005Ala Asp Phe Pro Asp Leu Pro Ala Tyr Val His Leu Ala Lys Glu
1010 1015 1020Tyr Asn Ala Leu Ile Tyr
Met Asp Asp Ala His Gly Phe Gly Ile 1025 1030
1035Leu Gly Glu Asn Pro Ser Ser Asp Met Pro Tyr Gly Tyr Lys
Gly 1040 1045 1050Asn Gly Met Val Asn
Tyr Phe Asp Leu Arg Phe Ala Glu Asp Asn 1055 1060
1065Ile Ile Tyr Val Ala Gly Leu Ser Lys Ala Tyr Ser Ser
Tyr Ala 1070 1075 1080Ala Phe Leu Thr
Cys Gly Asp Arg Arg Ile Lys Thr Asn Phe Arg 1085
1090 1095Asn Ala Trp Thr Ala Ile Phe Ser Gly Pro Ser
Pro Val Ala Ser 1100 1105 1110Leu Ala
Ser Ala Leu Ala Gly Leu Gln Val Asn Arg Gln Glu Gly 1115
1120 1125Glu Gln Leu Arg Lys Gln Ile Tyr His Leu
Thr His Lys Leu Val 1130 1135 1140Thr
Gln Ala Arg Ala Ile Gly Phe Glu Val Asp Asn Tyr Gly Tyr 1145
1150 1155Val Pro Ile Val Gly Val Leu Val Gly
Asp Ala Gln His Met Ile 1160 1165
1170Asp Val Cys Gln Leu Leu Trp Glu Tyr Gly Ile Leu Ile Thr Pro
1175 1180 1185Ala Ile Phe Pro Ile Val
Pro Leu Asn Lys Ser Ala Leu Arg Phe 1190 1195
1200Ser Ile Thr Ala Ala Asn Thr Glu Glu Glu Ile Asp Gln Ala
Ile 1205 1210 1215Lys Ser Leu Lys Ala
Val Trp Asp Leu Leu Gln Lys Arg Lys Ala 1220 1225
1230Leu Pro Cys Lys Gln Glu Glu Asn Ile Leu Lys His
1235 1240 124516387DNACylindrospermopsis
raciborskii T3 16atgttgaaag atttcaacca gtttttaatc agaacactag cattcgtatt
cgcatttggt 60attttcttaa ccactggagt tggcattgct aaagctgact acctagttaa
aggtggaaag 120attaccaatg ttcaaaatac ttcttctaac ggtgataatt atgccgttag
tatcagcggt 180gggtttggtc cttgcgcaga tagagtgatt atcctaccaa cttcaggagt
gataaatcga 240gacattcata tgcgtggcta tgaagccgca ttaactgcac tatccaatgg
ctttttagta 300gatatttacg actatactgg ctcttcttgc agcaatggtg gccaactaac
tattaccaac 360caattaggta agctaatcag caattag
38717128PRTCylindrospermopsis raciborskii T3 17Met Leu Lys
Asp Phe Asn Gln Phe Leu Ile Arg Thr Leu Ala Phe Val1 5
10 15Phe Ala Phe Gly Ile Phe Leu Thr Thr
Gly Val Gly Ile Ala Lys Ala 20 25
30Asp Tyr Leu Val Lys Gly Gly Lys Ile Thr Asn Val Gln Asn Thr Ser
35 40 45Ser Asn Gly Asp Asn Tyr Ala
Val Ser Ile Ser Gly Gly Phe Gly Pro 50 55
60Cys Ala Asp Arg Val Ile Ile Leu Pro Thr Ser Gly Val Ile Asn Arg65
70 75 80Asp Ile His Met
Arg Gly Tyr Glu Ala Ala Leu Thr Ala Leu Ser Asn 85
90 95Gly Phe Leu Val Asp Ile Tyr Asp Tyr Thr
Gly Ser Ser Cys Ser Asn 100 105
110Gly Gly Gln Leu Thr Ile Thr Asn Gln Leu Gly Lys Leu Ile Ser Asn
115 120 125181416DNACylindrospermopsis
raciborskii T3 18atggaaacaa cctcaaaaaa atttaagtca gatctgatat tagaagcacg
agcaagccta 60aagttgggaa tccccttagt catttcacaa atgtgcgaaa cgggtattta
tacagcgaat 120gcagtcatga tgggtttact tggtacgcaa gttttggccg ccggtgcttt
gggcgcgctc 180gcttttttga ccttattatt tgcctgccat ggtattctct cagtaggagg
atcactagca 240gccgaagctt ttggggcaaa taaaatagat gaagttagtc gtattgcttc
cgggcaaata 300tggctagcag ttaccttgtc tttacctgca atgcttctgc tttggcatgg
cgatactatc 360ttgctgctat tcggtcaaga ggaaagcaat gtgttattga caaaaacgta
tttacactca 420attttatggg gctttcccgc tgcgcttagt attttgacat taagaggcat
tgcctctgct 480ctcaacgttc cccgattgat aactattact atgctcactc agctgatatt
gaataccgcc 540gccgattatg tgttaatatt cggtaaattt ggtcttcctc aacttggttt
ggctggaata 600ggctgggcaa ctgctctggg tttttgggtt agttttacat tggggcttat
cttgctgatt 660ttctccctga aagttagaga ttataaactt ttccgctact tgcatcagtt
tgataaacag 720atctttgtca aaatttttca aactggatgg cccatggggt ttcaatgggg
ggcggaaacg 780gcactattta acgtcaccgc ttgggtagca gggtatttag gaacggtaac
attagcagcc 840catgatattg gcttccaaac ggcagaactg gcgatggtta taccactcgg
agtcggcaat 900gtcgctatga caagagtagg tcagagtata ggagaaaaaa accctttggg
tgcaagaagg 960gtagcatcga ttggaattac aatagttggc atttatgcca gtattgtagc
acttgttttc 1020tggttgtttc catatcaaat tgccggaatt tatttaaata taaacaatcc
cgagaatatc 1080gaagcaatta agaaagcaac tacttttatc cccttggcgg gactattcca
aatgttttac 1140agtattcaaa taattattgt tggggctttg gtcggtctgc gggatacatt
tgttccagta 1200tcaatgaact taattgtctg gggtcttgga ttggcaggaa gctatttcat
ggcaatcatt 1260ttaggatggg gggggatcgg gatttggttg gctatggttt tgagtccact
cctctcggca 1320gttattttaa ctgttcgttt ttatcgagtg attgacaatc ttcttgccaa
cagtgatgat 1380atgttacaga atgcgtctgt tactactcta ggctga
141619471PRTCylindrospermopsis raciborskii T3 19Met Glu Thr
Thr Ser Lys Lys Phe Lys Ser Asp Leu Ile Leu Glu Ala1 5
10 15Arg Ala Ser Leu Lys Leu Gly Ile Pro
Leu Val Ile Ser Gln Met Cys 20 25
30Glu Thr Gly Ile Tyr Thr Ala Asn Ala Val Met Met Gly Leu Leu Gly
35 40 45Thr Gln Val Leu Ala Ala Gly
Ala Leu Gly Ala Leu Ala Phe Leu Thr 50 55
60Leu Leu Phe Ala Cys His Gly Ile Leu Ser Val Gly Gly Ser Leu Ala65
70 75 80Ala Glu Ala Phe
Gly Ala Asn Lys Ile Asp Glu Val Ser Arg Ile Ala 85
90 95Ser Gly Gln Ile Trp Leu Ala Val Thr Leu
Ser Leu Pro Ala Met Leu 100 105
110Leu Leu Trp His Gly Asp Thr Ile Leu Leu Leu Phe Gly Gln Glu Glu
115 120 125Ser Asn Val Leu Leu Thr Lys
Thr Tyr Leu His Ser Ile Leu Trp Gly 130 135
140Phe Pro Ala Ala Leu Ser Ile Leu Thr Leu Arg Gly Ile Ala Ser
Ala145 150 155 160Leu Asn
Val Pro Arg Leu Ile Thr Ile Thr Met Leu Thr Gln Leu Ile
165 170 175Leu Asn Thr Ala Ala Asp Tyr
Val Leu Ile Phe Gly Lys Phe Gly Leu 180 185
190Pro Gln Leu Gly Leu Ala Gly Ile Gly Trp Ala Thr Ala Leu
Gly Phe 195 200 205Trp Val Ser Phe
Thr Leu Gly Leu Ile Leu Leu Ile Phe Ser Leu Lys 210
215 220Val Arg Asp Tyr Lys Leu Phe Arg Tyr Leu His Gln
Phe Asp Lys Gln225 230 235
240Ile Phe Val Lys Ile Phe Gln Thr Gly Trp Pro Met Gly Phe Gln Trp
245 250 255Gly Ala Glu Thr Ala
Leu Phe Asn Val Thr Ala Trp Val Ala Gly Tyr 260
265 270Leu Gly Thr Val Thr Leu Ala Ala His Asp Ile Gly
Phe Gln Thr Ala 275 280 285Glu Leu
Ala Met Val Ile Pro Leu Gly Val Gly Asn Val Ala Met Thr 290
295 300Arg Val Gly Gln Ser Ile Gly Glu Lys Asn Pro
Leu Gly Ala Arg Arg305 310 315
320Val Ala Ser Ile Gly Ile Thr Ile Val Gly Ile Tyr Ala Ser Ile Val
325 330 335Ala Leu Val Phe
Trp Leu Phe Pro Tyr Gln Ile Ala Gly Ile Tyr Leu 340
345 350Asn Ile Asn Asn Pro Glu Asn Ile Glu Ala Ile
Lys Lys Ala Thr Thr 355 360 365Phe
Ile Pro Leu Ala Gly Leu Phe Gln Met Phe Tyr Ser Ile Gln Ile 370
375 380Ile Ile Val Gly Ala Leu Val Gly Leu Arg
Asp Thr Phe Val Pro Val385 390 395
400Ser Met Asn Leu Ile Val Trp Gly Leu Gly Leu Ala Gly Ser Tyr
Phe 405 410 415Met Ala Ile
Ile Leu Gly Trp Gly Gly Ile Gly Ile Trp Leu Ala Met 420
425 430Val Leu Ser Pro Leu Leu Ser Ala Val Ile
Leu Thr Val Arg Phe Tyr 435 440
445Arg Val Ile Asp Asn Leu Leu Ala Asn Ser Asp Asp Met Leu Gln Asn 450
455 460Ala Ser Val Thr Thr Leu Gly465
470201134DNACylindrospermopsis raciborskii T3 20atgaccaatc
aaaataacca agaattagag aacgatttac caatcgccaa gcagccttgt 60ccggtcaatt
cttataatga gtgggacaca cttgaggagg tcattgttgg tagtgttgaa 120ggtgcaatgt
taccggccct agaaccaatc aacaaatgga cattcccttt tgaagaattg 180gaatctgccc
aaaagatact ctctgagagg ggaggagttc cttatccacc agagatgatt 240acattagcac
acaaagaact aaatgaattt attcacattc ttgaagcaga aggggtcaaa 300gttcgtcgag
ttaaacctgt agatttctct gtccccttct ccacaccagc ttggcaagta 360ggaagtggtt
tttgtgccgc caatcctcgc gatgtttttt tggtgattgg gaatgagatt 420attgaagcac
caatggcaga tcgcaaccgc tattttgaaa cttgggcgta tcgagagatg 480ctcaaggaat
attttcaggc aggagctaag tggactgcag cgccgaagcc acaattattc 540gacgcacagt
atgacttcaa tttccagttt cctcaactgg gggagccgcc gcgtttcgtc 600gttacagagt
ttgaaccgac ttttgatgcg gcagattttg tgcgctgtgg acgagatatt 660tttggtcaaa
aaagtcatgt gactaatggt ttgggcatag aatggttaca acgtcacttg 720gaagacgaat
accgtattca tattattgaa tcgcattgtc cggaagcact gcacatcgat 780accaccttaa
tgcctcttgc acctggcaaa atactagtaa atccagaatt tgtagatgtt 840aataaattgc
caaaaatcct gaaaagctgg gacattttgg ttgcacctta ccccaaccat 900atacctcaaa
accagctgag actggtcagt gaatgggcag gtttgaatgt actgatgtta 960gatgaagagc
gagtcattgt agaaaaaaac caggagcaga tgattaaagc actgaaagat 1020tggggattta
agcctattgt ttgccatttt gaaagctact atccattttt aggatcattt 1080cactgtgcaa
cattagacgt tcgccgacgc ggaactcttc agtcctattt ttaa
113421377PRTCylindrospermopsis raciborskii T3 21Met Thr Asn Gln Asn Asn
Gln Glu Leu Glu Asn Asp Leu Pro Ile Ala1 5
10 15Lys Gln Pro Cys Pro Val Asn Ser Tyr Asn Glu Trp
Asp Thr Leu Glu 20 25 30Glu
Val Ile Val Gly Ser Val Glu Gly Ala Met Leu Pro Ala Leu Glu 35
40 45Pro Ile Asn Lys Trp Thr Phe Pro Phe
Glu Glu Leu Glu Ser Ala Gln 50 55
60Lys Ile Leu Ser Glu Arg Gly Gly Val Pro Tyr Pro Pro Glu Met Ile65
70 75 80Thr Leu Ala His Lys
Glu Leu Asn Glu Phe Ile His Ile Leu Glu Ala 85
90 95Glu Gly Val Lys Val Arg Arg Val Lys Pro Val
Asp Phe Ser Val Pro 100 105
110Phe Ser Thr Pro Ala Trp Gln Val Gly Ser Gly Phe Cys Ala Ala Asn
115 120 125Pro Arg Asp Val Phe Leu Val
Ile Gly Asn Glu Ile Ile Glu Ala Pro 130 135
140Met Ala Asp Arg Asn Arg Tyr Phe Glu Thr Trp Ala Tyr Arg Glu
Met145 150 155 160Leu Lys
Glu Tyr Phe Gln Ala Gly Ala Lys Trp Thr Ala Ala Pro Lys
165 170 175Pro Gln Leu Phe Asp Ala Gln
Tyr Asp Phe Asn Phe Gln Phe Pro Gln 180 185
190Leu Gly Glu Pro Pro Arg Phe Val Val Thr Glu Phe Glu Pro
Thr Phe 195 200 205Asp Ala Ala Asp
Phe Val Arg Cys Gly Arg Asp Ile Phe Gly Gln Lys 210
215 220Ser His Val Thr Asn Gly Leu Gly Ile Glu Trp Leu
Gln Arg His Leu225 230 235
240Glu Asp Glu Tyr Arg Ile His Ile Ile Glu Ser His Cys Pro Glu Ala
245 250 255Leu His Ile Asp Thr
Thr Leu Met Pro Leu Ala Pro Gly Lys Ile Leu 260
265 270Val Asn Pro Glu Phe Val Asp Val Asn Lys Leu Pro
Lys Ile Leu Lys 275 280 285Ser Trp
Asp Ile Leu Val Ala Pro Tyr Pro Asn His Ile Pro Gln Asn 290
295 300Gln Leu Arg Leu Val Ser Glu Trp Ala Gly Leu
Asn Val Leu Met Leu305 310 315
320Asp Glu Glu Arg Val Ile Val Glu Lys Asn Gln Glu Gln Met Ile Lys
325 330 335Ala Leu Lys Asp
Trp Gly Phe Lys Pro Ile Val Cys His Phe Glu Ser 340
345 350Tyr Tyr Pro Phe Leu Gly Ser Phe His Cys Ala
Thr Leu Asp Val Arg 355 360 365Arg
Arg Gly Thr Leu Gln Ser Tyr Phe 370
375221005DNACylindrospermopsis raciborskii T3 22atgacaactg ctgacctaat
cttaattaac aactggtacg tagtcgcaaa ggtggaagat 60tgtaaaccag gaagtatcac
cacggctctt ttattgggag ttaagttggt actatggcgc 120agtcgtgaac agaattcccc
catacagata tggcaagact actgccctca ccgaggtgtg 180gctctgtcta tgggagaaat
tgttaataat actttggttt gtccgtatca cggatggaga 240tataatcaag caggtaaatg
cgtacatatc ccggctcacc ctgacatgac acccccagca 300agtgcccaag ccaagatcta
tcattgccag gagcgatacg gattagtatg ggtgtgctta 360ggtgatcctg tcaatgatat
accttcatta cccgaatggg acgatccgaa ttatcataat 420acttgtacta aatcttattt
tattcaagct agtgcgtttc gtgtaatgga taatttcata 480gatgtatctc attttccttt
tgtccacgac ggtgggttag gtgatcgcaa ccacgcacaa 540attgaagaat ttgaggtaaa
agtagacaaa gatggcatta gcataggtaa ccttaaactc 600cagatgccaa ggtttaacag
cagtaacgaa gatgactcat ggactcttta ccaaaggatt 660agtcatccct tgtgtcaata
ctatattact gaatcctctg aaattcggac tgcggatttg 720atgctggtaa caccgattga
tgaagacaac agcttagtgc gaatgttagt aacgtggaac 780cgctccgaaa tattagagtc
aacggtacta gaggaatttg acgaaacaat agaacaagat 840attccgatta tacactctca
acagccagcg cgtttaccac tgttaccttc aaagcagata 900aacatgcaat ggttgtcaca
ggaaatacat gtaccgtcag atcgatgcac agttgcctat 960cgtcgatggc taaaggaact
gggcgttacc tatggtgttt gttaa
100523334PRTCylindrospermopsis raciborskii T3 23Met Thr Thr Ala Asp Leu
Ile Leu Ile Asn Asn Trp Tyr Val Val Ala1 5
10 15Lys Val Glu Asp Cys Lys Pro Gly Ser Ile Thr Thr
Ala Leu Leu Leu 20 25 30Gly
Val Lys Leu Val Leu Trp Arg Ser Arg Glu Gln Asn Ser Pro Ile 35
40 45Gln Ile Trp Gln Asp Tyr Cys Pro His
Arg Gly Val Ala Leu Ser Met 50 55
60Gly Glu Ile Val Asn Asn Thr Leu Val Cys Pro Tyr His Gly Trp Arg65
70 75 80Tyr Asn Gln Ala Gly
Lys Cys Val His Ile Pro Ala His Pro Asp Met 85
90 95Thr Pro Pro Ala Ser Ala Gln Ala Lys Ile Tyr
His Cys Gln Glu Arg 100 105
110Tyr Gly Leu Val Trp Val Cys Leu Gly Asp Pro Val Asn Asp Ile Pro
115 120 125Ser Leu Pro Glu Trp Asp Asp
Pro Asn Tyr His Asn Thr Cys Thr Lys 130 135
140Ser Tyr Phe Ile Gln Ala Ser Ala Phe Arg Val Met Asp Asn Phe
Ile145 150 155 160Asp Val
Ser His Phe Pro Phe Val His Asp Gly Gly Leu Gly Asp Arg
165 170 175Asn His Ala Gln Ile Glu Glu
Phe Glu Val Lys Val Asp Lys Asp Gly 180 185
190Ile Ser Ile Gly Asn Leu Lys Leu Gln Met Pro Arg Phe Asn
Ser Ser 195 200 205Asn Glu Asp Asp
Ser Trp Thr Leu Tyr Gln Arg Ile Ser His Pro Leu 210
215 220Cys Gln Tyr Tyr Ile Thr Glu Ser Ser Glu Ile Arg
Thr Ala Asp Leu225 230 235
240Met Leu Val Thr Pro Ile Asp Glu Asp Asn Ser Leu Val Arg Met Leu
245 250 255Val Thr Trp Asn Arg
Ser Glu Ile Leu Glu Ser Thr Val Leu Glu Glu 260
265 270Phe Asp Glu Thr Ile Glu Gln Asp Ile Pro Ile Ile
His Ser Gln Gln 275 280 285Pro Ala
Arg Leu Pro Leu Leu Pro Ser Lys Gln Ile Asn Met Gln Trp 290
295 300Leu Ser Gln Glu Ile His Val Pro Ser Asp Arg
Cys Thr Val Ala Tyr305 310 315
320Arg Arg Trp Leu Lys Glu Leu Gly Val Thr Tyr Gly Val Cys
325 330241839DNACylindrospermopsis raciborskii T3
24atgcagatct taggaatttc agcttactac cacgatagtg ctgccgcgat ggttatcgat
60ggcgaaattg ttgctgcagc tcaggaagaa cgtttctcaa gacgaaagca cgatgctggg
120tttccgactg gagcgattac ttactgtcta aaacaagtag gaaccaagtt acaatatatc
180gatcaaattg ttttttacga caagccatta gtcaaatttg agcggttgct agaaacatat
240ttagcatatg ccccaaaggg atttggctcg tttattactg ctatgcccgt ttggctcaaa
300gaaaagcttt acctaaaaac acttttaaaa aaagaattgg cgcttttggg ggagtgcaaa
360gcttctcaat tgcctcctct actgtttacc tcacatcacc aagcccatgc ggccgctgct
420ttttttccca gtccttttca gcgtgctgcc gttctgtgct tagatggtgt aggagagtgg
480gcaactactt ctgtctggtt gggagaagga aataaactca caccacaatg ggaaattgat
540tttccccatt ccctcggttt gctttactca gcgtttacct actacactgg gttcaaagtt
600aactcaggtg agtacaaact catgggttta gcaccctacg gggaacccaa atatgtggac
660caaattctca agcatttgtt ggatctcaaa gaagatggta cttttaggtt gaatatggac
720tacttcaact acacggtggg gctaaccatg accaatcata agttccatag tatgtttgga
780ggaccaccac gccaggcgga aggaaaaatc tcccaaagag acatggatct ggcaagttcg
840atccaaaagg tgactgaaga agtcatactg cgtctggcta gaactatcaa aaaagaactg
900ggtgtagagt atctatgttt agcaggtggt gtcggtctca attgcgtggc taacggacga
960attctccgag aaagtgattt caaagatatt tggattcaac ccgcagcagg agatgccggt
1020agtgcagtgg gagcagcttt agcgatttgg catgaatacc ataagaaacc tcgcacttca
1080acagcaggcg atcgcatgaa aggttcttat ctgggaccta gctttagcga ggcggagatt
1140ctccagtttc ttaattctgt taacataccc taccatcgat gcgttgataa cgaacttatg
1200gctcgtcttg cagaaatttt agaccaggga aatgttgtag gctggttttc tggacgaatg
1260gagtttggtc cgcgtgcttt gggtggccgt tcgattattg gcgattcacg cagtccaaaa
1320atgcaatcgg tcatgaacct gaaaattaaa tatcgtgagt ccttccgtcc atttgctcct
1380tcagtcttgg ctgaacgagt ctccgactac ttcgatcttg atcgtcctag tccttatatg
1440cttttggtag cacaagtcaa agagaatctg cacattccta tgacacaaga gcaacacgag
1500ctatttggga tcgagaagct gaatgttcct cgttcccaaa ttcccgcagt cactcacgtt
1560gattactcag ctcgtattca gacagttcac aaagaaacga atcctcgtta ctacgagtta
1620attcgtcatt ttgaggcacg aactggttgt gctgtcttgg tcaatacttc gtttaatgtc
1680cgcggcgaac caattgtttg tactcccgaa gacgcttatc gatgctttat gagaactgaa
1740atggactatt tggttatgga gaatttcttg ttggtcaaat ctgaacagcc acggggaaat
1800agtgatgagt catggcaaaa agaattcgag ttagattaa
183925612PRTCylindrospermopsis raciborskii T3 25Met Gln Ile Leu Gly Ile
Ser Ala Tyr Tyr His Asp Ser Ala Ala Ala1 5
10 15Met Val Ile Asp Gly Glu Ile Val Ala Ala Ala Gln
Glu Glu Arg Phe 20 25 30Ser
Arg Arg Lys His Asp Ala Gly Phe Pro Thr Gly Ala Ile Thr Tyr 35
40 45Cys Leu Lys Gln Val Gly Thr Lys Leu
Gln Tyr Ile Asp Gln Ile Val 50 55
60Phe Tyr Asp Lys Pro Leu Val Lys Phe Glu Arg Leu Leu Glu Thr Tyr65
70 75 80Leu Ala Tyr Ala Pro
Lys Gly Phe Gly Ser Phe Ile Thr Ala Met Pro 85
90 95Val Trp Leu Lys Glu Lys Leu Tyr Leu Lys Thr
Leu Leu Lys Lys Glu 100 105
110Leu Ala Leu Leu Gly Glu Cys Lys Ala Ser Gln Leu Pro Pro Leu Leu
115 120 125Phe Thr Ser His His Gln Ala
His Ala Ala Ala Ala Phe Phe Pro Ser 130 135
140Pro Phe Gln Arg Ala Ala Val Leu Cys Leu Asp Gly Val Gly Glu
Trp145 150 155 160Ala Thr
Thr Ser Val Trp Leu Gly Glu Gly Asn Lys Leu Thr Pro Gln
165 170 175Trp Glu Ile Asp Phe Pro His
Ser Leu Gly Leu Leu Tyr Ser Ala Phe 180 185
190Thr Tyr Tyr Thr Gly Phe Lys Val Asn Ser Gly Glu Tyr Lys
Leu Met 195 200 205Gly Leu Ala Pro
Tyr Gly Glu Pro Lys Tyr Val Asp Gln Ile Leu Lys 210
215 220His Leu Leu Asp Leu Lys Glu Asp Gly Thr Phe Arg
Leu Asn Met Asp225 230 235
240Tyr Phe Asn Tyr Thr Val Gly Leu Thr Met Thr Asn His Lys Phe His
245 250 255Ser Met Phe Gly Gly
Pro Pro Arg Gln Ala Glu Gly Lys Ile Ser Gln 260
265 270Arg Asp Met Asp Leu Ala Ser Ser Ile Gln Lys Val
Thr Glu Glu Val 275 280 285Ile Leu
Arg Leu Ala Arg Thr Ile Lys Lys Glu Leu Gly Val Glu Tyr 290
295 300Leu Cys Leu Ala Gly Gly Val Gly Leu Asn Cys
Val Ala Asn Gly Arg305 310 315
320Ile Leu Arg Glu Ser Asp Phe Lys Asp Ile Trp Ile Gln Pro Ala Ala
325 330 335Gly Asp Ala Gly
Ser Ala Val Gly Ala Ala Leu Ala Ile Trp His Glu 340
345 350Tyr His Lys Lys Pro Arg Thr Ser Thr Ala Gly
Asp Arg Met Lys Gly 355 360 365Ser
Tyr Leu Gly Pro Ser Phe Ser Glu Ala Glu Ile Leu Gln Phe Leu 370
375 380Asn Ser Val Asn Ile Pro Tyr His Arg Cys
Val Asp Asn Glu Leu Met385 390 395
400Ala Arg Leu Ala Glu Ile Leu Asp Gln Gly Asn Val Val Gly Trp
Phe 405 410 415Ser Gly Arg
Met Glu Phe Gly Pro Arg Ala Leu Gly Gly Arg Ser Ile 420
425 430Ile Gly Asp Ser Arg Ser Pro Lys Met Gln
Ser Val Met Asn Leu Lys 435 440
445Ile Lys Tyr Arg Glu Ser Phe Arg Pro Phe Ala Pro Ser Val Leu Ala 450
455 460Glu Arg Val Ser Asp Tyr Phe Asp
Leu Asp Arg Pro Ser Pro Tyr Met465 470
475 480Leu Leu Val Ala Gln Val Lys Glu Asn Leu His Ile
Pro Met Thr Gln 485 490
495Glu Gln His Glu Leu Phe Gly Ile Glu Lys Leu Asn Val Pro Arg Ser
500 505 510Gln Ile Pro Ala Val Thr
His Val Asp Tyr Ser Ala Arg Ile Gln Thr 515 520
525Val His Lys Glu Thr Asn Pro Arg Tyr Tyr Glu Leu Ile Arg
His Phe 530 535 540Glu Ala Arg Thr Gly
Cys Ala Val Leu Val Asn Thr Ser Phe Asn Val545 550
555 560Arg Gly Glu Pro Ile Val Cys Thr Pro Glu
Asp Ala Tyr Arg Cys Phe 565 570
575Met Arg Thr Glu Met Asp Tyr Leu Val Met Glu Asn Phe Leu Leu Val
580 585 590Lys Ser Glu Gln Pro
Arg Gly Asn Ser Asp Glu Ser Trp Gln Lys Glu 595
600 605Phe Glu Leu Asp 61026444DNACylindrospermopsis
raciborskii T3 26atgagtgaat ttttcccaca aaaaagtggt aaattaaaga tggaacagat
aaaagaactt 60gacaaaaaag gattgcgtga gtttggactg attggcggtt ctatagtggc
ggttttattc 120ggctttttac tgccagttat acgccatcat tccttatcag ttatcccttg
ggttgttgct 180ggatttctct ggatttgggc aataatcgca cctacgactt taagttttat
ttaccaaata 240tggatgagga ttggacttgt tttaggatgg atacaaacac gaattatttt
gggagtttta 300ttttatataa tgatcacacc aataggattc ataagacggc tgttgaatca
agatccaatg 360acgcgaatct tcgagccaga gttgccaact tatcgccaat tgagtaagtc
aagaactaca 420caaagtatgg agaaaccatt ctaa
44427147PRTCylindrospermopsis raciborskii T3 27Met Ser Glu
Phe Phe Pro Gln Lys Ser Gly Lys Leu Lys Met Glu Gln1 5
10 15Ile Lys Glu Leu Asp Lys Lys Gly Leu
Arg Glu Phe Gly Leu Ile Gly 20 25
30Gly Ser Ile Val Ala Val Leu Phe Gly Phe Leu Leu Pro Val Ile Arg
35 40 45His His Ser Leu Ser Val Ile
Pro Trp Val Val Ala Gly Phe Leu Trp 50 55
60Ile Trp Ala Ile Ile Ala Pro Thr Thr Leu Ser Phe Ile Tyr Gln Ile65
70 75 80Trp Met Arg Ile
Gly Leu Val Leu Gly Trp Ile Gln Thr Arg Ile Ile 85
90 95Leu Gly Val Leu Phe Tyr Ile Met Ile Thr
Pro Ile Gly Phe Ile Arg 100 105
110Arg Leu Leu Asn Gln Asp Pro Met Thr Arg Ile Phe Glu Pro Glu Leu
115 120 125Pro Thr Tyr Arg Gln Leu Ser
Lys Ser Arg Thr Thr Gln Ser Met Glu 130 135
140Lys Pro Phe14528165DNACylindrospermopsis raciborskii T3
28atgctaaaag acacttggga ttttattaaa gacattgccg gatttattaa agaacaaaaa
60aactatttgt tgattcccct aattatcacc ctggtatcct tgggggcgct gattgtcttt
120gctcaatctt ctgcgatcgc acctttcatt tacactcttt tttaa
1652954PRTCylindrospermopsis raciborskii T3 29Met Leu Lys Asp Thr Trp Asp
Phe Ile Lys Asp Ile Ala Gly Phe Ile1 5 10
15Lys Glu Gln Lys Asn Tyr Leu Leu Ile Pro Leu Ile Ile
Thr Leu Val 20 25 30Ser Leu
Gly Ala Leu Ile Val Phe Ala Gln Ser Ser Ala Ile Ala Pro 35
40 45Phe Ile Tyr Thr Leu Phe
50301299DNACylindrospermopsis raciborskii T3 30atgagtaact tcaagggttc
ggtaaagata gcattgatgg gaatattgat tttttgtggg 60ctaatctttg gcgtagcatt
tgttgaaatt gggttacgta ttgccgggat cgaacacata 120gcattccata gcattgatga
acacaggggg tgggtagggc gacctcatgt ttccgggtgg 180tatagaaccg aaggtgaagc
tcacatccaa atgaatagtg atggctttcg agatcgagaa 240cacatcaagg tcaaaccaga
aaataccttc aggatagcgc tgttgggaga ttcctttgta 300gagtccatgc aagtaccgtt
ggagcaaaat ttggcagcag ttatagaagg agaaatcagt 360agttgtatag ctttagctgg
acgaaaggcg gaagtgatta attttggagt gactggttat 420ggaacagacc aagaactaat
tactctacgg gagaaagttt gggactattc acctgatata 480gtagtgctag atttttatac
tggcaacgac attgttgata actcccgtgc gctgagtcag 540aaattctatc ctaatgaact
aggttcacta aagccgtttt ttatacttag agatggtaat 600ctggtggttg atgcttcgtt
tatcaatacg gataattatc gctcaaagct gacatggtgg 660ggcaaaactt atatgaaaat
aaaagaccac tcacggattt tacaggtttt aaacatggta 720cgggatgctc ttaacaactc
tagtagaggg ttttcttctc aagctataga ggaaccgtta 780tttagtgatg gaaaacagga
tacaaaattg agcgggtttt ttgatatcta caaaccacct 840actgaccctg aatggcaaca
ggcatggcaa gtcacagaga aactgattag ctcaatgcaa 900cacgaggtga ctgcgaagaa
agcagatttt ttagttgtta cttttggcgg tccctttcaa 960cgagaacctt tagtgcgtca
aaaagaaatg caagaattgg gtctgactga ttggttttac 1020ccagagaagc gaattacacg
tttgggtgag gatgaggggt tcagtgtact caatctcagc 1080ccaaatttgc aggtttattc
tgagcagaac aatgcttgcc tatatgggtt tgatgatact 1140caaggctgtg tagggcattg
gaatgcttta ggacatcagg tagcaggaaa aatgattgca 1200tcgaagattt gtcaacagca
gatgagagaa agtatattgc ctcataagca cgacccttca 1260agccaaagct cacctattac
ccaatcagtg atccaataa
129931432PRTCylindrospermopsis raciborskii T3 31Met Ser Asn Phe Lys Gly
Ser Val Lys Ile Ala Leu Met Gly Ile Leu1 5
10 15Ile Phe Cys Gly Leu Ile Phe Gly Val Ala Phe Val
Glu Ile Gly Leu 20 25 30Arg
Ile Ala Gly Ile Glu His Ile Ala Phe His Ser Ile Asp Glu His 35
40 45Arg Gly Trp Val Gly Arg Pro His Val
Ser Gly Trp Tyr Arg Thr Glu 50 55
60Gly Glu Ala His Ile Gln Met Asn Ser Asp Gly Phe Arg Asp Arg Glu65
70 75 80His Ile Lys Val Lys
Pro Glu Asn Thr Phe Arg Ile Ala Leu Leu Gly 85
90 95Asp Ser Phe Val Glu Ser Met Gln Val Pro Leu
Glu Gln Asn Leu Ala 100 105
110Ala Val Ile Glu Gly Glu Ile Ser Ser Cys Ile Ala Leu Ala Gly Arg
115 120 125Lys Ala Glu Val Ile Asn Phe
Gly Val Thr Gly Tyr Gly Thr Asp Gln 130 135
140Glu Leu Ile Thr Leu Arg Glu Lys Val Trp Asp Tyr Ser Pro Asp
Ile145 150 155 160Val Val
Leu Asp Phe Tyr Thr Gly Asn Asp Ile Val Asp Asn Ser Arg
165 170 175Ala Leu Ser Gln Lys Phe Tyr
Pro Asn Glu Leu Gly Ser Leu Lys Pro 180 185
190Phe Phe Ile Leu Arg Asp Gly Asn Leu Val Val Asp Ala Ser
Phe Ile 195 200 205Asn Thr Asp Asn
Tyr Arg Ser Lys Leu Thr Trp Trp Gly Lys Thr Tyr 210
215 220Met Lys Ile Lys Asp His Ser Arg Ile Leu Gln Val
Leu Asn Met Val225 230 235
240Arg Asp Ala Leu Asn Asn Ser Ser Arg Gly Phe Ser Ser Gln Ala Ile
245 250 255Glu Glu Pro Leu Phe
Ser Asp Gly Lys Gln Asp Thr Lys Leu Ser Gly 260
265 270Phe Phe Asp Ile Tyr Lys Pro Pro Thr Asp Pro Glu
Trp Gln Gln Ala 275 280 285Trp Gln
Val Thr Glu Lys Leu Ile Ser Ser Met Gln His Glu Val Thr 290
295 300Ala Lys Lys Ala Asp Phe Leu Val Val Thr Phe
Gly Gly Pro Phe Gln305 310 315
320Arg Glu Pro Leu Val Arg Gln Lys Glu Met Gln Glu Leu Gly Leu Thr
325 330 335Asp Trp Phe Tyr
Pro Glu Lys Arg Ile Thr Arg Leu Gly Glu Asp Glu 340
345 350Gly Phe Ser Val Leu Asn Leu Ser Pro Asn Leu
Gln Val Tyr Ser Glu 355 360 365Gln
Asn Asn Ala Cys Leu Tyr Gly Phe Asp Asp Thr Gln Gly Cys Val 370
375 380Gly His Trp Asn Ala Leu Gly His Gln Val
Ala Gly Lys Met Ile Ala385 390 395
400Ser Lys Ile Cys Gln Gln Gln Met Arg Glu Ser Ile Leu Pro His
Lys 405 410 415His Asp Pro
Ser Ser Gln Ser Ser Pro Ile Thr Gln Ser Val Ile Gln 420
425 430321449DNACylindrospermopsis raciborskii
T3 32atgacaaata ccgaaagagg attagcagaa ataacatcaa caggatataa gtcagagctt
60agatcggagg cacgagttag cctccaactg gcaattccct tagtccttgt cgaaatatgc
120ggaacgagta ttaatgtggt ggatgtagtc atgatgggct tacttggtac tcaagttttg
180gctgctggtg ccttgggtgc gatcgctttt ttatctgtat cgaatacttg ttataatatg
240cttttgtcgg gggtagcaaa ggcatctgag gcttttgggg caaacaaaat agatcaggtt
300agtcgtattg cttctgggca aatatggctg gcactcacct tgtctttgcc tgcaatgctt
360ttgctttggt atatggatac tatattggtg ctatttggtc aagttgaaag caacacatta
420attgcaaaaa cgtatttaca ctcaattgtg tggggatttc cggcggcagt tggtattttg
480atattaagag gcattgcctc tgctgtgaac gtcccccaat tggtaactgt gacgatgcta
540gtagggctgg tcttgaatgc cccggccaat tatgtattaa tgttcggtaa atttggtctt
600cctgaacttg gtttagctgg aataggctgg gcaagtactt tggttttttg gattagtttt
660ctagtggggg ttgtcttgct gattttctcc ccaaaagtta gagattataa acttttccgc
720tacttgcatc agtttgatcg acagacggtt gtggaaattt ttcaaactgg atggcctatg
780ggttttctac tgggagtgga atcagtagta ttgagcctca ccgcttggtt aacaggctat
840ttgggaacag taacattagc agctcatgag atcgcgatcc aaacagcaga actggcgata
900gtgataccac tcggaatcgg gaatgttgcc gtcacgagag taggtcagac tataggagaa
960aaaaaccctt tgggtgctag aagggcagca ttgattggga ttatgattgg tggcatttat
1020gccagtcttg tggcagtcat tttctggttg tttccatatc agattgcggg actttattta
1080aaaataaacg atccagagag tatggaagca gttaagacag caactaattt tctcttcttg
1140gcgggattat tccaattttt tcatagcgtt caaataattg ttgttggggt tttaataggg
1200ttgcaggata cgtttatccc attgttaatg aatttggtag gctggggtct tggcttggca
1260gtaagctatt acatgggaat cattttatgt tggggaggta tgggtatctg gttaggtctg
1320gttttgagtc cactcctgtc cggacttatt ttaatggttc gtttttatca agagattgcc
1380aataggattg ccaatagtga tgatgggcaa gagagtatat ctattgacaa cgttgaagaa
1440ctctcctga
144933482PRTCylindrospermopsis raciborskii T3 33Met Thr Asn Thr Glu Arg
Gly Leu Ala Glu Ile Thr Ser Thr Gly Tyr1 5
10 15Lys Ser Glu Leu Arg Ser Glu Ala Arg Val Ser Leu
Gln Leu Ala Ile 20 25 30Pro
Leu Val Leu Val Glu Ile Cys Gly Thr Ser Ile Asn Val Val Asp 35
40 45Val Val Met Met Gly Leu Leu Gly Thr
Gln Val Leu Ala Ala Gly Ala 50 55
60Leu Gly Ala Ile Ala Phe Leu Ser Val Ser Asn Thr Cys Tyr Asn Met65
70 75 80Leu Leu Ser Gly Val
Ala Lys Ala Ser Glu Ala Phe Gly Ala Asn Lys 85
90 95Ile Asp Gln Val Ser Arg Ile Ala Ser Gly Gln
Ile Trp Leu Ala Leu 100 105
110Thr Leu Ser Leu Pro Ala Met Leu Leu Leu Trp Tyr Met Asp Thr Ile
115 120 125Leu Val Leu Phe Gly Gln Val
Glu Ser Asn Thr Leu Ile Ala Lys Thr 130 135
140Tyr Leu His Ser Ile Val Trp Gly Phe Pro Ala Ala Val Gly Ile
Leu145 150 155 160Ile Leu
Arg Gly Ile Ala Ser Ala Val Asn Val Pro Gln Leu Val Thr
165 170 175Val Thr Met Leu Val Gly Leu
Val Leu Asn Ala Pro Ala Asn Tyr Val 180 185
190Leu Met Phe Gly Lys Phe Gly Leu Pro Glu Leu Gly Leu Ala
Gly Ile 195 200 205Gly Trp Ala Ser
Thr Leu Val Phe Trp Ile Ser Phe Leu Val Gly Val 210
215 220Val Leu Leu Ile Phe Ser Pro Lys Val Arg Asp Tyr
Lys Leu Phe Arg225 230 235
240Tyr Leu His Gln Phe Asp Arg Gln Thr Val Val Glu Ile Phe Gln Thr
245 250 255Gly Trp Pro Met Gly
Phe Leu Leu Gly Val Glu Ser Val Val Leu Ser 260
265 270Leu Thr Ala Trp Leu Thr Gly Tyr Leu Gly Thr Val
Thr Leu Ala Ala 275 280 285His Glu
Ile Ala Ile Gln Thr Ala Glu Leu Ala Ile Val Ile Pro Leu 290
295 300Gly Ile Gly Asn Val Ala Val Thr Arg Val Gly
Gln Thr Ile Gly Glu305 310 315
320Lys Asn Pro Leu Gly Ala Arg Arg Ala Ala Leu Ile Gly Ile Met Ile
325 330 335Gly Gly Ile Tyr
Ala Ser Leu Val Ala Val Ile Phe Trp Leu Phe Pro 340
345 350Tyr Gln Ile Ala Gly Leu Tyr Leu Lys Ile Asn
Asp Pro Glu Ser Met 355 360 365Glu
Ala Val Lys Thr Ala Thr Asn Phe Leu Phe Leu Ala Gly Leu Phe 370
375 380Gln Phe Phe His Ser Val Gln Ile Ile Val
Val Gly Val Leu Ile Gly385 390 395
400Leu Gln Asp Thr Phe Ile Pro Leu Leu Met Asn Leu Val Gly Trp
Gly 405 410 415Leu Gly Leu
Ala Val Ser Tyr Tyr Met Gly Ile Ile Leu Cys Trp Gly 420
425 430Gly Met Gly Ile Trp Leu Gly Leu Val Leu
Ser Pro Leu Leu Ser Gly 435 440
445Leu Ile Leu Met Val Arg Phe Tyr Gln Glu Ile Ala Asn Arg Ile Ala 450
455 460Asn Ser Asp Asp Gly Gln Glu Ser
Ile Ser Ile Asp Asn Val Glu Glu465 470
475 480Leu Ser34831DNACylindrospermopsis raciborskii T3
34atgaaaacaa acaaacatat agctatgtgg gcttgtccta gaagtcgttc tactgtaatt
60acccgtgctt ttgagaactt agatgggtgt gttgtttatg atgagcctct agaggctccg
120aatgtcttga tgacaactta cacgatgagt aacagtcgta cgttagcaga agaagactta
180aagcaattaa tactgcaaaa taatgtagaa acagacctca agaaagttat agaacaattg
240actggagatt taccggacgg aaaattattc tcatttcaaa aaatgataac aggtgactat
300agatctgaat ttggaataga ttgggcaaaa aagctaacta acttcttttt aataaggcat
360ccccaagata ttattttttc tttcgatata gcggagagaa agacaggtat cacagaacca
420ttcacacaac aaaatcttgg catgaaaaca ctttatgaag ttttccaaca aattgaagtt
480attacagggc aaacaccttt agttattcac tcagatgata taattaaaaa ccctccttct
540gctttgaaat ggctgtgtaa aaacttaggg cttgcatttg atgaaaagat gctgacatgg
600aaagcaaatc tagaagactc caatttaaag tatacaaaat tatatgctaa ttctgcgtct
660ggcagttcag aaccttggtt tgaaacttta agatcgacca aaacatttct cgcctatgaa
720aagaaggaga aaaaattacc agctcggtta atacctctac tagatgaatc tattccttac
780tatgaaaaac tcttacagca ttgtcatatt tttgaatggt cagaacactg a
83135276PRTCylindrospermopsis raciborskii T3 35Met Lys Thr Asn Lys His
Ile Ala Met Trp Ala Cys Pro Arg Ser Arg1 5
10 15Ser Thr Val Ile Thr Arg Ala Phe Glu Asn Leu Asp
Gly Cys Val Val 20 25 30Tyr
Asp Glu Pro Leu Glu Ala Pro Asn Val Leu Met Thr Thr Tyr Thr 35
40 45Met Ser Asn Ser Arg Thr Leu Ala Glu
Glu Asp Leu Lys Gln Leu Ile 50 55
60Leu Gln Asn Asn Val Glu Thr Asp Leu Lys Lys Val Ile Glu Gln Leu65
70 75 80Thr Gly Asp Leu Pro
Asp Gly Lys Leu Phe Ser Phe Gln Lys Met Ile 85
90 95Thr Gly Asp Tyr Arg Ser Glu Phe Gly Ile Asp
Trp Ala Lys Lys Leu 100 105
110Thr Asn Phe Phe Leu Ile Arg His Pro Gln Asp Ile Ile Phe Ser Phe
115 120 125Asp Ile Ala Glu Arg Lys Thr
Gly Ile Thr Glu Pro Phe Thr Gln Gln 130 135
140Asn Leu Gly Met Lys Thr Leu Tyr Glu Val Phe Gln Gln Ile Glu
Val145 150 155 160Ile Thr
Gly Gln Thr Pro Leu Val Ile His Ser Asp Asp Ile Ile Lys
165 170 175Asn Pro Pro Ser Ala Leu Lys
Trp Leu Cys Lys Asn Leu Gly Leu Ala 180 185
190Phe Asp Glu Lys Met Leu Thr Trp Lys Ala Asn Leu Glu Asp
Ser Asn 195 200 205Leu Lys Tyr Thr
Lys Leu Tyr Ala Asn Ser Ala Ser Gly Ser Ser Glu 210
215 220Pro Trp Phe Glu Thr Leu Arg Ser Thr Lys Thr Phe
Leu Ala Tyr Glu225 230 235
240Lys Lys Glu Lys Lys Leu Pro Ala Arg Leu Ile Pro Leu Leu Asp Glu
245 250 255Ser Ile Pro Tyr Tyr
Glu Lys Leu Leu Gln His Cys His Ile Phe Glu 260
265 270Trp Ser Glu His
27536774DNACylindrospermopsis raciborskii T3 36ctaaaaattt ttttctactc
ttttcaggat agaattccag tttctagagc cgttgtaacc 60gtacatatct tgatagtacg
tatcgatgag gtactcattt tcgtggagca ttaaccagct 120ttttaactcc gctaatttct
gctctccttt ttctattaat tcttgctcat ccaaatcatc 180cctgtccaac tcctccctgt
ccaactccca catagttttg ttggtatctt cgacaatcaa 240gtagtctcca ctttttagac
cgttttcgtg aaaatattca actactccca ccgcattagc 300atgggcatct tctacgatca
accagggatg agcaagccca gaaagcagtt ccgacgacat 360tattgcaccc atattgttac
aatccccctc taaaaaatga acgcgagagt cagtttttgc 420tttctcgtcg agtagggaaa
gatcgatatc gatacagtag acacaacctt ctatttggaa 480cagttctaag tgatcggcta
gccaaatcgc gctgccaccg cttaatgctc ctatttcgat 540tattgttttc gggcgaagct
catacaggag cattgaataa agagctattt cggtgcaccc 600tttcaggaag ggtatccctt
tccaagtgaa caaatcgcgg tttgccaaga gcgctctcca 660agctggcact ggaatagcac
atttatcttc tctttcagaa attttggcaa accgattagg 720tttgaaaggt gcaactttat
aggcggcttc ttgaacaaat ttttggaagc tcat
77437257PRTCylindrospermopsis raciborskii T3 37Met Ser Phe Gln Lys Phe
Val Gln Glu Ala Ala Tyr Lys Val Ala Pro1 5
10 15Phe Lys Pro Asn Arg Phe Ala Lys Ile Ser Glu Arg
Glu Asp Lys Cys 20 25 30Ala
Ile Pro Val Pro Ala Trp Arg Ala Leu Leu Ala Asn Arg Asp Leu 35
40 45Phe Thr Trp Lys Gly Ile Pro Phe Leu
Lys Gly Cys Thr Glu Ile Ala 50 55
60Leu Tyr Ser Met Leu Leu Tyr Glu Leu Arg Pro Lys Thr Ile Ile Glu65
70 75 80Ile Gly Ala Leu Ser
Gly Gly Ser Ala Ile Trp Leu Ala Asp His Leu 85
90 95Glu Leu Phe Gln Ile Glu Gly Cys Val Tyr Cys
Ile Asp Ile Asp Leu 100 105
110Ser Leu Leu Asp Glu Lys Ala Lys Thr Asp Ser Arg Val His Phe Leu
115 120 125Glu Gly Asp Cys Asn Asn Met
Gly Ala Ile Met Ser Ser Glu Leu Leu 130 135
140Ser Gly Leu Ala His Pro Trp Leu Ile Val Glu Asp Ala His Ala
Asn145 150 155 160Ala Val
Gly Val Val Glu Tyr Phe His Glu Asn Gly Leu Lys Ser Gly
165 170 175Asp Tyr Leu Ile Val Glu Asp
Thr Asn Lys Thr Met Trp Glu Leu Asp 180 185
190Arg Glu Glu Leu Asp Arg Asp Asp Leu Asp Glu Gln Glu Leu
Ile Glu 195 200 205Lys Gly Glu Gln
Lys Leu Ala Glu Leu Lys Ser Trp Leu Met Leu His 210
215 220Glu Asn Glu Tyr Leu Ile Asp Thr Tyr Tyr Gln Asp
Met Tyr Gly Tyr225 230 235
240Asn Gly Ser Arg Asn Trp Asn Ser Ile Leu Lys Arg Val Glu Lys Asn
245 250
255Phe38327DNACylindrospermopsis raciborskii T3 38ttattcaaat agccgtagtt
tatgatcggt atccaattcg ctattgtttt ttctgccata 60tccccaacct aagatgcgac
gatattcacc cataatgcca ctgtcaatta aatcatcctc 120gttgactgca acattggtat
gagattgcgg cgcaacatag agcgcatccg caggacaata 180tgcttcacag atgaaacaag
tttgacagtc ttcctgtcgg gcgatcgcag gcggttggtt 240gggaactgca tcaaagacat
tggtagggca tacttggacg caaacattac aattaataca 300gagtttatgg ctgacaagct
cgatcat
32739108PRTCylindrospermopsis raciborskii T3 39Met Ile Glu Leu Val Ser
His Lys Leu Cys Ile Asn Cys Asn Val Cys1 5
10 15Val Gln Val Cys Pro Thr Asn Val Phe Asp Ala Val
Pro Asn Gln Pro 20 25 30Pro
Ala Ile Ala Arg Gln Glu Asp Cys Gln Thr Cys Phe Ile Cys Glu 35
40 45Ala Tyr Cys Pro Ala Asp Ala Leu Tyr
Val Ala Pro Gln Ser His Thr 50 55
60Asn Val Ala Val Asn Glu Asp Asp Leu Ile Asp Ser Gly Ile Met Gly65
70 75 80Glu Tyr Arg Arg Ile
Leu Gly Trp Gly Tyr Gly Arg Lys Asn Asn Ser 85
90 95Glu Leu Asp Thr Asp His Lys Leu Arg Leu Phe
Glu 100 105401653DNACylindrospermopsis
raciborskii T3 40ttaagtggtt aatactggtg gtgtagcgct cgcatccttc acccaatccc
gtctcaccca 60aagcctttct aagccgcccg tggcttggta ataaagctga tttggatcgg
tttcaggata 120gtctatgcga atatgttcgc tacgcgtttc cttgcgatgt aaagcgctaa
aatatgccca 180tcgtgctaca gacacaagag cagccgctcg acgagaaaat tccagatcgc
gcactgtatc 240ttgtttcggg ttcccttgta cttgctgcca cagcatttct aatttggcga
gggaatccaa 300aagtccctgc tcacagcgca agtaattctt ctctaatggg aacatctcgg
cttgtacacc 360gcggacaact gcctcgctat cgaatgtttc ggaaccaggg tactgggaac
gtaatccggc 420ttgacctgct ggacgcacaa cccgttcatg gacatgagcg cccaaactct
tggcaaaggc 480ggctgcacct tcccctgccc attgtcctgt agagattgcc caagcagcat
taggaccatc 540acccccagaa gctatcccag ctaaaaactc ccgcgatgct gcatctccgg
cggcatacag 600tccaggaact tttgtaccac aactatcatt cacaatccga attccacctg
taccacggac 660tgtaccttct aaaaccagtg ttacaggtac tcgttctgta taagggtcaa
tgccagcttt 720tttatagggt agaaaggcga tgaagtgaga cttttcaacc aatgcttgga
tttcaggtgt 780ggctcgatcc aaacgagcat aaacgggacc tttcaggagg gcattgggca
ggaacgatgg 840atcgcgacga ccattgatat agccaccaag atcgttacct gcctcatcgg
tgtaactagc 900ccagtaaaag ggagcagccc ttgtcactgt ggcattgaaa gcggtcgaga
tggtatagtg 960actggaagct tccatactgg agagttcgcc gccagcttcc accgccatca
gcagtccatc 1020gcctgtattg gtattgcaac ctaaagcttt acttaggaat gcacaaccgc
cattcgctag 1080aactactgca ccagcgcgaa cggtataggt gcgatgattt tgcctctgta
cacctctagc 1140tccagccacg gagccgtcct gggctaataa cagttctaga gccggacttt
ggtcgaaaat 1200ttgcacaccc acacgcaaca ggttcttgcg aagtacccgc atatattccg
gaccataata 1260actctggcgc acggattccc cattttcttt ggggaaacga tagccccaat
cttccactaa 1320gggcaaactc agccaagctt tttcaattac acgttcaatc caacgtaagt
tagcgaggtt 1380atttcctttg ctgtaacatt cggatacatc tttctcccaa ttctctggag
aaggtgccat 1440gacgctattg ccactggcag cagctgcacc gctcgtacct agaaaacctt
tatcaacaat 1500gatgactttg acaccttggg ctccagccgc ccatgctgcc catgcggcgg
caggaccacc 1560accaattacc agcacgtcag cagttaattg tagttcagtg ccgctatagg
ctgtaagcaa 1620ttgcttttcc tccttgttta aagtcaagtt cat
165341550PRTCylindrospermopsis raciborskii T3 41Met Asn Leu
Thr Leu Asn Lys Glu Glu Lys Gln Leu Leu Thr Ala Tyr1 5
10 15Ser Gly Thr Glu Leu Gln Leu Thr Ala
Asp Val Leu Val Ile Gly Gly 20 25
30Gly Pro Ala Ala Ala Trp Ala Ala Trp Ala Ala Gly Ala Gln Gly Val
35 40 45Lys Val Ile Ile Val Asp Lys
Gly Phe Leu Gly Thr Ser Gly Ala Ala 50 55
60Ala Ala Ser Gly Asn Ser Val Met Ala Pro Ser Pro Glu Asn Trp Glu65
70 75 80Lys Asp Val Ser
Glu Cys Tyr Ser Lys Gly Asn Asn Leu Ala Asn Leu 85
90 95Arg Trp Ile Glu Arg Val Ile Glu Lys Ala
Trp Leu Ser Leu Pro Leu 100 105
110Val Glu Asp Trp Gly Tyr Arg Phe Pro Lys Glu Asn Gly Glu Ser Val
115 120 125Arg Gln Ser Tyr Tyr Gly Pro
Glu Tyr Met Arg Val Leu Arg Lys Asn 130 135
140Leu Leu Arg Val Gly Val Gln Ile Phe Asp Gln Ser Pro Ala Leu
Glu145 150 155 160Leu Leu
Leu Ala Gln Asp Gly Ser Val Ala Gly Ala Arg Gly Val Gln
165 170 175Arg Gln Asn His Arg Thr Tyr
Thr Val Arg Ala Gly Ala Val Val Leu 180 185
190Ala Asn Gly Gly Cys Ala Phe Leu Ser Lys Ala Leu Gly Cys
Asn Thr 195 200 205Asn Thr Gly Asp
Gly Leu Leu Met Ala Val Glu Ala Gly Gly Glu Leu 210
215 220Ser Ser Met Glu Ala Ser Ser His Tyr Thr Ile Ser
Thr Ala Phe Asn225 230 235
240Ala Thr Val Thr Arg Ala Ala Pro Phe Tyr Trp Ala Ser Tyr Thr Asp
245 250 255Glu Ala Gly Asn Asp
Leu Gly Gly Tyr Ile Asn Gly Arg Arg Asp Pro 260
265 270Ser Phe Leu Pro Asn Ala Leu Leu Lys Gly Pro Val
Tyr Ala Arg Leu 275 280 285Asp Arg
Ala Thr Pro Glu Ile Gln Ala Leu Val Glu Lys Ser His Phe 290
295 300Ile Ala Phe Leu Pro Tyr Lys Lys Ala Gly Ile
Asp Pro Tyr Thr Glu305 310 315
320Arg Val Pro Val Thr Leu Val Leu Glu Gly Thr Val Arg Gly Thr Gly
325 330 335Gly Ile Arg Ile
Val Asn Asp Ser Cys Gly Thr Lys Val Pro Gly Leu 340
345 350Tyr Ala Ala Gly Asp Ala Ala Ser Arg Glu Phe
Leu Ala Gly Ile Ala 355 360 365Ser
Gly Gly Asp Gly Pro Asn Ala Ala Trp Ala Ile Ser Thr Gly Gln 370
375 380Trp Ala Gly Glu Gly Ala Ala Ala Phe Ala
Lys Ser Leu Gly Ala His385 390 395
400Val His Glu Arg Val Val Arg Pro Ala Gly Gln Ala Gly Leu Arg
Ser 405 410 415Gln Tyr Pro
Gly Ser Glu Thr Phe Asp Ser Glu Ala Val Val Arg Gly 420
425 430Val Gln Ala Glu Met Phe Pro Leu Glu Lys
Asn Tyr Leu Arg Cys Glu 435 440
445Gln Gly Leu Leu Asp Ser Leu Ala Lys Leu Glu Met Leu Trp Gln Gln 450
455 460Val Gln Gly Asn Pro Lys Gln Asp
Thr Val Arg Asp Leu Glu Phe Ser465 470
475 480Arg Arg Ala Ala Ala Leu Val Ser Val Ala Arg Trp
Ala Tyr Phe Ser 485 490
495Ala Leu His Arg Lys Glu Thr Arg Ser Glu His Ile Arg Ile Asp Tyr
500 505 510Pro Glu Thr Asp Pro Asn
Gln Leu Tyr Tyr Gln Ala Thr Gly Gly Leu 515 520
525Glu Arg Leu Trp Val Arg Arg Asp Trp Val Lys Asp Ala Ser
Ala Thr 530 535 540Pro Pro Val Leu Thr
Thr545 55042750DNACylindrospermopsis raciborskii T3
42ttaattatct tctgcagtcg gtcgaatcaa aatttcattt acatttacat gatcgggttg
60tgtcactgca taaattatag ctcttgcaat atcctcactt tgtaaaggtg ttattgtact
120aagttgttct ttactaagct gtttcgtgat cgggtcagaa attaagtcat taaatggcgt
180atcgactaaa cctggctcaa tgatggtaac gcgaatgttg tctaaagata cctcctggcg
240taatgcttct gaaagagcat tgacgcctga tttggcagca ctataaacga ccgcaccgga
300ctgcgctatc ctgccatcga cagaagatat attgactata tgaccggatt tttgggcctt
360cagaagaggc aaaactgcgt ggatagcata taaaactccc agaacattca catcgaatgc
420tcgcctccag tctgcgggat ttccagtatc aattgcacca aacacaccaa ttcctgcatt
480attcaccaaa atatctacat gtcctagctc aaccttggtc ttttggacta gatgatttac
540ttgagattcg tctgtaatat ctgtaacaat aggcaatgct tgaccaccac tggcttcaat
600ccgttttgct agtgcatgca aaagctcagc acgtcttgcg gcgatcgcaa cttttgcccc
660ctccgcagct aaagcaaatg ctgtagcctc tccaatccca gaggaagctc cagtaataat
720cgccactttt ccatccaatt tacctgccat
75043249PRTCylindrospermopsis raciborskii T3 43Met Ala Gly Lys Leu Asp
Gly Lys Val Ala Ile Ile Thr Gly Ala Ser1 5
10 15Ser Gly Ile Gly Glu Ala Thr Ala Phe Ala Leu Ala
Ala Glu Gly Ala 20 25 30Lys
Val Ala Ile Ala Ala Arg Arg Ala Glu Leu Leu His Ala Leu Ala 35
40 45Lys Arg Ile Glu Ala Ser Gly Gly Gln
Ala Leu Pro Ile Val Thr Asp 50 55
60Ile Thr Asp Glu Ser Gln Val Asn His Leu Val Gln Lys Thr Lys Val65
70 75 80Glu Leu Gly His Val
Asp Ile Leu Val Asn Asn Ala Gly Ile Gly Val 85
90 95Phe Gly Ala Ile Asp Thr Gly Asn Pro Ala Asp
Trp Arg Arg Ala Phe 100 105
110Asp Val Asn Val Leu Gly Val Leu Tyr Ala Ile His Ala Val Leu Pro
115 120 125Leu Leu Lys Ala Gln Lys Ser
Gly His Ile Val Asn Ile Ser Ser Val 130 135
140Asp Gly Arg Ile Ala Gln Ser Gly Ala Val Val Tyr Ser Ala Ala
Lys145 150 155 160Ser Gly
Val Asn Ala Leu Ser Glu Ala Leu Arg Gln Glu Val Ser Leu
165 170 175Asp Asn Ile Arg Val Thr Ile
Ile Glu Pro Gly Leu Val Asp Thr Pro 180 185
190Phe Asn Asp Leu Ile Ser Asp Pro Ile Thr Lys Gln Leu Ser
Lys Glu 195 200 205Gln Leu Ser Thr
Ile Thr Pro Leu Gln Ser Glu Asp Ile Ala Arg Ala 210
215 220Ile Ile Tyr Ala Val Thr Gln Pro Asp His Val Asn
Val Asn Glu Ile225 230 235
240Leu Ile Arg Pro Thr Ala Glu Asp Asn
245441005DNACylindrospermopsis raciborskii T3 44ttaacaaacc ccataagtaa
cacctagttg ctttagccat cgacgatagg caagtgtgca 60tctatctgat ggtacgtgga
tttcgtgtga aaacaattgt gtatttatct gctttggagt 120taacagtggt aaacgtaccg
gctgttgtgc atgtaagatc cgaatatctt gttctattgt 180ttcgtcatat tcagttagca
tctttgactc taacgtttca tacccgttcc acattatcaa 240catacgcaat acactatttt
cctcatcaat cggtgtgatc gtcattaaat ccacaatcct 300catttcaggg gattctgaaa
cgcagtattg acataaagga tgactaagcc tgaaccaatt 360aacccaagag tcatcttcga
tatggctgac aatccttgat gtctggaatt gatacttacc 420catagtaagg ccatctttat
ctaatttcac ctcaaattct tccacttttg tataattgcg 480atcacctaac caaccgtcat
ggataaaagg aaaatgagac acgtctaagg aattatccat 540cacacgaaac gcactagctt
taatcaagta agacttggta taagtcttgt gataattcgg 600atcatcccat tcaggaaatg
aaggtatatc attaacagga tcgcccaagc acacccacac 660taagccatag cgctcctggg
agtgatatgt cctggcttca gcacttgccg gtggtaccat 720gccagggtga gctgggatct
gtatgcattt accagcctca ttgtatctcc atccgtgata 780cggacaaact aaagtattat
tcgtaatttc tcccatagac agaggaacac ctcggtgggg 840gcagtagtca agccatacct
gtatgggtga attttgttca taactgcgcc ataataccaa 900cttcactccc aacaaacgag
atctggtgat acttccaggt ttacagtctt ctacattggc 960gactacgtgc cagttattga
ttaagattgg gtcggtagtt gtcat
100545334PRTCylindrospermopsis raciborskii T3 45Met Thr Thr Thr Asp Pro
Ile Leu Ile Asn Asn Trp His Val Val Ala1 5
10 15Asn Val Glu Asp Cys Lys Pro Gly Ser Ile Thr Arg
Ser Arg Leu Leu 20 25 30Gly
Val Lys Leu Val Leu Trp Arg Ser Tyr Glu Gln Asn Ser Pro Ile 35
40 45Gln Val Trp Leu Asp Tyr Cys Pro His
Arg Gly Val Pro Leu Ser Met 50 55
60Gly Glu Ile Thr Asn Asn Thr Leu Val Cys Pro Tyr His Gly Trp Arg65
70 75 80Tyr Asn Glu Ala Gly
Lys Cys Ile Gln Ile Pro Ala His Pro Gly Met 85
90 95Val Pro Pro Ala Ser Ala Glu Ala Arg Thr Tyr
His Ser Gln Glu Arg 100 105
110Tyr Gly Leu Val Trp Val Cys Leu Gly Asp Pro Val Asn Asp Ile Pro
115 120 125Ser Phe Pro Glu Trp Asp Asp
Pro Asn Tyr His Lys Thr Tyr Thr Lys 130 135
140Ser Tyr Leu Ile Lys Ala Ser Ala Phe Arg Val Met Asp Asn Ser
Leu145 150 155 160Asp Val
Ser His Phe Pro Phe Ile His Asp Gly Trp Leu Gly Asp Arg
165 170 175Asn Tyr Thr Lys Val Glu Glu
Phe Glu Val Lys Leu Asp Lys Asp Gly 180 185
190Leu Thr Met Gly Lys Tyr Gln Phe Gln Thr Ser Arg Ile Val
Ser His 195 200 205Ile Glu Asp Asp
Ser Trp Val Asn Trp Phe Arg Leu Ser His Pro Leu 210
215 220Cys Gln Tyr Cys Val Ser Glu Ser Pro Glu Met Arg
Ile Val Asp Leu225 230 235
240Met Thr Ile Thr Pro Ile Asp Glu Glu Asn Ser Val Leu Arg Met Leu
245 250 255Ile Met Trp Asn Gly
Tyr Glu Thr Leu Glu Ser Lys Met Leu Thr Glu 260
265 270Tyr Asp Glu Thr Ile Glu Gln Asp Ile Arg Ile Leu
His Ala Gln Gln 275 280 285Pro Val
Arg Leu Pro Leu Leu Thr Pro Lys Gln Ile Asn Thr Gln Leu 290
295 300Phe Ser His Glu Ile His Val Pro Ser Asp Arg
Cys Thr Leu Ala Tyr305 310 315
320Arg Arg Trp Leu Lys Gln Leu Gly Val Thr Tyr Gly Val Cys
325 33046726DNACylindrospermopsis raciborskii T3
46ctaaattatc cttttcaagg catccaccaa cagtggtttg atgttgtttt ttgtaaaaat
60cagagttagc atcctgtaat cggtaattga agtgttggca gctgcggtat gccatacagt
120tggtgtataa aacattgctg cccctcctgg aagtgaaaga catatttctg catttagtga
180attggcagaa gatgaatcta atgagtgttc ccattggtgg ctacttggta taactcgcat
240tgtacccata gtattatctg tatcctgtaa gtatatagtt atgaatacca tggcttgatt
300ggctactgga accaacaacc gaagcgcgtc gtcatttaac tcgttttttg acatggatgc
360aagtgcgttc aatacttcaa ctacatatcc atggtcttga tgccaagcaa tgtatcctgt
420acctgcacga attatggcta gatcggtgat caataggaag atatcagacc caattagagc
480ctgtactggt cccatcacag ttggaagctc taaaagcctc tgaattatct tttgatacct
540aactggatct gggatagtat gctcagacca ccactcatag tcacccgcca atactccccc
600acgtttttgt tcggtaataa gttctacttc atgccgtatt tcttcaatta acgcttttgg
660tacagcttct tcaactgtga aataaccatc atttgtgtaa gcttgttttt gttccgctgt
720gagcat
72647241PRTCylindrospermopsis raciborskii T3 47Met Leu Thr Ala Glu Gln
Lys Gln Ala Tyr Thr Asn Asp Gly Tyr Phe1 5
10 15Thr Val Glu Glu Ala Val Pro Lys Ala Leu Ile Glu
Glu Ile Arg His 20 25 30Glu
Val Glu Leu Ile Thr Glu Gln Lys Arg Gly Gly Val Leu Ala Gly 35
40 45Asp Tyr Glu Trp Trp Ser Glu His Thr
Ile Pro Asp Pro Val Arg Tyr 50 55
60Gln Lys Ile Ile Gln Arg Leu Leu Glu Leu Pro Thr Val Met Gly Pro65
70 75 80Val Gln Ala Leu Ile
Gly Ser Asp Ile Phe Leu Leu Ile Thr Asp Leu 85
90 95Ala Ile Ile Arg Ala Gly Thr Gly Tyr Ile Ala
Trp His Gln Asp His 100 105
110Gly Tyr Val Val Glu Val Leu Asn Ala Leu Ala Ser Met Ser Lys Asn
115 120 125Glu Leu Asn Asp Asp Ala Leu
Arg Leu Leu Val Pro Val Ala Asn Gln 130 135
140Ala Met Val Phe Ile Thr Ile Tyr Leu Gln Asp Thr Asp Asn Thr
Met145 150 155 160Gly Thr
Met Arg Val Ile Pro Ser Ser His Gln Trp Glu His Ser Leu
165 170 175Asp Ser Ser Ser Ala Asn Ser
Leu Asn Ala Glu Ile Cys Leu Ser Leu 180 185
190Pro Gly Gly Ala Ala Met Phe Tyr Thr Pro Thr Val Trp His
Thr Ala 195 200 205Ala Ala Asn Thr
Ser Ile Thr Asp Tyr Arg Met Leu Thr Leu Ile Phe 210
215 220Thr Lys Asn Asn Ile Lys Pro Leu Leu Val Asp Ala
Leu Lys Arg Ile225 230 235
240Ile48576DNACylindrospermopsis raciborskii T3 48tcaatggtta gtaggaatta
tcctatagct gttctttctc tggatagaag aaaggttgtg 60agaagctcgc tccgacttca
tttcagccaa tttttctgca gaccaatact gaaaatatcc 120caatcttaat aattcatcac
tagcctcttg taactggctg aatgactgta ctgatgctaa 180aacatactta gggtgagtta
tgattacgtt attcacattc tccgcgtcat caccaacata 240ttgtttgtct ggatgcgatc
ctaaagctac caaatcgtat tctggtaata cataattcgc 300cttggtaatg tacctttcca
acctctgtgc atctaggttt tgagggtcgc agccaaaaat 360caccatttca aagtcattat
tccatgttct tatctgttcc attagaagct ctggcagttc 420aggtccatga aaccaacgaa
cactaacacg gttatttaac caagctgcct tcgcgtaagg 480acagggtgga aaatttcctg
ttagaggatt gggaatgctg acaacattga taatccaatc 540ctctatttct tggcgaaatt
gttcgatatt tatcat
57649191PRTCylindrospermopsis raciborskii T3 49Met Ile Asn Ile Glu Gln
Phe Arg Gln Glu Ile Glu Asp Trp Ile Ile1 5
10 15Asn Val Val Ser Ile Pro Asn Pro Leu Thr Gly Asn
Phe Pro Pro Cys 20 25 30Pro
Tyr Ala Lys Ala Ala Trp Leu Asn Asn Arg Val Ser Val Arg Trp 35
40 45Phe His Gly Pro Glu Leu Pro Glu Leu
Leu Met Glu Gln Ile Arg Thr 50 55
60Trp Asn Asn Asp Phe Glu Met Val Ile Phe Gly Cys Asp Pro Gln Asn65
70 75 80Leu Asp Ala Gln Arg
Leu Glu Arg Tyr Ile Thr Lys Ala Asn Tyr Val 85
90 95Leu Pro Glu Tyr Asp Leu Val Ala Leu Gly Ser
His Pro Asp Lys Gln 100 105
110Tyr Val Gly Asp Asp Ala Glu Asn Val Asn Asn Val Ile Ile Thr His
115 120 125Pro Lys Tyr Val Leu Ala Ser
Val Gln Ser Phe Ser Gln Leu Gln Glu 130 135
140Ala Ser Asp Glu Leu Leu Arg Leu Gly Tyr Phe Gln Tyr Trp Ser
Ala145 150 155 160Glu Lys
Leu Ala Glu Met Lys Ser Glu Arg Ala Ser His Asn Leu Ser
165 170 175Ser Ile Gln Arg Lys Asn Ser
Tyr Arg Ile Ile Pro Thr Asn His 180 185
19050777DNACylindrospermopsis raciborskii T3 50ttaatctagg
tcatagtata accatatatt aggctcgatg tatattccca tattgttggg 60atagtcaatt
ttgacaggta ctaagccttt gggaataata tagtcaccag tttctggaaa 120acgcatccca
actctatctt cccaaccgtc aatagtatca ttaattgttg tggatttaaa 180acagatccct
gcaattttag ccccatgttt gacattaact cgtaaccaag ggtcaaatat 240aagaccattt
ttatctcgcc aggtaatata ccgctctatg ggtataagtg ggtaaagata 300ttttaggctt
ggacgtgcag ccatgatcaa agaattaaga ccgtggtatt gagcaagttc 360tttcatgtat
ccaatcagat actgactcaa gtttttgcct tgatactctg gtaggattga 420aatcgatact
acacataacg cattaggcag gcggttctgt tctcggtctt caagccactt 480ggctaaagcc
cagtcacaac cttcgtccgg taactcatca aaacggcttt cataagttaa 540agggatacag
tttccttgcg ctatcataag ctgtgtggta gcttctacta acccaaactg 600gaattctgga
taaatttcaa atagagctaa ggaagctgga tctgcccaga catcatgtat 660caaaaatttt
gggtatgctt gatcaaagac actcatcgtc ctttccacaa aatcagaagt 720ttcttttggg
gttacaaagc tatactctaa attatgctgt acaatttgaa tggtcat
77751258PRTCylindrospermopsis raciborskii T3 51Met Thr Ile Gln Ile Val
Gln His Asn Leu Glu Tyr Ser Phe Val Thr1 5
10 15Pro Lys Glu Thr Ser Asp Phe Val Glu Arg Thr Met
Ser Val Phe Asp 20 25 30Gln
Ala Tyr Pro Lys Phe Leu Ile His Asp Val Trp Ala Asp Pro Ala 35
40 45Ser Leu Ala Leu Phe Glu Ile Tyr Pro
Glu Phe Gln Phe Gly Leu Val 50 55
60Glu Ala Thr Thr Gln Leu Met Ile Ala Gln Gly Asn Cys Ile Pro Leu65
70 75 80Thr Tyr Glu Ser Arg
Phe Asp Glu Leu Pro Asp Glu Gly Cys Asp Trp 85
90 95Ala Leu Ala Lys Trp Leu Glu Asp Arg Glu Gln
Asn Arg Leu Pro Asn 100 105
110Ala Leu Cys Val Val Ser Ile Ser Ile Leu Pro Glu Tyr Gln Gly Lys
115 120 125Asn Leu Ser Gln Tyr Leu Ile
Gly Tyr Met Lys Glu Leu Ala Gln Tyr 130 135
140His Gly Leu Asn Ser Leu Ile Met Ala Ala Arg Pro Ser Leu Lys
Tyr145 150 155 160Leu Tyr
Pro Leu Ile Pro Ile Glu Arg Tyr Ile Thr Trp Arg Asp Lys
165 170 175Asn Gly Leu Ile Phe Asp Pro
Trp Leu Arg Val Asn Val Lys His Gly 180 185
190Ala Lys Ile Ala Gly Ile Cys Phe Lys Ser Thr Thr Ile Asn
Asp Thr 195 200 205Ile Asp Gly Trp
Glu Asp Arg Val Gly Met Arg Phe Pro Glu Thr Gly 210
215 220Asp Tyr Ile Ile Pro Lys Gly Leu Val Pro Val Lys
Ile Asp Tyr Pro225 230 235
240Asn Asn Met Gly Ile Tyr Ile Glu Pro Asn Ile Trp Leu Tyr Tyr Asp
245 250 255Leu
Asp52777DNACylindrospermopsis raciborskii T3 52ctaatcctta aatttatact
ggaagtcaaa tgagatctca ctatcgttat tatctggaag 60tacttgcact gtcaattcat
taccgacttt cccattccca ggcataatta ataagttagg 120gtgaggtgga atgccgtcgt
actgtcggac gcggcgaaaa atgctcgaat tctcgccacc 180atgtttattc aagaggactt
caactggtgt gatgacaaaa gtcattcctg acccaaggtg 240gcgcgatcgc cgcttttgat
ttgctggagt ggaaacacta acaaataagg cacaccctcc 300tagagaataa gaccagttag
cagactgcgg atcggcagac caatggcagg gacaagacac 360cgcatcaagg ctatgtaacg
cattcaaaaa atcaaatgct tgacctgcat attcctctac 420tgtaagaact gttggttcag
gtgggaaaaa gatgacaagt gtcagaagat ccgcattttc 480gtgctgaagc aattcgtttt
cattaacttc atcaatgtat ttgtagatac cctcaagcgt 540atgctcaacc aagatcgggt
cagttaaaga tgagactatc aggtatctaa tcattccctt 600ctgttccccg atagttcccc
agaagcaagg gaaggcagaa tcgctgattg tttcaacaaa 660tgttgagtag ctagtgcgta
cccaagcagg aaggcactcc tctagaagag aggattccat 720ctggcttttg ttccagattg
gtgtaactcc gtcaggacat aaattcttga ttaccat
77753258PRTCylindrospermopsis raciborskii T3 53Met Val Ile Lys Asn Leu
Cys Pro Asp Gly Val Thr Pro Ile Trp Asn1 5
10 15Lys Ser Gln Met Glu Ser Ser Leu Leu Glu Glu Cys
Leu Pro Ala Trp 20 25 30Val
Arg Thr Ser Tyr Ser Thr Phe Val Glu Thr Ile Ser Asp Ser Ala 35
40 45Phe Pro Cys Phe Trp Gly Thr Ile Gly
Glu Gln Lys Gly Met Ile Arg 50 55
60Tyr Leu Ile Val Ser Ser Leu Thr Asp Pro Ile Leu Val Glu His Thr65
70 75 80Leu Glu Gly Ile Tyr
Lys Tyr Ile Asp Glu Val Asn Glu Asn Glu Leu 85
90 95Leu Gln His Glu Asn Ala Asp Leu Leu Thr Leu
Val Ile Phe Phe Pro 100 105
110Pro Glu Pro Thr Val Leu Thr Val Glu Glu Tyr Ala Gly Gln Ala Phe
115 120 125Asp Phe Leu Asn Ala Leu His
Ser Leu Asp Ala Val Ser Cys Pro Cys 130 135
140His Trp Ser Ala Asp Pro Gln Ser Ala Asn Trp Ser Tyr Ser Leu
Gly145 150 155 160Gly Cys
Ala Leu Phe Val Ser Val Ser Thr Pro Ala Asn Gln Lys Arg
165 170 175Arg Ser Arg His Leu Gly Ser
Gly Met Thr Phe Val Ile Thr Pro Val 180 185
190Glu Val Leu Leu Asn Lys His Gly Gly Glu Asn Ser Ser Ile
Phe Arg 195 200 205Arg Val Arg Gln
Tyr Asp Gly Ile Pro Pro His Pro Asn Leu Leu Ile 210
215 220Met Pro Gly Asn Gly Lys Val Gly Asn Glu Leu Thr
Val Gln Val Leu225 230 235
240Pro Asp Asn Asn Asp Ser Glu Ile Ser Phe Asp Phe Gln Tyr Lys Phe
245 250 255Lys
Asp541227DNACylindrospermopsis raciborskii T3 54ctatatctta ttttttggaa
gtccctgaaa attattcaac aagatcgaga cgttgttgtt 60gccagaattt gtgacagcca
ggtcaagctt gctgtcgccg ttgaaatccg caattgctat 120agattcagga ttagtaccga
ctggaaagtt agtagctatg ccaaaagacc cattaccatt 180tcctggtaag accgagacgt
tattgctact ataatttgta acagccaggt caagtttact 240gtcgccattc acatctctaa
tcgctacaga gtagggatta gtaccggctg gaaagttagt 300ggctgcgcca aaagacccat
taccatttcc cagtaagacc gagacgttat tgctgctagt 360atttgcaaca gccaggtcaa
gcttgctgtc gccatttaca tccccagttg ctacaaatat 420gggattagta ccgactggaa
agttagtggc tgcgccaaaa gacccattac catttcccag 480taagaccgag acgttattgc
tgacccaatt tgtaatagca aggtcgagct tactgtcgct 540attaaaatcc gcaatcgcta
cggaaatcga ataagtatcg acagggaagc tgctggctgc 600gccaaaagac ccattaccat
ttcccagtaa aaccaagacc ttattgtcga accaatttgt 660aaaagcaagg tcaagctcac
tatcgttatt cacatctcca atggctacag aataagggtt 720agtaccaact gaaaagttag
tggctgcgcc aaaagaccca ttaccatttc ctagtaagac 780cgagacgtta ttgctactaa
aatttgcaac agccaggtca agcttgctgt cgccatttac 840atccccagtc actacaaaga
cgggattagt accgactgga aagttagtgg ctgcgccaaa 900agacccatta ccatttccca
gtaagaccga gacgttattg tcgaaccaat ttgtaacagc 960caggtcgagc ttactatcgc
tattgaaatc cccaactgct acagagtcag catcaagacc 1020agttgggaag ttaatagcag
tagcataact actcctgtgg gcaaatctca ctcctacgga 1080caaattaacc ggaacactaa
attgcccaga aagcttttca ttcttcagat aatagtcagt 1140tatatttgct aatgcaacag
gagttataca taaaaatgta ctaacagata atatccccgc 1200tataattagt aaagtgagcc
ttttcac
122755408PRTCylindrospermopsis raciborskii T3 55Met Lys Arg Leu Thr Leu
Leu Ile Ile Ala Gly Ile Leu Ser Val Ser1 5
10 15Thr Phe Leu Cys Ile Thr Pro Val Ala Leu Ala Asn
Ile Thr Asp Tyr 20 25 30Tyr
Leu Lys Asn Glu Lys Leu Ser Gly Gln Phe Ser Val Pro Val Asn 35
40 45Leu Ser Val Gly Val Arg Phe Ala His
Arg Ser Ser Tyr Ala Thr Ala 50 55
60Ile Asn Phe Pro Thr Gly Leu Asp Ala Asp Ser Val Ala Val Gly Asp65
70 75 80Phe Asn Ser Asp Ser
Lys Leu Asp Leu Ala Val Thr Asn Trp Phe Asp 85
90 95Asn Asn Val Ser Val Leu Leu Gly Asn Gly Asn
Gly Ser Phe Gly Ala 100 105
110Ala Thr Asn Phe Pro Val Gly Thr Asn Pro Val Phe Val Val Thr Gly
115 120 125Asp Val Asn Gly Asp Ser Lys
Leu Asp Leu Ala Val Ala Asn Phe Ser 130 135
140Ser Asn Asn Val Ser Val Leu Leu Gly Asn Gly Asn Gly Ser Phe
Gly145 150 155 160Ala Ala
Thr Asn Phe Ser Val Gly Thr Asn Pro Tyr Ser Val Ala Ile
165 170 175Gly Asp Val Asn Asn Asp Ser
Glu Leu Asp Leu Ala Phe Thr Asn Trp 180 185
190Phe Asp Asn Lys Val Leu Val Leu Leu Gly Asn Gly Asn Gly
Ser Phe 195 200 205Gly Ala Ala Ser
Ser Phe Pro Val Asp Thr Tyr Ser Ile Ser Val Ala 210
215 220Ile Ala Asp Phe Asn Ser Asp Ser Lys Leu Asp Leu
Ala Ile Thr Asn225 230 235
240Trp Val Ser Asn Asn Val Ser Val Leu Leu Gly Asn Gly Asn Gly Ser
245 250 255Phe Gly Ala Ala Thr
Asn Phe Pro Val Gly Thr Asn Pro Ile Phe Val 260
265 270Ala Thr Gly Asp Val Asn Gly Asp Ser Lys Leu Asp
Leu Ala Val Ala 275 280 285Asn Thr
Ser Ser Asn Asn Val Ser Val Leu Leu Gly Asn Gly Asn Gly 290
295 300Ser Phe Gly Ala Ala Thr Asn Phe Pro Ala Gly
Thr Asn Pro Tyr Ser305 310 315
320Val Ala Ile Arg Asp Val Asn Gly Asp Ser Lys Leu Asp Leu Ala Val
325 330 335Thr Asn Tyr Ser
Ser Asn Asn Val Ser Val Leu Pro Gly Asn Gly Asn 340
345 350Gly Ser Phe Gly Ile Ala Thr Asn Phe Pro Val
Gly Thr Asn Pro Glu 355 360 365Ser
Ile Ala Ile Ala Asp Phe Asn Gly Asp Ser Lys Leu Asp Leu Ala 370
375 380Val Thr Asn Ser Gly Asn Asn Asn Val Ser
Ile Leu Leu Asn Asn Phe385 390 395
400Gln Gly Leu Pro Lys Asn Lys Ile
40556603DNACylindrospermopsis raciborskii T3 56ctattgtttg aaaattgtga
atttgttttc cacgtatttg agtagttgtt ctaggctttc 60ctcgacggtg agttcggatg
tttccaccca taaatctggg ctattgggtg gttcataagg 120ggcgctgatt cccgtaaatc
catctatttc cccactgcgt gcttttagat aaagaccttt 180cggatcacgc tgctcacaaa
gttccagtgg agttgcaatg tatacttcat gaaatagatc 240tccagctagt ctacgcacct
gttctcggtc attcctgtag ggtgagatga aggcagtgat 300cactaggcat cctgactccg
caaagagttt ggcaacctca cccaaacgac ggatattttc 360tgagcgatca ctagcagaaa
atcctaaatc ggaacacagt ccatgacgaa cactatcacc 420atctaaaaca aaggtagacc
atcctttctc gaacaaagtc tgctctaatt ttaaagccaa 480tgttgtttta ccagccccgg
acagtccagt aaaccataga atcccgcttt tatgaccatt 540ctttagataa cgatcatatg
gagatataag atgttttgta tagtgaatat tagttgattt 600cat
60357200PRTCylindrospermopsis raciborskii T3 57Met Lys Ser Thr Asn Ile
His Tyr Thr Lys His Leu Ile Ser Pro Tyr1 5
10 15Asp Arg Tyr Leu Lys Asn Gly His Lys Ser Gly Ile
Leu Trp Phe Thr 20 25 30Gly
Leu Ser Gly Ala Gly Lys Thr Thr Leu Ala Leu Lys Leu Glu Gln 35
40 45Thr Leu Phe Glu Lys Gly Trp Ser Thr
Phe Val Leu Asp Gly Asp Ser 50 55
60Val Arg His Gly Leu Cys Ser Asp Leu Gly Phe Ser Ala Ser Asp Arg65
70 75 80Ser Glu Asn Ile Arg
Arg Leu Gly Glu Val Ala Lys Leu Phe Ala Glu 85
90 95Ser Gly Cys Leu Val Ile Thr Ala Phe Ile Ser
Pro Tyr Arg Asn Asp 100 105
110Arg Glu Gln Val Arg Arg Leu Ala Gly Asp Leu Phe His Glu Val Tyr
115 120 125Ile Ala Thr Pro Leu Glu Leu
Cys Glu Gln Arg Asp Pro Lys Gly Leu 130 135
140Tyr Leu Lys Ala Arg Ser Gly Glu Ile Asp Gly Phe Thr Gly Ile
Ser145 150 155 160Ala Pro
Tyr Glu Pro Pro Asn Ser Pro Asp Leu Trp Val Glu Thr Ser
165 170 175Glu Leu Thr Val Glu Glu Ser
Leu Glu Gln Leu Leu Lys Tyr Val Glu 180 185
190Asn Lys Phe Thr Ile Phe Lys Gln 195
200581350DNACylindrospermopsis raciborskii T3 58ttaagaaaaa attatttcaa
actcgctcgc caaacgctcc ataatcaaat taatttcaga 60cgaaaaagga cagtaatatg
gtagctctac caacaccctt cttgcggaaa ctgtcacctt 120cgctgctatt ttgataatcg
tttcccttaa cctaggaacc tgggctttag ccagttttgt 180tccctgtgct gcttgccgaa
ttcccaacat taaaatgtaa gctgcttgag ataaaaataa 240ccgaaactga ttgacaataa
atttctcaca gctgagtcta tctgatttta tccccagttt 300taattcctta attctatgct
ctgaagtagc tcctctttga acataaaatt tatcgtataa 360atcctgagct tctgtttcca
agctagtaat tataaatcta ggattgggtc ctttttctag 420ccattctgct ttcataatta
ctcgccgagg ttctgaccaa ctccgagctg cgtaatacac 480atcatcaaat aaacgaactt
tttctcctgt gcgacaatat tccagtctgg ctcggtcaag 540aaggtaatta atttttcgtt
ttaagacatc attattgctg aatccaaaaa catatccaac 600cccgcttttt tcacaaacct
caatgatttc tggtaacgag aaacccccgt ctcccctcag 660aacaattcta atttcaggta
aggctctttt gattcgcaaa aataaccatt ttagaatgcc 720agctactcct ttaccagagt
gagaatttcc cgcccttagt tgtagaacta atggataacc 780actggaagct tcattaatca
gaactggaaa gtagatatca tgcctatggt aaccattaaa 840taagctcagt tgttgatgac
catgagttag agcatcccac gcatctatgt ccaggacaat 900ctcttttgat tcccgaggat
aggattctag gaatttatca acaaataacc gacgaatttg 960tttgatatct ttttgagtca
cctgattttc taaacgactc atagttggtt gactagctaa 1020taagttttct cctactgtgg
gaacttgatt acaaactagc ttaaaaattg gatcttggcg 1080caatttatta ctatcgttgc
tatcttcata gccagcaatt atttgataaa ttcgttggct 1140aattaattga gaaagagaat
gtttgacttt agtttggtcc cgattatccg tcaaacaatc 1200tgccatatct tgacaaattt
ttaccttttc ttctacttgt cgtgccagaa taattccgcc 1260atcactactt aaactcatat
cagaaaaagt cagatctaaa gtttttttat cgaagaaatt 1320taaagataat cttgaggaag
atttagtcat
135059449PRTCylindrospermopsis raciborskii T3 59Met Thr Lys Ser Ser Ser
Arg Leu Ser Leu Asn Phe Phe Asp Lys Lys1 5
10 15Thr Leu Asp Leu Thr Phe Ser Asp Met Ser Leu Ser
Ser Asp Gly Gly 20 25 30Ile
Ile Leu Ala Arg Gln Val Glu Glu Lys Val Lys Ile Cys Gln Asp 35
40 45Met Ala Asp Cys Leu Thr Asp Asn Arg
Asp Gln Thr Lys Val Lys His 50 55
60Ser Leu Ser Gln Leu Ile Ser Gln Arg Ile Tyr Gln Ile Ile Ala Gly65
70 75 80Tyr Glu Asp Ser Asn
Asp Ser Asn Lys Leu Arg Gln Asp Pro Ile Phe 85
90 95Lys Leu Val Cys Asn Gln Val Pro Thr Val Gly
Glu Asn Leu Leu Ala 100 105
110Ser Gln Pro Thr Met Ser Arg Leu Glu Asn Gln Val Thr Gln Lys Asp
115 120 125Ile Lys Gln Ile Arg Arg Leu
Phe Val Asp Lys Phe Leu Glu Ser Tyr 130 135
140Pro Arg Glu Ser Lys Glu Ile Val Leu Asp Ile Asp Ala Trp Asp
Ala145 150 155 160Leu Thr
His Gly His Gln Gln Leu Ser Leu Phe Asn Gly Tyr His Arg
165 170 175His Asp Ile Tyr Phe Pro Val
Leu Ile Asn Glu Ala Ser Ser Gly Tyr 180 185
190Pro Leu Val Leu Gln Leu Arg Ala Gly Asn Ser His Ser Gly
Lys Gly 195 200 205Val Ala Gly Ile
Leu Lys Trp Leu Phe Leu Arg Ile Lys Arg Ala Leu 210
215 220Pro Glu Ile Arg Ile Val Leu Arg Gly Asp Gly Gly
Phe Ser Leu Pro225 230 235
240Glu Ile Ile Glu Val Cys Glu Lys Ser Gly Val Gly Tyr Val Phe Gly
245 250 255Phe Ser Asn Asn Asp
Val Leu Lys Arg Lys Ile Asn Tyr Leu Leu Asp 260
265 270Arg Ala Arg Leu Glu Tyr Cys Arg Thr Gly Glu Lys
Val Arg Leu Phe 275 280 285Asp Asp
Val Tyr Tyr Ala Ala Arg Ser Trp Ser Glu Pro Arg Arg Val 290
295 300Ile Met Lys Ala Glu Trp Leu Glu Lys Gly Pro
Asn Pro Arg Phe Ile305 310 315
320Ile Thr Ser Leu Glu Thr Glu Ala Gln Asp Leu Tyr Asp Lys Phe Tyr
325 330 335Val Gln Arg Gly
Ala Thr Ser Glu His Arg Ile Lys Glu Leu Lys Leu 340
345 350Gly Ile Lys Ser Asp Arg Leu Ser Cys Glu Lys
Phe Ile Val Asn Gln 355 360 365Phe
Arg Leu Phe Leu Ser Gln Ala Ala Tyr Ile Leu Met Leu Gly Ile 370
375 380Arg Gln Ala Ala Gln Gly Thr Lys Leu Ala
Lys Ala Gln Val Pro Arg385 390 395
400Leu Arg Glu Thr Ile Ile Lys Ile Ala Ala Lys Val Thr Val Ser
Ala 405 410 415Arg Arg Val
Leu Val Glu Leu Pro Tyr Tyr Cys Pro Phe Ser Ser Glu 420
425 430Ile Asn Leu Ile Met Glu Arg Leu Ala Ser
Glu Phe Glu Ile Ile Phe 435 440
445Ser60666DNACylindrospermopsis raciborskii T3 60ctatctttgc cctgtaacaa
tgtatgctac cctttgacca atattagtag catgatctgc 60cattctctct aaacactgaa
ttgctaatgt taatagtaaa atgggctcca ctaccccggg 120aacatctttc tgctgcgcca
aattacgata taactttttg taagcatcat ctactgtatc 180atctaataat ttaatccttc
taccactaat ctcgtctaaa tccgctaaag ctactaggct 240ggtagccaac atagattggg
catgatcgga cataatggca acctccccca aagtaggatg 300ggggggatag ggaaatattt
tcattgctat ttctgccaaa tctttggcat agtccccaat 360acgttccaag tctctaacta
attgcatgaa tgagcttaaa caccgagatt cttggtctgt 420gggagcttga ctgctcataa
ttgtggcaca atcgacttct atttgtctgt agaagcgatc 480aatttttttg tctaatctcc
gtatttgctc agctgctgtt aaatcccgat tgaatagagc 540ttggtgactc agacggaatg
actgctctac taaagcaccc atacgcaaaa catctcgttc 600cagtctttta atggcacgta
taggttgagg tttttcaaaa attgtatatt tcacaacagc 660tttcat
66661221PRTCylindrospermopsis raciborskii T3 61Met Lys Ala Val Val Lys
Tyr Thr Ile Phe Glu Lys Pro Gln Pro Ile1 5
10 15Arg Ala Ile Lys Arg Leu Glu Arg Asp Val Leu Arg
Met Gly Ala Leu 20 25 30Val
Glu Gln Ser Phe Arg Leu Ser His Gln Ala Leu Phe Asn Arg Asp 35
40 45Leu Thr Ala Ala Glu Gln Ile Arg Arg
Leu Asp Lys Lys Ile Asp Arg 50 55
60Phe Tyr Arg Gln Ile Glu Val Asp Cys Ala Thr Ile Met Ser Ser Gln65
70 75 80Ala Pro Thr Asp Gln
Glu Ser Arg Cys Leu Ser Ser Phe Met Gln Leu 85
90 95Val Arg Asp Leu Glu Arg Ile Gly Asp Tyr Ala
Lys Asp Leu Ala Glu 100 105
110Ile Ala Met Lys Ile Phe Pro Tyr Pro Pro His Pro Thr Leu Gly Glu
115 120 125Val Ala Ile Met Ser Asp His
Ala Gln Ser Met Leu Ala Thr Ser Leu 130 135
140Val Ala Leu Ala Asp Leu Asp Glu Ile Ser Gly Arg Arg Ile Lys
Leu145 150 155 160Leu Asp
Asp Thr Val Asp Asp Ala Tyr Lys Lys Leu Tyr Arg Asn Leu
165 170 175Ala Gln Gln Lys Asp Val Pro
Gly Val Val Glu Pro Ile Leu Leu Leu 180 185
190Thr Leu Ala Ile Gln Cys Leu Glu Arg Met Ala Asp His Ala
Thr Asn 195 200 205Ile Gly Gln Arg
Val Ala Tyr Ile Val Thr Gly Gln Arg 210 215
220621353DNACylindrospermopsis raciborskii T3 62tcagaaatat
ccgccatcat gttgaaccac ctggggaaga tgaatttgta tccaagcacc 60accggtatca
ggatggttca tggccctgat tttgccacca tgagctataa ttatttggcg 120gacaatggat
aaccctaaac cactaccagt aatttctact gtttcattct cagagcggga 180ctcgcggtgt
ctagctttgt ccccccgata aaatctttga aagacatggg gtagatccat 240gggagcaaat
ccaaccccgg aatcaataat gttaatttct aaaatctgat ttgatacttg 300gtttaatatt
gtatctgctt ctggatcaac cccattaata gacttctccc cacaaactgg 360attcatttca
atgaaaatag taccgttcag gttgctgtat ttaatacagt tatctaacag 420attaagaaac
acttgataaa ttctggactt atcagcacat atatagacct tttccgggcc 480ggagtaagaa
atactaagat gctgattagc ggctaggggc tctaaattct cccagactga 540aaaaattagg
gagcggactt ctagcatttc caaattcagt tgtatggagg aggttatttc 600catctgggtc
aggtctaacc aattttggac taaattaatt agtctgtcaa cctcctgcat 660caagcggatg
acccaacggt ttagaggggg atctaagcga gtttgcaggg tttctgcgac 720cagacgaatg
gaagtcagag gtgttctcag ttcatgggcc aggtctgaaa aagagcggtc 780acgttgctga
tgaatgtcta caaattgttg gtgactttct agaaacacac ccacttgtcc 840ccccggtagg
ggaaaactgt tagctgctaa agacaatggc tttaatccta aaataccctg 900accatgatct
cgggaagggt gaaaaatcca ctcttgcatt tgcggttttt gccaatcccg 960ggtttgctca
attaactgat ccagctcata ggatctcact aattccagta gcaggcgcac 1020ttgacccggt
tgccatcttt gtaaatacag catttcccgc gcgcactgat tacaccatag 1080tagttggttt
tcttcatcta cttgtaaata tcccaaaggc gcagcatcca gcaactgttc 1140ataagctttg
agtgacaagc gtaagttttg ttgctcatct ctaacggtag atattttacg 1200atgtaatcca
gctaataggg gtaataatat cttttcagcg tgagggttta agggttgggt 1260taactgctcc
aaatgactgt taagttgaaa ttgttgccaa agccaaaaac caaaaccgac 1320tgccaaaccc
agaagaaatc ccaataagaa cat
135363450PRTCylindrospermopsis raciborskii T3 63Met Phe Leu Leu Gly Phe
Leu Leu Gly Leu Ala Val Gly Phe Gly Phe1 5
10 15Trp Leu Trp Gln Gln Phe Gln Leu Asn Ser His Leu
Glu Gln Leu Thr 20 25 30Gln
Pro Leu Asn Pro His Ala Glu Lys Ile Leu Leu Pro Leu Leu Ala 35
40 45Gly Leu His Arg Lys Ile Ser Thr Val
Arg Asp Glu Gln Gln Asn Leu 50 55
60Arg Leu Ser Leu Lys Ala Tyr Glu Gln Leu Leu Asp Ala Ala Pro Leu65
70 75 80Gly Tyr Leu Gln Val
Asp Glu Glu Asn Gln Leu Leu Trp Cys Asn Gln 85
90 95Cys Ala Arg Glu Met Leu Tyr Leu Gln Arg Trp
Gln Pro Gly Gln Val 100 105
110Arg Leu Leu Leu Glu Leu Val Arg Ser Tyr Glu Leu Asp Gln Leu Ile
115 120 125Glu Gln Thr Arg Asp Trp Gln
Lys Pro Gln Met Gln Glu Trp Ile Phe 130 135
140His Pro Ser Arg Asp His Gly Gln Gly Ile Leu Gly Leu Lys Pro
Leu145 150 155 160Ser Leu
Ala Ala Asn Ser Phe Pro Leu Pro Gly Gly Gln Val Gly Val
165 170 175Phe Leu Glu Ser His Gln Gln
Phe Val Asp Ile His Gln Gln Arg Asp 180 185
190Arg Ser Phe Ser Asp Leu Ala His Glu Leu Arg Thr Pro Leu
Thr Ser 195 200 205Ile Arg Leu Val
Ala Glu Thr Leu Gln Thr Arg Leu Asp Pro Pro Leu 210
215 220Asn Arg Trp Val Ile Arg Leu Met Gln Glu Val Asp
Arg Leu Ile Asn225 230 235
240Leu Val Gln Asn Trp Leu Asp Leu Thr Gln Met Glu Ile Thr Ser Ser
245 250 255Ile Gln Leu Asn Leu
Glu Met Leu Glu Val Arg Ser Leu Ile Phe Ser 260
265 270Val Trp Glu Asn Leu Glu Pro Leu Ala Ala Asn Gln
His Leu Ser Ile 275 280 285Ser Tyr
Ser Gly Pro Glu Lys Val Tyr Ile Cys Ala Asp Lys Ser Arg 290
295 300Ile Tyr Gln Val Phe Leu Asn Leu Leu Asp Asn
Cys Ile Lys Tyr Ser305 310 315
320Asn Leu Asn Gly Thr Ile Phe Ile Glu Met Asn Pro Val Cys Gly Glu
325 330 335Lys Ser Ile Asn
Gly Val Asp Pro Glu Ala Asp Thr Ile Leu Asn Gln 340
345 350Val Ser Asn Gln Ile Leu Glu Ile Asn Ile Ile
Asp Ser Gly Val Gly 355 360 365Phe
Ala Pro Met Asp Leu Pro His Val Phe Gln Arg Phe Tyr Arg Gly 370
375 380Asp Lys Ala Arg His Arg Glu Ser Arg Ser
Glu Asn Glu Thr Val Glu385 390 395
400Ile Thr Gly Ser Gly Leu Gly Leu Ser Ile Val Arg Gln Ile Ile
Ile 405 410 415Ala His Gly
Gly Lys Ile Arg Ala Met Asn His Pro Asp Thr Gly Gly 420
425 430Ala Trp Ile Gln Ile His Leu Pro Gln Val
Val Gln His Asp Gly Gly 435 440
445Tyr Phe 45064819DNACylindrospermopsis raciborskii T3 64tcaaccaaat
ctatagccaa aacccctaac tgtgacaata tattctggat ggctagggtc 60taactctaat
ttttccctca gccatcgaat gtgaacatcc accgttttac tgtcaccaac 120aaaatcagga
ccccaaacct ggtctaataa ctgttcccgt gaccacaccc tgcgagcata 180actcataaat
agttctagta accggaattc tttcggtgac aagctcacct ccctccctct 240cactaacacc
cgacattcct gaggatttaa actgatatcc ttatatttta aagtgggtat 300caagggcaaa
ttagaaaacc gctgacgacg taacagggcg cgacacctag ccaccatttc 360ccgtacgcta
aaaggcttag ttaggtaatc atccgcccct acctctaaac ccagcacccg 420gtcagtttca
ctacctttcg cactcagaat taaaatcggt atggaattac cctggtgacg 480taacaaacga
caaatatcta atccgttgat ttgtggcaac atcaagtcta gcacaagcag 540gtcgaaggat
aactcaccag gttgggtctc taaattcctg attaattcca cagcacaacg 600accatcctta
gcagtcacaa cttcataacc ttcaccctct aaggctacta caagcatctc 660tcggatcagt
tcttcgtctt ccactattaa aacgcgacta actggttcaa tatccgattt 720agtgaagtat
ctagggtaat tcagtagtat acattgataa caaaaatttg taagaatgta 780ctggtctggg
tttcccacta gtatatgatc ctcactcat
81965272PRTCylindrospermopsis raciborskii T3 65Met Ser Glu Asp His Ile
Leu Val Gly Asn Pro Asp Gln Tyr Ile Leu1 5
10 15Thr Asn Phe Cys Tyr Gln Cys Ile Leu Leu Asn Tyr
Pro Arg Tyr Phe 20 25 30Thr
Lys Ser Asp Ile Glu Pro Val Ser Arg Val Leu Ile Val Glu Asp 35
40 45Glu Glu Leu Ile Arg Glu Met Leu Val
Val Ala Leu Glu Gly Glu Gly 50 55
60Tyr Glu Val Val Thr Ala Lys Asp Gly Arg Cys Ala Val Glu Leu Ile65
70 75 80Arg Asn Leu Glu Thr
Gln Pro Gly Glu Leu Ser Phe Asp Leu Leu Val 85
90 95Leu Asp Leu Met Leu Pro Gln Ile Asn Gly Leu
Asp Ile Cys Arg Leu 100 105
110Leu Arg His Gln Gly Asn Ser Ile Pro Ile Leu Ile Leu Ser Ala Lys
115 120 125Gly Ser Glu Thr Asp Arg Val
Leu Gly Leu Glu Val Gly Ala Asp Asp 130 135
140Tyr Leu Thr Lys Pro Phe Ser Val Arg Glu Met Val Ala Arg Cys
Arg145 150 155 160Ala Leu
Leu Arg Arg Gln Arg Phe Ser Asn Leu Pro Leu Ile Pro Thr
165 170 175Leu Lys Tyr Lys Asp Ile Ser
Leu Asn Pro Gln Glu Cys Arg Val Leu 180 185
190Val Arg Gly Arg Glu Val Ser Leu Ser Pro Lys Glu Phe Arg
Leu Leu 195 200 205Glu Leu Phe Met
Ser Tyr Ala Arg Arg Val Trp Ser Arg Glu Gln Leu 210
215 220Leu Asp Gln Val Trp Gly Pro Asp Phe Val Gly Asp
Ser Lys Thr Val225 230 235
240Asp Val His Ile Arg Trp Leu Arg Glu Lys Leu Glu Leu Asp Pro Ser
245 250 255His Pro Glu Tyr Ile
Val Thr Val Arg Gly Phe Gly Tyr Arg Phe Gly 260
265 27066774DNACylindrospermopsis raciborskii T3
66tcaggcaaaa cgagagaagt ctaaagtggg tggaatatcc tgaattcttc caggacctat
60agcccgtagt gcttctggta aactaatatc cccagtatat agggctttac ccacaattac
120tcctgtaacc ccctgatgtt ctaaagataa taaggttaat aggtcagtaa cagaacccac
180acccccagag gcaatcacgg gtatggaaat agcagatacc aagtctctta atgctcgcaa
240gtttggtccc tgaagcgtac catcacggtt tatatccgta taaataatag ctgccgcacc
300caattcctgc atttgggttg ctagttgggg ggccaaaatt tgagaagttt ctaaccaacc
360cctggtagca actagaccat tccgcgcatc aatcccaatt ataatttgct gggggaattg
420ttcacacagt ccttgaacca gatctggttg ctctactgct acagttccca gaattgccca
480ctgtacccca agattaaata actgtataac gctggagcta tcacgtattc ctccgccaac
540ttcaataggt atggaaatag cattggtaat agcttctata gtagataaat taactatttt
600accagttttt gctccatcta aatctactaa atgtagtctt gttgctcctt ggtctgccca
660cattttagcg gtttccacag ggttatggct gtaaacctgg gattgtgcat agtcaccttt
720gtagagtctt acacaacgcc cctctaatag atctattgct gggataactt ccat
77467257PRTCylindrospermopsis raciborskii T3 67Met Glu Val Ile Pro Ala
Ile Asp Leu Leu Glu Gly Arg Cys Val Arg1 5
10 15Leu Tyr Lys Gly Asp Tyr Ala Gln Ser Gln Val Tyr
Ser His Asn Pro 20 25 30Val
Glu Thr Ala Lys Met Trp Ala Asp Gln Gly Ala Thr Arg Leu His 35
40 45Leu Val Asp Leu Asp Gly Ala Lys Thr
Gly Lys Ile Val Asn Leu Ser 50 55
60Thr Ile Glu Ala Ile Thr Asn Ala Ile Ser Ile Pro Ile Glu Val Gly65
70 75 80Gly Gly Ile Arg Asp
Ser Ser Ser Val Ile Gln Leu Phe Asn Leu Gly 85
90 95Val Gln Trp Ala Ile Leu Gly Thr Val Ala Val
Glu Gln Pro Asp Leu 100 105
110Val Gln Gly Leu Cys Glu Gln Phe Pro Gln Gln Ile Ile Ile Gly Ile
115 120 125Asp Ala Arg Asn Gly Leu Val
Ala Thr Arg Gly Trp Leu Glu Thr Ser 130 135
140Gln Ile Leu Ala Pro Gln Leu Ala Thr Gln Met Gln Glu Leu Gly
Ala145 150 155 160Ala Ala
Ile Ile Tyr Thr Asp Ile Asn Arg Asp Gly Thr Leu Gln Gly
165 170 175Pro Asn Leu Arg Ala Leu Arg
Asp Leu Val Ser Ala Ile Ser Ile Pro 180 185
190Val Ile Ala Ser Gly Gly Val Gly Ser Val Thr Asp Leu Leu
Thr Leu 195 200 205Leu Ser Leu Glu
His Gln Gly Val Thr Gly Val Ile Val Gly Lys Ala 210
215 220Leu Tyr Thr Gly Asp Ile Ser Leu Pro Glu Ala Leu
Arg Ala Ile Gly225 230 235
240Pro Gly Arg Ile Gln Asp Ile Pro Pro Thr Leu Asp Phe Ser Arg Phe
245 250
255Ala68396DNACylindrospermopsis raciborskii T3 68atgagttggt ccacaatgaa
ggacgtcttg attttaatag tcaaatccct ccaaatccat 60tataatccca tgaatgctct
ttcaattcct acctggatta tccatatttc tagtgtcatt 120gaatgggtag ttgccatttc
cctcatctgg aaatatggcg aactgaccca aaaccatagt 180tggaggggat ttgccttagg
tatgataccc gccttaatta gcgccctatc cgcttgtacc 240tggcattatt tcgataatcc
ccagtcccta gaatggttag tcaccctcca ggctactact 300acgttaatag gtaattttac
tctttgggca gcagcagtct gggtttggcg ttctactcga 360ccgaatgagg ttctcagtat
ctcaaataag gagtag
39669131PRTCylindrospermopsis raciborskii T3 69Met Ser Trp Ser Thr Met
Lys Asp Val Leu Ile Leu Ile Val Lys Ser1 5
10 15Leu Gln Ile His Tyr Asn Pro Met Asn Ala Leu Ser
Ile Pro Thr Trp 20 25 30Ile
Ile His Ile Ser Ser Val Ile Glu Trp Val Val Ala Ile Ser Leu 35
40 45Ile Trp Lys Tyr Gly Glu Leu Thr Gln
Asn His Ser Trp Arg Gly Phe 50 55
60Ala Leu Gly Met Ile Pro Ala Leu Ile Ser Ala Leu Ser Ala Cys Thr65
70 75 80Trp His Tyr Phe Asp
Asn Pro Gln Ser Leu Glu Trp Leu Val Thr Leu 85
90 95Gln Ala Thr Thr Thr Leu Ile Gly Asn Phe Thr
Leu Trp Ala Ala Ala 100 105
110Val Trp Val Trp Arg Ser Thr Arg Pro Asn Glu Val Leu Ser Ile Ser
115 120 125Asn Lys Glu
1307020DNAArtificialBased on Cylindrospermopsis raciborskii T3
sequence 70ttaattgctt ggtctatctc
207120DNAArtificialBased on Cylindrospermopsis raciborskii T3
sequence 71caataccgaa gaggagatag
207220DNAArtificialBased on Cylindrospermopsis raciborskii T3
sequence 72taggcgtgtt agtgggagat
207320DNAArtificialBased on Cylindrospermopsis raciborskii T3
sequence 73tgtgtaacca atttgtgagt
207420DNAArtificialBased on Cylindrospermopsis raciborskii T3
sequence 74ttagccggat tacaggtgaa
207520DNAArtificialBased on Cylindrospermopsis raciborskii T3
sequence 75ctggactcgg cttgttgctt
207620DNAArtificialBased on Cylindrospermopsis raciborskii T3
sequence 76cagcgagtta cacccaccac
207720DNAArtificialBased on Cylindrospermopsis raciborskii T3
sequence 77ctcgcactaa atattctacc
207819DNAArtificialBased on Cylindrospermopsis raciborskii T3
sequence 78aaaacctcag cttccacaa
197922DNAArtificialBased on Cylindrospermopsis raciborskii T3
sequence 79atgattttgg aggtccattg tt
228042156DNACylindrospermopsis raciborskii AWT205 80gtttttactg
caaaagcata ttcatattat attctaatag ggttggtgga atattcaagg 60ggaggttaga
aaatgcgatc gctcttatga atgaggttgt ctatccgaat atcaaatatt 120ggtggttgaa
aaaagacctt atatgcggac acagattccc atgatgaaaa tatatcattg 180tcaagtcaat
tagtcaaccc cccaatagac atctccgaaa aagaatcaaa gtgtgataaa 240atttgcagta
cagcaggata taaaatagtt tttcctctat acttctgagt gtaggcttgc 300gtccgccccc
gggcgcacgt ttgcggtttg ctaaggagtt aaacacggtg cgttaatatg 360tatcagcaac
ctgagataac agctcgttga atgcttagcg gttaagtcca gtcattgctc 420gtagcagtcg
ctcttgattc aggatgcggt ctaagttcaa cattaatgtc accctacttg 480tctgcttgat
tattatccct tattttccaa caactctaat gaaagtacct ataacagcaa 540acgaagatgc
agctacatta cttcagcgtg ttggactgtc cctaaaggaa gcacaccaac 600aacttgaggc
aatgcaacgc cgagcgcacg aaccgatcgc aattgtgggg ctggggctgc 660ggtttccggg
agctgattca ccacagacat tctggaaact acttcagaat ggtgttgata 720tggtcaccga
aatccctagc gatcgctggg cagttgatga atactatgat ccccaacctg 780ggtgtccagg
caaaatgtat attcgtgaag ccgcttttgt tgatgcagtg gataaattcg 840atgcctcgtt
ttttgatatt tcgccacgtg aagcggccaa tatagatccc cagcatagaa 900tgttgctgga
ggtagcttgg gaggcactcg aaagggctgg cattgctccc agccaattga 960tggatagcca
aacgggggta tttgtcggga tgagcgaaaa tgactattat gctcacctag 1020aaaatacagg
ggatcatcat aatgtctatg cggcaacggg caatagcaat tactatgctc 1080cggggcgttt
atcctatcta ttggggcttc aaggacctaa catggtcgtt gatagtgcct 1140gttcctcctc
cttagtggct gtacatcttg cctgtaatag tttgcggatg ggagaatgtg 1200atctggcact
ggctggtggc gttcagctta tgttaatccc agaccctatg attgggactg 1260cccagttaaa
tgcctttgcg accgatggtc gtagtaaaac atttgacgct gccgccgatg 1320gctatggacg
cggcgaaggt tgtggcatga ttgtacttaa aagaataagt gacgcgatcg 1380tggcagacga
tccaatttta gccgtaatcc ggggtagtgc agtcaatcat ggcgggcgta 1440gcagtggttt
aactgcccct aataagctgt ctcaagaagc cttactgcgt caggcactac 1500aaaacgccaa
ggttcagccg gaagcagtca gttatatcga agcccatggc acagggacac 1560aactgggcga
cccgattgag gtgggagcat taacgaccgt ctttggatct tctcgttcag 1620aacccttgtg
gattggctct gtcaaaacta atatcggaca cctagaacca gccgctggta 1680ttgcggggtt
aataaaagtc attttatcat tacaagaaaa acagattcct cccagtctcc 1740attttcaaaa
ccctaatccc ttcattgatt gggaatcttc gccagttcaa gtgccgacac 1800agtgtgtacc
ctggactggg aaagagcgcg tcgctggagt tagctcgttt ggtatgagcg 1860gtacaaactg
tcatctagtt gtcgcagaag cacctgtccg ccaaaacgaa aaatctgaaa 1920atgcaccgga
gcgtccttgt cacattctga ccctttcagc caaaaccgaa gcggcactca 1980acgcattggt
agcccgttac atggcatttc tcagggaagc gcccgccata tccctagctg 2040atctttgtta
tagtgccaat gtcgggcgta atctttttgc ccatcgctta agttttatct 2100ccgagaacat
cgcgcagtta tcagaacaat tagaacactg cccacagcag gctacaatgc 2160caacgcaaca
taatgtgata ctagataatc aactcagccc tcaaatcgct tttctgttta 2220ctggacaagg
ttcgcagtac atcaacatgg ggcgtgagct ttacgaaact cagcccacct 2280tccgtcggat
tatggacgaa tgtgacgaca ttctgcatcc attgttgggt gaatcaattc 2340tgaacatact
ctacacttcc cctagcaaac ttaatcaaac cgtttatacc caacctgccc 2400tttttgcttt
tgaatatgcc ctagcaaaac tatggatatc atggggtatt gagcctgatg 2460tcgtactggg
tcacagcgtg ggtgaatatg tagccgcttg tctggcgggt gtctttagtt 2520tagaagatgg
gttaaaactc attgcatctc gtggatgttt gatgcaagcc ttaccgccgg 2580ggaaaatgct
tagtatcaga agcaatgaga tcggagtgaa agcgctcatc gcgccttata 2640gtgcagaagt
atcaattgca gcaatcaatg gacagcaaag cgtggtgatc tccggcaaag 2700ctgaaattat
agataattta gcagcagagt ttgcatcgga aggcatcaaa acacacctaa 2760ttacagtctc
ccacgctttc cactcgccaa tgatgacccc catgctgaaa gcattccgag 2820acgttgccag
caccatcagc tataggtcac ccagtttatc actgatttct aacggtacag 2880ggcaattggc
aacaaaggag gttgctacac ctgattattg ggtgcgtcat gtccattcta 2940ccgtccgttt
tgccgatggt attgccacat tggcagaaca gaatactgac atcctcctag 3000aagtaggacc
caaaccaata ttgttgggta tggcaaagca gatttatagt gaaaacggtt 3060cagctagtca
tccgctcatg ctacccagtt tgcgtgaaga tggcaacgat tggcagcaga 3120tgctttctac
ttgtggacaa cttgtagtta atggagtcaa gattgactgg gcgggttttg 3180acaaggatta
ttcacgacac aaaatattgt tgcccaccta tccgtttcag agagaacgat 3240attggattga
aagctccgtc aaaaagcccc aaaaacagga gctgcgccca atgttggata 3300agatgatccg
gctaccatca gagaacaaag tggtgtttga aaccgagttt ggcgtgcgac 3360agatgcctca
tatctccgat catcagatat acggtgaagt cattgtaccg ggggcagtat 3420tagcttcctt
aatcttcaat gcagcgcagg ttttataccc agactatcag catgaattaa 3480ctgatattgc
tttttatcag ccaattatct ttcatgacga cgatacggtg atcgtgcagg 3540cgattttcag
ccctgataag tcacaggaga atcaaagcca tcaaacattt ccacccatga 3600gcttccagat
tattagcttc atgccggatg gtcccttaga gaacaaaccg aaagtccatg 3660tcacagggtg
tctgagaatg ttgcgcgatg cccaaccgcc aacactctcc ccgaccgaaa 3720tacgtcagcg
ctgtccacat accgtaaatg gtcatgactg gtacaatagc ttagtcaaac 3780aaaaatttga
aatgggtcct tcctttaggt gggtacagca actttggcat ggggaaaatg 3840aagcattgac
ccgtcttcac ataccagatg tggtcggctc tgtatcagga catcaacttc 3900acggcatatt
gctcgatggt tcactttcaa ccaccgctgt catggagtac gagtacggag 3960actccgcgac
cagagttcct ttgtcatttg cttctctgca actgtacaaa cccgtcacgg 4020gaacagagtg
gtggtgctac gcgaggaaga ttggggaatt caaatatgac ttccagatta 4080tgaatgaaat
cggggaaacc ttggtgaaag caattggctt tgtacttcgt gaagcctctc 4140ccgaaaaatt
cctcagaaca acatacgtac acaactggct tgtagacatt gaatggcaag 4200ctcaatcaac
ttccctagtc ccttctgatg gcactatctc tggcagttgt ttggttttat 4260cagatcagca
tggaacaggg gctgcattgg cacaaaggct agacaatgct ggagtgccag 4320tgaccatgat
ctatgctgat ctgatactgg acaattacga attaatattc cgtactttgc 4380cagatttaca
acaagtcgtc tatttatggg ggttggatca aaaagaggat tgtcacccca 4440tgaagcaagc
agaggataac tgtacatcgg tgctatatct tgtgcaagca ttactcaata 4500cctactcaac
cccgccatcc ctgcttattg tcacctgtga tgcacaagcg gtggttgaac 4560aagatcgagt
aaatggcttc gcccaatcgt ctttgttggg acttgccaaa gttatcatgc 4620tagaacaccc
agaattgtcc tgtgtttaca tggatgtgga agccggatat ttacagcaag 4680atgtggcgaa
cacgatattt acacagctaa aaagaggcca tctatcaaag gacggagaag 4740agagtcagtt
ggcttggcgc aatggacaag catacgtagc acgtcttagt caatataaac 4800ccaaatccga
acaactggtt gagatccgca gcgatcgcag ctatttgatc actggtggac 4860ggggcggtgt
cggcttacaa atcgcacggt ggttagtgga aaagggggct aaacatctcg 4920ttttgttggg
gcgcagtcag accagttccg aagtcagtct ggtgttggat gagctagaat 4980cagccggggc
gcaaatcatt gtggctcaag ctgatattag cgatgagaag gtattagcgc 5040agattctgac
caatctaacc gtacctctgt gtggtgtaat ccacgccgca ggagtgcttg 5100atgatgcgag
tctactccaa caaactccag ccaagctcaa aaaagttcta ttgccaaaag 5160cagagggggc
ttggattctg cataatttga ccctggagca gcgactagac ttctttgttc 5220tcttttcttc
tgccagttct ctattaggtg cgccagggca ggccaactat tcagcagcca 5280atgctttcct
agatggttta gctgcctatc ggcgagggcg aggactcccc tgtttgtcta 5340tctgctgggg
ggcatgggat caagtcggta tggctgcacg acaagggcta ctggacaagt 5400taccgcaaag
aggtgaagag gccatcccgt tacagaaagg cttagacctc ttcggcgaat 5460tactgaacga
gccagccgct caaattggtg tgatcccaat tcaatggact cgcttcttgg 5520atcatcaaaa
aggtaatttg cctttttatg agaagttttc taagtctagc cggaaagcgc 5580agagttacga
ttcgatggca gtcagtcaca cagaagatat tcagaggaaa ctgaagcaag 5640ctgctgtgca
agatcgacca aaattattag aagtgcatct tcgctctcaa gtcgctcaac 5700tgttaggaat
aaacgtggca gagctaccaa atgaagaagg aattggtttt gttacattag 5760gtcttgactc
gctcacctct attgaactgc gtaacagttt acaacgcaca ttagattgtt 5820cattacctgt
cacctttgct tttgactacc caactataga aatagcggtt aagtacctaa 5880cacaagttgt
aattgcaccg atggaaagca cagcatcgca gcaaacagac tctttatcag 5940caatgttcac
agatacttcg tccatcggga gaattcttga caacgaaaca gatgtgttag 6000acagcgaaat
gcaaagtgat gaagatgaat ctttgtctac acttatacaa aaattatcaa 6060cacatttgga
ttaggagtga tcaataatta tacattgcgg acgtgagcat acaagtaaag 6120gaaaaatgaa
tgaacgcttt gtcagaaaat caggtaactt ctatagtcaa gaaggcattg 6180aacaaaatag
aggagttaca agccgaactt gaccgtttaa aatacgcgca acgggaacca 6240atcgccatca
ttggaatggg ctgtcgcttt cctggtgcag acacacctga agctttttgg 6300aaattattgc
acaatggggt tgatgctatc caagagattc caaaaagccg ttgggatatt 6360gacgactatt
atgatcccac accagcaaca cccggcaaaa tgtatacacg ttttggtggt 6420tttctcgacc
aaatagcagc cttcgaccct gagttctttc gcatttctac tcgtgaggca 6480atcagcttag
accctcaaca gagattgctt ctggaagtga gttgggaagc cttagaacgg 6540gctgggctga
caggcaataa actgactaca caaacaggtg tctttgttgg catcagtgaa 6600agtgattatc
gtgatttgat tatgcgtaat ggttctgacc tagatgtata ttctggttca 6660ggtaactgcc
atagtacagc cagcgggcgt ttatcttatt atttgggact tactggaccc 6720aatttgtccc
ttgataccgc ctgttcgtcc tctttggttt gtgtggcatt ggctgtcaag 6780agcctacgtc
aacaggagtg tgatttggca ttggcgggtg gtgtacagat acaagtgata 6840ccagatggct
ttatcaaagc ctgtcaatcc cgtatgttgt cgcctgatgg acggtgcaaa 6900acatttgatt
tccaggcaga tggttatgcc cgtgctgagg ggtgtgggat ggtagttctc 6960aaacgcctat
ccgatgcaat tgctgacaat gataatatcc tggccttgat tcgtggtgcc 7020gcagtcaatc
atgatggcta cacgagtgga ttaaccgttc ccagtggtcc ctcacaacgg 7080gcggtgatcc
aacaggcatt agcggatgct ggaatacacc cggatcaaat tagctatatt 7140gaggcacatg
gcacaggtac atccttaggc gatcctattg aaatgggtgc gattgggcaa 7200gtctttggtc
aacgctcaca gatgcttttc gtcggttcgg tcaagacgaa tattggtcat 7260actgaggctg
ctgctggtat tgctggtctc atcaaggttg tactctcaat gcagcacggt 7320gaaatcccag
caaacttaca cttcgaccag ccaagtcctt atattaactg ggatcaatta 7380ccagtcagta
tcccaacaga aacaatacct tggtctacta gcgatcgctt tgcaggagtc 7440agtagctttg
gctttagtgg cacaaactct catatcgtac tagaggcagc cccaaacata 7500gagcaaccta
ctgatgatat taatcaaacg ccgcatattt tgaccttagc tgcaaaaaca 7560cccgcagccc
tgcaagaact ggctcggcgt tatgcgactc agatagagac ctctcccgat 7620gttcctctgg
cggacatttg tttcacagca cacatagggc gtaaacattt taaacatagg 7680tttgcggtag
tcacggaatc taaagagcaa ctgcgtttgc aattggatgc atttgcacaa 7740tcagggggtg
tggggcgaga agtcaaatcg ctaccaaaga tagcctttct ttttacaggt 7800caaggctcac
agtatgtggg aatgggtcgt caactttacg aaaaccaacc taccttccga 7860aaagcactcg
cccattgtga tgacatcttg cgtgctggtg catatttcga ccgatcacta 7920ctttcgattc
tctacccaga gggaaaatca gaagccattc accaaaccgc ttatactcag 7980cccgcgcttt
ttgctcttga gtatgcgatc gctcagttgt ggcactcctg gggtatcaaa 8040ccagatatcg
tgatggggca tagtgtaggt gaatacgtcg ccgcttgtgt ggcgggcata 8100ttttctttag
aggatgggct gaaactaatt gctactcgtg gtcgtctgat gcaatcccta 8160cctcaagacg
gaacgatggt ttcttctttg gcaagtgaag ctcgtatcca ggaagctatt 8220acaccttacc
gagatgatgt gtcaatcgca gcgataaatg ggacagaaag cgtggttatc 8280tctggcaaac
gcacctctgt gatggcaatt gctgaacaac tcgccaccgt tggcatcaag 8340acacgccaac
tgacggtttc ccatgccttc cattcaccac ttatgacacc catcttggat 8400gagttccgcc
aggtggcagc cagtatcacc tatcaccagc ccaagttgct acttgtctcc 8460aacgtctccg
ggaaagtggc cggccctgaa atcaccagac cagattactg ggtacgccat 8520gtccgtgagg
cagtgcgctt tgccgatgga gtgaggacgc tgaatgaaca aggtgtcaat 8580atctttctgg
aaatcggttc taccgctacc ctgttgggca tggcactgcg agtaaatgag 8640gaagattcaa
atgcctcaaa aggaacttcg tcttgctacc tgcccagttt acgggaaagc 8700cagaaggatt
gtcagcagat gttcactagt ctgggtgagt tgtacgtaca tggatatgat 8760attgattggg
gtgcatttaa tcggggatat caaggacgca aggtgatatt gccaacctat 8820ccgtttcagc
gacaacgtta ttggcttccc gaccctaagt tggcacaaag ttccgattta 8880gatacctttc
aagctcagag cagcgcatca tcacaaaatc ctagcgctgt gtccacttta 8940ctgatggaat
atttgcaagc aggtgatgtc caatctttag ttgggctttt ggatgatgaa 9000cggaaactct
ctgctgctga acgaattgca ctacccagta ttttggagtt tttggtagag 9060gaacaacagc
gacaaataag ctcaaccaca actcctcaaa cagttttaca aaaaataagt 9120caaacttccc
atgaggacag atatgaaata ttgaagaacc tgatcaaatc tgaaatcgaa 9180acgattatca
aaagtgttcc ctccgatgaa caaatgtttt ctgacttagg aattgattcc 9240ttgatggcga
tcgaactgcg taataagctc cgttctgcta tagggttgga actgccagtg 9300gcaatagtat
ttgaccatcc cacgattaag cagttaacta acttcgtact ggacagaatt 9360gtgccgcagg
cagaccaaaa ggacgttccc accgaatcct tgtttgcttc taaacaggag 9420atatcagttg
aggagcagtc ttttgcaatt accaagctgg gcttatcccc tgcttcccac 9480tccctgcatc
ttcctccatg gacggttaga cctgcggtaa tggcagatgt aacaaaacta 9540agccaacttg
aaagagaggc ctatggctgg atcggagaag gagcgatcgc cccgccccat 9600ctcattgccg
atcgcatcaa tttactcaac agtggtgata tgccttggtt ctgggtaatg 9660gagcgatcag
gagagttggg cgcgtggcag gtgctacaac cgacatctgt tgatccatat 9720acttatggaa
gttgggatga agtaactgac caaggtaaac tgcaagcaac cttcgaccca 9780agtggacgca
atgtgtatat tgtcgcgggt gggtctagca acctccccac ggtagccagc 9840cacctcatga
cgcttcagac tttattgatg ctgcgggaaa ctggtcgtga cacaatcttt 9900gtctgtctgg
caatgccagg ttatgccaaa taccacagtc aaacaggaaa atcgccggaa 9960gagtatattg
cgctgactga cgaggatggt atcccaatgg acgagtttat tgcactttct 10020gtctacgact
ggcctgttac cccatcgttt cgtgttctgc gagacggtta tccacctgat 10080cgagattctg
gtggtcacgc agttagtacg gttttccagc tcaatgattt cgatggagcg 10140atcgaagaaa
catatcgtcg tattatccgc catgccgatg tccttggtct cgaaagaggc 10200taaatttcag
gcgttggtga atagaaccca cattccgcag ataaggtctt atgaataaaa 10260aacaggtaga
cacattgtta atacacgctc atctttttac catgcagggc aatggcctgg 10320gatatattgc
cgatggggca attgcggttc agggtagcca gatcgtagca gtggattcga 10380cagaggcttt
gctgagtcat tttgaaggaa ataaaacaat taatgcggta aattgtgcag 10440tgttgcctgg
actaattgat gctcatatac atacgacttg tgctattctg cgtggagtgg 10500cacaggatgt
aaccaattgg ctaatggacg cgacaattcc ttatgcactt cagatgacac 10560ccgcagtaaa
tatagccgga acgcgcttga gtgtactcga agggctgaaa gcaggaacaa 10620ccacattcgg
cgattctgag actccttacc cgctctgggg agagtttttc gatgaaattg 10680gggtacgtgc
tattctatcc cctgccttta acgcctttcc actagaatgg tcggcatgga 10740aggagggaga
cctctatccc ttcgatatga aggcaggacg acgtggtatg gaagaggctg 10800tggattttgc
ttgtgcatgg aatggagccg cagagggacg tatcaccact atgttgggac 10860tacaggcggc
ggatatgcta ccactggaga tcctacacgc agctaaagag attgcccaac 10920gggaaggctt
aatgctgcat attcatgtgg cccagggaga tcgagaaaca aaacaaattg 10980tcaaacgata
tggtaagcgt ccgatcgcat ttctagctga aattggctac ttggacgaac 11040agttgctggc
agttcacctc accgatgcca cagatgaaga agtgatacaa gtagccaaaa 11100gtggtgctgg
catggcactc tgttcgggcg ctattggcat cattgacggt cttgttccgc 11160ccgctcatgt
ttttcgacaa gcaggcggtt ccgttgcact cggttctgat caagcctgtg 11220gcaacaactg
ttgtaacatc ttcaatgaaa tgaagctgac cgccttattc aacaaaataa 11280aatatcatga
tccaaccatt atgccggctt gggaagtcct gcgtatggct accatcgaag 11340gagcgcaggc
gattggttta gatcacaaga ttggctctct tcaagtgggc aaagaagccg 11400acctgatctt
aatagacctc agttccccta acctctcgcc caccctgctc aaccctattc 11460gtaaccttgt
acctaacttg gtgtatgctg cttcaggaca tgaagttaaa agcgtcatgg 11520tggcgggaaa
acttttagtg gaagactacc aagtcctcac ggtagatgag tccgctattc 11580tcgctgaagc
gcaagtacaa gctcaacaac tctgccaacg tgtgaccgct gaccccattc 11640acaaaaagat
ggtgttaatg gaagcgatgg ctaagggtaa attatagata caggcttatc 11700tgcaacaaca
tttctgaatc aaacctggag gggcaaacca atgaccatat atgaaaataa 11760gttgagtagt
tatcaaaaaa atcaagatgc cataatatct gcaaaagaac tcgaagaatg 11820gcatttaatt
ggacttctag accattcaat agatgcggta atagtaccga attattttct 11880tgagcaagag
tgtatgacaa tttcagagag aataaaaaag agtaaatatt ttagcgctta 11940tcccggtcat
ccatcagtaa gtagcttggg acaagagttg tatgaatgcg aaagtgagct 12000tgaattagca
aagtatcaag aagacgcacc cacattgatt aaagaaatgc ggaggctggt 12060acatccgtac
ataagtccaa ttgatagact tagggttgaa gttgatgata tttggagtta 12120tggctgtaat
ttagcaaaac ttggtgataa aaaactgttt gcgggtatcg ttagagagtt 12180taaagaagat
aaccctggcg caccacattg tgacgtaatg gcatggggtt ttctcgaata 12240ttataaagat
aaaccaaata tcataaatca aatcgcagca aatgtatatt taaaaacgtc 12300tgcatcagga
ggagaaatag tgctttggga tgaatggcca actcaaagcg aatatatagc 12360atacaaaaca
gatgatccag ctagtttcgg tcttgatagc aaaaagatcg cacaaccaaa 12420acttgagatc
caaccgaacc agggagattt aattctattc aattccatga gaattcatgc 12480ggtgaaaaag
atagaaactg gtgtacgtat gacatgggga tgtttgattg gatactctgg 12540aactgataaa
ccgcttgtta tttggactta atgtagcgtt tccatttgag tcaaggcacg 12600agaagcttct
aaagctggaa tagatacact atcattctca actacactct caaatgtcct 12660aggtaactgt
gccccaaaca tcagcattcc aatggcgttg aacaaaaaga aagccaacca 12720caagatatgg
ttactctcaa atttaacagc agctacatcc gcaggtaaaa atcctacacc 12780aaacgcgatt
aagttaacat tgcggagagt atgcccttga gccaaaccca agaagtaccc 12840acatagtatg
caacatactg aattgcatac taggacaagt accaaccagg gaataaaaat 12900atcaatattc
tcaataattt ctgcgtggtt ggttaacaac ccaaaaacat catcgggaaa 12960tagccaacac
gctccgccga aaaccagact cactagcaga gccattccca cagaaacttt 13020tgccagaggt
gctaactgtt ctgtggctcc tttcccttta aaatttcctg ccagagtttc 13080tgtacagaat
cccaatcctt caacaatgta gatgctcaaa gcccatatct gtaagagcaa 13140ggcattttga
gcgtagataa ttgtccccat ttgtgcccct tcgtagttaa acgttaagtt 13200ggtaaacata
caaactaaat tgctgacaaa gatgtttcca ttgagagtta aggtggagcg 13260tatagctttt
atgtcccaaa tttttccagc taattctttt acctcttgcc acgggatttc 13320tttgcagaca
aaaaacaatc ccaccaatag ggtgagatat tgacttgcag cagaagctac 13380tcctgccccc
atgctcgacc agtctaagtg gataataaac aagtagtcga gtgcgatatt 13440ggcagcattg
cccacaaccg acaacaacac aactaagcca tttttttccc gtcccagaaa 13500ccagccaagc
aggacaaagt tgagcaaaat ggcaggcgct ccccaactct gggtgttaaa 13560atacgcttga
gctgaagact tcacctctgg gccgacatct agtatagaaa accccaacac 13620ccctaacggg
tactgtaaca gtatgatcgc cacccccagc accagagcaa ttaaaccatt 13680aagcagtccc
gccaacagta cgccctctcg gtcatctcgt ccgactgctt gtgctgttaa 13740cgcagtggta
cccattcgta aaaacgataa aacaaagtag agaaagttaa gcaggtttcc 13800agcaagggct
actccagcta ggtagtggat ttccgagaga tgacctaaga acatgatact 13860gactaaatta
ctcagtggta ctataatatt cgataggacg ttggtaaaag ctagtcggaa 13920gtagcggggt
ataaagtcat actggcttgg aaatgtcagg ctcataagat taatttgaca 13980gtagagttgt
tggaaaataa gggataataa tcaagcagac aagtagggtg acattaatgt 14040tgaacttaga
ccgcatcctg aatcaagagc gactgctacg agaaatgact ggacttaacc 14100gccaagcatt
caacgagctg ttatctcagt ttgctgatac ctatgaacgc accgtgttca 14160actccttagc
aaaccgcaaa cgtgcgcccg ggggcggacg caagcctaca ctcagaagta 14220tagaggaaaa
actattttat atcctgctgt actgcaaatg ttatccgacg tttgacttgc 14280tgagtgtgtt
gttcaacttt gaccgctcct gtgctcatga ttgggtacat cgactactgt 14340ctgtgctaga
aaccacttta ggagaaaagc aagttttgcc agcacgcaaa ctcaggagca 14400tggaggaatt
caccaaaagg tttccagatg tgaaggaggt gattgtggat ggtacggagc 14460gtccagtcca
gcgtcctcaa aaccgagaac gccaaaaaga gtattactct ggcaagaaaa 14520agcggcatac
atgcaagcag attacagtca gcacaaggga gaaacgagtg attattcgga 14580cggaaaccag
agcaggtaaa gtgcatgaca aacggctact ccatgaatca gagatagtgc 14640aatacattcc
tgatgaagta gcaatagagg gagatttggg ttttcatggg ttggagaaag 14700aatttgtcaa
tgtccattta ccacacaaga aaccgaaagg tatcgaagca aggaggcatg 14760gcggcgggat
gggtcagttt ttataagaga gttttgacaa tataaataaa agacttttga 14820caaccagact
tggcattact tagtttcagt ctttcatctc aagtttacgt tattctgagg 14880cgaacatgaa
tcttataaca acaaaaaaac aggtagatac attagtgata cacgctcatc 14940tttttaccat
gcagggaaat ggtgtgggat atattgcaga tggggcactt gcggttgagg 15000gtagccgtat
tgtagcagtt gattcgacgg aggcgttgct gagtcatttt gagggcagaa 15060aggttattga
gtccgcgaat tgtgccgtct tgcctgggct gattaatgct cacgtagaca 15120caagtttggt
gctgatgcgt ggggcggcgc aagatgtaac taattggcta atggacgcga 15180ccatgcctta
ttttgctcac atgacacccg tggcgagtat ggctgcaaca cgcttaaggg 15240tggtagaaga
gttgaaagca ggcacaacaa cattctgtga caataaaatt attagccccc 15300tgtggggcga
atttttcgat gaaattggtg tacgggctag tttagctcct atgttcgatg 15360cactcccact
ggagatgcca ccgcttcaag acggggagct ttatcccttc gatatcaagg 15420cgggacggcg
ggcgatggca gaggctgtgg attttgcctg tgggtggaat ggggcagcag 15480aggggcgtat
cactaccatg ttaggaatgt attcgccaga tatgatgccg cttgagatgc 15540tacgcgcagc
caaagagatt gctcaacggg aaggcttaat gctgcatttt catgtagcgc 15600agggagatcg
ggaaacagag caaatcgtta aacgatatgg taagcgtccg atcgcatttc 15660tagctgagat
tggctacttg gacgaacagt tgctggcagt tcacctcacc gatgccaccg 15720atgaagaggt
gatacaagta gccaaaagtg gcgctggcat ggtactctgt tcgggaatga 15780ttggcactat
tgacggtatc gtgccgcccg ctcatgtgtt tcggcaagca ggcggacccg 15840ttgcgctagg
cagcagctac aataatattt tccatgagat gaagctgacc gccttattca 15900acaaaataaa
atatcacgat ccaaccatta tgccggcttg ggaagtcctg cgtatggcta 15960ccatcgaagg
agcgcgggcg attggtttag atcacaagat tggctctctt gaagttggca 16020aagaagccga
cctgatctta atagacctca gcacccctaa cctctcaccc actctgctta 16080accccattcg
taaccttgta cctaatttcg tgtacgctgc ttcaggacat gaagttaaaa 16140gtgtcatggt
ggcgggaaaa ctgttattgg aagactacca agtcctcaca gtagatgagt 16200ctgctatcat
tgctgaagca caattgcaag cccaacagat ttctcaatgc gtagcatctg 16260accctatcca
caaaaaaatg gtgctgatgg cggcgatggc aaggggccaa ttgtaggaat 16320ggtcttgagt
tatctagtaa gctaagttgc caactaacaa ttaaaaatac gaagcaggtg 16380ataaggcaga
attacagcag gttgtctttc ggatcgctcg ttggatcttt gtaccttccc 16440tagtcatggc
gatcgccctc atcgtcttcg cccaacccgt gatgagcctg ttcggtgcag 16500agtttgctgt
ggctcattgg tagccgatac catccctcca actgacttgt catgatagtc 16560atggtgcgac
tttcccttcg gtactgataa actgggattg aatccctttc agagtcatca 16620tgatagattt
gggaagtcta aatgtggtcg agaagaaagt gcttttccca tgttgagaat 16680agtcacatta
acatcagcat caaaacgcct aattctagat tttacctatg gtttcagcca 16740aggtaaagga
actgagtcta aattacacgc cgtcatgaga taatatgatt attaattttc 16800tgtatagccc
agttaattat acttgattgt aggctatttt tagcctcttc taatgaagaa 16860tccagactaa
tccttatgta cgggaatatg ttatgcaaga aaaacgaatc gcaatgtggt 16920ctgtgccacg
aagtttgggt acagtgctgc tacaagcctg gtcgagtcgg ccagataccg 16980tagtctttga
tgaacttctc tcctttccct atctctttat caaagggaaa gatatgggct 17040ttacttggac
agaccttgat tctagccaaa tgccccacgc agattggcga tccgtcatcg 17100atctgttaaa
ggctcccctg cctgaaggga aatcaatcat cgatctgtta aaggctcccc 17160tgcctgaagg
gaaatcaatt tgctatcaga agcatcaagc gtatcattta atcgaagaga 17220ccatggggat
tgagtggata ttgcccttca gcaactgctt tctgattcgc caacccaaag 17280aaatgctctt
atcttttcgt aagattgtgc cacattttac ctttgaagaa acaggctgga 17340tcgaattaaa
acggctgttt gactatgtac atcaaacgag cggagtaatc ccgcctgtca 17400tagatgcaca
cgacttgctg aacgatccgc ggagaatgct ctccaagctt tgtcaggttg 17460taggggttga
gtttaccgag acaatgctca gttggccccc catggaggtc gagttgaacg 17520aaaaactagc
cccttggtac agcaccgtag caagttctac gcattttcac tcgtatcaga 17580ataaaaatga
gtcgttgccg ctatatcttg tcgatatttg taaacgctgc gatgaaatat 17640atcaggaatt
atatcaattt cgactttatt agagagtatt ggtaatgaaa attttgaatt 17700agtgaagaaa
tagaagttga gaatatagac catctaggga tagagactta tgctggacgg 17760attcaacaac
atcaggacaa ttacccacgt cagagtgatt ttagctttgc tgtttacgga 17820caattatgga
tttatggcat ggaactatag gctgatttag ctctaagctt aattagtctt 17880aaacctcata
aacgcctctt tttcaagcgt ggctttcagg ctctatccct tatgaaacaa 17940gctgtttgac
cactttgtca cccggtaagg agaaaaacct taaacccaag cagaaaaaat 18000tagcccgtaa
aaaaaaggga agtaaatcaa ggaaatatag ggtaatatat ttttcacaag 18060tttatcaatt
gtaatctact tgattcagta aattaattaa ggtgttgaag agatgcaaac 18120aagaattgta
aatagctgga atgagtggga tgaactaaag gagatggttg tcgggattgc 18180agatggtgct
tattttgaac caactgagcc aggtaaccgc cctgctttac gcgataagaa 18240cattgccaaa
atgttctctt ttcccagggg tccgaaaaag caagaggtaa cagagaaagc 18300taatgaggag
ttgaatgggc tggtagcgct tctagaatca cagggcgtaa ctgtacgccg 18360cccagagaaa
cataactttg gcctgtctgt gaagacacca ttctttgagg tagagaatca 18420atattgtgcg
gtctgcccac gtgatgttat gatcaccttt gggaacgaaa ttctcgaagc 18480aactatgtca
cggcggtcac gcttctttga gtatttaccc tatcgcaaac tagtctatga 18540atattggcat
aaagatccag atatgatctg gaatgctgcg cctaaaccga ctatgcaaaa 18600tgccatgtac
cgcgaagatt tctgggagtg tccgatggaa gatcgatttg agagtatgca 18660tgattttgag
ttctgcgtca cccaggatga ggtgattttt gacgcagcag actgtagccg 18720ctttggccgt
gatatttttg tgcaggagtc aatgacgact aatcgtgcag ggattcgctg 18780gctcaaacgg
catttagagc cgcgtcgctt ccgcgtgcat gatattcact tcccactaga 18840tattttccca
tcccacattg attgtacttt tgtcccctta gcacctgggg ttgtgttagt 18900gaatccagat
cgccccatca aagagggtga agagaaactc ttcatggata acggttggca 18960attcatcgaa
gcacccctcc ccacttccac cgacgatgag atgcctatgt tctgccagtc 19020cagtaagtgg
ttggcgatga atgtgttaag catttccccc aagaaggtca tctgtgaaga 19080gcaagagcat
ccgcttcatg agttgctaga taaacacggc tttgaggtct atccaattcc 19140ctttcgcaat
gtctttgagt ttggcggttc gctccattgt gccacctggg atatccatcg 19200cacgggaacc
tgtgaggatt acttccctaa actaaactat acgccggtaa ctgcatcaac 19260caatggcgtt
tctcgcttca tcatttagta ggttttatag ttatgcaaaa gagagaaagc 19320ccacagatac
tatttgatgg gaatggaaca caatctgagt ttccagatag ttgcattcac 19380cacttgttcg
aggatcaagc cgcaaagcga ccggatgcga tcgctctcat tgacggtgag 19440caatccctta
cctacgggga actaaatgta cgcgctaacc acctagccca gcatctcttg 19500tccctaggct
gtcaacccga tgacctcctc gccatctgca tcgagcgttc ggcagaactc 19560tttattggtt
tgttgggtat cctaaaagcc ggatgtgctt atgtgccttt ggatgtaggc 19620tatcctggcg
atcgcataga gtatatgttg cgggactcgg atgcgcgtat tttactaacc 19680tcaacggatg
tcgctaagaa acttgcctta accatacctg cattgcaaga gtgccaaacc 19740gtctatttag
atcaagagat atttgagtat gattttcatt ttttagcgat agctaaacta 19800ttacataacc
aatacttgag attattacat ttttattttt ataccttgat tcagcaatgc 19860caggcaactt
cggtttccca agggattcag acacaggttc tccccaataa tctcgcttac 19920tgcatttaca
cctctggctc taccggaaat cccaaaggga tcttgatgga acatcgctca 19980ctggtgaata
tgctttggtg gcatcagcaa acgcggcctt cggttcaggg tgttaggacg 20040ctgcaatttt
gtgcagtcag ctttgacttt tcctgccatg aaattttttc taccctctgt 20100cttggcggga
tattggtctt ggtgccagag gcagtgcgcc aaaatccctt tgcattggct 20160gagttcatca
gtcaacagaa aattgaaaaa ttgtttcttc ccgttatagc attactacag 20220ttggccgaag
ctgtaaatgg gaataaaagc acctccctcg cgctttgcga agttatcact 20280accggggagc
agatgcagat cacacctgct gtcgccaacc tctttcagaa aaccggggcg 20340atgttgcata
atcactacgg ggcaacagaa tttcaagatg ccaccactca taccctcaag 20400ggcaatccag
agggctggcc aacactggtg ccagtgggtc gtccactgca caatgttcaa 20460gtgtatattc
tggatgaggc acagcaacct gtacctcttg gtggagaggg tgaattctgt 20520attggtggta
ttggactggc tcgtggctat cacaatttgc ctgacctaac gaatgaaaaa 20580tttattccca
atccatttgg ggctaatgag aacgctaaaa aactctaccg cacaggggac 20640ttggcacgct
acctacccga cggcacgatt gagcatttag gacggataga ccaccaggtt 20700aagatccgag
gtttccgcgt ggaattgggg gaaattgagt ccgtgctggc aagtcaccaa 20760gctgtgcgtg
aatgtgccgt tgtggcacgg gagattgcag gtcatacaca gttggtaggg 20820tatatcatag
caaaggatac acttaatctc agtttcgaca aacttgaacc tatcctgcgt 20880caatattcgg
aagcggtgct gccagaatac atgataccca ctcggttcat caatatcagt 20940aatatgccgt
tgactcccag tggtaaactt gaccgcaggg cattacctga tcccaaaggc 21000gatcgccctg
cattgtctac cccacttgtc aagcctcgta cccagacaga gaaacgttta 21060gcagagattt
ggggcagtta tcttgctgta gatattgtgg gaacccacga caatttcttt 21120gatctaggcg
gtacgtcact gctattgact caagcgcaca aattcctgtg cgagaccttt 21180aatattaatt
tgtccgctgt ctcactcttt caatatccca caattcagac attggcacaa 21240tatattgatt
gccaaggaga cacaacctca agcgatacag catccaggca caagaaagta 21300cgtaaaaagc
agtccggtga cagcaacgat attgccatca tcagtgtggc aggtcgcttt 21360ccgggtgctg
aaacgattga gcagttctgg cataatctct gtaatggtgt tgaatccatc 21420acccttttta
gtgatgatga gctagagcag actttgcctg agttatttaa taatcccgct 21480tatgtcaaag
caggtgcggt gctagaaggc gttgaattat ttgatgctac cttttttggc 21540tacagcccca
aagaagctgc ggtgacagac cctcagcaac ggattttgct agagtgtgcc 21600tgggaagcat
ttgaacgggc tggctacaac cccgaaacct atccagaacc agttggtgtt 21660tatgctggtt
caagcctgag tacctatctg cttaacaata ttggctctgc tttaggcata 21720attaccgagc
aaccctttat tgaaacggat atggagcagt ttcaggctaa aattggcaat 21780gaccggagct
atcttgctac acgcatctct tacaagctga atctcaaggg tccaagcgtc 21840aatgtgcaga
ccgcctgctc aacctcgtta gttgcggttc acatggcctg tcagagtctc 21900attagtggag
agtgtcaaat ggctttagcc ggtggtattt ctgtggttgt accacagaag 21960gggggctatc
tctacgaaga aggcatggtt cgttcccagg atggtcattg tcgcgccttt 22020gatgccgaag
cccaagggac tatatttggc aatggcggcg gcttggtttt gcttaaacgg 22080ttgcaggatg
cactggacga taacgacaac attatggcag tcatcaaagc cacagccatc 22140aacaacgacg
gtgcgctcaa gatgggctac acagcaccga gcgtggatgg gcaagctgat 22200gtaattagcg
aggcgattgc tatcgctgac atagatgcaa gcaccattgg ctatgtagaa 22260gctcatggca
cagccaccca attgggtgat ccgattgaag tagcagggtt agcaagggca 22320tttcagcgta
gtacggacag cgtccttggt aaacaacaat gcgctattgg atcagttaaa 22380actaatattg
gccacttaga tgaggcggca ggcattgccg gactgataaa ggctgctcta 22440gctctacaat
atggacagat tccaccgagc ttgcactatg ccaatcctaa tccacggatt 22500gattttgacg
caaccccatt ttttgtcaac acagaactac gcgaatggtc aaggaatggt 22560tatcctcggc
gggcgggggt gagttctttt ggtgtgggtg gaactaacag ccatattgtg 22620ctggaggagt
cgcctgtaaa gcaacccaca ttgttctctt ctttgccaga acgcagtcat 22680catctgctga
cgctttctgc ccatacacaa gaggctttgc atgagttggt gcaacgctac 22740atccaacata
acgagacaca ccttgatatt aacttaggcg acctctgttt cacagccaat 22800acgggacgca
agcattttga gcatcgccta gcggttgtag ccgaatcaat ccctggctta 22860caggcacaac
tggaaactgc acagactgcg atttcagcac agaaaaaaaa tgccccgccg 22920acgatcgcat
tcctgtttac aggtcaaggc tcacaataca ttaacatggg gcgcaccctc 22980tacgatactg
aatcaacatt ccgtgcagcc cttgaccgat gtgaaaccat tctccaaaat 23040ttagggatcg
agtccattct ctccgttatt tttggttcat ctgagcatgg actctcatta 23100gatgacacag
cctataccca gcccgcactc tttgccatcg aatacgcgct ctatcaatta 23160tggaagtcgt
ggggcatcca gccctcagtg gtgataggtc atagtgtagg tgaatatgtg 23220tccgcttgtg
tggcgggagt ctttagctta gaggatgggt tgaaactgat tgcagaacga 23280ggacgactga
tacaggcact tcctcgtgat gggagcatgg tttccgtgat ggcaagcgag 23340aagcgtattg
cagatatcat tttaccttat gggggacagg tagggatcgc cgcgattaat 23400ggcccacaaa
gtgttgtaat ttctgggcaa cagcaagcga ttgatgctat ttgtgccatc 23460ttggaaactg
agggcatcaa aagcaagaag ctaaacgtct cccatgcctt ccactcgccg 23520ctagtggaag
caatgttaga ctctttcttg caggttgcac aagaggtcac ttactcgcaa 23580cctcaaatca
agcttatctc taatgtaacg ggaacattgg caagccatga atcttgtccc 23640gatgaacttc
cgatcaccac cgcagagtat tgggtacgtc atgtgcgaca gcccgtccgg 23700tttgcggcgg
gaatggagag ccttgagggt caaggggtaa acgtatttat agaaatcggt 23760cctaaacctg
ttcttttagg catgggacgc gactgcttgc ctgaacaaga gggactttgg 23820ttgcctagtt
tgcgcccaaa acaggatgat tggcaacagg tgttaagtag tttgcgtgat 23880ctatacttag
caggtgtaac cgtagattgg agcagtttcg atcaggggta tgctcgtcgc 23940cgtgtgccac
taccgactta tccttggcag cgagagcggc attgggtaga gccaattatt 24000cgtcaacggc
aatcagtatt acaagccaca aataccacca agctaactcg taacgccagc 24060gtggcgcagc
atcctctgct tggtcaacgg ctgcatttgt cgcggactca agagatttac 24120tttcaaacct
tcatccactc cgacttccca atatgggttg ctgatcataa agtatttgga 24180aatgtcatca
ttccgggtgt cgcctatttt gagatggcac tggcagcagg gaaggcactt 24240aaaccagaca
gtatattttg gctcgaagat gtatccatcg cccaagcact gattattccc 24300gatgaagggc
aaactgtgca aatagtatta agcccacagg aagagtcagc ttattttttt 24360gaaatcctct
ctttagaaaa agaaaactct tgggtgcttc atgcctctgg taagctagtc 24420gcccaagagc
aagtgctaga aaccgagcca attgacttga ttgcgttaca ggcacattgt 24480tccgaagaag
tgtcagtaga tgtgctatat caggaagaaa tggcgcgccg gctggatatg 24540ggtccaatga
tgcgtggggt gaagcagctt tggcgttatc cgctctcctt tgccaaaagt 24600catgatgcga
tcgcactcgc caaggtcagc ttgccagaaa tcttgcttca tgagtccaat 24660gcctaccaat
tccatcctgt aatcttggat gcggggctgc aaatgataac ggtctcttat 24720cctgaagcaa
accaaggcca gacttatgta cctgttggta tagagggtct acaagtctat 24780ggtcgtccca
gttcagaact ttggtgtcgc gcccaatatc ggcctccttt ggatacagat 24840caaaggcagg
gtattgattt gctgccaaag aaattgattg cagacttgca tctatttgat 24900acccagggtc
gtgtggttgc catcatgttt ggtgtgcaat ctgtccttgt gggacgggaa 24960gcaatgttgc
gatcgcaaga tacttggcga aattggcttt atcaagtcct gtggaaacct 25020caagcctgtt
ttggactttt accgaattac ctgccaaccc cagataagat tcggaaacgc 25080ctggaaacaa
agttagcgac attgatcatc gaagctaatt tggcgactta tgcgatcgcc 25140tatacccaac
tggaaaggtt aagtctagct tacgttgtgg cggctttccg acaaatgggc 25200tggctgtttc
aacccggtga gcgtttttcc accgcccaga aggtatcagc gttaggaatc 25260gttgatcaac
atcggcaact attcgctcgt ttgctcgaca ttctagccga agcagacata 25320ctccgcagcg
aaaacttgat gacgatatgg gaagtcattt catacccgga aacgattgat 25380atacaggtac
ttcttgacga cctcgaagcc aaagaagcag aagccgaagt cacactggtt 25440tcccgttgca
gtgcaaaatt ggccgaagta ttacaaggaa aatgtgaccc catacagttg 25500ctctttcccg
caggggacac aacaacgtta agcaaactct atcgtgaagc cccagttttg 25560ggtgttacta
atactctagt ccaagaagcg cttctttccg ccctggagca gttgccgccg 25620gaacgtggtt
ggcgaatttt agagattggt gctggaacag gtggaaccac agcctacttg 25680ttaccgcatc
tgcctgggga tcagacaaaa tatgtcttta ccgatattag tgcctttttt 25740cttgccaaag
cggaagagcg ttttaaagat tacccgtttg tacgttatca ggtattagat 25800atcgaacaag
caccacaggc gcaaggattt gaaccccaaa tatacgattt aatcgtagca 25860gcggatgtct
tgcatgctac tagtgacctg cgtcaaactc ttgtacatat ccggcaatta 25920ttagcgccgg
gcgggatgtt gatcctgatg gaagacagcg aacccgcacg ctgggctgat 25980ttaacctttg
gcttaacaga aggctggtgg aagtttacag accatgactt acgccccaac 26040catccgctat
tgtctcctga gcagtggcaa atcttgttgt cagaaatggg atttagtcaa 26100acaaccgcct
tatggccaaa aatagatagc ccccataaat tgccacggga ggcggtgatt 26160gtggcgcgta
atgaaccagc catcagaaaa ccccgaagat ggctgatctt ggctgacgag 26220gagattggtg
gactactagc caaacagcta cgtgaagaag gagaagattg tatactcctc 26280ttgccagggg
aaaagtacac agagagagat tcacaaacgt ttacaatcaa tcctggagat 26340attgaagagt
ggcaacagtt attgaaccga gtaccgaaca tacaagaaat tgtacattgt 26400tggagtatgg
tttccactga cttagataga gccactattt tcagttgcag cagtacgctg 26460catttagttc
aagcattagc aaactatcca aaaaaccctc gcttgtcact tgtcacccta 26520ggcgcacaag
ccgttaacga acatcatgtt caaaatgtag ttggagcagc cctctggggc 26580atgggaaagg
taattgcact cgaacaccca gagctacaag tagcacaaat ggatttagac 26640ccgaatggga
aggttaaggc gcaagtagaa gtgcttaggg atgaacttct cgccagaaaa 26700gaccctgcat
cagcaatgtc tgtgcctgat ctgcaaacac gacctcatga aaagcaaata 26760gcctttcgtg
agcaaacacg ttatgtggca agactttcgc ccttagaccg ccccaatcct 26820ggagagaaag
gcacacaaga ggctcttacc ttccgtgatg atggcagcta tctgattgct 26880ggtggtttag
gcggactggg gttagtggtg gctcgttttc tggttacaaa tggggctaaa 26940taccttgtgc
tagtcggacg acgtggtgcg agggaggaac agcaagctca attaagcgaa 27000ctagagcaac
tcggagcttc cgtgaaagtt ttacaagccg atattgctga tgcagaacaa 27060ctagcccaag
cactttcagc agtaacctac ccaccattac ggggtgttat tcatgcggca 27120ggtacattga
acgatgggat tctacagcag caaagttggc aagcctttaa agaagtgatg 27180aatcccaagg
tagcaggtgc gtggaaccta catatactga caaaaaatca gcctttagac 27240ttctttgtcc
tgttctcctc cgccacctct ttgttaggta acgctggaca agccaatcac 27300gccgccgcaa
atgctttcct tgatgggtta gcctcctatc gtcgtcactt aggactaccg 27360agcctctcga
ttaattgggg gacatggagc gaagtgggaa ttgcggctcg acttgaacta 27420gataagttgt
ccagcaaaca gggagaggga accattacgc taggacaggg cttacaaatt 27480cttgagcagt
tgctcaaaga cgagaatggg gtgtatcaag tgggtgtcat gcctatcaac 27540tggacacaat
tcttagcaag gcaattgact ccgcagccgt tcttcagcga tgccatgaag 27600agtattgaca
cctctgtagg taaactaacc ttgcaggagc gggactcttg cccccaaggt 27660tacgggcata
atattcgaga gcaattagag aacgctccgc ccaaagaggg tctgactctc 27720ttgcaggctc
atgttcggga gcaggtttcc caagttttgg ggatagacac gaagacatta 27780ttggcagaac
aagacgtggg tttctttacc ctggggatgg attcgctgac ctctgtcgag 27840ttaagaaaca
ggttacaagc cagtttgggc tgctctcttt cttccacttt ggcttttgac 27900tatccaacac
aacaggctct tgtgaattat cttgccaatg aattgctggg aacccctgag 27960cagctacaag
agcctgaatc tgatgaagaa gatcagatat cgtcaatgga tgacatcgtg 28020cagttgctgt
ccgcgaaact agagatggaa atttaagccc atggatgaaa aactaagaac 28080atacgaacga
ttaatcaagc aatcctatca caagatagag gctctggaag ctgaagttaa 28140caggttgaag
caaacccaat gtgaacctat cgccatcgtc ggcatgggct gtcgttttcc 28200tggtgcgaat
agtccagaag cgttttggca gttgttgtgt gatggggttg atgctattcg 28260tgagatacca
aaaaatcgat gggttgttga tgcctacata gatgaaaatt tggaccgcgc 28320agacaagaca
tcaatgcgat ttggcgggtt tgtcgagcaa cttgagaagt ttgatgccca 28380attctttggc
atatcaccgc gagaagcggt ttctcttgac cctcagcaac gtttgttatt 28440agaagtaagt
tgggaagcac tggaaaatgc agcggtgata ccaccttcgg caacgggcgt 28500attcgtcggt
attagtaacc ttgattatcg tgaaacgctc ttgaagcaag gagcaattgg 28560tacttatttt
gcttcgggta atgcccatag cacagccagt ggtcgcttgt cttactttct 28620cggtctgaca
ggcccctgtc tctcgataga tacagcttgt tcttcgtcgt tggtcgctgt 28680acatcagtca
ctgataagtc tgcgtcagcg agaatgtgac ttagcgttgg ttgggggagt 28740ccatcggctg
atagccccag aggaaagtgt ctcgttagca aaagcccata tgttatctcc 28800cgatggtcgt
tgcaaagtct ttgatgcgtc ggcaaacggg tatgtccgag ccgaaggatg 28860tggcatgata
gtcctcaaac gattatcgga cgcgcaagct gatggggata aaatcttggc 28920gttgattcgc
gggtcagcca taaatcaaga cggtcgcacg agtggcttga ccgttccaaa 28980tggtccccaa
caagccgacg tgattcgcca agccctcgcc aatagtggca taagaccaga 29040acaagttaac
tatgtagaag ctcatggcac agggacttcc ctaggagacc cgattgaggt 29100cggcgcgttg
ggaacgatct ttaatcaacg ctcccaacct ttaattattg gttcagttaa 29160aacaaatatt
gggcatctag aagcagcagc agggattgct ggactgatta aagtcgtcct 29220tgccatgcag
catggagaaa ttccacctaa tttacacttt caccagccca atcctcgcat 29280taactgggat
aaattgccaa tcaggatccc cacagaacga acagcttggc ctactggcga 29340tcgcatcgca
gggataagtt ctttcggctt tagtggcact aattctcatg tcgtgttaga 29400ggaagcccca
aaaatagagc cgtctacttt agagattcat tcaaagcagt atgtttttac 29460cttatcagca
gcgacacctc aagcactaca agaacttact cagcgttatg taacttatct 29520cactgaacac
ttacaagaga gtctggcgga tatttgcttt acagccaaca cagggcgcaa 29580acactttaga
catcgctttg cagtagtagc agagtctaaa acccagttgc gccaacaatt 29640ggaaacgttt
gcccaatcgg gagaggggca ggggaagagg acatctctct caaaaatagc 29700ttttctcttt
acaggtcaag gctcacagta tgtggggatg gggcaagaac tttatgagag 29760ccaacccacc
ttccggcaaa ccattgaccg atgtgatgag attcttcgtt cactgttggg 29820caaatcaatc
ctctcaatac tctatcccag ccaacaaatg ggattggaaa cgccatccca 29880aattgatgaa
accgcctata ctcaacccac tcttttttct cttgaatatg cactggcgca 29940gttgtggcgc
tcctggggta ttgagcctga tgtggtgatg gggcatagtg tgggagaata 30000tgtggccgct
tgtgtggcgg gtgtcttttc tttagaggat ggactcaaac taattgctga 30060aagaggccgt
ctgatgcaag aattgcctcc cgatggggcg atggtttcag ttatggccaa 30120taaatcgcgc
atagagcaag caattcaatc tgtcagccga gaggtttcta ttgcggccat 30180caatggacct
gagagtgtgg ttatctctgg taaaagggag atattacaac agattaccga 30240acatctggtt
gccgaaggca ttaagacacg ccaactgaag gtctctcatg cctttcactc 30300accattgatg
gagccaatat taggtcagtt ccgccgagtt gccaatacca tcacctatcg 30360gccaccgcaa
attaaccttg tctcaaatgt cacaggcgga caggtgtata aagaaatcgc 30420tactcccgat
tattgggtga gacatctgca agagactgtc cgttttgcgg atggggttaa 30480ggtgttacat
gaacagaatg tcaatttcat gctcgaaatt ggtcccaaac ccacactgct 30540gggcatggtt
gagttacaaa gttctgagaa tccattttct atgccaatga tgatgcccag 30600tttgcgtcag
aatcgtagcg actggcagca gatgttggag agcttgagtc aactctatgt 30660tcatggtgtt
gagattgact ggatcggttt taataaagac tatgtgcgac ataaagttgt 30720cctgccgaca
tacccatggc agaaggagcg ttactgggta gaattggatc aacagaagca 30780cgccgctaaa
aatctacatc ctctactgga caggtgcatg aagctgcctc gtcataacga 30840aacaattttt
gagaaagaat ttagtctaga gacattgccc tttcttgctg actatcgcat 30900ttatggttca
gttgtgtcgc caggtgcaag ttatctatca atgatactaa gtattgccga 30960gtcgtatgca
aatggtcatt tgaatggagg gaatagtgca aagcaaacca cttatttact 31020aaaggatgtc
acattcccag tacctcttgt gatctctgat gaggcaaatt acatggtgca 31080agttgcttgt
tctctctctt gtgctgcgcc acacaatcgt ggcgacgaga cgcagtttga 31140attgttcagt
tttgctgaga atgtacctga aagtagcagt ataaatgctg attttcagac 31200acccattatt
catgcaaaag ggcaatttaa gcttgaagat acagcacctc ctaaagtgga 31260gctagaagaa
ctacaagcgg gttgtcccca agaaattgat ctcaaccttt tctatcaaac 31320attcacagac
aaaggttttg tttttggatc tcgttttcgc tggttagaac aaatctgggt 31380gggcgatgga
gaagcattgg cgcgtctgcg acaaccggaa agtattgaat cgtttaaagg 31440atatgtgatt
catcccggtt tgttggatgc ctgtacacaa gtcccatttg caatttcgtc 31500tgacgatgaa
aataggcaat cagaaacgac aatgcccttt gcgctgaatg aattacgttg 31560ttatcagcct
gcaaacggac aaatgtggtg ggttcatgca acagaaaaag atagatatac 31620atgggatgtt
tctctgtttg atgagagcgg gcaagttatt gcggaattta taggtttaga 31680agttcgtgct
gctatgcccg aaggcttact aagggcagac ttttggcata actggctcta 31740tacagtgaat
tggcgatcgc aacctctaca aatcccagag gtgctggata ttaataagac 31800aggtgcagaa
acatggcttc tttttgcaca accagaggga ataggagcgg acttagccga 31860atatttgcag
agccaaggaa agcactgtgt ttttgtagtg cctgggagtg agtatacagt 31920gaccgagcaa
cacattggac gcactggaca tcttgatgtg acgaaactga caaaaattgt 31980cacgatcaat
cctgcttctc ctcatgacta taaatatttt ttagaaactc tgacggacat 32040tagattacct
tgtgaacata tactctattt atggaatcgt tatgatttaa caaatacttc 32100taatcatcgg
acagaattga ctgtaccaga tatagtctta aacttatgta ctagtcttac 32160ttatttggta
caagccctta gccacatggg tttttccccg aaattatggc taattacaca 32220aaatagtcaa
gcggttggta gtgacttagc gaatttagaa atcgaacaat ccccattatg 32280ggcattgggt
cgaagcatcc gcgccgaaca ccctgaattt gattgccgtt gtttagattt 32340tgacacgctc
tcaaatatcg caccactctt gttgaaagag atgcaagcta tagactatga 32400atctcaaatt
gcttaccgac aaggaacgcg ctatgttgca cgactaattc gtaatcaatc 32460agaatgtcac
gcaccgattc aaacaggaat ccgtcctgat ggcagctatt tgattacagg 32520tggattaggc
ggtctaggat tgcaggtagc actcgccctt gcggacgctg gagcaagaca 32580cttgatcctc
aatagtcgcc gtggtacggt ctccaaagaa gcccagttaa ttattgaccg 32640actacgccaa
gaggatgtta gggttgattt gattgcggca gatgtctctg atgcggcaga 32700tagcgaacga
ctcttagtag aaagtcagcg caagacctct cttcgaggga ttgtccatgt 32760tgcgggagtc
ttggatgatg gcatcctgct ccaacaaaat caagagcgtt ttgaaaaagt 32820gatggcggct
aaggtacgcg gagcttggca tctggaccaa cagagccaaa ccctcgattt 32880agatttcttt
gttgcgttct catctgttgc gtcgctcata gaagaaccag gacaagccaa 32940ttacgccgca
gcgaatgcgt ttttggattc attaatgtat tatcgtcaca taaagggatc 33000taatagcttg
agtatcaact ggggggcttg ggcagaagtc ggcatggcag ccaatttatc 33060atgggaacaa
cggggaatcg cggcaatttc tccaaagcaa gggaggcata ttctcgtcca 33120acttattcaa
aaacttaatc agcatacaat cccccaagtt gctgtacaac cgaccaattg 33180ggctgaatat
ctatcccatg atggcgtgaa tatgccattc tatgaatatt ttacacacca 33240cttgcgtaac
gaaaaagaag ccaaattgcg gcaaacagca ggcagcacct cagaggaagt 33300cagtctgcgg
caacagcttc aaacactctc agagaaagac cgggatgccc ttttgatgga 33360acatcttcaa
aaaactgcga tcagagttct cggtttggca tctaatcaaa aaattgatcc 33420ctatcaggga
ttgatgaata tgggactaga ctctttgatg gcggttgaat ttcggaatca 33480cttgatacgt
agtttagaac gccctctgcc agccactctg ctctttaatt gcccaacact 33540tgattcattg
catgattacc tagtcgcaaa aatgtttgat gatgcccctc agaaggcaga 33600gcaaatggca
caaccaacaa cactgacagc acacagcata tcaatagaat ccaaaataga 33660tgataacgaa
agcgtggatg acattgcaca aatgctggca caagcactca atatcgcctt 33720tgagtagcaa
tgggcagccc ttaacctttc aaggtgacta atcaatagac ctcttgcaca 33780attgtttctg
tggtacaata agtggtttta ggttttatgt atatttgggt gttgttgcga 33840tagctacgct
cgccgaaggc atcacaaatt caaagatagg cgtgtgattc taacttttag 33900cttaacgggt
gacaaggcgg ctaaagagct tgtttcataa gggatagagc ctgaaagccc 33960cgttgaaaaa
agaggcgttt atgaggcttg agattgatta aattcagagc taaatcagcc 34020cataattcca
taccataaat ccatagttgt ccgtagagac caaagctaaa atcactttga 34080cgtgggtact
tgtcctgatg ttgttgaatc ccacattcag catgagtaaa tatactcaaa 34140atatttttcc
cagcaggtta agtgttctaa tcctaagtct gatatcttat ttttgataag 34200ggacttaccg
cgtaatagtt aaatttttgt atagcctaat tttacttggt ttaaggctct 34260tttttgctct
tttggtgaat tattcaggat aatcaaagat gagtcagccc aattatggca 34320ttttgatgaa
aaatgcgttg aacgaaataa atagcctacg atcgcaacta gctgcggtag 34380aagcccaaaa
aaatgagtct attgccattg ttggtatgag ttgccgtttt ccaggcggtg 34440caactactcc
agagcgtttt tgggtattac tgcgcgaggg tatatcagcc attacagaaa 34500tccctgctga
tcgctgggat gttgataaat attatgatgc tgaccccaca tcgtccggta 34560aaatgcatac
tcgttacggc ggttttctga atgaagttga tacatttgag ccatcattct 34620ttaatattgc
tgcccgtgaa gccgttagca tggatccaca gcaacgcttg ctacttgaag 34680tcagttggga
agctctggaa tccggtaata ttgttcctgc aactcttttt gatagttcca 34740ctggtgtatt
tatcggtatt ggtggtagca actacaaatc tttaatgatc gaaaacagga 34800gtcggatcgg
gaaaaccgat ttgtatgagt taagtggcac tgatgtgagt gttgctgccg 34860gcaggatatc
ctatgtcctg ggtttgatgg gtcccagttt tgtgattgat acagcttgtt 34920catcttcttt
ggtctcagtt catcaagcct gtcagagtct gcgtcagaga gaatgtgatc 34980tagcactagc
tggtggagtc ggtttactca ttgatccaga tgagatgatt ggtctttctc 35040aaggggggat
gctggcacct gatggtagtt gtaaaacatt tgatgccaat gcaaatggct 35100atgtgcgagg
cgaaggttgt gggatgattg ttctaaaacg tctctcggat gcaacagccg 35160atggggataa
tattcttgcc atcattcgtg ggtctatggt taatcatgat ggtcatagca 35220gtggtttaac
tgctccaaga ggccccgcac aagtctctgt cattaagcaa gccttagata 35280gagcaggtat
tgcaccggat gccgtaagtt atttagaagc ccatggtaca ggcacacccc 35340ttggtgatcc
tatcgagatg gattcattga acgaagtgtt tggtcggaga acagaaccac 35400tttgggtcgg
ctcagttaag acaaatattg gtcatttaga agccgcgtcc ggtattgcag 35460ggctgattaa
ggttgtcttg atgctaaaaa acaagcagat tcctcctcac ttgcatttca 35520agacaccaaa
tccatatatt gattggaaaa atctcccggt cgaaattccg accacccttc 35580atgcttggga
tgacaagaca ttgaaggaca gaaagcgaat tgcaggggtt agttctttta 35640gtttcagtgg
tactaacgcc cacattgtat tatctgaagc cccatctagc gaactaatta 35700gtaatcatgc
ggcagtggaa agaccatggc acttgttaac ccttagtgct aagaatgagg 35760aagcgttggc
taacttggtt gggctttatc agtcatttat ttctactact gatgcaagtc 35820ttgccgatat
atgctacact gctaatacgg cacgaaccca tttttctcat cgccttgctc 35880tatcggctac
ttcacacatc caaatagagg ctcttttagc cgcttataag gaagggtcgg 35940tgagtttgag
catcaatcaa ggttgtgtcc tttccaacag tcgtgcgccg aaggtcgctt 36000ttctctttac
aggtcaaggt tcgcaatatg tgcaaatggc tggagaactt tatgagaccc 36060agcctacttt
ccgtaattgc ttagatcgct gtgccgaaat cttgcaatcc atcttttcat 36120cgagaaacag
cccttgggga aacccactgc tttcggtatt atatccaaac catgagtcaa 36180aggaaattga
ccagacggct tatacccaac ctgccctttt tgctgtagaa tatgccctag 36240cacagatgtg
gcggtcgtgg ggaatcgagc cagatatcgt aatgggtcat agcataggtg 36300aatatgtggc
agcttgtgtg gcggggatct tttctctgga ggatggtctc aaacttgctg 36360ccgaaagagg
ccgtttgatg caggcgctac cacaaaatgg cgagatggtt gctatatcgg 36420cctcccttga
ggaagttaag ccggctattc aatctgacca gcgagttgtg atagcggcgg 36480taaatggacc
acgaagtgtc gtcatttcgg gcgatcgcca agctgtgcaa gtcttcacca 36540acaccctaga
agatcaagga atccggtgca agagactgtc tgtttcacac gctttccact 36600ctccattgat
gaaaccaatg gagcaggagt tcgcacaggt ggccagggaa atcaactata 36660gtcctccaaa
aatagctctt gtcagtaatc taaccggcga cttgatttca cctgagtctt 36720ccctggagga
aggagtgatc gcttcccctg gttactgggt aaatcattta tgcaatcctg 36780tcttgttcgc
tgatggtatt gcaactatgc aagcgcagga tgtccaagtc ttccttgaag 36840ttggaccaaa
accgacctta tcaggactag tgcaacaata ttttgacgag gttgcccata 36900gcgatcgccc
tgtcaccatt cccaccttgc gccccaagca acccaactgg cagacactat 36960tggagagttt
gggacaactg tatgcgcttg gtgtccaggt aaattgggcg ggctttgata 37020gagattacac
cagacgcaaa gtaagcctac ccacctatgc ttggaagcgt caacgttatt 37080ggctagagaa
acagtccgct ccacgtttag aaacaacaca agttcgtccc gcaactgcca 37140ttgtagagca
tcttgaacaa ggcaatgtgc cgaaaatcgt ggacttgtta gcggcgacgg 37200atgtactttc
aggcgaagca cggaaattgc tacccagcat cattgaacta ttggttgcaa 37260aacatcgtga
ggaagcgaca cagaagccca tctgcgattg gctttatgaa gtggtttggc 37320aaccccagtt
gctgacccta tctaccttac ctgctgtgga aacagagggt agacaatggc 37380tcatcttcgc
cgatgctagt ggacacggtg aagcacttgc ggctcaatta cgtcagcaag 37440gggatataat
tacgcttgtc tatgctggtc taaaatatca ctcggctaat aataaacaaa 37500ataccggggg
ggacatccca tattttcaga ttgatccgat ccaaagggag gattatgaaa 37560ggttgtttgc
tgctttgcct ccactgtatg gtattgttca tctttggagt ttagatatac 37620ttagcttgga
caaagtatct aacctaattg aaaatgtaca attaggtagt ggcacgctat 37680taaatttaat
acagacagtc ttgcaacttg aaacgcccac ccctagcttg tggctcgtga 37740caaagaacgc
gcaagctgtg cgtaaaaacg atagcctagt cggagtgctt cagtcaccct 37800tatggggtat
gggtaaggtg atagccttag aacaccctga actcaactgt gtatcaatcg 37860accttgatgg
tgaagggctt ccagatgaac aagccaagtt tctggcggct gaactccgcg 37920ccgcctccga
gttcagacat accaccattc cccacgaaag tcaagttgct tggcgtaata 37980ggactcgcta
tgtgtcacgg ttcaaaggtt atcagaagca tcccgcgacc tcatcaaaaa 38040tgcctattcg
accagatgcc acttatttga tcacgggcgg ctttggtggt ttgggcttgc 38100ttgtggctcg
ttggatggtt gaacaggggg ctacccatct atttctgatg ggacgcagcc 38160aacccaaacc
agccgcccaa aaacaactgc aagagatagc cgcgctgggt gcaacagtga 38220cggtggtgca
agccgatgtt ggcatccgct cccaagtagc caatgtgttg gcacagattg 38280ataaggcata
tcctttggct ggtattattc atactgccgg tgtattagac gacggaatct 38340tattgcagca
aaattgggcg cgttttagca aggtgttcgc ccccaaacta gagggagctt 38400ggcatctaca
tacactgact gaagagatgc cgcttgattt ctttatttgt ttttcctcaa 38460cagcaggatt
gctgggcagt ggtggacaag ctaactatgc tgctgccaat gcctttttag 38520atgcctttgc
ccatcatcgg cgaatacaag gcttgccagc tctctcgatt aactgggacg 38580cttggtctca
agtgggaatg acggtacgtc tccaacaagc ttcttcacaa agcaccacag 38640ttgggcaaga
tattagcact ttggaaattt caccagaaca gggattgcaa atctttgcct 38700atcttctgca
acaaccatcc gcccaaatag cggccatttc taccgatggg cttcgcaaga 38760tgtacgacac
aagctcggcc ttttttgctt tacttgatct tgacaggtct tcctccacta 38820cccaggagca
atctacactt tctcatgaag ttggccttac cttactcgaa caattgcagc 38880aagctcggcc
aaaagagcga gagaaaatgt tactgcgcca tctacagacc caagttgctg 38940cggtcttgcg
tagtcccgaa ctgcccgcag ttcatcaacc cttcactgac ttggggatgg 39000attcgttgat
gtcacttgaa ttgatgcggc gtttggaaga aagtctgggg attcagatgc 39060ctgcaacgct
tgcattcgat tatcctatgg tagaccgttt ggctaagttt atactgactc 39120aaatatgtat
aaattctgag ccagatacct cagcagttct cacaccagat ggaaatgggg 39180aggaaaaaga
cagtaataag gacagaagta ccagcacttc cgttgactca aatattactt 39240ccatggcaga
agatttattc gcactcgaat ccttactaaa taaaataaaa agagatcaat 39300aatagagctg
ttgggaaata aaagcatatt tccggatgac agaacttccc ccatcccgat 39360tgaatttatg
ctgcatctaa atagaagttc catagccctg cactgaccaa catcaattga 39420tcatcaaaat
cggtcacacg attcctatat gtgggataaa atttgcagta cagcaggata 39480taaaatagtt
tttcctctat acttctgagt gtaggcttgc gtccgccccc gggcgcacgt 39540ttgcggtttg
ctaaggagtt gaacacggtg cgttcatagg tatcagcaaa ctgagataac 39600agctcgttga
atgcttggcg gttaagtcca gtcattgctc gtagcagtcg ctcttgattc 39660aggatgcggt
ctaagttcaa cattaatgtc accctacttg tctgcttgat tattatccct 39720tattttccaa
caactctatt atagcttatc ttattttgga gtttaactac atgaaaatcg 39780ctgtaaagac
tcctactgag tgaaagtgaa cttctttccc acgtattcga gtagctgttg 39840taagctggcc
tcgatggaaa gttccgaagt ttccaccagt aaatctggtg ttctcggtgg 39900ttcgtaggga
gcgctaattc ccgtaaaaga ctcaatttct ccacggcgtg cttttgcata 39960gagacccttg
gggtcacgtt gttcacaaat ttccatcgga gttgcaatat atacttcatg 40020aaacagatct
ccggacagaa tacggatttg ctcccggtct ttcctgtaag gtgaaatgaa 40080agcagtaatc
actaaacaac ccgaatccgc aaaaagtttg gccacctcgc caatacgacg 40140aatattttcc
gcacgatcag cagcagaaaa tcccaagtca gcacataatc catgacggat 40200attgtcacca
tcaaggacaa aagtatacca acctttctgg aacaaaatcc gctctaattc 40260tagagccaat
gttgttttac ctgatcctga taatccagtg aaccatagaa ttccatttcg 40320gtgaccattc
tttaaacaac gatcaaatgg ggacacaaga tgttttgtat gttgaatatt 40380gcttgatttc
atatctatga taaatatgat aaaagtgatt ggccaaacag aactgctcac 40440ccaataatat
agttaaaggt tattttttca aaaactcctt ctaaattata gctcacaatt 40500atgcctaaat
actttaatac tgctggaccc tgtaaatccg aaatccacta tatgctctct 40560cccacagctc
gactaccgga tttgaaagca ctaattgacg gagaaaacta ctttataatt 40620cacgcgccgc
gacaagtcgg caaaactaca gctatgatag ccttagcacg agaattgact 40680gatagtggaa
aatataccgc agttattctt tccgttgaag tgggatcagt attctcccat 40740aatccccagc
aagcggagca ggttatttta gaagaatgga aacaggcaat caaattttat 40800ttacccaaag
aactacaacc atcctattgg ccagagcgtg aaacagactc aggaataggc 40860aaaactttaa
gtgagtggtc cgcacaatct ccaagacctc ttgtaatctt tttacatgaa 40920atcgattccc
taacagatga agctttaatc ctaattttaa gacaattacg ctcaggtttt 40980ccccgtcgtc
ctcggggatt tccccattcg gtggggttaa ttggtatgcg ggatgtgcgg 41040gactataagg
ttaaatctgg tggaagtgaa cgactgaata cgtcaagtcc tttcaatatc 41100aaagcggaat
ccttgacttt aagtaatttc actctgtcag aggtggaaga actttactta 41160caacatacgc
aagctacagg acaaattttt accccggaag caattaaaca agcattttat 41220ttaaccgatg
ggcaaccatg gttagtaaac gccctagctc gtcaagccac tcaggtgtta 41280gtgaaagata
ttactcaacc cattaccgct gaagtaatta accaagccaa agaagttctg 41340attcagcgcc
aggataccca tttggatagt ttggcagagc gcttacggga agatcgggtc 41400aaagccatta
ttcaacctat gttagctgga tcggacttac cagatacccc agaggatgat 41460cgccgtttct
tgctagattt aggcttggta aagcgcagtc ccttgggagg actaaccatt 41520gccaatccca
tttaccagga ggtgattcct cgtgttttgt cccagggtag tcaggatagt 41580ctaccccaga
ttcaacctac ttggttaaat actgataata ctttaaatcc tgacaaactc 41640ttaaatgctt
tcctagagtt ttggcgacaa catggggaac cattactcaa aagtgcgcct 41700tatcatgaaa
ttgctcccca tttagttttg atggcgtttt tacatcgggt agtgaatggt 41760ggtggcactt
tagaacggga atatgccgtt ggttctggaa gaatggatat ttgtttacgc 41820tatggcaagg
tagtgatggg catagagtta aaggtttggg ggggaaaatc ggatccgtta 41880acgaagggtt
tgacccaatt ggataaatat ctgggtgggt taggattaga tagaggttgg 41940ttagtaattt
ttgatcaccg tccgggatta ccacccatgg gtgagaggat tagtatggaa 42000caggccatta
gtccagaggg aagaaccatt acagtgattc gtagctagag cgttagatat 42060cagatgattg
aacctcaatt attgtgcaac gccacatttt ctttccaaag atgtatgtta 42120aactctagta
aactctaatt aggtcgagaa agagat
42156815631DNACylindrospermopsis raciborskii AWT205 81atgcggtcta
agttcaacat taatgtcacc ctacttgtct gcttgattat tatcccttat 60tttccaacaa
ctctaatgaa agtacctata acagcaaacg aagatgcagc tacattactt 120cagcgtgttg
gactgtccct aaaggaagca caccaacaac ttgaggcaat gcaacgccga 180gcgcacgaac
cgatcgcaat tgtggggctg gggctgcggt ttccgggagc tgattcacca 240cagacattct
ggaaactact tcagaatggt gttgatatgg tcaccgaaat ccctagcgat 300cgctgggcag
ttgatgaata ctatgatccc caacctgggt gtccaggcaa aatgtatatt 360cgtgaagccg
cttttgttga tgcagtggat aaattcgatg cctcgttttt tgatatttcg 420ccacgtgaag
cggccaatat agatccccag catagaatgt tgctggaggt agcttgggag 480gcactcgaaa
gggctggcat tgctcccagc caattgatgg atagccaaac gggggtattt 540gtcgggatga
gcgaaaatga ctattatgct cacctagaaa atacagggga tcatcataat 600gtctatgcgg
caacgggcaa tagcaattac tatgctccgg ggcgtttatc ctatctattg 660gggcttcaag
gacctaacat ggtcgttgat agtgcctgtt cctcctcctt agtggctgta 720catcttgcct
gtaatagttt gcggatggga gaatgtgatc tggcactggc tggtggcgtt 780cagcttatgt
taatcccaga ccctatgatt gggactgccc agttaaatgc ctttgcgacc 840gatggtcgta
gtaaaacatt tgacgctgcc gccgatggct atggacgcgg cgaaggttgt 900ggcatgattg
tacttaaaag aataagtgac gcgatcgtgg cagacgatcc aattttagcc 960gtaatccggg
gtagtgcagt caatcatggc gggcgtagca gtggtttaac tgcccctaat 1020aagctgtctc
aagaagcctt actgcgtcag gcactacaaa acgccaaggt tcagccggaa 1080gcagtcagtt
atatcgaagc ccatggcaca gggacacaac tgggcgaccc gattgaggtg 1140ggagcattaa
cgaccgtctt tggatcttct cgttcagaac ccttgtggat tggctctgtc 1200aaaactaata
tcggacacct agaaccagcc gctggtattg cggggttaat aaaagtcatt 1260ttatcattac
aagaaaaaca gattcctccc agtctccatt ttcaaaaccc taatcccttc 1320attgattggg
aatcttcgcc agttcaagtg ccgacacagt gtgtaccctg gactgggaaa 1380gagcgcgtcg
ctggagttag ctcgtttggt atgagcggta caaactgtca tctagttgtc 1440gcagaagcac
ctgtccgcca aaacgaaaaa tctgaaaatg caccggagcg tccttgtcac 1500attctgaccc
tttcagccaa aaccgaagcg gcactcaacg cattggtagc ccgttacatg 1560gcatttctca
gggaagcgcc cgccatatcc ctagctgatc tttgttatag tgccaatgtc 1620gggcgtaatc
tttttgccca tcgcttaagt tttatctccg agaacatcgc gcagttatca 1680gaacaattag
aacactgccc acagcaggct acaatgccaa cgcaacataa tgtgatacta 1740gataatcaac
tcagccctca aatcgctttt ctgtttactg gacaaggttc gcagtacatc 1800aacatggggc
gtgagcttta cgaaactcag cccaccttcc gtcggattat ggacgaatgt 1860gacgacattc
tgcatccatt gttgggtgaa tcaattctga acatactcta cacttcccct 1920agcaaactta
atcaaaccgt ttatacccaa cctgcccttt ttgcttttga atatgcccta 1980gcaaaactat
ggatatcatg gggtattgag cctgatgtcg tactgggtca cagcgtgggt 2040gaatatgtag
ccgcttgtct ggcgggtgtc tttagtttag aagatgggtt aaaactcatt 2100gcatctcgtg
gatgtttgat gcaagcctta ccgccgggga aaatgcttag tatcagaagc 2160aatgagatcg
gagtgaaagc gctcatcgcg ccttatagtg cagaagtatc aattgcagca 2220atcaatggac
agcaaagcgt ggtgatctcc ggcaaagctg aaattataga taatttagca 2280gcagagtttg
catcggaagg catcaaaaca cacctaatta cagtctccca cgctttccac 2340tcgccaatga
tgacccccat gctgaaagca ttccgagacg ttgccagcac catcagctat 2400aggtcaccca
gtttatcact gatttctaac ggtacagggc aattggcaac aaaggaggtt 2460gctacacctg
attattgggt gcgtcatgtc cattctaccg tccgttttgc cgatggtatt 2520gccacattgg
cagaacagaa tactgacatc ctcctagaag taggacccaa accaatattg 2580ttgggtatgg
caaagcagat ttatagtgaa aacggttcag ctagtcatcc gctcatgcta 2640cccagtttgc
gtgaagatgg caacgattgg cagcagatgc tttctacttg tggacaactt 2700gtagttaatg
gagtcaagat tgactgggcg ggttttgaca aggattattc acgacacaaa 2760atattgttgc
ccacctatcc gtttcagaga gaacgatatt ggattgaaag ctccgtcaaa 2820aagccccaaa
aacaggagct gcgcccaatg ttggataaga tgatccggct accatcagag 2880aacaaagtgg
tgtttgaaac cgagtttggc gtgcgacaga tgcctcatat ctccgatcat 2940cagatatacg
gtgaagtcat tgtaccgggg gcagtattag cttccttaat cttcaatgca 3000gcgcaggttt
tatacccaga ctatcagcat gaattaactg atattgcttt ttatcagcca 3060attatctttc
atgacgacga tacggtgatc gtgcaggcga ttttcagccc tgataagtca 3120caggagaatc
aaagccatca aacatttcca cccatgagct tccagattat tagcttcatg 3180ccggatggtc
ccttagagaa caaaccgaaa gtccatgtca cagggtgtct gagaatgttg 3240cgcgatgccc
aaccgccaac actctccccg accgaaatac gtcagcgctg tccacatacc 3300gtaaatggtc
atgactggta caatagctta gtcaaacaaa aatttgaaat gggtccttcc 3360tttaggtggg
tacagcaact ttggcatggg gaaaatgaag cattgacccg tcttcacata 3420ccagatgtgg
tcggctctgt atcaggacat caacttcacg gcatattgct cgatggttca 3480ctttcaacca
ccgctgtcat ggagtacgag tacggagact ccgcgaccag agttcctttg 3540tcatttgctt
ctctgcaact gtacaaaccc gtcacgggaa cagagtggtg gtgctacgcg 3600aggaagattg
gggaattcaa atatgacttc cagattatga atgaaatcgg ggaaaccttg 3660gtgaaagcaa
ttggctttgt acttcgtgaa gcctctcccg aaaaattcct cagaacaaca 3720tacgtacaca
actggcttgt agacattgaa tggcaagctc aatcaacttc cctagtccct 3780tctgatggca
ctatctctgg cagttgtttg gttttatcag atcagcatgg aacaggggct 3840gcattggcac
aaaggctaga caatgctgga gtgccagtga ccatgatcta tgctgatctg 3900atactggaca
attacgaatt aatattccgt actttgccag atttacaaca agtcgtctat 3960ttatgggggt
tggatcaaaa agaggattgt caccccatga agcaagcaga ggataactgt 4020acatcggtgc
tatatcttgt gcaagcatta ctcaatacct actcaacccc gccatccctg 4080cttattgtca
cctgtgatgc acaagcggtg gttgaacaag atcgagtaaa tggcttcgcc 4140caatcgtctt
tgttgggact tgccaaagtt atcatgctag aacacccaga attgtcctgt 4200gtttacatgg
atgtggaagc cggatattta cagcaagatg tggcgaacac gatatttaca 4260cagctaaaaa
gaggccatct atcaaaggac ggagaagaga gtcagttggc ttggcgcaat 4320ggacaagcat
acgtagcacg tcttagtcaa tataaaccca aatccgaaca actggttgag 4380atccgcagcg
atcgcagcta tttgatcact ggtggacggg gcggtgtcgg cttacaaatc 4440gcacggtggt
tagtggaaaa gggggctaaa catctcgttt tgttggggcg cagtcagacc 4500agttccgaag
tcagtctggt gttggatgag ctagaatcag ccggggcgca aatcattgtg 4560gctcaagctg
atattagcga tgagaaggta ttagcgcaga ttctgaccaa tctaaccgta 4620cctctgtgtg
gtgtaatcca cgccgcagga gtgcttgatg atgcgagtct actccaacaa 4680actccagcca
agctcaaaaa agttctattg ccaaaagcag agggggcttg gattctgcat 4740aatttgaccc
tggagcagcg actagacttc tttgttctct tttcttctgc cagttctcta 4800ttaggtgcgc
cagggcaggc caactattca gcagccaatg ctttcctaga tggtttagct 4860gcctatcggc
gagggcgagg actcccctgt ttgtctatct gctggggggc atgggatcaa 4920gtcggtatgg
ctgcacgaca agggctactg gacaagttac cgcaaagagg tgaagaggcc 4980atcccgttac
agaaaggctt agacctcttc ggcgaattac tgaacgagcc agccgctcaa 5040attggtgtga
tcccaattca atggactcgc ttcttggatc atcaaaaagg taatttgcct 5100ttttatgaga
agttttctaa gtctagccgg aaagcgcaga gttacgattc gatggcagtc 5160agtcacacag
aagatattca gaggaaactg aagcaagctg ctgtgcaaga tcgaccaaaa 5220ttattagaag
tgcatcttcg ctctcaagtc gctcaactgt taggaataaa cgtggcagag 5280ctaccaaatg
aagaaggaat tggttttgtt acattaggtc ttgactcgct cacctctatt 5340gaactgcgta
acagtttaca acgcacatta gattgttcat tacctgtcac ctttgctttt 5400gactacccaa
ctatagaaat agcggttaag tacctaacac aagttgtaat tgcaccgatg 5460gaaagcacag
catcgcagca aacagactct ttatcagcaa tgttcacaga tacttcgtcc 5520atcgggagaa
ttcttgacaa cgaaacagat gtgttagaca gcgaaatgca aagtgatgaa 5580gatgaatctt
tgtctacact tatacaaaaa ttatcaacac atttggatta g
5631821876PRTCylindrospermopsis raciborskii AWT205 82Met Arg Ser Lys Phe
Asn Ile Asn Val Thr Leu Leu Val Cys Leu Ile1 5
10 15Ile Ile Pro Tyr Phe Pro Thr Thr Leu Met Lys
Val Pro Ile Thr Ala 20 25
30Asn Glu Asp Ala Ala Thr Leu Leu Gln Arg Val Gly Leu Ser Leu Lys
35 40 45Glu Ala His Gln Gln Leu Glu Ala
Met Gln Arg Arg Ala His Glu Pro 50 55
60Ile Ala Ile Val Gly Leu Gly Leu Arg Phe Pro Gly Ala Asp Ser Pro65
70 75 80Gln Thr Phe Trp Lys
Leu Leu Gln Asn Gly Val Asp Met Val Thr Glu 85
90 95Ile Pro Ser Asp Arg Trp Ala Val Asp Glu Tyr
Tyr Asp Pro Gln Pro 100 105
110Gly Cys Pro Gly Lys Met Tyr Ile Arg Glu Ala Ala Phe Val Asp Ala
115 120 125Val Asp Lys Phe Asp Ala Ser
Phe Phe Asp Ile Ser Pro Arg Glu Ala 130 135
140Ala Asn Ile Asp Pro Gln His Arg Met Leu Leu Glu Val Ala Trp
Glu145 150 155 160Ala Leu
Glu Arg Ala Gly Ile Ala Pro Ser Gln Leu Met Asp Ser Gln
165 170 175Thr Gly Val Phe Val Gly Met
Ser Glu Asn Asp Tyr Tyr Ala His Leu 180 185
190Glu Asn Thr Gly Asp His His Asn Val Tyr Ala Ala Thr Gly
Asn Ser 195 200 205Asn Tyr Tyr Ala
Pro Gly Arg Leu Ser Tyr Leu Leu Gly Leu Gln Gly 210
215 220Pro Asn Met Val Val Asp Ser Ala Cys Ser Ser Ser
Leu Val Ala Val225 230 235
240His Leu Ala Cys Asn Ser Leu Arg Met Gly Glu Cys Asp Leu Ala Leu
245 250 255Ala Gly Gly Val Gln
Leu Met Leu Ile Pro Asp Pro Met Ile Gly Thr 260
265 270Ala Gln Leu Asn Ala Phe Ala Thr Asp Gly Arg Ser
Lys Thr Phe Asp 275 280 285Ala Ala
Ala Asp Gly Tyr Gly Arg Gly Glu Gly Cys Gly Met Ile Val 290
295 300Leu Lys Arg Ile Ser Asp Ala Ile Val Ala Asp
Asp Pro Ile Leu Ala305 310 315
320Val Ile Arg Gly Ser Ala Val Asn His Gly Gly Arg Ser Ser Gly Leu
325 330 335Thr Ala Pro Asn
Lys Leu Ser Gln Glu Ala Leu Leu Arg Gln Ala Leu 340
345 350Gln Asn Ala Lys Val Gln Pro Glu Ala Val Ser
Tyr Ile Glu Ala His 355 360 365Gly
Thr Gly Thr Gln Leu Gly Asp Pro Ile Glu Val Gly Ala Leu Thr 370
375 380Thr Val Phe Gly Ser Ser Arg Ser Glu Pro
Leu Trp Ile Gly Ser Val385 390 395
400Lys Thr Asn Ile Gly His Leu Glu Pro Ala Ala Gly Ile Ala Gly
Leu 405 410 415Ile Lys Val
Ile Leu Ser Leu Gln Glu Lys Gln Ile Pro Pro Ser Leu 420
425 430His Phe Gln Asn Pro Asn Pro Phe Ile Asp
Trp Glu Ser Ser Pro Val 435 440
445Gln Val Pro Thr Gln Cys Val Pro Trp Thr Gly Lys Glu Arg Val Ala 450
455 460Gly Val Ser Ser Phe Gly Met Ser
Gly Thr Asn Cys His Leu Val Val465 470
475 480Ala Glu Ala Pro Val Arg Gln Asn Glu Lys Ser Glu
Asn Ala Pro Glu 485 490
495Arg Pro Cys His Ile Leu Thr Leu Ser Ala Lys Thr Glu Ala Ala Leu
500 505 510Asn Ala Leu Val Ala Arg
Tyr Met Ala Phe Leu Arg Glu Ala Pro Ala 515 520
525Ile Ser Leu Ala Asp Leu Cys Tyr Ser Ala Asn Val Gly Arg
Asn Leu 530 535 540Phe Ala His Arg Leu
Ser Phe Ile Ser Glu Asn Ile Ala Gln Leu Ser545 550
555 560Glu Gln Leu Glu His Cys Pro Gln Gln Ala
Thr Met Pro Thr Gln His 565 570
575Asn Val Ile Leu Asp Asn Gln Leu Ser Pro Gln Ile Ala Phe Leu Phe
580 585 590Thr Gly Gln Gly Ser
Gln Tyr Ile Asn Met Gly Arg Glu Leu Tyr Glu 595
600 605Thr Gln Pro Thr Phe Arg Arg Ile Met Asp Glu Cys
Asp Asp Ile Leu 610 615 620His Pro Leu
Leu Gly Glu Ser Ile Leu Asn Ile Leu Tyr Thr Ser Pro625
630 635 640Ser Lys Leu Asn Gln Thr Val
Tyr Thr Gln Pro Ala Leu Phe Ala Phe 645
650 655Glu Tyr Ala Leu Ala Lys Leu Trp Ile Ser Trp Gly
Ile Glu Pro Asp 660 665 670Val
Val Leu Gly His Ser Val Gly Glu Tyr Val Ala Ala Cys Leu Ala 675
680 685Gly Val Phe Ser Leu Glu Asp Gly Leu
Lys Leu Ile Ala Ser Arg Gly 690 695
700Cys Leu Met Gln Ala Leu Pro Pro Gly Lys Met Leu Ser Ile Arg Ser705
710 715 720Asn Glu Ile Gly
Val Lys Ala Leu Ile Ala Pro Tyr Ser Ala Glu Val 725
730 735Ser Ile Ala Ala Ile Asn Gly Gln Gln Ser
Val Val Ile Ser Gly Lys 740 745
750Ala Glu Ile Ile Asp Asn Leu Ala Ala Glu Phe Ala Ser Glu Gly Ile
755 760 765Lys Thr His Leu Ile Thr Val
Ser His Ala Phe His Ser Pro Met Met 770 775
780Thr Pro Met Leu Lys Ala Phe Arg Asp Val Ala Ser Thr Ile Ser
Tyr785 790 795 800Arg Ser
Pro Ser Leu Ser Leu Ile Ser Asn Gly Thr Gly Gln Leu Ala
805 810 815Thr Lys Glu Val Ala Thr Pro
Asp Tyr Trp Val Arg His Val His Ser 820 825
830Thr Val Arg Phe Ala Asp Gly Ile Ala Thr Leu Ala Glu Gln
Asn Thr 835 840 845Asp Ile Leu Leu
Glu Val Gly Pro Lys Pro Ile Leu Leu Gly Met Ala 850
855 860Lys Gln Ile Tyr Ser Glu Asn Gly Ser Ala Ser His
Pro Leu Met Leu865 870 875
880Pro Ser Leu Arg Glu Asp Gly Asn Asp Trp Gln Gln Met Leu Ser Thr
885 890 895Cys Gly Gln Leu Val
Val Asn Gly Val Lys Ile Asp Trp Ala Gly Phe 900
905 910Asp Lys Asp Tyr Ser Arg His Lys Ile Leu Leu Pro
Thr Tyr Pro Phe 915 920 925Gln Arg
Glu Arg Tyr Trp Ile Glu Ser Ser Val Lys Lys Pro Gln Lys 930
935 940Gln Glu Leu Arg Pro Met Leu Asp Lys Met Ile
Arg Leu Pro Ser Glu945 950 955
960Asn Lys Val Val Phe Glu Thr Glu Phe Gly Val Arg Gln Met Pro His
965 970 975Ile Ser Asp His
Gln Ile Tyr Gly Glu Val Ile Val Pro Gly Ala Val 980
985 990Leu Ala Ser Leu Ile Phe Asn Ala Ala Gln Val
Leu Tyr Pro Asp Tyr 995 1000
1005Gln His Glu Leu Thr Asp Ile Ala Phe Tyr Gln Pro Ile Ile Phe
1010 1015 1020His Asp Asp Asp Thr Val
Ile Val Gln Ala Ile Phe Ser Pro Asp 1025 1030
1035Lys Ser Gln Glu Asn Gln Ser His Gln Thr Phe Pro Pro Met
Ser 1040 1045 1050Phe Gln Ile Ile Ser
Phe Met Pro Asp Gly Pro Leu Glu Asn Lys 1055 1060
1065Pro Lys Val His Val Thr Gly Cys Leu Arg Met Leu Arg
Asp Ala 1070 1075 1080Gln Pro Pro Thr
Leu Ser Pro Thr Glu Ile Arg Gln Arg Cys Pro 1085
1090 1095His Thr Val Asn Gly His Asp Trp Tyr Asn Ser
Leu Val Lys Gln 1100 1105 1110Lys Phe
Glu Met Gly Pro Ser Phe Arg Trp Val Gln Gln Leu Trp 1115
1120 1125His Gly Glu Asn Glu Ala Leu Thr Arg Leu
His Ile Pro Asp Val 1130 1135 1140Val
Gly Ser Val Ser Gly His Gln Leu His Gly Ile Leu Leu Asp 1145
1150 1155Gly Ser Leu Ser Thr Thr Ala Val Met
Glu Tyr Glu Tyr Gly Asp 1160 1165
1170Ser Ala Thr Arg Val Pro Leu Ser Phe Ala Ser Leu Gln Leu Tyr
1175 1180 1185Lys Pro Val Thr Gly Thr
Glu Trp Trp Cys Tyr Ala Arg Lys Ile 1190 1195
1200Gly Glu Phe Lys Tyr Asp Phe Gln Ile Met Asn Glu Ile Gly
Glu 1205 1210 1215Thr Leu Val Lys Ala
Ile Gly Phe Val Leu Arg Glu Ala Ser Pro 1220 1225
1230Glu Lys Phe Leu Arg Thr Thr Tyr Val His Asn Trp Leu
Val Asp 1235 1240 1245Ile Glu Trp Gln
Ala Gln Ser Thr Ser Leu Val Pro Ser Asp Gly 1250
1255 1260Thr Ile Ser Gly Ser Cys Leu Val Leu Ser Asp
Gln His Gly Thr 1265 1270 1275Gly Ala
Ala Leu Ala Gln Arg Leu Asp Asn Ala Gly Val Pro Val 1280
1285 1290Thr Met Ile Tyr Ala Asp Leu Ile Leu Asp
Asn Tyr Glu Leu Ile 1295 1300 1305Phe
Arg Thr Leu Pro Asp Leu Gln Gln Val Val Tyr Leu Trp Gly 1310
1315 1320Leu Asp Gln Lys Glu Asp Cys His Pro
Met Lys Gln Ala Glu Asp 1325 1330
1335Asn Cys Thr Ser Val Leu Tyr Leu Val Gln Ala Leu Leu Asn Thr
1340 1345 1350Tyr Ser Thr Pro Pro Ser
Leu Leu Ile Val Thr Cys Asp Ala Gln 1355 1360
1365Ala Val Val Glu Gln Asp Arg Val Asn Gly Phe Ala Gln Ser
Ser 1370 1375 1380Leu Leu Gly Leu Ala
Lys Val Ile Met Leu Glu His Pro Glu Leu 1385 1390
1395Ser Cys Val Tyr Met Asp Val Glu Ala Gly Tyr Leu Gln
Gln Asp 1400 1405 1410Val Ala Asn Thr
Ile Phe Thr Gln Leu Lys Arg Gly His Leu Ser 1415
1420 1425Lys Asp Gly Glu Glu Ser Gln Leu Ala Trp Arg
Asn Gly Gln Ala 1430 1435 1440Tyr Val
Ala Arg Leu Ser Gln Tyr Lys Pro Lys Ser Glu Gln Leu 1445
1450 1455Val Glu Ile Arg Ser Asp Arg Ser Tyr Leu
Ile Thr Gly Gly Arg 1460 1465 1470Gly
Gly Val Gly Leu Gln Ile Ala Arg Trp Leu Val Glu Lys Gly 1475
1480 1485Ala Lys His Leu Val Leu Leu Gly Arg
Ser Gln Thr Ser Ser Glu 1490 1495
1500Val Ser Leu Val Leu Asp Glu Leu Glu Ser Ala Gly Ala Gln Ile
1505 1510 1515Ile Val Ala Gln Ala Asp
Ile Ser Asp Glu Lys Val Leu Ala Gln 1520 1525
1530Ile Leu Thr Asn Leu Thr Val Pro Leu Cys Gly Val Ile His
Ala 1535 1540 1545Ala Gly Val Leu Asp
Asp Ala Ser Leu Leu Gln Gln Thr Pro Ala 1550 1555
1560Lys Leu Lys Lys Val Leu Leu Pro Lys Ala Glu Gly Ala
Trp Ile 1565 1570 1575Leu His Asn Leu
Thr Leu Glu Gln Arg Leu Asp Phe Phe Val Leu 1580
1585 1590Phe Ser Ser Ala Ser Ser Leu Leu Gly Ala Pro
Gly Gln Ala Asn 1595 1600 1605Tyr Ser
Ala Ala Asn Ala Phe Leu Asp Gly Leu Ala Ala Tyr Arg 1610
1615 1620Arg Gly Arg Gly Leu Pro Cys Leu Ser Ile
Cys Trp Gly Ala Trp 1625 1630 1635Asp
Gln Val Gly Met Ala Ala Arg Gln Gly Leu Leu Asp Lys Leu 1640
1645 1650Pro Gln Arg Gly Glu Glu Ala Ile Pro
Leu Gln Lys Gly Leu Asp 1655 1660
1665Leu Phe Gly Glu Leu Leu Asn Glu Pro Ala Ala Gln Ile Gly Val
1670 1675 1680Ile Pro Ile Gln Trp Thr
Arg Phe Leu Asp His Gln Lys Gly Asn 1685 1690
1695Leu Pro Phe Tyr Glu Lys Phe Ser Lys Ser Ser Arg Lys Ala
Gln 1700 1705 1710Ser Tyr Asp Ser Met
Ala Val Ser His Thr Glu Asp Ile Gln Arg 1715 1720
1725Lys Leu Lys Gln Ala Ala Val Gln Asp Arg Pro Lys Leu
Leu Glu 1730 1735 1740Val His Leu Arg
Ser Gln Val Ala Gln Leu Leu Gly Ile Asn Val 1745
1750 1755Ala Glu Leu Pro Asn Glu Glu Gly Ile Gly Phe
Val Thr Leu Gly 1760 1765 1770Leu Asp
Ser Leu Thr Ser Ile Glu Leu Arg Asn Ser Leu Gln Arg 1775
1780 1785Thr Leu Asp Cys Ser Leu Pro Val Thr Phe
Ala Phe Asp Tyr Pro 1790 1795 1800Thr
Ile Glu Ile Ala Val Lys Tyr Leu Thr Gln Val Val Ile Ala 1805
1810 1815Pro Met Glu Ser Thr Ala Ser Gln Gln
Thr Asp Ser Leu Ser Ala 1820 1825
1830Met Phe Thr Asp Thr Ser Ser Ile Gly Arg Ile Leu Asp Asn Glu
1835 1840 1845Thr Asp Val Leu Asp Ser
Glu Met Gln Ser Asp Glu Asp Glu Ser 1850 1855
1860Leu Ser Thr Leu Ile Gln Lys Leu Ser Thr His Leu Asp
1865 1870 1875834074DNACylindrospermopsis
raciborskii AWT205 83atgaacgctt tgtcagaaaa tcaggtaact tctatagtca
agaaggcatt gaacaaaata 60gaggagttac aagccgaact tgaccgttta aaatacgcgc
aacgggaacc aatcgccatc 120attggaatgg gctgtcgctt tcctggtgca gacacacctg
aagctttttg gaaattattg 180cacaatgggg ttgatgctat ccaagagatt ccaaaaagcc
gttgggatat tgacgactat 240tatgatccca caccagcaac acccggcaaa atgtatacac
gttttggtgg ttttctcgac 300caaatagcag ccttcgaccc tgagttcttt cgcatttcta
ctcgtgaggc aatcagctta 360gaccctcaac agagattgct tctggaagtg agttgggaag
ccttagaacg ggctgggctg 420acaggcaata aactgactac acaaacaggt gtctttgttg
gcatcagtga aagtgattat 480cgtgatttga ttatgcgtaa tggttctgac ctagatgtat
attctggttc aggtaactgc 540catagtacag ccagcgggcg tttatcttat tatttgggac
ttactggacc caatttgtcc 600cttgataccg cctgttcgtc ctctttggtt tgtgtggcat
tggctgtcaa gagcctacgt 660caacaggagt gtgatttggc attggcgggt ggtgtacaga
tacaagtgat accagatggc 720tttatcaaag cctgtcaatc ccgtatgttg tcgcctgatg
gacggtgcaa aacatttgat 780ttccaggcag atggttatgc ccgtgctgag gggtgtggga
tggtagttct caaacgccta 840tccgatgcaa ttgctgacaa tgataatatc ctggccttga
ttcgtggtgc cgcagtcaat 900catgatggct acacgagtgg attaaccgtt cccagtggtc
cctcacaacg ggcggtgatc 960caacaggcat tagcggatgc tggaatacac ccggatcaaa
ttagctatat tgaggcacat 1020ggcacaggta catccttagg cgatcctatt gaaatgggtg
cgattgggca agtctttggt 1080caacgctcac agatgctttt cgtcggttcg gtcaagacga
atattggtca tactgaggct 1140gctgctggta ttgctggtct catcaaggtt gtactctcaa
tgcagcacgg tgaaatccca 1200gcaaacttac acttcgacca gccaagtcct tatattaact
gggatcaatt accagtcagt 1260atcccaacag aaacaatacc ttggtctact agcgatcgct
ttgcaggagt cagtagcttt 1320ggctttagtg gcacaaactc tcatatcgta ctagaggcag
ccccaaacat agagcaacct 1380actgatgata ttaatcaaac gccgcatatt ttgaccttag
ctgcaaaaac acccgcagcc 1440ctgcaagaac tggctcggcg ttatgcgact cagatagaga
cctctcccga tgttcctctg 1500gcggacattt gtttcacagc acacataggg cgtaaacatt
ttaaacatag gtttgcggta 1560gtcacggaat ctaaagagca actgcgtttg caattggatg
catttgcaca atcagggggt 1620gtggggcgag aagtcaaatc gctaccaaag atagcctttc
tttttacagg tcaaggctca 1680cagtatgtgg gaatgggtcg tcaactttac gaaaaccaac
ctaccttccg aaaagcactc 1740gcccattgtg atgacatctt gcgtgctggt gcatatttcg
accgatcact actttcgatt 1800ctctacccag agggaaaatc agaagccatt caccaaaccg
cttatactca gcccgcgctt 1860tttgctcttg agtatgcgat cgctcagttg tggcactcct
ggggtatcaa accagatatc 1920gtgatggggc atagtgtagg tgaatacgtc gccgcttgtg
tggcgggcat attttcttta 1980gaggatgggc tgaaactaat tgctactcgt ggtcgtctga
tgcaatccct acctcaagac 2040ggaacgatgg tttcttcttt ggcaagtgaa gctcgtatcc
aggaagctat tacaccttac 2100cgagatgatg tgtcaatcgc agcgataaat gggacagaaa
gcgtggttat ctctggcaaa 2160cgcacctctg tgatggcaat tgctgaacaa ctcgccaccg
ttggcatcaa gacacgccaa 2220ctgacggttt cccatgcctt ccattcacca cttatgacac
ccatcttgga tgagttccgc 2280caggtggcag ccagtatcac ctatcaccag cccaagttgc
tacttgtctc caacgtctcc 2340gggaaagtgg ccggccctga aatcaccaga ccagattact
gggtacgcca tgtccgtgag 2400gcagtgcgct ttgccgatgg agtgaggacg ctgaatgaac
aaggtgtcaa tatctttctg 2460gaaatcggtt ctaccgctac cctgttgggc atggcactgc
gagtaaatga ggaagattca 2520aatgcctcaa aaggaacttc gtcttgctac ctgcccagtt
tacgggaaag ccagaaggat 2580tgtcagcaga tgttcactag tctgggtgag ttgtacgtac
atggatatga tattgattgg 2640ggtgcattta atcggggata tcaaggacgc aaggtgatat
tgccaaccta tccgtttcag 2700cgacaacgtt attggcttcc cgaccctaag ttggcacaaa
gttccgattt agataccttt 2760caagctcaga gcagcgcatc atcacaaaat cctagcgctg
tgtccacttt actgatggaa 2820tatttgcaag caggtgatgt ccaatcttta gttgggcttt
tggatgatga acggaaactc 2880tctgctgctg aacgaattgc actacccagt attttggagt
ttttggtaga ggaacaacag 2940cgacaaataa gctcaaccac aactcctcaa acagttttac
aaaaaataag tcaaacttcc 3000catgaggaca gatatgaaat attgaagaac ctgatcaaat
ctgaaatcga aacgattatc 3060aaaagtgttc cctccgatga acaaatgttt tctgacttag
gaattgattc cttgatggcg 3120atcgaactgc gtaataagct ccgttctgct atagggttgg
aactgccagt ggcaatagta 3180tttgaccatc ccacgattaa gcagttaact aacttcgtac
tggacagaat tgtgccgcag 3240gcagaccaaa aggacgttcc caccgaatcc ttgtttgctt
ctaaacagga gatatcagtt 3300gaggagcagt cttttgcaat taccaagctg ggcttatccc
ctgcttccca ctccctgcat 3360cttcctccat ggacggttag acctgcggta atggcagatg
taacaaaact aagccaactt 3420gaaagagagg cctatggctg gatcggagaa ggagcgatcg
ccccgcccca tctcattgcc 3480gatcgcatca atttactcaa cagtggtgat atgccttggt
tctgggtaat ggagcgatca 3540ggagagttgg gcgcgtggca ggtgctacaa ccgacatctg
ttgatccata tacttatgga 3600agttgggatg aagtaactga ccaaggtaaa ctgcaagcaa
ccttcgaccc aagtggacgc 3660aatgtgtata ttgtcgcggg tgggtctagc aacctcccca
cggtagccag ccacctcatg 3720acgcttcaga ctttattgat gctgcgggaa actggtcgtg
acacaatctt tgtctgtctg 3780gcaatgccag gttatgccaa ataccacagt caaacaggaa
aatcgccgga agagtatatt 3840gcgctgactg acgaggatgg tatcccaatg gacgagttta
ttgcactttc tgtctacgac 3900tggcctgtta ccccatcgtt tcgtgttctg cgagacggtt
atccacctga tcgagattct 3960ggtggtcacg cagttagtac ggttttccag ctcaatgatt
tcgatggagc gatcgaagaa 4020acatatcgtc gtattatccg ccatgccgat gtccttggtc
tcgaaagagg ctaa 4074841357PRTCylindrospermopsis raciborskii
AWT205 84Met Asn Ala Leu Ser Glu Asn Gln Val Thr Ser Ile Val Lys Lys Ala1
5 10 15Leu Asn Lys Ile
Glu Glu Leu Gln Ala Glu Leu Asp Arg Leu Lys Tyr 20
25 30Ala Gln Arg Glu Pro Ile Ala Ile Ile Gly Met
Gly Cys Arg Phe Pro 35 40 45Gly
Ala Asp Thr Pro Glu Ala Phe Trp Lys Leu Leu His Asn Gly Val 50
55 60Asp Ala Ile Gln Glu Ile Pro Lys Ser Arg
Trp Asp Ile Asp Asp Tyr65 70 75
80Tyr Asp Pro Thr Pro Ala Thr Pro Gly Lys Met Tyr Thr Arg Phe
Gly 85 90 95Gly Phe Leu
Asp Gln Ile Ala Ala Phe Asp Pro Glu Phe Phe Arg Ile 100
105 110Ser Thr Arg Glu Ala Ile Ser Leu Asp Pro
Gln Gln Arg Leu Leu Leu 115 120
125Glu Val Ser Trp Glu Ala Leu Glu Arg Ala Gly Leu Thr Gly Asn Lys 130
135 140Leu Thr Thr Gln Thr Gly Val Phe
Val Gly Ile Ser Glu Ser Asp Tyr145 150
155 160Arg Asp Leu Ile Met Arg Asn Gly Ser Asp Leu Asp
Val Tyr Ser Gly 165 170
175Ser Gly Asn Cys His Ser Thr Ala Ser Gly Arg Leu Ser Tyr Tyr Leu
180 185 190Gly Leu Thr Gly Pro Asn
Leu Ser Leu Asp Thr Ala Cys Ser Ser Ser 195 200
205Leu Val Cys Val Ala Leu Ala Val Lys Ser Leu Arg Gln Gln
Glu Cys 210 215 220Asp Leu Ala Leu Ala
Gly Gly Val Gln Ile Gln Val Ile Pro Asp Gly225 230
235 240Phe Ile Lys Ala Cys Gln Ser Arg Met Leu
Ser Pro Asp Gly Arg Cys 245 250
255Lys Thr Phe Asp Phe Gln Ala Asp Gly Tyr Ala Arg Ala Glu Gly Cys
260 265 270Gly Met Val Val Leu
Lys Arg Leu Ser Asp Ala Ile Ala Asp Asn Asp 275
280 285Asn Ile Leu Ala Leu Ile Arg Gly Ala Ala Val Asn
His Asp Gly Tyr 290 295 300Thr Ser Gly
Leu Thr Val Pro Ser Gly Pro Ser Gln Arg Ala Val Ile305
310 315 320Gln Gln Ala Leu Ala Asp Ala
Gly Ile His Pro Asp Gln Ile Ser Tyr 325
330 335Ile Glu Ala His Gly Thr Gly Thr Ser Leu Gly Asp
Pro Ile Glu Met 340 345 350Gly
Ala Ile Gly Gln Val Phe Gly Gln Arg Ser Gln Met Leu Phe Val 355
360 365Gly Ser Val Lys Thr Asn Ile Gly His
Thr Glu Ala Ala Ala Gly Ile 370 375
380Ala Gly Leu Ile Lys Val Val Leu Ser Met Gln His Gly Glu Ile Pro385
390 395 400Ala Asn Leu His
Phe Asp Gln Pro Ser Pro Tyr Ile Asn Trp Asp Gln 405
410 415Leu Pro Val Ser Ile Pro Thr Glu Thr Ile
Pro Trp Ser Thr Ser Asp 420 425
430Arg Phe Ala Gly Val Ser Ser Phe Gly Phe Ser Gly Thr Asn Ser His
435 440 445Ile Val Leu Glu Ala Ala Pro
Asn Ile Glu Gln Pro Thr Asp Asp Ile 450 455
460Asn Gln Thr Pro His Ile Leu Thr Leu Ala Ala Lys Thr Pro Ala
Ala465 470 475 480Leu Gln
Glu Leu Ala Arg Arg Tyr Ala Thr Gln Ile Glu Thr Ser Pro
485 490 495Asp Val Pro Leu Ala Asp Ile
Cys Phe Thr Ala His Ile Gly Arg Lys 500 505
510His Phe Lys His Arg Phe Ala Val Val Thr Glu Ser Lys Glu
Gln Leu 515 520 525Arg Leu Gln Leu
Asp Ala Phe Ala Gln Ser Gly Gly Val Gly Arg Glu 530
535 540Val Lys Ser Leu Pro Lys Ile Ala Phe Leu Phe Thr
Gly Gln Gly Ser545 550 555
560Gln Tyr Val Gly Met Gly Arg Gln Leu Tyr Glu Asn Gln Pro Thr Phe
565 570 575Arg Lys Ala Leu Ala
His Cys Asp Asp Ile Leu Arg Ala Gly Ala Tyr 580
585 590Phe Asp Arg Ser Leu Leu Ser Ile Leu Tyr Pro Glu
Gly Lys Ser Glu 595 600 605Ala Ile
His Gln Thr Ala Tyr Thr Gln Pro Ala Leu Phe Ala Leu Glu 610
615 620Tyr Ala Ile Ala Gln Leu Trp His Ser Trp Gly
Ile Lys Pro Asp Ile625 630 635
640Val Met Gly His Ser Val Gly Glu Tyr Val Ala Ala Cys Val Ala Gly
645 650 655Ile Phe Ser Leu
Glu Asp Gly Leu Lys Leu Ile Ala Thr Arg Gly Arg 660
665 670Leu Met Gln Ser Leu Pro Gln Asp Gly Thr Met
Val Ser Ser Leu Ala 675 680 685Ser
Glu Ala Arg Ile Gln Glu Ala Ile Thr Pro Tyr Arg Asp Asp Val 690
695 700Ser Ile Ala Ala Ile Asn Gly Thr Glu Ser
Val Val Ile Ser Gly Lys705 710 715
720Arg Thr Ser Val Met Ala Ile Ala Glu Gln Leu Ala Thr Val Gly
Ile 725 730 735Lys Thr Arg
Gln Leu Thr Val Ser His Ala Phe His Ser Pro Leu Met 740
745 750Thr Pro Ile Leu Asp Glu Phe Arg Gln Val
Ala Ala Ser Ile Thr Tyr 755 760
765His Gln Pro Lys Leu Leu Leu Val Ser Asn Val Ser Gly Lys Val Ala 770
775 780Gly Pro Glu Ile Thr Arg Pro Asp
Tyr Trp Val Arg His Val Arg Glu785 790
795 800Ala Val Arg Phe Ala Asp Gly Val Arg Thr Leu Asn
Glu Gln Gly Val 805 810
815Asn Ile Phe Leu Glu Ile Gly Ser Thr Ala Thr Leu Leu Gly Met Ala
820 825 830Leu Arg Val Asn Glu Glu
Asp Ser Asn Ala Ser Lys Gly Thr Ser Ser 835 840
845Cys Tyr Leu Pro Ser Leu Arg Glu Ser Gln Lys Asp Cys Gln
Gln Met 850 855 860Phe Thr Ser Leu Gly
Glu Leu Tyr Val His Gly Tyr Asp Ile Asp Trp865 870
875 880Gly Ala Phe Asn Arg Gly Tyr Gln Gly Arg
Lys Val Ile Leu Pro Thr 885 890
895Tyr Pro Phe Gln Arg Gln Arg Tyr Trp Leu Pro Asp Pro Lys Leu Ala
900 905 910Gln Ser Ser Asp Leu
Asp Thr Phe Gln Ala Gln Ser Ser Ala Ser Ser 915
920 925Gln Asn Pro Ser Ala Val Ser Thr Leu Leu Met Glu
Tyr Leu Gln Ala 930 935 940Gly Asp Val
Gln Ser Leu Val Gly Leu Leu Asp Asp Glu Arg Lys Leu945
950 955 960Ser Ala Ala Glu Arg Ile Ala
Leu Pro Ser Ile Leu Glu Phe Leu Val 965
970 975Glu Glu Gln Gln Arg Gln Ile Ser Ser Thr Thr Thr
Pro Gln Thr Val 980 985 990Leu
Gln Lys Ile Ser Gln Thr Ser His Glu Asp Arg Tyr Glu Ile Leu 995
1000 1005Lys Asn Leu Ile Lys Ser Glu Ile
Glu Thr Ile Ile Lys Ser Val 1010 1015
1020Pro Ser Asp Glu Gln Met Phe Ser Asp Leu Gly Ile Asp Ser Leu
1025 1030 1035Met Ala Ile Glu Leu Arg
Asn Lys Leu Arg Ser Ala Ile Gly Leu 1040 1045
1050Glu Leu Pro Val Ala Ile Val Phe Asp His Pro Thr Ile Lys
Gln 1055 1060 1065Leu Thr Asn Phe Val
Leu Asp Arg Ile Val Pro Gln Ala Asp Gln 1070 1075
1080Lys Asp Val Pro Thr Glu Ser Leu Phe Ala Ser Lys Gln
Glu Ile 1085 1090 1095Ser Val Glu Glu
Gln Ser Phe Ala Ile Thr Lys Leu Gly Leu Ser 1100
1105 1110Pro Ala Ser His Ser Leu His Leu Pro Pro Trp
Thr Val Arg Pro 1115 1120 1125Ala Val
Met Ala Asp Val Thr Lys Leu Ser Gln Leu Glu Arg Glu 1130
1135 1140Ala Tyr Gly Trp Ile Gly Glu Gly Ala Ile
Ala Pro Pro His Leu 1145 1150 1155Ile
Ala Asp Arg Ile Asn Leu Leu Asn Ser Gly Asp Met Pro Trp 1160
1165 1170Phe Trp Val Met Glu Arg Ser Gly Glu
Leu Gly Ala Trp Gln Val 1175 1180
1185Leu Gln Pro Thr Ser Val Asp Pro Tyr Thr Tyr Gly Ser Trp Asp
1190 1195 1200Glu Val Thr Asp Gln Gly
Lys Leu Gln Ala Thr Phe Asp Pro Ser 1205 1210
1215Gly Arg Asn Val Tyr Ile Val Ala Gly Gly Ser Ser Asn Leu
Pro 1220 1225 1230Thr Val Ala Ser His
Leu Met Thr Leu Gln Thr Leu Leu Met Leu 1235 1240
1245Arg Glu Thr Gly Arg Asp Thr Ile Phe Val Cys Leu Ala
Met Pro 1250 1255 1260Gly Tyr Ala Lys
Tyr His Ser Gln Thr Gly Lys Ser Pro Glu Glu 1265
1270 1275Tyr Ile Ala Leu Thr Asp Glu Asp Gly Ile Pro
Met Asp Glu Phe 1280 1285 1290Ile Ala
Leu Ser Val Tyr Asp Trp Pro Val Thr Pro Ser Phe Arg 1295
1300 1305Val Leu Arg Asp Gly Tyr Pro Pro Asp Arg
Asp Ser Gly Gly His 1310 1315 1320Ala
Val Ser Thr Val Phe Gln Leu Asn Asp Phe Asp Gly Ala Ile 1325
1330 1335Glu Glu Thr Tyr Arg Arg Ile Ile Arg
His Ala Asp Val Leu Gly 1340 1345
1350Leu Glu Arg Gly 1355851437DNAcylindrospermopsis raciborskii
awt205 85atgaataaaa aacaggtaga cacattgtta atacacgctc atctttttac
catgcagggc 60aatggcctgg gatatattgc cgatggggca attgcggttc agggtagcca
gatcgtagca 120gtggattcga cagaggcttt gctgagtcat tttgaaggaa ataaaacaat
taatgcggta 180aattgtgcag tgttgcctgg actaattgat gctcatatac atacgacttg
tgctattctg 240cgtggagtgg cacaggatgt aaccaattgg ctaatggacg cgacaattcc
ttatgcactt 300cagatgacac ccgcagtaaa tatagccgga acgcgcttga gtgtactcga
agggctgaaa 360gcaggaacaa ccacattcgg cgattctgag actccttacc cgctctgggg
agagtttttc 420gatgaaattg gggtacgtgc tattctatcc cctgccttta acgcctttcc
actagaatgg 480tcggcatgga aggagggaga cctctatccc ttcgatatga aggcaggacg
acgtggtatg 540gaagaggctg tggattttgc ttgtgcatgg aatggagccg cagagggacg
tatcaccact 600atgttgggac tacaggcggc ggatatgcta ccactggaga tcctacacgc
agctaaagag 660attgcccaac gggaaggctt aatgctgcat attcatgtgg cccagggaga
tcgagaaaca 720aaacaaattg tcaaacgata tggtaagcgt ccgatcgcat ttctagctga
aattggctac 780ttggacgaac agttgctggc agttcacctc accgatgcca cagatgaaga
agtgatacaa 840gtagccaaaa gtggtgctgg catggcactc tgttcgggcg ctattggcat
cattgacggt 900cttgttccgc ccgctcatgt ttttcgacaa gcaggcggtt ccgttgcact
cggttctgat 960caagcctgtg gcaacaactg ttgtaacatc ttcaatgaaa tgaagctgac
cgccttattc 1020aacaaaataa aatatcatga tccaaccatt atgccggctt gggaagtcct
gcgtatggct 1080accatcgaag gagcgcaggc gattggttta gatcacaaga ttggctctct
tcaagtgggc 1140aaagaagccg acctgatctt aatagacctc agttccccta acctctcgcc
caccctgctc 1200aaccctattc gtaaccttgt acctaacttg gtgtatgctg cttcaggaca
tgaagttaaa 1260agcgtcatgg tggcgggaaa acttttagtg gaagactacc aagtcctcac
ggtagatgag 1320tccgctattc tcgctgaagc gcaagtacaa gctcaacaac tctgccaacg
tgtgaccgct 1380gaccccattc acaaaaagat ggtgttaatg gaagcgatgg ctaagggtaa
attatag 143786478PRTCylindrospermopsis raciborskii AWT205 86Met Asn
Lys Lys Gln Val Asp Thr Leu Leu Ile His Ala His Leu Phe1 5
10 15Thr Met Gln Gly Asn Gly Leu Gly
Tyr Ile Ala Asp Gly Ala Ile Ala 20 25
30Val Gln Gly Ser Gln Ile Val Ala Val Asp Ser Thr Glu Ala Leu
Leu 35 40 45Ser His Phe Glu Gly
Asn Lys Thr Ile Asn Ala Val Asn Cys Ala Val 50 55
60Leu Pro Gly Leu Ile Asp Ala His Ile His Thr Thr Cys Ala
Ile Leu65 70 75 80Arg
Gly Val Ala Gln Asp Val Thr Asn Trp Leu Met Asp Ala Thr Ile
85 90 95Pro Tyr Ala Leu Gln Met Thr
Pro Ala Val Asn Ile Ala Gly Thr Arg 100 105
110Leu Ser Val Leu Glu Gly Leu Lys Ala Gly Thr Thr Thr Phe
Gly Asp 115 120 125Ser Glu Thr Pro
Tyr Pro Leu Trp Gly Glu Phe Phe Asp Glu Ile Gly 130
135 140Val Arg Ala Ile Leu Ser Pro Ala Phe Asn Ala Phe
Pro Leu Glu Trp145 150 155
160Ser Ala Trp Lys Glu Gly Asp Leu Tyr Pro Phe Asp Met Lys Ala Gly
165 170 175Arg Arg Gly Met Glu
Glu Ala Val Asp Phe Ala Cys Ala Trp Asn Gly 180
185 190Ala Ala Glu Gly Arg Ile Thr Thr Met Leu Gly Leu
Gln Ala Ala Asp 195 200 205Met Leu
Pro Leu Glu Ile Leu His Ala Ala Lys Glu Ile Ala Gln Arg 210
215 220Glu Gly Leu Met Leu His Ile His Val Ala Gln
Gly Asp Arg Glu Thr225 230 235
240Lys Gln Ile Val Lys Arg Tyr Gly Lys Arg Pro Ile Ala Phe Leu Ala
245 250 255Glu Ile Gly Tyr
Leu Asp Glu Gln Leu Leu Ala Val His Leu Thr Asp 260
265 270Ala Thr Asp Glu Glu Val Ile Gln Val Ala Lys
Ser Gly Ala Gly Met 275 280 285Ala
Leu Cys Ser Gly Ala Ile Gly Ile Ile Asp Gly Leu Val Pro Pro 290
295 300Ala His Val Phe Arg Gln Ala Gly Gly Ser
Val Ala Leu Gly Ser Asp305 310 315
320Gln Ala Cys Gly Asn Asn Cys Cys Asn Ile Phe Asn Glu Met Lys
Leu 325 330 335Thr Ala Leu
Phe Asn Lys Ile Lys Tyr His Asp Pro Thr Ile Met Pro 340
345 350Ala Trp Glu Val Leu Arg Met Ala Thr Ile
Glu Gly Ala Gln Ala Ile 355 360
365Gly Leu Asp His Lys Ile Gly Ser Leu Gln Val Gly Lys Glu Ala Asp 370
375 380Leu Ile Leu Ile Asp Leu Ser Ser
Pro Asn Leu Ser Pro Thr Leu Leu385 390
395 400Asn Pro Ile Arg Asn Leu Val Pro Asn Leu Val Tyr
Ala Ala Ser Gly 405 410
415His Glu Val Lys Ser Val Met Val Ala Gly Lys Leu Leu Val Glu Asp
420 425 430Tyr Gln Val Leu Thr Val
Asp Glu Ser Ala Ile Leu Ala Glu Ala Gln 435 440
445Val Gln Ala Gln Gln Leu Cys Gln Arg Val Thr Ala Asp Pro
Ile His 450 455 460Lys Lys Met Val Leu
Met Glu Ala Met Ala Lys Gly Lys Leu465 470
47587831DNACylindrospermopsis raciborskii AWT205 87atgaccatat atgaaaataa
gttgagtagt tatcaaaaaa atcaagatgc cataatatct 60gcaaaagaac tcgaagaatg
gcatttaatt ggacttctag accattcaat agatgcggta 120atagtaccga attattttct
tgagcaagag tgtatgacaa tttcagagag aataaaaaag 180agtaaatatt ttagcgctta
tcccggtcat ccatcagtaa gtagcttggg acaagagttg 240tatgaatgcg aaagtgagct
tgaattagca aagtatcaag aagacgcacc cacattgatt 300aaagaaatgc ggaggctggt
acatccgtac ataagtccaa ttgatagact tagggttgaa 360gttgatgata tttggagtta
tggctgtaat ttagcaaaac ttggtgataa aaaactgttt 420gcgggtatcg ttagagagtt
taaagaagat aaccctggcg caccacattg tgacgtaatg 480gcatggggtt ttctcgaata
ttataaagat aaaccaaata tcataaatca aatcgcagca 540aatgtatatt taaaaacgtc
tgcatcagga ggagaaatag tgctttggga tgaatggcca 600actcaaagcg aatatatagc
atacaaaaca gatgatccag ctagtttcgg tcttgatagc 660aaaaagatcg cacaaccaaa
acttgagatc caaccgaacc agggagattt aattctattc 720aattccatga gaattcatgc
ggtgaaaaag atagaaactg gtgtacgtat gacatgggga 780tgtttgattg gatactctgg
aactgataaa ccgcttgtta tttggactta a
83188276PRTCylindrospermopsis raciborskii AWT205 88Met Thr Ile Tyr Glu
Asn Lys Leu Ser Ser Tyr Gln Lys Asn Gln Asp1 5
10 15Ala Ile Ile Ser Ala Lys Glu Leu Glu Glu Trp
His Leu Ile Gly Leu 20 25
30Leu Asp His Ser Ile Asp Ala Val Ile Val Pro Asn Tyr Phe Leu Glu
35 40 45Gln Glu Cys Met Thr Ile Ser Glu
Arg Ile Lys Lys Ser Lys Tyr Phe 50 55
60Ser Ala Tyr Pro Gly His Pro Ser Val Ser Ser Leu Gly Gln Glu Leu65
70 75 80Tyr Glu Cys Glu Ser
Glu Leu Glu Leu Ala Lys Tyr Gln Glu Asp Ala 85
90 95Pro Thr Leu Ile Lys Glu Met Arg Arg Leu Val
His Pro Tyr Ile Ser 100 105
110Pro Ile Asp Arg Leu Arg Val Glu Val Asp Asp Ile Trp Ser Tyr Gly
115 120 125Cys Asn Leu Ala Lys Leu Gly
Asp Lys Lys Leu Phe Ala Gly Ile Val 130 135
140Arg Glu Phe Lys Glu Asp Asn Pro Gly Ala Pro His Cys Asp Val
Met145 150 155 160Ala Trp
Gly Phe Leu Glu Tyr Tyr Lys Asp Lys Pro Asn Ile Ile Asn
165 170 175Gln Ile Ala Ala Asn Val Tyr
Leu Lys Thr Ser Ala Ser Gly Gly Glu 180 185
190Ile Val Leu Trp Asp Glu Trp Pro Thr Gln Ser Glu Tyr Ile
Ala Tyr 195 200 205Lys Thr Asp Asp
Pro Ala Ser Phe Gly Leu Asp Ser Lys Lys Ile Ala 210
215 220Gln Pro Lys Leu Glu Ile Gln Pro Asn Gln Gly Asp
Leu Ile Leu Phe225 230 235
240Asn Ser Met Arg Ile His Ala Val Lys Lys Ile Glu Thr Gly Val Arg
245 250 255Met Thr Trp Gly Cys
Leu Ile Gly Tyr Ser Gly Thr Asp Lys Pro Leu 260
265 270Val Ile Trp Thr
275891398DNACylindrospermopsis raciborskii AWT205 89ttaatgtagc gtttccattt
gagtcaaggc acgagaagct tctaaagctg gaatagatac 60actatcattc tcaactacac
tctcaaatgt cctaggtaac tgtgccccaa acatcagcat 120tccaatggcg ttgaacaaaa
agaaagccaa ccacaagata tggttactct caaatttaac 180agcagctaca tccgcaggta
aaaatcctac accaaacgcg attaagttaa cattgcggag 240agtatgccct tgagccaaac
ccaagaagta cccacatagt atgcaacata ctgaattgca 300tactaggaca agtaccaacc
agggaataaa aatatcaata ttctcaataa tttctgcgtg 360gttggttaac aacccaaaaa
catcatcggg aaatagccaa cacgctccgc cgaaaaccag 420actcactagc agagccattc
ccacagaaac ttttgccaga ggtgctaact gttctgtggc 480tcctttccct ttaaaatttc
ctgccagagt ttctgtacag aatcccaatc cttcaacaat 540gtagatgctc aaagcccata
tctgtaagag caaggcattt tgagcgtaga taattgtccc 600catttgtgcc ccttcgtagt
taaacgttaa gttggtaaac atacaaacta aattgctgac 660aaagatgttt ccattgagag
ttaaggtgga gcgtatagct tttatgtccc aaatttttcc 720agctaattct tttacctctt
gccacgggat ttctttgcag acaaaaaaca atcccaccaa 780tagggtgaga tattgacttg
cagcagaagc tactcctgcc cccatgctcg accagtctaa 840gtggataata aacaagtagt
cgagtgcgat attggcagca ttgcccacaa ccgacaacaa 900cacaactaag ccattttttt
cccgtcccag aaaccagcca agcaggacaa agttgagcaa 960aatggcaggc gctccccaac
tctgggtgtt aaaatacgct tgagctgaag acttcacctc 1020tgggccgaca tctagtatag
aaaaccccaa cacccctaac gggtactgta acagtatgat 1080cgccaccccc agcaccagag
caattaaacc attaagcagt cccgccaaca gtacgccctc 1140tcggtcatct cgtccgactg
cttgtgctgt taacgcagtg gtacccattc gtaaaaacga 1200taaaacaaag tagagaaagt
taagcaggtt tccagcaagg gctactccag ctaggtagtg 1260gatttccgag agatgaccta
agaacatgat actgactaaa ttactcagtg gtactataat 1320attcgatagg acgttggtaa
aagctagtcg gaagtagcgg ggtataaagt catactggct 1380tggaaatgtc aggctcat
139890465PRTCylindrospermopsis raciborskii AWT205 90Met Ser Leu Thr Phe
Pro Ser Gln Tyr Asp Phe Ile Pro Arg Tyr Phe1 5
10 15Arg Leu Ala Phe Thr Asn Val Leu Ser Asn Ile
Ile Val Pro Leu Ser 20 25
30Asn Leu Val Ser Ile Met Phe Leu Gly His Leu Ser Glu Ile His Tyr
35 40 45Leu Ala Gly Val Ala Leu Ala Gly
Asn Leu Leu Asn Phe Leu Tyr Phe 50 55
60Val Leu Ser Phe Leu Arg Met Gly Thr Thr Ala Leu Thr Ala Gln Ala65
70 75 80Val Gly Arg Asp Asp
Arg Glu Gly Val Leu Leu Ala Gly Leu Leu Asn 85
90 95Gly Leu Ile Ala Leu Val Leu Gly Val Ala Ile
Ile Leu Leu Gln Tyr 100 105
110Pro Leu Gly Val Leu Gly Phe Ser Ile Leu Asp Val Gly Pro Glu Val
115 120 125Lys Ser Ser Ala Gln Ala Tyr
Phe Asn Thr Gln Ser Trp Gly Ala Pro 130 135
140Ala Ile Leu Leu Asn Phe Val Leu Leu Gly Trp Phe Leu Gly Arg
Glu145 150 155 160Lys Asn
Gly Leu Val Val Leu Leu Ser Val Val Gly Asn Ala Ala Asn
165 170 175Ile Ala Leu Asp Tyr Leu Phe
Ile Ile His Leu Asp Trp Ser Ser Met 180 185
190Gly Ala Gly Val Ala Ser Ala Ala Ser Gln Tyr Leu Thr Leu
Leu Val 195 200 205Gly Leu Phe Phe
Val Cys Lys Glu Ile Pro Trp Gln Glu Val Lys Glu 210
215 220Leu Ala Gly Lys Ile Trp Asp Ile Lys Ala Ile Arg
Ser Thr Leu Thr225 230 235
240Leu Asn Gly Asn Ile Phe Val Ser Asn Leu Val Cys Met Phe Thr Asn
245 250 255Leu Thr Phe Asn Tyr
Glu Gly Ala Gln Met Gly Thr Ile Ile Tyr Ala 260
265 270Gln Asn Ala Leu Leu Leu Gln Ile Trp Ala Leu Ser
Ile Tyr Ile Val 275 280 285Glu Gly
Leu Gly Phe Cys Thr Glu Thr Leu Ala Gly Asn Phe Lys Gly 290
295 300Lys Gly Ala Thr Glu Gln Leu Ala Pro Leu Ala
Lys Val Ser Val Gly305 310 315
320Met Ala Leu Leu Val Ser Leu Val Phe Gly Gly Ala Cys Trp Leu Phe
325 330 335Pro Asp Asp Val
Phe Gly Leu Leu Thr Asn His Ala Glu Ile Ile Glu 340
345 350Asn Ile Asp Ile Phe Ile Pro Trp Leu Val Leu
Val Leu Val Cys Asn 355 360 365Ser
Val Cys Cys Ile Leu Cys Gly Tyr Phe Leu Gly Leu Ala Gln Gly 370
375 380His Thr Leu Arg Asn Val Asn Leu Ile Ala
Phe Gly Val Gly Phe Leu385 390 395
400Pro Ala Asp Val Ala Ala Val Lys Phe Glu Ser Asn His Ile Leu
Trp 405 410 415Leu Ala Phe
Phe Leu Phe Asn Ala Ile Gly Met Leu Met Phe Gly Ala 420
425 430Gln Leu Pro Arg Thr Phe Glu Ser Val Val
Glu Asn Asp Ser Val Ser 435 440
445Ile Pro Ala Leu Glu Ala Ser Arg Ala Leu Thr Gln Met Glu Thr Leu 450
455 460His46591750DNACylindrospermopsis
raciborskii AWT205 91atgttgaact tagaccgcat cctgaatcaa gagcgactgc
tacgagaaat gactggactt 60aaccgccaag cattcaacga gctgttatct cagtttgctg
atacctatga acgcaccgtg 120ttcaactcct tagcaaaccg caaacgtgcg cccgggggcg
gacgcaagcc tacactcaga 180agtatagagg aaaaactatt ttatatcctg ctgtactgca
aatgttatcc gacgtttgac 240ttgctgagtg tgttgttcaa ctttgaccgc tcctgtgctc
atgattgggt acatcgacta 300ctgtctgtgc tagaaaccac tttaggagaa aagcaagttt
tgccagcacg caaactcagg 360agcatggagg aattcaccaa aaggtttcca gatgtgaagg
aggtgattgt ggatggtacg 420gagcgtccag tccagcgtcc tcaaaaccga gaacgccaaa
aagagtatta ctctggcaag 480aaaaagcggc atacatgcaa gcagattaca gtcagcacaa
gggagaaacg agtgattatt 540cggacggaaa ccagagcagg taaagtgcat gacaaacggc
tactccatga atcagagata 600gtgcaataca ttcctgatga agtagcaata gagggagatt
tgggttttca tgggttggag 660aaagaatttg tcaatgtcca tttaccacac aagaaaccga
aaggtatcga agcaaggagg 720catggcggcg ggatgggtca gtttttataa
75092249PRTCylindrospermopsis raciborskii AWT205
92Met Leu Asn Leu Asp Arg Ile Leu Asn Gln Glu Arg Leu Leu Arg Glu1
5 10 15Met Thr Gly Leu Asn Arg
Gln Ala Phe Asn Glu Leu Leu Ser Gln Phe 20 25
30Ala Asp Thr Tyr Glu Arg Thr Val Phe Asn Ser Leu Ala
Asn Arg Lys 35 40 45Arg Ala Pro
Gly Gly Gly Arg Lys Pro Thr Leu Arg Ser Ile Glu Glu 50
55 60Lys Leu Phe Tyr Ile Leu Leu Tyr Cys Lys Cys Tyr
Pro Thr Phe Asp65 70 75
80Leu Leu Ser Val Leu Phe Asn Phe Asp Arg Ser Cys Ala His Asp Trp
85 90 95Val His Arg Leu Leu Ser
Val Leu Glu Thr Thr Leu Gly Glu Lys Gln 100
105 110Val Leu Pro Ala Arg Lys Leu Arg Ser Met Glu Glu
Phe Thr Lys Arg 115 120 125Phe Pro
Asp Val Lys Glu Val Ile Val Asp Gly Thr Glu Arg Pro Val 130
135 140Gln Arg Pro Gln Asn Arg Glu Arg Gln Lys Glu
Tyr Tyr Ser Gly Lys145 150 155
160Lys Lys Arg His Thr Cys Lys Gln Ile Thr Val Ser Thr Arg Glu Lys
165 170 175Arg Val Ile Ile
Arg Thr Glu Thr Arg Ala Gly Lys Val His Asp Lys 180
185 190Arg Leu Leu His Glu Ser Glu Ile Val Gln Tyr
Ile Pro Asp Glu Val 195 200 205Ala
Ile Glu Gly Asp Leu Gly Phe His Gly Leu Glu Lys Glu Phe Val 210
215 220Asn Val His Leu Pro His Lys Lys Pro Lys
Gly Ile Glu Ala Arg Arg225 230 235
240His Gly Gly Gly Met Gly Gln Phe Leu
245931431DNACylindrospermopsis raciborskii AWT205 93atgaatctta taacaacaaa
aaaacaggta gatacattag tgatacacgc tcatcttttt 60accatgcagg gaaatggtgt
gggatatatt gcagatgggg cacttgcggt tgagggtagc 120cgtattgtag cagttgattc
gacggaggcg ttgctgagtc attttgaggg cagaaaggtt 180attgagtccg cgaattgtgc
cgtcttgcct gggctgatta atgctcacgt agacacaagt 240ttggtgctga tgcgtggggc
ggcgcaagat gtaactaatt ggctaatgga cgcgaccatg 300ccttattttg ctcacatgac
acccgtggcg agtatggctg caacacgctt aagggtggta 360gaagagttga aagcaggcac
aacaacattc tgtgacaata aaattattag ccccctgtgg 420ggcgaatttt tcgatgaaat
tggtgtacgg gctagtttag ctcctatgtt cgatgcactc 480ccactggaga tgccaccgct
tcaagacggg gagctttatc ccttcgatat caaggcggga 540cggcgggcga tggcagaggc
tgtggatttt gcctgtgggt ggaatggggc agcagagggg 600cgtatcacta ccatgttagg
aatgtattcg ccagatatga tgccgcttga gatgctacgc 660gcagccaaag agattgctca
acgggaaggc ttaatgctgc attttcatgt agcgcaggga 720gatcgggaaa cagagcaaat
cgttaaacga tatggtaagc gtccgatcgc atttctagct 780gagattggct acttggacga
acagttgctg gcagttcacc tcaccgatgc caccgatgaa 840gaggtgatac aagtagccaa
aagtggcgct ggcatggtac tctgttcggg aatgattggc 900actattgacg gtatcgtgcc
gcccgctcat gtgtttcggc aagcaggcgg acccgttgcg 960ctaggcagca gctacaataa
tattttccat gagatgaagc tgaccgcctt attcaacaaa 1020ataaaatatc acgatccaac
cattatgccg gcttgggaag tcctgcgtat ggctaccatc 1080gaaggagcgc gggcgattgg
tttagatcac aagattggct ctcttgaagt tggcaaagaa 1140gccgacctga tcttaataga
cctcagcacc cctaacctct cacccactct gcttaacccc 1200attcgtaacc ttgtacctaa
tttcgtgtac gctgcttcag gacatgaagt taaaagtgtc 1260atggtggcgg gaaaactgtt
attggaagac taccaagtcc tcacagtaga tgagtctgct 1320atcattgctg aagcacaatt
gcaagcccaa cagatttctc aatgcgtagc atctgaccct 1380atccacaaaa aaatggtgct
gatggcggcg atggcaaggg gccaattgta g
143194476PRTCylindrospermopsis raciborskii AWT205 94Met Asn Leu Ile Thr
Thr Lys Lys Gln Val Asp Thr Leu Val Ile His1 5
10 15Ala His Leu Phe Thr Met Gln Gly Asn Gly Val
Gly Tyr Ile Ala Asp 20 25
30Gly Ala Leu Ala Val Glu Gly Ser Arg Ile Val Ala Val Asp Ser Thr
35 40 45Glu Ala Leu Leu Ser His Phe Glu
Gly Arg Lys Val Ile Glu Ser Ala 50 55
60Asn Cys Ala Val Leu Pro Gly Leu Ile Asn Ala His Val Asp Thr Ser65
70 75 80Leu Val Leu Met Arg
Gly Ala Ala Gln Asp Val Thr Asn Trp Leu Met 85
90 95Asp Ala Thr Met Pro Tyr Phe Ala His Met Thr
Pro Val Ala Ser Met 100 105
110Ala Ala Thr Arg Leu Arg Val Val Glu Glu Leu Lys Ala Gly Thr Thr
115 120 125Thr Phe Cys Asp Asn Lys Ile
Ile Ser Pro Leu Trp Gly Glu Phe Phe 130 135
140Asp Glu Ile Gly Val Arg Ala Ser Leu Ala Pro Met Phe Asp Ala
Leu145 150 155 160Pro Leu
Glu Met Pro Pro Leu Gln Asp Gly Glu Leu Tyr Pro Phe Asp
165 170 175Ile Lys Ala Gly Arg Arg Ala
Met Ala Glu Ala Val Asp Phe Ala Cys 180 185
190Gly Trp Asn Gly Ala Ala Glu Gly Arg Ile Thr Thr Met Leu
Gly Met 195 200 205Tyr Ser Pro Asp
Met Met Pro Leu Glu Met Leu Arg Ala Ala Lys Glu 210
215 220Ile Ala Gln Arg Glu Gly Leu Met Leu His Phe His
Val Ala Gln Gly225 230 235
240Asp Arg Glu Thr Glu Gln Ile Val Lys Arg Tyr Gly Lys Arg Pro Ile
245 250 255Ala Phe Leu Ala Glu
Ile Gly Tyr Leu Asp Glu Gln Leu Leu Ala Val 260
265 270His Leu Thr Asp Ala Thr Asp Glu Glu Val Ile Gln
Val Ala Lys Ser 275 280 285Gly Ala
Gly Met Val Leu Cys Ser Gly Met Ile Gly Thr Ile Asp Gly 290
295 300Ile Val Pro Pro Ala His Val Phe Arg Gln Ala
Gly Gly Pro Val Ala305 310 315
320Leu Gly Ser Ser Tyr Asn Asn Ile Phe His Glu Met Lys Leu Thr Ala
325 330 335Leu Phe Asn Lys
Ile Lys Tyr His Asp Pro Thr Ile Met Pro Ala Trp 340
345 350Glu Val Leu Arg Met Ala Thr Ile Glu Gly Ala
Arg Ala Ile Gly Leu 355 360 365Asp
His Lys Ile Gly Ser Leu Glu Val Gly Lys Glu Ala Asp Leu Ile 370
375 380Leu Ile Asp Leu Ser Thr Pro Asn Leu Ser
Pro Thr Leu Leu Asn Pro385 390 395
400Ile Arg Asn Leu Val Pro Asn Phe Val Tyr Ala Ala Ser Gly His
Glu 405 410 415Val Lys Ser
Val Met Val Ala Gly Lys Leu Leu Leu Glu Asp Tyr Gln 420
425 430Val Leu Thr Val Asp Glu Ser Ala Ile Ile
Ala Glu Ala Gln Leu Gln 435 440
445Ala Gln Gln Ile Ser Gln Cys Val Ala Ser Asp Pro Ile His Lys Lys 450
455 460Met Val Leu Met Ala Ala Met Ala
Arg Gly Gln Leu465 470
47595780DNACylindrospermopsis raciborskii AWT205 95atgcaagaaa aacgaatcgc
aatgtggtct gtgccacgaa gtttgggtac agtgctgcta 60caagcctggt cgagtcggcc
agataccgta gtctttgatg aacttctctc ctttccctat 120ctctttatca aagggaaaga
tatgggcttt acttggacag accttgattc tagccaaatg 180ccccacgcag attggcgatc
cgtcatcgat ctgttaaagg ctcccctgcc tgaagggaaa 240tcaatcatcg atctgttaaa
ggctcccctg cctgaaggga aatcaatttg ctatcagaag 300catcaagcgt atcatttaat
cgaagagacc atggggattg agtggatatt gcccttcagc 360aactgctttc tgattcgcca
acccaaagaa atgctcttat cttttcgtaa gattgtgcca 420cattttacct ttgaagaaac
aggctggatc gaattaaaac ggctgtttga ctatgtacat 480caaacgagcg gagtaatccc
gcctgtcata gatgcacacg acttgctgaa cgatccgcgg 540agaatgctct ccaagctttg
tcaggttgta ggggttgagt ttaccgagac aatgctcagt 600tggcccccca tggaggtcga
gttgaacgaa aaactagccc cttggtacag caccgtagca 660agttctacgc attttcactc
gtatcagaat aaaaatgagt cgttgccgct atatcttgtc 720gatatttgta aacgctgcga
tgaaatatat caggaattat atcaatttcg actttattag
78096259PRTCylindrospermopsis raciborskii AWT205 96Met Gln Glu Lys Arg
Ile Ala Met Trp Ser Val Pro Arg Ser Leu Gly1 5
10 15Thr Val Leu Leu Gln Ala Trp Ser Ser Arg Pro
Asp Thr Val Val Phe 20 25
30Asp Glu Leu Leu Ser Phe Pro Tyr Leu Phe Ile Lys Gly Lys Asp Met
35 40 45Gly Phe Thr Trp Thr Asp Leu Asp
Ser Ser Gln Met Pro His Ala Asp 50 55
60Trp Arg Ser Val Ile Asp Leu Leu Lys Ala Pro Leu Pro Glu Gly Lys65
70 75 80Ser Ile Ile Asp Leu
Leu Lys Ala Pro Leu Pro Glu Gly Lys Ser Ile 85
90 95Cys Tyr Gln Lys His Gln Ala Tyr His Leu Ile
Glu Glu Thr Met Gly 100 105
110Ile Glu Trp Ile Leu Pro Phe Ser Asn Cys Phe Leu Ile Arg Gln Pro
115 120 125Lys Glu Met Leu Leu Ser Phe
Arg Lys Ile Val Pro His Phe Thr Phe 130 135
140Glu Glu Thr Gly Trp Ile Glu Leu Lys Arg Leu Phe Asp Tyr Val
His145 150 155 160Gln Thr
Ser Gly Val Ile Pro Pro Val Ile Asp Ala His Asp Leu Leu
165 170 175Asn Asp Pro Arg Arg Met Leu
Ser Lys Leu Cys Gln Val Val Gly Val 180 185
190Glu Phe Thr Glu Thr Met Leu Ser Trp Pro Pro Met Glu Val
Glu Leu 195 200 205Asn Glu Lys Leu
Ala Pro Trp Tyr Ser Thr Val Ala Ser Ser Thr His 210
215 220Phe His Ser Tyr Gln Asn Lys Asn Glu Ser Leu Pro
Leu Tyr Leu Val225 230 235
240Asp Ile Cys Lys Arg Cys Asp Glu Ile Tyr Gln Glu Leu Tyr Gln Phe
245 250 255Arg Leu
Tyr971176DNACylindrospermopsis raciborskii AWT205 97atgcaaacaa gaattgtaaa
tagctggaat gagtgggatg aactaaagga gatggttgtc 60gggattgcag atggtgctta
ttttgaacca actgagccag gtaaccgccc tgctttacgc 120gataagaaca ttgccaaaat
gttctctttt cccaggggtc cgaaaaagca agaggtaaca 180gagaaagcta atgaggagtt
gaatgggctg gtagcgcttc tagaatcaca gggcgtaact 240gtacgccgcc cagagaaaca
taactttggc ctgtctgtga agacaccatt ctttgaggta 300gagaatcaat attgtgcggt
ctgcccacgt gatgttatga tcacctttgg gaacgaaatt 360ctcgaagcaa ctatgtcacg
gcggtcacgc ttctttgagt atttacccta tcgcaaacta 420gtctatgaat attggcataa
agatccagat atgatctgga atgctgcgcc taaaccgact 480atgcaaaatg ccatgtaccg
cgaagatttc tgggagtgtc cgatggaaga tcgatttgag 540agtatgcatg attttgagtt
ctgcgtcacc caggatgagg tgatttttga cgcagcagac 600tgtagccgct ttggccgtga
tatttttgtg caggagtcaa tgacgactaa tcgtgcaggg 660attcgctggc tcaaacggca
tttagagccg cgtcgcttcc gcgtgcatga tattcacttc 720ccactagata ttttcccatc
ccacattgat tgtacttttg tccccttagc acctggggtt 780gtgttagtga atccagatcg
ccccatcaaa gagggtgaag agaaactctt catggataac 840ggttggcaat tcatcgaagc
acccctcccc acttccaccg acgatgagat gcctatgttc 900tgccagtcca gtaagtggtt
ggcgatgaat gtgttaagca tttcccccaa gaaggtcatc 960tgtgaagagc aagagcatcc
gcttcatgag ttgctagata aacacggctt tgaggtctat 1020ccaattccct ttcgcaatgt
ctttgagttt ggcggttcgc tccattgtgc cacctgggat 1080atccatcgca cgggaacctg
tgaggattac ttccctaaac taaactatac gccggtaact 1140gcatcaacca atggcgtttc
tcgcttcatc atttag
117698391PRTCylindrospermopsis raciborskii AWT205 98Met Gln Thr Arg Ile
Val Asn Ser Trp Asn Glu Trp Asp Glu Leu Lys1 5
10 15Glu Met Val Val Gly Ile Ala Asp Gly Ala Tyr
Phe Glu Pro Thr Glu 20 25
30Pro Gly Asn Arg Pro Ala Leu Arg Asp Lys Asn Ile Ala Lys Met Phe
35 40 45Ser Phe Pro Arg Gly Pro Lys Lys
Gln Glu Val Thr Glu Lys Ala Asn 50 55
60Glu Glu Leu Asn Gly Leu Val Ala Leu Leu Glu Ser Gln Gly Val Thr65
70 75 80Val Arg Arg Pro Glu
Lys His Asn Phe Gly Leu Ser Val Lys Thr Pro 85
90 95Phe Phe Glu Val Glu Asn Gln Tyr Cys Ala Val
Cys Pro Arg Asp Val 100 105
110Met Ile Thr Phe Gly Asn Glu Ile Leu Glu Ala Thr Met Ser Arg Arg
115 120 125Ser Arg Phe Phe Glu Tyr Leu
Pro Tyr Arg Lys Leu Val Tyr Glu Tyr 130 135
140Trp His Lys Asp Pro Asp Met Ile Trp Asn Ala Ala Pro Lys Pro
Thr145 150 155 160Met Gln
Asn Ala Met Tyr Arg Glu Asp Phe Trp Glu Cys Pro Met Glu
165 170 175Asp Arg Phe Glu Ser Met His
Asp Phe Glu Phe Cys Val Thr Gln Asp 180 185
190Glu Val Ile Phe Asp Ala Ala Asp Cys Ser Arg Phe Gly Arg
Asp Ile 195 200 205Phe Val Gln Glu
Ser Met Thr Thr Asn Arg Ala Gly Ile Arg Trp Leu 210
215 220Lys Arg His Leu Glu Pro Arg Arg Phe Arg Val His
Asp Ile His Phe225 230 235
240Pro Leu Asp Ile Phe Pro Ser His Ile Asp Cys Thr Phe Val Pro Leu
245 250 255Ala Pro Gly Val Val
Leu Val Asn Pro Asp Arg Pro Ile Lys Glu Gly 260
265 270Glu Glu Lys Leu Phe Met Asp Asn Gly Trp Gln Phe
Ile Glu Ala Pro 275 280 285Leu Pro
Thr Ser Thr Asp Asp Glu Met Pro Met Phe Cys Gln Ser Ser 290
295 300Lys Trp Leu Ala Met Asn Val Leu Ser Ile Ser
Pro Lys Lys Val Ile305 310 315
320Cys Glu Glu Gln Glu His Pro Leu His Glu Leu Leu Asp Lys His Gly
325 330 335Phe Glu Val Tyr
Pro Ile Pro Phe Arg Asn Val Phe Glu Phe Gly Gly 340
345 350Ser Leu His Cys Ala Thr Trp Asp Ile His Arg
Thr Gly Thr Cys Glu 355 360 365Asp
Tyr Phe Pro Lys Leu Asn Tyr Thr Pro Val Thr Ala Ser Thr Asn 370
375 380Gly Val Ser Arg Phe Ile Ile385
390998754DNACylindrospermopsis raciborskii AWT205 99atgcaaaaga
gagaaagccc acagatacta tttgatggga atggaacaca atctgagttt 60ccagatagtt
gcattcacca cttgttcgag gatcaagccg caaagcgacc ggatgcgatc 120gctctcattg
acggtgagca atcccttacc tacggggaac taaatgtacg cgctaaccac 180ctagcccagc
atctcttgtc cctaggctgt caacccgatg acctcctcgc catctgcatc 240gagcgttcgg
cagaactctt tattggtttg ttgggtatcc taaaagccgg atgtgcttat 300gtgcctttgg
atgtaggcta tcctggcgat cgcatagagt atatgttgcg ggactcggat 360gcgcgtattt
tactaacctc aacggatgtc gctaagaaac ttgccttaac catacctgca 420ttgcaagagt
gccaaaccgt ctatttagat caagagatat ttgagtatga ttttcatttt 480ttagcgatag
ctaaactatt acataaccaa tacttgagat tattacattt ttatttttat 540accttgattc
agcaatgcca ggcaacttcg gtttcccaag ggattcagac acaggttctc 600cccaataatc
tcgcttactg catttacacc tctggctcta ccggaaatcc caaagggatc 660ttgatggaac
atcgctcact ggtgaatatg ctttggtggc atcagcaaac gcggccttcg 720gttcagggtg
ttaggacgct gcaattttgt gcagtcagct ttgacttttc ctgccatgaa 780attttttcta
ccctctgtct tggcgggata ttggtcttgg tgccagaggc agtgcgccaa 840aatccctttg
cattggctga gttcatcagt caacagaaaa ttgaaaaatt gtttcttccc 900gttatagcat
tactacagtt ggccgaagct gtaaatggga ataaaagcac ctccctcgcg 960ctttgcgaag
ttatcactac cggggagcag atgcagatca cacctgctgt cgccaacctc 1020tttcagaaaa
ccggggcgat gttgcataat cactacgggg caacagaatt tcaagatgcc 1080accactcata
ccctcaaggg caatccagag ggctggccaa cactggtgcc agtgggtcgt 1140ccactgcaca
atgttcaagt gtatattctg gatgaggcac agcaacctgt acctcttggt 1200ggagagggtg
aattctgtat tggtggtatt ggactggctc gtggctatca caatttgcct 1260gacctaacga
atgaaaaatt tattcccaat ccatttgggg ctaatgagaa cgctaaaaaa 1320ctctaccgca
caggggactt ggcacgctac ctacccgacg gcacgattga gcatttagga 1380cggatagacc
accaggttaa gatccgaggt ttccgcgtgg aattggggga aattgagtcc 1440gtgctggcaa
gtcaccaagc tgtgcgtgaa tgtgccgttg tggcacggga gattgcaggt 1500catacacagt
tggtagggta tatcatagca aaggatacac ttaatctcag tttcgacaaa 1560cttgaaccta
tcctgcgtca atattcggaa gcggtgctgc cagaatacat gatacccact 1620cggttcatca
atatcagtaa tatgccgttg actcccagtg gtaaacttga ccgcagggca 1680ttacctgatc
ccaaaggcga tcgccctgca ttgtctaccc cacttgtcaa gcctcgtacc 1740cagacagaga
aacgtttagc agagatttgg ggcagttatc ttgctgtaga tattgtggga 1800acccacgaca
atttctttga tctaggcggt acgtcactgc tattgactca agcgcacaaa 1860ttcctgtgcg
agacctttaa tattaatttg tccgctgtct cactctttca atatcccaca 1920attcagacat
tggcacaata tattgattgc caaggagaca caacctcaag cgatacagca 1980tccaggcaca
agaaagtacg taaaaagcag tccggtgaca gcaacgatat tgccatcatc 2040agtgtggcag
gtcgctttcc gggtgctgaa acgattgagc agttctggca taatctctgt 2100aatggtgttg
aatccatcac cctttttagt gatgatgagc tagagcagac tttgcctgag 2160ttatttaata
atcccgctta tgtcaaagca ggtgcggtgc tagaaggcgt tgaattattt 2220gatgctacct
tttttggcta cagccccaaa gaagctgcgg tgacagaccc tcagcaacgg 2280attttgctag
agtgtgcctg ggaagcattt gaacgggctg gctacaaccc cgaaacctat 2340ccagaaccag
ttggtgttta tgctggttca agcctgagta cctatctgct taacaatatt 2400ggctctgctt
taggcataat taccgagcaa ccctttattg aaacggatat ggagcagttt 2460caggctaaaa
ttggcaatga ccggagctat cttgctacac gcatctctta caagctgaat 2520ctcaagggtc
caagcgtcaa tgtgcagacc gcctgctcaa cctcgttagt tgcggttcac 2580atggcctgtc
agagtctcat tagtggagag tgtcaaatgg ctttagccgg tggtatttct 2640gtggttgtac
cacagaaggg gggctatctc tacgaagaag gcatggttcg ttcccaggat 2700ggtcattgtc
gcgcctttga tgccgaagcc caagggacta tatttggcaa tggcggcggc 2760ttggttttgc
ttaaacggtt gcaggatgca ctggacgata acgacaacat tatggcagtc 2820atcaaagcca
cagccatcaa caacgacggt gcgctcaaga tgggctacac agcaccgagc 2880gtggatgggc
aagctgatgt aattagcgag gcgattgcta tcgctgacat agatgcaagc 2940accattggct
atgtagaagc tcatggcaca gccacccaat tgggtgatcc gattgaagta 3000gcagggttag
caagggcatt tcagcgtagt acggacagcg tccttggtaa acaacaatgc 3060gctattggat
cagttaaaac taatattggc cacttagatg aggcggcagg cattgccgga 3120ctgataaagg
ctgctctagc tctacaatat ggacagattc caccgagctt gcactatgcc 3180aatcctaatc
cacggattga ttttgacgca accccatttt ttgtcaacac agaactacgc 3240gaatggtcaa
ggaatggtta tcctcggcgg gcgggggtga gttcttttgg tgtgggtgga 3300actaacagcc
atattgtgct ggaggagtcg cctgtaaagc aacccacatt gttctcttct 3360ttgccagaac
gcagtcatca tctgctgacg ctttctgccc atacacaaga ggctttgcat 3420gagttggtgc
aacgctacat ccaacataac gagacacacc ttgatattaa cttaggcgac 3480ctctgtttca
cagccaatac gggacgcaag cattttgagc atcgcctagc ggttgtagcc 3540gaatcaatcc
ctggcttaca ggcacaactg gaaactgcac agactgcgat ttcagcacag 3600aaaaaaaatg
ccccgccgac gatcgcattc ctgtttacag gtcaaggctc acaatacatt 3660aacatggggc
gcaccctcta cgatactgaa tcaacattcc gtgcagccct tgaccgatgt 3720gaaaccattc
tccaaaattt agggatcgag tccattctct ccgttatttt tggttcatct 3780gagcatggac
tctcattaga tgacacagcc tatacccagc ccgcactctt tgccatcgaa 3840tacgcgctct
atcaattatg gaagtcgtgg ggcatccagc cctcagtggt gataggtcat 3900agtgtaggtg
aatatgtgtc cgcttgtgtg gcgggagtct ttagcttaga ggatgggttg 3960aaactgattg
cagaacgagg acgactgata caggcacttc ctcgtgatgg gagcatggtt 4020tccgtgatgg
caagcgagaa gcgtattgca gatatcattt taccttatgg gggacaggta 4080gggatcgccg
cgattaatgg cccacaaagt gttgtaattt ctgggcaaca gcaagcgatt 4140gatgctattt
gtgccatctt ggaaactgag ggcatcaaaa gcaagaagct aaacgtctcc 4200catgccttcc
actcgccgct agtggaagca atgttagact ctttcttgca ggttgcacaa 4260gaggtcactt
actcgcaacc tcaaatcaag cttatctcta atgtaacggg aacattggca 4320agccatgaat
cttgtcccga tgaacttccg atcaccaccg cagagtattg ggtacgtcat 4380gtgcgacagc
ccgtccggtt tgcggcggga atggagagcc ttgagggtca aggggtaaac 4440gtatttatag
aaatcggtcc taaacctgtt cttttaggca tgggacgcga ctgcttgcct 4500gaacaagagg
gactttggtt gcctagtttg cgcccaaaac aggatgattg gcaacaggtg 4560ttaagtagtt
tgcgtgatct atacttagca ggtgtaaccg tagattggag cagtttcgat 4620caggggtatg
ctcgtcgccg tgtgccacta ccgacttatc cttggcagcg agagcggcat 4680tgggtagagc
caattattcg tcaacggcaa tcagtattac aagccacaaa taccaccaag 4740ctaactcgta
acgccagcgt ggcgcagcat cctctgcttg gtcaacggct gcatttgtcg 4800cggactcaag
agatttactt tcaaaccttc atccactccg acttcccaat atgggttgct 4860gatcataaag
tatttggaaa tgtcatcatt ccgggtgtcg cctattttga gatggcactg 4920gcagcaggga
aggcacttaa accagacagt atattttggc tcgaagatgt atccatcgcc 4980caagcactga
ttattcccga tgaagggcaa actgtgcaaa tagtattaag cccacaggaa 5040gagtcagctt
atttttttga aatcctctct ttagaaaaag aaaactcttg ggtgcttcat 5100gcctctggta
agctagtcgc ccaagagcaa gtgctagaaa ccgagccaat tgacttgatt 5160gcgttacagg
cacattgttc cgaagaagtg tcagtagatg tgctatatca ggaagaaatg 5220gcgcgccggc
tggatatggg tccaatgatg cgtggggtga agcagctttg gcgttatccg 5280ctctcctttg
ccaaaagtca tgatgcgatc gcactcgcca aggtcagctt gccagaaatc 5340ttgcttcatg
agtccaatgc ctaccaattc catcctgtaa tcttggatgc ggggctgcaa 5400atgataacgg
tctcttatcc tgaagcaaac caaggccaga cttatgtacc tgttggtata 5460gagggtctac
aagtctatgg tcgtcccagt tcagaacttt ggtgtcgcgc ccaatatcgg 5520cctcctttgg
atacagatca aaggcagggt attgatttgc tgccaaagaa attgattgca 5580gacttgcatc
tatttgatac ccagggtcgt gtggttgcca tcatgtttgg tgtgcaatct 5640gtccttgtgg
gacgggaagc aatgttgcga tcgcaagata cttggcgaaa ttggctttat 5700caagtcctgt
ggaaacctca agcctgtttt ggacttttac cgaattacct gccaacccca 5760gataagattc
ggaaacgcct ggaaacaaag ttagcgacat tgatcatcga agctaatttg 5820gcgacttatg
cgatcgccta tacccaactg gaaaggttaa gtctagctta cgttgtggcg 5880gctttccgac
aaatgggctg gctgtttcaa cccggtgagc gtttttccac cgcccagaag 5940gtatcagcgt
taggaatcgt tgatcaacat cggcaactat tcgctcgttt gctcgacatt 6000ctagccgaag
cagacatact ccgcagcgaa aacttgatga cgatatggga agtcatttca 6060tacccggaaa
cgattgatat acaggtactt cttgacgacc tcgaagccaa agaagcagaa 6120gccgaagtca
cactggtttc ccgttgcagt gcaaaattgg ccgaagtatt acaaggaaaa 6180tgtgacccca
tacagttgct ctttcccgca ggggacacaa caacgttaag caaactctat 6240cgtgaagccc
cagttttggg tgttactaat actctagtcc aagaagcgct tctttccgcc 6300ctggagcagt
tgccgccgga acgtggttgg cgaattttag agattggtgc tggaacaggt 6360ggaaccacag
cctacttgtt accgcatctg cctggggatc agacaaaata tgtctttacc 6420gatattagtg
ccttttttct tgccaaagcg gaagagcgtt ttaaagatta cccgtttgta 6480cgttatcagg
tattagatat cgaacaagca ccacaggcgc aaggatttga accccaaata 6540tacgatttaa
tcgtagcagc ggatgtcttg catgctacta gtgacctgcg tcaaactctt 6600gtacatatcc
ggcaattatt agcgccgggc gggatgttga tcctgatgga agacagcgaa 6660cccgcacgct
gggctgattt aacctttggc ttaacagaag gctggtggaa gtttacagac 6720catgacttac
gccccaacca tccgctattg tctcctgagc agtggcaaat cttgttgtca 6780gaaatgggat
ttagtcaaac aaccgcctta tggccaaaaa tagatagccc ccataaattg 6840ccacgggagg
cggtgattgt ggcgcgtaat gaaccagcca tcagaaaacc ccgaagatgg 6900ctgatcttgg
ctgacgagga gattggtgga ctactagcca aacagctacg tgaagaagga 6960gaagattgta
tactcctctt gccaggggaa aagtacacag agagagattc acaaacgttt 7020acaatcaatc
ctggagatat tgaagagtgg caacagttat tgaaccgagt accgaacata 7080caagaaattg
tacattgttg gagtatggtt tccactgact tagatagagc cactattttc 7140agttgcagca
gtacgctgca tttagttcaa gcattagcaa actatccaaa aaaccctcgc 7200ttgtcacttg
tcaccctagg cgcacaagcc gttaacgaac atcatgttca aaatgtagtt 7260ggagcagccc
tctggggcat gggaaaggta attgcactcg aacacccaga gctacaagta 7320gcacaaatgg
atttagaccc gaatgggaag gttaaggcgc aagtagaagt gcttagggat 7380gaacttctcg
ccagaaaaga ccctgcatca gcaatgtctg tgcctgatct gcaaacacga 7440cctcatgaaa
agcaaatagc ctttcgtgag caaacacgtt atgtggcaag actttcgccc 7500ttagaccgcc
ccaatcctgg agagaaaggc acacaagagg ctcttacctt ccgtgatgat 7560ggcagctatc
tgattgctgg tggtttaggc ggactggggt tagtggtggc tcgttttctg 7620gttacaaatg
gggctaaata ccttgtgcta gtcggacgac gtggtgcgag ggaggaacag 7680caagctcaat
taagcgaact agagcaactc ggagcttccg tgaaagtttt acaagccgat 7740attgctgatg
cagaacaact agcccaagca ctttcagcag taacctaccc accattacgg 7800ggtgttattc
atgcggcagg tacattgaac gatgggattc tacagcagca aagttggcaa 7860gcctttaaag
aagtgatgaa tcccaaggta gcaggtgcgt ggaacctaca tatactgaca 7920aaaaatcagc
ctttagactt ctttgtcctg ttctcctccg ccacctcttt gttaggtaac 7980gctggacaag
ccaatcacgc cgccgcaaat gctttccttg atgggttagc ctcctatcgt 8040cgtcacttag
gactaccgag cctctcgatt aattggggga catggagcga agtgggaatt 8100gcggctcgac
ttgaactaga taagttgtcc agcaaacagg gagagggaac cattacgcta 8160ggacagggct
tacaaattct tgagcagttg ctcaaagacg agaatggggt gtatcaagtg 8220ggtgtcatgc
ctatcaactg gacacaattc ttagcaaggc aattgactcc gcagccgttc 8280ttcagcgatg
ccatgaagag tattgacacc tctgtaggta aactaacctt gcaggagcgg 8340gactcttgcc
cccaaggtta cgggcataat attcgagagc aattagagaa cgctccgccc 8400aaagagggtc
tgactctctt gcaggctcat gttcgggagc aggtttccca agttttgggg 8460atagacacga
agacattatt ggcagaacaa gacgtgggtt tctttaccct ggggatggat 8520tcgctgacct
ctgtcgagtt aagaaacagg ttacaagcca gtttgggctg ctctctttct 8580tccactttgg
cttttgacta tccaacacaa caggctcttg tgaattatct tgccaatgaa 8640ttgctgggaa
cccctgagca gctacaagag cctgaatctg atgaagaaga tcagatatcg 8700tcaatggatg
acatcgtgca gttgctgtcc gcgaaactag agatggaaat ttaa
87541002917PRTCylindrospermopsis raciborskii AWT205 100Met Gln Lys Arg
Glu Ser Pro Gln Ile Leu Phe Asp Gly Asn Gly Thr1 5
10 15Gln Ser Glu Phe Pro Asp Ser Cys Ile His
His Leu Phe Glu Asp Gln 20 25
30Ala Ala Lys Arg Pro Asp Ala Ile Ala Leu Ile Asp Gly Glu Gln Ser
35 40 45Leu Thr Tyr Gly Glu Leu Asn Val
Arg Ala Asn His Leu Ala Gln His 50 55
60Leu Leu Ser Leu Gly Cys Gln Pro Asp Asp Leu Leu Ala Ile Cys Ile65
70 75 80Glu Arg Ser Ala Glu
Leu Phe Ile Gly Leu Leu Gly Ile Leu Lys Ala 85
90 95Gly Cys Ala Tyr Val Pro Leu Asp Val Gly Tyr
Pro Gly Asp Arg Ile 100 105
110Glu Tyr Met Leu Arg Asp Ser Asp Ala Arg Ile Leu Leu Thr Ser Thr
115 120 125Asp Val Ala Lys Lys Leu Ala
Leu Thr Ile Pro Ala Leu Gln Glu Cys 130 135
140Gln Thr Val Tyr Leu Asp Gln Glu Ile Phe Glu Tyr Asp Phe His
Phe145 150 155 160Leu Ala
Ile Ala Lys Leu Leu His Asn Gln Tyr Leu Arg Leu Leu His
165 170 175Phe Tyr Phe Tyr Thr Leu Ile
Gln Gln Cys Gln Ala Thr Ser Val Ser 180 185
190Gln Gly Ile Gln Thr Gln Val Leu Pro Asn Asn Leu Ala Tyr
Cys Ile 195 200 205Tyr Thr Ser Gly
Ser Thr Gly Asn Pro Lys Gly Ile Leu Met Glu His 210
215 220Arg Ser Leu Val Asn Met Leu Trp Trp His Gln Gln
Thr Arg Pro Ser225 230 235
240Val Gln Gly Val Arg Thr Leu Gln Phe Cys Ala Val Ser Phe Asp Phe
245 250 255Ser Cys His Glu Ile
Phe Ser Thr Leu Cys Leu Gly Gly Ile Leu Val 260
265 270Leu Val Pro Glu Ala Val Arg Gln Asn Pro Phe Ala
Leu Ala Glu Phe 275 280 285Ile Ser
Gln Gln Lys Ile Glu Lys Leu Phe Leu Pro Val Ile Ala Leu 290
295 300Leu Gln Leu Ala Glu Ala Val Asn Gly Asn Lys
Ser Thr Ser Leu Ala305 310 315
320Leu Cys Glu Val Ile Thr Thr Gly Glu Gln Met Gln Ile Thr Pro Ala
325 330 335Val Ala Asn Leu
Phe Gln Lys Thr Gly Ala Met Leu His Asn His Tyr 340
345 350Gly Ala Thr Glu Phe Gln Asp Ala Thr Thr His
Thr Leu Lys Gly Asn 355 360 365Pro
Glu Gly Trp Pro Thr Leu Val Pro Val Gly Arg Pro Leu His Asn 370
375 380Val Gln Val Tyr Ile Leu Asp Glu Ala Gln
Gln Pro Val Pro Leu Gly385 390 395
400Gly Glu Gly Glu Phe Cys Ile Gly Gly Ile Gly Leu Ala Arg Gly
Tyr 405 410 415His Asn Leu
Pro Asp Leu Thr Asn Glu Lys Phe Ile Pro Asn Pro Phe 420
425 430Gly Ala Asn Glu Asn Ala Lys Lys Leu Tyr
Arg Thr Gly Asp Leu Ala 435 440
445Arg Tyr Leu Pro Asp Gly Thr Ile Glu His Leu Gly Arg Ile Asp His 450
455 460Gln Val Lys Ile Arg Gly Phe Arg
Val Glu Leu Gly Glu Ile Glu Ser465 470
475 480Val Leu Ala Ser His Gln Ala Val Arg Glu Cys Ala
Val Val Ala Arg 485 490
495Glu Ile Ala Gly His Thr Gln Leu Val Gly Tyr Ile Ile Ala Lys Asp
500 505 510Thr Leu Asn Leu Ser Phe
Asp Lys Leu Glu Pro Ile Leu Arg Gln Tyr 515 520
525Ser Glu Ala Val Leu Pro Glu Tyr Met Ile Pro Thr Arg Phe
Ile Asn 530 535 540Ile Ser Asn Met Pro
Leu Thr Pro Ser Gly Lys Leu Asp Arg Arg Ala545 550
555 560Leu Pro Asp Pro Lys Gly Asp Arg Pro Ala
Leu Ser Thr Pro Leu Val 565 570
575Lys Pro Arg Thr Gln Thr Glu Lys Arg Leu Ala Glu Ile Trp Gly Ser
580 585 590Tyr Leu Ala Val Asp
Ile Val Gly Thr His Asp Asn Phe Phe Asp Leu 595
600 605Gly Gly Thr Ser Leu Leu Leu Thr Gln Ala His Lys
Phe Leu Cys Glu 610 615 620Thr Phe Asn
Ile Asn Leu Ser Ala Val Ser Leu Phe Gln Tyr Pro Thr625
630 635 640Ile Gln Thr Leu Ala Gln Tyr
Ile Asp Cys Gln Gly Asp Thr Thr Ser 645
650 655Ser Asp Thr Ala Ser Arg His Lys Lys Val Arg Lys
Lys Gln Ser Gly 660 665 670Asp
Ser Asn Asp Ile Ala Ile Ile Ser Val Ala Gly Arg Phe Pro Gly 675
680 685Ala Glu Thr Ile Glu Gln Phe Trp His
Asn Leu Cys Asn Gly Val Glu 690 695
700Ser Ile Thr Leu Phe Ser Asp Asp Glu Leu Glu Gln Thr Leu Pro Glu705
710 715 720Leu Phe Asn Asn
Pro Ala Tyr Val Lys Ala Gly Ala Val Leu Glu Gly 725
730 735Val Glu Leu Phe Asp Ala Thr Phe Phe Gly
Tyr Ser Pro Lys Glu Ala 740 745
750Ala Val Thr Asp Pro Gln Gln Arg Ile Leu Leu Glu Cys Ala Trp Glu
755 760 765Ala Phe Glu Arg Ala Gly Tyr
Asn Pro Glu Thr Tyr Pro Glu Pro Val 770 775
780Gly Val Tyr Ala Gly Ser Ser Leu Ser Thr Tyr Leu Leu Asn Asn
Ile785 790 795 800Gly Ser
Ala Leu Gly Ile Ile Thr Glu Gln Pro Phe Ile Glu Thr Asp
805 810 815Met Glu Gln Phe Gln Ala Lys
Ile Gly Asn Asp Arg Ser Tyr Leu Ala 820 825
830Thr Arg Ile Ser Tyr Lys Leu Asn Leu Lys Gly Pro Ser Val
Asn Val 835 840 845Gln Thr Ala Cys
Ser Thr Ser Leu Val Ala Val His Met Ala Cys Gln 850
855 860Ser Leu Ile Ser Gly Glu Cys Gln Met Ala Leu Ala
Gly Gly Ile Ser865 870 875
880Val Val Val Pro Gln Lys Gly Gly Tyr Leu Tyr Glu Glu Gly Met Val
885 890 895Arg Ser Gln Asp Gly
His Cys Arg Ala Phe Asp Ala Glu Ala Gln Gly 900
905 910Thr Ile Phe Gly Asn Gly Gly Gly Leu Val Leu Leu
Lys Arg Leu Gln 915 920 925Asp Ala
Leu Asp Asp Asn Asp Asn Ile Met Ala Val Ile Lys Ala Thr 930
935 940Ala Ile Asn Asn Asp Gly Ala Leu Lys Met Gly
Tyr Thr Ala Pro Ser945 950 955
960Val Asp Gly Gln Ala Asp Val Ile Ser Glu Ala Ile Ala Ile Ala Asp
965 970 975Ile Asp Ala Ser
Thr Ile Gly Tyr Val Glu Ala His Gly Thr Ala Thr 980
985 990Gln Leu Gly Asp Pro Ile Glu Val Ala Gly Leu
Ala Arg Ala Phe Gln 995 1000
1005Arg Ser Thr Asp Ser Val Leu Gly Lys Gln Gln Cys Ala Ile Gly
1010 1015 1020Ser Val Lys Thr Asn Ile
Gly His Leu Asp Glu Ala Ala Gly Ile 1025 1030
1035Ala Gly Leu Ile Lys Ala Ala Leu Ala Leu Gln Tyr Gly Gln
Ile 1040 1045 1050Pro Pro Ser Leu His
Tyr Ala Asn Pro Asn Pro Arg Ile Asp Phe 1055 1060
1065Asp Ala Thr Pro Phe Phe Val Asn Thr Glu Leu Arg Glu
Trp Ser 1070 1075 1080Arg Asn Gly Tyr
Pro Arg Arg Ala Gly Val Ser Ser Phe Gly Val 1085
1090 1095Gly Gly Thr Asn Ser His Ile Val Leu Glu Glu
Ser Pro Val Lys 1100 1105 1110Gln Pro
Thr Leu Phe Ser Ser Leu Pro Glu Arg Ser His His Leu 1115
1120 1125Leu Thr Leu Ser Ala His Thr Gln Glu Ala
Leu His Glu Leu Val 1130 1135 1140Gln
Arg Tyr Ile Gln His Asn Glu Thr His Leu Asp Ile Asn Leu 1145
1150 1155Gly Asp Leu Cys Phe Thr Ala Asn Thr
Gly Arg Lys His Phe Glu 1160 1165
1170His Arg Leu Ala Val Val Ala Glu Ser Ile Pro Gly Leu Gln Ala
1175 1180 1185Gln Leu Glu Thr Ala Gln
Thr Ala Ile Ser Ala Gln Lys Lys Asn 1190 1195
1200Ala Pro Pro Thr Ile Ala Phe Leu Phe Thr Gly Gln Gly Ser
Gln 1205 1210 1215Tyr Ile Asn Met Gly
Arg Thr Leu Tyr Asp Thr Glu Ser Thr Phe 1220 1225
1230Arg Ala Ala Leu Asp Arg Cys Glu Thr Ile Leu Gln Asn
Leu Gly 1235 1240 1245Ile Glu Ser Ile
Leu Ser Val Ile Phe Gly Ser Ser Glu His Gly 1250
1255 1260Leu Ser Leu Asp Asp Thr Ala Tyr Thr Gln Pro
Ala Leu Phe Ala 1265 1270 1275Ile Glu
Tyr Ala Leu Tyr Gln Leu Trp Lys Ser Trp Gly Ile Gln 1280
1285 1290Pro Ser Val Val Ile Gly His Ser Val Gly
Glu Tyr Val Ser Ala 1295 1300 1305Cys
Val Ala Gly Val Phe Ser Leu Glu Asp Gly Leu Lys Leu Ile 1310
1315 1320Ala Glu Arg Gly Arg Leu Ile Gln Ala
Leu Pro Arg Asp Gly Ser 1325 1330
1335Met Val Ser Val Met Ala Ser Glu Lys Arg Ile Ala Asp Ile Ile
1340 1345 1350Leu Pro Tyr Gly Gly Gln
Val Gly Ile Ala Ala Ile Asn Gly Pro 1355 1360
1365Gln Ser Val Val Ile Ser Gly Gln Gln Gln Ala Ile Asp Ala
Ile 1370 1375 1380Cys Ala Ile Leu Glu
Thr Glu Gly Ile Lys Ser Lys Lys Leu Asn 1385 1390
1395Val Ser His Ala Phe His Ser Pro Leu Val Glu Ala Met
Leu Asp 1400 1405 1410Ser Phe Leu Gln
Val Ala Gln Glu Val Thr Tyr Ser Gln Pro Gln 1415
1420 1425Ile Lys Leu Ile Ser Asn Val Thr Gly Thr Leu
Ala Ser His Glu 1430 1435 1440Ser Cys
Pro Asp Glu Leu Pro Ile Thr Thr Ala Glu Tyr Trp Val 1445
1450 1455Arg His Val Arg Gln Pro Val Arg Phe Ala
Ala Gly Met Glu Ser 1460 1465 1470Leu
Glu Gly Gln Gly Val Asn Val Phe Ile Glu Ile Gly Pro Lys 1475
1480 1485Pro Val Leu Leu Gly Met Gly Arg Asp
Cys Leu Pro Glu Gln Glu 1490 1495
1500Gly Leu Trp Leu Pro Ser Leu Arg Pro Lys Gln Asp Asp Trp Gln
1505 1510 1515Gln Val Leu Ser Ser Leu
Arg Asp Leu Tyr Leu Ala Gly Val Thr 1520 1525
1530Val Asp Trp Ser Ser Phe Asp Gln Gly Tyr Ala Arg Arg Arg
Val 1535 1540 1545Pro Leu Pro Thr Tyr
Pro Trp Gln Arg Glu Arg His Trp Val Glu 1550 1555
1560Pro Ile Ile Arg Gln Arg Gln Ser Val Leu Gln Ala Thr
Asn Thr 1565 1570 1575Thr Lys Leu Thr
Arg Asn Ala Ser Val Ala Gln His Pro Leu Leu 1580
1585 1590Gly Gln Arg Leu His Leu Ser Arg Thr Gln Glu
Ile Tyr Phe Gln 1595 1600 1605Thr Phe
Ile His Ser Asp Phe Pro Ile Trp Val Ala Asp His Lys 1610
1615 1620Val Phe Gly Asn Val Ile Ile Pro Gly Val
Ala Tyr Phe Glu Met 1625 1630 1635Ala
Leu Ala Ala Gly Lys Ala Leu Lys Pro Asp Ser Ile Phe Trp 1640
1645 1650Leu Glu Asp Val Ser Ile Ala Gln Ala
Leu Ile Ile Pro Asp Glu 1655 1660
1665Gly Gln Thr Val Gln Ile Val Leu Ser Pro Gln Glu Glu Ser Ala
1670 1675 1680Tyr Phe Phe Glu Ile Leu
Ser Leu Glu Lys Glu Asn Ser Trp Val 1685 1690
1695Leu His Ala Ser Gly Lys Leu Val Ala Gln Glu Gln Val Leu
Glu 1700 1705 1710Thr Glu Pro Ile Asp
Leu Ile Ala Leu Gln Ala His Cys Ser Glu 1715 1720
1725Glu Val Ser Val Asp Val Leu Tyr Gln Glu Glu Met Ala
Arg Arg 1730 1735 1740Leu Asp Met Gly
Pro Met Met Arg Gly Val Lys Gln Leu Trp Arg 1745
1750 1755Tyr Pro Leu Ser Phe Ala Lys Ser His Asp Ala
Ile Ala Leu Ala 1760 1765 1770Lys Val
Ser Leu Pro Glu Ile Leu Leu His Glu Ser Asn Ala Tyr 1775
1780 1785Gln Phe His Pro Val Ile Leu Asp Ala Gly
Leu Gln Met Ile Thr 1790 1795 1800Val
Ser Tyr Pro Glu Ala Asn Gln Gly Gln Thr Tyr Val Pro Val 1805
1810 1815Gly Ile Glu Gly Leu Gln Val Tyr Gly
Arg Pro Ser Ser Glu Leu 1820 1825
1830Trp Cys Arg Ala Gln Tyr Arg Pro Pro Leu Asp Thr Asp Gln Arg
1835 1840 1845Gln Gly Ile Asp Leu Leu
Pro Lys Lys Leu Ile Ala Asp Leu His 1850 1855
1860Leu Phe Asp Thr Gln Gly Arg Val Val Ala Ile Met Phe Gly
Val 1865 1870 1875Gln Ser Val Leu Val
Gly Arg Glu Ala Met Leu Arg Ser Gln Asp 1880 1885
1890Thr Trp Arg Asn Trp Leu Tyr Gln Val Leu Trp Lys Pro
Gln Ala 1895 1900 1905Cys Phe Gly Leu
Leu Pro Asn Tyr Leu Pro Thr Pro Asp Lys Ile 1910
1915 1920Arg Lys Arg Leu Glu Thr Lys Leu Ala Thr Leu
Ile Ile Glu Ala 1925 1930 1935Asn Leu
Ala Thr Tyr Ala Ile Ala Tyr Thr Gln Leu Glu Arg Leu 1940
1945 1950Ser Leu Ala Tyr Val Val Ala Ala Phe Arg
Gln Met Gly Trp Leu 1955 1960 1965Phe
Gln Pro Gly Glu Arg Phe Ser Thr Ala Gln Lys Val Ser Ala 1970
1975 1980Leu Gly Ile Val Asp Gln His Arg Gln
Leu Phe Ala Arg Leu Leu 1985 1990
1995Asp Ile Leu Ala Glu Ala Asp Ile Leu Arg Ser Glu Asn Leu Met
2000 2005 2010Thr Ile Trp Glu Val Ile
Ser Tyr Pro Glu Thr Ile Asp Ile Gln 2015 2020
2025Val Leu Leu Asp Asp Leu Glu Ala Lys Glu Ala Glu Ala Glu
Val 2030 2035 2040Thr Leu Val Ser Arg
Cys Ser Ala Lys Leu Ala Glu Val Leu Gln 2045 2050
2055Gly Lys Cys Asp Pro Ile Gln Leu Leu Phe Pro Ala Gly
Asp Thr 2060 2065 2070Thr Thr Leu Ser
Lys Leu Tyr Arg Glu Ala Pro Val Leu Gly Val 2075
2080 2085Thr Asn Thr Leu Val Gln Glu Ala Leu Leu Ser
Ala Leu Glu Gln 2090 2095 2100Leu Pro
Pro Glu Arg Gly Trp Arg Ile Leu Glu Ile Gly Ala Gly 2105
2110 2115Thr Gly Gly Thr Thr Ala Tyr Leu Leu Pro
His Leu Pro Gly Asp 2120 2125 2130Gln
Thr Lys Tyr Val Phe Thr Asp Ile Ser Ala Phe Phe Leu Ala 2135
2140 2145Lys Ala Glu Glu Arg Phe Lys Asp Tyr
Pro Phe Val Arg Tyr Gln 2150 2155
2160Val Leu Asp Ile Glu Gln Ala Pro Gln Ala Gln Gly Phe Glu Pro
2165 2170 2175Gln Ile Tyr Asp Leu Ile
Val Ala Ala Asp Val Leu His Ala Thr 2180 2185
2190Ser Asp Leu Arg Gln Thr Leu Val His Ile Arg Gln Leu Leu
Ala 2195 2200 2205Pro Gly Gly Met Leu
Ile Leu Met Glu Asp Ser Glu Pro Ala Arg 2210 2215
2220Trp Ala Asp Leu Thr Phe Gly Leu Thr Glu Gly Trp Trp
Lys Phe 2225 2230 2235Thr Asp His Asp
Leu Arg Pro Asn His Pro Leu Leu Ser Pro Glu 2240
2245 2250Gln Trp Gln Ile Leu Leu Ser Glu Met Gly Phe
Ser Gln Thr Thr 2255 2260 2265Ala Leu
Trp Pro Lys Ile Asp Ser Pro His Lys Leu Pro Arg Glu 2270
2275 2280Ala Val Ile Val Ala Arg Asn Glu Pro Ala
Ile Arg Lys Pro Arg 2285 2290 2295Arg
Trp Leu Ile Leu Ala Asp Glu Glu Ile Gly Gly Leu Leu Ala 2300
2305 2310Lys Gln Leu Arg Glu Glu Gly Glu Asp
Cys Ile Leu Leu Leu Pro 2315 2320
2325Gly Glu Lys Tyr Thr Glu Arg Asp Ser Gln Thr Phe Thr Ile Asn
2330 2335 2340Pro Gly Asp Ile Glu Glu
Trp Gln Gln Leu Leu Asn Arg Val Pro 2345 2350
2355Asn Ile Gln Glu Ile Val His Cys Trp Ser Met Val Ser Thr
Asp 2360 2365 2370Leu Asp Arg Ala Thr
Ile Phe Ser Cys Ser Ser Thr Leu His Leu 2375 2380
2385Val Gln Ala Leu Ala Asn Tyr Pro Lys Asn Pro Arg Leu
Ser Leu 2390 2395 2400Val Thr Leu Gly
Ala Gln Ala Val Asn Glu His His Val Gln Asn 2405
2410 2415Val Val Gly Ala Ala Leu Trp Gly Met Gly Lys
Val Ile Ala Leu 2420 2425 2430Glu His
Pro Glu Leu Gln Val Ala Gln Met Asp Leu Asp Pro Asn 2435
2440 2445Gly Lys Val Lys Ala Gln Val Glu Val Leu
Arg Asp Glu Leu Leu 2450 2455 2460Ala
Arg Lys Asp Pro Ala Ser Ala Met Ser Val Pro Asp Leu Gln 2465
2470 2475Thr Arg Pro His Glu Lys Gln Ile Ala
Phe Arg Glu Gln Thr Arg 2480 2485
2490Tyr Val Ala Arg Leu Ser Pro Leu Asp Arg Pro Asn Pro Gly Glu
2495 2500 2505Lys Gly Thr Gln Glu Ala
Leu Thr Phe Arg Asp Asp Gly Ser Tyr 2510 2515
2520Leu Ile Ala Gly Gly Leu Gly Gly Leu Gly Leu Val Val Ala
Arg 2525 2530 2535Phe Leu Val Thr Asn
Gly Ala Lys Tyr Leu Val Leu Val Gly Arg 2540 2545
2550Arg Gly Ala Arg Glu Glu Gln Gln Ala Gln Leu Ser Glu
Leu Glu 2555 2560 2565Gln Leu Gly Ala
Ser Val Lys Val Leu Gln Ala Asp Ile Ala Asp 2570
2575 2580Ala Glu Gln Leu Ala Gln Ala Leu Ser Ala Val
Thr Tyr Pro Pro 2585 2590 2595Leu Arg
Gly Val Ile His Ala Ala Gly Thr Leu Asn Asp Gly Ile 2600
2605 2610Leu Gln Gln Gln Ser Trp Gln Ala Phe Lys
Glu Val Met Asn Pro 2615 2620 2625Lys
Val Ala Gly Ala Trp Asn Leu His Ile Leu Thr Lys Asn Gln 2630
2635 2640Pro Leu Asp Phe Phe Val Leu Phe Ser
Ser Ala Thr Ser Leu Leu 2645 2650
2655Gly Asn Ala Gly Gln Ala Asn His Ala Ala Ala Asn Ala Phe Leu
2660 2665 2670Asp Gly Leu Ala Ser Tyr
Arg Arg His Leu Gly Leu Pro Ser Leu 2675 2680
2685Ser Ile Asn Trp Gly Thr Trp Ser Glu Val Gly Ile Ala Ala
Arg 2690 2695 2700Leu Glu Leu Asp Lys
Leu Ser Ser Lys Gln Gly Glu Gly Thr Ile 2705 2710
2715Thr Leu Gly Gln Gly Leu Gln Ile Leu Glu Gln Leu Leu
Lys Asp 2720 2725 2730Glu Asn Gly Val
Tyr Gln Val Gly Val Met Pro Ile Asn Trp Thr 2735
2740 2745Gln Phe Leu Ala Arg Gln Leu Thr Pro Gln Pro
Phe Phe Ser Asp 2750 2755 2760Ala Met
Lys Ser Ile Asp Thr Ser Val Gly Lys Leu Thr Leu Gln 2765
2770 2775Glu Arg Asp Ser Cys Pro Gln Gly Tyr Gly
His Asn Ile Arg Glu 2780 2785 2790Gln
Leu Glu Asn Ala Pro Pro Lys Glu Gly Leu Thr Leu Leu Gln 2795
2800 2805Ala His Val Arg Glu Gln Val Ser Gln
Val Leu Gly Ile Asp Thr 2810 2815
2820Lys Thr Leu Leu Ala Glu Gln Asp Val Gly Phe Phe Thr Leu Gly
2825 2830 2835Met Asp Ser Leu Thr Ser
Val Glu Leu Arg Asn Arg Leu Gln Ala 2840 2845
2850Ser Leu Gly Cys Ser Leu Ser Ser Thr Leu Ala Phe Asp Tyr
Pro 2855 2860 2865Thr Gln Gln Ala Leu
Val Asn Tyr Leu Ala Asn Glu Leu Leu Gly 2870 2875
2880Thr Pro Glu Gln Leu Gln Glu Pro Glu Ser Asp Glu Glu
Asp Gln 2885 2890 2895Ile Ser Ser Met
Asp Asp Ile Val Gln Leu Leu Ser Ala Lys Leu 2900
2905 2910Glu Met Glu Ile
29151015667DNACylindrospermopsis raciborskii AWT205 101atggatgaaa
aactaagaac atacgaacga ttaatcaagc aatcctatca caagatagag 60gctctggaag
ctgaagttaa caggttgaag caaacccaat gtgaacctat cgccatcgtc 120ggcatgggct
gtcgttttcc tggtgcgaat agtccagaag cgttttggca gttgttgtgt 180gatggggttg
atgctattcg tgagatacca aaaaatcgat gggttgttga tgcctacata 240gatgaaaatt
tggaccgcgc agacaagaca tcaatgcgat ttggcgggtt tgtcgagcaa 300cttgagaagt
ttgatgccca attctttggc atatcaccgc gagaagcggt ttctcttgac 360cctcagcaac
gtttgttatt agaagtaagt tgggaagcac tggaaaatgc agcggtgata 420ccaccttcgg
caacgggcgt attcgtcggt attagtaacc ttgattatcg tgaaacgctc 480ttgaagcaag
gagcaattgg tacttatttt gcttcgggta atgcccatag cacagccagt 540ggtcgcttgt
cttactttct cggtctgaca ggcccctgtc tctcgataga tacagcttgt 600tcttcgtcgt
tggtcgctgt acatcagtca ctgataagtc tgcgtcagcg agaatgtgac 660ttagcgttgg
ttgggggagt ccatcggctg atagccccag aggaaagtgt ctcgttagca 720aaagcccata
tgttatctcc cgatggtcgt tgcaaagtct ttgatgcgtc ggcaaacggg 780tatgtccgag
ccgaaggatg tggcatgata gtcctcaaac gattatcgga cgcgcaagct 840gatggggata
aaatcttggc gttgattcgc gggtcagcca taaatcaaga cggtcgcacg 900agtggcttga
ccgttccaaa tggtccccaa caagccgacg tgattcgcca agccctcgcc 960aatagtggca
taagaccaga acaagttaac tatgtagaag ctcatggcac agggacttcc 1020ctaggagacc
cgattgaggt cggcgcgttg ggaacgatct ttaatcaacg ctcccaacct 1080ttaattattg
gttcagttaa aacaaatatt gggcatctag aagcagcagc agggattgct 1140ggactgatta
aagtcgtcct tgccatgcag catggagaaa ttccacctaa tttacacttt 1200caccagccca
atcctcgcat taactgggat aaattgccaa tcaggatccc cacagaacga 1260acagcttggc
ctactggcga tcgcatcgca gggataagtt ctttcggctt tagtggcact 1320aattctcatg
tcgtgttaga ggaagcccca aaaatagagc cgtctacttt agagattcat 1380tcaaagcagt
atgtttttac cttatcagca gcgacacctc aagcactaca agaacttact 1440cagcgttatg
taacttatct cactgaacac ttacaagaga gtctggcgga tatttgcttt 1500acagccaaca
cagggcgcaa acactttaga catcgctttg cagtagtagc agagtctaaa 1560acccagttgc
gccaacaatt ggaaacgttt gcccaatcgg gagaggggca ggggaagagg 1620acatctctct
caaaaatagc ttttctcttt acaggtcaag gctcacagta tgtggggatg 1680gggcaagaac
tttatgagag ccaacccacc ttccggcaaa ccattgaccg atgtgatgag 1740attcttcgtt
cactgttggg caaatcaatc ctctcaatac tctatcccag ccaacaaatg 1800ggattggaaa
cgccatccca aattgatgaa accgcctata ctcaacccac tcttttttct 1860cttgaatatg
cactggcgca gttgtggcgc tcctggggta ttgagcctga tgtggtgatg 1920gggcatagtg
tgggagaata tgtggccgct tgtgtggcgg gtgtcttttc tttagaggat 1980ggactcaaac
taattgctga aagaggccgt ctgatgcaag aattgcctcc cgatggggcg 2040atggtttcag
ttatggccaa taaatcgcgc atagagcaag caattcaatc tgtcagccga 2100gaggtttcta
ttgcggccat caatggacct gagagtgtgg ttatctctgg taaaagggag 2160atattacaac
agattaccga acatctggtt gccgaaggca ttaagacacg ccaactgaag 2220gtctctcatg
cctttcactc accattgatg gagccaatat taggtcagtt ccgccgagtt 2280gccaatacca
tcacctatcg gccaccgcaa attaaccttg tctcaaatgt cacaggcgga 2340caggtgtata
aagaaatcgc tactcccgat tattgggtga gacatctgca agagactgtc 2400cgttttgcgg
atggggttaa ggtgttacat gaacagaatg tcaatttcat gctcgaaatt 2460ggtcccaaac
ccacactgct gggcatggtt gagttacaaa gttctgagaa tccattttct 2520atgccaatga
tgatgcccag tttgcgtcag aatcgtagcg actggcagca gatgttggag 2580agcttgagtc
aactctatgt tcatggtgtt gagattgact ggatcggttt taataaagac 2640tatgtgcgac
ataaagttgt cctgccgaca tacccatggc agaaggagcg ttactgggta 2700gaattggatc
aacagaagca cgccgctaaa aatctacatc ctctactgga caggtgcatg 2760aagctgcctc
gtcataacga aacaattttt gagaaagaat ttagtctaga gacattgccc 2820tttcttgctg
actatcgcat ttatggttca gttgtgtcgc caggtgcaag ttatctatca 2880atgatactaa
gtattgccga gtcgtatgca aatggtcatt tgaatggagg gaatagtgca 2940aagcaaacca
cttatttact aaaggatgtc acattcccag tacctcttgt gatctctgat 3000gaggcaaatt
acatggtgca agttgcttgt tctctctctt gtgctgcgcc acacaatcgt 3060ggcgacgaga
cgcagtttga attgttcagt tttgctgaga atgtacctga aagtagcagt 3120ataaatgctg
attttcagac acccattatt catgcaaaag ggcaatttaa gcttgaagat 3180acagcacctc
ctaaagtgga gctagaagaa ctacaagcgg gttgtcccca agaaattgat 3240ctcaaccttt
tctatcaaac attcacagac aaaggttttg tttttggatc tcgttttcgc 3300tggttagaac
aaatctgggt gggcgatgga gaagcattgg cgcgtctgcg acaaccggaa 3360agtattgaat
cgtttaaagg atatgtgatt catcccggtt tgttggatgc ctgtacacaa 3420gtcccatttg
caatttcgtc tgacgatgaa aataggcaat cagaaacgac aatgcccttt 3480gcgctgaatg
aattacgttg ttatcagcct gcaaacggac aaatgtggtg ggttcatgca 3540acagaaaaag
atagatatac atgggatgtt tctctgtttg atgagagcgg gcaagttatt 3600gcggaattta
taggtttaga agttcgtgct gctatgcccg aaggcttact aagggcagac 3660ttttggcata
actggctcta tacagtgaat tggcgatcgc aacctctaca aatcccagag 3720gtgctggata
ttaataagac aggtgcagaa acatggcttc tttttgcaca accagaggga 3780ataggagcgg
acttagccga atatttgcag agccaaggaa agcactgtgt ttttgtagtg 3840cctgggagtg
agtatacagt gaccgagcaa cacattggac gcactggaca tcttgatgtg 3900acgaaactga
caaaaattgt cacgatcaat cctgcttctc ctcatgacta taaatatttt 3960ttagaaactc
tgacggacat tagattacct tgtgaacata tactctattt atggaatcgt 4020tatgatttaa
caaatacttc taatcatcgg acagaattga ctgtaccaga tatagtctta 4080aacttatgta
ctagtcttac ttatttggta caagccctta gccacatggg tttttccccg 4140aaattatggc
taattacaca aaatagtcaa gcggttggta gtgacttagc gaatttagaa 4200atcgaacaat
ccccattatg ggcattgggt cgaagcatcc gcgccgaaca ccctgaattt 4260gattgccgtt
gtttagattt tgacacgctc tcaaatatcg caccactctt gttgaaagag 4320atgcaagcta
tagactatga atctcaaatt gcttaccgac aaggaacgcg ctatgttgca 4380cgactaattc
gtaatcaatc agaatgtcac gcaccgattc aaacaggaat ccgtcctgat 4440ggcagctatt
tgattacagg tggattaggc ggtctaggat tgcaggtagc actcgccctt 4500gcggacgctg
gagcaagaca cttgatcctc aatagtcgcc gtggtacggt ctccaaagaa 4560gcccagttaa
ttattgaccg actacgccaa gaggatgtta gggttgattt gattgcggca 4620gatgtctctg
atgcggcaga tagcgaacga ctcttagtag aaagtcagcg caagacctct 4680cttcgaggga
ttgtccatgt tgcgggagtc ttggatgatg gcatcctgct ccaacaaaat 4740caagagcgtt
ttgaaaaagt gatggcggct aaggtacgcg gagcttggca tctggaccaa 4800cagagccaaa
ccctcgattt agatttcttt gttgcgttct catctgttgc gtcgctcata 4860gaagaaccag
gacaagccaa ttacgccgca gcgaatgcgt ttttggattc attaatgtat 4920tatcgtcaca
taaagggatc taatagcttg agtatcaact ggggggcttg ggcagaagtc 4980ggcatggcag
ccaatttatc atgggaacaa cggggaatcg cggcaatttc tccaaagcaa 5040gggaggcata
ttctcgtcca acttattcaa aaacttaatc agcatacaat cccccaagtt 5100gctgtacaac
cgaccaattg ggctgaatat ctatcccatg atggcgtgaa tatgccattc 5160tatgaatatt
ttacacacca cttgcgtaac gaaaaagaag ccaaattgcg gcaaacagca 5220ggcagcacct
cagaggaagt cagtctgcgg caacagcttc aaacactctc agagaaagac 5280cgggatgccc
ttttgatgga acatcttcaa aaaactgcga tcagagttct cggtttggca 5340tctaatcaaa
aaattgatcc ctatcaggga ttgatgaata tgggactaga ctctttgatg 5400gcggttgaat
ttcggaatca cttgatacgt agtttagaac gccctctgcc agccactctg 5460ctctttaatt
gcccaacact tgattcattg catgattacc tagtcgcaaa aatgtttgat 5520gatgcccctc
agaaggcaga gcaaatggca caaccaacaa cactgacagc acacagcata 5580tcaatagaat
ccaaaataga tgataacgaa agcgtggatg acattgcaca aatgctggca 5640caagcactca
atatcgcctt tgagtag
56671021888PRTCylindrospermopsis raciborskii AWT205 102Met Asp Glu Lys
Leu Arg Thr Tyr Glu Arg Leu Ile Lys Gln Ser Tyr1 5
10 15His Lys Ile Glu Ala Leu Glu Ala Glu Val
Asn Arg Leu Lys Gln Thr 20 25
30Gln Cys Glu Pro Ile Ala Ile Val Gly Met Gly Cys Arg Phe Pro Gly
35 40 45Ala Asn Ser Pro Glu Ala Phe Trp
Gln Leu Leu Cys Asp Gly Val Asp 50 55
60Ala Ile Arg Glu Ile Pro Lys Asn Arg Trp Val Val Asp Ala Tyr Ile65
70 75 80Asp Glu Asn Leu Asp
Arg Ala Asp Lys Thr Ser Met Arg Phe Gly Gly 85
90 95Phe Val Glu Gln Leu Glu Lys Phe Asp Ala Gln
Phe Phe Gly Ile Ser 100 105
110Pro Arg Glu Ala Val Ser Leu Asp Pro Gln Gln Arg Leu Leu Leu Glu
115 120 125Val Ser Trp Glu Ala Leu Glu
Asn Ala Ala Val Ile Pro Pro Ser Ala 130 135
140Thr Gly Val Phe Val Gly Ile Ser Asn Leu Asp Tyr Arg Glu Thr
Leu145 150 155 160Leu Lys
Gln Gly Ala Ile Gly Thr Tyr Phe Ala Ser Gly Asn Ala His
165 170 175Ser Thr Ala Ser Gly Arg Leu
Ser Tyr Phe Leu Gly Leu Thr Gly Pro 180 185
190Cys Leu Ser Ile Asp Thr Ala Cys Ser Ser Ser Leu Val Ala
Val His 195 200 205Gln Ser Leu Ile
Ser Leu Arg Gln Arg Glu Cys Asp Leu Ala Leu Val 210
215 220Gly Gly Val His Arg Leu Ile Ala Pro Glu Glu Ser
Val Ser Leu Ala225 230 235
240Lys Ala His Met Leu Ser Pro Asp Gly Arg Cys Lys Val Phe Asp Ala
245 250 255Ser Ala Asn Gly Tyr
Val Arg Ala Glu Gly Cys Gly Met Ile Val Leu 260
265 270Lys Arg Leu Ser Asp Ala Gln Ala Asp Gly Asp Lys
Ile Leu Ala Leu 275 280 285Ile Arg
Gly Ser Ala Ile Asn Gln Asp Gly Arg Thr Ser Gly Leu Thr 290
295 300Val Pro Asn Gly Pro Gln Gln Ala Asp Val Ile
Arg Gln Ala Leu Ala305 310 315
320Asn Ser Gly Ile Arg Pro Glu Gln Val Asn Tyr Val Glu Ala His Gly
325 330 335Thr Gly Thr Ser
Leu Gly Asp Pro Ile Glu Val Gly Ala Leu Gly Thr 340
345 350Ile Phe Asn Gln Arg Ser Gln Pro Leu Ile Ile
Gly Ser Val Lys Thr 355 360 365Asn
Ile Gly His Leu Glu Ala Ala Ala Gly Ile Ala Gly Leu Ile Lys 370
375 380Val Val Leu Ala Met Gln His Gly Glu Ile
Pro Pro Asn Leu His Phe385 390 395
400His Gln Pro Asn Pro Arg Ile Asn Trp Asp Lys Leu Pro Ile Arg
Ile 405 410 415Pro Thr Glu
Arg Thr Ala Trp Pro Thr Gly Asp Arg Ile Ala Gly Ile 420
425 430Ser Ser Phe Gly Phe Ser Gly Thr Asn Ser
His Val Val Leu Glu Glu 435 440
445Ala Pro Lys Ile Glu Pro Ser Thr Leu Glu Ile His Ser Lys Gln Tyr 450
455 460Val Phe Thr Leu Ser Ala Ala Thr
Pro Gln Ala Leu Gln Glu Leu Thr465 470
475 480Gln Arg Tyr Val Thr Tyr Leu Thr Glu His Leu Gln
Glu Ser Leu Ala 485 490
495Asp Ile Cys Phe Thr Ala Asn Thr Gly Arg Lys His Phe Arg His Arg
500 505 510Phe Ala Val Val Ala Glu
Ser Lys Thr Gln Leu Arg Gln Gln Leu Glu 515 520
525Thr Phe Ala Gln Ser Gly Glu Gly Gln Gly Lys Arg Thr Ser
Leu Ser 530 535 540Lys Ile Ala Phe Leu
Phe Thr Gly Gln Gly Ser Gln Tyr Val Gly Met545 550
555 560Gly Gln Glu Leu Tyr Glu Ser Gln Pro Thr
Phe Arg Gln Thr Ile Asp 565 570
575Arg Cys Asp Glu Ile Leu Arg Ser Leu Leu Gly Lys Ser Ile Leu Ser
580 585 590Ile Leu Tyr Pro Ser
Gln Gln Met Gly Leu Glu Thr Pro Ser Gln Ile 595
600 605Asp Glu Thr Ala Tyr Thr Gln Pro Thr Leu Phe Ser
Leu Glu Tyr Ala 610 615 620Leu Ala Gln
Leu Trp Arg Ser Trp Gly Ile Glu Pro Asp Val Val Met625
630 635 640Gly His Ser Val Gly Glu Tyr
Val Ala Ala Cys Val Ala Gly Val Phe 645
650 655Ser Leu Glu Asp Gly Leu Lys Leu Ile Ala Glu Arg
Gly Arg Leu Met 660 665 670Gln
Glu Leu Pro Pro Asp Gly Ala Met Val Ser Val Met Ala Asn Lys 675
680 685Ser Arg Ile Glu Gln Ala Ile Gln Ser
Val Ser Arg Glu Val Ser Ile 690 695
700Ala Ala Ile Asn Gly Pro Glu Ser Val Val Ile Ser Gly Lys Arg Glu705
710 715 720Ile Leu Gln Gln
Ile Thr Glu His Leu Val Ala Glu Gly Ile Lys Thr 725
730 735Arg Gln Leu Lys Val Ser His Ala Phe His
Ser Pro Leu Met Glu Pro 740 745
750Ile Leu Gly Gln Phe Arg Arg Val Ala Asn Thr Ile Thr Tyr Arg Pro
755 760 765Pro Gln Ile Asn Leu Val Ser
Asn Val Thr Gly Gly Gln Val Tyr Lys 770 775
780Glu Ile Ala Thr Pro Asp Tyr Trp Val Arg His Leu Gln Glu Thr
Val785 790 795 800Arg Phe
Ala Asp Gly Val Lys Val Leu His Glu Gln Asn Val Asn Phe
805 810 815Met Leu Glu Ile Gly Pro Lys
Pro Thr Leu Leu Gly Met Val Glu Leu 820 825
830Gln Ser Ser Glu Asn Pro Phe Ser Met Pro Met Met Met Pro
Ser Leu 835 840 845Arg Gln Asn Arg
Ser Asp Trp Gln Gln Met Leu Glu Ser Leu Ser Gln 850
855 860Leu Tyr Val His Gly Val Glu Ile Asp Trp Ile Gly
Phe Asn Lys Asp865 870 875
880Tyr Val Arg His Lys Val Val Leu Pro Thr Tyr Pro Trp Gln Lys Glu
885 890 895Arg Tyr Trp Val Glu
Leu Asp Gln Gln Lys His Ala Ala Lys Asn Leu 900
905 910His Pro Leu Leu Asp Arg Cys Met Lys Leu Pro Arg
His Asn Glu Thr 915 920 925Ile Phe
Glu Lys Glu Phe Ser Leu Glu Thr Leu Pro Phe Leu Ala Asp 930
935 940Tyr Arg Ile Tyr Gly Ser Val Val Ser Pro Gly
Ala Ser Tyr Leu Ser945 950 955
960Met Ile Leu Ser Ile Ala Glu Ser Tyr Ala Asn Gly His Leu Asn Gly
965 970 975Gly Asn Ser Ala
Lys Gln Thr Thr Tyr Leu Leu Lys Asp Val Thr Phe 980
985 990Pro Val Pro Leu Val Ile Ser Asp Glu Ala Asn
Tyr Met Val Gln Val 995 1000
1005Ala Cys Ser Leu Ser Cys Ala Ala Pro His Asn Arg Gly Asp Glu
1010 1015 1020Thr Gln Phe Glu Leu Phe
Ser Phe Ala Glu Asn Val Pro Glu Ser 1025 1030
1035Ser Ser Ile Asn Ala Asp Phe Gln Thr Pro Ile Ile His Ala
Lys 1040 1045 1050Gly Gln Phe Lys Leu
Glu Asp Thr Ala Pro Pro Lys Val Glu Leu 1055 1060
1065Glu Glu Leu Gln Ala Gly Cys Pro Gln Glu Ile Asp Leu
Asn Leu 1070 1075 1080Phe Tyr Gln Thr
Phe Thr Asp Lys Gly Phe Val Phe Gly Ser Arg 1085
1090 1095Phe Arg Trp Leu Glu Gln Ile Trp Val Gly Asp
Gly Glu Ala Leu 1100 1105 1110Ala Arg
Leu Arg Gln Pro Glu Ser Ile Glu Ser Phe Lys Gly Tyr 1115
1120 1125Val Ile His Pro Gly Leu Leu Asp Ala Cys
Thr Gln Val Pro Phe 1130 1135 1140Ala
Ile Ser Ser Asp Asp Glu Asn Arg Gln Ser Glu Thr Thr Met 1145
1150 1155Pro Phe Ala Leu Asn Glu Leu Arg Cys
Tyr Gln Pro Ala Asn Gly 1160 1165
1170Gln Met Trp Trp Val His Ala Thr Glu Lys Asp Arg Tyr Thr Trp
1175 1180 1185Asp Val Ser Leu Phe Asp
Glu Ser Gly Gln Val Ile Ala Glu Phe 1190 1195
1200Ile Gly Leu Glu Val Arg Ala Ala Met Pro Glu Gly Leu Leu
Arg 1205 1210 1215Ala Asp Phe Trp His
Asn Trp Leu Tyr Thr Val Asn Trp Arg Ser 1220 1225
1230Gln Pro Leu Gln Ile Pro Glu Val Leu Asp Ile Asn Lys
Thr Gly 1235 1240 1245Ala Glu Thr Trp
Leu Leu Phe Ala Gln Pro Glu Gly Ile Gly Ala 1250
1255 1260Asp Leu Ala Glu Tyr Leu Gln Ser Gln Gly Lys
His Cys Val Phe 1265 1270 1275Val Val
Pro Gly Ser Glu Tyr Thr Val Thr Glu Gln His Ile Gly 1280
1285 1290Arg Thr Gly His Leu Asp Val Thr Lys Leu
Thr Lys Ile Val Thr 1295 1300 1305Ile
Asn Pro Ala Ser Pro His Asp Tyr Lys Tyr Phe Leu Glu Thr 1310
1315 1320Leu Thr Asp Ile Arg Leu Pro Cys Glu
His Ile Leu Tyr Leu Trp 1325 1330
1335Asn Arg Tyr Asp Leu Thr Asn Thr Ser Asn His Arg Thr Glu Leu
1340 1345 1350Thr Val Pro Asp Ile Val
Leu Asn Leu Cys Thr Ser Leu Thr Tyr 1355 1360
1365Leu Val Gln Ala Leu Ser His Met Gly Phe Ser Pro Lys Leu
Trp 1370 1375 1380Leu Ile Thr Gln Asn
Ser Gln Ala Val Gly Ser Asp Leu Ala Asn 1385 1390
1395Leu Glu Ile Glu Gln Ser Pro Leu Trp Ala Leu Gly Arg
Ser Ile 1400 1405 1410Arg Ala Glu His
Pro Glu Phe Asp Cys Arg Cys Leu Asp Phe Asp 1415
1420 1425Thr Leu Ser Asn Ile Ala Pro Leu Leu Leu Lys
Glu Met Gln Ala 1430 1435 1440Ile Asp
Tyr Glu Ser Gln Ile Ala Tyr Arg Gln Gly Thr Arg Tyr 1445
1450 1455Val Ala Arg Leu Ile Arg Asn Gln Ser Glu
Cys His Ala Pro Ile 1460 1465 1470Gln
Thr Gly Ile Arg Pro Asp Gly Ser Tyr Leu Ile Thr Gly Gly 1475
1480 1485Leu Gly Gly Leu Gly Leu Gln Val Ala
Leu Ala Leu Ala Asp Ala 1490 1495
1500Gly Ala Arg His Leu Ile Leu Asn Ser Arg Arg Gly Thr Val Ser
1505 1510 1515Lys Glu Ala Gln Leu Ile
Ile Asp Arg Leu Arg Gln Glu Asp Val 1520 1525
1530Arg Val Asp Leu Ile Ala Ala Asp Val Ser Asp Ala Ala Asp
Ser 1535 1540 1545Glu Arg Leu Leu Val
Glu Ser Gln Arg Lys Thr Ser Leu Arg Gly 1550 1555
1560Ile Val His Val Ala Gly Val Leu Asp Asp Gly Ile Leu
Leu Gln 1565 1570 1575Gln Asn Gln Glu
Arg Phe Glu Lys Val Met Ala Ala Lys Val Arg 1580
1585 1590Gly Ala Trp His Leu Asp Gln Gln Ser Gln Thr
Leu Asp Leu Asp 1595 1600 1605Phe Phe
Val Ala Phe Ser Ser Val Ala Ser Leu Ile Glu Glu Pro 1610
1615 1620Gly Gln Ala Asn Tyr Ala Ala Ala Asn Ala
Phe Leu Asp Ser Leu 1625 1630 1635Met
Tyr Tyr Arg His Ile Lys Gly Ser Asn Ser Leu Ser Ile Asn 1640
1645 1650Trp Gly Ala Trp Ala Glu Val Gly Met
Ala Ala Asn Leu Ser Trp 1655 1660
1665Glu Gln Arg Gly Ile Ala Ala Ile Ser Pro Lys Gln Gly Arg His
1670 1675 1680Ile Leu Val Gln Leu Ile
Gln Lys Leu Asn Gln His Thr Ile Pro 1685 1690
1695Gln Val Ala Val Gln Pro Thr Asn Trp Ala Glu Tyr Leu Ser
His 1700 1705 1710Asp Gly Val Asn Met
Pro Phe Tyr Glu Tyr Phe Thr His His Leu 1715 1720
1725Arg Asn Glu Lys Glu Ala Lys Leu Arg Gln Thr Ala Gly
Ser Thr 1730 1735 1740Ser Glu Glu Val
Ser Leu Arg Gln Gln Leu Gln Thr Leu Ser Glu 1745
1750 1755Lys Asp Arg Asp Ala Leu Leu Met Glu His Leu
Gln Lys Thr Ala 1760 1765 1770Ile Arg
Val Leu Gly Leu Ala Ser Asn Gln Lys Ile Asp Pro Tyr 1775
1780 1785Gln Gly Leu Met Asn Met Gly Leu Asp Ser
Leu Met Ala Val Glu 1790 1795 1800Phe
Arg Asn His Leu Ile Arg Ser Leu Glu Arg Pro Leu Pro Ala 1805
1810 1815Thr Leu Leu Phe Asn Cys Pro Thr Leu
Asp Ser Leu His Asp Tyr 1820 1825
1830Leu Val Ala Lys Met Phe Asp Asp Ala Pro Gln Lys Ala Glu Gln
1835 1840 1845Met Ala Gln Pro Thr Thr
Leu Thr Ala His Ser Ile Ser Ile Glu 1850 1855
1860Ser Lys Ile Asp Asp Asn Glu Ser Val Asp Asp Ile Ala Gln
Met 1865 1870 1875Leu Ala Gln Ala Leu
Asn Ile Ala Phe Glu 1880
18851035004DNACylindrospermopsis raciborskii AWT205 103atgagtcagc
ccaattatgg cattttgatg aaaaatgcgt tgaacgaaat aaatagccta 60cgatcgcaac
tagctgcggt agaagcccaa aaaaatgagt ctattgccat tgttggtatg 120agttgccgtt
ttccaggcgg tgcaactact ccagagcgtt tttgggtatt actgcgcgag 180ggtatatcag
ccattacaga aatccctgct gatcgctggg atgttgataa atattatgat 240gctgacccca
catcgtccgg taaaatgcat actcgttacg gcggttttct gaatgaagtt 300gatacatttg
agccatcatt ctttaatatt gctgcccgtg aagccgttag catggatcca 360cagcaacgct
tgctacttga agtcagttgg gaagctctgg aatccggtaa tattgttcct 420gcaactcttt
ttgatagttc cactggtgta tttatcggta ttggtggtag caactacaaa 480tctttaatga
tcgaaaacag gagtcggatc gggaaaaccg atttgtatga gttaagtggc 540actgatgtga
gtgttgctgc cggcaggata tcctatgtcc tgggtttgat gggtcccagt 600tttgtgattg
atacagcttg ttcatcttct ttggtctcag ttcatcaagc ctgtcagagt 660ctgcgtcaga
gagaatgtga tctagcacta gctggtggag tcggtttact cattgatcca 720gatgagatga
ttggtctttc tcaagggggg atgctggcac ctgatggtag ttgtaaaaca 780tttgatgcca
atgcaaatgg ctatgtgcga ggcgaaggtt gtgggatgat tgttctaaaa 840cgtctctcgg
atgcaacagc cgatggggat aatattcttg ccatcattcg tgggtctatg 900gttaatcatg
atggtcatag cagtggttta actgctccaa gaggccccgc acaagtctct 960gtcattaagc
aagccttaga tagagcaggt attgcaccgg atgccgtaag ttatttagaa 1020gcccatggta
caggcacacc ccttggtgat cctatcgaga tggattcatt gaacgaagtg 1080tttggtcgga
gaacagaacc actttgggtc ggctcagtta agacaaatat tggtcattta 1140gaagccgcgt
ccggtattgc agggctgatt aaggttgtct tgatgctaaa aaacaagcag 1200attcctcctc
acttgcattt caagacacca aatccatata ttgattggaa aaatctcccg 1260gtcgaaattc
cgaccaccct tcatgcttgg gatgacaaga cattgaagga cagaaagcga 1320attgcagggg
ttagttcttt tagtttcagt ggtactaacg cccacattgt attatctgaa 1380gccccatcta
gcgaactaat tagtaatcat gcggcagtgg aaagaccatg gcacttgtta 1440acccttagtg
ctaagaatga ggaagcgttg gctaacttgg ttgggcttta tcagtcattt 1500atttctacta
ctgatgcaag tcttgccgat atatgctaca ctgctaatac ggcacgaacc 1560catttttctc
atcgccttgc tctatcggct acttcacaca tccaaataga ggctctttta 1620gccgcttata
aggaagggtc ggtgagtttg agcatcaatc aaggttgtgt cctttccaac 1680agtcgtgcgc
cgaaggtcgc ttttctcttt acaggtcaag gttcgcaata tgtgcaaatg 1740gctggagaac
tttatgagac ccagcctact ttccgtaatt gcttagatcg ctgtgccgaa 1800atcttgcaat
ccatcttttc atcgagaaac agcccttggg gaaacccact gctttcggta 1860ttatatccaa
accatgagtc aaaggaaatt gaccagacgg cttataccca acctgccctt 1920tttgctgtag
aatatgccct agcacagatg tggcggtcgt ggggaatcga gccagatatc 1980gtaatgggtc
atagcatagg tgaatatgtg gcagcttgtg tggcggggat cttttctctg 2040gaggatggtc
tcaaacttgc tgccgaaaga ggccgtttga tgcaggcgct accacaaaat 2100ggcgagatgg
ttgctatatc ggcctccctt gaggaagtta agccggctat tcaatctgac 2160cagcgagttg
tgatagcggc ggtaaatgga ccacgaagtg tcgtcatttc gggcgatcgc 2220caagctgtgc
aagtcttcac caacacccta gaagatcaag gaatccggtg caagagactg 2280tctgtttcac
acgctttcca ctctccattg atgaaaccaa tggagcagga gttcgcacag 2340gtggccaggg
aaatcaacta tagtcctcca aaaatagctc ttgtcagtaa tctaaccggc 2400gacttgattt
cacctgagtc ttccctggag gaaggagtga tcgcttcccc tggttactgg 2460gtaaatcatt
tatgcaatcc tgtcttgttc gctgatggta ttgcaactat gcaagcgcag 2520gatgtccaag
tcttccttga agttggacca aaaccgacct tatcaggact agtgcaacaa 2580tattttgacg
aggttgccca tagcgatcgc cctgtcacca ttcccacctt gcgccccaag 2640caacccaact
ggcagacact attggagagt ttgggacaac tgtatgcgct tggtgtccag 2700gtaaattggg
cgggctttga tagagattac accagacgca aagtaagcct acccacctat 2760gcttggaagc
gtcaacgtta ttggctagag aaacagtccg ctccacgttt agaaacaaca 2820caagttcgtc
ccgcaactgc cattgtagag catcttgaac aaggcaatgt gccgaaaatc 2880gtggacttgt
tagcggcgac ggatgtactt tcaggcgaag cacggaaatt gctacccagc 2940atcattgaac
tattggttgc aaaacatcgt gaggaagcga cacagaagcc catctgcgat 3000tggctttatg
aagtggtttg gcaaccccag ttgctgaccc tatctacctt acctgctgtg 3060gaaacagagg
gtagacaatg gctcatcttc gccgatgcta gtggacacgg tgaagcactt 3120gcggctcaat
tacgtcagca aggggatata attacgcttg tctatgctgg tctaaaatat 3180cactcggcta
ataataaaca aaataccggg ggggacatcc catattttca gattgatccg 3240atccaaaggg
aggattatga aaggttgttt gctgctttgc ctccactgta tggtattgtt 3300catctttgga
gtttagatat acttagcttg gacaaagtat ctaacctaat tgaaaatgta 3360caattaggta
gtggcacgct attaaattta atacagacag tcttgcaact tgaaacgccc 3420acccctagct
tgtggctcgt gacaaagaac gcgcaagctg tgcgtaaaaa cgatagccta 3480gtcggagtgc
ttcagtcacc cttatggggt atgggtaagg tgatagcctt agaacaccct 3540gaactcaact
gtgtatcaat cgaccttgat ggtgaagggc ttccagatga acaagccaag 3600tttctggcgg
ctgaactccg cgccgcctcc gagttcagac ataccaccat tccccacgaa 3660agtcaagttg
cttggcgtaa taggactcgc tatgtgtcac ggttcaaagg ttatcagaag 3720catcccgcga
cctcatcaaa aatgcctatt cgaccagatg ccacttattt gatcacgggc 3780ggctttggtg
gtttgggctt gcttgtggct cgttggatgg ttgaacaggg ggctacccat 3840ctatttctga
tgggacgcag ccaacccaaa ccagccgccc aaaaacaact gcaagagata 3900gccgcgctgg
gtgcaacagt gacggtggtg caagccgatg ttggcatccg ctcccaagta 3960gccaatgtgt
tggcacagat tgataaggca tatcctttgg ctggtattat tcatactgcc 4020ggtgtattag
acgacggaat cttattgcag caaaattggg cgcgttttag caaggtgttc 4080gcccccaaac
tagagggagc ttggcatcta catacactga ctgaagagat gccgcttgat 4140ttctttattt
gtttttcctc aacagcagga ttgctgggca gtggtggaca agctaactat 4200gctgctgcca
atgccttttt agatgccttt gcccatcatc ggcgaataca aggcttgcca 4260gctctctcga
ttaactggga cgcttggtct caagtgggaa tgacggtacg tctccaacaa 4320gcttcttcac
aaagcaccac agttgggcaa gatattagca ctttggaaat ttcaccagaa 4380cagggattgc
aaatctttgc ctatcttctg caacaaccat ccgcccaaat agcggccatt 4440tctaccgatg
ggcttcgcaa gatgtacgac acaagctcgg ccttttttgc tttacttgat 4500cttgacaggt
cttcctccac tacccaggag caatctacac tttctcatga agttggcctt 4560accttactcg
aacaattgca gcaagctcgg ccaaaagagc gagagaaaat gttactgcgc 4620catctacaga
cccaagttgc tgcggtcttg cgtagtcccg aactgcccgc agttcatcaa 4680cccttcactg
acttggggat ggattcgttg atgtcacttg aattgatgcg gcgtttggaa 4740gaaagtctgg
ggattcagat gcctgcaacg cttgcattcg attatcctat ggtagaccgt 4800ttggctaagt
ttatactgac tcaaatatgt ataaattctg agccagatac ctcagcagtt 4860ctcacaccag
atggaaatgg ggaggaaaaa gacagtaata aggacagaag taccagcact 4920tccgttgact
caaatattac ttccatggca gaagatttat tcgcactcga atccttacta 4980aataaaataa
aaagagatca ataa
50041041667PRTCylindrospermopsis raciborskii AWT205 104Met Ser Gln Pro
Asn Tyr Gly Ile Leu Met Lys Asn Ala Leu Asn Glu1 5
10 15Ile Asn Ser Leu Arg Ser Gln Leu Ala Ala
Val Glu Ala Gln Lys Asn 20 25
30Glu Ser Ile Ala Ile Val Gly Met Ser Cys Arg Phe Pro Gly Gly Ala
35 40 45Thr Thr Pro Glu Arg Phe Trp Val
Leu Leu Arg Glu Gly Ile Ser Ala 50 55
60Ile Thr Glu Ile Pro Ala Asp Arg Trp Asp Val Asp Lys Tyr Tyr Asp65
70 75 80Ala Asp Pro Thr Ser
Ser Gly Lys Met His Thr Arg Tyr Gly Gly Phe 85
90 95Leu Asn Glu Val Asp Thr Phe Glu Pro Ser Phe
Phe Asn Ile Ala Ala 100 105
110Arg Glu Ala Val Ser Met Asp Pro Gln Gln Arg Leu Leu Leu Glu Val
115 120 125Ser Trp Glu Ala Leu Glu Ser
Gly Asn Ile Val Pro Ala Thr Leu Phe 130 135
140Asp Ser Ser Thr Gly Val Phe Ile Gly Ile Gly Gly Ser Asn Tyr
Lys145 150 155 160Ser Leu
Met Ile Glu Asn Arg Ser Arg Ile Gly Lys Thr Asp Leu Tyr
165 170 175Glu Leu Ser Gly Thr Asp Val
Ser Val Ala Ala Gly Arg Ile Ser Tyr 180 185
190Val Leu Gly Leu Met Gly Pro Ser Phe Val Ile Asp Thr Ala
Cys Ser 195 200 205Ser Ser Leu Val
Ser Val His Gln Ala Cys Gln Ser Leu Arg Gln Arg 210
215 220Glu Cys Asp Leu Ala Leu Ala Gly Gly Val Gly Leu
Leu Ile Asp Pro225 230 235
240Asp Glu Met Ile Gly Leu Ser Gln Gly Gly Met Leu Ala Pro Asp Gly
245 250 255Ser Cys Lys Thr Phe
Asp Ala Asn Ala Asn Gly Tyr Val Arg Gly Glu 260
265 270Gly Cys Gly Met Ile Val Leu Lys Arg Leu Ser Asp
Ala Thr Ala Asp 275 280 285Gly Asp
Asn Ile Leu Ala Ile Ile Arg Gly Ser Met Val Asn His Asp 290
295 300Gly His Ser Ser Gly Leu Thr Ala Pro Arg Gly
Pro Ala Gln Val Ser305 310 315
320Val Ile Lys Gln Ala Leu Asp Arg Ala Gly Ile Ala Pro Asp Ala Val
325 330 335Ser Tyr Leu Glu
Ala His Gly Thr Gly Thr Pro Leu Gly Asp Pro Ile 340
345 350Glu Met Asp Ser Leu Asn Glu Val Phe Gly Arg
Arg Thr Glu Pro Leu 355 360 365Trp
Val Gly Ser Val Lys Thr Asn Ile Gly His Leu Glu Ala Ala Ser 370
375 380Gly Ile Ala Gly Leu Ile Lys Val Val Leu
Met Leu Lys Asn Lys Gln385 390 395
400Ile Pro Pro His Leu His Phe Lys Thr Pro Asn Pro Tyr Ile Asp
Trp 405 410 415Lys Asn Leu
Pro Val Glu Ile Pro Thr Thr Leu His Ala Trp Asp Asp 420
425 430Lys Thr Leu Lys Asp Arg Lys Arg Ile Ala
Gly Val Ser Ser Phe Ser 435 440
445Phe Ser Gly Thr Asn Ala His Ile Val Leu Ser Glu Ala Pro Ser Ser 450
455 460Glu Leu Ile Ser Asn His Ala Ala
Val Glu Arg Pro Trp His Leu Leu465 470
475 480Thr Leu Ser Ala Lys Asn Glu Glu Ala Leu Ala Asn
Leu Val Gly Leu 485 490
495Tyr Gln Ser Phe Ile Ser Thr Thr Asp Ala Ser Leu Ala Asp Ile Cys
500 505 510Tyr Thr Ala Asn Thr Ala
Arg Thr His Phe Ser His Arg Leu Ala Leu 515 520
525Ser Ala Thr Ser His Ile Gln Ile Glu Ala Leu Leu Ala Ala
Tyr Lys 530 535 540Glu Gly Ser Val Ser
Leu Ser Ile Asn Gln Gly Cys Val Leu Ser Asn545 550
555 560Ser Arg Ala Pro Lys Val Ala Phe Leu Phe
Thr Gly Gln Gly Ser Gln 565 570
575Tyr Val Gln Met Ala Gly Glu Leu Tyr Glu Thr Gln Pro Thr Phe Arg
580 585 590Asn Cys Leu Asp Arg
Cys Ala Glu Ile Leu Gln Ser Ile Phe Ser Ser 595
600 605Arg Asn Ser Pro Trp Gly Asn Pro Leu Leu Ser Val
Leu Tyr Pro Asn 610 615 620His Glu Ser
Lys Glu Ile Asp Gln Thr Ala Tyr Thr Gln Pro Ala Leu625
630 635 640Phe Ala Val Glu Tyr Ala Leu
Ala Gln Met Trp Arg Ser Trp Gly Ile 645
650 655Glu Pro Asp Ile Val Met Gly His Ser Ile Gly Glu
Tyr Val Ala Ala 660 665 670Cys
Val Ala Gly Ile Phe Ser Leu Glu Asp Gly Leu Lys Leu Ala Ala 675
680 685Glu Arg Gly Arg Leu Met Gln Ala Leu
Pro Gln Asn Gly Glu Met Val 690 695
700Ala Ile Ser Ala Ser Leu Glu Glu Val Lys Pro Ala Ile Gln Ser Asp705
710 715 720Gln Arg Val Val
Ile Ala Ala Val Asn Gly Pro Arg Ser Val Val Ile 725
730 735Ser Gly Asp Arg Gln Ala Val Gln Val Phe
Thr Asn Thr Leu Glu Asp 740 745
750Gln Gly Ile Arg Cys Lys Arg Leu Ser Val Ser His Ala Phe His Ser
755 760 765Pro Leu Met Lys Pro Met Glu
Gln Glu Phe Ala Gln Val Ala Arg Glu 770 775
780Ile Asn Tyr Ser Pro Pro Lys Ile Ala Leu Val Ser Asn Leu Thr
Gly785 790 795 800Asp Leu
Ile Ser Pro Glu Ser Ser Leu Glu Glu Gly Val Ile Ala Ser
805 810 815Pro Gly Tyr Trp Val Asn His
Leu Cys Asn Pro Val Leu Phe Ala Asp 820 825
830Gly Ile Ala Thr Met Gln Ala Gln Asp Val Gln Val Phe Leu
Glu Val 835 840 845Gly Pro Lys Pro
Thr Leu Ser Gly Leu Val Gln Gln Tyr Phe Asp Glu 850
855 860Val Ala His Ser Asp Arg Pro Val Thr Ile Pro Thr
Leu Arg Pro Lys865 870 875
880Gln Pro Asn Trp Gln Thr Leu Leu Glu Ser Leu Gly Gln Leu Tyr Ala
885 890 895Leu Gly Val Gln Val
Asn Trp Ala Gly Phe Asp Arg Asp Tyr Thr Arg 900
905 910Arg Lys Val Ser Leu Pro Thr Tyr Ala Trp Lys Arg
Gln Arg Tyr Trp 915 920 925Leu Glu
Lys Gln Ser Ala Pro Arg Leu Glu Thr Thr Gln Val Arg Pro 930
935 940Ala Thr Ala Ile Val Glu His Leu Glu Gln Gly
Asn Val Pro Lys Ile945 950 955
960Val Asp Leu Leu Ala Ala Thr Asp Val Leu Ser Gly Glu Ala Arg Lys
965 970 975Leu Leu Pro Ser
Ile Ile Glu Leu Leu Val Ala Lys His Arg Glu Glu 980
985 990Ala Thr Gln Lys Pro Ile Cys Asp Trp Leu Tyr
Glu Val Val Trp Gln 995 1000
1005Pro Gln Leu Leu Thr Leu Ser Thr Leu Pro Ala Val Glu Thr Glu
1010 1015 1020Gly Arg Gln Trp Leu Ile
Phe Ala Asp Ala Ser Gly His Gly Glu 1025 1030
1035Ala Leu Ala Ala Gln Leu Arg Gln Gln Gly Asp Ile Ile Thr
Leu 1040 1045 1050Val Tyr Ala Gly Leu
Lys Tyr His Ser Ala Asn Asn Lys Gln Asn 1055 1060
1065Thr Gly Gly Asp Ile Pro Tyr Phe Gln Ile Asp Pro Ile
Gln Arg 1070 1075 1080Glu Asp Tyr Glu
Arg Leu Phe Ala Ala Leu Pro Pro Leu Tyr Gly 1085
1090 1095Ile Val His Leu Trp Ser Leu Asp Ile Leu Ser
Leu Asp Lys Val 1100 1105 1110Ser Asn
Leu Ile Glu Asn Val Gln Leu Gly Ser Gly Thr Leu Leu 1115
1120 1125Asn Leu Ile Gln Thr Val Leu Gln Leu Glu
Thr Pro Thr Pro Ser 1130 1135 1140Leu
Trp Leu Val Thr Lys Asn Ala Gln Ala Val Arg Lys Asn Asp 1145
1150 1155Ser Leu Val Gly Val Leu Gln Ser Pro
Leu Trp Gly Met Gly Lys 1160 1165
1170Val Ile Ala Leu Glu His Pro Glu Leu Asn Cys Val Ser Ile Asp
1175 1180 1185Leu Asp Gly Glu Gly Leu
Pro Asp Glu Gln Ala Lys Phe Leu Ala 1190 1195
1200Ala Glu Leu Arg Ala Ala Ser Glu Phe Arg His Thr Thr Ile
Pro 1205 1210 1215His Glu Ser Gln Val
Ala Trp Arg Asn Arg Thr Arg Tyr Val Ser 1220 1225
1230Arg Phe Lys Gly Tyr Gln Lys His Pro Ala Thr Ser Ser
Lys Met 1235 1240 1245Pro Ile Arg Pro
Asp Ala Thr Tyr Leu Ile Thr Gly Gly Phe Gly 1250
1255 1260Gly Leu Gly Leu Leu Val Ala Arg Trp Met Val
Glu Gln Gly Ala 1265 1270 1275Thr His
Leu Phe Leu Met Gly Arg Ser Gln Pro Lys Pro Ala Ala 1280
1285 1290Gln Lys Gln Leu Gln Glu Ile Ala Ala Leu
Gly Ala Thr Val Thr 1295 1300 1305Val
Val Gln Ala Asp Val Gly Ile Arg Ser Gln Val Ala Asn Val 1310
1315 1320Leu Ala Gln Ile Asp Lys Ala Tyr Pro
Leu Ala Gly Ile Ile His 1325 1330
1335Thr Ala Gly Val Leu Asp Asp Gly Ile Leu Leu Gln Gln Asn Trp
1340 1345 1350Ala Arg Phe Ser Lys Val
Phe Ala Pro Lys Leu Glu Gly Ala Trp 1355 1360
1365His Leu His Thr Leu Thr Glu Glu Met Pro Leu Asp Phe Phe
Ile 1370 1375 1380Cys Phe Ser Ser Thr
Ala Gly Leu Leu Gly Ser Gly Gly Gln Ala 1385 1390
1395Asn Tyr Ala Ala Ala Asn Ala Phe Leu Asp Ala Phe Ala
His His 1400 1405 1410Arg Arg Ile Gln
Gly Leu Pro Ala Leu Ser Ile Asn Trp Asp Ala 1415
1420 1425Trp Ser Gln Val Gly Met Thr Val Arg Leu Gln
Gln Ala Ser Ser 1430 1435 1440Gln Ser
Thr Thr Val Gly Gln Asp Ile Ser Thr Leu Glu Ile Ser 1445
1450 1455Pro Glu Gln Gly Leu Gln Ile Phe Ala Tyr
Leu Leu Gln Gln Pro 1460 1465 1470Ser
Ala Gln Ile Ala Ala Ile Ser Thr Asp Gly Leu Arg Lys Met 1475
1480 1485Tyr Asp Thr Ser Ser Ala Phe Phe Ala
Leu Leu Asp Leu Asp Arg 1490 1495
1500Ser Ser Ser Thr Thr Gln Glu Gln Ser Thr Leu Ser His Glu Val
1505 1510 1515Gly Leu Thr Leu Leu Glu
Gln Leu Gln Gln Ala Arg Pro Lys Glu 1520 1525
1530Arg Glu Lys Met Leu Leu Arg His Leu Gln Thr Gln Val Ala
Ala 1535 1540 1545Val Leu Arg Ser Pro
Glu Leu Pro Ala Val His Gln Pro Phe Thr 1550 1555
1560Asp Leu Gly Met Asp Ser Leu Met Ser Leu Glu Leu Met
Arg Arg 1565 1570 1575Leu Glu Glu Ser
Leu Gly Ile Gln Met Pro Ala Thr Leu Ala Phe 1580
1585 1590Asp Tyr Pro Met Val Asp Arg Leu Ala Lys Phe
Ile Leu Thr Gln 1595 1600 1605Ile Cys
Ile Asn Ser Glu Pro Asp Thr Ser Ala Val Leu Thr Pro 1610
1615 1620Asp Gly Asn Gly Glu Glu Lys Asp Ser Asn
Lys Asp Arg Ser Thr 1625 1630 1635Ser
Thr Ser Val Asp Ser Asn Ile Thr Ser Met Ala Glu Asp Leu 1640
1645 1650Phe Ala Leu Glu Ser Leu Leu Asn Lys
Ile Lys Arg Asp Gln 1655 1660
1665105318DNACylindrospermopsis raciborskii AWT205 105ttatgctgca
tctaaataga agttccatag ccctgcactg accaacatca attgatcatc 60aaaatcggtc
acacgattcc tatatgtggg ataaaatttg cagtacagca ggatataaaa 120tagtttttcc
tctatacttc tgagtgtagg cttgcgtccg cccccgggcg cacgtttgcg 180gtttgctaag
gagttgaaca cggtgcgttc ataggtatca gcaaactgag ataacagctc 240gttgaatgct
tggcggttaa gtccagtcat tgctcgtagc agtcgctctt gattcaggat 300gcggtctaag
ttcaacat
318106105PRTCylindrospermopsis raciborskii AWT205 106Met Leu Asn Leu Asp
Arg Ile Leu Asn Gln Glu Arg Leu Leu Arg Ala1 5
10 15Met Thr Gly Leu Asn Arg Gln Ala Phe Asn Glu
Leu Leu Ser Gln Phe 20 25
30Ala Asp Thr Tyr Glu Arg Thr Val Phe Asn Ser Leu Ala Asn Arg Lys
35 40 45Arg Ala Pro Gly Gly Gly Arg Lys
Pro Thr Leu Arg Ser Ile Glu Glu 50 55
60Lys Leu Phe Tyr Ile Leu Leu Tyr Cys Lys Phe Tyr Pro Thr Tyr Arg65
70 75 80Asn Arg Val Thr Asp
Phe Asp Asp Gln Leu Met Leu Val Ser Ala Gly 85
90 95Leu Trp Asn Phe Tyr Leu Asp Ala Ala
100 105107600DNACylindrospermopsis raciborskii AWT205
107ctactgagtg aaagtgaact tctttcccac gtattcgagt agctgttgta agctggcctc
60gatggaaagt tccgaagttt ccaccagtaa atctggtgtt ctcggtggtt cgtagggagc
120gctaattccc gtaaaagact caatttctcc acggcgtgct tttgcataga gacccttggg
180gtcacgttgt tcacaaattt ccatcggagt tgcaatatat acttcatgaa acagatctcc
240ggacagaata cggatttgct cccggtcttt cctgtaaggt gaaatgaaag cagtaatcac
300taaacaaccc gaatccgcaa aaagtttggc cacctcgcca atacgacgaa tattttccgc
360acgatcagca gcagaaaatc ccaagtcagc acataatcca tgacggatat tgtcaccatc
420aaggacaaaa gtataccaac ctttctggaa caaaatccgc tctaattcta gagccaatgt
480tgttttacct gatcctgata atccagtgaa ccatagaatt ccatttcggt gaccattctt
540taaacaacga tcaaatgggg acacaagatg ttttgtatgt tgaatattgc ttgatttcat
600108199PRTCylindrospermopsis raciborskii AWT205 108Met Lys Ser Ser Asn
Ile Gln His Thr Lys His Leu Val Ser Pro Phe1 5
10 15Asp Arg Cys Leu Lys Asn Gly His Arg Asn Gly
Ile Leu Trp Phe Thr 20 25
30Gly Leu Ser Gly Ser Gly Lys Thr Thr Leu Ala Leu Glu Leu Glu Arg
35 40 45Ile Leu Phe Gln Lys Gly Trp Tyr
Thr Phe Val Leu Asp Gly Asp Asn 50 55
60Ile Arg His Gly Leu Cys Ala Asp Leu Gly Phe Ser Ala Ala Asp Arg65
70 75 80Ala Glu Asn Ile Arg
Arg Ile Gly Glu Val Ala Lys Leu Phe Ala Asp 85
90 95Ser Gly Cys Leu Val Ile Thr Ala Phe Ile Ser
Pro Tyr Arg Lys Asp 100 105
110Arg Glu Gln Ile Arg Ile Leu Ser Gly Asp Leu Phe His Glu Val Tyr
115 120 125Ile Ala Thr Pro Met Glu Ile
Cys Glu Gln Arg Asp Pro Lys Gly Leu 130 135
140Tyr Ala Lys Ala Arg Arg Gly Glu Ile Glu Ser Phe Thr Gly Ile
Ser145 150 155 160Ala Pro
Tyr Glu Pro Pro Arg Thr Pro Asp Leu Leu Val Glu Thr Ser
165 170 175Glu Leu Ser Ile Glu Ala Ser
Leu Gln Gln Leu Leu Glu Tyr Val Gly 180 185
190Lys Lys Phe Thr Phe Thr Gln
1951091548DNACylindrospermopsis raciborskii AWT205 109atgcctaaat
actttaatac tgctggaccc tgtaaatccg aaatccacta tatgctctct 60cccacagctc
gactaccgga tttgaaagca ctaattgacg gagaaaacta ctttataatt 120cacgcgccgc
gacaagtcgg caaaactaca gctatgatag ccttagcacg agaattgact 180gatagtggaa
aatataccgc agttattctt tccgttgaag tgggatcagt attctcccat 240aatccccagc
aagcggagca ggttatttta gaagaatgga aacaggcaat caaattttat 300ttacccaaag
aactacaacc atcctattgg ccagagcgtg aaacagactc aggaataggc 360aaaactttaa
gtgagtggtc cgcacaatct ccaagacctc ttgtaatctt tttacatgaa 420atcgattccc
taacagatga agctttaatc ctaattttaa gacaattacg ctcaggtttt 480ccccgtcgtc
ctcggggatt tccccattcg gtggggttaa ttggtatgcg ggatgtgcgg 540gactataagg
ttaaatctgg tggaagtgaa cgactgaata cgtcaagtcc tttcaatatc 600aaagcggaat
ccttgacttt aagtaatttc actctgtcag aggtggaaga actttactta 660caacatacgc
aagctacagg acaaattttt accccggaag caattaaaca agcattttat 720ttaaccgatg
ggcaaccatg gttagtaaac gccctagctc gtcaagccac tcaggtgtta 780gtgaaagata
ttactcaacc cattaccgct gaagtaatta accaagccaa agaagttctg 840attcagcgcc
aggataccca tttggatagt ttggcagagc gcttacggga agatcgggtc 900aaagccatta
ttcaacctat gttagctgga tcggacttac cagatacccc agaggatgat 960cgccgtttct
tgctagattt aggcttggta aagcgcagtc ccttgggagg actaaccatt 1020gccaatccca
tttaccagga ggtgattcct cgtgttttgt cccagggtag tcaggatagt 1080ctaccccaga
ttcaacctac ttggttaaat actgataata ctttaaatcc tgacaaactc 1140ttaaatgctt
tcctagagtt ttggcgacaa catggggaac cattactcaa aagtgcgcct 1200tatcatgaaa
ttgctcccca tttagttttg atggcgtttt tacatcgggt agtgaatggt 1260ggtggcactt
tagaacggga atatgccgtt ggttctggaa gaatggatat ttgtttacgc 1320tatggcaagg
tagtgatggg catagagtta aaggtttggg ggggaaaatc ggatccgtta 1380acgaagggtt
tgacccaatt ggataaatat ctgggtgggt taggattaga tagaggttgg 1440ttagtaattt
ttgatcaccg tccgggatta ccacccatgg gtgagaggat tagtatggaa 1500caggccatta
gtccagaggg aagaaccatt acagtgattc gtagctag
1548110515PRTCylindrospermopsis raciborskii AWT205 110Met Pro Lys Tyr Phe
Asn Thr Ala Gly Pro Cys Lys Ser Glu Ile His1 5
10 15Tyr Met Leu Ser Pro Thr Ala Arg Leu Pro Asp
Leu Lys Ala Leu Ile 20 25
30Asp Gly Glu Asn Tyr Phe Ile Ile His Ala Pro Arg Gln Val Gly Lys
35 40 45Thr Thr Ala Met Ile Ala Leu Ala
Arg Glu Leu Thr Asp Ser Gly Lys 50 55
60Tyr Thr Ala Val Ile Leu Ser Val Glu Val Gly Ser Val Phe Ser His65
70 75 80Asn Pro Gln Gln Ala
Glu Gln Val Ile Leu Glu Glu Trp Lys Gln Ala 85
90 95Ile Lys Phe Tyr Leu Pro Lys Glu Leu Gln Pro
Ser Tyr Trp Pro Glu 100 105
110Arg Glu Thr Asp Ser Gly Ile Gly Lys Thr Leu Ser Glu Trp Ser Ala
115 120 125Gln Ser Pro Arg Pro Leu Val
Ile Phe Leu His Glu Ile Asp Ser Leu 130 135
140Thr Asp Glu Ala Leu Ile Leu Ile Leu Arg Gln Leu Arg Ser Gly
Phe145 150 155 160Pro Arg
Arg Pro Arg Gly Phe Pro His Ser Val Gly Leu Ile Gly Met
165 170 175Arg Asp Val Arg Asp Tyr Lys
Val Lys Ser Gly Gly Ser Glu Arg Leu 180 185
190Asn Thr Ser Ser Pro Phe Asn Ile Lys Ala Glu Ser Leu Thr
Leu Ser 195 200 205Asn Phe Thr Leu
Ser Glu Val Glu Glu Leu Tyr Leu Gln His Thr Gln 210
215 220Ala Thr Gly Gln Ile Phe Thr Pro Glu Ala Ile Lys
Gln Ala Phe Tyr225 230 235
240Leu Thr Asp Gly Gln Pro Trp Leu Val Asn Ala Leu Ala Arg Gln Ala
245 250 255Thr Gln Val Leu Val
Lys Asp Ile Thr Gln Pro Ile Thr Ala Glu Val 260
265 270Ile Asn Gln Ala Lys Glu Val Leu Ile Gln Arg Gln
Asp Thr His Leu 275 280 285Asp Ser
Leu Ala Glu Arg Leu Arg Glu Asp Arg Val Lys Ala Ile Ile 290
295 300Gln Pro Met Leu Ala Gly Ser Asp Leu Pro Asp
Thr Pro Glu Asp Asp305 310 315
320Arg Arg Phe Leu Leu Asp Leu Gly Leu Val Lys Arg Ser Pro Leu Gly
325 330 335Gly Leu Thr Ile
Ala Asn Pro Ile Tyr Gln Glu Val Ile Pro Arg Val 340
345 350Leu Ser Gln Gly Ser Gln Asp Ser Leu Pro Gln
Ile Gln Pro Thr Trp 355 360 365Leu
Asn Thr Asp Asn Thr Leu Asn Pro Asp Lys Leu Leu Asn Ala Phe 370
375 380Leu Glu Phe Trp Arg Gln His Gly Glu Pro
Leu Leu Lys Ser Ala Pro385 390 395
400Tyr His Glu Ile Ala Pro His Leu Val Leu Met Ala Phe Leu His
Arg 405 410 415Val Val Asn
Gly Gly Gly Thr Leu Glu Arg Glu Tyr Ala Val Gly Ser 420
425 430Gly Arg Met Asp Ile Cys Leu Arg Tyr Gly
Lys Val Val Met Gly Ile 435 440
445Glu Leu Lys Val Trp Gly Gly Lys Ser Asp Pro Leu Thr Lys Gly Leu 450
455 460Thr Gln Leu Asp Lys Tyr Leu Gly
Gly Leu Gly Leu Asp Arg Gly Trp465 470
475 480Leu Val Ile Phe Asp His Arg Pro Gly Leu Pro Pro
Met Gly Glu Arg 485 490
495Ile Ser Met Glu Gln Ala Ile Ser Pro Glu Gly Arg Thr Ile Thr Val
500 505 510Ile Arg Ser
51511120DNAArtificialBased on Cylindrospermopsis raciborskii AWT205
sequence 111acttctctcc tttccctatc
2011222DNAArtificialBased on Cylindrospermopsis raciborskii
AWT205 sequence 112gagtgaaaat gcgtagaact tg
2211322DNAArtificialBased on Cylindrospermopsis
raciborskii T3 sequence 113cccaatatct ccctgtaaaa ct
2211420DNAArtificialBased on
Cylindrospermopsis raciborskii T3 sequence 114tggcaattgt ctctccgtat
2011520DNAArtificialBased
on Cylindrospermopsis raciborskii T3 sequence 115ctcgccgatg
aaagtcctct
2011620DNAArtificialBased on Cylindrospermopsis raciborskii T3
sequence 116gcgtgtcgag aaaaaggtgt
2011720DNAArtificialBased on Cylindrospermopsis raciborskii T3
sequence 117ctcgacacgc aagaataacg
2011821DNAArtificialBased on Cylindrospermopsis raciborskii T3
sequence 118atgcttctgc tttggcatgg c
2111921DNAArtificialBased on Cylindrospermopsis raciborskii
T3 sequence 119taactcgacg aactttgacc c
2112019DNAArtificialBased on Cylindrospermopsis
raciborskii T3 120gccgccaatc ctcgcgatg
1912122DNAArtificialBased on Cylindrospermopsis raciborskii
T3 sequence 121gaacgtctaa tgttgcacag tg
2212223DNAArtificialBased on Cylindrospermopsis
raciborskii T3 sequence 122ctggtacgta gtcgcaaagg tgg
2312326DNAArtificialBased on
Cylindrospermopsis raciborskii T3 sequence 123ctgacggtac atgtatttcc
tgtgac 2612430DNAArtificialBased
on Cylindrospermopsis raciborskii T3 sequence 124cgtctcatat
gcagatctta ggaatttcag
3012525DNAArtificialBased on Cylindrospermopsis raciborskii T3
sequence 125gcttactacc acgatagtgc tgccg
2512622DNAArtificialBased on Cylindrospermopsis raciborskii T3
sequence 126tctatgttta gcaggtggtg tc
2212720DNAArtificialBased on Cylindrospermopsis raciborskii T3
sequence 127ttctgcaaga cgagccataa
2012820DNAArtificialBased on Cylindrospermopsis raciborskii
T3 sequence 128ggttcgccgc ggacattaaa
2012920DNAArtificialBased on Cylindrospermopsis
raciborskii T3 sequence 129atgctaatgc ggtgggagta
2013020DNAArtificialBased on
Cylindrospermopsis raciborskii T3 sequence 130aaagcagttc cgacgacatt
2013123DNAArtificialBased
on Cylindrospermopsis raciborskii T3 sequence 131cctatttcga
ttattgtttt cgg
2313220DNAArtificialBased on Cylindrospermopsis raciborskii T3
sequence 132gataccgatc ataaactacg
2013321DNAArtificialBased on Cylindrospermopsis raciborskii T3
sequence 133gcaaattttg caggagtaat g
2113421DNAArtificialBased on Cylindrospermopsis raciborskii T3
sequence 134gcaaattttg caggagtaat g
2113523DNAArtificialBased on Cylindrospermopsis raciborskii
T3 sequence 135ttttgggtaa actttatagc cat
2313622DNAArtificialBased on Cylindrospermopsis
raciborskii T3 sequence 136tgggtctgga cagttgtaga ta
2213723DNAArtificialBased on
Cylindrospermopsis raciborskii T3 sequence 137aaggggaaaa caaaattatc
aat 2313820DNAArtificialBased
on Cylindrospermopsis raciborskii T3 sequence 138ggcgatcgcc
tgctaaaaat
2013923DNAArtificialBased on Cylindrospermopsis raciborskii T3
sequence 139cctcattttc atttctagac gtt
2314020DNAArtificialBased on Cylindrospermopsis raciborskii T3
sequence 140ccacttcaac taaaacagca
2014120DNAArtificialBased on Cylindrospermopsis raciborskii T3
sequence 141aaaaattttg gaggggtagc
2014220DNAArtificialBased on Cylindrospermopsis raciborskii
T3 sequence 142atccaagatg cgacaacact
2014321DNAArtificialBased on Cylindrospermopsis
raciborskii T3 sequence 143ggtccttgcg cagatagagt g
2114421DNAArtificialBased on
Cylindrospermopsis raciborskii T3 sequence 144cactctatct gcgcaaggac
c 2114521DNAArtificialBased
on Cylindrospermopsis raciborskii T3 sequence 145tgactgcatt
cgctgtataa a
2114622DNAArtificialBased on Cylindrospermopsis raciborskii T3
sequence 146ttcataagac ggctgttgaa tc
2214730DNAArtificialBased on Cylindrospermopsis raciborskii T3
sequence 147ctcgagttaa aaaagagtgt aaatgaaagg
3014823DNAArtificialBased on Cylindrospermopsis raciborskii T3
sequence 148ttctataact gctgccaaat ttt
2314923DNAArtificialBased on Cylindrospermopsis raciborskii
T3 sequence 149aattttggag tgactggtta tgg
2315023DNAArtificialBased on Cylindrospermopsis
raciborskii T3 sequence 150ccataaccag tcactccaaa att
2315121DNAArtificialBased on
Cylindrospermopsis raciborskii T3 sequence 151ttttagttgt tacttttggc
g 2115220DNAArtificialBased
on Cylindrospermopsis raciborskii T3 sequence 152acagcagatg
agagaaagta
2015320DNAArtificialBased on Cylindrospermopsis raciborskii T3
sequence 153gggttgtctt gctgattttc
2015422DNAArtificialBased on Cylindrospermopsis raciborskii T3
sequence 154cattaaaata agtccggaca gg
2215520DNAArtificialBased on Cylindrospermopsis raciborskii T3
sequence 155ttaaacagaa tgaggagcaa
2015620DNAArtificialBased on Cylindrospermopsis raciborskii
T3 sequence 156aaacaacaca cccatctaag
2015720DNAArtificialBased on Cylindrospermopsis
raciborskii T3 sequence 157ttaataaggc atccccaaga
2015820DNAArtificialBased on
Cylindrospermopsis raciborskii T3 sequence 158gaaatggctg tgtaaaaact
2015920DNAArtificialBased
on Cylindrospermopsis raciborskii T3 sequence 159tctgccatat
ccccaaccta
2016020DNAArtificialBased on Cylindrospermopsis raciborskii T3
sequence 160gatcgcccga caggaagact
2016120DNAArtificialBased on Cylindrospermopsis raciborskii T3
sequence 161tccggcttga cctgctggac
2016220DNAArtificialBased on Cylindrospermopsis raciborskii T3
sequence 162tgcgatgatt ttgcctctgt
2016320DNAArtificialBased on Cylindrospermopsis raciborskii
T3 sequence 163aaaatttgca cacccacacg
2016427DNAArtificialBased on Cylindrospermopsis
raciborskii T3 sequence 164ttggattgaa cgtgtaattg aaaaagc
2716527DNAArtificialBased on
Cylindrospermopsis raciborskii T3 sequence 165gctttttcaa ttacacgttc
aatccaa 2716619DNAArtificialBased
on Cylindrospermopsis raciborskii T3 sequence 166aaatggcgta
tcgactaac
1916721DNAArtificialBased on Cylindrospermopsis raciborskii T3
sequence 167atataggagc gcataaagtg c
2116820DNAArtificialBased on Cylindrospermopsis raciborskii T3
sequence 168cttggtataa gtcttgtgat
2016920DNAArtificialBased on Cylindrospermopsis raciborskii T3
sequence 169aacactcatt agattcatct
2017021DNAArtificialBased on Cylindrospermopsis raciborskii
T3 sequence 170tccactaaat cctttgaatt g
2117121DNAArtificialBased on Cylindrospermopsis
raciborskii T3 sequence 171tgtttgtctg gatgcgatcc t
2117220DNAArtificialBased on
Cylindrospermopsis raciborskii T3 sequence 172gcagttcagg tccatgaaac
2017320DNAArtificialBased
on Cylindrospermopsis raciborskii T3 sequence 173agcccagtca
caaccttcgt
2017421DNAArtificialBased on Cylindrospermopsis raciborskii T3
sequence 174tctggaagta cttgcactgt c
2117522DNAArtificialBased on Cylindrospermopsis raciborskii T3
sequence 175tgtaactccg tcaggacata aa
2217623DNAArtificialBased on Cylindrospermopsis raciborskii T3
sequence 176tgcaaatttt agtagcaata acg
2317727DNAArtificialBased on Cylindrospermopsis raciborskii
T3 sequence 177ctttactaat tatagcgggg atattat
2717820DNAArtificialBased on Cylindrospermopsis
raciborskii T3 sequence 178cagtggggaa atagatggat
2017920DNAArtificialBased on
Cylindrospermopsis raciborskii T3 sequence 179tggtcataaa agcgggattc
2018018DNAArtificialBased
on Cylindrospermopsis raciborskii T3 sequence 180ggatcttggc gcaattta
1818123DNAArtificialBased on Cylindrospermopsis raciborskii T3
sequence 181gttagagact tggaacgtat tgg
2318219DNAArtificialBased on Cylindrospermopsis raciborskii T3
sequence 182ccaaacccag aagaaatcc
1918322DNAArtificialBased on Cylindrospermopsis raciborskii T3
sequence 183aatctatagc caaaacccct aa
2218419DNAArtificialBased on Cylindrospermopsis raciborskii
T3 sequence 184actgtgtgaa caattcccc
1918529DNAArtificialBased on Cylindrospermopsis
raciborskii T3 sequence 185gcaacaagac tacatttagt agatttaga
2918627DNAArtificialBased on
Cylindrospermopsis raciborskii T3 sequence 186gctttttcaa ttacacgttc
aatccaa 27
User Contributions:
Comment about this patent or add new information about this topic: