Patent application title: Techniques for Linking Non-Coding and Gene-Coding Deoxyribonucleic Acid Sequences and Applications Thereof
Inventors:
Isidore Rigoutsos (Asloria, NY, US)
Tien Huynh (Yorktown Heights, NY, US)
Aristotelis Tsirigos (Astoria, NY, US)
Alice Carolyn Mchardy (New York, NY, US)
Kevin Charles Miranda (Mcdowall, AU)
Assignees:
International Business Machines Corporation
IPC8 Class: AC07H2102FI
USPC Class:
536 254
Class name: Nitrogen containing n-glycosides, polymers thereof, metal derivatives (e.g., nucleic acids, oligonucleotides, etc.) separation or purification of polynucleotides or oligonucleotides
Publication date: 2009-04-09
Patent application number: 20090093624
Claims:
1. A method of designing one or more sequences of short interfering RNAs
that can interact with one or more sites in a given transcript of a given
sequence in a given organism and result in the down-regulation of the
expression of a protein product encoded by the given transcript, the
method comprising the steps of:identifying one or regions of interest in
the sequence of the given transcript;sub-selecting one or more regions
from the collection of the regions of interest;generating one or more
derived sequences from the sequence of the one or more sub-selected
regions;using the one or more derived sequences to create one or more
instances of a corresponding molecule that the one or more derived
sequences represent; andusing the one or more instances of the created
molecule in an appropriate environment to regulate the expression of the
given transcript.
2. The method of claim 1, wherein a region of interest in the collection of regions of interest is identified to be an instance of a motif that has one or more copies in intergenic and intronic regions of the genome of interest and one or more copies in untranslated and amino acid coding regions of one or more genes in the genome of interest.
3. The method of claim 1, wherein a region of interest in the collection of regions of interest is identified using a method that is based on pattern discovery.
4. The method of claim 1, wherein a region of interest in the collection of regions of interest is identified to be a target island.
5. The method of claim 1, wherein a region of interest is located in a 5' untranslated region of the given transcript.
6. The method of claim 1, wherein a region of interest is located in an amino acid coding region of the given transcript.
7. The method of claim 6, wherein a region of interest is located in a 3' untranslated region of the given transcript.
8. The method of claim 1, wherein the genome of interest is a eukaryotic genome.
9. The method of claim 8, where the eukaryotic genome is a human genome.
10. The method of claim 8, wherein the eukaryotic genome is a mouse genome.
11. The method of claim 8, wherein the eukaryotic genome is a rat genome.
12. The method of claim 8, wherein the eukaryotic genome is a dog genome.
13. The method of claim 8, wherein the eukaryotic genome is a fruit fly genome.
14. The method of claim 8, wherein the eukaryotic genome is a worm genome.
15. The method of claim 1, wherein a region of interest is sub-selected based on one or more of its attributes.
16. The method of claim 15, wherein an attribute is length of the region.
17. The method of claim 15, wherein an attribute is location of the region in the transcript.
18. The method of claim 1, wherein a derived sequence is a reverse complement of the sequence of the one or more sub-selected regions.
19. The method of claim 1, wherein a derived sequence is a near-reverse complement of the sequence of the one or more sub-selected regions.
20. The method of claim 1, wherein the one or more copies of the molecule can be built using any of a set of biochemical processes.
Description:
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001]This application is a divisional of pending U.S. application Ser. No. 11/367,512, filed Mar. 3, 2006, the disclosure of which is incorporated by reference herein. The present application claims priority to U.S. provisional applications identified as: Ser. No. 60/658,251 (attorney docket no. YOR920050130US1), filed Mar. 3, 2005, and entitled "Overrepresentation of Nucleotides in Human Chromosomes and in the 3' Untranslated Regions of Human Genes;" and Ser. No. 60/696,213 (attorney docket no. YOR920050350US1), filed Jul. 1, 2005, and entitled "Techniques For Linking Non-Coding And Gene-Coding Deoxyribonucleic Acid Sequences," the disclosures of which are incorporated by reference herein.
FIELD OF THE INVENTION
[0002]The present invention relates to genes and, more particularly, to techniques for linking regions of a genome.
BACKGROUND OF THE INVENTION
[0003]It is known that the intergenic and intronic regions comprise most of the genomic sequence of higher organisms. The intergenic and intronic regions are collectively referred to as the "non-coding region" of an organism's genome, as opposed to the "gene-" or "protein-coding region" of the genome. Even though recent work suggested participation of the intergenic and intronic regions in a regulatory role, for the most part, their true function remains elusive. The search for conserved motifs, presumed to be regulatory and control signals, in regions upstream of the 5' untranslated regions (5'UTRs) of genes, has been the focus of research activities for many years.
[0004]More recently, researchers began studying the 3' untranslated regions (3'UTRs) of genes where they discovered conserved regions and showed them to be functionally significant, in direct analogy to the cis-motifs of promoter regions. Large-scale comparative analyses allowed researchers to also study conservation in the vicinity of genes and elsewhere in the genome with great success. However, these studies were carried out on only a handful of organisms at a time because of the magnitude of the necessary computations.
[0005]The analysis of 3'UTRs intensified further after it was discovered that they contain binding sites that are targeted by short interfering ribonucleic acids (RNAs) that induce the post-transcriptional control of the corresponding gene's expression through either messenger RNA (mRNA) degradation or translational inhibition. Accumulating evidence that non-coding RNAs control developmental and physiological processes and that a considerable part of the human genome is transcribed, has helped researchers identify "functional" elements in areas of the genome that are not associated with protein-coding regions.
[0006]Thus, techniques for efficiently and effectively identifying and associating non-coding regions with gene coding regions of a genome would be desirable.
SUMMARY OF THE INVENTION
[0007]Techniques for linking non-coding and gene coding regions of a genome are provided, in accordance with an illustrative embodiment of the present invention.
[0008]In a first aspect of the invention, a method of determining associations between non-coding sequences and gene coding sequences in a genome of an organism comprises the following steps. At least one conserved region is identified from a plurality of the non-coding sequences. The at least one conserved region is linked with one or more of the gene coding sequences of the genome. The at least one conserved region is associated with one or more biological processes of the organism.
[0009]In a second aspect of the invention, a method of designing one or more sequences of small interfering RNAs that can interact with one or more sites in a given transcript of a given sequence in a given organism and result in the down-regulation of the expression of the protein product encoded by the given transcript comprises the following steps. One or regions of interest are identified in the sequence of a given transcript. One or more regions are sub-selected from the collection of these regions. One or more derived sequences are generated from the sequence of the one or more sub-selected regions. The one or more derived sequences are used to create one more instances of the corresponding molecule that the one or more derived sequences represent. The one or more instances of the created molecule are used in an appropriate environment to regulate the expression of the given transcript.
[0010]In a third aspect of the invention, a method of engineering a given transcript of a given gene in a given organism in order to regulate its expression comprises the following steps. One or more regions of interest are identified in the sequence of a given transcript. One or more regions are sub-selected from the collection of these regions. The one or more sub-selected regions are used to make one or more modifications to the sequence of the given transcript.
[0011]A more complete understanding of the present invention, as well as further features and advantages of the present invention, will be obtained by reference to the following detailed description and drawings.
BRIEF DESCRIPTION OF THE DRAWINGS
[0012]FIG. 1 is a diagram illustrating an exemplary methodology for determining associations between non-coding and gene coding sequences in a genome of an organism, according to an embodiment of the present invention;
[0013]FIG. 2 is a diagram illustrating a probability density function and a cumulative distribution for lengths of patterns discovered in analyzed intergenic and intronic sequences of the human genome, according to an embodiment of the present invention;
[0014]FIG. 3 is a diagram illustrating a probability density function for a number of 16-mers with a given number of copies in a random input set, according to an embodiment of the present invention;
[0015]FIG. 4 is a diagram illustrating a preprocessing methodology, according to an embodiment of the present invention;
[0016]FIG. 5 is a diagram illustrating pattern specificity and number of appearances, according to an embodiment of the present invention;
[0017]FIG. 6 is a diagram illustrating a process for determining logically distinct patterns, according to an embodiment of the present invention;
[0018]FIG. 7 is a diagram illustrating a probability density function for lengths of pyknons, according to an embodiment of the present invention;
[0019]FIG. 8 is a diagram illustrating a number of blocks that are shared by three sets whose union makes up a pyknon collection, according to an embodiment of the present invention;
[0020]FIG. 9 is a diagram illustrating a cumulative function for a number of intergenic and intronic copies of pyknons, according to an embodiment of the present invention;
[0021]FIG. 10 is a diagram illustrating a cumulative function showing what percentage of affected transcripts contain N or more pyknon instances, according to an embodiment of the present invention;
[0022]FIG. 11 is a diagram illustrating combinatorial arrangements of pyknons in 3'UTRs, according to an embodiment of the present invention;
[0023]FIG. 12 is a diagram illustrating combinatorial arrangements of pyknons in 5'UTRs, according to an embodiment of the present invention;
[0024]FIG. 13 is a diagram illustrating combinatorial arrangements of pyknons in amino acid-coding regions, according to an embodiment of the present invention;
[0025]FIG. 14 is a diagram illustrating a probability density function and corresponding cumulative function for variable representing the fraction of pyknon instances located in repeat-free regions, according to an embodiment of the present invention;
[0026]FIG. 15 is a diagram illustrating a partial list of biological processes whose corresponding genes show significant enrichment or depletion in pyknon instances in their 5'UTRs, CRs or 3'UTRs, according to an embodiment of the present invention;
[0027]FIG. 16 is a diagram illustrating probability density functions for the distance between starting points of consecutive instances of pyknons, according to an embodiment of the present invention;
[0028]FIG. 17 is a diagram illustrating probability density functions for the number of intergenic and intronic copies of variable-length strings derived by counting instances of 3'UTR-conserved pyknons after shifting, according to an embodiment of the present invention;
[0029]FIG. 18 is a diagram illustrating the number of intergenic/intronic neighborhoods each of which contains the reverse complement of a pyknon and is predicted to fold into a hairpin-like structure, according to an embodiment of the present invention;
[0030]FIG. 19 is a diagram illustrating the number of positions per 10,000 nucleotides that human pyknons cover in other genomes, according to an embodiment of the present invention;
[0031]FIG. 20 is a diagram illustrating the number of human pyknons that can be found in the untranslated and coding regions of other genomes, and the number of intergenic/intronic positions that human pyknons cover in other genomes, according to an embodiment of the present invention;
[0032]FIG. 21 is a diagram illustrating a first methodology for using pyknons, according to an embodiment of the present invention;
[0033]FIG. 22 is a diagram illustrating a second methodology for using pyknons, according to an embodiment of the present invention; and
[0034]FIG. 23 is a block diagram of an exemplary hardware implementation of one or more of the methodologies of the present invention, according to an embodiment of the present invention.
DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS
[0035]The sequence listings referred to herein can be found in electronic text format as the file, 1500-675 PatentIn Sequence Listing. ST25, created on Wednesday, Jul. 12, 2006, having a size of 102 Kilobytes, the contents of which are incorporated by reference herein.
[0036]FIG. 1 is a diagram illustrating exemplary methodology 100 for determining associations between non-coding and gene coding sequences in a genome of an organism. In step 102, non-coding sequences from the genome of an organism are obtained. Preferably, the non-coding sequences comprise intergenic and/or intronic sequences. As used herein, the term "intergenic" refers generally to any portion of a deoxyribonucleic acid (DNA) strand that is between gene sequences. Further, as used herein, the term "intronic" refers to any portion of a DNA strand that is encompassed by an intron.
[0037]According to one exemplary embodiment, the genome comprises the human genome. Further, as will be described below, the starting point of the present techniques may be the genome of a single organism, e.g., the human genome.
[0038]In step 104, conserved regions in the intergenic/intronic sequences are identified. As will be described in detail below, these conserved regions may be identified by used pattern discovery techniques to identify patterns existing in the sequences.
[0039]In step 106, the identified conserved regions (also referred to as `conserved motifs`) of the intergenic/intronic non-coding sequences are linked to gene coding regions of the genome. Specifically, instances of the patterns, described, for example, in conjunction with the description of step 104, above, may be searched for in gene coding regions of the genome. For example, as will be described in detail below, sequences will be identified, e.g., that are at least 16 nucleotide bases in length and appear a minimum of 40 times in the non-coding region of a genome, and which also appear at least one time in the coding region of the genome. These identified sequences (motifs) link the coding and non-coding regions of the genome. As will also be described in detail below, linking the conserved regions and gene coding regions of the genome provides for an association to be made with the biological processes of the organism.
[0040]In step 108, as will be described in detail below, the identified motifs that link the non-genic and genic regions of a genome also provide for an association between these motifs and specific biological processes that even persists in organisms beyond the human genome.
[0041]As used herein, the phrase "conserved region" may also be referred to as a "conserved motif" or a "conserved block" or "an exceptionally-well-conserved block (EWCB)" or a "pyknon." The term "pyknon" is from the Greek adjective πυκνoc/πυκν{acute over (η)}/πυκνoν meaning "serried, dense, frequent."
[0042]Accordingly, using an unsupervised pattern discovery method, we processed the human intergenic and intronic regions and catalogued all variable-length patterns with identically conserved copies and multiplicities above what is expected by chance. Among the millions of discovered patterns, we found a subset of 127,998 patterns, termed pyknons, which have one or more additional non-overlapping instances in the untranslated and protein-coding regions of 30,675 transcripts from 20,059 human genes. The pyknons are found to arrange combinatorially in the S' untranslated, coding, and 3' untranslated regions of numerous human genes where they form mosaics. Consecutive instances of pyknons in these regions exhibit a strong bias in their relative placement favoring distances of about 22 nucleotides. We also found that pyknons are enriched in a statistically significant manner in genes involved in specific processes, e.g., cell communication, transcription, regulation of transcription, signaling, transport, etc. For about 1/3rd of the pyknons, the intergenic/intronic instances of their reverse complement lie within 382,244 non-overlapping regions, typically 60-80 nucleotides long, which are predicted to form double-stranded, energetically stable, hairpin-shaped RNA secondary structures; additionally, the pyknons subsume about 40% of the known microRNA sequences, thus suggesting a possible link with post-transcriptional gene silencing and RNA interference. Cross-genome comparisons revealed that many of the pyknons are also present in the 3'UTRs of genes from other vertebrates and invertebrates where they are over-represented in similar biological processes as in the human genome. These novel and unexpected findings suggest potential functional connections between the coding and non-coding parts of the human genome.
[0043]Thus, in accordance with an illustrative methodology of the invention, we examine whether highly specific patterns exist within a single genome that may act as targets or sources for such putative regulatory activity or as a `vocabulary` for yet undiscovered mechanisms. Our analysis represents a substantial point of departure from previous efforts. First, we carry out all of the analysis on a single genome. Second, we seek patterns in the intergenic and intronic regions of the genome (not the UTRs or protein coding) regions. Third, the pattern instances transcend chromosomal boundaries. And, fourth, we rely on the unsupervised discovery of conserved motifs instead of searching schemes. In particular, we sought to discover identically conserved, variable-length motifs of certain minimum length but unlimited maximum length in human intergenic and intronic regions. We discovered more than 66 million motifs with multiplicities well above what is expected by chance. A sizeable subset of these motifs, referred to as the pyknons, have one or more additional instances in the untranslated and coding regions of almost all known human genes and exhibit properties that suggest the possibility of an extensive link between the non-genic and genic regions of the genome and a connection with post-transcriptional gene silencing (PTGS) and RNA interference (RNAi).
[0044]As described, for example, in conjunction with the description of step 104 of FIG. 1, above, according to the techniques described herein, a pattern discovery step may be performed. We used the parallel version of a pattern discovery algorithm described in I. Rigoutsos et al., Combinatorial Pattern Discovery in Biological Sequences: the TEIRESIAS Algorithm, 14 BIOINFORMATICS 1, pgs. 55-67 (January 1998) (hereinafter "Rigoutsos"), the disclosure of which is incorporated by reference herein. The pattern discovery (Teiresias) algorithm seeks variable-length motifs that are identically conserved across all of their instances, comprise a minimum of L=16 nucleotides and appear a minimum of K=40 copies in the processed input (see below regarding the values of L, K). The algorithm guarantees the reporting of all composition-maximal and length-maximal patterns satisfying these parameters. The input comprised the intergenic and intronic sequences (step 102 of FIG. 1) of the human genome from ENSEMBL Rel. 31 (see Stabenau, A., McVicker, G., Melsopp, C., Proctor, G., Clamp, M. & Birney, E. (2004) Genome Res 14, 929-33) for a total of 6,039,720,050 nucleotides. The input did not include the reverse complement of the 5' untranslated, amino acid coding or 3' untranslated regions of any human genes. This exclusion ensures that any discovered patterns are not connected in any way to sequences of known genes, protein motifs or domains. The algorithm ran on a shared-memory architecture with 128 Gigabytes of main memory and 8 processors running at a clock frequency of 1.75 GHz, and generated an initial set Pinit of 66+ million statistically significant patterns (see below). Most of the patterns in Pinit were a few tens of nucleotides in length. FIG. 2 shows the probability density function (in black) and cumulative distribution (in light gray) for the lengths of the more than 66 million patterns discovered in the analyzed intergenic and intronic sequences of the human genome. These patterns form the set Pinit. As can be seen from FIG. 2, more than 95% of all discovered patterns are shorter than 100 nucleotides. Note that the primary Y-axis is logarithmic whereas the secondary is linear.
[0045]The Teiresias discovery algorithm that we used for this analysis requires the setting of three parameters: L, W and K. The parameter L controls the minimum possible size of the discovered patterns but has no bearing on the patterns' maximum length; the latter is not constrained in any way. The parameter W satisfies the inequality W≧L and controls the `degree of conservation` across the various instances of the reported patterns: smaller (respectively, larger) values of W will tolerate fewer (respectively, more) mismatches across the instances. Since for this analysis, we are interested only in patterns with identically conserved instances, we set W=L (i.e., the patterns contained no "wild cards"). Finally, the parameter K controls the minimum required number of appearances before a pattern can be reported by the algorithm.
[0046]For a given choice of t, W and K, the algorithm guarantees the reporting of all patterns that have K or more appearances in the processed input and are such that any L consecutive (but not necessarily contiguous) positions span at most W positions. These patterns are generally overlapping: a given sequence location can simultaneously appear in multiple, distinct, non-redundant patterns. It is also important to stress three properties of the algorithm. First, as stated above, the value L does not impose any constraint on the maximum length of a pattern which is unbounded. Second, each reported pattern will be maximal in composition, i.e., it cannot be made more specific by specifying the value of a wild-card without decreasing the number of locations where it appears. And, third, each reported pattern will be maximal in length, i.e., it cannot be made longer without decreasing the number of locations where it appears. In this discussion, we use the terms pattern, block and motif interchangeably.
[0047]Opting for small L values generally permits the identification of shorter conserved motifs that may be present in the processed input, in addition to all longer ones--see above properties. Generally, for short motifs to be claimed as statistically significant they need to have a large number of copies in the processed input; requiring a lot of copies runs the risk of discarding bona fide motifs. On the other hand, larger values of L will generally permit the identification of statistically significant motifs even if these motifs repeat only a small number of times. This happens at the expense of significant decreases in sensitivity; i.e. bona fide motifs will be missed.
[0048]For our analysis, we have selected L=16, a value that strikes a balance between the desirable sensitivity (which favors lower L values) and achievable specificity (which favors higher L values). We stress that the maximality properties of the pattern discovery step ensure that we will be able to report any and all motifs that are 16 nucleotides or longer. And as explained above, we will set W=L.
[0049]The last parameter that needs to be set is K, the required number of appearances for a pattern to be reported. K needs to be set to a value that can ensure that the reported patterns could not have been derived from a random database with the same size as the input at hand. In order to determine this value, we used several randomly-shuffled versions of our intergenic and intronic input (of approximately 6 billion bases) and in there sought frequent, fixed-size 16-mers with all low-complexity 16-mers removed by NSEG (see Wootton, J. C. & Federhen, S. (1993) Computers in Chemistry 17, 149-163). The idea here is that if a randomly-shuffled version of our input set cannot give rise to any 16-mers that appear more than KX times, then it will also be true that no patterns exist in the input set that are longer than 16 nucleotides and have more than KX copies. Several iterations of this process allowed us to establish that KX=23. FIG. 3 shows the probability density function for the number of 16-mers with a given number of copies in the random input set--note that both the X and Y axes are logarithmic. From this, it follows that a randomly-shuffled version of our input set cannot possibly give rise to patterns which are longer than 16 nucleotides and have more than 23 copies: in fact, as a pattern increases in length, the number of times it appears in a given input set can only decrease. We thus opted for the even larger threshold of K=40 for our pattern discovery step.
[0050]Before we sought to discover patterns in the intergenic and intronic regions of the human genome, we preprocessed the sequences and removed: a) all the regions that corresponded to 5' untranslated, coding and 3' untranslated regions of known genes; and, b) all the regions that were the reverse complement of 5' untranslated, coding and 3' untranslated regions of known genes. We show this preprocessing step pictorially in FIG. 4. The genomic input before the preprocessing step is shown above the arrow, and the input upon which pattern discovery is run is shown below the arrow.
[0051]Under the assumption that all four nucleotides are equiprobable (i.e., pA=pT=pC=pG=1/4), independent, and, identically distributed, we estimate the probability p of a pattern of length l to be p=4-l. We can compute the probability Prk to observe k instances of a given pattern in a database of size D (D>1) to be Prk≈(pD)ke-pD/k! (Poisson distribution). The least specific pattern that our method will discover is one that is the shortest possible (i.e., l=L=16) and appears the smallest allowed number of times (i.e., k=K=40): if D=6.0×109 bases (=all chromosomes and both strands), then Prk=1.95×10-43.
[0052]We now revisit this calculation by taking into account the nucleotides' natural probability of occurrence. Using ENSEMBL Release 31 from May 2005 (based on NCBI Assembly 35 from July 2004) as our database D, we see that the fraction of bases that are undetermined across the 24 human chromosomes ranges from roughly 1.2 to 61.0% for the Y chromosome. Of course, the following constraints should apply: pA=pT and pC=pG. Since the fractions of nucleotides that are undetermined are not equal, the required balance between A/T and C/G is only approximately preserved. Ignoring the unspecified positions and recomputing ratios based on the remaining bases, we find that pA=pT≈3/10 and pC=pG≈2/10.
[0053]Let us consider a block of size l and let "match" indicate the match between the i-th character of this block and a character c at position in a database D of nucleotide sequences. Then it is easy to see that:
Pr ( match ) = Pr ( match with c ) = Pr ( match ( c is one of A , C , G , T ) ) = Pr ( ( match c = A ) ( match c = C ) ( match c = G ) ( match c = T ) ) = Pr ( match A ) Pr ( A ) + Pr ( match C ) Pr ( C ) + Pr ( match G ) Pr ( G ) + Pr ( match T ) P ( T ) = Pr ( A ) Pr ( A ) + Pr ( C ) Pr ( C ) + Pr ( G ) Pr ( G ) + Pr ( T ) P ( T ) = Pr ( A ) 2 + Pr ( C ) 2 + Pr ( G ) 2 + Pr ( T ) 2 = p A 2 + p C 2 + p G 2 + p T 2 = 0.3 2 + 0.3 2 + 0.2 2 + 0.2 2 = 0.26 ##EQU00001##
[0054]In this analysis, we consider blocks of length l with l≧16. Naturally, these shortest blocks will be associated with the largest probability p of observing a pattern accidentally--the value p decreases as the value of l increases. The probability that a block of length l=16 will have one instance in the database D is then p1Pr(match)16=(0.26)16 or p1=4.4*10-10.
[0055]An alternative way to approach this is to assume that the block of length l is constructed by drawing from the same nucleotide distribution that gives rise to the database D. Then, a block of length l=16 would comprise pA*16≈5 A's, pC*16≈5 C's, pG*16≈3 G's and pT*16≈3 T's. Then, the probability that this block will arise accidentally is p2pA5*pC5*pG3*pT3=3.8*10.sup- .-10.
[0056]We can compute the probability of finding k accidental instances in a database D that contains 6×109 bases where each of the instances is independent of all the preceding instances using the Poisson distribution Prk≈(pD)ke-pD/k!. The probability Prk that a 16-mer will appear k times with k=40 is equal to 4.5*10-33 (resp. 2.6*10-35) if p1 (resp. p2) is used in the calculation.
[0057]We thus can see that even if we take into account the natural frequency of appearance in the human genome of each of the four nucleotides, the probability that one of our discovered blocks is accidental remains very small even for blocks of size 16 that appear only 40 times.
[0058]Alternatively, we can estimate the significance of our patterns using z-scores: for the least specific patterns of length 16 that have exactly 40 identical copies we obtain the remarkably-high value of z=32.66. Longer patterns and patterns with more intergenic/intronic copies have even higher z-scores. These analyses confirm in different ways that every one of our discovered patterns is statistically significant and not the result of a random process. These conclusions hold true for the reverse complements of the discovered patterns as well and for the pyknons that are a subset of the discovered patterns Pinit.
[0059]It is to be noted that we will use the terms "coding" and "coding region" (abbreviated as CR and CRs) to refer to the translated, amino-acid coding part of exons.
[0060]We now describe the step of determining which of the discovered patterns have additional instances in the 5'UTRs, CRs or 3'UTRs of known genes. Once the pattern discovery step has produced the set Pinit of variable length patterns, we processed it to identify `logically distinct` patterns using the following approach. Let there be a recurrent logical unit which appears several times in the intergenic/intronic regions of the human genome; each one of its instances is assumed to have different lengths that reflect varying degrees of conservation. For simplicity, we assume here that different degrees of conservation will result in variable length instances of the pattern. We only seek patterns with identically-conserved instances so this is a correct assumption. For example's sake, we will assume that all variations of the logical unit contain an intact copy of an 18-nucleotide core motif; let TCCCATACCACGGGGATT (SEQ ID NO: 1) represent this core. As the instances of the logical unit become longer and thus more specific, the number of appearances in the input decreases. FIG. 5 shows this example in more detail. Several hypothetical variations of the logical unit are aligned around the common core motif and the number of instances is listed next to each variation.
[0061]We reasoned that these patterns should be processed in order of decreasing value of the total number of positions that they span: this number is simply the product of each pattern's length by the number of times it appears in the input. As patterns are examined in turn, some of them are selected and kept whereas others collide with earlier-made selections.
[0062]Two collision scenarios are possible and we examine them with the help of FIG. 6. Two blocks, light and darker gray, are shown therein together with a `reference set` of sequences. The light gray block corresponds to a pattern that has already been examined and placed at all its instances. The instances of the darker gray block show the intended placements for the pattern currently under consideration. The blocks collide at two locations (they overlap in the first and second sequence) but the rest of their instances are disjoint. We have two possibilities regarding the handling of collisions. The darker gray block is kept if and only if there is at least one other location in the reference sequence set where it can be placed without generating a collision (e.g. the fifth and sixth sequences in FIG. 6). Alternatively, the darker gray block is kept if and only if it generates no collisions whatsoever with any block that has already been selected and placed. We have opted for the stricter, second choice: if a pattern's instance uses a position that has already been claimed by an earlier-selected pattern, then the pattern under consideration will be discarded and not considered further. Generally, it will be redundant variations of the same pattern that will generate collisions: only one pattern will be used to represent a core motif such as the one shown in FIG. 5.
[0063]The one remaining element is to decide which sequences to use as the reference set. We have chosen to use each of the 5'UTRs, CRs, and 3'UTRs in turn. Sub-selecting among the patterns in Pinit with the help of each of the 5' untranslated, coding and 3' untranslated regions gives rise to the pattern collections P5'UTR, PCODING and P3'UTR respectively. The union of these sets, P5'UTR U PCODING U P3'UTR comprises the pyknons, i.e., patterns that were originally discovered in the intergenic and intronic regions of the human genome and which have additional instances in the 5' untranslated, coding and 3' untranslated regions of known human genes.
[0064]We used the above steps to determine which of the discovered patterns has additional instances in the untranslated and coding regions of genes. After filtering the surviving patterns for low-complexity with NSEG (Wootton, J. C. & Federhen, S. (1993) Computers in Chemistry 17, 149-163), we generated three patterns sets P5'UTR, PCODING and P3'UTR that contained 12,267, 54,396 and 67,544 patterns respectively and had additional instances in 5'UTRs, CRs and 3'UTRs. The union of P5'UTR U PCODING U P3'UTR contained 127,998 patterns indicating that the three pattern sets are largely disjoint. We refer to these 127,998 patterns as pyknons.
[0065]We know describe some properties of the pyknons. The three patterns sets P5'UTR, PCODING and P3'UTR contain 12,267, 54,396 and 67,544 blocks respectively. The union P5'UTRU PCODING U P3'UTR comprises the 127,998 pyknons. In FIG. 7, we show the probability density function for the length of the pyknons; the function is shown separately for each of the three subsets that make up the pyknon collection. Note that the Y-axis is logarithmic.
[0066]The patterns in each of the three collections, P5'UTR, PCODING and P3'UTR, fall into one of two types. "Type-A" patterns are patterns whose reverse complement is also present in the same collection (note that reverse palindromes are included among the type-A patterns). "Type-B" patterns are patterns whose reverse complement is absent from the collection. The breakdown for each of P5'UTR, PCODING and P3'UTR is as follows: P5'UTR contains 217 type-A blocks and 11,835 type-B blocks; PCODING contains 1,038 type-A blocks and 52,330 type-B blocks; and P3'UTR contains 2,501 type-A blocks and 62,577 type-B blocks. The clear majority of the blocks in each of the three collections are type-B blocks.
[0067]With respect to their content, the three collections are largely disjoint, a characteristic that presumably reflects sequence differences that are inherent to the actual 5'UTRs, CRs and 3'UTRs. FIG. 8 shows pictorially the relationship among the members of the three sets P5'UTR, PCODING and P3'UTR: note the small cardinalities of the various intersections.
[0068]Finally, we comment on the number of intergenic and intronic copies of a pyknon. This number spans a very wide range of values with the most frequent pyknon having 356,989 copies--the minimum number of copies is, by design, equal to K=40. For about 95% of the pyknons, their intergenic/intronic copies are fewer than 2,000. FIG. 9 shows the cumulative distributions for the number of intergenic and intronic copies of the pyknons--the distribution is again shown separately for each of P5'UTR, PCODING and P3'UTR in order to highlight the similarities and differences of the three sets.
[0069]The pyknons also exhibit a number of properties that connect the non-genic and genic regions of the human genome, as well as other genomes, in unexpected ways. In particular: [0070]The pyknons have one or more instances within nearly all known genes. The 127,998 pyknons that we originally discovered in the human intergenic and intronic regions have an additional 226,874 non-overlapping copies in the 5'UTRs, CRs or 3'UTRs of 20,059 genes (30,675 transcripts). That is, more than 90% of all human genes contain one or more pyknon instances. The pyknons in P5'UTR cover 3.82% of the 6,947,437 nucleotides in human 5'UTRs; the pyknons in PCODING cover 3.04% of the 50,737,024 nucleotides in human CRs; and, the pyknons in P3'UTR cover 7.33% of the 25,597,040 nucleotides in human 3'UTRs. The distribution of pyknons in the various transcripts is not uniform. FIG. 10 shows the cumulative for the number of transcripts with a given number of pyknons instances in them. As can be seen, about 52% of the 30,675 affected transcripts contain four or more pyknon instances; of these about 2,200 transcripts contain 20 or more pyknon instances in them.
[0071]The pyknons arrange combinatorially in many human 5'UTRs, CRs and 3'UTRs forming mosaics. In those cases where we find many pyknons in one transcript, the pyknons arrange combinatorially and form mosaics. FIG. 1 shows an example of such a combinatorial arrangement in the 3'UTRs of birc4 (an apoptosis inhibitor) and nine other human genes. The 3'UTR of birc4 contains 100 instances of 95 distinct pyknons: of these, 22 are also present in the 3'UTRs of the other nine genes shown. One or more instances of the 95 pyknons from birc4's 3'UTR exist in the 3'UTRs of 2,306 transcripts (data not shown).
[0072]We next show two more examples, one involving 5' untranslated and the other involving coding regions. It is important to stress here that the pyknons are initially discovered in an input that includes neither untranslated/amino-acid-coding sequences nor their reverse-complement; thus, pyknon arrangements such as the ones shown in the following two examples represent non-trivial findings from the standpoint of statistical significance. FIG. 12 shows an example of combinatorial rearrangement in the 5'UTRs of ENSG00000196809 a gene of unknown function and 8 more human genes. 63 distinct pyknons have a total of 65 instances in the 5'UTR of ENSG00000196809. Of the 63 pyknons in the 5'UTR of ENSG00000196809, nine are also shared with the remaining eight genes of the shown group.
[0073]FIG. 13 shows an example of combinatorial rearrangement in human coding regions with the help of the amino-acid-coding sequences from 10 distinct genes: 9 pyknons have a total of 124 instances in the coding regions of the shown transcripts with several of the conserved pyknons appearing twice or more in a given sequence.
[0074]Recall that we initially discovered the pyknons in an input that included neither transcribed gene-related sequences nor their reverse-complement. Thus, finding so many pyknons with instances in human 5'UTRs, CRs and 3'UTRs is significant, especially in view of the three striking examples of combinatorial rearrangements shown above.
[0075]The pyknons account for 1/6th of the human intergenic and intronic regions. The intergenic and intronic copies of the pyknons span 692,393,548 positions on the forward and reverse strands. For those pyknons whose reverse complements are not already in the list of 127,998 pyknons, their Watson-strand instances impose constraints on their Crick-strand instances. Considering this and recalculating shows that 898,424,004 positions, i.e., about 1/6th of the human intergenic/intronic regions, are covered by pyknons and their reverse complement.
[0076]The pyknons are non-redundant. We clustered the pyknons using a scheme based on BLASTN (Altschul, S. F., Gish, W., Miller, W., Myers, E. W. & Lipman, D. J. (1990) J Mol Biol 215, 403-10). Two pyknons are redundant if they agree on at least X % of their positions. Since our collection includes pyknon pairs whose members are the reverse complement of one another, we had to ensure that the clustering scheme did not over-count: when comparing sequences A and B, we examined for redundancy the pair (A,B) and the pair (reverse-complement-of-A,B). Clustering at X=70, 80 and 90%, we generated clusters with 32621, 44417 and 89159 pyknons respectively. The high numbers of surviving clusters show that the pyknons are largely distinct.
[0077]We next describe the BLASTN-based clustering scheme. Let us assume that we are given a set of N sequences of nucleic acids of variable length, and a user-defined threshold X for the permitted, maximum remaining pair-wise sequence similarity. Then, we carry out the following steps:
TABLE-US-00001 .box-solid. sort the N sequences in order of decreasing length ; let Si denote the i-th sequence of the sorted set ; let Sl be the longest sequence of the sorted set ; .box-solid. CLEANED_UP_SET Sl .box-solid. for i = 2 through N do .box-solid. use Si as query to run BLAST against the current contents of CLEAN .box-solid. if the top BLAST hit T agrees with Si or with the reverse complement of Si at more than X% of T's positions then make Si a member of the cluster represented by T ; discard Si ; else CLEANED_UP_SET CLEANED_UP_SET U { Si } ; end-for-loop
[0078]Upon termination, the set CLEANED_UP_SET contains sequences no pair of which agrees on more than X % of the positions in the shorter of the two sequences.
[0079]On pyknons and repeat elements. 1,292 pyknons (1.0%) have instances occurring exclusively inside repeat elements as determined with the help of RepeatMasker (Smit, A. & Green, P. RepeatMasker: ftp.genome.washington.edu/RM/RepeatMasker.html). Seventy-ninepyknons have instances exclusively in repeat-free regions. And, the remaining 126,627 pyknons (98.9% of total) have instances both inside repeat elements and in repeat-free regions. A question that arises here is what fraction, on average, of the total number of copies of pyknons is generated from repeat-free regions. We have computed the probability density and cumulative functions for this fraction, and plot them in FIG. 14. As can be seen, about 60% of the pyknons have more than 90% of their copies inside repeat elements. However, the remaining 40% of the pyknons, which amounts to a little more than 50,000 pyknons, have between 10% and 100% of their instances in regions that are free of repeats.
[0080]The pyknons are distinct from the "ultraconserved elements." 52 pyknons have instances in 46 of the 481 ultraconserved elements (Bejerano, G., Pheasant, M., Makunin, I., Stephen, S., Kent, W. J., Mattick, J. S. & Haussler, D. (2004) Science 304, 1321-5) and cover 0.67% of the 126,007 positions: uc.73+ contains four pyknons; uc.23+, uc.66+, ue.143+ and uc.414+ each contain two pyknons; the remaining 41 elements contain a single pyknon each.
[0081]The pyknons are associated with specific biological processes. For 663 GO terms (Ashburner, M., Ball, C. A., Blake, J. A., Botstein, D., Butler, H., Cherry, J. M., Davis, A. P., Dolinski, K., Dwight, S. S., Eppig, J. T., Harris, M. A., Hill, D. P., Issel-Tarver, L., Kasarskis, A., Lewis, S., Matese, J. C., Richardson, J. E., Ringwald, M., Rubin, G. M. & Sherlock, G. (2000) Nat Genet 25, 25-9) describing biological processes at varying levels of detail, we found that the corresponding genes had either a significant enrichment or a significant depletion in pyknon instances. FIG. 15 shows a partial list of GO terms that are enriched or depleted in pyknons.
[0082]We determined these associations as follows: each gene was included in a list n times, where n is the number of pyknons found in its 5' untranslated, coding or 3' untranslated region, respectively--to avoid over-counting, pyknons with multiple instances in the transcript(s) of a given gene were counted only once. For the sets of sequences belonging to human 5'UTRs, CRs and 3'UTRs, respectively, the binomial distribution was used to estimate the significance of enrichment (or depletion) of pyknons encountered in a group of genes associated with a certain term, compared to the expected frequency of this term in a background set comprising all genes with 5' untranslated, coding or 3, untranslated regions respectively.
[0083]Two control tests helped ensure the significance of our findings. First, we generated gene lists identical to the ones derived from the real data but which were created by random associations with pyknons: we found that only 1 of the generated 84,780 p-values exceeded our selected significance threshold of a Bonferroni-corrected|log(p-value)| of about 2.3 (data not shown). Second, we examined the relation between GO-process associations and the amount of sequence covered by the pyknons: this test allowed us to rule out the possibility that the derived significant enrichment/depletion were due to variations in sequence length for the genes associated with given cellular processes.
[0084]The relative positioning of pyknons in 5'UTRs, CRs and 3'UTRs is strongly biased but consecutive pyknon instances are not correlated. We examined the distances between consecutive pyknons, independently for each of the 5'UTRs, CRs and 3'UTRs. FIG. 16 shows the calculated probability density functions. Given the stringent criteria, we used when selecting pyknons, the coverage of each region is not dense, hence the tail-heavy distributions. The three curves have similar shapes, pronounced peaks at abscissas 18 and 22, and an overall preference for distances between 18 and 31 nucleotides.
[0085]We next examined whether or not the pyknons are fragments of larger conserved regions. Let b denote a pyknon and let us assume that, unbeknownst to us, b is part of a larger-size conserved unit B. Then B will correspond to a larger area than the instance carved out by b, and thus there will be length(B)-length(b)+1 strings in the immediate neighborhood of b whose intergenic and intronic counterparts have as many identically-conserved copies as B. We tested this possibility in 3'UTRs by taking each instance of a pyknon, shifting it by +d (resp. -d), generating a new string and locating the new string's instances in the human intergenic and intronic regions. Had b been part of a larger logical unit, then for some values of d the number of intergenic and intronic copies of the newly formed string would have remained identical to those of b. On the other hand, if b were not part of a larger unit, then the new string would now cross the "natural boundaries" of the underlying presumed logical units and the new string's intergenic/intronic copies would be reduced drastically. Given the strict criteria that we used in identifying pyknons, it is possible that we discarded blocks that are conserved in intergenic/intronic regions and have instances in human coding regions. In this case, a shift of +/-d may end up generating a string that was not included in our set of pyknons but continues to have numerous intergenic/intronic copies. FIG. 17 shows the results obtained for the 3'UTRs for dε and separately for intergenic (top half) and intronic (bottom half) regions; the curves for d=0 correspond to the pyknons in P3'UTR. Note that even for a small shift of d=2 positions, the derived, shifted strings have strikingly fewer copies than the pyknons in P3'UTR, and this holds true for both the intergenic and intronic instances. We obtained similar results for negative d's (data not shown).
[0086]The pyknons are possibly linked to PTGS. The most conspicuous feature of FIG. 16 is the strong preference for distances typically encountered in the context of PTGS. By definition, the 127,998 pyknons have one or more instances in the untranslated and coding regions of human genes: for each pyknon, we generated its reverse complement β, identified all of β's intergenic and intronic instances, and predicted the RNA structure and folding energy of the immediately surrounding neighborhoods using the Vienna package (Hofacker, I. L., Fontana, W., Stadler, P., Bonhoeffer, L. S., Tacker, M. & Schuster, P. (1994) Monatshefte f. Chemie 125, 167-188). We discarded structures whose predicted folding energies were >-30 Kcal/mol, and structures (including ones with favorable folding energies) that were predicted to locally self-hybridize, even if the involved positions represented a miniscule fraction of the total length of the regions under consideration. We also discarded structures that contained either a single large bulge or many unmatched bases. Each of the surviving regions was predicted to fold into a hairpin-shaped RNA structure that had a straightforward arm-loop-arm architecture, contained very small bulges if any, and was energetically very stable. The analysis identified 380,084 non-overlapping regions predicted to form hairpin-shaped structures (298,197 in intergenic and 81,887 in intronic sequences). These 380,084 regions contained instances of the reverse complement of 37,421 pyknons (29.23% of total). In terms of length, the clear majority of these regions are between 60 and 80 nucleotides long.
[0087]FIG. 18 shows the density of the surviving regions per 10,000 nucleotides and for each chromosome separately. The density is reported for each chromosome and separately for the intergenic and intronic regions. Per unit length, there are more predicted hairpins in intronic rather than intergenic regions but the shear difference in the magnitude of these regions results in the intergenic regions contributing the bulk of the hairpins. Interestingly, the density of discovered hairpins is not constant across chromosomes: chromosomes 16, 17, 19 and 22 who are the most densely-packed in terms of predicted hairpins are also among the shortest in length. We emphasize that the average pyknon has length similar to that of a typical microRNA and that there is a straightforward sense-antisense relationship between segments of the 380,084 hairpins and the pyknons instances in human 5'UTRs/CRs/3'UTRs. Also note that the regions containing the 81,887 intronic hairpins will be transcribed: these regions account for 21,727 of the 37,421 pyknons that are linked to hairpins.
[0088]If pyknons are indeed connected to PTGS, then two hypotheses arise from FIG. 16: a) in addition to 3'UTRs, gene silencing is likely effected through the 5'UTRs and amino acid coding regions; and, b) RNAi products in animals likely form distinct quantized categories based on size and have preferences for lengths of 18, 22, 24, 26, 29, 30 and 31 nucleotides.
[0089]The pyknons relate to known microRNAs. We formed the union of the RFAM (Griffiths-Jones, S., Bateman, A., Marshall, M., Khanna, A. & Eddy, S. R. (2003) Nucleic Acids Res 31, 439-41) and pyknon collections, and clustered it with the above-described BLASTN-based scheme, using a threshold of pair-wise remaining sequence similarity of 70%; i.e., we allowed up to six mismatches in 22 nucleotides. When comparing two sequences A and B, we avoided over-counting by examining for redundancy the pairs (A,B) and (reverse-complement-of-A,B). In total, 1,087 known microRNAs clustered with 689 pyknons across 279 of the 32,994 formed clusters.
[0090]The pyknons relate to recently discovered 3'UTR motifs. We compared the pyknons in P3'UTR to the 72 8-mer motifs that were recently reported to be conserved in human, mouse, rat and dog 3'UTRs (Xie, X., Lu, J., Kulbokas, E. J., Golub, T. R., Mootha, V., Lindblad-Toh, K., Lander, E. S. & Kellis, M. (2005) Nature 434, 338-45). We say that one of these 8-mers coincides with a pyknon of length l if one of the following conditions holds: the 8-mer agrees with letters l-7 through l of a pyknon (`type 0` agreement); the 8-mer agrees with letters l-8 through l-1 (`type 1` agreement); or, the 8-mer agrees with letters l-9 through l-2 (`type 2` agreement). Of the 72 reported conserved 8-mers, 39 were in `type 0` agreement, 10 in `type 1` agreement, and seven in `type 2` agreement with one or more pyknons from P3'UTR. Six of the 8-mers did not match at all any of the pyknons in P3'UTR. In summary, the pyknons that we have derived by intragenomic analysis overlap with 56 out of the 72 motifs that were discovered through cross-species comparisons.
[0091]Human pyknons are also present in other genomes where they associate with similar biological processes. In FIG. 19, and for each of 7 genomes in turn, we show how many positions in region X of the genome at hand are covered by the human pyknons contained in set PX, X={5'UTR,CODING,3'UTR}. We account for length differences across genomes by reporting the number of covered positions per 10,000 nucleotides. FIG. 20 shows how many of the human pyknons contained in set PX can also be found in the region X of the genome under consideration, X={5'UTR,CODING,3'UTR}. FIG. 20 also shows the total number of intergenic and intronic positions covered by those of the human pyknons that are also in other genomes. Notably, the human genome contains more than 600 million nucleotides that are associated with identical copies of pyknons and are absent from the mouse and rat genomes. Interestingly, the human pyknons have many instances in the intergenic and intronic regions of the phylogenetically distant worm and fruit-fly genomes covering about 1.6 million nucleotides in each.
[0092]A set of 6,160 human-genome-derived pyknons are simultaneously present in human 3'UTRs (5,752 genes) and mouse 3'UTRs (4,905 genes) whereas a second set of 388 pyknons are simultaneously present in human 3'UTRs (565 genes), mouse 3'UTRs (673 genes) and fruit-fly 3'UTRs (554 genes). Strikingly, we found these two sets of common pyknons to be significantly over-represented in the same biological processes in these other genomes (i.e. mouse and fruit-fly) as in the human genome, even though the pyknons were initially discovered by processing the human genome in isolation (data not shown). The common processes include regulation of transcription, cell communication, signal transduction etc. Finally, for each of the 388 pyknons in this second set, we manually analyzed about 130 nucleotide-long neighborhoods centered on the instances of each pyknon across the human, mouse and fruit-fly 3'UTRs and for a total of more than 4,000 such neighborhoods: notably, we did not find any instance of syntenic conservation across the three genomes.
[0093]Accordingly, as explained above, we explored the existence of sequence-based links between coding and non-coding regions of the human genome and identified 127,998 pyknons with a combined 226,874 non-overlapping instances in the 5'UTRs, CRs or 3' UTRs of 30,675 transcripts from 20,059 human genes. In transcripts that contained multiple pyknon instances, we found that the pyknons arrange themselves combinatorially forming mosaics. Statistical analysis revealed that the untranslated and/or coding regions of genes associated with specific biological processes are significantly enriched/depleted in pyknons.
[0094]We also found that the pyknon placement in 5'UTRs, CRs and 3'UTRs is strongly biased: the starting positions of consecutive pyknons show a clear preference for distances between 18 and 31 nucleotides. Importantly, we found an apparent lack of correlation between consecutive pyknon instances in these regions. The observed bias in the relative placement of the pyknons is conspicuously reminiscent of lengths that are associated with small RNA molecules that induce PTGS, suggesting the hypothesis that the pyknons' instances correspond to binding sites for microRNAs. Analysis of the regions immediately surrounding the intergenic and intronic instances of the reverse complement of the 127,998 discovered pyknons revealed that 30.0% of the pyknons have instances within about 380,000 distinct, non-overlapping regions between 60 and 80 nucleotides in length that are predicted to fold into hairpin-shaped RNA secondary structures with folding energies ≦-30 Kcal/mol. Many of these predicted hairpin-shaped structures are located inside known introns and thus will be transcribed. Our analysis also suggests that PTGS may be effected though the genes' 5'UTR and amino acid regions, in addition to their 3'UTRs. Another resulting hypothesis is that RNAi products in animals likely fall into distinct categories that are quantized in terms of size and have preferences for lengths of 18, 22, 24, 26, 29, 30 and 31 nucleotides. Notably, through sequence-based analysis, we showed that about 40% of the known microRNAs are similar to 689 pyknons, and that the pyknons subsume 56 of the 72 recently reported 3'UTR motifs, lending further support to the possibility of a connection between the pyknons and RNAi/PTGS.
[0095]The intergenic/intronic copies of the 127,998 pyknons constrain approximately 900 million nucleotides of the human genome. Instances of human pyknons can also be found in other genomes namely C. elegans, D. melanogaster, G. gallus, M. musculus, R. norvegicus and C. familiaris. The number of human pyknons that can be located in the 5'UTRs, CRs and 3'UTRs of other genomes decreases with phylogenetic distance. Strikingly, the pyknons that we found inside mouse and fruit-fly 3'UTRs were over-represented in the same biological processes as in the human genome. On a related note, more than 600 million bases, which correspond to identically conserved intergenic and intronic copies of human pyknons, are not present in the mouse and rat genomes.
[0096]The fact that some of the intergenic/intronic copies of pyknons originate in repeat elements may lead one to assume that our analysis has merely `rediscovered` such elements. However, as mentioned above, more than 50,000 of the pyknons have many of their instances in repeat-free regions. Moreover, the typical length of a pyknon is substantially smaller than, e.g., that of an ALU. It was recently reported that genes can achieve evolutionary novelty through the `careful` incorporation of ALUs in their coding regions (Iwashita, S., Osada, N., Itoh, T., Sezaki, M., Oshima, K., Hashimoto, E., Kitagawa-Arita, Y., Takahashi, I., Masui, T., Hashimoto, K. & Makalowski, W. (2003) Mol Biol Evol 20, 1556-63; and Lev-Maor, G., Sorek, R., Shomron, N. & Ast, G. (2003) Science 300, 1288-91). Also, the "pack-mule" paradigm revealed that entire genes, large fragments from a single gene, or fragments from multiple genes can be `hijacked` by transposable elements (Jiang, N., Bao, Z., Zhang, X., Eddy, S. R. & Wessler, S. R. (2004) Nature 431, 569-73). However, `fortuitous coincidence` is generally considered the prevailing mechanism by which such potential is unleashed. Contrasting this, the combinatorial arrangement of the pyknons within the untranslated and coding regions of genes together with the large number of instances in these regions and the association of pyknons with specific biological processes suggests that their placement is not accidental and likely serves a specific purpose. Our findings do not rule out a link with transposable elements. On the contrary, the findings seem to support a dynamic view of a genome (Jorgensen, R. A. (2004) Cold Spring Harb Symp Quant Biol 69, 349-54) that has leamed to respond, and likely continues to respond, to environmental challenges or "stress" in a controlled, organized manner.
[0097]Taken together, the results suggest the existence of an extensive link between the non-coding and gene-coding parts in animal genomes. It is conceivable that this link could be the result of integration into the genome of dsRNA-breakdown products. Since many genes are known to give rise to antisense transcripts, it is possible that these genes were at some point subjected to RNAi-mediated dsRNA breakdown which in turn gave rise to products about 20 nucleotides in length. The latter, through repeated integration, could have eventually given rise to the numerous intergenic and intronic copies of the pyknons that we have identified. However, this explanation would have to be reconciled with four of our findings. First, the pyknons have identically conserved copies in non-genic regions. Second, pyknons appear to favor a specific size and, in genic regions, a specific relative placement. Third, slight modification of the 3'UTR instances of the pyknons by either prepending or appending immediately neighboring positions results in new strings whose intergenic and intronic copies are markedly decreased. And fourth, we can discover human pyknons in other organisms such as the mouse and the fruit-fly where they exhibit a persistent enrichment within specific processes yet are not the result of syntenic conservation. It may well be that we are seeing traces of an organized, coordinated activity that involves nearly all known genes. The existence of a pyknon-based regulatory layer that is massive in scope and extent, originates in the non-coding part of the genome, operates through the genes' untranslated and coding regions, and, is likely linked to PTGS, is a tantalizing possibility. Moreover, the observed disparity in the number of intergenic/intronic positions covered by human pyknons in the human and the phylogenetically-close mouse/rat genomes suggests that pyknons and thus the presumed regulatory layer may be organism-specific to some degree ("pyknome"). Addressing such questions might eventually help explain the apparent lack of correlation between the number of amino-acid coding genes in an organism and the organism's apparent complexity.
[0098]In the above description, and in order to identify motifs that are present in both non-genic and genic regions, we proceeded by first carrying out pattern discovery in the intergenic and intronic regions of the human genome. Once those patterns were determined, we identified additional instances for them in the genic regions of the genome and in particular in the 5' untranslated, amino acid coding and 3' untranslated regions of the genes. In other words, the computational analysis flowed from the non-genic to the genic-regions. But there is nothing that inherently prevents us from carrying out the computation in the other direction, i.e., from the genic to the non-genic regions, although there is potential for a loss in sensitivity that might result in the identification of smaller sets of motifs linking non-genic with genic regions. One could carry out the genic/non-genic analysis in a number of ways. For example, one could use a pattern discovery method to process the full collection of 5' untranslated, amino acid coding and 3' untranslated regions (with the regions processed separately or together), identifying recurrent motifs contained therein, and finally establishing links with the non-genic regions of the genome by locating the intergenic and intronic copies for these motifs.
[0099]Instead of working with the full length sequences of the genes' untranslated and coding regions, an alternative method would be to delineate areas of interest in these regions (effectively subselecting), analyzing those areas to derive motifs, and finally locating additional instances of these motifs in the non-genic parts of the genome. Such areas of interest could, for example, be known or putative microRNA binding sites. Alternatively, the areas of interest could be what, in our work on the problem of RNA interference, we refer to as "target islands." A detailed description of the work is described in the U.S. patent application identified as Ser. No. 11/351,821, filed on Feb. 10, 2006, and entitled "System and Method for Identification of MicroRNA Target Sites and Corresponding Targeting MicroRNA Sequences," the disclosure of which is incorporated herein.
[0100]Summarily, our approach for finding microRNA target sites is known as rna22 and proceeds as follows: it discovers statistically significant patterns that are contained in the sequences of known microRNAs, generates their reverse complement, identifies all the instances of these reverse-complement patterns in a region of interest (namely one of 5'UTRs, CRs or 3'UTRs) and finally reports groups of consecutive locations from the region of interest as long as they are `hit` a minimum number of times by these patterns. Generally, the groups of consecutive locations that rna22 reports will be variable in length and may correspond to one or more binding sites: consequently, and so as to not loose generality, we have been referring to them as "target islands."
[0101]Let us assume that target islands are available for the region of interest. One could proceed by doing an all-against-all comparison of the target islands forming clusters. Any two target-islands that end up in the same cluster have the property that their corresponding sequences share a substantial portion of their extent, say a minimum of N locations. Initially, each target island is in its own cluster. There is always the possibility that the thresholds used in the various stages of the process are too stringent, thus resulting in the method to miss some target-islands that could have otherwise become members of some cluster c. In order to account for this, one could enhance the cluster-forming process as follows. Using the Clustal-W multiple alignment algorithm (Chenna, R., Sugawara, H., Koike, T., Lopez, R., Gibson, T. J., Higgins, D. G. & Thompson, J. D. (2003) Nucleic Acids Res 31, 3497-500), we could align the sequences in cluster c and extract the core region of the alignment, then use it to search the sequences of interest for instances of the core region that were skipped because of the employed thresholds. If a given cluster contains more than one core regions then it can be replaced by as many new clusters as the number of its distinct core regions. For each one of the formed clusters whose core region that resulted from the Clustal-W alignment of its members is at least N nucleotides in length, we report the region as a (genic) motif.
[0102]Optionally, one can discard core regions that exhibit low-complexity using the NSEG algorithm (Wootton, J. C. & Federhen, S. (1993) Computers in Chemistry 17, 149-163). These motifs are then sought in the corresponding genome's intergenic/intronic regions instances to establish links between coding and non-coding parts of the genome. Finally, it is clear that instead of clustering the target islands to determine motifs, one could simply use a pattern discovery approach and subselect among the reported patterns to keep only those that, for example, satisfy a minimum length requirement, or some other property.
[0103]Given the above description, a few points should be noted. First, it is clear that the method which we have described and the ensuing analysis is not specific to the human genome; in fact, it can be carried out separately in other eukaryotic genomes such as chimpanzee, mouse, rat, dog, chicken, fruit-fly, worm, etc. It is expected that the resulting pyknomes will have non-zero intersections with one another but will likely also contain organism-specific pyknons. Whether generated from the human or some other genome, the pyknons are statistically significant and link the non-genic and genic regions of the genome at hand. The links that are instantiated by the pyknons are `natural` in that they involve large numbers of sequences that occur naturally in the genome at hand. Consequently, the pyknons would form natural candidates for a number of processes that to date have been carried out using schemes that make use of local information alone and do not take into account long-range conservations of the kind that we presented in our discussion.
[0104]One such application would be the design of small interfering RNAs (siRNAs) to regulate the gene expression of one specific gene. Some of our pyknons have the property of being shared by two or more genes which allows the design of siRNAs that can interfere with a cluster of genes at once. As illustrated in the flow diagram of FIG. 21, a method for designing one or more sequences of siRNAs that can interact with one or more sites in a given transcript of a given sequence in a given organism and result in the down-regulation of the expression of the protein product encoded by the given transcript can comprise the following steps. One or regions of interest are identified in the sequence of a given transcript (step 2102). One or more regions are sub-selected from the collection of these regions (step 2104). One or more derived sequences are generated from the sequence of the one or more sub-selected regions (step 2106). The one or more derived sequences are used to create one more instances of the corresponding molecule that the one or more derived sequences represent (step 2108). The one or more instances of the created molecule are used in an appropriate environment to regulate the expression of the given transcript (step 2110).
[0105]Further, the method of designing one or more siRNAs may use a region of interest in the collection of regions of interest identified to be an instance of a motif that has one or more copies in the intergenic and intronic regions of the genome of interest, and one or more copies in the untranslated and amino acid coding regions of one or more genes in the genome of interest, each such region of interest being computed using the method and system for finding pyknons described above.
[0106]The method may use a region of interest in the collection of regions of interest identified using a method that is based on pattern discovery, for example, the method described in the above-referenced U.S. patent application identified as Ser. No. 11/351,821. A region of interest in the collection of regions of interest can also be identified to be a target island that is computed using the method also described in the above-referenced U.S. patent application identified as Ser. No. 11/351,821.
[0107]The method of designing one or more siRNAs may also use a region of interest, for example, located in the 5' untranslated region of the given transcript, located in the amino acid coding region of the given transcript, or located in the 3' untranslated region of the given transcript.
[0108]As detailed above, the method of designing one or more siRNAs can be used where the genome of interest is a eukaryotic genome, and wherein the eukaryotic genome is, for example, is the human genome, the mouse genome, the rat genome, the dog genome, the fruit fly genome, or the worm genome.
[0109]Also, the method of designing one or more siRNAs may use a region of interest that is sub-selected based on one or more of its attributes. These attributes may include, for example, the region's length and the region's location in the transcript.
[0110]The method of designing one or more siRNAs can also use a derived sequence that is, for example, the reverse complement of the sequence of the one or more sub-selected regions, or a near-reverse complement of the sequence of the one or more sub-selected regions, i.e. it contains mismatches at one or more locations.
[0111]The method of designing one or more siRNAs can be used such that the one or more copies of the molecule can be built using any of a set of biochemical processes.
[0112]Another application would involve the rational use of pyknons to appropriately engineer a transcript of interest in order to control its expression (either up-regulate or down-regulate) in a specific tissue or for a specific cellular process. For example, one could remove one or more of the pyknons existing in the transcript of interest leading to an up-regulation of the transcript. Alternatively, one could down-regulate the transcript of interest by adding more instances of existing pyknons and rely on the naturally occurring agent that targets this pyknon to induce down-regulation. Or one could add the sequence of a pyknon that is not among those contained in the transcript and selective control the transcript's expression by adding or removing appropriately generated instances of the reverse complement of the pyknon.
[0113]As illustrated in the flow diagram of FIG. 22, a method for engineering a given transcript of a given gene in a given organism in order to regulate its expression may comprise the following steps. One or more regions of interest are identified in the sequence of a given transcript (step 2202). One or more regions are sub-selected from the collection of these regions (step 2204). The one or more sub-selected regions are used to make one or more modifications to the sequence of the given transcript (step 2206).
[0114]Further, the method of engineering a given transcript to regulate gene expression can comprise many of the same steps as mentioned above in the method for designing one or more siRNAs. For example, the method of engineering a given transcript to regulate gene expression may use a region of interest in the collection of regions of interest identified to be an instance of a motif that has one or more copies in the intergenic and intronic regions of the genome of interest, and one or more copies in the untranslated and amino acid coding regions of one or more genes in the genome of interest. The motif can be computed, for example, using the pyknons discovery method and system described above.
[0115]Also, as above, the method of engineering a given transcript to regulate gene expression may use a region of interest in the collection of regions of interest computed using a method that is based on pattern discovery, for example, the method described in the above-referenced U.S. patent application identified as Ser. No. 11/351,821.
[0116]The present method may also use a region of interest, for example, located in the 5' untranslated region of the given transcript, located in the amino acid coding region of the given transcript, or located in the 3, untranslated region of the given transcript.
[0117]Also, similar to the above methodology, the method of engineering a given transcript to regulate gene expression may use a region of interest that is sub-selected based on one or more of its attributes including, for example, the region's length and the region's location in the transcript. Additional attributes may include the association of the region with a given biological process, the region's association with a given tissue, and the region's association with a given cellular compartment.
[0118]Further, the method of engineering a given transcript to regulate gene expression can include a modification that, for example, comprises an extension of the sequence of the given transcript, or a shortening of the sequence of the given transcript. The extension can, for example, comprise one or more instances of a region of interest, and the shortening can, for example, comprise one or more instances of a region of interest.
[0119]Another application of pyknons, for example, would be the measuring of the impact that one or more pyknons can have on a gene's regulation "by proxy." This would entail the engineering of an assay that involves a reporter gene (for example, luciferase) and instances of the one or more pyknons placed downstream from the region that codes for the reporter's amino acid sequence. Then, one can measure the impact on the expression of the reporter gene by using various combinations of appropriately generated instances of the reverse complement of these pyknons. The observations made in the context of the reporter assay can then be carried over to the gene that is studied. Additional applications are also possible if one assumes that for the organism that is being studied the sequences of the corresponding pyknons are available.
[0120]FIG. 23 is a block diagram of an exemplary hardware implementation of one or more of the methodologies of the present invention. That is, apparatus 2300 may implement one or more of the steps/components described above in the context of FIGS. 1-22. Apparatus 2300 comprises a computer system 2310 that interacts with media 2350. Computer system 2310 comprises a processor 2320, a network interface 2325, a memory 2330, a media interface 2335 and an optional display 2340. Network interface 2325 allows computer system 2310 to connect to a network, while media interface 2335 allows computer system 2310 to interact with media 2350, such as a Digital Versatile Disk (DVD) or a hard drive.
[0121]As is known in the art, the methods and apparatus discussed herein may be distributed as an article of manufacture that itself comprises a computer-readable medium having computer-readable code means embodied thereon. The computer-readable program code means is operable, in conjunction with a computer system such as computer system 2310, to carry out all or some of the steps to perform one or more of the methods or create the apparatus discussed herein. For example, the computer-readable code is configured to implement a method of determining associations between non-coding sequences and gene coding sequences in a genome of an organism, by the steps of: identifying at least one conserved region from a plurality of the non-coding sequences; and linking the at least one conserved region with one or more of the gene coding sequences of the genome to associate the at least one conserved region with one or more biological processes of the organism. The computer-readable medium may be a recordable medium (e.g., floppy disks, hard drive, optical disks such as a DVD, or memory cards) or may be a transmission medium (e.g., a network comprising fiber-optics, the world-wide web, cables, or a wireless channel using time-division multiple access, code-division multiple access, or other radio-frequency channel). Any medium known or developed that can store information suitable for use with a computer system may be used. The computer-readable code means is any mechanism for allowing a computer to read instructions and data, such as magnetic variations on a magnetic medium or height variations on the surface of a compact disk.
[0122]Memory 2330 configures the processor 2320 to implement the methods, steps, and functions disclosed herein. The memory 2330 could be distributed or local and the processor 2320 could be distributed or singular. The memory 2330 could be implemented as an electrical, magnetic or optical memory, or any combination of these or other types of storage devices. Moreover, the term "memory" should be construed broadly enough to encompass any information able to be read from or written to an address in the addressable space accessed by processor 2320. With this definition, information on a network, accessible through network interface 2325, is still within memory 2330 because the processor 2320 can retrieve the information from the network. It should be noted that each distributed processor that makes up processor 2320 generally contains its own addressable memory space. It should also be noted that some or all of computer system 2310 can be incorporated into an application-specific or general-use integrated circuit.
[0123]Optional video display 2340 is any type of video display suitable for interacting with a human user of apparatus 2300. Generally, video display 2440 is a computer monitor or other similar video display.
[0124]Although illustrative embodiments of the present invention have been described herein with reference to the accompanying drawings, it is to be understood that the invention is not limited to those precise embodiments, and that various other changes and modifications may be made by one skilled in the art without departing from the scope or spirit of the invention.
Sequence CWU
1
52118DNAUnknownThe pattern is a representation of an 18-nucleotide
core motif presently assumed to be present in all variations of the
logical unit which appears several times in the intergenic/intronic
regions of the human genome. 1tcccatacca cggggatt
18218DNAUnknownThe pattern is a representation
of a core pattern presently assumed to be present in all variations
of the logical unit which appears several times in the
intergenic/intronic regions of the human genome. 2tcccatacca cggggatt
18319DNAUnknownThe pattern
is a representation of a core pattern presently assumed to be
present in all variations of the logical unit which appears several
times in the intergenic/intronic regions of the human genome.
3tcccatacca cggggatta
19419DNAUnknownThe pattern is a representation of a core pattern
presently assumed to be present in all variations of the logical
unit which appears several times in the intergenic/intronic regions
of the human genome. 4ctcccatacc acggggatt
19520DNAUnknownThe pattern is a representation of a
core pattern presently assumed to be present in all variations of
the logical unit which appears several times in the
intergenic/intronic regions of the human genome. 5ctcccatacc acggggatta
20620DNAUnknownThe pattern
is a representation of a core pattern presently assumed to be
present in all variations of the logical unit which appears several
times in the intergenic/intronic regions of the human genome.
6tcccatacca cggggattac
20721DNAUnknownThe pattern is a representation of a core pattern
presently assumed to be present in all variations of the logical
unit which appears several times in the intergenic/intronic regions
of the human genome. 7gcctcccata ccacggggat t
21821DNAUnknownThe pattern is a representation of a
core pattern presently assumed to be present in all variations of
the logical unit which appears several times in the
intergenic/intronic regions of the human genome. 8tcccatacca cggggattac a
21922DNAUnknownThe pattern
is a representation of a core pattern presently assumed to be
present in all variations of the logical unit which appears several
times in the intergenic/intronic regions of the human genome.
9gcctcccata ccacggggat ta
221024DNAUnknownThe pattern is a representation of a core pattern
presently assumed to be present in all variations of the logical
unit which appears several times in the intergenic/intronic regions
of the human genome. 10gcctcccata ccacggggat taca
241123DNAUnknownThe pattern is a representation of a
core pattern presently assumed to be present in all variations of
the logical unit which appears several times in the
intergenic/intronic regions of the human genome. 11ggcctcccat accacgggga
tta 231216DNAUnknownThe
pattern is an example of a pyknon. 12tgcactccag cctggg
161319DNAUnknownThe pattern is an
example of a pyknon. 13taatcccagc actttggga
191417DNAUnknownThe pattern is an example of a pyknon.
14ggctgaggca ggagaat
171516DNAUnknownThe pattern is an example of a pyknon. 15gaggttgcag
tgagcc
16166791DNAUnknownUnknown organism 16nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 60nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 120nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nntttcttga tttttcagnn 180nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 240nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnatactgat ttaatttcnn 300nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 360nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 420nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 480nnnnnnnnnn nnnnnnnnnn nnnnnnattt
tcttttaaag ttnnnnnnnn nnnnnnnnnn 540nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 600nnnnnnnnnn nnnactgaca tggaaagatn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 660nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 720nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 780nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 840nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nntgtattac ttttgtaann nnnattttta 900gaaagtattn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 960nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1020nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnntggat gaaaaatatt tnnnnnnnnn 1080nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1140nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1200nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1260nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1320nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1380nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1440nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1500nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1560nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1620nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnctcacaac aaacctatnn 1680nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1740nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1800nnnnnnnnnn nnnnnnnatt catttaaaac
attnnnnnnn nnnnnnnnnn nnnnnnnnnn 1860ctttttgaga tggagtcttn nnnnnnnccc
aggctggagt gcannnnnnn natctctgct 1920cactgcannn nnngccttct gggttcaagn
nattctcgtg cctcagcnnn nnnagtagct 1980ggaattacag nnnnnngcca ccatgcccga
ctannnnnnn nnnnnnnnnn nnnnnnnnnn 2040nnnnnnnnnn nnttggccag gctggtatnn
aactcctgac ctcaagnnnn nnnnnnnnnn 2100nnnnntccca aagtgctggg attacaggct
tgagccacca nnnnnnnnnn nnnnnnnnnn 2160nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2220nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2280nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2340nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2400nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2460nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2520nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2580nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2640nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2700nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2760nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2820nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnttattt tacattttag 2880nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2940nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3000nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3060nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3120nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3180nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnaaatttg tattttgaan nnnnnnnnnn 3240nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3300nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3360nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnncagtg 3420gctcacgcct gtaatcccag cactttggga
nnnnnnnnnn nnnngatcac gaggtcagga 3480gnnnnagacc atcctggcta acannnnnnn
nnnnnntctc tactaaaaaa cannnnntta 3540gccgggcgtg gtggnnnnnn nctgtagtcc
cagctactnn nnnggctgag gcaggagaat 3600ggtgtgaacc cgggaggnng agcttgcagt
gagccnnnnn nnnnnnnntg cactccagcc 3660tgggnnnnag agcaagactc tgtcnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3720nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3780nnnnntctgt gaaaggaaaa tnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3840nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
actcccatcc taatacnnnn nnnnnnnnnn 3900nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3960nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4020nnncggtggc tcatgcctgt aatcccagca
ctttgggann nnnnnnnnnn nngatcacct 4080gaggtcggga ggnnnagacc agcctgacca
annnnnnnnn nnnnnnnnnn nnnnnnnnnn 4140nnnnnnttag ctgggcgtgg tgnnnnnnnn
ctgtaatccc agctactnnn nnggctgagg 4200caggagaatn ncttgaaccc aggaggcnga
ggttgtggtg agcgnnnnnn nnnnnnntgc 4260actccagcct gggcaacaag agcaaaactn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4320nnnnnnnnnn nnnnnnnnnn nnnnnnntta
gccaggcgtg gtgnnnnnnn nnnnnnnnng 4380cagctactct ggaggnagag gcaggaggat
cacttgagcc catgaattng aggcagcagt 4440gagctnnnnn nnnnnnnntg tactccagtc
tgggnnacag agtgagaccc cannnnnnnn 4500nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnttgaaaag attattctnn 4560nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4620nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4680nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnntcattt tacaggtgag 4740nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4800nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4860nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4920nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4980nnnnnnnnnn nnnnnnnncc atggattcaa
ccaannnnnn nnnnnnnnnn nnnnnnnnnn 5040nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 5100nnnnnnnctg ttgttaagca acannntaac
aactatttac atnnnnnnnn nnnnnnnnnn 5160nnnnnnnnnn nnnnnnnnna gcaattattt
ttaaannnnn nnnnnnnatt gtattaggta 5220ttannnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnccag cctggacaaa agnnnaaacc 5280ctgtctctac aaaannnnnn nnnnttagct
gggcatggtg nnnnnnnnnt gtagtcctgg 5340ctactnnnnn nnnnnnnnnn ggaggatcgc
ttgagtnnnn nnnnnngagg ctgcattgag 5400ctnnnnnnnn nnnnntgcat tccagcctgg
gnnnnnnnnn nagaccttgt ctcagaannn 5460nnnnnnnnnn nnnnnnnnnn nnnnnnnnat
tatatgcaaa tactnnnnnn nnnnnnnnnn 5520nnnnnnncag tctcactgtg ttgnnnngga
tggagtgcaa tggcacaatc ttggctcatn 5580nnnnnnnnnn nnnnnnnnnc agctgggact
acaggnnnnn nnnnnngtgc ccagttaatt 5640ttttttgtat tcttagtaga gannnnnnnn
nnnnnnnttg gccaggctag tctnnaattt 5700ctgacctcaa gnnnnnnnnn tcccaaagtg
ctgggattac aggcgtgagc caccannnnn 5760nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnttgaccag gctggtctnn 5820aactcctgat ctcaggtgat nnnnnnnnct
cggcctcaca aagtgctggg attacaggtg 5880tgaaccacnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 5940nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 6000nnnnnnnnnn nnnggcacgg tggctcacgc
ctnnnnnnnn nnnnnnnncc gaggctgagg 6060caggnnnctc acctgaggtc aggagttnna
gaccagcctg gccaannnnn nnnnnncctg 6120tctgtacaaa aannnnnnna tagctgggca
tggtgnnnnn nnnctgtagt cccagctact 6180nnnnnnactg aggcaggaga atnncttgaa
cctgggaggc ngaggttgca gggagccnnn 6240nnnnnnnnnn cgcactccag cctaggngat
agagtgagac tccnnnnnnn nnnnnnnnnn 6300nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 6360nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 6420nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 6480nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 6540nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 6600nnnnnnnnnn atgttttgag acagagnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 6660nnnnnnnnnn nnnnnnnnnn nngaaaacta
agaaaattnn nnnnnnnnnn nnnnnnnnnn 6720nnnnnnnnnn nnnnnnnnnt tatttttctg
tgaatnnnnn nnnnncaata aaatactatt 6780cnnnnnnnnn n
6791171732DNAUnknownUnknown organism
17nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
60nnnnnngcca ggtatggtgg ctnnnnnnnn taatcccagc actttgggan nnnnnnnnnn
120nnnnatcacc tgatgtcagg agttnnagac cagcctggcc aannnnnnnn nnnnnnnnnn
180nnnnnnnnnn nnnnnnnnac tagccaggcg tggtgnnnnn nnnctgtaat cccagctact
240nnnnnggctg aggcaggaga atnncttgaa cccaggaggc ngaggttgca gtgagccnnn
300nnnnnnncac tgcactccag cccnngtgac agtgtgagac tnnnnnnnnn nnnnnnnnnn
360nnnnnnnnnn nnncggtggc tcaagcctgt aatcccagca ctttgggann nnnnnnnnnn
420nngatcacga ggtcaggagn nnnagaccat cctggctaac annnnnnnnn nnnnnnnnnn
480nnnnnnnnnn nnnnnnnnta gtcgggcgtg gtggnnnnnn cctgtattca cagctannnn
540ngaggttgag gcaggagnnn ngggtgaacc cgggaggnng agcttgcagt gagccnnnnn
600nnnnnnnntg cactccagcc tgggngacag agccagactc cnnnnnnnnn nnnnnnnnnn
660nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
720nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
780naggctcaca cctgtaatcc nnnnnnnnnn ngacgctgag gtgggaggat cacttgagcc
840caggagttnn nnnnnnnnnt gggcaatata gtgagannnn nnnnctacaa aaaagttttt
900nnnnnnnnnn nagcatggtg gcacatgnct gtagtcccac ctactnnnnn nnnnnnnnna
960gggtcacctg agcctnnnnn nnngaggctg cagtgagccn nnnnnnnnnn nntgcactcc
1020agcctgggnn nnagagtaag accctgtcnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1080nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1140nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1200nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1260nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1320nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1380nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1440nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1500nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1560nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnntttta ttgagcagtt tnnnnnnnnn
1620nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1680nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nn
1732185255DNAUnknownUnknown organism 18nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 60nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 120nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 180nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 240nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 300nnnnnnnnnn nnaaagaaga tcattttgnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 360nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nagcagtggc 420tcacacctnn taatcccagc actttgggan
nnnnaggtgg gcggatcacc cnnnnnnnnn 480nnnnnnnaaa ccagcctgac caacatggtg
aaaccctgtc tctactaaat nttagcgggg 540tgtggtgnnn ngcacctgta atcgcagnnn
nnnnngaggc tgagacagga gnnnnncttg 600aaccctagag gcngagtttg cagtgagccn
nnnnnnnncc attgtactcc agctnnnnnn 660nnnnagtaag actctgtctc nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 720nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 780nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 840nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 900nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 960nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1020nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1080nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1140nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1200nnnnntggga gtcatccttg annnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1260nnnnnnnnnn nnnnnnnggc caggcatggt
ggcnnnnnnn nntaatccca gcattttggg 1320annnnnnnnn nnnnnnnnnn nntgaggtca
ggagctcaag accagcctgg ccaannnnnn 1380nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnntagtcg ggcgtggtgg nnnnnnnctg 1440taatcccagc tactnnnnng gctgaggcag
gagaatnnct tgaacctggg aggcngaggt 1500tgcagtgagc cnnnnnnnnn nnnntgtact
ccagcctggg nnnnnnagtg agactctgtc 1560ttnnnnnnnn nnnnnnnnnc tgataaatat
tgatgnnnnn nnnnnnnnnn nnnnnnnnnn 1620nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1680nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1740nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1800nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1860nnnnnnnnnn nnnncagctc cactaggaag
nnnnnnnnnn nnnnnnnncc attcaattcc 1920atttnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1980nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2040nnnnnnnnnn nnnnnngagc tctttgaggc
cannnnnnnn nnnnnnnnnn nnnnnnnnnn 2100nnnnnnnnnn nnnnnnnncc tagcacatag
taggnnntga atgaatgaat gaannnnnnn 2160nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnntcatttt 2220atgaagctan nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2280nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2340nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2400ttttatcaga aaaaaannnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2460nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2520nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2580nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2640nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2700nnnnnnnnnn nnnnnnnnnn nacttaatcc
ccagtgtnnn nnnnnnnnnn nnnnnnnnnn 2760nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2820nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnna caaaggaatg 2880aagagnnnnn nnagcattta ggccatttnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2940nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3000nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3060nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3120nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3180nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3240nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3300nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3360nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3420nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3480nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3540nnnnnnnnnn nnnnaaatgg tatttagaaa
nnnnnnnnnn nnnncggtgg ctcatgcctg 3600taatcccagc actttgggan nntgaggcag
gcggatcact nnnnnnnnnn nnnnnnagac 3660cagcctggcc aannnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnttagccgg 3720gcgtggtggn nnnnnnctgt aatcccagct
actnnnnagg ctgaggcagg agaaannncc 3780tgaacccaga aggcngaggt tgcagtgagc
cnnnnnnnnn nnnnnncact gcactccagc 3840cgnnnnnnnn nnnnnnncct ctgtctcaaa
aaaannnnnn nnnnnnnnnn nnnnnnnnnn 3900nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnt ttctttatct gtaaannnnn 3960nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4020nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4080nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4140nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4200nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4260nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4320nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnngtttag 4380aaagtaaaaa nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4440nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4500nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4560nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4620nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4680nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4740nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4800nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnncc caggttggag 4860tgcannnnnn nnnnnnnggc tcactgcaac
ctcngcctcc cgggttcaag nnnnnctcct 4920gtctcagcct cnngagtacc tgggactacn
nnnnnnnnnn nnnnngcccg gctaattttt 4980tgtatttgta gtagagannn nnnnnnnnnn
tgttagccag gatggtctnn nnctcctgac 5040ctcatgatcn nnnnnnnnnn nnnnnnnnnt
cccaaagtgc tgggattaca ggcgtgagcc 5100cccgnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 5160nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 5220nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnn 5255192780DNAUnknownUnknown organism
19nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
60nncctgttgt ggggagggnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
120nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
180nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
240nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
300nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
360nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
420nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
480nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
540nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
600nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
660nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
720nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
780nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
840nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nncggtggct catgcctgnc atcccagcac
900tttgggannn nnnnnnnnnn nnnnaacctg aggtcaggag ttnnnnnnnn nnnnnaacaa
960catggtgaaa cnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnntgcctgc
1020ctgtaatccn nnnnnnnnng aggctgaggc aggaaaannn cttgaacccg aaaggcngag
1080gttgcagtgt gccnnnnnnn nnncactgaa ctccagcctn nncaacaaga gtgaaactnn
1140nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1200nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnt attgaacact
1260tactannnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1320nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1380nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnggcagag ccaggatttn nnnnnnnnnn
1440nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1500nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1560nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1620nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnntgttttt
1680taaaaagaan nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1740nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1800nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1860nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1920nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1980nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
2040nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnntttac agacaaggaa annnnnnnnn
2100nnnnnnnnnn nnnnggccag gcatggtggc nnnnnnnnnt aatcccagca gtttgggagg
2160ctgaggtggg aggagnnnnn nnnnnnnnnn nnnnnnnnag accagcctag gcaannnnnn
2220nnnnncccat ctctacaaaa anttatctgg gcctggtgnn nnnnnnctgt agtcccagct
2280actnnagagg ctgaggtggg annnnnnctt gagcccagaa gttgaggctg cagtgagccn
2340nnnnnnnnnn nntgtactcc agcctgggnn ncaaagcaaa accctgtnnn nnnnnnnnnn
2400nnnnnnnnnn nnnncggtgg ctcacacctg taatcccaac actttgggan nnnnnnnnnn
2460nnnnatcacc tgaggtcagg agttnnagac cagcctggcc aannnnnnnn nnnnnncctc
2520tactgaaaat acaaaannnn nnnnngcatt gtggcacatg cnnnnnnnnn nnnatcacct
2580gaggtcagga gttnnagacc agcctggcca annnnnnnnn nnnnnnnnnn nnnnnnnnnn
2640nnnnnnntta gctgggcgtg gtgnnnnnnn nctgtagtcc cagctactnn nnnggctgag
2700gcaggagaat nncttgaacc tgggaggtng aggttgcagt aagccnnnnn nnnnnnnntg
2760cactccagcc tgggnnnnnn
2780203330DNAUnknownUnknown organism 20nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 60nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 120nnnnnnnnnn nnnnnnnnnn actgtactgt
atttatnnnn nnnnnnnnnn nnnnnnnnnn 180nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 240nnatatttta cagaaatann nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 300nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 360nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 420nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnt tattcatttg tttaannnnn 480nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 540nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 600nnnncagtgg ctcatgcctg nnnnnnnnnn
nnnngctgag gcaggaggac agtttgaggc 660caggagnnnn agactagcct ggacaannnn
nnnnnnnnnn nntctctaca aaaacataaa 720aataaattag ctggnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnna 780aaaataggct gggtgtggtg gttcatgcct
gtaatcctag cactttggga nnnnnnnnnn 840nnnggatcac ctgaggtcag gtnnnnnnnn
nnnnnnnncc caatatggtg aaacnnnnnn 900nnnnnnnnnn nnnnnnnnnn tagccaggtg
tggtggnnnn nnnctgtagt cccagctact 960nnngaggctg acacaggagn nnnncttgaa
cccaggaagt ngaggctgca gtgagccnnn 1020nnnnnnnnnn tgcactccag cctgggnnnn
nnagtgagac tctgtctcnn nnnnnnnnnn 1080nnnnnnnnnn nnnnnnnnnn nnnngtgtgg
tggcatgtgt ntgtggtccc agctactnnn 1140nnnnnnnnnn nnnnnnnnnn nnnnnnnncc
agaaggtcaa ggctnnnntg agctgtgatt 1200gcatnnntgc actccagcct gggnnnnagc
aagaccctat ctcnnnnnnn nnnnnnnnnn 1260nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnaattta caatttacaa nnnnnnnnnn 1320nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn ctgaagaact 1380ttctttnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1440nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1500nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1560nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1620nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1680nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1740nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1800nnnnattttg tttctaaata nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1860nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1920nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn agagatgggg tctcgctnnn 1980nnncccaggc tggagtgcan nnnnnnnnnn
nnnntagctc actgcagtct nnnnnnnnnn 2040nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnaaatg catttttaaa annnnnnnnn 2100nnnnnnnnnn nnnnnnnntt tctgattaat
aaatnnnnnn nnnnnnnnnn nnnnnnnnnn 2160nnnnnnnnnn nnnnnnnnnn nnnnnnnnnt
gtatgtgcca catttnnnnn nnnnnnnnnn 2220nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2280nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2340nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2400nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2460nnnnnnnnnn nnnnnnnnnn ncacacacac
atgtgtgnnn nnnnnnnnnn nnnnnnnnnn 2520nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2580nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2640nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2700nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2760nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2820nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2880nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2940nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3000nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnncagt 3060ggctcacgcc tgtaatccca gcactttggg
annnnnnnnn nnnnnnnnaa cctgaggtca 3120ggagttnnag accagcctga ccaannnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3180ttagccaggc atggtgnnnn nnnnctgtaa
tcccagctac tnnnnnggct gaggcaggag 3240aatnncttga acccgggagg cngaggttgc
agttagccnn nnnnnnnnnn ntgcactcca 3300gcctgggcaa caagagtaaa actnnnnnnn
3330214018DNAUnknownUnknown organism
21nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
60nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
120nnnnnnnnca cacacacacc cctgnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
180nnnnnnnnnn nnnnaattgt tttttctaat nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
240nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
300nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnctttt tgaattttta
360annnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
420nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
480nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnncagtgg cccatgcctg taatcccagc
540actttggggn nnnnnnnnnn nnnnnatcac ctgaggacag gagttnnaga ccagtctggc
600caannnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn tagccgggca tggtggnnnn
660nnnctgtaat cccagctact nnnnnggctg aggcaggaga atnncttgaa cctaggaggc
720ngaggttgca gtgagccnnn nnnnnnnnnn tgcactccag cctgggngac aagagtgaaa
780ctcnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
840nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
900nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
960nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1020nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1080nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnntttgaat taatttttcn
1140nnnnnnnnnn nnnnnnnnnn nngcctctaa aactgtgann nnnnnnnnnn nnnnngcatc
1200agaatcacct gnagggcttg ttaaaacann nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1260nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1320nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1380nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1440nnnnnnnnnn nnnnnacatt tattgagctc cnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1500nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1560nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1620nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1680nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1740nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1800nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1860nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1920nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnagggcag
1980aggaaacaan nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
2040nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
2100nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
2160nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
2220nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
2280nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
2340nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
2400nnnnnnnnnn nnncaatggc tcatgcctgt aatcccaaca ctttgggagg ccaaggtggg
2460aggannnnnn nnnnnnnnnn nnnnnagacc agcctggaca annnagtgag atcctatctc
2520nnnnnnnnnn nnnnnnttag ccaggcatgg tgnnnnnnnc ctatagtcct ggctannnnn
2580nnngctgagg caggaggatn nnnnnnnnnn nnnnnnnnna ggctgcagta agccannnnn
2640nngccactgc actcagccnn nnnnnnnnag caagaccctg tctcnnnnnn nnnnnnnnnn
2700nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
2760nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
2820nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
2880nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
2940nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
3000nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
3060nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
3120nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
3180nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
3240nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
3300nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
3360nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
3420nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
3480nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
3540nnnnnnnnnn nnnnnnnnnn nnnntctggg ctgggcacag nnnnnnncac actttgggag
3600gccnnnnnnn nnnnnnnatc acctgaggtc aggagttnna gaccagcctg gccaannnnn
3660nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nttagctggg cgtggtgnnn nnnnnctgta
3720atcccagcta ctnnnnnggc tgaggcagga gaatnncttg aacccaggag gcngaggttg
3780cagtgagccn nnnnnnnnnn nntgcactcc agcctgggng acaagagcaa aactcnnnnn
3840nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
3900nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
3960nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnn
4018225449DNAUnknownUnknown organism 22nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 60nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 120nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 180nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 240nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 300nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 360nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 420nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 480nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 540nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 600nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 660nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 720nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnntgtc 780acccagctgg agnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 840nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nntggttgga gcaggaggnn 900nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 960nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1020nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1080nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1140nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1200nnnnnnnnnn nnnnnnnnnn nnnnnnnnca
gtggctcacg cctannnnnn nnnnnnnnnn 1260ngaagccgag gcgggtggat caccnnnnnn
nnnnnnnnnn agaccatcct ggctaacann 1320nnnnnnncct atctctacaa aaanttagct
gggtgtggtg nnnnnnnnct gtagtcccag 1380ctactnnnnn ggctgaggca ggagaatnnn
nnnaacctct gcctccgggn nnnnnnnnnn 1440nnncctgcct cagcgtcccn agtagcaggg
actacanncg tgcaccacca tgcccgnnnn 1500nnnnnngtat ttttagtaga gannnnnnnn
nnnnnnnttg gccaggctgg tctnnaactc 1560ttgacctcaa gnnnnnnnnn ngccaaagtg
ctgggattat aggcgtgagc caccgnnnnn 1620nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1680nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nctgccagga 1740ctgggttnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1800nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1860nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1920nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1980nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2040nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2100nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2160nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2220nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2280nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2340nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2400nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnntgggg acttgggggg aannnnnnnn 2460nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2520nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2580nnnnnnnnnn nnnncctctc tgggctgcag
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2640nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2700nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2760nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2820nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2880nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2940nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3000nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3060nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3120nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3180nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3240nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3300nnnnnnnnna gtttcactct tgttgcccag
gctggagtgc annnnnnnnn nnnnngctca 3360cctcaacctc cgcctcccag gttcaagnna
ttctcctgcc tcagccnnnn nagtagctgg 3420gattacagnn nnnnnnnnnn atgcctggct
aattttgtat ttttagtaga gannnnnnnn 3480nnnnnnnnnn nnnnagctgg tctcgaactc
nnnnnnnnnn nnnatctgcc cacttcggcc 3540tcccaaagtg ctgggattan aggcatgagc
caccgcnnnn nnnnnnnnnn nnnnnnngag 3600acagagtctc actnnnnnnn cccaggttgg
agtgcannnn nnnnatctga gctcactgca 3660nnnnnngcct cttgggttca agnnattctc
ctgcctcagc cnnnnnagta gctaggacta 3720cannnnnnnn ncaccatgcc cagctaannn
nngtattttt agtagagann nnnnnnnnnn 3780nnnttggcca ggctggtctn nnnnnnnnna
cctaaggtga tccacnnnnn nnnnnntccc 3840aaagtgctgg gattannnnn ntgagccatg
gcacctgnnn nnnnnnnnnn nnnnnnnnnn 3900nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3960nnnnnnngga aaggggtcag ggcnnnnnnn
nnnnnnnnnn nnnnnnggaa actgaggccc 4020agnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnctctgcc 4080tctgggattn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4140nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4200nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4260nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nngtgggcag cccaggagnn 4320nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4380nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnncatct gtgaaatggg 4440annnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4500nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4560nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4620nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4680nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn tgcctgggtt 4740tgaatcnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnntctgtgc cttcatttcn 4800nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4860nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4920nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4980nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 5040nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 5100nnnnnnnnnn nnnnnnngct gggattccag
gcannnnnnn nnnnnnnnnn nnnnnnnnnn 5160nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnntaatcc cagcactttg ggannnnnnn 5220nnnnnnngat cacgaggtca ggagnnnnag
accatcctag ctaacanagt gaaaccctat 5280ctcnnnnnnn nnnnnnnnnn nttagctggg
catggtgnnn nnnnnntgta gtcccagata 5340ctnnnnnnnn taaggcagaa gaatcgcttc
aacctgggag gcngaggttg cagtgagccn 5400nnnnnnnnnn nntgcactcc agcctgggnn
nnnnnnnnnn nnnnnnnnn 5449231974DNAUnknownUnknown organism
23nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
60nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
120nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
180nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnncggtg gctcacacct
240gtaatcccag cactttggga nnnnnnnnnn nnnngatcac aaggtcagga gnnnnagacc
300atcctgtcta acannnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnaaat tagtcggaca
360tgnnnnnnnn nnnctgtagt cccagctact nnnnnggctg aggcaggaga atggcgtgaa
420cccaggaggn ngagcttgca gtgagccnnn nnnnnnnnnn tgcactccag cctgggngac
480agagtgagac tccnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
540nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnc agtggctcac gcctgnnnnn
600nnnncacatt gggaggctga ggnnnnnnna tcacctgagg tcaggagttn nagaccagcc
660tggccaannn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
720nnnnnnnctg taatcccagc tactnnngag gctgaggccg gagaannnct tgagcccgag
780aggtnnnnnn nnnnnnaagc caagatcatg ccantgcact ccagcctggg caacacaggg
840agactcnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
900nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn tttgagaggc ctaggcnnnn nnnnnnnnnn
960nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1020nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1080nnnnnnnnnn nnnnnnnnnn nnnnnntttt tcttgtagag gtnnnnnnnn nnnnnnnnnn
1140nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1200nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1260nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1320nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1380nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1440nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1500nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1560nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnncaaaaa
1620tgggcaaaat nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1680nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1740nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1800nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1860nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1920nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnn
1974241256DNAUnknownUnknown organism 24nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn ntttttgttt ttgaggcnnn 60nnnnnnnnnn ntcgcccagg gtggagtnnn
nnnnnnnnnn nnnggctcat tgcaacctcn 120gcctcccggc ttcaagnnnn nctgctgcct
cagcctcnng agtagctggg ataacaggtg 180tggtggcgca tgcctnnnnn nncagctatt
ctggaggctn nnnnnnnnnn nnnncttgaa 240cccaggaggt ngaggttgca gtgagccnnn
nnnnnnnnnn nnnnncaaca agagcaaaac 300tnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 360nnnnnnnnnn nnnnnnnnnn nnnaagagat
gacatttgan nnnnnnnnnn nnnnnnnnnn 420nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn ntgcaaaggc cctgaggnnn 480nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnntaat 540cccagcactt tgggannnnn nnnnnnnnnn
atcacctgag gtcaggagtt nnagaccagc 600ctggccaann nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnttagct gggtgtggtg 660nnnnnnnnct gtaatcccag atactnnnnn
ngctgaggca tgagaatnnc ttgaacctgg 720gaggcnnnnn ntgcagtgag ctgagatnnn
nnnnntgcac tccagcctgg gnnnnnnnnn 780nnnnnnnnnn nnnnnnnnnn nnnnnnnnnt
agctgggcct ggtggnnnnn nnntatagtc 840ccagctactn nngaggctaa ggcaggagnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnctg 900tgagctatga ttgnnnnnnt atactccagc
ctggcnnnnn ngcaagaccc tgtctttnnn 960naaaaaaaaa tctaggcnng gcacggtggc
tcacgcctnt aatctcaaca ctttgnnnnn 1020nnnnnnnnnn nnnatcacct gaggtcagga
gttnnagacc agactgacca annnnnnnnn 1080nnnnnntctc tactaaaaac gcaaanttag
ccgggcgtgg tggnnnnnnn ctgtaatccc 1140agctactnnn nnggctgagg caggagaatn
ncttgaaccc aggaggcnnn nnntgcagtg 1200agctgagatn nnnnnnntgc actccagcct
aggcaacaag agtgaaactn nnnnnn 1256255980DNAUnknownUnknown organism
25nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
60nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nntcctctgt
120ctctgcctnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
180nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
240nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
300nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
360nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
420nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
480nnnnnnnnnn nttgaatttt gtaaaatnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
540nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
600nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
660nnnnnnnnnn nnnnnnnnnn nnnnnnnnnc agtggctcac acctgtaatc ccagcccttt
720gggannctga ggcaggtgga ttncttgagc ccaggagttn nnnnnnnnnn nnnnnctcat
780ctctacaaaa annnnnnnnn nngtggtggc gtgtacctct agtcccagct acccnnnnnn
840nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnncta ggctgcagtg agcnnnnnnn
900nnnnnnntgc actccagcct ggtnnacagt gaggccctgt cnnnnnnnnn nnnnnnnnnn
960nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1020nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1080nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnaaaa aaaatttttc tgnnnnnnnn
1140nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnactg agcacttaaa
1200atnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1260nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1320nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn aaaatgttac tgaaatnnnn nnnnnnnnnn
1380nnnnnnnnnn nnnnnnnnnn nnnnnnnnna tcatgttatt tttctnnnnn nnnnnnnnnn
1440nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1500nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nttggaaatt attttaannn
1560nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1620nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1680nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1740nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1800nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nttttgttga aagcaaannn
1860nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1920nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1980nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
2040nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
2100nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
2160nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
2220nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
2280nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
2340nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
2400nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
2460nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
2520nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
2580nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
2640nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
2700nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
2760nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
2820nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
2880nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
2940nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
3000nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
3060nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
3120nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
3180nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
3240nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
3300nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
3360nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
3420nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
3480nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
3540nnnnnnnnnn nnatattcac tttttaaann nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
3600nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
3660nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
3720nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
3780nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
3840nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
3900nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
3960nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
4020nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
4080nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
4140nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
4200nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
4260nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
4320nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
4380nnnnnnnnnn nnnnatacat atttacataa nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
4440nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnatagaaa cttttaaaan nnnnnnnnng
4500ggcatggtgg ctcacnnnnn taatcccagc actttgtnnn nnngaggtgg gcggatctcc
4560taaggtcagg agttcnagac cagcctggcc aanatggtga aaccccgtct tnnnnnnnnn
4620nnnnnnnntt agccaggcgt ggtgnnnnnn nnctgtaatc ctagctactc aggaggcagg
4680aaaatnnctt gaacctggga ggcngaggtt gcagtgagcc nnnnnnnnnn nnntgcactc
4740cagcctgggn gacagagtga gactccnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
4800nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
4860nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
4920nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
4980nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
5040nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
5100nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
5160nnnnnnnnnn nnnnnnnnnt gagaatattt taaatnnnnn nnnnnnnnnn nnnnnnnnnn
5220nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnatttt
5280attttcaaga tnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
5340nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
5400nnnnnnnnnn nnnnnnnnnn ntcattccaa ttttagtnnn nnnnnnnnnn nnnnnnnnnn
5460nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
5520nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
5580nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
5640nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnncaaag
5700gccgggcgcg gnnnnnnnnn nnnntaatcc cagcactttg ggannnnnnn nnnnnnngat
5760cacgaggtca ggagnnnnag accatcctgg ctaacannnn nnnnnnnnnn nnnnnnnnnn
5820nnnnnnnnnn attagccagg cctggtnnnn nnnnnctgta gtcccagcta ctnnngaggc
5880tgaggcagga gaacggcgtg aacccgggag gnngagcttg cagtgagccn nnnnnnnnnn
5940nntgcactcc agcctaggng gcagagccag actccnnnnn
59802619DNAUnknownThe pattern is an example of a pyknon. 26tcccaaagtg
ctgggatta
192716DNAUnknownThe pattern is an example of a pyknon. 27cccaggctgg
agtgca
162817DNAUnknownThe pattern is an example of a pyknon. 28agtagctggg
attacag
172920DNAUnknownThe pattern is an example of a pyknon. 29aactcctgac
ctcaggtgat
20305002DNAHomosapiensmisc_featuren indicates a nucleotide separating the
pyknons in which the nucleotide is a, c, t, or g. 30nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 60nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 120nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 180nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 240nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 300nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 360nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 420nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 480nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 540nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 600nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 660nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnntttctt cattcattca 720nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnngacagtc 780tcgctctgtn
ncccaggctg gagtgcannn nnnnnatctc agctcactgc annnnnngcc 840ttccaggttc
aagnnagtct cctgcctcag cctcnnnagt agctgggatt acagnnnnnn 900nnnaccacgc
ctggctaatt nnnncatttt tagtagagac gnnncttcac catgttggcc 960annnnnnnnn
nnaactcctg acctcaggta atnnnnnnnn nnntcccaaa gtgctgagat 1020tannnnngtg
agctaccgtg cccnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1080nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1140nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1200nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnng 1260gtggtcaggg
aaggcnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1320nnnnnnnnnn
nnnnnnnnna ggcaggagga acagcnnnnn nnnnnnnnnn nnnnnnnnnn 1380nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1440nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1500nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnttc attcaacaaa 1560tatnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1620nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1680nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1740nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1800nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1860nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1920nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1980nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2040nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2100nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2160nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2220nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2280nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2340nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2400nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2460nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnttgccc aggctggagt tnnnnnnnnn 2520nagctcactg
caacctccac ctcccagatt caagtgattc tnnnnnnnnn nnnnnnnnaa 2580tagctgggac
tacannnnna tgtcaccatg cccagnnnnn nnnnnnnnnn nnnnnnnnnn 2640nnnnnnnnnn
nnnnnnnttg gccaggctgg tctnnaactc ctggcctcaa anaatctacc 2700tgccttggnn
tccaaagtgc taggattaca ggtgtgagcc accgnnnnnn nnnnnnnnnn 2760nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2820nnnnnnnnnn
ncccaggctg gagtgcannn nnnnnnnnnn ggctcattgc aacctcnacc 2880tcccgggttc
aagnnnnnnn nnnnnntcag cttcctgagt agctnnnnnt acaggcactt 2940gccacnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnngac agggttttgc catnttggcc 3000aggctggtct
nnaactcctg acctcaggtg atnnnnnnnn nnnnnnntcc caaagtgctg 3060gcattanngg
cttgagccac cacgnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3120nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3180nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3240nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3300nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3360nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3420nnnnnnnnnn
nnnnnggagt ctcgctctgt cnnnnnngct ggagtgcagt gacntgatct 3480cggctcactg
cagnctccgc cttctgggtt cannnnattc tcctgcctca gcctcnnnag 3540tagctgggac
tacagnnnnn nnccacctcg cctggctann nnnnnnnnnn nnnnnnnnnn 3600nnnnnnnnnn
nnnntgttag ccaggatggt ctnnnnctcg tgacctcgtg atcnnccagc 3660ctcggcctcc
cnnnnnnnnn nnnnnnnnnn ntgagatgga gtttcgcnnn nnnnnntcag 3720gctggagtgc
aatgnnnnna tctcagctca ctgcannnnn ngcctcccag gttcaagnna 3780ttctcccacc
tcagcnnnnn nagtagctgg gattacagnn nnnnnnnnnn nnnnnnnnnn 3840nnnnnnnnnn
nnnnnnnnnn nnnggggttt cactatgttg gccaagctgg tcttgaactc 3900ctgacctcag
gtgatnnnnn nnnnnnnnnn tcccaaagtg ctgggattan nnnnnnnnnn 3960nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4020nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4080nnnnnnnnnn
nnnntggagg tagggcctgg nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4140nnnnnnnnnn
nnnnnnnnnt gaatgaaaga atgaannnnt ttgggaggcc aaggtgggtn 4200natcacctga
ggtcaggagt tnnagaccag cctggccaan nnnnnnnnnn nnnncctcta 4260ctaaaaatac
aaaaaattag ctgggcgtnn nnnnnnntgc ctgtaatcac agcnnnnnnn 4320gaggctgagg
caggagaatn nnnnnnnnnn nnnnnnnnga ggttgcagtg agccnnnnnn 4380ntgccattgc
actccagnnn nggcaacaga gcgagactnn nnnnnnnnnn nnnnnnnnnn 4440nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4500nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4560nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4620nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4680nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4740nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4800nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4860nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4920nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4980nnnnnnnnnn
nnnnnnnnnn nn
5002311393DNAHomosapiensmisc_featuren indicates a nucleotide separating
the pyknons in which the nucleotide is a, c, t, or g. 31nnnnngagtc
ttgctctgtt gnngaggctg gagtgcagtg atctcagctc actggannnn 60nngcctcctg
ggttcaagnn nnnnnnnnnt ctcaacctcc caagtntgcc accacacctg 120ccnnnnnnnn
nnntttttag tagggatggn nnnnnnnnnn ttgggcaggc tggtctnnaa 180ctcctgacct
caagnnnnnn nnnnnnnnnn nnntcccaga gtgctgggat tannnnnntg 240agccaccatg
cctgnnnnnn nncctgtctc taaaacaann nnnnnnnnnn nnnnnnnnnn 300nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 360nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 420nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnca gaaattttct 480ttttnnnnnn
nnnnnnnnnn nnnnnnngag acagagtctt actnnntcgc tcaggctgga 540gtnnnnnnnn
nnnatctcag ctcactgcaa cctctgcctc ctggtnnnnn nnnnnnctcc 600ttcctcagcc
tcnnnagtag ctgggattac acnnnnnnnc caccgcgcct ggctannnnn 660nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnttggcca ggctggtctn naactcctga 720cctcaggtga
tnnnnnnnnn nnnagcctcc caaaatgctn nnnnnncagg catgagccac 780cgnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 840nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 900nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 960nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1020nctcagtttc
ttcatttnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1080nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnggggtt gctgccagct gnnnnnnnnn 1140nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1200nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1260nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1320nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1380nnnnnnnnnn
nnn
139332602DNAHomosapiensmisc_featuren indicates a nucleotide separating
the pyknons in which the nucleotide is a, c, t, or g. 32nnnntgctgg
gattacaggc gcnnnnnacc acgcccggct aatnnnnnnn nnnnnnnnnn 60nnnnnngggt
ttcactgtgt tgnnnaggat ggtctctatc tcnnnnnnnn gtgatatgcc 120cgcctcnnnn
tcccaaagtg ctgggattac aggcttgagc caccgnnnnn ggcctattta 180tttattnnnn
nngagacgga gtgttgctnn nnnnncccag gctggagtgc annnnnnnnn 240nnnnggctca
ctgcaacctc ngcctcccgg gttcaagnna ttctcctgcc tcagcctcnn 300nagtagctgg
gattacagnn nnnnnnnnnn acacccggct aattttgtat ttttagtaga 360aanngggttt
ctccatgttg nnnnnnctgg tttcgaactc ccnnnnnnnn gtcatctgcc 420tgcctcnnnn
tcccaaagtg ctgggattac aggcgtgagc caccgnnnnn nnnnnnntct 480gtatttttaa
aaannnnnnn nnnnnnnnnn nnnntaaatg atttgcccaa nnnnnnnnnn 540nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 600nn
602332558DNAHomosapiensmisc_featuren indicates a nucleotide separating
the pyknons in which the nucleotide is a, c, t, or g. 33nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 60nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 120nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 180nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 240nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 300nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 360nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 420nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 480nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 540nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 600nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 660nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 720nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 780nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 840nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 900nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 960nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1020nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1080nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1140nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1200nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1260acaacaccca
agtcctnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1320nnngcctgag
tgcagtggcn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1380nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1440nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nttcttcaaa gaagaaannn 1500nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn tttgagacag 1560agtctcnnnn
nnnnncccag gctggagtgc annnnnnnnn nnnnnnnggc tcactacaac 1620ctcngcctcc
caggttcaaa nnattctcct gcctcagcct cnnnagtagc tgggattaca 1680gnnnnnnnnn
nnnnnnnnnn nnnnnnnata gagatggggt ttcnnnnnnt tggccaggct 1740ggtctnnaac
tcctgacctc aggtgatnnn nnnnnnnnnn nntcccaaag tgctgggatt 1800annnnnnnnn
nncaccacac ccagccatnn nnnnnnatta tgttttctaa aannnnnnnn 1860nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnca gactctgcct caaannnnnn 1920nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1980nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2040nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2100nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn ngagtcttgc 2160tctgttgnnn
cgctggagtg cagtggnnnn nnnnnggctc actgcaacct cnngctccca 2220ggttcaagcn
nnnnnnnnna gtagctggga ctacagnnnn nnnnnnnnac atccggctaa 2280ttttnnnnnn
nnnnnnatag agacggggtt ttnnnnnntt ggccaggatg gtctnnnnct 2340cctgacctca
tgatcnnnnn nnnnnagcct cccaaagtgc tggggnnnna ggcatgagcc 2400accacgnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnngt cttgctctgt 2460caccnnnnnn
nnnnnnnntg cagccttgaa ctccnnnnnn nnnnnnnncc tcctgcctca 2520gcctcccgan
nnnnnnnnnn nnacaggtat gtaccacc
2558341265DNAHomosapiensmisc_featuren indicates a nucleotide separating
the pyknons in which the nucleotide is a, c, t, or g. 34nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 60nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 120nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 180nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 240nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 300nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 360nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 420nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 480nnnnnnnnnn
nnnnnnnnnn nnnnagagtc ttgctctgtc gcccaggctg gagtgcannn 540nnnnnnnnnn
ggctcactgc aacctcngcc tcccaggttc aagnnattct cctgcctcag 600cctcnnntgt
agctgggatt acaggcnnnn nncaccatgc ccggctaatn nnnttatttt 660tagtagagan
nnnnnnnnnn nnnnttggcc aggctggtct nnaactcctg acctcaggtg 720atnnnnnnnn
nnnnnnntcc caaagtgctg ggattacagg cgtgagccac cgnnnnnnnn 780nnnnnnnnnn
nnnnnnnnnn ntggtcttgc tgtgtcaccc aggctggagt gcannnnnnn 840nnnnnnnnnn
nnnnnnnntc ttgaattcct gggcnnnnnn nnnnnnnnnn nnnaagcctc 900ccaagtagct
nnnnnnncag gcacatacta ccannnnnnn nnnnnnnnnn nnnntttttg 960tagagacagg
nnnnttacta tgttgcccag actggtcttg aactccnnnn nnnnnnnnnn 1020nnnnnnnnnn
nnnnnnnnnn nnnnnnnatt acaggtgtga gcctnnnnnn nnnnnnnnnn 1080nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn ngtttttttg agacagggtn 1140nnnnnnnnnn
ncccaggctg gagtgcannn nnnnnnnnnn ggctcactgc agcctcnncc 1200tcccaaggct
caggnnatcc tcttgcctca gcnnnnnnnn nnnnnnnnac caagtagctg 1260ggacn
126535582DNAHomosapiensmisc_featuren indicates a nucleotide separating
the pyknons in which the nucleotide is a, c, t, or g. 35nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 60nnnnnnggct
ggagggcaga ggnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 120nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 180nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn cccaggctgg agtgcannnn nnngatctcg 240gctcactgcg
nnnnnnnnnn nnnnnnnnnn nnnnattctc ctgcctcagc ctcnnnagta 300gctgggatta
cagnnnnnnn ccatcacgcc cggctannnn nnnnnnnnnn nnnnnnnnnn 360nnnnnnnnnn
nnnttggtca ggctagtctn naactcctga cctcaggtga tnnnnnnnnn 420nnnntcccaa
agtgctggga ttacaggcgt gagccaccgn nnnnnnnnnn nnnnnnnnnn 480nnnnnnnnnn
nnnnnnnnnn nnngcagaga ctggagggan nnnnnnnnnn nnnnnnnnnn 540nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nn
582362077DNAHomosapiensmisc_featuren indicates a nucleotide separating
the pyknons in which the nucleotide is a, c, t, or g. 36nnnnnnntct
gttgccaggc tggagtgccg tggtgtgatc tcggctcact gcaacctcnn 60nctctcaggt
tcaagcgatt ctcctgcctc agcctcctca gtagctggga ttnnnaggca 120tgcactacca
tnnnnnnnnn nnnnnnnnnn nnnnnntaga gatggggttt caccacnttg 180cccaggctgg
tctnnaactc ctgacctcaa cnnnnnnnnn nnnnnnnnnn tcccaaagtg 240ctgggattac
aggcgtgagc caccgnnccc ggcctgattt tttnnnnnnn nnnnnnnnnn 300nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 360nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 420nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 480nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 540ctggataaag
agaatgnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 600nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 660nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 720nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 780nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 840nnnnnnnnnn
ngagtcttgc tctgttgctc aggctggagt gcannnnnnn gatctcagct 900cattgcnnnn
nnngcccccc aggttcaagn nnnnttcctg cctcagcctc nnnagtagct 960gggattacag
nnnnnnnnnn nnacacctgg ctaattttnn nnnattttag tagagatggn 1020nnnnnnnnnn
nttggccagg atggtctnnn nnnnnnnncc tcaagtgatc tgcctgnnnn 1080nnnntcccaa
agtgctagga ttacaggtgt gagccatcan nnnnnnnnnn nnnnnnnnnn 1140nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnt 1200ttccttatct
gtaaannnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1260nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1320nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn ngaatgactt 1380tggggtannn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnntt ctttcttttc 1440catgnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1500nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1560nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1620nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1680nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1740nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1800nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1860nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1920nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1980nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2040nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnn
2077374111DNAHomosapiensmisc_featuren indicates a nucleotide separating
the pyknons in which the nucleotide is a, c, t, or g. 37nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 60nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 120nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 180nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 240nnnnnnnnnn
nnnnnacata ttttaaattc tnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 300nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 360nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 420nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 480nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 540nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 600nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 660nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 720nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 780nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 840nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 900nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 960nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1020nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1080nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1140nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1200nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1260nnnnnnnnnn
nnnnngagtc ttgctctgtt gcccaggctg gagtgcannn nnnnngtctc 1320agctcactgc
annnnnngcc tcttgggttc aagnnnnnnt cctacctcag cctcctnagt 1380agctgggatt
acagnnnnnt accaccatgt ccagcnnnnn nnnnnnnttt tactagagat 1440ggggtttnnn
nnnnttggcc aggctggtct nnnnctcctg acctcatgat cnnnnnnnnn 1500nnnnntccca
aagtgctggg attacaggca tgagccaccg nnnnnnnnnn nnnnnnnnnn 1560nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1620nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1680nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1740nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnngaatggg 1800caaaagctgn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1860nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1920nnnnnnnnnn
nnnngtcttg ctctgtcacc naggctagag tgcagtgnnn nnnnttcagc 1980tcactgcaac
ctnngcctcc cgggttcaag nnnnnnnnnn gcctccacct cctgagcagc 2040tgggattaca
ggnnnnnnnc accatgcccg gctaatnnnn nnnnnnnnnn nnnnnnnnnn 2100nnnnnnnnnn
tgttagccag gattgtctnn nnctcttgac ctcgtgatcn nnnnnnnnnn 2160nnntcccaaa
gtgctgggat tacaggcgtg agccactgnn nnnnnnnnnn nnnnnnnnnn 2220nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2280nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2340nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2400nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2460nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2520nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnncaa 2580tgttttacag
tttnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2640nnnnnnnnnn
nnnnnnnnnn nnnctcaagg ttacacagcn nnnnnnnnnn nnnnnnnnnn 2700nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2760nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2820nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2880nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2940nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3000gagtgtgtgt
gagagannnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3060nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3120nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3180nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3240nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3300nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3360nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3420nnaacccagg
aggttgaann nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3480nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3540nnnnnnnnnn
nnnnnnnnnn nnatctgacc tttctctcnn nnnnnnnnnn nnnnnnnnnn 3600nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3660nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3720nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3780nnnnnnnnnn
nntgtgccag gcccagggnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3840nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3900nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3960nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4020nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4080nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn n
4111381516DNAHomosapiensmisc_featuren indicates a nucleotide separating
the pyknons in which the nucleotide is a, c, t, or g. 38nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 60nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nttgtcgccc acgctggnnn 120nnnnnnnnnn
atctcagttc actgcannnn nngcctccca ggttcaagnn nnnnnnntgc 180ctcagcttcc
cgannagcta ggactacagg tnnnnnnnnn nacgctcagc taattttnnn 240nnnnnnnnnn
nnnnngatgg ggttttccca tnttggccag gctggtctnn aactcctgac 300ctcagannnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnncaggcg tgagccactg 360nnnnnnnnac
attttggtat gtttnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 420nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 480nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 540nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 600nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 660nnnnnntgaa
caggcaactt acannnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 720nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 780nnnnnnnnaa
taatattttc ttcannnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 840nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 900nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 960nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1020nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1080nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnntgct ggagtgcaat ggnnnnnnnn 1140nggctcactg
caacctcngc ctcctgggtt caggnnnnnc tcctgcctca gactcnnnag 1200tagctgggat
tacagnngcc tgccaccacg cccggnnnnn nnnnnnnnnn nnnnnnnnnn 1260nngtgtttca
ctatgttgnn ngggctggtc tcgaactnct gccctcaggt gatcnnnnnn 1320nnnnnnnntc
ccaaagtgct gggattanng gcgtgagcca ctgctnnnnn nnnnnnnnnn 1380nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn ntagatgaaa caagattnnn nnnnnnnnnn 1440nnnnnnnnnn
ntatgcatgt atctttannn nnnntttttt tttaaagact nnnnnnnnnn 1500nnnnnnnnnn
nnnnnn
15163916DNAUnknownThe pattern is an example of a pyknon. 39aaatgtgaag
aatgtg
164016DNAUnknownThe pattern is an example of a pyknon. 40tcatactgga
gagaaa
164116DNAUnknownThe pattern is an example of a pyknon. 41acaagtgtga
agaatg
164216DNAUnknownThe pattern is an example of a pyknon. 42catactgaag
agaaac
16431497DNAUnknownUnknown organism 43nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 60nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 120nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnaagg aaaaaaacct 180ttnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 240nnnnnnnnnn nnnnnnagca gagcataaaa
gannnnnnnn nnnnnnnnnn nnnnnnnnnn 300nnnnnnnnnn nnnnnnnnnn nnatttacag
tttaaaaann nnnnnnnnnn nnnnnnnnnn 360nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 420nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 480nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 540nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnaaa 600tgtgaagaat gtgnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnt 660catactggag agaaannnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 720nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnaaatg tgaagaatgt 780gnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnntca tactggagag 840aaannnnnna aatgtgaaga atgtgnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 900nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnaaatgtg aagaatgtgn nnnnnnnnnn 960nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnntcata ctggagagaa annnnnnaaa 1020tgtgaagaat gtgnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1080nnnnnnnnnn nnnnnnnnnn naaatgtgaa
gaatgtgnnn nnnnnnnnnn nnnnnnnnnn 1140nnnnnnnnnn nnnnnnnnnn nnntcatact
ggagagaaan nnnnnaaatg tgaagaatgt 1200gnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1260nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1320nnnnnnnnnn ntcatactgg agagaaannn
nnnaaatgtg aagaatgtgn nnnnnnnnnn 1380nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnntcata ctggagagaa annnnnnnnn 1440nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnn 149744996DNAUnknownUnknown organism
44nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnaa atgtgaagaa
60tgtgnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
120nnnnnnnnnn nnnnnnnnnn nnnnnnnaaa tgtgaagaat gtgnnnnnnn nnnnnnnnnn
180nnnnnnnnnn nnnnnnnnnn nnnnnnnnnt catactggag agaaannnnn nnnnnnnnnn
240nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnntcatact
300ggagagaaan nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
360nnnnnnnnnn nnnntcatac tggagagaaa nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
420nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
480aaatgtgaag aatgtgnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
540nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
600nnnnnnnnnn nnnnnnnnnn nnnnnntcat actggagaga aannnnnnnn nnnnnnnnnn
660nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn tcatactgga
720gagaaannnn nnaaatgtga agaatgtgnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
780nnnnnnnnnn nnnntcatac tggagagaaa nnnnnnaaat gtaaagaatg tgnnnnnnnn
840nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnntc atactggaga gaaannnnnn
900aaatgtgaag aatgtgnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
960nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnn
996451182DNAUnknownUnknown organism 45nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 60nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 120nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 180nnnnnnnnnn nnnnnnnnnn nnnnaaatgt
gaagaatgtg nnnnnnnnnn nnnnnnnnnn 240nnnnnnnnnn nnnnnnnnnn nnnnnntcat
actggagaga aannnnnnaa atgtgaagaa 300tgtgnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 360nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 420nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 480nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 540aaatgtgaag aatgtgnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 600nnncatactg aagagaaacn nnnnaaatgt
gaagaatgtg nnnnnnnnnn nnnnnnnnnn 660nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnntgtg aaaaatgtgg cannnnnnnn 720nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 780nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 840nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 900nnnaaatgtg aagaatgtgn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 960nnnnnnnnnn nnnnnnnnnn nnnnaaatgt
gaagaatgtg nnnnnnnnnn nnnnnnnnnn 1020nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnaa atgtgaagaa 1080tgtgnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn tcatactgga 1140gagaaannnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nn 1182461968DNAUnknownUnknown organism
46nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
60nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnaatttat atagaaatgn nnnnnnnnnn
120nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
180nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
240nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnntctttt
300caaaaagcaa nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnntttaca gttaagaaaa
360nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
420nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnntgata aatatttgaa annnnnnnnn
480nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
540nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
600nnnnnnnnnn nnnnnnnnaa atgtaaagaa tgtgnnnnnn nnnnnnnnnn nnnnnnnnnn
660nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
720nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
780nnnnnnaaat gtgaagaatg tgnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
840nnnnnnnntc atactggaga gaaannnnac aagtgtgaag aatgnnnnnn nnnnnnnnnn
900nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnacaagtgt
960gaagaatgnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnac
1020actggagaga aaccnnnnaa atgtgaagaa tgtgnnnnnn nnnnnnnnnn nnnnnnnnnn
1080nnnnnnnnnn nnnnnnnnnn tcatactgga gagaaannnn nnnnnnnnnn nnnnnnnnnn
1140nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1200nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1260nnnnnnnntc atactggaga gaaannnnac aagtgtgaag aatgnnnnnn nnnnnnnnnn
1320nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnaaatgt
1380gaagaatgtg nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnac
1440actggagaga aaccnnnnaa atgtgaagaa tgtgnnnnnn nnnnnnnnnn nnnnnnnnnn
1500nnnnnnnnnn nnnnnnnnnn tcatactgga gagaaannnn nnaaatgtga agaatgtgnn
1560nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnncatac tgaagagaaa
1620cnnnnnnnnt gtgaaaaatg tggcannnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1680nnnnnnnntc atactggaga gaaannnnnn aaatgtgaag aatgtgnnnn nnnnnnnnnn
1740nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnaaatgt
1800gaagaatgtg nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1860nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1920nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnn
1968471191DNAUnknownUnknown organism 47nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 60nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 120nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 180nnnnnnnnnn nnnnnnnnnn nnnnaaatgt
gaagaatgtg nnnnnnnnnn taaacaattc 240tcaaaannnn nnnnnnnnnn nnnnnntcat
actggagaga aannnnnnaa atgtgaagaa 300tgtgnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 360nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 420nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 480nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 540aaatgtgaag aatgtgnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 600nnncatactg aagagaaacn nnnnaaatgt
gaagaatgtg nnnnnnnnnn nnnnnnnnnn 660nnnnnnnnnn ntcatactgg agagaaannn
nnnnnntgtg aaaaatgtgg cannnnnnnn 720nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 780nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 840nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 900nnnnnnnnna aatgtgaaga atgtgnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 960nnnnnnnnnn nncatactga agagaaacnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1020nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnncata ctgaagagaa acnnnnnaaa 1080tgtgaagaat gtgnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnt 1140catactggag agaaannnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn n 1191481371DNAUnknownUnknown organism
48nnnnnnnnnn nnnnnnnnnn ncagccagag cagggcannn nnnnnnnnnn nnnnnnnnnn
60nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
120nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
180nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
240nnnnnnnnnn nnnnnnnnnn nnnnnnnnna aagaaacatt tcaaannnnn nnnnnnnnnn
300nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
360nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
420nnnnnnnntc atactggaga gaaannnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
480nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nntcatactg gagagaaann nnnnnnnnnn
540nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
600nnnnnnnnnn nnnnnnnnaa atgtgaagaa tgtgnnnnnn nnnnnnnnnn nnnnnnnnnn
660nnnnnnnnnn nnnnnnnnnn tcatactgga gagaaannnn nnnnnnnnnn nnnnnnnnnn
720nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
780nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
840nnnnntcata ctggagagaa annnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
900nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
960nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1020nnnnnnnnnn nnnnnaaatg tgaagaatgt gnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1080nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1140nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn ntcatactgg agagaaannn
1200nnnaaatgtg aagaatgtgn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1260nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1320nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn n
137149914DNAUnknownUnknown organism 49nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 60nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 120nnnnnnnnnn nnaaatgtga agaatgtgnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 180nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnaaatgtg aagaatgtgn nnnnnnnnnn 240nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnaaa 300tgtgaagaat gtgnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 360nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 420nnnnnnnnnn nnnnnnnnnn nnntcatact
ggagagaaan nnnnnnnnnn nnnnnnnnnn 480nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnntca tactggagag 540aaannnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 600nnnnnnnnnn ntcatactgg agagaaannn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 660nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnaattcata tggaattgnn nnnnnnnaaa 720tgtgaagaat gtgnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnactaatca taagagaann 780nacactggag agaaaccnnn naaatgtgaa
gaatgtgnnn nnnnnnnnnn nnnnnnnnnn 840nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnaaatg tgaagaatgt 900gnnnnnnnnn nnnn
914501617DNAUnknownUnknown organism
50nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
60nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
120nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
180nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
240nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
300nnnnnnnaaa aatgtggaaa tgannnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
360nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
420nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnntttaata aattttcacn nnnnnnnnnn
480nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnaaatgta aagaatgtgn nnnnnnnnnn
540nnnnnnnnnn nnnnnnnnnn nnnnnnnnaa agaaattata ccaannnnnn nnnnnnnnnn
600nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
660nnnnnnnnnn nnnnnnnnaa atgtgaagaa tgtgnnnnnn nnnnnnnnnn nnnnnnnnnn
720nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnaaatgtga agaatgtgnn
780nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
840nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
900nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn aaatgtgaag aatgtgnnnn nnnnnnnnnn
960nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1020nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1080nnnnnnnnnn nnnnnnnnaa atgtaaagaa tgtgnnnnnn nnnnnnnaca ctcctcagcc
1140cttnnnnnnn nnnnnnnnnn nnacactgga gagaaaccnn nnaaatgtga agaatgtgnn
1200nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1260nnnnnnaaat gtgaagaatg tgnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1320nnnnnnnnnn nnnnnnnnnn nnnnnnnnac aagtgtgaag aatgnnnnnn nnnnnnnnnn
1380nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnaaatgt
1440gaagaatgtg nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnntcat
1500actggagaga aannnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1560nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnn
1617511530DNAUnknownUnknown organism 51nnnnnnnnnn nnnnnnnnnn ncagccagag
cagggcannn nnnnnnnnnn nnnnnnnnnn 60nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 120nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 180nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 240nnnnnnnnnn nnnnnnnnnn nnnnnnnnna
aagaaacatt tcaaannnnn nnnnnnnnnn 300nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 360nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 420nnnnnnnntc atactggaga gaaannnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 480nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nntcatactg gagagaaann nnnnnnnnnn 540nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 600nnnnnnnnnn nnnnnnnnaa atgtgaagaa
tgtgnnnnnn nnnnnnnnnn nnnnnnnnnn 660nnnnnnnnnn nnnnnnnnnn tcatactgga
gagaaannnn nnnnnnnnnn nnnnnnnnnn 720nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 780nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 840nnnnnnnntc atactggaga gaaannnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 900nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 960nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1020nnnnnnnnnn nnnnnnnnaa atgtgaagaa
tgtgnnnnnn nnnnnnnnnn nnnnnnnnnn 1080nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1140nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnntcatac tggagagaaa 1200nnnnnnaaat gtgaagaatg tgnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1260nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1320nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1380nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1440nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
nnnnntcata ctggagagaa annnnnnaaa 1500tgtgaagaat gtgnnnnnnn nnnnnnnnnn
1530521209DNAUnknownUnknown organism
52nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
60nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
120nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
180nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn tttacaaata
240agaaaannnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
300nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
360nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
420aaatgtaaag aatgtgnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
480nnncatactg aagagaaacn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
540nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnacaa gtgtgaagaa
600tgnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
660nnnnnnnnnn nnaaatgtga agaatgtgnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
720nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
780nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
840nnnnnnnnna aatgtgaaga atgtgnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
900nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnaaatgtg aagaatgtgn nnnnnnnnnn
960nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnncata ctgaagagaa acnnnnnaaa
1020tgtgaagaat gtgnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn
1080catactgaag agaaacnnnn nnnnnntgaa gaatgtatca gannnnnnnn nnnnnnnnnn
1140nnnnnnnnnn nnnnnnnnnn nnntcatact ggagagaaan nnnnnnnnnn nnnnnnnnnn
1200nnnnnnnnn
1209
User Contributions:
Comment about this patent or add new information about this topic:
People who visited this patent also read: | |
Patent application number | Title |
---|---|
20150007082 | CABIN MANAGEMENT SYSTEM HAVING A THREE-DIMENSIONAL OPERATING PANEL |
20150007081 | MULTI-CHANNEL CONVERSION PATH POSITION REPORTING |
20150007080 | SYSTEM AND METHOD FOR PRODUCING AND SUBMITTING A PIREP |
20150007079 | COMBINING PARALLEL COORDINATES AND HISTOGRAMS |
20150007078 | Data Displays in a Tile-Based User Interface |