Patent application title: METHOD AND KIT FOR DISCRIMINATING BETWEEN BREAST CANCER AND BENIGN BREAST DISEASE
Inventors:
IPC8 Class: AC12Q168FI
USPC Class:
1 1
Class name:
Publication date: 2016-10-20
Patent application number: 20160304944
Abstract:
A method and kit for discriminating between breast cancer and benign
breast disease by the determination of the expression level of at least
one target gene having a nucleic acid sequence selected from the nucleic
acid sequences set forth in SEQ ID NOs: 1, 2 or 3, 4 and 5 or 6 to obtain
an expression profile for the patient, and the comparison of the
expression profile of the patient with expression profiles of target
genes from patients previously clinically classified as breast cancer and
expression profiles of target genes from patients previously clinically
classified as benign breast disease.Claims:
1. A method for discriminating between breast cancer and benign breast
disease in a biological sample from a patient, the method comprising the
following steps: a) obtaining the biological sample comprising a
biological material from the patient, b) contacting the biological
material from the biological sample with at least one specific reagent
for at least one target gene and no more than 28 specific reagents for 28
target genes comprising the full-length nucleic acid sequences set forth
in SEQ ID NOS: 1 to 44, wherein the at least one reagent is specific for
at least one target gene comprising the full-length nucleic acid sequence
set forth in SEQ ID NOS: 1 to 6, c) measuring the expression level of the
at least one target gene to obtain an expression profile for the patient,
and d) performing clustering analysis of the expression profile of the
patient with expression profiles of the at least one target gene from
patients previously clinically classified as having breast cancer and
expression profiles of the at least one target gene from patients
previously clinically classified as having benign breast disease,
wherein: if the expression profile of the patient is clustered with the
expression profiles from patients previously clinically classified as
having breast cancer, then the patient is diagnosed to have breast
cancer, and if the expression profile of the patient is clustered with
the expression profiles from patients previously clinically classified as
having benign breast disease, then the patient is diagnosed to have a
benign breast disease.
2. The method as claimed in claim 1, wherein: in step b) the biological material from the biological sample is contacted with reagents specific for a combination of at least 4 and no more than 28 target genes, the at least four reagents being specific for at least four different target genes respectively comprising the full-length nucleic acid sequences set forth in: 1) SEQ ID NO: 1; and 2) SEQ ID NO: 2 or 3; and 3) SEQ ID NO: 4; and 4) SEQ ID NO: 5 or 6; and the expression level of the target genes is measured in step c) to obtain the expression profile for the patient.
3. The method as claimed in claim 1, wherein in step b) the biological material is brought into contact with reagents specific for a combination of 28 target genes, and the expression level of the 28 genes is measured in step c) to obtain the expression profile for the patient.
4. The method as claimed in claim 1, wherein the biological sample taken from the patient is a blood sample.
5. The method as claimed in claim 1, wherein the biological material comprises nucleic acids.
6. The method as claimed in claim 1, wherein the at least one specific reagent of step b) comprises at least one hybridization probe.
7. The method as claimed in claim 6, wherein the specific reagents of step b) comprises at least one hybridization probe and at least one primer.
8. The method as claimed in claim 7, wherein the specific reagents of step b) comprises one hybridization probe and two primers.
9. A kit for discriminating breast cancer from benign breast disease in a biological sample from a patient, comprising at least one specific reagent for at least one target gene and no more than 28 specific reagents for 28 target genes comprising the full-length nucleic acid sequences set forth in SEQ ID NOS: 1 to 44, wherein the at least one reagent is specific for at least one target gene comprising the full-length nucleic acid sequence set forth in SEQ ID NOS: 1 to 6.
10. The kit as claimed in claim 9, wherein the at least one specific reagent comprises at least four reagents respectively specific for at least four target genes and no more than 28 reagents, wherein the target genes are selected from the group consisting of genes comprising the full-length nucleic acid sequences set forth in SEQ ID NOS: 1 to 44, the at least four reagents being specific for at least four different target genes respectively comprising the full-length nucleic acid sequences set forth in: 1) SEQ ID NO: 1; and 2) SEQ ID NO: 2 or 3; and 3) SEQ ID NO: 4; and 4) SEQ ID NO: 5 or 6.
11. The kit as claimed in claim 10, comprising reagents specific for a combination of 28 target genes.
12. A method comprising manufacturing the kit of claim 9.
13. A method comprising manufacturing the kit of claim 10.
14. A method comprising manufacturing the kit of claim 11.
15. The method as claimed in claim 12, wherein the at least one specific reagent comprises at least one hybridization probe.
16. The method as claimed in claim 15, wherein the at least one specific reagent comprises at least one hybridization probe and at least one primer.
17. The method as claimed in claim 16, wherein the specific reagent comprises one hybridization probe and two primers.
Description:
[0001] This is a divisional of application Ser. No. 13/696,937 filed Nov.
28, 2012, which is a National Stage Application of PCT/CN2010/073342
filed May 28, 2010. The entire disclosures of the prior applications are
hereby incorporated by reference herein in their entirety.
FIELD OF THE INVENTION
[0002] The present invention relates to the filed of the discrimination between breast cancer and benign breast disease. Particularly, the present invention relates to a method and kit for discriminating between breast cancer and benign breast disease.
BACKGROUND
[0003] Breast cancer is the most common cancer in women in the world. As the pathogenesis of breast cancer is inadequately understood, the early diagnosis seemed much of significance. Currently, mammogram screening is the most frequent method for the breast cancer detection. It can be used to reduce breast cancer morbidity by 20 to 40 percent in the age of 40 to 69 women, which has been proved by several large randomized trials. Mammography is currently the gold standard for early breast cancer detection while the reported overall sensitivity is significantly reduced in certain subsets of women, particularly in women with radiographically dense breasts and those at increased risk of breast cancer. Estimates of film mammographic sensitivity in women with extremely dense breasts range from 48 to 63%. Mammography has the disadvantage of low sensitivity and specificity, especially in the young group, and a compression pain during the process. In addition, due to small volume and high-density breast, many cases failed to obtain a clear result of their mammography in the screening, which are often classified as BI-RADS 0 (BI-RADS: Breast Imaging Reporting and Data System) in their mammographic diagnosis.
[0004] The BI-RADS was developed in 1993 by the American College of Radiology (ACR) to standardize mammographic reporting, to improve communication, to reduce confusion regarding mammographic findings, to aid research, and to facilitate outcomes monitoring. According to the Mammography Quality Standards Act (MQSA) of 1997 [Final Rule 62(208): 55988], all mammograms in the United States must be reported using one of these assessment categories. Each mammographic study should be assigned a single assessment based on the most concerning findings. Classifications are divided into an incomplete assessment (category 0) and completed assessments (categories 1, 2, 3, 4, 5, 6). BI-RADS Category 0 is defined as an incomplete assessment, which means additional imaging needed. Follow-up is usually recommended, which requires a long, expensive and anxiety producing process, based on ultrasonography or magnetic resonance imaging (MRI) or even biopsy. Ultrasonography, even combined with mammography, is associated with high rate of false positive results which led to unnecessary invasive steps. The long term of reservation of MRI is detrimental to the patients. MRI also brings a high rate of false positive result, together with a high cost. With such a variety of factors, the need of a new easy-to go test that would improve breast cancer detection and demonstrate the risk of patients, particularly when mammography cannot be identified, is highly important.
[0005] The serum biomarker, such as CEA, CA15-3, does not show a good performance in the cancer screening [1]. Recently, there is some literature describe the possibility of early diagnosis of breast cancer using gene-expression patterns in peripheral blood cells [2]. The result of these pilot studies indicate that cancer would cause characteristic changes in the biochemical environment of blood, and as a result of that the expression pattern of some identified genes can be used to discriminate cancer and control group with high accuracy. However, no alternative based on blood biomarkers has yet succeeded to discriminate within the BI-RADS 0 patients, between breast cancer (BC) and benign breast disease (BBD).
SUMMARY OF THE INVENTION
[0006] The present invention provides a method for discriminating between breast cancer and benign breast disease in a biological sample from a patient, wherein it comprises the following steps: a) obtaining the biological sample comprising a biological material from the patient, b) contacting the biological material from the biological sample with at least one specific reagent for at least one target gene and no more than 28 specific reagents for 28 target genes comprising the nucleic acid sequences set forth in SEQ ID NOs 1 to 44, wherein the at least one reagent is specific for at least a target gene comprising a nucleic acid sequence selected from the nucleic acid sequences set forth in SEQ ID NOs: 1, 2 or 3, 4 and 5 or 6, and c) determining the expression level of at least one target gene comprising a nucleic acid sequence selected from the nucleic acid sequences set forth in SEQ ID NOs: 1, 2 or 3, 4 and 5 or 6 to obtain an expression profile for the patient, and d) performing analysis of the expression profile of the patient with expression profiles of target genes from patients previously clinically classified as breast cancer and expression profiles of target genes from patients previously clinically classified as benign breast disease, wherein: if the expression profile of the patient is clustered with the expression profiles from patients previously clinically classified as breast cancer, then the patient is prognosticated to have breast cancer, and if the expression profile of the patient is clustered with the expression profiles from patients previously clinically classified as benign breast disease, then the patient is prognosticated to have a benign breast disease.
[0007] In one embodiment, in step b) the biological material is brought into contact with reagents specific for a combination of at least 4 and no more than 28 target genes, wherein the reagents include at least reagents specific for the target genes comprising the nucleic acid sequence set forth in SEQ ID NOs 1, 2 or 3, 4 and 5 or 6, respectively, and the expression level of at least said 4 genes is determined in step c) to obtain the expression profile for the patient.
[0008] In another embodiment, in step b) the biological material is brought into contact with reagents specific for a combination of 28 genes, wherein the reagents include reagents specific for the target genes comprising the nucleic acid sequence set forth in SEQ ID NOs 1 to 44 respectively, and the expression level of the 28 genes is determined in step c) to obtain the expression profile for the patient.
[0009] Particularly, the biological sample taken from the patient is a blood sample. More particularly, the biological material comprises nucleic acids.
[0010] In one embodiment, the at least one specific reagent of step b) comprises at least one hybridization probe. In another embodiment, the specific reagents of step b) comprises at least one hybridization probe and at least one primer. In a further embodiment, the specific reagents of step b) comprises one hybridization probe and two primers.
[0011] The present invention also provides a kit for discriminating breast cancer from benign breast disease in a biological sample from a patient comprises at least one specific reagent for at least one target gene and no more than 28 specific reagents for 28 target genes comprising the nucleic acid sequences set forth in SEQ ID NOs 1 to 44, wherein the at least one reagent is specific for at least a target gene comprising a nucleic acid sequence selected from the nucleic acid sequences set forth in SEQ ID NOs: 1, 2 or 3, 4 and 5 or 6.
[0012] In one embodiment, the kit of the present invention comprises reagents specific for a combination of at least 4 and no more than 28 target genes, wherein the reagents include at least reagents specific for the target genes comprising the nucleic acid sequence set forth in SEQ ID NOs 1, 2 or 3, 4 and 5 or 6, respectively.
[0013] In another embodiment, the kit of the present invention comprises reagents specific for a combination of 28 target genes, wherein the reagents include reagents specific for the target genes comprising the nucleic acid sequence set forth in SEQ ID NOs 1 to 44.
[0014] The present invention also relates to the use of at least one specific reagent for at least one target gene and no more than specific reagents for 28 target genes comprising the nucleic acid sequences set forth in SEQ ID NOs 1 to 44 in the manufacture of a composition for discriminating breast cancer from benign breast disease in a biological sample from a patient, wherein the at least one reagent is specific for at least a target gene comprising a nucleic acid sequence selected from the nucleic acid sequences set forth in SEQ ID NOs: 1, 2 or 3, 4 and 5 or 6.
[0015] In one embodiment, the present invention relates to use of reagents specific for a combination of at least 4 and no more than 28 target genes in the manufacture of a composition for discriminating breast cancer from benign breast disease in a biological sample from a patient, wherein the reagents include at least reagents specific for the target genes comprising the nucleic acid sequence set forth in SEQ ID NOs 1, 2 or 3, 4 and 5 or 6, respectively.
[0016] In another embodiment, the present invention relates to use of a combination of 28 target genes in the manufacture of a composition for discriminating breast cancer from benign breast disease in a biological sample from a patient, wherein the reagents include reagents specific for the target genes comprising the nucleic acid sequence set forth in SEQ ID NOs 1 to 44.
DETAILED DESCRIPTION OF THE INVENTION
[0017] The present invention proposes to solve all the drawbacks of the prior art by providing a diagnostic tool for discriminating within BI-RADS 0 patients, between BC and BBD. Considering most of the patients whose mammography classified as BI-RADS 0 have breast lesion, the present study aims to discriminate BC from BBD. This is very different from the earlier researches which focused on the expression pattern of breast cancer patients and patients with no signs of this disease. That eliminates some not cancer-specific factors to the detection of cancer such as some inflammatory response regulation.
[0018] Surprisingly, the inventors have demonstrated that the analysis of the expression of at least one target gene selected from CHI3, CLEC4C, LILRA3 and TUBB2A gives an information that is sufficient for distinguishing BDD patients from BC. Of course, the analysis of the expression of the above target genes, taken in combination, improves the sensitivity and the specificity of the result, likewise the analysis of the expression profile of 28 target genes, such as described below in table 1, including CHI3, CLEC4C, LILRA3 and TUBB2A.
TABLE-US-00001 TABLE 1 SEQ ID Abbreviated Accession NOs: name Name of gene number 1 CHI3L1 Chitinase 3-like 1 (cartilage glycoprotein-39) ENST00000255409 2 CLEC4C C-type lectin domain family 4, member C ENST00000354629 3 ENST00000360345 4 LILRA3 Leukocyte immunoglobulin-like receptor, subfamily A ENST00000251390 (without TM domain), member 3 5 TUBB2A Tubulin, beta 2A ENST00000259218 6 ENST00000333628 7 ADAM12 ADAM metallopeptidase domain 12 ENST00000368676 8 CHURC1 Churchill domain containing 1 ENST00000359118 9 RNF182 Ring finger protein 182 ENST00000313403 10 TMEM176B Transmembrane protein 176B ENST00000326442 11 ENST00000429904 12 ENST00000434545 13 ENST00000447204 14 FAM118A Family with sequence similarity 118, member A ENST00000216214 15 ENST00000441876 16 ANKRD20A Ankyrin repeat domain 20 family, member A1/2/3/4/5 ENST00000377477 17 KLRC1/2 Killer cell lectin-like receptor subfamily C, ENST00000347831 18 member 1/2 ENST00000359151 19 ENST00000381902 20 KIAA1671 KIAA1671 protein ENST00000358431 21 ZBTB44 Zinc finger and BTB domain containing 44 ENST00000454539 22 LQK1 LQK1 hypothetical protein short isoform NR_027285 23 NR_027286 24 APOBEC3A Apolipoprotein B mRNA editing enzyme, catalytic ENST00000249116 25 polypeptide-like 3A ENST00000402255 26 LOC283788 Homo sapiens cDNA FLJ90087 fis, clone HEMBA1005230, NR_027436 weakly similar to zinc protein 140 27 FAM87A/B Family with sequence similarity 87, member A/B ENST00000330148 28 LOC642236 Similar to FRG1 protein (FSHD region gene 1 ENST00000226798 protein) 29 C4A/B Complement component 4A/B ENST00000428596 30 ENTPD5 Ectonucleoside triphosphate diphosphohydrolase5 ENST00000334696 31 LOC728263 Similar to hCG1818012 NG_008780 32 MGC15705 Putative uncharacterized protein MGC15705. ENST00000425084 33 FAM160A1 Family with sequence similarity 160 A1 ENST00000340515 34 ENST00000435205 35 PLXDC1 Plexin domain containing 1 ENST00000315392 36 SFN Stratifin ENST00000339276 37 CLU Clusterin ENST00000316403 38 ENST00000380446 39 ENST00000405140 40 PSPH Phosphoserine phosphatase ENST00000275605 41 ENST00000395471 42 ENST00000437355 43 HLA-DQB1 Major Histocompatibility Complex, class II, DQB1 ENST00000399084 44 ENST00000434651
[0019] Several variants sometimes exist for the same target gene, as revealed, for example, in table 1. In the present invention, all the variants are relevant and are indifferently analyzed. It is clearly understood that, if various isoforms of these genes exist, all the isoforms are relevant for the present invention.
[0020] The inventors have identified peripheral blood mRNA signatures which can help to discriminate breast cancer from benign breast disease, with a particular interest in patients with non-conclusive mammography.
[0021] Accordingly the present invention relates to a method for discriminating between breast cancer and benign breast disease in a biological sample from a patient, wherein it comprises the following steps:
[0022] a) obtaining the biological sample comprising a biological material from the patient,
[0023] b) contacting the biological material from the biological sample with at least one specific reagent for at least one target gene and no more than 28 specific reagents for 28 target genes comprising the nucleic acid sequences set forth in SEQ ID NOs 1 to 44, wherein the at least one reagent is specific for at least a target gene comprising a nucleic acid sequence selected from the nucleic acid sequences set forth in SEQ ID NOs: 1, 2 or 3, 4 and 5 or 6, and
[0024] c) determining the expression level of at least one target gene comprising a nucleic acid sequence selected from the nucleic acid sequences set forth in SEQ ID NOs: 1, 2 or 3, 4 and 5 or 6 to obtain an expression profile for the patient, and
[0025] d) performing analysis of the expression profile of the patient with expression profiles of target genes from patients previously clinically classified as breast cancer and expression profiles of target genes from patients previously clinically classified as benign breast disease, wherein: if the expression profile of the patient is clustered with the expression profiles from patients previously clinically classified as breast cancer, then the patient is prognosticated to have breast cancer, and if the expression profile of the patient is clustered with the expression profiles from patients previously clinically classified as benign breast disease, then the patient is prognosticated to have a benign breast disease.
[0026] In one or more embodiments it is possible in step b) to bring the biological material into contact with reagents specific for a combination of at least 2, or at least 3 or at least 4 target genes and no more than 28 target genes, wherein the reagents include at least reagents specific for the target genes comprising the nucleic acid sequence set forth in any one of SEQ ID NOs 1, 2 or 3, 4 and 5 or 6, respectively, and the expression level of at least 2, 3 or 4 genes is determined in step c).
[0027] Examples of combination of target genes are described below:
[0028] SEQ ID NO: 1 and SEQ ID NO: 2 or 3
[0029] SEQ ID NO: 1 and SEQ ID NO: 4
[0030] SEQ ID NO: 1 and SEQ ID NO: 5 or 6
[0031] SEQ ID NO: 2 or 3 and SEQ ID NO: 4
[0032] SEQ ID NO: 2 or 3 and SEQ ID NO: 5 or 6
[0033] SEQ ID NO: 4 and SEQ ID NO: 5 or 6
[0034] SEQ ID NO: 1, SEQ ID NO: 2 or 3 and SEQ ID NO: 4
[0035] SEQ ID NO: 1, SEQ ID NO: 2 or 3 and SEQ ID NO: 5 or 6
[0036] SEQ ID NO: 1, SEQ ID NO: 4 and SEQ ID NO: 5 or 6
[0037] SEQ ID NO: 2 or 3, SEQ ID NO: 4 and SEQ ID NO: 5 or 6
[0038] SEQ ID NO: 4, SEQ ID NO: 5 or 6 and SEQ ID NO: 2 or 3, and
[0039] SEQ ID NO: 1, SEQ ID NO: 2 or 3, SEQ ID NO: 4 and SEQ ID NO: 5 or 6; the following combinations of target genes SEQ ID NO: 1,
[0040] SEQ ID NO: 2, SEQ ID NO: 4 and SEQ ID NO: 5 and SEQ ID NO: 1,
[0041] SEQ ID NO: 3, SEQ ID NO: 4 and SEQ ID NO: 6 being preferred.
[0042] Consequently, in one embodiment of the method of the present invention in step b) the biological material is brought into contact with reagents specific for a combination of at least 4 and no more than 28 target genes, wherein the reagents include at least reagents specific for the target genes comprising the nucleic acid sequence set forth in SEQ ID NOs 1, 2 or 3, 4 and 5 or 6, respectively, and the expression level of at least said 4 genes is determined in step c) to obtain the expression profile for the patient.
[0043] In another embodiment of the method in step b) the biological material is brought into contact with reagents specific for a combination of 28 genes, wherein the reagents include reagents specific for the target genes comprising the nucleic acid sequence set forth in SEQ ID NOs 1 to 44 respectively, and the expression level of the 28 genes is determined in step c) to obtain the expression profile for the patient.
[0044] The biological sample taken from the patient is any sample liable to contain a biological material as defined hereinafter, in particular blood, plasma, serum, tissue, circulating cells sample, blood sample being preferred. This biological sample is provided by any type of sampling known to those skilled in the art.
[0045] In an embodiment of the method of the invention, the biological material can be extracted from the biological sample by any of the nucleic acid extraction and purification protocols well known to those skilled in the art. In another embodiment of the present invention the target biological material is not extracted from the biological sample and its analysis is directly performed from the sample.
[0046] The term "biological material" is intended to mean any material that makes it possible to detect the expression of a target gene. The biological material may in particular comprise proteins, or nucleic acids, such as, in particular, deoxyribonucleic acids (DNA) or ribonucleic acids (RNA). The nucleic acid may in particular be an RNA (ribonucleic acid).
[0047] According to a preferred embodiment of the invention, the biological material is extracted in step and comprises nucleic acids, preferably RNAs, and even more preferably total RNA. Total RNA comprises transfer RNAs (tRNA), messenger RNAs (mRNAs), such as the mRNAs transcribed from the target gene, but also transcribed from any other gene, and ribosomal RNAs. This biological material comprises material specific for a target gene, such as in particular the mRNAs transcribed from the target gene or the proteins derived from these mRNAs.
[0048] By way of indication, the nucleic acid extraction can be carried out by: a step consisting of lysis of the cells present in the biological sample, in order to release the nucleic acids contained in the cells of the patient. By way of example, use may be made of the methods of lysis as described in patent applications: WO 00/05338 regarding mixed magnetic and mechanical lysis, WO 99/53304 regarding electrical lysis, WO 99/15321 regarding mechanical lysis. Those skilled in the art may use other well-known methods of lysis, such as thermal or osmotic shocks or chemical lyses using chaotropic agents such as guanidinium salts (U.S. Pat. No. 5,234,809); a purification step, for separating the nucleic acids from the other cellular constituents released in the lysis step. This generally makes it possible to concentrate the nucleic acids, and can be adapted to the purification of DNA or of RNA. By way of example, use may be made of magnetic particles optionally coated with oligonucleotides, by adsorption or covalence (in this respect, see U.S. Pat. No. 4,672,040 and U.S. Pat. No. 5,750,338), and the nucleic acids which are bound to these magnetic particles can thus be purified by means of a washing step. This nucleic acid purification step is particularly advantageous if it is desired to subsequently amplify said nucleic acids. A particularly advantageous embodiment of these magnetic particles is described in patent applications: WO-A-97/45202 and WO-A-99/35500.
[0049] The term "specific reagent" is intended to mean a reagent which, when it is brought into contact with biological material as defined above, binds with the material specific for said target gene. By way of indication, when the specific reagent and the biological material are of nucleic origin, bringing the specific reagent into contact with the biological material allows the specific reagent to hybridize with the material specific for the target gene. The term "hybridization" is intended to mean the process during which, under appropriate conditions, two nucleotide fragments bind with stable and specific hydrogen bonds so as to form a double-stranded complex. These hydrogen bonds form between the complementary adenine (A) and thymine (T) (or uracile (U)) bases (this is referred to as an A-T bond) or between the complementary guanine (G) and cytosine (C) bases (this is referred to as a G-C bond). The hybridization of two nucleotide fragments may be complete (reference is then made to complementary nucleotide fragments or sequences), i.e. the double-stranded complex obtained during this hybridization comprises only A-T bonds and C-G bonds. This hybridization may be partial (reference is then made to sufficiently complementary nucleotide fragments or sequences), i.e. the double-stranded complex obtained comprises A-T bonds and C-G bonds that make it possible to form the double-stranded complex, but also bases not bound to a complementary base. The hybridization between two nucleotide fragments depends on the working conditions that are used, and in particular on the stringency. The stringency is defined in particular as a function of the base composition of the two nucleotide fragments, and also by the degree of mismatching between two nucleotide fragments. The stringency can also depend on the reaction parameters, such as the concentration and the type of ionic species present in the hybridization solution, the nature and the concentration of denaturing agents and/or the hybridization temperature. All these data are well known and the appropriate conditions can be determined by those skilled in the art. In general, depending on the length of the nucleotide fragments that it is intended to hybridize, the hybridization temperature is between approximately 20 and 70.degree. C., in particular between 35 and 65.degree. C. in a saline solution at a concentration of approximately 0.5 to 1 M. A sequence, or nucleotide fragment, or oligonucleotide, or polynucleotide, is a series of nucleotide motifs assembled together by phosphoric ester bonds, characterized by the informational sequence of the natural nucleic acids, capable of hybridizing to a nucleotide fragment, it being possible for the series to contain monomers having different structures and to be obtained from a natural nucleic acid molecule and/or by genetic recombination and/or by chemical synthesis. A motif is a derivative of a monomer which may be a natural nucleotide of nucleic acid, the constitutive elements of which are a sugar, a phosphate group and a nitrogenous base; in DNA, the sugar is deoxy-2-ribose, in RNA, the sugar is ribose; depending on whether DNA or RNA is involved, the nitrogenous base is selected from adenine, guanine, uracile, cytosine and thymine; alternatively the monomer is a nucleotide that is modified in at least one of the three constitutive elements; by way of example, the modification may occur either at the level of the bases, with modified bases such as inosine, methyl-5-deoxycytidine, deoxyuridine, dimethylamino-5-deoxyuridine, diamino-2,6-purine, bromo-5-deoxyuridine or any other modified base capable of hybridization, or at the level of the sugar, for example the replacement of at least one deoxyribose with a polyamide (P. E. Nielsen et al, Science, 254, 1497-1500 (1991)[3]), or else at the level of the phosphate group, for example its replacement with esters in particular selected from diphosphates, alkyl- and arylphosphonates and phosphorothioates.
[0050] According to a specific embodiment of the invention, the specific reagent comprises at least one hybridization probe or at least one hybridization probe and at least one primer which is specific for the target gene or at least one hybridization probe and two primers specific for the target genes.
[0051] For the purpose of the present invention, the term "amplification primer" is intended to mean a nucleotide fragment comprising from 5 to 100 nucleotides, preferably from 15 to 30 nucleotides that allow the initiation of an enzymatic polymerization, for instance an enzymatic amplification reaction. The term "enzymatic amplification reaction" is intended to mean a process which generates multiple copies of a nucleotide fragment through the action of at least one enzyme. Such amplification reactions are well known to those skilled in the art and mention may in particular be made of the following techniques: PCR (polymerase chain reaction), as described in U.S. Pat. No. 4,683,195, U.S. Pat. No. 4,683,202 and U.S. Pat. No. 4,800,159, LCR (ligase chain reaction), disclosed, for, example, in patent application EP 0 201 184, RCR (repair chain reaction), described in patent application WO 90/01069, 3SR (self sustained sequence replication) with patent application WO 90/06995, NASBA (nucleic acid sequence-based amplification) with patent application WO 91/02818, TMA (transcription mediated amplification) with U.S. Pat. No. 5,399,491 and RT-PCR. When the enzymatic amplification is a PCR, the specific reagent comprises at least two amplification primers, specific for a target gene, that allow the amplification of the material specific for the target gene. The material specific for the target gene then preferably comprises a complementary DNA obtained by reverse transcription of messenger RNA derived from the target gene (reference is then made to target-gene-specific cDNA) or a complementary RNA obtained by transcription of the cDNAs specific for a target gene (reference is then made to target-gene-specific cRNA). When the enzymatic amplification is a PCR carried out after a reverse transcription reaction, reference is made to RT-PCR.
[0052] The term "hybridization probe" is intended to mean a nucleotide fragment comprising at least 5 nucleotides, such as from 5 to 100 nucleotides, in particular from 10 to 75 nucleotides, such as 15-35 nucleotides and 60-70 nucleotides, having a hybridization specificity under given conditions so as to form a hybridization complex with the material specific for a target gene. In the present invention, the material specific for the target gene may be a nucleotide sequence included in a messenger RNA derived from the target gene (reference is then made to target-gene-specific mRNA), a nucleotide sequence included in a complementary DNA obtained by reverse transcription of said messenger RNA (reference is then made to target-gene-specific cDNA), or else a nucleotide sequence included in a complementary RNA obtained by transcription of said cDNA as described above (reference will then be made to target-gene-specific cRNA). The hybridization probe may include a label for its detection. The term "detection" is intended to mean either a direct detection such as a counting method, or an indirect detection by a method of detection using a label. Many methods of detection exist for detecting nucleic acids (see, for example, Kricka et al., Clinical Chemistry, 1999, no 45 (4), p. 453-458 [4] or Keller G. H. et al., DNA Probes, 2nd Ed., Stockton Press, 1993, sections 5 and 6, p. 173-249 [5]). The term "label" is intended to mean a tracer capable of generating a signal that can be detected. A non limiting list of these tracers includes enzymes which produce a signal that can be detected, for example, by colorimetry, fluorescence or luminescence, such as horseradish peroxidase, alkaline phosphatase, beta-galactosidase, glucose-6-phosphate dehydrogenase; chromophores such as fluorescent, luminescent or dye compounds; electron dense groups detectable by electron microscopy or by virtue of their electrical properties such as conductivity, by amperometry or voltametry methods, or by impedance measurement; groups that can be detected by optical methods such as diffraction, surface plasmon resonance, or contact angle variation, or by physical methods such as atomic force spectroscopy, tunnel effect, etc.; radioactive molecules such as .sup.32P, .sup.35S or .sup.125I.
[0053] For the purpose of the present invention, the hybridization probe may be a "detection" probe. In this case, the "detection" probe is labeled by means of a label. The detection probe may in particular be a "molecular beacon" detection probe as described by Tyagi & Kramer (Nature biotech, 1996, 14:303-308 [6]). These "molecular beacons" become fluorescent during the hybridization. They have a stem-loop-type structure and contain a fluorophore and a "quencher" group. The binding of the specific loop sequence with its complementary target nucleic acid sequence causes the stem to unroll and the emission of a fluorescent signal during excitation at the appropriate wavelength. The detection probe in particular may be a "reporter probe" comprising a "color-coded barecode" according to NanoString.TM.'s technology.
[0054] For the detection of the hybridization reaction, use may be made of target sequences that have been labeled, directly (in particular by the incorporation of a label within the target sequence) or indirectly (in particular using a detection probe as defined above). It is in particular possible to carry out, before the hybridization step, a step consisting in labeling and/or cleaving the target sequence, for example using a labeled deoxy-ribonucleotide triphosphate during the enzymatic amplification reaction. The cleavage may be carried out in particular by the action of imidazole or of manganese chloride. The target sequence may also be labeled after the amplification step, for example by hybridizing a detection probe according to the sandwich hybridization technique described in document WO 91/19812. Another specific preferred method of labeling nucleic acids is described in application FR 2780059.
[0055] According to a preferred embodiment of the invention, the detection probe comprises a fluorophore and a quencher.
[0056] According to an even more preferred embodiment of the invention, the hybridization probe comprises an FAM (6-carboxy-fluorescein) or ROX (6-carboxy-X-rhodamine) fluorophore at its 5' end and a quencher (Dabsyl) at its 3' end.
[0057] The hybridization probe may also be a "capture" probe. In this case, the "capture" probe is immobilized or can be immobilized on a solid substrate by any appropriate means, i.e. directly or indirectly, for example by covalence or adsorption. As solid substrate, use may be made of synthetic materials or natural materials, optionally chemically modified, in particular polysaccharides such as cellulose-based materials, for example paper, cellulose derivatives such as cellulose acetate and nitrocellulose or dextran, polymers, copolymers, in particular based on styrene-type monomers, natural fibers such as cotton, and synthetic fibers such as nylon; inorganic materials such as silica, quartz, glasses or ceramics; latices; magnetic particles; metal derivatives, gels, etc. The solid substrate may be in the form of a microtitration plate, of a membrane as described in application WO-A-94/12670 or of a particle. It is also possible to immobilize on the substrate several different capture probes, each being specific for a target gene. In particular, a biochip on which a large number of probes can be immobilized may be used as substrate. The term "biochip" is intended to mean a solid substrate that is small in size, to which a multitude of capture probes are attached at predetermined positions. The biochip, or DNA chip, concept dates from the beginning of the 1990s. It is based on a multidisciplinary technology that integrates microelectronics, nucleic acid chemistry, image analysis and information technology. The operating principle is based on a foundation of molecular biology: the hybridization phenomenon, i.e. the pairing, by complementarity, of the bases of two DNA and/or RNA sequences. The biochip method is based on the use of capture probes attached to a solid substrate, on which probes a sample of target nucleotide fragments directly or indirectly labeled with fluorochromes is made to act. The capture probes are positioned specifically on the substrate or chip and each hybridization gives a specific piece of information, in relation to the target nucleotide fragment. The pieces of information obtained are cumulative, and make it possible, for example, to quantify the level of expression of one or more target genes. In order to analyze the expression of a target gene, a substrate comprising a multitude of probes, which correspond to all or part of the target gene, which is transcribed to mRNA, can then be prepared. For the purpose of the present invention, the term "low-density substrate" is intended to mean a substrate comprising fewer than 50 probes. For the purpose of the present invention, the term "medium-density substrate" is intended to mean a substrate comprising from 50 probes to 10 000 probes. For the purpose of the present invention, the term "high-density substrate" is intended to mean a substrate comprising more than 10 000 probes.
[0058] The cDNAs or cRNAs specific for a target gene that it is desired to analyze are then hybridized, for example, to specific capture probes. After hybridization, the substrate or chip is washed and the labeled cDNA or cRNA/capture probe complexes are revealed by means of a high-affinity ligand bound, for example, to a fluorochrome-type label. The fluorescence is read, for example, with a scanner and the analysis of the fluorescence is processed by information technology. By way of indication, mention may be made of the DNA chips developed by the company Affymetrix ("Accessing Genetic Information with High-Density DNA arrays", M. Chee et al., Science, 1996, 274, 610-614 [7]. "Light-generated oligonucleotide arrays for rapid DNA sequence analysis", A. Caviani Pease et al., Proc. Natl. Acad. Sci. USA, 1994, 91, 5022-5026 [8]), for molecular diagnoses. In this technology, the capture probes are generally small in size, around 25 nucleotides. Other examples of biochips are given in the publications by G. Ramsay, Nature Biotechnology, 1998, No. 16, p. 40-44 [9]; F. Ginot, Human Mutation, 1997, No. 10, p. 1-10 [10]; J. Cheng et al, Molecular diagnosis, 1996, No. 1 (3), p. 183-200 [11]; T. Livache et al, Nucleic Acids Research, 1994, No. 22 (15), p. 2915-2921 [12]; J. Cheng et al, Nature Biotechnology, 1998, No. 16, p. 541-546 [13] or in U.S. Pat. No. 4,981,783, U.S. Pat. No. 5,700,637, U.S. Pat. No. 5,445,934, U.S. Pat. No. 5,744,305 and U.S. Pat. No. 5,807,522. The main characteristic of the solid substrate should be to conserve the hybridization characteristics of the capture probes on the target nucleotide fragments while at the same time generating a minimum background noise for the method of detection. Three main types of fabrication can be distinguished for immobilizing the probes on the substrate.
[0059] First of all, there is a first technique which consists in depositing pre-synthesized probes. The attachment of the probes is carried out by direct transfer, by means of micropipettes or of microdots or by means of an inkjet device. This technique allows the attachment of probes having a size ranging from a few bases (5 to 10) up to relatively large sizes of 60 bases (printing) to a few hundred bases (microdeposition).
[0060] Printing is an adaptation of the method used by inkjet printers. It is based on the propulsion of very small spheres of fluid (volume <1 nl) at a rate that may reach 4000 drops/second. The printing does not involve any contact between the system releasing the fluid and the surface on which it is deposited.
[0061] Microdeposition consists in attaching long probes of a few tens to several hundred bases to the surface of a glass slide. These probes are generally extracted from databases and are in the form of amplified and purified products. This technique makes it possible to produce chips called microarrays that carry approximately ten thousand spots, called recognition zones, of DNA on a surface area of a little less than 4 cm.sup.2. The use of nylon membranes, referred to as "macroarrays", which carry products that have been amplified, generally by PCR, with a diameter of 0.5 to 1 mm and the maximum density of which is 25 spots/cm.sup.2, should not however be forgotten. This very flexible technique is used by many laboratories. In the present invention, the latter technique is considered to be included among biochips. A certain volume of sample can, however, be deposited at the bottom of a microtitration plate, in each well, as in the case in patent applications WO-A-00/71750 and FR 00/14896, or a certain number of drops that are separate from one another can be deposited at the bottom of one and the same Petri dish, according to another patent application, FR 00/14691.
[0062] The second technique for attaching the probes to the substrate or chip is called in situ synthesis. This technique results in the production of short probes directly at the surface of the chip. It is based on in situ oligonucleotide synthesis (see, in particular, patent applications WO 89/10977 and WO 90/03382) and is based on the oligonucleotide synthesizer process. It consists in moving a reaction chamber, in which the oligonucleotide extension reaction takes place, along the glass surface.
[0063] Finally, the third technique is called photolithography, which is a process that is responsible for the biochips developed by Affymetrix. It is also an in situ synthesis. Photolithography is derived from microprocessor techniques. The surface of the chip is modified by the attachment of photolabile chemical groups that can be light-activated. Once illuminated, these groups are capable of reacting with the 3' end of an oligonucleotide. By protecting this surface with masks of defined shapes, it is possible to selectively illuminate and therefore activate areas of the chip where it is desired to attach one or other of the four nucleotides. The successive use of different masks makes it possible to alternate cycles of protection/reaction and therefore to produce the oligonucleotide probes on spots of approximately a few tens of square micrometers (.mu.m.sup.2). This resolution makes it possible to create up to several hundred thousand spots on a surface area of a few square centimeters (cm.sup.2). Photolithography has advantages: in bulk in parallel, it makes it possible to create a chip of N-mers in only 4.times.N cycles. All these techniques can be used with the present invention. According to a preferred embodiment of the invention, the at least one specific reagent of step b) defined above comprises at least one hybridization probe which is preferably immobilized on a substrate. This substrate is preferably a low-, high- or medium-density substrate as defined above.
[0064] These hybridization steps on a substrate comprising a multitude of probes may be preceded by an enzymatic amplification reaction step, as defined above, in order to increase the amount of target genetic material.
[0065] In step c), the determination of the expression level of a target gene can be carried out by any of the protocols known to those skilled in the art. In general, the expression of a target gene can be analyzed by detecting the mRNAs (messenger RNAs) that are transcribed from the target gene at a given moment or by detecting the proteins derived from these mRNAs.
[0066] The invention preferably relates to the determination of the expression level of a target gene by detection of the mRNAs derived from this target gene according to any of the protocols well known to those skilled in the art. According to a specific embodiment of the invention, the expression level of several target genes is determined simultaneously, by detection of several different mRNAs, each mRNA being derived from a target gene.
[0067] When the specific reagent comprises at least one amplification primer, it is possible, to determine the expression level of the target gene in the following way: 1) After having extracted, as biological material, the total RNA (comprising the transfer RNAs (tRNAs), the ribosomal RNAs (rRNAs) and the messenger RNAs (mRNAs)) from a biological sample as presented above, a reverse transcription step is carried out in order to obtain the complementary DNAs (or cDNAs) of said mRNAs. By way of indication, this reverse transcription reaction can be carried out using a reverse transcriptase enzyme which makes it possible to obtain, from an RNA fragment, a complementary DNA fragment. The reverse transcriptase enzyme from AMV (Avian Myoblastosis Virus) or from MMLV (Moloney Murine Leukaemia Virus) can in particular be used. When it is more particularly desired to obtain only the cDNAs of the mRNAs, this reverse transcription step is carried out in the presence of nucleotide fragments comprising only thymine bases (polyT), which hybridize by complementarity to the polyA sequence of the mRNAs so as to form a polyT-polyA complex which then serves as a starting point for the reverse transcription reaction carried out by the reverse transcriptase enzyme. cDNAs complementary to the mRNAs derived from a target gene (target-gene-specific cDNA) and cDNAs complementary to the mRNAs derived from genes other than the target gene (cDNAs not specific for the target gene) are then obtained. 2) The amplification primer(s) specific for a target gene is (are) brought into contact with the target-gene-specific cDNAs and the cDNAs not specific for the target gene. The amplification primer(s) specific for a target gene hybridize(s) with the target-gene-specific cDNAs and a predetermined region, of known length, of the cDNAs originating from the mRNAs derived from the target gene is specifically amplified. The cDNAs not specific for the target gene are not amplified, whereas a large amount of target-gene-specific cDNAs is then obtained. For the purpose of the present invention, reference is made, without distinction, to "target-gene-specific cDNAs" or to "cDNAs originating from the mRNAs derived from the target gene". This step can be carried out in particular by means of a PCR-type amplification reaction or by any other amplification technique as defined above. By PCR, it is also possible to simultaneously amplify several different cDNAs, each one being specific for different target genes, by using several pairs of different amplification primers, each one being specific for a target gene: reference is then made to multiplex amplification. 3) The expression of the target gene is determined by detecting and quantifying the target-gene-specific cDNAs obtained in step 2) above. This detection can be carried out after electrophoretic migration of the target-gene-specific cDNAs according to their size. The gel and the medium for the migration can include ethidium bromide so as to allow direct detection of the target-gene-specific cDNAs when the gel is placed, after a given migration period, on a UV (ultraviolet)-ray light table, through the emission of a light signal. The greater the amount of target-gene-specific cDNAs, the brighter this light signal. These electrophoresis techniques are well known to those skilled in the art. The target-gene-specific cDNAs can also be detected and quantified using a quantification range obtained by means of an amplification reaction carried out until saturation. In order to take into account the variability in enzymatic efficiency that may be observed during the various steps (reverse transcription, PCR, etc.), the expression of a target gene of various groups of patients can be normalized by simultaneously determining the expression of a "housekeeping" gene, the expression of which is similar in the various groups of patients. By realizing a ratio of the expression of the target gene to the expression of the housekeeping gene, i.e. by realizing a ratio of the amount of target-gene-specific cDNAs to the amount of housekeeping-gene-specific cDNAs, any variability between the various experiments is thus corrected. Those skilled in the art may refer in particular to the following publications: Bustin S A, J Mol Endocrinol, 2002, 29: 23-39; Giulietti A Methods, 2001, 25: 386-401.
[0068] When the specific reagent comprises at least one hybridization probe, the expression of a target gene can be determined in the following way: 1) After having extracted, as biological material, the total RNA from a biological sample as presented above, a reverse transcription step is carried out as described above in order to obtain cDNAs complementary to the mRNAs derived from a target gene (target-gene-specific cDNA) and cDNAs complementary to the mRNAs derived from genes other than the target gene (cDNA not specific for the target gene). 2) All the cDNAs are brought into contact with a substrate, on which are immobilized capture probes specific for the target gene whose expression it is desired to analyze, in order to carry out a hybridization reaction between the target-gene-specific cDNAs and the capture probes, the cDNAs not specific for the target gene not hybridizing to the capture probes. The hybridization reaction can be carried out on a solid substrate which includes all the materials as indicated above. According to a preferred embodiment, the hybridization probe is immobilized on a substrate. Preferably, the substrate is a low-, high- or medium-density substrate as defined above. The hybridization reaction may be preceded by a step consisting of enzymatic amplification of the target-gene-specific cDNAs as described above, so as to obtain a large amount of target-gene-specific cDNAs and to increase the probability of a target-gene-specific cDNA hybridizing to a capture probe specific for the target gene. The hybridization reaction may also be preceded by a step consisting in labeling and/or cleaving the target-gene-specific cDNAs as described above, for example using a labeled deoxyribonucleotide triphosphate for the amplification reaction. The cleavage can be carried out in particular by the action of imidazole and manganese chloride. The target-gene-specific cDNA can also be labeled after the amplification step, for example by hybridizing a labeled probe according to the sandwich hybridization technique described in document WO-A-91/19812. Other preferred specific methods for labeling and/or cleaving nucleic acids are described in applications WO 99/65926, WO 01/44507, WO 01/44506, WO 02/090584, WO 02/090319. 3) A step consisting of detection of the hybridization reaction is subsequently carried out. The detection can be carried out by bringing the substrate on which the capture probes specific for the target gene are hybridized with the target-gene-specific cDNAs into contact with a "detection" probe labeled with a label, and detecting the signal emitted by the label. When the target-gene-specific cDNA has been labeled beforehand with a label, the signal emitted by the label is detected directly.
[0069] When the at least one specific reagent is brought into contact in step b) comprises at least one hybridization probe, the expression of a target gene can also be determined in the following way: 1) After having extracted, as biological material, the total RNA from a biological sample as presented above, a reverse transcription step is carried out as described above in order to obtain the cDNAs of the mRNAs of the biological material. The polymerization of the complementary RNA of the cDNA is subsequently carried out using a T7 polymerase enzyme which functions under the control of a promoter and which makes it possible to obtain, from a DNA template, the complementary RNA. The cRNAs of the cDNAs of the mRNAs specific for the target gene (reference is then made to target-gene-specific cRNA) and the cRNAs of the cDNAs of the mRNAs not specific for the target gene are then obtained. 2) All the cRNAs are brought into contact with a substrate on which are immobilized capture probes specific for the target gene whose expression it is desired to analyze, in order to carry out a hybridization reaction between the target-gene-specific cRNAs and the capture probes, the cRNAs not specific for the target gene not hybridizing to the capture probes. When it is desired to simultaneously analyze the expression of several target genes, several different capture probes can be immobilized on the substrate, each one being specific for a target gene. The hybridization reaction may also be preceded by a step consisting in labeling and/or cleaving the target-gene-specific cRNAs as described above. 3) A step consisting of detection of the hybridization reaction is subsequently carried out. The detection can be carried out by bringing the substrate on which the capture probes specific for the target gene are hybridized with the target-gene-specific cRNA into contact with a "detection" probe labeled with a label, and detecting the signal emitted by the label. When the target-gene-specific cRNA has been labeled beforehand with a label, the signal emitted by the label is detected directly. The use of cRNA is particularly advantageous when a substrate of biochip type on which a large number of probes are hybridized is used.
[0070] The invention also relates to a substrate, comprising at least 4 hybridization probes selected from probes specific for the target genes with a nucleic sequence having any one of SEQ ID NOs 1 to 44 and in particular 4 hybridization probes specific for the target genes with a nucleic acid sequence having any one of SEQ ID NOs 1, 2 or 3, 4 and 5 or 6.
[0071] The invention further relates to the use of a substrate as defined above, for discriminating BC from BBD.
[0072] The present invention also concerns a kit for discriminating breast cancer from benign breast disease in a biological sample from a patient comprises at least one specific reagent for at least one target gene and no more than specific reagents for 28 target genes comprising the nucleic acid sequences set forth in SEQ ID NOs 1 to 44, wherein the at least one reagent is specific for at least a target gene comprising a nucleic acid sequence selected from the nucleic acid sequences set forth in SEQ ID NOs: 1, 2 or 3 4 and 5 or 6.
[0073] The specific reagents can targeted a combination of at least two, three or four genes as described above in more detail but no more than 28 genes and in one embodiment the kit comprises reagents specific for a combination of at least 4 and no more than 28 target genes, wherein the reagents include at least reagents specific for the target genes comprising the nucleic acid sequence set forth in SEQ ID NOs 1, 2 or 3, 4 and 5 or 6, respectively. In another embodiment the kit comprises reagents specific for a combination of 28 target genes, wherein the reagents include reagents specific for the target genes comprising the nucleic acid sequence set forth in SEQ ID NOs 1 to 28.
EXAMPLES
[0074] I) Materials and Methods
[0075] 1. Characteristic of Patients and Samples
[0076] Blood samples were collected from 84 patients with breast cancer and 94 patients with breast benign disease in this study. All patients had been referred to the Breast Surgery Department of Cancer Hospital, Fudan University (Shanghai, China) with suspected breast cancer between July 2007 and December 2008. Each of them went through the mammographic screening in the hospital, while all the BI-RADS category of the patients was determined by three professional radiologists. About 2.5 ml of peripheral blood were collected from each of 84 women with BC and 94 women with BBD, in Paxgene.TM. Blood RNA tubes (PreAnalytix) containing an RNA stabilizing solution. All blood samples were collected before fine-needle aspiration operation or any invasive steps which was indicated for cytological investigation on suspected breast lesion. Diagnosis of breast cancer was on the basis of identification of cancer cells on the core-needle biopsy or surgical specimen. Diagnosis of benign disease on the basis of lack of cancer cells at open biopsy. The protocol was approved by the local Ethical Committee for Clinical Research and written informed consent was obtained from all the patients recruited for the study. Final pathologic tumor stage was determined with the TNM staging system and graded using the Nottingham system. In addition tumor type and tumor grade, estrogen receptor (ER), progesterone receptor (PR) and Human Epidermal growth factor Receptor 2 (HER2) status and lymph node status were assessed in each tumor.
[0077] 2. RNA Extraction and Microarray Analysis
[0078] Total RNA was extracted with the PAXGene Blood RNA.RTM. kit (PreAnalytix) according to the manufacturer's instruction. The quantity of total RNA was measured by spectrophotometer at optical density (OD) 260 nm and the quality was assessed using the RNA 6000 Nano LabChip on a 2100 Bioanalyzer (Agilent Technologies). Only samples with RNA Integrity Number (RIN) between 7 and 10 were analyzed. 50 ng of total RNA was then reversely transcripted and linearly amplified to single strand cDNA using Ribo-SPIA Ovation technology with WT-Ovation RNA Amplification System (NuGen Technologies), according to the manufacturer's standard protocol and the products were purified with QIAquick PCR purification kit (Qiagen GmbH). 2 .mu.g amplified and purified cDNA was subsequently fragmented with RQ1 RNase-Free DNase (Promega corporation) and labeled with biotinylated deoxynucleoside triphosphates by Terminal Transferase (Roche Diagnostics GmbH) and DNA labeling reagent (Affymetrix). The labeled cDNA was hybridized onto HG U133 plus 2.0 Array (Affymetrix) in a Hybridization Oven 640 (Affymetrix) at 60 rpm, 50.degree. C. for 18 h. The HG U133 plus 2.0 Array contains 54,675 probe sets representing approximately 39,000 best characterized human genes. After hybridization, the arrays were washed and stained according to the Affymetrix protocol EukGE-WS2v4 using an Affymetrix fluidic station FS450. The arrays were scanned with the Affymetrix scanner 3000.
[0079] 3. Microarray Data Analysis
[0080] Quality Control and Preprocessing. Quality control analyses were performed according to the suggestions of standard Affymetrix quality control parameters. Based on the evaluation criteria, all blood sample measurements fulfilled the minimal quality requirements. The Affymetrix expression arrays were preprocessed by RMA (Robust Multi-chip Average) [10] with background correction, quantile normalization and median polish summarization. Probesets with extreme signal intensity (lower than 50 or higher than 214) were filtered out. Then, sequence information based filtering was performed according to the Entrez Gene database information. Probesets without Entrez Gene ID annotation were removed. For multiple probesets mapping to the same Entrez Gene ID, only the probeset with the largest value of Interquartile Range was retained and the others were removed. After all, to reduce the likelihood of batch, a normalization algorithm, ComBat [11] was applied. The ComBat method (statistics.byu.edu/johnson/ComBat/) applies either parametric or nonparametric empirical Bayes framework for adjusting batch effects in a given data set.
[0081] 4. Molecular Signature Identification.
[0082] After appropriate pre-processing to reduce redundant probesets and batch variation across expression data, Molecular Signature Identification was performed based on the preprocessed expression data. 84 BC and 94 BBD samples with mammographic results and confirmed pathologic information were categorized into two groups, 79 BC+73 BBD with BI-RADS 1-5, and 5 BC+21 BBD with BI-RADS 0. 79 BC+73 BBD with BI-RADS 1-5 were used as train set to identify interesting genes by Recursive Feature Elimination (RFE) procedure, and build the classification model by Support Vector Machine (SVM) [12-13]. Inside train set, 5-fold cross validation process was conducted to determine the optimal gene sets. A list of top-100 genes was identified by RFE based on four of the fifth train set. The classification model was created based on the top-100 genes and the model was tested using another one of the fifth train set. This process was run for 1000 iterations, thus one thousand of top 100 gene sets were generated. Eventually, the genes appeared in entire one thousand of 100-top gene lists were identified as the most robust genes to generate the final model using the whole train set. And the model was then applied to completely unseen samples 5 BC+21 BBD with BI-RADS 0.
[0083] The preprocessing and statistical steps were executed using R-environment with Bioconductor libraries [14-18].
[0084] II) Results
[0085] 1. Patient Characteristics
[0086] The present study was performed on 178 samples from 84 BC and 94 BBD patients with mammographic results and confirmed pathologic information, which then categorized in two groups, 79 BC+73 BBD with BI-RADS 1-5, and 5 BC+21 BBD with BI-RADS 0. Table 2 summarizes the clinical characteristics of these BC and BBD patient populations. Briefly, 92% of the cancer patients presented a T0-T2 tumor; 70% and 32% of the tumors were hormone receptor positive and Her2 positive respectively. Benign findings included 51.1% of breast disease, 27.7% of breast fibroadenoma and 21.2% intracanalicular papilloma respectively.
TABLE-US-00002 TABLE 2 Characteristics of the population Benign Breast Disease (BBD): 94 patients Age (years) Median 47.4 Range 34-75 Menopausal status Postmenopausal 30 33.7 Premenoposal 59 66.3 Non determined 5 Type of disease Breast disease 48 51.1% Breast fiboadenoma 26 27.7% Intracanalicular papilloma 20 21.2% Breast cancer (BD): 84 patients Age (years) Median 42.5 Range 31-77 Tumor type Ductal carcinoma in Situ (DCIS) 11 13.1% Intra Ductal carcinoma (IDC) 73 86.9% Tumor size T1 (0.1-2 cm) 44 52.4% T2 (>2-5 cm) 34 40.5% T3 (>5 cm) 1 1.2% unknown 5 5.9% Nodal status Positive 25 29.8% Negative 57 67.8% Unknow 2 2.4% TNM Stage 0 10 11.9% I 28 33.3% II 33 39.3% III 11 13.1% Unknow 2 2.4% Histological grade I 1 1.2% I-II 3 3.6% II 43 51.2% II-III 8 9.5% III 18 21.4% Unknow 11 13.1% Estrogen receptor status Negative 19 22.6% Positive 65 77.4% Progeterone receptor status Negative 20 23.8% Positive 64 7.2% Her-2 status Negative 53 63.1% Positive 31 36.9% *pValue
[0087] 2. Construction and Performance of the Model
[0088] By using Recursive Feature Elimination (RFE) procedure and Support Vector Machine (SVM) classification, a set of 28-gene panel (Table 1) was developed, to discriminate BC and BBD patients with BI-RADS 1-5. This 28-gene panel was then tested in the BI-RADS 0 group.
[0089] Among the 28 predictive genes, the expression of 15 of them are down-expressed in BC compared to BBD and 13 are up-expressed in BC versus BBD, as summarized in table 3.
TABLE-US-00003 TABLE 3 Expression Affymetrix Abbreviated Mean Fold in BC versus SEQ ID NOs: probeset name signal P-value change BBD 1 209395_at CHI3L1 271 5.74 10.sup.-3 1.22 Down-regulated 2-3 1552552_s_at CLEC4C 49 5.59 10.sup.-3 1.20 Down-regulated 4 206881_s_at LILRA3 73 4 10.sup.-6 1.43 Down-regulated 5-6 204141_at TUBB2A 684 5.82 10.sup.-2 1.30 Down-regulated 7 213790_at ADAM12 74 2.53 10.sup.-3 1.13 Up-regulated 8 226736_at CHURC1 124 5.54 10.sup.-4 1.26 Up-regulated 9 230720_at RNF182 49 3.52 10.sup.-3 1.58 Up-regulated 10-13 220532_at TMEM176B 97 1.70 10.sup.-2 1.21 Up-regulated 14-15 219629_at FAM118A 100 1.49 10.sup.-1 1.12 Up-regulated 16 156960_s_at ANKRD20A 70 7.80 10.sup.-2 1.11 Down-regulated 17-19 206785_s_at KLRC1/2 93 4.87 10.sup.-2 1.15 Down-regulated 20 225525_at KIAA1671 69 1.75 10.sup.-2 1.12 Up-regulated 21 1554469_at ZBTB44 58 2.16 10.sup.-3 1.13 Down-regulated 22-23 235126_at LQK1 83 2.66 10.sup.-2 1.14 Up-regulated 24-25 210873_x_at APOBEC3A 335 3.52 10.sup.-1 1.12 Down-regulated 26 229187_at LOC283788 94 1.91 10.sup.-1 1.08 Up-regulated 27 1559140_at FAM87A/B 68 2.32 10.sup.-2 1.09 Up-regulated 28 242770_at LOC642236 49 2.35 10.sup.-2 1.14 Up-regulated 29 214428_x_at C4A/B 55 4.77 10.sup.-2 1.11 Down-regulated 30 1554094_at ENTPDS 87 4.70 10.sup.-5 1.11 Down-regulated 31 215610_at LOC728263 89 2.03 10.sup.-3 1.09 Up-regulated 32 1553623_at MGC15705 79 2.57 10.sup.-2 1.08 Down-regulated 33-34 242687_at FAM160A1 50 2.48 10.sup.-2 1.08 Up-regulated 35 219700_at PLXDC1 107 3.82 10.sup.-3 1.14 Down-regulated 36 33323_r_at SFN 54 1.26 10.sup.-1 1.09 Down-regulated 37-39 208791_at CLU 112 2.37 10.sup.-1 1.08 Up-regulated 40-42 205048_s_at PSPH 68 4.18 10.sup.-1 1.06 Down-regulated 43-44 212999-_x_at HLA-DQB1 120 1.00 10.sup.-1 1.23 Down-regulated
[0090] 4-Genes Signature
[0091] In a first training set, the 4-gene panel CHI3L1, CLEC4C, LILRA3 and TUBB2A was classified malignant and benign with an estimated accuracy of 71% (76% sensitivity and 66% specificity).
[0092] Of the 79 breast cancer samples, 60 were classified correctly, while 48 of the 73 benign samples were assigned to the correct class (Table 4a).
TABLE-US-00004 TABLE 4a Classification value for the identified signature on Training Dataset Prediction outcome Training set BBD BC Pathological BBD 48 25 diagnosis BC 19 60 Accuracy = 71%, Sensitivity = 76%, Specificity = 66%
[0093] The metric performance of the model in the independent BI-RADS 0 test set was reported in Table 4b. Three of the five cancer samples were correctly classified, while 8 out of 21 benign patients were accurately classified, with a sensitivity of 60% and specificity of 38% respectively. The accuracy of the model in the test set of BI-RADS 0 is 42%.
TABLE-US-00005 TABLE 4b Classification value for the identified signature on Independent Test Dataset Prediction outcome Training set BBD BC Pathological BBD 8 13 diagnosis BC 2 3 Accuracy = 42%, Sensitivity = 60%, Specificity = 38%
[0094] 28-Genes Signature
[0095] In the training set, the 28-gene panel was classified malignant and benign with an estimated accuracy of 88% (94% sensitivity and 84% specificity).
[0096] Of the 79 breast cancer samples, 74 were classified correctly, while 61 of the 73 benign samples were assigned to the correct class (Table 5a).
TABLE-US-00006 TABLE 5a Classification value for the identified signature on Training Dataset Prediction outcome Training set BBD BC Pathological BBD 61 12 diagnosis BC 5 74 Accuracy = 88%, Sensitivity = 94%, Specificity = 84%
[0097] The metric performance of the model in the independent BI-RADS 0 test set was reported in Table 5b. Four of the five cancer samples were correctly classified, while 15 out of 21 benign patients were accurately classified, with a sensitivity of 80% and specificity of 71% respectively. The accuracy of the model in the test set of BI-RADS 0 is 73%.
TABLE-US-00007 TABLE 5b Classification value for the identified signature on Independent Test Dataset Prediction outcome Training set BBD BC Pathological BBD 15 6 diagnosis BC 1 4 Accuracy = 73%, Sensitivity = 80%, Specificity = 71%
[0098] The inventors have also analyzed whether any of the clinical characteristics were significantly overrepresented among the subjects incorrectly predicted. They found that the only false negative case in the test set was a 46 years old woman who had Paget's disease and DCIS.
BIBLIOGRAPHIC REFERENCES
[0099] 1. Margaret M. Eberl, MPH, Chester H. Fox, Stephen B. Edge, Cathleen A. Carter, and Martin C. Mahoney. BI-RADS Classification for Management of Abnormal Mammograms, The Journal of the American Board of Family Medicine 19:161-1
[0100] 2. Whitney A R, Diehn M, Popper S J, Alizadeh A A, Boldrick J C, Relman D A, Brown P O. Individuality and variation in gene expression patterns in human blood.Proc Natl Acad Sci USA. 2003, 18;100(4):1896-901.
[0101] 3. P. E. Nielsen et al, Science, 254, 1497-1500 (1991).
[0102] 4. Kricka et al., Clinical Chemistry, 1999, no 45 (4), p. 453-458.
[0103] 5. Keller G. H. et al., DNA Probes, 2nd Ed., Stockton Press, 1993, sections 5 and 6, p. 173-249.
[0104] 6. Tyagi & Kramer, Nature Biotech, 1996, 14:303-308.
[0105] 7. M. Chee et al., Science, 1996, 274, 610-614].
[0106] 8. A. Caviani Pease et al., Proc. Natl. Acad. Sci. USA, 1994, 91, 5022-5026.
[0107] 9. G. Ramsay, Nature Biotechnology, 1998, No. 16, p. 40-44.
[0108] 10. F. Ginot, Human Mutation, 1997, No. 10, p. 1-10.
[0109] 11. J. Cheng et al, Molecular diagnosis, 1996, No. 1 (3), p. 183-200.
[0110] 12. T. Livache et al, Nucleic Acids Research, 1994, No. 22 (15), p. 2915-2921.
[0111] 13. J. Cheng et al, Nature Biotechnology, 1998, No. 16, p. 541-546.
[0112] 14. Harris Drucker, Chris J. C. Burges, Linda Kaufman, Alex Smola and Vladimir Vapnik (1997). "Support Vector Regression Machines". Advances in Neural Information Processing Systems 9, NIPS 1996, 155-161, MIT Press.
[0113] 15. R Development Core Team (2009). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL www.R-project.org
[0114] 16. Gentleman R C, Carey V J, Bates D M, Bolstad B, Dettling M, Dudoit S, Ellis B, Gautier L, Ge Y, Gentry J, et al.: Bioconductor: open software development for computational biology and bioinformatics.
[0115] 17. Crispin J Miller. simpleaffy (2009): Very simple high level analysis of Affymetrix data. R package version 2.22.0. www.bioconductor.org, bioinformatics.picr.man.ac.uk/simpleaffy/
[0116] 18. R. Gentleman, V. Carey, W. Huber and F. Hahne (2009). genefilter: genefilter: methods for filtering genes from microarray experiments. R package version 1.28.0.
Sequence CWU
1
1
4411792DNAHomo sapiens 1agtggagtgg gacaggtata taaaggaagt acagggcctg
gggaagaggc cctgtctagg 60tagctggcac caggagccgt gggcaaggga agaggccaca
ccctgccctg ctctgctgca 120gccagaatgg gtgtgaaggc gtctcaaaca ggctttgtgg
tcctggtgct gctccagtgc 180tgctctgcat acaaactggt ctgctactac accagctggt
cccagtaccg ggaaggcgat 240gggagctgct tcccagatgc ccttgaccgc ttcctctgta
cccacatcat ctacagcttt 300gccaatataa gcaacgatca catcgacacc tgggagtgga
atgatgtgac gctctacggc 360atgctcaaca cactcaagaa caggaacccc aacctgaaga
ctctcttgtc tgtcggagga 420tggaactttg ggtctcaaag attttccaag atagcctcca
acacccagag tcgccggact 480ttcatcaagt cagtaccgcc atttctgcgc acccatggct
ttgatgggct ggaccttgcc 540tggctctacc ctggacggag agacaaacag cattttacca
ccctaatcaa ggaaatgaag 600gccgaattta taaaggaagc ccagccaggg aaaaagcagc
tcctgctcag cgcagcactg 660tctgcgggga aggtcaccat tgacagcagc tatgacattg
ccaagatatc ccaacacctg 720gatttcatta gcatcatgac ctacgatttt catggagcct
ggcgtgggac cacaggccat 780cacagtcccc tgttccgagg tcaggaggat gcaagtcctg
acagattcag caacactgac 840tatgctgtgg ggtacatgtt gaggctgggg gctcctgcca
gtaagctggt gatgggcatc 900cccaccttcg ggaggagctt cactctggct tcttctgaga
ctggtgttgg agccccaatc 960tcaggaccgg gaattccagg ccggttcacc aaggaggcag
ggacccttgc ctactatgag 1020atctgtgact tcctccgcgg agccacagtc catagaatcc
tcggccagca ggtcccctat 1080gccaccaagg gcaaccagtg ggtaggatac gacgaccagg
aaagcgtcaa aagcaaggtg 1140cagtacctga aggacaggca gctggcgggc gccatggtat
gggccctgga cctggatgac 1200ttccagggct ccttctgtgg ccaggatctg cgcttccctc
tcaccaatgc catcaaggat 1260gcactcgctg caacgtagcc ctctgttctg cacacagcac
gggggccaag gatgccccgt 1320ccccctctgg ctccagctgg ccgggagcct gatcacctgc
cctgctgagt cccaggctga 1380gcctcagtct ccctcccttg gggcctatgc agaggtccac
aacacacaga tttgagctca 1440gccctggtgg gcagagaggt agggatgggg ctgtggggat
agtgaggcat cgcaatgtaa 1500gactcgggat tagtacacac ttgttgatta atggaaatgt
ttacagatcc ccaagcctgg 1560caagggaatt tcttcaactc cctgcccccc agccctcctt
atcaaaggac accattttgg 1620caagctctat caccaaggag ccaaacatcc tacaagacac
agtgaccata ctaattatac 1680cccctgcaaa gcccagcttg aaaccttcac ttaggaacgt
aatcgtgtcc cctatcctac 1740ttccccttcc taattccaca gctgctcaat aaagtacaag
agcttaacag tg 17922922DNAHomo sapiens 2aattgaaagc tttagctcac
tgcagagtct cctaagtcac atctcttcct ttgcaagagt 60aggcgaagaa ggatctaagg
gcttggcttg tttgaaagaa ccacaccccg aaagtaacat 120ctttggagaa agtgatacaa
gagcttctgc acccacctga tagaggaagt ccaaagggtg 180tgcgcacaca caatggtgcc
tgaagaagag cctcaagacc gagtgcctca caattttatg 240tatagcaaaa ctgtcaagag
gctgtccaag ttacgagagt atcaacagta tcatccaagc 300ctgacctgcg tcatggaagg
aaaggacata gaagattgga gctgctgccc aaccccttgg 360acttcatttc agtctagttg
ctactttatt tctactggga tgcaatcttg gactaagagt 420caaaagaact gttctgtgat
gggggctgat ctggtggtga tcaacaccag ggaagaacag 480gatttcatca ttcagaatct
gaaaagaaat tcttcttatt ttctggggct gtcagatcca 540gggggtcggc gacattggca
atgggttgac cagacaccat acaatgaaaa tgtcacattc 600tggcactcag gtgaacccaa
taaccttgat gagcgttgtg cgataataaa tttccgttct 660tcagaagaat ggggctggaa
tgacattcac tgtcatgtac ctcagaagtc aatttgcaag 720atgaagaaga tctacatata
aatgaaatat tctccctgga aatgtgtttg ggttggcatc 780caccgttgta gaaagctaaa
ttgatttttt aatttatgtg taagttttgt acaaggaatg 840cccctaaaat gtttcagcag
gctgtcacct attacactta tgatataatc cattcacaca 900ttcatttatt catttattca
tt 92231015DNAHomo sapiens
3aattgaaagc tttagctcac tgcagagtct cctaagtcac atctcttcct ttgcaagagt
60aggcgaagaa ggatctaagg gcttggcttg tttgaaagaa ccacaccccg aaagtaacat
120ctttggagaa agtgatacaa gagcttctgc acccacctga tagaggaagt ccaaagggtg
180tgcgcacaca caatggtgcc tgaagaagag cctcaagacc gagagaaagg actctggtgg
240ttccagttga aggtctggtc catggcagtc gtatccatct tgctcctcag tgtctgtttc
300actgtgagtt ctgtggtgcc tcacaatttt atgtatagca aaactgtcaa gaggctgtcc
360aagttacgag agtatcaaca gtatcatcca agcctgacct gcgtcatgga aggaaaggac
420atagaagatt ggagctgctg cccaacccct tggacttcat ttcagtctag ttgctacttt
480atttctactg ggatgcaatc ttggactaag agtcaaaaga actgttctgt gatgggggct
540gatctggtgg tgatcaacac cagggaagaa caggatttca tcattcagaa tctgaaaaga
600aattcttctt attttctggg gctgtcagat ccagggggtc ggcgacattg gcaatgggtt
660gaccagacac catacaatga aaatgtcaca ttctggcact caggtgaacc caataacctt
720gatgagcgtt gtgcgataat aaatttccgt tcttcagaag aatggggctg gaatgacatt
780cactgtcatg tacctcagaa gtcaatttgc aagatgaaga agatctacat ataaatgaaa
840tattctccct ggaaatgtgt ttgggttggc atccaccgtt gtagaaagct aaattgattt
900tttaatttat gtgtaagttt tgtacaagga atgcccctaa aatgtttcag caggctgtca
960cctattacac ttatgatata atccattcac acattcattt attcatttat tcatt
101541604DNAHomo sapiens 4gagcctccaa gtgtccacac cctgtgtgtc ctctgtcctg
ccagcaccga gggctcatcc 60atccacagag cagtgcagtg ggaggagacg ccatgacccc
catcctcacg gtcctgatct 120gtctcgggct gagcctggac cccaggaccc acgtgcaggc
agggcccctc cccaagccca 180ccctctgggc tgagccaggc tctgtgatca cccaagggag
tcctgtgacc ctcaggtgtc 240aggggagcct ggagacgcag gagtaccatc tatatagaga
aaagaaaaca gcactctgga 300ttacacggat cccacaggag cttgtgaaga agggccagtt
ccccatccta tccatcacct 360gggaacatgc agggcggtat tgctgtatct atggcagcca
cactgcaggc ctctcagaga 420gcagtgaccc cctggagctg gtggtgacag gagcctacag
caaacccacc ctctcagctc 480tgcccagccc tgtggtgacc tcaggaggga atgtgaccat
ccagtgtgac tcacaggtgg 540catttgatgg cttcattctg tgtaaggaag gagaagatga
acacccacaa tgcctgaact 600cccattccca tgcccgtggg tcatcccggg ccatcttctc
cgtgggcccc gtgagcccaa 660gtcgcaggtg gtcgtacagg tgctatggtt atgactcgcg
cgctccctat gtgtggtctc 720tacccagtga tctcctgggg ctcctggtcc caggtgtttc
taagaagcca tcactctcag 780tgcagccggg tcctgtcgtg gcccctgggg agaagctgac
cttccagtgt ggctctgatg 840ccggctacga cagatttgtt ctgtacaagg agtggggacg
tgacttcctc cagcgccctg 900gccggcagcc ccaggctggg ctctcccagg ccaacttcac
cctgggccct gtgagccgct 960cctacggggg ccagtacaca tgctccggtg catacaacct
ctcctccgag tggtcggccc 1020ccagcgaccc cctggacatc ctgatcacag gacagatccg
tgccagaccc ttcctctccg 1080tgcggccggg ccccacagtg gcctcaggag agaacgtgac
cctgctgtgt cagtcacagg 1140gagggatgca cactttcctt ttgaccaagg agggggcagc
tgattccccg ctgcgtctaa 1200aatcaaagcg ccaatctcat aagtaccagg ctgaattccc
catgagtcct gtgacctcgg 1260cccacgcggg gacctacagg tgctacggct cactcagctc
caacccctac ctgctgactc 1320accccagtga ccccctggag ctcgtggtct caggagcagc
tgagaccctc agcccaccac 1380aaaacaagtc cgactccaag gctggtgagt gaggagatgc
ttgccgtgat gacgctgggc 1440acagagggtc aggtcctgtc aagaggagct gggtgtcctg
ggtggacatt tgaagaatta 1500tattcattcc aacttgaaga attattcaac acctttaaca
atgtatatgt gaagtacttt 1560attctttcat attttaaaaa taaaagataa ttatccatga
gaaa 160451998DNAHomo sapiens 5ggggggctcg ggctgggggc
gcggcctgtg ccggccgccc caccctcctt gcataaaagc 60cggagcccgc ggggccggcg
ctctcagccc gtcggttccc gagcgccttc ccggtgaccc 120cgcagtgggt gtgtgagggg
aggacggaca gacccagacg ccgccggacc aggaggacgc 180tgacgaggca ccatgcgtga
gatcgtgcac atccaggcgg gccagtgcgg caaccagatc 240ggcgccaagt tttgggaggt
catcagtgat gagcatggga ttgaccccac tggcagttac 300catggagaca gtgatttgca
gctggagaga atcaatgttt actacaatga agccactggt 360aacaaatatg ttcctcgggc
catcctcgtg gatctggagc caggcacgat ggattcggtt 420aggtctggac cattcggcca
gatcttcaga ccagacaatt tcgtgtttgg ccagagtgga 480gccgggaata actgggccaa
gggccactac acagagggag ccgagctggt cgactcggtc 540ctggatgtgg tgaggaagga
gtcagagagc tgtgactgtc tccagggctt ccagctgacc 600cactctctgg ggggcggcac
ggggtccggg atgggcaccc tgctcatcag caagatccgg 660gaagagtacc cagaccgcat
catgaacacc ttcagcgtca tgccctcacc caaggtgtca 720gacacggtgg tggagcccta
caacgccacc ctctcggtcc accagctggt ggaaaacaca 780gatgaaacct actgcattga
caacgaggcc ctgtatgaca tctgcttccg caccctgaag 840ctgaccaccc ccacctacgg
ggacctcaac cacctggtgt cggccaccat gagcggggtc 900accacctgcc tgcgcttccc
gggccagctg aacgcagacc tgcgcaagct ggcggtgaac 960atggtgccct tccctcgcct
gcacttcttc atgcccggct tcgcgcccct gaccagccgg 1020ggcagccagc agtaccgggc
gctcacggtg cccgagctca cccagcagat gttcgactcc 1080aagaacatga tggccgcctg
cgacccgcgc cacggccgct acctgacggt ggctgccatc 1140ttccggggcc gcatgtccat
gaaggaggtg gacgagcaga tgctcaacgt gcagaacaag 1200aacagcagct acttcgtgga
gtggatcccc aacaacgtga agacggccgt gtgcgacatc 1260ccgccccgcg gcctgaagat
gtcggccacc ttcatcggca acagcacggc catccaggag 1320ctgttcaagc gcatctccga
gcagttcacg gccatgttcc ggcgcaaggc cttcctgcac 1380tggtacacgg gcgagggcat
ggacgagatg gagttcaccg aggccgagag caacatgaac 1440gacctggtgt ccgagtacca
gcagtaccag gacgccacgg ccgacgaaca aggggagttc 1500gaggaggagg agggcgagga
cgaggcgtag atgcccccgc gagacgggtt agggaaagcg 1560gaggaggaaa gcgagggggt
ggggggcttc ccgggacgat aacctggcag tggaaggaaa 1620gaagcatggt ctactttagg
tgtgcgctgg gtctctggtg ctcttcactg ttgcctgtca 1680cttttttttt ccttttttgt
aatattgatg acatcaatgt aacatttgag atatttctga 1740attactgttg taatggctaa
aatcacataa acgtttgtgt cggaatggtg tcctctcttt 1800ctcttccttt ttctctttat
taacgattta aatgtaactt tctgaacaca ttgcattgaa 1860ttcttccttt aacaaaaagc
aaaggcgtag gtaaaagctc aaatgaattt attctttcgg 1920tatggtaaaa ttgaaccaat
cacagttaag atgagagatc aacctgagtt ttaaaatacc 1980tttaataaat attagttg
199861595DNAHomo sapiens
6gcccgccggt ccacgccgcg caccgctccg agggccagcg ccacccgctc cgcagccggc
60accatgcgcg agatcgtgca catccaggcg ggccagtgcg gcaaccagat cggcgccaag
120ttttgggagg tcatcagcga tgagcatggg atcgacccca caggcagtta ccatggagac
180agtgacttgc agctggagag aatcaacgtg tactacaatg aggctgctgg taacaaatat
240gtacctcggg ccatcctggt ggatctggag cctggcacca tggactctgt caggtctgga
300cccttcggcc agatcttcag accagacaac ttcgtgttcg gccagagtgg agccgggaat
360aactgggcca agggccacta cacagaggga gccgagctgg tcgactcggt cctggatgtg
420gtgaggaagg agtcagagag ctgtgactgt ctccagggct tccagctgac ccactctctg
480gggggcggca cggggtccgg gatgggcacc ctgctcatca gcaagatccg ggaagagtac
540ccagaccgca tcatgaacac cttcagcgtc atgccctcac ccaaggtgtc agacacggtg
600gtggagccct acaacgccac cctctctgtc caccagctgg tggaaaacac agatgaaacc
660tactccattg ataacgaggc cctgtatgac atctgcttcc gcaccctgaa gctgaccacc
720cccacctacg gggacctcaa ccacctggtg tcggccacca tgagcggggt caccacctgc
780ctgcgcttcc cgggccagct gaacgcagac ctgcgcaagc tggcggtgaa catggtgccc
840ttccctcgcc tgcacttctt catgcccggc ttcgcgcccc tgaccagccg gggcagccag
900cagtaccggg cgctcacggt gcccgagctc acccagcaga tgttcgactc caagaacatg
960atggccgcct gcgacccgcg ccacggccgc tacctgacgg tggctgccat cttccggggc
1020cgcatgtcca tgaaggaggt ggacgagcag atgctcaacg tgcagaacaa gaacagcagc
1080tacttcgtgg agtggatccc caacaacgtg aagacggccg tgtgcgacat cccgccccgc
1140ggcctgaaga tgtcggccac cttcatcggc aacagcacgg ccatccagga gctgttcaag
1200cgcatctccg agcagttcac ggccatgttc cggcgcaagg ccttcctgca ctggtacacg
1260ggcgagggca tggacgagat ggagttcacc gaggccgaga gcaacatgaa cgacctggtg
1320tccgagtacc agcagtacca ggacgccacg gccgacgaac aaggggagtt cgaggaggag
1380gagggcgagg acgaggctta aaaacttctc agatcaatcg tgcatcctta gtgaacttct
1440gttgtcctca agcatggtct ttctacttgt aaactatggt gctcagtttt gcctctgtta
1500gaaattcaca ctgttgatgt aatgatgtgg aactcctcta aaaattacag tattgtctgt
1560gaaggtatct atactaataa aaaagcatgt gtaga
159573313DNAHomo sapiens 7cactaacgct cttcctagtc cccgggccaa ctcggacagt
ttgctcattt attgcaacgg 60tcaaggctgg cttgtgccag aacggcgcgc gcgcgcgcac
gcacgcacac acacgggggg 120aaactttttt aaaaatgaaa ggctagaaga gctcagcggc
ggcgcgggcg ctgcgcgagg 180gctccggagc tgactcgccg aggcaggaaa tccctccggt
cgcgacgccc ggccccggct 240cggcgcccgc gtgggatggt gcagcgctcg ccgccgggcc
cgagagctgc tgcactgaag 300gccggcgacg atggcagcgc gcccgctgcc cgtgtccccc
gcccgcgccc tcctgctcgc 360cctggccggt gctctgctcg cgccctgcga ggcccgaggg
gtgagcttat ggaaccaagg 420aagagctgat gaagttgtca gtgcctctgt tgggagtggg
gacctctgga tcccagtgaa 480gagcttcgac tccaagaatc atccagaagt gctgaatatt
cgactacaac gggaaagcaa 540agaactgatc ataaatctgg aaagaaatga aggtctcatt
gccagcagtt tcacggaaac 600ccactatctg caagacggta ctgatgtctc cctcgctcga
aattacacgg taattctggg 660tcactgttac taccatggac atgtacgggg atattctgat
tcagcagtca gtctcagcac 720gtgttctggt ctcaggggac ttattgtgtt tgaaaatgaa
agctatgtct tagaaccaat 780gaaaagtgca accaacagat acaaactctt cccagcgaag
aagctgaaaa gcgtccgggg 840atcatgtgga tcacatcaca acacaccaaa cctcgctgca
aagaatgtgt ttccaccacc 900ctctcagaca tgggcaagaa ggcataaaag agagaccctc
aaggcaacta agtatgtgga 960gctggtgatc gtggcagaca accgagagtt tcagaggcaa
ggaaaagatc tggaaaaagt 1020taagcagcga ttaatagaga ttgctaatca cgttgacaag
ttttacagac cactgaacat 1080tcggatcgtg ttggtaggcg tggaagtgtg gaatgacatg
gacaaatgct ctgtaagtca 1140ggacccattc accagcctcc atgaatttct ggactggagg
aagatgaagc ttctacctcg 1200caaatcccat gacaatgcgc agcttgtcag tggggtttat
ttccaaggga ccaccatcgg 1260catggcccca atcatgagca tgtgcacggc agaccagtct
gggggaattg tcatggacca 1320ttcagacaat ccccttggtg cagccgtgac cctggcacat
gagctgggcc acaatttcgg 1380gatgaatcat gacacactgg acaggggctg tagctgtcaa
atggcggttg agaaaggagg 1440ctgcatcatg aacgcttcca ccgggtaccc atttcccatg
gtgttcagca gttgcagcag 1500gaaggacttg gagaccagcc tggagaaagg aatgggggtg
tgcctgttta acctgccgga 1560agtcagggag tctttcgggg gccagaagtg tgggaacaga
tttgtggaag aaggagagga 1620gtgtgactgt ggggagccag aggaatgtat gaatcgctgc
tgcaatgcca ccacctgtac 1680cctgaagccg gacgctgtgt gcgcacatgg gctgtgctgt
gaagactgcc agctgaagcc 1740tgcaggaaca gcgtgcaggg actccagcaa ctcctgtgac
ctcccagagt tctgcacagg 1800ggccagccct cactgcccag ccaacgtgta cctgcacgat
gggcactcat gtcaggatgt 1860ggacggctac tgctacaatg gcatctgcca gactcacgag
cagcagtgtg tcacgctctg 1920gggaccaggt gctaaacctg cccctgggat ctgctttgag
agagtcaatt ctgcaggtga 1980tccttatggc aactgtggca aagtctcgaa gagttccttt
gccaaatgcg agatgagaga 2040tgctaaatgt ggaaaaatcc agtgtcaagg aggtgccagc
cggccagtca ttggtaccaa 2100tgccgtttcc atagaaacaa acatccccct gcagcaagga
ggccggattc tgtgccgggg 2160gacccacgtg tacttgggcg atgacatgcc ggacccaggg
cttgtgcttg caggcacaaa 2220gtgtgcagat ggaaaaatct gcctgaatcg tcaatgtcaa
aatattagtg tctttggggt 2280tcacgagtgt gcaatgcagt gccacggcag aggggtgtgc
aacaacagga agaactgcca 2340ctgcgaggcc cactgggcac ctcccttctg tgacaagttt
ggctttggag gaagcacaga 2400cagcggcccc atccggcaag cagaagcaag gcaggaagct
gcagagtcca acagggagcg 2460cggccagggc caggagcccg tgggatcgca ggagcatgcg
tctactgcct cactgacact 2520catctgagcc ctcccatgac atggagaccg tgaccagtgc
tgctgcagag gaggtcacgc 2580gtccccaagg cctcctgtga ctggcagcat tgactctgtg
gctttgccat cgtttccatg 2640acaacagaca caacacagtt ctcggggctc aggaggggaa
gtccagccta ccaggcacgt 2700ctgcagaaac agtgcaagga agggcagcga cttcctggtt
gagcttctgc taaaacatgg 2760acatgcttca gtgctgctcc tgagagagta gcaggttacc
actctggcag gccccagccc 2820tgcagcaagg aggaagagga ctcaaaagtc tggcctttca
ctgagcctcc acagcagtgg 2880gggagaagca agggttgggc ccagtgtccc ctttccccag
tgacacctca gccttggcag 2940ccctgatgac tggtctctgg ctgcaactta atgctctgat
atggctttta gcatttatta 3000tatgaaaata gcagggtttt agtttttaat ttatcagaga
ccctgccacc cattccatct 3060ccatccaagc aaactgaatg gcaatgaaac aaactggaga
agaaggtagg agaaagggcg 3120gtgaactctg gctctttgct gtggacatgc gtgaccagca
gtactcaggt ttgagggttt 3180gcagaaagcc agggaaccca cagagtcacc aacccttcat
ttaacaagta agaatgttaa 3240aaagtgaaaa caatgtaaga gcctaactcc atcccccgtg
gccattactg cataaaatag 3300agtgcatttg aaa
331383381DNAHomo sapiens 8aaccgtatct cagttctcgc
gaggtttcgt cttcccggaa gcgttggagg acattccctg 60ttgactgcgt cgcgatgtgt
ggcgactgtg tggagaagga atatcccaac cggggtaata 120cctgcctgga gaatggatct
ttcttactga actttacagg ctgtgcagtg tgcagtaagc 180gggattttat gctgatcaca
aacaaatcct tgaaagaaga agatggagaa gaaatagtta 240cctatgatcc agatttgtgt
aagaattgtc atcatgtaat agccagacat gagtatacat 300tcagtatcat ggatgaattt
caggagtata ccatgctgtg tctgttatgc ggcaaagccg 360aagatactat cagtattctc
cctgatgacc cccgacaaat gactctctta ttctaaggat 420ccttctacag atctgttata
actatattgt gttggtttac aatacagcaa gcctgatggt 480ttgtcttatt tcattcatac
tgaaaattct ttgcatattt tttttctgct tagacttact 540tattctttga ggaaaaaagg
taatgtagga gctcattgtt ctctagagct gagctcttct 600gctaaagttc aaagttcaca
tcagtgtagc cagagtgaag catctttgtt agcagttatg 660tggtcagaaa acaactagaa
tgggagcaca cctggtgccc agattttggt ttccaataaa 720aggaaccagg actccttaga
aaatgaactg attctagaac tgggatatga actgtacaag 780atgagcctag agtatcttgt
gccacagaaa aatgaagtac tgaaattaaa aaaaaaaaat 840catagtgatt gggagtatgt
cagcatgatc gaactgaagg atctctgaaa ggccacagct 900ggagcaattt gagtaacaaa
ataaagtagt attgggttat aacccaaact atataataaa 960tatttacaag cccaaactaa
tataaatgat tgaatgggaa caaatctctt gtgcagaaga 1020atttcaaata atttatgtag
atactcccca cttagtgggc tctgcatagt gacttccttg 1080cagagtataa tatggaaagg
ggggaaaaga gtaactttac agtggagaaa tctgacagac 1140accagctcag ccaggtgatt
gaggttaaca tcatccatga taagcccttt tgatagcatg 1200tatccttgat agcatcacat
catacagaat gtgataagaa tggcacttta cctctgtggt 1260cctcctcccc agaactcaga
acccaagtct gatcaggaga aaaacatcac acaaattcca 1320attgagaagc attctttaaa
atacctaacc agtacttctc aaaactatca aggccataaa 1380aaacaaggaa aatctgagac
atggtcacag ccaagaggag cctgagaaga catgactgct 1440agatgtaatg tggtatccta
catggaatta tataccacga aaagaaaaaa aggtcatttg 1500gtagaaacta gggaaatctg
aatacagtgt ggacttcagt taatataaat gagaaaatgg 1560tttcattggc tttaaaggga
ttttagaata tcttactttg catagtttca tgagcactgg 1620ccaggaggtt ataaactggc
atttaggcct ctctgccatt tagtagcagt atagtcatta 1680agcaaagcat tactctggac
tttattgtcc tgtttctatc aaattggact aacaaaaatt 1740gatataatac attcatcaca
gtattgttag gagatttaaa tgaattagtg aatgtaagat 1800actttagaaa gcatgtaaag
tactagaaac aatgaggtgg cttcaattag tgctcattct 1860gagacctaaa tttaaagggt
caggcattag aaacaaaagt gtttcttcat atctaggaga 1920tttggaaatt ctgaactaag
gtcttataag aacaagaaat gagcaaagtg ttcattgtag 1980atactaggga aaagctttgt
tgaataacag taatgatgat aatatctgag ctttattaag 2040tacttactac atgctaagag
ctttttaagc acttatttaa tcctcattaa caaattcata 2100aggtaggtgc tattgatagt
ttagccataa ccagatatgg cttgttattt gtaagttatc 2160atgaaaaggt ttatattcac
tattctttat ttcagtgtag cacttttaga gccaaaaaaa 2220ctcaggcaca aagaagttaa
ataacttgcc gggtgcggtg gctcacccct gtaatcccag 2280cactttggga ggccaaagca
ggcggatcac ctgaggtcag gagtttgaga ccagccgggc 2340caacatggtg caacctcgtc
tttactaaaa atacaaaaac aaaaattagc caggggtggt 2400ggtatgcacc tgcagtccca
gcaacatggg aggctgaaca ggagaatcgc ttgaacccgc 2460gacagggagg ttgtgatgag
ccaaggtcat gccactgcac tccagcctgg gagacagagc 2520aagactccat ctcaaaaaca
aaaacaaaac aaaacaacaa caaaaaaaga agttaaataa 2580cttgtttaag atcacacagc
tagtgcagtt accccctctc agtaattatg tttggcaaat 2640atgaaataca aggatctggc
aaggttagga agaggttaag aatatcagtg acagtactct 2700gaggcatctg ggcatgtcag
atgaagattt attattcaga aaagttaagt cagctgttgc 2760agggatcagt tgttttattc
ctggaaaatg tttttcattt tgctgcaaat ctatatctga 2820cctgctgagg aaatcgtttg
gtgaaatgaa tccagaaagc agagaaaatg cttcattact 2880ttaaatagca gcattaagcc
cattatagcc tctgaattgt cccttccttg agtttgctaa 2940tgcttccaac ttagtcattt
gaagcccaag agtctaattt tatatgccct gccaatgtcc 3000tcatctattg cagaatgtat
aattatctat ttgttttgga ctatatgtta caaaaattta 3060aaacataaga tcctctctct
atatttcatt attggtgaac ccacattgtg ctgtgttttg 3120tgatatttta tcattttcta
gattcttatg tggatctaat aacgaccact tgaacccagt 3180ttcacagaat ccttattttt
ctgttcaaat taggaattag ggacatggat gaaattggaa 3240atcttcattc tcagtaaact
atcgcaagga caaaaaacca aacaccacat gttctcactc 3300atagatagga attgaacaat
gagaacacat ggagacagga aggggaacat cacactctag 3360ggactgttgt ggggtgaggg g
338193301DNAHomo sapiens
9aggcgccgcc gcagccggag cggctcccgg gccctgggcc gccgccggcc aggaagaaat
60acttgtgttg gctgcatttc cagggatgct accagagctc aaggctgtca cctggtcttg
120cccagaagag ccgttcttag aggcaggact tgatgaaggc tttcctgctg atggaatagg
180tttgctagag ctggccttgg aattagaacc cttcatgtgg cctttataaa tatgcgtttg
240agacagagtt atatgcagaa gttgaaaatg cctggaagat ttctggtttc tttcactact
300tatcctgcct ttttgcatcg ctgccagatt tggatgatat gatattcaga ggggcacctt
360aatcaaagcc attcttcaac aagacccacc tggcataaga ttgcacacat aattcaagat
420ggccagtcaa cctcctgaag acactgcgga gtctcaggcc tctgatgagc tggagtgcaa
480aatctgttac aatcgataca atctgaaaca gaggaaaccc aaagtgctgg agtgttgtca
540tagggtttgt gccaaatgcc tctacaagat catagacttt ggggactccc cacaaggtgt
600cattgtctgt cctttctgca ggtttgagac gtgcctgcca gatgatgaag ttagtagcct
660gcccgatgac aacaacatcc ttgtaaactt gacttgtgga ggcaaaggga agaagtgcct
720gccagagaac cctactgagc tgctgctcac ccccaagagg ctggcctctc tggtcagtcc
780ttctcacacg tcctccaact gcctggtcat aaccatcatg gaggtgcaga gagagagctc
840cccgtccctg agctccactc ctgtggtaga attttatagg cctgcgagtt tcgactctgt
900caccactgtg tcacacaact ggactgtgtg gaactgcacg tccctgctgt ttcagacatc
960catccgggtg ttagtgtggt tgctaggttt gctctacttc agctccttac ccttaggaat
1020ctacttactg gtgtctaaga aagtcaccct tggggtcgtc tttgtcagcc tggtcccttc
1080gagcctcgtt attcttatgg tgtatggttt ttgccagtgt gtttgtcatg aatttctaga
1140ctgtatggca cctccttctt aactgatatg caaaataaga aattggacac acattgccct
1200gtttgagtgt gaagttagat aatttataat ttattttctt ttatgttctt tatgattagt
1260atccatgaca ttaacaaaac ccttggccac atgttgactt gattggtttt cctgtaggct
1320ggaagtaaaa atgttcattt ctacttaggg gttagcaaaa ttgtataagc tcacacttca
1380tggagcactg acagcagtga gtcttcccag agaaaggaca gggtttctct cagcactgcc
1440agacagacgg caggggtgtg gtgtgttata ctatagggag agcatggatc cctcctttcg
1500tattcatgga tgttctatga tggcagttgg acacaagagg aaagttgctc tgagacacaa
1560agtgtgtact cctttcccac cccatacccc tggtattgga acaccctaga attgtcttca
1620ggtgttttct ctcttaagtc actctctgtg gtcggcgatc ccattgagat acttgtttcc
1680tctgcccatt ctcccttgcc aagaggaaac tcggttccat tcccttgggt agtttcaact
1740gaataccagt tttgccacat tactaaggag aatgaaaagc actgaggaat ttgcatcaca
1800gtcagcttca tggcagaatg tggccatttg tccttgagac acactctttc ctccatgtct
1860gttttcctct ctcctcagtc ttatctgaga agaatggagg agaaggaact tctcatacag
1920cggttattat tgatgaaaac cttcatttga gtctttcctt ataacatctt agttttggtt
1980tttttaaacc acattgccca atcagcctca ccccttgtcc tgaaaaggtt ccatttaaat
2040tagttgctat aaattcatca atactttttt tccctattat atttttggtt ctattaggat
2100ttacttaact gaatcttata acaattcgag gtgaactgtg gcaatgaaaa ccagaaacag
2160ttaatgagat gcttcagctc acagtttgaa gtgctgagaa cctaagtatt ttgctgtacg
2220gtactgagct gtaccaaaat atgatggttt aggtttatgt gcaagacttt gtgttgtagt
2280ctagacaaag gggtgggcaa gagacatgca aagctgaagc cctgcttgaa aagacccttc
2340aaggaagtaa aatggcaggg gcagagtgca gcttaacatg ttgctatccc tgttgttttt
2400gagttggttt tggaatggat tcaagttctt acacaattta ttttgaatac aagcataatc
2460taggtgattt gagttaatga acttcttttc atgatgtagg gaaagttgaa tgtatatatt
2520tctaagaaga atttgtttag cagattacaa gttggcaaaa tagactgttc acagaaacta
2580ggcaaaaatt taaaaaaaca ttctagtctc taaaacccat tactaatgat taacattaaa
2640atatttgtaa ctcttagaaa gggggcatta ctaagacgac tttaacttgt tatgaaatct
2700ttgttgtgtg atgcaggtac agtgcgccca ttccaactgg aatagcagtt tgattttaat
2760tgtaaaacta aacttcggga atatgtatgc ccaaagtaag taggatgaga atagtataca
2820tgggatatgg tccaatgaat ttaagcccca agatacagct aaatacattt atgatttcat
2880aaaatctagt ttagatagca ttgtgatgca atttccagaa atccatttgt gtttagagta
2940aataccatgt ttagaagatg ttttgtggtt tggatttata tatttgtaag gtttttttaa
3000aaaaatgttc gttttgtttg aaatgtaaca ttgagtaaat tggtgagtta tataatgaga
3060tttctagaaa gctctggaca tgggtacgat gtgttttgct tctctgtata atgtctacag
3120tgataaactt gtgtctccgt gtattgtggc agtctttttt tctagttaat ttggctttag
3180agagcaatct ttgtatgaca ccagaaaact cttcatgcta ttgaatgata aaaagataat
3240gctttaatat tttattcact gtgatactat tttgtttgtc tattaaattg ttattatttc
3300c
3301101129DNAHomo sapiens 10gtcgctcccc tgctcggggt cagctagtgt ccgctctgct
cggccgcggg ctcccggagg 60actgcaggca ggatgacgca aaacacggtg attgtgaatg
gagttgctat ggcctctagg 120ccatcccagc ccacccacgt caacgtccac atccaccagg
agtcagcttt gacacaactg 180ctgaaagctg gaggttctct gaagaagttt ctttttcacc
ctggggacac tgtgccttcc 240acagccagga ttggttatga gcagctggct ctaggggtga
ctcagatatt gctgggggtt 300gtgagttgtg ttcttggagt gtgtctcagc ttggggccct
ggactgtgct gagtgcctca 360ggctgtgcct tctgggcggg gtctgtggtg atcgcagcag
gagctggggc cattgtccat 420gagaagcacc cgggcaaact tgctggctat atatccagcc
tgctcaccct ggcaggcttt 480gctacagcta tggctgctgt tgtcctctgc gtgaatagct
tcatctggca aactgaaccc 540tttttataca tcgacactgt gtgtgatcgc tcagaccctg
tcttccctac cactgggtac 600agatggatgc ggcgaagtca agagaaccaa tggcagaagg
aggagtgtag agcttacatg 660cagatgctga ggaagttgtt cacagcaatc cgtgccctgt
tcctggctgt ctgtgtcttg 720aaggtcattg tgtccttggt ttccttggga gtaggtcttc
gaaacttgtg tggccagagc 780tcccagcccc tgaatgagga aggatcagag aagaggctac
tgggggagaa ttcagtgccc 840ccttcgccct ctagggagca gacctccact gccattgtcc
tgtgagctgc caaagacccc 900acggggtgcc cgcatgtccc tgtctagggc agcccagggc
ccccactcct ggctcctcac 960acttgcctcc cctatggccg ctctccagac cctcctcctt
tcttctcccc acatccgcac 1020ctgctgttcc cactctgggg ttctcaagtc catgaacaga
tattgttgca ttttccacaa 1080tgctgattaa acataataaa caatccagaa aagcagtttt
gcccagaaa 1129111206DNAHomo sapiens 11cccacctcag cctcccaaag
cactgggatt acaggcgtga gctattgtgc ccagctggga 60tcttgacaaa gacactattt
ctctcctttc acctgtgctg tgtatttttc cctcgcctag 120ttcccagacc tcactgctat
atgtcttctc cctggcaggc aggatgacgc aaaacacggt 180gattgtgaat ggagttgcta
tggcctctag gccatcccag cccacccacg tcaacgtcca 240catccaccag gagtcagctt
tgacacaact gctgaaagct ggaggttctc tgaagaagtt 300tctttttcac cctggggaca
ctgtgccttc cacagccagg attggttatg agcagctggc 360tctaggggtg actcagatat
tgctgggggt tgtgagttgt gttcttggag tgtgtctcag 420cttggggccc tggactgtgc
tgagtgcctc aggctgtgcc ttctgggcgg ggtctgtggt 480gatcgcagca ggagctgggg
ccattgtcca tgagaagcac ccgggcaaac ttgctggcta 540tatatccagc ctgctcaccc
tggcaggctt tgctacagct atggctgctg ttgtcctctg 600cgtgaatagc ttcatctggc
aaactgaacc ctttttatac atcgacactg tgtgtgatcg 660ctcagaccct gtcttcccta
ccactgggta cagatggatg cggcgaagtc aagagaacca 720atggcagaag gaggagtgta
gagcttacat gcagatgctg aggaagttgt tcacagcaat 780ccgtgccctg ttcctggctg
tctgtgtctt gaaggtcatt gtgtccttgg tttccttggg 840agtaggtctt cgaaacttgt
gtggccagag ctcccagccc ctgaatgagg aaggatcaga 900gaagaggcta ctgggggaga
attcagtgcc cccttcgccc tctagggagc agacctccac 960tgccattgtc ctgtgagctg
ccaaagaccc cacggggtgc ccgcatgtcc ctgtctaggg 1020cagcccaggg cccccactcc
tggctcctca cacttgcctc ccctatggcc gctctccaga 1080ccctcctcct ttcttctccc
cacatccgca cctgctgttc ccactctggg gttctcaagt 1140ccatgaacag atattgttgc
attttccaca atgctgatta aacataataa acaatccaga 1200aaagca
1206121303DNAHomo sapiens
12cgagtgggag cgcagggggc gcgcgggggc tcggtcttct cccgcccggt ccgctcctcc
60gggcggaacc ccctttgccg ggcccccggg ccgcagctgg tttcgggaaa ccgcgcagcc
120tgggcggggc cgcagccccc acccgtgcat ctgccgtctc cgccctcccg tcgctcccct
180gctcggggtc agctagtgtc cgctctgctc ggccgcgggc tcccggagga ctgcaggtga
240ggccaggcag gatgacgcaa aacacggtga ttgtgaatgg agttgctatg gcctctaggc
300catcccagcc cacccacgtc aacgtccaca tccaccagga gtcagctttg acacaactgc
360tgaaagctgg aggttctctg aagaagtttc tttttcaccc tggggacact gtgccttcca
420cagccaggat tggttatgag cagctggctc taggggtgac tcagatattg ctgggggttg
480tgagttgtgt tcttggagtg tgtctcagct tggggccctg gactgtgctg agtgcctcag
540gctgtgcctt ctgggcgggg tctgtggtga tcgcagcagg agctggggcc attgtccatg
600agaagcaccc gggcaaactt gctggctata tatccagcct gctcaccctg gcaggctttg
660ctacagctat ggctgctgtt gtcctctgcg tgaatagctt catctggcaa actgaaccct
720ttttatacat cgacactgtg tgtgatcgct cagaccctgt cttccctacc actgggtaca
780gatggatgcg gcgaagtcaa gagaaccaat ggcagaagga ggagtgtaga gcttacatgc
840agatgctgag gaagttgttc acagcaatcc gtgccctgtt cctggctgtc tgtgtcttga
900aggtcattgt gtccttggtt tccttgggag taggtcttcg aaacttgtgt ggccagagct
960cccagcccct gaatgaggaa ggatcagaga agaggctact gggggagaat tcagtgcccc
1020cttcgccctc tagggagcag acctccactg ccattgtcct gtgagctgcc aaagacccca
1080cggggtgccc gcatgtccct gtctagggca gcccagggcc cccactcctg gctcctcaca
1140cttgcctccc ctatggccgc tctccagacc ctcctccttt cttctcccca catccgcacc
1200tgctgttccc actctggggt tctcaagtcc atgaacagat attgttgcat tttccacaat
1260gctgattaaa cataataaac aatccagaaa agcagttttg ccc
1303131430DNAHomo sapiens 13gctgaccatg ctggaactgc ggcgactaca gagcctgcgg
gaacctcccc tttcgcccaa 60gatctgctct gtccccctca tcctcctccc agggccctgg
cgtctgggtc aagcagcgcc 120ccacacctcg acccctcacc ccctcctccc gggctcttcc
tgcggcctcc cctccacagt 180ccgcaggctc tgggacagga ccgagtcctt ggctgcctgt
ggagctcctg tgccagcagc 240tgcgccccgg ctgcgctccg gataccccca tccccgccac
cgccgacctc ccgctccacc 300gactgctgct cacgcccgac gggttcacgc cgcccctgcc
ccgtgaagga ccgcgctgcg 360gtgcggaggc aggatgacgc aaaacacggt gattgtgaat
ggagttgcta tggcctctag 420gccatcccag cccacccacg tcaacgtcca catccaccag
gagtcagctt tgacacaact 480gctgaaagct ggaggttctc tgaagaagtt tctttttcac
cctggggaca ctgtgccttc 540cacagccagg attggttatg agcagctggc tctaggggtg
actcagatat tgctgggggt 600tgtgagttgt gttcttggag tgtgtctcag cttggggccc
tggactgtgc tgagtgcctc 660aggctgtgcc ttctgggcgg ggtctgtggt gatcgcagca
ggagctgggg ccattgtcca 720tgagaagcac ccgggcaaac ttgctggcta tatatccagc
ctgctcaccc tggcaggctt 780tgctacagct atggctgctg ttgtcctctg cgtgaatagc
ttcatctggc aaactgaacc 840ctttttatac atcgacactg tgtgtgatcg ctcagaccct
gtcttcccta ccactgggta 900cagatggatg cggcgaagtc aagagaacca atggcagaag
gaggagtgta gagcttacat 960gcagatgctg aggaagttgt tcacagcaat ccgtgccctg
ttcctggctg tctgtgtctt 1020gaaggtcatt gtgtccttgg tttccttggg agtaggtctt
cgaaacttgt gtggccagag 1080ctcccagccc ctgaatgagg aaggatcaga gaagaggcta
ctgggggaga attcagtgcc 1140cccttcgccc tctagggagc agacctccac tgccattgtc
ctgtgagctg ccaaagaccc 1200cacggggtgc ccgcatgtcc ctgtctaggg cagcccaggg
cccccactcc tggctcctca 1260cacttgcctc ccctatggcc gctctccaga ccctcctcct
ttcttctccc cacatccgca 1320cctgctgttc ccactctggg gttctcaagt ccatgaacag
atattgttgc attttccaca 1380atgctgatta aacataataa acaatccaga aaagcagttt
tgcccagaaa 1430143458DNAHomo sapiens 14aaaagacttc agtggcagac
aaaggaggag taataagatc gctagggggc ccgtgcccag 60cccacccacg cacaatctca
gtcctcgcaa tacccacaag gtaggtgcta ggatcacacc 120ctttacggac gcggcacctg
cgacagggat gcgcgaggag tcagggggcc tcgccggatc 180gaacctaagc tggggaagag
tatttcttgt atttttagga gaaattctca gcctcgggga 240agagtatttc ttgatgaggg
aagagcgcgg ggaagacact cacgcacgca caaacatgtg 300ggcggccatg gtgtgcccag
cgccgtgctg gcttctggga acccccagtg gacaagacgg 360acaaggtacc ggctctcagg
ggaagtggga gccagtcaca agcgtaccta atttcggaga 420gtgacaagta ctctgaaaaa
gaaagaaggt agggctggtg actggccaat ttaagcgggc 480aggagtctgc tgggggacgg
agaccagcct caggtctggg ttggggacag aagctgtgcc 540taagtgtggt gcaggatgca
gttgcaaagg agcgcttccg atcgcacttg atgctcgcca 600cgtccctgca aagtgctccc
gccccctttc tgcaaatgag gaaacgggac gcgcggctcg 660ccgggccagc ccgcgtgcct
gcgcagtccc ctccccgaga accatcccct tgccccgccc 720agcgtcaggg gtgcgcggcc
gccgagagac cccggaggcg tagccggctg cggaggcgaa 780gaggtggcag cgcgagctgg
gaccagcgtc tcggaggcgc cgcagaattc acagatggat 840tcagtggaaa agacaacaaa
tagaagtgaa caaaaatcca gaaagttttt aaaaagcctc 900atccggaaac agccccagga
actgctcctg gttatcggga ctggcgtcag cgcagcagtg 960gcccccggaa tccctgccct
ttgctcgtgg agaagctgca tcgaggccgt catcgaggct 1020gcagagcagc tggaggtgct
gcaccccgga gacgtcgccg agttccggag gaaagtgaca 1080aaggaccggg acctgttggt
tgtcgcccat gatctgatcc ggaagatgtc acctcgcaca 1140ggcgatgcca agcccagctt
cttccaggac tgcctgatgg aggtgtttga cgacctggag 1200cagcacatcc ggagtcctgt
ggtgctgcag tcgatcctca gcctgatgga caggggcgcc 1260atggtcctga ccaccaacta
tgacaacctg ctggaggcct ttggccggcg gcagaacaag 1320cccatggagt ccctggactt
gaaggacaag accaaggtcc ttgaatgggc aagagggcac 1380atgaagtacg gcgtcctcca
cattcacggc ctctacacgg acccctgcgg ggtggtgctg 1440gacccatcgg ggtataaaga
cgtcactcaa gacgcagaag tcatggaagt cctccagaac 1500ttataccgca ccaagtcctt
tctgtttgtg ggctgtgggg agacccttcg tgatcagata 1560ttccaggccc tctttcttta
ctccgtgccg aataaggtgg atttggagca ctacatgctt 1620gtgctgaagg agaatgaaga
ccatttcttt aagcatcagg cagatatgct tctgcacgga 1680atcaaagttg tatcctacgg
ggactgtttt gaccactttc caggatatgt gcaagacctt 1740gccactcaga tctgcaaaca
gcaaagccca gatgctgatc gcgtggacag caccacatta 1800ttgggtaatg catgccagga
ctgtgcaaag aggaagttag aagagaatgg aattgaagtt 1860tcaaaaaaac gcacacaatc
agatactgat gatgctggag ggtcttgaaa tctttacagt 1920aaaacctgca acttgaaaac
tagccttctg taaccacagt gcccaaacga agaggaatgt 1980atggagaact ccacgtggat
ctctgattgc gaaaccgtca catacaccaa gagagccaca 2040tgggcatgtg gccctcaagg
ctgggtgaga gggctcccct gtgtgttgaa ctatgcagga 2100gggtgacgcg gacacatttc
aggtggactt tgcaaggact gatggatagc tacctcaggg 2160accagaatcc gtgggaaggg
atggacctgg tgttcccgtt cccatctgac aggctctctt 2220ttgtcaaggt ggtatttttc
gtaataaaag gggaagagta aagactgtcc aagcaacagt 2280agctgccaaa gagaaaatac
gaaatagaca cttttttttt ttgagtcaga gtctcactct 2340gtcacccagg acagagtgca
gtggtacgat ctcaagctca ctgcagccac caccgcctgg 2400gctcaagtga ttctcctgcc
tcagcctccc gagtagctgg gattacaggc gtccaccacc 2460atgcccagct aattttttta
tttttagtag agttggagtt tcaccatgtt ggccaggatg 2520gtctcgaact cttgacctca
ggtgatccac ccgccttggc ctcccaaagt gctaggatta 2580caggcatgag ccactgcgcc
cagcaaaata aacacatttt ataatttgta tgtggaaaca 2640tgttactata gaaagcattt
taaaggtacg ttttaaaggt ccactgttaa atagtaaaga 2700atgaatccgc tagcgaaaat
gtttttaggg agaacagctg gatcaaaagg gcttctttgg 2760aattaggttg ttttagtaac
ttctgttcca aagaaacaca ggtctgatat tgctaagaac 2820tgaaatcgga ggagccagag
gcccttttca gtccaggcca acattgtgca cggccactgt 2880gggactgaca accgggatag
ctcaagttcg agagaccagg tttcaaacat tataagttcc 2940aggctttgca agtctttatt
ctctggggta atatccagtc tttctgttat tgtctcttaa 3000aattctcttc catggcccac
attaagggag tttgcagaga gtgagggagg caaaacttga 3060aaagggcctg caacacttta
aaccttctca ggttcaccca catgaaacgg ctgtgctgag 3120tgtgctgccg gtgcccgggg
agcttctctg actgtgaccc ggcagaggct tctgtggcgg 3180tgcatgagcg gccctacagt
ggagggttct ctttggaaac aaacagccct gcttggtttc 3240agtttgaggc cacttatctt
caatgtgaca tttcttgcca agccctgtga cactccccat 3300tgatgactcc cataggtaca
gataaagtta agaacaggaa acagaagggt aggatgcata 3360gggagggaga gaagccctga
aaactttttt tttctttttg aagcatggaa aacaaatctt 3420ttatgccact ccagccataa
ataaaatttt aacttcaa 3458151776DNAHomo sapiens
15cggggcgggg cctccgggga ccgcggggcc gttggtttcg ggacggaacg ttcacgcggc
60tggggcgggc gcgcggggga agggtttgcg gcggcgccgc tgccggctaa cgcggagggg
120cgcctggagg cggcgtggcg tccgctctgg ctccgactcc ggctctcgct ctcgcttcta
180gcccgcgtgc ctgcgcagtc ccctccccga gaaccatccc cttgccccgc ccagcgtcag
240gggtgcgcgg ccgccgagag accccggagg cgtagccggc tgcggaggcg aagaggtggc
300agcgcgagct gggaccagcg tctcggaggc gccgcagaat tcacagatgg attcagtgga
360aaagacaaca aatagaagtg aacaaaaatc cagaaagttt ttaaaaagcc tcatccggaa
420acagccccag gaactgctcc tggttatcgg gactggcgtc agcgcagcag tggcccccgg
480aatccctgcc ctttgctcgt ggagaagctg catcgaggcc gtcatcgagg ctgcagagca
540gctggaggtg ctgcaccccg gagacgtcgc cgagttccgg aggaaagtga caaaggaccg
600ggacctgttg gttgtcgccc atgatctgat ccggaagatg tcacctcgca caggcgatgc
660caagcccagc ttcttccagg actgcctgat ggaggtgttt gacgacctgg agcagcacat
720ccggagtcct gtggtgctgc agtcgatcct cagcctgatg gacaggggcg ccatggtcct
780gaccaccaac tatgacaacc tgctggaggc ctttggccgg cggcagaaca agcccatgga
840gtccctggac ttgaaggaca agaccaaggt ccttgaatgg gcaagagggc acatgaagta
900cggcgtcctc cacattcacg gcctctacac ggacccctgc ggggtggtgc tggacccatc
960ggggtataaa gacgtcactc aagacgcaga agtcatggaa gtcctccaga acttataccg
1020caccaagtcc tttctgtttg tgggctgtgg ggagaccctt cgtgatcaga tattccaggc
1080cctctttctt tactccgtgc cgaataaggt ggatttggag cactacatgc ttgtgctgaa
1140ggagaatgaa gaccatttct ttaagcatca ggcagatatg cttctgcacg gaatcaaagt
1200tgtatcctac ggggactgtt ttgaccactt tccaggatat gtgcaagacc ttgccactca
1260gatctgcaaa cagcaaagcc cagatgctga tcgcgtggac agcaccacat tattgggtaa
1320tgcatgccag gactgtgcaa agaggaagtt agaagagaat ggaattgaag tttcaaaaaa
1380acgcacacaa tcagatactg atgatgctgg agggtcttga aatctttaca gtaaaacctg
1440caacttgaaa actagccttc tgtaaccaca gtgcccaaac gaagaggaat gtatggagaa
1500ctccacgtgg atctctgatt gcgaaaccgt cacatacacc aagagagcca catgggcatg
1560tggccctcaa ggctgggtga gagggctccc ctgtgtgttg aactatgcag gagggtgacg
1620cggacacatt tcaggtggac tttgcaagga ctgatggata gctacctcag ggaccagaat
1680ccgtgggaag ggatggacct ggtgttcccg ttcccatctg acaggctctc ttttgtcaag
1740gtggtatttt tcgtaataaa aggggaagag taaaga
1776163511DNAHomo sapiens 16ggggatttgg ggctgggtcg gccggggtcg gggagggggg
tggtgaaaag gtgacaggga 60gctgcccccg ctcaagagcc ggtggttggg ggtctgagaa
gaagtcacca atatgaagtt 120attcggcttc gggagccgca ggggccagac ggcccagggc
tccatagacc acgtctacac 180gggttccgga taccgaatcc gggactccga actgcagaag
atccacaggg cagctgtcaa 240aggcgacgcc gcggaggtgg agcgctgcct ggcgcgcagg
agcggagacc tggacgccct 300ggacaagcag cacagaactg ctctacactt ggcctgtacc
agtggccatg tgcaagtggt 360cactctcctg gttaacagaa aatgccagat tgatgtctgt
gacaaagaaa acagaacgcc 420tttgatacag gctgtccatt gccaggaaga ggcttgtgcc
gttattctgc tggaacatgg 480cgccaatcca aaccttaagg atatctacgg caacactgct
ctccattatg ccgtgtatag 540tgagagcacc tcactggcag aaaaactgct ttcccatggt
gcacatattg aagcactgga 600caaggacaat aataccccac ttttatttgc tataatttgc
aagaaagaga aaatggtgga 660atttttattg aaaaagaaag caagttcaca tgccgttgat
aggctgagac ggtcagctct 720catgcttgct gtatactatg actcaccagg tattgtcaat
atccttctta agcaaaatat 780tgatgtcttc gctcaagaca tgtgtggacg agatgcagaa
gattatgcta tttctcatca 840tttgacaaaa attcaacaac aaattttgga acataaaaag
aagatactta aaaaggagaa 900atcagatgtt ggaagttctg atgaatctgc agtcagcatt
ttccatgaac tgcgtgtgga 960ttcattgcct gcatcggatg acaaagactt gaatgttgct
actaagcagt gtgtccccga 1020gaaagtgtca gagcctttac ctggatcttc gcatgaaaaa
ggaaacagaa tagtcaatgg 1080acaaggagaa gggcctcctg caaaacatcc ttccttgaag
cctagcactg aagtggaaga 1140tcctgctgtg aaaggagcag tacaaagaaa gaatgtacag
acattgagag cagaacaagc 1200cttaccagtg gcttcagagg aagagcaaga aaggcatgaa
agaagtgaaa agaagcaacc 1260acaggtcaaa gaaggaaata atacaaacaa aagtgaaaaa
atacaacttt cagaaaatat 1320atgtgatagt acatcttctg ctgctgctgg cagattaacc
caacaaagaa agattgggaa 1380aacgtatcct cagcaatttc ccaagaagct gaaggaagag
catgatagat gcaccttaaa 1440acaagaaaat gaagaaaaaa caaatgttaa tatgctgtac
aaaaaaaata gagaagaatt 1500agaaaggaaa gagaaacaat ataagaaaga agttgaagca
aaacaacttg aaccaactgt 1560tcagtcacta gagatgaaat caaagactgc aagaaatact
ccaaattggg attttcataa 1620tcatgaagaa atgaaaggtc tgatggatga aaattgcatt
ttgaaggcag atattgctat 1680actcagacag gaaatatgta caatgaaaaa tgacaacttg
gaaaaagaaa ataaatatct 1740taaggacatt aaaattgtta aagaaacaaa tgctgccctt
gaaaagtata taaaactcaa 1800tgaggaaatg ataacagaaa cagcattccg gtatcaacaa
gagcttaatg atctcaaggc 1860tgagaataca aggctcaatg ccgaactgtt gaaggaaaaa
gaaagcaaga aaagactgga 1920agctgacatt gaatcttatc agtctagact ggctgctgct
ataagcaaac acagtgaaag 1980tgtgaaaaca gaaagaaacc taaaacttgc tttagagaga
acacgagatg tttctgtaca 2040agtagaaatg agttctgcta tttccaaagt aaaagctgag
aatgagtttc ttactgaaca 2100actttctgaa acacaaatta aattcaatgc cttaaaagat
aagttccgta agacaagaga 2160tagtctcaga aaaaagtcat tggctttaga aactgtacaa
aacgacctaa gccaaacaca 2220gcagcaaaca caggaaatga aagagatgta tcaaaatgca
gaagctaaag tgaataattc 2280cactggaaag tggaactgtg tagaagagag gatatgtcac
ctccaacgtg aaaatgcgtg 2340gcttgtacag caactagatg acgttcatca gaaagaggat
cataaagaga tagtaactaa 2400tatccaaaga ggctttattg agagtggaaa gaaagacctc
gtgctagaag agaaaagtaa 2460gaagctaatg aatgaatgtg atcatttaaa agaaagtctc
tttcagtatg agagagagaa 2520aacagaagga gtagtaagta tcaaggaaga taaatatttt
caaacttcta gaaagacaat 2580ttaaacattt ggttctggat acatgttgaa cttagttgaa
tataaaaatc tagattaaaa 2640gtgtgtttac catactgtat aattccattt acatgaagca
tccagaaaag ataaatgtat 2700agggacaaaa agtagattca tgtttgcaag gggctggggc
tggaagctgg tagtgactgc 2760taatgggcat gaggaatctt acagtgatgg aaatgctcta
aagttggatt gtagagatgg 2820ctgcacaact cagtaaatgt actaaaaatc ttttaactta
aaacagatac attctatagt 2880atgtaaatta tatttcaaca aagctgtttt aataaaaaaa
ggaaaaatgt gtttactata 2940tcggcttaga aacatgcctc atttctagga aataaaagat
agaggtgaga gatgatttac 3000tttgagaaaa gacattgtgt cacctatgaa attttattag
gcacagagtc atattttaag 3060gtagatagtt ctgtattgct gaaatagtaa ttttaatgtc
tttatgttgc cacatgttaa 3120gaccataatg tagttataaa tggaaatgtt tacacctgaa
gtgagtattt tcaaattaaa 3180atttaattaa gtgattttct tcgacactta attctagatt
ccccagatga attgaagtgt 3240attgctgtgt cttgtaatac cttgctttaa ctagcttttt
atgtatttta gttggtatag 3300ctttgttatt attcatatta acaaatctga aaatatgtca
aattacgtgt ttttatgacc 3360atgtaatgtt ttaaaggcac ctacttgtta taaaatcata
atttaggata aatgtggtaa 3420aacttagcaa aactatattt ggtttagttt tcccactggt
atttatagtt tactttgaat 3480atttatatta ataattagct cataattttt a
3511171338DNAHomo sapiens 17gggcagtctg cgaagattgc
aggcattgtt tgttcttgtc ttggatttat gcctttaaat 60ttcacctttt attacacagc
tatagcaggc ctttttatga gactaacctg gcctctccac 120taaaggatgt gtgactttct
ggggacagaa gagtacagtc cctgacatca cacactgcag 180agatggataa ccaaggagta
atctactcag acctgaatct gcccccaaac ccaaagaggc 240agcaacgaaa acctaaaggc
aataaaaact ccattttagc aactgaacag gaaataacct 300atgcggaatt aaaccttcaa
aaagcttctc aggattttca agggaatgac aaaacctatc 360actgcaaaga tttaccatca
gctccagaga agctcattgt tgggatcctg ggaattatct 420gtcttatctt aatggcctct
gtggtaacga tagttgttat tccctcacgt cattgtggcc 480attgtcctga ggagtggatt
acatattcca acagttgtta ctacattggt aaggaaagaa 540gaacttggga agagagtttg
ctggcctgta cttcgaagaa ctccagtctg ctttctatag 600ataatgaaga agaaatgaaa
tttctgtcca tcatttcacc atcctcatgg attggtgtgt 660ttcgtaacag cagtcatcat
ccatgggtga caatgaatgg tttggctttc aaacatgaga 720taaaagactc agataatgct
gaacttaact gtgcagtgct acaagtaaat cgacttaaat 780cagcccagtg tggatcttca
ataatatatc attgtaagca taagctttag aggtaaagcg 840tttgcatttg cagtgcatca
gataaattgt atatttctta aaatagaaat atattatgat 900tgcataaatc ttaaaatgaa
ttatgttatt tgctctaata agaaaattct aaatcaatta 960ttgaaacagg atacacacaa
ttactaaagt acagacatcc tagcatttgt gtcgggctca 1020ttttgctcaa catggtattt
gtggttttca gcctttctaa aagttgcatg ttatgtgagt 1080cagcttatag gaagtaccaa
gaacagtcaa acccatggag acagaaagta gaatagtggt 1140tgccaatgtc tgagggaggt
tgaaatagga gatgacctct aactgataga acgttacttt 1200gtgtcgtgat gaaaactttc
taaatttcag tagtggtgat ggttgtaact ctgcgaatat 1260actaaacatc attgattttt
aatcatttta agtgcatgaa atgtatgctt tgtacacgac 1320acttcaataa agctatcc
1338181392DNAHomo sapiens
18gggcagtctg cgaagattgc aggcattgtt tgttcttgtc ttggatttat gcctttaaat
60ttcacctttt attacacagc tatagcaggc ctttttatga gactaacctg gcctctccac
120taaaggatgt gtgactttct ggggacagaa gagtacagtc cctgacatca cacactgcag
180agatggataa ccaaggagta atctactcag acctgaatct gcccccaaac ccaaagaggc
240agcaacgaaa acctaaaggc aataaaaact ccattttagc aactgaacag gaaataacct
300atgcggaatt aaaccttcaa aaagcttctc aggattttca agggaatgac aaaacctatc
360actgcaaaga tttaccatca gctccagaga agctcattgt tgggatcctg ggaattatct
420gtcttatctt aatggcctct gtggtaacga tagttgttat tccctctaca ttaatacaga
480ggcacaacaa ttcttccctg aatacaagaa ctcagaaagc acgtcattgt ggccattgtc
540ctgaggagtg gattacatat tccaacagtt gttactacat tggtaaggaa agaagaactt
600gggaagagag tttgctggcc tgtacttcga agaactccag tctgctttct atagataatg
660aagaagaaat gaaatttctg tccatcattt caccatcctc atggattggt gtgtttcgta
720acagcagtca tcatccatgg gtgacaatga atggtttggc tttcaaacat gagataaaag
780actcagataa tgctgaactt aactgtgcag tgctacaagt aaatcgactt aaatcagccc
840agtgtggatc ttcaataata tatcattgta agcataagct ttagaggtaa agcgtttgca
900tttgcagtgc atcagataaa ttgtatattt cttaaaatag aaatatatta tgattgcata
960aatcttaaaa tgaattatgt tatttgctct aataagaaaa ttctaaatca attattgaaa
1020caggatacac acaattacta aagtacagac atcctagcat ttgtgtcggg ctcattttgc
1080tcaacatggt atttgtggtt ttcagccttt ctaaaagttg catgttatgt gagtcagctt
1140ataggaagta ccaagaacag tcaaacccat ggagacagaa agtagaatag tggttgccaa
1200tgtctgaggg aggttgaaat aggagatgac ctctaactga tagaacgtta ctttgtgtcg
1260tgatgaaaac tttctaaatt tcagtagtgg tgatggttgt aactctgcga atatactaaa
1320catcattgat ttttaatcat tttaagtgca tgaaatgtat gctttgtaca cgacacttca
1380ataaagctat cc
1392191209DNAHomo sapiens 19tgcagagatg aataaacaaa gaggaacctt ctcagaagtg
agtctggccc aggacccaaa 60gcggcagcaa aggaaaccta aaggcaataa aagctccatt
tcaggaaccg aacaggaaat 120attccaagta gaattaaatc ttcaaaatcc ttccctgaat
catcaaggga ttgataaaat 180atatgactgc caaggtttac tgccacctcc agagaagctc
actgccgagg tcctaggaat 240catttgcatt gtcctgatgg ccactgtgtt aaaaacaata
gttcttattc ctttcctgga 300gcagaacaat ttttccccga atacaagaac gcagaaagca
cgtcattgtg gccattgtcc 360tgaggagtgg attacatatt ccaacagttg ttattacatt
ggtaaggaaa gaagaacttg 420ggaagagagt ttgctggcct gtacttcgaa gaactccagt
ctgctttcta tagataatga 480agaagaaatg aaatttctgg ccagcatttt accttcctca
tggattggtg tgtttcgtaa 540cagcagtcat catccatggg tgacaataaa tggtttggct
ttcaaacata agataaaaga 600ctcagataat gctgaactta actgtgcagt gctacaagta
aatcgactta aatcagccca 660gtgtggatct tcaatgatat atcattgtaa gcataagctt
tagaagtaaa gcatttgcgt 720ttgcagtgca tcagatacat tttatatttc ttaaaataga
aatattatga ttgcataaat 780ctgaaaatga attatgttat ttgctctgat acaaaaattc
taaatcaatt attgaaatag 840gatgcacaca attactaaag tacagacatc ctagcatttg
tgtcgggctc attttgctca 900acatggtatt tgtggttttc agcctttcta aaagttgcat
gttatgtgag tcagcttata 960ggaagtacca agaacagtca aacccatgga gacagaaagt
agaatagtgg ttgccaatgt 1020ctcagggagg ttgaaatagg agatgaccac taattgatag
aacgtttctt tgtgtcgtga 1080tgaaaacttt ctaaatttca gtagtggtga tggttgtaac
tctgcgaata tactaaacat 1140cattgatttt taatcatttt aagtgcatga aatgtatgct
ttgtacatga cacttcaata 1200aagctatcc
12092010490DNAHomo sapiens 20acccccatga atccacaacc
ataaccatgg ccacgcgggt cgaggtgggc tccataacgc 60ccttgacggc cgtgccaggc
ctgggtgaga tgggcaagga ggagaccctg acgaggacct 120acttcctcca ggccggcgaa
gcctctgggg ctcccccagc ccggatcttg gaagcgaaga 180gccccctgcg gagcccggcc
cggttactcc ctctgccaag gctcgccccc aaacccttct 240cgaaggagca ggacgtgaaa
tctcctgtcc cgtctctgcg gcccagttcg actggacctt 300ccccctctgg ggggctctct
gaggagccag cagcaaagga tctggacaac aggatgcccg 360gcttggtggg gcaggaggtg
ggcagtgggg agggcccgag gacgagctcg cccctcttca 420acaaggctgt gttcctgcgg
cccagctcca gcaccatgat tctcttcgaa accaccaaaa 480gcggccccgc tctggggaag
gcggttagtg agggggcgga ggaggccaag ctaggtgtgt 540ccggctcccg gcctgaggtg
gctgccaagc ccgccctgcc cacccagaag cctgcgggga 600cccttccccg gtcagctccc
ctgtctcagg acacaaaacc acctgtaccc caagaggagg 660caggccaaga ccatcctccc
tcaaaggcca gcagtgtgga ggacacggca cgcccccttg 720tggagcccag gcctcgcctg
aagagaaggc ccgtgtctgc cattttcacg gagtccattc 780agcctcagaa gccaggcccc
ggcgcagcgg ccacagtggg caaagtgcca cccacccctc 840ccgagaagac gtgggtgagg
aagcccaggc ccttgtccat ggacctcacg gcccggtttg 900agaacaaaga ggccttgctg
aggaaggtgg ccgatgaagg aagtggaccc acagcagggg 960atatggctgg gctagagagg
cccagagcag cgtccaagct ggacagggac tgtttggtca 1020aggcggaggc tcctcttcat
gatcctgatt tggacttcct ggaggtggcc aagaaaatcc 1080gtgaacggaa ggagaagatg
ctttcgaagc cggagatggg cagccccaga gccctggtgg 1140ggggctcatc tggggtcacc
cccagcaatg accagagtcc ctgggaagaa aaggccaagc 1200tggacccaga gccagagaag
gctgctgagt ccccctcacc caggctggga aggggcctag 1260aacttgctga ggttaagagc
agagtggcgg atggggaggc cgcggcaggg ggagagtggg 1320cctccaggag gagtgtcagg
aagtgcatca gcctgtttcg ggaggacagc accttggcct 1380tggcagtggg gtctgaatct
cccctggcca cccctgcgtc cccatcggcg gcaccagagc 1440cggagaaagg ggttgtgagc
gttcaggaac ggatcagagg ctggactgcc gagagctcag 1500aggctaagcc cgaggtcagg
aggaggacgt tccaggctcg gccgctgtcg gcggatttga 1560ccaaattgtt ttcaagttca
gcttccagca acgaagtcaa atatgagaag agtgctgagc 1620tgagcggcga gtttcctaag
gaaccgagag aaaagcaaaa ggaggggcac agtttggatg 1680gagcatgcat cccgagaagc
ccctggaagc ctgggacact ccgggataag tccaggcaga 1740cggagcagaa ggttagctct
aaccaagacc ccgacagctg tcgcggtgga agctcagtgg 1800aggccccgtg cccttctgac
gtcactccag aggatgaccg gagcttccag actgtgtggg 1860ccacagtatt tgagcaccac
gtggagagac acacagtggc tgaccagtcg ggacgttgtc 1920tctccaccac accccctggt
gacatggccc atgcccgtgt ctcagaaccc aggccgaggc 1980ctgagatggg ctcttggctg
ggcagggacc caccagacat gacaaaactg aagaaagaga 2040actccagagg gtttgacaat
cccgagacgg agaaattggg accaaccacc cttttgaatg 2100gtgaactgag accgtatcac
acgcctctcc gggacaaata ccctttgtct gaaaaccaca 2160ataataacac cttcctcaaa
cacttggaaa atcctcccac atcgcagaga attgagccca 2220gatatgacat tgtgcatgca
gtgggagagc gtgtgcacag cgaggccatc tcaccggcac 2280cggaggagaa agcggtcacg
ctccgcagcc tcaggtcttg gctctcactg aaggacaggc 2340agctgtccca ggaggtcacc
cctgctgacc tggagtgtgg tttggaaggt caggcggggt 2400ccgtccaaag ggccagtttg
atttgggaag ctcgaggcat gcctgaggct agtggaccga 2460agtttggggg caattgcccg
tttcccaaat ggacaggcgg ggcagtggtg agctcgcaca 2520aagccaccgt ggcagtcagc
gaagagcact gtgctcccgg ggccacctcc gtcagggcta 2580tcaaggctgc catctgggaa
agccagcatg aggggccaga gggggccaga agcaagccag 2640gagtgggagc aaggggccca
ccccagggat gccccctcga tcctctttcc agggctacga 2700atgggccttc tgactcccaa
gcacgaacac atccagatgc atttgctgtg cagaaagggc 2760ccttcattgt agccgccagg
gagggtgatc cagggccggc ccaggtgcca cagcctgcag 2820tcagaatgcg gaaagccggc
gccatggacc agagaatgga cagatggcgg cggcggactt 2880taccccccaa cgtgaaattt
gatacattca gttctcttgt cccagaggac tctccacatg 2940tggggcacag acgaacagat
tatgtgagcc ccacagccag tgccttaaga aaacctcaac 3000tatcccacta cagggtggag
acccaggagg tgaacccagg tgcttcacgg gaccagactt 3060ccccagcagt gaagcaaggg
tcacctgtgg aacccaaggc gacatttttt gcagtcacct 3120atcagattcc caatactcaa
aaggcaaagg gtgtggttct gtcaggagct gaaagcttgc 3180tggaacattc tagaaaaatc
actccaccct cgtctcctca ttctttaaca tccactttgg 3240tttctcttgg tcatgaagag
gcattggaga tggcaggcag taaaaactgg atgaagggac 3300gagagcatga aaatgcaagc
attttaaaaa ctctgaagcc aacagaccgt ccatcatctc 3360ttggggcctg gagtctggac
cctttcaatg gaagaatcat tgatgtggat gccttatgga 3420gtcatcgggg atcagaagat
ggccctcgtc ctcaaagcaa ttggaaggaa agtgcgaaca 3480agatgtcccc cagcggcgga
gctccccaaa ccaccccgac tctgaggagt cgtccaaaag 3540atcttcctgt gagaaggaag
actgatgtga tcagtgacac gttcccaggt aaaatcagag 3600atggctacag atccagcgtt
cttgacattg acgccctgat ggcagagtac caggagctgt 3660cgctgaaagt ccctggggag
gctcaggaga ggaggagtcc caccgtggag cccagtacgt 3720tgcctcggga gaggcctgtt
cagctgggcg gggtggagca gagaaggagg agcctgaagg 3780agatgcccga taccgggggt
ctctggaaac cggccagttc tgccgaaata aatcacagtt 3840tcactcctgg cttaggcaag
cagctggcag agaccttgga gacagccatg ggcaccaaat 3900ctagccctcc cttctgggct
ctgccaccct cggctccttc tgaaaggtat ccagggggct 3960ctcctatacc tgcggatccc
aggaaaaaaa cggggtttgc tgaggatgac agaaaggcct 4020ttgccagtaa acatcatgtt
gcaaagtgtc agaattacct ggctgagtca aagccctctg 4080gtcgggagga tccaggcagt
ggggtcaggg tgtcacccaa atcgcccccc actgaccaga 4140agaaagggac cccaaggaaa
tccaccgggc ggggagagga ggacagtgtg gcccagtggg 4200gtgaccaccc acgtgactgt
ggacgggtgc cgctggatat caagagggcc tactcagaga 4260aggggccccc tgccaacatc
cgagagggcc tgtccatcat gcatgaagcc agagagagga 4320ggcgagagca gcccaaaggg
aggcccagcc ttactggaga gaatttggag gccaaaatgg 4380gaccctgttg gtgggagtca
gggactggag acagtcacaa ggtgctgcca cgggacctgg 4440agaaggagga tgccccccag
gagaaggagc gaccgctcca gcaggtgtcc cctgtggcct 4500cggttccctg gagaagccac
agcttctgca aagacaggag gagtgggccc tttgtggacc 4560agctgaagca gtgtttctcc
cggcagccca ctgaacccaa ggacactgac accctcgtgc 4620acgaagccgg cagccagtat
gggacgtgga cagagcagtg ccagagtggg gagagcttgg 4680ccactgagtc cccagatagc
agtgccacat cgacaaggaa acagcccccc agcagccgtt 4740tgtcttctct gtcctcccaa
acggagccca cctcggcagg ggaccagtat gactgctcca 4800gggaccagcg gagcaccagc
gtggaccact ccagcactga cctggaatcc accgatggga 4860tggaggggcc gcctccaccg
gacgcctgcc ctgaaaagag agtagatgac ttctccttca 4920ttgatcaaac ctcagtcctc
gactcaagtg ccctcaagac ccgggtgcag ctcagcaaga 4980gaagccgccg ccgggccccc
atctcccact ccctccggcg cagccgattt agtgagtccg 5040agagcagatc acctttggag
gatgagactg acaacacgtg gatgttcaaa gactcaacgg 5100aggagaaatc acccaggaag
gaggagtcgg atgaggagga gacggcatcc aaagctgaga 5160ggacccctgt cagccatcct
cagaggatgc ctgcgtttcc aggcatggat ccggcagtgc 5220taaaggctca gctgcacaag
aggccagagg tggacagtcc tggcgagacc cccagctggg 5280caccccaacc caagagcccc
aagtccccct tccagcctgg ggtgctgggc agtcgcgtgc 5340tgccttccag catggacaag
gatgagaggt cggatgaacc ctctccccag tggctaaagg 5400aattgaaatc caagaagagg
caaagtcttt atgagaacca agtttgacca ggcagggaac 5460actgccacat ctacgtaaca
gaagccttaa ccatcagaac ccgcagacga ggccgagctg 5520ctgccgtttc ttcctgcaca
acgcttacgt gcctgggccc ttcccattgg attgagaacg 5580ctatccagtg cccctgtcct
cgccaggctc tccctggatc cagacgggaa gacccaacct 5640ccaggagcac tcgctcatct
ccccagacag cacttcaggc tgggaaagga gccaggctgc 5700ccagaaacgt ccatgtgggt
ggttttgctt tttatgtaaa aatttgcatt tctacctatt 5760ttagacacct tcatcagctc
caagaacatc agtgtgaggc tggaattgtc tccccaggtt 5820ctgaaaaaca cctgcatttg
tgaaagcacc tcctcacccg accccggggc aggtcttttt 5880ttggaaggac atttccaagg
aagatcaaac tgttcatttt ctgctgtagt ctcggtgcag 5940ggtattaccg ctgggttgag
gttttctatt cttttttttt ggtgagttcc ttcttcccct 6000tgaggaatca gcagttccaa
tttttaaaca tactctccct gattatgtgt atttctcttt 6060ctagtggggc tgtgtgcatc
gttggcctat gttattgtat gtcatttttg tttgctttag 6120aaggcgataa agcaataatt
cagctaattt tctttgaaac ttgataggta tatatgtgtt 6180tacgttaaag gacaggagga
aagatgtgcg aataatttgt tctgaagtat ccagtcacct 6240caggttatct tatccctatc
caagctgttt agaagttata attgtcactc ttgtatttat 6300ttccatggct tcttttcatt
tgagctctgg tttcggtagg gtgacctttg ccccttggcc 6360ttaagggttc ataaactgca
gcccaagtgg tggtgccttt gcttatgaat catcagacct 6420gccctagtaa tctttttctc
cagagtcttc tttgaacatg acgtgggctg ctggcatgga 6480ggagttgttc caaactggcc
tcagaacaga tgcatgaatg aagggtactt tttgcttttg 6540ctcacctcat tttccttcat
ctttcttgat gaatccatta tttgcaaatg ctgtcaaaac 6600atgctttcct tctccctggc
tacctctcag gagttcattt gtctctctgt ctcagcttcc 6660tccctcatct tcctttaccc
ctcttctctg tcttagtggg tggatgggca tggatagaaa 6720tcctctgttt ctttaggtta
ggataaacac ctgggccctg gtgaggcagt tgctgttcag 6780caccagggac agcaactgca
gggctgtgag gccggagccc accttgtgtc ctgttcgatg 6840agctttggtt actgatgagc
aattgccaga tcatgtcacc cacccagcct tttacacctg 6900ggagcctcat gcatctgggt
ttgggagttt ggccctgctg atgatagttt gttttctctt 6960cattccctga attagtcatc
agttctccgt ggccatttgg ggattcatgc ctgcaactgc 7020ttcagtcgaa ttctttttta
aagatccagt ttatgtttta taataatttt gatcttatgg 7080ctttcaacgt caataggatc
ccctttaaaa ataccctgct agtgttttgt tttgttttgt 7140ttttctgaga caaggtctta
ctctgttgct caagctggag tggagtggca cgatcatagc 7200tcactaacac ctgggctcaa
gtgattctcc cacctcagcc tcccaagtag ctgggactac 7260aggtgtatgc taccatgacc
agctaatttt taaatatttt tgtagagaca agggtcttac 7320catgttgccc aggctggtct
cgaacttctg ggctcaagca attgtccctc ctcagcctcc 7380caaagtgctc ggattacaga
catgagccac tgcacctggc ctgaaacaca ttcctagtct 7440tttgatcctg acttcttatc
tgggctgctg tctctctttc cttcaggtgg aaaggacccc 7500ttggtaccat ctcaagcagt
aggaaaggac ctgctcttac atatattgat ggtcctcatg 7560caaaactctc aattcttaat
tagtcatgtc tcagctcaag gatatcagac aggaatgaaa 7620cacacttgaa aatgagattg
acctggagat tttttttccc taatctctca taccttaatt 7680ggaaaaataa tcaattaatt
ctatgttaat taggatatac aaagttcacc ctccttgaaa 7740gtgactaggg caagccctga
agatcttcct cacctccttt tatttttcta taaccttgtc 7800tcctccagca ccacagggaa
gacaatcaca gtgggtcaag agcgaccctc tttcacgtgg 7860gctctgccat gacctctgag
acctgcttat gatcagtgca atgaagttag aagtaactga 7920tgattgggag cctttgcaga
tagctgggca aatgggtgat ttacttatcc ccattctaaa 7980tggagtgagc tctctttgag
gctaagcaag gaggcgttgt atgctagttt ctagactttg 8040cctggagacc cctttggaaa
tctgtcttct ttttaaactc acttaatatg ccttaatcat 8100ctgtgtgtaa tggagtcatc
cgctcctcaa tctaaccctc ctcccctggg gctttggctg 8160tcctcaatga gagtttcatg
cagaatggaa aatcctctat atgtacaatc tctctccccc 8220tcatttctct tcctcctcac
ctccaccacc cctttgcaca tcagcatttt aacagctgat 8280cttttgagaa gcctgtatct
ttttcctctt cagtagatac ccttcttcat ggtcctttgc 8340ctaatcaaac agaggctttt
ggcctttgaa aatccatgac aaggcctcag aaatcagtgt 8400tgtggaggat tactccatgc
caccggagaa actctggtga aagagaaacc tcgtggtctt 8460taggatgttg ggattttgag
tgaacctgac ctgatagcct caggattcag ggaaaggaca 8520atcagatggc ggtgttttcc
agggggacgc gccaaatcat gtggtttcag acaattgtgt 8580ttgcctttgt gcctccctgg
aagggaggcc aactaagggt atcaccaaga agccaaaaga 8640gaaataggca tgagcctgtg
gttttaaact ttacaggctg ggcaaaggat ttagaaagac 8700ccttagcatg attttcctaa
aagagacctt agctgctcca acctggtgct gatagctgct 8760ttgttgatct atgctttaaa
atttttcttt ataatgcccc cagatggctc ctggaactag 8820tcgtaattgc aaactgtaaa
aatccctcct ccccagtgta gatatttaaa ccagagtaag 8880tgatggggag acattctgtg
gtctctgaat gtgccttccc cctcaccgtg tgttaaaaca 8940caaaagccga agttccatgg
catcatgatt ccgaggggct ggagggatag gacccactcc 9000acatctaaag gggatctgct
ttgggctcgg tcccattagc gagtggggga ctcttgctgt 9060gtgctaagag gctgctagga
ctcacccagt tggaattctg ggtgggctca ggaagtttag 9120agccacgtaa aaagctggta
ggcatgagtg tgccaggtct ttgccagcct gcgtctcctt 9180ttgcaccccc caatccagag
tttgctttct tttgactaaa ttggctcctg cagggggaag 9240ggcagaaagc taggccctct
gctctggaaa gtcggcctga ggtttccggc aagttaaccc 9300ttaaaatgga cacccctcag
cccgccctcc cctttggcct tcccagaatc tccttcagtg 9360gttgctctca cacctgtgcc
ataacatcat cttccatgac ttggacgggc acttccttga 9420caattcctat tggcatcaca
cgggctacaa attatgctgt tttctaaaga atttgaactt 9480tttttttttt cttttcttga
gacacggtct tgctctgttg gccaggctgg aatgcagtgg 9540cacaatcata gctcactcca
gcctccaact cctgggctca agcaatcctc tcatctccac 9600ctccagagta gcctagatga
caggcgcaca tcaccacgcc cagctgatct tttaaatctg 9660ttttgtagaa acaggatctc
actatattgc ccaagctggt cttaaacttg gtctcaagtg 9720attctactgc ctcagcctcc
caaagtgctg ggattacagg cctgagccac catgcccagc 9780tgaatttgag ctttttaata
atctcattcc acatagcctt atagatcctg taaatagggg 9840gggtcacaaa agtaatatat
tgtgttatgg aagataattt tgtactgtgc tgtttcctaa 9900atcataccaa tatcctaaag
tcatgcactt cccagatgat cgtgatcctc caaatgcttt 9960gtaagatggg gcagggcgtg
gaaatatata tatatacaca cacacacaga gacacacaca 10020cacacaagta tagtatatat
tttcctaacc tttcttctgg gtccttcctc agatctttga 10080gtcacgatag aaaaggagct
cgagttcttt gtgtaggaaa gttaagcttc ctgcctgcgg 10140tgttcttgca attgccttag
gaattcacaa gctctaggag ttctgaacgg aaggcagacg 10200agaggcactt tatccagtcc
cagaaagaat ctctaaccgt gtgactgaga agtcatctag 10260aaaaacttat atttttaatg
taaaaacaaa tggggcttac cagacctcac agagtattgg 10320acgtctacaa gtgcttttat
attttgtaac tgtaaagaag tttcatatgc acagaagagc 10380agttggaaat ctggtcgact
gcaataaaac aagatgacct ttgcatgtac aaagatgttg 10440cattcagact atgaaaatag
caaataaagc tttggtgcaa gttgcattgg 10490219335DNAHomo sapiens
21aaccgtcccc aacggtcgcc agcgcgcggc tcccgggccc cctcgctcct cccctctggc
60ccctccgagc gctgggggtg cttccctccc cgcctcctcg cactcccact cgcgggcacg
120ccggtggctg tgccgccatg ccccaccaca ggctgctcac gggcctaggc gcggcggccc
180gagatagggg cccctgagcc tcccggggga ggagagaggc agcgagccgc gctccgcccc
240ctccccccgg gccgccgccg ccgccgcctc gccgccccag cctggcggga gaaggaggag
300gagcagcagg cctccgggcc cggcgccgcc gcccgcagga catttctgat ttatttgcat
360ccttgaagag catctgtaga agaggaagta aaagatgggt gtgaaaacat ttactcatag
420ctcctcttcc cacagccagg aaatgcttgg aaagctaaat atgctgcgaa atgatggaca
480tttttgtgat atcactattc gtgtccagga caaaatcttc cgggcacata aggtggtact
540agcagcttgc agtgatttct ttcgcaccaa acttgtaggc caagccgagg atgagaacaa
600gaatgtgttg gatctgcatc atgttacagt gactggcttt atacctcttt tagaatatgc
660ttacacagcc actctatcaa ttaacacaga aaatattatt gatgttctag cagcagccag
720ctatatgcaa atgttcagtg ttgccagcac ctgctcagag ttcatgaaat caagcatttt
780atggaataca cccaacagcc aacctgaaaa gggtctagat gctggacaag aaaataattc
840taactgcaat tttacttctc gagatgggag catttctccc gtgtcctcag agtgcagtgt
900ggtagaaaga accattcctg tctgccgaga atcccggaga aagcgcaaaa gctacattgt
960tatgtctcct gaaagtcctg taaagtgtgg cacacaaaca agctcacccc aggtattgaa
1020ttcttcagct tcctactcag aaaatagaaa ccaaccagtt gactcttcct tagcttttcc
1080ttggactttt ccttttggaa ttgatcgaag gattcagcct gagaaagtta agcaagcaga
1140aaatacccgg actttagaat tacctggccc atctgagacc ggtagaagaa tggctgatta
1200tgtgacttgt gagagcacaa aaactacctt gcctttaggt accgaagaag atgtccgggt
1260caaagtagaa agattaagtg atgaggaggt ccatgaggaa gtgtcccagc ctgtcagtgc
1320atctcagagt tcgctgagtg atcagcagac agttccagga agtgaacaag tccaagagga
1380ccttctgatt agtccacagt cttcctctat aggctcagta gatgaaggcg tttctgaggg
1440cttgcctaca cttcaaagca cgtctagcac taatgctcct ccggatgatg atgatcgatt
1500ggaaaatgtt cagtatccct accaactcta cattgctcct tccaccagca gtacagagcg
1560accaagtcca aatggtcccg acagaccttt tcagtgtcca acctgcgggg tgcgattcac
1620ccgtattcag aacctaaagc agcacatgct catccactca ggaattaaac catttcagtg
1680tgaccgctgt gggaaaaagt tcaccagggc ttactcgcta aagatgcatc gcctaaagca
1740tgaagtgatt tcctagtacc ggactactta aaccaggaac aagaagagac ccttgttcaa
1800tatgatcttg gagaacacgg ttttgaaagc aactcctctg ttcaaatgcc tgtcatttca
1860caggtctcct caacccagaa ttgtgaaagc acttttccct tggggtctct tggtgggctg
1920gcagaaaaag aggaagaagt gccagagcag ccaaagagca gtgcttgtgc tgaggcaacc
1980agagatgacc ccccaaaatc agagctgtct tctataacta ttgagtagtt ttgtgatttg
2040gcttcagttt tgttttttgg aaagtgcctg tgcttggtct tgtacattta aatttaattt
2100aattttttaa acaaaaaaag ccgggtggga gggaggggga gattgggaaa gaatttccct
2160ttttactttc tgagccctga aactgatttt atttttccta actgagagat tgcttctgta
2220agtacacaat aacatgatgt tgaaacagaa actatgagac ttaaggagaa ctggttgact
2280taaaacatat accagttcct tcttccattg ttaaaagtag gctaacaaca gatcattagc
2340tagagaggaa atcagatgat tattgacctt cttgagacaa gagggtacat gagaaactaa
2400ttactaaaca gcttgacaaa tggcctgagt agatacttac tgctgtacac aggatgttgt
2460ataatatttt gtaaagcctg ttgtttttgg aagtatttat ggtaagcttt cttaaaaatt
2520attatggtaa atacttctga aatccggcgt acttttcttt aagctttgtc atttctgtta
2580tgatttttca tggtgaaatt ttggtactga gatgggcatt ctctgtacct ttatagtacc
2640actccaaagg caaggaacca tgattgacaa cagtcaagct gtggatgaaa tgaccaggaa
2700cggagaatga agtatgtaaa tcccagcttc ataggaactc ttctcatact gcttttcaga
2760ttaaaattgc tgtttacctg gtctccgaat gtaatgcctg actgtgtcat tgcccggatc
2820agttttcccc ctgccccatc aatatgttct cttgcatata ttggcgtgct gccatataaa
2880gtaaaaatac tggagatatt ctatatttta tatataggtt tatgtgttgt tggggatgtt
2940ttcattgtgc tcttttggac ataataaata attctctatt gaggctacat tctttttttt
3000tttctttttt tttaaaagaa tggcatctca ctctgttgcc caggctggag tatagtggct
3060aagtcatagc tcactgcagc ttcgaactcc tgggctcagg ccattctcct gcctcagcct
3120cttgagtatc taggattata ggcatgctcc accacacctg gttaacttca tttttatttt
3180ttgtagagat gaggtctcac tatggtgccc aggctggtct tgaactccta gaccaagtga
3240tcctcctcct ttggcctccc agactgctgg gatcactgca ccggccaagg ctgcattctt
3300aacccaacta gattgtttac tgaatcccat atgacagcga tacattgtcc ttacatattt
3360atttttagac attgcaaagt tattaaaaac agttaactat agtttttaca caacgtaggc
3420aacaatgaag agtatagact gtaagatttt catctatgac tcataaatct gggagaaaaa
3480aattattaag actaatgaga aactgaaaac cttaaactaa tgaatattat ttctgctgct
3540aaaaatatga aactttctgg tctgtagttg aaatttgtat gatcctctag acttgggtat
3600acttttcatc tggtgccatt aaagcatctc taatattgat cctaaatatt tgtaagtcca
3660tgagcagtga actttggaat aagtttctgt gtagataccc aaagtttaat aattatgaga
3720gcacctgatt tgatagacag aaaatacagt tctttagtca aaacacaaga tctgaatttt
3780gttcaggtgc tagaccatac taaatgtata tatttttaat tatagtgatt tgtttcattt
3840ttttagattg gctaattctg taattttttc cccaaaaaca tgtgaagaaa ggaaaagtaa
3900attaaattcc ttagaactgt tttaggttaa gattctctgt gtctgcccat attctgcagt
3960ccttaacttg ttttcaactc tttacctcac tcatgaactt gtttttaccc atttgctgcc
4020aaacataggt gtgttccctt caggagaatc agcatataca ggttatgata ggctcccacc
4080atttatgctt cttcactgat agggttgcat tactttctgc agcagactat aatacttcat
4140atagtactgc tgtgtctcaa ctggaaaggt caggaatttt aatgttacgt tgtggtcttt
4200gaaaactgtt aggcctagca atagataaat ctcaaaatta attttaaaat ctgtattgac
4260aagaataggt aaaattatga ccaggaatca ttgttaccct tagttccaag aggtggtttt
4320tgaaagagca agaggaaaga aaaaagaaaa gaaggaaaga agaataaaga agagaaacgt
4380gtaagaatgg tttgcagatt gattggttaa aagtgttttt agcacatccc caaccctgaa
4440aacttcgcat taagagccaa gcacaatgtt ggtagctcca agatattctg actgtgttct
4500cagaatgagg gattacccat cactgggtat tccttcccaa gtagaaactt tagattttca
4560ctggtaatac acattgccaa gttttatggg aaatctgaat atactgtgaa aatgcatatc
4620tggttagttg tctgctgccc agatcttatc aataccagta actaaccagt atttaacata
4680aaatgataca aataaaggcc tttttctatt tcagtgaggg tacatttttc ttgatatata
4740tgtactttaa ggatattgga tctgtttatg gatctgtttt aggaaacaga tttgcaaggg
4800ataattgtat atatagtagt atttaggttt atttcaaatt catcttaggg atgcctagat
4860gcataatttt taccaggaca tattgaaaat attgcaaaga gatagccagt tatattatcc
4920cattcattag aaattaccag tgtaactaaa cataaatatt ccagtttaga gtgcttaaac
4980gtagctatct ttcttaaggc caggagggta actttgtggt atctaaaggg cttaaatttc
5040agaatgcaga ataaattgcc tttttaaaac caagcatttt gtacaagctt ttattttcag
5100ttttttaact accaaatagg tgtgatgtac tttagaagta aacaaagatg ttcacccatg
5160ataatggatg ttaaagctcc tgcagttgtt cttttcgtgt ttaaagggta tattctaata
5220gtggaagcat caaacatgtc agtgatttca tctcattcag aaataaagag attaatattg
5280gtctttattt tttggcattt taagttttat ataaatggat gcagatggag attatcaccc
5340acaaatatat ttaaatggat ttttcttaat ttgaatttca agtaatcttt ttatttcaag
5400taattagtta cgcatttagt tacattgcca gttttttttt tttatcaatt tcagtaagac
5460aaaatatact aaaatgttta aatagcctca ttcaatctta cattttgaca tttcagcaat
5520cattctggct tacagtaatt agatcccctg ttacgacaca tgccctttgt tcttaataac
5580tagcaaaaaa aaaaaaaaaa aaaaactttc atcttgttaa aatactttgc caaatgaaat
5640agactagtca atacatctga tgtccataat tattggtaac tcagttacct tctaactaat
5700aggctggttc aggagactct cccagtttat aaatggttct cttgggagcc tttggaagct
5760gtattaaatc tttcagtctt ttatttctaa ttttttctct taatctaaat agaggccagt
5820tatctatttt atcagctttt attcttgaag attctcagat tatgttttag tcccttttag
5880ctttaatagt ccttgaaaaa tacattactg tataatgtgg caattctgta acagagactt
5940attacttgaa tgaataatcc taaaatttta atattttagc tgaagtttga gatttgtgga
6000atgaacaaaa gaattagaaa ctttcatatg ttactttgtt tcagtcatct gcaaagtatg
6060aagctgtaat tctgaaatac acatccaagt gaatgagaat taaaaatttt ctaaatatta
6120atactaactg ggaaaaaaaa actagtgtga agtttacagt tagaagaaac agacccaaag
6180ttgccagaag gtaataaata aatgtagttt tcactgtaag taagttattg acgtaagatg
6240ctttatttgt aatatattta gattttgaaa gttattgaga gatgaatgta taaaagctaa
6300attttctttt ctgaagcagt gaaacaaaat tggggtaaca aggaagctct gttgtggcaa
6360acatgtctat gaggaatatt aaaactaagc atactcccac aggctttaaa ctcaaactat
6420gaacatttaa attaagttgt tccttatttt gcctataccc atttttatct ttcattgtcg
6480tttttgcttg acagtatggt gacagagtat ttttatttgg aaagtcctca gcaagatgaa
6540ttagccaaaa gcaataatgg ttcagattaa acaataaagt ggaattgatt caatcccagg
6600ctcaactaag aacagtcgct ctctggatgt ttcattttag acgatagata agttgagatg
6660ttgtaatatt tatggggggt taagcctgtg tcagttatgg gatgaagact tgtagtacca
6720gtaccatcag tggtcatact tttttttaac tttttactaa actaatacag ttagacattt
6780ccactctatc gtgattatat ttttatgatg ggaaaataaa aacacttcca tgtttttata
6840aatagtctct gcaaagattt cagatgttat tggtatctcg gtttggcagt atctgaaaaa
6900ttgagattgt ctttgaaatg tttgtgctac ttttacttaa gtaaaccccc actgtgcaag
6960acccaggccg gcttcagcta ataccaaggt ttctgtgtgc ataatagttt acagagaact
7020taagagtaag gactgcggat taaaaacaaa acttttttta actttaaaat ttttagtttt
7080tgttcaaagt acctggttta taaagtcaaa ttcttttatt agttcctttc tcgtttaaat
7140tgactgatgt tgctgatgaa gcttaaagtc ccaggcacgg ttgtggcgat atactgataa
7200aattggtgcc tagtggtggg aggagctcca gtgtcaggac ttttattaaa aggcccttgt
7260tttcccaaat gccaatctag ccacatttag atttcattat tcaataaaac agatgaaaaa
7320tcatcccata aatgaatgtt gaggttacca aagtacatca cctgctgagg aaggataaat
7380cttcctgctt taagggagcc ctgtcatctc tcctcttaat gcacgtttcc cttggtatta
7440gtggaagctg tgttcaagat gggaagcctt tcctgcagtt cttagaaaca cctgctttct
7500aaggagagcc ttttctagga ttagcttatg tgtgttttct ctaggcgatt ttttatttca
7560gttaccaatt taattttcaa gttgacagat gctgtgtaaa gtctctcata atgagagtag
7620tccattaaat tgttgaaagt tgcactgctt ttcatctttc aggtacctga aatgagtgac
7680atcaggtatt tggaaggagt aagatcataa actgtattca ttttcttcct tgtacaaagt
7740gatgacttct aatgcttata tctcaaggta ttttttaaaa aagcaacggt ccctaataga
7800gtaaaatttg gttttggtcc aagttcccaa taatgtattt aatgtttctg ttgtttactg
7860gtgcctcccg ttgcatcagg tagagattgc ctgcctcttt gtagggcagc cttgtggcac
7920cttatgtcca acttggagga tagtatatgg cttctttgtg cctctactat cttttcaaaa
7980gccattttat aaaaatccta ggtagcctat tttaatattt aaatatatat atttgtgaaa
8040gaacttttag aacagacctt ttctttttac tttaaaattc ctgtatttcc atttttaaga
8100gtaaatttaa tctccaggat ttagaagtgt ctttccagag aagcataatg agaaagtcag
8160actgaggtaa taagaccaga attaagtgat agaagaaact gttgtttggt taaaggacac
8220agatttgaag gaaaaaaatt ttgatgtaac aattttttaa ataaaatttt gtttttctgt
8280aatgtcatat ttgctgctac agtagctcaa tattttacag ggctaacata aagctggctc
8340catttaaaaa ctggagtact tcctagtgca gccagcctag gcggaaactg tacaccatgg
8400tcttccagat gggtgactga tggctttggg tagctgatgc atgctttaat atttgcctat
8460agcccggcag caaggaagtc ggggcggggg gactttttta ccctgccagt tatagcattg
8520tgattctttc tgggcactgg cattttgtga aactctcaag ggaaggtgat gcaggggaga
8580aaatgtgaat taaattacat agatgggtgt ttttatgtct tctacccctt tcctagaatt
8640agtacaactc ttaactgtgc cagtccccag ttcaccagct ttgtatccag tcgtcatctc
8700attcaagtat ggctttactt ggtgacactg gccatagcta agttaacttg gcatgtttga
8760cttttgacaa taacaaaaat ggttttggat tttgttttat ttccaaaaaa tgtatacaat
8820atcagaactt cacattttat atactagtat ctggctatta gtattttaca ggaaccatag
8880ttcttggtga ctacatatat atatatattt ttgtgacctt ttttgtaaac taagtgccgt
8940ttcaacgtta caatcatttt tagggttatt gtaatcaatg tgaatatcat gttttttcaa
9000atctgttctg agcctatagt gtttgctttg tgaacatgtg tattgtatat attctgtata
9060gttatattgt actgaaatta gcttgtttga tataaggaaa atatgtattg agtacctttt
9120tgctagcctg attgtttaat ctttttaaaa aaggtttaaa ctttttttaa aaaaaaaatc
9180tttaaactgg cctttattac atggtcacac ataaagttgc agttaggaaa gggatgggca
9240gggaaaaact agttttgagt gtctttagat agaaacatga gactaaggtt tgattttgtt
9300ttcgttttct cattaaaata tcttatgctt tatgg
933522942DNAHomo sapiens 22gagaacttcg ggaggcgggg gaggaaagcg gccggcgagc
gctggctgac attttcctgc 60ccggaaggat gcatggcccg gggtctcctg cacctgaggg
tgggcgggag gcggccccgc 120gggttgtgtt gctggaaaaa ggggtctcga tccagacccc
aagagagggt tcttgggtct 180acttcaggga agaattggag gcgagtcaca gagcgcagtg
aaggaagcaa gtttattgga 240atctactccg ttagagagtg caagtcctca gactgcagga
ggaggaactc ccgtccttcg 300gtagtgtcac tacttagagg cagctgtgag gagctgtgat
taatcgtgga atgtgctgat 360gtgcccacta aagatgcctg gtaagcagtg gctcctctaa
caagtgggca gcttggaact 420gggaacgaga gagccaccga aacagaatga gggagacaaa
gaagtgggca tcacccttaa 480ctccacaagg gcttcccaag ggcgctgtac cgtacgcagg
cccaggacga actttatttc 540tcgccccagc atcgctgtcc ttgtcggtga gaccctggct
ttagggcaga caggaccacg 600tttcataagt tcatgctgtc ccagcagagg aataacgcca
gaaagtgttc cagtacaacc 660agagaaagag agtccatgag aaatctgccc ttgtgaagtt
ggaatcccct caacctcacc 720ccgctgactt gaatgaagcg actgagacgg gctatgatgg
agcagatccg gttgctcgac 780cttctccctt gcaccaacac atgtagttaa tagttactgg
acatgcatat tcagtgggtt 840ccaggtacca aacttgtatt gaatggtatg tgccagacac
cgtcttgaga tctggagaat 900aaaaaaataa aaaaataaaa acgagacatc cagagcacag
ta 94223688DNAHomo sapiens 23gagaacttcg ggaggcgggg
gaggaaagcg gccggcgagc gctggctgac attttcctgc 60ccggaaggat gcatggcccg
gggtctcctg cacctgaggg tgggcgggag gcggccccgc 120gggttgtgtt gctggaaaaa
ggggtctcga tccagacccc aagagagggt tcttgggtct 180acttcaggga agaattggag
gcgagtcaca gagcgcagtg aaggaagcaa gtttattgga 240atctactccg ttagagagtg
caagtcctca gactgcagga ggaggaactc ccgtccttcg 300gtagtgtcac tacttagagg
cagctgtgag gagctgtgat taatcgtgga atgtgctgat 360gtgcccacta aagaggaata
acgccagaaa gtgttccagt acaaccagag aaagagagtc 420catgagaaat ctgcccttgt
gaagttggaa tcccctcaac ctcaccccgc tgacttgaat 480gaagcgactg agacgggcta
tgatggagca gatccggttg ctcgaccttc tcccttgcac 540caacacatgt agttaatagt
tactggacat gcatattcag tgggttccag gtaccaaact 600tgtattgaat ggtatgtgcc
agacaccgtc ttgagatctg gagaataaaa aaataaaaaa 660ataaaaacga gacatccaga
gcacagta 688241444DNAHomo sapiens
24ggagaagggg tggggcaggg tatcgctgac tcagcagctt ccaggttgct ctgatgatat
60attaaggctc ctgaatccta agagaatgtt ggtgaagatc ttaacaccac gccttgagca
120agtcgcaaga gcgggaggac acagaccagg aaccgagaag ggacaagcac atggaagcca
180gcccagcatc cgggcccaga cacttgatgg atccacacat attcacttcc aactttaaca
240atggcattgg aaggcataag acctacctgt gctacgaagt ggagcgcctg gacaatggca
300cctcggtcaa gatggaccag cacaggggct ttctacacaa ccaggctaag aatcttctct
360gtggctttta cggccgccat gcggagctgc gcttcttgga cctggttcct tctttgcagt
420tggacccggc ccagatctac agggtcactt ggttcatctc ctggagcccc tgcttctcct
480ggggctgtgc cggggaagtg cgtgcgttcc ttcaggagaa cacacacgtg agactgcgta
540tcttcgctgc ccgcatctat gattacgacc ccctatataa ggaggcactg caaatgctgc
600gggatgctgg ggcccaagtc tccatcatga cctacgatga atttaagcac tgctgggaca
660cctttgtgga ccaccaggga tgtcccttcc agccctggga tggactagat gagcacagcc
720aagccctgag tgggaggctg cgggccattc tccagaatca gggaaactga aggatgggcc
780tcagtctcta aggaaggcag agacctgggt tgagcagcag aataaaagat cttcttccaa
840gaaatgcaaa cagaccgttc accaccatct ccagctgctc acagacgcca gcaaagcagt
900atgctcccga tcaagtagat ttttaaaaaa tcagagtggg ccgggcgcgg tggctcacgc
960ctgtaatccc agcactttgg aggccaaggc gggtggatca cgaggtcagg agatcgagac
1020catcctggct aacacggtga aaccctgtct ctactaaaaa tacaaaaaat tagccaggcg
1080tggtggcggg cgcctgtagt cccagctact ctggaggctg aggcaggaga gtagcgtgaa
1140cccgggaggc agagcttgcg gtgagccgag attgcgctac tgcactccag cctgggcgac
1200agtaccagac tccatctcaa aaaaaaaaaa accagactga attaatttta actgaaaatt
1260tctcttatgt tccaagtaca caatagtaag attatgctca atattctcag aataattttc
1320aatgtattaa tgaaatgaaa tgataatttg gcttcatatc tagactaaca caaaattaag
1380aatcttccat aattgctttt gctcagtaac tgtgtcatga attgcaagag tttccacaaa
1440cact
1444251478DNAHomo sapiens 25ggcctccctg ccacggggat gctgcctttt ctgctccggg
tgtttccacg aggcaggcat 60ggaatcttcc ctggacaagc gacataccgt ggagagacag
gctcctgaat cctaagagaa 120tgttggtgaa gatcttaaca ccacgccttg agcaagtcgc
aagagcggga ggacacagac 180caggaaccga gaagggacaa gcacatggaa gccagcccag
catccgggcc cagacacttg 240atggatccac acatattcac ttccaacttt aacaatggca
ttggaaggca taagacctac 300ctgtgctacg aagtggagcg cctggacaat ggcacctcgg
tcaagatgga ccagcacagg 360ggctttctac acaaccaggc taagaatctt ctctgtggct
tttacggccg ccatgcggag 420ctgcgcttct tggacctggt tccttctttg cagttggacc
cggcccagat ctacagggtc 480acttggttca tctcctggag cccctgcttc tcctggggct
gtgccgggga agtgcgtgcg 540ttccttcagg agaacacaca cgtgagactg cgtatcttcg
ctgcccgcat ctatgattac 600gaccccctat ataaggaggc actgcaaatg ctgcgggatg
ctggggccca agtctccatc 660atgacctacg atgaatttaa gcactgctgg gacacctttg
tggaccacca gggatgtccc 720ttccagccct gggatggact agatgagcac agccaagccc
tgagtgggag gctgcgggcc 780attctccaga atcagggaaa ctgaaggatg ggcctcagtc
tctaaggaag gcagagacct 840gggttgagca gcagaataaa agatcttctt ccaagaaatg
caaacagacc gttcaccacc 900atctccagct gctcacagac gccagcaaag cagtatgctc
ccgatcaagt agatttttaa 960aaaatcagag tgggccgggc gcggtggctc acgcctgtaa
tcccagcact ttggaggcca 1020aggcgggtgg atcacgaggt caggagatcg agaccatcct
ggctaacacg gtgaaaccct 1080gtctctacta aaaatacaaa aaattagcca ggcgtggtgg
cgggcgcctg tagtcccagc 1140tactctggag gctgaggcag gagagtagcg tgaacccggg
aggcagagct tgcggtgagc 1200cgagattgcg ctactgcact ccagcctggg cgacagtacc
agactccatc tcaaaaaaaa 1260aaaaaccaga ctgaattaat tttaactgaa aatttctctt
atgttccaag tacacaatag 1320taagattatg ctcaatattc tcagaataat tttcaatgta
ttaatgaaat gaaatgataa 1380tttggcttca tatctagact aacacaaaat taagaatctt
ccataattgc ttttgctcag 1440taactgtgtc atgaattgca agagtttcca caaacact
1478263680DNAHomo sapiens 26attcagcttt tgggtgaaga
cggaggcggg ttctggacag acgtacgctg tcagggagtg 60tttacttcgc ctccacttct
gttcctcccc gccctggtgc tgctccgggt cacatactcg 120tcctgagccg gcttcagcct
ctctgcgcag aagtgtcccg gagccatggc cgagtaatct 180tatgcgaagt ctaccaagct
tgtgctcaag ggaaccaaga cgaagaggtg ggtcctgcac 240ctccggcggg agcctcctca
gttcttttcg gacgcactcc acccccctcg aatccggtgg 300aagccgtggc gcggagagcc
ggctttgtgg cctcccaggc tttgccctgg cccctgtccg 360ggctggactg aggccggacc
gcggttcctg gcgcctgtgc agagaggggc agcctcccgc 420actgacgacc ctggaaacag
gatagacggg cgggtgaccc gtggccccgt acccacgagt 480ttcggtcctc tgaggcatct
ctgcaggcct ctgcctgtaa gaagaaaaag agcaaagaga 540agaagagaaa aagagaagaa
gatgaagaaa cccagtttga tatgttggaa tctggtgaac 600agtaacaaac tttgatgaaa
tttcaggaac catagccatt gaaatggatg agggaaccta 660tatacatgca ctcgacgatg
gtctttttac cctgggagct ccacacaaag aaggggaaaa 720tggctttgtt ggcctcaaat
ggctgcttta ttagatgcaa tgaagcaggg gacatagaag 780caaaaagtaa aacagcagga
gaagaagaaa tgatcaagat tagatcctgt gctgaaagag 840aaaccaagaa aaaagatgac
attccagaag aagacaaagg aaatgtcaaa caatgtgaaa 900tcaattatgt aaagaaattt
cagagcttcc aagagcacaa acttaaaata agtaaagaag 960acagtaaaat tcttaaaaag
gctcagaaag atggattttt gcatgagacg cttctggaca 1020gagtcgcagt catagcctgc
cttcatctct gaagggacca ggtaacacac ggagtgcctt 1080ggcagcctca tcattgcatg
caatcttcac agacaatatc tcactgcagc cacctttctg 1140caggagtaca ggtgtgacat
tatactgatg acatcattct cagagaaaat tcattttaca 1200cactgaagat acacagacac
aaagagcctt atacaaaggg aatggacctt tccccaacac 1260agtagtgcaa ggccctgcca
ctttgcttca attcctgaaa attccttggt aaacttgggg 1320ctgctctatt cctgacactt
tcaagaaata gttattcatc ctctcagcat ccacaacgtt 1380aatacaagcc caacattctc
taatttgtgt tttgtttttt ggagacacag tcttgctctg 1440tcacacaggc tggtatgaag
tggcatgatc tcagctcact gcagccttga cctcccgacc 1500tcaagcaatt ctcccatttc
agctttccaa gtagctagga ctacaggtgt gtgccacgac 1560actcagttaa ttttgtttat
ttttttgtag agaaaagctc tcgctatgtt acccgggctg 1620gtctcgaact cctgggctca
agagatcctc tggcctccgc ctcccaaagt tctgggatta 1680caggtgtgag tccatgtgcc
cagcccttta attctttact tggggttcag gtgactacat 1740atttcttact tacaaatttt
acttaatctc gctgatgctg tcacttgaaa gctgacccac 1800cttgaatgaa gcctcctcca
acaagtctgg aatccatccg aatttaaatt aacaggcgct 1860cctatgaatt ccatcagaga
acacacagta gatgctttag caacctcttc ccatgcctcc 1920gaagtatctg gtttgcattg
tggtggccat tagtcatcca tgggcttctg atgtaaaaaa 1980caaacctgcc tcttttgacc
ctgtgctgta cagcatcagg gcagtgattg ctggccacat 2040actggaccct tgaaacagag
ggctctgcat ccatgtccat gagccttcat gcccatctgg 2100ccatcaggcc ttgggaagca
gcaccccaca gctttggcac agctgcagtg acctccttgc 2160ctccgaatgg agtcaaatgt
gtacacgctg caattctcat ctgcaggaag cactggcctc 2220cttcatcctt aggctatagt
gctgacactg gcctccttca tcctcagact gtggtgccgg 2280atgtcacccc tctgcgaggc
ccttgggatc gactgagtga ccagcagtga agttcgtcag 2340ccttctgaat ggatgccatg
atcagatgtg atggagctca gtgggatgct gctgctttcc 2400ctcccttagc taggatgtcc
ctgataaagg atgacaccca agcctcagca caactggcca 2460aacttgaggt ggtcatcata
gcactgatgc tgggccaaca attagcccca tttgtacctt 2520tttacaaact ttttgacaat
tgccaagaat cgtccacctt ccctccccat tgaattaaat 2580acacttcttg tctcatggat
actcagaata ccaatcaagg taacagatgc ctttatttta 2640actaaggaca cagtacagat
ctcacaggga cactccttat cccttgcaga gttccagaca 2700ctactgatgg tcaccaaagc
aacatttcat cagaaaacac agtgctgggc ttgtgaagaa 2760ggtgtccagc agagcttcca
ctgcccctgt aggctgcagg cagctgcttc agttgagaga 2820tacactgagc tcctcaaaga
attcctattt aaggtacaaa gcagcgactg gatgccctgc 2880tggattccac tattgccaca
ggctttgata actctaagtt catgctcctt aggaaagtgc 2940actttttaca ccaatgttag
acagttccct cagttgcctt tactggacat caaggagttc 3000acatttttga caatctttca
gtcctgaact gtccaagggg tggaggtagc tccatacagg 3060aagctgcctg ctgcgtgtgg
atcagtaata tcagacttgc ccaacagctc actggagaca 3120tcaaaaccaa tgcatggggg
atgcatgacg tcacccaaat atttaccaca cgcttttgct 3180gccagagacc ccagatcact
tttcctgcct cctgctaaac aaatgggagc agtagcttct 3240catcagcctc cacatttcat
tactcataat tgtggccttc ctaatgtctc atgcccctaa 3300gcacttatta caatgtctca
acagccttct ctcagcctct caaaatatta gtcatcatta 3360aaataaatgt ctgatgttaa
agcatcaggg tcaattgtac tgtgacacct aaaatttcat 3420gctgcatccc accaggtcct
cagcctaatg gctttcatgc ccccaacaga ctgctaaact 3480ctttgaatta gtcaagcata
tccctcaagg gacaaggggt taccccaccc tctgatttct 3540accaagcctg cctcccgcac
cctgtgctcc agagtgaacc cccgggtaga cctgcacaga 3600tgcagtgtca tcccctgttg
ggctgagtat gcgagatgaa taaattacgg tgaatttcgt 3660ctaaaaaaaa aaaaaaaaaa
3680274010DNAHomo sapiens
27gttcttcttt ctacctttct aagattccac catgttgctt gtttagttca accagtgctg
60atagggacac gtttgcactt gtctctcagt gaacattcgc aagagatttt ctaggcctcc
120aggcagaagc cgctgctgac ggggagagtc tgctggagga agtcacatgg cagatgccag
180aggaccagcc tgggagaggc tttctagcaa ccaggacacc tgccgcagtg ccacgctttc
240tccaggaaac aaggacagct gcacgggtgc ccacacgaga gtcggcgacc agcctcggcg
300gtggggttcc tgcaacgcgt gttcctgctg gggttctcct ttctgctgtt ccgagctaca
360gtgcggatca tgctccccat acgtggagtc cagggacagc acagcccggg gactgcccag
420ccagccagga agtctggctt tcagagaact gggaagacca tcttccactt tatcccatct
480gggagactgt aaacatggat tccactgtct cggacagtga gacgattccc agtcttttcc
540tgttcctcct ctaggtatgt agctgctatc agcttccaga aacgctggaa ggaagggcgt
600tgctttggca ggacacgtgg ggagatgggc tgggagggga gatcccggag gtaaggaggg
660gaggctttcc tctccccaca ctgggaactc agtttctagc cccaggttcc tggttgtgtc
720cggcatccca ggagcatgag tgggaagctg ctgcatccac ctgccgtgtg cagtcgggga
780ccagctgcag agcgcactat gcatccttta gactctgcag gctcaagtga aaacccagct
840ctgccactgc tctctggatg accttgccac ctccctctcc tcctttgaaa agtgtgggaa
900atgctgcttt tcatcaagtc caagtcacca tgtattgtgg cttaaacggc agtgttattt
960tctttctcag tgacacataa catactgctg cgttttttaa cagatgctgt cttggacgca
1020gtgaagtctg ccgatggctg tttcagagcc gttaggatgg tgaagtgaga caatgtgaat
1080ggagtgtttt gtgaggctca gtgctgcata aatgctagcc attatctaat agcaagaatt
1140ccttgaaatg tttctttctc tcaaatttat gaaaagatta gaaaatagtt ctaaaaatta
1200aaaatgcttc tggtcggagc ggcttacatg atggcagtga aagtgtgcag cttcatgggt
1260cccactgcgt gcacttggca tggagagaac gaaatgtgga ggccagcgat accaagaaca
1320cccctgtgag ccgcgctggg tgcagccccc acggccggcg ggcagtgaca ccgcagtttc
1380ctctcctgcc ggcatccacg cttcctgtgc ccaggtgctg gaagctcagt cttcatgtcc
1440ccttctcagt tatgtgtctg tagggctggg acagggaagg aagtcagtga agctggttgg
1500atcttccaag ggatgctctg tgtgcaaggc tcggggacaa ggatggctgg gggagtcaga
1560ctgtcagtga agctgagggg cggatggagc ctcaccacgc ctgagtccct gttttccagc
1620tgcagatgct gaccctccag agctttgttt accagcccct gtgactgcga tctggagtca
1680gccaggcgtt cctccttgtc cggtagattt tattaagtgc tcaatgtgtg ccaagccaaa
1740tgtccgtttg acttactgca tgtccgagaa gcgatgccgt tgaaggaatt tcctccccaa
1800gcctgaagga attgaacacc aggaatgact ggaaccctgg agagagagaa ctggatttct
1860ggaggaaaat cactcgtttt aaggaagcag catcctggac ccttgaggcc ctggaggaag
1920cgggcagcac agctcggagg cgggtgtggc tggaggacgg ccgtcgcacc tgcgaaattc
1980tgtctgtggt acgtggttcc ttcctggctc tgggaaccac ctggatactt gcattcttcc
2040cttttccttt ctattctctt tcaagtcacc ctccttgaga cagccctcca gtccaggcca
2100aatctcagcc tgcccttggt ccgctgtggt tgggcctgca cccaagccat gagcacacgc
2160agcaattgtg gcagcagaag cttcctctgg gctcagactc aggctgatgc tgcgtcagga
2220cctgccgcgg tctcggctgg gcttcctggg actcggtggt tgtgggctga ttgtaaagca
2280cggaatgact cttagaaact gggcgtcatt ctttgtggtt ttccaagctt ggtctctgat
2340gatactccag gtcttaggag acatgctgaa tatttattat gcttacattc aagcaacatt
2400aacccttaag gttgatgtag ctccccgtct ttttttccca gaaggaggag cactgaagga
2460acacttttcc agtatggatt ctttccagct ccgagaagct ggaggcacac ggatccctcg
2520gccagctctc atctatggac gtgctgtagt cacaaggact gtgactaagg ctcagtccct
2580gaagagtgcc ttggcatggg ctgctttagg ctgtaaacac ccagttttat ccactttatg
2640tgaagaaagc caacaagggg catggagtga gttccgcagg ttttagtggc tgcggaggct
2700ggtgctcagt ggggatgatg gagggaaggc gcctccctct gcgggccccg aggtctgtgc
2760gggaatcagc tctgcagttg tgtccagggg cagccgtaga ccacacacgg caggctcaca
2820gctctgttcc atgagaactt tatacacaaa agcagacggg ctggcttggc ctctggatca
2880taatctgctg acccctgggg caacgttggc tgcagcggag atggctgctc cccggtgggg
2940tgtgtgctca gcccgcagcc cccgccctcc ggactccgtt cgcctctgct ctcagctttg
3000cacctcgtca ttgtcttcta attgtgcatc cctggactgc gtgacctaca aggctctcag
3060cacaacaaga ctctatgatt ctgtctattg gaacaaaaag ccagtgaggc aagtgtatca
3120tcctgttgat gaattcacag aattaactct gggagttggg gacagtttgt attcttcttc
3180cagacactct ctgtttctgc tggatggaaa ggttctgcta cttgtcctgt ggtcaggccc
3240agctgatgga atggaatgga agtgactcag ccccttactg gcagaaactt taaaagccgc
3300acaacattcc tgcaccctcc cctctgccat gagcctggca gtgctcagga tgggaaaatt
3360atttcacctg ggcctgagga tacaggagct actcccagcg tgcagtggaa gagaagcatg
3420ggcaagtaat taaactttgt gttttcaagc cacagaggtt ttttgaggtt gtttgctacc
3480atgctttgtc cctacaaaca cagtcatgga gaaggccagt ggcagagcct gagccgttcg
3540tgcatctgtt caccagcatc cagaataaca atagattttt gaaacattcc tgagaaaatt
3600ctgggagttg cataccggcc agtcttattc cctaaagttg ttccttctaa agggtgggat
3660gaccaaaaat ttcagaaaag caaaccaccg ctgaaaggca acgttatttc tgttggcaga
3720aggcggcctg agcaatctag attttccacg gttcaccaac tagtttttaa ggaaatatgg
3780ctgtgagagg aataaaacat aattcctacc tttaaggaac tcagagaagt gaattaaagg
3840aagtcacaga tcaggcaacc aaccacacaa agtttctaag agcaaactgt tcaggtcggc
3900aagtcactct tatccactgt tttgccttct gaggtttcag ttactctcag tcagtcatgg
3960tccaaaaaca ttaaatgaaa aattccagaa ataaacaatt cacacgtttt
4010281042DNAHomo sapiens 28gaaacccgga agtggaactc tgagccattc agcgtttggg
tgaagacgga ggcgggttct 60acagagacgt aggctgtcag ggagtgttta tttcgcgtcc
gcttctgttt ctccgcgccc 120ctgtgctgcc ccgactcaca tactcgtcca gaaccggcct
cagcctctcc gcgcagaagt 180ttcccggagc catggccgag tactcctacg tgaagtctac
caagctcgtg ctcaagggaa 240ccaagacgaa gagtaagaag aaaaagagca aagataagaa
aagaaaaaga gaagaagatg 300aagaaaccca gcttgatatt gttggaatct ggtggacagt
aacaaacttt ggtgaaattt 360caggaaccat agccattgaa atggataagg gaacctatat
acatgcactc gacaatggtc 420tttttaccct gggagctcca cacaaagaag ttgatgaggg
ccctagtcct ccagagcagt 480ttacggctgt caaattatct gattccagaa tcgccctgaa
gtctggctat ggaaaatatc 540ttggtataaa ttcagatgga cttgttgttg ggcgttcaga
tgcaattgga ccaagagaac 600aatgggaacc agtctttcaa aatgggaaaa tggctttgtt
ggcctcaaat agctgcttta 660ttagatgcaa tgaagcaggg gacatagaag caaaaagtaa
aacagcagga gaagaagaaa 720tgatcaagat tagatcctgt gctgaaagag aaaccaagaa
aaaagatgac attccagaag 780aagacaaagg aaatgtaaaa caatgtgaaa tcaattatgt
aaagaaattt cagagcttcc 840aagaccacaa acttaaaata agtaaagaag acagtaaaat
tcttaaaaag gctcggaaag 900atggattttt gcatgagacg cttctggaca ggagagccaa
attgaaagcc gacagatact 960gcaagtgact gggatttttg tttctgcctt atctttctgt
gtttttttct gaataaaata 1020ttcagaggaa atgcttttac ag
1042295460DNAHomo sapiens 29agagtcaact ctgccccgag
gcctagcttg gccagaaggt agcagacaga cagacggatc 60taacctctct tggatcctcc
agccatgagg ctgctctggg ggctgatctg ggcatccagc 120ttcttcacct tatctctgca
gaagcccagg ttgctcttgt tctctccttc tgtggttcat 180ctgggggtcc ccctatcggt
gggggtgcag ctccaggatg tgccccgagg acaggtagtg 240aaaggatcag tgttcctgag
aaacccatct cgtaataatg tcccctgctc cccaaaggtg 300gacttcaccc ttagctcaga
aagagacttc gcactcctca gtctccaggt gcccttgaaa 360gatgcgaaga gctgtggcct
ccatcaactc ctcagaggcc ctgaggtcca gctggtggcc 420cattcgccat ggctaaagga
ctctctgtcc agaacgacaa acatccaggg tatcaacctg 480ctcttctcct ctcgccgggg
gcacctcttt ttgcagacgg accagcccat ttacaaccct 540ggccagcggg ttcggtaccg
ggtctttgct ctggatcaga agatgcgccc gagcactgac 600accatcacag tcatggtgga
gaactctcac ggcctccgcg tgcggaagaa ggaggtgtac 660atgccctcgt ccatcttcca
ggatgacttt gtgatcccag acatctcaga gccagggacc 720tggaagatct cagcccgatt
ctcagatggc ctggaatcca acagcagcac ccagtttgag 780gtgaagaaat atgtccttcc
caactttgag gtgaagatca cccctggaaa gccctacatc 840ctgacggtgc caggccatct
tgatgaaatg cagttagaca tccaggccag gtacatctat 900gggaagccag tgcagggggt
ggcatatgtg cgctttgggc tcctagatga ggatggtaag 960aagactttct ttcgggggct
ggagagtcag accaagctgg tgaatggaca gagccacatt 1020tccctctcaa aggcagagtt
ccaggacgcc ctggagaagc tgaatatggg cattactgac 1080ctccaggggc tgcgcctcta
cgttgctgca gccatcattg agtctccagg tggggagatg 1140gaggaggcag agctcacatc
ctggtatttt gtgtcatctc ccttctcctt ggatcttagc 1200aagaccaagc gacaccttgt
gcctggggcc cccttcctgc tgcaggcctt ggtccgtgag 1260atgtcaggct ccccagcttc
tggcattcct gtcaaagttt ctgccacggt gtcttctcct 1320gggtctgttc ctgaagtcca
ggacattcag caaaacacag acgggagcgg ccaagtcagc 1380attccaataa ttatccctca
gaccatctca gagctgcagc tctcagtatc tgcaggctcc 1440ccacatccag cgatagccag
gctcactgtg gcagccccac cttcaggagg ccccgggttt 1500ctgtctattg agcggccgga
ttctcgacct cctcgtgttg gggacactct gaacctgaac 1560ttgcgagccg tgggcagtgg
ggccaccttt tctcattact actacatgat cctatcccga 1620gggcagatcg tgttcatgaa
tcgagagccc aagaggaccc tgacctcggt ctcggtgttt 1680gtggaccatc acctggcacc
ctccttctac tttgtggcct tctactacca tggagaccac 1740ccagtggcca actccctgcg
agtggatgtc caggctgggg cctgcgaggg caagctggag 1800ctcagcgtgg acggtgccaa
gcagtaccgg aacggggagt ccgtgaagct ccacttagaa 1860accgactccc tagccctggt
ggcgctggga gccttggaca cagctctgta tgctgcaggc 1920agcaagtccc acaagcccct
caacatgggc aaggtctttg aagctatgaa cagctatgac 1980ctcggctgtg gtcctggggg
tggggacagt gcccttcagg tgttccaggc agcgggcctg 2040gccttttctg atggagacca
gtggacctta tccagaaaga gactaagctg tcccaaggag 2100aagacaaccc ggaaaaagag
aaacgtgaac ttccaaaagg cgattaatga gaaattgggt 2160cagtatgctt ccccgacagc
caagcgctgc tgccaggatg gggtgacacg tctgcccatg 2220atgcgttcct gcgagcagcg
ggcagcccgc gtgcagcagc cggactgccg ggagcccttc 2280ctgtcctgct gccaatttgc
tgagagtctg cgcaagaaga gcagggacaa gggccaggcg 2340ggcctccaac gagccctgga
gatcctgcag gaggaggacc tgattgatga ggatgacatt 2400cccgtgcgca gcttcttccc
agagaactgg ctctggagag tggaaacagt ggaccgcttt 2460caaatattga cactgtggct
ccccgactct ctgaccacgt gggagatcca tggcctgagc 2520ctgtccaaaa ccaaaggcct
atgtgtggcc accccagtcc agctccgggt gttccgcgag 2580ttccacctgc acctccgcct
gcccatgtct gtccgccgct ttgagcagct ggagctgcgg 2640cctgtcctct ataactacct
ggataaaaac ctgactgtga gcgtccacgt gtccccagtg 2700gaggggctgt gcctggctgg
gggcggaggg ctggcccagc aggtgctggt gcctgcgggc 2760tctgcccggc ctgttgcctt
ctctgtggtg cccacggcag ccgccgctgt gtctctgaag 2820gtggtggctc gagggtcctt
cgaattccct gtgggagatg cggtgtccaa ggttctgcag 2880attgagaagg aaggggccat
ccatagagag gagctggtct atgaactcaa ccccttggac 2940caccgaggcc ggaccttgga
aatacctggc aactctgatc ccaatatgat ccctgatggg 3000gactttaaca gctacgtcag
ggttacagcc tcagatccat tggacacttt aggctctgag 3060ggggccttgt caccaggagg
cgtggcctcc ctcttgaggc ttcctcgagg ctgtggggag 3120caaaccatga tctacttggc
tccgacactg gctgcttccc gctacctgga caagacagag 3180cagtggagca cactgcctcc
cgagaccaag gaccacgccg tggatctgat ccagaaaggc 3240tacatgcgga tccagcagtt
tcggaaggcg gatggttcct atgcggcttg gttgtcacgg 3300gacagcagca cctggctcac
agcctttgtg ttgaaggtcc tgagtttggc ccaggagcag 3360gtaggaggct cgcctgagaa
actgcaggag acatctaact ggcttctgtc ccagcagcag 3420gctgacggct cgttccagga
cccctgtcca gtgttagaca ggagcatgca ggggggtttg 3480gtgggcaatg atgagactgt
ggcactcaca gcctttgtga ccatcgccct tcatcatggg 3540ctggccgtct tccaggatga
gggtgcagag ccattgaagc agagagtgga agcctccatc 3600tcaaaggcaa actcattttt
gggggagaaa gcaagtgctg ggctcctggg tgcccacgca 3660gctgccatca cggcctatgc
cctgacactg accaaggcgc ctgtggacct gctcggtgtt 3720gcccacaaca acctcatggc
aatggcccag gagactggag ataacctgta ctggggctca 3780gtcactggtt ctcagagcaa
tgccgtgtcg cccaccccgg ctcctcgcaa cccatccgac 3840cccatgcccc aggccccagc
cctgtggatt gaaaccacag cctacgccct gctgcacctc 3900ctgcttcacg agggcaaagc
agagatggca gaccaggctt cggcctggct cacccgtcag 3960ggcagcttcc aagggggatt
ccgcagtacc caagacacgg tgattgccct ggatgccctg 4020tctgcctact ggattgcctc
ccacaccact gaggagaggg gtctcaatgt gactctcagc 4080tccacaggcc ggaatgggtt
caagtcccac gcgctgcagc tgaacaaccg ccagattcgc 4140ggcctggagg aggagctgca
gttttccttg ggcagcaaga tcaatgtgaa ggtgggagga 4200aacagcaaag gaaccctgaa
ggtccttcgt acctacaatg tcctggacat gaagaacacg 4260acctgccagg acctacagat
agaagtgaca gtcaaaggcc acgtcgagta cacgatggaa 4320gcaaacgagg actatgagga
ctatgagtac gatgagcttc cagccaagga tgacccagat 4380gcccctctgc agcccgtgac
acccctgcag ctgtttgagg gtcggaggaa ccgccgcagg 4440agggaggcgc ccaaggtggt
ggaggagcag gagtccaggg tgcactacac cgtgtgcatc 4500tggcggaacg gcaaggtggg
gctgtctggc atggccatcg cggacgtcac cctcctgagt 4560ggattccacg ccctgcgtgc
tgacctggag aagctgacct ccctctctga ccgttacgtg 4620agtcactttg agaccgaggg
gccccacgtc ctgctgtatt ttgactcggt ccccacctcc 4680cgggagtgcg tgggctttga
ggctgtgcag gaagtgccgg tggggctggt gcagccggcc 4740agcgcaaccc tgtacgacta
ctacaacccc gagcgcagat gttctgtgtt ttacggggca 4800ccaagtaaga gcagactctt
ggccaccttg tgttctgctg aagtctgcca gtgtgctgag 4860gggaagtgcc ctcgccagcg
tcgcgccctg gagcggggtc tgcaggacga ggatggctac 4920aggatgaagt ttgcctgcta
ctacccccgt gtggagtacg gcttccaggt taaggttctc 4980cgagaagaca gcagagctgc
tttccgcctc tttgagacca agatcaccca agtcctgcac 5040ttcaccaagg atgtcaaggc
cgctgctaat cagatgcgca acttcctggt tcgagcctcc 5100tgccgccttc gcttggaacc
tgggaaagaa tatttgatca tgggtctgga tggggccacc 5160tatgacctcg agggacaccc
ccagtacctg ctggactcga atagctggat cgaggagatg 5220ccctctgaac gcctgtgccg
gagcacccgc cagcgggcag cctgtgccca gctcaacgac 5280ttcctccagg agtatggcac
tcaggggtgc caggtgtgag ggctgccctc ccacctccgc 5340tgggaggaac ctgaacctgg
gaaccatgaa gctggaagca ctgctgtgtc cgctttcatg 5400aacacagcct gggaccaggg
catattaaag gcttttggca gcaaagtgtc agtgttggca 5460301963DNAHomo sapiens
30gttttccttg ttcctggtca acaaagaaat gtggagtgtc ttggctgaat cctcatacag
60acaagatcat tatggtgctg ttaggtagga cttgtatcca gatgtaaggt tgaaaaagtg
120atataataaa ggaaccaagg agaaaattca gaaggaaaga aaaaattgcc tctgcaggtg
180tgcgagcagg attgcttctg caacaaaagc ctccacccag ccacatcttg ggaaaagaat
240ggccacttct tggggcacag tctttttcat gctggtggta tcctgtgttt gcagcgctgt
300ctcccacagg aaccagcaga cttggtttga gggtatcttc ctgtcttcca tgtgccccat
360caatgtcagc gccagcacct tgtatggaat tatgtttgat gcagggagca ctggaactcg
420aattcatgtt tacacctttg tgcagaaaat gccaggacag cttccaattc tagaagggga
480agtttttgat tctgtgaagc caggactttc tgcttttgta gatcaaccta agcagggtgc
540tgagaccgtt caagggctct tagaggtggc caaagactca atcccccgaa gtcactggaa
600aaagacccca gtggtcctaa aggcaacagc aggactacgc ttactgccag aacacaaagc
660caaggctctg ctctttgagg taaaggagat cttcaggaag tcacctttcc tggtaccaaa
720gggcagtgtt agcatcatgg atggatccga cgaaggcata ttagcttggg ttactgtgaa
780ttttctgaca ggtcagctgc atggccacag acaggagact gtggggacct tggacctagg
840gggagcctcc acccaaatca cgttcctgcc ccagtttgag aaaactctgg aacaaactcc
900taggggctac ctcacttcct ttgagatgtt taacagcact tataagctct atacacatag
960ttacctggga tttggattga aagctgcaag actagcaacc ctgggagccc tggagacaga
1020agggactgat gggcacactt tccggagtgc ctgtttaccg agatggttgg aagcagagtg
1080gatctttggg ggtgtgaaat accagtatgg tggcaaccaa gaaggggagg tgggctttga
1140gccctgctat gccgaagtgc tgagggtggt acgaggaaaa cttcaccagc cagaggaggt
1200ccagagaggt tccttctatg ctttctctta ctattatgac cgagctgttg acacagacat
1260gattgattat gaaaaggggg gtattttaaa agttgaagat tttgaaagaa aagccaggga
1320agtgtgtgat aacttggaaa acttcacctc aggcagtcct ttcctgtgca tggatctcag
1380ctacatcaca gccctgttaa aggatggctt tggctttgca gacagcacag tcttacagct
1440cacaaagaaa gtgaacaaca tagagacggg ctgggccttg ggggccacct ttcacctgtt
1500gcagtctctg ggcatctccc attgaggcca cgtacttcct tggagacctg catttgccaa
1560caccttttta aggggaggag agagcactta gtttctgaac tagtctgggg acatcctgga
1620cttgagccta gagatttagg tttaattaat tttacacatc taatgtgaac tgctgcctaa
1680ccactcaaga gtacacagct ggcaccagag catcacagag agccctgtga gccaaaaagt
1740atagttttgg aacttaacct tggagtgaga gcccagggac aggtccctgg aaaccaaaga
1800aaaatcgcat ttcaaccctt tgagtgcctc attccactga atatttaaat tttcctctta
1860aatggtaaac tgacttattg caatcccaag acccatcaat atcagtattt ttttcctccc
1920tatacagtgc cctgcccacc cttatctgca cccacctccc ctg
1963319604DNAHomo sapiens 31tgcagggctg tgaccgtcta tgacaagccg gcatctttct
ttcaagacac acctctggac 60ctgcagcgcc agctcttcat gaagctcagc ggcacacact
ctccgttcag ggcccggtag 120gcctcccatc ctcagctgcc ttctctcctg ctcgccactg
ccctggcctg tccccttctc 180actgcagacc tgggaaccca ctcacccagg ggttggcaaa
gtaaggctac aggccagtct 240cctgcttttg taaatcaagt gtcattggga cacagcacac
tcattaactt ctgagttgtc 300tacagccgcc tttgagctgc aatagcagaa tcgtgttttg
caacagagaa cctgtggccc 360gcaaagcctg aagtatttac tctctggccc tttaagaaat
gtttgtggac ccctgcgctg 420tcttactctc ctgccaatgg gttcccaggc ctgtggcagg
acctgtggac ctgtgtgtcc 480cctggggtgt ctcatggggc taaggagggg acctttgtgc
aggtccacgc accctgaggt 540gtgcccctgt gtaagctggg gtggtgtggg agggcgtccc
tgcaccctca tcttgagtcc 600aggggatgat aagacagtaa gtcccgtgga gaaaaggaat
gagtcagtct tgtttgctgt 660tgtaaactta tcacccagca acaatattag agaaagcaag
cccaggcctc ggatggcagg 720ggtgacctgg tgctgctgat gtggccgggc accccaacct
ttgggagcct gcaggccttg 780ccacggcagg agatgcccgt cctgggtcct gggcctgctc
tgtggcctct cacaagcttt 840tttcctgctc tttcagctca gaacctgagg acccagccac
ggagcggtcg gccttcatga 900agagggatgc tgggagcggg ctggtgatgc gtctccacga
gcggccagcc ctgctggtca 960gcagcacagg ctggacaggt ctgcacgacc cctgcaacac
ttggggttgg tgtgacaggc 1020acctggccaa cctgtgttgt cctcacacct gccagtcctt
catgccccca ccctgccacg 1080gtctcaatga gaaggggagg tcgtgagagc tgtaagaggg
gtgtctagaa acaggaccct 1140gacattcaat tctcttctca tagaggacga agacttctcc
atcctgctgg cagctttaga 1200aagtaggtgt gtggctgcgg tgaggagctc tgggcttgtc
gggggccact gagctgtgag 1260ctgcttgcct ggcctgcagc atgttcctgt ccctggccac
tgggtggggc agcctgggga 1320cagtggggat ggtggaggtg ggccgccttg aatccccagt
tgggtcattg agtgaccagg 1380ccctcaggct gaaatgcccc ctccaggaga gtatttcaca
gaggctggtg gcctccccac 1440cagagcagtg ctctttctcc acctgaacag gtgactctgg
ctattgttta tttaaaagtt 1500tttttctgaa tgggcatggt ggctcacacc tgtaatccta
gaactctggg aggccgaggc 1560aggcaggtca cctgagggca ggagttcgag gccaacatgg
cgaaacctgt ctctactaaa 1620aatacaaaaa ttagctgggt atggtggtgg gggcctgtaa
tcccagctac ttgggaggct 1680gatgcacgag aattacttga acccgggagg cagaggttgc
agtgaactga catcacgcca 1740ttgcattcca gcctgggtga cagagccaga gtctgtcgga
aaaaaaaaaa aaaaaaattc 1800taccagaaat tccgtgtaga attgtttctt tttttaaaca
cagagtttga acaactgact 1860cttgacggac acaaccttcc ttctctcgtc tgtgtgataa
caggtaccgc ctgggcccct 1920gggtgtctgt gtggttgggg gatggtggat ggggaggggc
acgcagcctt taccctgtgc 1980ttcccacgat cttgtctcct taatcctcac tgcagctctc
tgccataggg acttatactg 2040cttgacatgg gggaaactga ggctcagagg gtttcacagc
agggcaggga gcccagattt 2100gaatctgtag ataccaagct ttctactttt tcagtagttt
ccaagcatct tttttgttgt 2160tgttgttaca tcactggtgt cttttttttt tttttgagac
acagtctcta tcgcccaggc 2220tggggtgcag tggtgtgatc ttggctcact gcaacctcca
cctctcacat tcaagcaatt 2280ctcgtgcctt agcctcccga gtagctggga ctacaggggc
ccaccacacc cagctaattt 2340ttgtattttt agtagagatg gagtttcacc ctgttggcca
ggctggtctt gaactcctga 2400cctcaggtga gccacccaac ttggcctccc aatatgctcg
aattacaggc atgaatcact 2460gtgtctggcc atgtcattgg tgccttaacc aagcgtcttt
taatttttta aacggaagag 2520cccctgtccc acagttactg ctgctgagcc ctttcaaggt
gactcagtga ggagggagaa 2580aagcggaagc ggtgtgggaa gaggcggggt ctgggccagc
tgctgctcct gctctcctcc 2640ctcctctggc ctctaggctc ccaggagtgg tttggaacct
gcgccatgtg ctctggaggc 2700tgtggcaggg caggggcgtc ttggaacctg cgccatgtgc
tctgggggct gtgccaggga 2760agggggagtc ctcgtgtccc ctgcgcacaa cacagacaga
aggctggatc cacccagtgg 2820gcggtcgggt gccaggccag tgcttactcc gccatgtttg
cagcccgagg ccagctggct 2880gcaggtgcag gactatgcct caggggtcag ggtgcacaca
cccctgcatg tctcggggct 2940cctgggtagc ttctggaagg gcccagatgg ggcctgactg
gagctgccga ggggtggagc 3000ttctgggaaa aggatccctc ctagggggga gtgtcttgag
cctggggcca tgtggcaggg 3060acagagacgg gtccatggca gtgtctcctc ttctctgtga
aggcaaaggg cctctgaggg 3120agtattacag ccgcctcatc caccagtagc atttccagca
catccaggtc tgcaccccct 3180ggctggaggg ccgaggacta cccccgcttc taggtgagag
gccagcggga ggctcaggga 3240ggaggcaggg ccttaagcag ggggaacagg ggtgggcagg
atgtactttt tctgaaaagg 3300tggctctgga ggccacttgg gcacgggacc tgggctctgg
ctgaactccc gggaggaggc 3360tactttctgg tgtgccagcc cctcccagcc aggtggcccc
agaggccctt taccaagggg 3420tttgaggagg tcacatcctt tcagcctgcc acgccctcca
ttcagtcctc ttccttcctg 3480caggagggct gggcctgggg ttggggccac tgttgcccag
gtgtaggaag gcagtggctt 3540tgggaggtac agggacgatg tgtcaaacag cgtcgcctct
cccagtgaga tggttctgct 3600ttgcctccgt ctctttcccc gttgttttct ccaagtgggg
agttgtgtct tggtcctgat 3660gcgtctctag agccgcatct tccagcttct agtgagcaga
gcagttggag gctgaggcct 3720tttcctggca ggactctcca gctagtcttt gttttagaca
gtctcgctct gttgcctagg 3780ctggagtgca cgatctcagc tcatgcaacc tccgcctcct
gggttcaagc gattctccca 3840cctcagcctc ccgagtagat tatgggatta caggagccct
ccacaacacc tggcttattt 3900ttgtattttt agtagaaaca gggattcacc atgttggcca
gactggtctt gaactcctga 3960cctcaagtga tcctcctgtc ttggcctccc taagtgctgg
gattccaggc gtgacccatc 4020acgcctggtc ccagctagtc tttagaaatg ttaagccgtt
tcgctttatt ttcacactga 4080cagctggttt gtagtgggtg tgctgtggtt tattattatt
attattatta ttattattat 4140tattattatt ttgagaagga gtttcgctct tgtagcccag
gctgcagtgt aatggcacga 4200tcttggctca ctgcaacctc tgccttccca ggttcaagca
attctcctgc ctcagcctcc 4260tgagtagctg ggattacagg cacctgccac gacacttcgc
taattttgtg tttcttttag 4320tagagatggg gtttcaccac gttggccagg ctggtcttga
actcctgacc tcaggtgatc 4380cgcccacctt ggcctcccaa aatgctggga ttacatgcgg
gaggtgaacc tgggaggtgg 4440aggttgcagt gagctgagat tgtgccactg cactccagcc
tgggtgacag agtgagactc 4500tgtctcaaaa caaaacaaca acaacaaaaa aaccaaattg
tgattacgta gaaaaagtgt 4560caacttacat tttcagatgt cccagccaga ccatgtggct
gcttggccag cttaagccac 4620ttgtgcttgg ggctgcgggg ggccttatct gattttcact
cccctcgggg gatgctgcct 4680cactgtgctg ggaggatttg tgttcccagg gcagagacca
gctctctgac cgcacccctc 4740ttgcctagca gggtcggtgg acctgggtgt ctgtctagac
acgtcctcca gtggcctgga 4800cctgcccatg aaggtggtgg acatgttcag gagctgtttg
cctgtgtgtg cgttgtgaac 4860ttcaagtggt aggagcagaa cccgaatctt tctggggata
gcttcacaga tccaccgctg 4920agggggaaac agtgcagagc cagctgccca cagtgaggcc
ctgccccttg gtcagtccag 4980cacacactgg aggccatgag gaggagccct gtggttactg
tggctgggct gagcctcact 5040gaagtagttg cttccattta gaactcatgt tatatttagg
ttggtacaaa agtaatcacg 5100gtttttgcca ttaaaaatgg caataacttt tgcaccaacc
taatatgaaa aaagaaagca 5160ccttaaatac tagaactcca ctcggggctt ttgctcctag
agtagaattg gcgggaattg 5220cctgcaggct tacatggttt tctttgtttt tctctcccac
catgtccctt ttggccaagc 5280tcacatggtg ggtttgaatg agttaaatga gtgtcatgct
gtggcctcac tgcacccagc 5340atagatgggt gtttggaagg gtggcgttag aggagattct
agaagcagta gccccagcac 5400aagttgagcc cttggcccct gctcaggagc cggctcctgg
atgggattca gggattcaag 5460cccctcgtgt gagctgagct cagggaacgt cttgatcaaa
tctggtgccc tagaaaagtc 5520atcttttatg tgctgaacca gtctccaggg ggttgcctta
cttgttccac agccatggaa 5580ttaagaaaaa catacaaaaa taattcttca gtccttgaag
agcatccagc acagaaggta 5640caaaccctcc ttaaggctcc ctcctcaaat cggtttggcc
attttgatgt gcactccccc 5700aggcctttat acccttcaga tgccaaatct aagaaccagc
tcccagaaac cacaccccct 5760gttccaatcc ccagcctggc ttgagcgtgg ggtgcgaggg
gagcccaggt gggcacccca 5820ggggtctggt gtcttctcca ggcagctctc aggctccctt
ggttctctct gcagtttaca 5880tgagctggtg aaacatcaag aaaacggctt ggtctttgag
gactcagagg aactggcagc 5940tcagctgcag gtagccacat ctgccactaa gccagggtgg
gcagggttct ggagactggc 6000accgagccac gctccctgat ccctgcttcc cacagtcggg
gtgggaccat gtggggtctg 6060gcggaaaagc tagggaggga gcagaggtca cagaggctgg
cctactctgc tgtcccgttt 6120cggtacagta ggctcgggaa agttaggaca caaccccacc
tgcccgcctg cccgctggat 6180ttatggaaca gacactccac aaatgacgct ggagccgggt
gggccgggct gcagtttagg 6240aagtgagcag gatcaggtag gtgagtgggc aaagggagct
tctgggacca gccttgaaag 6300atgggtggaa ttctgcaaat gttacttgtt tcttattgca
aaaagtaata catcgttctt 6360gccaacagaa tgactggcag gattttcagt aaaggtccaa
gtcggaagtc atttagactg 6420ggtcccctag tctctgtcag aaccatggta ctctgttggg
gtgtgaaagt agccacagat 6480catctgtaga ttaaggggtg tggctttgtt ccaataaatc
tttatttaca aacacaggct 6540gtgggctgga tttggcctgc aggctgtagt ttgtgatcct
tgattcagac agtttagcaa 6600ggctgaaaag aacaccgaca cccccttgtt acccacagat
gggtgggact tggccagagg 6660ccaagaggag ggtgctcgca ggggagcata cagcatgaag
aggccgggag gtgccccagg 6720acaccaagtg tgggaaagtg ggacatgcgg ggaagttccc
agaaagcgtg atgtcaagtt 6780ggaggcggag cgctgctggg gcgtgaagag tctcgagtcc
aagtgaggga gttaagaact 6840tggtaggggt tgttgttggg tcgggcacct ggggtcagcc
aggtggtgac ctgggatgcg 6900gtggggacag gcaatgaggt aagctctgct ctttagtatt
atgcagatgc ttttctcaaa 6960ctttcctgat cctgcaggca agctaaacct gttccggaag
aacctgcggg agtcgcagca 7020gctccgatgg gatgagagct gggtgcagac tgtgctccct
ttggttatgg acacataact 7080cctgggccag aggctaaaac cccagggccc ctgctgtcct
tcccgcagct tcttcttgga 7140gtctcagggc aaaccctttc aagcagcgcc tcccagtggc
cagaagctga aatgacggca 7200gtggtgccgc ctggtgagtg aattgcttct gtgacccggg
aagctgtggt tggctctgat 7260ttcttttttg gaggcttgga aacacttcct ctcttcttct
gttcttcacg ccccatgccc 7320ctgctagcgt actactgttc tgtgacttcc ctgtgacctc
tgcagtactc ctcatcctgc 7380gtttggtctc caggtgtcgc ctttctgccg tgttcctaat
attttgattc ctgtcttgaa 7440aaaagcacct gctgcacagt aagcccaggg atgtggcagc
tgcagcgggc ttggctttgt 7500gaggaaccgg gtgtgtccac gttgggggaa catcatactt
gatacacacg tttttatttg 7560cacaaagaaa atgctatttt tagagccaga attttcatgt
ctgatttatg gtgattttct 7620taagaaccag aactgctggc agaaaggggg cacccacacg
cttagatagc cgatgtctta 7680ttagagggca gtttgtggtt cctgatttgg aaattaatat
tctccaaaca ttccagtcca 7740atgaaagttt tatccgcttt cccatgtaaa aattcttccc
atgagagtga cttgatcctc 7800acaatcccgt tgaagtcgcg tgtgagtcct acagtattag
gttcagcatt gccatctcca 7860agtgctcttt gtagggaaac agtttctggt catgacaagc
ttccacttcc catctgatcc 7920tggcctggcc tggaaacaga gcacatgtgt ttgaggatgg
cggtgtttgg ggacaggaca 7980tgagcgtatt gtgtggggct gctaggacag gcgtggtgtg
gtgggggagt gtccaagtca 8040gtctacttgt ttcacagttt cccagtccca cccaggtacc
tagaattggc ctccaggatg 8100ggaccagaaa tctggttttg catagaaatg gttagcagca
ggcaccgtgc cgctgtccac 8160tctctgctgg cgtctgcccc agcacttggc acagcgggac
agaagcagag ctctgaaccc 8220acatctacct ggctgcccag tcaacccact cttcacaaag
cttagaaagc ggccgggcac 8280agtggctcac gcctgtaatc ccaacacttt gggaggccaa
ggtgggggga tcacttgagg 8340tcaggagttc gagagcagcc tggccaacat ggtgaaaccc
catctctaca aaaatacaaa 8400aattagccag gcatgatggc gggtgcctgt aatcccagct
acctgggagg ctgaggcagg 8460agaattgctt gaacccagga ggcagaggtt gcagtgagct
gagattgtgc cactgcactc 8520cagcctgagt gacagagtga gactccattt caaaaaaaaa
acaccaaaaa aaacaaaaca 8580aaacaaacaa acaaaaacac acacacacag cttagaaggg
gctggtgttc tcataagcac 8640agatgtctga agagccgtta gccagaatga ttcttttttt
tttttttttt ttgagatacg 8700atcttgttct ttcacccagg ctggagtgca gtggcacagt
cattgctcac cacagccttg 8760actcctgggc tctagcaatc ctcccatttc ctgactagct
gggatgacag gtgcgtgcca 8820ccatgccagt aattttttta ttttgtagag atggggtcct
gaacccatgg cctcaaatga 8880tgctcctgcc tcagcctctt ttattatttt ttttttggac
ggagtttaac tctgttcccc 8940tagctggagt gcagtggcgc aatctcagct cactgcgatg
cctcccaggt tcaagtgatc 9000ctcctgcatc agcatcccga gtagctggga ttataggcgt
gcagcaccac gcctggctaa 9060tttttgtgtt tttagtagag atggggtttc accgtgttgg
ccaggctggt cttgatctct 9120tgacctcaag tgatccgctc acctcagcct cccaaatcct
cagcctctta cagtgttggg 9180attacaggtg tgagacactg tgacccggga tgattttcaa
tcacagtttt ttgttacaag 9240tggaaaatgc atatctataa aaatgaagta gtgcagacat
gaatgtgtag aagtctctat 9300aatcctgcca tccaaggatg gcacctgtta acgagtgtat
tagggatgtc caatcttttg 9360gcctccctgt gccacattgg aagaagaatc accttgggcc
acacataaaa tacactaacg 9420ctagcaatag ctgatgagct aaaagaaaaa aattcacaaa
aaaacctcgt actgttttaa 9480gaaagtttac agatttgtgt tgggccgcag gttggacaag
cctgctatat atatattcta 9540ggttttctcc tataggtata cttatgtgaa aatgatgatt
gtgataattt tttttttgag 9600atga
9604321164DNAHomo sapiens 32ggtcagctgc acctagaagg
ggccccctcc tggctcaaaa atggctgcaa ccacatgagc 60catgtgactc aggatggagg
gtggcactgt ccacaaatcg gcttctctgc agagcccact 120ctgcaaggct gcggtgaggg
gctgaggctg gctgcctggt gggatggcct gagatcctgc 180tcctgctagg gccagaggcc
aactgcctct agggttgctt ggtgcctcag taacaggtgg 240tttgagtgga tgtctcctgc
aggaaccatg actagaaatg tggttagaca agaatttgag 300gctccaggga agccacagga
ttctagccag caggatgcct gcttaatcct cgtaaaagga 360aactggacaa caaacgagat
ggaggtaaaa tgaaatccca aaggaacgcc tccaaggatc 420ccgtcttcta cctgtcagga
gatgttctat ttttctaact tccccacctt cagcataaca 480ggatttagta actgtttaga
ccttttgaag aacaaagaca aaaataggaa gttgtgtcag 540aggagtgcag tcatctctta
ttcatggtag tgttttggtt cgattttgga gcatccacca 600ggttcgcgta caggagaaga
aacacggatg atctctgtct ttattttggg ggaggacaat 660ccggccttct tgggactgtt
ccagcccacc tcccagatgc ttcggtcagt ctgatccacg 720agatatctca ggctgggaag
aacacatctt gggccaactt ccacagcagc catgcgtcca 780aggatacaga atcaggcagg
ccgataaaag gattttgact tgggtttcac aggaaggagg 840tatgccatca aaccccactg
tagtcaagga tcagaaggtg cttttagcag gaactgaagc 900tttttagtat aaggcccact
tagggtgcca gcaccctgac agtgtccctt taatccactt 960cagatctggt tgactcatca
tcaggtgttt ccattcactt tcaaaattac acagtcacag 1020gataacagtg agatccactg
ctaaagggta aagggtcaga ggctgaagag tactgtgtaa 1080tgtctgcatg agatttgttt
taaaaaagtc tgaattggca tggtattttg aatgctgctg 1140gaactgtaat aaaggttcac
cttt 1164332947DNAHomo sapiens
33acattcttcc cgggcactcc tgagtttgag ccgggcctgg aggacctggg ccaagacttt
60cgagcggcgg ccgcccgagg tgcgaggagc cagcccggct ttcctcactg ggtcccgcgc
120aggcgtcccc gggaccgcag agcaaacttt ctggactatc tgaggacact tgtccagcga
180gccccactgc tcggggagga ggagccacgg ccggggacag gtgatacact tgggtgttga
240aggacatttt tgaaatcatg agaactcaat gtttgactat gaatgtttcg ttataactgc
300ctggaaggtt agcgtcaaag aagttgagat ttttaaagtc ttcttctagg ggtttccagc
360agagccaaat gttagaaaaa tctttccgct cctctgaaga gtgaagtgag caaatacaac
420ccagcagtag gttattgaag acagcagccc caggttttgg aaggtgacaa tgaaatgtga
480agaagttaca tttctcaaac ttgaaagtta gtgacggctt accaaatttt aatgaaaatt
540aaatatgact tagaagcatt gatttatgaa ggcttatgat gtcatcggtt tcgacagaaa
600gcaaactcca gcaggctgtg agcctacagg gagttgaccc agaaacatgc atgattgtat
660ttaaaaacca ctgggcacag gttgtgaaaa tcttggagaa gcacgacccc ttgaagaaca
720cccaggcaaa atatgggtct atccctccag atgaggccag tgccgtgcag aattacgtag
780aacacatgct cttcttgttg attgaagagc aagccaaaga tgctgcaatg gggccgattc
840tggaatttgt ggtctctgag aacatcatgg agaaactttt cctttggagc ttgagaaggg
900agtttactga tgagactaaa attgagcagc taaagatgta tgagatgttg gtcacccagt
960cgcaccagcc tctgctgcac cacaaaccca ttctgaagcc tctgatgatg ttgctgagct
1020cttgttcagg aacaaccacc cccactgtgg aggagaagct ggttgtccta ctcaatcagc
1080tctgttccat tcttgccaaa gatccatcca ttttagaact cttcttccac actagtgaag
1140accaaggcgc tgccaacttc ctcatcttct cccttctgat tcccttcatt caccgagagg
1200ggtcagtagg ccagcaagct cgggatgcat tgctcttcat catgtctctt tctgctgaga
1260acaccatggt ggcccatcac atcgtggaga acacctactt ttgtccagta cttgcaactg
1320ggctcagtgg tctctactct tccctgccta caaagctaga agatgaggag gatgactttg
1380actcttttat agcggagatg cctgctgtag agactgtgcc ttccccattt gtggggagag
1440atgaggctgc ctttgccagt cgccatcccg tgaggactca aagcacccca ttcacaggcc
1500cattcatcag cgtagtcctg tcaaagctgg agaacatgct ggagaactct ttacatgtta
1560atttgctgct tatcgggatc attactcagc tagccagcta cccccagcca ctcctgcgct
1620cctttctgct caacaccaac atggtcttcc agccaagcgt ccgctctctc tatcaggtcc
1680ttgcatctgt gaaaaacaag attgaacagt ttgcttctgt ggagagagac ttcccagggc
1740tcctcattca agctcagcag tacctgctct tccgtgtgga catgtctgat atgacccctg
1800cagcactaac caaagatccc attcaggagg cttccaggac aggaagtggc aagaaccttt
1860tggatggacc tccaagagtg cttcagccct tcctgaccca cagaaccaag gtggctgagg
1920caccccccaa cctgcccctg ccggtgagga accccatgct ggctgctgcc ctcttcccag
1980agttcctgaa ggagctggcg gccttggccc aggaacactc cattctgtgc tacaagatct
2040tgggtgactt tgaggactcc tgctgttagc tttttttttt ttttttaata gaggttcttg
2100ttttgtaagg ttttagtgtc ttgactgaat gttaaatgca aagctgctta caaagatttc
2160tactttaatg tttcctgaca atacttgatt tgtggggagg ggaattttct gtatctttcc
2220tctctctctc tagccgggcc tttccacctt atgttatata tagaatgtaa gtctcataag
2280ctggttgctc ccttggcagt tttctttgct ctgtttttcc tccttatatt tttttggttg
2340tcattctcct atccctttga gttactcttc ttgcagctca gatcacgtca agcagatatt
2400ggggttcagt gatgtctggt gatgtctgga agtgccccat gtcagaattc cagctgttca
2460gcagcacagg aagattgtac acctgcaact gtgcgaatgg tcctgttgcc tcctgcattt
2520tggcctctgt tctataaagg aagagtaaag atggagctcc tcctgcctcc atcacgaaag
2580cacatatcat ctgtcccttt ggattttact tccaggacgt gtgtcgtccc cagcgtgtgt
2640tgccttatgg tgccggcaga gcctcagcta tctgcctggg aagtcggatg tccttggaga
2700gaatttggaa tgcagataat ttttcttatt tcttgagagc ttactttaat cagcatgaca
2760ctacctaaac actgaagatg gccttatatt agtaagattt gcacaaaatt aagtatacct
2820atgcaaacta ttactttggt ttttaggagt ttgatcagat gaagaagtaa tggtatcaca
2880tatatatgta agaagacaac catcattatt tttgtaagtg ttttataaaa acaaactgat
2940taacttg
2947344576DNAHomo sapiens 34acattcttcc cgggcactcc tgagtttgag ccgggcctgg
aggacctggg ccaagacttt 60cgagcggcgg ccgcccgagg tgcgaggagc cagcccggct
ttcctcactg ggtcccgcgc 120aggcgtcccc gggaccgcag agcaaacttt ctggactatc
tgaggacact tgtccagcga 180gccccactgc tcggggagga ggagccacgg ccggggacag
gtgatacact tgggtgttga 240aggacatttt tgaaatcatg agaactcaat gtttgactat
gaatgtttcg ttataactgc 300ctggaaggtt agcgtcaaag aagttgagat ttttaaagtc
ttcttctagg ggtttccagc 360agagccaaat gttagaaaaa tctttccgct cctctgaaga
gtgaagtgag caaatacaac 420ccagcagtag gttattgaag acagcagccc caggttttgg
aaggtgacaa tgaaatgtga 480agaagttaca tttctcaaac ttgaaagtta gtgacggctt
accaaatttt aatgaaaatt 540aaatatgact tagaagcatt gatttatgaa ggcttatgat
gtcatcggtt tcgacagaaa 600gcaaactcca gcaggctgtg agcctacagg gagttgaccc
agaaacatgc atgattgtat 660ttaaaaacca ctgggcacag gttgtgaaaa tcttggagaa
gcacgacccc ttgaagaaca 720cccaggcaaa atatgggtct atccctccag atgaggccag
tgccgtgcag aattacgtag 780aacacatgct cttcttgttg attgaagagc aagccaaaga
tgctgcaatg gggccgattc 840tggaatttgt ggtctctgag aacatcatgg agaaactttt
cctttggagc ttgagaaggg 900agtttactga tgagactaaa attgagcagc taaagatgta
tgagatgttg gtcacccagt 960cgcaccagcc tctgctgcac cacaaaccca ttctgaagcc
tctgatgatg ttgctgagct 1020cttgttcagg aacaaccacc cccactgtgg aggagaagct
ggttgtccta ctcaatcagc 1080tctgttccat tcttgccaaa gatccatcca ttttagaact
cttcttccac actagtgaag 1140accaaggcgc tgccaacttc ctcatcttct cccttctgat
tcccttcatt caccgagagg 1200ggtcagtagg ccagcaagct cgggatgcat tgctcttcat
catgtctctt tctgctgaga 1260acaccatggt ggcccatcac atcgtggaga acacctactt
ttgtccagta cttgcaactg 1320ggctcagtgg tctctactct tccctgccta caaagctaga
agagaaaggc gaggaatggc 1380actgccttct gaaagatgac tggcttctac ttccttctct
tgtccagttc atgaactccc 1440tggagttttg caatgcagtc atacaggtgg ctcacccctt
gattcgaaat cagcttgtca 1500attacattta caatggattt ttggtaccag tcttggctcc
tgctctccat aaggtgactg 1560tggaagaggt catgaccaca actgcatatc tggacctttt
cctgcgtagc atctccgagc 1620cagcactact tgagatcttc ctccgtttta tcctattgca
ccagcacgag aatgtccaca 1680tcctagacac tctcacgagt cgaatcaaca ccccgtttcg
gctttgtgtg gtgtctctgg 1740cattattcag aactctcatt ggtttacatt gtgaagatgt
gatgttacag ctagttctaa 1800ggtatctgat cccctgcaat cacatgatgc tgagtcagag
gtgggctgtg aaggagagag 1860actgttactc tgtttctgcg gccaagcttc tcgccttgac
tcctgtctgc tgctccagcg 1920ggatcactct gacgctgggg aaccaagaga gggattatat
tctctggtca aagtgtatgc 1980atgacacttc agggcctgtg gagcggccat tccccgaagc
gttctccgag tcagcctgca 2040ttgtggagta tgggaaagcc ctggacatca gctacctgca
gtacctgtgg gaggcccaca 2100ccaacatcct ccgctgcatg agggactgcc gtgtctggtc
cgccctgtat gatggcgact 2160cccccgaccc tgagatgttt ctccagagtc tgacggagga
gggcagtgtg agctcggcct 2220gccctgtgtt cgggctcccg caacaactcc ccaggaagac
aggacctcag ctggctccca 2280gaaaggacaa gagccagaca gagctggaat gggatgacag
ctatgacact ggaatctcct 2340caggggctga cgtgggctcc ccagggcctt atgatgatct
ggaggtttca ggccccccag 2400cacccattga tccccccaaa cacatccagg agatgaagaa
gaatgccctc ctgctcttca 2460aagggtccta catagaagag tcggactttc aggatgatgt
gatggtgtac aggctgtgtg 2520ctgagaagga ctccgaggac atgaaggatt ctcaggagga
agctgctagg ccaccagctg 2580aagcccaggc tgaagttcag agtgtcccca tcaacaacgg
ccccctcctc agcacccagc 2640cagagacaga ttcagaggag gagtggaata gggacaattc
agacccgttt cacagtgagc 2700ccaaggagcc aaagcaagag agggaacctg aagcagcccc
agaatccaac tcagagttag 2760catcccctgc ccctgaggca gagcacagct ctaacctgac
agccgcccac ccggagagcg 2820aggagctcat tgcccagtat gaccaaatca ttaaagagct
ggattccggc gccgagggct 2880tgatggaaca gaattacccc acacctgatc ccttgcttct
cactaaggag gaagaaggga 2940aggaagagag taaaggagaa aaggagaagg aggggaagaa
ggagctagaa gatgaggagg 3000atgactttga ctcttttata gcggagatgc ctgctgtaga
gactgtgcct tccccatttg 3060tggggagaga tgaggctgcc tttgccagtc gccatcccgt
gaggactcaa agcaccccat 3120tcacaggccc attcatcagc gtagtcctgt caaagctgga
gaacatgctg gagaactctt 3180tacatgttaa tttgctgctt atcgggatca ttactcagct
agccagctac ccccagccac 3240tcctgcgctc ctttctgctc aacaccaaca tggtcttcca
gccaagcgtc cgctctctct 3300atcaggtcct tgcatctgtg aaaaacaaga ttgaacagtt
tgcttctgtg gagagagact 3360tcccagggct cctcattcaa gctcagcagt acctgctctt
ccgtgtggac atgtctgata 3420tgacccctgc agcactaacc aaagatccca ttcaggaggc
ttccaggaca ggaagtggca 3480agaacctttt ggatggacct ccaagagtgc ttcagccctt
cctgacccac agaaccaagg 3540tggctgaggc accccccaac ctgcccctgc cggtgaggaa
ccccatgctg gctgctgccc 3600tcttcccaga gttcctgaag gagctggcgg ccttggccca
ggaacactcc attctgtgct 3660acaagatctt gggtgacttt gaggactcct gctgttagct
tttttttttt tttttaatag 3720aggttcttgt tttgtaaggt tttagtgtct tgactgaatg
ttaaatgcaa agctgcttac 3780aaagatttct actttaatgt ttcctgacaa tacttgattt
gtggggaggg gaattttctg 3840tatctttcct ctctctctct agccgggcct ttccacctta
tgttatatat agaatgtaag 3900tctcataagc tggttgctcc cttggcagtt ttctttgctc
tgtttttcct ccttatattt 3960ttttggttgt cattctccta tccctttgag ttactcttct
tgcagctcag atcacgtcaa 4020gcagatattg gggttcagtg atgtctggtg atgtctggaa
gtgccccatg tcagaattcc 4080agctgttcag cagcacagga agattgtaca cctgcaactg
tgcgaatggt cctgttgcct 4140cctgcatttt ggcctctgtt ctataaagga agagtaaaga
tggagctcct cctgcctcca 4200tcacgaaagc acatatcatc tgtccctttg gattttactt
ccaggacgtg tgtcgtcccc 4260agcgtgtgtt gccttatggt gccggcagag cctcagctat
ctgcctggga agtcggatgt 4320ccttggagag aatttggaat gcagataatt tttcttattt
cttgagagct tactttaatc 4380agcatgacac tacctaaaca ctgaagatgg ccttatatta
gtaagatttg cacaaaatta 4440agtataccta tgcaaactat tactttggtt tttaggagtt
tgatcagatg aagaagtaat 4500ggtatcacat atatatgtaa gaagacaacc atcattattt
ttgtaagtgt tttataaaaa 4560caaactgatt aacttg
4576356218DNAHomo sapiens 35agctccgctc gcgctctcgc
cgctcctgcc ggctcgcccg gccccgcgct ccgccgtctc 60ctcgccgccc gcccctccgc
cagccccggg gaccgcgcgg ccgcagcctg agccagggcc 120ccctccctcg tcaggaccgg
ggcagcaagc aggccggggg caggtccggg cacccaccat 180gcgaggcgag ctctggctcc
tggtgctggt gctcagggag gctgcccggg cgctgagccc 240ccagcccgga gcaggtcacg
atgagggccc aggctctgga tgggctgcca aagggaccgt 300gcggggctgg aaccggagag
cccgagagag ccctgggcat gtgtcagagc cggacaggac 360ccagctgagc caggacctgg
gtgggggcac cctggccatg gacacgctgc cagataacag 420gaccagggtg gtggaggaca
accacagcta ttatgtgtcc cgtctctatg gccccagcga 480gccccacagc cgggaactgt
gggtagatgt ggccgaggcc aaccggagcc aagtgaagat 540ccacacaata ctctccaaca
cccaccggca ggcttcgaga gtggtcttgt cctttgattt 600ccctttctac gggcatcctc
tgcggcagat caccatagca actggaggct tcatcttcat 660gggggacgtg atccatcgga
tgctcacagc tactcagtat gtggcgcccc tgatggccaa 720cttcaaccct ggctactccg
acaactccac agttgtttac tttgacaatg ggacagtctt 780tgtggttcag tgggaccacg
tttatctcca aggctgggaa gacaagggca gtttcacctt 840ccaggcagct ctgcaccatg
acggccgcat tgtctttgcc tataaagaga tccctatgtc 900tgtcccggaa atcagctcct
cccagcatcc tgtcaaaacc ggcctatcgg atgccttcat 960gattctcaat ccatccccgg
atgtgccaga atctcggcga aggagcatct ttgaatatca 1020ccgcatagag ctggacccca
gcaaggtcac cagcatgtcg gccgtggagt tcaccccatt 1080gccgacctgc ctgcagcata
ggagctgtga cgcctgcatg tcctcagacc tgaccttcaa 1140ctgcagctgg tgccatgtcc
tccagagatg ctccagtggc tttgaccgct atcgccagga 1200gtggatggac tatggctgtg
cacaggaggc agagggcagg atgtgcgagg acttccagga 1260tgaggaccac gactcagcct
cccctgacac ttccttcagc ccctatgatg gagacctcac 1320cactacctcc tcctccctct
tcatcgacag cctcaccaca gaagatgaca ccaagttgaa 1380tccctatgca ggaggagacg
gccttcagaa caacctgtcc cccaagacaa agggcactcc 1440tgtgcacctg ggcaccatcg
tgggcatcgt gctggcagtc ctcctcgtgg cggccatcat 1500cctggctgga atttacatca
atggccaccc cacatccaat gctgcgctct tcttcatcga 1560gcgtagacct caccactggc
cagccatgaa gtttcgcagc caccctgacc attccaccta 1620tgcggaggtg gagccctcgg
gccatgagaa ggagggcttc atggaggctg agcagtgctg 1680agaacaccaa gtctcccctt
tgaagacttt gaggccacag aaaagacagt taaagcaaag 1740aagagaagtg acttttcctg
gcctctccca gcatgccctg ggctgagatg agatggtggt 1800ttatggctcc agagctgctg
ctcgcttcgt cagcacaccc cgaatattga agagggggcc 1860aaaaaacaac cacatggatt
ttttatagga acaacaacct aatctcatcc tgttttgatg 1920caagggttct cttctgtgtc
ttgtaaccat gaaacagcag aagaactaac ataactaact 1980ccatttttgt ttaaggggcc
tttacctatt cctgcaccta ggctaggata actttagagc 2040actgacataa aacgcaaaaa
caggaatcat gccgtttgca aaactaactc tgggattaaa 2100ggggaagcat gtaaacagct
aactgttttt gttaaagatt tataggaatg aggaggtttg 2160gctattgtca catgacagac
tgttagccaa ggacaaagaa gttctgcaaa cctcccctgg 2220acccttgctg gtgtccagat
gtctgcggtt gtcagcccct tcctttcccc cgacctaaac 2280ataaaagaca aggcaaagcc
cgcataattt taagacggtt ctttaggaca ttagtccacc 2340atcttcttgg tttgctggct
ctccgaaata aagtcccttt ccttgctcca actccttgtc 2400tctcaacgta ttggctatga
cgcagcaagc agaatgaatt tggactcagt tacaggctgt 2460caatggtctg ctctgtagca
gtctcagagc ctccccgacc cactacctgg agatagccag 2520atagccagat gccctgctcc
tggccacctt taaagcccct gcatatgaca caggttaact 2580aaagtcaaga ttggggctgc
tgcattccag gttccctaga ctcacaagct ggtccttggc 2640caggtgcagt ggctcacgcc
tgtaatccca gcactttggg aggctgaggc aggcggatca 2700cctgaagtca gaagtttgag
accagcctgg ccaacataat taaaatgtct ctactaaaaa 2760tacaaaaaat tagctgggtg
tggtgacgct tgcctgtatc ccagctactc aggaagctga 2820gacacgagaa tcacttgaac
ctgggaggca gaggttgcag tgagctcaga tagtgccact 2880gcactccagc ctgggtgaca
gagcgagact ccgtctcaaa aaaaaaaaaa gaaagcagat 2940cctcatggct atagagttgg
cattttagcc ccagcttctg tagctctgaa agcctaaaga 3000aggtattctc tccatctgtt
aaacacagta tagtggctct cagcccttgg ggcatgttat 3060catgggaggg aagtcaaata
agaggagaga aaagaactca agggggaaac tgcattttta 3120ggctttgctc tcttaccttg
ccctttctac tcagaaccaa taacttctgc atcaaaacat 3180gttacagcct gcatcaaggg
ctttacccca acctgcagcc cagccttccc tgggtgagct 3240tgctatgcgc agccacattt
accatgtggg gctccctatt ctgatggcct gttcggtgcc 3300gggtttactc actgccctgt
tctgatgtca gtgcctgtac atacctccaa aggcaggact 3360tgcctgataa atatttttcc
tcctctgaac tggattttat aggcattaaa gacaagtcgg 3420gtggctagag ggctccttga
gacataccta gcagggaact gcaggtggat tctgttgaga 3480ggcaaagcac ctgagtggtt
gggacacagg cagctggcat gggagggact ttttttgaga 3540cagggtctca ctgtgtcgcc
cagggcaagg atgcccaaag acaccaggtt ggagaggcac 3600ctgccaacta cttgctttcc
ctggagcctg catgtgcctg tggggtgggg aggcgtaggg 3660gtctacggct gcctgagatg
ggtgtgcaca gtgtgtgaag tacctacctc cttgccttgc 3720tggactgtca gccagtcgca
gggccggcca caagacccat gtctccatct ggtcatactc 3780catagctacc aagttaacct
gctctaaact ttggagaact ggatctgtcc aataaacgct 3840tatttggcca agcctgatgg
ctcgtgcctg tactcccagc actttgggag gctgaggtgg 3900gagggttgct tgagcccagg
ggtttgagac cagcttgggc aacaacaaca aaaatgccag 3960gtgtggtggg gtgcacctgt
agtcccagct actagggagg ctgagccagg aggatcactt 4020gagcccggga ggttgaggct
gcagtggggg gtcataatca tgccactgta ctccagcctg 4080ggtgacagag tgagaccctg
tctccgaaaa aaaaaaaaaa aaaaagaacg gaaaaagaaa 4140tgcttacatt gtcagggatc
ctgtagacaa tcattaactc tatgagatgc ttggttctat 4200ttttttggga gactttgtcc
aagtgttttg gcttaagaaa tccataggcc tctcttggtg 4260acacatctct agtacttttt
gtcataaaca aacaggccat ctgccgccaa atacatccac 4320tccccatgcc actgacatcc
tatgggtcag ccaggcttgc tttgactgag gccgaggcat 4380ctggaacttt ctctgcctgc
aggggctagc agcagaggct tcaccgcatc accacccctt 4440cctccactcc tgacattctt
tcccttcagg gatccaaaat ggttggccga gctcccagtg 4500ggaaaacgtg tgctagagtt
ggggagtgag atgagtggtg ctgtccatgg aatcaggcca 4560cagcaggaac tgccccactg
gccatttgag acacacacag gtggtaaatg ctctgctggt 4620gggctgtgct tccctcattc
agagagctct gttacagccc actgtgtcct ttagaagctt 4680gaaaggaacc caactctttg
ctgcactgtc ctttttcttc ctcaaattca gaccctcctt 4740ccaccggcac ccccctactc
caccctcagc tcttccttgc ctggtttatc aagcagagct 4800gaggccccac gtttccaact
ctgattgtca cttgcatctt cacaaaggat aaaccacgga 4860gcaactggaa aaccatcagc
caagcgttcg gatgagtctg gttattggtc cacccccgac 4920cagattccct tacacttaac
tcacttcttt ctttggcaat gaccctcatg acatgtataa 4980atgggtatga ctaagaagag
gctgtgatct aacatttatt tgctgccatt ttttactctg 5040gggagaagca gccccaactc
atcactggga aagaactccc cctgcaaacc agctaaattt 5100gataatttaa accccctgcc
cctaaaactt ctcacagagc tggggagttg gtggcaactt 5160tccaagtcaa ggtcttgctt
agaaagtcct tcactacatg gccaggtgca gtggctcacg 5220cctgtagtcc caggtacttg
ggagcctgag gcaggaggat tgcttgagct caggagttca 5280aggctgcaga gagctatgat
catcccactg catttgttta aaaataaatt tttaaaattt 5340gtgtgtttta tcaggggtct
cctgtacagt gtatctgtgt atgtttgtgt gtgtgtttgt 5400atacagcctt gtttaatgtt
ttgagcaata agatatgcac acacaggtat tttgttgcta 5460aagagattgg acaaggttgt
agctgtgctc aggcttcagc ttggtttgtt aaattgagag 5520ataaacaatg acaagagctg
ccagccaacc acactattca aaaagcaaag tgttcaccac 5580taaagctaac cattcatctg
gttgcaggca aggctaaggc tctctctcct ctagttcctg 5640gaacagactc acagattggc
atgaagcact gatcaggggc tgcactcaga ctccctggcc 5700aagcaaacct acaccagaag
agtcagtgtc acagatatga tgcggccaat ctctgtctcc 5760aaaaacctac ctgaacttaa
tggtagaatt caaagatctg gggactgagg gcacccagcc 5820ttctaaaaca caatgtattc
atgtgtttag tgtaaactct ctgcatggat tctcagtgtt 5880aataataaaa ggaagcattc
ttttacaact cctgctgtgt gcaaaagaaa gtgcaaagga 5940tttggagtgg cattccgaag
atcaccacac ataccttggt tctgatggct gctgaactcc 6000gacttcttcg ctgagacatg
actgtgggaa cagcctccag ctatctgctc atcagaggtg 6060ctttcctcaa cctcctgcac
cacctccaag agaaacagcc taaaaagaaa ccccagctgt 6120ttacttatat tggtctgtaa
atccctggaa gtaaacccca tgcattttta tctactgtct 6180gaggacatac aataaatctg
agaaagtcta tgctgtca 6218361295DNAHomo sapiens
36ttggtcccag gcagcagtta gcccgccgcc cgcctgtgtg tccccagagc catggagaga
60gccagtctga tccagaaggc caagctggca gagcaggccg aacgctatga ggacatggca
120gccttcatga aaggcgccgt ggagaagggc gaggagctct cctgcgaaga gcgaaacctg
180ctctcagtag cctataagaa cgtggtgggc ggccagaggg ctgcctggag ggtgctgtcc
240agtattgagc agaaaagcaa cgaggagggc tcggaggaga aggggcccga ggtgcgtgag
300taccgggaga aggtggagac tgagctccag ggcgtgtgcg acaccgtgct gggcctgctg
360gacagccacc tcatcaagga ggccggggac gccgagagcc gggtcttcta cctgaagatg
420aagggtgact actaccgcta cctggccgag gtggccaccg gtgacgacaa gaagcgcatc
480attgactcag cccggtcagc ctaccaggag gccatggaca tcagcaagaa ggagatgccg
540cccaccaacc ccatccgcct gggcctggcc ctgaactttt ccgtcttcca ctacgagatc
600gccaacagcc ccgaggaggc catctctctg gccaagacca ctttcgacga ggccatggct
660gatctgcaca ccctcagcga ggactcctac aaagacagca ccctcatcat gcagctgctg
720cgagacaacc tgacactgtg gacggccgac aacgccgggg aagagggggg cgaggctccc
780caggagcccc agagctgagt gttgcccgcc accgccccgc cctgccccct ccagtccccc
840accctgccga gaggactagt atggggtggg aggccccacc cttctcccct aggcgctgtt
900cttgctccaa agggctccgt ggagagggac tggcagagct gaggccacct ggggctgggg
960atcccactct tcttgcagct gttgagcgca cctaaccact ggtcatgccc ccacccctgc
1020tctccgcacc cgcttcctcc cgaccccagg accaggctac ttctcccctc ctcttgcctc
1080cctcctgccc ctgctgcctc tgatcgtagg aattgaggag tgtcccgcct tgtggctgag
1140aactggacag tggcaggggc tggagatggg tgtgtgtgtg tgtgtgtgtg tgtgtgtgtg
1200tgtgcgcgcg cgccagtgca agaccgagat tgagggaaag catgtctgct gggtgtgacc
1260atgtttcctc tcaataaagt tcccctgtga cactc
1295372903DNAHomo sapiens 37gctgcaccgg cccccacctc ccggcttcca gaaagctccc
cttgctttcc gcggcattct 60ttgggcgtga gtcatgcagg tttgcagcca gccccaaagg
gggtgtgtgc gcgagcagag 120cgctataaat acggcgcctc ccagtgccca caacgcggcg
tcgccaggag gagcgcgcgg 180gcacagggtg ccgctgaccg aggcgtgcaa agactccaga
attggaggca tgatgaagac 240tctgctgctg tttgtggggc tgctgctgac ctgggagagt
gggcaggtcc tgggggacca 300gacggtctca gacaatgagc tccaggaaat gtccaatcag
ggaagtaagt acgtcaataa 360ggaaattcaa aatgctgtca acggggtgaa acagataaag
actctcatag aaaaaacaaa 420cgaagagcgc aagacactgc tcagcaacct agaagaagcc
aagaagaaga aagaggatgc 480cctaaatgag accagggaat cagagacaaa gctgaaggag
ctcccaggag tgtgcaatga 540gaccatgatg gccctctggg aagagtgtaa gccctgcctg
aaacagacct gcatgaagtt 600ctacgcacgc gtctgcagaa gtggctcagg cctggttggc
cgccagcttg aggagttcct 660gaaccagagc tcgcccttct acttctggat gaatggtgac
cgcatcgact ccctgctgga 720gaacgaccgg cagcagacgc acatgctgga tgtcatgcag
gaccacttca gccgcgcgtc 780cagcatcata gacgagctct tccaggacag gttcttcacc
cgggagcccc aggataccta 840ccactacctg cccttcagcc tgccccaccg gaggcctcac
ttcttctttc ccaagtcccg 900catcgtccgc agcttgatgc ccttctctcc gtacgagccc
ctgaacttcc acgccatgtt 960ccagcccttc cttgagatga tacacgaggc tcagcaggcc
atggacatcc acttccatag 1020cccggccttc cagcacccgc caacagaatt catacgagaa
ggcgacgatg accggactgt 1080gtgccgggag atccgccaca actccacggg ctgcctgcgg
atgaaggacc agtgtgacaa 1140gtgccgggag atcttgtctg tggactgttc caccaacaac
ccctcccagg ctaagctgcg 1200gcgggagctc gacgaatccc tccaggtcgc tgagaggttg
accaggaaat acaacgagct 1260gctaaagtcc taccagtgga agatgctcaa cacctcctcc
ttgctggagc agctgaacga 1320gcagtttaac tgggtgtccc ggctggcaaa cctcacgcaa
ggcgaagacc agtactatct 1380gcgggtcacc acggtggctt cccacacttc tgactcggac
gttccttccg gtgtcactga 1440ggtggtcgtg aagctctttg actctgatcc catcactgtg
acggtccctg tagaagtctc 1500caggaagaac cctaaattta tggagaccgt ggcggagaaa
gcgctgcagg aataccgcaa 1560aaagcaccgg gaggagtgag atgtggatgt tgcttttgca
cctacggggg catctgagtc 1620cagctccccc caagatgagc tgcagccccc cagagagagc
tctgcacgtc accaagtaac 1680caggccccag cctccaggcc cccaactccg cccagcctct
ccccgctctg gatcctgcac 1740tctaacactc gactctgctg ctcatgggaa gaacagaatt
gctcctgcat gcaactaatt 1800caataaaact gtcttgtgag ctgatcgctt ggagggtcct
ctttttatgt tgagttgctg 1860cttcccggca tgccttcatt ttgctatggg gggcaggcag
gggggatgga aaataagtag 1920aaacaaaaaa gcagtggcta agatggtata gggactgtca
taccagtgaa gaataaaagg 1980gtgaagaata aaagggatat gatgacaagg ttgatccact
tcaagaattg cttgctttca 2040ggaagagaga tgtgtttcaa caagccaact aaaatatatt
gctgcaaatg gaagcttttc 2100tgttctatta taaaactgtc gatgtattct gaccaaggtg
cgacaatctc ctaaaggaat 2160acactgaaag ttaaggagaa gaatcagtaa gtgtaaggtg
tacttggtat tataatgcat 2220aattgatgtt ttcgttatga aaacatttgg tgcccagaag
tccaaattat cagttttatt 2280tgtaagagct attgcttttg cagcggtttt atttgtaaaa
gctgttgatt tcgagttgta 2340agagctcagc atcccagggg catcttcttg actgtggcat
ttcctgtcca ccgccggttt 2400atatgatctt catacctttc cctggaccac aggcgtttct
cggcttttag tctgaaccat 2460agctgggctg cagtacccta cgctgccagc aggtggccat
gactacccgt ggtaccaatc 2520tcagtcttaa agctcaggct tttcgttcat taacattctc
tgatagaatt ctggtcatca 2580gatgtactgc aatggaacaa aactcatctg gctgcatccc
aggtgtgtag caaagtccac 2640atgtaaattt atagcttaga atattcttaa gtcactgtcc
cttgtctctc tttgaagtta 2700taaacaacaa acttaaagct tagcttatgt ccaaggtaag
tattttagca tggctgtcaa 2760ggaaattcag agtaaagtca gtgtgattca cttaatgata
tacattaatt agaattatgg 2820ggtcagaggt atttgcttaa gtgatcataa ttgtaaagta
tatgtcacat tgtcacatta 2880atgtcacact gtttcaaaag tta
2903381666DNAHomo sapiens 38cggcgtcgcc aggaggagcg
cgcgggcaca gggtgccgct gaccgaggcg tgcaaagact 60ccagaattgg aggcatgatg
aagactctgc tgctgtttgt ggggctgctg ctgacctggg 120agagtgggca ggtcctgggg
gaccagacgg tctcagacaa tgagctccag gaaatgtcca 180atcagggaag taagtacgtc
aataaggaaa ttcaaaatgc tgtcaacggg gtgaaacaga 240taaagactct catagaaaaa
acaaacgaag agcgcaagac actgctcagc aacctagaag 300aagccaagaa gaagaaagag
gatgccctaa atgagaccag ggaatcagag acaaagctga 360aggagctccc aggagtgtgc
aatgagacca tgatggccct ctgggaagag tgtaagccct 420gcctgaaaca gacctgcatg
aagttctacg cacgcgtctg cagaagtggc tcaggcctgg 480ttggccgcca gcttgaggag
ttcctgaacc agagctcgcc cttctacttc tggatgaatg 540gtgaccgcat cgactccctg
ctggagaacg accggcagca gacgcacatg ctggatgtca 600tgcaggacca cttcagccgc
gcgtccagca tcatagacga gctcttccag gacaggttct 660tcacccggga gccccaggat
acctaccact acctgccctt cagcctgccc caccggaggc 720ctcacttctt ctttcccaag
tcccgcatcg tccgcagctt gatgcccttc tctccgtacg 780agcccctgaa cttccacgcc
atgttccagc ccttccttga gatgatacac gaggctcagc 840aggccatgga catccacttc
catagcccgg ccttccagca cccgccaaca gaattcatac 900gagaaggcga cgatgaccgg
actgtgtgcc gggagatccg ccacaactcc acgggctgcc 960tgcggatgaa ggaccagtgt
gacaagtgcc gggagatctt gtctgtggac tgttccacca 1020acaacccctc ccaggctaag
ctgcggcggg agctcgacga atccctccag gtcgctgaga 1080ggttgaccag gaaatacaac
gagctgctaa agtcctacca gtggaagatg ctcaacacct 1140cctccttgct ggagcagctg
aacgagcagt ttaactgggt gtcccggctg gcaaacctca 1200cgcaaggcga agaccagtac
tatctgcggg tcaccacggt ggcttcccac acttctgact 1260cggacgttcc ttccggtgtc
actgaggtgg tcgtgaagct ctttgactct gatcccatca 1320ctgtgacggt ccctgtagaa
gtctccagga agaaccctaa atttatggag accgtggcgg 1380agaaagcgct gcaggaatac
cgcaaaaagc accgggagga gtgagatgtg gatgttgctt 1440ttgcacctac gggggcatct
gagtccagct ccccccaaga tgagctgcag ccccccagag 1500agagctctgc acgtcaccaa
gtaaccaggc cccagcctcc aggcccccaa ctccgcccag 1560cctctccccg ctctggatcc
tgcactctaa cactcgactc tgctgctcat gggaagaaca 1620gaattgctcc tgcatgcaac
taattcaata aaactgtctt gtgagc 1666392961DNAHomo sapiens
39gggcagcctg ctgtcggctt agaggggatg ggcagtgtgg agggcctggc agagcaagag
60gactcatcct tccaaaggga ctttctctgg gaagcctgct cctcgggcca ctgcgaaccc
120tctctactct ccgaagggaa ttgtccttcc tggcttccac tacttccacc cctgaatgca
180caggcagccc ggcccaagtc tcccactagg gatgcagatg gattcggtgt gaagggctgg
240ctgctgttgc ctccggctct tgaaagtcaa gttcagaggc gtgcaaagac tccagaattg
300gaggcatgat gaagactctg ctgctgtttg tggggctgct gctgacctgg gagagtgggc
360aggtcctggg ggaccagacg gtctcagaca atgagctcca ggaaatgtcc aatcagggaa
420gtaagtacgt caataaggaa attcaaaatg ctgtcaacgg ggtgaaacag ataaagactc
480tcatagaaaa aacaaacgaa gagcgcaaga cactgctcag caacctagaa gaagccaaga
540agaagaaaga ggatgcccta aatgagacca gggaatcaga gacaaagctg aaggagctcc
600caggagtgtg caatgagacc atgatggccc tctgggaaga gtgtaagccc tgcctgaaac
660agacctgcat gaagttctac gcacgcgtct gcagaagtgg ctcaggcctg gttggccgcc
720agcttgagga gttcctgaac cagagctcgc ccttctactt ctggatgaat ggtgaccgca
780tcgactccct gctggagaac gaccggcagc agacgcacat gctggatgtc atgcaggacc
840acttcagccg cgcgtccagc atcatagacg agctcttcca ggacaggttc ttcacccggg
900agccccagga tacctaccac tacctgccct tcagcctgcc ccaccggagg cctcacttct
960tctttcccaa gtcccgcatc gtccgcagct tgatgccctt ctctccgtac gagcccctga
1020acttccacgc catgttccag cccttccttg agatgataca cgaggctcag caggccatgg
1080acatccactt ccatagcccg gccttccagc acccgccaac agaattcata cgagaaggcg
1140acgatgaccg gactgtgtgc cgggagatcc gccacaactc cacgggctgc ctgcggatga
1200aggaccagtg tgacaagtgc cgggagatct tgtctgtgga ctgttccacc aacaacccct
1260cccaggctaa gctgcggcgg gagctcgacg aatccctcca ggtcgctgag aggttgacca
1320ggaaatacaa cgagctgcta aagtcctacc agtggaagat gctcaacacc tcctccttgc
1380tggagcagct gaacgagcag tttaactggg tgtcccggct ggcaaacctc acgcaaggcg
1440aagaccagta ctatctgcgg gtcaccacgg tggcttccca cacttctgac tcggacgttc
1500cttccggtgt cactgaggtg gtcgtgaagc tctttgactc tgatcccatc actgtgacgg
1560tccctgtaga agtctccagg aagaacccta aatttatgga gaccgtggcg gagaaagcgc
1620tgcaggaata ccgcaaaaag caccgggagg agtgagatgt ggatgttgct tttgcaccta
1680cgggggcatc tgagtccagc tccccccaag atgagctgca gccccccaga gagagctctg
1740cacgtcacca agtaaccagg ccccagcctc caggccccca actccgccca gcctctcccc
1800gctctggatc ctgcactcta acactcgact ctgctgctca tgggaagaac agaattgctc
1860ctgcatgcaa ctaattcaat aaaactgtct tgtgagctga tcgcttggag ggtcctcttt
1920ttatgttgag ttgctgcttc ccggcatgcc ttcattttgc tatggggggc aggcaggggg
1980gatggaaaat aagtagaaac aaaaaagcag tggctaagat ggtataggga ctgtcatacc
2040agtgaagaat aaaagggtga agaataaaag ggatatgatg acaaggttga tccacttcaa
2100gaattgcttg ctttcaggaa gagagatgtg tttcaacaag ccaactaaaa tatattgctg
2160caaatggaag cttttctgtt ctattataaa actgtcgatg tattctgacc aaggtgcgac
2220aatctcctaa aggaatacac tgaaagttaa ggagaagaat cagtaagtgt aaggtgtact
2280tggtattata atgcataatt gatgttttcg ttatgaaaac atttggtgcc cagaagtcca
2340aattatcagt tttatttgta agagctattg cttttgcagc ggttttattt gtaaaagctg
2400ttgatttcga gttgtaagag ctcagcatcc caggggcatc ttcttgactg tggcatttcc
2460tgtccaccgc cggtttatat gatcttcata cctttccctg gaccacaggc gtttctcggc
2520ttttagtctg aaccatagct gggctgcagt accctacgct gccagcaggt ggccatgact
2580acccgtggta ccaatctcag tcttaaagct caggcttttc gttcattaac attctctgat
2640agaattctgg tcatcagatg tactgcaatg gaacaaaact catctggctg catcccaggt
2700gtgtagcaaa gtccacatgt aaatttatag cttagaatat tcttaagtca ctgtcccttg
2760tctctctttg aagttataaa caacaaactt aaagcttagc ttatgtccaa ggtaagtatt
2820ttagcatggc tgtcaaggaa attcagagta aagtcagtgt gattcactta atgatataca
2880ttaattagaa ttatggggtc agaggtattt gcttaagtga tcataattgt aaagtatatg
2940tcacattgtc acattaatgt c
2961402131DNAHomo sapiens 40gttcccggca ttccgtgctc cttggttccg gcgttggagc
tctttggggc ccagctttgc 60ggacccggga gctcgggacg caggcggggc ttgtgctccg
cgggggcagg gcgtagggtg 120ggcctcctac ctcccctgat ctcgcggttt gttccgtttc
attggagctt cccggaccgt 180gtgctcgacg gtgccctagg tgccgtgggg ccacacgcga
gtctgataag caccctcccc 240cggaatcatg cggtgctgtg aggcctagcg aagatgaaga
tagaatgcaa ggtagaaagt 300gctggatacc tttagaaagc tgcaggactg gtgcgatggg
agttgagacg taagaacctg 360cccgtccgta gggctctgga tgctgctgag gcccgaggcc
cctatggcag atttgaaaat 420tcacccttgt agagtcattc ctgcctttga gcggactccc
ttttaagcag atctcaagag 480agcgttcggt ggaggccctg ggtctgcaca gctcacctcc
ctgggaactg ctcgcccgag 540cgtcggagcc ggcgctggcc ccctgcagcc ggaaggttgc
agccgcagga gccccggagg 600cccaggacac agggctcttg ctcttgcaga atccacaggt
ctttcttgag gaaatctgta 660gacagaactt tgtgctgcgt ttttatctag ggaaggaaca
gaagagtgtc gtctcctaga 720aatctagcac tggagaaacg aggaaaattc ttccagcgat
ggtctcccac tcagagctga 780ggaagctttt ctactcagca gatgctgtgt gttttgatgt
tgacagcacg gtcatcagag 840aagaaggaat cgatgagcta gccaaaatct gtggcgttga
ggacgcggtg tcagaaatga 900cacggcgagc catgggcggg gcagtgcctt tcaaagctgc
tctcacagag cgcttagccc 960tcatccagcc ctccagggag caggtgcaga gactcatagc
agagcaaccc ccacacctga 1020cccccggcat aagggagctg gtaagtcgcc tacaggagcg
aaatgttcag gttttcctaa 1080tatctggtgg ctttaggagt attgtagagc atgttgcttc
aaagctcaat atcccagcaa 1140ccaatgtatt tgccaatagg ctgaaattct actttaacgg
tgaatatgca ggttttgatg 1200agacgcagcc aacagctgaa tctggtggaa aaggaaaagt
gattaaactt ttaaaggaaa 1260aatttcattt taagaaaata atcatgattg gagatggtgc
cacagatatg gaagcctgtc 1320ctcctgctga tgctttcatt ggatttggag gaaatgtgat
caggcaacaa gtcaaggata 1380acgccaaatg gtatatcact gattttgtag agctgctggg
agaactggaa gaataacatc 1440cattgtcgta cagctccaaa caacttcaga tgaattttta
caagttatac agattgatac 1500tgtttgctta cagttgccta ttacaacttg ctatagaaag
ttggtacaaa tgatctgtac 1560tttaaactac agttaggaat cctagaagat tgcttttttt
ttttttttaa ctgtagttcc 1620agtattatat gatgactatt gatttcctgg agaggttttt
tttttttttg agacagaatc 1680ttgctctgtt gcccaggctg gagtgcagtg gcgcggtctc
ggctcactgc aagctctgcc 1740tcccaggttc acgccattct cctgcctcag cctcccgagt
agctgggact acaggcaccc 1800gccaccacat ccggctaatt ttttgtattt ttagtagaga
cggggtttga ccgtgttagc 1860caggatggtc ttgatctcct gaccttgtga tccgcctgcc
tcagcctccc aaagtgctgg 1920gattacaggc ttgggccacc gcgcccagcc aatgtcctag
agagttttgt gatctgaatt 1980ctttatgtat atttgtagct atatttcata caaagtgctt
taagtgtgga gagtcaatta 2040aacaccttta ctcttagaaa tacggattcg gcagccttca
gtgaatattg gtttctcttt 2100ggtatgtcaa taaaagttta tccgtatgtc a
2131412178DNAHomo sapiens 41gttcccggca ttccgtgctc
cttggttccg gcgttggagc tctttggggc ccagctttgc 60ggacccggga gctcgggacg
caggcggggc ttgtgctccg cgggggcagg gcgtagggtg 120ggcctcctac ctcccctgat
ctcgcggttt gttccgtttc attggagctt cccggaccgt 180gtgctcgacg gtgccctagg
tgccgtgggg ccacacgcga gtctgataag caccctcccc 240cggaatcatg cggtgctgtg
aggcctagcg aagatgaaga tagaatgcaa ggtagaaagt 300gctggatacc tttagaaagc
tgcaggactg gtgcgatggg agttgagacg taagaacctg 360cccgtccgta gggctctgga
tgctgctgag gcccgaggcc cctatggcag atttgaaaat 420tcacccttgt agagtcattc
ctgcctttga gcggactccc ttttaagttt acagaagcac 480ttgcagaact catcagaagc
caccccgctt atcagcagat ctcaagagag cgttcggtgg 540aggccctggg tctgcacagc
tcacctccct gggaactgct cgcccgagcg tcggagccgg 600cgctggcccc ctgcagccgg
aaggttgcag ccgcaggagc cccggaggcc caggacacag 660ggctcttgct cttgcagaat
ccacaggtct ttcttgagga aatctgtaga cagaactttg 720tgctgcgttt ttatctaggg
aaggaacaga agagtgtcgt ctcctagaaa tctagcactg 780gagaaacgag gaaaattctt
ccagcgatgg tctcccactc agagctgagg aagcttttct 840actcagcaga tgctgtgtgt
tttgatgttg acagcacggt catcagagaa gaaggaatcg 900atgagctagc caaaatctgt
ggcgttgagg acgcggtgtc agaaatgaca cggcgagcca 960tgggcggggc agtgcctttc
aaagctgctc tcacagagcg cttagccctc atccagccct 1020ccagggagca ggtgcagaga
ctcatagcag agcaaccccc acacctgacc cccggcataa 1080gggagctggt aagtcgccta
caggagcgaa atgttcaggt tttcctaata tctggtggct 1140ttaggagtat tgtagagcat
gttgcttcaa agctcaatat cccagcaacc aatgtatttg 1200ccaataggct gaaattctac
tttaacggtg aatatgcagg ttttgatgag acgcagccaa 1260cagctgaatc tggtggaaaa
ggaaaagtga ttaaactttt aaaggaaaaa tttcatttta 1320agaaaataat catgattgga
gatggtgcca cagatatgga agcctgtcct cctgctgatg 1380ctttcattgg atttggagga
aatgtgatca ggcaacaagt caaggataac gccaaatggt 1440atatcactga ttttgtagag
ctgctgggag aactggaaga ataacatcca ttgtcgtaca 1500gctccaaaca acttcagatg
aatttttaca agttatacag attgatactg tttgcttaca 1560gttgcctatt acaacttgct
atagaaagtt ggtacaaatg atctgtactt taaactacag 1620ttaggaatcc tagaagattg
cttttttttt ttttttaact gtagttccag tattatatga 1680tgactattga tttcctggag
aggttttttt tttttttgag acagaatctt gctctgttgc 1740ccaggctgga gtgcagtggc
gcggtctcgg ctcactgcaa gctctgcctc ccaggttcac 1800gccattctcc tgcctcagcc
tcccgagtag ctgggactac aggcacccgc caccacatcc 1860ggctaatttt ttgtattttt
agtagagacg gggtttgacc gtgttagcca ggatggtctt 1920gatctcctga ccttgtgatc
cgcctgcctc agcctcccaa agtgctggga ttacaggctt 1980gggccaccgc gcccagccaa
tgtcctagag agttttgtga tctgaattct ttatgtatat 2040ttgtagctat atttcataca
aagtgcttta agtgtggaga gtcaattaaa cacctttact 2100cttagaaata cggattcggc
agccttcagt gaatattggt ttctctttgg tatgtcaata 2160aaagtttatc cgtatgtc
2178421442DNAHomo sapiens
42ctcctacctc ccctgatctc gcggtttgtt ccgtttcatt ggagcttccc ggaccgtgtg
60ctcgacggtg ccctaggtgc cgtggggcca cacgcgagtc tgataagcac cctcccccgg
120aatcatgcgg tgctgtgagg cctagcgaag atgaagatag aatgcaaggt agaaagtgct
180ggataccttt agaaagctgc aggactggtg cgatgggagt tgagacgtaa gaacctgccc
240gtccgtaggg ctctggatgc tgctgaggcc cgaggcccct atggcagatt tgaaaattca
300cccttgtaga gtcattcctg cctttgagcg gactcccttt taaggaggaa aattcttcca
360gcgatggtct cccactcaga gctgaggaag cttttctact cagcagatgc tgtgtgtttt
420gatgttgaca gcacggtcat cagagaagaa ggaatcgatg agctagccaa aatctgtggc
480gttgaggacg cggtgtcaga aatgacacgg cgagccatgg gcggggcagt gcctttcaaa
540gctgctctca cagagcgctt agccctcatc cagccctcca gggagcaggt gcagagactc
600atagcagagc aacccccaca cctgaccccc ggcataaggg agctggtaag tcgcctacag
660gagcgaaatg ttcaggtttt cctaatatct ggtggcttta ggagtattgt agagcatgtt
720gcttcaaagc tcaatatccc agcaaccaat gtatttgcca ataggctgaa attctacttt
780aacggtgaat atgcaggttt tgatgagacg cagccaacag ctgaatctgg tggaaaagga
840aaagtgatta aacttttaaa ggaaaaattt cattttaaga aaataatcat gattggagat
900ggtgccacag atatggaagc ctgtcctcct gctgatgctt tcattggatt tggaggaaat
960gtgatcaggc aacaagtcaa ggataacgcc aaatggtata tcactgattt tgtagagctg
1020ctgggagaac tggaagaata acatccattg tcgtacagct ccaaacaact tcagatgaat
1080ttttacaagt tatacagatt gatactgttt gcttacagtt gcctattaca acttgctata
1140gaaagttggt acaaatgatc tgtactttaa actacagtta ggaatcctag aagattgctt
1200tttttttttt tttaactgta gttccagtat tatatgatga ctattgattt cctggagagt
1260tttgtgatct gaattcttta tgtatatttg tagctatatt tcatacaaag tgctttaagt
1320gtggagagtc aattaaacac ctttactctt agaaatacgg attcggcagc cttcagtgaa
1380tattggtttc tctttggtat gtcaataaaa gtttatccgt atgtcagaac ggatttgtgg
1440aa
1442431734DNAHomo sapiens 43aggatggatt agagacctct attttgaggc gcactgatgt
aggggctgag gaaggacatt 60gagggcacct tcaggtctct ctgcctattc ttccttgccc
caactccatt ccaggtgtac 120atcagatcca tcaggtccga gctgtgttga ctaccactgc
ttttcccttc gtctcagtta 180tgtcttggaa gaaggctttg cggatccccg gagaccttcg
ggtagcaact gtcaccttga 240tgctggcgat gctgagctcc ctactggctg agggcagaga
ctctcccgag gatttcgtgt 300tccagtttaa gggcatgtgc tacttcacca acgggacgga
gcgcgtgcgt cttgtgacca 360gatacatcta taaccgagag gagtacgcgc gcttcgacag
cgacgtgggg gtgtaccgcg 420cggtgacgcc gcaggggcgg cctgatgccg agtactggaa
cagccagaag gaagtcctgg 480aggggacccg ggcggagttg gacacggtgt gcagacacaa
ctacgaggtg gcgttccgcg 540ggatcttgca gaggagagtg gagcccacag tgaccatctc
cccatccagg acagaggccc 600tcaaccacca caacctgctg gtctgctcgg tgacagattt
ctatccaggc cagatcaaag 660tccggtggtt tcggaatgat caggaggaga cagccggcgt
tgtgtccacc ccccttatta 720ggaatggtga ctggactttc cagatcctgg tgatgctgga
aatgactccc cagcgtggag 780atgtctacac ctgccacgtg gagcacccca gcctccagag
ccccatcacc gtggagtggc 840gggctcagtc tgaatctgcc cagagcaaga tgctgagtgg
cgttggaggc ttcgtgctgg 900ggctgatctt ccttgggctg ggccttatca tccgtcaaag
gagtcagaaa gggcttctgc 960actgactcct gagactattt taactaggat tggttatcac
tcttctgtga tgcctgctta 1020tgcctgccca gaattcccag ctgcctgtgt cagcttgtcc
ccctgagatc aaagtcctac 1080agtggctgtc acgcagccac caggtcatct cctttcatcc
ccaccccaag gcgctggctg 1140tgactctgct tcctgcactg acccagagcc tctgcctgtg
catggccagc tgcgtctact 1200caggtcccaa ggggtttctg tttctattct ttcctcagac
tgctcaagag aagcacatga 1260aaaacattac ctgactttag agctttttta cataattaaa
catgatcctg agttatctgt 1320attctgaact ttcttaattg agaagaggca ggaaatcact
gcagaatgaa ggaacatccc 1380ttgaggtgac ccagcaaacc tgtggccaga aggaggattg
taccttgaaa agacactgaa 1440agcattttgg ggtgtgaagt aagggtgggc agaggaggta
gaaaataatt caattgtcgc 1500atcattcatg gttctttaat actgatgctc agtgcattgg
ccttagaata tcccagcctc 1560tcttctggtt tggtgagtgc tgtgtaaata agcatggtag
aattgtttgg agacatatat 1620agtgatcctt ggtcactggt gtttcaaaca ttctggaaag
tcacatcgat caagaatatt 1680ttttattttt aagaaagcat aaccagcaat aaaaatacta
tttttgagtc taaa 1734441642DNAHomo sapiens 44acaggttttt attctttctg
ccaggtacat cagatccatc aggtccgagc tgtgttgact 60accactgctt ttcccttcgt
ctcagttatg tcttggaaga aggctttgcg gatccccgga 120gaccttcggg tagcaactgt
caccttgatg ctggcgatgc tgagctccct actggctgag 180ggcagagact ctcccgagga
tttcgtgttc cagtttaagg gcatgtgcta cttcaccaac 240gggacggagc gcgtgcgtct
tgtgaccaga tacatctata accgagagga gtacgcgcgc 300ttcgacagcg acgtgggggt
gtaccgcgcg gtgacgccgc aggggcggcc tgatgccgag 360tactggaaca gccagaagga
agtcctggag gggacccggg cggagttgga cacggtgtgc 420agacacaact acgaggtggc
gttccgcggg atcttgcaga ggagagtgga gcccacagtg 480accatctccc catccaggac
agaggccctc aaccaccaca acctgctggt ctgctcggtg 540acagatttct atccaggcca
gatcaaagtc cggtggtttc ggaatgatca ggaggagaca 600gccggcgttg tgtccacccc
ccttattagg aatggtgact ggactttcca gatcctggtg 660atgctggaaa tgactcccca
gcgtggagat gtctacacct gccacgtgga gcaccccagc 720ctccagagcc ccatcaccgt
ggagtggcgg gctcagtctg aatctgccca gagcaagatg 780ctgagtggcg ttggaggctt
cgtgctgggg ctgatcttcc ttgggctggg ccttatcatc 840cgtcaaagga gtcagaaagg
gcttctgcac tgactcctga gactatttta actaggattg 900gttatcactc ttctgtgatg
cctgcttatg cctgcccaga attcccagct gcctgtgtca 960gcttgtcccc ctgagatcaa
agtcctacag tggctgtcac gcagccacca ggtcatctcc 1020tttcatcccc accccaaggc
gctggctgtg actctgcttc ctgcactgac ccagagcctc 1080tgcctgtgca tggccagctg
cgtctactca ggtcccaagg ggtttctgtt tctattcttt 1140cctcagactg ctcaagagaa
gcacatgaaa aacattacct gactttagag cttttttaca 1200taattaaaca tgatcctgag
ttatctgtat tctgaacttt cttaattgag aagaggcagg 1260aaatcactgc agaatgaagg
aacatccctt gaggtgaccc agcaaacctg tggccagaag 1320gaggattgta ccttgaaaag
acactgaaag cattttgggg tgtgaagtaa gggtgggcag 1380aggaggtaga aaataattca
attgtcgcat cattcatggt tctttaatac tgatgctcag 1440tgcattggcc ttagaatatc
ccagcctctc ttctggtttg gtgagtgctg tgtaaataag 1500catggtagaa ttgtttggag
acatatatag tgatccttgg tcactggtgt ttcaaacatt 1560ctggaaagtc acatcgatca
agaatatttt ttatttttaa gaaagcataa ccagcaataa 1620aaatactatt tttgagtcta
aa 1642
User Contributions:
Comment about this patent or add new information about this topic: