Patent application title: Gene Expression Profiling in Biopsied Tumor Tissues
Inventors:
Joffre B. Baker (Montara, CA, US)
Maureen T. Cronin (Los Altos, CA, US)
Maureen T. Cronin (Los Altos, CA, US)
Michael C. Kiefer (Clayton, CA, US)
Steve Shak (Hillsborough, CA, US)
Steve Shak (Hillsborough, CA, US)
Michael Graham Walker (Sunnyvale, CA, US)
IPC8 Class: AC12Q168FI
USPC Class:
435 6
Class name: Chemistry: molecular biology and microbiology measuring or testing process involving enzymes or micro-organisms; composition or test strip therefore; processes of forming such composition or test strip involving nucleic acid
Publication date: 2010-08-19
Patent application number: 20100209920
Claims:
1.-45. (canceled)
46. A method comprising:assaying a level of a RNA transcript of CEGP1 in a tissue sample obtained from a primary ductal or lobular breast tumor of a human patient;normalizing said level against a level of at least one reference RNA transcript in said tissue sample to provide a normalized CEGP1 expression level; andpredicting the likelihood of long-term survival of said patient without recurrence of breast cancer by comparing said normalized CEGP1 expression level to CEGP1 expression data obtained from reference breast cancer samples,wherein an increased normalized CEGP1 expression level is positively correlated with an increased likelihood of long-term survival without breast cancer recurrence in said patients.
47. The method of claim 46 further comprising assaying a level of a RNA transcript of one or more genes selected from the group consisting of: STK15, Ki-67, PR, GSTM3, ESR1, HNF3A, BIRC5, BAG1, BCL2, CCNB1, and GSTM1 in said tissue sample;normalizing the level of the RNA transcript of the one or more genes against a level of at least one reference RNA transcript in said tissue sample to provide a normalized level of said one or more genes; andcomparing said normalized level of said one or more genes to gene expression data from said one or more genes obtained from reference breast cancer samples,wherein increased expression of one or more of BIRC5, CCNB1, STK15 and Ki-67, negatively correlates with an increased likelihood of long-term survival without breast cancer recurrence, and increased expression of one or more of BAG1, BCL2, PR, GSTM1, GSTM3, ESR1 and HNF3A positively correlates with an increased likelihood of long-term survival without breast cancer recurrence.
48. The method of claim 46 wherein the breast tumor is an invasive breast tumor, and said method further comprises assaying a level of a RNA transcript of one or more genes selected from the group consisting of: FOXM1, PRAME, BCL2, STK15, Ki-67, PR, BBC3, NME1, BIRC5, GATA3, TFRC, YB-1, DPYD, CA9, Contig51037, RPS6K1 and Her2 in said tissue sample.
49. The method of claim 46 wherein said breast tumor is estrogen receptor (ER) positive breast tumor.
50. The method of claim 49 further comprising assaying a level of a RNA transcript of one or more genes selected from the group consisting of: PRAME, BCL2, FOXM1, DIABLO, EPHX1, HIF1A, VEGFC, Ki-67, IGF1R, VDR, NME1, GSTM3, Contig51037, CDC25B, CTSB, p27, CDH1, and IGFBP3 in said tissue sample.
51. The method of claim 47 wherein the levels of 2 or more RNA transcripts are assayed.
52. The method of claim 46, wherein said tissue sample is a fixed,wax-embedded breast cancer tissue specimen of said patient.
53. The method of claim 46, wherein said tissue sample is from a fine needle biopsy.
54. The method of claim 46, further comprising creating a report based upon the normalized CEGP1 expression level.
55. The method of claim 54, wherein said report includes a prediction of the likelihood of long term survival of said patient without the recurrence of breast cancer.
56. The method of claim 55, wherein said report comprises information concerning a recommendation for a treatment modality of said patient.
57. The method of claim 46, wherein said gene expression data is produced using a multivariate analysis using the Cox Proportional Hazards model.
58. The method of claim 46 wherein said assaying is done by reverse transcriptase polymerase chain reaction (RT-PCR).
59. The method of claim 46, wherein said assaying is done after a primary ductal carcinoma has been surgically removed from a breast of said patient.
60. The method of claim 59, wherein said primary ductal carcinoma is an invasive ductal carcinoma.
61. The method of claim 46, wherein said assaying is done after a primary lobular carcinoma has been surgically removed from a breast of said patient.
62. The method of claim 61, wherein said primary lobular carcinoma is an invasive lobular carcinoma.
Description:
CROSS-REFERENCE
[0001]This application claims the benefit under 35 U.S.C. 119(h) of provisional application Ser. Nos. 60/412,049, filed Sep. 18, 2002 and 60/364,890, filed Mar. 13, 2002, the entire disclosures which are hereby incorporated by reference.
BACKGROUND OF THE INVENTION
Field of the Invention
[0002]The present invention relates to gene expression profiling in biopsied tumor tissues. In particular, the present invention concerns sensitive methods to measure mRNA levels in biopsied tumor tissues, including archived paraffin-embedded biopsy material. In addition, the invention provides a set of genes the expression of which is important in the diagnosis and treatment of breast cancer.
[0003]Oncologists have a number of treatment options available to them, including different combinations of chemotherapeutic drugs that are characterized as "standard of care," and a number of drugs that do not carry a label claim for a particular cancer, but for which there is evidence of efficacy in that cancer. Best likelihood of good treatment outcome requires that patients be assigned to optimal available cancer treatment, and that this assignment be made as quickly as possible following diagnosis.
[0004]Currently, diagnostic tests used in clinical practice are single analyte, and therefore do not capture the potential value of knowing relationships between dozens of different markers. Moreover, diagnostic tests are frequently not quantitative, relying on immunohistochemistry. This method often yields different results in different laboratories, in part because the reagents are not standardized, and in part because the interpretations are subjective and cannot be easily quantified. RNA-based tests have not often been used because of the problem of RNA degradation over time and the fact that it is difficult to obtain fresh tissue samples from patients for analysis. Fixed paraffin-embedded tissue is more readily available and methods have been established to detect RNA in fixed tissue. However, these methods typically do not allow for the study of large numbers of genes (DNA or RNA) from small amounts of material. Thus, traditionally fixed tissue has been rarely used other than for immunohistochemistry detection of proteins.
[0005]Recently, several groups have published studies concerning the classification of various cancer types by microarray gene expression analysis (see, e.g. Golub et al., Science 286:531-537 (1999); Bhattacharjae et al., Proc. Natl. Acad. Sci. USA 98:13790-13795 (2001); Chen-Hsiang et al., Bioinformatics 17 (Suppl. 1):S316-S322 (2001); Ramaswamy et al., Proc. Natl. Acad. Sci. USA 98:15149-15154 (2001)). Certain classifications of human breast cancers based on gene expression patterns have also been reported (Martin et al., Cancer Res. 60:2232-2238 (2000); West et al., Proc. Natl. Acad. Sci. USA 98:11462-11467 (2001); Sorlie et al., Proc. Natl. Acad. Sci. USA 98:10869-10874 (2001); Yan et al., Cancer Res. 61:8375-8380 (2001)). However, these studies mostly focus on improving and refining the already established classification of various types of cancer, including breast cancer, and generally do not provide new insights into the relationships of the differentially expressed genes, and do not link the findings to treatment strategies in order to improve the clinical outcome of cancer therapy.
[0006]Although modern molecular biology and biochemistry have revealed more than 100 genes whose activities influence the behavior of tumor cells, state of their differentiation, and their sensitivity or resistance to certain therapeutic drugs, with a few exceptions, the status of these genes has not been exploited for the purpose of routinely making clinical decisions about drug treatments. One notable exception is the use of estrogen receptor (ER) protein expression in breast carcinomas to select patients to treatment with anti-estrogen drugs, such as tamoxifen. Another exceptional example is the use of ErbB2 (Her2) protein expression in breast carcinomas to select patients with the Her2 antagonist drug Herceptin® (Genentech, Inc., South San Francisco, Calif.).
[0007]Despite recent advances, the challenge of cancer treatment remains to target specific treatment regimens to pathogenically distinct tumor types, and ultimately personalize tumor treatment in order to maximize outcome. Hence, a need exists for tests that simultaneously provide predictive information about patient responses to the variety of treatment options. This is particularly true for breast cancer, the biology of which is poorly understood. It is clear that the classification of breast cancer into a few subgroups, such as ErbB2.sup.+ subgroup, and subgroups characterized by low to absent gene expression of the estrogen receptor (ER) and a few additional transcriptional factors (Perou et al. Nature 406:747-752 (2000)) does not reflect the cellular and molecular heterogeneity of breast cancer, and does not allow the design of treatment strategies maximizing patient response.
SUMMARY OF THE INVENTION
[0008]The present invention provides (1) sensitive methods to measure mRNA levels in biopsied tumor tissue, (2) a set of approximately 190 genes, the expression of which is important in the diagnosis of breast cancer, and (3) the significance of abnormally low or high expression for the genes identified and included in the gene set, through activation or disruption of biochemical regulatory pathways that influence patient response to particular drugs used or potentially useful in the treatment of breast cancer. These results permit assessment of genomic evidence of the efficacy of more than a dozen relevant drugs.
[0009]The present invention accommodates the use of archived paraffin-embedded biopsy material for assay of all markers in the set, and therefore is compatible with the most widely available type of biopsy material. The invention presents an efficient method for extraction of RNA from wax-embedded, fixed tissues, which reduces cost of mass production process for acquisition of this information without sacrificing quality of the analysis. In addition, the invention describes a novel highly effective method for amplifying mRNA copy number, which permits increased assay sensitivity and the ability to monitor expression of large numbers of different genes given the limited amounts of biopsy material. The invention also captures the predictive significance of relationships between expressions of certain markers in the breast cancer marker set. Finally, for each member of the gene set, the invention specifies the oligonucleotide sequences to be used in the test.
[0010]In one aspect, the invention concerns a method for predicting clinical outcome for a patient diagnosed with cancer, comprising
[0011]determining the expression level of one or more genes, or their expression products, selected from the group consisting of p53BP2, cathepsin B, cathepsin L, Ki67/MiB1, and thymidine kinase in a cancer tissue obtained from the patient, normalized against a control gene or genes, and compared to the amount found in a reference cancer tissue set,
[0012]wherein a poor outcome is predicted if:
[0013](a) the expression level of p53BP2 is in the lower 10th percentile; or
[0014](b) the expression level of either cathepsin B or cathepsin L is in the upper 10th percentile; or
[0015](c) the expression level of any either Ki67/MiB1 or thymidine kinase is in the upper 10th percentile.
[0016]Poor clinical outcome can be measured, for example, in terms of shortened survival or increased risk of cancer recurrence, e.g. following surgical removal of the cancer.
[0017]In another embodiment, the inventor concerns a method of predicting the likelihood of the recurrence of cancer, following treatment, in a cancer patient, comprising determining the expression level of p27, or its expression product, in a cancer tissue obtained from the patient, normalized against a control gene or genes, and compared to the amount found in a reference cancer tissue set, wherein an expression level in the upper 10th percentile indicates decreased risk of recurrence following treatment.
[0018]In another aspect, the invention concerns a method for classifying cancer comprising, determining the expression level of two or more genes selected from the group consisting of Bcl2, hepatocyte nuclear factor 3, ER, ErbB2, and Grb7, or their expression products, in a cancer tissue, normalized against a control gene or genes, and compared to the amount found in a reference cancer tissue set, wherein (i) tumors expressing at least one of Bcl2, hepatocyte nuclear factor 3, and ER, or their expression products, above the mean expression level in the reference tissue set are classified as having a good prognosis for disease free and overall patient survival following treatment; and (ii) tumors expressing elevated levels of ErbB2 and Grb7, or their expression products, at levels ten-fold or more above the mean expression level in the reference tissue set are classified as having poor prognosis of disease free and overall patient survival following treatment.
[0019]All types of cancer are included, such as, for example, breast cancer, colon cancer, lung cancer, prostate cancer, hepatocellular cancer, gastric cancer, pancreatic cancer, cervical cancer, ovarian cancer, liver cancer, bladder cancer, cancer of the urinary tract, thyroid cancer, renal cancer, carcinoma, melanoma, and brain cancer. The foregoing methods are particularly suitable for prognosis/classification of breast cancer.
[0020]In all previous aspects, in a specific embodiment, the expression level is determined using RNA obtained from a formalin-fixed, paraffin-embedded tissue sample. While all techniques of gene expression profiling, as well as proteomics techniques, are suitable for use in performing the foregoing aspects of the invention, the gene expression levels are often determined by reverse transcription polymerase chain reaction (RT-PCR).
[0021]If the source of the tissue is a formalin-fixed, paraffin embedded tissue sample, the RNA is often fragmented.
[0022]The expression data can be further subjected to multivariate analysis, for example using the Cox Proportional Hazards model.
[0023]In a further aspect, the invention concerns a method for the preparation of nucleic acid from a fixed, wax-embedded tissue specimen, comprising:
[0024](a) incubating a section of the fixed, wax-embedded tissue specimen at a temperature of about 56° C. to 70° C. in a lysis buffer, in the presence of a protease, without prior dewaxing, to form a lysis solution;
[0025](b) cooling the lysis solution to a temperature where the wax solidifies; and
[0026](c) isolating the nucleic acid from the lysis solution.
[0027]The lysis buffer may comprise urea, such as 4M urea. In a particular embodiment, incubation in step (a) of the foregoing method is performed at about 65° C.
[0028]In another particular embodiment, the protease used in the foregoing method is proteinase K.
[0029]In another embodiment, the cooling in step (b) is performed at room temperature.
[0030]In a further embodiment, the nucleic acid is isolated after protein removal with 2.5 M NH4OAc.
[0031]The nucleic acid can, for example, be total nucleic acid present in the fixed, wax-embedded tissue specimen.
[0032]In yet another embodiment, the total nucleic acid is isolated by precipitation from the lysis solution, following protein removal, with 2.5 M NH4OAc. The precipitation may, for example, be performed with isopropanol.
[0033]The method described above may further comprise the step of removing DNA from the total nucleic acid, for example by DNAse treatment.
[0034]The tissue specimen may, for example, be obtained from a tumor, and the RNA may be obtained from a microdissected portion of the tissue specimen enriched for tumor cells.
[0035]All types of tumor are included, such as, without limitation, breast cancer, colon cancer, lung cancer, prostate cancer, hepatocellular cancer, gastric cancer, pancreatic cancer, cervical cancer, ovarian cancer, liver cancer, bladder cancer, cancer of the urinary tract, thyroid cancer, renal cancer, carcinoma, melanoma, and brain cancer, in particular breast cancer.
[0036]The method described above may further comprise the step of subjecting the RNA to gene expression profiling. Thus, the gene expression profile may be completed for a set of genes comprising at least two of the genes listed in Table 1.
[0037]Although all methods of gene expression profiling are contemplated, in a particular embodiment, gene expression profiling is performed by RT-PCR which may be preceded by an amplification step.
[0038]In another aspect, the invention concerns a method for preparing fragmented RNA for gene expression analysis, comprising the steps of:
[0039](a) mixing the RNA with at least one gene-specific, single-stranded DNA scaffold under conditions such that fragments of the RNA complementary to the DNA scaffold hybridize with the DNA scaffold;
[0040](b) extending the hybridized RNA fragments with a DNA polymerase to form a DNA-DNA duplex; and
[0041](c) removing the DNA scaffold from the duplex.
[0042]In a specific embodiment, in step (b) of this method, the RNA may be mixed with a mixture of single-stranded DNA templates specific for each gene of interest.
[0043]The method can further comprise the step of heat-denaturing and reannealing the duplexed DNA to the DNA scaffold, with or without additional overlapping scaffolds, and further extending the duplexed sense strand with DNA polymerase prior to removal of the scaffold in step (c).
[0044]The DNA templates may be, but do not need to be, fully complementary to the gene of interest.
[0045]In a particular embodiment, at least one of the DNA templates is complementary to a specific segment of the gene of interest.
[0046]In another embodiment, the DNA templates include sequences complementary to polymorphic variants of the same gene.
[0047]The DNA template may include one or more dUTP or rNTP sites. In this case. in step (c) the DNA template may be removed by fragmenting the DNA template present in the DNA-DNA duplex formed in step (b) at the dUTP or rNTP sites.
[0048]In an important embodiment, the RNA is extracted from fixed, wax-embedded tissue specimens, and purified sufficiently to act as a substrate in an enzyme assay. The RNA purification may, but does not need to, include an oligo-dT based step.
[0049]In a further aspect, the invention concerns a method for amplifying RNA fragments in a sample comprising fragmented RNA representing at least one gene of interest, comprising the steps of:
[0050](a) contacting the sample with a pool of single-stranded DNA scaffolds comprising an RNA polymerase promoter at the 5' end under conditions such that the RNA fragments complementary to the DNA scaffolds hybridize with the DNA scaffolds;
[0051](b) extending the hybridized RNA fragments with a DNA polymerase along the DNA scaffolds to form DNA-DNA duplexes;
[0052](c) amplifying the gene or genes of interest by in vitro transcription; and
[0053](d) removing the DNA scaffolds from the duplexes.
[0054]An exemplary promoter is the T7 RNA polymerase promoter, while an exemplary DNA polymerase is DNA polymerase I.
[0055]In step (d) the DNA scaffolds may be removed, for example, by treatment with DNase I.
[0056]In a further embodiment, the pool of single-stranded DNA scaffolds comprises partial or complete gene sequences of interest, such as a library of cDNA clones.
[0057]In a specific embodiment, the sample represents a whole genome or a fraction thereof. In a preferred embodiment, the genome is the human genome.
[0058]In another aspect, the invention concerns a method of preparing a personalized genomics profile for a patient, comprising the steps of:
[0059](a) subjecting RNA extracted from a tissue obtained from the patient to gene expression analysis;
[0060](b) determining the expression level in such tissue of at least two genes selected from the gene set listed in Table 1, wherein the expression level is normalized against a control gene or genes, and is compared to the amount found in a cancer tissue reference set;
[0061](c) and creating a report summarizing the data obtained by the gene expression analysis.
[0062]The tissue obtained from the patient may, but does not have to, comprise cancer cells. Just as before, the cancer can, for example, be breast cancer, colon cancer, lung cancer, prostate cancer, hepatocellular cancer, gastric cancer, pancreatic cancer, cervical cancer, ovarian cancer, liver cancer, bladder cancer, cancer of the urinary tract, thyroid cancer, renal cancer, carcinoma, melanoma, or brain cancer, breast cancer being particularly preferred.
[0063]In a particular embodiment, the RNA is obtained from a microdissected portion of breast cancer tissue enriched for cancer cells. The control gene set may, for example, comprise S-actin, and ribosomal protein LPO.
[0064]The report prepared for the use of the patient or the patient's physician, may include the identification of at least one drug potentially beneficial in the treatment of the patient.
[0065]Step (b) of the foregoing method may comprise the step of determining the expression level of a gene specifically influencing cellular sensitivity to a drug, where the gene can, for example, be selected from the group consisting of aldehyde dehydrogenase 1A1, aldehyde dehydrogenase 1A3, amphiregulin, ARG, BRK, BCRP, CD9, CD31, CD82/KAI-1, COX2, c-abl, c-kit, c-kit L, CYP1B1, CYP2C9, DHFR, dihydropyrimidine dehydrogenase, EGF, epiregulin, ER-alpha, ErbB-1, ErbB-2, ErbB-3, ErbB-4, ER-beta, farnesyl pyrophosphate synthetase, gamma-GCS (glutamyl cysteine synthetase), GATA3, geranyl geranyl pyrophosphate synthetase, Grb7, GST-alpha, GST-pi, HB-EGF, hsp 27, human chorionic gonadotropin/CGA, IGF-1, IGF-2, IGF1R, KDR, LIV1, Lung Resistance Protein/MVP, Lot1, MDR-1, microsomal epoxide hydrolase, MMP9, MRP1, MRP2, MRP3, MRP4, PAI1, PDGF-A, PDGF-B, PDGF-C, PDGF-D, PGDFR-alpha, PDGFR-beta, PLAGa (pleiomorphic adenoma 1), PREP prolyl endopeptidase, progesterone receptor, pS2/trefoil factor 1, PTEN, PTB1b, RAR-alpha, RAR-beta2, Reduced Folate Carrier, SXR, TGF-alpha, thymidine phosphorylase, thymidine synthase, topoisomerase II-alpha, topoisomerase II-beta, VEGF, XIST, and YB-1.
[0066]In another embodiment, step (b) of the foregoing process includes determining the expression level of multidrug resistance factors, such as, for example, gamma-glutamyl-cysteine synthetase (GCS), GST-α, GST-π, MDR-1, MRP1-4, breast cancer resistance protein (BCRP), lung cancer resistance protein (MVP), SXR, or YB-1.
[0067]In another embodiment, step (b) of the foregoing process comprises determination of the expression level of eukaryotic translation initiation factor 4E (EIF4E).
[0068]In yet another embodiment, step (b) of the foregoing process comprises determination of the expression level of a DNA repair enzyme.
[0069]In a further embodiment, step (b) of the foregoing process comprises determination of the expression level of a cell cycle regulator, such as, for example, c-MYC, c-Src, Cyclin D1, Ha-Ras, mdm2. p14ARF, p21WAF1/CI, p16INK4a/p14, p23, p27, p53, PI3K, PKC-epsilon, or PKC-delta.
[0070]In a still further embodiment, step (b) of the foregoing process comprises determination of the expression level of a tumor suppressor or a related protein, such as, for example, APC or E-cadherin.
[0071]In another embodiment, step (b) of the foregoing method comprises determination of the expression level of a gene regulating apoptosis, such as, for example, p53, BC12, Bcl-x 1, Bak, Bax, and related factors, NFκ-B, CIAP1, CIAP2, survivin, and related factors, p53BP1/ASPP1, or p53BP2/ASPP2.
[0072]In yet another embodiment, step (b) of the foregoing process comprises determination of the expression level of a factor that controls cell invasion or angiogenesis, such as, for example, uPA, PAI1, cathepsin B, C, and L, scatter factor (HGF), c-met, KDR, VEGF, or CD31.
[0073]In a different embodiment, step (b) of the foregoing method comprises determination of the expression level of a marker for immune or inflammatory cells or processes, such as, for example, Ig light chain λ, CD18, CD3, CD68. Fas (CD95), or Fas Ligand.
[0074]In a further embodiment, step (b) of the foregoing process comprises determination of the expression level of a cell proliferation marker, such as, for example, Ki67/MiB1, PCNA, Pin1, or thymidine kinase.
[0075]In a still further embodiment, step (b) of the foregoing process comprises determination of the expression level of a growth factor or growth factor receptor, such as, for example, IGF1, IGF2, IGFBP3, IGF1R, FGF2, CSF-1, CSF-1R/fms, SCF-1, IL6 or IL8.
[0076]In another embodiment, step (b) of the foregoing process comprises determination of the expression level of a gene marker that defines a subclass of breast cancer, where the gene marker can, for example, be GRO1 oncogene alpha, Grb7, cytokeratins 5 and 17, retinol binding protein 4, hepatocyte nuclear factor 3, integrin subunit alpha 7, or lipoprotein lipase.
[0077]In a still further aspect, the invention concerns a method for predicting the response of a patient diagnosed with breast cancer to 5-fluorouracil (5-FU) or an analog thereof, comprising the steps of:
[0078](a) subjecting RNA extracted from a breast cancer tissue obtained from the patient to gene expression analysis;
[0079](b) determining the expression level in the tissue of thymidylate synthase mRNA, wherein the expression level is normalized against a control gene or genes, and is compared to the amount found in a reference breast cancer tissue set; and
[0080](c) predicting patient response based on the normalized thymidylate synthase mRNA level.
[0081]Step (d) of the foregoing method can further comprise determining the expression level of dihydropyrimidine phosphorylase.
[0082]In another embodiment, step (b) of the method can further comprise determining the expression level of thymidine phosphorylase.
[0083]In yet another embodiment, a positive response to 5-FU or an analog thereof is predicted if: (i) normalized thymidylate synthase mRNA level determined in step (b) is at or below the 15th percentile; or (ii) the sum of normalized expression levels of thymidylate synthase and dihydropyrimidine phosphorylase determined in step (b) is at or below the 25th percentile; or (iii) the sum of normalized expression levels of thymidylate synthase, dihydropyrimidine phosphorylase, plus thymidine phosphorylase determined in step (b) is at or below the 20th percentile.
[0084]In a further embodiment, in step (b) of the foregoing method the expression level of c-myc and wild-type p53 is determined. In this case, a positive response to 5-FU or an analog thereof is predicted, if the normalized expression level of c-myc relative to the normalized expression level of wild-type p53 is in the upper 15th percentile.
[0085]In a still further embodiment, in step (b) of the foregoing method, expression level of NFκB and cIAP2 is determined. In this particular embodiment, resistance to 5-FU or an analog thereof is typically predicted if the normalized expression level of NFκB and cIAP2 is at or above the 10th percentile.
[0086]In another aspect, the invention concerns a method for predicting the response of a patient diagnosed with breast cancer to methotrexate or an analog thereof, comprising the steps of:
[0087](a) subjecting RNA extracted from a breast cancer tissue obtained from the patient to gene expression analysis, wherein gene expression levels are normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set; and
[0088](b) predicting decreased patient sensitivity to methotrexate or analog if (i) DHFR levels are more than tenfold higher than the average expression level of DHFR in the control gene set, or (ii) the normalized expression levels of members of the reduced folate carver (RFC) family are below the 10th percentile.
[0089]In yet another aspect, the invention concerns a method for predicting the response of a patient diagnosed with breast cancer to an anthracycline or an analog thereof, comprising the steps of:
[0090](a) subjecting RNA extracted from a breast cancer tissue obtained from the patient to gene expression analysis, wherein gene expression levels are normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set; and
[0091](b) predicting patient resistance or decreased sensitivity to the anthracycline or analog if (i) the normalized expression level of topoisomerase IIα is below the 10th percentile, or (ii) the normalized expression level of topoisomerase IIβ is below the 10th percentile, or (iii) the combined normalized topoisomerase IIα or IIβ expression levels are below the 10th percentile.
[0092]In a different aspect, the invention concerns a method for predicting the response of a patient diagnosed with breast cancer to a docetaxol, Comprising the steps of:
[0093](a) subjecting RNA extracted from a breast cancer tissue obtained from the patient to gene expression analysis, wherein gene expression levels are normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set; and
[0094](b) predicting reduced sensitivity to docetaxol if the normalized expression level of CYP1B1 is in the upper 10th percentile.
[0095]The invention further concerns a method for predicting the response of a patient diagnosed with breast cancer to cyclophosphamide or an analog thereof, comprising
[0096](a) subjecting RNA extracted from a breast cancer tissue obtained from the patient to gene expression analysis, wherein gene expression levels are normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set; and
[0097](b) predicting reduced sensitivity to the cyclophosphamide or analog if the sum of the expression levels of aldehyde dehydrogenase 1A1 and 1A3 is more than tenfold higher than the average of their combined expression levels in the reference tissue set.
[0098]In a further aspect, the invention concerns a method for predicting the response of a patient diagnosed with breast cancer to anti-estrogen therapy, comprising
[0099](a) subjecting RNA extracted from a breast cancer tissue obtained from the patient to gene expression analysis, wherein gene expression levels are normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set that contains both specimens negative for and positive for estrogen receptor-α (ERα) and progesterone receptor-α (PRα); and
[0100](b) predicting patient response based upon the normalized expression levels of ERα or PRα, and at least one of microsomal epoxide hydrolase, pS2/trefoil factor 1, GATA3 and human chorionic gonadotropin.
[0101]In a specific embodiment, lack of response or decreased responsiveness is predicted if (i) the normalized expression level of microsomal epoxide hydrolase is in the upper 10th percentile; or (ii) the normalized expression level of pS2/trefoil factor 1, or GATA3 or human chorionic gonaostropin is at or below the corresponding average expression level in said breast cancer tissue set, regardless of the expression level of ERα or PRα in the breast cancer tissue obtained from the patient.
[0102]In another aspect, the invention concerns a method for predicting the response of a patient diagnosed with breast cancer to a taxane, comprising the steps of:
[0103](a) subjecting RNA extracted from a breast cancer tissue obtained from the patient to gene expression analysis, wherein gene expression levels are normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set; and
[0104](b) predicting reduced sensitivity to taxane if (i) no or minimal XIST expression is detected; or (ii) the normalized expression level of GST-π or propyl endopeptidase (PREP) is in the upper 10th percentile; or (iii) the normalized expression level of PLAG1 is in the upper 10th percentile.
[0105]The invention also concerns a method for predicting the response of a patient diagnosed with breast cancer to cisplatin or an analog thereof, comprising the steps of:
[0106](a) subjecting RNA extracted from a breast cancer tissue obtained from the patient to gene expression analysis, wherein gene expression levels are normalized against a control gene or genes, and compared to the amount found, in a reference breast cancer tissue set; and
[0107](b) predicting resistance or reduced sensitivity if the normalized expression level of ERCC1 is in the upper 10th percentile.
[0108]The invention further concerns a method for predicting the response of a patient diagnosed with breast cancer to an ErbB2 or EGFR antagonist, comprising the steps of:
[0109](a) subjecting RNA extracted from a breast cancer tissue obtained from the patient to gene expression analysis, wherein gene expression levels are normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set; and
[0110](b) predicting patient response based on the normalized expression levels of at least one of Grb7, IGF1R, IGF1 and IGF2.
[0111]In particular embodiment, a positive response is predicted if the normalized expression level of Grb7 is in the upper 10th percentile, and the expression of IGF1R, IGF1 and IGF2 is not elevated above the 90th percentile.
[0112]In a further particular embodiment, a decreased responsiveness is predicted if the expression level of at least one of IGF1R, IGF1 and IGF2 is elevated.
[0113]In another aspect, the invention concerns a method for predicting the response of a patient diagnosed with breast cancer to a bis-phosphonate drug, comprising the steps of:
[0114](a) subjecting RNA extracted from a breast cancer tissue obtained from the patient to gene expression analysis, wherein gene expression levels are normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set; and
[0115](b) predicting a positive response if the breast cancer tissue obtained from the patient expresses mutant Ha-Ras and additionally expresses farnesyl pyrophosphate synthetase or geranyl pyrophosphone synthetase at a normalized expression level at or above the 90th percentile.
[0116]In yet another aspect, the invention concerns a method for predicting the response of a patient diagnosed with breast cancer to treatment with a cyclooxygenase 2 inhibitor, comprising the steps of:
[0117](a) subjecting RNA extracted from a breast cancer tissue obtained from the patient to gene expression analysis, wherein gene expression levels are normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set; and
[0118](b) predicting a positive response if the normalized expression level of COX2 in the breast cancer tissue obtained from the patient is at or above the 90th percentile.
[0119]The invention further concerns a method for predicting the response of a patient diagnosed with breast cancer to an EGF receptor (EGFR) antagonist, comprising the steps of:
[0120](a) subjecting RNA extracted from a breast cancer tissue obtained from the patient to gene expression analysis, wherein gene expression levels are normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set; and
[0121](b) predicting a positive response to an EGFR antagonist, if (i) the normalized expression level of EGFR is at or above the 10th percentile, and (ii) the normalized expression level of at least one of epiregulin, TGF-α, amphiregulin, ErbB3, BRK, CD9, MMP9, CD82, and Lot1 is above the 90th percentile.
[0122]In another aspect, the invention concerns a method for monitoring the response of a patient diagnosed with breast cancer to treatment with an EGFR antagonist, comprising monitoring the expression level of a gene selected from the group consisting of epiregulin, TGF-α, amphiregulin, ErbB3, BRK, CD9, MMP9, CD82, and Lot1 in the patient during treatment, wherein reduction in the expression level is indicative of positive response to such treatment.
[0123]In yet another aspect, the invention concerns a method for predicting the response of a patient diagnosed with breast cancer to a drug targeting a tyrosine kinase selected from the group consisting of abl, c-kit, PDGFR-α, PDGFR-β and ARG, comprising the steps of:
[0124](a) subjecting RNA extracted from a breast cancer tissue obtained from the patient to gene expression analysis, wherein gene expression levels are normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set;
[0125](b) determining the normalized expression level of a tyrosine kinase selected from the group consisting of abl, c-kit, PDGFR-α, PDGFR-β and ARG, and the cognate ligand of the tyrosine kinase, and if the normalized expression level of the tyrosine kinase is in the upper 10th percentile,
[0126](c) determining whether the sequence of the tyrosine kinase contains any mutation,
[0127]wherein a positive response is predicted if (i) the normalized expression level of the tyrosine kinase is in the upper 10th percentile, (ii) the sequence of the tyrosine kinase contains an activating mutation, or (iii) the normalized expression level of the tyrosine kinase is normal and the expression level of the ligand is in the upper 10th percentile.
[0128]Another aspect of the invention is a method for predicting the response of a patient diagnosed with breast cancer to treatment with an anti-angiogenic drug, comprising the steps of:
[0129](a) subjecting RNA extracted from a breast cancer tissue obtained from the patient to gene expression analysis, wherein gene expression levels are normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set; and
[0130](b) predicting a positive response if (i) the normalized expression level of VEGF is in the upper 10th percentile and (ii) the normalized expression level of KDR or CD31 is in the upper 20th percentile.
[0131]A further aspect of the invention is a method for predicting the likelihood that a patient diagnosed with breast cancer develops resistance to a drug interacting with the MRP-1 gene coding for the multidrug resistance protein P-glycoprotein, comprising the steps of:
[0132](a) subjecting RNA extracted from a breast cancer tissue obtained from the patient to gene expression analysis to determine the expression level of PTP1b, wherein the expression level is normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set; and
[0133](b) concluding that the patient is likely to develop resistance to said drug if the normalized expression level of the MRP-1 gene is above the 90th percentile.
[0134]The invention further relates to a method for predicting the likelihood that a patient diagnosed with breast cancer develops resistance to a chemotherapeutic drug or toxin used in cancer treatment, comprising the steps of:
[0135](a) subjecting RNA extracted from a breast cancer tissue obtained from the patient to gene expression analysis, wherein gene expression levels are normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set; and
[0136](b) determining the normalized expression levels of at least one of the following genes: MDR1, SGTα, GST-π, SXR, BCRP YB-1, and LRP/MVP, wherein the finding of a normalized expression level in the upper 4th percentile is an indication that the patient is likely to develop resistance to the drug.
[0137]Also included herein is a method for measuring the translational efficiency of VEGF mRNA in a breast cancer tissue sample, comprising determining the expression levels of the VEGF and EIF4E mRNA in the sample, normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set, wherein a higher normalized EIF4E expression level for the same VEGF expression level is indicative of relatively higher translational efficiency for VEGF.
[0138]In another aspect, the invention provides a method for predicting the response of a patient diagnosed with breast cancer to a VEGF antagonist, comprising determining the expression level of VEGF and EIF4E mRNA normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set, wherein a VEGF expression level above the 90th percentile and an EIF4E expression level above the 50th percentile is a predictor of good patient response.
[0139]The invention further provides a method for predicting the likelihood of the recurrence of breast cancer in a patient diagnosed with breast cancer, comprising determining the ratio of p53:p21 mRNA expression or p53:mdm2 mRNA expression in a breast cancer tissue obtained from the patient, normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set, wherein an above normal ratio is indicative of a higher risk of recurrence. Typically, a higher risk of recurrence is indicated if the ratio is in the upper 10th percentile.
[0140]In yet another aspect, the invention concerns a method for predicting the likelihood of the recurrence of breast cancer in a breast cancer patient following surgery, comprising determining the expression level of cyclin D1 in a breast cancer tissue obtained from the patient, normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set, wherein an expression level in the upper 10th percentile indicates increased risk of recurrence following surgery. In a particular embodiment of this method, the patient is subjected to adjuvant chemotherapy, if the expression level is in the upper 10th percentile.
[0141]Another aspect of the invention is a method for predicting the likelihood of the recurrence of breast cancer in a breast cancer patient following surgery, comprising determining the expression level of APC or E-cadherin in a breast cancer tissue obtained from the patient, normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set, wherein an expression level in the upper 5th percentile indicates high risk of recurrence following surgery, and heightened risk of shortened survival.
[0142]A further aspect of the invention is a method for predicting the response of a patient diagnosed with breast cancer to treatment with a proapoptotic drug comprising determining the expression levels of BC12 and c-MYC in a breast cancer tissue obtained from the patient, normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set, wherein (i) a BC12 expression level in the upper 10th percentile in the absence of elevated expression of c-MYC indicates good response, and (ii) a good response is not indicated if the expression level c-MYC is elevated, regardless of the expression level of BC12.
[0143]A still further aspect of the invention is a method for predicting treatment outcome for a patient diagnosed with breast cancer, comprising the steps of:
[0144](a) subjecting RNA extracted from a breast cancer tissue obtained from the patient to gene expression analysis, wherein gene expression levels are normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set; and
[0145](b) determining the normalized expression levels of NFκB and at least one gene selected from the group consisting of cIAP1, cIAP2, XIAP, and Survivin,
[0146]wherein a poor prognosis is indicated if the expression levels for NFκB and at least one of the genes selected from the group consisting of cIAP1, cIAP2, XIAP, and Survivin is in the upper 5th percentile.
[0147]The invention further concerns a method for predicting treatment outcome for a patient diagnosed with breast cancer, comprising determining the expression levels of p53BP1 and p53BP2 in a breast cancer tissue obtained from the patient, normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set, wherein a poor outcome is predicted if the expression level of either p53BP1 or p53BP2 is in the lower 10th percentile.
[0148]The invention additionally concerns a method for predicting treatment outcome for a patient diagnosed with breast cancer, comprising determining the expression levels of uPA and PAI1 in a breast cancer tissue obtained from the patient, normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set, wherein (i) a poor outcome is predicted if the expression levels of uPA and PAI1 are in the upper 20th percentile, and (ii) a decreased risk of recurrence is predicted if the expression levels of uPA and PAI1 are not elevated above the mean observed in the breast cancer reference set. In a particular embodiment, poor outcome is measured in terms of shortened survival or increased risk of cancer recurrence following surgery. In another particular embodiment, uPA and PAI1 are expressed at normal levels, and the patient is subjected to adjuvant chemotherapy following surgery.
[0149]Another aspect of the invention is a method for predicting treatment outcome in a patient diagnosed with breast cancer, comprising determining the expression levels of cathepsin B and cathepsin L in a breast cancer tissue obtained from the patient, normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set, wherein a poor outcome is predicted if the expression level of either cathepsin B or cathepsin L is in the upper 10th percentile. Just as before, poor treatment outcome may be measured, for example, in terms of shortened survival or increased risk of cancer recurrence.
[0150]A further aspect of the invention is a method for devising the treatment of a patient diagnosed with breast cancer, comprising the steps of
[0151](a) determining the expression levels of scatter factor and c-met in a breast cancer tissue obtained from the patient, normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set, and
[0152](b) suggesting prompt aggressive chemotherapeutic treatment if the expression levels of scatter factor and c-met or the combination of both, are above the 90th percentile.
[0153]A still further aspect of the invention is a method for predicting treatment outcome for a patient diagnosed with breast cancer, comprising determining the expression levels of VEGF, CD31, and KDR in a breast cancer tissue obtained from the patient, normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set, wherein a poor treatment outcome is predicted if the expression level of any of VEGF, CD31, and KDR is in the upper 10th percentile.
[0154]Yet another aspect of the invention is a method for predicting treatment outcome for a patient diagnosed with breast cancer, comprising determining the expression levels of Ki67/MiB1, PCNA, Pin 1, and thymidine kinase in a breast cancer tissue obtained from the patient, normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set, wherein a poor treatment outcome is predicted if the expression level of any of Ki67/MiB1, PCNA, Pin1, and thymidine kinase is in the upper 10th percentile.
[0155]The invention further concerns a method for predicting treatment outcome for a patient diagnosed with breast cancer, comprising determining the expression level of soluble and full length CD95 in a breast cancer tissue obtained from the patient, normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set, wherein the presence of soluble CD95 correlates with poor patient survival.
[0156]The invention also concerns a method for predicting treatment outcome for a patient diagnosed with breast cancer, comprising determining the expression levels of IGF1, IGF1R and IGFBP3 in a breast cancer tissue obtained from the patient, normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set, wherein a poor treatment outcome is predicted if the sum of the expression levels of IGF1, IGF1R and IGFBP3 is in the upper 10th percentile.
[0157]The invention additionally concerns a method for classifying breast cancer comprising, determining the expression level of two or more genes selected from the group consisting of Bcl12, hepatocyte nuclear factor 3, LIV1, ER, lipoprotein lipase, retinol binding protein 4, integrin α7, cytokeratin 5, cytokeratin 17, GRO oncogen, ErbB2 and Grb7, in a breast cancer tissue, normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set, wherein (i) tumors expressing at least one of Bcl1, hepatocyte nuclear factor 3, LIV1, and ER above the mean expression level in the reference tissue set are classified as having a good prognosis for disease free and overall patient survival following surgical removal; (ii) tumors characterized by elevated expression of at least one of lipoprotein lipase, retinol binding protein 4, integrin α7 compared to the reference tissue set are classified as having intermediate prognosis of disease free and overall patient survival following surgical removal; and (iii) tumors expressing either elevated levels of cytokeratins 5 and 17, and GRO oncogen at levels four-fold or greater above the mean expression level in the reference tissue set, or ErbB2 and Grb7 at levels ten-fold or more above the mean expression level in the reference tissue set are classified as having poor prognosis of disease free and overall patient survival following surgical removal.
[0158]Another aspect of the invention is a panel of two or more gene specific primers selected from the group consisting of the forward and reverse primers listed in Table 2.
[0159]Yet another aspect of the invention is a method for reverse transcription of a fragmented RNA population in RT-PCR amplification, comprising using a multiplicity of gene specific primers as the reverse primers in the amplification reaction. In a particular embodiment, the method uses between two and about 40,000 gene specific primers in the same amplification reaction. In another embodiment, the gene specific primers are about 18 to 24 bases, such as about 20 bases in length. In another embodiment, the Tm of the primers is about 58-60° C. The primers can, for example, be selected from the group consisting of the forward and reverse primers listed in Table 2.
[0160]The invention also concerns a method of reverse transcriptase driven first strand cDNA synthesis, comprising using a gene specific primer of about 18 to 24 bases in length and having a Tm optimum between about 58° C. and about 60° C. In a particular embodiment, the first strand cDNA synthesis is followed by PCR DNA amplification, and the primer serves as the reverse primer that drives the PCR amplification. In another embodiment, the method uses a plurality of gene specific primers in the same first strand cDNA synthesis reaction mixture. The number of the gene specific primers can, for example, be between 2 and about 40,000.
[0161]In a different aspect, the invention concerns a method of predicting the likelihood of long-term survival of a breast cancer patient without the recurrence of breast cancer, following surgical removal of the primary tumor, comprising determining the expression level of one or more prognostic RNA transcripts or their product in a breast cancer tissue sample obtained from said patient, normalized against the expression level of all RNA transcripts or their products in said breast cancer tissue sample, or of a reference set of RNA transcripts or their products, wherein the prognostic transcript is the transcript of one or more genes selected from the group consisting of: FOXM1, PRAME, Bcl2, STK15, CEGP1, Ki-67, GSTM1, CA9, PR, BBC3, NME1, SURV, GATA3, TFRC, YB-1, DPYD, GSTM3, RPS6 KB1, Src, Chk1, ID1, EstR1, p27, CCNB1, XIAP, Chk2, CDC25B, IGF1R, AK055699, PI3KC2A, TGFB3, BAGI1, CYP3A4, EpCAM, VEGFC, pS2, hENT1, WISP1, HNF3A, NFKBp65, BRCA2, EGFR, TK1, VDR, Contig51037, pENT1, EPHX1, IF1A, DIABLO, CDH1, HIF1α, IGFBP3, CTSB, and Her2, wherein overexpression of one or more of FOXM1, PRAME, STK15, Ki-67, CA9, NME1, SURV, TFRC, YB-1, RPS6 KB1, Src, Chk1, CCNB1, Chk2, CDC25B, CYP3A4, EpCAM, VEGFC, hENT1, BRCA2, EGFR, TK1, VDR, EPHX1, IF1A, Contig51037, CDH1, HIF1α, IGFBP3, CTSB, Her2, and pENT1 indicates a decreased likelihood of long-term survival without breast cancer recurrence, and the overexpression of one or more of Bcl2, CEGP1, GSTM1, PR, BBC3, GATA3, DPYD, GSTM3, ID1, EstR1, p27, XIAP, IGF1R, AK055699, P13KC2A, TGFB3, BAGI1, pS2, WISP1, HNF3A, NFKBp65, and DIABLO indicates an increased likelihood of long-term survival without breast cancer recurrence.
[0162]In a particular embodiment of this method, the expression level of at least 2, preferably at least 5, more preferably at least 10, most preferably at least 15 prognostic transcripts or their expression products is determined.
[0163]When the breast cancer is invasive breast carcinoma, including both estrogen receptor (ER) overexpressing (ER positive) and ER negative tumors, the analysis includes determination of the expression levels of the transcripts of at least two of the following genes, or their expression products: FOXM1, PRAME, Bcl2, STK15, CEGP1, Ki-67, GSTM1, PR, BBC3, NME1, SURV, GATA3, TFRC, YB-1, DPYD, Src, CA9, Contig51037, RPS6K1 and Her2.
[0164]When the breast cancer is ER positive invasive breast carcinoma, the analysis includes determination of the expression levels of the transcripts of at least two of the following genes, or their expression products: PRAME, Bcl2, FOXM1, DIABLO, EPHX1, HIF1A, VEGFC, Ki-67, IGF1R, VDR, NME1, GSTM3, Contig51037, CDC25B, CTSB, p27, CDH1, and IGFBP3.
[0165]Just as before, it is preferred to determine the expression levels of at least 5, more preferably at least 10, most preferably at least 15 genes, or their respective expression products.
[0166]In a particular embodiment, the expression level of one or more prognostic RNA transcripts is determined, where RNA may, for example, be obtained from a fixed, wax-embedded breast cancer tissue specimen of the patient. The isolation of RNA can, for example, be carried out following any of the procedures described above or throughout the application, or by any other method known in the art.
[0167]In yet another aspect, the invention concerns an array comprising polynucleotides hybridizing to the following genes: FOXM1, PRAME, Bcl2, STK15, CEGP1, Ki-67, GSTM1, PR, BBC3, NME1, SURV, GATA3, TFRC, YB-1, DPYD, CA9, Contig51037, RPS6K1 and Her2, immobilized on a solid surface.
[0168]In a particular embodiment, the array comprises polynucleotides hybridizing to the following genes: FOXM1, PRAME, Bcl2, STK15, CEGP1, Ki-67, GSTM1, CA9, PR, BBC3, NME1, SURV, GATA3, TFRC, YB-1, DPYD, GSTM3, RPS6KB1, Src, Chk1, ID1, EstR1, p27, CCNB1, XIAP, Chk2, CDC25B, IGF1R, AK055699, P13KC2A, TGFB3, BAGI1, CYP3A4, EpCAM, VEGFC, pS2, hENT1, WISP1, HNF3A, NFKBp65, BRCA2, EGFR, TK1, VDR, Contig51037, pENT1, EPHX1, IF1A, CDH1, HIF1α, IGFBP3, CTSB, Her2 and DIABLO.
[0169]In a further aspect, the invention concerns a method of predicting the likelihood of long-term survival of a patient diagnosed with invasive breast cancer, without the recurrence of breast cancer, following surgical removal of the primary tumor, comprising the steps of:
[0170](1) determining the expression levels of the RNA transcripts or the expression products of genes of a gene set selected from the group consisting of [0171](a) Bcl2, cyclinG1, NFKBp65, NME1, EPHX1, TOP2B, DR5, TERC, Src, DIABLO; [0172](b) Ki67, XIAP, hENT1, TS, CD9, p27, cyclinG1, pS2, NFKBp65, CYP3A4; [0173](c) GSTM1, XIAP, Ki67, TS, cyclinG1, p27, CYP3A4, pS2, NFKBp65, ErbB3; [0174](d) PR, NME1, XIAP, upa, cyclinG1, Contig51037, TERC, EPHX1, ALDH1A3, CTSL; [0175](e) CA9, NME1, TERC, cyclinG1, EPHX1, DPYD, Src, TOP2B, NFKBp65, VEGFC; [0176](f) TFRC, XIAP, Ki67, TS, cyclinG1, p27, CYP3A4, pS2, ErbB3, NFKBp65; [0177](g) Bcl2, PRAME, cyclinG1, FOXM1, NFKBp65, TS, XIAP, Ki67, CYP3A4, p27; [0178](h) FOXM1, cyclinG1, XIAP, Contig51037, PRAME, TS, Ki67, PDGFRa, p27, NFKBp65; [0179](i) PRAME, FOXM1, cyclinG1, XIAP, Contig51037, TS, Ki6, PDGFRa, p27, NFKBp65; [0180](j) Ki67, XIAP, PRAME, hENT1, contig51037, TS, CD9, p27, ErbB3, cyclinG1; [0181](k) STK15, XIAP, PRAME, PLAUR, p27, CTSL, CD18, PREP, p53, RPS6 KB1; [0182](l) GSTM1, XIAP, PRAME, p27, Contig51037, ErbB3, GSTp, EREG, ID1, PLAUR; [0183](m) PR, FRAME, NME1, XIAP, PLAUR, cyclinG1, Contig51037, TERC, EPHX1, DR5; [0184](n) CA9, FOXM1, cyclinG1, XIAP, TS, Ki67, NFKBp65, CYP3A4, GSTM3, p27; [0185](o) TFRC, XIAP, PRAME, p27, Contig51037, ErbB3, DPYD, TERC, NME1, VEGFC; and [0186](p) CEGP1, PRAME, hENT1, XIAP, Contig51037, ErbB3, DPYD, NFKBp65, ID1, TSin a breast cancer tissue sample obtained from said patient, normalized against the expression levels of all RNA transcripts or their products in said breast cancer tissue sample, or of a reference set of RNA transcripts or their products;
[0187](2) subjecting the data obtained in step (a) to statistical analysis; and
[0188](3) determining whether the likelihood of said long-term survival has increased or decreased.
[0189]In a still further aspect, the invention concerns a method of predicting the likelihood of long-term survival of a patient diagnosed with estrogen receptor (ER)-positive invasive breast cancer, without the recurrence of breast cancer, following surgical removal of the primary tumor, comprising the steps of:
[0190](1) determining the expression levels of the RNA transcripts or the expression products of genes of a gene set selected from the group consisting of [0191](a) PRAME, p27, IGFBP2, HIF1A, T1MP2, ILT2, CYP3A4, ID1, EstR1, DIABLO; [0192](b) Contig51037, EPHX1, Ki67, TIMP2, cyclinG1, DPYD, CYP3A4, TP, AIB1, CYP2C8; [0193](c) Bcl2, hENT1, FOXM1, Contig51037, cyclinG1, Contig46653, PTEN, CYP3A4, TIMP2, AREG; [0194](d) HIF1A, PRAME, p27, IGFBP2, T1MP2, ILT2, CYP3A4, ID1, EstR1, DIABLO; [0195](e) IGF1R, PRAME, EPHX1, Contig51037, cyclinG1, Bcl2, NME1, PTEN, TBP, TIMP2; [0196](f) FOXM1, Contig51037, VEGFC, TBP, HIF1A, DPYD, RAD51C, DCR3, cyclinG1, BAG1; [0197](g) EPHX1, Contig51037, Ki67, TIMP2, cyclinG1, DPYD, CYP3A4, TP, AIB1, CYP2C8; [0198](h) Ki67, VEGFC, VDR, GSTM3, p27, upa, ITGA7, rhoC, TERC, Pin1; [0199](i) CDC25B, Contig51037, hENT1, Bcl2, HLAG, TERC, NME1, upa, ID1, CYP; [0200](j) VEGFC, Ki67, VDR, GSTM3, p27, upa, ITGA7, rhoC, TERC, Pin1; [0201](k) CTSB, PRAME, p27, IGFBP2, EPHX1, CTSL, BAD, DR5, DCR3, XIAP; [0202](l) DIABLO, Ki67, hENT1, TIMP2, ID1, p27, KRT19, IGFBP2, TS, PDGFB; [0203](m) p27, PRAME, IGFBP2, HIF1A, T1MP2, ILT2, CYP3A4, ID1, EstR1, DIABLO; [0204](n) CDH1; PRAME, VEGFC; HIF1A; DPYD, TIMP2, CYP3A4, EstR1, RBP4, p27; [0205](o) IGFBP3, PRAME, p27, Bcl2, XIAP, EstR1, Ki67, TS, Src, VEGF; [0206](p) GSTM3, PRAME, p27, IGFBP3, XIAP, FGF2, hENT1, PTEN, EstR1, APC; [0207](q) hENT1, Bcl2, FOXM1, Contig51037, CyclinG1, Contig46653, PTEN, CYP3A4, TIMP2, AREG; [0208](r) STK15, VEGFC, PRAME, p27, GCLC, hENT1, ID1, TIMP2, EstR1, MCP1; [0209](s) NME1, PRAM, p27, IGFBP3, XIAP, PTEN, hENT1, Bcl2, CYP3A4, HLAG; [0210](t) VDR, Bcl2, p27, hENT1, p53, PI3KC2A, EIF4E, TFRC, MCM3, ID1; [0211](u) EIF4E, Contig51037, EPHX1, cyclinG1, Bcl2, DR5, TBP, PTEN, NME1, HER2; [0212](v) CCNB1, PRAME, VEGFC, HIF1A, hENT1, GCLC, TIMP2, ID1, p27, upa; [0213](w) ID1, PRAME, DIABLO, hENT1, p27, PDGFRa, NME1, B1N1, BRCA1, TP; [0214](x). FBXO5, PRAME, IGFBP3, p27, GSTM3, hENT1, XIAP, FGF2, TS, PTEN; [0215](y) GUS, HIA1A, VEGFC, GSTM3, DPYD, hENT1, EBXO5, CA9, CYP, KRT18; and [0216](z) Bclx, Bcl2, hENT1, Contig51037, HLAG, CD9, ID1, BRCA1, BIN1, HBEGF;
[0217](2) subjecting the data obtained in step (1) to statistical analysis; and
[0218](3) determining whether the likelihood of said long-term survival has increased or decreased.
[0219]In a different aspect, the invention concerns an array comprising polynucleotides hybridizing to a gene set selected from the group consisting of [0220](a) Bcl2, cyclinG1, NFKBp65, NME1, EPHX1, TOP2B, DR5, TERC, Src, DIABLO; [0221](b) Ki67, XIAP, hENT1, TS, CD9, p27, cyclinG1, pS2, NFKBp65, CYP3A4; [0222](c) GSTM1, XIAP, Ki67, TS, cyclinG1, p27, CYP3A4, pS2, NFKBp65ErbB3; [0223](d) PR, NME1, XIAP, upa, cyclinG1, Contig51037, TERC, EPHX1, ALDH1A3, CTSL; [0224](e) CA9, NME1, TERC, cyclinG1, EPHX1, DPYD, Src, TOP2B, NFKBp65, VEGFC; [0225](f) TFRC, XIAP, Ki67, TS, cyclinG1, p27, CYP3A4, pS2, ErbB3, NFKBp65; [0226](g) Bcl2, PRAME, cyclinG1, FOXM1, NFKBp65, TS, XIAP, Ki67, CYP3A4, p27; [0227](h) FOXM1, cyclinG1, XIAP, Contig51037, PRAME, TS, Ki67, PDGFRa, p27, NFKBp65; [0228](i) PRAME, FOXM1, cyclinG1, XIAP, Contig51037, TS, Ki6, PDGFRa, p27, NFKBp65; [0229](j) Ki67, XIAP, PRAME, hENT1, contig51037, TS, CD9, p27, ErbB3, cyclinG1; [0230](k) STK15, XIAP, PRAME, PLAUR, p27, CTSL, CD18, PREP, p53, RPS6 KB1; [0231](l) GSTM1, XIAP, PRAME, p2'7, Contig51037, ErbB3, GSTp, EREG, ID1, PLAUR; [0232](m) PR, PRAME, NME1, XIAP, PLAUR, cyclinG1, Contig51037, TERC, EPHX1, DR5; [0233](n) CA9, FOXM1, cyclinG1, XIAP, TS, Ki67, NFKBp65, CYP3A4, GSTM3, p27; [0234](o) TFRC, XIAP, PRAME, p27, Contig51037, ErbB3, DPYD, TERC, NME1, VEGFC; and [0235](p) CEGP1, PRAME, hENT1, XIAP, Contig51037, ErbB3, DPYD, NFKBp65, ID1, TS,immobilized on a solid surface.
[0236]In an additional aspect, the invention concerns an array comprising polynucleotides hybridizing to a gene set selected from the group consisting of: [0237](a) PRAME, p27, IGFBP2, HIF1A, T1MP2, ILT2, CYP3A4, ID1, EstR1, DIABLO; [0238](b) Contig51037, EPHX1, Ki67, TIMP2, cyclinG1, DPYD, CYP3A4, TP, AIB1, CYP2C8; [0239](c) Bcl2, hENT1, FOXM1, Contig51037, cyclinG1, Contig46653, PTEN, CYP3A4, TIMP2, AREG; [0240](d) HIF1A, PRAME, p27, IGFBP2, T1MP2, ILT2, CYP3A4, ID1, EstR1, DIABLO; [0241](e) IGF1R, PRAME, EPHX1, Contig51037, cyclinG1, Bcl2, NME1, PTEN, TBP, TIMP2; [0242](f) FOXM1, Contig51037, VEGFC, TBP, HIF1A, DPYD, RAD51C, DCR3, cyclinG1, BAG1; [0243](g) EPHX1, Contig51037, Ki67, TIMP2, cyclinG1, DPYD, CYP3A4, TP, AIB1, CYP2C8; [0244](h) Ki67, VEGFC, VDR, GSTM3, p27, upa, ITGA7, rhoC, TERC, Pin1; [0245](i) CDC25B, Contig51037, hENT1, Bcl2, HLAG, TERC, NME1, upa, ID1, CYP; [0246](j) VEGFC, Ki67, VDR, GSTM3, p27, upa, ITGA7, rhoC, TERC, Pin1; [0247](k) CTSB, PRAME, p27, IGFBP2, EPHX1, CTSL, BAD, DR5, DCR3, XIAP; [0248](l) DIABLO, Ki67, hENT1, TIMP2, ID1, p27, KRT19, IGFBP2, TS, PDGFB; [0249](m) p27, PRAME, IGFBP2, HIF1A, T1MP2, ILT2, CYP3A4, ID1, EstR1, DIABLO; [0250](n) CDH1; PRAME, VEGFC; HIF1A; DPYD, TIMP2, CYP3A4, EstR1, RBP4, p27; [0251](o) IGFBP3, PRAME, p27, Bcl2, XIAP, EstR1, Ki67, TS, Src, VEGF; [0252](p) GSTM3, PRAME, p27, IGFBP3, XIAP, FGF2, hENT1, PTEN, EstR1, APC; [0253](q) hENT1, Bcl2, FOXM1, Contig51037, CyclinG1, Contig46653, PTEN, CYP3A4, TIMP2, AREG; [0254](r) STK15, VEGFC, PRAME, p27, GCLC, hENT1, ID1, TIMP2, EstR1, MCP1; [0255](s) NME1, PRAM, p27, IGFBP3, XIAP, PTEN, hENT1, Bcl2, CYP3A4, HLAG; [0256](t) VDR, Bcl2, p27, hENT1, p53, PI3KC2A, EIF4E, TFRC, MCM3, ID1; [0257](u) EIF4E, Contig51037, EPHX1, cyclinG1, Bcl2, DR5, TBP, PTEN, NME1, HER2; [0258](v) CCNB1, PRAME, VEGFC, HIF1A, hENT1, GCLC, TIMP2, ID1, p27, upa; [0259](w) ID1, PRAME, DIABLO, hENT1, p27, PDGFRa, NME1, BIN1, BRCA1, TP; [0260](x) FBXO5, PRAME, IGFBP3, p27, GSTM3, hENT1, XIAP, FGF2, TS, PTEN; [0261](y) GUS, HIA1A, VEGFC, GSTM3, DPYD, hENT1, FBXO5, CA9, CYP, KRT18; and [0262](z) Bclx, Bcl2, hENT1, Contig51037, HLAG, CD9, ID1, BRCA1, BIN1, HBEGF,immobilized on a solid surface.
[0263]In all aspects, the polynucleotides can be cDNAs ("cDNA arrays") that are typically about 500 to 5000 bases long, although shorter or longer cDNAs can also be used and are within the scope of this invention. Alternatively, the polynucleotides can be oligonucleotides (DNA microarrays), which are typically about 20 to 80 bases long, although shorter and longer oligonucleotides are also suitable and are within the scope of the invention. The solid surface can, for example, be glass or nylon, or any other solid surface typically used in preparing arrays, such as microarrays, and is typically glass.
BRIEF DESCRIPTION OF THE DRAWINGS
[0264]FIG. 1 is a chart illustrating the overall workflow of the process of the invention for measurement of gene expression. In the Figure, FPET stands for "fixed paraffin-embedded tissue," and "RT-PCR" stands for "reverse transcriptase PCR." RNA concentration is determined by using the commercial RiboGreen® RNA Quantitation Reagent and Protocol.
[0265]FIG. 2 is a flow chart showing the steps of an RNA extraction method according to the invention alongside a flow chart of a representative commercial method.
[0266]FIG. 3 is a scheme illustrating the steps of an improved method for preparing fragmented mRNA for expression profiling analysis.
[0267]FIG. 4 illustrates methods for amplification of RNA prior to RT-PCR.
[0268]FIG. 5 illustrates an alternative scheme for repair and amplification of fragmented mRNA.
[0269]FIG. 6 shows the measurement of estrogen receptor mRNA levels in 40 FPE breast cancer specimens via RT-PCR. Three 10 micron sections were used for each measurement. Each data point represents the average of triplicate measurements.
[0270]FIG. 7 shows the results of the measurement of progesterone receptor mRNA levels in 40 FPE breast cancer specimens via RT-PCR performed as described in the legend of FIG. 6 above.
[0271]FIG. 8 shows results from an IVT/RT-PCR experiment.
[0272]FIG. 9 is a representation of the expression of 92 genes across 70 FPE breast cancer specimens. The y-axis shows expression as cycle threshold times. These genes are a subset of the genes listed in Table 1.
[0273]Table 1 shows a breast cancer gene list.
[0274]Table 2 sets forth amplicon and primer sequences used for amplification of fragmented mRNA.
[0275]Table 3 shows the Accession Nos. and SEQ ID NOS of the breast cancer genes examined.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT
A. Definitions
[0276]Unless defined otherwise, technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Singleton et al., Dictionary of Microbiology and Molecular Biology 2nd ed., J. Wiley & Sons (New York, N.Y. 1994), and March, Advanced Organic Chemistry Reactions, Mechanisms and Structure 4th ed., John Wiley & Sons (New York, N.Y. 1992), provide one skilled in the art with a general guide to many of the terms used in the present application.
[0277]One skilled in the art will recognize many methods and materials similar or equivalent to those described herein, which could be used in the practice of the present invention. Indeed, the present invention is in no way limited to the methods and materials described. For purposes of the present invention, the following terms are defined below.
[0278]The term "microarray" refers to an ordered arrangement of hybridizable array elements, preferably polynucleotide probes, on a substrate.
[0279]The term "polynucleotide," when used in singular or plural, generally refers to any polyribonucleotide or polydeoxyribonucleotide, which may be unmodified RNA or DNA or modified RNA or DNA. Thus, for instance, polynucleotides as defined herein include, without limitation, single- and double-stranded DNA, DNA including single- and double-stranded regions, single- and double-stranded RNA, and RNA including single- and double-stranded regions, hybrid molecules comprising DNA and RNA that may be single-stranded or, more typically, double-stranded or include single- and double-stranded regions. In addition, the term "polynucleotide" as used herein refers to triple-stranded regions comprising RNA or DNA or both RNA and DNA. The strands in such regions may be from the same molecule or from different molecules. The regions may include all of one or more of the molecules, but more typically involve only a region of some of the molecules. One of the molecules of a triple-helical region often is an oligonucleotide. The term "polynucleotide" specifically includes DNAs and RNAs that contain one or more modified bases. Thus, DNAs or RNAs with backbones modified for stability or for other reasons are "polynucleotides" as that term is intended herein. Moreover, DNAs or RNAs comprising unusual bases, such as inosine, or modified bases, such as tritiated bases, are included within the term "polynucleotides" as defined herein. In general, the term "polynucleotide" embraces all chemically, enzymatically and/or metabolically modified forms of unmodified polynucleotides, as well as the chemical forms of DNA and RNA characteristic of viruses and cells, including simple and complex cells.
[0280]The term "oligonucleotide" refers to a relatively short polynucleotide, including, without limitation, single-stranded deoxyribonucleotides, single- or double-stranded ribonucleotides, RNA:DNA hybrids and double-stranded DNAs. Oligonucleotides, such as single-stranded DNA probe oligonucleotides, are often synthesized by chemical methods, for example using automated oligonucleotide synthesizers that are commercially available. However, oligonucleotides can be made by a variety of other methods, including in vitro recombinant DNA-mediated techniques and by expression of DNAs in cells and organisms.
[0281]The terms "differentially expressed gene," "differential gene expression" and their synonyms, which are used interchangeably, refer to a gene whose expression is activated to a higher or lower level in a subject suffering from a disease, specifically cancer, such as breast cancer, relative to its expression in a normal or control subject. The terms also include genes whose expression is activated to a higher or lower level at different stages of the same disease. It is also understood that a differentially expressed gene may be either activated or inhibited at the nucleic acid level or protein level, or may be subject to alternative splicing to result in a different polypeptide product. Such differences may be evidenced by a change in mRNA levels, surface expression, secretion or other partitioning of a polypeptide, for example. Differential gene expression may include a comparison of expression between two or more genes, or a comparison of the ratios of the expression between two or more genes, or even a comparison of two differently processed products of the same gene, which differ between normal subjects and subjects suffering from a disease, specifically cancer, or between various stages of the same disease. Differential expression includes both quantitative, as well as qualitative, differences in the temporal or cellular expression pattern in a gene or its expression products among, for example, normal and diseased cells, or among cells which have undergone different disease events or disease stages. For the purpose of this invention, "differential gene expression" is considered to be present when there is at least an about two-fold, preferably at least about four-fold, more preferably at least about six-fold, most preferably at least about ten-fold difference between the expression of a given gene in normal and diseased subjects, or in various stages of disease development in a diseased subject.
[0282]The phrase "gene amplification" refers to a process by which multiple copies of a gene or gene fragment are formed in a particular cell or cell line. The duplicated region (a stretch of amplified DNA) is often referred to as "amplicon." Usually, the amount of the messenger RNA (mRNA) produced, i.e., the level of gene expression, also increases in the proportion of the number of copies made of the particular gene expressed.
[0283]The term "prognosis" is used herein to refer to the prediction of the likelihood of cancer-attributable death or progression, including recurrence, metastatic spread, and drug resistance, of a neoplastic disease, such as breast cancer. The term "prediction" is used herein to refer to the likelihood that a patient will respond either favorably or unfavorably to a drug or set of drugs, and also the extent of those responses. The predictive methods of the present invention can be used clinically to make treatment decisions by choosing the most appropriate treatment modalities for any particular patient. The predictive methods of the present invention are valuable tools in predicting if a patient is likely to respond favorably to a treatment regimen, such as surgical intervention, chemotherapy with a given drug or drug combination, and/or radiation therapy.
[0284]The term "increased resistance" to a particular drug or treatment option, when used in accordance with the present invention, means decreased response to a standard dose of the drug or to a standard treatment protocol.
[0285]The term "decreased sensitivity" to a particular drug or treatment option, when used in accordance with the present invention, means decreased response to a standard dose of the drug or to a standard treatment protocol, where decreased response can be compensated for (at least partially) by increasing the dose of drug, or the intensity of treatment.
[0286]"Patient response" can be assessed using any endpoint indicating a benefit to the patient, including, without limitation, (1) inhibition, to some extent, of tumor growth, including slowing down and complete growth arrest; (2) reduction in the number of tumor cells; (3) reduction in tumor size; (4) inhibition (i.e., reduction, slowing down or complete stopping) of tumor cell infiltration into adjacent peripheral organs and/or tissues; (5) inhibition (i.e. reduction, slowing down or complete stopping) of metastasis; (6) enhancement of anti-tumor immune response, which may, but does not have to, result in the regression or rejection of the tumor; (7) relief, to some extent, of one or more symptoms associated with the tumor; (8) increase in the length of survival following treatment; and/or (9) decreased mortality at a given point of time following treatment.
[0287]The term "treatment" refers to both therapeutic treatment and prophylactic or preventative measures, wherein the object is to prevent or slow down (lessen) the targeted pathologic condition or disorder. Those in need of treatment include those already with the disorder as well as those prone to have the disorder or those in whom the disorder is to be prevented. In tumor (e.g., cancer) treatment, a therapeutic agent may directly decrease the pathology of tumor cells, or render the tumor cells more susceptible to treatment by other therapeutic agents, e.g., radiation and/or chemotherapy.
[0288]The term "tumor," as used herein, refers to all neoplastic cell growth and proliferation, whether malignant or benign, and all pre-cancerous and cancerous cells and tissues.
[0289]The terms "cancer" and "cancerous" refer to or describe the physiological condition in mammals that is typically characterized by unregulated cell growth. Examples of cancer include but are not limited to, breast cancer, colon cancer, lung cancer, prostate cancer, hepatocellular cancer, gastric cancer, pancreatic cancer, cervical cancer, ovarian cancer, liver cancer, bladder cancer, cancer of the urinary tract, thyroid cancer, renal cancer, carcinoma, melanoma, and brain cancer.
[0290]The "pathology" of cancer includes all phenomena that compromise the well-being of the patient. This includes, without limitation, abnormal or uncontrollable cell growth, metastasis, interference with the normal functioning of neighboring cells, release of cytokines or other secretory products at abnormal levels, suppression or aggravation of inflammatory or immunological response, neoplasia, premalignancy, malignancy, invasion of surrounding or distant tissues or organs, such as lymph nodes, etc.
[0291]"Stringency" of hybridization reactions is readily determinable by one of ordinary skill in the art, and generally is an empirical calculation dependent upon probe length, washing temperature, and salt concentration. In general, longer probes require higher temperatures for proper annealing, while shorter probes need lower temperatures. Hybridization generally depends on the ability of denatured DNA to reanneal when complementary strands are present in an environment below their melting temperature. The higher the degree of desired homology between the probe and hybridizable sequence, the higher the relative temperature which can be used. As a result, it follows that higher relative temperatures would tend to make the reaction conditions more stringent, while lower temperatures less so. For additional details and explanation of stringency of hybridization reactions, see Ausubel et al., Current Protocols in Molecular Biology, Wiley Interscience Publishers, (1995).
[0292]"Stringent conditions" or "high stringency conditions", as defined herein, typically: (1) employ low ionic strength and high temperature for washing, for example 0.015 M sodium chloride/0.0015 M sodium citrate/0.1% sodium dodecyl sulfate at 50° C.; (2) employ during hybridization a denaturing agent, such as formamide, for example, 50% (v/v) formamide with 0.1% bovine serum albumin/0.1% Ficoll/0.1% polyvinylpyrrolidone/50 mM sodium phosphate buffer at pH 6.5 with 750 mM sodium chloride, 75 mM sodium citrate at 42° C.; or (3) employ 50% formamide, 5×SSC (0.75 M NaCl, 0.075 M sodium citrate), 50 mM sodium phosphate (pH 6.8), 0.1% sodium pyrophosphate, 5×Denhardt's solution, sonicated salmon sperm DNA (50 μg/ml), 0.1% SDS, and 10% dextran sulfate at 42° C., with washes at 42° C. in 0.2×SSC (sodium chloride/sodium citrate) and 50% formamide at 55° C., followed by a high-stringency wash consisting of 0.1×SSC containing EDTA at 55° C.
[0293]"Moderately stringent conditions" may be identified as described by Sambrook et al., Molecular Cloning: A Laboratory Manual, New York: Cold Spring Harbor Press, 1989, and include the use of washing solution and hybridization conditions (e.g., temperature, ionic strength and % SDS) less stringent that those described above. An example of moderately stringent conditions is overnight incubation at 37° C. in a solution comprising: 20% formamide, 5×SSC (150 mM NaCl, 15 mM trisodium citrate), 50 mM sodium phosphate (pH 7.6), 5×Denhardt's solution, 10% dextran sulfate, and 20 mg/ml denatured sheared salmon sperm DNA, followed by washing the filters in 1×SSC at about 37-50° C. The skilled artisan will recognize how to adjust the temperature, ionic strength, etc. as necessary to accommodate factors such as probe length and the like. In the context of the present invention, reference to "at least one," "at least two," "at least five," etc. of the genes listed in any particular gene set means any one or any and all combinations of the genes listed.
[0294]The terms "splicing" and "RNA splicing" are used interchangeably and refer to RNA processing that removes introns and joins exons to produce mature mRNA with continuous coding sequence that moves into the cytoplasm of an eukaryotic cell.
[0295]In theory, the term "exon" refers to any segment of an interrupted gene that is represented in the mature RNA product (B. Lewin. Genes IV Cell Press, Cambridge Mass. 1990). In theory the term "intron" refers to any segment of DNA that is transcribed but removed from within the transcript by splicing together the exons on either side of it. Operationally, exon sequences occur in the mRNA sequence of a gene as defined by Ref. Seq ID numbers. Operationally, intron sequences are the intervening sequences within the genomic DNA of a gene, bracketed by exon sequences and having GT and AG splice consensus sequences at their 5' and 3' boundaries.
B. Detailed Description
[0296]The practice of the present invention will employ, unless otherwise indicated, conventional techniques of molecular biology (including recombinant techniques), microbiology, cell biology, and biochemistry, which are within the skill of the art. Such techniques are explained fully in the literature, such as, "Molecular Cloning: A Laboratory Manual", 2nd edition (Sambrook et al., 1989); "Oligonucleotide Synthesis" (M. J. Gait, ed., 1984); "Animal Cell Culture" (R. I. Freshney, ed., 1987); "Methods in Enzymology" (Academic Press, Inc.); "Handbook of Experimental Immunology", 4th edition (D. M. Weir & C. C. Blackwell, eds., Blackwell Science Inc., 1987); "Gene Transfer Vectors for Mammalian Cells" (J. M. Miller & M. P. Calos, eds., 1987); "Current Protocols in Molecular Biology" (F. M. Ausubel et al., eds., 1987); and "PCR: The Polymerase Chain Reaction", (Mullis et al., eds., 1994).
[0297]1. Gene Expression Profiling
[0298]In general, methods of gene expression profiling can be divided into two large groups: methods based on hybridization analysis of polynucleotides, and methods based on sequencing of polynucleotides. The most commonly used methods known in the art for the quantification of mRNA expression in a sample include northern blotting and in situ hybridization (Parker & Barnes, Methods in Molecular Biology 106:247-283 (1999)); RNAse protection assays (Hod, Biotechniques 13:852-854 (1992)); and reverse transcription polymerase chain reaction (RT-PCR) (Weis et al., Trends in Genetics 8:263-264 (1992)). Alternatively, antibodies may be employed that can recognize specific duplexes, including DNA duplexes, RNA duplexes, and DNA-RNA hybrid duplexes or DNA-protein duplexes. Representative methods for sequencing-based gene expression analysis include Serial Analysis of Gene Expression (SAGE), and gene expression analysis by massively parallel signature sequencing (MPSS).
[0299]2. Reverse Transcriptase PCR (RT-PCR)
[0300]Of the techniques listed above, the most sensitive and most flexible quantitative method is RT-PCR, which can be used to compare mRNA levels in different sample populations, in normal and tumor tissues, with or without drug treatment, to characterize patterns of gene expression, to discriminate between closely related mRNAs, and to analyze RNA structure.
[0301]The first step is the isolation of mRNA from a target sample. The starting material is typically total RNA isolated from human tumors or tumor cell lines, and corresponding normal tissues or cell lines, respectively. Thus RNA can be isolated from a variety of primary tumors, including breast, lung, colon, prostate, brain, liver, kidney, pancreas, spleen, thymus, testis, ovary, uterus, etc., tumor, or tumor cell lines, with pooled DNA from healthy donors. If the source of mRNA is a primary tumor, mRNA can be extracted, for example, from frozen or archived paraffin-embedded and fixed (e.g. formalin-fixed) tissue samples.
[0302]General methods for mRNA extraction are well known in the art and are disclosed in standard textbooks of molecular biology, including Ausubel et al., Current Protocols of Molecular Biology, John Wiley and Sons (1997). Methods for RNA extraction from paraffin embedded tissues are disclosed, for example, in Rupp and Locker, Lab Invest. 56:A67 (1987), and De Andres et al., BioTechniques 18:42044 (1995). In particular, RNA isolation can be performed using purification kit, buffer set and protease from commercial manufacturers, such as Qiagen, according to the manufacturer's instructions. For example, total RNA from cells in culture can be isolated using Qiagen RNeasy mini-columns. Other commercially available RNA isolation kits include MasterPure® Complete DNA and RNA Purification Kit (EPICENTRE®, Madison, Wis.), and Paraffin Block RNA Isolation Kit (Ambion, Inc.). Total RNA from tissue samples can be isolated using RNA Stat-60 (Tel-Test). RNA prepared from tumor can be isolated, for example, by cesium chloride density gradient centrifugation.
[0303]As RNA cannot serve as a template for PCR, the first step in gene expression profiling by RT-PCR is the reverse transcription of the RNA template into cDNA, followed by its exponential amplification in a PCR reaction. The two most commonly used reverse transcriptases are avilo myeloblastosis virus reverse transcriptase (AMV-RT) and Moloney murine leukemia virus reverse transcriptase (MMLV-RT). The reverse transcription step is typically primed using specific primers, random hexamers, or oligo-dT primers, depending on the circumstances and the goal of expression profiling. For example, extracted RNA can be reverse-transcribed using a GeneAmp RNA PCR kit (Perkin Elmer, Calif., USA), following the manufacturer's instructions. The derived cDNA can then be used as a template in the subsequent PCR reaction.
[0304]Although the PCR step can use a variety of thermostable DNA-dependent DNA polymerases, it typically employs the Taq DNA polymerase, which has a 5'-3' nuclease activity but lacks a 3'-5' proofreading endonuclease activity. Thus, TaqMan® PCR typically utilizes the 5'-nuclease activity of Taq or Tth polymerase to hydrolyze a hybridization probe bound to its target amplicon, but any enzyme with equivalent 5' nuclease activity can be used. Two oligonucleotide primers are used to generate an amplicon typical of a PCR reaction. A third oligonucleotide, or probe, is designed to detect nucleotide sequence located between the two PCR primers. The probe is non-extendible by Taq DNA polymerase enzyme, and is labeled with a reporter fluorescent dye and a quencher fluorescent dye. Any laser-induced emission from the reporter dye is quenched by the quenching dye when the two dyes are located close together as they are on the probe. During the amplification reaction, the Taq DNA polymerase enzyme cleaves the probe in a template-dependent manner. The resultant probe fragments disassociate in solution, and signal from the released reporter dye is free from the quenching effect of the second fluorophore. One molecule of reporter dye is liberated for each new molecule synthesized, and detection of the unquenched reporter dye provides the basis for quantitative interpretation of the data.
[0305]TaqMan® RT-PCR can be performed using commercially available equipment, such as, for example, ABI PRISM 7700® Sequence Detection System® (Perkin-Elmer-Applied Biosystems, Foster City, Calif., USA), or Lightcycler (Roche Molecular Biochemicals, Mannheim, Germany). In a preferred embodiment, the 5' nuclease procedure is run on a real-time quantitative PCR device such as the ABI PRISM 7700® Sequence Detection System®. The system consists of a thermocycler, laser, charge-coupled device (CCD), camera and computer. The system amplifies samples in a 96-well format on a thermocycler. During amplification, laser-induced fluorescent signal is collected in real-time through fiber optics cables for all 96 wells, and detected at the CCD. The system includes software for running the instrument and for analyzing the data.
[0306]5'-Nuclease assay data are initially expressed as Ct, or the threshold cycle. As discussed above, fluorescence values are recorded during every cycle and represent the amount of product amplified to that point in the amplification reaction. The point when the fluorescent signal is first recorded as statistically significant is the threshold cycle (Ct).
[0307]To minimize errors and the effect of sample-to-sample variation, RT-PCR is usually performed using an internal standard. The ideal internal standard is expressed at a constant level among different tissues, and is unaffected by the experimental treatment. RNAs most frequently used to normalize patterns of gene expression are mRNAs for the housekeeping genes glyceraldehyde-3-phosphate-dehydrogenase (GAPDH) and β-actin.
[0308]A more recent variation of the RT-PCR technique is the real time quantitative PCR, which measures PCR product accumulation through a dual-labeled fluorogenic probe (i.e., TaqMan® probe). Real time PCR is compatible both with quantitative competitive PCR, where internal competitor for each target sequence is used for normalization, and with quantitative comparative PCR using a normalization gene contained within the sample, or a housekeeping gene for RT-PCR. For further details see, e.g. Held et al. Genome Research 6:986-994 (1996).
[0309]3. Microarrays
[0310]Differential gene expression can also be identified, or confirmed using the microarray technique. Thus, the expression profile of breast cancer-associated genes can be measured in either fresh or paraffin-embedded tumor tissue, using microarray technology. In this method, polynucleotide sequences of interest are plated, or arrayed, on a microchip substrate. The arrayed sequences are then hybridized with specific DNA probes from cells or tissues of interest. Just as in the RT-PCR method, the source of mRNA typically is total RNA isolated from human tumors or tumor cell lines, and corresponding normal tissues or cell lines. Thus RNA can be isolated from a variety of primary tumors or tumor cell lines. If the source of mRNA is a primary tumor, mRNA can be extracted, for example, from frozen or archived paraffin-embedded and fixed (e.g. formalin-fixed) tissue samples, which are routinely prepared and preserved in everyday clinical practice.
[0311]In a specific embodiment of the microarray technique, PCR amplified inserts of cDNA clones are applied to a substrate in a dense array. Preferably at least 10,000 nucleotide sequences are applied to the substrate. The microarrayed genes, immobilized on the microchip at 10,000 elements each, are suitable for hybridization under stringent conditions. Fluorescently labeled cDNA probes may be generated through incorporation of fluorescent nucleotides by reverse transcription of RNA extracted from tissues of interest. Labeled cDNA probes applied to the chip hybridize with specificity to each spot of DNA on the array. After stringent washing to remove non-specifically bound probes, the chip is scanned by confocal laser microscopy or by another detection method, such as a CCD camera. Quantitation of hybridization of each arrayed element allows for assessment of corresponding mRNA abundance. With dual color fluorescence, separately labeled cDNA probes generated from two sources of RNA are hybridized pairwise to the array. The relative abundance of the transcripts from the two sources corresponding to each specified gene is thus determined simultaneously. The miniaturized scale of the hybridization affords a convenient and rapid evaluation of the expression pattern for large numbers of genes. Such methods have been shown to have the sensitivity required to detect rare transcripts, which are expressed at a few copies per cell, and to reproducibly detect at least approximately two-fold differences in the expression levels (Schena et al., Proc. Natl. Acad. Sci. USA 93(2):106-149 (1996)). Microarray analysis can be performed by commercially available equipment, following manufacturer's protocols, such as by using the Affymetrix GenChip technology, or Incyte's microarray technology.
[0312]The development of microarray methods for large-scale analysis of gene expression makes it possible to search systematically for molecular markers of cancer classification and outcome prediction in a variety of tumor types.
[0313]4. Serial Analysis of Gene Expression (SAGE)
[0314]Serial analysis of gene expression (SAGE) is a method that allows the simultaneous and quantitative analysis of a large number of gene transcripts, without the need of providing an individual hybridization probe for each transcript. First, a short sequence tag (about 10-14 bp) is generated that contains sufficient information to uniquely identify a transcript, provided that the tag is obtained from a unique position within each transcript. Then, many transcripts are linked together to form long serial molecules, that can be sequenced, revealing the identity of the multiple tags simultaneously. The expression pattern of any population of transcripts can be quantitatively evaluated by determining the abundance of individual tags, and identifying the gene corresponding to each tag. For more details see, e.g. Velculescu et al., Science 270:484-487 (1995); and Velculescu et al., Cell 88:243-51 (1997).
[0315]5. Gene Expression Analysis by Massively Parallel Signature Sequencing (MPSS)
[0316]This method, described by Brenner et al., Nature Biotechnology 18:630-634 (2000), is a sequencing approach that combines non-gel-based signature sequencing with in vitro cloning of millions of templates on separate 5 μm diameter microbeads. First, a microbead library of DNA templates is constructed by in vitro cloning. This is followed by the assembly of a planar array of the template-containing microbeads in a flow cell at a high density (typically greater than 3×106 microbeads/cm2). The free ends of the cloned templates on each microbead are analyzed simultaneously, using a fluorescence-based signature sequencing method that does not require DNA fragment separation. This method has been shown to simultaneously and accurately provide, in a single operation, hundreds of thousands of gene signature sequences from a yeast cDNA library.
[0317]6. General Description of the mRNA Isolation, Purification and Amplification Methods of the Invention
[0318]The steps of a representative protocol of the invention, including mRNA isolation, purification, primer extension and amplification are illustrated in FIG. 1. As shown in FIG. 1, this representative process starts with cutting about 10 μm thick sections of paraffin-embedded tumor tissue samples. The RNA is then extracted, and protein and DNA are removed, following the method of the invention described below. After analysis of the RNA concentration, RNA repair and/or amplification steps may be included, if necessary, and RNA is reverse transcribed using gene specific promoters followed by RT-PCR. Finally, the data are analyzed to identify the best treatment option(s) available to the patient on the basis of the characteristic gene expression pattern identified in the tumor sample examined. The individual steps of this protocol will be discussed in greater detail below.
[0319]7. Improved Method for Isolation of Nucleic Acid from Archived Tissue Specimens
[0320]As discussed above, in the first step of the method of the invention, total RNA is extracted from the source material of interest, including fixed, paraffin-embedded tissue specimens, and purified sufficiently to act as a substrate in an enzyme assay. Despite the availability of commercial products, and the extensive knowledge available concerning the isolation of nucleic acid, such as RNA, from tissues, isolation of nucleic acid (RNA) from fixed, paraffin-embedded tissue specimens (FPET) is not without difficulty.
[0321]In one aspect, the present invention concerns an improved method for the isolation of nucleic acid from archived, e.g. FPET tissue specimens. Measured levels of mRNA species are useful for defining the physiological or pathological status of cells and tissues. RT-PCR (which is discussed above) is one of the most sensitive, reproducible and quantitative methods for this "gene expression profiling". Paraffin-embedded, formalin-fixed tissue is the most widely available material for such studies. Several laboratories have demonstrated that it is possible to successfully use fixed-paraffin-embedded tissue (FPET) as a source of RNA for RT-PCR (Stanta et al., Biotechniques 11:304-308 (1991); Stanta et al., Methods Mol. Biol. 86:23-26 (1998); Jackson et al., Lancet 1:1391 (1989); Jackson et al., J. Clin. Pathol. 43:499-504 (1999); Finke et al., Biotechniques 14:448-453 (1993); Goldsworthy et al., Mol. Carcinog. 25:86-91 (1999); Stanta and Bonin, Biotechniques 24:271-276 (1998); Godfrey et al., J. Mol. Diagnostics 2:84 (2000); Specht et al., J. Mol. Med. 78:B27 (2000); Specht et al., Am. J. Pathol. 158:419-429 (2001)). This allows gene expression profiling to be carried out on the most commonly available source of human biopsy specimens, and therefore potentially to create new valuable diagnostic and therapeutic information.
[0322]The most widely used protocols utilize hazardous organic solvents, such as xylene, or octane (Finke et al., supra) to dewax the tissue in the paraffin blocks before nucleic acid (RNA and/or DNA) extraction. Obligatory organic solvent removal (e.g. with ethanol) and rehydration steps follow, which necessitate multiple manipulations, and addition of substantial total time to the protocol, which can take up to several days. Commercial kits and protocols for RNA extraction from FPET [MasterPure® Complete DNA and RNA Purification Kit (EPICENTRE®, Madison, Wis.); Paraffin Block RNA Isolation Kit (Ambion, Inc.) and RNeasy® Mini kit (Qiagen, Chatsworth, Calif.)] use xylene for deparaffinization, in procedures which typically require multiple centrifugations and ethanol buffer changes, and incubations following incubation with xylene.
[0323]The present invention provides an improved nucleic acid extraction protocol that produces nucleic acid, in particular RNA, sufficiently intact for gene expression measurements. The key step in the nucleic acid extraction protocol herein is the performance of dewaxing without the use of any organic solvent, thereby eliminating the need for multiple manipulations associated with the removal of the organic solvent, and substantially reducing the total time to the protocol. According to the invention, wax, e.g. paraffin is removed from wax-embedded tissue samples by incubation at 65-75° C. in a lysis buffer that solubilizes the tissue and hydrolyzes the protein, following by cooling to solidify the wax.
[0324]FIG. 2 shows a flow chart of an RNA extraction protocol of the present invention in comparison with a representative commercial method, using xylene to remove wax. The times required for individual steps in the processes and for the overall processes are shown in the chart. As shown, the commercial process requires approximately 50% more time than the process of the invention.
[0325]The lysis buffer can be any buffer known for cell lysis. It is, however, preferred that oligo-dT-based methods of selectively purifying polyadenylated mRNA not be used to isolate RNA for the present invention, since the bulk of the mRNA molecules are expected to be fragmented and therefore will not have an intact polyadenylated tail, and will not be recovered or available for subsequent analytical assays. Otherwise, any number of standard nucleic acid purification schemes can be used. These include chaotrope and organic solvent extractions, extraction using glass beads or filters, salting out and precipitation based methods, or any of the purification methods known in the art to recover total RNA or total nucleic acids from a biological source.
[0326]Lysis buffers are commercially available, such as, for example, from Qiagen, Epicentre, or Ambion. A preferred group of lysis buffers typically contains urea, and Proteinase K or other protease. Proteinase K is very useful in the isolation of high quality, undamaged DNA or RNA, since most mammalian DNases and RNases are rapidly inactivated by this enzyme, especially in the presence of 0.5-1% sodium dodecyl sulfate (SDS). This is particularly important in the case of RNA, which is more susceptible to degradation than DNA. While DNases require metal ions for activity, and can therefore be easily inactivated by chelating agents, such as EDTA, there is no similar co-factor requirement for RNases.
[0327]Cooling and resultant solidification of the wax permits easy separation of the wax from the total nucleic acid, which can be conveniently precipitated, e.g. by isopropanol. Further processing depends on the intended purpose. If the proposed method of RNA analysis is subject to bias by contaminating DNA in an extract, the RNA extract can be further treated, e.g. by DNase, post purification to specifically remove DNA while preserving RNA. For example, if the goal is to isolate high quality RNA for subsequent RT-PCR amplification, nucleic acid precipitation is followed by the removal of DNA, usually by DNase treatment. However, DNA can be removed at various stages of nucleic acid isolation, by DNase or other techniques well known in the art.
[0328]While the advantages of the nucleic acid extraction protocol of the invention are most apparent for the isolation of RNA from archived, paraffin embedded tissue samples, the wax removal step of the present invention, which does not involve the use of an organic solvent, can also be included in any conventional protocol for the extraction of total nucleic acid (RNA and DNA) or DNA only. All of these aspects are specifically within the scope of the invention.
[0329]By using heat followed by cooling to remove paraffin, the process of the present invention saves valuable processing time, and eliminates a series of manipulations, thereby potentially increasing the yield of nucleic acid. Indeed, experimental evidence presented in the examples below, demonstrates that the method of the present invention does not compromise RNA yield.
[0330]8. 5'-Multiplexed Gene Specific Priming of Reverse Transcription
[0331]RT-PCR requires reverse transcription of the test RNA population as a first step. The most commonly used primer for reverse transcription is oligo-dT, which works well when RNA is intact. However, this primer will not be effective when RNA is highly fragmented as is the case in FPE tissues.
[0332]The present invention includes the use of gene specific primers, which are roughly 20 bases in length with a Tm optimum between about 58° C. and 60° C. These primers will also serve as the reverse primers that drive PCR DNA amplification.
[0333]Another aspect of the invention is the inclusion of multiple gene-specific primers in the same reaction mixture. The number of such different primers can vary greatly and can be as low as two and as high as 40,000 or more. Table 2 displays examples of reverse primers that can be successfully used in carrying out the methods of the invention. FIG. 9 shows expression data obtained using this multiplexed gene-specific priming strategy. Specifically, FIG. 9 is a representation of the expression of 92 genes (a subset of genes listed in Table 1) across 70 FPE breast cancer specimens. The y-axis shows expression as cycle threshold times.
[0334]An alternative approach is based on the use of random hexamers as primers for cDNA synthesis. However, we have experimentally demonstrated that the method of using a multiplicity of gene-specific primers is superior over the known approach using random hexamers.
[0335]9. Preparation of Fragmented mRNA for Expression Profiling Assays
[0336]It is of interest to analyze the abundance of specific mRNA species in biological samples, since this expression profile provides an index of the physiological state of that sample. mRNA is notoriously difficult to extract and maintain in its native state, consequently, mRNA recovered from biological sources is often fragmented or somewhat degraded. This is especially true of human tissue specimen which have been chemically fixed and stored for extended periods of time.
[0337]In one aspect, the present invention provides a means of preparing the mRNA extracted from various sources, including archived tissue specimens, for expression profiling in a way that its relative abundance is preserved and the mRNA's of interest can be successfully measured. This method is useful as a means of preparing mRNA for analysis by any of the known expression profiling methods, including RT-PCR coupled with 5' exonuclease of reporter probes (TaqMan® type assays), as discussed above, flap endonuclease assays (Cleavase® and Invader® type assays), oligonucleotide hybridization arrays, cDNA hybridization arrays, oligonucleotide ligation assays, 3' single nucleotide extension assays and other assays designed to assess the abundance of specific mRNA sequences in a biological sample.
[0338]According to the method of the invention, total RNA is extracted from the source material and sufficiently purified to act as a substrate in an enzyme assay. The extraction procedure, including a new and improved way of removing the wax (e.g. paraffin) used for embedding the tissue samples, has been discussed above. It has also been noted that it is preferred that oligo-dT based methods of selectively purifying polyadenylated mRNA not be used to isolate RNA for this invention since the bulk of the mRNA is expected to be fragmented, will not be polyadenylated and, therefore, will not be recovered and available for subsequent analytical assays if an oligo-dT based method is used.
[0339]A diagram of an improved method for repairing fragmented RNA is shown in FIG. 3. The fragmented RNA purified from the tissue sample is mixed with universal or gene-specific, single-stranded, DNA templates for each mRNA species of interest. These templates may be full length DNA copies of the mRNA derived from cloned gene sources, they may be fragments of the gene representing only the segment of the gene to be assayed, they may be a series of long oligonucleotides representing either the full length gene or the specific segment(s) of interest. The template can represent either a single consensus sequence or be a mixture of polymorphic variants of the gene. This DNA template, or scaffold, will preferably include one or more dUTP or rNTP sites in its length. This will provide a means of removing the template prior to carrying out subsequent analytical steps to avoid its acting as a substrate or target in later analysis assays. This removal is accomplished by treating the sample with uracil-DNA glycosylase (UDG) and heating it to cause strand breaks where UDG has generated abasic sites. In the case of rNTP's, the sample can be heated in the presence of a basic buffer (pH˜10) to induce strand breaks where rNTP's are located in the template.
[0340]The single stranded DNA template is mixed with the purified RNA, the mixture is denatured and annealed so that the RNA fragments complementary to the DNA template effectively become primers that can be extended along the single stranded DNA templates. DNA polymerase I requires a primer for extension but will efficiently use either a DNA or an RNA primer. Therefore in the presence of DNA polymerase I and dNTP's, the fragmented RNA can be extended along the complementary DNA templates. In order to increase the efficiency of the extension, this reaction can be thermally cycled, allowing overlapping templates and extension products to hybridize and extend until the overall population of fragmented RNA becomes represented as double stranded DNA extended from RNA fragment primers.
[0341]Following the generation of this "repaired" RNA, the sample should be treated with UDG or heat-treated in a mildly based solution to fragment the DNA template (scaffold) and prevent it from participating in subsequent analytical reactions.
[0342]The product resulting from this enzyme extension can then be used as a template in a standard enzyme profiling assay that includes amplification and detectable signal generation such as fluorescent, chemiluminescent, colorimetric or other common read outs from enzyme based assays. For example, for TaqMan® type assays, this double stranded DNA product is added as the template in a standard assay; and, for array hybridization, this product acts as the cDNA template for the cRNA labeling reaction typically used to generate single-stranded, labeled RNA for array hybridization.
[0343]This method of preparing template has the advantage of recovering information from mRNA fragments too short to effectively act as templates in standard cDNA generation schemes. In addition, this method acts to preserve the specific locations in mRNA sequences targeted by specific analysis assays. For example, TaqMan® assays rely on a single contiguous sequence in a cDNA copy of mRNA to act as a PCR amplification template targeted by a labeled reporter probe. If mRNA strand breaks occur in this sequence, the assay will not detect that template and will underestimate the quantity of that mRNA in the original sample. This target preparation method minimizes that effect from RNA fragmentation.
[0344]The extension product formed in the RNA primer extension assay can be controlled by controlling the input quantity of the single stranded DNA template and by doing limited cycling of the extension reaction. This is important in preserving the relative abundance of the mRNA sequences targeted for analysis.
[0345]This method has the added advantage of not requiring parallel preparation for each target sequence since it is easily multiplexed. It is also possible to use large pools of random sequence long oligonucleotides or full libraries of cloned sequences to extend the entire population of mRNA sequences in the sample extract for whole expressed genome analysis rather than targeted gene specific analysis.
[0346]10. Amplification of mRNA Species Prior to RT-PCR
[0347]Due to the limited amount and poor quality of mRNA that can be isolated from FPET, a new procedure that could accurately amplify mRNAs of interest would be very useful, particularly for real time quantitation of gene expression (TaqMan®) and especially for quantitatively large number (>50) of genes>50 to 10,000.
[0348]Current protocols (e.g. Eberwine, Biotechniques 20:584-91 (1996)) are optimized for mRNA amplification from small amount of total or poly A.sup.+ RNA mainly for microarray analysis. The present invention provides a protocol optimized for amplification of small amounts of fragmented total RNA (average size about 60-150 bps), utilizing gene-specific sequences as primers, as illustrated in FIG. 4.
[0349]The amplification procedure of the invention uses a very large number, typically as many as 100-190,000 gene specific primers (GSP's) in one reverse transcription run. Each GSP contains an RNA polymerase promoter, e.g. a T7 DNA-dependent RNA polymerase promoter, at the 5' end for subsequent RNA amplification. GSP's are preferred as primers because of the small size of the RNA. Current protocols utilize dT primers, which would not adequately represent all reverse transcripts of mRNAs due to the small size of the FPET RNA. GSP's can be designed by optimizing usual parameters, such as length, Tm, etc. For example, GSP's can be designed using the Primer Express® (Applied Biosystems), or Primer 3 (MIT) software program. Typically at least 3 sets per gene are designed, and the ones giving the lowest Ct on FPET RNA (best performers) are selected.
[0350]Second strand cDNA synthesis is performed by standard procedures (see FIG. 4, Method 1), or by GSPf primers and Taq pol under PCR conditions (e.g., 95° C., 10 min (Taq activation) then 60° C., 45 sec). The advantages of the latter method are that the second gene specific primer, SGFf adds additional specificity (and potentially more efficient second strand synthesis) and the option of performing several cycles of PCR, if more starting DNA is necessary for RNA amplification by T7 RNA polymerase. RNA amplification is then performed under standard conditions to generate multiple copies of cRNA, which is then used in a standard TaqMan® reaction.
[0351]Although this process is illustrated by using T7-based RNA amplification, a person skilled in the art will understand that other RNA polymerase promoters that do not require a primer, such as T3 or Sp6 can also be used, and are within the scope of the invention.
[0352]11. A Method of Elongation of Fragmented RNA and Subsequent Amplification
[0353]This method, which combines and modifies the inventions described in sections 9 and 10 above, is illustrated in FIG. 5. The procedure begins with elongation of fragmented mRNA. This occurs as described above except that the scaffold DNAs are tagged with the T7 RNA polymerase promoter sequence at their 5' ends, leading to double-stranded DNA extended from RNA fragments. The template sequences need to be removed after in vitro transcription. These templates can include dUTP or rNTP nucleotides, enabling enzymatic removal of the templates as described in section 9, or the templates can be removed by DNaseI treatment.
[0354]The template DNA can be a population representing different mRNAs of any number. A high sequence complexity source of DNA templates (scaffolds) can be generated by pooling RNA from a variety of cells or tissues. In one embodiment, these RNAs are converted into double stranded DNA and cloned into phagemids. Single stranded DNA can then be rescued by phagemid growth and single stranded DNA isolation from purified phagemids.
[0355]This invention is useful because it increases gene expression profile signals two different ways: both by increasing test mRNA polynucleotide sequence length and by in vitro transcription amplification. An additional advantage is that it eliminates the need to carry out reverse transcription optimization with gene specific primers tagged with the T7 RNA polymerase promoter sequence, and thus, is comparatively fast and economical.
[0356]This invention can be used with a variety of different methods to profile gene expression, e.g., RT-PCR or a variety of DNA array methods. Just as in the previous protocol, this approach is illustrated by using a T7 promoter but the invention is not so limited. A person skilled in the art will appreciate, however, that other RNA polymerase promoters, such as T3 or Sp6 can also be used.
[0357]12. Breast Cancer Gene Set, Assayed Gene Subsequences, and Clinical Application of Gene Expression Data
[0358]An important aspect of the present invention is to use the measured expression of certain genes by breast cancer tissue to match patients to best drugs or drug combinations, and to provide prognostic information. For this purpose it is necessary to correct for (normalize away) both differences in the amount of RNA assayed and variability in the quality of the RNA used. Therefore, the assay measures and incorporates the expression of certain normalizing genes, including well known housekeeping genes, such as GAPDH and Cyp1. Alternatively, normalization can be based on the mean or median signal (Ct) of all of the assayed genes or a large subset thereof (global normalization approach). On a gene-by-gene basis, measured normalized amount of a patient tumor mRNA is compared to the amount found in a breast cancer tissue reference set. The number (N) of breast cancer tissues in this reference set should be sufficiently high to ensure that different reference sets (as a whole) behave essentially the same way. If this condition is met, the identity of the individual breast cancer tissues present in a particular set will have no significant impact on the relative amounts of the genes assayed. Usually, the breast cancer tissue reference set consists of at least about 30, preferably at least about 40 different FPE breast cancer tissue specimens. Unless noted otherwise, normalized expression levels for each mRNA/tested tumor/patient will be expressed as a percentage of the expression level measured in the reference set. More specifically, the reference set of a sufficiently high number (e.g. 40) tumors yields a distribution of normalized levels of each mRNA species. The level measured in a particular tumor sample to be analyzed falls at some percentile within this range, which can be determined by methods well known in the art. Below, unless noted otherwise, reference to expression levels of a gene assume normalized expression relative to the reference set although this is not always explicitly stated.
[0359]The breast cancer gene set is shown in Table 1. The gene Accession Numbers, and the SEQ ID NOs for the forward primer, reverse primer and amplicon sequences that can be used for gene amplification, are listed in Table 2. The basis for inclusion of markers, as well as the clinical significance of mRNA level variations with respect to the reference set, is indicated below. Genes are grouped into subsets based on the type of clinical significance indicated by their expression levels: A. Prediction of patient response to drugs used in breast cancer treatment, or to drugs that are approved for other indications and could be used off-label in the treatment of breast cancer. B. Prognostic for survival or recurrence of cancer.
C. Prediction of Patient Response to Therapeutic Drugs
[0360]1. Molecules that Specifically Influence Cellular Sensitivity to Drugs
[0361]Table 1 lists 74 genes (shown in italics) that specifically influence cellular sensitivity to potent drugs, which are also listed. Most of the drugs shown are approved and already used to treat breast cancer (e.g., anthracyclines; cyclophosphamide; methotrexate; 5-FU and analogues). Several of the drugs are used to treat breast cancer off-label or are in clinical development phase (e.g., bisphosphonates and anti-VEGF mAb). Several of the drugs have not been widely used to treat breast cancer but are used in other cancers in which the indicated target is expressed (e.g., Celebrex is used to treat familial colon cancer; cisplatin is used to treat ovarian and other cancers.)
[0362]Patient response to 5 FU is indicated if normalized thymidylate synthase mRNA amount is at or below the 15th percentile, or the sum of expression of thymidylate synthase plus dihydropyrimidine phosphorylase is at or below the 25th percentile, or the sum of expression of these mRNAs plus thymidine phosphorylase is at or below the 20th percentile. Patients with dihydropyrimidine dehydrogenase below 5th percentile are at risk of adverse response to 5 FU, or analogs such as Xeloda.
[0363]When levels of thymidylate synthase, and dihydropyrimidine dehydrogenase, are within the acceptable range as defined in the preceding paragraph, amplification of c-myc mRNA in the upper 15%, against a background of wild-type p53 [as defined below] predicts a beneficial response to 5 FU (see D. Arango et al., Cancer Res. 61:4910-4915 (2001)). In the presence of normal levels of thymidylate synthase and dihydropyrimidine dehydrogenase, levels of NFκB and cIAP2 in the upper 10% indicate resistance of breast tumors to the chemotherapeutic drug 5 FU.
[0364]Patient resistance to anthracyclines is indicated if the normalized mRNA level of topoisomerase IIα is below the 10th percentile, or if the topoisomerase IIβ normalized mRNA level is below the 10th percentile or if the combined normalized topoisomerase IIα and β signals are below the 10th percentile.
[0365]Patient sensitivity to methotrexate is compromised if DHFR levels are more than tenfold higher than the average reference set level for this mRNA species, or if reduced folate carrier levels are below 10th percentile.
[0366]Patients whose tumors express CYP1B1 in the upper 10%, have reduced likelihood of responding to docetaxol.
[0367]The sum of signals for aldehyde dehydrogenase 1A1 and 1A3, when more than tenfold higher than the reference set average, indicates reduced likelihood of response to cyclophosphamide.
[0368]Currently, estrogen and progesterone receptor expression as measured by immunohistochemistry is used to select patients for anti-estrogen therapy. We have demonstrated RT-PCR assays for estrogen and progesterone receptor mRNA levels that predict levels of these proteins as determined by a standard clinical diagnostic tests, with high degree of concordance (FIGS. 6 and 7).
[0369]Patients whose tumors express ERα or PR mRNA in the upper 70%, are likely to respond to tamoxifen or other anti-estrogens (thus, operationally, lower levels of ERα than this are to defined ERα-negative). However, when the signal for microsomal epoxide hydrolase is in the upper 10% or when mRNAs for pS2/trefoil factor, GATA3 or human chorionic gonadotropin are at or below average levels found in ERα-negative tumors, anti-estrogen therapy will not be beneficial.
[0370]Absence of XIST signal compromises the likelihood of response to taxanes, as does elevation of the GST-π or prolyl endopeptidase [PREP] signal in the upper 10%. Elevation of PLAG1 in the upper 10% decreases sensitivity to taxanes.
[0371]Expression of ERCC1 mRNA in the upper 10% indicate significant risk of resistance to cisplatin or analogs.
[0372]An RT-PCR assay of Her2 mRNA expression predicts Her2 overexpression as measured by a standard diagnostic test, with high degree of concordance (data not shown). Patients whose tumors express Her2 (normalized to cyp. 1) in the upper 10% have increased likelihood of beneficial response to treatment with Herceptin or other ErbB2 antagonists. Measurement of expression of Grb7 mRNA serves as a test for HER2 gene amplification, because the Grb7 gene is closely linked to Her2. When Her2 is expression is high as defined above in this paragraph, similarly elevated Grb7 indicates Her2 gene amplification. Overexpression of IGF1R and or IGF1 or IGF2 decreases likelihood of beneficial response to Herceptin and also to EGFR antagonists.
[0373]Patients whose tumors express mutant Ha-Ras, and also express farnesyl pyrophosphate synthetase or geranyl pyrophosphonate synthetase mRNAs at levels above the tenth percentile comprise a group that is especially likely to exhibit a beneficial response to bis-phosphonate drugs.
[0374]Cox2 is a key control enzyme in the synthesis of prostaglandins. It is frequently expressed at elevated levels in subsets of various types of carcinomas including carcinoma of the breast. Expression of this gene is controlled at the transcriptional level, so RT-PCR serves a valid indicator of the cellular enzyme activity. Nonclinical research has shown that cox2 promotes tumor angiogenesis, suggesting that this enzyme is a promising drug target in solid tumors. Several Cox2 antagonists are marketed products for use in anti-inflammatory conditions. Treatment of familial adenomatous polyposis patients with the cox2 inhibitor Celebrex significantly decreased the number and size of neoplastic polyps. No cox2 inhibitor has yet been approved for treatment of breast cancer, but generally this class of drugs is safe and could be prescribed off-label in breast cancers in which cox2 is over-expressed. Tumors expressing COX2 at levels in the upper ten percentile have increased chance of beneficial response to Celebrex or other cyclooxygenase 2 inhibitors.
[0375]The tyrosine kinases ErbB1 [EGFR], ErbB3 [Her3] and ErbB4 [Her4]; also the ligands TGFalpha, amphiregulin, heparin-binding EGF-like growth factor, and epiregulin; also BRK, a non-receptor kinase. Several drugs in clinical development block the EGF receptor. ErbB2-4, the indicated ligands, and BRK also increase the activity of the EGFR pathway. Breast cancer patients whose tumors express high levels of EGFR or EGFR and abnormally high levels of the other indicated activators of the EGFR pathway are potential candidates for treatment with an EGFR antagonist.
[0376]Patients whose tumors express less than 10% of the average level of EGFR mRNA observed in the reference panel are relatively less likely to respond to EGFR antagonists [such as Iressa, or ImClone 225]. In cases in which the EGFR is above this low range, the additional presence of epiregulin, TGFα, amphiregulin, or ErbB3, or BRK, CD9, MMP9, or Lot1 at levels above the 90th percentile predisposes to response to EGFR antagonists. Epiregulin gene expression, in particular, is a good surrogate marker for EGFR activation, and can be used to not only to predict response to EGFR antagonists, but also to monitor response to EGFR antagonists [taking fine needle biopsies to provide tumor tissue during treatment]. Levels of CD82 above the 90th percentile suggest poorer efficacy from EGFR antagonists.
[0377]The tyrosine kinases abl, c-kit, PDGFRalpha, PDGFbeta, and ARG; also, the signal transmitting ligands c-kit ligand, PDGFA, B, C and D. The listed tyrosine kinases are all targets of the drug Gleevec® (imatinib mesylate, Novartis), and the listed ligands stimulate one or more of the listed tyrosine kinases. In the two indications for which Gleevec® is approved, tyrosine kinase targets (bcr-abl and ckit) are overexpressed and also contain activating mutations. A finding that one of the Gleevec® target tyrosine kinase targets is expressed in breast cancer tissue will prompt a second stage of analysis wherein the gene will be sequenced to determine whether it is mutated. That a mutation found is an activating mutation can be proved by methods known in the art, such as, for example, by measuring kinase enzyme activity or by measuring phosphorylation status of the particular kinase, relative to the corresponding wild-type kinase. Breast cancer patients whose tumors express high levels of mRNAs encoding Gleevec® target tyrosine kinases, specifically, in the upper ten percentile, or mRNAs for Gleevec® target tyrosine kinases in the average range and mRNAs for their cognate growth stimulating ligands in the upper ten percentile, are particularly good candidates for treatment with Gleevec®
[0378]VEGF is a potent and pathologically important angiogenic factor. (See below under Prognostic Indicators.) When VEGF mRNA levels are in the upper ten percentile, aggressive treatment is warranted. Such levels particularly suggest the value of treatment with anti-angiogenic drugs, including VEGF antagonists, such as anti-VEGF antibodies. Additionally, KDR or CD31 mRNA level in the upper 20 percentile further increases likelihood of benefit from VEGF antagonists.
[0379]Farnesyl pyrophosphatase synthetase and geranyl geranyl pyrophosphatase synthetase. These enzymes are targets of commercialized bisphosphonate drugs, which were developed originally for treatment of osteoporosis but recently have begun to prescribe them off-label in breast cancer. Elevated levels of mRNAs encoding these enzymes in breast cancer tissue, above the 90th percentile, suggest use of bisphosphonates as a treatment option.
[0380]2. Multidrug Resistance Factors
[0381]These factors include 10 Genes: gamma glutamyl cysteine synthetase [GCS]; GST-α; GST-π; MDR-1; MRP1-4; breast cancer resistance protein [BCRP]; lung resistance protein [MVP]; SXR; YB-1.
[0382]GCS and both GST-α and GST-π regulate glutathione levels, which decrease cellular sensitivity to chemotherapeutic drugs and other toxins by reductive derivatization. Glutathione is a necessary cofactor for multi-drug resistant pumps, MDR-1 and the MRPs. MDR1 and MRPs function to actively transport out of cells several important chemotherapeutic drugs used in breast cancer.
[0383]GSTs, MDR-1, and MRP-1 have all been studied extensively to determine possible have prognostic or predictive significance in human cancer. However, a great deal of disagreement exists in the literature with respect to these questions. Recently, new members of the MRP family have been identified: MRP-2, MRP-3, MRP-4, BCRP, and lung resistance protein [major vault protein]. These have substrate specificities that overlap with those of MDR-1 and MRP-1. The incorporation of all of these relevant ABC family members as well as glutathione synthetic enzymes into the present invention captures the contribution of this family to drug resistance, in a way that single or double analyte assays cannot.
[0384]MRP-I, the gene coding for the multidrug resistance protein.
[0385]P-glycoprotein, is not regulated primarily at the transcriptional level. However, p-glycoprotein stimulates the transcription of PTP1b. An embodiment of the present invention is the use of the level of the mRNA for the phosphatase PTP1b as a surrogate measure of MRP-1/p-glycoprotein activity.
[0386]The gene SXR is also an activator of multidrug resistance, as it stimulates transcription of certain multidrug resistance factors.
[0387]The impact of multidrug resistance factors with respect to chemotherapeutic agents used in breast cancer is as follows. Beneficial response to doxorubicin is compromised when the mRNA levels of either MDR1, GSTα, GSTπ, SXR, BCRP YB-1, or LRP/MVP are in the upper four percentile. Beneficial response to methotrexate is inhibited if mRNA levels of any of MRP1, MRP2, MRP3, or MRP4 or gamma-glutamyl cysteine synthetase are in the upper four percentile.
[0388]3. Eukaryotic Translation Initiation Factor 4E [EIF4E]
[0389]EIF4E mRNA levels provides evidence of protein expression and so expands the capability of RT-PCR to indicate variation in gene expression. Thus, one claim of the present invention is the use of EIF4E as an added indicator of gene expression of certain genes [e.g., cyclinD1, mdm2, VEGF, and others]. For example, in two tissue specimens containing the same amount of normalized VEGF mRNA, it is likely that the tissue containing the higher normalized level of EIF4E exhibits the greater level of VEGF gene expression.
[0390]The background is as follows. A key point in the regulation of mRNA translation is selection of mRNAs by the EIF4G complex to bind to the 43S ribosomal subunit. The protein EIF4E [the m7G CAP-binding protein] is often limiting because more mRNAs than EIF4E copies exist in cells. Highly structured 5'UTRs or highly GC-rich ones are inefficiently translated, and these often code for genes that carry out functions relevant to cancer [e.g., cyclinD1, mdm2, and VEGF]. EIF4E is itself regulated at the transcriptional/mRNA level. Thus, expression of EIF4E provides added indication of increased activity of a number of proteins.
[0391]It is also noteworthy that overexpression of EIF4E transforms cultured cells, and hence is an oncogene. Overexpression of EIF4E occurs in several different types of carcinomas but is particularly significant in breast cancer. EIF4E is typically expressed at very low levels in normal breast tissue.
D. Prognostic Indicators
[0392]1. DNA Repair Enzymes
[0393]Loss of BRCA1 or BRCA2 activity via mutation represents the critical oncogenic step in the most common type[s] of familial breast cancer. The levels of mRNAs of these important enzymes are abnormal in subsets of sporadic breast cancer as well. Loss of signals from either [to within the lower ten percentile] heightens risk of short survival.
[0394]2. Cell Cycle Regulators
[0395]Cell cycle regulators include 14 genes: c-MYC; c-Src; Cyclin D1; Ha-Ras; mdm2; p14ARF; p21WAF1/CIP; p16INK4a/p14; p23; p27; p53; PI3K; PKC-epsilon; PKC-delta.
[0396]The gene for p53 [TP53] is mutated in a large fraction of breast cancers. Frequently p53 levels are elevated when loss of function mutation occurs. When the mutation is dominant-negative, it creates survival value for the cancer cell because growth is promoted and apoptosis is inhibited. Thousands of different p53 mutations have been found in human cancer, and the functional consequences of many of them are not clear. A large body of academic literature addresses the prognostic and predictive significance of mutated p53 and the results are highly conflicting. The present invention provides a functional genomic measure of p53 activity, as follows. The activated wild type p53 molecule triggers transcription of the cell cycle inhibitor p21. Thus, the ratio of p53 to p21 should be low when p53 is wild-type and activated. When p53 is detectable and the ratio of p53 to p21 is elevated in tumors relative to normal breast, it signifies nonfunctional or dominant negative p53. The cancer literature provides evidence for this as born out by poor prognosis.
[0397]Mdm2 is an important p53 regulator. Activated wildtype p53 stimulates transcription of mdm2. The mdm2 protein binds p53 and promotes its proteolytic destruction. Thus, abnormally low levels of mdm2 in the presence of normal or higher levels of p53 indicate that p53 is mutated and inactivated.
[0398]One aspect of the present invention is the use of ratios of mRNAs levels p53:p21 and p53:mdm2 to provide a picture of p53 status. Evidence for dominant negative mutation of p53 (as indicated by high p53:p21 and/or high p53:mdm2 mRNA ratios--specifically in the upper ten percentile) presages higher risk of recurrence in breast cancer and therefore weights toward a decision to use chemotherapy in node negative post surgery breast cancer.
[0399]Another important cell cycle regulator is p27, which in the activated form blocks cell cycle progression at the level of cdk4. The protein is regulated primarily via phosphorylation/dephosphorylation, rather than at the transcriptional level. However, levels of p27 mRNAs do vary. Therefore a level of p27 mRNA in the upper ten percentile indicates reduced risk of recurrence of breast cancer post surgery.
[0400]Cyclin D1 is a principle positive regulator of entry into S phase of the cell cycle. The gene for cyclin D1 is amplified in about 20% of breast cancer patients, and therefore promotes tumor promotes tumor growth in those cases. One aspect of the present invention is use of cyclin D1 mRNA levels for diagnostic purposes in breast cancer. A level of cyclin D1 mRNA in the upper ten percentile suggests high risk of recurrence in breast cancer following surgery and suggests particular benefit of adjuvant chemotherapy.
[0401]3. Other Tumor Suppressors and Related Proteins
[0402]These include APC and E-cadherin. It has long been known that the tumor suppressor APC is lost in about 50% of colon cancers, with concomitant transcriptional upregulation of E-cadherin, an important cell adhesion molecule and growth suppressor. Recently, it has been found that the APC gene silenced in 15-40% of breast cancers. Likewise, the E-cadherin gene is silenced [via CpG island methylation] in about 30% of breast cancers. An abnormally low level of APC and/or E-cadherin mRNA in the lower 5 percentile suggests high risk of recurrence in breast cancer following surgery and heightened risk of shortened survival.
[0403]4. Regulators of Apoptosis
[0404]These include BC1/BAX family members BC12, Bcl-x1, Bak, Bax and related factors, NFκ-B and related factors, and also p53BP1/ASPP1 and p53BP2/ASPP2.
[0405]Bax and Bak are pro-apoptotic and BC12 and Bcl-x1 are anti-apoptotic. Therefore, the ratios of these factors influence the resistance or sensitivity of a cell to toxic (pro-apoptotic) drugs. In breast cancer, unlike other cancers, elevated level of BC12 (in the upper ten percentile) correlates with good outcome. This reflects the fact that BC12 has growth inhibitory activity as well as anti-apoptotic activity, and in breast cancer the significance of the former activity outweighs the significance of the latter. The impact of BC12 is in turn dependent on the status of the growth stimulating transcription factor c-MYC. The gene for c-MYC is amplified in about 20% of breast cancers. When c-MYC message levels are abnormally elevated relative to BC12 (such that this ratio is in the upper ten percentile), then elevated level of BC12 mRNA is no longer a positive indicator.
[0406]NFκ-B is another important anti-apoptotic factor. Originally, recognized as a pro-inflammatory transcription factor, it is now clear that it prevents programmed cell death in response to several extracellular toxic factors [such as tumor necrosis factor]. The activity of this transcription factor is regulated principally via phosphorylation/dephosphorylation events. However, levels of NFκ-B nevertheless do vary from cell to cell, and elevated levels should correlate with increased resistance to apoptosis. Importantly for present purposes, NFκ-B, exerts its anti-apoptotic activity largely through its stimulation of transcription of mRNAs encoding certain members of the IAP [inhibitor of apoptosis] family of proteins, specifically cIAP1, cIAP2, XIAP, and Survivin. Thus, abnormally elevated levels of mRNAs for these IAPs and for NFκ-B any in the upper 5 percentile] signify activation of the NFκ-B anti-apoptotic pathway. This suggests high risk of recurrence in breast cancer following chemotherapy and therefore poor prognosis. One embodiment of the present invention is the inclusion in the gene set of the above apoptotic regulators, and the above-outlined use of combinations and ratios of the levels of their mRNAs for prognosis in breast cancer.
[0407]The proteins p53BP1 and 2 bind to p53 and promote transcriptional activation of pro-apoptotic genes. The levels of p53BP1 and 2 are suppressed in a significant fraction of breast cancers, correlating with poor prognosis. When either is expressed in the lower tenth percentile poor prognosis is indicated.
[0408]5. Factors that Control Cell Invasion and Angiogenesis
[0409]These include uPA, PAI1 cathepsinsB, G and L, scatter factor [HGF], c-met, KDR, VEGF, and CD31. The plasminogen activator uPA and its serpin regulator PAI1 promote breakdown of extracellular matrices and tumor cell invasion. Abnormally elevated levels of both mRNAs in malignant breast tumors (in the upper twenty percentile) signify an increased risk of shortened survival, increased recurrence in breast cancer patients post surgery, and increased importance of receiving adjuvant chemotherapy. On the other hand, node negative patients whose tumors do not express elevated levels of these mRNA species are less likely to have recurrence of this cancer and could more seriously consider whether the benefits of standard chemotherapy justifies the associated toxicity.
[0410]Cathepsins B or L, when expressed in the upper ten percentile, predict poor disease-free and overall survival. In particular, cathepsin L predicts short survival in node positive patients.
[0411]Scatter factor and its cognate receptor c-met promote cell motility and invasion, cell growth, and angiogenesis. In breast cancer elevated levels of mRNAs encoding these factors should prompt aggressive treatment with chemotherapeutic drugs, when expression of either, or the combination, is above the 90th percentile.
[0412]VEGF is a central positive regulator of angiogenesis, and elevated levels in solid tumors predict short survival [note many references showing that elevated level of VEGF predicts short survival]. Inhibitors of VEGF therefore slow the growth of solid tumors in animals and humans. VEGF activity is controlled at the level of transcription. VEGF mRNA levels in the upper ten percentile indicate significantly worse than average prognosis. Other markers of vascularization, CD31 [PECAM], and KDR indicate high vessel density in tumors and that the tumor will be particularly malignant and aggressive, and hence that an aggressive therapeutic strategy is warranted.
[0413]6. Markers for Immune and Inflammatory Cells and Processes
[0414]These markers include the genes for Immunoglobulin light chain λ, CD18, CD3, CD68, Fas [CD95], and Fas Ligand.
[0415]Several lines of evidence suggest that the mechanisms of action of certain drugs used in breast cancer entail activation of the host immune/inflammatory response (For example, Herceptin®). One aspect of the present invention is the inclusion in the gene set of markers for inflammatory and immune cells, and markers that predict tumor resistance to immune surveillance. Immunoglobulin light chain lambda is a marker for immunoglobulin producing cells. CD18 is a marker for all white cells. CD3 is a marker for T-cells. CD68 is a marker for macrophages.
[0416]CD95 and Fas ligand are a receptor: ligand pair that mediate one of two major pathways by which cytotoxic T cells and NK cells kill targeted cells. Decreased expression of CD95 and increased expression of Fas Ligand indicates poor prognosis in breast cancer. Both CD95 and Fas Ligand are transmembrane proteins, and need to be membrane anchored to trigger cell death. Certain tumor cells produce a truncated soluble variant of CD95, created as a result of alternative splicing of the CD95 mRNA. This blocks NK cell and cytotoxic T cell Fas Ligand-mediated killing of the tumors cells. Presence of soluble CD95 correlates with poor survival in breast cancer. The gene set includes both soluble and full-length variants of CD95.
[0417]7. Cell Proliferation Markers
[0418]The gene set includes the cell proliferation markers Ki67/MiB1, PCNA, Pin1, and thymidine kinase. High levels of expression of proliferation markers associate with high histologic grade, and short survival. High levels of thymidine kinase in the upper ten percentile suggest in creased risk of short survival. Pin1 is a prolyl isomerase that stimulates cell growth, in part through the transcriptional activation of the cyclin D1 gene, and levels in the upper ten percentile contribute to a negative prognostic profile.
[0419]8. Other Growth Factors and Receptors
[0420]This gene set includes IGF1, IGF2, IGFBP3, IGF1R, FGF2, FGFR1, CSF-1R/fms, CSF-1, IL6 and IL8. All of these proteins are expressed in breast cancer. Most stimulate tumor growth. However, expression of the growth factor FGF2 correlates with good outcome. Some have anti-apoptotic activity, prominently IGF1. Activation of the IGF1 axis via elevated IGF1, IGF1R, or IGFBP3 (as indicated by the sum of these signals in the upper ten percentile) inhibits tumor cell death and strongly contributes to a poor prognostic profile.
[0421]9. Gene Expression Markers that Define Subclasses of Breast Cancer
[0422]These include: GRO1 oncogene alpha, Grb7, cytokeratins 5 and 17, retinal binding protein 4, hepatocyte nuclear factor 3, integrin alpha 7, and lipoprotein lipase. These markers subset breast cancer into different cell types that are phenotypically different at the level of gene expression. Tumors expressing signals for Bcl2, hepatocyte nuclear factor 3, LIV1 and ER above the mean have the best prognosis for disease free and overall survival following surgical removal of the cancer. Another category of breast cancer tumor type, characterized by elevated expression of lipoprotein lipase, retinol binding protein 4, and integrin α7, carry intermediate prognosis. Tumors expressing either elevated levels of cytokeratins 5, and 17, GRO oncogene at levels four-fold or greater above the mean, or ErbB2 and Grb7 at levels ten-fold or more above the mean, have worst prognosis.
[0423]Although throughout the present description, including the Examples below, various aspects of the invention are explained with reference to gene expression studies, the invention can be performed in a similar manner, and similar results can be reached by applying proteomics techniques that are well known in the art. The proteome is the totality of the proteins present in a sample (e.g. tissue, organism, or cell culture) at a certain point of time. Proteomics includes, among other things, study of the global changes of protein expression in a sample (also referred to as "expression proteomics"). Proteomics typically includes the following steps: (1) separation of individual proteins in a sample by 2-D gel electrophoresis (2-D PAGE); (2) identification of the individual proteins recovered from the gel, e.g. my mass spectrometry and/or N-terminal sequencing, and (3) analysis of the data using bioinformatics. Proteomics methods are valuable supplements to other methods of gene expression profiling, and can be used, alone or in combination with other methods of the present invention, to detect the products of the gene markers of the present invention.
[0424]Further details of the invention will be described in the following non-limiting Examples.
Example 1
Isolation of RNA from Formalin-Fixed, Paraffin-Embedded (FPET) Tissue Specimens
[0425]A. Protocols
[0426]I. EPICENTRE® Xylene Protocol
[0427]RNA Isolation
[0428](1) Cut 1-6 sections (each 10 μm thick) of paraffin-embedded tissue per sample using a clean microtome blade and place into a 1.5 ml eppendorf tube.
[0429](2) To extract paraffin, add 1 ml of xylene and invert the tubes for 10 minutes by rocking on a nutator.
[0430](3) Pellet the sections by centrifugation for 10 minutes at 14,000×g in an eppendorf microcentrifuge.
[0431](4) Remove the xylene, leaving some in the bottom to avoid dislodging the pellet.
[0432](5) Repeat steps 2-4.
[0433](6) Add 1 ml of 100% ethanol and invert for 3 minutes by rocking on the nutator.
[0434](7) Pellet the debris by centrifugation for 10 minutes at 14,000×g in an eppendorf microcentrifuge.
[0435](8) Remove the ethanol, leaving some at the bottom to avoid the pellet.
[0436](9) Repeat steps 6-8 twice.
[0437](10) Remove all of the remaining ethanol.
[0438](11) For each sample, add 2 μl of 50 μg/μl Proteinase K to 300 μl of Tissue and Cell Lysis Solution.
[0439](12) Add 300 μl of Tissue and Cell Lysis Solution containing the Proteinase K to each sample and mix thoroughly.
[0440](13) Incubate at 65° C. for 90 minutes (vortex mixing every 5 minutes). Visually monitor the remaining tissue fragment. If still visible after 30 minutes, add an additional 2 μl of 50 μg/μl Proteinase K and continue incubating at 65° C. until fragment dissolves.
[0441](14) Place the samples on ice for 3-5 minutes and proceed with protein removal and total nucleic acid precipitation.
[0442]Protein Removal and Precipitation of Total Nucleic Acid
[0443](1) Add 150 μl of MPC Protein Precipitation Reagent to each lysed sample and vortex vigorously for 10 seconds.
[0444](2) Pellet the debris by centrifugation for 10 minutes at 14,000×g in an eppendorf microcentrifuge.
[0445](3) Transfer the supernatant into clean eppendorf tubes and discard the pellet.
[0446](4) Add 500 μl of isopropanol to the recovered supernatant and thoroughly mix by rocking on the nutator for 3 minutes.
[0447](5) Pellet the RNA/DNA by centrifugation at 4° C. for 10 minutes at 14,000×g in an eppendorf microcentrifuge.
[0448](6) Remove all of the isopropanol with a pipet, being careful not to dislodge the pellet.
[0449]Removal of Contaminating DNA from RNA Preparations
[0450](1) Prepare 200 μl of DNase I solution for each sample by adding 5 μl of RNase-Free DNase I (1 U/μl) to 195 μl of 1×DNase Buffer.
[0451](2) Completely resuspend the pelleted RNA in 200 μl of DNase I solution by vortexing.
[0452](3) Incubate the samples at 37° C. for 60 minutes.
[0453](4) Add 200 μl of 2× T and C Lysis Solution to each sample and vortex for 5 seconds.
[0454](5) Add 200 μl of MPC Protein Precipitation Reagent, mix by vortexing for 10 seconds and place on ice for 3-5 minutes.
[0455](6) Pellet the debris by centrifugation for 10 minutes at 14,000×g in an eppendorf microcentrifuge.
[0456](7) Transfer the supernatant containing the RNA to clean eppendorf tubes and discard the pellet. (Be careful to avoid transferring the pellet.)
[0457](8) Add 500 μl of isopropanol to each supernatant and rock samples on the nutator for 3 minutes.
[0458](9) Pellet the RNA by centrifugation at 4° C. for 10 minutes at 14,000×g in an eppendorf microcentrifuge.
[0459](10) Remove the isopropanol, leaving some at the bottom to avoid dislodging the pellet.
[0460](11) Rinse twice with 1 ml of 75% ethanol. Centrifuge briefly if the RNA pellet is dislodged.
[0461](12) Remove ethanol carefully.
[0462](13) Set under fume hood for about 3 minutes to remove residual ethanol.
[0463](14) Resuspend the RNA in 30 μl of TE Buffer and store at -30° C.
[0464]II. Hot Wax/Urea Protocol of the Invention
[0465]RNA Isolation
[0466](1) Cut 3 sections (each 10 μm thick) of paraffin-embedded tissue using a clean microtome blade and place into a 1.5 ml eppendorf tube.
[0467](2) Add 300 μl of lysis buffer (10 mM Tris 7.5, 0.5% sodium lauroyl sarcosine, 0.1 mM EDTA pH 7.5, 4M Urea) containing 330 μg/ml Proteinase K (added freshly from a 50 μg/μl stock solution) and vortex briefly.
[0468](3) Incubate at 65° C. for 90 minutes (vortex mixing every 5 minutes). Visually monitor the tissue fragment. If still visible after 30 minutes, add an additional 2 μl of 50 μg/μl Proteinase K and continue incubating at 65° C. until fragment dissolves.
[0469](4) Centrifuge for 5 minutes at 14,000×g and transfer upper aqueous phase to new tube, being careful not to disrupt the paraffin seal.
[0470](5) Place the samples on ice for 3-5 minutes and proceed with protein removal and total nucleic acid precipitation.
[0471]Protein Removal and Precipitation of Total Nucleic Acid
[0472](1) Add 150 μl of 7.5M NH4OAc to each lysed sample and vortex vigorously for 10 seconds.
[0473](2) Pellet the debris by centrifugation for 10 minutes at 14,000×g in an eppendorf microcentrifuge.
[0474](3) Transfer the supernatant into clean eppendorf tubes and discard the pellet.
[0475](4) Add 500 μl of isopropanol to the recovered supernatant and thoroughly mix by rocking on the nutator for 3 minutes.
[0476](5) Pellet the RNA/DNA by centrifugation at 4° C. for 10 minutes at 14,000×g in an eppendorf microcentrifuge.
[0477](6) Remove all of the isopropanol with a pipet, being careful not to dislodge the pellet.
[0478]Removal of Contaminating DNA from RNA Preparations
[0479](1) Add 45 μl of 1×DNase I buffer (10 mM Tris-Cl, pH 7.5, 2.5 mM MgCl2, 0.1 mM CaCl2) and 5 μl of RNase-Free DNase I (2 U/μl, Ambion) to each sample.
[0480](2) Incubate the samples at 37° C. for 60 minutes. Inactivate the DNaseI by heating at 70° C. for 5 minutes.
[0481]B. Results
[0482]Experimental evidence demonstrates that the hot RNA extraction protocol of the invention does not compromise RNA yield. Using 19 FPE breast cancer specimens, extracting RNA from three adjacent sections in the same specimens, RNA yields were measured via capillary electrophoresis with fluorescence detection (Agilent Bioanalyzer). Average RNA yields in nanograms and standard deviations with the invented and commercial methods, respectively, were: 139+/-21 versus 141+/-34.
[0483]Also, it was found that the urea-containing lysis buffer of the present invention can be substituted for the EPICENTRE® T&C lysis buffer, and the 7.5 M NH4OAc reagent used for protein precipitation in accordance with the present invention can be substituted for the EPICENTRE® MPC protein precipitation solution with neither significant compromise of RNA yield nor TaqMan® efficiency.
Example 2
Amplification of mRNA Species Prior to RT-PCR
[0484]The method described in section 10 above was used with RNA isolated from fixed, paraffin-embedded breast cancer tissue. TaqMan® analyses were performed with first strand cDNA generated with the T7-GSP primer (unamplified (T7-GSPr)), T7 amplified RNA (amplified (T7-GSPr)). RNA was amplified according to step 2 of FIG. 4. As a control, TaqMan® was also performed with cDNA generated with an unmodified GSPr (amplified (GSPr)). An equivalent amount of initial template (1 ng/well) was used in each TaqMan® reaction.
[0485]The results are shown in FIG. 8. In vitro transcription increased RT-PCR signal intensity by more than 10 fold, and for certain genes by more than 100 fold relative to controls in which the RT-PCR primers were the same primers used in method 2 for the generation of double-stranded DNA for in vitro transcription (GSP-T7r and GSPf). Also shown in FIG. 8 are RT-PCR data generated when standard optimized RT-PCR primers (i.e., lacking T7 tails) were used. As shown, compared to this control, the new method yielded substantial increases in RT-PCR signal (from 4 to 64 fold in this experiment).
[0486]The new method requires that each T7-GSP sequence be optimized so that the increase in the RT-PCR signal is the same for each gene, relative to the standard optimized RT-PCR (with non-T7 tailed primers).
Example 3
A Study of Gene Expression in Premalignant and Malignant Breast Tumors
[0487]A gene expression study was designed and conducted with the primary goal to molecularly characterize gene expression in paraffin-embedded, fixed tissue samples of invasive breast ductal carcinoma, and to explore the correlation between such molecular profiles and disease-free survival. A further objective of the study was to compare the molecular profiles in tissue samples of invasive breast cancer with the molecular profiles obtained in ductal carcinoma in situ. The study was further designed to obtain data on the molecular profiles in lobular carcinoma in situ and in paraffin-embedded, fixed tissue samples of invasive lobular carcinoma.
[0488]Molecular assays were performed on paraffin-embedded, formalin-fixed primary breast tumor tissues obtained from 202 individual patients diagnosed with breast cancer. All patients underwent surgery with diagnosis of invasive ductal carcinoma of the breast, pure ductal carcinoma in situ (DCIS), lobular carcinoma of the breast, or pure lobular carcinoma in situ (LCIS). Patients were included in the study only if histopathologic assessment, performed as described in the Materials and Methods section, indicated adequate amounts of tumor tissue and homogeneous pathology.
[0489]The individuals participating in the study were divided into the following groups:
[0490]Group 1: Pure ductal carcinoma in situ (DCIS); n=18
[0491]Group 2: Invasive ductal carcinoma n=130
[0492]Group 3: Pure lobular carcinoma in situ (LCIS); n=7
[0493]Group 4: Invasive lobular carcinoma n=16
Materials and Methods
[0494]Each representative tumor block was characterized by standard histopathology for diagnosis, semi-quantitative assessment of amount of tumor, and tumor grade. A total of 6 sections (10 microns in thickness each) were prepared and placed in two Costar Brand Microcentrifuge Tubes (Polypropylene, 1.7 mL tubes, clear; 3 sections in each tube). If the tumor constituted less than 30% of the total specimen area, the sample may have been crudely dissected by the pathologist, using gross microdissection, putting the tumor tissue directly into the Costar tube.
[0495]If more than one tumor block was obtained as part of the surgical procedure, all tumor blocks were subjected to the same characterization, as described above, and the block most representative of the pathology was used for analysis.
Gene Expression Analysis
[0496]mRNA was extracted and purified from fixed, paraffin-embedded tissue samples, and prepared for gene expression analysis as described in chapters 7-11 above. Molecular assays of quantitative gene expression were performed by RT-PCR, using the ABI PRISM 7900® Sequence Detection System® (Perkin-Elmer-Applied Biosystems, Foster City, Calif., USA). ABI PRISM 7900® consists of a thermocycler, laser, charge-coupled device (CCD), camera and computer. The system amplifies samples in a 384-well format on a thermocycler. During amplification, laser-induced fluorescent signal is collected in real-time through fiber optics cables for all 384 wells, and detected at the CCD. The system includes software for running the instrument and for analyzing the data.
Analysis and Results
[0497]Tumor tissue was analyzed for 185 cancer-related genes and 7 reference genes. The threshold cycle (CT) values for each patient were normalized based on the median of all genes for that particular patient. Clinical outcome data were available for all patients from a review of registry data and selected patient charts. Outcomes were classified as:
0 died due to breast cancer or to unknown cause or alive with breast cancer recurrence;1 alive without breast cancer recurrence or died due to a cause other than breast cancer
[0498]Analysis was performed by:
1. Analysis of the relationship between normalized gene expression and the binary outcomes of 0 or 1.2. Analysis of the relationship between normalized gene expression and the time to outcome (0 or 1 as defined above) where patients who were alive without breast cancer recurrence or who died due to a cause other than breast cancer were censored. This approach was used to evaluate the prognostic impact of individual genes and also sets of multiple genes.Analysis of 147 Patients with Invasive Breast Carcinoma by Binary Approach
[0499]In the first (binary) approach, analysis was performed on all 146 patients with invasive breast carcinoma. At test was performed on the group of patients classified as 0 or 1 and the p-values for the differences between the groups for each gene were calculated.
[0500]The following Table 4 lists the 45 genes for which the p-value for the differences between the groups was <0.05
TABLE-US-00001 TABLE 4 Gene/ Mean CT Mean CT Degrees of SEQ ID NO: Alive Deceased t-value freedom p FOXM1 33.66 32.52 3.92 144 0.0001 PRAME 35.45 33.84 3.71 144 0.0003 Bcl2 28.52 29.32 -3.53 144 0.0006 STK15 30.82 30.10 3.49 144 0.0006 CEGP1 29.12 30.86 -3.39 144 0.0009 Ki-67 30.57 29.62 3.34 144 0.0011 GSTM1 30.62 31.63 -3.27 144 0.0014 CA9 34.96 33.54 3.18 144 0.0018 PR 29.56 31.22 -3.16 144 0.0019 BBC3 31.54 32.10 -3.10 144 0.0023 NME1 27.31 26.68 3.04 144 0.0028 SURV 31.64 30.68 2.92 144 0.0041 GATA3 26.06 26.99 -2.91 144 0.0042 TFRC 28.96 28.48 2.87 144 0.0047 YB-1 26.72 26.41 2.79 144 0.0060 DPYD 28.51 28.84 -2.67 144 0.0084 GSTM3 28.21 29.03 -2.63 144 0.0095 RPS6KB1 31.18 30.61 2.61 144 0.0099 Src 27.97 27.69 2.59 144 0.0105 Chk1 32.63 31.99 2.57 144 0.0113 ID1 28.73 29.13 -2.48 144 0.0141 EstR1 24.22 25.40 -2.44 144 0.0160 p27 27.15 27.51 -2.41 144 0.0174 CCNB1 31.63 30.87 2.40 144 0.0176 XIAP 30.27 30.51 -2.40 144 0.0178 Chk2 31.48 31.11 2.39 144 0.0179 CDC25B 29.75 29.39 2.37 144 0.0193 IGF1R 28.85 29.44 -2.34 144 0.0209 AK055699 33.23 34.11 -2.28 144 0.0242 PI3KC2A 31.07 31.42 -2.25 144 0.0257 TGFB3 28.42 28.85 -2.25 144 0.0258 BAGI1 28.40 28.75 -2.24 144 0.0269 CYP3A4 35.70 35.32 2.17 144 0.0317 EpCAM 28.73 28.34 2.16 144 0.0321 VEGFC 32.28 31.82 2.16 144 0.0326 pS2 28.96 30.60 -2.14 144 0.0341 hENT1 27.19 26.91 2.12 144 0.0357 WISP1 31.20 31.64 -2.10 144 0.0377 HNF3A 27.89 28.64 -2.09 144 0.0384 NFKBp65 33.22 33.80 -2.08 144 0.0396 BRCA2 33.06 32.62 2.08 144 0.0397 EGFR 30.68 30.13 2.06 144 0.0414 TK1 32.27 31.72 2.02 144 0.0453 VDR 30.08 29.73 1.99 144 0.0488
[0501]In the foregoing Table 4, lower (negative) t-values indicate higher expression (or lower CTs), associated with better outcomes, and, inversely, higher (positive) t-values indicate higher expression (lower CTs) associated with worse outcomes. Thus, for example, elevated expression of the FOXM1 gene (t-value=3.92, CT mean alive>CT mean deceased) indicates a reduced likelihood of disease free survival. Similarly, elevated expression of the CEGP1 gene (t-value=-3.39; CT mean alive<CT mean deceased) indicates an increased likelihood of disease free survival.
[0502]Based on the data set forth in Table 4, the overexpression of any of the following genes in breast cancer indicates a reduced likelihood of survival without cancer recurrence following surgery: FOXM1; PRAME; SKT15, Ki-67; CA9; NME1; SURV; TFRC; YB-1; RPS6 KB1; Src; Chk1; CCNB1; Chk2; CDC25B; CYP3A4; EpCAM; VEGFC; hENT1; BRCA2; EGFR; TK1; VDR.
[0503]Based on the data set forth in Table 4, the overexpression of any of the following genes in breast cancer indicates a better prognosis for survival without cancer recurrence following surgery: Blc12; CEGP1; GSTM1; PR; BBC3; GATA3; DPYD; GSTM3; 101; EstR1; p27; XIAP; IGF1R; AK055699; P13KC2A; TGFB3; BAGI1; pS2; WISP1; HNF3A; NFKBp65.
[0504]Analysis of 108 ER Positive Patient by Binary Approach
[0505]108 patients with normalized CT for estrogen receptor (ER)<25.2 (i.e., ER positive patients) were subjected to separate analysis. At test was performed on the groups of patients classified as 0 or 1 and the p-values for the differences between the groups for each gene were calculated. The following Table 5 lists the 12 genes where the p-value for the differences between the groups was <0.05.
TABLE-US-00002 TABLE 5 Gene/ Mean CT Mean CT Degrees of SEQ ID NO: Alive Deceased t-value freedom p PRAME 35.54 33.88 3.03 106 0.0031 Bcl2 28.24 28.87 -2.70 106 0.0082 FOXM1 33.82 32.85 2.66 106 0.089 DIABLO 30.33 30.71 -2.47 106 0.0153 EPHX1 28.62 28.03 2.44 106 0.0163 HIF1A 29.37 28.88 2.40 106 0.0180 VEGFC 32.39 31.69 2.39 106 0.0187 Ki-67 30.73 29.82 2.38 106 0.0191 IGF1R 28.60 29.18 -2.37 106 0.0194 VDR 30.14 29.60 2.17 106 0.0322 NME1 27.34 26.80 2.03 106 0.0452 GSTM3 28.08 28.92 -2.00 106 0.0485
[0506]For each gene, a classification algorithm was utilized to identify the best threshold value (CT) for using each gene alone in predicting clinical outcome.
[0507]Based on the data set forth in Table 5, overexpression of the following genes in ER-positive cancer is indicative of a reduced likelihood of survival without cancer recurrence following surgery: PRAME; FOXM1; EPHX1; HIF1A; VEGFC; Ki-67; VDR; NME1. Some of these genes (PRAME; FOXM1; VEGFC; Ki-67; VDR; and NME1) were also identified as indicators of poor prognosis in the previous analysis, not limited to ER-positive breast cancer. The overexpression of the remaining genes (EPHX1 and HIF1A) appears to be negative indicator of disease free survival in ER-positive breast cancer only. Based on the data set forth in Table 5, overexpression of the following genes in ER-positive cancer is indicative of a better prognosis for survival without cancer recurrence following surgery: Bcl-2; DIABLO; IGF1R; GSTM3. Of the latter genes, Bcl-2; IGFR1; and GSTM3 have also been identified as indicators of good prognosis in the previous analysis, not limited to ER-positive breast cancer. The overexpression of DIABLO appears to be positive indicator of disease free survival in ER-positive breast cancer only.
[0508]Analysis of Multiple Genes and Indicators of Outcome
[0509]Two approaches were taken in order to determine whether using multiple genes would provide better discrimination between outcomes.
[0510]First, a discrimination analysis was performed using a forward stepwise approach. Models were generated that classified outcome with greater discrimination than was obtained with any single gene alone.
[0511]According to a second approach (time-to-event approach), for each gene a Cox Proportional Hazards model (see, e.g. Cox, D. R., and Oakes, D. (1984), Analysis of Survival Data, Chapman and Hall, London, N.Y.) was defined with time to recurrence or death as the dependent variable, and the expression level of the gene as the independent variable. The genes that have a p-value<0.05 in the Cox model were identified. For each gene, the Cox model provides the relative risk (RR) of recurrence or death for a unit change in the expression of the gene. One can choose to partition the patients into subgroups at any threshold value of the measured expression (on the CT scale), where all patients with expression values above the threshold have higher risk, and all patients with expression values below the threshold have lower risk, or vice versa, depending on whether the gene is an indicator of good (RR>1.01) or poor (RR<1.01) prognosis. Thus, any threshold value will define subgroups of patients with respectively increased or decreased risk. The results are summarized in the following Tables 6 and 7.
TABLE-US-00003 TABLE 6 Cox Model Results for 146 Patients with Invasive Breast Cancer Gene Relative Risk (RR) SE Relative Risk p value FOXM1 0.58 0.15 0.0002 STK15 0.51 0.20 0.0006 PRAME 0.78 0.07 0.0007 Bcl2 1.66 0.15 0.0009 CEGP1 1.25 0.07 0.0014 GSTM1 1.40 0.11 0.0014 Ki67 0.62 0.15 0.0016 PR 1.23 0.07 0.0017 Contig51037 0.81 0.07 0.0022 NME1 0.64 0.15 0.0023 YB-1 0.39 0.32 0.0033 TFRC 0.53 0.21 0.0035 BBC3 1.72 0.19 0.0036 GATA3 1.32 0.10 0.0039 CA9 0.81 0.07 0.0049 SURV 0.69 0.13 0.0049 DPYD 2.58 0.34 0.0052 RPS6KB1 0.60 0.18 0.0055 GSTM3 1.36 0.12 0.0078 Src.2 0.39 0.36 0.0094 TGFB3 1.61 0.19 0.0109 CDC25B 0.54 0.25 0.0122 XIAP 3.20 0.47 0.0126 CCNB1 0.68 0.16 0.0151 IGF1R 1.42 0.15 0.0153 Chk1 0.68 0.16 0.0155 ID1 1.80 0.25 0.0164 p27 1.69 0.22 0.0168 Chk2 0.52 0.27 0.0175 EstR1 1.17 0.07 0.0196 HNF3A 1.21 0.08 0.206 pS2 1.12 0.05 0.0230 BAGI1 1.88 0.29 0.0266 AK055699 1.24 0.10 0.0276 pENT1 0.51 0.31 0.0293 EpCAM 0.62 0.22 0.0310 WISP1 1.39 0.16 0.0338 VEGFC 0.62 0.23 0.0364 TK1 0.73 0.15 0.0382 NFKBp65 1.32 0.14 0.0384 BRCA2 0.66 0.20 0.0404 CYP3A4 0.60 0.25 0.0417 EGFR 0.72 0.16 0.0436
TABLE-US-00004 TABLE 7 Cox Model Results for 108 Patients wih ER+ Invasive Breast Cancer Gene Relative Risk (RR) SE Relative Risk p-value PRAME 0.75 0.10 0.0045 Contig51037 0.75 0.11 0.0060 Blc2 2.11 0.28 0.0075 HIF1A 0.42 0.34 0.0117 IGF1R 1.92 0.26 0.0117 FOXM1 0.54 0.24 0.0119 EPHX1 0.43 0.33 0.0120 Ki67 0.60 0.21 0.0160 CDC25B 0.41 0.38 0.0200 VEGFC 0.45 0.37 0.0288 CTSB 0.32 0.53 0.0328 DIABLO 2.91 0.50 0.0328 p27 1.83 0.28 0.0341 CDH1 0.57 0.27 0.0352 IGFBP3 0.45 0.40 0.0499
[0512]The binary and time-to-event analyses, with few exceptions, identified the same genes as prognostic markers. For example, comparison of Tables 4 and 6 shows that, with the exception of a single gene, the two analyses generated the same list of top 15 markers (as defined by the smallest p values). Furthermore, when both analyses identified the same gene, they were concordant with respect to the direction (positive or negative sign) of the correlation with survival/recurrence. Overall, these results strengthen the conclusion that the identified markers have significant prognostic value.
[0513]For Cox models comprising more than two genes (multivariate models), stepwise entry of each individual gene into the model is performed, where the first gene entered is pre-selected from among those genes having significant univariate p-values, and the gene selected for entry into the model at each subsequent step is the gene that best improves the fit of the model to the data. This analysis can be performed with any total number of genes. In the analysis the results of which are shown below, stepwise entry was performed for up to 10 genes.
[0514]Multivariate analysis is performed using the following equation:
RR=exp[coef(geneA)×Ct(geneA)+coef(geneB)×Ct(geneB)+coef(geneC)- ×Ct(geneC)+ . . . ].
[0515]In this equation, coefficients for genes that are predictors of beneficial outcome are positive numbers and coefficients for genes that are predictors of unfavorable outcome are negative numbers. The "Ct" values in the equation are ΔCts, i.e. reflect the difference between the average normalized Ct value for a population and the normalized Ct measured for the patient in question. The convention used in the present analysis has been that ΔCts below and above the population average have positive signs and negative signs, respectively (reflecting greater or lesser mRNA abundance). The relative risk (RR) calculated by solving this equation will indicate if the patient has an enhanced or reduced chance of long-term survival without cancer recurrence.
[0516]Multivariate Gene Analysis of 147 Patients with Invasive Breast Carcinoma
[0517](a) A multivariate stepwise analysis, using the Cox Proportional Hazards Model, was performed on the gene expression data obtained for all 147 patients with invasive breast carcinoma. Genes CEGP1, FOXM1, STK15 and PRAME were excluded from this analysis. The following ten-gene sets have been identified by this analysis as having particularly strong predictive value of patient survival without cancer recurrence following surgical removal of primary tumor. [0518]1. Bcl2, cyclinG1, NFKBp65, NME1, EPHX1, TOP2B, DR5, TERC, Src, DIABLO; [0519]2. Ki67, XIAP, hENT1, TS, CD9, p27, cyclinG1, pS2, NFKBp65, CYP3A4; [0520]3. GSTM1, XIAP, Ki67, TS, cyclinG1, p27, CYP3A4, pS2, NFKBp65, ErbB3; [0521]4. PR, NME1, XIAP, upa, cyclinG1, Contig51037, TERC, EPHX1, ALDH1A3, CTSL; [0522]5. CA9, NME1, TERC, cyclinG1, EPHX1, DPYD, Src, TOP2B, NFKBp65, VEGFC; [0523]6. TFRC, XIAP, Ki67, TS, cyclinG1, p27, CYP3A4, pS2, ErbB3, NFKBp65.
[0524](b) A multivariate stepwise analysis, using the Cox, Proportional Hazards Model, was performed on the gene expression data obtained for all 147 patients with invasive breast carcinoma, using an interrogation set including a reduced number of genes. The following ten-gene sets have been identified by this analysis as having particularly strong predictive value of patient survival without cancer recurrence following surgical removal of primary tumor. [0525]1. Bcl2, PRAME, cyclinG1, FOXM1, NFKBp65, TS, XIAP, Ki67, CYP3A4, p27; [0526]2. FOXM1, cyclinG1, XIAP, Contig51037, PRAME, TS, Ki67, PDGFRa, p27, NFKBp65; [0527]3. PRAME, FOXM1, cyclinG1, XIAP, Contig51037, TS, Ki6, PDGFRa, p27, NFKBp65; [0528]4. Ki67, XIAP, PRAME, hENT1, contig51037, TS, CD9, p27, ErbB3, cyclinG1; [0529]5. STK15, XIAP, PRAME, PLAUR, p27, CTSL, CD18, PREP, p53, RPS6 KB1; [0530]6. GSTM1, XIAP, PRAME, p27, Contig51037, ErbB3, GSTp, EREG, ID1, PLAUR; [0531]7. PR, PRAME, NME1, XIAP, PLAUR, cyclinG1, Contig51037, TERC, EPHX1, DR5; [0532]8. CA9, FOXM1, cyclinG1, XIAP, TS, Ki67, NFKBp65, CYP3A4, GSTM3, p27; [0533]9. TFRC, XIAP, PRAME, p27, Contig51037, ErbB3, DPYD, TERC, NME1, VEGFC; [0534]10. CEGP1, PRAME, hENT1, XIAP, Contig51037, ErbB3, DPYD, NFKBp65, ID1, TS.
[0535]Multivariate Analysis of Patients with ER Positive Invasive Breast Carcinoma
[0536]A multivariate stepwise analysis, using the Cox Proportional Hazards Model, was performed on the gene expression data obtained for patients with ER positive invasive breast carcinoma. The following ten-gene sets have been identified by this analysis as having particularly strong predictive value of patient survival without cancer recurrence following surgical removal of primary tumor. [0537]1. PRAME, p27, IGFBP2, HIF1A, T1MP2, ILT2, CYP3A4, ID1, EstR1, DIABLO; [0538]2. Contig51037, EPHX1, Ki67, TIMP2, cyclinG1, DPYD, CYP3A4, TP, AIB1, CYP2C8; [0539]3. Bcl2, hENT1, FOXM1, Contig51037, cyclinG1, Contig46653, PTEN, CYP3A4, TIMP2, AREG; [0540]4. HIF1A, PRAME, p27, IGFBP2, T1MP2, ILT2, CYP3A4, ID1, EstR1, DIABLO; [0541]5. IGF1R, PRAME, EPHX1, Contig51037, cyclinG1, Bcl2, NME1, PTEN, TBP, TIMP2; [0542]6. FOXM1, Contig51037, VEGFC, TBP, HIF1A, DPYD, RAD51C, DCR3, cyclinG1, BAG1;
[0543]7. EPHX1, Contig51037, Ki67, TIMP2, cyclinG1, DPYD, CYP3A4, TP, AIB1, CYP2C8; [0544]8. Ki67, VEGFC, VDR, GSTM3, p27, upa, ITGA7, rhoC, TERC, Pin1; [0545]9. CDC25B, Contig51037, hENT1, Bcl2, HLAG, TERC, NME1, upa, ID1, CYP; [0546]10. VEGFC, Ki67, VDR, GSTM3, p27, upa, ITGA7, rhoC, TERC, Pin1; [0547]11. CTSB, PRAME, p27, IGFBP2, EPHX1, CTSL, BAD, DR5, DCR3, XIAP; [0548]12. DIABLO, Ki67, hENT1, TIMP2, ILT2, p27, KRT19, IGFBP2, TS, PDGFB; [0549]13. p27, PRAME, IGFBP2, HIF1A, T1MP2, ILT2, CYP3A4, ID1, EstR1, DIABLO; [0550]14. CDH1; PRAME, VEGFC; HIF1A; DPYD, TIMP2, CYP3A4, EstR1, RBP4, p27; [0551]15. IGFBP3, PRAME, p27, Bcl2, XIAP, EstR1, Ki67, TS, Src, VEGF; [0552]16. GSTM3, PRAME, p27, IGFBP3, XIAP, FGF2, hENT1, PTEN, EstR1, APC; [0553]17. hENT1, Bcl2, FOXM1, Contig51037, CyclinG1, Contig46653, PTEN, CYP3A4, TIMP2, AREG; [0554]18. STK15, VEGFC, PRAME, p2'7, GCLC, hENT1, ID1, TIMP2, EstR1, MCP1; [0555]19. NME1, PRAM, p27, IGFBP3, XIAP, PTEN, hENT1, Bcl2, CYP3A4, HLAG; [0556]20. VDR, Bcl2, p27, hENT1, p53, PI3KC2A, EIF4E, TFRC, MCM3, ID1; [0557]21. EIF4E, Contig51037, EPHX1, cyclinG1, Bcl2, DR5, TBP, PTEN, NME1, HER2; [0558]22. CCNB1, PRAME, VEGFC, HIF1A, hENT1, GCLC, TIMP2, ID1, p27, upa; [0559]23. ID1, PRAME, DIABLO, hENT1, p27, PDGFRa, NME1, BIN1, BRCA1, TP; [0560]24. FBXO5, PRAME, IGFBP3, p27, GSTM3, hENT1, XIAP, FGF2, TS, PTEN; [0561]25. GUS, HIA1A, VEGFC, GSTM3, DPYD, hENT1, FBXO5, CA9, CYP, KRT18; [0562]26. Bclx, Bcl2, hENT1, Contig51037, HLAG, CD9, ID1, BRCA1, BIN1, HBEGF.
[0563]It is noteworthy that many of the foregoing gene sets include genes that alone did not have sufficient predictive value to qualify as prognostic markers under the standards discussed above, but in combination with other genes, their presence provides valuable information about the likelihood of long-term patient survival without cancer recurrence
[0564]All references cited throughout the disclosure are hereby expressly incorporated by reference.
[0565]While the present invention has been described with reference to what are considered to be the specific embodiments, it is to be understood that the invention is not limited to such embodiments. To the contrary, the invention is intended to cover various modifications and equivalents included within the spirit and scope of the appended claims. For example, while the disclosure focuses on the identification of various breast cancer associated genes and gene sets, and on the diagnosis and treatment of breast cancer, similar genes, gene sets and methods concerning other types of cancer are specifically within the scope herein.
TABLE-US-00005 TABLE 1 1. ADD3 (adducin 3 gamma)* 2. AKT1/Protein Kinase B 3. AKT 2 4. AKT 3 5. Aldehyde dehydrogenase 1A1 6. Aldehyde dehydrogenase 1A3 7. amphiregulin 8. APC 9. ARG 10. ATM 11. Bak 12. Bax 13. Bcl2 14. Bcl-xl 15. BRK 16. BCRP 17. BRCA-1 18. BRCA-2 19. Caspase-3 20. Cathepsin B 21. Cathepsin G 22. Cathepsin L 23. CD3 24. CD9 25. CD18 26. CD31 27. CD44{circumflex over ( )} 28. CD68 29. CD82/KAI-1 30. Cdc25A 31. Cdc25B 32. CGA 33. COX2 34. CSF-1 35. CSF-1R/fms 36. cIAP1 37. cIAP2 38. c-abl 39. c-kit 40. c-kit L 41. c-met 42. c-myc 43. cN-1 44. cryptochrome1* 45. c-Src 46. Cyclin D1 47. CYP1B1 48. CYP2C9* 49. Cytokeratin 5{circumflex over ( )} 50. Cytokeratin 17{circumflex over ( )} 51. Cytokeratin 18{circumflex over ( )} 52. DAP-Kinase-1 53. DHFR 54. DIABLO 55. Dihydropyrimidine dehydrogenase 56. EGF 57. ECadherin/CDH1{circumflex over ( )} 58. ELF 3* 59. Endothelin 60. Epiregulin 61. ER-alpha{circumflex over ( )} 62. ErbB-1 63. ErbB-2{circumflex over ( )} 64. ErbB-3 65. ErbB-4 66. ER-Beta 67. Eukaryotic Translation Initiation Factor 4B*(EIF4B) 68. E1F4E 69. farnesyl pyrolophosphate synthetase 70. FAS (CD95) 71. FasL 72. FGF R 1* 73. FGF2 [bFGF] 74. 53BP1 75. 53BP2 76. GALC (galactosylceramidase)* 77. Gamma-GCS (glutamyl cysteine synthetase) 78. GATA3{circumflex over ( )} 79. geranyl geranyl pyrophosphate synthetase 80. G-CSF 81. GPC3 82. gravin* [AK AP258] 83. GRO1 oncogene alpha{circumflex over ( )} 84. Grb7{circumflex over ( )} 85. GST-alpha 86. GST-pi{circumflex over ( )} 87. Ha-Ras 88. HB-EGF 89. HE4-extracellular Proteinase Inhibitor Homologue* 90. hepatocyte nuclear factor 3{circumflex over ( )} 91. HER-2 92. HGF/Scatter factor 93. hIAP1 94. hIAP2 95. HIF-1 96. human kallikrein 10 97. MLH1 98. hsp 27 99. human chorionic gonadotropin/CGA 100. Human Extracellular Protein S1-5 101. Id-1 102. Id-2 103. Id-3 104. IGF-1 105. IGF2 106. IGF1R 107. IGFBP3 108. interstitial integrin alpha 7 109. IL6 110. IL8 111. IRF-2* 112. IRF9 Protein 113. Kalikrein 5 114. Kalikrein 6 115. KDR 116. Ki-67/MiB1 117. lipoprotein lipase{circumflex over ( )} 118. LIV1 119. Lung Resistance Protein/MVP 120. Lot1 121. Maspin 122. MCM2 123. MCM3 124. MCM7 125. MCP-1 126. microtubule-associated protein 4 127. MCJ 128. mdm2 129. MDR-1 130. microsomal epoxide hydrolase 131. MMP9 132. MRP1 133. MRP2 134. MRP3 135. MRP4 136. MSN (Moesin)* 137. mTOR 138. Muc1/CA 15-3 139. NF-kB 140. P14ARF 141. P16INK4a/p14 142. p21wAF1/CIP1 143. p23 144. p27 145. p311* 146. p53 147. PAI1 148. PCNA 149. PDGF-A 150. PDGF-B 151. PDGF-C 152. PDGF-D 153. PDGFR-α 154. PDGFR-β 155. PI3K 156. Pin1 157. PKC-ε 158. Pkc-δ 159. PLAG1 (pleiomorphic aden 1)* 160. PREP prolyl endopeptidase 161. Progesterone receptor 162. pS2/trefoil factor 1 163. PTEN 164. PTP1b 165. RAR-alpha 166. RAR-beta2 167. RCP 168. Reduced Folate Carrier 169. Retinol binding protein 4{circumflex over ( )} 170. STK15/BTAK 171. Survivin 172. SXR 173. Syk 174. TGD (thymine-DNA glycosylase)* 175. TGFalpha 176. Thymidine Kinase 177. Thymidine phosphorylase 178. Thymidylate Synthase 179. Topoisomerase II-α 180. Topoisomerase II-β 181. TRAMP 182. UPA 183. VEGF 184. Vimentin 185. WTH3 186. XAF1 187. XIAP 188. XIST 189. XPA 190. YB-1 *NCI 60 drug Sens./Resist Marker {circumflex over ( )}In Cluster Defining tumor subclass Jan. 19, 2002 indicates data missing or illegible when filed
TABLE-US-00006 TABLE 2 Forward Reverse Primer Primer Amplicon Gene Accession No. SEQ ID NO. SEQ ID NO. SEQ ID NO. ABCB1 NM_000927 1 2 3 ABCC1 NM_004996 4 5 6 ABCC2 NM_000392 7 8 9 ABCC3 NM_003786 10 11 12 ABCC4 NM_005845 13 14 15 ABL1 NM_005157 16 17 18 ABL2 NM_005158 19 20 21 ACTB NM_001101 22 23 24 AKT1 NM_005163 25 26 27 AKT3 NM_005465 28 29 30 ALDH1 NM_000689 31 32 33 ALDH1A3 NM_000693 34 35 36 APC NM_000038 37 38 39 AREG NM_001657 40 41 42 B2M NM_004048 43 44 45 BAK1 NM_001188 46 47 48 BAX NM_004324 49 50 51 BCL2 NM_000633 52 53 54 BCL2L1 NM_001191 55 56 57 BIRC3 NM_001165 58 59 60 BIRC4 NM_001167 61 62 63 BIRC5 NM_001168 64 65 66 BRCA1 NM_007295 67 68 69 BRCA2 NM_000059 70 71 72 CCND1 NM_001758 73 74 75 CD3Z NM_000734 76 77 78 CD68 NM_001251 79 80 81 CDC25A NM_001789 82 83 84 CDH1 NM_004360 85 86 87 CDKN1A NM_000389 88 89 90 CDKN1B NM_004064 91 92 93 CDKN2A NM_000077 94 95 96 CYP1B1 NM_000104 97 98 99 DHFR NM_000791 100 101 102 DPYD NM_000110 103 104 105 ECGF1 NM_001953 106 107 108 EGFR NM_005228 109 110 111 EIF4E NM_001968 112 113 114 ERBB2 NM_004448 115 116 117 ERBB3 NM_001982 118 119 120 ESR1 NM_000125 121 122 123 ESR2 NM_001437 124 125 126 GAPD NM_002046 127 128 129 GATA3 NM_002051 130 131 132 GRB7 NM_005310 133 134 135 GRO1 NM_001511 136 137 138 GSTP1 NM_000852 139 140 141 GUSB NM_000181 142 143 144 hHGF M29145 145 146 147 HNF3A NM_004496 148 149 150 ID2 NM_002166 151 152 153 IGF1 NM_000618 154 155 156 IGFBP3 NM_000598 157 158 159 ITGA7 NM_002206 160 161 162 ITGB2 NM_000211 163 164 165 KDR NM_002253 166 167 168 KIT NM_000222 169 170 171 KITLG NM_000899 172 173 174 KRT17 NM_000422 175 176 177 KRT5 NM_000424 178 179 180 LPL NM_000237 181 182 183 MET NM_000245 184 185 186 MKI67 NM_002417 187 188 189 MVP NM_017458 190 191 192 MYC NM_002467 193 194 195 PDGFA NM_002607 196 197 198 PDGFB NM_002608 199 200 201 PDGFC NM_016205 202 203 204 PDGFRA NM_006206 205 206 207 PDGFRB NM_002609 208 209 210 PGK1 NM_000291 211 212 213 PGR NM_000926 214 215 216 PIN1 NM_006221 217 218 219 PLAU NM_002658 220 221 222 PPIH NM_006347 223 224 225 PTEN NM_000314 226 227 228 PTGS2 NM_000963 229 230 231 RBP4 NM_006744 232 233 234 RELA NM_021975 235 236 237 RPL19 NM_000981 238 239 240 RPLP0 NM_001002 241 242 243 SCDGF-B NM_025208 244 245 246 SERPINE1 NM_000602 247 248 249 SLC19A1 NM_003056 250 251 252 TBP NM_003194 253 254 255 TFF1 NM_003225 256 257 258 TFRC NM_003234 259 260 261 TK1 NM_003258 262 263 264 TNFRSF6 NM_000043 265 266 267 TNFSF6 NM_000639 268 269 270 TOP2A NM_001067 271 272 273 TOP2B NM_001068 274 275 276 TP53 NM_000546 277 278 279 TYMS NM_001071 280 281 282 VEGF NM_003376 283 284 285
TABLE-US-00007 TABLE 3 GENE ACCESSION NO. SEQ ID NO: AK055699 AK055699 286 BAG1 NM_004323 287 BBC3 NM_014417 288 Bcl2 NM_000633 289 BRCA2 NM_000059 290 CA9 NM_001216 291 CCNB1 NM_031966 292 CDC25B NM_021874 293 CEGP1 NM_020974 294 Chk1 NM_001274 295 Chk2 NM_007194 296 CYP3A4 NM_017460 297 DIABLO NM_019887 298 DPYD NM_000110 299 EGFR NM_005228 300 EpCAM NM_002354 301 EPHX1 NM_000120 302 EstR1 NM_000125 303 FOXM1 NM_021953 304 GATA3 NM_002051 305 GSTM1 NM_000561 306 GSTM3 NM_000849 307 hENT1 NM_004955 308 HIF1A NM_001530 309 HNF3A NM_004496 310 ID1 NM_002165 311 IGF1R NM_000875 312 Ki-67 NM_002417 313 NFKBp65 NM_021975 314 NME1 NM_000269 315 p27 NM_004064 316 PI3KC2A NM_002645 317 PR NM_000926 318 PRAME NM_006115 319 pS2 NM_003225 320 RPS6KB1 NM_003161 321 Src NM_004383 322 STK15 NM_003600 323 SURV NM_001168 324 TFRC NM_003234 325 TGFB3 NM_003239 326 TK1 NM_003258 327 VDR NM_000376 328 VEGFC NM_005429 329 WISP1 NM_003882 330 XIAP NM_001167 331 YB-1 NM_004559 332 ITGA7 NM_002206 333 PDGFB NM_002608 334 Upa NM_002658 335 TBP NM_003194 336 PDGFRa NM_006206 337 Pin1 NM_006221 338 CYP NM_006347 339 RBP4 NM_006744 340 BRCA1 NM_007295 341 APC NM_000038 342 GUS NM_000181 343 CD18 NM_000211 344 PTEN NM_000314 345 P53 NM_000546 346 ALDH1A3 NM_000693 347 GSTp NM_000852 348 TOP2B NM_001068 349 TS NM_001071 350 Bclx NM_001191 351 AREG NM_001657 352 TP NM_001953 353 EIF4E NM_001968 354 ErbB3 NM_001982 355 EREG NM_001432 356 GCLC NM_001498 357 CD9 NM_001769 358 HB-EGF NM_001945 359 IGFBP2 NM_000597 360 CTSL NM_001912 361 PREP NM_002726 362 CYP3A4 NM_017460 363 ILT-2 NM_006669 364 MCM3 NM_002388 365 KRT19 NM_002276 366 KRT18 NM_000224 367 TIMP2 NM_003255 368 BAD NM_004322 369 CYP2C8 NM_030878 370 DCR3 NM_016434 371 PLAUR NM_002659 372 PI3KC2A NM_002645 373 FGF2 NM_002006 374 HLA-G NM_002127 375 AIB1 NM_006534 376 MCP1 NM_002982 377 Contig46653 Contig46653 378 RhoC NM_005167 379 DR5 NM_003842 380 RAD51C NM_058216 381 BIN1 NM_004305 382 VDR NM_000376 383 TERC U86046 384
Sequence CWU
1
384118DNAHomo sapiens 1gtcccaggag cccatcct
18219DNAHomo sapiens 2cccggctgtt gtctccata
19368DNAHomo sapiens 3gtcccaggag
cccatcctgt ttgactgcag cattgctgag aacattgcct atggagacaa 60cagccggg
68418DNAHomo
sapiens 4tcatggtgcc cgtcaatg
18523DNAHomo sapiens 5cgattgtctt tgctcttcat gtg
23679DNAHomo sapiens 6tcatggtgcc cgtcaatgct
gtgatggcga tgaagaccaa gacgtatcag gtggcccaca 60tgaagagcaa agacaatcg
79720DNAHomo sapiens
7aggggatgac ttggacacat
20820DNAHomo sapiens 8aaaactgcat ggctttgtca
20965DNAHomo sapiens 9aggggatgac ttggacacat ctgccattcg
acatgactgc aattttgaca aagccatgca 60gtttt
651022DNAHomo sapiens 10tcatcctggc
gatctacttc ct 221120DNAHomo
sapiens 11ccgttgagtg gaatcagcaa
201291DNAHomo sapiens 12tcatcctggc gatctacttc ctctggcaga acctaggtcc
ctctgtcctg gctggagtcg 60ctttcatggt cttgctgatt ccactcaacg g
911320DNAHomo sapiens 13agcgcctgga atctacaact
201420DNAHomo sapiens
14agagcccctg gagagaagat
201566DNAHomo sapiens 15agcgcctgga atctacaact cggagtccag tgttttccca
cttgtcatct tctctccagg 60ggctct
661624DNAHomo sapiens 16gcccagagaa ggtctatgaa
ctca 241722DNAHomo sapiens
17gtttcaaagg cttggtggat tt
221894DNAHomo sapiens 18gcccagagaa ggtctatgaa ctcatgcgag catgttggca
gtggaatccc tctgaccggc 60cctcctttgc tgaaatccac caagcctttg aaac
941921DNAHomo sapiens 19cgcagtgcag ctgagtatct g
212021DNAHomo sapiens
20tgcccagggc tactctcact t
212180DNAHomo sapiens 21cgcagtgcag ctgagtatct gctcagcagt ctaatcaatg
gcagcttcct ggtgcgagaa 60agtgagagta gccctgggca
802221DNAHomo sapiens 22cagcagatgt ggatcagcaa g
212318DNAHomo sapiens
23gcatttgcgg tggacgat
182466DNAHomo sapiens 24cagcagatgt ggatcagcaa gcaggagtat gacgagtccg
gcccctccat cgtccaccgc 60aaatgc
662520DNAHomo sapiens 25cgcttctatg gcgctgagat
202620DNAHomo sapiens
26tcccggtaca ccacgttctt
202771DNAHomo sapiens 27cgcttctatg gcgctgagat tgtgtcagcc ctggactacc
tgcactcgga gaagaacgtg 60gtgtaccggg a
712825DNAHomo sapiens 28ttgtctctgc cttggactat
ctaca 252924DNAHomo sapiens
29ccagcattag attctccaac ttga
243075DNAHomo sapiens 30ttgtctctgc cttggactat ctacattccg gaaagattgt
gtaccgtgat ctcaagttgg 60agaatctaat gctgg
753125DNAHomo sapiens 31gaaggagata aggaggatgt
tgaca 253218DNAHomo sapiens
32cgccacggag atccaatc
183374DNAHomo sapiens 33gaaggagata aggaggatgt tgacaaggca gtgaaggccg
caagacaggc ttttcagatt 60ggatctccgt ggcg
743421DNAHomo sapiens 34tggtgaacat tgtgccagga t
213522DNAHomo sapiens
35gaaggcgatc ttgttgatct ga
223680DNAHomo sapiens 36tggtgaacat tgtgccagga ttcgggccca cagtgggagc
agcaatttct tctcaccctc 60agatcaacaa gatcgccttc
803720DNAHomo sapiens 37ggacagcagg aatgtgtttc
203820DNAHomo sapiens
38acccactcga tttgtttctg
203969DNAHomo sapiens 39ggacagcagg aatgtgtttc tccatacagg tcacggggag
ccaatggttc agaaacaaat 60cgagtgggt
694027DNAHomo sapiens 40tgtgagtgaa atgccttcta
gtagtga 274127DNAHomo sapiens
41ttgtggttcg ttatcatact cttctga
274282DNAHomo sapiens 42tgtgagtgaa atgccttcta gtagtgaacc gtcctcggga
gccgactatg actactcaga 60agagtatgat aacgaaccac aa
824319DNAHomo sapiens 43gtctcgctcc gtggcctta
194424DNAHomo sapiens
44cgtgagtaaa cctgaatctt tgga
244593DNAHomo sapiens 45gtctcgctcc gtggccttag ctgtgctcgc gctactctct
ctttctggcc tggaggctat 60ccagcgtact ccaaagattc aggtttactc acg
934620DNAHomo sapiens 46ccattcccac cattctacct
204720DNAHomo sapiens
47gggaacatag acccaccaat
204866DNAHomo sapiens 48ccattcccac cattctacct gaggccagga cgtctggggt
gtggggattg gtgggtctat 60gttccc
664918DNAHomo sapiens 49ccgccgtgga cacagact
185021DNAHomo sapiens
50ttgccgtcag aaaacatgtc a
215170DNAHomo sapiens 51ccgccgtgga cacagactcc ccccgagagg tctttttccg
agtggcagct gacatgtttt 60ctgacggcaa
705225DNAHomo sapiens 52cagatggacc tagtacccac
tgaga 255324DNAHomo sapiens
53cctatgattt aagggcattt ttcc
245473DNAHomo sapiens 54cagatggacc tagtacccac tgagatttcc acgccgaagg
acagcgatgg gaaaaatgcc 60cttaaatcat agg
735524DNAHomo sapiens 55cttttgtgga actctatggg
aaca 245619DNAHomo sapiens
56cagcggttga agcgttcct
195770DNAHomo sapiens 57cttttgtgga actctatggg aacaatgcag cagccgagag
ccgaaagggc caggaacgct 60tcaaccgctg
705824DNAHomo sapiens 58ggatatttcc gtggctctta
ttca 245925DNAHomo sapiens
59cttctcatca aggcagaaaa atctt
256086DNAHomo sapiens 60ggatatttcc gtggctctta ttcaaactct ccatcaaatc
ctgtaaactc cagagcaaat 60caagattttt ctgccttgat gagaag
866123DNAHomo sapiens 61gcagttggaa gacacaggaa agt
236221DNAHomo sapiens
62tgcgtggcac tattttcaag a
216377DNAHomo sapiens 63gcagttggaa gacacaggaa agtatcccca aattgcagat
ttatcaacgg cttttatctt 60gaaaatagtg ccacgca
776420DNAHomo sapiens 64tgttttgatt cccgggctta
206524DNAHomo sapiens
65caaagctgtc agctctagca aaag
246680DNAHomo sapiens 66tgttttgatt cccgggctta ccaggtgaga agtgagggag
gaagaaggca gtgtcccttt 60tgctagagct gacagctttg
806720DNAHomo sapiens 67tcagggggct agaaatctgt
206820DNAHomo sapiens
68ccattccagt tgatctgtgg
206965DNAHomo sapiens 69tcagggggct agaaatctgt tgctatgggc ccttcaccaa
catgcccaca gatcaactgg 60aatgg
657020DNAHomo sapiens 70agttcgtgct ttgcaagatg
207120DNAHomo sapiens
71aaggtaagct gggtctgctg
207270DNAHomo sapiens 72agttcgtgct ttgcaagatg gtgcagagct ttatgaagca
gtgaagaatg cagcagaccc 60agcttacctt
707321DNAHomo sapiens 73gcatgttcgt ggcctctaag a
217422DNAHomo sapiens
74cggtgtagat gcacagcttc tc
227569DNAHomo sapiens 75gcatgttcgt ggcctctaag atgaaggaga ccatccccct
gacggccgag aagctgtgca 60tctacaccg
697620DNAHomo sapiens 76agatgaagtg gaaggcgctt
207721DNAHomo sapiens
77tgcctctgta atcggcaact g
217865DNAHomo sapiens 78agatgaagtg gaaggcgctt ttcaccgcgg ccatcctgca
ggcacagttg ccgattacag 60aggca
657918DNAHomo sapiens 79tggttcccag ccctgtgt
188019DNAHomo sapiens
80ctcctccacc ctgggttgt
198174DNAHomo sapiens 81tggttcccag ccctgtgtcc acctccaagc ccagattcag
attcgagtca tgtacacaac 60ccagggtgga ggag
748220DNAHomo sapiens 82tcttgctggc tacgcctctt
208321DNAHomo sapiens
83ctgcattgtg gcacagttct g
218471DNAHomo sapiens 84tcttgctggc tacgcctctt ctgtccctgt tagacgtcct
ccgtccatat cagaactgtg 60ccacaatgca g
718521DNAHomo sapiens 85tgagtgtccc ccggtatctt c
218621DNAHomo sapiens
86cagccgcttt cagattttca t
218781DNAHomo sapiens 87tgagtgtccc ccggtatctt ccccgccctg ccaatcccga
tgaaattgga aattttattg 60atgaaaatct gaaagcggct g
818821DNAHomo sapiens 88tggagactct cagggtcgaa a
218922DNAHomo sapiens
89ggcgtttgga gtggtagaaa tc
229065DNAHomo sapiens 90tggagactct cagggtcgaa aacggcggca gaccagcatg
acagatttct accactccaa 60acgcc
659121DNAHomo sapiens 91cggtggacca cgaagagtta a
219219DNAHomo sapiens
92ggctcgcctc ttccatgtc
199366DNAHomo sapiens 93cggtggacca cgaagagtta acccgggact tggagaagca
ctgcagagac atggaagagg 60cgagcc
669419DNAHomo sapiens 94gcggaaggtc cctcagaca
199523DNAHomo sapiens
95tctaagtttc ccgaggtttc tca
239670DNAHomo sapiens 96gcggaaggtc cctcagacat ccccgattga aagaaccaga
gaggctctga gaaacctcgg 60gaaacttaga
709722DNAHomo sapiens 97ccagctttgt gcctgtcact at
229820DNAHomo sapiens
98gggaatgtgg tagcccaaga
209971DNAHomo sapiens 99ccagctttgt gcctgtcact attcctcatg ccaccactgc
caacacctct gtcttgggct 60accacattcc c
7110027DNAHomo sapiens 100ttgctataac taagtgcttc
tccaaga 2710122DNAHomo sapiens
101gtggaatggc agctcactgt ag
2210273DNAHomo sapiens 102ttgctataac taagtgcttc tccaagaccc caactgagtc
cccagcacct gctacagtga 60gctgccattc cac
7310319DNAHomo sapiens 103aggacgcaag gagggtttg
1910421DNAHomo sapiens
104gatgtccgcc gagtccttac t
2110587DNAHomo sapiens 105aggacgcaag gagggtttgt cactggcaga ctcgagactg
taggcactgc catggcccct 60gtgctcagta aggactcggc ggacatc
8710624DNAHomo sapiens 106ctatatgcag ccagagatgt
gaca 2410724DNAHomo sapiens
107ccacgagttt cttactgaga atgg
2410882DNAHomo sapiens 108ctatatgcag ccagagatgt gacagccacc gtggacagcc
tgccactcat cacagcctcc 60attctcagta agaaactcgt gg
8210920DNAHomo sapiens 109tgtcgatgga cttccagaac
2011019DNAHomo sapiens
110attgggacag cttggatca
1911162DNAHomo sapiens 111tgtcgatgga cttccagaac cacctgggca gctgccaaaa
gtgtgatcca agctgtccca 60at
6211223DNAHomo sapiens 112gatctaagat ggcgactgtc
gaa 2311325DNAHomo sapiens
113ttagattccg ttttctcctc ttctg
2511482DNAHomo sapiens 114gatctaagat ggcgactgtc gaaccggaaa ccacccctac
tcctaatccc ccgactacag 60aagaggagaa aacggaatct aa
8211520DNAHomo sapiens 115cggtgtgaga agtgcagcaa
2011619DNAHomo sapiens
116cctctcgcaa gtgctccat
1911770DNAHomo sapiens 117cggtgtgaga agtgcagcaa gccctgtgcc cgagtgtgct
atggtctggg catggagcac 60ttgcgagagg
7011823DNAHomo sapiens 118cggttatgtc atgccagata
cac 2311924DNAHomo sapiens
119gaactgagac ccactgaaga aagg
2412081DNAHomo sapiens 120cggttatgtc atgccagata cacacctcaa aggtactccc
tcctcccggg aaggcaccct 60ttcttcagtg ggtctcagtt c
8112119DNAHomo sapiens 121cgtggtgccc ctctatgac
1912219DNAHomo sapiens
122ggctagtggg cgcatgtag
1912368DNAHomo sapiens 123cgtggtgccc ctctatgacc tgctgctgga gatgctggac
gcccaccgcc tacatgcgcc 60cactagcc
6812420DNAHomo sapiens 124tggtccatcg ccagttatca
2012523DNAHomo sapiens
125tgttctagcg atcttgcttc aca
2312676DNAHomo sapiens 126tggtccatcg ccagttatca catctgtatg cggaacctca
aaagagtccc tggtgtgaag 60caagatcgct agaaca
7612724DNAHomo sapiens 127catccatgac aactttggta
tcgt 2412821DNAHomo sapiens
128cagtcttctg ggtggcagtg a
2112974DNAHomo sapiens 129catccatgac aactttggta tcgtggaagg actcatgacc
acagtccatg ccatcactgc 60cacccagaag actg
7413023DNAHomo sapiens 130caaaggagct cactgtggtg
tct 2313126DNAHomo sapiens
131gagtcagaat ggcttattca cagatg
2613275DNAHomo sapiens 132caaaggagct cactgtggtg tctgtgttcc aaccactgaa
tctggacccc atctgtgaat 60aagccattct gactc
7513320DNAHomo sapiens 133ccatctgcat ccatcttgtt
2013420DNAHomo sapiens
134ggccaccagg gtattatctg
2013567DNAHomo sapiens 135ccatctgcat ccatcttgtt tgggctcccc acccttgaga
agtgcctcag ataataccct 60ggtggcc
6713623DNAHomo sapiens 136cgaaaagatg ctgaacagtg
aca 2313720DNAHomo sapiens
137tcaggaacag ccaccagtga
2013873DNAHomo sapiens 138cgaaaagatg ctgaacagtg acaaatccaa ctgaccagaa
gggaggagga agctcactgg 60tggctgttcc tga
7313920DNAHomo sapiens 139gagaccctgc tgtcccagaa
2014023DNAHomo sapiens
140ggttgtagtc agcgaaggag atc
2314176DNAHomo sapiens 141gagaccctgc tgtcccagaa ccagggaggc aagaccttca
ttgtgggaga ccagatctcc 60ttcgctgact acaacc
7614220DNAHomo sapiens 142cccactcagt agccaagtca
2014320DNAHomo sapiens
143cacgcaggtg gtatcagtct
2014473DNAHomo sapiens 144cccactcagt agccaagtca caatgtttgg aaaacagccc
gtttacttga gcaagactga 60taccacctgc gtg
7314524DNAHomo sapiens 145catcaaatgt cagccctgga
gttc 2414626DNAHomo sapiens
146ttcctgtagg tctttacccc gatagc
2614785DNAHomo sapiens 147catcaaatgt cagccctgga gttccatgat accacacgaa
cacagctttt tgccttcgag 60ctatcggggt aaagacctac aggaa
8514824DNAHomo sapiens 148tccaggatgt taggaactgt
gaag 2414922DNAHomo sapiens
149gcgtgtctgc gtagtagctg tt
2215073DNAHomo sapiens 150tccaggatgt taggaactgt gaagatggaa gggcatgaaa
ccagcgactg gaacagctac 60tacgcagaca cgc
7315123DNAHomo sapiens 151aacgactgct actccaagct
caa 2315222DNAHomo sapiens
152ggatttccat cttgctcacc tt
2215376DNAHomo sapiens 153aacgactgct actccaagct caaggagctg gtgcccagca
tcccccagaa caagaaggtg 60agcaagatgg aaatcc
7615421DNAHomo sapiens 154tccggagctg tgatctaagg a
2115520DNAHomo sapiens
155cggacagagc gagctgactt
2015676DNAHomo sapiens 156tccggagctg tgatctaagg aggctggaga tgtattgcgc
acccctcaag cctgccaagt 60cagctcgctc tgtccg
7615717DNAHomo sapiens 157acgcaccggg tgtctga
1715824DNAHomo sapiens
158tgccctttct tgatgatgat tatc
2415968DNAHomo sapiens 159acgcaccggg tgtctgatcc caagttccac cccctccatt
caaagataat catcatcaag 60aaagggca
6816022DNAHomo sapiens 160ccattcaccc tgtgtaacag
ga 2216121DNAHomo sapiens
161ccgaccctct aggttaaggc a
2116268DNAHomo sapiens 162ccattcaccc tgtgtaacag gaccccaagg acctgcctcc
ccggaagtgc cttaacctag 60agggtcgg
6816320DNAHomo sapiens 163cgtcaggacc caccatgtct
2016424DNAHomo sapiens
164ggttaattgg tgacatcctc aaga
2416581DNAHomo sapiens 165cgtcaggacc caccatgtct gccccatcac gcggccgaga
catggcttgg ccacagctct 60tgaggatgtc accaattaac c
8116623DNAHomo sapiens 166caaacgctga catgtacggt
cta 2316718DNAHomo sapiens
167gctcgttggc gcactctt
1816888DNAHomo sapiens 168caaacgctga catgtacggt ctatgccatt cctcccccgc
atcacatcca ctggtattgg 60cagttggagg aagagtgcgc caacgagc
8816925DNAHomo sapiens 169gaggcaactg cttatggctt
aatta 2517018DNAHomo sapiens
170ggcactcggc ttgagcat
1817175DNAHomo sapiens 171gaggcaactg cttatggctt aattaagtca gatgcggcca
tgactgtcgc tgtaaagatg 60ctcaagccga gtgcc
7517218DNAHomo sapiens 172gtccccggga tggatgtt
1817325DNAHomo sapiens
173gatcagtcaa gctgtctgac aattg
2517479DNAHomo sapiens 174gtccccggga tggatgtttt gccaagtcat tgttggataa
gcgagatggt agtacaattg 60tcagacagct tgactgatc
7917521DNAHomo sapiens 175cgaggattgg ttcttcagca a
2117622DNAHomo sapiens
176actctgcacc agctcactgt tg
2217773DNAHomo sapiens 177cgaggattgg ttcttcagca agacagagga actgaaccgc
gaggtggcca ccaacagtga 60gctggtgcag agt
7317820DNAHomo sapiens 178tcagtggaga aggagttgga
2017920DNAHomo sapiens
179tgccatatcc agaggaaaca
2018069DNAHomo sapiens 180tcagtggaga aggagttgga ccagtcaaca tctctgttgt
cacaagcagt gtttcctctg 60gatatggca
6918126DNAHomo sapiens 181gtacaagaga gaaccagact
ccaatg 2618218DNAHomo sapiens
182gtgtagcccg cggacact
1818387DNAHomo sapiens 183gtacaagaga gaaccagact ccaatgtcat tgtggtggac
tggctgtcac gggctcagga 60gcattaccca gtgtccgcgg gctacac
8718422DNAHomo sapiens 184gacatttcca gtcctgcagt
ca 2218520DNAHomo sapiens
185ctccgatcgc acacatttgt
2018686DNAHomo sapiens 186gacatttcca gtcctgcagt caatgcctct ctgccccacc
ctttgttcag tgtggctggt 60gccacgacaa atgtgtgcga tcggag
8618724DNAHomo sapiens 187gttttggagg aaatgtgttc
ttca 2418826DNAHomo sapiens
188ttctctaata cactgccgtc ttaagg
26189101DNAHomo sapiens 189gttttggagg aaatgtgttc ttcagtgcac agaatgcagc
aaaacagcca tctgataaat 60gctctgcaag ccctccctta agacggcagt gtattagaga a
10119022DNAHomo sapiens 190acgagaacga gggcatctat
gt 2219122DNAHomo sapiens
191gcatgtaggt gcttccaatc ac
2219275DNAHomo sapiens 192acgagaacga gggcatctat gtgcaggatg tcaagaccgg
aaaggtgcgc gctgtgattg 60gaagcaccta catgc
7519321DNAHomo sapiens 193tccctccact cggaaggact a
2119422DNAHomo sapiens
194cggttgttgc tgatctgtct ca
2219584DNAHomo sapiens 195tccctccact cggaaggact atcctgctgc caagagggtc
aagttggaca gtgtcagagt 60cctgagacag atcagcaaca accg
8419619DNAHomo sapiens 196ttgttggtgt gccctggtg
1919721DNAHomo sapiens
197tgggttctgt ccaaacactg g
2119867DNAHomo sapiens 198ttgttggtgt gccctggtgc cgtggtggcg gtcactccct
ctgctgccag tgtttggaca 60gaaccca
6719920DNAHomo sapiens 199actgaaggag acccttggag
2020020DNAHomo sapiens
200taaataaccc tgcccacaca
2020162DNAHomo sapiens 201actgaaggag acccttggag cctaggggca tcggcaggag
agtgtgtggg cagggttatt 60ta
6220228DNAHomo sapiens 202agttactaaa aaataccacg
aggtcctt 2820321DNAHomo sapiens
203gtcggtgagt gatttgtgca a
2120479DNAHomo sapiens 204agttactaaa aaataccacg aggtccttca gttgagacca
aagaccggtg tcaggggatt 60gcacaaatca ctcaccgac
7920520DNAHomo sapiens 205gggagtttcc aagagatgga
2020620DNAHomo sapiens
206cttcaaccac cttcccaaac
2020772DNAHomo sapiens 207gggagtttcc aagagatgga ctagtgcttg gtcgggtctt
ggggtctgga gcgtttggga 60aggtggttga ag
7220823DNAHomo sapiens 208aggtgtcatc catcaacgtc
tct 2320920DNAHomo sapiens
209tcccgatcac aatgcacatg
2021090DNAHomo sapiens 210aggtgtcatc catcaacgtc tctgtgaacg cagtgcagac
tgtggtccgc cagggtgaga 60acatcaccct catgtgcatt gtgatcggga
9021124DNAHomo sapiens 211agagccagtt gctgtagaac
tcaa 2421221DNAHomo sapiens
212ctgggcctac acagtccttc a
2121374DNAHomo sapiens 213agagccagtt gctgtagaac tcaaatctct gctgggcaag
gatgttctgt tcttgaagga 60ctgtgtaggc ccag
7421426DNAHomo sapiens 214gaaatgactg catcgttgat
aaaatc 2621519DNAHomo sapiens
215tgccagcctg acagcactt
1921678DNAHomo sapiens 216gaaatgactg catcgttgat aaaatccgca gaaaaaactg
cccagcatgt cgccttagaa 60agtgctgtca ggctggca
7821720DNAHomo sapiens 217gatcaacggc tacatccaga
2021820DNAHomo sapiens
218tgaactgtga ggccagagac
2021968DNAHomo sapiens 219gatcaacggc tacatccaga agatcaagtc gggagaggag
gactttgagt ctctggcctc 60acagttca
6822019DNAHomo sapiens 220gtggatgtgc cctgaagga
1922120DNAHomo sapiens
221ctgcggatcc agggtaagaa
2022270DNAHomo sapiens 222gtggatgtgc cctgaaggac aagccaggcg tctacacgag
agtctcacac ttcttaccct 60ggatccgcag
7022327DNAHomo sapiens 223tggacttcta gtgatgagaa
agattga 2722422DNAHomo sapiens
224cactgcgaga tcaccacagg ta
2222584DNAHomo sapiens 225tggacttcta gtgatgagaa agattgagaa tgttcccaca
ggccccaaca ataagcccaa 60gctacctgtg gtgatctcgc agtg
8422625DNAHomo sapiens 226tggctaagtg aagatgacaa
tcatg 2522725DNAHomo sapiens
227tgcacatatc attacaccag ttcgt
2522881DNAHomo sapiens 228tggctaagtg aagatgacaa tcatgttgca gcaattcact
gtaaagctgg aaagggacga 60actggtgtaa tgatatgtgc a
8122923DNAHomo sapiens 229tctgcagagt tggaagcact
cta 2323021DNAHomo sapiens
230gccgaggctt ttctaccaga a
2123179DNAHomo sapiens 231tctgcagagt tggaagcact ctatggtgac atcgatgctg
tggagctgta tcctgccctt 60ctggtagaaa agcctcggc
7923224DNAHomo sapiens 232acgacacgta tgccgtacag
tact 2423318DNAHomo sapiens
233ccgggaaaac acgaagga
1823486DNAHomo sapiens 234acgacacgta tgccgtacag tactcctgcc gcctcctgaa
cctcgatggc acctgtgctg 60acagctactc cttcgtgttt tcccgg
8623519DNAHomo sapiens 235ctgccgggat ggcttctat
1923622DNAHomo sapiens
236ccaggttctg gaaactgtgg at
2223768DNAHomo sapiens 237ctgccgggat ggcttctatg aggctgagct ctgcccggac
cgctgcatcc acagtttcca 60gaacctgg
6823820DNAHomo sapiens 238ccacaagctg aaggcagaca
2023921DNAHomo sapiens
239gcgtgcttcc ttggtcttag a
2124085DNAHomo sapiens 240ccacaagctg aaggcagaca aggcccgcaa gaagctcctg
gctgaccagg ctgaggcccg 60caggtctaag accaaggaag cacgc
8524124DNAHomo sapiens 241ccattctatc atcaacgggt
acaa 2424223DNAHomo sapiens
242tcagcaagtg ggaaggtgta atc
2324375DNAHomo sapiens 243ccattctatc atcaacgggt acaaacgagt cctggccttg
tctgtggaga cggattacac 60cttcccactt gctga
7524420DNAHomo sapiens 244tatcgaggca ggtcatacca
2024520DNAHomo sapiens
245taacgcttgg catcatcatt
2024674DNAHomo sapiens 246tatcgaggca ggtcatacca tgaccggaag tcaaaagttg
acctggatag gctcaatgat 60gatgccaagc gtta
7424719DNAHomo sapiens 247ccgcaacgtg gttttctca
1924821DNAHomo sapiens
248tgctgggttt ctcctcctgt t
2124981DNAHomo sapiens 249ccgcaacgtg gttttctcac cctatggggt ggcctcggtg
ttggccatgc tccagctgac 60aacaggagga gaaacccagc a
8125025DNAHomo sapiens 250tcaagaccat catcactttc
attgt 2525127DNAHomo sapiens
251ggatcaggaa gtacacggag tataact
2725296DNAHomo sapiens 252tcaagaccat catcactttc attgtctcgg acgtgcgggg
cctgggcctc ccggtccgca 60agcagttcca gttatactcc gtgtacttcc tgatcc
9625319DNAHomo sapiens 253gcccgaaacg ccgaatata
1925423DNAHomo sapiens
254cgtggctctc ttatcctcat gat
2325565DNAHomo sapiens 255gcccgaaacg ccgaatataa tcccaagcgg tttgctgcgg
taatcatgag gataagagag 60ccacg
6525619DNAHomo sapiens 256gccctcccag tgtgcaaat
1925725DNAHomo sapiens
257cgtcgatggt attaggatag aagca
2525886DNAHomo sapiens 258gccctcccag tgtgcaaata agggctgctg tttcgacgac
accgttcgtg gggtcccctg 60gtgcttctat cctaatacca tcgacg
8625927DNAHomo sapiens 259caagctagat cagcattctc
taacttg 2726025DNAHomo sapiens
260cacatgactg ttatcgccat ctact
2526199DNAHomo sapiens 261caagctagat cagcattctc taacttgttt ggtggagaac
cattgtcata tacccggttc 60agcctggctc ggcaagtaga tggcgataac agtcatgtg
9926222DNAHomo sapiens 262cacaggaaca acagcatctt
tc 2226320DNAHomo sapiens
263agataagccc ctgggatcca
2026475DNAHomo sapiens 264cacaggaaca acagcatctt tcaccaagat gggtggcacc
aaccttgctg ggacttggat 60cccaggggct tatct
7526521DNAHomo sapiens 265ggattgctca acaaccatgc t
2126624DNAHomo sapiens
266ggcattaaca cttttggacg ataa
2426791DNAHomo sapiens 267ggattgctca acaaccatgc tgggcatctg gaccctccta
cctctggttc ttacgtctgt 60tgctagatta tcgtccaaaa gtgttaatgc c
9126824DNAHomo sapiens 268gcactttggg attctttcca
ttat 2426924DNAHomo sapiens
269gcatgtaaga agaccctcac tgaa
2427080DNAHomo sapiens 270gcactttggg attctttcca ttatgattct ttgttacagg
caccgagaat gttgtattca 60gtgagggtct tcttacatgc
8027120DNAHomo sapiens 271aatccaaggg ggagagtgat
2027220DNAHomo sapiens
272gtacagattt tgcccgagga
2027372DNAHomo sapiens 273aatccaaggg ggagagtgat gacttccata tggactttga
ctcagctgtg gctcctcggg 60caaaatctgt ac
7227421DNAHomo sapiens 274tgtggacatc ttcccctcag a
2127518DNAHomo sapiens
275ctagcccgac cggttcgt
1827666DNAHomo sapiens 276tgtggacatc ttcccctcag acttccctac tgagccacct
tctctgccac gaaccggtcg 60ggctag
6627720DNAHomo sapiens 277ctttgaaccc ttgcttgcaa
2027818DNAHomo sapiens
278cccgggacaa agcaaatg
1827968DNAHomo sapiens 279ctttgaaccc ttgcttgcaa taggtgtgcg tcagaagcac
ccaggacttc catttgcttt 60gtcccggg
6828018DNAHomo sapiens 280gcctcggtgt gcctttca
1828119DNAHomo sapiens
281cgtgatgtgc gcaatcatg
1928265DNAHomo sapiens 282gcctcggtgt gcctttcaac atcgccagct acgccctgct
cacgtacatg attgcgcaca 60tcacg
6528320DNAHomo sapiens 283ctgctgtctt gggtgcattg
2028418DNAHomo sapiens
284gcagcctggg accacttg
1828571DNAHomo sapiens 285ctgctgtctt gggtgcattg gagccttgcc ttgctgctct
acctccacca tgccaagtgg 60tcccaggctg c
712861947DNAHomo sapiens 286ttttccccag atatggggtt
ctattcagcc atagataatc tagacagagg atttcagaat 60gaaaggaaaa atgtgtggag
attagtccta gttcattctg agggccgact aagtggctca 120gccagcttct tactccatct
gcagttcata ctgccaaaga gctcccactt ccaaatcccc 180agtgacttta tggagaagat
tctgcattaa attgtctttc gaatgatggg gaagcaaggc 240ataatatgcg atgatgagga
gaaagtagac cagtgaggtg attgcaagac taacaaggag 300actcaatggg aagtttttct
ttcttttaga tattgctttt gaagtagatg gtaaaatttt 360tgtcatcctt cttgtatttt
ttgtacccca agttacaatt tttcttcttc cttgtaaata 420atttaaacag tatttatttt
tgtaaggcat aactagaaac taaaatatat tctaaaaaat 480tcattattct gaacaaagtg
atcaaattag aatacatatt tttcaacagt ggtagagctt 540ttaatatatg tttattgaaa
gttatctata atacttgcac cagtgttgaa aaaagttaac 600atgtaggcaa gagcaatatg
tttgtctcaa ggatttttcc atggtttcct cagtgatggt 660gtcctggaat tattcaggtg
gtgaccatca ctggtctaag tttgtgtgca gggttttcag 720acgtgttttt gtgaaacttg
gtagaaccat ggctaataaa gaggacagtg ttgtcagggt 780ccatctgccc tccatagaaa
aatgtctctg gctcataaaa tgagactccc tcagggacta 840aatatgaact gacagcagta
actctgatac agaataatct aaattgcatc aaatggcctt 900aattcagagt ttgttaggct
tatcagtatg ttgcttttaa ttggggtggg aaagtagagg 960gagagaaagc aagacattta
ttaagcacct cgtatgtgcc aggcactatg ctaagcactt 1020tacataagtt aggattaatc
cctgcaagaa tcctataaag aatgttacta gcatttacac 1080ttcccaaatg aaggtaccaa
agctcaaacg caatgttgtg aagctgtttc cttcagattt 1140aggttatgtg ggatgatgtg
ggattgaaga ggaaagaaag gtgggattat ccccctagga 1200agactttcag gcctgacttc
ataggaattc atccatctta tcatgtggag tttatctcac 1260cctgctgttg caggatgcta
tttgcatgtg tccccaggtg atgttttttc tttggggagt 1320aggggtttgg cttcctcatt
catccctctt gctaaaagag gagatagttg atgttgcatc 1380taaagatgct ataagacaat
gaaagtttga tgttgtacat acctacaagt accatttttg 1440tgcatgatta cactccactg
acatcttcca agtactgcat gtgattgaat aagaaacaag 1500aaagtgacca caccaaagcc
tccctggctg gtgtacaggg atcaggtcca cagtggtaca 1560gattcaacca ccacccaggg
agtgcttgca gactctgcat agatgttgct gcatgcgtcc 1620catgtgcctg tcagaatggc
agtgtttaat tctcttgaaa gaaagttatt tgctcactat 1680ccccagcctc aaggagccaa
ggaagagtca ttcacatgga aggtccgggt ctggtcagcc 1740actctgactt ttctaccaca
ttaaattctc cattacatct cactattggt aatggcttaa 1800gtgtaaagag ccatgatgtg
tatattaagc tatgtgccac atatttattt ttagactctc 1860cacagcattc atgtcaatat
gggattaatg cctaaacttt gtaaatattg tacagtttgt 1920aaatcaatga ataaaggttt
tgagtgt 19472871311DNAHomo sapiens
287tagtcgggcg gggttgtgag acgccgcgct cagcttccat cgctgggcgg tcaacaagtg
60cgggcctggc tcagcgcggg ggggcgcgga gaccgcgagg cgaccgggag cggctgggtt
120cccggctgcg cgcccttcgg ccaggccggg agccgcgcca gtcggagccc ccggcccagc
180gtggtccgcc tccctctcgg cgtccacctg cccggagtac tgccagcggg catgaccgac
240ccaccagggg cgccgccgcc ggcgctcgca ggccgcggat gaagaagaaa acccggcgcc
300gctcgacccg gagcgaggag ttgacccgga gcgaggagtt gaccctgagt gaggaagcga
360cctggagtga agaggcgacc cagagtgagg aggcgaccca gggcgaagag atgaatcgga
420gccaggaggt gacccgggac gaggagtcga cccggagcga ggaggtgacc agggaggaaa
480tggcggcagc tgggctcacc gtgactgtca cccacagcaa tgagaagcac gaccttcatg
540ttacctccca gcagggcagc agtgaaccag ttgtccaaga cctggcccag gttgttgaag
600aggtcatagg ggttccacag tcttttcaga aactcatatt taagggaaaa tctctgaagg
660aaatggaaac accgttgtca gcacttggaa tacaagatgg ttgccgggtc atgttaattg
720ggaaaaagaa cagtccacag gaagaggttg aactaaagaa gttgaaacat ttggagaagt
780ctgtggagaa gatagctgac cagctggaag agttgaataa agagcttact ggaatccagc
840agggttttct gcccaaggat ttgcaagctg aagctctctg caaacttgat aggagagtaa
900aagccacaat agagcagttt atgaagatct tggaggagat tgacacactg atcctgccag
960aaaatttcaa agacagtaga ttgaaaagga aaggcttggt aaaaaaggtt caggcattcc
1020tagccgagtg tgacacagtg gagcagaaca tctgccagga gactgagcgg ctgcagtcta
1080caaactttgc cctggccgag tgaggtgtag cagaaaaagg ctgtgctgcc ctgaagaatg
1140gcgccaccag ctctgccgtc tctggatcgg aatttacctg atttcttcag ggctgctggg
1200ggcaactggc catttgccaa ttttcctact ctcacactgg ttctcaatga aaaatagtgt
1260ctttgtgatt tgagtaaagc tcctattctg tttttcacaa aaaaaaaaaa a
1311288582DNAHomo sapiens 288atggcccgcg cacgccagga gggcagctcc ccggagcccg
tagagggcct ggcccgcgac 60ggcccgcgcc ccttcccgct cggccgcctg gtgccctcgg
cagtgtcctg cggcctctgc 120gagcccggcc tggctgccgc ccccgccgcc cccaccctgc
tgcccgctgc ctacctctgc 180gcccccaccg ccccacccgc cgtcaccgcc gccctggggg
gttcccgctg gcctgggggt 240ccccgcagcc ggccccgagg cccgcgcccg gacggtcctc
agccctcgct ctcgctggcg 300gagcagcacc tggagtcgcc cgtgcccagc gccccggggg
ctctggcggg cggtcccacc 360caggcggccc cgggagtccg cggggaggag gaacagtggg
cccgggagat cggggcccag 420ctgcggcgga tggcggacga cctcaacgca cagtacgagc
ggcggagaca agaggagcag 480cagcggcacc gcccctcacc ctggagggtc ctgtacaatc
tcatcatggg actcctgccc 540ttacccaggg gccacagagc ccccgagatg gagcccaatt
ag 5822896030DNAHomo sapiens 289gttggccccc
gttacttttc ctctgggaaa tatggcgcac gctgggagaa cagggtacga 60taaccgggag
atagtgatga agtacatcca ttataagctg tcgcagaggg gctacgagtg 120ggatgcggga
gatgtgggcg ccgcgccccc gggggccgcc cccgcgccgg gcatcttctc 180ctcgcagccc
gggcacacgc cccatacagc cgcatcccgg gacccggtcg ccaggacctc 240gccgctgcag
accccggctg cccccggcgc cgccgcgggg cctgcgctca gcccggtgcc 300acctgtggtc
cacctgaccc tccgccaggc cggcgacgac ttctcccgcc gctaccgccg 360cgacttcgcc
gagatgtcca ggcagctgca cctgacgccc ttcaccgcgc ggggacgctt 420tgccacggtg
gtggaggagc tcttcaggga cggggtgaac tgggggagga ttgtggcctt 480ctttgagttc
ggtggggtca tgtgtgtgga gagcgtcaac cgggagatgt cgcccctggt 540ggacaacatc
gccctgtgga tgactgagta cctgaaccgg cacctgcaca cctggatcca 600ggataacgga
ggctgggatg cctttgtgga actgtacggc cccagcatgc ggcctctgtt 660tgatttctcc
tggctgtctc tgaagactct gctcagtttg gccctggtgg gagcttgcat 720caccctgggt
gcctatctgg gccacaagtg aagtcaacat gcctgcccca aacaaatatg 780caaaaggttc
actaaagcag tagaaataat atgcattgtc agtgatgttc catgaaacaa 840agctgcaggc
tgtttaagaa aaaataacac acatataaac atcacacaca cagacagaca 900cacacacaca
caacaattaa cagtcttcag gcaaaacgtc gaatcagcta tttactgcca 960aagggaaata
tcatttattt tttacattat taagaaaaaa agatttattt atttaagaca 1020gtcccatcaa
aactcctgtc tttggaaatc cgaccactaa ttgccaagca ccgcttcgtg 1080tggctccacc
tggatgttct gtgcctgtaa acatagattc gctttccatg ttgttggccg 1140gatcaccatc
tgaagagcag acggatggaa aaaggacctg atcattgggg aagctggctt 1200tctggctgct
ggaggctggg gagaaggtgt tcattcactt gcatttcttt gccctggggg 1260ctgtgatatt
aacagaggga gggttcctgt ggggggaagt ccatgcctcc ctggcctgaa 1320gaagagactc
tttgcatatg actcacatga tgcatacctg gtgggaggaa aagagttggg 1380aacttcagat
ggacctagta cccactgaga tttccacgcc gaaggacagc gatgggaaaa 1440atgcccttaa
atcataggaa agtatttttt taagctacca attgtgccga gaaaagcatt 1500ttagcaattt
atacaatatc atccagtacc ttaagccctg attgtgtata ttcatatatt 1560ttggatacgc
accccccaac tcccaatact ggctctgtct gagtaagaaa cagaatcctc 1620tggaacttga
ggaagtgaac atttcggtga cttccgcatc aggaaggcta gagttaccca 1680gagcatcagg
ccgccacaag tgcctgcttt taggagaccg aagtccgcag aacctgcctg 1740tgtcccagct
tggaggcctg gtcctggaac tgagccgggg ccctcactgg cctcctccag 1800ggatgatcaa
cagggcagtg tggtctccga atgtctggaa gctgatggag ctcagaattc 1860cactgtcaag
aaagagcagt agaggggtgt ggctgggcct gtcaccctgg ggccctccag 1920gtaggcccgt
tttcacgtgg agcatgggag ccacgaccct tcttaagaca tgtatcactg 1980tagagggaag
gaacagaggc cctgggccct tcctatcaga aggacatggt gaaggctggg 2040aacgtgagga
gaggcaatgg ccacggccca ttttggctgt agcacatggc acgttggctg 2100tgtggccttg
gcccacctgt gagtttaaag caaggcttta aatgactttg gagagggtca 2160caaatcctaa
aagaagcatt gaagtgaggt gtcatggatt aattgacccc tgtctatgga 2220attacatgta
aaacattatc ttgtcactgt agtttggttt tatttgaaaa cctgacaaaa 2280aaaaagttcc
aggtgtggaa tatgggggtt atctgtacat cctggggcat taaaaaaaaa 2340atcaatggtg
gggaactata aagaagtaac aaaagaagtg acatcttcag caaataaact 2400aggaaatttt
tttttcttcc agtttagaat cagccttgaa acattgatgg aataactctg 2460tggcattatt
gcattatata ccatttatct gtattaactt tggaatgtac tctgttcaat 2520gtttaatgct
gtggttgata tttcgaaagc tgctttaaaa aaatacatgc atctcagcgt 2580ttttttgttt
ttaattgtat ttagttatgg cctatacact atttgtgagc aaaggtgatc 2640gttttctgtt
tgagattttt atctcttgat tcttcaaaag cattctgaga aggtgagata 2700agccctgagt
ctcagctacc taagaaaaac ctggatgtca ctggccactg aggagctttg 2760tttcaaccaa
gtcatgtgca tttccacgtc aacagaattg tttattgtga cagttatatc 2820tgttgtccct
ttgaccttgt ttcttgaagg tttcctcgtc cctgggcaat tccgcattta 2880attcatggta
ttcaggatta catgcatgtt tggttaaacc catgagattc attcagttaa 2940aaatccagat
ggcaaatgac cagcagattc aaatctatgg tggtttgacc tttagagagt 3000tgctttacgt
ggcctgtttc aacacagacc cacccagagc cctcctgccc tccttccgcg 3060ggggctttct
catggctgtc cttcagggtc ttcctgaaat gcagtggtgc ttacgctcca 3120ccaagaaagc
aggaaacctg tggtatgaag ccagacctcc ccggcgggcc tcagggaaca 3180gaatgatcag
acctttgaat gattctaatt tttaagcaaa atattatttt atgaaaggtt 3240tacattgtca
aagtgatgaa tatggaatat ccaatcctgt gctgctatcc tgccaaaatc 3300attttaatgg
agtcagtttg cagtatgctc cacgtggtaa gatcctccaa gctgctttag 3360aagtaacaat
gaagaacgtg gacgctttta atataaagcc tgttttgtct tctgttgttg 3420ttcaaacggg
attcacagag tatttgaaaa atgtatatat attaagaggt cacgggggct 3480aattgctggc
tggctgcctt ttgctgtggg gttttgttac ctggttttaa taacagtaaa 3540tgtgcccagc
ctcttggccc cagaactgta cagtattgtg gctgcacttg ctctaagagt 3600agttgatgtt
gcattttcct tattgttaaa aacatgttag aagcaatgaa tgtatataaa 3660agcctcaact
agtcattttt ttctcctctt cttttttttc attatatcta attattttgc 3720agttgggcaa
cagagaacca tccctatttt gtattgaaga gggattcaca tctgcatctt 3780aactgctctt
tatgaatgaa aaaacagtcc tctgtatgta ctcctcttta cactggccag 3840ggtcagagtt
aaatagagta tatgcacttt ccaaattggg gacaagggct ctaaaaaaag 3900ccccaaaagg
agaagaacat ctgagaacct cctcggccct cccagtccct cgctgcacaa 3960atactccgca
agagaggcca gaatgacagc tgacagggtc tatggccatc gggtcgtctc 4020cgaagatttg
gcaggggcag aaaactctgg caggcttaag atttggaata aagtcacaga 4080atcaaggaag
cacctcaatt tagttcaaac aagacgccaa cattctctcc acagctcact 4140tacctctctg
tgttcagatg tggccttcca tttatatgtg atctttgttt tattagtaaa 4200tgcttatcat
ctaaagatgt agctctggcc cagtgggaaa aattaggaag tgattataaa 4260tcgagaggag
ttataataat caagattaaa tgtaaataat cagggcaatc ccaacacatg 4320tctagctttc
acctccagga tctattgagt gaacagaatt gcaaatagtc tctatttgta 4380attgaactta
tcctaaaaca aatagtttat aaatgtgaac ttaaactcta attaattcca 4440actgtacttt
taaggcagtg gctgttttta gactttctta tcacttatag ttagtaatgt 4500acacctactc
tatcagagaa aaacaggaaa ggctcgaaat acaagccatt ctaaggaaat 4560tagggagtca
gttgaaattc tattctgatc ttattctgtg gtgtcttttg cagcccagac 4620aaatgtggtt
acacactttt taagaaatac aattctacat tgtcaagctt atgaaggttc 4680caatcagatc
tttattgtta ttcaatttgg atctttcagg gatttttttt ttaaattatt 4740atgggacaaa
ggacatttgt tggaggggtg ggagggagga acaattttta aatataaaac 4800attcccaagt
ttggatcagg gagttggaag ttttcagaat aaccagaact aagggtatga 4860aggacctgta
ttggggtcga tgtgatgcct ctgcgaagaa ccttgtgtga caaatgagaa 4920acattttgaa
gtttgtggta cgacctttag attccagaga catcagcatg gctcaaagtg 4980cagctccgtt
tggcagtgca atggtataaa tttcaagctg gatatgtcta atgggtattt 5040aaacaataaa
tgtgcagttt taactaacag gatatttaat gacaaccttc tggttggtag 5100ggacatctgt
ttctaaatgt ttattatgta caatacagaa aaaaatttta taaaattaag 5160caatgtgaaa
ctgaattgga gagtgataat acaagtcctt tagtcttacc cagtgaatca 5220ttctgttcca
tgtctttgga caaccatgac cttggacaat catgaaatat gcatctcact 5280ggatgcaaag
aaaatcagat ggagcatgaa tggtactgta ccggttcatc tggactgccc 5340cagaaaaata
acttcaagca aacatcctat caacaacaag gttgttctgc ataccaagct 5400gagcacagaa
gatgggaaca ctggtggagg atggaaaggc tcgctcaatc aagaaaattc 5460tgagactatt
aataaataag actgtagtgt agatactgag taaatccatg cacctaaacc 5520ttttggaaaa
tctgccgtgg gccctccaga tagctcattt cattaagttt ttccctccaa 5580ggtagaattt
gcaagagtga cagtggattg catttctttt ggggaagctt tcttttggtg 5640gttttgttta
ttataccttc ttaagttttc aaccaaggtt tgcttttgtt ttgagttact 5700ggggttattt
ttgttttaaa taaaaataag tgtacaataa gtgtttttgt attgaaagct 5760tttgttatca
agattttcat acttttacct tccatggctc tttttaagat tgatactttt 5820aagaggtggc
tgatattctg caacactgta cacataaaaa atacggtaag gatactttac 5880atggttaagg
taaagtaagt ctccagttgg ccaccattag ctataatggc actttgtttg 5940tgttgttgga
aaaagtcaca ttgccattaa actttccttg tctgtctagt taatattgtg 6000aagaaaaata
aagtacagtg tgagatactg
603029010987DNAHomo sapiens 290ggtggcgcga gcttctgaaa ctaggcggca
gaggcggagc cgctgtggca ctgctgcgcc 60tctgctgcgc ctcgggtgtc ttttgcggcg
gtgggtcgcc gccgggagaa gcgtgagggg 120acagatttgt gaccggcgcg gtttttgtca
gcttactccg gccaaaaaag aactgcacct 180ctggagcgga cttatttacc aagcattgga
ggaatatcgt aggtaaaaat gcctattgga 240tccaaagaga ggccaacatt ttttgaaatt
tttaagacac gctgcaacaa agcagattta 300ggaccaataa gtcttaattg gtttgaagaa
ctttcttcag aagctccacc ctataattct 360gaacctgcag aagaatctga acataaaaac
aacaattacg aaccaaacct atttaaaact 420ccacaaagga aaccatctta taatcagctg
gcttcaactc caataatatt caaagagcaa 480gggctgactc tgccgctgta ccaatctcct
gtaaaagaat tagataaatt caaattagac 540ttaggaagga atgttcccaa tagtagacat
aaaagtcttc gcacagtgaa aactaaaatg 600gatcaagcag atgatgtttc ctgtccactt
ctaaattctt gtcttagtga aagtcctgtt 660gttctacaat gtacacatgt aacaccacaa
agagataagt cagtggtatg tgggagtttg 720tttcatacac caaagtttgt gaagggtcgt
cagacaccaa aacatatttc tgaaagtcta 780ggagctgagg tggatcctga tatgtcttgg
tcaagttctt tagctacacc acccaccctt 840agttctactg tgctcatagt cagaaatgaa
gaagcatctg aaactgtatt tcctcatgat 900actactgcta atgtgaaaag ctatttttcc
aatcatgatg aaagtctgaa gaaaaatgat 960agatttatcg cttctgtgac agacagtgaa
aacacaaatc aaagagaagc tgcaagtcat 1020ggatttggaa aaacatcagg gaattcattt
aaagtaaata gctgcaaaga ccacattgga 1080aagtcaatgc caaatgtcct agaagatgaa
gtatatgaaa cagttgtaga tacctctgaa 1140gaagatagtt tttcattatg tttttctaaa
tgtagaacaa aaaatctaca aaaagtaaga 1200actagcaaga ctaggaaaaa aattttccat
gaagcaaacg ctgatgaatg tgaaaaatct 1260aaaaaccaag tgaaagaaaa atactcattt
gtatctgaag tggaaccaaa tgatactgat 1320ccattagatt caaatgtagc acatcagaag
ccctttgaga gtggaagtga caaaatctcc 1380aaggaagttg taccgtcttt ggcctgtgaa
tggtctcaac taaccctttc aggtctaaat 1440ggagcccaga tggagaaaat acccctattg
catatttctt catgtgacca aaatatttca 1500gaaaaagacc tattagacac agagaacaaa
agaaagaaag attttcttac ttcagagaat 1560tctttgccac gtatttctag cctaccaaaa
tcagagaagc cattaaatga ggaaacagtg 1620gtaaataaga gagatgaaga gcagcatctt
gaatctcata cagactgcat tcttgcagta 1680aagcaggcaa tatctggaac ttctccagtg
gcttcttcat ttcagggtat caaaaagtct 1740atattcagaa taagagaatc acctaaagag
actttcaatg caagtttttc aggtcatatg 1800actgatccaa actttaaaaa agaaactgaa
gcctctgaaa gtggactgga aatacatact 1860gtttgctcac agaaggagga ctccttatgt
ccaaatttaa ttgataatgg aagctggcca 1920gccaccacca cacagaattc tgtagctttg
aagaatgcag gtttaatatc cactttgaaa 1980aagaaaacaa ataagtttat ttatgctata
catgatgaaa cattttataa aggaaaaaaa 2040ataccgaaag accaaaaatc agaactaatt
aactgttcag cccagtttga agcaaatgct 2100tttgaagcac cacttacatt tgcaaatgct
gattcaggtt tattgcattc ttctgtgaaa 2160agaagctgtt cacagaatga ttctgaagaa
ccaactttgt ccttaactag ctcttttggg 2220acaattctga ggaaatgttc tagaaatgaa
acatgttcta ataatacagt aatctctcag 2280gatcttgatt ataaagaagc aaaatgtaat
aaggaaaaac tacagttatt tattacccca 2340gaagctgatt ctctgtcatg cctgcaggaa
ggacagtgtg aaaatgatcc aaaaagcaaa 2400aaagtttcag atataaaaga agaggtcttg
gctgcagcat gtcacccagt acaacattca 2460aaagtggaat acagtgatac tgactttcaa
tcccagaaaa gtcttttata tgatcatgaa 2520aatgccagca ctcttatttt aactcctact
tccaaggatg ttctgtcaaa cctagtcatg 2580atttctagag gcaaagaatc atacaaaatg
tcagacaagc tcaaaggtaa caattatgaa 2640tctgatgttg aattaaccaa aaatattccc
atggaaaaga atcaagatgt atgtgcttta 2700aatgaaaatt ataaaaacgt tgagctgttg
ccacctgaaa aatacatgag agtagcatca 2760ccttcaagaa aggtacaatt caaccaaaac
acaaatctaa gagtaatcca aaaaaatcaa 2820gaagaaacta cttcaatttc aaaaataact
gtcaatccag actctgaaga acttttctca 2880gacaatgaga ataattttgt cttccaagta
gctaatgaaa ggaataatct tgctttagga 2940aatactaagg aacttcatga aacagacttg
acttgtgtaa acgaacccat tttcaagaac 3000tctaccatgg ttttatatgg agacacaggt
gataaacaag caacccaagt gtcaattaaa 3060aaagatttgg tttatgttct tgcagaggag
aacaaaaata gtgtaaagca gcatataaaa 3120atgactctag gtcaagattt aaaatcggac
atctccttga atatagataa aataccagaa 3180aaaaataatg attacatgaa caaatgggca
ggactcttag gtccaatttc aaatcacagt 3240tttggaggta gcttcagaac agcttcaaat
aaggaaatca agctctctga acataacatt 3300aagaagagca aaatgttctt caaagatatt
gaagaacaat atcctactag tttagcttgt 3360gttgaaattg taaatacctt ggcattagat
aatcaaaaga aactgagcaa gcctcagtca 3420attaatactg tatctgcaca tttacagagt
agtgtagttg tttctgattg taaaaatagt 3480catataaccc ctcagatgtt attttccaag
caggatttta attcaaacca taatttaaca 3540cctagccaaa aggcagaaat tacagaactt
tctactatat tagaagaatc aggaagtcag 3600tttgaattta ctcagtttag aaaaccaagc
tacatattgc agaagagtac atttgaagtg 3660cctgaaaacc agatgactat cttaaagacc
acttctgagg aatgcagaga tgctgatctt 3720catgtcataa tgaatgcccc atcgattggt
caggtagaca gcagcaagca atttgaaggt 3780acagttgaaa ttaaacggaa gtttgctggc
ctgttgaaaa atgactgtaa caaaagtgct 3840tctggttatt taacagatga aaatgaagtg
gggtttaggg gcttttattc tgctcatggc 3900acaaaactga atgtttctac tgaagctctg
caaaaagctg tgaaactgtt tagtgatatt 3960gagaatatta gtgaggaaac ttctgcagag
gtacatccaa taagtttatc ttcaagtaaa 4020tgtcatgatt ctgttgtttc aatgtttaag
atagaaaatc ataatgataa aactgtaagt 4080gaaaaaaata ataaatgcca actgatatta
caaaataata ttgaaatgac tactggcact 4140tttgttgaag aaattactga aaattacaag
agaaatactg aaaatgaaga taacaaatat 4200actgctgcca gtagaaattc tcataactta
gaatttgatg gcagtgattc aagtaaaaat 4260gatactgttt gtattcataa agatgaaacg
gacttgctat ttactgatca gcacaacata 4320tgtcttaaat tatctggcca gtttatgaag
gagggaaaca ctcagattaa agaagatttg 4380tcagatttaa cttttttgga agttgcgaaa
gctcaagaag catgtcatgg taatacttca 4440aataaagaac agttaactgc tactaaaacg
gagcaaaata taaaagattt tgagacttct 4500gatacatttt ttcagactgc aagtgggaaa
aatattagtg tcgccaaaga gtcatttaat 4560aaaattgtaa atttctttga tcagaaacca
gaagaattgc ataacttttc cttaaattct 4620gaattacatt ctgacataag aaagaacaaa
atggacattc taagttatga ggaaacagac 4680atagttaaac acaaaatact gaaagaaagt
gtcccagttg gtactggaaa tcaactagtg 4740accttccagg gacaacccga acgtgatgaa
aagatcaaag aacctactct gttgggtttt 4800catacagcta gcgggaaaaa agttaaaatt
gcaaaggaat ctttggacaa agtgaaaaac 4860ctttttgatg aaaaagagca aggtactagt
gaaatcacca gttttagcca tcaatgggca 4920aagaccctaa agtacagaga ggcctgtaaa
gaccttgaat tagcatgtga gaccattgag 4980atcacagctg ccccaaagtg taaagaaatg
cagaattctc tcaataatga taaaaacctt 5040gtttctattg agactgtggt gccacctaag
ctcttaagtg ataatttatg tagacaaact 5100gaaaatctca aaacatcaaa aagtatcttt
ttgaaagtta aagtacatga aaatgtagaa 5160aaagaaacag caaaaagtcc tgcaacttgt
tacacaaatc agtcccctta ttcagtcatt 5220gaaaattcag ccttagcttt ttacacaagt
tgtagtagaa aaacttctgt gagtcagact 5280tcattacttg aagcaaaaaa atggcttaga
gaaggaatat ttgatggtca accagaaaga 5340ataaatactg cagattatgt aggaaattat
ttgtatgaaa ataattcaaa cagtactata 5400gctgaaaatg acaaaaatca tctctccgaa
aaacaagata cttatttaag taacagtagc 5460atgtctaaca gctattccta ccattctgat
gaggtatata atgattcagg atatctctca 5520aaaaataaac ttgattctgg tattgagcca
gtattgaaga atgttgaaga tcaaaaaaac 5580actagttttt ccaaagtaat atccaatgta
aaagatgcaa atgcataccc acaaactgta 5640aatgaagata tttgcgttga ggaacttgtg
actagctctt caccctgcaa aaataaaaat 5700gcagccatta aattgtccat atctaatagt
aataattttg aggtagggcc acctgcattt 5760aggatagcca gtggtaaaat cgtttgtgtt
tcacatgaaa caattaaaaa agtgaaagac 5820atatttacag acagtttcag taaagtaatt
aaggaaaaca acgagaataa atcaaaaatt 5880tgccaaacga aaattatggc aggttgttac
gaggcattgg atgattcaga ggatattctt 5940cataactctc tagataatga tgaatgtagc
acgcattcac ataaggtttt tgctgacatt 6000cagagtgaag aaattttaca acataaccaa
aatatgtctg gattggagaa agtttctaaa 6060atatcacctt gtgatgttag tttggaaact
tcagatatat gtaaatgtag tatagggaag 6120cttcataagt cagtctcatc tgcaaatact
tgtgggattt ttagcacagc aagtggaaaa 6180tctgtccagg tatcagatgc ttcattacaa
aacgcaagac aagtgttttc tgaaatagaa 6240gatagtacca agcaagtctt ttccaaagta
ttgtttaaaa gtaacgaaca ttcagaccag 6300ctcacaagag aagaaaatac tgctatacgt
actccagaac atttaatatc ccaaaaaggc 6360ttttcatata atgtggtaaa ttcatctgct
ttctctggat ttagtacagc aagtggaaag 6420caagtttcca ttttagaaag ttccttacac
aaagttaagg gagtgttaga ggaatttgat 6480ttaatcagaa ctgagcatag tcttcactat
tcacctacgt ctagacaaaa tgtatcaaaa 6540atacttcctc gtgttgataa gagaaaccca
gagcactgtg taaactcaga aatggaaaaa 6600acctgcagta aagaatttaa attatcaaat
aacttaaatg ttgaaggtgg ttcttcagaa 6660aataatcact ctattaaagt ttctccatat
ctctctcaat ttcaacaaga caaacaacag 6720ttggtattag gaaccaaagt ctcacttgtt
gagaacattc atgttttggg aaaagaacag 6780gcttcaccta aaaacgtaaa aatggaaatt
ggtaaaactg aaactttttc tgatgttcct 6840gtgaaaacaa atatagaagt ttgttctact
tactccaaag attcagaaaa ctactttgaa 6900acagaagcag tagaaattgc taaagctttt
atggaagatg atgaactgac agattctaaa 6960ctgccaagtc atgccacaca ttctcttttt
acatgtcccg aaaatgagga aatggttttg 7020tcaaattcaa gaattggaaa aagaagagga
gagcccctta tcttagtggg agaaccctca 7080atcaaaagaa acttattaaa tgaatttgac
aggataatag aaaatcaaga aaaatcctta 7140aaggcttcaa aaagcactcc agatggcaca
ataaaagatc gaagattgtt tatgcatcat 7200gtttctttag agccgattac ctgtgtaccc
tttcgcacaa ctaaggaacg tcaagagata 7260cagaatccaa attttaccgc acctggtcaa
gaatttctgt ctaaatctca tttgtatgaa 7320catctgactt tggaaaaatc ttcaagcaat
ttagcagttt caggacatcc attttatcaa 7380gtttctgcta caagaaatga aaaaatgaga
cacttgatta ctacaggcag accaaccaaa 7440gtctttgttc caccttttaa aactaaatca
cattttcaca gagttgaaca gtgtgttagg 7500aatattaact tggaggaaaa cagacaaaag
caaaacattg atggacatgg ctctgatgat 7560agtaaaaata agattaatga caatgagatt
catcagttta acaaaaacaa ctccaatcaa 7620gcagcagctg taactttcac aaagtgtgaa
gaagaacctt tagatttaat tacaagtctt 7680cagaatgcca gagatataca ggatatgcga
attaagaaga aacaaaggca acgcgtcttt 7740ccacagccag gcagtctgta tcttgcaaaa
acatccactc tgcctcgaat ctctctgaaa 7800gcagcagtag gaggccaagt tccctctgcg
tgttctcata aacagctgta tacgtatggc 7860gtttctaaac attgcataaa aattaacagc
aaaaatgcag agtcttttca gtttcacact 7920gaagattatt ttggtaagga aagtttatgg
actggaaaag gaatacagtt ggctgatggt 7980ggatggctca taccctccaa tgatggaaag
gctggaaaag aagaatttta tagggctctg 8040tgtgacactc caggtgtgga tccaaagctt
atttctagaa tttgggttta taatcactat 8100agatggatca tatggaaact ggcagctatg
gaatgtgcct ttcctaagga atttgctaat 8160agatgcctaa gcccagaaag ggtgcttctt
caactaaaat acagatatga tacggaaatt 8220gatagaagca gaagatcggc tataaaaaag
ataatggaaa gggatgacac agctgcaaaa 8280acacttgttc tctgtgtttc tgacataatt
tcattgagcg caaatatatc tgaaacttct 8340agcaataaaa ctagtagtgc agatacccaa
aaagtggcca ttattgaact tacagatggg 8400tggtatgctg ttaaggccca gttagatcct
cccctcttag ctgtcttaaa gaatggcaga 8460ctgacagttg gtcagaagat tattcttcat
ggagcagaac tggtgggctc tcctgatgcc 8520tgtacacctc ttgaagcccc agaatctctt
atgttaaaga tttctgctaa cagtactcgg 8580cctgctcgct ggtataccaa acttggattc
tttcctgacc ctagaccttt tcctctgccc 8640ttatcatcgc ttttcagtga tggaggaaat
gttggttgtg ttgatgtaat tattcaaaga 8700gcatacccta tacagtggat ggagaagaca
tcatctggat tatacatatt tcgcaatgaa 8760agagaggaag aaaaggaagc agcaaaatat
gtggaggccc aacaaaagag actagaagcc 8820ttattcacta aaattcagga ggaatttgaa
gaacatgaag aaaacacaac aaaaccatat 8880ttaccatcac gtgcactaac aagacagcaa
gttcgtgctt tgcaagatgg tgcagagctt 8940tatgaagcag tgaagaatgc agcagaccca
gcttaccttg agggttattt cagtgaagag 9000cagttaagag ccttgaataa tcacaggcaa
atgttgaatg ataagaaaca agctcagatc 9060cagttggaaa ttaggaaggc catggaatct
gctgaacaaa aggaacaagg tttatcaagg 9120gatgtcacaa ccgtgtggaa gttgcgtatt
gtaagctatt caaaaaaaga aaaagattca 9180gttatactga gtatttggcg tccatcatca
gatttatatt ctctgttaac agaaggaaag 9240agatacagaa tttatcatct tgcaacttca
aaatctaaaa gtaaatctga aagagctaac 9300atacagttag cagcgacaaa aaaaactcag
tatcaacaac taccggtttc agatgaaatt 9360ttatttcaga tttaccagcc acgggagccc
cttcacttca gcaaattttt agatccagac 9420tttcagccat cttgttctga ggtggaccta
ataggatttg tcgtttctgt tgtgaaaaaa 9480acaggacttg cccctttcgt ctatttgtca
gacgaatgtt acaatttact ggcaataaag 9540ttttggatag accttaatga ggacattatt
aagcctcata tgttaattgc tgcaagcaac 9600ctccagtggc gaccagaatc caaatcaggc
cttcttactt tatttgctgg agatttttct 9660gtgttttctg ctagtccaaa agagggccac
tttcaagaga cattcaacaa aatgaaaaat 9720actgttgaga atattgacat actttgcaat
gaagcagaaa acaagcttat gcatatactg 9780catgcaaatg atcccaagtg gtccacccca
actaaagact gtacttcagg gccgtacact 9840gctcaaatca ttcctggtac aggaaacaag
cttctgatgt cttctcctaa ttgtgagata 9900tattatcaaa gtcctttatc actttgtatg
gccaaaagga agtctgtttc cacacctgtc 9960tcagcccaga tgacttcaaa gtcttgtaaa
ggggagaaag agattgatga ccaaaagaac 10020tgcaaaaaga gaagagcctt ggatttcttg
agtagactgc ctttacctcc acctgttagt 10080cccatttgta catttgtttc tccggctgca
cagaaggcat ttcagccacc aaggagttgt 10140ggcaccaaat acgaaacacc cataaagaaa
aaagaactga attctcctca gatgactcca 10200tttaaaaaat tcaatgaaat ttctcttttg
gaaagtaatt caatagctga cgaagaactt 10260gcattgataa atacccaagc tcttttgtct
ggttcaacag gagaaaaaca atttatatct 10320gtcagtgaat ccactaggac tgctcccacc
agttcagaag attatctcag actgaaacga 10380cgttgtacta catctctgat caaagaacag
gagagttccc aggccagtac ggaagaatgt 10440gagaaaaata agcaggacac aattacaact
aaaaaatata tctaagcatt tgcaaaggcg 10500acaataaatt attgacgctt aacctttcca
gtttataaga ctggaatata atttcaaacc 10560acacattagt acttatgttg cacaatgaga
aaagaaatta gtttcaaatt tacctcagcg 10620tttgtgtatc gggcaaaaat cgttttgccc
gattccgtat tggtatactt ttgcttcagt 10680tgcatatctt aaaactaaat gtaatttatt
aactaatcaa gaaaaacatc tttggctgag 10740ctcggtggct catgcctgta atcccaacac
tttgagaagc tgaggtggga ggagtgcttg 10800aggccaggag ttcaagacca gcctgggcaa
catagggaga cccccatctt tacgaagaaa 10860aaaaaaaagg ggaaaagaaa atcttttaaa
tctttggatt tgatcactac aagtattatt 10920ttacaatcaa caaaatggtc atccaaactc
aaacttgaga aaatatcttg ctttcaaatt 10980gacacta
109872911552DNAHomo sapiens 291gcccgtacac
accgtgtgct gggacacccc acagtcagcc gcatggctcc cctgtgcccc 60agcccctggc
tccctctgtt gatcccggcc cctgctccag gcctcactgt gcaactgctg 120ctgtcactgc
tgcttctgat gcctgtccat ccccagaggt tgccccggat gcaggaggat 180tcccccttgg
gaggaggctc ttctggggaa gatgacccac tgggcgagga ggatctgccc 240agtgaagagg
attcacccag agaggaggat ccacccggag aggaggatct acctggagag 300gaggatctac
ctggagagga ggatctacct gaagttaagc ctaaatcaga agaagagggc 360tccctgaagt
tagaggatct acctactgtt gaggctcctg gagatcctca agaaccccag 420aataatgccc
acagggacaa agaaggggat gaccagagtc attggcgcta tggaggcgac 480ccgccctggc
cccgggtgtc cccagcctgc gcgggccgct tccagtcccc ggtggatatc 540cgcccccagc
tcgccgcctt ctgcccggcc ctgcgccccc tggaactcct gggcttccag 600ctcccgccgc
tcccagaact gcgcctgcgc aacaatggcc acagtgtgca actgaccctg 660cctcctgggc
tagagatggc tctgggtccc gggcgggagt accgggctct gcagctgcat 720ctgcactggg
gggctgcagg tcgtccgggc tcggagcaca ctgtggaagg ccaccgtttc 780cctgccgaga
tccacgtggt tcacctcagc accgcctttg ccagagttga cgaggccttg 840gggcgcccgg
gaggcctggc cgtgttggcc gcctttctgg aggagggccc ggaagaaaac 900agtgcctatg
agcagttgct gtctcgcttg gaagaaatcg ctgaggaagg ctcagagact 960caggtcccag
gactggacat atctgcactc ctgccctctg acttcagccg ctacttccaa 1020tatgaggggt
ctctgactac accgccctgt gcccagggtg tcatctggac tgtgtttaac 1080cagacagtga
tgctgagtgc taagcagctc cacaccctct ctgacaccct gtggggacct 1140ggtgactctc
ggctacagct gaacttccga gcgacgcagc ctttgaatgg gcgagtgatt 1200gaggcctcct
tccctgctgg agtggacagc agtcctcggg ctgctgagcc agtccagctg 1260aattcctgcc
tggctgctgg tgacatccta gccctggttt ttggcctcct ttttgctgtc 1320accagcgtcg
cgttccttgt gcagatgaga aggcagcaca gaaggggaac caaagggggt 1380gtgagctacc
gcccagcaga ggtagccgag actggagcct agaggctgga tcttggagaa 1440tgtgagaagc
cagccagagg catctgaggg ggagccggta actgtcctgt cctgctcatt 1500atgccacttc
cttttaactg ccaagaaatt ttttaaaata aatatttata at
15522921578DNAHomo sapiens 292acgaacaggc caataaggag ggagcagtgc ggggtttaaa
tctgaggcta ggctggctct 60tctcggcgtg ctgcggcgga acggctgttg gtttctgctg
gttgtaggtc cttggctggt 120cgggcctccg gtgttctgct tctccccgct gagctgctgc
ctggtgaaga ggaagccatg 180gcgctccgag tcaccaggaa ctcgaaaatt aatgctgaaa
ataaggcgaa gatcaacatg 240gcaggcgcaa agcgcgttcc tacggcccct gctgcaacct
ccaagcccgg actgaggcca 300agaacagctc ttggggacat tggtaacaaa gtcagtgaac
aactgcaggc caaaatgcct 360atgaagaagg aagcaaaacc ttcagctact ggaaaagtca
ttgataaaaa actaccaaaa 420cctcttgaaa aggtacctat gctggtgcca gtgccagtgt
ctgagccagt gccagagcca 480gaacctgagc cagaacctga gcctgttaaa gaagaaaaac
tttcgcctga gcctattttg 540gttgatactg cctctccaag cccaatggaa acatctggat
gtgcccctgc agaagaagac 600ctgtgtcagg ctttctctga tgtaattctt gcagtaaatg
atgtggatgc agaagatgga 660gctgatccaa acctttgtag tgaatatgtg aaagatattt
atgcttatct gagacaactt 720gaggaagagc aagcagtcag accaaaatac ctactgggtc
gggaagtcac tggaaacatg 780agagccatcc taattgactg gctagtacag gttcaaatga
aattcaggtt gttgcaggag 840accatgtaca tgactgtctc cattattgat cggttcatgc
agaataattg tgtgcccaag 900aagatgctgc agctggttgg tgtcactgcc atgtttattg
caagcaaata tgaagaaatg 960taccctccag aaattggtga ctttgctttt gtgactgaca
acacttatac taagcaccaa 1020atcagacaga tggaaatgaa gattctaaga gctttaaact
ttggtctggg tcggcctcta 1080cctttgcact tccttcggag agcatctaag attggagagg
ttgatgtcga gcaacatact 1140ttggccaaat acctgatgga actaactatg ttggactatg
acatggtgca ctttcctcct 1200tctcaaattg cagcaggagc tttttgctta gcactgaaaa
ttctggataa tggtgaatgg 1260acaccaactc tacaacatta cctgtcatat actgaagaat
ctcttcttcc agttatgcag 1320cacctggcta agaatgtagt catggtaaat caaggactta
caaagcacat gactgtcaag 1380aacaagtatg ccacatcgaa gcatgctaag atcagcactc
taccacagct gaattctgca 1440ctagttcaag atttagccaa ggctgtggca aaggtgtaac
ttgtaaactt gagttggagt 1500actatattta caaataaaat tggcaccatg tgccatctgt
aaaaaaaaaa aaaaaaaaaa 1560aaaaaaaaaa aaaaaaaa
15782933195DNAHomo sapiens 293agaggcttcc ctggctggtg
cctgagcccg gcgtccctcg ccccccgccc tccccgcatc 60cctctcctcc ctcgcgcctg
gccctgtggc tcttcctccc tccctccttc cccccccccc 120cacccctcgc ccgctgcctc
cctcggccca gccagctgtg ccggcgtttg ttggctgccc 180tgcgcccggc cctccagcca
gccttctgcc ggccccgccg cgatggaggt gccccagccg 240gagcccgcgc caggctcggc
tctcagtcca gcaggcgtgt gcggtggcgc ccagcgtccg 300ggccacctcc cgggcctcct
gctgggatct catggcctcc tggggtcccc ggtgcgggcg 360gccgcttcct cgccggtcac
caccctcacc cagaccatgc acgacctcgc cgggctcggc 420agccgcagcc gcctgacgca
cctatccctg tctcgacggg catccgaatc ctccctgtcg 480tctgaatcct ccgaatcttc
tgatgcaggt ctctgcatgg attcccccag ccctatggac 540ccccacatgg cggagcagac
gtttgaacag gccatccagg cagccagccg gatcattcga 600aacgagcagt ttgccatcag
acgcttccag tctatgccgg tgaggctgct gggccacagc 660cccgtgcttc ggaacatcac
caactcccag gcgcccgacg gccggaggaa gagcgaggcg 720ggcagtggag ctgccagcag
ctctggggaa gacaaggaga atgtgcgctt ctggaaggcc 780ggggtgggag ctctccggga
agaggagggg gcatgctggg gtggttccct ggcatgtgag 840gaccctcctc tcccatcttg
gctgcaggat ggatttgtct tcaagatgcc atggaagccc 900acacatccca gctccaccca
tgctctggca gagtgggcca gccgcaggga agcctttgcc 960cagagaccca gctcggcccc
cgacctgatg tgtctcagtc ctgaccggaa gatggaagtg 1020gaggagctca gccccctggc
cctaggtcgc ttctctctga cccctgcaga gggggatact 1080gaggaagatg atggatttgt
ggacatccta gagagtgact taaaggatga tgatgcagtt 1140cccccaggca tggagagtct
cattagtgcc ccactggtca agaccttgga aaaggaagag 1200gaaaaggacc tcgtcatgta
cagcaagtgc cagcggctct tccgctctcc gtccatgccc 1260tgcagcgtga tccggcccat
cctcaagagg ctggagcggc cccaggacag ggacacgccc 1320gtgcagaata agcggaggcg
gagcgtgacc cctcctgagg agcagcagga ggctgaggaa 1380cctaaagccc gcgtcctccg
ctcaaaatca ctgtgtcacg atgagatcga gaacctcctg 1440gacagtgacc accgagagct
gattggagat tactctaagg ccttcctcct acagacagta 1500gacggaaagc accaagacct
caagtacatc tcaccagaaa cgatggtggc cctattgacg 1560ggcaagttca gcaacatcgt
ggataagttt gtgattgtag actgcagata cccctatgaa 1620tatgaaggcg ggcacatcaa
gactgcggtg aacttgcccc tggaacgcga cgccgagagc 1680ttcctactga agagccccat
cgcgccctgt agcctggaca agagagtcat cctcattttc 1740cactgtgaat tctcatctga
gcgtgggccc cgcatgtgcc gtttcatcag ggaacgagac 1800cgtgctgtca acgactaccc
cagcctctac taccctgaga tgtatatcct gaaaggcggc 1860tacaaggagt tcttccctca
gcacccgaac ttctgtgaac cccaggacta ccggcccatg 1920aaccacgagg ccttcaagga
tgagctaaag accttccgcc tcaagactcg cagctgggct 1980ggggagcgga gccggcggga
gctctgtagc cggctgcagg accagtgagg ggcctgcgcc 2040agtcctgcta cctcccttgc
ctttcgaggc ctgaagccag ctgccctatg ggcctgccgg 2100gctgagggcc tgctggaggc
ctcaggtgct gtccatggga aagatggtgt ggtgtcctgc 2160ctgtctgccc cagcccagat
tcccctgtgt catcccatca ttttccatat cctggtgccc 2220cccacccctg gaagagccca
gtctgttgag ttagttaagt tgggttaata ccagcttaaa 2280ggcagtattt tgtgtcctcc
aggagcttct tgtttccttg ttagggttaa cccttcatct 2340tcctgtgtcc tgaaacgctc
ctttgtgtgt gtgtcagctg aggctgggga gagccgtggt 2400ccctgaggat gggtcagagc
taaactcctt cctggcctga gagtcagctc tctgccctgt 2460gtacttcccg ggccagggct
gcccctaatc tctgtaggaa ccgtggtatg tctgccatgt 2520tgcccctttc tcttttcccc
tttcctgtcc caccatacga gcacctccag cctgaacaga 2580agctcttact ctttcctatt
tcagtgttac ctgtgtgctt ggtctgtttg actttacgcc 2640catctcagga cacttccgta
gactgtttag gttcccctgt caaatatcag ttacccactc 2700ggtcccagtt ttgttgcccc
agaaagggat gttattatcc ttgggggctc ccagggcaag 2760ggttaaggcc tgaatcatga
gcctgctgga agcccagccc ctactgctgt gaaccctggg 2820gcctgactgc tcagaacttg
ctgctgtctt gttgcggatg gatggaaggt tggatggatg 2880ggtggatggc cgtggatggc
cgtggatgcg cagtgccttg catacccaaa ccaggtggga 2940gcgttttgtt gagcatgaca
cctgcagcag gaatatatgt gtgcctattt gtgtggacaa 3000aaatatttac acttagggtt
tggagctatt caagaggaaa tgtcacagaa gcagctaaac 3060caaggactga gcaccctctg
gattctgaat ctcaagatgg gggcagggct gtgcttgaag 3120gccctgctga gtcatctgtt
agggccttgg ttcaataaag cactgagcaa gttgagaaaa 3180aaaaaaaaaa aaaaa
31952943737DNAHomo sapiens
294ggcgtccgcg cacacctccc cgcgccgccg ccgccaccgc ccgcactccg ccgcctctgc
60ccgcaaccgc tgagccatcc atgggggtcg cgggccgcaa ccgtcccggg gcggcctggg
120cggtgctgct gctgctgctg ctgctgccgc cactgctgct gctggcgggg gccgtcccgc
180cgggtcgggg ccgtgccgcg gggccgcagg aggatgtaga tgagtgtgcc caagggctag
240atgactgcca tgccgacgcc ctgtgtcaga acacacccac ctcctacaag tgctcctgca
300agcctggcta ccaaggggaa ggcaggcagt gtgaggacat cgatgaatgt ggaaatgagc
360tcaatggagg ctgtgtccat gactgtttga atattccagg caattatcgt tgcacttgtt
420ttgatggctt catgttggct catgacggtc ataattgtct tgatgtggac gagtgcctgg
480agaacaatgg cggctgccag catacctgtg tcaacgtcat ggggagctat gagtgctgct
540gcaaggaggg gtttttcctg agtgacaatc agcacacctg cattcaccgc tcggaagagg
600gcctgagctg catgaataag gatcacggct gtagtcacat ctgcaaggag gccccaaggg
660gcagcgtcgc ctgtgagtgc aggcctggtt ttgagctggc caagaaccag agagactgca
720tcttgacctg taaccatggg aacggtgggt gccagcactc ctgtgacgat acagccgatg
780gcccagagtg cagctgccat ccacagtaca agatgcacac agatgggagg agctgccttg
840agcgagagga cactgtcctg gaggtgacag agagcaacac cacatcagtg gtggatgggg
900ataaacgggt gaaacggcgg ctgctcatgg aaacgtgtgc tgtcaacaat ggaggctgtg
960accgcacctg taaggatact tcgacaggtg tccactgcag ttgtcctgtt ggattcactc
1020tccagttgga tgggaagaca tgtaaagata ttgatgagtg ccagacccgc aatggaggtt
1080gtgatcattt ctgcaaaaac atcgtgggca gttttgactg cggctgcaag aaaggattta
1140aattattaac agatgagaag tcttgccaag atgtggatga gtgctctttg gataggacct
1200gtgaccacag ctgcatcaac caccctggca catttgcttg tgcttgcaac cgagggtaca
1260ccctgtatgg cttcacccac tgtggagaca ccaatgagtg cagcatcaac aacggaggct
1320gtcagcaggt ctgtgtgaac acagtgggca gctatgaatg ccagtgccac cctgggtaca
1380agctccactg gaataaaaaa gactgtgtgg aagtgaaggg gctcctgccc acaagtgtgt
1440caccccgtgt gtccctgcac tgcggtaaga gtggtggagg agacgggtgc ttcctcagat
1500gtcactctgg cattcacctc tcttcagatg tcaccaccat caggacaagt gtaaccttta
1560agctaaatga aggcaagtgt agtttgaaaa atgctgagct gtttcccgag ggtctgcgac
1620cagcactacc agagaagcac agctcagtaa aagagagctt ccgctacgta aaccttacat
1680gcagctctgg caagcaagtc ccaggagccc ctggccgacc aagcacccct aaggaaatgt
1740ttatcactgt tgagtttgag cttgaaacta accaaaagga ggtgacagct tcttgtgacc
1800tgagctgcat cgtaaagcga accgagaagc ggctccgtaa agccatccgc acgctcagaa
1860aggccgtcca cagggagcag tttcacctcc agctctcagg catgaacctc gacgtggcta
1920aaaagcctcc cagaacatct gaacgccagg cagagtcctg tggagtgggc cagggtcatg
1980cagaaaacca atgtgtcagt tgcagggctg ggacctatta tgatggagca cgagaacgct
2040gcattttatg tccaaatgga accttccaaa atgaggaagg acaaatgact tgtgaaccat
2100gcccaagacc aggaaattct ggggccctga agaccccaga agcttggaat atgtctgaat
2160gtggaggtct gtgtcaacct ggtgaatatt ctgcagatgg ctttgcacct tgccagctct
2220gtgccctggg cacgttccag cctgaagctg gtcgaacttc ctgcttcccc tgtggaggag
2280gccttgccac caaacatcag ggagctactt cctttcagga ctgtgaaacc agagttcaat
2340gttcacctgg acatttctac aacaccacca ctcaccgatg tattcgttgc ccagtgggaa
2400cataccagcc tgaatttgga aaaaataatt gtgtttcttg cccaggaaat actacgactg
2460actttgatgg ctccacaaac ataacccagt gtaaaaacag aagatgtgga ggggagctgg
2520gagatttcac tgggtacatt gaatccccaa actacccagg caattaccca gccaacaccg
2580agtgtacgtg gaccatcaac ccacccccca agcgccgcat cctgatcgtg gtccctgaga
2640tcttcctgcc catagaggac gactgtgggg actatctggt gatgcggaaa acctcttcat
2700ccaattctgt gacaacatat gaaacctgcc agacctacga acgccccatc gccttcacct
2760ccaggtcaaa gaagctgtgg attcagttca agtccaatga agggaacagc gctagagggt
2820tccaggtccc atacgtgaca tatgatgagg actaccagga actcattgaa gacatagttc
2880gagatggcag gctctatgca tctgagaacc atcaggaaat acttaaggat aagaaactta
2940tcaaggctct gtttgatgtc ctggcccatc cccagaacta tttcaagtac acagcccagg
3000agtcccgaga gatgtttcca agatcgttca tccgattgct acgttccaaa gtgtccaggt
3060ttttgagacc ttacaaatga ctcagcccac gtgccactca atacaaatgt tctgctatag
3120ggttggtggg acagagctgt cttccttctg catgtcagca cagtcgggta ttgctgcctc
3180ccgtatcagt gactcattag agttcaattt ttatagataa tacagatatt ttggtaaatt
3240gaacttggtt tttctttccc agcatcgtgg atgtagactg agaatggctt tgagtggcat
3300cagcttctca ctgctgtggg cggatgtctt ggatagatca cgggctggct gagctggact
3360ttggtcagcc taggtgagac tcacctgtcc ttctggggtc ttactcctcc tcaaggagtc
3420tgtagtggaa aggaggccac agaataagct gcttattctg aaacttcagc ttcctctagc
3480ccggccctct ctaagggagc cctctgcact cgtgtgcagg ctctgaccag gcagaacagg
3540caagagggga gggaaggaga cccctgcagg ctccctccac ccaccttgag acctgggagg
3600actcagtttc tccacagcct tctccagcct gtgtgataca agtttgatcc caggaacttg
3660agttctaagc agtgctcgtg aaaaaaaaaa gcagaaagaa ttagaaataa ataaaaacta
3720agcacttctg gagacat
37372952042DNAHomo sapiens 295ggggccagtc gttcgccgga aagcatttgt ctcccacctc
atcataacaa caattaattt 60cctctggggc ctgaggaggg cagaatttca accttcggtg
tgcttgggag tggcgattgt 120gatttacacg acaaaatgcc gaggtgctcg gtggagtcat
ggcagtgccc tttgtggaag 180actgggactt ggtgcaaacc ctgggagaag gtgcctatgg
agaagttcaa cttgctgtga 240atagagtaac tgaagaagca gtcgcagtga agattgtaga
tatgaagcgt gccgtagact 300gtccagaaaa tattaagaaa gagatctgta tcaataaaat
gctaaatcat gaaaatgtag 360taaaattcta tggtcacagg agagaaggca atatccaata
tttatttctg gagtactgta 420gtggaggaga gctttttgac agaatagagc cagacatagg
catgcctgaa ccagatgctc 480agagattctt ccatcaactc atggcagggg tggtttatct
gcatggtatt ggaataactc 540acagggatat taaaccagaa aatcttctgt tggatgaaag
ggataacctc aaaatctcag 600actttggctt ggcaacagta tttcggtata ataatcgtga
gcgtttgttg aacaagatgt 660gtggtacttt accatatgtt gctccagaac ttctgaagag
aagagaattt catgcagaac 720cagttgatgt ttggtcctgt ggaatagtac ttactgcaat
gctcgctgga gaattgccat 780gggaccaacc cagtgacagc tgtcaggagt attctgactg
gaaagaaaaa aaaacatacc 840tcaacccttg gaaaaaaatc gattctgctc ctctagctct
gctgcataaa atcttagttg 900agaatccatc agcaagaatt accattccag acatcaaaaa
agatagatgg tacaacaaac 960ccctcaagaa aggggcaaaa aggccccgag tcacttcagg
tggtgtgtca gagtctccca 1020gtggattttc taagcacatt caatccaatt tggacttctc
tccagtaaac agtgcttcta 1080gtgaagaaaa tgtgaagtac tccagttctc agccagaacc
ccgcacaggt ctttccttat 1140gggataccag cccctcatac attgataaat tggtacaagg
gatcagcttt tcccagccca 1200catgtcctga tcatatgctt ttgaatagtc agttacttgg
caccccagga tcctcacaga 1260acccctggca gcggttggtc aaaagaatga cacgattctt
taccaaattg gatgcagaca 1320aatcttatca atgcctgaaa gagacttgtg agaagttggg
ctatcaatgg aagaaaagtt 1380gtatgaatca ggttactata tcaacaactg ataggagaaa
caataaactc attttcaaag 1440tgaatttgtt agaaatggat gataaaatat tggttgactt
ccggctttct aagggtgatg 1500gattggagtt caagagacac ttcctgaaga ttaaagggaa
gctgattgat attgtgagca 1560gccagaaggt ttggcttcct gccacatgat cggaccatcg
gctctgggga atcctggtga 1620atatagtgct gctatgttga cattattctt cctagagaag
attatcctgt cctgcaaact 1680gcaaatagta gttcctgaag tgttcacttc cctgtttatc
caaacatctt ccaatttatt 1740ttgtttgttc ggcatacaaa taatacctat atcttaattg
taagcaaaac tttggggaaa 1800ggatgaatag aattcatttg attatttctt catgtgtgtt
tagtatctga atttgaaact 1860catctggtgg aaaccaagtt tcaggggaca tgagttttcc
agcttttata cacacgtatc 1920tcatttttat caaaacattt tgtttaattc aaaaagtaca
tatttcttcc atgttgattt 1980aattctaaga tgaaccaata aagacataat tcttgcaaaa
aaaaaaaaaa aaaaaaaaaa 2040aa
20422962547DNAHomo sapiens 296cttacaaggt acagtcctct
gctcaggggg gccaggaggg tcttataggc atcattcacc 60agggtcgaat gcttctctga
gaagtccttt tcagtctgag acctctggct gaagaaatct 120gggtggacaa gacgctgcag
ttgctggtac ctgtgctgga gcttcgctgt atcaactctg 180aaggaacggt tgcagtccat
aaggctgaag tagtctcgag tggggtcagg tgcctgcagc 240gctcggcact gtgggcagaa
gaacctgtcc tcccgcccgg ggccccatgg gccgccgcag 300ttccaacagc ggggataatt
gcttcccgcc tgcgacgcag catcgcagct tagcggtctc 360cttctgggaa cccctgtcgg
ccaaaacccc cacacccgga gcaaagcccc ggctctcccc 420cgccacatct ggccggcggc
ctatctagcc gtggtcactc gtggggaaaa gcaaagagag 480cgtctaacca gactaatgtt
gctgattggc tggggagtcg agggggcggg atcacccgag 540gggaacccgg gttctaagtt
ccgctctccc ttctaaacta caactcccag gaggcattga 600ggcggcgcct gacggccaca
tctgctgctc ctcattggtc cggcggcagg ggagggggtt 660ttgattggct gagggtggag
tttgtatctg caggtttagc gccactctgc tggctgaggc 720tgcggagagt gtgcggctcc
aggtgggctc acgcggtcgt gatgtctcgg gagtcggatg 780ttgaggctca gcagtctcat
ggcagcagtg cctgttcaca gccccatggc agcgttaccc 840agtcccaagg ctcctcctca
cagtcccagg gcatatccag ctcctctacc agcacgatgc 900caaactccag ccagtcctct
cactccagct ctgggacact gagctcctta gagacagtgt 960ccactcagga actctattct
attcctgagg accaagaacc tgaggaccaa gaacctgagg 1020agcctacccc tgccccctgg
gctcgattat gggcccttca ggatggattt gccaatcttg 1080aatgtgtgaa tgacaactac
tggtttggga gggacaaaag ctgtgaatat tgctttgatg 1140aaccactgct gaaaagaaca
gataaatacc gaacatacag caagaaacac tttcggattt 1200tcagggaagt gggtcctaaa
aactcttaca ttgcatacat agaagatcac agtggcaatg 1260gaacctttgt aaatacagag
cttgtaggga aaggaaaacg ccgtcctttg aataacaatt 1320ctgaaattgc actgtcacta
agcagaaata aagtttttgt cttttttgat ctgactgtag 1380atgatcagtc agtttatcct
aaggcattaa gagatgaata catcatgtca aaaactcttg 1440gaagtggtgc ctgtggagag
gtaaagctgg ctttcgagag gaaaacatgt aagaaagtag 1500ccataaagat catcagcaaa
aggaagtttg ctattggttc agcaagagag gcagacccag 1560ctctcaatgt tgaaacagaa
atagaaattt tgaaaaagct aaatcatcct tgcatcatca 1620agattaaaaa cttttttgat
gcagaagatt attatattgt tttggaattg atggaagggg 1680gagagctgtt tgacaaagtg
gtggggaata aacgcctgaa agaagctacc tgcaagctct 1740atttttacca gatgctcttg
gctgtgcagt accttcatga aaacggtatt atacaccgtg 1800acttaaagcc agagaatgtt
ttactgtcat ctcaagaaga ggactgtctt ataaagatta 1860ctgattttgg gcactccaag
attttgggag agacctctct catgagaacc ttatgtggaa 1920cccccaccta cttggcgcct
gaagttcttg tttctgttgg gactgctggg tataaccgtg 1980ctgtggactg ctggagttta
ggagttattc tttttatctg ccttagtggg tatccacctt 2040tctctgagca taggactcaa
gtgtcactga aggatcagat caccagtgga aaatacaact 2100tcattcctga agtctgggca
gaagtctcag agaaagctct ggaccttgtc aagaagttgt 2160tggtagtgga tccaaaggca
cgttttacga cagaagaagc cttaagacac ccgtggcttc 2220aggatgaaga catgaagaga
aagtttcaag atcttctgtc tgaggaaaat gaatccacag 2280ctctacccca ggttctagcc
cagccttcta ctagtcgaaa gcggccccgt gaaggggaag 2340ccgagggtgc cgagaccaca
aagcgcccag ctgtgtgtgc tgctgtgttg tgaactccgt 2400ggtttgaaca cgaaagaaat
gtaccttctt tcactctgtc atctttcttt tctttgagtc 2460tgttttttta tagtttgtat
tttaattatg ggaataattg ctttttcaca gtcactgatg 2520tacaattaaa aacctgatgg
aacctgg 25472972768DNAHomo sapiens
297cactgctgtg cagggcagga aagctccatg cacatagccc agcaaagagc aacacagagc
60tgaaaggaag actcagagga gagagataag taaggaaagt agtgatggct ctcatcccag
120acttggccat ggaaacctgg cttctcctgg ctgtcagcct ggtgctcctc tatctatatg
180gaacccattc acatggactt tttaagaagc ttggaattcc agggcccaca cctctgcctt
240ttttgggaaa tattttgtcc taccataagg gcttttgtat gtttgacatg gaatgtcata
300aaaagtatgg aaaagtgtgg ggcttttatg atggtcaaca gcctgtgctg gctatcacag
360atcctgacat gatcaaaaca gtgctagtga aagaatgtta ttctgtcttc acaaaccgga
420ggccttttgg tccagtggga tttatgaaaa gtgccatctc tatagctgag gatgaagaat
480ggaagagatt acgatcattg ctgtctccaa ccttcaccag tggaaaactc aaggagatgg
540tccctatcat tgcccagtat ggagatgtgt tggtgagaaa tctgaggcgg gaagcagaga
600caggcaagcc tgtcaccttg aaagacgtct ttggggccta cagcatggat gtgatcacta
660gcacatcatt tggagtgaac atcgactctc tcaacaatcc acaagacccc tttgtggaaa
720acaccaagaa gcttttaaga tttgattttt tggatccatt ctttctctca ataacagtct
780ttccattcct catcccaatt cttgaagtat taaatatctg tgtgtttcca agagaagtta
840caaatttttt aagaaaatct gtaaaaagga tgaaagaaag tcgcctcgaa gatacacaaa
900agcaccgagt ggatttcctt cagctgatga ttgactctca gaattcaaaa gaaactgagt
960cccacaaagc tctgtccgat ctggagctcg tggcccaatc aattatcttt atttttgctg
1020gctatgaaac cacgagcagt gttctctcct tcattatgta tgaactggcc actcaccctg
1080atgtccagca gaaactgcag gaggaaattg atgcagtttt acccaataag gcaccaccca
1140cctatgatac tgtgctacag atggagtatc ttgacatggt ggtgaatgaa acgctcagat
1200tattcccaat tgctatgaga cttgagaggg tctgcaaaaa agatgttgag atcaatggga
1260tgttcattcc caaaggggtg gtggtgatga ttccaagcta tgctcttcac cgtgacccaa
1320agtactggac agagcctgag aagttcctcc ctgaaagatt cagcaagaag aacaaggaca
1380acatagatcc ttacatatac acaccctttg gaagtggacc cagaaactgc attggcatga
1440ggtttgctct catgaacatg aaacttgctc taatcagagt ccttcagaac ttctccttca
1500aaccttgtaa agaaacacag atccccctga aattaagctt aggaggactt cttcaaccag
1560aaaaacccgt tgttctaaag gttgagtcaa gggatggcac cgtaagtgga gcctgaattt
1620tcctaaggac ttctgctttg ctcttcaaga aatctgtgcc tgagaacacc agagacctca
1680aattactttg tgaatagaac tctgaaatga agatgggctt catccaatgg actgcataaa
1740taaccgggga ttctgtacat gcattgagct ctctcattgt ctgtgtagag tgttatactt
1800gggaatataa aggaggtgac caaatcagtg tgaggaggta gatttggctc ctctgcttct
1860cacgggacta tttccaccac ccccagttag caccattaac tcctcctgag ctctgataag
1920agaatcaaca tttctcaata atttcctcca caaattatta atgaaaataa gaattatttt
1980gatggctcta acaatgacat ttatatcaca tgttttctct ggagtattct ataagtttta
2040tgttaaatca ataaagacca ctttacaaaa gtattatcag atgctttcct gcacattaag
2100gagaaatcta tagaactgaa tgagaaccaa caagtaaata tttttggtca ttgtaatcac
2160tgttggcgtg gggcctttgt cagaactaga atttgattat taacataggt gaaagttaat
2220ccactgtgac tttgcccatt gtttagaaag aatattcata gtttaattat gccttttttg
2280atcaggcaca gtggctcacg cctgtaatcc tagcagtttg ggaggctgag ccgggtggat
2340cgcctgaggt caggagttca agacaagcct ggcctacatg gttgaaaccc catctctact
2400aaaaatacac aaattagcta ggcatggtgg actcgcctgt aatctcacta cacaggaggc
2460tgaggcagga gaatcacttg aacctgggag gcggatgttg aagtgagctg agattgcacc
2520actgcactcc agtctgggtg agagtgagac tcagtcttaa aaaaatatgc ctttttgaag
2580cacgtacatt ttgtaacaaa gaactgaagc tcttattata ttattagttt tgatttaatg
2640ttttcagccc atctcctttc atatttctgg gagacagaaa acatgtttcc ctacacctct
2700tgcattccat cctcaacacc caactgtctc gatgcaatga acacttaata aaaaacagtc
2760gattggtc
27682981358DNAHomo sapiens 298ggcgtccgcg cgctgcacaa tggcggctct gaagagttgg
ctgtcgcgca gcgtaacttc 60attcttcagg tacagacagt gtttgtgtgt tcctgttgtg
gctaacttta agaagcggtg 120tttctcagaa ttgataagac catggcacaa aactgtgacg
attggctttg gagtaaccct 180gtgtgcggtt cctattgcac agaaatcaga gcctcattcc
cttagtagtg aagcattgat 240gaggagagca gtgtctttgg taacagatag cacctctacc
tttctctctc agaccacata 300tgcgttgatt gaagctatta ctgaatatac taaggctgtt
tataccttaa cttctcttta 360ccgacaatat acaagtttac ttgggaaaat gaattcagag
gaggaagatg aagtgtggca 420ggtgatcata ggagccagag ctgagatgac ttcaaaacac
caagagtact tgaagctgga 480aaccacttgg atgactgcag ttggtctttc agagatggca
gcagaagctg catatcaaac 540tggcgcagat caggcctcta taaccgccag gaatcacatt
cagctggtga aactgcaggt 600ggaagaggtg caccagctct cccggaaagc agaaaccaag
ctggcagaag cacagataga 660agagctccgt cagaaaacac aggaggaagg ggaggagcgg
gctgagtcgg agcaggaggc 720ctacctgcgt gaggattgag ggcctgagca cactgccctg
tctccccact cagtggggaa 780agcaggggca gatgccaccc tgcccagggt tggcatgact
gtctgtgcac cgagaagagg 840cggcaggtcc tgccctggcc aatcaggcga gacgcctttg
tgagctgtga gtgcctcctg 900tggtctcagg cttgcgctgg acctggttct tagcccttgg
gcactgcacc ctgtttaaca 960tttcacccca ctctgtacag ctgctcttac ccattttttt
tacctcacac ccaaagcatt 1020ttgcctacct gggtcagaga gaggagtcct ttttgtcatg
cccttaagtt cagcaactgt 1080ttaacctgtt ttcagtctta tttacgtcgt caaaaatgat
ttagtacttg ttccctctgt 1140tgggatgcca gttgtggcag ggggagggga acctgtccag
tttgtacgat ttctttgtat 1200gtatttctga tgtgttctct gatctgcccc cactgtcctg
tgaggacagc tgaggccaag 1260gagtgaaaaa cctattacta ctaagagaag gggtgcagag
tgtttacctg gtgctctcaa 1320caggacttaa catcaacagg acttaacaca gaaaaaaa
13582994407DNAHomo sapiens 299tttcgactcg cgctccggct
gctgtcactt ggctctctgg ctggagcttg aggacgcaag 60gagggtttgt cactggcaga
ctcgagactg taggcactgc catggcccct gtgctcagta 120aggactcggc ggacatcgag
agtatcctgg ctttaaatcc tcgaacacaa actcatgcaa 180ctctgtgttc cacttcggcc
aagaaattag acaagaaaca ttggaaaaga aatcctgata 240agaactgctt taattgtgag
aagctggaga ataattttga tgacatcaag cacacgactc 300ttggtgagcg aggagctctc
cgagaagcaa tgagatgcct gaaatgtgca gatgccccgt 360gtcagaagag ctgtccaact
aatcttgata ttaaatcatt catcacaagt attgcaaaca 420agaactatta tggagctgct
aagatgatat tttctgacaa cccacttggt ctgacttgtg 480gaatggtatg tccaacctct
gatctatgtg taggtggatg caatttatat gccactgaag 540agggacccat taatattggt
ggattgcagc aatttgctac tgaggtattc aaagcaatga 600gtatcccaca gatcagaaat
ccttcgctgc ctcccccaga aaaaatgtct gaagcctatt 660ctgcaaagat tgctcttttt
ggtgctgggc ctgcaagtat aagttgtgct tcctttttgg 720ctcgattggg gtactctgac
atcactatat ttgaaaaaca agaatatgtt ggtggtttaa 780gtacttctga aattcctcag
ttccggctgc cgtatgatgt agtgaatttt gagattgagc 840taatgaagga ccttggtgta
aagataattt gcggtaaaag cctttcagtg aatgaaatga 900ctcttagcac tttgaaagaa
aaaggctaca aagctgcttt cattggaata ggtttgccag 960aacccaataa agatgccatc
ttccaaggcc tgacgcagga ccaggggttt tatacatcca 1020aagacttttt gccacttgta
gccaaaggca gtaaagcagg aatgtgcgcc tgtcactctc 1080cattgccatc gatacgggga
gtcgtgattg tacttggagc tggagacact gccttcgact 1140gtgcaacatc tgctctacgt
tgtggagctc gccgagtgtt catcgtcttc agaaaaggct 1200ttgttaatat aagagctgtc
cctgaggaga tggagcttgc taaggaagaa aagtgtgaat 1260ttctgccatt cctgtcccca
cggaaggtta tagtaaaagg tgggagaatt gttgctatgc 1320agtttgttcg gacagagcaa
gatgaaactg gaaaatggaa tgaagatgaa gatcagatgg 1380tccatctgaa agccgatgtg
gtcatcagtg cctttggttc agttctgagt gatcctaaag 1440taaaagaagc cttgagccct
ataaaattta acagatgggg tctcccagaa gtagatccag 1500aaactatgca aactagtgaa
gcatgggtat ttgcaggtgg tgatgtcgtt ggtttggcta 1560acactacagt ggaatcggtg
aatgatggaa agcaagcttc ttggtacatt cacaaatacg 1620tacagtcaca atatggagct
tccgtttctg ccaagcctga actacccctc ttttacactc 1680ctattgatct ggtggacatt
agtgtagaaa tggccggatt gaagtttata aatccttttg 1740gtcttgctag cgcaactcca
gccaccagca catcaatgat tcgaagagct tttgaagctg 1800gatggggttt tgccctcacc
aaaactttct ctcttgataa ggacattgtg acaaatgttt 1860cccccagaat catccgggga
accacctctg gccccatgta tggccctgga caaagctcct 1920ttctgaatat tgagctcatc
agtgagaaaa cggctgcata ttggtgtcaa agtgtcactg 1980aactaaaggc tgacttccca
gacaacattg tgattgctag cattatgtgc agttacaata 2040aaaatgactg gacggaactt
gccaagaagt ctgaggattc tggagcagat gccctggagt 2100taaatttatc atgtccacat
ggcatgggag aaagaggaat gggcctggcc tgtgggcagg 2160atccagagct ggtgcggaac
atctgccgct gggttaggca agctgttcag attccttttt 2220ttgccaagct gaccccaaat
gtcactgata ttgtgagcat cgcaagagct gcaaaggaag 2280gtggtgccaa tggcgttaca
gccaccaaca ctgtctcagg tctgatggga ttaaaatctg 2340atggcacacc ttggccagca
gtggggattg caaagcgaac tacatatgga ggagtgtctg 2400ggacagcaat cagacctatt
gctttgagag ctgtgacctc cattgctcgt gctctgcctg 2460gatttcccat tttggctact
ggtggaattg actctgctga aagtggtctt cagtttctcc 2520atagtggtgc ttccgtcctc
caggtatgca gtgccattca gaatcaggat ttcactgtga 2580tcgaagacta ctgcactggc
ctcaaagccc tgctttatct gaaaagcatt gaagaactac 2640aagactggga tggacagagt
ccagctactg tgagtcacca gaaagggaaa ccagttccac 2700gtatagctga actcatggac
aagaaactgc caagttttgg accttatctg gaacagcgca 2760agaaaatcat agcagaaaac
aagattagac tgaaagaaca aaatgtagct ttttcaccac 2820ttaagagaag ctgttttatc
cccaaaaggc ctattcctac catcaaggat gtaataggaa 2880aagcactgca gtaccttgga
acatttggtg aattgagcaa cgtagagcaa gttgtggcta 2940tgattgatga agaaatgtgt
atcaactgtg gtaaatgcta catgacctgt aatgattctg 3000gctaccaggc tatacagttt
gatccagaaa cccacctgcc caccataacc gacacttgta 3060caggctgtac tctgtgtctc
agtgtttgcc ctattgtcga ctgcatcaaa atggtttcca 3120ggacaacacc ttatgaacca
aagagaggcg tacccttatc tgtgaatccg gtgtgttaag 3180gtgatttgtg aaacagttgc
tgtgaacttt catgtcacct acatatgctg atctcttaaa 3240atcatgatcc ttgtgttcag
ctctttccaa attaaaacaa atatacattt tctaaataaa 3300aatatgtaat ttcaaaatac
atttgtaagt gtaaaaaatg tctcatgtca atgaccattc 3360aattagtggc ataaaataga
ataattcttt tctgaggata gtagttaaat aactgtgtgg 3420cagttaattg gatgttcact
gccagttgtc ttatgtgaaa aattaacttt ttgtgtggca 3480attagtgtga cagtttccaa
attgccctat gctgtgctcc atatttgatt tctaattgta 3540agtgaaatta agcattttga
aacaaagtac tctttaacat acaagaaaat gtatccaagg 3600aaacatttta tcaataaaaa
ttacctttaa ttttaatgct gtttctaaga aaatgtagtt 3660agctccataa agtacaaatg
aagaaagtca aaaattattt gctatggcag gataagaaag 3720cctaaaattg agtttgtgga
ctttattaag taaaatcccc ttcgctgaaa ttgcttattt 3780ttggtgttgg atagaggata
gggagaatat ttactaacta aataccattc actactcatg 3840cgtgagatgg gtgtacaaac
tcatcctctt ttaatggcat ttctctttaa actatgttcc 3900taaccaaatg agatgatagg
atagatcctg gttaccactc ttttactgtg cacatatggg 3960ccccggaatt ctttaatagt
caccttcatg attatagcaa ctaatgtttg aacaaagctc 4020aaagtatgca atgcttcatt
attcaagaat gaaaaatata atgttgataa tatatattaa 4080gtgtgccaaa tcagtttgac
tactctctgt tttagtgttt atgtttaaaa gaaatatatt 4140ttttgttatt attagataat
atttttgtat ttctctattt tcataatcag taaatagtgt 4200catataaact catttatctc
ctcttcatgg catcttcaat atgaatctat aagtagtaaa 4260tcagaaagta acaatctatg
gcttatttct atgacaaatt caagagctag aaaaataaaa 4320tgtttcatta tgcactttta
gaaatgcata tttgccacaa aacctgtatt actgaataat 4380atcaaataaa atatcataaa
gcatttt 44073005532DNAHomo sapiens
300gccgcgctgc gccggagtcc cgagctagcc ccggcgccgc cgccgcccag accggacgac
60aggccacctc gtcggcgtcc gcccgagtcc ccgcctcgcc gccaacgcca caaccaccgc
120gcacggcccc ctgactccgt ccagtattga tcgggagagc cggagcgagc tcttcgggga
180gcagcgatgc gaccctccgg gacggccggg gcagcgctcc tggcgctgct ggctgcgctc
240tgcccggcga gtcgggctct ggaggaaaag aaagtttgcc aaggcacgag taacaagctc
300acgcagttgg gcacttttga agatcatttt ctcagcctcc agaggatgtt caataactgt
360gaggtggtcc ttgggaattt ggaaattacc tatgtgcaga ggaattatga tctttccttc
420ttaaagacca tccaggaggt ggctggttat gtcctcattg ccctcaacac agtggagcga
480attcctttgg aaaacctgca gatcatcaga ggaaatatgt actacgaaaa ttcctatgcc
540ttagcagtct tatctaacta tgatgcaaat aaaaccggac tgaaggagct gcccatgaga
600aatttacagg aaatcctgca tggcgccgtg cggttcagca acaaccctgc cctgtgcaac
660gtggagagca tccagtggcg ggacatagtc agcagtgact ttctcagcaa catgtcgatg
720gacttccaga accacctggg cagctgccaa aagtgtgatc caagctgtcc caatgggagc
780tgctggggtg caggagagga gaactgccag aaactgacca aaatcatctg tgcccagcag
840tgctccgggc gctgccgtgg caagtccccc agtgactgct gccacaacca gtgtgctgca
900ggctgcacag gcccccggga gagcgactgc ctggtctgcc gcaaattccg agacgaagcc
960acgtgcaagg acacctgccc cccactcatg ctctacaacc ccaccacgta ccagatggat
1020gtgaaccccg agggcaaata cagctttggt gccacctgcg tgaagaagtg tccccgtaat
1080tatgtggtga cagatcacgg ctcgtgcgtc cgagcctgtg gggccgacag ctatgagatg
1140gaggaagacg gcgtccgcaa gtgtaagaag tgcgaagggc cttgccgcaa agtgtgtaac
1200ggaataggta ttggtgaatt taaagactca ctctccataa atgctacgaa tattaaacac
1260ttcaaaaact gcacctccat cagtggcgat ctccacatcc tgccggtggc atttaggggt
1320gactccttca cacatactcc tcctctggat ccacaggaac tggatattct gaaaaccgta
1380aaggaaatca cagggttttt gctgattcag gcttggcctg aaaacaggac ggacctccat
1440gcctttgaga acctagaaat catacgcggc aggaccaagc aacatggtca gttttctctt
1500gcagtcgtca gcctgaacat aacatccttg ggattacgct ccctcaagga gataagtgat
1560ggagatgtga taatttcagg aaacaaaaat ttgtgctatg caaatacaat aaactggaaa
1620aaactgtttg ggacctccgg tcagaaaacc aaaattataa gcaacagagg tgaaaacagc
1680tgcaaggcca caggccaggt ctgccatgcc ttgtgctccc ccgagggctg ctggggcccg
1740gagcccaggg actgcgtctc ttgccggaat gtcagccgag gcagggaatg cgtggacaag
1800tgcaagcttc tggagggtga gccaagggag tttgtggaga actctgagtg catacagtgc
1860cacccagagt gcctgcctca ggccatgaac atcacctgca caggacgggg accagacaac
1920tgtatccagt gtgcccacta cattgacggc ccccactgcg tcaagacctg cccggcagga
1980gtcatgggag aaaacaacac cctggtctgg aagtacgcag acgccggcca tgtgtgccac
2040ctgtgccatc caaactgcac ctacggatgc actgggccag gtcttgaagg ctgtccaacg
2100aatgggccta agatcccgtc catcgccact gggatggtgg gggccctcct cttgctgctg
2160gtggtggccc tggggatcgg cctcttcatg cgaaggcgcc acatcgttcg gaagcgcacg
2220ctgcggaggc tgctgcagga gagggagctt gtggagcctc ttacacccag tggagaagct
2280cccaaccaag ctctcttgag gatcttgaag gaaactgaat tcaaaaagat caaagtgctg
2340ggctccggtg cgttcggcac ggtgtataag ggactctgga tcccagaagg tgagaaagtt
2400aaaattcccg tcgctatcaa ggaattaaga gaagcaacat ctccgaaagc caacaaggaa
2460atcctcgatg aagcctacgt gatggccagc gtggacaacc cccacgtgtg ccgcctgctg
2520ggcatctgcc tcacctccac cgtgcaactc atcacgcagc tcatgccctt cggctgcctc
2580ctggactatg tccgggaaca caaagacaat attggctccc agtacctgct caactggtgt
2640gtgcagatcg caaagggcat gaactacttg gaggaccgtc gcttggtgca ccgcgacctg
2700gcagccagga acgtactggt gaaaacaccg cagcatgtca agatcacaga ttttgggctg
2760gccaaactgc tgggtgcgga agagaaagaa taccatgcag aaggaggcaa agtgcctatc
2820aagtggatgg cattggaatc aattttacac agaatctata cccaccagag tgatgtctgg
2880agctacgggg tgaccgtttg ggagttgatg acctttggat ccaagccata tgacggaatc
2940cctgccagcg agatctcctc catcctggag aaaggagaac gcctccctca gccacccata
3000tgtaccatcg atgtctacat gatcatggtc aagtgctgga tgatagacgc agatagtcgc
3060ccaaagttcc gtgagttgat catcgaattc tccaaaatgg cccgagaccc ccagcgctac
3120cttgtcattc agggggatga aagaatgcat ttgccaagtc ctacagactc caacttctac
3180cgtgccctga tggatgaaga agacatggac gacgtggtgg atgccgacga gtacctcatc
3240ccacagcagg gcttcttcag cagcccctcc acgtcacgga ctcccctcct gagctctctg
3300agtgcaacca gcaacaattc caccgtggct tgcattgata gaaatgggct gcaaagctgt
3360cccatcaagg aagacagctt cttgcagcga tacagctcag accccacagg cgccttgact
3420gaggacagca tagacgacac cttcctccca gtgcctgaat acataaacca gtccgttccc
3480aaaaggcccg ctggctctgt gcagaatcct gtctatcaca atcagcctct gaaccccgcg
3540cccagcagag acccacacta ccaggacccc cacagcactg cagtgggcaa ccccgagtat
3600ctcaacactg tccagcccac ctgtgtcaac agcacattcg acagccctgc ccactgggcc
3660cagaaaggca gccaccaaat tagcctggac aaccctgact accagcagga cttctttccc
3720aaggaagcca agccaaatgg catctttaag ggctccacag ctgaaaatgc agaataccta
3780agggtcgcgc cacaaagcag tgaatttatt ggagcatgac cacggaggat agtatgagcc
3840ctaaaaatcc agactctttc gatacccagg accaagccac agcaggtcct ccatcccaac
3900agccatgccc gcattagctc ttagacccac agactggttt tgcaacgttt acaccgacta
3960gccaggaagt acttccacct cgggcacatt ttgggaagtt gcattccttt gtcttcaaac
4020tgtgaagcat ttacagaaac gcatccagca agaatattgt ccctttgagc agaaatttat
4080ctttcaaaga ggtatatttg aaaaaaaaaa aaaaagtata tgtgaggatt tttattgatt
4140ggggatcttg gagtttttca ttgtcgctat tgatttttac ttcaatgggc tcttccaaca
4200aggaagaagc ttgctggtag cacttgctac cctgagttca tccaggccca actgtgagca
4260aggagcacaa gccacaagtc ttccagagga tgcttgattc cagtggttct gcttcaaggc
4320ttccactgca aaacactaaa gatccaagaa ggccttcatg gccccagcag gccggatcgg
4380tactgtatca agtcatggca ggtacagtag gataagccac tctgtccctt cctgggcaaa
4440gaagaaacgg aggggatgaa ttcttcctta gacttacttt tgtaaaaatg tccccacggt
4500acttactccc cactgatgga ccagtggttt ccagtcatga gcgttagact gacttgtttg
4560tcttccattc cattgttttg aaactcagta tgccgcccct gtcttgctgt catgaaatca
4620gcaagagagg atgacacatc aaataataac tcggattcca gcccacattg gattcatcag
4680catttggacc aatagcccac agctgagaat gtggaatacc taaggataac accgcttttg
4740ttctcgcaaa aacgtatctc ctaatttgag gctcagatga aatgcatcag gtcctttggg
4800gcatagatca gaagactaca aaaatgaagc tgctctgaaa tctcctttag ccatcacccc
4860aaccccccaa aattagtttg tgttacttat ggaagatagt tttctccttt tacttcactt
4920caaaagcttt ttactcaaag agtatatgtt ccctccaggt cagctgcccc caaaccccct
4980ccttacgctt tgtcacacaa aaagtgtctc tgccttgagt catctattca agcacttaca
5040gctctggcca caacagggca ttttacaggt gcgaatgaca gtagcattat gagtagtgtg
5100aattcaggta gtaaatatga aactagggtt tgaaattgat aatgctttca caacatttgc
5160agatgtttta gaaggaaaaa agttccttcc taaaataatt tctctacaat tggaagattg
5220gaagattcag ctagttagga gcccattttt tcctaatctg tgtgtgccct gtaacctgac
5280tggttaacag cagtcctttg taaacagtgt tttaaactct cctagtcaat atccacccca
5340tccaatttat caaggaagaa atggttcaga aaatattttc agcctacagt tatgttcagt
5400cacacacaca tacaaaatgt tccttttgct tttaaagtaa tttttgactc ccagatcagt
5460cagagcccct acagcattgt taagaaagta tttgattttt gtctcaatga aaataaaact
5520atattcattt cc
55323011528DNAHomo sapiens 301cggcgagcga gcaccttcga cgcggtccgg ggaccccctc
gtcgctgtcc tcccgacgcg 60gacccgcgtg ccccaggcct cgcgctgccc ggccggctcc
tcgtgtccca ctcccggcgc 120acgccctccc gcgagtcccg ggcccctccc gcgcccctct
tctcggcgcg cgcgcagcat 180ggcgcccccg caggtcctcg cgttcgggct tctgcttgcc
gcggcgacgg cgacttttgc 240cgcagctcag gaagaatgtg tctgtgaaaa ctacaagctg
gccgtaaact gctttgtgaa 300taataatcgt caatgccagt gtacttcagt tggtgcacaa
aatactgtca tttgctcaaa 360gctggctgcc aaatgtttgg tgatgaaggc agaaatgaat
ggctcaaaac ttgggagaag 420agcaaaacct gaaggggccc tccagaacaa tgatgggctt
tatgatcctg actgcgatga 480gagcgggctc tttaaggcca agcagtgcaa cggcacctcc
acgtgctggt gtgtgaacac 540tgctggggtc agaagaacag acaaggacac tgaaataacc
tgctctgagc gagtgagaac 600ctactggatc atcattgaac taaaacacaa agcaagagaa
aaaccttatg atagtaaaag 660tttgcggact gcacttcaga aggagatcac aacgcgttat
caactggatc caaaatttat 720cacgagtatt ttgtatgaga ataatgttat cactattgat
ctggttcaaa attcttctca 780aaaaactcag aatgatgtgg acatagctga tgtggcttat
tattttgaaa aagatgttaa 840aggtgaatcc ttgtttcatt ctaagaaaat ggacctgaca
gtaaatgggg aacaactgga 900tctggatcct ggtcaaactt taatttatta tgttgatgaa
aaagcacctg aattctcaat 960gcagggtcta aaagctggtg ttattgctgt tattgtggtt
gtggtgatag cagttgttgc 1020tggaattgtt gtgctggtta tttccagaaa gaagagaatg
gcaaagtatg agaaggctga 1080gataaaggag atgggtgaga tgcataggga actcaatgca
taactatata atttgaagat 1140tatagaagaa gggaaatagc aaatggacac aaattacaaa
tgtgtgtgcg tgggacgaag 1200acatctttga aggtcatgag tttgttagtt taacatcata
tatttgtaat agtgaaacct 1260gtactcaaaa tataagcagc ttgaaactgg ctttaccaat
cttgaaattt gaccacaagt 1320gtcttatata tgcagatcta atgtaaaatc cagaacttgg
actccatcgt taaaattatt 1380tatgtgtaac attcaaatgt gtgcattaaa tatgcttcca
cagtaaaatc tgaaaaactg 1440atttgtgatt gaaagctgcc tttctattta cttgagtctt
gtacatacat acttttttat 1500gagctatgaa ataaaacatt ttaaactg
15283021856DNAHomo sapiens 302ctgacttggc aggactgtgc
aattgtcaga aggccgtggg gagtgggggc cagtgcctgc 60agcctgccct gcctctctca
caggccctta gagcatcgcc aggtgcagag ctccacagct 120ctctttccca aggagtaatc
agagggtgag aacgtggagc ctggtggaca ggtgaaagca 180ctgggatctt tctgcccaga
aaggggaaag ttgcacattt atatcctaga gggaagcgac 240agcagtgctt ctccctgtgc
tgaggtacag gagccatgtg gctagaaatc ctcctcactt 300cagtgctggg ctttgccatc
tactggttca tctcccggga caaagaggaa actttgccac 360ttgaagatgg gtggtggggg
ccaggcacga ggtccgcagc cagggaggac gacagcatcc 420gccctttcaa ggtggaaacg
tcagatgagg agatccacga cttacaccag aggatcgata 480agttccgttt caccccacct
ttggaggaca gctgcttcca ctatggcttc aactccaact 540acctgaagaa agtcatctcc
tactggcgga atgaatttga ctggaagaag caggtggaga 600ttctcaacag ataccctcac
ttcaagacta agattgaagg gctggacatc cacttcatcc 660acgtgaagcc cccccagctg
cccgcaggcc ataccccgaa gcccttgctg atggtgcacg 720gctggcccgg ctctttctac
gagttttata agatcatccc actcctgact gaccccaaga 780accatggcct gagcgatgag
cacgtttttg aagtcatctg cccttccatc cctggctatg 840gcttctcaga ggcatcctcc
aagaaggggt tcaactcggt ggccaccgcc aggatctttt 900acaagctgat gctgcggctg
ggcttccagg aattctacat tcaaggaggg gactgggggt 960ccctgatctg cactaatatg
gcccagctgg tgcccagcca cgtgaaaggc ctgcacttga 1020acatggcttt ggttttaagc
aacttctcta ccctgaccct cctcctggga cagcgtttcg 1080ggaggtttct tggcctcact
gagagggatg tggagctgct gtaccccgtc aaggagaagg 1140tattctacag cctgatgagg
gagagcggct acatgcacat ccagtgcacc aagcctgaca 1200ccgtaggctc tgctctgaat
gactctcctg tgggtctggc tgcctatatt ctagagaagt 1260tttccacctg gaccaatacg
gaattccgat acctggagga tggaggcctg gaaaggaagt 1320tctccctgga cgacctgctg
accaacgtca tgctctactg gacaacaggc accatcatct 1380cctcccagcg cttctacaag
gagaacctgg gacagggctg gatgacccag aagcatgagc 1440ggatgaaggt ctatgtgccc
actggcttct ctgccttccc ttttgagcta ttgcacacgc 1500ctgaaaagtg ggtgaggttc
aagtacccaa agctcatctc ctattcctac atggttcgtg 1560ggggccactt tgcggccttt
gaggagccgg agctgctcgc ccaggacatc cgcaagttcc 1620tgtcggtgct ggagcggcaa
tgacccaccc ctctcccccc gcctgccacc tccccccaca 1680agtgccctcc aggcttttct
tggggaagat accccttttc tgaggaatga gtttgcctcc 1740gtcccctgcc catgctggga
gcccacgctc accccctcac ccctccaagc tcactcccca 1800acccccaact ccgtgtggta
agcaacatgg ctttgatgat aaacgacttt actcta 18563036450DNAHomo sapiens
303gagttgtgcc tggagtgatg tttaagccaa tgtcagggca aggcaacagt ccctggccgt
60cctccagcac ctttgtaatg catatgagct cgggagacca gtacttaaag ttggaggccc
120gggagcccag gagctggcgg agggcgttcg tcctgggagc tgcacttgct ccgtcgggtc
180gccggcttca ccggaccgca ggctcccggg gcagggccgg ggccagagct cgcgtgtcgg
240cgggacatgc gctgcgtcgc ctctaacctc gggctgtgct ctttttccag gtggcccgcc
300ggtttctgag ccttctgccc tgcggggaca cggtctgcac cctgcccgcg gccacggacc
360atgaccatga ccctccacac caaagcatct gggatggccc tactgcatca gatccaaggg
420aacgagctgg agcccctgaa ccgtccgcag ctcaagatcc ccctggagcg gcccctgggc
480gaggtgtacc tggacagcag caagcccgcc gtgtacaact accccgaggg cgccgcctac
540gagttcaacg ccgcggccgc cgccaacgcg caggtctacg gtcagaccgg cctcccctac
600ggccccgggt ctgaggctgc ggcgttcggc tccaacggcc tggggggttt ccccccactc
660aacagcgtgt ctccgagccc gctgatgcta ctgcacccgc cgccgcagct gtcgcctttc
720ctgcagcccc acggccagca ggtgccctac tacctggaga acgagcccag cggctacacg
780gtgcgcgagg ccggcccgcc ggcattctac aggccaaatt cagataatcg acgccagggt
840ggcagagaaa gattggccag taccaatgac aagggaagta tggctatgga atctgccaag
900gagactcgct actgtgcagt gtgcaatgac tatgcttcag gctaccatta tggagtctgg
960tcctgtgagg gctgcaaggc cttcttcaag agaagtattc aaggacataa cgactatatg
1020tgtccagcca ccaaccagtg caccattgat aaaaacagga ggaagagctg ccaggcctgc
1080cggctccgca aatgctacga agtgggaatg atgaaaggtg ggatacgaaa agaccgaaga
1140ggagggagaa tgttgaaaca caagcgccag agagatgatg gggagggcag gggtgaagtg
1200gggtctgctg gagacatgag agctgccaac ctttggccaa gcccgctcat gatcaaacgc
1260tctaagaaga acagcctggc cttgtccctg acggccgacc agatggtcag tgccttgttg
1320gatgctgagc cccccatact ctattccgag tatgatccta ccagaccctt cagtgaagct
1380tcgatgatgg gcttactgac caacctggca gacagggagc tggttcacat gatcaactgg
1440gcgaagaggg tgccaggctt tgtggatttg accctccatg atcaggtcca ccttctagaa
1500tgtgcctggc tagagatcct gatgattggt ctcgtctggc gctccatgga gcacccagtg
1560aagctactgt ttgctcctaa cttgctcttg gacaggaacc agggaaaatg tgtagagggc
1620atggtggaga tcttcgacat gctgctggct acatcatctc ggttccgcat gatgaatctg
1680cagggagagg agtttgtgtg cctcaaatct attattttgc ttaattctgg agtgtacaca
1740tttctgtcca gcaccctgaa gtctctggaa gagaaggacc atatccaccg agtcctggac
1800aagatcacag acactttgat ccacctgatg gccaaggcag gcctgaccct gcagcagcag
1860caccagcggc tggcccagct cctcctcatc ctctcccaca tcaggcacat gagtaacaaa
1920ggcatggagc atctgtacag catgaagtgc aagaacgtgg tgcccctcta tgacctgctg
1980ctggagatgc tggacgccca ccgcctacat gcgcccacta gccgtggagg ggcatccgtg
2040gaggagacgg accaaagcca cttggccact gcgggctcta cttcatcgca ttccttgcaa
2100aagtattaca tcacggggga ggcagagggt ttccctgcca cagtctgaga gctccctggc
2160tcccacacgg ttcagataat ccctgctgca ttttaccctc atcatgcacc actttagcca
2220aattctgtct cctgcataca ctccggcatg catccaacac caatggcttt ctagatgagt
2280ggccattcat ttgcttgctc agttcttagt ggcacatctt ctgtcttctg ttgggaacag
2340ccaaagggat tccaaggcta aatctttgta acagctctct ttcccccttg ctatgttact
2400aagcgtgagg attcccgtag ctcttcacag ctgaactcag tctatgggtt ggggctcaga
2460taactctgtg catttaagct acttgtagag acccaggcct ggagagtaga cattttgcct
2520ctgataagca ctttttaaat ggctctaaga ataagccaca gcaaagaatt taaagtggct
2580cctttaattg gtgacttgga gaaagctagg tcaagggttt attatagcac cctcttgtat
2640tcctatggca atgcatcctt ttatgaaagt ggtacacctt aaagctttta tatgactgta
2700gcagagtatc tggtgattgt caattcactt ccccctatag gaatacaagg ggccacacag
2760ggaaggcaga tcccctagtt ggccaagact tattttaact tgatacactg cagattcaga
2820gtgtcctgaa gctctgcctc tggctttccg gtcatgggtt ccagttaatt catgcctccc
2880atggacctat ggagagcaac aagttgatct tagttaagtc tccctatatg agggataagt
2940tcctgatttt tgtttttatt tttgtgttac aaaagaaagc cctccctccc tgaacttgca
3000gtaaggtcag cttcaggacc tgttccagtg ggcactgtac ttggatcttc ccggcgtgtg
3060tgtgccttac acaggggtga actgttcact gtggtgatgc atgatgaggg taaatggtag
3120ttgaaaggag caggggccct ggtgttgcat ttagccctgg ggcatggagc tgaacagtac
3180ttgtgcagga ttgttgtggc tactagagaa caagagggaa agtagggcag aaactggata
3240cagttctgag cacagccaga cttgctcagg tggccctgca caggctgcag ctacctagga
3300acattccttg cagaccccgc attgcctttg ggggtgccct gggatccctg gggtagtcca
3360gctcttattc atttcccagc gtggccctgg ttggaagaag cagctgtcaa gttgtagaca
3420gctgtgttcc tacaattggc ccagcaccct ggggcacggg agaagggtgg ggaccgttgc
3480tgtcactact caggctgact ggggcctggt cagattacgt atgcccttgg tggtttagag
3540ataatccaaa atcagggttt ggtttgggga agaaaatcct cccccttcct cccccgcccc
3600gttccctacc gcctccactc ctgccagctc atttccttca atttcctttg acctataggc
3660taaaaaagaa aggctcattc cagccacagg gcagccttcc ctgggccttt gcttctctag
3720cacaattatg ggttacttcc tttttcttaa caaaaaagaa tgtttgattt cctctgggtg
3780accttattgt ctgtaattga aaccctattg agaggtgatg tctgtgttag ccaatgaccc
3840aggtagctgc tcgggcttct cttggtatgt cttgtttgga aaagtggatt tcattcattt
3900ctgattgtcc agttaagtga tcaccaaagg actgagaatc tgggagggca aaaaaaaaaa
3960aaaaagtttt tatgtgcact taaatttggg gacaatttta tgtatctgtg ttaaggatat
4020gcttaagaac ataattcttt tgttgctgtt tgtttaagaa gcaccttagt ttgtttaaga
4080agcaccttat atagtataat atatattttt ttgaaattac attgcttgtt tatcagacaa
4140ttgaatgtag taattctgtt ctggatttaa tttgactggg ttaacatgca aaaaccaagg
4200aaaaatattt agtttttttt tttttttttg tatacttttc aagctacctt gtcatgtata
4260cagtcattta tgcctaaagc ctggtgatta ttcatttaaa tgaagatcac atttcatatc
4320aacttttgta tccacagtag acaaaatagc actaatccag atgcctattg ttggatattg
4380aatgacagac aatcttatgt agcaaagatt atgcctgaaa aggaaaatta ttcagggcag
4440ctaattttgc ttttaccaaa atatcagtag taatattttt ggacagtagc taatgggtca
4500gtgggttctt tttaatgttt atacttagat tttcttttaa aaaaattaaa ataaaacaaa
4560aaaaatttct aggactagac gatgtaatac cagctaaagc caaacaatta tacagtggaa
4620ggttttacat tattcatcca atgtgtttct attcatgtta agatactact acatttgaag
4680tgggcagaga acatcagatg attgaaatgt tcgcccaggg gtctccagca actttggaaa
4740tctctttgta tttttacttg aagtgccact aatggacagc agatattttc tggctgatgt
4800tggtattggg tgtaggaaca tgatttaaaa aaaaaactct tgcctctgct ttcccccact
4860ctgaggcaag ttaaaatgta aaagatgtga tttatctggg gggctcaggt atggtgggga
4920agtggattca ggaatctggg gaatggcaaa tatattaaga agagtattga aagtatttgg
4980aggaaaatgg ttaattctgg gtgtgcacca aggttcagta gagtccactt ctgccctgga
5040gaccacaaat caactagctc catttacagc catttctaaa atggcagctt cagttctaga
5100gaagaaagaa caacatcagc agtaaagtcc atggaatagc tagtggtctg tgtttctttt
5160cgccattgcc tagcttgccg taatgattct ataatgccat catgcagcaa ttatgagagg
5220ctaggtcatc caaagagaag accctatcaa tgtaggttgc aaaatctaac ccctaaggaa
5280gtgcagtctt tgatttgatt tccctagtaa ccttgcagat atgtttaacc aagccatagc
5340ccatgccttt tgagggctga acaaataagg gacttactga taatttactt ttgatcacat
5400taaggtgttc tcaccttgaa atcttataca ctgaaatggc cattgattta ggccactggc
5460ttagagtact ccttcccctg catgacactg attacaaata ctttcctatt catactttcc
5520aattatgaga tggactgtgg gtactgggag tgatcactaa caccatagta atgtctaata
5580ttcacaggca gatctgcttg gggaagctag ttatgtgaaa ggcaaataaa gtcatacagt
5640agctcaaaag gcaaccataa ttctctttgg tgcaagtctt gggagcgtga tctagattac
5700actgcaccat tcccaagtta atcccctgaa aacttactct caactggagc aaatgaactt
5760tggtcccaaa tatccatctt ttcagtagcg ttaattatgc tctgtttcca actgcatttc
5820ctttccaatt gaattaaagt gtggcctcgt ttttagtcat ttaaaattgt tttctaagta
5880attgctgcct ctattatggc acttcaattt tgcactgtct tttgagattc aagaaaaatt
5940tctattcatt tttttgcatc caattgtgcc tgaactttta aaatatgtaa atgctgccat
6000gttccaaacc catcgtcagt gtgtgtgttt agagctgtgc accctagaaa caacatactt
6060gtcccatgag caggtgcctg agacacagac ccctttgcat tcacagagag gtcattggtt
6120atagagactt gaattaataa gtgacattat gccagtttct gttctctcac aggtgataaa
6180caatgctttt tgtgcactac atactcttca gtgtagagct cttgttttat gggaaaaggc
6240tcaaatgcca aattgtgttt gatggattaa tatgcccttt tgccgatgca tactattact
6300gatgtgactc ggttttgtcg cagctttgct ttgtttaatg aaacacactt gtaaacctct
6360tttgcacttt gaaaaagaat ccagcgggat gctcgagcac ctgtaaacaa ttttctcaac
6420ctatttgatg ttcaaataaa gaattaaact
64503043336DNAHomo sapiensunsure(0)...(0)n = A, T, C or G 304cggcggcgac
tgcagtctgg agggtccaca cttgtgattc tcaatggaga gtgaaaacgc 60agattcataa
tgaaagctag cccccgtcgg ccactgattc tcaaaagacg gaggctgccc 120cttcctgttc
aaaatgcccc aagtgaaaca tcagaggagg aacctaagag atcccctgcc 180caacaggagt
ctaatcaagc agaggcctcc aaggaagtgg cggagtccaa ctcttgcaag 240tttccagctg
ggatcaagat tattaaccac cccaccatgc ccaacacgca agtagtggcc 300atccccaaca
atgctaatat tcacagcatc atcacagcac tgactgccaa gggaaaagag 360agtggcagta
gtgggcccaa caaattcatc ctcatcagct gtgggggagc cccaactcag 420cctccaggac
tccggcctca aacccaaacc agctatgatg ccaaaaggac agaagtgacc 480ctggagacct
tgggaccaaa acctgcagct agggatgtga atcttcctag accacctgga 540gccctttgcg
agcagaaacg ggagacctgt gcagatggtg aggcagcagg ctgcactatc 600aacaatagcc
tatccaacat ccagtggctt cgaaagatga gttctgatgg actgggctcc 660cgcagcatca
agcaagagat ggaggaaaag gagaattgtc acctggagca gcgacaggtt 720aaggttgagg
agccttcgag accatcagcg tcctggcaga actctgtgtc tgagcggcca 780ccctactctt
acatggccat gatacaattc gccatcaaca gcactgagag gaagcgcatg 840actttgaaag
acatctatac gtggattgag gaccactttc cctactttaa gcacattgcc 900aagccaggct
ggaagaactc catccgccac aacctttccc tgcacgacat gtttgtccgg 960gagacgtctg
ccaatggcaa ggtctccttc tggaccattc accccagtgc caaccgctac 1020ttgacattgg
accaggtgtt taagccactg gacccagggt ctccacaatt gcccgagcac 1080ttggaatcac
agcagaaacg accgaatcca gagctccgcc ggaacatgac catcaaaacc 1140gaactccccc
tgggcgcacg gcggaagatg aagccactgc taccacgggt cagctcatac 1200ctggtaccta
tccagttccc ggtgaaccag tcactggtgt tgcagccctc ggtgaaggtg 1260ccattgcccc
tggcggcttc cctcatgagc tcagagcttg cccgccatag caagcgagtc 1320cgcattgccc
ccaaggtgct gctagctgag gaggggatag ctcctctttc ttctgcagga 1380ccagggaaag
aggagaaact cctgtttgga gaagggtttt ctcctttgct tccagttcag 1440actatcaagg
aggaagaaat ccagcctggg gaggaaatgc cacacttagc gagacccatc 1500aaagtggaga
gccctccctt ggaagagtgg ccctccccgg ccccatcttt caaagaggaa 1560tcatctcact
cctgggagga ttcgtcccaa tctcccaccc caagacccaa gaagtcctac 1620agtgggctta
ggtccccaac ccggtgtgtc tcggaaatgc ttgtgattca acacagggag 1680aggagggaga
ggagccggtc tcggaggaaa cagcatctac tgcctccctg tgtggatgag 1740ccggagctgc
tcttctcaga ggggcccagt acttcccgct gggccgcaga gctcccgttc 1800ccagcagact
cctctgaccc tgcctcccag ctcagctact cccaggaagt gggaggacct 1860tttaagacac
ccattaagga aacgctgccc atctcctcca ccccgagcaa atctgtcctc 1920cccagaaccc
ctgaatcctg gaggctcacg cccccagcca aagtaggggg actggatttc 1980agcccagtac
aaacctccca gggtgcctct gaccccttgc ctgaccccct ggggctgatg 2040gatctcagca
ccactccctt gcaaagtgct cccccccttg aatcaccgca aaggctcctc 2100agttcagaac
ccttagacct catctccgtc ccctttggca actcttctcc ctcagatata 2160gacgtcccca
agccaggctc cccggagcca caggtttctg gccttgcagc caatcgttct 2220ctgacagaag
gcctggtcct ggacacaatg aatgacagcc tcagcaagat cctgctggac 2280atcagctttc
ctggcctgga cgaggaccca ctgggccctg acaacatcaa ctggtcccag 2340tttattcctg
agctacagta gagccctgcc cttgcccctg tgctcaagct gtccaccatc 2400ccgggcactc
caaggctcag tgcaccccaa gcctctgagt gaggacagca ggcagggact 2460gttctgctcc
tcatagctcc ctgctgcctg attatgcaaa agtagcagtc acaccctagc 2520cactgctggg
accttgtgtt ccccaagagt atctgattcc tctgctgtcc ctgccaggag 2580ctgaagggtg
ggaacaacaa aggcaatggt gaaaagagat taggaacccc ccagcctgtt 2640tccattctct
gcccagcagt ctcttacctt ccctgatctt tgcagggtgg tccgtgtaaa 2700tagtataaat
tctccaaatt atcctctaat tataaatgta agcttatttc cttagatcat 2760tatccagaga
ctgccagaag gtgggtagga tgacctgggg tttcaattga cttctgttcc 2820ttgcttttag
ttttgataga agggaagacc tgcagtgcac ggtttcttcc aggctgaggt 2880acctggatct
tgggttcttc actgcaggga cccagacaag tggatctgct tgccagagtc 2940ctttttgccc
ctccctgcca cctccccgtg tttccaagtc agctttcctg caagaagaaa 3000tcctggttaa
aaaagtcttt tgtattgggt caggagttga atttggggtg ggaggatgga 3060tgcaactgaa
gcagagtgtg ggtgcccaga tgtgcgctat tagatgtttc tctgataatg 3120tccccaatca
taccagggag actggcattg acgagaactc aggtggaggc ttgagaaggc 3180cgaaagggcc
cctgacctgc ctggcttcct tagcttgccc ctcagctttg caaagagcca 3240ccctaggccc
cagctgaccg catgggtgtg agccagcttg agaacactaa ctactcaata 3300aaagcgaagg
tggaccnaaa aaaaaaaaaa aaaaaa
33363052365DNAHomo sapiens 305tcccagcctt cccatccccc caccgaaagc aaatcattca
acgacccccg accctccgac 60ggcaggagcc ccccgacctc ccaggcggac cgcccttccc
tccccgcgcg ggttccgggc 120ccggcgagag ggcgcgacga cagccgaggc catggaggtg
acggcggacc agccgcgctg 180ggtgagccac caccaccccg ccgtgctcaa cgggcagcac
ccggacacgc accacccggg 240cctcagccac tcctacatgg acgcggcgca gtacccgctg
ccggaggagg tggatgtgct 300ttttaacatc gacggtcaag gcaaccacgt cccgccctac
tacggaaact cggtcagggc 360cacggtgcag aggtaccctc cgacccacca cgggagccag
gtgtgccgcc cgcctctgct 420tcatggatcc ctaccctggc tggacggcgg caaagccctg
ggcagccacc acaccgcctc 480cccctggaat ctcagcccct tctccaagac gtccatccac
cacggctccc cggggcccct 540ctccgtctac cccccggcct cgtcctcctc cttgtcgggg
ggccacgcca gcccgcacct 600cttcaccttc ccgcccaccc cgccgaagga cgtctccccg
gacccatcgc tgtccacccc 660aggctcggcc ggctcggccc ggcaggacga gaaagagtgc
ctcaagtacc aggtgcccct 720gcccgacagc atgaagctgg agtcgtccca ctcccgtggc
agcatgaccg ccctgggtgg 780agcctcctcg tcgacccacc accccatcac cacctacccg
ccctacgtgc ccgagtacag 840ctccggactc ttccccccca gcagcctgct gggcggctcc
cccaccggct tcggatgcaa 900gtccaggccc aaggcccggt ccagcacagg cagggagtgt
gtgaactgtg gggcaacctc 960gaccccactg tggcggcgag atggcacggg acactacctg
tgcaacgcct gcgggctcta 1020tcacaaaatg aacggacaga accggcccct cattaagccc
aagcgaaggc tgtctgcagc 1080caggagagca gggacgtcct gtgcgaactg tcagaccacc
acaaccacac tctggaggag 1140gaatgccaat ggggaccctg tctgcaatgc ctgtgggctc
tactacaagc ttcacaatat 1200taacagaccc ctgactatga agaaggaagg catccagacc
agaaaccgaa aaatgtctag 1260caaatccaaa aagtgcaaaa aagtgcatga ctcactggag
gacttcccca agaacagctc 1320gtttaacccg gccgccctct ccagacacat gtcctccctg
agccacatct cgcccttcag 1380ccactccagc cacatgctga ccacgcccac gccgatgcac
ccgccatcca gcctgtcctt 1440tggaccacac cacccctcca gcatggtcac cgccatgggt
tagagccctg ctcgatgctc 1500acagggcccc cagcgagagt ccctgcagtc cctttcgact
tgcatttttg caggagcagt 1560atcatgaagc ctaaacgcga tggatatatg tttttgaagg
cagaaagcaa aattatgttt 1620gccactttgc aaaggagctc actgtggtgt ctgtgttcca
accactgaat ctggacccca 1680tctgtgaata agccattctg actcatatcc cctatttaac
agggtctcta gtgctgtgaa 1740aaaaaaaaat cctgaacatt gcatataact tatattgtaa
gaaatactgt acaatgactt 1800tattgcatct gggtagctgt aaggcatgaa ggatgccaag
aagtttaagg aatatgggag 1860aaatagtgtg gaaattaaga agaaactagg tctgatattc
aaatggacaa actgccagtt 1920ttgtttcctt tcactggcca cagttgtttg atgcattaaa
agaaaataaa aaaaagaaaa 1980aagagaaaag aaaaaaaaag aaaaaagttg taggcgaatc
atttgttcaa agctgttggc 2040cctctgcaaa ggaaatacca gttctgggca atcagtgtta
ccgttcacca gttgccattg 2100agggtttcag agagcctttt tctaggccta catgctttgt
gaacaagtcc ctgtaattgt 2160tgtttgtatg tataattcaa agcaccaaaa taagaaaaga
tgtagattta tttcatcata 2220ttatacagac cgaactgttg tataaattta tttactgcta
gtcttaagaa ctgctttctt 2280tcgtttgttt gtttcaatat tttccttctc tctcaatttt
cggttgaata aactagatta 2340cattcagttg gcaaaaaaaa aaaaa
23653061117DNAHomo sapiens 306gcaccaacca gcaccatgcc
catgatactg gggtactggg acatccgcgg gctggcccac 60gccatccgcc tgctcctgga
atacacagac tcaagctatg aggaaaagaa gtacacgatg 120ggggacgctc ctgattatga
cagaagccag tggctgaatg aaaaattcaa gctgggcctg 180gactttccca atctgcccta
cttgattgat ggggctcaca agatcaccca gagcaacgcc 240atcttgtgct acattgcccg
caagcacaac ctgtgtgggg agacagaaga ggagaagatt 300cgtgtggaca ttttggagaa
ccagaccatg gacaaccata tgcagctggg catgatctgc 360tacaatccag aatttgagaa
actgaagcca aagtacttgg aggaactccc tgaaaagcta 420aagctctact cagagtttct
ggggaagcgg ccatggtttg caggaaacaa gatcactttt 480gtagattttc tcgtctatga
tgtccttgac ctccaccgta tatttgagcc caactgcttg 540gacgccttcc caaatctgaa
ggacttcatc tcccgctttg agggcttgga gaagatctct 600gcctacatga agtccagccg
cttcctccca agacctgtgt tctcaaagat ggctgtctgg 660ggcaacaagt agggccttga
aggcaggagg tgggagtgag gagcccatac tcagcctgct 720gcccaggctg tgcagcgcag
ctggactctg catcccagca cctgcctcct cgttcctttc 780tcctgtttat tcccatcttt
actcccaaga cttcattgtc cctcttcact ccccctaaac 840ccctgtccca tgcaggccct
ttgaagcctc agctacccac tatccttcgt gaacatcccc 900tcccatcatt acccttccct
gcactaaagc cagcctgacc ttccttcctg ttagtggttg 960tgtctgcttt aaagcctgcc
tggcccctcg cctgtggagc tcagccccga gctgtccccg 1020tgttgcatga aggagcagca
ttgactggtt tacaggccct gctcctgcag catggtccct 1080gcctaggcct acctgatgga
agtaaagcct caaccac 11173071266DNAHomo sapiens
307ctcggaagcc cgtcaccatg tcgtgcgagt cgtctatggt tctcgggtac tgggatattc
60gtgggctggc gcacgccatc cgcctgctcc tggagttcac ggatacctct tatgaggaga
120aacggtacac gtgcggggaa gctcctgact atgatcgaag ccaatggctg gatgtgaaat
180tcaagctaga cctggacttt cctaatctgc cctacctcct ggatgggaag aacaagatca
240cccagagcaa tgccatcttg cgctacatcg ctcgcaagca caacatgtgt ggtgagactg
300aagaagaaaa gattcgagtg gacatcatag agaaccaagt aatggatttc cgcacacaac
360tgataaggct ctgttacagc tctgaccacg aaaaactgaa gcctcagtac ttggaagagc
420tacctggaca actgaaacaa ttctccatgt ttctgtggaa attctcatgg tttgccgggg
480aaaagctcac ctttgtggat tttctcacct atgatatctt ggatcagaac cgtatatttg
540accccaagtg cctggatgag ttcccaaacc tgaaggcttt catgtgccgt tttgaggctt
600tggagaaaat cgctgcctac ttacagtctg atcagttctg caagatgccc atcaacaaca
660agatggccca gtggggcaac aagcctgtat gctgagcagg aggcagactt gcagagcttg
720ttttgtttca tcctgtccgt aaggggtcag cgctcttgct ttgctctttt caatgaatag
780cacttatgtt actggtgtcc agctgagttt ctcttgggta taaaggctaa aagggaaaaa
840ggatatgtgg agaatcatca agatatgaat tgaatcgctg cgatactgtg gcatttccct
900actccccaac tgagttcaag ggctgtaggt tcatgcccaa gccctgagag tgggtactag
960aaaaaacgag attgcacagt tggagagagc aggtgtgtta aatggactgg agtccctgtg
1020aagactgggt gaggataaca caagtaaaac tgtggtactg atggacttaa ccggagttcg
1080gaaaccgtcc tgtgtacaca tgggagttta gtgtgataaa ggcagtattt cagactggtg
1140ggctagccaa tagagttggc aattgcttat tgaaactcat taaaaataat agagccccac
1200ttgacactat tcactaaaat taatctggaa tttaaggccc aacattaaac acaaagctgt
1260attgat
12663082162DNAHomo sapiens 308gggctgcgct gtccagctgt ggctatggcc ccagccccga
gatgaggagg gagagaacta 60ggggcccgca ggcctgggaa tttccgtccc ccaccaagtc
cggatgctca ctccaaagtc 120tcagcaggcc cctgagggag ggagctgtca gccagggaaa
accgagaaca ccatcaccat 180gacaaccagt caccagcctc aggacagata caaagctgtc
tggcttatct tcttcatgct 240gggtctggga acgctgctcc cgtggaattt tttcatgacg
gccactcagt atttcacaaa 300ccgcctggac atgtcccaga atgtgtcctt ggtcactgct
gaactgagca aggacgccca 360ggcgtcagcc gcccctgcag cacccttgcc tgagcggaac
tctctcagtg ccatcttcaa 420caatgtcatg accctatgtg ccatgctgcc cctgctgtta
ttcacctacc tcaactcctt 480cctgcatcag aggatccccc agtccgtacg gatcctgggc
agcctggtgg ccatcctgct 540ggtgtttctg atcactgcca tcctggtgaa ggtgcagctg
gatgctctgc ccttctttgt 600catcaccatg atcaagatcg tgctcattaa ttcatttggt
gccatcctgc agggcagcct 660gtttggtctg gctggccttc tgcctgccag ctacacggcc
cccatcatga gtggccaggg 720cctagcaggc ttctttgcct ccgtggccat gatctgcgct
attgccagtg gctcggaact 780atcagaaagt gccttcggct actttatcac agcctgtgct
gttatcattt tgaccatcat 840ctgttacctg ggcctgcccc gcctggaatt ctaccgctac
taccagcagc tcaagcttga 900aggacccggg gagcaggaga ccaagttgga cctcattagc
aaaggagagg agccaagagc 960aggcaaagag gaatctggag tttcagtctc caactctcag
cccaccaatg aaagccactc 1020tatcaaagcc atcctgaaaa atatctcagt cctggctttc
tctgtctgct tcatcttcac 1080tatcaccatt gggatgtttc cagccgtgac tgttgaggtc
aagtccagca tcgcaggcag 1140cagcacctgg gaacgttact tcattcctgt gtcctgtttc
ttgactttca atatctttga 1200ctggttgggc cggagcctca cagctgtatt catgtggcct
gggaaggaca gccgctggct 1260gccaagcctg gtgctggccc ggctggtgtt tgtgccactg
ctgctgctgt gcaacattaa 1320gccccgccgc tacctgactg tggtcttcga gcacgatgcc
tggttcatct tcttcatggc 1380tgcctttgcc ttctccaacg gctacctcgc cagcctctgc
atgtgcttcg ggcccaagaa 1440agtgaagcca gctgaggcag agaccgcagg agccatcatg
gccttcttcc tgtgtctggg 1500tctggcactg ggggctgttt tctccttcct gttccgggca
attgtgtgac aaaggatgga 1560cagaaggact gcctgcctcc ctccctgtct gcctcctgcc
ccttccttct gccaggggtg 1620atcctgagtg gtctggcggt tttttcttct aactgacttc
tgctttccac ggcgtgtgct 1680gggcccggat ctccaggccc tggggaggga gcctctggac
ggacagtggg gacattgtgg 1740gtttggggct cagagtcgag ggacggggtg tagcctcggc
atttgcttga gtttctccac 1800tcttggctct gactgatccc tgcttgtgca ggccagtgga
ggctcttggg cttggagaac 1860acgtgtgtct ctgtgtatgt gtctgtgtgt ctgcgtccgt
gtctgtcaga ctgtctgcct 1920gtcctggggt ggctaggagc tgggtctgac cgttgtatgg
tttgacctga tatactccat 1980tctcccctgc gcctcctcct ctgtgttttt tccatgtccc
cctcccaact ccccatgccc 2040agtttttacc catcatgcac cctgtacagt tgccacgtta
ctgccttttt taaaaatata 2100tttgacagaa accaggtgcc ttcagaggct ctctgattta
aataaacctt tcttgttttt 2160tt
21623093933DNAHomo sapiens 309cacgaggcag cactctcttc
gtcgcttcgg ccagtgtgtc gggctgggcc ctgacaagcc 60acctgaggag aggctcggag
ccgggcccgg accccggcga ttgccgcccg cttctctcta 120gtctcacgag gggtttcccg
cctcgcaccc ccacctctgg acttgccttt ccttctcttc 180tccgcgtgtg gagggagcca
gcgcttaggc cggagcgagc ctgggggccg cccgccgtga 240agacatcgcg gggaccgatt
caccatggag ggcgccggcg gcgcgaacga caagaaaaag 300ataagttctg aacgtcgaaa
agaaaagtct cgagatgcag ccagatctcg gcgaagtaaa 360gaatctgaag ttttttatga
gcttgctcat cagttgccac ttccacataa tgtgagttcg 420catcttgata aggcctctgt
gatgaggctt accatcagct atttgcgtgt gaggaaactt 480ctggatgctg gtgatttgga
tattgaagat gacatgaaag cacagatgaa ttgcttttat 540ttgaaagcct tggatggttt
tgttatggtt ctcacagatg atggtgacat gatttacatt 600tctgataatg tgaacaaata
catgggatta actcagtttg aactaactgg acacagtgtg 660tttgatttta ctcatccatg
tgaccatgag gaaatgagag aaatgcttac acacagaaat 720ggccttgtga aaaagggtaa
agaacaaaac acacagcgaa gcttttttct cagaatgaag 780tgtaccctaa ctagccgagg
aagaactatg aacataaagt ctgcaacatg gaaggtattg 840cactgcacag gccacattca
cgtatatgat accaacagta accaacctca gtgtgggtat 900aagaaaccac ctatgacctg
cttggtgctg atttgtgaac ccattcctca cccatcaaat 960attgaaattc ctttagatag
caagactttc ctcagtcgac acagcctgga tatgaaattt 1020tcttattgtg atgaaagaat
taccgaattg atgggatatg agccagaaga acttttaggc 1080cgctcaattt atgaatatta
tcatgctttg gactctgatc atctgaccaa aactcatcat 1140gatatgttta ctaaaggaca
agtcaccaca ggacagtaca ggatgcttgc caaaagaggt 1200ggatatgtct gggttgaaac
tcaagcaact gtcatatata acaccaagaa ttctcaacca 1260cagtgcattg tatgtgtgaa
ttacgttgtg agtggtatta ttcagcacga cttgattttc 1320tcccttcaac aaacagaatg
tgtccttaaa ccggttgaat cttcagatat gaaaatgact 1380cagctattca ccaaagttga
atcagaagat acaagtagcc tctttgacaa acttaagaag 1440gaacctgatg ctttaacttt
gctggcccca gccgctggag acacaatcat atctttagat 1500tttggcagca acgacacaga
aactgatgac cagcaacttg aggaagtacc attatataat 1560gatgtaatgc tcccctcacc
caacgaaaaa ttacagaata taaatttggc aatgtctcca 1620ttacccaccg ctgaaacgcc
aaagccactt cgaagtagtg ctgaccctgc actcaatcaa 1680gaagttgcat taaaattaga
accaaatcca gagtcactgg aactttcttt taccatgccc 1740cagattcagg atcagacacc
tagtccttcc gatggaagca ctagacaaag ttcacctgag 1800cctaatagtc ccagtgaata
ttgtttttat gtggatagtg atatggtcaa tgaattcaag 1860ttggaattgg tagaaaaact
ttttgctgaa gacacagaag caaagaaccc attttctact 1920caggacacag atttagactt
ggagatgtta gctccctata tcccaatgga tgatgacttc 1980cagttacgtt ccttcgatca
gttgtcacca ttagaaagca gttccgcaag ccctgaaagc 2040gcaagtcctc aaagcacagt
tacagtattc cagcagactc aaatacaaga acctactgct 2100aatgccacca ctaccactgc
caccactgat gaattaaaaa cagtgacaaa agaccgtatg 2160gaagacatta aaatattgat
tgcatctcca tctcctaccc acatacataa agaaactact 2220agtgccacat catcaccata
tagagatact caaagtcgga cagcctcacc aaacagagca 2280ggaaaaggag tcatagaaca
gacagaaaaa tctcatccaa gaagccctaa cgtgttatct 2340gtcgctttga gtcaaagaac
tacagttcct gaggaagaac taaatccaaa gatactagct 2400ttgcagaatg ctcagagaaa
gcgaaaaatg gaacatgatg gttcactttt tcaagcagta 2460ggaattggaa cattattaca
gcagccagac gatcatgcag ctactacatc actttcttgg 2520aaacgtgtaa aaggatgcaa
atctagtgaa cagaatggaa tggagcaaaa gacaattatt 2580ttaataccct ctgatttagc
atgtagactg ctggggcaat caatggatga aagtggatta 2640ccacagctga ccagttatga
ttgtgaagtt aatgctccta tacaaggcag cagaaaccta 2700ctgcagggtg aagaattact
cagagctttg gatcaagtta actgagcttt ttcttaattt 2760cattcctttt tttggacact
ggtggctcac tacctaaagc agtctattta tattttctac 2820atctaatttt agaagcctgg
ctacaatact gcacaaactt ggttagttca atttttgatc 2880ccctttctac ttaatttaca
ttaatgctct tttttagtat gttctttaat gctggatcac 2940agacagctca ttttctcagt
tttttggtat ttaaaccatt gcattgcagt agcatcattt 3000taaaaaatgc acctttttat
ttatttattt ttggctaggg agtttatccc tttttcgaat 3060tatttttaag aagatgccaa
tataattttt gtaagaaggc agtaaccttt catcatgatc 3120ataggcagtt gaaaaatttt
tacacctttt ttttcacatt ttacataaat aataatgctt 3180tgccagcagt acgtggtagc
cacaattgca caatatattt tcttaaaaaa taccagcagt 3240tactcatgga atatattctg
cgtttataaa actagttttt aagaagaaat tttttttggc 3300ctatgaaatt gttaaacctg
gaacatgaca ttgttaatca tataataatg attcttaaat 3360gctgtatggt ttattattta
aatgggtaaa gccatttaca taatatagaa agatatgcat 3420atatctagaa ggtatgtggc
atttatttgg ataaaattct caattcagag aaatcatctg 3480atgtttctat agtcactttg
ccagctcaaa agaaaacaat accctatgta gttgtggaag 3540tttatgctaa tattgtgtaa
ctgatattaa acctaaatgt tctgcctacc ctgttggtat 3600aaagatattt tgagcagact
gtaaacaaga aaaaaaaaat catgcattct tagcaaaatt 3660gcctagtatg ttaatttgct
caaaatacaa tgtttgattt tatgcacttt gtcgctatta 3720acatcctttt tttcatgtag
atttcaataa ttgagtaatt ttagaagcat tattttagga 3780atatatagtt gtcacagtaa
atatcttgtt ttttctatgt acattgtaca aatttttcat 3840tccttttgct ctttgtggtt
ggatctaaca ctaactgtat tgttttgtta catcaaataa 3900acatcttctg tggaaaaaaa
aaaaaaaaaa aaa 39333102872DNAHomo sapiens
310tccaggaatc gatagtgcat tcgtgcgcgc ggccgcccgt cgcttcgcac agggctggat
60ggttgtattg ggcagggtgg ctccaggatg ttaggaactg tgaagatgga agggcatgaa
120accagcgact ggaacagcta ctacgcagac acgcaggagg cctactcctc ggtcccggtc
180agcaacatga actcaggcct gggctccatg aactccatga acacctacat gaccatgaac
240accatgacta cgagcggcaa catgaccccg gcgtccttca acatgtccta tgccaacccg
300gccttagggg ccggcctgag tcccggcgca gtagccggca tgccgggggg ctcggcgggc
360gccatgaaca gcatgactgc ggccggcgtg acggccatgg gtacggcgct gagcccgagc
420ggcatgggcg ccatgggtgc gcagcaggcg gcctccatga tgaatggcct gggcccctac
480gcggccgcca tgaacccgtg catgagcccc atggcgtacg cgccgtccaa cctgggccgc
540agccgcgcgg gcggcggcgg cgacgccaag acgttcaagc gcagttaccc gcacgccaag
600ccgccctact cgtacatctc gctcatcacc atggccatcc agcgggcgcc cagcaagatg
660ctcacgctga gcgagatcta ccagtggatc atggacctct tcccctatta ccggcagaac
720cagcagcgct ggcagaactc catccgccac tcgctgtcct tcaatgactg cttcgtcaag
780gtggcacgct ccccggacaa gccgggcaag ggctcctact ggacgctgca cccggactcc
840ggcaacatgt tcgagaacgg ctgctacttg cgccgccaga agcgcttcaa gtgcgagaag
900cagccggggg ccggcggcgg gggcgggagc ggaagcgggg gcagcggcgc caagggcggc
960cctgagagcc gcaaggaccc ctctggcgcc tctaacccca gcgccgactc gcccctccat
1020cggggtgtgc acgggaagac cggccagcta gagggcgcgc cggccccggg cccggccgcc
1080agcccccaga ctctggacca cagtggggcg acggcgacag ggggcgcctc ggagttgaag
1140actccagcct cctcaactgc gccccccata agctccgggc ccggggcgct ggcctctgtg
1200cccgcctctc acccggcaca cggcttggca ccccacgagt cccagctgca cctgaaaggg
1260gacccccact actccttcaa ccacccgttc tccatcaaca acctcatgtc ctcctcggag
1320cagcagcata agctggactt caaggcatac gaacaggcac tgcaatactc gccttacggc
1380tctacgttgc ccgccagcct gcctctaggc agcgcctcgg tgaccaccag gagccccatc
1440gagccctcag ccctggagcc ggcgtactac caaggtgtgt attccagacc cgtcctaaac
1500acttcctagc tcccgggact ggggggtttg tctggcatag ccatgctggt agcaagagag
1560aaaaaatcaa cagcaaacaa aaccacacaa accaaaccgt caacagcata ataaaatcca
1620acaactattt ttatttcatt tttcatgcac aaccttgccc ccagtgcaaa agactgttac
1680tttattattg tattcaaaat tcattgtgta tattactaca aagacggccc caaaccaatt
1740tttttcctgc gaagtttaat gatccacaag tgtatatatg aaattctcct ccttccttgc
1800ccccctctct ttcttccctc ttggccctcc agacattcta gtttgtggag ggttatttaa
1860aaaacaaaaa ggaagatggt caagtttgta aaatatttgt ttgtgctttt cccccctcct
1920tacctgaccc cctacgagtt tacaggcttg tggcaatact cttaaccata agaattgaaa
1980tggtgaagaa acaagtatac actagaggct cttaaaagta ttgaaaagac aatactgctg
2040ttatatagca agacataaac agattataaa catcagagcc atttgcttct cagtttacat
2100ttctgataca tgcagatagc agatgtcttt aaatgaaata catgtatatt gtgtatggac
2160ttaattatgc acatgctcag atgtgtagac atcctccgta tatttacata acatatagag
2220gtaatagata ggtgatatac gtgatacgtt ctcaagagtt gcttgaccga aagttacaag
2280gaccccaacc cctttgctct ctacccacag atggccctgg gaacaatcct caggaattgc
2340cctcaagaac tcgcttcttt gctttgagag tgccatggtc atgtcattct gaggtacata
2400acacataaat tagtttctat gagtgtatac catttaaaga ttttttcagt aaagggaata
2460ttacatgttg ggaggaggag ataagttata gggagctgga tttcaaacgg tggtccaaga
2520ttcaaaaatc ctattgatag tggccatttt aatcattgcc atcgtgtgct tgtttcatcc
2580agtgttatgc actttccaca gttggtgtta gtatagccag agggtttcat tattatttct
2640ctttgctttc tcaatgttaa tttattgcat ggtttattct ttttctttac agctgaaatt
2700gctttaaatg atggttaaaa ttacaaatta aattgggaat ttttatcaat gtgattgtaa
2760ttaaaaatat tttgatttaa ataacaaaaa taataccaga ttttaagccg cggaaaatgt
2820tcttgatcat ttgcagttaa ggactttaaa taaatcaaat gttaacaaaa aa
2872311926DNAHomo sapiens 311ggggcccatt ctgtttcagc cagtcgccaa gaatcatgaa
agtcgccagt ggcagcaccg 60ccaccgccgc cgcgggcccc agctgcgcgc tgaaggccgg
caagacagcg agcggtgcgg 120gcgaggtggt gcgctgtctg tctgagcaga gcgtggccat
ctcgcgctgc cggggcgccg 180gggcgcgcct gcctgccctg ctggacgagc agcaggtaaa
cgtgctgctc tacgacatga 240acggctgtta ctcacgcctc aaggagctgg tgcccaccct
gccccagaac cgcaaggtga 300gcaaggtgga gattctccag cacgtcatcg actacatcag
ggaccttcag ttggagctga 360actcggaatc cgaagttggg acccccgggg gccgagggct
gccggtccgg gctccgctca 420gcaccctcaa cggcgagatc agcgccctga cggccgaggc
ggcatgcgtt cctgcggacg 480atcgcatctt gtgtcgctga agcgcctccc ccagggaccg
gcggacccca gccatccagg 540gggcaagagg aattacgtgc tctgtgggtc tcccccaacg
cgcctcgccg gatctgaggg 600agaacaagac cgatcggcgg ccactgcgcc cttaactgca
tccagcctgg ggctgaggct 660gaggcactgg cgaggagagg gcgctcctct ctgcacacct
actagtcacc agagacttta 720gggggtggga ttccactcgt gtgtttctat tttttgaaaa
gcagacattt taaaaaatgg 780tcacgtttgg tgcttctcag atttctgagg aaattgcttt
gtattgtata ttacaatgat 840caccgactga gaatattgtt ttacaatagt tctgtggggc
tgtttttttg ttattaaaca 900aataatttag atggtgaaaa aaaaaa
9263124989DNAHomo sapiens 312tttttttttt ttttgagaaa
gggaatttca tcccaaataa aaggaatgaa gtctggctcc 60ggaggagggt ccccgacctc
gctgtggggg ctcctgtttc tctccgccgc gctctcgctc 120tggccgacga gtggagaaat
ctgcgggcca ggcatcgaca tccgcaacga ctatcagcag 180ctgaagcgcc tggagaactg
cacggtgatc gagggctacc tccacatcct gctcatctcc 240aaggccgagg actaccgcag
ctaccgcttc cccaagctca cggtcattac cgagtacttg 300ctgctgttcc gagtggctgg
cctcgagagc ctcggagacc tcttccccaa cctcacggtc 360atccgcggct ggaaactctt
ctacaactac gccctggtca tcttcgagat gaccaatctc 420aaggatattg ggctttacaa
cctgaggaac attactcggg gggccatcag gattgagaaa 480aatgctgacc tctgttacct
ctccactgtg gactggtccc tgatcctgga tgcggtgtcc 540aataactaca ttgtggggaa
taagccccca aaggaatgtg gggacctgtg tccagggacc 600atggaggaga agccgatgtg
tgagaagacc accatcaaca atgagtacaa ctaccgctgc 660tggaccacaa accgctgcca
gaaaatgtgc ccaagcacgt gtgggaagcg ggcgtgcacc 720gagaacaatg agtgctgcca
ccccgagtgc ctgggcagct gcagcgcgcc tgacaacgac 780acggcctgtg tagcttgccg
ccactactac tatgccggtg tctgtgtgcc tgcctgcccg 840cccaacacct acaggtttga
gggctggcgc tgtgtggacc gtgacttctg cgccaacatc 900ctcagcgccg agagcagcga
ctccgagggg tttgtgatcc acgacggcga gtgcatgcag 960gagtgcccct cgggcttcat
ccgcaacggc agccagagca tgtactgcat cccttgtgaa 1020ggtccttgcc cgaaggtctg
tgaggaagaa aagaaaacaa agaccattga ttctgttact 1080tctgctcaga tgctccaagg
atgcaccatc ttcaagggca atttgctcat taacatccga 1140cgggggaata acattgcttc
agagctggag aacttcatgg ggctcatcga ggtggtgacg 1200ggctacgtga agatccgcca
ttctcatgcc ttggtctcct tgtccttcct aaaaaacctt 1260cgcctcatcc taggagagga
gcagctagaa gggaattact ccttctacgt cctcgacaac 1320cagaacttgc agcaactgtg
ggactgggac caccgcaacc tgaccatcaa agcagggaaa 1380atgtactttg ctttcaatcc
caaattatgt gtttccgaaa tttaccgcat ggaggaagtg 1440acggggacta aagggcgcca
aagcaaaggg gacataaaca ccaggaacaa cggggagaga 1500gcctcctgtg aaagtgacgt
cctgcatttc acctccacca ccacgtcgaa gaatcgcatc 1560atcataacct ggcaccggta
ccggccccct gactacaggg atctcatcag cttcaccgtt 1620tactacaagg aagcaccctt
taagaatgtc acagagtatg atgggcagga tgcctgcggc 1680tccaacagct ggaacatggt
ggacgtggac ctcccgccca acaaggacgt ggagcccggc 1740atcttactac atgggctgaa
gccctggact cagtacgccg tttacgtcaa ggctgtgacc 1800ctcaccatgg tggagaacga
ccatatccgt ggggccaaga gtgagatctt gtacattcgc 1860accaatgctt cagttccttc
cattcccttg gacgttcttt cagcatcgaa ctcctcttct 1920cagttaatcg tgaagtggaa
ccctccctct ctgcccaacg gcaacctgag ttactacatt 1980gtgcgctggc agcggcagcc
tcaggacggc tacctttacc ggcacaatta ctgctccaaa 2040gacaaaatcc ccatcaggaa
gtatgccgac ggcaccatcg acattgagga ggtcacagag 2100aaccccaaga ctgaggtgtg
tggtggggag aaagggcctt gctgcgcctg ccccaaaact 2160gaagccgaga agcaggccga
gaaggaggag gctgaatacc gcaaagtctt tgagaatttc 2220ctgcacaact ccatcttcgt
gcccagacct gaaaggaagc ggagagatgt catgcaagtg 2280gccaacacca ccatgtccag
ccgaagcagg aacaccacgg ccgcagacac ctacaacatc 2340accgacccgg aagagctgga
gacagagtac cctttctttg agagcagagt ggataacaag 2400gagagaactg tcatttctaa
ccttcggcct ttcacattgt accgcatcga tatccacagc 2460tgcaaccacg aggctgagaa
gctgggctgc agcgcctcca acttcgtctt tgcaaggact 2520atgcccgcag aaggagcaga
tgacattcct gggccagtga cctgggagcc aaggcctgaa 2580aactccatct ttttaaagtg
gccggaacct gagaatccca atggattgat tctaatgtat 2640gaaataaaat acggatcaca
agttgaggat cagcgagaat gtgtgtccag acaggaatac 2700aggaagtatg gaggggccaa
gctaaaccgg ctaaacccgg ggaactacac agcccggatt 2760caggccacat ctctctctgg
gaatgggtcg tggacagatc ctgtgttctt ctatgtccag 2820gccaaaacag gatatgaaaa
cttcatccat ctgatcatcg ctctgcccgt cgctgtcctg 2880ttgatcgtgg gagggttggt
gattatgctg tacgtcttcc atagaaagag aaataacagc 2940aggctgggga atggagtgct
gtatgcctct gtgaacccgg agtacttcag cgctgctgat 3000gtgtacgttc ctgatgagtg
ggaggtggct cgggagaaga tcaccatgag ccgggaactt 3060gggcaggggt cgtttgggat
ggtctatgaa ggagttgcca agggtgtggt gaaagatgaa 3120cctgaaacca gagtggccat
taaaacagtg aacgaggccg caagcatgcg tgagaggatt 3180gagtttctca acgaagcttc
tgtgatgaag gagttcaatt gtcaccatgt ggtgcgattg 3240ctgggtgtgg tgtcccaagg
ccagccaaca ctggtcatca tggaactgat gacacggggc 3300gatctcaaaa gttatctccg
gtctctgagg ccagaaatgg agaataatcc agtcctagca 3360cctccaagcc tgagcaagat
gattcagatg gccggagaga ttgcagacgg catggcatac 3420ctcaacgcca ataagttcgt
ccacagagac cttgctgccc ggaattgcat ggtagccgaa 3480gatttcacag tcaaaatcgg
agattttggt atgacgcgag atatctatga gacagactat 3540taccggaaag gaggcaaagg
gctgctgccc gtgcgctgga tgtctcctga gtccctcaag 3600gatggagtct tcaccactta
ctcggacgtc tggtccttcg gggtcgtcct ctgggagatc 3660gccacactgg ccgagcagcc
ctaccagggc ttgtccaacg agcaagtcct tcgcttcgtc 3720atggagggcg gccttctgga
caagccagac aactgtcctg acatgctgtt tgaactgatg 3780cgcatgtgct ggcagtataa
ccccaagatg aggccttcct tcctggagat catcagcagc 3840atcaaagagg agatggagcc
tggcttccgg gaggtctcct tctactacag cgaggagaac 3900aagctgcccg agccggagga
gctggacctg gagccagaga acatggagag cgtccccctg 3960gacccctcgg cctcctcgtc
ctccctgcca ctgcccgaca gacactcagg acacaaggcc 4020gagaacggcc ccggccctgg
ggtgctggtc ctccgcgcca gcttcgacga gagacagcct 4080tacgcccaca tgaacggggg
ccgcaagaac gagcgggcct tgccgctgcc ccagtcttcg 4140acctgctgat ccttggatcc
tgaatctgtg caaacagtaa cgtgtgcgca cgcgcagcgg 4200ggtggggggg gagagagagt
tttaacaatc cattcacaag cctcctgtac ctcagtggat 4260cttcagttct gcccttgctg
cccgcgggag acagcttctc tgcagtaaaa cacatttggg 4320atgttccttt tttcaatatg
caagcagctt tttattccct gcccaaaccc ttaactgaca 4380tgggccttta agaaccttaa
tgacaacact taatagcaac agagcacttg agaaccagtc 4440tcctcactct gtccctgtcc
ttccctgttc tccctttctc tctcctctct gcttcataac 4500ggaaaaataa ttgccacaag
tccagctggg aagccctttt tatcagtttg aggaagtggc 4560tgtccctgtg gccccatcca
accactgtac acacccgcct gacaccgtgg gtcattacaa 4620aaaaacacgt ggagatggaa
atttttacct ttatctttca cctttctagg gacatgaaat 4680ttacaaaggg ccatcgttca
tccaaggctg ttaccatttt aacgctgcct aattttgcca 4740aaatcctgaa ctttctccct
catcggcccg gcgctgattc ctcgtgtccg gaggcatggg 4800tgagcatggc agctggttgc
tccatttgag agacacgctg gcgacacact ccgtccatcc 4860gactgcccct gctgtgctgc
tcaaggccac aggcacacag gtctcattgc ttctgactag 4920attattattt gggggaactg
gacacaatag gtctttctct cagtgaaggt ggggagaagc 4980tgaaccggc
498931312515DNAHomo sapiens
313ctaccgggcg gaggtgagcg cggcgccggc tcctcctgcg gcggactttg ggtgcgactt
60gacgagcggt ggttcgacaa gtggccttgc gggccggatc gtcccagtgg aagagttgta
120aatttgcttc tggccttccc ctacggatta tacctggcct tcccctacgg attatactca
180acttactgtt tagaaaatgt ggcccacgag acgcctggtt actatcaaaa ggagcggggt
240cgacggtccc cactttcccc tgagcctcag cacctgcttg tttggaaggg gtattgaatg
300tgacatccgt atccagcttc ctgttgtgtc aaaacaacat tgcaaaattg aaatccatga
360gcaggaggca atattacata atttcagttc cacaaatcca acacaagtaa atgggtctgt
420tattgatgag cctgtacggc taaaacatgg agatgtaata actattattg atcgttcctt
480caggtatgaa aatgaaagtc ttcagaatgg aaggaagtca actgaatttc caagaaaaat
540acgtgaacag gagccagcac gtcgtgtctc aagatctagc ttctcttctg accctgatga
600gaaagctcaa gattccaagg cctattcaaa aatcactgaa ggaaaagttt caggaaatcc
660tcaggtacat atcaagaatg tcaaagaaga cagtaccgca gatgactcaa aagacagtgt
720tgctcaggga acaactaatg ttcattcctc agaacatgct ggacgtaatg gcagaaatgc
780agctgatccc atttctgggg attttaaaga aatttccagc gttaaattag tgagccgtta
840tggagaattg aagtctgttc ccactacaca atgtcttgac aatagcaaaa aaaatgaatc
900tcccttttgg aagctttatg agtcagtgaa gaaagagttg gatgtaaaat cacaaaaaga
960aaatgtccta cagtattgta gaaaatctgg attacaaact gattacgcaa cagagaaaga
1020aagtgctgat ggtttacagg gggagaccca actgttggtc tcgcgtaagt caagaccaaa
1080atctggtggg agcggccacg ctgtggcaga gcctgcttca cctgaacaag agcttgacca
1140gaacaagggg aagggaagag acgtggagtc tgttcagact cccagcaagg ctgtgggcgc
1200cagctttcct ctctatgagc cggctaaaat gaagacccct gtacaatatt cacagcaaca
1260aaattctcca caaaaacata agaacaaaga cctgtatact actggtagaa gagaatctgt
1320gaatctgggt aaaagtgaag gcttcaaggc tggtgataaa actcttactc ccaggaagct
1380ttcaactaga aatcgaacac cagctaaagt tgaagatgca gctgactctg ccactaagcc
1440agaaaatctc tcttccaaaa ccagaggaag tattcctaca gatgtggaag ttctgcctac
1500ggaaactgaa attcacaatg agccattttt aactctgtgg ctcactcaag ttgagaggaa
1560gatccaaaag gattccctca gcaagcctga gaaattgggc actacagctg gacagatgtg
1620ctctgggtta cctggtctta gttcagttga tatcaacaac tttggtgatt ccattaatga
1680gagtgaggga atacctttga aaagaaggcg tgtgtccttt ggtgggcacc taagacctga
1740actatttgat gaaaacttgc ctcctaatac gcctctcaaa aggggagaag ccccaaccaa
1800aagaaagtct ctggtaatgc acactccacc tgtcctgaag aaaatcatca aggaacagcc
1860tcaaccatca ggaaaacaag agtcaggttc agaaatccat gtggaagtga aggcacaaag
1920cttggttata agccctccag ctcctagtcc taggaaaact ccagttgcca gtgatcaacg
1980ccgtaggtcc tgcaaaacag cccctgcttc cagcagcaaa tctcagacag aggttcctaa
2040gagaggagga gaaagagtgg caacctgcct tcaaaagaga gtgtctatca gccgaagtca
2100acatgatatt ttacagatga tatgttccaa aagaagaagt ggtgcttcgg aagcaaatct
2160gattgttgca aaatcatggg cagatgtagt aaaacttggt gcaaaacaaa cacaaactaa
2220agtcataaaa catggtcctc aaaggtcaat gaacaaaagg caaagaagac ctgctactcc
2280aaagaagcct gtgggcgaag ttcacagtca atttagtaca ggccacgcaa actctccttg
2340taccataata atagggaaag ctcatactga aaaagtacat gtgcctgctc gaccctacag
2400agtgctcaac aacttcattt ccaaccaaaa aatggacttt aaggaagatc tttcaggaat
2460agctgaaatg ttcaagaccc cagtgaagga gcaaccgcag ttgacaagca catgtcacat
2520cgctatttca aattcagaga atttgcttgg aaaacagttt caaggaactg attcaggaga
2580agaacctctg ctccccacct cagagagttt tggaggaaat gtgttcttca gtgcacagaa
2640tgcagcaaaa cagccatctg ataaatgctc tgcaagccct cccttaagac ggcagtgtat
2700tagagaaaat ggaaacgtag caaaaacgcc caggaacacc tacaaaatga cttctctgga
2760gacaaaaact tcagatactg agacagagcc ttcaaaaaca gtatccactg taaacaggtc
2820aggaaggtct acagagttca ggaatataca gaagctacct gtggaaagta agagtgaaga
2880aacaaataca gaaattgttg agtgcatcct aaaaagaggt cagaaggcaa cactactaca
2940acaaaggaga gaaggagaga tgaaggaaat agaaagacct tttgagacat ataaggaaaa
3000tattgaatta aaagaaaacg atgaaaagat gaaagcaatg aagagatcaa gaacttgggg
3060gcagaaatgt gcaccaatgt ctgacctgac agacctcaag agcttgcctg atacagaact
3120catgaaagac acggcacgtg gccagaatct cctccaaacc caagatcatg ccaaggcacc
3180aaagagtgag aaaggcaaaa tcactaaaat gccctgccag tcattacaac cagaaccaat
3240aaacacccca acacacacaa aacaacagtt gaaggcatcc ctggggaaag taggtgtgaa
3300agaagagctc ctagcagtcg gcaagttcac acggacgtca ggggagacca cgcacacgca
3360cagagagcca gcaggagatg gcaagagcat cagaacgttt aaggagtctc caaagcagat
3420cctggaccca gcagcccgtg taactggaat gaagaagtgg ccaagaacgc ctaaggaaga
3480ggcccagtca ctagaagacc tggctggctt caaagagctc ttccagacac caggtccctc
3540tgaggaatca atgactgatg agaaaactac caaaatagcc tgcaaatctc caccaccaga
3600atcagtggac actccaacaa gcacaaagca atggcctaag agaagtctca ggaaagcaga
3660tgtagaggaa gaattcttag cactcaggaa actaacacca tcagcaggga aagccatgct
3720tacgcccaaa ccagcaggag gtgatgagaa agacattaaa gcatttatgg gaactccagt
3780gcagaaactg gacctggcag gaactttacc tggcagcaaa agacagctac agactcctaa
3840ggaaaaggcc caggctctag aagacctggc tggctttaaa gagctcttcc agactcctgg
3900tcacaccgag gaattagtgg ctgctggtaa aaccactaaa ataccctgcg actctccaca
3960gtcagaccca gtggacaccc caacaagcac aaagcaacga cccaagagaa gtatcaggaa
4020agcagatgta gagggagaac tcttagcgtg caggaatcta atgccatcag caggcaaagc
4080catgcacacg cctaaaccat cagtaggtga agagaaagac atcatcatat ttgtgggaac
4140tccagtgcag aaactggacc tgacagagaa cttaaccggc agcaagagac ggccacaaac
4200tcctaaggaa gaggcccagg ctctggaaga cctgactggc tttaaagagc tcttccagac
4260ccctggtcat actgaagaag cagtggctgc tggcaaaact actaaaatgc cctgcgaatc
4320ttctccacca gaatcagcag acaccccaac aagcacaaga aggcagccca agacaccttt
4380ggagaaaagg gacgtacaga aggagctctc agccctgaag aagctcacac agacatcagg
4440ggaaaccaca cacacagata aagtaccagg aggtgaggat aaaagcatca acgcgtttag
4500ggaaactgca aaacagaaac tggacccagc agcaagtgta actggtagca agaggcaccc
4560aaaaactaag gaaaaggccc aacccctaga agacctggct ggctggaaag agctcttcca
4620gacaccagta tgcactgaca agcccacgac tcacgagaaa actaccaaaa tagcctgcag
4680atcacaacca gacccagtgg acacaccaac aagctccaag ccacagtcca agagaagtct
4740caggaaagtg gacgtagaag aagaattctt cgcactcagg aaacgaacac catcagcagg
4800caaagccatg cacacaccca aaccagcagt aagtggtgag aaaaacatct acgcatttat
4860gggaactcca gtgcagaaac tggacctgac agagaactta actggcagca agagacggct
4920acaaactcct aaggaaaagg cccaggctct agaagacctg gctggcttta aagagctctt
4980ccagacacga ggtcacactg aggaatcaat gactaacgat aaaactgcca aagtagcctg
5040caaatcttca caaccagacc tagacaaaaa cccagcaagc tccaagcgac ggctcaagac
5100atccctgggg aaagtgggcg tgaaagaaga gctcctagca gttggcaagc tcacacagac
5160atcaggagag actacacaca cacacacaga gccaacagga gatggtaaga gcatgaaagc
5220atttatggag tctccaaagc agatcttaga ctcagcagca agtctaactg gcagcaagag
5280gcagctgaga actcctaagg gaaagtctga agtccctgaa gacctggccg gcttcatcga
5340gctcttccag acaccaagtc acactaagga atcaatgact aatgaaaaaa ctaccaaagt
5400atcctacaga gcttcacagc cagacctagt ggacacccca acaagctcca agccacagcc
5460caagagaagt ctcaggaaag cagacactga agaagaattt ttagcattta ggaaacaaac
5520gccatcagca ggcaaagcca tgcacacacc caaaccagca gtaggtgaag agaaagacat
5580caacacgttt ttgggaactc cagtgcagaa actggaccag ccaggaaatt tacctggcag
5640caatagacgg ctacaaactc gtaaggaaaa ggcccaggct ctagaagaac tgactggctt
5700cagagagctt ttccagacac catgcactga taaccccaca gctgatgaga aaactaccaa
5760aaaaatactc tgcaaatctc cgcaatcaga cccagcggac accccaacaa acacaaagca
5820acggcccaag agaagcctca agaaagcaga cgtagaggaa gaatttttag cattcaggaa
5880actaacacca tcagcaggca aagccatgca cacgcctaaa gcagcagtag gtgaagagaa
5940agacatcaac acatttgtgg ggactccagt ggagaaactg gacctgctag gaaatttacc
6000tggcagcaag agacggccac aaactcctaa agaaaaggcc aaggctctag aagatctggc
6060tggcttcaaa gagctcttcc agacaccagg tcacactgag gaatcaatga ccgatgacaa
6120aatcacagaa gtatcctgca aatctccaca accagaccca gtcaaaaccc caacaagctc
6180caagcaacga ctcaagatat ccttggggaa agtaggtgtg aaagaagagg tcctaccagt
6240cggcaagctc acacagacgt cagggaagac cacacagaca cacagagaga cagcaggaga
6300tggaaagagc atcaaagcgt ttaaggaatc tgcaaagcag atgctggacc cagcaaacta
6360tggaactggg atggagaggt ggccaagaac acctaaggaa gaggcccaat cactagaaga
6420cctggccggc ttcaaagagc tcttccagac accagaccac actgaggaat caacaactga
6480tgacaaaact accaaaatag cctgcaaatc tccaccacca gaatcaatgg acactccaac
6540aagcacaagg aggcggccca aaacaccttt ggggaaaagg gatatagtgg aagagctctc
6600agccctgaag cagctcacac agaccacaca cacagacaaa gtaccaggag atgaggataa
6660aggcatcaac gtgttcaggg aaactgcaaa acagaaactg gacccagcag caagtgtaac
6720tggtagcaag aggcagccaa gaactcctaa gggaaaagcc caacccctag aagacttggc
6780tggcttgaaa gagctcttcc agacaccagt atgcactgac aagcccacga ctcacgagaa
6840aactaccaaa atagcctgca gatctccaca accagaccca gtgggtaccc caacaatctt
6900caagccacag tccaagagaa gtctcaggaa agcagacgta gaggaagaat ccttagcact
6960caggaaacga acaccatcag tagggaaagc tatggacaca cccaaaccag caggaggtga
7020tgagaaagac atgaaagcat ttatgggaac tccagtgcag aaattggacc tgccaggaaa
7080tttacctggc agcaaaagat ggccacaaac tcctaaggaa aaggcccagg ctctagaaga
7140cctggctggc ttcaaagagc tcttccagac accaggcact gacaagccca cgactgatga
7200gaaaactacc aaaatagcct gcaaatctcc acaaccagac ccagtggaca ccccagcaag
7260cacaaagcaa cggcccaaga gaaacctcag gaaagcagac gtagaggaag aatttttagc
7320actcaggaaa cgaacaccat cagcaggcaa agccatggac accccaaaac cagcagtaag
7380tgatgagaaa aatatcaaca catttgtgga aactccagtg cagaaactgg acctgctagg
7440aaatttacct ggcagcaaga gacagccaca gactcctaag gaaaaggctg aggctctaga
7500ggacctggtt ggcttcaaag aactcttcca gacaccaggt cacactgagg aatcaatgac
7560tgatgacaaa atcacagaag tatcctgtaa atctccacag ccagagtcat tcaaaacctc
7620aagaagctcc aagcaaaggc tcaagatacc cctggtgaaa gtggacatga aagaagagcc
7680cctagcagtc agcaagctca cacggacatc aggggagact acgcaaacac acacagagcc
7740aacaggagat agtaagagca tcaaagcgtt taaggagtct ccaaagcaga tcctggaccc
7800agcagcaagt gtaactggta gcaggaggca gctgagaact cgtaaggaaa aggcccgtgc
7860tctagaagac ctggttgact tcaaagagct cttctcagca ccaggtcaca ctgaagagtc
7920aatgactatt gacaaaaaca caaaaattcc ctgcaaatct cccccaccag aactaacaga
7980cactgccacg agcacaaaga gatgccccaa gacacgtccc aggaaagaag taaaagagga
8040gctctcagca gttgagaggc tcacgcaaac atcagggcaa agcacacaca cacacaaaga
8100accagcaagc ggtgatgagg gcatcaaagt attgaagcaa cgtgcaaaga agaaaccaaa
8160cccagtagaa gaggaaccca gcaggagaag gccaagagca cctaaggaaa aggcccaacc
8220cctggaagac ctggccggct tcacagagct ctctgaaaca tcaggtcaca ctcaggaatc
8280actgactgct ggcaaagcca ctaaaatacc ctgcgaatct cccccactag aagtggtaga
8340caccacagca agcacaaaga ggcatctcag gacacgtgtg cagaaggtac aagtaaaaga
8400agagccttca gcagtcaagt tcacacaaac atcaggggaa accacggatg cagacaaaga
8460accagcaggt gaagataaag gcatcaaagc attgaaggaa tctgcaaaac agacaccggc
8520tccagcagca agtgtaactg gcagcaggag acggccaaga gcacccaggg aaagtgccca
8580agccatagaa gacctagctg gcttcaaaga cccagcagca ggtcacactg aagaatcaat
8640gactgatgac aaaaccacta aaataccctg caaatcatca ccagaactag aagacaccgc
8700aacaagctca aagagacggc ccaggacacg tgcccagaaa gtagaagtga aggaggagct
8760gttagcagtt ggcaagctca cacaaacctc aggggagacc acgcacaccg acaaagagcc
8820ggtaggtgag ggcaaaggca cgaaagcatt taagcaacct gcaaagcgga acgtggacgc
8880agaagatgta attggcagca ggagacagcc aagagcacct aaggaaaagg cccaacccct
8940ggaagacctg gccagcttcc aagagctctc tcaaacacca ggccacactg aggaactggc
9000aaatggtgct gctgatagct ttacaagcgc tccaaagcaa acacctgaca gtggaaaacc
9060tctaaaaata tccagaagag ttcttcgggc ccctaaagta gaacccgtgg gagacgtggt
9120aagcaccaga gaccctgtaa aatcacaaag caaaagcaac acttccctgc ccccactgcc
9180cttcaagagg ggaggtggca aagatggaag cgtcacggga accaagaggc tgcgctgcat
9240gccagcacca gaggaaattg tggaggagct gccagccagc aagaagcaga gggttgctcc
9300cagggcaaga ggcaaatcat ccgaacccgt ggtcatcatg aagagaagtt tgaggacttc
9360tgcaaaaaga attgaacctg cggaagagct gaacagcaac gacatgaaaa ccaacaaaga
9420ggaacacaaa ttacaagact cggtccctga aaataaggga atatccctgc gctccagacg
9480ccaagataag actgaggcag aacagcaaat aactgaggtc tttgtattag cagaaagaat
9540agaaataaac agaaatgaaa agaagcccat gaagacctcc ccagagatgg acattcagaa
9600tccagatgat ggagcccgga aacccatacc tagagacaaa gtcactgaga acaaaaggtg
9660cttgaggtct gctagacaga atgagagctc ccagcctaag gtggcagagg agagcggagg
9720gcagaagagt gcgaaggttc tcatgcagaa tcagaaaggg aaaggagaag caggaaattc
9780agactccatg tgcctgagat caagaaagac aaaaagccag cctgcagcaa gcactttgga
9840gagcaaatct gtgcagagag taacgcggag tgtcaagagg tgtgcagaaa atccaaagaa
9900ggctgaggac aatgtgtgtg tcaagaaaat aacaaccaga agtcataggg acagtgaaga
9960tatttgacag aaaaatcgaa ctgggaaaaa tataataaag ttagttttgt gataagttct
10020agtgcagttt ttgtcataaa ttacaagtga attctgtaag taaggctgtc agtctgctta
10080agggaagaaa actttggatt tgctgggtct gaatcggctt cataaactcc actgggagca
10140ctgctgggct cctggactga gaatagttga acaccggggg ctttgtgaag gagtctgggc
10200caaggtttgc cctcagcttt gcagaatgaa gccttgaggt ctgtcaccac ccacagccac
10260cctacagcag ccttaactgt gacacttgcc acactgtgtc gtcgtttgtt tgcctatgtt
10320ctccagggca cggtggcagg aacaactatc ctcgtctgtc ccaacactga gcaggcactc
10380ggtaaacacg aatgaatgga taagcgcacg gatgaatgga gcttacaaga tctgtctttc
10440caatggccgg gggcatttgg tccccaaatt aaggctattg gacatctgca caggacagtc
10500ctatttttga tgtcctttcc tttctgaaaa taaagttttg tgctttggag aatgactcgt
10560gagcacatct ttagggacca agagtgactt tctgtaagga gtgactcgtg gcttgccttg
10620gtctcttggg aatacttttc taactagggt tgctctcacc tgagacattc tccacccgcg
10680gaatctcagg gtcccaggct gtgggccatc acgacctcaa actggctcct aatctccagc
10740tttcctgtca ttgaaagctt cggaagttta ctggctctgc tcccgcctgt tttctttctg
10800actctatctg gcagcccgat gccacccagt acaggaagtg acaccagtac tctgtaaagc
10860atcatcatcc ttggagagac tgagcactca gcaccttcag ccacgatttc aggatcgctt
10920ccttgtgagc cgctgcctcc gaaatctcct ttgaagccca gacatctttc tccagcttca
10980gacttgtaga tataactcgt tcatcttcat ttactttcca ctttgccccc tgtcctctct
11040gtgttcccca aatcagagaa tagcccgcca tcccccagat cacctgtctg gattcctccc
11100cattcaccca ccttgccagg tgcaggtgag gatggtgcac cagacagggt agctgtcccc
11160caaaatgtgc cctgtgcggg cagtgccctg tctccacgtt tgtttcccca gtgtctggcg
11220gggagccagg tgacatcata aatacttgct gaatgaatgc agaaatcagc ggtactgact
11280tgtactatat tggctgccat gatagggttc tcacagcgtc atccatgatc gtaagggaga
11340atgacattct gcttgaggga gggaatagaa aggggcaggg aggggacatc tgagggcttc
11400acagggctgc aaagggtaca gggattgcac cagggcagaa caggggaggg tgttcaagga
11460agagtggctc ttagcagagg cactttggaa ggtgtgaggc ataaatgctt ccttctacgt
11520aggccaacct caaaactttc agtaggaatg ttgctatgat caagttgttc taacacttta
11580gacttagtag taattatgaa cctcacatag aaaaatttca tccagccata tgcctgtgga
11640gtggaatatt ctgtttagta gaaaaatcct ttagagttca gctctaacca gaaatcttgc
11700tgaagtatgt cagcaccttt tctcaccctg gtaagtacag tatttcaaga gcacgctaag
11760ggtggttttc attttacagg gctgttgatg atgggttaaa aatgttcatt taagggctac
11820ccccgtgttt aatagatgaa caccacttct acacaaccct ccttggtact gggggaggga
11880gagatctgac aaatactgcc cattccccta ggctgactgg atttgagaac aaatacccac
11940ccatttccac catggtatgg taacttctct gagcttcagt ttccaagtga atttccatgt
12000aataggacat tcccattaaa tacaagctgt ttttactttt tcgcctccca gggcctgtgc
12060gatctggtcc cccagcctct cttgggcttt cttacactaa ctctgtacct accatctcct
12120gcctccctta ggcaggcacc tccaaccacc acacactccc tgctgttttc cctgcctgga
12180actttcccac cagccccacc aagatcattt catccagtcc tgagctcagc ttaagggagg
12240cttcttgcct gtgggttccc tcacccccat gcctgtcctc caggctgggg caggttctta
12300gtttgcctgg aattgttctg tacctctttg tagcacgtag tgttgtgaaa ctaagccact
12360aattgagttt ctggctcccc tcctggggtt gtaagttttg ttcattcatg agggccgact
12420gtatttcctg gttactgtat cccagtgacc agccacagga gatgtccaat aaagtatgtg
12480atgaaatggt cttaaaaaaa aaaaaaaaaa aaaaa
125153142444DNAHomo sapiens 314ggcacgaggc ggggccgggt cgcagctggg
cccgcggcat ggacgaactg ttccccctca 60tcttcccggc agagcagccc aagcagcggg
gcatgcgctt ccgctacaag tgcgaggggc 120gctccgcggg cagcatccca ggcgagagga
gcacagatac caccaagacc caccccacca 180tcaagatcaa tggctacaca ggaccaggga
cagtgcgcat ctccctggtc accaaggacc 240ctcctcaccg gcctcacccc cacgagcttg
taggaaagga ctgccgggat ggcttctatg 300aggctgagct ctgcccggac cgctgcatcc
acagtttcca gaacctggga atccagtgtg 360tgaagaagcg ggacctggag caggctatca
gtcagcgcat ccagaccaac aacaacccct 420tccaagttcc tatagaagag cagcgtgggg
actacgacct gaatgctgtg cggctctgct 480tccaggtgac agtgcgggac ccatcaggca
ggcccctccg cctgccgcct gtcctttctc 540atcccatctt tgacaatcgt gcccccaaca
ctgccgagct caagatctgc cgagtgaacc 600gaaactctgg cagctgcctc ggtggggatg
agatcttcct actgtgtgac aaggtgcaga 660aagaggacat tgaggtgtat ttcacgggac
caggctggga ggcccgaggc tccttttcgc 720aagctgatgt gcaccgacaa gtggccattg
tgttccggac ccctccctac gcagacccca 780gcctgcaggc tcctgtgcgt gtctccatgc
agctgcggcg gccttccgac cgggagctca 840gtgagcccat ggaattccag tacctgccag
atacagacga tcgtcaccgg attgaggaga 900aacgtaaaag gacatatgag accttcaaga
gcatcatgaa gaagagtcct ttcagcggac 960ccaccgaccc ccggcctcca cctcgacgca
ttgctgtgcc ttcccgcagc tcagcttctg 1020tccccaagcc agcaccccag ccctatccct
ttacgtcatc cctgagcacc atcaactatg 1080atgagtttcc caccatggtg tttccttctg
ggcagatcag ccaggcctcg gccttggccc 1140cggcccctcc ccaagtcctg ccccaggctc
cagcccctgc ccctgctcca gccatggtat 1200cagctctggc ccaggcccca gcccctgtcc
cagtcctagc cccaggccct cctcaggctg 1260tggccccacc tgcccccaag cccacccagg
ctggggaagg aacgctgtca gaggccctgc 1320tgcagctgca gtttgatgat gaagacctgg
gggccttgct tggcaacagc acagacccag 1380ctgtgttcac agacctggca tccgtcgaca
actccgagtt tcagcagctg ctgaaccagg 1440gcatacctgt ggccccccac acaactgagc
ccatgctgat ggagtaccct gaggctataa 1500ctcgcctagt gacagcccag aggccccccg
acccagctcc tgctccactg ggggccccgg 1560ggctccccaa tggcctcctt tcaggagatg
aagacttctc ctccattgcg gacatggact 1620tctcagccct gctgagtcag atcagctcct
aagggggtga cgcctgccct ccccagagca 1680ctggttgcag gggattgaag ccctccaaaa
gcacttacgg attctggtgg ggtgtgttcc 1740aactgccccc aactttgtgg atgtcttcct
tggagggggg agccatattt tattctttta 1800ttgtcagtat ctgtatctct ctctcttttt
ggaggtgctt aagcagaagc attaacttct 1860ctggaaaggg gggagctggg gaaactcaaa
cttttcccct gtcctgatgg tcagctccct 1920tctctgtagg gaactgtggg gtcccccatc
cccatcctcc agcttctggt actctcctag 1980agacagaagc aggctggagg taaggccttt
gagcccacaa agccttatca agtgtcttcc 2040atcatggatt cattacagct taatcaaaat
aacgccccag ataccagccc ctgtatggca 2100ctggcattgt ccctgtgcct aacaccagcg
tttgaggggc tgccttcctg ccctacagag 2160gtctctgccg gctctttcct tgctcaacca
tggctgaagg aaacagtgca acagcactgg 2220ctctctccag gatccagaag gggtttggtc
tggacttcct tgctctcccc tcttctcaag 2280tgccttaata gtagggtaag ttgttaagag
tgggggagag caggctggca gctctccagt 2340caggaggcat agtttttagt gaacaatcaa
agcacttgga ctcttgctct ttctactctg 2400aactaataaa gctgttgcca agctggacgg
cacgagctcg tgcc 2444315732DNAHomo sapiens
315tgctgcgaac cacgtgggtc ccgggcgcgt ttcgggtgct ggcggctgca gccggagttc
60aaacctaagc agctggaagg aaccatggcc aactgtgagc gtaccttcat tgcgatcaaa
120ccagatgggg tccagcgggg tcttgtggga gagattatca agcgttttga gcagaaagga
180ttccgccttg ttggtctgaa attcatgcaa gcttccgaag atcttctcaa ggaacactac
240gttgacctga aggaccgtcc attctttgcc ggcctggtga aatacatgca ctcagggccg
300gtagttgcca tggtctggga ggggctgaat gtggtgaaga cgggccgagt catgctcggg
360gagaccaacc ctgcagactc caagcctggg accatccgtg gagacttctg catacaagtt
420ggcaggaaca ttatacatgg cagtgattct gtggagagtg cagagaagga gatcggcttg
480tggtttcacc ctgaggaact ggtagattac acgagctgtg ctcagaactg gatctatgaa
540tgacaggagg gcagaccaca ttgcttttca catccatttc ccctccttcc catgggcaga
600ggaccaggct gtaggaaatc tagttattta caggaacttc atcataattt ggagggaagc
660tcttggagct gtgagttctc cctgtacagt gttaccatcc ccgaccatct gattaaaatg
720cttcctccca gc
7323162422DNAHomo sapiens 316gtcagcctcc cttccaccgc catattgggc cactaaaaaa
agggggctcg tcttttcggg 60gtgtttttct ccccctcccc tgtccccgct tgctcacggc
tctgcgactc cgacgccggc 120aaggtttgga gagcggctgg gttcgcggga cccgcgggct
tgcacccgcc cagactcgga 180cgggctttgc caccctctcc gcttgcctgg tcccctctcc
tctccgccct cccgctcgcc 240agtccatttg atcagcggag actcggcggc cgggccgggg
cttccccgca gcccctgcgc 300gctcctagag ctcgggccgt ggctcgtcgg ggtctgtgtc
ttttggctcc gagggcagtc 360gctgggcttc cgagaggggt tcgggccgcg taggggcgct
ttgttttgtt cggttttgtt 420tttttgagag tgcgagagag gcggtcgtgc agacccggga
gaaagatgtc aaacgtgcga 480gtgtctaacg ggagccctag cctggagcgg atggacgcca
ggcaggcgga gcaccccaag 540ccctcggcct gcaggaacct cttcggcccg gtggaccacg
aagagttaac ccgggacttg 600gagaagcact gcagagacat ggaagaggcg agccagcgca
agtggaattt cgattttcag 660aatcacaaac ccctagaggg caagtacgag tggcaagagg
tggagaaggg cagcttgccc 720gagttctact acagaccccc gcggcccccc aaaggtgcct
gcaaggtgcc ggcgcaggag 780agccaggatg tcagcgggag ccgcccggcg gcgcctttaa
ttggggctcc ggctaactct 840gaggacacgc atttggtgga cccaaagact gatccgtcgg
acagccagac ggggttagcg 900gagcaatgcg caggaataag gaagcgacct gcaaccgacg
attcttctac tcaaaacaaa 960agagccaaca gaacagaaga aaatgtttca gacggttccc
caaatgccgg ttctgtggag 1020cagacgccca agaagcctgg cctcagaaga cgtcaaacgt
aaacagctcg aattaagaat 1080atgtttcctt gtttatcaga tacatcactg cttgatgaag
caaggaagat atacatgaaa 1140attttaaaaa tacatatcgc tgacttcatg gaatggacat
cctgtataag cactgaaaaa 1200caacaacaca ataacactaa aattttaggc actcttaaat
gatctgcctc taaaagcgtt 1260ggatgtagca ttatgcaatt aggtttttcc ttatttgctt
cattgtacta cctgtgtata 1320tagtttttac cttttatgta gcacataaac tttggggaag
ggagggcagg gtggggctga 1380ggaactgacg tggagcgggg tatgaagagc ttgctttgat
ttacagcaag tagataaata 1440tttgacttgc atgaagagaa gcaattttgg ggaagggttt
gaattgtttt ctttaaagat 1500gtaatgtccc tttcagagac agctgatact tcatttaaaa
aaatcacaaa aatttgaaca 1560ctggctaaag ataattgcta tttattttta caagaagttt
attctcattt gggagatctg 1620gtgatctccc aagctatcta aagtttgtta gatagctgca
tgtggctttt ttaaaaaagc 1680aacagaaacc tatcctcact gccctcccca gtctctctta
aagttggaat ttaccagtta 1740attactcagc agaatggtga tcactccagg tagtttgggg
caaaaatccg aggtgcttgg 1800gagttttgaa tgttaagaat tgaccatctg cttttattaa
atttgttgac aaaattttct 1860cattttcttt tcacttcggg ctgtgtaaac acagtcaaaa
taattctaaa tccctcgata 1920tttttaaaga tctgtaagta acttcacatt aaaaaatgaa
atatttttta atttaaagct 1980tactctgtcc atttatccac aggaaagtgt tatttttaaa
ggaaggttca tgtagagaaa 2040agcacacttg taggataagt gaaatggata ctacatcttt
aaacagtatt tcattgcctg 2100tgtatggaaa aaccatttga agtgtacctg tgtacataac
tctgtaaaaa cactgaaaaa 2160ttatactaac ttatttatgt taaaagattt tttttaatct
agacaatata caagccaaag 2220tggcatgttt tgtgcatttg taaatgctgt gttgggtaga
ataggttttc ccctcttttg 2280ttaaataata tggctatgct taaaaggttg catactgagc
caagtataat tttttgtaat 2340gtgtgaaaaa gatgccaatt attgttacac attaagtaat
caataaagaa aacttccata 2400gctaaaaaaa aaaaaaaaaa aa
24223175061DNAHomo sapiens 317atggctcaga tatttagcaa
cagcggattt aaagaatgtc cattttcaca tccggaacca 60acaagagcaa aagatgtgga
caaagaagaa gcattacaga tggaagcaga ggctttagca 120aaactgcaaa aggatagaca
agtgactgac aatcagagag gctttgagtt gtcaagcagc 180accagaaaaa aagcacaggt
ttataacaag caggattatg atctcatggt gtttcctgaa 240tcagattccc aaaaaagagc
attagatatt gatgtagaaa agctcaccca agctgaactt 300gagaaactat tgctggatga
cagtttcgag actaaaaaaa cacctgtatt accagttact 360cctattctga gcccttcctt
ttcagcacag ctctatttta gacctactat tcagagagga 420cagtggccac ctggattacc
tgggccttcc acttatgctt taccttctat ttatccttct 480acttacagta aacaggctgc
attccaaaat ggcttcaatc caagaatgcc cacttttcca 540tctacagaac ctatatattt
aagtcttccg ggacaatctc catatttctc atatcctttg 600acacctgcca caccctttca
tccacaagga agcttaccta tctatcgtcc agtagtcagt 660actgacatgg caaaactatt
tgacaaaata gctagtacat cagaattttt aaaaaatggg 720aaagcaagga ctgatttgga
gataacagat tcaaaagtca gcaatctaca ggtatctcca 780aagtctgagg atatcagtaa
atttgactgg ttagacttgg atcctctaag taagcctaag 840gtggataatg tggaggtatt
agaccatgag gaagagaaaa atgtttcaag tttgctagca 900aaggatcctt gggatgctgt
tcttcttgaa gagagatcga cagcaaattg tcatcttgaa 960agaaaggtga atggaaaatc
cctttctgtg gcaactgtta caagaagcca gtctttaaat 1020attcgaacaa ctcagcttgc
aaaagcccag ggccatatat ctcagaaaga cccaaatggg 1080accagtagtt tgccaactgg
aagttctctt cttcaagaag ttgaagtaca gaatgaggag 1140atggcagctt tttgtcgatc
cattacaaaa ttgaagacca aatttccata taccaatcac 1200cgcacaaacc caggctattt
gttaagtcca gtcacagcgc aaagaaacat atgcggagaa 1260aatgctagtg tgaaggtctc
cattgacatt gaaggatttc agctaccagt tacttttacg 1320tgtgatgtga gttctactgt
agaaatcatt ataatgcaag ccctttgctg ggtacatgat 1380gacttgaatc aagtagatgt
tggcagctat gttctaaaag tttgtggtca agaggaagtg 1440ctgcagaata atcattgcct
tggaagtcat gagcatattc aaaactgtcg aaaatgggac 1500acagaaatta gactacaact
cttgaccttc agtgcaatgt gtcaaaatct ggcccgaaca 1560gcagaagatg atgaaacacc
cgtggattta aacaaacacc tgtatcaaat agaaaaacct 1620tgcaaagaag ccatgacgag
acaccctgtt gaagaactct tagattctta tcacaaccaa 1680gtagaactgg ctcttcaaat
tgaaaaccaa caccgagcag tagatcaagt aattaaagct 1740gtaagaaaaa tctgtagtgc
tttagatggt gtcgagactc ttgccattac agaatcagta 1800aagaagctaa agagagcagt
taatcttcca aggagtaaaa ctgctgatgt gacttctttg 1860tttggaggag aagacactag
caggagttca actaggggct cacttaatcc tgaaaatcct 1920gttcaagtaa gcataaacca
attaactgca gcaatttatg atcttctcag actccatgca 1980aattctggta ggagtcctac
agactgtgcc caaagtagca agagtgtcaa ggaagcatgg 2040actacaacag agcagctcca
gtttactatt tttgctgctc atggaatttc aagtaattgg 2100gtatcaaatt atgaaaaata
ctacttgata tgttcactgt ctcacaatgg aaaggatctt 2160tttaaaccta ttcaatcaaa
gaaggttggc acttacaaga atttcttcta tcttattaaa 2220tgggatgaac taatcatttt
tcctatccag atatcacaat tgccattaga atcagttctt 2280caccttactc tttttggaat
tttaaatcag agcagtggaa gttcccctga ttctaataag 2340cagagaaagg gaccagaagc
tttgggcaaa gtttctttac ctctttgtga ctttagacgg 2400tttttaacat gtggaactaa
acttctatat ctttggactt catcacatac aaattctgtt 2460cctggaacag ttaccaaaaa
aggatatgtc atggaaagaa tagtgctaca ggttgatttt 2520ccttctcctg catttgatat
tatttataca actcctcaag ttgacagaag cattatacag 2580caacataact tagaaacact
agagaatgat ataaaaggga aacttcttga tattcttcat 2640aaagactcat cacttggact
ttctaaagaa gataaagctt ttttatggga gaaacgttat 2700tattgcttca aacacccaaa
ttgtcttcct aaaatattag caagcgcccc aaactggaaa 2760tggggtaatc ttgccaaaac
ttactcattg cttcaccagt ggcctgcatt gtacccacta 2820attgcattgg aacttcttga
ttcaaaattt gctgatcagg aagtaagatc cctagctgtg 2880acctggattg aggccattag
tgatgatgag ctaacagatc ttcttccaca gtttgtacaa 2940gctttgaaat atgaaattta
cttgaatagt tcattagtgc aattcctttt gtccagggca 3000ttgggaaata tccagatagc
acacaattta tattggcttc tcaaagatgc cctgcatgat 3060gtacagttta gtacccgata
cgaacatgtt ttgggtgctc tcctgtcagt aggaggaaaa 3120cgacttagag aagaacttct
aaaacagacg aaacttgtac agcttttagg aggagtagca 3180gaaaaagtaa ggcaggctag
tggatcagcc agacaggttg ttctccaaag aagtatggaa 3240cgagtacagt ccttttttca
gaaaaataaa tgccgtctcc ctctcaagcc aagtctagtg 3300gcaaaagaat taaatattaa
gtcgtgttcc ttcttcagtt ctaatgctgt ccccctaaaa 3360gtcacaatgg tgaatgctga
ccctctggga gaagaaatta atgtcatgtt taaggttggt 3420gaagatcttc ggcaagatat
gttagcttta cagatgataa agattatgga taagatctgg 3480cttaaagaag gactagatct
gaggatggta attttcaaat gtctctcaac tggcagagat 3540cgaggcatgg tggagctggt
tcctgcttcc gataccctca ggaaaatcca agtggaatat 3600ggtgtgacag gatcctttaa
agataaacca cttgcagagt ggctaaggaa atacaatccc 3660tctgaagaag aatatgaaaa
ggcttcagag aactttatct attcctgtgc tggatgctgt 3720gtagccacct atgttttagg
catctgtgat cgacacaatg acaatataat gcttcgaagc 3780acgggacaca tgtttcacat
tgactttgga aagtttttgg gacatgcaca gatgtttggc 3840agcttcaaaa gggatcgggc
tccttttgtg ctgacctctg atatggcata tgtcattaat 3900gggggtgaaa agcccaccat
tcgttttcag ttgtttgtgg acctctgctg tcaggcctac 3960aacttgataa gaaagcagac
aaaccttttt cttaacctcc tttcactgat gattccttca 4020gggttaccag aacttacaag
tattcaagat ttgaaatacg ttagagatgc acttcaaccc 4080caaactacag acgcagaagc
tacaattttc tttactaggc ttattgaatc aagtttggga 4140agcattgcca caaagtttaa
cttcttcatt cacaaccttg ctcagcttcg tttttctggt 4200cttccttcta atgatgagcc
catcctttca ttttcaccta aaacatactc ctttagacaa 4260gatggtcgaa tcaaggaagt
ctctgttttt acatatcata agaaatacaa cccagataaa 4320cattatattt atgtagtccg
aattttgtgg gaaggacaga ttgaaccatc atttgtcttc 4380cgaacatttg tcgaatttca
ggaacttcac aataagctca gtattatttt tccactttgg 4440aagttaccag gctttcctaa
taggatggtt ctaggaagaa cacacataaa agatgtagca 4500gccaaaagga aaattgagtt
aaacagttac ttacagagtt tgatgaatgc ttcaacggat 4560gtagcagagt gtgatcttgt
ttgtactttc ttccaccctt tacttcgtga tgagaaagct 4620gaagggatag ctaggtctgc
agatgcaggt tccttcagtc ctactccagg ccaaatagga 4680ggagctgtga aattatccat
ctcttaccga aatggtactc ttttcatcat ggtgatgcat 4740atcaaagatc ttgttactga
agatggagct gacccaaatc catatgtcaa aacataccta 4800cttccagata accacaaaac
atccaaacgt aaaaccaaaa tttcacgaaa aacgaggaat 4860ccgacattca atgaaatgct
tgtatacagt ggatatagca aagaaaccct aagacagcga 4920gaacttcaac taagtgtact
cagtgcagaa tctctgcggg agaatttttt cttgggtgga 4980gtaaccctgc ctttgaaaga
tttcaacttg agcaaagaga cggttaaatg gtatcagctg 5040actgcggcaa catacttgta a
50613183014DNAHomo sapiens
318ctgaccagcg ccgccctccc ccgcccccga cccaggaggt ggagatccct ccggtccagc
60cacattcaac acccactttc tcctccctct gcccctatat tcccgaaacc ccctcctcct
120tcccttttcc ctcctccctg gagacggggg aggagaaaag gggagtccag tcgtcatgac
180tgagctgaag gcaaagggtc cccgggctcc ccacgtggcg ggcggcccgc cctcccccga
240ggtcggatcc ccactgctgt gtcgcccagc cgcaggtccg ttcccgggga gccagacctc
300ggacaccttg cctgaagttt cggccatacc tatctccctg gacgggctac tcttccctcg
360gccctgccag ggacaggacc cctccgacga aaagacgcag gaccagcagt cgctgtcgga
420cgtggagggc gcatattcca gagctgaagc tacaaggggt gctggaggca gcagttctag
480tcccccagaa aaggacagcg gactgctgga cagtgtcttg gacactctgt tggcgccctc
540aggtcccggg cagagccaac ccagccctcc cgcctgcgag gtcaccagct cttggtgcct
600gtttggcccc gaacttcccg aagatccacc ggctgccccc gccacccagc gggtgttgtc
660cccgctcatg agccggtccg ggtgcaaggt tggagacagc tccgggacgg cagctgccca
720taaagtgctg ccccggggcc tgtcaccagc ccggcagctg ctgctcccgg cctctgagag
780ccctcactgg tccggggccc cagtgaagcc gtctccgcag gccgctgcgg tggaggttga
840ggaggaggat ggctctgagt ccgaggagtc tgcgggtccg cttctgaagg gcaaacctcg
900ggctctgggt ggcgcggcgg ctggaggagg agccgcggct gtcccgccgg gggcggcagc
960aggaggcgtc gccctggtcc ccaaggaaga ttcccgcttc tcagcgccca gggtcgccct
1020ggtggagcag gacgcgccga tggcgcccgg gcgctccccg ctggccacca cggtgatgga
1080tttcatccac gtgcctatcc tgcctctcaa tcacgcctta ttggcagccc gcactcggca
1140gctgctggaa gacgaaagtt acgacggcgg ggccggggct gccagcgcct ttgccccgcc
1200gcggagttca ccctgtgcct cgtccacccc ggtcgctgta ggcgacttcc ccgactgcgc
1260gtacccgccc gacgccgagc ccaaggacga cgcgtaccct ctctatagcg acttccagcc
1320gcccgctcta aagataaagg aggaggagga aggcgcggag gcctccgcgc gctccccgcg
1380ttcctacctt gtggccggtg ccaaccccgc agccttcccg gatttcccgt tggggccacc
1440gcccccgctg ccgccgcgag cgaccccatc cagacccggg gaagcggcgg tgacggccgc
1500acccgccagt gcctcagtct cgtctgcgtc ctcctcgggg tcgaccctgg agtgcatcct
1560gtacaaagcg gagggcgcgc cgccccagca gggcccgttc gcgccgccgc cctgcaaggc
1620gccgggcgcg agcggctgcc tgctcccgcg ggacggcctg ccctccacct ccgcctctgc
1680cgccgccgcc ggggcggccc ccgcgctcta ccctgcactc ggcctcaacg ggctcccgca
1740gctcggctac caggccgccg tgctcaagga gggcctgccg caggtctacc cgccctatct
1800caactacctg aggccggatt cagaagccag ccagagccca caatacagct tcgagtcatt
1860acctcagaag atttgtttaa tctgtgggga tgaagcatca ggctgtcatt atggtgtcct
1920tacctgtggg agctgtaagg tcttctttaa gagggcaatg gaagggcagc acaactactt
1980atgtgctgga agaaatgact gcatcgttga taaaatccgc agaaaaaact gcccagcatg
2040tcgccttaga aagtgctgtc aggctggcat ggtccttgga ggtcgaaaat ttaaaaagtt
2100caataaagtc agagttgtga gagcactgga tgctgttgct ctcccacagc cagtgggcgt
2160tccaaatgaa agccaagccc taagccagag attcactttt tcaccaggtc aagacataca
2220gttgattcca ccactgatca acctgttaat gagcattgaa ccagatgtga tctatgcagg
2280acatgacaac acaaaacctg acacctccag ttctttgctg acaagtctta atcaactagg
2340cgagaggcaa cttctttcag tagtcaagtg gtctaaatca ttgccaggtt ttcgaaactt
2400acatattgat gaccagataa ctctcattca gtattcttgg atgagcttaa tggtgtttgg
2460tctaggatgg agatcctaca aacacgtcag tgggcagatg ctgtattttg cacctgatct
2520aatactaaat gaacagcgga tgaaagaatc atcattctat tcattatgcc ttaccatgtg
2580gcagatccca caggagtttg tcaagcttca agttagccaa gaagagttcc tctgtatgaa
2640agtattgtta cttcttaata caattccttt ggaagggcta cgaagtcaaa cccagtttga
2700ggagatgagg tcaagctaca ttagagagct catcaaggca attggtttga ggcaaaaagg
2760agttgtgtcg agctcacagc gtttctatca acttacaaaa cttcttgata acttgcatga
2820tcttgtcaaa caacttcatc tgtactgctt gaatacattt atccagtccc gggcactgag
2880tgttgaattt ccagaaatga tgtctgaagt tattgctgca caattaccca agatattggc
2940agggatggtg aaaccccttc tctttcataa aaagtgaatg tcatcttttt cttttaaaga
3000attaaatttt gtgg
30143192148DNAHomo sapiens 319gcttcagggt acagctcccc cgcagccaga agccgggcct
gcagcgcctc agcaccgctc 60cgggacaccc cacccgcttc ccaggcgtga cctgtcaaca
gcaacttcgc ggtgtggtga 120actctctgag gaaaaaccat tttgattatt actctcagac
gtgcgtggca acaagtgact 180gagacctaga aatccaagcg ttggaggtcc tgaggccagc
ctaagtcgct tcaaaatgga 240acgaaggcgt ttgtggggtt ccattcagag ccgatacatc
agcatgagtg tgtggacaag 300cccacggaga cttgtggagc tggcagggca gagcctgctg
aaggatgagg ccctggccat 360tgccgccctg gagttgctgc ccagggagct cttcccgcca
ctcttcatgg cagcctttga 420cgggagacac agccagaccc tgaaggcaat ggtgcaggcc
tggcccttca cctgcctccc 480tctgggagtg ctgatgaagg gacaacatct tcacctggag
accttcaaag ctgtgcttga 540tggacttgat gtgctccttg cccaggaggt tcgccccagg
aggtggaaac ttcaagtgct 600ggatttacgg aagaactctc atcaggactt ctggactgta
tggtctggaa acagggccag 660tctgtactca tttccagagc cagaagcagc tcagcccatg
acaaagaagc gaaaagtaga 720tggtttgagc acagaggcag agcagccctt cattccagta
gaggtgctcg tagacctgtt 780cctcaaggaa ggtgcctgtg atgaattgtt ctcctacctc
attgagaaag tgaagcgaaa 840gaaaaatgta ctacgcctgt gctgtaagaa gctgaagatt
tttgcaatgc ccatgcagga 900tatcaagatg atcctgaaaa tggtgcagct ggactctatt
gaagatttgg aagtgacttg 960tacctggaag ctacccacct tggcgaaatt ttctccttac
ctgggccaga tgattaatct 1020gcgtagactc ctcctctccc acatccatgc atcttcctac
atttccccgg agaaggaaga 1080gcagtatatc gcccagttca cctctcagtt cctcagtctg
cagtgcctgc aggctctcta 1140tgtggactct ttatttttcc ttagaggccg cctggatcag
ttgctcaggc acgtgatgaa 1200ccccttggaa accctctcaa taactaactg ccggctttcg
gaaggggatg tgatgcatct 1260gtcccagagt cccagcgtca gtcagctaag tgtcctgagt
ctaagtgggg tcatgctgac 1320cgatgtaagt cccgagcccc tccaagctct gctggagaga
gcctctgcca ccctccagga 1380cctggtcttt gatgagtgtg ggatcacgga tgatcagctc
cttgccctcc tgccttccct 1440gagccactgc tcccagctta caaccttaag cttctacggg
aattccatct ccatatctgc 1500cttgcagagt ctcctgcagc acctcatcgg gctgagcaat
ctgacccacg tgctgtatcc 1560tgtccccctg gagagttatg aggacatcca tggtaccctc
cacctggaga ggcttgccta 1620tctgcatgcc aggctcaggg agttgctgtg tgagttgggg
cggcccagca tggtctggct 1680tagtgccaac ccctgtcctc actgtgggga cagaaccttc
tatgacccgg agcccatcct 1740gtgcccctgt ttcatgccta actagctggg tgcacatatc
aaatgcttca ttctgcatac 1800ttggacacta aagccaggat gtgcatgcat cttgaagcaa
caaagcagcc acagtttcag 1860acaaatgttc agtgtgagtg aggaaaacat gttcagtgag
gaaaaaacat tcagacaaat 1920gttcagtgag gaaaaaaagg ggaagttggg gataggcaga
tgttgacttg aggagttaat 1980gtgatctttg gggagataca tcttatagag ttagaaatag
aatctgaatt tctaaaggga 2040gattctggct tgggaagtac atgtaggagt taatccctgt
gtagactgtt gtaaagaaac 2100tgttgaaaat aaagagaagc aatgtgaagc aaaaaaaaaa
aaaaaaaa 2148320540DNAHomo sapiens 320atccctgact
cggggtcgcc tttggagcag agaggaggca atggccacca tggagaacaa 60ggtgatctgc
gccctggtcc tggtgtccat gctggccctc ggcaccctgg ccgaggccca 120gacagagacg
tgtacagtgg ccccccgtga aagacagaat tgtggttttc ctggtgtcac 180gccctcccag
tgtgcaaata agggctgctg tttcgacgac accgttcgtg gggtcccctg 240gtgcttctat
cctaatacca tcgacgtccc tccagaagag gagtgtgaat tttagacact 300tctgcaggga
tctgcctgca tcctgacggg gtgccgtccc cagcacggtg attagtccca 360gagctcggct
gccacctcca ccggacacct cagacacgct tctgcagctg tgcctcggct 420cacaacacag
attgactgct ctgactttga ctactcaaaa ttggcctaaa aattaaaaga 480gatcgatatt
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa
5403212346DNAHomo sapiens 321gcacgaggct gcggcgggtc cgggcccatg aggcgacgaa
ggaggcggga cggcttttac 60ccagccccgg acttccgaga cagggaagct gaggacatgg
caggagtgtt tgacatagac 120ctggaccagc cagaggacgc gggctctgag gatgagctgg
aggagggggg tcagttaaat 180gaaagcatgg accatggggg agttggacca tatgaacttg
gcatggaaca ttgtgagaaa 240tttgaaatct cagaaactag tgtgaacaga gggccagaaa
aaatcagacc agaatgtttt 300gagctacttc gggtacttgg taaagggggc tatggaaagg
tttttcaagt acgaaaagta 360acaggagcaa atactgggaa aatatttgcc atgaaggtgc
ttaaaaaggc aatgatagta 420agaaatgcta aagatacagc tcatacaaaa gcagaacgga
atattctgga ggaagtaaag 480catcccttca tcgtggattt aatttatgcc tttcagactg
gtggaaaact ctacctcatc 540cttgagtatc tcagtggagg agaactattt atgcagttag
aaagagaggg aatatttatg 600gaagacactg cctgctttta cttggcagaa atctccatgg
ctttggggca tttacatcaa 660aaggggatca tctacagaga cctgaagccg gagaatatca
tgcttaatca ccaaggtcat 720gtgaaactaa cagactttgg actatgcaaa gaatctattc
atgatggaac agtcacacac 780acattttgtg gaacaataga atacatggcc cctgaaatct
tgatgagaag tggccacaat 840cgtgctgtgg attggtggag tttgggagca ttaatgtatg
acatgctgac tggagcaccc 900ccattcactg gggagaatag aaagaaaaca attgacaaaa
tcctcaaatg taaactcaat 960ttgcctccct acctcacaca agaagccaga gatctgctta
aaaagctgct gaaaagaaat 1020gctgcttctc gtctgggagc tggtcctggg gacgctggag
aagttcaagc tcatccattc 1080tttagacaca ttaactggga agaacttctg gctcgaaagg
tggagccccc ctttaaacct 1140ctgttgcaat ctgaagagga tgtaagtcag tttgattcca
agtttacacg tcagacacct 1200gtcgacagcc cagatgactc aactctcagt gaaagtgcca
atcaggtctt tctgggtttt 1260acatatgtgg ctccatctgt acttgaaagt gtgaaagaaa
agttttcctt tgaaccaaaa 1320atccgatcac ctcgaagatt tattggcagc ccacgaacac
ctgtcagccc agtcaaattt 1380tctcctgggg atttctgggg aagaggtgct tcggccagca
cagcaaatcc tcagacacct 1440gtggaatacc caatggaaac aagtggcata gagcagatgg
atgtgacaat gagtggggaa 1500gcatcggcac cacttccaat acgacagccg aactctgggc
catacaaaaa acaagctttt 1560cccatgatct ccaaacggcc agagcacctg cgtatgaatc
tatgacagag caatgctttt 1620aatgaattta aggcaaaaag gtggagaggg agatgtgtga
gcatcctgca aggtgaaaca 1680agactcaaaa tgacagtttc agagagtcaa tgtcattaca
tagaacactt cggacacagg 1740aaaaataaac gtggatttta aaaaatcaat caatggtgca
aaaaaaaact taaagcaaaa 1800tagtattgct gaactcttag gcacatcaat taattgattc
ctcgcgacat ctttctcaac 1860cttatcaagg attttcatgt tgatgactcg aaactgacag
tattaagggt aggatgttgc 1920tctgaatcac tgtgagtctg atgtgtgaag aagggtatcc
tttcattagg caagtacaaa 1980ttgcctataa tacttgcaac taaggacaaa ttagcatgca
agcttggtca aacttttccc 2040aggcaaaatg ggaaggcaaa gacaaaagaa acttaccaat
tgatgtttta cgtgcaaaca 2100acctgaatct tttttttata taaatatata tttttcaaat
agatttttga ttcagctcat 2160tatgaaaaac atcccaaact ttaaaatgcg aaattattgg
ttggtgtgaa gaaagccaga 2220caacttctgt ttcttctctt ggtgaaataa taaaatgcaa
atgaatcatt gttaacacag 2280ctgtggctcg tttgagggat tggggtggac ctggggttta
ttttcagtaa cccagctgcg 2340gagcct
23463222420DNAHomo sapiens 322tccggggcgg cccccggcag
ccagcgcgac gttccaaaat cgaacctcag tggcggcgct 60cggaagcgga actctgccgg
ggccgcgccg gctacattgt ttcctccccc cgactccctc 120ccgccccctt cccccgcctt
tcttccctcc gcgacccggg ccgtgcgtcc gtccccctgc 180ctctgcctgg cggtccctcc
tcccctctcc ttgcacccat acctctttgt accgcacccc 240ctggggaccc ctgcgcccct
cccctccccc ctgaccgcat ggaccgtccc gcaggccgct 300gatgccgccc gcggcgaggt
ggcccggacc gcagtgcccc aagagagctc taatggtacc 360aagtgacagg ttggctttac
tgtgactcgg ggacgccaga gctcctgaga agatgtcagc 420aatacaggcc gcctggccat
ccggtacaga atgtattgcc aagtacaact tccacggcac 480tgccgagcag gacctgccct
tctgcaaagg agacgtgctc accattgtgg ccgtcaccaa 540ggaccccaac tggtacaaag
ccaaaaacaa ggtgggccgt gagggcatca tcccagccaa 600ctacgtccag aagcgggagg
gcgtgaaggc gggtaccaaa ctcagcctca tgccttggtt 660ccacggcaag atcacacggg
agcaggctga gcggcttctg tacccgccgg agacaggcct 720gttcctggtg cgggagagca
ccaactaccc cggagactac acgctgtgcg tgagctgcga 780cggcaaggtg gagcactacc
gcatcatgta ccatgccagc aagctcagca tcgacgagga 840ggtgtacttt gagaacctca
tgcagctggt ggagcactac acctcagacg cagatggact 900ctgtacgcgc ctcattaaac
caaaggtcat ggagggcaca gtggcggccc aggatgagtt 960ctaccgcagc ggctgggccc
tgaacatgaa ggagctgaag ctgctgcaga ccatcgggaa 1020gggggagttc ggagacgtga
tgctgggcga ttaccgaggg aacaaagtcg ccgtcaagtg 1080cattaagaac gacgccactg
cccaggcctt cctggctgaa gcctcagtca tgacgcaact 1140gcggcatagc aacctggtgc
agctcctggg cgtgatcgtg gaggagaagg gcgggctcta 1200catcgtcact gagtacatgg
ccaaggggag ccttgtggac tacctgcggt ctaggggtcg 1260gtcagtgctg ggcggagact
gtctcctcaa gttctcgcta gatgtctgcg aggccatgga 1320atacctggag ggcaacaatt
tcgtgcatcg agacctggct gcccgcaatg tgctggtgtc 1380tgaggacaac gtggccaagg
tcagcgactt tggtctcacc aaggaggcgt ccagcaccca 1440ggacacgggc aagctgccag
tcaagtggac agcccctgag gccctgagag agaagaaatt 1500ctccactaag tctgacgtgt
ggagtttcgg aatccttctc tgggaaatct actcctttgg 1560gcgagtgcct tatccaagaa
ttcccctgaa ggacgtcgtc cctcgggtgg agaagggcta 1620caagatggat gcccccgacg
gctgcccgcc cgcagtctat gaagtcatga agaactgctg 1680gcacctggac gccgccatgc
ggccctcctt cctacagctc cgagagcagc ttgagcacat 1740caaaacccac gagctgcacc
tgtgacggct ggcctccgcc tgggtcatgg gcctgtgggg 1800actgaacctg gaagatcatg
gacctggtgc ccctgctcac tgggcccgag cctgaactga 1860gccccagcgg gctggcgggc
ctttttcctg cgtcccagcc tgcacccctc cggccccgtc 1920tctcttggac ccacctgtgg
ggcctgggga gcccactgag gggccaggga ggaaggaggc 1980cacggagcgg gcggcagcgc
cccaccacgt cgggcttccc tggcctcccg ccactcgcct 2040tcttagagtt ttattccttt
ccttttttga gatttttttt ccgtgtgttt attttttatt 2100atttttcaag ataaggagaa
agaaagtacc cagcaaatgg gcattttaca agaagtacga 2160atcttatttt tcctgtcctg
cccgtgaggt gggggggacc gggcccctct ctagggaccc 2220ctcgccccag cctcattccc
cattctgtgt cccatgtccc gtgtctcctc ggtcgccccg 2280tgtttgcgct tgaccatgtt
gcactgtttg catgcgcccg aggcagacgt ctgtcagggg 2340cttggatttc gtgtgccgct
gccacccgcc cacccgcctt gtgagatgga atcgtaataa 2400accacgccat gaggaaaaaa
24203232253DNAHomo sapiens
323ggaagacttg ggtccttggg tcgcaggtgg gagccgacgg gtgggtagac cgtgggggat
60atctcagtgg cggacgagga cggcggggac aaggggcggc tggtcggagt ggcggagcgt
120caagtcccct gtcggttcct ccgtccctga gtgtccttgg cgctgccttg tgcccgccca
180gcgcctttgc atccgctcct gggcaccgag gcgccctgta ggatactgct tgttacttat
240tacagctaga ggcatcatgg accgatctaa agaaaactgc atttcaggac ctgttaaggc
300tacagctcca gttggaggtc caaaacgtgt tctcgtgact cagcaaattc cttgtcagaa
360tccattacct gtaaatagtg gccaggctca gcgggtcttg tgtccttcaa attcttccca
420gcgcgttcct ttgcaagcac aaaagcttgt ctccagtcac aagccggttc agaatcagaa
480gcagaagcaa ttgcaggcaa ccagtgtacc tcatcctgtc tccaggccac tgaataacac
540ccaaaagagc aagcagcccc tgccatcggc acctgaaaat aatcctgagg aggaactggc
600atcaaaacag aaaaatgaag aatcaaaaaa gaggcagtgg gctttggaag actttgaaat
660tggtcgccct ctgggtaaag gaaagtttgg taatgtttat ttggcaagag aaaagcaaag
720caagtttatt ctggctctta aagtgttatt taaagctcag ctggagaaag ccggagtgga
780gcatcagctc agaagagaag tagaaataca gtcccacctt cggcatccta atattcttag
840actgtatggt tatttccatg atgctaccag agtctaccta attctggaat atgcaccact
900tggaacagtt tatagagaac ttcagaaact ttcaaagttt gatgagcaga gaactgctac
960ttatataaca gaattggcaa atgccctgtc ttactgtcat tcgaagagag ttattcatag
1020agacattaag ccagagaact tacttcttgg atcagctgga gagcttaaaa ttgcagattt
1080tgggtggtca gtacatgctc catcttccag gaggaccact ctctgtggca ccctggacta
1140cctgccccct gaaatgattg aaggtcggat gcatgatgag aaggtggatc tctggagcct
1200tggagttctt tgctatgaat ttttagttgg gaagcctcct tttgaggcaa acacatacca
1260agagacctac aaaagaatat cacgggttga attcacattc cctgactttg taacagaggg
1320agccagggac ctcatttcaa gactgttgaa gcataatccc agccagaggc caatgctcag
1380agaagtactt gaacacccct ggatcacagc aaattcatca aaaccatcaa attgccaaaa
1440caaagaatca gctagcaaac agtcttagga atcgtgcagg gggagaaatc cttgagccag
1500ggctgccata taacctgaca ggaacatgct actgaagttt attttaccat tgactgctgc
1560cctcaatcta gaacgctaca caagaaatat ttgttttact cagcaggtgt gccttaacct
1620ccctattcag aaagctccac atcaataaac atgacactct gaagtgaaag tagccacgag
1680aattgtgcta cttatactgg ttcataatct ggaggcaagg ttcgactgca gccgccccgt
1740cagcctgtgc taggcatggt gtcttcacag gaggcaaatc cagagcctgg ctgtggggaa
1800agtgaccact ctgccctgac cccgatcagt taaggagctg tgcaataacc ttcctagtac
1860ctgagtgagt gtgtaactta ttgggttggc gaagcctggt aaagctgttg gaatgagtat
1920gtgattcttt ttaagtatga aaataaagat atatgtacag acttgtattt tttctctggt
1980ggcattcctt taggaatgct gtgtgtctgt ccggcacccc ggtaggcctg attgggtttc
2040tagtcctcct taaccactta tctcccatat gagagtgtga aaaataggaa cacgtgctct
2100acctccattt agggatttgc ttgggataca gaagaggcca tgtgtctcag agctgttaag
2160ggcttatttt tttaaaacat tggagtcata gcatgtgtgt aaactttaaa tatgcaaata
2220aataagtatc tatgtctaaa aaaaaaaaaa aaa
22533241619DNAHomo sapiens 324ccgccagatt tgaatcgcgg gacccgttgg cagaggtggc
ggcggcggca tgggtgcccc 60gacgttgccc cctgcctggc agccctttct caaggaccac
cgcatctcta cattcaagaa 120ctggcccttc ttggagggct gcgcctgcac cccggagcgg
atggccgagg ctggcttcat 180ccactgcccc actgagaacg agccagactt ggcccagtgt
ttcttctgct tcaaggagct 240ggaaggctgg gagccagatg acgaccccat agaggaacat
aaaaagcatt cgtccggttg 300cgctttcctt tctgtcaaga agcagtttga agaattaacc
cttggtgaat ttttgaaact 360ggacagagaa agagccaaga acaaaattgc aaaggaaacc
aacaataaga agaaagaatt 420tgaggaaact gcgaagaaag tgcgccgtgc catcgagcag
ctggctgcca tggattgagg 480cctctggccg gagctgcctg gtcccagagt ggctgcacca
cttccagggt ttattccctg 540gtgccaccag ccttcctgtg ggccccttag caatgtctta
ggaaaggaga tcaacatttt 600caaattagat gtttcaactg tgctcctgtt ttgtcttgaa
agtggcacca gaggtgcttc 660tgcctgtgca gcgggtgctg ctggtaacag tggctgcttc
tctctctctc tctctttttt 720gggggctcat ttttgctgtt ttgattcccg ggcttaccag
gtgagaagtg agggaggaag 780aaggcagtgt cccttttgct agagctgaca gctttgttcg
cgtgggcaga gccttccaca 840gtgaatgtgt ctggacctca tgttgttgag gctgtcacag
tcctgagtgt ggacttggca 900ggtgcctgtt gaatctgagc tgcaggttcc ttatctgtca
cacctgtgcc tcctcagagg 960acagtttttt tgttgttgtg tttttttgtt tttttttttt
ggtagatgca tgacttgtgt 1020gtgatgagag aatggagaca gagtccctgg ctcctctact
gtttaacaac atggctttct 1080tattttgttt gaattgttaa ttcacagaat agcacaaact
acaattaaaa ctaagcacaa 1140agccattcta agtcattggg gaaacggggt gaacttcagg
tggatgagga gacagaatag 1200agtgatagga agcgtctggc agatactcct tttgccactg
ctgtgtgatt agacaggccc 1260agtgagccgc ggggcacatg ctggccgctc ctccctcaga
aaaaggcagt ggcctaaatc 1320ctttttaaat gacttggctc gatgctgtgg gggactggct
gggctgctgc aggccgtgtg 1380tctgtcagcc caaccttcac atctgtcacg ttctccacac
gggggagaga cgcagtccgc 1440ccaggtcccc gctttctttg gaggcagcag ctcccgcagg
gctgaagtct ggcgtaagat 1500gatggatttg attcgccctc ctccctgtca tagagctgca
gggtggattg ttacagcttc 1560gctggaaacc tctggaggtc atctcggctg ttcctgagaa
ataaaaagcc tgtcatttc 16193255010DNAHomo sapiens 325ggcggctcgg
gacggaggac gcgctagtgt gagtgcgggc ttctagaact acaccgaccc 60tcgtgtcctc
ccttcatcct gcggggctgg ctggagcggc cgctccggtg ctgtccagca 120gccataggga
gccgcacggg gagcgggaaa gcggtcgcgg ccccaggcgg ggcggccggg 180atggagcggg
gccgcgagcc tgtggggaag gggctgtggc ggcgcctcga gcggctgcag 240gttcttctgt
gtggcagttc agaatgatgg atcaagctag atcagcattc tctaacttgt 300ttggtggaga
accattgtca tatacccggt tcagcctggc tcggcaagta gatggcgata 360acagtcatgt
ggagatgaaa cttgctgtag atgaagaaga aaatgctgac aataacacaa 420aggccaatgt
cacaaaacca aaaaggtgta gtggaagtat ctgctatggg actattgctg 480tgatcgtctt
tttcttgatt ggatttatga ttggctactt gggctattgt aaaggggtag 540aaccaaaaac
tgagtgtgag agactggcag gaaccgagtc tccagtgagg gaggagccag 600gagaggactt
ccctgcagca cgtcgcttat attgggatga cctgaagaga aagttgtcgg 660agaaactgga
cagcacagac ttcaccagca ccatcaagct gctgaatgaa aattcatatg 720tccctcgtga
ggctggatct caaaaagatg aaaatcttgc gttgtatgtt gaaaatcaat 780ttcgtgaatt
taaactcagc aaagtctggc gtgatcaaca ttttgttaag attcaggtca 840aagacagcgc
tcaaaactcg gtgatcatag ttgataagaa cggtagactt gtttacctgg 900tggagaatcc
tgggggttat gtggcgtata gtaaggctgc aacagttact ggtaaactgg 960tccatgctaa
ttttggtact aaaaaagatt ttgaggattt atacactcct gtgaatggat 1020ctatagtgat
tgtcagagca gggaaaatca cctttgcaga aaaggttgca aatgctgaaa 1080gcttaaatgc
aattggtgtg ttgatataca tggaccagac taaatttccc attgttaacg 1140cagaactttc
attctttgga catgctcatc tggggacagg tgacccttac acacctggat 1200tcccttcctt
caatcacact cagtttccac catctcggtc atcaggattg cctaatatac 1260ctgtccagac
aatctccaga gctgctgcag aaaagctgtt tgggaatatg gaaggagact 1320gtccctctga
ctggaaaaca gactctacat gtaggatggt aacctcagaa agcaagaatg 1380tgaagctcac
tgtgagcaat gtgctgaaag agataaaaat tcttaacatc tttggagtta 1440ttaaaggctt
tgtagaacca gatcactatg ttgtagttgg ggcccagaga gatgcatggg 1500gccctggagc
tgcaaaatcc ggtgtaggca cagctctcct attgaaactt gcccagatgt 1560tctcagatat
ggtcttaaaa gatgggtttc agcccagcag aagcattatc tttgccagtt 1620ggagtgctgg
agactttgga tcggttggtg ccactgaatg gctagaggga tacctttcgt 1680ccctgcattt
aaaggctttc acttatatta atctggataa agcggttctt ggtaccagca 1740acttcaaggt
ttctgccagc ccactgttgt atacgcttat tgagaaaaca atgcaaaatg 1800tgaagcatcc
ggttactggg caatttctat atcaggacag caactgggcc agcaaagttg 1860agaaactcac
tttagacaat gctgctttcc ctttccttgc atattctgga atcccagcag 1920tttctttctg
tttttgcgag gacacagatt atccttattt gggtaccacc atggacacct 1980ataaggaact
gattgagagg attcctgagt tgaacaaagt ggcacgagca gctgcagagg 2040tcgctggtca
gttcgtgatt aaactaaccc atgatgttga attgaacctg gactatgaga 2100ggtacaacag
ccaactgctt tcatttgtga gggatctgaa ccaatacaga gcagacataa 2160aggaaatggg
cctgagttta cagtggctgt attctgctcg tggagacttc ttccgtgcta 2220cttccagact
aacaacagat ttcgggaatg ctgagaaaac agacagattt gtcatgaaga 2280aactcaatga
tcgtgtcatg agagtggagt atcacttcct ctctccctac gtatctccaa 2340aagagtctcc
tttccgacat gtcttctggg gctccggctc tcacacgctg ccagctttac 2400tggagaactt
gaaactgcgt aaacaaaata acggtgcttt taatgaaacg ctgttcagaa 2460accagttggc
tctagctact tggactattc agggagctgc aaatgccctc tctggtgacg 2520tttgggacat
tgacaatgag ttttaaatgt gatacccata gcttccatga gaacagcagg 2580gtagtctggt
ttctagactt gtgctgatcg tgctaaattt tcagtagggc tacaaaacct 2640gatgttaaaa
ttccatccca tcatcttggt actactagat gtctttaggc agcagctttt 2700aatacagggt
agataacctg tacttcaagt taaagtgaat aaccacttaa aaaatgtcca 2760tgatggaata
ttcccctatc tctagaattt taagtgcttt gtaatgggaa ctgcctcttt 2820cctgttgttg
ttaatgaaaa tgtcagaaac cagttatgtg aatgatctct ctgaatccta 2880agggctggtc
tctgctgaag gttgtaagtg gttcgcttac tttgagtgat cctccaactt 2940catttgatgc
taaataggag ataccaggtt gaaagacctc tccaaatgag atctaagcct 3000ttccataagg
aatgtagcag gtttcctcat tcctgaaaga aacagttaac tttcagaaga 3060gatgggcttg
ttttcttgcc aatgaggtct gaaatggagg tccttctgct ggataaaatg 3120aggttcaact
gttgattgca ggaataaggc cttaatatgt taacctcagt gtcatttatg 3180aaaagagggg
accagaagcc aaagacttag tatattttct tttcctctgt cccttccccc 3240ataagcctcc
atttagttct ttgttatttt tgtttcttcc aaagcacatt gaaagagaac 3300cagtttcagg
tgtttagttg cagactcagt ttgtcagact ttaaagaata atatgctgcc 3360aaattttggc
caaagtgtta atcttagggg agagctttct gtccttttgg cactgagata 3420tttattgttt
atttatcagt gacagagttc actataaatg gtgttttttt aatagaatat 3480aattatcgga
agcagtgcct tccataatta tgacagttat actgtcggtt ttttttaaat 3540aaaagcagca
tctgctaata aaacccaaca gatactggaa gttttgcatt tatggtcaac 3600acttaagggt
tttagaaaac agccgtcagc caaatgtaat tgaataaagt tgaagctaag 3660atttagagat
gaattaaatt taattagggg ttgctaagaa gcgagcactg accagataag 3720aatgctggtt
ttcctaaatg cagtgaattg tgaccaagtt ataaatcaat gtcacttaaa 3780ggctgtggta
gtactcctgc aaaattttat agctcagttt atccaaggtg taactctaat 3840tcccatttgc
aaaatttcca gtacctttgt cacaatccta acacattatc gggagcagtg 3900tcttccataa
tgtataaaga acaaggtagt ttttacctac cacagtgtct gtatcggaga 3960cagtgatctc
catatgttac actaagggtg taagtaatta tcgggaacag tgtttcccat 4020aattttcttc
atgcaatgac atcttcaaag cttgaagatc gttagtatct aacatgtatc 4080ccaactccta
taattcccta tcttttagtt ttagttgcag aaacattttg tggtcattaa 4140gcattgggtg
ggtaaattca accactgtaa aatgaaatta ctacaaaatt tgaaatttag 4200cttgggtttt
tgttaccttt atggtttctc caggtcctct acttaatgag atagcagcat 4260acatttataa
tgtttgctat tgacaagtca ttttaattta tcacattatt tgcatgttac 4320ctcctataaa
cttagtgcgg acaagtttta atccagaatt gaccttttga cttaaagcag 4380agggactttg
tatagaaggt ttgggggctg tggggaagga gagtcccctg aaggtctgac 4440acgtctgcct
acccattcgt ggtgatcaat taaatgtagg tatgaataag ttcgaagctc 4500cgtgagtgaa
ccatcatata aacgtgtagt acagctgttt gtcatagggc agttggaaac 4560ggcctcctag
ggaaaagttc atagggtctc ttcaggttct tagtgtcact tacctagatt 4620tacagcctca
cttgaatgtg tcactactca cagtctcttt aatcttcagt tttatcttta 4680atctcctctt
ttatcttgga ctgacattta gcgtagctaa gtgaaaaggt catagctgag 4740attcctggtt
cgggtgttac gcacacgtac ttaaatgaaa gcatgtggca tgttcatcgt 4800ataacacaat
atgaatacag ggcatgcatt ttgcagcagt gagtctcttc agaaaaccct 4860tttctacagt
tagggttgag ttacttccta tcaagccagt acgtgctaac aggctcaata 4920ttcctgaatg
aaatatcaga ctagtgacaa gctcctggtc ttgagatgtc ttctcgttaa 4980ggagtagggc
cttttggagg taaaggtata
50103262574DNAHomo sapiens 326cctgtttaga cacatggaca acaatcccag cgctacaagg
cacacagtcc gcttcttcgt 60cctcagggtt gccagcgctt cctggaagtc ctgaagctct
cgcagtgcag tgagttcatg 120caccttcttg ccaagcctca gtctttggga tctggggagg
ccgcctggtt ttcctccctc 180cttctgcacg tctgctgggg tctcttcctc tccaggcctt
gccgtccccc tggcctctct 240tcccagctca cacatgaaga tgcacttgca aagggctctg
gtggtcctgg ccctgctgaa 300ctttgccacg gtcagcctct ctctgtccac ttgcaccacc
ttggacttcg gccacatcaa 360gaagaagagg gtggaagcca ttaggggaca gatcttgagc
aagctcaggc tcaccagccc 420ccctgagcca acggtgatga cccacgtccc ctatcaggtc
ctggcccttt acaacagcac 480ccgggagctg ctggaggaga tgcatgggga gagggaggaa
ggctgcaccc aggaaaacac 540cgagtcggaa tactatgcca aagaaatcca taaattcgac
atgatccagg ggctggcgga 600gcacaacgaa ctggctgtct gccctaaagg aattacctcc
aaggttttcc gcttcaatgt 660gtcctcagtg gagaaaaata gaaccaacct attccgagca
gaattccggg tcttgcgggt 720gcccaacccc agctctaagc ggaatgagca gaggatcgag
ctcttccaga tccttcggcc 780agatgagcac attgccaaac agcgctatat cggtggcaag
aatctgccca cacggggcac 840tgccgagtgg ctgtcctttg atgtcactga cactgtgcgt
gagtggctgt tgagaagaga 900gtccaactta ggtctagaaa tcagcattca ctgtccatgt
cacacctttc agcccaatgg 960agatatcctg gaaaacattc acgaggtgat ggaaatcaaa
ttcaaaggcg tggacaatga 1020ggatgaccat ggccgtggag atctggggcg cctcaagaag
cagaaggatc accacaaccc 1080tcatctaatc ctcatgatga ttcccccaca ccggctcgac
aacccgggcc aggggggtca 1140gaggaagaag cgggctttgg acaccaatta ctgcttccgc
aacttggagg agaactgctg 1200tgtgcgcccc ctctacattg acttccgaca ggatctgggc
tggaagtggg tccatgaacc 1260taagggctac tatgccaact tctgctcagg cccttgccca
tacctccgca gtgcagacac 1320aacccacagc acggtgctgg gactgtacaa cactctgaac
cctgaagcat ctgcctcgcc 1380ttgctgcgtg ccccaggacc tggagcccct gaccatcctg
tactatgttg ggaggacccc 1440caaagtggag cagctctcca acatggtggt gaagtcttgt
aaatgtagct gagaccccac 1500gtgcgacaga gagaggggag agagaaccac cactgcctga
ctgcccgctc ctcgggaaac 1560acacaagcaa caaacctcac tgagaggcct ggagcccaca
accttcggct ccgggcaaat 1620ggctgagatg gaggtttcct tttggaacat ttctttcttg
ctggctctga gaatcacggt 1680ggtaaagaaa gtgtgggttt ggttagagga aggctgaact
cttcagaaca cacagacttt 1740ctgtgacgca gacagagggg atggggatag aggaaaggga
tggtaagttg agatgttgtg 1800tggcaatggg atttgggcta ccctaaaggg agaaggaagg
gcagagaatg gctgggtcag 1860ggccagactg gaagacactt cagatctgag gttggatttg
ctcattgctg taccacatct 1920gctctaggga atctggatta tgttatacaa ggcaagcatt
ttttttttta aagacaggtt 1980acgaagacaa agtcccagaa ttgtatctca tactgtctgg
gattaagggc aaatctatta 2040cttttgcaaa ctgtcctcta catcaattaa catcgtgggt
cactacaggg agaaaatcca 2100ggtcatgcag ttcctggccc atcaactgta ttgggccttt
tggatatgct gaacgcagaa 2160gaaagggtgg aaatcaaccc tctcctgtct gccctctggg
tccctcctct cacctctccc 2220tcgatcatat ttccccttgg acacttggtt agacgccttc
caggtcagga tgcacatttc 2280tggattgtgg ttccatgcag ccttggggca ttatgggtct
tcccccactt cccctccaag 2340accctgtgtt catttggtgt tcctggaagc aggtgctaca
acatgtgagg cattcgggga 2400agctgcacat gtgccacaca gtgacttggc cccagacgca
tagactgagg tataaagaca 2460agtatgaata ttactctcaa aatctttgta taaataaata
tttttggggc atcctggatg 2520atttcatctt ctggaatatt gtttctagaa cagtaaaagc
cttattctaa ggtg 25743271421DNAHomo sapiens 327acttactgcg
ggacggcctt ggagagtact cgggttcgtg aacttcccgg aggcgcaatg 60agctgcatta
acctgcccac tgtgctgccc ggctccccca gcaagacccg ggggcagatc 120caggtgattc
tcgggccgat gttctcagga aaaagcacag agttgatgag acgcgtccgt 180cgcttccaga
ttgctcagta caagtgcctg gtgatcaagt atgccaaaga cactcgctac 240agcagcagct
tctgcacaca tgaccggaac accatggagg cgctgcccgc ctgcctgctc 300cgagacgtgg
cccaggaggc cctgggcgtg gctgtcatag gcatcgacga ggggcagttt 360ttccctgaca
tcatggagtt ctgcgaggcc atggccaacg ccgggaagac cgtaattgtg 420gctgcactgg
atgggacctt ccagaggaag ccatttgggg ccatcctgaa cctggtgccg 480ctggccgaga
gcgtggtgaa gctgacggcg gtgtgcatgg agtgcttccg ggaagccgcc 540tataccaaga
ggctcggcac agagaaggag gtcgaggtga ttgggggagc agacaagtac 600cactccgtgt
gtcggctctg ctacttcaag aaggcctcag gccagcctgc cgggccggac 660aacaaagaga
actgcccagt gccaggaaag ccaggggaag ccgtggctgc caggaagctc 720tttgccccac
agcagattct gcaatgcagc cctgccaact gagggacctg caagggccgc 780ccgctccctt
cctgccactg ccgcctactg gacgctgccc tgcatgctgc ccagccactc 840caggaggaag
tcgggaggcg tggagggtga ccacaccttg gccttctggg aactctcctt 900tgtgtggctg
ccccacctgc cgcatgctcc ctcctctcct acccactggt ctgcttaaag 960cttccctctc
agctgctggg acgatcgccc aggctggagc tggccccgct tggtggcctg 1020ggatctggca
cactccctct ccttggggtg agggacagag ccccacgctg ttgacatcag 1080cctgcttctt
cccctctgcg gctttcactg ctgagtttct gttctccctg ggaagcctgt 1140gccagcacct
ttgagccttg gcccacactg aggcttaggc ctctctgcct gggatgggct 1200cccaccctcc
cctgaggatg gcctggattc acgccctctt gtttcctttt gggctcaaag 1260cccttcctac
ctctggtgat ggtttccaca ggaacaacag catctttcac caagatgggt 1320ggcaccaacc
ttgctgggac ttggatccca ggggcttatc tcttcaagtg tggagagggc 1380agggtccacg
cctctgctgt agcttatgaa attaactaat t
14213284604DNAHomo sapiens 328ggaacagctt gtccacccgc cggccggacc agaagccttt
gggtctgaag tgtctgtgag 60acctcacaga agagcacccc tgggctccac ttacctgccc
cctgctcctt cagggatgga 120ggcaatggcg gccagcactt ccctgcctga ccctggagac
tttgaccgga acgtgccccg 180gatctgtggg gtgtgtggag accgagccac tggctttcac
ttcaatgcta tgacctgtga 240aggctgcaaa ggcttcttca ggcgaagcat gaagcggaag
gcactattca cctgcccctt 300caacggggac tgccgcatca ccaaggacaa ccgacgccac
tgccaggcct gccggctcaa 360acgctgtgtg gacatcggca tgatgaagga gttcattctg
acagatgagg aagtgcagag 420gaagcgggag atgatcctga agcggaagga ggaggaggcc
ttgaaggaca gtctgcggcc 480caagctgtct gaggagcagc agcgcatcat tgccatactg
ctggacgccc accataagac 540ctacgacccc acctactccg acttctgcca gttccggcct
ccagttcgtg tgaatgatgg 600tggagggagc catccttcca ggcccaactc cagacacact
cccagcttct ctggggactc 660ctcctcctcc tgctcagatc actgtatcac ctcttcagac
atgatggact cgtccagctt 720ctccaatctg gatctgagtg aagaagattc agatgaccct
tctgtgaccc tagagctgtc 780ccagctctcc atgctgcccc acctggctga cctggtcagt
tacagcatcc aaaaggtcat 840tggctttgct aagatgatac caggattcag agacctcacc
tctgaggacc agatcgtact 900gctgaagtca agtgccattg aggtcatcat gttgcgctcc
aatgagtcct tcaccatgga 960cgacatgtcc tggacctgtg gcaaccaaga ctacaagtac
cgcgtcagtg acgtgaccaa 1020agccggacac agcctggagc tgattgagcc cctcatcaag
ttccaggtgg gactgaagaa 1080gctgaacttg catgaggagg agcatgtcct gctcatggcc
atctgcatcg tctccccaga 1140tcgtcctggg gtgcaggacg ccgcgctgat tgaggccatc
caggaccgcc tgtccaacac 1200actgcagacg tacatccgct gccgccaccc gcccccgggc
agccacctgc tctatgccaa 1260gatgatccag aagctagccg acctgcgcag cctcaatgag
gagcactcca agcagtaccg 1320ctgcctctcc ttccagcctg agtgcagcat gaagctaacg
ccccttgtgc tcgaagtgtt 1380tggcaatgag atctcctgac taggacagcc tgtgcggtgc
ctgggtgggg ctgctcctcc 1440agggccacgt gccaggcccg gggctggcgg ctactcagca
gccctcctca cccgtctggg 1500gttcagcccc tcctctgcca cctcccctat ccacccagcc
cattctctct cctgtccaac 1560ctaacccctt tcctgcgggc ttttccccgg tcccttgaga
cctcagccat gaggagttgc 1620tgtttgtttg acaaagaaac ccaagtgggg gcagagggca
gaggctggag gcaggccttg 1680cccagagatg cctccaccgc tgcctaagtg gctgctgact
gatgttgagg gaacagacag 1740gagaaatgca tccattcctc agggacagag acacctgcac
ctccccccac tgcaggcccc 1800gcttgtccag cgcctagtgg ggtctccctc tcctgcctta
ctcacgataa ataatcggcc 1860cacagctccc accccacccc cttcagtgcc caccaacatc
ccattgccct ggttatattc 1920tcacgggcag tagctgtggt gaggtgggtt ttcttcccat
cactggagca ccaggcacga 1980acccacctgc tgagagaccc aaggaggaaa aacagacaaa
aacagcctca cagaagaata 2040tgacagctgt ccctgtcacc aagctcacag ttcctcgccc
tgggtctaag gggttggttg 2100aggtggaagc cctccttcca cggatccatg tagcaggact
gaattgtccc cagtttgcag 2160aaaagcacct gccgacctcg tcctccccct gccagtgcct
tacctcctgc ccaggagagc 2220cagccctccc tgtcctcctc ggatcaccga gagtagccga
gagcctgctc ccccaccccc 2280tccccagggg agagggtctg gagaagcagt gagccgcatc
ttctccatct ggcagggtgg 2340gatggaggag aagaattttc agaccccagc ggctgagtca
tgatctccct gccgcctcaa 2400tgtggttgca aggccgctgt tcaccacagg gctaagagct
aggctgccgc accccagagt 2460gtgggaaggg agagcggggc agtctcgggt ggctagtcag
agagagtgtt tgggggttcc 2520gtgatgtagg gtaaggtgcc ttcttattct cactccacca
cccaaaagtc aaaaggtgcc 2580tgtgaggcag gggcggagtg atacaacttc aagtgcatgc
tctctgcagg tcgagcccag 2640cccagctggt gggaagcgtc tgtccgttta ctccaaggtg
ggtctttgtg agagtgagct 2700gtaggtgtgc gggaccggta cagaaaggcg ttcttcgagg
tggatcacag aggcttcttc 2760agatcaatgc ttgagtttgg aatcggccgc attccctgag
tcaccaggaa tgttaaagtc 2820agtgggaacg tgactgcccc aactcctgga agctgtgtcc
ttgcacctgc atccgtagtt 2880ccctgaaaac ccagagagga atcagacttc acactgcaag
agccttggtg tccacctggc 2940cccatgtctc tcagaattct tcaggtggaa aaacatctga
aagccacgtt ccttactgca 3000gaatagcata tatatcgctt aatcttaaat ttattagata
tgagttgttt tcagactcag 3060actccatttg tattatagtc taatatacag ggtagcaggt
accactgatt tggagatatt 3120tatgggggga gaacttacat tgtgaaactt ctgtacatta
attattattg ctgttgttat 3180tttacaaggg tctagggaga gacccttgtt tgattttagc
tgcagaactg tattggtcca 3240gcttgctctt cagtgggaga aaaacacttg taagttgcta
aacgagtcaa tcccctcatt 3300caggaaaact gacagaggag ggcgtgactc acccaagcca
tatataacta gctagaagtg 3360ggccaggaca ggccgggcgc ggtggctcac gcctgtaatc
ccagcagttt gggaggtcga 3420ggtaggtgga tcacctgagg tcgggagttc gagaccaacc
tgaccaacat ggagaaaccc 3480tgtctctatt aaaaatacaa aaaaaaaaaa aaaaaaaaat
agccgggcat ggtggcgcaa 3540gcctgtaatc ccagctactc aggaggctga ggcagaagaa
ttgaacccag gaggtggagg 3600ttgcagtgag ctgagatcgt gccgttactc tccaacctgg
acaacaagag cgaaactccg 3660tcttagaagt ggaccaggac aggaccagat tttggagtca
tggtccggtg tccttttcac 3720tacaccatgt ttgagctcag acccccactc tcattcccca
ggtggctgac ccagtccctg 3780ggggaagccc tggatttcag aaagagccaa gtctggatct
gggacccttt ccttccttcc 3840ctggcttgta actccaccaa gcccatcaga aggagaagga
aggagactca cctctgcctc 3900aatgtgaatc agaccctacc ccaccacgat gtgccctggc
tgctgggctc tccacctcag 3960gccttggata atgctgttgc ctcatctata acatgcattt
gtctttgtaa tgtcaccacc 4020ttcccagctc tccctctggc cctgcttctt cggggaactc
ctgaaatatc agttactcag 4080ccctgggccc caccacctag gccactcctc caaaggaagt
ctaggagctg ggaggaaaag 4140aaaagagggg aaaatgagtt tttatggggc tgaacgggga
gaaaaggtca tcatcgattc 4200tactttagaa tgagagtgtg aaatagacat ttgtaaatgt
aaaactttta aggtatatca 4260ttataactga aggagaaggt gccccaaaat gcaagatttt
ccacaagatt cccagagaca 4320ggaaaatcct ctggctggct aactggaagc atgtaggaga
atccaagcga ggtcaacaga 4380gaaggcagga atgtgtggca gatttagtga aagctagaga
tatggcagcg aaaggatgta 4440aacagtgcct gctgaatgat ttccaaagag aaaaaaagtt
tgccagaagt ttgtcaagtc 4500aaccaatgta gaaagctttg cttatggtaa taaaaatggc
tcatacttat atagcactta 4560ctttgtttgc aagtactgct gtaaataaat gctttatgca
aacc 46043292076DNAHomo sapiens 329cggggaaggg
gagggaggag ggggacgagg gctctggcgg gtttggaggg gctgaacatc 60gcggggtgtt
ctggtgtccc ccgccccgcc tctccaaaaa gctacaccga cgcggaccgc 120ggcggcgtcc
tccctcgccc tcgcttcacc tcgcgggctc cgaatgcggg gagctcggat 180gtccggtttc
ctgtgaggct tttacctgac acccgccgcc tttccccggc actggctggg 240agggcgccct
gcaaagttgg gaacgcggag ccccggaccc gctcccgccg cctccggctc 300gcccaggggg
ggtcgccggg aggagcccgg gggagaggga ccaggagggg cccgcggcct 360cgcaggggcg
cccgcgcccc cacccctgcc cccgccagcg gaccggtccc ccacccccgg 420tccttccacc
atgcacttgc tgggcttctt ctctgtggcg tgttctctgc tcgccgctgc 480gctgctcccg
ggtcctcgcg aggcgcccgc cgccgccgcc gccttcgagt ccggactcga 540cctctcggac
gcggagcccg acgcgggcga ggccacggct tatgcaagca aagatctgga 600ggagcagtta
cggtctgtgt ccagtgtaga tgaactcatg actgtactct acccagaata 660ttggaaaatg
tacaagtgtc agctaaggaa aggaggctgg caacataaca gagaacaggc 720caacctcaac
tcaaggacag aagagactat aaaatttgct gcagcacatt ataatacaga 780gatcttgaaa
agtattgata atgagtggag aaagactcaa tgcatgccac gggaggtgtg 840tatagatgtg
gggaaggagt ttggagtcgc gacaaacacc ttctttaaac ctccatgtgt 900gtccgtctac
agatgtgggg gttgctgcaa tagtgagggg ctgcagtgca tgaacaccag 960cacgagctac
ctcagcaaga cgttatttga aattacagtg cctctctctc aaggccccaa 1020accagtaaca
atcagttttg ccaatcacac ttcctgccga tgcatgtcta aactggatgt 1080ttacagacaa
gttcattcca ttattagacg ttccctgcca gcaacactac cacagtgtca 1140ggcagcgaac
aagacctgcc ccaccaatta catgtggaat aatcacatct gcagatgcct 1200ggctcaggaa
gattttatgt tttcctcgga tgctggagat gactcaacag atggattcca 1260tgacatctgt
ggaccaaaca aggagctgga tgaagagacc tgtcagtgtg tctgcagagc 1320ggggcttcgg
cctgccagct gtggacccca caaagaacta gacagaaact catgccagtg 1380tgtctgtaaa
aacaaactct tccccagcca atgtggggcc aaccgagaat ttgatgaaaa 1440cacatgccag
tgtgtatgta aaagaacctg ccccagaaat caacccctaa atcctggaaa 1500atgtgcctgt
gaatgtacag aaagtccaca gaaatgcttg ttaaaaggaa agaagttcca 1560ccaccaaaca
tgcagctgtt acagacggcc atgtacgaac cgccagaagg cttgtgagcc 1620aggattttca
tatagtgaag aagtgtgtcg ttgtgtccct tcatattgga aaagaccaca 1680aatgagctaa
gattgtactg ttttccagtt catcgatttt ctattatgga aaactgtgtt 1740gccacagtag
aactgtctgt gaacagagag acccttgtgg gtccatgcta acaaagacaa 1800aagtctgtct
ttcctgaacc atgtggataa ctttacagaa atggactgga gctcatctgc 1860aaaaggcctc
ttgtaaagac tggttttctg ccaatgacca aacagccaag attttcctct 1920tgtgatttct
ttaaaagaat gactatataa tttatttcca ctaaaaatat tgtttctgca 1980ttcattttta
tagcaacaac aattggtaaa actcactgtg atcaatattt ttatatcatg 2040caaaatatgt
ttaaaataaa atgaaaattg tattat
20763302819DNAHomo sapiens 330ctgggcccag ctcccccgag aggtggtcgg atcctctggg
ctgctcggtc gatgcctgtg 60ccactgacgt ccaggcatga ggtggttcct gccctggacg
ctggcagcag tgacagcagc 120agccgccagc accgtcctgg ccacggccct ctctccagcc
cctacgacca tggactttac 180cccagctcca ctggaggaca cctcctcacg cccccaattc
tgcaagtggc catgtgagtg 240cccgccatcc ccaccccgct gcccgctggg ggtcagcctc
atcacagatg gctgtgagtg 300ctgtaagatg tgcgctcagc agcttgggga caactgcacg
gaggctgcca tctgtgaccc 360ccaccggggc ctctactgtg actacagcgg ggaccgcccg
aggtacgcaa taggagtgtg 420tgcacaggtg gtcggtgtgg gctgcgtcct ggatggggtg
cgctacaaca acggccagtc 480cttccagcct aactgcaagt acaactgcac gtgcatcgac
ggcgcggtgg gctgcacacc 540actgtgcctc cgagtgcgcc ccccgcgtct ctggtgcccc
cacccgcggc gcgtgagcat 600acctggccac tgctgtgagc agtgggtatg tgaggacgac
gccaagaggc cacgcaagac 660cgcaccccgt gacacaggag ccttcgatgc tgtgggtgag
gtggaggcat ggcacaggaa 720ctgcatagcc tacacaagcc cctggagccc ttgctccacc
agctgcggcc tgggggtctc 780cactcggatc tccaatgtta acgcccagtg ctggcctgag
caagagagcc gcctctgcaa 840cttgcggcca tgcgatgtgg acatccatac actcattaag
gcagggaaga agtgtctggc 900tgtgtaccag ccagaggcat ccatgaactt cacacttgcg
ggctgcatca gcacacgctc 960ctatcaaccc aagtactgtg gagtttgcat ggacaatagg
tgctgcatcc cctacaagtc 1020taagactatc gacgtgtcct tccagtgtcc tgatgggctt
ggcttctccc gccaggtcct 1080atggattaat gcctgcttct gtaacctgag ctgtaggaat
cccaatgaca tctttgctga 1140cttggaatcc taccctgact tctcagaaat tgccaactag
gcaggcacaa atcttgggtc 1200ttggggacta acccaatgcc tgtgaagcag tcagccctta
tggccaataa cttttcacca 1260atgagcctta gttaccctga tctggaccct tggcctccat
ttctgtctct aaccattcaa 1320atgacgcctg atggtgctgc tcaggcccat gctatgagtt
ttctccttga tatcattcag 1380catctactct aaagaaaaat gcctgtctct agctgttctg
gactacaccc aagcctgatc 1440cagcctttcc aagtcactag aagtcctgct ggatcttgcc
taaatcccaa gaaatggaat 1500caggtagact tttaatatca ctaatttctt ctttagatgc
caaaccacaa gactctttgg 1560gtccattcag atgaatagat ggaatttgga acaatagaat
aatctattat ttggagcctg 1620ccaagaggta ctgtaatggg taattctgac gtcagcgcac
caaaactatc ctgattccaa 1680atatgtatgc acctcaaggt catcaaacat ttgccaagtg
agttgaatag ttgcttaatt 1740ttgattttta atggaaagtt gtatccatta acctgggcat
tgttgaggtt aagtttctct 1800tcacccctac actgtgaagg gtacagatta ggtttgtccc
agtcagaaat aaaatttgat 1860aaacattcct gttgatggga aaagccccca gttaatactc
cagagacagg gaaaggtcag 1920cccgtttcag aaggaccaat tgactctcac actgaatcag
ctgctgactg gcagggcttt 1980gggcagttgg ccaggctctt ccttgaatct tctcccttgt
cctgcttggg gttcatagga 2040attggtaagg cctctggact ggcctgtctg gcccctgaga
gtggtgccct ggaacactcc 2100tctactctta cagagccttg agagacccag ctgcagacca
tgccagaccc actgaaatga 2160ccaagacagg ttcaggtagg ggtgtgggtc aaaccaagaa
gtgggtgccc ttggtagcag 2220cctggggtga cctctagagc tggaggctgt gggactccag
gggcccccgt gttcaggaca 2280catctattgc agagactcat ttcacagcct ttcgttctgc
tgaccaaatg gccagttttc 2340tggtaggaag atggaggttt accggttgtt tagaaacaga
aatagactta ataaaggttt 2400aaagctgaag aggttgaagc taaaaggaaa aggttgttgt
taatgaatat caggctatta 2460tttattgtat taggaaaata taatatttac tgttagaatt
cttttattta gggccttttc 2520tgtgccagac attgctctca gtgctttgca tgtattagct
cactgaatct tcacgacaat 2580gttgagaagt tcccattatt atttctgttc ttacaaatgt
gaaacggaag ctcatagagg 2640tgagaaaact caaccagagt cacccagttg gtgactggga
aagttaggat tcagatcgaa 2700attggactgt ctttataacc catattttcc ccctgttttt
agagcttcca aatgtgtcag 2760aataggaaaa cattgcaata aatggcttga ttttttaaaa
aaaaaaaaaa aaaaaaaaa 28193312540DNAHomo sapiens 331gaaaaggtgg
acaagtccta ttttcaagag aagatgactt ttaacagttt tgaaggatct 60aaaacttgtg
tacctgcaga catcaataag gaagaagaat ttgtagaaga gtttaataga 120ttaaaaactt
ttgctaattt tccaagtggt agtcctgttt cagcatcaac actggcacga 180gcagggtttc
tttatactgg tgaaggagat accgtgcggt gctttagttg tcatgcagct 240gtagatagat
ggcaatatgg agactcagca gttggaagac acaggaaagt atccccaaat 300tgcagattta
tcaacggctt ttatcttgaa aatagtgcca cgcagtctac aaattctggt 360atccagaatg
gtcagtacaa agttgaaaac tatctgggaa gcagagatca ttttgcctta 420gacaggccat
ctgagacaca tgcagactat cttttgagaa ctgggcaggt tgtagatata 480tcagacacca
tatacccgag gaaccctgcc atgtattgtg aagaagctag attaaagtcc 540tttcagaact
ggccagacta tgctcaccta accccaagag agttagcaag tgctggactc 600tactacacag
gtattggtga ccaagtgcag tgcttttgtt gtggtggaaa actgaaaaat 660tgggaacctt
gtgatcgtgc ctggtcagaa cacaggcgac actttcctaa ttgcttcttt 720gttttgggcc
ggaatcttaa tattcgaagt gaatctgatg ctgtgagttc tgataggaat 780ttcccaaatt
caacaaatct tccaagaaat ccatccatgg cagattatga agcacggatc 840tttacttttg
ggacatggat atactcagtt aacaaggagc agcttgcaag agctggattt 900tatgctttag
gtgaaggtga taaagtaaag tgctttcact gtggaggagg gctaactgat 960tggaagccca
gtgaagaccc ttgggaacaa catgctaaat ggtatccagg gtgcaaatat 1020ctgttagaac
agaagggaca agaatatata aacaatattc atttaactca ttcacttgag 1080gagtgtctgg
taagaactac tgagaaaaca ccatcactaa ctagaagaat tgatgatacc 1140atcttccaaa
atcctatggt acaagaagct atacgaatgg ggttcagttt caaggacatt 1200aagaaaataa
tggaggaaaa aattcagata tctgggagca actataaatc acttgaggtt 1260ctggttgcag
atctagtgaa tgctcagaaa gacagtatgc aagatgagtc aagtcagact 1320tcattacaga
aagagattag tactgaagag cagctaaggc gcctgcaaga ggagaagctt 1380tgcaaaatct
gtatggatag aaatattgct atcgtttttg ttccttgtgg acatctagtc 1440acttgtaaac
aatgtgctga agcagttgac aagtgtccca tgtgctacac agtcattact 1500ttcaagcaaa
aaatttttat gtcttaatct aactctatag taggcatgtt atgttgttct 1560tattaccctg
attgaatgtg tgatgtgaac tgactttaag taatcaggat tgaattccat 1620tagcatttgc
taccaagtag gaaaaaaaat gtacatggca gtgttttagt tggcaatata 1680atctttgaat
ttcttgattt ttcagggtat tagctgtatt atccattttt tttactgtta 1740tttaattgaa
accatagact aagaataaga agcatcatac tataactgaa cacaatgtgt 1800attcatagta
tactgattta atttctaagt gtaagtgaat taatcatctg gattttttat 1860tcttttcaga
taggcttaac aaatggagct ttctgtatat aaatgtggag attagagtta 1920atctccccaa
tcacataatt tgttttgtgt gaaaaaggaa taaattgttc catgctggtg 1980gaaagataga
gattgttttt agaggttggt tgttgtgttt taggattctg tccattttct 2040tgtaaaggga
taaacacgga cgtgtgcgaa atatgtttgt aaagtgattt gccattgttg 2100aaagcgtatt
taatgataga atactatcga gccaacatgt actgacatgg aaagatgtca 2160gagatatgtt
aagtgtaaaa tgcaagtggc gggacactat gtatagtctg agccagatca 2220aagtatgtat
gttgttaata tgcatagaac gagagatttg gaaagatata caccaaactg 2280ttaaatgtgg
tttctcttcg gggagggggg gattggggga ggggccccag aggggtttta 2340gaggggcctt
ttcactttcg acttttttca ttttgttctg ttcggatttt ttataagtat 2400gtagaccccg
aagggtttta tgggaactaa catcagtaac ctaacccccg tgactatcct 2460gtgctcttcc
tagggagctg tgttgtttcc cacccaccac ccttccctct gaacaaatgc 2520ctgagtgctg
gggcactttg
25403321474DNAHomo sapiens 332aaaaagaaat caagaatgca attttattta caatagtcac
gccggaaata cctagaaata 60aatttaactg aggatgtaaa agacctctac aaggagagtt
caatgcgtag cgggagcgga 120gagctgaccc cagagagccc tgggcagccc cacctccgcc
gccggcctag ttaccatcac 180accccggaga gcccgcagct gccgcagccg gccccagtca
ccatcaccgc aaccatgagc 240agcgaggccg agacccagca gccgcccgcc gccccccccg
ccgcccccgc cctcagcgcc 300gccgacacca agcccggcac taccggagcg gcgcagggag
cggtggcccg ggcggctcac 360atcggcggcg ctggcgcggg cgacaagaag gtcatcgcaa
cgaaggtttt gggaacagta 420aaatggttca atgtaaggaa cggatatggt ttcatcaaca
ggaatgacac caaggaagat 480gtatttgtac accagactgc cataaagaag aataacccca
ggaagtacct tcgcagtgta 540ggagatggag agactgtgga gtttgatgtt gttgaaggag
aaaagggtgc ggaggcagca 600aatgttacag gtcctggtgg tgttccagtt caaggcagta
aatatgcagc agaccgtaac 660cattatagac gctatccacg tcgtaggggt cctccacgca
attaccagca aaattaccag 720aatagtgaga gtggggaaaa gaacgaggga tcggagagtg
ctcccgaagc caggcccaac 780aacgccggcc ctacgcaggc gaaggttccc accttactac
atgcggagac ctatgggcgt 840cgaccacagt attccaaccc tcctgtgcag ggagaagtga
tggagggtgc tgacaaccag 900ggtgcaggag aacaaggtag accagtgagg cagatatgta
tcggggatat agaccacgat 960tccgcagggg ccctcctcgc caaaagacag cctagagagg
acggcaatga agaagataaa 1020gaaaatcaag gagatgagac ccaaggtcag cagccacctc
aagctcggta ccgccgcaac 1080ttcaattacc gacgcagacg cccagaaaac cctaaaccac
aagatggcaa agagacaaaa 1140gcagccgatc caccagctga gaattcgtcc gctcccgagg
ctgagcaggg cggggctgag 1200taaatgccgg cttaccatct ctaccatcat ccggtttagt
catccaacaa gaagaaatat 1260gaaattccag caataagaaa tgaacaaaag attggagctg
aagacctaaa gtgcttgctt 1320tttgcccgtt gaccagataa atagaactat ctgcattatc
tatgcagcat ggggttttta 1380ttatgtttta cctaaagacg tctctttttg gtaataacaa
accgtgtttt ttaaaaaagc 1440ctggtttttc tcaatacgcc tttaaaggaa ttcc
14743334079DNAHomo sapiens 333ggagcggcgg gcgggcggga
gggctggcgg ggcgaacgtc tgggagacgt ctgaaagacc 60aacgagactt tggagaccag
agacgcgcct ggggggacct ggggcttggg gcgtgcgaga 120tttcccttgc attcgctggg
agctcgcgca gggatcgtcc catggccggg gctcggagcc 180gcgacccttg gggggcctcc
gggatttgct acctttttgg ctccctgctc gtcgaactgc 240tcttctcacg ggctgtcgcc
ttcaatctgg acgtgatggg tgccttgcgc aaggagggcg 300agccaggcag cctcttcggc
ttctctgtgg ccctgcaccg gcagttgcag ccccgacccc 360agagctggct gctggtgggt
gctccccagg ccctggctct tcctgggcag caggcgaatc 420gcactggagg cctcttcgct
tgcccgttga gcctggagga gactgactgc tacagagtgg 480acatcgacca gggagctgat
atgcaaaagg aaagcaagga gaaccagtgg ttgggagtca 540gtgttcggag ccaggggcct
gggggcaaga ttgttacctg tgcacaccga tatgaggcaa 600ggcagcgagt ggaccagatc
ctggagacgc gggatatgat tggtcgctgc tttgtgctca 660gccaggacct ggccatccgg
gatgagttgg atggtgggga atggaagttc tgtgagggac 720gcccccaagg ccatgaacaa
tttgggttct gccagcaggg cacagctgcc gccttctccc 780ctgatagcca ctacctcctc
tttggggccc caggaaccta taattggaag gggttgcttt 840ttgtgaccaa cattgatagc
tcagaccccg accagctggt gtataaaact ttggaccctg 900ctgaccggct cccaggacca
gccggagact tggccctcaa tagctactta ggcttctcta 960ttgactcggg gaaaggtctg
gtgcgtgcag aagagctgag ctttgtggct ggagcccccc 1020gcgccaacca caagggtgct
gtggttatcc tgcgcaagga cagcgccagt cgcctggtgc 1080ccgaggttat gctgtctggg
gagcgcctga cctccggctt tggctactca ctggctgtgg 1140ctgacctcaa cagtgatggc
tggccagacc tgatagtggg tgccccctac ttctttgagc 1200gccaagaaga gctggggggt
gctgtgtatg tgtacttgaa ccaggggggt cactgggctg 1260ggatctcccc tctccggctc
tgcggctccc ctgactccat gttcgggatc agcctggctg 1320tcctggggga cctcaaccaa
gatggctttc cagatattgc agtgggtgcc ccctttgatg 1380gtgatgggaa agtcttcatc
taccatggga gcagcctggg ggttgtcgcc aaaccttcac 1440aggtgctgga gggcgaggct
gtgggcatca agagcttcgg ctactccctg tcaggcagct 1500tggatatgga tgggaaccaa
taccctgacc tgctggtggg ctccctggct gacaccgcag 1560tgctcttcag ggccagaccc
atcctccatg tctcccatga ggtctctatt gctccacgaa 1620gcatcgacct ggagcagccc
aactgtgctg gcggccactc ggtctgtgtg gacctaaggg 1680tctgtttcag ctacattgca
gtccccagca gctatagccc tactgtggcc ctggactatg 1740tgttagatgc ggacacagac
cggaggctcc ggggccaggt tccccgtgtg acgttcctga 1800gccgtaacct ggaagaaccc
aagcaccagg cctcgggcac cgtgtggctg aagcaccagc 1860atgaccgagt ctgtggagac
gccatgttcc agctccagga aaatgtcaaa gacaagcttc 1920gggccattgt agtgaccttg
tcctacagtc tccagacccc tcggctccgg cgacaggctc 1980ctggccaggg gctgcctcca
gtggccccca tcctcaatgc ccaccagccc agcacccagc 2040gggcagagat ccacttcctg
aagcaaggct gtggtgaaga caagatctgc cagagcaatc 2100tgcagctggt ccacgcccgc
ttctgtaccc gggtcagcga cacggaattc caacctctgc 2160ccatggatgt ggatggaaca
acagccctgt ttgcactgag tgggcagcca gtcattggcc 2220tggagctgat ggtcaccaac
ctgccatcgg acccagccca gccccaggct gatggggatg 2280atgcccatga agcccagctc
ctggtcatgc ttcctgactc actgcactac tcaggggtcc 2340gggccctgga ccctgcggag
aagccactct gcctgtccaa tgagaatgcc tcccatgttg 2400agtgtgagct ggggaacccc
atgaagagag gtgcccaggt caccttctac ctcatcctta 2460gcacctccgg gatcagcatt
gagaccacgg aactggaggt agagctgctg ttggccacga 2520tcagtgagca ggagctgcat
ccagtctctg cacgagcccg tgtcttcatt gagctgccac 2580tgtccattgc aggaatggcc
attccccagc aactcttctt ctctggtgtg gtgaggggcg 2640agagagccat gcagtctgag
cgggatgtgg gcagcaaggt caagtatgag gtcacggttt 2700ccaaccaagg ccagtcgctc
agaaccctgg gctctgcctt cctcaacatc atgtggcctc 2760atgagattgc caatgggaag
tggttgctgt acccaatgca ggttgagctg gagggcgggc 2820aggggcctgg gcagaaaggg
ctttgctctc ccaggcccaa catcctccac ctggatgtgg 2880acagtaggga taggaggcgg
cgggagctgg agccacctga gcagcaggag cctggtgagc 2940ggcaggagcc cagcatgtcc
tggtggccag tgtcctctgc tgagaagaag aaaaacatca 3000ccctggactg cgcccggggc
acggccaact gtgtggtgtt cagctgccca ctctacagct 3060ttgaccgcgc ggctgtgctg
catgtctggg gccgtctctg gaacagcacc tttctggagg 3120agtactcagc tgtgaagtcc
ctggaagtga ttgtccgggc caacatcaca gtgaagtcct 3180ccataaagaa cttgatgctc
cgagatgcct ccacagtgat cccagtgatg gtatacttgg 3240accccatggc tgtggtggca
gaaggagtgc cctggtgggt catcctcctg gctgtactgg 3300ctgggctgct ggtgctagca
ctgctggtgc tgctcctgtg gaagatggga ttcttcaaac 3360gggcgaagca ccccgaggcc
accgtgcccc agtaccatgc ggtgaagatt cctcgggaag 3420accgacagca gttcaaggag
gagaagacgg gcaccatcct gaggaacaac tggggcagcc 3480cccggcggga gggcccggat
gcacacccca tcctggctgc tgacgggcat cccgagctgg 3540gccccgatgg gcatccaggg
ccaggcaccg cctaggttcc catgtcccag cctggcctgt 3600ggctgccctc catcccttcc
ccagagatgg ctccttggga tgaagagggt agagtgggct 3660gctggtgtcg catcaagatt
tggcaggatc ggcttcctca ggggcacaga cctctcccac 3720ccacaagaac tcctcccacc
caacttcccc ttagagtgct gtgagatgag agtgggtaaa 3780tcagggacag ggccatgggg
tagggtgaga agggcagggg tgtcctgatg caaaggtggg 3840gagaagggat cctaatccct
tcctctccca ttcaccctgt gtaacaggac cccaaggacc 3900tgcctccccg gaagtgcctt
aacctagagg gtcggggagg aggttgtgtc actgactcag 3960gctgctcctt ctctagtttc
ccctctcatc tgaccttagt ttgctgccat cagtctagtg 4020gtttcgtggt ttcgtctatt
tattaaaaaa tatttgagaa caaaaaaaaa aaaaaaaaa 40793343373DNAHomo sapiens
334ggtggcaact tctcctcctg cggccgggag cggcctgcct gcctccctgc gcacccgcag
60cctcccccgc tgcctcccta gggctcccct ccggccgcca gcgcccattt ttcattccct
120agatagagat actttgcgcg cacacacata catacgcgcg caaaaaggaa aaaaaaaaaa
180aaaagcccac cctccagcct cgctgcaaag agaaaaccgg agcagccgca gctcgcagct
240cgcagctcgc agcccgcagc ccgcagagga cgcccagagc ggcgagcagg cgggcagacg
300gaccgacgga ctcgcgccgc gtccacctgt cggccgggcc cagccgagcg cgcagcgggc
360acgccgcgcg cgcggagcag ccgtgcccgc cgcccgggcc cgccgccagg gcgcacacgc
420tcccgccccc ctacccggcc cgggcgggag tttgcacctc tccctgcccg ggtgctcgag
480ctgccgttgc aaagccaact ttggaaaaag ttttttgggg gagacttggg ccttgaggtg
540cccagctccg cgctttccga ttttgggggc ctttccagaa aatgttgcaa aaaagctaag
600ccggcgggca gaggaaaacg cctgtagccg gcgagtgaag acgaaccatc gactgccgtg
660ttccttttcc tcttggaggt tggagtcccc tgggcgcccc cacacggcta gacgcctcgg
720ctggttcgcg acgcagcccc ccggccgtgg atgctgcact cgggctcggg atccgcccag
780gtagccggcc tcggacccag gtcctgcgcc caggtcctcc cctgcccccc agcgacggag
840ccggggccgg gggcggcggc gccgggggca tgcgggtgag ccgcggctgc agaggcctga
900gcgcctgatc gccgcggacc tgagccgagc ccacccccct ccccagcccc ccaccctggc
960cgcgggggcg gcgcgctcga tctacgcgtc cggggccccg cggggccggg cccggagtcg
1020gcatgaatcg ctgctgggcg ctcttcctgt ctctctgctg ctacctgcgt ctggtcagcg
1080ccgaggggga ccccattccc gaggagcttt atgagatgct gagtgaccac tcgatccgct
1140cctttgatga tctccaacgc ctgctgcacg gagaccccgg agaggaagat ggggccgagt
1200tggacctgaa catgacccgc tcccactctg gaggcgagct ggagagcttg gctcgtggaa
1260gaaggagcct gggttccctg accattgctg agccggccat gatcgccgag tgcaagacgc
1320gcaccgaggt gttcgagatc tcccggcgcc tcatagaccg caccaacgcc aacttcctgg
1380tgtggccgcc ctgtgtggag gtgcagcgct gctccggctg ctgcaacaac cgcaacgtgc
1440agtgccgccc cacccaggtg cagctgcgac ctgtccaggt gagaaagatc gagattgtgc
1500ggaagaagcc aatctttaag aaggccacgg tgacgctgga agaccacctg gcatgcaagt
1560gtgagacagt ggcagctgca cggcctgtga cccgaagccc ggggggttcc caggagcagc
1620gagccaaaac gccccaaact cgggtgacca ttcggacggt gcgagtccgc cggcccccca
1680agggcaagca ccggaaattc aagcacacgc atgacaagac ggcactgaag gagacccttg
1740gagcctaggg gcatcggcag gagagtgtgt gggcagggtt atttaatatg gtatttgctg
1800tattgccccc atggggtcct tggagtgata atattgtttc cctcgtccgt ctgtctcgat
1860gcctgattcg gacggccaat ggtgcttccc ccacccctcc acgtgtccgt ccacccttcc
1920atcagcgggt ctcctcccag cggcctccgg tcttgcccag cagctcaaag aagaaaaaga
1980aggactgaac tccatcgcca tcttcttccc ttaactccaa gaacttggga taagagtgtg
2040agagagactg atggggtcgc tctttggggg aaacgggttc cttcccctgc acctggcctg
2100ggccacacct gagcgctgtg gactgtcctg aggagccctg aggacctctc agcatagcct
2160gcctgatccc tgaacccctg gccagctctg aggggaggca cctccaggca ggccaggctg
2220cctcggactc catggctaag accacagacg ggcacacaga ctggagaaaa cccctcccac
2280ggtgcccaaa caccagtcac ctcgtctccc tggtgcctct gtgcacagtg gcttcttttc
2340gttttcgttt tgaagacgtg gactcctctt ggtgggtgtg gccagcacac caagtggctg
2400ggtgccctct caggtgggtt agagatggag tttgctgttg aggtggtgta gatggtgacc
2460tgggtatccc ctgcctcctg ccaccccttc ctccccatac tccactctga ttcacctctt
2520cctctggttc ctttcatctc tctacctcca ccctgcattt tcctcttgtc ctggcccttc
2580agtctgctcc accaaggggc tcttgaaccc cttattaagg ccccagatga ccccagtcac
2640tcctctctag ggcagaagac tagaggccag ggcagcaagg gacctgctca tcatattcca
2700acccagccac gactgccatg taaggttgtg cagggtgtgt actgcacaag gacattgtat
2760gcagggagca ctgttcacat catagataaa gctgatttgt atatttatta tgacaatttc
2820tggcagatgt aggtaaagag gaaaaggatc cttttcctaa ttcacacaaa gactccttgt
2880ggactggctg tgcccctgat gcagcctgtg gctggagtgg ccaaatagga gggagactgt
2940ggtaggggca gggaggcaac actgctgtcc acatgacctc catttcccaa agtcctctgc
3000tccagcaact gcccttccag gtgggtgtgg gacacctggg agaaggtctc caagggaggg
3060tgcagccctc ttgcccgcac ccctccctgc ttgcacactt ccccatcttt gatccttctg
3120agctccacct ctggtggctc ctcctaggaa accagctcgt gggctgggaa tgggggagag
3180aagggaaaag atccccaaga ccccctgggg tgggatctga gctcccacct cccttcccac
3240ctactgcact ttcccccttc ccgccttcca aaacctgctt ccttcagttt gtaaagtcgg
3300tgattatatt tttgggggct ttccttttat tttttaaatg taaaatttat ttatattccg
3360tatttaaagt tgt
33733352304DNAHomo sapiens 335gtccccgcag cgccgtcgcg ccctcctgcc gcaggccacc
gaggccgccg ccgtctagcg 60ccccgacctc gccaccatga gagccctgct ggcgcgcctg
cttctctgcg tcctggtcgt 120gagcgactcc aaaggcagca atgaacttca tcaagttcca
tcgaactgtg actgtctaaa 180tggaggaaca tgtgtgtcca acaagtactt ctccaacatt
cactggtgca actgcccaaa 240gaaattcgga gggcagcact gtgaaataga taagtcaaaa
acctgctatg aggggaatgg 300tcacttttac cgaggaaagg ccagcactga caccatgggc
cggccctgcc tgccctggaa 360ctctgccact gtccttcagc aaacgtacca tgcccacaga
tctgatgctc ttcagctggg 420cctggggaaa cataattact gcaggaaccc agacaaccgg
aggcgaccct ggtgctatgt 480gcaggtgggc ctaaagccgc ttgtccaaga gtgcatggtg
catgactgcg cagatggaaa 540aaagccctcc tctcctccag aagaattaaa atttcagtgt
ggccaaaaga ctctgaggcc 600ccgctttaag attattgggg gagaattcac caccatcgag
aaccagccct ggtttgcggc 660catctacagg aggcaccggg ggggctctgt cacctacgtg
tgtggaggca gcctcatcag 720cccttgctgg gtgatcagcg ccacacactg cttcattgat
tacccaaaga aggaggacta 780catcgtctac ctgggtcgct caaggcttaa ctccaacacg
caaggggaga tgaagtttga 840ggtggaaaac ctcatcctac acaaggacta cagcgctgac
acgcttgctc accacaacga 900cattgccttg ctgaagatcc gttccaagga gggcaggtgt
gcgcagccat cccggactat 960acagaccatc tgcctgccct cgatgtataa cgatccccag
tttggcacaa gctgtgagat 1020cactggcttt ggaaaagaga attctaccga ctatctctat
ccggagcagc tgaaaatgac 1080tgttgtgaag ctgatttccc accgggagtg tcagcagccc
cactactacg gctctgaagt 1140caccaccaaa atgctatgtg ctgctgaccc ccaatggaaa
acagattcct gccagggaga 1200ctcaggggga cccctcgtct gttccctcca aggccgcatg
actttgactg gaattgtgag 1260ctggggccgt ggatgtgccc tgaaggacaa gccaggcgtc
tacacgagag tctcacactt 1320cttaccctgg atccgcagtc acaccaagga agagaatggc
ctggccctct gagggtcccc 1380agggaggaaa cgggcaccac ccgctttctt gctggttgtc
atttttgcag tagagtcatc 1440tccatcagct gtaagaagag actgggaaga taggctctgc
acagatggat ttgcctgtgg 1500caccaccagg gtgaacgaca atagctttac cctcacggat
aggcctgggt gctggctgcc 1560cagaccctct ggccaggatg gaggggtggt cctgactcaa
catgttactg accagcaact 1620tgtctttttc tggactgaag cctgcaggag ttaaaaaggg
cagggcatct cctgtgcatg 1680ggctcgaagg gagagccagc tcccccgacc ggtgggcatt
tgtgaggccc atggttgaga 1740aatgaataat ttcccaatta ggaagtgtaa gcagctgagg
tctcttgagg gagcttagcc 1800aatgtgggag cagcggtttg gggagcagag acactaacga
cttcagggca gggctctgat 1860attccatgaa tgtatcagga aatatatatg tgtgtgtatg
tttgcacact tgttgtgtgg 1920gctgtgagtg taagtgtgag taagagctgg tgtctgattg
ttaagtctaa atatttcctt 1980aaactgtgtg gactgtgatg ccacacagag tggtctttct
ggagaggtta taggtcactc 2040ctggggcctc ttgggtcccc cacgtgacag tgcctgggaa
tgtacttatt ctgcagcatg 2100acctgtgacc agcactgtct cagtttcact ttcacataga
tgtccctttc ttggccagtt 2160atcccttcct tttagcctag ttcatccaat cctcactggg
tggggtgagg accactcctt 2220acactgaata tttatatttc actattttta tttatatttt
tgtaatttta aataaaagtg 2280atcaataaaa tgtgattttt ctga
23043361876DNAHomo sapiens 336cgcggccgcg gttcgctgtg
gcgggcgcct gggccgccgg ctgtttaact tcgcttccgc 60tggcccatag tgatctttgc
agtgacccag cagcatcact gtttcttggc gtgtgaagat 120aacccaagga attgaggaag
ttgctgagaa gagtgtgctg gagatgctct aggaaaaaat 180tgaatagtga gacgagttcc
agcgcaaggg tttctggttt gccaagaaga aagtgaacat 240catggatcag aacaacagcc
tgccacctta cgctcagggc ttggcctccc ctcagggtgc 300catgactccc ggaatcccta
tctttagtcc aatgatgcct tatggcactg gactgacccc 360acagcctatt cagaacacca
atagtctgtc tattttggaa gagcaacaaa ggcagcagca 420gcaacaacaa cagcagcagc
agcagcagca gcagcagcaa cagcaacagc agcagcagca 480gcagcagcag cagcagcagc
agcagcagca gcagcagcag caacaggcag tggcagctgc 540agccgttcag cagtcaacgt
cccagcaggc aacacaggga acctcaggcc aggcaccaca 600gctcttccac tcacagactc
tcacaactgc acccttgccg ggcaccactc cactgtatcc 660ctcccccatg actcccatga
cccccatcac tcctgccacg ccagcttcgg agagttctgg 720gattgtaccg cagctgcaaa
atattgtatc cacagtgaat cttggttgta aacttgacct 780aaagaccatt gcacttcgtg
cccgaaacgc cgaatataat cccaagcggt ttgctgcggt 840aatcatgagg ataagagagc
cacgaaccac ggcactgatt ttcagttctg ggaaaatggt 900gtgcacagga gccaagagtg
aagaacagtc cagactggca gcaagaaaat atgctagagt 960tgtacagaag ttgggttttc
cagctaagtt cttggacttc aagattcaga acatggtggg 1020gagctgtgat gtgaagtttc
ctataaggtt agaaggcctt gtgctcaccc accaacaatt 1080tagtagttat gagccagagt
tatttcctgg tttaatctac agaatgatca aacccagaat 1140tgttctcctt atttttgttt
ctggaaaagt tgtattaaca ggtgctaaag tcagagcaga 1200aatttatgaa gcatttgaaa
acatctaccc tattctaaag ggattcagga agacgacgta 1260atggctctca tgtacccttg
cctcccccac ccccttcttt tttttttttt aaacaaatca 1320gtttgttttg gtacctttaa
atggtggtgt tgtgagaaga tggatgttga gttgcagggt 1380gtggcaccag gtgatgccct
tctgtaagtg cccaccgcgg gatgccggga aggggcatta 1440tttgtgcact gagaacaccg
cgcagcgtga ctgtgagttg ctcataccgt gctgctatct 1500gggcagcgct gcccatttat
ttatatgtag attttaaaca ctgctgttga caagttggtt 1560tgagggagaa aactttaagt
gttaaagcca cctctataat tgattggact ttttaatttt 1620aatgtttttc cccatgaacc
acagttttta tatttctacc agaaaagtaa aaatcttttt 1680taaaagtgtt gtttttctaa
tttataactc ctaggggtta tttctgtgcc agacacattc 1740cacctctcca gtattgcagg
acggaatata tgtgttaatg aaaatgaatg gctgtacata 1800tttttttctt tcttcagagt
actctgtaca ataaatgcag tttataaaag tgttaaaaaa 1860aaaaaaaaaa aaaaaa
18763376633DNAHomo sapiens
337ttctccccgc cccccagttg ttgtcgaagt ctgggggttg ggactggacc ccctgattgc
60gtaagagcaa aaagcgaagg cgcaatctgg acactgggag attcggagcg cagggagttt
120gagagaaact tttattttga agagaccaag gttgaggggg ggcttatttc ctgacagcta
180tttacttaga gcaaatgatt agttttagaa ggatggacta taacattgaa tcaattacaa
240aacgcggttt ttgagcccat tactgttgga gctacaggga gagaaacagg aggagactgc
300aagagatcat ttgggaaggc cgtgggcacg ctctttactc catgtgtggg acattcattg
360cggaataaca tcggaggaga agtttcccag agctatgggg acttcccatc cggcgttcct
420ggtcttaggc tgtcttctca cagggctgag cctaatcctc tgccagcttt cattaccctc
480tatccttcca aatgaaaatg aaaaggttgt gcagctgaat tcatcctttt ctctgagatg
540ctttggggag agtgaagtga gctggcagta ccccatgtct gaagaagaga gctccgatgt
600ggaaatcaga aatgaagaaa acaacagcgg cctttttgtg acggtcttgg aagtgagcag
660tgcctcggcg gcccacacag ggttgtacac ttgctattac aaccacactc agacagaaga
720gaatgagctt gaaggcaggc acatttacat ctatgtgcca gacccagatg tagcctttgt
780acctctagga atgacggatt atttagtcat cgtggaggat gatgattctg ccattatacc
840ttgtcgcaca actgatcccg agactcctgt aaccttacac aacagtgagg gggtggtacc
900tgcctcctac gacagcagac agggctttaa tgggaccttc actgtagggc cctatatctg
960tgaggccacc gtcaaaggaa agaagttcca gaccatccca tttaatgttt atgctttaaa
1020agcaacatca gagctggatc tagaaatgga agctcttaaa accgtgtata agtcagggga
1080aacgattgtg gtcacctgtg ctgtttttaa caatgaggtg gttgaccttc aatggactta
1140ccctggagaa gtgaaaggca aaggcatcac aatgctggaa gaaatcaaag tcccatccat
1200caaattggtg tacactttga cggtccccga ggccacggtg aaagacagtg gagattacga
1260atgtgctgcc cgccaggcta ccagggaggt caaagaaatg aagaaagtca ctatttctgt
1320ccatgagaaa ggtttcattg aaatcaaacc caccttcagc cagttggaag ctgtcaacct
1380gcatgaagtc aaacattttg ttgtagaggt gcgggcctac ccacctccca ggatatcctg
1440gctgaaaaac aatctgactc tgattgaaaa tctcactgag atcaccactg atgtggaaaa
1500gattcaggaa ataaggtatc gaagcaaatt aaagctgatc cgtgctaagg aagaagacag
1560tggccattat actattgtag ctcaaaatga agatgctgtg aagagctata cttttgaact
1620gttaactcaa gttccttcat ccattctgga cttggtcgat gatcaccatg gctcaactgg
1680gggacagacg gtgaggtgca cagctgaagg cacgccgctt cctgatattg agtggatgat
1740atgcaaagat attaagaaat gtaataatga aacttcctgg actattttgg ccaacaatgt
1800ctcaaacatc atcacggaga tccactcccg agacaggagt accgtggagg gccgtgtgac
1860tttcgccaaa gtggaggaga ccatcgccgt gcgatgcctg gctaagaatc tccttggagc
1920tgagaaccga gagctgaagc tggtggctcc caccctgcgt tctgaactca cggtggctgc
1980tgcagtcctg gtgctgttgg tgattgtgat catctcactt attgtcctgg ttgtcatttg
2040gaaacagaaa ccgaggtatg aaattcgctg gagggtcatt gaatcaatca gcccggatgg
2100acatgaatat atttatgtgg acccgatgca gctgccttat gactcaagat gggagtttcc
2160aagagatgga ctagtgcttg gtcgggtctt ggggtctgga gcgtttggga aggtggttga
2220aggaacagcc tatggattaa gccggtccca acctgtcatg aaagttgcag tgaagatgct
2280aaaacccacg gccagatcca gtgaaaaaca agctctcatg tctgaactga agataatgac
2340tcacctgggg ccacatttga acattgtaaa cttgctggga gcctgcacca agtcaggccc
2400catttacatc atcacagagt attgcttcta tggagatttg gtcaactatt tgcataagaa
2460tagggatagc ttcctgagcc accacccaga gaagccaaag aaagagctgg atatctttgg
2520attgaaccct gctgatgaaa gcacacggag ctatgttatt ttatcttttg aaaacaatgg
2580tgactacatg gacatgaagc aggctgatac tacacagtat gtccccatgc tagaaaggaa
2640agaggtttct aaatattccg acatccagag atcactctat gatcgtccag cctcatataa
2700gaagaaatct atgttagact cagaagtcaa aaacctcctt tcagatgata actcagaagg
2760ccttacttta ttggatttgt tgagcttcac ctatcaagtt gcccgaggaa tggagttttt
2820ggcttcaaaa aattgtgtcc accgtgatct ggctgctcgc aacgtcctcc tggcacaagg
2880aaaaattgtg aagatctgtg actttggcct ggccagagac atcatgcatg attcgaacta
2940tgtgtcgaaa ggcagtacct ttctgcccgt gaagtggatg gctcctgaga gcatctttga
3000caacctctac accacactga gtgatgtctg gtcttatggc attctgctct gggagatctt
3060ttcccttggt ggcacccctt accccggcat gatggtggat tctactttct acaataagat
3120caagagtggg taccggatgg ccaagcctga ccacgctacc agtgaagtct acgagatcat
3180ggtgaaatgc tggaacagtg agccggagaa gagaccctcc ttttaccacc tgagtgagat
3240tgtggagaat ctgctgcctg gacaatataa aaagagttat gaaaaaattc acctggactt
3300cctgaagagt gaccatcctg ctgtggcacg catgcgtgtg gactcagaca atgcatacat
3360tggtgtcacc tacaaaaacg aggaagacaa gctgaaggac tgggagggtg gtctggatga
3420gcagagactg agcgctgaca gtggctacat cattcctctg cctgacattg accctgtccc
3480tgaggaggag gacctgggca agaggaacag acacagctcg cagacctctg aagagagtgc
3540cattgagacg ggttccagca gttccacctt catcaagaga gaggacgaga ccattgaaga
3600catcgacatg atggacgaca tcggcataga ctcttcagac ctggtggaag acagcttcct
3660gtaactggcg gattcgaggg gttccttcca cttctggggc cacctctgga tcccgttcag
3720aaaaccactt tattgcaatg cggaggttga gaggaggact tggttgatgt ttaaagagaa
3780gttcccagcc aagggcctcg gggagcgttc taaatatgaa tgaatgggat attttgaaat
3840gaactttgtc agtgttgcct ctcgcaatgc ctcagtagca tctcagtggt gtgtgaagtt
3900tggagataga tggataaggg aataataggc cacagaaggt gaactttgtg cttcaaggac
3960attggtgaga gtccaacaga cacaatttat actgcgacag aacttcagca ttgtaattat
4020gtaaataact ctaaccaagg ctgtgtttag attgtattaa ctatcttctt tggacttctg
4080aagagaccac tcaatccatc catgtacttc cctcttgaaa cctgatgtca gctgctgttg
4140aactttttaa agaagtgcat gaaaaaccat ttttgaacct taaaaggtac tggtactata
4200gcattttgct atctttttta gtgttaagag ataaagaata ataattaacc aaccttgttt
4260aatagatttg ggtcatttag aagcctgaca actcattttc atattgtaat ctatgtttat
4320aatactacta ctgttatcag taatgctaaa tgtgtaataa tgtaacatga tttccctcca
4380gagaaagcac aatttaaaac aatccttact aagtaggtga tgagtttgac agtttttgac
4440atttatatta aataacatgt ttctctataa agtatggtaa tagctttagt gaattaaatt
4500tagttgagca tagagaacaa agtaaaagta gtgttgtcca ggaagtcaga atttttaact
4560gtactgaata ggttccccaa tccatcgtat taaaaaacaa ttaactgccc tctgaaataa
4620tgggattaga aacaaacaaa actcttaagt cctaaaagtt ctcaatgtag aggcataaac
4680ctgtgctgaa cataacttct catgtatatt acccaatgga aaatataatg atcagcaaaa
4740agactggatt tgcagaagtt tttttttttt ttcttcatgc ctgatgaaag ctttggcaac
4800cccaatatat gtattttttg aatctatgaa cctgaaaagg gtcagaagga tgcccagaca
4860tcagcctcct tctttcaccc cttaccccaa agagaaagag tttgaaactc gagaccataa
4920agatattctt tagtggaggc tggatgtgca ttagcctgga tcctcagttc tcaaatgtgt
4980gtggcagcca ggatgactag atcctgggtt tccatccttg agattctgaa gtatgaagtc
5040tgagggaaac cagagtctgt atttttctaa actccctggc tgttctgatc ggccagtttt
5100cggaaacact gacttaggtt tcaggaagtt gccatgggaa acaaataatt tgaactttgg
5160aacagggttg gaattcaacc acgcaggaag cctactattt aaatccttgg cttcaggtta
5220gtgacattta atgccatcta gctagcaatt gcgaccttaa tttaactttc cagtcttagc
5280tgaggctgag aaagctaaag tttggttttg acaggttttc caaaagtaaa gatgctactt
5340cccactgtat gggggagatt gaactttccc cgtctcccgt cttctgcctc ccactccata
5400ccccgccaag gaaaggcatg tacaaaaatt atgcaattca gtgttccaag tctctgtgta
5460accagctcag tgttttggtg gaaaaaacat tttaagtttt actgataatt tgaggttaga
5520tgggaggatg aattgtcaca tctatccaca ctgtcaaaca ggttggtgtg ggttcattgg
5580cattctttgc aatactgctt aattgctgat accatatgaa tgaaacatgg gctgtgatta
5640ctgcaatcac tgtgctatcg gcagatgatg ctttggaaga tgcagaagca ataataaagt
5700acttgactac ctactggtgt aatctcaatg caagccccaa ctttcttatc caactttttc
5760atagtaagtg cgaagactga gccagattgg ccaattaaaa acgaaaacct gactaggttc
5820tgtagagcca attagacttg aaatacgttt gtgtttctag aatcacagct caagcattct
5880gtttatcgct cactctccct tgtacagcct tattttgttg gtgctttgca ttttgatatt
5940gctgtgagcc ttgcatgaca tcatgaggcc ggatgaaact tctcagtcca gcagtttcca
6000gtcctaacaa atgctcccac ctgaatttgt atatgactgc atttgtgggt gtgtgtgtgt
6060tttcagcaaa ttccagattt gtttcctttt ggcctcctgc aaagtctcca gaagaaaatt
6120tgccaatctt tcctactttc tatttttatg atgacaatca aagccggcct gagaaacact
6180atttgtgact ttttaaacga ttagtgatgt ccttaaaatg tggtctgcca atctgtacaa
6240aatggtccta tttttgtgaa gagggacata agataaaatg atgttataca tcaatatgta
6300tatatgtatt tctatataga cttggagaat actgccaaaa catttatgac aagctgtatc
6360actgccttcg tttatatttt tttaactgtg ataatcccca caggcacatt aactgttgca
6420cttttgaatg tccaaaattt atattttaga aataataaaa agaaagatac ttacatgttc
6480ccaaaacaat ggtgtggtga atgtgtgaga aaaactaact tgatagggtc taccaataca
6540aaatgtatta cgaatgcccc tgttcatgtt tttgttttaa aacgtgtaaa tgaagatctt
6600tatatttcaa taaatgatat ataatttaaa gtt
6633338994DNAHomo sapiens 338tgctggccag cacctcgagg gaagatggcg gacgaggaga
agctgccgcc cggctgggag 60aagcgcatga gccgcagctc aggccgagtg tactacttca
accacatcac taacgccagc 120cagtgggagc ggcccagcgg caacagcagc agtggtggca
aaaacgggca gggggagcct 180gccagggtcc gctgctcgca cctgctggtg aagcacagcc
agtcacggcg gccctcgtcc 240tggcggcagg agaagatcac ccggaccaag gaggaggccc
tggagctgat caacggctac 300atccagaaga tcaagtcggg agaggaggac tttgagtctc
tggcctcaca gttcagcgac 360tgcagctcag ccaaggccag gggagacctg ggtgccttca
gcagaggtca gatgcagaag 420ccatttgaag acgcctcgtt tgcgctgcgg acgggggaga
tgagcgggcc cgtgttcacg 480gattccggca tccacatcat cctccgcact gagtgagggt
ggggagccca ggcctggcct 540cggggcaggg cagggcggct aggccggcca gctccccctt
gcccgccagc cagtggccga 600accccccact ccctgccacc gtcacacagt atttattgtt
cccacaatgg ctgggagggg 660gcccttccag attgggggcc ctggggtccc cactccctgt
ccatccccag ttggggctgc 720gaccgccaga ttctccctta aggaattgac ttcagcaggg
gtgggaggct cccagaccca 780gggcagtgtg gtgggagggg tgttccaaag agaaggcctg
gtcagcagag ccgccccgtg 840tccccccagg tgctggaggc agactcgagg gccgaattgt
ttctagttag gccacgctcc 900tctgttcagt cgcaaaggtg aacactcatg cggcagccat
gggccctctg agcaactgtg 960cagacccttt cacccccaat taaacccaga acca
994339772DNAHomo sapiens 339agctcgtgcc gaattcggca
cgagccgggt cggagccatg gcggtggcaa attcaagtcc 60tgttaacccc gtggtgttct
ttgatgtcag tattggcggt caggaagttg gccgcatgaa 120gatcgagctc tttgcagacg
ttgtgcctaa gacggccgag aactttaggc agttctgcac 180cggagaattc aggaaagatg
gggttccaat aggatacaaa ggaagcacct tccacagggt 240cataaaggat ttcatgattc
agggtggaga ttttgttaat ggagatggta ctggagtcgc 300cagtatttac cgggggccat
ttgcagatga aaattttaaa cttagacact cagctccagg 360cctgctttcc atggcgaaca
gtggtccaag tacaaatggc tgtcagttct ttatcacctg 420ctctaagtgc gattggctgg
atgggaagca tgtggtgttt ggaaaaatca tcgatggact 480tctagtgatg agaaagattg
agaatgttcc cacaggcccc aacaataagc ccaagctacc 540tgtggtgatc tcgcagtgtg
gggagatgta gtccagacaa agactgaatc aggccttccc 600ttcttcttgg tggtgttctt
gagtaagata atctggactg gcccccgtct ttgcttccct 660gcctgctgct gccccatttg
atcaagagac catggaagtg tcagagattc agaatccaag 720attgtcttta agttttcaac
tgtaaataaa gtttttttgt atgcgtaaaa aa 772340919DNAHomo sapiens
340cgctcgcctc cctcgctcca cgcgcgcccg gacgcggcgg ccaggcttgc gcgtggttcc
60cctcccggtg ggcggattcc tgggcaagat gaagtgggtg tgggcgctct tgctgttggc
120ggcgtgggca gcggccgagc gcgactgccg agtgagcagc ttccgagtca aggagaactt
180cgacaaggct cgcttctctg ggacctggta cgccatggcc aagaaggacc ccgagggcct
240ctttctgcag gacaacatcg tcgcggagtt ctcggtggac gagaccggcc agatgagcgc
300cacagccaag ggccgagtcc gtcttttgaa taactgggac gtgtgcgcag acatggtggg
360caccttcaca gacaccgagg accctgccaa gttcaagatg aagtactggg gcgtagcctc
420ctttctgcag aaaggaaatg atgaccactg gatcgtcgac acagactacg acacgtatgc
480cgtacagtac tcctgccgcc tcctgaacct cgatggcacc tgtgctgaca gctactcctt
540cgtgttttcc cgggacccca acggcctgcc cccagaagcg cagaagattg taaggcagcg
600gcaggaggag ctgtgcctgg ccaggcagta caggctgatc gtccacaacg gttactgcga
660tggcagatca gaaagaaacc ttttgtagca atatcaagaa tctagtttca tctgagaact
720tctgattagc tctcagtctt cagctctatt tatcttagga gtttaatttg cccttctctc
780cccatcttcc ctcagttccc ataaaacctt cattacacat aaagatacac gtgggggtca
840gtgaatctgc ttgcctttcc tgaaagtttc tggggcttaa gattccagac tctgattcat
900taaactatag tcacccgtg
9193417365DNAHomo sapiens 341ggcagtttgt aggtcgcgag ggaagcgctg aggatcagga
agggggcact gagtgtccgt 60gggggaatcc tcgtgatagg aactggaata tgccttgagg
gggacactat gtctttaaaa 120acgtcggctg gtcatgaggt caggagttcc agaccagcct
gaccaacgtg gtgaaactcc 180gtctctacta aaaatacaaa aattagccgg gcgtggtgcc
gctccagcta ctcaggaggc 240tgaggcagga gaatcgctag aacccgggag gcggaggttg
cagtgagccg agatcgcgcc 300attgcactcc agcctgggcg acagagcgag actgtctcaa
aacaaaacaa aacaaaacaa 360aacaaaaaac accggctgtt cattggaaca gaaagaaatg
gatttatctg ctcttcgcgt 420tgaagaagta caaaatgtca ttaatgctat gcagaaaatc
ttagagtgtc ccatctgtct 480ggagttgatc aaggaacctg tctccacaaa gtgtgaccac
atattttgca aattttgcat 540gctgaaactt ctcaaccaga agaaagggcc ttcacagtgt
cctttatgta agaatgatat 600aaccaaaagg agcctacaag aaagtacgag atttagtcaa
cttgttgaag agctattgaa 660aatcatttgt gcttttcagc ttgacacagg tttggagtat
gcaaacagct ataattttgc 720aaaaaaggaa aataactctc ctgaacatct aaaagatgaa
gtttctatca tccaaagtat 780gggctacaga aaccgtgcca aaagacttct acagagtgaa
cccgaaaatc cttccttgca 840ggaaaccagt ctcagtgtcc aactctctaa ccttggaact
gtgagaactc tgaggacaaa 900gcagcggata caacctcaaa agacgtctgt ctacattgaa
ttgggatctg attcttctga 960agataccgtt aataaggcaa cttattgcag tgtgggagat
caagaattgt tacaaatcac 1020ccctcaagga accagggatg aaatcagttt ggattctgca
aaaaaggctg cttgtgaatt 1080ttctgagacg gatgtaacaa atactgaaca tcatcaaccc
agtaataatg atttgaacac 1140cactgagaag cgtgcagctg agaggcatcc agaaaagtat
cagggtagtt ctgtttcaaa 1200cttgcatgtg gagccatgtg gcacaaatac tcatgccagc
tcattacagc atgagaacag 1260cagtttatta ctcactaaag acagaatgaa tgtagaaaag
gctgaattct gtaataaaag 1320caaacagcct ggcttagcaa ggagccaaca taacagatgg
gctggaagta aggaaacatg 1380taatgatagg cggactccca gcacagaaaa aaaggtagat
ctgaatgctg atcccctgtg 1440tgagagaaaa gaatggaata agcagaaact gccatgctca
gagaatccta gagatactga 1500agatgttcct tggataacac taaatagcag cattcagaaa
gttaatgagt ggttttccag 1560aagtgatgaa ctgttaggtt ctgatgactc acatgatggg
gagtctgaat caaatgccaa 1620agtagctgat gtattggacg ttctaaatga ggtagatgaa
tattctggtt cttcagagaa 1680aatagactta ctggccagtg atcctcatga ggctttaata
tgtaaaagtg aaagagttca 1740ctccaaatca gtagagagta atattgaaga caaaatattt
gggaaaacct atcggaagaa 1800ggcaagcctc cccaacttaa gccatgtaac tgaaaatcta
attataggag catttgttac 1860tgagccacag ataatacaag agcgtcccct cacaaataaa
ttaaagcgta aaaggagacc 1920tacatcaggc cttcatcctg aggattttat caagaaagca
gatttggcag ttcaaaagac 1980tcctgaaatg ataaatcagg gaactaacca aacggagcag
aatggtcaag tgatgaatat 2040tactaatagt ggtcatgaga ataaaacaaa aggtgattct
attcagaatg agaaaaatcc 2100taacccaata gaatcactcg aaaaagaatc tgctttcaaa
acgaaagctg aacctataag 2160cagcagtata agcaatatgg aactcgaatt aaatatccac
aattcaaaag cacctaaaaa 2220gaataggctg aggaggaagt cttctaccag gcatattcat
gcgcttgaac tagtagtcag 2280tagaaatcta agcccaccta attgtactga attgcaaatt
gatagttgtt ctagcagtga 2340agagataaag aaaaaaaagt acaaccaaat gccagtcagg
cacagcagaa acctacaact 2400catggaaggt aaagaacctg caactggagc caagaagagt
aacaagccaa atgaacagac 2460aagtaaaaga catgacagcg atactttccc agagctgaag
ttaacaaatg cacctggttc 2520ttttactaag tgttcaaata ccagtgaact taaagaattt
gtcaatccta gccttccaag 2580agaagaaaaa gaagagaaac tagaaacagt taaagtgtct
aataatgctg aagaccccaa 2640agatctcatg ttaagtggag aaagggtttt gcaaactgaa
agatctgtag agagtagcag 2700tatttcattg gtacctggta ctgattatgg cactcaggaa
agtatctcgt tactggaagt 2760tagcactcta gggaaggcaa aaacagaacc aaataaatgt
gtgagtcagt gtgcagcatt 2820tgaaaacccc aagggactaa ttcatggttg ttccaaagat
aatagaaatg acacagaagg 2880ctttaagtat ccattgggac atgaagttaa ccacagtcgg
gaaacaagca tagaaatgga 2940agaaagtgaa cttgatgctc agtatttgca gaatacattc
aaggtttcaa agcgccagtc 3000atttgctccg ttttcaaatc caggaaatgc agaagaggaa
tgtgcaacat tctctgccca 3060ctctgggtcc ttaaagaaac aaagtccaaa agtcactttt
gaatgtgaac aaaaggaaga 3120aaatcaagga aagaatgagt ctaatatcaa gcctgtacag
acagttaata tcactgcagg 3180ctttcctgtg gttggtcaga aagataagcc agttgataat
gccaaatgta gtatcaaagg 3240aggctctagg ttttgtctat catctcagtt cagaggcaac
gaaactggac tcattactcc 3300aaataaacat ggacttttac aaaacccata tcgtatacca
ccactttttc ccatcaagtc 3360atttgttaaa actaaatgta agaaaaatct gctagaggaa
aactttgagg aacattcaat 3420gtcacctgaa agagaaatgg gaaatgagaa cattccaagt
acagtgagca caattagccg 3480taataacatt agagaaaatg tttttaaaga agccagctca
agcaatatta atgaagtagg 3540ttccagtact aatgaagtgg gctccagtat taatgaaata
ggttccagtg atgaaaacat 3600tcaagcagaa ctaggtagaa acagagggcc aaaattgaat
gctatgctta gattaggggt 3660tttgcaacct gaggtctata aacaaagtct tcctggaagt
aattgtaagc atcctgaaat 3720aaaaaagcaa gaatatgaag aagtagttca gactgttaat
acagatttct ctccatatct 3780gatttcagat aacttagaac agcctatggg aagtagtcat
gcatctcagg tttgttctga 3840gacacctgat gacctgttag atgatggtga aataaaggaa
gatactagtt ttgctgaaaa 3900tgacattaag gaaagttctg ctgtttttag caaaagcgtc
cagaaaggag agcttagcag 3960gagtcctagc cctttcaccc atacacattt ggctcagggt
taccgaagag gggccaagaa 4020attagagtcc tcagaagaga acttatctag tgaggatgaa
gagcttccct gcttccaaca 4080cttgttattt ggtaaagtaa acaatatacc ttctcagtct
actaggcata gcaccgttgc 4140taccgagtgt ctgtctaaga acacagagga gaatttatta
tcattgaaga atagcttaaa 4200tgactgcagt aaccaggtaa tattggcaaa ggcatctcag
gaacatcacc ttagtgagga 4260aacaaaatgt tctgctagct tgttttcttc acagtgcagt
gaattggaag acttgactgc 4320aaatacaaac acccaggatc ctttcttgat tggttcttcc
aaacaaatga ggcatcagtc 4380tgaaagccag ggagttggtc tgagtgacaa ggaattggtt
tcagatgatg aagaaagagg 4440aacgggcttg gaagaaaata atcaagaaga gcaaagcatg
gattcaaact taggtgaagc 4500agcatctggg tgtgagagtg aaacaagcgt ctctgaagac
tgctcagggc tatcctctca 4560gagtgacatt ttaaccactc agcagaggga taccatgcaa
cataacctga taaagctcca 4620gcaggaaatg gctgaactag aagctgtgtt agaacagcat
gggagccagc cttctaacag 4680ctacccttcc atcataagtg actcttctgc ccttgaggac
ctgcgaaatc cagaacaaag 4740cacatcagaa aaagcagtat taacttcaca gaaaagtagt
gaatacccta taagccagaa 4800tccagaaggc ctttctgctg acaagtttga ggtgtctgca
gatagttcta ccagtaaaaa 4860taaagaacca ggagtggaaa ggtcatcccc ttctaaatgc
ccatcattag atgataggtg 4920gtacatgcac agttgctctg ggagtcttca gaatagaaac
tacccatctc aagaggagct 4980cattaaggtt gttgatgtgg aggagcaaca gctggaagag
tctgggccac acgatttgac 5040ggaaacatct tacttgccaa ggcaagatct agagggaacc
ccttacctgg aatctggaat 5100cagcctcttc tctgatgacc ctgaatctga tccttctgaa
gacagagccc cagagtcagc 5160tcgtgttggc aacataccat cttcaacctc tgcattgaaa
gttccccaat tgaaagttgc 5220agaatctgcc cagagtccag ctgctgctca tactactgat
actgctgggt ataatgcaat 5280ggaagaaagt gtgagcaggg agaagccaga attgacagct
tcaacagaaa gggtcaacaa 5340aagaatgtcc atggtggtgt ctggcctgac cccagaagaa
tttatgctcg tgtacaagtt 5400tgccagaaaa caccacatca ctttaactaa tctaattact
gaagagacta ctcatgttgt 5460tatgaaaaca gatgctgagt ttgtgtgtga acggacactg
aaatattttc taggaattgc 5520gggaggaaaa tgggtagtta gctatttctg ggtgacccag
tctattaaag aaagaaaaat 5580gctgaatgag catgattttg aagtcagagg agatgtggtc
aatggaagaa accaccaagg 5640tccaaagcga gcaagagaat cccaggacag aaagatcttc
agggggctag aaatctgttg 5700ctatgggccc ttcaccaaca tgcccacaga tcaactggaa
tggatggtac agctgtgtgg 5760tgcttctgtg gtgaaggagc tttcatcatt cacccttggc
acaggtgtcc acccaattgt 5820ggttgtgcag ccagatgcct ggacagagga caatggcttc
catgcaattg ggcagatgtg 5880tgaggcacct gtggtgaccc gagagtgggt gttggacagt
gtagcactct accagtgcca 5940ggagctggac acctacctga taccccagat cccccacagc
cactactgac tgcagccagc 6000cacaggtaca gagccacagg accccaagaa tgagcttaca
aagtggcctt tccaggccct 6060gggagctcct ctcactcttc agtccttcta ctgtcctggc
tactaaatat tttatgtaca 6120tcagcctgaa aaggacttct ggctatgcaa gggtccctta
aagattttct gcttgaagtc 6180tcccttggaa atctgccatg agcacaaaat tatggtaatt
tttcacctga gaagatttta 6240aaaccattta aacgccacca attgagcaag atgctgattc
attatttatc agccctattc 6300tttctattca ggctgttgtt ggcttagggc tggaagcaca
gagtggcttg gcctcaagag 6360aatagctggt ttccctaagt ttacttctct aaaaccctgt
gttcacaaag gcagagagtc 6420agacccttca atggaaggag agtgcttggg atcgattatg
tgacttaaag tcagaatagt 6480ccttgggcag ttctcaaatg ttggagtgga acattgggga
ggaaattctg aggcaggtat 6540tagaaatgaa aaggaaactt gaaacctggg catggtggct
cacgcctgta atcccagcac 6600tttgggaggc caaggtgggc agatcactgg aggtcaggag
ttcgaaacca gcctggccaa 6660catggtgaaa ccccatctct actaaaaata cagaaattag
ccggtcatgg tggtggacac 6720ctgtaatccc agctactcag gtggctaagg caggagaatc
acttcagccc gggaggtgga 6780ggttgcagtg agccaagatc ataccacggc actccagcct
gggtgacagt gagactgtgg 6840ctcaaaaaaa aaaaaaaaaa aggaaaatga aactaggaaa
ggtttcttaa agtctgagat 6900atatttgcta gatttctaaa gaatgtgttc taaaacagca
gaagattttc aagaaccggt 6960ttccaaagac agtcttctaa ttcctcatta gtaataagta
aaatgtttat tgttgtagct 7020ctggtatata atccattcct cttaaaatat aagacctctg
gcatgaatat ttcatatcta 7080taaaatgaca gatcccacca ggaaggaagc tgttgctttc
tttgaggtga tttttttcct 7140ttgctccctg ttgctgaaac catacagctt cataaataat
tttgcttgct gaaggaagaa 7200aaagtgtttt tcataaaccc attatccagg actgtttata
gctgttggaa ggactaggtc 7260ttccctagcc cccccagtgt gcaagggcag tgaagacttg
attgtacaaa atacgttttg 7320taaatgttgt gctgttaaca ctgcaaataa acttggtagc
aaaca 736534210386DNAHomo sapiensunsure(0)...(0)n = a,
t, c or g 342attgaggact cggaaatgag gtccaagggt agccaaggat ggctgcagct
tcatatgatc 60agttgttaaa gcaagttgag gcactgaaga tggagaactc aaatcttcga
caagagctag 120aagataattc caatcatctt acaaaactgg aaactgaggc atctaatatg
aaggaagtac 180ttaaacaact acaaggaagt attgaagatg aagctatggc ttcttctgga
cagattgatt 240tattagagcg tcttaaagag cttaacttag atagcagtaa tttccctgga
gtaaaactgc 300ggtcaaaaat gtccctccgt tcttatggaa gccgggaagg atctgtatca
agccgttctg 360gagagtgcag tcctgttcct atgggttcat ttccaagaag agggtttgta
aatggaagca 420gagaaagtac tggatattta gaagaacttg agaaagagag gtcattgctt
cttgctgatc 480ttgacaaaga agaaaaggaa aaagactggt attacgctca acttcagaat
ctcactaaaa 540gaatagatag tcttccttta actgaaaatt tttccttaca aacagatatg
accagaaggc 600aattggaata tgaagcaagg caaatcagag ttgcgatgga agaacaacta
ggtacctgcc 660aggatatgga aaaacgagca cagcgaagaa tagccagaat tcagcaaatc
gaaaaggaca 720tacttcgtat acgacagctt ttacagtccc aagcaacaga agcagagagg
tcatctcaga 780acaagcatga aaccggctca catgatgctg agcggcagaa tgaaggtcaa
ggagtgggag 840aaatcaacat ggcaacttct ggtaatggtc agggttcaac tacacgaatg
gaccatgaaa 900cagccagtgt tttgagttct agtagcacac actctgcacc tcgaaggctg
acaagtcatc 960tgggaaccaa ggtggaaatg gtgtattcat tgttgtcaat gcttggtact
catgataagg 1020atgatatgtc gcgaactttg ctagctatgt ctagctccca agacagctgt
atatccatgc 1080gacagtctgg atgtcttcct ctcctcatcc agcttttaca tggcaatgac
aaagactctg 1140tattgttggg aaattcccgg ggcagtaaag aggctcgggc cagggccagt
gcagcactcc 1200acaacatcat tcactcacag cctgatgaca agagaggcag gcgtgaaatc
cgagtccttc 1260atcttttgga acagatacgc gcttactgtg aaacctgttg ggagtggcag
gaagctcatg 1320aaccaggcat ggaccaggac aaaaatccaa tgccagctcc tgttgaacat
cagatctgtc 1380ctgctgtgtg tgttctaatg aaactttcat ttgatgaaga gcatagacat
gcaatgaatg 1440aactaggggg actacaggcc attgcagaat tattgcaagt ggactgtgaa
atgtacgggc 1500ttactaatga ccactacagt attacactaa gacgatatgc tggaatggct
ttgacaaact 1560tgacttttgg agatgtagcc aacaaggcta cgctatgctc tatgaaaggc
tgcatgagag 1620cacttgtggc ccaactaaaa tctgaaagtg aagacttaca gcaggttatt
gcaagtgttt 1680tgaggaattt gtcttggcga gcagatgtaa atagtaaaaa gacgttgcga
gaagttggaa 1740gtgtgaaagc attgatggaa tgtgctttag aagttaaaaa ggaatcaacc
ctcaaaagcg 1800tattgagtgc cttatggaat ttgtcagcac attgcactga gaataaagct
gatatatgtg 1860ctgtagatgg tgcacttgca tttttggttg gcactcttac ttaccggagc
cagacaaaca 1920ctttagccat tattgaaagt ggaggtggga tattacggaa tgtgtccagc
ttgatagcta 1980caaatgagga ccacaggcaa atcctaagag agaacaactg tctacaaact
ttattacaac 2040acttaaaatc tcatagtttg acaatagtca gtaatgcatg tggaactttg
tggaatctct 2100cagcaagaaa tcctaaagac caggaagcat tatgggacat gggggcagtt
agcatgctca 2160agaacctcat tcattcaaag cacaaaatga ttgctatggg aagtgctgca
gctttaagga 2220atctcatggc aaataggcct gcgaagtaca aggatgccaa tattatgtct
cctggctcaa 2280gcttgccatc tcttcatgtt aggaaacaaa aagccctaga agcagaatta
gatgctcagc 2340acttatcaga aacttttgac aatatagaca atttaagtcc caaggcatct
catcgtagta 2400agcagagaca caagcaaagt ctctatggtg attatgtttt tgacaccaat
cgacatgatg 2460ataataggtc agacaatttt aatactggca acatgactgt cctttcacca
tatttgaata 2520ctacagtgtt acccagctcc tcttcatcaa gaggaagctt agatagttct
cgttctgaaa 2580aagatagaag tttggagaga gaacgcggaa ttggtctagg caactaccat
ccagcaacag 2640aaaatccagg aacttcttca aagcgaggtt tgcagatctc caccactgca
gcccagattg 2700ccaaagtcat ggaagaagtg tcagccattc atacctctca ggaagacaga
agttctgggt 2760ctaccactga attacattgt gtgacagatg agagaaatgc acttagaaga
agctctgctg 2820cccatacaca ttcaaacact tacaatttca ctaagtcgga aaattcaaat
aggacatgtt 2880ctatgcctta tgccaaatta gaatacaaga gatcttcaaa tgatagttta
aatagtgtca 2940gtagtagtga tggttatggt aaaagaggtc aaatgaaacc ctcgattgaa
tcctattctg 3000aagatgatga aagtaagttt tgcagttatg gtcaataccc agccgaccta
gcccataaaa 3060tacatagtgc aaatcatatg gatgataatg atggagaact agatacacca
ataaattata 3120gtcttaaata ttcagatgag cagttgaact ctggaaggca aagtccttca
cagaatgaaa 3180gatgggcaag acccaaacac ataatagaag atgaaataaa acaaagtgag
caaagacaat 3240caaggaatca aagtacaact tatcctgttt atactgagag cactgatgat
aaacacctca 3300agttccaacc acattttgga cagcaggaat gtgtttctcc atacaggtca
cggggagcca 3360atggttcaga aacaaatcga gtgggttcta atcatggaat taatcaaaat
gtaagccagt 3420ctttgtgtca agaagatgac tatgaagatg ataagcctac caattatagt
gaacgttact 3480ctgaagaaga acagcatgaa gaagaagaga gaccaacaaa ttatagcata
aaatataatg 3540aagagaaacg tcatgtggat cagcctattg attatagttt aaaatatgcc
acagatattc 3600cttcatcaca gaaacagtca ttttcattct caaagagttc atctggacaa
agcagtaaaa 3660ccgaacatat gtcttcaagc agtgagaata cgtccacacc ttcatctaat
gccaagaggc 3720agaatcagct ccatccaagt tctgcacaga gtagaagtgg tcagcctcaa
aaggctgcca 3780cttgcaaagt ttcttctatt aaccaagaaa caatacagac ttattgtgta
gaagatactc 3840caatatgttt ttcaagatgt agttcattat catctttgtc atcagctgaa
gatgaaatag 3900gatgtaatca gacgacacag gaagcagatt ctgctaatac cctgcaaata
gcagaaataa 3960aagaaaagat tggaactagg tcagctgaag atcctgtgag cgaagttcca
gcagtgtcac 4020agcaccctag aaccaaatcc agcagactgc agggttctag tttatcttca
gaatcagcca 4080ggcacaaagc tgttgaattt tcttcaggag cgaaatctcc ctccaaaagt
ggtgctcaga 4140cacccaaaag tccacctgaa cactatgttc aggagacccc actcatgttt
agcagatgta 4200cttctgtcag ttcacttgat agttttgaga gtcgttcgat tgccagctcc
gttcagagtg 4260aaccatgcag tggaatggta agtggcatta taagccccag tgatcttcca
gatagccctg 4320gacaaaccat gccaccaagc agaagtaaaa cacctccacc acctcctcaa
acagctcaaa 4380ccaagcgaga agtacctaaa aataaagcac ctactgctga aaagagagag
agtggaccta 4440agcaagctgc agtaaatgct gcagttcaga gggtccaggt tcttccagat
gctgatactt 4500tattacattt tgccacggaa agtactccag atggattttc ttgttcatcc
agcctgagtg 4560ctctgagcct cgatgagcca tttatacaga aagatgtgga attaagaata
atgcctccag 4620ttcaggaaaa tgacaatggg aatgaaacag aatcagagca gcctaaagaa
tcaaatgaaa 4680accaagagaa agaggcagaa aaaactattg attctgaaaa ggacctatta
gatgattcag 4740atgatgatga tattgaaata ctagaagaat gtattatttc tgccatgcca
acaaagtcat 4800cacgtaaagc aaaaaagcca gcccagactg cttcaaaatt acctccacct
gtggcaagga 4860aaccaagtca gctgcctgtg tacaaacttc taccatcaca aaacaggttg
caaccccaaa 4920agcatgttag ttttacaccg ggggatgata tgccacgggt gtattgtgtt
gaagggacac 4980ctataaactt ttccacagct acatctctaa gtgatctaac aatcgaatcc
cctccaaatg 5040agttagctgc tggagaagga gttagaggag gagcacagtc aggtgaattt
gaaaaacgag 5100ataccattcc tacagaaggc agaagtacag atgaggctca aggaggaaaa
acctcatctg 5160taaccatacc tgaattggat gacaataaag cagaggaagg tgatattctt
gcagaatgca 5220ttaattctgc tatgcccaaa gggaaaagtc acaagccttt ccgtgtgaaa
aagataatgg 5280accaggtcca gcaagcatct gcgtcgtctt ctgcacccaa caaaaatcag
ttagatggta 5340agaaaaagaa accaacttca ccagtaaaac ctataccaca aaatactgaa
tataggacac 5400gtgtaagaaa aaatgcagac tcaaaaaata atttaaatgc tgagagagtt
ttctcagaca 5460acaaagattc aaagaaacag aatttgaaaa ataattccaa ggacttcaat
gataagctcc 5520caaataatga agatagagtc agaggaagtt ttgcttttga ttcacctcat
cattacacgc 5580ctattgaagg aactccttac tgtttttcac gaaatgattc tttgagttct
ctagattttg 5640atgatgatga tgttgacctt tccagggaaa aggctgaatt aagaaaggca
aaagaaaata 5700aggaatcaga ggctaaagtt accagccaca cagaactaac ctccaaccaa
caatcagcta 5760ataagacaca agctattgca aagcagccaa taaatcgagg tcagcctaaa
cccatacttc 5820agaaacaatc cacttttccc cagtcatcca aagacatacc agacagaggg
gcagcaactg 5880atgaaaagtt acagaatttt gctattgaaa atactccagt ttgcttttct
cataattcct 5940ctctgagttc tctcagtgac attgaccaag aaaacaacaa taaagaaaat
gaacctatca 6000aagagactga gccccctgac tcacagggag aaccaagtaa acctcaagca
tcaggctatg 6060ctcctaaatc atttcatgtt gaagataccc cagtttgttt ctcaagaaac
agttctctca 6120gttctcttag tattgactct gaagatgacc tgttgcagga atgtataagc
tccgcaatgc 6180caaaaaagaa aaagccttca agactcaagg gtgataatga aaaacatagt
cccagaaata 6240tgggtggcat attaggtgaa gatctgacac ttgatttgaa agatatacag
agaccagatt 6300cagaacatgg tctatcccct gattcagaaa attttgattg gaaagctatt
caggaaggtg 6360caaattccat agtaagtagt ttacatcaag ctgctgctgc tgcatgttta
tctagacaag 6420cttcgtctga ttcagattcc atcctttccc tgaaatcagg aatctctctg
ggatcaccat 6480ttcatcttac acctgatcaa gaagaaaaac cctttacaag taataaaggc
ccacgaattc 6540taaaaccagg ggagaaaagt acattggaaa ctaaaaagat agaatctgaa
agtaaaggaa 6600tcaaaggagg aaaaaaagtt tataaaagtt tgattactgg aaaagttcga
tctaattcag 6660aaatttcagg ccaaatgaaa cagccccttc aagcaaacat gccttcaatc
tctcgaggca 6720ggacaatgat tcatattcca ggagttcgaa atagctcctc aagtacaagt
cctgtttcta 6780aaaaaggccc accccttaag actccagcct ccaaaagccc tagtgaaggt
caaacagcca 6840ccacttctcc tagaggagcc aagccatctg tgaaatcaga attaagccct
gttgccaggc 6900agacatccca aataggtggg tcaagtaaag caccttctag atcaggatct
agagattcga 6960ccccttcaag acctgcccag caaccattaa gtagacctat acagtctcct
ggccgaaact 7020caatttcccc tggtagaaat ggaataagtc ctcctaacaa attatctcaa
cttccaagga 7080catcatcccc tagtactgct tcaactaagt cctcaggttc tggaaaaatg
tcatatacat 7140ctccaggtag acagatgagc caacagaacc ttaccaaaca aacaggttta
tccaagaatg 7200ccagtagtat tccaagaagt gagtctgcct ccaaaggact aaatcagatg
aataatggta 7260atggagccaa taaaaaggta gaactttcta gaatgtcttc aactaaatca
agtggaagtg 7320aatctgatag atcagaaaga cctgtattag tacgccagtc aactttcatc
aaagaagctc 7380caagcccaac cttaagaaga aaattggagg aatctgcttc atttgaatct
ctttctccat 7440catctagacc agcttctccc actaggtccc aggcacaaac tccagtttta
agtccttccc 7500ttcctgatat gtctctatcc acacattcgt ctgttcaggc tggtggatgg
cgaaaactcc 7560cacctaatct cagtcccact atagagtata atgatggaag accagcaaag
cgccatgata 7620ttgcacggtc tcattctgaa agtccttcta gacttccaat caataggtca
ggaacctgga 7680aacgtgagca cagcaaacat tcatcatccc ttcctcgagt aagcacttgg
agaagaactg 7740gaagttcatc ttcaattctt tctgcttcat cagaatccag tgaaaaagca
aaaagtgagg 7800atgaaaaaca tgtgaactct atttcaggaa ccaaacaaag taaagaaaac
caagtatccg 7860caaaaggaac atggagaaaa ataaaagaaa atgaattttc tcccacaaat
agtacttctc 7920agaccgtttc ctcaggtgct acaaatggtg ctgaatcaaa gactctaatt
tatcaaatgg 7980cacctgctgt ttctaaaaca gaggatgttt gggtgagaat tgaggactgt
cccattaaca 8040atcctagatc tggaagatct cccacaggta atactccccc ggtgattgac
agtgtttcag 8100aaaaggcaaa tccaaacatt aaagattcaa aagataatca ggcaaaacaa
aatgtgggta 8160atggcagtgt tcccatgcgt accgtgggtt tggaaaatcg cctgaactcc
tttattcagg 8220tggatgcccc tgaccaaaaa ggaactgaga taaaaccagg acaaaataat
cctgtccctg 8280tatcagagac taatgaaagt tctatagtgg aacgtacccc attcagttct
agcagctcaa 8340gcaaacacag ttcacctagt gggactgttg ctgccagagt gactcctttt
aattacaacc 8400caagccctag gaaaagcagc gcagatagca cttcagctcg gccatctcag
atcccaactc 8460cagtgaataa caacacaaag aagcgagatt ccaaaactga cagcacagaa
tccagtggaa 8520cccaaagtcc taagcgccat tctgggtctt accttgtgac atctgtttaa
aagagaggaa 8580gaatgaaact aagaaaattc tatgttaatt acaactgcta tatagacatt
ttgtttcaaa 8640tgaaacttta aaagactgaa aaattttgta aataggtttg attcttgtta
gagggttttt 8700gttctggaag ccatatttga tagtatactt tgtcttcact ggtcttattt
tgggaggcac 8760tcttgatggt taggaaaaaa atagtaaagc caagtatgtt tgtacagtat
gttttacatg 8820tatttaaagt agcatcccat cccaacttcc tttaattatt gcttgtctta
aaataatgaa 8880cactacagat agaaaatatg atatattgct gttatcaatc atttctagat
tataaactga 8940ctaaacttac atcagggaaa aattggtatt tatgcaaaaa aaaatgtttt
tgtccttgtg 9000agtccatcta acatcataat taatcatgtg gctgtgaaat tcacagtaat
atggttcccg 9060atgaacaagc tttacccagc ctgtttgctt tactgcatga atgaaactga
tggttcaatt 9120tcagaagtaa tgattaacag ttatgtggtc acatgatgtg catagagata
gctacagtgt 9180aataatttac actattttgt gctccaaaca aaacaaaaat ctgtgtaact
gtaaaacatt 9240gaatgaaact attttacctg aactagattt tatctgaaag taggtagaat
ttttgctatg 9300ctgtaatttg ttgtatattc tggtatttga ggtgagatgg ctgctctttt
attaatgaga 9360catgaattgt gtctcaacag aaactaaatg aacatttcag aataaattat
tgctgtatgt 9420aaactgttac tgaaattggt atttgtttga agggtcttgt ttcacatttg
tattaataat 9480tgtttaaaat gcctctttta aaagcttata taaatttttt ncttcagctt
ctatgcatta 9540agagtaaaat tcctcttact gtaataaaaa caattgaaga agactgttgc
cacttaacca 9600ttccatgcgt tggcacttat ctattcctga aattctttta tgtgattagc
tcatcttgat 9660ttttaacatt tttccactta aacttttttt tcttactcca ctggagctca
gtaaaagtaa 9720attcatgtaa tagcaatgca agcagcctag cacagactaa gcattgagca
taataggccc 9780acataatttc ctctttctta atattataga aattctgtac ttgaaattga
ttcttagaca 9840ttgcagtctc ttcgaggctt tacagtgtaa actgtcttgc cccttcatct
tcttgttgca 9900actgggtctg acatgaacac tttttatcac cctgtatgtt agggcaagat
ctcagcagtg 9960aagtataatc agcactttgc catgctcaga aaattcaaat cacatggaac
tttagaggta 10020gatttaatac gattaagata ttcagaagta tattttagaa tccctgcctg
ttaaggaaac 10080tttatttgtg gtaggtacag ttctggggta catgttaagt gtccccttat
acagtggagg 10140gaagtcttcc ttcctgaagg aaaataaact gacacttatt aactaagata
atttacttaa 10200tatatcttcc ctgatttgtt ttaaaagatc agagggtgac tgatgataca
tgcatacata 10260tttgttgaat aaatgaaaat ttatttttag tgataagatt catacactct
gtatttgggg 10320agagaaaacc tttttaagca tggtggggca ctcagatagg agtgaataca
cctacctggt 10380ggtcat
103863432191DNAHomo sapiens 343ggtggccgag cgggggaccg ggaagcatgg
cccgggggtc ggcggttgcc tgggcggcgc 60tcgggccgtt gttgtggggc tgcgcgctgg
ggctgcaggg cgggatgctg tacccccagg 120agagcccgtc gcgggagtgc aaggagctgg
acggcctctg gagcttccgc gccgacttct 180ctgacaaccg acgccggggc ttcgaggagc
agtggtaccg gcggccgctg tgggagtcag 240gccccaccgt ggacatgcca gttccctcca
gcttcaatga catcagccag gactggcgtc 300tgcggcattt tgtcggctgg gtgtggtacg
aacgggaggt gatcctgccg gagcgatgga 360cccaggacct gcgcacaaga gtggtgctga
ggattggcag tgcccattcc tatgccatcg 420tgtgggtgaa tggggtcgac acgctagagc
atgagggggg ctacctcccc ttcgaggccg 480acatcagcaa cctggtccag gtggggcccc
tgccctcccg gctccgaatc actatcgcca 540tcaacaacac actcaccccc accaccctgc
caccagggac catccaatac ctgactgaca 600cctccaagta tcccaagggt tactttgtcc
agaacacata ttttgacttt ttcaactacg 660ctggactgca gcggtctgta cttctgtaca
cgacacccac cacctacatc gatgacatca 720ccgtcaccac cagcgtggag caagacagtg
ggctggtgaa ttaccagatc tctgtcaagg 780gcagtaacct gttcaagttg gaagtgcgtc
ttttggatgc agaaaacaaa gtcgtggcga 840atgggactgg gacccagggc caacttaagg
tgccaggtgt cagcctctgg tggccgtacc 900tgatgcacga acgccctgcc tatctgtatt
cattggaggt gcagctgact gcacagacgt 960cactggggcc tgtgtctgac ttctacacac
tccctgtggg gatccgcact gtggctgtca 1020ccaagagcca gttcctcatc aatgggaaac
ctttctattt ccacggtgtc aacaagcatg 1080aggatgcgga catccgaggg aagggcttcg
actggccgct gctggtgaag gacttcaacc 1140tgcttcgctg gcttggtgcc aacgctttcc
gtaccagcca ctacccctat gcagaggaag 1200tgatgcagat gtgtgaccgc tatgggattg
tggtcatcga tgagtgtccc ggcgtgggcc 1260tggcgctgcc gcagttcttc aacaacgttt
ctctgcatca ccacatgcag gtgatggaag 1320aagtggtgcg tagggacaag aaccaccccg
cggtcgtgat gtggtctgtg gccaacgagc 1380ctgcgtccca cctagaatct gctggctact
acttgaagat ggtgatcgct cacaccaaat 1440ccttggaccc ctcccggcct gtgacctttg
tgagcaactc taactatgca gcagacaagg 1500gggctccgta tgtggatgtg atctgtttga
acagctacta ctcttggtat cacgactacg 1560ggcacctgga gttgattcag ctgcagctgg
ccacccagtt tgagaactgg tataagaagt 1620atcagaagcc cattattcag agcgagtatg
gagcagaaac gattgcaggg tttcaccagg 1680atccacctct gatgttcact gaagagtacc
agaaaagtct gctagagcag taccatctgg 1740gtctggatca aaaacgcaga aaatatgtgg
ttggagagct catttggaat tttgccgatt 1800tcatgactga acagtcaccg acgagagtgc
tggggaataa aaaggggatc ttcactcggc 1860agagacaacc aaaaagtgca gcgttccttt
tgcgagagag atactggaag attgccaatg 1920aaaccaggta tccccactca gtagccaagt
cacaatgttt ggaaaacagc ccgtttactt 1980gagcaagact gataccacct gcgtgtccct
tcctccccga gtcagggcga cttccacagc 2040agcagaacaa gtgcctcctg gactgttcac
ggcagaccag aacgtttctg gcctgggttt 2100tgtggtcatc tattctagca gggaacacta
aaggtggaaa taaaagattt tctattatgg 2160aaataaagag ttggcatgaa agtcgctact g
21913442776DNAHomo sapiens 344cagggcagac
tggtagcaaa gcccccacgc ccagccagga gcaccgccgc ggactccagc 60acaccgaggg
acatgctggg cctgcgcccc ccactgctcg ccctggtggg gctgctctcc 120ctcgggtgcg
tcctctctca ggagtgcacg aagttcaagg tcagcagctg ccgggaatgc 180atcgagtcgg
ggcccggctg cacctggtgc cagaagctga acttcacagg gccgggggat 240cctgactcca
ttcgctgcga cacccggcca cagctgctca tgaggggctg tgcggctgac 300gacatcatgg
accccacaag cctcgctgaa acccaggaag accacaatgg gggccagaag 360cagctgtccc
cacaaaaagt gacgctttac ctgcgaccag gccaggcagc agcgttcaac 420gtgaccttcc
ggcgggccaa gggctacccc atcgacctgt actatctgat ggacctctcc 480tactccatgc
ttgatgacct caggaatgtc aagaagctag gtggcgacct gctccgggcc 540ctcaacgaga
tcaccgagtc cggccgcatt ggcttcgggt ccttcgtgga caagaccgtg 600ctgccgttcg
tgaacacgca ccctgataag ctgcgaaacc catgccccaa caaggagaaa 660gagtgccagc
ccccgtttgc cttcaggcac gtgctgaagc tgaccaacaa ctccaaccag 720tttcagaccg
aggtcgggaa gcagctgatt tccggaaacc tggatgcacc cgagggtggg 780ctggacgcca
tgatgcaggt cgccgcctgc ccggaggaaa tcggctggcg caacgtcacg 840cggctgctgg
tgtttgccac tgatgacggc ttccatttcg cgggcgacgg aaagctgggc 900gccatcctga
cccccaacga cggccgctgt cacctggagg acaacttgta caagaggagc 960aacgaattcg
actacccatc ggtgggccag ctggcgcaca agctggctga aaacaacatc 1020cagcccatct
tcgcggtgac cagtaggatg gtgaagacct acgagaaact caccgagatc 1080atccccaagt
cagccgtggg ggagctgtct gaggactcca gcaatgtggt ccatctcatt 1140aagaatgctt
acaataaact ctcctccagg gtcttcctgg atcacaacgc cctccccgac 1200accctgaaag
tcacctacga ctccttctgc agcaatggag tgacgcacag gaaccagccc 1260agaggtgact
gtgatggcgt gcagatcaat gtcccgatca ccttccaggt gaaggtcacg 1320gccacagagt
gcatccagga gcagtcgttt gtcatccggg cgctgggctt cacggacata 1380gtgaccgtgc
aggttcttcc ccagtgtgag tgccggtgcc gggaccagag cagagaccgc 1440agcctctgcc
atggcaaggg cttcttggag tgcggcatct gcaggtgtga cactggctac 1500attgggaaaa
actgtgagtg ccagacacag ggccggagca gccaggagct ggaaggaagc 1560tgccggaagg
acaacaactc catcatctgc tcagggctgg gggactgtgt ctgcgggcag 1620tgcctgtgcc
acaccagcga cgtccccggc aagctgatat acgggcagta ctgcgagtgt 1680gacaccatca
actgtgagcg ctacaacggc caggtctgcg gcggcccggg gagggggctc 1740tgcttctgcg
ggaagtgccg ctgccacccg ggctttgagg gctcagcgtg ccagtgcgag 1800aggaccactg
agggctgcct gaacccgcgg cgtgttgagt gtagtggtcg tggccggtgc 1860cgctgcaacg
tatgcgagtg ccattcaggc taccagctgc ctctgtgcca ggagtgcccc 1920ggctgcccct
caccctgtgg caagtacatc tcctgcgccg agtgcctgaa gttcgaaaag 1980ggcccctttg
ggaagaactg cagcgcggcg tgtccgggcc tgcagctgtc gaacaacccc 2040gtgaagggca
ggacctgcaa ggagagggac tcagagggct gctgggtggc ctacacgctg 2100gagcagcagg
acgggatgga ccgctacctc atctatgtgg atgagagccg agagtgtgtg 2160gcaggcccca
acatcgccgc catcgtcggg ggcaccgtgg caggcatcgt gctgatcggc 2220attctcctgc
tggtcatctg gaaggctctg atccacctga gcgacctccg ggagtacagg 2280cgctttgaga
aggagaagct caagtcccag tggaacaatg ataatcccct tttcaagagc 2340gccaccacga
cggtcatgaa ccccaagttt gctgagagtt aggagcactt ggtgaagaca 2400aggccgtcag
gacccaccat gtctgcccca tcacgcggcc gagacatggc ttggccacag 2460ctcttgagga
tgtcaccaat taaccagaaa tccagttatt ttccgccctc aaaatgacag 2520ccatggccgg
ccggtgcttc tgggggctcg tcggggggac agctccactc tgactggcac 2580agtctttgca
tggagacttg aggagggctt gaggttggtg aggttaggtg cgtgtttcct 2640gtgcaagtca
ggacatcagt ctgattaaag gtggtgccaa tttatttaca tttaaacttg 2700tcagggtata
aaatgacatc ccattaatta tattgttaat caatcacgtg tatagaaaaa 2760aaaataaaac
ttcaat
27763453160DNAHomo sapiens 345cctcccctcg cccggcgcgg tcccgtccgc ctctcgctcg
cctcccgcct cccctcggtc 60ttccgaggcg cccgggctcc cggcgcggcg gcggaggggg
cgggcaggcc ggcgggcggt 120gatgtggcag gactctttat gcgctgcggc aggatacgcg
ctcggcgctg ggacgcgact 180gcgctcagtt ctctcctctc ggaagctgca gccatgatgg
aagtttgaga gttgagccgc 240tgtgaggcga ggccgggctc aggcgaggga gatgagagac
ggcggcggcc gcggcccgga 300gcccctctca gcgcctgtga gcagccgcgg gggcagcgcc
ctcggggagc cggccggcct 360gcggcggcgg cagcggcggc gtttctcgcc tcctcttcgt
cttttctaac cgtgcagcct 420cttcctcggc ttctcctgaa agggaaggtg gaagccgtgg
gctcgggcgg gagccggctg 480aggcgcggcg gcggcggcgg cggcacctcc cgctcctgga
gcggggggga gaagcggcgg 540cggcggcggc cgcggcggct gcagctccag ggagggggtc
tgagtcgcct gtcaccattt 600ccagggctgg gaacgccgga gagttggtct ctccccttct
actgcctcca acacggcggc 660ggcggcggcg gcacatccag ggacccgggc cggttttaaa
cctcccgtcc gccgccgccg 720caccccccgt ggcccgggct ccggaggccg ccggcggagg
cagccgttcg gaggattatt 780cgtcttctcc ccattccgct gccgccgctg ccaggcctct
ggctgctgag gagaagcagg 840cccagtcgct gcaaccatcc agcagccgcc gcagcagcca
ttacccggct gcggtccaga 900gccaagcggc ggcagagcga ggggcatcag ctaccgccaa
gtccagagcc atttccatcc 960tgcagaagaa gccccgccac cagcagcttc tgccatctct
ctcctccttt ttcttcagcc 1020acaggctccc agacatgaca gccatcatca aagagatcgt
tagcagaaac aaaaggagat 1080atcaagagga tggattcgac ttagacttga cctatattta
tccaaacatt attgctatgg 1140gatttcctgc agaaagactt gaaggcgtat acaggaacaa
tattgatgat gtagtaaggt 1200ttttggattc aaagcataaa aaccattaca agatatacaa
tctttgtgct gaaagacatt 1260atgacaccgc caaatttaat tgcagagttg cacaatatcc
ttttgaagac cataacccac 1320cacagctaga acttatcaaa cccttttgtg aagatcttga
ccaatggcta agtgaagatg 1380acaatcatgt tgcagcaatt cactgtaaag ctggaaaggg
acgaactggt gtaatgatat 1440gtgcatattt attacatcgg ggcaaatttt taaaggcaca
agaggcccta gatttctatg 1500gggaagtaag gaccagagac aaaaagggag taactattcc
cagtcagagg cgctatgtgt 1560attattatag ctacctgtta aagaatcatc tggattatag
accagtggca ctgttgtttc 1620acaagatgat gtttgaaact attccaatgt tcagtggcgg
aacttgcaat cctcagtttg 1680tggtctgcca gctaaaggtg aagatatatt cctccaattc
aggacccaca cgacgggaag 1740acaagttcat gtactttgag ttccctcagc cgttacctgt
gtgtggtgat atcaaagtag 1800agttcttcca caaacagaac aagatgctaa aaaaggacaa
aatgtttcac ttttgggtaa 1860atacattctt cataccagga ccagaggaaa cctcagaaaa
agtagaaaat ggaagtctat 1920gtgatcaaga aatcgatagc atttgcagta tagagcgtgc
agataatgac aaggaatatc 1980tagtacttac tttaacaaaa aatgatcttg acaaagcaaa
taaagacaaa gccaaccgat 2040acttttctcc aaattttaag gtgaagctgt acttcacaaa
aacagtagag gagccgtcaa 2100atccagaggc tagcagttca acttctgtaa caccagatgt
tagtgacaat gaacctgatc 2160attatagata ttctgacacc actgactctg atccagagaa
tgaacctttt gatgaagatc 2220agcatacaca aattacaaaa gtctgaattt ttttttatca
agagggataa aacaccatga 2280aaataaactt gaataaactg aaaatggacc tttttttttt
taatggcaat aggacattgt 2340gtcagattac cagttatagg aacaattctc ttttcctgac
caatcttgtt ttaccctata 2400catccacagg gttttgacac ttgttgtcca gttgaaaaaa
ggttgtgtag ctgtgtcatg 2460tatatacctt tttgtgtcaa aaggacattt aaaattcaat
taggattaat aaagatggca 2520ctttcccgtt ttattccagt tttataaaaa gtggagacag
actgatgtgt atacgtagga 2580attttttcct tttgtgttct gtcaccaact gaagtggcta
aagagctttg tgatatactg 2640gttcacatcc tacccctttg cacttgtggc aacagataag
tttgcagttg gctaagagag 2700gtttccgaaa ggttttgcta ccattctaat gcatgtattc
gggttagggc aatggagggg 2760aatgctcaga aaggaaataa ttttatgctg gactctggac
catataccat ctccagctat 2820ttacacacac ctttctttag catgctacag ttattaatct
ggacattcga ggaattggcc 2880gctgtcactg cttgttgttt gcgcattttt ttttaaagca
tattggtgct agaaaaggca 2940gctaaaggaa gtgaatctgt attggggtac aggaatgaac
cttctgcaac atcttaagat 3000ccacaaatga agggatataa aaataatgtc ataggtaaga
aacacagcaa caatgactta 3060accatataaa tgtggaggct atcaacaaag aatgggcttg
aaacattata aaaattgaca 3120atgatttatt aaatatgttt tctcaattgt aaaaaaaaaa
31603462629DNAHomo sapiens 346acttgtcatg gcgactgtcc
agctttgtgc caggagcctc gcaggggttg atgggattgg 60ggttttcccc tcccatgtgc
tcaagactgg cgctaaaagt tttgagcttc tcaaaagtct 120agagccaccg tccagggagc
aggtagctgc tgggctccgg ggacactttg cgttcgggct 180gggagcgtgc tttccacgac
ggtgacacgc ttccctggat tggcagccag actgccttcc 240gggtcactgc catggaggag
ccgcagtcag atcctagcgt cgagccccct ctgagtcagg 300aaacattttc agacctatgg
aaactacttc ctgaaaacaa cgttctgtcc cccttgccgt 360cccaagcaat ggatgatttg
atgctgtccc cggacgatat tgaacaatgg ttcactgaag 420acccaggtcc agatgaagct
cccagaatgc cagaggctgc tccccgcgtg gcccctgcac 480cagcagctcc tacaccggcg
gcccctgcac cagccccctc ctggcccctg tcatcttctg 540tcccttccca gaaaacctac
cagggcagct acggtttccg tctgggcttc ttgcattctg 600ggacagccaa gtctgtgact
tgcacgtact cccctgccct caacaagatg ttttgccaac 660tggccaagac ctgccctgtg
cagctgtggg ttgattccac acccccgccc ggcacccgcg 720tccgcgccat ggccatctac
aagcagtcac agcacatgac ggaggttgtg aggcgctgcc 780cccaccatga gcgctgctca
gatagcgatg gtctggcccc tcctcagcat cttatccgag 840tggaaggaaa tttgcgtgtg
gagtatttgg atgacagaaa cacttttcga catagtgtgg 900tggtgcccta tgagccgcct
gaggttggct ctgactgtac caccatccac tacaactaca 960tgtgtaacag ttcctgcatg
ggcggcatga accggaggcc catcctcacc atcatcacac 1020tggaagactc cagtggtaat
ctactgggac ggaacagctt tgaggtgcgt gtttgtgcct 1080gtcctgggag agaccggcgc
acagaggaag agaatctccg caagaaaggg gagcctcacc 1140acgagctgcc cccagggagc
actaagcgag cactgcccaa caacaccagc tcctctcccc 1200agccaaagaa gaaaccactg
gatggagaat atttcaccct tcagatccgt gggcgtgagc 1260gcttcgagat gttccgagag
ctgaatgagg ccttggaact caaggatgcc caggctggga 1320aggagccagg ggggagcagg
gctcactcca gccacctgaa gtccaaaaag ggtcagtcta 1380cctcccgcca taaaaaactc
atgttcaaga cagaagggcc tgactcagac tgacattctc 1440cacttcttgt tccccactga
cagcctccca cccccatctc tccctcccct gccattttgg 1500gttttgggtc tttgaaccct
tgcttgcaat aggtgtgcgt cagaagcacc caggacttcc 1560atttgctttg tcccggggct
ccactgaaca agttggcctg cactggtgtt ttgttgtggg 1620gaggaggatg gggagtagga
cataccagct tagattttaa ggtttttact gtgagggatg 1680tttgggagat gtaagaaatg
ttcttgcagt taagggttag tttacaatca gccacattct 1740aggtaggtag gggcccactt
caccgtacta accagggaag ctgtccctca tgttgaattt 1800tctctaactt caaggcccat
atctgtgaaa tgctggcatt tgcacctacc tcacagagtg 1860cattgtgagg gttaatgaaa
taatgtacat ctggccttga aaccaccttt tattacatgg 1920ggtctaaaac ttgaccccct
tgagggtgcc tgttccctct ccctctccct gttggctggt 1980gggttggtag tttctacagt
tgggcagctg gttaggtaga gggagttgtc aagtcttgct 2040ggcccagcca aaccctgtct
gacaacctct tggtcgacct tagtacctaa aaggaaatct 2100caccccatcc cacaccctgg
aggatttcat ctcttgtata tgatgatctg gatccaccaa 2160gacttgtttt atgctcaggg
tcaatttctt ttttcttttt tttttttttt tttctttttc 2220tttgagactg ggtctcgctt
tgttgcccag gctggagtgg agtggcgtga tcttggctta 2280ctgcagcctt tgcctccccg
gctcgagcag tcctgcctca gcctccggag tagctgggac 2340cacaggttca tgccaccatg
gccagccaac ttttgcatgt tttgtagaga tggggtctca 2400cagtgttgcc caggctggtc
tcaaactcct gggctcaggc gatccacctg tctcagcctc 2460ccagagtgct gggattacaa
ttgtgagcca ccacgtggag ctggaagggt caacatcttt 2520tacattctgc aagcacatct
gcattttcac cccacccttc ccctccttct ccctttttat 2580atcccatttt tatatcgatc
tcttatttta caataaaact ttgctgcca 26293473442DNAHomo sapiens
347agccggtgcg ccgcagacta gggcgcctcg ggccagggag cgcggaggag ccatggccac
60cgctaacggg gccgtggaaa acgggcagcc ggacgggaag ccgccggccc tgccgcgccc
120catccgcaac ctggaggtca agttcaccaa gatatttatc aacaatgaat ggcacgaatc
180caagagtggg aaaaagtttg ctacatgtaa cccttcaact cgggagcaaa tatgtgaagt
240ggaagaagga gataagcccg acgtggacaa ggctgtggag gctgcacagg ttgccttcca
300gaggggctcg ccatggcgcc ggctggatgc cctgagtcgt gggcggctgc tgcaccagct
360ggctgacctg gtggagaggg accgcgccac cttggccgcc ctggagacga tggatacagg
420gaagccattt cttcatgctt ttttcatcga cctggagggc tgtattagaa ccctcagata
480ctttgcaggg tgggcagaca aaatccaggg caagaccatc cccacagatg acaacgtcgt
540atgcttcacc aggcatgagc ccattggtgt ctgtggggcc atcactccat ggaacttccc
600cctgctgatg ctggtgtgga agctggcacc cgccctctgc tgtgggaaca ccatggtcct
660gaagcctgcg gagcagacac ctctcaccgc cctttatctc ggctctctga tcaaagaggc
720cgggttccct ccaggagtgg tgaacattgt gccaggattc gggcccacag tgggagcagc
780aatttcttct caccctcaga tcaacaagat cgccttcacc ggctccacag aggttggaaa
840actggttaaa gaagctgcgt cccggagcaa tctgaagcgg gtgacgctgg agctgggggg
900gaagaacccc tgcatcgtgt gtgcggacgc tgacttggac ttggcagtgg agtgtgccca
960tcagggagtg ttcttcaacc aaggccagtg ttgcacggca gcctccaggg tgttcgtgga
1020ggagcaggtc tactctgagt ttgtcaggcg gagcgtggag tatgccaaga aacggcccgt
1080gggagacccc ttcgatgtca aaacagaaca ggggcctcag attgatcaaa agcagttcga
1140caaaatctta gagctgatcg agagtgggaa gaaggaaggg gccaagctgg aatgcggggg
1200ctcagccatg gaagacaagg ggctcttcat caaacccact gtcttctcag aagtcacaga
1260caacatgcgg attgccaaag aggagatttt cgggccagtg caaccaatac tgaagttcaa
1320aagtatcgaa gaagtgataa aaagagcgaa tagcaccgac tatggactca cagcagccgt
1380gttcacaaaa aatctcgaca aagccctgaa gttggcttct gccttagagt ctggaacggt
1440ctggatcaac tgctacaacg ccctctatgc acaggctcca tttggtggct ttaaaatgtc
1500aggaaatggc agagaactag gtgaatacgc tttggccgaa tacacagaag tgaaaactgt
1560caccatcaaa cttggcgaca agaacccctg aaggaaaggc ggggctcctt cctcaaacat
1620cggacggcgg aatgtggcag atgaaatgtg ctggaggaaa aaaatgacat ttctgacctt
1680cccgggacac attcttctgg aggctttaca tctactggag ttgaatgatt gctgttttcc
1740tctcactctc ctgtttattc accagactgg ggatgcctat aggttgtctg tgaaatcgca
1800gtcctgcctg gggagggagc tgttggccat ttctgtgttt ccctttaaac cagatcctgg
1860agacagtgag atactcaggg cgttgttaac agggagtggt atttgaagtg tccagcagtt
1920gcttgaaatg ctttgccgaa tctgactcca gtaagaatgt gggaaaaccc cctgtgtgtt
1980ctgcaagcag ggctcttgca ccagcggtct cctcagggtg gacctgctta cagagcaagc
2040cacgcctctt tccgaggtga aggtgggacc attccttggg aaaggattca cagtaaggtt
2100ttttggtttt tgttttttgt tttcttgttt ttaaaaaaag gatttcacag tgagaaagtt
2160ttggttagtg cataccgtgg aagggcgcca gggtctttgt ggattgcatg ttgacattga
2220ccgtgagatt cggcttcaaa ccaatactgc ctttggaata tgacagaatc aatagcccag
2280agagcttagt caaagacgat atcacggtct accttaacca aggcactttc ttaagcagaa
2340aatattgttg aggttacctt tgctgctaaa gatccaatct tctaacgcca caacagcata
2400gcaaatccta ggataattca cctcctcatt tgacaaatca gagctgtaat tcactttaac
2460aaattacgca tttctatcac gttcactaac agcttatgat aagtctgtgt agtcttcctt
2520ttctccagtt ctgttaccca atttagatta gtaaagcgta cacaactgga aagactgctg
2580taataacaca gccttgttat ttttaagtcc tattttgata ttaatttctg attagttagt
2640aaataacacc tggattctat ggaggacctc ggtcttcatc caagtggcct gagtatttca
2700ctggcaggtt gtgaattttt cttttcctct ttgggaatcc aaatgatgat gtgcaatttc
2760atgttttaac ttgggaaact gaaagtgttc ccatatagct tcaaaaacaa aaacaaatgt
2820gttatccgac ggatactttt atggttacta actagtactt tcctaattgg gaaagtagtg
2880cttaagtttg caaattaagt tggggagggc aataataaaa tgagggcccg taacagaacc
2940agtgtgtgta taacgaaaac catgtataaa atgggcctat cacccttgtc agagatataa
3000attaccacat ttggcttccc ttcatcagct aacacttatc acttatacta ccaataactt
3060gttaaatcag gatttggctt catacactga attttcagta ttttatctca agtagatata
3120gacactaacc ttgatagtga tacgttagag ggttcctatt cttccattgt acgataatgt
3180ctttaatatg aaatgctaca ttatttataa ttggtagagt tattgtatct ttttatagtt
3240gtaagtacac agaggtggta tatttaaact tctgtaatat actgtattta gaaatggaaa
3300tatatatagt gttaggtttc acttctttta aggtttaccc ctgtggtgtg gtttaaaaat
3360ctataggcct gggaattccg atcctagctg cagatcgcat cccacaatgc gagaatgata
3420aaataaaatt ggatatttga ga
3442348737DNAHomo sapiens 348ggagtttcgc cgccgcagtc ttcgccacca tgccgcccta
caccgtggtc tatttcccag 60ttcgaggccg ctgcgcggcc ctgcgcatgc tgctggcaga
tcagggccag agctggaagg 120aggaggtggt gaccgtggag acgtggcagg agggctcact
caaagcctcc tgcctatacg 180ggcagctccc caagttccag gacggagacc tcaccctgta
ccagtccaat accatcctgc 240gtcacctggg ccgcaccctt gggctctatg ggaaggacca
gcaggaggca gccctggtgg 300acatggtgaa tgacggcgtg gaggacctcc gctgcaaata
catctccctc atctacacca 360actatgaggc gggcaaggat gactatgtga aggcactgcc
cgggcaactg aagccttttg 420agaccctgct gtcccagaac cagggaggca agaccttcat
tgtgggagac cagatctcct 480tcgctgacta caacctgctg gacttgctgc tgatccatga
ggtcctagcc cctggctgcc 540tggatgcgtt ccccctgctc tcagcatatg tggggcgcct
cagcgcccgg cccaagctca 600aggccttcct ggcctcccct gagtacgtga acctccccat
caatggcaac gggaaacagt 660gagggttggg gggactctga gcgggaggca gagtttgcct
tcctttctcc aggaccaata 720aaatttctaa gagagct
7373495189DNAHomo sapiens 349atggccaagt cgggtggctg
cggcgcggga gccggcgtgg gcggcggcaa cggggcactg 60acctgggtga acaatgctgc
aaaaaaagaa gagtcagaaa ctgccaacaa aaatgattct 120tcaaagaagt tgtctgttga
gagagtgtat cagaagaaga cacaacttga acacattctt 180cttcgtcctg atacatatat
tgggtcagtg gagccattga cgcagttcat gtgggtgtat 240gatgaagatg taggaatgaa
ttgcagggag gttacctttg tgccaggttt atacaagatc 300tttgatgaaa ttttggttaa
tgctgctgac aataaacaga gggataagaa catgacttgt 360attaaagttt ctattgatcc
tgaatctaac attataagca tttggaataa tgggaaaggc 420attccagtag tagaacacaa
ggtagagaaa gtttatgttc ctgctttaat ttttggacag 480cttttaacat ccagtaacta
tgatgatgat gagaaaaaag ttacaggtgg tcgtaatggt 540tatggtgcaa aactttgtaa
tattttcagt acaaagttta cagtagaaac agcttgcaaa 600gaatacaaac acagttttaa
gcagacatgg atgaataata tgatgaagac ttctgaagcc 660aaaattaaac attttgatgg
tgaagattac acatgcataa cattccaacc agatctgtcc 720aaatttaaga tggaaaaact
tgacaaggat attgtggccc tcatgactag aagggcatat 780gatttggctg gttcgtgtag
aggggtcaag gtcatgttta atggaaagaa attgcctgta 840aatggatttc gcagttatgt
agatctttat gtgaaagaca aattggatga aactggggtg 900gccctgaaag ttattcatga
gcttgcaaat gaaagatggg atgtttgtct cacattgagt 960gaaaaaggat tccagcaaat
cagctttgta aatagtattg caactacaaa aggtggacgg 1020cacgtggatt atgtggtaga
tcaagttgtt ggtaaactga ttgaagtagt taagaaaaag 1080aacaaagctg gtgtatcagt
gaaaccattt caagtaaaaa accatatatg ggtttttatt 1140aattgcctta ttgaaaatcc
aacttttgat tctcagacta aggaaaacat gactctgcag 1200cccaaaagtt ttgggtctaa
atgccagctg tcagaaaaat tttttaaagc agcctctaat 1260tgtggcattg tagaaagtat
cctgaactgg gtgaaattta aggctcagac tcagctgaat 1320aagaagtgtt catcagtaaa
atacagtaaa atcaaaggta ttcccaaact ggatgatgct 1380aatgatgctg gtggtaaaca
ttccctggag tgtacactga tattaacaga gggagactct 1440gccaaatcac tggctgtgtc
tggattaggt gtgattggac gagacagata cggagttttt 1500ccactcaggg gcaaaattct
taatgtacgg gaagcttctc ataaacagat catggaaaat 1560gctgaaataa ataatattat
taaaatagtt ggtctacaat ataagaaaag ttacgatgat 1620gcagaatctc tgaaaacctt
acgctatgga aagattatga ttatgaccga tcaggatcaa 1680gatggttctc acataaaagg
cctgcttatt aatttcatcc atcacaattg gccatcactt 1740ttgaagcatg gttttcttga
agagttcatt actcctattg taaaggcaag caaaaataag 1800caggaacttt ccttctacag
tattcctgaa tttgacgaat ggaaaaaaca tatagaaaac 1860cagaaagcct ggaaaataaa
gtactataaa ggattgggta ctagtacagc taaagaagca 1920aaggaatatt ttgctgatat
ggaaaggcat cgcatcttgt ttagatatgc tggtcctgaa 1980gatgatgctg ccattacctt
ggcatttagt aagaagaaga ttgatgacag aaaagaatgg 2040ttaacaaatt ttatggaaga
ccggagacag cgtaggctac atggcttacc agagcaattt 2100ttatatggta ctgcaacaaa
gcatttgact tataatgatt tcatcaacaa ggaattgatt 2160ctcttctcaa actcagacaa
tgaaagatct ataccatctc ttgttgatgg ctttaaacct 2220ggccagcgga aagttttatt
tacctgtttc aagaggaatg ataaacgtga agtaaaagtt 2280gcccagttgg ctggctctgt
tgctgagatg tcggcttatc atcatggaga acaagcattg 2340atgatgacta ttgtgaattt
ggctcagaac tttgtgggaa gtaacaacat taacttgctt 2400cagcctattg gtcagtttgg
aactcggctt catggtggca aagatgctgc aagccctcgt 2460tatattttca caatgttaag
cactttagca aggctacttt ttcctgctgt ggatgacaac 2520ctccttaagt tcctttatga
tgataatcaa cgtgtagagc ctgagtggta tattcctata 2580attcccatgg ttttaataaa
tggtgctgag ggcattggta ctggatgggc ttgtaaacta 2640cccaactatg atgctaggga
aattgtgaac aatgtcagac gaatgctaga tggcctggat 2700cctcatccca tgcttccaaa
ctacaaaaac tttaaaggca cgattcaaga acttggtcaa 2760aaccagtatg cagtcagtgg
tgaaatattt gtagtggaca gaaacacagt agaaattaca 2820gagcttccag ttagaacttg
gacacaggta tataaagaac aggttttaga acctatgcta 2880aatggaacag ataaaacacc
agcattaatt tctgattata aagaatatca tactgacaca 2940actgtgaaat ttgtggtgaa
aatgactgaa gagaaactag cacaagcaga agctgctgga 3000ctgcataaag tttttaaact
tcaaactact cttacttgta attccatggt actttttgat 3060catatgggat gtctgaagaa
atatgaaact gtgcaagaca ttctgaaaga attctttgat 3120ttacgattaa gttattacgg
tttacgtaag gagtggcttg tgggaatgtt gggagcagaa 3180tctacaaagc ttaacaatca
agcccgtttc attttagaga agatacaagg gaaaattact 3240atagagaata ggtcaaagaa
agatttgatt caaatgttag tccagagagg ttatgaatct 3300gacccagtga aagcctggaa
agaagcacaa gaaaaggcag cagaagagga tgaaacacaa 3360aaccagcatg atgatagttc
ctccgattca ggaactcctt caggcccaga ttttaattat 3420attttaaata tgtctctgtg
gtctcttact aaagaaaaag ttgaagaact gattaaacag 3480agagatgcaa aagggcgaga
ggtcaatgat cttaaaagaa aatctccttc agatctttgg 3540aaagaggatt tagcggcatt
tgttgaagaa ctggataaag tggaatctca agaacgagaa 3600gatgttctgg ctggaatgtc
tggaaaagca attaaaggta aagttggcaa acctaaggtg 3660aagaaactcc agttggaaga
gacaatgccc tcaccttatg gcagaagaat aattcctgaa 3720attacagcta tgaaggcaga
tgccagcaaa aagttgctga agaagaagaa gggtgatctt 3780gatactgcag cagtaaaagt
ggaatttgat gaagaattca gtggagcacc agtagaaggt 3840gcaggagaag aggcattgac
tccatcagtt cctataaata aaggtcccaa acctaagagg 3900gagaagaagg agcctggtac
cagagtgaga aaaacaccta catcatctgg taaacctagt 3960gcaaagaaag tgaagaaacg
gaatccttgg tcagatgatg aatccaagtc agaaagtgat 4020ttggaagaaa cagaacctgt
ggttattcca agagattctt tgcttaggag agcagcagcc 4080gaaagaccta aatacacatt
tgatttctca gaagaagagg atgatgatgc tgatgatgat 4140gatgatgaca ataatgattt
agaggaattg aaagttaaag catctcccat aacaaatgat 4200ggggaagatg aatttgttcc
ttcagatggg ttagataaag atgaatatac attttcacca 4260ggcaaatcaa aagccactcc
agaaaaatct ttgcatgaca aaaaaagtca ggattttgga 4320aatctcttct catttccttc
atattctcag aagtcagaag atgattcagc taaatttgac 4380agtaatgaag aagattctgc
ttctgttttt tcaccatcat ttggtctgaa acagacagat 4440aaagttccaa gtaaaacggt
agctgctaaa aagggaaaac cgtcttcaga tacagtccct 4500aagcccaaga gagccccaaa
acagaagaaa gtagtagagg ctgtaaactc tgactcggat 4560tcagaatttg gcattccaaa
gaagactaca acaccaaaag gtaaaggccg aggggcaaag 4620aaaaggaaag catctggctc
tgaaaatgaa ggcgattata accctggcag gaaaacatcc 4680aaaacaacaa gcaagaaacc
gaagaagaca tcttttgatc aggattcaga tgtggacatc 4740ttcccctcag acttccctac
tgagccacct tctctgccac gaaccggtcg ggctaggaaa 4800gaagtaaaat attttgcaga
gtctgatgaa gaagaagatg atgttgattt tgcaatgttt 4860aattaagtgc ccaaagagca
caaacatttt tcaacaaata tcttgtgttg tccttttgtc 4920ttctctgtct cagacttttg
tacatctggc ttattttaat gtgatgatgt aattgacggt 4980tttttattat tgtggtaggc
cttttaacat tttgttctta cacatacagt tttatgctct 5040tttttactca ttgaaatgtc
acgtactgtc tgattggctt gtagaattgt tatagactgc 5100cgtgcattag cacagatttt
aattgtcatg gttacaaact acagacctgc tttttgaaat 5160gaaatttaaa cattaaaaat
ggaactgtg 51893501536DNAHomo sapiens
350gggggggggg ggaccacttg gcctgcctcc gtcccgccgc gccacttggc ctgcctccgt
60cccgccgcgc cacttcgcct gcctccgtcc cccgcccgcc gcgccatgcc tgtggccggc
120tcggagctgc cgcgccggcc cttgcccccc gccgcacagg agcgggacgc cgagccgcgt
180ccgccgcacg gggagctgca gtacctgggg cagatccaac acatcctccg ctgcggcgtc
240aggaaggacg accgcacggg caccggcacc ctgtcggtat tcggcatgca ggcgcgctac
300agcctgagag atgaattccc tctgctgaca accaaacgtg tgttctggaa gggtgttttg
360gaggagttgc tgtggtttat caagggatcc acaaatgcta aagagctgtc ttccaaggga
420gtgaaaatct gggatgccaa tggatcccga gactttttgg acagcctggg attctccacc
480agagaagaag gggacttggg cccagtttat ggcttccagt ggaggcattt tggggcagaa
540tacagagata tggaatcaga ttattcagga cagggagttg accaactgca aagagtgatt
600gacaccatca aaaccaaccc tgacgacaga agaatcatca tgtgcgcttg gaatccaaga
660gatcttcctc tgatggcgct gcctccatgc catgccctct gccagttcta tgtggtgaac
720agtgagctgt cctgccagct gtaccagaga tcgggagaca tgggcctcgg tgtgcctttc
780aacatcgcca gctacgccct gctcacgtac atgattgcgc acatcacggg cctgaagcca
840ggtgacttta tacacacttt gggagatgca catatttacc tgaatcacat cgagccactg
900aaaattcagc ttcagcgaga acccagacct ttcccaaagc tcaggattct tcgaaaagtt
960gagaaaattg atgacttcaa agctgaagac tttcagattg aagggtacaa tccgcatcca
1020actattaaaa tggaaatggc tgtttagggt gctttcaaag gagcttgaag gatattgtca
1080gtctttaggg gttgggctgg atgccgaggt aaaagttctt tttgctctaa aagaaaaagg
1140aactaggtca aaaatctgtc cgtgacctat cagttattaa tttttaagga tgttgccact
1200ggcaaatgta actgtgccag ttctttccat aataaaaggc tttgagttaa ctcactgagg
1260gtatctgaca atgctgaggt tatgaacaaa gtgaggagaa tgaaatgtat gtgctcttag
1320caaaaacatg tatgtgcatt tcaatcccac gtacttataa agaaggttgg tgaatttcac
1380aagctatttt tggaatattt ttagaatatt ttaagaattt cacaagctat tccctcaaat
1440ctgagggagc tgagtaacac catcgatcat gatgtagagt gtggttatga actttatagt
1500tgttttatat gttgctataa taaagaagtg ttctgc
15363512386DNAHomo sapiens 351ggaggaggaa gcaagcgagg gggctggttc ctgagcttcg
caattcctgt gtcgccttct 60gggctcccag cctgccgggt cgcatgatcc ctccggccgg
agctggtttt tttgccagcc 120accgcgaggc cggctgagtt accggcatcc ccgcagccac
ctcctctccc gacctgtgat 180acaaaagatc ttccgggggc tgcacctgcc tgcctttgcc
taaggcggat ttgaatctct 240ttctctccct tcagaatctt atcttggctt tggatcttag
aagagaatca ctaaccagag 300acgagactca gtgagtgagc aggtgttttg gacaatggac
tggttgagcc catccctatt 360ataaaaatgt ctcagagcaa ccgggagctg gtggttgact
ttctctccta caagctttcc 420cagaaaggat acagctggag tcagtttagt gatgtggaag
agaacaggac tgaggcccca 480gaagggactg aatcggagat ggagaccccc agtgccatca
atggcaaccc atcctggcac 540ctggcagaca gccccgcggt gaatggagcc actggccaca
gcagcagttt ggatgcccgg 600gaggtgatcc ccatggcagc agtaaagcaa gcgctgaggg
aggcaggcga cgagtttgaa 660ctgcggtacc ggcgggcatt cagtgacctg acatcccagc
tccacatcac cccagggaca 720gcatatcaga gctttgaaca ggatactttt gtggaactct
atgggaacaa tgcagcagcc 780gagagccgaa agggccagga acgcttcaac cgctggttcc
tgacgggcat gactgtggcc 840ggcgtggttc tgctgggctc actcttcagt cggaaatgac
cagacactga ccatccactc 900taccctccca cccccttctc tgctccacca catcctccgt
ccagccgcca ttgccaccag 960gagaaccact acatgcagcc catgcccacc tgcccatcac
agggttgggc ccagatctgg 1020tcccttgcag ctagttttct agaatttatc acacttctgt
gagaccccca cacctcagtt 1080cccttggcct cagaattcac aaaatttcca caaaatctgt
ccaaaggagg ctggcaggta 1140tggaagggtt tgtggctggg ggcaggaggg ccctacctga
ttggtgcaac ccttacccct 1200tagcctccct gaaaatgttt ttctgccagg gagcttgaaa
gttttcagaa cctcttcccc 1260agaaaggaga ctagattgcc tttgttttga tgtttgtggc
ctcagaattg atcattttcc 1320ccccactctc cccacactaa cctgggttcc ctttccttcc
atccctaccc cctaagagcc 1380atttaggggc cacttttgac tagggattca ggctgcttgg
gataaagatg caaggaccag 1440gactccctcc tcacctctgg actggctaga gtcctcactc
ccagtccaaa tgtcctccag 1500aagcctctgg ctagaggcca gccccaccca ggagggaggg
ggctatagct acaggaagca 1560ccccatgcca aagctagggt ggcccttgca gttcagcacc
accctagtcc cttcccctcc 1620ctggctccca tgaccatact gagggaccaa ctgggcccaa
gacagatgcc ccagagctgt 1680ttatggcctc agctgcctca cttcctacaa gagcagcctg
tggcatcttt gccttgggct 1740gctcctcatg gtgggttcag gggactcagc cctgaggtga
aagggagcta tcaggaacag 1800ctatgggagc cccagggtct tccctacctc aggcaggaag
ggcaggaagg agagcctgct 1860gcatggggtg gggtagggct gactagaagg gccagtcctg
cctggccagg cagatctgtg 1920ccccatgcct gtccagcctg ggcagccagg ctgccaaggc
cagagtggcc tggccaggag 1980ctcttcaggc ctccctctct cttctgctcc acccttggcc
tgtctcatcc ccaggggtcc 2040cagccacccc gggctctctg ctgtacatat ttgagactag
tttttattcc ttgtgaagat 2100gatatactat ttttgttaag cgtgtctgta tttatgtgtg
aggagctgct ggcttgcagt 2160gcgcgtgcac gtggagagct ggtgcccgga gattggacgg
cctgatgctc cctcccctgc 2220cctggtccag ggaagctggc cgagggtcct ggctcctgag
gggcatctgc ccctccccca 2280acccccaccc cacacttgtt ccagctcttt gaaatagtct
gtgtgaaggt gaaagtgcag 2340ttcagtaata aactgtgttt actcagtgaa aaaaaaaaaa
aaaaaa 23863521270DNAHomo sapiens 352agacgttcgc
acacctgggt gccagcgccc cagaggtccc gggacagccc gaggcgccgc 60gcccgccgcc
ccgagctccc caagccttcg agagcggcgc acactcccgg tctccactcg 120ctcttccaac
acccgctcgt tttggcggca gctcgtgtcc cagagaccga gttgccccag 180agaccgagac
gccgccgctg cgaaggacca atgagagccc cgctgctacc gccggcgccg 240gtggtgctgt
cgctcttgat actcggctca ggccattatg ctgctggatt ggacctcaat 300gacacctact
ctgggaagcg tgaaccattt tctggggacc acagtgctga tggatttgag 360gttacctcaa
gaagtgagat gtcttcaggg agtgagattt cccctgtgag tgaaatgcct 420tctagtagtg
aaccgtcctc gggagccgac tatgactact cagaagagta tgataacgaa 480ccacaaatac
ctggctatat tgtcgatgat tcagtcagag ttgaacaggt agttaagccc 540ccccaaaaca
agacggaaag tgaaaatact tcagataaac ccaaaagaaa gaaaaaggga 600ggcaaaaatg
gaaaaaatag aagaaacaga aagaagaaaa atccatgtaa tgcagaattt 660caaaatttct
gcattcacgg agaatgcaaa tatatagagc acctggaagc agtaacatgc 720aaatgtcagc
aagaatattt cggtgaacgg tgtggggaaa agtccatgaa aactcacagc 780atgattgaca
gtagtttatc aaaaattgca ttagcagcca tagctgcctt tatgtctgct 840gtgatcctca
cagctgttgc tgttattaca gtccagctta gaagacaata cgtcaggaaa 900tatgaaggag
aagctgagga acgaaagaaa cttcgacaag agaatggaaa tgtacatgct 960atagcataac
tgaagataaa attacaggat atcacattgg agtcactgcc aagtcatagc 1020cataaatgat
gagtcggtcc tctttccagt ggatcataag acaatggacc ctttttgtta 1080tgatggtttt
aaactttcaa ttgtcacttt ttatgctatt tctgtatata aaggtgcacg 1140aaggtaaaaa
gtattttttc aagttgtaaa taatttattt aatatttaat ggaagtgtat 1200ttattttaca
gctcattaaa cttttttaac caaacagaaa aaaaaaaaaa aaaaaaaaaa 1260aaaaaaaaaa
12703531600DNAHomo
sapiens 353gccccgccgc cggcagtgga ccgctgtgcg cgaaccctga accctacggt
cccgacccgc 60gggcgaggcc gggtacctgg gctgggatcc ggagcaagcg ggcgagggca
gcgccctaag 120caggcccgga gcgatggcag ccttgatgac cccgggaacc ggggccccac
ccgcgcctgg 180tgacttctcc ggggaaggga gccagggact tcccgaccct tcgccagagc
ccaagcagct 240cccggagctg atccgcatga agcgagacgg aggccgcctg agcgaagcgg
acatcagggg 300cttcgtggcc gctgtggtga atgggagcgc gcagggcgca cagatcgggg
ccatgctgat 360ggccatccga cttcggggca tggatctgga ggagacctcg gtgctgaccc
aggccctggc 420tcagtcggga cagcagctgg agtggccaga ggcctggcgc cagcagcttg
tggacaagca 480ttccacaggg ggtgtgggtg acaaggtcag cctggtcctc gcacctgccc
tggcggcatg 540tggctgcaag gtgccaatga tcagcggacg tggtctgggg cacacaggag
gcaccttgga 600taagctggag tctattcctg gattcaatgt catccagagc ccagagcaga
tgcaagtgct 660gctggaccag gcgggctgct gtatcgtggg tcagagtgag cagctggttc
ctgcggacgg 720aatcctatat gcagccagag atgtgacagc caccgtggac agcctgccac
tcatcacagc 780ctccattctc agtaagaaac tcgtggaggg gctgtccgct ctggtggtgg
acgttaagtt 840cggaggggcc gccgtcttcc ccaaccagga gcaggcccgg gagctggcaa
agacgctggt 900tggcgtggga gccagcctag ggcttcgggt cgcggcagcg ctgaccgcca
tggacaagcc 960cctgggtcgc tgcgtgggcc acgccctgga ggtggaggag gcgctgctct
gcatggacgg 1020cgcaggcccg ccagacttaa gggacctggt caccacgctc gggggcgccc
tgctctggct 1080cagcggacac gcggggactc aggctcaggg cgctgcccgg gtggccgcgg
cgctggacga 1140cggctcggcc cttggccgct tcgagcggat gctggcggcg cagggcgtgg
atcccggtct 1200ggcccgagcc ctgtgctcgg gaagtcccgc agaacgccgg cagctgctgc
ctcgcgcccg 1260ggagcaggag gagctgctgg cgcccgcaga tggcaccgtg gagctggtcc
gggcgctgcc 1320gctggcgctg gtgctgcacg agctcggggc cgggcgcagc cgcgctgggg
agccgctccg 1380cctgggggtg ggcgcagagc tgctggtcga cgtgggtcag aggctgcgcc
gtgggacccc 1440ctggctccgc gtgcaccggg acggccccgc gctcagcggc ccgcagagcc
gcgccctgca 1500ggaggcgctc gtactctccg accgcgcgcc attcgccgcc ccctcgccct
tcgcagagct 1560cgttctgccg ccgcagcaat aaagctcctt tgccgcgaaa
16003541842DNAHomo sapiens 354cgatcagatc gatctaagat ggcgactgtc
gaaccggaaa ccacccctac tcctaatccc 60ccgactacag aagaggagaa aacggaatct
aatcaggagg ttgctaaccc agaacactat 120attaaacatc ccctacagaa cagatgggca
ctctggtttt ttaaaaatga taaaagcaaa 180acttggcaag caaacctgcg gctgatctcc
aagtttgata ctgttgaaga cttttgggct 240ctgtacaacc atatccagtt gtctagtaat
ttaatgcctg gctgtgacta ctcacttttt 300aaggatggta ttgagcctat gtgggaagat
gagaaaaaca aacggggagg acgatggcta 360attacattga acaaacagca gagacgaagt
gacctcgatc gcttttggct agagacactt 420ctgtgcctta ttggagaatc ttttgatgac
tacagtgatg atgtatgtgg cgctgttgtt 480aatgttagag ctaaaggtga taagatagca
atatggacta ctgaatgtga aaacagagaa 540gctgttacac atatagggag ggtatacaag
gaaaggttag gacttcctcc aaagatagtg 600attggttatc agtcccacgc agacacagct
actaagagcg gctccaccac taaaaatagg 660tttgttgttt aagaagacac cttctgagta
ttctcatagg agactgcgtc aagcaatcga 720gatttgggag ctgaaccaaa gcctcttcaa
aaagcagagt ggactgcatt taaatttgat 780ttccatctta atgttactca gatataagag
aagtctcatt cgcctttgtc ttgtacttct 840gtgttcattt tttttttttt tttttggcta
gagtttccac tatcccaatc aaagaattac 900agtacacatc cccagaatcc ataaatgtgt
tcctggccca ctctgtaata gttcagtaga 960attaccatta attacataca gattttacct
atccacaata gtcagaaaac aacttggcat 1020ttctatactt tacaggaaaa aaaattctgt
tgttccattt tatgcagaag catattttgc 1080tggtttgaaa gattatgatg catacagttt
tctagcaatt ttctttgttt ctttttacag 1140cattgtcttt gctgtactct tgctgatggc
tgctagattt taatttattt gtttccctac 1200ttgataatat tagtgattct gatttcagtt
tttcatttgt tttgcttaaa tttttttttt 1260ttttttcctc atgtaacatt ggtgaaggat
ccaggaatat gacacaaagg tggaataaac 1320attaattttg tgcattcttt ggtaattttt
tttgtttttt gtaactacaa agctttgcta 1380caaatttatg catttcattc aaatcagtga
tctatgtttg tgtgatttcc taaacataat 1440tgtggattat aaaaaatgta acatcataat
tacattccta actagaatta gtatgtctgt 1500ttttgtatct ttatgctgta ttttaacact
ttgtattact taggttattt tgctttggtt 1560aaaaatggct caagtagaaa agcagtccca
ttcatattaa gacagtgtac aaaactgtaa 1620ataaaatgtg tacagtgaat tgtcttttag
acaactagat ttgtcctttt atttctccat 1680ctttatagaa ggaatttgta cttcttattg
caggcaagtc tctatattat gtcctctttt 1740gtggtgtctt ccatgtgaac agcataagtt
tggagcacta gtttgattat tatgtttatt 1800acaattttta ataaattgaa taggtagtat
catatatatg ga 18423554975DNAHomo sapiens
355ctctcacaca cacacacccc tcccctgcca tccctccccg gactccggct ccggctccga
60ttgcaatttg caacctccgc tgccgtcgcc gcagcagcca ccaattcgcc agcggttcag
120gtggctcttg cctcgatgtc ctagcctagg ggcccccggg ccggacttgg ctgggctccc
180ttcaccctct gcggagtcat gagggcgaac gacgctctgc aggtgctggg cttgcttttc
240agcctggccc ggggctccga ggtgggcaac tctcaggcag tgtgtcctgg gactctgaat
300ggcctgagtg tgaccggcga tgctgagaac caataccaga cactgtacaa gctctacgag
360aggtgtgagg tggtgatggg gaaccttgag attgtgctca cgggacacaa tgccgacctc
420tccttcctgc agtggattcg agaagtgaca ggctatgtcc tcgtggccat gaatgaattc
480tctactctac cattgcccaa cctccgcgtg gtgcgaggga cccaggtcta cgatgggaag
540tttgccatct tcgtcatgtt gaactataac accaactcca gccacgctct gcgccagctc
600cgcttgactc agctcaccga gattctgtca gggggtgttt atattgagaa gaacgataag
660ctttgtcaca tggacacaat tgactggagg gacatcgtga gggaccgaga tgctgagata
720gtggtgaagg acaatggcag aagctgtccc ccctgtcatg aggtttgcaa ggggcgatgc
780tggggtcctg gatcagaaga ctgccagaca ttgaccaaga ccatctgtgc tcctcagtgt
840aatggtcact gctttgggcc caaccccaac cagtgctgcc atgatgagtg tgccgggggc
900tgctcaggcc ctcaggacac agactgcttt gcctgccggc acttcaatga cagtggagcc
960tgtgtacctc gctgtccaca gcctcttgtc tacaacaagc taactttcca gctggaaccc
1020aatccccaca ccaagtatca gtatggagga gtttgtgtag ccagctgtcc ccataacttt
1080gtggtggatc aaacatcctg tgtcagggcc tgtcctcctg acaagatgga agtagataaa
1140aatgggctca agatgtgtga gccttgtggg ggactatgtc ccaaagcctg tgagggaaca
1200ggctctggga gccgcttcca gactgtggac tcgagcaaca ttgatggatt tgtgaactgc
1260accaagatcc tgggcaacct ggactttctg atcaccggcc tcaatggaga cccctggcac
1320aagatccctg ccctggaccc agagaagctc aatgtcttcc ggacagtacg ggagatcaca
1380ggttacctga acatccagtc ctggccgccc cacatgcaca acttcagtgt tttttccaat
1440ttgacaacca ttggaggcag aagcctctac aaccggggct tctcattgtt gatcatgaag
1500aacttgaatg tcacatctct gggcttccga tccctgaagg aaattagtgc tgggcgtatc
1560tatataagtg ccaataggca gctctgctac caccactctt tgaactggac caaggtgctt
1620cgggggccta cggaagagcg actagacatc aagcataatc ggccgcgcag agactgcgtg
1680gcagagggca aagtgtgtga cccactgtgc tcctctgggg gatgctgggg cccaggccct
1740ggtcagtgct tgtcctgtcg aaattatagc cgaggaggtg tctgtgtgac ccactgcaac
1800tttctgaatg gggagcctcg agaatttgcc catgaggccg aatgcttctc ctgccacccg
1860gaatgccaac ccatgggggg cactgccaca tgcaatggct cgggctctga tacttgtgct
1920caatgtgccc attttcgaga tgggccccac tgtgtgagca gctgccccca tggagtccta
1980ggtgccaagg gcccaatcta caagtaccca gatgttcaga atgaatgtcg gccctgccat
2040gagaactgca cccaggggtg taaaggacca gagcttcaag actgtttagg acaaacactg
2100gtgctgatcg gcaaaaccca tctgacaatg gctttgacag tgatagcagg attggtagtg
2160attttcatga tgctgggcgg cacttttctc tactggcgtg ggcgccggat tcagaataaa
2220agggctatga ggcgatactt ggaacggggt gagagcatag agcctctgga ccccagtgag
2280aaggctaaca aagtcttggc cagaatcttc aaagagacag agctaaggaa gcttaaagtg
2340cttggctcgg gtgtctttgg aactgtgcac aaaggagtgt ggatccctga gggtgaatca
2400atcaagattc cagtctgcat taaagtcatt gaggacaaga gtggacggca gagttttcaa
2460gctgtgacag atcatatgct ggccattggc agcctggacc atgcccacat tgtaaggctg
2520ctgggactat gcccagggtc atctctgcag cttgtcactc aatatttgcc tctgggttct
2580ctgctggatc atgtgagaca acaccggggg gcactggggc cacagctgct gctcaactgg
2640ggagtacaaa ttgccaaggg aatgtactac cttgaggaac atggtatggt gcatagaaac
2700ctggctgccc gaaacgtgct actcaagtca cccagtcagg ttcaggtggc agattttggt
2760gtggctgacc tgctgcctcc tgatgataag cagctgctat acagtgaggc caagactcca
2820attaagtgga tggcccttga gagtatccac tttgggaaat acacacacca gagtgatgtc
2880tggagctatg gtgtgacagt ttgggagttg atgaccttcg gggcagagcc ctatgcaggg
2940ctacgattgg ctgaagtacc agacctgcta gagaaggggg agcggttggc acagccccag
3000atctgcacaa ttgatgtcta catggtgatg gtcaagtgtt ggatgattga tgagaacatt
3060cgcccaacct ttaaagaact agccaatgag ttcaccagga tggcccgaga cccaccacgg
3120tatctggtca taaagagaga gagtgggcct ggaatagccc ctgggccaga gccccatggt
3180ctgacaaaca agaagctaga ggaagtagag ctggagccag aactagacct agacctagac
3240ttggaagcag aggaggacaa cctggcaacc accacactgg gctccgccct cagcctacca
3300gttggaacac ttaatcggcc acgtgggagc cagagccttt taagtccatc atctggatac
3360atgcccatga accagggtaa tcttgggggg tcttgccagg agtctgcagt ttctgggagc
3420agtgaacggt gcccccgtcc agtctctcta cacccaatgc cacggggatg cctggcatca
3480gagtcatcag aggggcatgt aacaggctct gaggctgagc tccaggagaa agtgtcaatg
3540tgtagaagcc ggagcaggag ccggagccca cggccacgcg gagatagcgc ctaccattcc
3600cagcgccaca gtctgctgac tcctgttacc ccactctccc cacccgggtt agaggaagag
3660gatgtcaacg gttatgtcat gccagataca cacctcaaag gtactccctc ctcccgggaa
3720ggcacccttt cttcagtggg tctcagttct gtcctgggta ctgaagaaga agatgaagat
3780gaggagtatg aatacatgaa ccggaggaga aggcacagtc cacctcatcc ccctaggcca
3840agttcccttg aggagctggg ttatgagtac atggatgtgg ggtcagacct cagtgcctct
3900ctgggcagca cacagagttg cccactccac cctgtaccca tcatgcccac tgcaggcaca
3960actccagatg aagactatga atatatgaat cggcaacgag atggaggtgg tcctgggggt
4020gattatgcag ccatgggggc ctgcccagca tctgagcaag ggtatgaaga gatgagagct
4080tttcaggggc ctggacatca ggccccccat gtccattatg cccgcctaaa aactctacgt
4140agcttagagg ctacagactc tgcctttgat aaccctgatt actggcatag caggcttttc
4200cccaaggcta atgcccagag aacgtaactc ctgctccctg tggcactcag ggagcattta
4260atggcagcta gtgcctttag agggtaccgt cttctcccta ttccctctct ctcccaggtc
4320ccagcccctt ttccccagtc ccagacaatt ccattcaatc tttggaggct tttaaacatt
4380ttgacacaaa attcttatgg tatgtagcca gctgtgcact ttcttctctt tcccaacccc
4440aggaaaggtt ttccttattt tgtgtgcttt cccagtccca ttcctcagct tcttcacagg
4500cactcctgga gatatgaagg attactctcc atatcccttc ctctcaggct cttgactact
4560tggaactagg ctcttatgtg tgcctttgtt tcccatcaga ctgtcaagaa gaggaaaggg
4620aggaaaccta gcagaggaaa gtgtaatttt ggtttatgac tcttaacccc ctagaaagac
4680agaagcttaa aatctgtgaa gaaagaggtt aggagtagat attgattact atcataattc
4740agcacttaac tatgagccag gcatcatact aaacttcacc tacattatct cacttagtcc
4800tttatcatcc ttaaaacaat tctgtgacat acatattatc tcattttaca caaagggaag
4860tcgggcatgg tggctcatgc ctgtaatctc agcactttgg gaggctgagg cagaaggatt
4920acctgaggca aggagtttga gaccagctta gccaacatag taagaccccc atctc
49753564627DNAHomo sapiens 356tcacttgcct gatatttcca gtgtcagagg gacacagcca
acgtggggtc ccttctaggc 60tgacagccgc tctccagcca ctgccgcgag cccgtctgct
cccgccctgc ccgtgcactc 120tccgcagccg ccctccgcca agccccagcg cccgctccca
tcgccgatga ccgcggggag 180gaggatggag atgctctgtg ccggcagggt ccctgcgctg
ctgctctgcc tgggtttcca 240tcttctacag gcagtcctca gtacaactgt gattccatca
tgtatcccag gagagtccag 300tgataactgc acagctttag ttcagacaga agacaatcca
cgtgtggctc aagtgtcaat 360aacaaagtgt agctctgaca tgaatggcta ttgtttgcat
ggacagtgca tctatctggt 420ggacatgagt caaaactact gcaggtgtga agtgggttat
actggtgtcc gatgtgaaca 480cttcttttta accgtccacc aacctttaag caaagagtat
gtggctttga ccgtgattct 540tattattttg tttcttatca cagtcgtcgg ttccacatat
tatttctgca gatggtacag 600aaatcgaaaa agtaaagaac caaagaagga atatgagaga
gttacctcag gggatccaga 660gttgccgcaa gtctgaatgg cgccatcaaa cttatgggca
gggataacag tgtgcctggt 720taatattaat attccatttt attaataata tttatgttgg
gtcaagtgtt aggtcaataa 780cactgtattt taatgtactt gaaaaatgtt tttatttttg
ttttattttt gacagactat 840ttgctaatgt ataatgtgca gaaaatattt aatatcaaaa
gaaaattgat atttttatac 900aagtaatttc ctgagctaaa tgcttcattg aaagcttcaa
agtttatatg cctggtgcac 960agtgcttaga agtaagcaat tcccaggtca tagctcaaga
attgttagca aatgacagat 1020ttctgtaagc ctatatatat agtcaaatcg atttagtaag
tatgtttttt atgttcctca 1080aatcagtgat aattggtttg actgtaccat ggtttgatat
gtagttggca ccatggtatc 1140atatattaaa acaataatgc aattagaatt tgggagaagc
aaatataggt cctgtgttaa 1200acactacaca tttgaaacaa gctaaccctg gggagtctat
ggtctcttca ctcaggtctc 1260agctataatt ctgttatatg aggggcagtg gacagttccc
tatgccaact cacgactcct 1320acaggtacta gtcactcatc taccagattc tgcctatgta
aaatgaattg aaaaacaatt 1380ttctgtaatc ttttatttaa gtagtgggca tttcatagct
tcacaatgtt ccttttttgt 1440atattacaac atttatgtga ggtaattatt gctcaacaga
caattagaaa aaagtccaca 1500cttgaagcct aaatttgtgc tttttaagaa tatttttaga
ctatttcttt ttataggggc 1560tttgctgaat tctaacatta aatcacagcc caaaatttga
tggactaatt attattttaa 1620aatatatgaa gacaataatt ctacatgttg tcttaagatg
gaaatacagt tatttcatct 1680tttattcaag gaagttttaa ctttaataca gctcagtaaa
tggcttcttc tagaatgtaa 1740agttatgtat ttaaagttgt atcttgacac aggaaatggg
aaaaaactta aaaattaata 1800tggtgtattt ttccaaatga aaaatctcaa ttgaaagctt
ttaaaatgta gaaacttaaa 1860cacaccttcc tgtggaggct gagatgaaaa ctagggctca
ttttcctgac atttgtttat 1920tttttggaag agacaaagat ttcttctgca ctctgagccc
ataggtctca gagagttaat 1980aggagtattt ttgggctatt gcataaggag ccactgctgc
caccactttt ggattttatg 2040ggaggctcct tcatcgaatg ctaaaccttt gagtagagtc
tccctggatc acataccagg 2100tcagggagga tctgttcttc ctctacgttt atcctggcat
gtgctagggt aaacgaaggc 2160ataataagcc atggctgacc tctggagcac caggtgccag
gacttgtctc catgtgtatc 2220catgcattat ataccctggt gcaatcacac gactgtcatc
taaagtcctg gccctggccc 2280ttactattag gaaaataaac agacaaaaac aagtaaatat
atatggtcct atacatattg 2340tatatatatt catatacaaa catgtatgta tacatgacct
taatggatca tagaattgca 2400gtcatttggt gctctgctaa ccatttatat aaaacttaaa
aacaagagaa aagaaaaatc 2460aattagatct aaacagttat ttctgtttcc tatttaatat
agctgaagtc aaaatatgta 2520agaacacatt ttaaatactc tacttacagt tggccctctg
tggttagttc cacatctgtg 2580gattcaacca accaaggacg gaaaatgctt aaaaaataat
acaacaacaa caaaaaatac 2640attataacaa ctatttactt tttttttttt ctttttgaga
tggagtctcg ctctgttgcc 2700caggttggag tgcagtggca cgatctcggc tcactgcaac
ctcacctccc gggttcaaga 2760gatcctcctg cctcagcctc ctgagcagct gggactacag
gcgcatgcca ccatgcccag 2820ctaatttttg tatttttagt agaggcgggg tttcaccatg
ttggccagga tggtctcaat 2880ctcctaacct tgagatccac cctccacagc ctcccaaact
gctgggatta caggcgtgag 2940ccaccgcacg tagcatttac attaggtatt acaagtaatg
taaagatgat ttaagtatac 3000aggaggatgt gaataggtta tatgcaagca ctatgccctt
ttatataagt gacttgaaca 3060tctgtgcccg attttagtat gtgcaggggg gcgatctggg
aatcagtccc ctgtggatac 3120caaggtacaa ctgtatttat taacgcttac tagatgtgag
gagagtctga atattttcag 3180tgatcttggc tgtttcaaaa aaatctattg acttttcaat
aaatcagctg caatccattt 3240atttcattta caaaagattt attgtaagcc tctcaatctt
ggtttttcag ttgatcttaa 3300gcatgtcaat tcataaaaac aagtcatttt tgtatttttc
atctttaaga atgcttaaaa 3360aagctaatcc ctaaaatagt tagatctttg taaatgcata
ttaaataata aagtatgacc 3420cacattactt tttatgggtg aaaataagac aaaaataata
gttttagtga ggatggtgct 3480gagtaaacat aaaaactgat ttgctctcag ctgatgtgtc
ctgtacacag tgggaagatt 3540ttagttcaca cttagtctaa ctcccccatt ttacagattt
ctcactatat atatttctag 3600aaggggctat gcatattcaa tgtattgaga accaaagcaa
ccacaaatgc ataaatgcat 3660aatttatggt cttcaaccaa ggccacataa taacccagtt
aacttactct ttaaccagga 3720atattaagtt ctataactag tactcaaggt ttaaccttaa
aattaagatt tccttaacct 3780taaccttaaa attgatatta tattaaacat acataataca
atgtaactcc actgttctcc 3840tgaatatttt ttgctctaat ctctctgccg aaagtcaaag
tgatgggaga attggtatac 3900tggtatgact acgtcttaag tcagattttt atttatgagt
ctttgagact aaattcaatc 3960accaccaggt atcaaatcaa cttttatgca gcaaatatat
gattctagtg tctgactttt 4020gttaaattca gtaatgcagt ttttaaaaac ctgtatctga
cccactttgt aatttttgct 4080ccaatatcca ttctgtagac ttttgaaaaa aaagttttta
atttgatgcc caatatattc 4140tgaccgttaa aaaattcttg ttcatatggg agaaggggga
gtaatgactt gtacaaacag 4200tatttctggt gtatatttta atgtttttaa aaagagtaat
ttcatttaaa tatctgttat 4260tcaaatttga tgatgttaaa tgtaatataa tgtattttct
ttttattttg cactctgtaa 4320ttgcactttt taagtttgaa gagccatttt ggtaaacggt
ttttattaaa gatgctatgg 4380aacataaagt tgtattgcat gcaatttaaa gtaacttatt
tgactatgaa tattatcgga 4440ttactgaatt gtatcaattt gtttgtgttc aatatcagct
ttgataattg tgtaccttaa 4500gatattgaag gagaaaatag ataatttaca agatattatt
aatttttatt tatttttctt 4560gggaattgaa aaaaattgaa ataaataaaa atgcattgaa
catcttgcat tcaaaatctt 4620cactgac
46273572634DNAHomo sapiens 357ggcacgaggc tgagtgtccg
tctcgcgccc ggaagcgggc gaccgccgtc agcccggagg 60aggaggagga ggaggaggag
gagggggcgg ccatggggct gctgtcccag ggctcgccgc 120tgagctggga ggaaaccaag
cgccatgccg accacgtgcg gcggcacggg atcctccagt 180tcctgcacat ctaccacgcc
gtcaaggacc ggcacaagga cgttctcaag tggggcgatg 240aggtggaata catgttggta
tcttttgatc atgaaaataa aaaagtccgg ttggtcctgt 300ctggggagaa agttcttgaa
actctgcaag agaaggggga aaggacaaac ccaaaccatc 360ctaccctttg gagaccagag
tatgggagtt acatgattga agggacacca ggacagccct 420acggaggaac aatgtccgag
ttcaatacag ttgaggccaa catgcgaaaa cgccggaagg 480aggctacttc tatattagaa
gaaaatcagg ctctttgcac aataacttca tttcccagat 540taggctgtcc tgggttcaca
ctgcccgagg tcaaacccaa cccagtggaa ggaggagctt 600ccaagtccct cttctttcca
gatgaagcaa taaacaagca ccctcgcttc agtaccttaa 660caagaaatat ccgacatagg
agaggagaaa aggttgtcat caatgtacca atatttaagg 720acaagaatac accatctcca
tttatagaaa catttactga ggatgatgaa gcttcaaggg 780cttctaagcc ggatcatatt
tacatggatg ccatgggatt tggaatgggc aattgctgtc 840tccaggtgac attccaagcc
tgcagtatat ctgaggccag atacctttat gatcagttgg 900ctactatctg tccaattgtt
atggctttga gtgctgcatc tcccttttac cgaggctatg 960tgtcagacat tgattgtcgc
tggggagtga tttctgcatc tgtagatgat agaactcggg 1020aggagcgagg actggagcca
ttgaagaaca ataactatag gatcagtaaa tcccgatatg 1080actcaataga cagctattta
tctaagtgtg gtgagaaata taatgacatc gacttgacga 1140tagataaaga gatctacgaa
cagctgttgc aggaaggcat tgatcatctc ctggcccagc 1200atgttgctca tctctttatt
agagacccac tgacactgtt tgaagagaaa atacacctgg 1260atgatgctaa tgagtctgac
cattttgaga atattcagtc cacaaattgg cagacaatga 1320gatttaagcc ccctcctcca
aactcagaca ttggatggag agtagaattt cgacccatgg 1380aggtgcaatt aacagacttt
gagaactctg cctatgtggt gtttgtggta ctgctcacca 1440gagtgatcct ttcctacaaa
ttggattttc tcattccact gtcaaaggtt gatgagaaca 1500tgaaggtagc acagaaaaga
gatgctgtct tgcagggaat gttttatttc aggaaagata 1560tttgcaaagg tggcaatgca
gtggtggatg gttgtggcaa ggcccagaac agcacggagc 1620tcgctgcaga ggagtacacc
ctcatgagca tagacaccat catcaatggg aaggaaggtg 1680tgtttcctgg actgatccca
attctgaact cttaccttga aaacatggaa gtggatgtgg 1740acaccagatg tagtattctg
aactacctaa agctaattaa gaagagagca tctggagaac 1800taatgacagt tgccagatgg
atgagggagt ttatcgcaaa ccatcctgac tacaagcaag 1860acagtgtcat aactgatgaa
atgaattata gccttatttt gaagtgtaac caaattgcaa 1920atgaattatg tgaatgccca
gagttacttg gatcagcatt taggaaagta aaatatagtg 1980gaagtaaaac tgactcatcc
aactagacat tctacagaaa gaaaaatgca ttattgacga 2040actggctaca gtaccatgcc
tctcagcccg tgtgtataat atgaagacca aatgatagaa 2100ctgtactgtt ttctgggcca
gtgagccaga aattgattaa ggctttcttt ggtaggtaaa 2160tctagagttt atacagtgta
catgtacata gtaaagtatt tttgattaac aatgtatttt 2220aataacatat ctaaagtcat
catgaactgg cttgtacatt tttaaattct tactctggag 2280caacctactg tctaagcagt
tttgtaaatg tactggtaat tgtacaatac ttgcattcca 2340gagttaaaat gtttactgta
aatttttgtt cttttaaaga ctacctggga cctgatttat 2400tgaaattttt ctctttaaaa
acattttctc tcgttaattt tcctttgtca tttcctttgt 2460tgtctacatt aaatcacttg
aatccattga aagtgcttca agggtaatct tgggtttcta 2520gcaccttatc tatgatgttt
cttttgcaat tggaataatc acttggtcac cttgccccaa 2580gctttcccct ctgaataaat
acccattgaa ctctgaaaaa aaaaaaaaaa aaaa 26343581246DNAHomo sapiens
358gaccagccta cagccgcctg catctgtatc cagcgccagg tcccgccagt cccagctgcg
60cgcgcccccc agtcccgcac ccgttcggcc caggctaagt tagccctcac catgccggtc
120aaaggaggca ccaagtgcat caaatacctg ctgttcggat ttaacttcat cttctggctt
180gccgggattg ctgtccttgc cattggacta tggctccgat tcgactctca gaccaagagc
240atcttcgagc aagaaactaa taataataat tccagcttct acacaggagt ctatattctg
300atcggagccg gcgccctcat gatgctggtg ggcttcctgg gctgctgcgg ggctgtgcag
360gagtcccagt gcatgctggg actgttcttc ggcttcctct tggtgatatt cgccattgaa
420atagctgcgg ccatctgggg atattcccac aaggatgagg tgattaagga agtccaggag
480ttttacaagg acacctacaa caagctgaaa accaaggatg agccccagcg ggaaacgctg
540aaagccatcc actatgcgtt gaactgctgt ggtttggctg ggggcgtgga acagtttatc
600tcagacatct gccccaagaa ggacgtactc gaaaccttca ccgtgaagtc ctgtcctgat
660gccatcaaag aggtcttcga caataaattc cacatcatcg gcgcagtggg catcggcatt
720gccgtggtca tgatatttgg catgatcttc agtatgatct tgtgctgtgc tatccgcagg
780aaccgcgaga tggtctagag tcagcttaca tccctgagca ggaaagttta cccatgaaga
840ttggtgggat tttttgtttg tttgttttgt tttgtttgtt gtttgttgtt tgtttttttg
900ccactaattt tagtattcat tctgcattgc tagataaaag ctgaagttac tttatgtttg
960tcttttaatg cttcattcaa tattgacatt tgtagttgag cggggggttt ggtttgcttt
1020ggtttatatt ttttcagttg tttgtttttg cttgttatat taagcagaaa tcctgcaatg
1080aaaggtacta tatttgctag actctagaca agatattgta cataaaagaa tttttttgtc
1140tttaaataga tacaaatgtc tatcaacttt aatcaagttg taacttatat tgaagacaat
1200ttgatacata ataaaaaatt atgacaatgt caaaaaaaaa aaaaaa
12463592360DNAHomo sapiens 359gctacgcggg ccacgctgct ggctggcctg acctaggcgc
gcggggtcgg gcggccgcgc 60gggcgggctg agtgagcaag acaagacact caagaagagc
gagctgcgcc tgggtcccgg 120ccaggcttgc acgcagaggc gggcggcaga cggtgcccgg
cggaatctcc tgagctccgc 180cgcccagctc tggtgccagc gcccagtggc cgccgcttcg
aaagtgactg gtgcctcgcc 240gcctcctctc ggtgcgggac catgaagctg ctgccgtcgg
tggtgctgaa gctctttctg 300gctgcagttc tctcggcact ggtgactggc gagagcctgg
agcggcttcg gagagggcta 360gctgctggaa ccagcaaccc ggaccctccc actgtatcca
cggaccagct gctaccccta 420ggaggcggcc gggaccggaa agtccgtgac ttgcaagagg
cagatctgga ccttttgaga 480gtcactttat cctccaagcc acaagcactg gccacaccaa
acaaggagga gcacgggaaa 540agaaagaaga aaggcaaggg gctagggaag aagagggacc
catgtcttcg gaaatacaag 600gacttctgca tccatggaga atgcaaatat gtgaaggagc
tccgggctcc ctcctgcatc 660tgccacccgg gttaccatgg agagaggtgt catgggctga
gcctcccagt ggaaaatcgc 720ttatatacct atgaccacac aaccatcctg gccgtggtgg
ctgtggtgct gtcatctgtc 780tgtctgctgg tcatcgtggg gcttctcatg tttaggtacc
ataggagagg aggttatgat 840gtggaaaatg aagagaaagt gaagttgggc atgactaatt
cccactgaga gagacttgtg 900ctcaaggaat cggctgggga ctgctacctc tgagaagaca
caaggtgatt tcagactgca 960gaggggaaag acttccatct agtcacaaag actccttcgt
ccccagttgc cgtctaggat 1020tgggcctccc ataattgctt tgccaaaata ccagagcctt
caagtgccaa acagagtatg 1080tccgatggta tctgggtaag aagaaagcaa aagcaaggga
ccttcatgcc cttctgattc 1140ccctccacca aaccccactt cccctcataa gtttgtttaa
acacttatct tctggattag 1200aatgccggtt aaattccata tgctccagga tctttgactg
aaaaaaaaaa agaagaagaa 1260gaaggagagc aagaaggaaa gatttgtgaa ctggaagaaa
gcaacaaaga ttgagaagcc 1320atgtactcaa gtaccaccaa gggatctgcc attgggaccc
tccagtgctg gatttgatga 1380gttaactgtg aaataccaca agcctgagaa ctgaattttg
ggacttctac ccagatggaa 1440aaataacaac tatttttgtt gttgttgttt gtaaatgcct
cttaaattat atatttattt 1500tattctatgt atgttaattt atttagtttt taacaatcta
acaataatat ttcaagtgcc 1560tagactgtta ctttggcaat ttcctggccc tccactcctc
atccccacaa tctggcttag 1620tgccacccac ctttgccaca aagctaggat ggttctgtga
cccatctgta gtaatttatt 1680gtctgtctac atttctgcag atcttccgtg gtcagagtgc
cactgcggga gctctgtatg 1740gtcaggatgt aggggttaac ttggtcagag ccactctatg
agttggactt cagtcttgcc 1800taggcgattt tgtctaccat ttgtgttttg aaagcccaag
gtgctgatgt caaagtgtaa 1860cagatatcag tgtctccccg tgtcctctcc ctgccaagtc
tcagaagagg ttgggcttcc 1920atgcctgtag ctttcctggt ccctcacccc catggcccca
ggccacagcg tgggaactca 1980ctttcccttg tgtcaagaca tttctctaac tcctgccatt
cttctggtgc tactccatgc 2040aggggtcagt gcagcagagg acagtctgga gaaggtatta
gcaaagcaaa aggctgagaa 2100ggaacaggga acattggagc tgactgttct tggtaactga
ttacctgcca attgctaccg 2160agaaggttgg aggtggggaa ggctttgtat aatcccaccc
acctcaccaa aacgatgaag 2220gtatgctgtc atggtccttt ctggaagttt ctggtgccat
ttctgaactg ttacaacttg 2280tatttccaaa cctggttcat atttatactt tgcaatccaa
ataaagataa cccttattcc 2340ataaaaaaaa aaaaaaaaaa
23603601433DNAHomo sapiens 360attcggggcg agggaggagg
aagaagcgga ggaggcggct cccgctcgca gggccgtgca 60cctgcccgcc cgcccgctcg
ctcgctcgcc cgccgcgccg cgctgccgac cgccagcatg 120ctgccgagag tgggctgccc
cgcgctgccg ctgccgccgc cgccgctgct gccgctgctg 180ccgctgctgc tgctgctact
gggcgcgagt ggcggcggcg gcggggcgcg cgcggaggtg 240ctgttccgct gcccgccctg
cacacccgag cgcctggccg cctgcgggcc cccgccggtt 300gcgccgcccg ccgcggtggc
cgcagtggcc ggaggcgccc gcatgccatg cgcggagctc 360gtccgggagc cgggctgcgg
ctgctgctcg gtgtgcgccc ggctggaggg cgaggcgtgc 420ggcgtctaca ccccgcgctg
cggccagggg ctgcgctgct atccccaccc gggctccgag 480ctgcccctgc aggcgctggt
catgggcgag ggcacttgtg agaagcgccg ggacgccgag 540tatggcgcca gcccggagca
ggttgcagac aatggcgatg accactcaga aggaggcctg 600gtggagaacc acgtggacag
caccatgaac atgttgggcg ggggaggcag tgctggccgg 660aagcccctca agtcgggtat
gaaggagctg gccgtgttcc gggagaaggt cactgagcag 720caccggcaga tgggcaaggg
tggcaagcat caccttggcc tggaggagcc caagaagctg 780cgaccacccc ctgccaggac
tccctgccaa caggaactgg accaggtcct ggagcggatc 840tccaccatgc gccttccgga
tgagcggggc cctctggagc acctctactc cctgcacatc 900cccaactgtg acaagcatgg
cctgtacaac ctcaaacagt gcaagatgtc tctgaacggg 960cagcgtgggg agtgctggtg
tgtgaacccc aacaccggga agctgatcca gggagccccc 1020accatccggg gggaccccga
gtgtcatctc ttctacaatg agcagcagga ggcttgcggg 1080gtgcacaccc agcggatgca
gtagaccgca gccagccggt gcctggcgcc cctgcccccc 1140gcccctctcc aaacaccggc
agaaaacgga gagtgcttgg gtggtgggtg ctggaggatt 1200ttccagttct gacacacgta
tttatatttg gaaagagacc agcaccgagc tcggcacctc 1260cccggcctct ctcttcccag
ctgcagatgc cacacctgct ccttcttgct ttccccgggg 1320gaggaagggg gttgtggtcg
gggagctggg gtacaggttt ggggaggggg aagagaaatt 1380tttatttttg aacccctgtg
tcccttttgc ataagattaa aggaaggaaa agt 14333611632DNAHomo sapiens
361gccggccgaa cccagacccg aggttttaga agcagagtca ggcgaagctg ggccagaacc
60gcgacctccg caaccttgag cggcatccgt ggagtgcgcc tgcgcagcta cgaccgcagc
120aggaaagcgc cgccggccag gcccagctgt ggccggacag ggactggaag agaggacgcg
180gtcgagtagg tgtgcaccag ccctggcaac gagagcgtct accccgaact ctgctggcct
240tgaggtgggg aagccgggga gggcagttga ggaccccgcg gaggcgcgtg actggttgag
300cgggcaggcc agcctccgag ccgggtggac acaggtttta aaacatgaat cctacactca
360tccttgctgc cttttgcctg ggaattgcct cagctactct aacatttgat cacagtttag
420aggcacagtg gaccaagtgg aaggcgatgc acaacagatt atacggcatg aatgaagaag
480gatggaggag agcagtgtgg gagaagaaca tgaagatgat tgaactgcac aatcaggaat
540acagggaagg gaaacacagc ttcacaatgg ccatgaacgc ctttggagac atgaccagtg
600aagaattcag gcaggtgatg aatggctttc aaaaccgtaa gcccaggaag gggaaagtgt
660tccaggaacc tctgttttat gaggccccca gatctgtgga ttggagagag aaaggctacg
720tgactcctgt gaagaatcag ggtcagtgtg gttcttgttg ggcttttagt gctactggtg
780ctcttgaagg acagatgttc cggaaaactg ggaggcttat ctcactgagt gagcagaatc
840tggtagactg ctctgggcct caaggcaatg aaggctgcaa tggtggccta atggattatg
900ctttccagta tgttcaggat aatggaggcc tggactctga ggaatcctat ccatatgagg
960caacagaaga atcctgtaag tacaatccca agtattctgt tgctaatgac accggctttg
1020tggacatccc taagcaggag aaggccctga tgaaggcagt tgcaactgtg gggcccattt
1080ctgttgctat tgatgcaggt catgagtcct tcctgttcta taaagaaggc atttattttg
1140agccagactg tagcagtgaa gacatggatc atggtgtgct ggtggttggc tacggatttg
1200aaagcacaga atcagataac aataaatatt ggctggtgaa gaacagctgg ggtgaagaat
1260ggggcatggg tggctacgta aagatggcca aagaccggag aaaccattgt ggaattgcct
1320cagcagccag ctaccccact gtgtgagctg gtggacggtg atgaggaagg acttgactgg
1380ggatggcgca tgcatgggag gaattcatct tcagtctacc agcccccgct gtgtcggata
1440cacactcgaa tcattgaaga tccgagtgtg atttgaattc tgtgatattt tcacactggt
1500aaatgttacc tctattttaa ttactgctat aaataggttt atattattga ttcacttact
1560gactttgcat tttcgttttt aaaaggatgt ataaattttt acctgtttaa ataaaattta
1620atttcaaatg ta
16323622756DNAHomo sapiens 362atgctgtcct tccagtaccc cgacgtgtac cgcgacgaga
ccgccgtaca ggattatcat 60ggtcataaaa tttgtgaccc ttacgcctgg cttgaagacc
ccgacagtga acagactaag 120gcctttgtgg aggcccagaa taagattact gtgccatttc
ttgagcagtg tcccatcaga 180ggtttataca aagagagaat gactgaacta tatgattatc
ccaagtatag ttgccacttc 240aagaaaggaa aacggtattt ttatttttac aatacaggtt
tgcagaacca gcgagtatta 300tatgtacagg attccttaga gggtgaggcc agagtgttcc
tggaccccaa catactgtct 360gacgatggca cagtggcact ccgaggttat gcgttcagcg
aagatggtga atattttgcc 420tatggtctga gtgccagtgg ctcagactgg gtgacaatca
agttcatgaa agttgatggt 480gccaaagagc ttccagatgt gcttgaaaga gtcaagttca
gctgtatggc ctggacccat 540gatgggaagg gaatgttcta caactcatac cctcaacagg
atggaaaaag tgatggcaca 600gagacatcta ccaatctcca ccaaaagctc tactaccatg
tcttgggaac cgatcagtca 660gaagatattt tgtgtgctga gtttcctgat gaacctaaat
ggatgggtgg agctgagtta 720tctgatgatg gccgctatgt cttgttatca ataagggaag
gatgtgatcc agtaaaccga 780ctctggtact gtgacctaca gcaggaatcc agtggcatcg
cgggaatcct gaagtgggta 840aaactgattg acaactttga aggggaatat gactacgtga
ccaatgaggg ggcggtgttc 900acattcaaga cgaatcgcca gtctcccaac tatcgcgtga
tcaacattga cttcagggat 960cctgaagagt ctaagtggaa agtacttgtt cctgagcatg
agaaagatgt cttagaatgg 1020atagcttgtg tcaggtccaa cttcttggtc ttatgctacc
tccatgacgt caagaacatt 1080ctgcagctcc atgacctgac tactggtgct ctccttaaga
ccttcccgct cgatgtcggc 1140agcattgtag ggtacagcgg tcagaagaag gacactgaaa
tcttctatca gtttacttcc 1200tttttatctc caggtatcat ttatcactgt gatcttacca
aagaggagct ggagccaaga 1260gttttccgag aggtgaccgt aaaaggaatt gatgcttctg
attaccagac agtccagatt 1320ttctacccta gcaaggatgg tacgaagatt ccaatgttca
ttgtgcataa aaaaagcata 1380aaattggatg gctctcatcc agctttctta tatggctatg
gcggcttcaa catatccatc 1440acacccaact acagtgtttc caggcttatt tttgtgagac
acatgggtgg tatcctggca 1500gtggccaaca tcagaggagg tggcgaatat ggagagacgt
ggcataaagg tggtatcttg 1560gccaacaaac aaaactgctt tgatgacttt cagtgtgctg
ctgagtatct gatcaaggaa 1620ggttacacat ctcccaagag gctgactatt aatggaggtt
caaatggagg cctcttagtg 1680gctgcttgtg caaatcagag acctgacctc tttggttgtg
ttattgccca agttggagta 1740atggacatgc tgaagtttca taaatatacc atcggccatg
cttggaccac tgattatggg 1800tgctcggaca gcaaacaaca ctttgaatgg cttgtcaaat
actctccatt gcataatgtg 1860aagttaccag aagcagatga catccagtac ccgtccatgc
tgctcctcac tgctgaccat 1920gatgaccgcg tggtcccgct tcactccctg aagttcattg
ccacccttca gtacatcgtg 1980ggccgcagca ggaagcaaag caaccccctg cttatccacg
tggacaccaa ggcgggccac 2040ggggcgggga agcccacagc caaagtgata gaggaagtct
cagacatgtt tgcgttcatc 2100gcgcggtgcc tgaacgtcga ctggattcca taaacagttt
tcgtgcttcc tcctgacagc 2160gacagaaaac ctcaagggct ttcccacgtt gacaccaaga
aaccactggg cataatgctt 2220ccccacggga acattattcc tggactgaca ggctacagtt
gaacagaact gccgtgggaa 2280ttttatcttt tttaggcttc tcctttttag caaggccttg
gtgtttcttt ttccaccctg 2340tctaggcaca tgtggttttt tggtgttttt tttaagggca
tgttgggata aatagctaaa 2400tggcaacaaa cacattgtga atattagatt gctgaattaa
ggatcatagt cgggcatact 2460tatctatatc cataacctct atatctttaa ataaatgtga
gaactgttct catggagaag 2520acttctttgc aacaataata aatgttattt aagaatgaca
gggatttact tccggtttct 2580tcatattgag gggcaactcc agaagtggag ttttctgtga
gaataaagca tttcaccttt 2640ctgcaacaag ttagttttca agcagttaag tcatagaatg
tttgttagct gtgaaaataa 2700gttgttcatc caaaaaaaaa aaaaaaaaaa aaaaaaaaaa
aaaaaaaaag gaattc 27563632768DNAHomo sapiens 363cactgctgtg
cagggcagga aagctccatg cacatagccc agcaaagagc aacacagagc 60tgaaaggaag
actcagagga gagagataag taaggaaagt agtgatggct ctcatcccag 120acttggccat
ggaaacctgg cttctcctgg ctgtcagcct ggtgctcctc tatctatatg 180gaacccattc
acatggactt tttaagaagc ttggaattcc agggcccaca cctctgcctt 240ttttgggaaa
tattttgtcc taccataagg gcttttgtat gtttgacatg gaatgtcata 300aaaagtatgg
aaaagtgtgg ggcttttatg atggtcaaca gcctgtgctg gctatcacag 360atcctgacat
gatcaaaaca gtgctagtga aagaatgtta ttctgtcttc acaaaccgga 420ggccttttgg
tccagtggga tttatgaaaa gtgccatctc tatagctgag gatgaagaat 480ggaagagatt
acgatcattg ctgtctccaa ccttcaccag tggaaaactc aaggagatgg 540tccctatcat
tgcccagtat ggagatgtgt tggtgagaaa tctgaggcgg gaagcagaga 600caggcaagcc
tgtcaccttg aaagacgtct ttggggccta cagcatggat gtgatcacta 660gcacatcatt
tggagtgaac atcgactctc tcaacaatcc acaagacccc tttgtggaaa 720acaccaagaa
gcttttaaga tttgattttt tggatccatt ctttctctca ataacagtct 780ttccattcct
catcccaatt cttgaagtat taaatatctg tgtgtttcca agagaagtta 840caaatttttt
aagaaaatct gtaaaaagga tgaaagaaag tcgcctcgaa gatacacaaa 900agcaccgagt
ggatttcctt cagctgatga ttgactctca gaattcaaaa gaaactgagt 960cccacaaagc
tctgtccgat ctggagctcg tggcccaatc aattatcttt atttttgctg 1020gctatgaaac
cacgagcagt gttctctcct tcattatgta tgaactggcc actcaccctg 1080atgtccagca
gaaactgcag gaggaaattg atgcagtttt acccaataag gcaccaccca 1140cctatgatac
tgtgctacag atggagtatc ttgacatggt ggtgaatgaa acgctcagat 1200tattcccaat
tgctatgaga cttgagaggg tctgcaaaaa agatgttgag atcaatggga 1260tgttcattcc
caaaggggtg gtggtgatga ttccaagcta tgctcttcac cgtgacccaa 1320agtactggac
agagcctgag aagttcctcc ctgaaagatt cagcaagaag aacaaggaca 1380acatagatcc
ttacatatac acaccctttg gaagtggacc cagaaactgc attggcatga 1440ggtttgctct
catgaacatg aaacttgctc taatcagagt ccttcagaac ttctccttca 1500aaccttgtaa
agaaacacag atccccctga aattaagctt aggaggactt cttcaaccag 1560aaaaacccgt
tgttctaaag gttgagtcaa gggatggcac cgtaagtgga gcctgaattt 1620tcctaaggac
ttctgctttg ctcttcaaga aatctgtgcc tgagaacacc agagacctca 1680aattactttg
tgaatagaac tctgaaatga agatgggctt catccaatgg actgcataaa 1740taaccgggga
ttctgtacat gcattgagct ctctcattgt ctgtgtagag tgttatactt 1800gggaatataa
aggaggtgac caaatcagtg tgaggaggta gatttggctc ctctgcttct 1860cacgggacta
tttccaccac ccccagttag caccattaac tcctcctgag ctctgataag 1920agaatcaaca
tttctcaata atttcctcca caaattatta atgaaaataa gaattatttt 1980gatggctcta
acaatgacat ttatatcaca tgttttctct ggagtattct ataagtttta 2040tgttaaatca
ataaagacca ctttacaaaa gtattatcag atgctttcct gcacattaag 2100gagaaatcta
tagaactgaa tgagaaccaa caagtaaata tttttggtca ttgtaatcac 2160tgttggcgtg
gggcctttgt cagaactaga atttgattat taacataggt gaaagttaat 2220ccactgtgac
tttgcccatt gtttagaaag aatattcata gtttaattat gccttttttg 2280atcaggcaca
gtggctcacg cctgtaatcc tagcagtttg ggaggctgag ccgggtggat 2340cgcctgaggt
caggagttca agacaagcct ggcctacatg gttgaaaccc catctctact 2400aaaaatacac
aaattagcta ggcatggtgg actcgcctgt aatctcacta cacaggaggc 2460tgaggcagga
gaatcacttg aacctgggag gcggatgttg aagtgagctg agattgcacc 2520actgcactcc
agtctgggtg agagtgagac tcagtcttaa aaaaatatgc ctttttgaag 2580cacgtacatt
ttgtaacaaa gaactgaagc tcttattata ttattagttt tgatttaatg 2640ttttcagccc
atctcctttc atatttctgg gagacagaaa acatgtttcc ctacacctct 2700tgcattccat
cctcaacacc caactgtctc gatgcaatga acacttaata aaaaacagtc 2760gattggtc
27683642984DNAHomo
sapiens 364gaggaggaac agaaaagaaa agaaaagaaa aagtgggaaa caaataatct
aagaatgagg 60agaaagcaag aagagtgacc cccttgtggg cactccattg gttttatggc
gcctctactt 120tctggagttt gtgtaaaaca aaaatattat ggtctttgtg cacatttaca
tcaagctcag 180cctgggcggc acagccagat gcgagatgcg tctctgctga tctgagtctg
cctgcagcat 240ggacctgggt cttccctgaa gcatctccag ggctggaggg acgactgcca
tgcaccgagg 300gctcatccat ccacagagca gggcagtggg aggagacgcc atgaccccca
tcctcacggt 360cctgatctgt ctcgggctga gtctgggccc ccggacccac gtgcaggcag
ggcacctccc 420caagcccacc ctctgggctg aaccaggctc tgtgatcacc caggggagtc
ctgtgaccct 480caggtgtcag gggggccagg agacccagga gtaccgtcta tatagagaaa
agaaaacagc 540accctggatt acacggatcc cacaggagct tgtgaagaag ggccagttcc
ccatcccatc 600catcacctgg gaacatgcag ggcggtatcg ctgttactat ggtagcgaca
ctgcaggccg 660ctcagagagc agtgaccccc tggagctggt ggtgacagga gcctacatca
aacccaccct 720ctcagcccag cccagccccg tggtgaactc aggagggaat gtaaccctcc
agtgtgactc 780acaggtggca tttgatggct tcattctgtg taaggaagga gaagatgaac
acccacaatg 840cctgaactcc cagccccatg cccgtgggtc gtcccgcgcc atcttctccg
tgggccccgt 900gagcccgagt cgcaggtggt ggtacaggtg ctatgcttat gactcgaact
ctccctatga 960gtggtctcta cccagtgatc tcctggagct cctggtccta ggtgtttcta
agaagccatc 1020actctcagtg cagccaggtc ctatcgtggc ccctgaggag accctgactc
tgcagtgtgg 1080ctctgatgct ggctacaaca gatttgttct gtataaggac ggggaacgtg
acttccttca 1140gctcgctggc gcacagcccc aggctgggct ctcccaggcc aacttcaccc
tgggccctgt 1200gagccgctcc tacgggggcc agtacagatg ctacggtgca cacaacctct
cctccgagtg 1260gtcggccccc agcgaccccc tggacatcct gatcgcagga cagttctatg
acagagtctc 1320cctctcggtg cagccgggcc ccacggtggc ctcaggagag aacgtgaccc
tgctgtgtca 1380gtcacaggga tggatgcaaa ctttccttct gaccaaggag ggggcagctg
atgacccatg 1440gcgtctaaga tcaacgtacc aatctcaaaa ataccaggct gaattcccca
tgggtcctgt 1500gacctcagcc catgcgggga cctacaggtg ctacggctca cagagctcca
aaccctacct 1560gctgactcac cccagtgacc ccctggagct cgtggtctca ggaccgtctg
ggggccccag 1620ctccccgaca acaggcccca cctccacatc tggccctgag gaccagcccc
tcacccccac 1680cgggtcggat ccccagagtg gtctgggaag gcacctgggg gttgtgatcg
gcatcttggt 1740ggccgtcatc ctactgctcc tcctcctcct cctcctcttc ctcatcctcc
gacatcgacg 1800tcagggcaaa cactggacat cgacccagag aaaggctgat ttccaacatc
ctgcaggggc 1860tgtggggcca gagcccacag acagaggcct gcagtggagg tccagcccag
ctgccgatgc 1920ccaggaagaa aacctctatg ctgccgtgaa gcacacacag cctgaggatg
gggtggagat 1980ggacactcgg agcccacacg atgaagaccc ccaggcagtg acgtatgccg
aggtgaaaca 2040ctccagacct aggagagaaa tggcctctcc tccttcccca ctgtctgggg
aattcctgga 2100cacaaaggac agacaggcgg aagaggacag gcagatggac actgaggctg
ctgcatctga 2160agccccccag gatgtgacct acgcccagct gcacagcttg acccttagac
ggaaggcaac 2220tgagcctcct ccatcccagg aagggccctc tccagctgtg cccagcatct
acgccactct 2280ggccatccac tagcccaggg ggggacgcag accccacact ccatggagtc
tggaatgcat 2340gggagctgcc cccccagtgg acaccattgg accccaccca gcctggatct
accccaggag 2400actctgggaa cttttagggg tcactcaatt ctgcagtata aataactaat
gtctctacaa 2460ttttgaaata aagcaacaga cttctcaata atcaatgaag tagctgagaa
aactaagtca 2520gaaagtgcat taaactgaat cacaatgtaa atattacaca tcaagcgatg
aaactggaaa 2580actacaagcc acgaatgaat gaattaggaa agaaaaaaag taggaaatga
atgatcttgg 2640ctttcctata agaaatttag ggcagggcac ggtggctcac gcctgtaatt
ccagcacttt 2700gggaggccga ggcgggcaga tcacgagttc aggagatcga gaccatcttg
gccaacatgg 2760tgaaaccctg tctctcctaa aaatacaaaa attagctgga tgtggtggca
gtgcctgtaa 2820tcccagctat ttgggaggct gaggcaggag aatcgcttga accagggagt
cagaggtttc 2880agtgagccaa gatcgcacca ctgctctcca gcctggcgac agagggagac
tccatctcaa 2940attaaaaaaa aaaaaaaaaa agaaagaaaa aaaaaaaaaa aaaa
29843653061DNAHomo sapiens 365cggcacgagg cgactttggt ggaggtagtt
ctttggcagc gggcatggcg ggtaccgtgg 60tgctggacga tgtggagctg cgggaggctc
agagagatta cctggacttc ctggacgacg 120aggaagacca gggaatttat cagagcaaag
ttcgggagct gatcagtgac aaccaatacc 180ggctgattgt caatgtgaat gacctgcgca
ggaaaaacga gaagagggct aaccggcttc 240tgaacaatgc ctttgaggag ctggttgcct
tccagcgggc cttaaaggat tttgtggcct 300ccattgatgc tacctatgcc aagcagtatg
aggagttcta cgtaggactg gaaggcagct 360ttggctccaa gcacgtctcc ccgcggactc
ttacctcctg cttcctcagc tgtgtggtct 420gtgtggaggg cattgtcact aaatgttctc
tagttcgtcc caaagtcgtc cgcagtgtcc 480actactgtcc tgctactaag aagaccatag
agcgacgtta ttctgatctc accaccctgg 540tggcctttcc ctccagctct gtctatccta
ccaaggatga ggagaacaat ccccttgaga 600cagaatatgg cctttctgtc tacaaggatc
accagaccat caccatccag gagatgccgg 660agaaggcccc agccggccag ctcccccgct
ctgtggacgt cattctggat gatgacttgg 720tggataaagc gaagcctggt gaccgggttc
aggtggtggg aacctaccgt tgccttcctg 780gaaagaaggg aggctacacc tctgggacct
tcaggactgt cctgattgcc tgtaatgtta 840agcagatgag caaggatgct cagccctctt
tctctgctga ggatatagcc aagatcaaga 900agttcagtaa aacccgatcc aaggatatct
ttgaccagct ggccaagtca ttggccccaa 960gtatccatgg gcatgactat gtcaagaaag
caatcctctg cttgctcttg ggaggggtgg 1020aacgagacct agaaaatggc agccacatcc
gtggggacat caatattctt ctaataggag 1080acccatccgt tgccaagtct cagcttctgc
ggtatgtgct ttgcactgca ccccgagcta 1140tccccaccac tggccggggc tcctctggag
tgggtctgac ggctgctgtc accacagacc 1200aggaaacagg agagcgccgt ctggaagcag
gggccatggt cctggctgac cgaggcgtgg 1260tttgcattga tgaatttgac aaaatgtctg
acatggatcg cacagccatc catgaagtga 1320tggagcaggg tcgagtgacc attgccaagg
ctggcatcca tgctcggctg aatgcccgct 1380gcagtgtttt ggcagctgcc aaccctgtct
acggcaggta tgaccagtat aagactccaa 1440tggagaacat tgggctacag gactcactgc
tgtcacgatt tgacttgctc ttcatcatgc 1500tggatcagat ggatcctgag caggatcggg
agatctcaga ccatgtcctt cggatgcacc 1560gttacagagc acctggggag caggatggcg
atgctatgcc cttgggtagt gctgtggata 1620tcctggccac agatgatccc aactttagcc
aggaagatca gcaggacacc cagatttatg 1680agaagcatga caaccttcta catgggacca
agaagaaaaa ggagaagatg gtgagtgcag 1740cattcatgaa gaagtacatc catgtggcca
aaatcatcaa gcctgtcctg acacaggagt 1800cggccaccta cattgcagaa gagtattcac
gcctgcgcag ccaggatagc atgagctcag 1860acaccgccag gacatctcca gttacagccc
gaacactgga aactctgatt cgactggcca 1920cagcccatgc gaaggcccgc atgagcaaga
ctgtggacct gcaggatgca gaggaagctg 1980tggagttggt ccagtatgct tactttaaga
aggttctgga gaaggagaag aaacgtaaga 2040agcgaagtga ggatgaatca gagacagaag
atgaagagga gaaaagccaa gaggaccagg 2100agcagaagag gaagagaagg aagactcgcc
agccagatgc caaagatggg gattcatacg 2160acccctatga cttcagtgac acagaggagg
aaatgcctca agtacacact ccaaagacgg 2220cagactcaca ggagaccaag gaatcccaga
aagtggagtt gagtgaatcc aggttgaagg 2280cattcaaggt ggccctcttg gatgtgttcc
gggaagctca tgcgcagtca atcggcatga 2340atcgcctcac agaatccatc aaccgggaca
gcgaagagcc cttctcttca gttgagatcc 2400aggctgctct gagcaagatg caggatgaca
atcaggtcat ggtgtctgag ggcatcatct 2460tcctcatctg aggaggcctc gtctctgaac
ttgggttgtg ccgagagagt ttgttctgtg 2520tttcccaccc tctccctgac ccaagtcttt
gcctctactc ccttaacagt gttgaattca 2580actgaaggcg aggaatgttg gtgatgaagc
tgagttcagg actcggtgga ccctttggga 2640atgggtcatg aaagctgcca tggggtgagg
aaagaggaga cagtgggaga ggacaatgac 2700tattgcatct tcattgcaaa agcactggct
catccgccct acttcccatc ccacacaaac 2760ccaattgtaa ataacatatg acttctgagt
acttttgggg gcacaactgt tttctgtttg 2820ctgttttttt gttttgtttt ttttctccag
agcactttgg tctagactag gctttgggtg 2880gttccaattg gtggagagaa gctctgaggc
acgtcatgca ggtcaagaaa gctttctttg 2940cagtagcacc agttaaggtg aatatgtatt
gtatcacaaa acaaacccaa tatccagatg 3000aatatccgag atgttgaata aacttagcca
tttcgtacaa aaaaaggggg gcccggtaaa 3060c
30613661360DNAHomo sapiens 366cgggggttgc
tccgtccgtg ctccgcctcg ccatgacttc ctacagctat cgccagtcgt 60cggccacgtc
gtccttcgga ggcctgggcg gcggctccgt gcgttttggg ccgggggtcg 120cttttcgcgc
gcccagcatt cacgggggct ccggcggccg cggcgtatcc gtgtcctccg 180cccgctttgt
gtcctcgtcc tcctcggggg gctacggcgg cggctacggc ggcgtcctga 240ccgcgtccga
cgggctgctg gcgggcaacg agaagctaac catgcagaac ctcaacgacc 300gcctggcctc
ctacctggac aaggtgcgcg ccctggaggc ggccaacggc gagctagagg 360tgaagatccg
cgactggtac cagaagcagg ggcctgggcc ctcccgcgac tacagccact 420actacacgac
catccaggac ctgcgggaca agattcttgg tgccaccatt gagaactcca 480ggattgtcct
gcagatcgac aacgcccgtc tggctgcaga tgacttccga accaagtttg 540agacggaaca
ggctctgcgc atgagcgtgg aggccgacat caacggcctg cgcagggtgc 600tggatgagct
gaccctggcc aggaccgacc tggagatgca gatcgaaggc ctgaaggaag 660agctggccta
cctgaagaag aaccatgagg aggaaatcag tacgctgagg ggccaagtgg 720gaggccaggt
cagtgtggag gtggattccg ctccgggcac cgatctcgcc aagatcctga 780gtgacatgcg
aagccaatat gaggtcatgg ccgagcagaa ccggaaggat gctgaagcct 840ggttcaccag
ccggactgaa gaattgaacc gggaggtcgc tggccacacg gagcagctcc 900agatgagcag
gtccgaggtt actgacctgc ggcgcaccct tcagggtctt gagattgagc 960tgcagtcaca
gctgagcatg aaagctgcct tggaagacac actggcagaa acggaggcgc 1020gctttggagc
ccagctggcg catatccagg cgctgatcag cggtattgaa gcccagctgg 1080cggatgtgcg
agctgatagt gagcggcaga atcaggagta ccagcggctc atggacatca 1140agtcgcggct
ggagcaggag attgccacct accgcagcct gctcgaggga caggaagatc 1200actacaacaa
tttgtctgcc tccaaggtcc tctgaggcag caggctctgg ggcttctgct 1260gtcctttgga
gggtgtcttc tgggtagagg gatgggaagg aagggaccct tacccccggc 1320tcttctcctg
acctgccaat aaaaatttat ggtccaaggg
13603671412DNAHomo sapiens 367cggggtcgtc cgcaaagcct gagtcctgtc ctttctctct
ccccggacag catgagcttc 60accactcgct ccaccttctc caccaactac cggtccctgg
gctctgtcca ggcgcccagc 120tacggcgccc ggccggtcag cagcgcggcc agcgtctatg
caggcgctgg gggctctggt 180tcccggatct ccgtgtcccg ctccaccagc ttcaggggcg
gcatggggtc cgggggcctg 240gccaccggga tagccggggg tctggcagga atgggaggca
tccagaacga gaaggagacc 300atgcaaagcc tgaacgaccg cctggcctct tacctggaca
gagtgaggag cctggagacc 360gagaaccgga ggctggagag caaaatccgg gagcacttgg
agaagaaggg accccaggtc 420agagactgga gccattactt caagatcatc gaggacctga
gggctcagat cttcgcaaat 480actgtggaca atgcccgcat cgttctgcag attgacaatg
cccgtcttgc tgctgatgac 540tttagagtca agtatgagac agagctggcc atgcgccagt
ctgtggagaa cgacatccat 600gggctccgca aggtcattga tgacaccaat atcacacgac
tgcagctgga gacagagatc 660gaggctctca aggaggagct gctcttcatg aagaagaacc
acgaagagga agtaaaaggc 720ctacaagccc agattgccag ctctgggttg accgtggagg
tagatgcccc caaatctcag 780gacctcgcca agatcatggc agacatccgg gcccaatatg
acgagctggc tcggaagaac 840cgagaggagc tagacaagta ctggtctcag cagattgagg
agagcaccac agtggtcacc 900acacagtctg ctgaggttgg agctgctgag acgacgctca
cagagctgag acgtacagtc 960cagtccttgg agatcgacct ggactccatg agaaatctga
aggccagctt ggagaacagc 1020ctgagggagg tggaggcccg ctacgcccta cagatggagc
agctcaacgg gatcctgctg 1080caccttgagt cagagctggc acagacccgg gcagagggac
agcgccaggc ccaggagtat 1140gaggccctgc tgaacatcaa ggtcaagctg gaggctgaga
tcgccaccta ccgccgcctg 1200ctggaagatg gcgaggactt taatcttggt gatgccttgg
acagcagcaa ctccatgcaa 1260accatccaaa agaccaccac ccgccggata gtggatggca
aagtggtgtc tgagaccaat 1320gacaccaaag ttctgaggca ttaagccagc agaagcaggg
taccctttgg ggagcaggag 1380gccaataaaa agttcagagt tcattggatg tc
14123681075DNAHomo sapiens 368cgcagcaaac acatccgtag
aaggcagcgc ggccgccgag agccgcagcg ccgctcgccc 60gccgcccccc accccgccgc
cccgcccggc gaattgcgcc ccgcgcccct cccctcgcgc 120ccccgagaca aagaggagag
aaagtttgcg cggccgagcg gggcaggtga ggagggtgag 180ccgcgcggga ggggcccgcc
tcggccccgg ctcagccccc gcccgcgccc ccagcccgcc 240gccgcgagca gcgcccggac
cccccagcgg cggcccccgc ccgcccagcc ccccggcccg 300ccatgggcgc cgcggcccgc
accctgcggc tggcgctcgg cctcctgctg ctggcgacgc 360tgcttcgccc ggccgacgcc
tgcagctgct ccccggtgca cccgcaacag gcgttttgca 420atgcagatgt agtgatcagg
gccaaagcgg tcagtgagaa ggaagtggac tctggaaacg 480acatttatgg caaccctatc
aagaggatcc agtatgagat caagcagata aagatgttca 540aagggcctga gaaggatata
gagtttatct acacggcccc ctcctcggca gtgtgtgggg 600tctcgctgga cgttggagga
aagaaggaat atctcattgc aggaaaggcc gagggggacg 660gcaagatgca catcaccctc
tgtgacttca tcgtgccctg ggacaccctg agcaccaccc 720agaagaagag cctgaaccac
aggtaccaga tgggctgcga gtgcaagatc acgcgctgcc 780ccatgatccc gtgctacatc
tcctccccgg acgagtgcct ctggatggac tgggtcacag 840agaagaacat caacgggcac
caggccaagt tcttcgcctg catcaagaga agtgacggct 900cctgtgcgtg gtaccgcggc
gcggcgcccc ccaagcagga gtttctcgac atcgaggacc 960cataagcagg cctccaacgc
ccctgtggcc aactgcaaaa aaagcctcca agggtttcga 1020ctggtccagc tctgacatcc
cttcctggaa acagcatgaa taaaacactc atccc 10753691127DNAHomo sapiens
369cacgggcggg gcggggcctg ggtccaccgg ggttctgagg ggagactgag gtcctgagcc
60gacagcctca gctccctgcc aggccagacc cggcagacag atgagggccc aggaggcctg
120gcgggcctgg gggcgctacg gtgggagagg aagccagggg tacctgcctc tgccttccag
180ggccaccgtt ggccccagct gtgccttgac tacgtaacat cttgtcctca cagcccagag
240catgttccag atcccagagt ttgagccgag tgagcaggaa gactccagct ctgcagagag
300gggcctgggc cccagccccg caggggacgg gccctcaggc tccggcaagc atcatcgcca
360ggccccaggc ctcctgtggg acgccagtca ccagcaggag cagccaacca gcagcagcca
420tcatggaggc gctggggctg tggagatccg gagtcgccac agctcctacc ccgcggggac
480ggaggacgac gaagggatgg gggaggagcc cagccccttt cggggccgct cgcgctcggc
540gccccccaac ctctgggcag cacagcgcta tggccgcgag ctccggagga tgagtgacga
600gtttgtggac tcctttaaga agggacttcc tcgcccgaag agcgcgggca cagcaacgca
660gatgcggcaa agctccagct ggacgcgagt cttccagtcc tggtgggatc ggaacttggg
720caggggaagc tccgccccct cccagtgacc ttcgctccac atcccgaaac tccacccgtt
780cccactgccc tgggcagcca tcttgaatat gggcggaagt acttccctca ggcctatgca
840aaaagaggat ccgtgctgtc tcctttggag ggagggctga cccagattcc cttccggtgc
900gtgtgaagcc acggaaggct tggtcccatc ggaagttttg ggttttccgc ccacagccgc
960cggaagtggc tccgtggccc cgccctcagg ctccgggctt tcccccaggc gcctgcgcta
1020agtcgcgagc caggtttaac cgttgcgtca ccgggacccg agcccccgcg atgccctggg
1080ggccgtgctc actaccaaat gttaataaag cccgcgtctg tgccgcc
11273701890DNAHomo sapiens 370cttaataaga agagaaggct tcaatggaac cttttgtggt
cctggtgctg tgtctctctt 60ttatgcttct cttttcactc tggagacaga gctgtaggag
aaggaagctc cctcctggcc 120ccactcctct tcctattatt ggaaatatgc tacagataga
tgttaaggac atctgcaaat 180ctttcaccaa tttctcaaaa gtctatggtc ctgtgttcac
cgtgtatttt ggcatgaatc 240ccatagtggt gtttcatgga tatgaggcag tgaaggaagc
cctgattgat aatggagagg 300agttttctgg aagaggcaat tccccaatat ctcaaagaat
tactaaagga cttggaatca 360tttccagcaa tggaaagaga tggaaggaga tccggcgttt
ctccctcaca aacttgcgga 420attttgggat ggggaagagg agcattgagg accgtgttca
agaggaagct cactgccttg 480tggaggagtt gagaaaaacc aaggcttcac cctgtgatcc
cactttcatc ctgggctgtg 540ctccctgcaa tgtgatctgc tccgttgttt tccagaaacg
atttgattat aaagatcaga 600attttctcac cctgatgaaa agattcaatg aaaacttcag
gattctgaac tccccatgga 660tccaggtctg caataatttc cctctactca ttgattgttt
cccaggaact cacaacaaag 720tgcttaaaaa tgttgctctt acacgaagtt acattaggga
gaaagtaaaa gaacaccaag 780catcactgga tgttaacaat cctcgggact ttatggattg
cttcctgatc aaaatggagc 840aggaaaagga caaccaaaag tcagaattca atattgaaaa
cttggttggc actgtagctg 900atctatttgt tgctggaaca gagacaacaa gcaccactct
gagatatgga ctcctgctcc 960tgctgaagca cccagaggtc acagctaaag tccaggaaga
gattgatcat gtaattggca 1020gacacaggag cccctgcatg caggatagga gccacatgcc
ttacactgat gctgtagtgc 1080acgagatcca gagatacagt gaccttgtcc ccaccggtgt
gccccatgca gtgaccactg 1140atactaagtt cagaaactac ctcatcccca agagctttga
taacaagata atgctggctg 1200cataaaacta gggcacaacc ataatggcat tactgacttc
cgtgctacat gatgacaaag 1260aatttcctaa tccaaatatc tttgaccctg gccactttct
agataagaat ggcaacttta 1320agaaaagtga ctacttcatg cctttctcag caggaaaacg
aatttgtgca ggagaaggac 1380ttgcccgcat ggagctattt ttatttctaa ccacaatttt
acagaacttt aacctgaaat 1440ctgttgatga tttaaagaac ctcaatacta ctgcagttac
caaagggatt gtttctctgc 1500caccctcata ccagatctgc ttcatccctg tctgaagaat
gctagcccat ctggctgctg 1560atctgctatc acctgcaact ctttttttat caaggacatt
cccactatta tgtcttctct 1620gacctctcat caaatcttcc cattcactca atatcccata
agcatccaaa ctccattaag 1680gagagttgtt caggtcactg cacaaatata tctgcaatta
ttcatactct gtaacacttg 1740tattaattgc tgcatatgct aatacttttc taatgctgac
tttttaatat gttatcactg 1800taaaacacag aaaagtgatt aatgaatgat aatttagtcc
atttcttttg tgaatgtgct 1860aaataaaaag tgttattaat tgctggttca
18903714946DNAHomo sapiens 371agtcagccct gctgccagcc
agtgccgggt gctggggact cagggaggcc cgccgggacc 60actgcgggac agtgagccga
gcagaagctg gaacgcagga gaggaaggag agggggcggt 120cagggctctc aggagccggg
tcctgggcaa ggcgcagccg ttttcaaatt ttcaggaaag 180cggtcggctc acactcgagc
agtaaaaaga tgcctctggg gaggaggccc gtgcagctct 240ccgggcaatg gtggtggctc
ggcctagaga ggcggtagtg gaacgcagac cctggtgggg 300gaatgacatc aagggaggag
acgggcggga ccccagattt ctgcctgtgg gcgatggaag 360tgaggttcac tggccagcgg
agccggacac agaacgcgca aaacgccgtg taggcctgga 420ggagccgaag agcaggcgga
ccccctccgc gggggaacag tttccgccgg gagcacaaag 480caacggaccg gaagtggggg
gcggaagtgc agtgggctca gcgccgactg cgcgcctctg 540cccgcgaaaa ctctgagctg
gctgacagct ggggacgggt ggcggccctc gactggagtc 600ggttgagttc ctgagggacc
ccggttctgg aaggttcgcc gcggagacaa gtgagcagtc 660tgtgccatag ggattctcga
agagaacagc gttgtgtccc agtgcacatg ctcgcatcgc 720ttaccaggag tgcccgagac
cctaagatgt tcggagtggt tttttcgcac agacccgaat 780agcctgcccc tcagccacgc
tctgtgccct tctgagaaca ggctgatatg cccaagatag 840tcctgaatgg tgtgaccgta
gacttccctt tccagcccta caaatgccaa caggagtaca 900tgaccaaggt cctggaatgt
ctgcagcaga aggtgaatgg catcctggag agccctacgg 960gtacagggaa gacgctgtgc
ctgctgtgca ccacgctggc ctggcgagaa cacctccgag 1020acggcatctc tgcccgcaag
attgccgaga gggcgcaagg agagcttttc ccggatcggg 1080ccttgtcatc ctggggcaac
gctgctgctg ctgctggaga ccccatagct tgctacacgg 1140acatcccaaa gattatttac
gcctccagga cccactcgca actcacacag gtcatcaacg 1200agcttcggaa cacctcctac
cggcctaagg tgtgtgtgct gggctcccgg gagcagctgt 1260gcatccatcc tgaggtgaag
aaacaagaga gtaaccatct acagatccac ttgtgccgta 1320agaaggtggc aagtcgctcc
tgtcatttct acaacaacgt agaagaaaaa agcctggagc 1380aggagctggc cagccccatc
ctggacattg aggacttggt caagagcgga agcaagcaca 1440gggtgtgccc ttactacctg
tcccggaacc tgaagcagca agccgacatc atattcatgc 1500cgtacaatta cttgttggat
gccaagagcc gcagagcaca caacattgac ctgaagggga 1560cagtcgtgat ctttgacgaa
gctcacaacg tggagaagat gtgtgaagaa tcggcatcct 1620ttgacctgac tccccatgac
ctggcttcag gactggacgt catagaccag gtgctggagg 1680agcagaccaa ggcagcgcag
cagggtgagc cccacccgga gttcagcgcg gactccccca 1740gcccagggct gaacatggag
ctggaagaca ttgcaaagct gaagatgatc ctgctgcgcc 1800tggagggggc catcgatgct
gttgagctgc ctggagacga cagcggtgtc accaagccag 1860ggagctacat ctttgagctg
tttgctgaag cccagatcac gtttcagacc aagggctgca 1920tcctggactc gctggaccag
atcatccagc acctggcagg acgtgctgga gtgttcacca 1980acacggccgg actgcagaag
ctggcggaca ttatccagat tgtgttcagt gtggacccct 2040ccgagggcag ccctggttcc
ccagcagggc tgggggcctt acagtcctat aaggtgcaca 2100tccatcctga tgctggtcac
cggaggacgg ctcagcggtc tgatgcctgg agcaccactg 2160cagccagaaa gcgagggaag
gtgctgagct actggtgctt cagtcccggc cacagcatgc 2220acgagctggt ccgccagggc
gtccgctccc tcatccttac cagcggcacg ctggccccgg 2280tgtcctcctt tgctctggag
atgcagatcc ctttcccagt ctgcctggag aacccacaca 2340tcatcgacaa gcaccagatc
tgggtggggg tcgtccccag aggccccgat ggagcccagt 2400tgagctccgc gtttgacaga
cggttttccg aggagtgctt atcctccctg gggaaggctc 2460tgggcaacat cgcccgcgtg
gtgccctatg ggctcctgat cttcttccct tcctatcctg 2520tcatggagaa gagcctggag
ttctggcggg cccgcgactt ggccaggaag atggaggcgc 2580tgaagccgct gtttgtggag
cccaggagca aaggcagctt ctccgagacc atcagtgctt 2640actatgcaag ggttgccgcc
cctgggtcca ccggcgccac cttcctggcg gtctgccggg 2700gcaaggccag cgaggggctg
gacttctcag acacgaatgg ccgtggtgtg attgtcacgg 2760gcctcccgta ccccccacgc
atggaccccc gggttgtcct caagatgcag ttcctggatg 2820agatgaaggg ccagggtggg
gctgggggcc agttcctctc tgggcaggag tggtaccggc 2880agcaggcgtc cagggctgtg
aaccaggcca tcgggcgagt gatccggcac cgccaggact 2940acggagctgt cttcctctgt
gaccacaggt tcgcctttgc cgacgcaaga gcccaactgc 3000cctcctgggt gcgtccccac
gtcagggtgt atgacaactt tggccatgtc atccgagacg 3060tggcccagtt cttccgtgtt
gccgagcgaa ctatgccagc gccggccccc cgggctacag 3120cacccagtgt gcgtggagaa
gatgctgtca gcgaggccaa gtcgcctggc cccttcttct 3180ccaccaggaa agctaagagt
ctggacctgc atgtccccag cctgaagcag aggtcctcag 3240ggtcaccagc tgccggggac
cccgagagta gcctgtgtgt ggagtatgag caggagccag 3300ttcctgcccg gcagaggccc
agggggctgc tggccgccct ggagcacagc gaacagcggg 3360cggggagccc tggcgaggag
caggcccaca gctgctccac cctgtccctc ctgtctgaga 3420agaggccggc agaagaaccg
cgaggaggga ggaagaagat ccggctggtc agccacccgg 3480aggagcccgt ggctggtgca
cagacggaca gggccaagct cttcatggtg gccgtgaagc 3540aggagttgag ccaagccaac
tttgccacct tcacccaggc cctgcaggac tacaagggtt 3600ccgatgactt cgccgccctg
gccgcctgtc tcggccccct ctttgctgag gaccccaaga 3660agcacaacct gctccaaggc
ttctaccagt ttgtgcggcc ccaccataag cagcagtttg 3720aggaggtctg tatccagctg
acaggacgag gctgtggcta tcggcctgag cacagcattc 3780cccgaaggca gcgggcacag
ccggtcctgg accccactgg aagaacggcg ccggatccca 3840agctgaccgt gtccacggct
gcagcccagc agctggaccc ccaagagcac ctgaaccagg 3900gcaggcccca cctgtcgccc
aggccacccc caacaggaga ccctggcagc caaccacagt 3960gggggtctgg agtgcccaga
gcagggaagc agggccagca cgccgtgagc gcctacctgg 4020ctgatgcccg cagggccctg
gggtccgcgg gctgtagcca actcttggca gcgctgacag 4080cctataagca agacgacgac
ctcgacaagg tgctggctgt gttggccgcc ctgaccactg 4140caaagccaga ggacttcccc
ctgctgcaca ggttcagcat gtttgtgcgt ccacaccaca 4200agcagcgctt ctcacagacg
tgcacagacc tgaccggccg gccctacccg ggcatggagc 4260caccgggacc ccaggaggag
aggcttgccg tgcctcctgt gcttacccac agggctcccc 4320aaccaggccc ctcacggtcc
gagaagaccg ggaagaccca gagcaagatc tcgtccttcc 4380ttagacagag gccagcaggg
actgtggggg cgggcggtga ggatgcaggt cccagccagt 4440cctcaggacc tccccacggg
cctgcagcat ctgagtgggg cctctaggat gtgcccagcc 4500tgccacaccg cctccaggaa
gcagagcgtc atgcaggtct tctggccaga gccccagtga 4560gtgcccacgg aggcccccag
cacacccaac gtggcttgat cacctgcctg tccagctctg 4620gtgggccaag aacccaccca
acagaatagg ccagcccatg ccagccggct tggcccgctg 4680caggcctcag gcaggcgggg
cccatggttg gtccctgcgg tgggaccgga tctgggcctg 4740cctctgagaa gccctgagct
accttggggt ctggggtggg tttctgggaa agtgcttccc 4800cagaacttcc ctggctcctg
gcctgtgagt ggtgccacag gggcacccca gctgagcccc 4860tcaccgggaa ggaggagacc
cccgtgggca cgtgtccact tttaatcagg ggacagggct 4920ctctaataaa gctgctggca
gtgccc 49463721743DNAHomo sapiens
372cagtatccct cctgacaaaa ctaacaaaaa tcctgttagc caaataatca gccacattca
60tatttaccgt caaagttttt atcctcattt tacagcagtg gagagcgatt gccccgggtc
120ccacgttagg aagagagaga actgggattt gcacccaggc aatctgggga cagagctgtg
180atcacaactc catgagtcag ggccgagcca gccccttcac caccagccgg ccgcgccccg
240ggaaggaagt ttgtggcgga ggaggttcgt acgggaggag ggggaggcgc ccacgcatct
300ggggctgact cgctctttcg caaaacgtct gggaggagtc cctggggcca caaaactgcc
360tccttcctga ggccagaagg agagaagacg tgcagggacc ccgcgcacag gagctgccct
420cgcgacatgg gtcacccgcc gctgctgccg ctgctgctgc tgctccacac ctgcgtccca
480gcctcttggg gcctgcggtg catgcagtgt aagaccaacg gggattgccg tgtggaagag
540tgcgccctgg gacaggacct ctgcaggacc acgatcgtgc gcttgtggga agaaggagaa
600gagctggagc tggtggagaa aagctgtacc cactcagaga agaccaacag gaccctgagc
660tatcggactg gcttgaagat caccagcctt accgaggttg tgtgtgggtt agacttgtgc
720aaccagggca actctggccg ggctgtcacc tattcccgaa gccgttacct cgaatgcatt
780tcctgtggct catcagacat gagctgtgag aggggccggc accagagcct gcagtgccgc
840agccctgaag aacagtgcct ggatgtggtg acccactgga tccaggaagg tgaagaaggg
900cgtccaaagg atgaccgcca cctccgtggc tgtggctacc ttcccggctg cccgggctcc
960aatggtttcc acaacaacga caccttccac ttcctgaaat gctgcaacac caccaaatgc
1020aacgagggcc caatcctgga gcttgaaaat ctgccgcaga atggccgcca gtgttacagc
1080tgcaagggga acagcaccca tggatgctcc tctgaagaga ctttcctcat tgactgccga
1140ggccccatga atcaatgtct ggtagccacc ggcactcacg aaccgaaaaa ccaaagctat
1200atggtaagag gctgtgcaac cgcctcaatg tgccaacatg cccacctggg tgacgccttc
1260agcatgaacc acattgatgt ctcctgctgt actaaaagtg gctgtaacca cccagacctg
1320gatgtccagt accgcagtgg ggctgctcct cagcctggcc ctgcccatct cagcctcacc
1380atcaccctgc taatgactgc cagactgtgg ggaggcactc tcctctggac ctaaacctga
1440aatccccctc tctgccctgg ctggatccgg gggacccctt tgcccttccc tcggctccca
1500gccctacaga cttgctgtgt gacctcaggc cagtgtgccg acctctctgg gcctcagttt
1560tcccagctat gaaaacagct atctcacaaa gttgtgtgaa gcagaagaga aaagctggag
1620gaaggccgtg ggcaatggga gagctcttgt tattattaat attgttgccg ctgttgtgtt
1680gttgttatta attaatattc atattattta ttttatactt acataaagat tttgtaccag
1740tgg
17433735061DNAHomo sapiens 373atggctcaga tatttagcaa cagcggattt aaagaatgtc
cattttcaca tccggaacca 60acaagagcaa aagatgtgga caaagaagaa gcattacaga
tggaagcaga ggctttagca 120aaactgcaaa aggatagaca agtgactgac aatcagagag
gctttgagtt gtcaagcagc 180accagaaaaa aagcacaggt ttataacaag caggattatg
atctcatggt gtttcctgaa 240tcagattccc aaaaaagagc attagatatt gatgtagaaa
agctcaccca agctgaactt 300gagaaactat tgctggatga cagtttcgag actaaaaaaa
cacctgtatt accagttact 360cctattctga gcccttcctt ttcagcacag ctctatttta
gacctactat tcagagagga 420cagtggccac ctggattacc tgggccttcc acttatgctt
taccttctat ttatccttct 480acttacagta aacaggctgc attccaaaat ggcttcaatc
caagaatgcc cacttttcca 540tctacagaac ctatatattt aagtcttccg ggacaatctc
catatttctc atatcctttg 600acacctgcca caccctttca tccacaagga agcttaccta
tctatcgtcc agtagtcagt 660actgacatgg caaaactatt tgacaaaata gctagtacat
cagaattttt aaaaaatggg 720aaagcaagga ctgatttgga gataacagat tcaaaagtca
gcaatctaca ggtatctcca 780aagtctgagg atatcagtaa atttgactgg ttagacttgg
atcctctaag taagcctaag 840gtggataatg tggaggtatt agaccatgag gaagagaaaa
atgtttcaag tttgctagca 900aaggatcctt gggatgctgt tcttcttgaa gagagatcga
cagcaaattg tcatcttgaa 960agaaaggtga atggaaaatc cctttctgtg gcaactgtta
caagaagcca gtctttaaat 1020attcgaacaa ctcagcttgc aaaagcccag ggccatatat
ctcagaaaga cccaaatggg 1080accagtagtt tgccaactgg aagttctctt cttcaagaag
ttgaagtaca gaatgaggag 1140atggcagctt tttgtcgatc cattacaaaa ttgaagacca
aatttccata taccaatcac 1200cgcacaaacc caggctattt gttaagtcca gtcacagcgc
aaagaaacat atgcggagaa 1260aatgctagtg tgaaggtctc cattgacatt gaaggatttc
agctaccagt tacttttacg 1320tgtgatgtga gttctactgt agaaatcatt ataatgcaag
ccctttgctg ggtacatgat 1380gacttgaatc aagtagatgt tggcagctat gttctaaaag
tttgtggtca agaggaagtg 1440ctgcagaata atcattgcct tggaagtcat gagcatattc
aaaactgtcg aaaatgggac 1500acagaaatta gactacaact cttgaccttc agtgcaatgt
gtcaaaatct ggcccgaaca 1560gcagaagatg atgaaacacc cgtggattta aacaaacacc
tgtatcaaat agaaaaacct 1620tgcaaagaag ccatgacgag acaccctgtt gaagaactct
tagattctta tcacaaccaa 1680gtagaactgg ctcttcaaat tgaaaaccaa caccgagcag
tagatcaagt aattaaagct 1740gtaagaaaaa tctgtagtgc tttagatggt gtcgagactc
ttgccattac agaatcagta 1800aagaagctaa agagagcagt taatcttcca aggagtaaaa
ctgctgatgt gacttctttg 1860tttggaggag aagacactag caggagttca actaggggct
cacttaatcc tgaaaatcct 1920gttcaagtaa gcataaacca attaactgca gcaatttatg
atcttctcag actccatgca 1980aattctggta ggagtcctac agactgtgcc caaagtagca
agagtgtcaa ggaagcatgg 2040actacaacag agcagctcca gtttactatt tttgctgctc
atggaatttc aagtaattgg 2100gtatcaaatt atgaaaaata ctacttgata tgttcactgt
ctcacaatgg aaaggatctt 2160tttaaaccta ttcaatcaaa gaaggttggc acttacaaga
atttcttcta tcttattaaa 2220tgggatgaac taatcatttt tcctatccag atatcacaat
tgccattaga atcagttctt 2280caccttactc tttttggaat tttaaatcag agcagtggaa
gttcccctga ttctaataag 2340cagagaaagg gaccagaagc tttgggcaaa gtttctttac
ctctttgtga ctttagacgg 2400tttttaacat gtggaactaa acttctatat ctttggactt
catcacatac aaattctgtt 2460cctggaacag ttaccaaaaa aggatatgtc atggaaagaa
tagtgctaca ggttgatttt 2520ccttctcctg catttgatat tatttataca actcctcaag
ttgacagaag cattatacag 2580caacataact tagaaacact agagaatgat ataaaaggga
aacttcttga tattcttcat 2640aaagactcat cacttggact ttctaaagaa gataaagctt
ttttatggga gaaacgttat 2700tattgcttca aacacccaaa ttgtcttcct aaaatattag
caagcgcccc aaactggaaa 2760tggggtaatc ttgccaaaac ttactcattg cttcaccagt
ggcctgcatt gtacccacta 2820attgcattgg aacttcttga ttcaaaattt gctgatcagg
aagtaagatc cctagctgtg 2880acctggattg aggccattag tgatgatgag ctaacagatc
ttcttccaca gtttgtacaa 2940gctttgaaat atgaaattta cttgaatagt tcattagtgc
aattcctttt gtccagggca 3000ttgggaaata tccagatagc acacaattta tattggcttc
tcaaagatgc cctgcatgat 3060gtacagttta gtacccgata cgaacatgtt ttgggtgctc
tcctgtcagt aggaggaaaa 3120cgacttagag aagaacttct aaaacagacg aaacttgtac
agcttttagg aggagtagca 3180gaaaaagtaa ggcaggctag tggatcagcc agacaggttg
ttctccaaag aagtatggaa 3240cgagtacagt ccttttttca gaaaaataaa tgccgtctcc
ctctcaagcc aagtctagtg 3300gcaaaagaat taaatattaa gtcgtgttcc ttcttcagtt
ctaatgctgt ccccctaaaa 3360gtcacaatgg tgaatgctga ccctctggga gaagaaatta
atgtcatgtt taaggttggt 3420gaagatcttc ggcaagatat gttagcttta cagatgataa
agattatgga taagatctgg 3480cttaaagaag gactagatct gaggatggta attttcaaat
gtctctcaac tggcagagat 3540cgaggcatgg tggagctggt tcctgcttcc gataccctca
ggaaaatcca agtggaatat 3600ggtgtgacag gatcctttaa agataaacca cttgcagagt
ggctaaggaa atacaatccc 3660tctgaagaag aatatgaaaa ggcttcagag aactttatct
attcctgtgc tggatgctgt 3720gtagccacct atgttttagg catctgtgat cgacacaatg
acaatataat gcttcgaagc 3780acgggacaca tgtttcacat tgactttgga aagtttttgg
gacatgcaca gatgtttggc 3840agcttcaaaa gggatcgggc tccttttgtg ctgacctctg
atatggcata tgtcattaat 3900gggggtgaaa agcccaccat tcgttttcag ttgtttgtgg
acctctgctg tcaggcctac 3960aacttgataa gaaagcagac aaaccttttt cttaacctcc
tttcactgat gattccttca 4020gggttaccag aacttacaag tattcaagat ttgaaatacg
ttagagatgc acttcaaccc 4080caaactacag acgcagaagc tacaattttc tttactaggc
ttattgaatc aagtttggga 4140agcattgcca caaagtttaa cttcttcatt cacaaccttg
ctcagcttcg tttttctggt 4200cttccttcta atgatgagcc catcctttca ttttcaccta
aaacatactc ctttagacaa 4260gatggtcgaa tcaaggaagt ctctgttttt acatatcata
agaaatacaa cccagataaa 4320cattatattt atgtagtccg aattttgtgg gaaggacaga
ttgaaccatc atttgtcttc 4380cgaacatttg tcgaatttca ggaacttcac aataagctca
gtattatttt tccactttgg 4440aagttaccag gctttcctaa taggatggtt ctaggaagaa
cacacataaa agatgtagca 4500gccaaaagga aaattgagtt aaacagttac ttacagagtt
tgatgaatgc ttcaacggat 4560gtagcagagt gtgatcttgt ttgtactttc ttccaccctt
tacttcgtga tgagaaagct 4620gaagggatag ctaggtctgc agatgcaggt tccttcagtc
ctactccagg ccaaatagga 4680ggagctgtga aattatccat ctcttaccga aatggtactc
ttttcatcat ggtgatgcat 4740atcaaagatc ttgttactga agatggagct gacccaaatc
catatgtcaa aacataccta 4800cttccagata accacaaaac atccaaacgt aaaaccaaaa
tttcacgaaa aacgaggaat 4860ccgacattca atgaaatgct tgtatacagt ggatatagca
aagaaaccct aagacagcga 4920gaacttcaac taagtgtact cagtgcagaa tctctgcggg
agaatttttt cttgggtgga 4980gtaaccctgc ctttgaaaga tttcaacttg agcaaagaga
cggttaaatg gtatcagctg 5040actgcggcaa catacttgta a
50613746802DNAHomo sapiens 374cggccccaga aaacccgagc
gagtaggggg cggcgcgcag gagggaggag aactgggggc 60gcgggaggct ggtgggtgtc
gggggtggag atgtagaaga tgtgacgccg cggcccggcg 120ggtgccagat tagcggacgg
ctgcccgcgg ttgcaacggg atcccgggcg ctgcagcttg 180ggaggcggct ctccccaggc
ggcgtccgcg gagacaccca tccgtgaacc ccaggtcccg 240ggccgccggc tcgccgcgca
ccaggggccg gcggacagaa gagcggccga gcggctcgag 300gctgggggac cgcgggcgcg
gccgcgcgct gccgggcggg aggctggggg gccggggccg 360gggccgtgcc ccggagcggg
tcggaggccg gggccggggc cgggggacgg cggctccccg 420cgcggctcca gcggctcggg
gatcccggcc gggccccgca gggaccatgg cagccgggag 480catcaccacg ctgcccgcct
tgcccgagga tggcggcagc ggcgccttcc cgcccggcca 540cttcaaggac cccaagcggc
tgtactgcaa aaacgggggc ttcttcctgc gcatccaccc 600cgacggccga gttgacgggg
tccgggagaa gagcgaccct cacatcaagc tacaacttca 660agcagaagag agaggagttg
tgtctatcaa aggagtgtgt gctaaccgtt acctggctat 720gaaggaagat ggaagattac
tggcttctaa atgtgttacg gatgagtgtt tcttttttga 780acgattggaa tctaataact
acaatactta ccggtcaagg aaatacacca gttggtatgt 840ggcactgaaa cgaactgggc
agtataaact tggatccaaa acaggacctg ggcagaaagc 900tatacttttt cttccaatgt
ctgctaagag ctgattttaa tggccacatc taatctcatt 960tcacatgaaa gaagaagtat
attttagaaa tttgttaatg agagtaaaag aaaataaatg 1020tgtatagctc agtttggata
attggtcaaa caatttttta tccagtagta aaatatgtaa 1080ccattgtccc agtaaagaaa
aataacaaaa gttgtaaaat gtatattctc ccttttatat 1140tgcatctgct gttacccagt
gaagcttacc tagagcaatg atctttttca cgcatttgct 1200ttattcgaaa agaggctttt
aaaatgtgca tgtttagaaa caaaatttct tcatggaaat 1260catatacatt agaaaatcac
agtcagatgt ttaatcaatc caaaatgtcc actatttctt 1320atgtcattcg ttagtctaca
tgtttctaaa catataaatg tgaatttaat caattccttt 1380catagtttta taattctctg
gcagttcctt atgatagagt ttataaaaca gtcctgtgta 1440aactgctgga agttcttcca
cagtcaggtc aattttgtca aacccttctc tgtacccata 1500cagcagcagc ctagcaactc
tgctggtgat gggagttgta ttttcagtct tcgccaggtc 1560attgagatcc atccactcac
atcttaagca ttcttcctgg caaaaattta tggtgaatga 1620atatggcttt aggcggcaga
tgatatacat atctgacttc ccaaaagctc caggatttgt 1680gtgctgttgc cgaatactca
ggacggacct gaattctgat tttataccag tctcttcaaa 1740aacttctcga accgctgtgt
ctcctacgta aaaaaagaga tgtacaaatc aataataatt 1800acacttttag aaactgtatc
atcaaagatt ttcagttaaa gtagcattat gtaaaggctc 1860aaaacattac cctaacaaag
taaagttttc aatacaaatt ctttgccttg tggatatcaa 1920gaaatcccaa aatattttct
taccactgta aattcaagaa gcttttgaaa tgctgaatat 1980ttctttggct gctacttgga
ggcttatcta cctgtacatt tttggggtca gctcttttta 2040acttcttgct gctctttttc
ccaaaaggta aaaatataga ttgaaaagtt aaaacatttt 2100gcatggctgc agttcctttg
tttcttgaga taagattcca aagaacttag attcatttct 2160tcaacaccga aatgctggag
gtgtttgatc agttttcaag aaacttggaa tataaataat 2220tttataattc aacaaaggtt
ttcacatttt ataaggttga tttttcaatt aaatgcaaat 2280ttgtgtggca ggatttttat
tgccattaac atatttttgt ggctgctttt tctacacatc 2340cagatggtcc ctctaactgg
gctttctcta attttgtgat gttctgtcat tgtctcccaa 2400agtatttagg agaagccctt
taaaaagctg ccttcctcta ccactttgct ggaaagcttc 2460acaattgtca cagacaaaga
tttttgttcc aatactcgtt ttgcctctat ttttcttgtt 2520tgtcaaatag taaatgatat
ttgcccttgc agtaattcta ctggtgaaaa acatgcaaag 2580aagaggaagt cacagaaaca
tgtctcaatt cccatgtgct gtgactgtag actgtcttac 2640catagactgt cttacccatc
ccctggatat gctcttgttt tttccctcta atagctatgg 2700aaagatgcat agaaagagta
taatgtttta aaacataagg cattcatctg ccatttttca 2760attacatgct gacttccctt
acaattgaga tttgcccata ggttaaacat ggttagaaac 2820aactgaaagc ataaaagaaa
aatctaggcc gggtgcagtg gctcatgcct atattccctg 2880cactttggga ggccaaagca
ggaggatcgc ttgagcccag gagttcaaga ccaacctggt 2940gaaaccccgt ctctacaaaa
aaacacaaaa aatagccagg catggtggcg tgtacatgtg 3000gtctcagata cttgggaggc
tgaggtggga gggttgatca cttgaggctg agaggtcaag 3060gttgcagtga gccataatcg
tgccactgca gtccagccta ggcaacagag tgagactttg 3120tctcaaaaaa agagaaattt
tccttaataa gaaaagtaat ttttactctg atgtgcaata 3180catttgttat taaatttatt
atttaagatg gtagcactag tcttaaattg tataaaatat 3240cccctaacat gtttaaatgt
ccatttttat tcattatgct ttgaaaaata attatgggga 3300aatacatgtt tgttattaaa
tttattatta aagatagtag cactagtctt aaatttgata 3360taacatctcc taacttgttt
aaatgtccat ttttattctt tatgcttgaa aataaattat 3420ggggatccta tttagctctt
agtaccacta atcaaaagtt cggcatgtag ctcatgatct 3480atgctgtttc tatgtcgtgg
aagcaccgga tgggggtagt gagcaaatct gccctgctca 3540gcagtcacca tagcagctga
ctgaaaatca gcactgcctg agtagttttg atcagtttaa 3600cttgaatcac taactgactg
aaaattgaat gggcaaataa gtgcttttgt ctccagagta 3660tgcgggagac ccttccacct
caagatggat atttcttccc caaggatttc aagatgaatt 3720gaaattttta atcaagatag
tgtgctttat tctgttgtat tttttattat tttaatatac 3780tgtaagccaa actgaaataa
catttgctgt tttataggtt tgaagaacat aggaaaaact 3840aagaggtttt gtttttattt
ttgctgatga agagatatgt ttaaatatgt tgtattgttt 3900tgtttagtta caggacaata
atgaaatgga gtttatattt gttatttcta ttttgttata 3960tttaataata gaattagatt
gaaataaaat ataatgggaa ataatctgca gaatgtgggt 4020ttcctggtgt ttcctctgac
tctagtgcac tgatgatctc tgataaggct cagctgcttt 4080atagttctct ggctaatgca
gcagatactc ttcctgccag tggtaatacg attttttaag 4140aaggcagttt gtcaatttta
atcttgtgga tacctttata ctcttagggt attattttat 4200acaaaagcct tgaggattgc
attctatttt ctatatgacc ctcttgatat ttaaaaaaca 4260ctatggataa caattcttca
tttacctagt attatgaaag aatgaaggag ttcaaacaaa 4320tgtgtttccc agttaactag
ggtttactgt ttgagccaat ataaatgttt aactgtttgt 4380gatggcagta ttcctaaagt
acattgcatg ttttcctaaa tacagagttt aaataatttc 4440agtaattctt agatgattca
gcttcatcat taagaatatc ttttgtttta tgttgagtta 4500gaaatgcctt catatagaca
tagtctttca gacctctact gtcagttttc atttctagct 4560gctttcaggg ttttatgaat
tttcaggcaa agctttaatt tatactaagc ttaggaagta 4620tggctaatgc caacggcagt
ttttttcttc ttaattccac atgactgagg catatatgat 4680ctctgggtag gtgagttgtt
gtgacaacca caagcacttt tttttttttt aaagaaaaaa 4740aggtagtgaa tttttaatca
tctggacttt aagaaggatt ctggagtata cttaggcctg 4800aaattatata tatttggctt
ggaaatgtgt ttttcttcaa ttacatctac aagtaagtac 4860agctgaaatt cagaggaccc
ataagagttc acatgaaaaa aatcaattca tttgaaaagg 4920caagatgcag gagagaggaa
gccttgcaaa cctgcagact gctttttgcc caatatagat 4980tgggtaaggc tgcaaaacat
aagcttaatt agctcacatg ctctgctctc acgtggcacc 5040agtggatagt gtgagagaat
taggctgtag aacaaatggc cttctctttc agcattcaca 5100ccactacaaa atcatctttt
atatcaacag aagaataagc ataaactaag caaaaggtca 5160ataagtacct gaaaccaaga
ttggctagag atatatctta atgcaatcca ttttctgatg 5220gattgttacg agttggctat
ataatgtatg tatggtattt tgatttgtgt aaaagtttta 5280aaaatcaagc tttaagtaca
tggacatttt taaataaaat atttaaagac aatttagaaa 5340attgccttaa tatcattgtt
ggctaaatag aataggggac atgcatatta aggaaaaggt 5400catggagaaa taatattggt
atcaaacaaa tacattgatt tgtcatgata cacattgaat 5460ttgatccaat agtttaagga
ataggtagga aaatttggtt tctatttttc gatttcctgt 5520aaatcagtga cataaataat
tcttagctta ttttatattt ccttgtctta aatactgagc 5580tcagtaagtt gtgttagggg
attatttctc agttgagact ttcttatatg acattttact 5640atgttttgac ttcctgacta
ttaaaaataa atagtagaaa caattttcat aaagtgaaga 5700attatataat cactgcttta
taactgactt tattatattt atttcaaagt tcatttaaag 5760gctactattc atcctctgtg
atggaatggt caggaatttg ttttctcata gtttaattcc 5820aacaacaata ttagtcgtat
ccaaaataac ctttaatgct aaactttact gatgtatatc 5880caaagcttct ccttttcaga
cagattaatc cagaagcagt cataaacaga agaataggtg 5940gtatgttcct aatgatatta
tttctactaa tggaataaac tgtaatatta gaaattatgc 6000tgctaattat atcagctctg
aggtaatttc tgaaatgttc agactcagtc ggaacaaatt 6060ggaaaattta aatttttatt
cttagctata aagcaagaaa gtaaacacat taatttcctc 6120aacattttta agccaattaa
aaatataaaa gatacacacc aatatcttct tcaggctctg 6180acaggcctcc tggaaacttc
cacatatttt tcaactgcag tataaagtca gaaaataaag 6240ttaacataac tttcactaac
acacacatat gtagatttca caaaatccac ctataattgg 6300tcaaagtggt tgagaatata
ttttttagta attgcatgca aaatttttct agcttccatc 6360ctttctccct cgtttcttct
ttttttgggg gagctggtaa ctgatgaaat cttttcccac 6420cttttctctt caggaaatat
aagtggtttt gtttggttaa cgtgatacat tctgtatgaa 6480tgaaacattg gagggaaaca
tctactgaat ttctgtaatt taaaatattt tgctgctagt 6540taactatgaa cagatagaag
aatcttacag atgctgctat aaataagtag aaaatataaa 6600tttcatcact aaaatatgct
attttaaaat ctatttccta tattgtattt ctaatcagat 6660gtattactct tattatttct
attgtatgtg ttaatgattt tatgtaaaaa tgtaattgct 6720tttcatgagt agtatgaata
aaattgatta gtttgtgttt tcttgtctcc cgaaaaaaaa 6780aaaaaaaaaa aaaaaaaaaa
aa 68023751840DNAHomo sapiens
375cccattaggt gacaggtttt tagagaagcc aatcacgtcg ccgcggtcct ggttctaaag
60tcctcgctca cccacccgga ctcattctcc ccagacgcca aggatggtgg tcatggcgcc
120ccgaaccctc ttcctgctgc tctcgggggc cctgaccctg accgagacct gggcgggctc
180ccactccatg aggtatttca gcgccgccgt gtcccggccc ggccgcgggg agccccgctt
240catcgccatg ggctacgtgg acgacacgca gttcgtgcgg ttcgacagcg actcggcgtg
300tccgaggatg gagccgcggg cgccgtgggt ggagcaggag gggccggagt attgggaaga
360ggagacacgg aacaccaagg cccacgcaca gactgacaga atgaacctgc agaccctgcg
420cggctactac aaccagagcg aggccagttc tcacaccctc cagtggatga ttggctgcga
480cctggggtcc gacggacgcc tcctccgcgg gtatgaacag tatgcctacg atggcaagga
540ttacctcgcc ctgaacgagg acctgcgctc ctggaccgca gcggacactg cggctcagat
600ctccaagcgc aagtgtgagg cggccaatgt ggctgaacaa aggagagcct acctggaggg
660cacgtgcgtg gagtggctcc acagatacct ggagaacggg aaggagatgc tgcagcgcgc
720ggaccccccc aagacacacg tgacccacca ccctgtcttt gactatgagg ccaccctgag
780gtgctgggcc ctgggcttct accctgcgga gatcatactg acctggcagc gggatgggga
840ggaccagacc caggacgtgg agctcgtgga gaccaggcct gcaggggatg gaaccttcca
900gaagtgggca gctgtggtgg tgccttctgg agaggagcag agatacacgt gccatgtgca
960gcatgagggg ctgccggagc ccctcatgct gagatggaag cagtcttccc tgcccaccat
1020ccccatcatg ggtatcgttg ctggcctggt tgtccttgca gctgtagtca ctggagctgc
1080ggtcgctgct gtgctgtgga gaaagaagag ctcagattga aaaggaggga gctactctca
1140ggctgcaagt aagtatgaag gaggctgatc cctgagatcc ttgggatctt gtgtttggga
1200gccatggggg agctcaccca ccccacaatt cctcctctgg ccacatctcc tgtggtctct
1260gaccaggtgc tgtttttgtt ctactctagg cagtgacagt gcccagggct ctaatgtgtc
1320tctcacggct tgtaaatgtg acaccccggg gggcctgatg tgtgtgggtt gttgagggga
1380acaggggaca tagctgtgct atgaggtttc tttgacttca atgtattgag catgtgatgg
1440gctgtttaaa gtgtcacccc tcactgtgac tgatatgaat ttgttcatga atatttttct
1500gtagtgtgaa acagctgccc tgtgtgggac tgagtggcaa gtccctttgt gacttcaaga
1560accctgactt ctctttgtgc agagaccagc ccacccctgt gcccaccatg accctcttcc
1620tcatgctgaa ctgcattcct tccccaatca cctttcctgt tccagaaaag gggctgggat
1680gtctccgtct ctgtctcaaa tttgtggtcc actgagctat aacttacttc tgtattaaaa
1740ttagaatctg agtgtaaatt tactttttca aattatttcc aagagagatt gatgggttaa
1800ttaaaggaga agattcctga aatttgagag acaaaataaa
18403766754DNAHomo sapiens 376gtcgacgtgg cggccggcgg cggctgcggg ctgagcggcg
agtttccgat ttaaagctga 60gctgcgagga aaatggcggc gggaggatca aaatacttgc
tggatggtgg actcagagac 120caataaaaat aaactgcttg aacatccttt gactggttag
ccagttgctg atgtatattc 180aagatgagtg gattaggaga aaacttggat ccactggcca
gtgattcacg aaaacgcaaa 240ttgccatgtg atactccagg acaaggtctt acctgcagtg
gtgaaaaacg gagacgggag 300caggaaagta aatatattga agaattggct gagctgatat
ctgccaatct tagtgatatt 360gacaatttca atgtcaaacc agataaatgt gcgattttaa
aggaaacagt aagacagata 420cgtcaaataa aagagcaagg aaaaactatt tccaatgatg
atgatgttca aaaagccgat 480gtatcttcta cagggcaggg agttattgat aaagactcct
taggaccgct tttacttcag 540gcattggatg gtttcctatt tgtggtgaat cgagacggaa
acattgtatt tgtatcagaa 600aatgtcacac aatacctgca atataagcaa gaggacctgg
ttaacacaag tgtttacaat 660atcttacatg aagaagacag aaaggatttt cttaagaatt
taccaaaatc tacagttaat 720ggagtttcct ggacaaatga gacccaaaga caaaaaagcc
atacatttaa ttgccgtatg 780ttgatgaaaa caccacatga tattctggaa gacataaacg
ccagtcctga aatgcgccag 840agatatgaaa caatgcagtg ctttgccctg tctcagccac
gagctatgat ggaggaaggg 900gaagatttgc aatcttgtat gatctgtgtg gcacgccgca
ttactacagg agaaagaaca 960tttccatcaa accctgagag ctttattacc agacatgatc
tttcaggaaa ggttgtcaat 1020atagatacaa attcactgag atcctccatg aggcctggct
ttgaagatat aatccgaagg 1080tgtattcaga gattttttag tctaaatgat gggcagtcat
ggtcccagaa acgtcactat 1140caagaagtta ccagtgatgg gatattttcc ccaacagctt
atcttaatgg ccatgcagaa 1200accccagtat atcgattctc gttggctgat ggaactatag
tgactgcaca gacaaaaagc 1260aaactcttcc gaaatcctgt aacaaatgat cgacatggct
ttgtctcaac ccacttcctt 1320cagagagaac agaatggata tagaccaaac ccaaatcctg
ttggacaagg gattagacca 1380cctatggctg gatgcaacag ttcggtaggc ggcatgagta
tgtcgccaaa ccaaggctta 1440cagatgccga gcagcagggc ctatggcttg gcagacccta
gcaccacagg gcagatgagt 1500ggagctaggt atgggggttc cagtaacata gcttcattga
cccctgggcc aggcatgcaa 1560tcaccatctt cctaccagaa caacaactat aggctcaaca
tgagtagccc cccacatggg 1620agtcctggtc ttgccccaaa ccagcagaat atcatgattt
ctcctcgtaa tcgtgggagt 1680ccaaagatag cctcacatca gttttctcct gttgcaggtg
tgcactctcc catggcatct 1740tctggcaata ctgggaacca cagcttttcc agcagctctc
tcagtgccct gcaagccatc 1800agtgaaggtg tggggacttc ccttttatct actctgtcat
caccaggccc caaattggat 1860aactctccca atatgaatat tacccaacca agtaaagtaa
gcaatcagga ttccaagagt 1920cctctgggct tttattgcga ccaaaatcca gtggagagtt
caatgtgtca gtcaaatagc 1980agagatcacc tcagtgacaa agaaagtaag gagagcagtg
ttgagggggc agagaatcaa 2040aggggtcctt tggaaagcaa aggtcataaa aaattactgc
agttacttac ctgttcttct 2100gatgaccggg gtcattcctc cttgaccaac tcccccctag
attcaagttg taaagaatct 2160tctgttagtg tcaccagccc ctctggagtc tcctcctcta
catctggagg agtatcctct 2220acatccaata tgcatgggtc actgttacaa gagaagcacc
ggattttgca caagttgctg 2280cagaatggga attcaccagc tgaggtagcc aagattactg
cagaagccac tgggaaagac 2340accagcagta taacttcttg tggggacgga aatgttgtca
agcaggagca gctaagtcct 2400aagaagaagg agaataatgc acttcttaga tacctgctgg
acagggatga tcctagtgat 2460gcactctcta aagaactaca gccccaagtg gaaggagtgg
ataataaaat gagtcagtgc 2520accagctcca ccattcctag ctcaagtcaa gagaaagacc
ctaaaattaa gacagagaca 2580agtgaagagg gatctggaga cttggataat ctagatgcta
ttcttggtga tctgactagt 2640tctgactttt acaataattc catatcctca aatggtagtc
atctggggac taagcaacag 2700gtgtttcaag gaactaattc tctgggtttg aaaagttcac
agtctgtgca gtctattcgt 2760cctccatata accgagcagt gtctctggat agccctgttt
ctgttggctc aagtcctcca 2820gtaaaaaata tcagtgcttt ccccatgtta ccaaagcaac
ccatgttggg tgggaatcca 2880agaatgatgg atagtcagga aaattatggc tcaagtatgg
gagactgggg cttaccaaac 2940tcaaaggccg gcagaatgga acctatgaat tcaaactcca
tgggaagacc aggaggagat 3000tataatactt ctttacccag acctgcactg ggtggctcta
ttcccacatt gcctcttcgg 3060tctaatagca taccaggtgc gagaccagta ttgcaacagc
agcagcagat gcttcaaatg 3120aggcctggtg aaatccccat gggaatgggg gctaatccct
atggccaagc agcagcatct 3180aaccaactgg gttcctggcc cgatggcatg ttgtccatgg
aacaagtttc tcatggcact 3240caaaataggc ctcttcttag gaattccctg gatgatcttg
ttgggccacc ttccaacctg 3300gaaggccaga gtgacgaaag agcattattg gaccagctgc
acactcttct cagcaacaca 3360gatgccacag gcctggaaga aattgacaga gctttgggca
ttcctgaact tgtcaatcag 3420ggacaggcat tagagcccaa acaggatgct ttccaaggcc
aagaagcagc agtaatgatg 3480gatcagaagg caggattata tggacagaca tacccagcac
aggggcctcc aatgcaagga 3540ggctttcatc ttcagggaca atcaccatct tttaactcta
tgatgaatca gatgaaccag 3600caaggcaatt ttcctctcca aggaatgcac ccacgagcca
acatcatgag accccggaca 3660aacaccccca agcaacttag aatgcagctt cagcagaggc
tgcagggcca gcagtttttg 3720aatcagagcc gacaggcact tgaattgaaa atggaaaacc
ctactgctgg tggtgctgcg 3780gtgatgaggc ctatgatgca gccccagcag ggttttctta
atgctcaaat ggtcgcccaa 3840cgcagcagag agctgctaag tcatcacttc cgacaacaga
gggtggctat gatgatgcag 3900cagcagcaac agcagcagca gcagcagcag cagcagcaac
agcaacagca acagcaacag 3960cagcaacagc agcaaaccca ggccttcagc ccacctccta
atgtgactgc ttcccccagc 4020atggatgggc ttttggcagg acccacaatg ccacaagctc
ctccgcaaca gtttccatat 4080caaccaaatt atggaatggg acaacaacca gatccagcct
ttggtcgagt gtctagtcct 4140cccaatgcaa tgatgtcgtc aagaatgggt ccctcccaga
atcccatgat gcaacacccg 4200caggctgcat ccatctatca gtcctcagaa atgaagggct
ggccatcagg aaatttggcc 4260aggaacagct ccttttccca gcagcagttt gcccaccagg
ggaatcctgc agtgtatagt 4320atggtgcaca tgaatggcag cagtggtcac atgggacaga
tgaacatgaa ccccatgccc 4380atgtctggca tgcctatggg tcctgatcag aaatactgct
gacatctctg caccaggacc 4440tcttaaggaa accactgtac aaatgacact gcactaggat
tattgggaag gaatcattgt 4500tccaggcatc catcttggaa gaaaggacca gctttgagct
ccatcaaggg tattttaagt 4560gatgtcattt gagcaggact ggattttaag ccgaagggca
atatctacgt gtttttcccc 4620cctccttctg ctgtgtatca tggtgttcaa aacagaaatg
ttttttggca ttccacctcc 4680tagggatata attctggaga catggagtgt tactgatcat
aaaacttttg tgtcactttt 4740ttctgccttg ctagccaaaa tctcttaaat acacgtaggt
gggccagaga acattggaag 4800aatcaagaga gattagaata tctggtttct ctagttgcag
tattggacaa agagcatagt 4860cccagccttc aggtgtagta gttctgtgtt gaccctttgt
ccagtggaat tggtgattct 4920gaattgtcct ttactaatgg tgttgagttg ctctgtccct
attatttgcc ctaggctttc 4980tcctaatgaa ggttttcatt tgccattcat gtcctgtaat
acttcacctc caggaactgt 5040catggatgtc caaatggctt tgcagaaagg aaatgagatg
acagtattta atcgcagcag 5100tagcaaactt ttcacatgct aatgtgcagc tgagtgcact
ttatttaaaa agaatggata 5160aatgcaatat tcttgaggtc ttgagggaat agtgaaacac
attcctggtt tttgcctaca 5220cttacgtgtt agacaagaac tatgattttt ttttttaaag
tactggtgtc accctttgcc 5280tatatggtag agcaataatg ctttttaaaa ataaacttct
gaaaacccaa ggccaggtac 5340tgcattctga atcagaatct cgcagtgttt ctgtgaatag
atttttttgt aaatatgacc 5400tttaagatat tgtattatgt aaaatatgta tatacctttt
tttgtaggtc acaacaactc 5460atttttacag agtttgtgaa gctaaatatt taacattgtt
gatttcagta agctgtgtgg 5520tgaggctacc agtggaagag acatcccttg acttttgtgg
cctgggggag gggtagtgca 5580ccacagcttt tccttcccca ccccccagcc ttagatgcct
cgctcttttc aatctcttaa 5640tctaaatgct ttttaaagag attatttgtt tagatgtagg
cattttaatt ttttaaaaat 5700tcctctacca gaactaagca ctttgttaat ttggggggaa
agaatagata tggggaaata 5760aacttaaaaa aaaatcagga atttaaaaaa aacgagcaat
ttgaagagaa tcttttggat 5820tttaagcagt ccgaaataat agcaattcat gggctgtgtg
tgtgtgtgta tgtgtgtgtg 5880tgtgtgtgta tgtttaatta tgttaccttt tcatcccctt
taggagcgtt ttcagatttt 5940ggttcgtaag acctgaatcc catattgaga tctcgagtag
aatccttggt gtggtttctg 6000gtgtctgctc agctgtcccc tcattctact aatgtgatgc
tttcattatg tccctgtgga 6060ttagaatagt gtcagttatt tcttaagtaa ctcagtaccc
agaacagcca gttttactgt 6120gattcagagc cacagtctaa ctgagcacct tttaaacccc
tccctcttct gccccctacc 6180acttttctgc tgttgcctct ctttgacacc tgttttagtc
agttgggagg aagggaaaaa 6240tcaagtttaa ttccctttat ctgggttaat tcatttggtt
caaatagttg acggaattgg 6300gtttctgaat gtctgtgaat ttcagaggtc tctgctagcc
ttggtatcat tttctagcaa 6360taactgagag ccagttaatt ttaagaattt cacacattta
gccaatcttt ctagatgtct 6420ctgaaggtaa gatcatttaa tatctttgat atgcttacga
gtaagtgaat cctgattatt 6480tccagaccca ccaccagagt ggatcttatt ttcaaagcag
tatagacaat tatgagtttg 6540ccctctttcc cctaccaagt tcaaaatata tctaagaaag
attgtaaatc cgaaaacttc 6600cattgtagtg gcctgtgctt ttcagatagt atactctcct
gtttggagac agaggaagaa 6660ccaggtcagt ctgtctcttt ttcagctcaa ttgtatctga
cccttcttta agttatgtgt 6720gtggggagaa atagaatggt gctcttatgt cgac
6754377757DNAHomo sapiens 377ggaaccgaga ggctgagact
aacccagaaa catccaattc tcaaactgaa gctcgcactc 60tcgcctccag catgaaagtc
tctgccgccc ttctgtgcct gctgctcata gcagccacct 120tcattcccca agggctcgct
cagccagatg caatcaatgc cccagtcacc tgctgttata 180acttcaccaa taggaagatc
tcagtgcaga ggctcgcgag ctatagaaga atcaccagca 240gcaagtgtcc caaagaagct
gtgatcttca agaccattgt ggccaaggag atctgtgctg 300accccaagca gaagtgggtt
caggattcca tggaccacct ggacaagcaa acccaaactc 360cgaagacttg aacactcact
ccacaaccca agaatctgca gctaacttat tttcccctag 420ctttccccag acaccctgtt
ttattttatt ataatgaatt ttgtttgttg atgtgaaaca 480ttatgcctta agtaatgtta
attcttattt aagttattga tgttttaagt ttatctttca 540tggtactagt gttttttaga
tacagagact tggggaaatt gcttttcctc ttgaaccaca 600gttctacccc tgggatgttt
tgagggtctt tgcaagaatc attaatacaa agaatttttt 660ttaacattcc aatgcattgc
taaaatatta ttgtggaaat gaatattttg taactattac 720accaaataaa tatatttttg
tacaaaaaaa aaaaaaa 757378476DNAHomo sapiens
378taaaggcaaa gaaggttttt atttaagtga caacatttga gagctaaaaa ccagctcaca
60tcaaaatcaa gacccagttg taaaaatctt ttaactccat aatgctgttt ttgtcttgtt
120agaaatctga tatcttacat tagcgtttct aacggatttt gtacaaggca gccataagga
180atataataaa cctttttcac cacagaacca tctgtcacag ataatactga aagttacaca
240cttaggaaca gtcagaccac agacaaggtc agactggctg ccaccaccaa gtaaacaact
300agaaaaggac agcggggtcc aagggtgggg gtccctgtgc acgagtcgcc ctcctctggc
360ctgccccccc tcgggtcacc tgtttctcct ttgccccaaa gagggtggag tcaaatgcag
420attttcctcc caactgcctg ttagtgtctc aacaaggaga gcagagccca ggtcag
4763792518DNAHomo sapiens 379gggtgcgctc ggccgtggcg cacctggtga gctccggggg
cgctccgcct ccgcgcccca 60aatccccgga cctgcccaac gccgcctcgg cgccgcccgc
cgccgctcca gaagcgccca 120ggagccctcc cgcgaaggct gggagcggga gcgcgacgcc
cgcgaaggct gttgaggctc 180gagcgagctt ctccagaccg acctttctgc agctgagccc
cggggggctg cgacgcgccg 240atgaccacgc gggccgggct gtgcaaagcc ccccggacac
gggccgccgc ctgccctgga 300gcacaggcta cgccgagtga gcgccccctg gggcacccaa
accaggatgg ggctcccacc 360cctctcccca gctccgcatc cccggcgcta ggacgcgttc
cccacgccgc gtccgggcca 420ggagctccct tttccgtgga cctttgctat cctctggtct
tcgggccgca ccccctccca 480acccattttc cagtgggggg cagcctgtgt caccttcttc
acgtccttcc cgctcattga 540ctgccctcgc ccacgccgcc tcaggaccct gttctgcccc
agagcccgga gggcggagag 600cccggcgaag gatgagttgg ccagttcccc gtcgcggccc
ggcagcttaa aggctaaggg 660aaaaggggtt tcacgaagga gcggggttct ttttaatagg
ggacatagcg gttgggaaga 720ctcgctcacc cgcttcccgg ctccagcgcc ccagttccct
gtccctctta ccgtagttcc 780cctccccctc cacacccaga aatagcccgc gacaccagga
ggccgccagc ttccccagga 840gcggggaggg ggacgcccgg ggtagaggag ggtcccattt
agatgccctt cagcctgcca 900actcgtgctg gcctggcaaa gaagcggacc ccctgcccgg
agcggccggc tggcccccgg 960gctgtgtgta ttttaaatgc atctgccggg aacgcagagc
accgagggag atgggggcgc 1020tcagttcgct gaggaaggtg gctggtggcc catggaccca
ccaccacctc ccttagcctc 1080ctgtgtggga ggagtttatg ggtatgtggc tcctgcccag
tccaggtggg ctttcacttc 1140tactctattt cagttcctct ttcccgatct gggctggaga
gcttcctcat tgttaaggca 1200gcagaaactt tcgctggatg gttttaggat aaggggtcat
caatgctggc aagagtcggc 1260acaatgagga ccaggcttgc tgtgaagtgg tgtatgtgga
aggtcggagg agtgttacag 1320gagtacctag ggagcctagc cgaggccagg gactctgctt
ctactactgg ggcctatttg 1380atgggcatgc agggggcgga gctgctgaaa tggcctcacg
gctcctgcat cgccatatcc 1440gagagcagct aaaggacctg aaggaagtga gccacgagag
cctggtagtg ggggccattg 1500agaatgcctt ccagctcatg gatgagcaga tggcccggga
gcggcgtggc caccaagtgg 1560aggggggctg ctgtgcactg gttgtgatct acctgctagg
caaggtgtac gtggccaatg 1620caggcgatag cagggccatc attgtccgga atggtgaaat
cattccaatg tcccgggagt 1680ttaccccgga gactgagcgc cagcgtcttc agctgcttgg
cttcctgaaa ccagagctgc 1740taggcagtga attcacccac cttgagttcc cccgcagagt
tctgcccaag gagctggggc 1800agaggatgtt gtaccgggac cagaacatga ccggctgggc
ctacaaaaag atcgagctgg 1860aggatctcag gtttcctctg gtctgtgggg agggcaaaaa
ggctcgggtg atggccacca 1920ttggggtgac ccgaggcttg ggagaccaca gccttaaggt
ctgcagttcc accctgccca 1980tcaagccctt tctctcctgc ttccctgagg tacgagtgta
tgacctgaca caatatgagc 2040actgcccaga tgatgtgcta gtcctgggaa cagatggcct
gtgggatgtc actactgact 2100gtgaggtagc tgccactgtg gacagggtgc tgtcggccta
tgagcctaat gaccacagca 2160ggtatacagc tctggcccaa gctctggtcc tgggggcccg
gggtaccccc cgagaccgtg 2220gctggcgtct ccccaacaac aagctgggtt ccggggatga
catctctgtc ttcgtcatcc 2280ccctgggagg gccaggcagt tactcctgag gggctgaaca
ccatccctcc cactagcctc 2340tccatactta ctcctctcac agcccaaatt ctgaagttgt
ctccctgacc cttctttagt 2400ggcaacttaa ctgaagaagg gatgtccgct atatccaaaa
ttacagctat tggcaaataa 2460acgagatgga taaaaaaaaa aaaaaaaaaa aaaaaaaaaa
aaaaaaaaaa aaaaaaaa 25183804160DNAHomo sapiens 380gcgcttgcgg
aggattgcgt tgacgagact cttatttatt gtcaccaacc tgtggtggaa 60tttgcagttg
cacattggat ctgattcgcc ccgccccgaa tgacgcctgc ccggaggcag 120tgaaagtaca
gccgcgccgc cccaagtcag cctggacaca taaatcagca cgcggccgga 180gaaccccgca
atctctgcgc ccacaaaata caccgacgat gcccgatcta ctttaagggc 240tgaaacccac
gggcctgaga gactataaga gcgttcccta ccgccatgga acaacgggga 300cagaacgccc
cggccgcttc gggggcccgg aaaaggcacg gcccaggacc cagggaggcg 360cggggagcca
ggcctgggct ccgggtcccc aagacccttg tgctcgttgt cgccgcggtc 420ctgctgttgg
tctcagctga gtctgctctg atcacccaac aagacctagc tccccagcag 480agagcggccc
cacaacaaaa gaggtccagc ccctcagagg gattgtgtcc acctggacac 540catatctcag
aagacggtag agattgcatc tcctgcaaat atggacagga ctatagcact 600cactggaatg
acctcctttt ctgcttgcgc tgcaccaggt gtgattcagg tgaagtggag 660ctaagtccct
gcaccacgac cagaaacaca gtgtgtcagt gcgaagaagg caccttccgg 720gaagaagatt
ctcctgagat gtgccggaag tgccgcacag ggtgtcccag agggatggtc 780aaggtcggtg
attgtacacc ctggagtgac atcgaatgtg tccacaaaga atcaggtaca 840aagcacagtg
gggaagcccc agctgtggag gagacggtga cctccagccc agggactcct 900gcctctccct
gttctctctc aggcatcatc ataggagtca cagttgcagc cgtagtcttg 960attgtggctg
tgtttgtttg caagtcttta ctgtggaaga aagtccttcc ttacctgaaa 1020ggcatctgct
caggtggtgg tggggaccct gagcgtgtgg acagaagctc acaacgacct 1080ggggctgagg
acaatgtcct caatgagatc gtgagtatct tgcagcccac ccaggtccct 1140gagcaggaaa
tggaagtcca ggagccagca gagccaacag gtgtcaacat gttgtccccc 1200ggggagtcag
agcatctgct ggaaccggca gaagctgaaa ggtctcagag gaggaggctg 1260ctggttccag
caaatgaagg tgatcccact gagactctga gacagtgctt cgatgacttt 1320gcagacttgg
tgccctttga ctcctgggag ccgctcatga ggaagttggg cctcatggac 1380aatgagataa
aggtggctaa agctgaggca gcgggccaca gggacacctt gtacacgatg 1440ctgataaagt
gggtcaacaa aaccgggcga gatgcctctg tccacaccct gctggatgcc 1500ttggagacgc
tgggagagag acttgccaag cagaagattg aggaccactt gttgagctct 1560ggaaagttca
tgtatctaga aggtaatgca gactctgcca tgtcctaagt gtgattctct 1620tcaggaagtc
agaccttccc tggtttacct tttttctgga aaaagcccaa ctggactcca 1680gtcagtagga
aagtgccaca attgtcacat gaccggtact ggaagaaact ctcccatcca 1740acatcaccca
gtggatggaa catcctgtaa cttttcactg cacttggcat tatttttata 1800agctgaatgt
gataataagg acactatgga aatgtctgga tcattccgtt tgtgcgtact 1860ttgagatttg
gtttgggatg tcattgtttt cacagcactt ttttatccta atgtaaatgc 1920tttatttatt
tatttgggct acattgtaag atccatctac acagtcgttg tccgacttca 1980cttgatacta
tatgatatga accttttttg ggtggggggt gcggggcagt tcactctgtc 2040tcccaggctg
gagtgcaatg gtgcaatctt ggctcactat agccttgacc tctcaggctc 2100aagcgattct
cccacctcag ccatccaaat agctgggacc acaggtgtgc accaccacgc 2160ccggctaatt
ttttgtattt tgtctagata taggggctct ctatgttgct cagggtggtc 2220tcgaattcct
ggactcaagc agtctgccca cctcagactc ccaaagcggt ggaattagag 2280gcgtgagccc
ccatgcttgg ccttaccttt ctacttttat aattctgtat gttattattt 2340tatgaacatg
aagaaacttt agtaaatgta cttgtttaca tagttatgtg aatagattag 2400ataaacataa
aaggaggaga catacaatgg gggaagaaga agaagtcccc tgtaagatgt 2460cactgtctgg
gttccagccc tccctcagat gtactttggc ttcaatgatt ggcaacttct 2520acaggggcca
gtcttttgaa ctggacaacc ttacaagtat atgagtatta tttataggta 2580gttgtttaca
tatgagtcgg gaccaaagag aactggatcc acgtgaagtc ctgtgtgtgg 2640ctggtcccta
cctgggcagt ctcatttgca cccatagccc ccatctatgg acaggctggg 2700acagaggcag
atgggttaga tcacacataa caatagggtc tatgtcatat cccaagtgaa 2760cttgagccct
gtttgggctc aggagataga agacaaaatc tgtctcccac gtctgccatg 2820gcatcaaggg
ggaagagtag atggtgcttg agaatggtgt gaaatggttg ccatctcagg 2880agtagatggc
ccggctcact tctggttatc tgtcaccctg agcccatgag ctgcctttta 2940gggtacagat
tgcctacttg aggaccttgg ccgctctgta agcatctgac tcatctcaga 3000aatgtcaatt
cttaaacact gtggcaacag gacctagaat ggctgacgca ttaaggtttt 3060cttcttgtgt
cctgttctat tattgtttta agacctcagt aaccatttca gcctctttcc 3120agcaaaccct
tctccatagt atttcagtca tggaaggatc atttatgcag gtagtcattc 3180caggagtttt
tggtcttttc tgtctcaagg cattgtgtgt tttgttccgg gactggtttg 3240ggtgggacaa
agttagaatt gcctgaagat cacacattca gactgttgtg tctgtggagt 3300tttaggagtg
gggggtgacc tttctggtct ttgcacttcc atcctctccc acttccatct 3360ggcatcccac
gcgttgtccc ctgcacttct ggaaggcaca gggtgctgct gcctcctggt 3420ctttgccttt
gctgggcctt ctgtgcagga cgctcagcct cagggctcag aaggtgccag 3480tccggtccca
ggtcccttgt cccttccaca gaggccttcc tagaagatgc atctagagtg 3540tcagccttat
cagtgtttaa gatttgtctt ttatttttaa tttttttgag acagaatctc 3600actctctcgc
ccaggctgga gtgcaacggt acgatcttgg ctcagtgcaa cctccgcctc 3660ctgggttcaa
gcgattctcg tgcctcagcc tccggagtag ctgggattgc aggcacccgc 3720caccacgcct
ggctaatttt tgtattttta gtagagacgg ggtttcacca tgttggtcag 3780gctggtctcg
aactcctgac ctcaggtgat ccaccttggc ctccgaaagt gctgggatta 3840caggcgtgag
ccaccagcca ggccaagcta ttcttttaaa gtaagcttcc tgacgacatg 3900aaataattgg
gggttttgtt gtttagttac attaggcttt gctatatccc caggccaaat 3960agcatgtgac
acaggacagc catagtatag tgtgtcactc gtggttggtg tcctttcatg 4020cttctgccct
gtcaaaggtc cctatttgaa atgtgttata atacaaacaa ggaagcacat 4080tgtgtacaaa
atacttatgt atttatgaat ccatgaccaa attaaatatg aaaccttata 4140taaaaaaaaa
aaaaaaaaaa
41603811295DNAHomo sapiens 381gtgcggagtt tggctgctcc ggggttagca ggtgagcctg
cgatgcgcgg gaagacgttc 60cgctttgaaa tgcagcggga tttggtgagt ttcccgctgt
ctccagcggt gcgggtgaag 120ctggtgtctg cggggttcca gactgctgag gaactcctag
aggtgaaacc ctccgagctt 180agcaaagaag ttgggatatc taaagcagaa gccttagaaa
ctctgcaaat tatcagaaga 240gaatgtctca caaataaacc aagatatgct ggtacatctg
agtcacacaa gaagtgtaca 300gcactggaac ttcttgagca ggagcatacc cagggcttca
taatcacctt ctgttcagca 360ctagatgata ttcttggggg tggagtgccc ttaatgaaaa
caacagaaat ttgtggtgca 420ccaggtgttg gaaaaacaca attatgtatg cagttggcag
tagatgtgca gataccagaa 480tgttttggag gagtggcagg tgaagcagtt tttattgata
cagagggaag ttttatggtt 540gatagagtgg tagaccttgc tactgcctgc attcagcacc
ttcagcttat agcagaaaaa 600cacaagggag aggaacaccg aaaagctttg gaggatttca
ctcttgataa tattctttct 660catatttatt attttcgctg tcgtgactac acagagttac
tggcacaagt ttatcttctt 720ccagatttcc tttcagaaca ctcaaaggtt cgactagtga
tagtggatgg tattgctttt 780ccatttcgtc atgacctaga tgacctgtct cttcgtactc
ggttattaaa tggcctagcc 840cagcaaatga tcagccttgc aaataatcac agattagctg
taattttaac caatcagatg 900acaacaaaga ttgatagaaa tcaggccttg cttgttcctg
cattagggga aagttgggga 960catgctgcta caatacggct aatctttcat tgggaccgaa
agcaaaggtt ggcaacattg 1020tacaagtcac ccagccagaa ggaatgcaca gtactgtttc
aaatcaaacc tcagggattt 1080agagatactg ttgttacttc tgcatgttca ttgcaaacag
aaggttcctt gagcacccgg 1140aaacggtcac gagacccaga ggaagaatta taacccagaa
acaaatctca aagtgtacaa 1200atttattgat gttgtgaaat caatgtgtac aagtggactt
gttaccttaa agtataaata 1260aacacactat ggcatgaatg aaaaaaaaaa aaaaa
12953822210DNAHomo sapiens 382cgcgcccctc cctcctcgcg
gacctggcgg tgccggcgcc cggagtggcc ctttaaaagg 60cagcttattg tccggagggg
gcgggcgggg ggcgccgacc gcggcctgag gcccggcccc 120tcccctctcc ctccctctgt
ccccgcgtcg ctcgctggct agctcgctgg ctcgctcgcc 180cgtccggcgc acgctccgcc
tccgtcagtt ggctccgctg tcgggtgcgc ggcgtggagc 240ggcagccggt ctggacgcgc
ggccggggct gggggctggg agcgcggcgc gcaagatctc 300cccgcgcgag agcggcccct
gccaccgggc gaggcctgcg ccgcgatggc agagatgggc 360agtaaagggg tgacggcggg
aaagatcgcc agcaacgtgc agaagaagct cacccgcgcg 420caggagaagg ttctccagaa
gctggggaag gcagatgaga ccaaggatga gcagtttgag 480cagtgcgtcc agaatttcaa
caagcagctg acggagggca cccggctgca gaaggatctc 540cggacctacc tggcctccgt
caaagccatg cacgaggctt ccaagaagct gaatgagtgt 600ctgcaggagg tgtatgagcc
cgattggccc ggcagggatg aggcaaacaa gatcgcagag 660aacaacgacc tgctgtggat
ggattaccac cagaagctgg tggaccaggc gctgctgacc 720atggacacgt acctgggcca
gttccccgac atcaagtcac gcattgccaa gcgggggcgc 780aagctggtgg actacgacag
tgcccggcac cactacgagt cccttcaaac tgccaaaaag 840aaggatgaag ccaaaattgc
caaggccgag gaggagctca tcaaagccca gaaggtgttt 900gaggagatga atgtggatct
gcaggaggag ctgccgtccc tgtggaacag ccgcgtaggt 960ttctacgtca acacgttcca
gagcatcgcg ggcctggagg aaaacttcca caaggagatg 1020agcaagctca accagaacct
caatgatgtg ctggtcggcc tggagaagca acacgggagc 1080aacaccttca cggtcaaggc
ccagcccaga aagaaaagta aactgttttc gcggctgcgc 1140agaaagaaga acagtgacaa
cgcgcctgca aaagggaaca agagcccttc gcctccagat 1200ggctcccctg ccgccacccc
cgagatcaga gtcaaccacg agccagagcc ggccggcggg 1260gccacgcccg gggccaccct
ccccaagtcc ccatctcagc cagcagaggc ctcggaggtg 1320gcgggtggga cccaacctgc
ggctggagcc caggagccag gggagacggc ggcaagtgaa 1380gcagcctcca gctctcttcc
tgctgtcgtg gtggagacct tcccagcaac tgtgaatggc 1440accgtggagg gcggcagtgg
ggccgggcgc ttggacctgc ccccaggttt catgttcaag 1500gtacaggccc agcacgacta
cacggccact gacacagacg agctgcagct caaggctggt 1560gatgtggtgc tggtgatccc
cttccagaac cctgaagagc aggatgaagg ctggctcatg 1620ggcgtgaagg agagcgactg
gaaccagcac aaggagctgg agaagtgccg tggcgtcttc 1680cccgagaact tcactgagag
ggtcccatga cggcggggcc caggcagcct ccgggcgtgt 1740gaagaacacc tcctcccgaa
aaatgtgtgg ttcttttttt tgttttgttt tcgtttttca 1800tcttttgaag agcaaaggga
aatcaagagg agacccccag gcagaggggc gttctcccaa 1860agattaggtc gttttccaaa
gagccgcgtc ccggcaagtc cggcggaatt caccagtgtt 1920cctgaagctg ctgtgtcctc
tagttgagtt tctggcgccc ctgcctgtgc ccgcatgtgt 1980gcctggccgc agggcggggc
tgggggctgc cgagccacca tgcttgcctg aagcttcggc 2040cgcgccaccc gggcaagggt
cctcttttcc tggcagctgc tgtgggtggg gcccagacac 2100cagcctagcc tggctctgcc
ccgcagacgg tctgtgtgct gtttgaaaat aaatcttagt 2160gttcaaaaca aaatgaaaca
aaaaaaaaat gataaaaact ctcaaaaaaa 22103834604DNAHomo sapiens
383ggaacagctt gtccacccgc cggccggacc agaagccttt gggtctgaag tgtctgtgag
60acctcacaga agagcacccc tgggctccac ttacctgccc cctgctcctt cagggatgga
120ggcaatggcg gccagcactt ccctgcctga ccctggagac tttgaccgga acgtgccccg
180gatctgtggg gtgtgtggag accgagccac tggctttcac ttcaatgcta tgacctgtga
240aggctgcaaa ggcttcttca ggcgaagcat gaagcggaag gcactattca cctgcccctt
300caacggggac tgccgcatca ccaaggacaa ccgacgccac tgccaggcct gccggctcaa
360acgctgtgtg gacatcggca tgatgaagga gttcattctg acagatgagg aagtgcagag
420gaagcgggag atgatcctga agcggaagga ggaggaggcc ttgaaggaca gtctgcggcc
480caagctgtct gaggagcagc agcgcatcat tgccatactg ctggacgccc accataagac
540ctacgacccc acctactccg acttctgcca gttccggcct ccagttcgtg tgaatgatgg
600tggagggagc catccttcca ggcccaactc cagacacact cccagcttct ctggggactc
660ctcctcctcc tgctcagatc actgtatcac ctcttcagac atgatggact cgtccagctt
720ctccaatctg gatctgagtg aagaagattc agatgaccct tctgtgaccc tagagctgtc
780ccagctctcc atgctgcccc acctggctga cctggtcagt tacagcatcc aaaaggtcat
840tggctttgct aagatgatac caggattcag agacctcacc tctgaggacc agatcgtact
900gctgaagtca agtgccattg aggtcatcat gttgcgctcc aatgagtcct tcaccatgga
960cgacatgtcc tggacctgtg gcaaccaaga ctacaagtac cgcgtcagtg acgtgaccaa
1020agccggacac agcctggagc tgattgagcc cctcatcaag ttccaggtgg gactgaagaa
1080gctgaacttg catgaggagg agcatgtcct gctcatggcc atctgcatcg tctccccaga
1140tcgtcctggg gtgcaggacg ccgcgctgat tgaggccatc caggaccgcc tgtccaacac
1200actgcagacg tacatccgct gccgccaccc gcccccgggc agccacctgc tctatgccaa
1260gatgatccag aagctagccg acctgcgcag cctcaatgag gagcactcca agcagtaccg
1320ctgcctctcc ttccagcctg agtgcagcat gaagctaacg ccccttgtgc tcgaagtgtt
1380tggcaatgag atctcctgac taggacagcc tgtgcggtgc ctgggtgggg ctgctcctcc
1440agggccacgt gccaggcccg gggctggcgg ctactcagca gccctcctca cccgtctggg
1500gttcagcccc tcctctgcca cctcccctat ccacccagcc cattctctct cctgtccaac
1560ctaacccctt tcctgcgggc ttttccccgg tcccttgaga cctcagccat gaggagttgc
1620tgtttgtttg acaaagaaac ccaagtgggg gcagagggca gaggctggag gcaggccttg
1680cccagagatg cctccaccgc tgcctaagtg gctgctgact gatgttgagg gaacagacag
1740gagaaatgca tccattcctc agggacagag acacctgcac ctccccccac tgcaggcccc
1800gcttgtccag cgcctagtgg ggtctccctc tcctgcctta ctcacgataa ataatcggcc
1860cacagctccc accccacccc cttcagtgcc caccaacatc ccattgccct ggttatattc
1920tcacgggcag tagctgtggt gaggtgggtt ttcttcccat cactggagca ccaggcacga
1980acccacctgc tgagagaccc aaggaggaaa aacagacaaa aacagcctca cagaagaata
2040tgacagctgt ccctgtcacc aagctcacag ttcctcgccc tgggtctaag gggttggttg
2100aggtggaagc cctccttcca cggatccatg tagcaggact gaattgtccc cagtttgcag
2160aaaagcacct gccgacctcg tcctccccct gccagtgcct tacctcctgc ccaggagagc
2220cagccctccc tgtcctcctc ggatcaccga gagtagccga gagcctgctc ccccaccccc
2280tccccagggg agagggtctg gagaagcagt gagccgcatc ttctccatct ggcagggtgg
2340gatggaggag aagaattttc agaccccagc ggctgagtca tgatctccct gccgcctcaa
2400tgtggttgca aggccgctgt tcaccacagg gctaagagct aggctgccgc accccagagt
2460gtgggaaggg agagcggggc agtctcgggt ggctagtcag agagagtgtt tgggggttcc
2520gtgatgtagg gtaaggtgcc ttcttattct cactccacca cccaaaagtc aaaaggtgcc
2580tgtgaggcag gggcggagtg atacaacttc aagtgcatgc tctctgcagg tcgagcccag
2640cccagctggt gggaagcgtc tgtccgttta ctccaaggtg ggtctttgtg agagtgagct
2700gtaggtgtgc gggaccggta cagaaaggcg ttcttcgagg tggatcacag aggcttcttc
2760agatcaatgc ttgagtttgg aatcggccgc attccctgag tcaccaggaa tgttaaagtc
2820agtgggaacg tgactgcccc aactcctgga agctgtgtcc ttgcacctgc atccgtagtt
2880ccctgaaaac ccagagagga atcagacttc acactgcaag agccttggtg tccacctggc
2940cccatgtctc tcagaattct tcaggtggaa aaacatctga aagccacgtt ccttactgca
3000gaatagcata tatatcgctt aatcttaaat ttattagata tgagttgttt tcagactcag
3060actccatttg tattatagtc taatatacag ggtagcaggt accactgatt tggagatatt
3120tatgggggga gaacttacat tgtgaaactt ctgtacatta attattattg ctgttgttat
3180tttacaaggg tctagggaga gacccttgtt tgattttagc tgcagaactg tattggtcca
3240gcttgctctt cagtgggaga aaaacacttg taagttgcta aacgagtcaa tcccctcatt
3300caggaaaact gacagaggag ggcgtgactc acccaagcca tatataacta gctagaagtg
3360ggccaggaca ggccgggcgc ggtggctcac gcctgtaatc ccagcagttt gggaggtcga
3420ggtaggtgga tcacctgagg tcgggagttc gagaccaacc tgaccaacat ggagaaaccc
3480tgtctctatt aaaaatacaa aaaaaaaaaa aaaaaaaaat agccgggcat ggtggcgcaa
3540gcctgtaatc ccagctactc aggaggctga ggcagaagaa ttgaacccag gaggtggagg
3600ttgcagtgag ctgagatcgt gccgttactc tccaacctgg acaacaagag cgaaactccg
3660tcttagaagt ggaccaggac aggaccagat tttggagtca tggtccggtg tccttttcac
3720tacaccatgt ttgagctcag acccccactc tcattcccca ggtggctgac ccagtccctg
3780ggggaagccc tggatttcag aaagagccaa gtctggatct gggacccttt ccttccttcc
3840ctggcttgta actccaccaa gcccatcaga aggagaagga aggagactca cctctgcctc
3900aatgtgaatc agaccctacc ccaccacgat gtgccctggc tgctgggctc tccacctcag
3960gccttggata atgctgttgc ctcatctata acatgcattt gtctttgtaa tgtcaccacc
4020ttcccagctc tccctctggc cctgcttctt cggggaactc ctgaaatatc agttactcag
4080ccctgggccc caccacctag gccactcctc caaaggaagt ctaggagctg ggaggaaaag
4140aaaagagggg aaaatgagtt tttatggggc tgaacgggga gaaaaggtca tcatcgattc
4200tactttagaa tgagagtgtg aaatagacat ttgtaaatgt aaaactttta aggtatatca
4260ttataactga aggagaaggt gccccaaaat gcaagatttt ccacaagatt cccagagaca
4320ggaaaatcct ctggctggct aactggaagc atgtaggaga atccaagcga ggtcaacaga
4380gaaggcagga atgtgtggca gatttagtga aagctagaga tatggcagcg aaaggatgta
4440aacagtgcct gctgaatgat ttccaaagag aaaaaaagtt tgccagaagt ttgtcaagtc
4500aaccaatgta gaaagctttg cttatggtaa taaaaatggc tcatacttat atagcactta
4560ctttgtttgc aagtactgct gtaaataaat gctttatgca aacc
4604384545DNAHomo sapiens 384gagtgactct cacgagagcc gcgagagtca gcttggccaa
tccgtgcggt cggcggccgc 60tccctttata agccgactcg cccggcagcg caccgggttg
cggagggtgg gcctgggagg 120ggtggtggcc attttttgtc taaccctaac tgagaagggc
gtaggcgccg tgcttttgct 180ccccgcgcgc tgtttttctc gctgactttc agcgggcgga
aaagcctcgg cctgccgcct 240tccaccgttc attctagagc aaacaaaaaa tgtcagctgc
tggcccgttc gcccctcccg 300gggacctgcg gcgggtcgcc tgcccagccc ccgaaccccg
cctggaggcc gcggtcggcc 360cggggcttct ccggaggcac ccactgccac cgcgaagagt
tgggctctgt cagccgcggg 420tctctcgggg gcgagggcga ggttcaggcc tttcaggccg
caggaagagg aacggagcga 480gtccccgcgc gcggcgcgat tccctgagct gtgggacgtg
cacccaggac tcggctcaca 540catgc
545
User Contributions:
Comment about this patent or add new information about this topic: