Patent application title: Gene for Identifying Individuals with Familial Dysautonomia
Inventors:
Susan Slaugenhaupt (Quincy, MA, US)
James F. Gusella (Framingham, MA, US)
Assignees:
The General Hospital Corporation
IPC8 Class: AC12Q168FI
USPC Class:
435 6
Class name: Chemistry: molecular biology and microbiology measuring or testing process involving enzymes or micro-organisms; composition or test strip therefore; processes of forming such composition or test strip involving nucleic acid
Publication date: 2009-07-09
Patent application number: 20090176222
Claims:
1-43. (canceled)
44. A kit for assaying for the presence of a mutation associated with Familial Dysautonomia in an individual comprising primers consisting of at least 16 contiguous nucleotides of the human IKBKAP gene as defined according to its cDNA sequence in SEQ ID NO: 2 or the complement thereof, said primers being complementary to sequences flanking the mutation site and capable of amplifying a region of the IKBKAP gene of sufficient size to detect the G-C mutation at position 73 of exon 19 of the human IKBKAP gene, wherein said amplified region comprises a G-C mutation at position 73 of exon 19 of the human IKBKAP gene.
45. An isolated oligonucleotide probe suitable for the detection of a mutation associated with Familial Dysautonomia in an individual, the oligonucleotide probe consisting of at 16 contiguous nucleotides of the human IKBKAP gene as defined according to its cDNA sequence in SEQ ID NO: 2 or the complement thereof, said oligonucleotide probe being complementary to either the coding or non-coding strand and being suitable for detection of the G-C mutation at position 73 of exon 19 of the human IKBKAP gene.
46. A method for detecting a mutation associated with Familial Dysautonomia in a sample, comprising DNA isolated from an individual, by amplifying the DNA sequence in the region flanking the portion of the human IKBKAP gene as defined according to its cDNA sequence in SEQ ID NO: 2 using primers consisting of at least 16 contiguous nucleotides of the human IKBKAP or the complement thereof, said primers being complementary to sequences flanking the mutation site and being capable of amplifying a region of the IKBKAP gene of sufficient size to detect the G-C mutation at position 73 of exon 19 of the human IKBKAP gene, wherein said amplified region comprises a G-C mutation at position 73 of exon 19 of the human IKBKAP gene, wherein said amplified region comprises a G-C mutation at position 73 of exon 19 of the human IKBKAP gene, and sequencing the amplified region.
Description:
[0001]This application claims priority to provisional application Ser. No.
60/260,080, the entirety of which is incorporated herein by reference.
FIELD OF THE INVENTION
[0003]This invention relates generally to the gene, and mutations thereto, that are responsible for the disease familial dysautonomia (FD). More particularly, the invention relates to the identification, isolation and cloning of the DNA sequence corresponding to the normal and mutant FD genes, as well as characterization of their transcripts and gene products. This invention also relates to genetic screening methods and kits for identifying FD mutant and wild-type alleles, and further relates to FD diagnosis, prenatal screening and diagnosis, and therapies of FD, including gene therapeutics and protein/antibody based therapeutics.
BACKGROUND OF THE INVENTION
[0004]Familial Dysautonomia (FD, Riley-Day Syndrome, Hereditary Sensory and Autonomic Neuropathy Type III) [OMIM 223900] is an autosomal recessive disorder present in 1 in 3,600 live births in the Ashkenazi Jewish population. This debilitating disorder is due to the poor development, survival, and progressive degeneration of the sensory and autonomic nervous system (Axelrod et al., 1974). FD was first described in 1949 based on five children who presented with defective lacrimation, excessive sweating, skin blotching, and hypertension (Riley et al., 1949). The following cardinal criteria have evolved for diagnosis of FD: absence of fungiform papillae on the tongue, absence of flare after injection of intradermal histamine, decreased or absent deep tendon reflexes, absence of overflow emotional tears, and Ashkenazi Jewish descent (Axelrod and Pearson, 1984, Axelrod 1984).
[0005]The loss of neuronal function in FD has many repercussions, with patients displaying gastrointestinal dysfunction, abnormal respiratory responses to hypoxic and hypercarbic states, scoliosis, gastroesophageal reflux, vomiting crises, lack of overflow tears, inappropriate sweating, and postural hypotension (Riley et al. 1949; Axelrod et al. 1974, Axelrod 1996). Despite recent advances in the management of FD, the disorder is inevitably fatal with only 50% of patients reaching 30 years of age. The clinical features of FD are due to a genetic defect that causes a striking, progressive depletion of unmyelinated sensory and autonomic neurons (Pearson and Pytel 1978a; Pearson and Pytel 1978b; Pearson et al. 1978; Axelrod 1995). This neuronal deficiency begins during development, as extensive pathology is evident even in the youngest subjects. Fetal development and postnatal maintenance of dorsal root ganglion (DRG) neurons is abnormal, significantly decreasing their numbers and resulting in DRG of grossly reduced size. Slow progressive degeneration is evidenced by continued neuronal depletion with increasing age. In the autonomic nervous system, superior cervical sympathetic ganglia are also reduced in size due to a severe decrease in the neuronal population.
[0006]Previously, the FD gene, DYS, was mapped to an 11-cM region of chromosome 9q31 (Blumenfeld et al. 1993) which was then narrowed by haplotype analysis to <0.5cM or 471 kb (Blumenfeld et al. 1999). There is a single major haplotype that accounts for >99.5% of all FD chromosomes in the Ashkenazi Jewish (AJ) population. The recent identification of several single nucleotide polymorphisms (SNPs) in the candidate interval has allowed for further reduction of the candidate region to 177 kb by revealing a common core haplotype shared by the major and one previously described minor haplotype (Blumenfeld et al. 1999).
SUMMARY OF THE INVENTION
[0007]This invention relates to mutations in the IKBKAP gene which the inventors of this invention discovered and found to be associated with Familial Dysautonomia. The mutation associated with the major haplotype of FD is a base pair mutation, wherein the thymine nucleotide located at bp 6 of intron 20 in the IKBKAP gene is replaced with a cytosine nucleotide (T C) (hereinafter "FD1 mutation"). The mutation associated with the minor haplotype is a base pair mutation wherein the guanine nucleotide at bp 2397 (bp 73 of exon 19) is replaced with a cysteine nucleotide (G C) (hereinafter "FD2 mutation" This base pair mutation causes an arginine to proline missense mutation (R696P) in the amino acid sequence of the IKBKAP gene that is predicted to disrupt a potential phosphorylation site
[0008]In accordance with one aspect of the present invention, there is provided an isolated nucleic acid comprising a nucleic acid sequence selected from the group consisting of:
[0009]nucleic acid sequences corresponding to the genomic sequence of the FD gene including introns and exons as shown in FIG. 6;
[0010]nucleic acid sequences corresponding to the nucleic acid sequence of the FD gene as shown in FIG. 6, wherein the thymine nucleotide at position 34,201 is replaced by a cytosine nucleotide;
[0011]nucleic acid sequences corresponding to the nucleic acid sequence of the FD gene as shown in FIG. 6, wherein the guanine nucleotide at position 33,714 is replaced by a cytosine nucleotide;
[0012]nucleic acid sequences corresponding to the nucleic acid sequence of the FD gene as shown in FIG. 6, wherein the thymine nucleotide at position 34,201 is replaced by a cytosine nucleotide and the guanine nucleotide at position 33,714 is replaced by a cytosine nucleotide;
[0013]nucleic acid sequences corresponding to the cDNA sequence including the coding sequence of the FD gene as shown in FIG. 7;
[0014]nucleic acid sequences corresponding to the cDNA sequence shown in FIG. 7, wherein the arginine at position 696 is replaced by a proline;
[0015]In accordance with another aspect of the present invention, there is provided a nucleic acid probe, comprising a nucleotide sequence corresponding to a portion of a nucleic acid as set forth in any one of the foregoing nucleic acid sequences
[0016]In accordance with another aspect of the present invention, there is provided a cloning vector comprising a coding sequence of a nucleic acid as set forth above and a replicon operative in a host cell for the vector.
[0017]In accordance with another aspect of the present invention, there is provided an expression vector comprising a coding sequence of a nucleic acid set forth above operably linked with a promoter sequence capable of directing expression of the coding sequence in host cells for the vector.
[0018]In accordance with another aspect of the present invention, there is provided host cells transformed with a vector as set forth above.
[0019]In accordance with another aspect of the present invention, there is provided a method of producing a mutant FD polypeptide comprising: transforming host cells with a vector capable of expressing a polypeptide from a nucleic acid sequence as set forth above; culturing the cells under conditions suitable for production of the polypeptide; and recovering the polypeptide.
[0020]In accordance with another aspect of the present invention, there is provided a peptide product selected from the group consisting of: a polypeptide having an amino acid sequence corresponding to the amino acid sequence shown in FIG. 8; a polypeptide containing a mutation in the amino acid sequence shown in FIG. 8, wherein the arginine at position 696 is replaced with a proline; a peptide comprising at least 6 amino acid residues corresponding to the amino acid sequence shown in FIG. 8, and a peptide comprising at least 6 amino acid residues corresponding to a mutated form of the amino acid sequence shown in FIG. 8. In one embodiment, the peptide is labeled. In another embodiment, the peptide is a fusion protein.
[0021]In accordance with another aspect of the present invention, there is provided a use of a peptide as set forth above as an immunogen for the production of antibodies. In one embodiment, there is provided an antibody produced in such application. In one embodiment, the antibody is labeled. In another embodiment, the antibody is bound to a solid support. In accordance with another aspect of the present invention, there is provided a method to determine the presence or absence of the familial dysautonomia (FD) gene mutation in an individual, comprising: isolating genomic DNA, cDNA, or RNA from a potential FD disease carrier or patient; and assessing the DNA for the presence or absence of an FD-associated allele, wherein said FD-associated allele is the FD1 and/or FD2 mutation wherein, the absence of either FD-associated allele indicates the absence of the FD gene mutation in the genome of the individual and the presence of the allele indicates that the individual is either affected with FD or a heterozygote carrier.
[0022]In one embodiment, the assessing step is performed by a process which comprises subjecting the DNA to amplification using oligonucleotide primers flanking the FD1 mutation and the FD2 mutation. In another embodiment, the assessing step further comprises an allele-specific oligonucleotide hybridization assay.
[0023]In another embodiment, DNA is amplified using the following oligonucleotide primers: 5'-GCCAGTGTTTTTGCCTGAG-3'; 5'-CGGATTGTCACTGTTGTGC-3'; 5'-GACTGCTCTCATAGCATCGC-3'. In another embodiment, the assessing step further comprises an allele-specific oligonucleotide hybridization assay. In another embodiment, the allele-specific oligonucleotide hybridization assay is accomplished using the following oligonucleotides: 5'-AAGTAAG(T/C)GCCATTG-3' and 5'-GGTTCAC(G/C)GATTGTC. In yet another embodiment, neuronal tissue from an individual is screened for the presence of truncated IKBKAP mRNA or peptides, wherein the presence of said truncated mRNA or peptides indicates that said individual possesses the FD1 and/or FD2 mutation in the IKBKAP gene.
[0024]In accordance with another aspect of the present invention, there is provided an animal model for familial dysautonomia (FD), comprising a mammal possessing a mutant or knock-out or knock-in FD gene. In another embodiment, there is provided a method of producing a transgenic animal expressing a mutant IKAP mRNA comprising:
[0025](a) introducing into an embryonal cell of an animal a promoter operably linked to the nucleotide sequence containing a mutation associated with FD;
[0026](b) transplanting the transgenic embryonal target cell formed thereby into a recipient female parent; and
[0027](c) identifying at least one offspring containing said nucleotide sequence in said offspring's genome.
[0028]In accordance with another aspect of the present invention, there is provided a method for screening potential therapeutic agents for activity, in connection with FD, comprising: providing a screening tool selected from the group consisting of a cell line, and a mammal containing or expressing a defective FD gene or gene product; contacting the screening tool with the potential therapeutic agent; and assaying the screening tool for an activity.
[0029]In accordance with another aspect of the present invention, there is provided a method for treating familial dysautonomia (FD) by gene therapy using recombinant DNA technology to deliver the normal form of the FD gene into patient cells or vectors which will supply the patient with gene product in vivo.
[0030]In another embodiment, there is provided a method for treating familial dysautonomia (FD), comprising: providing an antibody directed against an FD protein sequence or peptide product; and delivering the antibody to affected tissues or cells in a patient having FD.
[0031]In accordance with another aspect of the present invention, there is provided kits for carrying out the methods of the invention. These kits include nucleic acids, polypeptides and antibodies of the present invention. In another embodiment the kit for detecting FD mutations will also contain genetic tests for diagnosing additional genetic diseases, such as Canavan's disease, Tay-Sachs disease, Goucher disease, Cystic Fibrosis, Fanconi anemia, and Bloom syndrome.
[0032]It will be appreciated by a skilled worker in the art that the identification of the genetic defect in a genetic disease, coupled with the provision of the DNA sequences of both normal and disease-causing alleles, provides the full scope of diagnostic and therapeutic aspects of such an invention as can be envisaged using current technology.
BRIEF DESCRIPTION OF THE DRAWINGS
[0033]FIG. 1. Genomic structure of IKBKAP. The figure illustrates the orientation and placement of the 37 exons within a 68 kb genomic region of chromosome 9q31. The primers used for analysis of the splice defect are indicated as 18F (exon 18), 19F (exon 19) and 23R (exon 23). Asterick indicates the locations of the two mutations identified; the mutation associated with the major AJ haplotype is located at bp 6 of intron 20, whereas the mutation association with the minor AJ haplotype is located at bp 73 of exon 19. The 4.8 and 5.9 designations at exon 37 indicate the lengths of the two IKBKAP messages that differ only in the length of their 3' UTRs.
[0034]FIGS. 2A-2C. Demonstration of mutations in IKBKAP. FIG. 2A shows the antisense sequence of the T-C mutation (shown by arrows adjacent to the G and A lanes) at bp 6 of intron 20 that is associated with the major FD haplotype. Lanes 1 and 2 are FD patients homozygous for the major haplotype (homozygous GG), lane 3 is an FD patient heterozygous for the major haplotype and minor haplotype 2 (heterozygous GA), lane 4 is an FD patient heterozygous for the major haplotype and minor haplotype 3 (heterozygous GA), and lanes 5 and 6 are non-FD controls (homozygous AA). FIG. 2b shows heterozygosity for the G-C mutation (shown by arrows adjacent to the G and C lanes) at bp 73 of exon 19. Lane 1 is an FD homozygous for the major haplotype (homozygous GG), lanes 2-4 are three patients heterozygous for the major haplotype and minor haplotype 2 (heterozygous GC), lane 5 is a patient heterozygous for the major haplotype and minor haplotype 3 (homozygous GG), and lane 6 is a non-FD control (homozygous GG). FIG. 2c shows the sequence of the cDNA generated from the RT-PCR of a patient heterozygous for the major and minor 2 haplotypes. The arrow points to the heterozygous G-C mutation in exon 19. The boundary of exons 19 and 20 is also indicated, illustrating that this patient expresses wild-type message that includes exon 20, despite the presence of the major mutation on one allele.
[0035]FIGS. 3A-3B. Northern blot analysis of IKBKAP. FIG. 3A is a human multiple tissue northern blot that was hybridized with IKBKAP exon 2 and shows the presence of two messages of 4.8 and 5.9 kb (northern blots hybridized with other IKBKAP probes yielded similar patterns). FIG. 3b is a northern blot generated using mRNA isolated from lymphoblast cell lines: lanes 1, 2, and 5 FD patients homozygous for the major haplotype; lane 3 individual carrying two definitively non-FD chromosomes, lane 4 FD patient heterozygous for the major haplotype and minor haplotype 2; lane 6 control brain RNA (Clontech). The level of expression of IKBKAP mRNA relative to β-actin mRNA is quite variable in lymphoblasts. We observed no consistent increase or decrease in mRNA levels between FD patients homozygous for the major haplotype, those heterozygous for the major haplotype and minor haplotype 2, and non-FD individuals.
[0036]FIGS. 4A-4B: RT-PCR analysis of the exon 20 region of IKBKAP showing expression of the wild-type message and protein in patients. FIG. 4A was generated using primers 18F (exon 18) and 23R (exon 23). Lanes 1 and 2 are FD patients homozygous for the major haplotype, lane 3 is an FD patient heterozygous for the major haplotype and minor haplotype 2, lanes 4 and 5 are non-FD controls, lane 6 is a water control. FIG. 4B is a western blot generated using cytoplasmic protein isolated from patient lymphoblast cell lines and detected with a carboxyl-terminal antibody. Lanes 2, 4, 6, and 8 are patients homozygous for the major haplotype, lanes 3, 5, 7, and 9 are non-FD controls, lane 1 is a patient heterozygous for the major and minor haplotype 3, and lane 10 is a patient heterozygous for the major and minor haplotype 2 and lane 11 is a Hela cell line sample.
[0037]FIG. 5. RT-PCR analysis of the exon 20 region of IKBKAP showing variable expression of the mutant message in FD patients. The analysis was done using primers 19F (exon 19) and 23F (exon 23). Lanes 1 and 2, control fibroblasts; lanes 3, 4, and 5, FD fibroblasts homozygous for the major mutation; lanes 6 and 7 FD lymphoblasts homozygous for the major mutation, lanes 8 and 9 non-FD lymphoblasts, lane 10 FD patient brain stem, lane 11 FD patient temporal lobe (showing a faint 319 bp band and no 393 bp band), lane 12 water control. RT-PCR of control brain RNA (Clontech) showed only the 393 bp band (data not shown).
[0038]FIG. 6. The genomic sequence for IKBKAP.
[0039]FIG. 7--The cDNA sequence for IKBKAP
[0040]FIG. 8--the amino acid sequence of the IKBKAP gene
[0041]FIG. 9--Comparison of the amino acid sequence of Ikap across several species. Alignment of the amino acid sequence of Ikap (M--musculus) with that of Homo sapiens (H--sapiens), Drosophila melanogaster (D--melanogaster), Saccharomyces cerevisiae (S--cervisiae), Arabidopsis thaliana (A--thaliana), and Caenorhabdits elegans (C--elegans).
[0042]FIG. 10--Comparison of the Novel Mouse Ikbkap Gene with Multiple Species Homologs
[0043]FIG. 11--Mouse Ikbkap Exon and Intron Boundaries
[0044]FIG. 12--Comparison of the synthetic regions of mouse chromosome 4 (MMU4) and human chromosome 9 (HSA9q31). This diagram on the left shows the location of Ikbkap in relation to mapped and genetic markers (boldface). Distances are given in centimorgans. The positions of the homologous genes that map to human chromosome 9q31 are shown on the right.
DETAILED DESCRIPTION OF THE INVENTION
[0045]This invention relates to mutations in the IKBKAP gene, which the inventors of the instant application discovered are associated with Familial Dysautonomia. More specifically, the mutation associated with the major haplotype of FD is a T-C change located at bp 6 of intron 20 in the IKBKAP gene as shown in FIG. 1. This mutation can result in skipping of exon 20 in the mRNA from FD patients, although they continue to express varying levels of wild-type message in a tissue specific manner. The mutation associated with the minor haplotype is a single G-C change at bp 2397 (bp 73 of exon 19) that causes an arginine to proline missense mutation (R696P) that is predicted to disrupt a potential phosphorylation site.
[0046]These findings have direct implications for understanding the clinical manifestations of FD, for preventing it and potentially for treating it. The IKAP protein produced from IKBKAP gene was originally isolated as part of a large interleukin-1-inducible IKK complex and described as a regulator of kinases involved in pro-inflammatory cytokine signaling (Cohen et al. 1998). However, a recent report questioned this conclusion, by reporting that cellular IKK complexes do not contain IKAP based on various protein-protein interaction and functional assays. Rather, IKAP appears to be a member of a novel complex containing additional unidentified proteins of 100, 70, 45, and 39 kDa (Krappmann et al. 2000).
[0047]IKAP is homologous to the Elp1 protein of S. cerevisiae, which is encoded by the IKI3 locus and is required for sensitivity to pGKL killer toxin. The human and yeast proteins exhibit 29% identity and 46% similarity over their entire lengths. Yeast Elp1 protein is part of the RNA polymerase II-associated elongator complex, which also contains Elp2, a WD-40 repeat protein, and Elp3, a histone acetyltransferase (Otero et al. 1999). The human ELP3 gene encodes a 60 kDa histone acetyltransferase that shows more than 75% identity with yeast Elp3 protein, but no 60 kDa protein has been found in the human IKAP-containing protein complex. Consequently, it is considered unlikely that IKAP is a member of a functionally conserved mammalian elongator complex (Krappmann et al. 2000). Instead, it has been reported that the protein may play a role in general gene activation mechanisms, as overexpression of IKAP interferes with the activity of both NF-κB-dependent and independent reporter genes (Krappmann et al. 2000). Therefore, the FD phenotype may be caused by aberrant expression of genes crucial to the development of the sensory and autonomic nervous systems, secondary to the loss of a functional IKAP protein in specific tissues.
[0048]FD is unique among Ashkenazi Jewish disorders in that one mutation accounts for >99.5% of the disease chromosomes. As in other autosomal recessive diseases with no phenotype in heterozygous carriers, one might have expected to find several different types of mutations producing complete inactivation of the DYS gene in the AJ population. The fact that the major FD mutation does not produce complete inactivation, but rather allows variable tissue-specific expression of IKAP, may explain this lack of mutational diversity. Mutations causing complete inactivation of IKAP in all tissues might cause a more severe or even lethal phenotype. Indeed, CG10535, the apparent Drosophila melanogaster homologue of IKBKAP, maps coincident with a larval recessive lethal mutation (l(3)04629) supporting the essential nature of the protein (FlyBase). Thus, the array of mutations that can produce the FD phenotype may be limited if they must also allow expression of functional or partially functional IKAP in some tissues to permit survival. With the identification of IKBKAP as DYS, it will now be possible to test this inactivation hypothesis in a mammalian model system.
[0049]Despite the overwhelming predominance of a single mutation in FD patients, the disease phenotype is remarkably variable both within and between families. The nature of the major FD mutation makes it tempting to consider that this phenotypic variability might relate to the frequency of exon 20 skipping in specific tissues and at specific developmental stages, which may be governed by variations in many factors involved in RNA splicing. Even a small amount of normal IKAP protein expressed in critical tissues might permit sufficient neuronal survival to alleviate the most severe phenotypes. This possibility is supported by the relatively mild phenotype associated with the presence of the R696P mutation, which is predicted to permit expression of an altered full-length IKAP protein that may retain some functional capacity. To date, this minor FD mutation has only been seen in four patients heterozygous for the major mutation. Consequently, it is uncertain whether homozygotes for the R696P mutation would display any phenotypic abnormality characteristic of FD. The single patient with minor haplotype 3 and mixed ancestry, whose mutation has yet to be found, is also a compound heterozygote with the major haplotype. The existence of minor haplotype 3 indicates that IKBKAP mutations will be found outside the AJ population, but like the R696P mutation, it is difficult to predict the severity of phenotype that would result from homozygosity.
[0050]Since FD affects the development and maintenance of the sensory and autonomic nervous systems, the identification of IKBKAP as the DYS gene allows for further investigation of the role of IKAP and associated proteins in the sensory and autonomic nervous systems. Of more immediate practical importance, however, the discovery of the single base mutation that characterizes >99.5% of FD chromosomes will permit efficient, inexpensive carrier testing in the AJ population, to guide reproductive choices and reduce the incidence of FD. The nature of the major mutation also offers some hope for new approaches to treatment of FD. Despite the presence of this mutation, lymphoblastoid cells from patients are capable of producing full-length wild-type mRNA and normal IKAP protein, while in neuronal tissue exon 20 is skipped, presumably leading to a truncated product. Investigation of the mechanism that permits lymphoblasts to be relatively insensitive to the potential effect of the mutation on splicing may suggest strategies to prevent skipping of exon 20 in other cell types. An effective treatment to prevent the progressive neuronal loss of FD may be one aimed at facilitating the production of wild-type mRNA from the mutant gene rather than exogenous administration of the missing IKAP protein via gene therapy.
FD Screening
[0051]With knowledge of the primary mutation and secondary mutation of the FD gene as disclosed herein, screening for presymptomatic homozygotes, including prenatal diagnosis, and screening for heterozygous carriers can be readily carried out.
1. Nucleic Acid Based Screening
[0052]Individuals carrying mutations in the FD gene may be detected at either the DNA or RNA level using a variety of techniques that are well known in the art. The genomic DNA used for the diagnosis may be obtained from an individual's cells, such as those present in peripheral blood, urine, saliva, bucca, surgical specimen, and autopsy specimens. The DNA may be used directly or may be amplified enzymatically in vitro through use of PCR (Saiki et al. Science 239:487-491 (1988)) or other in vitro amplification methods such as the ligase chain reaction (LCR) (Wu and Wallace Genomics 4:560-569 (1989)), strand displacement amplification (SDA) (Walker et al. PNAS USA 89:392-396 (1992)), self-sustained sequence replication (3SR) (Fahy et al. PCR Methods Appl. 1:25-33 (1992)), prior to mutation analysis in situ hybridization may also be used to detect the FD gene.
[0053]The methodology for preparing nucleic acids in a form that is suitable for mutation detection is well known in the art. For example, suitable probes for detecting a given mutation include the nucleotide sequence at the mutation site and encompass a sufficient number of nucleotides to provide a means of differentiating a normal from a mutant allele. Any probe or combination of probes capable of detecting any one of the FD mutations herein described are suitable for use in this invention Examples of suitable probes include those complementary to either the coding or noncoding strand of the DNA. Similarly, suitable PCR primers are complementary to sequences flanking the mutation site. Production of these primers and probes can be carried out in accordance with any one of the many routine methods, e.g., as disclosed in Sambrook et al. sup. 45, and those disclosed in WO 93/06244 for assays for Goucher disease.
[0054]Probes for use with this invention should be long enough to specifically identify or amplify the relevant FD mutations with sufficient accuracy to be useful in evaluating the risk of an individual to be a carrier or having the FD disorder. In general, suitable probes and primers will comprise, preferably at a minimum, an oligomer of at least 16 nucleotides in length. Since calculations for mammalian genomes indicate that for an oligonucleotide 16 nucleotides in length, there is only one chance in ten that a typical cDNA library will fortuitously contain a sequence that exactly matches the sequence of the nucleotide. Therefore, suitable probes and primers are preferably 18 nucleotides long, which is the next larger oligonucleotide fully encoding an amino acid sequence (i.e., 6 amino acids in length).
[0055]By use of nucleotide and polypeptide sequences provided by this invention, safe, effective and accurate testing procedures are also made available to identify carriers of mutant alleles of IKBKAP, as well as pre- and postnatal diagnosis of fetuses and live born patients carrying either one or two mutant alleles. This affords potential parents the opportunity to make reproductive decisions prior to pregnancy, as well as afterwards, e.g., if chorionic villi sampling or amniocentesis is performed early in pregnancy. Thus, prospective parents who know that they are both carriers may wish to determine if their fetus will have the disease, and may wish to terminate such a pregnancy, or to provide the physician with the opportunity to begin treatment as soon as possible, including prenatally. In the case where such screening has not been performed, and therefore the carrier status of the patient is not known, and where FD disease is part of the differential diagnosis, the present invention also provides a method for making the diagnosis genetically.
[0056]Many versions of conventional genetic screening tests are known in the art. Several are disclosed in detail in WO 91/02796 for cystic fibrosis, in U.S. Pat. No. 5,217,865 for Tay-Sachs disease, in U.S. Pat. No. 5,227,292 for neurofibromatosis and in WO 93/06244 for Goucher disease. Thus, in accordance with the state of the art regarding assays for such genetic disorders, several types of assays are conventionally prepared using the nucleotides, polypeptides and antibodies of the present invention. For example: the detection of mutations in specific DNA sequences, such as the FD gene, can be accomplished by a variety of methods including, but not limited to, restriction-fragment-length-polymorphism detection based on allele-specific restriction-endonuclease cleavage (Kan and Dozy Lancet ii:910-912 (1978)), hybridization with allele-specific oligonucleotide probes (Wallace et al. Nucl Acids Res 6:3543-3557 (1978)), including immobilized oligonucleotides (Saiki et al. PNAS USA 86:6230-6234 (1989)) or oligonucleotide arrays (Maskos and Southern Nucl Acids Res 21:2269-2270 (1993)), allele-specific PCR (Newton et al. Nucl Acids Res 17:2503-25 16 (1989)), mismatch-repair detection (MRD) (Faham and Cox Genome Res 5:474-482 (1995)), binding of MutS protein (Wagner et al. Nucl Acids Res 23:3944-3948 (1995), denaturing-gradient gel electrophoresis (DGGE) (Fisher and Lerman et al. PNAS USA 80:1579-1583 (1983)), single-strand-conformation-polymorphism detection (Orita et al. Genomics 5:874-879 (1983)), RNAase cleavage at mismatched base-pairs (Myers et al. Science 230:1242 (1985)), chemical (Cotton et al. PNAS USA 85:4397-4401 (1988)) or enzymatic (Youil et al. PNAS USA 92:87-91 (1995)) cleavage of heteroduplex DNA, methods based on allele specific primer extension (Syvanen et al. Genomics 8:684-692 (1990)), genetic bit analysis (GBA) (Nikiforov et al. Nucl Acids Res 22:4167-4175 (1994)), the oligonucleotide-ligation assay (OLA) (Landegren et al. Science 241:1077 (1988)), the allele-specific ligation chain reaction (LCR) (Barrany PNAS USA 88:189-193 (1991)), gap-LCR (Abravaya et al. Nucl Acids Res 23:675-682 (1995)), and radioactive and/or fluorescent DNA sequencing using standard procedures well known in the art.
[0057]As will be appreciated, the mutation analysis may also be performed on samples of RNA by reverse transcription into cDNA therefrom. Furthermore, mutations may also be detected at the protein level using, for example, antibodies specific for the mutant and normal FD protein, respectively. It may also be possible to base an FD mutation assay on altered cellular or subcellular localization of the mutant form of the FD protein.
2. Antibodies
[0058]Antibodies can also be used for the screening of the presence of the FD gene, the mutant FD gene, and the protein products therefrom. In addition, antibodies are useful in a variety of other contexts in accordance with this invention. As will be appreciated, antibodies can be raised against various epitopes of the FD protein. Such antibodies can be utilized for the diagnosis of FD and, in certain applications, targeting of affected tissues.
[0059]For example, antibodies can be used to detect truncated FD protein in neuronal cells, the detection of which indicates that an individual possesses a mutation in the IKBKAP gene.
[0060]Thus, in accordance with another aspect of the present invention a kit is provided that is suitable for use in screening and assaying for the presence of the FD gene by an immunoassay through use of an antibody which specifically binds to a gene product of the FD gene in combination with a reagent for detecting the binding of the antibody to the gene product.
[0061]Antibodies raised in accordance with the invention can also be utilized to provide extensive information on the characteristics of the protein and of the disease process and other valuable information which includes but is not limited to:
[0062]1. Antibodies can be used for the immunostaining of cells and tissues to determine the precise localization of the FD protein. Immunofluorescence and immuno-electron microscopy techniques which are well known in the art can be used for this purpose. Defects in the FD gene or in other genes which cause an altered localization of the FD protein are expected to be localizable by this method.
[0063]2. Antibodies to distinct isoforms of the FD protein (i.e., wild-type or mutant-specific antibodies) can be raised and used to detect the presence or absence of the wild-type or mutant gene products by immunoblotting (Western blotting) or other immunostaining methods. Such antibodies can also be utilized for therapeutic applications where, for example, binding to a mutant form of the FD protein reduces the consequences of the mutation.
[0064]3. Antibodies can also be used as tools for affinity purification of FD protein. Methods such as immunoprecipitation or column chromatography using immobilized antibodies are well known in the art and are further described in Section (II)(B)(3), entitled "Protein Purification" herein.
[0065]4. Immunoprecipitation with specific antibodies is useful in characterizing the biochemical properties of the FD protein. Modifications of the FD protein (i.e., phosphorylation, glycosylation, ubiquitination, and the like) can be detected through use of this method. Immunoprecipitation and Western blotting are also useful for the identification of associating molecules that may be involved in the mammalian elongation complex.
[0066]5. Antibodies can also be utilized in connection with the isolation and characterization of tissues and cells which express FD protein. For example, FD protein expressing cells can be isolated from peripheral blood, bone marrow, liver, and other tissues, or from cultured cells by fluorescence activated cell sorting (FACS) Harlow et al., eds., Antibodies: A Laboratory Manual, pp. 394-395, Cold Spring Harbor Press, N.Y. (1988). Cells can be mixed with antibodies (primary antibodies) with or without conjugated dyes. If nonconjugated antibodies are used, a second dye-conjugated antibody (secondary antibody) which binds to the primary antibody can be added. This process allows the specific staining of cells or tissues which express the FD protein.
[0067]Antibodies against the FD protein are prepared by several methods which include, but are not limited to:
[0068]1. The potentially immunogenic domains of the protein are predicted from hydropathy and surface probability profiles. Then oligopeptides which span the predicted immunogenic sites are chemically synthesized. These oligopeptides can also be designed to contain the specific mutant amino acids to allow the detection of and discrimination between the mutant versus wild-type gene products. Rabbits or other animals are immunized with the synthesized oligopeptides coupled to a carrier such as KLH to produce anti-FD protein polyclonal antibodies. Alternatively, monoclonal antibodies can be produced against the synthesized oligopeptides using conventional techniques that are well known in the art Harlow et al., eds., Antibodies: A Laboratory Manual, pp. 151-154, Cold Spring Harbor Press, N.Y. (1988). Both in vivo and in vitro immunization techniques can be used. For therapeutic applications, "humanized" monoclonal antibodies having human constant and variable regions are often preferred so as to minimize the immune response of a patient against the antibody. Such antibodies can be generated by immunizing transgenic animals which contain human immunoglobulin genes. See Jakobovits et al. Ann NY Acad Sci 764:525-535 (1995).
[0069]2. Antibodies can also be raised against expressed FD protein products from cells. Such expression products can include the full length expression product or parts or fragments thereof. Expression can be accomplished using conventional expression systems, such as bacterial, baculovirus, yeast, mammalian, and other overexpression systems using conventional recombinant DNA techniques. The proteins can be expressed as fusion proteins with a histidine tag, glutathione-S-transferase, or other moieties, or as nonfused proteins. Expressed proteins can be purified using conventional protein purification methods or affinity purification methods that are well known in the art. Purified proteins are used as immunogens to generate polyclonal or monoclonal antibodies using methods similar to those described above for the generation of antipeptide antibodies.
[0070]In each of the techniques described above, once hybridoma cell lines are prepared, monoclonal antibodies can be made through conventional techniques of, for example, priming mice with pristane and interperitoneally injecting such mice with the hybrid cells to enable harvesting of the monoclonal antibodies from ascites fluid.
[0071]In connection with synthetic and semi-synthetic antibodies, such terms are intended to cover antibody fragments, isotype switched antibodies, humanized antibodies (mouse-human, human-mouse, and the like), hybrids, antibodies having plural specificities, fully synthetic antibody-like molecules, and the like.
3. Expression Systems
[0072]Expression systems for the FD gene product allow for the study of the function of the FD gene product, in either normal or wild-type form and/or mutated form. Such analyses are useful in providing insight into the disease causing process that is derived from mutations in the gene.
[0073]"Expression systems" refer to DNA sequences containing a desired coding sequence and control sequences in operable linkage, so that hosts transformed with these sequences are capable of producing the encoded proteins. In order to effect transformation, the expression system may be included on a vector; however, the relevant DNA may then also be integrated into the host chromosome.
[0074]In general terms, the production of a recombinant form of FD gene product typically involves the following:
[0075]First a DNA encoding the mature (used here to include all normal and mutant forms of the proteins) protein, the preprotein, or a fusion of the FD protein to an additional sequence cleavable under controlled conditions such as treatment with peptidase to give an active protein, is obtained. If the sequence is uninterrupted by introns it is suitable for expression in any host. If there are introns, expression is obtainable in mammalian or other eukaryotic systems capable of processing them. This sequence should be in excisable and recoverable form. The excised or recovered coding sequence is then placed in operable linkage with suitable control sequences in an expression vector. The construct is used to transform a suitable host, and the transformed host is cultured under selective conditions to effect the production of the recombinant FD protein. Optionally the FD protein is isolated from the medium or from the cells and purified as described in Section entitled "Protein Purification".
[0076]Each of the foregoing steps can be done in a variety of ways. For example, the desired coding sequences can be obtained by preparing suitable cDNA from cellular mRNA and manipulating the cDNA to obtain the complete sequence. Alternatively, genomic fragments may be obtained and used directly in appropriate hosts. The construction of expression vectors operable in a variety of hosts are made using appropriate replicons and control sequences, as set forth below. Suitable restriction sites can, if not normally available, be added to the ends of the coding sequence so as to provide an excisable gene to insert into these vectors.
[0077]The control sequences, expression vectors, and transformation methods are dependent on the type of host cell used to express the gene. Generally, prokaryotic, yeast, insect, or mammalian cells are presently useful as hosts. Prokaryotic hosts are in general the most efficient and convenient for the production of recombinant proteins. However, eukaryotic cells, and, in particular, yeast and mammalian cells, are often preferable because of their processing capacity and post-translational processing of human proteins.
[0078]Prokaryotes most frequently are represented by various strains of E. coli. However, other microbial strains may also be used, such as Bacillus subtilis and various species of Pseudomonas or other bacterial strains. In such prokaryotic systems, plasmid or bacteriophage vectors which contain origins of replication and control sequences compatible with the host are used. A wide variety of vectors for many prokaryotes are known (Maniatis et al. Molecular Cloning: A Laboratory Manual pp. 1.3-1.11, 2.3-2.125, 3.2-3.48, 2-4.64 (Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y. (1982)); Sambrook et al. Molecular Cloning: A Laboratory Manual pp. 1-54 (Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y. (1989)); Meth. Enzymology 68: 357-375 (1979); 101: 307-325 (1983); 152: 673-864 (1987) (Academic Press, Orlando, Fla. Pouwells et al. Cloning Vectors: A Laboratory Manual (Elsevier, Amsterdam (1987))). Commonly used prokaryotic control sequences which are defined herein to include promoters for transcription initiation, optionally with an operator, along with ribosome binding site sequences, include such commonly used promoters as the beta-lactamase (penicillinase) and lactose (lac) promoter systems, the tryptophan (trp) promoter system and the lambda derived PL promoter and N-gene ribosome binding, site, which has become useful as a portable control cassette (U.S. Pat. No. 4,711,845). However, any available promoter system compatible with prokaryotes can be used (Sambrook et al. supra. (1989); Meth. Enzymology supra. (1979, 1983, 1987); John et al. Gene 61: 207-215 (1987).
[0079]In addition to bacteria, eukaryotic microbes, such as yeast, may also be used as hosts. Laboratory strain Saccharomyces cerevisiae or Baker's yeast, is most often used although other strains are commonly available.
[0080]Vectors employing the 2 micron origin of replication and other plasmid vectors suitable for yeast expression are known (Sambrook et al. supra. (1989); Meth. Enzymology supra (1979, 1983, 1987); John et al. supra (1987). Control sequences for yeast vectors include promoters for the synthesis of glycolytic enzymes. Additional promoters known in the art include the promoters for 3-phosphoglycerate kinase, and those for other glycolytic enzymes, such as glyceraldehyde-3-phosphate dehydrogenase, hexokinase, pyruvate decarboxylase, phosphofructokinase, glucose-6-phosphate isomerase, 3-phosphoglycerate mutase, pyruvate kinase, triosephosphate isomerase, phosphoglucose isomerase, and glucokinase. Other promoters, which have the additional advantage of transcription controlled by growth conditions, are the promoter regions for alcohol dehydrogenase 2, isocytochrome C, acid phosphatase, degradative enzymes associated with nitrogen metabolism, and enzymes responsible for maltose and galactose utilization. See Sambrook et al. supra (1989); Meth. Enzymology supra. John et al. supra. (1987). It is also believed that terminator sequences at the 3' end of the coding sequences are desirable. Such terminators are found in the 3' untranslated region following the coding sequences in yeast-derived genes. Many of the useful vectors contain control sequences derived from the enolase gene containing plasmid peno46 or the LEU2 gene obtained from Yep13, however, any vector containing a yeast compatible promoter, origin of replication, and other control sequences is suitable (Sambrook et al. supra. (1989); Meth. Enzymology supra. (1979, 1983, 1987); John et al. supra.
[0081]It is also, of course, possible to express genes encoding polypeptides in eukaryotic host cell cultures derived from multicellular organisms (Kruse and Patterson Tissue Culture pp. 475-500 (Academic Press, Orlando (1973)); Meth. Enzymology 68: 357-375 (1979); Freshney Culture of Animal Cells; A Manual of Basic Techniques pp. 329-334 (2d ed., Alan R. Liss, N.Y. (1987))). Useful host cell lines include murine myelomas N51, VERO and HeT cells, SF9 or other insect cell lines, and Chinese hamster ovary (CHO) cells. Expression vectors for such cells ordinarily include promoters and control sequences compatible with mammalian cells such as, for example, the commonly used early and later promoters from Simian Virus 40 (SV 40), or other viral promoters such as those from polyoma, adenovirus 2, bovine papilloma virus, or avian sarcoma viruses, herpes virus family (such as cytomegalovirus, herpes simplex virus, or Epstein-Barr virus), or immunoglobulin promoters and heat shock promoters (Sambrook et al. supra. pp. 16.3-16.74 (1989); Meth. Enzymology 152: 684-704 (1987); John et al. supra. In addition, regulated promoters, such as metallothionine (i.e., MT-1 and MT-2), glucocorticoid, or antibiotic gene "switches" can be used.
[0082]General aspects of mammalian cell host system transformations have been described by Axel (U.S. Pat. No. 4,399,216). Plant cells are also now available as hosts, and control sequences compatible with plant cells such as the nopaline synthase promoter and polyadenylation signal sequences are available (Pouwells et al. supra. (1987); Meth Enzymology 118: 627-639 (Academic Press, Orlando (1986); Gelvin et al. J. Bact. 172: 1600-1608.
[0083]Depending on the host cell used, transformation is done using standard techniques appropriate to such cells (Sambrook et al. supra. pp. 16.30-16.5 (1989); Meth. Enzymology supra 68:357-375 (1979); 101: 307-325 (1983); 152: 673-864 (1987). U.S. Pat. No. 4,399,216; Meth Enzymology supra 118: 627-639 (1986); Gelvin et al. J. Bact. 172: 1600-1608 (1990). Such techniques include, without limitation, calcium treatment employing calcium chloride for prokaryotes or other cells which contain substantial cell wall barriers; infection with Agrobacterium tumefaciens for certain plant cells; calcium phosphate precipitation, DEAE, lipid transfection systems (such as LIPOFECTIN®. and LIPOFFECTAMINE®.), and electroporation methods for mammalian cells without cell walls, and, microprojectile bombardment for many cells including, plant cells. In addition, DNA may be delivered by viral delivery systems such as retroviruses or the herpes family, adenoviruses, baculoviruses, or semliki forest virus, as appropriate for the species of cell line chosen.
C. Therapeutics
[0084]Identification of the FD gene and its gene product also has therapeutic implications. Indeed, one of the major aims of this invention is the development of therapies to circumvent or overcome the defect leading to FD disease. Envisioned are pharmacological, protein replacement, antibody therapy, and gene therapy approaches. In addition the development of animal models useful for developing therapies and for understanding the molecular mechanisms of FD disease are envisioned.
[0085]1. Pharmacological
[0086]In the pharmacological approach, drugs which circumvent or overcome the defective FD gene function are sought. In this approach, modulation of FD gene function can be accomplished by agents or drugs which are designed to interact with different aspects of the FD protein structure or function.
[0087]Efficacy of a drug or agent, can be identified in a screening program in which modulation is monitored in vitro cell systems. Indeed, the present invention provides for host cell systems which express various mutant FD proteins (especially the T-C and G-C mutations noted in this application) and are suited for use as primary screening systems.
[0088]In vivo testing of FD disease-modifying compounds is also required as a confirmation of activity observed in the in vitro assays. Animal models of FD disease are envisioned and discussed in the section entitled "Animal Models", below, in the present application.
[0089]Drags can be designed to modulate FD gene and FD protein activity from knowledge of the structure and function correlations of FD protein and from knowledge of the specific defect in various FD mutant proteins. For this, rational drug design by use of X-ray crystallography, computer-aided molecular modeling (CAMM), quantitative or qualitative structure-activity relationship (QSAR), and similar technologies can further focus drug discovery efforts. Rational design allows prediction of protein or synthetic structures which can interact with and modify the FD protein activity. Such structures may be synthesized chemically or expressed in biological systems. This approach has been reviewed in Capsey et al., Genetically Engineered Human Therapeutic Drugs, Stockton Press, New York (1988). Further, combinatorial libraries can be designed, synthesized and used in screening programs.
[0090]The present invention also envisions that the treatment of FD disease can take the form of modulation of another protein or step in the pathway in which the FD gene or its protein product participates in order to correct the physiological abnormality.
[0091]In order to administer therapeutic agents based on, or derived from, the present invention, it will be appreciated that suitable carriers, excipients, and other agents may be incorporated into the formulations to provide improved transfer, delivery, tolerance, and the like.
[0092]A multitude of appropriate formulations can be found in the formulary known to all pharmaceutical chemists: Remington's Pharmaceutical Sciences, (15th Edition, Mack Publishing Company, Easton, Pa. (1975)), particularly Chapter 87, by Blaug, Seymour, therein. These formulations include for example, powders, pastes, ointments, jelly, waxes, oils, lipids, anhydrous absorption bases, oil-in-water or water-in-oil emulsions, emulsions carbowax (polyethylene glycols of a variety of molecular weights), semi-solid gels, and semi-solid mixtures containing carbowax.
[0093]Any of the foregoing formulations may be appropriate in treatments and therapies in accordance with the present invention, provided that the active agent in the formulation is not inactivated by the formulation and the formulation is physiologically compatible.
2. Protein Replacement Therapy
[0094]The present invention also relates to the use of polypeptide or protein replacement therapy for those individuals determined to have a defective FD gene. Treatment of FD disease could be performed by replacing the defective FD protein with normal protein or its functional equivalent in therapeutic amounts.
[0095]FD polypeptide can be prepared for therapy by any of several conventional procedures. First, FD protein can be produced by cloning the FD cDNA into an appropriate expression vector, expressing the FD gene product from this vector in an in vitro expression system (cell-free or cell-based) and isolating the FD protein from the medium or cells of the expression system. General expression vectors and systems are well known in the art. In addition, the invention envisions the potential need to express a stable form of the FD protein in order to obtain high yields and obtain a form readily amenable to intravenous administration. Stable high yield expression of proteins have been achieved through systems utilizing lipid-linked forms of proteins as described in Wettstein et al. J Exp Med 174:219-228 (1991) and Lin et al. Science 249:677-679 (1990).
[0096]FD protein can be prepared synthetically. Alternatively, the FD protein can be prepared from total protein samples by affinity chromatography. Sources would include tissues expressing normal FD protein, in vitro systems (outlined above), or synthetic materials. The affinity matrix would consist of antibodies (polyclonal or monoclonal) coupled to an inert matrix. In addition, various ligands which specifically interact with the FD protein could be immobilized on an inert matrix. General methods for preparation and use of affinity matrices are well known in the art.
[0097]Protein replacement therapy requires that FD protein be administered in an appropriate formulation. The FD protein can be formulated in conventional ways standard to the art for the administration of protein substances. Delivery may require packaging in lipid-containing vesicles (such as LIPOFECTIN®. or other cationic or anionic lipid or certain surfactant proteins) that facilitate incorporation into the cell membrane. The FD protein formulations can be delivered to affected tissues by different methods depending on the affected tissue.
3. Gene Therapy
[0098]Gene therapy utilizing recombinant DNA technology to deliver the normal form, of the FD gene into patient cells or vectors which will supply the patient with gene product in vivo is also contemplated within the scope of the present invention. In gene therapy of FD disease, a normal version of the FD gene is delivered to affected tissue(s) in a form and amount such that the correct gene is expressed and will prepare sufficient quantities of FD protein to reverse the effects of the mutated FD gene. Current approaches to gene therapy include viral vectors, cell-based delivery systems and delivery agents. Further, ex vivo gene therapy could also be useful. In ex vivo gene therapy, cells (either autologous or otherwise) are transfected with the normal FD gene or a portion thereof and implanted or otherwise delivered into the patient. Such cells thereafter express the normal FD gene product in vivo and would be expected to assist a patient with FD disease in avoiding iron overload normally associated with FD disease. Ex vivo gene therapy is described in U.S. Pat. No. 5,399,346 to Anderson et al., the disclosure of which is hereby incorporated by reference in its entirety. Approaches to gene therapy are discussed below:
a. Viral Vectors
[0099]Retroviruses are often considered the preferred vector for somatic gene therapy. They provide high efficiency infection, stable integration and stable expression (Friedman, T. Progress Toward Human Gene Therapy. Science 244:1275 (1989)). The full length FD gene cDNA can be cloned into a retroviral vector driven by its endogenous promoter or from the retroviral LTR. Delivery of the virus could be accomplished by direct implantation of virus directly into the affected tissue.
[0100]Other delivery systems which can be utilized include adenovirus, adeno-associated virus (AAV), vaccinia virus, bovine papilloma virus or members of the herpes virus group such as Epstein-Barr virus. Viruses can be, and preferably are, replication deficient.
b. Non-Viral Gene Transfer
[0101]Other methods of inserting the FD gene into the appropriate tissues may also be productive. Many of these agents, however, are of lower efficiency than viral vectors and would potentially require infection in vitro, selection of transfectants, and reimplantation. This would include calcium phosphate, DEAE dextran, electroporation, and protoplast fusion. A particularly attractive idea is the use of liposomes (i.e., LIPOFECTIN®.), which might be possible to carry out in vivo. Synthetic cationic lipids and DNA conjugates also appear to show some promise and may increase the efficiency and ease of carrying out this approach.
4. Animal Models
[0102]The generation of a mouse or other animal model of FD disease is important for both an understanding the biology of the disease but also for testing of potential therapies.
[0103]The present invention envisions the creation of an animal model of FD disease by introduction of the FD disease causing mutations in a number of species including mice, rats, pigs, and primates.
[0104]Techniques for specifically inactivating or mutating genes by homologous recombination in embryonic stem cells (ES cells) have been described (Capecci Science 244:1288 (1989)). Animals with the inactivated homologous FD gene can then be used to introduce the mutant or normal human FD gene or for introduction of the homologous gene to that species and containing the T-C, G-C or other FD disease-causing mutations. Methods for these transgenic procedures are well known to those versed in the art and have been described by Murphy and Carter, Curr. Opin. Cell Biol. 4:273-279 (1992)
ILLUSTRATIVE EXAMPLES
[0105]The following examples are provided to illustrate certain aspects of the present invention and not intended as limiting the subject matter thereof.
Example 1
[0106]Identification of the IKBKAP gene and the mutations associated with FD were obtained as follows:
Patient Samples
[0107]Blood samples were collected from two major sources, the Dysautonomia Diagnostic and Treatment Center at New York University Medical Center and the Israeli Center for Familial Dysautonomia at Hadassah University Hospital, with approval from the institutional review boards at these institutions, Massachusetts General Hospital and Harvard Medical School. Either F.A. or C.M. diagnosed all patients using established criteria. Epstein Barr virus transformed lymphoblast lines using standard conditions. Fibroblast cell lines were obtained from the Coriell Cell Repositories, Camden, N.J. RNA isolated from post-mortem FD brain was obtained from the Dysautonomia Diagnostic and Treatment Center at NYU. Genomic DNA, total RNA, and mRNA were prepared using commercial kits (Invitrogen and Molecular Research Center, Inc.). Cytoplasmic protein was extracted from lymphoblasts as previously described (Krappmann et al. 2000).
Identification of IKBKAP and Mutation Analysis
[0108]Exon trapping experiments of cosmids from a physical map of the candidate region yielded 5 exons that were used to screen a human frontal cortex cDNA library. Several cDNA clones were isolated and assembled into a novel transcript encoding a 1332 AA protein that was later identified as IKBKAP (Cohen et al. 1998). The complete 5.9 kb cDNA sequence of IKBKAP has been submitted to GenBank under accession number AF153419. In order to screen for mutations in FD patients, total lymphoblast RNA was reverse transcribed and overlapping sections of IKBKAP were amplified by PCR and sequenced. Evaluation of the splicing defect was performed using the following primers: 18F: GCCAGTGTTTTTTGCCTGAG; 19F: CGGATTGTCACTGTTGTGC; 23R: GACTGCTCTCATAGCATCGC (FIG. 1).
DNA Sequencing
[0109]Sequencing was performed using the AmpliCycle sequencing kit (Applied Biosystems) or on an ABI 377 automated DNA sequencer using the BigDye terminator cycle sequencing kit (Applied Biosystems). The control sequence of the candidate region was obtained by constructing subclone libraries from BACs and sequencing using vector specific primers. The FD sequence was generated by sequencing cosmids from a patient homozygous for the major FD haplotype using sequence specific primers.
Expression Studies
[0110]Several human multiple tissue northern blots (Clontech) were hybridized using the following radioactively labeled probes: IKBKAP exon 2, IKBKAP exons 18/19/20, IKBKAP exon 23, and a 400 bp fragment of the IKBKAP 3'UTR immediately following the stop codon. Poly (A).sup.+ RNA was isolated from patient and control lymphoblast lines, northern blotted, and hybridized using a probe representing the full coding sequence of IKBKAP. Cytoplasmic protein extracted from lymphoblast cell lines was western blotted and detected using ECL (Amersham) with an antibody raised against a peptide comprising the extreme carboxyl terminus (AA 1313-1332) of human IKAP, the protein encoded by IKBKAP (Krappmann et al. 2000).
[0111]To identify DYS, exon trapping and cDNA selection were used to clone and characterize all of the genes in the 471 kb candidate region: EPB41L8 (unpublished data) or EHM2 (Shimizu et al. 2000), C9ORF4 (Chadwick et al. 1999a), C9ORF5 (Chadwick et al. 2000), CTNNVAL1 (Zhang et al. 1998), a novel gene with homology to the glycine cleavage system H proteins (CG-8) (unpublished data), IKBKAP (Cohen et al. 1998), and ACTL7A and ACTL7B (Chadwick et al. 1999b). As FD is a recessive disorder, the a priori expectation for the mutation was inactivation of one of these genes. Consequently, each of these were screened for mutations by RT-PCR of patient lymphoblast RNA and direct sequencing of all coding regions. Although many SNPs were identified, there was no evidence for a homozygous inactivating mutation. Thus, it was concluded that the mutation would be found in non-coding sequence and the control genomic sequence of the entire 471 kb candidate region was generated using BACs from a physical map. Direct sequence prediction using GENSCAN and comprehensive searches of the public databases did not reveal any additional genes in the candidate region beyond those found by cloning methods. However, SNPs identified during sequence analysis enabled us to refine the haplotype analysis and narrow the candidate interval to 177 kb shared by the major haplotype and the previously described minor haplotype 1 (Blumenfeld et al. 1999). This reduced interval contains 5 genes, CYNNAL1, CG-8, IKBKAP, ACTL7A and ACTL7B, all previously screened by RT-PCR without yielding a coding sequence mutation. A cosmid library was constructed from a patient homozygous for the major haplotype, assembled the minimal coverage contig for the now reduced candidate interval, and generated the sequence of the mutant chromosome.
[0112]Comparison of the FD and control sequences revealed 152 differences (excluding simple sequence repeat markers), which include 26 variations in the length of dT, tracts, 1 VNTR, and 125 base pair changes. Each of the 125 base pair changes was tested in a panel of 50 individuals known to catry two non-FD chromosomes by segregation in FD families. Of the 125 changes tested, only 1 was unique to patients carrying the major FD haplotype. This T-C change is located at bp 6 of intron 20 in the IKBKAP gene depicted in FIG. 1, and is demonstrated in FIG. 2A. IKAP was originally identified as an IκB kinase (IKK) complex-associated protein that can bind both NF-κB inducing kinase (NIK) and IKKs through separate domains and assemble them into an active kinase complex (Cohen et al. 1998). Recent work, however, has shown that IKAP is not associated with IKKs and plays no specific role in cytokine-induced NF-κB signaling (Krappmann et al. 2000). Rather, IKAP was shown to be part of a novel multi-protein complex hypothesized to play a role in general transcriptional regulation.
[0113]The IKBKAP gene contains 37 exons and encodes a 1332 amino acid protein. The full-length 5.9 kb cDNA (GenBank accession number AF153419) covers 68 kb of genomic sequence, with the start methionine encoded in exon 2. IKBKAP was previously assigned to chromosome 9q34 (GenBank accession number AF044195), but it clearly maps within the FD candidate region of 9q31. Northern analysis of IKBKAP revealed two mRNAs of 4.8 and 5.9 kb (FIGS. 3a and b). The wild-type 4.8 kb mRNA has been reported previously (Cohen et al. 1998), while the second 5.9 kb message differs only in the length of the 3' UTR and is predicted to encode an identical 150 kDa protein. As seen in FIG. 3b, the putative FD mutation does not eliminate expression of the IKBKAP mRNA in patient lymphoblasts.
[0114]A base pair change at position 6 of the splice donor site might be expected to result in skipping of exon 20 (74 bp), causing a frameshift and therefore producing a truncated protein. However, initial inspection of our RT-PCR experiments in patient lymphoblast RNA using primers located in exons 18 and 23 (FIG. 1) showed a normal length 500 bp fragment that contained exon 20 (FIG. 4A), indicating that patient lymphoblasts express normal IKBKAP message. The Western blot shown in FIG. 4B demonstrates that full-length IKAP protein is expressed in these patient lymphoblasts. However, as the antibody used was directed against the carboxyl-terminus of IKAP it would not be expected to detect any truneated protein should it be present. The presence of apparently normal IKAP in patient cells is at odds with the expectation of an inactivating mutation in this recessive disease.
[0115]In the absence of any evidence for a functional consequence of the intron 20 sequence change, the only alteration unique to FD chromosomes, additional genetic evidence was sought to support the view that it represents the FD mutation. The 658 FD chromosomes that carry the major haplotype all show the T-C change. In toto, 887 chromosomes have been tested that are definitively non-FD due to their failure to cause the disorder when present in individuals heterozygous for the major FD haplotype. None of these non-FD chromosomes exhibits the T-C mutation, strongly indicating that it is not a rare polymorphism. The frequency of the mutation in random AJ chromosomes was 14/1012 (gene frequency 1/72; carrier frequency 1/36), close to the expected carrier frequency of 1/32 (Maayan et al. 1987).
[0116]In view of the strong genetic evidence that this mutation must be pathogenic, it was postulated that its effect might be tissue-specific. RNA extracted from the brain stem and temporal lobe of a post-mortem FD brain sample was therefore examined. In contrast to FD lymphoblasts, RT-PCR of the FD brain tissue RNA using primers in exons 19 and 23 (expected to produce a normal product of 393 bp) revealed a 319 bp mutant product, indicating virtually complete absence of exon 20 from the IKBKAP mRNA (FIG. 5, lanes 10-11). As additional FD autopsy material could not be obtained, intensive analyses of additional lymphoblast and fibroblast cell lines were performed to determine whether exon-skipping could be detected. Fibroblast lines from homozygous FD patients yielded variable results. Some primary fibroblast lines displayed approximately equal expression of the mutant and wild-type mRNAs while others displayed primarily wild-type mRNA. In addition, extensive examination of additional patient lymphoblast lines indicated that the mutant message could sometimes be detected at low levels. An example of the variability seen in FD fibroblasts and the presence of the mutant message in some FD lymphoblasts is shown in FIG. 5. In fact, close re-examination of FIG. 4a shows a trace of the mutant band in 2 (lanes 1 and 2) of the 3 FD samples. The absence of exon 20 in the FD brain RNA and the preponderance of wild-type mRNA in fibroblasts and lymphoblasts indicate that the major FD mutation acts by altering splicing of IKBKAP in a tissue-specific manner.
[0117]To identify the mutations associated with minor haplotypes 2 and 3, (Blumenfeld et al. 1999) we amplified each IKBKAP exon, including adjacent intron sequence, from genomic DNA. A single G-C change at bp 2397 (bp 73 of exon 19) that causes an arginine to proline missense mutation (R696P) was identified in all 4 patients with minor haplotype 2 (FIG. 2b). This was subsequently confirmed by RT-PCR in lymphoblast RNA as shown in FIG. 2c for a region that crosses the exon 19-20 border. The PCR product, generated from an FD patient who is a compound heterozygote with minor haplotype 2 and the major haplotype, clearly shows that RNA is being expressed equally from both alleles based on heterozygosity of the G-C point mutation in exon 19. However, the RNA from the major haplotype allele shows no evidence for skipping of exon 20 which would be expected to produce a mixture of exon 20 and 21 sequence beginning at the end of exon 19. This confirms our previous observation that lymphoblasts with the major FD mutation produce a predominance of normal IKBKAP transcript.
[0118]The R696P mutation is absent from 500 non-FD chromosomes, and it has been seen only once in 706 random AJ chromosomes in an individual who also carries the minor haplotype. This mutation is predicted to disrupt a potential threonine phosphorylation site at residue 699 identified by Netphos 2.0 (Blom et al. 1999), suggesting that it may affect regulation of IKAP. Interestingly, the presence of this minor mutation is associated with a relatively mild disease phenotype, suggesting that a partially functional IKAP protein may be expressed from this allele. No mutation has been identified for minor haplotype 3, which represents the only non-AJ putative FD chromosome.
Example 2
FD Diagnostic Assays
[0119]As discussed above, the allele-specific oligonucleotide (ASO) hybridization assay is highly effective for detecting single nucleotide changes in DNA and RNA, such as the T-C or G-C mutations or sequence variations, especially when used in conjunction with allele-specific PCR amplification. Thus, in accordance with the present invention, there is provided an assay kit to detect mutations in the FD gene through use of a PCR/ASO hybridization assay.
PCR Amplification
[0120]Genomic DNA samples are placed into a reaction vessel(s) with appropriate primers, nucleotides, buffers, and salts and subjected to PCR amplification.
[0121]Suitable genomic DNA-containing samples from patients can be readily obtained and the DNA extracted therefrom using conventional techniques. For example, DNA can be isolated and prepared in accordance with the method described in Dracopoli, N. et al. eds. Current Protocols in Human Genetics pp. 7.1.1-7.1.7 (J. Wiley & Sons, New York (1994)), the disclosure of which is hereby incorporated by reference in its entirety. Most typically, a blood sample, a buccal swab, a hair follicle preparation, or a nasal aspirate is used as a source of cells to provide the DNA.
[0122]Alternatively, RNA from an individual (i.e., freshly transcribed or messenger RNA) can be easily utilized in accordance with the present invention for the detection of the FD2 mutation. Total RNA from an individual can be isolated according to the procedure outlined in Sambrook, J. et al. Molecular Cloning--A Laboratory Manual pp. 7.3-7.76 (2nd Ed., Cold Spring Harbor Laboratory Press, New York (1989)) the disclosure of which is hereby incorporated by reference.
[0123]In a preferred embodiment, the DNA-containing sample is a blood sample from a patient being screened for FD.
[0124]In amplification, a solution containing the DNA sample (obtained either directly or through reverse transcription of RNA) is mixed with an aliquot of each of dATP, dCTP, dGTP and dTTP (i.e., Pharmacia LKB Biotechnology, N.J.), an aliquot of each of the DNA specific PCR primers, an aliquot of Taq polymerase (i.e., Promega, Wis.), and an aliquot of PCR buffer, including MgCl2 (i.e., Promega) to a final volume. Followed by pre-denaturation (i.e., at 95° C. for 7 minutes), PCR is carried out in a DNA thermal cycler (i.e., Perkin-Elmer Cetus, Conn.) with repetitive cycles of annealing, extension, and denaturation. As will be appreciated, such steps can be modified to optimize the PCR amplification for any particular reaction, however, exemplary conditions utilized include denaturation at 95° C. for 1 minute, annealing at 55° C. for 1 minute, and extension at 72° C. for 4 minutes, respectively, for 30 cycles. Further details of the PCR technique can be found in Erlich, "PCR Technology," Stockton Press (1989) and U.S. Pat. No. 4,683,202, the disclosure of which is incorporated herein by reference.
[0125]In a preferred embodiment, the amplification primers used for detecting the T-C mutation and the G-C mutation in the FD gene are 5'-GCCAGTGTTTTTGCCTGAG-3'/5'-GACTGCTCTCATAGCATCGC-3' and 5'-CGGATTGTCACTGTTGTGC-3'/5'-GACTGCTCTCATAGCATCGC-3, respectively.
Hybridization
[0126]Following PCR amplification, the PCR products are subjected to a hybridization assay using allele-specific oligonucleotides. In a preferred embodiment, the allele-specific oligonucleotides used to detect the mutatons in the FD gene are as follows:
TABLE-US-00001 5'- AAGTAAG(T/C)GCCATTG- 3' and 5'- GGTTCAC(G/C)GATTGTC.
[0127]In the ASO assay, when carried out in microtiter plates, for example, one well is used for the determination of the presence of the normal allele and a second well is used for the determination of the presence of the mutated allele. Thus, the results for an individual who is heterozygous for the T-C mutation (i.e. a carrier of FD) will show a signal in each of the wells, an individual who is homozygous for the T-C allele (i.e., affected with FD) will show a signal in only the C well, and an individual who does not have the FD mutation will show only one signal in the T well.
[0128]In another embodiment, a kit for detecting the FD mutation by ASO assay is provided. In the kit, amplification primers for DNA or RNA (or generally primers for amplifying a sequence of genomic DNA, reverse transcription products, complementary products) including the T-C mutated and normal alleles are provided. Allele-specific oligonucleotides are also preferably provided. The kit further includes separate reaction wells and reagents for detecting the presence of homozygosity or heterozygosity for the T-C mutation.
[0129]Within the same kit, or in separate kits, oligonucleotides for amplification and detection of other differences (such as the G-C mutation) can also be provided. If in the same kit as that used for detection of the T-C mutation, separate wells and reagents are provided, and homozygosity and heterozygosity can similarly be determined.
[0130]In another embodiment a kit combining other diseases (i.e., Canavan's)
Example 3
FD Diagnostic: Other Nucleotide Based Assays
[0131]As will be appreciated, a variety of other nucleotide based detection techniques are available for the detection of mutations in samples of RNA or DNA from patients. See, for example, the section, above, entitled "Nucleic Acid Based Screening." Any one or any combination of such techniques can be used in accordance with the invention for the design of a diagnostic device and method for the screening of samples of DNA or RNA for FD gene mutations in accordance with the invention, such as the mutations and sequence variants identified herein. Further, other techniques, currently available, or developed in the future, which allow for the specific detection of mutations and sequence variants in the FD gene are contemplated in accordance with the invention.
[0132]Through use of any such techniques, it will be appreciated that devices and methods can be readily developed by those of ordinary skill in art to rapidly and accurately screen for mutations and sequence variants in the FD gene in accordance with the invention.
[0133]Thus, in accordance with the invention, there is provided a nucleic acid based test for FD gene mutations and sequence variants which comprises providing a sample of a patient's DNA or RNA and assessing the DNA or RNA for the presence of one or more FD gene mutations or sequence variants. Samples of patient DNA or RNA (or genomic, transcribed, reverse transcribed, and/or complementary sequences to the FD gene) can be readily obtained as described in Example 2. Through the identification and characterization of the FD gene as taught and disclosed in the present invention, one of ordinary skill in the art can readily identify the genomic, transcribed, reverse transcribed, and/or complementary sequences to the FD gene sequence in a sample and readily detect differences therein. Such differences in accordance with the present invention can be the T-C or G-C mutations or sequence variations identified and characterized in accordance herewith. Alternatively, other differences might similarly be detectable.
[0134]Kits for conducting and/or substantially automating the process of identification and detection of selected changes, as well as reagents utilized in connection therewith, are therefore envisioned in accordance with the invention of the present invention.
As discussed above, through knowledge of the gene-associated mutations responsible for FD disease, it is now possible to prepare transgenic animals as models of the FD disease. Such animals are useful in both understanding the mechanisms of FD disease as well as use in drug discovery efforts. The animals can be used in combination with cell-based or cell-free assays for drug screening programs.
Example 4
Creating Animal Models of FD
[0135]The first step in creating an animal model of FD is the identification and cloning of homologs of the IKBKAP gene in other species.
Isolation of Mouse cDNA Clones
[0136]The human IKBKAP sequence (GenBank Accession No. AF153419) was used to search the mouse expressed sequence tag database (dbEST) using the BLAST program (www.ncbi.nlm.nih.gov/BLAST). A single 5' EST from a mouse brain library (GenBank Association No. AU079160) was identified that showed marked similarity to the 5' end of IKBKAP. The corresponding cDNA clone, MNCB-3931, was obtained from the Japanese Collection of the Research Bioresource/National Institute of Infectious Disease. In addition, eight EST's that were similar to the 3' end of the ORF were found to belong to UniGene cluster Mn.46573 (www.ncbi.nlm.nih.gov/Unigene). Examination of this cluster yielded several poly (A+)--containing clones, and we obtained the clone UI-M-CG0p-bhb-g-07-0-U1 (GenBank Accession No. BE994893) from Research Genetics.
RT-PCR Analysis
[0137]RNA (1 ug/ml from BALB/c mouse brain was obtained commercially (Clontech). Oligo-dT 15 and random hexamer primers were annealed to the template at 65° C. for 10 min in the presence of 1× first-strand buffer, 2 mM dNTP mix, and 4 mM DTT. The reaction mixture was incubated at 42° C. for 90 min after addition of Suuperscript TM II RT (200 Ulul) and Rnase inhibitor (80 U/ul) (GIBCO).
DAN Sequencing and Analysis
[0138]DNA sequencing was performed using the AmpliCycle sequencing kit (Applied Biosystems) for the 33 [P]-labeled dideoxynucleotide chain termination reaction, using the following conditions: 30 sec at 94° C., 30 sec at 60° C., and 30 sec at 72° C. for 30 cycles. The radioactively labeled sequence reaction product was denatured at 95 C for 10 min and run on a denaturing 6% polyacrylamide gel for autoradiography. Basic sequencing manipulations and aliginents were carried out using a program from Genetics Computer Group (GCC; Madison, Wis.). The cDNA sequence generated throughout the experiments were aligned and assembled into a 4799-bp cDNA named Ikbkap.
Isolation of Full-Length cDNA
[0139]To obtain the full-length cDNA sequence, PCR was performed on the mouse cDNA template using primers designed from the sequence of the 5'- and 3' cDNA clones. The PCR conditions were as follows: 15 sec at 95° C., 30 sec at 54° C. to 60° C., and 3 min at 68° C. for 9 cycles; then 15 sec at 95° C., 30 sec at 54 to 60° C., and 3 min with increment of 5 sec for each succeeding cycle at 68 C for 19 cycles, followed by 7 min at 72° C. The PCR products were electrophoresed on a 1% agarose gel stained with ethidium bromide and were cleaner using a Qiaquick PCR cleaning kit (Qiagen) in the preparation for cycle sequencing. Successive primers were designed in order to obtain the full-length Ikbkap sequence, which was deposited in GenBank under Accession No. AF367244.
Northern Blot Analysis
[0140]Expression of Ikbkap was examined using both mouse embryo and adult mouse multiple tissue Northern blots (Clontech). The blots were probed with a 1045-bp PCR fragment that contains exons 2 through 11, which was generated using primer 1 (5' GGCGTCGTAGAAATTGC-3') and primer 2 (5'-GTGGTGCTGAAGGGGCAGGC-3'). The probe was radiolabeled (Sambrook et al., 1989) and was hybridized according to the manufacturer's instructions.
Chromosome Mapping of the Mouse Ikbkap Gene
[0141]Several of the mouse Ikbkap ESTs belonged to the Unigene cluster Mn.46573, which has been mapped to chromosome 4 (UniSTS entry: 253051) between D4Mit287 and D4Mit197. To assess synteny between mouse chromosome 4 and human chromosome 9, we used several resources available at NCBI (www.nbci.nlm.nib.gov/Homology).
Determination of Genomic Structure of the Mouse Ikbkap
[0142]The 37 human IKBKAP exons were searched against the Celera database to obtain homologous mouse sequences. Approximately 130 mouse genomic fragments (500-700 bp) were obtained using the Celera Discovery System and Celera's associated database, and these fragments were assembled into seven contigs. In order to assemble the coomplete genomic sequence, we obtaiined six mouse bacterial artificial chromosomes (BACs) from Research Genetics after they screened an RPCI-23 mouse library using 4300 bp human probe that contained exon 2. To verify that these BAC clones contained the entire Ikbkap gene, we amplified fragments from the 5' and 3' ends of the gene, as well as a fragment from the 3' flanking gene Act17b (Slaugenhaupt et al., 2001) We designed primers at the ends of each of the seven contigs constructed from the Celera data and generated PCR products from the BACs. Subsequently, we sequenced and closed five of the gaps, with the resulting two contigs assembled and deposited to Celera (Accession No. CSN009).
Creating a Targeting Vector
[0143]After cloning and sequencing the mouse homolog of the human IKBKAP gene, a targeting vector can then be constructed from the mouse genomic DNA. The targeting vector would consist of two approximately 3 kb genomic fragments from the mouse FD gene as 5' and 3' homologous arms. These arms would be chosen to flank a region critical to the function of the FD gene product (for example, exon 20).
[0144]In place of exon 20, negative and positive selectable markers can be placed, for example, to abolish the activity of the FD gene. As a positive selectable marker a neo gene under control of phosphoglycerate kinase (pgk-1) promoter may be used and as a negative selectable marker the 5' arm of the vector can be flanked by a pgk-1 promoted herpes simplex thymidine kinase (HSV-TK) gene can be used.
[0145]The vector is then transfected into R1 ES cells and the transfectants are subjected to positive and negative selection (i.e., G418 and gancyclovir, respectively, where neo and HSV-TK are used). PCR is then used to screen for surviving colonies for the desired homologous recombination events. These are confirmed by Southern blot analysis.
[0146]Subsequently, several mutant clones are picked and injected into C57BL/6 blastocytes to produce high-percentage chimeric animals. The animals are then mated to C57BL/6 females. Heterozygous offspring are then mated to produce homozygous mutants. Such mutant offspring can then be tested for the FD gene mutation by Southern blot analysis. In addition, these animals are tested by RT-PCR to assess whether the targeted homologous recombination results in the ablation of the FD gene mRNA. These results are confirmed by Northern blot analysis and RNase protection assays.
[0147]Once established, the FD gene-/- mice can be studied for the development of FD-like disease and can also be utilized to examine which cells and tissue-types are involved in the FD disease process. The animals can also be used to introduce the mutant or normal FD gene or for the introduction of the homologous gene to that species (i.e., mouse) and containing the T-C or G-C mutations, or other disease causing mutations. Methods for the above-described transgenic procedures are well known to those versed in the art and are described in detail by Murphy and Carter supra (1993).
[0148]The techniques described above, can also be used to introduce the T-C or G-C mutations, or other homologous mutations in the animal, into the homologous animal gene. As will be appreciated, similar techniques to those described above, can be utilized for the creation of many transgenic animal lines.
[0149]To the extent that any reference (including books, articles, papers, patents, and patent applications) cited herein is not already incorporated by reference, they are hereby expressly incorporated by reference in their entirety.
[0150]While the invention has been described in connection with specific embodiments thereof, it will be understood that it is capable of further modification, and this application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice in the art to which the invention pertains and as may be applied to the essential features hereinbefore set forth, and as fall within the scope of the invention and the limits of the appended claims.
Sequence CWU
1
89166479DNAHomo sapiens 1ccagtgctgc ggctgcctag ttgacgcacc cattgagtcg
ctggcttctt tgcagcgctt 60cagcgttttc ccctggaggg cgcctccatc cttggaggcc
tagtgccgtc ggagagagag 120cgggagccgc ggacagagac gcgtgcgcaa ttcggagccg
actctgggtg cggactgtgg 180gagctgactc tgggtagccg gctgcgcgtg gctggggagg
cgaggccgga cgcacctctg 240tttgggggtc ctcaggtaag cgatccatcc agggtagggg
cacgggagtg gacctctccg 300ccggcggtgt ccgggtgaag gagacccgga gcctcctctg
cctgctgcgg gccggggact 360ggagtgcggg ctgcaccacc tctttcctag agccttaaat
tctttttgca gccttgccac 420ctgctccatc gggggcgctg ggaggcgcga cagcccaggg
atgcctgctg cccctccagc 480cggacttaac ccagcctctt gattgcttgc agggggttga
taataacgct gaaagcgaga 540gtattaattc acgatggaag gcggcggtta atagaggctc
gggtgctgtg gtgcgggtcc 600tttctcgcgt gtgagacttt ttcgtggagg tggtgtcctc
tgtgcttctc catctaacgt 660ggtgttttac gtggctttct ctcccgttaa cgatgatctc
cgtggagaca gtggctgagt 720aatcttcaga tcccagtact tagcaagtgc tcagtcggtg
ttggatgtag gccacaaacc 780ggatcgtaaa gaattcaact gtatattgac agccacggaa
ctaatcaatg aatagatccg 840tatgaagagt aagcaaaaag gcagcaaaga cagtttttca
gcttggggac atagagtaga 900aatggtctgt ccccaaatag tgggaactgt catttggggg
aagaatagca agttctttgc 960tttccaggtc gcatttgatg tgcatgtgag acatgcttgt
gattctatca ggaggttgaa 1020aatgtgggtt tagtggtaag tttgggctaa ttcagtcagg
gctaggcatt taggcctaat 1080cagcgtattg gtgatctacc tggtatatgt aatcatgcat
gtgatgtcta gccaagaggt 1140ggatagtcga aggagcaagg gaagaaaatg aagcagttat
caggaaatta agagagaatc 1200cacgattgac ctttggtgtg gagggatctt tagcacattt
aagaactgcg aagagtttga 1260atcagtggag gcaggaaggt tggaggttgc agatgtccaa
gaaagagtac taataggcct 1320aggtcctgtg gcaatatgga ggatattcct ttcctagcct
ggaaagaagt ggagggaagt 1380cttcctccga gaagataagg gaataaggct gatgggtgtg
aaatttcaga gaaactagtt 1440ttgaggcgtt tttatgatgt ttaaagatga aaaacgagca
ggcacggtgg ctcaggcctg 1500taatcccagc actttgggag gcagaggcgg gtggatcact
tgaggttagg agttcaagaa 1560cagcctgggc aacatggtga aaccctgtct ctactaaaaa
tacaaaaatt aactgggcat 1620ggtgccgggc gcctgtaatc ccagctactc cggaggctga
ggcaggagaa tcgcttgaac 1680ccggaaggca gatgttgcgg tgagccgaga tcgcgccatt
gcaccccagc ctgggcaata 1740agagcgaaac tccgtctcaa aaaacaaaaa aacctgcatg
atatgttaga ggttcaagta 1800atttctagca gttcttgaat ataattgtca ccaaaactta
ctaaaatcat tgtcttcctc 1860acttccatca tatataaact tacctttctc ttatcccaca
ttatatatta tataattcct 1920atgacacttg acattatctt ctgtgtacta ttaggattga
ttcatcttta ttctttctat 1980gtcatacata tgtggggtgc caagatgaga gaagtctcct
tggattaaag tgacaataag 2040accggtgtgg tccttgtaat tgctacccct aacataagtt
agggacttac aatcataagc 2100cttaaaggga tctgaatata aataactagc acagtaacat
ttttttcccc tacttaggta 2160atgttatgca tttaagcaag cctgattttg ccagaccaaa
gtagatgtct tgtttagcac 2220tcttttctca cgttttatat tgtcctggga aaagcctggc
cagaagaaca aagttactgg 2280aagtagttat gtcaggtcat cagggtcctt gaaatgttgg
tcatcatttt gaagtaaatt 2340gttgtcatgt cccagtattt tctcttcccc tttagaacag
taaatgcttt tctatctttg 2400atttcagttt ttttatgaat gtataaaacc agtttataaa
tgaatagacc tggtgaatat 2460taaagtcatt tcagattctc ttcaactgcc agtatataaa
aatggatttt caaatagtgc 2520taatcagtgg gatacccttt tgtttttcct catgatttta
taaagatgtc ctaatatgca 2580aaaataaaat gtttccccat tcatttgttc tttcaacttt
cccaaaggaa taactgatat 2640tacatctttt ttgaagaaaa cattctaaag ttgagaatct
tgcctctcct aaaaagaaca 2700taaaataggt ttcagaattc ctaatttgta gaccataact
gtatagagtg ggtcaggttg 2760ctgctataat ccatacatgg gtgtgtactc agagaggtaa
gttttttctt ttcttggtta 2820ttctgattct gactaccact tcttcacccc ctgaatcatt
tcatttaaat aaatatggtc 2880atttatcact attaagctat ttatttttct cttagagatt
aatgattcat caagggatag 2940ttgtacttgt ctcgtgggaa tcacttcatc atgcgaaatc
tgaaattatt tcggaccctg 3000gagttcaggg atattcaagg tccagggaat cctcagtgct
tctctctccg aactgaacag 3060gggacggtgc tcattggttc agaacatggc ctgatagaag
tagaccctgt ctcaagagaa 3120gtaagttact gatgtagaat gccagcatgt gggtatgacc
cttgatttct cttcttccaa 3180atttctttcc ccacatggtc tttctttata tcttattgaa
tttatatcct cccaaataaa 3240catcttttgc ttcatatata tgccatgtta gacatagctt
aaatcgtaat ccttctttaa 3300ctctgctgct attttaacct aagtcagtag aactctgacc
ttactttttg agtgtgtgcc 3360gtacttttta ccctctttgt catgcaaatt ctgtttataa
gagtggtttt tttttttttt 3420tttttgagac ggagtctcgc tctgtcaccc aggctggagt
gcagtggtgt gatcgtggct 3480cactgcaagc tccgcctccc cgggttcaca ccattctcct
gcctcagcct cccgagaagc 3540tgggactaca ggcgcccgcc accgcgcccg gctaattttt
tgtattttta gtagatgtgc 3600ggtttcaccg tgttagccag gatggtcttg atctcctgac
ctcgtgatcc gcctgcctca 3660gcgcccggcc aagagtggtt tttaattggg aatgaacacg
aaagttgccc atggagcttt 3720ctaaaagttt gagcccacat ctcatgtcaa ctaaatcaga
atctttagtg ttggctccta 3780actatatgta ctttaaaaac ctctgtgggt tggttttgat
atggtccctt gattatgttc 3840ttctactaat acattttagg cagttacatc ctttagtgcc
ttttccccat actatagaaa 3900tcttagaaaa gcatagctat tagcatcata ttttagtgga
caattttaaa gagaccaggc 3960ttattgtttt tgtttttgtg tttgtttggc aaaaaggtca
cattacctat ttttcttgtt 4020agagatgaca gagtagtgat atttctcaaa tgaaagtttg
gattttcatc tagaaaaaat 4080atttttgaaa gcttttatgt aataaaagaa gcattaaaaa
gtatttctgg aaatgttatc 4140aattattctt gaaagtagac tgggttaatt tgcttgtgtt
tactttggtg aaaggtgaaa 4200aatgaagttt ctttggtggc agaaggcttt ctcccagagg
atggaagtgg ccgcattgtt 4260ggtgttcagg acttgctgga tcaggagtct gtgtgtgtgg
ccacagcctc tggagacgtc 4320atactctgca gtctcagcac acaacaggta agtggaagac
tccagtgagg ggggagtctc 4380aagcatcctc aaataggtta cttgctattt gtggaagttt
tcaaatcagt agccataata 4440gttacacttt tgctaattaa tttttgcatt atatatttct
ttatttaaaa aattgttaac 4500atggctttat ctatatgtta agattcttct aaaactgagt
tttgtctgct gcatctatta 4560atcagagtga tcagaatgtt ccaaatgaga atatattttt
ttaaaagtta aaactggcta 4620ttcttatgtg gtgtagatca cctcttatca gaccctcatc
ttgagttgca acctttgttt 4680ctcaatttag gaagtctttg tttatctgac ttagattttc
tgttatgaat gttgattggc 4740taaatttaga gtccctgaag tctaggcact aaagtaaata
cattgtcatt acctgcacat 4800gtgatgactg ccagtagagc tagacttcaa gcaattgctt
ctttctctac tttagtgtat 4860agttgagttt ctgatttcta tcctcacctt cttaacagca
agggtttcaa attacacttg 4920gctgattctt taaatcttct tccattactt cattagttgt
gatctcctta acattgatta 4980tgtcacagaa gttagagtat tactaatagt aggataatga
tagcagctta catttattaa 5040ctatcatgtg cctggcactt tttaaagtgc ttttcatgca
aatttattta atcttcacca 5100tgaccttatg cagtaggttg ttgtttccta ttcttcagaa
gaggcagtta aggcacagag 5160tgcttaagta attagaccag ggtcacacag taatcaaatg
gggtttgacc ctagcagtct 5220aaatctggca cctctgctct taaccattcc atttagtaca
atcataaacc tttacttgca 5280gttcatggtg ggaaatatca aacttgtcat atacagcttg
tttttttttc gtatttgaaa 5340gatagatgct tttactttcc aaacattttg tagcattgtt
tcctggttac tgagctcttc 5400cagtctattt atcttcattt aatggtgctg attctgccct
ttagtggctt ctcaattgtc 5460tgaaaggtag agcccactat tgtgccttat aagccccttt
cactatctgt tccccacatt 5520cctttttagc ctcatccccc cattgttcct gtgtgtacgt
aaaccttatg ttttagttgc 5580agctgatttt taactgctct tttttctggc tttgtgcctc
tacactgtgt tttcttcctg 5640gtctctcttt cctgtcctta ttaccactct ttgaaacacg
tcagaaaaac tttttctgga 5700ctttgggcca cttgtcattc cctgtgctga gacgcatttt
gctttccaga gatcttggtc 5760attgctgtta tcctctgtag ggtcttcttt tatctccctc
gtgagacagc tctgggaaga 5820aaaagatatt tatttctaat ccctgtgcct aataacaggt
ctattctctt gatatccatt 5880actgaagaaa tgtttgttga gtaagttctt gttttaattt
ttaaatataa atttttaatt 5940tttatgagta catagtaggt acatatattt atgggctaca
tgagatgttc tgatacaggc 6000atgcagtgca aaataaccac atcatggaga ataggatatc
catcccatca agcgtttatc 6060ctttgtgtta caaacaatcc aattacagtc ttttagttat
tttaaaatgt gcaattactg 6120ttgactgtag ttaccttgtt gtgctatcaa atagcaggtc
ttatttattc tatttttttt 6180tgtacctatt aaccatccca acttccctca gcccctcact
acccttccca gcctctggta 6240accatccttg tactctctgt gtccgtgagt tcaattgttt
tgatttttag atcgcacaaa 6300taagtgagaa catgtgatgt ttgtcttttt gtgtctggtt
tatttcactt aatgtaatca 6360tctccagttc catctatgtt gttgcagatg acaggatctt
attctttttt tatggctgaa 6420tagtactcca ttgtgtatag taccacaatt tctttatcca
gtcatccatt gatggacact 6480taggttgctt ccaaatctta gctattgtga acagagctgc
aacaaacatg agagtgcaga 6540tatctcttcc atatactgat gtcttttcgt tttgtttttt
taattgtttt gattgaagtt 6600gcagtcagtt tttactgaga tgctagtgtt tgaatctctc
ttttcaattt tctctgtctc 6660agctggagtg tgttgggagt gtagccagtg gtatctctgt
tatgagttgg agtcctgacc 6720aagagctggt gcttcttgcc acaggtaagc ttgttactgg
tgcctcactg gcttttttaa 6780aacattattc cagatgtctt acaggcttca tcagctttag
gctgcttgaa tttcaaaaaa 6840tttctttgaa ccagtataat accaattatg aaccagtata
ataccaatta tgtatgtgtg 6900tgtgtatata tatataaaac gtagagtgat ttttttttgg
tgactgaagt tttgcctctt 6960agtctatcat tataaaaagt tgtttcatgt aactttttaa
gtctttggga gtaagaaaca 7020aagtcataaa acttggggag gctgctaagt ccccagttag
agttaaaaat gtcagcaata 7080tgtattttaa cttattctaa gagttgctgt atggacacat
tctaaaagcc cttcttgggt 7140tctgttgctg tttttcccct ttaagtctca tcattccaga
tgagtttagt aaaccagctc 7200cactgatgac atttatattt agaggtatct tggggacaag
gagtgttgaa gttagtggag 7260gagggctttg tggactttta agttcaactg tacacacatt
aatagctgag cataagcacc 7320aggtgactta tctagggaaa gctttttggg gttttttgtc
attgttgttt ttttaagtca 7380aagcattttg gatgaattct gtctgctctg ttcagactaa
ctccagctcc ttagcttaca 7440gtgccatagg tacttaggaa tggcaaattt gttacatgaa
aacaaaatca tttttgtttg 7500tgtttctcta aggtcaacag accctgatta tgatgacaaa
agattttgag ccaatcctgg 7560agcagcagat ccatcaggat gattttggtg aaagtaagta
tagctttgtg caatattttg 7620tgacctacgt ttcttcccat ttttgaccat ttccttgtgc
actaatagcc atgtcattag 7680gccaaagaac tgtgaaagtt aaacccccag ctattaaatg
tctattagcc cagttccttc 7740agcccatccc aaatcttaaa aggcctactg atgcctctcc
aggtctgagg gtttaaggtc 7800acttagatag ttattaccca aaccctagga aagtcttagg
ctgggctttc agtgaaaggg 7860actgtacaag gtagtatttc tgggatacag ttttagggag
aagaaaagaa gaaagatgga 7920atagaaggct ggtttttgtt actacgatta gatccaatct
gcatttccat gggaacaatc 7980agattatttt cttgctaaaa tctagccaag gtcatctggg
cattaaggct gtgggggtat 8040tgaagggcag tgcaggagaa gagagacgct tattaagcat
aagctttggc catcttgaag 8100tcacaaagta gctggcctga ttgaagaggg atggggaaga
agatgttcca acttctgtta 8160tggtctaact tcctgccttc ttgctccatc aactctgaga
aatcatttag acaacttcta 8220cccatttatt tacaaataat gtatttgttc agaaataatt
ttggagggct gggcacagtg 8280gctcatgcct gtaatcccag cacttttgga ggatgaggca
ggaggattgc ttgagcccag 8340gagtttgata ctagcctagg caacgtaggg agacccagca
tctacaaaga atttaaaaat 8400tagctgggct tggtggtatc agcacagtaa tgacatgatg
tgcaggtact ggggtagcat 8460aagggaagga aacgagtaac tagagaggga tgatttattt
cccctaggag gccaacttga 8520gctgagtctc agctgaattg gtgttgggta ggtgagggat
aagggtgggg agtagtcagc 8580tgaattggta ttgggcaggt gagggataag ggtttggagt
agtcagctga attggtattg 8640ggtaggtgag ggataagggt ggggaacagt ccaagcaagt
gaatgtgtcc atttcaagtg 8700tccatttcaa gggagggtta tttcatagaa acattgtggg
ttactcaggg aactgtgagt 8760aattcagcat tgctgaagtg gcagaatgtg agtgtagaat
gaaataaatg gaacagattt 8820gattgagttt gtagtaggga atatggacat tgagttatag
ttgatcagcc attacaagtt 8880ttgatgataa gaggtttaaa gagatttatt taatagaaag
atggctcgtg atggcatatt 8940tttgttgttt ttgtgtgtgg agagggaaga gatgagaggc
agggtgatca ggtaggaggt 9000tgctacagga atccagatga aagataagga aggtttgtgt
ggggctagaa gcaggaatca 9060ttcaggaaaa aacttgattc acaatgagga tgggagtaca
ttttttagaa ttagctggga 9120aactttttta gaatatatgt gcatgattcc ccttctgccc
taggccagtt tgagaaatac 9180caatttagaa agtgaaataa ataggctttg cgtatgtaag
gtgaataaga aaaagttgag 9240caggactcca gccagaacct caggtgttgg gaataaagat
gccagtaaca gggaagatgg 9300agaagtgctg gtctgtaagg ggtgggtggt gagatctgtt
ttggatttgt tgaaggacca 9360tatgtgattg ccatgtggag tatgcaaata taaggctgaa
gctcaggaga ggccagagct 9420atggactgag agtagtgggt atgtaggaaa ttctgacagt
tttgggaaca gatggactgt 9480ctcagggagc agatgctgta caggaagagt ctagaatcca
gggtggaact ctggggcatc 9540cagctttgag gacagtcaga gagagagtaa cagcacacag
tatactttgg gatgggaaag 9600tgctctgggc ctggtgtttc ccactgactt tttcacacaa
atcctaatgc agtaaatcaa 9660aggaaatgta ggccaagtta agatcttagg tctcagaaat
gtgtttctca gtacaaaaaa 9720aaaaaaatca ttctatggag tgatgaatat ttttcctcta
tcctggggtc agtagacttg 9780ttctgaaaag ggctaggtca tgaatatgtt cagctttgca
ggctgtatga tctgtgttgc 9840agctgctcaa ttctaatgtt gaggtgtgaa agttatacat
gatacataag cacatctatg 9900ttccagtaaa cgtttgtttg taaaagcaga tgtaggctgt
agttttgcaa atccctgctg 9960taaccccatc atttcttgtc ttccattgga aaagttctct
ttcttcattc cttggtcctt 10020aatctttctg tggaaacttg cagatagaag cctgggggtt
tgcaccagga tagtcactac 10080catttgtacg cagcagcaat tgaggtactg tagcacttgg
atgtgagcag acaggaaatg 10140gtcatatgga cccataattt ataggaattg caaacagccc
tgcttcatca gaatcagaat 10200caatggcagg aggaaagtat tgggtcctgg attaggtgat
gttttcagga ccatctttat 10260tgtgcttctt gcaaatggat cctacctcca ggaacagaag
ggttgtgttg tttcagcaac 10320tctgcctaat agtttatata agagaagtgt tacgatctag
aaagaacccc agtcagcctg 10380gaaggcagaa gacctgtgtt ctactttttg gctccaccat
tagggagggt ctcaatctct 10440aagtctatgt gaggagctgt tttgtgacct gcagcccctc
tatcaccagt gagagcttgc 10500aatcagaatt ttattcccag ttctcatctt ggggttttat
gttccggaca tattttgtaa 10560actctttatg tttcattctt cttacttata aggtgagggt
gagatcgctg acttgtgtca 10620tcaaagaaac ttggaatatg taagatggca gtaaaatgct
ttccaaaata aggaagggca 10680tttcaaattc ttcaaagtca ctgctgcata taatatgaaa
tgggttttgt ttgtttgttt 10740tgagatgggg gtctcgctgt gttacccagg ctagagagtg
cagtagtaca atcagggctc 10800actgcagcct tgaactcctg ggttcaagtg atcctcctac
tttagtctct tgagtagctg 10860ggaccacagg tgtgtgccat catgtccagc ttattttgta
tactttttgt agagatgggt 10920gtctccctat gttgcccagg ctggtctcga actcctggac
tcaagtgatc ctcctgcctc 10980agcctcccaa agtgttggga ctataggcat gagccaccat
gcccagcctg aaacataggt 11040ttctcaaata ttgactgctg gtcaatttat tgagaggcgt
tagaggacct gagtaattgc 11100caatgactaa cttcatgaag aatagcagtg aaactgtttt
tgtttcattt catgtggctt 11160attagttgtc ttgccaattg ttctgtaggc aagtttatca
ctgttggatg gggtaggaag 11220gagacacagt tccatggatc agaaggcaga caagcagctt
ttcagatgca aatggtaagt 11280ttggtttgat ggataaaaag ccttgactgg aacaaatgta
agtttgccac ccaccaggaa 11340ctctttggtg tccacttaga tgccagtaat gaacagttct
cttctgcttt agtaaaactg 11400cctagaacct tcaggaaatg aatccctcta gaaagatcct
ttttttcctt gttattgcca 11460agttgctttg tgatttattt tcatagtagc aaataattat
aaccaatatt catcacccag 11520tttaaaaaat aaaacatcac agacaaagga aaccccctgt
gtatcccgtc ccgatgtccc 11580tccccttcct ctccagagag agctgccatc cttcattcac
atgcatgttc tcatactttt 11640cccatatatg tgtatattag atatttttct ttttctgttg
gatgaaactc tttgttttcc 11700ttacttctgg attggaaaat tctgaagacc atataatgat
gtcttgatga ctcaaggcag 11760gactttttaa tcttctaatg taggcggggc ggcccctgaa
ggcagaggtg tgtggacaca 11820agaagagtgc agactcttgg ggcacctggg gaagtagtgt
ccgtgtcaca ttaaattcat 11880ttaaactctt atattttatt ttaatttata caatatgaat
attttttaaa actatgaatt 11940gaaaagtatt acccttgagt aaaattaatg ccccaagaag
atgtgccata tttaccctct 12000ggcacactac caagtacccc caggggcatt acagatctct
gttagaaaag tacagattac 12060attatcctca taacatttag aagctatgag accttggcag
ggaagtttcc taatgtttct 12120gagcctcagt attctctgta aagtggacaa cataatgtct
ccttacaagg gttgagatgg 12180gcaggtaata gcatatataa aacagctatc atagcatcag
cacagtgtag gcactcaaat 12240ggtagttgct gcttttgttt tagtagacaa ataatttttg
aaacttttta aagcgtagtt 12300tttatttcaa aacaacttta ttgtgagtaa aatatgcata
gtgggtctaa tttaacattc 12360tgaaagctat tgacttatta gaacagtaaa ggattattag
agggcagaaa catggagtaa 12420gtactctgag acacaacctt gcttctttgg gggtgatcca
ctacaactgc ccagctttgg 12480acaagtggtt ttcatgttcc cctgattttt aagtgatttt
tttttttttt ggcaggactt 12540aaaaggtatc cttgactaaa caggaacttg accaagtaaa
tagttggtgc aatttgaata 12600ttctttcttg ctataagcaa caagtaaatt atggtacagc
tttctaagac catatctttt 12660cgatttaaaa atagcacttt actcatacat gttatgacat
gggtaaacct cataaagatt 12720atgctaagtg aaagaagcca gtcataaaag atcacatata
atatgatccc atttgtatga 12780agtgcccaga aggggcaaat ccacagaggc agaaagtaga
gtagtggttg ggtagggctg 12840tggggtgggg tggggaaggg gtgactgcta atggatatgg
ggtttctttt ggggatgatg 12900aaaatgctca aaatttagat tatggtgatg gctattcaac
tttgtaaata tactttaaaa 12960acattgattc ttaccactga gtttaaacaa ccaaaaaaaa
atcccaaggt gcattgaatt 13020gtgtacttca aatgggtgaa ccttaataat atgtaaatta
tatcccagta aaggtgttaa 13080aaaatagtac tttaaaggaa tctatggtag ttttgaaaat
aaggcagttt tccatacttt 13140gttaaactct ggagaagatg acactttact actggtacct
gctagagtaa gacttatcta 13200gtattaacaa aattagggtt tattaatggt ataggatgat
ccaggtaatg ggggaaaaaa 13260accgagcatc ctgttatcta atgtactatc cagtaaacta
ctctagcttt ttttcatgaa 13320ctttttctaa aggctttcta gggcctcgtc ttggtttgaa
agttcacagc tacccttcag 13380aaaagaaaac aaaaatccat ggagtaggca gatacaagta
ctcatgtgag cataatttac 13440tttgattttt taagttgtgt tattctagcc ctcagcctgt
tccctgcctg ggctctccta 13500gtgcccagta acactgattc aagaggttgc atttagctgg
gcacagtggc tgatgcctgc 13560aatcccagca ctttgggagg ccaagttggg cagatcacct
gaggtcagga gttcaagacc 13620agcatgtcca acatggtgaa atcctatctc tactaaaaat
acaaaaatta gccaggcatg 13680gtggcagatg cctgtaatct cagctacttg agaagctaag
gtagtagaat cacttgtacc 13740tgggaggcag aggttgcggt gagccaagat tgtgccactg
cactccagcc tgggccataa 13800agcaagactc cgtctcaaaa aaaaaaaaaa aaaaattggg
tgagagggag gaattgagga 13860ggataccaag ggttgggcct gaacaaatgg aagcataatt
atatgtagaa atttctatga 13920gctactcttc tagaatagat gactcaataa taccctgctt
gccatctacg ttttctgtcc 13980ttaattattt ccagttctat ttcatataat gcctatttca
ggccttaacc cttcagtaaa 14040ggaggtttgg tttctatacc ctaggacagt ttcattgaga
ataaattttg ttaggctacc 14100tatgtattcc ctactgtgca gactacagta cagtactagc
agaattctta ggctgttact 14160agaatatgat gatgaatgcc cgggtggtca tctgtctccc
acccggtaga gttggcttca 14220ggattgagat acacgtggcc ctggaggaga cgtttcttcc
cgtcatgctg cagaatgaga 14280acatttccat gttttcgtca ttgtctgctg ctgcctttac
cacctctgtg gctcctccct 14340attcaccttg ttcacatctt aactcatctg tgccctgttg
tgaagcttac acaatatgta 14400aacaaaactc taccctgttg gacaaatgga acacttgttt
ccttgttgta gttacctgat 14460aggttcctta gctcattata ttcaggatct agatctgtag
ctcttttcct cttttgctgt 14520tctcagaggc cacttttttt ttttttaatg ccgaaaggag
gattttgttt gttttacatt 14580tttttcttct ttttgatgat ttctgcgttc taagaaccaa
cccttggatg gtttctgatt 14640ctagaggcag gctttcaaag tagcttaaac ctcttaaaaa
acatctgtat ctagtggtct 14700gaggcttgtt tgattctggg atacttaagg tcccccagta
atattggtgt ttgttcccct 14760ttttagcatg agtctgcttt gccctgggat gaccatagac
cacaagttac ctggcggggg 14820gatggacagt tttttgctgt gagtgttgtt tgcccagaaa
caggtatgga aatatattgc 14880agttaaacaa caataaaaaa tttttatctt attaaaatta
aggaaaattt tctttctttt 14940gctttgagta gggtattaat tatacatatg aggcaaggat
gtgctgcttt aaatgtgaaa 15000tgaggttaga gttaagaatt agaagagtcc tttgaggcca
tttggtccat cctcctacct 15060ggtggacaca aatttgtaac aaaattaatc taattggcta
tgtaaaacca tggcagtttt 15120tatttgtaag gaaggtgttt gaatagttct gaattgacaa
cttttatcat aatgttttaa 15180gtgtgtatgt gtgtttgact ccactcccgc acaggggctc
ggaaggtcag agtgtggaac 15240cgagagtttg ctttgcagtc aaccagtgag cctgtggcag
gactgggacc agccctggct 15300tggaagtgag tgggagaaga aaccttagag aaattcttgg
aaccagagta gaggtggtgg 15360tacacatgga tacagatgat acagatgttt gtgtaacaca
aaaggatttt tacgtttctt 15420catttggtta taaggctgta tctatctttg tttcttcttt
tttttttttc ttattccctg 15480aagtctgaat tcaactcgaa tagtagattt tacgcttctt
cacagatttc attgttccaa 15540ggccgcatat attttgcatt cctaactctt aaaaggctgt
ggttttaagg cagggtatat 15600atgaagccat tgtacagagc agaaaatggt gtttagaagg
gaaggcccag tttgcaaggc 15660tctgtggggc aaatggtgct tttgtggaaa ttagggaaag
agcctccttc cttggcacaa 15720aattcctaca gcagaggatc tgcttgccaa ggagcatgca
ggctggattc agaccctgct 15780ctttccttcc attctcctcc ttggcccagt acccttgtgc
aggttacaat ttgcctgtca 15840tatgtggctg cctgatttta gatagaagat gtatctcctc
tgtttcggtg atatctgttg 15900tatgtagacc tcttgtttcc caccagtatc tgaatggtat
tatatgatag agcagaagag 15960aaatgtattt gaattaaaac cctagagaca aatatgaata
agatgaggca attaagatgt 16020tttcaacatt tggtgaagtc ttaaaaaaga cctactggag
catagaatat ttgctgaagt 16080tgtataatgg aaggagaaat agattttgat ttttaggaca
ttatacctgg aatggtttag 16140ataacttatt atttttaaag tcatccaaat gcaatgtaaa
tatgtaaggt tttgtgggca 16200aatggagcct ctgtgtaaaa caggaaaagg cactctttcc
tctgggcaag tacagtccca 16260cagtgggatg aaccgctcgc cgagagacaa gggacacatg
ggatttaaaa cttccttgga 16320taaagatatt cattaattcg ttcattcatt cattcatgtt
tgctggaaaa aaaactcttc 16380tggattttat ctattcttta gttaggtgag ctttcgatat
tgtaacactc tgagtttgct 16440ttaagaccct caggcagttt gattgcatct acacaagata
aacccaacca gcaggatatt 16500gtgttttttg agaaaaatgg actccttcat ggacacttta
cacttccctt ccttaaagat 16560gaggttaagg taagtgcctg agtttgtttc accctcgaat
gtagaggact ttccatagct 16620atagagggaa tttttttttt ttttttttga gatggagttt
cattcttgtt gcccaggttg 16680gagtgcgata gtgcaatctc ggttcactgc aacctccgcc
tcctaggttc aagtgattct 16740cctgcctcag cctcccgagt agctgggatt acaggcttgc
gccaccacag ccagctaatt 16800ttgtattttt agtagagacg gggtttctcc gtgttggtca
ggctggtctc aaacccctga 16860cctcaggtga tccacccgcc tctgcctccc aaagtgctgg
gattacaggc gtgagccacc 16920acgcctggcc tatagagggg atttatattt gatatggata
tataaatagt agctttagag 16980taaatagtaa taaaaatggt ggcttcctag aactgatttt
tatttaataa aatattgttt 17040ttccagtgat tttgcaaata atagcatttg tcccccacct
tagataaaac agaagtagga 17100aataaaaatg ctagttttta ttgtttattt tgacaaaagc
ataatttttc cagtaatgaa 17160gatgtttttc atttataaca tttaaatctt aagtggtttg
tataccatta agattcttgc 17220tgaagtgaga acacatcaaa tggtatctct gtgtaaaatt
ttaaacatcc taagttgaga 17280gacgagttta atgaactccc atgtaactat tactcacttt
cagtagatac caacattttg 17340caaaactatt ttcatcggtc cgcaactctt tggcctatac
atatatatac ttacatatat 17400ttttatttcc tggagtttta attctagaaa tcatattttc
aatatttatt tataacagtt 17460aaggacattt ttctttacat aaccataatt ctattattac
atcttatctc tgtgttgtct 17520aacacccagt ccatattcca gtttctctga ttgtctaaaa
atgtcacctt gtatttggtt 17580aagtttctta agtctctttt aatctttaag cataatgtat
ttcttttttt taagtcctct 17640acataataat gacatatttt acagatttgt ttaatgcctc
tgtaggttag tgatttacag 17700ctagggatga gctcaggtag tgggattatt tgatttgaga
gaggaaatac agctattata 17760aagatttgga agtaaatcca taactgaaag ccaatgacag
atcttttttc ccttctaggt 17820aaatgacttg ctctggaatg cagattcctc tgtgcttgca
gtctggctgg aagaccttca 17880gagagaagaa agctccattc cgaaaacctg tggtaagaca
gctgtagtac cccagccttc 17940tgccccataa aacgtagttg aaagtagaca ggtatgggat
ttccttcatc ccttctactt 18000agtcccttag tagaatcaaa gatgctgaag tgggtaggtg
gaaatggggg tggttaggtt 18060ttgattgatt gtggatttca gtcatgtatt ggttggggtt
ctctagagaa acaaataata 18120catatatata attcgtccct cagtattctc gggggattag
ttctaggatt gcccatggac 18180gccaaaatcc acacatggtc aagtcctgca gtcaaccctg
cagaacactc agatatgaaa 18240agtcagcctt ttgtatactt gggttttgca ttcctcaagt
accatatttt tgatgtgcgt 18300ttggttgcgg gtatagaatc cacaatatga agggccgact
gtattcattg aaaaaaatac 18360gaatataaat ggacctgtgt agttcaagcc tgtgttgttc
aagggtcagc tgtacttaca 18420tagagagacg gtgagagagg gaatagggtg gggcgggagg
gagagagagt aatagagtgt 18480ggatagattt actttaaaag attagctaat gtaggggatg
gcaagtttga aatttgtggg 18540ggcaggttgg caggctggaa attcaggtaa gaattgatgt
tgctgtcttg agtatgaaat 18600ctgtagggca ggctggaaac ttagggagga tttctgttac
agccttaagg cagaatttct 18660tcttttctgc gaagcctcag tttttgcttt taaggtcttc
agctgaatga atgggacctt 18720cccacattat ggggaataat ctgctttcct tatagtcagc
cgattataaa tattaatcac 18780atctacagaa taccttcaca gcaacatctg gagtttagca
gatagctggg tgccatagcc 18840tagccaactt gacacaataa aattaactgt tgtaagtcat
cacgtgcttt ccctagtgca 18900tggtattacc acagaaaaaa cactaaccaa aggaattctg
tggacgtgaa agaagattta 18960gattaagcgt aaaagtaaga atatttttat agcttttaaa
atgtataagt gtgtggtttt 19020aagtattaaa taatacttga aaatgttaga aaataagatg
agaaaaaaat ctcatagttc 19080taccacttcg taataatcac tattcaaatt ttcttgtctt
ctaggttttt catgtatata 19140tctcagtata gctatcatct tgtttttgtt aaaagtgtag
taggtatggg ccaggtgcgg 19200tggctcatgc actttggggg cccagcactt tgggaggccg
aggcgggcgg atcacgaggt 19260caggagatcg agaccgtcct ggctaacacg gtgtaacccc
atctctacta aaaatacaaa 19320aaattagctg ggcgtggtgg caggcgcctg tagtcccagc
tactcaggag gctgaggcag 19380gagaatggtg tgaacctgga ggaggcggag cttgcagtga
atggagatcg tgccactgca 19440ctccagcctt ggcgacagag tgagactgtc tcaaaacaaa
acaaaaaaaa gtgtaggtgt 19500gatacatctg catcatttta aattgctgta taatactcgt
ttattctcgt tcattaaatc 19560tcatgctgtt agacatttac agttttgtca tttctcatta
ttgtaaacag caatgcatgg 19620tacatttttg ttcataaatc tttttacttg attattttct
aagtagcttt caaactcttt 19680aatcagtaga accccccccc tttttttttt tttttggaga
cggagtctct ctctttcccc 19740caggctggag tgcagtggcc cgatctcggt cactgcaagc
tctgcctccc gggttcactc 19800cattttcctg cctcagcttc ccgagtagct gggtctacag
gcgcccgcca ccaagcctgg 19860ctaatttttt gtatttttgg tagaggcagg gtttcaccgc
gttagccagg atggtctcga 19920tctccatctc gtgatctgcc cgtctcggcc tcccaaagtg
ctgggattac aggcgtgagc 19980caccgtgccc ggcctcagta gaaccctttt aactgcaatg
ttaagaaact cattattcat 20040tcaacacaat agttcttaac cctggccaca cctttagaaa
aaaaatgata ttcaggcttc 20100atctaagagt tcagttcagt gtgttggaat ggagattata
cgtaagtatt taattaaaaa 20160ccaaaagccc ccaagtgatt ttaaacagcc gcagttgaga
accaccgatt aaccagtgtg 20220tcaagggatg gcactgtgat atgctgagca taaaaatatt
gcacaggatg aaaccctgtc 20280tctactaaaa atgcaaaaat tagtccggcg tggtggtgcg
cgcctgtagt cctagctact 20340cgggaggctg agacaaggga atcgcttgaa ctgggaggca
gaggttgccg tgagccgaga 20400ttgagccact gcactccagc atgggtgaca gagtgagact
ccatctcaaa aacatgtata 20460tatatatata cacacacaca cacattgcac aagaacagcc
acaacatctg tgctcacaga 20520acatcagcat gtggtctaac ttcaaagtgt tgtaataatg
cggtttgaga ctaggttatg 20580tttgctgtga tcactaagtt aagcattagt gagcaaggag
attgagaaaa tccttaatat 20640aaataatatt tcttaatata actataattc ctaatataac
taaggtctta atttatatgt 20700catctgttta gtaaaggttg gttttggcat gattaagtct
tgcttgctta atagatgttg 20760gaaggataat ttcatgctta tcttctttgg acagctgaat
caggattaat acccagatag 20820ccttgaacat aagtgcttgc aaagcacctg aaagaaaata
agcatcttaa gcccaataca 20880acacaatgat gctagtctag atcttggatt aagtgtttta
atacttttac tctaattgcc 20940aagttatctt cttcctaaat cttcatgaga aaacccacta
aaagaatgct ttttcctggt 21000agccttccat tgtgatcata aagtttggaa gtaaagttga
aaataaacat gtgggccagg 21060cacggtggct caggcctgta atctcagcac tttgggaggc
cgaggcaggc ggatcacaag 21120gtcaggagat caagaccatc ctggctaaca cggtgaaacc
atgtttctac taaaaataca 21180aaaaaaaaaa attagccggg tgtggtggtg ggcgcctgta
gtcctagcta ctcgagaggc 21240tgaggcagga gaatggcatg aacccgggag atggagcttg
cagtgagccg agattgcgcc 21300actgcactcc agcctggccg gcagagcgag actctgtgtc
aataaaaaaa aaaaaaaaac 21360gaaaataaac atatgaataa aagttaaaaa tagaaaaaaa
acaagaaaat aaacatatat 21420ttctgacctt attgattctt gatattttat ctgcatggaa
agctattttt tggcagttat 21480tattgttctt attttagaga cgaggctgag caggaagggt
cctttgaaaa agaaaagatt 21540gcccttgaac ccctctggca agtgggatga agtctgcttc
ccagcctcta acggccttct 21600tttcattttc ccttgcagtt cagctctgga ctgttggaaa
ctatcactgg tatctcaagc 21660aaagtttatc cttcagcacc tgtgggaaga gcaagattgt
gtctctgatg tgggaccctg 21720tgaccccata ccggctgcat gttctctgtc agggctggca
ttacctcgcc tatgattggc 21780actggacgac tgaccggagc gtgggagata attcaagtga
cttgtccaat gtggctgtca 21840ttgatggaag taagctcctg ggaagtgtgt ccatgagcct
gcaaggggtc ctgagcctag 21900ggcctgcaga tgtggtggtt tgactggaac agtggggaat
ctttatttgt tttggctgtt 21960tgggttactt gtttttttat tgaatgggat ataaggtggg
gtatgttctc tcctgagaac 22020cattgtcccc cctcccccac cagtttcctg ttatactgca
tctgtggcct tcacacgttt 22080acttgcctgg cctttgaaga cactgaaaac tttgactcta
ggtagagagg atgacaacag 22140tacagtcttg tgggattggg tgtgttagct ttatctgttt
gccctgacac agatttataa 22200ttgaccctta taccacccca cttgtgttgc tttgtttcct
gatacaaatg cttgctgata 22260tatacctctc cagtatgttc agttcatgca taaacgtttg
cctaatatga agattaggtt 22320tatattttat aatgaggtag aaggtttttt tagggggtgg
ggtgggaagg gcaagactga 22380agagtgaagt agtcacctta atgaatagtt tcattgctga
tatgaaaggg agcactggct 22440tctaagattg taatgtgagg tggatattaa ttcatattct
gtgtaatatt ctacataata 22500ctgattttat agtcatgtat tctatataga gaacttaatc
agatctgcgt tattaccaaa 22560tccacacata ggaaagtgct ttaaggattt tgaaagtatt
aattcccttg gtttagtgtg 22620gcttggttgc aggcccaggt ttaaagctag aggtctgacc
tcttggcctt tttgccttag 22680tccctggcac ctgaaactcc aggtactgag atggactccc
ctaggcctag aggtgacaat 22740agccaattat ggacagaacc catgacattt ccccatccca
cactgttttt agacttgttc 22800ctgagaaaaa cattgaaagt tatttttttg tgaattgcca
ttattgttta gatatactgt 22860gatgttcaga tggcttatct tacaaattga atatccctag
gtctaatcct cttctttctt 22920tttcactgca gacagggtgt tggtgacagt cttccggcag
actgtggttc cgcctcccat 22980gtgcacctac caactgctgt tcccacaccc tgtgaatcaa
gtcacattct tagcacaccc 23040tcaaaagagt aatgaccttg ctgttctaga tgccagtaac
cagatttctg tttataaatg 23100tggtatgtta taaaactttt gccaagatgt tctgaatcaa
gtcccttcta ctcctacata 23160aaagcaaatt atagtttggt gttgccatag gtctagtgtt
tctcaaaatt tttaagtctg 23220cagttgatat cattatcatt atgatattta attgccttgg
gtttttgttt tttttttttt 23280taatcctata ctggtttgta cgagccattc cttttccctt
actgacttga agagtcagtt 23340atttaagaat aacattggac tctggaaata acatagtatg
ttatacattg ttaacatgtt 23400ttactctttt catagccttt acacatattt tcagttgatc
tcatccctcc taggagctgt 23460gtcagagatg gggttttcct cttttgtaga tgagggaaca
cagtgtcaga ggttttgtaa 23520tttgtttgaa caagaatgga caaggacctc aacacaggtg
ttctagctcc taatccactt 23580gtcctgccac agccccattg ctgtcagttc ttcattactt
tcctgatgtg ctggagaatc 23640tgaaatttgt ttttacttgt gagttctgtg gttatgtcat
aaattctgct ggcatatggc 23700agtgttagcc ttgttttcaa atatcttttg aattctcaga
aaaagcctag atagttgcca 23760agagagaata atcaaaatta attaatttaa atgggaagtc
cttactttca tatcagcttt 23820tctgttaagt cagcagccca ctgtgtacat ggatcctatc
tggatgtatc accagtttct 23880ctgattatag tttcagtgtg taaaatgctg ttacagtcct
ccttaaactt ttcaaaatag 23940ctttaaaaaa aagtgcaaat atgttcattg tcaaggcaaa
aagaatcaga tgtaagcttt 24000tgtgggactt aactgtatga tgctaatgag tttatatgtc
actttatgat gtatggtatg 24060ttttgttctg cattcactta aaaaatagct ttatatcatt
catctattta aagtgtacaa 24120ttcaatggtt tatatgtgtg tgtatgaata tatatacata
tgtatatgta tatatatgta 24180tattcacaga gttgtacagc catcaccacg atcaatttta
ggacgttttt atctcctcag 24240aatgaaaccc tgtaccaccc tgcattcatt ttacttgaga
gaaaactccc tgtgatgaga 24300taggacaggt tgagagctcc acttttgaaa gattgttcgg
catcaatatg tggggttggc 24360cataggtcag gggcacctgg aggcagagat tctagttagg
agaagctgtt gtcaagtgtc 24420caggcaggag ctagcaagag cttgagccag agcagtgttc
atagaaatgg aaagaagaga 24480aagatcataa caaatccatg aagtaaaaac cctgagaagt
taaagaaccc actggggaga 24540gtttggatat aagagaatct ggaaaaagag atcttggact
ggaacaggtc agggctccgt 24600gcccaagtgg aagggaaatt aagaacttgg agtcaagtgg
tagacatttg agtggtgtgg 24660agacaagttc gttgccaaag ttttcaaaga tggtgtttga
tgcatcctga gtatcactcc 24720tttttccccc tcattgcttc ttgattgttt attatatgcc
aggctttttt ctagtacttg 24780gcttgttgta ctagaaaact agttgtactt tgtctacaac
ttgttgttct aggtgtagac 24840aaaagatatc aattaaatat gatctatcag atggcaagtg
ctgtggagaa aaattaagca 24900aaataagggg tagggagagc ttaaggataa gggtttacag
ggggaaggtg tctttcctat 24960ttagtgtgat cccaaaggcc tctctgtgaa ggtgacattg
aagcagagac ctggtgagaa 25020tcacagtggg agccacgcag acatctgggg taagagcgtc
ccaagcattc tatgcttgaa 25080ggcaaagaag aaaaaagaaa gagcgttcca agcagagtaa
aaagcaacca ccgaagtgcc 25140tgttgtgttt aggaaatagc caggaggcca gggtggctgc
agcagagcaa aggaggggaa 25200ggtggtgggt gagttcagag tggtgatggg aatctgctct
tgtagggcct tgcggctttt 25260actccgagtg agataggagc caccagaggg cttagaacag
aggagtgcag tgttctggct 25320gaatttttta aaggcttgca ttggctgctg tgcagtgaat
aaactggatg aagaatagaa 25380agaaaatgtc ttttaagcag gtgcttagga ctttggagaa
tttgaggata ttgagaggtg 25440gttgaagaca gtggaggaaa ttgtccacag cactgggctg
agagggtagc cccttcacct 25500ggtcttgctg agatgtggcc tttgtcaggg aagattatga
ctgatgtgtt cttaagagga 25560aagcagagat tttaaggagg ttgagatgtg attattttct
agattgctgt ttgccttcta 25620gaactcatta attgcagaca ccatcccctt agtattaggt
gaaatcttat aatttacgat 25680gataatattt gcatttttgt tttccaggtg attgtccaag
tgctgaccct acagtgaaac 25740tgggagctgt gggtggaagt ggatttaaag tttgccttag
aactcctcat ttggaaaaga 25800gatacaagta ggttcttaat tatcttgggc ttctgggaac
agaatcagcc agcatgcagt 25860cctaaattca gccatctgat aacagttcta tgcctgttgc
tgagtggaac aagaaataaa 25920gacaacaccc aggccctgac tttcggatct gattggagaa
gccagtcatg tagtttgtct 25980gaatgccata taatttgata ggtagcagga gagcatgagt
tgtaagccag cctaggacct 26040actcccaata gcgcttggtt ctccaggaaa aatcatgtgg
gaaagatgga gatgacaatg 26100ataaggcgga gctgcattct cttacataaa tggggatgta
tgggttgtta acatggatga 26160cctaatgcag cctctgtctt tgctccatcc cagaatctag
aacttctggg tgctgtgctt 26220tgaggctcct gggatggaaa tcagaatgca ttcttccatt
gaaacagtat tgtaaacaat 26280tggatgttat tgaatacctc aggtacacta taggcatttg
caaaatgacc tagaaaccaa 26340attataatgc cacatctgtg agagaacttt tttaaaaagt
accacttatt gagtacttac 26400agattaaaaa aacaaagtgt agaggttagg taacttaccc
aaggtcatgg acctggtaac 26460tagagaattt agggtttgat tctattctgt ttgataagtc
catgttcttc attactaaac 26520tactctgcct ccagggaaca tttattgtta gattaataga
aataattaac tgagtacaac 26580aaatagcaga atttaataaa taatgtttct taaatatatg
tgatatattt aataaataca 26640gcagaagtgt tcaacctctg tatgattttg aggctgcctg
tataatgctt agtagttttt 26700aaagagcatt tacatgcatt atttcacttc atagacttga
aaccactaga gtagagatag 26760aggacaaatt agaaagtatg aggcagttta gaatatagtt
tcatttaaaa aaaattgatg 26820gggataatgc caattcgtct gagatttcac agaagacatg
agtactcatc gtgatcttgg 26880ggaagggata ggtttggggt tggcaaagaa ttgggaacat
tgggtctggt ggggaagaaa 26940gtgtcagtga aaaccagagg tgggactgat cctccatggg
atactctatg tgaatgcaat 27000ggagagcctg agtccgggga gagatgtttg aggaggaaga
tcaggctagt gaccaacttc 27060ttcagtggga gctgcggatt tgccacctga tctaaaaggc
aggaagtagc cattgtcggt 27120tcctacgtga ggtgacaaga acagtgcgct ggtcaggtgt
ataaatgcta ccaaagaatg 27180cattagagac atggagacca tctctcaagc tagtcagtca
gtttaatgtg aggtgcttag 27240gaaaggaccc attctactgc aagtgacata cctgccagag
cctggtttga atgctggtaa 27300gtcatggcag tggaaaagct ctggggttca ttagtgtagg
gactagggct ggtaattttc 27360ttgtgtagtc agtttcctca agtgttctct tcaaatttaa
agatttcagg gtatgagaaa 27420tttagggaaa atataaaaac gtattcttaa gccagacaaa
gattaatttt agattttgta 27480gtatttggta gtatctcagg ttttgtccct ccaaataatt
aggagtggac tgtatacaag 27540atgcttcagt cttccttcat ccaggaacgt ctcagtggtt
tttaagtttt attcatgtct 27600tggatattct tcaatattta caatagaatc cagtttgaga
ataatgaaga tcaagatgta 27660aacccgctga aactaggcct tctcacttgg attgaagaag
acgtcttcct ggctgtaagc 27720cacagtgagt tcagcccccg gtctgtcatt caccatttga
ctgcagcttc ttctgagatg 27780gatgaagagc atggacagct caatgtcagg tattgcagtt
tttccctgta ctccacatgt 27840taagcaaatg gagttaggtt tttgtctttt atgagcatac
aacttttgac ttctattgat 27900caaggttgag gagcagtagc tttcttgtta gacacactta
acaagaaggt taagtctagt 27960tatgagccat gtcaaaataa cagaccaaaa atatatcaaa
aagtggtgaa aaataggata 28020aatattagta gatgaagcaa ctttttaaag atatgttaaa
tattttaatt tagcatctac 28080ccacattttt ccagcgtgat tgttatatgt tataattgat
tttaataact gtcaagcata 28140attagagtgg ctaattctca tgggctaatg tgatgggaag
aaattttgta taaatgcagt 28200catgcgcata tatgtgtgtg tgtgtgtgtg tgtgtgtgtg
tatacatacc ttttctatgt 28260ttagatacac aaatacttga catggtatta caattgcctg
tagtattctg taaagtaaca 28320tgctgtccag gtttgtagcc tggtagcaat aggccatacc
ccataggcta ggggtgtagt 28380aggctacacc acctaggttt gtgtaagtac tctatgatgt
ttgcacaatg atgaaatcac 28440ctaacaacac atttctcaga cgtatcccca tcgttaaatg
atgcataatt gcacatatat 28500gctttgtttt gatgtggtga cttcaaaatg cttcttccag
cctcctcttc tatatatcct 28560attttgtacc tgactacatt taccattaga aagtctctat
tcttctttgc tgaaatttca 28620ctgttctctg ggcctgagtt ttgttttgat tcctgactat
atcttcatta tgtaacaggt 28680ttcagttaat gaatgctctt ctgtgtaatg taagccctgt
tgtatagttg atagcatttt 28740ctagccagtt cccagaactc cttgtttcca gtgtcaatac
ttggcacctt tgtccactga 28800cactaatccc cagattaatt tgtaattaaa gccctactgg
tgagatttct gagaaacgtt 28860gttgcaaaat taggaacctt tcctttatat atatacatta
cataaattta tagacataaa 28920acattttaat gcagtcattt gctgctactc tttgactcat
agtctttcgt gatattttga 28980aaaagccttt tgttaacatg tctaaatgca gaatatgttc
tagaaatatg tagcacttaa 29040agtaagccat tagattacct tttgaaaagc ggagcaattt
actaagtttc tacttcttca 29100gatttgaaat tcttcatcat tagcttgtag aggcaaaagc
ttgatgcagt catctcattt 29160gctgtaaagg aaatgagaag tcatttacag tatatttcta
ctgctttgac ttttatttct 29220caaaaagact gttttgttca tataaaatat taatgctttt
gaggactaca aagtccctcg 29280atttagttta catttacttt agcttatact ttgtaaaaaa
tactcttcta aatgctttgt 29340ctgttttagc ttacttattt ctcataatac ctctgtaaag
tatatgccat ttgcaccatc 29400attttacaga tgagacaact aagacatgga gcagttaggt
aacttgcctg agatcatgca 29460ggtggagcca ggatcaaatc ccagcgagtc tagctccaga
gtttgttctc ttcttgacag 29520ataatttatc ctcacaaaat ttgaagcatt tgtagaggaa
ttccctattg ttataatgtt 29580tagttttttt gtagattggt taaaaacttt gaattaaatg
ttagcattaa catcatttgc 29640ttttatcact acttctttgt ctcttttttc tttttttaat
cactacctct tcctcctctt 29700ttgagaaatt ctgcttccgt ggctatggtc caagctactt
gagaaggtga ggtgggagga 29760tcacttgagc ctaggaggtt gagattgcgg tgagctgtga
ttgtgtcaac tgcatttcaa 29820cctgggcaac agagcaagac actgtccaaa aaaaaaaaaa
aaaatagtga aattttactt 29880cgctccattg actcagggaa aaaatgtaat ggtgataaca
aattcccttc atctcattag 29940tgaaaatcca caattttcca tcaatcgata tgatagtgat
agagatattg agtgtgctca 30000ttttcctaca gaccagctgc tttaactatt ttaagcagac
agaaatgata ttggtaccat 30060ccatgtctaa tgaaggcaat actttgtaat aagttgcagt
aagttgtggc cagaagagga 30120atgatgactt cacagtgtaa acaactacct tattgggttt
gtggaaaatg gtgtcatgta 30180gcagatgtgg ctttatctgg gctttggttt ggagtagttt
tatctattca tctaaccgtc 30240tgtctctaag tgtataagtg tgtgtgtgtg tgtgtgtata
gtattgggtg tgtatatatg 30300tattttgtct acattgtatt gaagtaggta gtgcagcatc
aaaaggaaat tgttgatttt 30360caaaatcagt gaaatgtcac tatttttgag aaaaatggtc
tgtttacact cccttctcct 30420tttttttgtc agttcatctg cagcggtgga tggggtcata
atcagtctat gttgcaattc 30480caagaccaag tcagtagtat tacagctggc tgatggccag
atatttaagt acctttgggg 30540tgagtatcaa ggtgttagga aagcatgtta tgacttacat
agatgcttag ttcttaagaa 30600catgtacttg tatcttgtca gttcaatatt gattgtcagg
tcttttaact accctggaaa 30660accctaagct ttagagtgga attggcaagt gtattctact
cctgtttcct cttttaatga 30720actaacgtac tcttaaaaaa gtgattgatg actatcgcag
ggacaaaaaa cgaaacaccg 30780catgttctca ctcataggtg ggaactgaac agtgagaaca
cttggacaca ggaaggagaa 30840catcacacac ttgggcctgt cgtggggtgg gggagggtgg
agggatagca ttaggagata 30900tacctaatgt aaatgacaag ttaatgggtg cagcacacca
acatggcaca tgtatacata 30960tgtaacaaac ctgcacattg tgcacatgta ccctagaact
taaagtataa taaaaatata 31020tatataaata aataatgcca gcattagaga aaaaaagtga
ttgaaattgc atgttaagtg 31080ttttagcaaa tgttgatgtt gatggttttt tgcaaagagc
gcatcagcta tttgtgaact 31140agatctgtga atcttgcaga gtcaccttct ctggctatta
aaccatggaa gaactctggt 31200ggatttcctg ttcggtttcc ttatccatgc acccagaccg
aattggccat gattggagaa 31260gaggtaggtg aacacggagc aggaaattta cttaaagtag
ttacccaggg actgatggca 31320ttaagtagaa agagcgtggg ctttggaggt ggacttgggt
ctccactaaa tgcctagaca 31380atagtgggaa atgatctcac tttcataagc cacaccttat
tcatctataa aatgggaaaa 31440tcagtatctg tctatcaggg ttcagaagac taaatgagat
aatatatgtg attagcaacc 31500ttttatccct agttgtacaa atcattcaaa gttaatttta
tttagtaggg gaaacagaaa 31560tgtgatcttg agaatagttt tagtagattt ttattcaaca
catactagaa tgcctataat 31620tgtggtggat ggtagaatgc agtggctgga aaacaaaacc
gcttgactaa ttcctgctct 31680tctggaactt gtgatctatt aatttcaatg taatgattcc
ctttgttggg agtgtgatgg 31740aaatggacag agtatactgg tagagaatac tgagatgttt
gaggggtaat ttgaggatgg 31800tggctatgag aatgggagtc ctgcatctgg tggtccagga
aggcctctcg gaggcagtga 31860tgtgtgtgct gagatgtgaa gaaaaagaag gctctgtctc
caggcagaag gaacaacaaa 31920ctccttgagc ttagcaagag ctcatcttat tcaagggact
ggatggaagt attgtggctg 31980gagctcagtg acagtcatag gagggaattt gggttcttta
attgaacaaa gattagaaac 32040ttcttgtgat ttttaataac agagtaatgt gttctgcttc
atggtttgga cagtgattct 32100ggctgcccag aagagacttg attggagagt gacgagactg
gaatatggga tcaacaccgg 32160ttgagtggag ttagtgaggg gaaaaaggag atgggtttga
gatatgtgta ggagatggag 32220atgtcagggc tcactgatgg attggatggc ttcacattcc
gttttgcact ggaccagcca 32280cgtcttaggt atctatcttt agtcctgatt acaggaactt
aggtgtgaaa tcatagggtg 32340gtagaactat gtgatagaaa aggtaggttt aactgatttg
agatagaatt gcttgtgatt 32400tcagttttat ttctttgcag gaatgtgtcc ttggtctgac
tgacaggtgt cgctttttca 32460tcaatgacat tgaggtatca aggcttggtt tggtgttgga
tccttttcac agtgttagct 32520ccgagtaatc tagctagctt tcacccatgc ctctctggcc
ttctcttgca ggttgcgtca 32580aatatcacgt catttgcagt atatgatgag tttttattgt
tgacaaccca ttcccatacc 32640tgccagtgtt tttgcctgag ggatgcttca tttaaaagta
agttttcaat gtataaaaca 32700gaaatggtcc cttctccaat gtcttttgga gtcttgatga
ctttttgaat tcttcattta 32760ttttggcttt ttatcaagga gtcctaggct ggagaaaatc
tttagagtta ttttacttag 32820accctaatct caacataata tctcagttaa atcattctgc
actttagtaa agacatccaa 32880ggaagggagt tccttcctta agcagcacat tctaaagtta
aaaacttttc aggaaatttt 32940attatgtaac tgatctaata ttttatttgg aattactatg
tagatcccca atgttttacc 33000ttctgtgtag tcttttccca ctgtgcccac cctccactgt
acatctgcgc tccatctagt 33060ggtttgtagg atattggctg cattttgtct tctgttccat
gccctatcta tctctgtgtg 33120tgtggcgtgt atgtgtgtgt ggcgtgtatg tgtgtgtggc
gtgtatgtgt gtgtggcgtg 33180tatgtgtgtg tggcgtgtat gtgtgtgtgg cgtgtatgtg
tgtgtggcgt gtatgtgtgt 33240gtggcgtgta tgtgtgtgtg tgttccttat tctaaaaagc
caacttattt tctttgcttc 33300caacttggaa atagggaatc tttctttcat tgatatgatt
atagtacact gataatgcta 33360agaaatagag aagttgcccc aattcttaac tgtgtttctc
cacatcattt gagaagctgt 33420gtatgtgaat gtgcatgagg gctctgtaag agagagggca
agttccaggg atgagcgtgt 33480tcatcagcag ggctgatagt cttgaggttc agtgggagag
ctaaggcaca tggttgttat 33540ttgttctctt ctatttcaca taatgtgtgc ggtttcaatt
gcagttaatg gagagtggct 33600tgttgtgata attaaggctt attagttaat ggtgtgttta
gcattacagg ccggcctgag 33660cagcaatcat gtgtcccatg gggaagttct gcggaaagtg
gagaggggtt cacggattgt 33720cactgttgtg ccccaggaca caaagcttgt attacaggta
agctggtttt tcagacaaga 33780tagatagtct gattgtcatt cagccaagta ccaagcataa
ttcttgcagg ttgtatttta 33840ggctttctta ttctttgtat cgtttattgt aaacctttcc
ttgatagttt tctgttagct 33900ttattcaaag gagtgttgat acaggctgtg accataaggc
tcaaagcgaa acttttcttg 33960aaagtcaaga taaatataga gaacaacaag attctgctaa
aagtgtgctg attttagaga 34020gttgtggtaa ttctctgtga agagttaggt aaaatggtgt
atcctggcta tttaaatgtt 34080ttctacttaa ttaaaaatgt tactgcttta atttatttaa
gatgccaagg ggaaacttag 34140aagttgttca tcatcgagcc ctggttttag ctcagattcg
gaagtggttg gacaagtaag 34200tgccattgta ctgtttgcga ctagttagct tgtgatttat
gtgtgaagac aataagtatt 34260ttattacaat ttcgagaact taaaattatg aaaagccctc
attacctata tcatcaatca 34320gattcttaga ggctcttttt ttttttttta acttttttac
tttaatgcag tattttgtag 34380tggagattcc tagcagaaag aatcgtgaca ctcatcatat
aaaggagggc ttctcttaac 34440ctgagggaac acatgtgggt tttaggtggc ctgtgaaccc
agggagattg tacacaccaa 34500accttgtctt tgtgtattta ttcaagtaga aagcccacag
ctttcaatag atttacagcg 34560gggcctatga cccagaaaag cctgagctac tcttgtgaag
gaaatgactg attttctgaa 34620cctatttgga ggaaactttg tattggaaag atctatacta
atgttttgtt taaaaagtag 34680acctgaattc catgatgatt ttctttgttt tttttttgag
acagagtctt gctctgtcac 34740ccaggctgga gtacagtggc gcaatctcgg cttactgcaa
cctctgcctt ctgggttcaa 34800gcaatcctcc cacttcagcc tcccgcatag ctaggattac
aggtgtgcac cacgcctggc 34860taattttttt ttttgtattt tcagtagaga cagggtttca
ccatgttggc caggctggtc 34920tcaaactcct gacctcaagt gttctgccca cctcggcctc
ccaaagtgct aggattacag 34980gtgtgaacca ccgtgcccgg gcttctgtaa tgattttctg
ttgtatgtat gtgaagatgt 35040agttctcaga cagtcatgat gactaaatta caccttttaa
gaaggtaaat gaatgtggta 35100cctgattttt ttattctgta atttcagagt agaaatccag
tgatagtagc ttggcattgg 35160ggctgtaatc tgattataac tggtttgtat cataatgaaa
atatgctggg cccatggagc 35220tcagtttttg tgaatatctt ttctattctt tctctgtctt
ctcacagact tatgtttaaa 35280gaggcatttg aatgcatgag aaagctgaga atcaatctca
atctgattta tgatcataac 35340cctaaggtaa ctttctaagc tgtcatttac tctagcttac
tttgtactta aactaatatg 35400atctgaacga agatgttttg tccttttttt ggtaggtgtt
tcttggaaat gtggaaacct 35460tcattaaaca gatagattct gtgaatcata ttaacttgtt
ttttacagaa ttgaagtaag 35520tattttgaat aattcatgtg tatcttttcc atagttttct
ctcttcttgt taaggaaatc 35580aagcataaat agctagagaa gaaaaattcc ttactgttca
tttttaaaaa ttgctataac 35640tcttagatgc cagttggttt tttgctcttt tccgttcttt
ttaaaacagc ctgtttaaaa 35700ctatgtcctt aaaacatgtc attcagaatt attatttcac
ttgattttta ggtatacata 35760taaaactact tgtttttcct aggagactga aatcaaatgg
catctttctc tctgatgatc 35820tttcccctca actttttaat gaaacacttt caaaatagag
aaaagttgag agaattgtcc 35880agtaagcaac ctatatatac cccacctgga ttcgccagtt
tatatttttc tgtatacaca 35940ttctcattct ctataatctg tccatccatc attcatcttg
tttgtagaca aattgctaag 36000tgagttgtag acatcagtcc actctaccac ctgtacttct
ccttgtatat cattaactag 36060agggcattct ttgtgtatgg gttggttttg ttgtgttttt
tcaggtcata tttatctaca 36120gtgaaatgtc caaatcttaa gtgtgccact tagtgagttt
tggcaaatgt acacttcatg 36180taacctgaac ctctgtcaag ttagagggca tttactcctt
ttcagaaagc tgcttcagat 36240tcctttcaat cagtccctgt cccattcccc aggcaactac
tcttctgaat tttttaccat 36300aaatcagttt tgcctgttca agaacttcac ctaaatggaa
gcatacagta ttactcttct 36360gcataaagct gttttcattc agcatattgt cttgagattc
atctgtgttt ttatatgtat 36420cactagttca ttcttttttt attggtcagt agtatgccgt
tgtgtaaata caccactatt 36480tgcttattca ttcccctgtt gctggacatg tggattgtac
taccctgttt ggggctaatg 36540tgactaaaac atctacaaac atttgtataa gtcttttgtg
gacatgtttt atttctcaat 36600atttttataa ttcaactctt ttccaaaagt catttttatt
tatcatcatc agcatgccag 36660gtgtatgtta gtaatttgat cgctgggcta catgttctgt
tgatgaccat tccatacaca 36720cctgttctta gagaagaaga tgtcacgaag accatgtacc
ctgcaccagt taccagcagt 36780gtctacctgt ccagggatcc tgacgggaat aaaatagacc
ttgtctgcga tgctatgaga 36840gcagtcatgg agagcataaa tcctcataag tatgtatgct
gtcaccaggt ggcatccttt 36900gaaaaaccga agtgtgtagt tgtccttgtc cagcctactt
acctttctca ttctggtgtt 36960cttcacttat tacctcagat actgcctatc catacttaca
tctcatgtaa agaagacaac 37020cccagaactg gaaattgtac tgcaaaaagt acacgagctt
caaggtagag atccgctcac 37080agagaaagtg cttaaggtgg ccgtgactgc tactagtctt
ctgcaggtga caatcaccat 37140gtcattgcca caccacagat ttaacatgtg actttttagt
tgccatttta agacccttgt 37200cagttttttt cagtgctgcc ctctaaagca tatataaaag
tatcagaagt atatattctt 37260ctgatgtcca gttctattga gaaaaattta ttgtcttttt
ggttatgttg ttaggtctgt 37320ggattttttc cccaaatgat tgtgttctgt tttgttttct
aaacactgtt aggaaatgct 37380ccctctgatc ctgatgctgt gagtgctgaa gaggccttga
aatatttgct gcatctggta 37440gatgttaatg aattatatga tcattctctt ggcacctatg
actttgattt ggtcctcatg 37500gtagctgaga agtcacagaa ggtatgtgga gttcttactt
ttatgccatt tggttcttgt 37560ttatataatg atagtgtgaa accctgcttc tggtagtgca
gtagcttttc tgctatcact 37620ctgtgagtgc agggctggag acagatctgt gagtttctag
ggcccacatt cctaagcccc 37680tgtgcttatg aaagtgtttt gattgtgagg ttgaagaagt
gaagtaaaat tgcatggctt 37740ttttttgttt cttttttttt gagacggagt ctcactcagt
cgcccaggct ggagtgcagt 37800ggtgcgatct cggcttactg caagttccac ctcccgtgtt
cacgccattc tcctgcctca 37860gcctctctag tagctgggac tacaggtgcc catcaccacg
cccggctaat tttttgtatt 37920tttagtagag acagggtttc actgtgttag ccaggatggt
ctccatctcc tgacctcgtg 37980atccgcctac ctcagtctcc caaagtgctg gaattacagg
tgtgggccac catgtgcggc 38040ctaaaattac atggttattt ttaagatgat gggcatatgt
gtgagctaat ttcttctctt 38100ataaaggaaa tgtaacaagt ggttcatgtt ccactccggt
tctttctcac atggctcttt 38160tttctagtgg agggtgggca catggagcac agaaggctca
tggcctcctt tcctatgttg 38220gtacatttgc tatgatcaaa aactttgaac accactggta
tgcatatttt ttatttattt 38280ttttgcagcc tcagtctctt ccccatgacc tctccaaaaa
tgaaaatcgg atccttcatc 38340tctctgctta aaatacttca tgagctccca ttgttccgag
gatataattc agaagccata 38400atactgctta aaaacccttc cttgacctgg cctctgtgta
tctttccatt ctcacttctt 38460ggtattgtct ttttttcctc tgcccatgga ggaaagacaa
tgcttttgtc ccccttccct 38520tgcccctcac caccacatgc cttggtgggc agcattactt
ctgccatcca tgggctttga 38580ctgcttccac cctcaccatt cccctggcta attctcacta
atctaggtta aaggatgcca 38640aggtggcctc ttcccagtaa gccattcatg cttccctcca
gggactgggt gaggtgaccc 38700tcctatatgc ttctgttgca cacagtgcct acccctgcag
actacagtgt gtctttatct 38760agagtgcggt atttatttat ttatttttga gacaaggtcg
ggctctatca cccgggctgg 38820agtgcagtgg caccatcttg gctcactgca acctacgcct
cctaggctca agcaatctca 38880cctcagctta caggcgtgca ccaccatgcc tggctaagtt
ttgaattttt tttgttgaga 38940cggggtttcg ccatgttgcc caggctggtc tcaaacttgt
gagctgaagc aatccatctg 39000cctcggcctc ccagagtgct gggaatgagc acttaattat
ttgttgtctt gggttttctt 39060cctatgttgt tcttacatgt atttatcctg tcagcccagg
gaaattgcat taaaaacagg 39120aaacacctct ccattaggaa gaaaaacaat ttgcttacag
ggcatggcat agagctggag 39180atgatagtgc caataaatac taggttggca gggtctcaga
gttttgtgtc caactcagta 39240taattttatg tttgttttaa tgtgatcatt tcaggagagc
atggaatgtc atgaaaacag 39300caccaagagc aatgtcttag acttttagga gaaacttaga
tgcatttgtt gaatatcttc 39360tagactgaaa ccttatttcc cttattagcc tatgaaataa
atgatactgt gagacttagt 39420taaggaagtt actattattc caagtgtaac ttattaatat
ccgtatgtga aagcattttt 39480gccaaagctt gtttgatgtt cagctgaccc ttgcacaacg
tgagtttcaa ctgtgcgagt 39540ttgaactgtg tgggtttatc taaatgtgga tctctctcaa
acacagttgg ccctttgtgt 39600ccacggcttc tgcatccaca atcagtgtgg atcaaaagta
caatatttgc aggatttgaa 39660acttgcagat acagagggcc aacattttgt gtatccaggc
tccatggggt caaatgtagg 39720actggggtat gcttggattt tggtatcctt ggggtgtcct
ggaaccaatt ccccatagat 39780actgggggac aactgtagtt tgattttata tattatataa
tatgcagtta atatataata 39840cacatttaaa aattatgtag ctttgggttt attgctatat
gtaaatgcta gtttctattc 39900ctatatatga atatcacaag taataaagtt ctcattaatc
atttttttag gatcccaaag 39960aatatcttcc atttcttaat acacttaaga aaatggaaac
taattatcag cggtttacta 40020tagacaaata cttgaaacga tatgaaaaag ccattggcca
cctcagcaaa tgtggtaagt 40080gtggggatta gtatgtttat ctctacttca gatcttcttt
ggaactaggc aaggtataaa 40140ttaaactgtt agtttagaca gtgactgatt tcacttccca
ctcctgaaaa ctctaacaat 40200tatgtatgct cacgttattt tgtcctgtgt tctgaaaagc
tgaaggtaat cacttttaat 40260gaactggagg agctccctag gtaagaacgt caagtagatc
cttttttggt taagaatgag 40320cacctgtgaa gttaacttca gtgtctcaga atcaaaattg
gttgacagtt cttccttctc 40380atgctgtttg cagacatgtc agggaaactc tgcttgtctg
gagagagtga tgaggccacc 40440tccccgtgcc ctgcaagacg cagttttaat tgacagtgat
ggggtgccag ttgttcttcc 40500catgctggaa cagttgtgat tctttactga ggactgatgg
gggaaaggaa gaatcacctg 40560gggtgcatgt taagccttca gctgctggca tccttggaga
atctgattca ggtggtctgg 40620gataggactg aggcgtgcat gtgtctaata agcttcccag
gtgatgtctt ttcaaggagg 40680ctgagaaaac actgggctgg aaagctggga ctcttaagta
ggatgctgat cccaatcagt 40740gctgctcttg cctcagaatc tgcagtggtg ctcattaaaa
attcaaattc caggatccca 40800ttcttcagat tctctgatta tttaggtctt aaaaagttcc
tcatttattt tgtttggtga 40860ccattggtat aaatgaagtc cattatgctt cccatgtctt
aagcctgtct ttgtgtgaat 40920ctttttcctg caggacctga gtacttccca gaatgcttaa
acttgataaa agataaaaac 40980ttgtataacg aagctctgaa gttatattca ccaagctcac
aacagtacca ggtatgtggt 41040atgtgaaaat gaggctctcc tggttttgct ttttgcttta
gtaggaaagg agtgaggatc 41100ctaagttcat aacaccatcc ttggcttcaa aatttatctt
aaaactaatt agcctcaatt 41160tgaacttctt atctgggaga atggtcctga cctgttctct
gattcctcat ctggaatacc 41220acagcacctt cctcgtgggg ttccctgctt ctttcccacc
cctcctctag cccaacctta 41280ctgctgtaag tctgattatc ctaacaagta cagatctttc
ccatatattt cagcataaag 41340ggaaattttt gtttgcttga aaaagcatcc ctttagcttt
ttttatatac cacacacttt 41400gcttctaagt taaatgtgtt atatgatcct cttaacagcc
tcatagggtg ctgtacacaa 41460tttgtagatg aggaagcaac ttgcctgagg atccagagct
acaaagtgct ggacctggga 41520tacagagccc aggctgcctg accaccctgc ccatgccatt
aaccaccact ctaccatgcc 41580accagcatca ccattttcag tttgtcctca gacaatatac
acatctttct ttgatcaagc 41640ccctgccagc ttctttagca ccagcttctg ccactgtcca
cattcccagt tacttgtagg 41700tagttctaca gatgtcacat cgtgtgattc ctctgtcatt
tctctaccca ccagccttcc 41760tttagcccca tttgtccatc agaacccttg ggttactcct
gaatgccatt cctggaccag 41820gcgccaaaca ctgagccccc agagcagcct gccctcgcct
tggtgattgc atttgtcaaa 41880ctgctgatta gctggtttgt cacctccacc aggctgtggg
ctccttaagg gcagggactc 41940catgttgtat tcctctctga atctctggct aacatccagc
ctggagaatc gaggatttgg 42000ccagtggata cctctttgcc cttgttttct gttctcttcc
acactctctc tgctctagtc 42060acactggccg tcctgttact cctcagacct gctatacaca
ttcctgctgc atggccatgg 42120tgccttctgt gccctctgcc tggtgccccc tatctcatca
cgtggtttat tctcctgaca 42180gccattagag ctcacactcc ctgagagctg caaggagact
gtcctctgtc cctttactca 42240cgtttgccat tatgctatag actatatttt gtccctaagt
ccatcctctg ttactataag 42300agcagcaact tggtggtggt tcttatatgg tttttcattt
gtttggtttt attttttgcc 42360ttgctgtagt atccatactg cccagaatgg tgcatatgta
gttaagagta attatttgtt 42420gagtgaataa atggcacatc ctcagtaagg ttttgaatga
aaaaatgact gtactaactg 42480atcaactgta agattttccc aggtaattct ttcaagggag
ttccaagtat aggaactaag 42540gcagctacac tggagcttta gagaaatgat tgtcatattt
cctcctcagt cctaaatctc 42600ctcttgtcac aggatatcag cattgcttat ggggagcacc
tgatgcagga gcacatgtat 42660gagccagcgg ggctcatgtt tgcccgttgc ggtgcccacg
agaaagctct ctcagccttt 42720ctgacatgtg gcaactggaa gcaagccctc tgtgtggcag
cccagcttaa ctttaccaaa 42780gaccagctgg tgggcctcgg cagaactctg gcaggtaagt
acaatcattt atatgtttac 42840atctacaaag gttttaaaaa atttatttct tttgtttggt
aattttgcaa ataaatttag 42900ggcagaatac tctgagacag tcttgttctc actgataaaa
attaatttag aatgctttaa 42960aggataagct actacagcaa gagtcccaga atgcagtggc
ccaatatgga aagaagttta 43020tttctctctc ccatagggat ttataggccc ttccgttgtg
tggctctgca accttttagg 43080cagatggttg tagctgggtt atctccacag ctgtggggaa
ggaaggagag tggggagaag 43140ttagaatcat ggtaaaacat ttacctttaa gttggaaatg
acctggatgg aagttaaact 43200atcaccttct attccatctc ggccacgcca tgtagctgga
tgggctgtgc cctgtaagaa 43260ggtaaagatg aatttttgga tgggtccatt ctgttataga
cagtaggttg ttggaatagc 43320caggaatgag gtggggaaaa taaaaggcca aatgtcgaag
cattctgaaa gcaaaggcag 43380tttagctgcg tcagggacaa gggttgcccg aaccagaggc
gaggctggta ccaggggctc 43440tagtaccaga gtggaggaaa gggtaaggac acctatgaaa
agagatgagc agaagctctg 43500gtcatctcag cagtgcttga agtaaagcaa tgactggtat
atttttttcc ctaacttgta 43560aatattgttg agatctcaaa gaaaaaaata aaaagcagtc
ctaaaaaaat tccaaactct 43620atcctgttaa attttgttaa atttatgtac cagtccttct
ttgtcatttg cagtattctt 43680tttttcttgg gattatacca gtgtatggga ttatcacttt
tctttttctg gttattagcc 43740tttcccaaat ccctccgttt ccatgctggc ctctttttac
aaatgtcgag aattccttat 43800ttcaggcctt ttagttattc gttcggtctc cattgttcct
ttctgcttta gaaatttatg 43860atattggttg tttatacctt ctatctctgt tcttggatct
cttctattct ttacagctct 43920tagcttgcta tttcccatgt cttatgaggg agtatttcta
gtttttctca gatgtttagc 43980aaaagtaggt ggggagggca gtggtcaaag atgtttgaga
aatgttacac actggagtca 44040ctctgtgtgt acatttaacg taggcagttt acacaagaga
gcaaaagaaa ggtaactatt 44100taaatagtgg aggtgatttt acctactttt tttagtgata
tatgcactgg agtgagcatg 44160caatgagaga ccggaatcta ccagctcctt cgaaagcctt
gggttctctg tgcctctcat 44220tgtggtttat ctcaattggg ctgagagtga ttctaggatc
taaagacact gcatgactca 44280aacataagtc agctacctcc atctagtgct caaccaaaga
aatagtggtc tcttactgtt 44340aagggacgaa gtggtttagt gagagatacc aggtcatttt
cccatataca tgctttggaa 44400gcatctttca aggctaattt tggctgtata tgattttcaa
ttcctgtgct aaatttagat 44460tctagctgcc atttaagata ggactctgtg gtgtatatac
ctattccctc acagaaattc 44520agaaagtaca tagtttcata cataataaag acatattaaa
gaagcacttg agctaaagta 44580tctgtttaac tttgtagtca actgctgctt attgtctcta
caggaaagct ggttgagcag 44640aggaagcaca ttgatgcggc catggttttg gaagagtgtg
cccaggtaaa ctcaattcct 44700cccttctaaa ccccccagtc agcaagaaag gtcttctcaa
ttgtatctta gtgatcatga 44760aagttaaagg aactgtgcat aattgttaag tccagagata
gtgtttgccc cagaggtctt 44820atcttgctgg cttgacttgg aaatctaaat ttagtacatc
tctaagtttg gtgaggtaga 44880atatgaaggt gctctacttt aacataccac tggtttgacc
ttggtagaaa gtacttaatt 44940acatctcaag gtagctgtgc tttttaaaat tgagtttgcc
aaagtagaaa caatgagaaa 45000ggaccattat aaaacaggat cattgaaggc tacatactct
tggcttttac tctcattctc 45060cctattggaa atgtctcttt tacctcaggg acctggaggt
acagcagatt ataaggataa 45120gtacccatat gagcatttgg tagtattata ggatttatta
tgaaaataat aaaactgcag 45180taacactggc cacagactaa cagtacacag gtgcacagtt
gacaccaggg attattgcct 45240tgtagagttt tgacctttga tgagagagtg ttttttacag
ttgttactga tagcacattt 45300atgtaactta attgtgcttt aaaaatattt aattgtctct
tgtgtaataa cagtaagtga 45360aagacgataa ctaaaatttt atataattag atcctggaga
gaatatttgt tgggtgattg 45420aattgaaaat accagtgaat gaaacatacc taaaagggta
gataggttgg gttggaaaga 45480tataccacat cgagggttaa ttaaatggat aagatgtcat
tatctttttt tctttgtaaa 45540ggaagattaa tgcataaaat tattttgtgt aatttacata
caataaaatt atgtgttgta 45600cagttgtata atttacatat aataaagcta attcaccaat
tttagatgaa gaattcagta 45660catttggaca tatgtttgta gctgtgtaac caccattgca
ctcatgatct agaacatttc 45720taacaccccc aaaagttccc tacttcccct tttgcagtca
gccttctccc tccactgcca 45780gcctttggca aactgatcag tcagtaaagt ttcacattat
ctagaatttc atataaacag 45840aaccatatgg tatgtagtct ttttaatctg gctcctttca
ctcacatagt gcattggaga 45900tgcatccatg ttgtagttta ttcctttgta ttgctgaata
gtatcccatt atatgtatat 45960gtcagaattt gttgatttac cagttgatgt acatttggat
tgttttcagt ttggggttat 46020tatgaataac gcagccatga acattctagt gcaggtcttt
atggggacag gagtaggaat 46080gccacatccc gtggtaagtg gatgtttaac tttttaggaa
gctgcagaac taatctgcag 46140tggccgtatc attttgcatt cccctcagtg atatgtgaga
gtgcttcagt gactcctata 46200ctcaccaaca ctgggtgtat tactgtgaca ctagatgtat
tatctattgc tacgtaacaa 46260cttaccttaa aagctggcag cttaaaacaa cagaccctat
tatcccactt tttcaatggg 46320ccaagaatct tggctgggct tagctggggc ctctggctca
gggtccttta caaggctgca 46380attaaggtat tggccagggc tagagtcatc tcaaggcttg
actagttttt aatttcattt 46440tctaatgttt tattactagt atatagaaat atagctgaag
tgttttgcag ggaggctgta 46500taattgacct tgtatcctgc aaccttgcta aactcattta
ttagttctag aagctcttgg 46560gtgtattctc taggattttc tacatcaaca aacatggttt
ctataaatat agttttatgt 46620ctttcttaca atcaatactt ttttctatct gtattgcatt
ttctagggct tccagtgtgg 46680tgttgaatag aagtgttaag agtgaacatc cttgcctttt
tcctgatatt ggagaaaatt 46740cacttgtctt ttagcattaa gtgtcatgtt tgctttttta
aaattttatt ctatattatt 46800ttatttttga gacagagtct tgctctgtca cccaggctgg
agtgcagtgg tgtgatctca 46860gctcactaca accttgacct cctaggctca agcgatcctc
ccacctcagc ctcctgagta 46920gctgggactg caggaacatg ccaccatgcc tggctaattt
ttgtattttt tgtagggatg 46980gggttttgcc atgttgccca ggctggtctt gaactgttgg
attcaagcaa ttcgcctgtc 47040tcagcctccc aaagtgctgg gattacaggc atgagcctcc
gtgcctggcc tgatatttgc 47100tttttttttt ttttttaatg ctctctattg cagagttggc
aaactacaac ctgtgacaaa 47160tccagcatgc cacctgtttt tgtaaataaa gctttattgg
agcatagcca tgctcattag 47220tttacatctt gtgtatggct gctttaacac tacagcagca
gagttagagt tgtgacacag 47280atagtttggc ccataaggcc tatatttact gtctaatctt
ttacaggaaa aatttgccaa 47340ttcctgccct cttggtttga ggaaattccc ttctgttcct
tgttctgaga gtttgtatca 47400tgaatgggtg ttaaattttg tcaaatgcat tttcaactat
gaagggtttt gtttttagac 47460gagtgatatg ggggactagg tgattgattt tctactgtta
aaccaacctt gcatctctgg 47520gttcaacccc acttggtatt atagatttat tacccttttt
ctcttgtggc agattagatc 47580tactaaaatt ttcttgagga tttttgtgtt tgtgttcatg
agggatattg tagttttttc 47640gtgtctttgc catgttttgg gtatcaggat aatgctgctg
tcattgaggg gtgacaaaaa 47700tgaggggtgg tgtcctttac acttctgttt tctggaggat
ttcatgtaga attggtatga 47760gagtctagct tatggttaaa aacctatgtg tgatgtttca
gacctgacca taaacaatta 47820cagactttac ctaggaggcc acatggggaa aagctgccct
ccctacacca gacttggcgt 47880actgccaatg cattacagtt tctaaaggga gttgcagtca
aggactcagg gccccctgtt 47940agtcatgctc ttgtaacagt atttgcattg agagtcctgg
cactttcatt cttaggtctc 48000tctatctgag gacatgggcc aaggtcttct tcaggcacct
ctgccaaggc ctgtttatgc 48060aagaaggagt ggaaaaacct tgacattttt ttccactgtg
actcactacc cagtactttt 48120ccacccttag cccccttcct ttgcacccat acccccaaga
tccatcaaac tgctaaagcc 48180tttttttcca agctccttca acagtgaacc aaccctcatg
tctgtgtgga tccagctgac 48240tcttgactag tgagttgttc cttgggaaaa aatggaacag
agagagttgg tgctttccct 48300ggttttagcc tcttgcttat accaatgcaa tgcctgaagg
cttaattcat ttttgacttg 48360ttgctttgat cagctactcc aacacctgac agctcagctc
tttctcccag ctcttgggag 48420atattttttt ctttaaatgt ttagtagaat ataccagtaa
ggccatctcg gccaggagtt 48480ttctttaatg aaagtttttc actattagtt cagttacttt
agtagacatt aacctattca 48540agtttatctg tgtcttctgg aatgagcatt ggtagtttat
gtctttcaag taatttgttc 48600atttcatcta aattgtcaga tttattggta tgaagtgttt
atagtattct cttattttac 48660tgtccgtagg gtctatggtg atgtcctgtc tttcattgta
gatattgatg tgtcttcttt 48720tttctgatta ttctggccag aggtttatca attttattga
tcttattaaa gaatgaactg 48780tttcattgtt tttctctatg atttttctgt attctatatc
attctttttt tattatttta 48840ttattttatt tgctctttat ttttctagtt tcttaaggtg
atggcttact tttatttttt 48900tcttattttt ttcttttgtt gttgttgttt ttttaaagaa
acagggtccc actcttgctc 48960aggctggagt gcagtggcac gatcatggtt cactgcagtc
tcaaactcct acattcaagc 49020tgtcctcccc cctcagcctc cagagtagtt gggattacag
gtgcatgcca ccatgcctgg 49080ctaattttta attttttttg tagagatggg gtgttactag
ttgcccacgc tggtctgaaa 49140ctcctggcct caagtgatcc ctccacctct gcctcccaaa
gtgctgggat tccatgtgta 49200agccactgtg cctggccaag gtgatggctt aaagctattg
atttgagatg attccttact 49260ttatagttta agcatataat gccataattt tcctcaagca
ccgttttagt tacgttatac 49320aaattttgaa atgttttgtt ttcatttcct aatttccctt
gtgatttctt tattgaacct 49380tggcttattt agaagtatgt ttaacttgca gatattggag
atttgccagc catctttttg 49440ttattaattt ctactttaat tttgttgtga ttagagaaca
tacattttat taatttaaat 49500ttataattta ttttaattta taatatggtc tgttttacag
aatgttgtgt gtgtatttga 49560aaataatatg aaagctacta ttattggatg gagtgttcta
taaatgtcag ttagattagg 49620ttgatcatgc tgttctagct ttttatatcc ttattgattt
cctcactact tgctctatca 49680atgactggga aagtgttgaa gtctcccagt atttgtctat
ttctcctttg attctaccag 49740tgtttgctta atgtattttg aagctctgtt ataggtgcat
acatgtttat gagtatgtta 49800tagatgtatt cattttgata tccttctttc tctgttacta
ttcctaattc tgaatttgac 49860tttaatgtta ttaatataat tcttccagcc ttctcttggt
tagtcttttc attgcatatc 49920tttttctatc cttttacttt taatctagct gaatgtagtc
tttattttga aagtgcgttc 49980cttgttgata gcattattgg ttcttttttt tttttaaatc
taatttgaca atctctgtct 50040tttaattgga gggtttagac atttgcattg aatgtgatta
ccaatatagt tagatttaaa 50100cctacagtct tgctgtttgc tttttgtttg tttcattgat
cctttgtttc ttgttttttt 50160ctttttttgc tttcctttgg atttagtatt tttcataatt
ccattttacc tccactgttg 50220gcttattagc tatacttctt catttcagta ttttagtggt
tgctgtagga tttataataa 50280atatcattaa ctgaccatat cttcagataa tcgtatacta
cttcatatat agtgtaaaaa 50340ccttacaaga gtattcactc cataatactt tgttattgct
tttgctttaa gtgatcaatg 50400attgtttaag gaaatttttt aatgaccttt catgtttatt
cttttttttt tttttccaaa 50460agattcagta ttttccgagt tttcaaaaac tgctggccac
tcaaagtgga tcaacaaaaa 50520tttaagagct aaaactgtaa aactcttgaa ggctgggcac
agaggttcat gcctgtgatt 50580ccagcacttt gagaagctga ggtgggacaa tcacttgagc
ccaggggttt gagaccagcc 50640tgggtaacat agaaagacct tgtttctaca aaaaataaaa
acacaattag ccaggcatgg 50700cggtgtgcac ctgtagtccc aacttcttgg gaggccaagg
tggcaggatt tcctgagcct 50760gtaagtttga gactgcagtg agctgagttc acgccactgc
acttcagcct ggacaacaga 50820acaagaccct gtctcaaaac cagaacgaaa ctataaaact
cttagaagaa aacagggcta 50880aatcttcatg actttggatt tggcaatgga tggttagaat
taataccaaa aacacaatca 50940ataaattgat aaattggatt taataaaaat taagaacttt
tgtgtatcaa ggacattgtc 51000aagaatgtga aaagacagca tatagaatgg aagaagatat
ttgcaaatcc tatatctgat 51060aaaggtttaa tatccagaat atgtaaggaa ctcctgcagc
tcaacaacag aaagccagtt 51120aaatcaattt tgaaatgagc aaacgcctgt aaacccagct
gcttggcaga ttgagacagg 51180aggattgctt gaggctagga gttcaagacc aacctggaca
acatagtgag accctgtcta 51240aaaacatttt tttaattagc tgggtgtggt ggcatattcc
tgtagtccca gctacatggg 51300agaccgaggc aggaggatca cttggggcca ggcagtcaag
gctgccgtga gctgtgatta 51360tgccactgca tcccagcctg ggcgacagag tgagaccctg
tctgagaaaa aaaaaaaaaa 51420aagaacaaaa aaaaatttag aagattgcta ttctagtcta
ctattttttc aaagggtggt 51480cttgttaaca attctggagc ccacctaaac ctgctaaatc
aaacttggta gtaaagctgg 51540ggagatgggc atgtctaaca gacgtttctg gtggttttga
tgtccaggcg tgcagagaga 51600tgatgcttac cttgtgtttt gtcattattt tcaggattta
caccccttcc ttgtcttttg 51660tatcaatatt tatggagtca tgaactctag gataggcatg
atgttgagaa ctaggagttc 51720tcccctggcc agggagatag aggcaggtct gtggttagtt
ttgtagttgg ctgtgatgac 51780atctgacatg ctctcttcac ttgttgtctt cttcctgttc
ccttgtcagg attatgaaga 51840agctgtgctc ttgctgttag aaggagctgc ctgggaagaa
gctttgaggc tggtaagaat 51900cttgtaaatc ctctggatgt tgggtgctaa gcagagagag
caagcaaggg attccaggtc 51960agttggaatc tcttgtcttc tgaggttcat gaaataagta
gaaataggtc aggttcctgg 52020cttaaggaaa agcggtgtta ctaaaatcat ttttatcatt
cttgataata atttgaaata 52080ttactgtctt ttactgaaat gaattgaatt tccttggctg
ccttgtagga ggcctgtttt 52140tcaggaaaat attctgatta cctctgaaag taatccatgt
ctttctaagt atcttaactc 52200tccagtgact agaagttttc cttcctaaaa tatcgtgttt
ttccttctag gtatgcaaat 52260ataacagact ggatattata gaaaccaacg taaagccttc
cattttagaa ggtgagggtt 52320ccattttaga tagaattcct catttggaag aaggtgagga
gagagagatg agagagtctc 52380ctcctattta ctgtgttttc ttaataatat gtcatgtaga
ctcaatcaaa attaccacct 52440ggatataata tttaattctc actagaattt ttaaatatgc
tgaactatta aatggtaaca 52500aaatatttaa atgttagaaa cctgtgatca aatatgatta
agaatctttg tatttggaaa 52560tagtaaactt gaatatgaac tatattagat aataatataa
cactgataaa tttctggcat 52620ttaataatca tgttgtggtt atataagata atatcctatt
attctcaaga gataaatgct 52680gaaatattta ggaatgaagg atcatatctc tgccttactc
ttaaaaggtt ccacaaaagt 52740attaatgaat gtgtgtatgc atgcagagaa acaggaagca
aaaaaatgtc aaaatgttag 52800taattggtaa atcaaagtga agggtatatg tgtgttcatt
gaactcttac aacttttatg 52860taggtttcaa cgtttcaaag tattttttaa aagttacctt
ttcaaatgaa gtttgtggtt 52920cttagagaac atatgaatat taccagttct agaatactca
gatggtcact gtgacctctt 52980aaaagcaaag tggagaagga catcagtttg acttatagaa
accttaggga gtggttgatt 53040ttaagttctg catttttatg cacatctacc ctgtaagtaa
cgtctggcct ttctgacatt 53100tacatgtatg cacattctta ccttgtctgc acccccttcc
tccatcctaa ttaaaacgtt 53160gctggggtac tttttatgtc attcacttta ggtacctcta
actgggtact gaaaacatca 53220ttcctcatct ataataatct aaccagctct tacttagatt
ttcaccacta atgagaacct 53280ttcttagata aatgccgata attcatctac ataggcccaa
aacctattaa taaaatgcat 53340ccttggatag tagtattttg cttttttaaa atgtattcta
ctagtgttat ttttctcttg 53400tgtatttttc cattggacaa tatttattag atacattttt
tccacatcca tgggcatttt 53460gatggatgtt tagccagaaa catttaggta attttcttct
tatttttgtt aactgagctc 53520ccctccccta cccccccttt ttttgtttgt ttgttttgtt
tgtttgtttg ttttgccaat 53580cctcccttgc tttaggtatc aagtcttcgt tcaggtgatt
ttacaagttc agtggtagcg 53640catattctgg gataatgttg atgaactcta agatctggaa
tctcagtctc taatttgtta 53700atgcttatta aggaaaaaga gctcgcttgg aaaacctagt
aacctctttc tttttgctga 53760attttaaccc tccttcactg ctccccgcct ttagtttttt
ctctttgctt aaacctcatg 53820ctcaaactat tttccattct gcatctccag cccagaaaaa
ttatatggca tttctggact 53880ctcagacagc cacattcagt cgccacaaga aacgtttatt
ggtagttcga gagctcaagg 53940agcaagccca gcaggcaggt ctgggtgagt atctgcgtga
aggccatcga cgtgcggggg 54000cagtggggtt gggtaacgcc acacattgtc tagattgctt
ggtgatccgc ctgcaatctg 54060attactgtgc catgggcaag tgtgaggctt ctgtggagcc
ccttcagggc cctctgtgtc 54120tgtgtttgtg tgttggtgaa gggcaggacc aagcatgaat
ggggagagct ctgccagaca 54180ttcccaccta cccccattca cccagagcag ctgaccactt
ccgtgtctaa caaaatgagt 54240ttcctcattt ccagaaaaaa gttcaggaaa ctactgattt
acattagtaa ttactgtatt 54300taatattatc tcattcattt tgagatcaac tttgcaatca
ttttcatcca tcctttgata 54360tgcaccagtt gactctagtt agttcattta ccgccctgaa
agtaaaccca cacattagca 54420ggcagtgttt tcatcggctt ctggttcttc ttttctagat
gatgaggtac cccacgggca 54480agagtcagac ctcttctctg aaactagcag tgtcgtgagt
ggcagtgaga tgagtggcaa 54540atactcccat agtaactcca ggatatcagc gtacgtatca
cattgattca gcacattgac 54600tatatcctgg gcatataggg aaagtggaag caaatagatt
ggttttctac tgggacggtg 54660tagtgggagt ggggagaata ttcttcagcg ctgtgtggaa
gttgttcaga cactttccca 54720gcatatctga gacattaaac ttggcattgg aaggttttct
tcctcagctt tgtggcttgt 54780gtgttttccc attccccacg aggcagttcc tcccctgaat
gctcagttta tattaacatc 54840tgattttatt ttttgaacaa atgttgtgac taaattatag
gcactgaaaa aatgaaaaga 54900taagcttctt caattcaaaa tcaggattgg aagagaccat
aaatgtaaaa taagtcataa 54960cacttttacc aaatatagta atttgtcaga aatatttatt
cagcactcat atggtaggtg 55020cagtagatgt taccaaaaac ttataaggag atatgagtta
taagagttta tagtcttgct 55080tgggatgtgt aaagcaatgc aagattatat attcaaactg
aattttgctt taggaattta 55140aaatggagat ctgtgaagtt gtgtggggtc atcagcaact
gcaagaaagt agccaggcaa 55200ggtagcacat gcctgtagtc ctagctactc aggaggctta
aaaatatctg tgtaatttct 55260aacaggagat catccaagaa tcgccgaaaa gcggagcgga
agaagcacag cctcaaagaa 55320ggcagtccgc tggaggacct ggccctcctg gaggcactga
gtgaagtggt gcagaacact 55380gaaaacctga aaggtatatt ctcagtcctg atgatgattc
ctgaccacaa acaatagtga 55440ataggcagta cagacaggca gagttcagta ggtgattaag
ctaccatttt cccaatttga 55500ggaaagatga gaacttttag caggaagggt catgtctgca
cacattcctg aagcagccct 55560tcttagctgg taactgagaa gccttcctcc atttggcatc
cccctaactg aactgggaga 55620gatgcttaag ccaggataaa gaattgtggg acactgcttt
ctgcgtaggc cccccagcgt 55680gcttgatttt ctttttgtag tacatgtgtt taattattcc
agcatttggg aagaaaaaag 55740ataatgtggg agaaaggacc tgcagtggga tcatagaaat
ttttggcttt ggatagaagc 55800tatgtatgat tctgtcaatg gagctgggaa tataacttac
cactctttca aatttcttct 55860ctctagatga agtataccat attttaaagg tactctttct
ctttgagttt gatgaacaag 55920gaagggaatt acagaaggcc tttgaagata cgctgcagtt
gatggaaagg tcacttccag 55980aaatttggac tcttacttac cagcagaatt cagctacccc
ggtaagtttt ctcagagacg 56040gtgtgcattt ttttcatcat tttcatgggt tattgtattc
acacaatctc caagtcaaaa 56100agttttcctg ttcttaaaac ataagatgcc atagttaaat
tatcttagca tttatgtgta 56160agctgtcagt aagatttgat atttgcctgt agagtgacta
gtataccttg gcataggtta 56220aatggactgt cattttcctt tctggatgaa gtagctgtca
tggagaaaat gggaaagtca 56280catgattgct cctggccttc aatgaggttg gagtggggag
agatggggga agatggggtc 56340agagacggcc tctcactttc ctttcagaac tcagggatgg
gatcaggctt taaagggacc 56400ccaggcaatt gcttttcctt ttgttttatg aaaaatttga
cttgtcactt ctatgttgtt 56460atgatggact ttgcgggttg tgtttaaggc tgaatcagct
ttgtatcgca gaattctagt 56520atattgtcat ctgtttatta tttatacctc tgttcactct
cttatacttc aagtctattg 56580ttaagagttt ttatttggat tcaaaaaggc tggtgtatca
gtcaagatct agaaaggaaa 56640acaaaagcct atctattatt ttatcacaga atttaatata
tggatttgtt aaataagtat 56700tagaggacta aacaaggcaa aagggaaata cagaggaagg
acattgagat agtaactgta 56760ggaagcagct ttaccctcta gctgagggaa caggaggagt
tgttgggaat tattagaatt 56820tagaagcctg gaagtggggc cctgtagagc tggctcttga
acctctgaga ggagggtgcc 56880agccagctaa tcctggcatt tctgagggag ctggttccaa
gcgtacagaa gtaaatggaa 56940actggaagga acagctgctg ctgggggaaa agccagccgg
tcgggccagg tgtggtggtg 57000gctcacgcct gtaatcccag cactttggga ggccaaggca
ggcggatcac ctgaagtcag 57060gagttcgtga ctaatgtggc caacatggag aagccccgtc
tctactaaaa atacaaaatt 57120acccgggcat ggtggcgcat gcctgtaatc ccagctactc
aggaggctga ggcaagagaa 57180tcgcttgaac ctgggagaca gaggttgtga tgagccaaga
tcgtgccatt gtactccaac 57240ctgggcagca agagcgaatc tccgtttaaa aaaaaaaaaa
aaaaagccag ccaatcacgg 57300aagaaatcta gaaatctttt gttcatcctc cagctttgta
ctccccctct ggtgttcact 57360gtaggcagga catgatggga agccagcagc aaggaagaat
atctttcagg tgcccagccc 57420cagcaccaca agcagtggat agaagggtgg gttggagctg
agagattaca aatcagctca 57480gtgtttagaa acacatacgc ttatcatgtc ttgatttcct
catttagaaa tgggcataag 57540acttctctgt gtgcttcaat agaatgcttt gaaggttaaa
taagagggtg tgtgtaaaag 57600cactttacaa accgttgaaa taaaagcaac taggaatcag
ggccccagaa cttcttgaat 57660ttattataat aggtatttct tagaagaaat gtgatcatca
tcttcaaaac tgtagtactt 57720ttgaagataa ttgtttttgt tttttgagac agggtctcac
tctgttgctc aggctggagt 57780gcagtgatca ccgctcactg cagcatccac cgccccgggc
tcaggtgatc ctcccacctc 57840agcctcttga gtagctggga ctacaggcgc atgccacaac
acctggttaa ttttcaaatt 57900ttctgtagag acagggtgtc accaagttgt ccccgctggt
cttgaacaac tcctgggctc 57960aagtggtctg cccacctcac ctctccaaag tgctgggact
ataggcatca gccaccatgc 58020ccggcttgaa gataataatt tataatacca ctcccatgag
tgatcttctc ttctgatcac 58080atattcacat taaggtctat tttattttat ttttttcttg
ctctgtcacc caggctagag 58140tgcagtgaca gtatgatcaa tcatggcttg gtgcagcctc
gaatgcctgg gctaaagcag 58200tcctcccacc gcagtctcct gagtaattgg gaccacaggt
gcacaccacc atgcccagct 58260aattttaaaa ttttttccta gacatgggga gagggagtct
tgctgtgttg cccaagctgg 58320tcttgaactc ctggcctcaa gtgatcctcc tgccttggcc
tcccaaagtg ctgagattac 58380aggtgtaagc caccatgcct cccacattaa gttctaagac
atcaatttta tgattgtggt 58440tttgattggt gaagtatggt tgtggtatgt gcaggatacc
gtgagtgact tctcatggca 58500ttgctcttga gagtgtgcca ccaagggtct gcactaacca
ggggtgtgcc cagaggctcg 58560ctgcaggctt gaaattcctg cggagtcttg tgttttacct
ggagcacatg tgcacagttt 58620ccattctgct ccatagtatg cacatgtttg tatttatttc
aacctaaaaa tgtttgtttc 58680ccataactct ttgcgtataa ttgatactct acgtatttgt
agcctctttt actcttttcc 58740ctttcctcag ggagtggttt gctcatttag aaaaggccaa
gatatatcac tgtagagttt 58800cgtttctttt cttttcctcc accccccatc tttaccttgt
tctgggagaa aggagaatta 58860gaagtctgag ttgcagctgg agaaactggc aaattaaaat
cacattggga aagagaatta 58920ctgtgtttca caccatacca gtagaaatga caggctgttt
tctgctggta gggatttggc 58980ctttggtatt ggcagtcttg agaagtatta gataatcttt
gctgatacag tctattttct 59040cctcaggttc taggtcccaa ttctactgca aatagtatca
tggcatctta tcagcaacag 59100aagacttcgg ttcctgttct tggttagtat tttttctcat
ttaatattac aatactaagc 59160agaaggacta tctttctgta agtattgaga agatcagcag
tataaggaga gattggatac 59220aatttttcac tacaaaaaat tgactacaat tcttcctcaa
ttctaagacc gcatctttag 59280tatgatcagt ttcatgcttc tagcggtggg ggacctggtg
caggaaaatc cagcatgacc 59340attgtatgtg taatttttaa aaatatttat gtggcatatg
cttgttcata aaggcacacc 59400acagttccag tttcagtcta aactgtctac atttacatat
acatcaaaag attcttctga 59460agcatcatta ctggctattg gcagttatgc tttgcatctt
gggggcattt tcataaacct 59520tgcttatgag tgggaccttt ttattatgtt taggattgac
aatataattt gaaggcaaat 59580ccaaagaata ttagcatttt atacatattt cctgtttagt
tatgcatgaa gtgttttatt 59640tgttgagggg agatgattct caattagatt acttatttcc
ctaaaaatta aaaaccctaa 59700gcgctttctt ttgaaagttg gttagaaaca tttgatgagt
cagcttggga ctttcagtat 59760ttgcccttac ttatagttgg atcaatgaag catcttagct
ttgaaaagtg aatgatagtt 59820tctaaaataa ttggcagttt taactgctat tatttgcatt
tctagcatgt gacaagcaac 59880tttctgaaat tttttttcac cgaagtgcta cactgtaata
gcattttgat gacatttgaa 59940gtagcctgtg gggattcaaa ttaagtttga ctttaacagc
ttatgttgct accaggaaga 60000acagctacct tccatcccag ctaaactcat acatccagac
tgtaactact gtattcctag 60060ctcctcttct gtctagagaa tggcaaggtt cttttggtat
gcagtttcga catatccact 60120tattcctttt tttttcttaa gttttttcat ttagaaaaaa
aaacagatgg ggtcttaata 60180tgttgcccag gctggtctca gcctcctggt ctcaagtgat
cctcctgcct cggcctccca 60240aagtgctggg attacaggcg tctgcccctg tgcccagccc
acttatttcc cagatgctag 60300gaacttacat tagacctgag gccatttggt cattgtttat
tttgtgctgt agtccaatcc 60360agttgtgatt tctgcctcct gtgttcctcg ttgctggcct
gatgctgacc ttcaggttag 60420gtcagtccca tcattcccca gggtattcta gatggctttc
ccacttcaaa gagcactttc 60480ttgttttcca gctgagcctt aaagacactc tgtaatattt
gagagcccct cattatctga 60540gtgtttatta tcattaccct tgtggtttca aggatgtata
ggaaaaggta agttcctata 60600attcaaaaat tgccactgat gaactaatca caaaattagt
gccactcaaa tattactcag 60660ctgcccctcc ccagctaaca atagttaagt atattggcac
atccccacaa gtgaaatcaa 60720tgacttgatg ggtcatttct gattgtttcc tgctttgatg
caatacaata tcatgcagat 60780caattgcaag tcttgcaaaa atttagtatt acataaaata
gattaaaatg atattggaaa 60840agtacttgaa tcacagctgg gttggacttg ttgcaattga
tgacaaaata agtgcttcaa 60900atgattttga ctatcaaagg attgagagag gtccttagaa
aaattgaaaa gccctcaagt 60960tatttttata aaaatggcct tttttgtgtg ctgtgaaatc
cacatatgga aatgtgaaat 61020atgtcatgtc ctgctgtcat ataatttgtc agaataatta
ctttcttgcc caaaagtctg 61080tactttgtgt ttatttcaag ttaagtctag aatcaaatat
agttgtagtt atgcctaatt 61140ttaaaaaatg agatagagca cattattttt gtaactagtt
tttttttttt tttttcagac 61200agagtcttgc tctgtggccc aggcgggagt gcagtggcgc
aatctcggct cactgcaagc 61260tccgcctccc gggttcacgc cattctcctg cctcaccctc
ctgagtagct gggactacag 61320gcgcccgcca tcacgcccgg ctaatttttt tgtattttta
gtagagacgg ggtttcaccg 61380tgttagccag gatggtctcg atctcctgac ctcgtgatcc
acccgcctcg gcctcccaaa 61440gtgctgggat tacaagcgtg agccaccgcg cccggcctgt
aaatagtttt tttaagataa 61500agtcttattc caactttaat tggaatttat gaaatacctt
gttgatagtg aatttattta 61560agtagccttt tttcagtatt gatattctta tatctttatg
gcaccattta gtggagagaa 61620atgtaaacaa acataaagat gtagtattaa atcataactg
cataaaatta actgtagtat 61680gtactgcact actgtaataa ttttgtagct acctcctgtt
gctattgtgg tgagtgagct 61740caagtgttac caatatctgc ttaaaatgcc atgtgccgct
aaccatctcc acatgagcag 61800cacatgagag tctccattaa ttgcatatgg cagcgaaaag
tgatctcttg cattgtcgtg 61860tattttttat cacgtttaat gtaatatcgt aaaccttaaa
taacaccatg agacctatag 61920gaagtaccac aagtgttgct cccaggaagc agagaaaagt
cataacatta caagaaaaag 61980ttgacttgct cgatatgtac tatagattga ggtctgcagc
tgtagttgcc caccacttca 62040agataaatga acccagtgca aggactatta taaaagaaaa
ggaaatttat gaagctgtca 62100ctgcagttat gccagcaggc atgaaaacct tgtacttttt
gcaaaatacc tttttatgtt 62160gtattgaaga tgcagctttt atgtgggtgc aggattgcta
tgagaaaggc atacctatac 62220aactattatg atttgagaaa aagcacagtc attgtatgag
aacttaaagc aaaaagatga 62280aggatcaaag ctggagaatt taatgccagc aaaggatggt
ttgataattt tagaaagagg 62340tttggctttg taaatgtctg gataatagga aaagcagctc
ctgccatcca ggaggcagca 62400gcaaaggcag tcaggtttat gatcaggact gcccttatct
gtaaagctgc taacccccga 62460gcctggaagg gaaaagatta acaccagctg ccaggctttt
ggttgtacca tacaacaaga 62520aggcttggac aaggagaaca ctttttctgg attggttcca
ttgtcgattt gtccctgaag 62580ttaagtagta tcttgccagt aaggggactg ccttttaaag
ttcttttgat actggagaat 62640gcccgaggcc accccaaact ccatgagttc aacaccgaag
acattgaagt gatctacttg 62700cccccaaaca cacatctcta attcagcctc tagatcaggg
tgtcataagg acctttaagg 62760ctcgttacaa acagtactct atagaaagga ttgtcaaatg
tatggaaaag aaccttgaca 62820gaacatgaaa gtctgaaaga attacaccat caatgatgcc
atcattgtta tagaaaaagc 62880tgtgaaagcc atcaagccca ggacaataaa ttcctgctag
agaaaactgt gtccagatgt 62940gcatgacttc acaggcttta cgacagccaa tcaaggaaat
catgaaaaag attgtggatc 63000tggcacaaaa aaaaaaaaaa aaaaaaaaaa tggtgcatga
aggatttcaa gataggaatc 63060ttggagaaat tcaagaggtg atagacatca caccggagga
attaacagaa gatgacttga 63120tggagatgag tacttccaaa ccagcgccag acaatgagga
agattacata aaagaagcag 63180tgccagaaaa taaattgaca tttgttccaa aggttccaat
tattcaagac tgcctttggc 63240ttcttttaca acatggatga ttctatgtta tgggcactga
aactaaaaga aactgtggaa 63300ggattggtac cttagagaaa tgaaaaagca aaaacatcag
aaattatggt gtatttctgt 63360aaagttagtg acactgagtg tgcccacctc tcttgcctcc
tctttaacct cccctacctg 63420tttcatctct accacccctg agacagcaag accaacccct
ccacttcctc ctctacttca 63480gcctactcaa cgtggagatg acaaagatga agacctttat
gatgatccac ttccatttaa 63540tgaatagtaa atattgtttt ctttatgatt ttcttaatat
tttcttttct ctagcttact 63600ttattgtagg aatgtagtat ataatacata taacatacaa
aacatttgtt aactgacttt 63660ttatgctgcc aatacactgc cgaacaacag taagctattg
gtacttgagt tttggagatt 63720cagaagttaa acatggggcc aggtgtggtg gctcacacct
gtaatcccag cactttggga 63780ggctgaggtg ggtggaacga gaccaggagt tttgagagta
gcctgggcag catggtgaaa 63840ccttgtctct acagaaatta gccaggtatg gtggtgtaca
cttgtagtcc cagctacttg 63900ggaggctgag gcaggagaat cgcttgaacc cagggggtcg
aggctgcagt gagtcatgat 63960cgtgccactg cactccaacc tgggcaacaa aatgagaccc
tgtctcaaaa aaagaaaaaa 64020aaaaggtata tgcagatttt tgactgtgca ggggggtccg
cacccataac cctacattca 64080aggatcaact gtaatttttc atgcctgcat ggctcatatg
tacagattta ctgctggaag 64140tttatcataa ataatgctga aaaagaaaat ccttatatat
acatattttc tcctatctct 64200gcttgcagta tatgattcct ggttagaaaa gaaacttaac
aaatctaagt gaaagagtgc 64260ctgggagttt taggttacaa tgacagaatc ttttcctaac
cctctctctc cattcacttt 64320ttttaaagca ggggcatctt tattgatcaa catgtttgtc
gaagtttcat cataaagtag 64380ttcctgtcca ttaacttcac ttactgaata tgtgctatca
cattttgcta ttccttaaaa 64440attgagctag actttacata tagtgaaatg cagagatttc
aggtgtacaa tttgatgagt 64500tttaataaat gtatacagcc atgtgactgc tgccaccacc
cctcccacca gtttgaaata 64560cagaacattc ttccactttg aatcactggg tgagcatgcc
tgaggttgaa atgcagtccc 64620tcctctcagg gcggggcctc caggttgtgt ttgctctgac
ctggaggttg caggggtagc 64680agacacatga actctggctc tgatggtctt attgctgcaa
actccacctg cctagtttgt 64740ttagtttaga gttactgcct cagcgccctc caacaagagt
atgtctgtca caatttccct 64800tcctttcttg cttttagatg ctgagctttt tataccacca
aagatcaaca gaagaaccca 64860gtggaagctg agcctgctag actgagtgac tgcagttagg
agggatccga cagagaagac 64920catttccact cattcctgtt gtcctaccac cccttgctct
ttgagggctg gctattgaga 64980actggaaaga gtaaaatgat aacttacctt agcattgcca
agaacttcag cagacaacaa 65040gcaattctat ttattttatg ttgtgtatac atcttgatca
ttagcaagac attaagcttt 65100aaccattatg gcaccatttt gtgagaatga ttgttctttc
acttgggctg tttgagagca 65160taattatggt aatcatgaga ttaatgtttc atgatttcta
cctccaaagt gtgaagacaa 65220gtaaaacaat gtttctaaat tgtcttattt tgttggcgga
gaagattaca atggctatta 65280gtgctacatt tggtcaaatg taatcactta aatagcttct
tgtcacctta aactaaagca 65340gaataaaaag tatcctttga aattataagc cctcctttgc
tgacagctat tattttgtaa 65400catcttacca ggtcatgtgc tttcagttat aactgggctg
agcctcctat aattacaatg 65460tctataggga ctgttttact gcctgtgtat tttctgctag
agagttagca atgttagagc 65520tagaacagat tagaatttct aaacagtatc atgcacagtt
ggtgtgagtg atcagtgtgc 65580attgtatggc atgcatggtt gtgaattatt ctctgttctc
caaatactgt ttctttaact 65640cagatatttt tgttagtgtc taggccactt catttatttt
tcgtcatggt actttactga 65700cttctcttta ttcaattctc cacgccctca ccaaaaaaaa
ctgtctcaaa atgagaatat 65760tttattttca tggtgagtct agaaaacgcc cacttcattc
tgattaaaaa ttcttccatg 65820ttttaaatat cagaaccaga cctttcttac tgtgtatctt
agcccatttg tgtctctata 65880acaacaacca gctttcaaag gaactaatag agtgaaaact
cactcattac cacgaggatg 65940gcacaagcga ttcacgtagg atctgcccct gtgaccaaaa
cacctcccat tgggccccac 66000ttccaacact ggtgatcaca tttcaacatg aggtttaggg
aaacaaatgc ctaaactaca 66060gcactgtaca taaactaaca ggaaatgctg cttttgatcc
tcaaagaagt gatatagcca 66120aaattgtaat ttaagaagcc tttcccagta tagcaagatg
ttaactatag aatcaatcta 66180ggagtattca ctgtaaaatt caacttttct gtatgtttga
acattttcac aatctcatag 66240gagtttttaa aaagaagaga aagaagatat actttgcttt
ggagaaatct actttttgac 66300ttacatgggt ttgctgtaat taagtgccca atattgaaag
gctgcaagta ctttgtaatc 66360actctttggc atgggtaaat aagcatggta acttatattg
aaatatagtg ctcttgcttt 66420ggataactgt aaagggaccc atgctgatag actggaaata
gaagtaaatg tgtttattg 6647925924DNAHomo sapiens 2ccagtgctgg ggctgcctag
ttgacgcacc cattgagtcg ctggcttctt tgcagcgctt 60cagcgttttc ccctggaggg
cgcctccatc cttggaggcc tagtgccgtc ggagagagag 120cgggagccgc ggacagagac
gcgtgcgcaa ttcggagccg actctgggtg cggactgtgg 180gagctgactc tgggtagccg
gctgcgcgtg gctggggagg cgaggccgga cgcacctctg 240tttgggggtc ctcagagatt
aatgattcat caagggatag ttgtactgtt ctcgtgggaa 300tcacttcatc atgcgaaatc
tgaaattatt tcggaccctg gagttcaggg atattcaagg 360tccagggaat cctcagtgct
tctctctccg aactgaacag gggacggtgc tcattggttc 420agaacatggc ctgatagaag
tagaccctgt ctcaagagaa gtgaaaaatg aagtttcttt 480ggtggcagaa ggctttctcc
cagaggatgg aagtggccgc attgttggtg ttcaggactt 540gctggatcag gagtctgtgt
gtgtggccac agcctctgga gacgtcatac tctgcagtct 600cagcacacaa cagctggagt
gtgttgggag tgtagccagt ggtatctctg ttatgagttg 660gagtcctgac caagagctgg
tgcttcttgc cacaggtcaa cagaccctga ttatgatgac 720aaaagatttt gagccaatcc
tggagcagca gatccatcag gatgattttg gtgaaagcaa 780gtttatcact gttggatggg
gtaggaagga gacacagttc catggatcag aaggcagaca 840agcagctttt cagatgcaaa
tgcatgagtc tgctttgccc tgggatgacc atagaccaca 900agttacctgg cggggggatg
gacagttttt tgctgtgagt gttgtttgcc cagaaacagg 960ggctcggaag gtcagagtgt
ggaaccgaga gtttgctttg cagtcaacca gtgagcctgt 1020ggcaggactg ggaccagccc
tggcttggaa accctcaggc agtttgattg catctacaca 1080agataaaccc aaccagcagg
atattgtgtt ttttgagaaa aatggactcc ttcatggaca 1140ctttacactt cccttcctta
aagatgaggt taaggtaaat gacttgctct ggaatgcaga 1200ttcctctgtg cttgcagtct
ggctggaaga ccttcagaga gaagaaagct ccattccgaa 1260aacctgtgtt cagctctgga
ctgttggaaa ctatcactgg tatctcaagc aaagtttatc 1320cttcagcacc tgtgggaaga
gcaagattgt gtctctgatg tgggaccctg tgaccccata 1380ccggctgcat gttctctgtc
agggctggca ttacctcgcc tatgattggc actggacgac 1440tgaccggagc gtgggagata
attcaagtga cttgtccaat gtggctgtca ttgatggaaa 1500cagggtgttg gtgacagtct
tccggcagac tgtggttccg cctcccatgt gcacctacca 1560actgctgttc ccacaccctg
tgaatcaagt cacattctta gcacaccctc aaaagagtaa 1620tgaccttgct gttctagatg
ccagtaacca gatttctgtt tataaatgtg gtgattgtcc 1680aagtgctgac cctacagtga
aactgggagc tgtgggtgga agtggattta aagtttgcct 1740tagaactcct catttggaaa
agagatacaa aatccagttt gagaataatg aagatcaaga 1800tgtaaacccg ctgaaactag
gccttctcac ttggattgaa gaagacgtct tcctggctgt 1860aagccacagt gagttcagcc
cccggtctgt cattcaccat ttgactgcag cttcttctga 1920gatggatgaa gagcatggac
agctcaatgt cagttcatct gcagcggtgg atggggtcat 1980aatcagtcta tgttgcaatt
ccaagaccaa gtcagtagta ttacagctgg ctgatggcca 2040gatatttaag tacctttggg
agtcaccttc tctggctatt aaaccatgga agaactctgg 2100tggatttcct gttcggtttc
cttatccatg cacccagacc gaattggcca tgattggaga 2160agaggaatgt gtccttggtc
tgactgacag gtgtcgcttt ttcatcaatg acattgaggt 2220tgcgtcaaat atcacgtcat
ttgcagtata tgatgagttt ttattgttga caacccattc 2280ccatacctgc cagtgttttt
gcctgaggga tgcttcattt aaaacattac aggccggcct 2340gagcagcaat catgtgtccc
atggggaagt tctgcggaaa gtggagaggg gttcacggat 2400tgtcactgtt gtgccccagg
acacaaagct tgtattacag atgccaaggg gaaacttaga 2460agttgttcat catcgagccc
tggttttagc tcagattcgg aagtggttgg acaaacttat 2520gtttaaagag gcatttgaat
gcatgagaaa gctgagaatc aatctcaatc tgatttatga 2580tcataaccct aaggtgtttc
ttggaaatgt ggaaaccttc attaaacaga tagattctgt 2640gaatcatatt aacttgtttt
ttacagaatt gaaagaagaa gatgtcacga agaccatgta 2700ccctgcacca gttaccagca
gtgtctacct gtccagggat cctgacggga ataaaataga 2760ccttgtctgc gatgctatga
gagcagtcat ggagagcata aatcctcata aatactgcct 2820atccatactt acatctcatg
taaagaagac aaccccagaa ctggaaattg tactgcaaaa 2880agtacacgag cttcaaggaa
atgctccctc tgatcctgat gctgtgagtg ctgaagaggc 2940cttgaaatat ttgctgcatc
tggtagatgt taatgaatta tatgatcatt ctcttggcac 3000ctatgacttt gatttggtcc
tcatggtagc tgagaagtca cagaaggatc ccaaagaata 3060tcttccattt cttaatacac
ttaagaaaat ggaaactaat tatcagcggt ttactataga 3120caaatacttg aaacgatatg
aaaaagccat tggccacctc agcaaatgtg gacctgagta 3180cttcccagaa tgcttaaact
tgataaaaga taaaaacttg tataacgaag ctctgaagtt 3240atattcacca agctcacaac
agtaccagga tatcagcatt gcttatgggg agcacctgat 3300gcaggagcac atgtatgagc
cagcggggct catgtttgcc cgttgcggtg cccacgagaa 3360agctctctca gcctttctca
catgtggcaa ctggaagcaa gccctctgtg tggcagccca 3420gcttaacttt accaaagacc
agctggtggg cctcggcaga actctggcag gaaagctggt 3480tgagcagagg aagcacattg
atgcggccat ggttttggaa gagagtgccc aggattatga 3540agaagctgtg ctcttgctgt
tagaaggagc tgcctgggaa gaagctttga ggctggtata 3600caaatataac agactggata
ttatagaaac caacgtaaag ccttccattt tagaagccca 3660gaaaaattat atggcatttc
tggactctca gacagccaca ttcagtcgcc acaagaaacg 3720tttattggta gttcgagagc
tcaaggagca agcccagcag gcaggtctgg atgatgaggt 3780accccacggg caagagtcag
acctcttctc tgaaactagc agtgtcgtga gtggcagtga 3840gatgagtggc aaatactccc
atagtaactc caggatatca gcgagatcat ccaagaatcg 3900ccgaaaagcg gagcggaaga
agcacagcct caaagaaggc agtccgctgg aggacctggc 3960cctcctggag gcactgagtg
aagtggtgca gaacactgaa aacctgaaag atgaagtata 4020ccatatttta aaggtactct
ttctctttga gtttgatgaa caaggaaggg aattacagaa 4080ggcctttgaa gatacgctgc
agttgatgga aaggtcactt ccagaaattt ggactcttac 4140ttaccagcag aattcagcta
ccccggttct aggtcccaat tctactgcaa atagtatcat 4200ggcatcttat cagcaacaga
agacttcggt tcctgttctt gatgctgagc tttttatacc 4260accaaagatc aacagaagaa
cccagtggaa gctgagcctg ctagactgag tgactgcagt 4320taggagggat ccgacagaga
agaccatttc cactcattcc tgttgtccta ccaccccttg 4380ctctttgagg gctggctatt
gagaactgga aagagtaaaa tgataactta ccttagcatt 4440gccaagaact tcagcagaca
acaagcaatt ctatttattt tatgttgtgt atacatcttg 4500atcattagca agacattaag
ctttaaccat tatggcacca ttttgtgaga atgattgttc 4560tttcacttgg gctgtttgag
agcataatta tggtaatcat gagattaatg tttcatgatt 4620tctacctcca aagtgtgaag
acaagtaaaa caatgtttct aaattgtctt attttgttgg 4680cggagaagat tacaatggct
attagtgcta catttggtca aatgtaatca cttaaatagc 4740ttcttgtcac cttaaactaa
agcagaataa aaagtatcct ttgaaattat aagccctcct 4800ttgctgacag ctattatttt
gtaacatctt accaggtcat gtgctttcag ttataactgg 4860gctgagcctc ctataattac
aatgtctata gggactgttt tactgcctgt gtattttctg 4920ctagagagtt agcaatgtta
gagctagaac agattagaat ttctaaacag tatcatgcac 4980agttggtgtg agtgatcagt
gtgcattgta tggcatgcat ggttgtgaat tattctctgt 5040tctccaaata ctgtttcttt
aactcagata tttttgttag tgtctaggcc acttcattta 5100tttttcgtca tggtacttta
ctgacttctc tttattcaat tctccacgcc ctcaccaaaa 5160aaaactgtct caaaatgaga
atatttttat tcttcatggt gagtctagaa aacgccccac 5220ttcattctga ttaaaaaatt
cttccatgtt tttaaatatc agaaccagac ctttcttact 5280gtgtatctta gcccatttgt
gtctctataa caacaaccag ctttcaaagg aactaataga 5340gtgaaaactc actcattacc
acgaggatgg cacaagcgat tcacgtagga tctgcccctg 5400tgaccaaaac acctcccatt
gggccccact tccaacactg gtgatcacat ttcaacatga 5460ggtttaggga aacaaatgcc
taaactacag cactgtacat aaactaacag gaaatgctgc 5520ttttgatcct caaagaagtg
atatagccaa aattgtaatt taagaagcct ttgtcagtat 5580agcaagatgt taactataga
atcaatctag gagtattcac tgtaaaattc aacttttctg 5640tatgtttgaa cattttcaca
atctcatagg agtttttaaa aagaagagaa agaagatata 5700ctttgctttg gagaaatcta
ctttttgact tacatgggtt tgctgtaatt aagtgcccaa 5760tattgaaagg ctgcaagtac
tttgtaatca ctctttggca tgggtaaata agcatggtaa 5820cttatattga aatatagtgc
tcttgctttg gataactgta aagggaccca tgctgataga 5880ctggaaatag aagtaaatgt
gtttattgaa aaaaaaaaaa aaaa 592431332PRTHomo sapiens
3Met Arg Asn Leu Lys Leu Phe Arg Thr Leu Glu Phe Arg Asp Ile Gln1
5 10 15Gly Pro Gly Asn Pro Gln
Cys Phe Ser Leu Arg Thr Glu Gln Gly Thr20 25
30Val Leu Ile Gly Ser Glu His Gly Leu Ile Glu Val Asp Pro Val Ser35
40 45Arg Glu Val Lys Asn Glu Val Ser Leu
Val Ala Glu Gly Phe Leu Pro50 55 60Glu
Asp Gly Ser Gly Arg Ile Val Gly Val Gln Asp Leu Leu Asp Gln65
70 75 80Glu Ser Val Cys Val Ala
Thr Ala Ser Gly Asp Val Ile Leu Cys Ser85 90
95Leu Ser Thr Gln Gln Leu Glu Cys Val Gly Ser Val Ala Ser Gly Ile100
105 110Ser Val Met Ser Trp Ser Pro Asp
Gln Glu Leu Val Leu Leu Ala Thr115 120
125Gly Gln Gln Thr Leu Ile Met Met Thr Lys Asp Phe Glu Pro Ile Leu130
135 140Glu Gln Gln Ile His Gln Asp Asp Phe
Gly Glu Ser Lys Phe Ile Thr145 150 155
160Val Gly Trp Gly Arg Lys Glu Thr Gln Phe His Gly Ser Glu
Gly Arg165 170 175Gln Ala Ala Phe Gln Met
Gln Met His Glu Ser Ala Leu Pro Trp Asp180 185
190Asp His Arg Pro Gln Val Thr Trp Arg Gly Asp Gly Gln Phe Phe
Ala195 200 205Val Ser Val Val Cys Pro Glu
Thr Gly Ala Arg Lys Val Arg Val Trp210 215
220Asn Arg Glu Phe Ala Leu Gln Ser Thr Ser Glu Pro Val Ala Gly Leu225
230 235 240Gly Pro Ala Leu
Ala Trp Lys Pro Ser Gly Ser Leu Ile Ala Ser Thr245 250
255Gln Asp Lys Pro Asn Gln Gln Asp Ile Val Phe Phe Glu Lys
Asn Gly260 265 270Leu Leu His Gly His Phe
Thr Leu Pro Phe Leu Lys Asp Glu Val Lys275 280
285Val Asn Asp Leu Leu Trp Asn Ala Asp Ser Ser Val Leu Ala Val
Arg290 295 300Leu Glu Asp Leu Gln Arg Glu
Lys Ser Ser Ile Pro Lys Thr Cys Val305 310
315 320Gln Leu Trp Thr Val Gly Asn Tyr His Trp Tyr Leu
Lys Gln Ser Leu325 330 335Ser Phe Ser Thr
Cys Gly Lys Ser Lys Ile Val Ser Leu Met Trp Asp340 345
350Pro Val Thr Pro Tyr Arg Leu His Val Leu Cys Gln Gly Trp
His Tyr355 360 365Leu Ala Tyr Asp Trp His
Trp Thr Thr Asp Arg Ser Val Gly Asp Asn370 375
380Ser Ser Asp Leu Ser Asn Val Ala Val Ile Asp Gly Asn Arg Val
Leu385 390 395 400Val Thr
Val Phe Arg Gln Thr Val Val Pro Pro Pro Met Cys Thr Tyr405
410 415Gln Leu Leu Phe Pro His Pro Val Asn Gln Val Thr
Phe Leu Ala His420 425 430Pro Gln Lys Ser
Asn Asp Leu Ala Val Leu Asp Ala Ser Asn Gln Ile435 440
445Ser Val Tyr Lys Cys Gly Asp Cys Pro Ser Ala Asp Pro Thr
Val Lys450 455 460Leu Gly Ala Val Gly Gly
Ser Gly Phe Lys Val Cys Leu Arg Thr Pro465 470
475 480His Leu Glu Lys Arg Tyr Lys Ile Gln Phe Glu
Asn Asn Glu Asp Gln485 490 495Asp Val Asn
Pro Leu Lys Leu Gly Leu Leu Thr Trp Ile Glu Glu Asp500
505 510Val Phe Leu Ala Val Ser His Ser Glu Phe Ser Pro
Arg Ser Val Ile515 520 525His His Leu Thr
Ala Ala Ser Ser Glu Met Asp Glu Glu His Gly Gln530 535
540Leu Asn Val Ser Ser Ser Ala Ala Val Asp Gly Val Ile Ile
Ser Leu545 550 555 560Cys
Cys Asn Ser Lys Thr Lys Ser Val Val Leu Gln Leu Ala Asp Gly565
570 575Gln Ile Phe Lys Tyr Leu Trp Glu Ser Pro Ser
Leu Ala Ile Lys Pro580 585 590Trp Lys Asn
Ser Gly Gly Phe Pro Val Arg Phe Pro Tyr Pro Cys Thr595
600 605Gln Thr Glu Leu Ala Met Ile Gly Glu Glu Glu Cys
Val Leu Gly Leu610 615 620Thr Asp Arg Cys
Arg Phe Phe Ile Asn Asp Ile Glu Val Ala Ser Asn625 630
635 640Ile Thr Ser Phe Ala Val Tyr Asp Glu
Phe Leu Leu Leu Thr Thr His645 650 655Ser
His Thr Cys Gln Cys Phe Cys Leu Arg Asp Ala Ser Phe Lys Thr660
665 670Leu Gln Ala Gly Leu Ser Ser Asn His Val Ser
His Gly Glu Val Leu675 680 685Arg Lys Val
Glu Arg Gly Ser Arg Ile Val Thr Val Val Pro Gln Asp690
695 700Thr Lys Leu Val Leu Gln Met Pro Arg Gly Asn Leu
Glu Val Val His705 710 715
720His Arg Ala Leu Val Leu Ala Gln Ile Arg Lys Trp Leu Asp Lys Leu725
730 735Met Phe Lys Glu Ala Phe Glu Cys Met
Arg Lys Leu Arg Ile Asn Leu740 745 750Asn
Pro Ile Tyr Asp His Asn Pro Lys Val Phe Leu Gly Asn Val Glu755
760 765Thr Phe Ile Lys Gln Ile Asp Ser Val Asn His
Ile Asn Leu Phe Phe770 775 780Thr Glu Leu
Lys Glu Glu Asp Val Thr Lys Thr Met Tyr Pro Ala Pro785
790 795 800Val Thr Ser Ser Val Tyr Leu
Ser Arg Asp Pro Asp Gly Asn Lys Ile805 810
815Asp Leu Val Cys Asp Ala Met Arg Ala Val Met Glu Ser Ile Asn Pro820
825 830His Lys Tyr Cys Leu Ser Ile Leu Thr
Ser His Val Lys Lys Thr Thr835 840 845Pro
Glu Leu Glu Ile Val Leu Gln Lys Val His Glu Leu Gln Gly Asn850
855 860Ala Pro Ser Asp Pro Asp Ala Val Ser Ala Glu
Glu Ala Leu Lys Tyr865 870 875
880Leu Leu His Leu Val Asp Val Asn Glu Leu Tyr Asp His Ser Leu
Gly885 890 895Thr Tyr Asp Phe Asp Leu Val
Leu Met Val Ala Glu Lys Ser Gln Lys900 905
910Asp Pro Lys Glu Tyr Leu Pro Phe Leu Asn Thr Leu Lys Lys Met Glu915
920 925Thr Asn Tyr Gln Arg Phe Thr Ile Asp
Lys Tyr Leu Lys Arg Tyr Glu930 935 940Lys
Ala Ile Gly His Leu Ser Lys Cys Gly Pro Glu Tyr Phe Pro Glu945
950 955 960Cys Leu Asn Leu Ile Lys
Asp Lys Asn Leu Tyr Asn Glu Ala Leu Lys965 970
975Leu Tyr Ser Pro Ser Ser Gln Gln Tyr Gln Asp Ile Ser Ile Ala
Tyr980 985 990Gly Glu His Leu Met Gln Glu
His Met Tyr Glu Pro Ala Gly Leu Met995 1000
1005Phe Ala Arg Cys Gly Ala His Glu Lys Ala Leu Ser Ala Phe Leu Thr1010
1015 1020Cys Gly Asn Trp Lys Gln Ala Leu Cys
Val Ala Ala Gln Leu Asn Phe1025 1030 1035
1040Thr Lys Asp Gln Leu Val Gly Leu Gly Arg Thr Leu Ala Gly
Lys Leu1045 1050 1055Val Glu Gln Arg Lys
His Ile Asp Ala Ala Met Val Leu Glu Glu Ser1060 1065
1070Ala Gln Asp Tyr Glu Glu Ala Val Leu Leu Leu Leu Glu Gly Ala
Ala1075 1080 1085Trp Glu Glu Ala Leu Arg
Leu Val Tyr Lys Tyr Asn Arg Leu Asp Ile1090 1095
1100Ile Glu Thr Asn Val Lys Pro Ser Ile Leu Glu Ala Gln Lys Asn
Tyr1105 1110 1115 1120Met Ala
Phe Leu Asp Ser Gln Thr Ala Thr Phe Ser Arg His Lys Lys1125
1130 1135Arg Leu Leu Val Val Arg Glu Leu Lys Glu Gln Ala
Gln Gln Ala Gly1140 1145 1150Leu Asp Asp
Glu Val Pro His Gly Gln Glu Ser Asp Leu Phe Ser Glu1155
1160 1165Thr Ser Ser Val Val Ser Gly Ser Glu Met Ser Gly
Lys Tyr Ser His1170 1175 1180Ser Asn Ser
Arg Ile Ser Ala Arg Ser Ser Lys Asn Arg Arg Lys Ala1185
1190 1195 1200Glu Arg Lys Lys His Ser Leu
Lys Glu Gly Ser Pro Leu Glu Asp Leu1205 1210
1215Ala Leu Leu Glu Ala Leu Ser Glu Val Val Gln Asn Thr Glu Asn Leu1220
1225 1230Lys Asp Glu Val Tyr His Ile Leu Lys
Val Leu Phe Leu Phe Glu Phe1235 1240
1245Asp Glu Gln Gly Arg Glu Leu Gln Lys Ala Phe Glu Asp Thr Leu Gln1250
1255 1260Leu Met Glu Arg Ser Leu Pro Glu Ile
Trp Thr Leu Thr Tyr Gln Gln1265 1270 1275
1280Asn Ser Ala Thr Pro Val Leu Gly Pro Asn Ser Thr Ala Asn
Ser Ile1285 1290 1295Met Ala Ser Tyr Gln
Gln Gln Lys Thr Ser Val Pro Val Leu Asp Ala1300 1305
1310Glu Leu Phe Ile Pro Pro Lys Ile Asn Arg Arg Thr Gln Trp Lys
Leu1315 1320 1325Ser Leu Leu
Asp133041332PRTMus musculus 4Met Arg Asn Leu Lys Leu His Arg Thr Leu Glu
Phe Arg Asp Ile Gln1 5 10
15Ala Pro Gly Lys Pro Gln Cys Phe Cys Leu Arg Ala Glu Gln Gly Thr20
25 30Val Leu Ile Gly Ser Glu Arg Gly Leu Thr
Glu Val Asp Pro Val Arg35 40 45Arg Glu
Val Lys Thr Glu Ile Ser Leu Val Ala Glu Gly Phe Leu Pro50
55 60Glu Asp Gly Ser Gly Cys Ile Val Gly Ile Gln Asp
Leu Leu Asp Gln65 70 75
80Glu Ser Val Cys Val Ala Thr Ala Ser Gly Asp Val Ile Val Cys Asn85
90 95Leu Ser Thr Gln Gln Leu Glu Cys Val Gly
Ser Val Ala Ser Gly Ile100 105 110Ser Val
Met Ser Trp Ser Pro Asp Gln Glu Leu Leu Leu Leu Ala Thr115
120 125Ala Gln Gln Thr Leu Ile Met Met Thr Lys Asp Phe
Glu Val Ile Ala130 135 140Glu Glu Gln Ile
His Gln Asp Asp Phe Gly Glu Gly Lys Phe Val Thr145 150
155 160Val Gly Trp Gly Ser Lys Gln Thr Gln
Phe His Gly Ser Glu Gly Arg165 170 175Pro
Thr Ala Phe Pro Val Gln Leu Pro Glu Asn Ala Leu Pro Trp Asp180
185 190Asp Arg Arg Pro His Ile Thr Trp Arg Gly Asp
Gly Gln Tyr Phe Ala195 200 205Val Ser Val
Val Cys Arg Gln Thr Glu Ala Arg Lys Ile Arg Val Trp210
215 220Asn Arg Glu Phe Ala Leu Gln Ser Thr Ser Glu Ser
Val Pro Gly Leu225 230 235
240Gly Pro Ala Leu Ala Trp Lys Pro Ser Gly Ser Leu Ile Ala Ser Thr245
250 255Gln Asp Lys Pro Asn Gln Gln Asp Val
Val Phe Phe Glu Lys Asn Gly260 265 270Leu
Leu His Gly His Phe Thr Leu Pro Phe Leu Lys Asp Glu Val Lys275
280 285Val Asn Asp Leu Leu Trp Asn Ala Asp Ser Ser
Val Leu Ala Ile Trp290 295 300Leu Glu Asp
Leu Pro Lys Glu Asp Ser Ser Thr Leu Lys Ser Tyr Val305
310 315 320Gln Leu Trp Thr Val Gly Asn
Tyr His Trp Tyr Leu Lys Gln Ser Leu325 330
335Pro Phe Ser Thr Thr Gly Lys Asn Gln Ile Val Ser Leu Leu Trp Asp340
345 350Pro Val Thr Pro Cys Arg Leu His Val
Leu Cys Thr Gly Trp Arg Tyr355 360 365Leu
Cys Cys Asp Trp His Trp Thr Thr Asp Arg Ser Ser Gly Asn Ser370
375 380Ala Asn Asp Leu Ala Asn Val Ala Val Ile Asp
Gly Asn Arg Val Leu385 390 395
400Val Thr Val Phe Arg Gln Thr Val Val Pro Pro Pro Met Cys Thr
Tyr405 410 415Arg Leu Leu Ile Pro His Pro
Val Asn Gln Val Ile Phe Ser Ala His420 425
430Leu Gly Asn Asp Leu Ala Val Leu Asp Ala Ser Asn Gln Ile Ser Val435
440 445Tyr Lys Cys Gly Asp Lys Pro Asn Met
Asp Ser Thr Val Lys Leu Gly450 455 460Ala
Val Gly Gly Asn Gly Phe Lys Val Pro Leu Thr Thr Pro His Leu465
470 475 480Glu Lys Arg Tyr Ser Ile
Gln Phe Gly Asn Asn Glu Glu Glu Glu Glu485 490
495Glu Asp Phe Ala Leu Gln Leu Ser Phe Leu Thr Trp Val Glu Asp
Asp500 505 510Thr Phe Leu Ala Ile Ser Tyr
Ser His Ser Ser Ser Gln Ser Ile Ile515 520
525His His Leu Thr Val Thr His Ser Glu Val Asp Glu Glu Gln Gly Gln530
535 540Leu Asp Val Ser Ser Ser Val Thr Val
Asp Gly Val Val Ile Gly Leu545 550 555
560Cys Cys Cys Ser Lys Thr Lys Ser Leu Ala Val Gln Leu Ala
Asp Gly565 570 575Gln Val Leu Lys Ile Leu
Trp Glu Ser Pro Ser Leu Ala Val Glu Pro580 585
590Trp Lys Asn Ser Glu Gly Ile Pro Val Arg Phe Val His Pro Cys
Thr595 600 605Gln Met Glu Val Ala Thr Ile
Gly Gly Glu Glu Cys Val Leu Gly Leu610 615
620Thr Asp Arg Cys Arg Phe Phe Ile Leu Val Thr Glu Val Ala Ser Asn625
630 635 640Ile Thr Ser Phe
Ala Val Cys Asp Asp Phe Leu Leu Val Thr Thr His645 650
655Ser His Thr Cys Gln Gly Phe Ser Leu Ser Gly Ala Ser Leu
Lys Met660 665 670Leu Gln Ala Ala Leu Ser
Gly Ser His Glu Ala Ser Gly Glu Ile Leu675 680
685Arg Lys Val Val Trp Gly Ser Arg Ile Val Thr Val Val Pro Gln
Asp690 695 700Thr Lys Leu Ile Leu Gln Met
Pro Arg Gly Asn Leu Glu Val Val His705 710
715 720His Arg Ala Leu Val Leu Ala Gln Ile Arg Lys Trp
Leu Asp Lys Leu725 730 735Met Phe Lys Glu
Ala Phe Glu Cys Met Arg Lys Leu Arg Ile Asn Leu740 745
750Asn Leu Ile His Asp His Asn Pro Lys Val Phe Leu Glu Asn
Val Glu755 760 765Thr Phe Val Phe Gln Ile
Asp Ser Val Asn His Ile Asn Leu Phe Phe770 775
780Thr Glu Leu Arg Glu Glu Asp Val Thr Lys Thr Met Tyr Pro Pro
Pro785 790 795 800Ile Thr
Lys Ser Val Gln Val Ser Thr His Pro Asp Gly Lys Lys Leu805
810 815Asp Leu Ile Cys Asp Ala Met Arg Ala Ala Met Glu
Ala Ile Asn Pro820 825 830Arg Lys Phe Cys
Leu Ser Ile Leu Thr Ser His Val Lys Lys Thr Thr835 840
845Pro Glu Leu Glu Ile Val Leu Gln Lys Val Gln Glu Leu Gln
Gly Asn850 855 860Leu Pro Phe Asp Pro Glu
Ser Val Ser Val Glu Glu Ala Leu Lys Tyr865 870
875 880Leu Leu Leu Leu Val Asp Val Asn Glu Leu Phe
Asn His Ser Leu Gly885 890 895Thr Tyr Asp
Phe Asn Leu Val Leu Met Val Ala Glu Lys Ser Gln Lys900
905 910Asp Pro Lys Glu Tyr Leu Pro Phe Leu Asn Thr Leu
Lys Lys Met Glu915 920 925Thr Asn Tyr Gln
Arg Phe Thr Ile Asp Lys Tyr Leu Lys Arg Tyr Glu930 935
940Lys Ala Leu Gly His Leu Ser Lys Cys Gly Pro Glu Tyr Phe
Thr Glu945 950 955 960Cys
Leu Asn Leu Ile Lys Asp Lys Asn Leu Tyr Lys Glu Ala Leu Lys965
970 975Leu Tyr Arg Pro Asp Ser Pro Gln Tyr Gln Ala
Val Ser Met Ala Tyr980 985 990Gly Glu His
Leu Met Gln Glu His Leu Tyr Glu Pro Ala Gly Leu Val995
1000 1005Phe Ala Arg Cys Gly Ala Gln Glu Lys Ala Leu Glu
Ala Phe Leu Ala1010 1015 1020Cys Gly Ser
Trp Gln Gln Ala Leu Cys Val Ala Ala Gln Leu Gln Met1025
1030 1035 1040Ser Lys Asp Lys Val Ala Gly
Leu Ala Arg Thr Leu Ala Gly Lys Leu1045 1050
1055Val Glu Gln Arg Lys His Ser Glu Ala Ala Thr Val Leu Glu Gln Tyr1060
1065 1070Ala Gln Asp Tyr Glu Glu Ala Val Leu
Leu Leu Leu Glu Gly Ser Ala1075 1080
1085Trp Glu Glu Ala Leu Arg Leu Val Tyr Lys Tyr Asp Arg Val Asp Ile1090
1095 1100Ile Glu Thr Ser Ile Lys Pro Ser Ile
Leu Glu Ala Gln Lys Asn Tyr1105 1110 1115
1120Met Asp Phe Leu Asp Ser Glu Thr Ala Thr Phe Ile Arg His
Lys Asn1125 1130 1135Arg Leu Gln Val Val
Arg Ala Leu Arg Arg Gln Ala Pro Gln Val His1140 1145
1150Val Asp His Glu Val Ala His Gly Pro Glu Ser Asp Leu Phe Ser
Glu1155 1160 1165Thr Ser Ser Ile Met Ser
Gly Ser Glu Met Ser Gly Arg Tyr Ser His1170 1175
1180Ser Asn Ser Arg Ile Ser Ala Arg Ser Ser Lys Asn Arg Arg Lys
Ala1185 1190 1195 1200Glu Arg
Lys Lys His Ser Leu Lys Glu Gly Ser Pro Leu Glu Gly Leu1205
1210 1215Ala Leu Leu Glu Ala Leu Ser Glu Val Val Gln Ser
Val Glu Lys Leu1220 1225 1230Lys Asp Glu
Val Arg Ala Ile Leu Lys Val Leu Phe Leu Phe Glu Phe1235
1240 1245Glu Glu Gln Ala Lys Glu Leu Gln Arg Ala Phe Glu
Ser Thr Leu Gln1250 1255 1260Leu Met Glu
Arg Ala Val Pro Glu Ile Trp Thr Pro Ala Gly Gln Gln1265
1270 1275 1280Ser Ser Thr Thr Pro Val Leu
Gly Pro Ser Ser Thr Ala Asn Ser Ile1285 1290
1295Thr Ala Ser Tyr Gln Gln Gln Lys Thr Cys Val Pro Ala Leu Asp Ala1300
1305 1310Gly Val Tyr Met Pro Pro Lys Met Asp
Pro Arg Ser Gln Trp Lys Leu1315 1320
1325Ser Leu Leu Glu133051332PRTHomo sapiens 5Met Arg Asn Leu Lys Leu Phe
Arg Thr Leu Glu Phe Arg Asp Ile Gln1 5 10
15Gly Pro Gly Asn Pro Gln Cys Phe Ser Leu Arg Thr Glu
Gln Gly Thr20 25 30Val Leu Ile Gly Ser
Glu His Gly Leu Ile Glu Val Asp Pro Val Ser35 40
45Arg Glu Val Lys Asn Glu Val Ser Leu Val Ala Glu Gly Phe Leu
Pro50 55 60Glu Asp Gly Ser Gly Arg Ile
Val Gly Val Gln Asp Leu Leu Asp Gln65 70
75 80Glu Ser Val Cys Val Ala Thr Ala Ser Gly Asp Val
Ile Leu Cys Ser85 90 95Leu Ser Thr Gln
Gln Leu Glu Cys Val Gly Ser Val Ala Ser Gly Ile100 105
110Ser Val Met Ser Trp Ser Pro Asp Gln Glu Leu Val Leu Leu
Ala Thr115 120 125Gly Gln Gln Thr Leu Ile
Met Met Thr Lys Asp Phe Glu Pro Ile Leu130 135
140Glu Gln Gln Ile His Gln Asp Asp Phe Gly Glu Ser Lys Phe Ile
Thr145 150 155 160Val Gly
Trp Gly Arg Lys Glu Thr Gln Phe His Gly Ser Glu Gly Arg165
170 175Gln Ala Ala Phe Gln Met Gln Met His Glu Ser Ala
Leu Pro Trp Asp180 185 190Asp His Arg Pro
Gln Val Thr Trp Arg Gly Asp Gly Gln Phe Phe Ala195 200
205Val Ser Val Val Cys Pro Glu Thr Gly Ala Arg Lys Val Arg
Val Trp210 215 220Asn Arg Glu Phe Ala Leu
Gln Ser Thr Ser Glu Pro Val Ala Gly Leu225 230
235 240Gly Pro Ala Leu Ala Trp Lys Pro Ser Gly Ser
Leu Ile Ala Ser Thr245 250 255Gln Asp Lys
Pro Asn Gln Gln Asp Ile Val Phe Phe Glu Lys Asn Gly260
265 270Leu Leu His Gly His Phe Thr Leu Pro Phe Leu Lys
Asp Glu Val Lys275 280 285Val Asn Asp Leu
Leu Trp Asn Ala Asp Ser Ser Val Leu Ala Val Trp290 295
300Leu Glu Asp Leu Gln Arg Glu Glu Ser Ser Ile Pro Lys Thr
Cys Val305 310 315 320Gln
Leu Trp Thr Val Gly Asn Tyr His Trp Tyr Leu Lys Gln Ser Leu325
330 335Ser Phe Ser Thr Cys Gly Lys Ser Lys Ile Val
Ser Leu Met Trp Asp340 345 350Pro Val Thr
Pro Tyr Arg Leu His Val Leu Cys Gln Gly Trp His Tyr355
360 365Leu Ala Tyr Asp Trp His Trp Thr Thr Asp Arg Ser
Val Gly Asp Asn370 375 380Ser Ser Asp Leu
Ser Asn Val Ala Val Ile Asp Gly Asn Arg Val Leu385 390
395 400Val Thr Val Phe Arg Gln Thr Val Val
Pro Pro Pro Met Cys Thr Tyr405 410 415Gln
Leu Leu Phe Pro His Pro Val Asn Gln Val Thr Phe Leu Ala His420
425 430Pro Gln Lys Ser Asn Asp Leu Ala Val Leu Asp
Ala Ser Asn Gln Ile435 440 445Ser Val Tyr
Lys Cys Gly Asp Cys Pro Ser Ala Asp Pro Thr Val Lys450
455 460Leu Gly Ala Val Gly Gly Ser Gly Phe Lys Val Cys
Leu Arg Thr Pro465 470 475
480His Leu Glu Lys Arg Tyr Lys Ile Gln Phe Glu Asn Asn Glu Asp Gln485
490 495Asp Val Asn Pro Leu Lys Leu Gly Leu
Leu Thr Trp Ile Glu Glu Asp500 505 510Val
Phe Leu Ala Val Ser His Ser Glu Phe Ser Pro Arg Ser Val Ile515
520 525His His Leu Thr Ala Ala Ser Ser Glu Met Asp
Glu Glu His Gly Gln530 535 540Leu Asn Val
Ser Ser Ser Ala Ala Val Asp Gly Val Ile Ile Ser Leu545
550 555 560Cys Cys Asn Ser Lys Thr Lys
Ser Val Val Leu Gln Leu Ala Asp Gly565 570
575Gln Ile Phe Lys Tyr Leu Trp Glu Ser Pro Ser Leu Ala Ile Lys Pro580
585 590Trp Lys Asn Ser Gly Gly Phe Pro Val
Arg Phe Pro Tyr Pro Cys Thr595 600 605Gln
Thr Glu Leu Ala Met Ile Gly Glu Glu Glu Cys Val Leu Gly Leu610
615 620Thr Asp Arg Cys Arg Phe Phe Ile Asn Asp Ile
Glu Val Ala Ser Asn625 630 635
640Ile Thr Ser Phe Ala Val Tyr Asp Glu Phe Leu Leu Leu Thr Thr
His645 650 655Ser His Thr Cys Gln Cys Phe
Cys Leu Arg Asp Ala Ser Phe Lys Thr660 665
670Leu Gln Ala Gly Leu Ser Ser Asn His Val Ser His Gly Glu Val Leu675
680 685Arg Lys Val Glu Arg Gly Ser Arg Ile
Val Thr Val Val Pro Gln Asp690 695 700Thr
Lys Leu Val Leu Gln Met Pro Arg Gly Asn Leu Glu Val Val His705
710 715 720His Arg Ala Leu Val Leu
Ala Gln Ile Arg Lys Trp Leu Asp Lys Leu725 730
735Met Phe Lys Glu Ala Phe Glu Cys Met Arg Lys Leu Arg Ile Asn
Leu740 745 750Asn Leu Ile Tyr Asp His Asn
Pro Lys Val Phe Leu Gly Asn Val Glu755 760
765Thr Phe Ile Lys Gln Ile Asp Ser Val Asn His Ile Asn Leu Phe Phe770
775 780Thr Glu Leu Lys Glu Glu Asp Val Thr
Lys Thr Met Tyr Pro Ala Pro785 790 795
800Val Thr Ser Ser Val Tyr Leu Ser Arg Asp Pro Asp Gly Asn
Lys Ile805 810 815Asp Leu Val Cys Asp Ala
Met Arg Ala Val Met Glu Ser Ile Asn Pro820 825
830His Lys Tyr Cys Leu Ser Ile Leu Thr Ser His Val Lys Lys Thr
Thr835 840 845Pro Glu Leu Glu Ile Val Leu
Gln Lys Val His Glu Leu Gln Gly Asn850 855
860Ala Pro Ser Asp Pro Asp Ala Val Ser Ala Glu Glu Ala Leu Lys Tyr865
870 875 880Leu Leu His Leu
Val Asp Val Asn Glu Leu Tyr Asp His Ser Leu Gly885 890
895Thr Tyr Asp Phe Asp Leu Val Leu Met Val Ala Glu Lys Ser
Gln Lys900 905 910Asp Pro Lys Glu Tyr Leu
Pro Phe Leu Asn Thr Leu Lys Lys Met Glu915 920
925Thr Asn Tyr Gln Arg Phe Thr Ile Asp Lys Tyr Leu Lys Arg Tyr
Glu930 935 940Lys Ala Ile Gly His Leu Ser
Lys Cys Gly Pro Glu Tyr Phe Pro Glu945 950
955 960Cys Leu Asn Leu Ile Lys Asp Lys Asn Leu Tyr Asn
Glu Ala Leu Lys965 970 975Leu Tyr Ser Pro
Ser Ser Gln Gln Tyr Gln Asp Ile Ser Ile Ala Tyr980 985
990Gly Glu His Leu Met Gln Glu His Met Tyr Glu Pro Ala Gly
Leu Met995 1000 1005Phe Ala Arg Cys Gly Ala
His Glu Lys Ala Leu Ser Ala Phe Leu Thr1010 1015
1020Cys Gly Asn Trp Lys Gln Ala Leu Cys Val Ala Ala Gln Leu Asn
Phe1025 1030 1035 1040Thr Lys
Asp Gln Leu Val Gly Leu Gly Arg Thr Leu Ala Gly Lys Leu1045
1050 1055Val Glu Gln Arg Lys His Ile Asp Ala Ala Met Val
Leu Glu Glu Ser1060 1065 1070Ala Gln Asp
Tyr Glu Glu Ala Val Leu Leu Leu Leu Glu Gly Ala Ala1075
1080 1085Trp Glu Glu Ala Leu Arg Leu Val Tyr Lys Tyr Asn
Arg Leu Asp Ile1090 1095 1100Ile Glu Thr
Asn Val Lys Pro Ser Ile Leu Glu Ala Gln Lys Asn Tyr1105
1110 1115 1120Met Ala Phe Leu Asp Ser Gln
Thr Ala Thr Phe Ser Arg His Lys Lys1125 1130
1135Arg Leu Leu Val Val Arg Glu Leu Lys Glu Gln Ala Gln Gln Ala Gly1140
1145 1150Leu Asp Asp Glu Val Pro His Gly Gln
Glu Ser Asp Leu Phe Ser Glu1155 1160
1165Thr Ser Ser Val Val Ser Gly Ser Glu Met Ser Gly Lys Tyr Ser His1170
1175 1180Ser Asn Ser Arg Ile Ser Ala Arg Ser
Ser Lys Asn Arg Arg Lys Ala1185 1190 1195
1200Glu Arg Lys Lys His Ser Leu Lys Glu Gly Ser Pro Leu Glu
Asp Leu1205 1210 1215Ala Leu Leu Glu Ala
Leu Ser Glu Val Val Gln Asn Thr Glu Asn Leu1220 1225
1230Lys Asp Glu Val Tyr His Ile Leu Lys Val Leu Phe Leu Phe Glu
Phe1235 1240 1245Asp Glu Gln Gly Arg Glu
Leu Gln Lys Ala Phe Glu Asp Thr Leu Gln1250 1255
1260Leu Met Glu Arg Ser Leu Pro Glu Ile Trp Thr Leu Thr Tyr Gln
Gln1265 1270 1275 1280Asn Ser
Ala Thr Pro Val Leu Gly Pro Asn Ser Thr Ala Asn Ser Ile1285
1290 1295Met Ala Ser Tyr Gln Gln Gln Lys Thr Ser Val Pro
Val Leu Asp Ala1300 1305 1310Glu Leu Phe
Ile Pro Pro Lys Ile Asn Arg Arg Thr Gln Trp Lys Leu1315
1320 1325Ser Leu Leu Asp133061213PRTDrosophila
melanogaster 6Met Arg Asn Leu Lys Leu Arg Tyr Cys Lys Glu Leu Asn Ala Val
Ala1 5 10 15His Pro Gln
His Leu Leu Leu Gln Pro Glu Leu Asn Gly Gly Ala Ser20 25
30Asp Ile Tyr Phe Val Val Ala Asp Asn Lys Thr Tyr Ala
Val Gln Glu35 40 45Ser Gly Asp Val Arg
Leu Lys Val Ile Ala Asp Leu Pro Asp Ile Val50 55
60Gly Val Glu Phe Leu Gln Leu Asp Asn Ala Ile Cys Val Ala Ser
Gly65 70 75 80Ala Gly
Glu Val Ile Leu Val Asp Pro Gln Thr Gly Ala Thr Ser Glu85
90 95Gly Thr Phe Cys Asp Val Gly Ile Glu Ser Met Ala
Trp Ser Pro Asn100 105 110Gln Glu Val Val
Ala Phe Val Thr Arg Thr His Asn Val Val Leu Met115 120
125Thr Ser Thr Phe Asp Val Ile Ala Glu Gln Pro Leu Asp Ala
Glu Leu130 135 140Asp Pro Asp Gln Gln Phe
Val Asn Val Gly Trp Gly Lys Lys Glu Thr145 150
155 160Gln Phe His Gly Ser Glu Gly Lys Gln Ala Ala
Lys Gln Lys Glu Ser165 170 175Asp Ser Thr
Phe Thr Arg Asp Glu Gln Glu Leu Asn Gln Asp Val Ser180
185 190Ile Ser Trp Arg Gly Asp Gly Glu Phe Phe Val Val
Ser Tyr Val Ala195 200 205Ala Gln Leu Gly
Arg Thr Phe Lys Val Tyr Asp Ser Glu Gly Lys Leu210 215
220Asn His Thr Ala Glu Lys Ser Ala Asn Leu Lys Asp Ser Val
Val Trp225 230 235 240Arg
Pro Thr Gly Asn Trp Ile Ala Val Pro Gln Gln Phe Pro Asn Lys245
250 255Ser Thr Ile Ala Leu Phe Glu Lys Asn Gly Leu
Arg His Arg Glu Leu260 265 270Val Leu Pro
Phe Asp Leu Gln Glu Glu Pro Val Val Gln Leu Arg Trp275
280 285Ser Glu Asp Ser Asp Ile Leu Ala Ile Arg Thr Cys
Ala Lys Glu Glu290 295 300Gln Arg Val Tyr
Leu Tyr Thr Ile Gly Asn Tyr His Trp Tyr Leu Lys305 310
315 320Gln Val Leu Ile Phe Glu Gln Ala Asp
Pro Leu Ala Leu Leu His Trp325 330 335Asp
Thr Arg Cys Gly Ala Glu His Thr Leu His Val Leu Lys Glu Ser340
345 350Gly Lys His Leu Val Tyr Arg Trp Ala Phe Ala
Val Asp Arg Asn Asn355 360 365Ser Ile Val
Gly Val Ile Asp Gly Lys Arg Leu Leu Leu Thr Asp Phe370
375 380Asp Glu Ala Ile Val Pro Pro Pro Met Ser Lys Glu
Leu Gln Lys Pro385 390 395
400Ile Met Leu Met Pro Asp Ala Glu Leu Ser Gly Leu His Leu Ala Asn405
410 415Leu Thr His Phe Ser Pro His Tyr Leu
Leu Ala Thr His Ser Ser Ala420 425 430Gly
Ser Thr Arg Leu Leu Leu Leu Ser Tyr Lys Asp Asn Asp Asn Lys435
440 445Pro Gly Glu Trp Phe Tyr Arg Val His Ser Ser
Val Arg Ile Asn Gly450 455 460Leu Val Asn
Ala Val Ala Val Ala Pro Tyr Ala Met Asn Glu Phe Tyr465
470 475 480Val Gln Thr Val Asn Asn Gly
His Thr Tyr Glu Val Ser Leu Lys Ala485 490
495Asp Lys Thr Leu Lys Val Glu Arg Ser Tyr Val Gln Leu His Glu Pro500
505 510Ala Asp Gln Ile Asp Trp Val Ile Val
Lys Gly Cys Ile Trp Asp Gly515 520 525Tyr
Thr Gly Ala Leu Val Thr Leu Arg Asn Gln His Leu Leu His Ile530
535 540Asp Gly Tyr Arg Ile Gly Glu Asp Val Thr Ser
Phe Cys Val Val Thr545 550 555
560Asn Tyr Leu Val Tyr Thr Gln Leu Asn Ala Met His Phe Val Gln
Leu565 570 575Asp Asp Arg Arg Gln Val Ala
Ser Arg Asn Ile Glu Arg Gly Ala Lys580 585
590Ile Val Thr Ala Val Ala Arg Lys Ala Arg Val Val Leu Gln Leu Pro595
600 605Arg Gly Asn Leu Glu Ala Ile Cys Pro
Arg Val Leu Val Leu Glu Leu610 615 620Val
Gly Asp Leu Leu Glu Arg Gly Lys Tyr Gln Lys Ala Ile Glu Met625
630 635 640Ser Arg Lys Gln Arg Ile
Asn Leu Asn Ile Ile Phe Asp His Asp Val645 650
655Lys Arg Phe Val Ser Ser Val Gly Ala Phe Leu Asn Asp Ile Asn
Glu660 665 670Pro Gln Trp Leu Cys Leu Phe
Leu Ser Glu Leu Gln Asn Glu Asp Phe675 680
685Thr Lys Gly Met Tyr Ser Ser Asn Tyr Asp Ala Ser Lys Gln Thr Tyr690
695 700Pro Ser Asp Tyr Arg Val Asp Gln Lys
Val Phe Tyr Val Cys Arg Leu705 710 715
720Leu Glu Gln Gln Met Asn Arg Phe Val Ser Arg Phe Arg Leu
Pro Leu725 730 735Ile Thr Ala Tyr Val Lys
Leu Gly Cys Leu Glu Met Ala Leu Gln Val740 745
750Ile Trp Lys Glu Gln Gln Glu Asp Ala Ser Leu Ala Asp Gln Leu
Leu755 760 765Gln His Leu Leu Tyr Leu Val
Asp Val Asn Asp Leu Tyr Asn Val Ala770 775
780Leu Gly Thr Tyr Asp Phe Gly Leu Val Leu Phe Val Ala Gln Lys Ser785
790 795 800Gln Lys Asp Pro
Lys Glu Phe Leu Pro Tyr Leu Asn Asp Leu Lys Ala805 810
815Leu Pro Ile Asp Tyr Arg Lys Phe Arg Ile Asp Asp His Leu
Lys Arg820 825 830Tyr Thr Ser Ala Leu Ser
His Leu Ala Ala Cys Gly Glu Gln His Tyr835 840
845Glu Glu Ala Leu Glu Tyr Ile Arg Lys His Gly Leu Tyr Thr Asp
Gly850 855 860Leu Ala Phe Tyr Arg Glu His
Ile Glu Phe Gln Lys Asn Ile Tyr Val865 870
875 880Ala Tyr Ala Asp His Leu Arg Ala Ile Ala Lys Leu
Asp Asn Ala Ser885 890 895Leu Met Tyr Glu
Arg Gly Gly Gln Leu Gln Gln Ala Leu Leu Ser Ala900 905
910Lys His Thr Leu Asp Trp Gln Arg Val Leu Val Leu Ala Lys
Lys Leu915 920 925Ser Glu Pro Leu Asp Gln
Val Ala Gln Ser Leu Val Gly Pro Leu Gln930 935
940Gln Gln Gly Arg His Met Glu Ala Tyr Glu Leu Val Lys Glu His
Cys945 950 955 960Gln Asp
Arg Lys Arg Gln Phe Asp Val Leu Leu Glu Gly His Leu Tyr965
970 975Ser Arg Ala Ile Tyr Glu Ala Gly Leu Glu Asp Asp
Asp Val Ser Glu980 985 990Lys Ile Ala Pro
Ala Leu Leu Ala Tyr Gly Val Gln Leu Glu Ser Ser995 1000
1005Leu Gln Ala Asp Leu Gln Leu Phe Leu Asp Tyr Lys Gln Arg
Leu Leu1010 1015 1020Asp Ile Arg Arg Asn
Gln Ala Lys Ser Gly Glu Gly Tyr Ile Asp Thr1025 1030
1035 1040Asp Val Asn Leu Lys Glu Val Asp Leu Leu
Ser Asp Thr Thr Ser Leu1045 1050 1055His
Ser Ser Gln Tyr Ser Gly Thr Ser Arg Arg Thr Gly Lys Thr Phe1060
1065 1070Arg Ser Ser Lys Asn Arg Arg Lys His Glu Arg
Lys Leu Phe Ser Leu1075 1080 1085Lys Pro
Gly Asn Pro Phe Glu Asp Ile Ala Leu Ile Asp Ala Leu His1090
1095 1100Asn His Val Thr Lys Ile Ala Gln Gln Gln Gln Pro
Val Arg Asp Thr1105 1110 1115
1120Cys Lys Ala Leu Leu Gln Leu Ala Asn Ala Ala Asp Ala Asp Pro Leu1125
1130 1135Ala Ala Ala Leu Gln Arg Glu Phe Lys
Thr Leu Leu Gln Ala Val Asp1140 1145
1150Ala Ala Leu Asp Glu Ile Trp Thr Pro Glu Leu Arg Gly Asn Gly Leu1155
1160 1165Met Ala Asp His Leu Thr Gly Pro Asn
Val Asp Tyr Leu Ala Leu Gln1170 1175
1180Lys Glu Gln Arg Tyr Ala Leu Leu Ser Pro Leu Lys Arg Phe Lys Pro1185
1190 1195 1200Gln Leu Ile Met
Met Asp Trp Gln His Glu Ile Leu Gln1205
121071349PRTSaccharomyces cerevisiae 7Met Val Glu His Asp Lys Ser Gly Ser
Lys Arg Gln Glu Leu Arg Ser1 5 10
15Asn Met Arg Asn Leu Ile Thr Leu Asn Lys Gly Lys Phe Lys Pro
Thr20 25 30Ala Ser Thr Ala Glu Gly Asp
Glu Asp Asp Leu Ser Phe Thr Leu Leu35 40
45Asp Ser Val Phe Asp Thr Leu Ser Asp Ser Ile Thr Cys Val Leu Gly50
55 60Ser Thr Asp Ile Gly Ala Ile Glu Val Gln
Gln Phe Met Lys Asp Gly65 70 75
80Ser Arg Asn Val Leu Ala Ser Phe Asn Ile Gln Thr Phe Asp Asp
Lys85 90 95Leu Leu Ser Phe Val His Phe
Ala Asp Ile Asn Gln Leu Val Phe Val100 105
110Phe Glu Gln Gly Asp Ile Ile Thr Ala Thr Tyr Asp Pro Val Ser Leu115
120 125Asp Pro Ala Glu Thr Leu Ile Glu Ile
Met Gly Thr Ile Asp Asn Gly130 135 140Ile
Ala Ala Ala Gln Trp Ser Tyr Asp Glu Glu Thr Leu Ala Met Val145
150 155 160Thr Lys Asp Arg Asn Val
Val Val Leu Ser Lys Leu Phe Glu Pro Ile165 170
175Ser Glu Tyr His Leu Glu Val Asp Asp Leu Lys Ile Ser Lys His
Val180 185 190Thr Val Gly Trp Gly Lys Lys
Glu Thr Gln Phe Arg Gly Lys Gly Ala195 200
205Arg Ala Met Glu Arg Glu Ala Leu Ala Ser Leu Lys Ala Ser Gly Leu210
215 220Val Gly Asn Gln Leu Arg Asp Pro Thr
Met Pro Tyr Met Val Asp Thr225 230 235
240Gly Asp Val Thr Ala Leu Asp Ser His Glu Ile Thr Ile Ser
Trp Arg245 250 255Gly Asp Cys Asp Tyr Phe
Ala Val Ser Ser Val Glu Glu Val Pro Asp260 265
270Glu Asp Asp Glu Thr Lys Ser Ile Lys Arg Arg Ala Phe Arg Val
Phe275 280 285Ser Arg Glu Gly Gln Leu Asp
Ser Ala Ser Glu Pro Val Thr Gly Met290 295
300Glu His Gln Leu Ser Trp Lys Pro Gln Gly Ser Leu Ile Ala Ser Ile305
310 315 320Gln Arg Lys Thr
Asp Leu Gly Glu Glu Asp Ser Val Asp Val Ile Phe325 330
335Phe Glu Arg Asn Gly Leu Arg His Gly Glu Phe Asp Thr Arg
Leu Pro340 345 350Leu Asp Glu Lys Val Glu
Ser Val Cys Trp Asn Ser Asn Ser Glu Ala355 360
365Leu Ala Val Val Leu Ala Asn Arg Ile Gln Leu Trp Thr Ser Lys
Asn370 375 380Tyr His Trp Tyr Leu Lys Gln
Glu Leu Tyr Ala Ser Asp Ile Ser Tyr385 390
395 400Val Lys Trp His Pro Glu Lys Asp Phe Thr Leu Met
Phe Ser Asp Ala405 410 415Gly Phe Ile Asn
Ile Val Asp Phe Ala Tyr Lys Met Ala Gln Gly Pro420 425
430Thr Leu Glu Pro Phe Asp Asn Gly Thr Ser Leu Val Val Asp
Gly Arg435 440 445Thr Val Asn Ile Thr Pro
Leu Ala Leu Ala Asn Val Pro Pro Pro Met450 455
460Tyr Tyr Arg Asp Phe Glu Thr Pro Gly Asn Val Leu Asp Val Ala
Cys465 470 475 480Ser Phe
Ser Asn Glu Ile Tyr Ala Ala Ile Asn Lys Asp Val Leu Ile485
490 495Phe Ala Ala Val Pro Ser Ile Glu Glu Met Lys Lys
Gly Lys His Pro500 505 510Ser Ile Val Cys
Glu Phe Pro Lys Ser Glu Phe Thr Ser Glu Val Asp515 520
525Ser Leu Arg Gln Val Ala Phe Ile Asn Asp Ser Ile Val Gly
Val Leu530 535 540Leu Asp Thr Asp Asn Leu
Ser Arg Ile Ala Leu Leu Asp Ile Gln Asp545 550
555 560Ile Thr Gln Pro Thr Leu Ile Thr Ile Val Glu
Val Tyr Asp Lys Ile565 570 575Val Leu Leu
Ser Ser Asp Phe Asp Tyr Asn His Leu Val Tyr Glu Thr580
585 590Arg Asp Gly Thr Val Cys Gln Leu Asp Ala Glu Gly
Gln Leu Met Glu595 600 605Ile Thr Lys Phe
Pro Gln Leu Val Arg Asp Phe Arg Val Lys Arg Val610 615
620His Asn Thr Ser Ala Glu Asp Asp Asp Asn Trp Ser Ala Glu
Ser Ser625 630 635 640Glu
Leu Val Ala Phe Gly Ile Thr Asn Asn Gly Lys Leu Phe Ala Asn645
650 655Gln Val Leu Leu Ala Ser Ala Val Thr Ser Leu
Glu Ile Thr Asp Ser660 665 670Phe Leu Leu
Phe Thr Thr Ala Gln His Asn Leu Gln Phe Val His Leu675
680 685Asn Ser Thr Asp Phe Lys Pro Leu Pro Leu Val Glu
Glu Gly Val Glu690 695 700Asp Glu Arg Val
Arg Ala Ile Glu Arg Gly Ser Ile Leu Val Ser Val705 710
715 720Ile Pro Ser Lys Arg Ser Val Val Leu
Gln Ala Thr Arg Gly Asn Leu725 730 735Glu
Thr Ile Tyr Pro Arg Ile Met Val Leu Ala Glu Val Arg Lys Asn740
745 750Ile Met Ala Lys Arg Tyr Lys Glu Ala Phe Ile
Val Cys Arg Thr His755 760 765Arg Ile Asn
Leu Asp Ile Leu His Asp Tyr Ala Pro Glu Leu Phe Ile770
775 780Glu Asn Leu Glu Val Phe Ile Asn Gln Ile Gly Arg
Val Asp Tyr Leu785 790 795
800Asn Leu Phe Ile Ser Cys Leu Ser Glu Asp Asp Val Thr Lys Thr Lys805
810 815Tyr Lys Glu Thr Leu Tyr Ser Gly Ile
Ser Lys Ser Phe Gly Met Glu820 825 830Pro
Ala Pro Leu Thr Glu Met Gln Ile Tyr Met Lys Lys Lys Met Phe835
840 845Asp Pro Lys Thr Ser Lys Val Asn Lys Ile Cys
Asp Ala Val Leu Asn850 855 860Val Leu Leu
Ser Asn Pro Glu Tyr Lys Lys Lys Tyr Leu Gln Thr Ile865
870 875 880Ile Thr Ala Tyr Ala Ser Gln
Asn Pro Gln Asn Leu Ser Ala Ala Leu885 890
895Lys Leu Ile Ser Glu Leu Glu Asn Ser Glu Glu Lys Asp Ser Cys Val900
905 910Thr Tyr Leu Cys Phe Leu Gln Asp Val
Asn Val Val Tyr Lys Ser Ala915 920 925Leu
Ser Leu Tyr Asp Val Ser Leu Ala Leu Leu Val Ala Gln Lys Ser930
935 940Gln Met Asp Pro Arg Glu Tyr Leu Pro Phe Leu
Gln Glu Leu Gln Asp945 950 955
960Asn Glu Pro Leu Arg Arg Lys Phe Leu Ile Asp Asp Tyr Leu Gly
Asn965 970 975Tyr Glu Lys Ala Leu Glu His
Leu Ser Glu Ile Asp Lys Asp Gly Asn980 985
990Val Ser Glu Glu Val Ile Asp Tyr Val Glu Ser His Asp Leu Tyr Lys995
1000 1005His Gly Leu Ala Leu Tyr Arg Tyr Asp
Ser Glu Lys Gln Asn Val Ile1010 1015
1020Tyr Asn Ile Tyr Ala Lys His Leu Ser Ser Asn Gln Met Tyr Thr Asp1025
1030 1035 1040Ala Ala Val Ala
Tyr Glu Met Leu Gly Lys Leu Lys Glu Ala Met Gly1045 1050
1055Ala Tyr Gln Ser Ala Lys Arg Trp Arg Glu Ala Met Ser Ile
Ala Val1060 1065 1070Gln Lys Phe Pro Glu
Glu Val Glu Ser Val Ala Glu Glu Leu Ile Ser1075 1080
1085Ser Leu Thr Phe Glu His Arg Tyr Val Asp Ala Ala Asp Ile Gln
Leu1090 1095 1100Glu Tyr Leu Asp Asn Val
Lys Glu Ala Val Ala Leu Tyr Cys Lys Ala1105 1110
1115 1120Tyr Arg Tyr Asp Ile Ala Ser Leu Val Ala Ile
Lys Ala Lys Lys Asp1125 1130 1135Glu Leu
Leu Glu Glu Val Val Asp Pro Gly Leu Gly Glu Gly Phe Gly1140
1145 1150Ile Ile Ala Glu Leu Leu Ala Asp Cys Lys Gly Gln
Ile Asn Ser Gln1155 1160 1165Leu Arg Arg
Leu Arg Glu Leu Arg Ala Lys Lys Glu Glu Asn Pro Tyr1170
1175 1180Ala Phe Tyr Gly Gln Glu Thr Glu Gln Ala Asp Asp
Val Ser Val Ala1185 1190 1195
1200Pro Ser Glu Thr Ser Thr Gln Glu Ser Phe Phe Thr Arg Tyr Thr Gly1205
1210 1215Lys Thr Gly Gly Thr Ala Lys Thr Gly
Ala Ser Arg Arg Thr Ala Lys1220 1225
1230Asn Lys Arg Arg Glu Glu Arg Lys Arg Ala Arg Gly Lys Lys Gly Thr1235
1240 1245Ile Tyr Glu Glu Glu Tyr Leu Val Gln
Ser Val Gly Arg Leu Ile Glu1250 1255
1260Arg Leu Asn Gln Thr Lys Pro Asp Ala Val Arg Val Val Glu Gly Leu1265
1270 1275 1280Cys Arg Arg Asn
Met Arg Glu Gln Ala His Gln Ile Gln Lys Asn Phe1285 1290
1295Val Glu Val Leu Asp Leu Leu Lys Ala Asn Val Lys Glu Ile
Tyr Ser1300 1305 1310Ile Ser Glu Lys Asp
Arg Glu Arg Val Asn Glu Asn Gly Glu Val Tyr1315 1320
1325Tyr Ile Pro Glu Ile Pro Val Pro Glu Ile His Asp Phe Pro Lys
Ser1330 1335 1340His Ile Val Asp
Phe134581319PRTArabidopsis thaliana 8Met Lys Asn Leu Lys Leu Phe Ser Glu
Val Pro Gln Asn Ile Gln Leu1 5 10
15His Ser Thr Glu Glu Val Val Gln Phe Ala Ala Thr Asp Ile Asp
Gln20 25 30Ser Arg Leu Phe Phe Ala Ser
Ser Ala Asn Phe Val Tyr Ala Leu Gln35 40
45Leu Ser Ser Phe Gln Asn Glu Ser Ala Gly Ala Lys Ser Ala Met Pro50
55 60Val Glu Val Cys Ser Ile Asp Ile Glu Pro
Gly Asp Phe Ile Thr Ala65 70 75
80Phe Asp Tyr Leu Ala Glu Lys Glu Ser Leu Leu Ile Gly Thr Ser
His85 90 95Gly Leu Leu Leu Val His Asn
Val Glu Ser Asp Val Thr Glu Leu Val100 105
110Gly Asn Ile Glu Gly Gly Val Lys Cys Ile Ser Pro Asn Pro Thr Gly115
120 125Asp Leu Leu Gly Leu Ile Thr Gly Leu
Gly Gln Leu Ile Val Met Thr130 135 140Tyr
Asp Trp Ala Leu Met Tyr Glu Lys Ala Leu Gly Glu Val Pro Glu145
150 155 160Gly Gly Tyr Val Arg Glu
Thr Asn Asp Leu Ser Val Asn Cys Gly Gly165 170
175Ile Ser Ile Ser Trp Arg Gly Asp Gly Lys Tyr Phe Ala Thr Met
Gly180 185 190Glu Val Tyr Glu Ser Gly Cys
Met Ser Lys Lys Ile Lys Ile Trp Glu195 200
205Ser Asp Ser Gly Ala Leu Gln Ser Ser Ser Glu Thr Lys Glu Phe Thr210
215 220Gln Gly Ile Leu Glu Trp Met Pro Ser
Gly Ala Lys Ile Ala Ala Val225 230 235
240Tyr Lys Arg Lys Ser Asp Asp Ser Ser Pro Ser Ile Ala Phe
Phe Glu245 250 255Arg Asn Gly Leu Glu Arg
Ser Ser Phe Arg Ile Gly Glu Pro Glu Asp260 265
270Ala Thr Glu Ser Cys Glu Asn Leu Lys Trp Asn Ser Ala Ser Asp
Leu275 280 285Leu Ala Gly Val Val Ser Cys
Lys Thr Tyr Asp Ala Ile Arg Val Trp290 295
300Phe Phe Ser Asn Asn His Trp Tyr Leu Lys Gln Glu Ile Arg Tyr Pro305
310 315 320Arg Glu Ala Gly
Val Thr Val Met Trp Asp Pro Thr Lys Pro Leu Gln325 330
335Leu Ile Cys Trp Thr Leu Ser Gly Gln Val Ser Val Arg His
Phe Met340 345 350Trp Val Thr Ala Val Met
Glu Asp Ser Thr Ala Phe Val Ile Asp Asn355 360
365Ser Lys Ile Leu Val Thr Pro Leu Ser Leu Ser Leu Met Pro Pro
Pro370 375 380Met Tyr Leu Phe Ser Leu Ser
Phe Ser Ser Ala Val Arg Asp Ile Ala385 390
395 400Tyr Tyr Ser Arg Asn Ser Lys Asn Cys Leu Ala Val
Phe Leu Ser Asp405 410 415Gly Asn Leu Ser
Phe Val Glu Phe Pro Ala Pro Asn Thr Trp Glu Asp420 425
430Leu Glu Gly Lys Asp Phe Ser Val Glu Ile Ser Asp Cys Lys
Thr Ala435 440 445Leu Gly Ser Phe Val His
Leu Leu Trp Leu Asp Val His Ser Leu Leu450 455
460Cys Val Ser Ala Tyr Gly Ser Ser His Asn Lys Cys Leu Ser Ser
Gly465 470 475 480Gly Tyr
Asp Thr Glu Leu His Gly Ser Tyr Leu Gln Glu Val Glu Val485
490 495Val Cys His Glu Asp His Val Pro Asp Gln Val Thr
Cys Ser Gly Phe500 505 510Lys Ala Ser Ile
Thr Phe Gln Thr Leu Leu Glu Ser Pro Val Leu Ala515 520
525Leu Ala Trp Asn Pro Ser Lys Arg Asp Ser Ala Phe Val Glu
Phe Glu530 535 540Gly Gly Lys Val Leu Gly
Tyr Ala Ser Arg Ser Glu Ile Met Glu Thr545 550
555 560Arg Ser Ser Asp Asp Ser Val Cys Phe Pro Ser
Thr Cys Pro Trp Val565 570 575Arg Val Ala
Gln Val Asp Ala Ser Gly Val His Lys Pro Leu Ile Cys580
585 590Gly Leu Asp Asp Met Gly Arg Leu Ser Ile Asn Gly
Lys Asn Leu Cys595 600 605Asn Asn Cys Ser
Ser Phe Ser Phe Tyr Ser Glu Leu Ala Asn Glu Val610 615
620Val Thr His Leu Ile Ile Leu Thr Lys Gln Asp Phe Leu Phe
Ile Val625 630 635 640Asp
Thr Lys Asp Val Leu Asn Gly Asp Val Ala Leu Gly Asn Val Phe645
650 655Phe Val Ile Asp Gly Arg Arg Arg Asp Glu Glu
Asn Met Ser Tyr Val660 665 670Asn Ile Trp
Glu Arg Gly Ala Lys Val Ile Gly Val Leu Asn Gly Asp675
680 685Glu Ala Ala Val Ile Leu Gln Thr Met Arg Gly Asn
Leu Glu Cys Ile690 695 700Tyr Pro Arg Lys
Leu Val Leu Ser Ser Ile Thr Asn Ala Leu Ala Gln705 710
715 720Gln Arg Phe Lys Asp Ala Phe Asn Leu
Val Arg Arg His Arg Ile Asp725 730 735Phe
Asn Val Ile Val Asp Leu Tyr Gly Trp Gln Ala Phe Leu Gln Ser740
745 750Ala Val Ala Phe Val Glu Gln Val Asn Asn Leu
Asn His Val Thr Glu755 760 765Phe Val Cys
Ala Met Lys Asn Glu Asp Val Thr Glu Thr Leu Tyr Lys770
775 780Lys Phe Ser Phe Ser Lys Lys Gly Asp Glu Val Phe
Arg Val Lys Asp785 790 795
800Ser Cys Ser Asn Lys Val Ser Ser Val Leu Gln Ala Ile Arg Lys Ala805
810 815Leu Glu Glu His Ile Pro Glu Ser Pro
Ser Arg Glu Leu Cys Ile Leu820 825 830Thr
Thr Leu Ala Arg Ser Asp Pro Pro Ala Ile Glu Glu Ser Leu Leu835
840 845Arg Ile Lys Ser Val Arg Glu Met Glu Leu Leu
Asn Ser Ser Asp Asp850 855 860Ile Arg Lys
Lys Ser Cys Pro Ser Ala Glu Glu Ala Leu Lys His Leu865
870 875 880Leu Trp Leu Leu Asp Ser Glu
Ala Val Phe Glu Ala Ala Leu Gly Leu885 890
895Tyr Asp Leu Asn Leu Ala Ala Ile Val Ala Leu Asn Ser Gln Arg Asp900
905 910Pro Lys Glu Phe Leu Pro Tyr Leu Gln
Glu Leu Glu Lys Met Pro Glu915 920 925Ser
Leu Met His Phe Lys Ile Asp Ile Lys Leu Gln Arg Phe Asp Ser930
935 940Ala Leu Arg Asn Ile Val Ser Ala Gly Val Gly
Tyr Phe Pro Asp Cys945 950 955
960Met Asn Leu Ile Lys Lys Asn Pro Gln Leu Phe Pro Leu Gly Leu
Leu965 970 975Leu Ile Thr Asp Pro Glu Lys
Lys Leu Val Val Leu Glu Ala Trp Ala980 985
990Asp His Leu Ile Asp Glu Lys Arg Phe Glu Asp Ala Ala Thr Thr Tyr995
1000 1005Leu Cys Cys Cys Lys Leu Glu Lys Ala
Ser Lys Ala Tyr Arg Glu Cys1010 1015
1020Gly Asp Trp Ser Gly Val Leu Arg Val Gly Ala Leu Met Lys Leu Gly1025
1030 1035 1040Lys Asp Glu Ile
Leu Lys Leu Ala Tyr Glu Leu Cys Glu Glu Val Asn1045 1050
1055Ala Leu Gly Lys Pro Ala Glu Ala Ala Lys Ile Ala Leu Glu
Tyr Cys1060 1065 1070Ser Asp Ile Ser Gly
Gly Ile Ser Leu Leu Ile Asn Ala Arg Glu Trp1075 1080
1085Glu Glu Ala Leu Arg Val Ala Phe Leu His Thr Ala Asp Asp Arg
Ile1090 1095 1100Ser Val Val Lys Ser Ser
Ala Leu Glu Cys Ala Ser Gly Leu Val Ser1105 1110
1115 1120Glu Phe Lys Glu Ser Ile Glu Lys Val Gly Lys
Tyr Leu Thr Arg Tyr1125 1130 1135Leu Ala
Val Arg Gln Arg Arg Leu Leu Leu Ala Ala Lys Leu Lys Ser1140
1145 1150Glu Glu Arg Ser Val Val Asp Leu Asp Asp Asp Thr
Ala Ser Glu Ala1155 1160 1165Ser Ser Asn
Leu Ser Gly Met Ser Ala Tyr Thr Leu Gly Thr Arg Arg1170
1175 1180Gly Ser Ala Ala Ser Val Ser Ser Ser Asn Ala Thr
Ser Arg Ala Arg1185 1190 1195
1200Asp Leu Arg Arg Gln Arg Lys Ser Gly Lys Ile Arg Ala Gly Ser Ala1205
1210 1215Gly Glu Glu Met Ala Leu Val Asp His
Leu Lys Gly Met Arg Met Thr1220 1225
1230Asp Gly Gly Lys Arg Glu Leu Lys Ser Leu Leu Ile Cys Leu Val Thr1235
1240 1245Leu Gly Glu Met Glu Ser Ala Gln Lys
Leu Gln Gln Thr Ala Glu Asn1250 1255
1260Phe Gln Val Ser Gln Val Ala Ala Val Glu Leu Ala His Asp Thr Val1265
1270 1275 1280Ser Ser Glu Ser
Val Asp Glu Glu Val Tyr Cys Phe Glu Arg Tyr Ala1285 1290
1295Gln Lys Thr Arg Ser Thr Ala Arg Asp Ser Asp Ala Phe Ser
Trp Met1300 1305 1310Leu Lys Val Phe Ile
Ser Pro131591178PRTCaenorhabditis elegans 9Met Lys Asn Leu Gln Ile Gly
Ser Val Lys Thr Phe Glu Asn Pro Glu1 5 10
15Ile Ala Gly Ala Asp Asp Phe Ala Val His Pro Ile Leu
Gln Thr Ile20 25 30Ala Val Ser Thr Lys
Asn Glu Leu Leu Leu Leu Glu Asn Asn Leu Ile35 40
45Ser Ser Thr Ile Lys Trp Ala Glu Gln Arg Arg Glu Leu Glu Val
Ile50 55 60Ser Leu Ser Phe Arg Thr Asp
Gly Asn Gln Ile Val Val Ile Leu Ala65 70
75 80Asp Gly Arg Ala Leu Ile Val Glu Asp Gly Glu Val
Met Asp Leu Glu85 90 95Ile Ala Glu Leu
Thr Asp Thr Thr Val Ser Ala Ala Glu Trp Thr Ala100 105
110Asp Glu Gln Thr Leu Ala Leu Ala Asp Asn Gln Thr Leu Tyr
Leu Ala115 120 125Asp Ser Ser Leu Val Pro
Phe Ala Glu Arg Pro Leu Ile Phe Ser Glu130 135
140Asn Glu Arg Lys Ser Ala Pro Val Asn Val Gly Trp Gly Ser Glu
Ser145 150 155 160Thr Gln
Phe Arg Gly Ser Ala Gly Lys Leu Lys Pro Gly Glu Lys Ile165
170 175Glu Lys Glu Lys Glu Gln Ile Glu Gln His Ser Arg
Lys Thr Ser Val180 185 190His Trp Arg Trp
Asp Gly Glu Ile Val Ala Val Ser Phe Tyr Ser Ser195 200
205Gln Asn Asp Thr Arg Asn Leu Thr Val Phe Asp Arg Asn Gly
Glu Ile210 215 220Leu Asn Asn Met Asn Ile
Arg Asn Ile Tyr Leu Ser His Cys Phe Ala225 230
235 240His Lys Pro Asn Ala Asn Leu Leu Cys Ser Ala
Ile Gln Glu Asn Gly245 250 255Ser Asp Asp
Arg Ile Val Ile Tyr Glu Arg Asn Gly Glu Thr Arg Asn260
265 270Ser Tyr Val Val Lys Trp Pro Ala Asn Gln Ile Glu
Asp Arg Arg Ile275 280 285Ile Glu Lys Ile
Glu Trp Asn Ser Thr Gly Thr Ile Leu Ser Met Gln290 295
300Thr Ser Leu Gly Lys Lys His Gln Leu Glu Phe Trp His Leu
Ser Asn305 310 315 320Tyr
Glu Phe Thr Arg Lys Cys Tyr Trp Lys Phe Ser Glu Ser Ile Ile325
330 335Trp Lys Trp Ser Thr Val Glu Cys Gln Asn Ile
Glu Val Leu Leu Glu340 345 350Ser Gly Gln
Phe Phe Ser Val His Ile Thr Pro Thr Ala Ser Phe Ser355
360 365Asp Val Ile Ser Gln Asn Val Val Val Ala Thr Asp
Glu Leu Arg Met370 375 380Tyr Ser Leu Cys
Arg Arg Val Val Pro Pro Pro Met Cys Asp Tyr Ser385 390
395 400Ile Gln Cys Leu Ser Asp Ile Val Ala
Tyr Thr Thr Ser Thr His His405 410 415Val
His Val Ile Thr Ser Asp Trp Lys Ile Ile Ser Cys Met Leu Phe420
425 430Phe Lys Lys Lys Lys Arg Asn Tyr Ser Asn Pro
Phe Phe Arg Lys Lys435 440 445Tyr Ile Leu
Glu Ile Leu Lys Val Pro Ser His Lys Thr Tyr Phe Ala450
455 460Cys Phe Ala Val Ser Gln Asp Thr Asp Gly Tyr Lys
Phe Asn Ser Asp465 470 475
480Arg Ala Ser Ile Asp Glu Val Leu His Thr Glu Val Thr Glu Gly Ile485
490 495Ile Cys Gly Phe Val Tyr Asp Glu Pro
Ser Glu Ser Tyr Ile Ile Trp500 505 510Asn
Val Ser His Gly Lys His Gln Ile Ser Arg Val Gly Ala Asn Pro515
520 525Glu Lys Ile Phe Glu Gly Glu Asn Ile Gly Trp
Ile Gly Val Asn Pro530 535 540Ser Asn Lys
His Val Glu Ile Ala Ser Asn Asp Gly Lys Phe Ile Asp545
550 555 560Leu Asn Thr Lys Glu Glu Leu
Phe Lys Ile Asp Lys Phe Glu Ser Thr565 570
575Glu Val His Phe Ile Gln Val Cys His Gly Ile Leu Asn His His Val580
585 590Ile Gln Val Asp Asn Ser Met Leu Phe
Leu Asp Ser Glu Arg Val Ser595 600 605Gln
Asp Ala Ile Ser Ile Leu Thr Arg Gly Ser Asp Ile Leu Leu Ile610
615 620Asp Phe Asp Asn Lys Leu Arg Phe Ile Asp Ala
Glu Ser Gly Lys Thr625 630 635
640Leu Glu Asp Val Arg Asn Val Glu Ala Gly Cys Glu Leu Val Ala
Cys645 650 655Asp Ser Gln Ser Ala Asn Val
Ile Leu Gln Ala Ala Arg Gly Asn Leu660 665
670Glu Thr Ile Gln Pro Arg Arg Tyr Val Met Ala His Thr Arg Asp Leu675
680 685Leu Asp Arg Lys Glu Tyr Ile Ala Ser
Phe Lys Trp Met Lys Lys His690 695 700Arg
Val Asp Met Ser Phe Ala Met Lys Tyr Lys Gly Asp Asp Leu Glu705
710 715 720Asp Asp Ile Pro Ile Trp
Leu Lys Thr Ser Asn Asp Ser Gln Phe Leu725 730
735Glu Gln Leu Leu Ile Ser Cys Thr Glu Val Phe Glu Asp Ala Gly
Ser740 745 750Ser Leu Cys Met Thr Val Ala
Arg Tyr Val Arg Asp Leu Ser Asp Ala755 760
765Glu Lys Thr Lys Met Phe Pro Leu Leu Leu Thr Ala Leu Leu Ser Ala770
775 780Arg Ser Lys Pro Ser Lys Val Asn Asp
Cys Leu Lys Glu Val Gln Glu785 790 795
800His Val Glu Lys Ile Ala Asp Arg Lys Asp Val Phe Thr Arg
Asn Ser805 810 815Leu His His Ile Ser Phe
Phe Val Pro Ala Lys Glu Leu Phe Asn Cys820 825
830Ala Leu Ser Thr Tyr Asp Leu Lys Leu Ala Gln Gln Val Ala Glu
Ala835 840 845Ser Asn Tyr Asp Pro Lys Glu
Tyr Leu Pro Val Leu Asn Lys Leu Asn850 855
860Arg Val Met Cys Thr Leu Glu Arg Gln Tyr Arg Ile Asn Val Val Arg865
870 875 880Glu Ala Trp Ile
Asp Ala Val Ser Ser Leu Phe Leu Leu Asp Ser Ser885 890
895Lys Glu Arg Gly Ser Glu Glu Thr Trp Trp Asn Asp Ile Glu
Asp Ile900 905 910Ile Ile Gln Arg Glu Lys
Leu Tyr Gln Asp Ala Leu Thr Leu Val Lys915 920
925Pro Gly Asp Arg Arg Tyr Lys Gln Cys Cys Glu Leu Tyr Ala Glu
Leu930 935 940Glu Arg Lys Val His Trp Arg
Glu Ala Ala Leu Phe Tyr Glu Leu Ser945 950
955 960Gly Asn Ser Glu Lys Thr Leu Lys Cys Trp Glu Met
Ser Arg Asp Val965 970 975Asp Gly Leu Ala
Ala Ser Ala Arg Arg Leu Ala Val Asp Ala Gly Lys980 985
990Leu Lys Ile His Ala Ile Lys Met Ser Thr Thr Leu Arg Glu
Ala Arg995 1000 1005Gln Pro Lys Glu Leu Ala
Lys Ala Leu Lys Leu Ala Gly Ser Ser Ser1010 1015
1020Thr Gln Ile Val His Val Leu Cys Asp Ala Phe Glu Trp Leu Asp
Ala1025 1030 1035 1040Ser Arg
Glu Val Glu Val Gly Lys Glu Glu Ala Leu Lys Lys Ala Ala1045
1050 1055Leu Ser Arg Asn Asp Gln Val Leu Met Asp Leu Glu
Arg Arg Lys Thr1060 1065 1070Glu Phe Glu
Asn Tyr Lys Lys Arg Leu Ala Val Val Arg Glu Asn Lys1075
1080 1085Leu Lys Arg Val Glu Gln Phe Ala Ala Gly Glu Val
Asp Asp Leu Arg1090 1095 1100Asp Asp Ile
Ser Val Ile Ser Ser Ile Ser Ser Arg Ser Gly Ser Ser1105
1110 1115 1120Lys Val Ser Met Ala Ser Thr
Val Arg Arg Lys Gln Ile Glu Lys Lys1125 1130
1135Lys Ser Ser Leu Lys Glu Gly Gly Glu Tyr Glu Asp Ser Ala Leu Leu1140
1145 1150Asn Val Leu Ser Glu Asn Tyr Arg Trp
Leu Glu Asn Ile Gly Ser Glu1155 1160
1165Phe Cys Phe Pro Trp Asn Phe Asn Ile Leu1170
11751017DNAMus sp. 10ttttttttcc ctcagaa
171117DNAMus sp. 11tatgctttgt gaaaggt
171217DNAMus sp. 12ttttctctga tgcagct
171317DNAMus sp.
13acatgaactc ctaagct
171417DNAMus sp. 14cttgaaaaac tgtaggc
171517DNAMus sp. 15ggtgtctctc ttcagcc
171617DNAMus sp. 16ctacctcctt tgcag ag
171717DNAMus sp.
17aggttctgct ttcagac
171817DNAMus sp. 18ttttgtccct accaggt
171917DNAMus sp. 19tccctccaca cacagtc
172017DNAMus sp. 20cttttcattg tgtagac
172117DNAMus sp.
21ttttttgttt tctaggt
172217DNAMus sp. 22ctaatatttg aacagga
172317DNAMus sp. 23ttttttttgc tttagtt
172417DNAMus sp. 24ttaatcttac aacagag
172517DNAMus sp.
25ttcatttctt tgcagga
172617DNAMus sp. 26tcttgcctgt tgcaggt
172717DNAMus sp. 27cactggtatt tttagtg
172817DNAMus sp. 28gggttttatt ttgagat
172917DNAMus sp.
29ttcctgtcct cacagac
173017DNAMus sp. 30tactttcttt gataggt
173117DNAMus sp. 31tactgtggtt cttaggg
173217DNAMus sp. 32cacttactac ctcaggt
173317DNAMus sp.
33cttaaactcc aacagga
173417DNAMus sp. 34aacttttttc ctaggga
173517DNAMus sp. 35tttttttttt ttcagga
173617DNAMus sp. 36cgtctcttgt cacaggc
173717DNAMus sp.
37ttgctgtctt ttcagga
173817DNAMus sp. 38ctcttccctt gtcagga
173917DNAMus sp. 39tttcttccct cttaggt
174017DNAMus sp. 40attatgcatc ctcagcc
174117DNAMus sp.
41gttcatcttc tctagat
174217DNAMus sp. 42tgtaatttct gacagga
174317DNAMus sp. 43ccatttcttc tctagat
174417DNAMus sp. 44ctgttttctg cttaggt
174517DNAMus sp.
45cattcttgct tccagat
174617DNAMus sp. 46aggtgagcat tcgcccg
174717DNAMus sp. 47aagtaggtca ctgatgc
174817DNAMus sp. 48aggtaggtgt aaggcct
174917DNAMus sp.
49aggtaagctt tgcactg
175017DNAMus sp. 50aggtaagcgt ttcttgg
175117DNAMus sp. 51tggtaaggcg ggatgat
175217DNAMus sp. 52tggtgtctct cttcagc
175317DNAMus sp.
53aagtgagtga gcataaa
175417DNAMus sp. 54aggtaggggt cagagtt
175517DNAMus sp. 55tggtatgaca gcttgtg
175617DNAMus sp. 56aagtaagttg ctgcgaa
175717DNAMus sp.
57tggtaagtgg aagcagg
175817DNAMus sp. 58tcgtaagttc ctaaata
175917DNAMus sp. 59aggtatcatg gttcatc
176017DNAMus sp. 60gggtgaggat cagagtt
176117DNAMus sp.
61aggtgaatag acacggc
176217DNAMus sp. 62aggtatgtag gcttggt
176317DNAMus sp. 63aagtaagctc tcctata
176417DNAMus sp. 64aggtaagctg actcttc
176517DNAMus sp.
65aagtaagtat ttattct
176617DNAMus sp. 66aggtacactt tgcgtct
176717DNAMus sp. 67aggtaagtat tttgata
176817DNAMus sp. 68aagtgggtgc tgtgtgt
176917DNAMus sp.
69aggtagagac ctgcgcg
177017DNAMus sp. 70aggtatgtgg agttgag
177117DNAMus sp. 71tggtaagggt ttttttt
177217DNAMus sp. 72aggtatgtgg tgggtta
177317DNAMus sp.
73aggtaagcag ggccatt
177417DNAMus sp. 74aggtgagctc ctccccg
177517DNAMus sp. 75tggtaaggaa gctctga
177617DNAMus sp. 76aggtgaggat tacattt
177717DNAMus sp.
77gggtgagtgc ctccaaa
177817DNAMus sp. 78gcgtacgtac gagacct
177917DNAMus sp. 79aggtatggct tcagtgc
178017DNAMus sp. 80cggtaagctt cctcaga
178117DNAMus sp.
81cggtgtactg ctcgttc
178219DNAArtificial SequenceDescription of Artificial Sequence Primer
82gccagtgttt ttgcctgag
198319DNAArtificial SequenceDescription of Artificial Sequence Primer
83cggattgtca ctgttgtgc
198420DNAArtificial SequenceDescription of Artificial Sequence Primer
84gactgctctc atagcatcgc
208515DNAArtificial SequenceDescription of Artificial Sequence Primer
85aagtaagygc cattg
158615DNAArtificial SequenceDescription of Artificial Sequence Primer
86ggttcacsga ttgtc
158717DNAArtificial SequenceDescription of Artificial Sequence Primer
87ggcgtcgtag aaattgc
178820DNAArtificial SequenceDescription of Artificial Sequence Primer
88gtggtgctga aggggcaggc
208910DNAArtificial SequenceDescription of Artificial Sequence Primer
89tacagactta
10
User Contributions:
Comment about this patent or add new information about this topic: