Patent application title: Human Growth Gene and Short Stature Gene Region
Inventors:
Gudrun Rappold-Hoerbrand (Heidelberg, DE)
Ercole Rao (Riedstadt, DE)
IPC8 Class: AA61K3827FI
USPC Class:
514 12
Class name: Designated organic active ingredient containing (doai) peptide containing (e.g., protein, peptones, fibrinogen, etc.) doai 25 or more peptide repeating units in known peptide chain structure
Publication date: 2009-04-30
Patent application number: 20090111744
Claims:
1. A method for the treatment of short stature in patients identified as
having a genetic: defect in the human growth gene SHOX, comprisinga)
identifying a genetic defect in a human subject suspected of having a
genetic mutation in the SHOX gene having the nucleotide sequence
according to SEQ. ID NO. 14, andb) administering to said patient a
therapeutically active amount of human growth hormone.
2. The method according to claim 1, wherein said genetic defect is identified by obtaining a biological sample molecule to be examined, and amplifying said biological sample molecule in the presence of two nucleotide probes completely or in part complementary to any of the DNA sequences of SEQ. ID. NO: 2 to SEQ ID NO. 7.
3. The method according to claim 1, wherein the genetic mutation is caused by a hot spot of mutation in the nucleic acid sequence encoding a protein truncation at amino acid position 195 in the SHOX gene.
4. The method according to claim 1, wherein said patient is not suffering from Turner's Syndrome.
5. A method for the treatment of short stature in patients identified as having a genetic defect in the human growth gene SHOX, said SHOX gene having the nucleotide sequence according to SEQ. ID NO. 14, comprisinga) identifying a genetic defect in a human subject suspected of having a genetic mutation in the SHOX gene, comprisingi) obtaining a biological sample containing a polynucleotide from a human subject,ii) amplifying the polynucleotide of i) in the presence of a primer, wherein said primer is an exon flanking primer or a primer to an exon nucleotide sequence of the SHOX gene according to SEQ ID NO: 14 and wherein the oligonucleotide primer has a length of 18-26 nucleotides and the oligonucleotide sequence of said primer is identical to a partial sequence of an exon nucleotide sequence of the SHOX gene according to SEQ ID NO: 14, andiii) sequencing any amplification product of the polynucleotide ofi) to determine the presence of a genetic mutation in the SHOX gene of said human subject, andb) administering to said human subject a therapeutically active amount of human growth hormone.
6. The method according to claim 5, wherein the exon nucleotide sequence in step a) ii) is a polynucleotide sequence selected from the group consisting of SHOX ET93 (SEQ ID NO: 2), SHOX G310 (SEQ ID NO: 3), SHOX ET45 (SEQ ID NO: 4), SHOX G108 (SEQ ID NO: 5), SHOX Va (SEQ ID NO: 6) and SHOX Vb (SEQ ID NO: 7).
Description:
[0001]This application is a divisional of U.S. Ser. No. 10/158,160 filed
May 31, 2002, which is a continuation of U.S. Ser. No. 09/147,699, filed
Jun. 24, 1999, now abandoned, which is a 371 of PCT/EP97/05355, filed on
Sep. 29, 1997, which claims the benefit of U.S. provisional application
60/027,633, filed on Oct. 1, 1996.
[0002]The present invention relates to the isolation, identification and characterization of newly identified human genes responsible for disorders relating to human growth, especially for short stature or Turner syndrome, as well as the diagnosis and therapy of such disorders.
[0003]The isolated genomic DNA or fragments thereof can be used for pharmaceutical purposes or as diagnostic tools or reagents for identification or characterization of the genetic defect involved in such disorders. Subject of the present invention are further human growth proteins (transcription factors A, B and C) which are expressed after transcription of said DNA into RNA or mRNA and which can be used in the therapeutic treatment of disorders related to mutations in said genes. The invention further relates to appropriate cDNA sequences which can be used for the preparation of recombinant proteins suitable for the treatment of such disorders. Subject of the invention are further plasmid vectors for the expression of the DNA of these genes and appropriate cells containing such DNAs. It is a further subject of the present invention to provide means and methods for the genetic treatment of such disorders in the area of molecular medicine using an expression plasmid prepared by incorporating the DNA of this invention downstream from an expression promotor which effects expression in a mammalian host cell.
[0004]Growth is one of the fundamental aspects in the development of an organism, regulated by a highly organised and complex system. Height is a multifactorial trait, influenced by both environmental and genetic factors. Developmental malformations concerning body height are common phenomena among humans of all races. With an incidence of 3 in 100, growth retardation resulting in short stature account for the large majority of inborn deficiencies seen in humans.
[0005]With an incidence of 1:2500 life-born phenotypic females, Turner syndrome is a common chromosomal disorder (Rosenfeld et al., 1996). It has been estimated that 1-2% of all human conceptions are 45,X and that as many as 99% of such fetuses do not come to term (Hall and Gilchrist, 1990; Robins, 1990). Significant clinical variability exists in the phenotype of persons with Turner syndrome (or Ullrich-Turner syndrome) (Ullrich, 1930; Turner, 1938). Short stature, however, is a consistent finding and together with gonadal dysgenesis considered as the lead symptoms of this disorder. Turner syndrome is a true multifactorial disorder. Both the embryonic lethality, the short stature, gonadal dysgenesis and the characteristic somatic features are thought to be due to monosomy of genes common to the X and Y chromosomes. The diploid dosis of those X-Y homologous genes are suggested to be requested for normal human development. Turner genes (or anti-Turner genes) are expected to be expressed in females from both the active and inactive X chromosomes or Y chromosome to ensure correct dosage of gene product. Haploinsufficiency (deficiency due to only one active copy), consequently would be the suggested genetic mechanism underlying the disease.
[0006]A variety of mechanisms underlying short stature have been elucidated so far. Growth hormone and growth hormone receptor deficiencies as well as skeletal disorders have been described as causes for the short stature phenotype (Martial et al., 1979; Phillips et al., 1981; Leung et al., 1987; Goddard et al., 1995). Recently, mutations in three human fibroblast growth factor receptor-encoding genes (FGFR 1-3) were identified as the cause of various skeletal disorders, including the most common form of dwarfism, achondroplasia (Shiang et al., 1994; Rousseau et al., 1994; Muenke and Schell, 1995). A well-known and frequent (1:2500 females) chromosomal disorder, Turner Syndrome (45,X), is also consistently associated with short stature. Taken together, however, all these different known causes account for only a small fraction of all short patients, leaving the vast majority of short stature cases unexplained to date.
[0007]The sex chromosomes X and Y are believed to harbor genes influencing height (Ogata and Matsuo, 1993). This could be deduced from genotype-phenotype correlations in patients with sex chromosome abnormalities. Cytogenetic studies have provided evidence that terminal deletions of the short arms of either the X or the Y chromosome consistently lead to short stature in the respective individuals (Zuffardi et al., 1982; Curry et al., 1984). More than 20 chromosomal rearrangements associated with terminal deletions of chromosome Xp and Yp have been reported that localize the gene(s) responsible for short stature to the pseudoautosomal region (PAR1) (Ballabio et al., 1989, Schaefer et al., 1993). This localisation has been narrowed down to the most distal 700 kb of DNA of the PAR1 region, with DXYS 15 as the flanking marker (Ogata et al., 1992; 1995).
[0008]Mammalian growth regulation is organized as a complex system. It is conceivable that multiple growth promoting genes (proteins) interact with one another in a highly organized way. One of those genes controlling height has tentatively been mapped to the pseudoautosomal region PAR1 (Ballabio et al., 1989), a region known to be freely exchanged between the X and Y chromosomes (for a review see Rappold, 1993). The entire PAR1 region is approximately 2,700 kb.
[0009]The critical region for short stature has been defined with deletion patients. Short stature is the consequence when an entire 700 kb region is deleted or when a specific gene within this critical region is present in haploid state, is interrupted or mutated (as is the case with idiotypic short stature or Turner sydrome). The frequency of Turner's syndrome is 1 in 2500 females worldwide; the frequency of this kind of idiopathic short stature can be estimated to be 1 in 4.000-5.000 persons. Turner females and some short stature individuals usually receive an unspecific treatment with growth hormone (GH) for many years to over a decade although it is well known that they have normal GH levels and GH deficiency is not the problem. The treatment of such patients is very expensive (estimated costs approximately 30.000 USD p.a.). Therefore, the problem existed to provide a method and means for distinguishing short stature patients on the one side who have a genetic defect in the respective gene and on the other side patients who do not have any genetic defect in this gene. Patients with a genetic defect in the respective gene--either a complete gene deletion (as in Turner syndrome) or a point mutation (as in idiopathic short stature)--should be susceptible for an alternative treatment without human GH, which now can be devised.
[0010]Genotype/phenotype correlations have supported the existence of a growth gene in the proximal part of Yq and in the distal part of Yp. Short stature is also consistently found in individuals with terminal deletions of Xp. Recently, an extensive search for male and female patients with partial monosomies of the pseudoautosomal region has been undertaken. On the basis of genotype-phenotype correlations, a minimal common region of deletion of 700 kb DNA adjacent to the telomere was determined (Ogata et al., 1992; Ogata et al., 1995). The region of interest was shown to lie between genetic markers DXYS20 (3cosPP) and DXYS15 (113D) and all candidate genes for growth control from within the PAR1 region (e.g., the hemopoietic growth factor receptor a; CSF2RA) (Gough et al., 1990) were excluded based on their physical location (Rappold et al., 1992). That is, the genes were within the 700 kb deletion region of the 2.700 kb PAR1 region.
[0011]Deletions of the pseudoautosomal region (PAR1) of the sex chromosomes were recently discovered in individuals with short stature and subsequently a minimal common deletion region of 700 kb within PAR1 was defined. Southern blot analysis on DNA of patients AK and SS using different pseudoautosomal markers has identified an Xp terminal deletion of about 700 kb distal to DXYS 15 (113D) (Ogata et al, 1992; Ogata et al, 1995).
[0012]The gene region corresponding to short stature has been identified as a region of approximately 500 kb, preferably approximately 170 kb in the PAR1 region of the X and Y chromosomes. Three genes in this region have been identified as candidates for the short stature gene. These genes were designated SHOX (also referred to as SHOX93 or HOX93), (SHOX=short stature homeobox-containing gene), pET92 and SHOT (SHOX-like homeobox gene on chromosome three). The gene SHOX which has two separate splicing sites resulting in two variations (SHOX a and b) is of particular importance. In preliminary investigations, essential parts of the nucleotide sequence of the short stature gene could be analysed (SEQ ID No. 8). Respective exons or parts thereof could be predicted and identified (e.g. exon I [G310]; exon II [ET93]; exon IV [G108]; pET92). The obtained sequence information could then be used for designing appropriate primers or nucleotide probes which hybridize to parts of the SHOX gene or fragments thereof. By conventional methods, the SHOX gene can then be isolated. By further analysis of the DNA sequence of the genes responsible for short stature, the nucleotide sequence of exons I-V could be refined (v. FIG. 1-3). The gene SHOX contains a homeobox sequence (SEQ ID NO: I) of approximately 180 bp (v. FIG. 2 and FIG. 3), starting from the nucleotide coding for amino acid position 117 (Q) to the nucleotide coding for amino acid position 176 (E), i.e. from CAG (440) to GAG (619). The homeobox sequence is identified as the homeobox-pET93 (SHOX) sequence and two point mutations have been found in individuals with short stature in a German (A1) and a Japanese patient by screening up to date 250 individuals with idiopahtic short stature. Both point mutations were found at the identical position and leading to a protein truncation at amino acid position 195, suggesting that there may exist a hot spot of mutation. Due to the fact that both mutations found, which lead to a protein truncation, are at the identical position, it is possible that a putative hot spot of recombination exisits with exon 4 (G108). Exon specific primers can therefore be used as indicated below, e.g. GCA CAG CCA ACC ACC TAG (for) or TGG AAA GGC ATC ATC CGT AAG (rev).
[0013]The above-mentioned novel homeobox-containing gene, SHOX, which is located within the 170 kb interval, is alternatively spliced generating two proteins with diverse function. Mutation analysis and DNA sequencing were used to demonstrate that short stature can be caused by mutations in SHOX.
[0014]The identification and cloning of the short stature critical region according to the present invention was performed as follows: Extensive physical mapping studies on 15 individuals with partial monosomy in the pseudoautosomal region (PAR1) were performed. By correlating the height of those individuals with their deletion breakpoints a short stature (SS) critical region of approximately 700 kb was defined. This region was subsequently cloned as an overlapping cosmid contig using yeast artificial chromosomes (YACs) from PAR 1 (Ried et al., 1996) and by cosmid walking. To search for candidate genes for SS within this interval, a variety of techniques were applied to an approximately 600 kb region between the distal end of cosmid 56G10 and the proximal end of 51D11. Using cDNA selection, exon trapping, and CpG island cloning, the two novel genes were identified.
[0015]The position of the short stature critical interval could be refined to a smaller interval of 170 kb of DNA by characterizing three further specific individuals (GA, AT and RY), who were consistently short. To precisely localize the rearrangement breakpoints of those individuals, fluorescence in situ hybridization (FISH) on metaphase chromosomes was carried out using cosmids from the contig. Patient GA, with a terminal deletion and normal height, defined the distal boundary of the critical region (with the breakpoint on cosmid 110E3), and patient AT, with an X chromosome inversion and normal height, the proximal boundary (with the breakpoint on cosmid 34F5). The Y-chromosomal breakpoint of patient RY, with a terminal deletion and short stature, was also found to be contained on cosmid 34F5, suggesting that this region contains sequences predisposing to chromosome rearrangements.
[0016]The entire region, bounded by the Xp/Yp telomere, has been cloned as a set of overlapping cosmids. Fluorescence in situ hybridization (FISH) with cosmids from this region was used to study six patients with X chromosomal rearrangements, three with normal height and three with short stature. Genotype-phenotype correlations narrowed down the critical short stature interval to 270 kb of DNA or even less as 170 kb, containing the gene or genes with an important role in human growth. A minimal tiling path of six to eight cosmids bridging this interval is now available for interphase and metaphase FISH providing a valuable tool for diagnostic investigations on patients with idiopathic short stature.
BRIEF DESCRIPTION OF THE DRAWINGS
[0017]FIG. 1 is a gene map of the SHOX gene including five exons which are identified as follows: exon I: G310, exon II: ET93, exon III: ET45, exon IV: G108 and exons Va and Vb, whereby exons Va and Vb result from two different splicing sites of the SHOX gene. Exon II and III contain the homeobox sequence of 180 nucleotides.
[0018]FIGS. 2 and 3 are the nucleotide and predicted amino acid sequences of SHOXa and SHOXb: [0019]SHOX a: The predicted start of translation begins at nucleotide 92 with the first in-frame stop codon (TGA) at nucleotides 968-970, yielding an open reading frame of 876 bp that encodes a predicted protein of 292 amino acids (designated as transcription factor A or SHOXa protein, respectively). An in-frame, 5'stop codon at nucleotide 4, the start codon and the predicted termination stop codon are in bold. The homeobox is boxed (starting from amino acid position 117 (Q) to 176 (E), i.e. CAG thru GAG in the nucleotide sequence). The locations of introns are indicated with arrows. Two putative polyadenylation signals in the 3'untranslated region are underlined. [0020]SHOX b: An open reading frame of 876 bp exists from A in the first methionin at nucleotide 92 to the in-frame stop codon at nucleotide 767-769, yielding an open reading frame of 675 bp that encodes a predicted protein of 225 amino acids (transcription factor B or SHOXb protein, respectively). The locations of introns are indicated with arrows. Exons I-IV are identical with SHOXa, exon V is specific for SHOX b. A putative polyadenylation signal in the 3' untranslated region is underlined.
[0021]FIG. 4 are the nucleotide (SEQ ID NO:43) and predicted amino acid (SEQ ID NO: 16) sequence of SHOT. The predicted start of translation begins at nucleotide 43 with the first in-frame stop codon (TGA) at nucleotides 613-615, yielding an open reading frame of 573 bp that encodes a predicted protein of 190 amino acids (designated as transcription factor C or SHOT protein, respectively). The homeobox is boxed (starting from amino acid position 11 (Q) to 70 (E), i.e. CAG thru GAG in the nucleotide sequence). The locations of introns are indicated with arrows. Two putative polyadenylation signals in the 3'untranslated region are underlined
[0022]FIG. 5 gives the exon/intron organization of the human SHOX gene and the respective positions in the nucleotide sequence (Intron/Exon sequences (SEQ ID NOS 44-49, respectively in order of appearance) and Exon/Intron sequences (SEQ ID NOS 50-55, respectively in order of appearance)).
BRIEF DESCRIPTION OF THE SEQ ID:
[0023]SEQ ID NO.1: translated amino acid sequence of the homeobox domain (180 bp)SEQ ID NO.2: exon II (ET93) of the SHOX geneSEQ ID NO. 3: exon I (G310) of the SHOX geneSEQ ID NO. 4: exon III (ET45) of the SHOX geneSEQ ID NO. 5: exon IV (G108) of the SHOX geneSEQ ID NO. 6: exon Va of the SHOX geneSEQ ID NO. 7: exon Vb of the SHOX geneSEQ ID NO. 8: preliminary nucleotide sequence of the SHOX geneSEQ ID NO.9: ET92 geneSEQ ID NO.10: SHOXa sequence (see also FIG. 2)SEQ ID NO.1: transcription factor A (see also FIG. 2)SEQ ID NO. 12: SHOXb sequence (see also FIG. 3)SEQ ID NO. 13: transcription factor B (see also FIG. 3)SEQ ID NO. 14: SHOX geneSEQ ID NO. 15: SHOT sequenceSEQ ID NO. 16: transcription factor C (see also FIG. 4)
[0024]Since the target gene leading to disorders in human growth (e.g. short stature region) was unknown prior to the present invention, the biological and clinical association of patients with this deletion could give insights to the function of this gene. In the present study, fluorescence in situ hybridization (FISH) was used to examine metaphase and interphase lymphocyte nuclei of six patients. The aim was to test all cosmids of the overlapping set for their utility as FISH probes and to determine the breakpoint regions in all four cases, thereby determining the minimal critical region for the short stature gene.
[0025]Duplication and deletion of genomic DNA can be technically assessed by carefully controlled quantitative PCR or dose estimation on Southern blots or by using RFLPs. However, a particularly reliable method for the accurate distinction between single and double dose of markers is FISH, the clinical application of is presently routine. Whereas in interphase FISH, the pure absence or presence of a molecular marker can be evaluated, FISH on metaphase chromosomes may provide a semi-quantitative measurement of inter-cosmid deletions. The present inventor has determined that deletions of about 10 kb (25% of signal reduction) can still be detected. This is of importance, as practically all disease genes on the human X chromosome have been associated with smaller and larger deletions in the range from a few kilobases to several megabases of DNA (Nelson et al., 1995).
[0026]Subject of the present invention are therefore DNA sequences or fragments thereof which are part of the genes responsible for human growth (or for short stature, respectively, in case of genetic defects in these genes). Three genes responsible for human growth were identified: SHOX, pET92 and SHOT. DNA sequences or fragments of these genes, as well as the respective full length DNA sequences of these genes can be transformed in an appropriate vector and transfected into cells. When such vectors are introduced into cells in an appropriate way as they are present in healthy humans, it is devisable to treat diseases involved with short stature, i.e. Turners syndrome, by modern means of gene therapy. For example, short stature can be treated by removing the respective mutated growth genes responsible for short stature. It is also possible to stimulate the respective genes which compensate the action of the genes responsible for short stature, i.e. by inserting DNA sequences before, after or within the growth/short stature genes in order to increase the expression of the healthy allels. By such modifications of the genes, the growth/short stature genes become activated or silent, respectively. This can be accomplished by inserting DNA sequences at appropriate sites within or adjacent to the gene, so that these inserted DNA sequences interfere with the growth/short stature genes and thereby activate or prevent their transcription. It is also devisable to insert a regulatory element (e.g. a promotor sequence) before said growth genes to stimulate the genes to become active. It is further devisable to stimulate the respective promotor sequence in order to overexpress--in the case of Turner syndrome--the healthy functional allele and to compensate for the missing allele. The modification of genes can be generally achieved by inserting exogenous DNA sequences into the growth gene/short stature gene via homologous recombination.
[0027]The DNA sequences according to the present invention can also be used for transformation of said sequences into animals, such as mammals, via an appropriate vector system. These transgenic animals can then be used for in vivo investigations for screening or identifying pharamceutical agents which are useful in the treatment of diseases involved with short stature. If the animals positively respond to the administration of a candidate compound or agent, such agent or compound or derivatives thereof would be devisable as pharmaceutical agents. By appropriate means, the DNA sequences of the present invention can also be used in genetic experiments aiming at finding methods in order to compensate for the loss of genes responsible for short stature (knock-out animals).
[0028]In a further object of this invention, the DNA sequences can also be used to be transformed into cells. These cells can be used for identifying pharmaceutical agents useful for the treatment of diseases involved with short stature, or for screening of such compounds or library of compounds. In an appropriate test system, variations in the phenotype or in the expression pattern of these cells can be determined, thereby allowing the identification of interesting candidate agents in the development of pharmaceutical drugs.
[0029]The DNA sequences of the present invention can also be used for the design of appropriate primers which hybridize with segments of the short stature genes or fragments thereof under stringent conditions. Appropriate primer sequences can be constructed which are useful in the diagnosis of people who have a genetic defect causing short stature. In this respect it is noteworthy that the two mutations found occur at the identical position, suggesting that a mutational hot spot exists.
[0030]In general, DNA sequences according to the present invention are understood to embrace also such DNA sequences which are degenerate to the specific sequences shown, based on the degeneracy of the genetic code, or which hybridize under stringent conditions with the specifically shown DNA sequences.
[0031]The present invention encompasses especially the following aspects: [0032]a) An isolated human nucleic acid molecule encoding polypeptides containing a homeobox domain of sixty amino acids having the amino acid sequence of SEQ ID NO: 1 and having regulating activity on human growth. [0033]b) An isolated DNA molecule comprising the nucleotide sequence essentially as indicated in FIG. 2, FIG. 3 or FIG. 4, and especially as shown in SEQ ID NO: 10, SEQ ID NO: 12 or SEQ ID NO: 15. [0034]c) DNA molecules capable of hybridizing to the DNA molecules of item b). [0035]d) DNA molecules of item c) above which are capable of hybridization with the DNA molecules of item 2. under a temperature of 60-70° C. and in the presence of a standard buffer solution. [0036]e) DNA molecules comprising a nucleotide sequence having a homology of seventy percent or higher with the nucleotide sequence of SEQ ID NO: 10, SEQ ID NO: 12 or SEQ ID NO: 15 and encoding a polypeptide having regulating activity on human growth. [0037]f) Human growth proteins having the amino acid sequence of SEQ ID NO: 11, 13 or 16 or a functional fragment thereof. [0038]g) Antibodies obtained from immunization of animals with human growth proteins of item f) or antigenic variants thereof. [0039]h) Pharmaceutical compositions comprising human growth proteins or functional fragments thereof for treating disorders caused by genetic mutations of the human growth gene. [0040]i) A method of screening for a substance effective for the treatment of disorders mentioned above under item h) comprising detecting messenger RNA hybridizing to any of the DNA molecules decribed in a)-e) so as to measure any enhancement in the expression levels of the DNA molecule in response to treatment of the host cell with that substance. [0041]j) An expression vector or plasmid containing any of the nucleic acid molecules described in a)-e) above which enables the DNA molecules to be expressed in mammalian cells. [0042]k) A method for the determination of the gene or genes responsible for short stature in a biological sample of body tissues or body fluids.
[0043]In the method k) above, preferably nucleotide amplification techniques, e.g. PCR, are used for detecting specific nucleotide sequences known to persons skilled in the art, and described, for example, by Mullis et al. 1986, Cold Spring Harbor Symposium Quant. Biol. 51, 263-273, and Saiki et al., 1988, Science 239, 487-491, which are incorporated herein by reference. The short stature nucleotide sequences to be determined are mainly those represented by sequences SEQ ID No. 2 to SEQ ID No. 7.
[0044]In principle, all oligonucleotide primers and probes for amplifying and detecting a genetic defect responsible for deminished human growth in a biological sample are suitable for amplifying a target short stature associated sequence. Especially, suitable exon specific primer pairs according to the invention are provided by table 1. Subsequently, a suitable detection, e.g. a radioactive or non-radioactive label is carried out.
TABLE-US-00001 TABLE 1 Exon Sense primer Antisense primer Product (bp) Ta (° C.) 5'-I (G310) SP 1 ASP 1 194 58 3'-I (G310) SP 2 ASP 2 295 58 II (ET93) SP 3 ASP 3 262 76/72/68 III (ET45) SP 4 ASP 4 120 65 IV (G108) SP 5 ASP 5 154 62 Va (SHOXa) SP 6 ASP 6 265 61
explanation of the abbreviations for the primers:
TABLE-US-00002 SP1 ATTTCCAATGGAAAGGCGTAAATAAC SP2 ACGGCTTTTGTATCCAAGTCTTTTG SP3 GCCCTGTGCCCTCCGCTCCC SP4 GGCTCTTCACATCTCTCTCTGCTTC SP5 CCACACTGACACCTGCTCCCTTTG S5P6 CCCGCAGGTCCAGGCTCAGCTG ASP1 CGCCTCCGCCGTTACCGTCCTTG ASP2 CCCTGGAGCCGGCGCGCAAAG ASP3 CCCCGCCCCCGCCCCCGG ASP4 CTTCAGGTCCCCCCAGTCCCG ASP5 CTAGGGATCTTCAGAGGAAGAAAAAG ASP6 GCTGCGCGGCGGGTCAGAGCCCCAG
[0045]Also, a single stranded RNA can be used as target. Methods for reversed transcribing RNA into cDNA are also well known and described in Sambrook et al., Molecular Cloning: A Laboratory Manual, New York, Cold Spring Harbor Laboratory 1989. Alternatively, preferred methods for reversed transcription utilize thermostable DNA polymerases having RT activity.
[0046]Further, the technique described before can be used for selecting those person from a group of persons being of short stature characterized by a genetic defect and which allows as a consequence a more specific medical treatment.
[0047]In another subject of the present invention, the transcription factors A, B and C can be used as pharmaceutical agents. These transcription factors initiate a still unknown cascade of biological effects on a molecular level involved with human growth. These proteins or functional fragments thereof have a mitogenic effect on various cells. Especially, they have an osteogenic effect. They can be used in the treatment of bone diseases, such as e.g. osteoporosis, and especially all those diseases involved with disturbance in the bone calcium regulation.
[0048]As used herein, the term "isolated" refers to the original derivation of the DNA molecule by cloning. It is to be understood however, that this term is not intended to be so limiting and, in fact, the present invention relates to both naturally occurring and synthetically prepared seqences, as will be understood by the skilled person in the art.
[0049]The DNA molecules of this invention may be used in forms of gene therapy involving the use of an expression plasmid prepared by incorporating an appropriate DNA sequence of this invention downstream from an expression promotor that effects expression in a mammalian host cell. Suitable host cells are procaryotic or eucaryotic cells. Procaryotic host cells are, for example, E. coli, Bacillus subtilis, and the like. By transfecting host cells with replicons originating from species adaptable to the host, that is, plasmid vectors containing replication starting point and regulator sequences, these host cells can be transfected with the desired gene or cDNA. Such vectors are preferably those having a sequence that provides the transfected cells with a property (phenotype) by which they can be selected. For example, for E. coli hosts the strain E. coli K12 is typically used, and for the vector either pBR322 or pUC plasmids can be generally employed. Examples for suitable promoters for E. coli hosts are trp promotor, lac promotor or Ipp promotor. If desired, secretion of the expression product through the cell membrane can be effected by connecting a DNA sequence coding for a signal peptide sequence at the 5' upstream side of the gene. Eucaryotic host cells include cells derived from vertebrates or yeast etc.. As a vertebrate host cell, COS cells can be used (Cell, 1981, 23: 175-182), or CHO cells. Preferably, promotors can be used which are positioned 5' upstream of the gene to be expressed and having RNA splicing positions, polyadenylation and transcription termination seqences.
[0050]The transcription factors A, B and C of the present invention can be used to treat disorders caused by mutations in the human growth genes and can be used as growth promoting agents. Due to the polymorphism known in the case of eukaryotic genes, one or more amino acids may be substituted. Also, one or more amino acids in the polypeptides can be deleted or inserted at one or more sites in the amino acid sequence of the polypeptides of SEQ ID NO: 11, 13 or 16. Such polypeptides are generally referred to equivalent polypeptides as long as the underlying biological acitivity of the unmodified polypeptide remains essentially unchanged.
[0051]The present invention is illustrated by the following examples.
EXAMPLE 1
Patients
[0052]All six patients studied had de novo sex chromosome aberrations.
[0053]CC is a girl with a karyotype 45,X/46,X psu dic (X) (Xqter→Xp22.3::Xp22.3→Xqter). At the last examination at 61/2 years of age, her height was 114 cm (25-50 the % percentile). Her mother's height was 155 cm, the father was not available for analysis. For details, see Henke et al., 1991.
[0054]GA is a girl with a karyotype 46,X der X (3pter→3p23::Xp22.3→Xqter). At the last examination at 17 years, normal stature (159 cm) was observed. Her mother's height is 160 cm and her father's height 182 cm. For details, see Kulharya et al, 1995.
[0055]SS is a girl with a karyotype 46,X rea (X) (Xqter→Xq26::Xp22.3→Xq26:). At 11 years her height remained below the 3rd percentile growth curve for Japanese girls; her predicted adult height (148.5 cm) was below her target height (163 cm) and target range (155 to 191 em). For details, see Ogata et alt, 1992.
[0056]AK is a girl with a karyotype 46,X rea (X) (Xqter→Xp22.3::Xp22.3→Xp21.3:). At 13 years her height remained below the 2nd percentile growth curve for Japanese girls; her predicted adult height (142.8 cm) was below her target height (155.5 cm) and target range (147.5-163.5 em). For details, see Ogata et alt, 1995.
[0057]RY: the karyotype of the ring Y patient is 46,X,r(Y)/46,Xdic r(Y)/45,X[95:3:2], as examined on 100 lymphocytes; at 16 years of age his final height was 148; the heights of his three brothers are all in the normal range with 170 cm (16 years, brother 1), 164 cm (14 years, brother 2) and 128 cm (9 years, brother 3), respectively. Growth retardation of this patient is so severe that it would also be compatible with an additional deletion of the GCY locus on Yq.
[0058]AT: boy with ataxia and inv(X); normal height of 116 cm at age 7, parents' heights are 156 cm and 190 cm, respectively.
Patients for Mutation Analysis:
[0059]250 individuals with idiopathic short stature were tested for mutations in SHOXa. The patients were selected on the following criteria: height for chronological age was below the 3rd centile of national height standards, minus 2 standard deviations (SDS); no causative disease was known, in particular: normal weight (length) for gestational age, normal body proportions, no chronic organic disorder, normal food intake, no psychiatric disorder, no skeletal dysplasia disorder, no thyroid or growth hormone deficiency.
Family A:
[0060]Cases 1 and 2 are short statured children of a German non-consanguineous family. The boy (case 1) was born at the 38th week of gestation by cesarian section. Birth weight was 2660 g, birth length 47 cm. He developed normally except for subnormal growth. On examination at the age of 6.4 years, he was proportionate small (106.8 cm, -2.6 SDS) and obese (22.7 kg), but otherwise normal. His bone age was not retarded (6 yrs) and bone dysplasia was excluded by X-ray analysis. IGF-I and IGFBP-3 levels as well as thyroid parameters in serum rendered GH or thyroid hormone deficiency unlikely. The girl (case 2) was born at term by cesarian section. Birth weight was 2920 g, birth length 47 cm. Her developmental milestones were normal, but by the age of 12 months poor growth was apparent (length: 67 cm, -3.0 SDS). At 4 years she was 89.6 cm of height (-3.6 SDS). No dysmorphic features or dysproportions were apparent. She was not obese (13 kg). Her bone age was 3.5 years and bone dysplasia was excluded. Hormone parameters were normal. It is interesting to note that both the girl and the boy grow on the 50 percentile growth curve for females with Turner syndrome. The mother is the smallest of the family and has a mild rhizomelic dysproportion (142.3 cm, -3.8 SDS). One of her two sisters (150 cm, -2.5 SDS) and the maternal grandmother (153 cm, -2.0 SDS) are all short without any dysproportion. One sister has normal stature (167 cm, +0.4 SDS). The father's height is 166 cm (-1.8 SDS) and the maternal grandfather' height is 165 cm (-1.9 SDS). The other patient was of Japanese origin and showed the identical mutation.
EXAMPLE 2
Identification of the Short Stature Gene
[0061]A. In situ Hybridizationa) Florescence in situ Hybridization (FISH)
[0062]Florescence in situ hybridization (FISH) using cosmids residing in the Xp/Yp pseudoautosomal region (PAR1) was carried out. FISH studies using cosmids 64/75cos (LLNLc 110H032), E22cos (2e2), F1/14cos (110A7), M1/70cos (110E3), P99F2cos (43C11), P99cos (LLNLc110P2410), B6cosb (1CRFc104H0425), F20cos (34F5), F21cos (ICRFc104G0411), F3cos2 (9E3), F3cos1 (11E6), P117cos (29B11), P6cos1 (ICRFc104P0117), P6cos2 (LLNLc110E0625) and E4cos (15G7) was carried out according to published methods (Lichter and Cremer, 1992). In short, one microgram of the respective cosmid clone was labeled with biotin and hybridized to human metaphase chromosomes under conditions that suppress signals from repetitive DNA sequences. Detection of the hybridization signal was via FITC-conjugated avidin. Images of FITC were taken by using a cooled charge coupled device camera system (Photometrics, Tucson, Ariz.).
b) Physical Mapping
[0063]Cosmids were derived from Lawrence Livermore National Laboratory X- and Y-chromosome libraries and the Imperial Cancer Research Fund London (now Max Planck Institute for Molecular Genetics Berlin) X chromosome library. Using cosmids distal to DXYS15, namely E4cos, P6cos2, P6cos1, P1117cos and F3cos1 one can determine that two copies are still present of E4cos, P6cos2, P6cos1 and one copy of P117cos and F3cos1. Breakpoints of both patients AK and SS map on cosmid P6cos1, with a maximum physical distance of 10 kb from each other. It was concluded that the abnormal X chromosomes of AK and SS have deleted about 630 kb of DNA. Further cosmids were derived from the ICRF X chromosome specific cosmid library (ICRFc104), the Lawrence Livermore X chromosome specific cosmid library (LLNLc110) and the Y chromosome specific library (LLCO3'M'), as well as from a self-made cosmid library covering the entire genome. Cosmids were identified by hybridisation with all known probes mapping to this region and by using entire YACs as probes. To verify overlaps, end probes from several cosmids were used in cases in which overlaps could not be proven using known probes.
c) Southern Blot Hybridisation
[0064]Southern blot analysis using different pseudoautosomal markers has provided evidence that the breakpoint on the X chromosome of patient CC resides between DXYS20 (3cosPP) and DXYS60 (U7A) (Henke et al, 1991). In order to confirm this finding and to refine the breakpoint location, cosmids 64/75cos, E22cos, F1/14cos, M1/70cos, F2cos, P99F2cos and P99cos were used as FISH probes. The breakpoint location on the abnormal X of patient CC between cosmids 64/75cos (one copy) and F1/14cos (two copies) on the E22PAC could be determined. Patient CC with normal stature consequently has lost approximately 260-290 kb of DNA.
[0065]Southern blot hybridisations were carried out at high stringency conditions in Church buffer (0.5 M NaPi pH 7.2, 7% SDS, 1 mM EDTA) at 65° C. and washed in 40 mM NaPi, 1% SDS at 65° C.
d) FISH Analysis
[0066]Biotinylated cosmid DNA (insert size 32-45 kb) or cosmid fragments (10-16 kb) were hybridised to metaphase chromosomes from stimulated lymphocytes of patients under conditions as described previously (Lichter and Cremer, 1992). The hybridised probe was detected via avidin-conjugated FITC.
e) PCR Amplification
[0067]All PCRs were performed in 50 μl volumes containing 100 pg-200 ng template, 20 pmol of each primer, 200 μM dNTP's (Pharmacia), 1.5 mM MgCl2, 75 mMTris/HCl pH9, 20 mM (NH4)2SO4 0.01% (w/v) Tween20 and 2 U of Goldstar DNA Polymerase (Eurogentec). Thermal cycling was carried out in a Thermocycler GeneE (Techne).
f) Exon Amplification
[0068]Four cosmid pools consisting of each four to five clones from the cosmid contigs were used for exon amplification experiments. The cosmids in each cosmid pool were partially digested with Sau3A. Gel purified fractions in the size range of 4-10 kb were cloned in the BamHI digested pSPL3B vector (Burn et al, 1995) and used for the exon amplification experiments as previously described (Church et al., 1994).
g) Genomic Sequencing
[0069]Sonificated fragments of the two cosmids LLOYNCO3'M'15D10 and LLOYNCO3'M'34F5 were subcloned separately into M13mp18 vectors. From each cosmid library at least 1000 plaques were picked, M13 DNA prepared and sequenced using dye-terminators, ThermoSequenase (Amersham) and universal M13-primer (MWG-BioTech). The gels were run on ABI-377 sequencers and data were assembled and edited with the GAP4 program (Staden).
[0070]Of all six patients, GA had the least well characterized chromosomal breakpoint. The most distal markers previously tested for their presence or absence on the X were DXS1060 and DXS996, which map approximately 6 Mb from the telomere (Nelson et al., 1995). Several cosmids containing different gene sequences from within PAR1 (MIC2, ANT3, CSF2RA, and XE7) were tested and all were present on the translocation chromosome. Cosmids from within the short stature critical region e.g., chromosome, thereby placing the translocation breakpoint on cosmid M1/70cos. A quantitative comparison of the signal intensities of M1/70cos between the normal and the rearranged X indicates that approximately 70% of this cosmid is deleted.
TABLE-US-00003 TABLE 2 Table 2: This table summarizes the FISH data for the 16 cosmids tested on four patients. CC GA AK SS 64/75cos - - E22cos - - F1/14cos + - M1/70cos + (+) F2cos + + P99F2cos + + P99cos + + B6cos + F20cos F21cos F3cos2 F3cos1 - - P117cos - - P6cos1 + + P6cos2 + + E4cos + + [-] one copy; indicates that the respective cosmid was deleted on the rearranged X, but present on the normal X chromosome [+] two copies; indicates that the respective cosmid is present on the rearranged and on the normal X chromosome [(+)] breakpoint region; indicates that the breakpoint occurs within the cosmid as shown by FISH
[0071]In summary, the molecular analysis on six patients with X chromosomal rearrangements using florescence-labeled cosmid probes and in situ hybridization indicates that the short stature critical region can be narrowed down to a 270 kb interval, bounded by the breakpoint of patient GA from its centromere distal side and by patients AK and SS on its centromere proximal side.
[0072]Genotype-phenotype correlations may be informative and have been chosen to delineate the short stature critical interval on the human X and Y chromosome. In the present study FISH analysis was used to study metaphase spreads and interphase nuclei of lymphocytes from patients carrying deletions and translocations on the X chromosome and breakpoints within Xp22.3. These breakpoints appear to be clustered in two of the four patients (AK and SS) presumably due to the presence of sequences predisposing to chromosome rearrangements. One additional patient Ring Y has been found with an interruption in the 270 kb critical region, thereby reducing the critical interval to a 170 kb region.
[0073]By correlating the height of all six individuals with their deletion breakpoint, an interval of 170 kb was mapped to within the pseudoautosomal region, presence or absence of which has a significant effect on stature. This interval is bounded by the X chromosomal breakpoint of patient GA at 340 kb from the telomere (Xptel) distally and by the breakpoints of patients AT and RY at 510/520 kb Xptel proximally. This assignment constitutes a considerable reduction of the critical interval to almost one fourth of its previous size (Ogata et al., 1992; Ogata et al., 1995). A small set of six to eight cosmids are now available for FISH experiments to test for the prevalence and significance of this genomic locus on a large series of patients with idiopathic short stature.
B. Identification of the Candidate Short Stature Gene
[0074]To search for transcription units within the smallest 170 kb critical region, exon trapping and cDNA selection on six cosmids (110E3, F2cos, 43C11, P2410, 15D10, 34F5) was carried out. Three different positive clones (ET93, ET45 and G108) were isolated by exon trapping, all of which mapped back to cosmid 34F5. Previous studies using cDNA selection protocols and an excess of 25 different cDNA libraries had proven unsuccessful, suggesting that genes in this interval are expressed at very low abundancy.
[0075]To find out whether any gene in this interval was missed, the nucleotide sequence of about 140 kb from this region of the PAR1 was determined, using the random M13 method and dye terminator chemistry. The cosmids for sequence analysis were chosen to minimally overlap with each other and to collectively span the critical interval. DNA sequence analysis and subsequent protein prediction by the "X Grail" program, version 1.3c as well as by the exon-trapping program FEXHB were carried out and confirmed all 3 previously cloned exons. No protein-coding genes other than the previously isolated one could be detected.
C. Isolation of the Short Stature Candidate Gene SHOX
[0076]Assuming that all three exon clones ET93, ET45 and G108 are part of the same gene, they were used collectively as probes to screen 14 different cDNA libraries from 12 different fetal (lung, liver, brain 1 and 2) and adult tissues (ovary, placenta 1 and 2, fibroblast, skeletal muscle, bone marrow, brain, brain stem, hypothalamus, pituitary). Not a single clone among approximately 14 million plated clones was detected. To isolate the full-length transcript, 3' and 5'RACE were carried out. For 3'RACE, primers from exon G108 were used on RNA from placenta, skeletal muscle and bone marrow fibroblasts, tissues where G108 was shown to be expressed in. Two different 3'RACE clones of 1173 and 652 bp were derived from all three tissues, suggesting that two different 3'exons a and b exist. The two different forms were termed SHOXa and SHOXb.
[0077]To increase chances to isolate the complete 5'portion of a gene known to be expressed at low abundancy, a Hela cell line was treated with retinoic acid and phorbol ester PMA. RNA from such an induced cell line and RNA from placenta and skeletal muscle were used for the construction of a `Marathon cDNA library`. Identical 5'RACE cDNA clones were isolated from all three tissues.
Experimental Procedure:
[0078]RT-PCR and cDNA Library Construction
[0079]Human polyA+RNA of heart, pancreas, placenta, skeletal muscle, fetal kidney and liver was purchased from Clontech. Total RNA was isolated from a bone marrow fibroblast cell line with TRIZOL reagent (Gibco-BRL) as described by the manufacturer. First strand cDNA synthesis was performed with the Superscript first strand cDNA synthesis kit (Gibco-BRL) starting with 100 ng polyA+RNA or 10 μg total RNA using oligo(dt)-adapter primer (GGCCACGCGTCGACTAGTAC[dT]20N. After first strand cDNA synthesis the reaction mix was diluted 1/10. For further PCR experiments 5 μl of this dilutions were used.
[0080]A `Marathon CDNA library` was constructed from skeletal muscle and placenta polyA+ RNA with the marathon cDNA amplification kit (Clontech) as described by the manufacturer.
[0081]Fetal brain (catalog # HL5015b), fetal lung (HL3022a), ovary (HL1098a), pituitary gland (HL1097v) and hypothalamus (HL1172b) cDNA libraries were purchased from Clontech. Brain, kidney, liver and lung cDNA libraries were part of the quick screen human cDNA library panel (Clontech). Fetal muscle cDNA library was obtained from the UK Human Genome Mapping Project Resource Center.
D. Sequence Analysis and Structure of SHOX Gene
[0082]A consensus sequence of SHOXa and SHOXb (1349 and 1870 bp) was assembled by analysis of sequences from the 5' and 3'RACE derived clones. A single open reading frame of 1870 bp (SHOXa) and 1349 bp (SHOXb) was identified, resulting in two proteins of 292 (SHOXa) and 225 amino acids (SHOXb). Both transcripts a and b share a common 5'end, but have a different last 3'exon, a finding suggestive of the use of alternative splicing signals. A complete alignment between the two cDNAs and the sequenced genomic DNA from cosmids LL0YNCO3''M''15D10 and LL0YNC3''M''34F5 was achieved, allowing establishment of the exon-intron structure (FIG. 4). The gene is composed of 6 exons ranging in size from 58 bp (exon III) to 1146 bp (exon Va). Exon I contains a CpG-island, the start codon and the 5' region. A stop codon as well as the 3'-noncoding region is located in each of the alternatively spliced exons Va and Vb.
EXAMPLE 3
[0083]Two cDNAs have been identified which map to the 160 kb region identified as critical for short stature. These cDNAs correspond to the genes SHOX and pET92. The cDNAs were identified by the hybridization of subclones of the cosmids to cDNA libraries.
[0084]Employing the set of cosmid clones with complete coverage of the critical region has now provided the genetic material to identify the causative gene. Positional cloning projects aimed at the isolation of the genes from this region are done by exon trapping and cDNA selection techniques. By virtue of their location within the pseudoautosomal region, these genes can be assumed to escape X-inactivation and to exert a dosage effect.
[0085]The cloning of the gene leading to short stature when absent (haploid) or deficient, represents a further step forward in diagnostic accuracy, providing the basis for mutational analysis within the gene by e.g. single strand conformation polymorphism (SSCP). In addition, cloning of this gene and its subsequent biochemical characterization has opened the way to a deeper understanding of biological processes involved in growth control.
[0086]The DNA sequences of the present invention provide a first molecular test to identify individuals with a specific genetic disorder within the complex heterogeneous group of patients with idiopathic short stature.
EXAMPLE 4
Expression Pattern of SHOXa and SHOXb
[0087]Northern blot analysis using single exons as hybridisation probes reveiled a different expression profile for every exon, strongly suggesting that the bands of different size and intensities represent cross-hybridisation products to other G,C rich gene sequences. To achieve a more realistic expression profile of both genes SHOXa and b, RT-PCR experiments on RNA from different tissues were carried out. Whereas expression of SHOXa was observed in skeletal muscle, placenta, pancreas, heart and bone marrow fibroblasts, expression of SHOXb was restricted to fetal kidney, skeletal muscle and bone marrow fibroblasts, with the far highest expression in bone marrow fibroblasts.
[0088]The expression of SHOXa in several cDNA libraries made of fetal brain, lung and muscle, of adult brain, lung and pituitary and of SHOXb in none of the tested libraries gives additional evidence that one spliced form (SHOXa) is more broadly expressed and the other (SHOXb) expressed in a predominantly tissue-specific manner.
[0089]To assess the transcriptional activity of SHOXa and SHOXb on the X and Y chromosome we used RT-PCR of RNA extracted from various cell lines containing the active X, the inactive X or the Y chromosome as the only human chromosomes. All cell lines revealed an amplification product of the expected length of 119 bp (SHOXa) and 541 bp (SHOXb), providing clear evidence that both SHOXa and b escape X-inactivation.
[0090]SHOXa and SHOXb encode novel homeodomain proteins. SHOX is highly conserved across species from mammalian to fish and flies. The very 5' end and the very 3' end--besides the homeodomain--are likely conserved regions between man and mouse, indicating a functional significance. Differences in those amino acid regions have not been allowed to accumulate during evolution between man and mouse.
Experimental Procedures:
a) 5' and 3'RACE
[0091]To clone the 5' end of the SHOXa and b transcripts, 5'RACE was performed using the constructed `Marathon cDNA libraries`. The following oligonucleotide primers were used: SHOX B rev, GAAAGGCATCCGTAAGGCTCCC (position 697-718, reverse strand [r]) and the adaptor primer AP1. PCR was carried out using touchdown parameters: 94° C. for 2 min, 94° C. for 30 sec, 70° C. for 30 sec, 72° C. for 2 min for 5 cycles. 94° C. for 30 sec, 66° C. for 30 sec, 72° C. for 2 min for 5 cycles. 94° C. for 30 sec, 62° C. for 30 sec, 72° C. for 2 min for 25 cycles. A second round of amplification was performed using 1/100 of the PCR product and the following nested oligonucleotide primers: SHOX A rev, GACGCCTTTATGCATCTGATTCTC (position 617-640 r) and the adaptor primer AP2. PCR was carried out for 35 cycles with an annealing temperature of 60° C.
[0092]To clone the 3' end of the SHOXa and b transcripts, 3'RACE was performed as previously described (Frohman et al., 1988) using oligo(dT)adaptor primed first strand cDNA. The following oligonucleotide primers were used: SHOX A for, GAATCAGATGCATAAAGGCGTC (position 619-640) and the oligo(dT)adaptor. PCR was carried out using following parameters: 94° C. for 2 min, 94° C. for 30 sec, 62° C. for 30 sec, 72° C. for 2 min for 35 cycles. A second round of amplification was performed using 1/100 of the PCR product and the following nested oligonucleotide primers: SHOX B for, GGGAGCCTTACGGATGCCTTTC (position 697-718) and the oligo(dT)adaptor. PCR was carried out for 35 cycles with annealing temperature of 62° C.
[0093]To validate the sequences of SHOXa and SHOXb transcripts, PCR was performed with a 5' oligonucleotide primer and a 3' oligonucleotide primer. For SHOXa the following primers were used: G310 for, AGCCCCGGCTGCTCGCCAGC (position 59-78) and SHOX D rev, CTGCGCGGCGGGTCAGAGCCCCAG (position 959-982 r). For SHOXb the following primers were used: G310 for, AGCCCCGGCTGCTCGCCAGC and SHOX2A rev, GCCTCAGCAGCAAAGCAAGATCCC (position 1215-1238 r). Both PCRs were carried out using touchdown parameters: 94° C. for 2 min, 94° C. for 30 sec, 70° C. for 30 sec, 72° C. for 2 min for 5 cycles. 94° C. for 30 sec, 68° C. for 30 sec, 72° C. for 2 min for 5 cycles. 94° C. for 30 sec, 65° C. for 30 sec, 72° C. for 2 min for 35 cycles. Products were gel-purified and cloned for sequencing analysis.
b) SSCP Analysis
[0094]SSCP analysis was performed on genomic amplified DNA from patients according to a previously described method (Orita et al., 1989). One to five μl of the PCR products were mixed with 5 μl of denaturation solution containing 95% Formamid and 10 mM EDTA pH8 and denaturated at 95° C. for 10 min. Samples were immediately chilled on ice and loaded on a 10% Polyacryamidgel (Acrylamide:Bisacryamide=37.5:1 and 29:1; Multislotgel, TGGE base, Qiagen) containing 2% glycerol and 1×TBE. Gels were run at 15° C. with 500V for 3 to 5 hours and silver stained as described in TGGE handbook (Qiagen, 1993).
c) Cloning and Sequencing of PCR Products
[0095]PCR products were cloned into pMOSBlue using the pMOSBlueT- Vector Kit from Amersham. Overnight cultures of single colonies were lysed in 100 μl H2O by boiling for 10 min. The lysates were used as templates for PCRs with specific primers for the cloned PCR product. SSCP of PCR products allowed the identification of clones containing different alleles. The clones were sequenced with CY5 labelled vector primers Uni and T7 by the cycle sequencing method described by the manufacturer (ThermoSequenase Kit (Amersham)) on an ALF express automated sequencer (Pharmacia).
d) PCR Screening of cDNA Libraries
[0096]To detect expression of SHOXa and b, a PCR screening of several cDNA libraries and first strand cDNAs was carried out with SHOXa and b specific primers. For the cDNA libraries a DNA equivalent of 5×108 pfu was used. For SHOXa, primers SHOX E rev, GCTGAGCCTGGACCTGTTGGAAAGG (position 713-737 r) and SHOX a for were used. For SHOXb, the following primers were used: SHOX B for and SHOX2A rev. Both PCRs were carried out using touchdown parameters: 94° C. for 2 min; 94° C. for 30 sec, 68° C. for 30 sec, 72° C. for 40 sec for 5 cycles. 94° C. for 30 sec, 65° C. for 30 sec, 72° C. for 40 sec for 5 cycles. 94° C. for 30 sec, 62° C. for 30 sec, 72° C. for 40 sec for 35 cycles.
e) PCR Screening of cDNA Libraries
[0097]To detect expression of SHOXa and b, a PCR screening of several cDNA libraries and first strand cDNAs was carried out with SHOXa and b specific primers. For the cDNA libraries a DNA equivalent of 5×108 pfu was used. For SHOXa, primers SHOX E rev, GCTGAGCCTGGACCTGTTGGAAAGG (position 713-737 r) and SHOX a for were used. For SHOXb, the following primers were used: SHOX B for and SHOX2A rev. Both PCRs were carried out using touchdown parameters: 94° C. for 2 min; 94° C. for 30 sec, 68° C. for 30 sec, 72° C. for 40 sec for 5 cycles. 94° C. for 30 sec, 65° C. for 30 sec, 72° C. for 40 sec for 5 cycles. 94° C. for 30 sec, 62° C. for 30 sec, 72° C. for 40 sec for 35 cycles.
EXAMPLE 5
Expression Pattern of OG12, the Putative Mouse Homolog of Both SHOX and SHOT
[0098]In situ hybridisation on mouse embryos ranging from day 5 p.c. and day 18.5 p.c., as well as on fetal and newborn animals was carried out to establish the expression pattern. Expression was seen in the developing limb buds, in the mesoderm of nasal processes which contribute to the formation of the nose and palate, in the eyelid, in the aorta, in the developing female gonads, in the developing spinal cord (restricted to differentiating motor neurons) and brain. Based on this expression pattern and on the mapping position of its human homolog SHOT, SHOT represents a likely candidate for the Cornelia de Lange syndrome which includes short stature.
EXAMPLE 6
Isolation of a Novel SHOX-Like Homeobox Gene on Chromosome Three, SHOT, Being Related to Human Growth/Short Stature
[0099]A new gene called SHOT (for SHOX-homolog on chromosome three) was isolated in human, sharing the most homology with the murine OG12 gene and the human SHOX gene. The human SHOT gene and the murine OG12 genes are highly homologous, with 99% identity at the protein level. Although not yet proven, due to the striking homology between SHOT and SHOX ( identity within the homeodomain only), it is likely that SHOT is also a gene likely involved in short stature or human growth.
[0100]SHOT was isolated using primers from two new human ESTs (HS 1224703 and HS 126759) from the EMBL database, to amplify a reverse-transcribed RNA from a bone marrow fibroblast line (Rao et al, 1997). The 5' and 3' ends of SHOT were generated by RACE-PCR from a bone marrow fibroblast library that was constructed according to Rao et al., 1997. SHOT was mapped by FISH analysis to chromosome 3q25/q26 and the murine homolog to the syntenic region on mouse chromosome 3. Based on the expression pattern of OG12, its mouse homolog, SHOT represents a candidate for the Cornelia Lange syndrome (which shows short stature and other features, including craniofacial abnormalities) mapped to this chromosomal interval on 3q25/26.
EXAMPLE 7
Searching for Mutations in Patients with Idiopathic Short Stature
[0101]The DNA sequences of the present invention are used in PCR, LCR, and other known technologies to determine if such individuals with short stature have small deletions or point mutations in the short stature gene.
[0102]A total of initially 91 (in total 250 individuals) unrelated male and female patients with idiopathic short stature (idiopathic short stature has an estimated incidence of 2-2.5% in the general population) were tested for small rearrangements or point mutations in the SHOXa gene. Six sets of PCR primers were designed not only to amplify single exons but also sequences flanking the exon and a small part of the 5'UTR. For the largest exon, exon one, two additional internal-exon primers were generated. Primers used for PCR are shown in table 2.
[0103]Single strand conformation polymorphism (SSCP) of all amplified exons ranging from 120 to 295 bp in size was carried out. Band mobility shifts were identified in only 2 individuals with short stature (Y91 and A1). Fragments that gave altered SSCP patterns (unique SSCP conformers) were cloned and sequenced. To avoid PCR and sequencing artifacts, sequencing was performed on two strands using two independent PCR reactions. The mutation in patient Y91 resides 28bp 5' of the start codon in the 5'UTR and involves a cytidine-to-guanine substitution. To find out if this mutation represents a rare polymorphism or is responsible for the phenotype by regulating gene expression e.g. though a weaker binding of translation initiation factors, his parents and a sister were tested. As both the sister and father with normal height also show the same SSCP variant (data not shown), this base substitution represents a rare polymorphism unrelated to the phenotype.
[0104]Cloning and sequencing of a unique SSCP conformer for patient A1 revealed a cytidine-to-thymidine base transition (nucleotide 674) which introduces a termination codon at amino-acid position 195 of the predicted 225 and 292 amino-acid sequences, respectively. To determine whether this nonsense mutation is genetically associated with the short stature in the family, pedigree analysis was carried out. It was found that all six short individuals (defined as height below 2 standard deviations) showed an aberrant SSCP shift and the cytidine-to-thymidine transition. Neither the father, nor one aunt and maternal grandfather with normal height showed this mutation, indicating that the grandmother has transferred the mutated allele onto two of her daughters and her two grandchildren. Thus, there is concordance between the presence of the mutant allele and the short stature phenotype in this family.
[0105]The identical situation as indicated above was found in another short stature patient of Japanese origin.
EXAMPLE 8
[0106]The DNA sequences of the present invention are used to characterize the function of the gene or genes. The DNA sequences can be used as search queries for data base searching of nucleic acid or amino acid databases to identify related genes or gene products. The partial amino acid sequence of SHOX93 has been used as a search query of amino acid databases. The search showed very high homology to many known homeobox proteins. The cDNA sequences of the present invention can be used to recombinantly produce the peptide. Various expression systems known to those skilled in the art can be used for recombinant protein production.
[0107]By conventional peptide synthesis (protein synthesis according to the Merrifield method), a peptide having the sequence CSKSFDQKSKDGNGG (SEQ ID NO: 42) was synthesized and polyclonal antibodies were derived in both rabbits and chicken according to standard protocols.
REFERENCES
[0108]The following references are herein incorporated by reference.
[0109]Ashworth A, Rastan S, Lovell-Badge R, Kay G (1991): X-chromosome inactivation may explain the difference in viability of X0 humans and mice. Nature 351: 406-408.
[0110]Ballabio A, Bardoni A, Carrozzo R, Andria G, Bick D, Campbell L, Hamel B, Ferguson-Smith MA, Gimelli G, Fraccaro M, Maraschio P, Zuffardi O, Guilo S, Camerino G (1989): Contiguous gene syndromes due to deletions in the distal short arm of the human X chromosome. Proc Natl Acad Sci USA 86:10001-10005.
[0111]Blagowidow N, Page D C, Huff D, Mennuti M T (1989): Ullrich-Tumer syndrome in an XY female fetus with deletion of the sex-determining portion of the Y chromosome. Am. J. med. Genet. 34:159-162.
[0112]Cantrell M A, Bicknell J N, Pagon R A et al. (1989): Molecular analysis of 46,XY females and regional assignment of a new Y-chromosome-specific probe. Hum. Genet. 83: 88-92.
[0113]Connor J M, Loughlin S A R (1989): Molecular genetics of Turner's syndrome. Acta Pediatr. Scand. (Suppl.) 356: 77-80.
[0114]Disteche C M, Casanova M, Saal H, Friedmen C, Sybert V, Graham J, Thuline H, Page D C, Fellous M (1986): Small deletions of the short arm of the Y-chromosome in 46,XY females. Proc Natl Acad Sci USA 83:7841-7844.
[0115]Ferguson-Smith M A (1965): Karyotype-phenotype correlations in gonadal dysgenesis and their bearing on the pathogenesis of malformations. J. med. Genet. 2: 142-155.
[0116]Ferrari D, Kosher R A, Dealy C N (1994): Limb mesenchymal cells inhibited from undergoing cartilage differentiation by a tumor promoting phorbol ester maintain expression of the homeobox-containing gene MSX1 and fail to exhibit gap junctional communication. Biochemical and Biophysical Research Communications. 205(1): 429-434.
[0117]Fischer M, Bur-Romero P, Brown L G et al. (1990): Homologous ribosomal protein genes in the human X- and Y-chromosomes escape from X-inactivation and possible implementation for Turner syndrome. Cell 63: 1205-1218.
[0118]Freund C, Horsford D J, McInnes R R (1996): Transcription factor genes and the developing eye: a genetic perspective. Hum Mol Genet 5: 1471-1488.
[0119]Gehring W J, Qian Y Q, Billeter M, Furukubo-Tokunaga K, Schier A F, Resendez-Perez D, Affolter M, Otting G, Wuthrich K (1994): Homeodomain-DNA recognition. Cell 78: 211-223.
[0120]Gough N M, Gearing D P, Nicola N A, Baker E, Pritchard M, Callen D F, Sutherland G R (1990). Localization of the human GM-CSF receptor gene to the X-Y pseudoautosomal region. Nature 345: 734736
[0121]Grumbach M M, Conte F A (1992): Disorders of sexual differentiation. In: Williams textbook of endocrinology, 8th edn., edited by Wilson J D, Foster D W, pp. 853-952, Philadelphia, W B Saunders.
[0122]Hall J G, Gilchrist D M (1990): Turner syndrome and its variants. Pedriatr. Clin. North Am. 37: 1421-1436.
[0123]Henke A, Wapenaar M, van Ommen G-J, Maraschio P, Camerino O, Rappold G A (1991): Deletions within the pseudoautosomal region help map three new markers and indicate a possible role of this region in linear growth. Am J Hum Genet 49:811-819.
[0124]Hernandez D, Fisher E M C (1996): Down syndrome genetics: unravelling a multifactorial disorder. Hum Mol Genet 5:1411-1416.
[0125]Kenyon C (1994): If birds can fly, why can't we? Homeotic genes and evolution. Cell 78: 175-180.
[0126]Krumlauf R (1994): Hox genes in vertebrate development. Cell 78: 191-201.
[0127]Kulharya A S, Roop H, Kukolich M K, Nachtman R G, Belmont J W, Garcia-Heras J (1995): Mild phenotypic effects of a de novo deletion Xpter→Xp22.3 and duplication 3pter→3p23. Am J Med Genet 56:16-21.
[0128]Lawrence P A, Morata G (1994): Homeobox genes: their function in Drosophila segmentation and pattern formation. Cell 78: 181-189.
[0129]Lehrach H, Drmnac R, Hoheisel J D, Larin Z, Lemon G, Monaco A P, Nizetic D, et a,. Hybridization finger printing in genome mapping and sequencing. In Davies K E, Tilghman S, Eds. Genome Analysis 1990: 39-81 Cold Spring Harbor, N.Y.
[0130]Levilliers J, Quack B, Weissenbach J, Petit C (1989): Exchange of terminal portions of X- and Y-chromosomal short arms in human XY females. Proc Natl Acad Sci USA 86:2296-2300.
[0131]Lichter P, Cremer T, Human Cytogenetics: A practical Approach, IRL Press 1992, Oxford, New York, Tokyo
[0132]Lippe B M (1991): Turner Syndrome. Endocrinol Metab Clin North Am 20: 121-152. Magenis R E, Tochen M L Holahan K P, Carey T, Allen L, Brown M G (1984): Turner syndrome resulting from partial deletion of Y-chromosome short arm: localization of male determinants. J Pediatr 105: 916-919.
[0133]Nelson D L, Ballabio A, Cremers F, Monaco A P, Schlessinger D (1995).--Report of the sixth international workshop on the X chromosome mapping. Cytogenet. Cell Genet. 71: 308-342
[0134]Ogata T, Goodfellow P, Petit C, Aya M, Matsuo N (1992): Short stature in a girl with a terminal Xp deletion distal to DXYS15: localization of a growth gene(s) in the pseudoautosomal region. J Med Genet 29:455-459.
[0135]Ogata T, Tyler-Smith C, Purvis-Smith S, Turner G (1993): Chromosomal localisation of a gene(s) for Turner stigmata on Yp. J. Med. Genet. 30: 918-922.
[0136]Ogata T, Yoshizawa. A, Muroya K, Matsuo N, Fukushima Y, Rappold GA, Yokoya S (1995): Short stature in a girl with partial monosomy of the pseudoautosomal region distal to DXYS15: further evidence for the assignment of the critical region for a pseudoautosomal growth gene(s). J Med Genet 32:831-834.
[0137]Ogata T, Matsuo N (1995): Turner syndrome and female sex chromosome aberrations: deduction of the principle factors involved in the development of clinical features. Hum. Genet. 95: 607-629.
[0138]Orita M, Suzuki Y, Sekiya T and Hayashi K (1989): Rapid and sensitive detection of point mutations and polymorphisms using the polymerase chain reaction. Genomics 5:874-879.
[0139]Pohlschmidt M, Rappold G A, Krause M, Ahlert D, Hosenfeld D, Weissenbach J, Gal A (1991): Ring Y chromosome: Molecular characterization by DNA probes. Cytogenet Cell Genet 56:65-68.
[0140]Qiagen (1993) TGGE Handbook, Diagen GmbH, TGMA 4112 3/93.
[0141]Rao E, Weiss B, Mertz A et al. (1995): Construction of a cosmid contig spanning the short stature candidate region in the pseudoautosomal region PAR 1. in: Turner syndrome in a life span perspective: Research and clinical aspects. Proceedings of the 4th International Symposium on Turner Syndrome, Gothenburg, Sweden, 18-21 May, 1995., edited by Albertsson-Wikland K, Ranke M B, pp. 19-24, Elsevier.
[0142]Rao E, Weiss B, Fukami M, Rump A, Niesler B, Mertz A, Muroya K, Binder G, Kirsch S, Winkelmann M, Nordsiek G, Heinrich U, Breuning M H, Ranke M B, Rosenthal A, Ogata T, Rappold G A (1997): Pseudoautosomal deletions encompassing a novel homeobox gene cause growth failure in idiopathic short stature and Turner syndrome. Nature Genet 15:54-62
[0143]Rappold G A (1993): The pseudoautosomal region of the human sex chromosomes. Hum Genet 92:315-324.
[0144]Rappold G A, Willson T A, Henke A, Gough N M (1992): Arrangement and localization of the human GM-CSF receptor α chain gene CSF2RA within the X-Y pseudoautosomal region. Genomics 14:455-461.
[0145]Ried K, Mertz A, Nagaraja R, Trusnich M, Riley J, Anand R, Page D, Lehrach H., Elliso J, Rappold G A (1995): Characterization of a yeast artificial chromosome contig spanning the pseudoautosomal region. Genomics 29:787-792.
[0146]Robinson A (1990): Demography and prevalence of Turner syndrome. In: Turner Syndrome., edited by Rosenfeld R G, Grumbach M M, pp. 93-100, New York, Marcel Dekker.
[0147]Rosenfeld R G (1992): Turner syndrome: a guide for physicians. Second edition. The Turner's Syndrome Society.
[0148]Rosenfeld R G, Tesch L-G, Rodriguez-Rigau L J, McCauley E, Albertsson-Wikland K, Asch R, Cara J, Conte F, Hall J G, Lippe B, Nagel T C, Neely E K, Page D C, Ranke M, Saenger P, Watkins J M, Wilson D M (1994): Recommendations for diagnosis,treatment, and management of individuals with Turner syndrome. The Endocrinologist 4(5): 351-358.
[0149]Rovescalli A C, Asoh S, Nirenberg M (1996): Cloning and characterization of four murine homeobox genes. Proc Natl Acad Sci USA 93:10691-10696.
[0150]Schaefer L, Ferrero G B, Grillo A, Bassi M T, Roth E J, Wapenaar M C, van Ommen G-J B, Mohandas T K, Rocchi M, Zoghbi H Y, Ballabio A (1993): A high resolution deletion map of human chromosome Xp22. Nature genetics 4: 272-279.
[0151]Shalet S M (1993): Leukemia in children treated with growth hormone. Journal of Pediatric Endocrinology 6: 109-11
[0152]Vimpani G V, Vimpani A F, Lidgard G P, Cameron E H D, Farquhar J W (1977) Prevalence of severe growth hormone deficiency. Br Med J. 2: 427-430
[0153]Zinn A R, Page D C, Fisher E M C (1993): Turner syndrome: the case of the missing sex chromosome. TIG 9 (3): 90-93.
Sequence CWU
1
55160PRTHomo sapiens 1Gln Arg Arg Ser Arg Thr Asn Phe Thr Leu Glu Gln Leu
Asn Glu Leu 1 5 10 15Glu
Arg Leu Phe Asp Glu Thr His Tyr Pro Asp Ala Phe Met Arg Glu
20 25 30Glu Leu Ser Gln Arg Leu Gly Leu
Ser Glu Ala Arg Val Gln Val Trp 35 40
45Phe Gln Asn Arg Arg Ala Lys Cys Arg Lys Gln Glu 50
55 602209DNAHomo sapiens 2ggatttatga atgcaaagag
aagcgcgagg acgtgaagtc ggaggacgag gacgggcaga 60ccaagctgaa acagaggcgc
agccgcacca acttcacgct ggagcagctg aacgagctcg 120agcgactctt cgacgagacc
cattaccccg acgccttcat gcgcgaggag ctcagccagc 180gcctggggct ctccgaggcg
cgcgtgcag 2093368DNAHomo sapiens
3gtgatccacc cgcgcgcacg ggccgtcctc tccgcgcggg gagacgcgcg catccaccag
60ccccggctgc tcgccagccc cggccccagc catggaagag ctcacggctt ttgtatccaa
120gtcttttgac cagaaaagca aggacggtaa cggcggaggc ggaggcggcg gaggtaagaa
180ggattccatt acgtaccggg aagttttgga gagcggactg gcgcgctccc gggagctggg
240gacgtcggat tccagcctcc aggacatcac ggagggcggc ggccactgcc cggtgcattt
300gttcaaggac cacgtagaca atgacaagga gaaactgaaa gaattcggca ccgcgagagt
360ggcagaag
368458DNAHomo sapiens 4gtttggttcc agaaccggag agccaagtgc cgcaaacaag
agaatcagat gcataaag 58589DNAHomo sapiens 5gcgtcatctt gggcacagcc
aaccacctag acgcctgccg agtggcaccc tacgtcaaca 60tgggagcctt acggatgcct
ttccaacag 8961166DNAHomo sapiens
6gtccaggctc agctgcagct ggaaggcgtg gcccacgcgc acccgcacct gcacccgcac
60ctggcggcgc acgcgcccta cctgatgttc cccccgccgc ccttcgggct gcccatcgcg
120tcgctggccg agtccgcctc ggccgccgcc gtggtcgccg ccgccgccaa aagcaacagc
180aagaattcca gcatcgccga cctgcggctc aaggcgcgga agcacgcgga ggccctgggg
240ctctgacccg ccgcgcagcc ccccgcgcgc ccggactccc gggctccgcg caccccgcct
300gcaccgcgcg tcctgcactc aaccccgcct ggagctcctt ccgcggccac cgtgctccgg
360gcaccccggg agctcctgca agaggcctga ggagggaggc tcccgggacc gtccacgcac
420gacccagcca gaccctcgcg gagatggtgc agaaggcgga gcgggtgagc ggccgtgcgt
480ccagcccggg cctctccaag gctgcccgtg cgtcctggga ccctggagaa gggtaaaccc
540ccgcctggct gcgtcttcct ctgctatacc ctatgcatgc ggttaactac acacgtttgg
600aagatcctta gagtctattg aaactgcaaa gatcccggag ctggtctccg atgaaaatgc
660catttcttcg ttgccaacga ttttctttac taccatgctc cttccttcat cccgagaggc
720tgcggaacgg gtgtggattt gaatgtggac ttcggaatcc caggaggcag gggccgggct
780ctcctccacc gctcccccgg agcctcccag gcagcaataa ggaaatagtt ctctggctga
840ggctgaggac gtgaaccgcg ggctttggaa agggagggga gggagacccg aacctcccac
900gttgggactc ccacgttccg gggacctgaa tgaggaccga ctttataact tttccagtgt
960ttgattccca aattgggtct ggttttgttt tggattggta tttttttttt tttttttttt
1020tgctgtgtta caggattcag acgcaaaaga cttgcataag agacggacgc gtggttgcaa
1080ggtgtcatac tgatatgcag cattaacttt actgacatgg agtgaagtgc aatattataa
1140atattataga ttaaaaaaaa aatagc
11667625DNAHomo sapiens 7atggagtttt gctcttgtcg cccaggctgg agtataatgg
catgatctcg actcactgca 60acctccgcct cccgagttca agcgattctc ctgcctcagc
ctcccgagta gctgggatta 120caggtgccca ccaccatgtc aagataatgt ttgtattttc
agtagagatg gggtttgacc 180atgttggcca ggctggtctc gaactcctga cctcaggtga
tccacccgcc ttagcctccc 240aaagtgctgg gatgacaggc gtgagcccct gcgcccggcc
tttgtaactt tatttttaat 300tttttttttt ttttaagaaa gacagagtct tgctctgtca
cccaggctgg agcacactgg 360tgcgatcata gctcactgca gcctcaaact cctgggctca
agcaatcctc ccacctcagc 420ctcctgagta gctgggacta caggcaccca ccaccacacc
cagctaattt ttttgatttt 480tactagagac gggatcttgc tttgctgctg aggctggtct
tgagctcctg agctccaaag 540atcctctcac ctccacctcc caaagtgtta gaattacaag
catgaaccac tgcccgtggt 600ctccaaaaaa aggactgtta cgtgg
625815577DNAHomo
sapiensmisc_feature(3844)..(4068)pET92 region (first part) 8ctctccctgt
tgtgtctctc tttctctctc tccatctctc tccgtctttc cccctctgtc 60tctttctctg
tctccatccc tctgtctctc cctttctctc tgtctttcct tgtctctctc 120tttctctctc
tctctccatc tctctctctc ccggtctctc tctctccatc tccccgtctc 180tccgtttctc
tctctgcctc tccctgtctg tctctctctt tgtgtgtgtt acacacaccc 240caacccaccg
tcactcatgt ccccccactg ctgtgccatc tcacacaagt tcacagctca 300gctgtcatcc
tgggtcccca ggccccgccg gggaggaaga tgcgccgtgg ggttacggga 360ggaaggggac
tccgggcctc ctggtgcccc actttatttg cagaaggtcc ttggcaggaa 420ccgtgacgcg
tttggtttcc aggacttgga aaacgaattt caggtcgcga tggcgagcac 480cggcttcccc
tgaagcacat tcaatagcga gaggcgggag ggagcgagca ggagcatccc 540accatgaaaa
ccaaaaacac aagtattttt ttcacccggt aaatacccca gacgccaggg 600tgacagcgcg
gcgctaaggg aggaggcctc gcgccggggt ccgccgggat ctggcgcggg 660cggaaagaat
atagatcttt acgaaccgga tctcccgggg acctgggctt ctttctgcgg 720gcgctggaaa
cccgggaggc ggccccgggg atcctcggcc tccgccgccg ccgcctccca 780agcgcccgcg
tcccggtttg gggacacccg gccccttctt ctcactttcg gggattctcc 840agccgcgttc
catctcacca actctccatc caagggcgcg ccgccaccaa cttggagctc 900atcttctccc
aaaatcgtgc gtccccgggg cgcccgggtc ccccccctcg ccatctcaac 960cccggcgcga
cccgggcgct tcctggaaag atccaggcgc cgggctctgc gctcctcccg 1020ggagcgaggg
cggccggaca actgggaccc tcctctctcc agccgtgaac tccttgtctc 1080tctgtctctc
tctgcaggaa aactggagtt tgcttttcct ccggccacgg aaagaacgcg 1140ggtaacctgt
gtggggggct cgggcgcctg cgcccccctc ctgcgcgcgc gctctccctt 1200ccaaaaatgg
gatctttccc ccttcgcacc aaggtgtacg gacgccaaac agtgatgaaa 1260tgagaagaaa
gccaattgcc ggcctggggg gtgggggaga cacagcgtct ctgcgtgcgt 1320ccgccgcgga
gcccggagac cagtaattgc accagacagg cagcgcatgg ggggctgggc 1380gaggtcgccg
cgtataaata gtgagatttc caatggaaag gcgtaaataa cagcgctggt 1440gatccacccg
cgcgcacggg ccgtcctctc cgcgcgggga gacgcgcgca tccaccagcc 1500ccggctgctc
gccagccccg gccccagcca tggaagagct cacggctttt gtatccaagt 1560cttttgacca
gaaaagcaag gacggtaacg gcggaggcgg aggcggcgga ggtaagaagg 1620attccattac
gtaccgggaa gttttggaga gcggactggc gcgctcccgg gagctgggga 1680cgtcggattc
cagcctccag gacatcacgg agggcggcgg ccactgcccg gtgcatttgt 1740tcaaggacca
cgtagacaat gacaaggaga aactgaaaga attcggcacc gcgagagtgg 1800cagaaggtaa
gttcctttgc gcgccggctc caggggggcc ctcctggggt tcggcgcctc 1860ctcgccacgg
agtcggcccc gcgcgcccct cgctgtgcac atttgcagct cccgtctcgc 1920cagggtaagg
cccgggccgt caggctttgc ctaagaaagg aaggaaggca ggagtggacc 1980cgaccggaga
cgcgggtggt gggtagcggg gtgcgggggg acccagggag ggtcgcagcg 2040ggggccgcgc
gcgtgggcac cgacacggga aggtcccggg ctggggtgga tccgggtggc 2100tgtgcctgaa
gccgtagggc ctgagatgtc tttttcattt tctttttctt tcctttcctt 2160tttttgtttg
tttgtttgtt tgtttgagac agagtctcgc tctgtccccc aggctggagt 2220gcagtggtgc
gatctcggct cactgcaacc tccgcctcct gggttcaagc gattctcctg 2280cctcagcctc
cccagtagct gggattacag gcatgcacca ccacgcctgg ctaatttttg 2340tgcttttagt
aaagacgggg attcaccatg ttggccaggc tggtctcgaa ctcctgacct 2400caggtgatcc
acccgcctcg gcctcccaaa gtgctgggat gacaggcgtg aggcaccgcg 2460cccggcctgg
gtcctgacgg cttaggatgt gtgtttctgt ctctgcctgt ctgccttgta 2520tttacggtca
cccagacgca cagaggagcc gtctccacgc gccttcccag cgctcagcgc 2580ctgccgggcc
cccggagatc acgggaagac tcgaggctgc gtggtaggag acgggaaggc 2640cccgggtcag
ctcggttctg tttcncttta aggaaccctt cattattatt tcattgtttt 2700cctttgaacg
tcgaggcttg atcttggcga aagctgttgg gtccataaaa accactcccg 2760tgagcggagg
tggccgggat ctggatgggg cgcgaggggc cccggggaag ctggcggctt 2820cgcgggcgcg
tcctaagtca aggttgtcag agcgcagccg gttgtgcgcg gcccgggggn 2880agctcccctc
tggcccttcc tcctgagacc tcagtggtgg gtcgtcccgt ggtggaaatc 2940ggggagtaag
aggctcagag agaggggctg gccccgggga tctctgtgca cacacgacaa 3000ctgggcggca
tacatcttaa gaataaaatg ggctggctgt gtcggggcac agctggagac 3060ggctatggac
gcctgttatg ttttcattac aaagacgcag agaatctagc ctcggctttt 3120gctgattcgc
aaagttgagg tgcgagggtg aatgccccaa aggtaattct tcctaagact 3180ctggggctac
ctgctctccg gggccctgca tttggggtgt ggagtggccc cgggaaatag 3240cccttgtatt
cgtaggaggc accaggcagc ttcccaaggc cctgactttg tcgaagcaga 3300aagctgtggc
tacggtttac aaagcagtcc ccggtttctg accgtctaag aggcaggagc 3360ccagcctgcc
tttgacagtg agaggagttc ctccctacac actgctgcgg gcacccggca 3420ctgtaattca
tacacagaga gttggccttc ctggacgcaa ggctgggagc cgcttgaggg 3480cctgcgtgta
atttaagagg gttcgcangc gcccggcggc cgcttctgnt ggggttgctt 3540tttggttgtc
cttcngcaaa caccgttttg ctcctctngn aactctctct tnctcccccn 3600tggccngtng
gacccgggna ngagcaaagt gtcctccaga ccnttttgaa angtgagagg 3660aaaataaaga
ccaggccaaa nngacccagg gccacaggag aggagacaga gagtccccgt 3720tacattttnc
cccttggctg ggtgcagaaa gacccccggg ccaggactgc cacccaggct 3780actatttatt
catcagatcc aagttaaatc gaggttggag ggcaggggag agtctgaggt 3840taccgtggaa
gcctggagtt tttgggnaac agcgtgtccc cgccgagcct gggagcccgt 3900gggttctgca
aagcctgcgg gtgtttgagg actttgaaga ccagtttgtc agttgggctc 3960aattncctgg
ggttcagact tagagaaatg aaggagggag agctggggtc gtctccagga 4020aacgattcac
ttggggggaa ggaatggagt gttcttgcag gcacatgtct gttaggaggt 4080gaaacagaat
gtgaaatcca cgttggagta agcgtccagc gctgaatgta gctcggggtg 4140gggtgggagg
gccctggtgt ggatcgtgga aggnaagaaa gacagaacag ggtgctagta 4200tttaccccgt
tnccctgtag acaccctgga tttgtcagct ttgcaagctt cttggttgca 4260gcggccttgc
ctgtgcccct ttgagactgt ttccagacta aacttccaaa tgtcagcccc 4320ttacccttga
cagcaaggga catctcatta gggcatcgcg tgcttctcat ctgtgnctca 4380gcaggcccng
agataggaan cangaggggc ngttggnaga tgcncacttc caccagccct 4440gggnttgaag
gggangcgan gggangacna ccttttanct taaacccctn gagcttggtn 4500cagagaggnc
tgaatgtcta aaatgaggaa gaaaaggttt ttcacctgga aacgcttgag 4560ggctgagtct
tctgcccntt ctgacntccc ccagcaaata cagacaggtc accaanctac 4620tggagatgag
aaagtgccat ttttggcaca ctctggtggg gtaggtgccc gaccgcgtgt 4680gaaaaangtg
ggaannggag agatttctgn cgcacgcggt tcagccccca ggcgcggntg 4740gcngcattcn
aggntactca gacgcggttc tgctgttctg ctgagaaaca ggcttcgggt 4800aggggctcct
agctccgcca gatcgcggag ggacccccag ccctcctgcg ctgcagcggt 4860ggggatagcg
tctctccgta ggcctagaat ctgcaacccg ccccgggtcc tccccgtgtc 4920cttcccgggc
gtcccgccgg ggatcccaca gttggcagct cttcctcaaa ttctttccct 4980taaaaatagg
atttgacacc ccactctcct taaaaaaaaa aaataagaaa aaaaggttag 5040gttatgtcaa
cagaggtgaa gtggataatt gaggaaacga ttctgagatg aggccaagaa 5100aacaacgctc
gtgcaaagcc caggtttttg ggaaagcagc gagtatcctc ctcggctttt 5160gcgttatgga
ccccacgcag tttttgcgtc aaagcgcatt ggttttcgag ggcccccttt 5220ccaccgcggg
atgcacgaag gggttcgcca cgttgcgcaa aacctccccg gcctcagccc 5280tgtgccctcc
gctccccacg cagggattta tgaatgcaaa gagaagcgcg aggacgtgaa 5340gtcggaggac
gaggacgggc agaccaagct gaaacagagg cgcagccgca ccaacttcac 5400gctggagcag
ctgaacgagc tcgagcgact ttttgacgag acccattacc ccgacgcctt 5460catgcgcgag
gagctcagcc agcgcctggg gctttccgag gcgcgcgtgc aggtaggaac 5520ccgggggcgg
gggcgggggg cccggagcca tcgcctggtc ctcgggagcg cacagcacgc 5580gtacagccac
ctgcgcccgg gccgccgccg tccccttccc ggagcgcggg gaggttgggt 5640gagggacggg
ctggggttcc tggacttttg gagacgcctg aggcctgtag gatgggttca 5700ttgcgtttgt
ttttcaccaa cagcaaacaa atatatatac atatatatta tacaaataac 5760aaataaatat
atatgttata cagatgggta tattgtatat attatagata tttgttcgtc 5820cttggtgcaa
agacacccgg tgaacccata tattggctcc tgactgcctt cggttcccct 5880gggattggtt
ataggggcaa cacatgcaaa caaaactttc cctggattat acttaggaga 5940cgaagctaca
gatgcgtttg atccagagtg ttttacaaga tttttcattt aaaaaaaaat 6000gtgtcttttg
gcccctgatt cccctccgtc ttcccgtgtg gctgcattga aaaggtttcc 6060ttaggatgaa
aggagagggg tgtcctctgt ccctaggtgg agagaaacag ggtcttctct 6120ttcctccgtt
ttttcaccta ccgtttctat ctccctcctc ccctctccag ccctgtcctc 6180tgctacaaac
caccccctcc tccctccggc tgtggggagc gcaggagcac gttgggcatc 6240tggatgagcg
gnagactatt agcggggcac gggggctccc cgaggagcgc gcgaattcac 6300gctgccccat
gagaccaggc accggggggc ggaggggcct tgggtgtccg cagagggacg 6360ggcgggcaga
gccttcctcc gcattctaaa cattcactta aaggtatgag tttantttca 6420ggggtgctgc
tgggagagcc tccaaatggc ttcttccagc ccctgcctga cagttcagct 6480cccctggaag
gtcaactcct ctagtccttt ctcctggttc tgggcaggac agaagtgggg 6540ggagggagag
agagagagag agagagagag acggtcagga tccccggacc ctggggaacc 6600cgtcaaaaat
aaatgaaatt aagattgccg accagagaga gaaccgtgac aaagcaaacg 6660gcgttcaaag
caaagagacg aactgaaagc ccgttcccgt aggactggtt atgaggtcaa 6720cacattcaaa
cacagcttgc tctggatttt gctgagcaga ggaagataca gatgcatttg 6780atccaaagtg
tgttacatct ttcattatat gtgtgtctat atatataaac atatataaat 6840atataaacat
acataaatgt atgtaaatat atataatcta tatacatata taaatatata 6900aacacatata
taatatataa atctataaac atatataata tataaacata aatatataaa 6960catatataat
atataaatat attaacatat ataaaatatg tataaatata tataaacata 7020taaacatata
taaatatata aacatataaa tatataaaca tatataaata tatacaaaca 7080tattgtatat
atataaatat atataaaaac atatatatac atataaaaat atatataaac 7140atatatacat
ataaagaaat atatataaac atatatacat ataaatatac atatataaac 7200atatatatac
ataaaatata tataaacata tatacatata aaaatatata tatattaaca 7260tatatataca
tataaaaata tatatattaa catatatata catataaaaa tatatatata 7320tttttggccc
ctgattccct tcggttcctg tgggatgggt gattgagtca acacattcaa 7380acacaacttt
tccatcgatg ttgcttagga gatgaggata cagatgcgtt tgatggagag 7440ggttttacaa
gctctttcat ttaaatatat atatatatat atatatattt tttggctcct 7500gattctcttc
cgtcttccca tgtggctgca ttttaaaagg cttccctaag atcgttacga 7560ttaaatcaac
cctccccagg catctttacc gagggctgtg gtccccaaag cgatacagcc 7620caggagggag
agaggctttg gtgacttgga ggaaggactg tgtccctcct tagggcgtct 7680gtggcctcag
tgagggaagg aagctgcatc agacaggggt ttcctcgctg tccacccctc 7740tggcagaaga
tggattgggc tgccccgnta taaattaatg aaaagattaa agtttcgcta 7800aaggggacat
cgagtttatg tgtcatctcc tggtgntctg tgtgccntgg gatnctgcaa 7860tatatcccan
ngcccttgat gnnntactgt ttnctataaa aanntaaatn tacttgtnna 7920atttaanttc
cnnnacacta tttnctttcc nngtnagtct nattanccga ncgagagcan 7980cgnttagttn
cagctngcgg aaaattggtt gtggggtgtg tgcggacccc ngagnaacgc 8040ccnntaaaat
naaagacaaa ntcnggggac aagnctnggg ggttatcgnn attgcnnagg 8100ggtcgncatg
aaaantttaa cgacggtaaa taataataaa aanncaaaca tgggaatgnc 8160aataaaagac
ataattctcc nnatcgccgc ggggggaaag gatcctatag taaaggcgag 8220tgcgctttga
ggggtcataa aaatcaatta gttccaacac ccacgtcccg cgttgagggg 8280acggggacga
gcagggacag aaaaagaaac catatttgaa tcccatctct ctgtgaattc 8340ttgggtcaca
tgcgtctcag tacagcccgt cccgtgctgt gaccggatag agtttcaatt 8400tactgtggaa
atttgctgta aataaattga gcatccgata gaagctgttg ctgattaacc 8460ttttattttt
agcgtggccc tgcaaagtcg tatcacccag ctgtcaggct tctaatcgaa 8520agttatgaga
ccacggtgag gggcaggcgg taatttaatt acaacaaata tctttgggtt 8580tatggcgcag
agctaaatta aatgtcatta ttcactgtct gtnaatggna aatcaaaann 8640ggaaatcgca
nttacggnca tttgggnnaa angaaagcgg ggnagtgctc tttaatngaa 8700nngaaataac
tgtcttaagc agtgtcacac acttcactta ccatattcgn ggcctnaatt 8760ggaanntgga
tcgtnngaat cactccnaag actngattta ttangcgctt cacgncagcn 8820nggcntaatt
catcnacttn ngtattcttc atcnnnnatt tttttttttc ctctcnngcc 8880gtgttnngaa
gggagagtga atgaggcttt ccacgtttca ggaggatttt cttttttgaa 8940aaatgccctt
ccagaggctt ttgggtggct ggcttgcttt ctgggccctg gaggangaca 9000ggcggangag
tccaggtggg catggagagg cacagtggca ggtcacctgg atggtcagtg 9060gaggtggagg
tctgaaggcg ccagctttgg aaattattgg tgaatttcga tgtcagcacc 9120aggncagggg
cctttttggc gggggtgtga ggganggatg anctttgctg ggaaanncag 9180gatcaggttc
tccaggcgca ctgcagcccg gtaggaccca ctttggaaat gaaaagccag 9240ttnccgaaag
ctgggctgga agcttccgtg ttgggttcaa gagcaagttc acgttgcgct 9300gtgtagactc
ctggctgctc ccaaactctg agggttttct gaggttccct tcataggggc 9360accggccctg
ggccatgcac agtgcgtaag ggtggctgtg ggccgaggga cccagcacgt 9420gttttgccca
caacagccgg agtgactggt tcactcaccg ccttggcgga ggacgcctgt 9480tctctggacg
aatcatttct cttgggtggt gactgccttg tgggtcaagg tgcaggtttt 9540ctgccacaga
aaacctgtta ggaggaatta agcgactaag actgtcaggg aggtggtggt 9600gggggangag
gnagggggtg gtgtccagat taccaggcat aggctaaact gcctgcactc 9660tccagctggt
ctgtctgtgg aggaggggat tgtcaatact gggagagcag aggaggctcg 9720taggaggtga
gagggggtgg aatttgcatg caaatcttca catgaggcct gtgtgaattt 9780ctccagcctc
ctgagggtcc cctgcgctat tgcactcaac ttcttgatag tttaccccaa 9840gactcagaag
tccttagagg ggcagaatgc ccccaccaca aagcctgcta tccttgggcg 9900tcctcaggac
ccttggtcat gaatgggacc ctttcatgta tggggaccct tggtaatatg 9960aatgggacgc
cttcagctcc ccagggcttc cgaggaggcc gagaagggca aagacacttc 10020cgaggaggcc
gagaagggca aagacatttt ctgggcttgg tgtgtcaaga gctagattgg 10080agaaggggct
ggatttggaa ctctttagcc atcagctcac cctctccgtt tgtggctaaa 10140gtctgaaggt
ggaaacttcg gttctcctac agggtctaca ggagttgggg ggcggggcgc 10200ccacacagaa
cgctggaaag ttcgacagtc cacttccact ggctcggaac tcactttttc 10260accttaagtt
catcagcggt aacgcatagg tctcacttag gcagggcacg gatgatttaa 10320caatttctac
ttctaggtca ggtgcggtgg ctcacacctc taatcccagc actttgggag 10380gcccaggagg
gtggatcgct tgaggtcagg agtttgagac cagcctggcc aacatggtga 10440aaccccgtct
ctactaaaat acgaaaatta gccaggcatg gtggtgagca cctgtaattc 10500cagctactcg
ggaggctgag gcaggagaat cgcttgaacc tgggaggtgg acgttgcagt 10560gaggtgagat
cacaccactg cactccagcc tggatgagag agcaagactc tgtctcaaaa 10620acaaaataaa
acaaaaacaa aacaaaaatc aaaaaagaaa acccaatttc cagttctagg 10680ccaggtgcag
tggctcacgc ctgtcatccc agcactttgg gaggcccagg agggtggatc 10740gcttgaggtc
aggagttcga gaccagcctg gccaacatgg tgaaacccca tctttactaa 10800aaatacaaac
gttagctggg tgtggtggtg tgcgcctgta atcccagcta ctcgggaagc 10860tgaggctgga
gaattgcttg aatctgggag gtggaggttg cagggaggcg agatagtgcc 10920actgcagtcc
agcctggacc agagagcaag actccgtctc aaaaacaaaa gaaagcaaaa 10980acaaaaaaca
agagaccagc ctggccaaca tggtgaaacc gcgtctttac taaaatacaa 11040aattagccgg
gcatggtggt gggcacctgt agtcccagct actcgggagg ctgaggcagg 11100agaatggctt
gaacctggga ggtggagctt gcagtgagcc gagatagtgc cactgcactc 11160cagcctgggc
gacagagcga gacttgattt cagaaccacc accaccacaa caaaacaaaa 11220caaaaaatcc
aaaaaaaccc caatttccag tactaggtag tcagtgatgc agggctggag 11280acagaggggc
ggtaagtgtc tgggcgccca ccatcagtca cctcccagct cccangaggt 11340gcaaagtgct
tggttcagcc tcatgggaag gatgctccct ggggaggctg ggctgggttc 11400acagggctct
tcacatctct ctctgcttct nccccaaggt ttggttncca gaaccggaga 11460gccaagtgcc
gncaaacaag agaatcagat gcataaaggt gggtgtcggg actgggggga 11520cctgaagctg
ggggatcctg ctccaggagg gatggggtcg acaaggtgct ggctacaccc 11580aggaccacca
cactgacacc tgctcccttt ggacacaggc gtcatcttgg gcacagccaa 11640ccacctagac
gcctgccnga gtggcaccct acgtcaacat gggagcctta cggatgcctt 11700tccaacaggt
agctcacttt ttcttcctct gnaagatccc tagggacctg ctgctccctt 11760cccctttccc
ctatttgctg ccgcatcctg acactcctag tccctccctg cccctgcaga 11820cttctcagct
ggcccttaga aaaaaagcct cttttccgag gaggcattta caggcacctt 11880ggcacctatg
aaatcaggct gggccaggcg gggtggctca cacctgtcat cccagcactt 11940tgggaggcca
aggttaggag tttgagacca gcctggacaa catagcaaaa gcctgtctct 12000actaaaaata
caaaaaaaaa ttaacaggga gtggtggtgg gcacctgtaa tcccagctac 12060ttgggaggct
gaggcaggag aatcacttga acccgggagg ccgaggttgc ggtgagccga 12120gatcgtgcca
ttgcactcca ggctgggcga cagagtgaga ctctgtctca aaaaataaat 12180aaataaataa
atgtaaaaaa ataaaaatag gtcgggcacg gtggctcacg tctgtaatcc 12240cagcactttg
gaaggccgag gtgggtggat gacagggtca agagattgag accatcctgg 12300ccaacatggc
aaaatgccgt ctctactaaa aaatacaaaa attaggcggg cgtggtggcg 12360ggtgcctgta
atcccagcta ctcgggaggc tgaggcagga gaatcggttg aacccgggat 12420gcggaggttg
cagtgagcgg agatcacatc actgcactcc aggctgggca acaagagcga 12480aactgcgtct
tacaataaat aaatagataa ataaataaac aaataaactt tactttagaa 12540acaaatccct
gtccgtgttt gtcttttcac ctgtcctgca gggaaaacaa aacataaaat 12600gtcaaggcaa
atagtagtga tttcattccg ggaaaaagaa agtggatgtt tgccttcacc 12660ctttctcgtc
cttcctctgg tgctcctcan ggcccanggg nagagggtgg aaagtncaga 12720ggaagaaaga
cggggctggg ggggggggtc cgtggggacc caggcaggca tgttcccnat 12780ttccntgtct
tcacnttcaa agnaggggcc cctcgnctct ggaatgaggc ctacggtttc 12840ctttcccnga
agagttnccc ctttgtgagc ttacggcttc ggagtgaacc tcggtgcaac 12900ctgttattaa
aacacacaga ggctaatgcc agcaaaaaca cgccccccgc tcctggtttc 12960agagggaaga
aaaaaattca taagcacggc catgcttttc taataaaaat tcattaaata 13020atcgttataa
gggatgaagc cgggagggga gaggagagga acacaatcaa gagactttct 13080ttgaactttt
tctccctgct tcaaatacaa agcaatcttc tgtgggcctg ggcctggggg 13140gtttccccct
ttctctgcag cccattggga ggaagaaaat gcttccctga angttgctgc 13200aaaattgttt
ctgtttttct tttctttttc tttttttttt ttttttgaga cggagtctcg 13260ctctgtcacc
aggctggagt gcaatggtat gatctcagct cactgcaacc tccacgttcc 13320tgtttcaagt
cattctcctg cctcagcctc ctgagtagct gggactacag gcgcccgcca 13380ccacgcccgg
ctagtgtttg tatttttaga aaagacaggg tttccccatg ttggccaggc 13440tggtcttgaa
ctcctgtcct caagtgatct gcctgcctcg gcctcccaaa gtgctgtgtt 13500tctgtttttc
tttccccgct ttcttaggag gccatcggga agaataaaat gctttccttg 13560aagttgatgc
aaaattgttt ctgtttttct tttctctttt ctttcttttt gagatggagt 13620ctcgctcttt
cacccaggct ggagggcagt ggcgcgacct cggctcactg caacctccgc 13680ctcccgggtt
caagcgattc tcctgcctca gcctccggag tagctgggat tacaggcacc 13740tgccactatg
cctggctaat tttattattt ttagtagaga cggggtttca ccatgttggc 13800caggctggtc
tcaaactcct gacctcaggt gatccgcccg cctcgcctcc caaagtgatg 13860ggatgancag
gncatngagc ncaccgtgcc cggccctcta actctttacc agacataaag 13920tctccnnttc
ccctttctaa atgtatatat tgtgttttta aaagttaaca gcagggatcc 13980cacctcattn
ccccgctnct ctccccaaga cctgtcctgc acgttgcaca cagcaggtgt 14040gccctggaca
tatcccaaac ccacgctgaa agaaagaggg tctcactaca cgtatgatat 14100ctgtgnatcc
tttaaacatc tccgtggctt ccaggcaaca cagccataaa taggaatctc 14160atgtctgaca
tgataccggg accatgtatg ggnaaattct gggtgtgaag ttccagctac 14220ccccgcagag
gcanccattg cataccctcc agaaactccc ctgccgttnc aagccaaaga 14280cacaacacaa
acagcntccg agagagggtg tcattgaaaa tcaataccat cataagagca 14340cacagcaccg
tctttctctt ctgcccgttg atacacaatt atgagcaatt tgctaacact 14400gacaactcgt
ggcaagaaca ggtcgtgttg atacggttgc ctcgtgagga cccatctgtc 14460ttctggggtc
ttgcctggaa cggagatcgg agttcagggt ggctaataga atcattactc 14520acctagggac
acagaatnat gagggttacc cccagttaag tgcatacagt caaacggacg 14580gctgctctgg
aaggtacagt gacgtgaaca gcttttatga aatgcctaga tctggacctt 14640ccatacctga
gccaccgttc caaagcactg ggcgtttttc agatactttc atgagaaatg 14700ttgtcaacac
cgcaagtttg cagtacacag tctgaaagat attcttgtat atgtagatgt 14760ctgtagatgc
cctgaaggtg tgtagacttt agacacccag aaggtgtgta gatgtctgta 14820gacaccttct
atgtgtgtag atgtctgtag acgccctgca ggtgtgtaga tatatctaga 14880tggtctgcct
gtgtatgata caggctaaaa agacatttgt ggtggacact agttgattat 14940ttaggactat
gagatgggaa aggaagnagc aaccagcagt gaaaggcatg tggtgggtgg 15000ggggttggca
ttgcagtggg gtcctcntga ngcaggtgac acccactata gggctgccct 15060tggnatggac
gctttgtnga agctgtttga tttcaccaca ccaagcctgg aggcacggac 15120attccaggat
ggtgaggagt ctgcaaagga ggagattgga ggaggtgcaa tatccctaga 15180gtacgagaga
tgagatagga gagctgtata aatagcacta ccagccggat gcggtggctc 15240acgcctgtca
tcccagcact ttaggaggct gaggcaggcg gatcacctga ggtcaggagt 15300tccagaacag
cctggccaac acaatgaaac cccatcttta ctaaaaatac aagattagct 15360gggcacggtg
tctcacgcct gtcatccctg cactttggga ggtcgaggtg cgcagatcat 15420gaggtcagtt
tggccaacgc ggcgaaaccc cgtctctact aaaaatacaa aaaagtagcc 15480gggcgtggtg
gtgggcacct gtagtcccag ctactaggga ggctgaggca ggagaatcgc 15540ttgaacccgg
atgcggacat tgcagtgagc cgagatc 155779753DNAHomo
sapiens 9cgtggaagcc tggagttttt gggaacagcg tgtccccgcc gagcctggga
gcccgtgggt 60tctgcaaagc ctgcgggtgt ttgaggactt tgaagaccag tttgtcagtt
gggctcaatt 120cctggggttc agacttagag aaatgaagga gggagagctg gggtcgtctc
caggaaacga 180ttcacttggg gggaaggaat ggagtgttct tgcaggcaca tgtctgttag
gaggtgaaac 240agaatgtgaa atccacgttg gagtaagcgt ccagcgctga atgtagctcg
gggtggggtg 300ggagggccct ggtgtggatc gtggaaggaa gaaagacaga acagggtgct
agtatttacc 360ccgttccctg tagacaccct ggatttgtca gctttgcaag cttcttggtt
gcagcggcct 420tgcctgtgcc cctttgagac tgtttccaga ctaaacttcc aaatgtcagc
cccttaccct 480tgacagcaag ggacatctca ttagggcatc gcgtgcttct catctgtgct
cagcaggccc 540gagataggaa cagaggggcg ttggagatgc cacttccacc agccctgggt
tgaaggggag 600cgagggagac accttttact taaacccctg agcttggtca gagaggctga
atgtctaaaa 660tgaggaagaa aaggtttttc acctggaaac gcttgagggc tgagtcttct
gcccttctga 720ctcccccagc aaatacagac aggtcaccaa cta
753101895DNAHomo sapiensCDS(91)..(966) 10gtgatccacc
cgccgcacgg gccgtcctct ccgcgcgggg agacgcgcgc atccaccagc 60cccggctgct
cgccagcccc ggccccagcc atg gaa gag ctc acg gct ttt gta 114
Met Glu Glu Leu Thr Ala Phe Val
1 5tcc aag tct ttt gac cag aaa agc aag gac ggt
aac ggc gga ggc gga 162Ser Lys Ser Phe Asp Gln Lys Ser Lys Asp Gly
Asn Gly Gly Gly Gly 10 15 20ggc ggc
gga ggt aag aag gat tcc att acg tac cgg gaa gtt ttg gag 210Gly Gly
Gly Gly Lys Lys Asp Ser Ile Thr Tyr Arg Glu Val Leu Glu 25
30 35 40agc gga ctg gcg cgc tcc cgg
gag ctg ggg acg tcg gat tcc agc ctc 258Ser Gly Leu Ala Arg Ser Arg
Glu Leu Gly Thr Ser Asp Ser Ser Leu 45
50 55cag gac atc acg gag ggc ggc ggc cac tgc ccg gtg cat
ttg ttc aag 306Gln Asp Ile Thr Glu Gly Gly Gly His Cys Pro Val His
Leu Phe Lys 60 65 70gac cac
gta gac aat gac aag gag aaa ctg aaa gaa ttc ggc acc gcg 354Asp His
Val Asp Asn Asp Lys Glu Lys Leu Lys Glu Phe Gly Thr Ala 75
80 85aga gtg gca gaa ggg att tat gaa tgc aaa
gag aag cgc gag gac gtg 402Arg Val Ala Glu Gly Ile Tyr Glu Cys Lys
Glu Lys Arg Glu Asp Val 90 95 100aag
tcg gag gac gag gac ggg cag acc aag ctg aaa cag agg cgc agc 450Lys
Ser Glu Asp Glu Asp Gly Gln Thr Lys Leu Lys Gln Arg Arg Ser105
110 115 120cgc acc aac ttc acg ctg
gag cag ctg aac gag ctc gag cga ctc ttc 498Arg Thr Asn Phe Thr Leu
Glu Gln Leu Asn Glu Leu Glu Arg Leu Phe 125
130 135gac gag acc cat tac ccc gac gcc ttc atg cgc gag
gag ctc agc cag 546Asp Glu Thr His Tyr Pro Asp Ala Phe Met Arg Glu
Glu Leu Ser Gln 140 145 150cgc
ctg ggg ctc tcc gag gcg cgc gtg cag gtt tgg ttc cag aac cgg 594Arg
Leu Gly Leu Ser Glu Ala Arg Val Gln Val Trp Phe Gln Asn Arg 155
160 165aga gcc aag tgc cgc aaa caa gag aat
cag atg cat aaa ggc gtc atc 642Arg Ala Lys Cys Arg Lys Gln Glu Asn
Gln Met His Lys Gly Val Ile 170 175
180ttg ggc aca gcc aac cac cta gac gcc tgc cga gtg gca ccc tac gtc
690Leu Gly Thr Ala Asn His Leu Asp Ala Cys Arg Val Ala Pro Tyr Val185
190 195 200aac atg gga gcc
tta cgg atg cct ttc caa cag gtc cag gct cag ctg 738Asn Met Gly Ala
Leu Arg Met Pro Phe Gln Gln Val Gln Ala Gln Leu 205
210 215cag ctg gaa ggc gtg gcc cac gcg cac ccg
cac ctg cac ccg cac ctg 786Gln Leu Glu Gly Val Ala His Ala His Pro
His Leu His Pro His Leu 220 225
230gcg gcg cac gcg ccc tac ctg atg ttc ccc ccg ccg ccc ttc ggg ctg
834Ala Ala His Ala Pro Tyr Leu Met Phe Pro Pro Pro Pro Phe Gly Leu
235 240 245ccc atc gcg tcg ctg gcc gag
tcc gcc tcg gcc gcc gcc gtg gtc gcc 882Pro Ile Ala Ser Leu Ala Glu
Ser Ala Ser Ala Ala Ala Val Val Ala 250 255
260gcc gcc gcc aaa agc aac agc aag aat tcc agc atc gcc gac ctg cgg
930Ala Ala Ala Lys Ser Asn Ser Lys Asn Ser Ser Ile Ala Asp Leu Arg265
270 275 280ctc aag gcg cgg
aag cac gcg gag gcc ctg ggg ctc tgacccgccg 976Leu Lys Ala Arg
Lys His Ala Glu Ala Leu Gly Leu 285
290cgcagccccc cgcgcgcccg gactcccggg ctccgcgcac cccgcctgca ccgcgcgtcc
1036tgcactcaac cccgcctgga gctccttccg cggccaccgt gctccgggca ccccgggagc
1096tcctgcaaga ggcctgagga gggaggctcc cgggaccgtc cacgcacgac ccagccagac
1156cctcgcggag atggtgcaga aggcggagcg ggtgagcggc cgtgcgtcca gcccgggcct
1216ctccaaggct gcccgtgcgt cctgggaccc tggagaaggg taaacccccg cctggctgcg
1276tcttcctctg ctatacccta tgcatgcggt taactacaca cgtttggaag atccttagag
1336tctattgaaa ctgcaaagat cccggagctg gtctccgatg aaaatgccat ttcttcgttg
1396ccaacgattt tctttactac catgctcctt ccttcatccc gagaggctgc ggaacgggtg
1456tggatttgaa tgtggacttc ggaatcccag gaggcagggg ccgggctctc ctccaccgct
1516cccccggagc ctcccaggca gcaataagga aatagttctc tggctgaggc tgaggacgtg
1576aaccgcgggc tttggaaagg gaggggaggg agacccgaac ctcccacgtt gggactccca
1636cgttccgggg acctgaatga ggaccgactt tataactttt ccagtgtttg attcccaaat
1696tgggtctggt tttgttttgg attggtattt tttttttttt ttttttttgc tgtgttacag
1756gattcagacg caaaagactt gcataagaga cggacgcgtg gttgcaaggt gtcatactga
1816tatgcagcat taactttact gacatggagt gaagtgcaat attataaata ttatagatta
1876aaaaaaaaat agcaaaaaa
189511292PRTHomo sapiens 11Met Glu Glu Leu Thr Ala Phe Val Ser Lys Ser
Phe Asp Gln Lys Ser 1 5 10
15Lys Asp Gly Asn Gly Gly Gly Gly Gly Gly Gly Gly Lys Lys Asp Ser
20 25 30Ile Thr Tyr Arg Glu Val
Leu Glu Ser Gly Leu Ala Arg Ser Arg Glu 35 40
45Leu Gly Thr Ser Asp Ser Ser Leu Gln Asp Ile Thr Glu Gly
Gly Gly 50 55 60His Cys Pro Val His
Leu Phe Lys Asp His Val Asp Asn Asp Lys Glu 65 70
75 80Lys Leu Lys Glu Phe Gly Thr Ala Arg Val
Ala Glu Gly Ile Tyr Glu 85 90
95Cys Lys Glu Lys Arg Glu Asp Val Lys Ser Glu Asp Glu Asp Gly Gln
100 105 110Thr Lys Leu Lys Gln
Arg Arg Ser Arg Thr Asn Phe Thr Leu Glu Gln 115
120 125Leu Asn Glu Leu Glu Arg Leu Phe Asp Glu Thr His
Tyr Pro Asp Ala 130 135 140Phe Met Arg
Glu Glu Leu Ser Gln Arg Leu Gly Leu Ser Glu Ala Arg145
150 155 160Val Gln Val Trp Phe Gln Asn
Arg Arg Ala Lys Cys Arg Lys Gln Glu 165
170 175Asn Gln Met His Lys Gly Val Ile Leu Gly Thr Ala
Asn His Leu Asp 180 185 190Ala
Cys Arg Val Ala Pro Tyr Val Asn Met Gly Ala Leu Arg Met Pro 195
200 205Phe Gln Gln Val Gln Ala Gln Leu Gln
Leu Glu Gly Val Ala His Ala 210 215
220His Pro His Leu His Pro His Leu Ala Ala His Ala Pro Tyr Leu Met225
230 235 240Phe Pro Pro Pro
Pro Phe Gly Leu Pro Ile Ala Ser Leu Ala Glu Ser 245
250 255Ala Ser Ala Ala Ala Val Val Ala Ala Ala
Ala Lys Ser Asn Ser Lys 260 265
270Asn Ser Ser Ile Ala Asp Leu Arg Leu Lys Ala Arg Lys His Ala Glu
275 280 285Ala Leu Gly Leu
290121354DNAHomo sapiensCDS(91)..(765) 12gtgatccacc cgccgcacgg gccgtcctct
ccgcgcgggg agacgcgcgc atccaccagc 60cccggctgct cgccagcccc ggccccagcc
atg gaa gag ctc acg gct ttt gta 114
Met Glu Glu Leu Thr Ala Phe Val 1
5tcc aag tct ttt gac cag aaa agc aag gac ggt aac ggc gga ggc gga
162Ser Lys Ser Phe Asp Gln Lys Ser Lys Asp Gly Asn Gly Gly Gly Gly
10 15 20ggc ggc gga ggt aag aag gat
tcc att acg tac cgg gaa gtt ttg gag 210Gly Gly Gly Gly Lys Lys Asp
Ser Ile Thr Tyr Arg Glu Val Leu Glu 25 30
35 40agc gga ctg gcg cgc tcc cgg gag ctg ggg acg tcg
gat tcc agc ctc 258Ser Gly Leu Ala Arg Ser Arg Glu Leu Gly Thr Ser
Asp Ser Ser Leu 45 50
55cag gac atc acg gag ggc ggc ggc cac tgc ccg gtg cat ttg ttc aag
306Gln Asp Ile Thr Glu Gly Gly Gly His Cys Pro Val His Leu Phe Lys
60 65 70gac cac gta gac aat gac
aag gag aaa ctg aaa gaa ttc ggc acc gcg 354Asp His Val Asp Asn Asp
Lys Glu Lys Leu Lys Glu Phe Gly Thr Ala 75 80
85aga gtg gca gaa ggg att tat gaa tgc aaa gag aag cgc gag
gac gtg 402Arg Val Ala Glu Gly Ile Tyr Glu Cys Lys Glu Lys Arg Glu
Asp Val 90 95 100aag tcg gag gac gag
gac ggg cag acc aag ctg aaa cag agg cgc agc 450Lys Ser Glu Asp Glu
Asp Gly Gln Thr Lys Leu Lys Gln Arg Arg Ser105 110
115 120cgc acc aac ttc acg ctg gag cag ctg aac
gag ctc gag cga ctc ttc 498Arg Thr Asn Phe Thr Leu Glu Gln Leu Asn
Glu Leu Glu Arg Leu Phe 125 130
135gac gag acc cat tac ccc gac gcc ttc atg cgc gag gag ctc agc cag
546Asp Glu Thr His Tyr Pro Asp Ala Phe Met Arg Glu Glu Leu Ser Gln
140 145 150cgc ctg ggg ctc tcc gag
gcg cgc gtg cag gtt tgg ttc cag aac cgg 594Arg Leu Gly Leu Ser Glu
Ala Arg Val Gln Val Trp Phe Gln Asn Arg 155 160
165aga gcc aag tgc cgc aaa caa gag aat cag atg cat aaa ggc
gtc atc 642Arg Ala Lys Cys Arg Lys Gln Glu Asn Gln Met His Lys Gly
Val Ile 170 175 180ttg ggc aca gcc aac
cac cta gac gcc tgc cga gtg gca ccc tac gtc 690Leu Gly Thr Ala Asn
His Leu Asp Ala Cys Arg Val Ala Pro Tyr Val185 190
195 200aac atg gga gcc tta cgg atg cct ttc caa
cag atg gag ttt tgc tct 738Asn Met Gly Ala Leu Arg Met Pro Phe Gln
Gln Met Glu Phe Cys Ser 205 210
215tgt cgc cca ggc tgg agt ata atg gca tgatctcgac tcactgcaac
785Cys Arg Pro Gly Trp Ser Ile Met Ala 220
225ctccgcctcc cgagttcaag cgattctcct gcctcagcct cccgagtagc tgggattaca
845ggtgcccacc accatgtcaa gataatgttt gtattttcag tagagatggg gtttgaccat
905gttggccagg ctggtctcga actcctgacc tcaggtgatc cacccgcctt agcctcccaa
965agtgctggga tgacaggcgt gagcccctgc gcccggcctt tgtaacttta tttttaattt
1025tttttttttt ttaagaaaga cagagtcttg ctctgtcacc caggctggag cacactggtg
1085cgatcatagc tcactgcagc ctcaaactcc tgggctcaag caatcctccc acctcagcct
1145cctgagtagc tgggactaca ggcacccacc accacaccca gctaattttt ttgattttta
1205ctagagacgg gatcttgctt tgctgctgag gctggtcttg agctcctgag ctccaaagat
1265cctctcacct ccacctccca aagtgttaga attacaagca tgaaccactg cccgtggtct
1325ccaaaaaaag gactgttacg tggaaaaaa
135413225PRTHomo sapiens 13Met Glu Glu Leu Thr Ala Phe Val Ser Lys Ser
Phe Asp Gln Lys Ser 1 5 10
15Lys Asp Gly Asn Gly Gly Gly Gly Gly Gly Gly Gly Lys Lys Asp Ser
20 25 30Ile Thr Tyr Arg Glu Val
Leu Glu Ser Gly Leu Ala Arg Ser Arg Glu 35 40
45Leu Gly Thr Ser Asp Ser Ser Leu Gln Asp Ile Thr Glu Gly
Gly Gly 50 55 60His Cys Pro Val His
Leu Phe Lys Asp His Val Asp Asn Asp Lys Glu 65 70
75 80Lys Leu Lys Glu Phe Gly Thr Ala Arg Val
Ala Glu Gly Ile Tyr Glu 85 90
95Cys Lys Glu Lys Arg Glu Asp Val Lys Ser Glu Asp Glu Asp Gly Gln
100 105 110Thr Lys Leu Lys Gln
Arg Arg Ser Arg Thr Asn Phe Thr Leu Glu Gln 115
120 125Leu Asn Glu Leu Glu Arg Leu Phe Asp Glu Thr His
Tyr Pro Asp Ala 130 135 140Phe Met Arg
Glu Glu Leu Ser Gln Arg Leu Gly Leu Ser Glu Ala Arg145
150 155 160Val Gln Val Trp Phe Gln Asn
Arg Arg Ala Lys Cys Arg Lys Gln Glu 165
170 175Asn Gln Met His Lys Gly Val Ile Leu Gly Thr Ala
Asn His Leu Asp 180 185 190Ala
Cys Arg Val Ala Pro Tyr Val Asn Met Gly Ala Leu Arg Met Pro 195
200 205Phe Gln Gln Met Glu Phe Cys Ser Cys
Arg Pro Gly Trp Ser Ile Met 210 215
220Ala2251432367DNAHomo sapiens 14tttctctgtc tccatccctc tgtctctccc
tttctctctg tctttccttg tctctctctt 60tctctctctc tctccatctc tctctctccc
tgtctctctc tctccatctc cccgtctctc 120cgtttctctc tctgcctctc cctgtctgtc
tctctctttc tgtgtcttac acacacccca 180acccaccgtc actcatgtcc ccccactgct
gtgccatctc acacaagttc acagctcagc 240tgtcatcctg ggtccccagg ccccgccggg
gaggaagatg cgccgtgggg ttacgggagg 300aaggggactc cgggcctcct ggtgccccac
tttatttgca gaaggtcctt ggcaggaacc 360gtgacgcgtt tggtttccag gacttggaaa
acgaatttca ggtcgcgatg gcgagcaccg 420gcttcccctg aagcacattc aatagcgaga
ggcgggaggg agcgagcagg agcatcccac 480catgaaaacc aaaaacacaa gtattttttt
cacccggtaa ataccccaga cgccagggtg 540acagcgcggc gctaagggag gaggcctcgc
gccggggtcc gccgggatct ggcgcgggcg 600gaaagaatat agatctttac gaaccggatc
tcccggggac ctgggcttct ttctgcgggc 660gctggagacc cgggaggcgg ccccggggat
cctcggcctc cgccgccgcc gcctcccaag 720cgcccgcgtc ccggtttggg gacacccggc
cccttcttct cactttcggg gattctccag 780ccgcgttcca tctcaccaac tctccatcca
agggcgcgcc gccaccaact tggagctcat 840cttctcccaa gatcgtgcgt ccccggggcg
cccgggtccc ccccctcgcc atctcaaccc 900cggcgcgacc cgggcgcttc ctggaaagat
ccaggcgccg ggctctgcgc tcctcccggg 960agcgagggcg gccggacgac tgggaccctc
ctctctccag ccgtgaactc cttgtctctc 1020tgtctctctc tgcaggaaaa ctggagtttg
cttttcctcc ggccacggag agaacgcggg 1080taacctgtgt ggggggctcg ggcgcctgcg
cccccctcct gcgcgcgcgc tctcccttcc 1140aaaaatggga tctttccccc ttcgcaccaa
ggtgtacgga cgccaaacag tgatgaaatg 1200agaagaaagc caattgccgg cctggggggt
gggggagaca cagcgtctct gcgtgcgtcc 1260gccgcggagc ccggagacca gtaattgcac
cagacaggca gcgcatgggg ggctgggcga 1320ggtcgccgcg tataaatagt gagatttcca
atggaaaggc gtaaataaca gcgctggtga 1380tccacccgcg cgcacgggcc gtcctctccg
cgcggggaga cgcgcgcatc caccagcccc 1440ggctgctcgc cagccccggc cccagccatg
gaagagctca cggcttttgt atccaagtct 1500tttgaccaga aaagcaagga cggtaacggc
ggaggcggag gcggcggagg taagaaggat 1560tccattacgt accgggaagt tttggagagc
ggactggcgc gctcccggga gctggggacg 1620tcggattcca gcctccagga catcacggag
ggcggcggcc actgcccggt gcatttgttc 1680aaggaccacg tagacaatga caaggagaaa
ctgaaagaat tcggcaccgc gagagtggca 1740gaaggtaagt tcctttgcgc gccggctcca
ggggggccct cctggggttc ggcgcctcct 1800cgccacggag tcggccccgc gcgcccctcg
ctgtgcacat ttgcagctcc cgtctcgcca 1860gggtaaggcc cgggccgtca ggctttgcct
aagaaaggaa ggaaggcagg agtggacccg 1920accggagacg cgggtggtgg gtagcggggt
gcggggggac ccagggaggg tcgcagcggg 1980ggccgcgcgc gtgggcaccg acacgggaag
gtcccgggct ggggtggatc cgggtggctg 2040tgcctgaagc cgtagggcct gagatgtctt
tttcattttc tttttctttc ctttcctttt 2100tttgtttgtt tgtttgtttg tttgagacag
agtctcgctc tgtcccccag gctggagtgc 2160agtggtgcga tctcggctca ctgcaacctc
cgcctcctgg gttcaagcga ttctcctgcc 2220tcagcctccc cagtagctgg gattacaggc
atgcaccacc acgcctggct aatttttgtg 2280cttttagtaa agacggggat tcaccatgtt
ggccaggctg gtctcgaact cctgacctca 2340ggtgatccac ccgcctcggc ctcccaaagt
gctgggatga caggcgtgag gcaccgcgcc 2400cggcctgggt cctgacggct taggatgtgt
gtttctgtct ctgcctgtct gccttgtatt 2460tacggtcacc cagacgcaca gaggagccgt
ctccacgcgc cttcccagcg ctcagcgcct 2520gccgggcccc cggagatcac gggaagactc
gaggctgcgt ggtaggagac gggaaggccc 2580cgggtcagct cggttctgtt tcctttaagg
aacccttcat tattatttca ttgttttcct 2640ttgaacgtcg aggcttgatc ttggcgaaag
ctgttgggtc cataaaaacc actcccgtga 2700gcggaggtgg ccgggatctg gatggggcgc
gaggggcccc ggggaagctg gcggcttcgc 2760gggcgcgtcc taagtcaagg ttgtcagagc
gcagccggtt gtgcgcggcc cgggggagct 2820cccctctggc ccttcctcct gagacctcag
tggtgggtcg tcccgtggtg gaaatcgggg 2880agtaagaggc tcagagagag gggctggccc
cggggatctc tgtgcacaca cgacaactgg 2940gcggcataca tcttaagaat aaaatgggct
ggctgtgtcg gggcacagct ggagacggct 3000atggacgcct gttatgtttt cattacaaag
acgcagagaa tctagcctcg gcttttgctg 3060attcgcagag ttgaggtgcg agggtgaatg
ccccaaaggt aattcttcct aagactctgg 3120ggctacctgc tctccggggc cctgcatttg
gggtgtggag tggccccggg aaatagccct 3180tgtattcgta ggaggcacca ggcagcttcc
caaggccctg actttgtcga agcagaaagc 3240tgtggctacg gtttacaaag cagtccccgg
tttctgaccg tctaagaggc aggagcccag 3300cctgcctttg acagtgagag gagttcctcc
ctacacactg ctgcgggcac ccggcactgt 3360aattcataca cagagagttg gccttcctgg
acgcaaggct gggagccgct tgagggcctg 3420cgtgtaattt aagagggttc gcagcgcccg
gcggccgctt ctgtggggtt gctttttggt 3480tgtccttcgc agacaccgtt ttgctcctct
gaactctctc ttctccccct ggccgtggac 3540ccgggagagc aaagtgtcct ccagaccttt
tgaaagtgag aggaaaataa agaccaggcc 3600aaagacccag ggccacagga gaggagacag
agagtccccg ttacattttc cccttggctg 3660ggtgcagaaa gacccccggg ccaggactgc
cacccaggct actatttatt catcagatcc 3720aagttaaatc gaggttggag ggcaggggag
agtctgaggt taccgtggaa gcctggagtt 3780tttgggaaca gcgtgtcccc gccgagcctg
ggagcccgtg ggttctgcaa agcctgcggg 3840tgtttgagga ctttgaagac cagtttgtca
gttgggctca attcctgggg ttcagactta 3900gagaaatgaa ggagggagag ctggggtcgt
ctccaggaaa cgattcactt ggggggaagg 3960aatggagtgt tcttgcaggc acatgtctgt
taggaggtga aacagaatgt gaaatccacg 4020ttggagtaag cgtccagcgc tgaatgtagc
tcggggtggg gtgggagggc cctggtgtgg 4080atcgtggaag gaagaaagac agaacagggt
gctagtattt accccgttcc ctgtagacac 4140cctggatttg tcagctttgc aagcttcttg
gttgcagcgg ccttgcctgt gcccctttga 4200gactgtttcc agactaaact tccaaatgtc
agccccttac ccttgacagc aagggacatc 4260tcattagggc atcgcgtgct tctcatctgt
gctcagcagg cccgagatag gaacagaggg 4320gcgttggaga tgccacttcc accagccctg
ggttgaaggg gagcgaggga gacacctttt 4380acttaaaccc ctgagcttgg tcagagaggc
tgaatgtcta aaatgaggaa gaaaaggttt 4440ttcacctgga aacgcttgag ggctgagtct
tctgcccttc tgactccccc agcaaataca 4500gacaggtcac caactactgg agatgagaaa
gtgccatttt tggcacactc tggtggggta 4560ggtgcccgac cgcgtgtgaa aaagtgggaa
ggagagattt ctgcgcacgc ggttcagccc 4620ccaggcgcgg tggcgcattc aggtactcag
acgcggttct gctgttctgc tgagaaacag 4680gcttcgggta ggggctccta gctccgccag
atcgcggagg gacccccagc cctcctgcgc 4740tgcagcggtg gggatagcgt ctctccgtag
gcctagaatc tgcaacccgc cccgggtcct 4800ccccgtgtcc ttcccgggcg tcccgccggg
gatcccacag ttggcagctc ttcctcaaat 4860tctttccctt aaaaatagga tttgacaccc
cactctcctt aaaaaaaaaa aataagaaaa 4920aaaggttagg ttatgtcaac agaggtgaag
tggataattg aggaaacgat tctgagatga 4980ggccaagaaa acaacgctcg tgcaaagccc
aggtttttgg gaaagcagcg agtatcctcc 5040tcggcttttg cgttatggac cccacgcagt
ttttgcgtca aagcgcattg gttttcgagg 5100gccccctttc caccgcggga tgcacgaagg
ggttcgccac gttgcgcaaa acctccccgg 5160cctcagccct gtgccctccg ctccccacgc
agggatttat gaatgcaaag agaagcgcga 5220ggacgtgaag tcggaggacg aggacgggca
gaccaagctg aaacagaggc gcagccgcac 5280caacttcacg ctggagcagc tgaacgagct
cgagcgactc ttcgacgaga cccattaccc 5340cgacgccttc atgcgcgagg agctcagcca
gcgcctgggg ctctccgagg cgcgcgtgca 5400ggtaggaacc cgggggcggg ggcggggggc
ccggagccat cgcctggtcc tcgggagcgc 5460acagcacgcg tacagccacc tgcgcccggg
ccgccgccgt ccccttcccg gagcgcgggg 5520aggttgggtg agggacgggc tggggttcct
ggacttttgg agacgcctga ggcctgtagg 5580atgggttcat tgcgtttgtt tttcaccaac
agcaaacaaa tatatataca tatatattat 5640acaaataaca aataaatata tatgttatac
agatgggtat attgtatata ttatagatat 5700ttgttcgtcc ttggtgcaaa gacacccggt
gaacccatat attggctcct gactgccttc 5760ggttcccctg ggattggtta taggggcaac
acatgcaaac aaaactttcc ctggattata 5820cttaggagac gaagctacag atgcgtttga
tccagagtgt tttacaagat ttttcattta 5880aaaaaaaatg tgtcttttgg cccctgattc
ccctccgtct tcccgtgtgg ctgcattgaa 5940aaggtttcct taggatgaaa ggagaggggt
gtcctctgtc cctaggtgga gagaaacagg 6000gtcttctctt tcctccgttt tttcacctac
cgtttctatc tccctcctcc cctctccagc 6060cctgtcctct gctacaaacc accccctcct
ccctccggct gtggggagcg caggagcacg 6120ttgggcatct ggatgagcgg agactattag
cggggcacgg gggctccccg aggagcgcgc 6180gaattcacgc tgccccatga gaccaggcac
cggggggcgg aggggccttg ggtgtccgca 6240gagggacggg cgggcagagc cttcctccgc
attctaaaca ttcacttaaa ggtatgagtt 6300tatttcaggg gtgctgctgg gagagcctcc
aaatggcttc ttccagcccc tgcctgacag 6360ttcagctccc ctggaaggtc aactcctcta
gtcctttctc ctggttctgg gcaggacaga 6420agtgggggga gggagagaga gagagagaga
gagagagacg gtcaggatcc ccggaccctg 6480gggaacccgt caaaaataaa tgaaattaag
attgccgacc agagagagaa ccgtgacaaa 6540gcaaacggcg ttcaaagcaa agagacgaac
tgaaagcccg ttcccgtagg actggttatg 6600aggtcaacac attcaaacac agcttgctct
ggattttgct gagcagagga agatacagat 6660gcatttgatc caaagtgtgt tacatctttc
attatatgtg tgtctatata tataaacata 6720tataaatata taaacataca taaatgtatg
taaatatata taatctatat acatatataa 6780atatataaac acatatataa tatataaatc
tataaacata tataatatat aaacataaat 6840atataaacat atataatata taaatatatt
aacatatata aaatatgtat aaatatatat 6900aaacatataa acatatataa atatataaac
atataaatat ataaacatat ataaatatat 6960acaaacatat tgtatatata taaatatata
taaaaacata tatatacata taaaaatata 7020tataaacata tatacatata aagaaatata
tataaacata tatacatata aatatacata 7080tataaacata tatatacata aaatatatat
aaacatatat acatataaaa atatatatat 7140attaacatat atatacatat aaaaatatat
atattaacat atatatacat ataaaaatat 7200atatatattt ttggcccctg attcccttcg
gttcctgtgg gatgggtgat tgagtcaaca 7260cattcaaaca caacttttcc atcgatgttg
cttaggagat gaggatacag atgcgtttga 7320tggagagggt tttacaagct ctttcattta
aatatatata tatatatata tatatttttt 7380ggctcctgat tctcttccgt cttcccatgt
ggctgcattt taaaaggctt ccctaagatc 7440gttacgatta aatcaaccct ccccaggcat
ctttaccgag ggctgtggtc cccaaagcga 7500tacagcccag gagggagaga ggctttggtg
acttggagga aggactgtgt ccctccttag 7560ggcgtctgtg gcctcagtga gggaaggaag
ctgcatcaga caggggtttc ctcgctgtcc 7620acccctctgg cagaagatgg attgggctgc
cccgtataaa ttaatgaaaa gattaaagtt 7680tcgctaaagg ggacatcgag tttatgtgtc
atctcctggt gtctgtgtgc ctgggatctg 7740caatatatcc cagcccttga tgtactgttt
ctataaaaat aaattacttg taatttaatt 7800ccacactatt tctttccgta gtctattacc
gacgagagca cgttagttca gctgcggaaa 7860attggttgtg gggtgtgtgc ggaccccgag
aacgccctaa aataaagaca aatcggggac 7920aagctggggg ttatcgattg caggggtcgc
atgaaaattt aacgacggta aataataata 7980aaaacaaaca tgggaatgca ataaaagaca
taattctcca tcgccgcggg gggaaaggat 8040cctatagtaa aggcgagtgc gctttgaggg
gtcataaaaa tcaattagtt ccaacaccca 8100cgtcccgcgt tgaggggacg gggacgagca
gggacagaaa aagaaaccat atttgaatcc 8160catctctctg tgaattcttg ggtcacatgc
gtctcagtac agcccgtccc gtgctgtgac 8220cggatagagt ttcaatttac tgtggaaatt
tgctgtaaat aaattgagca tccgatagaa 8280gctgttgctg attaaccttt tatttttagc
gtggccctgc aaagtcgtat cacccagctg 8340tcaggcttct aatcgaaagt tatgagacca
cggtgagggg caggcggtaa tttaattaca 8400acaaatatct ttgggtttat ggcgcagagc
taaattaaat gtcattattc actgtctgta 8460atggaaatca aaaggaaatc gcattacggc
atttgggaaa gaaagcgggg agtgctcttt 8520aatgaagaaa taactgtctt aagcagtgtc
acacacttca cttaccatat tcgggcctaa 8580ttggaatgga tcgtgaatca ctccaagact
gatttattag cgcttcacgc agcggctaat 8640tcatcacttg tattcttcat catttttttt
tttcctctcg ccgtgttgaa gggagagtga 8700atgaggcttt ccacgtttca ggaggatttt
cttttttgaa aaatgccctt ccagaggctt 8760ttgggtggct ggcttgcttt ctgggccctg
gaggagacag gcggagagtc caggtgggca 8820tggagaggca cagtggcagg tcacctggat
ggtcagtgga ggtggaggtc tgaaggcgcc 8880agctttggaa attattggtg aatttcgatg
tcagcaccag gcaggggcct ttttggcggg 8940ggtgtgaggg aggatgactt tgctgggaaa
caggatcagg ttctccaggc gcactgcagc 9000ccggtaggac ccactttgga aatgaaaagc
cagttccgaa agctgggctg gaagcttccg 9060tgttgggttc aagagcaagt tcacgttgcg
ctgtgtagac tcctggctgc tcccaaactc 9120tgagggtttt ctgaggttcc cttcataggg
gcaccggccc tgggccatgc acagtgcgta 9180agggtggctg tgggccgagg gacccagcac
gtgttttgcc cacaacagcc ggagtgactg 9240gttcactcac cgccttggcg gaggacgcct
gttctctgga cgaatcattt ctcttgggtg 9300gtgactgcct tgtgggtcaa ggtgcaggtt
ttctgccaca gaaaacctgt taggaggaat 9360taagcgacta agactgtcag ggaggtggtg
gtgggggaga ggagggggtg gtgtccagat 9420taccaggcat aggctaaact gcctgcactc
tccagctggt ctgtctgtgg aggaggggat 9480tgtcaatact gggagagcag aggaggctcg
taggaggtga gagggggtgg aatttgcatg 9540caaatcttca catgaggcct gtgtgaattt
ctccagcctc ctgagggtcc cctgcgctat 9600tgcactcaac ttcttgatag tttaccccaa
gactcagaag tccttagagg ggcagaatgc 9660ccccaccaca aagcctgcta tccttgggcg
tcctcaggac ccttggtcat gaatgggacc 9720ctttcatgta tggggaccct tggtaatatg
aatgggacgc cttcagctcc ccagggcttc 9780cgaggaggcc gagaagggca aagacacttc
cgaggaggcc gagaagggca aagacatttt 9840ctgggcttgg tgtgtcaaga gctagattgg
agaaggggct ggatttggaa ctctttagcc 9900atcagctcac cctctccgtt tgtggctaaa
gtctgaaggt ggaaacttcg gttctcctac 9960agggtctaca ggagttgggg ggcggggcgc
ccacacagaa cgctggaaag ttcgacagtc 10020cacttccact ggctcggaac tcactttttc
accttaagtt catcagcggt aacgcatagg 10080tctcacttag gcagggcacg gatgatttaa
caatttctac ttctaggtca ggtgcggtgg 10140ctcacacctc taatcccagc actttgggag
gcccaggagg gtggatcgct tgaggtcagg 10200agtttgagac cagcctggcc aacatggtga
aaccccgtct ctactaaaat acgaaaatta 10260gccaggcatg gtggtgagca cctgtaattc
cagctactcg ggaggctgag gcaggagaat 10320cgcttgaacc tgggaggtgg acgttgcagt
gaggtgagat cacaccactg cactccagcc 10380tggatgagag agcaagactc tgtctcaaaa
acaaaataaa acaaaaacaa aacaaaaatc 10440aaaaaagaaa acccaatttc cagttctagg
ccaggtgcag tggctcacgc ctgtcatccc 10500agcactttgg gaggcccagg agggtggatc
gcttgaggtc aggagttcga gaccagcctg 10560gccaacatgg tgaaacccca tctctactaa
aaatacaaac gttagctggg tgtggtggtg 10620tgcgcctgta atcccagcta ctcgggaagc
tgaggctgga gaattgcttg aatctgggag 10680gtggaggttg cagggaggcg agatagtgcc
actgcagtcc agcctggacc agagagcaag 10740actccgtctc aaaaacaaaa gaaagcaaaa
acaaaaaaca agagaccagc ctggccaaca 10800tggtgaaacc gcgtctctac taaaatacaa
aattagccgg gcatggtggt gggcacctgt 10860agtcccagct actcgggagg ctgaggcagg
agaatggctt gaacctggga ggtggagctt 10920gcagtgagcc gagatagtgc cactgcactc
cagcctgggc gacagagcga gacttgattt 10980cagaaccacc accaccacaa caaaacaaaa
caaaaaatcc aaaaaaaccc caatttccag 11040tactaggtag tcagtgatgc agggctggag
acagaggggc ggtaagtgtc tgggcgccca 11100ccatcagtca cctcccagct cccagaggtg
caaagtgctt ggttcagcct catgggaagg 11160atgctccctg gggaggctgg gctgggttca
cagggctctt cacatctctc tctgcttctc 11220cccaaggttt ggttccagaa ccggagagcc
aagtgccgca aacaagagaa tcagatgcat 11280aaaggtgggt gtcgggactg gggggacctg
aagctggggg atcctgctcc aggagggatg 11340gggtcgacga ggtgctggct acacccagga
ccaccacact gacacctgct ccctttggac 11400acaggcgtca tcttgggcac agccaaccac
ctagacgcct gccgagtggc accctacgtc 11460aacatgggag ccttacggat gcctttccaa
caggtagctc actttttctt cctctgaaga 11520tccctaggga cctgctgctc ccttcccctt
tcccctattt gctgccgcat cctgacactc 11580ctagtccctc cctgcccctg cagacttctc
agctggccct tagaaaaaaa gcctcttttc 11640cgaggaggca tttacaggca ccttggcacc
tatgaaatca ggctgggcca ggcggggtgg 11700ctcacacctg tcatcccagc actttgggag
gctgaggagg gtgcatcacc tgagatcagg 11760agttcaagac cagcctggcc aacttaacga
aaccccgtct attaaaaata caaaatgggt 11820gtggtggctc acgcctgtca tcccagcact
ttgggaggcc gaggcaggtg gatcacctga 11880ggtcaggaat tcgagaccag cctgaccaac
atgctgaaac cccgtctcta ctgaaaacac 11940aaagcttagc cgggcgtggt ggtgcacacc
tgtgatccca ggtacttggg agggagaatc 12000acttgaacct gggaggtgga ggttgccgtg
agccaatatc gcgccactgc actccactct 12060gggtgacaga gtgagactcc aagactccat
ctcaaaaaaa aaaaaaaaaa tcaggctgta 12120aaaatccact tttgggaagg tgaacacaca
caagcccaaa cagaaatctg acaaaaacca 12180gaggggtgaa aagtccacac agtcaggcac
ccccacctgg cttgctgcct ggttaagaag 12240ggcgcagatg cctgtgcctg gataccagag
atgggacaga cacccattcc cttttcatca 12300ccacccccga gtgcccgagg gcctggggcg
tctgcctggc ccctggcccc tggcttgggc 12360tctgcacctc tgaactggag acaccctact
cagctcccca cttactttgg agtgagcagc 12420gcttgggtgc ccagcgtgga tttggggctt
ccagggagtc ggggttcggt cgcggagccc 12480aagcttccca agggcgcccc cgccctgccc
tggcttagtg gtggggatgg gatgggggga 12540aacggggagc tgcgtggaag gaggtgaagg
gtcacaggag gagagagcgc agcgcccacg 12600tgcgccctgc ctgaacgcgc agcgcagcgc
ccggctgcgg tgccccttgc cccttcggtc 12660cctaatttgg ggatcgggag tgcatgcgcg
ggcggaacgg gcttgggggg ggggctctgg 12720cagggcggac gcgtggcctc ccttcttcac
cgttttattc caaggggaca ggctggggat 12780tgtatttggg cgcgtgtttg gctgagggtg
cagggacttg gggggtggcg gtggggagcg 12840cggaaggtat aaacgtataa atcataagta
aacaactcag aaatggaccc cgagcgctgg 12900tcgccgctag ctctccagct ctccctggcc
caggcccgaa ggagaggggt ccgcatccct 12960ccgcggttct cctctcctgg gtacctggcc
ttgaggtggg ggaacgagcc tacttcttgt 13020accgtctttt gccgacggcg ggacccagtg
aaattaggcc gttggagccc gcaggcctgc 13080ctggctttgc gcaccggagt cttggggacc
tggtgtcccc gggaaaaact tggggacctg 13140gtatccccgg gagaggcttg gggacctggt
gtcccgggag aggcttgggt acctggtttc 13200tctggaagag gcttggacac ctggtgtcct
gggagggcct ttgggacctg gtgtcctggg 13260agaggcttgg agatctgttg tcctgggaga
ggcttgggga cctggtgtcc ctggagaggc 13320ttggggacct ggtgaccttg gagaggcttg
gagacctggt gttctgggag aggcttgggg 13380acctggtgtt ctgggagagg cttggggacc
tggtgtctct ggaagaggct tggacacctg 13440gtgacccggg agggccttgg ggatctggtg
tcccgggaga gccttgggga cctggtgtcc 13500tgggagaggc ttggggacct ggtgaccttg
gagaggcttg gggacctggt gtcctgagag 13560agccttgggg atctggtgtc ccaggagagg
cttggggacc tggtgtctct ggaagaggct 13620tggacacctg gtgtcctggg gagaggcttg
gggacctggt gtcctgggag aggcttgggg 13680acctggtgtc ctgggagagg cttggagatc
tggtgagccg ggagaggctt ggggacctgg 13740tgtcccggga gaggcttggg gacttggtgt
cccgggagag gcttgaacac ctggtgtccc 13800aggagaggct tggggacctg gtgaccttgg
agaggcctgg ggacctggtg acccgggaga 13860gccttgggga cctggtgtcc tggggagagc
cttggggacc tggtgacctt ggagaggctt 13920ggggacctgg tgtctcggga gtgccttggg
gacctagtga cccgggagag gcttggggac 13980ctggtgtccc gggagaggct tggggacctg
gtgtcctggg agagccttgg ggatctggtg 14040tcctggggag aggctggggg acctggtgtc
tcgggagaga gccttgggga cctggtgacc 14100cgggagaggc ttggacacct ggtgtcccgg
gagaggcttg gggacctggt gacccgggag 14160agccttgggg acctggtgtc ctggggagag
gctgggggac ctggtgtctc gggagagagc 14220cttggggacc tggtgacccg ggagaggctt
ggacacctgg tgtcccggga gaggcttggg 14280agcctggtgt cccgggagag ccttggggac
caggtgacct tggagaggct tggggacctg 14340gtgatcttgg agaggcttgg ggacctggtg
tctcgggaga ggttacgggg gctggttggg 14400ggagagaacg ttgtgagcca aagtccctga
atccctgcga aaagagcgca tcgggagctc 14460cccctgaggg cgttccattt gtggaccccc
ctcccatgcg ctttgcaggg agctgttcgg 14520attcccctgg cccggctccc gcggatgcat
ccagtggcag cgccaattct gggccagggg 14580gaaggaggaa aggcgggtgt ggggtggtct
ccacggctgg agaaggggcg acgctcccta 14640ggggagaaga ggcacgttgg gggtttccgg
gggcgcgggg cggagcaggc cccccagtcc 14700ccatcctgcg ccctcacccc gccgggtccg
ctcccgcagg tccaggctca gctgcagctg 14760gaaggcgtgg cccacgcgca cccgcacctg
cacccgcacc tggcggcgca cgcgccctac 14820ctgatgttcc ccccgccgcc cttcgggctg
cccatcgcgt cgctggccga gtccgcctcg 14880gccgccgccg tggtcgccgc cgccgccaaa
agcaacagca agaattccag catcgccgac 14940ctgcggctca aggcgcggaa gcacgcggag
gccctggggc tctgacccgc cgcgcagccc 15000cccgcgcgcc cggactcccg ggctccgcgc
accccgcctg caccgcgcgt cctgcactca 15060accccgcctg gagctccttc cgcggccacc
gtgctccggg caccccggga gctcctgcaa 15120gaggcctgag gagggaggct cccgggaccg
tccacgcacg acccagccag accctcgcgg 15180agatggtgca gaaggcggag cgggtgagcg
gccgtgcgtc cagcccgggc ctctccaagg 15240ctgcccgtgc gtcctgggac cctggagaag
ggtaaacccc cgcctggctg cgtcttcctc 15300tgctataccc tatgcatgcg gttaactaca
cacgtttgga agatccttag agtctattga 15360aactgcaaag atcccggagc tggtctccga
tgaaaatgcc atttcttcgt tgccaacgat 15420tttctttact accatgctcc ttccttcatc
ccgagaggct gcggaacggg tgtggatttg 15480aatgtggact tcggaatccc aggaggcagg
ggccgggctc tcctccaccg ctcccccgga 15540gcctcccagg cagcaataag gaaatagttc
tctggctgag gctgaggacg tgaaccgcgg 15600gctttggaaa gggaggggag ggagacccga
acctcccacg ttgggactcc cacgttccgg 15660ggacctgaat gaggaccgac tttataactt
ttccagtgtt tgattcccaa attgggtctg 15720gttttgtttt ggattggtat tttttttttt
tttttttttt gctgtgttac aggattcaga 15780cgcaaaagac ttgcataaga gacggacgcg
tggttgcaag gtgtcatact gatatgcagc 15840attaacttta ctgacatgga gtgaagtgca
atattataaa tattatagat taaaaaaaaa 15900atagccgtgc actcttgacc ccgtcaacgt
ccaacgtgga aaaggcgtta cctcttctcc 15960cagcgctggc cgcctggcca ctgagggccc
tttgcaaaaa tcacgggtgt agagatggcc 16020ctgggcgcgc tgggagtgtg gttgtgtttc
tgaaggggat aaaagagggc acggtggtgc 16080caagatatca gtttggtacc tgagctgttt
ctggttggga agcgtaaaag ccagggagag 16140atccagagag ttttcaagtt tttgcagatg
taggtggttc cagcttttct ttctccccta 16200ctccatcttc tgcgttcccc cagttctttt
atttctttgt tttttatttt tgagacagag 16260acttgctttg tcgcccaggc tggagtgcag
tggcgcaatg tcagctcact gccacctcca 16320cctcccgggt tcaagcgatg ctcctgcctc
agcctcccga gtagctggga ctacaggcac 16380ctgccaccac ccccggctaa ttttttgtat
ttatagtaga gacggggttt caccgtgttg 16440gccaggctcg tctcgaactc ctgacctcag
gtgatctgcc cgcctcggcc tcccaacgtg 16500cccccagttt tataaacagc agatagcaac
ttgtcgtcac agctggcatg ggctggacag 16560ttgcttgaaa tgacctaacc aaaaacattc
aagggttctg cccccagatt tcgggagatc 16620cacgttccat gttctgattg gttttctggg
aacacagcaa ggggtttggt gacctccgag 16680aagatccatc tgcatgattg gcattagtta
ccacagcctg cccagagaga aactatcttc 16740tcccaacatt tactaacatc cactggtcaa
ctctcttatt tccataacac atttgcatct 16800ttctggattc aagcttggtg gttttctttc
ctaacttctg atttagatac ttctccctga 16860ggtggggata aaagaaaaaa aaaaaacaac
ttcttttttt cttccgcata acactttcta 16920tcttgtcact gagctgaact gtagatccat
ttggacccgt ctcatttgta tcttctgata 16980ttctttatac aaaccaaaag tccccttcaa
cattttttat gtcaaaatgt tacaaccgct 17040gtaaaatgac ggagagagag agaaagaatc
ccagacatta acggtattag agagtttgcc 17100tcattcatcc atttttctta aaagctggaa
attaaaaaaa aaaaagagag agagaggctt 17160taatagttaa gctgaaattt ttatcgaaaa
gaagaattgc attttgaatc tttgggaagt 17220aggttcattc atcagagtat gtaacccttt
ggaaaagtgg ttggtaagat atgtacagcc 17280ctagattttt ttttttttaa ccaaaaaggc
tgagtaattt tgaaaaatcg aaacataaca 17340gtgtgtcatc atttcctccc aagaaaaagc
tcactccacg tgagtagaaa gacatctacc 17400tggtccctgt agaatctgaa cgtttctctt
tagagacgga atttcaatct tgttgcccag 17460gctggagtgc agtggcacaa tctcggctca
ccgcaacctc cgcctcccgg gttcaagcca 17520ttctcctgcc tcagtctccc gagtagctgg
gattacaggc acctgccacc aggcctgggt 17580aactttctgg tatttttagt agagacaggg
tttcagcctc ccgagtagct gggattacag 17640gcacctgcca ccaggcctgg gtaactttct
ggtattttta gtagagacag ggtttcagcc 17700tcccgagtag ctgggattac aggcacctgc
caccaggcct gggtaacttt ctggtagttt 17760tagtagagac agggtttcgg cctcccgagt
agctgggatt acaggcacct gccaccaggc 17820ctgggtaact ttctggtatt tttagtagag
acagggtttc ggcctcccga gtagctggga 17880ttacaggcac ctgccaccag gcctgggtaa
ctttctggta tttttagtac agacagggtt 17940tcggcctcct gagtagctgg gattacaggc
acctgccacc aggcctgggt aactttctgg 18000tagttttagt agagacaggg tttcagcctc
ccgagtagct gggattacag gcacctgcca 18060ccaggcctgg gtaatttttt tgcatttttg
gtagagacag gtttttgccg tgttggcccg 18120gctggtctca aactcctgac ctcaggttga
cctgcccgct ttgtccctcg caaagtgctg 18180ggattacagg cgtgagccac cacacctggc
ctgaatctga acttttaaaa gggagttact 18240gactctcaac tgtgcgggga cggtttcact
ttgatttaat atggaaagag ggccaagtgt 18300catcctcaca aatgggtccc cgaagcagat
caaacgcaga gaactgtgag ggtgggacac 18360gagtgtctgt ggacactggc tgcctttggc
ttttctcctg cgagagaagt tgggtgactt 18420tctgtaggtg gatgagtgat ccctgaatga
gtgtggggta cgtgtatgct agctgcttct 18480ttctccctga aactctcgga tggaaggaag
taagaaattc agcttgggct gtgaccagtt 18540ctcaccacca acgccctctt ctctctccct
tctccttcct tccttccttc cttcctttct 18600ttctttttct ttctttctct ctttctttct
tttctttctt tctgtttctt tcctttttat 18660ctttctctct ttttctttct cttttccttt
tttgtttctt tctttctttt tctttctttc 18720tttttctttc ttctttcttt cttcgatgaa
gtctcactct gtcacccagg ctggagtgca 18780gtggtgcaat cccagctcac tgcatcctct
acctcctggc ttcaagaaat tctcctgcct 18840cagcctccca agtagctggg atgacaggca
cccaccacca ttcccggata atttttgtat 18900tttttagtag agactgggtt tcgccatgtt
ggccaggctg gtcttgaact cctgacctca 18960catgatccac ccgcctcagc ctcccagagt
gctgggatta cggggtgagg caccgcgccc 19020ggcctcctct ctctttttct gagatgttta
ggaaggactg ggctgatggg gaccctctgt 19080atgtgatgtg cgtgggtttg gtttcccgga
aggccctcca gagacacgtt tgcgtgaaca 19140ttcagcatgg aaacaacata cgtctctcca
caggaggtga gaaattgaat ttatggggtg 19200ggtgtacgct ggcgattctt ggtgcttttt
gctcaaaaca aggttctttt gaaagtcacg 19260ttcctgcttt ccctgtggct tcccggtgag
ctcgctcgca gagcaaggaa taccacccag 19320agagcaacgt gggctgtgtt ccgttgtaac
gccgttgcag agagaggatt tggtgtgtga 19380gatccgtacc agctccagca cactgatagg
aacacgttgc tggccgaact gaacgatgct 19440gggttgggtc ctgattgata cgtattttct
tccctcctct ccccaaaact tggccaaata 19500gtccgtggag ggttgtcagt cgccgcagtt
gagcaaaaaa cacttcttcc tttgagtggc 19560tgttctggtg aaatctgttt ctgacatatc
cacttttctc tctcttttct ctctctctga 19620ctgcgaagca cccacaggga gaaggaattg
gatgtatcgg atgttggtat tagattttct 19680ttctccgttc gagtctctga ctggtgcata
ctttgcaaag gtgtgttcct ggcaattgcc 19740aagagttaga aaaatgcacc ttctctggtg
gccgttgggg tgttgtttca caggcagtgg 19800tgacagggcc ccttggctgt ggctgtcttc
tccagcgccg tggataaaga gacgggacag 19860attctgtgcc tctgtacgat ttagagcgta
actgaccgcg tccaacaccc gtttttccac 19920ttacaaagct ggtggtgcga cgggcttggt
gtctcccgta cgggaaggag gcctttgggc 19980cgctccaaag acgccctgtc gtaggaatgg
cctctccatc ccgccaaagt ccagccaggc 20040ccccgaaatg gtcccatttc cttggaagcc
tgagtttctg ttctggtctt gctgctgtcc 20100ttggccacgt cagcacgtgg gagcatctgt
ggataccgca gagtctgggg acagctgggc 20160gtttaaccga aatgaagccg agacgggttt
caggttttgg tgccaagctc tggtcaggat 20220gaaagggaaa taccagagtc ctctgtcctc
gcctctgggt ttcatgctga cctttctaac 20280atttgttttc ccctaagaac aagcagaagc
ctccagctcc ctttagctcc acagttttcc 20340cggggacata gcgaggatgg cacacggcag
ccactcccac gacacacatt tcggaggcac 20400tttgctggaa gccgcttgtc tcctccagct
ttgggaggtc tggggaggag agaggctttc 20460ggtggacacg tttgacatta aaaaaaaaaa
aaaaaaaaaa aaaaaaactg gtgcctaatt 20520tattaaagag aattagctta gcgagtatat
gctgatattc ttcgacacac gtgggtaagt 20580tgatgccatt tataaatgtt ttattgaaat
ttgatattta atgagaagcc ggttaaggaa 20640tgtagacaat atcccgtttc aaagctatga
aatgtgctat ttattgaaag gggatgtggc 20700ttcacgagtt cagcccattg tacgtgcagg
tcccgtggga aggaggcaaa agcccctgct 20760tcttactttg tgatgtatgt gcatttgtta
tttatttttt tttccttggt cggacgttca 20820taaatatgta ctattttaat tatgtcgagt
gtaaatttga catcgcgttg catttatttt 20880tatatttctg aaaactgttg ctttttcttt
ttccctcccc cattgacgac atagcggccc 20940ccgcgtccgg gttacaaata catctacaga
tattttcagg gattgcttca gatgaaaaca 21000aatcacacac cgtttcccaa accaacagtc
ttcacatttc tatccctctg ttattgtcgg 21060caggcggtga ggggtagaaa aaaaacaaac
aaacaaacag aaaaaaaaac caaaaaaaac 21120caccctgagt ttctctggtg acgccctcat
tctcctaacg ttcaataatc tcaatgttga 21180gttgcagcaa cagactgtat ttttgtgacg
ccccgtagta tgaatgtaca tcttgtaaaa 21240ctgagatata aataaactta taaatatttg
tattcaagtg ttaaaaaaaa aaaaattctc 21300aacctctccc ctgaggacag gcttattgga
aaaaaaaaaa aaaaaaaaaa atcctgagtc 21360ggccgtggct gaacacagag tgttgttctg
ctccgtgcat ttccagggtg ggtacccagt 21420gttgcccccc agccttagat cgggaggtac
cattgacttt tgcttgtatc ccatcccctt 21480cctttactga aacctacctc cccgcttctc
agccaacgtc cccccagaag gtggcaaaaa 21540aaacagagga aaaagccctg atttgaatca
agtcagagct gctaattctc cactttcttt 21600aattaattaa tttatttttt tttttgagac
tgagtctcgc tctgtcgccc aggccggagg 21660agtgcagggg cgcgatctcg gctcaccgcg
acctccgcct cccgggttca agcgactctc 21720ctgcctcagc ctcccgagta gctgggatga
cagtcacctg caccaccgcg cccggctcat 21780ttttgtattt ttagtagcaa tggggtttca
ccgtgttggt caggctggtc tcgaactcct 21840gacctcgtga tccacccgcg tctgggcccg
gccggtgatg tgtgtgcttt taacttttat 21900tttgttccag ttttcgacag tggcacggat
tttccagcac ggtcttgcaa ggatgattga 21960gtcatttttg agacaaaaaa tataataata
ataaatggaa aaagaaatcg acttttaaaa 22020atgacaaatt tttttttttt ttttttgcat
agatttttct ctctttatgt aaaggaaagt 22080tcatgattgg atttggccgg cctgactgct
tcccggctgt gataaaaaac acatgtgagc 22140tgggagggaa gtgggggagg gacacagctg
cccacacagg gttcccaccg cggttacagg 22200gtgggcagtg ctgggggagc tttctctgtg
gggggctcag agcctgagga caggtgagcc 22260tctccgacac ctccccagtt gcctggagtc
taaaccgtcc gttgtctgta ccgtccgttc 22320ttcctgctga ctcctggtag ttcctgaaag
cttctcttgg ccagagaagg ggtttcagag 22380gccgtgtgtc caggccattc tgcaaagtgc
aacttgaccg ttcctttcct tttctggcct 22440gcgtggtctg aagctcagag ccctctcttc
acccagcctg tgtgtgtctt gccggacaga 22500agaaaaatgg tgctttttgc gtgttagcag
aggtgctttt catggctgac ctcaacgcgt 22560ccatctccag ccttgaccaa gctgtttttt
aggggcaaac gcaggcaagt tctgaatgca 22620cacagttatt tcatggttaa actattcagc
tttggccggg cgcagtgtgg ctctcacgcc 22680tgtcatccca gcactttggg aggccgaggc
gggtggatca cctgaggtca ggagttcgag 22740accagcctgg ccaacacggt gaaactctat
ctctactaaa aatacaaaaa ttagccgggc 22800gtggtggtgt gtatctgtaa tcccagctac
tcaggaagct gaggcaggag aatcgcttgg 22860acccaggagg cggaggttgc actgagccga
gatcgcgcca ttgcactcca gcctgggcga 22920cagagccaga cgctgtctca aaaaaatgaa
taataaaata aaataacagg aactaaataa 22980aataaaacgt tcagctttgt tctgcaaatc
cactcctatt gttttacgtg gtttgagaga 23040ctctgtccct tagaaataga tgtttgttgc
caattgtaat gaatctgttt caaaaatgaa 23100cagaatattc aaatggtttg agagatcttt
tcccttagaa atagcttgtt gccaatcaca 23160aagaatgttt ttcaaaaatg aatggaatct
tcctggatat cgcttccaga tcttcatttt 23220ttttgcatag ttcaacctga aaagtaagtg
tctcagccct gaatttcttt ctgatttttc 23280catgggttgt cttgcagact tctctggact
tgaccacatt taaaaaaaaa aaaattaact 23340ttttcacacg gacacggttt caataggaat
gagatctttg agtttttatg taacagattc 23400ttaccatcag ttctcagatt cccaaattac
acacaaaaag ccacggactt cgcctcctgc 23460taacatgtcc ttctgtttct gaggcttctg
ttggtgttag actttcatgt ttaatagcag 23520acaatgtagg gatttaaaga aaaatgcaga
gaaagcaaaa acactgacca aacacacgga 23580gataagcttt ctaaagcctt tgttcttgga
gttgtcgtta aaaaaaaaaa gttgttttaa 23640actttgcaag catgcctata ttgaactcat
aagcaagaga gccaagaaaa atagtgtcgg 23700tcgtctactc tacacgtttt cccaaaacag
acgtatttta atttcttttg tttgaactca 23760cagatgctga gagttaaaag ttaaattttt
gtcatgaaca atagtggcca aaaccacagt 23820tacttttgca ctatagcata ataagaaaaa
tacaggctgg gctcggtggc tcacacctgt 23880aatcaaagca cttttggagg cgaaacagcc
agatcccttg agcccaggag attgagacca 23940gcctgggcaa catagcgaga ccctcatctc
tacaaaaaag gtttgttaca tatgtaacaa 24000acctgcacat tgtgcacatg taccctaaaa
cttaaagtat aataataaaa aaattaaaaa 24060aaaattcacc aatcaactgc ctgctggtgc
cttcaagaga ctcacctaac acataaggac 24120ttgcataaac ttataaaaca attcaatgga
agaatccttg aaagtattct gagaagacag 24180tataataaac tgatttctaa aaaggctata
aaaaattgaa taaatcattg ttgggcatcc 24240tgtgctgaaa tataatgcag ccaataaaaa
ttacaaaatg aataaacatt ttataacaat 24300aaaaaaaagt caaataatta ggcaggcatg
gtggtgctct cctacggttg aagctattca 24360gcaggcaaga ggatactttg tttttgtttt
ttaatttttt ttgagacaga gtctcgctct 24420gttgccaggc tggagtgcag tggcgtgatc
tcagctcact gtaatttctg cctcccgggt 24480tcaagcgatt ttcctgcccc agcctcccga
gtagctggga ttacaggtgc ccgccaccac 24540acctggctaa tttcttttgt atttttagta
gagacgaggt ttccccatgt tggccaggct 24600ggttttgagc tcccgacctc gggtgatcca
cccgcctcag cctcccaaag tgctgggatg 24660acaggcgtga gccaccgcgc ctggcccagg
aggattattt gatcccagga ggtggaggct 24720gcaggaagcc atgattgcac cactgcactc
cagcctggct gacagagtga gaccacatct 24780ctaaataaat gaataaatac aggcagaaac
tttttttgtt ttgttttgat ggagtcttgc 24840tctgtcacca ggcaggagtg cagtggtgcc
atctcagctc actgcaacct ccacctcctg 24900ggttcaagca atcctcctgc ctcagcctcc
cgagtagctg ggattacagg tgcccgccac 24960cacgcccggc taattttttg tatgtttagt
agagacggga tttcaccgtg ttagccagga 25020tggtcttgat ctcttgactt tgtgatctgc
ctgcctcagc ctcccaaagt gctgggatta 25080caggcatgag cccaggagtt caagaccagc
ctcagcaaca aagtgagacc ttttctctcc 25140aaaaaatcaa aaatttagcc agctgtggtg
gctcctgccc gtgatcccag tactgtggga 25200ggctgaggca gaattgcttg agcccaggag
ttcgagacca acctcagcaa aaaggactct 25260ctctctctct ctctctctct ctctctctct
ctctctatat atatatatat atatatatat 25320gagtttcaaa aattgctggg tgaccagctc
atctactggt tttccccttg ggaaagtgaa 25380attgtcatgt attgaagatt tccaaggaag
ttgtattgaa tgagaaacaa actcaatctg 25440ttcgtgttta aagagctgca gtgcgtttgc
tgtgtttccc ataaaactgc acttccaaaa 25500gacacgctga gaaaggagac caggatttgt
aattcagaaa ttggaaagca agttaggctg 25560gacgtggtag ctcatgcttg ttgtaatctc
agcactctgg gaggctgagg caggaggatc 25620acttgagccc aggagttcaa gaccagcccg
tgccacatgg tgaaaccctg tctctccaaa 25680aaataaaaca tttagccaga tgtggtgact
catgcctgta atcccggtat tctgggaggc 25740tgaggcagag ttgcttgagc ccaggagttc
aagaccagcc tcggcaacaa agtgagaccc 25800tgtctctcca aaaaataaaa catttagcca
gctgtggtga ctcatgcctg taatctcagt 25860actctgggag gctggggcag aatggcttga
gcccaggagt tcgagaccaa cctcagcaac 25920aaagtgagat cttgtttctc caaaaaatca
aaaatttagc cagctgtgct ggctcatgcc 25980tgtaatcccg gtactctggg aggctgaggc
agaatcgttt gagcccagga gttcgagacc 26040aacctcagca acaaagtgag atcttgtttc
tccaaaaaaa tcaaaaattt agccagctgt 26100gctggctggt gcctgtaatc ccggtactct
gggaggctga ggcggaattg cttgagccca 26160ggagttcaag accagcctca gcaacaaagt
gagatcttgt ttctccaaaa aataaaacat 26220ttagtcagct gtggtggctc aagcctgtga
tcccagcatt ttgggaggcc gaggcgggcg 26280gatcacgagg tcatgagatc gagaccatcc
tggctaacac ggtgaaaccc cgtctctact 26340aaaaatacaa agaaaattag ccgggcgtgg
tggcgggcgc ctgtagtccc agctactcag 26400gaggctgagg caggagaatg ccgtgagcct
gggaggcgga ccatgcagtg agtcaagatc 26460gcgccactgc cctccagcct gggccacaga
gcaagactcc gtctcaaaaa aaaaaaaaaa 26520aaaactgctg cccaacctgt gtttgcacca
ctgccctcca gcctgggcaa cagagcaaga 26580ctccgtctca aaaaaaaaaa aatgctgccc
aagctgtgtt tgcaccactg ccctccagcc 26640tgggcaacag agcaagactc cgtctcaaaa
aaaaaaaaaa aaaatgctgc ccaagctgtg 26700tttgcaccac tgccctccag cctgggcaac
agagcaagac tctgtctcaa aaaaaaaaaa 26760aatgctgccc aagctgtgtt tgcaccactg
ccctccggcc tgggcaacag agcaagactc 26820cgtctcaaaa aaaaaaaaaa aatgctgccc
aagctgtgtt tgcaccactg ccctccagcc 26880tgggcaacaa agcaagcctc agctttctgc
catctccaca accaagaaag caattcacac 26940agaaatcagt gcatcgtgca gtgacctctt
cagaaaacca atgagttttc cacctgagga 27000actgtttctg agccccattc agaaaaacac
atccctgtaa ctgcagggca gatttactca 27060ctgtatgcct gtttaaataa agcttccagc
ctctgcatgg ggtctgtctg gaagctcctg 27120tatctgtccc acattcttgg aatcacaatg
cacccttggg aggaagatat gtatttaaag 27180ggagtggatg ttatggtgag aaaatgctgc
ccatccttct agaagacaaa agccacacaa 27240aatacatcac aagaaccagt ttttttcaga
gaagaacctg cacaaagaac ctgctccccc 27300cacaccccca cacacaggtg aattaacagg
atgtatgttt tatcataaaa gcacaggttt 27360gtttcctatg cactctctga ggatttggcc
atatgcaaag atgtacaaaa accttctctt 27420tccccaggga accgtaaccc gtctgaaaag
atgcccttct cagaagcgag ttgaacgatt 27480gttggaaaag ataaaatacg acgtgcacac
acacagtaga gaaatgtcac ccatgcaaat 27540tatgtgtttg aatggaacac attcaggaag
ctaaatgggg tatgaccaca catttgggtt 27600gatttatttg acgagtggaa ggggcagatg
gaaatgaata ctgctgtttt cctttggaag 27660gccatatatg ggaataccaa gaggattact
ttggaagttt agcttctcca ggtggtctct 27720ctctctctct ctttttttga gacagagtct
cactctgtca cccaggctgc agtgcaatgg 27780cgtgctctcg gctcactgca acctcagcct
cccaggtaca agcgattctc ctgcctcagc 27840ctcccgagta gctgggatca caggtgtgca
ccaccacgcc tggctaatgt ttgtattttc 27900agtagagatg aggttttacc atgttggcca
ggctggtctt gaactcctga cctcaggtga 27960tccgcctgcc tcggcctccc aaagtgctgg
gatgacagac atgagctagc acgcccggcc 28020ccaggtggtc tttttagcgg gtattaaagc
agctttctct ctgagcctta aaccatgaag 28080atagacagac tcagtgtatg ggttttagag
ttgtaatttt ataaaaataa gaaaaagtcg 28140acctatcatt gatggttagt attttttgta
gcagttgcat gcaatattag gataaggcat 28200gttctcaaaa agaactcttt tttttttttt
tttgagacgg agtctcgctc tgtcacccag 28260gctggagtgc agtggcacga tctccgctca
ctgcaagctc ctcttcccgg gttcacgcca 28320ttctcctgcc tcagcctccc cagtagctgg
gactacaggc gcccgccacc acgcccggct 28380aattttttgt atttttagta gagacggggt
ttcaccatgt tagccaggaa ggtctcgatc 28440tcctgacctc atgatccgtc cgcctcagcc
tcccaaagtg ctgggactac aggcgtgagc 28500cactgcactt ggcctttttt tttttttaga
tggagttttg ctcttgtcgc ccaggctgga 28560gtataatggc atgatctcga ctcactgcaa
cctccgcctc ccgagttcaa gcgattctcc 28620tgcctcagcc tcccgagtag ctgggattac
aggtgcccac caccatgtca agataatgtt 28680tgtattttca gtagagatgg ggtttgacca
tgttggccag gctggtctcg aactcctgac 28740ctcaggtgat ccacccgcct tagcctccca
aagtgctggg atgacaggcg tgagcccctg 28800cgcccggcct ttgtaacttt atttttaatt
tttttttttt tttaagaaag acagagtctt 28860gctctgtcac ccaggctgga gcacactggt
gcgatcatag ctcactgcag cctcaaactc 28920ctgggctcaa gcaatcctcc cacctcagcc
tcctgagtag ctgggactac aggcacccac 28980caccacaccc agctaatttt tttgattttt
actagagacg ggatcttgct ttgctgctga 29040ggctggtctt gagctcctga gctccaaaga
tcctctcacc tccacctccc aaagtgttag 29100aattacaagc atgaaccact gcccgtggtc
tccaaaaaaa ggactgttac gtggatgttc 29160tagcttcctg ttctcgtctt ttctttgtta
attgtacagt ttgagggtgt gtgtgcgtgt 29220gcgcacgtgt gtgtgtgcag tctcctgatt
tcatgtattt aattgttatt accaccacct 29280ccatctctca ttccttctta ccctcactgt
gtaaagatac atgttgtttt taaattttat 29340gtatttatat ttatttattt gtatttctga
gacagagtct cactctgttg cccaggctag 29400tggcatgatc tcagctcaca gcaacctttg
cctcctgggt tcaagcgatt ctcctgcctc 29460agcctcccga gtagctgaga ttacaggcac
acaccaccac acccggctag ttttgttttg 29520agacggagtc tcgctctgtt gcaggctgca
gtgcagtggc gtgatcctgg ctcactgcaa 29580cctctgcctc ctggattcaa gcgattctcc
tgcctcagcc tcccaagtag ctgggattac 29640aggcgcccac cgccacacct ggctaatttt
ttattggtag tagagacggg gtttctccat 29700gttgaccaga ctggtcttga actcccaacc
tcgggtgatc cacccacctg ggcctcccaa 29760agtgctggga tgacaggcga gggccaccgc
gtccagcctt cttcttcttc ttcttttttt 29820tttttttaag atggagtttc actctgttgc
ccaggctgga gtgcagtggt gcaatctcgg 29880ctccctgcaa cctccacctc ccaggttcaa
gaaattcttt tgcctcagcc tcccgagtag 29940ctgggactac aggtgcccgc caccacaccc
acctaatgtt tgtatttttt tggtagagac 30000ggggcttcac cacattggcc aggctggtct
tgaactcctg acttcagatg atcctcctgc 30060ctcagcctcc cagagtgttg ggattacagg
cgtgagccac ggtgcccggc cagacgtcat 30120gtcttaggaa atcagaaagt gggtagtttc
cgcactctga ggagaaaaag agacgtccgg 30180cgaagagaaa ggagagtgaa aggatgtctc
ctcttgtctg tagcctgttc tcaatcgtga 30240gtgagccaat tgccagaaac tgagggtgct
tcatttggcc aggcaagctt ctcaacagaa 30300tgtctaagta cttgttaatg ctgagaagct
ctccaagcta ctgcactcca gcctgggtga 30360cagagcacga ccttgtctga aaacaattaa
ttaatcaatt aattaatata atgaaatcat 30420actgaactca ggagaccatt ggggtgggca
gggctggggt tggaaaggaa cataaaatat 30480ggtgcaatgg actttgctcc agtctccctc
cccatctctt ctcgccaaga gtctctggag 30540ggagcatggg gaagatgctt tgggaatctg
taacttcttg tcttgtaaac agaatatcta 30600agtaattgtt aatgctgaga agttatagat
ttccaaagcc tttctccagg ctacggacaa 30660gggtcatggg ttactcagtg ttacagaaag
aatgacatgg agatgtttgt tacatcttaa 30720ggaaccatga ggggccagag tattttactc
taagtgtaga tggtacattg gccacgcctg 30780tcccaacacc accaatggtg gcacctaact
tttgtgtttg tgccccacat ttcttcttct 30840tttctgacgt aaatgcaagt gatattcctt
ggaaaccatg ctgcagcaag aggccatctg 30900actactagtg ataccctgta gctcacctac
agcagctcac ttgaagcagc tcacccatag 30960ctcaggtata gctcacctgc agcggctcac
ctgtagctca cgtgtagctc acttgtagca 31020gctcactggt agctcacctg cagcagctca
cctgtacctc acctgtacct cacctgcagc 31080agctcacctg tagctcacct gtacgtgagc
caccgtaccc ggccagcaag accccatttc 31140taaaataaat acacaaaaat tagccggacg
cggtggcgcg tgtctgtagt tgtagctact 31200caggaggctg aggtgggagg attgctggag
gctgggaggt agaggctgca gtgaaccgtg 31260atccagccac tgtactctag cctggatgac
atagcaaaac cttgtctcaa aaaacaaaaa 31320caaaaaacaa aacaaagaaa caaacaaaaa
acccacacac accggaaaac aaaacaaaaa 31380gcaaaaagga aagaaaagag agccaggtcc
caaatatata tttccttgga gaaccatttg 31440caaagagcac acttaaggcc gggcgcggtg
gctcacgcct gtcatcccgg cactttggga 31500ggccgaggtg ggtggatcac gaggttggga
gatcgagacc atcctggcca acatggcgaa 31560accccatctc tactaaaaat acaaaaaatc
agccaggtgc tgaggcaggt gcctgtagtc 31620ccagccactc aggaggctga ggcaggagaa
tggcatgaac ctgggaggtg gaggttgcag 31680tgagccgaga tcgcgcccct gcactccagc
ctgggcgaca gagcgagact ccttctcaaa 31740taaataaata aataaataac aaagagcaaa
cttaaaattg tctcagaaat cccacgggat 31800attggatctc cctcatgcct atctgatgac
actttgagtg tctggggccc cgtgcctatt 31860ttctggggtt cccagaagct gccgttctga
aagtgtggct ctcggggacg tggcacaggt 31920gtggatgtct gttttaaatg tcaggcgttt
ggacgttgag gaacgtgagg ctgaaggtcg 31980ccttcgccga ccccctgagt ttagggtcct
gccttttaaa atcttcccag cactctgttg 32040ttcacgcaag cgtcccatct gtttgggtgg
ccgtgccgtc tgcatctgtc tcgaaccttc 32100acagctttgc agaatatcct gtttctcaat
acggatggag aaacacgaga cgcgttttct 32160gggttatttt agccgtcacg gagaacccca
gactcatgtg tgctaatgac ctcattaatg 32220atactctgag gcagacagcc ctgcctgatc
ttaacaacat tttttaaatt tctttttttg 32280ttgttgttgt tacagcatca ttcatataac
gtaggaaacc gtgatcagta gcttttagga 32340tatttgcaac agggtgtaac adaaabd
3236715806DNAHomo
sapiensCDS(43)..(612)misc_feature(667)a, c, t, g, other or unknown
15gtgtccccgg agctgaaaga tcgcaaagag gatgcgaaag gg atg gag gac gaa
54 Met Glu Asp Glu
1ggc cag acc aaa atc aag cag agg
cga agt cgg acc aat ttc acc ctg 102Gly Gln Thr Lys Ile Lys Gln Arg
Arg Ser Arg Thr Asn Phe Thr Leu 5 10
15 20gaa caa ctc aat gag ctg gag agg ctt ttt gac gag acc
cac tat ccc 150Glu Gln Leu Asn Glu Leu Glu Arg Leu Phe Asp Glu Thr
His Tyr Pro 25 30 35gac
gcc ttc atg cga gag gaa ctg agc cag cga ctg ggc ctg tcg gag 198Asp
Ala Phe Met Arg Glu Glu Leu Ser Gln Arg Leu Gly Leu Ser Glu
40 45 50gcc cga gtg cag gtt tgg ttt caa
aat cga aga gct aaa tgt aga aaa 246Ala Arg Val Gln Val Trp Phe Gln
Asn Arg Arg Ala Lys Cys Arg Lys 55 60
65caa gaa aat caa ctc cat aaa ggt gtt ctc ata ggg gcc gcc agc cag
294Gln Glu Asn Gln Leu His Lys Gly Val Leu Ile Gly Ala Ala Ser Gln
70 75 80ttt gaa gct tgt aga gtc gca cct
tat gtc aac gta ggt gct tta agg 342Phe Glu Ala Cys Arg Val Ala Pro
Tyr Val Asn Val Gly Ala Leu Arg 85 90
95 100atg cca ttt cag cag gtt cag gcg cag ctg cag ctg gac
agc gct gtg 390Met Pro Phe Gln Gln Val Gln Ala Gln Leu Gln Leu Asp
Ser Ala Val 105 110 115gcg
cac gcg cac cac cac ctg cat ccg cac ctg gcc gcg cac gcg ccc 438Ala
His Ala His His His Leu His Pro His Leu Ala Ala His Ala Pro
120 125 130tac atg atg ttc cca gca ccg
ccc ttc gga ctg ccg ctc gcc acg ctg 486Tyr Met Met Phe Pro Ala Pro
Pro Phe Gly Leu Pro Leu Ala Thr Leu 135 140
145gcc gcg gat tcg gct tcc gcc gcc tcg gta gtg gcg gcc gca gca
gcc 534Ala Ala Asp Ser Ala Ser Ala Ala Ser Val Val Ala Ala Ala Ala
Ala 150 155 160gcc aag acc acc agc aag
gac tcc agc atc gcc gat ctc aga ctg aaa 582Ala Lys Thr Thr Ser Lys
Asp Ser Ser Ile Ala Asp Leu Arg Leu Lys165 170
175 180gcc aaa aag cac gcc gca gcc ctg ggt ctg
tgacvccaac gccagcacca 632Ala Lys Lys His Ala Ala Ala Leu Gly Leu
185 190atgtcgcgcc tgtcccgcgg cactcagcct
gcasnccctn ddkanmcgtt rctyhtcmat 692tacactttgg gaccycgggd bagvcctttt
nnagacttyv atkggscwcs ctggbccctb 752rkgavvactt gsghycgrga accgakhtgc
ccabaygagg accrgtttgg akdg 80616190PRTHomo sapiens 16Met Glu Asp
Glu Gly Gln Thr Lys Ile Lys Gln Arg Arg Ser Arg Thr 1 5
10 15Asn Phe Thr Leu Glu Gln Leu Asn Glu
Leu Glu Arg Leu Phe Asp Glu 20 25
30Thr His Tyr Pro Asp Ala Phe Met Arg Glu Glu Leu Ser Gln Arg Leu
35 40 45Gly Leu Ser Glu Ala Arg
Val Gln Val Trp Phe Gln Asn Arg Arg Ala 50 55
60Lys Cys Arg Lys Gln Glu Asn Gln Leu His Lys Gly Val Leu Ile
Gly 65 70 75 80Ala Ala
Ser Gln Phe Glu Ala Cys Arg Val Ala Pro Tyr Val Asn Val
85 90 95Gly Ala Leu Arg Met Pro Phe Gln
Gln Val Gln Ala Gln Leu Gln Leu 100 105
110Asp Ser Ala Val Ala His Ala His His His Leu His Pro His Leu
Ala 115 120 125Ala His Ala Pro Tyr
Met Met Phe Pro Ala Pro Pro Phe Gly Leu Pro 130 135
140Leu Ala Thr Leu Ala Ala Asp Ser Ala Ser Ala Ala Ser Val
Val Ala145 150 155 160Ala
Ala Ala Ala Ala Lys Thr Thr Ser Lys Asp Ser Ser Ile Ala Asp
165 170 175Leu Arg Leu Lys Ala Lys Lys
His Ala Ala Ala Leu Gly Leu 180 185
1901718DNAArtificial SequenceDescription of Artificial Sequence
synthetic primer 17gcacagccaa ccacctag
181821DNAArtificial SequenceDescription of Artificial
Sequence synthetic primer 18tggaaaggca tcatccgtaa g
211926DNAArtificial SequenceDescription of
Artificial Sequence synthetic primer 19atttccaatg gaaaggcgta aataac
262025DNAArtificial
SequenceDescription of Artificial Sequence synthetic primer
20acggcttttg tatccaagtc ttttg
252120DNAArtificial SequenceDescription of Artificial Sequence synthetic
primer 21gccctgtgcc ctccgctccc
202225DNAArtificial SequenceDescription of Artificial Sequence
synthetic primer 22ggctcttcac atctctctct gcttc
252324DNAArtificial SequenceDescription of Artificial
Sequence synthetic primer 23ccacactgac acctgctccc tttg
242422DNAArtificial SequenceDescription of
Artificial Sequence synthetic primer 24cccgcaggtc caggctcagc tg
222523DNAArtificial
SequenceDescription of Artificial Sequence synthetic primer
25cgcctccgcc gttaccgtcc ttg
232621DNAArtificial SequenceDescription of Artificial Sequence synthetic
primer 26ccctggagcc ggcgcgcaaa g
212718DNAArtificial SequenceDescription of Artificial Sequence
synthetic primer 27ccccgccccc gcccccgg
182821DNAArtificial SequenceDescription of Artificial
Sequence synthetic primer 28cttcaggtcc ccccagtccc g
212926DNAArtificial SequenceDescription of
Artificial Sequence synthetic primer 29ctagggatct tcagaggaag aaaaag
263025DNAArtificial
SequenceDescription of Artificial Sequence synthetic primer
30gctgcgcggc gggtcagagc cccag
253141DNAArtificial SequenceDescription of Artificial Sequence synthetic
primer 31ggccacgcgt cgactagtac tttttttttt tttttttttt n
413222DNAArtificial SequenceDescription of Artificial Sequence
synthetic primer 32gaaaggcatc cgtaaggctc cc
223324DNAArtificial SequenceDescription of Artificial
Sequence synthetic primer 33gacgccttta tgcatctgat tctc
243422DNAArtificial SequenceDescription of
Artificial Sequence synthetic primer 34gaatcagatg cataaaggcg tc
223522DNAArtificial
SequenceDescription of Artificial Sequence synthetic primer
35gggagcctta cggatgcctt tc
223620DNAArtificial SequenceDescription of Artificial Sequence synthetic
primer 36agccccggct gctcgccagc
203724DNAArtificial SequenceDescription of Artificial Sequence
synthetic primer 37ctgcgcggcg ggtcagagcc ccag
243820DNAArtificial SequenceDescription of Artificial
Sequence synthetic primer 38agccccggct gctcgccagc
203924DNAArtificial SequenceDescription of
Artificial Sequence synthetic primer 39gcctcagcag caaagcaaga tccc
244025DNAArtificial
SequenceDescription of Artificial Sequence synthetic primer
40gctgagcctg gacctgttgg aaagg
254125DNAArtificial SequenceDescription of Artificial Sequence synthetic
primer 41gctgagcctg gacctgttgg aaagg
254215PRTArtificial SequenceDescription of Artificial Sequence
Synthetic peptide 42Cys Ser Lys Ser Phe Asp Gln Lys Ser Lys Asp Gly
Asn Gly Gly 1 5 10
15431541DNAHomo sapiensCDS(43)..(612) 43gtgtccccgg agctgaaaga tcgcaaagag
gatgcgaaag gg atg gag gac gaa 54
Met Glu Asp Glu
1ggc cag acc aaa atc aag cag agg cga agt cgg acc aat ttc acc ctg
102Gly Gln Thr Lys Ile Lys Gln Arg Arg Ser Arg Thr Asn Phe Thr Leu 5
10 15 20gaa caa ctc aat
gag ctg gag agg ctt ttt gac gag acc cac tat ccc 150Glu Gln Leu Asn
Glu Leu Glu Arg Leu Phe Asp Glu Thr His Tyr Pro 25
30 35gac gcc ttc atg cga gag gaa ctg agc cag
cga ctg ggc ctg tcg gag 198Asp Ala Phe Met Arg Glu Glu Leu Ser Gln
Arg Leu Gly Leu Ser Glu 40 45
50gcc cga gtg cag gtt tgg ttt caa aat cga aga gct aaa tgt aga aaa
246Ala Arg Val Gln Val Trp Phe Gln Asn Arg Arg Ala Lys Cys Arg Lys
55 60 65caa gaa aat caa ctc cat aaa
ggt gtt ctc ata ggg gcc gcc agc cag 294Gln Glu Asn Gln Leu His Lys
Gly Val Leu Ile Gly Ala Ala Ser Gln 70 75
80ttt gaa gct tgt aga gtc gca cct tat gtc aac gta ggt gct tta agg
342Phe Glu Ala Cys Arg Val Ala Pro Tyr Val Asn Val Gly Ala Leu Arg 85
90 95 100atg cca ttt cag
cag gtt cag gcg cag ctg cag ctg gac agc gct gtg 390Met Pro Phe Gln
Gln Val Gln Ala Gln Leu Gln Leu Asp Ser Ala Val 105
110 115gcg cac gcg cac cac cac ctg cat ccg cac
ctg gcc gcg cac gcg ccc 438Ala His Ala His His His Leu His Pro His
Leu Ala Ala His Ala Pro 120 125
130tac atg atg ttc cca gca ccg ccc ttc gga ctg ccg ctc gcc acg ctg
486Tyr Met Met Phe Pro Ala Pro Pro Phe Gly Leu Pro Leu Ala Thr Leu
135 140 145gcc gcg gat tcg gct tcc gcc
gcc tcg gta gtg gcg gcc gca gca gcc 534Ala Ala Asp Ser Ala Ser Ala
Ala Ser Val Val Ala Ala Ala Ala Ala 150 155
160gcc aag acc acc agc aag gac tcc agc atc gcc gat ctc aga ctg aaa
582Ala Lys Thr Thr Ser Lys Asp Ser Ser Ile Ala Asp Leu Arg Leu Lys165
170 175 180gcc aaa aag cac
gcc gca gcc ctg ggt ctg tgacgccaac gccagcacca 632Ala Lys Lys His
Ala Ala Ala Leu Gly Leu 185 190atgtcgcgcc
tgtcccgcgg cactcagcct gcacgccctc cgcgccccgc tgcttctccg 692ttaccccttt
gagacctcgg gagccggccc tcttcccgcc tcactgacca tccctcgtcc 752cctatcgcat
cttggactcg gaaagccaga ctccacgcag gaccagggat ctcacgaggc 812acgcaggctc
cgtggctcct gcccgttttc ctactcgagg gcctagaatt gggttttgta 872ggagcgggtt
tgggggagtc tggagagaga ctggacaggg tagtgctgga accgcggagt 932ttggctcacc
gcaaagctac aacgatggac tcttgcatag aaaaaaaaaa tcttgttaac 992aatgaaaaaa
tgagcaaaca aaaaaatcga aagacaaacg ggagagaaaa agaggaaggc 1052aacttatttc
ttaactgcta tttggcagaa gctgaaattg gagaaccaag gagcaaaaac 1112aaattttaaa
attaaagtat tttatacatt taaaaatatg gaaaaacaac ccagacgatt 1172ctcgagagac
tggggggagt taccaactta aatgtgtgtt ttaaaaaatg cgctaagaag 1232gcaaagcaga
aagaagaggt atacttattt aaaaaactaa gatgaaaaaa gtgcgcaggt 1292gggaagttca
caggttttga aactgacctt tttctgcgaa gttcacgtta atacgagaaa 1352tttgatgaga
gaggcgggcc tccttttacg ttgaatcaga tgctttgagt ttaaacccac 1412catgtatgga
agagcaagaa aagagaaaat attaaaacga ggagagagaa aaataatggc 1472aaaactgtct
ggactgctga cagtaaattc cggtttgcat ggaaaaaaaa aaaaaaaaaa 1532aaaaaaaaa
15414410DNAHomo
sapiens 44gtgatccacc
104520DNAHomo sapiens 45ccccacgcag ggatttatga
204620DNAHomo sapiens 46tctccccaag gtttggttcc
204720DNAHomo sapiens
47ttggacacag gcgtcatctt
204820DNAHomo sapiens 48gctcccgcag gtccaggctc
204920DNAHomo sapiens 49ttttttttag atggagtttt
205020DNAHomo sapiens
50gtggcagaag gtaagttcct
205120DNAHomo sapiens 51gcgcgtgcag gtaggaaccc
205220DNAHomo sapiens 52atgcataaag gtgggtgtcg
205320DNAHomo sapiens
53tttccaacag gtagctcact
205410DNAHomo sapiens 54aaaaaatagc
105510DNAHomo sapiens 55tgttacgtgg
10
User Contributions:
Comment about this patent or add new information about this topic:
People who visited this patent also read: | |
Patent application number | Title |
---|---|
20170006793 | SOYBEAN VARIETY 01051074 |
20170006792 | NEW TOMATO VARIETY NUN 09085 TOF |
20170006791 | RESISTANCE TO GEMINIVIRUSES IN WATERMELONS |
20170006790 | SYSTEMS AND METHODS FOR CULTIVATING AND DISTRIBUTING AQUATIC ORGANISMS |
20170006789 | Tree Stand Water Level Gauge |