Patent application title: GENES ASSOCIATED WITH MECHANICAL STRESS, EXPRESSION PRODUCTS THEREFROM, AND USES THEREOF
Inventors:
Paz Einat (Ness Ziona, IL)
Orit Segev (Rehovot, IL)
Rami Skaliter (Ness Ziona, IL)
Elena Feinstein (Rehovot, IL)
Alexander Faerman (Bnei Aiish, IL)
Aviva Samach (D.n. Emek Soreq, IL)
Assignees:
Quark Biotech, Inc.
IPC8 Class: AA61K39395FI
USPC Class:
4241391
Class name: Drug, bio-affecting and body treating compositions immunoglobulin, antiserum, antibody, or antibody fragment, except conjugate or complex of the same with nonimmunoglobulin material binds antigen or epitope whose amino acid sequence is disclosed in whole or in part (e.g., binds specifically-identified amino acid sequence, etc.)
Publication date: 2009-10-15
Patent application number: 20090258021
Inventors list |
Agents list |
Assignees list |
List by place |
Classification tree browser |
Top 100 Inventors |
Top 100 Agents |
Top 100 Assignees |
Usenet FAQ Index |
Documents |
Other FAQs |
Patent application title: GENES ASSOCIATED WITH MECHANICAL STRESS, EXPRESSION PRODUCTS THEREFROM, AND USES THEREOF
Inventors:
Elena Feinstein
Paz EINAT
Orit Segev
Rami Skaliter
Alexander Faerman
Aviva Samach
Agents:
BROWDY AND NEIMARK, P.L.L.C.;624 NINTH STREET, NW
Assignees:
Quark Biotech, Inc.
Origin: WASHINGTON, DC US
IPC8 Class: AA61K39395FI
USPC Class:
4241391
Patent application number: 20090258021
Abstract:
The disclosure relates to human and mechanical stress induced genes, in
particular gene 608, and functional equivalents, probes therefor, tests
to identify such genes, polypeptide expression products of such genes,
antibodies to the polypeptides, uses for such genes, expression products
and antibodies, e.g., in diagnosis (for instance risk determination),
treatment, prevention, or control, of osteoporosis or fractures; and to
diagnostic, treatment, prevention, or control methods or processes, as
well as compositions therefor and methods or processes for making and
using such compositions, and receptors therefor and methods or processes
for obtaining and using such receptors.Claims:
1. An isolated polypeptide encoded by a nucleic acid molecule comprising
consecutive nucleotides having a sequence set forth in SEQ ID NO:6, SEQ
ID NO: 20, SEQ ID NO: 23, SEQ ID NO:26, SEQ ID NO:27, SEQ ID NO:28, SEQ
ID NO:29 or SEQ ID NO:31 or comprising nucleotides having a sequence
incorporated in plasmids deposited under ATCC Accession Nos. PTA-3638,
PTA-3876, PTA-3877, PTA-3878 complements thereof or a polynucleotide
having a sequence that differs due to the degeneracy of the genetic code
from SEQ ID NO:6, SEQ ID NO: 20, SEQ ID NO: 23, SEQ ID NO:26, SEQ ID
NO:27, SEQ ID NO:28, SEQ ID NO:29, SEQ ID NO:31 or a sequence
incorporated in plasmids deposited under ATCC Accession Nos. PTA-3638,
PTA-3876, PTA-3877 or PTA-3878, or a sequence which hybridizes under
stringent conditions to SEQ ID NO:6, SEQ ID NO: 20, SEQ ID NO: 23, SEQ ID
NO:26, SEQ ID NO:27, SEQ ID NO:28, SEQ ID NO:29 or SEQ ID NO:31 or to a
sequence incorporated in plasmids deposited under ATCC Accession Nos.
PTA-3638, PTA-3876, PTA-3877 or PTA-3878 or a functional portion thereof
or a polynucleotide which is at least substantially homologous thereto.
2. The isolated polypeptide of claim 1, wherein the polypeptide is identified as human protein 608, or a functional portion of protein 608 or Adlican-2, or a polypeptide which is at least substantially homologous thereto.
3. The isolated polypeptide of claim 2, wherein the polypeptide comprises about 663 to about 1634 amino acids.
4. An antibody which specifically binds to a polypeptide of claim 2 or a functional portion thereof.
5. An isolated polypeptide wherein the functional portion comprises consecutive amino acids having a sequence set forth in SEQ ID NO:16, SEQ ID NO: 24, SEQ ID NO: 30 or SEQ ID NO: 32.
6. The isolated polypeptide of claim 5 wherein the sequence comprises about the first 663 amino acids of the sequence set forth in SEQ ID NO:16, SEQ ID NO: 24, SEQ ID NO: 30 or SEQ ID NO: 32.
7. The isolated polypeptide of claim 6 wherein the sequence comprises about the first 741 amino acids of the sequence set forth in SEQ ID NO:16, SEQ ID NO: 24, SEQ ID NO: 30 or SEQ ID NO: 32.
8. The isolated polypeptide of claim 5, wherein the polypeptide is identified as human 608 protein or human Adlican-2 protein or a functional portion thereof or a polypeptide which is at least substantially homologous thereto.
9. An isolated polypeptide of claim 5 comprising consecutive amino acids having a sequence set forth in SEQ ID NO:32, designated human 608 protein.
10. An isolated polypeptide of claim 5 comprising consecutive amino acids having a sequence set forth in SEQ ID NO: 30 designated Adlican-2 polypeptide.
11. An isolated polypeptide of claim 10 comprising consecutive amino acids having a sequence set forth in SEQ ID NO: 30 deleted of amino acids 6-215.
12. An antibody which binds specifically to a polypeptide of claim 9.
13. An antibody which binds specifically to a polypeptide of claim 10.
14. An antibody of claim 12, which does not bind specifically to a rat 608 polypeptide.
15. An antibody of claim 14, wherein the rat polypeptide has a sequence set forth in SEQ ID NO: 34.
16. A composition comprising the antibody of claim 14
Description:
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001]The present application is a divisional of U.S. patent application Ser. No. 10/454,351, filed Jun. 4, 2003, which, in turn, was a continuation-in-part of International Application No. PCT/US01/46400, filed Dec. 4, 2001, the entire contents of which are hereby incorporated by reference, and also a continuation-in-part of U.S. patent application Ser. No. 09/312,216, filed May 14, 1999, the entire contents of which are also hereby incorporated by reference. Each document or reference cited in those applications is hereby expressly incorporated herein by reference. Documents or references are also cited in the following text, and these documents or references ("herein-cited documents or references"), as well as each document or reference cited in each of the herein-cited documents or references, are hereby expressly incorporated herein by reference.
FIELD OF THE INVENTION
[0002]This invention relates to mechanical stress induced genes and their functional equivalents, probes therefor, tests to identify such genes, expression products of such genes, uses for such genes and expression products, e.g., in diagnosis (for instance risk determination), treatment, prevention, or control, of osteoporosis or factors or processes which lead to osteoporosis, osteopenia, osteopetrosis, osteosclerosis, osteoarthritis, periodontosis and bone fractures; and, to diagnosis, treatment, prevention, or control methods or processes, as well as compositions therefor and methods or processes for making and using such compositions, and receptors for such expression products and methods or processes for obtaining and using such receptors.
BACKGROUND OF THE INVENTION
[0003]Bone is composed of a collagen-rich organic matrix impregnated with mineral, largely calcium and phosphate. Two major forms of bone exist, compact cortical bone forms the external envelopes of the skeleton and trabecular or medullary bone forms plates that traverse the internal cavities of the skeleton. The responses of these two forms to metabolic influences and their susceptibility to fracture differ.
[0004]Bone undergoes continuous remodeling (turnover, renewal) throughout life. Mechanical and electrical forces, hormones and local regulatory factors influence remodeling. Bone is renewed by two opposing activities that are coupled in time and space. Parfitt (1979) Calcify Tis. Int. 28:1-5. These activities, resorption and formation, are contained within a temporary anatomic structure known as a bone-remodeling unit. Parfitt (1981) Res. Staff Physic. Dec.: 60-72. Within a given bone-remodeling unit, old bone is resorbed by osteoclasts. The resorbed cavity created by osteoclasts is subsequently filled with new bone by osteoblasts, synthesizing bone organic matrix.
[0005]Peak bone mass is mainly genetically determined, though dietary factors and physical activity can have positive effects. Peak bone mass is attained at the point when skeletal growth ceases, after which time bone loss starts.
[0006]In contrast to the positive balance that occurs during growth, in osteoporosis, the resorbed cavity is not completely refilled by bone. Parfitt (1988), Osteoporosis: Etiology, Diagnosis, and Management (Riggs and Melton, eds.) Raven Press, New York, pp. 74-93. Osteoporosis, or porous bone, is a progressive and chronic disease characterized by low bone mass and structural deterioration of bone tissue, leading to bone fragility and an increased susceptibility to fractures of the hip, spine, and wrist (diminishing bone strength).
[0007]Bone loss occurs without symptoms. The Consensus Development Conference ((1993) Am. J. Med. 94:646-650) defined osteoporosis as "a systemic skeletal disease characterized by low bone mass and microarchitectural deterioration of bone tissue, with a consequent increase in bone fragility and susceptibility to fracture."
[0008]Common types of osteoporosis include postmenopausal osteoporosis; and senile osteoporosis, which generally occurs in later life, e.g., 70+ years. See, e.g., U.S. Pat. No. 5,691,153. Osteoporosis is estimated to affect more than 25 million people in the United States (Rosen (1997) Calcif. Tis. Int. 60:225-228); and, at least one estimate asserts that osteoporosis affects 1 in 3 women. Keen et al. (1997) Drugs Aging 11:333-337. Moreover, life expectancy has increased, and in the western world, 17% of women are now over 50 years of age: a woman can expect to live one third of her life after menopause. Thus, some estimate that 1 out of every 2 women and 1 out of 5 men will eventually develop osteoporosis; and, that 75 million people in the U.S., Japan and Europe have osteoporosis. The World Summit of Osteoporosis Societies estimates that more than 200 million people worldwide are afflicted with the disease. The actual incidence of the disease is difficult to estimate since the condition is often asymptomatic until a bone fracture occurs. It is believed that there are over 1.5 million osteoporosis-associated bone fractures per year in the U.S. Of these, 300,000 are hip fractures that usually require hospitalization and surgery and may result in lengthy or permanent disability or even death. See a minireview by Spangler et al. "The Genetic Component of Osteoporosis" (1997) Cambridge Scientific Abstracts".
[0009]Osteoporosis is also a major health problem in virtually all societies. Eisman (1996); Wark (1996) Maturitas 23:193-207; and U.S. Pat. No. 5,834,200. There is a20-30% mortality rate related to hip fractures in elderly women (U.S. Pat. No. 5,691,153); and, such a patient with a hip fracture has a 10-15% greater chance of dying than others of the same age. Further, although men suffer fewer hip injuries than women, men are 25% more likely than women to die within one year of the injury. See Spangler et al., supra. Also, about 20% of the patients who lived independently before a hip fracture remain confined in a long-term health care facility one year later. The treatment of osteoporosis and related fractures costs over $10 billion annually.
[0010]Osteoporosis treatment helps stop further bone loss and fractures. Common therapeutics include HRT (hormone replacement therapy), bisphosphonates, e.g., alendronate (Fosamax), estrogen and estrogen receptor modulators, progestin, calcitonin, and vitamin D. While there may be numerous factors that determine whether any particular person will develop osteoporosis, a step towards prevention, control or treatment of osteoporosis is determining whether one is at risk for osteoporosis. Genetic factors also play an important role in the pathogenesis of osteoporosis. Ralston (1997); see also Keen et al. (1997); Eisman (1996); Rosen (1997); Cole (1998); Johnston et al. (1995) Bone 17(2 Suppl)19S-22S; Gong et al. (1996) Am. J. Hum. Genet. 59:146-151; and Wasnich (1996) Bone 18(3 Suppl): 179S-183S. Some attribute 50-60% of total bone variation (bone mineral density: "BMD"), depending upon the bone area, to genetic effects. Livshits et al. (1996) Hum. Biol. 68:540-554. However, up to 85%-90% of the variance in bone mineral density may be genetically determined.
[0011]Studies have shown from family histories, twin studies, and racial factors, that there may be a predisposition for osteoporosis. Jouanny et al. (1995) Arthritis Rheum. 38:61-67; Garnero et al. (1996) J. Clin. Endrocrinol. Metab. 81:140-146; Cummings (1996) Bone 18(3 Suppl): 165S-167S; and Lonzer et al. (1996) Clin. Pediatr. 35:185-189. Several candidate genes may be involved in this, most probably multigenic, process.
[0012]Cytokines are powerful regulators of bone resorption and formation under control of estrogen/testosterone, parathyroid hormone and 1,25(OH)2D3. Some cytokines primarily enhance osteoclastic bone resorption e.g. IL-1 (interleukin-1), TNF (tumor necrosis factor) and IL-6 (interleukin-6); while others primarily stimulate bone formation e.g. TGF-β (transforming growth factor-β), IGF (insulin-like growth factor) and PDGF (platelet derived growth factor).
[0013]There is need for clinical and epidemiological research for the prevention and treatment of osteoporosis for gaining greater knowledge of factors controlling bone cell activity and regulation of bone mineral and matrix formation and remodeling.
[0014]Bone develops via a number of processes. Mesenchymal cells can differentiate directly into bone, as occurs in the flat bones of the craniofacial skeleton; this process is termed intramembranous ossification. Alternatively, cartilage provides a template for bone morphogenesis, as occurs in the majority of human bones. The cartilage template is replaced by bone in a process known as endochondral ossification. Reddi (1981) Collagen Rel. Res. 1:209-226. Bone is also continuously modeled during growth and development and remodeled throughout the life of the organism in response to physical and chemical signals. Development and maintenance of cartilage and bone tissue during embryogenesis and throughout the lifetime of vertebrates is very complex. It is widely accepted that a multitude of factors, from systemic hormones to local regulatory factors such as the members of the TGF-β superfamily, cytokines and prostaglandins, act in concert to regulate the continuous processes of bone formation and bone resorption. Disturbance of the balance between osteoblastic bone deposition and osteoclastic bone resorption is responsible for many skeletal diseases.
[0015]Diseases of bone loss are a major public health problem especially for women in all Western communities. The most common cause of osteopenia is osteoporosis; other causes include osteomalacia and bone disease related to hyperparathyroidism. Osteopenia has been defined as the appearance of decreased bone mineral content on radiography, but the term more appropriately refers to a phase in the continuum from decreased bone mass to fractures and infirmity.
[0016]It is estimated that 30 million Americans are at risk for osteoporosis, the most common among these diseases, and there are probably 100 million people similarly at risk worldwide. Melton (1995) Bone Min. Res. 10:175. These numbers are growing as the proportion of the elderly in the world population increases. Despite recent successes with drugs that inhibit bone resorption, there is a clear need for specific anabolic agents that will considerably increase bone formation in people who have already suffered substantial bone loss. There are no such drugs currently approved.
[0017]Mechanical stimulation induces new bone formation in vivo and increases osteoblastic differentiation and metabolic activity in culture. Mechanotransduction in bone tissue involves several steps: 1) mechanochemical transduction of the signal; 2) cell-to-cell signaling; and 3) increased number and activity of osteoblasts. Cell-to-cell signaling after mechanical stimulus involves prostaglandins, especially those produced by COX-2, and nitric oxide. Prostaglandins induce new bone formation by promoting both proliferation and differentiation of osteoprogenitor cells.
OBJECTS AND SUMMARY OF THE INVENTION
[0018]In a search for agents that enhance osteoblast proliferation/differentiation and bone formation, mechanical force was employed as an osteogenesis inducer and a proprietary gene discovery methodology was carried out to detect genes that are specifically expressed in very early osteo-, chondro-progenitor cells.
[0019]The present invention provides human mechanical stress induced genes and their functional equivalents, expression products of such genes, uses for such genes and expression products for treatment, prevention, control, of osteoporosis or factors or processes which are involved in bone diseases including, but not limited to, osteoporosis, osteopenia, osteopetrosis, osteosclerosis, osteoarthritis, periodontosis and bone fracture. The invention further provides diagnostic, treatment, prevention, control methods or processes as well as compositions.
[0020]The invention additionally provides an isolated nucleic acid molecule, and the complement thereof, encoding the protein 608 or a functional portion thereof or a polypeptide, which is at least substantially homologous thereto. The invention encompasses an isolated nucleic acid molecule encoding human protein 608 (or "OCP") or a functional portion thereof.
[0021]The invention further encompasses a method for preventing, treating or controlling osteoporosis or low bone density or other factors associated with, causing or contributing to bone diseases including, but not limited to, osteopenia, osteopetrosis, osteosclerosis, osteoarthritis, periodontosis or symptoms thereof, or other conditions involving mechanical stress or a lack thereof, by administering to a subject in need thereof, a polypeptide or portion thereof provided herein; and accordingly, the invention comprehends uses of polypeptides in preparing a medicament or therapy for such prevention, treatment or control.
[0022]The invention also comprehends a method for preventing, treating or controlling osteoporosis or low bone density or other factors causing or contributing to osteoporosis or symptoms thereof or other conditions involving mechanical stress or a lack thereof, by administering a composition comprising a gene or functional portion thereof, the expression product of that gene or a functional portion thereof, an antibody or portion thereof elicited by such an expression product or portion thereof, and, the invention thus further comprehends uses of such genes, expression products, antibodies, portions thereof, in the preparation of a medicament or therapy for such control, prevention or treatment.
[0023]Analogously with the OCP-related description above, the invention further encompasses methods of use of Adlican and a novel polypeptide Adlican-2 as described herein for any use of OCP. The Adlican gene, or Adlican-2 gene, or functional portions thereof, can likewise be used for any purpose described herein for an OCP gene. The invention further encompasses compositions comprising a physiologically acceptable excipient and at least one of Adlican, the Adlican gene and antibodies specific to Adlican, and at least one of Adlican-2, the Adlican-2 gene and antibodies specific to Adlican-2.
[0024]The invention additionally provides receptors for expression products of human mechanical stress induced genes and their functional equivalents, such as OCP and Adlican, and methods or processes for obtaining and using such receptors. The invention also provides methods of using such receptors in assays, for instance for identifying proteins or polypeptides that bind to, associate with or block the receptors, and for testing the effects of such polypeptides. These and other embodiments are disclosed or are obvious from and encompassed by, the Detailed Description which follows the Brief Description of the Figures below.
BRIEF DESCRIPTION OF THE FIGURES
[0025]The following Detailed Description, given by way of example, but not intended to limit the invention to specific embodiments described, may be understood in conjunction with the accompanying Figures, in which
[0026]FIG. 1 shows the rat 608 cDNA sequence (SEQ ID NO:1).
[0027]FIG. 2 shows the pcDNA3.1-608 construct.
[0028]FIG. 3 shows the OCP rat protein amino acid sequence (SEQ ID NO:2).
[0029]FIG. 4 shows the mouse OCP exon and intron map.
[0030]FIG. 5 shows the OCP map of exon-intron borders.
[0031]FIG. 6 shows the human OCP exon and intron list.
[0032]FIG. 7 shows the OCP human cDNA sequence (predicted coding region) (nucleotides 1-7796 of SEQ ID NO:6).
[0033]FIGS. 8A-8D show the percent identity between FIG. 8A. rat protein/human protein; FIG. 8B. rat protein/mouse protein; FIG. 8C. rat cDNA/human cDNA; and FIG. 8D. rat cDNA/mouse cDNA, based on the OCP human cDNA sequence of FIG. 7
[0034]FIG. 9 shows the partial mouse OCP protein amino acid sequence (236 aa) (SEQ ID NO:15).
[0035]FIG. 10 shows the OCP human protein amino acid sequence (2587 aa) (SEQ ID NO:16), based on the OCP human cDNA sequence of FIG. 7.
[0036]FIGS. 11A-11B show a list of expression patterns of OCP in primary cells and various other cell lines. A. Northern blot of poly A+ RNA RT-PCR from rat primary calvaria cells and MC3T3 cells is shown. The main 8.9 kb transcript is present only in calvaria cells. RT-PCR assays with specific OCP primers were performed on total RNA from various lines as indicated on the right side of the figure. In all assays similar amounts of GapDH RT-PCR products were detected in all RNA samples. In addition, B. no GapDH products were detected in any RNA samples, when RT was omitted. (-) represents no expression of OCP, while (+) represents expression. When (-+) are indicated, the expression of OCP is induced only upon specific conditions.
[0037]FIG. 12 shows responsiveness of CMF608 expression to mechanical stimulation by Northern blot analysis using polyA RNA from primary rat calvaria cells before and after mechanical stress (m.s.)--see left of Figure. In these cells, CMF608 is transcribed as a single RNA species of approximately 9 Kb. On tissue blot, CMF608-specific 9 Kb mRNA transcript was hardly detectable in any other tissue type except for the bone (B)--see right of Figure.
[0038]FIG. 13 shows that OCP is an early marker of endochondral ossification in P7 rat femoral epiphysis.
[0039]FIG. 14 shows that OCP is induced during osteoblastic differentiation of bone marrow stroma cells and is a specific marker of early osteoblastic differentiation in bone marrow.
[0040]FIG. 15 shows in vivo regulation of OCP expression in bone marrow formation by various treatments. The results shown are representative of three experiments using total cellular RNA from treated two-month old mice. The different treatments are indicated. The RT-PCR products are marked. Control mice did not undergo any treatment. In each treatment group the left lane represents negative control without the addition of RT, the central lane represents the OCP RT-PCR product and the right lane represents the GapDH RT-PCR product. Bone formation is shown with blood loss and estrogen administration; bone loss is shown with sciatic neurotomy models.
[0041]FIG. 16 shows a low power photomicrograph of fractured bone one week after the operation. Note that well-developed woven bone and fibrocartilagenous callus formed at the fracture site. Bone marrow tissue was mainly destroyed by insertion of the wire used for the fracture immobilization. Marked areas are presented at higher magnification in the following figures.
[0042]FIGS. 17A-17B show photomicrographs of the central part of callus, FIG. 17A. brightfield and FIG. 17B. darkfield. Cells expressing the OCP gene can be seen in the fibrous part of the callus. There was no hybridization signal from chondrocytes.
[0043]FIGS. 18A-18B show photomicrographs of the callus area marked by 2 in FIG. 16, FIG. 18A. brightfield and FIG. 18B. darkfield. Cells expressing the OCP gene can be seen in a highly vascularized subperiosteal area bordering the cartilagenous part of the callus.
[0044]FIGS. 19A-B show photomicrographs of the highly vascularized endosteal tissue. This was developed in reaction to the wire insertion (area 3 on FIG. 16), FIG. 19A. brightfield and FIG. 19B. darkfield. This tissue contains many cells expressing the OCP gene.
[0045]FIG. 20 shows a high power photomicrograph of perivascular cells. The perivascular cells express the 608 gene within lacuna of woven bone arrowheads.
[0046]FIG. 21 shows a high power photomicrograph of periosteum covering the woven bone. Multiple cells display expression of the 608 gene in periosteum. Arrowheads point to two 608 expressing cells within the woven bone.
[0047]FIGS. 22A-22B show FIG. 22A. brightfield and FIG. 22B. darkfield photomicrographs of a section of fractured bone healed for 4 weeks. Multiple cells in periosteal tissue area of active remodeling of the cancellous bone covering the callus show a hybridization signal.
[0048]FIG. 23 shows the boxed area of FIG. 22 presented at higher magnification. Several OCP-expressing cells are concentrated in vascular tissue that fills the cavities resulting from osteoclast activity (marked by asterisks).
[0049]FIG. 24 shows increased osteoblast differentiation in OCP-transfected ROS cells. RT-PCR assays were with OCP, Cbfa1, ALP, BSP and GapDH specific primers as indicated above. The results shown are representative of two experiments using total cellular RNA from: (1) the stable OCP-expressed ROS cell line; and (2) the control ROS cell line (stable transfection with pcDNA). The OCP RT-PCR product is 1020 bp, the Cbfa1 product is 289 bp, the ALP product is 226 bp, the BSP product is 1048 bp and the GapDH (control) product is 450 bp long. M represents protein markers.
[0050]FIG. 25 shows increased osteoblast proliferation in OCP-transfected ROS cells.
[0051]FIG. 26 shows the sequences of the primer (SEQ ID NO:19) and QB3 (CMF608) (SEQ ID NO:20).
[0052]FIG. 27 shows the Adlican amino acid sequence (SEQ ID NO: 21).
[0053]FIG. 28 shows the Adlican DNA sequence (SEQ ID NO: 22)
[0054]FIG. 29 shows the predicted DNA sequence of the coding region-ORF of human OCP (SEQ ID NO: 23).
[0055]FIG. 30 shows the predicted amino acid sequence corresponding to the predicted coding region-ORF of human OCP (SEQ ID NO: 24).
[0056]FIG. 31 shows the sequence of the N-terminal 663 amino acid fragment derived from the OCP rat protein (SEQ ID NO: 25).
[0057]FIG. 32 shows the pCM-H-608-663-N-term construct map.
[0058]FIG. 33 shows the structure of the pKS H608 5'-2.4 Kb bAc#1 construct (deposited on Nov. 21, 2001 under the terms of the Budapest Treaty with the American Type Culture Collection (ATCC), P.O. Box 1549, Manassas, Va. 20108, USA, under ATCC accession number PTA-3878).
[0059]FIG. 34 shows the physical sequence of the 5' fragment (A) cloned into pBluescript KS to NotI (5') and HindIII (3') sites. Fragment A is comprised of the 5' region (2440 bp) of the complete human OCP sequence and includes, in addition, at the 5' end, 21 nucleotides of the β-actin "Kozak" region (SEQ ID NO:26).
[0060]FIG. 35 shows the structure of the pKS H608 m.FRG.3.5 Kb#34 construct (deposited on Nov. 21, 2001 under the terms of the Budapest Treaty with the American Type Culture Collection (ATCC), P.O. Box 1549, Manassas, Va. 20108, USA, under ATCC accession number PTA-3876)
[0061]FIG. 36 shows the physical sequence of the middle fragment (B) cloned into pBluescript KS to HindIII (5') and SalI (3') sites. Fragment B is comprised of the central region (3518 bp) of the complete human OCP sequence (SEQ ID NO:27).
[0062]FIG. 37 shows the structure of the pM H608 3'-1.9 Kb HSTG#3.3 construct (deposited on Nov. 21, 2001 under the terms of the Budapest Treaty with the American Type Culture Collection (ATCC), P.O. Box 1549, Manassas, Va. 20108, USA, under ATCC accession number PTA-3877).
[0063]FIG. 38 shows the physical sequence of the 3' fragment (C) cloned into pMCS SV(A) to SalI (5') and SpeI (3') sites. Fragment C is comprised of the 3' region (1923 bp, not including the 3 bp stop codon) of the complete human OCP sequence and includes, at the 3' end, 18 nucleotides coding for 6 Histidine residues (SEQ ID NO:28). Also cloned fragment C contains a silent mutation (C>T transition) compared to the predicted sequence of human OCP ORF. This transition does not change the identity of the encoded amino acid residue.
[0064]FIG. 39 shows the predicted DNA sequence of Adlican-2 (SEQ ID NO:29). Bases 1555 and 5638 are presented as "g" but could be any other base.
[0065]FIG. 40 shows the predicted amino acid sequence of human Adlican-2 (SEQ ID NO:30).
[0066]FIG. 41 shows the amino acid sequence alignment of (i) human Adlican (SEQ ID NO: 21), (ii) human Adlican-2 full amino acid predicted sequence, as determined by the inventors (SEQ ID NO 30), (iii) deduced sequence (hLOC96359) of human Adlican-2 fragment of 539 amino acid residues as found in the database (residues 2036-2652 of SEQ ID NO 30), and (iv) deduced sequence (hLOC90792) of human Adlican-2-fragment of 617 amino acid residues as found in the database (residues 2114-2652 of SEQ ID NO 30).
[0067]FIG. 42 shows the complete physical DNA sequence of the coding region (ORF) of human OCP (SEQ ID NO: 31).
[0068]FIG. 43 shows the predicted amino acid sequence corresponding to the complete physical DNA sequence of the coding region (ORF) of human OCP (SEQ ID NO:32).
[0069]FIG. 44 shows the full rat 608 cDNA sequence (SEQ ID NO:33). This sequence is virtually identical to SEQ ID NO:1, but five unknown nucleotides (designated "n" in SEQ ID NO:1) have been identified. The ORF is from position 575 to 8368.
[0070]FIG. 45 shows the OCP rat protein amino acid sequence corresponding to the above ORF sequence (SEQ ID NO:34). Three previously unknown amino acids have been identified, as compared to SEQ ID NO:2, where these amino acids are designated "Xaa".
[0071]FIG. 46 shows that ALP, which is a biochemical serum marker of bone formation, is significantly increased in 3 month old 608 KO mice.
DETAILED DESCRIPTION OF THE INVENTION
[0072]The present invention is related to the discovery of a novel gene, 608 ("OCP"), the expression of which is upregulated by mechanical stress on primary calvaria cells. Several functional features identify OCP as a most specific early marker of osteo- or chondro-progenitor cells as well as an inducer of osteoblast proliferation and differentiation.
[0073]As used herein, the same gene of the invention may be referred to either as "608" or "OCP." RNA refers to RNA isolated from cell cultures, cultured tissues or cells or tissues isolated from organisms which are stimulated, differentiated, exposed to a chemical compound, infected with a pathogen, or otherwise stimulated. As used herein, translation is defined as the synthesis of protein encoded by an mRNA template.
[0074]As used herein, stimulation of translation, transcription, stability or transportation of unknown target mRNA or stimulating element, includes chemically, pathogenically, physically, or otherwise inducing or repressing an mRNA population encoded by genes derived from native tissues and/or cells under pathological and/or stress conditions. In other words, stimulating the expression of an mRNA with a stress inducing element or "stressor" includes, but is not limited to, the application of an external cue, stimulus, or stimuli that stimulates or initiates translation of an mRNA stored as untranslated mRNA in the cells from the sample. The stressor may cause an increase in stability of certain mRNAs, or induce the transport of specific mRNAs from the nucleus to the cytoplasm. The stressor may also induce specific gene transcription. In addition to stimulating translation of mRNA from genes in native cells/tissues, stimulation can include induction and/or repression of genes under pathological and/or stress conditions. The method utilizes a stimulus or stressor to identify unknown target genes regulated at the various possible levels by the stress inducing element or stressor.
[0075]More in particular, with respect to nucleic acid molecules (rat 608 and human 608 genes) and polypeptides expressed from them, the invention further comprehends isolated and/or purified nucleic acid molecules and isolated and/or purified polypeptides having at least about 70%, preferably at least about 75% or about 77% homology ("substantially homologous"); advantageously at least about 80% or about 83%, such as at least about 85% or about 87% homology ("significantly homologous"); for instance at least about 90% or about 93% homology ("highly homologous"); more advantageously at least about 95%, e.g., at least about 97%, about 98%, about 99% or even about 100% homology ("very highly homologous" to "100% (homologous"); or from about 84-100% homology considered "highly conserved". The invention also comprehends that these nucleic acid molecules and polypeptides can be used in the same fashion as the herein or aforementioned nucleic acid molecules and polypeptides.
[0076]Nucleotide sequence homology can be determined using the "Align" program of Myers and Miller, ((1988) CABIOS 4:11-17) and available at NCBI. Alternatively or additionally, the term "homology" for instance, with respect to a nucleotide or amino acid sequence, can indicate a quantitative measure of homology between two sequences. The percent sequence homology can be calculated as (Nref-Ndif)*100/Nref, wherein Ndif is the total number of non-identical residues in the two sequences when aligned and wherein Nref is the number of residues in one of the sequences. Hence, AGTCAGTC has a sequence similarity of 75% to AATCAATC(Nref=8; Ndif=2).
[0077]Alternatively or additionally, "homology" with respect to sequences can refer to the number of positions with identical nucleotides or amino acid residues divided by the number of nucleotides or amino acid residues in the shorter of the two sequences wherein alignment of the two sequences can be determined in accordance with the Wilbur and Lipman algorithm ((1983) Proc. Natl. Acad. Sci. USA 80:726), for instance, using a window size of 20 nucleotides, a word length of 4 nucleotides, and a gap penalty of 4, and computer-assisted analysis and interpretation of the sequence data including alignment can be conveniently performed using commercially available programs (e.g., Intelligenetics® Suite, Intelligenetics Inc., CA). When RNA sequences are said to be similar, or have a degree of sequence identity or homology with DNA sequences, thymidine (T) in the DNA sequence is considered equal to uracil (U) in the RNA sequence. RNA sequences within the scope of the invention can be derived from DNA sequences or their complements, by substituting thymidine (T) in the DNA sequence with uracil (U).
[0078]Additionally or alternatively, amino acid sequence similarity or identity or homology can be determined, for instance, using the BlastP program (Altschul et al. Nucl. Acids Res. 25:3389-3402) and available at NCBI. The following references provide algorithms for comparing the relative homology of amino acid residues of two proteins, and additionally, or alternatively, with respect to the foregoing, the teachings in these references can be used for determining percent homology. Smith et al. (1981) Adv. Appl. Math. 2:482-489; Smith et al. (1983) Nucl. Acids Res. 11:2205-2220; Devereux et al. (1984) Nucl. Acids Res. 12:387-395; Feng et al. (1987) J. Molec. Evol. 25:351-360; Higgins et al. (1989) CABIOS 5:151-153; and Thompson et al. (1994) Nucl. Acids Res. 22:4673-480.
[0079]As to uses, the inventive genes and expression products as well as genes identified by the herein disclosed methods and expression products thereof and the compositions comprising Adlican or the Adlican gene (including "functional" variations of such expression products, and truncated portions of herein defined genes such as portions of herein defined genes which encode a functional portion of an expression product) are useful in treating, preventing or controlling or diagnosing mechanical stress conditions or absence or reduced mechanical stress conditions.
[0080]As described herein, Adlican, including functional portions thereof, can be used in all methods suitable for OCP. The sequence homology between Adlican and human OCP provides this novel use of the Adlican protein. Adlican is provided, for instance, in AF245505.1:1.8487. Adlican is named for "Adhesion protein with Leucine-rich repeats has immunoglobulin domains related to perleCAN"; and shows elevated expression in cartilage from osteoarthritis patients. The Adlican gene, or functional portions thereof, can likewise be used for any purpose described herein for an OCP gene. The invention further encompasses compositions comprising a physiologically acceptable excipient and at least one of Adlican, the Adlican gene and antibodies specific to Adlican.
[0081]OCP expression is related to proliferation and differentation of osteoblasts and chondrocytes. The expression product of OCP, or cells or vectors expressing OCP may cause cells to selectively proliferate and differentiate and thereby increase or alter bone density. Detecting levels of OCP mRNA or expression and comparing it to "normal" non-osteopathic levels may allow one to detect subjects at risk for osteoporosis or lower levels of osteoblasts and chondrocytes.
[0082]The medicament or treatment can be any conventional medicament or treatment for osteoporosis. Alternatively, or additionally, the medicament or treatment can be the particular protein of the gene detected in the inventive methods, or that which inhibits that protein, e.g., binds to it. Similarly, additionally, or alternatively, the medicament or treatment can be a vector which expresses the protein of the gene detected in the inventive methods or that which inhibits expression of that gene; again, for instance, that which can bind to it and/or otherwise prevents its transcription or translation. The selection of administering a protein or that which expresses it, or of administering that which inhibits the protein or the gene expression, can be done without undue experimentation, e.g., based on down-regulation or up-regulation as determined by inventive methods (e.g., in the osteoporosis model).
[0083]In the practice of the invention, one can employ general methods in molecular biology. Standard molecular biology techniques known in the art and not specifically described are generally followed as in Sambrook et al. (1989, 1992) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, New York; and Ausubel et al. (1989) Current Protocols in Molecular Biology, John Wiley and Sons, Baltimore, Md.
[0084]PCR comprising the methods of the invention is performed in a reaction mixture comprising an amount, typically between <10 ng-200 ng template nucleic acid; 50-100 pmoles each oligonucleotide primer; 1-1.25 mM each deoxynucleotide triphosphate; a buffer solution appropriate for the polymerase used to catalyze the amplification reaction; and 0.5-2 Units of a polymerase, most preferably a thermostable polymerase (e.g., Taq polymerase or Tth polymerase).
[0085]Antibodies may be used in various aspects of the invention, e.g., in detection or treatment or prevention methods. Antibodies can be monoclonal, polyclonal or recombinant for use in the immunoassays or other methods of analysis necessary for the practice of the invention. By the term "antibody" as used in the present invention is meant both poly- and mono-clonal complete antibodies as well as fragments thereof, such as Fab, F(ab')2, and Fv, which are capable of binding the epitopic determinant. These antibody fragments retain the ability to selectively bind with its antigen or receptor and are exemplified as follows, inter alia: [0086](1) Fab, the fragment which contains a monovalent antigen-binding fragment of an antibody molecule can be produced by digestion of whole antibody with the enzyme papain to yield a light chain and a portion of the heavy chain; [0087](2) (Fab')2, the fragment of the antibody that can be obtained by treating whole antibody with the enzyme pepsin without subsequent reduction; F(ab'2) is a dimer of two Fab fragments held together by two disulfide bonds; [0088](3) Fv, defined as a genetically engineered fragment containing the variable region of the light chain and the variable region of the heavy chain expressed as two chains; and [0089](4) Single chain antibody (SCA), defined as a genetically engineered molecule containing the variable region of the light chain and the variable region of the heavy chain linked by a suitable polypeptide linker as a genetically fused single chain molecule.
[0090]Conveniently, the antibodies may be prepared against the immunogen or antigenic portion thereof for example a synthetic peptide based on the sequence, or prepared recombinantly by cloning techniques or the natural gene product and/or portions thereof may be isolated and used as the immunogen. The genes are identified as set forth in the present invention and the gene product identified. Immunogens can be used to produce antibodies by standard antibody production technology well known to those skilled in the art as described generally in Harlow and Lane (1988) Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y.; and Borrebaeck (1992) Antibody Engineering--A Practical Guide, W.H. Freeman and Co. Antibody fragments, as mentioned above, include Fab, F(ab')2, Fv and scFv prepared by methods known to those skilled in the art. Bird et al. (1988) Science 242:423-426. Any peptide having sufficient flexibility and length can be used as an scFv linker. Usually the linker is selected to have little to no immunogenicity. Linker sequences can also provide additional functions, such as a means for attaching a drug or a solid support.
[0091]For producing polyclonal antibodies, a host, such as a rabbit or goat, is immunized with the immunogen or an immunogenic fragment thereof, generally with an adjuvant and, if necessary, coupled to a carrier; and antibodies to the immunogen are collected from the sera of the immunized animal. The sera can be adsorbed against related immunogens so that no cross-reactive antibodies remain in the sera rendering the polyclonal antibody monospecific.
[0092]For producing monoclonal antibodies (mAbs), an appropriate donor, generally a mouse, is hyperimmunized with the immunogen and splenic antibody producing cells are isolated. These cells are fused to an immortal cell, such as a myeloma cell, to provide an immortal fused cell hybrid that secretes the antibody. The cells are then cultured, in bulk, and the mAbs are harvested from the culture media for use. Hybridoma cell lines provide a constant, inexpensive source of chemically identical antibodies and preparations of such antibodies can be easily standardized. Methods for producing mAbs are well known to those of ordinary skill in the art. See, e.g. U.S. Pat. No. 4,196,265.
[0093]For producing recombinant antibodies, mRNAs from antibody producing B lymphocytes of animals, or hybridomas are reverse-transcribed to obtain cDNAs. See generally, Huston et al. (1991) Met. Enzymol. 203:46-88; Johnson and Bird (1991) Met. Enzymol. 203:88-99; and Mernaugh and Memaugh (1995) In, Molecular Methods in Plant Pathology (Singh and Singh eds.) CRC Press Inc. Boca Raton, Fla., pp. 359-365). Antibody cDNA, which can be full or partial length, is amplified and cloned into a phage or a plasmid. The cDNA can be a partial length of heavy and light chain cDNA, separated or connected by a linker. The antibody, or antibody fragment, is expressed using a suitable expression system to obtain recombinant antibody. Antibody cDNA can also be obtained by screening pertinent expression libraries.
[0094]Antibodies can be bound to a solid support substrate or conjugated with a detectable moiety or be both bound and conjugated as is well known in the art. For a general discussion of conjugation of fluorescent or enzymatic moieties see, Johnston and Thorpe (1982) Immunochemistry in Practice, Blackwell Scientific Publications, Oxford. The binding of antibodies to a solid support substrate is also well known in the art. See for a general discussion, Harlow and Lane (1988); and Borrebaeck (1992). The detectable moieties contemplated with the present invention include, but are not limited to, fluorescent, metallic, enzymatic and radioactive markers such as biotin, gold, ferritin, alkaline phosphatase, β-galactosidase, peroxidase, urease, fluorescein, rhodamine, tritium, 13C and iodination.
[0095]Antibodies can also be used as an active agent in a therapeutic composition and such antibodies can be humanized, for instance, to enhance their effects. See, Huls et al. Nature Biotech. 17:1999. "Humanized" antibodies are antibodies in which at least part of the sequence has been altered from its initial form to render it more like human immunoglobulins. In one version, the H chain and L chain C regions are replaced with human sequence. In another version, the CDR regions comprise amino acid sequences from the antibody of interest, while the V framework regions have also been converted human sequences. See, for example, EP 0329400. In a third version, V regions are humanized by designing consensus sequences of human and mouse V regions, and converting residues outside the CDRs that are different between the consensus sequences. The invention encompasses humanized mAbs.
[0096]The expression product from the gene or portions thereof can be useful for generating antibodies such as monoclonal or polyclonal antibodies which are useful for diagnostic purposes or to block activity of expression products or portions thereof or of genes or a portion thereof, e.g., as therapeutics.
[0097]Note that some antibodies to the mouse or rat 608 polypeptide may also bind the human 608 polypeptide. A preferred set of antibodies encompassed by this invention are antibodies which bind human 608 polypeptide but which do not bind rat 608 polypeptide. Another preferred set of antibodies encompassed by this invention are antibodies which bind human 608 polypeptide but which do not bind mouse 608 polypeptide
[0098]The genes of the present invention or portions thereof, e.g., a portion thereof which expresses a protein which function the same as or analogously to the full length protein, or genes identified by the methods herein can be expressed recombinantly, e.g., in Escherichia coli or in another vector or plasmid for either in vivo expression or in vitro expression. The methods for making and/or administering a vector or recombinant or plasmid for expression of gene products of genes of the invention or identified by the invention or a portion thereof either in vivo or in vitro can be any desired method, e.g., a method which is by or analogous to the methods disclosed in: U.S. Pat. Nos. 4,603,112; 4,769,330; 5,174,993; 5,505,941; 5,338,683; 5,494,807; 4,394,448; 4,722,848; 4,745,051; 4,769,331; 5,591,639; 5,589,466; 4,945,050; 5,677,178; 5,591,439; 5,552,143; and 5,580,859; U.S. patent application Serial No. 920,197, filed Oct. 16, 1986; WO 94/16716; WO 96/39491; WO91/11525; WO 98/33510; WO 90/01543; EP 0 370 573; EP 265785; Paoletti (1996) Proc. Natl. Acad. Sci. USA 93:11349-11353; Moss (1996) Proc. Natl. Acad. Sci. USA 93:11341-11348; Richardson (Ed) (1995) Methods in Molecular Biology 39, "Baculovirus Expression Protocols," Humana Press Inc.; Smith et al. (1983) Mol. Cell. Biol. 3:2156-2165; Pennock et al. (1984) Mol. Cell. Biol. 4:399-406; RoizmanProc. Natl. Acad. Sci. USA 93:11307-11312; Andreansky et al. Proc. Natl. Acad. Sci. USA 93:11313-11318; Robertson et al. Proc. Natl. Acad. Sci. USA 93:11334-11340; Frolov et al. Proc. Natl. Acad. Sci. USA 93:11371-11377; Kitson et al. (1991) J. Virol. 65:3068-3075; Grunhaus et al. (1992) Sem. Virol. 3:237-52; Ballay et al. (1993) EMBO J. 4:3861-65; Graham (1990) Tibtech 8:85-87; Prevec et al. J. Gen. Virol. 70:429-434; Felgner et al. (1994) J. Biol. Chem. 269:2550-2561; (1993) Science 259:1745-49; McClements et al. (1996) Proc. Natl. Acad. Sci. USA 93:11414-11420; Ju et al. (1998) Diabetologia 41:736-739; and Robinson et al. (1997) Sem. Immunol. 9:271-283.
[0099]The expression product generated by vectors or recombinants can also be isolated and/or purified from infected or transfected cells; for instance, to prepare compositions for administration to patients. However, in certain instances, it may be advantageous to not isolate and/or purify an expression product from a cell; for instance, when the cell or portions thereof enhance the effect of the polypeptide.
[0100]As used herein, "treatment" refers to clinical intervention in an attempt to alter the natural course of the individual or cell being treated, and may be performed either for prophylaxis or during the course of clinical pathology. Desirable effects of the treatment include preventing occurrence or recurrence of disease, alleviation of symptoms, diminishment of any direct or indirect pathological consequences of the disease, preventing metastases, decreasing the rate of disease progression, amelioration or palliation of the disease state, and remission or improved prognosis.
[0101]An inventive vector or recombinant expressing a gene or a portion thereof identified herein or from a method herein can be administered in any suitable amount to achieve expression at a suitable dosage level, e.g., a dosage level analogous to the herein mentioned dosage levels (wherein the gene product is directly present). The inventive vector or recombinant nucleotide can be administered to a patient or infected or transfected into cells in an amount of about at least 103 pfu; more preferably about 104 pfu to about 1010 pfu, e.g., about 105 pfu to about 109 pfu, for instance about 106 pfu to about 108 pfu. In plasmid compositions, the dosage should be a sufficient amount of plasmid to elicit a response analogous to compositions wherein gene product or a portion thereof is directly present; or to have expression analogous to dosages in such compositions; or to have expression analogous to expression obtained in vivo by recombinant compositions. For instance, suitable quantities of plasmid DNA in plasmid compositions can be 1 μg to 100 mg, preferably 0.1 to 10 mg, e.g., 500 μg, but lower levels such as 0.1 to 2 mg or preferably 1-10 μg may be employed. Documents cited herein regarding DNA plasmid vectors can be consulted for the skilled artisan to ascertain other suitable dosages for DNA plasmid vector compositions of the invention, without undue experimentation.
[0102]Compositions for administering vectors can be as in or analogous to such compositions in documents cited herein or as in or analogous to compositions herein described, e.g., pharmaceutical or therapeutic compositions and the like.
[0103]Thus, the invention comprehends in vivo gene expression which is sometimes termed "gene therapy." Gene therapy can refer to the transfer of genetic material (e.g. DNA or RNA) of interest into a host subject or patient to treat or prevent a genetic or acquired disease, condition or phenotype. The particular gene that is to be used or which has been identified as the target gene is identified as set forth herein. The genetic material of interest encodes a product (e.g. a protein, polypeptide, peptide or functional RNA) the production in vivo of which is desired. For example, the genetic material of interest can encode a hormone, receptor, enzyme, polypeptide or peptide of therapeutic value. For a review see, in general, the text "Gene Therapy" (Advances in Pharmacology 40, Academic Press, 1997).
[0104]Two basic approaches to gene therapy have evolved: (1) ex vivo; and (2) in vivo gene therapy. In ex vivo gene therapy cells are removed from a patient, and while being cultured are treated in vitro. Generally, a functional replacement gene is introduced into the cell via an appropriate gene delivery vehicle/method (transfection, homologous recombination, etc.) and, an expression system as needed and then the modified cells are expanded in culture and returned to the host/patient. These genetically reimplanted cells have been shown to produce the transfected gene product in situ. In in vivo gene therapy, target cells are not removed from the subject; rather, the gene to be transferred is introduced into the cells of the recipient organism in situ, that is within the recipient. Alternatively, if the host gene is defective, the gene is repaired in situ. Culver (1998) Antisense DNA & RNA Based Therapeutics, February, 1998, Coronado, Calif. These genetically altered cells have been shown to produce the transfected gene product in situ.
[0105]The gene expression vehicle is capable of delivery/transfer of heterologous nucleic acid into a host cell. The expression vehicle may include elements to control targeting, expression and transcription of the nucleic acid in a cell-selective manner as is known in the art. It should be noted that often the 5'UTR and/or 3'UTR of the gene may be replaced by the 5' UTR and/or 3'UTR of the expression vehicle. Therefore, as used herein, the expression vehicle may, as needed, not include the 5'UTR and/or 3'UTR shown in sequences herein and only include the specific amino acid coding region.
[0106]The expression vehicle can include a promoter for controlling transcription of the heterologous material and can be either a constitutive or inducible promoter to allow selective transcription. Enhancers that may be required to obtain necessary transcription levels can optionally be included. Enhancers are generally any non-translated DNA sequence that works contiguously with the coding sequence (in cis) to change the basal transcription level dictated by the promoter. The expression vehicle can also include a selection gene as described herein.
[0107]Vectors can be introduced into cells or tissues by any one of a variety of known methods within the art. Such methods can be found generally described in Sambrook et al. (1989, 1992); Ausubel et al. (1989); Chang et al. (1995) Somatic Gene Therapy, CRC Press, Ann Arbor, Mich.; Vega et al. (1995) Gene Targeting, CRC Press, Ann Arbor, Mich.; Vectors: A Survey of Molecular Cloning Vectors and Their Uses, Butterworths, Boston Mass. (1988); and Gilboa et al. (1986) BioTech. 4:504-512, as well as other documents cited herein and include, for example, stable or transient transfection, lipofection, electroporation and infection with recombinant viral vectors. In addition, see U.S. Pat. No. 4,866,042 for vectors involving the central nervous system; and also U.S. Pat. Nos. 5,464,764 and 5,487,992 for positive-negative selection methods.
[0108]Introduction of nucleic acids by infection offers advantages over the other listed methods. Higher efficiency can be obtained due to their infectious nature. Moreover, viruses are very specialized and typically infect and propagate in specific cell types. Thus, their natural specificity can be used to target the vectors to specific cell types in vivo or within a tissue or mixed cell culture. Viral vectors can also be modified with specific receptors or ligands to alter target specificity through receptor-mediated events.
[0109]Additional features can be added to the vector to ensure its safety and/or enhance its therapeutic efficacy. Such features include, for example, markers that can be used to negatively select against cells infected with the recombinant virus. An example of such a negative selection marker is the TK gene described above that confers sensitivity to the antibiotic gancyclovir. Negative selection is therefore a means by which infection can be controlled because it provides inducible suicide through the addition of antibiotic. Such protection ensures that if, for example, mutations arise that produce altered forms of the viral vector or recombinant sequence, cellular transformation will not occur. Features that limit expression to particular cell types can also be included. Such features include, for example, promoter and regulatory elements that are specific for the desired cell type.
[0110]In addition, recombinant viral vectors are useful for in vivo expression of a desired nucleic acid because they offer advantages such as lateral infection and targeting specificity. Lateral infection is inherent in the life cycle of, for example, retrovirus and is the process by which a single infected cell produces many progeny virions that bud off and infect neighboring cells. The result is that a large area becomes rapidly infected, most of which was not initially infected by the original viral particles. This is in contrast to vertical-type of infection in which the infectious agent spreads only through daughter progeny. Viral vectors can also be produced that are unable to spread laterally. This characteristic can be useful if the desired purpose is to introduce a specified gene into only a localized number of targeted cells.
[0111]In particular, use of the 608 gene (or a functional fragment thereof) for treatment of osteoporosis, and/or osteoarthritis, and/or osteopetrosis, and/or osteosarcoma, and/or fracture healing is envisaged using gene therapy methods. As described above, a plasmid or DNA vector expressing the gene could be injected directly to the target tissue; alternatively a virus bearing a plasmid or DNA vector expressing the gene could be injected directly to the target tissue. Another embodiment is that cells transfected with a plasmid or DNA vector expressing the gene could be injected directly to the target tissue. These transfected cells should preferably be the patient's own cells for example mesenchymal stem cells drawn from the bone marrow.
[0112]Delivery of gene products (products from herein defined genes: genes identified herein or by inventive methods or portions thereof) and/or antibodies or portions thereof and/or agonists or antagonists (collectively or individually "therapeutics"), and compositions comprising the same, as well as of compositions comprising a vector expressing gene products, can be done without undue experimentation from this disclosure and the knowledge in the art.
[0113]The pharmaceutically "effective amount" for purposes herein is thus determined by such considerations as are known in the art. The amount must be effective to achieve improvement including but not limited to improved survival rate or more rapid recovery, or improvement or amelioration or elimination of symptoms and other indicators, e.g., of osteoporosis, for instance, improvement in bone density, as are selected as appropriate measures by those skilled in the art.
[0114]It is noted that humans are treated generally longer than the mice or other experimental animals exemplified herein. Human treatment has a length proportional to the length of the disease process and drug effectiveness. The doses may be single doses or multiple doses over a period of several days, but single doses are preferred. Thus, one can scale up from animal experiments, e.g., rats, mice, and the like, to humans, by techniques from this disclosure and the knowledge in the art, without undue experimentation.
[0115]The present invention provides an isolated nucleic acid molecule containing nucleotides having a sequence set forth in at least one of SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO: 6, SEQ ID NO: 20, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 29 or SEQ ID NO: 31 or as inserted in a plasmid designated pCm-H-608-663-N-term, deposited under ATCC Accession No. PTA-3638, supplements thereof and a polynucleotide having a sequence that differs from SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO: 6, SEQ ID NO: 20, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29 or SEQ ID NO: 31 or as inserted in a plasmid designated pCm-H-608-663-N-term, deposited under ATCC Accession No. PTA-3638 due to the degeneracy of the genetic code or a sequence which hybridizes under stringent conditions to a sequence in a plasmid designated pCm-H608-663-N-term or a functional portion thereof or a polynucleotide which is at least substantially homologous thereto. In a preferred embodiment, the nucleic acid molecule comprises a polynucleotide having at least 15 nucleotides from SEQ ID NO: 1, SEQ ID NO:3, SEQ ID NO: 6, SEQ ID NO: 20, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 29 or SEQ ID NO: 31 or as inserted in a plasmid designated pCm-H-608-663-N-term, deposited under ATCC Accession No. PTA-3638, preferably at least 50 nucleotides and more preferably at least 100 nucleotides.
[0116]The present invention further provides an isolated nucleic acid molecule containing nucleotides having a sequence set forth in at least one of SEQ ID NO:26, SEQ ID NO:27, SEQ ID NO: 28, or SEQ ID NO:26 and SEQ ID NO:27 or SEQ ID NO:26 and SEQ ID NO:27 and SEQ ID NO: 28 or as inserted in a plasmid designated pKS H608 5'-2.4 Kb bAc#1 (deposited on Nov. 21, 2001 under the terms of the Budapest Treaty with the American Type Culture Collection (ATCC), P.O. Box 1549, Manassas, Va. 20108, USA, under ATCC accession number PTA-3878), pKS H608 m.FRG.3.5 Kb#34 (deposited on Nov. 21, 2001 under the terms of the Budapest Treaty with the American Type Culture Collection (ATCC), P.O. Box 1549, Manassas, Va. 20108, USA, under ATCC accession number PTA-3876) or pM H608 3'-1.9 Kb HSTG#3.3 (deposited on Nov. 21, 2001 under the terms of the Budapest Treaty with the American Type Culture Collection (ATCC), P.O. Box 1549, Manassas, Va. 20108, USA, under ATCC accession number PTA-3877), supplements thereof and a polynucleotide having a sequence that differs from SEQ ID NO:26, SEQ ID NO:27, SEQ ID NO: 28, or SEQ ID NO:26 and SEQ ID NO:27 or SEQ ID NO:26 and SEQ ID NO:27 and SEQ ID NO: 28 or as inserted in a plasmid designated pKS H608 5'-2.4 Kb bAc#1, pKS H608 m.FRG.3.5 Kb#34 or pM H608 3'-1.9 Kb HSTG#3.3 due to the degeneracy of the genetic code or a sequence which hybridizes under stringent conditions to a sequence in a plasmid designated pKS H608 5'-2.4 Kb bAc#1, pKS H608 m.FRG.3.5 Kb#34 or pM H608 3'-1.9 Kb HSTG#3.3 or a functional portion thereof or a polynucleotide which is at least substantially homologous thereto. In a preferred embodiment, the nucleic acid molecule comprises a polynucleotide having at least 15 nucleotides from SEQ ID NO:26, SEQ ID NO:27, SEQ ID NO: 28, or SEQ ID NO:26 and SEQ ID NO:27 or SEQ ID NO:26 and SEQ ID NO:27 and SEQ ID NO: 28 or as inserted in a plasmid designated pKS H608 5'-2.4 Kb bAc#1, pKS H608 m.FRG.3.5 Kb#34 or pM H608 3'-1.9 Kb HSTG#3.3, preferably at least 50 nucleotides and more preferably at least 100 nucleotides.
[0117]The present invention also provides a composition of the isolated nucleic acid molecule, a vector comprising the isolated nucleic acid molecule, a composition containing said vector and a method for preventing, treating or controlling bone diseases including, but not limited to, osteoporosis, osteopenia, osteopetrosis, osteosclerosis, osteoarthritis, periodontosis, bone fractures or low bone density or or other conditions involving mechanical stress or a lack thereof in a subject, comprising administering the inventive composition, or the inventive vector, and a method for preparing a polypeptide comprising expressing the isolated nucleic acid molecule or comprising expressing the polypeptide from the vector.
[0118]The present invention further provides a method for preventing, treating or controlling osteoporosis, osteopenia, osteopetrosis, osteosclerosis, osteoarthritis, periodontosis, bone fractures or low bone density or other factors causing or contributing to osteoporosis or symptoms thereof or other conditions involving mechanical stress or a lack thereof in a subject, comprising administering an isolated nucleic acid molecule or functional portion thereof or a polypeptide comprising an expression product of the gene or functional portion of the polypeptide or an antibody to the polypeptide or a functional portion of the antibody. In one embodiment of the invention, the isolated nucleic acid molecule encodes a 10 kD to 100 kD N-terminal cleavage product of the OCP protein. Preferably, the N-terminal cleavage product comprises of a polypeptide of about 25 kD. More preferably the N-terminal cleavage product comprises a polypeptide of about 70-80 kD, most preferably about 1-663 amino acids or about 1-741 amino acids of the OCP protein. The present invention provides an isolated polypeptide encoded by the inventive polynucleotide. In one embodiment of the invention, the polypeptide is identified as human 608 protein, rat 608 protein, human Adlican-2 protein or a functional portion thereof or a polypeptide which is at least substantially homologous thereto. More particularly this invention is directed to an isolated polypeptide wherein the functional portion comprises consecutive amino acids having a sequence set forth in SEQ ID NO:2, SEQ ID NO:16, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 30, SEQ ID NO: 32 or SEQ ID NO: 34. Particular fragments of the polypeptide are about the first 663 amino acids or about the first 741 amino acids of the sequence set forth in SEQ ID NO:2, SEQ ID NO:16, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 30, SEQ ID NO: 32 or SEQ ID NO: 34. Other particular fragments of the human 608 protein include amino acids 1-500, 501-1000, 1001-1500, 1501-2000, 2001-2500, and 2051-2623 of the sequence set forth in SEQ ID NO: 32. Further particular fragments of the human 608 protein include amino acids 250-749, 750-129, 1250-1749, 1750-2249 and 2250-2623 of the sequence set forth in SEQ ID NO: 32. Nucleic acid molecules (polynucleotides) encoding these particular fragments are also envisaged as aspects of the invention. Similar particular polypeptide fragments of the Adlican-2 protein (SEQ ID NO: 30), and similar particular polynucleotide fragments of the Adlican-2 nucleic acid (SEQ ID NO: 29) are also envisaged as aspects of the invention.
[0119]The present invention also provides a composition comprising one or of isolated polypeptides, an antibody specific for the polypeptide or a functional portion thereof, a composition comprising the antibody or a functional portion thereof, and a method for treating or preventing osteoporosis, or fracture healing, bone elongation, or periodontosis in a subject, comprising administering to the subject a N-terminal polypeptide having a molecular weight of between 10 kD and 100 kD, preferably about 25 kD to about 70-80 kD.
[0120]The present invention provides for a method of treating or preventing osteoarthritis, osteopetrosis, or osteosclerosis, comprising administering to a subject an effective amount of a chemical or a neutralizing mAbs that inhibit the activity of the N-terminal polypeptide having a molecular weight of between 10 kD and 30 kD, preferably about 25 kD.
[0121]As used herein, the term "subject," "patient," "host" include, but are not limited to human, bovine, pig, mouse, rat, goat, sheep and horse.
[0122]Those skilled in the art will recognize that the components of the compositions should be selected to be chemically inert with respect to the gene product and optional adjuvant or additive. This will present no problem to those skilled in chemical and pharmaceutical principles, or problems can be readily avoided by reference to standard texts or by simple experiments (not involving undue experimentation), from this disclosure and the documents cited herein.
[0123]The present invention provides receptors of the expression products of human mechanical stress induced genes and their functional equivalents, such as OCP and Adlican, and methods or processes for obtaining and using such receptors. The receptors of the present invention are those to which the expression products of mechanical stress induced genes and their functional equivalents bind or associate as determined by conventional assays, as well as in vivo. For example, binding of the polypeptides of the instant invention to receptors can be determined in vitro, using candidate receptor molecules that are associated with lipid membranes. See, e.g., Watson, J. et al., Development of FlashPlate® technology to measure (35S) GTP gamma S binding to Chinese hamster ovary cell membranes expressing the cloned human 5-HT1B receptor, Journal of Biomolecular Screening. Summer, 1998; 3 (2) 101-105; Komesli-Sylviane et al., Chimeric extracellular domain of type II transforming growth factor (TGF)-beta receptor fused to the Fc region of human immunoglobulin as a TGF-beta antagonist, European Journal of Biochemistry. June, 1998; 254 (3) 505-513. See, generally, Darnell et al., Molecular Cell Biology, 644-646, Scientific American Books, New York (1986). Scanning electron microscopy ("SEM"), x-ray crystallography and reactions using labelled polypeptides are examples of conventional means for determining whether polypeptides have bound or associated with a receptor molecule. For instance, X-ray crystallography can provide detailed structural information to determine whether and to what extent binding or association has occurred. See, e.g., U.S. Pat. No. 6,037,117; U.S. Pat. No. 6,128,582 and U.S. Pat. No. 6,153,579. Further, crystallography, including X-ray crystallography, provides three-dimensional structures that show whether a candidate polypeptide ligand can or would bind or associate with a target molecule, such as a receptor. See, e.g., WO 99/45379; U.S. Pat. Nos. 6,087,478 and 6,110,672. Such binding or association shows that the receptor molecule is the receptor for the candidate polypeptide.
[0124]With the disclosures in the present specification of the inventive genes, expression products and uses thereof, those skilled in the art can obtain by conventional methods the receptors for the inventive expression products. The conventional means for obtaining the receptors include raising monoclonal antibodies (Mabs) to candidate receptors, purifying the receptors from a tissue sample by use of an affinity column, treatment with a buffer, and collection of the eluate receptor molecules. Other means of isolating and purifying the receptors are conventional in the art, for instance isolation and purification by dialysis, salting out, and electrophoretic (e.g. SDS-PAGE) and chromatographic (e.g. ion-exchange and gel-filtration, in additional to affinity) techniques. Such methods can be found generally described in Stryer, Biochemistry, 44-50, W.H. Freeman & Co., New York (3d ed. 1988); Darnell et al., Molecular Cell Biology, 77-80 (1986); Alberts et al., Molecular Biology of the Cell, 167-172, 193 Garland Publishing, New York (2d ed. 1989).
[0125]Sequencing of the isolated receptor involves methods known in the art, for instance directly sequencing a short N-terminal sequence of the receptor, constructing a nucleic-acid probe, isolating the receptor gene, and determining the entire amino-acid sequence of the receptor from the nucleic-acid sequence. Alternatively, the entire receptor protein can be sequenced directly. Automated Edman degradation is one conventional method used to partially or entirely sequence a receptor protein, facilitated by chemical or enzymatic cleavage. Automated sequenators, such as an ABI-494 Procise Sequencer (Applied Biosystems) can be used. See, generally, Stryer, Biochemistry, 50-58 (3d ed. 1988).
[0126]The invention provides methods for using such receptors in assays, for instance for identifying proteins or polypeptides that bind to, associate with or block the inventive receptors, determining binding constants and degree of binding, and for testing the effects of such polypeptides, for instance utilising membrane receptor preparations. See Watson (1998); Komesli-Sylviane (1998). For instance, FlashPlate® (Perkin-Elmer, Massachusetts, USA) technology can be used with the present invention to determine whether and to what degree candidate polypeptides bind to and are functional with respect to a receptor of the invention.
Diagnostics:
[0127]The gene and polypeptides of the invention can be employed as a diagnostic in several ways as follows:
1. Diagnosis of osteoarthritis by detection of 608 protein or parts of it, or detection of 608 RNA in synovial fluid.2. Diagnosis of osteoporosis by detection of 608 protein or fragments thereof, or detection of 608 RNA, preferably in a blood sample.3. Diagnosis of a fracture by detection of 608 protein or fragments thereof, or detection of 608 RNA, preferably in a blood sample.4. Diagnosis of succeptibility to osteoporosis, and/or osteoarthritis, and/or osteopetrosis, and/or osteosarcoma associated with mutated 608 by PCR or RT PCR of DNA or RNA respectively. DNA and/or RNA from bodily fluids or from a tissue, and preferably DNA from blood are tested.5. Diagnosis of a disease associated with mutated 608 by PCR or RT PCR of DNA or RNA respectively. DNA and/or RNA from bodily fluids or from a tissue and preferably DNA from blood are tested
[0128]The diagnostic methods to be utilized are described in more detail as follows.
[0129]In diagnosis, the sample is taken from a bodily fluid or from a tissue, preferably bone or cartilage tissue; the bodily fluid is selected from the group of fluid consisting of blood, lymph fluid, ascites, serous fluid, pleural effusion, sputum, cerebrospinal fluid, lacrimal fluid, synovial fluid, saliva, stool, sperm and urine, preferably blood or urine. Measurement of level of the 608 polypeptide may be determined by a method selected from the group consisting of immunohistochemistry, western blotting, ELISA, antibody microarray hybridization and targeted molecular imaging; antibodies have been described above. Such methods are well-known in the art, for example for immunohistochemistry: M. A. Hayat (2002) Microscopy, Immunohistochemistry and Antigen Retrieval Methods: For Light and Electron Microscopy, Kluwer Academic Publishers; Brown C (1998): "Antigen retrieval methods for immunohistochemistry", Toxicol Pathol; 26(6): 830-1); for western blotting: Laemmeli UK(1970): "Cleavage of structural proteins during the assembley of the head of a bacteriophage T4", Nature; 227: 680-685; and Egger & Bienz(1994) "Protein (western) blotting", Mol Biotechnol; 1(3): 289-305); for ELISA: Onorato et al.(1998) "Immunohistochemical and ELISA assays for biomarkers of oxidative stress in aging and disease", Ann NY Acad Sci 20; 854: 277-90); for antibody microarray hybridization: Huang(2001) "Detection of multiple proteins in an antibody-based protein microarray system, Immunol Methods 1; 255 (1-2): 1-13); and for targeted molecular imaging: Thomas (2001). Targeted Molecular Imaging in Oncology, Kim et al (Eds)., Springer Verlag, inter alia.
[0130]Measurement of level of 608 polynucleotide may be determined by a method selected from: RT-PCR analysis, in-situ hybridization, polynucleotide microarray and Northern blotting. Such methods are well-known in the art, for example for in-situ hybridization Andreeff & Pinkel (Editors) (1999), "Introduction to Fluorescence In Situ Hybridization: Principles and Clinical Applications", John Wiley & Sons Inc.; and for Northern blotting Trayhurn (1996) "Northern blotting", Proc Nutr Soc; 55(1B): 583-9 and Shifman & Stein(1995) "A reliable and sensitive method for non-radioactive Northern blot analysis of nerve growth factor mRNA from brain tissues", Journal of Neuroscience Methods; 59: 205-208 inter alia.
[0131]A better understanding of the present invention and of its many advantages will be had from the following examples, given by way of illustration and as a further description of the invention.
EXPERIMENTAL DETAILS
Example 1
608 Gene Expression by In Situ Hybridization
[0132]The 608 gene expression pattern was studied by in situ hybridization on sections of bones from ovariectomized and sham-operated rats. Female Wistar rats weighing 300-350g were subjected to ovariectomy under general anesthesia. Control rats were operated on in the same way but ovaries were not excised as a sham operation.
[0133]Three weeks after the operation, rats were sacrificed and tibia were excised together with the knee joint. Bones were fixed for three days in 4% paraformaldehyde and then decalcified for four days in a solution containing 5% formic acid and 10% formalin. Decalcified bones were postfixed in 10% formalin for three days and embedded into paraffin.
[0134]The ectopic bone formation model was employed to study the bone development 608 gene expression pattern. Rat bone marrow cells were seeded into cylinders of demineralized bone matrix prepared from rat tibiae. Cylinders were implanted subcutaneously into adult rats. After three weeks, rats were sacrificed and implants were decalcified and embedded into paraffin as described above for tibial bones.
[0135]The 6 μm sections were prepared and hybridized in situ. After hybridization, sections were dipped into nuclear track emulsion and exposed for three weeks at 4° C. Autoradiographs were developed, stained with hematoxylin-eosin and studied under microscopy using brightfield and darkfield illumination.
[0136]For further assessment of cell and tissue specificity of 608 gene expression, an in situ hybridization study was performed on sections of multitissue block containing multiple samples of adult rat tissues. The 608 expression developmental pattern was studied on sagittal sections of mouse embryos of 12.5, 14.5 and 16.5 days postconception (dpc) stages.
[0137]Microscopic study of hybridized sections of long bones revealed a peculiar pattern of 608 probe hybridization. The hybridization signal can be seen mainly in fibroblast-like cells found in several locations throughout the sections. Prominent accumulations of these cells can be seen in the area of periosteal modeling in metaphysis, and also in regions of active remodeling of compact bone in diaphysis: at the boundary between bone marrow and endosteal osteoblasts and in periosteum; also in close contact with osteoblasts. Perivascular connective tissue filling Volkmann's canals in compact bone in diaphysis and epiphysis also contains 608-expressing cells. No hybridization was found within cancellous bone and in bone marrow. This hybridization pattern suggests that cells expressing 608 are associated with areas of remodeling of preexisting bone and are not involved in primary endochondral ossification.
[0138]At the growth plate level, 608 expressing cells can be seen in the perichondral fibrous ring of LaCroix. Some investigators regard this fibrous tissue as the aggregation of residual mesenchymal cells able to differentiate into both osteoblasts and chondrocytes. In this respect it is noteworthy that single cells expressing 608 can be seen in epiphyseal cartilage. These 608-expressing cells are rounded cells within the lateral segment of epiphysis (sometimes in close vicinity to the LaCroix ring) and flattened cells covering the articulate surface. Most cells in articulate cartilage and all chondrocytes on the growth plate do not show 608 expression. Ovariectomy did not alter the intensity and pattern of 608 expression in bone tissue.
[0139]In ectopic bone sections, 608 hybridization signal can be seen in some fibroblast-like cells either scattered within unmineralized connective tissue matrix or concentrated at the boundary between this tissue and osteoblasts of immature bone. 608 gene expression patterns revealed by in situ hybridization in bone and cartilage indicate that its expression marks some skeletal tissue elements able to differentiate into two skeletal cell types--osteoblasts and chondrocytes. The terminal differentiation of these cells appears to be accompanied by down-regulation of 608 expression. The latter observation is supported by peculiar temporal pattern of 608 expression in primary cultures of osteogenic cells isolated from calvaria bones of rat fetuses. In these cultures, expression was revealed by in situ hybridization in the vast majority of cells after one and two weeks of incubation in vitro. Three and four week old cultures showing signs of ossification contained no 608 expressing cells. Significantly, no hybridization signal was found on sections of multitissue block hybridized to 608 probe suggesting high specificity of this gene expression for the skeletal tissue in adult organisms.
[0140]An in situ hybridization study of embryonic sections demonstrated that, at 12.5 dpc, a weak hybridization signal can be discerned in some mesenchymal cells in several locations throughout the embryonic body. The most prominent signal is found in the head in loose mesenchymal tissue surrounding the olfactory epithelium and underlying the surface epithelium of nose tip. Other mesenchymal cells in the head also show a hybridization signal: non-cartilaginous part of basisphenoid bone primordium and mesenchyme surrounding the dental laminae (tooth primordia) in the mandible.
[0141]In the trunk, expression can be detected in less developed vertebrae primordia in the thoraco-lumbar region. The hybridization signal here marks the condensed portion of sclerotomes. Another area of the trunk showing a hybridization signal is comprised of a thin layer of mesenchymal cells in the anterior part of the thoracic body wall.
[0142]At later stages of development, 14.5 and 16.5 dpc, probe 608 gave no hybridization signal. Thus, it appears that during embryonic development the 608 gene is transiently expressed by at least some mesenchymal and skeleton-forming cells. This expression is down-regulated at later stages of development. More detailed study of late embryonic and postnatal stages of development reveals the timing of appearance of cells expressing 608 in bone tissue.
[0143]Further experiments to study the expression of the OCP gene in embryonic development were performed as follows. The targeting vector used to produce OCP knockout mice included a "knock-in" of the β-galactosidase (LacZ) reporter gene into the OCP gene. The LacZ gene was fused to the first exon of the OCP gene--a non-coding exon. Thus, expression of LacZ is expected to depend on the OCP regulatory elements and to mark the cells expressing OCP.
[0144]Analysis of LacZ staining was performed during embryonic development on OCP knockout mice. The expression pattern revealed by this analysis reflects the activation of the OCP gene promoter, which results in expression of the knocked-in LacZ gene. This data in general supports the pattern detected by the in-situ hybridization described above.
[0145]At 10.5 dpc expression is seen in the apical ectodermal ridge (AER), in the forelimbs only. This specialized region, together with the zone of polarizing activity (ZPA), directs and coordinates the development of the limb bud.
[0146]At 12.5 dpc expression in AER is maintained in the forelimbs and appears also in hindlimbs. In addition, it appears in precartilagenous condensations of the ribs, in mesenchymal tissue in the face, in mesenchymal tissue rostral to the forelimb, a region of future muscle development, and in the tip of the genital tubercle.
[0147]At 14.5 dpc there is broader expression, in the head region, limbs, ribs, and back. Although no expression at 14.5 dpc was detected in the in situ RNA hybridization experiments described above, expression was detected in this experiment, probably because the lacZ staining is a more sensitive detection method.
[0148]To summarize: [0149]1. There is an interesting pattern of expression of gene OCP in embryonic development: in tissues originating from different germ layers (ectoderm and mesoderm), in critical regions of pattern formation (AER), and an apparently regulated pattern during cartilage and bone development. [0150]2. In mesodermal tissues, the gene is expressed mainly in skeletal lineages, but also in some myoblasts and some dermal cells as well. [0151]3. Ectodermal expression appears in the head mesenchyme, derived from neural crest cells, and cells in the apical ectodermal ridge.
[0152]The 608 expression pattern during embryonic development is closely coupled with regions of bone and cartilage development. This expression pattern strongly suggests a role for 608 in bone metabolism.
Example 2
Isolation of Rat OCP
[0153]Primary rat calvaria cells grown on elastic membranes that were stretched for 20 minutes provided a model system for a stimulator of bone formation following mechanical force. Gene expression patterns were compared before and after the application of mechanical force.
[0154]OCP expression was upregulated approximately 3-fold by mechanical force. This was detected both by microarray analysis and by Northern blot analysis using poly (A)+ RNA from rat calvaria cells before and after the mechanical stress. In rat calvaria primary cells and in rat bone extract this gene was expressed as a main RNA species of approximately 8.9 kb and a minor RNA transcript of approximately 9 kb. The hybridization signal was not detected in any other rat RNA from various tissue sources, including testis, colon, intestine, kidney, stomach, thymus, lung, uterus, heart, brain, liver, eye, and lymph node.
[0155]The partial OCP rat cDNA clone (4007 bp long) isolated from a rat calvaria cDNA phage library was found to contain a 3356 bp open reading frame closed at the 3' end. Comparison to public mouse databases revealed no sequence homologues. A complete OCP rat cDNA clone was isolated from the rat calvaria cDNA library by a combination of 5' RACE technique (Clontech), RT-PCR of 5' cDNA fragments, and ligation of the latter products to the original 3' clone. The full rat cDNA clone that was generated (shown in FIG. 1-SEQ ID NO:1) was sequenced, and no mutations were found. The full sequence stretch is 8883 bp long and contains an ORF (nt 575-8366) for a 2597 amino acid residue protein. FIG. 3-SEQ ID NO:2. The cDNA does not contain a polyadenylation site, but contains a 3' poly A stretch.
[0156]608 encodes a large protein that appears to be a part of the extra-cellular matrix. The gene may be actively involved in supporting osteoblast differentiation. Another option is that it is expressed in regions were remodeling takes place. Such an hypothesis is also compatible with a role in directing osteoclast action and thus it may be a target for inhibition by small molecules.
[0157]In normal bone formation, activation of osteoblasts leads to secretion of various factors that attract osteoclast precursors or mature osteoclasts to the sites of bone formation to initiate the process of bone resorption. In normal bone formation both functions are balanced. Imbalance to any side causes either osteitis deformans (osteoblast function overwhelms) or osteoporosis (osteoclast function overwhelms).
[0158]Among known osteoblast activators--mechanical force stimulation--is actually applied in the present model. As proof of principle, increased expression of several genes known to respond to mechanical stress by transcriptional upregulation were found. They include tenascin, endothelin and possibly trombospondin.
Example 3
Full-Length OCP cDNA Construction and Expression
[0159]TNT (transcription-translation) assays were performed according to the manufacturer's instructions (Promega--TNT coupled reticulocyte lysate systems), using specific fragments taken from various regions of the gene. In all assays a clear translation product was observed. The following fragments were tested:
TABLE-US-00001 TNT products Translation Frag. Location Fragment size (bp) product size (kD) Promoter 1 134-2147 2013 73 T7 2 3912-5014 1102 40 '' 3 574-1513 939 34 ''
Example 4
The Mouse OCP Gene
[0160]Two mouse genomic Bac clones containing the mouse OCP gene promoter region and part of the coding region were identified, based on their partial homology to the 5'UTR region of the rat-608 cDNA. These clones (23-261L4 and 23-241H7 with ˜200 Kb average insert length) were bought from TIGR.
[0161]Specific primers for the amplification of a part of the mouse-OCP promoter region were designed and used for PCR screening of a mouse genomic phage library (performed by Lexicon Genetics Inc. for the Applicants). One phage clone containing part of the genomic region of the mouse 608 gene was detected and completely sequenced. The length of this clone was reported to be 11,963 bp. Parts of the physical "Lexicon" clone were re-sequenced by the inventors and corrections were made. The resequenced clone is 11,967 bp long. Exon-location prediction (FIG. 4) was performed based on the alignment of the mouse genomic and the rat cDNA sequences.
Example 5
The Human OCP Gene
[0162]On the nucleotide level, the rat OCP cDNA sequence is homologous to the human genomic DNA sequence located on chromosome 3. Based on the homology and bioinformatic analysis (FIG. 6), a putative cDNA sequence was generated. FIG. 7. The highest similarity is evident between nt 1-1965 (1-655 a.a); 2179-2337 (727-779 a.a); and 4894-7833 (1635 a.a.-end) as presented in the table shown in FIG. 8. On the protein level, no homologues were found in the data bank.
Example 6
The Deduced OCP Protein
[0163]The deduced OCP protein was generated following the alignment of the rat, mouse and human cDNA sequences and the equivalent rat, mouse and human amino acid sequences, respectively. The following alignments were made: (a) alignment of rat, human, and mouse OCP cDNA coding regions (rat cDNA: SEQ ID NO:7; human 5+3 corrected: SEQ ID NO:8; and mus cDNA 5: SEQ ID NO:9)
(b) alignment of rat, human and mouse OCP proteins (rat: SEQ ID NO:10; human 5+3 corrected: SEQ ID NO:11; and mouse 5 corrected: SEQ ID NO: 12) and(c) alignment of rat and human OCP proteins (rat: SEQ ID NO:13; and human 5+3 corrected: SEQ ID NO:14).
[0164]The deduced OCP protein(FIG. 10): contains the following features
[0165]a. a cleavable, well-defined N-terminal signal peptide (aa 1-28);
[0166]b. a leucine-rich repeat region (aa 28-280). This region can be divided into N-terminal and C-terminal domains of leucine-rich repeats (aa 28-61 and 219-280, respectively). Between them, there are six leucine-rich repeat outliers (aa 74-96, 98-120, 122-144, 146-168, 178-200, 202-224). Leucine rich repeats are usually found in extracellular portions of a number of proteins with diverse functions. These repeats are thought to be involved in protein-protein interactions. Each leucine-rich repeat is composed of β-sheet and α-helix. Such units form elongated non-globular structures;
[0167]c. twelve immunoglobulin C-2 type repeats at amino acid positions 488-558, 586-652, 1635-1704, 1732-1801, 1829-1898, 1928-1997, 2025-2100, 2128-2194, 2233-2294, 2324-2392, 2419-2487, 2515-2586. Thus, two Ig-like repeats are found immediately downstream of a leucine-rich region, while the remaining 10 repeats are clustered at the protein's C-terminus. Immunoglobulin C-2 type repeats are involved in protein-protein interaction and are usually found in extracellular protein portions;
[0168]d. no transmembrane domain; and
[0169]5 nuclear localization domains (NLS) at: 724, 747, 1026, 1346 and 1618.
[0170]These observations indicate that OCP belongs to the Ig superfamily. OCP is a serine-rich protein (10.3% versus av. 6.3%), with a central nuclear prediction domain and an N-terminal extracellular prediction domain.
Example 7
Bone Fracture Healing
[0171]Expression of 608 RNA is bone-specific. Moreover, it seems to be specific to bone progenitors (as judged by their location in bone and involvement in normal bone modeling and remodeling processes) that do not yet express the known bone-specific markers. To further prove the relevance of 608-expressing cells to osteogenic lineage, the patterns of 608 expression in the animal model of bone fracture healing that imply the activation of bone formation processes were studied.
[0172]The sequence of physiological events following bone fracture is now relatively well understood. Healing takes place in three phases-inflammatory, reparative and remodeling. In each phase certain cells predominate and specific histological and biochemical events are observed. Although these phases are referred to separately, it is well known that events described in one phase persist into the next and events apparent in a subsequent phase begin before this particular phase predominates. These events have been described over the years in investigative reports and review articles. Ham (1969) In, Histology, 6th ed. Philadelphia, Lippincott, p. 441; and Urist and Johnson (1943) J. Bone Joint Surg. 25:375.
[0173]During the first phase immediately following fracture (the inflammatory phase), wide-spread vasodilatation and exudation of plasma lead to the acute edema visible in the region of a fresh fracture. Acute inflammatory cells migrate to the region, as do polymorphonuclear leukocytes and then macrophages. The cells that participate directly in fracture repair during the second phase (the reparative phase), are of mesenchymal origin and are pluripotent. These cells form collagen, cartilage and bone. Some cells are derived from the cambium layer of the periosteum and form the earliest bone. Endosteal cells also participate. However, the majority of cells directly taking part in fracture healing enter the fracture site with the granulation tissue that invades the region from surrounding vessels. Trueta (1963) J. Bone Joint Surg. 45:402. Note that the entire vascular bed of an extremity enlarges shortly after the fracture has occurred but the osteogenic response is limited largely to the zones surrounding the fracture itself. Wray (1963) Angiol. 14:134.
[0174]The invading cells produce tissue known as "callus" (made up of fibrous tissue, cartilage, and young, immature fibrous bone), rapidly enveloping the ends of the bone, with a resulting gradual increase in stability of the fracture fragments. Cartilage thus formed will eventually be resorbed by a process that is indistinguishable except for its lack of organization from endochondral bone formation. Bone will be formed by those cells having an adequate oxygen supply and subjected to the relevant mechanical stimuli.
[0175]Early in the repair process, cartilage formation predominates and glycosaminoglycans are found in high concentrations. Later, bone formation is more obvious. As this phase of repair takes place, the bone ends gradually become enveloped in a mass of callus containing increasing amounts of bone. In the middle of the reparative phase the remodeling phase begins, with resorption of portions of the callus and the laying down of trabecular bone along lines of stress. Finally, exercise increases the rate of bone repair. Heikkinen et al. Scand J. Clin. Lab. Invest. 25(suppl 113):32. In situ hybridization results have shown that OCP expression is confined to very specific regions where bone and cartilage formation is initiated.
[0176]In order to find out if OCP expression is induced in an animal model of bone fracture healing, a standard midshaft fracture was created in rat femur by means of a blunt guillotine, driven by a dropped weight. Bonnarens et al. (1984) Orthop. Res. 2:97-101. One, 2, 3 and 4 week-fractured bones were excised, fixed in buffered formalin, decalcified in EDTA solution and embedded in paraffin. All sections were hybridized with the OCP probe. The in-situ hybridization results show that a strong hybridization signal was apparent during the first and second weeks of fracture healing in the highly vascularized areas of the connective tissue within the callus (FIGS. 16-18), the endosteum (FIG. 19), the woven bone (FIG. 20) and the periosteum FIG. 21). The periosteum is regarded as a source of undifferentiated progenitors participating in callus formation at the site of bone fracture. The hybridization signal disappeared slowly during further differentiation stages of fracture healing (three and four weeks) and was retained only in the vascularized connective tissue. 22 displays brightfield (left) and darkfield (right) photomicrographs of a section of fractured bone healed for 4 weeks. In these later healing stages, the mature callus tissue was found to be comprised mainly by cancellous bone undergoing remodeling into compact bone, with little if any cartilage or woven bone present. The volume of the vascularized periosteal tissue is decreased but multiple cells in the periosteal tissue area of active remodeling of the cancellous bone covering the callus, show hybridization signal. This tissue covers the center of the callus and is also entrapped within the bone. See FIGS. 22 and 23. The box in FIG. 22 is enlarged in FIG. 23. As in the earlier stages, no hybridization signal was found in chondrocytes and osteoblasts. FIGS. 17 and 23. Several OCP expressing cells are concentrated in the vascular tissue that fills the cavities resulting from osteoclast activity (marked by asterisks).
[0177]Fractures in the young heal rapidly, while adult bone fractures heal slowly. The cause is a slower recruitment of specific chondro-/osteo-progenitors for the reparative process in the adult bone. Denervation retards fracture healing by diminishing the stress across the fracture site, while mechanical stress increases the rate of repair probably by increasing the proliferation and differentiation of specific bone progenitor cells and as a result, accelerates the rate of bone formation. The above results confirm our conclusions (see also hereunder) that OCP is most probably involved in induction of cortical and trabecular bone formation and remodeling, endochondral bone growth during development, and bone repair processes. In addition, there is strong evidence that OCP expression is tightly regulated, and induced during the earliest stages of bone fracture repair when osteo-/chondro-progenitor cells are recruited. This observation suggests that OCP plays a role in this process.
[0178]Taking into account the pattern of 608 expression during the process of bone fracture healing, it seems a reasonable hypothesis that 608-positive precursor cells are involved not only in remodeling of intact bone but also in the repair processes of the fractured bone as well.
Example 8
OCP Transcriptional Regulation
[0179]In order to clone the longest possible fragment which will contain the OCP regulatory region/s, bacs L4 and H7 were restricted with three different enzymes: BamHI, Bgl II and SauIIIA. The resulting fragments were cloned into the BamHI site of pKS. Ligation mixes were transformed into bacteria (E. coli-DH5α) and 1720 colonies were plated onto nitrocellulose filters which were screened with 32P-labeled PCR fragment spanning the mouse-OCP-exonl. Positive colonies were isolated.
[0180]Two identical clones, 14C10 and 15E11, contained the largest inserts (BamHI-derived ˜13 Kb inserts). The 14C10 clone is longer than the OCP "Lexicon" clone by ˜8 Kb at the 5' end.
[0181]a. Cloning of Mouse OCP Promoter and UTR Upstream to the Reporter Gene-EGFP
[0182]The 1.4 Kb genomic region of the mouse OCP gene, flanked by BamHI site (nuc 5098 of the "Lexicon" clone which is the start site of clone p14C10) and the first ATG codon (first nucleotide of exon 2), was synthesized by genomic PCR using the "Lexicon" clone as template and pre-designed primers: 5'primer (For1) located upstream to the BamHI site (nucleotides 4587-4611 of the Lexicon clone) and 3' primer (Rev 2) located immediately upstream to the first ATG (nucleotides 6560-6540 of the Lexicon clone) and tailed by a NotI site. The PCR product was cut by BamHI and NotI and the resulting 1.4 Kb fragment was ligated to pMCSIE into BamHI/NotI sites upstream to the EGFP reporter gene. The resulting clone was designated pMCSIEm608prm1.4.
[0183]Clone p14C10 was cut by XbaI and BamHI and the excised 4.088 Kb fragment was ligated into the BamHI and XbaI sites of pMCSIEm608prm1.4, upstream to the 1.4 Kb insert. The resulting clone was designated pMCSIEm608prm5.5 and contains 5552 nucleotides of the mouse 608 promoter and UTR upstream to EGFP. The insert of pMCSIEm608prm5.5 clone was completely sequenced.
[0184]The whole 13 Kb insert of p14C10 was excised by BamHI and ligated upstream to the 1.4 Kb insert of pMCSIEm608prm1.4 into the BamHI site. The resulting construct, pMCSIEm608prm14.5 contains a 14.5 Kb fragment of the mouse-OCP promoter and UTR upstream to EGFP.
[0185]b. Cloning Mouse OCP Promoter and UTR Upstream to the Reporter Gene-Luciferase
[0186]Both inserts of pMCSIEm608prm5.5 and of pMCSIEm608prm14.5 were also cloned upstream to luciferase, in Promega's pGL3-Basic vector. The 5.5 Kb insert of pMCSIEm608prm5.5 was excised by EcoRV and XbaI and ligated to SmaI and NheI sites of pGL3-Basic vector. The resulting clone is designated pGL3basicm608prm5.5.
[0187]Plasmid pMCSIEm608prm14.5 was restricted by NotI and the cohesive ends of the linearized plasmid were filled and turned into blunt ends. The 14.5 Kb insert was then excised by cutting the linear plasmid by SalI. The purified 14.5 Kb fragment was ligated to the XhoI and HindIII (filled in) sites of pGL3-basic upstream to the luciferase gene to create the construct designated pGL3basicm608prm14.5. SEQ ID NO:18 depicts 4610 bp that have been sequenced.
c. Analysis of TF Binding DNA Elements Common to Mouse and Human OCP
[0188]Known transcription factor (TF) binding DNA elements were analyzed for similarity upstream of human and mouse OCP ATG using the DiAlign program of Genomatix GmbH. The genomic pieces used are the proprietary mouse genomic OCP and reverse complement of AC024886 92001 to 11 1090. The locations of the ATG in these DNA pieces are: [0189]575 on rat cDNA [0190]*6521 on mouse genomic [0191]*3381 on the piece extracted from human genomic DNA AC002488614 elements were extracted in this procedure and analyzed for transcription binding motifs using the MatInspector.
[0192]Some of the main "master gene" binding sites are the osteoblast-/chondrocyte-specific Cbfa1 factor; the chondrocyte-specific SOX 9 factor; the myoblast-specific Myo-D and Myo-F factors; the brain- and bone-specific WT1; Egr 3 and Egr 2 factors (Egr superfamily); the vitamin D-responsive (VDR) factor; the adipocyte-specific PPAR factor; and the ubiquitous activator SP1.
Example 9
Expression Pattern and Regulation of Gene 608: Expression of Gene 608 in Regard to Other Osteogenic Lineage Markers
[0193]Expression of gene 608 was tested in primary cells and in cell lines with regard to expression of various markers of osteogenic and chondrogenic lineages. The results of this analysis are summarized in the following table and showed that expression of 608 is restricted to committed early osteoprogenitor cells.
TABLE-US-00002 Cells 608 Collagen I Collagen II Alk. Phos. Osteocalcin Cbfa1 Osteopontin STO - - + - + + + (fibroblasts) ROS - - - + + +/- + (osteosarcoma) MC3T3 (pre- + - - + + + + osteoblasts) C2C12(pre- - - - - + - + myoblasts) C6 (glioma) - - Calvaria mouse + + Calvaria rat + + C3H10T1/2 - - + - + - + (mesenchymal stem cells)
Example 10
OCP Expression is Mechanically Induced in MC3T3 E1 Cells
[0194]OCP transcription was detected by RT-PCR in mouse calvaria cells, U20S cells (human osteosarcoma cell line), and human embryonal bone. FIG. 14. OCP was initially discovered as being upregulated during mechanical stress in calvaria cells. In the present invention, we demonstrate that the influence of mechanical stress on OCP expression can be reproduced in another cell system using a different type of mechanical stimulation. In serum-deprived MC3T3-E1 pre-osteoblastic cells, mechanical stimulation caused by mild (287×g) centrifugation markedly induced OCP mRNA accumulation. FIG. 15. Other osteoblastic marker genes (osteopontin, ALP (staining--not shown) and Cbfa1) were transcriptionally augmented by this procedure. FIG. 15. The RT-PCR product of a non-osteoblastic marker gene (GAP-DH) was used as a control to compare RNA levels between samples. No increased expression was noticed when the latter primers were used. No expression was detected in non-osteoblastic cells (FIG. 14), suggesting that OCP expression is specifically induced in osteogenesis. Responsiveness of CMF608 expression to mechanical stimulation was confirmed by Northern blot analysis using polyA RNA from primary rat calvaria cells before and after mechanical stress (FIG. 12.).
Example 11
OCP Induction During Endochondral Growth--In Situ Hybridization Analysis
[0195]Our previous results demonstrated that OCP is expressed during adult mice bone modeling and remodeling. The expression was restricted to the following regions: [0196]1 perichondrium [0197]2 periosteum [0198]3 active remodeling and modeling regions [0199]4 perivascular connective tissue [0200]5 articular cartilage covering cells [0201]6 embryo-condensed mesenchymal cells-head, vertebrae and trunk [0202]7 ectopic bone formation
[0203]No previous observations suggest any role for OCP in bone development or initiation of endochondral ossification (longitudinal growth of long bones). Thus, the expression pattern of OCP by in situ hybridization on sections of bones from 1 week old mice was analyzed. At this stage of bone development, osteogenesis starts within the epiphysis (secondary ossification center). The hind limb skeleton of 1 week old rat pups (femur together with tibia) was fixed in buffered formalin and longitudinal sections of decalcified tissue were processed for in situ hybridization according to standard in-house protocol. Autoradiographs were developed, stained with hematoxylin-eosin and studied under microscope using brightfield and darkfield illumination.
[0204]A strong fluorescence signal was observed all over the second ossification center using OCP probes. FIG. 17 In addition, the hybridization signal delineates periosteal and perichondrial tissue in a way similar to that found earlier in adult bones. Surrounding mature chondrocytes displayed no signal. A very faint signal was observed using the osteocalcin probe which is a marker of mature osteoblasts.
[0205]In conclusion, OCP is expressed in osteoprogenitor cells that initiate endochondral ossification during bone development.
Example 12
In vivo Regulation by Stimuli Either Promoting or Suppressing Bone Formation Estrogen Administration, Blood Loss and Sciatic Neurotomy
[0206]Osteogenic cells are believed to derive from precursor cells present in the marrow stroma and along the bone surface. Blood loss, a condition that stimulates hemopoietic stem cells, activates osteoprogenitor cells in the bone marrow and initiates a systemic osteogenic response. High-dose estrogen administration also increases de novo medullary bone formation possibly via stimulation of generation of osteoblasts from bone marrow osteoprogenitor cells. In contrast, skeletal unweighting, whether due to space-flight, prolonged bed-rest, paralysis or cast immobilization leads to bone loss in humans and laboratory animal models. To detect alteration in OCP expression pattern following the above procedures, the following experiments were performed on two month old mice: [0207]estrogen administration (500 μg/animal/week), [0208]bleeding (withdrawing approximately 1.6% body weight), [0209]unilateral (right limb) sciatic neurotomy, [0210]control groups for each treatment
[0211]Total RNA was extracted from long bones after two-day treatment and RT-PCR using OCP-specific primers was performed. The results demonstrate that OCP expression was highly enhanced following blood loss and estro gen administration, while down-regulation was observed following sciatic neurotomy. FIG. 19.
[0212]By having a unique cell marker (OCP) we can show that the above procedures induce or reduce bone formation by increasing ordecreasing the number of osteoprogenitor cells. The above results suggest once more that OCP is a major member of a group of "bone specific genes" that regulate the accumulation of bone specific precursor cells.
Example 13
OCP Induction During Osteoblastic Differentiation of Bone Marrow Stroma Cells
[0213]Bone formation should be augmented in trabecular bone and cortical bone in osteoporotic patients. We have previously detected OCP expression in periosteum and endosteum (surrounding the cortical bone) but no signal was apparent in bone marrow cells. The latter cells normally differentiate to mature osteoblasts embedded in the trabecular and cortical bone matrix.
[0214]To further assess OCP expression in bone marrow progenitor cells, the inventors extracted total RNA from mouse and rat bone marrow immediately after obtaining it and after cultivation for up to 15 days in culture. No OCP-specific RT-PCR product was detected with RNA from freshly obtained bone marrow (both in adherent and non-adherent) cells. However, a faint signal was found after 5 days in culture, and it was further enhanced when RNA from cells grown for 15 days in culture was used. ALP (alkaline phosphatase) expression (an osteoblastic marker) was also found to be enhanced after 15 days. At both time points, adherent and non-adherent cells were reseeded, and RNA extractions were prepared 5 and 15 days later. A stronger RT-PCR product was observed with RNA extracted from originally adherent cells, suggesting the existence of less mature progenitors in the non-adherent population of bone marrow cells. The RT-PCR product of a non-osteoblastic marker gene (GAP-DH) was used as a control to compare RNA levels between samples.
[0215]In conclusion, bone marrow progenitor cells do not express OCP, but differentiate to more committed cells that do express this gene.
Example 14
OCP Role in Osteogenesis In Vitro
[0216]The ultimate test for the role of OCP as a crucial factor that induces osteoblast-related genes is its ability to up-regulate these genes in pre-osteoblastic and osteoblastic cells. Stable transfection of OCP to ROS 17/2.8 (differentiating osteoblast cell line) cells upregulated ALP and BSP expression. FIG. 24 In addition, marked increase in osteoblastic proliferation was observed; see FIG. 25
[0217]C3H10T1/2 cells were transfected with the following constructs containing the CMV promoter:
1. 608-663 a.a--Construct containing 5' untranslated region of β-actin, the OCP coding region from ATG at position 1 to the amino acid at position 663 of FIG. 3 (SEQ ID NO:2) and 3'Flag Tag. The functional portion of the mammalian OCP expressed using this construct contains the first 663 amino acids of the OCP polypeptide sequence, plus several additional amino acids of the 3'Flag tag
[0218]An additional construct was made, designated pCm-H608-663Nterm, which has the 5' untranslated region of β-actin, the human OCP coding region from which encodes polypeptide from the ATG at position 1 to the amino acid at position 663 of FIG. 30 (SEQ ID NO:24) but no Flag Tag; this construct was deposited in the ATCC on Aug. 14, 2001 under ATCC Number PTA-3638.
2. pCMV-neo--as negative control. This is the empty plasmid into which the 608-663aa was cloned to create vector #1 above. It serves as negative control to show that the effects are not caused by any other part of the #1 construct but by expression of the 608-663aa.
Example 15
Creation of a Readout System
[0219]A readout system is created to identify small molecules that can either activate or inactivate the OCP bone-precursor-specific promoter
Example 16
Bioinformatic Analysis of Human 608
[0220]A DNA sequence encoding a fragment of human OCP named AC024886 is found in htgs database but not in nt. There is no genomic DNA corresponding to the rat cDNA. Alignment of AC024886 against the rat cDNA using BLAST shows two areas of long alignment (and several shorter areas):
[0221]1. cDNA: 6462-8186 [0222]Genomic: 89228-90952 [0223]plus/plus orientation: 81% identity
[0224]2. cDNA: 5581-6451 [0225]Genomic: 107710-106840 [0226]Plus/minus orientation: 80% identity
[0227]Thus AC024886 was wrongly assembled in the region upstream of position 6462 (according to the rat cDNA), it was in the incorrect orientation. Using the incorrect orientation provided incorrect coding sequence and does not yield the human OCP protein.
[0228]The Genbank report on AC024886 was as follows:
LOCUS AC024886 175319 bp DNA
HTG 06-SEP-2000
[0229]DEFINITION Homo sapiens chromosome 3 clone RP11-25K24, WORKING DRAFTSEQUENCE, 9 unordered pieces.
ACCESSION AC024886
VERSION AC024886.10 GI:9438330
KEYWORDS HTG; HTGS_PHASE1; HTGS_DRAFT.
[0230]SOURCE human.NOTE: This was a `working draft` sequence. It consisted of 9 contigs. The true order of the pieces was not known and their order in this sequence record was arbitrary. Gaps between the contigs are represented as runs of N, but the exact sizes of the gaps was unknown. [0231]1 62523: contig of 62523 bp in length [0232]62524 62623: gap of unknown length [0233]62624 85445: contig of 22822 bp in length [0234]85446 85545: gap of unknown length [0235]85546 106059: contig of 20514 bp in length [0236]106060 106159: gap of unknown length [0237]106160 127908: contig of 21749 bp in length [0238]127909 128008: gap of unknown length [0239]128009 143068: contig of 15060 bp in length [0240]143069 143168: gap of unknown length [0241]143169 158734: contig of 15566 bp in length [0242]158735 158834: gap of unknown length [0243]158835 170042: contig of 11208 bp in length [0244]170043 170142: gap of unknown length [0245]170143 173715: contig of 3573 bp in length [0246]173716 173815: gap of unknown length [0247]173816 175319: contig of 1504 bp in length.a. Mapping Human Genomic 608 Exons
[0248]Ten exons were mapped on the rat cDNA sequence from base 107 to 6451. Thus the first exon on the human genomic piece may be lacking. The human genomic piece (AC024886) upstream (19090 bases) of base 6462 of cDNA (reverse complement from base of AC024886 92001 to 111090) was compared with the rat cDNA using the program ExonMapper of Genomatix. In the Table, base 1 is actually 1131 in the genomic piece used so that the actual genomic location starts at 91870.
[0249]Two additional exons were mapped on the rat cDNA sequence from base 6462 to 8883. Thus bases 6452-6461 are lacking. The human genomic piece used is from base 165,337 to 17,5667 (10,331 bases). The same type of program was used to compare this sequence to the genomic mouse 608 sequence deduced as described above.
[0250]Connecting the exon/intron borders from the genomic sequences yielded the predicted human and mouse cDNAs. The mouse and human predicted cDNAs were modified in order to allow frame shifts that allow a good multiple alignment of the human, mouse and rat proteins. Alignment was done using CLUSTALX and Pretty.
[0251]The cDNA modifications after the alignment of human cDNA to rat cDNA by GeneWise were as follows. In the following two tables,-x indicates a deletion of nucleotide x in the cDNA sequence; +x indicates an insertion of nucleotide x in the cDNA sequence; and all changed positions are in relation to the original sequence
TABLE-US-00003 Position Change 1111 -g 4154 -c 4538 +g 4730 -a 4744-5 -aa 4830 +c 4852 -g 4902 +t 4942 +c 5370 +t 5387 -a 5395 +c
[0252]The corrections of frame-shifts in mouse 608 were as follows:
TABLE-US-00004 Position Change 678 -c 1106 -a
Chromosomal Location on the Human Chromosome:
[0253]Two different types of data exist. [0254]a. Genomic piece AC024886 has identity to the fragment identified as ACCESSION D14436 as described by Fukui et al. (1994) Biochem. Biophys. Res. Commun. 201:894-901. [0255]Alignment information: [0256]Identities=315/335 (94%), [0257]hrh1: 4-338 [0258]AC024886: 41662-41328 [0259]Hrh1 is mapped to chromosome 3 and to 3p25; and [0260]b. Identity to STS at 3q. STS: 20-432 is identified as ACCESSION G54370 and described by Joensuu et al. (2000) Genomics 63:409-416.
Example 17
Polyclonal Antibody Preparation
[0261]Polyclonal antibodies specific to the whole 608 putative protein are prepared by methods well-known in the art (the structure of 608 resembles that of growth factor precursors). Polyclonal antibodies are identified and the recombinant active form of 608 is prepared. The activities of the polyclonal antibodies are tested in vivo in mice. The antibodies can be used for the identification of the active form of this protein which is likely to constitute a fraction of the 608 protein.
Example 18
Stretch of Basic Amino Acids Found at the Boundary of the Rat and Human 608 Proteins, and its Implications
[0262]The homology between the rat and human N-terminal portions of the 608 protein is especially significant within the first 250 amino acids. At the boundary of this conserved region there is a completely conserved stretch of basic amino acids: KCKKDR (aa 242-247 and 240-245, in rat and human proteins, respectively). Stretches of basic amino acids frequently serve as protease cleavage sites. The fact that such a stretch is found on the boundary of more or less conserved sequences and the fact that it occurs within the C-terminal LRR, a generally conserved domain, suggests an underlying biological significance.
[0263]Accordingly, the 608 protein may undergo post-translational processing through the cleavage of its highly conserved N-terminal portion and this portion may be an active part of the 608 protein or possess at least part of its biological activities. Since the resulting 25 kD protein preserves the signal peptide, it would be secreted.
[0264]The biologically active 25 kD N-terminal cleavage product of 608 can thus be used for treatment and/or prevention of osteoporosis, fracture healing, bone elongation and periodontosis. As an indirect product (inhibition by either chemicals or by neutralizing mAbs), the fragment can be used for treatment and/or prevention of osteoarthritis, osteopetrosis, and osteosclerosis.
Example 19
The Adlican Protein and Gene
[0265]Adlican is a recently described protein. Crowl and Luk (2000) Arthritis Biol. Res. Adlican, a proteoglycan, was derived from placenta. The full amino acid sequence of Adlican is disclosed and identified as AF245505.1:1.8487, and is hereby incorporated by reference into this application; see FIG. 27.
[0266]The structure of Adlican was analyzed using methods described herein and found to have leucine-rich repeats and immunoglobulin regions similar to those of the OCP protein. The overall homology found between the amino acid residues of the indicated regions in the two proteins, is as follows:
TABLE-US-00005 OCP Adlican % 1-661 1-669 38.4 662-1629 670-1865 19.7 1630-2587 1866-2828 46.5 1-2587 1-2828 33.2
[0267]The invention therefor encompasses the use of Adlican in any manner described herein for the OCP protein. These functions and uses have not been disclosed previously for Adlican. They include use of Adlican, or a functional portion thereof, for preventing, treating or controlling osteoporosis, or for fracture healing, bone elongation or treatment of osteopenia, periodontosis, bone fractures or low bone density or other factors causing or contributing to osteoporosis or symptoms thereof or other conditions involving mechanical stress or lack thereof in a subject.
[0268]The Adlican gene, or functional portions thereof, can likewise be used for any purpose described herein for an OCP gene. Compositions comprising the Adlican gene, Adlican or antibodies specific for Adlican and physiologically acceptable excipients are likewise encompassed by the invention. Such excipients are known in the art and include saline, phosphate buffered saline and Ringer's solutions.
Example 20
Sequencing of the N-Terminal of the OCP Gene
[0269]Sequencing of the N-terminal fragment of the OCP gene using the 663 amino acid human construct (Example 14) added six additional nucleotides to the DNA sequence as shown in FIG. 29 (SEQ ID NO:23), where these 6 additional nucleotides are underlined.
[0270]The corresponding amino acid sequence of the encoded OCP protein thus has an additional two amino acids, as shown in FIG. 30,(SEQ ID NO:24) where these 2 additional amino acids are underlined.
Example 21
Preparation of a Recombinant Functional Portion of OCP
[0271]The 663 amino acid construct described in Example 14 was expressed in 293T cells. Western blot analysis of the medium, using antibody to the Flag tag, showed the presence of the 663 amino acid polypeptide. This polypeptide was purified from the medium, using a column of anti-Flag tag antibodies.
[0272]Our objective was to determine if the 1-663 amino acid polypeptide fragment of the 608 protein could induce proliferation in bone-related cell lines. Proliferation activity was tested by 3[H] thymidine incorporation assay on 4 bone related cell lines, with IGF1 or PTH as standards. (Pre-osteoblastic and osteoblastic proliferation is an activity that characterizes bone formation inducing factors such as IGF1 and PTH.)
[0273]In this key series of experiments, the purified 663 polypeptide showed a proliferative effect on W-20-17, a mouse bone marrow stromal cell line. This effect was reproduced with two 663-polypeptide batches in 5 independent experiments.
[0274]The activity of proliferation of bone marrow stromal cells demonstrated in the above experiments could be indicative of pre-osteoblastic proliferation activity induced by the 663 amino acid polypeptide. The 663 polypeptide activity could be mimicking the complete 608 protein in vivo activity. Alternatively, the 663 polypeptide activity could have a dominant negative effect, i.e. an effect that inhibits the whole 608 protein in vivo activity. Regardless of the mechanism, the 663 polypeptide could be used to induce proliferation of pre-osteoblastic stromal cells. This activity could help restore the pre-osteoblastic cell population that is known to be depleted in old-age or senile osteoporosis.
Example 22
Identification of RGD and Subtilisin-Like Proprotein Convertase (SPC) Motifs in Rat OCP
[0275]SEQ ID NO: 2 and SEQ ID NO: 34 depict the amino acid sequence of the rat 608 polypeptide. There is an RGD sequence at positions 729-731, and there is a putative cleavage motif subtilisin-like proprotein convertase (SPC) consensus sequence at positions 735-741.
[0276]The 608 protein was partially cleaved by SPC, in 293HEK cells. This putative peptide also contained the RGD sequence. Many adhesive proteins, present in extracellular matrices and in the blood, contain this tripeptide as their cell recognition site. Therefore, the 608 peptide comprising 1-741 amino acids, or a shorter fragment of the 608 protein containing the RGD sequence, may be a much more effective drug than the 663 amino acid fragment. The RGD and RxxRxxR (viz. R-aa1-aa2-R-aa3-aa4-R, i.e., SPC cleavage site) sequences are present in the human 608 protein sequence but are not present in Adlican or in Adlican-2.
Example 23
Natural Cleavage of Rat OCP
[0277]A polyclonal antibody against the rat 608 fragment comprising amino acid residues 1-312 was prepared by methods well-known in the art. This antibody was used to identify 608 peptides on Western blots. Several 608 sequences were expressed in cells derived from the transiently transfected 293T kidney cell line. The sequences were rat full length 608 polypeptide, rat 608 polypeptide fragment comprising amino acid residues 1-1634, and rat 608 polypeptide fragment comprising amino acid residues 1-663. The antibody identified a peptide of about 90 kDa in all three constructs produced. This peptide was detected by the anti-608 antibody in the conditioned medium of the cells, and not in cell extracts.
[0278]Table: Western blot analysis using polyclonal antibody to rat 608 fragment comprising 1-312 amino acid residues:
TABLE-US-00006 608 amino acid sequence Expected size Detected size 1-663 80 kDa 75-100 kDa 1-1634 196 kDa 75-100 kDa (larger than 1-663 aa peptide) 1-2597 (full length) 311 kDa 75-100 kDa (larger than 1-663 aa peptide) 1-741 89 kDa
[0279]The 608 full length and the 608 1-1634 aa proteins produced in 293T cells were cleaved and secreted into the medium. The cleaved products appeared to be of identical size. The 608 1-663 aa protein was also secreted into the medium, but appeared to be slightly smaller than the cleaved full length and 1-1634 aa proteins. The expected size of the 608 fragment from 1-741 aa, that is, the putative SPC cleavage product, was approximately 89 kDa.
[0280]In further experiments, mouse calvaria cells cultured in vitro were analyzed by western blotting with the antibody to the 608 fragment 1-312aa. No 608 specific band was detected in cell extracts. In the conditioned medium from the cells a band of approximately 350 kDa was detected by the anti 608 1-312aa. The size of this band correlates with the protein size expected from the full length 608 protein. This analysis probably indicates that the 608 full-length protein is secreted.
[0281]To summarize, in human embryonic kidney cells, which do not normally express the 608 gene, overexpression of 608 protein results in secretion of a cleaved part of the 608 protein. In mouse calvaria cells, which normally express the 608 gene, the naturally expressed 608 protein is probably secreted uncleaved. One possible explanation of this data is that 608 activity is regulated by proteases that are selectively expressed.
Example 24
Human Adlican as a Candidate for Osteoblast Proliferation and Differentiation
[0282]As discussed in Example 19, Adlican is a recently described protein. The Adlican protein has LRR (Leucine-rich-repeats) and immunoglobulin regions highly similar to those of the OCP protein. The overall homology found between the amino acid residues of the indicated regions in the two human proteins is as shown in Example 19.
[0283]The deduced Adlican protein comprises the following features:
[0284]a. A cleavable, well-defined N-terminal signal peptide at 1-26 aa,
[0285]b. A LRR region (26-205 aa). This region can be divided into N-terminal and C-terminal domains of LRR (aa 26-59 and 217-276, respectively). Between them, there are six LRR (aa 55-77, 78-101, 102-125, 126-149, 150-173, 182-205).
[0286]c. Twelve immunoglobulin C-2 type repeats at amino acid positions 492-562, 590 658, 1866-1935, 1963-2032, 2060-2129, 2159-2228, 2256-2331, 2359-2425, 2457-2525, 2555-2623, 2650-2718, 2746-2817. Thus, two Ig-like repeats are found immediately downstream of a LRR region, while the remaining 10 repeats are clustered at the protein's C-terminus, as in OCP.
[0287]d. 4 nuclear putative localization domains (NLS) at amino acids: 676-682, 1146-1165, 1230-1236, and 1747-1763.
[0288]Therefore, we have determined that Adlican is a good candidate as an inducer of osteoblast proliferation and differentiation.
[0289]In order to determine if human Adlican expression causes proliferation and differentation of osteoblasts and chondrocytes, the expression product of Adlican, or cells or vectors expressing Adlican are monitored to determine if they cause cells to selectively proliferate and differentiate and thereby increase or alter bone density. Detecting levels of Adlican mRNA or expression and comparing it to "normal" non-osteopathic levels will allow screening amd detection of individuals who may be at risk for developing osteoporosis or lower levels of osteoblasts and chondrocytes.
Example 25
The Deduced Adlican-2 Protein
[0290]The deduced Adlican-2 protein (Genomic location: Yq11.21) was generated following the alignment (shown in FIG. 41) comparing Adlican-2 predicted sequences (FIGS. 39 and 40) and the equivalent human Adlican amino acid sequences (FIG. 27). This DNA molecule and the encoded polypeptide are novel molecules and constitute an integral part of this invention.
[0291]A Y chromosome BAC clone (gi 8748884) shows 93% homology to Human Adlican. Two mRNA sequences 100% homologous to this BAC clone were submitted to the gene bank (gi 14719942, and gi 14719940). However, the sequence of these clones is not based on cDNA sequences, but on human genomic data and they cover a short stretch of the nucleotide sequence in the C-terminal Ig region. We performed upstream nucleotide and deduced amino acid sequencing. The sequence alignment of Adlican and Adlican-2 exists along the entire Adlican sequence with one possible exception. Alignment along aa 66-215 of Adlican may be missing from the Adlican-2 molecule. This is the area of the 6 LRR (leucine-rich repeats). Although the encoding nt's for the 6 LRR region have not yet been observed, their existence has not been definitely ruled out.
[0292]The invention therefore encompasses the use of Adlican-2 in any manner described herein for the OCP protein. No functions or uses have been disclosed previously for Adlican-2. The proposed uses include use of Adlican-2, or a functional portion thereof, for preventing, treating or controlling osteoporosis, or of fracture healing, bone elongation or treatment of osteopenia, periodontosis, bone fractures or low bone density or other factors causing or contributing to osteoporosis or symptoms thereof or other conditions involving maechanical stress or lack thereof in a subject. As an indirect product (inhibition by either chemicals or by neutralizing mAbs), Adlican-2 can be used for treatment and/or prevention of osteoarthritis, osteopetrosis, and osteosclerosis. The Adlican-2 gene, or functional portions thereof, can likewise be used for any purpose described herein for an OCP gene. Compositions comprising the Adlican-2 gene, Adlican-2 or antibodies specific for Adlican-2 and physiologically acceptable excipients are likewise encompassed by the invention. Such excipients are known in the art and include saline, phosphate buffered saline and Ringer's solutions.
Example 26
The Physical Sequence of the Human OCP
[0293]Obtaining the sequence of human OCP was a difficult task. Initially several attempts were made to do amplification via RT-PCR using rat primers from the rat OCP coding sequence, in order to obtain human OCP cDNA, but these efforts failed. Thereafter, a predicted sequence was created as described in Examples 5 and 16 by bioinformatic analysis of the human genome. Then primers specifically designed according to the predicted sequence were used to amplify human cDNA. This proved difficult, due to the large size of the gene and also to the problem of the low abundancy of OCP mRNA in human tissue and the unavailabilty of such tissue. Eventually, cell line U20S (human osteosarcoma cell line) was found to be a suitable source for OCP mRNA. It was also decide to clone the DNA in fragments.
[0294]The first section of the gene to be cloned was a small fragment corresponding to the first 663 amino acids, creating the plasmid described in Example 14 (ATCC Number PTA-3638), and giving a corrected predicted sequence.
[0295]Since the complete human OCP gene could not be amplified and cloned as one entity, three large overlapping fragments were amplified, spanning the complete ORF. These PCR fragments were sequenced and the physical sequence of the human OCP was determined accordingly. The physical sequence was found to contain inserts relative to the predicted sequence. The overlapping PCR fragments were subsequently cloned in three separate plasmids (described below) as continous clones (overlapping regions were removed).
[0296]FIG. 42 shows the physical DNA sequence of the coding region-ORF of human OCP (SEQ ID NO: 31) having 7872 base pairs, including the stop codon. The sequence contains a silent mutation (C>T transition) at position 6729 compared to the predicted sequence of human OCP ORF. This transition does not change the identity of the encoded amino acid residue.
[0297]FIG. 43 shows the predicted amino acid sequence corresponding to the physical DNA sequence of the coding region-ORF of human OCP (SEQ ID NO:32), having 2623 amino acids.
[0298]The three plasmids harboring the 5' fragment (A), middle fragment (B) and 3' fragment (C) are depicted in FIGS. 34, 36 and 38 respectively, and were deposited on Nov. 21, 2001 under the terms of the Budapest Treaty with the American Type Culture Collection (ATCC), P.O. Box 1549, Manassas, Va. 20108, USA, under ATCC accession numbers PTA-3878, PTA-3876 and PTA-3877 respectively.
[0299]FIG. 34 shows the physical sequence of the 5' fragment (A) cloned into pBluescript KS to NotI (5') and HindIII (3') sites. Fragment A is comprised of the 5' region (2440 bp) of the complete human OCP sequence and includes, in addition, at the 5' end, 21 nucleotides of the β-actin "Kozak" region (nucleotides 9-29) followed by the ATG initiation coNotI (5') and HindIII (3') sites are located at nucleotides 1-8 and 2464-2469 respectively (SEQ ID NO:26).
[0300]FIG. 36 shows the physical sequence of the middle fragment (B) cloned into pBluescript KS to HindIII (5') and SalI (3') sites. Fragment B is comprised of the central region (3518 bp) of the complete human OCP sequence; the HindIII (5') and SalI (3') sites are located at nucleotides 1-6 and 3513-3518 respectively (SEQ ID NO:27).
[0301]FIG. 38 shows the physical sequence of the 3' fragment (C) cloned into pMCS SV(A) to SalI (5') and SpeI (3') sites. Fragment C is comprised of the 3' region (1923 bp, not including the 3 bp stop codon) of the complete human OCP sequence and includes at the 3' end, 18 nucleotides coding for 6 Histidine residues, nucleotides 1924-1941, followed by the TGA stop codon.; the SalI (5') and SpeI (3') sites are located at nucleotides 1-6 and 1945-1950 respectively (SEQ ID NO:28).
[0302]Additionally, as discussed above, cloned fragment C contains a silent mutation (C>T transition) at nucleotide 783 compared to the predicted sequence of human OCP ORF; this transition does not change the identity of the encoded amino acid residue.
[0303]Note that if the number of OCP--encoding nucleotides in the three separate clones is added (viz., 2440+3518+1923), nine (9) more nucleotides are obtained than in the single complete sequence (7881 nucleotides vs 7872 nucleotides). This discrepancy is due to three reasons:
[0304]1. The restriction site that appears at the 3' end of fragment A and at the 5' end of fragment B is counted twice, once in each fragment, giving an extra 6 nucleotides
[0305]2. The restriction site that appears at the 3' end of fragment B and at the 5' end of fragment C is counted twice, once in each fragment, giving an extra 6 nucleotides
[0306]3. The sequence of fragment C does not include the 3 nucleotide stop codon at the 3' end, since it is interrupted by 18 nucleotides coding for 6 Histidine residues.
[0307]Therefore the difference is 6+6-3=9, which exactly explains the discrepancy mentioned above.
Example 27
608 Knockout Bone Phenotypes in Females with and without Ovariectomy
[0308]Introduction
[0309]Knockout (KO) mice deleted of the 608 gene were prepared by the method of Wattler et. al. BioTechniques 26:1150-1160, 1999. Comparison of 608 knockout (KO) mice to age, sex, and treatment matched wild type (WT) mice was performed to test the effect of 608 absence on bone parameters. Bone parameters of KO and WT were compared in untreated 3 and 4 months old females. KO and WT bone parameters were also compared in 3 months old female mice 5 weeks post ovariectomy (post-menopausal osteoporosis model).
[0310]The bone-related phenotypes were evaluated using two analyses: Peripheral Quantitative Computed Tomography (PQCT) of femur and tibia Rosen H N et. Al. Calcif. Tissue Int. 57:35-39, 1995) and serum Alkaline phosphatase (ALP) (Farley J R et. al. J Bone Miner Res 9:497-508, 1994. pQCT scanning is a 2D X-ray analysis that measures bone mineral density (BMD), bone mineral content (BMC), and bone geometric parameters. Serum ALP is a biochemical marker of bone formation.
[0311]Results
[0312]A. Untreated KO Females
[0313]It was found that the serum marker (serum ALP) of bone formation was significantly increased in 3 month old 608 KO mice. These results are depicted in FIG. 46.
[0314]pQCT scanning was performed for two groups of mice. pQCT of 3 months old untreated (sham operated) female mice gave parameters that were significantly different between WT and KO by two-way ANOVA analysis (pvalue<0.05), as shown in the Table below.
TABLE-US-00007 pQCT WT KO % Parameter Average Average P Value Increase Femur 3.572 4.070 P < 0.05 14 Metaphysis Total Area Tibia 0.296 0.314 P < 0.05 6.1 Diaphysis Cortical Thickness Femur 0.990 1.114 P < 0.05 12.5 Diaphysis Cortical Area
[0315]Similarly, pQCT of 4 months old untreated female mice also gave parameters that were significantly different between WT and KO by one-way ANOVA analysis (pvalue<0.05), as shown in the following Table.
TABLE-US-00008 pQCT WT KO % Parameter Average Average P Value Increase Tibia 833.4 911.6 0.001803 9.4 metaphysis cortical BMD Femur 3.38 3.8 0.004296 12.4 Metaphysis Total Area Tibia 1150.4 1192 0.015211 3.6 Diaphysis Cortical BMD Femur 1202.4 1241.5 0.015951 3.3 Diaphysis Cortical BMD
[0316]In summary, the bone related phenotype of untreated 608 KO females is as follows:
[0317]At 3 months old, serum ALP is significantly increased in 608 KO mice. This may indicate that bone metabolism is different due to lack of the 608 gene. At this age, the significant increases are in bone geometric parameters. Slightly larger bone diameter and increased cortical thickness could affect bone strength.
[0318]At 4 months old, there is also a significant increase in cortical BMD of both femur and tibia. The incidence of fracture is closely related to BMD. Patients who sustain fractures have significantly decreased BMD.
[0319]In all parameters that showed a significant difference, the 608KO values were higher compared to WT. This may implicate an inhibitory role for the 608 gene in bone metabolism.
B. Ovariectomized KO Females
[0320]None of the parameters that were significantly increased in untreated females showed differences in ovariectomized females. Loss of tibia metaphysis total BMD due to ovariectomy may be smaller in 608KO mice. However, this difference was not found significant. Increasing the number of animals in each group could improve the statistical results.
CONCLUSIONS
[0321]The bones of KO mice appeared to have some basic anatomical differences compared to the bones of WT mice. This observation is based on trends seen in parameters reflecting bone geometry, such as total slice area, periosteal circumference, cortical area and thickness. Compared to untreated WT mice, an increase in the femur metaphysis area was observed in KO animals, both in 4 month-old and 3 month-old mice. Distal femur total BMD was notably unaffected by genotype despite the differences in bone size. Similar increases in geometric parameters were noted in KO mice compared to WT at the femur diaphysis (cortical area) and at the tibia diaphysis (cortical thickness) in 3 month-old mice. At 4 months old there is also a significant increase in cortical BMD of both femur and tibia.
[0322]Consistent with these effects on bone mass, biochemical markers of bone turnover were increased in KO mice, relative to the WT controls, suggesting that bone metabolism is different. Parameters that could affect bone strength, such as a slightly larger bone diameter and increased cortical thickness, could contribute to bone strength.
[0323]The effects on bone mass and biochemical markers of bone turnover noted in the KO mice appear to be indicative of a protective effect of the KO phenotype on bone loss following ovariectomy, although the effects were small. A trend to a genotype-related prevention of bone loss in the distal femur metaphysis relative to the ovariectomized controls was observed in KO animals. A slight partial prevention of bone loss relative to the ovariectomized control group was observed in KO mice at the endocortical surface of femur and tibia metaphysis, although the effects were not marked.
[0324]In conclusion, the effects on bone mass and biochemical markers of bone turnover noted in the KO mice appear to be indicative of a protective effect of the KO genotype on bone loss following ovariectomy. Bone metabolism and bone geometry could be different in KO mice.
[0325]The following hypothesis may correlate the in vitro data in the previous Examples with this in vivo data from the KO mice analyses:
[0326]The function of the 608 protein could be to promote proliferation of the undifferentiated osteoprogenitor cell population. This hypothesis is based on the proliferative effect of the 608 1-663aa polypeptide on mouse bone marrow cell line, as shown in Example 21. In the absence of this protein the balance between proliferation and differentiation of osteoprogenitors is changed in favor of differentiation and therefore the increased bone parameters are obtained at a young age. It could be that in aged mice this change in balance causes a decrease in bone parameters due to the normal decrease in osteoprogenitors that occurres with aging. If this hypothesis is correct an intermittent administration of the 608 protein or fragments of it could be used as a treatment for osteoporosis. Administration of the 608 polypeptide would cause proliferation of osteoprogenitors. When 608 level is allowed to decrease to normal levels, differentiation could take place.
[0327]An example of timing of intermittent treatment may be daily e.g. daily administration, preferably by injection, preferably subcutaneous, as opposed to continuous administration e.g. by infusion. Other examples may be administration every other day, or every few days, or even once a week or once a month. In the case of parathyroid hormone (1-34 amino acid), daily subcutaneous injections of 20-40 μg were considered intermittent administration as opposed to continuous infusions; see Neer R. M. et. al. 2001, The New England Journal of Medicine. 344: 1434-1441, Effect of parathyroid hormone (1-34) on fractures and bone mineral density in postmenopausal women with osteoporosis.
[0328]Alternatively the 663aa fragment may act as an inhibitor of 608 function, as discussed in Example 21.
Sequence CWU
1
3718883DNARattus speciesmisc_feature(1)..(8883)'n' can be any nucleotide
'a', 'c', 'g' or 't'. 1cgagagacga cagaaggtta cggctgcgag aagacgacag
aagggtccag aaaaaggaaa 60gtgctggagg ggagtgggga caaaagcagc gaccaagtga
atgtcacttc agtgactgag 120gccaggcaaa acgcgcggga aggattttgt gtagcttggg
accctttcat agacactgat 180gacacgttta cgcaaaatag aaatttgagg agaaacgcct
gggccttcgg aaaggagtga 240ttgattagta cttgcaagtt taggtgactt taaggagaac
taactaatgt atactattga 300gggaggagga agagcattac agagtttcca gcagcagcag
gaaagctttg gttaatttgg 360aaatggatga tagcattaaa ataacagaag cgcctccagg
tctctgaagc ttcagtcccc 420cagctgaaag ccagaaaaga ctaagcccac taagcctttt
gatccctttg gaagcaaaga 480actttccttc cctggggtga agactctcct cagaagattt
cctgtctctg cctatgttac 540aagaggaatc aaaaccaaga cagaagagct caggatgcag
gtgagaggca gggaagtcag 600cggcttgttg atctccctca ctgctgtctg cctggtggtc
acccctggga gcagggcctg 660tcctcgccgc tgtgcctgct atgtgcccac agaggtgcac
tgtacatttc ggtacctgac 720ctccatccca gatggcatcc cggccaatgt ggaacgaata
aatttaggat ataacagcct 780tactagattg acagaaaacg actttgatgg cctgagcaaa
ctggagttac tcatgctgca 840cagtaatggc attcacagag tcagtgacaa gaccttctcg
ggcttgcagt ccttgcaggt 900cttaaaaatg agctataaca aagtccaaat cattcggaag
gatactttct acggactcgg 960gagcttggtc cggttgcacc tggatcacaa caacattgaa
ttcatcaacc ctgaggcctt 1020ttatggactt acctcgctcc gcttggtaca tttagaagga
aaccggctca caaagctcca 1080tccagacaca tttgtctcat taagctatct ccagatattt
aaaacctctt tcattaagta 1140cctgttcttg tctgataact tcctgacctc cctcccaaaa
gaaatggtct cctacatgcc 1200aaacctagaa agcctgtatt tgcatggaaa cccatggacc
tgtgactgcc atttaaagtg 1260gttgtctgag tggatgcagg gaaacccaga tataataaaa
tgcaagaaag acagaagctc 1320ttccagtcct cagcaatgtc ccctttgcat gaaccccagg
atctctaaag gcagaccctt 1380tgctatggta ccatctggag ctttcctatg tacaaagcca
accattgatc catcactgaa 1440gtcaaagagc ctggttactc aggaggacaa tggatctgcc
tccacctcac ctcaagattt 1500catagaaccc tttggctcct tgtctttgaa catgacanan
ntntctggaa ataaggccga 1560catggtctgt agtatccaaa agccatcaag gacatcacca
actgcattca ctgaagaaaa 1620tgactacatc atgctaaatg cgtcattttc cacaaatctt
gtgtgcagtg tagattataa 1680tcacatccag ccagtgtggc aacttctggc tttatacagt
gactctcctc tgatactaga 1740aaggaagccc cagcttaccg agactccttc actgtcttct
agatataaac aggtggctct 1800taggcctgaa gacattttta ccagcataga ggctgatgtc
agagcagacc ctttttggtt 1860ccaacaagaa aaaattgtct tgcagctgaa cagaactgcc
accacactta gcacattaca 1920gatccagttt tccactgatg ctcaaatcgc tttaccaagg
gcggagatga gagcggagag 1980actcaaatgg accatgatcc tgatgatgaa caatcccaaa
ctggaacgca ctgtcctggt 2040tggcggcact attgccctga gctgtccagg caaaggcgac
ccttcacctc acttggaatg 2100gcttctagct gatgggagta aagtgagagc cccttacgtt
agcgaggatg ggcgaatcct 2160aatagacaaa aatgggaagt tggaactgca gatggctgac
agctttgatg caggtcttta 2220ccactgcata agcaccaatg atgcagatgc ggatgttctc
acatacagga taactgtggt 2280agagccctat ggagaaagca cacatgacag tggagtccag
cacacagtgg ttacgggtga 2340gacgctcgac cttccatgcc tttccacggg tgttccagat
gcttctatta gctggattct 2400tccagggaac actgtgttct ctcagccatc aagagacagg
caaattctta acaatgggac 2460cttaagaata ttacaggtta cgccaaaaga tcaaggtcat
taccaatgtg tggctgccaa 2520cccatcaggg gccgactttt ccagttttaa agtttcagtt
caaaagaaag gccaaaggat 2580ggttgagcat gacagggagg caggtggatc tggacttgga
gaacccaact ccagtgtttc 2640ccttaagcag ccagcatctt tgaaactctc tgcatcagct
ttgacagggt cagaggctgg 2700aaaacaagtc tccggtgtac ataggaagaa caaacataga
gacttaatac atcggcggcg 2760tggggattcc acgctccggc gattcaggga gcataggagg
cagctccctc tctctgctcg 2820gagaattgac ccgcaacgct gggcagcact tctagaaaaa
gccaaaaaga attctgtgcc 2880aaaaaagcaa gaaaatacca cagtaaagcc agtgccactg
gctgttcccc tcgtggaact 2940cactgacgag gaaaaggatg cctctggcat gattcctcca
gatgaagaat tcatggttct 3000gaaaactaag gcttctggtg tcccaggaag gtcaccaact
gctgactctg gaccagtaaa 3060tcatggtttt atgacgagta tagcttctgg cacagaagtc
tcaactgtga atccacaaac 3120actacaatct gagcaccttc ctgatttcaa attatttagt
gtaacaaacg gtacagctgt 3180gacaaagagt atgaacccat ccatagcaag caaaatagaa
gatacaacca accaaaaccc 3240aatcattatc tttccatcag tagctgaaat tcgagattct
gctcaggcag gaagagcatc 3300ttcccaaagt gcacaccctg taacaggggg aaacatggct
acctatggcc ataccaacac 3360atatagtagc tttaccagca aagccagtac agtcttgcag
ccaataaatc caacagaaag 3420ttatggacct cagataccta ttacaggagt cagcagacct
agcagtagtg acatctcttc 3480tcacactact gcagacccta gcttctccag tcacccttca
ggttcacaca ccactgcctc 3540gtctttattt cacattccta gaaacaacaa tacaggtaac
ttccccttgt ccaggcactt 3600gggaagagag aggacaattt ggagcagagg gagagttaaa
aacccacata gaaccccagt 3660tctccgacgg catagacaca ggactgtgag gccagcaatc
aagggacctg ctaacaaaaa 3720tgtgagccaa gttccagcca cagagtaccc tgggatgtgc
cacacatgtc cttccgcaga 3780ggggctcaca gtggctactg cagcactgtc agttccaagt
tcatcccaca gtgccctccc 3840caaaactaat aatgttgggg tcatagcaga agagtctacc
actgtggtca agaaaccact 3900gttactattt aaggacaaac aaaatgtaga tattgagata
ataacaacca ctacaaaata 3960ttccggaggg gaaagtaacc acgtgattcc tacggaagca
agcatgactt ctgctccaac 4020atctgtatcc ctggggaaat ctcctgtaga caatagtggt
cacctgagca tgcctgggac 4080catccaaact gggaaagatt cagtggaaac aacaccactt
cccagccccc tcagcacacc 4140ctcaatacca acaagcacaa aattctcaaa gaggaaaact
cccttgcacc agatctttgt 4200aaataaccag aagaaggagg ggatgttaaa gaatccatat
caattcggtt tacaaaagaa 4260cccagccgca aagcttccca aaatagctcc tcttttaccc
acaggtcaga gttccccctc 4320agattctaca actctcttga caagtccgcc accagctctg
tctacaacaa tggctgccac 4380tcagaacaag ggcactgaag tagtatcagg tgccagaagt
ctctcagcag ggaagaagca 4440gcccttcacc aactcctctc cagtgcttcc tagcaccata
agcaagagat ctaatacatt 4500aaacttcttg tcaacggaaa cccccacagt gacaagtcct
actgctactg catctgtcat 4560tatgtctgaa acccaacgaa caagatccaa agaagcaaaa
gaccaaataa aggggcctcg 4620gaagaacaga aacaacgcaa acaccacccc caggcaggtt
tctggctata gtgcatactc 4680agctctaaca acagctgata cccccttggc tttcagtcat
tccccacgac aagatgatgg 4740tggaaatgta agtgcagttg cttatcactc aacaacctct
cttctggcca taactgaact 4800gtttgagaag tacacccaga ctttgggaaa tacaacagct
ttggaaacaa cgttgttgag 4860caaatcacag gagagtacca cagtgaaaag agcctcagac
acaccaccac cactcctcag 4920cagtggggcg cccccagtgc ccactccttc cccacctcct
tttactaagg gtgtggttac 4980agacagcaaa gtcacatcag ctttccagat gacgtcaaat
agagtggtca ccatatatga 5040atcttcaagg cacaatacag atctgcagca accctcagca
gaggctagcc ccaatcctga 5100gatcataact ggaaccactg actctccctc taatctgttt
ccatccactt ctgtgccagc 5160actaagggta gataaaccac agaattctaa atggaagccc
tctccctggc cagaacacaa 5220atatcagctc aagtcatact ccgaaaccat tgagaagggc
aaaaggccag cagtaagcat 5280gtccccccac ctcagccttc cagaggccag cactcatgcc
tcacactgga atacacagaa 5340gcatgcagaa aagagtgttt ttgataagaa acctggtcaa
aacccaactt ccaaacatct 5400gccttacgtc tctctaccta agactctatt gaaaaagcca
agaataattg gaggaaaggc 5460tgcaagcttt acagttccag ctaattcaga cgtttttctt
ccttgtgagg ctgttggaga 5520cccactgccc atcatccact ggaccagagt ttcatcagga
nttgaaatat cccaagggac 5580acagaaaagc cggttccacg tgcttcccaa tggcaccttg
tccatccaga gggtcagtat 5640tcaggaccgt ggacagtacc tgtgctctgc atttaatcca
ctgggcgtag accattttca 5700tgtctctttg tctgtggttt tttacccggc aaggattttg
gacagacatg tcaaggagat 5760cacagttcac tttggaagta ctgtggaact aaagtgcaga
gtggagggta tgccgaggcc 5820tacggtttcc tggatacttg caaaccaaac ggtggtctca
gaaacggcca agggaagcag 5880aaaggtctgg gtaacacctg atggaacatt gatcatctat
aatctgagtc tttatgatcg 5940tggtttttac aagtgtgtgg ccagcaaccc atctggccag
gattcactgt tggttaagat 6000acaagtcatc acagctcccc ctgtcattat agagcaaaag
aggcaagcca tcgttggggt 6060tttaggtgga agtttgaaac tgccctgcac tgcaaaagga
actccccagc ctagtgttca 6120ctgggtcctt tatgatggga ctgaactaaa accattgcag
ttgactcatt ccagattttt 6180cttgtatcca aatggaactc tgtatataag aagcatcgct
ccttcagtga ggggcactta 6240tgagtgcatt gccaccagct cctcaggctc agagagaagg
gtagtgattc ttactgtgga 6300agagggagag acaatcccca ggatagaaac tgcctctcag
aaatggactg aggtgaattt 6360gggtgagaaa ttactactga actgctcagc tactggggat
ccaaagccta gaataatctg 6420gaggctgcca tccaaggctg tcatcgacca gtggcacaga
atgggcagcc gaatccacgt 6480ctacccaaat ggatccttgg tggttgggtc agtgacggaa
aaagacgctg gtgactactt 6540atgtgtggca agaaacaaaa tgggagatga cctagtcctg
atgcatgtcc gcctgagatt 6600gacacctgcc aaaattgaac agaagcagta ttttaagaag
caagtgctcc atgggaaaga 6660tttccaagtt gactgcaagg cctctggctc ccctgtgcct
gaggtatcct ggagtttgcc 6720tgatgggaca gtgctcaaca atgtagccca agctgatgac
agtggctata ggaccaagag 6780gtacaccctt ttccacaatg gaaccttgta tttcaacaac
gttgggatgg cagaggaagg 6840agattatatc tgctctgccc agaacacctt agggaaagat
gagatgaaag tccacctaac 6900agttctaaca gccatcccac ggataaggca aagctacaag
accaccatga ggctcagggc 6960tggagaaaca gctgtccttg actgcgaggt cactggggaa
ccgaagccca atgtattttg 7020gttgctgcct tccaacaatg tcatttcatt ctccaatgac
aggttcacat ttcatgccaa 7080tagaactttg tccatccata aagtgaaacc acttgactct
ggggactatg tgtgcgtagc 7140tcagaatcct agtggggatg acactaagac atacaaactg
gacattgtct ctaaacctcc 7200attaatcaat ggcctgtatg caaacaagac tgttattaaa
gccacagcca ttcggcactc 7260caaaaaatac tttgactgca gagcagatgg gatcccatct
tcccaggtca cgtggattat 7320gccaggcaat attttcctcc cagctccata ctttggaagc
agagtcacgg tccatccaaa 7380tggaaccttg gagatgagga acatccggct ttctgactct
gcggacttca cctgtgtggt 7440tcggagcgag ggaggagaga gtgtgttggt agtgcagtta
gaagtcctag aaatgctgag 7500aagaccaaca ttcagaaacc cattcaacga aaaagtcatc
gcccaagctg gcaagcccgt 7560agcactgaac tgctctgtgg atgggaaccc cccacctgaa
attacctgga tcttacctga 7620cggcacacag tttgctaaca gaccacacaa ttccccgtat
ctgatggcag gcaatggctc 7680tctcatcctt tacaaagcaa ctcggaacaa gtcagggaag
tatcgctgtg cagccaggaa 7740taaggttggc tacatcgaga aactcatcct gttagagatt
gggcagaagc cagtcattct 7800gacatacgaa ccagggatgg tgaagagcgt cagtggggaa
ccgttatcac tgcattgtgt 7860gtctgatggg atccccaagc caaatgtcaa gtggactaca
ccgggtggcc atgtaatcga 7920caggcctcaa gtggatggaa aatacatact gcatgaaaat
ggcacgctgg tcatcaaagc 7980aacaacagct cacgaccaag gaaattatat ctgtagggct
caaaacagtg ttggccaggc 8040agttattagc gtgtcagtga tggttgtggc ctaccctccc
cgaatcataa actacctacc 8100caggaacatg ctcaggagga caggggaagc catgcagctc
cactgtgtgg ccttgggaat 8160ccccaagcca aaagtcacct gggagacgcc aagacactcc
ctgctctcaa aagcaacagc 8220aagaaaaccc catagaagtg agatgcttca cccacaaggt
acgctggtca ttcagaatct 8280ccaaacctcg gattccggag tctataagtg cagagctcag
aacctacttg ggactgatta 8340cgcaacaact tacatccagg tactctgaca ggaaggggga
gactaaaatt caacagaagt 8400ccacatccac agggtttatt ttttggaaga agtttaatca
aaggcagcca taggcatgta 8460aatgagtctg aatacattta cagtattaaa tttacaatgg
acatgcgatg agacttgtaa 8520atgaaagcat tgtgaactga aaccgagtct ctgtggatct
caaagcaaac tcttaactta 8580aggcactttg attttgccaa caaataataa caaacattaa
gagaaaaaaa tgatccacta 8640cgaaataaca aacggctaat gcacctgaat tctcagtaaa
aagacctttc tctcgctaac 8700agttgccagc tgcctcgtgt ctgtttccta ccaatgtcac
aaacatcgca cacagggtga 8760atggagtcaa cgggaaagat taagtttgcg gtctgtgtaa
atctcaatgt acaaatattc 8820tgtcnctggt ttataaacat tttgataaaa ccgaaaaaaa
aaaaaaaaaa aaaaaaaaaa 8880aaa
888322597PRTRattus
speciesmisc_feature(1)..(2597)'x' can be any amino acid 2Met Gln Val Arg
Gly Arg Glu Val Ser Gly Leu Leu Ile Ser Leu Thr1 5
10 15Ala Val Cys Leu Val Val Thr Pro Gly Ser
Arg Ala Cys Pro Arg Arg 20 25
30Cys Ala Cys Tyr Val Pro Thr Glu Val His Cys Thr Phe Arg Tyr Leu
35 40 45Thr Ser Ile Pro Asp Gly Ile Pro
Ala Asn Val Glu Arg Ile Asn Leu 50 55
60Gly Tyr Asn Ser Leu Thr Arg Leu Thr Glu Asn Asp Phe Asp Gly Leu65
70 75 80Ser Lys Leu Glu Leu
Leu Met Leu His Ser Asn Gly Ile His Arg Val 85
90 95Ser Asp Lys Thr Phe Ser Gly Leu Gln Ser Leu
Gln Val Leu Lys Met 100 105
110Ser Tyr Asn Lys Val Gln Ile Ile Arg Lys Asp Thr Phe Tyr Gly Leu
115 120 125Gly Ser Leu Val Arg Leu His
Leu Asp His Asn Asn Ile Glu Phe Ile 130 135
140Asn Pro Glu Ala Phe Tyr Gly Leu Thr Ser Leu Arg Leu Val His
Leu145 150 155 160Glu Gly
Asn Arg Leu Thr Lys Leu His Pro Asp Thr Phe Val Ser Leu
165 170 175Ser Tyr Leu Gln Ile Phe Lys
Thr Ser Phe Ile Lys Tyr Leu Phe Leu 180 185
190Ser Asp Asn Phe Leu Thr Ser Leu Pro Lys Glu Met Val Ser
Tyr Met 195 200 205Pro Asn Leu Glu
Ser Leu Tyr Leu His Gly Asn Pro Trp Thr Cys Asp 210
215 220Cys His Leu Lys Trp Leu Ser Glu Trp Met Gln Gly
Asn Pro Asp Ile225 230 235
240Ile Lys Cys Lys Lys Asp Arg Ser Ser Ser Ser Pro Gln Gln Cys Pro
245 250 255Leu Cys Met Asn Pro
Arg Ile Ser Lys Gly Arg Pro Phe Ala Met Val 260
265 270Pro Ser Gly Ala Phe Leu Cys Thr Lys Pro Thr Ile
Asp Pro Ser Leu 275 280 285Lys Ser
Lys Ser Leu Val Thr Gln Glu Asp Asn Gly Ser Ala Ser Thr 290
295 300Ser Pro Gln Asp Phe Ile Glu Pro Phe Gly Ser
Leu Ser Leu Asn Met305 310 315
320Thr Xaa Xaa Ser Gly Asn Lys Ala Asp Met Val Cys Ser Ile Gln Lys
325 330 335Pro Ser Arg Thr
Ser Pro Thr Ala Phe Thr Glu Glu Asn Asp Tyr Ile 340
345 350Met Leu Asn Ala Ser Phe Ser Thr Asn Leu Val
Cys Ser Val Asp Tyr 355 360 365Asn
His Ile Gln Pro Val Trp Gln Leu Leu Ala Leu Tyr Ser Asp Ser 370
375 380Pro Leu Ile Leu Glu Arg Lys Pro Gln Leu
Thr Glu Thr Pro Ser Leu385 390 395
400Ser Ser Arg Tyr Lys Gln Val Ala Leu Arg Pro Glu Asp Ile Phe
Thr 405 410 415Ser Ile Glu
Ala Asp Val Arg Ala Asp Pro Phe Trp Phe Gln Gln Glu 420
425 430Lys Ile Val Leu Gln Leu Asn Arg Thr Ala
Thr Thr Leu Ser Thr Leu 435 440
445Gln Ile Gln Phe Ser Thr Asp Ala Gln Ile Ala Leu Pro Arg Ala Glu 450
455 460Met Arg Ala Glu Arg Leu Lys Trp
Thr Met Ile Leu Met Met Asn Asn465 470
475 480Pro Lys Leu Glu Arg Thr Val Leu Val Gly Gly Thr
Ile Ala Leu Ser 485 490
495Cys Pro Gly Lys Gly Asp Pro Ser Pro His Leu Glu Trp Leu Leu Ala
500 505 510Asp Gly Ser Lys Val Arg
Ala Pro Tyr Val Ser Glu Asp Gly Arg Ile 515 520
525Leu Ile Asp Lys Asn Gly Lys Leu Glu Leu Gln Met Ala Asp
Ser Phe 530 535 540Asp Ala Gly Leu Tyr
His Cys Ile Ser Thr Asn Asp Ala Asp Ala Asp545 550
555 560Val Leu Thr Tyr Arg Ile Thr Val Val Glu
Pro Tyr Gly Glu Ser Thr 565 570
575His Asp Ser Gly Val Gln His Thr Val Val Thr Gly Glu Thr Leu Asp
580 585 590Leu Pro Cys Leu Ser
Thr Gly Val Pro Asp Ala Ser Ile Ser Trp Ile 595
600 605Leu Pro Gly Asn Thr Val Phe Ser Gln Pro Ser Arg
Asp Arg Gln Ile 610 615 620Leu Asn Asn
Gly Thr Leu Arg Ile Leu Gln Val Thr Pro Lys Asp Gln625
630 635 640Gly His Tyr Gln Cys Val Ala
Ala Asn Pro Ser Gly Ala Asp Phe Ser 645
650 655Ser Phe Lys Val Ser Val Gln Lys Lys Gly Gln Arg
Met Val Glu His 660 665 670Asp
Arg Glu Ala Gly Gly Ser Gly Leu Gly Glu Pro Asn Ser Ser Val 675
680 685Ser Leu Lys Gln Pro Ala Ser Leu Lys
Leu Ser Ala Ser Ala Leu Thr 690 695
700Gly Ser Glu Ala Gly Lys Gln Val Ser Gly Val His Arg Lys Asn Lys705
710 715 720His Arg Asp Leu
Ile His Arg Arg Arg Gly Asp Ser Thr Leu Arg Arg 725
730 735Phe Arg Glu His Arg Arg Gln Leu Pro Leu
Ser Ala Arg Arg Ile Asp 740 745
750Pro Gln Arg Trp Ala Ala Leu Leu Glu Lys Ala Lys Lys Asn Ser Val
755 760 765Pro Lys Lys Gln Glu Asn Thr
Thr Val Lys Pro Val Pro Leu Ala Val 770 775
780Pro Leu Val Glu Leu Thr Asp Glu Glu Lys Asp Ala Ser Gly Met
Ile785 790 795 800Pro Pro
Asp Glu Glu Phe Met Val Leu Lys Thr Lys Ala Ser Gly Val
805 810 815Pro Gly Arg Ser Pro Thr Ala
Asp Ser Gly Pro Val Asn His Gly Phe 820 825
830Met Thr Ser Ile Ala Ser Gly Thr Glu Val Ser Thr Val Asn
Pro Gln 835 840 845Thr Leu Gln Ser
Glu His Leu Pro Asp Phe Lys Leu Phe Ser Val Thr 850
855 860Asn Gly Thr Ala Val Thr Lys Ser Met Asn Pro Ser
Ile Ala Ser Lys865 870 875
880Ile Glu Asp Thr Thr Asn Gln Asn Pro Ile Ile Ile Phe Pro Ser Val
885 890 895Ala Glu Ile Arg Asp
Ser Ala Gln Ala Gly Arg Ala Ser Ser Gln Ser 900
905 910Ala His Pro Val Thr Gly Gly Asn Met Ala Thr Tyr
Gly His Thr Asn 915 920 925Thr Tyr
Ser Ser Phe Thr Ser Lys Ala Ser Thr Val Leu Gln Pro Ile 930
935 940Asn Pro Thr Glu Ser Tyr Gly Pro Gln Ile Pro
Ile Thr Gly Val Ser945 950 955
960Arg Pro Ser Ser Ser Asp Ile Ser Ser His Thr Thr Ala Asp Pro Ser
965 970 975Phe Ser Ser His
Pro Ser Gly Ser His Thr Thr Ala Ser Ser Leu Phe 980
985 990His Ile Pro Arg Asn Asn Asn Thr Gly Asn Phe
Pro Leu Ser Arg His 995 1000
1005Leu Gly Arg Glu Arg Thr Ile Trp Ser Arg Gly Arg Val Lys Asn
1010 1015 1020Pro His Arg Thr Pro Val
Leu Arg Arg His Arg His Arg Thr Val 1025 1030
1035Arg Pro Ala Ile Lys Gly Pro Ala Asn Lys Asn Val Ser Gln
Val 1040 1045 1050Pro Ala Thr Glu Tyr
Pro Gly Met Cys His Thr Cys Pro Ser Ala 1055 1060
1065Glu Gly Leu Thr Val Ala Thr Ala Ala Leu Ser Val Pro
Ser Ser 1070 1075 1080Ser His Ser Ala
Leu Pro Lys Thr Asn Asn Val Gly Val Ile Ala 1085
1090 1095Glu Glu Ser Thr Thr Val Val Lys Lys Pro Leu
Leu Leu Phe Lys 1100 1105 1110Asp Lys
Gln Asn Val Asp Ile Glu Ile Ile Thr Thr Thr Thr Lys 1115
1120 1125Tyr Ser Gly Gly Glu Ser Asn His Val Ile
Pro Thr Glu Ala Ser 1130 1135 1140Met
Thr Ser Ala Pro Thr Ser Val Ser Leu Gly Lys Ser Pro Val 1145
1150 1155Asp Asn Ser Gly His Leu Ser Met Pro
Gly Thr Ile Gln Thr Gly 1160 1165
1170Lys Asp Ser Val Glu Thr Thr Pro Leu Pro Ser Pro Leu Ser Thr
1175 1180 1185Pro Ser Ile Pro Thr Ser
Thr Lys Phe Ser Lys Arg Lys Thr Pro 1190 1195
1200Leu His Gln Ile Phe Val Asn Asn Gln Lys Lys Glu Gly Met
Leu 1205 1210 1215Lys Asn Pro Tyr Gln
Phe Gly Leu Gln Lys Asn Pro Ala Ala Lys 1220 1225
1230Leu Pro Lys Ile Ala Pro Leu Leu Pro Thr Gly Gln Ser
Ser Pro 1235 1240 1245Ser Asp Ser Thr
Thr Leu Leu Thr Ser Pro Pro Pro Ala Leu Ser 1250
1255 1260Thr Thr Met Ala Ala Thr Gln Asn Lys Gly Thr
Glu Val Val Ser 1265 1270 1275Gly Ala
Arg Ser Leu Ser Ala Gly Lys Lys Gln Pro Phe Thr Asn 1280
1285 1290Ser Ser Pro Val Leu Pro Ser Thr Ile Ser
Lys Arg Ser Asn Thr 1295 1300 1305Leu
Asn Phe Leu Ser Thr Glu Thr Pro Thr Val Thr Ser Pro Thr 1310
1315 1320Ala Thr Ala Ser Val Ile Met Ser Glu
Thr Gln Arg Thr Arg Ser 1325 1330
1335Lys Glu Ala Lys Asp Gln Ile Lys Gly Pro Arg Lys Asn Arg Asn
1340 1345 1350Asn Ala Asn Thr Thr Pro
Arg Gln Val Ser Gly Tyr Ser Ala Tyr 1355 1360
1365Ser Ala Leu Thr Thr Ala Asp Thr Pro Leu Ala Phe Ser His
Ser 1370 1375 1380Pro Arg Gln Asp Asp
Gly Gly Asn Val Ser Ala Val Ala Tyr His 1385 1390
1395Ser Thr Thr Ser Leu Leu Ala Ile Thr Glu Leu Phe Glu
Lys Tyr 1400 1405 1410Thr Gln Thr Leu
Gly Asn Thr Thr Ala Leu Glu Thr Thr Leu Leu 1415
1420 1425Ser Lys Ser Gln Glu Ser Thr Thr Val Lys Arg
Ala Ser Asp Thr 1430 1435 1440Pro Pro
Pro Leu Leu Ser Ser Gly Ala Pro Pro Val Pro Thr Pro 1445
1450 1455Ser Pro Pro Pro Phe Thr Lys Gly Val Val
Thr Asp Ser Lys Val 1460 1465 1470Thr
Ser Ala Phe Gln Met Thr Ser Asn Arg Val Val Thr Ile Tyr 1475
1480 1485Glu Ser Ser Arg His Asn Thr Asp Leu
Gln Gln Pro Ser Ala Glu 1490 1495
1500Ala Ser Pro Asn Pro Glu Ile Ile Thr Gly Thr Thr Asp Ser Pro
1505 1510 1515Ser Asn Leu Phe Pro Ser
Thr Ser Val Pro Ala Leu Arg Val Asp 1520 1525
1530Lys Pro Gln Asn Ser Lys Trp Lys Pro Ser Pro Trp Pro Glu
His 1535 1540 1545Lys Tyr Gln Leu Lys
Ser Tyr Ser Glu Thr Ile Glu Lys Gly Lys 1550 1555
1560Arg Pro Ala Val Ser Met Ser Pro His Leu Ser Leu Pro
Glu Ala 1565 1570 1575Ser Thr His Ala
Ser His Trp Asn Thr Gln Lys His Ala Glu Lys 1580
1585 1590Ser Val Phe Asp Lys Lys Pro Gly Gln Asn Pro
Thr Ser Lys His 1595 1600 1605Leu Pro
Tyr Val Ser Leu Pro Lys Thr Leu Leu Lys Lys Pro Arg 1610
1615 1620Ile Ile Gly Gly Lys Ala Ala Ser Phe Thr
Val Pro Ala Asn Ser 1625 1630 1635Asp
Val Phe Leu Pro Cys Glu Ala Val Gly Asp Pro Leu Pro Ile 1640
1645 1650Ile His Trp Thr Arg Val Ser Ser Gly
Xaa Glu Ile Ser Gln Gly 1655 1660
1665Thr Gln Lys Ser Arg Phe His Val Leu Pro Asn Gly Thr Leu Ser
1670 1675 1680Ile Gln Arg Val Ser Ile
Gln Asp Arg Gly Gln Tyr Leu Cys Ser 1685 1690
1695Ala Phe Asn Pro Leu Gly Val Asp His Phe His Val Ser Leu
Ser 1700 1705 1710Val Val Phe Tyr Pro
Ala Arg Ile Leu Asp Arg His Val Lys Glu 1715 1720
1725Ile Thr Val His Phe Gly Ser Thr Val Glu Leu Lys Cys
Arg Val 1730 1735 1740Glu Gly Met Pro
Arg Pro Thr Val Ser Trp Ile Leu Ala Asn Gln 1745
1750 1755Thr Val Val Ser Glu Thr Ala Lys Gly Ser Arg
Lys Val Trp Val 1760 1765 1770Thr Pro
Asp Gly Thr Leu Ile Ile Tyr Asn Leu Ser Leu Tyr Asp 1775
1780 1785Arg Gly Phe Tyr Lys Cys Val Ala Ser Asn
Pro Ser Gly Gln Asp 1790 1795 1800Ser
Leu Leu Val Lys Ile Gln Val Ile Thr Ala Pro Pro Val Ile 1805
1810 1815Ile Glu Gln Lys Arg Gln Ala Ile Val
Gly Val Leu Gly Gly Ser 1820 1825
1830Leu Lys Leu Pro Cys Thr Ala Lys Gly Thr Pro Gln Pro Ser Val
1835 1840 1845His Trp Val Leu Tyr Asp
Gly Thr Glu Leu Lys Pro Leu Gln Leu 1850 1855
1860Thr His Ser Arg Phe Phe Leu Tyr Pro Asn Gly Thr Leu Tyr
Ile 1865 1870 1875Arg Ser Ile Ala Pro
Ser Val Arg Gly Thr Tyr Glu Cys Ile Ala 1880 1885
1890Thr Ser Ser Ser Gly Ser Glu Arg Arg Val Val Ile Leu
Thr Val 1895 1900 1905Glu Glu Gly Glu
Thr Ile Pro Arg Ile Glu Thr Ala Ser Gln Lys 1910
1915 1920Trp Thr Glu Val Asn Leu Gly Glu Lys Leu Leu
Leu Asn Cys Ser 1925 1930 1935Ala Thr
Gly Asp Pro Lys Pro Arg Ile Ile Trp Arg Leu Pro Ser 1940
1945 1950Lys Ala Val Ile Asp Gln Trp His Arg Met
Gly Ser Arg Ile His 1955 1960 1965Val
Tyr Pro Asn Gly Ser Leu Val Val Gly Ser Val Thr Glu Lys 1970
1975 1980Asp Ala Gly Asp Tyr Leu Cys Val Ala
Arg Asn Lys Met Gly Asp 1985 1990
1995Asp Leu Val Leu Met His Val Arg Leu Arg Leu Thr Pro Ala Lys
2000 2005 2010Ile Glu Gln Lys Gln Tyr
Phe Lys Lys Gln Val Leu His Gly Lys 2015 2020
2025Asp Phe Gln Val Asp Cys Lys Ala Ser Gly Ser Pro Val Pro
Glu 2030 2035 2040Val Ser Trp Ser Leu
Pro Asp Gly Thr Val Leu Asn Asn Val Ala 2045 2050
2055Gln Ala Asp Asp Ser Gly Tyr Arg Thr Lys Arg Tyr Thr
Leu Phe 2060 2065 2070His Asn Gly Thr
Leu Tyr Phe Asn Asn Val Gly Met Ala Glu Glu 2075
2080 2085Gly Asp Tyr Ile Cys Ser Ala Gln Asn Thr Leu
Gly Lys Asp Glu 2090 2095 2100Met Lys
Val His Leu Thr Val Leu Thr Ala Ile Pro Arg Ile Arg 2105
2110 2115Gln Ser Tyr Lys Thr Thr Met Arg Leu Arg
Ala Gly Glu Thr Ala 2120 2125 2130Val
Leu Asp Cys Glu Val Thr Gly Glu Pro Lys Pro Asn Val Phe 2135
2140 2145Trp Leu Leu Pro Ser Asn Asn Val Ile
Ser Phe Ser Asn Asp Arg 2150 2155
2160Phe Thr Phe His Ala Asn Arg Thr Leu Ser Ile His Lys Val Lys
2165 2170 2175Pro Leu Asp Ser Gly Asp
Tyr Val Cys Val Ala Gln Asn Pro Ser 2180 2185
2190Gly Asp Asp Thr Lys Thr Tyr Lys Leu Asp Ile Val Ser Lys
Pro 2195 2200 2205Pro Leu Ile Asn Gly
Leu Tyr Ala Asn Lys Thr Val Ile Lys Ala 2210 2215
2220Thr Ala Ile Arg His Ser Lys Lys Tyr Phe Asp Cys Arg
Ala Asp 2225 2230 2235Gly Ile Pro Ser
Ser Gln Val Thr Trp Ile Met Pro Gly Asn Ile 2240
2245 2250Phe Leu Pro Ala Pro Tyr Phe Gly Ser Arg Val
Thr Val His Pro 2255 2260 2265Asn Gly
Thr Leu Glu Met Arg Asn Ile Arg Leu Ser Asp Ser Ala 2270
2275 2280Asp Phe Thr Cys Val Val Arg Ser Glu Gly
Gly Glu Ser Val Leu 2285 2290 2295Val
Val Gln Leu Glu Val Leu Glu Met Leu Arg Arg Pro Thr Phe 2300
2305 2310Arg Asn Pro Phe Asn Glu Lys Val Ile
Ala Gln Ala Gly Lys Pro 2315 2320
2325Val Ala Leu Asn Cys Ser Val Asp Gly Asn Pro Pro Pro Glu Ile
2330 2335 2340Thr Trp Ile Leu Pro Asp
Gly Thr Gln Phe Ala Asn Arg Pro His 2345 2350
2355Asn Ser Pro Tyr Leu Met Ala Gly Asn Gly Ser Leu Ile Leu
Tyr 2360 2365 2370Lys Ala Thr Arg Asn
Lys Ser Gly Lys Tyr Arg Cys Ala Ala Arg 2375 2380
2385Asn Lys Val Gly Tyr Ile Glu Lys Leu Ile Leu Leu Glu
Ile Gly 2390 2395 2400Gln Lys Pro Val
Ile Leu Thr Tyr Glu Pro Gly Met Val Lys Ser 2405
2410 2415Val Ser Gly Glu Pro Leu Ser Leu His Cys Val
Ser Asp Gly Ile 2420 2425 2430Pro Lys
Pro Asn Val Lys Trp Thr Thr Pro Gly Gly His Val Ile 2435
2440 2445Asp Arg Pro Gln Val Asp Gly Lys Tyr Ile
Leu His Glu Asn Gly 2450 2455 2460Thr
Leu Val Ile Lys Ala Thr Thr Ala His Asp Gln Gly Asn Tyr 2465
2470 2475Ile Cys Arg Ala Gln Asn Ser Val Gly
Gln Ala Val Ile Ser Val 2480 2485
2490Ser Val Met Val Val Ala Tyr Pro Pro Arg Ile Ile Asn Tyr Leu
2495 2500 2505Pro Arg Asn Met Leu Arg
Arg Thr Gly Glu Ala Met Gln Leu His 2510 2515
2520Cys Val Ala Leu Gly Ile Pro Lys Pro Lys Val Thr Trp Glu
Thr 2525 2530 2535Pro Arg His Ser Leu
Leu Ser Lys Ala Thr Ala Arg Lys Pro His 2540 2545
2550Arg Ser Glu Met Leu His Pro Gln Gly Thr Leu Val Ile
Gln Asn 2555 2560 2565Leu Gln Thr Ser
Asp Ser Gly Val Tyr Lys Cys Arg Ala Gln Asn 2570
2575 2580Leu Leu Gly Thr Asp Tyr Ala Thr Thr Tyr Ile
Gln Val Leu 2585 2590
2595311967DNAmouse spmisc_feature(1)..(11967)'n' can be any nucleotide
'a', 'c', 'g' or 't'. 3tttggaacca acccagatgc ccctcaacag agaaatgggc
cagaaaatgt ggtccattta 60tccaatggaa tactactcaa cttattaaaa acaacgactt
tcataaaatt tttaggcaaa 120tgnatggtct gnaggatctt gagtgaggta acccaatcac
aaaagaacac tcatggtatg 180cactcactga taagtggcta tttgtctatg gagtgattta
aaagggaaga agacacatag 240ctttttgtgt gtataatatt aagatggaaa tttgccagtg
ctgtttggct tatgagtgaa 300tcttgtttca gtggattacc ggaagaaaat aataagtgaa
ctgtaggaag aagtagttaa 360tcaaggtgac aaagtatcct gacacattgg gaaaagacca
cagtccagga aactgagtct 420taaggattca tattaactcc agttccccat gtgcagctct
gagactttgg cagatcagac 480acttaacttc accagcttcc tacacagagc agttactatc
cttgcacttc acacatggag 540tgtgaccatt aagctgcact gaaacatgag tctgacttgt
taataatctt aaaatacaaa 600ttgtgttgta aagtatgtga ccaaagagca tggtcatgct
attaaccttt gatgttctat 660ggactcttaa ttttatggta gaaatgtcaa caagcttgtg
gaggctggaa gatacaaggc 720ttaagaggat ggcctttcag ttttgaaagt aattcagtat
gtgttctggc atcccttttc 780ctaaagcaat ttaacccccc aagtaggcat aattttaatg
cttacttcat cagaatatgt 840ctaattgact cttctaaaaa gactttggta tgcataggat
ctaaatgtaa atgtgattta 900ctgacataat aaataggaga aactgagcta gaataggtat
aaaatatgtg ctggctttct 960aataggtctt ataggttata taagaggtgg gaaaggaata
tttgaaacat ctagaagtaa 1020aatgatcctg agtagcgatc ctgggaaaat acgtactcta
acacactgca atcatctctc 1080tgtggtttgc tggagctgag gtctggaagg ctcgaccttg
gttagaaata acctaccgaa 1140tacagagcta tgacgttagt ctggaaggag ctttggaaga
atgacaagct gtagctgccc 1200agaacatact agatgccata tttccaaggc aagtgtccac
atgcggacat cttaagaata 1260tggttgtctc tgcagtgcta aggaccttgt tcgtgccaca
caggtctcca gggttagtgc 1320taactctgac tgcttgactc tttaattcta ccttgatcat
taatgactag aaatcacttg 1380gtgattagca actggatatg gaatattact aatttgtacc
caagccaggc cacctcagct 1440ttggcagctc cattcattct gtggagccca gtcacgtggg
tttgaatcaa ctgtactgtt 1500tctacttaca agacgcatta cctgagatga gtcatttttc
ttcacaagtc tttttagaag 1560agtcaattag acatattctg atgaagtaag catataaagt
gagagcagca tgaatgtgtt 1620ccatgtatgc tcatggatgc tattataatg tggaaataaa
ctgactttaa aaaaaaaagc 1680ttatgatact tgtcacagag taaatcttcc ataaatatca
tctgcattta taaattattt 1740tcataatcca tcaattaaaa acctttagaa attttgttaa
cacaaagatc cctaggcccc 1800tgccctagga tggtctgtat ggtgggcctg agagatggag
cttaagaact tacttgctcc 1860aggagcacat cttcagaaca tctgcctcaa aacatttatc
ccaaatgctc atcaaaggct 1920cactcacatg tgcttcaacc acagggatta aacagtcatt
ttagtcacat ttctcaaacg 1980gtggaagcct gctagaggaa caggatgtat caggataaca
tccaacctta caaaaggatg 2040tcataaccct caccacaaca aacaacaacg acaacaaacc
cataaaaatt atcacggcaa 2100atgaactaag ccatatgcag aaaaagtatt atatgttctc
attgtggggt gtttttcctt 2160aatagtcaaa tatgcagaat atagacaaag atggtttatg
caagtgggga tggcgaagga 2220tacttgtaga ttagaggaca caaagcaaca actacagagt
gaagtaatcc agagacttaa 2280tgtataatat gaggactgta tttaataatt ctatttaaga
tacacagcaa acgagtgtat 2340cttactaaca cacacactta catagagaga ataaagtgat
agatacgttt gttttatctt 2400catgtagctg ataatttcat attgtacacc tcaaacatag
ataaccaaca aagaggaaga 2460ggataggtgc ctctcccagg gcggaagagt acattcgaaa
gtcagacacc attgtgtaga 2520tgtaccacat ggaggagcta gagaaagtag ccaaggagct
aaagggatct gcaaccctat 2580aggtggaaca acattatgag ctaaccagta ccccggagct
cttgactcta gctgcatata 2640tatcaaaaga tggcctaatc ggccatcact ggaaagagag
gcccattgga cttgcaaact 2700ttatatgccc cagtacaggg gaataccagg gccaaaaagg
gggagtgggt gggcagggga 2760gtgggggtgg gtggatatgg gggacttttg gtatagcatt
ggaaatgtaa atgagttaaa 2820tacctaataa aaaatggaaa aaaaaaaaaa aaaaaaaaaa
aaggaaggtc agacacctca 2880cttcactgct atctcaactt gcaaacagaa ggggagtcac
aaacccagga caaaccacag 2940tgattgaagc gtctttgaat gttattgctg ttgttgttac
caccatcatt agcatatatt 3000cattgtgaaa acttacgggg tctatgacat gtttttttat
tcaagtatat cacatgctgt 3060cagcatattt ggcaccacta ccagccccag ccccctttgc
cccgccccca acacacacac 3120acacacacac acacacacac acacacacac acacacacac
acacacacct ttaccttctc 3180ctgggcatca tctgctcact cacccaccca agcttaatcc
ttttccttcc ctgcaatagt 3240acctctccta tttttatgtc taggttcccc ctccccctgt
taggagatgg gagaggtcac 3300gaaaggaaag aatttgtagc ccctgagcca gcccgggcca
cagagcctgc caccagacag 3360gaaaagccca gggcttacca gcacaggagg agcaaactcg
caggcgagcc tgggttggcg 3420ctggtggtcc cgggtcgatg gcccgcccat tcccagaagc
cgaggctata gctgcgtcac 3480ctgccccgcc ctcctcccga gtgaagaccc ctagaggctg
agcagacccc aaaggcggtg 3540caattccatt ggcccaaggc agaggtgagc ggctgctaat
cccctcggga agtgaaggga 3600cccagagagt ctggtagatg tgggagctgg ggttcagggc
gagacagagg gtgggatggg 3660cagaagggtc caggaaaagg aaagtactgg aggggagttg
ggacaaaagc agcgaccaag 3720ggaacatcgc ttcagtgact gaagccaggc aaaaggagcg
ggaaggatta tatgtagcct 3780gggacgcttt cataaacact gatgacgtgt ttgtgcaaag
caagcaattt gaggagaaac 3840gcctgggacg tcggaaagaa ggagtgatcg attagtactt
gtaagtttag gtgagtttga 3900gaactaacta acctatacta ttgagggaga aggaagagca
ttccagcagc agcagcagca 3960gcagcaatca gataaaggaa agctttggtt agtttggaaa
tgtatgatac cattaaaata 4020acagaagcgc ctccagttct ctgaagagtc agtcccccag
ctagtgaaga ctaagcctac 4080taagcctttt gctcccgttg gaagcaaaga acgttccttc
aatcaggtga aggctctcct 4140cagaagattt cctgtctctg cttatgttac aagaggattc
aaaagcaaga cagaagagct 4200caggtattgc caactctttt gttaaataca gtttgaggct
taagtgtacg ggaactcatg 4260tggtattcat ttacggctct cttctcttat aactaactct
taaggtgcat atagtctctt 4320ctgtttccca gctaccttgt accatctttg tttatctaat
aatagcaagc tcatctgctt 4380tttaatcatc acgcagagag tattcaaaaa tattcagtga
tgtaacagtg acagtgtagg 4440catagaagta atcattagta aatcttaatt tgggttaaac
tcattcataa cagctccagg 4500ttgggaggga tcactgagcc ttcgccacgt gcgggttaaa
gatattttct aacaagagaa 4560gcagaattct tccttggcca tgctccccat cactgtgtca
gtaagcagag gggtgtttcc 4620aagcagagaa agagcagaca gtgttatgcc tgcaaagtca
gagactcagc cctcccagct 4680ggtcagttta ctgtcctccc ggtcattagt tggctctgaa
aaggcccatg tgtccttatt 4740ggcaaggact tgcagacatg ctagaaagaa atttgacctt
tttttctagt gggttattac 4800agctgtaaaa gtattttgga aggttaagcc aaataaataa
aacacatatt aaataataca 4860atgttacaaa aattgatcat ataaagaagt acattcataa
atgcaatgtg aaaaatatat 4920ataattttta tctatttact ggtgcaaagt tttctaaatt
gcacatgtac tatttttata 4980tttataaaaa tatttttaaa atgtatataa aagtgtaaaa
ggctcttggt caaacaagag 5040agttaaattt acaaacttta attgtcccga taacattatt
atgatctcta atgacaggga 5100tcctgctttt cattgggaaa tgagaagcta tgaagatatg
tttacaataa taagcccatt 5160tagtgataaa gtccaatggg aagctagcac acactggttt
ataaagagaa cagtttcctg 5220agtctatgca agtttacact ctagggaata agagttcctc
tttctccaga tttcactagc 5280atttgttgtc atcatttatc ttcttgatga tgagcattat
aagtggaata agataggatc 5340tcaaaggaat gtcaatttgg atgccctgaa caatctttca
ggtctttctt tcagttcact 5400agtctattca tttattggat aattggggga tggtgttaat
ttttttgcag ttcttatgga 5460attccaaaaa acaaaaaaca aacaaacaaa caaaaaacct
ctgaaactag aactaccaat 5520ccattactgg gtatgtaaca aagagaaatc tgcacagaat
ttattgctac attgttcatt 5580attcacgaca gccaagaatg tggaaccaac ttacgtagcc
gtcaaaatat gaacggataa 5640agaaaatgtg gaaatgtgta caacagagtc ccatgtggcc
ataaaagagt gaaatcatga 5700catatgcagg aaatggatgc aactggaaat caattgggct
aatcaaaaca agacagactc 5760aaaaaggaaa caccgtgtag cttctctgac aaacagaagc
tagatttaca cttgtacgtg 5820cgcatgtgtg tttagaattt tatttagtta tacactattc
taatctgtga gtgtgtataa 5880aggcatgcat gtaaagcaaa aacaagctag ctggggtggg
taggagagaa agcaatgaga 5940ggagttaata agaacgaagc atagtaacat aggtgccagg
atgaaatgca ttaatttgta 6000tgctaactaa accacagaca ggaggcacac gttcaaacca
gggtgaaatc ccagcacaga 6060gaaggggaag tagacacaaa gtttcgccac taaccaagaa
gccatttgca gttgctgcct 6120gctgggaggg gcgttccagt tttctccagt ctgacactgt
gtataacaac cagttgacaa 6180tacaaagttg gcatgatgga tggtttttgt gctatttttc
attttttttc ttactgtttt 6240gttgttgtgg tggttgttgt ggtggtggct gtggttttca
tttgtttctt ttgagagaga 6300gaaggaacat gaaattgggt gggtaggaag ctggaaacga
tctggaagaa gttggggaaa 6360gagaaaaatt gtatggagca tatttaaaca aacaaacaaa
caaacaaaag gttcattttg 6420ccacaaaaag gtgtgaatta aattaaccag ttacgactct
taaagaaaat attcccaatt 6480attcccagag ttgctatgta tgctgtgcct aggactttgc
ttgaactggc cctataactc 6540tggtgtggtg tcttttcagg atgcagaaga gaggcaggga
agtcagctgc ttgctgatct 6600ccctcactgc catctgcctg gtggtcaccc ctgggagcag
ggtctgtcct cgccgatgtg 6660cctgctatgt gcccacagag gtgcactgta catttcggga
cctgacctcc atcccagacg 6720ggcatcccag ccaatgtgga acgagtcaat ttagggtgtg
tggaccttgc ctgatctcct 6780tctcagagag ggaccactga ttttcctggt actttgcccc
ccaaacacct gtgattactt 6840ttaatagttt tcttctaaaa tgggttcata caaaccttat
attgtggaga caatgaacat 6900tttatcccaa tagtctttta ctagaacttg aagcccctct
tagttgtttg ggagcctcat 6960aattatgggg cagctttatt ctgaatgaat tttaaatgaa
aaagatacag tttctgttaa 7020caatcattat gataccaagg aagaggaatt gtcattgaat
attttaaaaa agcatttctt 7080ttgcaattta taaataccca ttacaaaatg gcttacttaa
aatacttgcc ttactaaatc 7140tgacaaatta tggtgatatt ttgaaggttt atgaaaattt
gtttatgtgt ataaatgcac 7200aagaaatggg atatgccatc acctatgtgc cattagtgag
catgtacagt atgccaaaca 7260ctattgttca cgtttggagg aagtaatggg ggtgggggag
caacaagggt tataaccgta 7320tacccagtgc cttggaagcg attgcaaaca gtaaagactg
acattgtgtt ctccctatga 7380gggaggggcc ttgggctgag cactttgcaa tgagcatttg
ctcattgtgc tggcaggttt 7440tatgataact tgacccaagc tagagtcact ggagaggaag
gaacttcaac tgagaacatg 7500cctgaagaag atcagattat aggcaggcct gtggggcatt
ttcttaatta gtgattcatg 7560gggcagggcc cagtccattg ttcgtggtac catttctcag
gcactattaa aaaaaaaaaa 7620acaggctgag caagtgtcaa ggagcaagtc agtgagcagc
agccctaatg atctctgcat 7680cagctcctgc ctccaggttc ctaccctatt tgagttcctg
tcctagctcc ctacagtgat 7740gaacaatgat gtggaagtat aagccaaata aatcctttct
tccccaactt gctgttggtc 7800atgatgtttc atcacagtga taatagtcct catgaagatg
ctggtgttta taacaccttt 7860ggactaaatt ctgttatcta tagctgagga aaatggagca
tagaaagtct ccagactaca 7920ccagagtgta atctgggcct gagcttagaa tcacacccac
gtgcactcca ctgccggggc 7980ttcttaaccg gaacacagtt gtaaaaggga attttctgtt
tgtttccatt ttgacatgtg 8040gactttaatt gacgattcat ctgaagctga aaatgatttt
ttttccaggt ataacagcct 8100cactagattg acagaaaatg acttttctgg cctgagcaga
ctggagttac tcatgctgca 8160cagcaatggc attcacagag tcagtgacaa gaccttctcg
ggcttgcagt ccttgcaggt 8220gagataggta gagggtgatg gaggctgaga agagaggtgc
aactgtgggt tatacccaaa 8280agctgctgat tcccgtggga gacattctat aagcattcta
taaactagag gcagatatca 8340aggaaggatt tcaattgtaa tgcaatttta tgagaaaatt
tgaatattaa gaaaatgctg 8400gggaaaatgc ttacacaatt gcgaggacct aatttaggat
ctccaatagc cacataaaaa 8460gcacagcatg gcggcagaca cctgcaattc ctgtccctgg
aagcacctgt tcagaatccc 8520agagactcat tggccaaaca ctctattcaa tcaatgaagt
ccatattcag tgacaaaact 8580tgactcagaa actaatgtgg aaagcatcag gaagacagcc
aacatctggt ctctactcat 8640gcatgaataa gggatcccag agagaaggga agaaaaagga
aggaaggaag gaaggaagga 8700aggaaggaag gaaggaagga aggaaggaag agagggagga
aaggagggag ggaaggaagg 8760aaagggaaag gaaaaaagag atggggaggg agggaaggaa
aggaaagggg gagaaagaag 8820agaagaaagg aaaataaata aattttcagg gattattaca
cctttaaatt ttatccataa 8880aaggtcattt ccacctgttt gtctggaagt agagtgggat
cccttatata agggcagtct 8940ttaacatagt agcattttat aaaccattac aaattttgag
ttttctctac tttttatcct 9000ctaccatctt caaactgaaa ctacaattat tcccacaaat
gaagaaaatg ctgtaagagt 9060tttcacacac cgaagtggga aacttaagga ttagacaagt
ctaacaatga gaatggggag 9120aacaaaaaga gactgcacag ggagcccttt ctctgcttat
aatcttgaca cttgagaagc 9180taattgacgc tgcatgacta ctcaactctt taagcaaaca
atgctgttgt tcatgaaaag 9240cacaataaag tacatatgtc ccataatatt catcaaaatt
tgcatgcagc acataatagc 9300aatcaaagca ataacaccca ctgttcacag agactttaaa
catgaaactg gaactatgtc 9360tagtgttttg acttagggta catagtatgc tgtgtctgta
tgtaccaatg ttgatttagg 9420tcatcagaca gcatttggaa catgtatctt caggaggaat
cattcatgta tcctgcatga 9480aattctccac ctatgtttat tctcttagcc aggtttttct
ctgatggaga aacattgggt 9540ttgaggtttt actcccaggt aacatttagg gaaaagctgt
ctatgttctc agtttggctt 9600ttatttatga gggatgttgg tattccagaa aattctcttt
tgaagagatt acaatttagg 9660tcaaaacaga aaaatatgta aaaagttatt gtttttatta
gtatttcatg ttcttttctt 9720ttttaaaaat ggtatgctta gaactaatta agattagatt
agattagatt agaaaataat 9780cagagaggga tttgatgaat gctaaagcat catgaaaaat
tcaaaatttt ttgcttctaa 9840ttcagaatca attaaattca tattactata aaagacagca
cgccagatgt gtgccagctg 9900aggagtggat aaactgtgta acgtgagtgc tatgtagaaa
cagaaaggag tgaagggttg 9960atgtgcgctg caacatcttg aaaacattcg gctacatgat
ggaagccagg cacaaaaagc 10020cacatattgc atggttatgt ttatatgaaa tgtttaaaat
acatggattc ttagcaaaca 10080gagtaagatg ttacttaggg tcaggaaaag attaaaaaaa
aaaaaactat tgatgtggaa 10140tgatcttaat ttggggaaaa gacaatttcc taagacgaaa
tagttgaggt agatatagtt 10200atatccctgt ggatattgta ataaaccagc atgctgtgct
ctgagaaggg cctaatgaag 10260gggcaggagg aagtgaaatg agatggtaga aaggaaagtc
atataccatg gcttctctcg 10320tgggtggaat ctagatatgt taatatattg acataaagga
aggaattgtt tagggaagga 10380tcaaaaccaa caggagtgag ggagacaata ggaaccaatg
agaggcaaag ttcatggtca 10440atgtgtgtgg agacaccata ataaaactcc ttttttgttt
gctaactaaa accactaaaa 10500tctaaaaaca aaacattttt gcacaagaat tatttattat
tcaataaaga tgtttaaatg 10560ggggaagttg aagttcattg atagtctcat aaatcttaaa
tgtatttaaa ctgcttttta 10620cgttttttat tattaattac tcttgctgtc attattatca
tcatcattat cgtcatcatc 10680atcactaatg cttttcacca tacacaaatg taggcagaag
agtgtaatcc acttagtgag 10740gcaatcttgg agagggaaag gaagcggatg cggggcagag
gcacacagga ggacagtgag 10800agggaaatga acaagaaaaa atgtggacac atgcacaaaa
attccatagt ccactacatt 10860actttgtatt ctaatattaa gaaaataata aacccatttc
tgtgcactta tcacccaggc 10920tcaacagtta tcttggccac agatcctgtc tcactgcatc
ctgtccacct gagtccactt 10980agcgttctga atccaatcca gggcatgatg cttactccta
cacagaacta aagattaaag 11040agagtttaaa agtaaccatg acatctctct gttcctttag
cgataagttc ttaatattta 11100tggctgcttg tgtatgttct aatttctcta atattgtcac
atttagttgg caactacttt 11160gtttgaattg agttggagtt aaggtcccat aggattaatc
tcaacatatt tctatattta 11220taaacttttc tctctttgtg aaagttcctt tgagaaaaca
aatatgccca tatctttctt 11280tacaggtctt aaaaatgagc tataacaaag tccaaataat
tgagaaggat actttgtatg 11340gactcaggag cttgacccgg ttgcacctgg atcacaacaa
cattgagttt atcaaccccg 11400aggcgtttta cggactcacc ttgctccgct tggtacatct
agaaggaaac cggctgacaa 11460agctccatcc agacacattt gtctctttga gctatctcca
gatatttaaa acctccttca 11520ttaagnacct gtacttgtat gataacttca ttgacctccc
tcccaaaaga aatggtctcc 11580tctatgccaa acctagaaag cctttacttg catggaaacc
catggacctg tgactgccat 11640ttaaagtggt tgtccgagtg gatgcaggga aacccaggta
actatcttgt ttgtttgttt 11700ctttttttat arkacgtatt ttcctcaatt tcatttagaa
tgatatccca aaagtccccc 11760ataacctccc ccccacttcc ctacctaccc attcccattt
tttggccctg gcattcccct 11820gtactggggc atataaagtt tgcgtgtcca atggacctct
ctttccagtg atggccaact 11880aggccatctt ttgatacata tgcagctaga gtcaagagct
ctggggtact ggttagttca 11940taatgttgtt gcacctacag ggttgaa
1196742404DNAhomo sapiens 4tgggcagctg gatccacgtc
taccctaatg gatccctgtt tattggatca gtaacagaaa 60aagacagtgg tgtctacttg
tgtgtggcaa gaaacaaaat gggggatgat ctgatactga 120tgcatgttag cctaagactg
aaacctgcca aaattgacca caagcagtat tttagaaagc 180aagtgctcca tgggaaagat
ttccaagtag attgcaaagc ttccggctcc ccagtgccag 240agatatcttg gagtttgcct
gatggaacca tgatcaacaa tgcaatgcaa gccgatgaca 300gtggccacag gactaggaga
tatacccttt tcaacaatgg aactttatac ttcaacaaag 360ttggggtagc ggaggaagga
gattatactt gctatgccca gaacacccta gggaaagatg 420aaatgaaggt ccacttaaca
gttataacag ctgctccccg gataaggcag agtaacaaaa 480ccaacaagag aatcaaagct
ggagacacag ctgtccttga ctgtgaggtc actggggatc 540ccaaaccaaa aatattttgg
ttgctgcctt ccaatgacat gatttccttc tccattgata 600ggtacacatt tcatgccaat
gggtctttga ccatcaacaa agtgaaactg ctcgattctg 660gagagtacgt atgtgtagcc
cgaaatccca gtggggatga caccaaaatg tacaaactgg 720atgtggtctc taaacctcca
ttaatcaatg gtctgtatac aaacagaact gttattaaag 780ccacagctgt gagacattcc
aaaaaacact ttgactgcag agctgaaggg acaccatctc 840ctgaagtcat gtggatcatg
ccagacaata ttttcctcac agccccatac tatggaagca 900gaatcacagt ccataaaaat
ggaaccttgg aaattaggaa tgtgaggctt tcagattcag 960ccgactttat ctgtgtggcc
cgaaatgaag gtggagagag cgtgttggta gtacagttag 1020aagtactgga aatgctgaga
agaccgacat ttagaaatcc atttaatgaa aaaatagttg 1080cccagctggg aaagtccaca
gcattgaatt gctctgttga tggtaaccca ccacctgaaa 1140taatctggat tttaccaaat
ggcacacgat tttccaatgg accacaaagt tatcagtatc 1200tgatagcaag caatggttct
tttatcattt ctaaaacaac tcgggaggat gcaggaaaat 1260atcgctgtgc agctaggaat
aaagttggct atattgagaa attagtcata ttagaaattg 1320gccagaagcc agttattctt
acctatgcac cagggacagt aaaaggcatc agtggagaat 1380ctctatcact gcattgtgtg
tctgatggaa tccctaagcc aaatatcaaa tggactatgc 1440caagtggtta tgtagtagac
aggcctcaaa ttaatgggaa atacatattg catgacaatg 1500gcaccttagt cattaaagaa
gcaacagctt atgacagagg aaactatatc tgtaaggctc 1560aaaatagtgt tggtcataca
ctgattactg ttccagtaat gattgtagcc taccctcccc 1620gaattacaaa tcgtccaccc
aggagtattg tcaccaggac aggggcagcc tttcagctcc 1680actgtgtggc cttgggagtt
cccaagccag aaatcacatg ggagatgcct gaccactccc 1740ttctctcaac ggcaagtaaa
gagaggacac atggaagtga gcagcttcac ttacaaggta 1800ccctagtcat tcagaatccc
caaacctccg attctgggat atacaaatgc acagcaaaga 1860acccacttgg tagtgattat
gcagcaacgt atattcaagt aatctgacat gaaataataa 1920agtcaacaac atctgggcag
aatttatttt ttggaagaag tttaatcaaa ggcagccata 1980ggcatgtaaa tgaatttgaa
tacatttaca gtattaaatt tacaatgaac atgcaaaata 2040aaaggacttg taaataaatg
cattatgaac tgatgataag tctctgtgga tctcaaagca 2100aactcttaac ttaaggcact
ttgctgattt atttaatgga tctcaaaaca aacttttaac 2160ttaaggcact tttattttgc
caacaaataa caataaacaa acattgaaac ggttcactat 2220aaaataacaa atggctaatg
tacctgaatt tttcagtaaa aaaatgaact tctaatacca 2280gttgcctagt gtccacctcc
tatcaatgtt acaagcatgg cactcagaac agagacaatg 2340gaaaatatta aatctgcaat
ctttatgatg taaatttacc atcctgatgt ataaatattt 2400tgtg
240458883DNARattus
speciesmisc_feature(1)..(8916)n can be any amino acid 5cgagagacga
cagaaggtta cggctgcgag aagacgacag aagggtccag aaaaaggaaa 60gtgctggagg
ggagtgggga caaaagcagc gaccaagtga atgtcacttc agtgactgag 120gccaggcaaa
acgcgcggga aggattttgt gtagcttggg accctttcat agacactgat 180gacacgttta
cgcaaaatag aaatttgagg agaaacgcct gggccttcgg aaaggagtga 240ttgattagta
cttgcaagtt taggtgactt taaggagaac taactaatgt atactattga 300gggaggagga
agagcattac agagtttcca gcagcagcag gaaagctttg gttaatttgg 360aaatggatga
tagcattaaa ataacagaag cgcctccagg tctctgaagc ttcagtcccc 420cagctgaaag
ccagaaaaga ctaagcccac taagcctttt gatccctttg gaagcaaaga 480actttccttc
cctggggtga agactctcct cagaagattt cctgtctctg cctatgttac 540aagaggaatc
aaaaccaaga cagaagagct caggatgcag gtgagaggca gggaagtcag 600cggcttgttg
atctccctca ctgctgtctg cctggtggtc acccctggga gcagggcctg 660tcctcgccgc
tgtgcctgct atgtgcccac agaggtgcac tgtacatttc ggtacctgac 720ctccatccca
gatggcatcc cggccaatgt ggaacgaata aatttaggat ataacagcct 780tactagattg
acagaaaacg actttgatgg cctgagcaaa ctggagttac tcatgctgca 840cagtaatggc
attcacagag tcagtgacaa gaccttctcg ggcttgcagt ccttgcaggt 900cttaaaaatg
agctataaca aagtccaaat cattcggaag gatactttct acggactcgg 960gagcttggtc
cggttgcacc tggatcacaa caacattgaa ttcatcaacc ctgaggcctt 1020ttatggactt
acctcgctcc gcttggtaca tttagaagga aaccggctca caaagctcca 1080tccagacaca
tttgtctcat taagctatct ccagatattt aaaacctctt tcattaagta 1140cctgttcttg
tctgataact tcctgacctc cctcccaaaa gaaatggtct cctacatgcc 1200aaacctagaa
agcctgtatt tgcatggaaa cccatggacc tgtgactgcc atttaaagtg 1260gttgtctgag
tggatgcagg gaaacccaga tataataaaa tgcaagaaag acagaagctc 1320ttccagtcct
cagcaatgtc ccctttgcat gaaccccagg atctctaaag gcagaccctt 1380tgctatggta
ccatctggag ctttcctatg tacaaagcca accattgatc catcactgaa 1440gtcaaagagc
ctggttactc aggaggacaa tggatctgcc tccacctcac ctcaagattt 1500catagaaccc
tttggctcct tgtctttgaa catgacanan ntntctggaa ataaggccga 1560catggtctgt
agtatccaaa agccatcaag gacatcacca actgcattca ctgaagaaaa 1620tgactacatc
atgctaaatg cgtcattttc cacaaatctt gtgtgcagtg tagattataa 1680tcacatccag
ccagtgtggc aacttctggc tttatacagt gactctcctc tgatactaga 1740aaggaagccc
cagcttaccg agactccttc actgtcttct agatataaac aggtggctct 1800taggcctgaa
gacattttta ccagcataga ggctgatgtc agagcagacc ctttttggtt 1860ccaacaagaa
aaaattgtct tgcagctgaa cagaactgcc accacactta gcacattaca 1920gatccagttt
tccactgatg ctcaaatcgc tttaccaagg gcggagatga gagcggagag 1980actcaaatgg
accatgatcc tgatgatgaa caatcccaaa ctggaacgca ctgtcctggt 2040tggcggcact
attgccctga gctgtccagg caaaggcgac ccttcacctc acttggaatg 2100gcttctagct
gatgggagta aagtgagagc cccttacgtt agcgaggatg ggcgaatcct 2160aatagacaaa
aatgggaagt tggaactgca gatggctgac agctttgatg caggtcttta 2220ccactgcata
agcaccaatg atgcagatgc ggatgttctc acatacagga taactgtggt 2280agagccctat
ggagaaagca cacatgacag tggagtccag cacacagtgg ttacgggtga 2340gacgctcgac
cttccatgcc tttccacggg tgttccagat gcttctatta gctggattct 2400tccagggaac
actgtgttct ctcagccatc aagagacagg caaattctta acaatgggac 2460cttaagaata
ttacaggtta cgccaaaaga tcaaggtcat taccaatgtg tggctgccaa 2520cccatcaggg
gccgactttt ccagttttaa agtttcagtt caaaagaaag gccaaaggat 2580ggttgagcat
gacagggagg caggtggatc tggacttgga gaacccaact ccagtgtttc 2640ccttaagcag
ccagcatctt tgaaactctc tgcatcagct ttgacagggt cagaggctgg 2700aaaacaagtc
tccggtgtac ataggaagaa caaacataga gacttaatac atcggcggcg 2760tggggattcc
acgctccggc gattcaggga gcataggagg cagctccctc tctctgctcg 2820gagaattgac
ccgcaacgct gggcagcact tctagaaaaa gccaaaaaga attctgtgcc 2880aaaaaagcaa
gaaaatacca cagtaaagcc agtgccactg gctgttcccc tcgtggaact 2940cactgacgag
gaaaaggatg cctctggcat gattcctcca gatgaagaat tcatggttct 3000gaaaactaag
gcttctggtg tcccaggaag gtcaccaact gctgactctg gaccagtaaa 3060tcatggtttt
atgacgagta tagcttctgg cacagaagtc tcaactgtga atccacaaac 3120actacaatct
gagcaccttc ctgatttcaa attatttagt gtaacaaacg gtacagctgt 3180gacaaagagt
atgaacccat ccatagcaag caaaatagaa gatacaacca accaaaaccc 3240aatcattatc
tttccatcag tagctgaaat tcgagattct gctcaggcag gaagagcatc 3300ttcccaaagt
gcacaccctg taacaggggg aaacatggct acctatggcc ataccaacac 3360atatagtagc
tttaccagca aagccagtac agtcttgcag ccaataaatc caacagaaag 3420ttatggacct
cagataccta ttacaggagt cagcagacct agcagtagtg acatctcttc 3480tcacactact
gcagacccta gcttctccag tcacccttca ggttcacaca ccactgcctc 3540gtctttattt
cacattccta gaaacaacaa tacaggtaac ttccccttgt ccaggcactt 3600gggaagagag
aggacaattt ggagcagagg gagagttaaa aacccacata gaaccccagt 3660tctccgacgg
catagacaca ggactgtgag gccagcaatc aagggacctg ctaacaaaaa 3720tgtgagccaa
gttccagcca cagagtaccc tgggatgtgc cacacatgtc cttccgcaga 3780ggggctcaca
gtggctactg cagcactgtc agttccaagt tcatcccaca gtgccctccc 3840caaaactaat
aatgttgggg tcatagcaga agagtctacc actgtggtca agaaaccact 3900gttactattt
aaggacaaac aaaatgtaga tattgagata ataacaacca ctacaaaata 3960ttccggaggg
gaaagtaacc acgtgattcc tacggaagca agcatgactt ctgctccaac 4020atctgtatcc
ctggggaaat ctcctgtaga caatagtggt cacctgagca tgcctgggac 4080catccaaact
gggaaagatt cagtggaaac aacaccactt cccagccccc tcagcacacc 4140ctcaatacca
acaagcacaa aattctcaaa gaggaaaact cccttgcacc agatctttgt 4200aaataaccag
aagaaggagg ggatgttaaa gaatccatat caattcggtt tacaaaagaa 4260cccagccgca
aagcttccca aaatagctcc tcttttaccc acaggtcaga gttccccctc 4320agattctaca
actctcttga caagtccgcc accagctctg tctacaacaa tggctgccac 4380tcagaacaag
ggcactgaag tagtatcagg tgccagaagt ctctcagcag ggaagaagca 4440gcccttcacc
aactcctctc cagtgcttcc tagcaccata agcaagagat ctaatacatt 4500aaacttcttg
tcaacggaaa cccccacagt gacaagtcct actgctactg catctgtcat 4560tatgtctgaa
acccaacgaa caagatccaa agaagcaaaa gaccaaataa aggggcctcg 4620gaagaacaga
aacaacgcaa acaccacccc caggcaggtt tctggctata gtgcatactc 4680agctctaaca
acagctgata cccccttggc tttcagtcat tccccacgac aagatgatgg 4740tggaaatgta
agtgcagttg cttatcactc aacaacctct cttctggcca taactgaact 4800gtttgagaag
tacacccaga ctttgggaaa tacaacagct ttggaaacaa cgttgttgag 4860caaatcacag
gagagtacca cagtgaaaag agcctcagac acaccaccac cactcctcag 4920cagtggggcg
cccccagtgc ccactccttc cccacctcct tttactaagg gtgtggttac 4980agacagcaaa
gtcacatcag ctttccagat gacgtcaaat agagtggtca ccatatatga 5040atcttcaagg
cacaatacag atctgcagca accctcagca gaggctagcc ccaatcctga 5100gatcataact
ggaaccactg actctccctc taatctgttt ccatccactt ctgtgccagc 5160actaagggta
gataaaccac agaattctaa atggaagccc tctccctggc cagaacacaa 5220atatcagctc
aagtcatact ccgaaaccat tgagaagggc aaaaggccag cagtaagcat 5280gtccccccac
ctcagccttc cagaggccag cactcatgcc tcacactgga atacacagaa 5340gcatgcagaa
aagagtgttt ttgataagaa acctggtcaa aacccaactt ccaaacatct 5400gccttacgtc
tctctaccta agactctatt gaaaaagcca agaataattg gaggaaaggc 5460tgcaagcttt
acagttccag ctaattcaga cgtttttctt ccttgtgagg ctgttggaga 5520cccactgccc
atcatccact ggaccagagt ttcatcagga nttgaaatat cccaagggac 5580acagaaaagc
cggttccacg tgcttcccaa tggcaccttg tccatccaga gggtcagtat 5640tcaggaccgt
ggacagtacc tgtgctctgc atttaatcca ctgggcgtag accattttca 5700tgtctctttg
tctgtggttt tttacccggc aaggattttg gacagacatg tcaaggagat 5760cacagttcac
tttggaagta ctgtggaact aaagtgcaga gtggagggta tgccgaggcc 5820tacggtttcc
tggatacttg caaaccaaac ggtggtctca gaaacggcca agggaagcag 5880aaaggtctgg
gtaacacctg atggaacatt gatcatctat aatctgagtc tttatgatcg 5940tggtttttac
aagtgtgtgg ccagcaaccc atctggccag gattcactgt tggttaagat 6000acaagtcatc
acagctcccc ctgtcattat agagcaaaag aggcaagcca tcgttggggt 6060tttaggtgga
agtttgaaac tgccctgcac tgcaaaagga actccccagc ctagtgttca 6120ctgggtcctt
tatgatggga ctgaactaaa accattgcag ttgactcatt ccagattttt 6180cttgtatcca
aatggaactc tgtatataag aagcatcgct ccttcagtga ggggcactta 6240tgagtgcatt
gccaccagct cctcaggctc agagagaagg gtagtgattc ttactgtgga 6300agagggagag
acaatcccca ggatagaaac tgcctctcag aaatggactg aggtgaattt 6360gggtgagaaa
ttactactga actgctcagc tactggggat ccaaagccta gaataatctg 6420gaggctgcca
tccaaggctg tcatcgacca gtggcacaga atgggcagcc gaatccacgt 6480ctacccaaat
ggatccttgg tggttgggtc agtgacggaa aaagacgctg gtgactactt 6540atgtgtggca
agaaacaaaa tgggagatga cctagtcctg atgcatgtcc gcctgagatt 6600gacacctgcc
aaaattgaac agaagcagta ttttaagaag caagtgctcc atgggaaaga 6660tttccaagtt
gactgcaagg cctctggctc ccctgtgcct gaggtatcct ggagtttgcc 6720tgatgggaca
gtgctcaaca atgtagccca agctgatgac agtggctata ggaccaagag 6780gtacaccctt
ttccacaatg gaaccttgta tttcaacaac gttgggatgg cagaggaagg 6840agattatatc
tgctctgccc agaacacctt agggaaagat gagatgaaag tccacctaac 6900agttctaaca
gccatcccac ggataaggca aagctacaag accaccatga ggctcagggc 6960tggagaaaca
gctgtccttg actgcgaggt cactggggaa ccgaagccca atgtattttg 7020gttgctgcct
tccaacaatg tcatttcatt ctccaatgac aggttcacat ttcatgccaa 7080tagaactttg
tccatccata aagtgaaacc acttgactct ggggactatg tgtgcgtagc 7140tcagaatcct
agtggggatg acactaagac atacaaactg gacattgtct ctaaacctcc 7200attaatcaat
ggcctgtatg caaacaagac tgttattaaa gccacagcca ttcggcactc 7260caaaaaatac
tttgactgca gagcagatgg gatcccatct tcccaggtca cgtggattat 7320gccaggcaat
attttcctcc cagctccata ctttggaagc agagtcacgg tccatccaaa 7380tggaaccttg
gagatgagga acatccggct ttctgactct gcggacttca cctgtgtggt 7440tcggagcgag
ggaggagaga gtgtgttggt agtgcagtta gaagtcctag aaatgctgag 7500aagaccaaca
ttcagaaacc cattcaacga aaaagtcatc gcccaagctg gcaagcccgt 7560agcactgaac
tgctctgtgg atgggaaccc cccacctgaa attacctgga tcttacctga 7620cggcacacag
tttgctaaca gaccacacaa ttccccgtat ctgatggcag gcaatggctc 7680tctcatcctt
tacaaagcaa ctcggaacaa gtcagggaag tatcgctgtg cagccaggaa 7740taaggttggc
tacatcgaga aactcatcct gttagagatt gggcagaagc cagtcattct 7800gacatacgaa
ccagggatgg tgaagagcgt cagtggggaa ccgttatcac tgcattgtgt 7860gtctgatggg
atccccaagc caaatgtcaa gtggactaca ccgggtggcc atgtaatcga 7920caggcctcaa
gtggatggaa aatacatact gcatgaaaat ggcacgctgg tcatcaaagc 7980aacaacagct
cacgaccaag gaaattatat ctgtagggct caaaacagtg ttggccaggc 8040agttattagc
gtgtcagtga tggttgtggc ctaccctccc cgaatcataa actacctacc 8100caggaacatg
ctcaggagga caggggaagc catgcagctc cactgtgtgg ccttgggaat 8160ccccaagcca
aaagtcacct gggagacgcc aagacactcc ctgctctcaa aagcaacagc 8220aagaaaaccc
catagaagtg agatgcttca cccacaaggt acgctggtca ttcagaatct 8280ccaaacctcg
gattccggag tctataagtg cagagctcag aacctacttg ggactgatta 8340cgcaacaact
tacatccagg tactctgaca ggaaggggga gactaaaatt caacagaagt 8400ccacatccac
agggtttatt ttttggaaga agtttaatca aaggcagcca taggcatgta 8460aatgagtctg
aatacattta cagtattaaa tttacaatgg acatgcgatg agacttgtaa 8520atgaaagcat
tgtgaactga aaccgagtct ctgtggatct caaagcaaac tcttaactta 8580aggcactttg
attttgccaa caaataataa caaacattaa gagaaaaaaa tgatccacta 8640cgaaataaca
aacggctaat gcacctgaat tctcagtaaa aagacctttc tctcgctaac 8700agttgccagc
tgcctcgtgt ctgtttccta ccaatgtcac aaacatcgca cacagggtga 8760atggagtcaa
cgggaaagat taagtttgcg gtctgtgtaa atctcaatgt acaaatattc 8820tgtcnctggt
ttataaacat tttgataaaa ccgaaaaaaa aaaaaaaaaa aaaaaaaaaa 8880aaa
888368262DNAhomo
sapiensmisc_feature(1)..(8262)'n' can be any nucleotide 'a', 'c', 'g' or
't'. 6atgaaggtaa aaggcagagg aatcacctgc ttgctggtct cctttgctgt gatctgcctg
60gtcgccaccc ctgggggcaa ggcctgtcct cgccgctgtg cctgttatat gcctacggag
120gtacactgca catttcggta cctgacttcc atcccagaca gcatcccgcc caatgtggaa
180cgcatcaatt taggatacaa cagcttggtt agattgatgg aaacagattt ttctggcctg
240accaaactgg agttactcat gcttcacagc aatggcattc acacaatccc tgacaagacc
300ttctcagatt tgcaggcctt gcaggtctta aaaatgagct ataataaagt ccgaaaactt
360cagaaagata ctttttatgg cctcaggagc ttgacacgat tgcacatgga ccacaacaat
420attgagttta taaacccaga ggttttttat gggctcaact ttctccgcct ggtgcacttg
480gaaggaaatc agctcactaa gctccaccca gatacatttg tctctttgag ctacctccag
540atatttaaaa tctctttcat taagttccta tacttgtctg ataacttcct gacctccctc
600cctcaagaga tggtctccta tatgcctgac ctagacagcc tttacctgca tggaaaccca
660tggacctgtg attgccattt aaagtggttg tctgactgga tacaggnnnn nccagatgta
720ataaaatgca aaaaagatag aagtccctct agtgctcagc agtgtccact ttgcatgaac
780cctaggactt ctaaaggcaa gccgttagct atggtctcag ctgcagcttt ccagtgtgcc
840aagccaacca ttgactcatc cctgaaatca aagagcctga ctattctgga agacagtagt
900tctgctttca tctctcccca aggtttcatg gcaccctttg gctccctcac tttgaatatg
960acagatcagt ctggaaatga agctaacatg gtctgcagta ttcaaaagcc ctcaaggaca
1020tcacccattg cattcactga agaaaatgac tacatcgtgc taaatacttc attttcaaca
1080tttttggtgt gcaacataga ttacggtcac attcagccag tgtggcaaat tttggctttg
1140tacagtgatt ctcctctgat actagaaagg agccacttgc ttagtgaaac accgcagctc
1200tattacaaat ataaacaggt ggctcctaag cctgaagaca tttttaccaa catagaggca
1260gatctcagag cagatccctc ttggttaatg caagaccaaa tttccttgca gctgaacaga
1320actgccacca cattcagtac attacagatc cagtactcca gtgatgctca aatcacttta
1380ccaagagcag agatgaggcc agtgaaacac aaatggacta tgatttcaag ggataacaat
1440actaagctgg aacatactgt cttggtaggt ggaaccgttg gcctgaactg cccaggccaa
1500ggagacccca ccccacacgt ggattggctt ctagctgatg gaagtaaagt gagagcccct
1560tatgtcagtg aggatggacg gatcctaata gacaaaagtg gaaaattgga actccagatg
1620gctgatagtt ttgacacagg cgtatatcac tgtataagca gcaattatga tgatgcagat
1680attctcacct ataggataac tgtggtagaa cctttggtcg aagcctatca ggaaaatggg
1740attcatcaca cagttttcat tggtgaaaca cttgatcttc catgccattc tactggtatc
1800ccagatgcct ctattagctg ggttattcca ggaaacaatg tgctctatca gtcatcaaga
1860gacaagaaag ttctaaacaa tggcacatta agaatattac aggtcacccc gaaagaccaa
1920ggttattatc gctgtgtggc agccaaccca tcaggggttg attttttgat tttccaagtt
1980tcagtcaaga tgaaaggaca aaggcccttg gagcatgatg gagaaacaga gggatctgga
2040cttgatgagt ccaatcctat tgctcatctt aaggagccac caggtgcaca actccgtaca
2100tctgctctga tggaggctga ggttggaaaa cacacctcaa gcacaagtaa gaggcacaac
2160tatcgggaat taacactcca gcgacgtgga gattcaacac atcgacgttt tagggagaat
2220aggaggcatt tccctccctc tgctaggaga attgacccac aacattgggc ggcactgttg
2280gagaaagcta aaaagaatgc tatgccagac aagcgagaaa ataccacagt gagcccaccc
2340ccagtggtca cccaactccc aaacatacct ggtgaagaag acgattcctc aggcatgctc
2400gctctacatg aggaatttat ggtcccggcc actaaagctt tgaaccttcc agcaaggaca
2460gtgactgctg actccagaac aatatctgat agtcctatga caaacataaa ttatggcaca
2520gaactctccg ttgtgaattc acaaatacta ccacctgaag aacccacaga tttcaaactg
2580tctactgcta ttaaaactac agccatgtca aagaatataa acccaaccat gtcaagccaa
2640atacaaggca caaccaatca acattcatcc actgtctttc cactgctact tggagcaact
2700gaatttcagg actctgacag agggaagagg aagagagcat ttccagtaac ccccaataac
2760agtaaggact atgatcaaag atgntcaatg tcaaanatgc ttagtagcac caccaacaaa
2820ctattattag agtcagtaaa taccacaaat agtcatcaga catctgtaag agaagtgagt
2880gaacccaggc acaatcactt ctattctcac actactcaaa tacttagcac ctccacgttc
2940ccttcagatc cacacacagc tgctcattct cagtttccga tccctagann naatagtaca
3000gttaacatcc cgctgttcag acgctttggg aggcagagga aaattggcgg aagggggcgg
3060attatcagcc catatagaac tccagttctg cgacggcata gatacagcat tttcaggtca
3120acaaccagag gttcttctga aaaaagcact actgcattct cagccacagt gctcaatgtg
3180acatgtctgt cctgtcttcc cagggagagg ctcaccactg ccacagcagc attgtctttt
3240ccaagtgctg ctcccatcac cttccccaaa gctgacattg ctagagtccc atcagaagag
3300tctacaactc tagtccagaa tccactatta ctacttgaga acaaacccag tgtagannnn
3360gaaannacaa cacccacaat aaaatattca ggactngaaa tttcccaagt gactccaact
3420ggtgcagtca tgacatatgc tccaacatcc atacccatgg aaaaaactca caaagtaaac
3480gccagttacc cacgtgtgtc tagcaccaat gaagctaaaa gagattcagt gattacatcg
3540tcactttcag gtgctatcac caagccacca atgactatta tagccattac aaggttttca
3600agaaggaaaa ttccctggca acagaacttt gtaaataacc ataacccaaa aggcagatta
3660aggaatcaac ataaagttag tttacaaaaa agcacagctg tgatgcttcc taaaacatct
3720cctgctttac cacagagaca aagttcccct ttccatttca ccacactttc aacaagtgtg
3780atgcaaattc catctaatac cttgactacc gctcaccaca ctacgaccaa aacacacaat
3840cctggaagtc ttccaacaaa gaaggagctt cccttcccac cccttaaccc tatgcttcct
3900agtattataa gcaaagactc aagtacaaaa agcatcatat caacgcaaac agcaaccgca
3960acaactccta ccttccctgc atctgtcatc acttatgaaa cccaaacaga gagatctaga
4020gcacaaacaa tacaaagaga aggacctcaa aagaagaaca ggactgaccc aaacatctct
4080ccagaccaga gttctggctt cactacaccc actgctatga cnacctcctn ngctctnnnn
4140gcattcactc attccccacc agaaaacaca actgggattt caagcacaat cagttttcat
4200tcaagaactc ttaatctgac agatgtgatt gaagaactag cccaagcaag tactcagact
4260ttgaagagca caattgcttc tgaaacaact ttgtccagca aatcacacca gagtaccaca
4320actaggaaag catcattaga cactcaacca ccaccattct tgagcagcag tgctactcta
4380atgccagttc ccatctcccc tccctttact cagagagcag ttactgacaa cgtggcgact
4440cccatttccg ggcttatgac aaatacagtg gtcaagctgc acgaatcctc aaggcacaat
4500ccnnnnnnnc aaatgccaag ttcacnnaat tgngaaccnn nnactcnnnn nacttcatct
4560acntctaatc tgttacattc tactcccatg ccagcactaa caacagttaa atcacagaat
4620tccaaattaa ctccatctcc ctgggcagaa taccaatttt ggcacaaacc atactcagac
4680attgctgaaa aaggcaaaaa gccagaagta agcatgttgg ctactacagg cctgtccgag
4740gccaccactc ttgtttcaga ttgggatgga cagaagaaca caaagaagag tgactttgat
4800aagaaaccag ttcaagaagc aacaacttcc aaactccttc cctttgactc tttgtctagg
4860tatatatttg aaaagcccag gatagttgga ggaaaagctg caagttttac tattccagct
4920aactcagatg cctttcttcc ctgtgaagct gttggaaatc ccctgcccac cattcattgg
4980accagagtnn nntcaggact tgatttatct aagaggaaac agaatagcag ggtccaggtt
5040ctccccaatg gtaccctgtc catccagagg gtggaaattc aggaccgcgg acagtacttg
5100tgttccgcat ccaatctgtt tggcacagac caccttcatg tcaccttgtc tgtggtttcc
5160tatcctccca ggatcctgga gagacgtacc aaagagatca cagttcattc cggaagcact
5220gtggaactga agtgcagagc agaaggtagg ccaagcccta cagttacctg gattcttgca
5280aaccaaacag ttgtctcaga atcatcccag ggaagtaggc aggctgtggt gacggttgac
5340ggaacattgg tcctccacaa tctcagtatt tatgaccgtg gcttttacaa atgtgtggcc
5400agcaacccag gtggccagga ttcactgctg gttaaaatac aagtcattgc agcaccacct
5460gttattctag agcaaaggag gcaagtcatt gtaggcactt ggggtgaaag tttaaaactg
5520ccctgtactg caaaaggaac tcctcagccc agcgtttact gggtcctctc tgatggcact
5580gaagtgaaac cattacagtt taccaattcc aagttgttct tattttcaaa tgggactttg
5640tatataagaa acctagcctc ttcagacagg ggcacttatg aatgcattgc taccagttcc
5700actggttcgg agcgaagagt agtaatgctt acaatggaag agcgagtgac cagccccagg
5760atagaagctg catcccagaa aaggactgaa gtgaattttg gggacaaatt actactgaac
5820tgctcagcca ctggggagcc caaaccccaa ataatgtgga ggttaccatc caaggctgtg
5880gtcgaccagt gggcagctgg atccacgtct accctaatgg atccctgttt attggatcag
5940taacagaaaa agacagtggt gtctacttgt gtgtggcaag aaacaaaatg ggggatgatc
6000tgatactgat gcatgttagc ctaagactga aacctgccaa aattgaccac aagcagtatt
6060ttagaaagca agtgctccat gggaaagatt tccaagtaga ttgcaaagct tccggctccc
6120cagtgccaga gatatcttgg agtttgcctg atggaaccat gatcaacaat gcaatgcaag
6180ccgatgacag tggccacagg actaggagat ataccctttt caacaatgga actttatact
6240tcaacaaagt tggggtagcg gaggaaggag attatacttg ctatgcccag aacaccctag
6300ggaaagatga aatgaaggtc cacttaacag ttataacagc tgctccccgg ataaggcaga
6360gtaacaaaac caacaagaga atcaaagctg gagacacagc tgtccttgac tgtgaggtca
6420ctggggatcc caaaccaaaa atattttggt tgctgccttc caatgacatg atttccttct
6480ccattgatag gtacacattt catgccaatg ggtctttgac catcaacaaa gtgaaactgc
6540tcgattctgg agagtacgta tgtgtagccc gaaatcccag tggggatgac accaaaatgt
6600acaaactgga tgtggtctct aaacctccat taatcaatgg tctgtataca aacagaactg
6660ttattaaagc cacagctgtg agacattcca aaaaacactt tgactgcaga gctgaaggga
6720caccatctcc tgaagtcatg tggatcatgc cagacaatat tttcctcaca gccccatact
6780atggaagcag aatcacagtc cataaaaatg gaaccttgga aattaggaat gtgaggcttt
6840cagattcagc cgactttatc tgtgtggccc gaaatgaagg tggagagagc gtgttggtag
6900tacagttaga agtactggaa atgctgagaa gaccgacatt tagaaatcca tttaatgaaa
6960aaatagttgc ccagctggga aagtccacag cattgaattg ctctgttgat ggtaacccac
7020cacctgaaat aatctggatt ttaccaaatg gcacacgatt ttccaatgga ccacaaagtt
7080atcagtatct gatagcaagc aatggttctt ttatcatttc taaaacaact cgggaggatg
7140caggaaaata tcgctgtgca gctaggaata aagttggcta tattgagaaa ttagtcatat
7200tagaaattgg ccagaagcca gttattctta cctatgcacc agggacagta aaaggcatca
7260gtggagaatc tctatcactg cattgtgtgt ctgatggaat ccctaagcca aatatcaaat
7320ggactatgcc aagtggttat gtagtagaca ggcctcaaat taatgggaaa tacatattgc
7380atgacaatgg caccttagtc attaaagaag caacagctta tgacagagga aactatatct
7440gtaaggctca aaatagtgtt ggtcatacac tgattactgt tccagtaatg attgtagcct
7500accctccccg aattacaaat cgtccaccca ggagtattgt caccaggaca ggggcagcct
7560ttcagctcca ctgtgtggcc ttgggagttc ccaagccaga aatcacatgg gagatgcctg
7620accactccct tctctcaacg gcaagtaaag agaggacaca tggaagtgag cagcttcact
7680tacaaggtac cctagtcatt cagaatcccc aaacctccga ttctgggata tacaaatgca
7740cagcaaagaa cccacttggt agtgattatg cagcaacgta tattcaagta atctgacatg
7800aaataataaa gtcaacaaca tctgggcaga atttattttt tggaagaagt ttaatcaaag
7860gcagccatag gcatgtaaat gaatttgaat acatttacag tattaaattt acaatgaaca
7920tgcaaaataa aaggacttgt aaataaatgc attatgaact gatgatactg atttatttaa
7980tggatctcaa aacaaacttt taacttaagg cacttttatt ttgccaacaa ataacaataa
8040acaaacattg aaacggttca ctataaaata acaaatggct aatgtacctg aatttttcag
8100taaaaaaatg aacttctaat accagttgcc tagtgtccac ctcctatcaa tgttacaagc
8160atggcactca gaacagagac aatggaaaat attaaatctg caatctatgt ataaatattt
8220tgtggtttat aaattttttt gctaaaacct acagaaaata ag
826278883DNARattus speciesmisc_feature(1)..(8916)'n' can be any
nucleotide 'a', 'c', 'g' or 't'. 7cgagagacga cagaaggtta cggctgcgag
aagacgacag aagggtccag aaaaaggaaa 60gtgctggagg ggagtgggga caaaagcagc
gaccaagtga atgtcacttc agtgactgag 120gccaggcaaa acgcgcggga aggattttgt
gtagcttggg accctttcat agacactgat 180gacacgttta cgcaaaatag aaatttgagg
agaaacgcct gggccttcgg aaaggagtga 240ttgattagta cttgcaagtt taggtgactt
taaggagaac taactaatgt atactattga 300gggaggagga agagcattac agagtttcca
gcagcagcag gaaagctttg gttaatttgg 360aaatggatga tagcattaaa ataacagaag
cgcctccagg tctctgaagc ttcagtcccc 420cagctgaaag ccagaaaaga ctaagcccac
taagcctttt gatccctttg gaagcaaaga 480actttccttc cctggggtga agactctcct
cagaagattt cctgtctctg cctatgttac 540aagaggaatc aaaaccaaga cagaagagct
caggatgcag gtgagaggca gggaagtcag 600cggcttgttg atctccctca ctgctgtctg
cctggtggtc acccctggga gcagggcctg 660tcctcgccgc tgtgcctgct atgtgcccac
agaggtgcac tgtacatttc ggtacctgac 720ctccatccca gatggcatcc cggccaatgt
ggaacgaata aatttaggat ataacagcct 780tactagattg acagaaaacg actttgatgg
cctgagcaaa ctggagttac tcatgctgca 840cagtaatggc attcacagag tcagtgacaa
gaccttctcg ggcttgcagt ccttgcaggt 900cttaaaaatg agctataaca aagtccaaat
cattcggaag gatactttct acggactcgg 960gagcttggtc cggttgcacc tggatcacaa
caacattgaa ttcatcaacc ctgaggcctt 1020ttatggactt acctcgctcc gcttggtaca
tttagaagga aaccggctca caaagctcca 1080tccagacaca tttgtctcat taagctatct
ccagatattt aaaacctctt tcattaagta 1140cctgttcttg tctgataact tcctgacctc
cctcccaaaa gaaatggtct cctacatgcc 1200aaacctagaa agcctgtatt tgcatggaaa
cccatggacc tgtgactgcc atttaaagtg 1260gttgtctgag tggatgcagg gaaacccaga
tataataaaa tgcaagaaag acagaagctc 1320ttccagtcct cagcaatgtc ccctttgcat
gaaccccagg atctctaaag gcagaccctt 1380tgctatggta ccatctggag ctttcctatg
tacaaagcca accattgatc catcactgaa 1440gtcaaagagc ctggttactc aggaggacaa
tggatctgcc tccacctcac ctcaagattt 1500catagaaccc tttggctcct tgtctttgaa
catgacanan ntntctggaa ataaggccga 1560catggtctgt agtatccaaa agccatcaag
gacatcacca actgcattca ctgaagaaaa 1620tgactacatc atgctaaatg cgtcattttc
cacaaatctt gtgtgcagtg tagattataa 1680tcacatccag ccagtgtggc aacttctggc
tttatacagt gactctcctc tgatactaga 1740aaggaagccc cagcttaccg agactccttc
actgtcttct agatataaac aggtggctct 1800taggcctgaa gacattttta ccagcataga
ggctgatgtc agagcagacc ctttttggtt 1860ccaacaagaa aaaattgtct tgcagctgaa
cagaactgcc accacactta gcacattaca 1920gatccagttt tccactgatg ctcaaatcgc
tttaccaagg gcggagatga gagcggagag 1980actcaaatgg accatgatcc tgatgatgaa
caatcccaaa ctggaacgca ctgtcctggt 2040tggcggcact attgccctga gctgtccagg
caaaggcgac ccttcacctc acttggaatg 2100gcttctagct gatgggagta aagtgagagc
cccttacgtt agcgaggatg ggcgaatcct 2160aatagacaaa aatgggaagt tggaactgca
gatggctgac agctttgatg caggtcttta 2220ccactgcata agcaccaatg atgcagatgc
ggatgttctc acatacagga taactgtggt 2280agagccctat ggagaaagca cacatgacag
tggagtccag cacacagtgg ttacgggtga 2340gacgctcgac cttccatgcc tttccacggg
tgttccagat gcttctatta gctggattct 2400tccagggaac actgtgttct ctcagccatc
aagagacagg caaattctta acaatgggac 2460cttaagaata ttacaggtta cgccaaaaga
tcaaggtcat taccaatgtg tggctgccaa 2520cccatcaggg gccgactttt ccagttttaa
agtttcagtt caaaagaaag gccaaaggat 2580ggttgagcat gacagggagg caggtggatc
tggacttgga gaacccaact ccagtgtttc 2640ccttaagcag ccagcatctt tgaaactctc
tgcatcagct ttgacagggt cagaggctgg 2700aaaacaagtc tccggtgtac ataggaagaa
caaacataga gacttaatac atcggcggcg 2760tggggattcc acgctccggc gattcaggga
gcataggagg cagctccctc tctctgctcg 2820gagaattgac ccgcaacgct gggcagcact
tctagaaaaa gccaaaaaga attctgtgcc 2880aaaaaagcaa gaaaatacca cagtaaagcc
agtgccactg gctgttcccc tcgtggaact 2940cactgacgag gaaaaggatg cctctggcat
gattcctcca gatgaagaat tcatggttct 3000gaaaactaag gcttctggtg tcccaggaag
gtcaccaact gctgactctg gaccagtaaa 3060tcatggtttt atgacgagta tagcttctgg
cacagaagtc tcaactgtga atccacaaac 3120actacaatct gagcaccttc ctgatttcaa
attatttagt gtaacaaacg gtacagctgt 3180gacaaagagt atgaacccat ccatagcaag
caaaatagaa gatacaacca accaaaaccc 3240aatcattatc tttccatcag tagctgaaat
tcgagattct gctcaggcag gaagagcatc 3300ttcccaaagt gcacaccctg taacaggggg
aaacatggct acctatggcc ataccaacac 3360atatagtagc tttaccagca aagccagtac
agtcttgcag ccaataaatc caacagaaag 3420ttatggacct cagataccta ttacaggagt
cagcagacct agcagtagtg acatctcttc 3480tcacactact gcagacccta gcttctccag
tcacccttca ggttcacaca ccactgcctc 3540gtctttattt cacattccta gaaacaacaa
tacaggtaac ttccccttgt ccaggcactt 3600gggaagagag aggacaattt ggagcagagg
gagagttaaa aacccacata gaaccccagt 3660tctccgacgg catagacaca ggactgtgag
gccagcaatc aagggacctg ctaacaaaaa 3720tgtgagccaa gttccagcca cagagtaccc
tgggatgtgc cacacatgtc cttccgcaga 3780ggggctcaca gtggctactg cagcactgtc
agttccaagt tcatcccaca gtgccctccc 3840caaaactaat aatgttgggg tcatagcaga
agagtctacc actgtggtca agaaaccact 3900gttactattt aaggacaaac aaaatgtaga
tattgagata ataacaacca ctacaaaata 3960ttccggaggg gaaagtaacc acgtgattcc
tacggaagca agcatgactt ctgctccaac 4020atctgtatcc ctggggaaat ctcctgtaga
caatagtggt cacctgagca tgcctgggac 4080catccaaact gggaaagatt cagtggaaac
aacaccactt cccagccccc tcagcacacc 4140ctcaatacca acaagcacaa aattctcaaa
gaggaaaact cccttgcacc agatctttgt 4200aaataaccag aagaaggagg ggatgttaaa
gaatccatat caattcggtt tacaaaagaa 4260cccagccgca aagcttccca aaatagctcc
tcttttaccc acaggtcaga gttccccctc 4320agattctaca actctcttga caagtccgcc
accagctctg tctacaacaa tggctgccac 4380tcagaacaag ggcactgaag tagtatcagg
tgccagaagt ctctcagcag ggaagaagca 4440gcccttcacc aactcctctc cagtgcttcc
tagcaccata agcaagagat ctaatacatt 4500aaacttcttg tcaacggaaa cccccacagt
gacaagtcct actgctactg catctgtcat 4560tatgtctgaa acccaacgaa caagatccaa
agaagcaaaa gaccaaataa aggggcctcg 4620gaagaacaga aacaacgcaa acaccacccc
caggcaggtt tctggctata gtgcatactc 4680agctctaaca acagctgata cccccttggc
tttcagtcat tccccacgac aagatgatgg 4740tggaaatgta agtgcagttg cttatcactc
aacaacctct cttctggcca taactgaact 4800gtttgagaag tacacccaga ctttgggaaa
tacaacagct ttggaaacaa cgttgttgag 4860caaatcacag gagagtacca cagtgaaaag
agcctcagac acaccaccac cactcctcag 4920cagtggggcg cccccagtgc ccactccttc
cccacctcct tttactaagg gtgtggttac 4980agacagcaaa gtcacatcag ctttccagat
gacgtcaaat agagtggtca ccatatatga 5040atcttcaagg cacaatacag atctgcagca
accctcagca gaggctagcc ccaatcctga 5100gatcataact ggaaccactg actctccctc
taatctgttt ccatccactt ctgtgccagc 5160actaagggta gataaaccac agaattctaa
atggaagccc tctccctggc cagaacacaa 5220atatcagctc aagtcatact ccgaaaccat
tgagaagggc aaaaggccag cagtaagcat 5280gtccccccac ctcagccttc cagaggccag
cactcatgcc tcacactgga atacacagaa 5340gcatgcagaa aagagtgttt ttgataagaa
acctggtcaa aacccaactt ccaaacatct 5400gccttacgtc tctctaccta agactctatt
gaaaaagcca agaataattg gaggaaaggc 5460tgcaagcttt acagttccag ctaattcaga
cgtttttctt ccttgtgagg ctgttggaga 5520cccactgccc atcatccact ggaccagagt
ttcatcagga nttgaaatat cccaagggac 5580acagaaaagc cggttccacg tgcttcccaa
tggcaccttg tccatccaga gggtcagtat 5640tcaggaccgt ggacagtacc tgtgctctgc
atttaatcca ctgggcgtag accattttca 5700tgtctctttg tctgtggttt tttacccggc
aaggattttg gacagacatg tcaaggagat 5760cacagttcac tttggaagta ctgtggaact
aaagtgcaga gtggagggta tgccgaggcc 5820tacggtttcc tggatacttg caaaccaaac
ggtggtctca gaaacggcca agggaagcag 5880aaaggtctgg gtaacacctg atggaacatt
gatcatctat aatctgagtc tttatgatcg 5940tggtttttac aagtgtgtgg ccagcaaccc
atctggccag gattcactgt tggttaagat 6000acaagtcatc acagctcccc ctgtcattat
agagcaaaag aggcaagcca tcgttggggt 6060tttaggtgga agtttgaaac tgccctgcac
tgcaaaagga actccccagc ctagtgttca 6120ctgggtcctt tatgatggga ctgaactaaa
accattgcag ttgactcatt ccagattttt 6180cttgtatcca aatggaactc tgtatataag
aagcatcgct ccttcagtga ggggcactta 6240tgagtgcatt gccaccagct cctcaggctc
agagagaagg gtagtgattc ttactgtgga 6300agagggagag acaatcccca ggatagaaac
tgcctctcag aaatggactg aggtgaattt 6360gggtgagaaa ttactactga actgctcagc
tactggggat ccaaagccta gaataatctg 6420gaggctgcca tccaaggctg tcatcgacca
gtggcacaga atgggcagcc gaatccacgt 6480ctacccaaat ggatccttgg tggttgggtc
agtgacggaa aaagacgctg gtgactactt 6540atgtgtggca agaaacaaaa tgggagatga
cctagtcctg atgcatgtcc gcctgagatt 6600gacacctgcc aaaattgaac agaagcagta
ttttaagaag caagtgctcc atgggaaaga 6660tttccaagtt gactgcaagg cctctggctc
ccctgtgcct gaggtatcct ggagtttgcc 6720tgatgggaca gtgctcaaca atgtagccca
agctgatgac agtggctata ggaccaagag 6780gtacaccctt ttccacaatg gaaccttgta
tttcaacaac gttgggatgg cagaggaagg 6840agattatatc tgctctgccc agaacacctt
agggaaagat gagatgaaag tccacctaac 6900agttctaaca gccatcccac ggataaggca
aagctacaag accaccatga ggctcagggc 6960tggagaaaca gctgtccttg actgcgaggt
cactggggaa ccgaagccca atgtattttg 7020gttgctgcct tccaacaatg tcatttcatt
ctccaatgac aggttcacat ttcatgccaa 7080tagaactttg tccatccata aagtgaaacc
acttgactct ggggactatg tgtgcgtagc 7140tcagaatcct agtggggatg acactaagac
atacaaactg gacattgtct ctaaacctcc 7200attaatcaat ggcctgtatg caaacaagac
tgttattaaa gccacagcca ttcggcactc 7260caaaaaatac tttgactgca gagcagatgg
gatcccatct tcccaggtca cgtggattat 7320gccaggcaat attttcctcc cagctccata
ctttggaagc agagtcacgg tccatccaaa 7380tggaaccttg gagatgagga acatccggct
ttctgactct gcggacttca cctgtgtggt 7440tcggagcgag ggaggagaga gtgtgttggt
agtgcagtta gaagtcctag aaatgctgag 7500aagaccaaca ttcagaaacc cattcaacga
aaaagtcatc gcccaagctg gcaagcccgt 7560agcactgaac tgctctgtgg atgggaaccc
cccacctgaa attacctgga tcttacctga 7620cggcacacag tttgctaaca gaccacacaa
ttccccgtat ctgatggcag gcaatggctc 7680tctcatcctt tacaaagcaa ctcggaacaa
gtcagggaag tatcgctgtg cagccaggaa 7740taaggttggc tacatcgaga aactcatcct
gttagagatt gggcagaagc cagtcattct 7800gacatacgaa ccagggatgg tgaagagcgt
cagtggggaa ccgttatcac tgcattgtgt 7860gtctgatggg atccccaagc caaatgtcaa
gtggactaca ccgggtggcc atgtaatcga 7920caggcctcaa gtggatggaa aatacatact
gcatgaaaat ggcacgctgg tcatcaaagc 7980aacaacagct cacgaccaag gaaattatat
ctgtagggct caaaacagtg ttggccaggc 8040agttattagc gtgtcagtga tggttgtggc
ctaccctccc cgaatcataa actacctacc 8100caggaacatg ctcaggagga caggggaagc
catgcagctc cactgtgtgg ccttgggaat 8160ccccaagcca aaagtcacct gggagacgcc
aagacactcc ctgctctcaa aagcaacagc 8220aagaaaaccc catagaagtg agatgcttca
cccacaaggt acgctggtca ttcagaatct 8280ccaaacctcg gattccggag tctataagtg
cagagctcag aacctacttg ggactgatta 8340cgcaacaact tacatccagg tactctgaca
ggaaggggga gactaaaatt caacagaagt 8400ccacatccac agggtttatt ttttggaaga
agtttaatca aaggcagcca taggcatgta 8460aatgagtctg aatacattta cagtattaaa
tttacaatgg acatgcgatg agacttgtaa 8520atgaaagcat tgtgaactga aaccgagtct
ctgtggatct caaagcaaac tcttaactta 8580aggcactttg attttgccaa caaataataa
caaacattaa gagaaaaaaa tgatccacta 8640cgaaataaca aacggctaat gcacctgaat
tctcagtaaa aagacctttc tctcgctaac 8700agttgccagc tgcctcgtgt ctgtttccta
ccaatgtcac aaacatcgca cacagggtga 8760atggagtcaa cgggaaagat taagtttgcg
gtctgtgtaa atctcaatgt acaaatattc 8820tgtcnctggt ttataaacat tttgataaaa
ccgaaaaaaa aaaaaaaaaa aaaaaaaaaa 8880aaa
888388180DNAhomo
sapiensmisc_feature(1)..(8180)'n' can be any nucleotide 'a', 'c', 'g' or
't'. 8tcacctgctt gctggtctcc tttgctgtga tctgcctggt cgccacccct gggggcaagg
60cctgtcctcg ccgctgtgcc tgttatatgc ctacggaggt acactgcaca tttcggtacc
120tgacttccat cccagacagc atcccgccca atgtggaacg catcaattta ggatacaaca
180gcttggttag attgatggaa acagattttt ctggcctgac caaactggag ttactcatgc
240ttcacagcaa tggcattcac acaatccctg acaagacctt ctcagatttg caggccttgc
300aggtcttaaa aatgagctat aataaagtcc gaaaacttca gaaagatact ttttatggcc
360tcaggagctt gacacgattg cacatggacc acaacaatat tgagtttata aacccagagg
420ttttttatgg gctcaacttt ctccgcctgg tgcacttgga aggaaatcag ctcactaagc
480tccacccaga tacatttgtc tctttgagct acctccagat atttaaaatc tctttcatta
540agttcctata cttgtctgat aacttcctga cctccctccc tcaagagatg gtctcctata
600tgcctgacct agacagcctt tacctgcatg gaaacccatg gacctgtgat tgccatttaa
660agtggttgtc tgactggata caggnnnnnc cagatgtaat aaaatgcaaa aaagatagaa
720gtccctctag tgctcagcag tgtccacttt gcatgaaccc taggacttct aaaggcaagc
780cgttagctat ggtctcagct gcagctttcc agtgtgccaa gccaaccatt gactcatccc
840tgaaatcaaa gagcctgact attctggaag acagtagttc tgctttcatc tctccccaag
900gtttcatggc accctttggc tccctcactt tgaatatgac agatcagtct ggaaatgaag
960ctaacatggt ctgcagtatt caaaagccct caaggacatc acccattgca ttcactgaag
1020aaaatgacta catcgtgcta aatacttcat tttcaacatt tttggtgtgc aacatagatt
1080acggtcacat tcagccagtg tggcaaattt tggctttgta cagtgattct cctctgatac
1140tagaaaggag ccacttgctt agtgaaacac cgcagctcta ttacaaatat aaacaggctt
1200ggttaatgca agaccaaatt tccttgcagc tgaacagaac tgccaccaca ttcagtacat
1260tacagatcca gtactccagt gatgctcaaa tcactttacc aagagcagag atgaggccag
1320tgaaacacaa atggactatg atttcaaggg ataacaatac taagctggaa catactgtct
1380tggtaggtgg aaccgttggc ctgaactgcc caggccaagg agaccccacc ccacacgtgg
1440attggcttct agctgatgga agtaaagtga gagcccctta tgtcagtgag gatggacgga
1500tcctaataga caaaagtgga aaattggaac tccagatggc tgatagtttt gacacaggcg
1560tatatcactg tataagcagc aattatgatg atgcagatat tctcacctat aggataactg
1620tggtagaacc tttggtcgaa gcctatcagg aaaatgggat tcatcacaca gttttcattg
1680gtgaaacact tgatcttcca tgccattcta ctggtatccc agatgcctct attagctggg
1740ttattccagg aaacaatgtg ctctatcagt catcaagaga caagaaagtt ctaaacaatg
1800gcacattaag aatattacag gtcaccccga aagaccaagg ttattatcgc tgtgtggcag
1860ccaacccatc aggggttgat tttttgattt tccaagtttc agtcaagatg aaaggacaaa
1920ggcccttgga gcatgatgga gaaacagagg gatctggact tgatgagtcc aatcctattg
1980ctcatcttaa ggagccacca ggtgcacaac tccgtacatc tgctctgatg gaggctgagg
2040ttggaaaaca cacctcaagc acaagtaaga ggcacaacta tcgggaatta acactccagc
2100gacgtggaga ttcaacacat cgacgtttta gggagaatag gaggcatttc cctccctctg
2160ctaggagaat tgacccacaa cattgggcgg cactgttgga gaaagctaaa aagaatgcta
2220tgccagacaa gcgagaaaat accacagtga gcccaccccc agtggtcacc caactcccaa
2280acatacctgg tgaagaagac gattcctcag gcatgctcgc tctacatgag gaatttatgg
2340tcccggccac taaagctttg aaccttccag caaggacagt gactgctgac tccagaacaa
2400tatctgatag tcctatgaca aacataaatt atggcacaga actctccgtt gtgaattcac
2460aaatactacc acctgaagaa cccacagatt tcaaactgtc tactgctatt aaaactacag
2520ccatgtcaaa gaatataaac ccaaccatgt caagccaaat acaaggcaca accaatcaac
2580attcatccac tgtctttcca ctgctacttg gagcaactga atttcaggac tctgacagag
2640ggaagaggaa gagagcattt ccagtaaccc ccaataacag taaggactat gatcaaagat
2700gntcaatgtc aaanatgctt agtagcacca ccaacaaact attattagag tcagtaaata
2760ccacaaatag tcatcagaca tctgtaagag aagtgagtga acccaggcac aatcacttct
2820attctcacac tactcaaata cttagcacct ccacgttccc ttcagatcca cacacagctg
2880ctcattctca gtttccgatc cctagannna atagtacagt taacatcccg ctgttcagac
2940gctttgggag gcagaggaaa attggcggaa gggggcggat tatcagccca tatagaactc
3000cagttctgcg acggcataga tacagcattt tcaggtcaac aaccagaggt tcttctgaaa
3060aaagcactac tgcattctca gccacagtgc tcaatgtgac atgtctgtcc tgtcttccca
3120gggagaggct caccactgcc acagcagcat tgtcttttcc aagtgctgct cccatcacct
3180tccccaaagc tgacattgct agagtcccat cagaagagtc tacaactcta gtccagaatc
3240cactattact acttgagaac aaacccagtg tagannnnga aannacaaca cccacaataa
3300aatattcagg actngaaatt tcccaagtga ctccaactgg tgcagtcatg acatatgctc
3360caacatccat acccatggaa aaaactcaca aagtaaacgc cagttaccca cgtgtgtcta
3420gcaccaatga agctaaaaga gattcagtga ttacatcgtc actttcaggt gctatcacca
3480agccaccaat gactattata gccattacaa ggttttcaag aaggaaaatt ccctggcaac
3540agaactttgt aaataaccat aacccaaaag gcagattaag gaatcaacat aaagttagtt
3600tacaaaaaag cacagctgtg atgcttccta aaacatctcc tgctttacca cagagacaaa
3660gttccccttt ccatttcacc acactttcaa caagtgtgat gcaaattcca tctaatacct
3720tgactaccgc tcaccacact acgaccaaaa cacacaatcc tggaagtctt ccaacaaaga
3780aggagcttcc cttcccaccc cttaacccta tgcttcctag tattataagc aaagactcaa
3840gtacaaaaag catcatatca acgcaaacag caaccgcaac aactcctacc ttccctgcat
3900ctgtcatcac ttatgaaacc caaacagaga gatctagagc acaaacaata caaagagaag
3960gacctcaaaa gaagaacagg actgacccaa acatctctcc agaccagagt tctggcttca
4020ctacacccac tgctatgacn acctcctnng ctctnnnngc attcactcat tccccaccag
4080aaaacacaac tgggatttca agcacaatca gttttcattc aagaactctt aatctgacag
4140atgtgattga agaactagcc caagcaagta ctcagacttt gaagagcaca attgcttctg
4200aaacaacttt gtccagcaaa tcacaccaga gtaccacaac taggaaagca tcattagaca
4260ctcaaccacc accattcttg agcagcagtg ctactctaat gccagttccc atctcccctc
4320cctttactca gagagcagtt actgacaacg tggcgactcc catttccggg cttatgacaa
4380atacagtggt caagctgcac gaatcctcaa ggcacaatcc nnnnnnncaa atgccaagtt
4440cacnnaattg ngaaccnnnn actcnnnnna cttcatctac ntctaatctg ttacattcta
4500ctcccatgcc agcactaaca acagttaaat cacagaattc caaattaact ccatctccct
4560gggcagaata ccaattttgg cacaaaccat actcagacat tgctgaaaaa ggcaaaaagc
4620cagaagtaag catgttggct actacaggcc tgtccgaggc caccactctt gtttcagatt
4680gggatggaca gaagaacaca aagaagagtg actttgataa gaaaccagtt caagaagcaa
4740caacttccaa actccttccc tttgactctt tgtctaggta tatatttgaa aagcccagga
4800tagttggagg aaaagctgca agttttacta ttccagctaa ctcagatgcc tttcttccct
4860gtgaagctgt tggaaatccc ctgcccacca ttcattggac cagagtnnnn tcaggacttg
4920atttatctaa gaggaaacag aatagcaggg tccaggttct ccccaatggt accctgtcca
4980tccagagggt ggaaattcag gaccgcggac agtacttgtg ttccgcatcc aatctgtttg
5040gcacagacca ccttcatgtc accttgtctg tggtttccta tcctcccagg atcctggaga
5100gacgtaccaa agagatcaca gttcattccg gaagcactgt ggaactgaag tgcagagcag
5160aaggtaggcc aagccctaca gttacctgga ttcttgcaaa ccaaacagtt gtctcagaat
5220catcccaggg aagtaggcag gctgtggtga cggttgacgg aacattggtc ctccacaatc
5280tcagtattta tgaccgtggc ttttacaaat gtgtggccag caacccaggt ggccaggatt
5340cactgctggt taaaatacaa gtcattgcag caccacctgt tattctagag caaaggaggc
5400aagtcattgt aggcacttgg ggtgaaagtt taaaactgcc ctgtactgca aaaggaactc
5460ctcagcccag cgtttactgg gtcctctctg atggcactga agtgaaacca ttacagttta
5520ccaattccaa gttgttctta ttttcaaatg ggactttgta tataagaaac ctagcctctt
5580cagacagggg cacttatgaa tgcattgcta ccagttccac tggttcggag cgaagagtag
5640taatgcttac aatggaagag cgagtgacca gccccaggat agaagctgca tcccagaaaa
5700ggactgaagt gaattttggg gacaaattac tactgaactg ctcagccact ggggagccca
5760aaccccaaat aatgtggagg ttaccatcca aggctgtggt cgaccagtgg gcagctggat
5820ccacgtctac cctaatggat ccctgtttat tggatcagta acagaaaaag acagtggtgt
5880ctacttgtgt gtggcaagaa acaaaatggg ggatgatctg atactgatgc atgttagcct
5940aagactgaaa cctgccaaaa ttgaccacaa gcagtatttt agaaagcaag tgctccatgg
6000gaaagatttc caagtagatt gcaaagcttc cggctcccca gtgccagaga tatcttggag
6060tttgcctgat ggaaccatga tcaacaatgc aatgcaagcc gatgacagtg gccacaggac
6120taggagatat acccttttca acaatggaac tttatacttc aacaaagttg gggtagcgga
6180ggaaggagat tatacttgct atgcccagaa caccctaggg aaagatgaaa tgaaggtcca
6240cttaacagtt ataacagctg ctccccggat aaggcagagt aacaaaacca acaagagaat
6300caaagctgga gacacagctg tccttgactg tgaggtcact ggggatccca aaccaaaaat
6360attttggttg ctgccttcca atgacatgat ttccttctcc attgataggt acacatttca
6420tgccaatggg tctttgacca tcaacaaagt gaaactgctc gattctggag agtacgtatg
6480tgtagcccga aatcccagtg gggatgacac caaaatgtac aaactggatg tggtctctaa
6540acctccatta atcaatggtc tgtatacaaa cagaactgtt attaaagcca cagctgtgag
6600acattccaaa aaacactttg actgcagagc tgaagggaca ccatctcctg aagtcatgtg
6660gatcatgcca gacaatattt tcctcacagc cccatactat ggaagcagaa tcacagtcca
6720taaaaatgga accttggaaa ttaggaatgt gaggctttca gattcagccg actttatctg
6780tgtggcccga aatgaaggtg gagagagcgt gttggtagta cagttagaag tactggaaat
6840gctgagaaga ccgacattta gaaatccatt taatgaaaaa atagttgccc agctgggaaa
6900gtccacagca ttgaattgct ctgttgatgg taacccacca cctgaaataa tctggatttt
6960accaaatggc acacgatttt ccaatggacc acaaagttat cagtatctga tagcaagcaa
7020tggttctttt atcatttcta aaacaactcg ggaggatgca ggaaaatatc gctgtgcagc
7080taggaataaa gttggctata ttgagaaatt agtcatatta gaaattggcc agaagccagt
7140tattcttacc tatgcaccag ggacagtaaa aggcatcagt ggagaatctc tatcactgca
7200ttgtgtgtct gatggaatcc ctaagccaaa tatcaaatgg actatgccaa gtggttatgt
7260agtagacagg cctcaaatta atgggaaata catattgcat gacaatggca ccttagtcat
7320taaagaagca acagcttatg acagaggaaa ctatatctgt aaggctcaaa atagtgttgg
7380tcatacactg attactgttc cagtaatgat tgtagcctac cctccccgaa ttacaaatcg
7440tccacccagg agtattgtca ccaggacagg ggcagccttt cagctccact gtgtggcctt
7500gggagttccc aagccagaaa tcacatggga gatgcctgac cactcccttc tctcaacggc
7560aagtaaagag aggacacatg gaagtgagca gcttcactta caaggtaccc tagtcattca
7620gaatccccaa acctccgatt ctgggatata caaatgcaca gcaaagaacc cacttggtag
7680tgattatgca gcaacgtata ttcaagtaat ctgacatgaa ataataaagt caacaacatc
7740tgggcagaat ttattttttg gaagaagttt aatcaaaggc agccataggc atgtaaatga
7800atttgaatac atttacagta ttaaatttac aatgaacatg caaaataaaa ggacttgtaa
7860ataaatgcat tatgaactga tgatactgat ttatttaatg gatctcaaaa caaactttta
7920acttaaggca cttttatttt gccaacaaat aacaataaac aaacattgaa acggttcact
7980ataaaataac aaatggctaa tgtacctgaa tttttcagta aaaaaatgaa cttctaatac
8040cagttgccta gtgtccacct cctatcaatg ttacaagcat ggcactcaga acagagacaa
8100tggaaaatat taaatctgca atctatgtat aaatattttg tggtttataa atttttttgc
8160taaaacctac agaaaataag
81809897DNAMus musculusmisc_feature(1)..(897)'n' can be any nucleotide
'a', 'c', 'g' or 't'. 9aagaacgttc cttcaatcag gtgaaggctc tcctcagaag
atttcctgtc tctgcttatg 60tcagctgctt gctgatctcc ctcactgcca tctgcctggt
ggtcacccct gggagcaggg 120tctgtcctcg ccgatgtgcc tgctatgtgc ccacagaggt
gcactgtaca tttcgggacc 180tgacctccat cccagacggg catcccagcc aatgtggaac
gagtcaattt agggtataac 240agcctcacta gattgacaga aaatgacttt tctggcctga
gcagactgga gttactcatg 300ctgcacagca atggcattca cagagtcagt gacaagacct
tctcgggctt gcagtccttg 360caggtcttaa aaatgagcta taacaaagtc caaataattg
agaaggatac tttgtatgga 420ctcaggagct tgacccggtt gcacctggat cacaacaaca
ttgagtttat caaccccgag 480gcgttttacg gactcacctt gctccgcttg gtacatctag
aaggaaaccg gctgacaaag 540ctccatccag acacatttgt ctctttgagc tatctccaga
tatttaaaac ctccttcatt 600aagnacctgt acttgtatga taacttcatt gacctccctc
ccaaaagaaa tggtctcctc 660tatgccaaac ctagaaagcc tttacttgca tggaaaccca
tggacctgtg actgccattt 720aaagtggttg tccgagtgga tgcagggaaa cccaggtaac
tatcttgttt gtttgtttct 780ttttttatar kacgtatttt cctcaatttc atttagaatg
atatcccaaa agtcccccat 840aacctccccc ccacttccct acctacccat tcccattttt
tggccctggc attcccc 897102597PRTRattus
speciesmisc_feature(1)..(2597)'x' can be any amino acid 10Met Gln Val Arg
Gly Arg Glu Val Ser Gly Leu Leu Ile Ser Leu Thr1 5
10 15Ala Val Cys Leu Val Val Thr Pro Gly Ser
Arg Ala Cys Pro Arg Arg 20 25
30Cys Ala Cys Tyr Val Pro Thr Glu Val His Cys Thr Phe Arg Tyr Leu
35 40 45Thr Ser Ile Pro Asp Gly Ile Pro
Ala Asn Val Glu Arg Ile Asn Leu 50 55
60Gly Tyr Asn Ser Leu Thr Arg Leu Thr Glu Asn Asp Phe Asp Gly Leu65
70 75 80Ser Lys Leu Glu Leu
Leu Met Leu His Ser Asn Gly Ile His Arg Val 85
90 95Ser Asp Lys Thr Phe Ser Gly Leu Gln Ser Leu
Gln Val Leu Lys Met 100 105
110Ser Tyr Asn Lys Val Gln Ile Ile Arg Lys Asp Thr Phe Tyr Gly Leu
115 120 125Gly Ser Leu Val Arg Leu His
Leu Asp His Asn Asn Ile Glu Phe Ile 130 135
140Asn Pro Glu Ala Phe Tyr Gly Leu Thr Ser Leu Arg Leu Val His
Leu145 150 155 160Glu Gly
Asn Arg Leu Thr Lys Leu His Pro Asp Thr Phe Val Ser Leu
165 170 175Ser Tyr Leu Gln Ile Phe Lys
Thr Ser Phe Ile Lys Tyr Leu Phe Leu 180 185
190Ser Asp Asn Phe Leu Thr Ser Leu Pro Lys Glu Met Val Ser
Tyr Met 195 200 205Pro Asn Leu Glu
Ser Leu Tyr Leu His Gly Asn Pro Trp Thr Cys Asp 210
215 220Cys His Leu Lys Trp Leu Ser Glu Trp Met Gln Gly
Asn Pro Asp Ile225 230 235
240Ile Lys Cys Lys Lys Asp Arg Ser Ser Ser Ser Pro Gln Gln Cys Pro
245 250 255Leu Cys Met Asn Pro
Arg Ile Ser Lys Gly Arg Pro Phe Ala Met Val 260
265 270Pro Ser Gly Ala Phe Leu Cys Thr Lys Pro Thr Ile
Asp Pro Ser Leu 275 280 285Lys Ser
Lys Ser Leu Val Thr Gln Glu Asp Asn Gly Ser Ala Ser Thr 290
295 300Ser Pro Gln Asp Phe Ile Glu Pro Phe Gly Ser
Leu Ser Leu Asn Met305 310 315
320Thr Xaa Xaa Ser Gly Asn Lys Ala Asp Met Val Cys Ser Ile Gln Lys
325 330 335Pro Ser Arg Thr
Ser Pro Thr Ala Phe Thr Glu Glu Asn Asp Tyr Ile 340
345 350Met Leu Asn Ala Ser Phe Ser Thr Asn Leu Val
Cys Ser Val Asp Tyr 355 360 365Asn
His Ile Gln Pro Val Trp Gln Leu Leu Ala Leu Tyr Ser Asp Ser 370
375 380Pro Leu Ile Leu Glu Arg Lys Pro Gln Leu
Thr Glu Thr Pro Ser Leu385 390 395
400Ser Ser Arg Tyr Lys Gln Val Ala Leu Arg Pro Glu Asp Ile Phe
Thr 405 410 415Ser Ile Glu
Ala Asp Val Arg Ala Asp Pro Phe Trp Phe Gln Gln Glu 420
425 430Lys Ile Val Leu Gln Leu Asn Arg Thr Ala
Thr Thr Leu Ser Thr Leu 435 440
445Gln Ile Gln Phe Ser Thr Asp Ala Gln Ile Ala Leu Pro Arg Ala Glu 450
455 460Met Arg Ala Glu Arg Leu Lys Trp
Thr Met Ile Leu Met Met Asn Asn465 470
475 480Pro Lys Leu Glu Arg Thr Val Leu Val Gly Gly Thr
Ile Ala Leu Ser 485 490
495Cys Pro Gly Lys Gly Asp Pro Ser Pro His Leu Glu Trp Leu Leu Ala
500 505 510Asp Gly Ser Lys Val Arg
Ala Pro Tyr Val Ser Glu Asp Gly Arg Ile 515 520
525Leu Ile Asp Lys Asn Gly Lys Leu Glu Leu Gln Met Ala Asp
Ser Phe 530 535 540Asp Ala Gly Leu Tyr
His Cys Ile Ser Thr Asn Asp Ala Asp Ala Asp545 550
555 560Val Leu Thr Tyr Arg Ile Thr Val Val Glu
Pro Tyr Gly Glu Ser Thr 565 570
575His Asp Ser Gly Val Gln His Thr Val Val Thr Gly Glu Thr Leu Asp
580 585 590Leu Pro Cys Leu Ser
Thr Gly Val Pro Asp Ala Ser Ile Ser Trp Ile 595
600 605Leu Pro Gly Asn Thr Val Phe Ser Gln Pro Ser Arg
Asp Arg Gln Ile 610 615 620Leu Asn Asn
Gly Thr Leu Arg Ile Leu Gln Val Thr Pro Lys Asp Gln625
630 635 640Gly His Tyr Gln Cys Val Ala
Ala Asn Pro Ser Gly Ala Asp Phe Ser 645
650 655Ser Phe Lys Val Ser Val Gln Lys Lys Gly Gln Arg
Met Val Glu His 660 665 670Asp
Arg Glu Ala Gly Gly Ser Gly Leu Gly Glu Pro Asn Ser Ser Val 675
680 685Ser Leu Lys Gln Pro Ala Ser Leu Lys
Leu Ser Ala Ser Ala Leu Thr 690 695
700Gly Ser Glu Ala Gly Lys Gln Val Ser Gly Val His Arg Lys Asn Lys705
710 715 720His Arg Asp Leu
Ile His Arg Arg Arg Gly Asp Ser Thr Leu Arg Arg 725
730 735Phe Arg Glu His Arg Arg Gln Leu Pro Leu
Ser Ala Arg Arg Ile Asp 740 745
750Pro Gln Arg Trp Ala Ala Leu Leu Glu Lys Ala Lys Lys Asn Ser Val
755 760 765Pro Lys Lys Gln Glu Asn Thr
Thr Val Lys Pro Val Pro Leu Ala Val 770 775
780Pro Leu Val Glu Leu Thr Asp Glu Glu Lys Asp Ala Ser Gly Met
Ile785 790 795 800Pro Pro
Asp Glu Glu Phe Met Val Leu Lys Thr Lys Ala Ser Gly Val
805 810 815Pro Gly Arg Ser Pro Thr Ala
Asp Ser Gly Pro Val Asn His Gly Phe 820 825
830Met Thr Ser Ile Ala Ser Gly Thr Glu Val Ser Thr Val Asn
Pro Gln 835 840 845Thr Leu Gln Ser
Glu His Leu Pro Asp Phe Lys Leu Phe Ser Val Thr 850
855 860Asn Gly Thr Ala Val Thr Lys Ser Met Asn Pro Ser
Ile Ala Ser Lys865 870 875
880Ile Glu Asp Thr Thr Asn Gln Asn Pro Ile Ile Ile Phe Pro Ser Val
885 890 895Ala Glu Ile Arg Asp
Ser Ala Gln Ala Gly Arg Ala Ser Ser Gln Ser 900
905 910Ala His Pro Val Thr Gly Gly Asn Met Ala Thr Tyr
Gly His Thr Asn 915 920 925Thr Tyr
Ser Ser Phe Thr Ser Lys Ala Ser Thr Val Leu Gln Pro Ile 930
935 940Asn Pro Thr Glu Ser Tyr Gly Pro Gln Ile Pro
Ile Thr Gly Val Ser945 950 955
960Arg Pro Ser Ser Ser Asp Ile Ser Ser His Thr Thr Ala Asp Pro Ser
965 970 975Phe Ser Ser His
Pro Ser Gly Ser His Thr Thr Ala Ser Ser Leu Phe 980
985 990His Ile Pro Arg Asn Asn Asn Thr Gly Asn Phe
Pro Leu Ser Arg His 995 1000
1005Leu Gly Arg Glu Arg Thr Ile Trp Ser Arg Gly Arg Val Lys Asn
1010 1015 1020Pro His Arg Thr Pro Val
Leu Arg Arg His Arg His Arg Thr Val 1025 1030
1035Arg Pro Ala Ile Lys Gly Pro Ala Asn Lys Asn Val Ser Gln
Val 1040 1045 1050Pro Ala Thr Glu Tyr
Pro Gly Met Cys His Thr Cys Pro Ser Ala 1055 1060
1065Glu Gly Leu Thr Val Ala Thr Ala Ala Leu Ser Val Pro
Ser Ser 1070 1075 1080Ser His Ser Ala
Leu Pro Lys Thr Asn Asn Val Gly Val Ile Ala 1085
1090 1095Glu Glu Ser Thr Thr Val Val Lys Lys Pro Leu
Leu Leu Phe Lys 1100 1105 1110Asp Lys
Gln Asn Val Asp Ile Glu Ile Ile Thr Thr Thr Thr Lys 1115
1120 1125Tyr Ser Gly Gly Glu Ser Asn His Val Ile
Pro Thr Glu Ala Ser 1130 1135 1140Met
Thr Ser Ala Pro Thr Ser Val Ser Leu Gly Lys Ser Pro Val 1145
1150 1155Asp Asn Ser Gly His Leu Ser Met Pro
Gly Thr Ile Gln Thr Gly 1160 1165
1170Lys Asp Ser Val Glu Thr Thr Pro Leu Pro Ser Pro Leu Ser Thr
1175 1180 1185Pro Ser Ile Pro Thr Ser
Thr Lys Phe Ser Lys Arg Lys Thr Pro 1190 1195
1200Leu His Gln Ile Phe Val Asn Asn Gln Lys Lys Glu Gly Met
Leu 1205 1210 1215Lys Asn Pro Tyr Gln
Phe Gly Leu Gln Lys Asn Pro Ala Ala Lys 1220 1225
1230Leu Pro Lys Ile Ala Pro Leu Leu Pro Thr Gly Gln Ser
Ser Pro 1235 1240 1245Ser Asp Ser Thr
Thr Leu Leu Thr Ser Pro Pro Pro Ala Leu Ser 1250
1255 1260Thr Thr Met Ala Ala Thr Gln Asn Lys Gly Thr
Glu Val Val Ser 1265 1270 1275Gly Ala
Arg Ser Leu Ser Ala Gly Lys Lys Gln Pro Phe Thr Asn 1280
1285 1290Ser Ser Pro Val Leu Pro Ser Thr Ile Ser
Lys Arg Ser Asn Thr 1295 1300 1305Leu
Asn Phe Leu Ser Thr Glu Thr Pro Thr Val Thr Ser Pro Thr 1310
1315 1320Ala Thr Ala Ser Val Ile Met Ser Glu
Thr Gln Arg Thr Arg Ser 1325 1330
1335Lys Glu Ala Lys Asp Gln Ile Lys Gly Pro Arg Lys Asn Arg Asn
1340 1345 1350Asn Ala Asn Thr Thr Pro
Arg Gln Val Ser Gly Tyr Ser Ala Tyr 1355 1360
1365Ser Ala Leu Thr Thr Ala Asp Thr Pro Leu Ala Phe Ser His
Ser 1370 1375 1380Pro Arg Gln Asp Asp
Gly Gly Asn Val Ser Ala Val Ala Tyr His 1385 1390
1395Ser Thr Thr Ser Leu Leu Ala Ile Thr Glu Leu Phe Glu
Lys Tyr 1400 1405 1410Thr Gln Thr Leu
Gly Asn Thr Thr Ala Leu Glu Thr Thr Leu Leu 1415
1420 1425Ser Lys Ser Gln Glu Ser Thr Thr Val Lys Arg
Ala Ser Asp Thr 1430 1435 1440Pro Pro
Pro Leu Leu Ser Ser Gly Ala Pro Pro Val Pro Thr Pro 1445
1450 1455Ser Pro Pro Pro Phe Thr Lys Gly Val Val
Thr Asp Ser Lys Val 1460 1465 1470Thr
Ser Ala Phe Gln Met Thr Ser Asn Arg Val Val Thr Ile Tyr 1475
1480 1485Glu Ser Ser Arg His Asn Thr Asp Leu
Gln Gln Pro Ser Ala Glu 1490 1495
1500Ala Ser Pro Asn Pro Glu Ile Ile Thr Gly Thr Thr Asp Ser Pro
1505 1510 1515Ser Asn Leu Phe Pro Ser
Thr Ser Val Pro Ala Leu Arg Val Asp 1520 1525
1530Lys Pro Gln Asn Ser Lys Trp Lys Pro Ser Pro Trp Pro Glu
His 1535 1540 1545Lys Tyr Gln Leu Lys
Ser Tyr Ser Glu Thr Ile Glu Lys Gly Lys 1550 1555
1560Arg Pro Ala Val Ser Met Ser Pro His Leu Ser Leu Pro
Glu Ala 1565 1570 1575Ser Thr His Ala
Ser His Trp Asn Thr Gln Lys His Ala Glu Lys 1580
1585 1590Ser Val Phe Asp Lys Lys Pro Gly Gln Asn Pro
Thr Ser Lys His 1595 1600 1605Leu Pro
Tyr Val Ser Leu Pro Lys Thr Leu Leu Lys Lys Pro Arg 1610
1615 1620Ile Ile Gly Gly Lys Ala Ala Ser Phe Thr
Val Pro Ala Asn Ser 1625 1630 1635Asp
Val Phe Leu Pro Cys Glu Ala Val Gly Asp Pro Leu Pro Ile 1640
1645 1650Ile His Trp Thr Arg Val Ser Ser Gly
Xaa Glu Ile Ser Gln Gly 1655 1660
1665Thr Gln Lys Ser Arg Phe His Val Leu Pro Asn Gly Thr Leu Ser
1670 1675 1680Ile Gln Arg Val Ser Ile
Gln Asp Arg Gly Gln Tyr Leu Cys Ser 1685 1690
1695Ala Phe Asn Pro Leu Gly Val Asp His Phe His Val Ser Leu
Ser 1700 1705 1710Val Val Phe Tyr Pro
Ala Arg Ile Leu Asp Arg His Val Lys Glu 1715 1720
1725Ile Thr Val His Phe Gly Ser Thr Val Glu Leu Lys Cys
Arg Val 1730 1735 1740Glu Gly Met Pro
Arg Pro Thr Val Ser Trp Ile Leu Ala Asn Gln 1745
1750 1755Thr Val Val Ser Glu Thr Ala Lys Gly Ser Arg
Lys Val Trp Val 1760 1765 1770Thr Pro
Asp Gly Thr Leu Ile Ile Tyr Asn Leu Ser Leu Tyr Asp 1775
1780 1785Arg Gly Phe Tyr Lys Cys Val Ala Ser Asn
Pro Ser Gly Gln Asp 1790 1795 1800Ser
Leu Leu Val Lys Ile Gln Val Ile Thr Ala Pro Pro Val Ile 1805
1810 1815Ile Glu Gln Lys Arg Gln Ala Ile Val
Gly Val Leu Gly Gly Ser 1820 1825
1830Leu Lys Leu Pro Cys Thr Ala Lys Gly Thr Pro Gln Pro Ser Val
1835 1840 1845His Trp Val Leu Tyr Asp
Gly Thr Glu Leu Lys Pro Leu Gln Leu 1850 1855
1860Thr His Ser Arg Phe Phe Leu Tyr Pro Asn Gly Thr Leu Tyr
Ile 1865 1870 1875Arg Ser Ile Ala Pro
Ser Val Arg Gly Thr Tyr Glu Cys Ile Ala 1880 1885
1890Thr Ser Ser Ser Gly Ser Glu Arg Arg Val Val Ile Leu
Thr Val 1895 1900 1905Glu Glu Gly Glu
Thr Ile Pro Arg Ile Glu Thr Ala Ser Gln Lys 1910
1915 1920Trp Thr Glu Val Asn Leu Gly Glu Lys Leu Leu
Leu Asn Cys Ser 1925 1930 1935Ala Thr
Gly Asp Pro Lys Pro Arg Ile Ile Trp Arg Leu Pro Ser 1940
1945 1950Lys Ala Val Ile Asp Gln Trp His Arg Met
Gly Ser Arg Ile His 1955 1960 1965Val
Tyr Pro Asn Gly Ser Leu Val Val Gly Ser Val Thr Glu Lys 1970
1975 1980Asp Ala Gly Asp Tyr Leu Cys Val Ala
Arg Asn Lys Met Gly Asp 1985 1990
1995Asp Leu Val Leu Met His Val Arg Leu Arg Leu Thr Pro Ala Lys
2000 2005 2010Ile Glu Gln Lys Gln Tyr
Phe Lys Lys Gln Val Leu His Gly Lys 2015 2020
2025Asp Phe Gln Val Asp Cys Lys Ala Ser Gly Ser Pro Val Pro
Glu 2030 2035 2040Val Ser Trp Ser Leu
Pro Asp Gly Thr Val Leu Asn Asn Val Ala 2045 2050
2055Gln Ala Asp Asp Ser Gly Tyr Arg Thr Lys Arg Tyr Thr
Leu Phe 2060 2065 2070His Asn Gly Thr
Leu Tyr Phe Asn Asn Val Gly Met Ala Glu Glu 2075
2080 2085Gly Asp Tyr Ile Cys Ser Ala Gln Asn Thr Leu
Gly Lys Asp Glu 2090 2095 2100Met Lys
Val His Leu Thr Val Leu Thr Ala Ile Pro Arg Ile Arg 2105
2110 2115Gln Ser Tyr Lys Thr Thr Met Arg Leu Arg
Ala Gly Glu Thr Ala 2120 2125 2130Val
Leu Asp Cys Glu Val Thr Gly Glu Pro Lys Pro Asn Val Phe 2135
2140 2145Trp Leu Leu Pro Ser Asn Asn Val Ile
Ser Phe Ser Asn Asp Arg 2150 2155
2160Phe Thr Phe His Ala Asn Arg Thr Leu Ser Ile His Lys Val Lys
2165 2170 2175Pro Leu Asp Ser Gly Asp
Tyr Val Cys Val Ala Gln Asn Pro Ser 2180 2185
2190Gly Asp Asp Thr Lys Thr Tyr Lys Leu Asp Ile Val Ser Lys
Pro 2195 2200 2205Pro Leu Ile Asn Gly
Leu Tyr Ala Asn Lys Thr Val Ile Lys Ala 2210 2215
2220Thr Ala Ile Arg His Ser Lys Lys Tyr Phe Asp Cys Arg
Ala Asp 2225 2230 2235Gly Ile Pro Ser
Ser Gln Val Thr Trp Ile Met Pro Gly Asn Ile 2240
2245 2250Phe Leu Pro Ala Pro Tyr Phe Gly Ser Arg Val
Thr Val His Pro 2255 2260 2265Asn Gly
Thr Leu Glu Met Arg Asn Ile Arg Leu Ser Asp Ser Ala 2270
2275 2280Asp Phe Thr Cys Val Val Arg Ser Glu Gly
Gly Glu Ser Val Leu 2285 2290 2295Val
Val Gln Leu Glu Val Leu Glu Met Leu Arg Arg Pro Thr Phe 2300
2305 2310Arg Asn Pro Phe Asn Glu Lys Val Ile
Ala Gln Ala Gly Lys Pro 2315 2320
2325Val Ala Leu Asn Cys Ser Val Asp Gly Asn Pro Pro Pro Glu Ile
2330 2335 2340Thr Trp Ile Leu Pro Asp
Gly Thr Gln Phe Ala Asn Arg Pro His 2345 2350
2355Asn Ser Pro Tyr Leu Met Ala Gly Asn Gly Ser Leu Ile Leu
Tyr 2360 2365 2370Lys Ala Thr Arg Asn
Lys Ser Gly Lys Tyr Arg Cys Ala Ala Arg 2375 2380
2385Asn Lys Val Gly Tyr Ile Glu Lys Leu Ile Leu Leu Glu
Ile Gly 2390 2395 2400Gln Lys Pro Val
Ile Leu Thr Tyr Glu Pro Gly Met Val Lys Ser 2405
2410 2415Val Ser Gly Glu Pro Leu Ser Leu His Cys Val
Ser Asp Gly Ile 2420 2425 2430Pro Lys
Pro Asn Val Lys Trp Thr Thr Pro Gly Gly His Val Ile 2435
2440 2445Asp Arg Pro Gln Val Asp Gly Lys Tyr Ile
Leu His Glu Asn Gly 2450 2455 2460Thr
Leu Val Ile Lys Ala Thr Thr Ala His Asp Gln Gly Asn Tyr 2465
2470 2475Ile Cys Arg Ala Gln Asn Ser Val Gly
Gln Ala Val Ile Ser Val 2480 2485
2490Ser Val Met Val Val Ala Tyr Pro Pro Arg Ile Ile Asn Tyr Leu
2495 2500 2505Pro Arg Asn Met Leu Arg
Arg Thr Gly Glu Ala Met Gln Leu His 2510 2515
2520Cys Val Ala Leu Gly Ile Pro Lys Pro Lys Val Thr Trp Glu
Thr 2525 2530 2535Pro Arg His Ser Leu
Leu Ser Lys Ala Thr Ala Arg Lys Pro His 2540 2545
2550Arg Ser Glu Met Leu His Pro Gln Gly Thr Leu Val Ile
Gln Asn 2555 2560 2565Leu Gln Thr Ser
Asp Ser Gly Val Tyr Lys Cys Arg Ala Gln Asn 2570
2575 2580Leu Leu Gly Thr Asp Tyr Ala Thr Thr Tyr Ile
Gln Val Leu 2585 2590
2595112586PRTHomo sapiensmisc_feature(1)..(2586)'x' can be any amino acid
11Met Lys Val Lys Gly Arg Gly Ile Thr Cys Leu Leu Val Ser Phe Ala1
5 10 15Val Ile Cys Leu Val Ala
Thr Pro Gly Gly Lys Ala Cys Pro Arg Arg 20 25
30Cys Ala Cys Tyr Met Pro Thr Glu Val His Cys Thr Phe
Arg Tyr Leu 35 40 45Thr Ser Ile
Pro Asp Ser Ile Pro Pro Asn Val Glu Arg Ile Asn Leu 50
55 60Gly Tyr Asn Ser Leu Val Arg Leu Met Glu Thr Asp
Phe Ser Gly Leu65 70 75
80Thr Lys Leu Glu Leu Leu Met Leu His Ser Asn Gly Ile His Thr Ile
85 90 95Pro Asp Lys Thr Phe Ser
Asp Leu Gln Ala Leu Gln Val Leu Lys Met 100
105 110Ser Tyr Asn Lys Val Arg Lys Leu Gln Lys Asp Thr
Phe Tyr Gly Leu 115 120 125Arg Ser
Leu Thr Arg Leu His Met Asp His Asn Asn Ile Glu Phe Ile 130
135 140Asn Pro Glu Val Phe Tyr Gly Leu Asn Phe Leu
Arg Leu Val His Leu145 150 155
160Glu Gly Asn Gln Leu Thr Lys Leu His Pro Asp Thr Phe Val Ser Leu
165 170 175Ser Tyr Leu Gln
Ile Phe Lys Ile Ser Phe Ile Lys Phe Leu Tyr Leu 180
185 190Ser Asp Asn Phe Leu Thr Ser Leu Pro Gln Glu
Met Val Ser Tyr Met 195 200 205Pro
Asp Leu Asp Ser Leu Tyr Leu His Gly Asn Pro Trp Thr Cys Asp 210
215 220Cys His Leu Lys Trp Leu Ser Asp Trp Ile
Gln Pro Asp Val Ile Lys225 230 235
240Cys Lys Lys Asp Arg Ser Pro Ser Ser Ala Gln Gln Cys Pro Leu
Cys 245 250 255Met Asn Pro
Arg Thr Ser Lys Gly Lys Pro Leu Ala Met Val Ser Ala 260
265 270Ala Ala Phe Gln Cys Ala Lys Pro Thr Ile
Asp Ser Ser Leu Lys Ser 275 280
285Lys Ser Leu Thr Ile Leu Glu Asp Ser Ser Ser Ala Phe Ile Ser Pro 290
295 300Gln Gly Phe Met Ala Pro Phe Gly
Ser Leu Thr Leu Asn Met Thr Asp305 310
315 320Gln Ser Gly Asn Glu Ala Asn Met Val Cys Ser Ile
Gln Lys Pro Ser 325 330
335Arg Thr Ser Pro Ile Ala Phe Thr Glu Glu Asn Asp Tyr Ile Val Leu
340 345 350Asn Thr Ser Phe Ser Thr
Phe Leu Val Cys Asn Ile Asp Tyr Gly His 355 360
365Ile Gln Pro Val Trp Gln Ile Leu Ala Leu Tyr Ser Asp Ser
Pro Leu 370 375 380Ile Leu Glu Arg Ser
His Leu Leu Ser Glu Thr Pro Gln Leu Tyr Tyr385 390
395 400Lys Tyr Lys Gln Val Ala Pro Lys Pro Glu
Asp Ile Phe Thr Asn Ile 405 410
415Glu Ala Asp Leu Arg Ala Asp Pro Ser Trp Leu Met Gln Asp Gln Ile
420 425 430Ser Leu Gln Leu Asn
Arg Thr Ala Thr Thr Phe Ser Thr Leu Gln Ile 435
440 445Gln Tyr Ser Ser Asp Ala Gln Ile Thr Leu Pro Arg
Ala Glu Met Arg 450 455 460Pro Val Lys
His Lys Trp Thr Met Ile Ser Arg Asp Asn Asn Thr Lys465
470 475 480Leu Glu His Thr Val Leu Val
Gly Gly Thr Val Gly Leu Asn Cys Pro 485
490 495Gly Gln Gly Asp Pro Thr Pro His Val Asp Trp Leu
Leu Ala Asp Gly 500 505 510Ser
Lys Val Arg Ala Pro Tyr Val Ser Glu Asp Gly Arg Ile Leu Ile 515
520 525Asp Lys Ser Gly Lys Leu Glu Leu Gln
Met Ala Asp Ser Phe Asp Thr 530 535
540Gly Val Tyr His Cys Ile Ser Ser Asn Tyr Asp Asp Ala Asp Ile Leu545
550 555 560Thr Tyr Arg Ile
Thr Val Val Glu Pro Leu Val Glu Ala Tyr Gln Glu 565
570 575Asn Gly Ile His His Thr Val Phe Ile Gly
Glu Thr Leu Asp Leu Pro 580 585
590Cys His Ser Thr Gly Ile Pro Asp Ala Ser Ile Ser Trp Val Ile Pro
595 600 605Gly Asn Asn Val Leu Tyr Gln
Ser Ser Arg Asp Lys Lys Val Leu Asn 610 615
620Asn Gly Thr Leu Arg Ile Leu Gln Val Thr Pro Lys Asp Gln Gly
Tyr625 630 635 640Tyr Arg
Cys Val Ala Ala Asn Pro Ser Gly Val Asp Phe Leu Ile Phe
645 650 655Gln Val Ser Val Lys Met Lys
Gly Gln Arg Pro Leu Glu His Asp Gly 660 665
670Glu Thr Glu Gly Ser Gly Leu Asp Glu Ser Asn Pro Ile Ala
His Leu 675 680 685Lys Glu Pro Pro
Gly Ala Gln Leu Arg Thr Ser Ala Leu Met Glu Ala 690
695 700Glu Val Gly Lys His Thr Ser Ser Thr Ser Lys Arg
His Asn Tyr Arg705 710 715
720Glu Leu Thr Leu Gln Arg Arg Gly Asp Ser Thr His Arg Arg Phe Arg
725 730 735Glu Asn Arg Arg His
Phe Pro Pro Ser Ala Arg Arg Ile Asp Pro Gln 740
745 750His Trp Ala Ala Leu Leu Glu Lys Ala Lys Lys Asn
Ala Met Pro Asp 755 760 765Lys Arg
Glu Asn Thr Thr Val Ser Pro Pro Pro Val Val Thr Gln Leu 770
775 780Pro Asn Ile Pro Gly Glu Glu Asp Asp Ser Ser
Gly Met Leu Ala Leu785 790 795
800His Glu Glu Phe Met Val Pro Ala Thr Lys Ala Leu Asn Leu Pro Ala
805 810 815Arg Thr Val Thr
Ala Asp Ser Arg Thr Ile Ser Asp Ser Pro Met Thr 820
825 830Asn Ile Asn Tyr Gly Thr Glu Phe Ser Pro Val
Val Asn Ser Gln Ile 835 840 845Leu
Pro Pro Glu Glu Pro Thr Asp Phe Lys Leu Ser Thr Ala Ile Lys 850
855 860Thr Thr Ala Met Ser Lys Asn Ile Asn Pro
Thr Met Ser Ser Gln Ile865 870 875
880Gln Gly Thr Thr Asn Gln His Ser Ser Thr Val Phe Pro Leu Leu
Leu 885 890 895Gly Ala Thr
Glu Phe Gln Asp Ser Asp Gln Met Gly Arg Gly Arg Glu 900
905 910His Phe Gln Ser Arg Pro Pro Ile Thr Val
Arg Thr Met Ile Lys Asp 915 920
925Val Asn Val Lys Met Leu Ser Ser Thr Thr Asn Lys Leu Leu Leu Glu 930
935 940Ser Val Asn Thr Thr Asn Ser His
Gln Thr Ser Val Arg Glu Val Ser945 950
955 960Glu Pro Arg His Asn His Phe Tyr Ser His Thr Thr
Gln Ile Leu Ser 965 970
975Thr Ser Thr Phe Pro Ser Asp Pro His Thr Ala Ala His Ser Gln Phe
980 985 990Pro Ile Pro Arg Asn Ser
Thr Val Asn Ile Pro Leu Phe Arg Arg Phe 995 1000
1005Gly Arg Gln Arg Lys Ile Gly Gly Arg Gly Arg Ile
Ile Ser Pro 1010 1015 1020Tyr Arg Thr
Pro Val Leu Arg Arg His Arg Tyr Ser Ile Phe Arg 1025
1030 1035Ser Thr Thr Arg Gly Ser Ser Glu Lys Ser Thr
Thr Ala Phe Ser 1040 1045 1050Ala Thr
Val Leu Asn Val Thr Cys Leu Ser Cys Leu Pro Arg Glu 1055
1060 1065Arg Leu Thr Thr Ala Thr Ala Ala Leu Ser
Phe Pro Ser Ala Ala 1070 1075 1080Pro
Ile Thr Phe Pro Lys Ala Asp Ile Ala Arg Val Pro Ser Glu 1085
1090 1095Glu Ser Thr Thr Leu Val Gln Asn Pro
Leu Leu Leu Leu Glu Asn 1100 1105
1110Lys Pro Ser Val Glu Lys Thr Thr Pro Thr Ile Lys Tyr Phe Arg
1115 1120 1125Thr Glu Ile Ser Gln Val
Thr Pro Thr Gly Ala Val Met Thr Tyr 1130 1135
1140Ala Pro Thr Ser Ile Pro Met Glu Lys Thr His Lys Val Asn
Ala 1145 1150 1155Ser Tyr Pro Arg Val
Ser Ser Thr Asn Glu Ala Lys Arg Asp Ser 1160 1165
1170Val Ile Thr Ser Ser Leu Ser Gly Ala Ile Thr Lys Pro
Pro Met 1175 1180 1185Thr Ile Ile Ala
Ile Thr Arg Phe Ser Arg Arg Lys Ile Pro Trp 1190
1195 1200Gln Gln Asn Phe Val Asn Asn His Asn Pro Lys
Gly Arg Leu Arg 1205 1210 1215Asn Gln
His Lys Val Ser Leu Gln Lys Ser Thr Ala Val Met Leu 1220
1225 1230Pro Lys Thr Ser Pro Ala Leu Pro Gln Arg
Gln Ser Ser Pro Phe 1235 1240 1245His
Phe Thr Thr Leu Ser Thr Ser Val Met Gln Ile Pro Ser Asn 1250
1255 1260Thr Leu Thr Thr Ala His His Thr Thr
Thr Lys Thr His Asn Pro 1265 1270
1275Gly Ser Leu Pro Thr Lys Lys Glu Leu Pro Phe Pro Pro Leu Asn
1280 1285 1290Pro Met Leu Pro Ser Ile
Ile Ser Lys Asp Ser Ser Thr Lys Ser 1295 1300
1305Ile Ile Ser Thr Gln Thr Ala Ile Pro Ala Thr Thr Pro Thr
Phe 1310 1315 1320Pro Ala Ser Val Ile
Thr Tyr Glu Thr Gln Thr Glu Arg Ser Arg 1325 1330
1335Ala Gln Thr Ile Gln Arg Glu Gln Glu Pro Gln Lys Lys
Asn Arg 1340 1345 1350Thr Asp Pro Asn
Ile Ser Pro Asp Gln Ser Ser Gly Phe Thr Thr 1355
1360 1365Pro Thr Ala Met Thr Pro Pro Ala Leu Ala Phe
Thr His Ser Pro 1370 1375 1380Pro Glu
Asn Thr Thr Gly Ile Ser Ser Thr Ile Ser Phe His Ser 1385
1390 1395Arg Thr Leu Asn Leu Thr Asp Val Ile Glu
Glu Leu Ala Gln Ala 1400 1405 1410Ser
Thr Gln Thr Leu Lys Ser Thr Ile Ala Ser Glu Thr Thr Leu 1415
1420 1425Ser Ser Lys Ser His Gln Ser Thr Thr
Thr Arg Lys Ala Ser Leu 1430 1435
1440Asp Thr Pro Ile Pro Pro Phe Leu Ser Ser Ser Ala Thr Leu Met
1445 1450 1455Pro Val Pro Ile Ser Pro
Pro Phe Thr Gln Arg Ala Val Thr Asp 1460 1465
1470Thr Arg Gly Asp Ser His Phe Arg Leu Met Thr Asn Thr Val
Val 1475 1480 1485Lys Leu His Glu Ser
Ser Arg His Asn Leu Gln Met Pro Ser Ser 1490 1495
1500Gln Leu Glu Pro Leu Thr Ser Ser Thr Ser Asn Leu Leu
His Ser 1505 1510 1515Thr Pro Met Pro
Ala Leu Thr Thr Val Lys Ser Gln Asn Ser Lys 1520
1525 1530Leu Thr Pro Ser Pro Trp Ala Glu Tyr Gln Phe
Trp His Lys Pro 1535 1540 1545Tyr Ser
Asp Ile Ala Glu Lys Gly Lys Lys Pro Glu Val Ser Met 1550
1555 1560Leu Ala Thr Thr Gly Leu Ser Glu Ala Thr
Thr Leu Val Ser Asp 1565 1570 1575Trp
Asp Gly Gln Lys Asn Thr Lys Lys Ser Asp Phe Asp Lys Lys 1580
1585 1590Pro Val Gln Glu Ala Thr Thr Ser Lys
Leu Leu Pro Phe Asp Ser 1595 1600
1605Leu Ser Arg Tyr Ile Phe Glu Lys Pro Arg Ile Val Gly Gly Lys
1610 1615 1620Ala Ala Ser Phe Thr Ile
Pro Ala Asn Ser Asp Ala Phe Leu Pro 1625 1630
1635Cys Glu Ala Val Gly Asn Pro Leu Pro Thr Ile His Trp Thr
Arg 1640 1645 1650Val Ser Gly Leu Asp
Leu Ser Arg Gly Asn Gln Asn Ser Arg Val 1655 1660
1665Gln Val Leu Pro Asn Gly Thr Leu Ser Ile Gln Arg Val
Glu Ile 1670 1675 1680Gln Asp Arg Gly
Gln Tyr Leu Cys Ser Ala Ser Asn Leu Phe Gly 1685
1690 1695Thr Asp His Leu His Val Thr Leu Ser Val Val
Ser Tyr Pro Pro 1700 1705 1710Arg Ile
Leu Glu Arg Arg Thr Lys Glu Ile Thr Val His Ser Gly 1715
1720 1725Ser Thr Val Glu Leu Lys Cys Arg Ala Glu
Gly Arg Pro Ser Pro 1730 1735 1740Thr
Val Thr Trp Ile Leu Ala Asn Gln Thr Val Val Ser Glu Ser 1745
1750 1755Ser Gln Gly Ser Arg Gln Ala Val Val
Thr Val Asp Gly Thr Leu 1760 1765
1770Val Leu His Asn Leu Ser Ile Tyr Asp Arg Gly Phe Tyr Lys Cys
1775 1780 1785Val Ala Ser Asn Pro Gly
Gly Gln Asp Ser Leu Leu Val Lys Ile 1790 1795
1800Gln Val Ile Ala Ala Pro Pro Val Ile Leu Glu Gln Arg Arg
Gln 1805 1810 1815Val Ile Val Gly Thr
Trp Gly Glu Ser Leu Lys Leu Pro Cys Thr 1820 1825
1830Ala Lys Gly Thr Pro Gln Pro Ser Val Tyr Trp Val Leu
Ser Asp 1835 1840 1845Gly Thr Glu Val
Lys Pro Leu Gln Phe Thr Asn Ser Lys Leu Phe 1850
1855 1860Leu Phe Ser Asn Gly Thr Leu Tyr Ile Arg Asn
Leu Ala Ser Ser 1865 1870 1875Asp Arg
Gly Thr Tyr Glu Cys Ile Ala Thr Ser Ser Thr Gly Ser 1880
1885 1890Glu Arg Arg Val Val Met Leu Thr Met Glu
Glu Arg Val Thr Ser 1895 1900 1905Pro
Arg Ile Glu Ala Ala Ser Gln Lys Arg Thr Glu Val Asn Phe 1910
1915 1920Gly Asp Lys Leu Leu Leu Asn Cys Ser
Ala Thr Gly Glu Pro Lys 1925 1930
1935Pro Gln Ile Met Trp Arg Leu Pro Ser Lys Ala Val Val Asp Gln
1940 1945 1950Gly Ser Trp Ile His Val
Tyr Pro Asn Gly Ser Leu Phe Ile Gly 1955 1960
1965Ser Val Thr Glu Lys Asp Ser Gly Val Tyr Leu Cys Val Ala
Arg 1970 1975 1980Asn Lys Met Gly Asp
Asp Leu Ile Leu Met His Val Ser Leu Arg 1985 1990
1995Leu Lys Pro Ala Lys Ile Asp His Lys Gln Tyr Phe Arg
Lys Gln 2000 2005 2010Val Leu His Gly
Lys Asp Phe Gln Val Asp Cys Lys Ala Ser Gly 2015
2020 2025Ser Pro Val Pro Glu Ile Ser Trp Ser Leu Pro
Asp Gly Thr Met 2030 2035 2040Ile Asn
Asn Ala Met Gln Ala Asp Asp Ser Gly His Arg Thr Arg 2045
2050 2055Arg Tyr Thr Leu Phe Asn Asn Gly Thr Leu
Tyr Phe Asn Lys Val 2060 2065 2070Gly
Val Ala Glu Glu Gly Asp Tyr Thr Cys Tyr Ala Gln Asn Thr 2075
2080 2085Leu Gly Lys Asp Glu Met Lys Val His
Leu Thr Val Ile Thr Ala 2090 2095
2100Ala Pro Arg Ile Arg Gln Ser Asn Lys Thr Asn Lys Arg Ile Lys
2105 2110 2115Ala Gly Asp Thr Ala Val
Leu Asp Cys Glu Val Thr Gly Asp Pro 2120 2125
2130Lys Pro Lys Ile Phe Trp Leu Leu Pro Ser Asn Asp Met Ile
Ser 2135 2140 2145Phe Ser Ile Asp Arg
Tyr Thr Phe His Ala Asn Gly Ser Leu Thr 2150 2155
2160Ile Asn Lys Val Lys Leu Leu Asp Ser Gly Glu Tyr Val
Cys Val 2165 2170 2175Ala Arg Asn Pro
Ser Gly Asp Asp Thr Lys Met Tyr Lys Leu Asp 2180
2185 2190Val Val Ser Lys Pro Pro Leu Ile Asn Gly Leu
Tyr Thr Asn Arg 2195 2200 2205Thr Val
Ile Lys Ala Thr Ala Val Arg His Ser Lys Lys His Phe 2210
2215 2220Asp Cys Arg Ala Glu Gly Thr Pro Ser Pro
Glu Val Met Trp Ile 2225 2230 2235Met
Pro Asp Asn Ile Phe Leu Thr Ala Pro Tyr Tyr Gly Ser Arg 2240
2245 2250Ile Thr Val His Lys Asn Gly Thr Leu
Glu Ile Arg Asn Val Arg 2255 2260
2265Leu Ser Asp Ser Ala Asp Phe Ile Cys Val Ala Arg Asn Glu Gly
2270 2275 2280Gly Glu Ser Val Leu Val
Val Gln Leu Glu Val Leu Glu Met Leu 2285 2290
2295Arg Arg Pro Thr Phe Arg Asn Pro Phe Asn Glu Lys Ile Val
Ala 2300 2305 2310Gln Leu Gly Lys Ser
Thr Ala Leu Asn Cys Ser Val Asp Gly Asn 2315 2320
2325Pro Pro Pro Glu Ile Ile Trp Ile Leu Pro Asn Gly Thr
Arg Phe 2330 2335 2340Ser Asn Gly Pro
Gln Ser Tyr Gln Tyr Leu Ile Ala Ser Asn Gly 2345
2350 2355Ser Phe Ile Ile Ser Lys Thr Thr Arg Glu Asp
Ala Gly Lys Tyr 2360 2365 2370Arg Cys
Ala Ala Arg Asn Lys Val Gly Tyr Ile Glu Lys Leu Val 2375
2380 2385Ile Leu Glu Ile Gly Gln Lys Pro Val Ile
Leu Thr Tyr Ala Pro 2390 2395 2400Gly
Thr Val Lys Gly Ile Ser Gly Glu Ser Leu Ser Leu His Cys 2405
2410 2415Val Ser Asp Gly Ile Pro Lys Pro Asn
Ile Lys Trp Thr Met Pro 2420 2425
2430Ser Gly Tyr Val Val Asp Arg Pro Gln Ile Asn Gly Lys Tyr Ile
2435 2440 2445Leu His Asp Asn Gly Thr
Leu Val Ile Lys Glu Ala Thr Ala Tyr 2450 2455
2460Asp Arg Gly Asn Tyr Ile Cys Lys Ala Gln Asn Ser Val Gly
His 2465 2470 2475Thr Leu Ile Thr Val
Pro Val Met Ile Val Ala Tyr Pro Pro Arg 2480 2485
2490Ile Thr Asn Arg Pro Pro Arg Ser Ile Val Thr Arg Thr
Gly Ala 2495 2500 2505Ala Phe Gln Leu
His Cys Val Ala Leu Gly Val Pro Lys Pro Glu 2510
2515 2520Ile Thr Trp Glu Met Pro Asp His Ser Leu Leu
Ser Thr Ala Ser 2525 2530 2535Lys Glu
Arg Thr His Gly Ser Glu Gln Leu His Leu Gln Gly Thr 2540
2545 2550Leu Val Ile Gln Asn Pro Gln Thr Ser Asp
Ser Gly Ile Tyr Lys 2555 2560 2565Cys
Thr Ala Lys Asn Pro Leu Gly Ser Asp Tyr Ala Ala Thr Tyr 2570
2575 2580Ile Gln Val 258512236PRTMus
musculusmisc_feature(1)..(236)'x' can be any amino acid 12Met Gln Lys Arg
Gly Arg Glu Val Ser Cys Leu Leu Ile Ser Leu Thr1 5
10 15Ala Ile Cys Leu Val Val Thr Pro Gly Ser
Arg Val Cys Pro Arg Arg 20 25
30Cys Ala Cys Tyr Val Pro Thr Glu Val His Cys Thr Phe Arg Asp Leu
35 40 45Thr Ser Ile Pro Asp Gly Pro Ala
Asn Val Glu Arg Val Asn Leu Gly 50 55
60Tyr Asn Ser Leu Thr Arg Leu Thr Glu Asn Asp Phe Ser Gly Leu Ser65
70 75 80Arg Leu Glu Leu Leu
Met Leu His Ser Asn Gly Ile His Arg Val Ser 85
90 95Asp Lys Thr Phe Ser Gly Leu Gln Ser Leu Gln
Val Leu Lys Met Ser 100 105
110Tyr Asn Lys Val Gln Ile Ile Glu Lys Asp Thr Leu Tyr Gly Leu Arg
115 120 125Ser Leu Thr Arg Leu His Leu
Asp His Asn Asn Ile Glu Phe Ile Asn 130 135
140Pro Glu Ala Phe Tyr Gly Leu Thr Leu Leu Arg Leu Val His Leu
Glu145 150 155 160Gly Asn
Arg Leu Thr Lys Leu His Pro Asp Thr Phe Val Ser Leu Ser
165 170 175Tyr Leu Gln Ile Phe Lys Thr
Ser Phe Ile Lys Xaa Leu Tyr Leu Tyr 180 185
190Asp Asn Phe Thr Ser Leu Pro Lys Glu Met Val Ser Ser Met
Pro Asn 195 200 205Leu Glu Ser Leu
Tyr Leu His Gly Asn Pro Trp Thr Cys Asp Cys His 210
215 220Leu Lys Trp Leu Ser Glu Trp Met Gln Gly Asn Pro225
230 235132597PRTRattus
speciesmisc_feature(1)..(2597)'x' can be any amino acid 13Met Gln Val Arg
Gly Arg Glu Val Ser Gly Leu Leu Ile Ser Leu Thr1 5
10 15Ala Val Cys Leu Val Val Thr Pro Gly Ser
Arg Ala Cys Pro Arg Arg 20 25
30Cys Ala Cys Tyr Val Pro Thr Glu Val His Cys Thr Phe Arg Tyr Leu
35 40 45Thr Ser Ile Pro Asp Gly Ile Pro
Ala Asn Val Glu Arg Ile Asn Leu 50 55
60Gly Tyr Asn Ser Leu Thr Arg Leu Thr Glu Asn Asp Phe Asp Gly Leu65
70 75 80Ser Lys Leu Glu Leu
Leu Met Leu His Ser Asn Gly Ile His Arg Val 85
90 95Ser Asp Lys Thr Phe Ser Gly Leu Gln Ser Leu
Gln Val Leu Lys Met 100 105
110Ser Tyr Asn Lys Val Gln Ile Ile Arg Lys Asp Thr Phe Tyr Gly Leu
115 120 125Gly Ser Leu Val Arg Leu His
Leu Asp His Asn Asn Ile Glu Phe Ile 130 135
140Asn Pro Glu Ala Phe Tyr Gly Leu Thr Ser Leu Arg Leu Val His
Leu145 150 155 160Glu Gly
Asn Arg Leu Thr Lys Leu His Pro Asp Thr Phe Val Ser Leu
165 170 175Ser Tyr Leu Gln Ile Phe Lys
Thr Ser Phe Ile Lys Tyr Leu Phe Leu 180 185
190Ser Asp Asn Phe Leu Thr Ser Leu Pro Lys Glu Met Val Ser
Tyr Met 195 200 205Pro Asn Leu Glu
Ser Leu Tyr Leu His Gly Asn Pro Trp Thr Cys Asp 210
215 220Cys His Leu Lys Trp Leu Ser Glu Trp Met Gln Gly
Asn Pro Asp Ile225 230 235
240Ile Lys Cys Lys Lys Asp Arg Ser Ser Ser Ser Pro Gln Gln Cys Pro
245 250 255Leu Cys Met Asn Pro
Arg Ile Ser Lys Gly Arg Pro Phe Ala Met Val 260
265 270Pro Ser Gly Ala Phe Leu Cys Thr Lys Pro Thr Ile
Asp Pro Ser Leu 275 280 285Lys Ser
Lys Ser Leu Val Thr Gln Glu Asp Asn Gly Ser Ala Ser Thr 290
295 300Ser Pro Gln Asp Phe Ile Glu Pro Phe Gly Ser
Leu Ser Leu Asn Met305 310 315
320Thr Xaa Xaa Ser Gly Asn Lys Ala Asp Met Val Cys Ser Ile Gln Lys
325 330 335Pro Ser Arg Thr
Ser Pro Thr Ala Phe Thr Glu Glu Asn Asp Tyr Ile 340
345 350Met Leu Asn Ala Ser Phe Ser Thr Asn Leu Val
Cys Ser Val Asp Tyr 355 360 365Asn
His Ile Gln Pro Val Trp Gln Leu Leu Ala Leu Tyr Ser Asp Ser 370
375 380Pro Leu Ile Leu Glu Arg Lys Pro Gln Leu
Thr Glu Thr Pro Ser Leu385 390 395
400Ser Ser Arg Tyr Lys Gln Val Ala Leu Arg Pro Glu Asp Ile Phe
Thr 405 410 415Ser Ile Glu
Ala Asp Val Arg Ala Asp Pro Phe Trp Phe Gln Gln Glu 420
425 430Lys Ile Val Leu Gln Leu Asn Arg Thr Ala
Thr Thr Leu Ser Thr Leu 435 440
445Gln Ile Gln Phe Ser Thr Asp Ala Gln Ile Ala Leu Pro Arg Ala Glu 450
455 460Met Arg Ala Glu Arg Leu Lys Trp
Thr Met Ile Leu Met Met Asn Asn465 470
475 480Pro Lys Leu Glu Arg Thr Val Leu Val Gly Gly Thr
Ile Ala Leu Ser 485 490
495Cys Pro Gly Lys Gly Asp Pro Ser Pro His Leu Glu Trp Leu Leu Ala
500 505 510Asp Gly Ser Lys Val Arg
Ala Pro Tyr Val Ser Glu Asp Gly Arg Ile 515 520
525Leu Ile Asp Lys Asn Gly Lys Leu Glu Leu Gln Met Ala Asp
Ser Phe 530 535 540Asp Ala Gly Leu Tyr
His Cys Ile Ser Thr Asn Asp Ala Asp Ala Asp545 550
555 560Val Leu Thr Tyr Arg Ile Thr Val Val Glu
Pro Tyr Gly Glu Ser Thr 565 570
575His Asp Ser Gly Val Gln His Thr Val Val Thr Gly Glu Thr Leu Asp
580 585 590Leu Pro Cys Leu Ser
Thr Gly Val Pro Asp Ala Ser Ile Ser Trp Ile 595
600 605Leu Pro Gly Asn Thr Val Phe Ser Gln Pro Ser Arg
Asp Arg Gln Ile 610 615 620Leu Asn Asn
Gly Thr Leu Arg Ile Leu Gln Val Thr Pro Lys Asp Gln625
630 635 640Gly His Tyr Gln Cys Val Ala
Ala Asn Pro Ser Gly Ala Asp Phe Ser 645
650 655Ser Phe Lys Val Ser Val Gln Lys Lys Gly Gln Arg
Met Val Glu His 660 665 670Asp
Arg Glu Ala Gly Gly Ser Gly Leu Gly Glu Pro Asn Ser Ser Val 675
680 685Ser Leu Lys Gln Pro Ala Ser Leu Lys
Leu Ser Ala Ser Ala Leu Thr 690 695
700Gly Ser Glu Ala Gly Lys Gln Val Ser Gly Val His Arg Lys Asn Lys705
710 715 720His Arg Asp Leu
Ile His Arg Arg Arg Gly Asp Ser Thr Leu Arg Arg 725
730 735Phe Arg Glu His Arg Arg Gln Leu Pro Leu
Ser Ala Arg Arg Ile Asp 740 745
750Pro Gln Arg Trp Ala Ala Leu Leu Glu Lys Ala Lys Lys Asn Ser Val
755 760 765Pro Lys Lys Gln Glu Asn Thr
Thr Val Lys Pro Val Pro Leu Ala Val 770 775
780Pro Leu Val Glu Leu Thr Asp Glu Glu Lys Asp Ala Ser Gly Met
Ile785 790 795 800Pro Pro
Asp Glu Glu Phe Met Val Leu Lys Thr Lys Ala Ser Gly Val
805 810 815Pro Gly Arg Ser Pro Thr Ala
Asp Ser Gly Pro Val Asn His Gly Phe 820 825
830Met Thr Ser Ile Ala Ser Gly Thr Glu Val Ser Thr Val Asn
Pro Gln 835 840 845Thr Leu Gln Ser
Glu His Leu Pro Asp Phe Lys Leu Phe Ser Val Thr 850
855 860Asn Gly Thr Ala Val Thr Lys Ser Met Asn Pro Ser
Ile Ala Ser Lys865 870 875
880Ile Glu Asp Thr Thr Asn Gln Asn Pro Ile Ile Ile Phe Pro Ser Val
885 890 895Ala Glu Ile Arg Asp
Ser Ala Gln Ala Gly Arg Ala Ser Ser Gln Ser 900
905 910Ala His Pro Val Thr Gly Gly Asn Met Ala Thr Tyr
Gly His Thr Asn 915 920 925Thr Tyr
Ser Ser Phe Thr Ser Lys Ala Ser Thr Val Leu Gln Pro Ile 930
935 940Asn Pro Thr Glu Ser Tyr Gly Pro Gln Ile Pro
Ile Thr Gly Val Ser945 950 955
960Arg Pro Ser Ser Ser Asp Ile Ser Ser His Thr Thr Ala Asp Pro Ser
965 970 975Phe Ser Ser His
Pro Ser Gly Ser His Thr Thr Ala Ser Ser Leu Phe 980
985 990His Ile Pro Arg Asn Asn Asn Thr Gly Asn Phe
Pro Leu Ser Arg His 995 1000
1005Leu Gly Arg Glu Arg Thr Ile Trp Ser Arg Gly Arg Val Lys Asn
1010 1015 1020Pro His Arg Thr Pro Val
Leu Arg Arg His Arg His Arg Thr Val 1025 1030
1035Arg Pro Ala Ile Lys Gly Pro Ala Asn Lys Asn Val Ser Gln
Val 1040 1045 1050Pro Ala Thr Glu Tyr
Pro Gly Met Cys His Thr Cys Pro Ser Ala 1055 1060
1065Glu Gly Leu Thr Val Ala Thr Ala Ala Leu Ser Val Pro
Ser Ser 1070 1075 1080Ser His Ser Ala
Leu Pro Lys Thr Asn Asn Val Gly Val Ile Ala 1085
1090 1095Glu Glu Ser Thr Thr Val Val Lys Lys Pro Leu
Leu Leu Phe Lys 1100 1105 1110Asp Lys
Gln Asn Val Asp Ile Glu Ile Ile Thr Thr Thr Thr Lys 1115
1120 1125Tyr Ser Gly Gly Glu Ser Asn His Val Ile
Pro Thr Glu Ala Ser 1130 1135 1140Met
Thr Ser Ala Pro Thr Ser Val Ser Leu Gly Lys Ser Pro Val 1145
1150 1155Asp Asn Ser Gly His Leu Ser Met Pro
Gly Thr Ile Gln Thr Gly 1160 1165
1170Lys Asp Ser Val Glu Thr Thr Pro Leu Pro Ser Pro Leu Ser Thr
1175 1180 1185Pro Ser Ile Pro Thr Ser
Thr Lys Phe Ser Lys Arg Lys Thr Pro 1190 1195
1200Leu His Gln Ile Phe Val Asn Asn Gln Lys Lys Glu Gly Met
Leu 1205 1210 1215Lys Asn Pro Tyr Gln
Phe Gly Leu Gln Lys Asn Pro Ala Ala Lys 1220 1225
1230Leu Pro Lys Ile Ala Pro Leu Leu Pro Thr Gly Gln Ser
Ser Pro 1235 1240 1245Ser Asp Ser Thr
Thr Leu Leu Thr Ser Pro Pro Pro Ala Leu Ser 1250
1255 1260Thr Thr Met Ala Ala Thr Gln Asn Lys Gly Thr
Glu Val Val Ser 1265 1270 1275Gly Ala
Arg Ser Leu Ser Ala Gly Lys Lys Gln Pro Phe Thr Asn 1280
1285 1290Ser Ser Pro Val Leu Pro Ser Thr Ile Ser
Lys Arg Ser Asn Thr 1295 1300 1305Leu
Asn Phe Leu Ser Thr Glu Thr Pro Thr Val Thr Ser Pro Thr 1310
1315 1320Ala Thr Ala Ser Val Ile Met Ser Glu
Thr Gln Arg Thr Arg Ser 1325 1330
1335Lys Glu Ala Lys Asp Gln Ile Lys Gly Pro Arg Lys Asn Arg Asn
1340 1345 1350Asn Ala Asn Thr Thr Pro
Arg Gln Val Ser Gly Tyr Ser Ala Tyr 1355 1360
1365Ser Ala Leu Thr Thr Ala Asp Thr Pro Leu Ala Phe Ser His
Ser 1370 1375 1380Pro Arg Gln Asp Asp
Gly Gly Asn Val Ser Ala Val Ala Tyr His 1385 1390
1395Ser Thr Thr Ser Leu Leu Ala Ile Thr Glu Leu Phe Glu
Lys Tyr 1400 1405 1410Thr Gln Thr Leu
Gly Asn Thr Thr Ala Leu Glu Thr Thr Leu Leu 1415
1420 1425Ser Lys Ser Gln Glu Ser Thr Thr Val Lys Arg
Ala Ser Asp Thr 1430 1435 1440Pro Pro
Pro Leu Leu Ser Ser Gly Ala Pro Pro Val Pro Thr Pro 1445
1450 1455Ser Pro Pro Pro Phe Thr Lys Gly Val Val
Thr Asp Ser Lys Val 1460 1465 1470Thr
Ser Ala Phe Gln Met Thr Ser Asn Arg Val Val Thr Ile Tyr 1475
1480 1485Glu Ser Ser Arg His Asn Thr Asp Leu
Gln Gln Pro Ser Ala Glu 1490 1495
1500Ala Ser Pro Asn Pro Glu Ile Ile Thr Gly Thr Thr Asp Ser Pro
1505 1510 1515Ser Asn Leu Phe Pro Ser
Thr Ser Val Pro Ala Leu Arg Val Asp 1520 1525
1530Lys Pro Gln Asn Ser Lys Trp Lys Pro Ser Pro Trp Pro Glu
His 1535 1540 1545Lys Tyr Gln Leu Lys
Ser Tyr Ser Glu Thr Ile Glu Lys Gly Lys 1550 1555
1560Arg Pro Ala Val Ser Met Ser Pro His Leu Ser Leu Pro
Glu Ala 1565 1570 1575Ser Thr His Ala
Ser His Trp Asn Thr Gln Lys His Ala Glu Lys 1580
1585 1590Ser Val Phe Asp Lys Lys Pro Gly Gln Asn Pro
Thr Ser Lys His 1595 1600 1605Leu Pro
Tyr Val Ser Leu Pro Lys Thr Leu Leu Lys Lys Pro Arg 1610
1615 1620Ile Ile Gly Gly Lys Ala Ala Ser Phe Thr
Val Pro Ala Asn Ser 1625 1630 1635Asp
Val Phe Leu Pro Cys Glu Ala Val Gly Asp Pro Leu Pro Ile 1640
1645 1650Ile His Trp Thr Arg Val Ser Ser Gly
Xaa Glu Ile Ser Gln Gly 1655 1660
1665Thr Gln Lys Ser Arg Phe His Val Leu Pro Asn Gly Thr Leu Ser
1670 1675 1680Ile Gln Arg Val Ser Ile
Gln Asp Arg Gly Gln Tyr Leu Cys Ser 1685 1690
1695Ala Phe Asn Pro Leu Gly Val Asp His Phe His Val Ser Leu
Ser 1700 1705 1710Val Val Phe Tyr Pro
Ala Arg Ile Leu Asp Arg His Val Lys Glu 1715 1720
1725Ile Thr Val His Phe Gly Ser Thr Val Glu Leu Lys Cys
Arg Val 1730 1735 1740Glu Gly Met Pro
Arg Pro Thr Val Ser Trp Ile Leu Ala Asn Gln 1745
1750 1755Thr Val Val Ser Glu Thr Ala Lys Gly Ser Arg
Lys Val Trp Val 1760 1765 1770Thr Pro
Asp Gly Thr Leu Ile Ile Tyr Asn Leu Ser Leu Tyr Asp 1775
1780 1785Arg Gly Phe Tyr Lys Cys Val Ala Ser Asn
Pro Ser Gly Gln Asp 1790 1795 1800Ser
Leu Leu Val Lys Ile Gln Val Ile Thr Ala Pro Pro Val Ile 1805
1810 1815Ile Glu Gln Lys Arg Gln Ala Ile Val
Gly Val Leu Gly Gly Ser 1820 1825
1830Leu Lys Leu Pro Cys Thr Ala Lys Gly Thr Pro Gln Pro Ser Val
1835 1840 1845His Trp Val Leu Tyr Asp
Gly Thr Glu Leu Lys Pro Leu Gln Leu 1850 1855
1860Thr His Ser Arg Phe Phe Leu Tyr Pro Asn Gly Thr Leu Tyr
Ile 1865 1870 1875Arg Ser Ile Ala Pro
Ser Val Arg Gly Thr Tyr Glu Cys Ile Ala 1880 1885
1890Thr Ser Ser Ser Gly Ser Glu Arg Arg Val Val Ile Leu
Thr Val 1895 1900 1905Glu Glu Gly Glu
Thr Ile Pro Arg Ile Glu Thr Ala Ser Gln Lys 1910
1915 1920Trp Thr Glu Val Asn Leu Gly Glu Lys Leu Leu
Leu Asn Cys Ser 1925 1930 1935Ala Thr
Gly Asp Pro Lys Pro Arg Ile Ile Trp Arg Leu Pro Ser 1940
1945 1950Lys Ala Val Ile Asp Gln Trp His Arg Met
Gly Ser Arg Ile His 1955 1960 1965Val
Tyr Pro Asn Gly Ser Leu Val Val Gly Ser Val Thr Glu Lys 1970
1975 1980Asp Ala Gly Asp Tyr Leu Cys Val Ala
Arg Asn Lys Met Gly Asp 1985 1990
1995Asp Leu Val Leu Met His Val Arg Leu Arg Leu Thr Pro Ala Lys
2000 2005 2010Ile Glu Gln Lys Gln Tyr
Phe Lys Lys Gln Val Leu His Gly Lys 2015 2020
2025Asp Phe Gln Val Asp Cys Lys Ala Ser Gly Ser Pro Val Pro
Glu 2030 2035 2040Val Ser Trp Ser Leu
Pro Asp Gly Thr Val Leu Asn Asn Val Ala 2045 2050
2055Gln Ala Asp Asp Ser Gly Tyr Arg Thr Lys Arg Tyr Thr
Leu Phe 2060 2065 2070His Asn Gly Thr
Leu Tyr Phe Asn Asn Val Gly Met Ala Glu Glu 2075
2080 2085Gly Asp Tyr Ile Cys Ser Ala Gln Asn Thr Leu
Gly Lys Asp Glu 2090 2095 2100Met Lys
Val His Leu Thr Val Leu Thr Ala Ile Pro Arg Ile Arg 2105
2110 2115Gln Ser Tyr Lys Thr Thr Met Arg Leu Arg
Ala Gly Glu Thr Ala 2120 2125 2130Val
Leu Asp Cys Glu Val Thr Gly Glu Pro Lys Pro Asn Val Phe 2135
2140 2145Trp Leu Leu Pro Ser Asn Asn Val Ile
Ser Phe Ser Asn Asp Arg 2150 2155
2160Phe Thr Phe His Ala Asn Arg Thr Leu Ser Ile His Lys Val Lys
2165 2170 2175Pro Leu Asp Ser Gly Asp
Tyr Val Cys Val Ala Gln Asn Pro Ser 2180 2185
2190Gly Asp Asp Thr Lys Thr Tyr Lys Leu Asp Ile Val Ser Lys
Pro 2195 2200 2205Pro Leu Ile Asn Gly
Leu Tyr Ala Asn Lys Thr Val Ile Lys Ala 2210 2215
2220Thr Ala Ile Arg His Ser Lys Lys Tyr Phe Asp Cys Arg
Ala Asp 2225 2230 2235Gly Ile Pro Ser
Ser Gln Val Thr Trp Ile Met Pro Gly Asn Ile 2240
2245 2250Phe Leu Pro Ala Pro Tyr Phe Gly Ser Arg Val
Thr Val His Pro 2255 2260 2265Asn Gly
Thr Leu Glu Met Arg Asn Ile Arg Leu Ser Asp Ser Ala 2270
2275 2280Asp Phe Thr Cys Val Val Arg Ser Glu Gly
Gly Glu Ser Val Leu 2285 2290 2295Val
Val Gln Leu Glu Val Leu Glu Met Leu Arg Arg Pro Thr Phe 2300
2305 2310Arg Asn Pro Phe Asn Glu Lys Val Ile
Ala Gln Ala Gly Lys Pro 2315 2320
2325Val Ala Leu Asn Cys Ser Val Asp Gly Asn Pro Pro Pro Glu Ile
2330 2335 2340Thr Trp Ile Leu Pro Asp
Gly Thr Gln Phe Ala Asn Arg Pro His 2345 2350
2355Asn Ser Pro Tyr Leu Met Ala Gly Asn Gly Ser Leu Ile Leu
Tyr 2360 2365 2370Lys Ala Thr Arg Asn
Lys Ser Gly Lys Tyr Arg Cys Ala Ala Arg 2375 2380
2385Asn Lys Val Gly Tyr Ile Glu Lys Leu Ile Leu Leu Glu
Ile Gly 2390 2395 2400Gln Lys Pro Val
Ile Leu Thr Tyr Glu Pro Gly Met Val Lys Ser 2405
2410 2415Val Ser Gly Glu Pro Leu Ser Leu His Cys Val
Ser Asp Gly Ile 2420 2425 2430Pro Lys
Pro Asn Val Lys Trp Thr Thr Pro Gly Gly His Val Ile 2435
2440 2445Asp Arg Pro Gln Val Asp Gly Lys Tyr Ile
Leu His Glu Asn Gly 2450 2455 2460Thr
Leu Val Ile Lys Ala Thr Thr Ala His Asp Gln Gly Asn Tyr 2465
2470 2475Ile Cys Arg Ala Gln Asn Ser Val Gly
Gln Ala Val Ile Ser Val 2480 2485
2490Ser Val Met Val Val Ala Tyr Pro Pro Arg Ile Ile Asn Tyr Leu
2495 2500 2505Pro Arg Asn Met Leu Arg
Arg Thr Gly Glu Ala Met Gln Leu His 2510 2515
2520Cys Val Ala Leu Gly Ile Pro Lys Pro Lys Val Thr Trp Glu
Thr 2525 2530 2535Pro Arg His Ser Leu
Leu Ser Lys Ala Thr Ala Arg Lys Pro His 2540 2545
2550Arg Ser Glu Met Leu His Pro Gln Gly Thr Leu Val Ile
Gln Asn 2555 2560 2565Leu Gln Thr Ser
Asp Ser Gly Val Tyr Lys Cys Arg Ala Gln Asn 2570
2575 2580Leu Leu Gly Thr Asp Tyr Ala Thr Thr Tyr Ile
Gln Val Leu 2585 2590
2595142586PRTHomo sapiens 14Met Lys Val Lys Gly Arg Gly Ile Thr Cys Leu
Leu Val Ser Phe Ala1 5 10
15Val Ile Cys Leu Val Ala Thr Pro Gly Gly Lys Ala Cys Pro Arg Arg
20 25 30Cys Ala Cys Tyr Met Pro Thr
Glu Val His Cys Thr Phe Arg Tyr Leu 35 40
45Thr Ser Ile Pro Asp Ser Ile Pro Pro Asn Val Glu Arg Ile Asn
Leu 50 55 60Gly Tyr Asn Ser Leu Val
Arg Leu Met Glu Thr Asp Phe Ser Gly Leu65 70
75 80Thr Lys Leu Glu Leu Leu Met Leu His Ser Asn
Gly Ile His Thr Ile 85 90
95Pro Asp Lys Thr Phe Ser Asp Leu Gln Ala Leu Gln Val Leu Lys Met
100 105 110Ser Tyr Asn Lys Val Arg
Lys Leu Gln Lys Asp Thr Phe Tyr Gly Leu 115 120
125Arg Ser Leu Thr Arg Leu His Met Asp His Asn Asn Ile Glu
Phe Ile 130 135 140Asn Pro Glu Val Phe
Tyr Gly Leu Asn Phe Leu Arg Leu Val His Leu145 150
155 160Glu Gly Asn Gln Leu Thr Lys Leu His Pro
Asp Thr Phe Val Ser Leu 165 170
175Ser Tyr Leu Gln Ile Phe Lys Ile Ser Phe Ile Lys Phe Leu Tyr Leu
180 185 190Ser Asp Asn Phe Leu
Thr Ser Leu Pro Gln Glu Met Val Ser Tyr Met 195
200 205Pro Asp Leu Asp Ser Leu Tyr Leu His Gly Asn Pro
Trp Thr Cys Asp 210 215 220Cys His Leu
Lys Trp Leu Ser Asp Trp Ile Gln Pro Asp Val Ile Lys225
230 235 240Cys Lys Lys Asp Arg Ser Pro
Ser Ser Ala Gln Gln Cys Pro Leu Cys 245
250 255Met Asn Pro Arg Thr Ser Lys Gly Lys Pro Leu Ala
Met Val Ser Ala 260 265 270Ala
Ala Phe Gln Cys Ala Lys Pro Thr Ile Asp Ser Ser Leu Lys Ser 275
280 285Lys Ser Leu Thr Ile Leu Glu Asp Ser
Ser Ser Ala Phe Ile Ser Pro 290 295
300Gln Gly Phe Met Ala Pro Phe Gly Ser Leu Thr Leu Asn Met Thr Asp305
310 315 320Gln Ser Gly Asn
Glu Ala Asn Met Val Cys Ser Ile Gln Lys Pro Ser 325
330 335Arg Thr Ser Pro Ile Ala Phe Thr Glu Glu
Asn Asp Tyr Ile Val Leu 340 345
350Asn Thr Ser Phe Ser Thr Phe Leu Val Cys Asn Ile Asp Tyr Gly His
355 360 365Ile Gln Pro Val Trp Gln Ile
Leu Ala Leu Tyr Ser Asp Ser Pro Leu 370 375
380Ile Leu Glu Arg Ser His Leu Leu Ser Glu Thr Pro Gln Leu Tyr
Tyr385 390 395 400Lys Tyr
Lys Gln Val Ala Pro Lys Pro Glu Asp Ile Phe Thr Asn Ile
405 410 415Glu Ala Asp Leu Arg Ala Asp
Pro Ser Trp Leu Met Gln Asp Gln Ile 420 425
430Ser Leu Gln Leu Asn Arg Thr Ala Thr Thr Phe Ser Thr Leu
Gln Ile 435 440 445Gln Tyr Ser Ser
Asp Ala Gln Ile Thr Leu Pro Arg Ala Glu Met Arg 450
455 460Pro Val Lys His Lys Trp Thr Met Ile Ser Arg Asp
Asn Asn Thr Lys465 470 475
480Leu Glu His Thr Val Leu Val Gly Gly Thr Val Gly Leu Asn Cys Pro
485 490 495Gly Gln Gly Asp Pro
Thr Pro His Val Asp Trp Leu Leu Ala Asp Gly 500
505 510Ser Lys Val Arg Ala Pro Tyr Val Ser Glu Asp Gly
Arg Ile Leu Ile 515 520 525Asp Lys
Ser Gly Lys Leu Glu Leu Gln Met Ala Asp Ser Phe Asp Thr 530
535 540Gly Val Tyr His Cys Ile Ser Ser Asn Tyr Asp
Asp Ala Asp Ile Leu545 550 555
560Thr Tyr Arg Ile Thr Val Val Glu Pro Leu Val Glu Ala Tyr Gln Glu
565 570 575Asn Gly Ile His
His Thr Val Phe Ile Gly Glu Thr Leu Asp Leu Pro 580
585 590Cys His Ser Thr Gly Ile Pro Asp Ala Ser Ile
Ser Trp Val Ile Pro 595 600 605Gly
Asn Asn Val Leu Tyr Gln Ser Ser Arg Asp Lys Lys Val Leu Asn 610
615 620Asn Gly Thr Leu Arg Ile Leu Gln Val Thr
Pro Lys Asp Gln Gly Tyr625 630 635
640Tyr Arg Cys Val Ala Ala Asn Pro Ser Gly Val Asp Phe Leu Ile
Phe 645 650 655Gln Val Ser
Val Lys Met Lys Gly Gln Arg Pro Leu Glu His Asp Gly 660
665 670Glu Thr Glu Gly Ser Gly Leu Asp Glu Ser
Asn Pro Ile Ala His Leu 675 680
685Lys Glu Pro Pro Gly Ala Gln Leu Arg Thr Ser Ala Leu Met Glu Ala 690
695 700Glu Val Gly Lys His Thr Ser Ser
Thr Ser Lys Arg His Asn Tyr Arg705 710
715 720Glu Leu Thr Leu Gln Arg Arg Gly Asp Ser Thr His
Arg Arg Phe Arg 725 730
735Glu Asn Arg Arg His Phe Pro Pro Ser Ala Arg Arg Ile Asp Pro Gln
740 745 750His Trp Ala Ala Leu Leu
Glu Lys Ala Lys Lys Asn Ala Met Pro Asp 755 760
765Lys Arg Glu Asn Thr Thr Val Ser Pro Pro Pro Val Val Thr
Gln Leu 770 775 780Pro Asn Ile Pro Gly
Glu Glu Asp Asp Ser Ser Gly Met Leu Ala Leu785 790
795 800His Glu Glu Phe Met Val Pro Ala Thr Lys
Ala Leu Asn Leu Pro Ala 805 810
815Arg Thr Val Thr Ala Asp Ser Arg Thr Ile Ser Asp Ser Pro Met Thr
820 825 830Asn Ile Asn Tyr Gly
Thr Glu Phe Ser Pro Val Val Asn Ser Gln Ile 835
840 845Leu Pro Pro Glu Glu Pro Thr Asp Phe Lys Leu Ser
Thr Ala Ile Lys 850 855 860Thr Thr Ala
Met Ser Lys Asn Ile Asn Pro Thr Met Ser Ser Gln Ile865
870 875 880Gln Gly Thr Thr Asn Gln His
Ser Ser Thr Val Phe Pro Leu Leu Leu 885
890 895Gly Ala Thr Glu Phe Gln Asp Ser Asp Gln Met Gly
Arg Gly Arg Glu 900 905 910His
Phe Gln Ser Arg Pro Pro Ile Thr Val Arg Thr Met Ile Lys Asp 915
920 925Val Asn Val Lys Met Leu Ser Ser Thr
Thr Asn Lys Leu Leu Leu Glu 930 935
940Ser Val Asn Thr Thr Asn Ser His Gln Thr Ser Val Arg Glu Val Ser945
950 955 960Glu Pro Arg His
Asn His Phe Tyr Ser His Thr Thr Gln Ile Leu Ser 965
970 975Thr Ser Thr Phe Pro Ser Asp Pro His Thr
Ala Ala His Ser Gln Phe 980 985
990Pro Ile Pro Arg Asn Ser Thr Val Asn Ile Pro Leu Phe Arg Arg Phe
995 1000 1005Gly Arg Gln Arg Lys Ile
Gly Gly Arg Gly Arg Ile Ile Ser Pro 1010 1015
1020Tyr Arg Thr Pro Val Leu Arg Arg His Arg Tyr Ser Ile Phe
Arg 1025 1030 1035Ser Thr Thr Arg Gly
Ser Ser Glu Lys Ser Thr Thr Ala Phe Ser 1040 1045
1050Ala Thr Val Leu Asn Val Thr Cys Leu Ser Cys Leu Pro
Arg Glu 1055 1060 1065Arg Leu Thr Thr
Ala Thr Ala Ala Leu Ser Phe Pro Ser Ala Ala 1070
1075 1080Pro Ile Thr Phe Pro Lys Ala Asp Ile Ala Arg
Val Pro Ser Glu 1085 1090 1095Glu Ser
Thr Thr Leu Val Gln Asn Pro Leu Leu Leu Leu Glu Asn 1100
1105 1110Lys Pro Ser Val Glu Lys Thr Thr Pro Thr
Ile Lys Tyr Phe Arg 1115 1120 1125Thr
Glu Ile Ser Gln Val Thr Pro Thr Gly Ala Val Met Thr Tyr 1130
1135 1140Ala Pro Thr Ser Ile Pro Met Glu Lys
Thr His Lys Val Asn Ala 1145 1150
1155Ser Tyr Pro Arg Val Ser Ser Thr Asn Glu Ala Lys Arg Asp Ser
1160 1165 1170Val Ile Thr Ser Ser Leu
Ser Gly Ala Ile Thr Lys Pro Pro Met 1175 1180
1185Thr Ile Ile Ala Ile Thr Arg Phe Ser Arg Arg Lys Ile Pro
Trp 1190 1195 1200Gln Gln Asn Phe Val
Asn Asn His Asn Pro Lys Gly Arg Leu Arg 1205 1210
1215Asn Gln His Lys Val Ser Leu Gln Lys Ser Thr Ala Val
Met Leu 1220 1225 1230Pro Lys Thr Ser
Pro Ala Leu Pro Gln Arg Gln Ser Ser Pro Phe 1235
1240 1245His Phe Thr Thr Leu Ser Thr Ser Val Met Gln
Ile Pro Ser Asn 1250 1255 1260Thr Leu
Thr Thr Ala His His Thr Thr Thr Lys Thr His Asn Pro 1265
1270 1275Gly Ser Leu Pro Thr Lys Lys Glu Leu Pro
Phe Pro Pro Leu Asn 1280 1285 1290Pro
Met Leu Pro Ser Ile Ile Ser Lys Asp Ser Ser Thr Lys Ser 1295
1300 1305Ile Ile Ser Thr Gln Thr Ala Ile Pro
Ala Thr Thr Pro Thr Phe 1310 1315
1320Pro Ala Ser Val Ile Thr Tyr Glu Thr Gln Thr Glu Arg Ser Arg
1325 1330 1335Ala Gln Thr Ile Gln Arg
Glu Gln Glu Pro Gln Lys Lys Asn Arg 1340 1345
1350Thr Asp Pro Asn Ile Ser Pro Asp Gln Ser Ser Gly Phe Thr
Thr 1355 1360 1365Pro Thr Ala Met Thr
Pro Pro Ala Leu Ala Phe Thr His Ser Pro 1370 1375
1380Pro Glu Asn Thr Thr Gly Ile Ser Ser Thr Ile Ser Phe
His Ser 1385 1390 1395Arg Thr Leu Asn
Leu Thr Asp Val Ile Glu Glu Leu Ala Gln Ala 1400
1405 1410Ser Thr Gln Thr Leu Lys Ser Thr Ile Ala Ser
Glu Thr Thr Leu 1415 1420 1425Ser Ser
Lys Ser His Gln Ser Thr Thr Thr Arg Lys Ala Ser Leu 1430
1435 1440Asp Thr Pro Ile Pro Pro Phe Leu Ser Ser
Ser Ala Thr Leu Met 1445 1450 1455Pro
Val Pro Ile Ser Pro Pro Phe Thr Gln Arg Ala Val Thr Asp 1460
1465 1470Thr Arg Gly Asp Ser His Phe Arg Leu
Met Thr Asn Thr Val Val 1475 1480
1485Lys Leu His Glu Ser Ser Arg His Asn Leu Gln Met Pro Ser Ser
1490 1495 1500Gln Leu Glu Pro Leu Thr
Ser Ser Thr Ser Asn Leu Leu His Ser 1505 1510
1515Thr Pro Met Pro Ala Leu Thr Thr Val Lys Ser Gln Asn Ser
Lys 1520 1525 1530Leu Thr Pro Ser Pro
Trp Ala Glu Tyr Gln Phe Trp His Lys Pro 1535 1540
1545Tyr Ser Asp Ile Ala Glu Lys Gly Lys Lys Pro Glu Val
Ser Met 1550 1555 1560Leu Ala Thr Thr
Gly Leu Ser Glu Ala Thr Thr Leu Val Ser Asp 1565
1570 1575Trp Asp Gly Gln Lys Asn Thr Lys Lys Ser Asp
Phe Asp Lys Lys 1580 1585 1590Pro Val
Gln Glu Ala Thr Thr Ser Lys Leu Leu Pro Phe Asp Ser 1595
1600 1605Leu Ser Arg Tyr Ile Phe Glu Lys Pro Arg
Ile Val Gly Gly Lys 1610 1615 1620Ala
Ala Ser Phe Thr Ile Pro Ala Asn Ser Asp Ala Phe Leu Pro 1625
1630 1635Cys Glu Ala Val Gly Asn Pro Leu Pro
Thr Ile His Trp Thr Arg 1640 1645
1650Val Ser Gly Leu Asp Leu Ser Arg Gly Asn Gln Asn Ser Arg Val
1655 1660 1665Gln Val Leu Pro Asn Gly
Thr Leu Ser Ile Gln Arg Val Glu Ile 1670 1675
1680Gln Asp Arg Gly Gln Tyr Leu Cys Ser Ala Ser Asn Leu Phe
Gly 1685 1690 1695Thr Asp His Leu His
Val Thr Leu Ser Val Val Ser Tyr Pro Pro 1700 1705
1710Arg Ile Leu Glu Arg Arg Thr Lys Glu Ile Thr Val His
Ser Gly 1715 1720 1725Ser Thr Val Glu
Leu Lys Cys Arg Ala Glu Gly Arg Pro Ser Pro 1730
1735 1740Thr Val Thr Trp Ile Leu Ala Asn Gln Thr Val
Val Ser Glu Ser 1745 1750 1755Ser Gln
Gly Ser Arg Gln Ala Val Val Thr Val Asp Gly Thr Leu 1760
1765 1770Val Leu His Asn Leu Ser Ile Tyr Asp Arg
Gly Phe Tyr Lys Cys 1775 1780 1785Val
Ala Ser Asn Pro Gly Gly Gln Asp Ser Leu Leu Val Lys Ile 1790
1795 1800Gln Val Ile Ala Ala Pro Pro Val Ile
Leu Glu Gln Arg Arg Gln 1805 1810
1815Val Ile Val Gly Thr Trp Gly Glu Ser Leu Lys Leu Pro Cys Thr
1820 1825 1830Ala Lys Gly Thr Pro Gln
Pro Ser Val Tyr Trp Val Leu Ser Asp 1835 1840
1845Gly Thr Glu Val Lys Pro Leu Gln Phe Thr Asn Ser Lys Leu
Phe 1850 1855 1860Leu Phe Ser Asn Gly
Thr Leu Tyr Ile Arg Asn Leu Ala Ser Ser 1865 1870
1875Asp Arg Gly Thr Tyr Glu Cys Ile Ala Thr Ser Ser Thr
Gly Ser 1880 1885 1890Glu Arg Arg Val
Val Met Leu Thr Met Glu Glu Arg Val Thr Ser 1895
1900 1905Pro Arg Ile Glu Ala Ala Ser Gln Lys Arg Thr
Glu Val Asn Phe 1910 1915 1920Gly Asp
Lys Leu Leu Leu Asn Cys Ser Ala Thr Gly Glu Pro Lys 1925
1930 1935Pro Gln Ile Met Trp Arg Leu Pro Ser Lys
Ala Val Val Asp Gln 1940 1945 1950Gly
Ser Trp Ile His Val Tyr Pro Asn Gly Ser Leu Phe Ile Gly 1955
1960 1965Ser Val Thr Glu Lys Asp Ser Gly Val
Tyr Leu Cys Val Ala Arg 1970 1975
1980Asn Lys Met Gly Asp Asp Leu Ile Leu Met His Val Ser Leu Arg
1985 1990 1995Leu Lys Pro Ala Lys Ile
Asp His Lys Gln Tyr Phe Arg Lys Gln 2000 2005
2010Val Leu His Gly Lys Asp Phe Gln Val Asp Cys Lys Ala Ser
Gly 2015 2020 2025Ser Pro Val Pro Glu
Ile Ser Trp Ser Leu Pro Asp Gly Thr Met 2030 2035
2040Ile Asn Asn Ala Met Gln Ala Asp Asp Ser Gly His Arg
Thr Arg 2045 2050 2055Arg Tyr Thr Leu
Phe Asn Asn Gly Thr Leu Tyr Phe Asn Lys Val 2060
2065 2070Gly Val Ala Glu Glu Gly Asp Tyr Thr Cys Tyr
Ala Gln Asn Thr 2075 2080 2085Leu Gly
Lys Asp Glu Met Lys Val His Leu Thr Val Ile Thr Ala 2090
2095 2100Ala Pro Arg Ile Arg Gln Ser Asn Lys Thr
Asn Lys Arg Ile Lys 2105 2110 2115Ala
Gly Asp Thr Ala Val Leu Asp Cys Glu Val Thr Gly Asp Pro 2120
2125 2130Lys Pro Lys Ile Phe Trp Leu Leu Pro
Ser Asn Asp Met Ile Ser 2135 2140
2145Phe Ser Ile Asp Arg Tyr Thr Phe His Ala Asn Gly Ser Leu Thr
2150 2155 2160Ile Asn Lys Val Lys Leu
Leu Asp Ser Gly Glu Tyr Val Cys Val 2165 2170
2175Ala Arg Asn Pro Ser Gly Asp Asp Thr Lys Met Tyr Lys Leu
Asp 2180 2185 2190Val Val Ser Lys Pro
Pro Leu Ile Asn Gly Leu Tyr Thr Asn Arg 2195 2200
2205Thr Val Ile Lys Ala Thr Ala Val Arg His Ser Lys Lys
His Phe 2210 2215 2220Asp Cys Arg Ala
Glu Gly Thr Pro Ser Pro Glu Val Met Trp Ile 2225
2230 2235Met Pro Asp Asn Ile Phe Leu Thr Ala Pro Tyr
Tyr Gly Ser Arg 2240 2245 2250Ile Thr
Val His Lys Asn Gly Thr Leu Glu Ile Arg Asn Val Arg 2255
2260 2265Leu Ser Asp Ser Ala Asp Phe Ile Cys Val
Ala Arg Asn Glu Gly 2270 2275 2280Gly
Glu Ser Val Leu Val Val Gln Leu Glu Val Leu Glu Met Leu 2285
2290 2295Arg Arg Pro Thr Phe Arg Asn Pro Phe
Asn Glu Lys Ile Val Ala 2300 2305
2310Gln Leu Gly Lys Ser Thr Ala Leu Asn Cys Ser Val Asp Gly Asn
2315 2320 2325Pro Pro Pro Glu Ile Ile
Trp Ile Leu Pro Asn Gly Thr Arg Phe 2330 2335
2340Ser Asn Gly Pro Gln Ser Tyr Gln Tyr Leu Ile Ala Ser Asn
Gly 2345 2350 2355Ser Phe Ile Ile Ser
Lys Thr Thr Arg Glu Asp Ala Gly Lys Tyr 2360 2365
2370Arg Cys Ala Ala Arg Asn Lys Val Gly Tyr Ile Glu Lys
Leu Val 2375 2380 2385Ile Leu Glu Ile
Gly Gln Lys Pro Val Ile Leu Thr Tyr Ala Pro 2390
2395 2400Gly Thr Val Lys Gly Ile Ser Gly Glu Ser Leu
Ser Leu His Cys 2405 2410 2415Val Ser
Asp Gly Ile Pro Lys Pro Asn Ile Lys Trp Thr Met Pro 2420
2425 2430Ser Gly Tyr Val Val Asp Arg Pro Gln Ile
Asn Gly Lys Tyr Ile 2435 2440 2445Leu
His Asp Asn Gly Thr Leu Val Ile Lys Glu Ala Thr Ala Tyr 2450
2455 2460Asp Arg Gly Asn Tyr Ile Cys Lys Ala
Gln Asn Ser Val Gly His 2465 2470
2475Thr Leu Ile Thr Val Pro Val Met Ile Val Ala Tyr Pro Pro Arg
2480 2485 2490Ile Thr Asn Arg Pro Pro
Arg Ser Ile Val Thr Arg Thr Gly Ala 2495 2500
2505Ala Phe Gln Leu His Cys Val Ala Leu Gly Val Pro Lys Pro
Glu 2510 2515 2520Ile Thr Trp Glu Met
Pro Asp His Ser Leu Leu Ser Thr Ala Ser 2525 2530
2535Lys Glu Arg Thr His Gly Ser Glu Gln Leu His Leu Gln
Gly Thr 2540 2545 2550Leu Val Ile Gln
Asn Pro Gln Thr Ser Asp Ser Gly Ile Tyr Lys 2555
2560 2565Cys Thr Ala Lys Asn Pro Leu Gly Ser Asp Tyr
Ala Ala Thr Tyr 2570 2575 2580Ile Gln
Val 258515236PRTMus musculusmisc_feature(1)..(236)'x' can be any
amino acid 15Met Gln Lys Arg Gly Arg Glu Val Ser Cys Leu Leu Ile Ser Leu
Thr1 5 10 15Ala Ile Cys
Leu Val Val Thr Pro Gly Ser Arg Val Cys Pro Arg Arg 20
25 30Cys Ala Cys Tyr Val Pro Thr Glu Val His
Cys Thr Phe Arg Asp Leu 35 40
45Thr Ser Ile Pro Asp Gly Pro Ala Asn Val Glu Arg Val Asn Leu Gly 50
55 60Tyr Asn Ser Leu Thr Arg Leu Thr Glu
Asn Asp Phe Ser Gly Leu Ser65 70 75
80Arg Leu Glu Leu Leu Met Leu His Ser Asn Gly Ile His Arg
Val Ser 85 90 95Asp Lys
Thr Phe Ser Gly Leu Gln Ser Leu Gln Val Leu Lys Met Ser 100
105 110Tyr Asn Lys Val Gln Ile Ile Glu Lys
Asp Thr Leu Tyr Gly Leu Arg 115 120
125Ser Leu Thr Arg Leu His Leu Asp His Asn Asn Ile Glu Phe Ile Asn
130 135 140Pro Glu Ala Phe Tyr Gly Leu
Thr Leu Leu Arg Leu Val His Leu Glu145 150
155 160Gly Asn Arg Leu Thr Lys Leu His Pro Asp Thr Phe
Val Ser Leu Ser 165 170
175Tyr Leu Gln Ile Phe Lys Thr Ser Phe Ile Lys Xaa Leu Tyr Leu Tyr
180 185 190Asp Asn Phe Thr Ser Leu
Pro Lys Glu Met Val Ser Ser Met Pro Asn 195 200
205Leu Glu Ser Leu Tyr Leu His Gly Asn Pro Trp Thr Cys Asp
Cys His 210 215 220Leu Lys Trp Leu Ser
Glu Trp Met Gln Gly Asn Pro225 230
235162587PRThomo sapiens 16Met Lys Val Lys Gly Arg Gly Ile Thr Cys Leu
Leu Val Ser Phe Ala1 5 10
15Val Ile Cys Leu Val Ala Thr Pro Gly Gly Lys Ala Cys Pro Arg Arg
20 25 30Cys Ala Cys Tyr Met Pro Thr
Glu Val His Cys Thr Phe Arg Tyr Leu 35 40
45Thr Ser Ile Pro Asp Ser Ile Pro Pro Asn Val Glu Arg Ile Asn
Leu 50 55 60Gly Tyr Asn Ser Leu Val
Arg Leu Met Glu Thr Asp Phe Ser Gly Leu65 70
75 80Thr Lys Leu Glu Leu Leu Met Leu His Ser Asn
Gly Ile His Thr Ile 85 90
95Pro Asp Lys Thr Phe Ser Asp Leu Gln Ala Leu Gln Val Leu Lys Met
100 105 110Ser Tyr Asn Lys Val Arg
Lys Leu Gln Lys Asp Thr Phe Tyr Gly Leu 115 120
125Arg Ser Leu Thr Arg Leu His Met Asp His Asn Asn Ile Glu
Phe Ile 130 135 140Asn Pro Glu Val Phe
Tyr Gly Leu Asn Phe Leu Arg Leu Val His Leu145 150
155 160Glu Gly Asn Gln Leu Thr Lys Leu His Pro
Asp Thr Phe Val Ser Leu 165 170
175Ser Tyr Leu Gln Ile Phe Lys Ile Ser Phe Ile Lys Phe Leu Tyr Leu
180 185 190Ser Asp Asn Phe Leu
Thr Ser Leu Pro Gln Glu Met Val Ser Tyr Met 195
200 205Pro Asp Leu Asp Ser Leu Tyr Leu His Gly Asn Pro
Trp Thr Cys Asp 210 215 220Cys His Leu
Lys Trp Leu Ser Asp Trp Ile Gln Pro Asp Val Ile Lys225
230 235 240Cys Lys Lys Asp Arg Ser Pro
Ser Ser Ala Gln Gln Cys Pro Leu Cys 245
250 255Met Asn Pro Arg Thr Ser Lys Gly Lys Pro Leu Ala
Met Val Ser Ala 260 265 270Ala
Ala Phe Gln Cys Ala Lys Pro Thr Ile Asp Ser Ser Leu Lys Ser 275
280 285Lys Ser Leu Thr Ile Leu Glu Asp Ser
Ser Ser Ala Phe Ile Ser Pro 290 295
300Gln Gly Phe Met Ala Pro Phe Gly Ser Leu Thr Leu Asn Met Thr Asp305
310 315 320Gln Ser Gly Asn
Glu Ala Asn Met Val Cys Ser Ile Gln Lys Pro Ser 325
330 335Arg Thr Ser Pro Ile Ala Phe Thr Glu Glu
Asn Asp Tyr Ile Val Leu 340 345
350Asn Thr Ser Phe Ser Thr Phe Leu Val Cys Asn Ile Asp Tyr Gly His
355 360 365Ile Gln Pro Val Trp Gln Ile
Leu Ala Leu Tyr Ser Asp Ser Pro Leu 370 375
380Ile Leu Glu Arg Ser His Leu Leu Ser Glu Thr Pro Gln Leu Tyr
Tyr385 390 395 400Lys Tyr
Lys Gln Val Ala Pro Lys Pro Glu Asp Ile Phe Thr Asn Ile
405 410 415Glu Ala Asp Leu Arg Ala Asp
Pro Ser Trp Leu Met Gln Asp Gln Ile 420 425
430Ser Leu Gln Leu Asn Arg Thr Ala Thr Thr Phe Ser Thr Leu
Gln Ile 435 440 445Gln Tyr Ser Ser
Asp Ala Gln Ile Thr Leu Pro Arg Ala Glu Met Arg 450
455 460Pro Val Lys His Lys Trp Thr Met Ile Ser Arg Asp
Asn Asn Thr Lys465 470 475
480Leu Glu His Thr Val Leu Val Gly Gly Thr Val Gly Leu Asn Cys Pro
485 490 495Gly Gln Gly Asp Pro
Thr Pro His Val Asp Trp Leu Leu Ala Asp Gly 500
505 510Ser Lys Val Arg Ala Pro Tyr Val Ser Glu Asp Gly
Arg Ile Leu Ile 515 520 525Asp Lys
Ser Gly Lys Leu Glu Leu Gln Met Ala Asp Ser Phe Asp Thr 530
535 540Gly Val Tyr His Cys Ile Ser Ser Asn Tyr Asp
Asp Ala Asp Ile Leu545 550 555
560Thr Tyr Arg Ile Thr Val Val Glu Pro Leu Val Glu Ala Tyr Gln Glu
565 570 575Asn Gly Ile His
His Thr Val Phe Ile Gly Glu Thr Leu Asp Leu Pro 580
585 590Cys His Ser Thr Gly Ile Pro Asp Ala Ser Ile
Ser Trp Val Ile Pro 595 600 605Gly
Asn Asn Val Leu Tyr Gln Ser Ser Arg Asp Lys Lys Val Leu Asn 610
615 620Asn Gly Thr Leu Arg Ile Leu Gln Val Thr
Pro Lys Asp Gln Gly Tyr625 630 635
640Tyr Arg Cys Val Ala Ala Asn Pro Ser Gly Val Asp Phe Leu Ile
Phe 645 650 655Gln Val Ser
Val Lys Met Lys Gly Gln Arg Pro Leu Glu His Asp Gly 660
665 670Glu Thr Glu Gly Ser Gly Leu Asp Glu Ser
Asn Pro Ile Ala His Leu 675 680
685Lys Glu Pro Pro Gly Ala Gln Leu Arg Thr Ser Ala Leu Met Glu Ala 690
695 700Glu Val Gly Lys His Thr Ser Ser
Thr Ser Lys Arg His Asn Tyr Arg705 710
715 720Glu Leu Thr Leu Gln Arg Arg Gly Asp Ser Thr His
Arg Arg Phe Arg 725 730
735Glu Asn Arg Arg His Phe Pro Pro Ser Ala Arg Arg Ile Asp Pro Gln
740 745 750His Trp Ala Ala Leu Leu
Glu Lys Ala Lys Lys Asn Ala Met Pro Asp 755 760
765Lys Arg Glu Asn Thr Thr Val Ser Pro Pro Pro Val Val Thr
Gln Leu 770 775 780Pro Asn Ile Pro Gly
Glu Glu Asp Asp Ser Ser Gly Met Leu Ala Leu785 790
795 800His Glu Glu Phe Met Val Pro Ala Thr Lys
Ala Leu Asn Leu Pro Ala 805 810
815Arg Thr Val Thr Ala Asp Ser Arg Thr Ile Ser Asp Ser Pro Met Thr
820 825 830Asn Ile Asn Tyr Gly
Thr Glu Phe Ser Pro Val Val Asn Ser Gln Ile 835
840 845Leu Pro Pro Glu Glu Pro Thr Asp Phe Lys Leu Ser
Thr Ala Ile Lys 850 855 860Thr Thr Ala
Met Ser Lys Asn Ile Asn Pro Thr Met Ser Ser Gln Ile865
870 875 880Gln Gly Thr Thr Asn Gln His
Ser Ser Thr Val Phe Pro Leu Leu Leu 885
890 895Gly Ala Thr Glu Phe Gln Asp Ser Asp Gln Met Gly
Arg Gly Arg Glu 900 905 910His
Phe Gln Ser Arg Pro Pro Ile Thr Val Arg Thr Met Ile Lys Asp 915
920 925Val Asn Val Lys Met Leu Ser Ser Thr
Thr Asn Lys Leu Leu Leu Glu 930 935
940Ser Val Asn Thr Thr Asn Ser His Gln Thr Ser Val Arg Glu Val Ser945
950 955 960Glu Pro Arg His
Asn His Phe Tyr Ser His Thr Thr Gln Ile Leu Ser 965
970 975Thr Ser Thr Phe Pro Ser Asp Pro His Thr
Ala Ala His Ser Gln Phe 980 985
990Pro Ile Pro Arg Asn Ser Thr Val Asn Ile Pro Leu Phe Arg Arg Phe
995 1000 1005Gly Arg Gln Arg Lys Ile
Gly Gly Arg Gly Arg Ile Ile Ser Pro 1010 1015
1020Tyr Arg Thr Pro Val Leu Arg Arg His Arg Tyr Ser Ile Phe
Arg 1025 1030 1035Ser Thr Thr Arg Gly
Ser Ser Glu Lys Ser Thr Thr Ala Phe Ser 1040 1045
1050Ala Thr Val Leu Asn Val Thr Cys Leu Ser Cys Leu Pro
Arg Glu 1055 1060 1065Arg Leu Thr Thr
Ala Thr Ala Ala Leu Ser Phe Pro Ser Ala Ala 1070
1075 1080Pro Ile Thr Phe Pro Lys Ala Asp Ile Ala Arg
Val Pro Ser Glu 1085 1090 1095Glu Ser
Thr Thr Leu Val Gln Asn Pro Leu Leu Leu Leu Glu Asn 1100
1105 1110Lys Pro Ser Val Glu Lys Thr Thr Pro Thr
Ile Lys Tyr Phe Arg 1115 1120 1125Thr
Glu Ile Ser Gln Val Thr Pro Thr Gly Ala Val Met Thr Tyr 1130
1135 1140Ala Pro Thr Ser Ile Pro Met Glu Lys
Thr His Lys Val Asn Ala 1145 1150
1155Ser Tyr Pro Arg Val Ser Ser Thr Asn Glu Ala Lys Arg Asp Ser
1160 1165 1170Val Ile Thr Ser Ser Leu
Ser Gly Ala Ile Thr Lys Pro Pro Met 1175 1180
1185Thr Ile Ile Ala Ile Thr Arg Phe Ser Arg Arg Lys Ile Pro
Trp 1190 1195 1200Gln Gln Asn Phe Val
Asn Asn His Asn Pro Lys Gly Arg Leu Arg 1205 1210
1215Asn Gln His Lys Val Ser Leu Gln Lys Ser Thr Ala Val
Met Leu 1220 1225 1230Pro Lys Thr Ser
Pro Ala Leu Pro Gln Arg Gln Ser Ser Pro Phe 1235
1240 1245His Phe Thr Thr Leu Ser Thr Ser Val Met Gln
Ile Pro Ser Asn 1250 1255 1260Thr Leu
Thr Thr Ala His His Thr Thr Thr Lys Thr His Asn Pro 1265
1270 1275Gly Ser Leu Pro Thr Lys Lys Glu Leu Pro
Phe Pro Pro Leu Asn 1280 1285 1290Pro
Met Leu Pro Ser Ile Ile Ser Lys Asp Ser Ser Thr Lys Ser 1295
1300 1305Ile Ile Ser Thr Gln Thr Ala Ile Pro
Ala Thr Thr Pro Thr Phe 1310 1315
1320Pro Ala Ser Val Ile Thr Tyr Glu Thr Gln Thr Glu Arg Ser Arg
1325 1330 1335Ala Gln Thr Ile Gln Arg
Glu Gln Glu Pro Gln Lys Lys Asn Arg 1340 1345
1350Thr Asp Pro Asn Ile Ser Pro Asp Gln Ser Ser Gly Phe Thr
Thr 1355 1360 1365Pro Thr Ala Met Thr
Pro Pro Ala Leu Ala Phe Thr His Ser Pro 1370 1375
1380Pro Glu Asn Thr Thr Gly Ile Ser Ser Thr Ile Ser Phe
His Ser 1385 1390 1395Arg Thr Leu Asn
Leu Thr Asp Val Ile Glu Glu Leu Ala Gln Ala 1400
1405 1410Ser Thr Gln Thr Leu Lys Ser Thr Ile Ala Ser
Glu Thr Thr Leu 1415 1420 1425Ser Ser
Lys Ser His Gln Ser Thr Thr Thr Arg Lys Ala Ser Leu 1430
1435 1440Asp Thr Pro Ile Pro Pro Phe Leu Ser Ser
Ser Ala Thr Leu Met 1445 1450 1455Pro
Val Pro Ile Ser Pro Pro Phe Thr Gln Arg Ala Val Thr Asp 1460
1465 1470Thr Arg Gly Asp Ser His Phe Arg Leu
Met Thr Asn Thr Val Val 1475 1480
1485Lys Leu His Glu Ser Ser Arg His Asn Leu Gln Met Pro Ser Ser
1490 1495 1500Gln Leu Glu Pro Leu Thr
Ser Ser Thr Ser Asn Leu Leu His Ser 1505 1510
1515Thr Pro Met Pro Ala Leu Thr Thr Val Lys Ser Gln Asn Ser
Lys 1520 1525 1530Leu Thr Pro Ser Pro
Trp Ala Glu Tyr Gln Phe Trp His Lys Pro 1535 1540
1545Tyr Ser Asp Ile Ala Glu Lys Gly Lys Lys Pro Glu Val
Ser Met 1550 1555 1560Leu Ala Thr Thr
Gly Leu Ser Glu Ala Thr Thr Leu Val Ser Asp 1565
1570 1575Trp Asp Gly Gln Lys Asn Thr Lys Lys Ser Asp
Phe Asp Lys Lys 1580 1585 1590Pro Val
Gln Glu Ala Thr Thr Ser Lys Leu Leu Pro Phe Asp Ser 1595
1600 1605Leu Ser Arg Tyr Ile Phe Glu Lys Pro Arg
Ile Val Gly Gly Lys 1610 1615 1620Ala
Ala Ser Phe Thr Ile Pro Ala Asn Ser Asp Ala Phe Leu Pro 1625
1630 1635Cys Glu Ala Val Gly Asn Pro Leu Pro
Thr Ile His Trp Thr Arg 1640 1645
1650Val Ser Gly Leu Asp Leu Ser Arg Gly Asn Gln Asn Ser Arg Val
1655 1660 1665Gln Val Leu Pro Asn Gly
Thr Leu Ser Ile Gln Arg Val Glu Ile 1670 1675
1680Gln Asp Arg Gly Gln Tyr Leu Cys Ser Ala Ser Asn Leu Phe
Gly 1685 1690 1695Thr Asp His Leu His
Val Thr Leu Ser Val Val Ser Tyr Pro Pro 1700 1705
1710Arg Ile Leu Glu Arg Arg Thr Lys Glu Ile Thr Val His
Ser Gly 1715 1720 1725Ser Thr Val Glu
Leu Lys Cys Arg Ala Glu Gly Arg Pro Ser Pro 1730
1735 1740Thr Val Thr Trp Ile Leu Ala Asn Gln Thr Val
Val Ser Glu Ser 1745 1750 1755Ser Gln
Gly Ser Arg Gln Ala Val Val Thr Val Asp Gly Thr Leu 1760
1765 1770Val Leu His Asn Leu Ser Ile Tyr Asp Arg
Gly Phe Tyr Lys Cys 1775 1780 1785Val
Ala Ser Asn Pro Gly Gly Gln Asp Ser Leu Leu Val Lys Ile 1790
1795 1800Gln Val Ile Ala Ala Pro Pro Val Ile
Leu Glu Gln Arg Arg Gln 1805 1810
1815Val Ile Val Gly Thr Trp Gly Glu Ser Leu Lys Leu Pro Cys Thr
1820 1825 1830Ala Lys Gly Thr Pro Gln
Pro Ser Val Tyr Trp Val Leu Ser Asp 1835 1840
1845Gly Thr Glu Val Lys Pro Leu Gln Phe Thr Asn Ser Lys Leu
Phe 1850 1855 1860Leu Phe Ser Asn Gly
Thr Leu Tyr Ile Arg Asn Leu Ala Ser Ser 1865 1870
1875Asp Arg Gly Thr Tyr Glu Cys Ile Ala Thr Ser Ser Thr
Gly Ser 1880 1885 1890Glu Arg Arg Val
Val Met Leu Thr Met Glu Glu Arg Val Thr Ser 1895
1900 1905Pro Arg Ile Glu Ala Ala Ser Gln Lys Arg Thr
Glu Val Asn Phe 1910 1915 1920Gly Asp
Lys Leu Leu Leu Asn Cys Ser Ala Thr Gly Glu Pro Lys 1925
1930 1935Pro Gln Ile Met Trp Arg Leu Pro Ser Lys
Ala Val Val Asp Gln 1940 1945 1950Gly
Ser Trp Ile His Val Tyr Pro Asn Gly Ser Leu Phe Ile Gly 1955
1960 1965Ser Val Thr Glu Lys Asp Ser Gly Val
Tyr Leu Cys Val Ala Arg 1970 1975
1980Asn Lys Met Gly Asp Asp Leu Ile Leu Met His Val Ser Leu Arg
1985 1990 1995Leu Lys Pro Ala Lys Ile
Asp His Lys Gln Tyr Phe Arg Lys Gln 2000 2005
2010Val Leu His Gly Lys Asp Phe Gln Val Asp Cys Lys Ala Ser
Gly 2015 2020 2025Ser Pro Val Pro Glu
Ile Ser Trp Ser Leu Pro Asp Gly Thr Met 2030 2035
2040Ile Asn Asn Ala Met Gln Ala Asp Asp Ser Gly His Arg
Thr Arg 2045 2050 2055Arg Tyr Thr Leu
Phe Asn Asn Gly Thr Leu Tyr Phe Asn Lys Val 2060
2065 2070Gly Val Ala Glu Glu Gly Asp Tyr Thr Cys Tyr
Ala Gln Asn Thr 2075 2080 2085Leu Gly
Lys Asp Glu Met Lys Val His Leu Thr Val Ile Thr Ala 2090
2095 2100Ala Pro Arg Ile Arg Gln Ser Asn Lys Thr
Asn Lys Arg Ile Lys 2105 2110 2115Ala
Gly Asp Thr Ala Val Leu Asp Cys Glu Val Thr Gly Asp Pro 2120
2125 2130Lys Pro Lys Ile Phe Trp Leu Leu Pro
Ser Asn Asp Met Ile Ser 2135 2140
2145Phe Ser Ile Asp Arg Tyr Thr Phe His Ala Asn Gly Ser Leu Thr
2150 2155 2160Ile Asn Lys Val Lys Leu
Leu Asp Ser Gly Glu Tyr Val Cys Val 2165 2170
2175Ala Arg Asn Pro Ser Gly Asp Asp Thr Lys Met Tyr Lys Leu
Asp 2180 2185 2190Val Val Ser Lys Pro
Pro Leu Ile Asn Gly Leu Tyr Thr Asn Arg 2195 2200
2205Thr Val Ile Lys Ala Thr Ala Val Arg His Ser Lys Lys
His Phe 2210 2215 2220Asp Cys Arg Ala
Glu Gly Thr Pro Ser Pro Glu Val Met Trp Ile 2225
2230 2235Met Pro Asp Asn Ile Phe Leu Thr Ala Pro Tyr
Tyr Gly Ser Arg 2240 2245 2250Ile Thr
Val His Lys Asn Gly Thr Leu Glu Ile Arg Asn Val Arg 2255
2260 2265Leu Ser Asp Ser Ala Asp Phe Ile Cys Val
Ala Arg Asn Glu Gly 2270 2275 2280Gly
Glu Ser Val Leu Val Val Gln Leu Glu Val Leu Glu Met Leu 2285
2290 2295Arg Arg Pro Thr Phe Arg Asn Pro Phe
Asn Glu Lys Ile Val Ala 2300 2305
2310Gln Leu Gly Lys Ser Thr Ala Leu Asn Cys Ser Val Asp Gly Asn
2315 2320 2325Pro Pro Pro Glu Ile Ile
Trp Ile Leu Pro Asn Gly Thr Arg Phe 2330 2335
2340Ser Asn Gly Pro Gln Ser Tyr Gln Tyr Leu Ile Ala Ser Asn
Gly 2345 2350 2355Ser Phe Ile Ile Ser
Lys Thr Thr Arg Glu Asp Ala Gly Lys Tyr 2360 2365
2370Arg Cys Ala Ala Arg Asn Lys Val Gly Tyr Ile Glu Lys
Leu Val 2375 2380 2385Ile Leu Glu Ile
Gly Gln Lys Pro Val Ile Leu Thr Tyr Ala Pro 2390
2395 2400Gly Thr Val Lys Gly Ile Ser Gly Glu Ser Leu
Ser Leu His Cys 2405 2410 2415Val Ser
Asp Gly Ile Pro Lys Pro Asn Ile Lys Trp Thr Met Pro 2420
2425 2430Ser Gly Tyr Val Val Asp Arg Pro Gln Ile
Asn Gly Lys Tyr Ile 2435 2440 2445Leu
His Asp Asn Gly Thr Leu Val Ile Lys Glu Ala Thr Ala Tyr 2450
2455 2460Asp Arg Gly Asn Tyr Ile Cys Lys Ala
Gln Asn Ser Val Gly His 2465 2470
2475Thr Leu Ile Thr Val Pro Val Met Ile Val Ala Tyr Pro Pro Arg
2480 2485 2490Ile Thr Asn Arg Pro Pro
Arg Ser Ile Val Thr Arg Thr Gly Ala 2495 2500
2505Ala Phe Gln Leu His Cys Val Ala Leu Gly Val Pro Lys Pro
Glu 2510 2515 2520Ile Thr Trp Glu Met
Pro Asp His Ser Leu Leu Ser Thr Ala Ser 2525 2530
2535Lys Glu Arg Thr His Gly Ser Glu Gln Leu His Leu Gln
Gly Thr 2540 2545 2550Leu Val Ile Gln
Asn Pro Gln Thr Ser Asp Ser Gly Ile Tyr Lys 2555
2560 2565Cys Thr Ala Lys Asn Pro Leu Gly Ser Asp Tyr
Ala Ala Thr Tyr 2570 2575 2580Ile Gln
Val Ile 2585175551DNAMus musculus 17tctagaagta aaatgatcct gagtagcgat
cctgggaaaa tacgtactct aacacactgc 60aatcatctct ctgtggtttg ctggagctga
ggtctggaag gctcgacctt ggttagaaat 120aacctaccga atacagagct atgacgttag
tctggaagga gctttggaag aatgacaagc 180tgtagctgcc cagaacatac tagatgccat
atttccaagg caagtgtcca catgcggaca 240tcttaagaat atggttgtct ctgcagtgct
aaggaccttg ttcgtgccac acaggtctcc 300agggttagtg ctaactctga ctgcttgact
ctttaattct accttgatca ttaatgacta 360gaaatcactt ggtgattagc aactggatat
ggaatattac taatttgtac ccaagccagg 420ccacctcagc tttggcagct ccattcattc
tgtggagccc agtcacgtgg gtttgaatca 480actgtactgt ttctacttac aagacgcatt
acctgagatg agtcattttt cttcacaagt 540ctttttagaa gagtcaatta gacatattct
gatgaagtaa gcatataaag tgagagcagc 600atgaatgtgt tccatgtatg ctcatggatg
ctattataat gtggaaataa actgacttta 660aaaaaaaaag cttatgatac ttgtcacaga
gtaaatcttc cataaatatc atctgcattt 720ataaattatt ttcataatcc atcaattaaa
aacctttaga aattttgtta acacaaagat 780ccctaggccc ctgccctagg atggtctgta
tggtgggcct gagagatgga gcttaagaac 840ttacttgctc caggagcaca tcttcagaac
atctgcctca aaacatttat cccaaatgct 900catcaaaggc tcactcacat gtgcttcaac
cacagggatt aaacagtcat tttagtcaca 960tttctcaaac ggtggaagcc tgctagagga
acaggatgta tcaggataac atccaacctt 1020acaaaaggat gtcataaccc tcaccacaac
aaacaacaac gacaacaaac ccataaaaat 1080tatcacggca aatgaactaa gccatatgca
gaaaaagtat tatatgttct cattgtgggg 1140tgtttttcct taatagtcaa atatgcagaa
tatagacaaa gatggtttat gcaagtgggg 1200atggcgaagg atacttgtag attagaggac
acaaagcaac aactacagag tgaagtaatc 1260cagagactta atgtataata tgaggactgt
atttaataat tctatttaag atacacagca 1320aacgagtgta tcttactaac acacacactt
acatagagag aataaagtga tagatacgtt 1380tgttttatct tcatgtagct gataatttca
tattgtacac ctcaaacata gataaccaac 1440aaagaggaag aggataggtg cctctcccag
ggcggaagag tacattcgaa agtcagacac 1500cattgtgtag atgtaccaca tggaggagct
agagaaagta gccaaggagc taaagggatc 1560tgcaacccta taggtggaac aacattatga
gctaaccagt accccggagc tcttgactct 1620agctgcatat atatcaaaag atggcctaat
cggccatcac tggaaagaga ggcccattgg 1680acttgcaaac tttatatgcc ccagtacagg
ggaataccag ggccaaaaag ggggagtggg 1740tgggcagggg agtgggggtg ggtggatatg
ggggactttt ggtatagcat tggaaatgta 1800aatgagttaa atacctaata aaaaatggaa
aaaaaaaaaa aaaaaaaaaa aaaggaaggt 1860cagacacctc acttcactgc tatctcaact
tgcaaacaga aggggagtca caaacccagg 1920acaaaccaca gtgattgaag cgtctttgaa
tgttattgct gttgttgtta ccaccatcat 1980tagcatatat tcattgtgaa aacttacggg
gtctatgaca tgttttttta ttcaagtata 2040tcacatgctg tcagcatatt tggcaccact
accagcccca gccccctttg ccccgccccc 2100aacacacaca cacacacaca cacacacaca
cacacacaca cacacacaca cacacacacc 2160tttaccttct cctgggcatc atctgctcac
tcacccaccc aagcttaatc cttttccttc 2220cctgcaatag tacctctcct atttttatgt
ctaggttccc cctccccctg ttaggagatg 2280ggagaggtca cgaaaggaaa gaatttgtag
cccctgagcc agcccgggcc acagagcctg 2340ccaccagaca ggaaaagccc agggcttacc
agcacaggag gagcaaactc gcaggcgagc 2400ctgggttggc gctggtggtc ccgggtcgat
ggcccgccca ttcccagaag ccgaggctat 2460agctgcgtca cctgccccgc cctcctcccg
agtgaagacc cctagaggct gagcagaccc 2520caaaggcggt gcaattccat tggcccaagg
cagaggtgag cggctgctaa tcccctcggg 2580aagtgaaggg acccagagag tctggtagat
gtgggagctg gggttcaggg cgagacagag 2640ggtgggatgg gcagaagggt ccaggaaaag
gaaagtactg gaggggagtt gggacaaaag 2700cagcgaccaa gggaacatcg cttcagtgac
tgaagccagg caaaaggagc gggaaggatt 2760atatgtagcc tgggacgctt tcataaacac
tgatgacgtg tttgtgcaaa gcaagcaatt 2820tgaggagaaa cgcctgggac gtcggaaaga
aggagtgatc gattagtact tgtaagttta 2880ggtgagtttg agaactaact aacctatact
attgagggag aaggaagagc attccagcag 2940cagcagcagc agcagcaatc agataaagga
aagctttggt tagtttggaa atgtatgata 3000ccattaaaat aacagaagcg cctccagttc
tctgaagagt cagtccccca gctagtgaag 3060actaagccta ctaagccttt tgctcccgtt
ggaagcaaag aacgttcctt caatcaggtg 3120aaggctctcc tcagaagatt tcctgtctct
gcttatgtta caagaggatt caaaagcaag 3180acagaagagc tcaggtattg ccaactcttt
tgttaaatac agtttgaggc ttaagtgtac 3240gggaactcat gtggtattca tttacggctc
tcttctctta taactaactc ttaaggtgca 3300tatagtctct tctgtttccc agctaccttg
taccatcttt gtttatctaa taatagcaag 3360ctcatctgct ttttaatcat cacgcagaga
gtattcaaaa atattcagtg atgtaacagt 3420gacagtgtag gcatagaagt aatcattagt
aaatcttaat ttgggttaaa ctcattcata 3480acagctccag gttgggaggg atcactgagc
cttcgccacg tgcgggttaa agatattttc 3540taacaagaga agcagaattc ttccttggcc
atgctcccca tcactgtgtc agtaagcaga 3600ggggtgtttc caagcagaga aagagcagac
agtgttatgc ctgcaaagtc agagactcag 3660ccctcccagc tggtcagttt actgtcctcc
cggtcattag ttggctctga aaaggcccat 3720gtgtccttat tggcaaggac ttgcagacat
gctagaaaga aatttgacct ttttttctag 3780tgggttatta cagctgtaaa agtattttgg
aaggttaagc caaataaata aaacacatat 3840taaataatac aatgttacaa aaattgatca
tataaagaag tacattcata aatgcaatgt 3900gaaaaatata tataattttt atctatttac
tggtgcaaag ttttctaaat tgcacatgta 3960ctatttttat atttataaaa atatttttaa
aatgtatata aaagtgtaaa aggctcttgg 4020tcaaacaaga gagttaaatt tacaaacttt
aattgtcccg ataacattat tatgatctct 4080aatgacaggg atcctgcttt tcattgggaa
atgagaagct atgaagatat gtttacaata 4140ataagcccat ttagtgataa agtccaatgg
gaagctagca cacactggtt tataaagaga 4200acagtttcct gagtctatgc aagtttacac
tctagggaat aagagttcct ctttctccag 4260atttcactag catttgttgt catcatttat
cttcttgatg atgagcatta taagtggaat 4320aagataggat ctcaaaggaa tgtcaatttg
gatgccctga acaatctttc aggtctttct 4380ttcagttcac tagtctattc atttattgga
taattggggg atggtgttaa tttttttgca 4440gttcttatgg aattccaaaa aacaaaaaac
aaacaaacaa acaaaaaacc tctgaaacta 4500gaactaccaa tccattactg ggtatgtaac
aaagagaaat ctgcacagaa tttattgcta 4560cattgttcat tattcacgac agccaagaat
gtggaaccaa cttacgtagc cgtcaaaata 4620tgaacggata aagaaaatgt ggaaatgtgt
acaacagagt cccatgtggc cataaaagag 4680tgaaatcatg acatatgcag gaaatggatg
caactggaaa tcaattgggc taatcaaaac 4740aagacagact caaaaaggaa acaccgtgta
gcttctctga caaacagaag ctagatttac 4800acttgtacgt gcgcatgtgt gtttagaatt
ttatttagtt atacactatt ctaatctgtg 4860agtgtgtata aaggcatgca tgtaaagcaa
aaacaagcta gctggggtgg gtaggagaga 4920aagcaatgag aggagttaat aagaacgaag
catagtaaca taggtgccag gatgaaatgc 4980attaatttgt atgctaacta aaccacagac
aggaggcaca cgttcaaacc agggtgaaat 5040cccagcacag agaaggggaa gtagacacaa
agtttcgcca ctaaccaaga agccatttgc 5100agttgctgcc tgctgggagg ggcgttccag
ttttctccag tctgacactg tgtataacaa 5160ccagttgaca atacaaagtt ggcatgatgg
atggtttttg tgctattttt catttttttt 5220cttactgttt tgttgttgtg gtggttgttg
tggtggtggc tgtggttttc atttgtttct 5280tttgagagag agaaggaaca tgaaattggg
tgggtaggaa gctggaaacg atctggaaga 5340agttggggaa agagaaaaat tgtatggagc
atatttaaac aaacaaacaa acaaacaaaa 5400ggttcatttt gccacaaaaa ggtgtgaatt
aaattaacca gttacgactc ttaaagaaaa 5460tattcccaat tattcccaga gttgctatgt
atgctgtgcc taggactttg cttgaactgg 5520ccctataact ctggtgtggt gtcttttcag g
5551184610DNAMus musculus 18cacagacctt
cctcttctaa cctctctccc ccatcttgtt gcttcatccc agacttcaac 60accagcaagc
acactctgct aatgcaaggg ctgctcctgt caggacaaca aggaggctga 120aggcagaccc
acacgtttcc aactgctcct gagagtcaat ccccctagac tcatctatag 180caggaaacct
gctgtgatct ccatttcttc tctgaccaca tccccaagtt atcacaagga 240gtttttcctc
aaacctttcc tctccagcaa accccttcag ctccttgggt actttctcta 300gccccttcat
tgggaaccct gtgctccatc caatggatgg ctgtgagcat ccacttctgt 360atagaatctt
ggtcagtgca gtcttttgta tcctcaagaa cactgggtct gaaaatttta 420acccaaagaa
ctgttttttg ttatgattgc tgcaatctct ttcaattcca ataaagagta 480agcatctcat
tcctttgtct cctcctttca gtaccaccct gcctttgctg cctttctcaa 540agaatcaata
aaaccaaagt gatatagatt catggcattc ctctaactgc tacatccact 600ccagtagtat
ctcacttggc aggtgtaaaa gcctggaagc agtcacgagg cagtttcaca 660gaaacttagc
ctcctggaac cttggcattc ccatagctag aatgccccag atttgtccct 720gagatattgt
ggtgggtctt gcatgctttc ttgcagtatt ttactggata agagttagaa 780atctcagggc
gagcttagca aaagtatacc tagaatcttc atgacagtca ggtattgcaa 840actacattgc
atattagaag aaagttggta aattcttctg acaaatggag attccctaca 900gataacttaa
aagaacagct aagtcacact catatgcaag aatttaccaa ggcctaggaa 960aggggggggg
ggtactgctt tattcatgat aaggtctgct agagcagaac cccctggtgc 1020tagctttcac
aaggttcaaa ggtgtagcat aaattgtgac tagagtgtga aatctttacc 1080tgtcattagc
tgactctagg cagagctgtt ttatctttac tgtaaacatt acctggttcc 1140tgtcagtcct
ttgaaggcat tcctctgttt tgtgacagat acttctatgt acctcgcctg 1200ctgtgacacc
ctactccttt gttttctgta ttatataagc ctggtgttcc ctttgtgaaa 1260aattacatcc
agatacagca ctcccttgtg tctgtgtcct tttgtcattt ctggccaact 1320ccatgcccac
ctgccagaac ccctagtctt ttccacagat tgagggaggc cgactgagcc 1380tggtccatgg
catctaacca ctgtcagctc actgttggtg actacctcaa ggtacaagct 1440ccattactaa
tgaaacaaaa ttagataagt gtgggtccag gaagcaggtt gtacaccctg 1500tctgaatgaa
cattatgaaa tgactgaaat aagttaaccc atctcttcct cgtttgctaa 1560tatagcaaat
aaaccgagtt tctgagctgc tgctggtgtg tctccatcag agggcagagc 1620cagtctgatc
ctagctttcc tgtatgtgtg tccattgttt cttcagttcc tgttgcccca 1680ttaggaaatc
ctaagccatg aaagccatga atctgggaat gacttttcta agaaatgcca 1740cgtgaacctt
gcgtttcaac gttttgcctg taaacaagat atatggtgcg cagtttataa 1800tcataataag
ctttgaaata atatataact ccattctcat tctgcttcca cgctgagcat 1860cctgtttccc
cagggaccac aagagcattt gaaaagtagt gatttatgac ctgctttgtt 1920ctgttactat
aaaagcttca tgaaagggca gccatgttga aacatggaac ttggggtgac 1980ctgtatctgt
gttcctgggt cgtgctcact catatttgtc tccagaataa atgagtttat 2040caacttcgag
gaaaaagttg tgtgtttgta tagcacgccc gtggagtccc accattctac 2100ttcctgtaat
ctgtatatgg tagaaaaagt taatttatgt gattcttcca actccaaata 2160tttcaaatct
tttagcccct cagcctggga tttctttgac taagtctatt gatttggaag 2220atctcagtgg
ttaggatttg cagtcatgat gttcatacgt caggctaagc tgaaaaatat 2280gacaaatgaa
atgtcaaatg tcatgtgcct gggaatgtga gtgttagggg gttttaaaga 2340aacaaatacc
tactctaaat agttaataag tcccatggtt ctattctagt tttgaataat 2400gttccctagt
atacagcaat ttaatttgaa atgaatagct tcttatcttg accaatctca 2460gtgacttcat
ccgtcccaag tcatgttttc atattcataa ggataggtct cattcaacca 2520catgtttatc
atttgggatc tgcatttttc tgatgcaaaa tgatttattc ttccagagca 2580ctggaattgg
gttgaatcat cttataacgg ccaaaactaa atgcttctgt gctaaacaga 2640gagttacaag
acctttttat gtggatggca gcattttagt catccttatg acagaatgtc 2700agagtggagc
tcccactggg ggaggggctg gtccttggca ggattctctt ggaacatcac 2760acaaagaaat
tccaaattat gaaatgcaca tgatccatcc agaatgtgac ttttgactct 2820tgaacatgag
cttttaaagt acgtttggct gttcagacct tgactttgag gtgaaggaaa 2880gctcgccaac
tcctttttat atgtaacaca atatatcaag atctaatgtg agacagtatg 2940ccagtcccaa
gatctgtcaa tatgactgaa gacacattgc gatgttatca ctaaggcagg 3000agaaggcaag
ctacagtgaa gcccagttca ctataaagct ttatgagaaa ttagataaga 3060agggtttcta
atttttaaat tttttttatt agatattttc ttcatttaca tttcaaatgc 3120tatcccaaaa
gtcccctata ctcccccctc accctgctcc cctacccacc cactcccatt 3180tcttggccct
ggttacttgt gatagtggtc atatgatcca ccaagcttta catgctcact 3240atctggtcta
ttgcaagaat ggctgccgag ctgatgcagt cagatacaga cacctacagc 3300caaacagtgg
aaggaacttg gggactctta tggaagaaaa ggaggaaggg ttatgggccc 3360cggatgggga
aaggaactcc acaggaagac caacatactt ggtcaactaa cctggaccct 3420tggggctctc
agagtctgaa ccaccaacca tagaacattc atgggctgta cccaggcctc 3480tccactcata
tgtaacagat atgtggcttg gccttcatct gggtcctgaa caactagatg 3540ggggttaggg
gtggggatgg gggttatctc aaaagctgtt gcctgtatgt gggatatgtt 3600cttactgagc
tgcctagtct ggcctcagtg ggagaggaag cacctagctt tagccttgta 3660aagacttgaa
gttctgaggt gtcggtggag ggtatactca gggaggccct cacctgctag 3720gaagagaaga
ggagggggaa gacttggggg agggggcagt gagcaggttg gtaaaatgaa 3780taagaaaaaa
aaaataaaaa taattaaaaa aaaaaaagaa tggctgctga gccctactct 3840aaaaccattg
catccccccc ccccaatcat tcagtgacta cgaattaaaa tcattgatac 3900taacaataga
tgtaggaaac tattgttaac ttctttgtga ccacgagtgg tatttggaac 3960cttttttatt
gaagctttca cacagagcct tgttctttca tttccctgta catgcatgta 4020gcttaatgat
gttcagtgaa ttaaaaataa caatgaagaa taaagacaac tgtattttaa 4080ggattcttcg
tatatattta aaaatctaag gtggtcacct ggaagaaatg tcttcagttt 4140ttctatatat
gtttactcta tcgtatgtta attaattata tgcaataatt cataaaatct 4200acaacatagt
atgtaactta taagaaagta aaacattcat gaaattgtga aggttacttt 4260tccttaccct
cagaaacact gggtttgaat aattcttatt ttggtatcag tgaagaattt 4320gaaagaatgt
aataacctac taaggcaaac atagaagttg aaattaaaaa gagtagacag 4380gagaagtaat
aaggcaaata atgaatattt gctttaaata gttcttaatg tatcatctaa 4440ctagggtgtg
attctccaga cttgactcca tccaaaatat ccaaaatgac tctaaccaca 4500gtcattgaaa
caatgtgttg aaaataataa acatttccta cttgaaaatt cagatttctc 4560ctactttgct
ttttattgct gtgataagca ccatgaccaa agcagcttat
461019424DNAArtificial Sequenceprimer 19taagcctttt gctcccgttg gaagcaaaga
acgttccttc aatcaggtga aggctctcct 60cagaagattt catgtctcag cttatgttac
aagaggattc aaaagcaaga cagaagagct 120caggtatagc caactctttt gttaaataca
gtatgaggct taagtgtacg gcaactcatg 180tggtattcat ttgcggctct cttctcttat
aactaactct taaggtgcat atagtctctt 240ctgtttccca gctaccttgc accatctttg
tttatctaat aatagcaagc tcatctgctt 300tttaatcatc acgcagagag tattcaaaaa
tattcagtga tgtaacagtg acagtgtagg 360catagaagta atcattagta aatcttaata
tgggttaaac tcattcataa cagctccagg 420ttgg
4242011962DNAMus
musculusmisc_feature(1)..(11962)'n' can be any nucleotide 'a', 'c', 'g'
or 't'. 20tttggaacca acccagatgc ccctcaacag agaaatgggc cagaaaatgt
ggtccattta 60tccaatggaa tactactcaa cttattaaaa acaacgactt tcataaaatt
tttaggcaaa 120tgnatggtct gnaggatctt gagtgaggta acccaatcac aaaagaacac
tcatggtatg 180cactcactga taagtggcta tttgtctatg gagtgattta aaagggaaga
agacacatag 240ctttttgtgt gtataatatt aagatggaaa tttgccagtg ctgtttggct
tatgagtgaa 300tcttgtttca gtggattacc ggaagaaaat aataagtgaa ctgtaggaag
aagtagttaa 360tcaaggtgac aaagtatcct gacacattgg gaaaagacca cagtccagga
aactgagtct 420taaggattca tattaactcc agttccccat gtgcagctct gagactttgg
cagatcagac 480acttaacttc accagcttcc tacacagagc agttactatc cttgccttca
cacatggagt 540gtgccattaa gtgcctgaac atgagtctga cttgttaata atctttaaaa
tccaattgtg 600tgtaaagtat gtgaccaaag agcatggtca tgctattaac ctttgatgtt
ctatggactc 660ttaattttat ggtagaaatg tcaacaagct tgtggaggct ggaagataca
aggcttaaga 720ggatggcctt tcagttttga aagtaattca gtatgtgttc tggcatccct
tttcctaaag 780caatttaacc ccccaagtag gcataatttt aatgcttact tcatcagaat
atatctaatt 840gactcttcta aaaagacttt ggtatgcata ggatctaaat gtaaatgtga
tttactgaca 900taataaatag gagaaactga gctagaatag gtataaaata tgtgctggct
ttctaatagg 960tcttataggt tatataagag gtgggaaagg aatatttgaa acatctagaa
gtaaaatgat 1020cctgagtagc gatcctggga aaatacgtac tctaacacac tgcaatcatc
tctctgtggt 1080ttgctggagc tgaggtctgg aaggctcgac cttggttaga aataacctac
cgaatacaga 1140gctatgacgt tagtctggaa ggagctttgg aagaatgaca agctgtagct
gcccagaaca 1200tactagatgc catatttcca aggcaagtgt ccacatgcgg acatcttaag
aatatggttg 1260tctctgcagt gctaaggacc ttgttcgtgc cacacaggtc tccagggtta
gtgctaactc 1320tgactgcttg actctttaat tctcccttga tcattaatga ctagaaatca
cttggtgatt 1380agcaactgga tatggaatat tacttaattt gtacccaagc caggccacct
cagctttggc 1440agctccattc attctgtgga gcccagtcac gtgggtttga atcaactgta
ctgtttctac 1500ttacaagacg cattacctga gatgagtcat ttttcttcac aagtcttttt
agaagagtca 1560attagacata ttctgatgaa gtaagcatat aaagtgagag cagcatgaat
gtgttccatg 1620tatgctcatg gatgctatta taatgtggaa ataaactgac tttaaaaaaa
aaagcttatg 1680atacttgtca cagagtaaat cttccataaa tatcatctgc atttataaat
tattttcata 1740atccatcaat taaaaacctt tagaaatttt gttaacacaa agatccctag
gcccctgccc 1800taggatggtc tgtatggtgg gcctgagaga tggagcttaa gaacttactt
gctccaggag 1860cacatcttca gaacatctgc ctcaaaacat ttatcccaaa tgctcatcaa
aggctcactc 1920acatgtgctt caaccacagg gattaaacag tcattttagt cacatttctc
aaacggtgga 1980agcctgctag aggaacagga tgtatcagga taacatccaa ccttacaaaa
ggatgtcata 2040accctcacca caacaaacaa caacgacaac aaacccataa aaattatcac
ggcaaatgaa 2100ctaagccata tgcagaaaaa gtattatatg ttctcattgt ggggtgtttt
tccttaatag 2160tcaaatatgc agaatataga caaagatggt ttatgcaagt ggggatggcg
aaggatactt 2220gtagattaga ggacacaaag caacaactac agagtgaagt aatccagaga
cttaatgtat 2280aatatgagga ctgtatttaa taattctatt taagatacac agcaaacgag
tgtatcttac 2340taacacacac acttacatag agagaataaa gtgatagata cgtttgtttt
atcttcatgt 2400agctgataat ttcatattgt acacctcaaa catagataac caacaaagag
gaagaggata 2460ggtgcctctc ccagggcgga agagtacatt cgaaagtcag acaccattgt
gtagatgtac 2520cacatggagg agctagagaa agtagccaag gagctaaagg gatctgcaac
cctataggtg 2580gaacaacatt atgagctaac cagtaccccg gagctcttga ctctagctgc
atatatatca 2640aaagatggcc taatcggcca tcactggaaa gagaggccat tggacttgca
aactttatat 2700gccccagtac aggggaatac cagggccaaa aagggggagt gggtgggcag
gggagtgggg 2760gtgggtggat atgggggact tttggtatag cattggaaat gtaaatgagt
taaataccta 2820ataaaaaatg gaaaaaaaaa aaaaaaaaaa aaaaaaggaa ggtcagacac
ctcacttcac 2880tgctatctca acttgcaaac agaaggggag tcacaaaccc aggacaaacc
acagtgattg 2940aagcgtcttt gaatgttatt gctgttgttg ttaccaccat cattagcata
tattcattgt 3000gaaaacttac ggggtctatg acatgttttt ttattcaagt atatcacatg
ctgtcagcat 3060atttggcacc actaccagcc ccagccccct ttgccccgcc cccaacacac
acacacacac 3120acacacacac acacacacac acacacacac acacacacac acctttacct
tctcctgggc 3180atcatctgct cactcaccca cccaagctta atccttttcc ttccctgcaa
tagtacctct 3240cctattttta tgtctaggtt ccccctcccc ctgttaggag atgggagagg
tcacgaaaga 3300aaggaatttg tagcccttga gccagcccgg gccacagagc ctgccaccag
acaggaaaag 3360cccagggctt accagcacag gaggagcaaa ctcgcaggcg agcctgggtt
ggcgctggtg 3420gtcccgggtc gatggcccgc ccattcccag aagccgaggc tatagctgcg
tcacctgccc 3480cgccctcctc ccgagtgaag acccctagag gctgagcaga ccccaaaggc
ggtgcaattc 3540cattggccca aggcagaggt gagcggctgc taatcccctc gggaagtgaa
gggacccaga 3600gagtctggta gatgtgggag ctggggttca gggcgagaca gagggtggga
tgggcagaag 3660ggtccaggaa aggaaagtac tggaggggag ttgggacaaa agcagcgacc
aagggaacat 3720cgcttcagtg actgaagcca ggcaaaagga gcgggaagga ttatatgtag
cctgggacgc 3780tttcataaac actgatgacg tgtttgtgca aagcaagcaa tttgaggaga
aacgcctggg 3840acgtcggaaa gaaggagtga tcgattagta cttgtaagtt taggtgagtt
tgagaactaa 3900ctaacctata ctattgaggg agaaggaaga gcattccagc agcagcagca
gcagcagcaa 3960tcagataaag gaaagctttg gttagtttgg aaatgtatga taccattaaa
ataacagaag 4020cgcctccagt tctctgaaga gtcagtcccc cagctagtga agactaagcc
tactaagcct 4080tttgctcccg ttggaagcaa agaacgttcc ttcaatcagg tgaaggctct
cctcagaaga 4140tttcctgtct ctgcttatgt tacaagagga ttcaaaagca agacagaaga
gctcaggtat 4200tgccaactct tttgttaaat acagtttgag gcttaagtgt acgggaactc
atgtggtatt 4260catttacggc tctcttctct tataactaac tcttaaggtg catatagtct
cttctgtttc 4320ccagctacct tgtaccatct ttgtttatct aataatagca agctcatctg
ctttttaatc 4380atcacgcaga gagtattcaa aaatattcag tgatgtaaca gtgacagtgt
aggcatagaa 4440gtaatcatta gtaaatctta atttgggtta aactcattca taacagctcc
aggttgggag 4500ggatcactga gccttcgcca cgtgcgggtt aaagatattt tctaacaaga
gaagcagaat 4560tcttccttgg ccatgctccc catcactgtg tcagtaagca gaggggtgtt
tccaagcaga 4620gaaagagcag acagtgttat gcctgcaaag tcagagactc agccctccca
gctggtcagt 4680ttactgtcct cccggtcatt agttggctct gaaaaggccc atgtgtcctt
attggcaagg 4740acttgcagac atgctagaaa gaaatttgac ctttttttct agtgggttat
tacagctgta 4800aaagtatttt ggaaggttaa gccaaataaa taaaacacat attaaataat
acaatgttac 4860aaaaattgat catataaaga agtacattca taaatgcaat gtgaaaaata
tatataattt 4920ttatctattt actggtgcaa agttttctaa attgcacatg tactattttt
atatttataa 4980aaatattttt aaaatgtata taaaagtgta aaaggctctt ggtcaaacaa
gagagttaaa 5040tttacaaact ttaattgtcc cgataacatt attatgatct ctaatgacag
ggatcctgct 5100tttcattggg aaatgagaag ctatgaagat atgtttacaa taataagccc
atttagtgat 5160aaagtccaat gggaagctag cacacactgg tttataaaga gaacagtttc
ctgagtctat 5220gcaagtttac actctaggga ataagagttc ctctttctcc agatttcact
agcatttgtt 5280gtcatcattt atcttcttga tgatgagcat tataagtgga ataagatagg
atctcaaagg 5340aatgtcaatt tggatgccct gaacaatctt tcaggtcttt ctttcagttc
actagtctat 5400tcatttattg gataattggg gggatggtgg taattttttt gcagttctta
tggaattcca 5460aaaaacaaaa aacaaaccaa ccaaccaaaa acctctgaaa ctagaactac
caatccatta 5520ctgggtatgt aacaaagaga aatctgcaca gaatttattg ctacattgtt
cattattcac 5580gacagccaag aatgtggaac caacttacgt agccgtcaaa atatgaacgg
ataaagaaaa 5640tgtggaaatg tgtacaacag agtcccatgt ggccataaaa gagtgaaatc
atgacatatg 5700caggaaatgg atgcaactgg aaatcaattg ggctaatcaa aacaagacag
actcaaaaag 5760gaaacaccgt gtagcttctc tgacaaacag aagctagatt tacacttgta
cgtgcgcatg 5820tgtgtttaga attttattta gttatacact attctaatct gtgagtgtgt
ataaaggcat 5880gcatgtaaag caaaaacaag ctagctgggg tgggtaggag agaaagcaat
gagaggagtt 5940aataagaacg aagcatagta acataggtgc caggatgaaa tgcattaatt
tgtatgctaa 6000ctaaaccaca gacaggaggc acacgttcaa accagggtga aatcccagca
cagagaaggg 6060gaagtagaca caaagtttcg ccactaacca agaagccatt tgcagttgct
gcctgctggg 6120aaggggcgtt ccagttttct ccagtctgac actgtgtata acaaccagtt
gacaatacaa 6180agttggcatg atggatggtt tttgtgctat ttttcatttt ttttcttact
gttttgttgt 6240tgtggtggtt gttgtggtgg tggctgtggt tttcatttgt ttcttttgag
agagagaagg 6300aacatgaaat tgggtgggta ggaagctgga aacgatctgg aagaagttgg
ggaaagagaa 6360aaattgtatg gagcatattt aaacaaacaa acaaacaaac aaaaggttca
ttttgccaca 6420aaaaggtgtg aattaaatta accagttacg actcttaaag aaaatattcc
caattattcc 6480cagagttgct atgtatgctg tgcctaggac tttgcttgaa ctggccctat
aactctggtg 6540tggtgtcttt tcaggatgca gaagagaggc agggaagtca gctgcttgct
gatctccctc 6600actgccatct gcctggtggt cacccctggg agcagggtct gtcctcgccg
atgtgcctgc 6660tatgtgccca cagaggtgca ctgtacattt cgggacctga cctccatccc
agacgggcat 6720cccagccaat gtggaacgag tcaatttagg gtgtgtggac cttgcctgat
ctccttctca 6780gagagggacc actgattttc ctggtacttt gccccccaaa cacctgtgat
tacttttaat 6840agttttcttc taaaatgggt tcatacaaac cttatattgt ggagacaatg
aacattttat 6900cccaatagtc ttttactaga acttgaagcc cctcttagtt gtttgggagc
ctcataatta 6960tggggcagct ttattctgaa tgaattttaa atgaaaaaga tacagtttct
gttaacaatc 7020attatgatac caaggaagag gaattgtcat tgaatatttt aaaaaagcat
ttcttttgca 7080atttataaat acccattaca aaatggctta cttaaaatac ttgccttact
aaatctgaca 7140aattatggtg atattttgaa ggtttatgaa aatttgttta tgtgtataaa
tgcacaagaa 7200atgggatatg ccatcaccta tgtgccatta gtgagcatgt acagtatgcc
aaacactatt 7260gttcacgttt ggaggaagta atgggggtgg gggagcaaca agggttataa
ccgtataccc 7320agtgccttgg aagcgattgc aaacagtaaa gactgacatt gtgttctccc
tatgagggag 7380gggccttggg ctgagcactt tgcaatgagc atttgctcat tgtgctggca
ggttttatga 7440taacttgacc caagctagag tcactggaga ggaaggaact tcaactgaga
acatgcctga 7500agaagatcag attataggca ggcctgtggg gcattttctt aattagtgat
tcatggggca 7560gggcccagtc cattgttcgt ggtaccattt ctcaggcact attaaaaaaa
aaaaaacagg 7620ctgagcaagt gtcaaggagc aagtcagtga gcagcagccc taatgatctc
tgcatcagct 7680cctgcctcca ggttcctacc ctatttgagt tcctgtccta gctccctaca
gtgatgaaca 7740atgatgtgga agtataagcc aaataaatcc tttcttcccc aacttgctgt
tggtcatgat 7800gtttcatcac agtgataata gtcctcatga agatgctggt gtttataaca
cctttggact 7860aaattctgtt atctatagct gaggaaaatg gagcatagaa agtctccaga
ctacaccaga 7920gtgtaatctg ggcctgagct tagaatcaca cccacgtgca ctccactgcc
ggggcttctt 7980aaccggaaca cagttgtaaa agggaatttt ctgtttgttt ccattttgac
atgtggactt 8040taattgacga ttcatctgaa gctgaaaatg attttttttc caggtataac
agcctcacta 8100gattgacaga aaatgacttt tctggcctga gcagactgga gttactcatg
ctgcacagca 8160atggcattca cagagtcagt gacaagacct tctcgggctt gcagtccttg
caggtgagat 8220aggtagaggg tgatggaggc tgagaagaga ggtgcaactg tgggttatac
ccaaaagctg 8280ctgattcccg tgggagacat tctataagca ttctataaac tagaggcaga
tatcaaggaa 8340ggatttcaat tgtaatgcaa ttttatgaga aaatttgaat attaagaaaa
tgctggggaa 8400aatgcttaca caattgcgag gacctaattt aggatctcca atagccacat
aaaaagcaca 8460gcatggcggc agacacctgc aattcctgtc cctggaagca cctgttcaga
atcccagaga 8520ctcattggcc aaacactcta ttcaatcaat gaagtccata ttcagtgaca
aaacttgact 8580cagaaactaa tgtggaaagc atcaggaaga cagccaacat ctggtctcta
ctcatgcatg 8640aataagggat cccagagaga agggaagaaa aaggaaggaa ggaaggaagg
aaggaaggaa 8700ggaaggaagg aaggaaggaa ggaagagagg gaggaaagga gggagggaag
gaaggaaagg 8760gaaaggaaaa aagagatggg gagggaggga aggaaaggaa agggggagaa
agaagagaag 8820aaaggaaaat aaataaattt tcagggatta ttacaccttt aaattttatc
cataaaaggt 8880catttccacc tgtttgtctg gaagtagagt gggatccctt atataagggc
agtctttaac 8940atagtagcat tttataaacc attacaaatt ttgagttttc tctacttttt
atcctctacc 9000atcttcaaac tgaaactaca attattccca caaatgaaga aaatgctgta
agagttttca 9060cacaccgaag tgggaaactt aaggattaga caagtctaac aatgagaatg
gggagaacaa 9120aaagagactg cacagggagc cctttctctg cttataatct tgacacttga
gaagctaatt 9180gacgctgcat gactactcaa ctctttaagc aaacaatgct gttgttcatg
aaaagcacaa 9240taaagtacat atgtcccata atattcatca aaatttgcat gcagcacata
atagcaatca 9300aagcaataac acccactgtt cacagagact ttaaacatga aactggaact
atgtctagtg 9360ttttgactta gggtacatag tatgctgtgt ctgtatgtac caatgttgat
ttaggtcatc 9420agacagcatt tggaacatgt atcttcagga ggaatcattc atgtatcctg
catgaaattc 9480tccacctatg tttattctct tagccaggtt tttctctgat ggagaaacat
tgggtttgag 9540gttttactcc caggtaacat ttagggaaaa gctgtctatg ttctcagttt
ggcttttatt 9600tatgagggat gttggtattc cagaaaattc tcttttgaag agattacaat
ttaggtcaaa 9660acagaaaaat atgtaaaaag ttattgtttt tattagtatt tcatgttctt
ttctttttta 9720aaaatggtat gcttagaact aattaagatt agattagatt agattagaaa
ataatcagag 9780agggatttga tgaatgctaa agcatcatga aaaattcaaa attttttgct
tctaattcag 9840aatcaattaa attcatatta ctataaaaga cagcacgcca gatgtgtgcc
agctgaggag 9900tggataaact gtgtaacgtg agtgctatgt agaaacagaa aggagtgaag
ggttgatgtg 9960cgctgcaaca tcttgaaaac attcggctac atgatggaag ccaggcacaa
aaagccacat 10020attgcatggt tatgtttata tgaaatgttt aaaatacatg gattcttagc
aaacagagta 10080agatgttact tagggtcagg aaaagattaa aaaaaaaaaa actattgatg
tggaatgatc 10140ttaatttggg gaaaagacaa tttcctaaga cgaaatagtt gaggtagata
tagttatatc 10200cctgtggata ttgtaataaa ccagcatgct gtgctctgag aagggcctaa
tgaaggggca 10260ggaggaagtg aaatgagatg gtagaaagga aagtcatata ccatggcttc
tctcgtgggt 10320ggaatctaga tatgttaata tattgacata aaggaaggaa ttgtttaggg
aaggatcaaa 10380accaacagga gtgagggaga caataggaac caatgagagg caaagttcat
ggtcaatgtg 10440tgtggagaca ccataataaa actccttttt tgtttgctaa ctaaaaccac
taaaatctaa 10500aaacaaaaca tttttgcaca agaattattt attattcaat aaagatgttt
aaatggggga 10560agttgaagtt cattgatagt ctcataaatc ttaaatgtat ttaaactgct
ttttacgttt 10620tttattatta attactcttg ctgtcattat tatcatcatc attatcgtca
tcatcatcac 10680taatgctttt caccatacac aaatgtaggc agaagagtgt aatccactta
gtgaggcaat 10740cttggagagg gaaaggaagc ggatgcgggg cagaggcaca caggaggaca
gtgagaggga 10800aatgaacaag aaaaaatgtg gacacatgca caaaaattcc atagtccact
acattacttt 10860gtattctaat attaagaaaa taataaaccc atttctgtgc acttatcacc
caggctcaac 10920agttatcttg gccacagatc ctgtctcact gcatcctgtc cacctgagtc
cacttagcgt 10980tctgaatcca atccagggca tgatgcttac tcctacacag aactaaagat
taaagagagt 11040ttaaaagtaa ccatgacatc tctctgttcc tttagcgata agttcttaat
atttatggct 11100gcttgtgtat gttctaattt ctctaatatt gtcacattta gttggcaact
actttgtttg 11160aattgagttg gagttaaggt cccataggat taatctcaac atatttctat
atttataaac 11220ttttctctct ttgtgaaagt tcctttgaga aaacaaatat gcccatatct
ttctttacag 11280gtcttaaaaa tgagctataa caaagtccaa ataattgaga aggatacttt
gtatggactc 11340aggagcttga cccggttgca cctggatcac aacaacattg agtttatcaa
ccccgaggcg 11400ttttacggac tcaccttgct ccgcttggta catctagaag gaaaccggct
gacaaagctc 11460catccagaca catttgtctc tttgagctat ctccagatat ttaaaacctc
cttcattaag 11520nacctgtact tgtatgataa cttcattgac ctccctccca aaagaaatgg
tctcctctat 11580gccaaaccta gaaagccttt acttgcatgg aaacccatgg acctgtgact
gccatttaaa 11640gtggttgtcc gagtggatgc agggaaaccc aggtaactat cttgtttgtt
tgtttctttt 11700tttatarkac gtattttcct caatttcatt tagaatgata tcccaaaagt
cccccataac 11760ctccccccca cttccctacc tacccattcc cattttttgg ccctggcatt
cccctgtact 11820ggggcatata aagtttgcgt gtccaatgga cctctctttc cagtgatggc
caactaggcc 11880atcttttgat acatatgcag ctagagtcaa gagctctggg gtactggtta
gttcataatg 11940ttgttgcacc tacagggttg aa
11962212828PRThomo sapiens 21Met Pro Lys Arg Ala His Trp Gly
Ala Leu Ser Val Val Leu Ile Leu1 5 10
15Leu Trp Gly His Pro Arg Val Ala Leu Ala Cys Pro His Pro
Cys Ala 20 25 30Cys Tyr Val
Pro Ser Glu Val His Cys Thr Phe Arg Ser Leu Ala Ser 35
40 45Val Pro Ala Gly Ile Ala Arg His Val Glu Arg
Ile Asn Leu Gly Phe 50 55 60Asn Ser
Ile Gln Ala Leu Ser Glu Thr Ser Phe Ala Gly Leu Thr Lys65
70 75 80Leu Glu Leu Leu Met Ile His
Gly Asn Glu Ile Pro Ser Ile Pro Asp 85 90
95Gly Ala Leu Arg Asp Leu Ser Ser Leu Gln Val Phe Lys
Phe Ser Tyr 100 105 110Asn Lys
Leu Arg Val Ile Thr Gly Gln Thr Leu Gln Gly Leu Ser Asn 115
120 125Leu Met Arg Leu His Ile Asp His Asn Lys
Ile Glu Phe Ile His Pro 130 135 140Gln
Ala Phe Asn Gly Leu Thr Ser Leu Arg Leu Leu His Leu Glu Gly145
150 155 160Asn Leu Leu His Gln Leu
His Pro Ser Thr Phe Ser Thr Phe Thr Phe 165
170 175Leu Asp Tyr Phe Arg Leu Ser Thr Ile Arg His Leu
Tyr Leu Ala Glu 180 185 190Asn
Met Val Arg Thr Leu Pro Ala Ser Met Leu Arg Asn Met Pro Leu 195
200 205Leu Glu Asn Leu Tyr Leu Gln Gly Asn
Pro Trp Thr Cys Asp Cys Glu 210 215
220Met Arg Trp Phe Leu Glu Trp Asp Ala Lys Ser Arg Gly Ile Leu Lys225
230 235 240Cys Lys Lys Asp
Lys Ala Tyr Glu Gly Gly Gln Leu Cys Ala Met Cys 245
250 255Phe Ser Pro Lys Lys Leu Tyr Lys His Glu
Ile His Lys Leu Lys Asp 260 265
270Met Thr Cys Leu Lys Pro Ser Ile Glu Ser Pro Leu Arg Gln Asn Arg
275 280 285Ser Arg Ser Ile Glu Glu Glu
Gln Glu Gln Glu Glu Asp Gly Gly Ser 290 295
300Gln Leu Ile Leu Glu Lys Phe Gln Leu Pro Gln Trp Ser Ile Ser
Leu305 310 315 320Asn Met
Thr Asp Glu His Gly Asn Met Val Asn Leu Val Cys Asp Ile
325 330 335Lys Lys Pro Met Asp Val Tyr
Lys Ile His Leu Asn Gln Thr Asp Pro 340 345
350Pro Asp Ile Asp Ile Asn Ala Thr Val Ala Leu Asp Phe Glu
Cys Pro 355 360 365Met Thr Arg Glu
Asn Tyr Glu Lys Leu Trp Lys Leu Ile Ala Tyr Tyr 370
375 380Ser Glu Val Pro Val Lys Leu His Arg Glu Leu Met
Leu Ser Lys Asp385 390 395
400Pro Arg Val Ser Tyr Gln Tyr Arg Gln Asp Ala Asp Glu Glu Ala Leu
405 410 415Tyr Tyr Thr Gly Val
Arg Ala Gln Ile Leu Ala Glu Pro Glu Trp Val 420
425 430Met Gln Pro Ser Ile Asp Ile Gln Leu Asn Arg Arg
Gln Ser Thr Ala 435 440 445Lys Lys
Val Leu Leu Ser Tyr Tyr Thr Gln Tyr Ser Gln Thr Ile Ser 450
455 460Thr Lys Asp Thr Arg Gln Ala Arg Gly Arg Ser
Trp Val Met Ile Glu465 470 475
480Pro Ser Gly Ala Val Gln Arg Asp Gln Thr Val Leu Glu Gly Gly Pro
485 490 495Cys Gln Leu Ser
Cys Asn Val Lys Ala Ser Glu Ser Pro Ser Ile Phe 500
505 510Trp Val Leu Pro Asp Gly Ser Ile Leu Lys Ala
Pro Met Asp Asp Pro 515 520 525Asp
Ser Lys Phe Ser Ile Leu Ser Ser Gly Trp Leu Arg Ile Lys Ser 530
535 540Met Glu Pro Ser Asp Ser Gly Leu Tyr Gln
Cys Ile Ala Gln Val Arg545 550 555
560Asp Glu Met Asp Arg Met Val Tyr Arg Val Leu Val Gln Ser Pro
Ser 565 570 575Thr Gln Pro
Ala Glu Lys Asp Thr Val Thr Ile Gly Lys Asn Pro Gly 580
585 590Glu Ser Val Thr Leu Pro Cys Asn Ala Leu
Ala Ile Pro Glu Ala His 595 600
605Leu Ser Trp Ile Leu Pro Asn Arg Arg Ile Ile Asn Asp Leu Ala Asn 610
615 620Thr Ser His Val Tyr Met Leu Pro
Asn Gly Thr Leu Ser Ile Pro Lys625 630
635 640Val Gln Val Ser Asp Ser Gly Tyr Tyr Arg Cys Val
Ala Val Asn Gln 645 650
655Gln Gly Ala Asp His Phe Thr Val Gly Ile Thr Val Thr Lys Lys Gly
660 665 670Ser Gly Leu Pro Ser Lys
Arg Gly Arg Arg Pro Gly Ala Lys Ala Leu 675 680
685Ser Arg Val Arg Glu Asp Ile Val Glu Asp Glu Gly Gly Ser
Gly Met 690 695 700Gly Asp Glu Glu Asn
Thr Ser Arg Arg Leu Leu His Pro Lys Asp Gln705 710
715 720Glu Val Phe Leu Lys Thr Lys Asp Asp Ala
Ile Asn Gly Asp Lys Lys 725 730
735Ala Lys Lys Gly Arg Arg Lys Leu Lys Leu Trp Lys His Ser Glu Lys
740 745 750Glu Pro Glu Thr Asn
Val Ala Glu Gly Arg Arg Val Phe Glu Ser Arg 755
760 765Arg Arg Ile Asn Met Ala Asn Lys Gln Ile Asn Pro
Glu Arg Trp Ala 770 775 780Asp Ile Leu
Ala Lys Val Arg Gly Lys Asn Leu Pro Lys Gly Thr Glu785
790 795 800Val Pro Pro Leu Ile Lys Thr
Thr Ser Pro Pro Ser Leu Ser Leu Glu 805
810 815Val Thr Pro Pro Phe Pro Ala Val Ser Pro Pro Ser
Ala Ser Pro Val 820 825 830Gln
Thr Val Thr Ser Ala Glu Glu Ser Ser Ala Asp Val Pro Leu Leu 835
840 845Gly Glu Glu Glu His Val Leu Gly Thr
Ile Ser Ser Ala Ser Met Gly 850 855
860Leu Glu His Asn His Asn Gly Val Ile Leu Val Glu Pro Glu Val Thr865
870 875 880Ser Thr Pro Leu
Glu Glu Val Val Asp Asp Leu Ser Glu Lys Thr Glu 885
890 895Glu Ile Thr Ser Thr Glu Gly Asp Leu Lys
Gly Thr Ala Ala Pro Thr 900 905
910Leu Ile Ser Glu Pro Tyr Glu Pro Ser Pro Thr Leu His Thr Leu Asp
915 920 925Thr Val Tyr Glu Lys Pro Thr
His Glu Glu Thr Ala Thr Glu Gly Trp 930 935
940Ser Ala Ala Asp Val Gly Ser Ser Pro Glu Pro Thr Ser Ser Glu
Tyr945 950 955 960Glu Pro
Pro Leu Asp Ala Val Ser Leu Ala Glu Ser Glu Pro Met Gln
965 970 975Tyr Phe Asp Pro Asp Leu Glu
Thr Lys Ser Gln Pro Asp Glu Asp Lys 980 985
990Met Lys Glu Asp Thr Phe Ala His Leu Thr Pro Thr Pro Thr
Ile Trp 995 1000 1005Val Asn Asp
Ser Ser Thr Ser Gln Leu Phe Glu Asp Ser Thr Ile 1010
1015 1020Gly Glu Pro Gly Val Pro Gly Gln Ser His Leu
Gln Gly Leu Thr 1025 1030 1035Asp Asn
Ile His Leu Val Lys Ser Ser Leu Ser Thr Gln Asp Thr 1040
1045 1050Leu Leu Ile Lys Lys Gly Met Lys Glu Met
Ser Gln Thr Leu Gln 1055 1060 1065Gly
Gly Asn Met Leu Glu Gly Asp Pro Thr His Ser Arg Ser Ser 1070
1075 1080Glu Ser Glu Gly Gln Glu Ser Lys Ser
Ile Thr Leu Pro Asp Ser 1085 1090
1095Thr Leu Gly Ile Met Ser Ser Met Ser Pro Val Lys Lys Pro Ala
1100 1105 1110Glu Thr Thr Val Gly Thr
Leu Leu Asp Lys Asp Thr Thr Thr Val 1115 1120
1125Thr Thr Thr Pro Arg Gln Lys Val Ala Pro Ser Ser Thr Met
Ser 1130 1135 1140Thr His Pro Ser Arg
Arg Arg Pro Asn Gly Arg Arg Arg Leu Arg 1145 1150
1155Pro Asn Lys Phe Arg His Arg His Lys Gln Thr Pro Pro
Thr Thr 1160 1165 1170Phe Ala Pro Ser
Glu Thr Phe Ser Thr Gln Pro Thr Gln Ala Pro 1175
1180 1185Asp Ile Lys Ile Ser Ser Gln Val Glu Ser Ser
Leu Val Pro Thr 1190 1195 1200Ala Trp
Val Asp Asn Thr Val Asn Thr Pro Lys Gln Leu Glu Met 1205
1210 1215Glu Lys Asn Ala Glu Pro Thr Ser Lys Gly
Thr Pro Arg Arg Lys 1220 1225 1230His
Gly Lys Arg Pro Asn Lys His Arg Tyr Thr Pro Ser Thr Val 1235
1240 1245Ser Ser Arg Ala Ser Gly Ser Lys Pro
Ser Pro Ser Pro Glu Asn 1250 1255
1260Lys His Arg Asn Ile Val Thr Pro Ser Ser Glu Thr Ile Leu Leu
1265 1270 1275Pro Arg Thr Val Ser Leu
Lys Thr Glu Gly Pro Tyr Asp Ser Leu 1280 1285
1290Asp Tyr Met Thr Thr Thr Arg Lys Ile Tyr Ser Ser Tyr Pro
Lys 1295 1300 1305Val Gln Glu Thr Leu
Pro Val Thr Tyr Lys Pro Thr Ser Asp Gly 1310 1315
1320Lys Glu Ile Lys Asp Asp Val Ala Thr Asn Val Asp Lys
His Lys 1325 1330 1335Ser Asp Ile Leu
Val Thr Gly Glu Ser Ile Thr Asn Ala Ile Pro 1340
1345 1350Thr Ser Arg Ser Leu Val Ser Thr Met Gly Glu
Phe Lys Glu Glu 1355 1360 1365Ser Ser
Pro Val Gly Phe Pro Gly Thr Pro Thr Trp Asn Pro Ser 1370
1375 1380Arg Thr Ala Gln Pro Gly Arg Leu Gln Thr
Asp Ile Pro Val Thr 1385 1390 1395Thr
Ser Gly Glu Asn Leu Thr Asp Pro Pro Leu Leu Lys Glu Leu 1400
1405 1410Glu Asp Val Asp Phe Thr Ser Glu Phe
Leu Ser Ser Leu Thr Val 1415 1420
1425Ser Thr Pro Phe His Gln Glu Glu Ala Gly Ser Ser Thr Thr Leu
1430 1435 1440Ser Ser Ile Lys Val Glu
Val Ala Ser Ser Gln Ala Glu Thr Thr 1445 1450
1455Thr Leu Asp Gln Asp His Leu Glu Thr Thr Val Ala Ile Leu
Leu 1460 1465 1470Ser Glu Thr Arg Pro
Gln Asn His Thr Pro Thr Ala Ala Arg Met 1475 1480
1485Lys Glu Pro Ala Ser Ser Ser Pro Ser Thr Ile Leu Met
Ser Leu 1490 1495 1500Gly Gln Thr Thr
Thr Thr Lys Pro Ala Leu Pro Ser Pro Arg Ile 1505
1510 1515Ser Gln Ala Ser Arg Asp Ser Lys Glu Asn Val
Phe Leu Asn Tyr 1520 1525 1530Val Gly
Asn Pro Glu Thr Glu Ala Thr Pro Val Asn Asn Glu Gly 1535
1540 1545Thr Gln His Met Ser Gly Pro Asn Glu Leu
Ser Thr Pro Ser Ser 1550 1555 1560Asp
Arg Asp Ala Phe Asn Leu Ser Thr Lys Leu Glu Leu Glu Lys 1565
1570 1575Gln Val Phe Gly Ser Arg Ser Leu Pro
Arg Gly Pro Asp Ser Gln 1580 1585
1590Arg Gln Asp Gly Arg Val His Ala Ser His Gln Leu Thr Arg Val
1595 1600 1605Pro Ala Lys Pro Ile Leu
Pro Thr Ala Thr Val Arg Leu Pro Glu 1610 1615
1620Met Ser Thr Gln Ser Ala Ser Arg Tyr Phe Val Thr Ser Gln
Ser 1625 1630 1635Pro Arg His Trp Thr
Asn Lys Pro Glu Ile Thr Thr Tyr Pro Ser 1640 1645
1650Gly Ala Leu Pro Glu Asn Lys Gln Phe Thr Thr Pro Arg
Leu Ser 1655 1660 1665Ser Thr Thr Ile
Pro Leu Pro Leu His Met Ser Lys Pro Ser Ile 1670
1675 1680Pro Ser Lys Phe Thr Asp Arg Arg Thr Asp Gln
Phe Asn Gly Tyr 1685 1690 1695Ser Lys
Val Phe Gly Asn Asn Asn Ile Pro Glu Ala Arg Asn Pro 1700
1705 1710Val Gly Lys Pro Pro Ser Pro Arg Ile Pro
His Tyr Ser Asn Gly 1715 1720 1725Arg
Leu Pro Phe Phe Thr Asn Lys Thr Leu Ser Phe Pro Gln Leu 1730
1735 1740Gly Val Thr Arg Arg Pro Gln Ile Pro
Thr Ser Pro Ala Pro Val 1745 1750
1755Met Arg Glu Arg Lys Val Ile Pro Gly Ser Tyr Asn Arg Ile His
1760 1765 1770Ser His Ser Thr Phe His
Leu Asp Phe Gly Pro Pro Ala Pro Pro 1775 1780
1785Leu Leu His Thr Pro Gln Thr Thr Gly Ser Pro Ser Thr Asn
Leu 1790 1795 1800Gln Asn Ile Pro Met
Val Ser Ser Thr Gln Ser Ser Ile Ser Phe 1805 1810
1815Ile Thr Ser Ser Val Gln Ser Ser Gly Ser Phe His Gln
Ser Ser 1820 1825 1830Ser Lys Phe Phe
Ala Gly Gly Pro Pro Ala Ser Lys Phe Trp Ser 1835
1840 1845Leu Gly Glu Lys Pro Gln Ile Leu Thr Lys Ser
Pro Gln Thr Val 1850 1855 1860Ser Val
Thr Ala Glu Thr Asp Thr Val Phe Pro Cys Glu Ala Thr 1865
1870 1875Gly Lys Pro Lys Pro Phe Val Thr Trp Thr
Lys Val Ser Thr Gly 1880 1885 1890Ala
Leu Met Thr Pro Asn Thr Arg Ile Gln Arg Phe Glu Val Leu 1895
1900 1905Lys Asn Gly Thr Leu Val Ile Arg Lys
Val Gln Val Gln Asp Arg 1910 1915
1920Gly Gln Tyr Met Cys Thr Ala Ser Asn Leu His Gly Leu Asp Arg
1925 1930 1935Met Val Val Leu Leu Ser
Val Thr Val Gln Gln Pro Gln Ile Leu 1940 1945
1950Ala Ser His Tyr Gln Asp Val Thr Val Tyr Leu Gly Asp Thr
Ile 1955 1960 1965Ala Met Glu Cys Leu
Ala Lys Gly Thr Pro Ala Pro Gln Ile Ser 1970 1975
1980Trp Ile Phe Pro Asp Arg Arg Val Trp Gln Thr Val Ser
Pro Val 1985 1990 1995Glu Ser Arg Ile
Thr Leu His Glu Asn Arg Thr Leu Ser Ile Lys 2000
2005 2010Glu Ala Ser Phe Ser Asp Arg Gly Val Tyr Lys
Cys Val Ala Ser 2015 2020 2025Asn Ala
Ala Gly Ala Asp Ser Leu Ala Ile Arg Leu His Val Ala 2030
2035 2040Ala Leu Pro Pro Val Ile His Gln Glu Lys
Leu Glu Asn Ile Ser 2045 2050 2055Leu
Pro Pro Gly Leu Ser Ile His Ile His Cys Thr Ala Lys Ala 2060
2065 2070Ala Pro Leu Pro Ser Val Arg Trp Val
Leu Gly Asp Gly Thr Gln 2075 2080
2085Ile Arg Pro Ser Gln Phe Leu His Gly Asn Leu Phe Val Phe Pro
2090 2095 2100Asn Gly Thr Leu Tyr Ile
Arg Asn Leu Ala Pro Lys Asp Ser Gly 2105 2110
2115Arg Tyr Glu Cys Val Ala Ala Asn Leu Val Gly Ser Ala Arg
Arg 2120 2125 2130Thr Val Gln Leu Asn
Val Gln Arg Ala Ala Ala Asn Ala Arg Ile 2135 2140
2145Thr Gly Thr Ser Pro Arg Arg Thr Asp Val Arg Tyr Gly
Gly Thr 2150 2155 2160Leu Lys Leu Asp
Cys Ser Ala Ser Gly Asp Pro Trp Pro Arg Ile 2165
2170 2175Leu Trp Arg Leu Pro Ser Lys Arg Met Ile Asp
Ala Leu Phe Ser 2180 2185 2190Phe Asp
Ser Arg Ile Lys Val Phe Ala Asn Gly Thr Leu Val Val 2195
2200 2205Lys Ser Val Thr Asp Lys Asp Ala Gly Asp
Tyr Leu Cys Val Ala 2210 2215 2220Arg
Asn Lys Val Gly Asp Asp Tyr Val Val Leu Lys Val Asp Val 2225
2230 2235Val Met Lys Pro Ala Lys Ile Glu His
Lys Glu Glu Asn Asp His 2240 2245
2250Lys Val Phe Tyr Gly Gly Asp Leu Lys Val Asp Cys Val Ala Thr
2255 2260 2265Gly Leu Pro Asn Pro Glu
Ile Ser Trp Ser Leu Pro Asp Gly Ser 2270 2275
2280Leu Val Asn Ser Phe Met Gln Ser Asp Asp Ser Gly Gly Arg
Thr 2285 2290 2295Lys Arg Tyr Val Val
Phe Asn Asn Gly Thr Leu Tyr Phe Asn Glu 2300 2305
2310Val Gly Met Arg Glu Glu Gly Asp Tyr Thr Cys Phe Ala
Glu Asn 2315 2320 2325Gln Val Gly Lys
Asp Glu Met Arg Val Arg Val Lys Val Val Thr 2330
2335 2340Ala Pro Ala Thr Ile Arg Asn Lys Thr Tyr Leu
Ala Val Gln Val 2345 2350 2355Pro Tyr
Gly Asp Val Val Thr Val Ala Cys Glu Ala Lys Gly Glu 2360
2365 2370Pro Met Pro Lys Val Thr Trp Leu Ser Pro
Thr Asn Lys Val Ile 2375 2380 2385Pro
Thr Ser Ser Glu Lys Tyr Gln Ile Tyr Gln Asp Gly Thr Leu 2390
2395 2400Leu Ile Gln Lys Ala Gln Arg Ser Asp
Ser Gly Asn Tyr Thr Cys 2405 2410
2415Leu Val Arg Asn Ser Ala Gly Glu Asp Arg Lys Thr Val Trp Ile
2420 2425 2430His Val Asn Val Gln Pro
Pro Lys Ile Asn Gly Asn Pro Asn Pro 2435 2440
2445Ile Thr Thr Val Arg Glu Ile Ala Ala Gly Gly Ser Arg Lys
Leu 2450 2455 2460Ile Asp Cys Lys Ala
Glu Gly Ile Pro Thr Pro Arg Val Leu Trp 2465 2470
2475Ala Phe Pro Glu Gly Val Val Leu Pro Ala Pro Tyr Tyr
Gly Asn 2480 2485 2490Arg Ile Thr Val
His Gly Asn Gly Ser Leu Asp Ile Arg Ser Leu 2495
2500 2505Arg Lys Ser Asp Ser Val Gln Leu Val Cys Met
Ala Arg Asn Glu 2510 2515 2520Gly Gly
Glu Ala Arg Leu Ile Val Gln Leu Thr Val Leu Glu Pro 2525
2530 2535Met Glu Lys Pro Ile Phe His Asp Pro Ile
Ser Glu Lys Ile Thr 2540 2545 2550Ala
Met Ala Gly His Thr Ile Ser Leu Asn Cys Ser Ala Ala Gly 2555
2560 2565Thr Pro Thr Pro Ser Leu Val Trp Val
Leu Pro Asn Gly Thr Asp 2570 2575
2580Leu Gln Ser Gly Gln Gln Leu Gln Arg Phe Tyr His Lys Ala Asp
2585 2590 2595Gly Met Leu His Ile Ser
Gly Leu Ser Ser Val Asp Ala Gly Ala 2600 2605
2610Tyr Arg Cys Val Ala Arg Asn Ala Ala Gly His Thr Glu Arg
Leu 2615 2620 2625Val Ser Leu Lys Val
Gly Leu Lys Pro Glu Ala Asn Lys Gln Tyr 2630 2635
2640His Asn Leu Val Ser Ile Ile Asn Gly Glu Thr Leu Lys
Leu Pro 2645 2650 2655Cys Thr Pro Pro
Gly Ala Gly Gln Gly Arg Phe Ser Trp Thr Leu 2660
2665 2670Pro Asn Gly Met His Leu Glu Gly Pro Gln Thr
Leu Gly Arg Val 2675 2680 2685Ser Leu
Leu Asp Asn Gly Thr Leu Thr Val Arg Glu Ala Ser Val 2690
2695 2700Phe Asp Arg Gly Thr Tyr Val Cys Arg Met
Glu Thr Glu Tyr Gly 2705 2710 2715Pro
Ser Val Thr Ser Ile Pro Val Ile Val Ile Ala Tyr Pro Pro 2720
2725 2730Arg Ile Thr Ser Glu Pro Thr Pro Val
Ile Tyr Thr Arg Pro Gly 2735 2740
2745Asn Thr Val Lys Leu Asn Cys Met Ala Met Gly Ile Pro Lys Ala
2750 2755 2760Asp Ile Thr Trp Glu Leu
Pro Asp Lys Ser His Leu Lys Ala Gly 2765 2770
2775Val Gln Ala Arg Leu Tyr Gly Asn Arg Phe Leu His Pro Gln
Gly 2780 2785 2790Ser Leu Thr Ile Gln
His Ala Thr Gln Arg Asp Ala Gly Phe Tyr 2795 2800
2805Lys Cys Met Ala Lys Asn Ile Leu Gly Ser Asp Ser Lys
Thr Thr 2810 2815 2820Tyr Ile His Val
Phe 2825229645DNAhomo sapiensmisc_feature(1)..(9645)'n' can be any
nucleotide 'a', 'c', 'g' or 't'. 22atgcccaagc gcgcgcactg gggggccctc
tccgtggtgc tgatcctgct ttggggccat 60ccgcgagtgg cgctggcctg cccgcatcct
tgtgcctgct acgtccccag cgaggtccac 120tgcacgttcc gatccctggc ttccgtgccc
gctggcattg ctagacacgt ggaaagaatc 180aatttggggt ttaatagcat acaggccctg
tcagaaacct catttgcagg actgaccaag 240ttggagctac ttatgattca cggcaatgag
atcccaagca tccccgatgg agctttaaga 300gacctcagct ctcttcaggt tttcaagttc
agctacaaca agctgagagt gatcacagga 360cagaccctcc agggtctctc taacttaatg
aggctgcaca ttgaccacaa caagatcgag 420tttatccacc ctcaagcttt caacggctta
acgtctctga ggctactcca tttggaagga 480aatctcctcc accagctgca ccccagcacc
ttctccacgt tcacattttt ggattatttc 540agactctcca ccataaggca cctctactta
gcagagaaca tggttagaac tcttcctgcc 600agcatgcttc ggaacatgcc gcttctggag
aatctttact tgcagggaaa tccgtggacc 660tgcgattgtg agatgagatg gtttttggaa
tgggatgcaa aatccagagg aattctgaag 720tgtaaaaagg acaaagctta tgaaggcggt
cagttgtgtg caatgtgctt cagtccaaag 780aagttgtaca aacatgagat acacaagctg
aaggacatga cttgtctgaa gccttcaata 840gagtcccctc tgagacagaa caggagcagg
agtattgagg aggagcaaga acaggaagag 900gatggtggca gccagctcat cctggagaaa
ttccaactgc cccagtggag catctctttg 960aatatgaccg acgagcacgg gaacatggtg
aacttggtct gtgacatcaa gaaaccaatg 1020gatgtgtaca agattcactt gaaccaaacg
gatcctccag atattgacat aaatgcaaca 1080gttgccttgg actttgagtg tccaatgacc
cgagaaaact atgaaaagct atggaaattg 1140atagcatact acagtgaagt tcccgtgaag
ctacacagag agctcatgct cagcaaagac 1200cccagagtca gctaccagta caggcaggat
gctgatgagg aagctcttta ctacacaggt 1260gtgagagccc agattcttgc agaaccagaa
tgggtcatgc agccatccat agatatccag 1320ctgaaccgac gtcagagtac ggccaagaag
gtgctacttt cctactacac ccagtattct 1380caaacaatat ccaccaaaga tacaaggcag
gctcggggca gaagctgggt aatgattgag 1440cctagtggag ctgtgcaaag agatcagact
gtcctggaag ggggtccatg ccagttgagc 1500tgcaacgtga aagcttctga gagtccatct
atcttctggg tgcttccaga tggctccatc 1560ctgaaagcgc ccatggatga cccagacagc
aagttctcca ttctcagcag tggctggctg 1620aggatcaagt ccatggagcc atctgactca
ggcttgtacc agtgcattgc tcaagtgagg 1680gatgaaatgg accgcatggt atatagggta
cttgtgcagt ctccctccac tcagccagcc 1740gagaaagaca cagtgacaat tggcaagaac
ccaggggagt cggtgacatt gccttgcaat 1800gctttagcaa tacccgaagc ccaccttagc
tggattcttc caaacagaag gataattaat 1860gatttggcta acacatcaca tgtatacatg
ttgccaaatg gaactctttc catcccaaag 1920gtccaagtca gtgatagtgg ttactacaga
tgtgtggctg tcaaccagca aggggcagac 1980cattttacgg tgggaatcac agtgaccaag
aaagggtctg gcttgccatc caaaagaggc 2040agacgcccag gtgcaaaggc tctttccaga
gtcagagaag acatcgtgga ggatgaaggg 2100ggctcgggca tgggagatga agagaacact
tcaaggagac ttctgcatcc aaaggaccaa 2160gaggtgttcc tcaaaacaaa ggatgatgcc
atcaatggag acaagaaagc caagaaaggg 2220agaagaaagc tgaaactctg gaagcattcg
gaaaaagaac cagagaccaa tgttgcagaa 2280ggtcgcagag tgtttgaatc tagacgaagg
ataaacatgg caaacaaaca gattaatccg 2340gagcgctggg ctgatatttt agccaaagtc
cgtgggaaaa atctccctaa gggcacagaa 2400gtacccccgt tgattaaaac cacaagtcct
ccatccttga gcctagaagt cacaccacct 2460tttcctgctg tttctccccc ctcagcatct
cctgtgcaga cagtaaccag tgctgaagaa 2520tcctcagcag atgtacctct acttggtgaa
gaagagcacg ttttgggtac catttcctca 2580gccagcatgg ggctagaaca caaccacaat
ggagttattc ttgttgaacc tgaagtaaca 2640agcacacctc tggaggaagt tgttgatgac
ctttctgaga agactgagga gataacttcc 2700actgaaggag acctgaaggg gacagcagcc
cctacactta tatctgagcc ttatgaacca 2760tctcctactc tgcacacatt agacacagtc
tatgaaaagc ccacccatga agagacggca 2820acagagggtt ggtctgcagc agatgttgga
tcgtcaccag agcccacatc cagtgagtat 2880gagcctccat tggatgctgt ctccttggct
gagtctgagc ccatgcaata ctttgaccca 2940gatttggaga ctaagtcaca accagatgag
gataagatga aagaagacac ctttgcacac 3000cttactccaa cccccaccat ctgggttaat
gactccagta catcacagtt atttgaggat 3060tctactatag gggaaccagg tgtcccaggc
caatcacatc tacaaggact gacagacaac 3120atccaccttg tgaaaagtag tctaagcact
caagacacct tactgattaa aaagggtatg 3180aaagagatgt ctcagacact acagggagga
aatatgctag agggagaccc cacacactcc 3240agaagttctg agagtgaggg ccaagagagc
aaatccatca ctttgcctga ctccacactg 3300ggtataatga gcagtatgtc tccagttaag
aagcctgcgg aaaccacagt tggtaccctc 3360ctagacaaag acaccacaac agtaacaaca
acaccaaggc aaaaagttgc tccgtcatcc 3420accatgagca ctcacccttc tcgaaggaga
cccaacggga gaaggagatt acgccccaac 3480aaattccgcc accggcacaa gcaaacccca
cccacaactt ttgccccatc agagactttt 3540tctactcaac caactcaagc acctgacatt
aagatttcaa gtcaagtgga gagttctctg 3600gttcctacag cttgggtgga taacacagtt
aataccccca aacagttgga aatggagaag 3660aatgcagaac ccacatccaa gggaacacca
cggagaaaac acgggaagag gccaaacaaa 3720catcgatata ccccttctac agtgagctca
agagcgtccg gatccaagcc cagcccttct 3780ccagaaaata aacatagaaa cattgttact
cccagttcag aaactatact tttgcctaga 3840actgtttctc tgaaaactga gggcccttat
gattccttag attacatgac aaccaccaga 3900aaaatatatt catcttaccc taaagtccaa
gagacacttc cagtcacata taaacccaca 3960tcagatggaa aagaaattaa ggatgatgtt
gccacaaatg ttgacaaaca taaaagtgac 4020attttagtca ctggtgaatc aattactaat
gccataccaa cttctcgctc cttggtctcc 4080actatgggag aatttaagga agaatcctct
cctgtaggct ttccaggaac tccaacctgg 4140aatccctcaa ggacggccca gcctgggagg
ctacagacag acatacctgt taccacttct 4200ggggaaaatc ttacagaccc tccccttctt
aaagagcttg aggatgtgga tttcacttcc 4260gagtttttgt cctctttgac agtctccaca
ccatttcacc aggaagaagc tggttcttcc 4320acaactctct caagcataaa agtggaggtg
gcttcaagtc aggcagaaac caccaccctt 4380gatcaagatc atcttgaaac cactgtggct
attctccttt ctgaaactag accacagaat 4440cacaccccta ctgctgcccg gatgaaggag
ccagcatcct cgtccccatc cacaattctc 4500atgtctttgg gacaaaccac caccactaag
ccagcacttc ccagtccaag aatatctcaa 4560gcatctagag attccaagga aaatgttttc
ttgaattatg tggggaatcc agaaacagaa 4620gcaaccccag tcaacaatga aggaacacag
catatgtcag ggccaaatga attatcaaca 4680ccctcttccg accgggatgc atttaacttg
tctacaaagc tggaattgga aaagcaagta 4740tttggtagta ggagtctacc acgtggccca
gatagccaac gccaggatgg aagagttcat 4800gcttctcatc aactaaccag agtccctgcc
aaacccatcc taccaacagc aacagtgagg 4860ctacctgaaa tgtccacaca aagcgcttcc
agatactttg taacttccca gtcacctcgt 4920cactggacca acaaaccgga aataactaca
tatccttctg gggctttgcc agagaacaaa 4980cagtttacaa ctccaagatt atcaagtaca
acaattcctc tcccattgca catgtccaaa 5040cccagcattc ctagtaagtt tactgaccga
agaactgacc aattcaatgg ttactccaaa 5100gtgtttggaa ataacaacat ccctgaggca
agaaacccag ttggaaagcc tcccagtcca 5160agaattcctc attattccaa tggaagactc
cctttcttta ccaacaagac tctttctttt 5220ccacagttgg gagtcacccg gagaccccag
atacccactt ctcctgcccc agtaatgaga 5280gagagaaaag ttattccagg ttcctacaac
aggatacatt cccatagcac cttccatctg 5340gactttggcc ctccggcacc tccgttgttg
cacactccgc agaccacggg atcaccctca 5400actaacttac agaatatccc tatggtctct
tccacccaga gttctatctc ctttataaca 5460tcttctgtcc agtcctcagg aagcttccac
cagagcagct caaagttctt tgcaggagga 5520cctcctgcat ccaaattctg gtctcttggg
gaaaagcccc aaatcctcac caagtcccca 5580cagactgtgt ccgtcaccgc tgagacagac
actgtgttcc cctgtgaggc aacaggaaaa 5640ccaaagcctt tcgttacttg gacaaaggtt
tccacaggag ctcttatgac tccgaatacc 5700aggatacaac ggtttgaggt tctcaagaac
ggtaccttag tgatacggaa ggttcaagta 5760caagatcgag gccagtatat gtgcaccgcc
agcaacctgc acggcctgga caggatggtg 5820gtcttgcttt cggtcaccgt gcagcaacct
caaatcctag cctcccacta ccaggacgtc 5880actgtctacc tgggagacac cattgcaatg
gagtgtctgg ccaaagggac cccagccccc 5940caaatttcct ggatcttccc tgacaggagg
gtgtggcaaa ctgtgtcccc cgtggagagc 6000cgcatcaccc tgcacgaaaa ccggaccctt
tccatcaagg aggcgtcctt ctcagacaga 6060ggcgtctata agtgcgtggc cagcaatgca
gccggggcgg acagcctggc catccgcctg 6120cacgtggcgg cactgccccc cgttatccac
caggagaagc tggagaacat ctcgctgccc 6180ccggggctca gcattcacat tcactgcact
gccaaggctg cgcccctgcc cagcgtgcgc 6240tgggtgctcg gggacggtac ccagatccgc
ccctcgcagt tcctccacgg gaacttgttt 6300gttttcccca acgggacgct ctacatccgc
aacctcgcgc ccaaggacag cgggcgctat 6360gagtgcgtgg ccgccaacct ggtaggctcc
gcgcgcagga cggtgcagct gaacgtgcag 6420cgtgcagcag ccaacgcgcg catcacgggc
acctccccgc ggaggacgga cgtcaggtac 6480ggaggaaccc tcaagctgga ctgcagcgcc
tcgggggacc cctggccgcg catcctctgg 6540aggctgccgt ccaagaggat gatcgacgcg
ctcttcagtt ttgatagcag aatcaaggtg 6600tttgccaatg ggaccctggt ggtgaaatca
gtgacggaca aagatgccgg agattacctg 6660tgcgtagctc gaaataaggt tggtgatgac
tacgtggtgc tcaaagtgga tgtggtgatg 6720aaaccggcca agattgaaca caaggaggag
aacgaccaca aagtcttcta cgggggtgac 6780ctgaaagtgg actgtgtggc caccgggctt
cccaatcccg agatctcctg gagcctccca 6840gacgggagtc tggtgaactc cttcatgcag
tcggatgaca gcggtggacg caccaagcgc 6900tatgtcgtct tcaacaatgg gacactctac
tttaacgaag tggggatgag ggaggaagga 6960gactacacct gctttgctga aaatcaggtc
gggaaggacg agatgagagt cagagtcaag 7020gtggtgacag cgcccgccac catccggaac
aagacttact tggcggttca ggtgccctat 7080ggagacgtgg tcactgtagc ctgtgaggcc
aaaggagaac ccatgcccaa ggtgacttgg 7140ttgtccccaa ccaacaaggt gatccccacc
tcctctgaga agtatcagat ataccaagat 7200ggcactctcc ttattcagaa agcccagcgt
tctgacagcg gcaactacac ctgcctggtc 7260aggaacagcg cgggagagga taggaagacg
gtgtggattc acgtcaacgt ccagccaccc 7320aagatcaacg gtaaccccaa ccccatcacc
accgtgcggg agatagcagc cgggggcagt 7380cggaaactga ttgactgcaa agctgaaggc
atccccaccc cgagggtgtt atgggctttt 7440cccgagggtg tggttctgcc agctccatac
tatggaaacc ggatcactgt ccatggcaac 7500ggttccctgg acatcaggag tttgaggaag
agcgactccg tccagctggt atgcatggca 7560cgcaacgagg gaggggaggc gaggttgatc
gtgcagctca ctgtcctgga gcccatggag 7620aaacccatct tccacgaccc gatcagcgag
aagatcacgg ccatggcggg ccacaccatc 7680agcctcaact gctctgccgc ggggaccccg
acacccagcc tggtgtgggt ccttcccaat 7740ggcaccgatc tgcagagtgg acagcagctg
cagcgcttct accacaaggc tgacggcatg 7800ctacacatta gcggtctctc ctcggtggac
gctggggcct accgctgcgt ggcccgcaat 7860gccgctggcc acacggagag gctggtctcc
ctgaaggtgg gactgaagcc agaagcaaac 7920aagcagtatc ataacctggt cagcatcatc
aatggtgaga ccctgaagct cccctgcacc 7980cctcccgggg ctgggcaggg acgtttctcc
tggacgctcc ccaatggcat gcatctggag 8040ggcccccaaa ccctgggacg cgtttctctt
ctggacaatg gcaccctcac ggttcgtgag 8100gcctcggtgt ttgacagggg tacctatgta
tgcaggatgg agacggagta cggcccttcg 8160gtcaccagca tccccgtgat tgtgatcgcc
tatcctcccc ggatcaccag cgagcccacc 8220ccggtcatct acacccggcc cgggaacacc
gtgaaactga actgcatggc tatggggatt 8280cccaaagctg acatcacgtg ggagttaccg
gataagtcgc atctgaaggc aggggttcag 8340gctcgtctgt atggaaacag atttcttcac
ccccagggat cactgaccat ccagcatgcc 8400acacagagag atgccggctt ctacaagtgc
atggcaaaaa acattctcgg cagtgactcc 8460aaaacaactt acatccacgt cttctgaaat
gtggattcca gaatgattgc ttaggaactg 8520acaacaaagc ggggtttgta agggaagcca
ggttggggaa taggagctct taaataatgt 8580gtcacagtgc atggtggcct ctggtgggtt
tcaagttgag gttgatcttg atctacaatt 8640gttgggaaaa ggaagcaatg cagacacgag
aaggagggct cagccttgct gagacacttt 8700cttttgtgtt tacatcatgc caggggcttc
attcagggtg tctgtgctct gactgcaatt 8760tttcttcttt tgcaaatgcc actcgactgc
cttcataagc gtccatagga tatctgagga 8820acattcatca aaaataagcc atagacatga
acaacacctc actaccccat tgaagacgca 8880tcacctagtt aacctgctgc agtttttaca
tgatagactt tgttccagat tgacaagtca 8940tctttcagtt atttcctctg tcacttcaaa
actccagctt gcccaataag gatttagaac 9000cagagtgact gatatatata tatatatttt
aattcagagt tacatacata cagctaccat 9060tttatatgaa aaaagaaaaa catttcttcc
tggaactcac tttttatata atgttttata 9120tatatatttt ttcctttcaa atcagacgat
gagactagaa ggagaaatac tttctgtctt 9180attaaaatta ataaattatt ggtctttaca
agacttggat acattacagc agacatggaa 9240atataatttt aaaaaatttc tctccaacct
ccttcaaatt cagtcaccac tgttatatta 9300ccttctccag gaaccctcca gtggggaagg
ctgcgatatt agatttcctt gtatgcaaag 9360tttttgttga aagctgtgct cagaggaggt
gagaggagag gaaggagaaa actgcatcat 9420aactttacag aattgaatct agagtcttcc
ccgaaaagcc cagaaacttc tctgcagtat 9480ctggcttgtc catctggtct aaggtggctg
cttcttcccc agccatgagt cagtttgtgc 9540ccatgaataa tacacgacct gttatttcca
tgactgcttt actgtatttt taaggtcaat 9600atactgtaca tttgataata aaataatatt
ctcccaaaaa aaaaa 9645237770DNAhomo sapien 23atgaaggtaa
aaggcagagg aatcacctgc ttgctggtct cctttgctgt gatctgcctg 60gtcgccaccc
ctgggggcaa ggcctgtcct cgccgctgtg cctgttatat gcctacggag 120gtacactgca
catttcggta cctgacttcc atcccagaca gcatcccgcc caatgtggaa 180cgcatcaatt
taggatacaa cagcttggtt agattgatgg aaacagattt ttctggcctg 240accaaactgg
agttactcat gcttcacagc aatggcattc acacaatccc tgacaagacc 300ttctcagatt
tgcaggcctt gcaggtctta aaaatgagct ataataaagt ccgaaaactt 360cagaaagata
ctttttatgg cctcaggagc ttgacacgat tgcacatgga ccacaacaat 420attgagttta
taaacccaga ggttttttat gggctcaact ttctccgcct ggtgcacttg 480gaaggaaatc
agctcactaa gctccaccca gatacatttg tctctttgag ctacctccag 540atatttaaaa
tctctttcat taagttccta tacttgtctg ataacttcct gacctccctc 600cctcaagaga
tggtctccta tatgcctgac ctagacagcc tttacctgca tggaaaccca 660tggacctgtg
attgccattt aaagtggttg tctgactgga tacaggagaa gccagatgta 720ataaaatgca
aaaaagatag aagtccctct agtgctcagc agtgtccact ttgcatgaac 780cctaggactt
ctaaaggcaa gccgttagct atggtctcag ctgcagcttt ccagtgtgcc 840aagccaacca
ttgactcatc cctgaaatca aagagcctga ctattctgga agacagtagt 900tctgctttca
tctctcccca aggtttcatg gcaccctttg gctccctcac tttgaatatg 960acagatcagt
ctggaaatga agctaacatg gtctgcagta ttcaaaagcc ctcaaggaca 1020tcacccattg
cattcactga agaaaatgac tacatcgtgc taaatacttc attttcaaca 1080tttttggtgt
gcaacataga ttacggtcac attcagccag tgtggcaaat tttggctttg 1140tacagtgatt
ctcctctgat actagaaagg agccacttgc ttagtgaaac accgcagctc 1200tattacaaat
ataaacaggt ggctcctaag cctgaagaca tttttaccaa catagaggca 1260gatctcagag
cagatccctc ttggttaatg caagaccaaa tttccttgca gctgaacaga 1320actgccacca
cattcagtac attacagatc cagtactcca gtgatgctca aatcacttta 1380ccaagagcag
agatgaggcc agtgaaacac aaatggacta tgatttcaag ggataacaat 1440actaagctgg
aacatactgt cttggtaggt ggaaccgttg gcctgaactg cccaggccaa 1500ggagacccca
ccccacacgt ggattggctt ctagctgatg gaagtaaagt gagagcccct 1560tatgtcagtg
aggatggacg gatcctaata gacaaaagtg gaaaattgga actccagatg 1620gctgatagtt
ttgacacagg cgtatatcac tgtataagca gcaattatga tgatgcagat 1680attctcacct
ataggataac tgtggtagaa cctttggtcg aagcctatca ggaaaatggg 1740attcatcaca
cagttttcat tggtgaaaca cttgatcttc catgccattc tactggtatc 1800ccagatgcct
ctattagctg ggttattcca ggaaacaatg tgctctatca gtcatcaaga 1860gacaagaaag
ttctaaacaa tggcacatta agaatattac aggtcacccc gaaagaccaa 1920ggttattatc
gctgtgtggc agccaaccca tcaggggttg attttttgat tttccaagtt 1980tcagtcaaga
tgaaaggaca aaggcccttg gagcatgatg gagaaacaga gggatctgga 2040cttgatgagt
ccaatcctat tgctcatctt aaggagccac caggtgcaca actccgtaca 2100tctgctctga
tggaggctga ggttggaaaa cacacctcaa gcacaagtaa gaggcacaac 2160tatcgggaat
taacactcca gcgacgtgga gattcaacac atcgacgttt tagggagaat 2220aggaggcatt
tccctccctc tgctaggaga attgacccac aacattgggc ggcactgttg 2280gagaaagcta
aaaagaatgc tatgccagac aagcgagaaa ataccacagt gagcccaccc 2340ccagtggtca
cccaactccc aaacatacct ggtgaagaag acgattcctc aggcatgctc 2400gctctacatg
aggaatttat ggtcccggcc actaaagctt tgaaccttcc agcaaggaca 2460gtgactgctg
actccagaac aatatctgat agtcctatga caaacataaa ttatggcaca 2520gaattctctc
ctgttgtgaa ttcacaaata ctaccacctg aagaacccac agatttcaaa 2580ctgtctactg
ctattaaaac tacagccatg tcaaagaata taaacccaac catgtcaagc 2640caaatacaag
gcacaaccaa tcaacattca tccactgtct ttccactgct acttggagca 2700actgaatttc
aggactctga ccagatggga agaggaagag agcatttcca aagtagaccc 2760ccaataacag
taaggactat gatcaaagat gtcaatgtca aaatgcttag tagcaccacc 2820aacaaactat
tattagagtc agtaaatacc acaaatagtc atcagacatc tgtaagagaa 2880gtgagtgaac
ccaggcacaa tcacttctat tctcacacta ctcaaatact tagcacctcc 2940acgttccctt
cagatccaca cacagctgct cattctcagt ttccgatccc tagaaatagt 3000acagttaaca
tcccgctgtt cagacgcttt gggaggcaga ggaaaattgg cggaaggggg 3060cggattatca
gcccatatag aactccagtt ctgcgacggc atagatacag cattttcagg 3120tcaacaacca
gaggttcttc tgaaaaaagc actactgcat tctcagccac agtgctcaat 3180gtgacatgtc
tgtcctgtct tcccagggag aggctcacca ctgccacagc agcattgtct 3240tttccaagtg
ctgctcccat caccttcccc aaagctgaca ttgctagagt cccatcagaa 3300gagtctacaa
ctctagtcca gaatccacta ttactacttg agaacaaacc cagtgtagag 3360aaaacaacac
ccacaataaa atatttcagg actgaaattt cccaagtgac tccaactggt 3420gcagtcatga
catatgctcc aacatccata cccatggaaa aaactcacaa agtaaacgcc 3480agttacccac
gtgtgtctag caccaatgaa gctaaaagag attcagtgat tacatcgtca 3540ctttcaggtg
ctatcaccaa gccaccaatg actattatag ccattacaag gttttcaaga 3600aggaaaattc
cctggcaaca gaactttgta aataaccata acccaaaagg cagattaagg 3660aatcaacata
aagttagttt acaaaaaagc acagctgtga tgcttcctaa aacatctcct 3720gctttaccac
agagacaaag ttcccctttc catttcacca cactttcaac aagtgtgatg 3780caaattccat
ctaatacctt gactaccgct caccacacta cgaccaaaac acacaatcct 3840ggaagtcttc
caacaaagaa ggagcttccc ttcccacccc ttaaccctat gcttcctagt 3900attataagca
aagactcaag tacaaaaagc atcatatcaa cgcaaacagc aataccagca 3960acaactccta
ccttccctgc atctgtcatc acttatgaaa cccaaacaga gagatctaga 4020gcacaaacaa
tacaaagaga acaggagcct caaaagaaga acaggactga cccaaacatc 4080tctccagacc
agagttctgg cttcactaca cccactgcta tgacacctcc tgctctggca 4140ttcactcatt
ccccaccaga aaacacaact gggatttcaa gcacaatcag ttttcattca 4200agaactctta
atctgacaga tgtgattgaa gaactagccc aagcaagtac tcagactttg 4260aagagcacaa
ttgcttctga aacaactttg tccagcaaat cacaccagag taccacaact 4320aggaaagcat
cattagacac tcccatacca ccattcttga gcagcagtgc tactctaatg 4380ccagttccca
tctcccctcc ctttactcag agagcagtta ctgacacacg tggcgactcc 4440catttccggc
ttatgacaaa tacagtggtc aagctgcacg aatcctcaag gcacaatctc 4500caaatgccaa
gttcacaatt ggaaccactc acttcatcta cctctaatct gttacattct 4560actcccatgc
cagcactaac aacagttaaa tcacagaatt ccaaattaac tccatctccc 4620tgggcagaat
accaattttg gcacaaacca tactcagaca ttgctgaaaa aggcaaaaag 4680ccagaagtaa
gcatgttggc tactacaggc ctgtccgagg ccaccactct tgtttcagat 4740tgggatggac
agaagaacac aaagaagagt gactttgata agaaaccagt tcaagaagca 4800acaacttcca
aactccttcc ctttgactct ttgtctaggt atatatttga aaagcccagg 4860atagttggag
gaaaagctgc aagttttact attccagcta actcagatgc ctttcttccc 4920tgtgaagctg
ttggaaatcc cctgcccacc attcattgga ccagagtttc aggacttgat 4980ttatctagag
gaaaccagaa tagcagggtc caggttctcc ccaatggtac cctgtccatc 5040cagagggtgg
aaattcagga ccgcggacag tacttgtgtt ccgcatccaa tctgtttggc 5100acagaccacc
ttcatgtcac cttgtctgtg gtttcctatc ctcccaggat cctggagaga 5160cgtaccaaag
agatcacagt tcattccgga agcactgtgg aactgaagtg cagagcagaa 5220ggtaggccaa
gccctacagt tacctggatt cttgcaaacc aaacagttgt ctcagaatca 5280tcccagggaa
gtaggcaggc tgtggtgacg gttgacggaa cattggtcct ccacaatctc 5340agtatttatg
accgtggctt ttacaaatgt gtggccagca acccaggtgg ccaggattca 5400ctgctggtta
aaatacaagt cattgcagca ccacctgtta ttctagagca aaggaggcaa 5460gtcattgtag
gcacttgggg tgaaagttta aaactgccct gtactgcaaa aggaactcct 5520cagcccagcg
tttactgggt cctctctgat ggcactgaag tgaaaccatt acagtttacc 5580aattccaagt
tgttcttatt ttcaaatggg actttgtata taagaaacct agcctcttca 5640gacaggggca
cttatgaatg cattgctacc agttccactg gttcggagcg aagagtagta 5700atgcttacaa
tggaagagcg agtgaccagc cccaggatag aagctgcatc ccagaaaagg 5760actgaagtga
attttgggga caaattacta ctgaactgct cagccactgg ggagcccaaa 5820ccccaaataa
tgtggaggtt accatccaag gctgtggtcg accagtggag ctggatccac 5880gtctacccta
atggatccct gtttattgga tcagtaacag aaaaagacag tggtgtctac 5940ttgtgtgtgg
caagaaacaa aatgggggat gatctgatac tgatgcatgt tagcctaaga 6000ctgaaacctg
ccaaaattga ccacaagcag tattttagaa agcaagtgct ccatgggaaa 6060gatttccaag
tagattgcaa agcttccggc tccccagtgc cagagatatc ttggagtttg 6120cctgatggaa
ccatgatcaa caatgcaatg caagccgatg acagtggcca caggactagg 6180agatataccc
ttttcaacaa tggaacttta tacttcaaca aagttggggt agcggaggaa 6240ggagattata
cttgctatgc ccagaacacc ctagggaaag atgaaatgaa ggtccactta 6300acagttataa
cagctgctcc ccggataagg cagagtaaca aaaccaacaa gagaatcaaa 6360gctggagaca
cagctgtcct tgactgtgag gtcactgggg atcccaaacc aaaaatattt 6420tggttgctgc
cttccaatga catgatttcc ttctccattg ataggtacac atttcatgcc 6480aatgggtctt
tgaccatcaa caaagtgaaa ctgctcgatt ctggagagta cgtatgtgta 6540gcccgaaatc
ccagtgggga tgacaccaaa atgtacaaac tggatgtggt ctctaaacct 6600ccattaatca
atggtctgta tacaaacaga actgttatta aagccacagc tgtgagacat 6660tccaaaaaac
actttgactg cagagctgaa gggacaccat ctcctgaagt catgtggatc 6720atgccagaca
atattttcct cacagcccca tactatggaa gcagaatcac agtccataaa 6780aatggaacct
tggaaattag gaatgtgagg ctttcagatt cagccgactt tatctgtgtg 6840gcccgaaatg
aaggtggaga gagcgtgttg gtagtacagt tagaagtact ggaaatgctg 6900agaagaccga
catttagaaa tccatttaat gaaaaaatag ttgcccagct gggaaagtcc 6960acagcattga
attgctctgt tgatggtaac ccaccacctg aaataatctg gattttacca 7020aatggcacac
gattttccaa tggaccacaa agttatcagt atctgatagc aagcaatggt 7080tcttttatca
tttctaaaac aactcgggag gatgcaggaa aatatcgctg tgcagctagg 7140aataaagttg
gctatattga gaaattagtc atattagaaa ttggccagaa gccagttatt 7200cttacctatg
caccagggac agtaaaaggc atcagtggag aatctctatc actgcattgt 7260gtgtctgatg
gaatccctaa gccaaatatc aaatggacta tgccaagtgg ttatgtagta 7320gacaggcctc
aaattaatgg gaaatacata ttgcatgaca atggcacctt agtcattaaa 7380gaagcaacag
cttatgacag aggaaactat atctgtaagg ctcaaaatag tgttggtcat 7440acactgatta
ctgttccagt aatgattgta gcctaccctc cccgaattac aaatcgtcca 7500cccaggagta
ttgtcaccag gacaggggca gcctttcagc tccactgtgt ggccttggga 7560gttcccaagc
cagaaatcac atgggagatg cctgaccact cccttctctc aacggcaagt 7620aaagagagga
cacatggaag tgagcagctt cacttacaag gtaccctagt cattcagaat 7680ccccaaacct
ccgattctgg gatatacaaa tgcacagcaa agaacccact tggtagtgat 7740tatgcagcaa
cgtatattca agtaatctga
7770242589PRThomo sapien 24Met Lys Val Lys Gly Arg Gly Ile Thr Cys Leu
Leu Val Ser Phe Ala1 5 10
15Val Ile Cys Leu Val Ala Thr Pro Gly Gly Lys Ala Cys Pro Arg Arg
20 25 30Cys Ala Cys Tyr Met Pro Thr
Glu Val His Cys Thr Phe Arg Tyr Leu 35 40
45Thr Ser Ile Pro Asp Ser Ile Pro Pro Asn Val Glu Arg Ile Asn
Leu 50 55 60Gly Tyr Asn Ser Leu Val
Arg Leu Met Glu Thr Asp Phe Ser Gly Leu65 70
75 80Thr Lys Leu Glu Leu Leu Met Leu His Ser Asn
Gly Ile His Thr Ile 85 90
95Pro Asp Lys Thr Phe Ser Asp Leu Gln Ala Leu Gln Val Leu Lys Met
100 105 110Ser Tyr Asn Lys Val Arg
Lys Leu Gln Lys Asp Thr Phe Tyr Gly Leu 115 120
125Arg Ser Leu Thr Arg Leu His Met Asp His Asn Asn Ile Glu
Phe Ile 130 135 140Asn Pro Glu Val Phe
Tyr Gly Leu Asn Phe Leu Arg Leu Val His Leu145 150
155 160Glu Gly Asn Gln Leu Thr Lys Leu His Pro
Asp Thr Phe Val Ser Leu 165 170
175Ser Tyr Leu Gln Ile Phe Lys Ile Ser Phe Ile Lys Phe Leu Tyr Leu
180 185 190Ser Asp Asn Phe Leu
Thr Ser Leu Pro Gln Glu Met Val Ser Tyr Met 195
200 205Pro Asp Leu Asp Ser Leu Tyr Leu His Gly Asn Pro
Trp Thr Cys Asp 210 215 220Cys His Leu
Lys Trp Leu Ser Asp Trp Ile Gln Glu Lys Pro Asp Val225
230 235 240Ile Lys Cys Lys Lys Asp Arg
Ser Pro Ser Ser Ala Gln Gln Cys Pro 245
250 255Leu Cys Met Asn Pro Arg Thr Ser Lys Gly Lys Pro
Leu Ala Met Val 260 265 270Ser
Ala Ala Ala Phe Gln Cys Ala Lys Pro Thr Ile Asp Ser Ser Leu 275
280 285Lys Ser Lys Ser Leu Thr Ile Leu Glu
Asp Ser Ser Ser Ala Phe Ile 290 295
300Ser Pro Gln Gly Phe Met Ala Pro Phe Gly Ser Leu Thr Leu Asn Met305
310 315 320Thr Asp Gln Ser
Gly Asn Glu Ala Asn Met Val Cys Ser Ile Gln Lys 325
330 335Pro Ser Arg Thr Ser Pro Ile Ala Phe Thr
Glu Glu Asn Asp Tyr Ile 340 345
350Val Leu Asn Thr Ser Phe Ser Thr Phe Leu Val Cys Asn Ile Asp Tyr
355 360 365Gly His Ile Gln Pro Val Trp
Gln Ile Leu Ala Leu Tyr Ser Asp Ser 370 375
380Pro Leu Ile Leu Glu Arg Ser His Leu Leu Ser Glu Thr Pro Gln
Leu385 390 395 400Tyr Tyr
Lys Tyr Lys Gln Val Ala Pro Lys Pro Glu Asp Ile Phe Thr
405 410 415Asn Ile Glu Ala Asp Leu Arg
Ala Asp Pro Ser Trp Leu Met Gln Asp 420 425
430Gln Ile Ser Leu Gln Leu Asn Arg Thr Ala Thr Thr Phe Ser
Thr Leu 435 440 445Gln Ile Gln Tyr
Ser Ser Asp Ala Gln Ile Thr Leu Pro Arg Ala Glu 450
455 460Met Arg Pro Val Lys His Lys Trp Thr Met Ile Ser
Arg Asp Asn Asn465 470 475
480Thr Lys Leu Glu His Thr Val Leu Val Gly Gly Thr Val Gly Leu Asn
485 490 495Cys Pro Gly Gln Gly
Asp Pro Thr Pro His Val Asp Trp Leu Leu Ala 500
505 510Asp Gly Ser Lys Val Arg Ala Pro Tyr Val Ser Glu
Asp Gly Arg Ile 515 520 525Leu Ile
Asp Lys Ser Gly Lys Leu Glu Leu Gln Met Ala Asp Ser Phe 530
535 540Asp Thr Gly Val Tyr His Cys Ile Ser Ser Asn
Tyr Asp Asp Ala Asp545 550 555
560Ile Leu Thr Tyr Arg Ile Thr Val Val Glu Pro Leu Val Glu Ala Tyr
565 570 575Gln Glu Asn Gly
Ile His His Thr Val Phe Ile Gly Glu Thr Leu Asp 580
585 590Leu Pro Cys His Ser Thr Gly Ile Pro Asp Ala
Ser Ile Ser Trp Val 595 600 605Ile
Pro Gly Asn Asn Val Leu Tyr Gln Ser Ser Arg Asp Lys Lys Val 610
615 620Leu Asn Asn Gly Thr Leu Arg Ile Leu Gln
Val Thr Pro Lys Asp Gln625 630 635
640Gly Tyr Tyr Arg Cys Val Ala Ala Asn Pro Ser Gly Val Asp Phe
Leu 645 650 655Ile Phe Gln
Val Ser Val Lys Met Lys Gly Gln Arg Pro Leu Glu His 660
665 670Asp Gly Glu Thr Glu Gly Ser Gly Leu Asp
Glu Ser Asn Pro Ile Ala 675 680
685His Leu Lys Glu Pro Pro Gly Ala Gln Leu Arg Thr Ser Ala Leu Met 690
695 700Glu Ala Glu Val Gly Lys His Thr
Ser Ser Thr Ser Lys Arg His Asn705 710
715 720Tyr Arg Glu Leu Thr Leu Gln Arg Arg Gly Asp Ser
Thr His Arg Arg 725 730
735Phe Arg Glu Asn Arg Arg His Phe Pro Pro Ser Ala Arg Arg Ile Asp
740 745 750Pro Gln His Trp Ala Ala
Leu Leu Glu Lys Ala Lys Lys Asn Ala Met 755 760
765Pro Asp Lys Arg Glu Asn Thr Thr Val Ser Pro Pro Pro Val
Val Thr 770 775 780Gln Leu Pro Asn Ile
Pro Gly Glu Glu Asp Asp Ser Ser Gly Met Leu785 790
795 800Ala Leu His Glu Glu Phe Met Val Pro Ala
Thr Lys Ala Leu Asn Leu 805 810
815Pro Ala Arg Thr Val Thr Ala Asp Ser Arg Thr Ile Ser Asp Ser Pro
820 825 830Met Thr Asn Ile Asn
Tyr Gly Thr Glu Phe Ser Pro Val Val Asn Ser 835
840 845Gln Ile Leu Pro Pro Glu Glu Pro Thr Asp Phe Lys
Leu Ser Thr Ala 850 855 860Ile Lys Thr
Thr Ala Met Ser Lys Asn Ile Asn Pro Thr Met Ser Ser865
870 875 880Gln Ile Gln Gly Thr Thr Asn
Gln His Ser Ser Thr Val Phe Pro Leu 885
890 895Leu Leu Gly Ala Thr Glu Phe Gln Asp Ser Asp Gln
Met Gly Arg Gly 900 905 910Arg
Glu His Phe Gln Ser Arg Pro Pro Ile Thr Val Arg Thr Met Ile 915
920 925Lys Asp Val Asn Val Lys Met Leu Ser
Ser Thr Thr Asn Lys Leu Leu 930 935
940Leu Glu Ser Val Asn Thr Thr Asn Ser His Gln Thr Ser Val Arg Glu945
950 955 960Val Ser Glu Pro
Arg His Asn His Phe Tyr Ser His Thr Thr Gln Ile 965
970 975Leu Ser Thr Ser Thr Phe Pro Ser Asp Pro
His Thr Ala Ala His Ser 980 985
990Gln Phe Pro Ile Pro Arg Asn Ser Thr Val Asn Ile Pro Leu Phe Arg
995 1000 1005Arg Phe Gly Arg Gln Arg
Lys Ile Gly Gly Arg Gly Arg Ile Ile 1010 1015
1020Ser Pro Tyr Arg Thr Pro Val Leu Arg Arg His Arg Tyr Ser
Ile 1025 1030 1035Phe Arg Ser Thr Thr
Arg Gly Ser Ser Glu Lys Ser Thr Thr Ala 1040 1045
1050Phe Ser Ala Thr Val Leu Asn Val Thr Cys Leu Ser Cys
Leu Pro 1055 1060 1065Arg Glu Arg Leu
Thr Thr Ala Thr Ala Ala Leu Ser Phe Pro Ser 1070
1075 1080Ala Ala Pro Ile Thr Phe Pro Lys Ala Asp Ile
Ala Arg Val Pro 1085 1090 1095Ser Glu
Glu Ser Thr Thr Leu Val Gln Asn Pro Leu Leu Leu Leu 1100
1105 1110Glu Asn Lys Pro Ser Val Glu Lys Thr Thr
Pro Thr Ile Lys Tyr 1115 1120 1125Phe
Arg Thr Glu Ile Ser Gln Val Thr Pro Thr Gly Ala Val Met 1130
1135 1140Thr Tyr Ala Pro Thr Ser Ile Pro Met
Glu Lys Thr His Lys Val 1145 1150
1155Asn Ala Ser Tyr Pro Arg Val Ser Ser Thr Asn Glu Ala Lys Arg
1160 1165 1170Asp Ser Val Ile Thr Ser
Ser Leu Ser Gly Ala Ile Thr Lys Pro 1175 1180
1185Pro Met Thr Ile Ile Ala Ile Thr Arg Phe Ser Arg Arg Lys
Ile 1190 1195 1200Pro Trp Gln Gln Asn
Phe Val Asn Asn His Asn Pro Lys Gly Arg 1205 1210
1215Leu Arg Asn Gln His Lys Val Ser Leu Gln Lys Ser Thr
Ala Val 1220 1225 1230Met Leu Pro Lys
Thr Ser Pro Ala Leu Pro Gln Arg Gln Ser Ser 1235
1240 1245Pro Phe His Phe Thr Thr Leu Ser Thr Ser Val
Met Gln Ile Pro 1250 1255 1260Ser Asn
Thr Leu Thr Thr Ala His His Thr Thr Thr Lys Thr His 1265
1270 1275Asn Pro Gly Ser Leu Pro Thr Lys Lys Glu
Leu Pro Phe Pro Pro 1280 1285 1290Leu
Asn Pro Met Leu Pro Ser Ile Ile Ser Lys Asp Ser Ser Thr 1295
1300 1305Lys Ser Ile Ile Ser Thr Gln Thr Ala
Ile Pro Ala Thr Thr Pro 1310 1315
1320Thr Phe Pro Ala Ser Val Ile Thr Tyr Glu Thr Gln Thr Glu Arg
1325 1330 1335Ser Arg Ala Gln Thr Ile
Gln Arg Glu Gln Glu Pro Gln Lys Lys 1340 1345
1350Asn Arg Thr Asp Pro Asn Ile Ser Pro Asp Gln Ser Ser Gly
Phe 1355 1360 1365Thr Thr Pro Thr Ala
Met Thr Pro Pro Ala Leu Ala Phe Thr His 1370 1375
1380Ser Pro Pro Glu Asn Thr Thr Gly Ile Ser Ser Thr Ile
Ser Phe 1385 1390 1395His Ser Arg Thr
Leu Asn Leu Thr Asp Val Ile Glu Glu Leu Ala 1400
1405 1410Gln Ala Ser Thr Gln Thr Leu Lys Ser Thr Ile
Ala Ser Glu Thr 1415 1420 1425Thr Leu
Ser Ser Lys Ser His Gln Ser Thr Thr Thr Arg Lys Ala 1430
1435 1440Ser Leu Asp Thr Pro Ile Pro Pro Phe Leu
Ser Ser Ser Ala Thr 1445 1450 1455Leu
Met Pro Val Pro Ile Ser Pro Pro Phe Thr Gln Arg Ala Val 1460
1465 1470Thr Asp Thr Arg Gly Asp Ser His Phe
Arg Leu Met Thr Asn Thr 1475 1480
1485Val Val Lys Leu His Glu Ser Ser Arg His Asn Leu Gln Met Pro
1490 1495 1500Ser Ser Gln Leu Glu Pro
Leu Thr Ser Ser Thr Ser Asn Leu Leu 1505 1510
1515His Ser Thr Pro Met Pro Ala Leu Thr Thr Val Lys Ser Gln
Asn 1520 1525 1530Ser Lys Leu Thr Pro
Ser Pro Trp Ala Glu Tyr Gln Phe Trp His 1535 1540
1545Lys Pro Tyr Ser Asp Ile Ala Glu Lys Gly Lys Lys Pro
Glu Val 1550 1555 1560Ser Met Leu Ala
Thr Thr Gly Leu Ser Glu Ala Thr Thr Leu Val 1565
1570 1575Ser Asp Trp Asp Gly Gln Lys Asn Thr Lys Lys
Ser Asp Phe Asp 1580 1585 1590Lys Lys
Pro Val Gln Glu Ala Thr Thr Ser Lys Leu Leu Pro Phe 1595
1600 1605Asp Ser Leu Ser Arg Tyr Ile Phe Glu Lys
Pro Arg Ile Val Gly 1610 1615 1620Gly
Lys Ala Ala Ser Phe Thr Ile Pro Ala Asn Ser Asp Ala Phe 1625
1630 1635Leu Pro Cys Glu Ala Val Gly Asn Pro
Leu Pro Thr Ile His Trp 1640 1645
1650Thr Arg Val Ser Gly Leu Asp Leu Ser Arg Gly Asn Gln Asn Ser
1655 1660 1665Arg Val Gln Val Leu Pro
Asn Gly Thr Leu Ser Ile Gln Arg Val 1670 1675
1680Glu Ile Gln Asp Arg Gly Gln Tyr Leu Cys Ser Ala Ser Asn
Leu 1685 1690 1695Phe Gly Thr Asp His
Leu His Val Thr Leu Ser Val Val Ser Tyr 1700 1705
1710Pro Pro Arg Ile Leu Glu Arg Arg Thr Lys Glu Ile Thr
Val His 1715 1720 1725Ser Gly Ser Thr
Val Glu Leu Lys Cys Arg Ala Glu Gly Arg Pro 1730
1735 1740Ser Pro Thr Val Thr Trp Ile Leu Ala Asn Gln
Thr Val Val Ser 1745 1750 1755Glu Ser
Ser Gln Gly Ser Arg Gln Ala Val Val Thr Val Asp Gly 1760
1765 1770Thr Leu Val Leu His Asn Leu Ser Ile Tyr
Asp Arg Gly Phe Tyr 1775 1780 1785Lys
Cys Val Ala Ser Asn Pro Gly Gly Gln Asp Ser Leu Leu Val 1790
1795 1800Lys Ile Gln Val Ile Ala Ala Pro Pro
Val Ile Leu Glu Gln Arg 1805 1810
1815Arg Gln Val Ile Val Gly Thr Trp Gly Glu Ser Leu Lys Leu Pro
1820 1825 1830Cys Thr Ala Lys Gly Thr
Pro Gln Pro Ser Val Tyr Trp Val Leu 1835 1840
1845Ser Asp Gly Thr Glu Val Lys Pro Leu Gln Phe Thr Asn Ser
Lys 1850 1855 1860Leu Phe Leu Phe Ser
Asn Gly Thr Leu Tyr Ile Arg Asn Leu Ala 1865 1870
1875Ser Ser Asp Arg Gly Thr Tyr Glu Cys Ile Ala Thr Ser
Ser Thr 1880 1885 1890Gly Ser Glu Arg
Arg Val Val Met Leu Thr Met Glu Glu Arg Val 1895
1900 1905Thr Ser Pro Arg Ile Glu Ala Ala Ser Gln Lys
Arg Thr Glu Val 1910 1915 1920Asn Phe
Gly Asp Lys Leu Leu Leu Asn Cys Ser Ala Thr Gly Glu 1925
1930 1935Pro Lys Pro Gln Ile Met Trp Arg Leu Pro
Ser Lys Ala Val Val 1940 1945 1950Asp
Gln Trp Ser Trp Ile His Val Tyr Pro Asn Gly Ser Leu Phe 1955
1960 1965Ile Gly Ser Val Thr Glu Lys Asp Ser
Gly Val Tyr Leu Cys Val 1970 1975
1980Ala Arg Asn Lys Met Gly Asp Asp Leu Ile Leu Met His Val Ser
1985 1990 1995Leu Arg Leu Lys Pro Ala
Lys Ile Asp His Lys Gln Tyr Phe Arg 2000 2005
2010Lys Gln Val Leu His Gly Lys Asp Phe Gln Val Asp Cys Lys
Ala 2015 2020 2025Ser Gly Ser Pro Val
Pro Glu Ile Ser Trp Ser Leu Pro Asp Gly 2030 2035
2040Thr Met Ile Asn Asn Ala Met Gln Ala Asp Asp Ser Gly
His Arg 2045 2050 2055Thr Arg Arg Tyr
Thr Leu Phe Asn Asn Gly Thr Leu Tyr Phe Asn 2060
2065 2070Lys Val Gly Val Ala Glu Glu Gly Asp Tyr Thr
Cys Tyr Ala Gln 2075 2080 2085Asn Thr
Leu Gly Lys Asp Glu Met Lys Val His Leu Thr Val Ile 2090
2095 2100Thr Ala Ala Pro Arg Ile Arg Gln Ser Asn
Lys Thr Asn Lys Arg 2105 2110 2115Ile
Lys Ala Gly Asp Thr Ala Val Leu Asp Cys Glu Val Thr Gly 2120
2125 2130Asp Pro Lys Pro Lys Ile Phe Trp Leu
Leu Pro Ser Asn Asp Met 2135 2140
2145Ile Ser Phe Ser Ile Asp Arg Tyr Thr Phe His Ala Asn Gly Ser
2150 2155 2160Leu Thr Ile Asn Lys Val
Lys Leu Leu Asp Ser Gly Glu Tyr Val 2165 2170
2175Cys Val Ala Arg Asn Pro Ser Gly Asp Asp Thr Lys Met Tyr
Lys 2180 2185 2190Leu Asp Val Val Ser
Lys Pro Pro Leu Ile Asn Gly Leu Tyr Thr 2195 2200
2205Asn Arg Thr Val Ile Lys Ala Thr Ala Val Arg His Ser
Lys Lys 2210 2215 2220His Phe Asp Cys
Arg Ala Glu Gly Thr Pro Ser Pro Glu Val Met 2225
2230 2235Trp Ile Met Pro Asp Asn Ile Phe Leu Thr Ala
Pro Tyr Tyr Gly 2240 2245 2250Ser Arg
Ile Thr Val His Lys Asn Gly Thr Leu Glu Ile Arg Asn 2255
2260 2265Val Arg Leu Ser Asp Ser Ala Asp Phe Ile
Cys Val Ala Arg Asn 2270 2275 2280Glu
Gly Gly Glu Ser Val Leu Val Val Gln Leu Glu Val Leu Glu 2285
2290 2295Met Leu Arg Arg Pro Thr Phe Arg Asn
Pro Phe Asn Glu Lys Ile 2300 2305
2310Val Ala Gln Leu Gly Lys Ser Thr Ala Leu Asn Cys Ser Val Asp
2315 2320 2325Gly Asn Pro Pro Pro Glu
Ile Ile Trp Ile Leu Pro Val Gly Thr 2330 2335
2340Arg Phe Ser Asn Gly Pro Gln Ser Tyr Gln Tyr Leu Ile Ala
Ser 2345 2350 2355Asn Gly Ser Phe Ile
Ile Ser Lys Thr Thr Arg Glu Asp Ala Gly 2360 2365
2370Lys Tyr Arg Cys Ala Ala Arg Asn Lys Val Gly Tyr Ile
Glu Lys 2375 2380 2385Leu Val Ile Leu
Glu Ile Gly Gln Lys Pro Val Ile Leu Thr Tyr 2390
2395 2400Ala Pro Gly Thr Val Lys Gly Ile Ser Gly Glu
Ser Leu Ser Leu 2405 2410 2415His Cys
Val Ser Asp Gly Ile Pro Lys Pro Asn Ile Lys Trp Thr 2420
2425 2430Met Pro Ser Gly Tyr Val Val Asp Arg Pro
Gln Ile Asn Gly Lys 2435 2440 2445Tyr
Ile Leu His Asp Asn Gly Thr Leu Val Ile Lys Glu Ala Thr 2450
2455 2460Ala Tyr Asp Arg Gly Asn Tyr Ile Cys
Lys Ala Gln Asn Ser Val 2465 2470
2475Gly His Thr Leu Ile Thr Val Pro Val Met Ile Val Ala Tyr Pro
2480 2485 2490Pro Arg Ile Thr Asn Arg
Pro Pro Arg Ser Ile Val Thr Arg Thr 2495 2500
2505Gly Ala Ala Phe Gln Leu His Cys Val Ala Leu Gly Val Pro
Lys 2510 2515 2520Pro Glu Ile Thr Trp
Glu Met Pro Asp His Ser Leu Leu Ser Thr 2525 2530
2535Ala Ser Lys Glu Arg Thr His Gly Ser Glu Gln Leu His
Leu Gln 2540 2545 2550Gly Thr Leu Val
Ile Gln Asn Pro Gln Thr Ser Asp Ser Gly Ile 2555
2560 2565Tyr Lys Cys Thr Ala Lys Asn Pro Leu Gly Ser
Asp Tyr Ala Ala 2570 2575 2580Thr Tyr
Ile Gln Val Ile 258525663PRTRattus Speciesmisc_feature(322)..(322)"x"
can be any amino acid 25Met Gln Val Arg Gly Arg Glu Val Ser Gly Leu Leu
Ile Ser Leu Thr1 5 10
15Ala Val Cys Leu Val Val Thr Pro Gly Ser Arg Ala Cys Pro Arg Arg
20 25 30Cys Ala Cys Tyr Val Pro Thr
Glu Val His Cys Thr Phe Arg Tyr Leu 35 40
45Thr Ser Ile Pro Asp Gly Ile Pro Ala Asn Val Glu Arg Ile Asn
Leu 50 55 60Gly Tyr Asn Ser Leu Thr
Arg Leu Thr Glu Asn Asp Phe Asp Gly Leu65 70
75 80Ser Lys Leu Glu Leu Leu Met Leu His Ser Asn
Gly Ile His Arg Val 85 90
95Ser Asp Lys Thr Phe Ser Gly Leu Gln Ser Leu Gln Val Leu Lys Met
100 105 110Ser Tyr Asn Lys Val Gln
Ile Ile Arg Lys Asp Thr Phe Tyr Gly Leu 115 120
125Gly Ser Leu Val Arg Leu His Leu Asp His Asn Asn Ile Glu
Phe Ile 130 135 140Asn Pro Glu Ala Phe
Tyr Gly Leu Thr Ser Leu Arg Leu Val His Leu145 150
155 160Glu Gly Asn Arg Leu Thr Lys Leu His Pro
Asp Thr Phe Val Ser Leu 165 170
175Ser Tyr Leu Gln Ile Phe Lys Thr Ser Phe Ile Lys Tyr Leu Phe Leu
180 185 190Ser Asp Asn Phe Leu
Thr Ser Leu Pro Lys Glu Met Val Ser Tyr Met 195
200 205Pro Asn Leu Glu Ser Leu Tyr Leu His Gly Asn Pro
Trp Thr Cys Asp 210 215 220Cys His Leu
Lys Trp Leu Ser Glu Trp Met Gln Gly Asn Pro Asp Ile225
230 235 240Ile Lys Cys Lys Lys Asp Arg
Ser Ser Ser Ser Pro Gln Gln Cys Pro 245
250 255Leu Cys Met Asn Pro Arg Ile Ser Lys Gly Arg Pro
Phe Ala Met Val 260 265 270Pro
Ser Gly Ala Phe Leu Cys Thr Lys Pro Thr Ile Asp Pro Ser Leu 275
280 285Lys Ser Lys Ser Leu Val Thr Gln Glu
Asp Asn Gly Ser Ala Ser Thr 290 295
300Ser Pro Gln Asp Phe Ile Glu Pro Phe Gly Ser Leu Ser Leu Asn Met305
310 315 320Thr Xaa Xaa Ser
Gly Asn Lys Ala Asp Met Val Cys Ser Ile Gln Lys 325
330 335Pro Ser Arg Thr Ser Pro Thr Ala Phe Thr
Glu Glu Asn Asp Tyr Ile 340 345
350Met Leu Asn Ala Ser Phe Ser Thr Asn Leu Val Cys Ser Val Asp Tyr
355 360 365Asn His Ile Gln Pro Val Trp
Gln Leu Leu Ala Leu Tyr Ser Asp Ser 370 375
380Pro Leu Ile Leu Glu Arg Lys Pro Gln Leu Thr Glu Thr Pro Ser
Leu385 390 395 400Ser Ser
Arg Tyr Lys Gln Val Ala Leu Arg Pro Glu Asp Ile Phe Thr
405 410 415Ser Ile Glu Ala Asp Val Arg
Ala Asp Pro Phe Trp Phe Gln Gln Glu 420 425
430Lys Ile Val Leu Gln Leu Asn Arg Thr Ala Thr Thr Leu Ser
Thr Leu 435 440 445Gln Ile Gln Phe
Ser Thr Asp Ala Gln Ile Ala Leu Pro Arg Ala Glu 450
455 460Met Arg Ala Glu Arg Leu Lys Trp Thr Met Ile Leu
Met Met Asn Asn465 470 475
480Pro Lys Leu Glu Arg Thr Val Leu Val Gly Gly Thr Ile Ala Leu Ser
485 490 495Cys Pro Gly Lys Gly
Asp Pro Ser Pro His Leu Glu Trp Leu Leu Ala 500
505 510Asp Gly Ser Lys Val Arg Ala Pro Tyr Val Ser Glu
Asp Gly Arg Ile 515 520 525Leu Ile
Asp Lys Asn Gly Lys Leu Glu Leu Gln Met Ala Asp Ser Phe 530
535 540Asp Ala Gly Leu Tyr His Cys Ile Ser Thr Asn
Asp Ala Asp Ala Asp545 550 555
560Val Leu Thr Tyr Arg Ile Thr Val Val Glu Pro Tyr Gly Glu Ser Thr
565 570 575His Asp Ser Gly
Val Gln His Thr Val Val Thr Gly Glu Thr Leu Asp 580
585 590Leu Pro Cys Leu Ser Thr Gly Val Pro Asp Ala
Ser Ile Ser Trp Ile 595 600 605Leu
Pro Gly Asn Thr Val Phe Ser Gln Pro Ser Arg Asp Arg Gln Ile 610
615 620Leu Asn Asn Gly Thr Leu Arg Ile Leu Gln
Val Thr Pro Lys Asp Gln625 630 635
640Gly His Tyr Gln Cys Val Ala Ala Asn Pro Ser Gly Ala Asp Phe
Ser 645 650 655Ser Phe Lys
Val Ser Val Gln 660262469DNAHomo sapiens 26gcggccgcca
cacccgccac cagttcgcca tgaaggtaaa aggcagagga atcacctgct 60tgctggtctc
ctttgctgtg atctgcctgg tcgccacccc tgggggcaag gcctgtcctc 120gccgctgtgc
ctgttatatg cctacggagg tacactgcac atttcggtac ctgacttcca 180tcccagacag
catcccgccc aatgtggaac gcatcaattt aggatacaac agcttggtta 240gattgatgga
aacagatttt tctggcctga ccaaactgga gttactcatg cttcacagca 300atggcattca
cacaatccct gacaagacct tctcagattt gcaggccttg caggtcttaa 360aaatgagcta
taataaagtc cgaaaacttc agaaagatac tttttatggc ctcaggagct 420tgacacgatt
gcacatggac cacaacaata ttgagtttat aaacccagag gttttttatg 480ggctcaactt
tctccgcctg gtgcacttgg aaggaaatca gctcactaag ctccacccag 540atacatttgt
ctctttgagc tacctccaga tatttaaaat ctctttcatt aagttcctat 600acttgtctga
taacttcctg acctccctcc ctcaagagat ggtctcctat atgcctgacc 660tagacagcct
ttacctgcat ggaaacccat ggacctgtga ttgccattta aagtggttgt 720ctgactggat
acaggagaag ccagatgtaa taaaatgcaa aaaagataga agtccctcta 780gtgctcagca
gtgtccactt tgcatgaacc ctaggacttc taaaggcaag ccgttagcta 840tggtctcagc
tgcagctttc cagtgtgcca agccaaccat tgactcatcc ctgaaatcaa 900agagcctgac
tattctggaa gacagtagtt ctgctttcat ctctccccaa ggtttcatgg 960caccctttgg
ctccctcact ttgaatatga cagatcagtc tggaaatgaa gctaacatgg 1020tctgcagtat
tcaaaagccc tcaaggacat cacccattgc attcactgaa gaaaatgact 1080acatcgtgct
aaatacttca ttttcaacat ttttggtgtg caacatagat tacggtcaca 1140ttcagccagt
gtggcaaatt ttggctttgt acagtgattc tcctctgata ctagaaagga 1200gccacttgct
tagtgaaaca ccgcagctct attacaaata taaacaggtg gctcctaagc 1260ctgaagacat
ttttaccaac atagaggcag atctcagagc agatccctct tggttaatgc 1320aagaccaaat
ttccttgcag ctgaacagaa ctgccaccac attcagtaca ttacagatcc 1380agtactccag
tgatgctcaa atcactttac caagagcaga gatgaggcca gtgaaacaca 1440aatggactat
gatttcaagg gataacaata ctaagctgga acatactgtc ttggtaggtg 1500gaaccgttgg
cctgaactgc ccaggccaag gagaccccac cccacacgtg gattggcttc 1560tagctgatgg
aagtaaagtg agagcccctt atgtcagtga ggatggacgg atcctaatag 1620acaaaagtgg
aaaattggaa ctccagatgg ctgatagttt tgacacaggc gtatatcact 1680gtataagcag
caattatgat gatgcagata ttctcaccta taggataact gtggtagaac 1740ctttggtcga
agcctatcag gaaaatggga ttcatcacac agttttcatt ggtgaaacac 1800ttgatcttcc
atgccattct actggtatcc cagatgcctc tattagctgg gttattccag 1860gaaacaatgt
gctctatcag tcatcaagag acaagaaagt tctaaacaat ggcacattaa 1920gaatattaca
ggtcaccccg aaagaccaag gttattatcg ctgtgtggca gccaacccat 1980caggggttga
ttttttgatt ttccaagttt cagtcaagat gaaaggacaa aggcccttgg 2040agcatgatgg
agaaacagag ggatctggac ttgatgagtc caatcctatt gctcatctta 2100aggagccacc
aggtgcacaa ctccgtacat ctgctctgat ggaggctgag gttggaaaac 2160acacctcaag
cacaagtaag aggcacaact atcgggaatt aacactccag cgacgtggag 2220attcaacaca
tcgacgtttt agggagaata ggaggcattt ccctccctct gctaggagaa 2280ttgacccaca
acattgggcg gcactgttgg agaaagctaa aaagaatgct atgccagaca 2340agcgagaaaa
taccacagtg agcccacccc cagtggtcac ccaactccca aacatacctg 2400gtgaagaaga
cgattcctca ggcatgctcg ctctacatga ggaatttatg gtcccggcca 2460ctaaagctt
2469273518DNAHomo
sapiens 27aagctttgaa ccttccagca aggacagtga ctgctgactc cagaacaata
tctgatagtc 60ctatgacaaa cataaattat ggcacagaat tctctcctgt tgtgaattca
caaatactac 120cacctgaaga acccacagat ttcaaactgt ctactgctat taaaactaca
gccatgtcaa 180agaatataaa cccaaccatg tcaagccaaa tacaaggcac aaccaatcaa
cattcatcca 240ctgtctttcc actgctactt ggagcaactg aatttcagga ctctgaccag
atgggaagag 300gaagagagca tttccaaagt agacccccaa taacagtaag gactatgatc
aaagatgtca 360atgtcaaaat gcttagtagc accaccaaca aactattatt agagtcagta
aataccacaa 420atagtcatca gacatctgta agagaagtga gtgaacccag gcacaatcac
ttctattctc 480acactactca aatacttagc acctccacgt tcccttcaga tccacacaca
gctgctcatt 540ctcagtttcc gatccctaga aatagtacag ttaacatccc gctgttcaga
cgctttggga 600ggcagaggaa aattggcgga agggggcgga ttatcagccc atatagaact
ccagttctgc 660gacggcatag atacagcatt ttcaggtcaa caaccagagg ttcttctgaa
aaaagcacta 720ctgcattctc agccacagtg ctcaatgtga catgtctgtc ctgtcttccc
agggagaggc 780tcaccactgc cacagcagca ttgtcttttc caagtgctgc tcccatcacc
ttccccaaag 840ctgacattgc tagagtccca tcagaagagt ctacaactct agtccagaat
ccactattac 900tacttgagaa caaacccagt gtagagaaaa caacacccac aataaaatat
ttcaggactg 960aaatttccca agtgactcca actggtgcag tcatgacata tgctccaaca
tccataccca 1020tggaaaaaac tcacaaagta aacgccagtt acccacgtgt gtctagcacc
aatgaagcta 1080aaagagattc agtgattaca tcgtcacttt caggtgctat caccaagcca
ccaatgacta 1140ttatagccat tacaaggttt tcaagaagga aaattccctg gcaacagaac
tttgtaaata 1200accataaccc aaaaggcaga ttaaggaatc aacataaagt tagtttacaa
aaaagcacag 1260ctgtgatgct tcctaaaaca tctcctgctt tacccagaga caaagtctcc
cctttccatt 1320tcaccacact ttcaacaagt gtgatgcaaa ttccatctaa taccttgact
accgctcacc 1380acactacgac caaaacacac aatcctggaa gtcttccaac aaagaaggag
cttcccttcc 1440caccccttaa ccctatgctt cctagtatta taagcaaaga ctcaagtaca
aaaagcatca 1500tatcaacgca aacagcaata ccagcaacaa ctcctacctt ccctgcatct
gtcatcactt 1560atgaaaccca aacagagaga tctagagcac aaacaataca aagagaacag
gagcctcaaa 1620agaagaacag gactgaccca aacatctctc cagaccagag ttctggcttc
actacaccca 1680ctgctatgac acctcctgtt ctaaccacag ccgaaacttc agtcaagccc
agtgtctctg 1740cattcactca ttccccacca gaaaacacaa ctgggatttc aagcacaatc
agttttcatt 1800caagaactct taatctgaca gatgtgattg aagaactagc ccaagcaagt
actcagactt 1860tgaagagcac aattgcttct gaaacaactt tgtccagcaa atcacaccag
agtaccacaa 1920ctaggaaagc aatcattaga cactcaacca taccaccatt cttgagcagc
agtgctactc 1980taatgccagt tcccatctcc cctcccttta ctcagagagc agttactgac
aacgtggcga 2040ctcccatttc cgggcttatg acaaatacag tggtcaagct gcacgaatcc
tcaaggcaca 2100atgctaaacc acagcaatta gtagcagagg ttgcaacatc ccccaaggtt
cacccaaatg 2160ccaagttcac aattggaacc actcacttca tctactctaa tctgttacat
tctactccca 2220tgccagcact aacaacagtt aaatcacaga attctaaatt aactccatct
ccctgggcag 2280aaaaccaatt ttggcacaaa ccatactcag aaattgctga aaaaggcaaa
aagccagaag 2340taagcatgtt ggctactaca ggcctgtccg aggccaccac tcttgtttca
gattgggatg 2400gacagaagaa cacaaagaag agtgactttg ataagaaacc agttcaagaa
gcaacaactt 2460ccaaactcct tccctttgac tctttgtcta ggtatatatt tgaaaagccc
aggatagttg 2520gaggaaaagc tgcaagtttt actattccag ctaactcaga tgcctttctt
ccctgtgaag 2580ctgttggaaa tcccctgccc accattcatt ggaccagagt cccatcagga
cttgatttat 2640ctaagaggaa acagaatagc agggtccagg ttctccccaa tggtaccctg
tccatccaga 2700gggtggaaat tcaggaccgc ggacagtact tgtgttccgc atccaatctg
tttggcacag 2760accaccttca tgtcaccttg tctgtggttt cctatcctcc caggatcctg
gagagacgta 2820ccaaagagat cacagttcat tccggaagca ctgtggaact gaagtgcaga
gcagaaggta 2880ggccaagccc tacagttacc tggattcttg caaaccaaac agttgtctca
gaatcatccc 2940agggaagtag gcaggctgtg gtgacggttg acggaacatt ggtcctccac
aatctcagta 3000tttatgaccg tggcttttac aaatgtgtgg ccagcaaccc aggtggccag
gattcactgc 3060tggttaaaat acaagtcatt gcagcaccac ctgttattct agagcaaagg
aggcaagtca 3120ttgtaggcac ttggggtgaa agtttaaaac tgccctgtac tgcaaaagga
actcctcagc 3180ccagcgttta ctgggtcctc tctgatggca ctgaagtgaa accattacag
tttaccaatt 3240ccaagttgtt cttattttca aatgggactt tgtatataag aaacctagcc
tcttcagaca 3300ggggcactta tgaatgcatt gctaccagtt ccactggttc ggagcgaaga
gtagtaatgc 3360ttacaatgga agagcgagtg accagcccca ggatagaagc tgcatcccag
aaaaggactg 3420aagtgaattt tggggacaaa ttactactga actgctcagc cactggggag
cccaaacccc 3480aaataatgtg gaggttacca tccaaggctg tggtcgac
3518281950DNAHomo sapiens 28gtcgaccagc agcatagagt gggcagctgg
atccacgtct accctaatgg atccctgttt 60attggatcag taacagaaaa agacagtggt
gtctacttgt gtgtggcaag aaacaaaatg 120ggggatgatc tgatactgat gcatgttagc
ctaagactga aacctgccaa aattgaccac 180aagcagtatt ttagaaagca agtgctccat
gggaaagatt tccaagtaga ttgcaaagct 240tccggctccc cagtgccaga gatatcttgg
agtttgcctg atggaaccat gatcaacaat 300gcaatgcaag ccgatgacag tggccacagg
actaggagat ataccctttt caacaatgga 360actttatact tcaacaaagt tggggtagcg
gaggaaggag attatacttg ctatgcccag 420aacaccctag ggaaagatga aatgaaggtc
cacttaacag ttataacagc tgctccccgg 480ataaggcaga gtaacaaaac caacaagaga
atcaaagctg gagacacagc tgtccttgac 540tgtgaggtca ctggggatcc caaaccaaaa
atattttggt tgctgccttc caatgacatg 600atttccttct ccattgatag gtacacattt
catgccaatg ggtctttgac catcaacaaa 660gtgaaactgc tcgattctgg agagtacgta
tgtgtagccc gaaatcccag tggggatgac 720accaaaatgt acaaactgga tgtggtctct
aaacctccat taatcaatgg tctgtataca 780aatagaactg ttattaaagc cacagctgtg
agacattcca aaaaacactt tgactgcaga 840gctgaaggga caccatctcc tgaagtcatg
tggatcatgc cagacaatat tttcctcaca 900gccccatact atggaagcag aatcacagtc
cataaaaatg gaaccttgga aattaggaat 960gtgaggcttt cagattcagc cgactttatc
tgtgtggccc gaaatgaagg tggagagagc 1020gtgttggtag tacagttaga agtactggaa
atgctgagaa gaccgacatt tagaaatcca 1080tttaatgaaa aaatagttgc ccagctggga
aagtccacag cattgaattg ctctgttgat 1140ggtaacccac cacctgaaat aatctggatt
ttaccaaatg gcacacgatt ttccaatgga 1200ccacaaagtt atcagtatct gatagcaagc
aatggttctt ttatcatttc taaaacaact 1260cgggaggatg caggaaaata tcgctgtgca
gctaggaata aagttggcta tattgagaaa 1320ttagtcatat tagaaattgg ccagaagcca
gttattctta cctatgcacc agggacagta 1380aaaggcatca gtggagaatc tctatcactg
cattgtgtgt ctgatggaat ccctaagcca 1440aatatcaaat ggactatgcc aagtggttat
gtagtagaca ggcctcaaat taatgggaaa 1500tacatattgc atgacaatgg caccttagtc
attaaagaag caacagctta tgacagagga 1560aactatatct gtaaggctca aaatagtgtt
ggtcatacac tgattactgt tccagtaatg 1620attgtagcct accctccccg aattacaaat
cgtccaccca ggagtattgt caccaggaca 1680ggggcagcct ttcagctcca ctgtgtggcc
ttgggagttc ccaagccaga aatcacatgg 1740gagatgcctg accactccct tctctcaacg
gcaagtaaag agaggacaca tggaagtgag 1800cagcttcact tacaaggtac cctagtcatt
cagaatcccc aaacctccga ttctgggata 1860tacaaatgca cagcaaagaa cccacttggt
agtgattatg cagcaacgta tattcaagta 1920atccaccacc accaccacca ttgaactagt
1950299109DNAHomo sapiens 29atgcccaagc
gcgcgcactg gggggccctc tctgtggtgc tgatcctgct ttggggtcat 60ccgcgagtgg
cgctggcctg ccctcatcct tgtgcctgct acgtccccag cgaggtccac 120tgcacgttcc
gatccctggc ttctgtgccc gctggcattg ctaaacatgt ggaaagaatc 180aatttggggt
ttggaattct gaagtgtaaa aaggacaaag cttatgaagg cggtcagttg 240tgtgcaatgt
gcttcagtcc aaagaagttg tacaaacatg agattcacaa gctgaaggac 300ctgacttgtc
tgaagccttc catagagtct cctctgagac agaacaggag caggagtatt 360gaggaggagc
aaaaacaaga agagaatggt gacagccagc tcatcctgga gaaaatccaa 420cttccccagt
ggagcatctc tttgaatatg actgatgagc acgggaacct ggtgaacttg 480gtgtgtgaca
tcaagaaacc aatggatgtg tacaaaattc acttgaacca aacagatcct 540ccagatattg
acataaatgc aatggttgcc ttggactttg agtatccaat gacccaggaa 600aactatgaaa
atctatggaa attgatagca tactacagtg aagttcccat gaagctacac 660agagagctca
tgctcagcaa acaccccaga gtcagctacc agtacaggca agatgccgat 720gaagaagctc
tttactacac aggtgtgaga gcccagattc ttgcagaacc agaatggatc 780atgcagccat
ccatagatat ccagctgaac cgacctcaga gtacggccaa gaaggtgcta 840ctttcctact
acaaccagta ttctcaaaca atagccacca aagatacaag gcaggctcgg 900ggcagaagct
gggtaatgat tgagcctagt agagctgtgc aaaaagatca gactgtcctg 960gaagggggtc
gatgccagtt gagctgcaat gtgaaagctt ctgagagtcc atctatcttc 1020tgggtgcttc
cagatggctc catcctgaaa gtgcctgtgg atgacccaga cagcaagttc 1080tccattctca
gcagtggctg gctgaggatc aagtccatgg agccatctga ctcgggcttg 1140taccagtgca
ttgctcaagt gagggatgaa atggaccgca tggtatatag ggtacttgtg 1200cagtctccct
ccactcagcc agccgagaaa gacacagtga caattggcaa gaacccaggg 1260gagccagtga
tgttgccttg caatgcttta gctatacccg aagcccacct tagctggatt 1320cttccaaaca
gaaggataat taatgatttg gctaacacat cacatgtata catgctgcca 1380aatggaactc
tttccatccc aaaggtccaa gtcagtgaca gtggttacca cagatgtgtg 1440gctgtcaacc
agcatggggc agaccatatc acggtgggaa tcacagtgac caagaaaggt 1500tctggctcgc
catccaaaag aggcagatgg ccaggtccaa aggctctttc cagagtgaga 1560gaagacatcg
tggaggatga aggggtctca ggcacgggag atgaagagaa cacttcaagg 1620agacttctac
atccaaagca ccaagaggcg ttcctcaaaa caaaggatga tgccatcaat 1680ggagataaga
aagccaagaa agggagaaga aagctgaaac tctggaagca ttcagaaaaa 1740gaaccagaga
ccagtgttgc agaagatctc agagtgtttg aatcaagacg aaggataaac 1800gtggcaaaca
aacagattaa tccggagcac tgggctgata ttttagccaa agtctttggg 1860aaaaatctcc
ctacaggcac agaagtatcc ccaattatta aaaccacaag ttctccattc 1920ttgagcctag
tagtcacacc acctttgcct gctgtttctc cccccttggc atctccaata 1980cagacagcaa
caagtgctga agaatcctca gcagatgtac ctctactcag cgaaggaaag 2040cacattttga
gtaccatttc ctcagccagc atgggactag aacaccacaa caatggagtt 2100attcttgttg
aacctgaagt aacaagcaca cctctggaag aagttgttga tgagtattcc 2160aagaagactg
aggagatgac ttccactgaa ggcgacctga aggggactgc agcctctaca 2220cttatatctg
agccttatga acaatctcct actctacaca ccttagacac agtctatgaa 2280gagcccaccc
atgaagagac ggaaacagag ggttggtctg cagcagatgt tggatcctca 2340ccagatccca
catccagtga gtatgagctt ccattggttg ttgtctcctt ggctgagtct 2400aagcctgtgc
aatactttga cccagatttg gagactaatt cacaaccaca tgaggataac 2460ataaaagaat
acagttttgc acaccttact ccaaccgcca tcatctggtt taatgactct 2520agtacatcac
tgtcatttga ggattctact gtaggggaac aaggtgtccc aggcaaatca 2580catctacaag
gaccgacaga gaacatccag cttgtgaaaa gtagttttag cactcaagac 2640accttattga
ttaaaaaagg tatgaaagag atgtctcaga cactacaggg aggaaatatg 2700ctagagggag
accctacaca ctccagaagt tctgagaatg agggccaaga gagcaaatcc 2760atcactttac
ctgactccac actgggtata acgagcagta cgtctccagt taagaagcct 2820gcggaaacca
cagttgtcac cctgctacac aaagacacca caacagaaac aactccaagg 2880caaaaagtgg
cttcatcatc caccatgagc actcaccctt ctcgaaggag acccaatggg 2940agaaaattac
accctcacaa attccaccac cggcacaagc aaaccccacc cacaactttt 3000gctccattag
agactttttc tactcaacca actcaagcaa ctgacattaa gatttcaaat 3060caaatggaga
gttctctggt tcctacatct tgggagatta acacagttaa tacccccaaa 3120cagctggaaa
tggagaagaa tgtagagctc atatcaaagg gaactccacg gagaaaacac 3180gggaagaggc
caaacaaaca tcgatatacc ccttctacag tgagttcaag agcatctgca 3240tccaagccca
gcccttctcc agaaaataaa catagaaaca ttgttactcc cagttcagaa 3300actacacttt
tgcctagaaa tgtttctctg aaaactgagg gcgtttatga ttccttagat 3360tacacgacaa
ccaccagaaa aatacattca tctcaccata aagtccaaga cacacttcca 3420gtcatgtata
aacccacatc agatggaaaa gaaattcagg atgatgttgc cacaaatgtt 3480gacaaacata
aaagtgacat tttagtccct ggtgagtcaa ttacaaatgt cacacaaact 3540tctcgctcct
tggtctccac tatgggagaa tttaaggaag aatcctctcc tgtgggcttt 3600ccaggaattc
caacctggaa tccctcaagg aaagctcagc ctgggaggct acagacagac 3660atacatgtta
ccacttctgg ggaaacccct acagaccctc cccttgttaa cgagcttgag 3720gatgtggatt
ttacttctga gtttttgtcc tctgtgacag tctccacacc atttcaccag 3780gaagaagctg
gtttttccac aattctctca agcataaaag tggagatggc ttcaagtcag 3840gtagaaacta
ccacccttgg tcaagatcat catgaaacca ctgtggctat tctccactct 3900gaaactagac
cacagaatca catccttact gctgcctgga tgaaggagcc agcatctttg 3960tcccctccca
tgattctcct gtctttggga caaaccacca ccactaagcc agaacttctc 4020agtccaagaa
catctcaaat atgtaaagat tccaaggaaa atgttttctt gaattacatg 4080gggaatccag
aaacagaagc aaccccagtg aaaaatgaag gaacacagcg tatgtcaggg 4140ccaaatgaat
tatcaacacc atcttctgac cacgatgcat ttaacttgtc tacaaagcta 4200gaattggaaa
agcaagtatt tgatagtagg agtctaacac gtggcccaga tagccaccac 4260caggatggaa
gagttcatgc ttctcatcaa ctaaccagaa tccctgccaa acccatccta 4320ccaacaggaa
cagtgaggct gcctgaaatg tccacacaaa gcacttccag atactttgta 4380actttccagc
cacctcatca cgggaccaac aaaccagaaa taactacata tccttctagg 4440gctttgccag
agagcaaaca gtttacaact ccaagagtag caagtacaac tcctctccta 4500tcacacatgt
ccaaacccag catttctagt aagtttgctg acctaagaac tgaccaatcc 4560aatggctcct
acaaagtgtt tggaaatagc aacatccctg aggcaagaaa ctcagttgga 4620aagcctctca
gtccaagaat ttatcattat tccaatggaa gactcccttt ctttaccaac 4680aggactcttt
ctttttcaca gttgggagtc acccggagac cccagatacc ctcttctcct 4740gtcccagtaa
tgagagagag aaaagttaat ccaggttcct acaataggat atattcccat 4800agcaccttcc
atctggactt tggccttcca gcacctccac tgttgcacac tccatggacc 4860atggtatcac
ccccaactaa cttacagaat atccctatgg tctcatccac ccagagttct 4920gtctccttta
taacatcttc tgtccagtcc tcaggaagca tccaccaaag cggctcaaag 4980ttctttgcag
gaggaccgcc tgcatccaaa ttctggcctc ttggggaaaa gccccaaatc 5040ctcaccaagt
ccccacagac tgtgtctgtc actgctgaaa cggacgctgt gttcccgtgt 5100gaggcaatag
gaaaaccaaa gcctttcgtt acttggacaa aagtttccac aggagttctt 5160atgactccga
ataccaggat acaacggttt gaggttctca agaacggtac cttagtgata 5220aggaagtttc
aagtgcaaga tcgaggccag tatatgtgca ccgccagcaa cctgtacggc 5280ctggacagga
tggtggtctt tctctgggtc accgtgcagc aacctcaaat cctagcctcc 5340cactaccagg
acgtcaccgt ctacctggga gacaccatta caatggagtg tctggcgaaa 5400gggaccccag
ccccccaaat ttcctggatc ttccgtgaca ggagggtgtg gcaaactctg 5460tcctccgtgg
agggccggat caccctgcac caaaaccgga ccctttccat caaggaggcg 5520tccttctcag
acagaggcgt ctataagtgc gtggccagca acgcaacccg ggcggacagc 5580gtgtccatcc
gcctacacgt ggcggcactg ccccccatta tccaccagga gaagctggag 5640aacatctcgc
tgcccccggg gctcagcatt cacattcact gcactgccaa agctgcgccc 5700ctgcccagcg
tgctctgggt gctcggggat ggtacccaaa tccgcccctc gcatttcctc 5760caccggaact
tgtttgtttt ccccaacggg acgctctaca tctgcaacct cgcgcccaag 5820gacagcgggc
gctatgagtg cgtggccgcc aacctgatcg gctccgcgcg cagtacggtg 5880cagctgaacg
tgcagcgcgc agcagcgaac gcgcgcatca cgggcacctc ctcgcagagg 5940acggacgtca
ggtacggagg gaccctcaag ctggactgca gcgcctcggg ggatccctgg 6000ccgcgcatcc
tctggaggct gccgtccaag aggacgatcg acgcgctttt cagttttgat 6060agtagaatca
aggtgtttgc caacaggacc ctggtggtga aatcaatgac agacaaagac 6120gccggagatt
acctgtgtgt agctcgaaat aaggttggtg atgactgcgt ggtgctcaag 6180gtggatgtga
tgatgaaacc ggccaagatt gaacacaagg aggagaacga ccacaaagtc 6240ttctacaggg
gtgacctgaa agtggactgt gtggccactg gacttcccaa tcccgagatc 6300tcctggagcc
tcctggatgg gagtctggtg aactccttca tgcagtcaga tgacagtggt 6360ggacgcacca
agcactatgt ggtcttcaac aatgggacac tctacttcag tgaagtgggg 6420atgagggagg
aaggagacta cacctgcttt gctgaaaatc aggttgggaa ggatgagatg 6480agagtcagag
tcaagatggt gacacctgcc accatctgga acaagactta cttggcagtt 6540caggtaccct
atggagatgt ggtcactgta acctgtgagg ccaaaggaga acccatgccc 6600aaggtgactt
ggttgtcccc agccaacagg gtgatcccca cctcctctga gaagtatcag 6660atataccaat
atggcactct ccttattcag aaagcccagt gctctgacag cggcaactac 6720acctgcctgg
tcaggaacag tgccggagag gataggaaga cagtgtggat tcacgtcaac 6780ctccagccac
ccaagatcaa tggtaacccc aaccccatca ccaccgtgtg ggagatagca 6840gccgggggca
gtcggaaact gattgactgc aaagctgaag gcatccccac cccgagggtg 6900ttatgggctt
ttcccgaggg tgtggttctg ccagatccat actatggaaa ccggatcact 6960gtccatggca
acggttccct ggacatcagg agtttgagga agagcgactc cgtccagctg 7020gtatgcatgg
cacgcaacga gggaggggag gcgaggttga tcgtgcagct cactgtcctg 7080gagcccatgg
agaaacccat cttccacgac ccgatcagcg agaagatcac ggccatggcg 7140ggccacacca
tcagcctcaa ctgctctgcc gcggggaccc tgacacccag cctggtgtgg 7200gtccttccca
atggcaccga tctgcagagt ggacagcagc tgcagcgctt ctaccacaag 7260gctgacggca
tgctacacat tagcggtctc tcctcggtgg acgccggggc ctaccgctgc 7320gtggcccgca
atgccgcggg ccacacggag aggctggtct ccctgaaggt gggactgaag 7380ccagaagcaa
acaagcagta tcataacctg gtcagcatca tcaatggtga gaccctgaag 7440ctcccctgca
cccctcctgc agctgggcag ggacatttct cctggacact ccccaatggc 7500atgcatctgg
agggccccca aaccctggga cgcgtttctc ttctggacaa tggcaccctc 7560acggttcgtg
aggcctcggt gtttgacagg ggtacctatg tatgcaggat ggagacggcg 7620tacggccctt
cggtcaccag catccccgtg attgtgatcg cctatcctcc ccggatcacc 7680agcgagccta
ccccagtcat ctacacccgt cccgggaaca ccgtgaaact gaactgcatg 7740gctatgggga
ttcccaaagg tgacatcacg tgggagttac cggataagtt gcatctgaag 7800gcaggggttc
aggctcgtct gtatggaaac agatttcttc acccccaggg atcactgacc 7860atccagcagg
ccagacggag agacgctggc ttctacaagt gcacggcaaa aaacattctc 7920agcagtgact
ccaaaacaac ttatatccat gtcttctgaa atgtggattc cagaatgatt 7980gctcaggaac
tgacaacaaa gcggggtttg taagggaagc caggctgggg aatcagagct 8040cttaaataat
gtgtcacagt gcatggtggc ccccggtggg attcaagttg aggttgatct 8100tgatctacaa
ttgttgggaa aaggaagcaa tacagacatg agtaaaaggg ctcagcctca 8160ctgagaactt
tcttttgtgt ttacatcatg ccaggggctt cattcagggt gtctgtgctc 8220tgactgtaat
ttttattttt ttgcaaatgt cattcgactg cctgcgtaag tgtccatagg 8280atatctgagg
aacattcacc gaaaataagc catagacatg aacaacacct cactccccca 8340ttgaagatgc
atcgtctagt taacctgctg cagtttttac atgatagact ttgttccaga 8400ttgacaagtc
atctttcagt tatttcctct atcacttcaa aactccagct tgcccaataa 8460ggatttagaa
ctagagtgat tgttatatat ataatatata tattttaatt cagagttaca 8520tacatacagc
taccatttta tatgaaaaaa acatttcttc ctggaaccca ctttttatgt 8580aattttttta
tataaatatt tttcctttca aatcagatga tgagactaga aggagaaata 8640ctttctgtct
cattaaaatt aataaatgat tggtctttac aagacttgga tacattacag 8700cagacatgga
aatagaattt taaacaattc ctctccaacc tccttcaaat tcagtcgcta 8760ctgttatgtt
actttctcca gcaaccctgc actggggaag gctgtgatat tagatttcct 8820tgtatgcaaa
gtttttgttg aaagctgtgc tcagcggagg tgagaggaga ggaggagaaa 8880actgcatcat
atctttccag aattgaatct agagtcttcc ctggaaagcc cagaaacttc 8940tctgcagtat
ctgacttgtc catctggtct aaggtggctg cttcttccgc aaccatgagt 9000tagtctgtgt
ccatgaataa tacaagatct gttatttcca tgactgcttt actgtaattt 9060tagggtcaat
atactgtaca tttgataata aaatatattc tcccaaaaa
9109302652PRTHomo sapiens 30Met Pro Lys Arg Ala His Trp Gly Ala Leu Ser
Val Val Leu Ile Leu1 5 10
15Leu Trp Gly His Pro Arg Val Ala Leu Ala Cys Pro His Pro Cys Ala
20 25 30Cys Tyr Val Pro Ser Glu Val
His Cys Thr Phe Arg Ser Leu Ala Ser 35 40
45Val Pro Ala Gly Ile Ala Lys His Val Glu Arg Ile Asn Leu Gly
Phe 50 55 60Gly Ile Leu Lys Cys Lys
Lys Asp Lys Ala Tyr Glu Gly Gly Gln Leu65 70
75 80Cys Ala Met Cys Phe Ser Pro Lys Lys Leu Tyr
Lys His Glu Ile His 85 90
95Lys Leu Lys Asp Leu Thr Cys Leu Lys Pro Ser Ile Glu Ser Pro Leu
100 105 110Arg Gln Asn Arg Ser Arg
Ser Ile Glu Glu Glu Gln Lys Gln Glu Glu 115 120
125Asn Gly Asp Ser Gln Leu Ile Leu Glu Lys Ile Gln Leu Pro
Gln Trp 130 135 140Ser Ile Ser Leu Asn
Met Thr Asp Glu His Gly Asn Leu Val Asn Leu145 150
155 160Val Cys Asp Ile Lys Lys Pro Met Asp Val
Tyr Lys Ile His Leu Asn 165 170
175Gln Thr Asp Pro Pro Asp Ile Asp Ile Asn Ala Met Val Ala Leu Asp
180 185 190Phe Glu Tyr Pro Met
Thr Gln Glu Asn Tyr Glu Asn Leu Trp Lys Leu 195
200 205Ile Ala Tyr Tyr Ser Glu Val Pro Met Lys Leu His
Arg Glu Leu Met 210 215 220Leu Ser Lys
His Pro Arg Val Ser Tyr Gln Tyr Arg Gln Asp Ala Asp225
230 235 240Glu Glu Ala Leu Tyr Tyr Thr
Gly Val Arg Ala Gln Ile Leu Ala Glu 245
250 255Pro Glu Trp Ile Met Gln Pro Ser Ile Asp Ile Gln
Leu Asn Arg Pro 260 265 270Gln
Ser Thr Ala Lys Lys Val Leu Leu Ser Tyr Tyr Asn Gln Tyr Ser 275
280 285Gln Thr Ile Ala Thr Lys Asp Thr Arg
Gln Ala Arg Gly Arg Ser Trp 290 295
300Val Met Ile Glu Pro Ser Arg Ala Val Gln Lys Asp Gln Thr Val Leu305
310 315 320Glu Gly Gly Arg
Cys Gln Leu Ser Cys Asn Val Lys Ala Ser Glu Ser 325
330 335Pro Ser Ile Phe Trp Val Leu Pro Asp Gly
Ser Ile Leu Lys Val Pro 340 345
350Val Asp Asp Pro Asp Ser Lys Phe Ser Ile Leu Ser Ser Gly Trp Leu
355 360 365Arg Ile Lys Ser Met Glu Pro
Ser Asp Ser Gly Leu Tyr Gln Cys Ile 370 375
380Ala Gln Val Arg Asp Glu Met Asp Arg Met Val Tyr Arg Val Leu
Val385 390 395 400Gln Ser
Pro Ser Thr Gln Pro Ala Glu Lys Asp Thr Val Thr Ile Gly
405 410 415Lys Asn Pro Gly Glu Pro Val
Met Leu Pro Cys Asn Ala Leu Ala Ile 420 425
430Pro Glu Ala His Leu Ser Trp Ile Leu Pro Asn Arg Arg Ile
Ile Asn 435 440 445Asp Leu Ala Asn
Thr Ser His Val Tyr Met Leu Pro Asn Gly Thr Leu 450
455 460Ser Ile Pro Lys Val Gln Val Ser Asp Ser Gly Tyr
His Arg Cys Val465 470 475
480Ala Val Asn Gln His Gly Ala Asp His Ile Thr Val Gly Ile Thr Val
485 490 495Thr Lys Lys Gly Ser
Gly Ser Pro Ser Lys Arg Gly Arg Trp Pro Gly 500
505 510Pro Lys Ala Leu Ser Arg Val Arg Glu Asp Ile Val
Glu Asp Glu Gly 515 520 525Val Ser
Gly Thr Gly Asp Glu Glu Asn Thr Ser Arg Arg Leu Leu His 530
535 540Pro Lys His Gln Glu Ala Phe Leu Lys Thr Lys
Asp Asp Ala Ile Asn545 550 555
560Gly Asp Lys Lys Ala Lys Lys Gly Arg Arg Lys Leu Lys Leu Trp Lys
565 570 575His Ser Glu Lys
Glu Pro Glu Thr Ser Val Ala Glu Asp Leu Arg Val 580
585 590Phe Glu Ser Arg Arg Arg Ile Asn Val Ala Asn
Lys Gln Ile Asn Pro 595 600 605Glu
His Trp Ala Asp Ile Leu Ala Lys Val Phe Gly Lys Asn Leu Pro 610
615 620Thr Gly Thr Glu Val Ser Pro Ile Ile Lys
Thr Thr Ser Ser Pro Phe625 630 635
640Leu Ser Leu Val Val Thr Pro Pro Leu Pro Ala Val Ser Pro Pro
Leu 645 650 655Ala Ser Pro
Ile Gln Thr Ala Thr Ser Ala Glu Glu Ser Ser Ala Asp 660
665 670Val Pro Leu Leu Ser Glu Gly Lys His Ile
Leu Ser Thr Ile Ser Ser 675 680
685Ala Ser Met Gly Leu Glu His His Asn Asn Gly Val Ile Leu Val Glu 690
695 700Pro Glu Val Thr Ser Thr Pro Leu
Glu Glu Val Val Asp Glu Tyr Ser705 710
715 720Lys Lys Thr Glu Glu Met Thr Ser Thr Glu Gly Asp
Leu Lys Gly Thr 725 730
735Ala Ala Ser Thr Leu Ile Ser Glu Pro Tyr Glu Gln Ser Pro Thr Leu
740 745 750His Thr Leu Asp Thr Val
Tyr Glu Glu Pro Thr His Glu Glu Thr Glu 755 760
765Thr Glu Gly Trp Ser Ala Ala Asp Val Gly Ser Ser Pro Asp
Pro Thr 770 775 780Ser Ser Glu Tyr Glu
Leu Pro Leu Val Val Val Ser Leu Ala Glu Ser785 790
795 800Lys Pro Val Gln Tyr Phe Asp Pro Asp Leu
Glu Thr Asn Ser Gln Pro 805 810
815His Glu Asp Asn Ile Lys Glu Tyr Ser Phe Ala His Leu Thr Pro Thr
820 825 830Ala Ile Ile Trp Phe
Asn Asp Ser Ser Thr Ser Leu Ser Phe Glu Asp 835
840 845Ser Thr Val Gly Glu Gln Gly Val Pro Gly Lys Ser
His Leu Gln Gly 850 855 860Pro Thr Glu
Asn Ile Gln Leu Val Lys Ser Ser Phe Ser Thr Gln Asp865
870 875 880Thr Leu Leu Ile Lys Lys Gly
Met Lys Glu Met Ser Gln Thr Leu Gln 885
890 895Gly Gly Asn Met Leu Glu Gly Asp Pro Thr His Ser
Arg Ser Ser Glu 900 905 910Asn
Glu Gly Gln Glu Ser Lys Ser Ile Thr Leu Pro Asp Ser Thr Leu 915
920 925Gly Ile Thr Ser Ser Thr Ser Pro Val
Lys Lys Pro Ala Glu Thr Thr 930 935
940Val Val Thr Leu Leu His Lys Asp Thr Thr Thr Glu Thr Thr Pro Arg945
950 955 960Gln Lys Val Ala
Ser Ser Ser Thr Met Ser Thr His Pro Ser Arg Arg 965
970 975Arg Pro Asn Gly Arg Lys Leu His Pro His
Lys Phe His His Arg His 980 985
990Lys Gln Thr Pro Pro Thr Thr Phe Ala Pro Leu Glu Thr Phe Ser Thr
995 1000 1005Gln Pro Thr Gln Ala Thr
Asp Ile Lys Ile Ser Asn Gln Met Glu 1010 1015
1020Ser Ser Leu Val Pro Thr Ser Trp Glu Ile Asn Thr Val Asn
Thr 1025 1030 1035Pro Lys Gln Leu Glu
Met Glu Lys Asn Val Glu Leu Ile Ser Lys 1040 1045
1050Gly Thr Pro Arg Arg Lys His Gly Lys Arg Pro Asn Lys
His Arg 1055 1060 1065Tyr Thr Pro Ser
Thr Val Ser Ser Arg Ala Ser Ala Ser Lys Pro 1070
1075 1080Ser Pro Ser Pro Glu Asn Lys His Arg Asn Ile
Val Thr Pro Ser 1085 1090 1095Ser Glu
Thr Thr Leu Leu Pro Arg Asn Val Ser Leu Lys Thr Glu 1100
1105 1110Gly Val Tyr Asp Ser Leu Asp Tyr Thr Thr
Thr Thr Arg Lys Ile 1115 1120 1125His
Ser Ser His His Lys Val Gln Asp Thr Leu Pro Val Met Tyr 1130
1135 1140Lys Pro Thr Ser Asp Gly Lys Glu Ile
Gln Asp Asp Val Ala Thr 1145 1150
1155Asn Val Asp Lys His Lys Ser Asp Ile Leu Val Pro Gly Glu Ser
1160 1165 1170Ile Thr Asn Val Thr Gln
Thr Ser Arg Ser Leu Val Ser Thr Met 1175 1180
1185Gly Glu Phe Lys Glu Glu Ser Ser Pro Val Gly Phe Pro Gly
Ile 1190 1195 1200Pro Thr Trp Asn Pro
Ser Arg Lys Ala Gln Pro Gly Arg Leu Gln 1205 1210
1215Thr Asp Ile His Val Thr Thr Ser Gly Glu Thr Pro Thr
Asp Pro 1220 1225 1230Pro Leu Val Asn
Glu Leu Glu Asp Val Asp Phe Thr Ser Glu Phe 1235
1240 1245Leu Ser Ser Val Thr Val Ser Thr Pro Phe His
Gln Glu Glu Ala 1250 1255 1260Gly Phe
Ser Thr Ile Leu Ser Ser Ile Lys Val Glu Met Ala Ser 1265
1270 1275Ser Gln Val Glu Thr Thr Thr Leu Gly Gln
Asp His His Glu Thr 1280 1285 1290Thr
Val Ala Ile Leu His Ser Glu Thr Arg Pro Gln Asn His Ile 1295
1300 1305Leu Thr Ala Ala Trp Met Lys Glu Pro
Ala Ser Leu Ser Pro Pro 1310 1315
1320Met Ile Leu Leu Ser Leu Gly Gln Thr Thr Thr Thr Lys Pro Glu
1325 1330 1335Leu Leu Ser Pro Arg Thr
Ser Gln Ile Cys Lys Asp Ser Lys Glu 1340 1345
1350Asn Val Phe Leu Asn Tyr Met Gly Asn Pro Glu Thr Glu Ala
Thr 1355 1360 1365Pro Val Lys Asn Glu
Gly Thr Gln Arg Met Ser Gly Pro Asn Glu 1370 1375
1380Leu Ser Thr Pro Ser Ser Asp His Asp Ala Phe Asn Leu
Ser Thr 1385 1390 1395Lys Leu Glu Leu
Glu Lys Gln Val Phe Asp Ser Arg Ser Leu Thr 1400
1405 1410Arg Gly Pro Asp Ser His His Gln Asp Gly Arg
Val His Ala Ser 1415 1420 1425His Gln
Leu Thr Arg Ile Pro Ala Lys Pro Ile Leu Pro Thr Gly 1430
1435 1440Thr Val Arg Leu Pro Glu Met Ser Thr Gln
Ser Thr Ser Arg Tyr 1445 1450 1455Phe
Val Thr Phe Gln Pro Pro His His Gly Thr Asn Lys Pro Glu 1460
1465 1470Ile Thr Thr Tyr Pro Ser Arg Ala Leu
Pro Glu Ser Lys Gln Phe 1475 1480
1485Thr Thr Pro Arg Val Ala Ser Thr Thr Pro Leu Leu Ser His Met
1490 1495 1500Ser Lys Pro Ser Ile Ser
Ser Lys Phe Ala Asp Leu Arg Thr Asp 1505 1510
1515Gln Ser Asn Gly Ser Tyr Lys Val Phe Gly Asn Ser Asn Ile
Pro 1520 1525 1530Glu Ala Arg Asn Ser
Val Gly Lys Pro Leu Ser Pro Arg Ile Tyr 1535 1540
1545His Tyr Ser Asn Gly Arg Leu Pro Phe Phe Thr Asn Arg
Thr Leu 1550 1555 1560Ser Phe Ser Gln
Leu Gly Val Thr Arg Arg Pro Gln Ile Pro Ser 1565
1570 1575Ser Pro Val Pro Val Met Arg Glu Arg Lys Val
Asn Pro Gly Ser 1580 1585 1590Tyr Asn
Arg Ile Tyr Ser His Ser Thr Phe His Leu Asp Phe Gly 1595
1600 1605Leu Pro Ala Pro Pro Leu Leu His Thr Pro
Trp Thr Met Val Ser 1610 1615 1620Pro
Pro Thr Asn Leu Gln Asn Ile Pro Met Val Ser Ser Thr Gln 1625
1630 1635Ser Ser Val Ser Phe Ile Thr Ser Ser
Val Gln Ser Ser Gly Ser 1640 1645
1650Ile His Gln Ser Gly Ser Lys Phe Phe Ala Gly Gly Pro Pro Ala
1655 1660 1665Ser Lys Phe Trp Pro Leu
Gly Glu Lys Pro Gln Ile Leu Thr Lys 1670 1675
1680Ser Pro Gln Thr Val Ser Val Thr Ala Glu Thr Asp Ala Val
Phe 1685 1690 1695Pro Cys Glu Ala Ile
Gly Lys Pro Lys Pro Phe Val Thr Trp Thr 1700 1705
1710Lys Val Ser Thr Gly Val Leu Met Thr Pro Asn Thr Arg
Ile Gln 1715 1720 1725Arg Phe Glu Val
Leu Lys Asn Gly Thr Leu Val Ile Arg Lys Phe 1730
1735 1740Gln Val Gln Asp Arg Gly Gln Tyr Met Cys Thr
Ala Ser Asn Leu 1745 1750 1755Tyr Gly
Leu Asp Arg Met Val Val Phe Leu Trp Val Thr Val Gln 1760
1765 1770Gln Pro Gln Ile Leu Ala Ser His Tyr Gln
Asp Val Thr Val Tyr 1775 1780 1785Leu
Gly Asp Thr Ile Thr Met Glu Cys Leu Ala Lys Gly Thr Pro 1790
1795 1800Ala Pro Gln Ile Ser Trp Ile Phe Arg
Asp Arg Arg Val Trp Gln 1805 1810
1815Thr Leu Ser Ser Val Glu Gly Arg Ile Thr Leu His Gln Asn Arg
1820 1825 1830Thr Leu Ser Ile Lys Glu
Ala Ser Phe Ser Asp Arg Gly Val Tyr 1835 1840
1845Lys Cys Val Ala Ser Asn Ala Thr Arg Ala Asp Ser Val Ser
Ile 1850 1855 1860Arg Leu His Val Ala
Ala Leu Pro Pro Ile Ile His Gln Glu Lys 1865 1870
1875Leu Glu Asn Ile Ser Leu Pro Pro Gly Leu Ser Ile His
Ile His 1880 1885 1890Cys Thr Ala Lys
Ala Ala Pro Leu Pro Ser Val Leu Trp Val Leu 1895
1900 1905Gly Asp Gly Thr Gln Ile Arg Pro Ser His Phe
Leu His Arg Asn 1910 1915 1920Leu Phe
Val Phe Pro Asn Gly Thr Leu Tyr Ile Cys Asn Leu Ala 1925
1930 1935Pro Lys Asp Ser Gly Arg Tyr Glu Cys Val
Ala Ala Asn Leu Ile 1940 1945 1950Gly
Ser Ala Arg Ser Thr Val Gln Leu Asn Val Gln Arg Ala Ala 1955
1960 1965Ala Asn Ala Arg Ile Thr Gly Thr Ser
Ser Gln Arg Thr Asp Val 1970 1975
1980Arg Tyr Gly Gly Thr Leu Lys Leu Asp Cys Ser Ala Ser Gly Asp
1985 1990 1995Pro Trp Pro Arg Ile Leu
Trp Arg Leu Pro Ser Lys Arg Thr Ile 2000 2005
2010Asp Ala Leu Phe Ser Phe Asp Ser Arg Ile Lys Val Phe Ala
Asn 2015 2020 2025Arg Thr Leu Val Val
Lys Ser Met Thr Asp Lys Asp Ala Gly Asp 2030 2035
2040Tyr Leu Cys Val Ala Arg Asn Lys Val Gly Asp Asp Cys
Val Val 2045 2050 2055Leu Lys Val Asp
Val Met Met Lys Pro Ala Lys Ile Glu His Lys 2060
2065 2070Glu Glu Asn Asp His Lys Val Phe Tyr Arg Gly
Asp Leu Lys Val 2075 2080 2085Asp Cys
Val Ala Thr Gly Leu Pro Asn Pro Glu Ile Ser Trp Ser 2090
2095 2100Leu Leu Asp Gly Ser Leu Val Asn Ser Phe
Met Gln Ser Asp Asp 2105 2110 2115Ser
Gly Gly Arg Thr Lys His Tyr Val Val Phe Asn Asn Gly Thr 2120
2125 2130Leu Tyr Phe Ser Glu Val Gly Met Arg
Glu Glu Gly Asp Tyr Thr 2135 2140
2145Cys Phe Ala Glu Asn Gln Val Gly Lys Asp Glu Met Arg Val Arg
2150 2155 2160Val Lys Met Val Thr Pro
Ala Thr Ile Trp Asn Lys Thr Tyr Leu 2165 2170
2175Ala Val Gln Val Pro Tyr Gly Asp Val Val Thr Val Thr Cys
Glu 2180 2185 2190Ala Lys Gly Glu Pro
Met Pro Lys Val Thr Trp Leu Ser Pro Ala 2195 2200
2205Asn Arg Val Ile Pro Thr Ser Ser Glu Lys Tyr Gln Ile
Tyr Gln 2210 2215 2220Tyr Gly Thr Leu
Leu Ile Gln Lys Ala Gln Cys Ser Asp Ser Gly 2225
2230 2235Asn Tyr Thr Cys Leu Val Arg Asn Ser Ala Gly
Glu Asp Arg Lys 2240 2245 2250Thr Val
Trp Ile His Val Asn Leu Gln Pro Pro Lys Ile Asn Gly 2255
2260 2265Asn Pro Asn Pro Ile Thr Thr Val Trp Glu
Ile Ala Ala Gly Gly 2270 2275 2280Ser
Arg Lys Leu Ile Asp Cys Lys Ala Glu Gly Ile Pro Thr Pro 2285
2290 2295Arg Val Leu Trp Ala Phe Pro Glu Gly
Val Val Leu Pro Asp Pro 2300 2305
2310Tyr Tyr Gly Asn Arg Ile Thr Val His Gly Asn Gly Ser Leu Asp
2315 2320 2325Ile Arg Ser Leu Arg Lys
Ser Asp Ser Val Gln Leu Val Cys Met 2330 2335
2340Ala Arg Asn Glu Gly Gly Glu Ala Arg Leu Ile Val Gln Leu
Thr 2345 2350 2355Val Leu Glu Pro Met
Glu Lys Pro Ile Phe His Asp Pro Ile Ser 2360 2365
2370Glu Lys Ile Thr Ala Met Ala Gly His Thr Ile Ser Leu
Asn Cys 2375 2380 2385Ser Ala Ala Gly
Thr Leu Thr Pro Ser Leu Val Trp Val Leu Pro 2390
2395 2400Asn Gly Thr Asp Leu Gln Ser Gly Gln Gln Leu
Gln Arg Phe Tyr 2405 2410 2415His Lys
Ala Asp Gly Met Leu His Ile Ser Gly Leu Ser Ser Val 2420
2425 2430Asp Ala Gly Ala Tyr Arg Cys Val Ala Arg
Asn Ala Ala Gly His 2435 2440 2445Thr
Glu Arg Leu Val Ser Leu Lys Val Gly Leu Lys Pro Glu Ala 2450
2455 2460Asn Lys Gln Tyr His Asn Leu Val Ser
Ile Ile Asn Gly Glu Thr 2465 2470
2475Leu Lys Leu Pro Cys Thr Pro Pro Ala Ala Gly Gln Gly His Phe
2480 2485 2490Ser Trp Thr Leu Pro Asn
Gly Met His Leu Glu Gly Pro Gln Thr 2495 2500
2505Leu Gly Arg Val Ser Leu Leu Asp Asn Gly Thr Leu Thr Val
Arg 2510 2515 2520Glu Ala Ser Val Phe
Asp Arg Gly Thr Tyr Val Cys Arg Met Glu 2525 2530
2535Thr Ala Tyr Gly Pro Ser Val Thr Ser Ile Pro Val Ile
Val Ile 2540 2545 2550Ala Tyr Pro Pro
Arg Ile Thr Ser Glu Pro Thr Pro Val Ile Tyr 2555
2560 2565Thr Arg Pro Gly Asn Thr Val Lys Leu Asn Cys
Met Ala Met Gly 2570 2575 2580Ile Pro
Lys Gly Asp Ile Thr Trp Glu Leu Pro Asp Lys Leu His 2585
2590 2595Leu Lys Ala Gly Val Gln Ala Arg Leu Tyr
Gly Asn Arg Phe Leu 2600 2605 2610His
Pro Gln Gly Ser Leu Thr Ile Gln Gln Ala Arg Arg Arg Asp 2615
2620 2625Ala Gly Phe Tyr Lys Cys Thr Ala Lys
Asn Ile Leu Ser Ser Asp 2630 2635
2640Ser Lys Thr Thr Tyr Ile His Val Phe 2645
2650317872DNAHomo sapiens 31atgaaggtaa aaggcagagg aatcacctgc ttgctggtct
cctttgctgt gatctgcctg 60gtcgccaccc ctgggggcaa ggcctgtcct cgccgctgtg
cctgttatat gcctacggag 120gtacactgca catttcggta cctgacttcc atcccagaca
gcatcccgcc caatgtggaa 180cgcatcaatt taggatacaa cagcttggtt agattgatgg
aaacagattt ttctggcctg 240accaaactgg agttactcat gcttcacagc aatggcattc
acacaatccc tgacaagacc 300ttctcagatt tgcaggcctt gcaggtctta aaaatgagct
ataataaagt ccgaaaactt 360cagaaagata ctttttatgg cctcaggagc ttgacacgat
tgcacatgga ccacaacaat 420attgagttta taaacccaga ggttttttat gggctcaact
ttctccgcct ggtgcacttg 480gaaggaaatc agctcactaa gctccaccca gatacatttg
tctctttgag ctacctccag 540atatttaaaa tctctttcat taagttccta tacttgtctg
ataacttcct gacctccctc 600cctcaagaga tggtctccta tatgcctgac ctagacagcc
tttacctgca tggaaaccca 660tggacctgtg attgccattt aaagtggttg tctgactgga
tacaggagaa gccagatgta 720ataaaatgca aaaaagatag aagtccctct agtgctcagc
agtgtccact ttgcatgaac 780cctaggactt ctaaaggcaa gccgttagct atggtctcag
ctgcagcttt ccagtgtgcc 840aagccaacca ttgactcatc cctgaaatca aagagcctga
ctattctgga agacagtagt 900tctgctttca tctctcccca aggtttcatg gcaccctttg
gctccctcac tttgaatatg 960acagatcagt ctggaaatga agctaacatg gtctgcagta
ttcaaaagcc ctcaaggaca 1020tcacccattg cattcactga agaaaatgac tacatcgtgc
taaatacttc attttcaaca 1080tttttggtgt gcaacataga ttacggtcac attcagccag
tgtggcaaat tttggctttg 1140tacagtgatt ctcctctgat actagaaagg agccacttgc
ttagtgaaac accgcagctc 1200tattacaaat ataaacaggt ggctcctaag cctgaagaca
tttttaccaa catagaggca 1260gatctcagag cagatccctc ttggttaatg caagaccaaa
tttccttgca gctgaacaga 1320actgccacca cattcagtac attacagatc cagtactcca
gtgatgctca aatcacttta 1380ccaagagcag agatgaggcc agtgaaacac aaatggacta
tgatttcaag ggataacaat 1440actaagctgg aacatactgt cttggtaggt ggaaccgttg
gcctgaactg cccaggccaa 1500ggagacccca ccccacacgt ggattggctt ctagctgatg
gaagtaaagt gagagcccct 1560tatgtcagtg aggatggacg gatcctaata gacaaaagtg
gaaaattgga actccagatg 1620gctgatagtt ttgacacagg cgtatatcac tgtataagca
gcaattatga tgatgcagat 1680attctcacct ataggataac tgtggtagaa cctttggtcg
aagcctatca ggaaaatggg 1740attcatcaca cagttttcat tggtgaaaca cttgatcttc
catgccattc tactggtatc 1800ccagatgcct ctattagctg ggttattcca ggaaacaatg
tgctctatca gtcatcaaga 1860gacaagaaag ttctaaacaa tggcacatta agaatattac
aggtcacccc gaaagaccaa 1920ggttattatc gctgtgtggc agccaaccca tcaggggttg
attttttgat tttccaagtt 1980tcagtcaaga tgaaaggaca aaggcccttg gagcatgatg
gagaaacaga gggatctgga 2040cttgatgagt ccaatcctat tgctcatctt aaggagccac
caggtgcaca actccgtaca 2100tctgctctga tggaggctga ggttggaaaa cacacctcaa
gcacaagtaa gaggcacaac 2160tatcgggaat taacactcca gcgacgtgga gattcaacac
atcgacgttt tagggagaat 2220aggaggcatt tccctccctc tgctaggaga attgacccac
aacattgggc ggcactgttg 2280gagaaagcta aaaagaatgc tatgccagac aagcgagaaa
ataccacagt gagcccaccc 2340ccagtggtca cccaactccc aaacatacct ggtgaagaag
acgattcctc aggcatgctc 2400gctctacatg aggaatttat ggtcccggcc actaaagctt
tgaaccttcc agcaaggaca 2460gtgactgctg actccagaac aatatctgat agtcctatga
caaacataaa ttatggcaca 2520gaattctctc ctgttgtgaa ttcacaaata ctaccacctg
aagaacccac agatttcaaa 2580ctgtctactg ctattaaaac tacagccatg tcaaagaata
taaacccaac catgtcaagc 2640caaatacaag gcacaaccaa tcaacattca tccactgtct
ttccactgct acttggagca 2700actgaatttc aggactctga ccagatggga agaggaagag
agcatttcca aagtagaccc 2760ccaataacag taaggactat gatcaaagat gtcaatgtca
aaatgcttag tagcaccacc 2820aacaaactat tattagagtc agtaaatacc acaaatagtc
atcagacatc tgtaagagaa 2880gtgagtgaac ccaggcacaa tcacttctat tctcacacta
ctcaaatact tagcacctcc 2940acgttccctt cagatccaca cacagctgct cattctcagt
ttccgatccc tagaaatagt 3000acagttaaca tcccgctgtt cagacgcttt gggaggcaga
ggaaaattgg cggaaggggg 3060cggattatca gcccatatag aactccagtt ctgcgacggc
atagatacag cattttcagg 3120tcaacaacca gaggttcttc tgaaaaaagc actactgcat
tctcagccac agtgctcaat 3180gtgacatgtc tgtcctgtct tcccagggag aggctcacca
ctgccacagc agcattgtct 3240tttccaagtg ctgctcccat caccttcccc aaagctgaca
ttgctagagt cccatcagaa 3300gagtctacaa ctctagtcca gaatccacta ttactacttg
agaacaaacc cagtgtagag 3360aaaacaacac ccacaataaa atatttcagg actgaaattt
cccaagtgac tccaactggt 3420gcagtcatga catatgctcc aacatccata cccatggaaa
aaactcacaa agtaaacgcc 3480agttacccac gtgtgtctag caccaatgaa gctaaaagag
attcagtgat tacatcgtca 3540ctttcaggtg ctatcaccaa gccaccaatg actattatag
ccattacaag gttttcaaga 3600aggaaaattc cctggcaaca gaactttgta aataaccata
acccaaaagg cagattaagg 3660aatcaacata aagttagttt acaaaaaagc acagctgtga
tgcttcctaa aacatctcct 3720gctttaccca gagacaaagt ctcccctttc catttcacca
cactttcaac aagtgtgatg 3780caaattccat ctaatacctt gactaccgct caccacacta
cgaccaaaac acacaatcct 3840ggaagtcttc caacaaagaa ggagcttccc ttcccacccc
ttaaccctat gcttcctagt 3900attataagca aagactcaag tacaaaaagc atcatatcaa
cgcaaacagc aataccagca 3960acaactccta ccttccctgc atctgtcatc acttatgaaa
cccaaacaga gagatctaga 4020gcacaaacaa tacaaagaga acaggagcct caaaagaaga
acaggactga cccaaacatc 4080tctccagacc agagttctgg cttcactaca cccactgcta
tgacacctcc tgttctaacc 4140acagccgaaa cttcagtcaa gcccagtgtc tctgcattca
ctcattcccc accagaaaac 4200acaactggga tttcaagcac aatcagtttt cattcaagaa
ctcttaatct gacagatgtg 4260attgaagaac tagcccaagc aagtactcag actttgaaga
gcacaattgc ttctgaaaca 4320actttgtcca gcaaatcaca ccagagtacc acaactagga
aagcaatcat tagacactca 4380accataccac cattcttgag cagcagtgct actctaatgc
cagttcccat ctcccctccc 4440tttactcaga gagcagttac tgacaacgtg gcgactccca
tttccgggct tatgacaaat 4500acagtggtca agctgcacga atcctcaagg cacaatgcta
aaccacagca attagtagca 4560gaggttgcaa catcccccaa ggttcaccca aatgccaagt
tcacaattgg aaccactcac 4620ttcatctact ctaatctgtt acattctact cccatgccag
cactaacaac agttaaatca 4680cagaattcta aattaactcc atctccctgg gcagaaaacc
aattttggca caaaccatac 4740tcagaaattg ctgaaaaagg caaaaagcca gaagtaagca
tgttggctac tacaggcctg 4800tccgaggcca ccactcttgt ttcagattgg gatggacaga
agaacacaaa gaagagtgac 4860tttgataaga aaccagttca agaagcaaca acttccaaac
tccttccctt tgactctttg 4920tctaggtata tatttgaaaa gcccaggata gttggaggaa
aagctgcaag ttttactatt 4980ccagctaact cagatgcctt tcttccctgt gaagctgttg
gaaatcccct gcccaccatt 5040cattggacca gagtcccatc aggacttgat ttatctaaga
ggaaacagaa tagcagggtc 5100caggttctcc ccaatggtac cctgtccatc cagagggtgg
aaattcagga ccgcggacag 5160tacttgtgtt ccgcatccaa tctgtttggc acagaccacc
ttcatgtcac cttgtctgtg 5220gtttcctatc ctcccaggat cctggagaga cgtaccaaag
agatcacagt tcattccgga 5280agcactgtgg aactgaagtg cagagcagaa ggtaggccaa
gccctacagt tacctggatt 5340cttgcaaacc aaacagttgt ctcagaatca tcccagggaa
gtaggcaggc tgtggtgacg 5400gttgacggaa cattggtcct ccacaatctc agtatttatg
accgtggctt ttacaaatgt 5460gtggccagca acccaggtgg ccaggattca ctgctggtta
aaatacaagt cattgcagca 5520ccacctgtta ttctagagca aaggaggcaa gtcattgtag
gcacttgggg tgaaagttta 5580aaactgccct gtactgcaaa aggaactcct cagcccagcg
tttactgggt cctctctgat 5640ggcactgaag tgaaaccatt acagtttacc aattccaagt
tgttcttatt ttcaaatggg 5700actttgtata taagaaacct agcctcttca gacaggggca
cttatgaatg cattgctacc 5760agttccactg gttcggagcg aagagtagta atgcttacaa
tggaagagcg agtgaccagc 5820cccaggatag aagctgcatc ccagaaaagg actgaagtga
attttgggga caaattacta 5880ctgaactgct cagccactgg ggagcccaaa ccccaaataa
tgtggaggtt accatccaag 5940gctgtggtcg accagcagca tagagtgggc agctggatcc
acgtctaccc taatggatcc 6000ctgtttattg gatcagtaac agaaaaagac agtggtgtct
acttgtgtgt ggcaagaaac 6060aaaatggggg atgatctgat actgatgcat gttagcctaa
gactgaaacc tgccaaaatt 6120gaccacaagc agtattttag aaagcaagtg ctccatggga
aagatttcca agtagattgc 6180aaagcttccg gctccccagt gccagagata tcttggagtt
tgcctgatgg aaccatgatc 6240aacaatgcaa tgcaagccga tgacagtggc cacaggacta
ggagatatac ccttttcaac 6300aatggaactt tatacttcaa caaagttggg gtagcggagg
aaggagatta tacttgctat 6360gcccagaaca ccctagggaa agatgaaatg aaggtccact
taacagttat aacagctgct 6420ccccggataa ggcagagtaa caaaaccaac aagagaatca
aagctggaga cacagctgtc 6480cttgactgtg aggtcactgg ggatcccaaa ccaaaaatat
tttggttgct gccttccaat 6540gacatgattt ccttctccat tgataggtac acatttcatg
ccaatgggtc tttgaccatc 6600aacaaagtga aactgctcga ttctggagag tacgtatgtg
tagcccgaaa tcccagtggg 6660gatgacacca aaatgtacaa actggatgtg gtctctaaac
ctccattaat caatggtctg 6720tatacaaaca gaactgttat taaagccaca gctgtgagac
attccaaaaa acactttgac 6780tgcagagctg aagggacacc atctcctgaa gtcatgtgga
tcatgccaga caatattttc 6840ctcacagccc catactatgg aagcagaatc acagtccata
aaaatggaac cttggaaatt 6900aggaatgtga ggctttcaga ttcagccgac tttatctgtg
tggcccgaaa tgaaggtgga 6960gagagcgtgt tggtagtaca gttagaagta ctggaaatgc
tgagaagacc gacatttaga 7020aatccattta atgaaaaaat agttgcccag ctgggaaagt
ccacagcatt gaattgctct 7080gttgatggta acccaccacc tgaaataatc tggattttac
caaatggcac acgattttcc 7140aatggaccac aaagttatca gtatctgata gcaagcaatg
gttcttttat catttctaaa 7200acaactcggg aggatgcagg aaaatatcgc tgtgcagcta
ggaataaagt tggctatatt 7260gagaaattag tcatattaga aattggccag aagccagtta
ttcttaccta tgcaccaggg 7320acagtaaaag gcatcagtgg agaatctcta tcactgcatt
gtgtgtctga tggaatccct 7380aagccaaata tcaaatggac tatgccaagt ggttatgtag
tagacaggcc tcaaattaat 7440gggaaataca tattgcatga caatggcacc ttagtcatta
aagaagcaac agcttatgac 7500agaggaaact atatctgtaa ggctcaaaat agtgttggtc
atacactgat tactgttcca 7560gtaatgattg tagcctaccc tccccgaatt acaaatcgtc
cacccaggag tattgtcacc 7620aggacagggg cagcctttca gctccactgt gtggccttgg
gagttcccaa gccagaaatc 7680acatgggaga tgcctgacca ctcccttctc tcaacggcaa
gtaaagagag gacacatgga 7740agtgagcagc ttcacttaca aggtacccta gtcattcaga
atccccaaac ctccgattct 7800gggatataca aatgcacagc aaagaaccca cttggtagtg
attatgcagc aacgtatatt 7860caagtaatct ga
7872322623PRTHomo sapiens 32Met Lys Val Lys Gly Arg
Gly Ile Thr Cys Leu Leu Val Ser Phe Ala1 5
10 15Val Ile Cys Leu Val Ala Thr Pro Gly Gly Lys Ala
Cys Pro Arg Arg 20 25 30Cys
Ala Cys Tyr Met Pro Thr Glu Val His Cys Thr Phe Arg Tyr Leu 35
40 45Thr Ser Ile Pro Asp Ser Ile Pro Pro
Asn Val Glu Arg Ile Asn Leu 50 55
60Gly Tyr Asn Ser Leu Val Arg Leu Met Glu Thr Asp Phe Ser Gly Leu65
70 75 80Thr Lys Leu Glu Leu
Leu Met Leu His Ser Asn Gly Ile His Thr Ile 85
90 95Pro Asp Lys Thr Phe Ser Asp Leu Gln Ala Leu
Gln Val Leu Lys Met 100 105
110Ser Tyr Asn Lys Val Arg Lys Leu Gln Lys Asp Thr Phe Tyr Gly Leu
115 120 125Arg Ser Leu Thr Arg Leu His
Met Asp His Asn Asn Ile Glu Phe Ile 130 135
140Asn Pro Glu Val Phe Tyr Gly Leu Asn Phe Leu Arg Leu Val His
Leu145 150 155 160Glu Gly
Asn Gln Leu Thr Lys Leu His Pro Asp Thr Phe Val Ser Leu
165 170 175Ser Tyr Leu Gln Ile Phe Lys
Ile Ser Phe Ile Lys Phe Leu Tyr Leu 180 185
190Ser Asp Asn Phe Leu Thr Ser Leu Pro Gln Glu Met Val Ser
Tyr Met 195 200 205Pro Asp Leu Asp
Ser Leu Tyr Leu His Gly Asn Pro Trp Thr Cys Asp 210
215 220Cys His Leu Lys Trp Leu Ser Asp Trp Ile Gln Glu
Lys Pro Asp Val225 230 235
240Ile Lys Cys Lys Lys Asp Arg Ser Pro Ser Ser Ala Gln Gln Cys Pro
245 250 255Leu Cys Met Asn Pro
Arg Thr Ser Lys Gly Lys Pro Leu Ala Met Val 260
265 270Ser Ala Ala Ala Phe Gln Cys Ala Lys Pro Thr Ile
Asp Ser Ser Leu 275 280 285Lys Ser
Lys Ser Leu Thr Ile Leu Glu Asp Ser Ser Ser Ala Phe Ile 290
295 300Ser Pro Gln Gly Phe Met Ala Pro Phe Gly Ser
Leu Thr Leu Asn Met305 310 315
320Thr Asp Gln Ser Gly Asn Glu Ala Asn Met Val Cys Ser Ile Gln Lys
325 330 335Pro Ser Arg Thr
Ser Pro Ile Ala Phe Thr Glu Glu Asn Asp Tyr Ile 340
345 350Val Leu Asn Thr Ser Phe Ser Thr Phe Leu Val
Cys Asn Ile Asp Tyr 355 360 365Gly
His Ile Gln Pro Val Trp Gln Ile Leu Ala Leu Tyr Ser Asp Ser 370
375 380Pro Leu Ile Leu Glu Arg Ser His Leu Leu
Ser Glu Thr Pro Gln Leu385 390 395
400Tyr Tyr Lys Tyr Lys Gln Val Ala Pro Lys Pro Glu Asp Ile Phe
Thr 405 410 415Asn Ile Glu
Ala Asp Leu Arg Ala Asp Pro Ser Trp Leu Met Gln Asp 420
425 430Gln Ile Ser Leu Gln Leu Asn Arg Thr Ala
Thr Thr Phe Ser Thr Leu 435 440
445Gln Ile Gln Tyr Ser Ser Asp Ala Gln Ile Thr Leu Pro Arg Ala Glu 450
455 460Met Arg Pro Val Lys His Lys Trp
Thr Met Ile Ser Arg Asp Asn Asn465 470
475 480Thr Lys Leu Glu His Thr Val Leu Val Gly Gly Thr
Val Gly Leu Asn 485 490
495Cys Pro Gly Gln Gly Asp Pro Thr Pro His Val Asp Trp Leu Leu Ala
500 505 510Asp Gly Ser Lys Val Arg
Ala Pro Tyr Val Ser Glu Asp Gly Arg Ile 515 520
525Leu Ile Asp Lys Ser Gly Lys Leu Glu Leu Gln Met Ala Asp
Ser Phe 530 535 540Asp Thr Gly Val Tyr
His Cys Ile Ser Ser Asn Tyr Asp Asp Ala Asp545 550
555 560Ile Leu Thr Tyr Arg Ile Thr Val Val Glu
Pro Leu Val Glu Ala Tyr 565 570
575Gln Glu Asn Gly Ile His His Thr Val Phe Ile Gly Glu Thr Leu Asp
580 585 590Leu Pro Cys His Ser
Thr Gly Ile Pro Asp Ala Ser Ile Ser Trp Val 595
600 605Ile Pro Gly Asn Asn Val Leu Tyr Gln Ser Ser Arg
Asp Lys Lys Val 610 615 620Leu Asn Asn
Gly Thr Leu Arg Ile Leu Gln Val Thr Pro Lys Asp Gln625
630 635 640Gly Tyr Tyr Arg Cys Val Ala
Ala Asn Pro Ser Gly Val Asp Phe Leu 645
650 655Ile Phe Gln Val Ser Val Lys Met Lys Gly Gln Arg
Pro Leu Glu His 660 665 670Asp
Gly Glu Thr Glu Gly Ser Gly Leu Asp Glu Ser Asn Pro Ile Ala 675
680 685His Leu Lys Glu Pro Pro Gly Ala Gln
Leu Arg Thr Ser Ala Leu Met 690 695
700Glu Ala Glu Val Gly Lys His Thr Ser Ser Thr Ser Lys Arg His Asn705
710 715 720Tyr Arg Glu Leu
Thr Leu Gln Arg Arg Gly Asp Ser Thr His Arg Arg 725
730 735Phe Arg Glu Asn Arg Arg His Phe Pro Pro
Ser Ala Arg Arg Ile Asp 740 745
750Pro Gln His Trp Ala Ala Leu Leu Glu Lys Ala Lys Lys Asn Ala Met
755 760 765Pro Asp Lys Arg Glu Asn Thr
Thr Val Ser Pro Pro Pro Val Val Thr 770 775
780Gln Leu Pro Asn Ile Pro Gly Glu Glu Asp Asp Ser Ser Gly Met
Leu785 790 795 800Ala Leu
His Glu Glu Phe Met Val Pro Ala Thr Lys Ala Leu Asn Leu
805 810 815Pro Ala Arg Thr Val Thr Ala
Asp Ser Arg Thr Ile Ser Asp Ser Pro 820 825
830Met Thr Asn Ile Asn Tyr Gly Thr Glu Phe Ser Pro Val Val
Asn Ser 835 840 845Gln Ile Leu Pro
Pro Glu Glu Pro Thr Asp Phe Lys Leu Ser Thr Ala 850
855 860Ile Lys Thr Thr Ala Met Ser Lys Asn Ile Asn Pro
Thr Met Ser Ser865 870 875
880Gln Ile Gln Gly Thr Thr Asn Gln His Ser Ser Thr Val Phe Pro Leu
885 890 895Leu Leu Gly Ala Thr
Glu Phe Gln Asp Ser Asp Gln Met Gly Arg Gly 900
905 910Arg Glu His Phe Gln Ser Arg Pro Pro Ile Thr Val
Arg Thr Met Ile 915 920 925Lys Asp
Val Asn Val Lys Met Leu Ser Ser Thr Thr Asn Lys Leu Leu 930
935 940Leu Glu Ser Val Asn Thr Thr Asn Ser His Gln
Thr Ser Val Arg Glu945 950 955
960Val Ser Glu Pro Arg His Asn His Phe Tyr Ser His Thr Thr Gln Ile
965 970 975Leu Ser Thr Ser
Thr Phe Pro Ser Asp Pro His Thr Ala Ala His Ser 980
985 990Gln Phe Pro Ile Pro Arg Asn Ser Thr Val Asn
Ile Pro Leu Phe Arg 995 1000
1005Arg Phe Gly Arg Gln Arg Lys Ile Gly Gly Arg Gly Arg Ile Ile
1010 1015 1020Ser Pro Tyr Arg Thr Pro
Val Leu Arg Arg His Arg Tyr Ser Ile 1025 1030
1035Phe Arg Ser Thr Thr Arg Gly Ser Ser Glu Lys Ser Thr Thr
Ala 1040 1045 1050Phe Ser Ala Thr Val
Leu Asn Val Thr Cys Leu Ser Cys Leu Pro 1055 1060
1065Arg Glu Arg Leu Thr Thr Ala Thr Ala Ala Leu Ser Phe
Pro Ser 1070 1075 1080Ala Ala Pro Ile
Thr Phe Pro Lys Ala Asp Ile Ala Arg Val Pro 1085
1090 1095Ser Glu Glu Ser Thr Thr Leu Val Gln Asn Pro
Leu Leu Leu Leu 1100 1105 1110Glu Asn
Lys Pro Ser Val Glu Lys Thr Thr Pro Thr Ile Lys Tyr 1115
1120 1125Phe Arg Thr Glu Ile Ser Gln Val Thr Pro
Thr Gly Ala Val Met 1130 1135 1140Thr
Tyr Ala Pro Thr Ser Ile Pro Met Glu Lys Thr His Lys Val 1145
1150 1155Asn Ala Ser Tyr Pro Arg Val Ser Ser
Thr Asn Glu Ala Lys Arg 1160 1165
1170Asp Ser Val Ile Thr Ser Ser Leu Ser Gly Ala Ile Thr Lys Pro
1175 1180 1185Pro Met Thr Ile Ile Ala
Ile Thr Arg Phe Ser Arg Arg Lys Ile 1190 1195
1200Pro Trp Gln Gln Asn Phe Val Asn Asn His Asn Pro Lys Gly
Arg 1205 1210 1215Leu Arg Asn Gln His
Lys Val Ser Leu Gln Lys Ser Thr Ala Val 1220 1225
1230Met Leu Pro Lys Thr Ser Pro Ala Leu Pro Arg Asp Lys
Val Ser 1235 1240 1245Pro Phe His Phe
Thr Thr Leu Ser Thr Ser Val Met Gln Ile Pro 1250
1255 1260Ser Asn Thr Leu Thr Thr Ala His His Thr Thr
Thr Lys Thr His 1265 1270 1275Asn Pro
Gly Ser Leu Pro Thr Lys Lys Glu Leu Pro Phe Pro Pro 1280
1285 1290Leu Asn Pro Met Leu Pro Ser Ile Ile Ser
Lys Asp Ser Ser Thr 1295 1300 1305Lys
Ser Ile Ile Ser Thr Gln Thr Ala Ile Pro Ala Thr Thr Pro 1310
1315 1320Thr Phe Pro Ala Ser Val Ile Thr Tyr
Glu Thr Gln Thr Glu Arg 1325 1330
1335Ser Arg Ala Gln Thr Ile Gln Arg Glu Gln Glu Pro Gln Lys Lys
1340 1345 1350Asn Arg Thr Asp Pro Asn
Ile Ser Pro Asp Gln Ser Ser Gly Phe 1355 1360
1365Thr Thr Pro Thr Ala Met Thr Pro Pro Val Leu Thr Thr Ala
Glu 1370 1375 1380Thr Ser Val Lys Pro
Ser Val Ser Ala Phe Thr His Ser Pro Pro 1385 1390
1395Glu Asn Thr Thr Gly Ile Ser Ser Thr Ile Ser Phe His
Ser Arg 1400 1405 1410Thr Leu Asn Leu
Thr Asp Val Ile Glu Glu Leu Ala Gln Ala Ser 1415
1420 1425Thr Gln Thr Leu Lys Ser Thr Ile Ala Ser Glu
Thr Thr Leu Ser 1430 1435 1440Ser Lys
Ser His Gln Ser Thr Thr Thr Arg Lys Ala Ile Ile Arg 1445
1450 1455His Ser Thr Ile Pro Pro Phe Leu Ser Ser
Ser Ala Thr Leu Met 1460 1465 1470Pro
Val Pro Ile Ser Pro Pro Phe Thr Gln Arg Ala Val Thr Asp 1475
1480 1485Asn Val Ala Thr Pro Ile Ser Gly Leu
Met Thr Asn Thr Val Val 1490 1495
1500Lys Leu His Glu Ser Ser Arg His Asn Ala Lys Pro Gln Gln Leu
1505 1510 1515Val Ala Glu Val Ala Thr
Ser Pro Lys Val His Pro Asn Ala Lys 1520 1525
1530Phe Thr Ile Gly Thr Thr His Phe Ile Tyr Ser Asn Leu Leu
His 1535 1540 1545Ser Thr Pro Met Pro
Ala Leu Thr Thr Val Lys Ser Gln Asn Ser 1550 1555
1560Lys Leu Thr Pro Ser Pro Trp Ala Glu Asn Gln Phe Trp
His Lys 1565 1570 1575Pro Tyr Ser Glu
Ile Ala Glu Lys Gly Lys Lys Pro Glu Val Ser 1580
1585 1590Met Leu Ala Thr Thr Gly Leu Ser Glu Ala Thr
Thr Leu Val Ser 1595 1600 1605Asp Trp
Asp Gly Gln Lys Asn Thr Lys Lys Ser Asp Phe Asp Lys 1610
1615 1620Lys Pro Val Gln Glu Ala Thr Thr Ser Lys
Leu Leu Pro Phe Asp 1625 1630 1635Ser
Leu Ser Arg Tyr Ile Phe Glu Lys Pro Arg Ile Val Gly Gly 1640
1645 1650Lys Ala Ala Ser Phe Thr Ile Pro Ala
Asn Ser Asp Ala Phe Leu 1655 1660
1665Pro Cys Glu Ala Val Gly Asn Pro Leu Pro Thr Ile His Trp Thr
1670 1675 1680Arg Val Pro Ser Gly Leu
Asp Leu Ser Lys Arg Lys Gln Asn Ser 1685 1690
1695Arg Val Gln Val Leu Pro Asn Gly Thr Leu Ser Ile Gln Arg
Val 1700 1705 1710Glu Ile Gln Asp Arg
Gly Gln Tyr Leu Cys Ser Ala Ser Asn Leu 1715 1720
1725Phe Gly Thr Asp His Leu His Val Thr Leu Ser Val Val
Ser Tyr 1730 1735 1740Pro Pro Arg Ile
Leu Glu Arg Arg Thr Lys Glu Ile Thr Val His 1745
1750 1755Ser Gly Ser Thr Val Glu Leu Lys Cys Arg Ala
Glu Gly Arg Pro 1760 1765 1770Ser Pro
Thr Val Thr Trp Ile Leu Ala Asn Gln Thr Val Val Ser 1775
1780 1785Glu Ser Ser Gln Gly Ser Arg Gln Ala Val
Val Thr Val Asp Gly 1790 1795 1800Thr
Leu Val Leu His Asn Leu Ser Ile Tyr Asp Arg Gly Phe Tyr 1805
1810 1815Lys Cys Val Ala Ser Asn Pro Gly Gly
Gln Asp Ser Leu Leu Val 1820 1825
1830Lys Ile Gln Val Ile Ala Ala Pro Pro Val Ile Leu Glu Gln Arg
1835 1840 1845Arg Gln Val Ile Val Gly
Thr Trp Gly Glu Ser Leu Lys Leu Pro 1850 1855
1860Cys Thr Ala Lys Gly Thr Pro Gln Pro Ser Val Tyr Trp Val
Leu 1865 1870 1875Ser Asp Gly Thr Glu
Val Lys Pro Leu Gln Phe Thr Asn Ser Lys 1880 1885
1890Leu Phe Leu Phe Ser Asn Gly Thr Leu Tyr Ile Arg Asn
Leu Ala 1895 1900 1905Ser Ser Asp Arg
Gly Thr Tyr Glu Cys Ile Ala Thr Ser Ser Thr 1910
1915 1920Gly Ser Glu Arg Arg Val Val Met Leu Thr Met
Glu Glu Arg Val 1925 1930 1935Thr Ser
Pro Arg Ile Glu Ala Ala Ser Gln Lys Arg Thr Glu Val 1940
1945 1950Asn Phe Gly Asp Lys Leu Leu Leu Asn Cys
Ser Ala Thr Gly Glu 1955 1960 1965Pro
Lys Pro Gln Ile Met Trp Arg Leu Pro Ser Lys Ala Val Val 1970
1975 1980Asp Gln Gln His Arg Val Gly Ser Trp
Ile His Val Tyr Pro Asn 1985 1990
1995Gly Ser Leu Phe Ile Gly Ser Val Thr Glu Lys Asp Ser Gly Val
2000 2005 2010Tyr Leu Cys Val Ala Arg
Asn Lys Met Gly Asp Asp Leu Ile Leu 2015 2020
2025Met His Val Ser Leu Arg Leu Lys Pro Ala Lys Ile Asp His
Lys 2030 2035 2040Gln Tyr Phe Arg Lys
Gln Val Leu His Gly Lys Asp Phe Gln Val 2045 2050
2055Asp Cys Lys Ala Ser Gly Ser Pro Val Pro Glu Ile Ser
Trp Ser 2060 2065 2070Leu Pro Asp Gly
Thr Met Ile Asn Asn Ala Met Gln Ala Asp Asp 2075
2080 2085Ser Gly His Arg Thr Arg Arg Tyr Thr Leu Phe
Asn Asn Gly Thr 2090 2095 2100Leu Tyr
Phe Asn Lys Val Gly Val Ala Glu Glu Gly Asp Tyr Thr 2105
2110 2115Cys Tyr Ala Gln Asn Thr Leu Gly Lys Asp
Glu Met Lys Val His 2120 2125 2130Leu
Thr Val Ile Thr Ala Ala Pro Arg Ile Arg Gln Ser Asn Lys 2135
2140 2145Thr Asn Lys Arg Ile Lys Ala Gly Asp
Thr Ala Val Leu Asp Cys 2150 2155
2160Glu Val Thr Gly Asp Pro Lys Pro Lys Ile Phe Trp Leu Leu Pro
2165 2170 2175Ser Asn Asp Met Ile Ser
Phe Ser Ile Asp Arg Tyr Thr Phe His 2180 2185
2190Ala Asn Gly Ser Leu Thr Ile Asn Lys Val Lys Leu Leu Asp
Ser 2195 2200 2205Gly Glu Tyr Val Cys
Val Ala Arg Asn Pro Ser Gly Asp Asp Thr 2210 2215
2220Lys Met Tyr Lys Leu Asp Val Val Ser Lys Pro Pro Leu
Ile Asn 2225 2230 2235Gly Leu Tyr Thr
Asn Arg Thr Val Ile Lys Ala Thr Ala Val Arg 2240
2245 2250His Ser Lys Lys His Phe Asp Cys Arg Ala Glu
Gly Thr Pro Ser 2255 2260 2265Pro Glu
Val Met Trp Ile Met Pro Asp Asn Ile Phe Leu Thr Ala 2270
2275 2280Pro Tyr Tyr Gly Ser Arg Ile Thr Val His
Lys Asn Gly Thr Leu 2285 2290 2295Glu
Ile Arg Asn Val Arg Leu Ser Asp Ser Ala Asp Phe Ile Cys 2300
2305 2310Val Ala Arg Asn Glu Gly Gly Glu Ser
Val Leu Val Val Gln Leu 2315 2320
2325Glu Val Leu Glu Met Leu Arg Arg Pro Thr Phe Arg Asn Pro Phe
2330 2335 2340Asn Glu Lys Ile Val Ala
Gln Leu Gly Lys Ser Thr Ala Leu Asn 2345 2350
2355Cys Ser Val Asp Gly Asn Pro Pro Pro Glu Ile Ile Trp Ile
Leu 2360 2365 2370Pro Asn Gly Thr Arg
Phe Ser Asn Gly Pro Gln Ser Tyr Gln Tyr 2375 2380
2385Leu Ile Ala Ser Asn Gly Ser Phe Ile Ile Ser Lys Thr
Thr Arg 2390 2395 2400Glu Asp Ala Gly
Lys Tyr Arg Cys Ala Ala Arg Asn Lys Val Gly 2405
2410 2415Tyr Ile Glu Lys Leu Val Ile Leu Glu Ile Gly
Gln Lys Pro Val 2420 2425 2430Ile Leu
Thr Tyr Ala Pro Gly Thr Val Lys Gly Ile Ser Gly Glu 2435
2440 2445Ser Leu Ser Leu His Cys Val Ser Asp Gly
Ile Pro Lys Pro Asn 2450 2455 2460Ile
Lys Trp Thr Met Pro Ser Gly Tyr Val Val Asp Arg Pro Gln 2465
2470 2475Ile Asn Gly Lys Tyr Ile Leu His Asp
Asn Gly Thr Leu Val Ile 2480 2485
2490Lys Glu Ala Thr Ala Tyr Asp Arg Gly Asn Tyr Ile Cys Lys Ala
2495 2500 2505Gln Asn Ser Val Gly His
Thr Leu Ile Thr Val Pro Val Met Ile 2510 2515
2520Val Ala Tyr Pro Pro Arg Ile Thr Asn Arg Pro Pro Arg Ser
Ile 2525 2530 2535Val Thr Arg Thr Gly
Ala Ala Phe Gln Leu His Cys Val Ala Leu 2540 2545
2550Gly Val Pro Lys Pro Glu Ile Thr Trp Glu Met Pro Asp
His Ser 2555 2560 2565Leu Leu Ser Thr
Ala Ser Lys Glu Arg Thr His Gly Ser Glu Gln 2570
2575 2580Leu His Leu Gln Gly Thr Leu Val Ile Gln Asn
Pro Gln Thr Ser 2585 2590 2595Asp Ser
Gly Ile Tyr Lys Cys Thr Ala Lys Asn Pro Leu Gly Ser 2600
2605 2610Asp Tyr Ala Ala Thr Tyr Ile Gln Val Ile
2615 2620338883DNARattus
rattusmisc_feature(8825)..(8825)n is a, c, g, or t 33cgagagacga
cagaaggtta cggctgcgag aagacgacag aagggtccag aaaaaggaaa 60gtgctggagg
ggagtgggga caaaagcagc gaccaagtga atgtcacttc agtgactgag 120gccaggcaaa
acgcgcggga aggattttgt gtagcttggg accctttcat agacactgat 180gacacgttta
cgcaaaatag aaatttgagg agaaacgcct gggccttcgg aaaggagtga 240ttgattagta
cttgcaagtt taggtgactt taaggagaac taactaatgt atactattga 300gggaggagga
agagcattac agagtttcca gcagcagcag gaaagctttg gttaatttgg 360aaatggatga
tagcattaaa ataacagaag cgcctccagg tctctgaagc ttcagtcccc 420cagctgaaag
ccagaaaaga ctaagcccac taagcctttt gatccctttg gaagcaaaga 480actttccttc
cctggggtga agactctcct cagaagattt cctgtctctg cctatgttac 540aagaggaatc
aaaaccaaga cagaagagct caggatgcag gtgagaggca gggaagtcag 600cggcttgttg
atctccctca ctgctgtctg cctggtggtc acccctggga gcagggcctg 660tcctcgccgc
tgtgcctgct atgtgcccac agaggtgcac tgtacatttc ggtacctgac 720ctccatccca
gatggcatcc cggccaatgt ggaacgaata aatttaggat ataacagcct 780tactagattg
acagaaaacg actttgatgg cctgagcaaa ctggagttac tcatgctgca 840cagtaatggc
attcacagag tcagtgacaa gaccttctcg ggcttgcagt ccttgcaggt 900cttaaaaatg
agctataaca aagtccaaat cattcggaag gatactttct acggactcgg 960gagcttggtc
cggttgcacc tggatcacaa caacattgaa ttcatcaacc ctgaggcctt 1020ttatggactt
acctcgctcc gcttggtaca tttagaagga aaccggctca caaagctcca 1080tccagacaca
tttgtctcat taagctatct ccagatattt aaaacctctt tcattaagta 1140cctgttcttg
tctgataact tcctgacctc cctcccaaaa gaaatggtct cctacatgcc 1200aaacctagaa
agcctgtatt tgcatggaaa cccatggacc tgtgactgcc atttaaagtg 1260gttgtctgag
tggatgcagg gaaacccaga tataataaaa tgcaagaaag acagaagctc 1320ttccagtcct
cagcaatgtc ccctttgcat gaaccccagg atctctaaag gcagaccctt 1380tgctatggta
ccatctggag ctttcctatg tacaaagcca accattgatc catcactgaa 1440gtcaaagagc
ctggttactc aggaggacaa tggatctgcc tccacctcac ctcaagattt 1500catagaaccc
tttggctcct tgtctttgaa catgacagac ctgtctggaa ataaggccga 1560catggtctgt
agtatccaaa agccatcaag gacatcacca actgcattca ctgaagaaaa 1620tgactacatc
atgctaaatg cgtcattttc cacaaatctt gtgtgcagtg tagattataa 1680tcacatccag
ccagtgtggc aacttctggc tttatacagt gactctcctc tgatactaga 1740aaggaagccc
cagcttaccg agactccttc actgtcttct agatataaac aggtggctct 1800taggcctgaa
gacattttta ccagcataga ggctgatgtc agagcagacc ctttttggtt 1860ccaacaagaa
aaaattgtct tgcagctgaa cagaactgcc accacactta gcacattaca 1920gatccagttt
tccactgatg ctcaaatcgc tttaccaagg gcggagatga gagcggagag 1980actcaaatgg
accatgatcc tgatgatgaa caatcccaaa ctggaacgca ctgtcctggt 2040tggcggcact
attgccctga gctgtccagg caaaggcgac ccttcacctc acttggaatg 2100gcttctagct
gatgggagta aagtgagagc cccttacgtt agcgaggatg ggcgaatcct 2160aatagacaaa
aatgggaagt tggaactgca gatggctgac agctttgatg caggtcttta 2220ccactgcata
agcaccaatg atgcagatgc ggatgttctc acatacagga taactgtggt 2280agagccctat
ggagaaagca cacatgacag tggagtccag cacacagtgg ttacgggtga 2340gacgctcgac
cttccatgcc tttccacggg tgttccagat gcttctatta gctggattct 2400tccagggaac
actgtgttct ctcagccatc aagagacagg caaattctta acaatgggac 2460cttaagaata
ttacaggtta cgccaaaaga tcaaggtcat taccaatgtg tggctgccaa 2520cccatcaggg
gccgactttt ccagttttaa agtttcagtt caaaagaaag gccaaaggat 2580ggttgagcat
gacagggagg caggtggatc tggacttgga gaacccaact ccagtgtttc 2640ccttaagcag
ccagcatctt tgaaactctc tgcatcagct ttgacagggt cagaggctgg 2700aaaacaagtc
tccggtgtac ataggaagaa caaacataga gacttaatac atcggcggcg 2760tggggattcc
acgctccggc gattcaggga gcataggagg cagctccctc tctctgctcg 2820gagaattgac
ccgcaacgct gggcagcact tctagaaaaa gccaaaaaga attctgtgcc 2880aaaaaagcaa
gaaaatacca cagtaaagcc agtgccactg gctgttcccc tcgtggaact 2940cactgacgag
gaaaaggatg cctctggcat gattcctcca gatgaagaat tcatggttct 3000gaaaactaag
gcttctggtg tcccaggaag gtcaccaact gctgactctg gaccagtaaa 3060tcatggtttt
atgacgagta tagcttctgg cacagaagtc tcaactgtga atccacaaac 3120actacaatct
gagcaccttc ctgatttcaa attatttagt gtaacaaacg gtacagctgt 3180gacaaagagt
atgaacccat ccatagcaag caaaatagaa gatacaacca accaaaaccc 3240aatcattatc
tttccatcag tagctgaaat tcgagattct gctcaggcag gaagagcatc 3300ttcccaaagt
gcacaccctg taacaggggg aaacatggct acctatggcc ataccaacac 3360atatagtagc
tttaccagca aagccagtac agtcttgcag ccaataaatc caacagaaag 3420ttatggacct
cagataccta ttacaggagt cagcagacct agcagtagtg acatctcttc 3480tcacactact
gcagacccta gcttctccag tcacccttca ggttcacaca ccactgcctc 3540gtctttattt
cacattccta gaaacaacaa tacaggtaac ttccccttgt ccaggcactt 3600gggaagagag
aggacaattt ggagcagagg gagagttaaa aacccacata gaaccccagt 3660tctccgacgg
catagacaca ggactgtgag gccagcaatc aagggacctg ctaacaaaaa 3720tgtgagccaa
gttccagcca cagagtaccc tgggatgtgc cacacatgtc cttccgcaga 3780ggggctcaca
gtggctactg cagcactgtc agttccaagt tcatcccaca gtgccctccc 3840caaaactaat
aatgttgggg tcatagcaga agagtctacc actgtggtca agaaaccact 3900gttactattt
aaggacaaac aaaatgtaga tattgagata ataacaacca ctacaaaata 3960ttccggaggg
gaaagtaacc acgtgattcc tacggaagca agcatgactt ctgctccaac 4020atctgtatcc
ctggggaaat ctcctgtaga caatagtggt cacctgagca tgcctgggac 4080catccaaact
gggaaagatt cagtggaaac aacaccactt cccagccccc tcagcacacc 4140ctcaatacca
acaagcacaa aattctcaaa gaggaaaact cccttgcacc agatctttgt 4200aaataaccag
aagaaggagg ggatgttaaa gaatccatat caattcggtt tacaaaagaa 4260cccagccgca
aagcttccca aaatagctcc tcttttaccc acaggtcaga gttccccctc 4320agattctaca
actctcttga caagtccgcc accagctctg tctacaacaa tggctgccac 4380tcagaacaag
ggcactgaag tagtatcagg tgccagaagt ctctcagcag ggaagaagca 4440gcccttcacc
aactcctctc cagtgcttcc tagcaccata agcaagagat ctaatacatt 4500aaacttcttg
tcaacggaaa cccccacagt gacaagtcct actgctactg catctgtcat 4560tatgtctgaa
acccaacgaa caagatccaa agaagcaaaa gaccaaataa aggggcctcg 4620gaagaacaga
aacaacgcaa acaccacccc caggcaggtt tctggctata gtgcatactc 4680agctctaaca
acagctgata cccccttggc tttcagtcat tccccacgac aagatgatgg 4740tggaaatgta
agtgcagttg cttatcactc aacaacctct cttctggcca taactgaact 4800gtttgagaag
tacacccaga ctttgggaaa tacaacagct ttggaaacaa cgttgttgag 4860caaatcacag
gagagtacca cagtgaaaag agcctcagac acaccaccac cactcctcag 4920cagtggggcg
cccccagtgc ccactccttc cccacctcct tttactaagg gtgtggttac 4980agacagcaaa
gtcacatcag ctttccagat gacgtcaaat agagtggtca ccatatatga 5040atcttcaagg
cacaatacag atctgcagca accctcagca gaggctagcc ccaatcctga 5100gatcataact
ggaaccactg actctccctc taatctgttt ccatccactt ctgtgccagc 5160actaagggta
gataaaccac agaattctaa atggaagccc tctccctggc cagaacacaa 5220atatcagctc
aagtcatact ccgaaaccat tgagaagggc aaaaggccag cagtaagcat 5280gtccccccac
ctcagccttc cagaggccag cactcatgcc tcacactgga atacacagaa 5340gcatgcagaa
aagagtgttt ttgataagaa acctggtcaa aacccaactt ccaaacatct 5400gccttacgtc
tctctaccta agactctatt gaaaaagcca agaataattg gaggaaaggc 5460tgcaagcttt
acagttccag ctaattcaga cgtttttctt ccttgtgagg ctgttggaga 5520cccactgccc
atcatccact ggaccagagt ttcatcagga cttgaaatat cccaagggac 5580acagaaaagc
cggttccacg tgcttcccaa tggcaccttg tccatccaga gggtcagtat 5640tcaggaccgt
ggacagtacc tgtgctctgc atttaatcca ctgggcgtag accattttca 5700tgtctctttg
tctgtggttt tttacccggc aaggattttg gacagacatg tcaaggagat 5760cacagttcac
tttggaagta ctgtggaact aaagtgcaga gtggagggta tgccgaggcc 5820tacggtttcc
tggatacttg caaaccaaac ggtggtctca gaaacggcca agggaagcag 5880aaaggtctgg
gtaacacctg atggaacatt gatcatctat aatctgagtc tttatgatcg 5940tggtttttac
aagtgtgtgg ccagcaaccc atctggccag gattcactgt tggttaagat 6000acaagtcatc
acagctcccc ctgtcattat agagcaaaag aggcaagcca tcgttggggt 6060tttaggtgga
agtttgaaac tgccctgcac tgcaaaagga actccccagc ctagtgttca 6120ctgggtcctt
tatgatggga ctgaactaaa accattgcag ttgactcatt ccagattttt 6180cttgtatcca
aatggaactc tgtatataag aagcatcgct ccttcagtga ggggcactta 6240tgagtgcatt
gccaccagct cctcaggctc agagagaagg gtagtgattc ttactgtgga 6300agagggagag
acaatcccca ggatagaaac tgcctctcag aaatggactg aggtgaattt 6360gggtgagaaa
ttactactga actgctcagc tactggggat ccaaagccta gaataatctg 6420gaggctgcca
tccaaggctg tcatcgacca gtggcacaga atgggcagcc gaatccacgt 6480ctacccaaat
ggatccttgg tggttgggtc agtgacggaa aaagacgctg gtgactactt 6540atgtgtggca
agaaacaaaa tgggagatga cctagtcctg atgcatgtcc gcctgagatt 6600gacacctgcc
aaaattgaac agaagcagta ttttaagaag caagtgctcc atgggaaaga 6660tttccaagtt
gactgcaagg cctctggctc ccctgtgcct gaggtatcct ggagtttgcc 6720tgatgggaca
gtgctcaaca atgtagccca agctgatgac agtggctata ggaccaagag 6780gtacaccctt
ttccacaatg gaaccttgta tttcaacaac gttgggatgg cagaggaagg 6840agattatatc
tgctctgccc agaacacctt agggaaagat gagatgaaag tccacctaac 6900agttctaaca
gccatcccac ggataaggca aagctacaag accaccatga ggctcagggc 6960tggagaaaca
gctgtccttg actgcgaggt cactggggaa ccgaagccca atgtattttg 7020gttgctgcct
tccaacaatg tcatttcatt ctccaatgac aggttcacat ttcatgccaa 7080tagaactttg
tccatccata aagtgaaacc acttgactct ggggactatg tgtgcgtagc 7140tcagaatcct
agtggggatg acactaagac atacaaactg gacattgtct ctaaacctcc 7200attaatcaat
ggcctgtatg caaacaagac tgttattaaa gccacagcca ttcggcactc 7260caaaaaatac
tttgactgca gagcagatgg gatcccatct tcccaggtca cgtggattat 7320gccaggcaat
attttcctcc cagctccata ctttggaagc agagtcacgg tccatccaaa 7380tggaaccttg
gagatgagga acatccggct ttctgactct gcggacttca cctgtgtggt 7440tcggagcgag
ggaggagaga gtgtgttggt agtgcagtta gaagtcctag aaatgctgag 7500aagaccaaca
ttcagaaacc cattcaacga aaaagtcatc gcccaagctg gcaagcccgt 7560agcactgaac
tgctctgtgg atgggaaccc cccacctgaa attacctgga tcttacctga 7620cggcacacag
tttgctaaca gaccacacaa ttccccgtat ctgatggcag gcaatggctc 7680tctcatcctt
tacaaagcaa ctcggaacaa gtcagggaag tatcgctgtg cagccaggaa 7740taaggttggc
tacatcgaga aactcatcct gttagagatt gggcagaagc cagtcattct 7800gacatacgaa
ccagggatgg tgaagagcgt cagtggggaa ccgttatcac tgcattgtgt 7860gtctgatggg
atccccaagc caaatgtcaa gtggactaca ccgggtggcc atgtaatcga 7920caggcctcaa
gtggatggaa aatacatact gcatgaaaat ggcacgctgg tcatcaaagc 7980aacaacagct
cacgaccaag gaaattatat ctgtagggct caaaacagtg ttggccaggc 8040agttattagc
gtgtcagtga tggttgtggc ctaccctccc cgaatcataa actacctacc 8100caggaacatg
ctcaggagga caggggaagc catgcagctc cactgtgtgg ccttgggaat 8160ccccaagcca
aaagtcacct gggagacgcc aagacactcc ctgctctcaa aagcaacagc 8220aagaaaaccc
catagaagtg agatgcttca cccacaaggt acgctggtca ttcagaatct 8280ccaaacctcg
gattccggag tctataagtg cagagctcag aacctacttg ggactgatta 8340cgcaacaact
tacatccagg tactctgaca ggaaggggga gactaaaatt caacagaagt 8400ccacatccac
agggtttatt ttttggaaga agtttaatca aaggcagcca taggcatgta 8460aatgagtctg
aatacattta cagtattaaa tttacaatgg acatgcgatg agacttgtaa 8520atgaaagcat
tgtgaactga aaccgagtct ctgtggatct caaagcaaac tcttaactta 8580aggcactttg
attttgccaa caaataataa caaacattaa gagaaaaaaa tgatccacta 8640cgaaataaca
aacggctaat gcacctgaat tctcagtaaa aagacctttc tctcgctaac 8700agttgccagc
tgcctcgtgt ctgtttccta ccaatgtcac aaacatcgca cacagggtga 8760atggagtcaa
cgggaaagat taagtttgcg gtctgtgtaa atctcaatgt acaaatattc 8820tgtcnctggt
ttataaacat tttgataaaa ccgaaaaaaa aaaaaaaaaa aaaaaaaaaa 8880aaa
8883342597PRTRattus rattus 34Met Gln Val Arg Gly Arg Glu Val Ser Gly Leu
Leu Ile Ser Leu Thr1 5 10
15Ala Val Cys Leu Val Val Thr Pro Gly Ser Arg Ala Cys Pro Arg Arg
20 25 30Cys Ala Cys Tyr Val Pro Thr
Glu Val His Cys Thr Phe Arg Tyr Leu 35 40
45Thr Ser Ile Pro Asp Gly Ile Pro Ala Asn Val Glu Arg Ile Asn
Leu 50 55 60Gly Tyr Asn Ser Leu Thr
Arg Leu Thr Glu Asn Asp Phe Asp Gly Leu65 70
75 80Ser Lys Leu Glu Leu Leu Met Leu His Ser Asn
Gly Ile His Arg Val 85 90
95Ser Asp Lys Thr Phe Ser Gly Leu Gln Ser Leu Gln Val Leu Lys Met
100 105 110Ser Tyr Asn Lys Val Gln
Ile Ile Arg Lys Asp Thr Phe Tyr Gly Leu 115 120
125Gly Ser Leu Val Arg Leu His Leu Asp His Asn Asn Ile Glu
Phe Ile 130 135 140Asn Pro Glu Ala Phe
Tyr Gly Leu Thr Ser Leu Arg Leu Val His Leu145 150
155 160Glu Gly Asn Arg Leu Thr Lys Leu His Pro
Asp Thr Phe Val Ser Leu 165 170
175Ser Tyr Leu Gln Ile Phe Lys Thr Ser Phe Ile Lys Tyr Leu Phe Leu
180 185 190Ser Asp Asn Phe Leu
Thr Ser Leu Pro Lys Glu Met Val Ser Tyr Met 195
200 205Pro Asn Leu Glu Ser Leu Tyr Leu His Gly Asn Pro
Trp Thr Cys Asp 210 215 220Cys His Leu
Lys Trp Leu Ser Glu Trp Met Gln Gly Asn Pro Asp Ile225
230 235 240Ile Lys Cys Lys Lys Asp Arg
Ser Ser Ser Ser Pro Gln Gln Cys Pro 245
250 255Leu Cys Met Asn Pro Arg Ile Ser Lys Gly Arg Pro
Phe Ala Met Val 260 265 270Pro
Ser Gly Ala Phe Leu Cys Thr Lys Pro Thr Ile Asp Pro Ser Leu 275
280 285Lys Ser Lys Ser Leu Val Thr Gln Glu
Asp Asn Gly Ser Ala Ser Thr 290 295
300Ser Pro Gln Asp Phe Ile Glu Pro Phe Gly Ser Leu Ser Leu Asn Met305
310 315 320Thr Asp Leu Ser
Gly Asn Lys Ala Asp Met Val Cys Ser Ile Gln Lys 325
330 335Pro Ser Arg Thr Ser Pro Thr Ala Phe Thr
Glu Glu Asn Asp Tyr Ile 340 345
350Met Leu Asn Ala Ser Phe Ser Thr Asn Leu Val Cys Ser Val Asp Tyr
355 360 365Asn His Ile Gln Pro Val Trp
Gln Leu Leu Ala Leu Tyr Ser Asp Ser 370 375
380Pro Leu Ile Leu Glu Arg Lys Pro Gln Leu Thr Glu Thr Pro Ser
Leu385 390 395 400Ser Ser
Arg Tyr Lys Gln Val Ala Leu Arg Pro Glu Asp Ile Phe Thr
405 410 415Ser Ile Glu Ala Asp Val Arg
Ala Asp Pro Phe Trp Phe Gln Gln Glu 420 425
430Lys Ile Val Leu Gln Leu Asn Arg Thr Ala Thr Thr Leu Ser
Thr Leu 435 440 445Gln Ile Gln Phe
Ser Thr Asp Ala Gln Ile Ala Leu Pro Arg Ala Glu 450
455 460Met Arg Ala Glu Arg Leu Lys Trp Thr Met Ile Leu
Met Met Asn Asn465 470 475
480Pro Lys Leu Glu Arg Thr Val Leu Val Gly Gly Thr Ile Ala Leu Ser
485 490 495Cys Pro Gly Lys Gly
Asp Pro Ser Pro His Leu Glu Trp Leu Leu Ala 500
505 510Asp Gly Ser Lys Val Arg Ala Pro Tyr Val Ser Glu
Asp Gly Arg Ile 515 520 525Leu Ile
Asp Lys Asn Gly Lys Leu Glu Leu Gln Met Ala Asp Ser Phe 530
535 540Asp Ala Gly Leu Tyr His Cys Ile Ser Thr Asn
Asp Ala Asp Ala Asp545 550 555
560Val Leu Thr Tyr Arg Ile Thr Val Val Glu Pro Tyr Gly Glu Ser Thr
565 570 575His Asp Ser Gly
Val Gln His Thr Val Val Thr Gly Glu Thr Leu Asp 580
585 590Leu Pro Cys Leu Ser Thr Gly Val Pro Asp Ala
Ser Ile Ser Trp Ile 595 600 605Leu
Pro Gly Asn Thr Val Phe Ser Gln Pro Ser Arg Asp Arg Gln Ile 610
615 620Leu Asn Asn Gly Thr Leu Arg Ile Leu Gln
Val Thr Pro Lys Asp Gln625 630 635
640Gly His Tyr Gln Cys Val Ala Ala Asn Pro Ser Gly Ala Asp Phe
Ser 645 650 655Ser Phe Lys
Val Ser Val Gln Lys Lys Gly Gln Arg Met Val Glu His 660
665 670Asp Arg Glu Ala Gly Gly Ser Gly Leu Gly
Glu Pro Asn Ser Ser Val 675 680
685Ser Leu Lys Gln Pro Ala Ser Leu Lys Leu Ser Ala Ser Ala Leu Thr 690
695 700Gly Ser Glu Ala Gly Lys Gln Val
Ser Gly Val His Arg Lys Asn Lys705 710
715 720His Arg Asp Leu Ile His Arg Arg Arg Gly Asp Ser
Thr Leu Arg Arg 725 730
735Phe Arg Glu His Arg Arg Gln Leu Pro Leu Ser Ala Arg Arg Ile Asp
740 745 750Pro Gln Arg Trp Ala Ala
Leu Leu Glu Lys Ala Lys Lys Asn Ser Val 755 760
765Pro Lys Lys Gln Glu Asn Thr Thr Val Lys Pro Val Pro Leu
Ala Val 770 775 780Pro Leu Val Glu Leu
Thr Asp Glu Glu Lys Asp Ala Ser Gly Met Ile785 790
795 800Pro Pro Asp Glu Glu Phe Met Val Leu Lys
Thr Lys Ala Ser Gly Val 805 810
815Pro Gly Arg Ser Pro Thr Ala Asp Ser Gly Pro Val Asn His Gly Phe
820 825 830Met Thr Ser Ile Ala
Ser Gly Thr Glu Val Ser Thr Val Asn Pro Gln 835
840 845Thr Leu Gln Ser Glu His Leu Pro Asp Phe Lys Leu
Phe Ser Val Thr 850 855 860Asn Gly Thr
Ala Val Thr Lys Ser Met Asn Pro Ser Ile Ala Ser Lys865
870 875 880Ile Glu Asp Thr Thr Asn Gln
Asn Pro Ile Ile Ile Phe Pro Ser Val 885
890 895Ala Glu Ile Arg Asp Ser Ala Gln Ala Gly Arg Ala
Ser Ser Gln Ser 900 905 910Ala
His Pro Val Thr Gly Gly Asn Met Ala Thr Tyr Gly His Thr Asn 915
920 925Thr Tyr Ser Ser Phe Thr Ser Lys Ala
Ser Thr Val Leu Gln Pro Ile 930 935
940Asn Pro Thr Glu Ser Tyr Gly Pro Gln Ile Pro Ile Thr Gly Val Ser945
950 955 960Arg Pro Ser Ser
Ser Asp Ile Ser Ser His Thr Thr Ala Asp Pro Ser 965
970 975Phe Ser Ser His Pro Ser Gly Ser His Thr
Thr Ala Ser Ser Leu Phe 980 985
990His Ile Pro Arg Asn Asn Asn Thr Gly Asn Phe Pro Leu Ser Arg His
995 1000 1005Leu Gly Arg Glu Arg Thr
Ile Trp Ser Arg Gly Arg Val Lys Asn 1010 1015
1020Pro His Arg Thr Pro Val Leu Arg Arg His Arg His Arg Thr
Val 1025 1030 1035Arg Pro Ala Ile Lys
Gly Pro Ala Asn Lys Asn Val Ser Gln Val 1040 1045
1050Pro Ala Thr Glu Tyr Pro Gly Met Cys His Thr Cys Pro
Ser Ala 1055 1060 1065Glu Gly Leu Thr
Val Ala Thr Ala Ala Leu Ser Val Pro Ser Ser 1070
1075 1080Ser His Ser Ala Leu Pro Lys Thr Asn Asn Val
Gly Val Ile Ala 1085 1090 1095Glu Glu
Ser Thr Thr Val Val Lys Lys Pro Leu Leu Leu Phe Lys 1100
1105 1110Asp Lys Gln Asn Val Asp Ile Glu Ile Ile
Thr Thr Thr Thr Lys 1115 1120 1125Tyr
Ser Gly Gly Glu Ser Asn His Val Ile Pro Thr Glu Ala Ser 1130
1135 1140Met Thr Ser Ala Pro Thr Ser Val Ser
Leu Gly Lys Ser Pro Val 1145 1150
1155Asp Asn Ser Gly His Leu Ser Met Pro Gly Thr Ile Gln Thr Gly
1160 1165 1170Lys Asp Ser Val Glu Thr
Thr Pro Leu Pro Ser Pro Leu Ser Thr 1175 1180
1185Pro Ser Ile Pro Thr Ser Thr Lys Phe Ser Lys Arg Lys Thr
Pro 1190 1195 1200Leu His Gln Ile Phe
Val Asn Asn Gln Lys Lys Glu Gly Met Leu 1205 1210
1215Lys Asn Pro Tyr Gln Phe Gly Leu Gln Lys Asn Pro Ala
Ala Lys 1220 1225 1230Leu Pro Lys Ile
Ala Pro Leu Leu Pro Thr Gly Gln Ser Ser Pro 1235
1240 1245Ser Asp Ser Thr Thr Leu Leu Thr Ser Pro Pro
Pro Ala Leu Ser 1250 1255 1260Thr Thr
Met Ala Ala Thr Gln Asn Lys Gly Thr Glu Val Val Ser 1265
1270 1275Gly Ala Arg Ser Leu Ser Ala Gly Lys Lys
Gln Pro Phe Thr Asn 1280 1285 1290Ser
Ser Pro Val Leu Pro Ser Thr Ile Ser Lys Arg Ser Asn Thr 1295
1300 1305Leu Asn Phe Leu Ser Thr Glu Thr Pro
Thr Val Thr Ser Pro Thr 1310 1315
1320Ala Thr Ala Ser Val Ile Met Ser Glu Thr Gln Arg Thr Arg Ser
1325 1330 1335Lys Glu Ala Lys Asp Gln
Ile Lys Gly Pro Arg Lys Asn Arg Asn 1340 1345
1350Asn Ala Asn Thr Thr Pro Arg Gln Val Ser Gly Tyr Ser Ala
Tyr 1355 1360 1365Ser Ala Leu Thr Thr
Ala Asp Thr Pro Leu Ala Phe Ser His Ser 1370 1375
1380Pro Arg Gln Asp Asp Gly Gly Asn Val Ser Ala Val Ala
Tyr His 1385 1390 1395Ser Thr Thr Ser
Leu Leu Ala Ile Thr Glu Leu Phe Glu Lys Tyr 1400
1405 1410Thr Gln Thr Leu Gly Asn Thr Thr Ala Leu Glu
Thr Thr Leu Leu 1415 1420 1425Ser Lys
Ser Gln Glu Ser Thr Thr Val Lys Arg Ala Ser Asp Thr 1430
1435 1440Pro Pro Pro Leu Leu Ser Ser Gly Ala Pro
Pro Val Pro Thr Pro 1445 1450 1455Ser
Pro Pro Pro Phe Thr Lys Gly Val Val Thr Asp Ser Lys Val 1460
1465 1470Thr Ser Ala Phe Gln Met Thr Ser Asn
Arg Val Val Thr Ile Tyr 1475 1480
1485Glu Ser Ser Arg His Asn Thr Asp Leu Gln Gln Pro Ser Ala Glu
1490 1495 1500Ala Ser Pro Asn Pro Glu
Ile Ile Thr Gly Thr Thr Asp Ser Pro 1505 1510
1515Ser Asn Leu Phe Pro Ser Thr Ser Val Pro Ala Leu Arg Val
Asp 1520 1525 1530Lys Pro Gln Asn Ser
Lys Trp Lys Pro Ser Pro Trp Pro Glu His 1535 1540
1545Lys Tyr Gln Leu Lys Ser Tyr Ser Glu Thr Ile Glu Lys
Gly Lys 1550 1555 1560Arg Pro Ala Val
Ser Met Ser Pro His Leu Ser Leu Pro Glu Ala 1565
1570 1575Ser Thr His Ala Ser His Trp Asn Thr Gln Lys
His Ala Glu Lys 1580 1585 1590Ser Val
Phe Asp Lys Lys Pro Gly Gln Asn Pro Thr Ser Lys His 1595
1600 1605Leu Pro Tyr Val Ser Leu Pro Lys Thr Leu
Leu Lys Lys Pro Arg 1610 1615 1620Ile
Ile Gly Gly Lys Ala Ala Ser Phe Thr Val Pro Ala Asn Ser 1625
1630 1635Asp Val Phe Leu Pro Cys Glu Ala Val
Gly Asp Pro Leu Pro Ile 1640 1645
1650Ile His Trp Thr Arg Val Ser Ser Gly Leu Glu Ile Ser Gln Gly
1655 1660 1665Thr Gln Lys Ser Arg Phe
His Val Leu Pro Asn Gly Thr Leu Ser 1670 1675
1680Ile Gln Arg Val Ser Ile Gln Asp Arg Gly Gln Tyr Leu Cys
Ser 1685 1690 1695Ala Phe Asn Pro Leu
Gly Val Asp His Phe His Val Ser Leu Ser 1700 1705
1710Val Val Phe Tyr Pro Ala Arg Ile Leu Asp Arg His Val
Lys Glu 1715 1720 1725Ile Thr Val His
Phe Gly Ser Thr Val Glu Leu Lys Cys Arg Val 1730
1735 1740Glu Gly Met Pro Arg Pro Thr Val Ser Trp Ile
Leu Ala Asn Gln 1745 1750 1755Thr Val
Val Ser Glu Thr Ala Lys Gly Ser Arg Lys Val Trp Val 1760
1765 1770Thr Pro Asp Gly Thr Leu Ile Ile Tyr Asn
Leu Ser Leu Tyr Asp 1775 1780 1785Arg
Gly Phe Tyr Lys Cys Val Ala Ser Asn Pro Ser Gly Gln Asp 1790
1795 1800Ser Leu Leu Val Lys Ile Gln Val Ile
Thr Ala Pro Pro Val Ile 1805 1810
1815Ile Glu Gln Lys Arg Gln Ala Ile Val Gly Val Leu Gly Gly Ser
1820 1825 1830Leu Lys Leu Pro Cys Thr
Ala Lys Gly Thr Pro Gln Pro Ser Val 1835 1840
1845His Trp Val Leu Tyr Asp Gly Thr Glu Leu Lys Pro Leu Gln
Leu 1850 1855 1860Thr His Ser Arg Phe
Phe Leu Tyr Pro Asn Gly Thr Leu Tyr Ile 1865 1870
1875Arg Ser Ile Ala Pro Ser Val Arg Gly Thr Tyr Glu Cys
Ile Ala 1880 1885 1890Thr Ser Ser Ser
Gly Ser Glu Arg Arg Val Val Ile Leu Thr Val 1895
1900 1905Glu Glu Gly Glu Thr Ile Pro Arg Ile Glu Thr
Ala Ser Gln Lys 1910 1915 1920Trp Thr
Glu Val Asn Leu Gly Glu Lys Leu Leu Leu Asn Cys Ser 1925
1930 1935Ala Thr Gly Asp Pro Lys Pro Arg Ile Ile
Trp Arg Leu Pro Ser 1940 1945 1950Lys
Ala Val Ile Asp Gln Trp His Arg Met Gly Ser Arg Ile His 1955
1960 1965Val Tyr Pro Asn Gly Ser Leu Val Val
Gly Ser Val Thr Glu Lys 1970 1975
1980Asp Ala Gly Asp Tyr Leu Cys Val Ala Arg Asn Lys Met Gly Asp
1985 1990 1995Asp Leu Val Leu Met His
Val Arg Leu Arg Leu Thr Pro Ala Lys 2000 2005
2010Ile Glu Gln Lys Gln Tyr Phe Lys Lys Gln Val Leu His Gly
Lys 2015 2020 2025Asp Phe Gln Val Asp
Cys Lys Ala Ser Gly Ser Pro Val Pro Glu 2030 2035
2040Val Ser Trp Ser Leu Pro Asp Gly Thr Val Leu Asn Asn
Val Ala 2045 2050 2055Gln Ala Asp Asp
Ser Gly Tyr Arg Thr Lys Arg Tyr Thr Leu Phe 2060
2065 2070His Asn Gly Thr Leu Tyr Phe Asn Asn Val Gly
Met Ala Glu Glu 2075 2080 2085Gly Asp
Tyr Ile Cys Ser Ala Gln Asn Thr Leu Gly Lys Asp Glu 2090
2095 2100Met Lys Val His Leu Thr Val Leu Thr Ala
Ile Pro Arg Ile Arg 2105 2110 2115Gln
Ser Tyr Lys Thr Thr Met Arg Leu Arg Ala Gly Glu Thr Ala 2120
2125 2130Val Leu Asp Cys Glu Val Thr Gly Glu
Pro Lys Pro Asn Val Phe 2135 2140
2145Trp Leu Leu Pro Ser Asn Asn Val Ile Ser Phe Ser Asn Asp Arg
2150 2155 2160Phe Thr Phe His Ala Asn
Arg Thr Leu Ser Ile His Lys Val Lys 2165 2170
2175Pro Leu Asp Ser Gly Asp Tyr Val Cys Val Ala Gln Asn Pro
Ser 2180 2185 2190Gly Asp Asp Thr Lys
Thr Tyr Lys Leu Asp Ile Val Ser Lys Pro 2195 2200
2205Pro Leu Ile Asn Gly Leu Tyr Ala Asn Lys Thr Val Ile
Lys Ala 2210 2215 2220Thr Ala Ile Arg
His Ser Lys Lys Tyr Phe Asp Cys Arg Ala Asp 2225
2230 2235Gly Ile Pro Ser Ser Gln Val Thr Trp Ile Met
Pro Gly Asn Ile 2240 2245 2250Phe Leu
Pro Ala Pro Tyr Phe Gly Ser Arg Val Thr Val His Pro 2255
2260 2265Asn Gly Thr Leu Glu Met Arg Asn Ile Arg
Leu Ser Asp Ser Ala 2270 2275 2280Asp
Phe Thr Cys Val Val Arg Ser Glu Gly Gly Glu Ser Val Leu 2285
2290 2295Val Val Gln Leu Glu Val Leu Glu Met
Leu Arg Arg Pro Thr Phe 2300 2305
2310Arg Asn Pro Phe Asn Glu Lys Val Ile Ala Gln Ala Gly Lys Pro
2315 2320 2325Val Ala Leu Asn Cys Ser
Val Asp Gly Asn Pro Pro Pro Glu Ile 2330 2335
2340Thr Trp Ile Leu Pro Asp Gly Thr Gln Phe Ala Asn Arg Pro
His 2345 2350 2355Asn Ser Pro Tyr Leu
Met Ala Gly Asn Gly Ser Leu Ile Leu Tyr 2360 2365
2370Lys Ala Thr Arg Asn Lys Ser Gly Lys Tyr Arg Cys Ala
Ala Arg 2375 2380 2385Asn Lys Val Gly
Tyr Ile Glu Lys Leu Ile Leu Leu Glu Ile Gly 2390
2395 2400Gln Lys Pro Val Ile Leu Thr Tyr Glu Pro Gly
Met Val Lys Ser 2405 2410 2415Val Ser
Gly Glu Pro Leu Ser Leu His Cys Val Ser Asp Gly Ile 2420
2425 2430Pro Lys Pro Asn Val Lys Trp Thr Thr Pro
Gly Gly His Val Ile 2435 2440 2445Asp
Arg Pro Gln Val Asp Gly Lys Tyr Ile Leu His Glu Asn Gly 2450
2455 2460Thr Leu Val Ile Lys Ala Thr Thr Ala
His Asp Gln Gly Asn Tyr 2465 2470
2475Ile Cys Arg Ala Gln Asn Ser Val Gly Gln Ala Val Ile Ser Val
2480 2485 2490Ser Val Met Val Val Ala
Tyr Pro Pro Arg Ile Ile Asn Tyr Leu 2495 2500
2505Pro Arg Asn Met Leu Arg Arg Thr Gly Glu Ala Met Gln Leu
His 2510 2515 2520Cys Val Ala Leu Gly
Ile Pro Lys Pro Lys Val Thr Trp Glu Thr 2525 2530
2535Pro Arg His Ser Leu Leu Ser Lys Ala Thr Ala Arg Lys
Pro His 2540 2545 2550Arg Ser Glu Met
Leu His Pro Gln Gly Thr Leu Val Ile Gln Asn 2555
2560 2565Leu Gln Thr Ser Asp Ser Gly Val Tyr Lys Cys
Arg Ala Gln Asn 2570 2575 2580Leu Leu
Gly Thr Asp Tyr Ala Thr Thr Tyr Ile Gln Val Leu 2585
2590 25953521DNAArtificial Sequenceprimer 35gcactgaact
gctctgtggat
213623DNAArtificial Sequenceprimer 36ccacagaagt aaggttcctt cac
23376PRThomo sapiens 37Lys Cys Lys Lys
Asp Arg1 5
User Contributions:
comments("1"); ?> comment_form("1"); ?>Inventors list |
Agents list |
Assignees list |
List by place |
Classification tree browser |
Top 100 Inventors |
Top 100 Agents |
Top 100 Assignees |
Usenet FAQ Index |
Documents |
Other FAQs |
User Contributions:
Comment about this patent or add new information about this topic:
People who visited this patent also read: | |
Patent application number | Title |
---|---|
20210404609 | LED LIGHT TUBE APPARATUS |
20210404608 | LED TUBE LIGHT WITH COLOR ADJUSTMENT SWITCH |
20210404607 | OMNIDIRECTIONAL LIGHT EMITTING DIODE FILAMENT HOLDER |
20210404606 | MULTIPHASE FLOW MIXED DELIVERY METHOD EMPLOYING RECIPROCATING DRIVING PERFORMED BY LIQUID IN TWO CHAMBERS AND DEVICE THEREOF |
20210404605 | GAS CONTROL SYSTEM |