Patent application title: HUMAN SKELETAL MUSCLE-SPECIFIC UBIQUITIN-CONJUGATING ENZYME
Inventors:
Tsutomu Fujiwara (Naruto-Shi, JP)
Takeshi Wantanabe (Itano-Gun, JP)
Masato Horie (Tokushima-Shi, JP)
Assignees:
OTSUKA PHARMACEUTICAL CO., LTD.
IPC8 Class: AC07K1618FI
USPC Class:
5303879
Class name: Globulins immunoglobulin, antibody, or fragment thereof, other than immunoglobulin antibody, or fragment thereof that is conjugated or adsorbed binds specifically-identified amino acid sequence
Publication date: 2010-06-24
Patent application number: 20100160611
Claims:
1. An isolated antibody which binds to a polypeptide comprising the amino
acid sequence of SEQ ID No.37.
2. The isolated antibody according to claim 1, wherein the antibody is a polyclonal antibody or a monoclonal antibody.
Description:
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001]This application is a Divisional of U.S. application Ser. No. 11/593,525 filed Nov. 7, 2006 (now allowed), which is a Divisional of U.S. application Ser. No. 10/781,841 filed Feb. 20, 2004 (now abandoned), which is a Divisional of U.S. application Ser. No. 10/342,276 filed Jan. 15, 2003 (now U.S. Pat. No. 7,420,048), which is a Divisional of U.S. application Ser. No. 09/976,165 filed Oct. 15, 2001 (now U.S. Pat. No. 6,562,947), which is a Divisional of U.S. application Ser. No. 09/565,538 filed May 5, 2000 (now U.S. Pat. No. 6,333,404), which is a Divisional of U.S. application Ser. No. 09/273,565 filed Mar. 22, 1999 (now U.S. Pat. No. 6,166,190), which is a Divisional of U.S. application Ser. No. 09/055,699 filed Apr. 7, 1998 (now U.S. Pat. No. 6,005,088), which in turn is a Divisional of U.S. application Ser. No. 08/820,170, filed Mar. 19, 1997 (now U.S. Pat. No. 5,831,058), the disclosure of each of which is incorporated herein by reference.
TECHNICAL FIELD
[0002]The present invention relates to a gene useful as an indicator in the prophylaxis, diagnosis and treatment of diseases in humans. More particularly, it relates to a novel human gene analogous to rat, mouse, yeast, nematode and known human genes, among others, and utilizable, after cDNA analysis thereof, chromosome mapping of cDNA and function analysis of cDNA, in gene diagnosis using said gene and in developing a novel therapeutic method.
BACKGROUND ART
[0003]The genetic information of a living thing has been accumulated as sequences (DNA) of four bases, namely A, C, G and T, which exist in cell nuclei. Said genetic information has been preserved for line preservation and ontogeny of each individual living thing.
[0004]In the case of human being, the number of said bases is said to be about 3 billion (3×109) and supposedly there are 50 to 100 thousand genes therein. Such genetic information serves to maintain biological phenomena in that regulatory proteins, structural proteins and enzymes are produced via such route that mRNA is transcribed from a gene (DNA) and then translated into a protein. Abnormalities in said route from gene to protein translation are considered to be causative of abnormalities of life supporting systems, for example in cell proliferation and differentiation, hence causative of various diseases.
[0005]As a result of gene analyses so far made, a number of genes which may be expected to serve as useful materials in drug development, have been found, for example genes for various receptors such as insulin receptor and LDL receptor, genes involved in cell proliferation and differentiation and genes for metabolic enzymes such as proteases, ATPase and superoxide dismutases.
[0006]However, analysis of human genes and studies of the functions of the genes analyzed and of the relations between the genes analyzed and various diseases have been just begun and many points remain unknown. Further analysis of novel genes, analysis of the functions thereof, studies of the relations between the genes analyzed and diseases, and studies for applying the genes analyzed to gene diagnosis or for medicinal purposes, for instance, are therefore desired in the relevant art.
[0007]If such a novel human gene as mentioned above can be provided, it will be possible to analyze the level of expression thereof in each cell and the structure and function thereof and, through expression product analysis and other studies, it may become possible to reveal the pathogenesis of a disease associated therewith, for example a genopathy or cancer, or diagnose and treat said disease, for instance. It is an object of the present invention to provide such a novel human gene.
[0008]For attaining the above object, the present inventors made intensive investigations and obtained the findings mentioned below. Based thereon, the present invention has now been completed.
DISCLOSURE OF THE INVENTION
[0009]Thus, the present inventors synthesized cDNAs based on mRNAs extracted from various tissues, inclusive of human fetal brain, adult blood vessels and placenta, constructed libraries by inserting them into vectors, allowing colonies of Escherichia coli transformed with said libraries to form on agar medium, picked up colonies at random and transferred to 96-well micro plates and registered a large number of human gene-containing E. coli clones.
[0010]Each clone thus registered was cultivated on a small size, DNA was extracted and purified, the four base-specifically terminating extension reactions were carried out by the dideoxy chain terminator method using the cDNA extracted as a template, and the base sequence of the gene was determined over about 400 bases from the 5' terminus thereof using an automatic DNA sequencer. Based on the thus-obtained base sequence information, a novel family gene analogous to known genes of animal and plant species such as bacteria, yeasts, nematodes, mice and humans was searched for.
[0011]The method of the above-mentioned cDNA analysis is detailedly described in the literature by Fujiwara, one of the present inventors (Fujiwara, Tsutomu, Saibo Kogaku (Cell Engineering), 14:645-654 (1995)).
[0012]Among this group, there are novel receptors, DNA binding domain-containing transcription regulating factors, signal transmission system factors, metabolic enzymes and so forth. Based on the homology of the novel gene of the present invention as obtained by gene analysis to the genes analogous thereto, the product of the gene, hence the function of the protein, can approximately be estimated by analogy. Furthermore, such functions as enzyme activity and binding ability can be investigated by inserting the candidate gene into an expression vector to give a recombinant.
[0013]According to the present invention, there are provided a novel human gene characterized by containing a nucleotide sequence coding for an amino acid sequence defined by SEQ ID NO:1, SEQ ID NO:4, SEQ ID NO:7, SEQ ID NO:10, SEQ ID NO:13, SEQ ID NO:16, SEQ ID NO:19, SEQ ID NO:22, SEQ ID NO:25, SEQ ID NO:28, SEQ ID NO:31, SEQ ID NO:34, SEQ ID NO:37 or SEQ ID NO:40, a human gene characterized by containing the nucleotide sequence defined by SEQ ID NO:2, SEQ ID NO:5, SEQ ID NO:8, SEQ ID NO:11, SEQ ID NO:14, SEQ ID NO:17, SEQ ID NO:20, SEQ ID NO:23, SEQ ID NO:26, SEQ ID NO:29, SEQ ID NO:32, SEQ ID NO:35, SEQ ID NO:38 or SEQ ID NO:41, respectively coding for the amino acid sequence mentioned above, and a novel human gene characterized by the nucleotide sequence defined by SEQ ID NO:3, SEQ ID NO:6, SEQ ID NO:9, SEQ ID NO:12, SEQ ID NO:15, SEQ ID NO:18, SEQ ID NO:21, SEQ ID NO:24, SEQ ID NO:27, SEQ ID NO:30, SEQ ID NO:33, SEQ ID NO:36, SEQ ID NO:39 or SEQ ID NO:42.
[0014]The symbols used herein for indicating amino acids, peptides, nucleotides, nucleotide sequences and so on are those recommended by IUPAC and IUB or in "Guideline for drafting specifications etc., including nucleotide sequences or amino acid sequences" (edited by the Japanese Patent Office), or those in conventional use in the relevant field of art.
[0015]As specific examples of such gene of the present invention, there may be mentioned genes deducible from the DNA sequences of the clones designated as "GEN-501D08", "GEN-080G01", "GEN-025F07", "GEN-076C09", "GEN-331G07", "GEN-163D09", "GEN-078D05TA13", "GEN-423A12", "GEN-092E10", "GEN-428B12", "GEN-073E07", "GEN-093E05" and "GEN-077A09" shown later herein in Examples 1 to 11. The respective nucleotide sequences are as shown in the sequence listing.
[0016]These clones have an open reading frame comprising nucleotides (nucleic acid) respectively coding for the amino acids shown in the sequence listing. Their molecular weights were calculated at the values shown later herein in the respective examples. Hereinafter, these human genes of the present invention are sometimes referred to as the designation used in Examples 1 to 11.
[0017]In the following, the human gene of the present invention is described in further detail.
[0018]As mentioned above, each human gene of the present invention is analogous to rat, mouse, yeast, nematode and known human genes, among others, and can be utilized in human gene analysis based on the information about the genes analogous thereto and in studying the function of the gene analyzed and the relation between the gene analyzed and a disease. It is possible to use said gene in gene diagnosis of the disease associated therewith and in exploitation studies of said gene for medicinal purposes.
[0019]The gene of the present invention is represented in terms of a single-stranded DNA sequence, as shown under SEQ ID NO:2. It is to be noted, however, that the present invention also includes a DNA sequence complementary to such a single-stranded DNA sequence and a component comprising both. The sequence of the gene of the present invention as shown under SEQ ID NO:3n-1 (where n is an integer of 1 to 14) is merely an example of the codon combination encoding the respective amino acid residues. The gene of the present invention is not limited thereto but can of course have a DNA sequence in which the codons are arbitrarily selected and combined for the respective amino acid residues. The codon selection can be made in the conventional manner, for example taking into consideration the codon utilization frequencies in the host to be used (Nucl. Acids Res., 9:43-74 (1981)).
[0020]The gene of the present invention further includes DNA sequences coding for functional equivalents derived from the amino acid sequence mentioned above by partial amino acid or amino acid sequence substitution, deletion or addition. These polypeptides may be produced by spontaneous modification (mutation) or may be obtained by posttranslational modification or by modifying the natural gene (of the present invention) by a technique of genetic engineering, for example by site-specific mutagenesis (Methods in Enzymology, 154:350, 367-382 (1987); ibid., 100:468 (1983); Nucleic Acids Research, 12:9441 (1984); Zoku Seikagaku Jikken Koza (Sequel to Experiments in Biochemistry) 1, "Idensi Kenkyu-ho (Methods in Gene Research) II", edited by the Japan Biochemical Society, p. 105 (1986)) or synthesizing mutant DNAs by a chemical synthetic technique such as the phosphotriester method or phosphoamidite method (J. Am. Chem. Soc., 89:4801 (1967); ibid., 91:3350 (1969); Science, 150:178 (1968); Tetrahedron Lett., 22:1859 (1981); ibid., 24:245 (1983)), or by utilizing the techniques mentioned above in combination.
[0021]The protein encoded by the gene of the present invention can be expressed readily and stably by utilizing said gene, for example inserting it into a vector for use with a microorganism and cultivating the microorganism thus transformed.
[0022]The protein obtained by utilizing the gene of the present invention can be used in specific antibody production. In this case, the protein producible in large quantities by the genetic engineering technique mentioned above can be used as the component to serve as an antigen. The antibody obtained may be polyclonal or monoclonal and can be advantageously used in the purification, assay, discrimination or identification of the corresponding protein.
[0023]The gene of the present invention can be readily produced based on the sequence information thereof disclosed herein by using general genetic engineering techniques (cf., e.g., Molecular Cloning, 2nd Ed., Cold Spring Harbor Laboratory Press (1989); Zoku Seikagaku Jikken Koza, "Idenshi Kenkyu-ho I, II and III", edited by the Japan Biochemical Society (1986)).
[0024]This can be achieved, for example, by selecting a desired clone from a human cDNA library (prepared in the conventional manner from appropriate cells of origin in which the gene is expressed) using a probe or antibody specific to the gene of the present invention (e.g., Proc. Natl. Acad. Sci. USA, 78:6613 (1981); Science, 222:778 (1983)).
[0025]The cells of origin to be used in the above method are, for example, cells or tissues in which the gene in question is expressed, or cultured cells derived therefrom. Separation of total RNA, separation and purification of mRNA, conversion to (synthesis of) cDNA, cloning thereof and so on can be carried out by conventional methods. cDNA libraries are also commercially available and such cDNA libraries, for example various cDNA libraries available from Clontech Lab. Inc. can also be used in the above method.
[0026]Screening of the gene of the present invention from these cDNA libraries can be carried out by the conventional method mentioned above. These screening methods include, for example, the method comprising selecting a cDNA clone by immunological screening using an antibody specific to the protein produced by the corresponding cDNA, the technique of plaque or colony hybridization using probes selectively binding to the desired DNA sequence, or a combination of these. As regards the probe to be used here, a DNA sequence chemically synthesized based on the information about the DNA sequence of the present invention is generally used. It is of course possible to use the gene of the present invention or fragments thereof as the proble.
[0027]Furthermore, a sense primer and an antisense primer designed based on the information about the partial amino acid sequence of a natural extract isolated and purified from cells or a tissue can be used as probes for screening.
[0028]For obtaining the gene of the present invention, the technique of DNA/RNA amplification by the PCR method (Science, 230:1350-1354 (1984)) can suitably be employed. Particularly when the full-length cDNA can hardly be obtained from the library, the RACE method (rapid amplification of cDNA ends; Jikken Igaku (Experimental Medicine), 12(6):35-38 (1994)], in particular the 5'RACE method (Frohman, M. A., et al., Proc. Natl. Acad. Sci. USA, 85:8998-9002 (1988)) is preferably employed. The primers to be used in such PCR method can be appropriately designed based on the sequence information of the gene of the present invention as disclosed herein and can be synthesized by a conventional method.
[0029]The amplified DNA/RNA fragment can be isolated and purified by a conventional method as mentioned above, for example by gel electrophoresis.
[0030]The nucleotide sequence of the thus-obtained gene of the present invention or any of various DNA fragments can be determined by a conventional method, for example the dideoxy method (Proc. Natl. Acad. Sci. USA, 74:5463-5467 (1977)) or the Maxam-Gilbert method (Methods in Enzymology, 65:499 (1980)). Such nucleotide sequence determination can be readily performed using a commercially available sequence kit as well.
[0031]When the gene of the present invention is used and conventional techniques of recombinant DNA technology (see, e.g., Science, 224:1431 (1984); Biochem. Biophys. Res. Comm., 130:692 (1985); Proc. Natl. Acad. Sci. USA, 80:5990 (1983) and the references cited above) are followed, a recombinant protein can be obtained. More detailedly, said protein can be produced by constructing a recombinant DNA enabling the gene of the present invention to be expressed in host cells, introducing it into host cells for transformation thereof and cultivating the resulting transformant.
[0032]In that case, the host cells may be eukaryotic or prokaryotic. The eukaryotic cells include vertebrate cells, yeast cells and so on, and the vertebrate cells include, but are not limited to, simian cells named COS cells (Cell, 23:175-182 (1981)), Chinese hamster ovary cells and a dihydrofolate reductase-deficient cell line derived therefrom (Proc. Natl. Acad. Sci. USA, 77:4216-4220 (1980)) and the like, which are frequently used.
[0033]As regards the expression vector to be used with vertebrate cells, an expression vector having a promoter located upstream of the gene to be expressed, RNA splicing sites, a polyadenylation site and a transcription termination sequence can be generally used. This may further have an origin of replication as necessary. As an example of said expression vector, there may be mentioned pSV2dhfr (Mol. Cell. Biol., 1:854 (1981)), which has the SV40 early promoter. As for the eukaryotic microorganisms, yeasts are generally and frequently used and, among them, yeasts of the genus Saccharomyces can be used with advantage. As regards the expression vector for use with said yeasts and other eukaryotic microorganisms, pAM82 (Proc. Natl. Acad. Sci. USA, 80:1-5 (1983)), which has the acid phosphatase gene promoter, for instance, can be used.
[0034]Furthermore, a prokaryotic gene fused vector can be preferably used as the expression vector for the gene of the present invention. As specific examples of said vector, there may be mentioned pGEX-2TK and pGEX-4T-2 which have a GST domain (derived from S. japonicum) with a molecular weight of 26,000.
[0035]Escherichia coli and Bacillus subtilis are generally and preferably used as prokaryotic hosts. When these are used as hosts in the practice of the present invention, an expression plasmid derived from a plasmid vector capable of replicating in said host organisms and provided in this vector with a promoter and the SD (Shine and Dalgarno) sequence upstream of said gene for enabling the expression of the gene of the present invention and further provided with an initiation codon (e.g., ATG) necessary for the initiation of protein synthesis is preferably used. The Escherichia coli strain K12, among others, is preferably used as the host Escherichia coli, and pBR322 and modified vectors derived therefrom are generally and preferably used as the vector, while various known strains and vectors can also be used. Examples of the promoter which can be used are the tryptophan (trp) promoter, lpp promoter, lac promoter and PL/PR promoter.
[0036]The thus-obtained desired recombinant DNA can be introduced into host cells for transformation by using various general methods. The transformant obtained can be cultured by a conventional method and the culture leads to expression and production of the desired protein encoded by the gene of the present invention. The medium to be used in said culture can suitably be selected from among various media in conventional use according to the host cells employed. The host cells can be cultured under conditions suited for the growth thereof.
[0037]In the above manner, the desired recombinant protein is expressed and produced and accumulated or secreted within the transformant cells or extracellularly or on the cell membrane.
[0038]The recombinant protein can be separated and purified as desired by various separation procedures utilizing the physical, chemical and other properties thereof (cf., e.g., "Seikagaku (Biochemistry) Data Book II", pages 1175-1259, 1st Edition, 1st Printing, published Jun. 23, 1980, by Tokyo Kagaku Dojin; Biochemistry, 25(25):8274-8277 (1986); Eur. J. Biochem., 163:313-321 (1987)). Specifically, said procedures include, among others, ordinary reconstitution treatment, treatment with a protein precipitating agent (salting out), centrifugation, osmotic shock treatment, sonication, ultrafiltration, various liquid chromatography techniques such as molecular sieve chromatography (gel filtration), adsorption chromatography, ion exchange chromatography, affinity chromatography and high-performance liquid chromatography (HPLC), dialysis and combinations thereof. Among them, affinity chromatography utilizing a column with the desired protein bound thereto is particularly preferred.
[0039]Furthermore, on the basis of the sequence information about the gene of the present invention as revealed by the present invention, for example by utilizing part or the whole of said gene, it is possible to detect the expression of the gene of the present invention in various human tissues. This can be performed by a conventional method, for example by RNA amplification by RT-PCR (reverse transcribed-polymerase chain reaction) (Kawasaki, E. S., et al., Amplification of RNA, in PCR Protocol, A guide to methods and applications, Academic Press, Inc., San Diego, 21-27 (1991)), or by northern blotting analysis (Molecular Cloning, Cold Spring Harbor Laboratory (1989)), with good results.
[0040]The primers to be used in employing the above-mentioned PCR method are not limited to any particular ones provided that they are specific to the gene of the present invention and enable the gene of the present invention alone to be specifically amplified. They can be designed or selected appropriately based on the gene information provided by the present invention. They can have a partial sequence comprising about 20 to 30 nucleotides according to the established practice. Suitable examples are as shown in Examples 1 to 11.
[0041]Thus, the present invention also provides primers and/or probes useful in specifically detecting such novel gene.
[0042]By using the novel gene provided by the present invention, it is possible to detect the expression of said gene in various tissues, analyze the structure and function thereof and, further, produce the human protein encoded by said gene in the manner of genetic engineering. These make it possible to analyze the expression product, reveal the pathology of a disease associated therewith, for example a genopathy or cancer, and diagnose and treat the disease.
[0043]The following drawings are referred to in the examples.
[0044]FIG. 1 shows the result obtained by testing the PI4 kinase activity of NPIK in Example 9.
[0045]FIG. 2 shows the effect of Triton X-100® and adenosine on NPIK activity.
EXAMPLES
[0046]The following examples illustrate the present invention in further detail.
Example 1
GDP Dissociation Stimulator Gene
(1) Cloning and DNA Sequencing of GDP Dissociation Stimulator Gene
[0047]mRNAs extracted from the tissues of human fetal brain, adult blood vessels and placenta were purchased from Clontech and used as starting materials.
[0048]cDNA was synthesized from each mRNA and inserted into the vector λZAPII (Stratagene) to thereby construct a cDNA library (Otsuka GEN Research Institute, Otsuka Pharmaceutical Co., Ltd.).
[0049]Human gene-containing Escherichia coli colonies were allowed to form on agar medium by the in vivo excision technique (Short, J. M., et al., Nucleic Acids Res., 16:7583-7600 (1988)). Colonies were picked up at random and human gene-containing Escherichia coli clones were registered on 96-well micro plates. The clones registered were stored at -80° C.
[0050]Each of the clones registered was cultured overnight in 1.5 ml of LB medium, and DNA was extracted and purified using a model PI-100 automatic plasmid extractor (Kurabo). Contaminant Escherichia coli RNA was decomposed and removed by RNase treatment. The DNA was dissolved to a final volume of 30 μl. A 2-μl portion was used for roughly checking the DNA size and quantity using a minigel, 7 μl was used for sequencing reactions and the remaining portion (21 μl) was stored as plasmid DNA at 4° C.
[0051]This method, after slight changes in the program, enables extraction of the cosmid, which is useful also as a probe for FISH (fluorescence in situ hybridization) shown later in the examples.
[0052]Then, the dideoxy terminator method of Sanger et al. (Sanger, F., et al., Proc. Natl. Acad. Sci. USA, 74:5463-5467 (1977)) using T3, T7 or a synthetic oligonucleotide primer or the cycle sequence method (Carothers, A. M., et al., Bio. Techniques, 7:494-499 (1989)) comprising the dideoxy chain terminator method plus PCR method was carried out. These are methods of terminating the extension reaction specifically to the four bases using a small amount of plasmid DNA (about 0.1 to 0.5 μg) as a template.
[0053]The sequence primers used were FITC (fluorescein isothiocyanate)-labeled ones. Generally, about 25 cycles of reaction were performed using Taq polymerase. The PCR products were separated on a polyacrylamide urea gel and the fluorescence-labeled DNA fragments were submitted to an automatic DNA sequencer (ALF® DNA Sequencer; Pharmacia) for determining the sequence of about 400 bases from the 5' terminus side of cDNA.
[0054]Since the 3' nontranslational region is high in heterogeneity for each gene and therefore suited for discriminating individual genes from one another, sequencing was performed on the 3' side as well depending on the situation.
[0055]The vast sum of nucleotide sequence information obtained from the DNA sequencer was transferred to a 64-bit DEC 3400 computer for homology analysis by the computer. In the homology analysis, a data base (GenBank, EMBL) was used for searching according to the UWGCG FASTA program (Pearson, W. R. and Lipman, D. J., Proc. Natl. Acad. Sci. USA, 85:2444-2448 (1988)).
[0056]As a result of arbitrary selection by the above method and of cDNA sequence analysis, a clone designated as GEN-501D08 and having a 0.8 kilobase insert was found to show a high level of homology to the C terminal region of the human Ral guanine nucleotide dissociation stimulator (RalGDS) gene. Since RalGDS is considered to play a certain role in signal transmission pathways, the whole nucleotide sequence of the cDNA insert portion providing the human homolog was further determined.
[0057]Low-molecular GTPases play an important role in transmitting signals for a number of cell functions including cell proliferation, differentiation and transformation (Bourne, H. R. et al., Nature, 348:125-132 (1990); Bourne et al., Nature, 349:117-127 (1991)).
[0058]It is well known that, among them, those proteins encoded by the ras gene family function as molecular switches or, in other words, the functions of the ras gene family are regulated by different conditions of binding proteins such as biologically inactive GDP-binding proteins or active GDP-binding proteins, and that these two conditions are induced by GTPase activating proteins (GAPs) or GDS. The former enzymes induce GDP binding by stimulating the hydrolysis of bound GTP and the latter enzyme induces the regular GTP binding by releasing bound GDP (Bogusuki, M. S, and McCormick, F., Nature, 366:643-654 (1993)).
[0059]RalGDS was first discovered as a member of the ras gene family lacking in transforming activity and as a GDP dissociation stimulator specific to RAS (Chardin, P. and Tavitian, A., EMBO J., 5:2203-2208 (1986); Albright, C. F., et al., EMBO J., 12:339-347 (1993)).
[0060]In addition to Ral, RalGDS was found to function, through interaction with these proteins, as an effector molecule for N-ras, H-ras, K-ras and Rap (Spaargaren, M. and Bischoff, J. R., Proc. Natl. Acad. Sci. USA, 91:12609-12613 (1994)).
[0061]The nucleotide sequence of the cDNA clone designated as GEN-501D08 is shown under SEQ ID NO:3, the nucleotide sequence of the coding region of said clone under SEQ ID NO:2, and the amino acid sequence encoded by said nucleotide sequence under SEQ ID NO:1.
[0062]This cDNA comprises 842 nucleotides, including an open reading frame comprising 366 nucleotides and coding for 122 amino acids. The translation initiation codon was found to be located at the 28th nucleotide residue.
[0063]Comparison between the RalGDS protein known among conventional databases and the amino acid sequence deduced from said cDNA revealed that the protein encoded by this cDNA is homologous to the C terminal domain of human RalGDS. The amino acid sequence encoded by this novel gene was found to be 39.5% identical with the C terminal domain of RalGDS which is thought to be necessary for binding to ras.
[0064]Therefore, it is presumable, as mentioned above, that this gene product might interact with the ras family proteins or have influence on the ras-mediated signal transduction pathways. However, this novel gene is lacking in the region coding for the GDS activity domain and the corresponding protein seems to be different in function from the GDS protein. This gene was named human RalGDS by the present inventors.
(2) Northern Blot Analysis
[0065]The expression of the RalGDS protein mRNA in normal human tissues was evaluated by Northern blotting using, as a probe, the human cDNA clone labeled by the random oligonucleotide priming method.
[0066]The Northern blot analysis was carried out with a human MTN blot (Human Multiple Tissue Northern blot; Clontech, Palo Alto, Calif., USA) according to the manufacturer's protocol.
[0067]Thus, the PCR amplification product from the above GEN-501D08 clone was labeled with [32P]-dCTP (random-primed DNA labeling kit, Boehringer-Mannheim) for use as a probe.
[0068]For blotting, hybridization was performed overnight at 42° C. in a solution comprising 50% formamide/5×SSC/50×Denhardt's solution/0.1% SDS (containing 100 μg/ml denatured salmon sperm DNA). After washing with two portions of 2×SSC/0.01% SDS at room temperature, the membrane filter was further washed three times with 0.1×SSC/0.05% SDS at 50° C. for 40 minutes. An X-ray film (Kodak) was exposed to the filter at -70° C. for 18 hours.
[0069]As a result, it was revealed that a 900-bp transcript had been expressed in all the human tissues tested. In addition, a 3.2-kb transcript was observed specifically in the heart and skeletal muscle. The expression of these transcripts differing in size may be due either to alternative splicing or to cross hybridization with homologous genes.
(3) Cosmid Clone and Chromosome Localization by Fish
[0070]FISH was performed by screening a library of human chromosomes cloned in the cosmid vector pWE15 using, as a probe, the 0.8-kb insert of the cDNA clone (Sambrook, J., et al., Molecular Cloning, 2nd Ed., pp. 3.1-3.58, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (1989)).
[0071]FISH for chromosome assignment was carried out by the method of Inazawa et al. which comprises G-banding pattern comparison for confirmation (Inazawa, J., et al., Genomics, 17:153-162 (1993)).
[0072]For use as a probe, the cosmid DNA (0.5 μg) obtained from chromosome screening and corresponding to GEN-501D08 was labeled with biotin-16-dUTP by nick translation.
[0073]To eliminate the background noise due to repetitive sequences, 0.5 μl of sonicated human placenta DNA (10 mg/ml) was added to 9.5 μl of the probe solution. The mixture was denatured at 80° C. for 5 minutes and admixed with an equal volume of 4×SSC containing 20% dextransulfate. Then, a denatured slide was sown with the hybridization mixture and, after covering with paraffin, incubated in a wet chamber at 37° C. for 16 to 18 hours. After washing with 50% formamide/2×SSC at 37° C. for 15 minutes, the slide was washed with 2×SSC for 15 minutes and further with 1×SSC for 15 minutes.
[0074]The slide was then incubated in 4×SSC supplemented with "1% Block Ace" (trademark; Dainippon Pharmaceutical) containing avidin-FITC (5 μg/ml) at 37° C. for 40 minutes. Then, the slide was washed with 4×SSC for 10 minutes and with 4×SSC containing 0.05% Triton X-100® for 10 minutes and immersed in an antifading PPD solution (prepared by adjusting 100 mg of PPD (Wako Catalog No. 164-015321) and 10 ml of PBS(-) (pH 7.4) to pH 8.0 with 0.5 M Na2CO3/0.5 M NaHCO3 (9:1, v/v) buffer (pH 9.0) and adding glycerol to make a total volume of 100 ml) containing 1% DABCO (1% DABCO (Sigma) in PBS(-):glycerol 1:9 (v:v)), followed by counter staining with DAPI (4,6-diamino-2-phenylindole; Sigma).
[0075]With more than 100 tested cells in the metaphase, a specific hybridization signal was observed on the chromosome band at 6p21.3, without any signal on other chromosomes. It was thus confirmed that the RalGDS gene is located on the chromosome 6p21.3.
[0076]By using the novel human RalGDS-associated gene of the present invention as obtained in this example, the expression of said gene in various tissues can be detected and the human RalGDS protein can be produced in the manner of genetic engineering. These are expected to enable studies on the roles of the expression product protein and ras-mediated signals in transduction pathways as well as pathological investigations of diseases in which these are involved, for example cancer, and the diagnosis and treatment of such diseases. Furthermore, it becomes possible to study the development and progress of diseases involving the same chromosomal translocation of the RalGDS protein gene of the present invention, for example tonic spondylitis, atrial septal defect, pigmentary retinopathy, aphasia and the like.
Example 2
Cytoskeleton-Associated Protein 2 Gene (CKAP2 Gene)
(1) Cytoskeleton-Associated Protein 2 Gene Cloning and DNA Sequencing
[0077]cDNA clones were arbitrarily chosen from a human fetal brain cDNA library in the same manner as in Example 1 were subjected to sequence analysis and, as a result, a clone having a base sequence containing the CAP-glycine domain of the human cytoskeleton-associated protein (CAP) gene and highly homologous to several CAP family genes was found and named GEN-080G01.
[0078]Meanwhile, the cytoskeleton occurs in the cytoplasm and just inside the cell membrane of eukaryotic cells and is a network structure comprising complicatedly entangled filaments. Said cytoskeleton is constituted of microtubules composed of tubulin, microfilaments composed of actin, intermediate filaments composed of desmin and vimentin, and so on. The cytoskeleton not only acts as supportive cellular elements but also isokinetically functions to induce morphological changes of cells by polymerization and depolymerization in the fibrous system. The cytoskeleton binds to intracellular organelles, cell membrane receptors and ion channels and thus plays an important role in intracellular movement and locality maintenance thereof and, in addition, is said to have functions in activity regulation and mutual information transmission. Thus it supposedly occupies a very important position in physiological activity regulation of the whole cell. In particular, the relation between canceration of cells and qualitative changes of the cytoskeleton attracts attention since cancer cells differ in morphology and recognition response from normal cells.
[0079]The activity of this cytoskeleton is modulated by a number of cytoskeleton-associated proteins (CAPs). One group of CAPs is characterized by a glycine motif highly conserved and supposedly contributing to association with microtubules (CAP-GLY domain; Riehemann, K. and Song, C., Trends Biochem. Sci., 18:82-83 (1993)).
[0080]Among the members of this group of CAPs, there are CLIP-170, 150 kDa DAP (dynein-associated protein, or dynactin), D. melanogaster GLUED, S. cerevisiae BIK1, restin (Bilbe, G., et al., EMBO J., 11:2103-2113 (1992)); Hilliker, C., et al., Cytogenet. Cell Genet., 65:172-176 (1994)) and C. elegans 13.5 kDa protein (Wilson, R., et al., Nature, 368:32-38 (1994)). Except for the last two proteins, direct or indirect evidences have suggested that they could interact with microtublues.
[0081]The above-mentioned CLIP-170 is essential for the in vitro binding of endocytic vesicles to microtubules and colocalizes with endocytic organelles (Rickard, J. E. and Kreis, T. E., J. Biol. Chem., 18:82-83 (1990); Pierre, P., et al., Cell, 70:887 900 (1992)).
[0082]The above-mentioned dynactin is one of the factors constituting the cytoplasmic dynein motor, which functions in retrograde vesicle transport (Schroer, T. A. and Sheetz, M. P., J. Cell Biol., 115:1309-1318 (1991)) or probably in the movement of chromosomes during mitosis (Pfarr, C. M., et al., Nature, 345:263-265 (1990); Steuer, E. R., et al., Nature, 345:266-268 (1990); Wordeman, L., et al., J. Cell Biol., 114:285-294 (1991)).
[0083]GLUED, the Drosophila homolog of mammalian dynactin, is essential for the viability of almost all cells and for the proper organization of some neurons (Swaroop, A., et al., Proc. Natl. Acad. Sci. USA, 84:6501-6505 (1987); Holzbaur, E. L. P., et al., Nature, 351:579-583 (1991)).
[0084]BIK1 interacts with microtubules and plays an important role in spindle formation during mitosis in yeasts (Trueheart, J., et al., Mol. Cell. Biol., 7:2316-2326 (1987); Berlin, V., et al., J. Cell Biol., 111:2573-2586 (1990)).
[0085]At present, these genes are classified under the term CAP family (CAPs).
[0086]As a result of database searching, the above-mentioned cDNA clone of 463-bp (excluding the poly-A signal) showed significant homology in nucleotide sequence with the restin and CLIP-170 encoding genes. However, said clone was lacking in the 5' region as compared with the restin gene and, therefore, the technique of 5' RACE (Frohman, M. A., et al., Proc. Natl. Acad. Sci. USA, 85:8998-9002 (1988)) was used to isolate this missing segment.
(2) 5' RACE (5' Rapid Amplification of cDNA Ends)
[0087]A cDNA clone containing the 5' portion of the gene of the present invention was isolated for analysis by the 5' RACE technique using a commercial kit (5'-Rapid AmpliFinder RACE kit, Clontech) according to the manufacturer's protocol with minor modifications, as follows.
[0088]The gene-specific primer P1 and primer P2 used here were synthesized by the conventional method and their nucleotide sequences are as shown below in Table 1. The anchor primer used was the one attached to the commercial kit.
TABLE-US-00001 TABLE 1 Primer Nucleotide sequence Primer P1 5'-ACACCAATCCAGTAGCCAGGCTTG-3' (SEQ ID NO: 43) Primer P2 5'-CACTCGAGAATCTGTGAGACCTACATACATGACG-3' (SEQ ID NO: 44)
[0089]cDNA was obtained by reverse transcription of 0.1 μg of human fetal brain poly(A)+RNA by the random hexamer technique using reverse transcriptase (Superscript® II, Life Technologies) and the cDNA was amplified by the first PCR using the P1 primer and anchor primer according to Watanabe et al. (Watanabe, T., et al., Cell Genet., in press).
[0090]Thus, to 0.1 μg of the above-mentioned cDNA were added 2.5 mM dNTP/1× Taq buffer (Takara Shuzo)/0.2 μM P1 primer, 0.2 μM adaptor primer/0.25 unit ExTaq enzyme (Takara Shuzo) to make a total volume of 50 μl, followed by addition of the anchor primer. The mixture was subjected to PCR. Thus, 35 cycles of amplification were performed under the conditions: 94° C. for 45 seconds, 60° C. for 45 seconds, and 72° C. for 2 minutes. Finally, the mixture was heated at 72° C. for 5 minutes.
[0091]Then, 1 μl of the 50-μl first PCR product was subjected to amplification by the second PCR using the specific nested P2 primer and anchor primer. The second PCR product was analyzed by 1.5% agarose gel electrophoresis.
[0092]Upon agarose gel electrophoresis, a single band, about 650 nucleotides in size, was detected. The product from this band was inserted into a vector (pT7Blue®T-Vector, Novagen) and a plurality of clones with an insert having an appropriate size were selected.
[0093]Six of the 5' RACE clones obtained from the PCR product had the same sequence but had different lengths. By sequencing two overlapping cDNA clones, GEN-080G01 and GEN-080G0149, the protein-encoding sequence and 5' and 3' flanking sequences, 1015 nucleotides in total length, were determined. Said gene was named cytoskeleton-associated protein 2 gene (CKAP2 gene).
[0094]The nucleotide sequence obtained from the above-mentioned two overlapping cDNA clones GEN-080G01 and GEN-080G0149 is shown under SEQ ID NO:6, the nucleotide sequence of the coding region of said clone under SEQ ID NO:5, and the amino acid sequence encoded by said nucleotide sequence under SEQ ID NO:4.
[0095]As shown under SEQ ID NO:6, the CKAP2 gene had a relatively GC-rich 5' noncoding region, with incomplete triplet repeats, (CAG)4(CGG)4(CTG)(CGG), occurring at nucleotides 40-69.
[0096]ATG located at nucleotides 274-276 is the presumable start codon. A stop codon (TGA) was situated at nucleotides 853-855. A polyadenylation signal (ATTAAA) was followed by 16 nucleotides before the poly(A) start. The estimated open reading frame comprises 579 nucleotides coding for 193 amino acid residues with a calculated molecular weight of 21,800 daltons.
[0097]The coding region was further amplified by RT-PCR, to eliminate the possibility of the synthetic sequence obtained being a cDNA chimera.
(2) Similarity of CKAP2 to Other CAPs
[0098]While sequencing of CKAP2 revealed homology with the sequences of restin and CLIP-170, the homologous region was limited to a short sequence corresponding to the CAP-GLY domain. On the amino acid level, the deduced CKAP2 was highly homologous to five other CAPs in this domain.
[0099]CKAP2 was lacking in such other motif characteristics of some CAPs as the alpha helical rod and zinc finger motif. The alpha helical rod is thought to contribute to dimerization and to increase the microtubule binding capacity (Pierre, P., et al., Cell, 70:887-900 (1992)). The lack of the alpha helical domain might mean that CKAP2 be incapable of homo or hetero dimer formation.
[0100]Paralleling of the CAP-GLY domains of these proteins revealed that other conserved residues other than glycine residues are also found in CKAP2. CAPs having a CAP-GLY domain are thought to be associated with the activities of cellular organelles and the interactions thereof with microtubules. Since it contains a CAP-GLY domain, as mentioned above, CKAP2 is placed in the family of CAPs.
[0101]Studies with mutants of Glued have revealed that the Glued product plays an important role in almost all cells (Swaroop, A., et al., Proc. Natl. Acad. Sci. USA, 84:6501-6505 (1987)) and that it has other neuron-specific functions in neuronal cells (Meyerowitz, E. M. and Kankel, D. R., Dev. Biol., 62:112-142 (1978)). These microtubule-associated proteins are thought to function in vesicle transport and mitosis. Because of the importance of the vesicle transport system in neuronal cells, defects in these components might lead to aberrant neuronal systems.
[0102]In view of the above, CKAP2 might be involved in specific neuronal functions as well as in fundamental cellular functions.
(3) Northern Blot Analysis
[0103]The expression of human CKAP2 mRNA in normal human tissues was examined by Northern blotting in the same manner as in Example 1 (2) using the GEN-080G01 clone (corresponding to nucleotides 553-1015) as a probe.
[0104]As a result, in all the eight tissues tested, namely human heart, brain, placenta, lung, liver, skeletal muscle, kidney and pancreas, a 1.0 kb transcript agreeing in size with the CKAP2 cDNA was detected. Said 1.0 kb transcript was expressed at significantly higher levels in heart and brain than in the other tissues examined. Two weak bands, 3.4 kb and 4.6 kb, were also detected in all the tissues examined.
[0105]According to the Northern blot analysis, the 3.4 kb and 4.6 kb transcripts might possibly be derived from the same gene coding for the 1.0 kb CKAP2 by alternative splicing or transcribed from other related genes. These characteristics of the transcripts may indicate that CKAP2 might also code for a protein having a CAP-GLY domain as well as an alpha helix.
(4) Cosmid Cloning and Chromosomal Localization by Direct R-Banding FISH
[0106]Two cosmids corresponding to the CKAP2 cDNA were obtained. These two cosmid clones were subjected to direct R-banding FISH in the same manner as in Example 1 (3) for chromosomal locus mapping of CKAP2.
[0107]For suppressing the background due to repetitive sequences, a 20-fold excessive amount of human Cot-I DNA (BRL) was added as described by Lichter et al. (Lichter, P., et al., Proc. Natl. Acad. Sci. USA, 87:6634-6638 (1990)). A Provia 100 film (Fuji ISO 100; Fuji Photo Film) was used for photomicrography.
[0108]As a result, CKAP2 was mapped on chromosome bands 19q13.11-q13.12.
[0109]Two autosomal dominant neurological diseases have been localized to this region by linkage analysis: CADASIL (cerebral autosomal dominant arteriopathy with subcortical infarcts and leukoencephalopathy) between the DNA markers D19S221 and D19S222, and FHM (familial hemiplegic migraine) between D19S215 and D19S216. These two diseases may be allelic disorders in which the same gene is involved (Tournier-Lasserve, E., et al., Nature Genet., 3:256-259 (1993); Joutel, A., et al., Nature Genet., 5:40-45 (1993)).
[0110]Although no evidence is available to support CKAP2 as a candidate gene for FHM or CADASIL, it is conceivable that its mutation might lead to some or other neurological disease.
[0111]By using the novel human CKAP2 gene of the present invention as obtained in this example, it is possible to detect the expression of said gene in various tissues or produce the human CKAP2 gene in the manner of genetic engineering. Through these, it becomes possible to analyze the functions of the human CKAP2 system or human CKAP2, which is involved in diverse activities essential to cells, as mentioned above, to diagnose various neurological diseases in which said system or gene is involved, for example familial migraine, and to screen out and evaluate a therapeutic or prophylactic drug therefor.
Example 3
OTK27 Gene
(1) OTK27 Gene Cloning and DNA Sequencing
[0112]As a result of sequence analysis of cDNA clones arbitrarily selected from a human fetal brain cDNA library in the same manner as in Example 1 (1) and database searching, a cDNA clone, GEN-025F07, coding for a protein highly homologous to NHP2, a yeast nucleoprotein (Saccharomyces cerevisiae; Kolodrubetz, D. and Burgum, A., YEAST, 7:79-90 (1991)), was found and named OTK27.
[0113]Nucleoproteins are fundamental cellular constituents of chromosomes, ribosomes and so forth and are thought to play an essential role in cell multiplication and viability. The yeast nucleoprotein NHP2, a high-mobility group (HMG)-like protein, like HMG, has reportedly a function essential for cell viability (Kolodrubetz, D. and Burgum, A., YEAST, 7:79-90 (1991)).
[0114]The novel human gene, OTK27 gene, of the present invention, which is highly homologous to the above-mentioned yeast NHP2 gene, is supposed to be similar in function.
[0115]The nucleotide sequence of said GEN-025F07 clone was found to comprise 1493 nucleotides, as shown under SEQ ID NO:9, and contain an open reading frame comprising 384 nucleotides, as shown under SEQ ID NO:8, coding for an amino acid sequence comprising 128 amino acid residues, as shown under SEQ ID NO:7. The initiation codon was located at nucleotides 95-97 of the sequence shown under SEQ ID NO:9, and the termination codon at nucleotides 479-481.
[0116]At the amino acid level, the OTK27 protein was highly homologous (38%) to NHP2. It was 83% identical with the protein deduced from the cDNA from Arabidopsis thaliana; Newman, T., unpublished; GENEMBL Accession No. T14197).
(2) Northern Blot Analysis
[0117]For examining the expression of human OTK27 mRNA in normal human tissues, the insert in the OTK27 cDNA was amplified by PCR, the PCR product was purified and labeled with [32P]-dCTP (random-primed DNA labeling kit, Boehringer Mannheim), and Northern blotting was performed using the labeled product as a probe in the same manner as in Example 1 (2).
[0118]As a result of the Northern blot analysis, two bands corresponding to possible transcripts from this gene were detected at approximately 1.6 kb and 0.7 kb. Both sizes of transcript were expressed in all normal adult tissues examined. However, the expression of the 0.7 kb transcript was significantly reduced in brain and was of higher levels in heart, skeletal muscle and testicle than in other tissues examined.
[0119]For further examination of these two transcripts, eleven cDNA clones were isolated from a testis cDNA library and their DNA sequences were determined in the same manner as in Example 1 (1).
[0120]As a result, in six clones, the sequences were found to be in agreement with that of the 0.7 kb transcript, with a poly(A) sequence starting at around the 600th nucleotide, namely at the 598th nucleotide in two of the six clones, at the 606th nucleotide in three clones, and at the 613th nucleotide in one clone.
[0121]In these six clones, the "TATAAA" sequence was recognized at nucleotides 583-588 as a probable poly(A) signal. The upstream poly(A) signal "TATAAA" of this gene was recognized as little influencing in brain and more effective in the three tissues mentioned above than in other tissues. The possibility was considered that the stability of each transcript vary from tissue to tissue.
[0122]Results of zoo blot analysis indicated that this gene is well conserved also in other vertebrates. Since this gene is expressed ubiquitously in normal adult tissues and conserved among a wide range of species, the gene product is likely to play an important physiological role. The evidence that yeasts lacking in NHP2 are nonviable suggests that the human homolog may also be essential to cell viability.
(3) Chromosomal Localization of OTK27 by Direct R-Banding FISH
[0123]One cosmid clone corresponding to the cDNA OTK27 was isolated from a total human genomic cosmid library (5-genome equivalent) using the OTK27 cDNA insert as a probe and subjected to FISH in the same manner as in Example 1 (3) for chromosomal localization of OTK27.
[0124]As a result, two distinct spots were observed on the chromosome band 12q24.3.
[0125]The OTK27 gene of the present invention can be used in causing expression thereof and detecting the OTK27 protein, a human nucleoprotein, and thus can be utilized in the diagnosis and pathologic studies of various diseases in which said protein is involved and, because of its involvement in cell proliferation and differentiation, in screening out and evaluating therapeutic and preventive drugs for cancer.
Example 4
OTK18 Gene
(1) OTK18 Gene Cloning and DNA Sequencing
[0126]Zinc finger proteins are defined as constituting a large family of transcription-regulating proteins in eukaryotes and carry evolutionally conserved structural motifs (Kadonaga, J. T., et al., Cell, 51:1079-1090 (1987); Klung, A. and Rhodes, D., Trends Biol. Sci., 12:464-469 (1987); Evans, R. M. and Hollenberg, S. M., Cell, 52:1-3 (1988)).
[0127]The zinc finger, a loop-like motif formed by the interaction between the zinc ion and two residues, cysteine and histidine residues, is involved in the sequence-specific binding of a protein to RNA or DNA. The zinc finger motif was first identified within the amino acid sequence of the Xenopus transcription factor IIIA (Miller, J., et al., EMBO J., 4:1609-1614 (1986)).
[0128]The C2H2 finger motif is in general tandemly repeated and contains an evolutionally conserved intervening sequence of 7 or 8 amino acids. This intervening stretch was first identified in the Kruppel segmentation gene of Drosophila (Rosenberg, U. B., et al., Nature, 319:336-339 (1986)). Since then, hundreds of C2H2 zinc finger protein-encoding genes have been found in vertebrate genomes.
[0129]As a result of sequence analysis of cDNA clones arbitrarily selected from a human fetal brain cDNA library in the same manner as in Example 1 (1) and database searching, several zinc finger structure-containing clones were identified and, further, a clone having a zinc finger structure of the Kruppel type was found.
[0130]Since this clone lacked the 5' portion of the transcript, plaque hybridization was performed with a fetal brain cDNA library using, as a probe, an approximately 1.8 kb insert in the cDNA clone, whereby three clones were isolated. The nucleotide sequences of these were determined in the same manner as in Example 1 (1).
[0131]Among the three clones, the one having the largest insert spans 3,754 nucleotides including an open reading frame of 2,133 nucleotides coding for 711 amino acids. It was found that said clone contains a novel human gene coding for a peptide highly homologous in the zinc finger domain to those encoded by human ZNF41 and the Drosophila Kruppel gene. This gene was named OTK18 gene (derived from the clone GEN-076C09).
[0132]The nucleotide sequence of the cDNA clone of the OTK18 gene is shown under SEQ ID NO:12, the coding region-containing nucleotide sequence under SEQ ID NO:11, and the predicted amino acid sequence encoded by said OTK18 gene under SEQ ID NO:10.
[0133]It was found that the amino acid sequence of OTK18 as deduced from SEQ ID NO:12 contains 13 finger motifs on its carboxy side.
(2) Comparison with Other Zinc Finger Motif-Containing Genes
[0134]Comparison among OTK18, human ZNF41 and the Drosophila Kruppel gene revealed that each finger motif is for the most part conserved in the consensus sequence CXECGKAFXQKSXLX2HQRXH (SEQ ID NO:45).
[0135]Comparison of the consensus sequence of the zinc finger motifs of OTK18 with those of human ZNF41 and the Drosophila Kruppel gene revealed that the Kruppel type motif is well conserved in the OTK18-encoded protein. However, the sequence similarities were limited to zinc finger domains and no significant homologies were found with regard to other regions.
[0136]The zinc finger domain interacts specifically with the target DNA, recognizing an about 5 by sequence to thereby bind to the DNA helix (Rhodes, D. and Klug, A., Cell, 46:123-132 (1986)).
[0137]Based on the idea that, in view of the above, the multiple module (tandem repetitions of zinc finger) can interact with long stretches of DNA, it is presumable that the target DNA of this gene product containing 13 repeated zinc finger units would be a DNA fragment with a length of approximately 65 bp.
(3) Northern Blot Analysis
[0138]Northern blot analysis was performed as described in Example 1 (2) for checking normal human tissues for expression of the human OTK18 mRNA therein by amplifying the insert of the OTK18 cDNA by PCR, purifying the PCR product, labeling the same with [32P]-dCTP (random-primed DNA labeling kit, Boehringer Mannheim) and using an MTN blot with the labeled product as a probe.
[0139]The results of Northern blot analysis revealed that the transcript of OTK18 is approximately 4.3 kb long and is expressed ubiquitously in various normal adult tissues. However, the expression level in the liver and in peripheral blood lymphocytes seemed to be lower than in other organs tested.
(4) Cosmid Cloning and Chromosomal Localization by Direct R-Banding FISH
[0140]Chromosomal localization of OTK18 was carried out as described in Example 1 (3).
[0141]As a result, complete twin spots were identified with 8 samples while 23 samples showed an incomplete signal or twin spots on either or both homologs. All signals appeared at the q13.4 band of chromosome 19. No twin spots were observed on any other chromosomes.
[0142]The results of FISH thus revealed that this gene is localized on chromosomal band 19q13.4. This region is known to contain many DNA segments that hybridize with oligonucleotides corresponding to zinc finger domains (Hoovers, J. M. N., et al., Genomics, 12:254-263 (1992)). In addition, at least one other gene coding for a zinc finger domain has been identified in this region (Marine, J.-C., et al., Genomics, 21:285-286 (1994)).
[0143]Hence, the chromosome 19q13 is presumably a site of grouping of multiple genes coding for transcription-regulating proteins.
[0144]When the novel human OTK18 gene provided by this example is used, it becomes possible to detect expression of said gene in various tissues and produce the human OTK18 protein in the manner of genetic engineering. Through these, it is possible to analyze the functions of the human transcription regulating protein gene system or human transcription regulating proteins, which are deeply involved in diverse activities fundamental to cells, as mentioned above, to diagnose various diseases with which said gene is associated, for example malformation or cancer resulting from a developmental or differentiation anomaly, and mental or nervous disorder resulting from a developmental anomaly in the nervous system, and further to screen out and evaluate therapeutic or prophylactic drugs for these diseases.
Example 5
Genes Encoding Human 26S Proteasome Constituent P42 Protein and P27 Protein
(1) Cloning and DNA Sequencing of Genes Respectively Encoding Human 26S Proteasome Constituent P42 Protein and P27 Protein
[0145]Proteasome, which is a multifunctional protease, is an enzyme occurring widely in eukaryotes from yeasts to humans and decomposing ubiquitin-binding proteins in cells in an energy-dependent manner. Structurally, said proteasome is constituted of 20S proteasome composed of various constituents with a molecular weight of 21 to 31 kilodaltons and a group of PA700 regulatory proteins composed of various constituents with a molecular weight of 30 to 112 kilodaltons and showing a sedimentation coefficient of 22S and, as a whole, occurs as a macromolecule with a molecular weight of about 2 million daltons and a sedimentation coefficient of 26S (Rechsteiner, M., et al., J. Biol. Chem., 268:6065-6068 (1993); Yoshimura, T., et al., J. Struct. Biol., 111:200-211 (1993); Tanaka, K., et al., New Biologist, 4:173-187 (1992)).
[0146]Despite structural and mechanical analyses thereof, the whole picture of proteasome is not yet fully clear. However, according to studies using yeasts and mice in the main, it reportedly has the functions mentioned below and its functions are becoming more and more elucidated.
[0147]The mechanism of energy-dependent proteolysis in cells starts with selection of proteins by ubiquitin binding. It is not 20S proteasome but 26S proteasome that has ubiquitin-conjugated protein decomposing activity which is ATP-dependent (Chu-Ping et al., J. Biol. Chem., 269:3539-3547 (1994)). Hence, human 26S proteasome is considered to be useful in elucidating the mechanism of energy-dependent proteolysis.
[0148]Factors involved in the cell cycle regulation are generally short in half-life and in many cases they are subject to strict quantitative control. In fact, it has been made clear that the oncogene products Mos, Myc, Fos and so forth can be decomposed by 26S proteasome in an energy- and ubiquitin-dependent manner (Ishida, N., et al., FEBS Lett., 324:345-348 (1993); Hershko, A. and Ciechanover, A., Annu. Rev. Biochem., 61:761-807 (1992)) and the importance of proteasone in cell cycle control is being recognized.
[0149]Its importance in the immune system has also been pointed out. It is suggested that proteasome is positively involved in class I major histocompatible complex antigen presentation (Michalek, M. T., et al., Nature, 363:552-554 (1993)) and it is further suggested that proteasome may be involved in Alzheimer disease, since the phenomena of abnormal accumulation of ubiquitin-conjugated proteins in the brain of patients with Alzheimer disease (Kitaguchi, N., et al., Nature, 361:530-532 (1988)). Because of its diverse functions such as those mentioned above, proteasome attracts attention from the viewpoint of its utility in the diagnosis and treatment of various diseases.
[0150]A main function of 26S proteasome is ubiquitin-conjugated protein decomposing activity. In particular, it is known that cell cycle-related gene products such as oncogene products and cyclins, typically c-Myc, are degraded via ubiquitin-dependent pathways. It has also been observed that the proteasome gene is expressed abnormally in liver cancer cells, renal cancer cells, leukemia cells and the like as compared with normal cells (Kanayama, H., et al., Cancer Res., 51:6677-6685 (1991)) and that proteasome is abnormally accumulated in tumor cell nuclei. Hence, constituents of proteasome are expected to be useful in studying the mechanism of such canceration and in the diagnosis or treatment of cancer.
[0151]Also, it is known that the expression of proteasome is induced by interferon γ and so on and is deeply involved in antigen presentation in cells (Aki, M., et al., J. Biochem., 115:257-269 (1994)). Hence, constituents of human proteasome are expected to be useful in studying the mechanism of antigen presentation in the immune system and in developing immunoregulating drugs.
[0152]Furthermore, proteasome is considered to be deeply associated with ubiquitin abnormally accumulated in the brain of patients with Alzheimer disease. Hence, it is suggested that constituents of human proteasome should be useful in studying the cause of Alzheimer disease and in the treatment of said disease.
[0153]In addition to the utilization of expectedly multifunctional proteasome as such in the above manner, it is probably possible to produce antibodies using constituents of proteasome as antigens and use such antibodies in diagnosing various diseases by immunoassay. Its utility in this field of diagnosis is thus also a focus of interest.
[0154]Meanwhile, a protein having the characteristics of human 26S proteasome is disclosed, for example in Japanese Unexamined Patent Publication No. 292964/1993 and rat proteasome constituents are disclosed in Japanese Unexamined Patent Publication Nos. 268957/1993 and 317059/1993. However, no human 26S proteasome constituents are known. Therefore, the present inventors made a further search for human 26S proteasome constituents and successfully obtained two novel human 26S proteasome constituents, namely human 26S proteasome constituent P42 protein and human S26 proteasome constituent P27 protein, and performed cloning and DNA sequencing of the corresponding genes in the following manner.
(1) Purification of Human 26S Proteasome Constituents P42 Protein and P27 Protein
[0155]Human proteasome was purified using about 100 g of fresh human kidney and following the method of purifying human proteasome as described in Japanese Unexamined Patent Publication No. 292964/1993, namely by column chromatography using BioGel A-1.5 m (5×90 cm, Bio-Rad), hydroxyapatite (1.5×15 cm, Bio-Rad) and Q-Sepharose (1.5×15 cm, Pharmacia) and glycerol density gradient centrifugation.
[0156]The thus-obtained human proteasome was subjected to reversed phase high performance liquid chromatography (HPLC) using a Hitachi model L6200 HPLC system. A Shodex RS Pak D4-613 (0.6×15 cm, Showa Denko) was used and gradient elution was performed with the following two solutions:
First solution: 0.06% trifluoroacetic acid; andSecond solution: 0.05% trifluoroacetic acid, 70% acetonitrile.
[0157]An aliquot of each eluate fraction was subjected to 8.5% SDS-polyacrylamide electrophoresis under conditions of reduction with dithiothreitol. The P42 protein and P27 protein thus detected were isolated and purified.
[0158]The purified P42 and P27 proteins were respectively digested with 1 μg of trypsin in 0.1 M Tris buffer (pH 7.8) containing 2 M urea at 37° C. for 8 hours and the partial peptide fragments obtained were separated by reversed phase HPLC and their sequences were determined by Edman degradation. The results obtained are as shown below in Table 2.
TABLE-US-00002 TABLE 2 Partial protein Amino acid sequence P42 (1) VLNISLW (SEQ ID NO: 46) (2) TLMELLNQMDGFDTLHR (SEQ ID NO: 47) (3) AVSDFVVSEYXMAXA (SEQ ID NO: 48) (4) EVDPLVYNX (SEQ ID NO: 49) (5) HGEIDYEAIVK (SEQ ID NO: 50) (6) LSXGFNGADLRNVXTEAGMFAIXAD (SEQ ID NO: 51) (7) MIMATNRPDTLDPALLRPGXL (SEQ ID NO: 52) (8) IHIDLPNEQARLDILK (SEQ ID NO: 53) (9) ATNGPRYVVVG (SEQ ID NO: 54) (10) EIDGRLK (SEQ ID NO: 55) (11) ALQSVGQIVGEVLK (SEQ ID NO: 56) (12) ILAGPITK (SEQ ID NO: 57) (13) XXVIELPLTNPELFQG (SEQ ID NO: 58) (14) VVSSSLVDK (SEQ ID NO: 59) (15) ALQDYRK (SEQ ID NO: 60) (16) EHREQLK (SEQ ID NO: 61) (17) KLESKLDYKPVR (SEQ ID NO: 62) P27 (1) LVPTR (SEQ ID NO: 63) (2) AKEEEIEAQIK (SEQ ID NO: 64) (3) ANYEVLESQK (SEQ ID NO: 65) (4) VEDALHQLHAR (SEQ ID NO: 66) (5) DVDLYQVR (SEQ ID NO: 67) (6) QSQGLSPAQAFAK (SEQ ID NO: 68) (7) AGSQSGGSPEASGVTVSDVQE (SEQ ID NO: 69) (8) GLLGXNIIPLQR (SEQ ID NO: 70)
(2) cDNA Library Screening, Clone Isolation and cDNA Nucleotide Sequence Determination
[0159]As mentioned in Example 1 (1), the present inventors have a database comprising about 30,000 cDNA data as constructed based on large-scale DNA sequencing using human fetal brain, arterial blood vessel and placenta cDNA libraries.
[0160]Based on the amino acid sequences obtained as mentioned above in (1), computer searching was performed with the FASTA program (search for homology between said amino acid sequences and the amino acid sequences estimated from the database). As regards P42, a clone (GEN-331G07) showing identity with regard to two amino acid sequences ((2) and (7) shown in Table 2) was screened out and, as regards P27, a clone (GEN-163D09) showing identity with regard to two amino acid sequences ((1) and (8) shown in Table 2) was found.
[0161]For each of these clones, the 5' side sequence was determined by 5' RACE and the whole sequence was determined, in the same manner as in Example 2 (2).
[0162]As a result, it was revealed that the above-mentioned P42 clone GEN-331G07 comprises a 1,566-nucleotide sequence as shown under SEQ ID NO:15, inclusive of a 1,167-nucleotide open reading frame as shown under SEQ ID NO:14, and that the amino acid sequence encoded thereby is the one shown under SEQ ID NO:13 and comprises 389 amino acid residues.
[0163]The results of computer homology search revealed that the P42 protein is significantly homologous to the AAA (ATPase associated with a variety of cellular activities) protein family (e.g., P45, TBP1, TBP7, S4, MSS1, etc.). It was thus suggested that it is a new member of the AAA protein family.
[0164]As for the P27 clone GEN-163D09, it was revealed that it comprises a 1,128-nucleotide sequence as shown under SEQ ID NO:18, including a 669-nucleotide open reading frame as shown under SEQ ID NO:17 and that the amino acid sequence encoded thereby is the one shown under SEQ ID NO:16 and comprises 223 amino acid residues.
[0165]As regards the P27 protein, homology search using a computer failed to reveal any homologous gene among public databases. Thus, the gene in question is presumably a novel gene having an unknown function.
[0166]Originally, the above-mentioned P42 and P27 gene products were both purified as regulatory subunit components of proteasome complex. Therefore, these are expected to play an important role in various biological functions through proteolysis, for example a role in energy supply through decomposition of ATP and, hence, they are presumably useful not only in studying the function of human 26S proteasome but also in the diagnosis and treatment of various diseases caused by lowering of said biological functions, among others.
Example 6
BNAP Gene
(1) BNAP Gene Cloning and DNA Sequencing
[0167]The nucleosome composed of DNA and histone is a fundamental structure constituting chromosomes in eukaryotic cells and is well conserved over borders among species. This structure is closely associated with the processes of replication and transcription of DNA. However, the nucleosome formation is not fully understood as yet. Only certain specific factors involved in nucleosome assembly (NAPs) have been identified. Thus, two acidic proteins, nucleoplasmin and N1, are already known to facilitate nucleosome construction (Kleinschmidt, J. A., et al., J. Biol. Chem., 260:1166-1176 (1985); Dilworth, S. M., et al., Cell, 51:1009-1018 (1987)).
[0168]A yeast gene, NAP-I, was isolated using a monoclonal antibody and recombinant proteins derived therefrom were tested as to whether they have nucleosome assembling activity in vivo.
[0169]More recently, a mouse NAP-I gene, which is a mammalian homolog of the yeast NAP-I gene was cloned (Okuda, A.; registered in database under the accession number D12618). Also cloned were a mouse gene, DN38 (Kato, K., Eur. J. Neurosci., 2:704-711 (1990)) and a human nucleosome assembly protein (hNRP) (Simon, H. U., et al., Biochem. J., 297:389-397 (1994)). It was shown that the hNRP gene is expressed in many tissues and is associated with T lymphocyte proliferation.
[0170]The present inventors performed sequence analysis of cDNA clones arbitrarily chosen from a human fetal brain cDNA library in the same manner as in Example 1 (1), followed by searches among databases and, as a result, made it clear that a 1,125-nucleotide cDNA clone (free of poly(A)), GEN-078D05, is significantly homologous to the mouse NAP-I gene, which is a gene for a nucleosome assembly protein (NAP) involved in nucleosome construction, a mouse partial cDNA clone, DN38, and hNRP.
[0171]Since said clone GEN-078D05 was lacking in the 5' region, 5' RACE was performed in the same manner as in Example 2 (2) to obtain the whole coding region. For this 5' RACE, primers P1 and P2 respectively having the nucleotide sequences shown below in Table 3.
TABLE-US-00003 TABLE 3 Primer Nucleotide sequence Primer P1 5'-TTGAAGAATGATGCATTAGGAACCAC-3' (SEQ ID NO: 71) Primer P2 5'-CACTCGAGTGGCTGGATTTCAATTTCTCCAGTAG-3' (SEQ ID NO: 72)
[0172]After the first 5' RACE, a single band corresponding to a sequence length of 1,300 nucleotides was obtained. This product was inserted into pT7Blue® T-Vector and several clones appropriate in insert size were selected.
[0173]Ten 5' RACE clones obtained from two independent PCR reactions were sequenced and the longest clone GEN-078D05TA13 (about 1,300 nucleotides long) was further analyzed.
[0174]Both strands of the two overlapping cDNA clones GEN-078D05 and GEN-078D05TA13 were sequenced, whereby it was confirmed that the two clones did not yet cover the whole coding region. Therefore, a further second 5' RACE was carried out. For the second 5' RACE, two primers, P3 and P4, respectively having the sequences shown below in Table 4 were used.
TABLE-US-00004 TABLE 4 Primer Nucleotide sequence Primer P3 5'-GTCGAGCTAGCCATCTCCTCTTCG-3' (SEQ ID NO: 73) Primer P4 5'-CATGGGCGACAGGTTCCGAGACC-3' (SEQ ID NO: 74)
[0175]A clone, GEN-078D0508, obtained by the second 5' RACE was 300 nucleotides long. This clone contained an estimable initiation codon and three preceding in-frame termination codons. From these three overlapping clones, it became clear that the whole coding region comprises 2,636 nucleotides. This gene was named brain-specific nucleosome assembly protein (BNAP) gene.
[0176]The BNAP gene contains a 1,518-nucleotide open reading frame shown under SEQ ID NO:20. The amino acid encoded thereby comprises 506 amino acid residues, as shown under SEQ ID NO:19, and the nucleotide sequence of the whole cDNA clone of BNAP is as shown under SEQ ID NO:21.
[0177]As shown under SEQ ID NO:21, the 5' noncoding region of said gene was found to be generally rich in GC. Candidate initiation codon sequences were found at nucleotides Nos. 266-268, 287-289 and 329-331. These three sequences all had well conserved sequences in the vicinity of the initiation codons (Kozak, M., J. Biol. Chem., 266:19867-19870 (1991)).
[0178]According to the scanning model, the first ATG (nucleotides Nos. 266-268) of the cDNA clone may be the initiation codon. The termination codon was located at nucleotides Nos. 1784-1786.
[0179]The 3' noncoding redion was generally rich in AT and two polyadenylation signals (AATAAA) were located at nucleotides Nos. 2606-2611 and 2610-2615, respectively.
[0180]The longest open reading frame comprised 1,518 nucleotides coding for 506 amino acid residues and the calculated molecular weight of the BNAP gene product was 57,600 daltons.
[0181]Hydrophilic plots indicated that BNAP is very hydrophilic, like other NAPs.
[0182]For recombinant BNAP expression and purification and for eliminating the possibility that the BNAP gene sequence might give three chimera clones in the step of 5' RACE, RT-PCR was performed using a sequence comprising nucleotides Nos. 326-356 as a sense primer and a sequence comprising nucleotides Nos. 1758-1786 as an antisenses primer.
[0183]As a result, a single product of about 1,500 by was obtained and it was thus confirmed that said sequence is not a chimera but a single transcript.
(2) Comparison Between BNAP and NAPs
[0184]The amino acid sequence deduced from BNAP showed 46% identity and 65% similarity to hNRP.
[0185]The deduced BNAP gene product had motifs characteristic of the NAPs already reported and of BNAP. In general, half of the C terminus was well conserved in humans and yeasts.
[0186]The first motif (domain I) is KGIPDYWLI (corresponding to amino acid residues Nos. 309-317 (of SEQ ID NO:19)). This was observed also in hNRP (KGIPSFWLT (SEQ ID NO:75)) and in yeast NAP-I (KGIPEFWLT (SEQ ID NO:76)).
[0187]The second motif (domain II) is ASFFNFFSPP (corresponding to amino acid residues Nos. 437-446 (of SEQ ID NO:19)) and this was expressed as DSFFNFFAPP (SEQ ID NO:77) in hNRP and as ESFFNFFSP (SEQ ID NO:78) in yeast NAP-I.
[0188]These two motifs were also conserved in the deduced mouse NAP-I and DN38 peptides. Both conserved motifs were each a hydrophilic cluster, and the Cys in position 402 was also found conserved.
[0189]Half of the N terminus had no motifs strictly conserved from yeasts to mammalian species, while motifs conserved among mammalian species were found.
[0190]For instance, HDLERKYA (corresponding to amino acid residues Nos. 130 to 137 (of SEQ ID NO:19)) and IINAEYEPTEEECEW (corresponding to amino acid residues Nos. 150-164 (of SEQ ID NO:19)), which may be associated with mammal-specific functions, were found strictly conserved.
[0191]NAPs had acidic stretches, which are believed to be readily capable of binding to histone or other basic proteins. All NAPs had three acidic stretches but the locations thereof were not conserved.
[0192]BNAP has not such three acidic stretches but, instead, three repeated sequences (corresponding to amino acid residues Nos. 194-207, 208-221 and 222-235) with a long acidic cluster, inclusive of 41 amino acid residues out of 98 amino acid residues, the consensus sequence being ExxKExPEVKxEEK (SEQ ID NO:79) (each x being a nonconserved, mostly hydrophobic, residue).
[0193]Furthermore, it was revealed that the BNAP sequence had several BNAP-specific motifs. Thus, an extremely serine-rich doamin (corresponding to amino acid residues Nos. 24-72) with 33 (67%) of 49 amino acid residues being serine residues was found in the N-terminus portion. On the nucleic acid level, they were reflected as incomplete repetitions of AGC.
[0194]Following this serine-rich region, there appeared a basic domain (corresponding to amino acid residues Nos. 71-89) comprising 10 basic amino acid residues among 19 residues.
[0195]BNAP is supposed to be localized in the nucleus. Two possible signals localized in the nucleus were observed (NLSs). The first signal was found in the basic domain of BNAP and its sequence YRKKR (SEQ ID NO:96) (corresponding to amino acid residues Nos. 75-79) was similar to NLS (GRKKR (SEQ ID NO:80)) of Tat of HIV-1. The second signal was located in the C terminus and its sequence KKYRK (corresponding to amino acid residues Nos. 502-506 (of SEQ ID NO:19)) was similar to NLS (KKKRK (SEQ ID NO:81)) of the large T antigen of SV40. The presence of these two presumable NLSs suggested the localization of BNAP in the nucleus. However, the possibility that other basic clusters might act as NLSs could not be excluded.
[0196]BNAP has several phosphorylation sites and the activity of BNAP may be controlled through phosphorylation thereof.
(3) Northern Blot Analysis
[0197]Northern blot analysis was performed as described in Example 1 (2). Thus, the clone GEN-078D05TA13 (corresponding to nucleotides Nos. 323 to 1558 in the BNAP gene sequence) was amplified by PCR, the PCR product was purified and labeled with [32P]-dCTP (random-primed DNA labeling kit, Boehringer Mannheim), and the expression of BNAP mRNA in normal human tissues was examined using an MTN blot with the labeled product as a probe.
[0198]As a result of Northern blot analysis, a 3.0 kb transcript of BNAP was detected (8-hour exposure) in the brain among eight human adult tissues tested, namely heart, brain, placenta, lung, liver, skeletal muscle, kidney and pancreas and, after longer exposure (24 hours), a dim band of the same size was detected in the heart.
[0199]BNAP was found equally expressed in several sites of brain tested whereas, in other tissues, no signal was detected at all even after 72 hours of exposure. hNRP mRNA was found expressed everywhere in the human tissues tested whereas the expression of BNAP mRNA was tissue-specific.
(4) Radiation Hybrid Mapping
[0200]Chromosomal mapping of the BNAP clone was performed by means of radiation hybrid mapping (Cox, D. R., et al., Science, 250:245-250 (1990)).
[0201]Thus, a total human genome radiation hybrid clone (G3RH) panel was purchased from Research Genetics, Inc., AL, USA and PCR was carried out for chromosomal mapping analysis according to the product manual using two primers, A1 and A2, respectively having the nucleotide sequences shown in Table 5.
TABLE-US-00005 TABLE 5 Primer Nucleotide sequence A1 primer 5'-CCTAAAAAGTGTCTAAGTGCCAGTT-3' (SEQ ID NO: 82) A2 primer 5'-TCAGTGAAAGGGAAGGTAGAACAC-3' (SEQ ID NO: 83)
[0202]The results obtained were analyzed utilizing softwares usable on the Internet (Boehnke, M., et al., Am. J. Hum. Genet., 46:581-586 (1991)).
[0203]As a result, the BNAP gene was found strongly linked to the marker DXS990 (LOD=1000, cR8000=-0.00). Since DXS990 is a marker localized on the chromosome Xq21.3-q22, it was established that BNAP is localized to the chromosomal locus Xq21.3-q22 where genes involved in several signs or symptoms of X-chromosome-associated mental retardation are localized.
[0204]The nucleosome is not only a fundamental chromosomal structural unit characteristic of eukaryotes but also a gene expression regulating unit. Several results indicate that genes with high transcription activity are sensitive to nuclease treatment, suggesting that the chromosome structure changes with the transcription activity (Elgin, S. C. R., J. Biol. Chem., 263:19259-19262 (1988)).
[0205]NAP-I has been cloned in yeast, mouse and human and is one of the factors capable of promoting nucleosome construction in vivo. In a study performed on their sequences, NAPs containing the epitope of the specific antibody 4A8 were detected in human, mouse, frog, Drosophila and yeast (Saccharomyces cerevisiae) (Ishimi, Y., et al., Eur. J. Biochem., 162:19-24 (1987)).
[0206]In these experiments, NAPs, upon SDS-PAGE analysis, electrophoretically migrated to positions corresponding to a molecular weight between 50 and 60 kDa, whereas the recombinant BNAP slowly migrated to a position of about 80 kDa. The epitope of 4A8 was shown to be localized in the second, well-conserved, hydrophobic motif. And, it was simultaneously shown that the triplet FNF is important as a part of the epitope (Fujii-Nakata, T., et al., J. Biol. Chem., 267:20980-20986 (1992)).
[0207]BNAP also contained this consensus motif in domain II. The fact that domain II is markedly hydrophobic and the fact that domain II can be recognized by the immune system suggest that it is probably presented on the BNAP surface and is possibly involved in protein-protein interactions.
[0208]Domain I, too, may be involved in protein-protein interactions. Considering that these are conserved generally among NAPs, though to a relatively low extent, it is conceivable that they must be essential for nucleosome construction, although the functional meaning of the conserved domains is still unknown.
[0209]The hNRP gene is expressed in thyroid gland, stomach, kidney, intestine, leukemia, lung cancer, mammary cancer and so on (Simon, H. U., et al., Biochem. J., 297:389-397 (1994)). Like that, NAPs are expressed everywhere and are thought to be playing an important role in fundamental nucleosome formation.
[0210]BNAP may be involved in brain-specific nucleosome formation and an insufficiency thereof may cause neurological diseases or mental retardation as a result of deviated functions of neurons.
[0211]BNAP was found strongly linked to a marker on the X-chromosome q21.3-q22 where sequences involved in several symptoms of X-chromosome-associated mental retardation are localized. This center-surrounding region of X-chromosome was rich in genes responsible for α-thalassemia, mental retardation (ATR-X) or some other forms of mental retardation (Gibbons, R. J., et al., Cell, 80:837-845 (1995)). Like the analysis of the ATR-X gene which seems to regulate the nucleosome structure, the present inventors suppose that BNAP may be involved in a certain type of X-chromosome-linked mental retardation.
[0212]According to this example, the novel BNAP gene is provided and, when said gene is used, it is possible to detect the expression of said gene in various tissues and to produce the BNAP protein by the technology of genetic engineering. Through these, it is possible to study the brain nucleosome formation deeply involved, as mentioned above, in variegated activities essential to cells as well as the functions of cranial nerve cells and to diagnose various neurological diseases or mental retardation in which these are involved and screen out and evaluate drugs for the treatment or prevention of such diseases.
Example 7
Human Skeletal Muscle-Specific Ubiquitin-Conjugating Enzyme Gene (UBE2G Gene)
[0213]The ubiquitin system is a group of enzymes essential for cellular processes and is conserved from yeast to human. Said system is composed of ubiquitin-activating enzymes (UBAs), ubiquitin-conjugating enzymes (UBCs), ubiquitin protein ligases (UBRs) and 26S proteasome particles.
[0214]Ubiquitin is transferred from the above-mentioned UBAs to several UBCs, whereby it is activated. UBCs transfer ubiquitins to target proteins with or without the participation of UBRs. These ubiquitin-conjugated target proteins are said to induce a number of cellular responses, such as protein degradation, protein modification, protein translocation, DNA repair, cell cycle control, transcription control, stress responses, etc. and immunological responses (Jentsch, S., et al., Biochim. Biophys. Acta, 1089:127-139 (1991); Hershko, A. and Ciechanover, A., Annu. Rev. Biochem., 61:761-807 (1992); Jentsch, S., Annu. Rev. Genet., 26:179-207 (1992); Ciechanover, A., Cell, 79:13-21 (1994)).
[0215]UBCs are key components of this system and seem to have distinct substrate specificities and modulate different functions. For example, Saccharomyces cerevisiae UBC7 is induced by cadmium and involved in resistance to cadmium poisoning (Jungmann, J., et al., Nature, 361:369-371 (1993)). Degradation of MAT-α2 is also executed by UBC7 and UBC6 (Chen, P., et al., Cell, 74:357-369 (1993)).
[0216]The novel gene obtained in this example is UBC7-like gene strongly expressed in human skeletal muscle. In the following, cloning and DNA sequencing thereof are described.
(1) Cloning and DNA Sequencing of Human Skeletal Muscle-Specific Ubiquitin-Conjugating Enzyme Gene (UBE2G Gene)
[0217]Following the same procedure as in Example 1 (1), cDNA clones were arbitrarily selected from a human fetal brain cDNA library and subjected to sequence analysis, and database searches were performed. As a result, a cDNA clone, GEN-423A12, was found to have a significantly high level of homology to the genes coding for ubiquitin-conjugating enzymes (UBCs) in various species.
[0218]Since said GEN-423A12 clone was lacking in the 5' side, 5' RACE was performed in the same manner as in Example 2 (2) to obtain an entire coding region.
[0219]For said 5' RACE, two primers, P1 and P2, respectively having the nucleotide sequences shown in Table 6 were used.
TABLE-US-00006 TABLE 6 Primer Nucleotide sequence P1 primer 5'-TAATGAATTTCATTTTAGGAGGTCGG-3' (SEQ ID NO: 84) P2 primer 5'-ATCTTTTGGGAAAGTAAGATGAGCC-3' (SEQ ID NO: 85)
[0220]The 5' RACE product was inserted into pT7Blue® T-Vector and clones with an insert proper in size were selected.
[0221]Four of the 5' RACE clones obtained from two independent PCR reactions contained the same sequence but were different in length.
[0222]By sequencing the above clones, the coding sequence and adjacent 5'- and 3'-flanking sequences of the novel gene were determined.
[0223]As a result, it was revealed that the novel gene has a total length of 617 nucleotides. This gene was named human skeletal muscle-specific ubiquitin-conjugating enzyme gene (UBE2G gene).
[0224]To exclude the conceivable possibility that this sequence was a chimera clone, RT-PCR was performed in the same manner as in Example 6 (1) using the sense primer to amplify said sequence from the human fetal brain cDNA library. As a result, a single PCR product was obtained, whereby it was confirmed that said sequence is not a chimera one.
[0225]The UBE2G gene contains an open reading frame of 510 nucleotides, which is shown under SEQ ID NO:23, the amino acid sequence encoded thereby comprises 170 amino acid residues, as shown under SEQ ID NO:22, and the nucleotide sequence of the entire UBE2G cDNA is as shown under SEQ ID NO:24.
[0226]As shown under SEQ ID NO:24, the estimable initiation codon was located at nucleotides Nos. 19-21, corresponding to the first ATG triplet of the cDNA clone. Since no preceding in-frame termination codon was found, it was deduced that this clone contains the entire open reading frame on the following grounds.
[0227]Thus, (a) the amino acid sequence is highly homologous to S. cerevisiae UBC7 and said initiation codon agrees with that of yeast UBC7, supporting said ATG as such. (b) The sequence AGGATGA (nucleotide Nos. 18-24 in the sequence shown under SEQ ID NO:24) is similar to the consensus sequence (A/G)CCATGG around the initiation codon (Kozak, M., J. Biol. Chem., 266:19867-19870 (1991)).
(2) Comparison in Amino Acid Sequence Between UBE2G and UBCs
[0228]Comparison in amino acid sequence between UBE2G and UBCs suggested that the active site cystein capable of binding to ubiquitin should be the 90th residue cystein. The peptides encoded by these genes seem to belong to the same family.
(3) Northern Blot Analysis
[0229]Northern blot analysis was carried out as described in Example 1 (2). Thus, the entire sequence of UBE2G was amplified by PCR, the PCR product was purified and labeled with [32P]-dCTP (random-primed DNA labeling kit, Boehringer Mannheim) and the expression of UBE2G mRNA in normal human tissues using the labeled product as a probe. The membrane used was an MTN blot.
[0230]As a result of the Northern blot analysis, 4.4 kb, 2.4 kb and 1.6 kb transcripts could be detected in all 16 human adult tissues, namely heart, brain, placenta, lung, liver, skeletal muscle, kidney, pancreas, spleen, thyroid gland, urinary bladder, testis, ovary, small intestine, large intestine and peripheral blood leukocyte, after 18 hours of exposure. Strong expression of these transcripts was observed in skeletal muscle.
(4) Radiation Hybrid Mapping
[0231]Chromosomal mapping of the UBE2G clone was performed by radiation hybrid mapping in the same manner as in Example 6 (4).
[0232]The primers C1 and C4 used in PCR for chromosomal mapping analysis respectively correspond to nucleotides Nos. 415-435 and nucleotides Nos. 509-528 in the sequence shown under SEQ ID NO:24 and their nucleotide sequences are as shown below in Table 7.
TABLE-US-00007 TABLE 7 Primer Nucleotide sequence C1 primer 5'-GGAGACTCACCTGCTAATGTT-3' (SEQ ID NO: 86) C4 primer 5'-CTCAAAAGCAGTCTCTTGGC-3' (SEQ ID NO: 87)
[0233]As a result, the UBE2G gene was found linked to the markers D1S446 (LOD=12.52, cR8000=8.60) and D1S235 (LOD=9.14, cR8000=22.46). These markers are localized to the chromosome bands 1q42.13-q42.3.
[0234]UBE2G was expressed strongly in skeletal muscle and very weakly in all other tissues examined. All other UBCs are involved in essential cellular functions, such as cell cycle control, and those UBCs are expressed ubiquitously. However, the expression pattern of UBE2G might suggest a muscle-specific role thereof.
[0235]While the three transcripts differing in size were detected, attempts failed to identify which corresponds to the cDNA clone. The primary structure of the UBE2G product showed an extreme homology to yeast UBC7. On the other hand, nematode UBC7 showed strong homology to yeast UBC7. It is involved in degradation of the repressor and further confers resistance to cadmium in yeasts. The similarities among these proteins suggest that they belong to the same family.
[0236]It is speculated that UBE2G is involved in degradation of muscle-specific proteins and that a defect in said gene could lead to such diseases as muscular dystrophy. Recently, another proteolytic enzyme, calpain 3, was found to be responsible for limb-girdle muscular dystrophy type 2A (Richard, I., et al., Cell, 81:27-40 (1995)). At the present, the chromosomal location of UBE2G suggests no significant relationship with any hereditary muscular disease but it is likely that a relation to the gene will be unearthed by linkage analysis in future.
[0237]In accordance with this example, the novel UBE2G gene is provided and the use of said gene enables detection of its expression in various tissues and production of the UBE2G protein by the technology of genetic engineering. Through these, it becomes possible to study the degradation of muscle-specific proteins deeply involved in basic activities variegated and essential to cells, as mentioned above, and the functions of skeletal muscle, to diagnose various muscular diseases in which these are involved and further to screen out and evaluate drugs for the treatment and prevention of such diseases.
Example 8
TMP-2 Gene
(1) TMP-2 Gene Cloning and DNA Sequencing
[0238]Following the procedure of Example 1 (1), cDNA clones were arbitrarily selected from a human fetal brain cDNA library and subjected to sequence analysis, and database searches were performed. As a result, a clone (GEN-092E10) having a cDNA sequence highly homologous to a transmembrane protein gene (accession No.: U19878) was found out.
[0239]Membrane protein genes have so far been cloned in frog (Xenopus laevis) and human. These are considered to be a gene for a transmembrane type protein having a follistatin module and an epidermal growth factor (EGF) domain (accession No.: U19878).
[0240]The sequence information of the above protein gene indicated that the GEN-092E10 clone was lacking in the 5' region, so that the λgt10 cDNA library (human fetal brain 5'-STRETCH PLUS cDNA; Clontech) was screened using the GEN-092E10 clone as a probe, whereby a cDNA clone containing a further 5' upstream region was isolated.
[0241]Both strands of this cDNA clone were sequenced, whereby the sequence covering the entire coding region became clear. This gene was named TMP-2 gene.
[0242]The TMP-2 gene was found to contain an open reading frame of 1,122 nucleotides, as shown under SEQ ID NO:26, encoding an amino acid sequence of 374 residues, as shown under SEQ ID NO:25. The nucleotide sequence of the entire TMP-2 cDNA clone comprises 1,721 nucleotides, as shown under SEQ ID NO:27.
[0243]As shown under SEQ ID NO:27, the 5' noncoding region was generally rich in GC. Several candidates for the initiation codon were found but, according to the scanning model, the 5th ATG of the cDNA clone (bases Nos. 368-370) was estimated as the initiation codon. The termination codon was located at nucleotides Nos. 1490-1492. The polyadenylation signal (AATAAA) was located at nucleotides Nos. 1703-1708. The calculated molecular weight of the TMP-2 gene product was 41,400 daltons.
[0244]As mentioned above, the transmembrane genes have a follistatin module and an EGF domain. These motifs were also found conserved in the novel human gene of the present invention.
[0245]The TMP-2 gene of the present invention presumably plays an important role in cell proliferation or intercellular communication, since, on the amino acid level, said gene shows homology, across the EGF domain, to TGF-α (transforming growth factor-α; Derynck, R., et al., Cell, 38:287-297 (1984)), beta-cellulin (Igarashi, K. and Folkman, J., Science, 259:1604-1607 (1993)), heparin-binding EGF-like growth factor (Higashiyama, S., et al., Science, 251:936-939 (1991)) and schwannoma-derived growth factor (Kimura, H., et al., Nature, 348:257-260 (1990)).
(2) Northern Blot Analysis
[0246]Northern blot analysis was carried out as described in Example 1 (2). Thus, the clone GEN-092E10 was amplified by PCR, the PCR product was purified and labeled with [32P]-dCTP (random-primed DNA labeling kit, Boehringer Mannheim), and the expression of TMP-2 mRNA in normal human tissues was examined using an MTN blot with the labeled product as a probe.
[0247]As a result, high levels of expression were detected in brain and prostate gland. Said TMP-2 gene mRNA was about 2 kb in size.
[0248]According to the present invention, the novel human TMP-2 gene is provided and the use of said gene makes it possible to detect the expression of said gene in various tissues or produce the human TMP-2 protein by the technology of genetic engineering and, through these, it becomes possible to study brain tumor and prostatic cancer, which are closely associated with cell proliferation or intercellular communication, as mentioned above, to diagnose these diseases and to screen out and evaluate drugs for the treatment and prevention of such diseases.
Example 9
Human NKIK Gene
(1) Human NPIK Gene Cloning and DNA Sequencing
[0249]Following the procedures of Example 1 and Example 2, cDNA clones were arbitrarily selected from a human fetal brain cDNA library and subjected to sequence analysis, and database searches were performed. As a result, two cDNA clones highly homologous to the gene coding for an amino acid sequence conserved in phosphatidylinositol 3 and 4 kinases (Kunz, J., et al., Cell, 73:585-596 (1993)) were obtained. These were named GEN-428B12c1 and GEN-428B12c2 and the entire sequences of these were determined as in the foregoing examples.
[0250]As a result, the GEN-428B12c1 cDNA clone and the GEN-428B12c2 clone were found to have coding sequences differing by 12 amino acid residues at the 5' terminus, the GEN-428B12c1 cDNA clone being longer by 12 amino acid residues.
[0251]The GEN-428B12c1 cDNA sequence of the human NPIK gene contained an open reading frame of 2,487 nucleotides, as shown under SEQ ID NO:32, encoding an amino acid sequence comprising 829 amino acid residues, as shown under SEQ ID NO:31. The nucleotide sequence of the full-length cDNA clone comprised 3,324 nucleotides as shown under SEQ ID NO:33.
[0252]The estimated initiation codon was located, as shown under SEQ ID NO:33, at nucleotides Nos. 115-117 corresponding to the second ATG triplet of the cDNA clone. The termination codon was located at nucleotides Nos. 2602-2604 and the polyadenylation signal (AATAAA) at Nos. 3305-3310.
[0253]On the other hand, the GEN-428B12c2 cDNA sequence of the human NPIK gene contained an open reading frame of 2,451 nucleotides, as shown under SEQ ID NO:29. The amino acid sequence encoded thereby comprised 817 amino acid residues, as shown under SEQ ID NO:28. The nucleotide sequence of the full-length cDNA clone comprised 3,602 nucleotides, as shown under SEQ ID NO:30.
[0254]The estimated initiation codon was located, as shown under SEQ ID NO:30, at nucleotides Nos. 429-431 corresponding to the 7th ATG triplet of the cDNA clone. The termination codon was located at nucleotides Nos. 2880-2882 and the polyadenylation signal (AATAAA) at Nos. 3583-3588.
(2) Northern Blot Analysis
[0255]Northern blot analysis was carried out as described in Example 1 (2). Thus, the entire sequence of human NPIK was amplified by PCR, the PCR product was purified and labeled with [32P]-dCTP (random-primed DNA labeling kit, Boehringer Mannheim), and normal human tissues were examined for expression of the human NPIK mRNA using the MTN blot membrane with the labeled product as a probe.
[0256]As a result, the expression of the human NPIK gene was observed in 16 various human adult tissues examined and an about 3.8 kb transcript and an about 5 kb one could be detected.
[0257]Using primer A having the nucleotide sequence shown below in Table 8 and containing the initiation codon of the GEN-428B12c2 cDNA and primer B shown in Table 8 and containing the termination codon, PCR was performed with Human Fetal Brain Marathon-Ready cDNA (Clontech) as a template, and the nucleotide sequence of the PCR product was determined.
TABLE-US-00008 TABLE 8 Primer Nucleotide sequence Primer A 5'-ATGGGAGATACAGTAGTGGAGC-3' (SEQ ID NO: 88) Primer B 5'-TCACATGATGCCGTTGGTGAG-3' (SEQ ID NO: 89)
[0258]As a result, it was found that the human NPIK mRNA expressed included one lacking in nucleotides Nos. 1060-1104 of the GEN-428B12c1 cDNA sequence (SEQ ID NO:33) (amino acids Nos. 316-330 of the amino acid sequence under SEQ ID NO:31) and one lacking in nucleotides Nos. 1897-1911 of the GEN-428B12c1 cDNA sequence (SEQ ID NO:33) (amino acids Nos. 595-599 of the amino acid sequence under SEQ ID NO:31).
[0259]It was further revealed that polymorphism existed in this gene (428B12c1.fasta), as shown below in Table 9, in the region of bases Nos. 1941-1966 of the GEN-428B12c1 cDNA sequence shown under SEQ ID NO:33, whereby a mutant protein was encoded which resulted from the mutation of IQDSCEITT (amino acid residues Nos. 610-618 in the amino acid sequence (SEQ ID NO:31) encoded by GEN-428B12c1) into YKILVISA.
TABLE-US-00009 TABLE 9 ##STR00001##
(3) Chromosomal Mapping of Human NPIK Gene by FISH
[0260]Chromosomal mapping of the human NPIK gene was carried out by FISH as described in Example 1 (3).
[0261]As a result, it was found that the locus of the human NPIK gene is in the chromosomal position 1q21.1-q21.3.
[0262]The human NPIK gene, a novel human gene, of the present invention included two cDNAs differing in the 5' region and capable of encoding 829 and 817 amino acid residues, as mentioned above. In view of this and further in view of the findings that the mRNA corresponding to this gene includes two deletable sites and there occurs polymorphism in a specific region corresponding to amino acid residues Nos. 610-618 of the GEN-428B12c1 amino acid sequence (SEQ ID NO:31), whereby a mutant protein is encoded, it is conceivable that human NPIK includes species resulting from a certain number of combinations, namely human NPIK, deletion-containing human NPIK, human NPIK mutant and/or deletion-containing human NPIK mutant.
[0263]Recently, several proteins belonging to the family including the above-mentioned PI3 and 4 kinases have protein kinase activity (Dhand, R., et al., EMBO J., 13:522-533 (1994); Stack, J. H. and Emr, S. D., J. Biol. Chem., 269:31552-31562 (1994); Hartley, K. O., et al., Cell, 82:848-856 (1995)).
[0264]It was also revealed that a protein belonging to this family is involved in DNA repair (Hartley, K. O., et al., Cell, 82:849-856 (1995)) and is a causative gene of ataxia (Savitsky, K., et al., Science, 268:1749-1753 (1995)).
[0265]It can be anticipated that the human NPIK gene-encoded protein highly homologous to the family of these PI kinases is a novel enzyme phosphorylating lipids or proteins.
[0266]According to this example, the novel human NPIK gene is provided. The use of said gene makes it possible to detect the expression of said gene in various tissues and manufacture the human NPIK protein by the technology of genetic engineering and, through these, it becomes possible to study lipid- or protein-phosphorylating enzymes such as mentioned above, study DNA repairing, study or diagnose diseases in which these are involved, for example cancer, and screen out and evaluate drugs for the treatment or prevention thereof.
(4) Construction of an Expression Vector for Fusion Protein
[0267]To subclone the coding region for a human NPIK gene (GEN-428B12c2), first of all, two primers, C1 and C2, having the sequences shown below in Table 10 were formed based on the information on the DNA sequences obtained above in (1).
TABLE-US-00010 TABLE 10 Primer Nucleotide sequence Primer C1 5'-CTCAGATCTATGGGAGATACAGTAGTGGAGC-3' (SEQ ID NO: 92) Primer C2 5'-TCGAGATCTTCACATGATGCCGTTGGTGAG-3' (SEQ ID NO: 93)
[0268]Both of the primers C1 and C2 have a BglII site, and primer C2 is an antisense primer.
[0269]Using these two primers, cDNA derived from human fetal brain mRNA was amplified by PCR to provide a product having a length of about 2500 bases. The amplified cDNA was precipitated from ethanol and inserted into pT7BlueT-Vector (product of Novagen) and subcloning was completed. The entire sequence was determined in the same manner as above in Examples. As a result, it was revealed that this gene had polymorphism shown above in Table 9.
[0270]The above cDNA was cleaved by BglII and subjected to agarose gel electrophoresis. The cDNA was then excised from agarose gel and collected using GENECLEAN II KIT (product of Bio 101). The cDNA was inserted into pBlueBacHis2B-Vector (product of Invitrogen) at the BglII cleavage site and subcloning was completed.
[0271]The fusion vector thus obtained had a BglII cleavage site and was an expression vector for a fusion protein of the contemplated gene product (about 91 kd) and 38 amino acids derived from pBlueBacHis2B-Vector and containing a polyhistidine region and an epitope recognizing Anti-Xpress® antibody (product of Invitrogen).
(5) Transfection into Insect Cell Sf-9
[0272]The human NPIK gene was expressed according to the Baculovirus expression system. Baculovirus is a cyclic double-stranded insect-pathogenic virus and can produce large amounts of inclusion bodies named polyhedrins in the cells of insects. Using Bac-N-Blue® Transfection Kit utilizing this characteristic of Baculovirus and developed by Invitrogen, the Baculovirus expression was carried out.
[0273]Stated more specifically, 4 μg of pBlueBacHis2B containing the region of the human NPIK gene and 1 μg of Bac-N-Blue® DNA (product of Invitrogen) were co-transfected into Sf-9 cells in the presence of Insectin® liposomes (product of Invitrogen).
[0274]Prior to co-transfection, LacZ gene was incorporated into Bac-N-Blue® DNA, so that LacZ would be expressed only when homologous recombination took place between the Bac-N-Blue® DNA and pBlueBacHis2B. Thus when the co-transfected Sf-9 cells were incubated on agar medium, the plaques of the virus expressing the contemplated gene were easily detected as blue plaques.
[0275]The blue plaques were excised from each agar and suspended in 400 μl of medium to disperse the virus thereon. The suspension was subjected to centrifugation to give a supernatant containing the virus. Sf-9 cells were infected with the virus again to increase the titre and to obtain a large amount of infective virus solution.
(6) Preparation of Human NPIK
[0276]The expression of the contemplated human NPIK gene was confirmed three days after infection with the virus as follows.
[0277]Sf-9 cells were collected and washed with PBS. The cells were boiled with a SDS-PAGE loading buffer for 5 minutes and SDS-PAGE was performed. According to the western blot technique using Anti-Xpress® as an antibody, the contemplated protein was detected at the position of its presumed molecular weight. By contrast, in the case of control cells uninfected with the virus, no band corresponding to human NPIK was observed in the same test.
[0278]Stated more specifically, three days after the infection of flasks (175-cm2, FALCON) of semi-confluent Sf-9 cells, the cells were harvested and washed with PBS, followed by resuspension in a buffer (20 mM Tris/HCl (pH 7.5), 1 mM EDTA and 1 mM DTT). The suspended cells were lysed by 4 time-sonications for 30 seconds at 4° C. with 30 seconds intervals. The sonicated cells were subjected to centrifugation and the supernatant was collected. The protein in the supernatant was immunoprecipitated using an Anti-Xpress® antibody and obtained as a slurry of protein A-Sepharose beads. The slurry was boiled with a SDS-PAGE loading buffer for 5 minutes. SDS-PAGE was performed for identification and quantification of NPIK. The slurry itself was subjected to the following assaying.
(7) Confirmation of PI4 Kinase Activity
[0279]NPIK was expected to have the activity of incorporation phosphoric acid at the 4-position of the inositol ring of phosphatidylinositol (PI), namely, PI4 Kinase activity.
[0280]PI4 Kinase activity of NPIK was assayed according to the method of Takenawa, et al. (Yamakawa, A. and Takenawa, T., J. Biol. Chem., 263:17555-17560 (1988)) as shown below.
[0281]First prepared was a mixture of 10 μl of a NPIK slurry (20 mM Tris/HCl (pH 7.5), 1 mM EDTA, 1 mM DTT and 50% protein A beads), 10 μl of a PI solution (prepared by drying 5 mg of a PI-containing commercial chloroform solution in a stream of nitrogen onto a glass tube wall, adding 1 ml of 20 mM Tris/HCl (pH 7.5) buffer and forming micelles by sonication), 10 μl of an applied buffer (210 mM Tris/HCl (pH 7.5), 5 mM EGTA and 100 mM MgCl2) and 10 μl of distilled water. Thereto was added 10 μl of an ATP solution (5 μl of 500 μM ATP, 4.9 μl of distilled water and 0.1 μl of γ-32P ATP (6000 Ci/mmol, product of NEN Co., Ltd.)). The reaction was started at 30° C. and continued for 2, 5, 10 and minutes. The time 10 minutes was set as incubation time because a straight-line increase was observed around 10 minutes in incorporation of phosphoric acid into PI in the assaying process described below.
[0282]After completion of the reaction, PI was fractionated by the solvent extraction method and finally re-suspended in chloroform. The suspension was developed by thin layer chromatography (TLC) and the radioactivity of the reaction product at the PI4P-position was assayed using an analyzer (trade name: Bio-Image; product of Fuji Photo Film Co., Ltd.).
[0283]FIG. 1 shows the results. FIG. 1 is an analytical diagram of the results of assaying the radioactivity based on TLC as mentioned above. The right lane (2) is the fraction of Sf-9 cell cytoplasm infected with the NPIK-containing virus, whereas the left lane (1) is the fraction of uninfected Sf-9 cell cytoplasm.
[0284]Also, predetermined amounts of Triton X-100® and adenosine were added to the above reaction system to check how such addition would affect the PI4 Kinase activity. The PI4 Kinase activity was assayed in the same manner as above.
[0285]FIG. 2 shows the results. The results confirmed that NPIK had a typical PI4 Kinaze activity accelerated by Triton X-100® and inhibited by adenosine.
Example 10
Nel-Related Protein Type 1 (NRP1) Gene and Nel-Related Protein Type 2 (NRP2) Gene
(1) Cloning and DNA Sequencing of NRP1 Gene and NRP2 Gene
[0286]EGF-like repeats have been found in many membrane proteins and in proteins related to growth regulation and differentiation. This motif seems to be involved in protein-protein interactions.
[0287]Recently, a gene encoding nel, a novel peptide containing five EGF-like repeats, was cloned from a chick embryonic cDNA library (Matsuhashi, S., et al., Dev. Dynamics, 203:212-222 (1995)). This product is considered to be a transmembrane molecule with its EGF-like repeats in the extracellular domain. A 4.5 kb transcript (nel mRNA) is expressed in various tissues at the embryonic stage and exclusively in brain and retina after hatching.
[0288]Following the procedure of Example 1 (1), cDNA clones were randomly selected from a human fetal brain cDNA library and subjected to sequence analysis, followed by database searching. As a result, two cDNA clones with significantly high homology to the above-mentioned nel were found and named GEN-073E07 and GEN-093E05, respectively.
[0289]Since both clones were lacking in the 5' portion, 5' RACE was performed in the same manner as in Example 2 (2) to obtain the entire coding regions.
[0290]As for the primers for 5' RACE, primers having an arbitrary sequence obtained from the cDNA sequences of the above clones were synthesized while the anchor primer attached to a commercial kit was used as such.
[0291]5' RACE clones obtained from the PCR were sequenced and the sequences seemingly covering the entire coding regions of both genes were obtained. These genes were respectively named nel-related protein type 1 (NRP1) gene and nel-related protein type 2 (NRP2) gene.
[0292]The NRP1 gene contains an open reading frame of 2,430 nucleotides, as shown under SEQ ID NO:35, the amino acid sequence deduced therefrom comprises 810 amino acid residues, as shown under SEQ ID NO:34, and the nucleotide sequence of the entire cDNA clone of said NRP1 gene comprises 2,977 nucleotides, as shown under SEQ ID NO:36.
[0293]On the other hand, the NRP2 gene contains an open reading frame of 2,448 nucleotides, as shown under SEQ ID NO:38, the amino acid sequence deduced therefrom comprises 816 amino acid residues, as shown under SEQ ID NO:37, and the nucleotide sequence of the entire cDNA clone of said NRP2 gene comprises 3,198 nucleotides, as shown under SEQ ID NO:39.
[0294]Furthermore, the coding regions were amplified by RT-PCR to exclude the possibility that either of the sequences obtained was a chimeric cDNA.
[0295]The deduced NRP1 and NRP2 gene products both showed highly hydrophobic N termini capable of functioning as signal peptides for membrane insertion. As compared with chick embryonic nel, they both appeared to have no hydrophobic transmembrane domain. Comparison among NRP1, NRP2 and nel with respect to the deduced peptide sequences revealed that NRP2 has 80% homology on the amino acid level and is more closely related to nel than NRP1 having 50% homology. The cysteine residues in cysteine-rich domains and EGF-like repeats were found completely conserved.
[0296]The most remarkable difference between the NRPs and the chick protein was that the human homologs lack the putative transmembrane domain of nel. However, even in this lacking region, the nucleotide sequences of NRPs were very similar to that of nel. Furthermore, the two NRPs each possessed six EGF-like repeats, whereas nel has only five.
[0297]Other unique motifs of nel as reported by Matsuhashi et al. (Matsuhashi, S., et al., Dev. Dynamics, 203:212-222 (1995)) were also found in the NRPs at equivalent positions. Since as mentioned above, it was shown that the two deduced NRP peptides are not transmembrane proteins, the NRPs might be secretory proteins or proteins anchored to membranes as a result of posttranslational modification.
[0298]The present inventors speculate that NRPs might function as ligands by stimulating other molecules such as EGF receptors. The present inventors further found that an extra EGF-like repeat could be encoded in nel upon frame shifting of the membrane domain region of nel.
[0299]When paralleled and compared with NRP2 and nel, the frame-shifted amino acid sequence showed similarities over the whole range of NRP2 and of nel, suggesting that NRP2 might be a human counterpart of nel. In contrast, NRP1 is considered to be not a human counterpart of nel but a homologous gene.
(2) northern blot analysis
[0300]Northern blot analysis was carried out as described in Example 1 (2). Thus, the entire sequences of both clones cDNAs were amplified by PCR, the PCR products were purified and labeled with [32P]-dCTP (random-primed DNA labeling kit, Boehringer Mannheim) and human normal tissues were examined for NRP mRNA expression using an MTN blot with the labeled products as two probes.
[0301]Sixteen adult tissues and four human fetal tissues were examined for the expression pattern of two NRPs.
[0302]As a result of the Northern blot analysis, it was found that a 3.5 kb transcript of NRP1 was weakly expressed in fetal and adult brain and kidney. A 3.6 kb transcript of NRP2 was strongly expressed in adult and fetal brain alone, with weak expression thereof in fetal kidney as well.
[0303]This suggests that NRPs might play a brain-specific role, for example as signal molecules for growth regulation. In addition, these genes might have a particular function in kidney.
(3) Chromosomal Mapping of NRP1 Gene and NRP2 Gene by FISH
[0304]Chromosomal mapping of the NRP1 gene and NRP2 gene was performed by FISH as described in Example 1 (3).
[0305]As a result, it was revealed that the chromosomal locus of the NRP1 gene is localized to 11p15.1-p15.2 and the chromosomal locus of the NRP2 gene to 12q13.11-q13.12.
[0306]According to the present invention, the novel human NRP1 gene and NRP2 gene are provided and the use of said genes makes it possible to detect the expression of said genes in various tissues and produce the human NRP1 and NRP2 proteins by the technology of genetic engineering. They can further be used in the study of the brain neurotransmission system, diagnosis of various diseases related to neurotransmission in the brain, and the screening and evaluation of drugs for the treatment and prevention of such diseases. Furthermore, the possibility is suggested that these EGF domain-containing NRPs act as growth factors in brain, hence they may be useful in the diagnosis and treatment of various kinds of intracerebral tumor and effective in nerve regeneration in cases of degenerative nervous diseases.
Example 11
GSPT1-Related Protein (GSPT1-TK) Gene
(1) GSPT1-TK Gene Cloning and DNA Sequencing
[0307]The human GSPT1 gene is one of the human homologous genes of the yeast GST1 gene that encodes the GTP-binding protein essential for the G1 to S phase transition in the cell cycle. The yeast GST1 gene, first identified as a protein capable of complementing a temperature-sensitive gstl (G1-to-S transition) mutant of Saccharomyces cerevisiae, was isolated from a yeast genomic library (Kikuchi, Y., Shimatake, H. and Kikuchi, A., EMBO J., 7:1175-1182 (1988)) and encoded a protein with a target site of cAMP-dependent protein kinases and a GTPase domain.
[0308]The human GSPT1 gene was isolated from a KB cell cDNA library by hybridization using the yeast GST1 gene as a probe (Hoshino, S., Miyazawa, H., Enomoto, T., Hanaoka, F., Kikuchi, Y., Kikuchi, A. and Ui, M., EMBO J., 8:3807-3814 (1989)). The deduced protein of said GSPT1 gene, like yeast GST1, has a GTP-binding domain and a GTPase activity center, and plays an important role in cell proliferation.
[0309]Furthermore, a breakpoint for chromosome rearrangement has been observed in the GSPT1 gene located in the chromosomal locus 16p13.3 in patients with acute nonlymphocytic leukemia (ANLL) (Ozawa, K., Murakami, Y., Eki, T., Yokoyama, K., Soeda, E., Hoshino, S., Ui, M. and Hanaoka, F., Somatic Cell and Molecular Genet., 18:189-194 (1992)).
[0310]cDNA clones were randomly selected from a human fetal brain cDNA library and subjected to sequence analysis as described in Example 1 (1) and database searching was performed and, as a result, a clone having a 0.3 kb cDNA sequence highly homologous to the above-mentioned GSPT1 gene was found and named GEN-077A09. The GEN-077A09 clone seemed to be lacking in the 5' region, so that 5' RACE was carried out in the same manner as in Example 2 (2) to obtain the entire coding region.
[0311]The primers used for the 5' RACE were P1 and P2 primers respectively having the nucleotide sequences shown in Table 11 as designed based on the known cDNA sequence of the above-mentioned cDNA, and the anchor primer used was the one attached to the commercial kit. Thirty-five cycles of PCR were performed under the following conditions: 94° C. for 45 seconds, 58° C. for 45 seconds and 72° C. for 2 minutes. Finally, elongation reaction was carried out at 72° C. for 7 minutes.
TABLE-US-00011 TABLE 11 Primer Nucleotide sequence P1 primer 5'-GATTTGTGCTCAATAATCACTATCTGAA-3' (SEQ ID NO: 94) P2 primer 5'-GGTTACTAGGATCACAAAGTATGAATTCTGGAA-3' (SEQ ID NO: 95)
[0312]Several of the 5' RACE clones obtained from the above PCR were sequenced and the base sequence of that cDNA clone showing overlapping between the 5' RACE clones and the GEN-077A09 clone was determined to thereby reveal the sequence regarded as covering the entire coding region. This was named GSPT1-related protein "GSPT1-TK gene".
[0313]The GSPT1-TK gene was found to contain an open reading frame of 1,497 nucleotides, as shown under SEQ ID NO:41. The amino acid sequence deduced therefrom contained 499 amino acid residues, as shown under SEQ ID NO:40.
[0314]The nucleotide sequence of the whole cDNA clone of the GSPT1-TK gene was found to comprise 2,057 nucleotides, as shown under SEQ ID NO:42, and the molecular weight was calculated at 55,740 daltons.
[0315]The first methionine code (ATG) in the open reading frame had no in-frame termination codon but this ATG was surrounded by a sequence similar to the Kozak consensus sequence for translational initiation. Therefore, it was concluded that this ATG triplet occurring in positions 144-146 of the relevant sequence is the initiation codon.
[0316]Furthermore, a polyadenylation signal, AATAAA (within SEQ ID NO:42), was observed 13 nucleotides upstream from the polyadenylation site.
[0317]Human GSPT1-TK contains a glutamic acid rich region near the N terminus, and 18 of 20 glutamic acid residues occurring in this region of human GSPT1-TK are conserved and align perfectly with those of the human GSPT1 protein. Several regions (G1, G2, G3, G4 and G5) of GTP-binding proteins that are responsible for guanine nucleotide binding and hydrolysis were found conserved in the GSPT1-TK protein just as in the human GSPT1 protein.
[0318]Thus, the DNA sequence of human GSPT1-TK was found 89.4% identical, and the amino acid sequence deduced therefrom 92.4% identical, with the corresponding sequence of human GSPT1 which supposedly plays an important role in the G1 to S phase transition in the cell cycle. Said amino acid sequence showed 50.8% identity with that of yeast GST1.
(2) Northern Blot Analysis
[0319]Northern blot analysis was carried out as described in Example 1 (2). Thus, the GEN-077A09 cDNA clone was amplified by PCR, the PCR product was purified and labeled with [32P]-dCTP (random-primed DNA labeling kit, Boehringer Mannheim), and normal human tissues were examined for the expression of GSPT1-TK mRNA therein using an MTN blot with the labeled product as a probe.
[0320]As a result of the Northern blot analysis, a 2.7 kb major transcript was detected in various tissues. The level of human GSPT1-TK expression seemed highest in brain and in testis.
(3) Chromosome Mapping of GSPT1-TK Gene by FISH
[0321]Chromosome mapping of the GSPT1-TK gene was performed by FISH as described in Example 1 (3).
[0322]As a result, it was found that the GSPT1-TK gene is localized at the chromosomal locus 19p13.3. In this chromosomal localization site, reciprocal location has been observed very frequently in cases of acute lymphocytic leukemia (ALL) and acute myeloid leukemia (AML). In addition, it is reported that acute non-lymphocytic leukemia (ANLL) is associated with re-arrangements involving the human GSPT1 region (Ozawa, K., Murakami, Y., Eki, T., Yokoyama, K., Soeda, E., Hoshino, S., Ui, M. and Hanaoka, F., Somatic Cell and Molecular Genet., 18:189-194 (1992)).
[0323]In view of the above, it is suggested that this gene is the best candidate gene associated with ALL and AML.
[0324]In accordance with the present invention, the novel human GSPT1-TK gene is provided and the use of said gene makes it possible to detect the expression of said gene in various tissues and produce the human GSPT1-TK protein by the technology of genetic engineering. These can be used in the studies of cell proliferation, as mentioned above, and further make it possible to diagnose various diseases associated with the chromosomal locus of this gene, for example acute myelocytic leukemia. This is because translocation of this gene may result in decomposition of the GSPT1-TK gene and further some or other fused protein expressed upon said translocation may cause such diseases.
[0325]Furthermore, it is expected that diagnosis and treatment of said diseases can be made possible by producing antibodies to such fused protein, revealing the intracellular localization of said protein and examining its expression specific to said diseases. Therefore, it is also expected that the use of the gene of the present invention makes it possible to screen out and evaluate drugs for the treatment and prevention of said diseases.
Sequence CWU
1
961122PRTHomo sapiens 1Met Glu Leu Gly Glu Asp Gly Ser Val Tyr Lys Ser Ile
Leu Val Thr 1 5 10 15Ser
Gln Asp Lys Ala Pro Ser Val Ile Ser Arg Val Leu Lys Lys Asn
20 25 30Asn Arg Asp Ser Ala Val Ala Ser
Glu Tyr Glu Leu Val Gln Leu Leu 35 40
45Pro Gly Glu Arg Glu Leu Thr Ile Pro Ala Ser Ala Asn Val Phe Tyr
50 55 60Pro Met Asp Gly Ala Ser His
Asp Phe Leu Leu Arg Gln Arg Arg Arg 65 70
75 80Ser Ser Thr Ala Thr Pro Gly Val Thr Ser Gly Pro
Ser Ala Ser Gly 85 90
95Thr Pro Pro Ser Glu Gly Gly Gly Gly Ser Phe Pro Arg Ile Lys Ala
100 105 110Thr Gly Arg Lys Ile Ala Arg
Ala Leu Phe 115 1202366DNAHomo sapiens 2atggagttgg
gggaagatgg cagtgtctat aagagcattt tggtgacaag ccaggacaag 60gctccaagtg
tcatcagtcg tgtccttaag aaaaacaatc gtgactctgc agtggcttca 120gagtatgagc
tggtacagct gctaccaggg gagcgagagc tgactatccc agcctcggct 180aatgtattct
accccatgga tggagcttca cacgatttcc tcctgcggca gcggcgaagg 240tcctctactg
ctacacctgg cgtcaccagt ggcccgtctg cctcaggaac tcctccgagt 300gagggaggag
ggggctcctt tcccaggatc aaggccacag ggaggaagat tgcacgggca 360ctgttc
3663842DNAHomo
sapiensCDS(28)..(393) 3cccacgagcc gtatcatccg agtccag atg gag ttg ggg gaa
gat ggc agt gtc 54 Met Glu Leu Gly Glu
Asp Gly Ser Val 1 5tat aag
agc att ttg gtg aca agc cag gac aag gct cca agt gtc atc 102Tyr Lys
Ser Ile Leu Val Thr Ser Gln Asp Lys Ala Pro Ser Val Ile 10
15 20 25agt cgt gtc ctt aag aaa aac
aat cgt gac tct gca gtg gct tca gag 150 Ser Arg Val Leu Lys Lys
Asn Asn Arg Asp Ser Ala Val Ala Ser Glu 30
35 40tat gag ctg gta cag ctg cta cca ggg gag cga gag
ctg act atc cca 198 Tyr Glu Leu Val Gln Leu Leu Pro Gly Glu Arg
Glu Leu Thr Ile Pro 45 50
55gcc tcg gct aat gta ttc tac ccc atg gat gga gct tca cac gat ttc
246Ala Ser Ala Asn Val Phe Tyr Pro Met Asp Gly Ala Ser His Asp Phe
60 65 70ctc ctg cgg cag cgg cga agg
tcc tct act gct aca cct ggc gtc acc 294Leu Leu Arg Gln Arg Arg Arg
Ser Ser Thr Ala Thr Pro Gly Val Thr 75 80
85agt ggc ccg tct gcc tca gga act cct ccg agt gag gga gga ggg ggc
342Ser Gly Pro Ser Ala Ser Gly Thr Pro Pro Ser Glu Gly Gly Gly Gly 90
95 100 105tcc ttt ccc agg
atc aag gcc aca ggg agg aag att gca cgg gca ctg 390Ser Phe Pro Arg
Ile Lys Ala Thr Gly Arg Lys Ile Ala Arg Ala Leu 110
115 120ttc tgaggaggaa gccccttttt ttacagaagt
catggtgttc ataccagatg 443Phetgggtagcca tcctgaatgg tggcaattat
atcacattga gacagaaatt cagaaaggga 503gccagccacc ctggggcagt gaagtgccac
tggtttacca gacagctgag aaatccagcc 563ctgtcggaac tggtgtctta taaccaagtt
ggatacctgt gtatagcttg ccaccttcca 623tgagtgcagc acacaggtag tgctggaaaa
acgcatcagt ttctgattct tggccatatc 683ctaacatgca agggccaagc aaaggcttca
aggctctgag ccccagggca gaggggaatg 743gcaaaatgta ggtcctggca ggagctcttc
ttcccactct gggggtttct atcactgtga 803caacactaag ataataaacc aaaacactac
ctgaattct 8424193PRTHomo sapiens 4Met Glu Leu
Glu Leu Tyr Gly Val Asp Asp Lys Phe Tyr Ser Lys Leu 1 5
10 15Asp Gln Glu Asp Ala Leu Leu Gly Ser
Tyr Pro Val Asp Asp Gly Cys 20 25
30Arg Ile His Val Ile Asp His Ser Gly Ala Arg Leu Gly Glu Tyr Glu
35 40 45Asp Val Ser Arg Val Glu
Lys Tyr Thr Ile Ser Gln Glu Ala Tyr Asp 50 55
60Gln Arg Gln Asp Thr Val Arg Ser Phe Leu Lys Arg Ser Lys Leu
Gly 65 70 75 80Arg Tyr
Asn Glu Glu Glu Arg Ala Gln Gln Glu Ala Glu Ala Ala Gln
85 90 95Arg Leu Ala Glu Glu Lys Ala Gln
Ala Ser Ser Ile Pro Val Gly Ser 100 105
110Arg Cys Glu Val Arg Ala Ala Gly Gln Ser Pro Arg Arg Gly Thr
Val 115 120 125Met Tyr Val Gly Leu
Thr Asp Phe Lys Pro Gly Tyr Trp Ile Gly Val 130 135
140Arg Tyr Asp Glu Pro Leu Gly Lys Asn Asp Gly Ser Val Asn
Gly Lys145 150 155 160Arg
Tyr Phe Glu Cys Gln Ala Lys Tyr Gly Ala Phe Val Lys Pro Ala
165 170 175Val Val Thr Val Gly Asp Phe
Pro Glu Glu Asp Tyr Gly Leu Asp Glu 180 185
190Ile5579DNAHomo sapiens 5atggaactgg agctgtatgg agttgacgac
aagttctaca gcaagctgga tcaagaggat 60gcgctcctgg gctcctaccc tgtagatgac
ggctgccgca tccacgtcat tgaccacagt 120ggcgcccgcc ttggtgagta tgaggacgtg
tcccgggtgg agaagtacac gatctcacaa 180gaagcctacg accagaggca agacacggtc
cgctctttcc tgaagcgcag caagctcggc 240cggtacaacg aggaggagcg ggctcagcag
gaggccgagg ccgcccagcg cctggccgag 300gagaaggccc aggccagctc catccccgtg
ggcagccgct gtgaggtgcg ggcggcggga 360caatcccctc gccggggcac cgtcatgtat
gtaggtctca cagatttcaa gcctggctac 420tggattggtg tccgctatga tgagccactg
gggaaaaatg atggcagtgt gaatgggaaa 480cgctacttcg aatgccaggc caagtatggc
gcctttgtca agccagcagt cgtgacggtg 540ggggacttcc cggaggagga ctacgggttg
gacgagata 57961015DNAHomo
sapiensCDS(274)..(852) 6tgattggtca ggcacggagc aggaggcggg ctgatagccc
agcagcagca gcggcggcgg 60cggctgcgga gcgggtgtga ggcggctgga ccgcgctgca
ggcatccgcg ggcgcggcaa 120gatggaggtg acgggggtgt cggcaccacg gtgaccgttt
tcatcagcag ctccctcagc 180accttccgct ccgagaagcg atacagccgc agcctcacca
tcgctgagtt caagtgtaaa 240ctggagttgc tggtgggcag ccctgcttcc tgc atg gaa
ctg gag ctg tat gga 294 Met Glu
Leu Glu Leu Tyr Gly 1
5gtt gac gac aag ttc tac agc aag ctg gat caa gag gat gcg ctc ctg
342Val Asp Asp Lys Phe Tyr Ser Lys Leu Asp Gln Glu Asp Ala Leu Leu
10 15 20ggc tcc tac cct gta gat gac
ggc tgc cgc atc cac gtc att gac cac 390Gly Ser Tyr Pro Val Asp Asp
Gly Cys Arg Ile His Val Ile Asp His 25 30
35agt ggc gcc cgc ctt ggt gag tat gag gac gtg tcc cgg gtg gag aag
438Ser Gly Ala Arg Leu Gly Glu Tyr Glu Asp Val Ser Arg Val Glu Lys 40
45 50 55tac acg atc tca
caa gaa gcc tac gac cag agg caa gac acg gtc cgc 486Tyr Thr Ile Ser
Gln Glu Ala Tyr Asp Gln Arg Gln Asp Thr Val Arg 60
65 70tct ttc ctg aag cgc agc aag ctc ggc cgg
tac aac gag gag gag cgg 534Ser Phe Leu Lys Arg Ser Lys Leu Gly Arg
Tyr Asn Glu Glu Glu Arg 75 80
85gct cag cag gag gcc gag gcc gcc cag cgc ctg gcc gag gag aag gcc
582Ala Gln Gln Glu Ala Glu Ala Ala Gln Arg Leu Ala Glu Glu Lys Ala
90 95 100cag gcc agc tcc atc ccc gtg
ggc agc cgc tgt gag gtg cgg gcg gcg 630Gln Ala Ser Ser Ile Pro Val
Gly Ser Arg Cys Glu Val Arg Ala Ala 105 110
115gga caa tcc cct cgc cgg ggc acc gtc atg tat gta ggt ctc aca gat
678Gly Gln Ser Pro Arg Arg Gly Thr Val Met Tyr Val Gly Leu Thr Asp120
125 130 135ttc aag cct ggc
tac tgg att ggt gtc cgc tat gat gag cca ctg ggg 726Phe Lys Pro Gly
Tyr Trp Ile Gly Val Arg Tyr Asp Glu Pro Leu Gly 140
145 150aaa aat gat ggc agt gtg aat ggg aaa cgc
tac ttc gaa tgc cag gcc 774Lys Asn Asp Gly Ser Val Asn Gly Lys Arg
Tyr Phe Glu Cys Gln Ala 155 160
165aag tat ggc gcc ttt gtc aag cca gca gtc gtg acg gtg ggg gac ttc
822Lys Tyr Gly Ala Phe Val Lys Pro Ala Val Val Thr Val Gly Asp Phe
170 175 180ccg gag gag gac tac ggg ttg
gac gag ata tgacacctaa ggaattcccc 872Pro Glu Glu Asp Tyr Gly Leu
Asp Glu Ile 185 190tgcttcagct cctagctcag ccactgactg
cccctcctgt gtgtgcccat ggcccttttc 932tcctgacccc attttaattt tattcatttt
ttcctttgcc attgattttt gagactcatg 992cattaaattc actagaaacc cag
10157128PRTHomo sapiens 7Met Thr Glu Ala
Asp Val Asn Pro Lys Ala Tyr Pro Leu Ala Asp Ala 1 5
10 15His Leu Thr Lys Lys Leu Leu Asp Leu Val
Gln Gln Ser Cys Asn Tyr 20 25
30Lys Gln Leu Arg Lys Gly Ala Asn Glu Ala Thr Lys Thr Leu Asn Arg
35 40 45Gly Ile Ser Glu Phe Ile Val
Met Ala Ala Asp Ala Glu Pro Leu Glu 50 55
60Ile Ile Leu His Leu Pro Leu Leu Cys Glu Asp Lys Asn Val Pro Tyr
65 70 75 80Val Phe Val
Arg Ser Lys Gln Ala Leu Gly Arg Ala Cys Gly Val Ser 85
90 95Arg Pro Val Ile Ala Cys Ser Val Thr
Ile Lys Glu Gly Ser Gln Leu 100 105
110Lys Gln Gln Ile Gln Ser Ile Gln Gln Ser Ile Glu Arg Leu Leu Val
115 120 1258384DNAHomo sapiens
8atgactgagg ctgatgtgaa tccaaaggcc tatccccttg ccgatgccca cctcaccaag
60aagctactgg acctcgttca gcagtcatgt aactataagc agcttcggaa aggagccaat
120gaggccacca aaaccctcaa caggggcatc tctgagttca tcgtgatggc tgcagacgcc
180gagccactgg agatcattct gcacctgccg ctgctgtgtg aagacaagaa tgtgccctac
240gtgtttgtgc gctccaagca ggccctgggg agagcctgtg gggtctccag gcctgtcatc
300gcctgttctg tcaccatcaa agaaggctcg cagctgaaac agcagatcca atccattcag
360cagtccattg aaaggctctt agtc
38491493DNAHomo sapiensCDS(95)..(478) 9atccgtgtcc ttgcggtgct gggcagcaga
ccgtccaaac cgacacgcgt ggtatcctcg 60cggtgtccgg caagagacta ccaagacaga
cgct atg act gag gct gat gtg aat 115
Met Thr Glu Ala Asp Val Asn 1
5cca aag gcc tat ccc ctt gcc gat gcc cac ctc acc aag aag cta
ctg 163Pro Lys Ala Tyr Pro Leu Ala Asp Ala His Leu Thr Lys Lys Leu
Leu 10 15 20gac ctc gtt cag cag
tca tgt aac tat aag cag ctt cgg aaa gga gcc 211Asp Leu Val Gln Gln
Ser Cys Asn Tyr Lys Gln Leu Arg Lys Gly Ala 25 30
35aat gag gcc acc aaa acc ctc aac agg ggc atc tct gag ttc
atc gtg 259Asn Glu Ala Thr Lys Thr Leu Asn Arg Gly Ile Ser Glu Phe
Ile Val 40 45 50 55atg
gct gca gac gcc gag cca ctg gag atc att ctg cac ctg ccg ctg 307Met
Ala Ala Asp Ala Glu Pro Leu Glu Ile Ile Leu His Leu Pro Leu
60 65 70ctg tgt gaa gac aag aat gtg
ccc tac gtg ttt gtg cgc tcc aag cag 355Leu Cys Glu Asp Lys Asn Val
Pro Tyr Val Phe Val Arg Ser Lys Gln 75 80
85gcc ctg ggg aga gcc tgt ggg gtc tcc agg cct gtc atc gcc
tgt tct 403Ala Leu Gly Arg Ala Cys Gly Val Ser Arg Pro Val Ile Ala
Cys Ser 90 95 100gtc acc atc aaa
gaa ggc tcg cag ctg aaa cag cag atc caa tcc att 451Val Thr Ile Lys
Glu Gly Ser Gln Leu Lys Gln Gln Ile Gln Ser Ile 105
110 115cag cag tcc att gaa agg ctc tta gtc taaacctgtg
gcctctgcca 498Gln Gln Ser Ile Glu Arg Leu Leu Val120
125cgtgctccct gccagcttcc cccctgaggt tgtgtatcat attatctgtg
ttagcatgta 558gtattttcag ctactctcta ttgttataaa atgtagtact aaatctggtt
tctggatttt 618tgtgttgttt ttgttctgtt ttacagggtt gctatccccc ttcctttcct
ccctccctct 678gccatccttc atccttttat cctccctttt tggaacaagt gttcagagca
gacagaagca 738gggtggtggc accgttgaaa ggcagaaaga gccaggagaa agctgatgga
gccaggacag 798agatctggtt ccagctttca gccactagct tcctgttgtg tgcggggtgt
ggtggaatta 858aacagcattc attgtgtgtc cctgtgcctg gcacacagaa tcattcatac
gtgttcaagt 918gatcaagggg tttcatttgc tcttggggga ttaggtatca tttggggagg
aagcatgtgt 978tctgtgaggt tgttcggcta tgtccaagtg tcgtttacta atgtacccct
gctgtttgct 1038tttggtaatg tgatgttgat gttctccccc tacccacaac catgcccttg
agggtagcag 1098ggcagcagca taccaaagag atgtgctgca ggactccgga ggcagcctgg
gtgggtgagc 1158catggggcag ttgacctggg tcttgaaaga gtcgggagtg acaagctcag
agagcatgaa 1218ctgatgctgg catgaaggat tccaggaaga tcatggagac ctggctggta
gctgtaacag 1278agatggtgga gtccaaggaa acagcctgtc tctggtgaat gggactttct
ttggtggaca 1338cttggcacca gctctgagag cccttcccct gtgtcctgcc accatgtggg
tcagatgtac 1398tctctgtcac atgaggagag tgctagttca tgtgttctcc attcttgtga
gcatcctaat 1458aaatctgttc cattttgaaa aaaaaaaaaa aaaaa
149310711PRTHomo sapiens 10Met Pro Ala Asp Val Asn Leu Ser Gln
Lys Pro Gln Val Leu Gly Pro 1 5 10
15Glu Lys Gln Asp Gly Ser Cys Glu Ala Ser Val Ser Phe Glu Asp
Val 20 25 30Thr Val Asp Phe
Ser Arg Glu Glu Trp Gln Gln Leu Asp Pro Ala Gln 35
40 45Arg Cys Leu Tyr Arg Asp Val Met Leu Glu Leu Tyr
Ser His Leu Phe 50 55 60Ala Val Gly
Tyr His Ile Pro Asn Pro Glu Val Ile Phe Arg Met Leu 65
70 75 80Lys Glu Lys Glu Pro Arg Val Glu
Glu Ala Glu Val Ser His Gln Arg 85 90
95Cys Gln Glu Arg Glu Phe Gly Leu Glu Ile Pro Gln Lys Glu
Ile Ser 100 105 110Lys Lys Ala
Ser Phe Gln Lys Asp Met Val Gly Glu Phe Thr Arg Asp 115
120 125Gly Ser Trp Cys Ser Ile Leu Glu Glu Leu Arg
Leu Asp Ala Asp Arg 130 135 140Thr Lys
Lys Asp Glu Gln Asn Gln Ile Gln Pro Met Ser His Ser Ala145
150 155 160Phe Phe Asn Lys Lys Thr Leu
Asn Thr Glu Ser Asn Cys Glu Tyr Lys 165
170 175Asp Pro Gly Lys Met Ile Arg Thr Arg Pro His Leu
Ala Ser Ser Gln 180 185 190Lys
Gln Pro Gln Lys Cys Cys Leu Phe Thr Glu Ser Leu Lys Leu Asn 195
200 205Leu Glu Val Asn Gly Gln Asn Glu Ser
Asn Asp Thr Glu Gln Leu Asp 210 215
220Asp Val Val Gly Ser Gly Gln Leu Phe Ser His Ser Ser Ser Asp Ala225
230 235 240Cys Ser Lys Asn
Ile His Thr Gly Glu Thr Phe Cys Lys Gly Asn Gln 245
250 255Cys Arg Lys Val Cys Gly His Lys Gln Ser
Leu Lys Gln His Gln Ile 260 265
270His Thr Gln Lys Lys Pro Asp Gly Cys Ser Glu Cys Gly Gly Ser Phe
275 280 285Thr Gln Lys Ser His Leu Phe
Ala Gln Gln Arg Ile His Ser Val Gly 290 295
300Asn Leu His Glu Cys Gly Lys Cys Gly Lys Ala Phe Met Pro Gln
Leu305 310 315 320Lys Leu
Ser Val Tyr Leu Thr Asp His Thr Gly Asp Ile Pro Cys Ile
325 330 335Cys Lys Glu Cys Gly Lys Val
Phe Ile Gln Arg Ser Glu Leu Leu Thr 340 345
350His Gln Lys Thr His Thr Arg Lys Lys Pro Tyr Lys Cys His
Asp Cys 355 360 365Gly Lys Ala Phe
Phe Gln Met Leu Ser Leu Phe Arg His Gln Arg Thr 370
375 380His Ser Arg Glu Lys Leu Tyr Glu Cys Ser Glu Cys
Gly Lys Gly Phe385 390 395
400Ser Gln Asn Ser Thr Leu Ile Ile His Gln Lys Ile His Thr Gly Glu
405 410 415Arg Gln Tyr Ala Cys
Ser Glu Cys Gly Lys Ala Phe Thr Gln Lys Ser 420
425 430Thr Leu Ser Leu His Gln Arg Ile His Ser Gly Gln
Lys Ser Tyr Val 435 440 445Cys Ile
Glu Cys Gly Gln Ala Phe Ile Gln Lys Ala His Leu Ile Val 450
455 460His Gln Arg Ser His Thr Gly Glu Lys Pro Tyr
Gln Cys His Asn Cys465 470 475
480Gly Lys Ser Phe Ile Ser Lys Ser Gln Leu Asp Ile His His Arg Ile
485 490 495His Thr Gly Glu
Lys Pro Tyr Glu Cys Ser Asp Cys Gly Lys Thr Phe 500
505 510Thr Gln Lys Ser His Leu Asn Ile His Gln Lys
Ile His Thr Gly Glu 515 520 525Arg
His His Val Cys Ser Glu Cys Gly Lys Ala Phe Asn Gln Lys Ser 530
535 540Ile Leu Ser Met His Gln Arg Ile His Thr
Gly Glu Lys Pro Tyr Lys545 550 555
560Cys Ser Glu Cys Gly Lys Ala Phe Thr Ser Lys Ser Gln Phe Lys
Glu 565 570 575His Gln Arg
Ile His Thr Gly Glu Lys Pro Tyr Val Cys Thr Glu Cys 580
585 590Gly Lys Ala Phe Asn Gly Arg Ser Asn Phe
His Lys His Gln Ile Thr 595 600
605His Thr Arg Glu Arg Pro Phe Val Cys Tyr Lys Cys Gly Lys Ala Phe 610
615 620Val Gln Lys Ser Glu Leu Ile Thr
His Gln Arg Thr His Met Gly Glu625 630
635 640Lys Pro Tyr Glu Cys Leu Asp Cys Gly Lys Ser Phe
Ser Lys Lys Pro 645 650
655Gln Leu Lys Val His Gln Arg Ile His Thr Gly Glu Arg Pro Tyr Val
660 665 670Cys Ser Glu Cys Gly Lys
Ala Phe Asn Asn Arg Ser Asn Phe Asn Lys 675 680
685His Gln Thr Thr His Thr Arg Asp Lys Ser Tyr Lys Cys Ser
Tyr Ser 690 695 700Val Lys Gly Phe Thr
Lys Gln705 710112133DNAHomo sapiens 11atgcctgctg
atgtgaattt atcccagaag cctcaggtcc tgggtccaga gaagcaggat 60ggatcttgcg
aggcatcagt gtcatttgag gacgtgaccg tggacttcag cagggaggag 120tggcagcaac
tggaccctgc ccagagatgc ctgtaccggg atgtgatgct ggagctctat 180agccatctct
tcgcagtggg gtatcacatt cccaacccag aggtcatctt cagaatgcta 240aaagaaaagg
agccgcgtgt ggaggaggct gaagtctcac atcagaggtg tcaagaaagg 300gagtttgggc
ttgaaatccc acaaaaggag atttctaaga aagcttcatt tcaaaaggat 360atggtaggtg
agttcacaag agatggttca tggtgttcca ttttagaaga actgaggctg 420gatgctgacc
gcacaaagaa agatgagcaa aatcaaattc aacccatgag tcacagtgct 480ttcttcaaca
agaaaacatt gaacacagaa agcaattgtg aatataagga ccctgggaaa 540atgattcgca
cgaggcccca ccttgcttct tcacagaaac aacctcagaa atgttgctta 600tttacagaaa
gtttgaagct gaacctagaa gtgaacggtc agaatgaaag caatgacaca 660gaacagcttg
atgacgttgt tgggtctggt cagctattca gccatagctc ttctgatgcc 720tgcagcaaga
atattcatac aggagagaca ttttgcaaag gtaaccagtg tagaaaagtc 780tgtggccata
aacagtcact caagcaacat caaattcata ctcagaagaa accagatgga 840tgttctgaat
gtggggggag cttcacccag aagtcacacc tctttgccca acagagaatt 900catagtgtag
gaaacctcca tgaatgtggc aaatgtggaa aagccttcat gccacaacta 960aaactcagtg
tatatctgac agatcataca ggtgatatac cctgtatatg caaggaatgt 1020gggaaggtct
ttattcagag atcagaattg cttacgcacc agaaaacaca cactagaaag 1080aagccctata
aatgccatga ctgtggaaaa gcctttttcc agatgttatc tctcttcaga 1140catcagagaa
ctcacagtag agaaaaactc tatgaatgca gtgaatgtgg caaaggcttc 1200tcccaaaact
caaccctcat tatacatcag aaaattcata ctggtgagag acagtatgca 1260tgcagtgaat
gtgggaaagc ctttacccag aagtcaacac tcagcttgca ccagagaatc 1320cactcagggc
agaagtccta tgtgtgtatc gaatgcgggc aggccttcat ccagaaggca 1380cacctgattg
tccatcaaag aagccacaca ggagaaaaac cttatcagtg ccacaactgt 1440gggaaatcct
tcatttccaa gtcacagctt gatatacatc atcgaattca tacaggggag 1500aaaccttatg
aatgcagtga ctgtggaaaa accttcaccc aaaagtcaca cctgaatata 1560caccagaaaa
ttcatactgg agaaagacac catgtatgca gtgaatgcgg gaaagccttc 1620aaccagaagt
caatactcag catgcatcag agaattcaca ccggagagaa gccttacaaa 1680tgcagtgaat
gtgggaaagc cttcacttct aagtctcaat tcaaagagca tcagcgaatt 1740cacacgggtg
agaaacccta tgtgtgcact gaatgtggga aggccttcaa cggcaggtca 1800aatttccata
aacatcaaat aactcacact agagagaggc cttttgtctg ttacaaatgt 1860gggaaggctt
ttgtccagaa atcagagttg attacccatc aaagaactca catgggagag 1920aaaccctatg
aatgccttga ctgtgggaaa tcgttcagta agaaaccaca actcaaggtg 1980catcagcgaa
ttcacacggg agaaagacct tatgtgtgtt ctgaatgtgg aaaggccttc 2040aacaacaggt
caaacttcaa taaacaccaa acaactcata ccagagacaa atcttacaaa 2100tgcagttatt
ctgtgaaagg ctttaccaag caa
2133123754DNAHomo sapiensCDS(346)..(2478) 12gctaagccta tgtcgcttac
tggacgctga agtgattggg aatattagca gtgggggttc 60tgtagggtca ggaaggggcg
gctggctttg ggggagtgat gaggggcttg ttgggggtgg 120gggtgcgtga taaagggatt
tctcggctga agacgaggct gtgaggcttc tgcagaaccc 180ccaggtcagg ccacatcatt
gaggctgcag gatctctctt catagcccag tacgactctc 240cgccgtgtcc ctggttggaa
aatccaaaca cctatccagc ttctggctcc tgggaaaagt 300ggagttgtca gcaagagaga
ccgagagtag aagcccagag tggag atg cct gct gat 357
Met Pro Ala Asp
1gtg aat tta tcc cag aag cct cag gtc ctg ggt cca gag
aag cag gat 405Val Asn Leu Ser Gln Lys Pro Gln Val Leu Gly Pro Glu
Lys Gln Asp 5 10 15
20gga tct tgc gag gca tca gtg tca ttt gag gac gtg acc gtg gac ttc
453Gly Ser Cys Glu Ala Ser Val Ser Phe Glu Asp Val Thr Val Asp Phe
25 30 35agc agg gag gag tgg
cag caa ctg gac cct gcc cag aga tgc ctg tac 501Ser Arg Glu Glu Trp
Gln Gln Leu Asp Pro Ala Gln Arg Cys Leu Tyr 40
45 50cgg gat gtg atg ctg gag ctc tat agc cat ctc ttc
gca gtg ggg tat 549Arg Asp Val Met Leu Glu Leu Tyr Ser His Leu Phe
Ala Val Gly Tyr 55 60 65cac att
ccc aac cca gag gtc atc ttc aga atg cta aaa gaa aag gag 597His Ile
Pro Asn Pro Glu Val Ile Phe Arg Met Leu Lys Glu Lys Glu 70
75 80ccg cgt gtg gag gag gct gaa gtc tca cat cag
agg tgt caa gaa agg 645Pro Arg Val Glu Glu Ala Glu Val Ser His Gln
Arg Cys Gln Glu Arg 85 90 95
100gag ttt ggg ctt gaa atc cca caa aag gag att tct aag aaa gct tca
693Glu Phe Gly Leu Glu Ile Pro Gln Lys Glu Ile Ser Lys Lys Ala Ser
105 110 115ttt caa aag gat atg
gta ggt gag ttc aca aga gat ggt tca tgg tgt 741Phe Gln Lys Asp Met
Val Gly Glu Phe Thr Arg Asp Gly Ser Trp Cys 120
125 130tcc att tta gaa gaa ctg agg ctg gat gct gac cgc
aca aag aaa gat 789Ser Ile Leu Glu Glu Leu Arg Leu Asp Ala Asp Arg
Thr Lys Lys Asp 135 140 145gag caa
aat caa att caa ccc atg agt cac agt gct ttc ttc aac aag 837Glu Gln
Asn Gln Ile Gln Pro Met Ser His Ser Ala Phe Phe Asn Lys 150
155 160aaa aca ttg aac aca gaa agc aat tgt gaa tat
aag gac cct ggg aaa 885Lys Thr Leu Asn Thr Glu Ser Asn Cys Glu Tyr
Lys Asp Pro Gly Lys165 170 175
180atg att cgc acg agg ccc cac ctt gct tct tca cag aaa caa cct cag
933Met Ile Arg Thr Arg Pro His Leu Ala Ser Ser Gln Lys Gln Pro Gln
185 190 195aaa tgt tgc tta ttt
aca gaa agt ttg aag ctg aac cta gaa gtg aac 981Lys Cys Cys Leu Phe
Thr Glu Ser Leu Lys Leu Asn Leu Glu Val Asn 200
205 210ggt cag aat gaa agc aat gac aca gaa cag ctt gat
gac gtt gtt ggg 1029Gly Gln Asn Glu Ser Asn Asp Thr Glu Gln Leu Asp
Asp Val Val Gly 215 220 225tct ggt
cag cta ttc agc cat agc tct tct gat gcc tgc agc aag aat 1077Ser Gly
Gln Leu Phe Ser His Ser Ser Ser Asp Ala Cys Ser Lys Asn 230
235 240att cat aca gga gag aca ttt tgc aaa ggt aac
cag tgt aga aaa gtc 1125Ile His Thr Gly Glu Thr Phe Cys Lys Gly Asn
Gln Cys Arg Lys Val245 250 255
260tgt ggc cat aaa cag tca ctc aag caa cat caa att cat act cag aag
1173Cys Gly His Lys Gln Ser Leu Lys Gln His Gln Ile His Thr Gln Lys
265 270 275aaa cca gat gga tgt
tct gaa tgt ggg ggg agc ttc acc cag aag tca 1221Lys Pro Asp Gly Cys
Ser Glu Cys Gly Gly Ser Phe Thr Gln Lys Ser 280
285 290cac ctc ttt gcc caa cag aga att cat agt gta gga
aac ctc cat gaa 1269His Leu Phe Ala Gln Gln Arg Ile His Ser Val Gly
Asn Leu His Glu 295 300 305tgt ggc
aaa tgt gga aaa gcc ttc atg cca caa cta aaa ctc agt gta 1317Cys Gly
Lys Cys Gly Lys Ala Phe Met Pro Gln Leu Lys Leu Ser Val 310
315 320tat ctg aca gat cat aca ggt gat ata ccc tgt
ata tgc aag gaa tgt 1365Tyr Leu Thr Asp His Thr Gly Asp Ile Pro Cys
Ile Cys Lys Glu Cys325 330 335
340ggg aag gtc ttt att cag aga tca gaa ttg ctt acg cac cag aaa aca
1413Gly Lys Val Phe Ile Gln Arg Ser Glu Leu Leu Thr His Gln Lys Thr
345 350 355cac act aga aag aag
ccc tat aaa tgc cat gac tgt gga aaa gcc ttt 1461His Thr Arg Lys Lys
Pro Tyr Lys Cys His Asp Cys Gly Lys Ala Phe 360
365 370ttc cag atg tta tct ctc ttc aga cat cag aga act
cac agt aga gaa 1509Phe Gln Met Leu Ser Leu Phe Arg His Gln Arg Thr
His Ser Arg Glu 375 380 385aaa ctc
tat gaa tgc agt gaa tgt ggc aaa ggc ttc tcc caa aac tca 1557Lys Leu
Tyr Glu Cys Ser Glu Cys Gly Lys Gly Phe Ser Gln Asn Ser 390
395 400acc ctc att ata cat cag aaa att cat act ggt
gag aga cag tat gca 1605Thr Leu Ile Ile His Gln Lys Ile His Thr Gly
Glu Arg Gln Tyr Ala405 410 415
420tgc agt gaa tgt ggg aaa gcc ttt acc cag aag tca aca ctc agc ttg
1653Cys Ser Glu Cys Gly Lys Ala Phe Thr Gln Lys Ser Thr Leu Ser Leu
425 430 435cac cag aga atc cac
tca ggg cag aag tcc tat gtg tgt atc gaa tgc 1701His Gln Arg Ile His
Ser Gly Gln Lys Ser Tyr Val Cys Ile Glu Cys 440
445 450ggg cag gcc ttc atc cag aag gca cac ctg att gtc
cat caa aga agc 1749Gly Gln Ala Phe Ile Gln Lys Ala His Leu Ile Val
His Gln Arg Ser 455 460 465cac aca
gga gaa aaa cct tat cag tgc cac aac tgt ggg aaa tcc ttc 1797His Thr
Gly Glu Lys Pro Tyr Gln Cys His Asn Cys Gly Lys Ser Phe 470
475 480att tcc aag tca cag ctt gat ata cat cat cga
att cat aca ggg gag 1845Ile Ser Lys Ser Gln Leu Asp Ile His His Arg
Ile His Thr Gly Glu485 490 495
500aaa cct tat gaa tgc agt gac tgt gga aaa acc ttc acc caa aag tca
1893Lys Pro Tyr Glu Cys Ser Asp Cys Gly Lys Thr Phe Thr Gln Lys Ser
505 510 515cac ctg aat ata cac
cag aaa att cat act gga gaa aga cac cat gta 1941His Leu Asn Ile His
Gln Lys Ile His Thr Gly Glu Arg His His Val 520
525 530tgc agt gaa tgc ggg aaa gcc ttc aac cag aag tca
ata ctc agc atg 1989Cys Ser Glu Cys Gly Lys Ala Phe Asn Gln Lys Ser
Ile Leu Ser Met 535 540 545cat cag
aga att cac acc gga gag aag cct tac aaa tgc agt gaa tgt 2037His Gln
Arg Ile His Thr Gly Glu Lys Pro Tyr Lys Cys Ser Glu Cys 550
555 560ggg aaa gcc ttc act tct aag tct caa ttc aaa
gag cat cag cga att 2085Gly Lys Ala Phe Thr Ser Lys Ser Gln Phe Lys
Glu His Gln Arg Ile565 570 575
580cac acg ggt gag aaa ccc tat gtg tgc act gaa tgt ggg aag gcc ttc
2133His Thr Gly Glu Lys Pro Tyr Val Cys Thr Glu Cys Gly Lys Ala Phe
585 590 595aac ggc agg tca aat
ttc cat aaa cat caa ata act cac act aga gag 2181Asn Gly Arg Ser Asn
Phe His Lys His Gln Ile Thr His Thr Arg Glu 600
605 610agg cct ttt gtc tgt tac aaa tgt ggg aag gct ttt
gtc cag aaa tca 2229Arg Pro Phe Val Cys Tyr Lys Cys Gly Lys Ala Phe
Val Gln Lys Ser 615 620 625gag ttg
att acc cat caa aga act cac atg gga gag aaa ccc tat gaa 2277Glu Leu
Ile Thr His Gln Arg Thr His Met Gly Glu Lys Pro Tyr Glu 630
635 640tgc ctt gac tgt ggg aaa tcg ttc agt aag aaa
cca caa ctc aag gtg 2325Cys Leu Asp Cys Gly Lys Ser Phe Ser Lys Lys
Pro Gln Leu Lys Val645 650 655
660cat cag cga att cac acg gga gaa aga cct tat gtg tgt tct gaa tgt
2373His Gln Arg Ile His Thr Gly Glu Arg Pro Tyr Val Cys Ser Glu Cys
665 670 675gga aag gcc ttc aac
aac agg tca aac ttc aat aaa cac caa aca act 2421Gly Lys Ala Phe Asn
Asn Arg Ser Asn Phe Asn Lys His Gln Thr Thr 680
685 690cat acc aga gac aaa tct tac aaa tgc agt tat tct
gtg aaa ggc ttt 2469His Thr Arg Asp Lys Ser Tyr Lys Cys Ser Tyr Ser
Val Lys Gly Phe 695 700 705acc aag
caa tgaattccta gtgcatcagc atattcataa atgaaatata 2518Thr Lys
Gln 710ctccgagttt cttgaagaag agaacatctt ctcagaatca ggtctaatta
tatgttattg 2578aattcatgct tcagaaaaac tctagggatg cactgcatgt gtgaacacat
gataaaaaag 2638tcatgcttta ttttagtgag ggcaattaca gagaaaagag taagcagaaa
tgtccttctg 2698agtactggcc tcattaagga ttataaattt tctccccggg aagaaaccct
gactaacgca 2758ttgagaaaag cctttctgta aagaatggta caagacaggt tgttactcga
ttatttatag 2818taaaatatgt gggaaattat atcaatgata accctgttta ttgtgggata
tcaatatttt 2878taaagtgcca acacagtcat gataggacaa tattttatgt gtgtgtgtgc
gccttatgta 2938tataagcata tatataatat ataagcatat tattatatac aggttgagta
tcccttctcc 2998aaaatgcctg ggatcagaag cattttggat ttcagatact tacagatttt
ggaatatttg 3058cattatattt attggttgag catccctaat ctgaaaatcc aagattaaat
gctccaatta 3118gcatttcctt tgagcgtcat gttagagttc aaaaagtttc agattttggg
ttttcagatt 3178aggaataccc aacctgtatg tacgtatatt tctgtatcta tgtatgtata
tatatgcata 3238tgcagacata tgtatatggt ctggtcagca tatgtgtatg tatgcgtatg
tatgtatgta 3298tgtatgccct cagtgcagtg gggtttgctg cagaattcac tgcatagcag
gagatgtaag 3358cagatgagtt attttttaag agaatctaat ctaattgttt ttataaaaat
tattccctat 3418tgaatattta tataatgagg ttgtatcaac aatgattaac tcctttatta
tacatacaca 3478tgaatgtgca tttttggtaa atgcataaat gagattctat aatgtttact
gatctttata 3538ttacagattt tctcttcttt taggattagc tcagcttgcc ccccctttcc
atctccacca 3598tctatagtga gcctctccat aattagtgcc aaccattagt ctcgttcata
tttttacacc 3658aggagtcaac aaactgtgcc attggccaaa tatggcctcc caactgtttt
tttaaaataa 3718agttttattg gaacacaaaa aaaaaaaaaa aaaaaa
375413389PRTHomo sapiens 13Met Ala Asp Pro Arg Asp Lys Ala Leu
Gln Asp Tyr Arg Lys Lys Leu 1 5 10
15Leu Glu His Lys Glu Ile Asp Gly Arg Leu Lys Glu Leu Arg Glu
Gln 20 25 30Leu Lys Glu Leu
Thr Lys Gln Tyr Glu Lys Ser Glu Asn Asp Leu Lys 35
40 45Ala Leu Gln Ser Val Gly Gln Ile Val Gly Glu Val
Leu Lys Gln Leu 50 55 60Thr Glu Glu
Lys Phe Ile Val Lys Ala Thr Asn Gly Pro Arg Tyr Val 65
70 75 80Val Gly Cys Arg Arg Gln Leu Asp
Lys Ser Lys Leu Lys Pro Gly Thr 85 90
95Arg Val Ala Leu Asp Met Thr Thr Leu Thr Ile Met Arg Tyr
Leu Pro 100 105 110Arg Glu Val
Asp Pro Leu Val Tyr Asn Met Ser His Glu Asp Pro Gly 115
120 125Asn Val Ser Tyr Ser Glu Ile Gly Gly Leu Ser
Glu Gln Ile Arg Glu 130 135 140Leu Arg
Glu Val Ile Glu Leu Pro Leu Thr Asn Pro Glu Leu Phe Gln145
150 155 160Arg Val Gly Ile Ile Pro Pro
Lys Gly Cys Leu Leu Tyr Gly Pro Pro 165
170 175Gly Thr Gly Lys Thr Leu Leu Ala Arg Ala Val Ala
Ser Gln Leu Asp 180 185 190Cys
Asn Phe Leu Lys Val Val Ser Ser Ser Ile Val Asp Lys Tyr Ile 195
200 205Gly Glu Ser Ala Arg Leu Ile Arg Glu
Met Phe Asn Tyr Ala Arg Asp 210 215
220His Gln Pro Cys Ile Ile Phe Met Asp Glu Ile Asp Ala Ile Gly Gly225
230 235 240Arg Arg Phe Ser
Glu Gly Thr Ser Ala Asp Arg Glu Ile Gln Arg Thr 245
250 255Leu Met Glu Leu Leu Asn Gln Met Asp Gly
Phe Asp Thr Leu His Arg 260 265
270Val Lys Met Thr Met Ala Thr Asn Arg Pro Asp Thr Leu Asp Pro Ala
275 280 285Leu Leu Arg Pro Gly Arg Leu
Asp Arg Lys Ile His Ile Asp Leu Pro 290 295
300Asn Glu Gln Ala Arg Leu Asp Ile Leu Lys Ile His Ala Gly Pro
Ile305 310 315 320Thr Lys
His Gly Glu Ile Asp Tyr Glu Ala Ile Val Lys Leu Ser Asp
325 330 335Gly Phe Asn Gly Ala Asp Leu
Arg Asn Val Cys Thr Glu Ala Gly Met 340 345
350Phe Ala Ile Arg Ala Asp His Asp Phe Val Val Gln Glu Asp
Phe Met 355 360 365Lys Ala Val Arg
Lys Val Ala Asp Ser Lys Lys Leu Glu Ser Lys Leu 370
375 380Asp Tyr Lys Pro Val385141167DNAHomo sapiens
14atggcggacc ctagagataa ggcgcttcag gactaccgca agaagttgct tgaacacaag
60gagatcgacg gccgtcttaa ggagttaagg gaacaattaa aagaacttac caagcagtat
120gaaaagtctg aaaatgatct gaaggcccta cagagtgttg ggcagatcgt gggtgaagtg
180cttaaacagt taactgaaga aaaattcatt gttaaagcta ccaatggacc aagatatgtt
240gtgggttgtc gtcgacagct tgacaaaagt aagctgaagc caggaacaag agttgctttg
300gatatgacta cactaactat catgagatat ttgccgagag aggtggatcc actggtttat
360aacatgtctc atgaggaccc tgggaatgtt tcttattctg agattggagg gctatcagaa
420cagatccggg aattaagaga ggtgatagaa ttacctctta caaacccaga gttatttcag
480cgtgtaggaa taatacctcc aaaaggctgt ttgttatatg gaccaccagg tacgggaaaa
540acactcttgg cacgagccgt tgctagccag ctggactgca atttcttaaa ggttgtatct
600agttctattg tagacaagta cattggtgaa agtgctcgtt tgatcagaga aatgtttaat
660tatgctagag atcatcaacc atgcatcatt tttatggatg aaatagatgc tattggtggt
720cgtcggtttt ctgagggtac ttcagctgac agagagattc agagaacgtt aatggagtta
780ctgaatcaaa tggatggatt tgatactctg catagagtta aaatgaccat ggctacaaac
840agaccagata cactggatcc tgctttgctg cgtccaggaa gattagatag aaaaatacat
900attgatttgc caaatgaaca agcaagatta gacatactga aaatccatgc aggtcccatt
960acaaagcatg gtgaaataga ttatgaagca attgtgaagc tttcggatgg ctttaatgga
1020gcagatctga gaaatgtttg tactgaagca ggtatgttcg caattcgtgc tgatcatgat
1080tttgtagtac aggaagactt catgaaagca gtcagaaaag tggctgattc taagaagctg
1140gagtctaaat tggactacaa acctgtg
1167151566DNAHomo sapiensCDS(17)..(1183) 15gagacggctt ctcatc atg gcg gac
cct aga gat aag gcg ctt cag gac tac 52 Met Ala Asp
Pro Arg Asp Lys Ala Leu Gln Asp Tyr 1 5
10cgc aag aag ttg ctt gaa cac aag gag atc gac ggc cgt ctt
aag gag 100Arg Lys Lys Leu Leu Glu His Lys Glu Ile Asp Gly Arg Leu
Lys Glu 15 20 25tta agg gaa caa
tta aaa gaa ctt acc aag cag tat gaa aag tct gaa 148Leu Arg Glu Gln
Leu Lys Glu Leu Thr Lys Gln Tyr Glu Lys Ser Glu 30
35 40aat gat ctg aag gcc cta cag agt gtt ggg cag atc gtg
ggt gaa gtg 196Asn Asp Leu Lys Ala Leu Gln Ser Val Gly Gln Ile Val
Gly Glu Val 45 50 55
60ctt aaa cag tta act gaa gaa aaa ttc att gtt aaa gct acc aat gga
244Leu Lys Gln Leu Thr Glu Glu Lys Phe Ile Val Lys Ala Thr Asn Gly
65 70 75cca aga tat gtt gtg
ggt tgt cgt cga cag ctt gac aaa agt aag ctg 292Pro Arg Tyr Val Val
Gly Cys Arg Arg Gln Leu Asp Lys Ser Lys Leu 80
85 90aag cca gga aca aga gtt gct ttg gat atg act aca
cta act atc atg 340Lys Pro Gly Thr Arg Val Ala Leu Asp Met Thr Thr
Leu Thr Ile Met 95 100 105aga tat
ttg ccg aga gag gtg gat cca ctg gtt tat aac atg tct cat 388Arg Tyr
Leu Pro Arg Glu Val Asp Pro Leu Val Tyr Asn Met Ser His 110
115 120gag gac cct ggg aat gtt tct tat tct gag att
gga ggg cta tca gaa 436Glu Asp Pro Gly Asn Val Ser Tyr Ser Glu Ile
Gly Gly Leu Ser Glu125 130 135
140cag atc cgg gaa tta aga gag gtg ata gaa tta cct ctt aca aac cca
484Gln Ile Arg Glu Leu Arg Glu Val Ile Glu Leu Pro Leu Thr Asn Pro
145 150 155gag tta ttt cag cgt
gta gga ata ata cct cca aaa ggc tgt ttg tta 532Glu Leu Phe Gln Arg
Val Gly Ile Ile Pro Pro Lys Gly Cys Leu Leu 160
165 170tat gga cca cca ggt acg gga aaa aca ctc ttg gca
cga gcc gtt gct 580Tyr Gly Pro Pro Gly Thr Gly Lys Thr Leu Leu Ala
Arg Ala Val Ala 175 180 185agc cag
ctg gac tgc aat ttc tta aag gtt gta tct agt tct att gta 628Ser Gln
Leu Asp Cys Asn Phe Leu Lys Val Val Ser Ser Ser Ile Val 190
195 200gac aag tac att ggt gaa agt gct cgt ttg atc
aga gaa atg ttt aat 676Asp Lys Tyr Ile Gly Glu Ser Ala Arg Leu Ile
Arg Glu Met Phe Asn205 210 215
220tat gct aga gat cat caa cca tgc atc att ttt atg gat gaa ata gat
724Tyr Ala Arg Asp His Gln Pro Cys Ile Ile Phe Met Asp Glu Ile Asp
225 230 235gct att ggt ggt cgt
cgg ttt tct gag ggt act tca gct gac aga gag 772Ala Ile Gly Gly Arg
Arg Phe Ser Glu Gly Thr Ser Ala Asp Arg Glu 240
245 250att cag aga acg tta atg gag tta ctg aat caa atg
gat gga ttt gat 820Ile Gln Arg Thr Leu Met Glu Leu Leu Asn Gln Met
Asp Gly Phe Asp 255 260 265act ctg
cat aga gtt aaa atg acc atg gct aca aac aga cca gat aca 868Thr Leu
His Arg Val Lys Met Thr Met Ala Thr Asn Arg Pro Asp Thr 270
275 280ctg gat cct gct ttg ctg cgt cca gga aga tta
gat aga aaa ata cat 916Leu Asp Pro Ala Leu Leu Arg Pro Gly Arg Leu
Asp Arg Lys Ile His285 290 295
300att gat ttg cca aat gaa caa gca aga tta gac ata ctg aaa atc cat
964Ile Asp Leu Pro Asn Glu Gln Ala Arg Leu Asp Ile Leu Lys Ile His
305 310 315gca ggt ccc att aca
aag cat ggt gaa ata gat tat gaa gca att gtg 1012Ala Gly Pro Ile Thr
Lys His Gly Glu Ile Asp Tyr Glu Ala Ile Val 320
325 330aag ctt tcg gat ggc ttt aat gga gca gat ctg aga
aat gtt tgt act 1060Lys Leu Ser Asp Gly Phe Asn Gly Ala Asp Leu Arg
Asn Val Cys Thr 335 340 345gaa gca
ggt atg ttc gca att cgt gct gat cat gat ttt gta gta cag 1108Glu Ala
Gly Met Phe Ala Ile Arg Ala Asp His Asp Phe Val Val Gln 350
355 360gaa gac ttc atg aaa gca gtc aga aaa gtg gct
gat tct aag aag ctg 1156Glu Asp Phe Met Lys Ala Val Arg Lys Val Ala
Asp Ser Lys Lys Leu365 370 375
380gag tct aaa ttg gac tac aaa cct gtg taatttactg taagattttt
1203Glu Ser Lys Leu Asp Tyr Lys Pro Val 385gatggctgca
tgacagatgt tggcttattg taaaaataaa gttaaagaaa ataatgtatg 1263tattggcaat
gatgtcatta aaagtatatg aataaaaata tgagtaacat cataaaaatt 1323agtaattcaa
cttttaagat acagaagaaa tttgtatgtt tgttaaagtt gcatttattg 1383cagcaagtta
caaagggaaa gtgttgaagc ttttcatatt tgctgcgtga gcattttgta 1443aaatattgaa
agtggtttga gatagtggta taagaaagca tttcttatga cttattttgt 1503atcatttgtt
ttcctcatct aaaaagttga ataaaatctg tttgattcag ttctcctaaa 1563aaa
156616223PRTHomo
sapiens 16Met Ser Asp Glu Glu Ala Arg Gln Ser Gly Gly Ser Ser Gln Ala Gly
1 5 10 15Val Val Thr Val
Ser Asp Val Gln Glu Leu Met Arg Arg Lys Glu Glu 20
25 30Ile Glu Ala Gln Ile Lys Ala Asn Tyr Asp Val
Leu Glu Ser Gln Lys 35 40 45Gly
Ile Gly Met Asn Glu Pro Leu Val Asp Cys Glu Gly Tyr Pro Arg 50
55 60Ser Asp Val Asp Leu Tyr Gln Val Arg Thr
Ala Arg His Asn Ile Ile 65 70 75
80Cys Leu Gln Asn Asp His Lys Ala Val Met Lys Gln Val Glu Glu
Ala 85 90 95Leu His Gln
Leu His Ala Arg Asp Lys Glu Lys Gln Ala Arg Asp Met 100
105 110Ala Glu Ala His Lys Glu Ala Met Ser Arg
Lys Leu Gly Gln Ser Glu 115 120
125Ser Gln Gly Pro Pro Arg Ala Phe Ala Lys Val Asn Ser Ile Ser Pro 130
135 140Gly Ser Pro Ala Ser Ile Ala Gly
Leu Gln Val Asp Asp Glu Ile Val145 150
155 160Glu Phe Gly Ser Val Asn Thr Gln Asn Phe Gln Ser
Leu His Asn Ile 165 170
175Gly Ser Val Val Gln His Ser Glu Gly Lys Pro Leu Asn Val Thr Val
180 185 190Ile Arg Arg Gly Glu Lys
His Gln Leu Arg Leu Val Pro Thr Arg Trp 195 200
205Ala Gly Lys Gly Leu Leu Gly Cys Asn Ile Ile Pro Leu Gln
Arg 210 215 22017669DNAHomo sapiens
17atgtccgacg aggaagcgag gcagagcgga ggctcctcgc aggccggcgt cgtgactgtc
60agcgacgtcc aggagctgat gcggcgcaag gaggagatag aagcgcagat caaggccaac
120tatgacgtgc tggaaagcca aaaaggcatt gggatgaacg agccgctggt ggactgtgag
180ggctaccccc ggtcagacgt ggacctgtac caagtccgca ccgccaggca caacatcata
240tgcctgcaga atgatcacaa ggcagtgatg aagcaggtgg aggaggccct gcaccagctg
300cacgctcgcg acaaggagaa gcaggcccgg gacatggctg aggcccacaa agaggccatg
360agccgcaaac tgggtcagag tgagagccag ggccctccac gggccttcgc caaagtgaac
420agcatcagcc ccggctcccc agccagcatc gcgggtctgc aagtggatga tgagattgtg
480gagttcggct ctgtgaacac ccagaacttc cagtcactgc ataacattgg cagtgtggtg
540cagcacagtg aggggaagcc cctgaatgtg acagtgatcc gcagggggga aaaacaccag
600cttagacttg ttccaacacg ctgggcagga aaaggactgc tgggctgcaa cattattcct
660ctgcaaaga
669181128DNAHomo sapiensCDS(125)..(793) 18actgttctcg cgttcgcgga
cggctgtggt gttttggcgc atgggcggag cgtagttacg 60gtcgactggg gcgtcgtccc
tagcccggga gccgggtctc tggagtcgcg gcccggggtt 120cacg atg tcc gac gag
gaa gcg agg cag agc gga ggc tcc tcg cag gcc 169 Met Ser Asp Glu
Glu Ala Arg Gln Ser Gly Gly Ser Ser Gln Ala 1 5
10 15ggc gtc gtg act gtc agc gac gtc cag gag
ctg atg cgg cgc aag gag 217Gly Val Val Thr Val Ser Asp Val Gln Glu
Leu Met Arg Arg Lys Glu 20 25
30gag ata gaa gcg cag atc aag gcc aac tat gac gtg ctg gaa agc caa
265Glu Ile Glu Ala Gln Ile Lys Ala Asn Tyr Asp Val Leu Glu Ser Gln
35 40 45aaa ggc att ggg atg aac
gag ccg ctg gtg gac tgt gag ggc tac ccc 313Lys Gly Ile Gly Met Asn
Glu Pro Leu Val Asp Cys Glu Gly Tyr Pro 50 55
60cgg tca gac gtg gac ctg tac caa gtc cgc acc gcc agg cac
aac atc 361Arg Ser Asp Val Asp Leu Tyr Gln Val Arg Thr Ala Arg His
Asn Ile 65 70 75ata tgc ctg cag aat
gat cac aag gca gtg atg aag cag gtg gag gag 409Ile Cys Leu Gln Asn
Asp His Lys Ala Val Met Lys Gln Val Glu Glu 80 85
90 95gcc ctg cac cag ctg cac gct cgc gac aag
gag aag cag gcc cgg gac 457Ala Leu His Gln Leu His Ala Arg Asp Lys
Glu Lys Gln Ala Arg Asp 100 105
110atg gct gag gcc cac aaa gag gcc atg agc cgc aaa ctg ggt cag agt
505Met Ala Glu Ala His Lys Glu Ala Met Ser Arg Lys Leu Gly Gln Ser
115 120 125gag agc cag ggc cct cca
cgg gcc ttc gcc aaa gtg aac agc atc agc 553Glu Ser Gln Gly Pro Pro
Arg Ala Phe Ala Lys Val Asn Ser Ile Ser 130 135
140ccc ggc tcc cca gcc agc atc gcg ggt ctg caa gtg gat gat
gag att 601Pro Gly Ser Pro Ala Ser Ile Ala Gly Leu Gln Val Asp Asp
Glu Ile 145 150 155gtg gag ttc ggc tct
gtg aac acc cag aac ttc cag tca ctg cat aac 649Val Glu Phe Gly Ser
Val Asn Thr Gln Asn Phe Gln Ser Leu His Asn160 165
170 175att ggc agt gtg gtg cag cac agt gag ggg
aag ccc ctg aat gtg aca 697Ile Gly Ser Val Val Gln His Ser Glu Gly
Lys Pro Leu Asn Val Thr 180 185
190gtg atc cgc agg ggg gaa aaa cac cag ctt aga ctt gtt cca aca cgc
745Val Ile Arg Arg Gly Glu Lys His Gln Leu Arg Leu Val Pro Thr Arg
195 200 205tgg gca gga aaa gga ctg
ctg ggc tgc aac att att cct ctg caa aga 793Trp Ala Gly Lys Gly Leu
Leu Gly Cys Asn Ile Ile Pro Leu Gln Arg 210 215
220tgattgtccc tggggaacag taacaggaaa gcatcttccc ttgccctgga
cttgggtcta 853gggatttcca acttgtcttc tctccctgaa gcataaggat ctggaagagg
cttgtaacct 913gaacttctgt gtggtggcag tactgtggcc caccagtgta atctccctgg
attaaggcat 973tcttaaaaac ttaggcttgg cctctttcac aaattaggcc acggccctaa
ataggaattc 1033cctggattgt gggcaagtgg gcggaagtta ttctggcagg tactggtgtg
attattatta 1093ttatttttaa taaagagttt tacagtgctg atatg
112819506PRTHomo sapiens 19Met Ala Glu Ala Asp Phe Lys Met Val
Ser Glu Pro Val Ala His Gly 1 5 10
15Val Ala Glu Glu Glu Met Ala Ser Ser Thr Ser Asp Ser Gly Glu
Glu 20 25 30Ser Asp Ser Ser
Ser Ser Ser Ser Ser Thr Ser Asp Ser Ser Ser Ser 35
40 45Ser Ser Thr Ser Gly Ser Ser Ser Gly Ser Gly Ser
Ser Ser Ser Ser 50 55 60Ser Gly Ser
Thr Ser Ser Arg Ser Arg Leu Tyr Arg Lys Lys Arg Val 65
70 75 80Pro Glu Pro Ser Arg Arg Ala Arg
Arg Ala Pro Leu Gly Thr Asn Phe 85 90
95Val Asp Arg Leu Pro Gln Ala Val Arg Asn Arg Val Gln Ala
Leu Arg 100 105 110Asn Ile Gln
Asp Glu Cys Asp Lys Val Asp Thr Leu Phe Leu Lys Ala 115
120 125Ile His Asp Leu Glu Arg Lys Tyr Ala Glu Leu
Asn Lys Pro Leu Tyr 130 135 140Asp Arg
Arg Phe Gln Ile Ile Asn Ala Glu Tyr Glu Pro Thr Glu Glu145
150 155 160Glu Cys Glu Trp Asn Ser Glu
Asp Glu Glu Phe Ser Ser Asp Glu Glu 165
170 175Val Gln Asp Asn Thr Pro Ser Glu Met Pro Pro Leu
Glu Gly Glu Glu 180 185 190Glu
Glu Asn Pro Lys Glu Asn Pro Glu Val Lys Ala Glu Glu Lys Glu 195
200 205Val Pro Lys Glu Ile Pro Glu Val Lys
Asp Glu Glu Lys Glu Val Ala 210 215
220Lys Glu Ile Pro Glu Val Lys Ala Glu Glu Lys Ala Asp Ser Lys Asp225
230 235 240Cys Met Glu Ala
Thr Pro Glu Val Lys Glu Asp Pro Lys Glu Val Pro 245
250 255Gln Val Lys Ala Asp Asp Lys Glu Gln Pro
Lys Ala Thr Glu Ala Lys 260 265
270Ala Arg Ala Ala Val Arg Glu Thr His Lys Arg Val Pro Glu Glu Arg
275 280 285Leu Arg Asp Ser Val Asp Leu
Lys Arg Ala Arg Lys Gly Lys Pro Lys 290 295
300Arg Glu Asp Pro Lys Gly Ile Pro Asp Tyr Trp Leu Ile Val Leu
Lys305 310 315 320Asn Val
Asp Lys Leu Gly Pro Met Ile Gln Lys Tyr Asp Glu Pro Ile
325 330 335Leu Lys Phe Leu Ser Asp Val
Ser Leu Lys Phe Ser Lys Pro Gly Gln 340 345
350Pro Val Ser Tyr Thr Phe Glu Phe His Phe Leu Pro Asn Pro
Tyr Phe 355 360 365Arg Asn Glu Val
Leu Val Lys Thr Tyr Ile Ile Lys Ala Lys Pro Asp 370
375 380His Asn Asp Pro Phe Phe Ser Trp Gly Trp Glu Ile
Glu Asp Cys Lys385 390 395
400Gly Cys Lys Ile Asp Arg Arg Arg Gly Lys Asp Val Thr Val Thr Thr
405 410 415Thr Gln Ser Arg Thr
Thr Ala Thr Gly Glu Ile Glu Ile Gln Pro Arg 420
425 430Val Val Pro Asn Ala Ser Phe Phe Asn Phe Phe Ser
Pro Pro Glu Ile 435 440 445Pro Met
Ile Gly Lys Leu Glu Pro Arg Glu Asp Ala Ile Leu Asp Glu 450
455 460Asp Phe Glu Ile Gly Gln Ile Leu His Asp Asn
Val Ile Leu Lys Ser465 470 475
480Ile Tyr Tyr Tyr Thr Gly Glu Val Asn Gly Thr Tyr Tyr Gln Phe Gly
485 490 495Lys His Tyr Gly
Asn Lys Lys Tyr Arg Lys 500 505201518DNAHomo
sapiens 20atggcagaag cagattttaa aatggtctcg gaacctgtcg cccatggggt
tgccgaagag 60gagatggcta gctcgactag tgattctggg gaagaatctg acagcagtag
ctctagcagc 120agcactagtg acagcagcag cagcagcagc actagtggca gcagcagcgg
cagcggcagc 180agcagcagca gcagcggcag cactagcagc cgcagccgct tgtatagaaa
gaagagggta 240cctgagcctt ccagaagggc gcggcgggcc ccgttgggaa caaatttcgt
ggataggctg 300cctcaggcag ttagaaatcg tgtgcaagcg cttagaaaca ttcaagatga
atgtgacaag 360gtagataccc tgttcttaaa agcaattcat gatcttgaaa gaaaatatgc
tgaactcaac 420aagcctctgt atgataggcg gtttcaaatc atcaatgcag aatacgagcc
tacagaagaa 480gaatgtgaat ggaattcaga ggatgaggag ttcagcagtg atgaggaggt
gcaggataac 540acccctagtg aaatgcctcc cttagagggt gaggaagaag aaaaccctaa
agaaaaccca 600gaggtgaaag ctgaagagaa ggaagttcct aaagaaattc ctgaggtgaa
ggatgaagaa 660aaggaagttg ctaaagaaat tcctgaggta aaggctgaag aaaaagcaga
ttctaaagac 720tgtatggagg caacccctga agtaaaagaa gatcctaaag aagtccccca
ggtaaaggca 780gatgataaag aacagcctaa agcaacagag gctaaggcaa gggctgcagt
aagagagact 840cataaaagag ttcctgagga aaggcttcgg gacagtgtag atcttaaaag
agctaggaag 900ggaaagccta aaagagaaga ccctaaaggc attcctgact attggctgat
tgttttaaag 960aatgttgaca agctcgggcc tatgattcag aagtatgatg agcccattct
gaagttcttg 1020tcggatgtta gcctgaagtt ctcaaaacct ggccagcctg taagttacac
ctttgaattt 1080cattttctac ccaacccata cttcagaaat gaggtgctgg tgaagacata
tataataaag 1140gcaaaaccag atcacaatga tcccttcttt tcttggggat gggaaattga
agattgcaaa 1200ggctgcaaga tagaccggag aagaggaaaa gatgttactg tgacaactac
ccagagtcgc 1260acaactgcta ctggagaaat tgaaatccag ccaagagtgg ttcctaatgc
atcattcttc 1320aacttcttta gtcctcctga gattcctatg attgggaagc tggaaccacg
agaagatgct 1380atcctggatg aggactttga aattgggcag attttacatg ataatgtcat
cctgaaatca 1440atctattact atactggaga agtcaatggt acctactatc aatttggcaa
acattatgga 1500aacaagaaat acagaaaa
1518212636DNAHomo sapiensCDS(266)..(1783) 21gattcggctg
cggtacatct cggcactcta gctgcagccg ggagaggcct tgccgccacc 60gctgtcgccc
aagcctccac tgccgctgcc acctcagcgc cggcctctgc atccccagct 120ccagctccgc
tctgcgccgc tgctgccatc gccgctgcca cctccgcagc ccgggcctcc 180gccgccgcca
cccaagcatc cgtgagtcat tttctgccca tctctggtcg cgcggtctcc 240ctggtagagt
ttgtaggctt gcaag atg gca gaa gca gat ttt aaa atg gtc 292
Met Ala Glu Ala Asp Phe Lys Met Val
1 5tcg gaa cct gtc gcc cat ggg gtt gcc gaa gag gag
atg gct agc tcg 340Ser Glu Pro Val Ala His Gly Val Ala Glu Glu Glu
Met Ala Ser Ser 10 15 20
25act agt gat tct ggg gaa gaa tct gac agc agt agc tct agc agc agc
388Thr Ser Asp Ser Gly Glu Glu Ser Asp Ser Ser Ser Ser Ser Ser Ser
30 35 40act agt gac agc agc
agc agc agc agc act agt ggc agc agc agc ggc 436Thr Ser Asp Ser Ser
Ser Ser Ser Ser Thr Ser Gly Ser Ser Ser Gly 45
50 55agc ggc agc agc agc agc agc agc ggc agc act agc
agc cgc agc cgc 484Ser Gly Ser Ser Ser Ser Ser Ser Gly Ser Thr Ser
Ser Arg Ser Arg 60 65 70ttg tat
aga aag aag agg gta cct gag cct tcc aga agg gcg cgg cgg 532Leu Tyr
Arg Lys Lys Arg Val Pro Glu Pro Ser Arg Arg Ala Arg Arg 75
80 85gcc ccg ttg gga aca aat ttc gtg gat agg ctg
cct cag gca gtt aga 580Ala Pro Leu Gly Thr Asn Phe Val Asp Arg Leu
Pro Gln Ala Val Arg 90 95 100
105aat cgt gtg caa gcg ctt aga aac att caa gat gaa tgt gac aag gta
628Asn Arg Val Gln Ala Leu Arg Asn Ile Gln Asp Glu Cys Asp Lys Val
110 115 120gat acc ctg ttc tta
aaa gca att cat gat ctt gaa aga aaa tat gct 676Asp Thr Leu Phe Leu
Lys Ala Ile His Asp Leu Glu Arg Lys Tyr Ala 125
130 135gaa ctc aac aag cct ctg tat gat agg cgg ttt caa
atc atc aat gca 724Glu Leu Asn Lys Pro Leu Tyr Asp Arg Arg Phe Gln
Ile Ile Asn Ala 140 145 150gaa tac
gag cct aca gaa gaa gaa tgt gaa tgg aat tca gag gat gag 772Glu Tyr
Glu Pro Thr Glu Glu Glu Cys Glu Trp Asn Ser Glu Asp Glu 155
160 165gag ttc agc agt gat gag gag gtg cag gat aac
acc cct agt gaa atg 820Glu Phe Ser Ser Asp Glu Glu Val Gln Asp Asn
Thr Pro Ser Glu Met170 175 180
185cct ccc tta gag ggt gag gaa gaa gaa aac cct aaa gaa aac cca gag
868Pro Pro Leu Glu Gly Glu Glu Glu Glu Asn Pro Lys Glu Asn Pro Glu
190 195 200gtg aaa gct gaa gag
aag gaa gtt cct aaa gaa att cct gag gtg aag 916Val Lys Ala Glu Glu
Lys Glu Val Pro Lys Glu Ile Pro Glu Val Lys 205
210 215gat gaa gaa aag gaa gtt gct aaa gaa att cct gag
gta aag gct gaa 964Asp Glu Glu Lys Glu Val Ala Lys Glu Ile Pro Glu
Val Lys Ala Glu 220 225 230gaa aaa
gca gat tct aaa gac tgt atg gag gca acc cct gaa gta aaa 1012Glu Lys
Ala Asp Ser Lys Asp Cys Met Glu Ala Thr Pro Glu Val Lys 235
240 245gaa gat cct aaa gaa gtc ccc cag gta aag gca
gat gat aaa gaa cag 1060Glu Asp Pro Lys Glu Val Pro Gln Val Lys Ala
Asp Asp Lys Glu Gln250 255 260
265cct aaa gca aca gag gct aag gca agg gct gca gta aga gag act cat
1108Pro Lys Ala Thr Glu Ala Lys Ala Arg Ala Ala Val Arg Glu Thr His
270 275 280aaa aga gtt cct gag
gaa agg ctt cgg gac agt gta gat ctt aaa aga 1156Lys Arg Val Pro Glu
Glu Arg Leu Arg Asp Ser Val Asp Leu Lys Arg 285
290 295gct agg aag gga aag cct aaa aga gaa gac cct aaa
ggc att cct gac 1204Ala Arg Lys Gly Lys Pro Lys Arg Glu Asp Pro Lys
Gly Ile Pro Asp 300 305 310tat tgg
ctg att gtt tta aag aat gtt gac aag ctc ggg cct atg att 1252Tyr Trp
Leu Ile Val Leu Lys Asn Val Asp Lys Leu Gly Pro Met Ile 315
320 325cag aag tat gat gag ccc att ctg aag ttc ttg
tcg gat gtt agc ctg 1300Gln Lys Tyr Asp Glu Pro Ile Leu Lys Phe Leu
Ser Asp Val Ser Leu330 335 340
345aag ttc tca aaa cct ggc cag cct gta agt tac acc ttt gaa ttt cat
1348Lys Phe Ser Lys Pro Gly Gln Pro Val Ser Tyr Thr Phe Glu Phe His
350 355 360ttt cta ccc aac cca
tac ttc aga aat gag gtg ctg gtg aag aca tat 1396Phe Leu Pro Asn Pro
Tyr Phe Arg Asn Glu Val Leu Val Lys Thr Tyr 365
370 375ata ata aag gca aaa cca gat cac aat gat ccc ttc
ttt tct tgg gga 1444Ile Ile Lys Ala Lys Pro Asp His Asn Asp Pro Phe
Phe Ser Trp Gly 380 385 390tgg gaa
att gaa gat tgc aaa ggc tgc aag ata gac cgg aga aga gga 1492Trp Glu
Ile Glu Asp Cys Lys Gly Cys Lys Ile Asp Arg Arg Arg Gly 395
400 405aaa gat gtt act gtg aca act acc cag agt cgc
aca act gct act gga 1540Lys Asp Val Thr Val Thr Thr Thr Gln Ser Arg
Thr Thr Ala Thr Gly410 415 420
425gaa att gaa atc cag cca aga gtg gtt cct aat gca tca ttc ttc aac
1588Glu Ile Glu Ile Gln Pro Arg Val Val Pro Asn Ala Ser Phe Phe Asn
430 435 440ttc ttt agt cct cct
gag att cct atg att ggg aag ctg gaa cca cga 1636Phe Phe Ser Pro Pro
Glu Ile Pro Met Ile Gly Lys Leu Glu Pro Arg 445
450 455gaa gat gct atc ctg gat gag gac ttt gaa att ggg
cag att tta cat 1684Glu Asp Ala Ile Leu Asp Glu Asp Phe Glu Ile Gly
Gln Ile Leu His 460 465 470gat aat
gtc atc ctg aaa tca atc tat tac tat act gga gaa gtc aat 1732Asp Asn
Val Ile Leu Lys Ser Ile Tyr Tyr Tyr Thr Gly Glu Val Asn 475
480 485ggt acc tac tat caa ttt ggc aaa cat tat gga
aac aag aaa tac aga 1780Gly Thr Tyr Tyr Gln Phe Gly Lys His Tyr Gly
Asn Lys Lys Tyr Arg490 495 500
505aaa taagtcaatc tgaaagattt ttcaagaatc ttaaaatctc aagaagtgaa
1833Lysgcagattcat acagccttga aaaaagtaaa accctgacct gtaacctgaa
cactattatt 1893ccttatagtc aagtttttgt ggtttcttgg tagtctatat tttaaaaata
gtcctaaaaa 1953gtgtctaagt gccagtttat tctatctagg ctgttgtagt ataatattct
tcaaaatatg 2013taagctgttg tcaattatct aaagcatgtt agtttggtgc tacacagtgt
tgatttttgt 2073gatgtccttt ggtcatgttt ctgttagact gtagctgtga aactgtcaga
attgttaact 2133gaaacaaata tttgcttgaa aaaaaaagtt catgaagtac caatgcaagt
gttttatttt 2193ttttcttttt tccagcccat aagactaagg gtttaaatct gcttgcacta
gctgtgcctt 2253cattagtttg ctatagaaat ccagtactta tagtaaataa aacagtgtat
tttgaagttt 2313gactgcttga aaaagattag catacatcta atgtgaaaag accacatttg
attcaactga 2373gaccttgtgt atgtgacata tagtggccta taaatttaat cataatgatg
ttattgttta 2433ccactgaggt gttaatataa catagtattt ttgaaaaagt ttcttcatct
tatattgtgt 2493aattgtaaac taaagatacc gtgttttctt tgtattgtgt tctaccttcc
ctttcactga 2553aaatgatcac ttcatttgat actgtttttc atgttcttgt attgcaacct
aaaataaata 2613aatattaaag tgtgttatac tat
263622170PRTHomo sapiens 22Met Thr Glu Leu Gln Ser Ala Leu Leu
Leu Arg Arg Gln Leu Ala Glu 1 5 10
15Leu Asn Lys Asn Pro Val Glu Gly Phe Ser Ala Gly Leu Ile Asp
Asp 20 25 30Asn Asp Leu Tyr
Arg Trp Glu Val Leu Ile Ile Gly Pro Pro Asp Thr 35
40 45Leu Tyr Glu Gly Gly Val Phe Lys Ala His Leu Thr
Phe Pro Lys Asp 50 55 60Tyr Pro Leu
Arg Pro Pro Lys Met Lys Phe Ile Thr Glu Ile Trp His 65
70 75 80Pro Asn Val Asp Lys Asn Gly Asp
Val Cys Ile Ser Ile Leu His Glu 85 90
95Pro Gly Glu Asp Lys Tyr Gly Tyr Glu Lys Pro Glu Glu Arg
Trp Leu 100 105 110Pro Ile His
Thr Val Glu Thr Ile Met Ile Ser Val Ile Ser Met Leu 115
120 125Ala Asp Pro Asn Gly Asp Ser Pro Ala Asn Val
Asp Ala Ala Lys Glu 130 135 140Trp Arg
Glu Asp Arg Asn Gly Glu Phe Lys Arg Lys Val Ala Arg Cys145
150 155 160Val Arg Lys Ser Gln Glu Thr
Ala Phe Glu 165 17023510DNAHomo sapiens
23atgacggagc tgcagtcggc actgctactg cgaagacagc tggcagaact caacaaaaat
60ccagtggaag gcttttctgc aggtttaata gatgacaatg atctctaccg atgggaagtc
120cttattattg gccctccaga tacactttat gaaggtggtg tttttaaggc tcatcttact
180ttcccaaaag attatcccct ccgacctcct aaaatgaaat tcattacaga aatctggcac
240ccaaatgttg ataaaaatgg tgatgtgtgc atttctattc ttcatgagcc tggggaagat
300aagtatggtt atgaaaagcc agaggaacgc tggctcccta tccacactgt ggaaaccatc
360atgattagtg tcatttctat gctggcagac cctaatggag actcacctgc taatgttgat
420gctgcgaaag aatggaggga agatagaaat ggagaattta aaagaaaagt tgcccgctgt
480gtaagaaaaa gccaagagac tgcttttgag
51024617DNAHomo sapiensCDS(19)..(528) 24gggccctcgg cagggagg atg acg gag
ctg cag tcg gca ctg cta ctg cga 51 Met Thr Glu
Leu Gln Ser Ala Leu Leu Leu Arg 1 5
10aga cag ctg gca gaa ctc aac aaa aat cca gtg gaa ggc ttt
tct gca 99Arg Gln Leu Ala Glu Leu Asn Lys Asn Pro Val Glu Gly Phe
Ser Ala 15 20 25ggt tta ata
gat gac aat gat ctc tac cga tgg gaa gtc ctt att att 147Gly Leu Ile
Asp Asp Asn Asp Leu Tyr Arg Trp Glu Val Leu Ile Ile 30
35 40ggc cct cca gat aca ctt tat gaa ggt ggt gtt
ttt aag gct cat ctt 195Gly Pro Pro Asp Thr Leu Tyr Glu Gly Gly Val
Phe Lys Ala His Leu 45 50 55act ttc
cca aaa gat tat ccc ctc cga cct cct aaa atg aaa ttc att 243Thr Phe
Pro Lys Asp Tyr Pro Leu Arg Pro Pro Lys Met Lys Phe Ile 60
65 70 75aca gaa atc tgg cac cca aat
gtt gat aaa aat ggt gat gtg tgc att 291Thr Glu Ile Trp His Pro Asn
Val Asp Lys Asn Gly Asp Val Cys Ile 80
85 90tct att ctt cat gag cct ggg gaa gat aag tat ggt tat
gaa aag cca 339Ser Ile Leu His Glu Pro Gly Glu Asp Lys Tyr Gly Tyr
Glu Lys Pro 95 100 105gag gaa
cgc tgg ctc cct atc cac act gtg gaa acc atc atg att agt 387Glu Glu
Arg Trp Leu Pro Ile His Thr Val Glu Thr Ile Met Ile Ser 110
115 120gtc att tct atg ctg gca gac cct aat gga
gac tca cct gct aat gtt 435Val Ile Ser Met Leu Ala Asp Pro Asn Gly
Asp Ser Pro Ala Asn Val 125 130 135gat
gct gcg aaa gaa tgg agg gaa gat aga aat gga gaa ttt aaa aga 483Asp
Ala Ala Lys Glu Trp Arg Glu Asp Arg Asn Gly Glu Phe Lys Arg140
145 150 155aaa gtt gcc cgc tgt gta
aga aaa agc caa gag act gct ttt gag 528Lys Val Ala Arg Cys Val
Arg Lys Ser Gln Glu Thr Ala Phe Glu 160
165 170tgacatttat ttagcagcta gtaacttcac ttatttcagg
gtctccaatt gagaaacatg 588gcactgtttt tcctgcactc tacccaccg
61725374PRTHomo sapiens 25Met Val Leu Trp Glu Ser
Pro Arg Gln Cys Ser Ser Trp Thr Leu Cys 1 5
10 15Glu Gly Phe Cys Trp Leu Leu Leu Leu Pro Val Met
Leu Leu Ile Val 20 25 30Ala
Arg Pro Val Lys Leu Ala Ala Phe Pro Thr Ser Leu Ser Asp Cys 35
40 45Gln Thr Pro Thr Gly Trp Asn Cys Ser
Gly Tyr Asp Asp Arg Glu Asn 50 55
60Asp Leu Phe Leu Cys Asp Thr Asn Thr Cys Lys Phe Asp Gly Glu Cys 65
70 75 80Leu Arg Ile Gly Asp
Thr Val Thr Cys Val Cys Gln Phe Lys Cys Asn 85
90 95Asn Asp Tyr Val Pro Val Cys Gly Ser Asn Gly
Glu Ser Tyr Gln Asn 100 105
110Glu Cys Tyr Leu Arg Gln Ala Ala Cys Lys Gln Gln Ser Glu Ile Leu
115 120 125Val Val Ser Glu Gly Ser Cys
Ala Thr Asp Ala Gly Ser Gly Ser Gly 130 135
140Asp Gly Val His Glu Gly Ser Gly Glu Thr Ser Gln Lys Glu Thr
Ser145 150 155 160Thr Cys
Asp Ile Cys Gln Phe Gly Ala Glu Cys Asp Glu Asp Ala Glu
165 170 175Asp Val Trp Cys Val Cys Asn
Ile Asp Cys Ser Gln Thr Asn Phe Asn 180 185
190Pro Leu Cys Ala Ser Asp Gly Lys Ser Tyr Asp Asn Ala Cys
Gln Ile 195 200 205Lys Glu Ala Ser
Cys Gln Lys Gln Glu Lys Ile Glu Val Met Ser Leu 210
215 220Gly Arg Cys Gln Asp Asn Thr Thr Thr Thr Thr Lys
Ser Glu Asp Gly225 230 235
240His Tyr Ala Arg Thr Asp Tyr Ala Glu Asn Ala Asn Lys Leu Glu Glu
245 250 255Ser Ala Arg Glu His
His Ile Pro Cys Pro Glu His Tyr Asn Gly Phe 260
265 270Cys Met His Gly Lys Cys Glu His Ser Ile Asn Met
Gln Glu Pro Ser 275 280 285Cys Arg
Cys Asp Ala Gly Tyr Thr Gly Gln His Cys Glu Lys Lys Asp 290
295 300Tyr Ser Val Leu Tyr Val Val Pro Gly Pro Val
Arg Phe Gln Tyr Val305 310 315
320Leu Ile Ala Ala Val Ile Gly Thr Ile Gln Ile Ala Val Ile Cys Val
325 330 335Val Val Leu Cys
Ile Thr Arg Lys Cys Pro Arg Ser Asn Arg Ile His 340
345 350Arg Gln Lys Gln Asn Thr Gly His Tyr Ser Ser
Asp Asn Thr Thr Arg 355 360 365Ala
Ser Thr Arg Leu Ile 370261122DNAHomo sapiens 26atggtgctgt gggagtcccc
gcggcagtgc agcagctgga cactttgcga gggcttttgc 60tggctgctgc tgctgcccgt
catgctactc atcgtagccc gcccggtgaa gctcgctgct 120ttccctacct ccttaagtga
ctgccaaacg cccaccggct ggaattgctc tggttatgat 180gacagagaaa atgatctctt
cctctgtgac accaacacct gtaaatttga tggggaatgt 240ttaagaattg gagacactgt
gacttgcgtc tgtcagttca agtgcaacaa tgactatgtg 300cctgtgtgtg gctccaatgg
ggagagctac cagaatgagt gttacctgcg acaggctgca 360tgcaaacagc agagtgagat
acttgtggtg tcagaaggat catgtgccac agatgcagga 420tcaggatctg gagatggagt
ccatgaaggc tctggagaaa ctagtcaaaa ggagacatcc 480acctgtgata tttgccagtt
tggtgcagaa tgtgacgaag atgccgagga tgtctggtgt 540gtgtgtaata ttgactgttc
tcaaaccaac ttcaatcccc tctgcgcttc tgatgggaaa 600tcttatgata atgcatgcca
aatcaaagaa gcatcgtgtc agaaacagga gaaaattgaa 660gtcatgtctt tgggtcgatg
tcaagataac acaactacaa ctactaagtc tgaagatggg 720cattatgcaa gaacagatta
tgcagagaat gctaacaaat tagaagaaag tgccagagaa 780caccacatac cttgtccgga
acattacaat ggcttctgca tgcatgggaa gtgtgagcat 840tctatcaata tgcaggagcc
atcttgcagg tgtgatgctg gttatactgg acaacactgt 900gaaaaaaagg actacagtgt
tctatacgtt gttcccggtc ctgtacgatt tcagtatgtc 960ttaatcgcag ctgtgattgg
aacaattcag attgctgtca tctgtgtggt ggtcctctgc 1020atcacaagga aatgccccag
aagcaacaga attcacagac agaagcaaaa tacagggcac 1080tacagttcag acaatacaac
aagagcgtcc acgaggttaa tc 1122271721DNAHomo
sapiensCDS(368)..(1489) 27ctgcggggcg ccttgactct ccctccaccc tgcctcctcg
ggctccactc gtctgcccct 60ggactcccgt ctcctcctgt cctccggctt cccagagctc
cctccttatg gcagcagctt 120cccgcgtctc cggcgcagct tctcagcgga cgaccctctc
gctccggggc tgagccagtc 180cctggatgtt gctgaaactc tcgagatcat gcgcgggttt
ggctgctgct tccccgccgg 240gtgccactgc caccgccgcc gcctctgctg ccgccgtccg
cgggatgctc agtagcccgc 300tgcccggccc ccgcgatcct gtgttcctcg gaagccgttt
gctgctgcag agttgcacga 360actagtc atg gtg ctg tgg gag tcc ccg cgg cag
tgc agc agc tgg aca 409 Met Val Leu Trp Glu Ser Pro Arg Gln
Cys Ser Ser Trp Thr 1 5 10ctt tgc
gag ggc ttt tgc tgg ctg ctg ctg ctg ccc gtc atg cta ctc 457Leu Cys
Glu Gly Phe Cys Trp Leu Leu Leu Leu Pro Val Met Leu Leu 15
20 25 30atc gta gcc cgc ccg gtg aag
ctc gct gct ttc cct acc tcc tta agt 505Ile Val Ala Arg Pro Val Lys
Leu Ala Ala Phe Pro Thr Ser Leu Ser 35
40 45gac tgc caa acg ccc acc ggc tgg aat tgc tct ggt tat
gat gac aga 553Asp Cys Gln Thr Pro Thr Gly Trp Asn Cys Ser Gly Tyr
Asp Asp Arg 50 55 60gaa aat
gat ctc ttc ctc tgt gac acc aac acc tgt aaa ttt gat ggg 601Glu Asn
Asp Leu Phe Leu Cys Asp Thr Asn Thr Cys Lys Phe Asp Gly 65
70 75gaa tgt tta aga att gga gac act gtg act
tgc gtc tgt cag ttc aag 649Glu Cys Leu Arg Ile Gly Asp Thr Val Thr
Cys Val Cys Gln Phe Lys 80 85 90tgc
aac aat gac tat gtg cct gtg tgt ggc tcc aat ggg gag agc tac 697Cys
Asn Asn Asp Tyr Val Pro Val Cys Gly Ser Asn Gly Glu Ser Tyr 95
100 105 110cag aat gag tgt tac ctg
cga cag gct gca tgc aaa cag cag agt gag 745Gln Asn Glu Cys Tyr Leu
Arg Gln Ala Ala Cys Lys Gln Gln Ser Glu 115
120 125ata ctt gtg gtg tca gaa gga tca tgt gcc aca gat
gca gga tca gga 793Ile Leu Val Val Ser Glu Gly Ser Cys Ala Thr Asp
Ala Gly Ser Gly 130 135 140tct
gga gat gga gtc cat gaa ggc tct gga gaa act agt caa aag gag 841Ser
Gly Asp Gly Val His Glu Gly Ser Gly Glu Thr Ser Gln Lys Glu 145
150 155aca tcc acc tgt gat att tgc cag ttt
ggt gca gaa tgt gac gaa gat 889Thr Ser Thr Cys Asp Ile Cys Gln Phe
Gly Ala Glu Cys Asp Glu Asp 160 165
170gcc gag gat gtc tgg tgt gtg tgt aat att gac tgt tct caa acc aac
937Ala Glu Asp Val Trp Cys Val Cys Asn Ile Asp Cys Ser Gln Thr Asn175
180 185 190ttc aat ccc ctc
tgc gct tct gat ggg aaa tct tat gat aat gca tgc 985Phe Asn Pro Leu
Cys Ala Ser Asp Gly Lys Ser Tyr Asp Asn Ala Cys 195
200 205caa atc aaa gaa gca tcg tgt cag aaa cag
gag aaa att gaa gtc atg 1033Gln Ile Lys Glu Ala Ser Cys Gln Lys Gln
Glu Lys Ile Glu Val Met 210 215
220tct ttg ggt cga tgt caa gat aac aca act aca act act aag tct gaa
1081Ser Leu Gly Arg Cys Gln Asp Asn Thr Thr Thr Thr Thr Lys Ser Glu
225 230 235gat ggg cat tat gca aga aca
gat tat gca gag aat gct aac aaa tta 1129Asp Gly His Tyr Ala Arg Thr
Asp Tyr Ala Glu Asn Ala Asn Lys Leu 240 245
250gaa gaa agt gcc aga gaa cac cac ata cct tgt ccg gaa cat tac aat
1177Glu Glu Ser Ala Arg Glu His His Ile Pro Cys Pro Glu His Tyr Asn255
260 265 270ggc ttc tgc atg
cat ggg aag tgt gag cat tct atc aat atg cag gag 1225Gly Phe Cys Met
His Gly Lys Cys Glu His Ser Ile Asn Met Gln Glu 275
280 285cca tct tgc agg tgt gat gct ggt tat act
gga caa cac tgt gaa aaa 1273Pro Ser Cys Arg Cys Asp Ala Gly Tyr Thr
Gly Gln His Cys Glu Lys 290 295
300aag gac tac agt gtt cta tac gtt gtt ccc ggt cct gta cga ttt cag
1321Lys Asp Tyr Ser Val Leu Tyr Val Val Pro Gly Pro Val Arg Phe Gln
305 310 315tat gtc tta atc gca gct gtg
att gga aca att cag att gct gtc atc 1369Tyr Val Leu Ile Ala Ala Val
Ile Gly Thr Ile Gln Ile Ala Val Ile 320 325
330tgt gtg gtg gtc ctc tgc atc aca agg aaa tgc ccc aga agc aac aga
1417Cys Val Val Val Leu Cys Ile Thr Arg Lys Cys Pro Arg Ser Asn Arg335
340 345 350att cac aga cag
aag caa aat aca ggg cac tac agt tca gac aat aca 1465Ile His Arg Gln
Lys Gln Asn Thr Gly His Tyr Ser Ser Asp Asn Thr 355
360 365aca aga gcg tcc acg agg tta atc
taaagggagc atgtttcaca gtggctggac 1519Thr Arg Ala Ser Thr Arg Leu Ile
370taccgagagc ttggactaca caatacagta ttatagacaa aagaataaga
caagagatct 1579acacatgttg ccttgcattt gtggtaatct acaccaatga aaacatgtac
tacagctata 1639tttgattatg tatggatata tttgaaatag tatacattgt cttgatgttt
tttctgtaat 1699gtaaataaac tatttatatc ac
172128817PRTHomo sapiens 28Met Gly Asp Thr Val Val Glu Pro Ala
Pro Leu Lys Pro Thr Ser Glu 1 5 10
15Pro Thr Ser Gly Pro Pro Gly Asn Asn Gly Gly Ser Leu Leu Ser
Val 20 25 30Ile Thr Glu Gly
Val Gly Glu Leu Ser Val Ile Asp Pro Glu Val Ala 35
40 45Gln Lys Ala Cys Gln Glu Val Leu Glu Lys Val Lys
Leu Leu His Gly 50 55 60Gly Val Ala
Val Ser Ser Arg Gly Thr Pro Leu Glu Leu Val Asn Gly 65
70 75 80Asp Gly Val Asp Ser Glu Ile Arg
Cys Leu Asp Asp Pro Pro Ala Gln 85 90
95Ile Arg Glu Glu Glu Asp Glu Met Gly Ala Ala Val Ala Ser
Gly Thr 100 105 110Ala Lys Gly
Ala Arg Arg Arg Arg Gln Asn Asn Ser Ala Lys Gln Ser 115
120 125Trp Leu Leu Arg Leu Phe Glu Ser Lys Leu Phe
Asp Ile Ser Met Ala 130 135 140Ile Ser
Tyr Leu Tyr Asn Ser Lys Glu Pro Gly Val Gln Ala Tyr Ile145
150 155 160Gly Asn Arg Leu Phe Cys Phe
Arg Asn Glu Asp Val Asp Phe Tyr Leu 165
170 175Pro Gln Leu Leu Asn Met Tyr Ile His Met Asp Glu
Asp Val Gly Asp 180 185 190Ala
Ile Lys Pro Tyr Ile Val His Arg Cys Arg Gln Ser Ile Asn Phe 195
200 205Ser Leu Gln Cys Ala Leu Leu Leu Gly
Ala Tyr Ser Ser Asp Met His 210 215
220Ile Ser Thr Gln Arg His Ser Arg Gly Thr Lys Leu Arg Lys Leu Ile225
230 235 240Leu Ser Asp Glu
Leu Lys Pro Ala His Arg Lys Arg Glu Leu Pro Ser 245
250 255Leu Ser Pro Ala Pro Asp Thr Gly Leu Ser
Pro Ser Lys Arg Thr His 260 265
270Gln Arg Ser Lys Ser Asp Ala Thr Ala Ser Ile Ser Leu Ser Ser Asn
275 280 285Leu Lys Arg Thr Ala Ser Asn
Pro Lys Val Glu Asn Glu Asp Glu Glu 290 295
300Leu Ser Ser Ser Thr Glu Ser Ile Asp Asn Ser Phe Ser Ser Pro
Val305 310 315 320Arg Leu
Ala Pro Glu Arg Glu Phe Ile Lys Ser Leu Met Ala Ile Gly
325 330 335Lys Arg Leu Ala Thr Leu Pro
Thr Lys Glu Gln Lys Thr Gln Arg Leu 340 345
350Ile Ser Glu Leu Ser Leu Leu Asn His Lys Leu Pro Ala Arg
Val Trp 355 360 365Leu Pro Thr Ala
Gly Phe Asp His His Val Val Arg Val Pro His Thr 370
375 380Gln Ala Val Val Leu Asn Ser Lys Asp Lys Ala Pro
Tyr Leu Ile Tyr385 390 395
400Val Glu Val Leu Glu Cys Glu Asn Phe Asp Thr Thr Ser Val Pro Ala
405 410 415Arg Ile Pro Glu Asn
Arg Ile Arg Ser Thr Arg Ser Val Glu Asn Leu 420
425 430Pro Glu Cys Gly Ile Thr His Glu Gln Arg Ala Gly
Ser Phe Ser Thr 435 440 445Val Pro
Asn Tyr Asp Asn Asp Asp Glu Ala Trp Ser Val Asp Asp Ile 450
455 460Gly Glu Leu Gln Val Glu Leu Pro Glu Val His
Thr Asn Ser Cys Asp465 470 475
480Asn Ile Ser Gln Phe Ser Val Asp Ser Ile Thr Ser Gln Glu Ser Lys
485 490 495Glu Pro Val Phe
Ile Ala Ala Gly Asp Ile Arg Arg Arg Leu Ser Glu 500
505 510Gln Leu Ala His Thr Pro Thr Ala Phe Lys Arg
Asp Pro Glu Asp Pro 515 520 525Ser
Ala Val Ala Leu Lys Glu Pro Trp Gln Glu Lys Val Arg Arg Ile 530
535 540Arg Glu Gly Ser Pro Tyr Gly His Leu Pro
Asn Trp Arg Leu Leu Ser545 550 555
560Val Ile Val Lys Cys Gly Asp Asp Leu Arg Gln Glu Leu Leu Ala
Phe 565 570 575Gln Val Leu
Lys Gln Leu Gln Ser Ile Trp Glu Gln Glu Arg Val Pro 580
585 590Leu Trp Ile Lys Pro Ile Gln Asp Ser Cys
Glu Ile Thr Thr Asp Ser 595 600
605Gly Met Ile Glu Pro Val Val Asn Ala Val Ser Ile His Gln Val Lys 610
615 620Lys Gln Ser Gln Leu Ser Leu Leu
Asp Tyr Phe Leu Gln Glu His Gly625 630
635 640Ser Tyr Thr Thr Glu Ala Phe Leu Ser Ala Gln Arg
Asn Phe Val Gln 645 650
655Ser Cys Ala Gly Tyr Cys Leu Val Cys Tyr Leu Leu Gln Val Lys Asp
660 665 670Arg His Asn Gly Asn Ile
Leu Leu Asp Ala Glu Gly His Ile Ile His 675 680
685Ile Asp Phe Gly Phe Ile Leu Ser Ser Ser Pro Arg Asn Leu
Gly Phe 690 695 700Glu Thr Ser Ala Phe
Lys Leu Thr Thr Glu Phe Val Asp Val Met Gly705 710
715 720Gly Leu Asp Gly Asp Met Phe Asn Tyr Tyr
Lys Met Leu Met Leu Gln 725 730
735Gly Leu Ile Ala Ala Arg Lys His Met Asp Lys Val Val Gln Ile Val
740 745 750Glu Ile Met Gln Gln
Gly Ser Gln Leu Pro Cys Phe His Gly Ser Ser 755
760 765Thr Ile Arg Asn Leu Lys Glu Arg Phe His Met Ser
Met Thr Glu Glu 770 775 780Gln Leu Gln
Leu Leu Val Glu Gln Met Val Asp Gly Ser Met Arg Ser785
790 795 800Ile Thr Thr Lys Leu Tyr Asp
Gly Phe Gln Tyr Leu Thr Asn Gly Ile 805
810 815Met292451DNAHomo sapiens 29atgggagata cagtagtgga
gcctgccccc ttgaagccaa cttctgagcc cacttctggc 60ccaccaggga ataatggggg
gtccctgcta agtgtcatca cggagggggt cggggaacta 120tcagtgattg accctgaggt
ggcccagaag gcctgccagg aggtgttgga gaaagtcaag 180cttttgcatg gaggcgtggc
agtctctagc agaggcaccc cactggagtt ggtcaatggg 240gatggtgtgg acagtgagat
ccgttgccta gatgatccac ctgcccagat cagggaggag 300gaagatgaga tgggggccgc
tgtggcctca ggcacagcca aaggagcaag aagacggcgg 360cagaacaact cagctaaaca
gtcttggctg ctgaggctgt ttgagtcaaa actgtttgac 420atctccatgg ccatttcata
cctgtataac tccaaggagc ctggagtaca agcctacatt 480ggcaaccggc tcttctgctt
tcgcaacgag gacgtggact tctatctgcc ccagttgctt 540aacatgtaca tccacatgga
tgaggacgtg ggtgatgcca ttaagcccta catagtccac 600cgttgccgcc agagcattaa
cttttccctc cagtgtgccc tgttgcttgg ggcctattct 660tcagacatgc acatttccac
tcaacgacac tcccgtggga ccaagctacg gaagctgatc 720ctctcagatg agctaaagcc
agctcacagg aagagggagc tgccctcctt gagcccggcc 780cctgatacag ggctgtctcc
ctccaaaagg actcaccagc gctctaagtc agatgccact 840gccagcataa gtctcagcag
caacctgaaa cgaacagcca gcaaccctaa agtggagaat 900gaggatgagg agctctcctc
cagcaccgag agtattgata attcattcag ttcccctgtt 960cgactggctc ctgagagaga
attcatcaag tccctgatgg cgatcggcaa gcggctggcc 1020acgctcccca ccaaagagca
gaaaacacag aggctgatct cagagctctc cctgctcaac 1080cataagctcc ctgcccgagt
ctggctgccc actgctggct ttgaccacca cgtggtccgt 1140gtaccccaca cacaggctgt
tgtcctcaac tccaaggaca aggctcccta cctgatttat 1200gtggaagtcc ttgaatgtga
aaactttgac accaccagtg tccctgcccg gatccccgag 1260aaccgaattc ggagtacgag
gtccgtagaa aacttgcccg aatgtggtat tacccatgag 1320cagcgagctg gcagcttcag
cactgtgccc aactatgaca acgatgatga ggcctggtcg 1380gtggatgaca taggcgagct
gcaagtggag ctccccgaag tgcataccaa cagctgtgac 1440aacatctccc agttctctgt
ggacagcatc accagccagg agagcaagga gcctgtgttc 1500attgcagcag gggacatccg
ccggcgcctt tcggaacagc tggctcatac cccgacagcc 1560ttcaaacgag acccagaaga
tccttctgca gttgctctca aagagccctg gcaggagaaa 1620gtacggcgga tcagagaggg
ctccccctac ggccatctcc ccaattggcg gctcctgtca 1680gtcattgtca agtgtgggga
tgaccttcgg caagagcttc tggcctttca ggtgttgaag 1740caactgcagt ccatttggga
acaggagcga gtgccccttt ggatcaagcc aatacaagat 1800tcttgtgaaa ttacgactga
tagtggcatg attgaaccag tggtcaatgc tgtgtccatc 1860catcaggtga agaaacagtc
acagctctcc ttgctcgatt acttcctaca ggagcacggc 1920agttacacca ctgaggcatt
cctcagtgca cagcgcaatt ttgtgcaaag ttgtgctggg 1980tactgcttgg tctgctacct
gctgcaagtc aaggacagac acaatgggaa tatccttttg 2040gacgcagaag gccacatcat
ccacatcgac tttggcttca tcctctccag ctcaccccga 2100aatctgggct ttgagacgtc
agcctttaag ctgaccacag agtttgtgga tgtgatgggc 2160ggcctggatg gcgacatgtt
caactactat aagatgctga tgctgcaagg gctgattgcc 2220gctcggaaac acatggacaa
ggtggtgcag atcgtggaga tcatgcagca aggttctcag 2280cttccttgct tccatggctc
cagcaccatt cgaaacctca aagagaggtt ccacatgagc 2340atgactgagg agcagctgca
gctgctggtg gagcagatgg tggatggcag tatgcggtct 2400atcaccacca aactctatga
cggcttccag tacctcacca acggcatcat g 2451303602DNAHomo
sapiensCDS(429)..(2879) 30ggtggctcac gcctgtaatc ccagcacttt gggaggacaa
ggcagatccc ttgagcccag 60gaggtagagg ctgcagtgag ctgtgatggt gccactgcac
tccagcctgg gcaatgaagc 120aagaccctat ctgaaaaaaa aaatttttaa aaaaggcaaa
gatgggcctg gggcaccaaa 180tattccagag gaaagggaac gtgtgtactc cttgaggtgg
ggaacatgac ccacttgagg 240tgcagaaaga agacttgtat ggggctggtg cagcctccgc
ggccgctgtc agggaagcgc 300aggcggccaa tggaacccgg gagcggtcgc tgctgctgag
gcggcagtgt cggcagtcca 360accgcgactg cccgcacccc ctccgcgggg tcccccagag
cttggaagct cgaagtctgg 420ctgtggcc atg gga gat aca gta gtg gag cct gcc
ccc ttg aag cca act 470 Met Gly Asp Thr Val Val Glu Pro Ala
Pro Leu Lys Pro Thr 1 5 10tct
gag ccc act tct ggc cca cca ggg aat aat ggg ggg tcc ctg cta 518Ser
Glu Pro Thr Ser Gly Pro Pro Gly Asn Asn Gly Gly Ser Leu Leu 15
20 25 30agt gtc atc acg gag ggg
gtc ggg gaa cta tca gtg att gac cct gag 566Ser Val Ile Thr Glu Gly
Val Gly Glu Leu Ser Val Ile Asp Pro Glu 35
40 45gtg gcc cag aag gcc tgc cag gag gtg ttg gag aaa
gtc aag ctt ttg 614Val Ala Gln Lys Ala Cys Gln Glu Val Leu Glu Lys
Val Lys Leu Leu 50 55 60cat
gga ggc gtg gca gtc tct agc aga ggc acc cca ctg gag ttg gtc 662His
Gly Gly Val Ala Val Ser Ser Arg Gly Thr Pro Leu Glu Leu Val 65
70 75aat ggg gat ggt gtg gac agt gag atc
cgt tgc cta gat gat cca cct 710Asn Gly Asp Gly Val Asp Ser Glu Ile
Arg Cys Leu Asp Asp Pro Pro 80 85
90gcc cag atc agg gag gag gaa gat gag atg ggg gcc gct gtg gcc tca
758Ala Gln Ile Arg Glu Glu Glu Asp Glu Met Gly Ala Ala Val Ala Ser 95
100 105 110ggc aca gcc aaa
gga gca aga aga cgg cgg cag aac aac tca gct aaa 806Gly Thr Ala Lys
Gly Ala Arg Arg Arg Arg Gln Asn Asn Ser Ala Lys 115
120 125cag tct tgg ctg ctg agg ctg ttt gag tca
aaa ctg ttt gac atc tcc 854Gln Ser Trp Leu Leu Arg Leu Phe Glu Ser
Lys Leu Phe Asp Ile Ser 130 135
140atg gcc att tca tac ctg tat aac tcc aag gag cct gga gta caa gcc
902Met Ala Ile Ser Tyr Leu Tyr Asn Ser Lys Glu Pro Gly Val Gln Ala
145 150 155tac att ggc aac cgg ctc ttc
tgc ttt cgc aac gag gac gtg gac ttc 950Tyr Ile Gly Asn Arg Leu Phe
Cys Phe Arg Asn Glu Asp Val Asp Phe 160 165
170tat ctg ccc cag ttg ctt aac atg tac atc cac atg gat gag gac gtg
998Tyr Leu Pro Gln Leu Leu Asn Met Tyr Ile His Met Asp Glu Asp Val175
180 185 190ggt gat gcc att
aag ccc tac ata gtc cac cgt tgc cgc cag agc att 1046Gly Asp Ala Ile
Lys Pro Tyr Ile Val His Arg Cys Arg Gln Ser Ile 195
200 205aac ttt tcc ctc cag tgt gcc ctg ttg ctt
ggg gcc tat tct tca gac 1094Asn Phe Ser Leu Gln Cys Ala Leu Leu Leu
Gly Ala Tyr Ser Ser Asp 210 215
220atg cac att tcc act caa cga cac tcc cgt ggg acc aag cta cgg aag
1142Met His Ile Ser Thr Gln Arg His Ser Arg Gly Thr Lys Leu Arg Lys
225 230 235ctg atc ctc tca gat gag cta
aag cca gct cac agg aag agg gag ctg 1190Leu Ile Leu Ser Asp Glu Leu
Lys Pro Ala His Arg Lys Arg Glu Leu 240 245
250ccc tcc ttg agc ccg gcc cct gat aca ggg ctg tct ccc tcc aaa agg
1238Pro Ser Leu Ser Pro Ala Pro Asp Thr Gly Leu Ser Pro Ser Lys Arg255
260 265 270act cac cag cgc
tct aag tca gat gcc act gcc agc ata agt ctc agc 1286Thr His Gln Arg
Ser Lys Ser Asp Ala Thr Ala Ser Ile Ser Leu Ser 275
280 285agc aac ctg aaa cga aca gcc agc aac cct
aaa gtg gag aat gag gat 1334Ser Asn Leu Lys Arg Thr Ala Ser Asn Pro
Lys Val Glu Asn Glu Asp 290 295
300gag gag ctc tcc tcc agc acc gag agt att gat aat tca ttc agt tcc
1382Glu Glu Leu Ser Ser Ser Thr Glu Ser Ile Asp Asn Ser Phe Ser Ser
305 310 315cct gtt cga ctg gct cct gag
aga gaa ttc atc aag tcc ctg atg gcg 1430Pro Val Arg Leu Ala Pro Glu
Arg Glu Phe Ile Lys Ser Leu Met Ala 320 325
330atc ggc aag cgg ctg gcc acg ctc ccc acc aaa gag cag aaa aca cag
1478Ile Gly Lys Arg Leu Ala Thr Leu Pro Thr Lys Glu Gln Lys Thr Gln335
340 345 350agg ctg atc tca
gag ctc tcc ctg ctc aac cat aag ctc cct gcc cga 1526Arg Leu Ile Ser
Glu Leu Ser Leu Leu Asn His Lys Leu Pro Ala Arg 355
360 365gtc tgg ctg ccc act gct ggc ttt gac cac
cac gtg gtc cgt gta ccc 1574Val Trp Leu Pro Thr Ala Gly Phe Asp His
His Val Val Arg Val Pro 370 375
380cac aca cag gct gtt gtc ctc aac tcc aag gac aag gct ccc tac ctg
1622His Thr Gln Ala Val Val Leu Asn Ser Lys Asp Lys Ala Pro Tyr Leu
385 390 395att tat gtg gaa gtc ctt gaa
tgt gaa aac ttt gac acc acc agt gtc 1670Ile Tyr Val Glu Val Leu Glu
Cys Glu Asn Phe Asp Thr Thr Ser Val 400 405
410cct gcc cgg atc ccc gag aac cga att cgg agt acg agg tcc gta gaa
1718Pro Ala Arg Ile Pro Glu Asn Arg Ile Arg Ser Thr Arg Ser Val Glu415
420 425 430aac ttg ccc gaa
tgt ggt att acc cat gag cag cga gct ggc agc ttc 1766Asn Leu Pro Glu
Cys Gly Ile Thr His Glu Gln Arg Ala Gly Ser Phe 435
440 445agc act gtg ccc aac tat gac aac gat gat
gag gcc tgg tcg gtg gat 1814Ser Thr Val Pro Asn Tyr Asp Asn Asp Asp
Glu Ala Trp Ser Val Asp 450 455
460gac ata ggc gag ctg caa gtg gag ctc ccc gaa gtg cat acc aac agc
1862Asp Ile Gly Glu Leu Gln Val Glu Leu Pro Glu Val His Thr Asn Ser
465 470 475tgt gac aac atc tcc cag ttc
tct gtg gac agc atc acc agc cag gag 1910Cys Asp Asn Ile Ser Gln Phe
Ser Val Asp Ser Ile Thr Ser Gln Glu 480 485
490agc aag gag cct gtg ttc att gca gca ggg gac atc cgc cgg cgc ctt
1958Ser Lys Glu Pro Val Phe Ile Ala Ala Gly Asp Ile Arg Arg Arg Leu495
500 505 510tcg gaa cag ctg
gct cat acc ccg aca gcc ttc aaa cga gac cca gaa 2006Ser Glu Gln Leu
Ala His Thr Pro Thr Ala Phe Lys Arg Asp Pro Glu 515
520 525gat cct tct gca gtt gct ctc aaa gag ccc
tgg cag gag aaa gta cgg 2054Asp Pro Ser Ala Val Ala Leu Lys Glu Pro
Trp Gln Glu Lys Val Arg 530 535
540cgg atc aga gag ggc tcc ccc tac ggc cat ctc ccc aat tgg cgg ctc
2102Arg Ile Arg Glu Gly Ser Pro Tyr Gly His Leu Pro Asn Trp Arg Leu
545 550 555ctg tca gtc att gtc aag tgt
ggg gat gac ctt cgg caa gag ctt ctg 2150Leu Ser Val Ile Val Lys Cys
Gly Asp Asp Leu Arg Gln Glu Leu Leu 560 565
570gcc ttt cag gtg ttg aag caa ctg cag tcc att tgg gaa cag gag cga
2198Ala Phe Gln Val Leu Lys Gln Leu Gln Ser Ile Trp Glu Gln Glu Arg575
580 585 590gtg ccc ctt tgg
atc aag cca ata caa gat tct tgt gaa att acg act 2246Val Pro Leu Trp
Ile Lys Pro Ile Gln Asp Ser Cys Glu Ile Thr Thr 595
600 605gat agt ggc atg att gaa cca gtg gtc aat
gct gtg tcc atc cat cag 2294Asp Ser Gly Met Ile Glu Pro Val Val Asn
Ala Val Ser Ile His Gln 610 615
620gtg aag aaa cag tca cag ctc tcc ttg ctc gat tac ttc cta cag gag
2342Val Lys Lys Gln Ser Gln Leu Ser Leu Leu Asp Tyr Phe Leu Gln Glu
625 630 635cac ggc agt tac acc act gag
gca ttc ctc agt gca cag cgc aat ttt 2390His Gly Ser Tyr Thr Thr Glu
Ala Phe Leu Ser Ala Gln Arg Asn Phe 640 645
650gtg caa agt tgt gct ggg tac tgc ttg gtc tgc tac ctg ctg caa gtc
2438Val Gln Ser Cys Ala Gly Tyr Cys Leu Val Cys Tyr Leu Leu Gln Val655
660 665 670aag gac aga cac
aat ggg aat atc ctt ttg gac gca gaa ggc cac atc 2486Lys Asp Arg His
Asn Gly Asn Ile Leu Leu Asp Ala Glu Gly His Ile 675
680 685atc cac atc gac ttt ggc ttc atc ctc tcc
agc tca ccc cga aat ctg 2534Ile His Ile Asp Phe Gly Phe Ile Leu Ser
Ser Ser Pro Arg Asn Leu 690 695
700ggc ttt gag acg tca gcc ttt aag ctg acc aca gag ttt gtg gat gtg
2582Gly Phe Glu Thr Ser Ala Phe Lys Leu Thr Thr Glu Phe Val Asp Val
705 710 715atg ggc ggc ctg gat ggc gac
atg ttc aac tac tat aag atg ctg atg 2630Met Gly Gly Leu Asp Gly Asp
Met Phe Asn Tyr Tyr Lys Met Leu Met 720 725
730ctg caa ggg ctg att gcc gct cgg aaa cac atg gac aag gtg gtg cag
2678Leu Gln Gly Leu Ile Ala Ala Arg Lys His Met Asp Lys Val Val Gln735
740 745 750atc gtg gag atc
atg cag caa ggt tct cag ctt cct tgc ttc cat ggc 2726Ile Val Glu Ile
Met Gln Gln Gly Ser Gln Leu Pro Cys Phe His Gly 755
760 765tcc agc acc att cga aac ctc aaa gag agg
ttc cac atg agc atg act 2774Ser Ser Thr Ile Arg Asn Leu Lys Glu Arg
Phe His Met Ser Met Thr 770 775
780gag gag cag ctg cag ctg ctg gtg gag cag atg gtg gat ggc agt atg
2822Glu Glu Gln Leu Gln Leu Leu Val Glu Gln Met Val Asp Gly Ser Met
785 790 795cgg tct atc acc acc aaa ctc
tat gac ggc ttc cag tac ctc acc aac 2870Arg Ser Ile Thr Thr Lys Leu
Tyr Asp Gly Phe Gln Tyr Leu Thr Asn 800 805
810ggc atc atg tgacacgctc ctcagcccag gagtggtggg gggtccaggg
2919Gly Ile Met815caccctccct agagggccct tgtctgagaa accccaaacc
aggaaacccc acctacccaa 2979ccatccaccc aagggaaatg gaaggcaaga aacacgaagg
atcatgtggt aactgcgaga 3039gcttgctgag gggtgggaga gccagctgtg gggtccagac
ttgttggggc ttccctgccc 3099ctcctggtct gtgtcagtat taccaccaga ctgactccag
gactcactgc cctccagaaa 3159acagaggtga caaatgtgag ggacactggg gcctttcttc
tccttgtagg ggtctctcag 3219aggttctttc cacaggccat cctcttattc cgttctgggg
cccaggaagt ggggaagagt 3279aggttctcgg tacttaggac ttgatcctgt ggttgccact
ggccatgctg ctgcccagct 3339ctacccctcc cagggaccta cccctcccag ggaccgaccc
ctggcccaag ctccccttgc 3399tggcgggcgc tgcgtgggcc ctgcacttgc tgaggttccc
catcatgggc aaggcaaggg 3459aattcccaca gccctccagt gtactgaggg tactggccta
gccatgtgga attccctacc 3519ctgactcctt ccccaaaccc agggaaaaga gctctcaatt
ttttattttt aatttttgtt 3579tgaaataaag tccttagtta gcc
360231829PRTHomo sapiens 31Met Arg Phe Leu Glu Ala
Arg Ser Leu Ala Val Ala Met Gly Asp Thr 1 5
10 15Val Val Glu Pro Ala Pro Leu Lys Pro Thr Ser Glu
Pro Thr Ser Gly 20 25 30Pro
Pro Gly Asn Asn Gly Gly Ser Leu Leu Ser Val Ile Thr Glu Gly 35
40 45Val Gly Glu Leu Ser Val Ile Asp Pro
Glu Val Ala Gln Lys Ala Cys 50 55
60Gln Glu Val Leu Glu Lys Val Lys Leu Leu His Gly Gly Val Ala Val 65
70 75 80Ser Ser Arg Gly Thr
Pro Leu Glu Leu Val Asn Gly Asp Gly Val Asp 85
90 95Ser Glu Ile Arg Cys Leu Asp Asp Pro Pro Ala
Gln Ile Arg Glu Glu 100 105
110Glu Asp Glu Met Gly Ala Ala Val Ala Ser Gly Thr Ala Lys Gly Ala
115 120 125Arg Arg Arg Arg Gln Asn Asn
Ser Ala Lys Gln Ser Trp Leu Leu Arg 130 135
140Leu Phe Glu Ser Lys Leu Phe Asp Ile Ser Met Ala Ile Ser Tyr
Leu145 150 155 160Tyr Asn
Ser Lys Glu Pro Gly Val Gln Ala Tyr Ile Gly Asn Arg Leu
165 170 175Phe Cys Phe Arg Asn Glu Asp
Val Asp Phe Tyr Leu Pro Gln Leu Leu 180 185
190Asn Met Tyr Ile His Met Asp Glu Asp Val Gly Asp Ala Ile
Lys Pro 195 200 205Tyr Ile Val His
Arg Cys Arg Gln Ser Ile Asn Phe Ser Leu Gln Cys 210
215 220Ala Leu Leu Leu Gly Ala Tyr Ser Ser Asp Met His
Ile Ser Thr Gln225 230 235
240Arg His Ser Arg Gly Thr Lys Leu Arg Lys Leu Ile Leu Ser Asp Glu
245 250 255Leu Lys Pro Ala His
Arg Lys Arg Glu Leu Pro Ser Leu Ser Pro Ala 260
265 270Pro Asp Thr Gly Leu Ser Pro Ser Lys Arg Thr His
Gln Arg Ser Lys 275 280 285Ser Asp
Ala Thr Ala Ser Ile Ser Leu Ser Ser Asn Leu Lys Arg Thr 290
295 300Ala Ser Asn Pro Lys Val Glu Asn Glu Asp Glu
Glu Leu Ser Ser Ser305 310 315
320Thr Glu Ser Ile Asp Asn Ser Phe Ser Ser Pro Val Arg Leu Ala Pro
325 330 335Glu Arg Glu Phe
Ile Lys Ser Leu Met Ala Ile Gly Lys Arg Leu Ala 340
345 350Thr Leu Pro Thr Lys Glu Gln Lys Thr Gln Arg
Leu Ile Ser Glu Leu 355 360 365Ser
Leu Leu Asn His Lys Leu Pro Ala Arg Val Trp Leu Pro Thr Ala 370
375 380Gly Phe Asp His His Val Val Arg Val Pro
His Thr Gln Ala Val Val385 390 395
400Leu Asn Ser Lys Asp Lys Ala Pro Tyr Leu Ile Tyr Val Glu Val
Leu 405 410 415Glu Cys Glu
Asn Phe Asp Thr Thr Ser Val Pro Ala Arg Ile Pro Glu 420
425 430Asn Arg Ile Arg Ser Thr Arg Ser Val Glu
Asn Leu Pro Glu Cys Gly 435 440
445Ile Thr His Glu Gln Arg Ala Gly Ser Phe Ser Thr Val Pro Asn Tyr 450
455 460Asp Asn Asp Asp Glu Ala Trp Ser
Val Asp Asp Ile Gly Glu Leu Gln465 470
475 480Val Glu Leu Pro Glu Val His Thr Asn Ser Cys Asp
Asn Ile Ser Gln 485 490
495Phe Ser Val Asp Ser Ile Thr Ser Gln Glu Ser Lys Glu Pro Val Phe
500 505 510Ile Ala Ala Gly Asp Ile
Arg Arg Arg Leu Ser Glu Gln Leu Ala His 515 520
525Thr Pro Thr Ala Phe Lys Arg Asp Pro Glu Asp Pro Ser Ala
Val Ala 530 535 540Leu Lys Glu Pro Trp
Gln Glu Lys Val Arg Arg Ile Arg Glu Gly Ser545 550
555 560Pro Tyr Gly His Leu Pro Asn Trp Arg Leu
Leu Ser Val Ile Val Lys 565 570
575Cys Gly Asp Asp Leu Arg Gln Glu Leu Leu Ala Phe Gln Val Leu Lys
580 585 590Gln Leu Gln Ser Ile
Trp Glu Gln Glu Arg Val Pro Leu Trp Ile Lys 595
600 605Pro Ile Gln Asp Ser Cys Glu Ile Thr Thr Asp Ser
Gly Met Ile Glu 610 615 620Pro Val Val
Asn Ala Val Ser Ile His Gln Val Lys Lys Gln Ser Gln625
630 635 640Leu Ser Leu Leu Asp Tyr Phe
Leu Gln Glu His Gly Ser Tyr Thr Thr 645
650 655Glu Ala Phe Leu Ser Ala Gln Arg Asn Phe Val Gln
Ser Cys Ala Gly 660 665 670Tyr
Cys Leu Val Cys Tyr Leu Leu Gln Val Lys Asp Arg His Asn Gly 675
680 685Asn Ile Leu Leu Asp Ala Glu Gly His
Ile Ile His Ile Asp Phe Gly 690 695
700Phe Ile Leu Ser Ser Ser Pro Arg Asn Leu Gly Phe Glu Thr Ser Ala705
710 715 720Phe Lys Leu Thr
Thr Glu Phe Val Asp Val Met Gly Gly Leu Asp Gly 725
730 735Asp Met Phe Asn Tyr Tyr Lys Met Leu Met
Leu Gln Gly Leu Ile Ala 740 745
750Ala Arg Lys His Met Asp Lys Val Val Gln Ile Val Glu Ile Met Gln
755 760 765Gln Gly Ser Gln Leu Pro Cys
Phe His Gly Ser Ser Thr Ile Arg Asn 770 775
780Leu Lys Glu Arg Phe His Met Ser Met Thr Glu Glu Gln Leu Gln
Leu785 790 795 800Leu Val
Glu Gln Met Val Asp Gly Ser Met Arg Ser Ile Thr Thr Lys
805 810 815Leu Tyr Asp Gly Phe Gln Tyr
Leu Thr Asn Gly Ile Met 820 825322487DNAHomo
sapiens 32atgagattct tggaagctcg aagtctggct gtggccatgg gagatacagt
agtggagcct 60gcccccttga agccaacttc tgagcccact tctggcccac cagggaataa
tggggggtcc 120ctgctaagtg tcatcacgga gggggtcggg gaactatcag tgattgaccc
tgaggtggcc 180cagaaggcct gccaggaggt gttggagaaa gtcaagcttt tgcatggagg
cgtggcagtc 240tctagcagag gcaccccact ggagttggtc aatggggatg gtgtggacag
tgagatccgt 300tgcctagatg atccacctgc ccagatcagg gaggaggaag atgagatggg
ggccgctgtg 360gcctcaggca cagccaaagg agcaagaaga cggcggcaga acaactcagc
taaacagtct 420tggctgctga ggctgtttga gtcaaaactg tttgacatct ccatggccat
ttcatacctg 480tataactcca aggagcctgg agtacaagcc tacattggca accggctctt
ctgctttcgc 540aacgaggacg tggacttcta tctgccccag ttgcttaaca tgtacatcca
catggatgag 600gacgtgggtg atgccattaa gccctacata gtccaccgtt gccgccagag
cattaacttt 660tccctccagt gtgccctgtt gcttggggcc tattcttcag acatgcacat
ttccactcaa 720cgacactccc gtgggaccaa gctacggaag ctgatcctct cagatgagct
aaagccagct 780cacaggaaga gggagctgcc ctccttgagc ccggcccctg atacagggct
gtctccctcc 840aaaaggactc accagcgctc taagtcagat gccactgcca gcataagtct
cagcagcaac 900ctgaaacgaa cagccagcaa ccctaaagtg gagaatgagg atgaggagct
ctcctccagc 960accgagagta ttgataattc attcagttcc cctgttcgac tggctcctga
gagagaattc 1020atcaagtccc tgatggcgat cggcaagcgg ctggccacgc tccccaccaa
agagcagaaa 1080acacagaggc tgatctcaga gctctccctg ctcaaccata agctccctgc
ccgagtctgg 1140ctgcccactg ctggctttga ccaccacgtg gtccgtgtac cccacacaca
ggctgttgtc 1200ctcaactcca aggacaaggc tccctacctg atttatgtgg aagtccttga
atgtgaaaac 1260tttgacacca ccagtgtccc tgcccggatc cccgagaacc gaattcggag
tacgaggtcc 1320gtagaaaact tgcccgaatg tggtattacc catgagcagc gagctggcag
cttcagcact 1380gtgcccaact atgacaacga tgatgaggcc tggtcggtgg atgacatagg
cgagctgcaa 1440gtggagctcc ccgaagtgca taccaacagc tgtgacaaca tctcccagtt
ctctgtggac 1500agcatcacca gccaggagag caaggagcct gtgttcattg cagcagggga
catccgccgg 1560cgcctttcgg aacagctggc tcataccccg acagccttca aacgagaccc
agaagatcct 1620tctgcagttg ctctcaaaga gccctggcag gagaaagtac ggcggatcag
agagggctcc 1680ccctacggcc atctccccaa ttggcggctc ctgtcagtca ttgtcaagtg
tggggatgac 1740cttcggcaag agcttctggc ctttcaggtg ttgaagcaac tgcagtccat
ttgggaacag 1800gagcgagtgc ccctttggat caagccaata caagattctt gtgaaattac
gactgatagt 1860ggcatgattg aaccagtggt caatgctgtg tccatccatc aggtgaagaa
acagtcacag 1920ctctccttgc tcgattactt cctacaggag cacggcagtt acaccactga
ggcattcctc 1980agtgcacagc gcaattttgt gcaaagttgt gctgggtact gcttggtctg
ctacctgctg 2040caagtcaagg acagacacaa tgggaatatc cttttggacg cagaaggcca
catcatccac 2100atcgactttg gcttcatcct ctccagctca ccccgaaatc tgggctttga
gacgtcagcc 2160tttaagctga ccacagagtt tgtggatgtg atgggcggcc tggatggcga
catgttcaac 2220tactataaga tgctgatgct gcaagggctg attgccgctc ggaaacacat
ggacaaggtg 2280gtgcagatcg tggagatcat gcagcaaggt tctcagcttc cttgcttcca
tggctccagc 2340accattcgaa acctcaaaga gaggttccac atgagcatga ctgaggagca
gctgcagctg 2400ctggtggagc agatggtgga tggcagtatg cggtctatca ccaccaaact
ctatgacggc 2460ttccagtacc tcaccaacgg catcatg
2487333324DNAHomo sapiensCDS(115)..(2601) 33ccggaattcc
gggaaggccg gagcaagttt tgaagaagtc cctatcagat tacacttggt 60tgactactcc
ggagcagcca ctaagaggga tgaacaggcc tgcgtggaaa ttga atg 117
Met
1aga ttc ttg gaa gct cga agt ctg
gct gtg gcc atg gga gat aca gta 165Arg Phe Leu Glu Ala Arg Ser Leu
Ala Val Ala Met Gly Asp Thr Val 5 10
15gtg gag cct gcc ccc ttg aag cca act tct gag ccc act tct ggc
cca 213Val Glu Pro Ala Pro Leu Lys Pro Thr Ser Glu Pro Thr Ser Gly
Pro 20 25 30cca ggg aat aat ggg
ggg tcc ctg cta agt gtc atc acg gag ggg gtc 261Pro Gly Asn Asn Gly
Gly Ser Leu Leu Ser Val Ile Thr Glu Gly Val 35 40
45ggg gaa cta tca gtg att gac cct gag gtg gcc cag aag gcc
tgc cag 309Gly Glu Leu Ser Val Ile Asp Pro Glu Val Ala Gln Lys Ala
Cys Gln 50 55 60 65gag
gtg ttg gag aaa gtc aag ctt ttg cat gga ggc gtg gca gtc tct 357Glu
Val Leu Glu Lys Val Lys Leu Leu His Gly Gly Val Ala Val Ser
70 75 80agc aga ggc acc cca ctg gag
ttg gtc aat ggg gat ggt gtg gac agt 405Ser Arg Gly Thr Pro Leu Glu
Leu Val Asn Gly Asp Gly Val Asp Ser 85 90
95gag atc cgt tgc cta gat gat cca cct gcc cag atc agg gag
gag gaa 453Glu Ile Arg Cys Leu Asp Asp Pro Pro Ala Gln Ile Arg Glu
Glu Glu 100 105 110gat gag atg ggg
gcc gct gtg gcc tca ggc aca gcc aaa gga gca aga 501Asp Glu Met Gly
Ala Ala Val Ala Ser Gly Thr Ala Lys Gly Ala Arg 115
120 125aga cgg cgg cag aac aac tca gct aaa cag tct tgg
ctg ctg agg ctg 549Arg Arg Arg Gln Asn Asn Ser Ala Lys Gln Ser Trp
Leu Leu Arg Leu130 135 140
145ttt gag tca aaa ctg ttt gac atc tcc atg gcc att tca tac ctg tat
597Phe Glu Ser Lys Leu Phe Asp Ile Ser Met Ala Ile Ser Tyr Leu Tyr
150 155 160aac tcc aag gag cct
gga gta caa gcc tac att ggc aac cgg ctc ttc 645Asn Ser Lys Glu Pro
Gly Val Gln Ala Tyr Ile Gly Asn Arg Leu Phe 165
170 175tgc ttt cgc aac gag gac gtg gac ttc tat ctg ccc
cag ttg ctt aac 693Cys Phe Arg Asn Glu Asp Val Asp Phe Tyr Leu Pro
Gln Leu Leu Asn 180 185 190atg tac
atc cac atg gat gag gac gtg ggt gat gcc att aag ccc tac 741Met Tyr
Ile His Met Asp Glu Asp Val Gly Asp Ala Ile Lys Pro Tyr 195
200 205ata gtc cac cgt tgc cgc cag agc att aac ttt
tcc ctc cag tgt gcc 789Ile Val His Arg Cys Arg Gln Ser Ile Asn Phe
Ser Leu Gln Cys Ala210 215 220
225ctg ttg ctt ggg gcc tat tct tca gac atg cac att tcc act caa cga
837Leu Leu Leu Gly Ala Tyr Ser Ser Asp Met His Ile Ser Thr Gln Arg
230 235 240cac tcc cgt ggg acc
aag cta cgg aag ctg atc ctc tca gat gag cta 885His Ser Arg Gly Thr
Lys Leu Arg Lys Leu Ile Leu Ser Asp Glu Leu 245
250 255aag cca gct cac agg aag agg gag ctg ccc tcc ttg
agc ccg gcc cct 933Lys Pro Ala His Arg Lys Arg Glu Leu Pro Ser Leu
Ser Pro Ala Pro 260 265 270gat aca
ggg ctg tct ccc tcc aaa agg act cac cag cgc tct aag tca 981Asp Thr
Gly Leu Ser Pro Ser Lys Arg Thr His Gln Arg Ser Lys Ser 275
280 285gat gcc act gcc agc ata agt ctc agc agc aac
ctg aaa cga aca gcc 1029Asp Ala Thr Ala Ser Ile Ser Leu Ser Ser Asn
Leu Lys Arg Thr Ala290 295 300
305agc aac cct aaa gtg gag aat gag gat gag gag ctc tcc tcc agc acc
1077Ser Asn Pro Lys Val Glu Asn Glu Asp Glu Glu Leu Ser Ser Ser Thr
310 315 320gag agt att gat aat
tca ttc agt tcc cct gtt cga ctg gct cct gag 1125Glu Ser Ile Asp Asn
Ser Phe Ser Ser Pro Val Arg Leu Ala Pro Glu 325
330 335aga gaa ttc atc aag tcc ctg atg gcg atc ggc aag
cgg ctg gcc acg 1173Arg Glu Phe Ile Lys Ser Leu Met Ala Ile Gly Lys
Arg Leu Ala Thr 340 345 350ctc ccc
acc aaa gag cag aaa aca cag agg ctg atc tca gag ctc tcc 1221Leu Pro
Thr Lys Glu Gln Lys Thr Gln Arg Leu Ile Ser Glu Leu Ser 355
360 365ctg ctc aac cat aag ctc cct gcc cga gtc tgg
ctg ccc act gct ggc 1269Leu Leu Asn His Lys Leu Pro Ala Arg Val Trp
Leu Pro Thr Ala Gly370 375 380
385ttt gac cac cac gtg gtc cgt gta ccc cac aca cag gct gtt gtc ctc
1317Phe Asp His His Val Val Arg Val Pro His Thr Gln Ala Val Val Leu
390 395 400aac tcc aag gac aag
gct ccc tac ctg att tat gtg gaa gtc ctt gaa 1365Asn Ser Lys Asp Lys
Ala Pro Tyr Leu Ile Tyr Val Glu Val Leu Glu 405
410 415tgt gaa aac ttt gac acc acc agt gtc cct gcc cgg
atc ccc gag aac 1413Cys Glu Asn Phe Asp Thr Thr Ser Val Pro Ala Arg
Ile Pro Glu Asn 420 425 430cga att
cgg agt acg agg tcc gta gaa aac ttg ccc gaa tgt ggt att 1461Arg Ile
Arg Ser Thr Arg Ser Val Glu Asn Leu Pro Glu Cys Gly Ile 435
440 445acc cat gag cag cga gct ggc agc ttc agc act
gtg ccc aac tat gac 1509Thr His Glu Gln Arg Ala Gly Ser Phe Ser Thr
Val Pro Asn Tyr Asp450 455 460
465aac gat gat gag gcc tgg tcg gtg gat gac ata ggc gag ctg caa gtg
1557Asn Asp Asp Glu Ala Trp Ser Val Asp Asp Ile Gly Glu Leu Gln Val
470 475 480gag ctc ccc gaa gtg
cat acc aac agc tgt gac aac atc tcc cag ttc 1605Glu Leu Pro Glu Val
His Thr Asn Ser Cys Asp Asn Ile Ser Gln Phe 485
490 495tct gtg gac agc atc acc agc cag gag agc aag gag
cct gtg ttc att 1653Ser Val Asp Ser Ile Thr Ser Gln Glu Ser Lys Glu
Pro Val Phe Ile 500 505 510gca gca
ggg gac atc cgc cgg cgc ctt tcg gaa cag ctg gct cat acc 1701Ala Ala
Gly Asp Ile Arg Arg Arg Leu Ser Glu Gln Leu Ala His Thr 515
520 525ccg aca gcc ttc aaa cga gac cca gaa gat cct
tct gca gtt gct ctc 1749Pro Thr Ala Phe Lys Arg Asp Pro Glu Asp Pro
Ser Ala Val Ala Leu530 535 540
545aaa gag ccc tgg cag gag aaa gta cgg cgg atc aga gag ggc tcc ccc
1797Lys Glu Pro Trp Gln Glu Lys Val Arg Arg Ile Arg Glu Gly Ser Pro
550 555 560tac ggc cat ctc ccc
aat tgg cgg ctc ctg tca gtc att gtc aag tgt 1845Tyr Gly His Leu Pro
Asn Trp Arg Leu Leu Ser Val Ile Val Lys Cys 565
570 575ggg gat gac ctt cgg caa gag ctt ctg gcc ttt cag
gtg ttg aag caa 1893Gly Asp Asp Leu Arg Gln Glu Leu Leu Ala Phe Gln
Val Leu Lys Gln 580 585 590ctg cag
tcc att tgg gaa cag gag cga gtg ccc ctt tgg atc aag cca 1941Leu Gln
Ser Ile Trp Glu Gln Glu Arg Val Pro Leu Trp Ile Lys Pro 595
600 605ata caa gat tct tgt gaa att acg act gat agt
ggc atg att gaa cca 1989Ile Gln Asp Ser Cys Glu Ile Thr Thr Asp Ser
Gly Met Ile Glu Pro610 615 620
625gtg gtc aat gct gtg tcc atc cat cag gtg aag aaa cag tca cag ctc
2037Val Val Asn Ala Val Ser Ile His Gln Val Lys Lys Gln Ser Gln Leu
630 635 640tcc ttg ctc gat tac
ttc cta cag gag cac ggc agt tac acc act gag 2085Ser Leu Leu Asp Tyr
Phe Leu Gln Glu His Gly Ser Tyr Thr Thr Glu 645
650 655gca ttc ctc agt gca cag cgc aat ttt gtg caa agt
tgt gct ggg tac 2133Ala Phe Leu Ser Ala Gln Arg Asn Phe Val Gln Ser
Cys Ala Gly Tyr 660 665 670tgc ttg
gtc tgc tac ctg ctg caa gtc aag gac aga cac aat ggg aat 2181Cys Leu
Val Cys Tyr Leu Leu Gln Val Lys Asp Arg His Asn Gly Asn 675
680 685atc ctt ttg gac gca gaa ggc cac atc atc cac
atc gac ttt ggc ttc 2229Ile Leu Leu Asp Ala Glu Gly His Ile Ile His
Ile Asp Phe Gly Phe690 695 700
705atc ctc tcc agc tca ccc cga aat ctg ggc ttt gag acg tca gcc ttt
2277Ile Leu Ser Ser Ser Pro Arg Asn Leu Gly Phe Glu Thr Ser Ala Phe
710 715 720aag ctg acc aca gag
ttt gtg gat gtg atg ggc ggc ctg gat ggc gac 2325Lys Leu Thr Thr Glu
Phe Val Asp Val Met Gly Gly Leu Asp Gly Asp 725
730 735atg ttc aac tac tat aag atg ctg atg ctg caa ggg
ctg att gcc gct 2373Met Phe Asn Tyr Tyr Lys Met Leu Met Leu Gln Gly
Leu Ile Ala Ala 740 745 750cgg aaa
cac atg gac aag gtg gtg cag atc gtg gag atc atg cag caa 2421Arg Lys
His Met Asp Lys Val Val Gln Ile Val Glu Ile Met Gln Gln 755
760 765ggt tct cag ctt cct tgc ttc cat ggc tcc agc
acc att cga aac ctc 2469Gly Ser Gln Leu Pro Cys Phe His Gly Ser Ser
Thr Ile Arg Asn Leu770 775 780
785aaa gag agg ttc cac atg agc atg act gag gag cag ctg cag ctg ctg
2517Lys Glu Arg Phe His Met Ser Met Thr Glu Glu Gln Leu Gln Leu Leu
790 795 800gtg gag cag atg gtg
gat ggc agt atg cgg tct atc acc acc aaa ctc 2565Val Glu Gln Met Val
Asp Gly Ser Met Arg Ser Ile Thr Thr Lys Leu 805
810 815tat gac ggc ttc cag tac ctc acc aac ggc atc atg
tgacacgctc 2611Tyr Asp Gly Phe Gln Tyr Leu Thr Asn Gly Ile Met
820 825ctcagcccag gagtggtggg gggtccaggg caccctccct
agagggccct tgtctgagaa 2671accccaaacc aggaaacccc acctacccaa ccatccaccc
aagggaaatg gaaggcaaga 2731aacacgaagg atcatgtggt aactgcgaga gcttgctgag
gggtgggaga gccagctgtg 2791gggtccagac ttgttggggc ttccctgccc ctcctggtct
gtgtcagtat taccaccaga 2851ctgactccag gactcactgc cctccagaaa acagaggtga
caaatgtgag ggacactggg 2911gcctttcttc tccttgtagg ggtctctcag aggttctttc
cacaggccat cctcttattc 2971cgttctgggg cccaggaagt ggggaagagt aggttctcgg
tacttaggac ttgatcctgt 3031ggttgccact ggccatgctg ctgcccagct ctacccctcc
cagggaccta cccctcccag 3091ggaccgaccc ctggcccaag ctccccttgc tggcgggcgc
tgcgtgggcc ctgcacttgc 3151tgaggttccc catcatgggc aaggcaaggg aattcccaca
gccctccagt gtactgaggg 3211tactggccta gccatgtgga attccctacc ctgactcctt
ccccaaaccc agggaaaaga 3271gctctcaatt ttttattttt aatttttgtt tgaaataaag
tccttagtta gcc 332434810PRTHomo sapiens 34Met Pro Met Asp Leu
Ile Leu Val Val Trp Phe Cys Val Cys Thr Ala 1 5
10 15Arg Thr Val Val Gly Phe Gly Met Asp Pro Asp
Leu Gln Met Asp Ile 20 25
30Val Thr Glu Leu Asp Leu Val Asn Thr Thr Leu Gly Val Ala Gln Val
35 40 45Ser Gly Met His Asn Ala Ser Lys
Ala Phe Leu Phe Gln Asp Ile Glu 50 55
60Arg Glu Ile His Ala Ala Pro His Val Ser Glu Lys Leu Ile Gln Leu 65
70 75 80Phe Gln Asn Lys
Ser Glu Phe Thr Ile Leu Ala Thr Val Gln Gln Lys 85
90 95Pro Ser Thr Ser Gly Val Ile Leu Ser Ile
Arg Glu Leu Glu His Ser 100 105
110Tyr Phe Glu Leu Glu Ser Ser Gly Leu Arg Asp Glu Ile Arg Tyr His
115 120 125Tyr Ile His Asn Gly Lys Pro
Arg Thr Glu Ala Leu Pro Tyr Arg Met 130 135
140Ala Asp Gly Gln Trp His Lys Val Ala Leu Ser Val Ser Ala Ser
His145 150 155 160Leu Leu
Leu His Val Asp Cys Asn Arg Ile Tyr Glu Arg Val Ile Asp
165 170 175Pro Pro Asp Thr Asn Leu Pro
Pro Gly Ile Asn Leu Trp Leu Gly Gln 180 185
190Arg Asn Gln Lys His Gly Leu Phe Lys Gly Ile Ile Gln Asp
Gly Lys 195 200 205Ile Ile Phe Met
Pro Asn Gly Tyr Ile Thr Gln Cys Pro Asn Leu Asn 210
215 220His Thr Cys Pro Thr Cys Ser Asp Phe Leu Ser Leu
Val Gln Gly Ile225 230 235
240Met Asp Leu Gln Glu Leu Leu Ala Lys Met Thr Ala Lys Leu Asn Tyr
245 250 255Ala Glu Thr Arg Leu
Ser Gln Leu Glu Asn Cys His Cys Glu Lys Thr 260
265 270Cys Gln Val Ser Gly Leu Leu Tyr Arg Asp Gln Asp
Ser Trp Val Asp 275 280 285Gly Asp
His Cys Arg Asn Cys Thr Cys Lys Ser Gly Ala Val Glu Cys 290
295 300Arg Arg Met Ser Cys Pro Pro Leu Asn Cys Ser
Pro Asp Ser Leu Pro305 310 315
320Val His Ile Ala Gly Gln Cys Cys Lys Val Cys Arg Pro Lys Cys Ile
325 330 335Tyr Gly Gly Lys
Val Leu Ala Glu Gly Gln Arg Ile Leu Thr Lys Ser 340
345 350Cys Arg Glu Cys Arg Gly Gly Val Leu Val Lys
Ile Thr Glu Met Cys 355 360 365Pro
Pro Leu Asn Cys Ser Glu Lys Asp His Ile Leu Pro Glu Asn Gln 370
375 380Cys Cys Arg Val Cys Arg Gly His Asn Phe
Cys Ala Glu Gly Pro Lys385 390 395
400Cys Gly Glu Asn Ser Glu Cys Lys Asn Trp Asn Thr Lys Ala Thr
Cys 405 410 415Glu Cys Lys
Ser Gly Tyr Ile Ser Val Gln Gly Asp Ser Ala Tyr Cys 420
425 430Glu Asp Ile Asp Glu Cys Ala Ala Lys Met
His Tyr Cys His Ala Asn 435 440
445Thr Val Cys Val Asn Leu Pro Gly Leu Tyr Arg Cys Asp Cys Val Pro 450
455 460Gly Tyr Ile Arg Val Asp Asp Phe
Ser Cys Thr Glu His Asp Glu Cys465 470
475 480Gly Ser Gly Gln His Asn Cys Asp Glu Asn Ala Ile
Cys Thr Asn Thr 485 490
495Val Gln Gly His Ser Cys Thr Cys Lys Pro Gly Tyr Val Gly Asn Gly
500 505 510Thr Ile Cys Arg Ala Phe
Cys Glu Glu Gly Cys Arg Tyr Gly Gly Thr 515 520
525Cys Val Ala Pro Asn Lys Cys Val Cys Pro Ser Gly Phe Thr
Gly Ser 530 535 540His Cys Glu Lys Asp
Ile Asp Glu Cys Ser Glu Gly Ile Ile Glu Cys545 550
555 560His Asn His Ser Arg Cys Val Asn Leu Pro
Gly Trp Tyr His Cys Glu 565 570
575Cys Arg Ser Gly Phe His Asp Asp Gly Thr Tyr Ser Leu Ser Gly Glu
580 585 590Ser Cys Ile Asp Ile
Asp Glu Cys Ala Leu Arg Thr His Thr Cys Trp 595
600 605Asn Asp Ser Ala Cys Ile Asn Leu Ala Gly Gly Phe
Asp Cys Leu Cys 610 615 620Pro Ser Gly
Pro Ser Cys Ser Gly Asp Cys Pro His Glu Gly Gly Leu625
630 635 640Lys His Asn Gly Gln Val Trp
Thr Leu Lys Glu Asp Arg Cys Ser Val 645
650 655Cys Ser Cys Lys Asp Gly Lys Ile Phe Cys Arg Arg
Thr Ala Cys Asp 660 665 670Cys
Gln Asn Pro Ser Ala Asp Leu Phe Cys Cys Pro Glu Cys Asp Thr 675
680 685Arg Val Thr Ser Gln Cys Leu Asp Gln
Asn Gly His Lys Leu Tyr Arg 690 695
700Ser Gly Asp Asn Trp Thr His Ser Cys Gln Gln Cys Arg Cys Leu Glu705
710 715 720Gly Glu Val Asp
Cys Trp Pro Leu Thr Cys Pro Asn Leu Ser Cys Glu 725
730 735Tyr Thr Ala Ile Leu Glu Gly Glu Cys Cys
Pro Arg Cys Val Ser Asp 740 745
750Pro Cys Leu Ala Asp Asn Ile Thr Tyr Asp Ile Arg Lys Thr Cys Leu
755 760 765Asp Ser Tyr Gly Val Ser Arg
Leu Ser Gly Ser Val Trp Thr Met Ala 770 775
780Gly Ser Pro Cys Thr Thr Cys Lys Cys Lys Asn Gly Arg Val Cys
Cys785 790 795 800Ser Val
Asp Phe Glu Cys Leu Gln Asn Asn 805
810352430DNAHomo sapiens 35atgccgatgg atttgatttt agttgtgtgg ttctgtgtgt
gcactgccag gacagtggtg 60ggctttggga tggaccctga ccttcagatg gatatcgtca
ccgagcttga ccttgtgaac 120accacccttg gagttgctca ggtgtctgga atgcacaatg
ccagcaaagc atttttattt 180caagacatag aaagagagat ccatgcagct cctcatgtga
gtgagaaatt aattcagctg 240ttccagaaca agagtgaatt caccattttg gccactgtac
agcagaagcc atccacttca 300ggagtgatac tgtccattcg agaactggag cacagctatt
ttgaactgga gagcagtggc 360ctgagggatg agattcggta tcactacata cacaatggga
agccaaggac agaggcactt 420ccttaccgca tggcagatgg acaatggcac aaggttgcac
tgtcagttag cgcctctcat 480ctcctgctcc atgtcgactg taacaggatt tatgagcgtg
tgatagaccc tccagatacc 540aaccttcccc caggaatcaa tttatggctt ggccagcgca
accaaaagca tggcttattc 600aaagggatca tccaagatgg gaagatcatc tttatgccga
atggatatat aacacagtgt 660ccaaatctaa atcacacttg cccaacctgc agtgatttct
taagcctggt gcaaggaata 720atggatttac aagagctttt ggccaagatg actgcaaaac
taaattatgc agagacaaga 780cttagtcaat tggaaaactg tcattgtgag aagacttgtc
aagtgagtgg actgctctat 840cgagatcaag actcttgggt agatggtgac cattgcagga
actgcacttg caaaagtggt 900gccgtggaat gccgaaggat gtcctgtccc cctctcaatt
gctccccaga ctccctccca 960gtacacattg ctggccagtg ctgtaaggtc tgccgaccaa
aatgtatcta tggaggaaaa 1020gttcttgcag aaggccagcg gattttaacc aagagctgtc
gggaatgccg aggtggagtt 1080ttagtaaaaa ttacagaaat gtgtcctcct ttgaactgct
cagaaaagga tcacattctt 1140cctgagaatc agtgctgccg tgtctgtaga ggtcataact
tttgtgcaga aggacctaaa 1200tgtggtgaaa actcagagtg caaaaactgg aatacaaaag
ctacttgtga gtgcaagagt 1260ggttacatct ctgtccaggg agactctgcc tactgtgaag
atattgatga gtgtgcagct 1320aagatgcatt actgtcatgc caatactgtg tgtgtcaacc
ttcctgggtt atatcgctgt 1380gactgtgtcc caggatacat tcgtgtggat gacttctctt
gtacagaaca cgatgaatgt 1440ggcagcggcc agcacaactg tgatgagaat gccatctgca
ccaacactgt ccagggacac 1500agctgcacct gcaaaccggg ctacgtgggg aacgggacca
tctgcagagc tttctgtgaa 1560gagggctgca gatacggtgg aacgtgtgtg gctcccaaca
aatgtgtctg tccatctgga 1620ttcacaggaa gccactgcga gaaagatatt gatgaatgtt
cagagggaat cattgagtgc 1680cacaaccatt cccgctgcgt taacctgcca gggtggtacc
actgtgagtg cagaagcggt 1740ttccatgacg atgggaccta ttcactgtcc ggggagtcct
gtattgacat tgatgaatgt 1800gccttaagaa ctcacacctg ttggaacgat tctgcctgca
tcaacctggc agggggtttt 1860gactgtctct gcccctctgg gccctcctgc tctggtgact
gtcctcatga aggggggctg 1920aagcacaatg gccaggtgtg gaccttgaaa gaagacaggt
gttctgtctg ctcctgcaag 1980gatggcaaga tattctgccg acggacagct tgtgattgcc
agaatccaag tgctgaccta 2040ttctgttgcc cagaatgtga caccagagtc acaagtcaat
gtttagacca aaatggtcac 2100aagctgtatc gaagtggaga caattggacc catagctgtc
agcagtgtcg gtgtctggaa 2160ggagaggtag attgctggcc actcacttgc cccaacttga
gctgtgagta tacagctatc 2220ttagaagggg aatgttgtcc ccgctgtgtc agtgacccct
gcctagctga taacatcacc 2280tatgacatca gaaaaacttg cctggacagc tatggtgttt
cacggcttag tggctcagtg 2340tggacgatgg ctggatctcc ctgcacaacc tgtaaatgca
agaatggaag agtctgttgt 2400tctgtggatt ttgagtgtct tcaaaataat
2430362977DNAHomo sapiensCDS(103)..(2532)
36tagcaagttt ggcggctcca agccaggcgc gcctcaggat ccaggctcat ttgcttccac
60ctagcttcgg tgccccctgc taggcgggga ccctcgagag cg atg ccg atg gat
114 Met Pro Met Asp
1ttg att tta gtt gtg tgg ttc tgt
gtg tgc act gcc agg aca gtg gtg 162Leu Ile Leu Val Val Trp Phe Cys
Val Cys Thr Ala Arg Thr Val Val 5 10
15 20ggc ttt ggg atg gac cct gac ctt cag atg gat atc gtc
acc gag ctt 210Gly Phe Gly Met Asp Pro Asp Leu Gln Met Asp Ile Val
Thr Glu Leu 25 30 35gac
ctt gtg aac acc acc ctt gga gtt gct cag gtg tct gga atg cac 258Asp
Leu Val Asn Thr Thr Leu Gly Val Ala Gln Val Ser Gly Met His
40 45 50aat gcc agc aaa gca ttt tta ttt
caa gac ata gaa aga gag atc cat 306Asn Ala Ser Lys Ala Phe Leu Phe
Gln Asp Ile Glu Arg Glu Ile His 55 60
65gca gct cct cat gtg agt gag aaa tta att cag ctg ttc cag aac aag
354Ala Ala Pro His Val Ser Glu Lys Leu Ile Gln Leu Phe Gln Asn Lys
70 75 80agt gaa ttc acc att ttg gcc act
gta cag cag aag cca tcc act tca 402Ser Glu Phe Thr Ile Leu Ala Thr
Val Gln Gln Lys Pro Ser Thr Ser 85 90
95 100gga gtg ata ctg tcc att cga gaa ctg gag cac agc tat
ttt gaa ctg 450Gly Val Ile Leu Ser Ile Arg Glu Leu Glu His Ser Tyr
Phe Glu Leu 105 110 115gag
agc agt ggc ctg agg gat gag att cgg tat cac tac ata cac aat 498Glu
Ser Ser Gly Leu Arg Asp Glu Ile Arg Tyr His Tyr Ile His Asn
120 125 130ggg aag cca agg aca gag gca
ctt cct tac cgc atg gca gat gga caa 546Gly Lys Pro Arg Thr Glu Ala
Leu Pro Tyr Arg Met Ala Asp Gly Gln 135 140
145tgg cac aag gtt gca ctg tca gtt agc gcc tct cat ctc ctg ctc
cat 594Trp His Lys Val Ala Leu Ser Val Ser Ala Ser His Leu Leu Leu
His 150 155 160gtc gac tgt aac agg att
tat gag cgt gtg ata gac cct cca gat acc 642Val Asp Cys Asn Arg Ile
Tyr Glu Arg Val Ile Asp Pro Pro Asp Thr165 170
175 180aac ctt ccc cca gga atc aat tta tgg ctt ggc
cag cgc aac caa aag 690Asn Leu Pro Pro Gly Ile Asn Leu Trp Leu Gly
Gln Arg Asn Gln Lys 185 190
195cat ggc tta ttc aaa ggg atc atc caa gat ggg aag atc atc ttt atg
738His Gly Leu Phe Lys Gly Ile Ile Gln Asp Gly Lys Ile Ile Phe Met
200 205 210ccg aat gga tat ata aca
cag tgt cca aat cta aat cac act tgc cca 786Pro Asn Gly Tyr Ile Thr
Gln Cys Pro Asn Leu Asn His Thr Cys Pro 215 220
225acc tgc agt gat ttc tta agc ctg gtg caa gga ata atg gat
tta caa 834Thr Cys Ser Asp Phe Leu Ser Leu Val Gln Gly Ile Met Asp
Leu Gln 230 235 240gag ctt ttg gcc aag
atg act gca aaa cta aat tat gca gag aca aga 882Glu Leu Leu Ala Lys
Met Thr Ala Lys Leu Asn Tyr Ala Glu Thr Arg245 250
255 260ctt agt caa ttg gaa aac tgt cat tgt gag
aag act tgt caa gtg agt 930Leu Ser Gln Leu Glu Asn Cys His Cys Glu
Lys Thr Cys Gln Val Ser 265 270
275gga ctg ctc tat cga gat caa gac tct tgg gta gat ggt gac cat tgc
978Gly Leu Leu Tyr Arg Asp Gln Asp Ser Trp Val Asp Gly Asp His Cys
280 285 290agg aac tgc act tgc aaa
agt ggt gcc gtg gaa tgc cga agg atg tcc 1026Arg Asn Cys Thr Cys Lys
Ser Gly Ala Val Glu Cys Arg Arg Met Ser 295 300
305tgt ccc cct ctc aat tgc tcc cca gac tcc ctc cca gta cac
att gct 1074Cys Pro Pro Leu Asn Cys Ser Pro Asp Ser Leu Pro Val His
Ile Ala 310 315 320ggc cag tgc tgt aag
gtc tgc cga cca aaa tgt atc tat gga gga aaa 1122Gly Gln Cys Cys Lys
Val Cys Arg Pro Lys Cys Ile Tyr Gly Gly Lys325 330
335 340gtt ctt gca gaa ggc cag cgg att tta acc
aag agc tgt cgg gaa tgc 1170Val Leu Ala Glu Gly Gln Arg Ile Leu Thr
Lys Ser Cys Arg Glu Cys 345 350
355cga ggt gga gtt tta gta aaa att aca gaa atg tgt cct cct ttg aac
1218Arg Gly Gly Val Leu Val Lys Ile Thr Glu Met Cys Pro Pro Leu Asn
360 365 370tgc tca gaa aag gat cac
att ctt cct gag aat cag tgc tgc cgt gtc 1266Cys Ser Glu Lys Asp His
Ile Leu Pro Glu Asn Gln Cys Cys Arg Val 375 380
385tgt aga ggt cat aac ttt tgt gca gaa gga cct aaa tgt ggt
gaa aac 1314Cys Arg Gly His Asn Phe Cys Ala Glu Gly Pro Lys Cys Gly
Glu Asn 390 395 400tca gag tgc aaa aac
tgg aat aca aaa gct act tgt gag tgc aag agt 1362Ser Glu Cys Lys Asn
Trp Asn Thr Lys Ala Thr Cys Glu Cys Lys Ser405 410
415 420ggt tac atc tct gtc cag gga gac tct gcc
tac tgt gaa gat att gat 1410Gly Tyr Ile Ser Val Gln Gly Asp Ser Ala
Tyr Cys Glu Asp Ile Asp 425 430
435gag tgt gca gct aag atg cat tac tgt cat gcc aat act gtg tgt gtc
1458Glu Cys Ala Ala Lys Met His Tyr Cys His Ala Asn Thr Val Cys Val
440 445 450aac ctt cct ggg tta tat
cgc tgt gac tgt gtc cca gga tac att cgt 1506Asn Leu Pro Gly Leu Tyr
Arg Cys Asp Cys Val Pro Gly Tyr Ile Arg 455 460
465gtg gat gac ttc tct tgt aca gaa cac gat gaa tgt ggc agc
ggc cag 1554Val Asp Asp Phe Ser Cys Thr Glu His Asp Glu Cys Gly Ser
Gly Gln 470 475 480cac aac tgt gat gag
aat gcc atc tgc acc aac act gtc cag gga cac 1602His Asn Cys Asp Glu
Asn Ala Ile Cys Thr Asn Thr Val Gln Gly His485 490
495 500agc tgc acc tgc aaa ccg ggc tac gtg ggg
aac ggg acc atc tgc aga 1650Ser Cys Thr Cys Lys Pro Gly Tyr Val Gly
Asn Gly Thr Ile Cys Arg 505 510
515gct ttc tgt gaa gag ggc tgc aga tac ggt gga acg tgt gtg gct ccc
1698Ala Phe Cys Glu Glu Gly Cys Arg Tyr Gly Gly Thr Cys Val Ala Pro
520 525 530aac aaa tgt gtc tgt cca
tct gga ttc aca gga agc cac tgc gag aaa 1746Asn Lys Cys Val Cys Pro
Ser Gly Phe Thr Gly Ser His Cys Glu Lys 535 540
545gat att gat gaa tgt tca gag gga atc att gag tgc cac aac
cat tcc 1794Asp Ile Asp Glu Cys Ser Glu Gly Ile Ile Glu Cys His Asn
His Ser 550 555 560cgc tgc gtt aac ctg
cca ggg tgg tac cac tgt gag tgc aga agc ggt 1842Arg Cys Val Asn Leu
Pro Gly Trp Tyr His Cys Glu Cys Arg Ser Gly565 570
575 580ttc cat gac gat ggg acc tat tca ctg tcc
ggg gag tcc tgt att gac 1890Phe His Asp Asp Gly Thr Tyr Ser Leu Ser
Gly Glu Ser Cys Ile Asp 585 590
595att gat gaa tgt gcc tta aga act cac acc tgt tgg aac gat tct gcc
1938Ile Asp Glu Cys Ala Leu Arg Thr His Thr Cys Trp Asn Asp Ser Ala
600 605 610tgc atc aac ctg gca ggg
ggt ttt gac tgt ctc tgc ccc tct ggg ccc 1986Cys Ile Asn Leu Ala Gly
Gly Phe Asp Cys Leu Cys Pro Ser Gly Pro 615 620
625tcc tgc tct ggt gac tgt cct cat gaa ggg ggg ctg aag cac
aat ggc 2034Ser Cys Ser Gly Asp Cys Pro His Glu Gly Gly Leu Lys His
Asn Gly 630 635 640cag gtg tgg acc ttg
aaa gaa gac agg tgt tct gtc tgc tcc tgc aag 2082Gln Val Trp Thr Leu
Lys Glu Asp Arg Cys Ser Val Cys Ser Cys Lys645 650
655 660gat ggc aag ata ttc tgc cga cgg aca gct
tgt gat tgc cag aat cca 2130Asp Gly Lys Ile Phe Cys Arg Arg Thr Ala
Cys Asp Cys Gln Asn Pro 665 670
675agt gct gac cta ttc tgt tgc cca gaa tgt gac acc aga gtc aca agt
2178Ser Ala Asp Leu Phe Cys Cys Pro Glu Cys Asp Thr Arg Val Thr Ser
680 685 690caa tgt tta gac caa aat
ggt cac aag ctg tat cga agt gga gac aat 2226Gln Cys Leu Asp Gln Asn
Gly His Lys Leu Tyr Arg Ser Gly Asp Asn 695 700
705tgg acc cat agc tgt cag cag tgt cgg tgt ctg gaa gga gag
gta gat 2274Trp Thr His Ser Cys Gln Gln Cys Arg Cys Leu Glu Gly Glu
Val Asp 710 715 720tgc tgg cca ctc act
tgc ccc aac ttg agc tgt gag tat aca gct atc 2322Cys Trp Pro Leu Thr
Cys Pro Asn Leu Ser Cys Glu Tyr Thr Ala Ile725 730
735 740tta gaa ggg gaa tgt tgt ccc cgc tgt gtc
agt gac ccc tgc cta gct 2370Leu Glu Gly Glu Cys Cys Pro Arg Cys Val
Ser Asp Pro Cys Leu Ala 745 750
755gat aac atc acc tat gac atc aga aaa act tgc ctg gac agc tat ggt
2418Asp Asn Ile Thr Tyr Asp Ile Arg Lys Thr Cys Leu Asp Ser Tyr Gly
760 765 770gtt tca cgg ctt agt ggc
tca gtg tgg acg atg gct gga tct ccc tgc 2466Val Ser Arg Leu Ser Gly
Ser Val Trp Thr Met Ala Gly Ser Pro Cys 775 780
785aca acc tgt aaa tgc aag aat gga aga gtc tgt tgt tct gtg
gat ttt 2514Thr Thr Cys Lys Cys Lys Asn Gly Arg Val Cys Cys Ser Val
Asp Phe 790 795 800gag tgt ctt caa aat
aat tgaagtattt acagtggact caacgcagaa 2562Glu Cys Leu Gln Asn
Asn805 810gaatggacga aatgaccatc caacgtgatt aaggatagga
atcggtagtt tggttttttt 2622gtttgttttg tttttttaac cacagataat tgccaaagtt
tccacctgag gacggtgttt 2682cggaggttgc cttttggacc taccactttg ctcattcttg
ctaacctagt ctaggtgacc 2742tacagtgccg tgcatttaag tcaatggttg ttaaaagaag
tttcccgtgt tgtaaatcat 2802gtttccctta tcagatcatt tgcaaataca tttaaatgat
ctcatggtaa atggttgatg 2862tattttttgg gtttattttg tgtactaacc ataatagaga
gagactcagc tccttttatt 2922tattttgttg atttatggat caaattctaa aataaagttg
cctgttgtga ctttt 297737816PRTHomo sapiens 37Met Glu Ser Arg Val
Leu Leu Arg Thr Phe Cys Leu Ile Phe Gly Leu 1 5
10 15Gly Ala Val Trp Gly Leu Gly Val Asp Pro Ser
Leu Gln Ile Asp Val 20 25
30Leu Thr Glu Leu Glu Leu Gly Glu Ser Thr Thr Gly Val Arg Gln Val
35 40 45Pro Gly Leu His Asn Gly Thr Lys
Ala Phe Leu Phe Gln Asp Thr Pro 50 55
60Arg Ser Ile Lys Ala Ser Thr Ala Thr Ala Glu Gln Phe Phe Gln Lys 65
70 75 80Leu Arg Asn Lys
His Glu Phe Thr Ile Leu Val Thr Leu Lys Gln Thr 85
90 95His Leu Asn Ser Gly Val Ile Leu Ser Ile
His His Leu Asp His Arg 100 105
110Tyr Leu Glu Leu Glu Ser Ser Gly His Arg Asn Glu Val Arg Leu His
115 120 125Tyr Arg Ser Gly Ser His Arg
Pro His Thr Glu Val Phe Pro Tyr Ile 130 135
140Leu Ala Asp Asp Lys Trp His Lys Leu Ser Leu Ala Ile Ser Ala
Ser145 150 155 160His Leu
Ile Leu His Ile Asp Cys Asn Lys Ile Tyr Glu Arg Val Val
165 170 175Glu Lys Pro Ser Thr Asp Leu
Pro Leu Gly Thr Thr Phe Trp Leu Gly 180 185
190Gln Arg Asn Asn Ala His Gly Tyr Phe Lys Gly Ile Met Gln
Asp Val 195 200 205Gln Leu Leu Val
Met Pro Gln Gly Phe Ile Ala Gln Cys Pro Asp Leu 210
215 220Asn Arg Thr Cys Pro Thr Cys Asn Asp Phe His Gly
Leu Val Gln Lys225 230 235
240Ile Met Glu Leu Gln Asp Ile Leu Ala Lys Thr Ser Ala Lys Leu Ser
245 250 255Arg Ala Glu Gln Arg
Met Asn Arg Leu Asp Gln Cys Tyr Cys Glu Arg 260
265 270Thr Cys Thr Met Lys Gly Thr Thr Tyr Arg Glu Phe
Glu Ser Trp Ile 275 280 285Asp Gly
Cys Lys Asn Cys Thr Cys Leu Asn Gly Thr Ile Gln Cys Glu 290
295 300Thr Leu Ile Cys Pro Asn Pro Asp Cys Pro Leu
Lys Ser Ala Leu Ala305 310 315
320Tyr Val Asp Gly Lys Cys Cys Lys Glu Cys Lys Ser Ile Cys Gln Phe
325 330 335Gln Gly Arg Thr
Tyr Phe Glu Gly Glu Arg Asn Thr Val Tyr Ser Ser 340
345 350Ser Gly Val Cys Val Leu Tyr Glu Cys Lys Asp
Gln Thr Met Lys Leu 355 360 365Val
Glu Ser Ser Gly Cys Pro Ala Leu Asp Cys Pro Glu Ser His Gln 370
375 380Ile Thr Leu Ser His Ser Cys Cys Lys Val
Cys Lys Gly Tyr Asp Phe385 390 395
400Cys Ser Glu Arg His Asn Cys Met Glu Asn Ser Ile Cys Arg Asn
Leu 405 410 415Asn Asp Arg
Ala Val Cys Ser Cys Arg Asp Gly Phe Arg Ala Leu Arg 420
425 430Glu Asp Asn Ala Tyr Cys Glu Asp Ile Asp
Glu Cys Ala Glu Gly Arg 435 440
445His Tyr Cys Arg Glu Asn Thr Met Cys Val Asn Thr Pro Gly Ser Phe 450
455 460Met Cys Ile Cys Lys Thr Gly Tyr
Ile Arg Ile Asp Asp Tyr Ser Cys465 470
475 480Thr Glu His Asp Glu Cys Ile Thr Asn Gln His Asn
Cys Asp Glu Asn 485 490
495Ala Leu Cys Phe Asn Thr Val Gly Gly His Asn Cys Val Cys Lys Pro
500 505 510Gly Tyr Thr Gly Asn Gly
Thr Thr Cys Lys Ala Phe Cys Lys Asp Gly 515 520
525Cys Arg Asn Gly Gly Ala Cys Ile Ala Ala Asn Val Cys Ala
Cys Pro 530 535 540Gln Gly Phe Thr Gly
Pro Ser Cys Glu Thr Asp Ile Asp Glu Cys Ser545 550
555 560Asp Gly Phe Val Gln Cys Asp Ser Arg Ala
Asn Cys Ile Asn Leu Pro 565 570
575Gly Trp Tyr His Cys Glu Cys Arg Asp Gly Tyr His Asp Asn Gly Met
580 585 590Phe Ser Pro Ser Gly
Glu Ser Cys Glu Asp Ile Asp Glu Cys Gly Thr 595
600 605Gly Arg His Ser Cys Ala Asn Asp Thr Ile Cys Phe
Asn Leu Asp Gly 610 615 620Gly Tyr Asp
Cys Arg Cys Pro His Gly Lys Asn Cys Thr Gly Asp Cys625
630 635 640Ile His Asp Gly Lys Val Lys
His Asn Gly Gln Ile Trp Val Leu Glu 645
650 655Asn Asp Arg Cys Ser Val Cys Ser Cys Gln Asn Gly
Phe Val Met Cys 660 665 670Arg
Arg Met Val Cys Asp Cys Glu Asn Pro Thr Val Asp Leu Phe Cys 675
680 685Cys Pro Glu Cys Asp Pro Arg Leu Ser
Ser Gln Cys Leu His Gln Asn 690 695
700Gly Glu Thr Leu Tyr Asn Ser Gly Asp Thr Trp Val Gln Asn Cys Gln705
710 715 720Gln Cys Arg Cys
Leu Gln Gly Glu Val Asp Cys Trp Pro Leu Pro Cys 725
730 735Pro Asp Val Glu Cys Glu Phe Ser Ile Leu
Pro Glu Asn Glu Cys Cys 740 745
750Pro Arg Cys Val Thr Asp Pro Cys Gln Ala Asp Thr Ile Arg Asn Asp
755 760 765Ile Thr Lys Thr Cys Leu Asp
Glu Met Asn Val Val Arg Phe Thr Gly 770 775
780Ser Ser Trp Ile Lys His Gly Thr Glu Cys Thr Leu Cys Gln Cys
Lys785 790 795 800Asn Gly
His Ile Cys Cys Ser Val Asp Pro Gln Cys Leu Gln Glu Leu
805 810 815382448DNAHomo sapiens
38atggagtctc gggtcttact gagaacattc tgtttgatct tcggtctcgg agcagtttgg
60gggcttggtg tggacccttc cctacagatt gacgtcttaa cagagttaga acttggggag
120tccacgaccg gagtgcgtca ggtcccgggg ctgcataatg ggacgaaagc ctttctcttt
180caagatactc ccagaagcat aaaagcatcc actgctacag ctgaacagtt ttttcagaag
240ctgagaaata aacatgaatt tactattttg gtgaccctaa aacagaccca cttaaattca
300ggagttattc tctcaattca ccacttggat cacaggtacc tggaactgga aagtagtggc
360catcggaatg aagtcagact gcattaccgc tcaggcagtc accgccctca cacagaagtg
420tttccttaca ttttggctga tgacaagtgg cacaagctct ccttagccat cagtgcttcc
480catttgattt tacacattga ctgcaataaa atttatgaaa gggtagtaga aaagccctcc
540acagacttgc ctctaggcac aacattttgg ctaggacaga gaaataatgc gcatggatat
600tttaagggta taatgcaaga tgtccaatta cttgtcatgc cccagggatt tattgctcag
660tgcccagatc ttaatcgcac ctgtccaact tgcaatgact tccatggact tgtgcagaaa
720atcatggagc tacaggatat tttagccaaa acatcagcca agctgtctcg agctgaacag
780cgaatgaata gattggatca gtgctattgt gaaaggactt gcaccatgaa gggaaccacc
840taccgagaat ttgagtcctg gatagacggc tgtaagaact gcacatgcct gaatggaacc
900atccagtgtg aaactctaat ctgcccaaat cctgactgcc cacttaagtc ggctcttgcg
960tatgtggatg gcaaatgctg taaggaatgc aaatcgatat gccaatttca aggacgaacc
1020tactttgaag gagaaagaaa tacagtctat tcctcttctg gagtatgtgt tctctatgag
1080tgcaaggacc agaccatgaa acttgttgag agttcaggct gtccagcttt ggattgtcca
1140gagtctcatc agataacctt gtctcacagc tgttgcaaag tttgtaaagg ttatgacttt
1200tgttctgaaa ggcataactg catggagaat tccatctgca gaaatctgaa tgacagggct
1260gtttgtagct gtcgagatgg ttttagggct cttcgagagg ataatgccta ctgtgaagac
1320atcgatgagt gtgctgaagg gcgccattac tgtcgtgaaa atacaatgtg tgtcaacacc
1380ccgggttctt ttatgtgcat ctgcaaaact ggatacatca gaattgatga ttattcatgt
1440acagaacatg atgagtgtat cacaaatcag cacaactgtg atgaaaatgc tttatgcttc
1500aacactgttg gaggacacaa ctgtgtttgc aagccgggct atacagggaa tggaacgaca
1560tgcaaagcat tttgcaaaga tggctgtagg aatggaggag cctgtattgc cgctaatgtg
1620tgtgcctgcc cacaaggctt cactggaccc agctgtgaaa cggacattga tgaatgctct
1680gatggttttg ttcaatgtga cagtcgtgct aattgcatta acctgcctgg atggtaccac
1740tgtgagtgca gagatggcta ccatgacaat gggatgtttt caccaagtgg agaatcgtgt
1800gaagatattg atgagtgtgg gaccgggagg cacagctgtg ccaatgatac catttgcttc
1860aatttggatg gcggatatga ttgtcgatgt cctcatggaa agaattgcac aggggactgc
1920atccatgatg gaaaagttaa gcacaatggt cagatttggg tgttggaaaa tgacaggtgc
1980tctgtgtgct catgtcagaa tggattcgtt atgtgtcgac ggatggtctg tgactgtgag
2040aatcccacag ttgatctttt ttgctgccct gaatgtgacc caaggcttag tagtcagtgc
2100ctccatcaaa atggggaaac tttgtataac agtggtgaca cctgggtcca gaattgtcaa
2160cagtgccgct gcttgcaagg ggaagttgat tgttggcccc tgccttgccc agatgtggag
2220tgtgaattca gcattctccc agagaatgag tgctgcccgc gctgtgtcac agacccttgc
2280caggctgaca ccatccgcaa tgacatcacc aagacttgcc tggacgaaat gaatgtggtt
2340cgcttcaccg ggtcctcttg gatcaaacat ggcactgagt gtactctctg ccagtgcaag
2400aatggccaca tctgttgctc agtggatcca cagtgccttc aggaactg
2448393198DNAHomo sapiensCDS(97)..(2544) 39ttgggaggag cagtctctcc
gctcgtctcc cggagctttc tccattgtct ctgcctttac 60aacagaggga gacgatggac
tgagctgatc cgcacc atg gag tct cgg gtc tta 114
Met Glu Ser Arg Val Leu
1 5ctg aga aca ttc tgt ttg atc ttc ggt ctc gga gca
gtt tgg ggg ctt 162Leu Arg Thr Phe Cys Leu Ile Phe Gly Leu Gly Ala
Val Trp Gly Leu 10 15 20ggt
gtg gac cct tcc cta cag att gac gtc tta aca gag tta gaa ctt 210Gly
Val Asp Pro Ser Leu Gln Ile Asp Val Leu Thr Glu Leu Glu Leu 25
30 35ggg gag tcc acg acc gga gtg cgt cag
gtc ccg ggg ctg cat aat ggg 258Gly Glu Ser Thr Thr Gly Val Arg Gln
Val Pro Gly Leu His Asn Gly 40 45
50acg aaa gcc ttt ctc ttt caa gat act ccc aga agc ata aaa gca tcc
306Thr Lys Ala Phe Leu Phe Gln Asp Thr Pro Arg Ser Ile Lys Ala Ser 55
60 65 70act gct aca gct
gaa cag ttt ttt cag aag ctg aga aat aaa cat gaa 354Thr Ala Thr Ala
Glu Gln Phe Phe Gln Lys Leu Arg Asn Lys His Glu 75
80 85ttt act att ttg gtg acc cta aaa cag acc
cac tta aat tca gga gtt 402Phe Thr Ile Leu Val Thr Leu Lys Gln Thr
His Leu Asn Ser Gly Val 90 95
100att ctc tca att cac cac ttg gat cac agg tac ctg gaa ctg gaa agt
450Ile Leu Ser Ile His His Leu Asp His Arg Tyr Leu Glu Leu Glu Ser
105 110 115agt ggc cat cgg aat gaa gtc
aga ctg cat tac cgc tca ggc agt cac 498Ser Gly His Arg Asn Glu Val
Arg Leu His Tyr Arg Ser Gly Ser His 120 125
130cgc cct cac aca gaa gtg ttt cct tac att ttg gct gat gac aag tgg
546Arg Pro His Thr Glu Val Phe Pro Tyr Ile Leu Ala Asp Asp Lys Trp135
140 145 150cac aag ctc tcc
tta gcc atc agt gct tcc cat ttg att tta cac att 594His Lys Leu Ser
Leu Ala Ile Ser Ala Ser His Leu Ile Leu His Ile 155
160 165gac tgc aat aaa att tat gaa agg gta gta
gaa aag ccc tcc aca gac 642Asp Cys Asn Lys Ile Tyr Glu Arg Val Val
Glu Lys Pro Ser Thr Asp 170 175
180ttg cct cta ggc aca aca ttt tgg cta gga cag aga aat aat gcg cat
690Leu Pro Leu Gly Thr Thr Phe Trp Leu Gly Gln Arg Asn Asn Ala His
185 190 195gga tat ttt aag ggt ata atg
caa gat gtc caa tta ctt gtc atg ccc 738Gly Tyr Phe Lys Gly Ile Met
Gln Asp Val Gln Leu Leu Val Met Pro 200 205
210cag gga ttt att gct cag tgc cca gat ctt aat cgc acc tgt cca act
786Gln Gly Phe Ile Ala Gln Cys Pro Asp Leu Asn Arg Thr Cys Pro Thr215
220 225 230tgc aat gac ttc
cat gga ctt gtg cag aaa atc atg gag cta cag gat 834Cys Asn Asp Phe
His Gly Leu Val Gln Lys Ile Met Glu Leu Gln Asp 235
240 245att tta gcc aaa aca tca gcc aag ctg tct
cga gct gaa cag cga atg 882Ile Leu Ala Lys Thr Ser Ala Lys Leu Ser
Arg Ala Glu Gln Arg Met 250 255
260aat aga ttg gat cag tgc tat tgt gaa agg act tgc acc atg aag gga
930Asn Arg Leu Asp Gln Cys Tyr Cys Glu Arg Thr Cys Thr Met Lys Gly
265 270 275acc acc tac cga gaa ttt gag
tcc tgg ata gac ggc tgt aag aac tgc 978Thr Thr Tyr Arg Glu Phe Glu
Ser Trp Ile Asp Gly Cys Lys Asn Cys 280 285
290aca tgc ctg aat gga acc atc cag tgt gaa act cta atc tgc cca aat
1026Thr Cys Leu Asn Gly Thr Ile Gln Cys Glu Thr Leu Ile Cys Pro Asn295
300 305 310cct gac tgc cca
ctt aag tcg gct ctt gcg tat gtg gat ggc aaa tgc 1074Pro Asp Cys Pro
Leu Lys Ser Ala Leu Ala Tyr Val Asp Gly Lys Cys 315
320 325tgt aag gaa tgc aaa tcg ata tgc caa ttt
caa gga cga acc tac ttt 1122Cys Lys Glu Cys Lys Ser Ile Cys Gln Phe
Gln Gly Arg Thr Tyr Phe 330 335
340gaa gga gaa aga aat aca gtc tat tcc tct tct gga gta tgt gtt ctc
1170Glu Gly Glu Arg Asn Thr Val Tyr Ser Ser Ser Gly Val Cys Val Leu
345 350 355tat gag tgc aag gac cag acc
atg aaa ctt gtt gag agt tca ggc tgt 1218Tyr Glu Cys Lys Asp Gln Thr
Met Lys Leu Val Glu Ser Ser Gly Cys 360 365
370cca gct ttg gat tgt cca gag tct cat cag ata acc ttg tct cac agc
1266Pro Ala Leu Asp Cys Pro Glu Ser His Gln Ile Thr Leu Ser His Ser375
380 385 390tgt tgc aaa gtt
tgt aaa ggt tat gac ttt tgt tct gaa agg cat aac 1314Cys Cys Lys Val
Cys Lys Gly Tyr Asp Phe Cys Ser Glu Arg His Asn 395
400 405tgc atg gag aat tcc atc tgc aga aat ctg
aat gac agg gct gtt tgt 1362Cys Met Glu Asn Ser Ile Cys Arg Asn Leu
Asn Asp Arg Ala Val Cys 410 415
420agc tgt cga gat ggt ttt agg gct ctt cga gag gat aat gcc tac tgt
1410Ser Cys Arg Asp Gly Phe Arg Ala Leu Arg Glu Asp Asn Ala Tyr Cys
425 430 435gaa gac atc gat gag tgt gct
gaa ggg cgc cat tac tgt cgt gaa aat 1458Glu Asp Ile Asp Glu Cys Ala
Glu Gly Arg His Tyr Cys Arg Glu Asn 440 445
450aca atg tgt gtc aac acc ccg ggt tct ttt atg tgc atc tgc aaa act
1506Thr Met Cys Val Asn Thr Pro Gly Ser Phe Met Cys Ile Cys Lys Thr455
460 465 470gga tac atc aga
att gat gat tat tca tgt aca gaa cat gat gag tgt 1554Gly Tyr Ile Arg
Ile Asp Asp Tyr Ser Cys Thr Glu His Asp Glu Cys 475
480 485atc aca aat cag cac aac tgt gat gaa aat
gct tta tgc ttc aac act 1602Ile Thr Asn Gln His Asn Cys Asp Glu Asn
Ala Leu Cys Phe Asn Thr 490 495
500gtt gga gga cac aac tgt gtt tgc aag ccg ggc tat aca ggg aat gga
1650Val Gly Gly His Asn Cys Val Cys Lys Pro Gly Tyr Thr Gly Asn Gly
505 510 515acg aca tgc aaa gca ttt tgc
aaa gat ggc tgt agg aat gga gga gcc 1698Thr Thr Cys Lys Ala Phe Cys
Lys Asp Gly Cys Arg Asn Gly Gly Ala 520 525
530tgt att gcc gct aat gtg tgt gcc tgc cca caa ggc ttc act gga ccc
1746Cys Ile Ala Ala Asn Val Cys Ala Cys Pro Gln Gly Phe Thr Gly Pro535
540 545 550agc tgt gaa acg
gac att gat gaa tgc tct gat ggt ttt gtt caa tgt 1794Ser Cys Glu Thr
Asp Ile Asp Glu Cys Ser Asp Gly Phe Val Gln Cys 555
560 565gac agt cgt gct aat tgc att aac ctg cct
gga tgg tac cac tgt gag 1842Asp Ser Arg Ala Asn Cys Ile Asn Leu Pro
Gly Trp Tyr His Cys Glu 570 575
580tgc aga gat ggc tac cat gac aat ggg atg ttt tca cca agt gga gaa
1890Cys Arg Asp Gly Tyr His Asp Asn Gly Met Phe Ser Pro Ser Gly Glu
585 590 595tcg tgt gaa gat att gat gag
tgt ggg acc ggg agg cac agc tgt gcc 1938Ser Cys Glu Asp Ile Asp Glu
Cys Gly Thr Gly Arg His Ser Cys Ala 600 605
610aat gat acc att tgc ttc aat ttg gat ggc gga tat gat tgt cga tgt
1986Asn Asp Thr Ile Cys Phe Asn Leu Asp Gly Gly Tyr Asp Cys Arg Cys615
620 625 630cct cat gga aag
aat tgc aca ggg gac tgc atc cat gat gga aaa gtt 2034Pro His Gly Lys
Asn Cys Thr Gly Asp Cys Ile His Asp Gly Lys Val 635
640 645aag cac aat ggt cag att tgg gtg ttg gaa
aat gac agg tgc tct gtg 2082Lys His Asn Gly Gln Ile Trp Val Leu Glu
Asn Asp Arg Cys Ser Val 650 655
660tgc tca tgt cag aat gga ttc gtt atg tgt cga cgg atg gtc tgt gac
2130Cys Ser Cys Gln Asn Gly Phe Val Met Cys Arg Arg Met Val Cys Asp
665 670 675tgt gag aat ccc aca gtt gat
ctt ttt tgc tgc cct gaa tgt gac cca 2178Cys Glu Asn Pro Thr Val Asp
Leu Phe Cys Cys Pro Glu Cys Asp Pro 680 685
690agg ctt agt agt cag tgc ctc cat caa aat ggg gaa act ttg tat aac
2226Arg Leu Ser Ser Gln Cys Leu His Gln Asn Gly Glu Thr Leu Tyr Asn695
700 705 710agt ggt gac acc
tgg gtc cag aat tgt caa cag tgc cgc tgc ttg caa 2274Ser Gly Asp Thr
Trp Val Gln Asn Cys Gln Gln Cys Arg Cys Leu Gln 715
720 725ggg gaa gtt gat tgt tgg ccc ctg cct tgc
cca gat gtg gag tgt gaa 2322Gly Glu Val Asp Cys Trp Pro Leu Pro Cys
Pro Asp Val Glu Cys Glu 730 735
740ttc agc att ctc cca gag aat gag tgc tgc ccg cgc tgt gtc aca gac
2370Phe Ser Ile Leu Pro Glu Asn Glu Cys Cys Pro Arg Cys Val Thr Asp
745 750 755cct tgc cag gct gac acc atc
cgc aat gac atc acc aag act tgc ctg 2418Pro Cys Gln Ala Asp Thr Ile
Arg Asn Asp Ile Thr Lys Thr Cys Leu 760 765
770gac gaa atg aat gtg gtt cgc ttc acc ggg tcc tct tgg atc aaa cat
2466Asp Glu Met Asn Val Val Arg Phe Thr Gly Ser Ser Trp Ile Lys His775
780 785 790ggc act gag tgt
act ctc tgc cag tgc aag aat ggc cac atc tgt tgc 2514Gly Thr Glu Cys
Thr Leu Cys Gln Cys Lys Asn Gly His Ile Cys Cys 795
800 805tca gtg gat cca cag tgc ctt cag gaa ctg
tgaagttaac tgtctcatgg 2564Ser Val Asp Pro Gln Cys Leu Gln Glu Leu
810 815gagatttctg ttaaaagaat gttctttcat
taaaagacca aaaagaagtt aaaacttaaa 2624ttgggtgatt tgtgggcagc taaatgcagc
tttgttaata gctgagtgaa ctttcaatta 2684tgaaatttgt ggagcttgac aaaatcacaa
aaggaaaatt actggggcaa aattagacct 2744caagtctgcc tctactgtgt ctcacatcac
catgtagaag aatgggcgta cagtatatac 2804cgtgacatcc tgaaccctgg atagaaagcc
tgagcccatt ggatctgtga aagcctctag 2864cttcactggt gcagaaaatt ttcctctaga
tcagaatctt cagaatcagt taggttcctc 2924actgcaagaa ataaaatgtc aggcagtgaa
tgaattatat tttcagaagt aaagcaaaga 2984agctataaca tgttatgtac agtacactct
gaaaagaaat ctgaaacaag ttattgtaat 3044gataaaaata atgcacaggc atggttactt
aatattttct aacaggaaaa gtcatcccta 3104tttccttgtt ttactgcact taatattatt
tggttgaatt tgttcagtat aagctcgttc 3164ttgtgcaaaa ttaaataaat atttctctta
cctt 319840499PRTHomo sapiens 40Met Glu Leu
Ser Glu Pro Val Val Glu Asn Gly Glu Val Glu Met Ala 1 5
10 15Leu Glu Glu Ser Trp Glu His Ser Lys
Glu Val Ser Glu Ala Glu Pro 20 25
30Gly Gly Gly Ser Ser Gly Asp Ser Gly Pro Pro Glu Glu Ser Gly Gln
35 40 45Glu Met Met Glu Glu Lys
Glu Glu Ile Arg Lys Ser Lys Ser Val Ile 50 55
60Val Pro Ser Gly Ala Pro Lys Lys Glu His Val Asn Val Val Phe
Ile 65 70 75 80Gly His
Val Asp Ala Gly Lys Ser Thr Ile Gly Gly Gln Ile Met Phe
85 90 95Leu Thr Gly Met Ala Asp Lys Arg
Thr Leu Glu Lys Tyr Glu Arg Glu 100 105
110Ala Glu Glu Lys Asn Arg Glu Thr Trp Tyr Leu Ser Trp Ala Leu
Asp 115 120 125Thr Asn Gln Glu Glu
Arg Asp Lys Gly Lys Thr Val Glu Val Gly Arg 130 135
140Ala Tyr Phe Glu Thr Glu Arg Lys His Phe Thr Ile Leu Asp
Ala Pro145 150 155 160Gly
His Lys Ser Phe Val Pro Asn Met Ile Gly Gly Ala Ser Gln Ala
165 170 175Asp Leu Ala Val Leu Val Ile
Ser Ala Arg Lys Gly Glu Phe Glu Thr 180 185
190Gly Phe Glu Lys Gly Gly Gln Thr Arg Glu His Ala Met Phe
Gly Lys 195 200 205Thr Ala Gly Val
Lys His Leu Ile Val Leu Ile Asn Lys Met Asp Asp 210
215 220Pro Thr Val Asn Trp Gly Ile Glu Arg Tyr Glu Glu
Cys Lys Glu Lys225 230 235
240Leu Val Pro Phe Leu Lys Lys Val Gly Phe Ser Pro Lys Lys Asp Ile
245 250 255His Phe Met Pro Cys
Ser Gly Leu Thr Gly Ala Asn Ile Lys Glu Gln 260
265 270Ser Asp Phe Cys Pro Trp Tyr Thr Gly Leu Pro Phe
Ile Pro Tyr Leu 275 280 285Asn Asn
Leu Pro Asn Phe Asn Arg Ser Ile Asp Gly Pro Ile Arg Leu 290
295 300Pro Ile Val Asp Lys Tyr Lys Asp Met Gly Thr
Val Val Leu Gly Lys305 310 315
320Leu Glu Ser Gly Ser Ile Phe Lys Gly Gln Gln Leu Val Met Met Pro
325 330 335Asn Lys His Asn
Val Glu Val Leu Gly Ile Leu Ser Asp Asp Thr Glu 340
345 350Thr Asp Phe Val Ala Pro Gly Glu Asn Leu Lys
Ile Arg Leu Lys Gly 355 360 365Ile
Glu Glu Glu Glu Ile Leu Pro Glu Phe Ile Leu Cys Asp Pro Ser 370
375 380Asn Leu Cys His Ser Gly Arg Thr Phe Asp
Val Gln Ile Val Ile Ile385 390 395
400Glu His Lys Ser Ile Ile Cys Pro Gly Tyr Asn Ala Val Leu His
Ile 405 410 415His Thr Cys
Ile Glu Glu Val Glu Ile Thr Ala Leu Ile Ser Leu Val 420
425 430Asp Lys Lys Ser Gly Glu Lys Ser Lys Thr
Arg Pro Arg Phe Val Lys 435 440
445Gln Asp Gln Val Cys Ile Ala Arg Leu Arg Thr Ala Gly Thr Ile Cys 450
455 460Leu Glu Thr Phe Lys Asp Phe Pro
Gln Met Gly Arg Phe Thr Leu Arg465 470
475 480Asp Glu Gly Lys Thr Ile Ala Ile Gly Lys Val Leu
Lys Leu Val Pro 485 490
495Glu Lys Asp411497DNAHomo sapiens 41atggaacttt cagaacctgt tgtagaaaat
ggagaggtgg aaatggccct agaagaatca 60tgggagcaca gtaaagaagt aagtgaagcc
gagcctgggg gtggttcctc gggagattca 120gggcccccag aagaaagtgg ccaggaaatg
atggaggaaa aagaggaaat aagaaaatcc 180aaatctgtga tcgtaccctc aggtgcacct
aagaaagaac acgtaaatgt agtattcatt 240ggccatgtag acgctggcaa gtcaaccatc
ggaggacaga taatgttttt gactggaatg 300gctgacaaaa gaacactgga gaaatatgaa
agagaagctg aggaaaaaaa cagagaaacc 360tggtatttgt cctgggcctt agatacaaat
caggaggaac gagacaaggg taaaacagtc 420gaagtgggtc gtgcctattt tgaaacagaa
aggaaacatt tcacaatttt agatgcccct 480ggccacaaga gttttgtccc aaatatgatt
ggtggtgctt ctcaagctga tttggctgtg 540ctggtcatct ctgccaggaa aggagagttt
gaaactggat ttgaaaaagg tggacagaca 600agagaacatg cgatgtttgg caaaacggca
ggagtaaaac atttaatagt gcttattaat 660aagatggatg atcccacagt aaattggggc
atcgagagat atgaagaatg taaagaaaaa 720ctggtgccct ttttgaaaaa agtaggcttt
agtccaaaaa aggacattca ctttatgccc 780tgctcaggac tgaccggagc aaatattaaa
gagcagtcag atttctgccc ttggtacact 840ggattaccat ttattccgta tttgaataac
ttgccaaact tcaacagatc aattgatgga 900ccaataagac tgccaattgt ggataagtac
aaagatatgg gcactgtggt cctgggaaag 960ctggaatccg ggtccatttt taaaggccag
cagctcgtga tgatgccaaa caagcacaat 1020gtagaagttc ttggaatact ttctgatgat
actgaaactg attttgtagc cccaggtgaa 1080aacctcaaaa tcagactgaa gggaattgaa
gaagaagaga ttcttccaga attcatactt 1140tgtgatccta gtaacctctg ccattctgga
cgcacgtttg atgttcagat agtgattatt 1200gagcacaaat ccatcatctg cccaggttat
aatgcggtgc tgcacattca tacttgtatt 1260gaggaagttg agataacagc gttaatctcc
ttggtagaca aaaaatcagg ggaaaaaagt 1320aagacacgac cccgcttcgt gaaacaagat
caagtatgca ttgctcgttt aaggacagca 1380ggaaccatct gcctcgagac gttcaaagat
tttcctcaga tgggtcgttt tactttaaga 1440gatgagggta agaccattgc aattggaaaa
gttctgaaat tggtcccaga gaaggac 1497422057DNAHomo
sapiensCDS(144)..(1640) 42tcccggccgg ctccggcagc aacgatgaag cctgcaccgg
cgcgggatac cctcaaggta 60aaaggatggg acggggggca cctgtggaac cttcccgaga
ggaaccgtta gtgtcgcttg 120aaggttccaa ttcagccgtt acc atg gaa ctt tca gaa
cct gtt gta gaa aat 173 Met Glu Leu Ser Glu
Pro Val Val Glu Asn 1 5
10gga gag gtg gaa atg gcc cta gaa gaa tca tgg gag cac agt aaa gaa
221Gly Glu Val Glu Met Ala Leu Glu Glu Ser Trp Glu His Ser Lys Glu
15 20 25gta agt gaa gcc
gag cct ggg ggt ggt tcc tcg gga gat tca ggg ccc 269Val Ser Glu Ala
Glu Pro Gly Gly Gly Ser Ser Gly Asp Ser Gly Pro 30
35 40cca gaa gaa agt ggc cag gaa atg atg gag gaa
aaa gag gaa ata aga 317Pro Glu Glu Ser Gly Gln Glu Met Met Glu Glu
Lys Glu Glu Ile Arg 45 50 55aaa
tcc aaa tct gtg atc gta ccc tca ggt gca cct aag aaa gaa cac 365Lys
Ser Lys Ser Val Ile Val Pro Ser Gly Ala Pro Lys Lys Glu His 60
65 70gta aat gta gta ttc att ggc cat gta gac
gct ggc aag tca acc atc 413Val Asn Val Val Phe Ile Gly His Val Asp
Ala Gly Lys Ser Thr Ile 75 80 85
90gga gga cag ata atg ttt ttg act gga atg gct gac aaa aga aca
ctg 461Gly Gly Gln Ile Met Phe Leu Thr Gly Met Ala Asp Lys Arg Thr
Leu 95 100 105gag aaa tat
gaa aga gaa gct gag gaa aaa aac aga gaa acc tgg tat 509Glu Lys Tyr
Glu Arg Glu Ala Glu Glu Lys Asn Arg Glu Thr Trp Tyr 110
115 120ttg tcc tgg gcc tta gat aca aat cag gag
gaa cga gac aag ggt aaa 557Leu Ser Trp Ala Leu Asp Thr Asn Gln Glu
Glu Arg Asp Lys Gly Lys 125 130
135aca gtc gaa gtg ggt cgt gcc tat ttt gaa aca gaa agg aaa cat ttc
605Thr Val Glu Val Gly Arg Ala Tyr Phe Glu Thr Glu Arg Lys His Phe 140
145 150aca att tta gat gcc cct ggc cac
aag agt ttt gtc cca aat atg att 653Thr Ile Leu Asp Ala Pro Gly His
Lys Ser Phe Val Pro Asn Met Ile155 160
165 170ggt ggt gct tct caa gct gat ttg gct gtg ctg gtc
atc tct gcc agg 701Gly Gly Ala Ser Gln Ala Asp Leu Ala Val Leu Val
Ile Ser Ala Arg 175 180
185aaa gga gag ttt gaa act gga ttt gaa aaa ggt gga cag aca aga gaa
749Lys Gly Glu Phe Glu Thr Gly Phe Glu Lys Gly Gly Gln Thr Arg Glu
190 195 200cat gcg atg ttt ggc aaa
acg gca gga gta aaa cat tta ata gtg ctt 797His Ala Met Phe Gly Lys
Thr Ala Gly Val Lys His Leu Ile Val Leu 205 210
215att aat aag atg gat gat ccc aca gta aat tgg ggc atc gag
aga tat 845Ile Asn Lys Met Asp Asp Pro Thr Val Asn Trp Gly Ile Glu
Arg Tyr 220 225 230gaa gaa tgt aaa gaa
aaa ctg gtg ccc ttt ttg aaa aaa gta ggc ttt 893Glu Glu Cys Lys Glu
Lys Leu Val Pro Phe Leu Lys Lys Val Gly Phe235 240
245 250agt cca aaa aag gac att cac ttt atg ccc
tgc tca gga ctg acc gga 941Ser Pro Lys Lys Asp Ile His Phe Met Pro
Cys Ser Gly Leu Thr Gly 255 260
265gca aat att aaa gag cag tca gat ttc tgc cct tgg tac act gga tta
989Ala Asn Ile Lys Glu Gln Ser Asp Phe Cys Pro Trp Tyr Thr Gly Leu
270 275 280cca ttt att ccg tat ttg
aat aac ttg cca aac ttc aac aga tca att 1037Pro Phe Ile Pro Tyr Leu
Asn Asn Leu Pro Asn Phe Asn Arg Ser Ile 285 290
295gat gga cca ata aga ctg cca att gtg gat aag tac aaa gat
atg ggc 1085Asp Gly Pro Ile Arg Leu Pro Ile Val Asp Lys Tyr Lys Asp
Met Gly 300 305 310act gtg gtc ctg gga
aag ctg gaa tcc ggg tcc att ttt aaa ggc cag 1133Thr Val Val Leu Gly
Lys Leu Glu Ser Gly Ser Ile Phe Lys Gly Gln315 320
325 330cag ctc gtg atg atg cca aac aag cac aat
gta gaa gtt ctt gga ata 1181Gln Leu Val Met Met Pro Asn Lys His Asn
Val Glu Val Leu Gly Ile 335 340
345ctt tct gat gat act gaa act gat ttt gta gcc cca ggt gaa aac ctc
1229Leu Ser Asp Asp Thr Glu Thr Asp Phe Val Ala Pro Gly Glu Asn Leu
350 355 360aaa atc aga ctg aag gga
att gaa gaa gaa gag att ctt cca gaa ttc 1277Lys Ile Arg Leu Lys Gly
Ile Glu Glu Glu Glu Ile Leu Pro Glu Phe 365 370
375ata ctt tgt gat cct agt aac ctc tgc cat tct gga cgc acg
ttt gat 1325Ile Leu Cys Asp Pro Ser Asn Leu Cys His Ser Gly Arg Thr
Phe Asp 380 385 390gtt cag ata gtg att
att gag cac aaa tcc atc atc tgc cca ggt tat 1373Val Gln Ile Val Ile
Ile Glu His Lys Ser Ile Ile Cys Pro Gly Tyr395 400
405 410aat gcg gtg ctg cac att cat act tgt att
gag gaa gtt gag ata aca 1421Asn Ala Val Leu His Ile His Thr Cys Ile
Glu Glu Val Glu Ile Thr 415 420
425gcg tta atc tcc ttg gta gac aaa aaa tca ggg gaa aaa agt aag aca
1469Ala Leu Ile Ser Leu Val Asp Lys Lys Ser Gly Glu Lys Ser Lys Thr
430 435 440cga ccc cgc ttc gtg aaa
caa gat caa gta tgc att gct cgt tta agg 1517Arg Pro Arg Phe Val Lys
Gln Asp Gln Val Cys Ile Ala Arg Leu Arg 445 450
455aca gca gga acc atc tgc ctc gag acg ttc aaa gat ttt cct
cag atg 1565Thr Ala Gly Thr Ile Cys Leu Glu Thr Phe Lys Asp Phe Pro
Gln Met 460 465 470ggt cgt ttt act tta
aga gat gag ggt aag acc att gca att gga aaa 1613Gly Arg Phe Thr Leu
Arg Asp Glu Gly Lys Thr Ile Ala Ile Gly Lys475 480
485 490gtt ctg aaa ttg gtc cca gag aag gac
taagcaattt tcttgatgcc 1660Val Leu Lys Leu Val Pro Glu Lys Asp
495tctgcaagat actgtgagga gaattgacag caaaagttca ccacctactc
ttatttactg 1720cccattgatt gacttttctt catattttgc aaagagaaat ttcacagcaa
aaattcatgt 1780tttgtcagct ttctcatgtt gagatctgtt atgtcactga tgaatttacc
ctcaagtttc 1840cttcctctgt accactctgc ttccttggac aatatcagta atagctttgt
aagtgatgtg 1900gacgtaattg cctacagtaa taaaaaaata atgtacttta atttttcatt
ttcttttagg 1960atatttagac cacccttgtt ccacgcaaac cagagtgtgt cagtgtttgt
gtgtgtgtta 2020aaatgataac taacatgtga ataaaatact ccatttg
20574324DNAArtificial SequenceDescription of Artificial
Sequence Primer P1 43acaccaatcc agtagccagg cttg
244434DNAArtificial SequenceDescription of Artificial
Sequence Primer P2 44cactcgagaa tctgtgagac ctacatacat gacg
344521PRTArtificial Sequencemisc_feature(2)..(20)Xaa can
be any naturally occurring amino acid 45Cys Xaa Glu Cys Gly Lys Ala Phe
Xaa Gln Lys Ser Xaa Leu Xaa Xaa 1 5 10
15His Gln Arg Xaa His 20467PRTBovine sp. 46Val
Leu Asn Ile Ser Leu Trp 1 54717PRTBovine sp. 47Thr Leu Met
Glu Leu Leu Asn Gln Met Asp Gly Phe Asp Thr Leu His 1 5
10 15Arg4814PRTBovine
sp.misc_feature(11)..(13)Xaa can be any naturally occurring amino acid
48Ala Val Ser Asp Phe Val Val Ser Glu Tyr Xaa Met Xaa Ala 1
5 10499PRTBovine sp.misc_feature(9)..(9)Xaa can be any
naturally occurring amino acid 49Glu Val Asp Pro Leu Val Tyr Asn Xaa 1
55011PRTBovine sp. 50His Gly Glu Ile Asp Tyr Glu Ala Ile Val
Lys 1 5 105125PRTBovine
sp.misc_feature(3)..(23)Xaa can be any naturally occurring amino acid
51Leu Ser Xaa Gly Phe Asn Gly Ala Asp Leu Arg Asn Val Xaa Thr Glu 1
5 10 15Ala Gly Met Phe Ala Ile
Xaa Ala Asp 20 255221PRTBovine
sp.misc_feature(20)..(20)Xaa can be any naturally occurring amino acid
52Met Ile Met Ala Thr Asn Arg Pro Asp Thr Leu Asp Pro Ala Leu Leu 1
5 10 15Arg Pro Gly Xaa Leu
205316PRTBovine sp. 53Ile His Ile Asp Leu Pro Asn Glu Gln Ala Arg
Leu Asp Ile Leu Lys 1 5 10
155411PRTBovine sp. 54Ala Thr Asn Gly Pro Arg Tyr Val Val Val Gly 1
5 10557PRTBovine sp. 55Glu Ile Asp Gly Arg
Leu Lys 1 55614PRTBovine sp. 56Ala Leu Gln Ser Val Gly Gln
Ile Val Gly Glu Val Leu Lys 1 5
10578PRTBovine sp. 57Ile Leu Ala Gly Pro Ile Thr Lys 1
55816PRTBovine sp.misc_feature(1)..(2)Xaa can be any naturally occurring
amino acid 58Xaa Xaa Val Ile Glu Leu Pro Leu Thr Asn Pro Glu Leu Phe Gln
Gly 1 5 10
15599PRTBovine sp. 59Val Val Ser Ser Ser Leu Val Asp Lys 1
5607PRTBovine sp. 60Ala Leu Gln Asp Tyr Arg Lys 1
5617PRTBovine sp. 61Glu His Arg Glu Gln Leu Lys 1
56212PRTBovine sp. 62Lys Leu Glu Ser Lys Leu Asp Tyr Lys Pro Val Arg 1
5 10635PRTBovine sp. 63Leu Val Pro Thr Arg
1 56411PRTBovine sp. 64Ala Lys Glu Glu Glu Ile Glu Ala Gln
Ile Lys 1 5 106510PRTBovine sp. 65Ala Asn
Tyr Glu Val Leu Glu Ser Gln Lys 1 5
106611PRTBovine sp. 66Val Glu Asp Ala Leu His Gln Leu His Ala Arg 1
5 10678PRTBovine sp. 67Asp Val Asp Leu Tyr Gln
Val Arg 1 56813PRTBovine sp. 68Gln Ser Gln Gly Leu Ser Pro
Ala Gln Ala Phe Ala Lys 1 5
106921PRTBovine sp. 69Ala Gly Ser Gln Ser Gly Gly Ser Pro Glu Ala Ser Gly
Val Thr Val 1 5 10 15Ser
Asp Val Gln Glu 207012PRTBovine sp.misc_feature(5)..(5)Xaa
can be any naturally occurring amino acid 70Gly Leu Leu Gly Xaa Asn Ile
Ile Pro Leu Gln Arg 1 5
107126DNAArtificial SequenceDescription of Artificial Sequence Primer P1
71ttgaagaatg atgcattagg aaccac
267234DNAArtificial SequenceDescription of Artificial Sequence Primer P2
72cactcgagtg gctggatttc aatttctcca gtag
347324DNAArtificial SequenceDescription of Artificial Sequence Primer P3
73gtcgagctag ccatctcctc ttcg
247423DNAArtificial SequenceDescription of Artificial Sequence Primer P4
74catgggcgac aggttccgag acc
23759PRTHomo sapiens 75Lys Gly Ile Pro Ser Phe Trp Leu Thr 1
5769PRTSaccharomyces sp. 76Lys Gly Ile Pro Glu Phe Trp Leu Thr 1
57710PRTHomo sapiens 77Asp Ser Phe Phe Asn Phe Phe Ala Pro Pro 1
5 10789PRTSaccharomyces sp. 78Glu Ser Phe
Phe Asn Phe Phe Ser Pro 1 57914PRTArtificial
Sequencemisc_feature(2)..(11)Xaa can be any naturally occurring amino
acid 79Glu Xaa Xaa Lys Glu Xaa Pro Glu Val Lys Xaa Glu Glu Lys 1
5 10805PRTHIV-1 80Gly Arg Lys Lys Arg 1
5815PRTHomo sapiens 81Lys Lys Lys Arg Lys 1
58225DNAArtificial SequenceDescription of Artificial Sequence A1 Primer
82cctaaaaagt gtctaagtgc cagtt
258324DNAArtificial SequenceDescription of Artificial Sequence A2 Primer
83tcagtgaaag ggaaggtaga acac
248426DNAArtificial SequenceDescription of Artificial Sequence P1 Primer
84taatgaattt cattttagga ggtcgg
268525DNAArtificial SequenceDescription of Artificial Sequence P2 Primer
85atcttttggg aaagtaagat gagcc
258621DNAArtificial SequenceDescription of Artificial Sequence C1 Primer
86ggagactcac ctgctaatgt t
218720DNAArtificial SequenceDescription of Artificial Sequence C4 Primer
87ctcaaaagca gtctcttggc
208822DNAArtificial SequenceDescription of Artificial Sequence Primer A
88atgggagata cagtagtgga gc
228921DNAArtificial SequenceDescription of Artificial Sequence Primer B
89tcacatgatg ccgttggtga g
219051DNAHomo sapiens 90tggatcaagc caatacaaga ttcttgtgaa attacgactg
atagtggcat g 5191117DNAHomo sapiens 91tccatttggg aacaggagcg
agtgcccctt tggatcaagc catacaagat tcttgtgatt 60tcggctgata gtggcatgat
tgaaccagtg gtcaatgctg tgtccatcca tcaggtg 1179231DNAArtificial
SequenceDescription of Artificial Sequence Primer C1 92ctcagatcta
tgggagatac agtagtggag c
319330DNAArtificial SequenceDescription of Artificial Sequence Primer C2
93tcgagatctt cacatgatgc cgttggtgag
309428DNAArtificial SequenceDescription of Artificial Sequence P1 Primer
94gatttgtgct caataatcac tatctgaa
289533DNAArtificial SequenceDescription of Artificial Sequence P2 Primer
95ggttactagg atcacaaagt atgaattctg gaa
33965PRTHIV-1 96Tyr Arg Lys Lys Arg 1 5
User Contributions:
Comment about this patent or add new information about this topic:
People who visited this patent also read: | |
Patent application number | Title |
---|---|
20110078268 | METHOD AND APPARATUS FOR EFFICIENT POLLING |
20110078267 | CONDITIONAL COMMUNICATION |
20110078266 | APPARATUS, AND ASSOCIATED METHOD, FOR ALERTING USER OF COMMUNICATION DEVICE OF ENTRIES ON A MAIL MESSAGE DISTRIBUTION LIST |
20110078265 | METHOD AND DEVICE FOR PREDICTING MESSAGE RECIPIENTS |
20110078264 | E-mail proxy |