Patent application title: CHIMERIC GENE CONSTRUCTS FOR GENERATION OF FLUORESCENT TRANSGENIC ORNAMENTAL FISH
Inventors:
Zhiyuan Gong (Singapore, SG)
Jiangyan He (Cleveland, OH, US)
Bensheng Ju (Singapore, SG)
Toong Jin Lam (Singapore, SG)
Yanfei Xu (Chicago, IL, US)
Tie Yan (Singapore, SG)
Assignees:
NATIONAL UNIVERSITY OF SINGAPORE
IPC8 Class: AA01K67027FI
USPC Class:
Class name:
Publication date: 2022-08-18
Patent application number: 20220256819
Abstract:
Four zebrafish gene promoters, which are skin specific, muscle specific,
skeletal muscle specific and ubiquitously expressed respectively, were
isolated and ligated to the 5' end of the EGFP gene. When the resulting
chimeric gene constructs were introduced into zebrafish, the transgenic
zebrafish emit green fluorescence under a blue light or ultraviolet light
according to the specificity of the promoters used. Thus, new varieties
of ornamental fish of different fluorescence patterns, e.g., skin
fluorescence, muscle fluorescence, skeletal muscle-specific and/or
ubiquitous fluorescence, are developed.Claims:
1-41. (canceled)
42. A transgenic fish comprising a chimeric gene comprising a cytokeratin (CK) gene promoter operably linked to a fluorescence gene encoding a fluorescent protein, wherein the transgenic fish expresses the fluorescent protein at a level sufficient that the transgenic fish fluoresces upon exposure to sunlight.
43. The transgenic fish of claim 42, wherein the CK gene promoter is expressed in skin epithelia of the fish,
44. The transgenic fish of claim 42, wherein the CK gene promoter comprises the gene sequence set forth in SEQ ID NO: 7.
45. The transgenic fish of claim 42, wherein the fluorescence gene is under control of the CK gene promoter.
46. The transgenic fish of claim 42, wherein the transgenic fish contains the promoter in germ cells and/or in somatic cells and which is capable of breeding with either a the transgenic fish or a non-transgenic fish to produce viable and fertile transgenic progeny.
47. The transgenic fish of claim 42, wherein the transgenic fish or a progeny thereof emits green fluorescence when the whole transgenic fish or the progeny thereof is exposed to a blue or ultraviolet light.
48. The transgenic fish of claim 42, wherein the transgenic fish or a progeny thereof expresses a Green Fluorescent Protein.
49. The transgenic fish of claim 42, wherein the transgenic fish or a progeny thereof expresses an Enhanced Green Fluorescent Protein.
50. The transgenic fish of claim 42, wherein the transgenic fish or a progeny thereof expresses a Blue Fluorescent Protein.
51. The transgenic fish of claim 42, wherein the transgenic fish or a progeny thereof expresses an Enhanced Blue Fluorescent Protein.
52. The transgenic fish of claim 42, wherein the transgenic fish or a progeny thereof expresses an Enhanced Yellow Fluorescent Protein.
53. The transgenic fish of claim 42, wherein the transgenic fish or a progeny thereof expresses a Cyan Fluorescent Protein.
54. The transgenic fish of claim 42, wherein the transgenic fish or a progeny thereof expresses an Enhanced Cyan Fluorescent Protein.
55. The transgenic fish of claim 42, wherein the transgenic fish or a progeny thereof expresses more than one fluorescent proteins.
56. The transgenic fish of claim 42, wherein the transgenic fish is selected from a group consisting of medaka, goldfish, carp, koi, loach, tilapia, glassfish, catfish, angel fish, discus, eel, tetra, goby, gourami, guppy, Xiphophorus (swordtail), hatchet fish, Molly fish and Pangasius.
57. The transgenic fish of claim 42, wherein the transgenic fish is a zebrafish.
58. A transgenic fish whose genome comprises a transgene comprising a promoter consisting of the nucleic acid sequence set forth by SEQ ID NO: 7, wherein the promoter is operably linked to a fluorescence gene encoding a fluorescent protein, wherein the transgene fish is capable of breeding to produce viable and fertile transgenic progeny that express the fluorescence protein when exposed to sunlight.
Description:
CROSS REFERENCE TO RELATED APPLICATIONS
[0001] This is a continuation of co-pending application Ser. No. 09/913,898, filed Oct. 3, 2001, which is a nationalization of PCT application WO 00/49150 filed Jul. 16, 1999, claiming priority over a Singapore application filed Jul. 14, 1999, and an earlier Singapore application, Serial No. 9900811-2, filed Feb. 18, 1999, all of which are incorporated herein by reference in their entirety.
BACKGROUND OF INVENTION
[0002] This invention relates to fish gene promoters and chimeric gene constructs with these promoters for generation of transgenic fish, particularly fluorescent transgenic ornamental fish.
[0003] Transgenic technology involves the transfer of a foreign gene into a host organism enabling the host to acquire a new and inheritable trait. The technique was first developed in mice by Gordon et al. (1980). They injected foreign DNA into fertilized eggs and found that some of the mice developed from the injected eggs retained the foreign DNA. Applying the same technique, Palmiter et al. (1982) have introduced a chimeric gene containing a rat growth hormone gene under a mouse heavy metal-inducible gene promoter and generated the first batch of genetically engineered supermice, which are almost twice as large as non-transgenic siblings. This work has opened a promising avenue in using the transgenic approach to provide to animals new and beneficial traits for livestock husbandry and aquaculture.
[0004] In addition to the stimulation of somatic growth for increasing the gross production of animal husbandry and aquaculture, transgenic technology also has many other potential applications. First of all, transgenic animals can be used as bioreactors to produce commercially useful compounds by expression of a useful foreign gene in milk or in blood. Many pharmaceutically useful protein factors have been expressed in this way. For example, human al-antitrypsin, which is commonly used to treat emphysema, has been expressed at a concentration as high as 35 mg/ml (10% of milk proteins) in the milk of transgenic sheep (Wright et al., 1991). Similarly, the transgenic technique can also be used to improve the nutritional value of milk by selectively increasing the levels of certain valuable proteins such as caseins and by supplementing certain new and useful proteins such as lysozyme for antimicrobial activity (Maga and Murray, 1995). Second, transgenic mice have been widely used in medical research, particularly in the generation of transgenic animal models for human disease studies (Lathe and Mullins, 1993). More recently, it has been proposed to use transgenic pigs as organ donors for xenotransplantation by expressing human regulators of complement activation to prevent hyperacute rejection during organ transplantation (Cozzi and White, 1995). The development of disease resistant animals has also been tested in transgenic mice (e.g. Chen et al., 1988).
[0005] Fish are also an intensive research subject of transgenic studies. There are many ways of introducing a foreign gene into fish, including: microinjection (e.g. Zhu et al., 1985; Du et al., 1992), electroporation (Powers et al., 1992), sperm-mediated gene transfer (Khoo et al., 1992; Sin et al., 1993), gene bombardment or gene gun (Zelemin et al., 1991), liposome-mediated gene transfer (Szelei et al., 1994), and the direct injection of DNA into muscle tissue (Xu et al., 1999). The first transgenic fish report was published by Zhu et al. (1985) using a chimeric gene construct consisting of a mouse metallothionein gene promoter and a human growth hormone gene. Most of the early transgenic fish studies have concentrated on growth hormone gene transfer with an aim of generating fast growing "superfish". A majority of early attempts used heterologous growth hormone genes and promoters and failed to produce gigantic superfish (e.g. Chourrout et al., 1986; Penman et al., 1990; Brem et al., 1988; Gross et al., 1992). But enhanced growth of transgenic fish has been demonstrated in several fish species including Atlantic salmon, several species of Pacific salmons, and loach (e.g. Du et al., 1992; Delvin et al., 1994, 1995; Tsai et al., 1995).
[0006] The zebrafish, Danio rerio, is a new model organism for vertebrate developmental biology. As an experimental model, the zebrafish offers several major advantages such as easy availability of eggs and embryos, tissue clarity throughout embryogenesis, external development, short generation time and easy maintenance of both the adult and the young. Transgenic zebrafish have been used as an experimental tool in zebrafish developmental biology. However, despite the fact that the first transgenic zebrafish was reported a decade ago (Stuart et al., 1988), most transgenic zebrafish work conducted so far used heterologous gene promoters or viral gene promoters: e.g. viral promoters from SV40 (simian virus 40) and RSV (Rous sarcoma virus) (Stuart et al., 1988, 1990; Bayer and Campos-Ortega, 1992), a carp actin promoter (Liu et al., 1990), and mouse homeobox gene promoters (Westerfield et al., 1992). As a result, the expression pattern of a transgene in many cases is variable and unpredictable.
[0007] GFP (green fluorescent protein) was isolated from a jelly fish, Aqueous victoria. The wild type GFP emits green fluorescence at a wavelength of 508 nm upon stimulation with ultraviolet light (395 nm). The primary structure of GFP has been elucidated by cloning of its cDNA and genomic DNA (Prasher et al., 1992). A modified GFP, also called EGFP (Enhanced Green Fluorescent Protein) has been generated artificially and it contains mutations that allow the protein to emit a stronger green light and its coding sequence has also been optimized for higher expression in mammalian cells based on preferable human codons. As a result, EGFP fluorescence is about 40 times stronger than the wild type GFP in mammalian cells (Yang et al., 1996). GFP (including EGFP) has become a popular tool in cell biology and transgenic research. By fusing GFP with a tested protein, the GFP fusion protein can be used as an indicator of the subcellular location of the tested protein (Wang and Hazelrigg, 1994). By transformation of cells with a functional GFP gene, the GFP can be used as a marker to identify expressing cells (Chalfie et al., 1994). Thus, the GFP gene has become an increasingly popular reporter gene for transgenic research as GFP can be easily detected by a non-invasive approach.
[0008] The GFP gene (including EGFP gene) has also been introduced into zebrafish in several previous reports by using various gene promoters, including Xenopus elongation factor 1.alpha. enhancer-promoter (Amsterdam et al., 1995, 1996), rat myosin light-chain enhancer (Moss et al., 1996), zebrafish GATA-1 and GATA-3 promoters (Meng et al., 1997; Long et al., 1997), zebrafish .alpha.- and .beta.-actin promoters (Higashijima et al., 1997), and tilapia insulin-like growth factor I promoter (Chen et al., 1998). All of these transgenic experiments aim at either developing a GFP transgenic system for gene expression analysis or at testing regulatory DNA elements in gene promoters.
SUMMARY OF INVENTION
[0009] It is a primary objective of the invention to clone fish gene promoters that are constitutive (ubiquitous), or that have tissue specificity such as skin specificity or muscle specificity or that are inducible by a chemical substance, and to use these promoters to develop effective gene constructs for production of transgenic fish.
[0010] It is another objective of the invention to develop fluorescent transgenic ornamental fish using these gene constructs. By applying different gene promoters, tissue-specific, inducible under different environmental conditions, or ubiquitous, to drive the GFP gene, GFP could be expressed in different tissues or ubiquitously. Thus, these transgenic fish may be skin fluorescent, muscle fluorescent, ubiquitously fluorescent, or inducibly fluorescent. These transgenic fish may be used for ornamental purposes, for monitoring environmental pollution, and for basic studies such as recapitulation of gene expression programs or monitoring cell lineage and cell migration. These transgenic fish may be used for cell transplantation and nuclear transplantation or fish cloning.
[0011] Other objectives, features and advantages of the present invention will become apparent from the detailed description which follows, or may be learned by practice of the invention.
[0012] Four zebrafish gene promoters of different characteristics were isolated and four chimeric gene constructs containing a zebrafish gene promoter and EGFP DNA were made: pCK-EGFP, pMCK-EGFP, pMLC2f-EGFP and pARP-EGFP. The first chimeric gene construct, pCK-EGFP, contains a 2.2 kbp polynucleotide comprising a zebrafish cytokeratin (CK) gene promoter which is specifically or predominantly expressed in skin epithelia. The second one, pMCK-EGFP, contains a 1.5 kbp polynucleotide comprising a muscle-specific promoter from a zebrafish muscle creatine kinase (MCK) gene and the gene is only expressed in the muscle tissue. The third construct, pMLC2f-EGFP contains a 2.2 kpb polynucleotide comprising a strong skeletal muscle-specific promoter from the fast skeletal muscle isoform of the myosin light chain 2 (MLC2f) gene and is expressed specifically or predominantly in skeletal muscle. The fourth chimeric gene construct, pARP-EGFP, contains a strong and ubiquitously expressed promoter from a zebrafish acidic ribosomal protein (ARP) gene. These four chimeric gene constructs have been introduced into zebrafish at the one cell stage by microinjection. In all cases, the GFP expression patterns were consistent with the specificities of the promoters. GFP was predominantly expressed in skin epithelia with pCK-EGFP, specifically expressed in muscles with pMCK-EGFP, specifically expressed in skeletal muscles with pMLC2f-EGFP and ubiquitously expressed in all tissues with pARP-EGFP.
[0013] These chimeric gene constructs are useful to generate green fluorescent transgenic fish. The GFP transgenic fish emit green fluorescence light under a blue or ultraviolet light and this feature makes the genetically engineered fish unique and attractive in the ornamental fish market. The fluorescent transgenic fish are also useful for the development of a biosensor system and as research models for embryonic studies such as cell lineage, cell migration, cell and nuclear transplantation etc.
BRIEF DESCRIPTION OF DRAWINGS
[0014] FIGS. 1A-1I are photographs showing expression of CK (FIGS. 1A-1C), MCK (FIGS. 1D-1E), ARP (FIGS. 1F-1G) and MLC2f (FIGS. 1H-1I) mRNAs in zebrafish embryos as revealed by whole mount in situ hybridization (detailed description of the procedure can be found in Thisse et al., 1993). (FIG. 1A) A 28 hpf (hour postfertilization) embryo hybridized with a CK antisense riboprobe. (FIG. 1B) Enlargement of the mid-part of the embryo shown in FIG. 1A. (FIG. 1C) Cross-section of the embryo in FIG. 1A. (FIG. 1D) A 30 hpf embryo hybridized with an MCK antisense riboprobe. (FIG. 1E) Cross-section of the embryo in FIG. 1D. (FIG. 1F) A 28 hpf embryo hybridized with an ARP antisense riboprobe. (FIG. 1G) Cross-section of the embryo in FIG. 1F. Arrows indicate the planes for cross-sections and the box in panel A indicates the enlarged region shown in panel B. (FIG. 1H) Side view of a 22-hpf embryo hybridized with the MLC2f probe. (FIG. 1I) Transverse section through the trunk of a stained 24-hpf embryo. SC, spinal cord; N, notochord.
[0015] FIG. 2A is a digitized image showing distribution of CK, MCK and ARP mRNAs in adult tissues. Total RNAs were prepared from selected adult tissues as indicated at the top of each lane and analyzed by Northern blot hybridization (detailed description of the procedure can be found in Gong et al., 1992). Three identical blots were made from the same set of RNAs and hybridized with the CK, MCK and ARP probes, respectively.
[0016] FIG. 2B is a digitized image showing distribution of MLC2f mRNA in adult tissues. Total RNAs were prepared from selected adult tissues as indicated at the top of each lane and analyzed by Northern blot hybridization (detailed description of the procedure can be found in Gong et al., 1992). Two identical blots were made from the same set of RNAs and hybridized with the MLC2f probe and a ubiquitously expressed .beta.-actin probe, respectively.
[0017] FIG. 3. is a schematic representation of the strategy of promoter cloning. Restriction enzyme digested genomic DNA was ligated with a short linker DNA which consists of Oligo 1 and Oligo 2. Nested PCR reactions were then performed: the first round PCR used linker specific primer L1 and gene specific primers G1, where G1 is CK1, MCK1, M1 or ARP1 in the described embodiments, and the second round linker specific primer L2 and gene specific primer G2, where G2 is CK2, MCK2, M2 or ARP2, respectively in the described embodiments.
[0018] FIG. 4 is a schematic map of the chimeric gene construct, pCK-EGFP. The 2.2 kb zebrafish DNA fragment comprising the CK promoter region is inserted into pEGFP-1 (Clonetech) at the EcoRI and BamHI site as indicated. In the resulting chimeric DNA construct, the EGFP gene is under control of the zebrafish CK promoter. Also shown is the kanamycin/neomycin resistance gene (Kan.sup.r/Neo.sup.r) in the backbone of the original pEGFP-1 plasmid. The total length of the recombinant plasmid pCK-EGFP is 6.4 kb.
[0019] FIG. 5 is a schematic map of the chimeric gene construct, pMCK-EGFP. The 1.5 kb zebrafish DNA fragment comprising the MCK promoter region is inserted into pEGFP-1 (Clonetech) at the EcoRI and BamHI site as indicated. In the resulting chimeric DNA construct, the EGFP gene is under control of the zebrafish MCK promoter. Also shown is the kanamycin/neomycin resistance gene (Kan.sup.r/Neo.sup.r) in the backbone of the original pEGFP-1 plasmid. The total length of the recombinant plasmid pMCK-EGFP is 5.7 kb.
[0020] FIG. 6 is a schematic map of the chimeric gene construct, pARP-EGFP. The 2.2 kb zebrafish DNA fragment comprising the ARP promoter/1st intron region is inserted into pEGFP-1 (Clonetech) at the EcoRI and BamHI site as indicated. In the resulting chimeric DNA construct, the EGFP gene is under control of the zebrafish ARP promoter. Also shown is the kanamycin/neomycin resistance gene (Kan.sup.r/Neo.sup.r) in the backbone of the original pEGFP-1 plasmid. The total length of the recombinant plasmid pARP-EGFP is 6.4 kb.
[0021] FIG. 7 is a schematic map of the chimeric gene construct, pMLC2f-EGFP. The 2.0 kb zebrafish DNA fragment comprising the MLC2f promoter region is inserted into pEGFP-1 (Clonetech) at the HindIII and BamHI site as indicated. In the resulting chimeric DNA construct, the EGFP gene is under control of the zebrafish MLC2f promoter. Also shown is the kanamycin/neomycin resistance gene (Kan.sup.r/Neo.sup.r) in the backbone of the original pEGFP-1 plasmid. The total length of the recombinant plasmid pMLC2f-EGFP is 6.2 kb.
[0022] FIG. 8 is a photograph of a typical transgenic zebrafish fry (4 days old) with pCK-EGFP, which emits green fluorescence from skin epithelia under a blue light.
[0023] FIG. 9 is a photograph of a typical transgenic zebrafish fry (3 days old) with pMCK-EGFP, which emits green fluorescence from skeletal muscles under a blue light.
[0024] FIG. 10 is a photograph of a typical transgenic zebrafish fry (2 days old) with pARP-EGFP, which emits green fluorescence under a blue light from a variety of cell types such as skin epithelia, muscle cells, lens, neural tissues, notochord, circulating blood cells and yolk cells.
[0025] FIGS. 11A-11B. Photographs of a typical transgenic zebrafish founder with pMLC2f-EGFP (FIG. 11A) and an F1 stable transgenic offspring (FIG. 11B). Both pictures were taken under an ultraviolet light (365 nm). The green fluorescence can be better observed under a blue light with an optimal wavelength of 488 nm.
[0026] FIGS. 12A-12C. Examples of high, moderate and low expression of GFP in transiently transgenic embryos at 72 hpf. (FIG. 12A) High expression, GFP expression was detected in essentially 100% of the muscle fibers in the trunk. (FIG. 12B) Moderate expression, GFP expression was detected in several bundles of muscle fibers, usually in the mid-trunk region. (FIG. 12C) Low expression, GFP expression occurred in dispersed muscle fibers and the number of GFP positive fibers is usually less than 20 per embryo.
[0027] FIG. 13. Deletion analysis of the MLC2f promoter in transient transgenic zebrafish embryos. A series of 5'' deletions of MLC2f-EGFP constructs containing 2011-bp (2-kb), 1338-bp, 873-bp, 283-bp, 77-bp and 3-bp of the MLC2f promoter were generated by unidirectional deletion using the double-stranded Nested Deletion Kit from Pharmacia based on the manufacturer"s instructional manual. Each construct was injected into approximately 100 embryos and GFP expression was monitored in the first 72 hours of embryonic development. The level of GFP expression was classified based on the examples shown in FIGS. 12A-12C. Potential E-boxes and MEF2 binding sites, which are important for muscle-specific transcription (Schwarz et al., 1993; Olson et al., 1995), are indicated on the 2011-bp construct.
DETAILED DESCRIPTION
[0028] Gene Constructs. To develop successful transgenic fish with a predictable pattern of transgene expression, the first step is to make a gene construct suitable for transgenic studies. The gene construct generally comprises three portions: a gene promoter, a structural gene and transcriptional termination signals. The gene promoter would determine where, when and under what conditions the structural gene is turned on. The structural gene contains protein coding portions that determine the protein to be synthesized and thus the biological function. The structural gene might also contain intron sequences which can affect mRNA stability or which might contain transcription regulatory elements. The transcription termination signals consist of two parts: a polyadenylation signal and a transcriptional termination signal after the polyadenylation signal. Both are important to terminate the transcription of the gene. Among the three portions, selection of a promoter is very important for successful transgenic study, and it is preferable to use a homologous promoter (homologous to the host fish) to ensure accurate gene activation in the transgenic host.
[0029] A promoter drives expression "predominantly" in a tissue if expression is at least 2-fold, preferably at least 5-fold higher in that tissue compared to a reference tissue. A promoter drives expression "specifically" in a tissue if the level of expression is at least 5-fold, preferably at least 10-fold higher, more preferably at least 50-fold higher in that tissue than in any other tissue.
[0030] Recombinant DNA Constructs. Recombinant DNA constructs comprising one or more of the DNA or RNA sequences described herein and an additional DNA and/or RNA sequence are also included within the scope of this invention. These recombinant DNA constructs usually have sequences which do not occur in nature or exist in a form that does not occur in nature or exist in association with other materials that do not occur in nature. The DNA and/or RNA sequences described as constructs or in vectors above are "operably linked" with other DNA and/or RNA sequences. DNA regions are operably linked when they are functionally related to each other. For example, DNA for a presequence or secretory leader is operably linked to DNA for a polypeptide if it is expressed as part of a preprotein which participates in the secretion of the polypeptide; a promoter is operably linked to a coding sequence if it controls the transcription of the coding sequence; a ribosome binding site is operably linked to a coding sequence if it is positioned so as to permit translation. Generally, operably linked means contiguous (or in close proximity to) and, in the case of secretory leaders, contiguous and in reading phase.
[0031] The sequences of some of the DNAs, and the corresponding proteins encoded by the DNA, which are useful in the invention are set forth in the attached Sequence Listing.
[0032] The complete cytokeratin (CK) cDNA sequence is shown in SEQ ID NO:1, and its deduced amino acid sequence is shown in SEQ ID NO:2. The binding sites of the gene specific primers for promoter amplification, CK1 and CK2, are indicated. The extra nucleotides introduced into CK2 for generation of a restriction site are shown as a misc_feature in the primer sequence SEQ ID NO:11. A potential polyadenylation signal, AATAAA, is indicated in SEQ ID NO:1.
[0033] The complete muscle creatine kinase (MCK) cDNA sequence is shown in SEQ ID NO:3, and its deduced amino acid sequence is shown in SEQ ID NO:4. The binding sites of the gene specific primers for promoter amplification, MCK1 and MCK2, are indicated. The extra nucleotides introduced into MCK1 and MCK2 for generation of restriction sites are shown as a misc_feature in the primer sequences SEQ ID NOS:12 and 13, respectively. A potential polyadenylation signal, AATAAA, is indicated in SEQ ID NO:3.
[0034] The complete fast skeletal muscle isoform of myosin light chain 2 (MLC2f) cDNA sequence is shown in SEQ ID NO:20, and its deduced amino acid sequence is shown in SEQ ID NO:21. The binding sites of the gene-specific primers for promoter amplification, M1 and M2, are indicated. Two potential polyadenylation signals, AATAAA, are shown as a misc_feature in SEQ ID NO:20.
[0035] The complete acidic ribosomal protein P0 (ARP) cDNA sequence is shown in SEQ ID NO:5, and its deduced amino acid sequence is shown in SEQ ID NO:6. The binding sites of the gene specific primers for promoter amplification, ARP1 and ARP2, are indicated. The extra nucleotides introduced into ARP2 for generation of a restriction site are shown as a misc_feature in the primer sequence SEQ ID NO:15. A potential polyadenylation signal, AATAAA, is indicated in SEQ ID NO:5.
[0036] SEQ ID NO:7 shows the complete sequence of the CK promoter region. A putative TATA box is shown, and the 3' nucleotides identical to the 5' CK cDNA sequence are shown as a misc_feature. The binding site of the second gene specific primer, CK2, is shown. The introduced BamHI site is indicated as a misc_feature in the primer sequence SEQ ID NO:11.
[0037] SEQ ID NO:8 shows the complete sequence of the MCK promoter region. A putative TATA box is shown, and the 3' nucleotides identical to the 5' MCK cDNA sequence are shown as a misc_feature in SEQ ID NO:8. The binding site of the second gene specific primer, MCK2, is shown. The introduced BamHI site is indicated as a misc_feature in the primer sequence SEQ ID NO:13.
[0038] SEQ ID NO:22 shows the complete sequence of the MLC2f promoter region. A putative TATA box is shown, and the 3' nucleotides identical to the 5' MLC2f cDNA sequence are shown as a misc_feature. The binding site of the second gene-specific primer, M2, is shown. Potential muscle-specific cis-elements, E-boxes and MEF2 binding sites, are also shown. The proximal 1-kb region of the MLC2f promoter was recently published (Xu et al., 1999).
[0039] SEQ ID NO:9 shows the complete sequence of the ARP promoter region including the first intron. The first intron is shown, and the 3' nucleotides identical to the 5' ARP cDNA sequence are shown as misc_features. No typical TATA box is found. The binding site of the second gene specific primer, ARP2, is shown. The introduced BamHI site is indicated as a misc_feature in the primer sequence SEQ ID NO:15.
[0040] Specifically Exemplified Polypeptides/DNA. The present invention contemplates use of DNA that codes for various polypeptides and other types of DNA to prepare the gene constructs of the present invention. DNA that codes for structural proteins, such as fluorescent peptides including GFP, EGFP, BFP, EBFP, YFP, EYFP, CFP, ECFP and enzymes (such as luciferase, .beta.-galactosidase, chloramphenicol acetyltransferase, etc.), and hormones (such as growth hormone etc.), are useful in the present invention. More particularly, the DNA may code for polypeptides comprising the sequences exemplified in SEQ ID NOS:2, 4, 6 and 21. The present invention also contemplates use of particular DNA sequences, including regulatory sequences, such as promoter sequences shown in SEQ ID NOS: 7, 8, 9 and 22 or portions thereof effective as promoters. Finally, the present invention also contemplates the use of additional DNA sequences, described generally herein or described in the references cited herein, for various purposes.
[0041] Chimeric Genes. The present invention also encompasses chimeric genes comprising a promoter described herein operatively linked to a heterologous gene. Thus, a chimeric gene can comprise a promoter of a zebrafish operatively linked to a zebrafish structural gene other than that normally found linked to the promoter in the genome. Alternatively, the promoter can be operatively linked to a gene that is exogenous to a zebrafish, as exemplified by the GFP and other genes specifically exemplified herein. Furthermore, a chimeric gene can comprise an exogenous promoter linked to any structural gene not normally linked to that promoter in the genome of an organism.
[0042] Variants of Specifically Exemplified Polypeptide. DNA that codes for variants of the specifically exemplified polypeptides are also encompassed by the present invention. Possible variants include allelic variants and corresponding polypeptides from other organisms, particularly other organisms of the same species, genus or family. The variants may have substantially the same characteristics as the natural polypeptides. The variant polypeptide will possess the primary property of concern for the polypeptide. For example, the polypeptide will possess one or more or all of the primary physical (e.g., solubility) and/or biological (e.g., enzymatic activity, physiologic activity or fluorescence excitation or emission spectrum) properties of the reference polypeptide. DNA of the structural genes of the present invention will encode a protein that produces a fluorescent or chemiluminescent light under conditions appropriate to the particular polypeptide in one or more tissues of a fish. Preferred tissues for expression are skin, muscle, eye and bone.
[0043] Substitutions, Additions and Deletions. As possible variants of the above specifically exemplified polypeptides, the polypeptide may have additional individual amino acids or amino acid sequences inserted into the polypeptide in the middle thereof and/or at the N-terminal and/or C-terminal ends thereof so long as the polypeptide possesses the desired physical and/or biological characteristics. Likewise, some of the amino acids or amino acid sequences may be deleted from the polypeptide so long as the polypeptide possesses the desired physical and/or biochemical characteristics. Amino acid substitutions may also be made in the sequences so long as the polypeptide possesses the desired physical and biochemical characteristics. DNA coding for these variants can be used to prepare gene constructs of the present invention.
[0044] Sequence Identity. The variants of polypeptides or polynucleotides contemplated herein should possess more than 75% sequence identity (sometimes referred to as homology), preferably more than 85% identity, most preferably more than 95% identity, even more preferably more than 98% identity to the naturally occurring and/or specifically exemplified sequences or fragments thereof described herein. To determine this homology, two sequences are aligned so as to obtain a maximum match using gaps and inserts.
[0045] Two sequences are said to be "identical" if the sequence of residues is the same when aligned for maximum correspondence as described below. The term "complementary" applies to nucleic acid sequences and is used herein to mean that the sequence is complementary to all or a portion of a reference polynucleotide sequence.
[0046] Optimal alignment of sequences for comparison can be conducted by the local homology algorithm of Smith and Waterman (1981), by the homology alignment method of Needleman and Wunsch (1970), by the search for similarity method of Pearson and Lippman (1988), or the like. Computer implementations of the above algorithms are known as part of the Genetics Computer Group (GCG) Wisconsin Genetics Software Package (GAP, BESTFIT, BLASTA, FASTA and TFASTA), 575 Science Drive, Madison, Wis. These programs are preferably run using default values for all parameters.
[0047] "Percentage of sequence identity" is determined by comparing two optimally aligned sequences over a comparison window, wherein the portion of the sequence in the comparison window may comprise additions or deletions (i.e. "gaps") as compared to the reference sequence for optimal alignment of the two sequences being compared. The percentage identity is calculated by determining the number of positions at which the identical residue occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window and multiplying the result by 100 to yield the percentage of sequence identity. Total identity is then determined as the average identity over all of the windows that cover the complete query sequence.
[0048] Fragments of Polypeptide. Genes which code for fragments of the full length polypeptides such as proteolytic cleavage fragments which contain at least one, and preferably all, of the above physical and/or biological properties are also encompassed by the present invention.
[0049] DNA and RNA. The invention encompasses DNA that codes for any one of the above polypeptides including, but not limited to, those shown in SEQ ID NOS:2, 4, 6 and 21 including fusion polypeptides, variants and fragments thereof. The sequence of certain particularly useful cDNAs which encode polypeptides are shown in SEQ ID NOS:1, 3, 5 and 20. The present invention also includes cDNA as well as genomic DNA containing or comprising the requisite nucleotide sequences as well as corresponding RNA and antisense sequences.
[0050] Cloned DNA within the scope of the invention also includes allelic variants of the specific sequences presented in the attached Sequence Listing. An "allelic variant" is a sequence that is a variant from that of the exemplified nucleotide sequence, but represents the same chromosomal locus in the organism. In addition to those which occur by normal genetic variation in a population and perhaps fixed in the population by standard breeding methods, allelic variants can be produced by genetic engineering methods. A preferred allelic variant is one that is found in a naturally occurring organism, including a laboratory strain. Allelic variants are either silent or expressed. A silent allele is one that does not affect the phenotype of the organism. An expressed allele results in a detectable change in the phenotype of the trait represented by the locus.
[0051] A nucleic acid sequence "encodes" or "codes for" a polypeptide if it directs the expression of the polypeptide referred to. The nucleic acid can be DNA or RNA. Unless otherwise specified, a nucleic acid sequence that encodes a polypeptide includes the transcribed strand, the hnRNA and the spliced RNA or the DNA representative of the mRNA. An "antisense" nucleic acid is one that is complementary to all or part of a strand representative of mRNA, including untranslated portions thereof.
[0052] Degenerate Sequences. In accordance with degeneracy of genetic code, it is possible to substitute at least one base of the base sequence of a gene by another kind of base without causing the amino acid sequence of the polypeptide produced from the gene to be changed. Hence, the DNA of the present invention may also have any base sequence that has been changed by substitution in accordance with degeneracy of genetic code.
[0053] DNA Modification. The DNA is readily modified by substitution, deletion or insertion of nucleotides, thereby resulting in novel DNA sequences encoding the polypeptide or its derivatives. These modified sequences are used to produce mutant polypeptide and to directly express the polypeptide. Methods for saturating a particular DNA sequence with random mutations and also for making specific site-directed mutations are known in the art; see e.g. Sambrook et al. (1989).
[0054] Hybridizable Variants. The DNA molecules useful in accordance with the present invention can comprise a nucleotide sequence selected from the group consisting of SEQ ID NOS.:1, 3, 5, 7-20 and 22-24 or can comprise a nucleotide sequence that hybridizes to a DNA molecule comprising the nucleotide sequence of SEQ ID NOS.:1, 3, 5 or 20 under salt and temperature conditions providing stringency at least as high as that equivalent to 5.times.SSC and 42.degree. C. and that codes on expression for a polypeptide that has one or more or all of the above physical and/or biological properties. The present invention also includes polypeptides coded for by these hybridizable variants. The relationship of stringency to hybridization and wash conditions and other considerations of hybridization can be found in Chapters 11 and 12 of Sambrook et al (1989). The present invention also encompasses functional promoters which hybridize to SEQ ID NOS:7, 8, 9 or 22 under the above-described conditions. DNA molecules of the invention will preferably hybridize to reference sequences under more stringent conditions allowing the degree of mismatch represented by the degrees of sequence identity enumerated above. The present invention also encompasses functional primers or linker oligonucleotides set forth in SEQ ID NOS:10-19 and 23-24 or larger primers comprising these sequences, or sequences which hybridize with these sequences under the above-described conditions. The primers usually have a length of 10-50 nucleotides, preferably 15-35 nucleotides, more preferably 18-30 nucleotides.
[0055] Vectors. The invention is further directed to a replicable vector containing cDNA that codes for the polypeptide and that is capable of expressing the polypeptide.
[0056] The present invention is also directed to a vector comprising a replicable vector and a DNA sequence corresponding to the above described gene inserted into said vector. The vector may be an integrating or nonvector depending on its intended use and is conveniently a plasmid.
[0057] Transformed Cells. The invention further relates to a transformed cell or microorganism containing cDNA or a vector which codes for the polypeptide or a fragment or variant thereof and that is capable of expressing the polypeptide.
[0058] Expression Systems Using Vertebrate Cells. Interest has been great in vertebrate cells, and propagation of vertebrate cells in culture (tissue culture) has become a routine procedure. Examples of vertebrate host cell lines useful in the present invention preferably include cells from any of the fish described herein. Expression vectors for such cells ordinarily include (if necessary) an origin of replication, a promoter located upstream from the gene to be expressed, along with a ribosome-binding site, RNA splice site (if introngenomic DNA is used or if an intron is necessary to optimize expression of a cDNA), a polyadenylation site, and a transcription termination sequence.
EXAMPLES
[0059] The following examples are provided by way of illustration only and not by way of limitation. Those of skill will readily recognize a variety of noncritical parameters which can be changed or modified to yield essentially similar results.
[0060] Example I: Isolation of skin-specific, muscle-specific and ubiquitously expressed zebrafish cDNA clones. cDNA clones were isolated and sequenced as described by Gong et al. (1997). Basically, random cDNA clones were selected from zebrafish embryonic and adult cDNA libraries and each clone was partially sequenced by a single sequencing reaction. The partial sequences were then used to identify the sequenced clones for potential function and tissue specificity. Of the distinct clones identified by this approach, four of them were selected: for skin specificity (clone A39 encoding cytokeratin, CK), for muscle specificity (clone E146 encoding muscle creatine kinase, MCK), for skeletal muscle specificity (clone A113 encoding the fast skeletal muscle isoform of the myosin light chain 2, MLC2f) and for ubiquitous expression (clone A150 encoding acidic ribosomal protein P0, ARP), respectively.
[0061] The four cDNA clones were sequenced, and their complete cDNA sequences with deduced amino acid sequences are shown in SEQ ID NOS:1, 3, 5, and 20 respectively. A39 encodes a type II basic cytokeratin and its closest homolog in mammals is cytokeratin 8 (65-68% amino acid identity). E146 codes for the zebrafish MCK and its amino acid sequence shares .about.87% identity with mammalian MCKs. A113 encodes the fast skeletal muscle isoform of the myosin light chain 2. The deduced amino acid sequence of this gene is highly homologous to other vertebrate fast skeletal muscle MLC2f proteins (over 80% amino acid identity). The amino acid sequence of zebrafish ARP deduced from the A150 clone is 87-89% identical to those of mammalian ARPs.
[0062] To demonstrate their expression patterns, whole mount in situ hybridization (Thisse et al., 1993) was performed for developing embryos and Northern blot analyses (Gong et al., 1992) were carried out for selected adult tissues and for developing embryos.
[0063] As indicated by whole mount in situ hybridization, cytokeratin mRNA was specifically expressed in the embryonic surface (FIGS. 1A-1C) and cross section of in situ hybridized embryos confirmed that the expression was only in skin epithelia (FIG. 1C). Ontogenetically, the cytokeratin mRNA appeared before 4 hours post-fertilization (hpf) and it is likely that the transcription of the cytokeratin gene starts at mid-blastula transition when the zygotic genome is activated. By in situ hybridization, a clear cytokeratin mRNA signal was detected in highly flattened cells of the superficial layer in blastula and the expression remained in the superficial layer which eventually developed into skin epithelia including the yolk sac. In adult tissues, cytokeratin mRNA was predominantly detected in the skin and also weakly in several other tissues including the eye, gill, intestine and muscle, but not in the liver and ovary (FIG. 2). Therefore, the cytokeratin mRNA is predominantly, if not specifically, expressed in skin cells.
[0064] MCK mRNA was first detected in the first few anterior somites in 10 somite stage embryos (14 hpf) and at later stages the expression is specifically in skeletal muscle (FIG. 1D) and in heart (data not shown). When the stained embryos are cross-sectioned, the MCK mRNA signal was found exclusively in the trunk skeletal muscles (FIG. 1E). In adult tissues, MCK mRNA was detected exclusively in the skeletal muscle (FIG. 2).
[0065] MLC2f mRNA was specifically expressed in fast skeletal muscle in developing zebrafish embryos (FIGS. 1H-1I). To examine the tissue distribution of MLC2f mRNA, total RNAs were prepared from several adult tissues including heart, brain, eyes, gills, intestine, liver, skeletal muscle, ovary, skin, and testis. MLC2f mRNA was only detected in the skeletal muscle by Northern analysis; while .alpha.-actin mRNA was detected ubiquitously in the same set of RNAs, confirming the validity of the assay (FIG. 2B).
[0066] ARP mRNA was expressed ubiquitously and it is presumably a maternal mRNA since it is present in the ovary as well as in embryos at one cell stage. In in situ hybridization experiments, an intense hybridization signal was detected in most tissues. An example of a hybridized embryo at 28 hpf is shown in FIG. 1F. In adults, ARP mRNA was abundantly expressed in all tissues examined except for the brain where a relatively weak signal was detected (FIG. 2A). These observations confirmed that the ARP mRNA is expressed ubiquitously.
[0067] Example II: Isolation of zebrafish gene promoters Four zebrafish gene promoters were isolated by a linker-mediated PCR method as described by Liao et al., (1997) and as exemplified by the diagrams in FIG. 3. The whole procedure includes the following steps: 1) designing of gene specific primers; 2) isolation of zebrafish genomic DNA; 3) digestion of genomic DNA by a restriction enzyme; 4) ligation of a short linker DNA to the digested genomic DNA; 5) PCR amplification of the promoter region; and 6) DNA sequencing to confirm the cloned DNA fragment. The following is the detailed description of these steps.
[0068] 1. Designing of gene specific primers. Gene specific PCR primers were designed based on the 5' end of the four cDNA sequences and the regions used for designing the primers are shown in SEQ ID NOS: 1, 3, 5 and 20.
[0069] The two cytokeratin gene specific primers are: CK1 (SEQ ID NO: 10)CK2 (SEQ ID NO:11), where the first six nucleotides are for creation of an EcoRI site to facilitate cloning.
[0070] The two muscle creatine kinase gene specific primers are: MCK1 (SEQ ID NO:12), where the first five nucleotides are for creation of an EcoRI site to facilitate cloning.
[0071] MCK2 (SEQ ID NO:13), where the first three nucleotides are for creation of an EcoRI site to facilitate cloning.
[0072] The two fast skeletal muscle isoform of myosin light chain 2 gene specific primers are: M1 (SEQ ID NO:23) M2 (SEQ ID NO:24) The two acidic ribosomal protein P0 gene specific primers are: ARP1 (SEQ ID NO:14)ARP2 (SEQ ID NO:15), where the first six nucleotides are for creation of an EcoRI site to facilitate cloning.
[0073] 2. Isolation of zebrafish genomic DNA. Genomic DNA was isolated from a single individual fish by a standard method (Sambrook et al., 1989). Generally, an adult fish was quickly frozen in liquid nitrogen and ground into powder. The ground tissue was then transferred to an extraction buffer (10 mM Tris, pH 8, 0.1 M EDTA, 20 .mu.g/ml RNase A and 0.5% SDS) and incubated at 37.degree. C. for 1 hour. Proteinase K was added to a final concentration of 100 .mu.g/ml and gently mixed until the mixture appeared viscous, followed by incubation at 50.degree. C. for 3 hours with periodical swirling. The genomic DNA was gently extracted three times by phenol equilibrated with Tris-HCl (pH 8), precipitated by adding 0.1 volume of 3 M NaOAc and 2.5 volumes of ethanol, and collected by swirling on a glass rod, then rinsed in 70% ethanol.
[0074] 3. Digestion of genomic DNA by a restriction enzyme. Genomic DNA was digested with the selected restriction enzymes. Generally, 500 units of restriction enzyme were used to digest 50 .mu.g of genomic DNA overnight at the optimal enzyme reaction temperature (usually at 37.degree. C.).
[0075] 4. Ligation of a short linker DNA to the digested genomic DNA. The linker DNA was assembled by annealing equal moles of the two linker oligonucleotides, Oligo1 (SEQ ID NO:16) and Oligo 2 (SEQ ID NO:17). Oligo 2 was phosphorylated by T4 polynucleotide kinase prior to annealing. Restriction enzyme digested genomic DNA was filled-in or trimmed with T4 DNA polymerase, if necessary, and ligated with the linker DNA. Ligation was performed with 1 .mu.g of digested genomic DNA and 0.5 .mu.g of linker DNA in a 20 .mu.l reaction containing 10 units of T4 DNA ligase at 4.degree. C. overnight.
[0076] 5. PCR amplification of promoter region. PCR was performed with Advantage Tth Polymerase Mix (Clontech). The first round of PCR was performed using a linker specific primer L1 (SEQ ID NO:18) and a gene specific primer G1 (CK1, MCK1, M1 or ARP1). Each reaction (50 .mu.l) contains 5 .mu.l of 10.times.Tth PCR reaction buffer (1X=15 mM KOAc, 40 mM Tris, pH 9.3), 2.2 .mu.l of 25 mM Mg(OAc)2, 5 .mu.l of 2 mM dNTP, 1 .mu.l of L1 (0.2 .mu.g/.mu.l), 1 .mu.l of G1 (0.2 .mu.g/.mu.l), 33.8 .mu.l of H2O, and 1 .mu.l (50 ng) of linker ligated genomic DNA and 1 .mu.l of 50.times. Tth polymerase mix (Clontech). The cycling conditions were as follows: 94.degree. C./1 min, 35 cycles of 94.degree. C./30 sec and 68.degree. C./6 min, and finally 68.degree. C./8 min. After the primary round of PCR was completed, the products were diluted 100 fold. One .mu.l of diluted PCR product was used as template for the second round of PCR (nested PCR) with a second linker specific primer L2 (SEQ ID NO:19) and a second gene specific primer G2 (CK2, MCK2, M2 or ARP2), as described for the primary PCR but with the following modification: 94.degree. C./1 min, 25 cycles of 94.degree. C./30 sec and 68.degree. C./6 min, and finally 68.degree. C./8 min. Both the primary and secondary PCR products were analyzed on a 1% agarose gel.
[0077] 6. DNA sequencing to confirm the cloned DNA fragment. PCR products were purified from the agarose gel following electrophoresis and cloned into a TA vector, pT7Blue.TM. (Novogen). DNA sequencing was performed by dideoxynucleotide chain termination method using a T7 Sequencing Kit purchased from Pharmacia. Complete sequences of these promoter regions were obtained by automatic sequencing using a dRhodamine Terminator Cycle Sequencing Ready Reaction Kit (Perkin-Elmer) and an ABI 377 automatic sequencing machine.
[0078] The isolated cytokeratin DNA fragment comprising the gene promoter is 2.2 kb. In the 3' proximal region immediately upstream of a portion identical to the 3' part of the CK cDNA sequence, there is a putative TATA box perfectly matching to a consensus TATA box sequence. The 164 bp of the 3' region is identical to the 5' UTR (untranslated region) of the cytokeratin cDNA. Thus, the isolated fragment was indeed derived from the same gene as the cytokeratin cDNA clone (SEQ ID NO:7). Similarly, a 1.5 kb 5' flanking region was isolated from the muscle creatine kinase gene, a putative TATA box was also found in its 3' proximal region and the 3' region is identical to the 5' portion of the MCK cDNA clone (SEQ ID NO:8). For MLC2f, a 2 kb region was isolated from the fast skeletal muscle isoform of myosin light chain 2 gene and sequenced completely. The promoter sequence for MLC2f is shown in SEQ ID NO:22. The sequence immediately upstream of the gene specific primer M2 is identical to the 5' UTR of the MLC2f cDNA clone; thus, the amplified DNA fragments are indeed derived from the MLC2f gene. A perfect TATA box was found 30 nucleotides upstream of the transcription start site, which was defined by a primer extension experiment based on Sambrook et al. (1989). In the 2-kb region comprising the promoter, six E-boxes (CANNTG) and six potential MEF2 binding sites [C/T) TA(T/A)4TA(A/G)] were found and are indicated in SEQ ID NO:22. Both of these cis-element classes are important for muscle specific gene transcription (Schwarz et al., 1993; Olson et al., 1995). A 2.2 kb fragment was amplified for the ARP gene. By alignment of its sequence with the ARP cDNA clone, a 1.3 kb intron was found in the 5' UTR (SEQ ID NO:9). As a result, the isolated ARP promoter is within a DNA fragment about 0.8 kb long.
[0079] Example III: Generation of green fluorescent transgenic fish The isolated zebrafish gene promoters were inserted into the plasmid pEGFP-1 (Clonetech), which contains an EGFP structural gene whose codons have been optimized according to preferable human codons. Three promoter fragments were inserted into pEGFP-1 at the EcoRI and BamHI site and the resulting recombinant plasmids were named pCK-EGFP (FIG. 4), pMCK-EGFP (FIG. 5), and pARP-EGFP, respectively (FIG. 6). The promoter fragment for the MLC2f gene was inserted into the Hind III and Bam HI sites of the plasmid pEGFP-1 and the resulting chimeric DNA construct, pMLC2f-EGFP, is diagramed in FIG. 7.
[0080] Linearized plasmid DNAs at a concentrations of 500 .mu.g/ml (for pCK-EGFP and pMCK-EGFP) and 100 .mu.g/ml (for pMLC2f-EGFP) in 0.1 M Tris-HCl (pH 7.6)/0.25% phenol red were injected into the cytoplasm of 1- or 2-cell stage embryos. Because of a high mortality rate, pARP-EGFP was injected at a lower concentration (50 .mu.g/ml). Each embryo received 300-500 pl of DNA. The injected embryos were reared in autoclaved Holtfreter"s solution (0.35% NaCl, 0.01% KCl and 0.01% CaCl2) supplemented with 1 .mu.g/ml of methylene blue. Expression of GFP was observed and photographed under a ZEISS Axiovert 25 fluorescence microscope.
[0081] When zebrafish embryos received pCK-EGFP, GFP expression started about 4 hours after injection, which corresponds to the stage of .about.30% epiboly. About 55% of the injected embryos expressed GFP at this stage. The early expression was always in the superficial layer of cells, mimicking endogenous expression of the CK gene as observed by in situ hybridization. At later stages, in all GFP-expressing fish, GFP was found predominantly in skin epithelia. A typical pCK-EGFP transgenic zebrafish fry at 4 days old is shown in FIG. 8.
[0082] Under the MCK promoter, no GFP expression was observed in early embryos before muscle cells become differentiated. By 24 hpf, about 12% of surviving embryos expressed GFP strongly in muscle cells and these GFP-positive embryos remain GFP-positive after hatching. The GFP expression was always found in many bundles of muscle fibers, mainly in the mid-trunk region and no expression was ever found in other types of cells. A typical pMCK-EGFP transgenic zebrafish fry (3 days old) is shown in FIG. 9.
[0083] Expression of pARP-EGFP was first observed 4 hours after injection at the 30% epiboly stage. The timing of expression is similar to that of pCK-EGFP-injected embryos. However, unlike the pCK-EGFP transgenic embryos, the GFP expression under the ARP promoter occurred not only in the superficial layer of cells but also in deep layers of cells. In some batches of injected embryos, almost 100% of the injected embryos expressed initially. At later stages when some embryonic cells become overtly differentiated, it was found that the GFP expression occurred essentially in all different types of cells such as skin epithelia, muscle cells, lens, neural tissues, notochord, circulating blood cells and yolk cells (FIG. 10).
[0084] Under the MLC2f promoter, nearly 60% of the embryos expressed GFP. The earliest GFP expression started in trunk skeletal muscles about 19 hours after injection, which corresponds to the stage of 20-somite. Later, the GFP expression also occurred in head skeletal muscles including eye muscles, jaw muscles, gill muscles etc.
[0085] Transgenic founder zebrafish containing pMLC2f-EGFP emit a strong green fluorescent light under a blue or ultraviolet light (FIG. 11A). When the transgenic founders were crossed with wild-type fish, transgenic offspring were obtained that also displayed strong green fluorescence (FIG. 11B). The level of GFP expression is so high in the transgenic founders and offspring that green fluorescence can be observed when the fish are exposed to sunlight.
[0086] To identify the DNA elements conferring the strong promoter activity in skeletal muscles, deletion analysis of the 2-kb DNA fragment comprising the promoter was performed. Several deletion constructs, which contain 5'' deletions of the MLC2f promoter upstream of the EGFP gene, were injected into the zebrafish embryos and the transient expression of GFP in early embryos (19-72 hpf) was compared. To facilitate the quantitative analysis of GFP expression, we define the level of expression as follows (FIGS. 12A-12C): Strong expression: GFP expression was detected in essentially 100% muscle fibers in the trunk.
[0087] Moderate expression: GFP expression was detected in several bundles of muscle fibers, usually in the mid-trunk region.
[0088] Weak expression: GFP expression occurred in dispersed muscle fibers and the number of GFP positive fibers is usually less than 20 per embryo.
[0089] As shown in FIG. 13, deletion up to 283 bp maintained the GFP expression in skeletal muscles in 100% of the expressing embryos; however, the level of GFP expression from these deletion constructs varies greatly. Strong expression drops from 23% to 0% from the 2-kb (-2011 bp) promoter to the 283-bp promoter. Thus, only two constructs (2011 bp and 1338 bp) are capable of maintaining the high level of expression and the highest expression was obtained only with the 2-kb promoter, indicating the importance of the promoter region of 1338 bp to 2011 bp for conferring the highest promoter activity.
[0090] The expression of GFP using pMLC2f-EGFP is much higher than that obtained using the pMCK-EGFP that contains a 1.5 kb of zebrafish MCK promoter (Singapore Patent Application 9900811-2). By the same assay in transient transgenic zebrafish embryos, only about 12% of the embryos injected with pMCK-EGFP expressed GFP. Among the expressing embryos, no strong expression was observed, and 70% and 30% showed moderate and weak expression, respectively. In comparison, about 60% of the embryos injected with pMLC2f-EGFP expressed GFP and 23%, 37% and 40% showed strong, moderate and weak expression, respectively.
[0091] Example IV: Potential applications of fluorescent transgenic fish The fluorescent transgenic fish have use as ornamental fish in the market. Stably transgenic lines can be developed by breeding a GFP transgenic individual with a wild type fish or another transgenic fish. By isolation of more zebrafish gene promoters, such as eye-specific, bone-specific, tail-specific etc., and/or by classical breeding of these transgenic zebrafish, more varieties of fluorescent transgenic zebrafish can be produced. Previously, we have reported isolation of over 200 distinct zebrafish cDNA clones homologous to known genes (Gong et al., 1997). These isolated clones code for proteins in a variety of tissues and some of them are inducible by heat-shock, heavy metals, or hormones such as estrogens. By using the method of PCR amplification using gene-specific primers designed from the nucleotide sequences of these cDNAs, and the linker-specific primers described herein, the promoters of the genes represented by the cDNAs of Gong et al. can be used in the present invention. Thus, hormone-inducible promoters, heavy-metal inducible promoters and the like from zebrafish can be isolated and used to make fluorescent zebrafish (or other fish species) that express a GFP or variant thereof, in response to the relevant compound.
[0092] Multiple color fluorescent fish may be generated by the same technique as blue fluorescent protein (BFP) gene, yellow fluorescent protein (YFP) gene and cyan fluorescent protein (CFP) gene are available from Clonetech. For example, a transgenic fish with GFP under an eye-specific promoter, BFP under a skin-specific promoter, and YFP under a muscle-specific promoter will show the following multiple fluorescent colors: green eyes, blue skin and yellow muscle. By recombining different tissue specific promoters and fluorescent protein genes, more varieties of transgenic fish of different fluorescent color patterns will be created. By expression of two or more different fluorescent proteins in the same tissue, an intermediate color may be created. For example, expression of both GFP and BFP under a skin-specific promoter, a dark-green skin color may be created.
[0093] By using a heavy metal--(such as cadmium, cobalt, chromium) inducible or hormone--(such as estrogen, androgen or other steroid hormone) inducible promoter, a biosensor system may be developed for monitoring environmental pollution and for evaluating water quality for human consumption and aquacultural uses. In such a biosensor system, the transgenic fish will glow with a green fluorescence (or other color depending on the fluorescence protein gene used) when pollutants such as heavy metals and estrogens (or their derivatives) reach a threshold concentration in an aquatic environment. Such a biosensor system has advantages over classical analytical methods because it is rapid, visualizable, and capable of identifying specific compounds directly in complex mixture found in an aquatic environment, and is portable or less instrument dependent. Moreover, the biosensor system also provides direct information on biotoxicity and it is biodegradable and regenerative.
[0094] Environmental monitoring of several substances can be accomplished by either creating one transgenic fish having genes encoding different colored fluorescent proteins driven by promoters responsive to each substance. Then the particular colors exhibited the fish in an environment can be observed. Alternatively, a number of fish can be transformed with individual vectors, then the fish can be combined into a population for monitoring an environment and the colors expressed by each fish observed.
[0095] In addition, the fluorescent transgenic fish should also be valuable in the market for scientific research tools because they can be used for embryonic studies such as tracing cell lineage and cell migration. Cells from transgenic fish expressing GFP can also be used as cellular and genetic markers in cell transplantation and nuclear transplantation experiments.
[0096] The chimeric gene constructs demonstrated successfully in zebrafish in the present invention should also be applicable to other fish species such as medaka, goldfish, carp including koi, loach, tilapia, glassfish, catfish, angel fish, discus, eel, tetra, goby, gourami, guppy, Xiphophorus (swordtail), hatchet fish, Molly fish, Pangasius, etc. The promoters described herein can be used directly in these fish species. Alternatively, the homologous gene promoters from other fish species can be isolated by the method described in this invention. For example, the isolated and characterized zebrafish cDNA clones and promoters described in this invention can be used as molecular probes to screen for homologous promoters in other fish species by molecular hybridization or by PCR. Alternatively, one can first isolate the zebrafish cDNA and promoters based on the sequences presented in SEQ ID NOS:1, 3, 5, 7, 8, 9, 20 and 22 or using data from other sequences of cDNAs disclosed by Gong et al. 1997, by PCR and then use the zebrafish gene fragments to obtain homologous genes from other fish species by the methods mentioned above.
[0097] In addition, a strong muscle-specific promoter such as MLC2f is valuable to direct a gene to be expressed in muscle tissues for generation of other beneficial transgenic fish. For example, transgenic expression of a growth hormone gene under the muscle-specific promoter may stimulate somatic growth of transgenic fish. Such DNA can be introduced either by microinjection, electroporation, or sperm carrier to generate germ-line transgenic fish, or by direct injection of naked DNA into skeletal muscles (Xu et al., 1999) or into other tissues or cavities, or by a biolistic method (gene bombardment or gene gun) (Gomez-Chiarri et al., 1996).
Sequence CWU
1
1
2412480DNADanio rerioprimer bind(66)..(85)CK2CDS(90)..(1586)primer
bind(97)..(120)CK1polyA signal(2463)..(2480) 1ctctcctttg tgagcaacct
cctccactca ctcctctctc agagagcact ctcgtacctc 60cttctcagca actcaaagac
acaggcatc atg tca acc agg tct atc tct tac 113
Met Ser Thr Arg Ser Ile Ser Tyr 1
5tcc agc ggt ggc tcc atc agg agg ggc tac acc agc cag tca gcc
tat 161Ser Ser Gly Gly Ser Ile Arg Arg Gly Tyr Thr Ser Gln Ser Ala
Tyr 10 15 20gca gta cct gcc ggc tct
acc agg atg agc tca gtg acc agt gtc agg 209Ala Val Pro Ala Gly Ser
Thr Arg Met Ser Ser Val Thr Ser Val Arg25 30
35 40aga tct ggt gtg ggt gcc agc cca ggc ttc ggt
gcc ggt ggc agc tac 257Arg Ser Gly Val Gly Ala Ser Pro Gly Phe Gly
Ala Gly Gly Ser Tyr 45 50
55agc ttt agc agc agc agc atg ggt gga ggc tat gga agt ggt ctt ggt
305Ser Phe Ser Ser Ser Ser Met Gly Gly Gly Tyr Gly Ser Gly Leu Gly
60 65 70gga ggt ctc ggt ggg ggc atg
ggc ttt cgt tgc ggg ctt cct atc aca 353Gly Gly Leu Gly Gly Gly Met
Gly Phe Arg Cys Gly Leu Pro Ile Thr 75 80
85gct gta act gtc aac cag aac ctg ttg gcc ccc tta aac ctg gaa
atc 401Ala Val Thr Val Asn Gln Asn Leu Leu Ala Pro Leu Asn Leu Glu
Ile 90 95 100gac ccc aca att caa gct
gtc cgc act tca gag aaa gag cag att aag 449Asp Pro Thr Ile Gln Ala
Val Arg Thr Ser Glu Lys Glu Gln Ile Lys105 110
115 120acc ttc aac aac cgc ttc gct ttc ctc atc gac
aaa gtg cgc ttc ctg 497Thr Phe Asn Asn Arg Phe Ala Phe Leu Ile Asp
Lys Val Arg Phe Leu 125 130
135gaa cag cag aac aag atg ctt gag acc aaa tgg agt ctt ctc caa gaa
545Glu Gln Gln Asn Lys Met Leu Glu Thr Lys Trp Ser Leu Leu Gln Glu
140 145 150cag aca acc aca cgt tcc
aac atc gat gcc atg ttt gag gca tac atc 593Gln Thr Thr Thr Arg Ser
Asn Ile Asp Ala Met Phe Glu Ala Tyr Ile 155 160
165tct aac ctg cgc aga cag ctc gat gga ctg gga aat gag aag
atg aag 641Ser Asn Leu Arg Arg Gln Leu Asp Gly Leu Gly Asn Glu Lys
Met Lys 170 175 180ctg gag gga gag ctg
aag aac atg cag ggc ctg gtt gag gac ttc aag 689Leu Glu Gly Glu Leu
Lys Asn Met Gln Gly Leu Val Glu Asp Phe Lys185 190
195 200aac aag tac gag gat gag atc aac aag cgt
gct tcc gta gag aat gag 737Asn Lys Tyr Glu Asp Glu Ile Asn Lys Arg
Ala Ser Val Glu Asn Glu 205 210
215ttt gtc ctg ctc aag aag gat gtt gat gca gcc tac atg aac aag gtt
785Phe Val Leu Leu Lys Lys Asp Val Asp Ala Ala Tyr Met Asn Lys Val
220 225 230gag ctt gaa gcc aag gtt
gat gct ctt cag gat gag atc aac ttc ctc 833Glu Leu Glu Ala Lys Val
Asp Ala Leu Gln Asp Glu Ile Asn Phe Leu 235 240
245agg gca gtc tac gag gct gaa ctc cgg gag ctc cag tct cag
atc aag 881Arg Ala Val Tyr Glu Ala Glu Leu Arg Glu Leu Gln Ser Gln
Ile Lys 250 255 260gac aca tct gtt gtt
gta gaa atg gac aac agc aga aac ctg gat atg 929Asp Thr Ser Val Val
Val Glu Met Asp Asn Ser Arg Asn Leu Asp Met265 270
275 280gac tcc atc gtg gct gaa gtt cgc gct cag
tat gaa gac atc gcc aac 977Asp Ser Ile Val Ala Glu Val Arg Ala Gln
Tyr Glu Asp Ile Ala Asn 285 290
295cgc agc cgt gcc gag gca gag agc tgg tac aaa cag aag ttt gag gag
1025Arg Ser Arg Ala Glu Ala Glu Ser Trp Tyr Lys Gln Lys Phe Glu Glu
300 305 310atg cag agc acc gct ggt
cag tat ggt gat gac ctc cgc tca aca aag 1073Met Gln Ser Thr Ala Gly
Gln Tyr Gly Asp Asp Leu Arg Ser Thr Lys 315 320
325gct gag att gct gaa ctc aac cgc atg atc gcc cgc ctg cag
aac gag 1121Ala Glu Ile Ala Glu Leu Asn Arg Met Ile Ala Arg Leu Gln
Asn Glu 330 335 340atc gat gct gtc aag
gca cag cgt gcc aac ttg gag gct cag att gct 1169Ile Asp Ala Val Lys
Ala Gln Arg Ala Asn Leu Glu Ala Gln Ile Ala345 350
355 360gag gct gaa gag cgt gga gag ctg gca gtg
aag gat gcc aag ctc cgc 1217Glu Ala Glu Glu Arg Gly Glu Leu Ala Val
Lys Asp Ala Lys Leu Arg 365 370
375atc agg gag ctg gag gaa gct ctt cag agg gcc aag caa gac atg gcc
1265Ile Arg Glu Leu Glu Glu Ala Leu Gln Arg Ala Lys Gln Asp Met Ala
380 385 390cgc cag gtc cgc gag tac
cag gag ctc atg aac gtc aaa ttg gct ctg 1313Arg Gln Val Arg Glu Tyr
Gln Glu Leu Met Asn Val Lys Leu Ala Leu 395 400
405gac att gag atc gcc acc tac agg aaa ctg ttg gaa gga gag
gag agc 1361Asp Ile Glu Ile Ala Thr Tyr Arg Lys Leu Leu Glu Gly Glu
Glu Ser 410 415 420aga ctg tcc agc ggt
gga gct caa gct acc att cat gtt cag cag acc 1409Arg Leu Ser Ser Gly
Gly Ala Gln Ala Thr Ile His Val Gln Gln Thr425 430
435 440tcc gga ggt gtt tca tct ggt tat ggt ggt
agc ggc tct ggt ttc ggc 1457Ser Gly Gly Val Ser Ser Gly Tyr Gly Gly
Ser Gly Ser Gly Phe Gly 445 450
455tac agc agt ggc ttc agc agt ggt ggg tca gga tac ggt agt gga tca
1505Tyr Ser Ser Gly Phe Ser Ser Gly Gly Ser Gly Tyr Gly Ser Gly Ser
460 465 470gga ttc ggt tct gga tca
ggg tat ggt gga ggc tcc atc agc aaa acc 1553Gly Phe Gly Ser Gly Ser
Gly Tyr Gly Gly Gly Ser Ile Ser Lys Thr 475 480
485agt gtc acc acc gtc agc agt aaa cgc tat taa ggagaagccc
gcccaaaccc 1606Ser Val Thr Thr Val Ser Ser Lys Arg Tyr 490
495ccagccgaca cagtttccaa ccttccttac ctgcaactag atcccttctg
aaccttctta 1666cgactcaaac catctatggt gctatatttt agccagacag ctgtcccctg
ttaatgagga 1726gatgtggacg atgattttta aagtacaaaa taagttttag attgttctgt
gtgttgatgg 1786tagttacccg tatcatgcat ctcctgtctg gtggtgtcac tgccatttta
aatcatcaac 1846ccaactacac taaaacgata ccaggaagaa tcgtgctcca agccactgaa
tagtcttatt 1906tctgcactga tatgtacagg gaaagtgaga cacatagaaa ccactgtaac
ctacgtagta 1966ctatggtttc actggatcag gggtgtgcta tacaagttcc tgaatgtctt
gtttgaatgt 2026tttgtgctgt tacaagctcc ctgctgtagt tttgctgact aatctgactt
ttgtcatttt 2086gctatggctg tcagagttgg tttacctatt ttctataaaa tgtatatggc
agtcagccaa 2146taactgatga caattgcttg tgggctacta atgtccagtt acctcacatt
caagggagat 2206ctgttacagc aaaaaacagg cacaatggga tttatgtgga ccatccctcc
ttaaccttgt 2266gtactttccg tgttggaagt ggtgactgta ctgccttaca cattcccctg
tattcaactg 2326gcttccagag catattttac atccccggtt ataaatggaa aatgcaagaa
aactgaaaca 2386atgttcaacc agatttattt ggtattgatt gacgagacac caacttgaaa
tttgaataca 2446ataaatctga gaccacaaaa aaaaaaaaaa aaaa
24802498PRTDanio rerio 2Met Ser Thr Arg Ser Ile Ser Tyr Ser
Ser Gly Gly Ser Ile Arg Arg1 5 10
15Gly Tyr Thr Ser Gln Ser Ala Tyr Ala Val Pro Ala Gly Ser Thr
Arg 20 25 30Met Ser Ser Val
Thr Ser Val Arg Arg Ser Gly Val Gly Ala Ser Pro 35
40 45Gly Phe Gly Ala Gly Gly Ser Tyr Ser Phe Ser Ser
Ser Ser Met Gly 50 55 60Gly Gly Tyr
Gly Ser Gly Leu Gly Gly Gly Leu Gly Gly Gly Met Gly65 70
75 80Phe Arg Cys Gly Leu Pro Ile Thr
Ala Val Thr Val Asn Gln Asn Leu 85 90
95Leu Ala Pro Leu Asn Leu Glu Ile Asp Pro Thr Ile Gln Ala
Val Arg 100 105 110Thr Ser Glu
Lys Glu Gln Ile Lys Thr Phe Asn Asn Arg Phe Ala Phe 115
120 125Leu Ile Asp Lys Val Arg Phe Leu Glu Gln Gln
Asn Lys Met Leu Glu 130 135 140Thr Lys
Trp Ser Leu Leu Gln Glu Gln Thr Thr Thr Arg Ser Asn Ile145
150 155 160Asp Ala Met Phe Glu Ala Tyr
Ile Ser Asn Leu Arg Arg Gln Leu Asp 165
170 175Gly Leu Gly Asn Glu Lys Met Lys Leu Glu Gly Glu
Leu Lys Asn Met 180 185 190Gln
Gly Leu Val Glu Asp Phe Lys Asn Lys Tyr Glu Asp Glu Ile Asn 195
200 205Lys Arg Ala Ser Val Glu Asn Glu Phe
Val Leu Leu Lys Lys Asp Val 210 215
220Asp Ala Ala Tyr Met Asn Lys Val Glu Leu Glu Ala Lys Val Asp Ala225
230 235 240Leu Gln Asp Glu
Ile Asn Phe Leu Arg Ala Val Tyr Glu Ala Glu Leu 245
250 255Arg Glu Leu Gln Ser Gln Ile Lys Asp Thr
Ser Val Val Val Glu Met 260 265
270Asp Asn Ser Arg Asn Leu Asp Met Asp Ser Ile Val Ala Glu Val Arg
275 280 285Ala Gln Tyr Glu Asp Ile Ala
Asn Arg Ser Arg Ala Glu Ala Glu Ser 290 295
300Trp Tyr Lys Gln Lys Phe Glu Glu Met Gln Ser Thr Ala Gly Gln
Tyr305 310 315 320Gly Asp
Asp Leu Arg Ser Thr Lys Ala Glu Ile Ala Glu Leu Asn Arg
325 330 335Met Ile Ala Arg Leu Gln Asn
Glu Ile Asp Ala Val Lys Ala Gln Arg 340 345
350Ala Asn Leu Glu Ala Gln Ile Ala Glu Ala Glu Glu Arg Gly
Glu Leu 355 360 365Ala Val Lys Asp
Ala Lys Leu Arg Ile Arg Glu Leu Glu Glu Ala Leu 370
375 380Gln Arg Ala Lys Gln Asp Met Ala Arg Gln Val Arg
Glu Tyr Gln Glu385 390 395
400Leu Met Asn Val Lys Leu Ala Leu Asp Ile Glu Ile Ala Thr Tyr Arg
405 410 415Lys Leu Leu Glu Gly
Glu Glu Ser Arg Leu Ser Ser Gly Gly Ala Gln 420
425 430Ala Thr Ile His Val Gln Gln Thr Ser Gly Gly Val
Ser Ser Gly Tyr 435 440 445Gly Gly
Ser Gly Ser Gly Phe Gly Tyr Ser Ser Gly Phe Ser Ser Gly 450
455 460Gly Ser Gly Tyr Gly Ser Gly Ser Gly Phe Gly
Ser Gly Ser Gly Tyr465 470 475
480Gly Gly Gly Ser Ile Ser Lys Thr Ser Val Thr Thr Val Ser Ser Lys
485 490 495Arg
Tyr31589DNADanio rerioprimer bind(6)..(26)MCK2primer
bind(20)..(38)MCK1CDS(86)..(1231)polyA signal(1534)..(1539) 3cctatttcgg
cttggtgaac aggatctgat cccaaggact gttaccactt ttgttgtctt 60ttgtgcagtg
ttagaaaccg caatc atg cct ttc gga aac acc cac aac aac 112
Met Pro Phe Gly Asn Thr His Asn Asn
1 5ttc aag ctg aac tac tca gtt gat gag gag tat cca gac
ctt agc aag 160Phe Lys Leu Asn Tyr Ser Val Asp Glu Glu Tyr Pro Asp
Leu Ser Lys10 15 20
25cac aac aac cac atg gcc aag gtg ctg act aag gaa atg tat ggc aag
208His Asn Asn His Met Ala Lys Val Leu Thr Lys Glu Met Tyr Gly Lys
30 35 40ctt agg gac aag cag acc
cca cct gga ttc act gtg gat gat gtc atc 256Leu Arg Asp Lys Gln Thr
Pro Pro Gly Phe Thr Val Asp Asp Val Ile 45 50
55cag act ggt gtt gac aat cca ggc cac ccc ttc atc atg
acc gtc ggc 304Gln Thr Gly Val Asp Asn Pro Gly His Pro Phe Ile Met
Thr Val Gly 60 65 70tgt gtt gct
ggt gat gag gag tcc tac gat gtt ttc aag gac ctg ttc 352Cys Val Ala
Gly Asp Glu Glu Ser Tyr Asp Val Phe Lys Asp Leu Phe 75
80 85gac ccc gtc att tcc gac cgt cac ggt gga tac aag
gca act gac aag 400Asp Pro Val Ile Ser Asp Arg His Gly Gly Tyr Lys
Ala Thr Asp Lys90 95 100
105cac aag acc gac ctc aac ttt gag aac ctg aag ggt ggt gat gac ctg
448His Lys Thr Asp Leu Asn Phe Glu Asn Leu Lys Gly Gly Asp Asp Leu
110 115 120gac ccc aac tac ttc
ctg agc agc cgt gtg cgt acc gga cgc agc atc 496Asp Pro Asn Tyr Phe
Leu Ser Ser Arg Val Arg Thr Gly Arg Ser Ile 125
130 135aag gga tac ccc ctg ccc ccc cac aac agc cgt gga
gag cgc aga gct 544Lys Gly Tyr Pro Leu Pro Pro His Asn Ser Arg Gly
Glu Arg Arg Ala 140 145 150gtg gag
aag ctg tct gtt gaa gct ctg agt agc ttg gat gga gag ttc 592Val Glu
Lys Leu Ser Val Glu Ala Leu Ser Ser Leu Asp Gly Glu Phe 155
160 165aag ggc aag tac tac ccc ctg aag tcc atg act
gat gac gag cag gag 640Lys Gly Lys Tyr Tyr Pro Leu Lys Ser Met Thr
Asp Asp Glu Gln Glu170 175 180
185cag ctg atc gct gac cac ttc ctc ttt gac aaa ccc gtc tcc ccc ctg
688Gln Leu Ile Ala Asp His Phe Leu Phe Asp Lys Pro Val Ser Pro Leu
190 195 200ctg ctg gct gct ggt
atg gcc cgt gac tgg ccc gat gcc aga ggc att 736Leu Leu Ala Ala Gly
Met Ala Arg Asp Trp Pro Asp Ala Arg Gly Ile 205
210 215tgg cac aat gag aac aaa gcc ttc ctg gtc tgg gtg
aaa cag gag gat 784Trp His Asn Glu Asn Lys Ala Phe Leu Val Trp Val
Lys Gln Glu Asp 220 225 230cac ctg
cgt gtc att tcc atg cag aag ggt ggc aac atg aag gaa gtg 832His Leu
Arg Val Ile Ser Met Gln Lys Gly Gly Asn Met Lys Glu Val 235
240 245ttc aag cgc ttc tgc gtt ggt ctt cag agg att
gag gaa att ttc aag 880Phe Lys Arg Phe Cys Val Gly Leu Gln Arg Ile
Glu Glu Ile Phe Lys250 255 260
265aag cac aac cat ggg ttc atg tgg aac gag cat ctt ggt ttc gtc ctg
928Lys His Asn His Gly Phe Met Trp Asn Glu His Leu Gly Phe Val Leu
270 275 280acc tgc ccc tcc aac
ctg ggc aca ggc ctg cgc ggt gga gtc cac gtc 976Thr Cys Pro Ser Asn
Leu Gly Thr Gly Leu Arg Gly Gly Val His Val 285
290 295aag ctg ccc aag ctc agc aca cat gcc aag ttt gag
gag atc ctg acc 1024Lys Leu Pro Lys Leu Ser Thr His Ala Lys Phe Glu
Glu Ile Leu Thr 300 305 310aga ctg
cgc ctg cag aag cgt ggc aca ggg ggt gtg gac acc gct tcc 1072Arg Leu
Arg Leu Gln Lys Arg Gly Thr Gly Gly Val Asp Thr Ala Ser 315
320 325gtt ggt gga gtg ttt gac att tcc aac gct gac
cgt atc ggc tct tca 1120Val Gly Gly Val Phe Asp Ile Ser Asn Ala Asp
Arg Ile Gly Ser Ser330 335 340
345gag gtt gag cag gtg cag tgt gtg gtt gat ggt gtc aag ctg atg gtg
1168Glu Val Glu Gln Val Gln Cys Val Val Asp Gly Val Lys Leu Met Val
350 355 360gag atg gag aag aag
ctg gga gaa ggc cag tcc atc gac agc atg atc 1216Glu Met Glu Lys Lys
Leu Gly Glu Gly Gln Ser Ile Asp Ser Met Ile 365
370 375cct gcc cag aag taa agcgggaggc ccttccattt
ttttcttcgt ctttgtctgt 1271Pro Ala Gln Lys 380ttttttacag
tccaacagca acgsagagga aaactgctgc tcaaaaagac agtctcacct 1331ttgcacctgt
cttctttcct ttttttccct tcttctctaa tttccatgtc atttcgccat 1391ctttttttcc
actttgtttc ctattaagtc ggtaacatct tgggatcaga tacccggsgc 1451aggagtgagt
gcttgttgct gaggcttcac ctcaatttca gccttggttg taaaaagtga 1511atcaatcaaa
gttgtatttc aaaataaaaa tccccaataa aaaaaaaaaa aaaaaaaaaa 1571aaaaaaaaaa
aaaaaaaa 15894381PRTDanio
rerio 4Met Pro Phe Gly Asn Thr His Asn Asn Phe Lys Leu Asn Tyr Ser Val1
5 10 15Asp Glu Glu Tyr Pro
Asp Leu Ser Lys His Asn Asn His Met Ala Lys 20
25 30Val Leu Thr Lys Glu Met Tyr Gly Lys Leu Arg Asp
Lys Gln Thr Pro 35 40 45Pro Gly
Phe Thr Val Asp Asp Val Ile Gln Thr Gly Val Asp Asn Pro 50
55 60Gly His Pro Phe Ile Met Thr Val Gly Cys Val
Ala Gly Asp Glu Glu65 70 75
80Ser Tyr Asp Val Phe Lys Asp Leu Phe Asp Pro Val Ile Ser Asp Arg
85 90 95His Gly Gly Tyr Lys
Ala Thr Asp Lys His Lys Thr Asp Leu Asn Phe 100
105 110Glu Asn Leu Lys Gly Gly Asp Asp Leu Asp Pro Asn
Tyr Phe Leu Ser 115 120 125Ser Arg
Val Arg Thr Gly Arg Ser Ile Lys Gly Tyr Pro Leu Pro Pro 130
135 140His Asn Ser Arg Gly Glu Arg Arg Ala Val Glu
Lys Leu Ser Val Glu145 150 155
160Ala Leu Ser Ser Leu Asp Gly Glu Phe Lys Gly Lys Tyr Tyr Pro Leu
165 170 175Lys Ser Met Thr
Asp Asp Glu Gln Glu Gln Leu Ile Ala Asp His Phe 180
185 190Leu Phe Asp Lys Pro Val Ser Pro Leu Leu Leu
Ala Ala Gly Met Ala 195 200 205Arg
Asp Trp Pro Asp Ala Arg Gly Ile Trp His Asn Glu Asn Lys Ala 210
215 220Phe Leu Val Trp Val Lys Gln Glu Asp His
Leu Arg Val Ile Ser Met225 230 235
240Gln Lys Gly Gly Asn Met Lys Glu Val Phe Lys Arg Phe Cys Val
Gly 245 250 255Leu Gln Arg
Ile Glu Glu Ile Phe Lys Lys His Asn His Gly Phe Met 260
265 270Trp Asn Glu His Leu Gly Phe Val Leu Thr
Cys Pro Ser Asn Leu Gly 275 280
285Thr Gly Leu Arg Gly Gly Val His Val Lys Leu Pro Lys Leu Ser Thr 290
295 300His Ala Lys Phe Glu Glu Ile Leu
Thr Arg Leu Arg Leu Gln Lys Arg305 310
315 320Gly Thr Gly Gly Val Asp Thr Ala Ser Val Gly Gly
Val Phe Asp Ile 325 330
335Ser Asn Ala Asp Arg Ile Gly Ser Ser Glu Val Glu Gln Val Gln Cys
340 345 350Val Val Asp Gly Val Lys
Leu Met Val Glu Met Glu Lys Lys Leu Gly 355 360
365Glu Gly Gln Ser Ile Asp Ser Met Ile Pro Ala Gln Lys
370 375 38051104DNADanio rerioprimer
bind(45)..(64)ARP2CDS(75)..(1034)primer bind(87)..(112)ARKpolyA
signal(1069)..(1074) 5cgcgtcccta ccgtgagatt ttacaacctt gtctttaaac
cggctgttca ccgatccttg 60gaagcactgc aaag atg ccc agg gaa gac agg gcc
acg tgg aag tcc aac 110 Met Pro Arg Glu Asp Arg Ala
Thr Trp Lys Ser Asn 1 5
10tat ttt ctg aaa atc atc caa ctg ctg gat gac ttc ccc aag tgt ttc
158Tyr Phe Leu Lys Ile Ile Gln Leu Leu Asp Asp Phe Pro Lys Cys Phe
15 20 25atc gtg ggc gca gac aat gtc ggc
tcc aag cag atg cag acc atc cgt 206Ile Val Gly Ala Asp Asn Val Gly
Ser Lys Gln Met Gln Thr Ile Arg 30 35
40ctg tcc ctg cgg ggc aag gcc gtc gtg ctc atg ggg aaa aac acc atg
254Leu Ser Leu Arg Gly Lys Ala Val Val Leu Met Gly Lys Asn Thr Met45
50 55 60atg agg aag gcc att
cgt ggc cac ctg gaa aac aac cca gct ctg gag 302Met Arg Lys Ala Ile
Arg Gly His Leu Glu Asn Asn Pro Ala Leu Glu 65
70 75agg ctg ctt ccc cac atc cgc ggg aac gtg ggc
ttc gtc ttc acc aag 350Arg Leu Leu Pro His Ile Arg Gly Asn Val Gly
Phe Val Phe Thr Lys 80 85
90gag gat ctg act gag gtc cga gac ctg ctg ctg gca aac aaa gtg ccc
398Glu Asp Leu Thr Glu Val Arg Asp Leu Leu Leu Ala Asn Lys Val Pro
95 100 105gct gct gcc cgt gct ggt gcc
atc gcc ccc tgt gag gtg act gtg ccg 446Ala Ala Ala Arg Ala Gly Ala
Ile Ala Pro Cys Glu Val Thr Val Pro 110 115
120gcc cag aac acc ggg ctc ggt cct gag aag acc tct ttc ttc cag gct
494Ala Gln Asn Thr Gly Leu Gly Pro Glu Lys Thr Ser Phe Phe Gln Ala125
130 135 140ttg gga atc acc
acc aag atc tcc aga gga acc att gaa atc ttg agt 542Leu Gly Ile Thr
Thr Lys Ile Ser Arg Gly Thr Ile Glu Ile Leu Ser 145
150 155gac gtt cag ctt atc aaa cct gga gac aag
gtg ggc gcc agc gag gcc 590Asp Val Gln Leu Ile Lys Pro Gly Asp Lys
Val Gly Ala Ser Glu Ala 160 165
170acg ctg ctg aac atg ctg aac atg ctg aac atc tcg ccc ttc tcc tac
638Thr Leu Leu Asn Met Leu Asn Met Leu Asn Ile Ser Pro Phe Ser Tyr
175 180 185ggg ctg atc atc cag cag gtg
tat gat aac ggc agt gtc tac agc ccc 686Gly Leu Ile Ile Gln Gln Val
Tyr Asp Asn Gly Ser Val Tyr Ser Pro 190 195
200gag gtg ctg gac atc act gag gac gcc ctg cac aag agg ttc ctg aag
734Glu Val Leu Asp Ile Thr Glu Asp Ala Leu His Lys Arg Phe Leu Lys205
210 215 220ggt gtg agg aac
atc gcc agt gtg tgt ctg cag atc ggc tac cca act 782Gly Val Arg Asn
Ile Ala Ser Val Cys Leu Gln Ile Gly Tyr Pro Thr 225
230 235ctt gct tcc atc cct cac act atc atc aat
gga tac aag agg gtc ctg 830Leu Ala Ser Ile Pro His Thr Ile Ile Asn
Gly Tyr Lys Arg Val Leu 240 245
250gct gtc act gtc gaa aca gac tac aca ttc ccc ttg gct gag aag gtg
878Ala Val Thr Val Glu Thr Asp Tyr Thr Phe Pro Leu Ala Glu Lys Val
255 260 265aag gcc tac ctg gct gat ccc
acc gct ttc gct gtt gca gcc cct gtt 926Lys Ala Tyr Leu Ala Asp Pro
Thr Ala Phe Ala Val Ala Ala Pro Val 270 275
280gcg gca gct aca gag cag aaa tcc gct gct cct gcg gct aaa gag gag
974Ala Ala Ala Thr Glu Gln Lys Ser Ala Ala Pro Ala Ala Lys Glu Glu285
290 295 300gca ccc aag gag
gat tct gag gag tct gat gaa gac atg ggc ttc ggc 1022Ala Pro Lys Glu
Asp Ser Glu Glu Ser Asp Glu Asp Met Gly Phe Gly 305
310 315ctg ttt gat taa accagacacc gaatatccat
gtctgtttaa catcaataaa 1074Leu Phe Aspacatctggaa aaaaaaaaaa
aaaaaaaaaa 11046319PRTDanio rerio 6Met
Pro Arg Glu Asp Arg Ala Thr Trp Lys Ser Asn Tyr Phe Leu Lys1
5 10 15Ile Ile Gln Leu Leu Asp Asp
Phe Pro Lys Cys Phe Ile Val Gly Ala 20 25
30Asp Asn Val Gly Ser Lys Gln Met Gln Thr Ile Arg Leu Ser
Leu Arg 35 40 45Gly Lys Ala Val
Val Leu Met Gly Lys Asn Thr Met Met Arg Lys Ala 50 55
60Ile Arg Gly His Leu Glu Asn Asn Pro Ala Leu Glu Arg
Leu Leu Pro65 70 75
80His Ile Arg Gly Asn Val Gly Phe Val Phe Thr Lys Glu Asp Leu Thr
85 90 95Glu Val Arg Asp Leu Leu
Leu Ala Asn Lys Val Pro Ala Ala Ala Arg 100
105 110Ala Gly Ala Ile Ala Pro Cys Glu Val Thr Val Pro
Ala Gln Asn Thr 115 120 125Gly Leu
Gly Pro Glu Lys Thr Ser Phe Phe Gln Ala Leu Gly Ile Thr 130
135 140Thr Lys Ile Ser Arg Gly Thr Ile Glu Ile Leu
Ser Asp Val Gln Leu145 150 155
160Ile Lys Pro Gly Asp Lys Val Gly Ala Ser Glu Ala Thr Leu Leu Asn
165 170 175Met Leu Asn Met
Leu Asn Ile Ser Pro Phe Ser Tyr Gly Leu Ile Ile 180
185 190Gln Gln Val Tyr Asp Asn Gly Ser Val Tyr Ser
Pro Glu Val Leu Asp 195 200 205Ile
Thr Glu Asp Ala Leu His Lys Arg Phe Leu Lys Gly Val Arg Asn 210
215 220Ile Ala Ser Val Cys Leu Gln Ile Gly Tyr
Pro Thr Leu Ala Ser Ile225 230 235
240Pro His Thr Ile Ile Asn Gly Tyr Lys Arg Val Leu Ala Val Thr
Val 245 250 255Glu Thr Asp
Tyr Thr Phe Pro Leu Ala Glu Lys Val Lys Ala Tyr Leu 260
265 270Ala Asp Pro Thr Ala Phe Ala Val Ala Ala
Pro Val Ala Ala Ala Thr 275 280
285Glu Gln Lys Ser Ala Ala Pro Ala Ala Lys Glu Glu Ala Pro Lys Glu 290
295 300Asp Ser Glu Glu Ser Asp Glu Asp
Met Gly Phe Gly Leu Phe Asp305 310
31572241DNADanio rerioTATA signal(2103)..(2108)misc
feature(2142)..(2235)Identical to the 5' CK cDNAprimer
bind(2221)..(2241)CK2 7ccttcccttc tacttttgac gtccttttaa gattactcat
ctcaaacacc catacaaagg 60tcacacctgg tttatactat gatagttgta cagtgctggc
tgtgacaccc aactgctgcc 120aattgtctga ctatgcaggg tgtctatgcg tatagtttac
agttagacca aagtgtgctg 180gtgtgtgaag taacaaatga caaatactca aattgtaatt
tactaagtag tttaaaaatg 240tagtgcagtg ttggtacttt tatttcactt ttattcttgt
ctatgtggat tagacaaatc 300acatagaagg taaatcacat cataatgaac agcaaactgt
ttgccagcat taaaagaaga 360agactgctta gatgcatgtc actgatgaga aaataacttt
aaacgcacac aagacggcac 420gtaccccaac gcagtgggga cgttgcattt gaactcaacg
tcaggtcgat gtcaatgttc 480ctaatgatgt tacagcttga tgttatgcgg ggattatggt
tgccatacct gatgaataaa 540ggttcgacat tggattttgg tcgctttcca cctatgacat
cgttattgga cgtcaaaata 600aatttaggtc accacaacct atatttaacc tgctgggcaa
taactaaatg cactacagaa 660taaatgcatc agcttttcac agcataatac aaaagctact
tttcactcat actttgagta 720acatttttag gcatgtattg atatttttac cagccctccc
catacataat cgtatgttta 780acattagctt tgttagccgc tagcattact gagcttgtgc
atgaaagcag atttggagct 840gatgattgcc gtaccatgat ctcacacctt gacgattgcg
taatgctatt aaatgcccat 900atttcgtgtt gacttgcacg agaaatgaga tgggaacatt
tatcagtggt cattaaatac 960tatttttgtg ttagcttagc tgcagttttt aactattgta
attaagtagt ttttctcaga 1020tgtactttta ctttcccttg agtacatttt ccttccttca
acctgcagtc actactttat 1080agtcctgtga ttcctgtcca atcaaattgc taccttaaga
catgggccat ttataattgc 1140tgtcaaaaat atttacacgc attaacccag agatgatgga
tgtttactgt atgatgaccg 1200aagacgtcaa catggcgtta ggttgacgtt tgtttagaaa
tgaaaattag gttgacgtca 1260aacatccaat ctaaaatcat atatcaatgt atgttacccc
tatgacgtct atcagacgtt 1320tgtcattatt tgacgttggt ttaagatgtt acacaaccta
aatccaccaa atattaactt 1380acaatatcct tagatgctgg ctagactttg taatattaac
atcttatgat gttgtgtgcc 1440tgttacgttt acacacatgt aaattacatg tcactactta
ctactcttga gtacttttaa 1500atatttacaa ctgatacttt tactcgcact tatgattttt
cagtactctt tccactactg 1560cacatatggt ggagtttaga gccataatct gtgcagaatt
gtgtgtgtgc acattttcca 1620atatcaatac agaaggaaac tgtgttccct gttcccttgt
aaatctcaac aatgcaactg 1680ttcagctcag ggggaaaaat gccctgccag atccaaacgg
ctggcaaaag tgaatggaaa 1740aaagcctttc attaatgtga aagttgctgc gcgccccacc
cagataaaaa gagcagaggt 1800taacatgctc tctacggctg tccagccaac cagatactga
ggcagaaaca cacccgctgg 1860cagatggtga gagctacact gtcttttcca gagtttctac
tggaatgcct gtcctcaagt 1920ctcaagcctc tccttgcatt ctctcattcc acctggggca
aagccccagg ctgggtgtga 1980caacatttat cttaccactt tctctctgta cctgtctaac
aggtagggtg tgtgtgagag 2040tgcgtatgtg tgcaagtgcg tgtgtgtgtg agagcagtca
gctccaccct ctcaagagtg 2100tgtataaaat tggtcagcca gctgctgaga gacacgcaga
gggactttga ctctcctttg 2160tgagcaacct cctccactca ctcctctctc agagagcact
ctcgtacctc cttctcagca 2220actcaaagac acaggatccg g
224181456DNADanio rerioTATA
signal(1389)..(1394)misc feature(1428)..(1453)Identical to the 5' MCK
cDNAprimer bind(1433)..(1456)MCK2 8gaattgcaaa gtcagagtaa taaaatgaaa
ccaaaaaaca tttttaaata tacttgtctc 60tgtggcttaa tcttggctga tgtgtgtgtg
tgtgtgtgtg tacttgacag ctgctagtga 120gcatgtgcac catgacaggc ctgttattca
cacttggtgc catgttggag actgttcggc 180cagctatagt tttcttcaca gagtcctggg
tcacctaatg tcacaaggaa gaaacatgtt 240acatgttaaa atgtgacatt caaattgtag
tgcattactt aacgaaacgc attacacaag 300ttacagctta aaagattgct agacagaaaa
accagggagg ggttttccca taatatccag 360tgagactcta ggagcgggaa cactaacagg
cctccctgag tgagaacatt gcatgtgcgc 420gtgacagaaa accagagatg gaaatacctt
cttttgaatt gcataattgc ttaaaagaag 480acacaacagg gatagttcac ccaaaaaaca
gaccattctt tttttctgtt gaacaaaaat 540taagatattt tgaagaatgc ttaccgaata
acttccatat ttggaaacta attacagtga 600aagtcaatgg gtcttccagc attttttcaa
tataccttac tttgagttca aaagaaaaac 660acatctcaaa taggtttgag gttgaataaa
catttttcat tttggggtgg actatcccta 720attatttgac acttaagatt tatagtaaat
cattttatag actttctccc cttattaaac 780atggttgaat ttatcttcat gtttatgtct
gggttgtgct tttttgaaaa gatttccctg 840tcaaatgttt ttgtgtatgg ttggcgcaca
atagactgaa ctggcctatc acacagactt 900tcataacaac tccagttgat gccctttcac
cctcagtgta taaatatggc gtctgacatg 960agcagattaa acacgacact gcaacaactt
tacctgtaaa aatacaaatt gagtttgcac 1020ccagaatcat gtggtgaacg aagcctacca
agagattttt gaaagccatc ggcctgacac 1080gcgcacttct gatatctgtg gtatgtttgg
caaaagtgct gctcagcctt tttagcatgg 1140cagatcctcc acatcccatc acccctcctt
caacctattc cctcctggaa agctatgtat 1200ggggcgggaa gtgtaaatgg atatgggaag
gaaggggggc accacccaca gctgccacct 1260catctaggat gcctggggcc taaattgaag
cctttcttac actaaacagg gcataagaga 1320ccagcgccag ccaatcataa ttcagtgagc
tctaaaatgg gccagccaat ggctgcaggg 1380gctagaggta tatatatcca aatcaaactc
ttcttgcttg ggtgacccct atttcggctt 1440ggtgaacagg atccgg
145692205DNADanio reriomisc
feature(775)..(791)Identical to the 5' ARP cDNAIntron(792)..(2152)misc
feature(2153)..(2199)Identical to the 5' ARP cDNAprimer
bind(2179)..(2205)ARP2 9atctgtatta agaaacactt aaaatatata tgcgttacga
attaaaaaca aaacacgatc 60attttaattt gtgttgtata attttacatt ttgtaagtat
tatttttata aaaaatatat 120agaaataata caaatttgtt tacagtattc ttagttattg
caataaacga attttatata 180gaaagagaaa gagttttatt ataagatgtt caatttaaaa
aatggcagaa aatagaaaaa 240tgattgtcaa gatgataaaa gtcagtttag acaaaaaaat
aagatgaaaa acatcaaaat 300agataataaa gtgacttttt tgggcggacc aaatttccct
attaatggtc aattcattaa 360aatacattca ttaaaataaa ggtattgcga tgaatttaga
tgcacagtga ttttggttct 420gtgcagattt ttggctgttg ttagaaggga tacatctgcg
gccgaaagtt aacgggaact 480atttacattc tttgctatta aattatccat tatttgtatt
ttattacccc aaccgtaaac 540tcaaccctca cagtaatgta aaaatattat ttattgtttt
atagcgtcac agaatgatgc 600tatattgacc gcagctgtat cctttctaag tgcgactgta
caaatacgca ctgaccgtga 660cagacacgtg cattgaccaa tcagcgcaca gatacgcatt
ttccgcgcga ttctgattgg 720atgatcgact gatactaata ttgtgccgct tcctttcgcg
gcctctttct ttcacgcgtc 780cctaccgtga ggtaaggctg acgccgctct tgtggcggtt
tcttaaaatg tgttaataaa 840taacatcata agaggtcacg agaaggtcta cgtgtgttta
atatcagcgg cggttattat 900tatgcgttta aagcttgtgt aatgattttt acagtaaaag
ttagcactag cctgttagca 960caggcctcgt gcgccatgtg tgacgcgacg ttttaatagc
atcttatttg attttgatga 1020tccgattctg atattaatca tatttatgcg taaaatgtgt
gatgggtctg ctagtggaca 1080ttacatgcta gtacttgtgc tagtcggtcg atccacattg
agatgttgcg ctatttgcca 1140ttttaaaacc agttactctc attttagtga aatattctta
agccactaag ttaaaatttg 1200tcaatcacat ataattgtgt ttatgtttta tttgagtcat
cataccaggt aatagtttta 1260tttatattag tatgtacaat ttggcataaa ctgccttcgg
ttttgattga catctacttt 1320gtaaaggtaa tcttaaaggg gtaaaggctc acccaaaaga
caattcaccg tcaagtgttt 1380tcaaatctta tgagtttctt aatgaacatg gtatgttttg
gagaaaactg gaaaccaact 1440accataatac aaatacagga aaaatatact atagaagtcg
atggttacag gttttctgca 1500ttcaaaatat ctacacaagt gtttaatgga aggaactcaa
gtgatttgaa aagttaaggg 1560tgcataaatc agttttcatt tgggtgagct gtctctaaac
atttgattta gacacctcag 1620gcagtggtca ccaagcttgt tcctgaaggg ccagtgtcct
acagatttta gctccaaccc 1680taattaaaca cacctgaaca agctaatcaa ggtcttacta
ggtatgtttg aaacatccag 1740gcaggtgtgt tgatgcaaga tagagctaaa ccctgcaggg
acaatggccc aacaggattg 1800gtgacccctg cctcaagcca tcacaaatgc attatggtat
taagaaatgt gcaggttcag 1860ttatggacag gctgttgcag tgcttgttcg tcgttcccac
tgcacaaatg aacatgattc 1920cttctatccc tgtctgtctg catctcatga cttgcaggga
cgctggtctc agacacgttt 1980atagcagtaa atcaaataca atagtgctct gattatcttt
aaatatttga aagcttataa 2040taggcaacca aattacctgg aaacagttta caaacagtaa
ttcatatttt gtcatttaat 2100aagatgcaca caaggcaggt gtaaaagtat tgcttgtgtt
tgtaatcctc agattttaca 2160accttgtctt taaaccggct gttcaccgat ccttggaagg
gatcc 22051024DNAArtificial SequenceDescription of
Artificial Sequence Cytokeratin - gene specific primer 10cgctggagta
agagatagac ctgg
241126DNAArtificial SequenceDescription of Artificial Sequence
Cytokeratin gene specific primermisc feature(1)..(6)Introduced for
restriction sitemisc feature(3)..(8)BamHI site 11ccggatcctg tgtctttgag
ttgctg 261224DNAArtificial
SequenceDescription of Artificial Sequence Muscle creatine kinase
gene specific primermisc feature(3)..(8)BamHI site 12ccggatcctt
gggatcagat cctg
241324DNAArtificial SequenceDescription of Artificial Sequence Muscle
creatine kinase gene specific primermisc feature(1)..(3)Introduced for
restriction sitemisc feature(3)..(3)BamHI site 13ccggatcctg ttcaccaagc
cgaa 241425DNAArtificial
SequenceDescription of Artificial Sequence Acidic ribosomal protein
PO gene specific primer 14tagttggact tccacgtgcc ctgtc
251526DNAArtificial SequenceDescription of
Artificial Sequence Acidic ribosomal protein PO gene specific
primermisc feature(1)..(7)Introduced for restriction sitemisc
feature(1)..(6)BamHI site 15ggatcccttc caaggatcgg tgaaca
261651DNAArtificial SequenceDescription of
Artificial Sequence Oligonucleotide for linker used in
linker-mediated PCR 16gttcatcttt acaagctagc gctgaacaat gctgtggaca
agcttgaatt c 511710DNAArtificial SequenceDescription of
Artificial Sequence Oligonucleotide for linker used in
linker-mediated PCRmisc feature(10)..(10)n is a
dideoxycytidinemisc_feature(10)..(10)n is a, c, g, or t 17gaattcaagn
101821DNAArtificial
SequenceDescription of Artificial Sequence linker specific primer
18gttcatcttt acaagctagc g
211920DNAArtificial SequenceDescription of Artificial Sequence linker
specific primer 19tcctgaacaa tgctgtggac
20201392DNADanio rerioprimer bind(6)..(28)M2primer
bind(23)..(45)M1CDS(42)..(551)polyA signal(797)..(802)polyA
signal(1351)..(1357) 20ctcttcttga tcttcttaga cttcacacat accgtctcga c atg
gca ccc aag aag 56 Met
Ala Pro Lys Lys 1
5gcc aag agg agg gca gca gga gga gag ggt tcc tcc aac gtc ttc tcc
104Ala Lys Arg Arg Ala Ala Gly Gly Glu Gly Ser Ser Asn Val Phe Ser
10 15 20atg ttt gag cag agc cag
att cag gag tac aaa gag gct ttc aca atc 152Met Phe Glu Gln Ser Gln
Ile Gln Glu Tyr Lys Glu Ala Phe Thr Ile 25 30
35att gac cag aac aga gac ggt atc atc agc aaa gac gac
ctt agg gac 200Ile Asp Gln Asn Arg Asp Gly Ile Ile Ser Lys Asp Asp
Leu Arg Asp 40 45 50gtg ttg gcc
tca atg ggc cag ctg aat gtg aag aat gag gag ctg gag 248Val Leu Ala
Ser Met Gly Gln Leu Asn Val Lys Asn Glu Glu Leu Glu 55
60 65gcc atg atc aag gaa gcc agc ggc cca atc aac ttc
acc gtt ttc ctc 296Ala Met Ile Lys Glu Ala Ser Gly Pro Ile Asn Phe
Thr Val Phe Leu70 75 80
85acc atg ttc gga gag aag ttg aag ggt gct gac ccc gaa gac gtc atc
344Thr Met Phe Gly Glu Lys Leu Lys Gly Ala Asp Pro Glu Asp Val Ile
90 95 100gtg tct gcc ttc aag
gtg ctg gac cct gag ggc act gga tcc atc aag 392Val Ser Ala Phe Lys
Val Leu Asp Pro Glu Gly Thr Gly Ser Ile Lys 105
110 115aag gaa ttc ctt gag gag ctt ttg acc act cag tgc
gac agg ttc acc 440Lys Glu Phe Leu Glu Glu Leu Leu Thr Thr Gln Cys
Asp Arg Phe Thr 120 125 130gca gag
gag atg aag aat ctg tgg gcc gcc ttc ccc cca gat gtg gct 488Ala Glu
Glu Met Lys Asn Leu Trp Ala Ala Phe Pro Pro Asp Val Ala 135
140 145ggc aat gtt gac tac aag aac atc tgc tac gtc
atc aca cac gga gag 536Gly Asn Val Asp Tyr Lys Asn Ile Cys Tyr Val
Ile Thr His Gly Glu150 155 160
165gag aag gag gag taa acaaccttgg aatagaggaa acgaagagaa gaacatgcat
591Glu Lys Glu Glucctcacagct taatctccag tctgttgtct ggccttctct
aacttttgtt tttccttcct 651ccctttcttg ctttctacca tcgttgttac tccaagcact
tacactctcc atcttaccaa 711agacttgtct cgctgggact gaattgggag ggtggagagg
aacacgacca cagtgtctgt 771cgagtgggga catgggattg ttttcaataa aatgaacatc
atttctgtat ctctcacatt 831ctctctttct ctctgtttct cactcattac ccacaacccc
tctctttcat ttcagtcaag 891cttgcatgta agtcgctgct tcttctgctg cagtcttagg
agttgaaacg aaggcatcta 951tagtttgggg ctgaaacatc tctctagatc aatgtggaag
agtgctcact ctgaggggga 1011aagaagcacg atggagtgat ctcactctat aatagaggaa
ccagtcatca ttctcatttc 1071ctcctctggt ggttgactaa aaagagaaag agaaaatgag
ggttttgtgc tgagtgagtt 1131tagcctccta aaagcgatgc cgagctcatc acagagggag
tgagagggac agaccatcct 1191aggaagagag gagagcaggg actgaaagaa aacataacct
cttcactccc cctctcccct 1251cctcttctct atttctctgt ccatcttttc ttttttcttt
tttctttttt gctttctgca 1311tctgggcctg ctttgctctg ccaaacctct cctgtaacca
ataaaaagac acaaactgtg 1371aataaaaaaa aaaaaaaaaa a
139221169PRTDanio rerio 21Met Ala Pro Lys Lys Ala
Lys Arg Arg Ala Ala Gly Gly Glu Gly Ser1 5
10 15Ser Asn Val Phe Ser Met Phe Glu Gln Ser Gln Ile
Gln Glu Tyr Lys 20 25 30Glu
Ala Phe Thr Ile Ile Asp Gln Asn Arg Asp Gly Ile Ile Ser Lys 35
40 45Asp Asp Leu Arg Asp Val Leu Ala Ser
Met Gly Gln Leu Asn Val Lys 50 55
60Asn Glu Glu Leu Glu Ala Met Ile Lys Glu Ala Ser Gly Pro Ile Asn65
70 75 80Phe Thr Val Phe Leu
Thr Met Phe Gly Glu Lys Leu Lys Gly Ala Asp 85
90 95Pro Glu Asp Val Ile Val Ser Ala Phe Lys Val
Leu Asp Pro Glu Gly 100 105
110Thr Gly Ser Ile Lys Lys Glu Phe Leu Glu Glu Leu Leu Thr Thr Gln
115 120 125Cys Asp Arg Phe Thr Ala Glu
Glu Met Lys Asn Leu Trp Ala Ala Phe 130 135
140Pro Pro Asp Val Ala Gly Asn Val Asp Tyr Lys Asn Ile Cys Tyr
Val145 150 155 160Ile Thr
His Gly Glu Glu Lys Glu Glu 165222054DNADanio
rerioenhancer(142)..(148)E-box, canntgenhancer(452)..(457)E-box,
canntgenhancer(523)..(532)Potential MEF2 binding site,
yta(w)4tarenhancer(606)..(615)Potential MEF2 binding site,
yta(w)4tarenhancer(697)..(706)Potential MEF2 binding site,
yta(w)4tarenhancer(1095)..(1100)E-box, canntgenhancer(1278)..(1283)E-box,
canntgenhancer(1362)..(1367)E-box, canntgenhancer(1385)..(1390)E-box,
canntgenhancer(1490)..(1499)Potential MEF2 binding site,
yta(w)4tarenhancer(1640)..(1649)Potential MEF2 binding site,
yta(w)4tarenhancer(1956)..(1965)Potential MEF2 binding site,
yta(w)4tarTATA signal(1983)..(1989)Transcription start
site(2012)..(2012)misc feature(2027)..(2054)Identical to the 5' MLC2f
cDNAprimer bind(2032)..(2054)M2 22tgcatgcctg gcaggtccac tctagaggac
tactagtcat atgcgattct gaacaatgct 60ggaatgagcc accaactcat ccagtgtatt
accctacact gggaaacacc caaatctgtc 120tgttatattt gtgcatatac attagattag
aagctgtcac tgcggtggta ccttttcaaa 180ttgatacctc aaaagtatat attagtgcct
tttaggtact aatatatacc cttgaggttt 240tcatttggaa aggtaccacc ccagtgacag
aaatctggag cttatttaac aaaataactt 300tatttatatg ttattgaaaa atattaaata
agcaaaacaa tggaaaaaaa ttagttcaaa 360atttagcttt atttaaattg ttttatcttt
aatatagctg tttaataaat ctgttttgtt 420actgagagat ggagaaaaat attcattttc
ctgtaattat ctgtttttct aggtactgta 480caagcaggag caaaacaagc cgacagactc
gggaatgcac aacaaactca aggggggcaa 540gagagcaagg agcgctcaag attgtttagc
ctgccttccc aaaaaaaaac tgtcttaagc 600caaccactca gagggctgta gtgtgctgac
cgtgcttgtc cacagggcag cttcccacaa 660gtgaggtcat aggtcgatcg gcagagagag
atgggcatgg ccatgtggac gggtgtggtg 720actatactag gaaaagcatt aaaacctatt
aagacaccag aacgtcctct tatatatcag 780tcattggctc aaaaatctct ggattgaaat
atccaacaag taatcctgca agataagcca 840ggagggagtt gcgtcccctt tagactcagt
atgtgattgt atgaagctca aacagtccct 900gtggacagct tgaattcaat tcgccacaga
ttttatgcag cggatgccca tccagttgca 960ttttaaatta atatttttaa taggaagcta
tcagtacact ctcagaaata aatggtccgc 1020aggtacatat ttgtacttaa agggtccata
aaaaatttta agagaaacac ttttgtactt 1080tattatggac ctttaaggta caaattttta
ctcacgccct ttatttctga gagtgaagct 1140atgataacgg tccaaaaact actacaccca
caaatttata aacaggggaa aatcaagaga 1200atttgtaggt tgtaattttt ttgttgcaat
caattttgtg actaaaatat tattttaata 1260taaatgcacc aaaatacatt gcctatattc
aaaatgggct gtactcaatt actctaagca 1320aaataatgct aatcttaaac aattttggaa
acaggatatc aaattagtct aaagaaagaa 1380aacagtgact gatgaattag acaagaaaaa
tattttggtc accacagctg ttccttatgc 1440ctcaaatttc tcttcatgag ggtccaacat
catctaaaaa ctgggaaaaa ggggtaatta 1500atggcacctc acagtcactg aagtgaccgg
agagagagag agagagagag agtgctgaat 1560ggggcacttg aaccgaaatc ttacagcatc
ttcgattagg gctgatttga aataagggtt 1620ccagggcgtg aacaaatatg aacaacataa
ccatcaggat ctatcactgc aaccctcccc 1680gtattgatct gctgctaatc taactttagg
ggctacagct cattcatttc aaattgagtt 1740tacgtcccca tgtccttatt agacaacgcg
agacatgcag gccgctgcca tcagtatcag 1800attcatccca ttccaagact ccaatagcta
tttctgagca ctgtaagatg atagtacatc 1860ccagccggtg tccctccatc actttccccc
tacctcatag tttttcctct ttctctctcg 1920gtctgctatt tcccaaacct cacttaaggt
tgggtctata attagcaagg ggccttcgtc 1980agtatataag cccctcaagt acaggacact
acgcggcttc agacttctct tcttgatctt 2040cttagacttc acac
20542323DNAArtificial
SequenceDescription of Artificial Sequence MLC2F gene specific
primer Ml 23ccatgtcgag acggtatgtg tga
232423DNAArtificial SequenceDescription of Artificial Sequence
MLC2F gene specific primer M2 24gtgtgaagtc taagaagatc aag
23
User Contributions:
Comment about this patent or add new information about this topic: