Patent application title: RNA-Guided Targeting of Genetic and Epigenomic Regulatory Proteins to Specific Genomic Loci
Inventors:
IPC8 Class: AC12N922FI
USPC Class:
1 1
Class name:
Publication date: 2020-11-12
Patent application number: 20200354704
Abstract:
Methods and constructs for RNA-guided targeting of transcriptional
activators to specific genomic loci.Claims:
1. A fusion protein comprising catalytically inactive CRISPR associated 9
(Cas9) protein linked to a heterologous functional domain.
2. The fusion protein of claim 1, wherein the heterologous functional domain is a transcriptional activation domain.
3. The fusion protein of claim 2, wherein the transcriptional activation domain is from VP64 or NF-.kappa.B p65.
4. The fusion protein of claim 1, wherein the catalytically inactive Cas9 protein is from S. pyogenes.
5. The fusion protein of claim 1, wherein the catalytically inactive Cas9 protein comprises mutations at D10A and H840A.
6. The fusion protein of claim 1, wherein the heterologous functional domain is linked to the N terminus or C terminus of the catalytically inactive Cas9 protein, with an optional intervening linker, wherein the linker does not interfere with activity of the fusion protein.
7. The fusion protein of claim 1, further comprising one or both of a nuclear localization sequence and one or more epitope tags on the N-terminus, C-terminus, or in between the catalytically inactive CRISPR associated 9 (Cas9) protein and the heterologous functional domain, optionally with one or more intervening linkers.
8. The fusion protein of claim 7, wherein the one or more epitope tags is selected from the group consisting of c-myc, 6His, and FLAG tags.
9. A nucleic acid encoding the fusion protein of claim 1.
10. A nucleic acid encoding the fusion protein of claim 2.
11. A nucleic acid encoding the fusion protein of claim 3.
12. A nucleic acid encoding the fusion protein of claim 4.
13. An expression vector comprising the nucleic acid of claim 9.
14. An expression vector comprising the nucleic acid of claim 10.
15. An expression vector comprising the nucleic acid of claim 11.
16. An expression vector comprising the nucleic acid of claim 12.
Description:
CLAIM OF PRIORITY
[0001] This application is a continuation U.S. patent application Ser. No. 14/211,117, filed Mar. 14, 2014, which claims priority under 35 USC .sctn. 119(e) to U.S. Patent Application Ser. No. 61/799,647, filed on Mar. 15, 2013. The entire contents of the foregoing are hereby incorporated by reference.
TECHNICAL FIELD
[0003] This invention relates to methods and constructs for RNA-guided targeting of transcriptional activators to specific genomic loci.
BACKGROUND
[0004] Clustered Regulatory Interspaced Short Palindromic Repeats (CRISPR), and CRISPR-associated (cas) genes, referred to as CRISPR/Cas systems, are used by various bacteria and archaea to mediate defense against viruses and other foreign nucleic acid. These systems use small RNAs to detect and silence foreign nucleic acids in a sequence-specific manner.
[0005] Three types of CRISPR/Cas systems have been described (Makarova et al., Nat. Rev. Microbiol. 9, 467 (2011); Makarova et al., Biol. Direct 1, 7 (2006); Makarova et al., Biol. Direct 6, 38 (2011)). Recent work has shown that Type II CRISPR/Cas systems can be engineered to direct targeted double-stranded DNA breaks in vitro to specific sequences by using a single "guide RNA" with complementarity to the DNA target site and a Cas9 nuclease (Jinek et al., Science 2012; 337:816-821). This targetable Cas9-based system also works efficiently in cultured human cells (Mali et al., Science. 2013 Feb. 15; 339(6121):823-6; Cong et al., Science. 2013 Feb. 15; 339(6121):819-23) and in vivo in zebrafish (Hwang and Fu et al., Nat Biotechnol. 2013 March; 31(3):227-9) for inducing targeted alterations into endogenous genes.
SUMMARY
[0006] At least in part, the present invention is based on the development of a fusion protein including a heterologous functional domain (a transcriptional activation domain) fused to a Cas9 nuclease that has had its nuclease activity inactivated by mutations. While published studies have used guide RNAs to target the Cas9 nuclease to specific genomic loci, no work has yet adapted this system to recruit additional effector domains. This work also provides the first demonstration of an RNA-guided process that results in an increase (rather than a decrease) in the level of expression of a target gene.
[0007] In addition, the present disclosure provides the first demonstration that multiplex gRNAs can be used to mediate synergistic activation of transcription.
[0008] Thus, in a first aspect, the invention provides fusion proteins comprising a catalytically inactive CRISPR associated 9 (Cas9) protein linked to a heterologous functional domain that modifies DNA, e.g., transcriptional activation domain, transcriptional repressors, enzymes that modify the methylation state of DNA (e.g., DNA methyltransferase (DNMT) or TET proteins), or enzymes that modify histone subunit (e.g., histone acetyltransferases (HAT), histone deacetylases (HDAC), or histone demethylases). In preferred embodiments, the heterologous functional domain is a transcriptional activation domain, e.g., a transcriptional activation domain is from VP64 or NF-.kappa.B p65.
[0009] In some embodiments, the catalytically inactive Cas9 protein is from S. pyogenes.
[0010] In some embodiments, the catalytically inactive Cas9 protein comprises mutations at D10A and H840A.
[0011] In some embodiments, the heterologous functional domain is linked to the N terminus or C terminus of the catalytically inactive Cas9 protein, with an optional intervening linker, wherein the linker does not interfere with activity of the fusion protein.
[0012] In some embodiments, the fusion protein includes one or both of a nuclear localization sequence and one or more epitope tags, e.g., c-myc, 6His, or FLAG tags, on the N-terminus, C-terminus, or in between the catalytically inactive CRISPR associated 9 (Cas9) protein and the heterologous functional domain, optionally with one or more intervening linkers.
[0013] In further aspect, the invention provides nucleic acid encoding the fusion proteins described herein, as well as expression vectors including the nucleic acids, and host cells expressing the fusion proteins.
[0014] In an additional aspect, the invention provides methods for increasing expression of a target gene in a cell. The methods include expressing a Cas9-activator fusion protein as described herein in the cell, e.g., by contacting the cell with an expression vector including a sequence encoding the fusion protein, and also expressing in the cell one or more guideRNAs directed to the target gene, e.g., by contacting the cell with one or more expression vectors comprising nucleic acid sequences encoding one or more guideRNAs.
[0015] Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Methods and materials are described herein for use in the present invention; other, suitable methods and materials known in the art can also be used. The materials, methods, and examples are illustrative only and not intended to be limiting. All publications, patent applications, patents, sequences, database entries, and other references mentioned herein are incorporated by reference in their entirety. In case of conflict, the present specification, including definitions, will control.
[0016] Other features and advantages of the invention will be apparent from the following detailed description and figures, and from the claims.
DESCRIPTION OF DRAWINGS
[0017] FIG. 1A is a schematic illustration showing a single guide RNA (sgRNA) recruiting Cas9 nuclease to a specific DNA sequence.
[0018] FIG. 1B is a schematic illustration showing a longer version of the sgRNA used to introduce targeted alterations.
[0019] FIG. 1C is a schematic illustration showing a Cas9 protein containing D10A and H840A mutations to render the nuclease portion of the protein catalytically inactive fused to a transcriptional activation domain.
[0020] FIG. 2 is a bar graph showing levels of VEGFA protein expression in cells transfected with gRNA and Cas9-VP64. Fold activation was calculated relative to off-target gRNA control. Error bars represent standard error of the mean of three independent replicates. 1-18=18 guide RNAs targeted to various sites in the human VEGF-A gene; Cas9-Vp64=Fusion of catalytically inactive Cas9 (bearing D10A/H840A mutations) fused to the VP64 Activation domain; eGFP gRNA=a guide RNA targeted to an off-target site located in an EGFP Reporter gene
[0021] FIG. 3A is a bar graph showing VEGFA protein expression in cells transfected with multiple gRNAs and Cas9-VP64, demonstrating synergistic activation of VEGFA. Fold activation was calculated relative to off-target gRNA control. Error bars represent standard error of the mean of three independent replicates.
[0022] FIG. 3B is a bar graph showing VEGFA protein expression in cells transfected with multiple gRNAs and Cas9-VP64. The number underneath each bar indicate the amount in nanograms (ng) of Cas-activator (C) plasmid or guide RNA (g) plasmid transfected.
[0023] FIG. 4 is an exemplary sequence of a Guide RNA expression vector.
[0024] FIG. 5 is an exemplary sequence of CMV-T7-Cas9 D10A/H840A-3.times.FLAG-VP64.
[0025] FIG. 6 is an exemplary sequence of CMV-T7-Cas9 recoded D10A/H840A-3.times.FLAG-VP64.
[0026] FIG. 7 is an exemplary sequence of a Cas9-activator. An optional 3.times.FLAG sequence is underlined; the nuclear localization signal PKKKRKVS (SEQ ID NO:1) is in lower case; two linkers are in bold; and the VP64 transcriptional activator sequence, DALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDML (SEQ ID NO:2), is boxed.
DETAILED DESCRIPTION
[0027] Described herein are fusion proteins of transcriptional activation domains fused to a catalytically inactivated version of the Cas9 protein for the purpose of enabling RNA-guided targeting of these functional domains to specific genomic locations in cells and living organisms.
[0028] The CRISPR/Cas system has evolved in bacteria as a defense mechanism to protect against invading plasmids and viruses. Short protospacers, derived from foreign nucleic acid, are incorporated into CRISPR loci and subsequently transcribed and processed into short CRISPR RNAs (crRNAs). These RNAs then use their sequence complementarity to the invading nucleic acid to guide Cas9-mediated cleavage, and consequent destruction of the foreign nucleic acid. Last year, Doudna and colleagues demonstrated that a single guide RNA (sgRNA) can mediate recruitment of Cas9 nuclease to specific DNA sequences in vitro (FIG. 1C; Jinek et al., Science 2012).
[0029] More recently, a longer version of the sgRNA has been used to introduce targeted alterations in human cells and zebrafish (FIG. 1B; Mali et al. Science 2013, Hwang and Fu et al., Nat Biotechnol. 2013 March; 31(3):227-9).
[0030] As described herein, in addition to guiding Cas9-mediated nuclease activity, it is possible to use CRISPR-derived RNAs to target heterologous functional domains fused to Cas9 to specific sites in the genome (FIG. 1C). As described herein, it is possible to use single guide RNAs (sgRNAs) to target Cas9-transcriptional activators (hereafter referred to as Cas9-activators) to the promoters of specific genes and thereby increase expression of the target gene. Cas9-activators can be localized to sites in the genome, with target specificity defined by sequence complementarity of the guide RNA.
[0031] In some embodiments, the present system utilizes the Cas9 protein from S. pyogenes, either as encoded in bacteria or codon-optimized for expression in mammalian cells, containing D10A and H840A mutations to render the nuclease portion of the protein catalytically inactive (FIG. 1C). The Cas9-activators are created by fusing a transcriptional activation domain, e.g., from either VP64 or NF-.kappa.B p65, to the N-terminus or C-terminus of the catalytically inactive Cas9 protein.
[0032] The sequence of the catalytically inactive Cas9 used herein is as follows; the mutations are in bold and underlined.
TABLE-US-00001 (SEQ ID NO: 3) 10 20 30 40 50 60 MDKKYSIGLA IGTNSVGWAV ITDEYKVPSK KFKVLGNTDR HSIKKNLIGA LLFDSGETAE 70 80 90 100 110 120 ATRLKRTARR RYTRRKNRIC YLQEIFSNEM AKVDDSFFHR LEESFLVEED KKHERHPIFG 130 140 150 160 170 180 NIVDEVAYHE KYPTIYHLRK KLVDSTDKAD LRLIYLALAH MIKFRGHFLI EGDLNPDNSD 190 200 210 220 230 240 VDKLFIQLVQ TYNQLFEENP INASGVDAKA ILSARLSKSR RLENLIAQLP GEKKNGLFGN 250 260 270 280 290 300 LIALSLGLTP NFKSNFDLAE DAKLQLSKDT YDDDLDNLLA QIGDQYADLF LAAKNLSDAI 310 320 330 340 350 360 LLSDILRVNT EITKAPLSAS MIKRYDEHHQ DLTLLKALVR QQLPEKYKEI FFDQSKNGYA 370 380 390 400 410 420 GYIDGGASQE EFYKFIKPIL EKMDGTEELL VKLNREDLLR KQRTFDNGSI PHQIHLGELH 430 440 450 460 470 480 AILRRQEDFY PFLKDNREKI EKILTFRIPY YVGPLARGNS RFAWMTRKSE ETITPWNFEE 490 500 510 520 530 540 VVDKGASAQS FIERMTNFDK NLPNEKVLPK HSLLYEYFTV YNELTKVKYV TEGMRKPAFL 550 560 570 580 590 600 SGEQKKAIVD LLFKTNRKVT VKQLKEDYFK KIECFDSVEI SGVEDRFNAS LGTYHDLLKI 610 620 630 640 650 660 IKDKDFLDNE ENEDILEDIV LTLTLFEDRE MIEERLKTYA HLFDDKVMKQ LKRRRYTGWG 670 680 690 700 710 720 RLSRKLINGI RDKQSGKTIL DFLKSDGFAN RNFMQLIHDD SLTFKEDIQK AQVSGQGDSL 730 740 750 760 770 780 HEHIANLAGS RAIKKGILQT VKVVDELVKV MGRHKPENIV IEMARENQTT QKGQKNSRER 790 800 810 820 830 840 MKRIEEGIKE LGSQILKEHP VENTQLQNEK LYLYYLQNGR DMYVDQELDI NRLSDYDVDA 850 860 870 880 890 900 IVPQSFLKDD SIDNKVLTRS DKNRGKSDNV PSEEVVKKMK NYWRQLLNAK LITQRKFDNL 910 920 930 940 950 960 TKAERGGLSE LDKAGFIKRQ LVETRQITKH VAQILDSRMN TKYDENDKLI REVKVITLKS 970 980 990 1000 1010 1020 KLVSDFRKDF QFYKVREINN YHHAHDAYLN AVVGTALIKK YPKLESEFVY GDYKVYDVRK 1030 1040 1050 1060 1070 1080 MIAKSEQEIG KATAKYFFYS NIMNFFKTEI TLANGEIRKR PLIETNGETG EIVWDKGRDF 1090 1100 1110 1120 1130 1140 ATVRKVLSMP QVNIVKKTEV QTGGFSKESI LPKRNSDKLI ARKKDWDPKK YGGFDSPTVA 1150 1160 1170 1180 1190 1200 YSVLVVAKVE KGKSKKLKSV KELLGITIME RSSFEKNPID FLEAKGYKEV KKDLIIKLPK 1210 1220 1230 1240 1250 1260 YSLFELENGR KRMLASAGEL QKGNELALPS KYVNFLYLAS HYEKLKGSPE DNEQKQLFVE 1270 1280 1290 1300 1310 1320 QHKHYLDEII EQISEFSKRV ILADANLDKV LSAYNKHRDK PIREQAENII HLFTLTNLGA 1330 1340 1350 1360 PAAFKYFDTT IDRKRYTSTK EVLDATLIHQ SITGLYETRI DLSQLGGD
[0033] The transcriptional activation domains can be fused on the N or C terminus of the Cas9. In addition, although the present description exemplifies transcriptional activation domains, other heterologous functional domains (e.g., transcriptional repressors, enzymes that modify the methylation state of DNA (e.g., DNA methyltransferase (DNMT) or TET proteins), or enzymes that modify histone subunit (e.g., histone acetyltransferases (HAT), histone deacetylases (HDAC), or histone demethylases)) as are known in the art can also be used. A number of sequences for such domains are known in the art, e.g., a domain that catalyzes hydroxylation of methylated cytosines in DNA. Exemplary proteins include the Ten-Eleven-Translocation (TET)1-3 family, enzymes that converts 5-methylcytosine (5-mC) to 5-hydroxymethylcytosine (5-hmC) in DNA.
[0034] Sequences for human TET1-3 are known in the art and are shown in the following table:
TABLE-US-00002 GenBank Accession Nos. Gene Amino Acid Nucleic Acid TET1 NP_085128.2 NM_030625.2 TET2* NP_001120680.1 (var 1) NM_001127208.2 NP_060098.3 (var 2) NM_017628.4 TET3 NP_659430.1 NM_144993.1 *Variant (1) represents the longer transcript and encodes the longer isotorm (a). Variant (2) differs in the 5' UTR and in the 3' UTR and coding sequence compared to variant 1. The resulting isoform (b) is shorter and has a distinct C-terminus compared to isoform a.
[0035] In some embodiments, all or part of the full-length sequence of the catalytic domain can be included, e.g., a catalytic module comprising the cysteine-rich extension and the 2OGFeDO domain encoded by 7 highly conserved exons, e.g., the Tet1 catalytic domain comprising amino acids 1580-2052, Tet2 comprising amino acids 1290-1905 and Tet3 comprising amino acids 966-1678. See, e.g., FIG. 1 of Iyer et al., Cell Cycle. 2009 Jun. 1; 8(11):1698-710. Epub 2009 Jun. 27, for an alignment illustrating the key catalytic residues in all three Tet proteins, and the supplementary materials thereof (available at ftp site ftp.ncbi.nih.gov/pub/aravind/DONS/supplementary_material_DONS.html) for full length sequences (see, e.g., seq 2c); in some embodiments, the sequence includes amino acids 1418-2136 of Tet1 or the corresponding region in Tet2/3.
[0036] Other catalytic modules can be from the proteins identified in Iyer et al., 2009.
[0037] Methods of Use
[0038] The described Cas9-activator system is a useful and versatile tool for modifying the expression of endogenous genes. Current methods for achieving this require the generation of novel engineered DNA-binding proteins (such as engineered zinc finger or transcription activator-like effector DNA binding domains) for each site to be targeted. Because these methods demand expression of a large protein specifically engineered to bind each target site, they are limited in their capacity for multiplexing. Cas9-activators, however, require expression of only a single Cas9-activator protein, which can be targeted to multiple sites in the genome by expression of multiple short gRNAs. This system could therefore easily be used to simultaneously induce expression of a large number of genes. This capability will have broad utility, e.g., for basic biological research, where it can be used to study gene function and to manipulate the expression of multiple genes in a single pathway, and in synthetic biology, where it will enable researchers to create circuits in cell that are responsive to multiple input signals. The relative ease with which this technology can be implemented and adapted to multiplexing will make it a broadly useful technology with many wide-ranging applications.
[0039] The methods described herein include contacting cells with a nucleic acid encoding the Cas9-activators described herein, and nucleic acids encoding one or more guide RNAs directed to a selected gene, to thereby modulate expression of that gene. Guide RNAs, and methods of designing and expressing guide RNAs, are known in the art. See, e.g., Jinek et al., Science 2012; 337:816-821; Mali et al., Science. 2013 Feb. 15; 339(6121):823-6; Cong et al., Science. 2013 Feb. 15; 339(6121):819-23; and Hwang and Fu et al., Nat Biotechnol. 2013 March; 31(3):227-9). In some embodiments, the guideRNAs are directed to a region that is 100-800, e.g., about 500 bp upstream of the transcription start site. In some embodiments, vectors (e.g., plasmids) encoding more than one gRNA are used, e.g., plasmids encoding, 2, 3, 4, 5, or more gRNAs directed to different sites in the same region of the target gene.
[0040] Polypeptide Expression Systems
[0041] In order to use the fusion proteins described, it may be desirable to express the engineered proteins from a nucleic acid that encodes them. This can be performed in a variety of ways. For example, the nucleic acid encoding the fusion protein can be cloned into an intermediate vector for transformation into prokaryotic or eukaryotic cells for replication and/or expression. Intermediate vectors are typically prokaryote vectors, e.g., plasmids, or shuttle vectors, or insect vectors, for storage or manipulation of the nucleic acid encoding the fusion protein or for production of the fusion protein. The nucleic acid encoding the fusion protein can also be cloned into an expression vector, for administration to a plant cell, animal cell, preferably a mammalian cell or a human cell, fungal cell, bacterial cell, or protozoan cell.
[0042] To obtain expression, the fusion protein is typically subcloned into an expression vector that contains a promoter to direct transcription. Suitable bacterial and eukaryotic promoters are well known in the art and described, e.g., in Sambrook et al., Molecular Cloning, A Laboratory Manual (3d ed. 2001); Kriegler, Gene Transfer and Expression: A Laboratory Manual (1990); and Current Protocols in Molecular Biology (Ausubel et al., eds., 2010). Bacterial expression systems for expressing the engineered protein are available in, e.g., E. coli, Bacillus sp., and Salmonella (Palva et al., 1983, Gene 22:229-235). Kits for such expression systems are commercially available. Eukaryotic expression systems for mammalian cells, yeast, and insect cells are well known in the art and are also commercially available.
[0043] The promoter used to direct expression of the fusion protein nucleic acid depends on the particular application. For example, a strong constitutive promoter is typically used for expression and purification of fusion proteins. In contrast, when the fusion protein is to be administered in vivo for gene regulation, either a constitutive or an inducible promoter can be used, depending on the particular use of the fusion protein. In addition, a preferred promoter for administration of the fusion protein can be a weak promoter, such as HSV TK or a promoter having similar activity. The promoter can also include elements that are responsive to transactivation, e.g., hypoxia response elements, Gal4 response elements, lac repressor response element, and small molecule control systems such as tetracycline-regulated systems and the RU-486 system (see, e.g., Gossen & Bujard, 1992, Proc. Natl. Acad. Sci. USA, 89:5547; Oligino et al., 1998, Gene Ther., 5:491-496; Wang et al., 1997, Gene Ther., 4:432-441; Neering et al., 1996, Blood, 88:1147-55; and Rendahl et al., 1998, Nat. Biotechnol., 16:757-761).
[0044] In addition to the promoter, the expression vector typically contains a transcription unit or expression cassette that contains all the additional elements required for the expression of the nucleic acid in host cells, either prokaryotic or eukaryotic. A typical expression cassette thus contains a promoter operably linked, e.g., to the nucleic acid sequence encoding the fusion protein, and any signals required, e.g., for efficient polyadenylation of the transcript, transcriptional termination, ribosome binding sites, or translation termination. Additional elements of the cassette may include, e.g., enhancers, and heterologous spliced intronic signals.
[0045] The particular expression vector used to transport the genetic information into the cell is selected with regard to the intended use of the fusion protein, e.g., expression in plants, animals, bacteria, fungus, protozoa, etc. Standard bacterial expression vectors include plasmids such as pBR322 based plasmids, pSKF, pET23D, and commercially available tag-fusion expression systems such as GST and LacZ. A preferred tag-fusion protein is the maltose binding protein (MBP). Such tag-fusion proteins can be used for purification of the engineered TALE repeat protein. Epitope tags can also be added to recombinant proteins to provide convenient methods of isolation, for monitoring expression, and for monitoring cellular and subcellular localization, e.g., c-myc or FLAG
[0046] Expression vectors containing regulatory elements from eukaryotic viruses are often used in eukaryotic expression vectors, e.g., SV40 vectors, papilloma virus vectors, and vectors derived from Epstein-Barr virus. Other exemplary eukaryotic vectors include pMSG, pAV009/A+, pMTO10/A+, pMAMneo-5, baculovirus pDSVE, and any other vector allowing expression of proteins under the direction of the SV40 early promoter, SV40 late promoter, metallothionein promoter, murine mammary tumor virus promoter, Rous sarcoma virus promoter, polyhedrin promoter, or other promoters shown effective for expression in eukaryotic cells.
[0047] Some expression systems have markers for selection of stably transfected cell lines such as thymidine kinase, hygromycin B phosphotransferase, and dihydrofolate reductase. High yield expression systems are also suitable, such as using a baculovirus vector in insect cells, with the fusion protein encoding sequence under the direction of the polyhedrin promoter or other strong baculovirus promoters.
[0048] The elements that are typically included in expression vectors also include a replicon that functions in E. coli, a gene encoding antibiotic resistance to permit selection of bacteria that harbor recombinant plasmids, and unique restriction sites in nonessential regions of the plasmid to allow insertion of recombinant sequences.
[0049] Standard transfection methods are used to produce bacterial, mammalian, yeast or insect cell lines that express large quantities of protein, which are then purified using standard techniques (see, e.g., Colley et al., 1989, J. Biol. Chem., 264:17619-22; Guide to Protein Purification, in Methods in Enzymology, vol. 182 (Deutscher, ed., 1990)). Transformation of eukaryotic and prokaryotic cells are performed according to standard techniques (see, e.g., Morrison, 1977, J. Bacteriol. 132:349-351; Clark-Curtiss & Curtiss, Methods in Enzymology 101:347-362 (Wu et al., eds, 1983).
[0050] Any of the known procedures for introducing foreign nucleotide sequences into host cells may be used. These include the use of calcium phosphate transfection, polybrene, protoplast fusion, electroporation, nucleofection, liposomes, microinjection, naked DNA, plasmid vectors, viral vectors, both episomal and integrative, and any of the other well-known methods for introducing cloned genomic DNA, cDNA, synthetic DNA or other foreign genetic material into a host cell (see, e.g., Sambrook et al., supra). It is only necessary that the particular genetic engineering procedure used be capable of successfully introducing at least one gene into the host cell capable of expressing the protein of choice.
[0051] In some embodiments, the fusion protein includes a nuclear localization domain which provides for the protein to be translocated to the nucleus. Several nuclear localization sequences (NLS) are known, and any suitable NLS can be used. For example, many NLSs have a plurality of basic amino acids, referred to as a bipartite basic repeats (reviewed in Garcia-Bustos et al, 1991, Biochim. Biophys. Acta, 1071:83-101). An NLS containing bipartite basic repeats can be placed in any portion of chimeric protein and results in the chimeric protein being localized inside the nucleus. In preferred embodiments a nuclear localization domain is incorporated into the final fusion protein, as the ultimate functions of the fusion proteins described herein will typically require the proteins to be localized in the nucleus. However, it may not be necessary to add a separate nuclear localization domain in cases where the DBD domain itself, or another functional domain within the final chimeric protein, has intrinsic nuclear translocation function.
[0052] The present invention includes the vectors and cells comprising the vectors.
Examples
[0053] The invention is further described in the following examples, which do not limit the scope of the invention described in the claims.
Example 1. Engineering CRISPR/Cas Activator System
[0054] To express guide RNAs (gRNAs) in human cells, we engineered a vector that would express the full length chimeric gRNA (a fusion of crRNA and tracrRNA originally described by Jinek et al. (Science 2012)) driven by a U6 promoter. To create site-specific gRNAs, a pair of 26 nucleotide oligos are annealed and ligated into the BsmBI-digested vector backbone. See FIG. 4.
[0055] To engineer a Cas9-activator we introduced the D10A, H840A catalytic mutations (previously described in Jinek et al. Science 2012)) into either the wildtype or a codon-optimized Cas9 sequence (FIG. 5). These mutations render the Cas9 catalytically inactive so that it will no longer induce double-strand breaks. In one construct, a triple flag tag, nuclear localization signal and the VP64 activation domain were fused to the C-terminus of the inactive Cas9 (FIG. 6). Expression of this fusion protein is driven by the CMV promoter.
[0056] Cell Culture, Transfection and ELISA Assays were performed as follows.
[0057] Flp-In T-Rex 293 cells were maintained in Advanced DMEM supplemented with 10% FBS, 1% penstrep and 1% Glutamax (Invitrogen). Cells were transfected by Lipofectamine LTX (Invitrogen) according to manufacturer's instructions. Briefly, 160,000 293 cells were seeded in 24-well plates and transfected the following day with 250 ng gRNA plasmid, 250 ng Cas9-VP64 plasmid, 30 ng GFP, 0.5 ul Plus Reagent and 1.65 ul Lipofectamine LTX. Tissue culture media from transfected 293 cells was harvested 40 hours after transfection, and secreted VEGF-A protein assayed using R&D System's Human VEGF-A ELISA kit "Human VEGF Immunoassay."
[0058] 17 gRNAs were engineered to target three different regions (-500, 0 and +500 bp relative to the start site of transcription) in the human VEGFA promoter. Each gRNA was cotransfected with Cas9-VP64 into Hek293 cells and expression levels of VEGF-A protein was measured by ELISA. Of the 17 gRNAs, nine increased expression of VEGFA by three-fold or more as compared to an off-target gRNA control (FIG. 2). The greatest increase in VEGFA was observed in cells transfected with gRNA3, which induced protein expression by 18.7-fold. Interestingly, the three best gRNAs, and 6 of the 9 gRNAs capable of inducing expression by 3-fold or more, target the -500 region (-500 bp upstream of the transcription start site).
[0059] Plasmids encoding one, or more, e.g., two or five, different guide RNAs targeted to the human VEGFA promoter were transfected together with a plasmid encoding the Cas9-activator and assessed for their abilities to activate transcription of the VEGFA promoter. Combinations of multiple gRNAs further increased the level of VEGFA activation (FIGS. 3A-B). Co-transfection of all 6 gRNAs targeted to the -500 region and all possible combinations of 5 of these 6 gRNAs resulted in a synergistic increase in VEGFA protein expression (FIG. 3A).
[0060] These experiments demonstrate that co-expression of a Cas9-activator protein (harboring the VP64 transcriptional activation domain) and a gRNA with 20 nt of sequence complementarity to sites in the human VEGF-A promoter in human HEK293 cells can result in upregulation of VEGF-A expression. Increases in VEGF-A protein were measured by ELISA assay and it was found that individual gRNAs can function together with a Cas9-activator fusion protein to increase VEGF-A protein levels by up to .about.18-fold (FIG. 2). Additionally, it was possible to achieve even greater increases in activation through transcriptional synergy by introducing multiple gRNAs targeting various sites in the same promoter together with Cas9-activator fusion proteins (FIGS. 3A-B).
Other Embodiments
[0061] It is to be understood that while the invention has been described in conjunction with the detailed description thereof, the foregoing description is intended to illustrate and not limit the scope of the invention, which is defined by the scope of the appended claims. Other aspects, advantages, and modifications are within the scope of the following claims.
Sequence CWU
1
1
718PRTArtificial Sequencenuclear localization signal 1Pro Lys Lys Lys Arg
Lys Val Ser1 5250PRTArtificial SequenceVP64 transcriptional
activator sequence 2Asp Ala Leu Asp Asp Phe Asp Leu Asp Met Leu Gly Ser
Asp Ala Leu1 5 10 15Asp
Asp Phe Asp Leu Asp Met Leu Gly Ser Asp Ala Leu Asp Asp Phe 20
25 30Asp Leu Asp Met Leu Gly Ser Asp
Ala Leu Asp Asp Phe Asp Leu Asp 35 40
45Met Leu 5031368PRTArtificial Sequencecatalytically inactive
Cas9 3Met Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile Gly Thr Asn Ser Val1
5 10 15Gly Trp Ala Val Ile
Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe 20
25 30Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys
Lys Asn Leu Ile 35 40 45Gly Ala
Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu 50
55 60Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg
Lys Asn Arg Ile Cys65 70 75
80Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser
85 90 95Phe Phe His Arg Leu
Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys 100
105 110His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp
Glu Val Ala Tyr 115 120 125His Glu
Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp 130
135 140Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr
Leu Ala Leu Ala His145 150 155
160Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro
165 170 175Asp Asn Ser Asp
Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr 180
185 190Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala
Ser Gly Val Asp Ala 195 200 205Lys
Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn 210
215 220Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys
Asn Gly Leu Phe Gly Asn225 230 235
240Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn
Phe 245 250 255Asp Leu Ala
Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp 260
265 270Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile
Gly Asp Gln Tyr Ala Asp 275 280
285Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp 290
295 300Ile Leu Arg Val Asn Thr Glu Ile
Thr Lys Ala Pro Leu Ser Ala Ser305 310
315 320Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu
Thr Leu Leu Lys 325 330
335Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe
340 345 350Asp Gln Ser Lys Asn Gly
Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser 355 360
365Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys
Met Asp 370 375 380Gly Thr Glu Glu Leu
Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg385 390
395 400Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile
Pro His Gln Ile His Leu 405 410
415Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe
420 425 430Leu Lys Asp Asn Arg
Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile 435
440 445Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser
Arg Phe Ala Trp 450 455 460Met Thr Arg
Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu465
470 475 480Val Val Asp Lys Gly Ala Ser
Ala Gln Ser Phe Ile Glu Arg Met Thr 485
490 495Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu
Pro Lys His Ser 500 505 510Leu
Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys 515
520 525Tyr Val Thr Glu Gly Met Arg Lys Pro
Ala Phe Leu Ser Gly Glu Gln 530 535
540Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr545
550 555 560Val Lys Gln Leu
Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp 565
570 575Ser Val Glu Ile Ser Gly Val Glu Asp Arg
Phe Asn Ala Ser Leu Gly 580 585
590Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp
595 600 605Asn Glu Glu Asn Glu Asp Ile
Leu Glu Asp Ile Val Leu Thr Leu Thr 610 615
620Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr
Ala625 630 635 640His Leu
Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr
645 650 655Thr Gly Trp Gly Arg Leu Ser
Arg Lys Leu Ile Asn Gly Ile Arg Asp 660 665
670Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp
Gly Phe 675 680 685Ala Asn Arg Asn
Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe 690
695 700Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln
Gly Asp Ser Leu705 710 715
720His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly
725 730 735Ile Leu Gln Thr Val
Lys Val Val Asp Glu Leu Val Lys Val Met Gly 740
745 750Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala
Arg Glu Asn Gln 755 760 765Thr Thr
Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile 770
775 780Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile
Leu Lys Glu His Pro785 790 795
800Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu
805 810 815Gln Asn Gly Arg
Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg 820
825 830Leu Ser Asp Tyr Asp Val Asp Ala Ile Val Pro
Gln Ser Phe Leu Lys 835 840 845Asp
Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg 850
855 860Gly Lys Ser Asp Asn Val Pro Ser Glu Glu
Val Val Lys Lys Met Lys865 870 875
880Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg
Lys 885 890 895Phe Asp Asn
Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp 900
905 910Lys Ala Gly Phe Ile Lys Arg Gln Leu Val
Glu Thr Arg Gln Ile Thr 915 920
925Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp 930
935 940Glu Asn Asp Lys Leu Ile Arg Glu
Val Lys Val Ile Thr Leu Lys Ser945 950
955 960Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe
Tyr Lys Val Arg 965 970
975Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val
980 985 990Val Gly Thr Ala Leu Ile
Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe 995 1000
1005Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile
Ala Lys 1010 1015 1020Ser Glu Gln Glu
Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe Tyr Ser1025 1030
1035 1040Asn Ile Met Asn Phe Phe Lys Thr Glu
Ile Thr Leu Ala Asn Gly Glu 1045 1050
1055Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu Thr Gly Glu
Ile 1060 1065 1070Val Trp Asp
Lys Gly Arg Asp Phe Ala Thr Val Arg Lys Val Leu Ser 1075
1080 1085Met Pro Gln Val Asn Ile Val Lys Lys Thr Glu
Val Gln Thr Gly Gly 1090 1095 1100Phe
Ser Lys Glu Ser Ile Leu Pro Lys Arg Asn Ser Asp Lys Leu Ile1105
1110 1115 1120Ala Arg Lys Lys Asp Trp
Asp Pro Lys Lys Tyr Gly Gly Phe Asp Ser 1125
1130 1135Pro Thr Val Ala Tyr Ser Val Leu Val Val Ala Lys
Val Glu Lys Gly 1140 1145
1150Lys Ser Lys Lys Leu Lys Ser Val Lys Glu Leu Leu Gly Ile Thr Ile
1155 1160 1165Met Glu Arg Ser Ser Phe Glu
Lys Asn Pro Ile Asp Phe Leu Glu Ala 1170 1175
1180Lys Gly Tyr Lys Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro
Lys1185 1190 1195 1200Tyr Ser
Leu Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser
1205 1210 1215Ala Gly Glu Leu Gln Lys Gly
Asn Glu Leu Ala Leu Pro Ser Lys Tyr 1220 1225
1230Val Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys
Gly Ser 1235 1240 1245Pro Glu Asp
Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys His 1250
1255 1260Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe
Ser Lys Arg Val1265 1270 1275
1280Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala Tyr Asn Lys
1285 1290 1295His Arg Asp Lys Pro
Ile Arg Glu Gln Ala Glu Asn Ile Ile His Leu 1300
1305 1310Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala Phe
Lys Tyr Phe Asp 1315 1320 1325Thr
Thr Ile Asp Arg Lys Arg Tyr Thr Ser Thr Lys Glu Val Leu Asp 1330
1335 1340Ala Thr Leu Ile His Gln Ser Ile Thr Gly
Leu Tyr Glu Thr Arg Ile1345 1350 1355
1360Asp Leu Ser Gln Leu Gly Gly Asp
136542279DNAArtificial Sequenceguide RNA expression vector
sequencemisc_feature(331)...(350)n = A, T, C or G 4gacgtcgcta gctgtacaaa
aaagcaggct ttaaaggaac caattcagtc gactggatcc 60ggtaccaagg tcgggcagga
agagggccta tttcccatga ttccttcata tttgcatata 120cgatacaagg ctgttagaga
gataattaga attaatttga ctgtaaacac aaagatatta 180gtacaaaata cgtgacgtag
aaagtaataa tttcttgggt agtttgcagt tttaaaatta 240tgttttaaaa tggactatca
tatgcttacc gtaacttgaa agtatttcga tttcttggct 300ttatatatct tgtggaaagg
acgaaacacc nnnnnnnnnn nnnnnnnnnn gttttagagc 360tagaaatagc aagttaaaat
aaggctagtc cgttatcaac ttgaaaaagt ggcaccgagt 420cggtgctttt tttaagcttg
ggccgctcga ggtacctctc tacatatgac atgtgagcaa 480aaggccagca aaaggccagg
aaccgtaaaa aggccgcgtt gctggcgttt ttccataggc 540tccgcccccc tgacgagcat
cacaaaaatc gacgctcaag tcagaggtgg cgaaacccga 600caggactata aagataccag
gcgtttcccc ctggaagctc cctcgtgcgc tctcctgttc 660cgaccctgcc gcttaccgga
tacctgtccg cctttctccc ttcgggaagc gtggcgcttt 720ctcatagctc acgctgtagg
tatctcagtt cggtgtaggt cgttcgctcc aagctgggct 780gtgtgcacga accccccgtt
cagcccgacc gctgcgcctt atccggtaac tatcgtcttg 840agtccaaccc ggtaagacac
gacttatcgc cactggcagc agccactggt aacaggatta 900gcagagcgag gtatgtaggc
ggtgctacag agttcttgaa gtggtggcct aactacggct 960acactagaag aacagtattt
ggtatctgcg ctctgctgaa gccagttacc ttcggaaaaa 1020gagttggtag ctcttgatcc
ggcaaacaaa ccaccgctgg tagcggtggt ttttttgttt 1080gcaagcagca gattacgcgc
agaaaaaaag gatctcaaga agatcctttg atcttttcta 1140cggggtctga cgctcagtgg
aacgaaaact cacgttaagg gattttggtc atgagattat 1200caaaaaggat cttcacctag
atccttttaa attaaaaatg aagttttaaa tcaatctaaa 1260gtatatatga gtaaacttgg
tctgacagtt accaatgctt aatcagtgag gcacctatct 1320cagcgatctg tctatttcgt
tcatccatag ttgcctgact ccccgtcgtg tagataacta 1380cgatacggga gggcttacca
tctggcccca gtgctgcaat gataccgcga gacccacgct 1440caccggctcc agatttatca
gcaataaacc agccagccgg aagggccgag cgcagaagtg 1500gtcctgcaac tttatccgcc
tccatccagt ctattaattg ttgccgggaa gctagagtaa 1560gtagttcgcc agttaatagt
ttgcgcaacg ttgttgccat tgctacaggc atcgtggtgt 1620cacgctcgtc gtttggtatg
gcttcattca gctccggttc ccaacgatca aggcgagtta 1680catgatcccc catgttgtgc
aaaaaagcgg ttagctcctt cggtcctccg atcgttgtca 1740gaagtaagtt ggccgcagtg
ttatcactca tggttatggc agcactgcat aattctctta 1800ctgtcatgcc atccgtaaga
tgcttttctg tgactggtga gtactcaacc aagtcattct 1860gagaatagtg tatgcggcga
ccgagttgct cttgcccggc gtcaatacgg gataataccg 1920cgccacatag cagaacttta
aaagtgctca tcattggaaa acgttcttcg gggcgaaaac 1980tctcaaggat cttaccgctg
ttgagatcca gttcgatgta acccactcgt gcacccaact 2040gatcttcagc atcttttact
ttcaccagcg tttctgggtg agcaaaaaca ggaaggcaaa 2100atgccgcaaa aaagggaata
agggcgacac ggaaatgttg aatactcata ctcttccttt 2160ttcaatatta ttgaagcatt
tatcagggtt attgtctcat gagcggatac atatttgaat 2220gtatttagaa aaataaacaa
ataggggttc cgcgcacatt tccccgaaaa gtgccacct 227957786DNAArtificial
SequenceCMV-T7-Cas9 D10A/H840A-3xFLAG-VP64 sequence 5atatgccaag
tacgccccct attgacgtca atgacggtaa atggcccgcc tggcattatg 60cccagtacat
gaccttatgg gactttccta cttggcagta catctacgta ttagtcatcg 120ctattaccat
ggtgatgcgg ttttggcagt acatcaatgg gcgtggatag cggtttgact 180cacggggatt
tccaagtctc caccccattg acgtcaatgg gagtttgttt tggcaccaaa 240atcaacggga
ctttccaaaa tgtcgtaaca actccgcccc attgacgcaa atgggcggta 300ggcgtgtacg
gtgggaggtc tatataagca gagctggttt agtgaaccgt cagatccgct 360agagatccgc
ggccgctaat acgactcact atagggagag ccgccaccat ggataagaaa 420tactcaatag
gcttagctat cggcacaaat agcgtcggat gggcggtgat cactgatgaa 480tataaggttc
cgtctaaaaa gttcaaggtt ctgggaaata cagaccgcca cagtatcaaa 540aaaaatctta
taggggctct tttatttgac agtggagaga cagcggaagc gactcgtctc 600aaacggacag
ctcgtagaag gtatacacgt cggaagaatc gtatttgtta tctacaggag 660attttttcaa
atgagatggc gaaagtagat gatagtttct ttcatcgact tgaagagtct 720tttttggtgg
aagaagacaa gaagcatgaa cgtcatccta tttttggaaa tatagtagat 780gaagttgctt
atcatgagaa atatccaact atctatcatc tgcgaaaaaa attggtagat 840tctactgata
aagcggattt gcgcttaatc tatttggcct tagcgcatat gattaagttt 900cgtggtcatt
ttttgattga gggagattta aatcctgata atagtgatgt ggacaaacta 960tttatccagt
tggtacaaac ctacaatcaa ttatttgaag aaaaccctat taacgcaagt 1020ggagtagatg
ctaaagcgat tctttctgca cgattgagta aatcaagacg attagaaaat 1080ctcattgctc
agctccccgg tgagaagaaa aatggcttat ttgggaatct cattgctttg 1140tcattgggtt
tgacccctaa ttttaaatca aattttgatt tggcagaaga tgctaaatta 1200cagctttcaa
aagatactta cgatgatgat ttagataatt tattggcgca aattggagat 1260caatatgctg
atttgttttt ggcagctaag aatttatcag atgctatttt actttcagat 1320atcctaagag
taaatactga aataactaag gctcccctat cagcttcaat gattaaacgc 1380tacgatgaac
atcatcaaga cttgactctt ttaaaagctt tagttcgaca acaacttcca 1440gaaaagtata
aagaaatctt ttttgatcaa tcaaaaaacg gatatgcagg ttatattgat 1500gggggagcta
gccaagaaga attttataaa tttatcaaac caattttaga aaaaatggat 1560ggtactgagg
aattattggt gaaactaaat cgtgaagatt tgctgcgcaa gcaacggacc 1620tttgacaacg
gctctattcc ccatcaaatt cacttgggtg agctgcatgc tattttgaga 1680agacaagaag
acttttatcc atttttaaaa gacaatcgtg agaagattga aaaaatcttg 1740acttttcgaa
ttccttatta tgttggtcca ttggcgcgtg gcaatagtcg ttttgcatgg 1800atgactcgga
agtctgaaga aacaattacc ccatggaatt ttgaagaagt tgtcgataaa 1860ggtgcttcag
ctcaatcatt tattgaacgc atgacaaact ttgataaaaa tcttccaaat 1920gaaaaagtac
taccaaaaca tagtttgctt tatgagtatt ttacggttta taacgaattg 1980acaaaggtca
aatatgttac tgaaggaatg cgaaaaccag catttctttc aggtgaacag 2040aagaaagcca
ttgttgattt actcttcaaa acaaatcgaa aagtaaccgt taagcaatta 2100aaagaagatt
atttcaaaaa aatagaatgt tttgatagtg ttgaaatttc aggagttgaa 2160gatagattta
atgcttcatt aggtacctac catgatttgc taaaaattat taaagataaa 2220gattttttgg
ataatgaaga aaatgaagat atcttagagg atattgtttt aacattgacc 2280ttatttgaag
atagggagat gattgaggaa agacttaaaa catatgctca cctctttgat 2340gataaggtga
tgaaacagct taaacgtcgc cgttatactg gttggggacg tttgtctcga 2400aaattgatta
atggtattag ggataagcaa tctggcaaaa caatattaga ttttttgaaa 2460tcagatggtt
ttgccaatcg caattttatg cagctgatcc atgatgatag tttgacattt 2520aaagaagaca
ttcaaaaagc acaagtgtct ggacaaggcg atagtttaca tgaacatatt 2580gcaaatttag
ctggtagccc tgctattaaa aaaggtattt tacagactgt aaaagttgtt 2640gatgaattgg
tcaaagtaat ggggcggcat aagccagaaa atatcgttat tgaaatggca 2700cgtgaaaatc
agacaactca aaagggccag aaaaattcgc gagagcgtat gaaacgaatc 2760gaagaaggta
tcaaagaatt aggaagtcag attcttaaag agcatcctgt tgaaaatact 2820caattgcaaa
atgaaaagct ctatctctat tatctccaaa atggaagaga catgtatgtg 2880gaccaagaat
tagatattaa tcgtttaagt gattatgatg tcgatgccat tgttccacaa 2940agtttcctta
aagacgattc aatagacaat aaggtcttaa cgcgttctga taaaaatcgt 3000ggtaaatcgg
ataacgttcc aagtgaagaa gtagtcaaaa agatgaaaaa ctattggaga 3060caacttctaa
acgccaagtt aatcactcaa cgtaagtttg ataatttaac gaaagctgaa 3120cgtggaggtt
tgagtgaact tgataaagct ggttttatca aacgccaatt ggttgaaact 3180cgccaaatca
ctaagcatgt ggcacaaatt ttggatagtc gcatgaatac taaatacgat 3240gaaaatgata
aacttattcg agaggttaaa gtgattacct taaaatctaa attagtttct 3300gacttccgaa
aagatttcca attctataaa gtacgtgaga ttaacaatta ccatcatgcc 3360catgatgcgt
atctaaatgc cgtcgttgga actgctttga ttaagaaata tccaaaactt 3420gaatcggagt
ttgtctatgg tgattataaa gtttatgatg ttcgtaaaat gattgctaag 3480tctgagcaag
aaataggcaa agcaaccgca aaatatttct tttactctaa tatcatgaac 3540ttcttcaaaa
cagaaattac acttgcaaat ggagagattc gcaaacgccc tctaatcgaa 3600actaatgggg
aaactggaga aattgtctgg gataaagggc gagattttgc cacagtgcgc 3660aaagtattgt
ccatgcccca agtcaatatt gtcaagaaaa cagaagtaca gacaggcgga 3720ttctccaagg
agtcaatttt accaaaaaga aattcggaca agcttattgc tcgtaaaaaa 3780gactgggatc
caaaaaaata tggtggtttt gatagtccaa cggtagctta ttcagtccta 3840gtggttgcta
aggtggaaaa agggaaatcg aagaagttaa aatccgttaa agagttacta 3900gggatcacaa
ttatggaaag aagttccttt gaaaaaaatc cgattgactt tttagaagct 3960aaaggatata
aggaagttaa aaaagactta atcattaaac tacctaaata tagtcttttt 4020gagttagaaa
acggtcgtaa acggatgctg gctagtgccg gagaattaca aaaaggaaat 4080gagctggctc
tgccaagcaa atatgtgaat tttttatatt tagctagtca ttatgaaaag 4140ttgaagggta
gtccagaaga taacgaacaa aaacaattgt ttgtggagca gcataagcat 4200tatttagatg
agattattga gcaaatcagt gaattttcta agcgtgttat tttagcagat 4260gccaatttag
ataaagttct tagtgcatat aacaaacata gagacaaacc aatacgtgaa 4320caagcagaaa
atattattca tttatttacg ttgacgaatc ttggagctcc cgctgctttt 4380aaatattttg
atacaacaat tgatcgtaaa cgatatacgt ctacaaaaga agttttagat 4440gccactctta
tccatcaatc catcactggt ctttatgaaa cacgcattga tttgagtcag 4500ctaggaggtg
acggttctcc caagaagaag aggaaagtct cgagcgacta caaagaccat 4560gacggtgatt
ataaagatca tgacatcgat tacaaggatg acgatgacaa ggctgcagga 4620ggcggtggaa
gcgggcgcgc cgacgcgctg gacgatttcg atctcgacat gctgggttct 4680gatgccctcg
atgactttga cctggatatg ttgggaagcg acgcattgga tgactttgat 4740ctggacatgc
tcggctccga tgctctggac gatttcgatc tcgatatgtt ataaccggtc 4800atcatcacca
tcaccattga gtttaaaccc gctgatcagc ctcgactgtg ccttctagtt 4860gccagccatc
tgttgtttgc ccctcccccg tgccttcctt gaccctggaa ggtgccactc 4920ccactgtcct
ttcctaataa aatgaggaaa ttgcatcgca ttgtctgagt aggtgtcatt 4980ctattctggg
gggtggggtg gggcaggaca gcaaggggga ggattgggaa gacaatagca 5040ggcatgctgg
ggatgcggtg ggctctatgg cttctgaggc ggaaagaacc agctggggct 5100cgataccgtc
gacctctagc tagagcttgg cgtaatcatg gtcatagctg tttcctgtgt 5160gaaattgtta
tccgctcaca attccacaca acatacgagc cggaagcata aagtgtaaag 5220cctagggtgc
ctaatgagtg agctaactca cattaattgc gttgcgctca ctgcccgctt 5280tccagtcggg
aaacctgtcg tgccagctgc attaatgaat cggccaacgc gcggggagag 5340gcggtttgcg
tattgggcgc tcttccgctt cctcgctcac tgactcgctg cgctcggtcg 5400ttcggctgcg
gcgagcggta tcagctcact caaaggcggt aatacggtta tccacagaat 5460caggggataa
cgcaggaaag aacatgtgag caaaaggcca gcaaaaggcc aggaaccgta 5520aaaaggccgc
gttgctggcg tttttccata ggctccgccc ccctgacgag catcacaaaa 5580atcgacgctc
aagtcagagg tggcgaaacc cgacaggact ataaagatac caggcgtttc 5640cccctggaag
ctccctcgtg cgctctcctg ttccgaccct gccgcttacc ggatacctgt 5700ccgcctttct
cccttcggga agcgtggcgc tttctcaatg ctcacgctgt aggtatctca 5760gttcggtgta
ggtcgttcgc tccaagctgg gctgtgtgca cgaacccccc gttcagcccg 5820accgctgcgc
cttatccggt aactatcgtc ttgagtccaa cccggtaaga cacgacttat 5880cgccactggc
agcagccact ggtaacagga ttagcagagc gaggtatgta ggcggtgcta 5940cagagttctt
gaagtggtgg cctaactacg gctacactag aaggacagta tttggtatct 6000gcgctctgct
gaagccagtt accttcggaa aaagagttgg tagctcttga tccggcaaac 6060aaaccaccgc
tggtagcggt ggtttttttg tttgcaagca gcagattacg cgcagaaaaa 6120aaggatctca
agaagatcct ttgatctttt ctacggggtc tgacgctcag tggaacgaaa 6180actcacgtta
agggattttg gtcatgagat tatcaaaaag gatcttcacc tagatccttt 6240taaattaaaa
atgaagtttt aaatcaatct aaagtatata tgagtaaact tggtctgaca 6300gttaccaatg
cttaatcagt gaggcaccta tctcagcgat ctgtctattt cgttcatcca 6360tagttgcctg
actccccgtc gtgtagataa ctacgatacg ggagggctta ccatctggcc 6420ccagtgctgc
aatgataccg cgagacccac gctcaccggc tccagattta tcagcaataa 6480accagccagc
cggaagggcc gagcgcagaa gtggtcctgc aactttatcc gcctccatcc 6540agtctattaa
ttgttgccgg gaagctagag taagtagttc gccagttaat agtttgcgca 6600acgttgttgc
cattgctaca ggcatcgtgg tgtcacgctc gtcgtttggt atggcttcat 6660tcagctccgg
ttcccaacga tcaaggcgag ttacatgatc ccccatgttg tgcaaaaaag 6720cggttagctc
cttcggtcct ccgatcgttg tcagaagtaa gttggccgca gtgttatcac 6780tcatggttat
ggcagcactg cataattctc ttactgtcat gccatccgta agatgctttt 6840ctgtgactgg
tgagtactca accaagtcat tctgagaata gtgtatgcgg cgaccgagtt 6900gctcttgccc
ggcgtcaata cgggataata ccgcgccaca tagcagaact ttaaaagtgc 6960tcatcattgg
aaaacgttct tcggggcgaa aactctcaag gatcttaccg ctgttgagat 7020ccagttcgat
gtaacccact cgtgcaccca actgatcttc agcatctttt actttcacca 7080gcgtttctgg
gtgagcaaaa acaggaaggc aaaatgccgc aaaaaaggga ataagggcga 7140cacggaaatg
ttgaatactc atactcttcc tttttcaata ttattgaagc atttatcagg 7200gttattgtct
catgagcgga tacatatttg aatgtattta gaaaaataaa caaatagggg 7260ttccgcgcac
atttccccga aaagtgccac ctgacgtcga cggatcggga gatcgatctc 7320ccgatcccct
agggtcgact ctcagtacaa tctgctctga tgccgcatag ttaagccagt 7380atctgctccc
tgcttgtgtg ttggaggtcg ctgagtagtg cgcgagcaaa atttaagcta 7440caacaaggca
aggcttgacc gacaattgca tgaagaatct gcttagggtt aggcgttttg 7500cgctgcttcg
cgatgtacgg gccagatata cgcgttgaca ttgattattg actagttatt 7560aatagtaatc
aattacgggg tcattagttc atagcccata tatggagttc cgcgttacat 7620aacttacggt
aaatggcccg cctggctgac cgcccaacga cccccgccca ttgacgtcaa 7680taatgacgta
tgttcccata gtaacgccaa tagggacttt ccattgacgt caatgggtgg 7740actatttacg
gtaaactgcc cacttggcag tacatcaagt gtatcc
778667785DNAArtificial SequenceMV-T7-Cas9 recoded D10A/H840A-3xFLAG-VP64
sequence 6atatgccaag tacgccccct attgacgtca atgacggtaa atggcccgcc
tggcattatg 60cccagtacat gaccttatgg gactttccta cttggcagta catctacgta
ttagtcatcg 120ctattaccat ggtgatgcgg ttttggcagt acatcaatgg gcgtggatag
cggtttgact 180cacggggatt tccaagtctc caccccattg acgtcaatgg gagtttgttt
tggcaccaaa 240atcaacggga ctttccaaaa tgtcgtaaca actccgcccc attgacgcaa
atgggcggta 300ggcgtgtacg gtgggaggtc tatataagca gagctggttt agtgaaccgt
cagatccgct 360agagatccgc ggccgctaat acgactcact atagggagag ccgccaccat
ggataaaaag 420tattctattg gtttagccat cggcactaat tccgttggat gggctgtcat
aaccgatgaa 480tacaaagtac cttcaaagaa atttaaggtg ttggggaaca cagaccgtca
ttcgattaaa 540aagaatctta tcggtgccct cctattcgat agtggcgaaa cggcagaggc
gactcgcctg 600aaacgaaccg ctcggagaag gtatacacgt cgcaagaacc gaatatgtta
cttacaagaa 660atttttagca atgagatggc caaagttgac gattctttct ttcaccgttt
ggaagagtcc 720ttccttgtcg aagaggacaa gaaacatgaa cggcacccca tctttggaaa
catagtagat 780gaggtggcat atcatgaaaa gtacccaacg atttatcacc tcagaaaaaa
gctagttgac 840tcaactgata aagcggacct gaggttaatc tacttggctc ttgcccatat
gataaagttc 900cgtgggcact ttctcattga gggtgatcta aatccggaca actcggatgt
cgacaaactg 960ttcatccagt tagtacaaac ctataatcag ttgtttgaag agaaccctat
aaatgcaagt 1020ggcgtggatg cgaaggctat tcttagcgcc cgcctctcta aatcccgacg
gctagaaaac 1080ctgatcgcac aattacccgg agagaagaaa aatgggttgt tcggtaacct
tatagcgctc 1140tcactaggcc tgacaccaaa ttttaagtcg aacttcgact tagctgaaga
tgccaaattg 1200cagcttagta aggacacgta cgatgacgat ctcgacaatc tactggcaca
aattggagat 1260cagtatgcgg acttattttt ggctgccaaa aaccttagcg atgcaatcct
cctatctgac 1320atactgagag ttaatactga gattaccaag gcgccgttat ccgcttcaat
gatcaaaagg 1380tacgatgaac atcaccaaga cttgacactt ctcaaggccc tagtccgtca
gcaactgcct 1440gagaaatata aggaaatatt ctttgatcag tcgaaaaacg ggtacgcagg
ttatattgac 1500ggcggagcga gtcaagagga attctacaag tttatcaaac ccatattaga
gaagatggat 1560gggacggaag agttgcttgt aaaactcaat cgcgaagatc tactgcgaaa
gcagcggact 1620ttcgacaacg gtagcattcc acatcaaatc cacttaggcg aattgcatgc
tatacttaga 1680aggcaggagg atttttatcc gttcctcaaa gacaatcgtg aaaagattga
gaaaatccta 1740acctttcgca taccttacta tgtgggaccc ctggcccgag ggaactctcg
gttcgcatgg 1800atgacaagaa agtccgaaga aacgattact ccatggaatt ttgaggaagt
tgtcgataaa 1860ggtgcgtcag ctcaatcgtt catcgagagg atgaccaact ttgacaagaa
tttaccgaac 1920gaaaaagtat tgcctaagca cagtttactt tacgagtatt tcacagtgta
caatgaactc 1980acgaaagtta agtatgtcac tgagggcatg cgtaaacccg cctttctaag
cggagaacag 2040aagaaagcaa tagtagatct gttattcaag accaaccgca aagtgacagt
taagcaattg 2100aaagaggact actttaagaa aattgaatgc ttcgattctg tcgagatctc
cggggtagaa 2160gatcgattta atgcgtcact tggtacgtat catgacctcc taaagataat
taaagataag 2220gacttcctgg ataacgaaga gaatgaagat atcttagaag atatagtgtt
gactcttacc 2280ctctttgaag atcgggaaat gattgaggaa agactaaaaa catacgctca
cctgttcgac 2340gataaggtta tgaaacagtt aaagaggcgt cgctatacgg gctggggacg
attgtcgcgg 2400aaacttatca acgggataag agacaagcaa agtggtaaaa ctattctcga
ttttctaaag 2460agcgacggct tcgccaatag gaactttatg cagctgatcc atgatgactc
tttaaccttc 2520aaagaggata tacaaaaggc acaggtttcc ggacaagggg actcattgca
cgaacatatt 2580gcgaatcttg ctggttcgcc agccatcaaa aagggcatac tccagacagt
caaagtagtg 2640gatgagctag ttaaggtcat gggacgtcac aaaccggaaa acattgtaat
cgagatggca 2700cgcgaaaatc aaacgactca gaaggggcaa aaaaacagtc gagagcggat
gaagagaata 2760gaagagggta ttaaagaact gggcagccag atcttaaagg agcatcctgt
ggaaaatacc 2820caattgcaga acgagaaact ttacctctat tacctacaaa atggaaggga
catgtatgtt 2880gatcaggaac tggacataaa ccgtttatct gattacgacg tcgatgccat
tgtaccccaa 2940tcctttttga aggacgattc aatcgacaat aaagtgctta cacgctcgga
taagaaccga 3000gggaaaagtg acaatgttcc aagcgaggaa gtcgtaaaga aaatgaagaa
ctattggcgg 3060cagctcctaa atgcgaaact gataacgcaa agaaagttcg ataacttaac
taaagctgag 3120aggggtggct tgtctgaact tgacaaggcc ggatttatta aacgtcagct
cgtggaaacc 3180cgccaaatca caaagcatgt tgcacagata ctagattccc gaatgaatac
gaaatacgac 3240gagaacgata agctgattcg ggaagtcaaa gtaatcactt taaagtcaaa
attggtgtcg 3300gacttcagaa aggattttca attctataaa gttagggaga taaataacta
ccaccatgcg 3360cacgacgctt atcttaatgc cgtcgtaggg accgcactca ttaagaaata
cccgaagcta 3420gaaagtgagt ttgtgtatgg tgattacaaa gtttatgacg tccgtaagat
gatcgcgaaa 3480agcgaacagg agataggcaa ggctacagcc aaatacttct tttattctaa
cattatgaat 3540ttctttaaga cggaaatcac tctggcaaac ggagagatac gcaaacgacc
tttaattgaa 3600accaatgggg agacaggtga aatcgtatgg gataagggcc gggacttcgc
gacggtgaga 3660aaagttttgt ccatgcccca agtcaacata gtaaagaaaa ctgaggtgca
gaccggaggg 3720ttttcaaagg aatcgattct tccaaaaagg aatagtgata agctcatcgc
tcgtaaaaag 3780gactgggacc cgaaaaagta cggtggcttc gatagcccta cagttgccta
ttctgtccta 3840gtagtggcaa aagttgagaa gggaaaatcc aagaaactga agtcagtcaa
agaattattg 3900gggataacga ttatggagcg ctcgtctttt gaaaagaacc ccatcgactt
ccttgaggcg 3960aaaggttaca aggaagtaaa aaaggatctc ataattaaac taccaaagta
tagtctgttt 4020gagttagaaa atggccgaaa acggatgttg gctagcgccg gagagcttca
aaaggggaac 4080gaactcgcac taccgtctaa atacgtgaat ttcctgtatt tagcgtccca
ttacgagaag 4140ttgaaaggtt cacctgaaga taacgaacag aagcaacttt ttgttgagca
gcacaaacat 4200tatctcgacg aaatcataga gcaaatttcg gaattcagta agagagtcat
cctagctgat 4260gccaatctgg acaaagtatt aagcgcatac aacaagcaca gggataaacc
catacgtgag 4320caggcggaaa atattatcca tttgtttact cttaccaacc tcggcgctcc
agccgcattc 4380aagtattttg acacaacgat agatcgcaaa cgatacactt ctaccaagga
ggtgctagac 4440gcgacactga ttcaccaatc catcacggga ttatatgaaa ctcggataga
tttgtcacag 4500cttgggggtg acggatcccc caagaagaag aggaaagtct cgagcgacta
caaagaccat 4560gacggtgatt ataaagatca tgacatcgat tacaaggatg acgatgacaa
ggctgcagga 4620ggcggtggaa gcgggcgcgc cgacgcgctg gacgatttcg atctcgacat
gctgggttct 4680gatgccctcg atgactttga cctggatatg ttgggaagcg acgcattgga
tgactttgat 4740ctggacatgc tcggctccga tgctctggac gatttcgatc tcgatatgtt
ataaccggtc 4800atcatcacca tcaccattga gtttaaaccc gctgatcagc ctcgactgtg
ccttctagtt 4860gccagccatc tgttgtttgc ccctcccccg tgccttcctt gaccctggaa
ggtgccactc 4920ccactgtcct ttcctaataa aatgaggaaa ttgcatcgca ttgtctgagt
aggtgtcatt 4980ctattctggg gggtggggtg gggcaggaca gcaaggggga ggattgggaa
gacaatagca 5040ggcatgctgg ggatgcggtg ggctctatgg cttctgaggc ggaaagaacc
agctggggct 5100cgataccgtc gacctctagc tagagcttgg cgtaatcatg gtcatagctg
tttcctgtgt 5160gaaattgtta tccgctcaca attccacaca acatacgagc cggaagcata
aagtgtaaag 5220cctagggtgc ctaatgagtg agctaactca cattaattgc gttgcgctca
ctgcccgctt 5280tccagtcggg aaacctgtcg tgccagctgc attaatgaat cggccaacgc
gcggggagag 5340gcggtttgcg tattgggcgc tcttccgctt cctcgctcac tgactcgctg
cgctcggtcg 5400ttcggctgcg gcgagcggta tcagctcact caaaggcggt aatacggtta
tccacagaat 5460caggggataa cgcaggaaag aacatgtgag caaaaggcca gcaaaaggcc
aggaaccgta 5520aaaaggccgc gttgctggcg tttttccata ggctccgccc ccctgacgag
catcacaaaa 5580atcgacgctc aagtcagagg tggcgaaacc cgacaggact ataaagatac
caggcgtttc 5640cccctggaag ctccctcgtg cgctctcctg ttccgaccct gccgcttacc
ggatacctgt 5700ccgcctttct cccttcggga agcgtggcgc tttctcaatg ctcacgctgt
aggtatctca 5760gttcggtgta ggtcgttcgc tccaagctgg gctgtgtgca cgaacccccc
gttcagcccg 5820accgctgcgc cttatccggt aactatcgtc ttgagtccaa cccggtaaga
cacgacttat 5880cgccactggc agcagccact ggtaacagga ttagcagagc gaggtatgta
ggcggtgcta 5940cagagttctt gaagtggtgg cctaactacg gctacactag aaggacagta
tttggtatct 6000gcgctctgct gaagccagtt accttcggaa aaagagttgg tagctcttga
tccggcaaac 6060aaaccaccgc tggtagcggt ggtttttttg tttgcaagca gcagattacg
cgcagaaaaa 6120aaggatctca agaagatcct ttgatctttt ctacggggtc tgacgctcag
tggaacgaaa 6180actcacgtta agggattttg gtcatgagat tatcaaaaag gatcttcacc
tagatccttt 6240taaattaaaa atgaagtttt aaatcaatct aaagtatata tgagtaaact
tggtctgaca 6300gttaccaatg cttaatcagt gaggcaccta tctcagcgat ctgtctattt
cgttcatcca 6360tagttgcctg actccccgtc gtgtagataa ctacgatacg ggagggctta
ccatctggcc 6420ccagtgctgc aatgataccg cgagacccac gctcaccggc tccagattta
tcagcaataa 6480accagccagc cggaagggcc gagcgcagaa gtggtcctgc aactttatcc
gcctccatcc 6540agtctattaa ttgttgccgg gaagctagag taagtagttc gccagttaat
agtttgcgca 6600acgttgttgc cattgctaca ggcatcgtgg tgtcacgctc gtcgtttggt
atggcttcat 6660tcagctccgg ttcccaacga tcaaggcgag ttacatgatc ccccatgttg
tgcaaaaaag 6720cggttagctc cttcggtcct ccgatcgttg tcagaagtaa gttggccgca
gtgttatcac 6780tcatggttat ggcagcactg cataattctc ttactgtcat gccatccgta
agatgctttt 6840ctgtgactgg tgagtactca accaagtcat tctgagaata gtgtatgcgg
cgaccgagtt 6900gctcttgccc ggcgtcaata cgggataata ccgcgccaca tagcagaact
ttaaaagtgc 6960tcatcattgg aaaacgttct tcggggcgaa aactctcaag gatcttaccg
ctgttgagat 7020ccagttcgat gtaacccact cgtgcaccca actgatcttc agcatctttt
actttcacca 7080gcgtttctgg gtgagcaaaa acaggaaggc aaaatgccgc aaaaaaggga
ataagggcga 7140cacggaaatg ttgaatactc atactcttcc tttttcaata ttattgaagc
atttatcagg 7200gttattgtct catgagcgga tacatatttg aatgtattta gaaaaataaa
caaatagggg 7260ttccgcgcac atttccccga aaagtgccac ctgacgtcga cggatcggga
gatcgatctc 7320ccgatcccct agggtcgact ctcagtacaa tctgctctga tgccgcatag
ttaagccagt 7380atctgctccc tgcttgtgtg ttggaggtcg ctgagtagtg cgcgagcaaa
atttaagcta 7440caacaaggca aggcttgacc gacaattgca tgaagaatct gcttagggtt
aggcgttttg 7500cgctgcttcg cgatgtacgg gccagatata cgcgttgaca ttgattattg
actagttatt 7560aatagtaatc aattacgggg tcattagttc atagcccata tatggagttc
cgcgttacat 7620aacttacggt aaatggcccg cctggctgac cgcccaacga cccccgccca
ttgacgtcaa 7680taatgacgta tgttcccata gtaacgccaa tagggacttt ccattgacgt
caatgggtgg 7740actatttacg gtaaactgcc cacttggcag tacatcaagt gtatc
778571452PRTArtificial SequenceCas9-activator protein sequence
7Met Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile Gly Thr Asn Ser Val1
5 10 15Gly Trp Ala Val Ile Thr
Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe 20 25
30Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys
Asn Leu Ile 35 40 45Gly Ala Leu
Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu 50
55 60Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys
Asn Arg Ile Cys65 70 75
80Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser
85 90 95Phe Phe His Arg Leu Glu
Glu Ser Phe Leu Val Glu Glu Asp Lys Lys 100
105 110His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp
Glu Val Ala Tyr 115 120 125His Glu
Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp 130
135 140Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr
Leu Ala Leu Ala His145 150 155
160Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro
165 170 175Asp Asn Ser Asp
Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr 180
185 190Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala
Ser Gly Val Asp Ala 195 200 205Lys
Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn 210
215 220Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys
Asn Gly Leu Phe Gly Asn225 230 235
240Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn
Phe 245 250 255Asp Leu Ala
Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp 260
265 270Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile
Gly Asp Gln Tyr Ala Asp 275 280
285Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp 290
295 300Ile Leu Arg Val Asn Thr Glu Ile
Thr Lys Ala Pro Leu Ser Ala Ser305 310
315 320Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu
Thr Leu Leu Lys 325 330
335Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe
340 345 350Asp Gln Ser Lys Asn Gly
Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser 355 360
365Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys
Met Asp 370 375 380Gly Thr Glu Glu Leu
Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg385 390
395 400Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile
Pro His Gln Ile His Leu 405 410
415Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe
420 425 430Leu Lys Asp Asn Arg
Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile 435
440 445Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser
Arg Phe Ala Trp 450 455 460Met Thr Arg
Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu465
470 475 480Val Val Asp Lys Gly Ala Ser
Ala Gln Ser Phe Ile Glu Arg Met Thr 485
490 495Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu
Pro Lys His Ser 500 505 510Leu
Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys 515
520 525Tyr Val Thr Glu Gly Met Arg Lys Pro
Ala Phe Leu Ser Gly Glu Gln 530 535
540Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr545
550 555 560Val Lys Gln Leu
Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp 565
570 575Ser Val Glu Ile Ser Gly Val Glu Asp Arg
Phe Asn Ala Ser Leu Gly 580 585
590Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp
595 600 605Asn Glu Glu Asn Glu Asp Ile
Leu Glu Asp Ile Val Leu Thr Leu Thr 610 615
620Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr
Ala625 630 635 640His Leu
Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr
645 650 655Thr Gly Trp Gly Arg Leu Ser
Arg Lys Leu Ile Asn Gly Ile Arg Asp 660 665
670Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp
Gly Phe 675 680 685Ala Asn Arg Asn
Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe 690
695 700Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln
Gly Asp Ser Leu705 710 715
720His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly
725 730 735Ile Leu Gln Thr Val
Lys Val Val Asp Glu Leu Val Lys Val Met Gly 740
745 750Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala
Arg Glu Asn Gln 755 760 765Thr Thr
Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile 770
775 780Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile
Leu Lys Glu His Pro785 790 795
800Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu
805 810 815Gln Asn Gly Arg
Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg 820
825 830Leu Ser Asp Tyr Asp Val Asp Ala Ile Val Pro
Gln Ser Phe Leu Lys 835 840 845Asp
Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg 850
855 860Gly Lys Ser Asp Asn Val Pro Ser Glu Glu
Val Val Lys Lys Met Lys865 870 875
880Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg
Lys 885 890 895Phe Asp Asn
Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp 900
905 910Lys Ala Gly Phe Ile Lys Arg Gln Leu Val
Glu Thr Arg Gln Ile Thr 915 920
925Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp 930
935 940Glu Asn Asp Lys Leu Ile Arg Glu
Val Lys Val Ile Thr Leu Lys Ser945 950
955 960Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe
Tyr Lys Val Arg 965 970
975Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val
980 985 990Val Gly Thr Ala Leu Ile
Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe 995 1000
1005Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile
Ala Lys 1010 1015 1020Ser Glu Gln Glu
Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe Tyr Ser1025 1030
1035 1040Asn Ile Met Asn Phe Phe Lys Thr Glu
Ile Thr Leu Ala Asn Gly Glu 1045 1050
1055Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu Thr Gly Glu
Ile 1060 1065 1070Val Trp Asp
Lys Gly Arg Asp Phe Ala Thr Val Arg Lys Val Leu Ser 1075
1080 1085Met Pro Gln Val Asn Ile Val Lys Lys Thr Glu
Val Gln Thr Gly Gly 1090 1095 1100Phe
Ser Lys Glu Ser Ile Leu Pro Lys Arg Asn Ser Asp Lys Leu Ile1105
1110 1115 1120Ala Arg Lys Lys Asp Trp
Asp Pro Lys Lys Tyr Gly Gly Phe Asp Ser 1125
1130 1135Pro Thr Val Ala Tyr Ser Val Leu Val Val Ala Lys
Val Glu Lys Gly 1140 1145
1150Lys Ser Lys Lys Leu Lys Ser Val Lys Glu Leu Leu Gly Ile Thr Ile
1155 1160 1165Met Glu Arg Ser Ser Phe Glu
Lys Asn Pro Ile Asp Phe Leu Glu Ala 1170 1175
1180Lys Gly Tyr Lys Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro
Lys1185 1190 1195 1200Tyr Ser
Leu Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser
1205 1210 1215Ala Gly Glu Leu Gln Lys Gly
Asn Glu Leu Ala Leu Pro Ser Lys Tyr 1220 1225
1230Val Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys
Gly Ser 1235 1240 1245Pro Glu Asp
Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys His 1250
1255 1260Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe
Ser Lys Arg Val1265 1270 1275
1280Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala Tyr Asn Lys
1285 1290 1295His Arg Asp Lys Pro
Ile Arg Glu Gln Ala Glu Asn Ile Ile His Leu 1300
1305 1310Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala Phe
Lys Tyr Phe Asp 1315 1320 1325Thr
Thr Ile Asp Arg Lys Arg Tyr Thr Ser Thr Lys Glu Val Leu Asp 1330
1335 1340Ala Thr Leu Ile His Gln Ser Ile Thr Gly
Leu Tyr Glu Thr Arg Ile1345 1350 1355
1360Asp Leu Ser Gln Leu Gly Gly Asp Gly Ser Asp Tyr Lys Asp His
Asp 1365 1370 1375Gly Asp
Tyr Lys Asp His Asp Ile Asp Tyr Lys Asp Asp Asp Asp Lys 1380
1385 1390Ala Ala Gly Gly Gly Gly Ser Gly Arg
Ala Asp Ala Leu Asp Asp Phe 1395 1400
1405Asp Leu Asp Met Leu Gly Ser Asp Ala Leu Asp Asp Phe Asp Leu Asp
1410 1415 1420Met Leu Gly Ser Asp Ala Leu
Asp Asp Phe Asp Leu Asp Met Leu Gly1425 1430
1435 1440Ser Asp Ala Leu Asp Asp Phe Asp Leu Asp Met Leu
1445 1450
User Contributions:
Comment about this patent or add new information about this topic: