Patent application title: CODON OPTIMIZED NUCLEIC ACID ENCODING A RETINITIS PIGMENTOSA GTPASE REGULATOR (RPGR)
Inventors:
Guo-Jie Ye (Gainesville, FL, US)
Jilin Liu (Gainesville, FL, US)
IPC8 Class: AC07K1447FI
USPC Class:
1 1
Class name:
Publication date: 2019-09-12
Patent application number: 20190276507
Abstract:
This invention relates generally to a codon optimized nucleic acid
encoding a retinitis pigmentosa GTPase regulator (RPGR) protein. The
nucleic acid has enhanced stability during plasmid production relative to
a wildtype cDNA encoding the RPGR protein. The invention also relates to
expression cassettes, vectors, and host cells comprising the codon
optimized nucleic acid. Methods for preparing a recombinant
adeno-associated (rAAV) expression vector comprising the codon optimized
nucleic acid sequence are also provided. The nucleic acids, expression
cassettes, vectors, and host cells provided may be useful in the large
scale production of rAAV expression vectors for gene therapy
applications.Claims:
1. A polynucleotide comprising the nucleic acid sequence of SEQ ID NO: 1
encoding a human retinitis pigmentosa GTPase regulator (RPGR) protein.
2. An expression cassette comprising the polynucleotide of claim 1 and an expression control sequence operably linked and heterologous to the nucleic acid sequence.
3. A vector comprising the polynucleotide of claim 1.
4. The vector of claim 3, wherein the vector is a recombinant adeno-associated (rAAV) expression vector.
5. A recombinant herpes simplex virus (rHSV) comprising the polynucleotide of claim 1.
6. A host cell comprising the polynucleotide of claim 1.
7. The host cell of claim 6, wherein the host cell is a mammalian cell.
8. The host cell of claim 6, wherein the host cell is a HeLa cell, a BHK21 cell or a Vero cell.
9. The host cell of claim 6, wherein the host cell is a V27 cell.
10. The expression cassette of claim 2, wherein the expression control sequence is a human interphotoreceptor retinoid-binding protein (IRBP) promoter.
11. The expression cassette of claim 10, wherein the human IRBP promoter comprises a nucleic acid sequence having at least 95% sequence identity to the nucleic acid sequence of SEQ ID NO: 8 and directs preferential expression in rods and cones.
12. The expression cassette of claim 10, wherein the human IRBP promoter comprises the nucleic acid sequence of SEQ ID NO: 8.
13. The polynucleotide of claim 1, wherein the polynucleotide comprises the nucleic acid sequence of SEQ ID NO: 7.
14. A method of producing the rAAV expression vector of claim 4, comprising (a) infecting a host cell with a recombinant herpes simplex virus (rHSV) comprising the nucleic acid sequence of SEQ ID NO: 1; (b) incubating the host cell; and (c) following incubation, collecting rAAV from the host cell of step (b).
15. The method of claim 14, wherein the host cell is a HeLa cell, a BHK21 cell or a Vero cell.
16. The method of claim 14, wherein the rHSV further comprises a human IRBP promoter operably linked to the nucleic acid sequence of SEQ ID NO: 1.
17. The method of claim 16, wherein the human IRBP promoter comprises a nucleic acid sequence having at least 95% sequence identity to the nucleic acid sequence of SEQ ID NO: 8 and directs preferential expression in rods and cones.
18. The method of claim 16, wherein the human IRBP promoter comprises the nucleic acid sequence of SEQ ID NO: 8.
19. The method of claim 14, wherein the rHSV comprises the nucleic acid sequence of SEQ ID NO: 7.
Description:
RELATED APPLICATIONS
[0001] This application is a Continuation Application of U.S. patent application Ser. No. 15/360,362, filed on Nov. 23, 2016, which is a Continuation Application of U.S. patent application Ser. No. 14/687,227, filed on Apr. 15, 2015 and issued as U.S. Pat. No. 9,534,225 on Jan. 3, 2017, which claims the benefit of U.S. Provisional Application No. 61/979,633, filed on Apr. 15, 2014, the entire contents of which are expressly incorporated herein by reference.
SUBMISSION OF SEQUENCE LISTING
[0002] The Sequence Listing associated with this application is filed in electronic format via EFS-Web and hereby incorporated by reference into the specification in its entirety. The name of the text file containing the Sequence Listing is "Sequence Listing" and is 16 kb in size.
FIELD OF THE INVENTION
[0003] This invention relates generally to codon optimized nucleic acid sequences encoding a human retinitis pigmentosa GTPase regulator (RPGR).
BACKGROUND OF THE INVENTION
[0004] Retinitis pigmentosa (RP) is an inherited degenerative disease of the retina that affects approximately one in 3,500 individuals, with an estimated 1.5 million patients worldwide. See Churchill et al., 2013, Invest. Ophthalmol. Vis. Sci. 54(2): 1411-1416. RP is caused by progressive loss of rod and cone photoreceptors, resulting in night blindness followed by loss of visual fields. The disease may result in legal or even complete blindness. Mutations in the retinitis pigmentosa GTPase regulator (RPGR) gene account for greater than 70% of the cases of human X-linked retinitis pigmentosa (XLRP), the most severe subtype of RP. See Beltran et al., 2012, PNAS 109(6): 2132-2137 and Bader et al., 2003, Invest. Ophthalmol. Vis. Sci. (44)4: 1458-1463.
[0005] Alternative splicing of the RPGR gene results in expression of multiple isoforms of the RPGR protein. The mRNA for isoform A contains all 19 exons of the gene, while the mRNA for isoform C contains exons 1 to 15 and a large part of intron 15. Intron 15 is a purine-rich region that contains highly repetitive sequences that code for glutamate and glycine repeats (EEEGEGEGE in human and EEGEGE in mouse), see Vervoort et al., Mutational hot spot within a new RPGR exon in X-linked retinitis pigmentosa. Nat Genet 2000; 25:462-6. Isoform A is constitutively expressed in all tissues while isoform C, which is also referred to as "ORF15", is the predominant form expressed in the connecting cilium of photoreceptor, see Hong et al., Invest Ophthalmol Vis Sci 2002; 43:3373-82, and Hong et al., Invest Ophthalmol Vis Sci 2003; 44:2413-21.
[0006] A total of 55% of RPGR-related XLRP is caused by mutations in ORF15, all of which result from deletions that lead to truncated proteins. Most of the other cases are caused by mutations in exons 1-13, which can be either missense or nonsense mutations, with a small number caused by mutations in introns or large deletions. No cases have been identified due to mutations in exons 16 to 19.
[0007] Recent studies have demonstrated the potential of gene therapy approaches to treating XLRP caused by mutations in the RPGR gene. For example, Beltran et al. have shown that subretinal injections of adeno-assocatied virus (AAV) vectors expressing human RPGR increased rod and cone photoreceptor function in a canine model of XLRP.
[0008] However one of the challenges in large-scale production of AAV vectors for clinical use is that nucleic acid sequences encoding a protein of interest such as RPGR may be unstable, resulting in the accumulation of several mutations and deletions. For example, the RPGR gene contains a region of 1.2 kb called ORF15 near the 3' end of the cDNA that is highly repetitive and GA rich. This region is a mutation "hot spot" in population. This repetitive region is very unstable during cloning and vector preparation and clones obtained generally contain mutations and deletions. These mutations can potentially alter or eliminate RPGR protein function, limiting the use of this protein in gene therapy applications. Therefore a need exists to identify methods of stabilizing RPGR cDNAs during large-scale production of AAV vectors.
SUMMARY OF THE INVENTION
[0009] It has been surprisingly found that the nucleic acid sequence of SEQ ID NO: 1 encoding the human RPGR protein is stable in large scale production of AAV plasmid pTR-IRBP-RPGRsyn. This nucleic acid sequence was developed through codon optimization of the wild type RPGR cDNA. In one aspect, the present invention provides a polynucleotide comprising the nucleic acid sequence of SEQ ID NO: 1 encoding a human RPGR protein.
[0010] In one aspect, the invention features a polynucleotide comprising the nucleic acid sequence of SEQ ID NO: 1 encoding a human retinitis pigmentosa GTPase regulator (RPGR) protein.
[0011] In one embodiments, the invention features an expression cassette comprising the polynucleotide of the above aspect, and an expression control sequence operably linked and heterologous to the nucleic acid sequence.
[0012] In another embodiment, the invention features a vector comprising the polynucleotide of claim 1. In a further embodiment, the vector is a recombinant adeno-associated (rAAV) expression vector.
[0013] In another embodiment, the invention features a recombinant herpes simplex virus (rHSV) comprising the polynucleotide of any one of the above aspects.
[0014] In another embodiment, the invention features a host cell comprising the polynucleotide of any one of the above aspects. In a related embodiment, the host cell is a mammalian cell. In a further related embodiment, the host cell is a HeLa cell, a BHK21 cell or a Vero cell. In another further embodiment, the host cell is a V27 cell.
[0015] In another embodiment, the expression control sequence is a human interphotoreceptor retinoid-binding protein (IRBP) promoter. In a further related embodiment, the human IRBP promoter comprises a nucleic acid sequence having at least 95% sequence identity to the nucleic acid sequence of SEQ ID NO: 8 and directs preferential expression in rods and cones. In another further embodiment, the human IRBP promoter comprises the nucleic acid sequence of SEQ ID NO: 8.
[0016] In one embodiment, the polynucleotide comprises the nucleic acid sequence of SEQ ID NO: 7.
[0017] The invention also features in another embodiment a method of producing the rAAV expression vector of the above aspect, comprising (a) infecting a host cell with a recombinant herpes simplex virus (rHSV) comprising the nucleic acid sequence of SEQ ID NO: 1; (b) incubating the host cell; and (c) following incubation, collecting rAAV from the host cell of step (b).
[0018] In one embodiment, the host cell is a HeLa cell, a BHK21 cell or a Vero cell.
[0019] In another embodiment, the rHSV further comprises a human IRBP promoter operably linked to the nucleic acid sequence of SEQ ID NO: 1. In a further embodiment, the human IRBP promoter comprises a nucleic acid sequence having at least 95% sequence identity to the nucleic acid sequence of SEQ ID NO: 8 and directs preferential expression in rods and cones. In a further related embodiment, the human IRBP promoter comprises the nucleic acid sequence of SEQ ID NO: 8.
[0020] In another embodiment, the rHSV comprises the nucleic acid sequence of SEQ ID NO: 7.
BRIEF DESCRIPTION OF THE DRAWINGS
[0021] FIGS. 1A-1B show a sequence alignment of codon optimized RPGR cDNA (RPGRsyn; SEQ ID NO: 1) and the wildtype RPGR cDNA (Genbank Accession No. NM_001034853; SEQ ID NO: 5).
[0022] FIG. 2 shows a map of plasmid pUC57-RPGRsyn.
[0023] FIG. 3 shows pUC57-RPGRsyn plasmid DNA clones N5 and N6 prepared by mini-prep and larger scale midi-prep (Midi) and digested with restriction enzymes Notl and PciI. Plasmid DNA from mini-preps was retransformed into SURE2 cells before larger scale production by midi-prep.
[0024] FIG. 4 shows pUC57-RPGRsyn plasmid DNA from mini-preps (mini_2 and mini_3) digested with restriction enzymes NotI and PciI. Plasmid DNA was not detectable in larger scale midi preps (midi_2 and midi_3). Seeding culture was stored at 4.degree. C. overnight and used as the inoculant for larger scale plasmid production.
[0025] FIG. 5 shows a map of AAV proviral plasmid pTR-IRBP-RPGRsyn
[0026] FIG. 6 shows the restriction maps of pTR-IRBP-RPGRsyn plasmid DNA isolated from transformed bacteria after 4 rounds of serial overnight propagation, along with a control plasmid of pTR-IRBP-CNGB3co. Bacteria transformed with pTR-IRBP-RPGRsyn or pTR-IRBP/GNAT2-hCNGB3co plasmids were grown in medium at 37.degree. C., overnight. In the next morning, plasmid DNA was purified from 1.5 mL of overnight culture, and the remaining culture was left at room temperature until late afternoon and then used to inoculate 2 mLs of culture medium (1:1000 dilution) for the 2.sup.nd round propagation. Same procedures were followed for the 3.sup.rd and 4.sup.th round of propagation. Plasmid DNA purified from each round were then analyzed by restriction digestion with SmaI to confirm the integrity of the ITR sequence of the plasmid. Restriction maps kept same for both pTR-IRBP-RPGRsyn and the control plasmid pTR-IRBP-CNGB3co, through the 3 rounds of propagation in bacteria. However, the yield was significantly decreased after 3.sup.rd round propagation and almost no plasmid restriction fragments were detected after 4.sup.th round propagation in bacteria.
[0027] FIG. 7 shows the sequence alignment of the consensus sequence of contigs obtained from pTR-IRBP-RPGRsyn plasmid DNA to the reference pTR-IRBP-RPGRsyn sequence.
DETAILED DESCRIPTION OF THE INVENTION
[0028] The invention provides a polynucleotide comprising the nucleic acid sequence of SEQ ID NO: 1 encoding a human retinitis pigmentosa GTPase regulator (RPGR) protein. The nucleic acid sequence has been codon optimized for enhanced stability during vector replication, and may be used, for example, for production of adeno-assocatied virus (AAV) vectors for gene therapy applications.
[0029] Nucleic acid sequences may be codon optimized to improve stability or heterologous expression in host cells without changing the encoded amino acid sequence. For example, codon optimization may be used to remove sequences that negatively impact gene expression, transcript stability, protein expression or protein stability, such as transcription splice sites, DNA instability motifs, polyadenylation sites, secondary structure, AU-rich RNA elements, secondary ORFs, codon tandem repeats, or long range repeats. Codon optimization may also be used to adjust the G/C content of a sequence of interest.
[0030] A codon consists of a set of three nucleotides and encodes a specific amino acid or results in the termination of translation (i.e. stop codons). The genetic code is redundant in that multiple codons specify the same amino acid, i.e., there are a total of 61 codons encoding 20 amino acids. Codon optimization replaces codons present in a DNA sequence with preferred codons encoding the same amino acid, for example, codons preferred for mammalian expression. Thus, the amino acid sequence is not altered during the process. Codon optimization can be performed using gene optimization software. The codon optimized nucleotide sequence is translated and aligned to the original protein sequence to ensure that no changes were made to the amino acid sequence. For example, the nucleotide sequence of SEQ ID NO: 1 encoding human RPGR is a codon optimized version of the wild type human RPGR nucleotide sequence (Genbank Accession No. NM_001034853, SEQ ID NO: 5). Both SEQ ID NO: 1 and SEQ ID NO: 5 encode the same RPGR protein (SEQ ID NO: 6).
[0031] Methods of codon optimization are known in the art and are described, for example, in U.S. Application Publication No. 2008/0194511 and U.S. Pat. No. 6,114,148.
[0032] The nucleic acid sequences of the present invention can be made as synthetic sequences. Techniques for constructing synthetic nucleic acid sequences are known in the art, and synthetic gene sequences may be purchased from several companies, including DNA 2.0 (Menlo Park, Calif.) and GenScript USA Inc. (Piscataway, N.J.). Alternatively, codon changes can be introduced by standard molecular biology techniques such as site-specific in vitro mutagenesis, PCR, or any other genetic engineering methods known in art which are suitable for specifically changing a nucleic acid sequence. In vitro mutagenesis protocols are described, for example, in In Vitro Mutagenesis Protocols, Braman, ed., 2002, Humana Press, and in Sankaranarayanan, Protocols in Mutagenesis, 2001, Elsevier Science Ltd.
[0033] The human RPGR gene is located in chromosomal region Xp21.1 and spans 172 kb. Shu et al., 2012, Invest. Ophthalmol. Vis. Sci. 53(7): 3951-3958. There are multiple alternatively spliced transcripts, all of which encode an amino (N)-terminal RCC1-like (RCCL) domain. The RCCL domain is structurally similar to the RCC1 protein, a guanine nucleotide exchange factor for the small guanosine triphosphate--binding protein, Ran. The RPGR gene contains 19 exons (RPGRex1-19), encoding a predicted 90 kDa protein. Exons 2 to 11 encode the RCCL domain, whereas exons 12 to 19 encode a carboxyl (C)-terminal domain rich in acidic residues and ending in an isoprenylation anchorage signal. Mutations found in RPGRex1-19 account for 15% to 20% of XLRP patients, and subsequent studies revealed many more disease-causing mutations within one or more transcripts containing an alternatively spliced C-terminal exon called ORF15 (RPGRORF15). A high frequency of microdeletions, frameshift, and premature stop mutations are found within the ORF15.
[0034] In one embodiment, the RPGR cDNA used for codon optimization is the full-length human RPGRORF15 clone, variant C, Genbank Accession No. NM_001034853 (SEQ ID NO: 5). See Vervoort et al., 2000, Nat Genet 25: 462-466. This clone contains exons 1-ORF15 and was generated using three-way ligation by step-wise amplifying exons 1-part of 15b (nucleotides 169-1990) from human lymphocytes and 1991-3627 from human genomic DNA. See Beltran et al., 2012, PNAS 109(6): 2132-2137.
[0035] RPGR is widely expressed and shows a complex expression pattern. See Shu et al., cited above. RPGR transcripts are detected in different tissues, including brain, eye, kidney, lung, and testis in several different species. RPGR protein is detected in retina, trachea, brain, and testis. In human, mouse, and bovine retina, RPGR mainly localizes to photoreceptor connecting cilia, but expression has also been reported in outer segments in some species. RPGR is expressed in the transitional zone of motile cilia and within human and monkey cochlea.
[0036] The invention also provides an expression cassette comprising the nucleic acid sequence of SEQ ID NO: 1 and an expression control sequence operably linked and heterologous to the nucleic acid sequence. The term "expression control sequence" refers to any genetic element (e.g., polynucleotide sequence) that can exert a regulatory effect on the replication or expression (transcription or translation) of the nucleic acid sequence. Common expression control sequences include promoters, polyadenylation signals, transcription termination sequences, upstream regulatory domains, origins of replication, internal ribosome entry sites (IRES), and enhancers.
[0037] An expression control sequence is operably linked with a nucleic acid sequence when the expression control sequence is placed in a functional relationship with the second nucleic acid sequence. For example, a promoter is operably linked to a coding sequence if the promoter affects the expression of the coding sequence. The term operably linked encompasses, for example, an arrangement of an expression control sequence with the nucleic acid sequence to be expressed and optionally further expression control sequences, such as a terminator or enhancer, such that each of the expression control sequences can allow, modify, facilitate or otherwise influence expression of the nucleic acid sequence.
[0038] The term "heterologous" refers to nucleic acid or amino acid sequences that are obtained or derived from different source organisms or from different genes or proteins within the same source organism. For example, an expression control sequence that is not a native expression control sequence of the human RPGR gene is considered to be heterologous to the human RPGR gene. In certain embodiments, the expression control sequence is a promoter that is heterologous to the RPGR gene.
[0039] In a preferred embodiment, the expression control sequence is a human interphotoreceptor retinoid-binding protein (IRBP) promoter. IRBP is a large glycoprotein that is expressed only in the photoreceptor cells of the retina and to a much lesser extent in pinealocytes in the pineal gland in the brain. See Al-Ubaidi et al., 1992, J Cell Biology, 119(6) 1681-1687. The IRBP promoter region is well characterized. For example, Albini et al. (1990, Nucleic Acids Research 18(17): 5181-5187) describe a nucleotide sequence of the human IRBP promoter region (Genbank Accession No. X53044) containing 2818 bp of the 5' untranscribed region (SEQ ID NO: 2). Beltran et al. (cited above) demonstrated that a 235 bp fragment of the human IRBP promoter directed GFP expression in both rods and cones of normal canine retina in a dose- and time-dependent manner. A 1.3 kb fragment of the 5' untranslated region of the human IRBP gene (SEQ ID NO: 3) directed expression of a bacterial reporter gene (chloramphenicol acetyltransferase, CAT) specifically to photoreceptor cells in transgenic mice. See Al Ubaidi et al. 1992, J Cell Biology 119: 1681-1687. Nested deletion analysis of a 1783 bp fragment of the mouse IRBP 5' flanking region indicated that high promoter activity was maintained with a fragment consisting of 70 bp 5' to the transcription start site (SEQ ID NO: 4), but that elements upstream of this 70 bp fragment are required for complete tissue-specific regulation. See Boatright, et al., 1997, Molecular Vision 3: 15.
[0040] In a preferred embodiment, the human IRBP promoter comprises a nucleic acid sequence having at least 95%, 96%, 97%, 98% or 99% sequence identity to SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, or SEQ ID NO: 8. In a further preferred embodiment, the human IRBP promoter comprises SEQ ID NO: 8.
[0041] As used herein, the term "vector" refers to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked, e.g., a plasmid. Preferred vectors are those capable of autonomous replication and/or expression of nucleic acids to which they are linked. An "rAAV vector" is a recombinant vector that includes nucleic acid sequences derived from adeno-associated virus (AAV). Recombinant AAV is produced in vitro by introduction of gene constructs into cells known as producer cells. Recombinant AAV has been studied extensively as a vehicle for gene therapy and for its potential applicability as a treatment for human diseases based on genetic defects. At the clinical level, the rAAV vector has been used in human clinical trials to deliver the cftr gene to cystic fibrosis patients and the Factor IX gene to hemophilia patients (Flotte, et al., 1998, Methods Enzymol 292:717-732; and Wagner et al., 1998, Lancet 351:1702-1703). Systems for production of rAAV employ three elements: 1) a gene cassette containing the gene of interest, 2) a gene cassette containing AAV rep and cap genes and 3) a source of "helper" virus proteins. Methods of producing rAAV are known in the art and are described, for example, in U.S. Pat. No. 7,091,029.
[0042] Production of rAAV vectors for gene therapy is carried out in vitro, using suitable producer cell lines. A preferred cell line is 293, but production of rAAV can be achieved using other cell lines, including but not limited to human or monkey cell lines such as Vero, WI 38 and HeLa, and rodent cells, such as BHK cells, e.g. BHK21.
[0043] In particular embodiments, the rAAV comprises the nucleic acid sequence of SEQ ID NO: 1 encoding the human RPGR protein. The rAAV may further comprise one or more expression control sequences operably linked to the nucleic acid sequence of SEQ ID NO: 1. In a preferred embodiment, the expression control sequence is a human IRBP promoter. In a further preferred embodiment, the human IRBP promoter comprises a nucleic acid sequence having at least 95%, 96%, 97%, 98% or 99% sequence identity to the nucleic acid sequence of SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, or SEQ ID NO: 8 and directs preferential expression in rods and cones. In a particularly preferred embodiment, the human IRBP promoter comprises SEQ ID NO: 8.
[0044] In certain embodiments, the rAAV further comprises an SV40 poly A tail, an SV40 splice donor/splice acceptor (SD/SA) sequence, and a Kozak sequence, each operably linked to the nucleic acid sequence of SEQ ID NO: 1. In a preferred embodiment, the rAAV comprises the nucleic acid sequence of SEQ ID NO: 7.
[0045] One strategy for delivering all of the required elements for rAAV production to the producer cell line involves transfecting the cells with plasmids containing gene cassettes encoding the necessary gene products, as well as infection of the cells with the helper virus Ad to provide the helper functions. This system employs plasmids with two different gene cassettes. The first is a proviral plasmid encoding the recombinant DNA to be packaged as rAAV. The second is a plasmid encoding the rep and cap genes. Other DNA viruses, such as Herpes simplex virus type 1 (HSV-1) can be used instead of Ad to provide helper virus gene products needed for rAAV production (Conway et al., 1999, Gene Ther. 6:973-985).
[0046] Another strategy for rAAV production is based on the use of two or more recombinant rHSV-1 viruses to simultaneously co-infect producer cells with all of the components necessary for producing rAAV. This strategy employs at least two different forms of rHSV, each containing a different gene cassette. In addition to supplying the necessary helper functions, each of these rHSV viruses is engineered to deliver different AAV (and other) genes to the producer cells upon infection. The two rHSV forms are referred to as the "rHSV/rc virus" and the "rHSV expression virus." The rHSV/rc virus contains a gene cassette in which the rep and cap genes from AAV are inserted into the HSV genome. The rep genes are responsible for replication and packaging of the rAAV genome in host cells infected with AAV. The cap genes encode proteins that comprise the capsid of the rAAV produced by the infected cells.
[0047] The second recombinant HSV is an "rHSV expression virus." A usual element of an rAAV production system is an expression cassette containing transgene DNA sequences encoding a gene(s) of interest, such as the RPGR gene, along with promoter elements necessary for expression of the gene. In particular embodiments, the rHSV comprises the nucleic acid sequence of SEQ ID NO: 1 encoding the human RPGR protein. Expression vectors engineered for rAAV production are generally constructed with the gene of interest inserted between two AAV-2 inverted terminal repeats (ITRs). The ITRs are responsible for the ability of native AAV to insert its DNA into the genome of host cells upon infection or otherwise persist in the infected cells. The expression cassette is incorporated into the rHSV expression virus described above. This second rHSV virus is used for simultaneous co-infection of the cells along with the rHSV-1/rc virus.
[0048] The terms "recombinant HSV," "rHSV," "rHSV vector," and "rHSV expression vector" refer to isolated, genetically modified forms of herpes simplex virus (HSV) containing heterologous genes incorporated into the viral genome. Methods for production of rHSV are known in the art and are described, for example, by Conway et al. (1999, Gene Ther. 6:973-985); Conway et al. (1997, J Virol 71: 8780-8789) and U.S. Pat. No. 7,037,723.
[0049] In particular embodiments, the rHSV comprises the nucleic acid sequence of SEQ ID NO: 1 encoding the human RPGR protein. The rHSV may further comprise one or more expression control sequences for regulating expression of the nucleic acid sequence of SEQ ID NO: 1, wherein the expression control sequence is operably linked to the nucleic acid sequence of SEQ ID NO: 1. In a preferred embodiment, the expression control sequence is a human IRBP promoter that is operably linked to the nucleic acid sequence of SEQ ID NO: 1. In a further preferred embodiment, the human IRBP promoter comprises a nucleic acid sequence having at least 95%, 96%, 97%, 98% or 99% sequence identity to the nucleic acid sequence of SEQ ID SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, or SEQ ID NO: 8 and directs preferential expression in rods and cones. In a particularly preferred embodiment, the human IRBP promoter comprises SEQ ID NO: 8.
[0050] In certain embodiments of the aforementioned methods, the rHSV further comprises an SV40 poly A tail, an SV40 splice donor/splice acceptor (SD/SA) sequence, and a Kozak sequence, each operably linked to the nucleic acid sequence of SEQ ID NO: 1. In a preferred embodiment, the rHSV comprises the nucleic acid sequence of SEQ ID NO: 7.
[0051] The invention also provides a method of producing an rAAV expression vector comprising a polynucleotide comprising the nucleic acid sequence of SEQ ID NO: 1 encoding a human RPGR protein. In one embodiment, the method comprises (a) infecting a host cell with a recombinant herpes simplex virus (rHSV) comprising the nucleic acid sequence of SEQ ID NO: 1; (b) incubating the host cell; and (c) following incubation, collecting rAAV from the host cell of step (b).
[0052] Methods of producing rAAV expression vectors by infecting a host cell with an rHSV are known in the art and are described for example in U.S. Pat. No. 7,091,029. For example, in one embodiment, the host cells are infected with rHSV by diluting the virus in growth medium such as DMEM and adding the virus to flasks containing the host cells. The host cells may be incubated with the virus for various intervals, for example, 22, 26, 30, 34, or 46 hours. Following the incubation interval, the virus-infected cells may be harvested by pelleting, followed by resuspension in DMEM. Cell-associated rAAV may be collected from the host cells by lysis of the cells using standard techniques involving three rounds of freezing and thawing (See Conway et al., 1999, cited above).
[0053] In particular embodiments, the host cell used for producing an rAAV expression vector in the aforementioned methods is a HeLa cell, a BHK21 cell or a Vero cell.
[0054] The rHSV used in the aforementioned method may further comprise one or more expression control sequences for regulating expression of the nucleic acid sequence of SEQ ID NO: 1 that is operably linked to the nucleic acid sequence of SEQ ID NO: 1. In a preferred embodiment, the expression control sequence is a human IRBP promoter that is operably linked to the nucleic acid sequence of SEQ ID NO: 1. In a further preferred embodiment, the human IRBP promoter comprises a nucleic acid sequence having at least 95%, 96%, 97%, 98% or 99% sequence identity to the nucleic acid sequence of SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, or SEQ ID NO: 8 and directs preferential expression in rods and cones. In a particularly preferred embodiment, the human IRBP promoter comprises SEQ ID NO: 8.
[0055] In certain embodiments of the aforementioned methods, the rHSV further comprises an SV40 poly A tail, an SV40 splice donor/splice acceptor (SD/SA) sequence, and a Kozak sequence, each operably linked to the nucleic acid sequence of SEQ ID NO: 1. In a preferred embodiment, the rHSV comprises the nucleic acid sequence of SEQ ID NO: 7.
TABLE-US-00001 Description of Sequences SEQ ID NO: Description 1 Codon modified RPGR cDNA 2 Human IRBP promoter, 2818 bp. Albini et al., 1990, Nuc Acid Res 18: 5181-5187). SEQ ID NO: 2 comprises SEQ ID NO: 3 and 4. 3 Human IRBP promoter, 1326 bp. Al Ubaidi et al. 1992, J Cell Biology 119: 1681-1687 4 Mouse IRBP core promoter region, 70 bp. Boatright, et al., 1997, Molecular Vision 3: 15. 5 Wildtype RPGR cDNA, Genbank Accession No. NM_001034853 6 Wildtype RPGR amino acid sequence 7 3871 bp synthesized sequence comprising SEQ ID NO: 1, an SV40 poly A tail, the SV40 SD/SA sequence, Kozak sequence, and restriction sites 8 Human IRBP promoter, 234 bp fragment used in the RPGRsyn expression cassette
[0056] The following examples serve to illustrate certain embodiments and aspects of the present invention and are not to be construed as limiting the scope of the invention.
EXAMPLES
Example 1
Codon Optimization of the RPGR Gene and Evaluation of Plasmid Stability
[0057] A wildtype RPGR cDNA in an AAV plasmid used for AAV manufacturing was found to contain several mutations and deletions in the region from nt 2461 to nt 3057. There were a total of 42 bp accumulated deletions or substitutions across this region. The plasmid clone was found to be stable during plasmid propagation in bacteria, and no sequence changes were found in the AAV vector.
[0058] A 3459 bp coding sequence of the RPGR gene, variant C (SEQ ID NO: 5) was codon-optimized at Genscript, Inc. for mammalian expression. Codon optimization was used both to select codons of high frequency in mammals and to alter GC content to enhance stability, and to reduce the repetitive nature of the gene. The codon optimized version of the RPGR coding sequence (RPGRsyn; SEQ ID NO: 1) shares 72.1% sequence identity with the original gene (SEQ ID NO: 5). See FIG. 1. The codon optimized gene encodes the same polypeptide as the original gene, i.e. the polypeptide of SEQ ID NO: 6. The RPGRsyn gene was synthesized at GenScript along with an SV40 poly A tail, the SV40 SD/SA sequence, a Kozak sequence, and restriction sites for cloning purposes. The entire 3871 bp synthesized sequence is provided as SEQ ID NO: 7.
[0059] A map of the plasmid containing RPGRsyn (pUC57-RPGRsyn) is shown in FIG. 2. This plasmid was able to propagate stably in bacteria in small scale plasmid production. This plasmid also maintained its stability in larger scale production after being retransformed into SURE2 cells, a bacteria strain used for cloning of the AAV plasmid. See FIG. 3. Clone N5 of the pUC57-RPGRsyn plasmid DNA produced in large scale production was confirmed to be identical to the original plasmid by DNA sequencing. The plasmid yield could range from very low yield to none at all if the seeding culture was stored at 4.degree. C. overnight and used as the inoculant for large scale plasmid production. See FIG. 4. The RPGRsyn cDNA was then released from pUC57-RPGRsyn plasmid and inserted into a pTR containing plasmid to generate the AAV proviral plasmid pTR-IRBP-RPGRsyn (FIG. 5). pTR-IRBP-RPGRsyn contains inverted terminal repeats (ITR) of AAV2 and IRBP promoter. Large scale production of the plasmid confirmed to be 100% correct upon DNA sequencing (FIG. 6). To further confirm the stability of pTR-IRBP-RPGRsyn, bacteria transformed with pTR-IRBP-RPGRsyn or pTR-IRBP/GNAT2-hCNGB3co plasmids were grown in medium at 37.degree. C., overnight. In the next morning, plasmid DNA was purified from 1.5 mL of overnight culture, and the remaining culture was left at room temperature until late afternoon and then used to inoculate 2 mLs of culture medium (1:1000 dilution) for the 2.sup.nd round propagation. Same procedures were followed for the 3.sup.rd and 4.sup.th round propagation. Plasmid DNA purified from each round was then analyzed by restriction digestion with Sma Ito confirm the integrity of the ITR sequence of the plasmid. As shown in FIG. 6, the yield of pTR-IRBP-RPGRsyn declined during the serial passages; however, the same pattern is observed for pTR-IRBP-CNGB3co, a plasmid that contains the stable hCNGB3 cDNA. Therefore, the decline of plasmid yield is related to bacteria itself or other features such as TR, but not to the RPGRsyn. Also noted in FIG. 7, the 4.2 kb band containing RPGRsyn has been stable over the passages (it will become loose or smear if unstable).
Example 2
Construction of AAV Plasmids and Evaluation in Bacteria
[0060] An AAV plasmid (pTR-IRBP-RPGRsyn) comprising an RPGRsyn expressing cassette comprising the IRBP promoter (234 bp), the RPGRsyn cDNA (SEQ ID NO: 1), and an SV40 polyA signal sequence is constructed. This IRBP fragment is contained within the 235 bp fragment used by Beltran et al. in the canine model (See Beltran et al., 2012, PNAS 109(6): 2132-2137). After construction of pTR-IRBP-RPGRsyn, the plasmid is tested for stability in bacteria using the methods described in Example 1.
[0061] Once the stability of pTR-IRBP-RPGRsyn is confirmed, an HSV recombination plasmid comprising the IRBP-RPGRsyn expression cassette (pHSV106-IRBP-RPGRsyn) is constructed. pHSV106-IRBP-RPGRsyn is used for construction of HSV-IRBP-RPGRsyn helper vector for large scale production of the AAV vector AAV-IRBP-RPGRsyn. The rHSV helper viruses are propagated in mammalian cells (V27, an ICP27-complementing Vero cell line). RPGRsyn cDNA is more stable in mammalian cells than in bacteria. This increased stability will eliminate the need for large-scale production of an AAV proviral plasmid containing the RPGRsyn cDNA, which is a reagent required for rAAV production by plasmid transfection methods.
Sequence CWU
1
1
1213459DNAArtificial SequenceDescription of Artificial Sequence Synthetic
polynucleotide 1atgagagagc cagaggagct gatgccagat agcggagcag
tgtttacctt cggaaagtcc 60aagttcgcag agaataaccc aggaaagttc tggtttaaaa
acgacgtgcc cgtccacctg 120tcttgtggcg atgagcatag tgccgtggtc actgggaaca
ataagctgta tatgttcggg 180tccaacaatt ggggacagct ggggctggga tccaaatctg
ctatctctaa gccaacctgc 240gtgaaggcac tgaaacccga gaaggtcaaa ctggccgctt
gtggcagaaa ccacactctg 300gtgagcaccg agggcgggaa tgtctatgcc accggaggca
acaatgaggg acagctggga 360ctgggggaca ctgaggaaag gaataccttt cacgtgatct
ccttctttac atctgagcat 420aagatcaagc agctgagcgc cggctccaac acatctgcag
ccctgactga ggacgggcgc 480ctgttcatgt ggggagataa ttcagagggc cagattgggc
tgaaaaacgt gagcaacgtg 540tgcgtgcctc agcaggtgac catcggaaag ccagtcagtt
ggatttcatg tggctactat 600catagcgcct tcgtgaccac agatggcgag ctgtacgtct
ttggggagcc cgaaaacgga 660aaactgggcc tgcctaacca gctgctgggc aatcaccgga
caccccagct ggtgtccgag 720atccctgaaa aagtgatcca ggtcgcctgc gggggagagc
atacagtggt cctgactgag 780aatgccgtgt acaccttcgg actgggccag tttggccagc
tggggctggg aaccttcctg 840tttgagacat ccgaaccaaa agtgatcgag aacattcgcg
accagactat cagctacatt 900tcctgcggag agaatcacac cgcactgatc acagacattg
gcctgatgta tacctttggc 960gatgggcggc acgggaagct gggactgggc ctggagaact
tcactaatca cttcatcccc 1020accctgtgct ctaacttcct gcggttcatc gtgaaactgg
tcgcttgcgg cgggtgtcac 1080atggtggtct tcgctgcacc tcataggggc gtggctaagg
agatcgaatt tgacgagatt 1140aacgatacat gcctgagcgt ggcaactttc ctgccataca
gctccctgac ttctggcaat 1200gtgctgcaga gaaccctgag tgcaaggatg cggagaaggg
agagggaacg ctctcctgac 1260agtttctcaa tgcgacgaac cctgccacct atcgagggga
cactgggact gagtgcctgc 1320ttcctgccta actcagtgtt tccacgatgt agcgagcgga
atctgcagga gtctgtcctg 1380agtgagcagg atctgatgca gccagaggaa cccgactacc
tgctggatga gatgaccaag 1440gaggccgaaa tcgacaactc tagtacagtg gagtccctgg
gcgagactac cgatatcctg 1500aatatgacac acattatgtc actgaacagc aatgagaaga
gtctgaaact gtcaccagtg 1560cagaagcaga agaaacagca gactattggc gagctgactc
aggacaccgc cctgacagag 1620aacgacgata gcgatgagta tgaggaaatg tccgagatga
aggaaggcaa agcttgtaag 1680cagcatgtga gtcaggggat cttcatgaca cagccagcca
caactattga ggctttttca 1740gacgaggaag tggagatccc cgaggaaaaa gagggcgcag
aagattccaa ggggaatgga 1800attgaggaac aggaggtgga agccaacgag gaaaatgtga
aagtccacgg aggcaggaag 1860gagaaaacag aaatcctgtc tgacgatctg actgacaagg
ccgaggtgtc cgaaggcaag 1920gcaaaatctg tcggagaggc agaagacgga ccagagggac
gaggggatgg aacctgcgag 1980gaaggctcaa gcggggctga gcattggcag gacgaggaac
gagagaaggg cgaaaaggat 2040aaaggccgcg gggagatgga acgacctgga gagggcgaaa
aagagctggc agagaaggag 2100gaatggaaga aaagggacgg cgaggaacag gagcagaaag
aaagggagca gggccaccag 2160aaggagcgca accaggagat ggaagagggc ggcgaggaag
agcatggcga gggagaagag 2220gaagagggcg atagagaaga ggaagaggaa aaagaaggcg
aagggaagga ggaaggagag 2280ggcgaggaag tggaaggcga gagggaaaag gaggaaggag
aacggaagaa agaggaaaga 2340gccggcaaag aggaaaaggg cgaggaagag ggcgatcagg
gcgaaggcga ggaggaagag 2400accgagggcc gcggggaaga gaaagaggag ggaggagagg
tggagggcgg agaggtcgaa 2460gagggaaagg gcgagcgcga agaggaagag gaagagggcg
agggcgagga agaagagggc 2520gagggggaag aagaggaggg agagggcgaa gaggaagagg
gggagggaaa gggcgaagag 2580gaaggagagg aaggggaggg agaggaagag ggggaggagg
gcgaggggga aggcgaggag 2640gaagaaggag agggggaagg cgaagaggaa ggcgaggggg
aaggagagga ggaagaaggg 2700gaaggcgaag gcgaagagga gggagaagga gagggggagg
aagaggaagg agaagggaag 2760ggcgaggagg aaggcgaaga gggagagggg gaaggcgagg
aagaggaagg cgagggcgaa 2820ggagaggacg gcgagggcga gggagaagag gaggaagggg
aatgggaagg cgaagaagag 2880gaaggcgaag gcgaaggcga agaagagggc gaaggggagg
gcgaggaggg cgaaggcgaa 2940ggggaggaag aggaaggcga aggagaaggc gaggaagaag
agggagagga ggaaggcgag 3000gaggaaggag agggggagga ggagggagaa ggcgagggcg
aagaagaaga agagggagaa 3060gtggagggcg aagtcgaggg ggaggaggga gaaggggaag
gggaggaaga agagggcgaa 3120gaagaaggcg aggaaagaga aaaagaggga gaaggcgagg
aaaaccggag aaatagggaa 3180gaggaggaag aggaagaggg aaagtaccag gagacaggcg
aagaggaaaa cgagcggcag 3240gatggcgagg aatataagaa agtgagcaag atcaaaggat
ccgtcaagta cggcaagcac 3300aaaacctatc agaagaaaag cgtgaccaac acacagggga
atggaaaaga gcagcgaagt 3360aaaatgcctg tgcagtcaaa acggctgctg aagaatggcc
caagcgggtc taaaaaattc 3420tggaacaatg tcctgccaca ctatctggaa ctgaagtaa
345922818DNAHomo sapiens 2gctccttcct gtactgccca
gctccgcttg ctccctgacc atccctgcag cagccctgat 60gtgtcattgt ccccctctta
acctgcgctg cagtgctgca gggctgggct ctggagctgg 120gtctggtcat ttctccttag
atatgtagag gcccaggaaa ggtttggagc ctaagaagcc 180ctaggactcc aggtctccag
ggcagcccca gcctcttgga atgactttcc ctaataccac 240aggggtgttc taatcccagg
cagacccaag ctgcccctca ccaactccta cgtcctcaac 300ttcctttcat aacttctagg
atggaaacac ctaatcctcc agcaatactg aggcttttct 360ccttattctg ttttcccttt
tgaagaagcc aaggctcaga gcagtcgagt cacctaatca 420tggtctcatg tcgcctgatc
aaggtctcat gtcaccttat caagatctca cccactcacc 480tattcagttc tcaccagttc
agttcaggat ggcttctaag ctaccctgca cagctctgcc 540cacaggacat ttgtataagt
gagggggtgc aggccttcca gccccctcca actccaaaac 600tcagccccca agatcaagtg
gactctctga acccaccctg gccctacagt tgtcagggtc 660tggatgggaa gatgtagagc
tctcggcttt cactctgggg acttacccag aacatattct 720cctcatgagc taaggaggct
ggctgccatc ttcctacatc cccccacggc ctgggggcaa 780ggacaccctg gccccctgga
gtctggagaa ctctgaggac agaacttgct cttccacctg 840cttgggcctt acccacagga
gaagcactgc ttctctaccc atgccccatc caactcaggc 900accccaggga cttgcaacag
tctgattttt tctcacgtcc ttcttaaggc tctgggctag 960ccacacaaat caaatcccag
tgataggtcc agacaatcct atcctgaaac tacatcttag 1020taagactcca gggaatcctt
tccccaaaga cagtcttact cctgttctcc ccccaagcct 1080ttctgggcca gaagctttgc
ctggactcaa gcaatggcag acaagtgccc tctgaggaca 1140cggaagtgca tgctcagaac
tgtgattctc caagtggagg cagaggagaa ggcccaggct 1200tcccagcagg gctaaggata
tgcaaggagt gcattcatcc ggaggtgttg gcagcatccc 1260agccccaccc cattctcatc
gtaaatcagg ctcacttcca ttggctgcat acggtggagt 1320gatgtgacca tatgtcactt
gagcattaca caaatcctaa tgagctaaaa atatgtttgt 1380tttagctaat tgacctcttt
ggccttcata aagcagttgg taaacatcct cagataatga 1440tttccaaaga gcagattgtg
ggtctcagct gtgcagagaa agcccacgtc cctgagacca 1500ccttctccag ctgcctactg
aggcacacag gggcgcctgc ctgctgcccg ctcagccaag 1560gcggtgttgc tggagccagc
ttgggacagc tctcccaacg ctctgccctg gccttgcgac 1620cactctctgg gccgtagttg
tctgtctgtt aagtgaggaa agtgcccatc tccagaggca 1680ttcagcggca aagcagggct
tccaggttcc gaccccatag caggacttct tggatttcta 1740cagccagtca gttgcaagca
gcacccatat tatttctata agaagtggca ggagctggga 1800tctgaagagt tcagcagtct
acctttccct gtttcttgtg ctttatgcag tcaggaggaa 1860tgatctggat tccatgtgaa
gcctgggacc acggagaccc aagacttcct gcttgattct 1920ccctgcgaac tgcaggctgt
gggctgagcc ttcaagaagc aggagtcccc tctagccatt 1980aactctcaga gctaacctca
tttgaatggg aacactagtc ctgtgatgtc tggaaggtgg 2040gcgcctctac actccacacc
ctacatggtg gtccagacac atcattccca gcattagaaa 2100gctgtagggg gacccgttct
gttccctgga ggcattaaag ggacatagaa ataaatctca 2160agctctgagg ctgatgccag
cctcagactc agcctctgca ctgtatgggc caattgtagc 2220cccaaggact tcttcttgct
gcacccccta tctgtccaca cctaaaacga tgggcttcta 2280tttagttaca gaactctctg
gcctgttttg ttttgctttg ctttgttttg ttttgttttt 2340ttgttttttt gttttttagc
tatgaaacag aggtaatatc taatacagat aacttaccag 2400taatgagtgc ttcctactta
ctgggtactg ggaagaagtg ctttacacat attttctcat 2460ttaatctaca caataagtaa
ttaagacatt tccctgaggc cacgggagag acagtggcag 2520aacagttctc caaggaggac
ttgcaagtta ataactggac tttgcaaggc tctggtggaa 2580actgtcagct tgtaaaggat
ggagcacagt gtctggcatg tagcaggaac taaaataatg 2640gcagtgatta atgttatgat
atgcagacac aacacagcaa gataagatgc aatgtacctt 2700ctgggtcaaa ccaccctggc
cactcctccc cgatacccag ggttgatgtg cttgaattag 2760acaggattaa aggcttactg
gagctggaag ccttgcccca actcaggagt ttagcccc 281831326DNAHomo sapiens
3ctgcctactg aggcacacag gggcgcctgc ctgctgcccg ctcagccaag gcggtgttgc
60tggagccagc ttgggacagc tctcccaacg ctctgccctg gccttgcgac cactctctgg
120gccgtagttg tctgtctgtt aagtgaggaa agtgcccatc tccagaggca ttcagcggca
180aagcagggct tccaggttcc gaccccatag caggacttct tggatttcta cagccagtca
240gttgcaagca gcacccatat tatttctata agaagtggca ggagctggga tctgaagagt
300tcagcagtct acctttccct gtttcttgtg ctttatgcag tcaggaggaa tgatctggat
360tccatgtgaa gcctgggacc acggagaccc aagacttcct gcttgattct ccctgcgaac
420tgcaggctgt gggctgagcc ttcaagaagc aggagtcccc tctagccatt aactctcaga
480gctaacctca tttgaatggg aacactagtc ctgtgatgtc tggaaggtgg gcgcctctac
540actccacacc ctacatggtg gtccagacac atcattccca gcattagaaa gctgtagggg
600gacccgttct gttccctgga ggcattaaag ggacatagaa ataaatctca agctctgagg
660ctgatgccag cctcagactc agcctctgca ctgtatgggc caattgtagc cccaaggact
720tcttcttgct gcacccccta tctgtccaca cctaaaacga tgggcttcta tttagttaca
780gaactctctg gcctgttttg ttttgctttg ctttgttttg ttttgttttt ttgttttttt
840gttttttagc tatgaaacag aggtaatatc taatacagat aacttaccag taatgagtgc
900ttcctactta ctgggtactg ggaagaagtg ctttacacat attttctcat ttaatctaca
960caataagtaa ttaagacatt tccctgaggc cacgggagag acagtggcag aacagttctc
1020caaggaggac ttgcaagtta ataactggac tttgcaaggc tctggtggaa actgtcagct
1080tgtaaaggat ggagcacagt gtctggcatg tagcaggaac taaaataatg gcagtgatta
1140atgttatgat atgcagacac aacacagcaa gataagatgc aatgtacctt ctgggtcaaa
1200ccaccctggc cactcctccc cgatacccag ggttgatgtg cttgaattag acaggattaa
1260aggcttactg gagctggaag ccttgcccca actcaggagt ttagccccag accttctgtc
1320caccag
1326470DNAMus musculus 4gcttgaatta gacaggatta aaggcttact ggagctggaa
gccttgcccc aactcaggag 60tttagcccca
7053459DNAHomo sapiens 5atgagggagc cggaagagct
gatgcccgat tcgggtgctg tgtttacatt tgggaaaagt 60aaatttgctg aaaataatcc
cggtaaattc tggtttaaaa atgatgtccc tgtacatctt 120tcatgtggag atgaacattc
tgctgttgtt accggaaata ataaacttta catgtttggc 180agtaacaact ggggtcagtt
aggattagga tcaaagtcag ccatcagcaa gccaacatgt 240gtcaaagctc taaaacctga
aaaagtgaaa ttagctgcct gtggaaggaa ccacaccctg 300gtgtcaacag aaggaggcaa
tgtatatgca actggtggaa ataatgaagg acagttgggg 360cttggtgaca ccgaagaaag
aaacactttt catgtaatta gcttttttac atccgagcat 420aagattaagc agctgtctgc
tggatctaat acttcagctg ccctaactga ggatggaaga 480ctttttatgt ggggtgacaa
ttccgaaggg caaattggtt taaaaaatgt aagtaatgtc 540tgtgtccctc agcaagtgac
cattgggaaa cctgtctcct ggatctcttg tggatattac 600cattcagctt ttgtaacaac
agatggtgag ctatatgtgt ttggagaacc tgagaatggg 660aagttaggtc ttcccaatca
gctcctgggc aatcacagaa caccccagct ggtgtctgaa 720attccggaga aggtgatcca
agtagcctgt ggtggagagc atactgtggt tctcacggag 780aatgctgtgt atacctttgg
gctgggacaa tttggtcagc tgggtcttgg cacttttctt 840tttgaaactt cagaacccaa
agtcattgag aatattaggg atcaaacaat aagttatatt 900tcttgtggag aaaatcacac
agctttgata acagatatcg gccttatgta tacttttgga 960gatggtcgcc acggaaaatt
aggacttgga ctggagaatt ttaccaatca cttcattcct 1020actttgtgct ctaatttttt
gaggtttata gttaaattgg ttgcttgtgg tggatgtcac 1080atggtagttt ttgctgctcc
tcatcgtggt gtggcaaaag aaattgaatt cgatgaaata 1140aatgatactt gcttatctgt
ggcgactttt ctgccgtata gcagtttaac ctcaggaaat 1200gtactgcaga ggactctatc
agcacgtatg cggcgaagag agagggagag gtctccagat 1260tctttttcaa tgaggagaac
actacctcca atagaaggga ctcttggcct ttctgcttgt 1320tttctcccca attcagtctt
tccacgatgt tctgagagaa acctccaaga gagtgtctta 1380tctgaacagg acctcatgca
gccagaggaa ccagattatt tgctagatga aatgaccaaa 1440gaagcagaga tagataattc
ttcaactgta gaaagccttg gagaaactac tgatatctta 1500aacatgacac acatcatgag
cctgaattcc aatgaaaagt cattaaaatt atcaccagtt 1560cagaaacaaa agaaacaaca
aacaattggg gaactgacgc aggatacagc tcttactgaa 1620aacgatgata gtgatgaata
tgaagaaatg tcagaaatga aagaagggaa agcatgtaaa 1680caacatgtgt cacaagggat
tttcatgacg cagccagcta cgactatcga agcattttca 1740gatgaggaag tagagatccc
agaggagaag gaaggagcag aggattcaaa aggaaatgga 1800atagaggagc aagaggtaga
agcaaatgag gaaaatgtga aggtgcatgg aggaagaaag 1860gagaaaacag agatcctatc
agatgacctt acagacaaag cagaggtgag tgaaggcaag 1920gcaaaatcag tgggagaagc
agaggatggg cctgaaggta gaggggatgg aacctgtgag 1980gaaggtagtt caggagcaga
acactggcaa gatgaggaga gggagaaggg ggagaaagac 2040aagggtagag gagaaatgga
gaggccagga gagggagaga aggaactagc agagaaggaa 2100gaatggaaga agagggatgg
ggaagagcag gagcaaaagg agagggagca gggccatcag 2160aaggaaagaa accaagagat
ggaggaggga ggggaggagg agcatggaga aggagaagaa 2220gaggagggag acagagaaga
ggaagaagag aaggagggag aagggaaaga ggaaggagaa 2280ggggaagaag tggagggaga
acgtgaaaag gaggaaggag agaggaaaaa ggaggaaaga 2340gcggggaagg aggagaaagg
agaggaagaa ggagaccaag gagaggggga agaggaggaa 2400acagagggga gaggggagga
aaaagaggag ggaggggaag tagagggagg ggaagtagag 2460gaggggaaag gagagaggga
agaggaagag gaggagggtg agggggaaga ggaggaaggg 2520gagggggaag aggaggaagg
ggagggggaa gaggaggaag gagaagggaa aggggaggaa 2580gaaggggaag aaggagaagg
ggaggaagaa ggggaggaag gagaagggga gggggaagag 2640gaggaaggag aaggggaggg
agaagaggaa ggagaagggg agggagaaga ggaggaagga 2700gaaggggagg gagaagagga
aggagaaggg gagggagaag aggaggaagg agaagggaaa 2760ggggaggagg aaggagagga
aggagaaggg gagggggaag aggaggaagg agaaggggaa 2820ggggaggatg gagaagggga
gggggaagag gaggaaggag aatgggaggg ggaagaggag 2880gaaggagaag gggaggggga
agaggaagga gaaggggaag gggaggaagg agaaggggag 2940ggggaagagg aggaaggaga
aggggagggg gaagaggagg aaggggaaga agaaggggag 3000gaagaaggag agggagagga
agaaggggag ggagaagggg aggaagaaga ggaaggggaa 3060gtggaagggg aggtggaagg
ggaggaagga gagggggaag gagaggaaga ggaaggagag 3120gaggaaggag aagaaaggga
aaaggagggg gaaggagaag aaaacaggag gaacagagaa 3180gaggaggagg aagaagaggg
gaagtatcag gagacaggcg aagaagagaa tgaaaggcag 3240gatggagagg agtacaaaaa
agtgagcaaa ataaaaggat ctgtgaaata tggcaaacat 3300aaaacatatc aaaaaaagtc
agttactaac acacagggaa atgggaaaga gcagaggtcc 3360aaaatgccag tccagtcaaa
acgactttta aaaaacgggc catcaggttc caaaaagttc 3420tggaataatg tattaccaca
ttacttggaa ttgaagtaa 345961152PRTHomo sapiens
6Met Arg Glu Pro Glu Glu Leu Met Pro Asp Ser Gly Ala Val Phe Thr1
5 10 15Phe Gly Lys Ser Lys Phe
Ala Glu Asn Asn Pro Gly Lys Phe Trp Phe 20 25
30Lys Asn Asp Val Pro Val His Leu Ser Cys Gly Asp Glu
His Ser Ala 35 40 45Val Val Thr
Gly Asn Asn Lys Leu Tyr Met Phe Gly Ser Asn Asn Trp 50
55 60Gly Gln Leu Gly Leu Gly Ser Lys Ser Ala Ile Ser
Lys Pro Thr Cys65 70 75
80Val Lys Ala Leu Lys Pro Glu Lys Val Lys Leu Ala Ala Cys Gly Arg
85 90 95Asn His Thr Leu Val Ser
Thr Glu Gly Gly Asn Val Tyr Ala Thr Gly 100
105 110Gly Asn Asn Glu Gly Gln Leu Gly Leu Gly Asp Thr
Glu Glu Arg Asn 115 120 125Thr Phe
His Val Ile Ser Phe Phe Thr Ser Glu His Lys Ile Lys Gln 130
135 140Leu Ser Ala Gly Ser Asn Thr Ser Ala Ala Leu
Thr Glu Asp Gly Arg145 150 155
160Leu Phe Met Trp Gly Asp Asn Ser Glu Gly Gln Ile Gly Leu Lys Asn
165 170 175Val Ser Asn Val
Cys Val Pro Gln Gln Val Thr Ile Gly Lys Pro Val 180
185 190Ser Trp Ile Ser Cys Gly Tyr Tyr His Ser Ala
Phe Val Thr Thr Asp 195 200 205Gly
Glu Leu Tyr Val Phe Gly Glu Pro Glu Asn Gly Lys Leu Gly Leu 210
215 220Pro Asn Gln Leu Leu Gly Asn His Arg Thr
Pro Gln Leu Val Ser Glu225 230 235
240Ile Pro Glu Lys Val Ile Gln Val Ala Cys Gly Gly Glu His Thr
Val 245 250 255Val Leu Thr
Glu Asn Ala Val Tyr Thr Phe Gly Leu Gly Gln Phe Gly 260
265 270Gln Leu Gly Leu Gly Thr Phe Leu Phe Glu
Thr Ser Glu Pro Lys Val 275 280
285Ile Glu Asn Ile Arg Asp Gln Thr Ile Ser Tyr Ile Ser Cys Gly Glu 290
295 300Asn His Thr Ala Leu Ile Thr Asp
Ile Gly Leu Met Tyr Thr Phe Gly305 310
315 320Asp Gly Arg His Gly Lys Leu Gly Leu Gly Leu Glu
Asn Phe Thr Asn 325 330
335His Phe Ile Pro Thr Leu Cys Ser Asn Phe Leu Arg Phe Ile Val Lys
340 345 350Leu Val Ala Cys Gly Gly
Cys His Met Val Val Phe Ala Ala Pro His 355 360
365Arg Gly Val Ala Lys Glu Ile Glu Phe Asp Glu Ile Asn Asp
Thr Cys 370 375 380Leu Ser Val Ala Thr
Phe Leu Pro Tyr Ser Ser Leu Thr Ser Gly Asn385 390
395 400Val Leu Gln Arg Thr Leu Ser Ala Arg Met
Arg Arg Arg Glu Arg Glu 405 410
415Arg Ser Pro Asp Ser Phe Ser Met Arg Arg Thr Leu Pro Pro Ile Glu
420 425 430Gly Thr Leu Gly Leu
Ser Ala Cys Phe Leu Pro Asn Ser Val Phe Pro 435
440 445Arg Cys Ser Glu Arg Asn Leu Gln Glu Ser Val Leu
Ser Glu Gln Asp 450 455 460Leu Met Gln
Pro Glu Glu Pro Asp Tyr Leu Leu Asp Glu Met Thr Lys465
470 475 480Glu Ala Glu Ile Asp Asn Ser
Ser Thr Val Glu Ser Leu Gly Glu Thr 485
490 495Thr Asp Ile Leu Asn Met Thr His Ile Met Ser Leu
Asn Ser Asn Glu 500 505 510Lys
Ser Leu Lys Leu Ser Pro Val Gln Lys Gln Lys Lys Gln Gln Thr 515
520 525Ile Gly Glu Leu Thr Gln Asp Thr Ala
Leu Thr Glu Asn Asp Asp Ser 530 535
540Asp Glu Tyr Glu Glu Met Ser Glu Met Lys Glu Gly Lys Ala Cys Lys545
550 555 560Gln His Val Ser
Gln Gly Ile Phe Met Thr Gln Pro Ala Thr Thr Ile 565
570 575Glu Ala Phe Ser Asp Glu Glu Val Glu Ile
Pro Glu Glu Lys Glu Gly 580 585
590Ala Glu Asp Ser Lys Gly Asn Gly Ile Glu Glu Gln Glu Val Glu Ala
595 600 605Asn Glu Glu Asn Val Lys Val
His Gly Gly Arg Lys Glu Lys Thr Glu 610 615
620Ile Leu Ser Asp Asp Leu Thr Asp Lys Ala Glu Val Ser Glu Gly
Lys625 630 635 640Ala Lys
Ser Val Gly Glu Ala Glu Asp Gly Pro Glu Gly Arg Gly Asp
645 650 655Gly Thr Cys Glu Glu Gly Ser
Ser Gly Ala Glu His Trp Gln Asp Glu 660 665
670Glu Arg Glu Lys Gly Glu Lys Asp Lys Gly Arg Gly Glu Met
Glu Arg 675 680 685Pro Gly Glu Gly
Glu Lys Glu Leu Ala Glu Lys Glu Glu Trp Lys Lys 690
695 700Arg Asp Gly Glu Glu Gln Glu Gln Lys Glu Arg Glu
Gln Gly His Gln705 710 715
720Lys Glu Arg Asn Gln Glu Met Glu Glu Gly Gly Glu Glu Glu His Gly
725 730 735Glu Gly Glu Glu Glu
Glu Gly Asp Arg Glu Glu Glu Glu Glu Lys Glu 740
745 750Gly Glu Gly Lys Glu Glu Gly Glu Gly Glu Glu Val
Glu Gly Glu Arg 755 760 765Glu Lys
Glu Glu Gly Glu Arg Lys Lys Glu Glu Arg Ala Gly Lys Glu 770
775 780Glu Lys Gly Glu Glu Glu Gly Asp Gln Gly Glu
Gly Glu Glu Glu Glu785 790 795
800Thr Glu Gly Arg Gly Glu Glu Lys Glu Glu Gly Gly Glu Val Glu Gly
805 810 815Gly Glu Val Glu
Glu Gly Lys Gly Glu Arg Glu Glu Glu Glu Glu Glu 820
825 830Gly Glu Gly Glu Glu Glu Glu Gly Glu Gly Glu
Glu Glu Glu Gly Glu 835 840 845Gly
Glu Glu Glu Glu Gly Glu Gly Lys Gly Glu Glu Glu Gly Glu Glu 850
855 860Gly Glu Gly Glu Glu Glu Gly Glu Glu Gly
Glu Gly Glu Gly Glu Glu865 870 875
880Glu Glu Gly Glu Gly Glu Gly Glu Glu Glu Gly Glu Gly Glu Gly
Glu 885 890 895Glu Glu Glu
Gly Glu Gly Glu Gly Glu Glu Glu Gly Glu Gly Glu Gly 900
905 910Glu Glu Glu Glu Gly Glu Gly Lys Gly Glu
Glu Glu Gly Glu Glu Gly 915 920
925Glu Gly Glu Gly Glu Glu Glu Glu Gly Glu Gly Glu Gly Glu Asp Gly 930
935 940Glu Gly Glu Gly Glu Glu Glu Glu
Gly Glu Trp Glu Gly Glu Glu Glu945 950
955 960Glu Gly Glu Gly Glu Gly Glu Glu Glu Gly Glu Gly
Glu Gly Glu Glu 965 970
975Gly Glu Gly Glu Gly Glu Glu Glu Glu Gly Glu Gly Glu Gly Glu Glu
980 985 990Glu Glu Gly Glu Glu Glu
Gly Glu Glu Glu Gly Glu Gly Glu Glu Glu 995 1000
1005Gly Glu Gly Glu Gly Glu Glu Glu Glu Glu Gly Glu
Val Glu Gly 1010 1015 1020Glu Val Glu
Gly Glu Glu Gly Glu Gly Glu Gly Glu Glu Glu Glu 1025
1030 1035Gly Glu Glu Glu Gly Glu Glu Arg Glu Lys Glu
Gly Glu Gly Glu 1040 1045 1050Glu Asn
Arg Arg Asn Arg Glu Glu Glu Glu Glu Glu Glu Gly Lys 1055
1060 1065Tyr Gln Glu Thr Gly Glu Glu Glu Asn Glu
Arg Gln Asp Gly Glu 1070 1075 1080Glu
Tyr Lys Lys Val Ser Lys Ile Lys Gly Ser Val Lys Tyr Gly 1085
1090 1095Lys His Lys Thr Tyr Gln Lys Lys Ser
Val Thr Asn Thr Gln Gly 1100 1105
1110Asn Gly Lys Glu Gln Arg Ser Lys Met Pro Val Gln Ser Lys Arg
1115 1120 1125Leu Leu Lys Asn Gly Pro
Ser Gly Ser Lys Lys Phe Trp Asn Asn 1130 1135
1140Val Leu Pro His Tyr Leu Glu Leu Lys 1145
115073871DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotide 7tctagactcg aggaactgaa aaaccagaaa
gttaactggt aagtttagtc tttttgtctt 60ttatttcagg tcccggatcc ggtggtggtg
caaatcaaag aactgctcct cagtggatgt 120tgcctttact tctaggcctg tacggaagtg
ttacttctgc tctaaaagct gcggaattgt 180acccgcggcc gcgccaccat gagagagcca
gaggagctga tgccagatag cggagcagtg 240tttaccttcg gaaagtccaa gttcgcagag
aataacccag gaaagttctg gtttaaaaac 300gacgtgcccg tccacctgtc ttgtggcgat
gagcatagtg ccgtggtcac tgggaacaat 360aagctgtata tgttcgggtc caacaattgg
ggacagctgg ggctgggatc caaatctgct 420atctctaagc caacctgcgt gaaggcactg
aaacccgaga aggtcaaact ggccgcttgt 480ggcagaaacc acactctggt gagcaccgag
ggcgggaatg tctatgccac cggaggcaac 540aatgagggac agctgggact gggggacact
gaggaaagga atacctttca cgtgatctcc 600ttctttacat ctgagcataa gatcaagcag
ctgagcgccg gctccaacac atctgcagcc 660ctgactgagg acgggcgcct gttcatgtgg
ggagataatt cagagggcca gattgggctg 720aaaaacgtga gcaacgtgtg cgtgcctcag
caggtgacca tcggaaagcc agtcagttgg 780atttcatgtg gctactatca tagcgccttc
gtgaccacag atggcgagct gtacgtcttt 840ggggagcccg aaaacggaaa actgggcctg
cctaaccagc tgctgggcaa tcaccggaca 900ccccagctgg tgtccgagat ccctgaaaaa
gtgatccagg tcgcctgcgg gggagagcat 960acagtggtcc tgactgagaa tgccgtgtac
accttcggac tgggccagtt tggccagctg 1020gggctgggaa ccttcctgtt tgagacatcc
gaaccaaaag tgatcgagaa cattcgcgac 1080cagactatca gctacatttc ctgcggagag
aatcacaccg cactgatcac agacattggc 1140ctgatgtata cctttggcga tgggcggcac
gggaagctgg gactgggcct ggagaacttc 1200actaatcact tcatccccac cctgtgctct
aacttcctgc ggttcatcgt gaaactggtc 1260gcttgcggcg ggtgtcacat ggtggtcttc
gctgcacctc ataggggcgt ggctaaggag 1320atcgaatttg acgagattaa cgatacatgc
ctgagcgtgg caactttcct gccatacagc 1380tccctgactt ctggcaatgt gctgcagaga
accctgagtg caaggatgcg gagaagggag 1440agggaacgct ctcctgacag tttctcaatg
cgacgaaccc tgccacctat cgaggggaca 1500ctgggactga gtgcctgctt cctgcctaac
tcagtgtttc cacgatgtag cgagcggaat 1560ctgcaggagt ctgtcctgag tgagcaggat
ctgatgcagc cagaggaacc cgactacctg 1620ctggatgaga tgaccaagga ggccgaaatc
gacaactcta gtacagtgga gtccctgggc 1680gagactaccg atatcctgaa tatgacacac
attatgtcac tgaacagcaa tgagaagagt 1740ctgaaactgt caccagtgca gaagcagaag
aaacagcaga ctattggcga gctgactcag 1800gacaccgccc tgacagagaa cgacgatagc
gatgagtatg aggaaatgtc cgagatgaag 1860gaaggcaaag cttgtaagca gcatgtgagt
caggggatct tcatgacaca gccagccaca 1920actattgagg ctttttcaga cgaggaagtg
gagatccccg aggaaaaaga gggcgcagaa 1980gattccaagg ggaatggaat tgaggaacag
gaggtggaag ccaacgagga aaatgtgaaa 2040gtccacggag gcaggaagga gaaaacagaa
atcctgtctg acgatctgac tgacaaggcc 2100gaggtgtccg aaggcaaggc aaaatctgtc
ggagaggcag aagacggacc agagggacga 2160ggggatggaa cctgcgagga aggctcaagc
ggggctgagc attggcagga cgaggaacga 2220gagaagggcg aaaaggataa aggccgcggg
gagatggaac gacctggaga gggcgaaaaa 2280gagctggcag agaaggagga atggaagaaa
agggacggcg aggaacagga gcagaaagaa 2340agggagcagg gccaccagaa ggagcgcaac
caggagatgg aagagggcgg cgaggaagag 2400catggcgagg gagaagagga agagggcgat
agagaagagg aagaggaaaa agaaggcgaa 2460gggaaggagg aaggagaggg cgaggaagtg
gaaggcgaga gggaaaagga ggaaggagaa 2520cggaagaaag aggaaagagc cggcaaagag
gaaaagggcg aggaagaggg cgatcagggc 2580gaaggcgagg aggaagagac cgagggccgc
ggggaagaga aagaggaggg aggagaggtg 2640gagggcggag aggtcgaaga gggaaagggc
gagcgcgaag aggaagagga agagggcgag 2700ggcgaggaag aagagggcga gggggaagaa
gaggagggag agggcgaaga ggaagagggg 2760gagggaaagg gcgaagagga aggagaggaa
ggggagggag aggaagaggg ggaggagggc 2820gagggggaag gcgaggagga agaaggagag
ggggaaggcg aagaggaagg cgagggggaa 2880ggagaggagg aagaagggga aggcgaaggc
gaagaggagg gagaaggaga gggggaggaa 2940gaggaaggag aagggaaggg cgaggaggaa
ggcgaagagg gagaggggga aggcgaggaa 3000gaggaaggcg agggcgaagg agaggacggc
gagggcgagg gagaagagga ggaaggggaa 3060tgggaaggcg aagaagagga aggcgaaggc
gaaggcgaag aagagggcga aggggagggc 3120gaggagggcg aaggcgaagg ggaggaagag
gaaggcgaag gagaaggcga ggaagaagag 3180ggagaggagg aaggcgagga ggaaggagag
ggggaggagg agggagaagg cgagggcgaa 3240gaagaagaag agggagaagt ggagggcgaa
gtcgaggggg aggagggaga aggggaaggg 3300gaggaagaag agggcgaaga agaaggcgag
gaaagagaaa aagagggaga aggcgaggaa 3360aaccggagaa atagggaaga ggaggaagag
gaagagggaa agtaccagga gacaggcgaa 3420gaggaaaacg agcggcagga tggcgaggaa
tataagaaag tgagcaagat caaaggatcc 3480gtcaagtacg gcaagcacaa aacctatcag
aagaaaagcg tgaccaacac acaggggaat 3540ggaaaagagc agcgaagtaa aatgcctgtg
cagtcaaaac ggctgctgaa gaatggccca 3600agcgggtcta aaaaattctg gaacaatgtc
ctgccacact atctggaact gaagtaagcg 3660gccgcgcgga tccagacatg ataagataca
ttgatgagtt tggacaaacc acaactagaa 3720tgcagtgaaa aaaatgcttt atttgtgaaa
tttgtgatgc tattgcttta tttgtaacca 3780ttataagctg caataaacaa gttaacaaca
acaattgcat tcattttatg tttcaggttc 3840agggggaggt gtgggaggtt ttttagcatg c
38718235DNAHomo sapiens 8agcacagtgt
ctggcatgta gcaggaacta aaataatggc agtgattaat gttatgatat 60gcagacacaa
cacagcaaga taagatgcaa tgtaccttct gggtcaaacc accctggcca 120ctcctccccg
atacccaggg ttgatgtgct tgaattagac aggattaaag gcttactgga 180gctggaagcc
ttgccccaac tcaggagttt agccccagac cttctgtcca ccagc 23599PRTHomo
sapiens 9Glu Glu Glu Gly Glu Gly Glu Gly Glu1 5106PRTMus
musculus 10Glu Glu Gly Glu Gly Glu1 5114107DNAArtificial
SequenceDescription of Artificial Sequence Synthetic polynucleotide
11agatctgaat tcagcacagt gtctggcatg tagcaggaac taaaataatg gcagtgatta
60atgttatgat atgcagacac aacacagcaa gataagatgc aatgtacctt ctgggtcaaa
120ccaccctggc cactcctccc cgatacccag ggttgatgtg cttgaattag acaggattaa
180aggcttactg gagctggaag ccttgcccca actcaggagt ttagccccag accttctgtc
240caccagctct agactcgagg aactgaaaaa ccagaaagtt aactggtaag tttagtcttt
300ttgtctttta tttcaggtcc cggatccggt ggtggtgcaa atcaaagaac tgctcctcag
360tggatgttgc ctttacttct aggcctgtac ggaagtgtta cttctgctct aaaagctgcg
420gaattgtacc cgcggccgcg ccaccatgag agagccagag gagctgatgc cagatagcgg
480agcagtgttt accttcggaa agtccaagtt cgcagagaat aacccaggaa agttctggtt
540taaaaacgac gtgcccgtcc acctgtcttg tggcgatgag catagtgccg tggtcactgg
600gaacaataag ctgtatatgt tcgggtccaa caattgggga cagctggggc tgggatccaa
660atctgctatc tctaagccaa cctgcgtgaa ggcactgaaa cccgagaagg tcaaactggc
720cgcttgtggc agaaaccaca ctctggtgag caccgagggc gggaatgtct atgccaccgg
780aggcaacaat gagggacagc tgggactggg ggacactgag gaaaggaata cctttcacgt
840gatctccttc tttacatctg agcataagat caagcagctg agcgccggct ccaacacatc
900tgcagccctg actgaggacg ggcgcctgtt catgtgggga gataattcag agggccagat
960tgggctgaaa aacgtgagca acgtgtgcgt gcctcagcag gtgaccatcg gaaagccagt
1020cagttggatt tcatgtggct actatcatag cgccttcgtg accacagatg gcgagctgta
1080cgtctttggg gagcccgaaa acggaaaact gggcctgcct aaccagctgc tgggcaatca
1140ccggacaccc cagctggtgt ccgagatccc tgaaaaagtg atccaggtcg cctgcggggg
1200agagcataca gtggtcctga ctgagaatgc cgtgtacacc ttcggactgg gccagtttgg
1260ccagctgggg ctgggaacct tcctgtttga gacatccgaa ccaaaagtga tcgagaacat
1320tcgcgaccag actatcagct acatttcctg cggagagaat cacaccgcac tgatcacaga
1380cattggcctg atgtatacct ttggcgatgg gcggcacggg aagctgggac tgggcctgga
1440gaacttcact aatcacttca tccccaccct gtgctctaac ttcctgcggt tcatcgtgaa
1500actggtcgct tgcggcgggt gtcacatggt ggtcttcgct gcacctcata ggggcgtggc
1560taaggagatc gaatttgacg agattaacga tacatgcctg agcgtggcaa ctttcctgcc
1620atacagctcc ctgacttctg gcaatgtgct gcagagaacc ctgagtgcaa ggatgcggag
1680aagggagagg gaacgctctc ctgacagttt ctcaatgcga cgaaccctgc cacctatcga
1740ggggacactg ggactgagtg cctgcttcct gcctaactca gtgtttccac gatgtagcga
1800gcggaatctg caggagtctg tcctgagtga gcaggatctg atgcagccag aggaacccga
1860ctacctgctg gatgagatga ccaaggaggc cgaaatcgac aactctagta cagtggagtc
1920cctgggcgag actaccgata tcctgaatat gacacacatt atgtcactga acagcaatga
1980gaagagtctg aaactgtcac cagtgcagaa gcagaagaaa cagcagacta ttggcgagct
2040gactcaggac accgccctga cagagaacga cgatagcgat gagtatgagg aaatgtccga
2100gatgaaggaa ggcaaagctt gtaagcagca tgtgagtcag gggatcttca tgacacagcc
2160agccacaact attgaggctt tttcagacga ggaagtggag atccccgagg aaaaagaggg
2220cgcagaagat tccaagggga atggaattga ggaacaggag gtggaagcca acgaggaaaa
2280tgtgaaagtc cacggaggca ggaaggagaa aacagaaatc ctgtctgacg atctgactga
2340caaggccgag gtgtccgaag gcaaggcaaa atctgtcgga gaggcagaag acggaccaga
2400gggacgaggg gatggaacct gcgaggaagg ctcaagcggg gctgagcatt ggcaggacga
2460ggaacgagag aagggcgaaa aggataaagg ccgcggggag atggaacgac ctggagaggg
2520cgaaaaagag ctggcagaga aggaggaatg gaagaaaagg gacggcgagg aacaggagca
2580gaaagaaagg gagcagggcc accagaagga gcgcaaccag gagatggaag agggcggcga
2640ggaagagcat ggcgagggag aagaggaaga gggcgataga gaagaggaag aggaaaaaga
2700aggcgaaggg aaggaggaag gagagggcga ggaagtggaa ggcgagaggg aaaaggagga
2760aggagaacgg aagaaagagg aaagagccgg caaagaggaa aagggcgagg aagagggcga
2820tcagggcgaa ggcgaggagg aagagaccga gggccgcggg gaagagaaag aggagggagg
2880agaggtggag ggcggagagg tcgaagaggg aaagggcgag cgcgaagagg aagaggaaga
2940gggcgagggc gaggaagaag agggcgaggg ggaagaagag gagggagagg gcgaagagga
3000agagggggag ggaaagggcg aagaggaagg agaggaaggg gagggagagg aagaggggga
3060ggagggcgag ggggaaggcg aggaggaaga aggagagggg gaaggcgaag aggaaggcga
3120gggggaagga gaggaggaag aaggggaagg cgaaggcgaa gaggagggag aaggagaggg
3180ggaggaagag gaaggagaag ggaagggcga ggaggaaggc gaagagggag agggggaagg
3240cgaggaagag gaaggcgagg gcgaaggaga ggacggcgag ggcgagggag aagaggagga
3300aggggaatgg gaaggcgaag aagaggaagg cgaaggcgaa ggcgaagaag agggcgaagg
3360ggagggcgag gagggcgaag gcgaagggga ggaagaggaa ggcgaaggag aaggcgagga
3420agaagaggga gaggaggaag gcgaggagga aggagagggg gaggaggagg gagaaggcga
3480gggcgaagaa gaagaagagg gagaagtgga gggcgaagtc gagggggagg agggagaagg
3540ggaaggggag gaagaagagg gcgaagaaga aggcgaggaa agagaaaaag agggagaagg
3600cgaggaaaac cggagaaata gggaagagga ggaagaggaa gagggaaagt accaggagac
3660aggcgaagag gaaaacgagc ggcaggatgg cgaggaatat aagaaagtga gcaagatcaa
3720aggatccgtc aagtacggca agcacaaaac ctatcagaag aaaagcgtga ccaacacaca
3780ggggaatgga aaagagcagc gaagtaaaat gcctgtgcag tcaaaacggc tgctgaagaa
3840tggcccaagc gggtctaaaa aattctggaa caatgtcctg ccacactatc tggaactgaa
3900gtaagcggcc gcgcggatcc agacatgata agatacattg atgagtttgg acaaaccaca
3960actagaatgc agtgaaaaaa atgctttatt tgtgaaattt gtgatgctat tgctttattt
4020gtaaccatta taagctgcaa taaacaagtt aacaacaaca attgcattca ttttatgttt
4080caggttcagg gggaggtgtg ggaggtt
4107124142DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotide 12gcagagaggg agtggccaac ctcctagatc
tgaattcagc acagtgtctg gcatgtagca 60ggaactaaaa taatggcagt gattaatgtt
atgatatgca gacacaacac agcaagataa 120gatgcaatgt accttctggg tcaaaccacc
ctggccactc ctccccgata cccagggttg 180atgtgcttga attagacagg attaaaggct
tactggagct ggaagccttg ccccaactca 240ggagtttagc cccagacctt ctgtccacca
gctctagact cgaggaactg aaaaaccaga 300aagttaactg gtaagtttag tctttttgtc
ttttatttca ggtcccggat ccggtggtgg 360tgcaaatcaa agaactgctc ctcagtggat
gttgccttta cttctaggcc tgtacggaag 420tgttacttct gctctaaaag ctgcggaatt
gtacccgcgg ccgcgccacc atgagagagc 480cagaggagct gatgccagat agcggagcag
tgtttacctt cggaaagtcc aagttcgcag 540agaataaccc aggaaagttc tggtttaaaa
acgacgtgcc cgtccacctg tcttgtggcg 600atgagcatag tgccgtggtc actgggaaca
ataagctgta tatgttcggg tccaacaatt 660ggggacagct ggggctggga tccaaatctg
ctatctctaa gccaacctgc gtgaaggcac 720tgaaacccga gaaggtcaaa ctggccgctt
gtggcagaaa ccacactctg gtgagcaccg 780agggcgggaa tgtctatgcc accggaggca
acaatgaggg acagctggga ctgggggaca 840ctgaggaaag gaataccttt cacgtgatct
ccttctttac atctgagcat aagatcaagc 900agctgagcgc cggctccaac acatctgcag
ccctgactga ggacgggcgc ctgttcatgt 960ggggagataa ttcagagggc cagattgggc
tgaaaaacgt gagcaacgtg tgcgtgcctc 1020agcaggtgac catcggaaag ccagtcagtt
ggatttcatg tggctactat catagcgcct 1080tcgtgaccac agatggcgag ctgtacgtct
ttggggagcc cgaaaacgga aaactgggcc 1140tgcctaacca gctgctgggc aatcaccgga
caccccagct ggtgtccgag atccctgaaa 1200aagtgatcca ggtcgcctgc gggggagagc
atacagtggt cctgactgag aatgccgtgt 1260acaccttcgg actgggccag tttggccagc
tggggctggg aaccttcctg tttgagacat 1320ccgaaccaaa agtgatcgag aacattcgcg
accagactat cagctacatt tcctgcggag 1380agaatcacac cgcactgatc acagacattg
gcctgatgta tacctttggc gatgggcggc 1440acgggaagct gggactgggc ctggagaact
tcactaatca cttcatcccc accctgtgct 1500ctaacttcct gcggttcatc gtgaaactgg
tcgcttgcgg cgggtgtcac atggtggtct 1560tcgctgcacc tcataggggc gtggctaagg
agatcgaatt tgacgagatt aacgatacat 1620gcctgagcgt ggcaactttc ctgccataca
gctccctgac ttctggcaat gtgctgcaga 1680gaaccctgag tgcaaggatg cggagaaggg
agagggaacg ctctcctgac agtttctcaa 1740tgcgacgaac cctgccacct atcgagggga
cactgggact gagtgcctgc ttcctgccta 1800actcagtgtt tccacgatgt agcgagcgga
atctgcagga gtctgtcctg agtgagcagg 1860atctgatgca gccagaggaa cccgactacc
tgctggatga gatgaccaag gaggccgaaa 1920tcgacaactc tagtacagtg gagtccctgg
gcgagactac cgatatcctg aatatgacac 1980acattatgtc actgaacagc aatgagaaga
gtctgaaact gtcaccagtg cagaagcaga 2040agaaacagca gactattggc gagctgactc
aggacaccgc cctgacagag aacgacgata 2100gcgatgagta tgaggaaatg tccgagatga
aggaaggcaa agcttgtaag cagcatgtga 2160gtcaggggat cttcatgaca cagccagcca
caactattga ggctttttca gacgaggaag 2220tggagatccc cgaggaaaaa gagggcgcag
aagattccaa ggggaatgga attgaggaac 2280aggaggtgga agccaacgag gaaaatgtga
aagtccacgg aggcaggaag gagaaaacag 2340aaatcctgtc tgacgatctg actgacaagg
ccgaggtgtc cgaaggcaag gcaaaatctg 2400tcggagaggc agaagacgga ccagagggac
gaggggatgg aacctgcgag gaaggctcaa 2460gcggggctga gcattggcag gacgaggaac
gagagaaggg cgaaaaggat aaaggccgcg 2520gggagatgga acgacctgga gagggcgaaa
aagagctggc agagaaggag gaatggaaga 2580aaagggacgg cgaggaacag gagcagaaag
aaagggagca gggccaccag aaggagcgca 2640accaggagat ggaagagggc ggcgaggaag
agcatggcga gggagaagag gaagagggcg 2700atagagaaga ggaagaggaa aaagaaggcg
aagggaagga ggaaggagag ggcgaggaag 2760tggaaggcga gagggaaaag gaggaaggag
aacggaagaa agaggaaaga gccggcaaag 2820aggaaaaggg cgaggaagag ggcgatcagg
gcgaaggcga ggaggaagag accgagggcc 2880gcggggaaga gaaagaggag ggaggagagg
tggagggcgg agaggtcgaa gagggaaagg 2940gcgagcgcga agaggaagag gaagagggcg
agggcgagga agaagagggc gagggggaag 3000aagaggaggg agagggcgaa gaggaagagg
gggagggaaa gggcgaagag gaaggagagg 3060aaggggaggg agaggaagag ggggaggagg
gcgaggggga aggcgaggag gaagaaggag 3120agggggaagg cgaagaggaa ggcgaggggg
aaggagagga ggaagaaggg gaaggcgaag 3180gcgaagagga gggagaagga gagggggagg
aagaggaagg agaagggaag ggcgaggagg 3240aaggcgaaga gggagagggg gaaggcgagg
aagaggaagg cgagggcgaa ggagaggacg 3300gcgagggcga gggagaagag gaggaagggg
aatgggaagg cgaagaagag gaaggcgaag 3360gcgaaggcga agaagagggc gaaggggagg
gcgaggaggg cgaaggcgaa ggggaggaag 3420aggaaggcga aggagaaggc gaggaagaag
agggagagga ggaaggcgag gaggaaggag 3480agggggagga ggagggagaa ggcgagggcg
aagaagaaga agagggagaa gtggagggcg 3540aagtcgaggg ggaggaggga gaaggggaag
gggaggaaga agagggcgaa gaagaaggcg 3600aggaaagaga aaaagaggga gaaggcgagg
aaaaccggag aaatagggaa gaggaggaag 3660aggaagaggg aaagtaccag gagacaggcg
aagaggaaaa cgagcggcag gatggcgagg 3720aatataagaa agtgagcaag atcaaaggat
ccgtcaagta cggcaagcac aaaacctatc 3780agaagaaaag cgtgaccaac acacagggga
atggaaaaga gcagcgaagt aaaatgcctg 3840tgcagtcaaa acggctgctg aagaatggcc
caagcgggtc taaaaaattc tggaacaatg 3900tcctgccaca ctatctggaa ctgaagtaag
cggccgcgcg gatccagaca tgataagata 3960cattgatgag tttggacaaa ccacaactag
aatgcagtga aaaaaatgct ttatttgtga 4020aatttgtgat gctattgctt tatttgtaac
cattataagc tgcaataaac aagttaacaa 4080caacaattgc attcatttta tgtttcaggt
tcagggggag gtgtgggagg ttttttagca 4140tg
4142
User Contributions:
Comment about this patent or add new information about this topic: