Patent application title: SYSTEMS, METHODS, AND COMPOSITIONS FOR SITE-SPECIFIC GENETIC ENGINEERING USING PROGRAMMABLE ADDITION VIA SITE-SPECIFIC TARGETING ELEMENTS (PASTE)
Inventors:
IPC8 Class: AC12N1511FI
USPC Class:
1 1
Class name:
Publication date: 2022-05-12
Patent application number: 20220145293
Abstract:
This disclosure provides systems, methods, and compositions for
site-specific genetic engineering using Programmable Addition via
Site-Specific Targeting Elements (PASTE). PASTE comprises the addition of
an integration site into a target genome followed by the insertion of one
or more genes of interest or one or more nucleic acid sequences of
interest at the site. PASTE combines gene editing technologies and
integrase technologies to achieve unidirectional incorporation of genes
in a genome for the treatment of diseases and diagnosis of disease.Claims:
1. A method of site-specific integration of a nucleic acid into a cell
genome or target nucleic acid, the method comprising: (a) incorporating
an integration site at a desired location in the cell genome or target
nucleic acid by introducing into a cell: i. a DNA binding nuclease domain
linked to a reverse transcriptase domain, wherein the DNA binding
nuclease domain comprises a nickase activity; and ii. a guide RNA (gRNA)
comprising a primer binding targeting sequence linked to a complement of
an integration sequence, wherein the gRNA interacts with the DNA binding
nuclease domain and targets the desired location in the cell genome
genome or target nucleic acid, wherein the DNA binding nuclease domain
nicks a strand of the cell genome or target nucleic acid and the reverse
transcriptase domain incorporates the integration sequence of the gRNA
into the nicked site, thereby providing the integration site at the
desired location of the cell genome or target nucleic acid; and (b)
integrating the nucleic acid into the cell genome or target nucleic acid
by introducing into the cell: i. a DNA or RNA strand comprising the
nucleic acid linked to a sequence that is complementary or associated to
the integration site; and ii. an integration enzyme, wherein the
integration enzyme incorporates the nucleic acid into the cell genome or
target nucleic acid at the integration site by integration,
recombination, or reverse transcription of the sequence that is
complementary or associated to the integration site, thereby introducing
the nucleic acid into the desired location of the cell genome or target
nucleic acid of the cell.
2. The method of claim 1, wherein the gRNA hybridizes to a complementary strand of the cell genome to the genomic strand that is nicked by the DNA binding nuclease domain.
3. The method of claim 1, wherein: the integration enzyme is introduced as a polypeptide or a nucleic acid encoding the integration enzyme; and/or the DNA binding nuclease domain is introduced as a polypeptide or a nucleic acid encoding the DNA binding nuclease.
4. (canceled)
5. The method of claim 1, wherein the DNA or RNA strand comprising the nucleic acid is introduced into the cell as a minicircle, a plasmid, mRNA or a linear DNA, optionally wherein: the DNA or RNA strand comprising the nucleic acid is between 1000 bp and 36,000 bp; the DNA or RNA strand comprising the nucleic acid is more than 36,000 bp; and/or the DNA or RNA strand comprising the nucleic acid is less than 1000 bp.
6. (canceled)
7. (canceled)
8. (canceled)
9. The method of claim 1, wherein the DNA comprising the nucleic acid is introduced into the cell as a minicircle, optionally wherein the minicircle does not comprise a sequence of a bacterial origin.
10. (canceled)
11. The method of claim 1, wherein the DNA binding nuclease linked to a reverse transcriptase domain and the integration enzyme are linked via a linker, optionally wherein: the linker is cleavable; the linker is non-cleavable; or the linker can be replaced by two associating binding domains of the DNA binding nuclease linked to a reverse transcriptase.
12. (canceled)
13. (canceled)
14. (canceled)
15. The method of claim 1, wherein: the integration enzyme is selected from the group consisting of Cre, Dre, Vika, Bxb1, .phi.C31, RDF, FLP, .phi.BT1, R1, R2, R3, R4, R5, TP901-1, A118, .phi.FC1, .phi.C1, MR11, TG1, q 370.1, W.beta., BL3, SPBc, K38, Peaches, Veracruz, Rebeuca, Theia, Benedict, KSSJEB, PattyP, Doom, Scowl, Lockley, Switzer, Bob3, Troube, Abrogate, Anglerfish, Sarfire, SkiPole, ConceptII, Museum, Severus, Airmid, Benedict, Hinder, ICleared, Sheen, Mundrea, BxZ2, .phi.RV, retrotransposases encoded by R2, L1, Tol2 Tc1, Tc3, Mariner (Himar 1), Mariner (mos 1), and Minos, and any mutants thereof; the integration site is an attB site, an attP site, an attL site, an attR site, a lox71 site a Vox site, or a FRT site; the DNA binding nuclease comprising a nickase activity is selected from Cas9-D10A, Cas9-H840A, and Cas12a/b nickase; and/or the reverse transcriptase domain is selected from the group consisting of Moloney Murine Leukemia Virus (M-MLV) reverse transcriptase domain, transcription xenopolymerase (RTX), avian myeloblastosis virus reverse transcriptase (AMV-RT), and Eubacterium rectale maturase RT (MarathonRT), optionally wherein: the reverse transcriptase domain comprises a mutation relative to the wild-type sequence; and/or the M-MLV reverse transcriptase domain comprises one or more mutations selected from the group consisting of D200N, T306K, W313F, T330P and L603W.
16. (canceled)
17. (canceled)
18. (canceled)
19. (canceled)
20. (canceled)
21. (canceled)
22. The method of claim 1, further comprising introducing a nicking guide RNA (ngRNA).
23. The method of claim 1, wherein: the gRNA, the nucleic acid encoding the DNA binding nuclease, the reverse transcriptase, the DNA comprising nucleic acid linked to a complementary or associated integration site, the integration enzyme, and optionally the ngRNA, are introduced into a cell in a single reaction; and/or the gRNA, the nucleic acid encoding the DNA binding nuclease, the reverse transcriptase, the DNA comprising nucleic acid linked to a complementary integration site, the integration enzyme, and optionally the ngRNA, are introduced using a virus, a RNP, an mRNA, a lipid, or a polymeric nanoparticle.
24. (canceled)
25. The method of claim 1, wherein: the nucleic acid is a reporter gene, optionally wherein the reporter gene is a fluorescent protein; the nucleic acid is a degradation tag for programmable knockdown of proteins in the presence of small molecules; the nucleic acid is a T-cell receptor (TCR), a chimeric antigen receptor (CAR), an interleukin, a cytokine, or an immune checkpoint gene for integration into a T-cell or natural killer (NK) cell, optionally wherein the TCR, the CAR, the interleukin, the cytokine, or the immune checkpoint gene is incorporated into the target site of the T-cell or NK cell genome using a minicircle DNA; the nucleic acid is a beta hemoglobin (HBB) gene and the cell is a hematopoietic stem cell (HSC), optionally wherein the HBB gene is incorporated into the target site in the HSC genome using a minicircle DNA and/or the nucleic acid is a gene responsible for beta thalassemia or sickle cell anemia; the nucleic acid is a metabolic gene, optionally wherein the metabolic gene is involved in alpha-1 antitrypsin deficiency or ornithine transcarbamylase (OTC) deficiency and/or the metabolic gene is a gene involved in an inherited disease; or the nucleic acid is a gene involved in an inherited disease or an inherited syndrome, optionally wherein the inherited disease is cystic fibrosis, familial hypercholesterolemia, adenosine deaminase (ADA) deficiency, X-linked SCID (X-SCID), Wiskott-Aldrich syndrome (WAS), hemochromatosis, Tay-Sachs, fragile X syndrome, Huntington's disease, Marfan syndrome, phenylketonuria, or muscular dystrophy.
26. (canceled)
27. The method of claim 1, wherein the cell is a dividing cell or a non-dividing cell, optionally wherein: the desired location in the cell genome is the locus of a mutated gene; and/or the cell is a mammalian cell, a bacterial cell or a plant cell.
28. (canceled)
29. (canceled)
30. (canceled)
31. (canceled)
32. (canceled)
33. (canceled)
34. (canceled)
35. (canceled)
36. (canceled)
37. (canceled)
38. (canceled)
39. (canceled)
40. (canceled)
41. (canceled)
42. A vector comprising a nucleic acid encoding the polypeptide of claim 63.
43. (canceled)
44. (canceled)
45. (canceled)
46. (canceled)
47. (canceled)
48. (canceled)
49. (canceled)
50. (canceled)
51. (canceled)
52. (canceled)
53. (canceled)
54. A cell comprising: (a) the vector of claim 42; (b) a gRNA comprising a primer binding sequence, an integration sequence, and a guide sequence, wherein the gRNA can interact with the encoded nuclease comprising a nickase activity; (c) a DNA minicircle comprising a nucleic acid and a sequence recognized by the encoded integrase, recombinase, or reverse transcriptase; and (d) a nicking guide RNA (ngRNA) capable of binding the encoded nuclease comprising a nickase activity, wherein the ngRNA targets a sequence away from the gRNA.
55. The cell of claim 54, wherein: the minicircle does not comprise a sequence of bacterial origin; the integration enzyme is selected from the group consisting of Cre, Dre, Vika, Bxb1, .phi.C31, RDF, FLP, .phi.BT1, R1, R2, R3, R4, R5, TP901-1, A118, .phi.FC1, .phi.C1, MR11, TG1, q 370.1, W.beta., BL3, SPBc, K38, Peaches, Veracruz, Rebeuca, Theia, KSSJEB, PattyP, Doom, Scowl, Lockley, Switzer, Bob3, Troube, Abrogate, Anglerfish, Sarfire, SkiPole, ConceptII, Museum, Severus, Airmid, Benedict, Hinder, ICleared, Sheen, Mundrea, BxZ2, .phi.RV, retrotransposases encoded by R2, L1, Tol2 Tc1, Tc3, Mariner (Himar 1), Mariner (mos 1), Minos, and any mutants thereof; the DNA binding nuclease comprising a nickase activity is selected from the group consisting of Cas9-D10A, Cas9-H840A and Cas12a; the reverse transcriptase is a M-MLV reverse transcriptase, optionally wherein the reverse transcriptase is a modified M-MLV reverse transcriptase, optionally wherein the amino acid sequence of the M-MLV reverse transcriptase comprises one or more mutations selected from the group consisting of D200N, T306K, W313F, T330P, and L603W; and/or the cell further comprises a ngRNA.
56. (canceled)
57. (canceled)
58. (canceled)
59. (canceled)
60. (canceled)
61. (canceled)
62. (canceled)
63. A polypeptide comprising a DNA binding nuclease comprising a nickase activity C-terminally linked to a reverse transcriptase linked to an integration enzyme via a linker.
64. The polypeptide of claim 63, wherein: the linker is cleavable or non-cleavable; the integration enzyme is fused to an estrogen receptor; the DNA binding nuclease comprising a nickase activity is selected from the group consisting of Cas9-D10A, Cas9-H840A, and Cas12a/b/c/d/e/f/g/h/i/j; the reverse transcriptase is a M-MLV reverse transcriptase, a AMV-RT, a MarathonRT, or a XRT, optionally wherein the reverse transcriptase is a modified M-MLV relative to a wild-type M-MLV reverse transcriptase, optionally wherein the M-MLV reverse transcriptase domain comprises one or more of mutations selected from the group consisting of D200N, T306K, W313F, T330P, and L603W; the integration enzyme is selected from group consisting of Cre, Dre, Vika, Bxb1, .phi.C31, RDF, FLP, .phi.BT1, R1, R2, R3, R4, R5, TP901-1, A118, .phi.FC1, .phi.C1, MR11, TG1, .phi.370.1, W.beta., BL3, SPBc, K38, Peaches, Veracruz, Rebeuca, Theia, KSSJEB, PattyP, Doom, Scowl, Lockley, Switzer, Bob3, Troube, Abrogate, Anglerfish, Sarfire, SkiPole, ConceptII, Museum, Severus, Airmid, Benedict, Hinder, ICleared, Sheen, Mundrea, BxZ2, .phi.RV, retrotransposases encoded by R2, L1, Tol2 Tc1, Tc3, Mariner (Himar 1), Mariner (mos 1), Minos, and any mutants thereof.
65. (canceled)
66. (canceled)
67. (canceled)
68. (canceled)
69. (canceled)
70. (canceled)
71. (canceled)
72. (canceled)
73. A gRNA that specifically binds to a DNA binding nuclease comprising nickase activity, the gRNA comprising: (a) a primer binding site, which hybridizes to a nicked DNA strand; (b) a recognition site for an integration enzyme; and (c) a target recognition sequence recognizing a target site in a cell genome and hybridizing to a genomic strand complementary to the strand that is nicked by the DNA binding nuclease.
74. The gRNA of claim 73, wherein: the DNA binding nuclease comprising a nickase activity is selected from the group consisting of Cas9-D10A, Cas9-H840A, and Cas12a/b/c/d/e/f/g/h/i/j; the primer binding site hybridizes to the 3' end of the nicked DNA strand; the recognition site for the integration enzyme is selected from an attB site, an attP site, an attL site, an attR site, a lox71 site, and a FRT site; and/or the recognition site for the integration enzyme is a Bxb1 site.
75. (canceled)
76. (canceled)
77. (canceled)
78. A method of site-specific integration of two or more nucleic acids into a cell genome, the method comprising: (a) incorporating two integration sites at desired locations in the cell genome by introducing into the cell: i. a DNA binding nuclease linked to a reverse transcriptase domain, wherein the DNA binding nuclease comprises a nickase activity; and ii. two guide RNAs (gRNAs), each comprising, a primer binding sequence, and is linked to a unique integration sequence, wherein the gRNA interacts with the DNA binding nuclease and targets the desired locations in the cell genome, wherein the DNA binding nuclease nicks a strand of the cell genome and the reverse transcriptase domain incorporates each of the integration sequence of the gRNA into the nicked site, thereby providing the integration site at the desired locations of the cell genome; and (b) integrating the nucleic acid by introducing into the cell: i. two or more DNA or RNA comprising the nucleic acids, wherein each DNA is flanked by orthogonal integration sites; and ii. an integration enzyme, wherein the integration enzyme incorporates the nucleic acids into the cell genome at the integration sites by integrase, recombinase, or reverse transcriptase of the sequence that is complementary or associated to the integration site, thereby introducing the nucleic acids into the desired locations of the cell genome of the cell.
79. The method of claim 78, wherein each of the two different integration sites inserted into the cell genome are attB and/or attP sequences comprising different palindromic or non-palindromic central dinucleotide, optionally wherein: the integration enzyme enables each of the two or more DNA or RNA comprising the nucleic acids to directionally enable integration of the nucleic acids into a genome via recombination of a pair of orthogonal attB site sequence and an attP site sequence; and/or the pair of an attB site sequence and an attP site sequence are selected from the group consisting of SEQ ID NO: 5 and SEQ ID NO: 6, SEQ ID NO: 7 and SEQ ID NO: 8, SEQ ID NO: 9 and SEQ ID NO: 10, SEQ ID NO: 11 and SEQ ID NO: 12, SEQ ID NO: 13 and SEQ ID NO: 14, SEQ ID NO: 15 and SEQ ID NO: 16, SEQ ID NO: 17 and SEQ ID NO: 18, SEQ ID NO: 19 and SEQ ID NO: 20, SEQ ID NO: 21 and SEQ ID NO: 22, SEQ ID NO: 23 and SEQ ID NO: 24, SEQ ID NO: 25 and SEQ ID NO: 26, SEQ ID NO: 27 and SEQ ID NO: 28, SEQ ID NO: 29 and SEQ ID NO: 30, SEQ ID NO: 31 and SEQ ID NO: 32, SEQ ID NO: 33 and SEQ ID NO: 34, and SEQ ID NO: 35 and SEQ ID NO: 36.
80. (canceled)
81. (canceled)
82. (canceled)
83. (canceled)
84. (canceled)
85. (canceled)
86. (canceled)
87. (canceled)
88. (canceled)
89. (canceled)
90. (canceled)
91. (canceled)
92. The method of claim 17, wherein the attB site is about 40-46 basepair.
93. The gRNA of claim 74, wherein the attB site is about 40-46 basepair.
Description:
CROSS REFERENCE TO RELATED APPLICATIONS
[0001] This application claims the benefit of U.S. Provisional Patent Application Ser. No. 63/222,550, filed Jul. 16, 2021 and U.S. Provisional Patent Application Ser. No. 63/094,803, filed Oct. 21, 2020. The entire contents of the above-referenced patent applications are incorporated by reference in their entirety herein.
FIELD OF DISCLOSURE
[0002] The subject matter disclosed herein is generally directed to systems, methods, and compositions for site-specific genetic engineering using Programmable Addition via Site-Specific Targeting Elements (PASTE) for the treatment of diseases and diagnostics.
BACKGROUND
[0003] Editing genomes using the RNA-guided DNA targeting principle of CRISPR-Cas (Clustered Regularly Interspaced Short Palindromic Repeats-CRISPR associated proteins) immunity has been widely exploited and has become a powerful genome editing means for a wide variety of applications. The main advantage of CRISPR-Cas system lies in the minimal requirement for programmable DNA interference: an endonuclease, such as a Cas9, Cas12, or any programmable nucleases, guided by a customizable dual-RNA structure. Cas9 is a multi-domain enzyme that uses an HNH nuclease domain to cleave the target strand. The CRISPR/Cas9 protein-RNA complex is localized on the target by a guide RNA (guide RNA), then cleaved to generate a DNA double strand break (dsDNA break, DSB). After cleavage, DNA repair mechanisms are activated to repair the cleaved strand. Repair mechanisms are generally from one of two types: non-homologous end joining (NHEJ) or homologous recombination (HR). In general, NHEJ dominates the repair, and, being error prone, generates random indels (insertions or deletions) causing frame shift mutations, among others. In contrast, HR has a more precise repairing capability and is potentially capable of incorporating the exact substitution or insertion. To enhance HR, several techniques have been tried, for example: combination of fusion proteins of Cas9 nuclease with homology-directed repair (HDR) effectors to enforce their localization at DSBs, introducing an overlapping homology arm, or suppression of NHEJ. Most of these techniques rely on the host DNA repair systems.
[0004] Recently, new guided editors have been developed, such as guided prime editors (PE) PE1, PE2, and PE3, e.g., Liu, D. et al., Nature 2019, 576, 149-157. These PEs are reverse transcriptase (RT) fused with Cas 9 H 840A nickase (Cas9n (H840A)), and the genome editing is achieved using a prime-editing guide RNA (pegRNA). Despite these developments, programmable gene integration is still generally dependent on cellular pathways or repair processes.
[0005] Therefore, there is a need for more effective tools for gene editing and delivery.
SUMMARY
[0006] The present disclosure provides a method of site-specific integration of a nucleic acid into a cell genome. The method comprises incorporating an integration site at a desired location in the cell genome by introducing into the cell a DNA binding nuclease linked to a reverse transcriptase domain, wherein the DNA binding nuclease comprises a nickase activity; and a guide RNA (gRNA) comprising a primer binding sequence linked to an integration sequence, wherein the gRNA interacts with the DNA binding nuclease and targets the desired location in the cell genome, wherein the DNA binding nuclease nicks a strand of the cell genome and the reverse transcriptase domain incorporates the integration sequence of the gRNA into the nicked site, thereby providing the integration site at the desired location of the cell genome. The method further comprises integrating the nucleic acid into the cell genome by introducing into the cell a DNA or RNA strand comprising the nucleic acid linked to a sequence that is complementary or associated to the integration site, and an integration enzyme, wherein the integration enzyme incorporates the nucleic acid into the cell genome at the integration site by integration, recombination, or reverse transcription of the sequence that is complementary or associated to the integration site, thereby introducing the nucleic acid into the desired location of the cell genome of the cell.
[0007] In some embodiments, the gRNA can be hybridized to a complementary strand of the cell genome to the genomic strand that is nicked by the DNA binding nuclease.
[0008] In some embodiments, the integration enzyme can be introduced as a peptide or a nucleic acid encoding the same.
[0009] In some embodiments, the DNA binding nuclease can be introduced as a peptide or a nucleic acid encoding the same.
[0010] In some embodiments, the DNA or RNA strand comprising the nucleic acid can be introduced into the cell as a minicircle, a plasmid, mRNA or a linear DNA.
[0011] In some embodiments, the DNA or RNA strand comprising the nucleic acid can be between 1000 bp and 10,000 bp.
[0012] In some embodiments, the DNA or RNA strand comprising the nucleic acid can be more than 10,000 bp.
[0013] In some embodiments, the DNA or RNA strand comprising the nucleic acid can be less than 1000 bp.
[0014] In some embodiments, the DNA comprising the nucleic acid can be introduced into the cell as a minicircle.
[0015] In some embodiment, the minicircle cannot comprise sequences of a bacterial origin.
[0016] In some embodiments, the DNA binding nuclease can be linked to a reverse transcriptase domain and the integration enzyme can be linked via a linker. The linker can be cleavable. The linker can be non-cleavable. The linker can be replaced by two associating binding domains of the DNA binding nuclease linked to a reverse transcriptase.
[0017] In some embodiments, the integration enzyme can be selected from the group consisting of Cre, Dre, Vika, Bxb1, .phi.C31, RDF, FLP, .phi.BT1, R1, R2, R3, R4, R5, TP901-1, A118, .phi.FC1, .phi.C1, MR11, TG1, .phi.370.1, W.beta., BL3, SPBc, K38, Peaches, Veracruz, Rebeuca, Theia, Benedict, KSSJEB, PattyP, Doom, Scowl, Lockley, Switzer, Bob3, Troube, Abrogate, Anglerfish, Sarfire, SkiPole, ConceptII, Museum, Severus, Airmid, Benedict, Hinder, ICleared, Sheen, Mundrea, BxZ2, .phi.RV, retrotransposases encoded by R2, L1, Tol2 Tc1, Tc3, Mariner (Himar 1), Mariner (mos 1), and Minos, and any mutants thereof.
[0018] In some embodiments, the integration enzyme can be Bxb1 or a mutant thereof.
[0019] In some embodiments, the integration site can be selected from an attB site, an attP site, an attL site, an attR site, a lox71 site a Vox site, or a FRT site.
[0020] In some embodiments, the DNA binding nuclease comprising a nickase activity can be selected from Cas9-D10A, Cas9-H840A, and Cas12a/b nickase.
[0021] In some embodiments, the reverse transcriptase domain can be selected from the group consisting of Moloney Murine Leukemia Virus (M-MLV) reverse transcriptase domain, transcription xenopolymerase (RTX), avian myeloblastosis virus reverse transcriptase (AMV-RT), and Eubacterium rectale maturase RT (MarathonRT).
[0022] In some embodiments, the reverse transcriptase domain can comprise a mutation relative to the wild-type sequence.
[0023] In some embodiments, the M-MLV reverse transcriptase domain can comprise one or more mutations selected from the group consisting of D200N, T306K, W313F, T330P and L603W.
[0024] In some embodiments, the method can further comprise introducing a second nicking guide RNA (ngRNA). The ngRNA can direct nicking at 90 bases downstream of the gRNA nick on a complementary strand.
[0025] In some embodiments, the gRNA, the nucleic acid encoding the DNA binding nuclease, the reverse transcriptase, the DNA comprising nucleic acid linked to a complementary integration site, the integration enzyme, and optionally the ngRNA can be introduced into a cell in a single reaction.
[0026] In some embodiments, the gRNA, the nucleic acid encoding the DNA binding nuclease, the reverse transcriptase, the DNA comprising nucleic acid linked to a complementary integration site, the integration enzyme, and optionally the ngRNA can be introduced using a virus, a RNP, an mRNA, a lipid, or a polymeric nanoparticle.
[0027] In some embodiments, the nucleic acid can be a reporter gene. The reporter gene can be a fluorescent protein.
[0028] In some embodiments, the cell can be a dividing cell.
[0029] In some embodiments, the cell can be a non-dividing cell.
[0030] In some embodiments, the desired location in the cell genome can be the locus of a mutated gene.
[0031] In some embodiments, the nucleic acid can be a degradation tag for programmable knockdown of proteins in the presence of small molecules.
[0032] In some embodiments, the cell can be a mammalian cell, a bacterial cell or a plant cell.
[0033] In some embodiments, nucleic acid can be a T-cell receptor (TCR), a chimeric antigen receptor (CAR), an interleukin, a cytokine, or an immune checkpoint gene for integration into a T-cell or natural killer (NK) cell. The TCR, the CAR, the interleukin, the cytokine, or the immune checkpoint gene can be incorporated into the target site of the T-cell or NK cell genome using a minicircle DNA.
[0034] In some embodiments, the nucleic acid can be a beta hemoglobin (HBB) gene and the cell can be a hematopoietic stem cell (HSC). The HBB gene can be incorporated into the target site in the HSC genome using a minicircle DNA. The nucleic acid can be a gene responsible for beta thalassemia or sickle cell anemia.
[0035] In some embodiments, the nucleic acid can be a metabolic gene. The metabolic gene can be involved in alpha-1 antitrypsin deficiency or ornithine transcarbamylase (OTC) deficiency. The metabolic gene can be a gene involved in inherited diseases.
[0036] In some embodiments, the nucleic acid can be a gene involved in an inherited disease or an inherited syndrome. The inherited disease can be cystic fibrosis, familial hypercholesterolemia, adenosine deaminase (ADA) deficiency, X-linked SCID (X-SCID), Wiskott-Aldrich syndrome (WAS), hemochromatosis, Tay-Sachs, fragile X syndrome, Huntington's disease, Marfan syndrome, phenylketonuria, or muscular dystrophy.
[0037] The present disclosure provides a vector comprising a nucleic acid encoding a DNA binding nuclease comprising a nickase activity C-terminally linked to a reverse transcriptase linked to an integration enzyme via a linker.
[0038] In some embodiments, the linker can be cleavable.
[0039] In some embodiments, the linker can be non-cleavable.
[0040] In some embodiments, the linker can comprise two associating binding domains of the DNA binding nuclease linked to a reverse transcriptase.
[0041] In some embodiments, the integration enzyme can comprise a conditional activation domain or conditional expression domain.
[0042] In some embodiments, the integration enzyme can be fused to an estrogen receptor.
[0043] In some embodiments, the DNA binding nuclease comprising a nickase activity can be selected from the group consisting of Cas9-D10A, Cas9-H840A, and Cas12a/b.
[0044] In some embodiments, the reverse transcriptase can be a M-MLV reverse transcriptase, a AMV-RT, MarathonRT, or a RTX. The reverse transcriptase can be a modified M-MLV reverse transcriptase relative to the wildtype M-MLV reverse transcriptase. The M-MLV reverse transcriptase domain can comprise one or more of the mutations selected from the group consisting of D200N, T306K, W313F, T330P and L603W.
[0045] In some embodiments, the integration enzyme can be selected from the group consisting of Cre, Dre, Vika, Bxb1, .phi.C31, RDF, FLP, .phi.BT1, R1, R2, R3, R4, R5, TP901-1, A118, .phi.FC1, .phi.C1, MR11, TG1, .phi.370.1, W.beta., BL3, SPBc, K38, Peaches, Veracruz, Rebeuca, Theia, KSSJEB, PattyP, Doom, Scowl, Lockley, Switzer, Bob3, Troube, Abrogate, Anglerfish, Sarfire, SkiPole, ConceptII, Museum, Severus, Airmid, Benedict, Hinder, ICleared, Sheen, Mundrea, BxZ2, .phi.RV, retrotransposases encoded by R2, L1, Tol2 Tc1, Tc3, Mariner (Himar 1), Mariner (mos 1), and Minos, and any mutants thereof.
[0046] In some embodiments, the recombinase or integrase can be Bxb1 or a mutant thereof.
[0047] The present disclosure provides a cell comprising a vector comprising a nucleic acid encoding a DNA binding nuclease comprising a nickase activity C-terminally linked to a reverse transcriptase linked to an integration enzyme via a linker. The cell further comprises a gRNA comprising a primer binding sequence, an integration sequence, and a guide sequence, wherein the gRNA can interact with the encoded nuclease comprising a nickase activity. The cell further comprising a DNA minicircle comprising a nucleic acid and a sequence recognized by the encoded integrase, recombinase, or reverse transcriptase. The cell further comprising a nicking guide RNA (ngRNA) capable of binding the encoded nuclease comprising a nickase activity, and wherein the ngRNA targets a sequence away from the gRNA.
[0048] In some embodiments, the minicircle cannot comprise a sequence of bacterial origin.
[0049] In some embodiments, the integration enzyme can be selected from the group consisting of Cre, Dre, Vika, Bxb1, .phi.C31, RDF, FLP, .phi.BT1, R1, R2, R3, R4, R5, TP901-1, A118, .phi.FC1, .phi.C1, MR11, TG1, .phi.370.1, W.beta., BL3, SPBc, K38, Peaches, Veracruz, Rebeuca, Theia, KSSJEB, PattyP, Doom, Scowl, Lockley, Switzer, Bob3, Troube, Abrogate, Anglerfish, Sarfire, SkiPole, ConceptII, Museum, Severus, Airmid, Benedict, Hinder, ICleared, Sheen, Mundrea, BxZ2, .phi.RV, retrotransposases encoded by R2, L1, Tol2 Tc1, Tc3, Mariner (Himar 1), Mariner (mos 1), and Minos, and any mutants thereof.
[0050] In some embodiments, the integration enzyme can be Bxb1 or a mutant thereof.
[0051] In some embodiments, the DNA binding nuclease comprising a nickase activity can be selected from the group consisting of Cas9-D10A, Cas9-H840A and Cas12a.
[0052] In some embodiments, the reverse transcriptase can be a M-MLV reverse transcriptase. The reverse transcriptase can be a modified M-MLV reverse transcriptase. The amino acid sequence of the M-MLV reverse transcriptase can comprise one or more mutations selected from the group consisting of D200N, T306K, W313F, T330P, and L603W.
[0053] In some embodiments, the cell can further comprise introducing ngRNA to the cell. The ngRNA can be a +90 ngRNA. The +90 ngRNA can direct nicking at 90 bases downstream of the gRNA nick on a complementary strand.
[0054] The present disclosure provides a polypeptide comprising a DNA binding nuclease comprising a nickase activity C-terminally linked to a reverse transcriptase linked to an integration enzyme via a linker.
[0055] In some embodiments, the linker can be cleavable.
[0056] In some embodiments, the linker can be non-cleavable.
[0057] In some embodiments, the integration enzyme can be fused to an estrogen receptor.
[0058] In some embodiments, the DNA binding nuclease comprising a nickase activity can be selected from the group consisting of Cas9-D10A, Cas9-H840A, and Cas12a/b.
[0059] In some embodiments, the reverse transcriptase can be a M-MLV reverse transcriptase, a AMV-RT, a MarathonRT, or a XRT. The reverse transcriptase can be a modified M-MLV relative to a wild-type M-MLV reverse transcriptase. The M-MLV reverse transcriptase domain can comprise one or more of mutations selected from the group consisting of D200N, T306K, W313F, T330P, and L603W.
[0060] In some embodiments, the integration enzyme can be selected from group consisting of Cre, Dre, Vika, Bxb1, .phi.C31, RDF, FLP, .phi.BT1, R1, R2, R3, R4, R5, TP901-1, A118, .phi.FC1, .phi.C1, MR11, TG1, .phi.370.1, W.beta., BL3, SPBc, K38, Peaches, Veracruz, Rebeuca, Theia, KSSJEB, PattyP, Doom, Scowl, Lockley, Switzer, Bob3, Troube, Abrogate, Anglerfish, Sarfire, SkiPole, ConceptII, Museum, Severus, Airmid, Benedict, Hinder, ICleared, Sheen, Mundrea, BxZ2, .phi.RV, retrotransposases encoded by R2, L1, Tol2 Tc1, Tc3, Mariner (Himar 1), Mariner (mos 1), and Minos, and any mutants thereof.
[0061] In some embodiments, the integration enzyme can be Bxb1 or a mutant thereof.
[0062] The present disclosure provides a gRNA that specifically binds to a DNA binding nuclease comprising nickase activity, the gRNA comprising a primer binding site, which hybridizes to a nicked DNA strand, a recognition site for an integration enzyme, and a target recognition sequence recognizing a target site in a cell genome and hybridizing to a genomic strand complementary to the strand that is nicked by the DNA binding nuclease.
[0063] In some embodiments, the DNA binding nuclease comprising a nickase activity can be selected from the group consisting of Cas9-D10A, Cas9-H840A, and Cas12a/b.
[0064] In some embodiments, the primer binding site can hybridize to the 3' end of the nicked DNA strand.
[0065] In some embodiments, the recognition site for the integration enzyme can be selected from an attB site, an attP site, an attL site, an attR site, a lox71 site, and a FRT site.
[0066] In some embodiments, the recognition site for the integration enzyme can be a Bxb1 site.
[0067] The present disclosure provides a method of site-specific integration of two or more nucleic acids into a cell genome. The method comprises incorporating two integration sites at desired locations in the cell genome by introducing into the cell a DNA binding nuclease linked to a reverse transcriptase domain, wherein the DNA binding nuclease comprises a nickase activity, and two guide RNAs (gRNAs), each comprising, a primer binding sequence, linked to a unique integration sequence, wherein the gRNA interacts with the DNA binding nuclease and targets the desired locations in the cell genome, wherein the DNA binding nuclease nicks a strand of the cell genome and the reverse transcriptase domain incorporates each of the integration sequence of the gRNA into the nicked site, thereby providing the integration site at the desired locations of the cell genome. The method further comprises integrating the nucleic acid by introducing into the cell two or more DNA or RNA comprising the nucleic acids, wherein each DNA is flanked by orthogonal integration sites, and an integration enzyme, wherein the integration enzyme incorporates the nucleic acids into the cell genome at the integration sites by integrase, recombinase, or reverse transcriptase of the sequence that is complementary or associated to the integration site, thereby introducing the nucleic acids into the desired locations of the cell genome of the cell.
[0068] In some embodiments, each of the two different integration sites inserted into the cell genome can be attB sequences comprising different palindromic or non-palindromic central dinucleotide.
[0069] In some embodiments, each of the two different integration sites inserted into the cell genome can be attP sequences comprising different palindromic or non-palindromic central dinucleotide.
[0070] In some embodiments, the integration enzyme can enable each of the two or more DNA or RNA comprising the nucleic acids to directionally enable integration of the nucleic acids into a genome via recombination of a pair of orthogonal attB site sequence and an attP site sequence.
[0071] In some embodiments, the integration enzyme can be selected from the group consisting of Cre, Dre, Vika, Bxb1, .phi.C31, RDF, FLP, .phi.BT1, TP901-1, A118, .phi.FC1, .phi.C1, MR11, TG1, .phi.370.1, W.beta., BL3, SPBc, K38, Peaches, Veracruz, Rebeuca, Theia, KSSJEB, PattyP, Doom, Scowl, Lockley, Switzer, Bob3, Troube, Abrogate, Anglerfish, Sarfire, SkiPole, ConceptII, Museum, Severus, Airmid, Benedict, Hinder, ICleared, Sheen, Mundrea, BxZ2, .phi.RV, retrotransposases encoded by R1, R2, R3, R4, R5, L1, Tol2 Tc1, Tc3, Mariner (Himar 1), Mariner (mos 1), and Minos, and any mutants thereof.
[0072] In some embodiments, the integration enzyme can be Bxb1 or a mutant thereof.
[0073] In some embodiments, the DNA comprising genes can be genes involved in a cell maintenance pathway, cell-division, or a signal transduction pathway.
[0074] In some embodiments, the reverse transcriptase domain can comprise Moloney Murine Leukemia Virus (M-MLV) reverse transcriptase domain, transcription xenopolymerase (RTX), avian myeloblastosis virus reverse transcriptase (AMV-RT), or Eubacterium rectale maturase RT (MarathonRT).
[0075] In some embodiments, the DNA binding nuclease comprising a nickase activity can be selected from the group consisting of Cas9-D10A, Cas9-H840A, and Cas12a/b.
[0076] In some embodiments, the pair of an attB site sequence and an attP site sequence can be selected from the group consisting of SEQ ID NO: 5 and SEQ ID NO: 6, SEQ ID NO: 7 and SEQ ID NO: 8, SEQ ID NO: 9 and SEQ ID NO: 10, SEQ ID NO: 11 and SEQ ID NO: 12, SEQ ID NO: 13 and SEQ ID NO: 14, SEQ ID NO: 15 and SEQ ID NO: 16, SEQ ID NO: 17 and SEQ ID NO: 18, SEQ ID NO: 19 and SEQ ID NO: 20, SEQ ID NO: 21 and SEQ ID NO: 22, SEQ ID NO: 23 and SEQ ID NO: 24, SEQ ID NO: 25 and SEQ ID NO: 26, SEQ ID NO: 27 and SEQ ID NO: 28, SEQ ID NO: 29 and SEQ ID NO: 30, SEQ ID NO: 31 and SEQ ID NO: 32, SEQ ID NO: 33 and SEQ ID NO: 34 and SEQ ID NO: 35 and SEQ ID NO: 36.
[0077] The present disclosure provides a cell comprising a vector comprising a nucleic acid encoding a DNA binding nuclease comprising a nickase activity, wherein the DNA binding nuclease is C-terminally linked to a reverse transcriptase, wherein the reverse transcriptase is linked to a recombinase or integrase via a linker. The cell further comprises two guide RNAs (gRNAs) comprising a primer binding sequence, an integration sequence and a guide sequence, wherein the gRNA can interact with the encoded DNA binding nuclease comprising a nickase activity. The cell further comprises two or more DNA or RNA strands comprising a nucleic acid and a pair of flanking attB site sequence and an attP site sequence recognized by the encoded integrase or recombinase. The cell optionally further comprises a nicking guide RNA (ngRNA) capable of binding the encoded nuclease comprising a nickase activity, and wherein the ngRNA targets a sequence away from the gRNA.
[0078] The present disclosure provides a cell comprising a modified genome, wherein the modification comprises incorporation of two orthogonal integration sites within the cell genome by introducing into the cell a: vector comprising a nucleic acid encoding a DNA binding nuclease comprising a nickase activity, wherein the DNA binding nuclease is C-terminally linked to a reverse transcriptase; two guide RNAs (gRNAs), each comprising a primer binding sequence, a genomic integration sequence, and a guide sequence, wherein the gRNA can interact with the encoded nuclease comprising a nickase activity; and optionally a nicking guide RNA (ngRNA) capable of binding the encoded nuclease comprising a nickase activity, and wherein the ngRNA targets a sequence away from the gRNA.
[0079] The present disclosure provides a method of integrating two or more nucleic acids into the cell genome of cell of claim 90, the method comprising introducing into the cell: two or more DNA, each comprising a nucleic acid and a pair of flanking orthogonal integration site sequences; an integration enzyme that can recognize the integration site sequence enabling directional linking of the two or more DNA comprising nucleic acid; and enabling incorporation of the nucleic acids into the cell genome by integrating the 5' orthogonal integration sequence of the first DNA with the first genomic integration sequence and 3' orthogonal integration sequence of the last DNA with the last genomic integration sequence, thereby incorporating the two or more nucleic acids into the cell genome.
[0080] The present disclosure provides a cell comprising a modified genome, wherein the modification comprises incorporation of two orthogonal integration sites within the cell genome by introducing into the cell: a vector comprising a nucleic acid encoding a DNA binding nuclease comprising a nickase activity, wherein the DNA binding nuclease is C-terminally linked to a reverse transcriptase; two guide RNAs (gRNAs), each comprising a primer binding sequence, a genomic integration sequence, and a guide sequence, wherein the gRNA can interact with the encoded nuclease comprising a nickase activity; and optionally a nicking guide RNA (ngRNA) capable of binding the encoded nuclease comprising a nickase activity, and wherein the ngRNA targets a sequence away from the gRNA; two or more DNA or RNA comprising the nucleic acids, wherein each DNA is flanked by orthogonal integration sites; and an integration enzyme, wherein the integration enzyme incorporates the nucleic acids into the cell genome at the integration sites.
BRIEF DESCRIPTION OF THE DRAWINGS
[0081] Aspects, features, benefits and advantages of the embodiments described herein will be apparent with regard to the following description, appended claims, and accompanying drawings where:
[0082] FIG. 1 shows a schematic diagram of a concept of Programmable Addition via Site-Specific Targeting Elements (PASTE) according to embodiments of the present teachings;
[0083] FIG. 2 shows a schematic diagram of a prime editing process according to embodiments of the present teachings;
[0084] FIG. 3 shows the percent integration of green fluorescent protein (GFP) in the lentiviral integrated lox71 site in HEK293FT cell line in the presence of various plasmids according to embodiments of the present teachings;
[0085] FIG. 4 shows the percent editing of the HEK293FT genome for incorporation of various lengths of lox71 or lox66 according to embodiments of the present teachings;
[0086] FIG. 5A shows the percent editing of lox71 site with different PE/Cre vectors according to embodiments of the present teachings;
[0087] FIG. 5B shows the percent integration of GFP at the lox71 site in HEK293FT cell genome according to embodiments of the present teachings;
[0088] FIG. 6 shows a schematic representation of using Bxb1 to integrate a nucleic acid into the genome according to embodiments of the present teachings;
[0089] FIG. 7 shows the percent integration of GFP or Gluc into the attB locus using Bxb1 Programmable Addition via Site-Specific Targeting Elements (PASTE) according to embodiments of the present teachings;
[0090] FIG. 8 shows the percent editing of various HEK3 targeting pegRNA Programmable Addition via Site-Specific Targeting Elements (PASTE) according to embodiments of the present teachings;
[0091] FIG. 9A shows a fluorescent image of cells wherein the SUPT16H marker is tagged with EGFP using PASTE according to embodiments of the present teachings;
[0092] FIG. 9B shows a fluorescent image of cells wherein the SRRM2 marker is tagged with EGFP using Programmable Addition via Site-Specific Targeting Elements (PASTE) according to embodiments of the present teachings;
[0093] FIG. 9C shows a fluorescent image of cells wherein the LAMNB1 marker is tagged with EGFP using Programmable Addition via Site-Specific Targeting Elements (PASTE) according to embodiments of the present teachings;
[0094] FIG. 9D shows a fluorescent image of cells wherein the NOLC1 marker is tagged with EGFP using Programmable Addition via Site-Specific Targeting Elements (PASTE) according to embodiments of the present teachings;
[0095] FIG. 9E shows a fluorescent image of cells wherein the NOLC1 marker is tagged with EGFP using Programmable Addition via Site-Specific Targeting Elements (PASTE) according to embodiments of the present teachings;
[0096] FIG. 9F shows a fluorescent image of cells wherein the NOLC1 marker is tagged with EGFP using Programmable Addition via Site-Specific Targeting Elements (PASTE) according to embodiments of the present teachings;
[0097] FIG. 9G shows a fluorescent image of cells wherein the DEPDC4 marker is tagged with EGFP using Programmable Addition via Site-Specific Targeting Elements (PASTE) according to embodiments of the present teachings;
[0098] FIG. 10A shows comparisons of lipofectamine aided transfection in blue with electroporation aided transfection in red for the addition of the Bxb1 attB site at the ACTB N-terminal site in the genome using PASTE according to embodiments of the present teachings;
[0099] FIG. 10B shows comparisons of lipofectamine aided transfection in blue with electroporation aided transfection in red for EGFP integration at the ACTB N-terminal site in the genome using PASTE according to embodiments of the present teachings;
[0100] FIG. 11 shows a diagram of the integration of EGFP and Gluc with various HEK3 targeting pegRNAs according to embodiments of the present teachings;
[0101] FIG. 12 shows a schematic diagram of the using .phi.C31 as the integration enzyme, according to embodiments of the present teachings;
[0102] FIG. 13 shows a schematic diagram of multiplexing involving inserting multiple genes of interest in multiple loci using unique guide RNAs that incorporated exterior flanking attB sites according to embodiments of the present teachings;
[0103] FIG. 14A shows a diagram of the orthogonal editing with the right GT-EGFP according to embodiments of the present teachings;
[0104] FIG. 14B shows a diagram of the orthogonal editing with the right GA-mCherry according to embodiments of the present teachings;
[0105] FIG. 15A shows a fluorescent image of a multiplexing of ACTB-EGFP and NOLC1-mCherry according to embodiments of the present teachings
[0106] FIG. 15B shows a fluorescent image of a multiplexing of ACTB-EGFP and LAMNB1-mCherry according to embodiments of the present teachings;
[0107] FIG. 16A shows next generation sequencing results of 9.times.9 attP and attB central dinucleotide variants and their edit percentage wherein the orthogonality of attB/attP combinations for potential multiplexing applications is shown according to embodiments of the present teachings;
[0108] FIG. 16B shows an heatmap of 9.times.9 attP and attB central dinucleotide variants and their edit percentage according to embodiments of the present teachings;
[0109] FIG. 17 shows integration of SERPINA and CPS1 into Albumin loci using Albumin guide-pegRNA in HEK293FT cells according to embodiments of the present teachings;
[0110] FIG. 18 shows schematics for different nucleic acids for engineering T-cells according to embodiments of the present teachings;
[0111] FIG. 19 shows the editing efficiency for EGFP integration at the ACTB locus in primary T-cells according to embodiments of the present teachings;
[0112] FIG. 20 shows editing in TRAC locus in HEK293FT with different pegRNA according to embodiments of the present teachings;
[0113] FIG. 21A shows the attB integration at the ACTB locus using nicking guides 1 and 2 according to embodiments of the present teachings;
[0114] FIG. 21B shows the EGFP integration at the ACTB locus using nicking guides 1 and 2 according to embodiments of the present teachings;
[0115] FIG. 21C shows the EGFP integration at an ACTB site according to embodiments of the present teachings;
[0116] FIG. 22A shows PASTE editing in liver hepatocellular carcinoma cell line HEPG2 according to embodiments of the present teachings;
[0117] FIG. 22B shows PASTE editing of chronic myelogenous leukemia cell line K562 according to embodiments of the present teachings;
[0118] FIG. 23A shows the attB addition with targeting and non-targeting guides according to embodiments of the present teachings;
[0119] FIG. 23B shows the EGFP integration with targeting and non-targeting guides according to embodiments of the present teachings;
[0120] FIG. 23C shows the EGFP integration for mutagenized Bxb1 according to embodiments of the present teachings;
[0121] FIG. 24A shows a schematic of the design parameters for the pegRNA according to embodiments of the present teachings;
[0122] FIG. 24B shows a schematic of the design parameters for nicking guide RNA according to embodiments of the present teachings;
[0123] FIG. 25A shows the integration of EGFP at the ACTD locus with different PBS and RT lengths according to embodiments of the present teachings;
[0124] FIG. 25B shows the integration of EGFP at the LMNB1 loci with different PBS and RT lengths according to embodiments of the present teachings;
[0125] FIG. 25C shows the integration of EGFP at the NOLC1 loci with different PBS and RT lengths according to embodiments of the present teachings;
[0126] FIG. 25D shows the integration of EGFP at the GRSF1 locus with different PBS and RT lengths and different nicking guides according to embodiments of the present teachings;
[0127] FIG. 25E shows EGFP integration with mutant attP sites according to embodiments of the present teachings;
[0128] FIG. 25F shows the PASTE editing of an expanded panel of genes according to embodiments of the present teachings;
[0129] FIG. 26A shows the PASTE EGPF editing at the ACTB locus according to embodiments of the present teachings;
[0130] FIG. 26B shows the HITI EGPF editing at the ACTB locus according to embodiments of the present teachings;
[0131] FIG. 26C shows the comparison between the PASTE and HITI editing a panel of 14 genes according to embodiments of the present teachings;
[0132] FIG. 26D shows PASTE Bxb1 off-target integrations according to embodiments of the present teachings;
[0133] FIG. 26E shows PASTE Cas9 off-target integrations according to embodiments of the present teachings;
[0134] FIG. 26F shows the EGFP integration for gene inserts of different sizes according to embodiments of the present teachings;
[0135] FIG. 27A shows the orthogonality between selected sets of attB and attP sites according to embodiments of the present teachings;
[0136] FIG. 27B shows the orthogonality between selected sets of attB and attP sites according to embodiments of the present teachings;
[0137] FIG. 27C shows a schematic for the orthogonal PASTE editing using engineered di-nucleotide combinations according to embodiments of the present teachings;
[0138] FIG. 28A shows fluorescent images of the GFP tagging of ACTB and SUPT16H genes with PASTE according to embodiments of the present teachings;
[0139] FIG. 28B shows fluorescent images of the GFP tagging of NOLC1 and SRRM2 genes with PASTE according to embodiments of the present teachings;
[0140] FIG. 28C shows fluorescent images of the GFP tagging of LMNB1 and DEPDC4 genes with PASTE according to embodiments of the present teachings;
[0141] FIG. 28D shows the orthogonal gene integration at three endogenous sites with PASTE according to embodiments of the present teachings;
[0142] FIG. 28E shows the multiplexed insertion via one-plex, two-plex, and three-plex gene insertion at three endogenous sites via PASTE according to embodiments of the present teachings;
[0143] FIG. 28F shows fluorescent images of two single cells with multiplexed gene tagging of ACTB (EGFP) and NOLC1 (mCherry) using PASTE according to embodiments of the present teachings;
[0144] FIG. 28G shows fluorescent images two single cells with multiplexed gene tagging of ACTB (EGFP) and LMNB1 (mCherry) using PASTE according to embodiments of the present teachings;
[0145] FIG. 29A shows the prime editing efficiency of Bxb1 attB site insertion at the ACTB locus according to embodiments of the present teachings;
[0146] FIG. 29B shows the prime editing efficiency at inserting Bxb1 attB sites of different lengths at the ACTB locus according to embodiments of the present teachings;
[0147] FIG. 29C shows the prime editing efficiency of inserting attB sequences from different integrases, wherein both orientations of landing sites are profiled (F, forward; and R, reverse) according to embodiments of the present teachings;
[0148] FIG. 29D shows the prime editing efficiency of inserting attB sequences from Bxb1 integrase and Cre recombinase, wherein both orientations of landing sites are profiled (F, forward; and R, reverse) according to embodiments of the present teachings;
[0149] FIG. 29E shows a schematic of PASTE insertion at the ACTB locus showing guide and target sequences according to embodiments of the present teachings. FIG. 29E discloses SEQ ID NOS 428-431, respectively, in order of appearance;
[0150] FIG. 29F shows a comparison of PASTE integration efficiency of GFP with a panel of integrases targeting the 5' end of the ACTB locus, wherein both orientations of landing sites are profiled (F, forward; and R, reverse) according to embodiments of the present teachings;
[0151] FIG. 29G shows a comparison of GFP cargo integration efficiency between Bxb1 integrases and Cre recombinase according to embodiments of the present teachings;
[0152] FIG. 29H shows the dependence of PASTE editing activity on different prime and integrase components according to embodiments of the present teachings;
[0153] FIG. 29I shows a titration of a single vector PASTE system (SpCas9-RT-P2A-Bxb1) on integrase efficiency according to embodiments of the present teachings;
[0154] FIG. 29J shows the effect of cargo size on PASTE insertion efficiency at the endogenous ACTB target according to embodiments of the present teachings;
[0155] FIG. 29K shows a gel electrophoresis showing complete insertion by PASTE for multiple cargo sizes according to embodiments of the present teachings;
[0156] FIG. 30A shows a schematic of PASTE integration, including resulting attR and attL sites that are generated and PCR primers for assaying the integration junctions according to embodiments of the present teachings;
[0157] FIG. 30B shows a PCR and gel electrophoresis readout of left integration junction from PASTE insertion of GFP at the ACTB locus, wherein the insertion is analyzed for in-frame and out-of-frame GFP integration experiments as well as for a no prime control and expected sizes of the PCR fragments are shown using the primers shown in the schematic in subpanel FIG. 30A according to embodiments of the present teachings;
[0158] FIG. 30C shows a PCR and gel electrophoresis readout of right integration junction from PASTE insertion of GFP at the ACTB locus, wherein the insertion is analyzed for in-frame and out-of-frame GFP integration experiments as well as for a no prime control and the expected sizes of the PCR fragments are shown using the primers shown in the schematic in subpanel FIG. 30A according to embodiments of the present teachings;
[0159] FIG. 30D shows a Sanger sequencing shown for the right integration junction for an in-frame fusion of GFP via PASTE to the N-terminus of ACTB according to embodiments of the present teachings;
[0160] FIG. 30E shows a Sanger sequencing shown for the left integration junction for an in-frame fusion of GFP via PASTE to the N-terminus of ACTB according to embodiments of the present teachings;
[0161] FIG. 31A shows a schematic of various parameters that affect PASTE integration of .about.1 kb GFP insert, wherein on the pegRNA, the PBS, RT, and attB lengths can alter the efficiency of attB insertion, and nicking guide selection also affects overall gene integration efficiency according to embodiments of the present teachings;
[0162] FIG. 31B shows the impact of PBS and RT length on PASTE integration of GFP at the ACTB locus according to embodiments of the present teachings;
[0163] FIG. 31C shows the impact of PBS and RT length on PASTE integration of GFP at the LMNB1 locus according to embodiments of the present teachings;
[0164] FIG. 31D shows the impact of attB length on PASTE integration of GFP at the ACTB locus according to embodiments of the present teachings;
[0165] FIG. 31E shows the impact of attB length on PASTE integration of GFP at the LMNB1 locus according to embodiments of the present teachings;
[0166] FIG. 31F shows the impact of attB length on PASTE integration of GFP at the NOLC1 locus according to embodiments of the present teachings;
[0167] FIG. 31G shows the impact of minimal PBS, RT, and attB lengths on PASTE integration efficiency of GFP at the ACTB locus according to embodiments of the present teachings;
[0168] FIG. 31H shows the impact of minimal PBS, RT, and attB lengths on PASTE integration efficiency of GFP at the LMNB1 locus according to embodiments of the present teachings;
[0169] FIG. 31I shows the PASTE integration of GFP at the LMNB1 locus in the presence and absence of nicking guide, prime, and Bxb1 with a minimally compact pegRNA containing a 38 bp attB compared to a longer pegRNA design according to embodiments of the present teachings;
[0170] FIG. 32A shows the PASTE insertion efficiency at ACTB and LMNB1 loci with two different nicking guide designs according to embodiments of the present teachings;
[0171] FIG. 32B shows the PASTE editing efficiency at ACTB and LMNB1 with target and non-targeting spacers and matched pegRNAs with and without Bxb1 expression according to embodiments of the present teachings;
[0172] FIG. 33A shows the PASTE integration of GFP at the ACTB locus with different Bxb1 catalytic mutants according to embodiments of the present teachings;
[0173] FIG. 33B shows the PASTE integration of GFP at the ACTB locus with different RT catalytic mutants according to embodiments of the present teachings;
[0174] FIG. 34A shows the GFP integration by PASTE at a panel of endogenous genomic loci according to embodiments of the present teachings;
[0175] FIG. 34B shows the integration of a panel of different gene cargo at ACTB locus via PASTE according to embodiments of the present teachings;
[0176] FIG. 34C shows the integration efficiency of therapeutically relevant genes at the ACTB locus according to embodiments of the present teachings;
[0177] FIG. 34D shows the endogenous protein tagging with GFP via PASTE by in-frame endogenous gene tagging at the ACTB loci and SRRM2 loci according to embodiments of the present teachings;
[0178] FIG. 34E shows the endogenous protein tagging with GFP via PASTE by in-frame endogenous gene tagging at the NOLC1 loci and LMNB1 loci according to embodiments of the present teachings;
[0179] FIG. 35 shows the integration of a panel of different gene cargo at LMNB1 locus via PASTE according to embodiments of the present teachings;
[0180] FIG. 36A shows the PASTE integration efficiency for all 16 central dinucleotide attB/attP sequence pairs with a 5 kb GFP template at the ACTB locus according to embodiments of the present teachings;
[0181] FIG. 36B shows a schematic of the pooled attB/attP dinucleotide orthogonality assay, wherein each attB dinucleotide sequence is co-transfected with a barcoded pool of all 16 attP dinucleotide sequences and Bxb1 integrase, relative integration efficiencies are determined by next generation sequencing of barcodes, and all 16 attB dinucleotides are profiled in an arrayed format with attP pools according to embodiments of the present teachings;
[0182] FIG. 36C shows the relative insertion preferences for all possible attB/attP dinucleotide pairs determined by the pooled orthogonality assay according to embodiments of the present teachings;
[0183] FIG. 36D shows the orthogonality of top 4 attB/attP dinucleotide pairs evaluated for GFP integration with PASTE at the ACTB locus according to embodiments of the present teachings;
[0184] FIG. 37 shows the orthogonality of Bxb1 dinucleotides as measured by a pooled reporter assay, wherein each web logo motif shows the relative integration of different attP sequences in a pool at a denoted attB sequence with the listed dinucleotide according to embodiments of the present teachings;
[0185] FIG. 38A shows a schematic of multiplexed integration of different cargo sets at specific genomic loci, wherein three fluorescent cargos (GFP, mCherry, and YFP) are inserted orthogonally at three different loci (ACTB, LMNB1, NOLC1) for in-frame gene tagging according to embodiments of the present teachings;
[0186] FIG. 38B shows the efficiency of multiplexed PASTE insertion of combinations of fluorophores at ACTB, LMNB1, and NOLC1 loci according to embodiments of the present teachings;
[0187] FIG. 39A shows the GFP integration efficiency at a panel of genomic loci by PASTE compared to insertion rates by homology-independent targeted integration (HITI) according to embodiments of the present teachings;
[0188] FIG. 39B shows a comparison of unintended indel generation by PASTE and HITI at the ACTB and LMNB1 target sites, wherein the on-target EGFP integration rate observed compared to unintended indels is shown according to embodiments of the present teachings;
[0189] FIG. 39C shows the integration of a GFP template by PASTE at the ACTB locus compared to homology-directed repair (HDR) at the same target, wherein the quantification is by single-cell clone counting, wherein targeting and non-targeting guides were used for HDR insertion, and wherein for PASTE targeting and non-targeting refers to the presence or absence of the SpCas9-RT protein respectively according to embodiments of the present teachings;
[0190] FIG. 39D shows the comparison of unintended indel generation by PASTE and HDR based EGFP insertion at the ACTB target site, wherein the average indel rate measured across all single-cell clones generated is showed according to embodiments of the present teachings;
[0191] FIG. 39E shows a schematic for Bxb1 and Cas9 off-target identification and a detection assay according to embodiments of the present teachings;
[0192] FIG. 39F shows the GFP integration activity at predicted Bxb1 off-target sites in the human genome according to embodiments of the present teachings;
[0193] FIG. 39G shows the GFP integrations activity at predicted PASTE ACTB Cas9 guide off target sites according to embodiments of the present teachings;
[0194] FIG. 39H shows the GFP integration activity at predicted HITI ACTB Cas9 guide off-target sites according to embodiments of the present teachings;
[0195] FIG. 39I shows a schematic of next-generation sequencing method to assay genome-wide off-target integration sites by PASTE according to embodiments of the present teachings;
[0196] FIG. 39J shows the alignment of reads at the on-target ACTB site using a genome-wide integration assay, wherein expected on-target integration outcomes are shown according to embodiments of the present teachings;
[0197] FIG. 39K shows the analysis of on-target and off-target integration events across 3 single-cell clones for PASTE and 3 single-cell clones for no prime condition according to embodiments of the present teachings;
[0198] FIG. 39L shows a Manhattan plot of integration events for a representative single-cell clone with PASTE editing, wherein the on-target site is at the ACTB gene on chromosome 7 according to embodiments of the present teachings;
[0199] FIG. 40A shows a comparison of indel rates generated by PASTE and HITI mediated insertion of EGFP at the ACTB and LMNB1 loci in HepG2 cells according to embodiments of the present teachings;
[0200] FIG. 40B shows the validation of ddPCR assays for detecting editing at predicted Bxb1 offtarget sites using synthetic amplicons according to embodiments of the present teachings;
[0201] FIG. 40C shows the validation of ddPCR assays for detecting editing at predicted PASTE ACTB Cas9 guide off-target sites using synthetic amplicons according to embodiments of the present teachings;
[0202] FIG. 40D shows the validation of ddPCR assays for detecting editing at predicted HITI ACTB Cas9 guide off-target sites using synthetic amplicons according to embodiments of the present teachings;
[0203] FIG. 41A shows a number of significant differentially regulated genes in HEK293FT cells expressing Bxb1 integrase, PASTE targeting ACTB integration of EGFP, or Prime editing targeting ACTB for EGFP insertion without Bxb1 expression according to embodiments of the present teachings;
[0204] FIG. 41B shows Volcano plots depicting the fold expression change of sequenced mRNAs versus significance (p-value), wherein each dot represents a unique mRNA transcript and significant transcripts are shaded according to either upregulation (red) or downregulation (blue), and wherein fold expression change is measured against ACTB-targeting guide-only expression (including cargo) according to embodiments of the present teachings;
[0205] FIG. 41C shows top significantly upregulated and downregulated genes for Bxb1-only conditions, wherein genes are shown with their corresponding Z-scores of counts per million (cpm) for Bxb1 only expression, GFP-only expression, PASTE targeting ACTB for EGFP insertion, Prime targeting ACTB for EGFP expression without Bxb1, and guide/cargo only according to embodiments of the present teachings;
[0206] FIG. 42A shows a schematic of PASTE performance in the presence of cell cycle inhibition, wherein cells are transfected with plasmids for insertion with PASTE or Cas9-induced HDR and treated with aphidicolin to arrest cell division, and wherein the efficiency of PASTE and HDR are read out with ddPCR or amplicon sequencing respectively according to embodiments of the present teachings;
[0207] FIG. 42B shows the editing efficiency of single mutations by HDR at EMX1 locus with two Cas9 guides in the presence or absence of cell division read out with amplicon sequencing according to embodiments of the present teachings;
[0208] FIG. 42C shows the integration efficiency of various sized GFP inserts up to 13.3 kb at the ACTB locus with PASTE in the presence or absence of cell division according to embodiments of the present teachings;
[0209] FIG. 42D shows the PASTE editing efficiency with two vector (PE2 and Bxb1) and single vector (PE2-P2A-Bxb1) designs in K562 cells according to embodiments of the present teachings;
[0210] FIG. 42E shows the PASTE editing efficiency with single vector (PE2-P2A-Bxb1) designs in primary human T cells according to embodiments of the present teachings;
[0211] FIG. 42F shows the integration efficiency of therapeutically relevant genes at the ACTB locus according to embodiments of the present teachings;
[0212] FIG. 42G shows a schematic of protein production assay for PASTE-integrated transgene, wherein SERPINA1 and CPS1 transgenes are tagged with HIBIT luciferase for readout with both ddPCR and luminescence according to embodiments of the present teachings;
[0213] FIG. 42H shows the integration efficiency of SERPINA1 and CPS1 transgenes in HEK293FT cells at the ACTB locus according to embodiments of the present teachings;
[0214] FIG. 42I shows the integration efficiency of SERPINA1 and CPS1 transgenes in HepG2 cells at the ACTB locus according to embodiments of the present teachings;
[0215] FIG. 42J shows the intracellular levels of SERPINA1-HIBIT and CPS1-HIBIT in HepG2 cells according to embodiments of the present teachings;
[0216] FIG. 42K shows the secreted levels of SERPINA1-HIBIT and CPS1-HIBIT in HepG2 cells according to embodiments of the present teachings;
[0217] FIG. 43A shows the HDR mediated editing of the EMX1 locus that is significantly diminished in non-dividing HEK293FT cells blocked by 5 .mu.M aphidicolin treatment according to embodiments of the present teachings;
[0218] FIG. 43B shows the effect of insert minicircle DNA amount on PASTE-mediated insertion at the ACTB locus in dividing and nondividing HEK293FT cells blocked by 5 .mu.M aphidicolin treatment according to embodiments of the present teachings;
[0219] FIG. 43C shows the PASTE integration of GFP at the ACTB locus with the GFP template delivered via AAV, showing dose dependence of integration efficiency according to embodiments of the present teachings;
[0220] FIG. 44A shows the PASTE integration activity at three endogenous loci comparing the normal PASTE SV40 NLS to a c-Myc NLS/variable bi-partite SV40 NLS design according to embodiments of the present teachings;
[0221] FIG. 44B shows the PASTE integration activity at the ACTB locus with different GFP minicircle template amounts comparing the normal PASTE SV40 NLS to a c-Myc NLS/variable bi-partite SV40 NLS design according to embodiments of the present teachings;
[0222] FIG. 45 shows the improvement of the PASTE editing activity using a puromycin growth selection marker according to embodiments of the present teachings;
[0223] FIG. 46A shows the integration of SERPINA1 and CPS1 genes that are HIBIT tagged as measured by a protein expression luciferase assay according to embodiments of the present teachings;
[0224] FIG. 46B shows the integration of SERPINA1 and CPS1 genes that are HIBIT tagged as measured by a protein expression luciferase assay normalized to a standardized HIBIT ladder, enabling accurate quantification of protein levels according to embodiments of the present teachings;
[0225] FIG. 47A shows optimization of PASTE constructs with a panel of linkers and reverse transcriptase (RT) modifications for EGFP integration at the ACTB locus, according to embodiments of the present teachings;
[0226] FIG. 47B shows the effect of cargo size on PASTE insertion efficiency at the endogenous ACTB target. Cargos were transfected with fixed molar amounts, according to embodiments of the present teachings;
[0227] FIG. 48A shows prime editing efficiency for the insertion of different length BxbINT AttB sites at ACTB, according to embodiments of the present teachings;
[0228] FIG. 48B shows prime editing efficiency for the insertion of a BxbINT AttB site at ACTB with targeting and non-targeting guides, according to embodiments of the present teachings;
[0229] FIG. 48C shows prime editing efficiency for the insertion of different integrases' (Bxb1, Tp9, and Bt1) AttB sites at ACTB. Both orientations of landing sites are profiled (F, forward; R, reverse), according to embodiments of the present teachings;
[0230] FIG. 48D shows PASTE editing efficiency for the insertion of EGFP at ACTB with and without a nicking guide, according to embodiments of the present teachings; and
[0231] FIG. 49A shows optimization of PASTE editing by dosage titration and protein optimization. PASTE integration efficiency of EGFP at ACTB measured with different doses of a single-vector delivery of components.
[0232] FIG. 49B PASTE integration efficiency of EGFP at ACTB measured with different ratios of a single-vector delivery of components to the EGFP template vector.
[0233] FIG. 49C PASTE integration efficiency of EGFP at ACTB with different RT domain fusions.
[0234] FIG. 49D PASTE integration efficiency of EGFP at ACTB with different RT domain fusions and linkers.
[0235] FIG. 49E PASTE integration efficiency of EGFP at ACTB with mutant RT domains.
[0236] FIG. 49F PASTE integration efficiency of EGFP at ACTB with mutated BxbINT domains.
[0237] FIG. 50A Insertion templates delivered via AAV transduction. PASTE editing machinery was delivered via transfection, and templates were co-delivered via AAV dosing at levels indicated.
[0238] FIG. 50B Schematic of AdV delivery of the complete PASTE system with three viral vectors.
[0239] FIG. 50C Integration efficiency of AdV delivery of integrase, guides, and cargo in HEK293FT and HepG2 cells. BxbINT and guide RNAs or cargo were delivered either via plasmid transfection (P1), AdV transduction (AdV), or omitted (-). SpCas9-RT was only delivered as plasmid or omitted.
[0240] FIG. 50D AdV delivery of all PASTE components in HEK293FT and HepG2 cells.
[0241] FIG. 50E Schematic of mRNA and synthetic guide delivery of PASTE components.
[0242] FIG. 50F Delivery of PASTE system components with mRNA and synthetic guides, paired with either AdV or plasmid cargo.
[0243] FIG. 50G Delivery of circular mRNA with synthetic guides and either AdV or plasmid cargo.
[0244] FIG. 50H PASTE editing efficiency with single vector designs in primary human T cells.
[0245] FIG. 50I PASTE editing efficiency with single vector designs in primary human hepatocytes.
[0246] FIG. 51A PASTE editing efficiency at the LMNB1 locus with 130 bp and 385 bp deletions of the first exon of LMNB1 with combined insertion of an attB sequence.
[0247] FIG. 51B PASTE editing efficiency with a 130 bp deletion of the first exon of LMNB1 with a combined insertion of a 967 bp cargo using the PASTE system.
DETAILED DESCRIPTION
[0248] It will be appreciated that for clarity, the following discussion will describe various aspects of embodiments of the applicant's teachings. It should be noted that the specific embodiments are not intended as an exhaustive description or as a limitation to the broader aspects discussed herein. One aspect described in conjunction with a particular embodiment is not necessarily limited to that embodiment and can be practiced with any other embodiment(s). Reference throughout this specification to "one embodiment", "an embodiment," "an example embodiment," means that a particular feature, structure or characteristic described in connection with the embodiment is included in at least one embodiment of the present disclosure. Thus, appearances of the phrases "in one embodiment," "in an embodiment," or "an example embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment, but may. Furthermore, the particular feature, structures or characteristics may be combined in any suitable manner, as would be apparent to a person skilled in the art from this disclosure, in one or more embodiments.
General Definitions
[0249] Unless defined otherwise, technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure pertains. Definitions of common terms and techniques in molecular biology may be found in Molecular Cloning: A Laboratory Manual, 2nd edition (1989) (Sambrook, Fritsch, and Maniatis); Molecular Cloning: A Laboratory Manual, 4th edition (2012) (Green and Sambrook); Current Protocols in Molecular Biology (1987) (F. M. Ausubel et al. eds.); the series Methods in Enzymology (Academic Press, Inc.): PCR 2: A Practical Approach (1995) (M. J. MacPherson, B. D. Hames, and G. R. Taylor eds.): Antibodies, A Laboratory Manual (1988) (Harlow and Lane, eds.): Antibodies A Laboratory Manual, 2nd edition 2013 (E. A. Greenfield ed.); Animal Cell Culture (1987) (R. I. Freshney, ed.); Benjamin Lewin, Genes IX, published by Jones and Bartlet, 2008 (ISBN 0763752223); Kendrew et al. (eds.), The Encyclopedia of Molecular Biology, published by Blackwell Science Ltd., 1994 (ISBN 0632021829); Robert A. Meyers (ed.), Molecular Biology and Biotechnology: a Comprehensive Desk Reference, published by VCH Publishers, Inc., 1995 (ISBN 9780471185710); Singleton et al., Dictionary of Microbiology and Molecular Biology 2nd ed., J. Wiley & Sons (New York, N.Y. 1994), March, Advanced Organic Chemistry Reactions, Mechanisms and Structure 4th ed., John Wiley & Sons (New York, N.Y. 1992); and Marten H. Hofker and Jan van Deursen, Transgenic Mouse Methods and Protocols, 2nd edition (2011).
[0250] As used herein, the singular forms "a", "an," and "the" include both singular and plural referents unless the context clearly dictates otherwise. Thus, for example, reference to "a cell" includes a plurality of such cells.
[0251] As used herein, the term "optional" or "optionally" means that the subsequent described event, circumstance or substituent may or may not occur, and that the description includes instances where the event or circumstance occurs and instances where it does not.
[0252] The recitation of numerical ranges by endpoints includes all numbers and fractions subsumed within the respective ranges, as well as the recited endpoints.
[0253] As used herein, the term "about" or "approximately" refers to a measurable value such as a parameter, an amount, a temporal duration, and the like, are meant to encompass variations of and from the specified value, such as variations of +/-10% or less, +/-5% or less, +1-1% or less, +/-0.5% or less, and +/-0.1% or less of and from the specified value, insofar such variations are appropriate to perform in the disclosure. It is to be understood that the value to which the modifier "about" or "approximately" refers is itself also specifically, and preferably, disclosed.
[0254] It is noted that all publications and references cited herein are expressly incorporated herein by reference in their entirety. The publications discussed herein are provided solely for their disclosure prior to the filing date of the present application. Nothing herein is to be construed as an admission that the present disclosure is not entitled to antedate such publication. Further, the dates of publication provided may be different from the actual publication dates which may need to be independently confirmed.
Overview
[0255] The embodiments disclosed herein provide non-naturally occurring or engineered systems, methods, and compositions for site-specific genetic engineering using Programmable Addition via Site-Specific Targeting Elements (PASTE). A schematic diagram illustrating the concept of PASTE is shown in FIG. 1. As discussed in more details below, PASTE comprises the addition of an integration site into a target genome followed by the insertion of one or more genes of interest or one or more nucleic acid sequences of interest at the site. This process can be done as one or more reactions in a cell. The addition of the integration site into the target genome is done using gene editing technologies that include for example, without limitation, prime editing, recombinant adeno-associated virus (rAAV)-mediated nucleic acid integration, transcription activator-like effector nucleases (TALENS), and zinc finger nucleases (ZFNs). The integration of the transgene at the integration site is done using integrase technologies that include for example, without limitation, integrases, recombinases and reverse transcriptases. The necessary components for the site-specific genetic engineering disclosed herein comprise at least one or more nucleases, one or more gRNA, one or more integration enzymes, and one or more sequences that are complementary or associated to the integration site and linked to the one or more genes of interest or one or more nucleic acid sequences of interest to be inserted into the cell genome.
[0256] An advantage of the non-naturally occurring or engineered systems, methods, and compositions for site-specific genetic engineering disclosed herein is programmable insertion of large elements without reliance on DNA damage responses.
[0257] Another advantage of the non-naturally occurring or engineered systems, methods, and compositions for site-specific genetic engineering disclosed herein is facile multiplexing, enabling programmable insertion at multiple sites.
[0258] Another advantage of the non-naturally occurring or engineered systems, methods, and compositions for site-specific genetic engineering disclosed herein is scalable production and delivery through minicircle templates.
Prime Editing
[0259] The present disclosure provides non-naturally occurring or engineered systems, methods, and compositions for site-specific genetic engineering using gene editing technologies, such as prime editing, to add an integration site into a target genome. Prime editing will be discussed in more details below.
[0260] Prime editing is a versatile and precise genome editing method that directly writes new genetic information into a specified DNA site. A schematic diagram illustrating the concept of prime editing is shown in FIG. 2. See, Anzalone, A. V., et al. "Search-and-replace genome editing without double-strand breaks or donor DNA," Nature 576, 149-157 (2019). Prime editing uses a catalytically-impaired Cas9 endonuclease that is fused to an engineered reverse transcriptase (RT) and programmed with a prime-editing guide RNA (pegRNA). The skilled person in the art would appreciate that the pegRNA both specifies the target site and encodes the desired edit. The catalytically-impaired Cas9 endonuclease also comprises a Cas9 nickase that is fused to the reverse transcriptase. During genetic editing, the Cas9 nickase part of the protein is guided to the DNA target site by the pegRNA. The reverse transcriptase domain then uses the pegRNA to template reverse transcription of the desired edit, directly polymerizing DNA onto the nicked target DNA strand. The edited DNA strand replaces the original DNA strand, creating a heteroduplex containing one edited strand and one unedited strand. Afterward, the prime editor (PE) guides resolution of the heteroduplex to favor copying the edit onto the unedited strand, completing the process.
[0261] The prime editors refer to a Moloney Murine Leukemia Virus (M-MLV) reverse transcriptase (RT) fused to a Cas9 H840A nickase. Fusing the RT to the C-terminus of the Cas9 nickase may result in higher editing efficiency. Such a complex is called PE1. The Cas9(H840A) can also be linked to a non-M-MLV reverse transcriptase such as a AMV-RT or XRT (Cas9(H840A)-AMV-RT or XRT). In some embodiments, Cas 9(H840A) can be replaced with Cas12a/b or Cas9(D10A). A Cas9 (wild type), Cas9(H840A), Cas9(D10A) or Cas 12a/b nickase fused to a pentamutant of M-MLV RT (D200N/L603W/T330P/T306K/W313F), having up to about 45-fold higher efficiency is called PE2. In some embodiments, the M-MLV RT comprise one or more of the mutations: Y8H, P51L, S56A, S67R, E69K, V129P, L139P, T197A, H204R, V223H, T246E, N249D, E286R, Q2911, E302K, E302R, F309N, M320L, P330E, L435G, L435R, N454K, D524A, D524G, D524N, E562Q, D583N, H594Q, E607K, D653N, and L671P. In some embodiments, the reverse transcriptase can also be a wild-type or modified transcription xenopolymerase (RTX), avian myeloblastosis virus reverse transcriptase (AMV-RT), Feline Immunodeficiency Virus reverse transcriptase (FIV-RT), FeLV-RT (Feline leukemia virus reverse transcriptase), HIV-RT (Human Immunodeficiency Virus reverse transcriptase), or Eubacterium rectale maturase RT (MarathonRT). PE3 involves nicking the non-edited strand, potentially causing the cell to remake that strand using the edited strand as the template to induce HR. The nicking of the non-edited strand can involve the use of a nicking guide RNA (ngRNA).
[0262] Nicking the non-edited strand can increase editing efficiency. For example, nicking the non-edited strand can increase editing efficiency by about 1.1 fold, about 1.3 fold, about 1.5 fold, about 1.7 fold, about 1.9 fold, about 2.1 fold, about 2.3 fold, about 2.5 fold, about 2.7 fold, about 2.9 fold, about 3.1 fold, about 3.3 fold, about 3.5 fold, about 3.7 fold, about 3.9 fold, 4.1 fold, about 4.3 fold, about 4.5 fold, about 4.7 fold, about 4.9 fold, or any range that is formed from any two of those values as endpoints.
[0263] Although the optimal nicking position varies depending on the genomic site, nicks positioned 3' of the edit about 40-90 bp from the pegRNA-induced nick can generally increase editing efficiency without excess indel formation. The prime editing practice allows starting with non-edited strand nicks about 50 bp from the pegRNA-mediated nick, and testing alternative nick locations if indel frequencies exceed acceptable levels.
[0264] As used herein, the term "guide RNA" (gRNA) and the like refer to a RNA that guide the insertion or deletion of one or more genes of interest or one or more nucleic acid sequences of interest into a target genome. The gRNA can also refer to a prime editing guide RNA (pegRNA), a nicking guide RNA (ngRNA), and a single guide RNA (sgRNA). In some embodiments, the term "gRNA molecule" refers to a nucleic acid encoding a gRNA. In some embodiments, the gRNA molecule is naturally occurring. In some embodiments, a gRNA molecule is non-naturally occurring. In some embodiments, a gRNA molecule is a synthetic gRNA molecule. A gRNA can target a nuclease or a nickase such as Cas9, Cas 12a/b, Cas9 (H840A) or Cas9 (D10A) molecule to a target nucleic acid or sequence in a genome. In some embodiments, the gRNA can bind to a DNA nickase bound to a reverse transcriptase domain. A "modified gRNA," as used herein, refers to a gRNA molecule that has an improved half-life after being introduced into a cell as compared to a non-modified gRNA molecule after being introduced into a cell. In some embodiments, the guide RNA can facilitate the addition of the insertion site sequence for recognition by integrases, transposases, or recombinases.
[0265] As used herein, the term "prime-editing guide RNA" (pegRNA) and the like refer to an extended single guide RNA (sgRNA) comprising a primer binding site (PBS), a reverse transcriptase (RT) template sequence, and an integration site sequence that can be recognized by recombinases, integrases, or transposases. Exemplary design parameters for pegRNA are shown in FIG. 24A. For example, the PBS can have a length of at least about 4 nt, 5 nt, 6 nt, 7 nt, 8 nt, 9 nt, 10 nt, 11 nt, 12 nt, 13 nt, 14 nt, 15 nt, 16 nt, 17 nt, 18 nt, 19 nt, 20 nt, 21 nt, 22 nt, 23 nt, 24 nt, 25 nt, 26 nt, 27 nt, 28 nt, 29 nt, 30 nt, or more nt. For example, the PBS can have a length of about 4 nt, 5 nt, 6 nt, 7 nt, 8 nt, 9 nt, 10 nt, 11 nt, 12 nt, 13 nt, 14 nt, 15 nt, 16 nt, 17 nt, 18 nt, 19 nt, 20 nt, 21 nt, 22 nt, 23 nt, 24 nt, 25 nt, 26 nt, 27 nt, 28 nt, 29 nt, 30 nt, or any range that is formed from any two of those values as endpoints. For example, the RT template sequence can have a length of at least about 4 nt, 5 nt, 6 nt, 7 nt, 8 nt, 9 nt, 10 nt, 11 nt, 12 nt, 13 nt, 14 nt, 15 nt, 16 nt, 17 nt, 18 nt, 19 nt, 20 nt, 21 nt, 22 nt, 23 nt, 24 nt, 25 nt, 26 nt, 27 nt, 28 nt, 29 nt, 30 nt, 31 nt, 32 nt, 33 nt, 34 nt, 35 nt, 36 nt, 37 nt, 38 nt, 39 nt, 40 nt, 41 nt, 42 nt, 43 nt, 44 nt, 45 nt, 46 nt, 47 nt, 48 nt, 49 nt, 50 nt, or more nt. For example, the RT template sequence can have a length of about 4 nt, 5 nt, 6 nt, 7 nt, 8 nt, 9 nt, 10 nt, 11 nt, 12 nt, 13 nt, 14 nt, 15 nt, 16 nt, 17 nt, 18 nt, 19 nt, 20 nt, 21 nt, 22 nt, 23 nt, 24 nt, 25 nt, 26 nt, 27 nt, 28 nt, 29 nt, 30 nt, 31 nt, 32 nt, 33 nt, 34 nt, 35 nt, 36 nt, 37 nt, 38 nt, 39 nt, 40 nt, 41 nt, 42 nt, 43 nt, 44 nt, 45 nt, 46 nt, 47 nt, 48 nt, 49 nt, 50 nt, or any range that is formed from any two of those values as endpoints.
[0266] During genome editing, the primer binding site allows the 3' end of the nicked DNA strand to hybridize to the pegRNA, while the RT template serves as a template for the synthesis of edited genetic information. The pegRNA is capable for instance, without limitation, of (i) identifying the target nucleotide sequence to be edited and (ii) encoding new genetic information that replaces the targeted sequence. In some embodiments, the pegRNA is capable of (i) identifying the target nucleotide sequence to be edited and (ii) encoding an integration site that replaces the targeted sequence.
[0267] As used herein, the term "nicking guide RNA" (ngRNA) and the like refer to an RNA sequence that can nick a strand such as an edited strand and a non-edited strand. Exemplary design parameters for ngRNA are shown in FIG. 24B. The ngRNA can induce nicks at about 1 or more nt away from the site of the gRNA-induced nick. For example, the ngRNA can nick at least at about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, or more nt away from the site of the gRNA induced nick. In some embodiments, the ngRNA comprises SEQ ID NO: 75 with guide sequence SEQ ID NO: 74. As used herein, the terms "reverse transcriptase" and "reverse transcriptase domain" refer to an enzyme or an enzymatically active domain that can reverse a RNA transcribe into a complementary DNA. The reverse transcriptase or reverse transcriptase domain is a RNA dependent DNA polymerase. Such reverse transcriptase domains encompass, but are not limited, to a M-MLV reverse transcriptase, or a modified reverse transcriptase such as, without limitation, Superscript.RTM. reverse transcriptase (Invitrogen; Carlsbad, Calif.), Superscript.RTM. VILO.TM. cDNA synthesis (Invitrogen; Carlsbad, Calif.), RTX, AMV-RT, and Quantiscript Reverse Transcriptase (Qiagen, Hilden, Germany).
[0268] The pegRNA-PE complex disclosed herein recognizes the target site in the genome and the Cas9 for example nicks a protospacer adjacent motif (PAM) strand. The primer binding site (PBS) in the pegRNA hybridizes to the PAM strand. The RT template operably linked to the PBS, containing the edit sequence, directs the reverse transcription of the RT template to DNA into the target site. Equilibration between the edited 3' flap and the unedited 5' flap, cellular 5' flap cleavage and ligation, and DNA repair results in stably edited DNA. To optimize base editing, a Cas9 nickase can be used to nick the non-edited strand, thereby directing DNA repair to that strand, using the edited strand as a template.
Integrase Technologies
[0269] The present disclosure provides non-naturally occurring or engineered systems, methods, and compositions for site-specific genetic engineering using integrase technologies. Integrase technologies will be discussed in more details below.
[0270] The integrase technologies used herein comprise proteins or nucleic acids encoding the proteins that direct integration of a gene of interest or nucleic acid sequence of interest into an integration site via a nuclease such as a prime editing nuclease. The protein directing the integration can be an enzyme such as integration enzyme. The integration enzyme can be an integrase that incorporates the genome or nucleic acid of interest into the cell genome at the integration site by integration. The integration enzyme can be a recombinase that incorporates the genome or nucleic acid of interest into the cell genome at the integration site by recombination. The integration enzyme can be a reverse transcriptase that incorporates the genome or nucleic acid of interest into the cell genome at the integration site by reverse transcription. The integration enzyme can be a retrotransposase that incorporates the genome or nucleic acid of interest into the cell genome at the integration site by retrotransposition.
[0271] As used herein, the term "integration enzyme" refers to an enzyme or protein used to integrate a gene of interest or nucleic acid sequence of interest into a desired location or at the integration site, in the genome of a cell, in a single reaction or multiple reactions. Example of integration enzymes include for example, without limitation, Cre, Dre, Vika, Bxb1, .phi.C31, RDF, FLP, .phi.BT1, R1, R2, R3, R4, R5, TP901-1, A118, .phi.FC1, .phi.C1, MR11, TG1, .phi.370.1, W.beta., BL3, SPBc, K38, Peaches, Veracruz, Rebeuca, Theia, Benedict, KSSJEB, PattyP, Doom, Scowl, Lockley, Switzer, Bob3, Troube, Abrogate, Anglerfish, Sarfire, SkiPole, ConceptII, Museum, Severus, Airmid, Benedict, Hinder, ICleared, Sheen, Mundrea, BxZ2, .phi.RV, and retrotransposases encoded by R2, L1, Tol2 Tc1, Tc3, Mariner (Himar 1), Mariner (mos 1), and Minos. In some embodiments, the term "integration enzyme" refers to a nucleic acid (DNA or RNA) encoding the above-mentioned enzymes. In some embodiments, the Cre recombinase is expressed from a Cre recombinase expression plasmid (SEQ ID NO: 71).
[0272] Mammalian expression plasmids can be found in Table 1 below.
TABLE-US-00001 TABLE 1 Name Full Description SEQ ID NOS: PE2-Bxb1 Single pCMV-PE2- (SEQ ID NO: 381) Vector P2A-Bxb1 PE2 prime editor pCMV-PE2/ (SEQ ID NO: 382) Addgene #132775 PE2*-Bxb1 Single New NLS (SEQ ID NO: 383) Vector pCMV-PE2- P2A-Bxb1 PASTEv3 pCMV-SpCas9- (SEQ ID NO: 384) XTEN-RT (1-478)-Sto7d- GGGGS- BxbINT ACTB pegRNA ACTB N- (SEQ ID NO: 385) term PBS 13 RT 29 attB 46 pegRNA ACTB Nicking +48 ACTB N- (SEQ ID NO: 386) term Nicking guide 1 +48 guide Bxb1 integrase pCAG-NLS- (SEQ ID NO: 387) HA- Bxb1integrase/ Addgene #51271 TP901-1 Integrase TP901-1 (SEQ ID NO: 388) Integrase PhiBT Integrase PhiBT Integrase (SEQ ID NO: 389) HDR sgRNA guide Minicircle U6- (SEQ ID NO: 390) sgRNA EFS- SpCas9 HDR EGFP cargo Cas9 HDR (SEQ ID NO: 391) template site with EGFP AAV helper PDF6 AAV (SEQ ID NO: 392) plasmid helper plasmid AAV EGFP donor GFP AAV donor (SEQ ID NO: 393) plasmid AAV2/8 AAV2/8 capsid (SEQ ID NO: 394) protein
[0273] Minicircle cargo gene maps can be found in Table 2 below.
TABLE-US-00002 TABLE 2 Full Name Description SEQ ID NOS: Cargo EGFP Parent (SEQ ID NO: 76) minicircle plasmid - Cargo EGFP with attP Bxb1 site Cargo Cargo EGFP (SEQ ID NO: 395) EGFP with attP Bxb1 post site - post cleavage minicircle cleavage Cargo Parent (SEQ ID NO: 396) EGFP minicircle for plasmid - fusion Cargo EGFP with attP Bxb1 site for fusion mCherry Cargo (SEQ ID NO: 397) Cargo post mCherry cleavage with attP Bxb1 site - post minicircle cleavage YFP Cargo YFP (SEQ ID NO: 398) Cargo with attP Bxb1 post site - post cleavage minicircle cleavage SERPINA1 Cargo (SEQ ID NO: 399) Cargo SERPINA1 post with attP cleavage Bxb1 site - post minicircle cleavage CPS1 Cargo CPS1 (SEQ ID NO: 400) Cargo with attP Bxb1 post site - post cleavage minicircle cleavage CFTR Cargo Parent (SEQ ID NO: 401) minicircle plasmid - Cargo CFTR with attP Bxb1 site NYESO Cargo (SEQ ID NO: 402) TCR Cargo NYESO post TCR with cleavage attP Bxb1 site - post minicircle cleavage
[0274] In some embodiments, the serine integrase .phi.C31 from .phi.C31 phage is use as integration enzyme. The integrase .phi.C31 in combination with a pegRNA can be used to insert the pseudo attP integration site (SEQ ID NO: 78). A DNA minicircle containing a gene or nucleic acid of interest and attB (SEQ ID NO: 3) site can be used to integrate the gene or nucleic acid of interest into the genome of a cell. This integration can be aided by a co-transfection of an expression vector having the .phi.C31 integrase.
[0275] As used herein, the term "integrase" refers to a bacteriophage derived integrase, including wild-type integrase and any of a variety of mutant or modified integrases. As used herein, the term "integrase complex" may refer to a complex comprising integrase and integration host factor (IF). As used herein, the term "integrase complex" and the like may also refer to a complex comprising an integrase, an integration host factor, and a bacteriophage X-derived excisionase (Xis).
[0276] As used herein, the term "recombinase" and the like refer to a site-specific enzyme that mediates the recombination of DNA between recombinase recognition sequences, which results in the excision, integration, inversion, or exchange (e.g., translocation) of DNA fragments between the recombinase recognition sequences. Recombinases can be classified into two distinct families: serine recombinases (e.g., resolvases and invertases) and tyrosine recombinases (e.g., integrases). Examples of serine recombinases include, without limitation, Hin, Gin, Tn3, .beta.-six, CinH, ParA, .gamma..delta., Bxb1, .phi.C31, TP901, TG1, .phi.BT1, R1, R2, R3, R4, R5, .phi.RV1, .phi.FC1, MR11, A118, U153, and gp29. Examples of serine recombinases also include, without limitation, recombinases Peaches, Veracruz, Rebeuca, Theia, Benedict, KSSJEB, PattyP, Doom, Scowl, Lockley, Switzer, Bob3, Troube, Abrogate, Anglerfish, Sarfire, SkiPole, ConceptII, Museum, Severus, Airmid, Benedict, Hinder, ICleared, Sheen, Mundrea, and BxZ2 from Mycobacterial phages. Examples of tyrosine recombinases include, without limitation, Cre, FLP, R, Lambda, HK101, HK022, and pSAM2. The serine and tyrosine recombinase names stem from the conserved nucleophilic amino acid residue that the recombinase uses to attack the DNA and which becomes covalently linked to the DNA during strand exchange.
[0277] Recombinases have numerous applications, including the creation of gene knockouts/knock-ins and gene therapy applications. See, e.g., Brown et al., "Serine recombinases as tools for genome engineering."Methods, 2011; 53(4):372-9; Hirano et al., "Site-specific recombinases as tools for heterologous gene integration." Appl. Microbiol. Biotechnol. 2011; 92(2):227-39; Chavez and Calos, "Therapeutic applications of the .PHI.C31 integrase system." Curr. Gene Ther. 2011; 11(5):375-81; Turan and Bode, "Site-specific recombinases: from tag-and-target- to tag-and-exchange-based genomic modifications." FASEB J. 2011; 25(12):4088-107; Venken and Bellen, "Genome-wide manipulations of Drosophila melanogaster with transposons, Flp recombinase, and .PHI.C31 integrase."Methods Mol. Biol. 2012; 859:203-28; Murphy, "Phage recombinases and their applications."Adv. Virus Res. 2012; 83:367-414; Zhang et al., "Conditional gene manipulation: Creating a new biological era." J. Zhejiang Univ. Sci. B. 2012; 13(7):511-24; Karpenshif and Bernstein, "From yeast to mammals: recent advances in genetic control of homologous recombination." DNA Repair (Amst). 2012; 1; 11(10):781-8; the entire contents of each are hereby incorporated by reference in their entirety.
[0278] The recombinases provided herein are not meant to be exclusive examples of recombinases that can be used in embodiments of the disclosure. The methods and compositions of the disclosure can be expanded by mining databases for new orthogonal recombinases or designing synthetic recombinases with defined DNA specificities (See, e.g., Groth et al., "Phage integrases: biology and applications." J. Mol. Biol. 2004; 335, 667-678; Gordley et al., "Synthesis of programmable integrases." Proc. Natl. Acad. Sci. USA. 2009; 106, 5053-5058; the entire contents of each are hereby incorporated by reference in their entirety).
[0279] Other examples of recombinases that are useful in the systems, methods, and compositions described herein are known to those of skill in the art, and any new recombinase that is discovered or generated is expected to be able to be used in the different embodiments of the disclosure.
[0280] As used herein, the term "retrotransposase" and the like refer to an enzyme, or combination of one or more enzymes, wherein at least one enzyme has a reverse transcriptase domain. Retrotransposases are capable of inserting long sequences (e.g., over 3000 nucleotides) of heterologous nucleic acid into a genome. Examples of retrotransposases include for example, without limitation, retrotransposases encoded by elements such as R2, L1, Tol2 Tc1, Tc3, Mariner (Himar 1), Mariner (mos 1), Minos, and any mutants thereof.
[0281] In some embodiments, the one or more genes of interest or one or more nucleic acid sequences of interest are inserted into a desired location in a genome using a RNA fragment, such as a retrotransposon, encoding the nucleic acid linked to a complementary or associated integration site. The insertion of the nucleic acid of interest into a location in the desired location in the genome using a retrotransposon is aided by a retrotransposase.
[0282] The gene and nucleic acid sequence of interest disclosed herein can be any gene and nucleic acid sequence that are known in the art. The gene and nucleic acid sequence of interest can be for therapeutic and/or diagnostic uses. Examples of genes of interest include, without limitation, GBA, BTK, ADA, CNGB3, CNGA3, ATF6, GNAT2, ABCA1, ABCA7, APOE, CETP, LIPC, MMP9, PLTP, VTN, ABCA4, MFSD8, TLR3, TLR4, ERCC6, HMCN1, HTRA1, MCDR4, MCDR5, ARMS2, C2, C3, CFB, CFH, JAG1, NOTCH2, CACNA1F, SERPINA1, TTR, GSN, B2M, APOA2, APOA1, OSMR, ELP4, PAX6, ARG, ASL, PITX2, FOXC1, BBS1, BBS10, BBS2, BBS9, MKKS, MKS1, BBS4, BBS7, TTC8, ARL6, BBS5, BBS12, TRIM32, CEP290, ADIPOR1, BBIP1, CEP19, IFT27, LZTFL1, DMD, BEST1, HBB, CYP4V2, AMACR, CYP7B1, HSD3B7, AKR1D1, OPN1SW, NR2F1, RLBP1, RGS9, RGS9BP, PROM1, PRPH2, GUCY2D, CACD, CHM, ALAD, ASS1, SLC25A13, OTC, ACADVL, ETFDH, TMEM67, CC2D2A, RPGRIP1L, KCNV2, CRX, GUCA1A, CERKL, CDHR1, PDE6C, TTLL5, RPGR, CEP78, C21orf2, C8ORF37, RPGRIP1, ADAMS, POC1B, PITPNM3, RAB28, CACNA2D4, AIPL1, UNC119, PDE6H, OPN1LW, RIMS1, CNNM4, IFT81, RAX2, RDH5, SEMA4A, CORD17, PDE6B, GRK1, SAG, RHO, CABP4, GNB3, SLC24A1, GNAT1, GRM6, TRPM1, LRIT3, TGFBI, TACSTD2, KRT12, OVOL2, CPS1, UGT1A1, UGT1A9, UGT1A8, UGT1A7, UGT1A6, UGT1A5, UGT1A4, CFTR, DLD, EFEMP1, ABCC2, ZNF408, LRP5, FZD4, TSPAN12, EVR3, APOB, SLC2A2, LOC106627981, GBA1, NR2E3, OAT, SLC40A1, F8, F9, UROD, CPDX, HFE, JH, LDLR, EPHX1, TJP2, BAAT, NBAS, LARS1, HAMP, HJV, RS1, ADAMTS18, LRAT, RPE65, LCAS, MERTK, GDF6, RD3, CCT2, CLUAP1, DTHD1, NMNAT1, SPATA7, IFT140, IMPDH1, OTX2, RDH12, TULP1, CRB1, MT-ND4, MT-ND1, MT-ND6, BCKDHA, BCKDHB, DBT, MMAB, ARSB, GUSB, NAGS, NPC1, NPC2, NDP, OPA1, OPA3, OPA4, OPAS, RTN4IP1, TMEM126A, OPA6, OPA8, ACO2, PAH, PRKCSH, SEC63, GAA, UROS, PPDX, HPX, HMOX1, HMBS, MIR223, CYP1B1, LTBP2, AGXT, ATP8B1, ABCB11, ABCB4, FECH, ALAS2, PRPF31, RP1, EYS, TOPORS, USH2A, CNGA1, C2ORF71, RP2, KLHL7, ORF1, RP6, RP24, RP34, ROM1, ADGRA3, AGBL5, AHR, ARHGEF18, CA4, CLCC1, DHDDS, EMC1, FAM161A, HGSNAT, HK1, IDH3B, KIAA1549, KIZ, MAK, NEUROD1, NRL, PDE6A, PDE6G, PRCD, PRPF3, PRPF4, PRPF6, PRPF8, RBP3, REEP6, SAMD11, SLC7A14, SNRNP200, SPP2, ZNF513, NEK2, NEK4, NXNL1, OFD1, RP1L1, RP22, RP29, RP32, RP63, RP9, RGR, POMGNT1, DHX38, ARL3, COL2A1, SLCO1B1, SLCO1B3, KCNJ13, TIMP3, ELOVL4, TFR2, FAH, HPD, MYO7A, CDH23, PCDH15, DFNB31, GPR98, USH1C, USH1G, CIB2, CLRN1, HARS, ABHD12, ADGRV1, ARSG, CEP250, IMPG1, IMPG2, VCAN, G6PC1, ATP7B and any derivatives thereof.
[0283] As used here, the terms "retrotransposons," "jumping genes," "jumping nucleic acids," and the like refer to cellular movable genetic elements dependent on reverse transcription. The retrotransposons are of non-replication competent cellular origin, and are capable of carrying a foreign nucleic acid sequence. The retrotransposons can act as parasites of retroviruses, retaining certain classical hallmarks, such as long terminal repeats (LTR), retroviral primer binding sites, and the like. However, the naturally occurring retrotransposons usually do not contain functional retroviral structure genes, which would normally be capable of recombining to yield replication competent viruses. Some retrotransposons are examples of so-called "selfish DNA", or genetic information, which encodes nothing except the ability to replicate itself. The retrotransposon may do so by utilizing the occasional presence of a retrovirus or a retrotransposase within the host cell, efficiently packaging itself within the viral particle, which transports it to the new host genome, where it is expressed again as RNA. The information encoded within that RNA is potentially transported with the jumping gene. A retrotransposon can be a DNA transposon or a retrotransposon, including a LTR retrotransposon or a non-LTR retrotransposon.
[0284] Non-long terminal repeat (LTR) retrotransposons are a type of mobile genetic elements that are widespread in eukaryotic genomes. They include two classes: the apurinic/apyrimidinic endonuclease (APE)-type and the restriction enzyme-like endonuclease (RLE)-type. The APE class retrotransposons are comprised of two functional domains: an endonuclease/DNA binding domain, and a reverse transcriptase domain. The RLE class are comprised of three functional domains: a DNA binding domain, a reverse transcription domain, and an endonuclease domain. The reverse transcriptase domain of non-LTR retrotransposon functions by binding an RNA sequence template and reverse transcribing it into the host genome's target DNA. The RNA sequence template has a 3' untranslated region which is specifically bound to the transposase, and a variable 5' region generally having Open Reading Frame(s) ("ORF") encoding transposase proteins. The RNA sequence template may also comprise a 5' untranslated region which specifically binds the retrotransposase. In some embodiments, a non-LTR transposons can include a LINE retrotransposon, such as L1, and a SINE retrotransposon, such as an Alu sequence. Other examples include for example, without limitation, R1, R2, R3, R4, and R5 retro-transposons (Moss, W. N. et al., RNA Biol. 2011, 8(5), 714-718; and Burke, W. D. et al., Molecular Biology and Evolution 2003, 20(8), 1260-1270). The transposon can be autonomous or non-autonomous.
[0285] LTR retrotransposons, which include retroviruses, make up a significant fraction of the typical mammalian genome, comprising about 8% of the human genome and 10% of the mouse genome. Lander et al., 2001, Nature 409, 860-921; Waterson et al., 2002, Nature 420, 520-562. LTR elements include retrotransposons, endogenous retroviruses (ERVs), and repeat elements with HERV origins, such as SINE-R. LTR retrotransposons include two LTR sequences that flank a region encoding two enzymes: integrase and retrotransposase.
[0286] ERVs include human endogenous retroviruses (HERVs), the remnants of ancient germ-cell infections. While most HERV proviruses have undergone extensive deletions and mutations, some have retained ORFS coding for functional proteins, including the glycosylated env protein. The env gene confers the potential for LTR elements to spread between cells and individuals. Indeed, all three open reading frames (pol, gag, and env) have been identified in humans, and evidence suggests that ERVs are active in the germline. See, e.g., Wang et al., 2010, Genome Res. 20, 19-27. Moreover, a few families, including the HERV-K (HML-2) group, have been shown to form viral particles, and an apparently intact provirus has recently been discovered in a small fraction of the human population. See, e.g., Bannert and Kurth, 2006, Proc. Natl. Acad. USA 101, 14572-14579.
[0287] LTR retrotransposons insert into new sites in the genome using the same steps of DNA cleavage and DNA strand-transfer observed in DNA transposons. In contrast to DNA transposons, however, recombination of LTR retrotransposons involves an RNA intermediate. LTR retrotransposons make up about 8% of the human genome. See, e.g., Lander et al., 2001, Nature 409, 860-921; Hua-Van et al., 2011, Biol. Dir. 6, 19.
Integration Site
[0288] The present disclosure provides non-naturally occurring or engineered systems, methods, and compositions for site-specific genetic engineering via the addition of an integration site into a target genome. The integration site will be discussed in more details below.
[0289] As used herein, the term "integration site" refers to the site within the target genome where one or more genes of interest or one or more nucleic acid sequences of interest are inserted. Examples of integration sites include for example, without limitation, a lox71 site (SEQ ID NO: 1), attB sites (SEQ ID NO: 3 and SEQ ID NO: 43), attP sites (SEQ ID NO: 4 and SEQ ID NO: 44), an attL site (SEQ ID NO: 67), an attR site (SEQ ID NO: 68), a Vox site (SEQ ID NO: 69), a FRT site (SEQ ID NO: 70), or a pseudo attP site (SEQ ID NO: 78). The integration site can be inserted into the genome or a fragment thereof of a cell using a nuclease, a gRNA, and/or an integration enzyme. The integration site can be inserted into the genome of a cell using a prime editor such as, without limitation, PE1, PE2, and PE3, wherein the integration site is carried on a pegRNA. The pegRNA can target any site that is known in the art. Examples of cites targeted by the pegRNA include, without limitation, ACTB, SUPT16H, SRRM2, NOLC1, DEPDC4, NES, LMNB1, AAVS1 locus, CC10, CFTR, SERPINA1, ABCA4, and any derivatives thereof. The complementary integration site may be operably linked to a gene of interest or nucleic acid sequence of interest in an exogenous DNA or RNA. In some embodiments, one integration site is added to a target genome. In some embodiments, more than one integration sites are added to a target genome.
[0290] To insert multiple genes or nucleic acids of interest, two or more integration sites are added to a desired location. Multiple DNA comprising nucleic acid sequences of interest are flanked orthogonal to the integration sequences, such as, without limitation, attB and attP. An integration site is "orthogonal" when it does not significantly recognize the recognition site or nucleotide sequence of a recombinase. Thus, one attB site of a recombinase can be orthogonal to an attB site of a different recombinase. In addition, one pair of attB and attP sites of a recombinase can be orthogonal to another pair of attB and attP sites recognized by the same recombinase. A pair of recombinases are considered orthogonal to each other, as defined herein, when there is recognition of each other's attB or attP site sequences.
[0291] The lack of recognition of integration sites or pairs of sites by the same recombinase or a different recombinase can be less than about 30%. In some embodiments, the lack of recognition of integration sites or pairs of sites by the same recombinase or a different recombinase can be less than about 30%, less than about 28%, less than about 26%, less than about 24%, less than about 22%, less than about 20%, less than about 18%, less than about 16%, less than about 14%, less than about 12%, less than about 10%, less than about 8%, less than about 6%, less than about 4%, less than about 2%, about 1%, or any range that is formed from any two of those values as endpoints. The crosstalk can be less than about 30%. In some embodiments, the crosstalk is less than about 30%, less than about 28%, less than about 26%, less than about 24%, less than about 22%, less than about 20%, less than about 18%, less than about 16%, less than about 14%, less than about 12%, less than about 10%, less than about 8%, less than about 6%, less than about 4%, less than about 2%, less than about 1%, or any range that is formed from any two of those values as endpoints.
[0292] In some embodiments, the attB and/or attP site sequences comprise a central dinucleotide sequence. It has been shown that, for example, the central dinucleotide can be changed to GA from GT and that only GA containing attB/attP sites interact and will not cross react with GT containing sequences. In some embodiments, the central dinucleotide is selected from the group consisting of AG, AC, TG, TC, CA, CT, GA, AA, TT, CC, GG, AT, TA, GC, CG and GT.
[0293] As used herein, the term "pair of an attB and attP site sequences" and the like refer to attB and attP site sequences that share the same central dinucleotide and can recombine. This means that in the presence of one serine integrase as many as six pairs of these orthogonal att sites can recombine (attPTT will specifically recombine with attBTT, attPTC will specifically recombine with attBTC, and so on).
[0294] In some embodiments, the central dinucleotide is nonpalindromic. In some embodiments, the central dinucleotide is palindromic. In some embodiments, a pair of an attB site sequence and an attP site sequence are used in different DNA encoding genes of interest or nucleic acid sequences of interest for inducing directional integration of two or more different nucleic acids.
[0295] The Table 3 below shows examples of pairs of attB site sequence and attP site sequence with different central dinucleotide (CD).
TABLE-US-00003 TABLE 3 Pair attB attP CD 1 SEQ ID NO: 5 SEQ ID NO: 6 TT 2 SEQ ID NO: 7 SEQ ID NO: 8 AA 3 SEQ ID NO: 9 SEQ ID NO: 10 CC 4 SEQ ID NO: 11 SEQ ID NO: 12 GG 5 SEQ ID NO: 13 SEQ ID NO: 14 TG 6 SEQ ID NO: 15 SEQ ID NO: 16 GT 7 SEQ ID NO: 17 SEQ ID NO: 18 CT 8 SEQ ID NO: 19 SEQ ID NO: 20 CA 9 SEQ ID NO: 21 SEQ ID NO: 22 TC 10 SEQ ID NO: 23 SEQ ID NO: 24 GA 11 SEQ ID NO: 25 SEQ ID NO: 26 AG 12 SEQ ID NO: 27 SEQ ID NO: 28 AC 13 SEQ ID NO: 29 SEQ ID NO: 30 AT 14 SEQ ID NO: 31 SEQ ID NO: 32 GC 15 SEQ ID NO: 33 SEQ ID NO: 34 CG 16 SEQ ID NO: 35 SEQ ID NO: 36 TA
Paste
[0296] The present disclosure provides non-naturally occurring or engineered systems, methods, and compositions for site-specific genetic engineering using PASTE. PASTE will be discussed in more details below.
[0297] The site-specific genetic engineering disclosed herein is for the insertion of one or more genes of interest or one or more nucleic acid sequences of interest into a genome of a cell. In some embodiments, the gene of interest is a mutated gene implicated in a genetic disease such as, without limitation, a metabolic disease, cystic fibrosis, muscular dystrophy, hemochromatosis, Tay-Sachs, Huntington disease, Congenital Deafness, Sickle cell anemia, Familial hypercholesterolemia, adenosine deaminase (ADA) deficiency, X-linked SCID (X-SCID), and Wiskott-Aldrich syndrome (WAS). In some embodiments, the gene of interest or nucleic acid sequence of interest can be a reporter gene upstream or downstream of a gene for genetic analyses such as, without limitation, for determining the expression of a gene. In some embodiments, the reporter gene is a GFP template (SEQ ID NO: 76) or a Gaussia Luciferase (G-Luciferase) template (SEQ ID NO: 77) In some embodiments, the gene of interest or nucleic acid sequence of interest can be used in plant genetics to insert genes to enhance drought tolerance, weather hardiness, and increased yield and herbicide resistance in plants. In some embodiments, the gene of interest or nucleic acid sequence of interest can be used for site-specific insertion of a protein (e.g., a lysosomal enzyme), a blood factor (e.g., Factor I, II, V, VII, X, XI, XII or XIII), a membrane protein, an exon, an intracellular protein (e.g., a cytoplasmic protein, a nuclear protein, an organellar protein such as a mitochondrial protein or lysosomal protein), an extracellular protein, a structural protein, a signaling protein, a regulatory protein, a transport protein, a sensory protein, a motor protein, a defense protein, or a storage protein, an anti-inflammatory signaling molecules into cells for treatment of immune diseases, including but not limited to arthritis, psoriasis, lupus, coeliac disease, glomerulonephritis, hepatitis, and inflammatory bowel disease.
[0298] The size of the inserted gene or nucleic acid can vary from about 1 bp to about 50,000 bp. In some embodiments, the size of the inserted gene or nucleic acid can be about 1 bp, 10 bp, 50 bp, 100 bp, 150 bp, 200 bp, 250 bp, 300 bp, 350 bp, 400 bp, 600 bp, 800 bp, 1000 bp, 1200 bp, 1400 bp, 1600 bp, 1800 bp, 2000 bp, 2200 bp, 2400 bp, 2600 bp, 2800 bp, 3000 bp, 3200 bp, 3400 bp, 3600 bp, 3800 bp, 4000 bp, 4200 bp, 4400 bp, 4600 bp, 4800 bp, 5000 bp, 5200 bp, 5400 bp, 5600 bp, 5800 bp, 6000 bp, 6200, 6400 bp, 6600 bp, 6800 bp, 7000 bp, 7200 bp, 7400 bp, 7600 bp, 7800 bp, 8000 bp, 8200 bp, 8400 bp, 8600 bp, 8800 bp, 9000 bp, 9200 bp, 9400 bp, 9600 bp, 9800 bp, 10,000 bp, 10,200 bp, 10,400 bp, 10,600 bp, 10,800 bp, 11,000 bp, 11,200 bp, 11,400 bp, 11,600 bp, 11,800 bp, 12,000 bp, 14,000 bp, 16,000 bp, 18,000 bp, 20,000 bp, 30,000 bp, 40,000 bp, 50,000 bp, or any range that is formed from any two of those values as endpoints.
[0299] In some embodiments, the site-specific engineering using the gene of interest or nucleic acid sequence of interest disclosed herein is for the engineering of T cells and NKs for tumor targeting or allogeneic generation. These can involve the use of receptor or CAR for tumor specificity, anti-PD1 antibody, cytokines like IFN-gamma, TNF-alpha, IL-15, IL-12, IL-18, IL-21, and IL-10, and immune escape genes.
[0300] In the present disclosure, the site-specific insertion of the gene of interest or nucleic acid of interest is performed through Programmable Addition via Site-Specific Targeting Elements (PASTE). Components for inserting a gene of interest or a nucleic acid of interest using PASTE are for example, without limitation, a nuclease, a gRNA adding the integration site, a DNA or RNA strand comprising the gene or nucleic acid linked to a sequence that is complementary or associated to the integration site, and an integration enzyme. Components for inserting a gene of interest or a nucleic acid of interest using PASTE are for example, without limitation, a prime editor expression, pegRNA adding the integration site, nicking guide RNA, integration enzyme (Cre or serine recombinase), transgene vector comprising the gene of interest or nucleic acid sequence of interest with gene and integration signal. The nuclease and prime editor integrate the integration site into the genome. The integration enzyme integrates the gene of interest into the integration site. In some embodiments, the transgene vector comprising the gene or nucleic acid sequence of interest with gene and integration signal is a DNA minicircle devoid of bacterial DNA sequences. In some embodiments, the transgenic vector is a eukaryotic or prokaryotic vector.
[0301] As used herein, the term "vector" or "transgene vector" refers to a recombinant DNA molecule containing a desired coding sequence and appropriate nucleic acid sequences necessary for the expression of the operably linked coding sequence in a host organism. Nucleic acid sequences necessary for expression in prokaryotes usually include for example, without limitation, a promoter, an operator (optional), a ribosome binding site, and/or other sequences. Eukaryotic cells are generally known to utilize promoters (constitutive, inducible or tissue specific), enhancers, and termination and polyadenylation signals, although some elements may be deleted and other elements added without sacrificing the necessary expression. The transgenic vector may encode the PE and the integration enzyme, linked to each other via a linker. The linker can be a cleavable linker. For example, transgenic vector encoding the PE and the integration enzyme, linked to each other via a linker is pCMV PE2 P2A Cre comprises SEQ ID NO: 73. In some embodiments, the linker can be a non-cleavable linker. In some embodiments the nuclease, prime editor, and/or integration enzyme can be encoded in different vectors.
[0302] A method of inserting multiple genes or nucleic acid sequences of interest into a single site according to embodiments of the present disclosure is illustrated in FIG. 12. In some embodiments, multiplexing involves inserting multiple genes of interest in multiple loci using unique pegRNA as illustrated in FIG. 13 (Merrick, C. A. et al., ACS Synth. Biol. 2018, 7, 299-310). The insertion of multiple genes of interest or nucleic acids of interest into a cell genome, referred herein as "multiplexing," is facilitated by incorporation of the complementary 5' integration site to the 5' end of the DNA or RNA comprising the first nucleic acid and 3' integration site to the 3' end of the DNA or RNA comprising the last nucleic acid. In some embodiments, the number of genome of interest or amino acid sequences of interest that are inserted into a cell genome using multiplexing can be about 1, 2, 3, 4, 5, 6, 7, 8, 9 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, or any range that is formed from any two of those values as endpoints.
[0303] In some embodiments, multiplexing allows integration of for example, signaling cascade, over-expression of a protein of interest with its cofactor, insertion of multiple genes mutated in a neoplastic condition, or insertion of multiple CARs for treatment of cancer.
[0304] In some embodiments, the integration sites may be inserted into the genome using non-prime editing methods such as rAAV mediated nucleic acid integration, TALENS and ZFNs. A number of unique properties make AAV a promising vector for human gene therapy (Muzyczka, CURRENT TOPICS IN MICROBIOLOGY AND IMMUNOLOGY, 158:97-129 (1992)). Unlike other viral vectors, AAVs have not been shown to be associated with any known human disease and are generally not considered pathogenic. Wild type AAV is capable of integrating into host chromosomes in a site-specific manner M. Kotin et al., PROC. NATL. ACAD. SCI, USA, 87:2211-2215 (1990); R. J. Samulski, EMBO 10(12):3941-3950 (1991)). Instead of creating a double-stranded DNA break, AAV stimulates endogenous homologous recombination to achieve the DNA modification. Further, transcription activator-like effector nucleases (TALENs) and Zinc-finger nucleases (ZFNs) for genome editing and introducing targeted DSBs. The specificity of TALENs arises from two polymorphic amino acids, the so-called repeat variable diresidues (RVDs) located at positions 12 and 13 of a repeated unit. TALENS are linked to FokI nucleases, which cleaves the DNA at the desired locations. ZFNs are artificial restriction enzymes for custom site-specific genome editing. Zinc fingers themselves are transcription factors, where each finger recognizes 3-4 bases. By mixing and matching these finger modules, researchers can customize which sequence to target.
[0305] As used herein, the terms "administration," "introducing," or "delivery" into a cell, a tissue, or an organ of a plasmid, nucleic acids, or proteins for modification of the host genome refers to the transport for such administration, introduction, or delivery that can occur in vivo, in vitro, or ex vivo. Plasmids, DNA, or RNA for genetic modification can be introduced into cells by transfection, which is typically accomplished by chemical means (e.g., calcium phosphate transfection, polyethyleneimine (PEI) Or lipofection), physical means (electroporation or microinjection), infection (this typically means the introduction of an infectious agent such as a virus (e.g., a baculovirus expressing the AAV Rep gene)), transduction (in microbiology, this refers to the stable infection of cells by viruses, or the transfer of genetic material from one microorganism to another by viral factors (e.g., bacteriophages)). Vectors for the expression of a recombinant polypeptide, protein or oligonucleotide may be obtained by physical means (e.g., calcium phosphate transfection, electroporation, microinjection, or lipofection) in a cell, a tissue, an organ or a subject. The vector can be delivered by preparing the vector in a pharmaceutically acceptable carrier for the in vitro, ex vivo, or in vivo delivery to the carrier.
[0306] As used herein, the term "transfection" refers to the uptake of an exogenous nucleic acid molecule by a cell. A cell is "transfected" when an exogenous nucleic acid has been introduced into the cell membrane. The transfection can be a single transfection, co-transfection, or multiple transfection. Numerous transfection techniques are generally known in the art. See, for example, Graham et al. (1973) Virology, 52: 456. Such techniques can be used to introduce one or more exogenous nucleic acid molecules into a suitable host cell.
[0307] In some embodiments, the exogenous nucleic acid molecule and/or other components for gene editing are combined and delivered in a single transfection. In other embodiments, the exogenous nucleic acid molecule and/or other components for gene editing are not combined and delivered in a single transfection. In some embodiments, exogenous nucleic acid molecule and/or other components for gene editing are combined and delivered in a single transfection to comprise for example, without limitation, a prime editing vector, a landing site such as a landing site containing pegRNA, a nicking guide such as a nicking guide for stimulating prime editing, an expression vector such as an expression vector for a corresponding integrase or recombinase, a minicircle DNA cargo such as a minicircle DNA cargo encoding for green fluorescent protein (GFP), any derivatives thereof, and any combinations thereof. In some embodiments, the gene of interest or amino acid sequence of interest can be introduced using liposomes. In some embodiments, the gene of interest or amino acid sequence of interest can be delivered using suitable vectors for instance, without limitation, plasmids and viral vectors. Examples of viral vectors include, without limitation, adeno-associated viruses (AAV), lentiviruses, adenoviruses, other viral vectors, derivatives thereof, or combinations thereof. The proteins and one or more guide RNAs can be packaged into one or more vectors, e.g., plasmids or viral vectors. In some embodiments, the delivery is via nanoparticles or exosomes. For example, exosomes can be particularly useful in delivery RNA.
[0308] In some embodiments, the prime editing inserts the landing site with efficiencies of at least about 1%, at least about 5%, at least about 10%, at least about 15%, at least about, at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, or at least about 50%. In some embodiments, the prime editing inserts the landing site(s) with efficiencies of about 1%, about 2%, about 3%, about 4%, about 5%, about 6%, about 7%, about 8%, about 9%, about 10%, about 11%, about 12%, about 13%, about 14%, about 15%, about 16%, about 17%, about 18%, about 19%, about 20%, about 21%, about 22%, about 23%, about 24%, about 25%, about 26%, about 27%, about 28%, about 29%, about 30%, about 31%, about 32%, about 33%, about 34%, about 35%, about 36%, about 37%, about 38%, about 39%, about 40%, about 41%, about 42%, about 43%, about 44%, about 45%, about 46%, about 47%, about 48%, about 49%, about 50%, or any range that is formed from any two of those values as endpoints.
Sequences
[0309] Sequences of enzymes, guides, integration sites, and plasmids can be found in Table 4 below.
TABLE-US-00004 TABLE 4 SEQ ID NO/ DESCRIPTION/ SOURCE SEQUENCE SEQ ID NO: 1 ATAACTTCGTATAATGTATGCTATACGAACGGTA Lox71 (Artificial sequence) SEQ ID NO: 2 TACCGTTCGTATAATGTATGCTATACGAAGTTAT Lox66 (Artificial sequence) SEQ ID NO: 3 GGCCGGCTTGTCGACGACGGCGGTCTCCGTCGTCAGGATCATCCG attB G (Artificial sequence) SEQ ID NO: 4 CCGGATGATCCTGACGACGGAGACCGCCGTCGTCGACAAGCCGGC attP C (Artificial Sequence) SEQ ID NO: 5 GGCTTGTCGACGACGGCGTTCTCCGTCGTCAGGATCAT attB-TT (Artificial Sequence) SEQ ID NO: 6 GTGGTTTGTCTGGTCAACCACCGCGTTCTCAGTGGTGTACGGTACA attP-TT AACCCA (Artificial Sequence) SEQ ID NO: 7 GGCTTGTCGACGACGGCGAACTCCGTCGTCAGGATCAT attB-AA (Artificial Sequence) SEQ ID NO: 8 GTGGTTTGTCTGGTCAACCACCGCGAACTCAGTGGTGTACGGTAC attP-AA AAACCCA (Artificial Sequence) SEQ ID NO: 9 GGCTTGTCGACGACGGCGCCCTCCGTCGTCAGGATCAT attB-CC (Artificial Sequence) SEQ ID NO: 10 GTGGTTTGTCTGGTCAACCACCGCGCCCTCAGTGGTGTACGGTACA attP-CC AACCCA (Artificial Sequence) SEQ ID NO: 11 GGCTTGTCGACGACGGCGGGCTCCGTCGTCAGGATCAT attB-GG (Artificial Sequence) SEQ ID NO: 12 GTGGTTTGTCTGGTCAACCACCGCGGGCTCAGTGGTGTACGGTAC attP-GG AAACCCA (Artificial Sequence) SEQ ID NO: 13 GGCTTGTCGACGACGGCGTGCTCCGTCGTCAGGATCAT attB-TG (Artificial Sequence) SEQ ID NO: 14 GTGGTTTGTCTGGTCAACCACCGCGTGCTCAGTGGTGTACGGTACA attP-TG AACCCA (Artificial Sequence) SEQ ID NO: 15 GGCTTGTCGACGACGGCGGTCTCCGTCGTCAGGATCAT attB-GT (Artificial Sequence) SEQ ID NO: 16 GTGGTTTGTCTGGTCAACCACCGCGGTCTCAGTGGTGTACGGTACA attP-GT AACCCA (Artificial Sequence) SEQ ID NO: 17 GGCTTGTCGACGACGGCGCTCTCCGTCGTCAGGATCAT attB-CT (Artificial Sequence) SEQ ID NO: 18 GTGGTTTGTCTGGTCAACCACCGCGCTCTCAGTGGTGTACGGTACA attP-CT AACCCA (Artificial Sequence) SEQ ID NO: 19 GGCTTGTCGACGACGGCGCACTCCGTCGTCAGGATCAT attB-CA (Artificial Sequence) SEQ ID NO: 20 GTGGTTTGTCTGGTCAACCACCGCGCACTCAGTGGTGTACGGTACA attP-CA AACCCA (Artificial Sequence) SEQ ID NO: 21 GGCTTGTCGACGACGGCGTCCTCCGTCGTCAGGATCAT attB-TC (Artificial Sequence) SEQ ID NO: 22 GTGGTTTGTCTGGTCAACCACCGCGTCCTCAGTGGTGTACGGTACA attP-TC AACCCA (Artificial Sequence) SEQ ID NO: 23 GGCTTGTCGACGACGGCGGACTCCGTCGTCAGGATCAT attB-GA (Artificial Sequence) SEQ ID NO: 24 GTGGTTTGTCTGGTCAACCACCGCGGACTCAGTGGTGTACGGTAC attP-GA AAACCCA (Artificial Sequence) SEQ ID NO: 25 GGCTTGTCGACGACGGCGAGCTCCGTCGTCAGGATCAT attB-AG (Artificial Sequence) SEQ ID NO: 26 GTGGTTTGTCTGGTCAACCACCGCGAGCTCAGTGGTGTACGGTAC attP-AG AAACCCA (Artificial Sequence) SEQ ID NO: 27 GGCTTGTCGACGACGGCGACCTCCGTCGTCAGGATCAT attB-AC (Artificial Sequence) SEQ ID NO: 28 GTGGTTTGTCTGGTCAACCACCGCGACCTCAGTGGTGTACGGTACA attP-AC AACCCA (Artificial Sequence) SEQ ID NO: 29 GGCTTGTCGACGACGGCGATCTCCGTCGTCAGGATCAT attB-AT (Artificial Sequence) SEQ ID NO: 30 GTGGTTTGTCTGGTCAACCACCGCGATCTCAGTGGTGTACGGTACA attP-AT AACCCA (Artificial Sequence) SEQ ID NO: 31 GGCTTGTCGACGACGGCGGCCTCCGTCGTCAGGATCAT attB-GC (Artificial Sequence SEQ ID NO: 32 GTGGTTTGTCTGGTCAACCACCGCGGCCTCAGTGGTGTACGGTACA attP-GC AACCCA (Artificial Sequence) SEQ ID NO: 33 GGCTTGTCGACGACGGCGCGCTCCGTCGTCAGGATCAT attB-CG (Artificial Sequence) SEQ ID NO: 34 GTGGTTTGTCTGGTCAACCACCGCGCGCTCAGTGGTGTACGGTACA attP-CG AACCCA (Artificial Sequence) SEQ ID NO: 35 GGCTTGTCGACGACGGCGTACTCCGTCGTCAGGATCAT attB-TA (Artificial Sequence) SEQ ID NO: 36 GTGGTTTGTCTGGTCAACCACCGCGTACTCAGTGGTGTACGGTACA attP-TA AACCCA (Artificial Sequence) SEQ ID NO: 37 TGCGGGTGCCAGGGCGTGCCCTTGGGCTCCCCGGGCGCGTACTCC C31-attB (Artificial Sequence) SEQ ID NO: 38 GTGCCCCAACTGGGGTAACCTTTGAGTTCTCTCAGTTGGGGG C31-attP (Artificial Sequence) SEQ ID NO: 39 GCGCCCAAGTTGCCCATGACCATGCCGAAGCAGTGGTAGAAGGGC R4-attB ACCGGCAGACAC (Artificial Sequence) SEQ ID NO: 40 AGGCATGTTCCCCAAAGCGATACCACTTGAAGCAGTGGTACTGCT R4-attP TGTGGGTACACTCTGCGGGTGATGA (Artificial Sequence) SEQ ID NO: 41 GTCCTTGACCAGGTTTTTGACGAAAGTGATCCAGATGATCCAGCTC BT1-attB CACACCCCGAACGC (Artificial Sequence) SEQ ID NO: 42 GGTGCTGGGTTGTTGTCTCTGGACAGTGATCCATGGGAAACTACTC BT1-attP AGCACCACCAATGTTCC (Artificial Sequence) SEQ ID NO: 43 TCGGCCGGCTTGTCGACGACGGCGGTCTCCGTCGTCAGGATCATCC Bxb-attB GGGC (Artificial Sequence) SEQ ID NO: 44 GTCGTGGTTTGTCTGGTCAACCACCGCGGTCTCAGTGGTGTACGGT Bxb-attP ACAAACCCCGAC (Artificial Sequence) SEQ ID NO: 45 GATCAGCTCCGCGGGCAAGACCTTCTCCTTCACGGGGTGGAAGGT TG1-attB C (Artificial Sequence) SEQ ID NO: 46 TCAACCCCGTTCCAGCCCAACAGTGTTAGTCTTTGCTCTTACCCAG TG1-attP TTGGGCGGGATAGCCTGCCCG (Artificial Sequence) SEQ ID NO: 47 AACGATTTTCAAAGGATCACTGAATCAAAAGTATTGCTCATCCAC C1-attB GCGAAATTTTTC (Artificial Sequence) SEQ ID NO: 48 AATATTTTAGGTATATGATTTTGTTTATTAGTGTAAATAACACTAT C1-attP GTACCTAAAAT (Artificial Sequence) SEQ ID NO: 49 TGTAAAGGAGACTGATAATGGCATGTACAACTATACTCGTCGGTA C370-attB AAAAGGCA (Artificial Sequence) SEQ ID NO: 50 TAAAAAAATACAGCGTTTTTCATGTACAACTATACTAGTTGTAGTG C370-attP CCTAAA (Artificial Sequence) SEQ ID NO: 51 GAGCGCCGGATCAGGGAGTGGACGGCCTGGGAGCGCTACACGCT K38-attB GTGGCTGCGGTC (Artificial Sequence) SEQ ID NO: 52 CCCTAATACGCAAGTCGATAACTCTCCTGGGAGCGTTGACAACTT K38-attP GCGCACCCTGA (Artificial Sequence) SEQ ID NO: 53 TCTCGTGGTGGTGGAAGGTGTTGGTGCGGGGTTGGCCGTGGTCGA RB-attB GGTGGGGTGGTGGTAGCCATTCG (Artificial Sequence) SEQ ID NO: 54 GCACAGGTGTAGTGTATCTCACAGGTCCACGGTTGGCCGTGGACT RV-attP GCTGAAGAACATTCCACGCCAGGA (Artificial Sequence) SEQ ID NO: 55 AGTGCAGCATGTCATTAATATCAGTACAGATAAAGCTGTATCTCCT SPBC-attB GTGAACACAATGGGTGCCA (Artificial Sequence) SEQ ID NO: 56 AAAGTAGTAAGTATCTTAAAAAACAGATAAAGCTGTATATTAAGA SPBC-attP TACTTACTAC (Artificial Sequence) SEQ ID NO: 57 TGATAATTGCCAACACAATTAACATCTCAATCAAGGTAAATGCTTT TP901-attB TTCGTTTT (Artificial Sequence) SEQ ID NO: 58 AATTGCGAGTTTTTATTTCGTTTATTTCAATTAAGGTAACTAAAAA TP901-attP ACTCCTTT (Artificial Sequence) SEQ ID NO: 59 AAGGTAGCGTCAACGATAGGTGTAACTGTCGTGTTTGTAACGGTA W.beta.-attB CTTCCAACAGCTGGCGTTTCAGT (Artificial Sequence) SEQ ID NO: 60 TAGTTTTAAAGTTGGTTATTAGTTACTGTGATATTTATCACGGTAC W.beta.-attP CCAATAACCAATGAATATTTGA (Artificial Sequence) SEQ ID NO: 61 TGTAACTTTTTCGGATCAAGCTATGAAGGACGCAAAGAGGGAACT A118-attB AAACACTTAATT (Artificial Sequence)
SEQ ID NO: 62 TTGTTTAGTTCCTCGTTTTCTCTCGTTGGAAGAAGAAGAAACGAGA A118-attP AACTAAAATTA (Artificial Sequence) SEQ ID NO: 63 CAACCTGTTGACATGTTTCCACAGACAACTCACGTGGAGGTAGTC BL3-attB ACGGCTTTTACGTTAGTT (Artificial Sequence) SEQ ID NO: 64 GAGAATACTGTTGAACAATGAAAAACTAGGCATGTAGAAGTTGTT BL3-attP TGTGCACTAACTTTAA (Artificial Sequence) SEQ ID NO: 65 ACAGGTCAACACATCGCAGTTATCGAACAATCTTCGAAAATGTAT MR11-attB GGAGGCACTTGTATCAATATAGGATGTATACCTTCGAAGACACTT (Artificial Sequence) GTACATGATGGATTAGAAGGCAAATCCTTT SEQ ID NO: 66 CAAAATAAAAAACATTGATTTTTATTAACTTCTTTTGTGCGGAACT MR11-attP ACGAACAGTTCATTAATACGAAGTGTACAAACTTCCATACAAAAA (Artificial Sequence) TAACCACGACAATTAAGACGTGGTTTCTA SEQ ID NO: 67 ATTATTTCTCACCCTGA attL (Artificial Sequence) SEQ ID NO: 68 ATCATCTCCCACCCGGA attR (Artificial Sequence) SEQ ID NO: 69 AATAGGTCTG AGAACGCCCA TTCTCAGACG TATT Vox (Artificial Sequence) SEQ ID NO: 70 GAAGTTCCTATAC TTTCTAGA GAATAGGAACTTC FRT (Artificial Sequence) SEQ ID NO: 71 GGTCGACATTGATTATTGACTAGTTATTAATAGTAATCAATTACGG Cre recombinase GGTCATTAGTTCATAGCCCATATATGGAGTTCCGCGTTACATAACT expression plasmid TACGGTAAATGGCCCGCCTGGCTGACCGCCCAACGACCCCCGCCC (Artificial Sequence) ATTGACGTCAATAATGACGTATGTTCCCATAGTAACGCCAATAGG GACTTTCCATTGACGTCAATGGGTGGAGTATTTACGGTAAACTGCC CACTTGGCAGTACATCAAGTGTATCATATGCCAAGTACGCCCCCTA TTGACGTCAATGACGGTAAATGGCCCGCCTGGCATTATGCCCAGT ACATGACCTTATGGGACTTTCCTACTTGGCAGTACATCTACGTATT AGTCATCGCTATTACCATGGTCGAGGTGAGCCCCACGTTCTGCTTC ACTCTCCCCATCTCCCCCCCCTCCCCACCCCCAATTTTGTATTTATT TATTTTTTAATTATTTTGTGCAGCGATGGGGGCGGGGGGGGGGGG GGCGCGCGCCAGGCGGGGGGGGGGGGGGGGGGGGGGGGGGGGG GGGGGGGCGGGGGGGGGCGGCGGCAGCCAATCAGAGCGGCGCGC TCCGAAAGTTTCCTTTTATGGCGAGGCGGCGGCGGCGGCGGCCCT ATAAAAAGCGAAGCGCGCGGCGGGCGGGAGTCGCTGCGCGCTGC CTTCGCCCCGTGCCCCGCTCCGCCGCCGCCTCGCGCCGCCCGCCCC GGCTCTGACTGACCGCGTTACTCCCACAGGTGAGCGGGCGGGACG GCCCTTCTCCTCCGGGCTGTAATTAGCGCTTGGTTTAATGACGGCT TGTTTCTTTTCTGTGGCTGCGTGAAAGCCTTGAGGGGCTCCGGGAG GGCCCTTTGTGCGGGGGGAGCGGCTCGGGGGGTGCGTGCGTGTGT GTGTGCGTGGGGAGCGCCGCGTGCGGCTCCGCGCTGCCCGGCGGC TGTGAGCGCTGCGGGCGCGGCGCGGGGCTTTGTGCGCTCCGCAGT GTGCGCGAGGGGAGCGCGGCCGGGGGCGGTGCCCCGCGGTGCGG GGGGGGCTGCGAGGGGAACAAAGGCTGCGTGCGGGGTGTGTGCG TGGGGGGGTGAGCAGGGGGTGTGGGCGCGTCGGTCGGGCTGCAA CCCCCCCTGCACCCCCCTCCCCGAGTTGCTGAGCACGGCCCGGCTT CGGGTGCGGGGCTCCGTACGGGGCGTGGCGCGGGGCTCGCCGTGC CGGGCGGGGGGTGGCGGCAGGTGGGGGTGCCGGGCGGGGCGGGG CCGCCTCGGGCCGGGGAGGGCTCGGGGGAGGGGCGCGGCGGCCC CCGGAGCGCCGGCGGCTGTCGAGGCGCGGCGAGCCGCAGCCATTG CCTTTTATGGTAATCGTGCGAGAGGGCGCAGGGACTTCCTTTGTCC CAAATCTGTGCGGAGCCGAAATCTGGGAGGCGCCGCCGCACCCCC TCTAGCGGGCGCGGGGCGAAGCGGTGCGGCGCCGGCAGGAAGGA AATGGGCGGGGAGGGCCTTCGTGCGTCGCCGCGCCGCCGTCCCCT TCTCCCTCTCCAGCCTCGGGGCTGTCCGCGGGGGGACGGCTGCCTT CGGGGGGGACGGGGCAGGGCGGGGTTCGGCTTCTGGCGTGTGACC GGCGGCTCTAGAGCCTCTGCTAACCATGTTCATGCCTTCTTCTTTTT CCTACAGCTCCTGGGCAACGTGCTGGTTATTGTGCTGTCTCATCAT TTTGGCAAAGAATTCTGAGCCGCCACCATGGCCAATTTACTGACC GTACACCAAAATTTGCCTGCATTACCGGTCGATGCAACGAGTGAT GAGGTTCGCAAGAACCTGATGGACATGTTCAGGGATCGCCAGGCG TTTTCTGAGCATACCTGGAAAATGCTTCTGTCCGTTTGCCGGTCGT GGGCGGCATGGTGCAAGTTGAATAACCGGAAATGGTTTCCCGCAG AACCTGAAGATGTTCGCGATTATCTTCTATATCTTCAGGCGCGCGG TCTGGCAGTAAAAACTATCCAGCAACATTTGGGCCAGCTAAACAT GCTTCATCGTCGGTCCGGGCTGCCACGACCAAGTGACAGCAATGC TGTTTCACTGGTTATGCGGCGGATCCGAAAAGAAAACGTTGATGC CGGTGAACGTGCAAAACAGGCTCTAGCGTTCGAACGCACTGATTT CGACCAGGTTCGTTCACTCATGGAAAATAGCGATCGCTGCCAGGA TATACGTAATCTGGCATTTCTGGGGATTGCTTATAACACCCTGTTA CGTATAGCCGAAATTGCCAGGATCAGGGTTAAAGATATCTCACGT ACTGACGGTGGGAGAATGTTAATCCATATTGGCAGAACGAAAACG CTGGTTAGCACCGCAGGTGTAGAGAAGGCACTTAGCCTGGGGGTA ACTAAACTGGTCGAGCGATGGATTTCCGTCTCTGGTGTAGCTGATG ATCCGAATAACTACCTGTTTTGCCGGGTCAGAAAAAATGGTGTTG CCGCGCCATCTGCCACCAGCCAGCTATCAACTCGCGCCCTGGAAG GGATTTTTGAAGCAACTCATCGATTGATTTACGGCGCTAAGGATG ACTCTGGTCAGAGATACCTGGCCTGGTCTGGACACAGTGCCCGTG TCGGAGCCGCGCGAGATATGGCCCGCGCTGGAGTTTCAATACCGG AGATCATGCAAGCTGGTGGCTGGACCAATGTAAATATTGTCATGA ACTATATCCGTAACCTGGATAGTGAAACAGGGGCAATGGTGCGCC TGCTGGAAGATGGCGATGGACCGGTGGAACAAAAACTTATTTCTG AAGAAGATCTGTGATAGCGGCCGCACTCCTCAGGTGCAGGCTGCC TATCAGAAGGTGGTGGCTGGTGTGGCCAATGCCCTGGCTCACAAA TACCACTGAGATCTTTTTCCCTCTGCCAAAAATTATGGGGACATCA TGAAGCCCCTTGAGCATCTGACTTCTGGCTAATAAAGGAAATTTAT TTTCATTGCAATAGTGTGTTGGAATTTTTTGTGTCTCTCACTCGGAA GGACATATGGGAGGGCAAATCATTTAAAACATCAGAATGAGTATT TGGTTTAGAGTTTGGCAACATATGCCCATATGCTGGCTGCCATGAA CAAAGGTTGGCTATAAAGAGGTCATCAGTATATGAAACAGCCCCC TGCTGTCCATTCCTTATTCCATAGAAAAGCCTTGACTTGAGGTTAG ATTTTTTTTATATTTTGTTTTGTGTTATTTTTTTCTTTAACATCCCTA AAATTTTCCTTACATGTTTTACTAGCCAGATTTTTCCTCCTCTCCTG ACTACTCCCAGTCATAGCTGTCCCTCTTCTCTTATGGAGATCCCTC GACCTGCAGCCCAAGCTTGGCGTAATCATGGTCATAGCTGTTTCCT GTGTGAAATTGTTATCCGCTCACAATTCCACACAACATACGAGCC GGAAGCATAAAGTGTAAAGCCTGGGGTGCCTAATGAGTGAGCTAA CTCACATTAATTGCGTTGCGCTCACTGCCCGCTTTCCAGTCGGGAA ACCTGTCGTGCCAGCGGATCCGCATCTCAATTAGTCAGCAACCAT AGTCCCGCCCCTAACTCCGCCCATCCCGCCCCTAACTCCGCCCAGT TCCGCCCATTCTCCGCCCCATGGCTGACTAATTTTTTTTATTTATGC AGAGGCCGAGGCCGCCTCGGCCTCTGAGCTATTCCAGAAGTAGTG AGGAGGCTTTTTTGGAGGCCTAGGCTTTTGCAAAAAGCTAACTTGT TTATTGCAGCTTATAATGGTTACAAATAAAGCAATAGCATCACAA ATTTCACAAATAAAGCATTTTTTTCACTGCATTCTAGTTGTGGTTTG TCCAAACTCATCAATGTATCTTATCATGTCTGGATCCGCTGCATTA ATGAATCGGCCAACGCGCGGGGAGAGGCGGTTTGCGTATTGGGCG CTCTTCCGCTTCCTCGCTCACTGACTCGCTGCGCTCGGTCGTTCGG CTGCGGCGAGCGGTATCAGCTCACTCAAAGGCGGTAATACGGTTA TCCACAGAATCAGGGGATAACGCAGGAAAGAACATGTGAGCAAA AGGCCAGCAAAAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGG CGTTTTTCCATAGGCTCCGCCCCCCTGACGAGCATCACAAAAATCG ACGCTCAAGTCAGAGGTGGCGAAACCCGACAGGACTATAAAGAT ACCAGGCGTTTCCCCCTGGAAGCTCCCTCGTGCGCTCTCCTGTTCC GACCCTGCCGCTTACCGGATACCTGTCCGCCTTTCTCCCTTCGGGA AGCGTGGCGCTTTCTCATAGCTCACGCTGTAGGTATCTCAGTTCGG TGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCACGAACCCCCCGT TCAGCCCGACCGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCC AACCCGGTAAGACACGACTTATCGCCACTGGCAGCAGCCACTGGT AACAGGATTAGCAGAGCGAGGTATGTAGGCGGTGCTACAGAGTTC TTGAAGTGGTGGCCTAACTACGGCTACACTAGAAGAACAGTATTT GGTATCTGCGCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGAGTT GGTAGCTCTTGATCCGGCAAACAAACCACCGCTGGTAGCGGTGGT TTTTTTGTTTGCAAGCAGCAGATTACGCGCAGAAAAAAAGGATCT CAAGAAGATCCTTTGATCTTTTCTACGGGGTCTGACGCTCAGTGGA ACGAAAACTCACGTTAAGGGATTTTGGTCATGAGATTATCAAAAA GGATCTTCACCTAGATCCTTTTAAATTAAAAATGAAGTTTTAAATC AATCTAAAGTATATATGAGTAAACTTGGTCTGACAGTTACCAATG CTTAATCAGTGAGGCACCTATCTCAGCGATCTGTCTATTTCGTTCA TCCATAGTTGCCTGACTCCCCGTCGTGTAGATAACTACGATACGGG AGGGCTTACCATCTGGCCCCAGTGCTGCAATGATACCGCGAGACC CACGCTCACCGGCTCCAGATTTATCAGCAATAAACCAGCCAGCCG GAAGGGCCGAGCGCAGAAGTGGTCCTGCAACTTTATCCGCCTCCA TCCAGTCTATTAATTGTTGCCGGGAAGCTAGAGTAAGTAGTTCGCC AGTTAATAGTTTGCGCAACGTTGTTGCCATTGCTACAGGCATCGTG GTGTCACGCTCGTCGTTTGGTATGGCTTCATTCAGCTCCGGTTCCC AACGATCAAGGCGAGTTACATGATCCCCCATGTTGTGCAAAAAAG CGGTTAGCTCCTTCGGTCCTCCGATCGTTGTCAGAAGTAAGTTGGC CGCAGTGTTATCACTCATGGTTATGGCAGCACTGCATAATTCTCTT ACTGTCATGCCATCCGTAAGATGCTTTTCTGTGACTGGTGAGTACT CAACCAAGTCATTCTGAGAATAGTGTATGCGGCGACCGAGTTGCT CTTGCCCGGCGTCAATACGGGATAATACCGCGCCACATAGCAGAA CTTTAAAAGTGCTCATCATTGGAAAACGTTCTTCGGGGCGAAAAC TCTCAAGGATCTTACCGCTGTTGAGATCCAGTTCGATGTAACCCAC TCGTGCACCCAACTGATCTTCAGCATCTTTTACTTTCACCAGCGTTT CTGGGTGAGCAAAAACAGGAAGGCAAAATGCCGCAAAAAAGGGA ATAAGGGCGACACGGAAATGTTGAATACTCATACTCTTCCTTTTTC AATATTATTGAAGCATTTATCAGGGTTATTGTCTCATGAGCGGATA CATATTTGAATGTATTTAGAAAAATAAACAAATAGGGGTTCCGCG CACATTTCCCCGAAAAGTGCCACCTG SEQ ID NO: 72 AGCTCTGATCAAGAGACAGGATGAGGATCGTTTCGCATGATTGAA GFP-Lox66 Cre CAAGATGGATTGCACGCAGGTTCTCCGGCCGCTTGGGTGGAGAGG expression plasmid CTATTCGGCTATGACTGGGCACAACAGACAATCGGCTGCTCTGAT (Artificial Sequence) GCCGCCGTGTTCCGGCTGTCAGCGCAGGGGCGCCCGGTTCTTTTTG TCAAGACCGACCTGTCCGGTGCCCTGAATGAACTGCAAGACGAGG CAGCGCGGCTATCGTGGCTGGCCACGACGGGCGTTCCTTGCGCAG CTGTGCTCGACGTTGTCACTGAAGCGGGAAGGGACTGGCTGCTAT TGGGCGAAGTGCCGGGGCAGGATCTCCATGTCATCTACACCTTGC TCCTGCCGAGAAAGTATCCATCATGGCTGATGCAATGCGGCGGCT GCATACGCTTGATCCGGCTACCTGCCCATTCGACCACCAAGCGAA ACATCGCATCGAGCGAGCACGTACTCGGATGGAAGCCGGTCTTGT CGATCAGGATGATCTGGACGAAGAGCATCAGGGGCTCGCGCCAGC CGAACTGTTCGCCAGGCTCAAGGCGAGCATGCCCGACGGCGAGGA TCTCGTCGTGACCCATGGCGATGCCTGCTTGCCGAATATCATGGTG GAAAATGGCCGCTTTTCTGGATTCATCGACTGTGGCCGGCTGGGTG TGGCGGACCGCTATCAGGACATAGCGTTGGCTACCCGTGATATTG CTGAAGAGCTTGGCGGCGAATGGGCTGACCGCTTCCTCGTGCTTTA CGGTATCGCCGCTCCCGATTCGCAGCGCATCGCCTTCTATCGCCTT CTTGACGAGTTCTTCTGAATTATTAACTCGAGATCCACTAGAGTGT GGCGGCCGCATTCTTATAATCAGCATCATGATGTGGTACCACATCA TGATGCTGATTACCCCCAACTGAGAGAACTCAAAGGTTACCCCAG TTGGGGCGGGCCCACAAATAAAGCAATAGCATCACAAATTTCACA AATAAAGCATTTTTTTCACTGCATTCTAGTTGTGGTTTGTCCAAAC TCATCGAGCTCGAGATCTGGCGAAGGCGATGGGGGTCTTGAAGGC GTGCTGGTACTCCACGATGCCCAGCTCGGTGTTGCTGTGCAGCTCC TCCACGCGGCGGAAGGCGAACATGGGGCCCCCGTTCTGCAGGATG CTGGGGTGGATGGCGCTCTTGAAGTGCATGTGGCTGTCCACCACG AAGCTGTAGTAGCCGCCGTCGCGCAGGCTGAAGGTGCGGGCGAAG CTGCCCACCAGCACGTTATCGCCCATGGGGTGCAGGTGCTCCACG GTGGCGTTGCTGCGGATGATCTTGTCGGTGAAGATCACGCTGTCCT CGGGGAAGCCGGTGCCCACCACCTTGAAGTCGCCGATCACGCGGC CGGCCTCGTAGCGGTAGCTGAAGCTCACGTGCAGCACGCCGCCGT CCTCGTACTTCTCGATGCGGGTGTTGGTGTAGCCGCCGTTGTTGAT GGCGTGCAGGAAGGGGTTCTCGTAGCCGCTGGGGTAGGTGCCGAA GTGGTAGAAGCCGTAGCCCATCACGTGGCTCAGCAGGTAGGGGCT GAAGGTCAGGGCGCCTTTGGTGCTCTTCATCTTGTTGGTCATGCGG CCCTGCTCGGGGGTGCCCTCTCCGCCGCCCACCAGCTCGAACTCCA CGCCGTTCAGGGTGCCGGTGATGCGGCACTCGATCTTCATGGCGG GCATGGTGGCGACCGGTAGCGCTAGCGGCTTCGGATAACTTCGTA TAGCATACATTATACGAACGGTAAGCGCTACCGCCGGCATACCCA AGTGAAGTTGCTCGCAGCTTATAGTCGCGCCCGGGGAGCCCAAGG GCACGCCCTGGCACCGCGGCCGCTGAGTCTCGACCATCATCATCA TCATCATTGAGTTTATCTGGGATAACAGGGTAATGTCATCTAGGGA TAACAGGGTATGTCATCTGGGATAACAGGGTAATGTATCTAGGGA TAACAGGGTAATGTCATCTGGGATAACAGGGTAATGTCATCTAGG GATAACAGGGTATGTCATCTGGGATAACAGGGTAATGTATCTAGG GATAACAGGGTAATGTCATCTGGGATAACAGGGTAATGTCATCTA GGGATAACAGGGTATGTCATCTGGGATAACAGGGTAATGTATCTA GGGATAACAGGGTAATGTCATCTGGGATAACAGGGTAATGTCATC TAGGGATAACAGGGTATGTCATCTGGGATAACAGGGTAATGTATC TAGGGATAACAGGGTAATGTCATCTGGGATAACAGGGTAATGTCA TCTAGGGATAACAGGGTATGTCATCTGGGATAACAGGGTAATGTA TCTAGGGATAACAGGGTAATGTCATCTGGGATAACAGGGTAATGT CATCTAGGGATAACAGGGTATGTCATCTGGGATAACAGGGTAATG TATCTAGGGATAACAGGGTAATGTCATCTGGGATAACAGGGTAAT GTCATCTAGGGATAACAGGGTAAATGTCATCTAGGGATAACAGGG TAATGTCATCTAGGGATAACAGGGTAATGTCATCTGGGATAACAG GGTAATGTCATCTAGGGATAACAGGGTAATGTATCGCCAGCGTCG CACAGCATGTTTGCTTGTCGCCGTCGCGTCTGTCACATCTTTTCCG CCAGCAGTTAGGGATTAGCGTCTTAAGCTGGCGCGAGGACCAACG TATCAGCCAGGCGAAGCTGCTTTTGAGCACCACCCGGATGCCTAT CGCCACCGTCGGTCGCAATGTTGGTTTTGACGATCAACTCTATTTC TCGCGGGTATTTAAAAAATGCACCGGGGCCAGCCCGAGCGAGTTC CGTGCCGGTTGTGAAGAAAAAGTGAATGATGTAGCCGTCAAGTTG TCATAATTGGTAACGAATCAGACAATTGACGGCTTGACGGAGTAG CATAGGGTTTGCAGAATCCCTGCTTCGTCCATTTGACAGGCACATT ATGCATGCCGCTTCGCCTTCGCGCGCGAATTGATCTGCTGCCTCGC GCGTTTCGGTGATGACGGTGAAAACCTCTGACACATGCAGCTCCC GGAGACGGTCACAGCTTGTCTGTAAGCGGATGCCGGGAGCAGACA AGCCCGTCAGGGCGCGTCAGCGGGTGTTGGCGGGTGTCGGGGCGC AGCCATGACCCAGTCACGTAGCGATAGCGGAGTGTATACTGGCTT AACTATGCGGCATCAGAGCAGATTGTACTGAGAGTGCACCATATG CGGTGTGAAATACCGCACAGATGCGTAAGGAGAAAATACCGCATC AGGCGCTCTTCCGCTTCCTCGCTCACTGACTCGCTGCGCTCGGTCG TTCGGCTGCGGCGAGCGGTATCAGCTCACTCAAAGGCGGTAATAC GGTTATCCACAGAATCAGGGGATAACGCAGGAAAGAACATGTGA GCAAAAGGCCAGCAAAAGGCCAGGAACCGTAAAAAGGCCGCGTT GCTGGCGTTTTTCCATAGGCTCCGCCCCCCTGACGAGCATCACAAA AATCGACGCTCAAGTCAGAGGTGGCGAAACCCGACAGGACTATAA AGATACCAGGCGTTTCCCCCTGGAAGCTCCCTCGTGCGCTCTCCTG TTCCGACCCTGCCGCTTACCGGATACCTGTCCGCCTTTCTCCCTTCG GGAAGCGTGGCGCTTTCTCATAGCTCACGCTGTAGGTATCTCAGTT CGGTGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCACGAACCCC CCGTTCAGCCCGACCGCTGCGCCTTATCCGGTAACTATCGTCTTGA
GTCCAACCCGGTAAGACACGACTTATCGCCACTGGCAGCAGCCAC TGGTAACAGGATTAGCAGAGCGAGGTATGTAGGCGGTGCTACAGA GTTCTTGAAGTGGTGGCCTAACTACGGCTACACTAGAAGGACAGT ATTTGGTATCTGCGCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGA GTTGGTAGCTCTTGATCCGGCAAACAAACCACCGCTGGTAGCGGT GGTTTTTTTGTTTGCAAGCAGCAGATTACGCGCAGAAAAAAAGGA TCTCAAGAAGATCCTTTGATCTTTTCTACGGGGTCTGACGCTCAGT GGAACGAAAACTCACGTTAAGGGATTTTGGTCATGAGCGGATACA TATTTGAATGTATTTAGAAAAATAAACAAAAGAGTTTGTAGAAAC GCAAAAAGGCCATCCGTCAGGATGGCCTTCTGCTTAATTTGATGCC TGGCAGTTTATGGCGGGCGTCCTGCCCGCCACCCTCCGGGCCGTTG CTTCGCAACGTTCAAATCCGCTCCCGGCGGATTTGTCCTACTCAGG AGAGCGTTCACCGACAAACAACAGATAAAACGAAAGGCCCAGTC TTTCGACTGAGCCTTTCGTTTTATTTGATGCCTGGCAGTTCCCTACT CTCGCATGGGGAGACCCCACACTACCATCGGCGCTACGGCGTTTC ACTTCTGAGTTCGGCATGGGGTCAGGTGGGACCACCGCGCTACTG CCGCCAGGCAAATTCTGTTTTATCAGACCGCTTCTGCGTTCTGATT TAATCTGTATCAGGCTGAAAATCTTCTCTCATCCGCCAAAACAGCC AAGCTGGAGACCGTTTGGCCCCCCTCGAGCACGTAGAAAGCCAGT CCGCAGAAACGGTGCTGACCCCGGATGAATGTCAGCTACTGGGCT ATCTGGACAAGGGAAAACGCAAGCGCAAAGAGAAAGCAGGTAGC TTGCAGTGGGCTTACATGGCGATAGCTAGACTGGGCGGTTTTATG GACAGCAAGCGAACCGGAATTGCCAGCTGGGGCGCCCTCTGGTAA GGTTGGGAAGCCCTGCAAAGTAAACTGGATGGCTTTCTCGCCGCC AAGGATCTGATGGCGCAGGGGATCA SEQ ID NO: 73 ACGCGTTGACATTGATTATTGACTAGTTATTAATAGTAATCAATTA pCMV PE2 P2A Cre CGGGGTCATTAGTTCATAGCCCATATATGGAGTTCCGCGTTACATA plasmid ACTTACGGTAAATGGCCCGCCTGGCTGACCGCCCAACGACCCCCG (Artificial Sequence) CCCATTGACGTCAATAATGACGTATGTTCCCATAGTAACGCCAATA GGGACTTTCCATTGACGTCAATGGGTGGAGTATTTACGGTAAACT GCCCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTACGCCC CCTATTGACGTCAATGACGGTAAATGGCCCGCCTGGCATTATGCCC AGTACATGACCTTATGGGACTTTCCTACTTGGCAGTACATCTACGT ATTAGTCATCGCTATTACCATGGTGATGCGGTTTTGGCAGTACATC AATGGGCGTGGATAGCGGTTTGACTCACGGGGATTTCCAAGTCTC CACCCCATTGACGTCAATGGGAGTTTGTTTTGGCACCAAAATCAAC GGGACTTTCCAAAATGTCGTAACAACTCCGCCCCATTGACGCAAA TGGGCGGTAGGCGTGTACGGTGGGAGGTCTATATAAGCAGAGCTG GTTTAGTGAACCGTCAGATCCGCTAGAGATCCGCGGCCGCTAATA CGACTCACTATAGGGAGAGCCGCCACCATGAAACGGACAGCCGAC GGAAGCGAGTTCGAGTCACCAAAGAAGAAGCGGAAAGTCGACAA GAAGTACAGCATCGGCCTGGACATCGGCACCAACTCTGTGGGCTG GGCCGTGATCACCGACGAGTACAAGGTGCCCAGCAAGAAATTCAA GGTGCTGGGCAACACCGACCGGCACAGCATCAAGAAGAACCTGAT CGGAGCCCTGCTGTTCGACAGCGGCGAAACAGCCGAGGCCACCCG GCTGAAGAGAACCGCCAGAAGAAGATACACCAGACGGAAGAACC GGATCTGCTATCTGCAAGAGATCTTCAGCAACGAGATGGCCAAGG TGGACGACAGCTTCTTCCACAGACTGGAAGAGTCCTTCCTGGTGG AAGAGGATAAGAAGCACGAGCGGCACCCCATCTTCGGCAACATCG TGGACGAGGTGGCCTACCACGAGAAGTACCCCACCATCTACCACC TGAGAAAGAAACTGGTGGACAGCACCGACAAGGCCGACCTGCGG CTGATCTATCTGGCCCTGGCCCACATGATCAAGTTCCGGGGCCACT TCCTGATCGAGGGCGACCTGAACCCCGACAACAGCGACGTGGACA AGCTGTTCATCCAGCTGGTGCAGACCTACAACCAGCTGTTCGAGG AAAACCCCATCAACGCCAGCGGCGTGGACGCCAAGGCCATCCTGT CTGCCAGACTGAGCAAGAGCAGACGGCTGGAAAATCTGATCGCCC AGCTGCCCGGCGAGAAGAAGAATGGCCTGTTCGGAAACCTGATTG CCCTGAGCCTGGGCCTGACCCCCAACTTCAAGAGCAACTTCGACC TGGCCGAGGATGCCAAACTGCAGCTGAGCAAGGACACCTACGACG ACGACCTGGACAACCTGCTGGCCCAGATCGGCGACCAGTACGCCG ACCTGTTTCTGGCCGCCAAGAACCTGTCCGACGCCATCCTGCTGAG CGACATCCTGAGAGTGAACACCGAGATCACCAAGGCCCCCCTGAG CGCCTCTATGATCAAGAGATACGACGAGCACCACCAGGACCTGAC CCTGCTGAAAGCTCTCGTGCGGCAGCAGCTGCCTGAGAAGTACAA AGAGATTTTCTTCGACCAGAGCAAGAACGGCTACGCCGGCTACAT TGACGGCGGAGCCAGCCAGGAAGAGTTCTACAAGTTCATCAAGCC CATCCTGGAAAAGATGGACGGCACCGAGGAACTGCTCGTGAAGCT GAACAGAGAGGACCTGCTGCGGAAGCAGCGGACCTTCGACAACG GCAGCATCCCCCACCAGATCCACCTGGGAGAGCTGCACGCCATTC TGCGGCGGCAGGAAGATTTTTACCCATTCCTGAAGGACAACCGGG AAAAGATCGAGAAGATCCTGACCTTCCGCATCCCCTACTACGTGG GCCCTCTGGCCAGGGGAAACAGCAGATTCGCCTGGATGACCAGAA AGAGCGAGGAAACCATCACCCCCTGGAACTTCGAGGAAGTGGTGG ACAAGGGCGCTTCCGCCCAGAGCTTCATCGAGCGGATGACCAACT TCGATAAGAACCTGCCCAACGAGAAGGTGCTGCCCAAGCACAGCC TGCTGTACGAGTACTTCACCGTGTATAACGAGCTGACCAAAGTGA AATACGTGACCGAGGGAATGAGAAAGCCCGCCTTCCTGAGCGGCG AGCAGAAAAAGGCCATCGTGGACCTGCTGTTCAAGACCAACCGGA AAGTGACCGTGAAGCAGCTGAAAGAGGACTACTTCAAGAAAATC GAGTGCTTCGACTCCGTGGAAATCTCCGGCGTGGAAGATCGGTTC AACGCCTCCCTGGGCACATACCACGATCTGCTGAAAATTATCAAG GACAAGGACTTCCTGGACAATGAGGAAAACGAGGACATTCTGGA AGATATCGTGCTGACCCTGACACTGTTTGAGGACAGAGAGATGAT CGAGGAACGGCTGAAAACCTATGCCCACCTGTTCGACGACAAAGT GATGAAGCAGCTGAAGCGGCGGAGATACACCGGCTGGGGCAGGC TGAGCCGGAAGCTGATCAACGGCATCCGGGACAAGCAGTCCGGCA AGACAATCCTGGATTTCCTGAAGTCCGACGGCTTCGCCAACAGAA ACTTCATGCAGCTGATCCACGACGACAGCCTGACCTTTAAAGAGG ACATCCAGAAAGCCCAGGTGTCCGGCCAGGGCGATAGCCTGCACG AGCACATTGCCAATCTGGCCGGCAGCCCCGCCATTAAGAAGGGCA TCCTGCAGACAGTGAAGGTGGTGGACGAGCTCGTGAAAGTGATGG GCCGGCACAAGCCCGAGAACATCGTGATCGAAATGGCCAGAGAG AACCAGACCACCCAGAAGGGACAGAAGAACAGCCGCGAGAGAAT GAAGCGGATCGAAGAGGGCATCAAAGAGCTGGGCAGCCAGATCC TGAAAGAACACCCCGTGGAAAACACCCAGCTGCAGAACGAGAAG CTGTACCTGTACTACCTGCAGAATGGGCGGGATATGTACGTGGAC CAGGAACTGGACATCAACCGGCTGTCCGACTACGATGTGGACGCT ATCGTGCCTCAGAGCTTTCTGAAGGACGACTCCATCGACAACAAG GTGCTGACCAGAAGCGACAAGAACCGGGGCAAGAGCGACAACGT GCCCTCCGAAGAGGTCGTGAAGAAGATGAAGAACTACTGGCGGC AGCTGCTGAACGCCAAGCTGATTACCCAGAGAAAGTTCGACAATC TGACCAAGGCCGAGAGAGGCGGCCTGAGCGAACTGGATAAGGCC GGCTTCATCAAGAGACAGCTGGTGGAAACCCGGCAGATCACAAAG CACGTGGCACAGATCCTGGACTCCCGGATGAACACTAAGTACGAC GAGAATGACAAGCTGATCCGGGAAGTGAAAGTGATCACCCTGAA GTCCAAGCTGGTGTCCGATTTCCGGAAGGATTTCCAGTTTTACAAA GTGCGCGAGATCAACAACTACCACCACGCCCACGACGCCTACCTG AACGCCGTCGTGGGAACCGCCCTGATCAAAAAGTACCCTAAGCTG GAAAGCGAGTTCGTGTACGGCGACTACAAGGTGTACGACGTGCGG AAGATGATCGCCAAGAGCGAGCAGGAAATCGGCAAGGCTACCGC CAAGTACTTCTTCTACAGCAACATCATGAACTTTTTCAAGACCGAG ATTACCCTGGCCAACGGCGAGATCCGGAAGCGGCCTCTGATCGAG ACAAACGGCGAAACCGGGGAGATCGTGTGGGATAAGGGCCGGGA TTTTGCCACCGTGCGGAAAGTGCTGAGCATGCCCCAAGTGAATAT CGTGAAAAAGACCGAGGTGCAGACAGGCGGCTTCAGCAAAGAGT CTATCCTGCCCAAGAGGAACAGCGATAAGCTGATCGCCAGAAAGA AGGACTGGGACCCTAAGAAGTACGGCGGCTTCGACAGCCCCACCG TGGCCTATTCTGTGCTGGTGGTGGCCAAAGTGGAAAAGGGCAAGT CCAAGAAACTGAAGAGTGTGAAAGAGCTGCTGGGGATCACCATCA TGGAAAGAAGCAGCTTCGAGAAGAATCCCATCGACTTTCTGGAAG CCAAGGGCTACAAAGAAGTGAAAAAGGACCTGATCATCAAGCTG CCTAAGTACTCCCTGTTCGAGCTGGAAAACGGCCGGAAGAGAATG CTGGCCTCTGCCGGCGAACTGCAGAAGGGAAACGAACTGGCCCTG CCCTCCAAATATGTGAACTTCCTGTACCTGGCCAGCCACTATGAGA AGCTGAAGGGCTCCCCCGAGGATAATGAGCAGAAACAGCTGTTTG TGGAACAGCACAAGCACTACCTGGACGAGATCATCGAGCAGATCA GCGAGTTCTCCAAGAGAGTGATCCTGGCCGACGCTAATCTGGACA AAGTGCTGTCCGCCTACAACAAGCACCGGGATAAGCCCATCAGAG AGCAGGCCGAGAATATCATCCACCTGTTTACCCTGACCAATCTGG GAGCCCCTGCCGCCTTCAAGTACTTTGACACCACCATCGACCGGA AGAGGTACACCAGCACCAAAGAGGTGCTGGACGCCACCCTGATCC ACCAGAGCATCACCGGCCTGTACGAGACACGGATCGACCTGTCTC AGCTGGGAGGTGACTCTGGAGGATCTAGCGGAGGATCCTCTGGCA GCGAGACACCAGGAACAAGCGAGTCAGCAACACCAGAGAGCAGT GGCGGCAGCAGCGGCGGCAGCAGCACCCTAAATATAGAAGATGA GTATCGGCTACATGAGACCTCAAAAGAGCCAGATGTTTCTCTAGG GTCCACATGGCTGTCTGATTTTCCTCAGGCCTGGGCGGAAACCGG GGGCATGGGACTGGCAGTTCGCCAAGCTCCTCTGATCATACCTCTG AAAGCAACCTCTACCCCCGTGTCCATAAAACAATACCCCATGTCA CAAGAAGCCAGACTGGGGATCAAGCCCCACATACAGAGACTGTTG GACCAGGGAATACTGGTACCCTGCCAGTCCCCCTGGAACACGCCC CTGCTACCCGTTAAGAAACCAGGGACTAATGATTATAGGCCTGTC CAGGATCTGAGAGAAGTCAACAAGCGGGTGGAAGACATCCACCC CACCGTGCCCAACCCTTACAACCTCTTGAGCGGGCTCCCACCGTCC CACCAGTGGTACACTGTGCTTGATTTAAAGGATGCCTTTTTCTGCC TGAGACTCCACCCCACCAGTCAGCCTCTCTTCGCCTTTGAGTGGAG AGATCCAGAGATGGGAATCTCAGGACAATTGACCTGGACCAGACT CCCACAGGGTTTCAAAAACAGTCCCACCCTGTTTAATGAGGCACT GCACAGAGACCTAGCAGACTTCCGGATCCAGCACCCAGACTTGAT CCTGCTACAGTACGTGGATGACTTACTGCTGGCCGCCACTTCTGAG CTAGACTGCCAACAAGGTACTCGGGCCCTGTTACAAACCCTAGGG AACCTCGGGTATCGOGCCTCGGCCAAGAAAGCCCAAATTTGCCAG AAACAGGTCAAGTATCTGGGGTATCTTCTAAAAGAGGGTCAGAGA TGGCTGACTGAGGCCAGAAAAGAGACTGTGATGGGGCAGCCTACT CCGAAGACCCCTCGACAACTAAGGGAGTTCCTAGGGAAGGCAGGC TTCTGTCGCCTCTTCATCCCTGGGTTTGCAGAAATGGCAGCCCCCC TGTACCCTCTCACCAAACCGGGGACTCTGTTTAATTGGGGCCCAGA CCAACAAAAGGCCTATCAAGAAATCAAGCAAGCTCTTCTAACTGC CCCAGCCCTGGGGTTGCCAGATTTGACTAAGCCCTTTGAACTCTTT GTCGACGAGAAGCAGGGCTACGCCAAAGGTGTCCTAACGCAAAA ACTGGGACCTTGGCGTCGGCCGGTGGCCTACCTGTCCAAAAAGCT AGACCCAGTAGCAGCTGGGTGGCCCCCTTGCCTACGGATGGTAGC AGCCATTGCCGTACTGACAAAGGATGCAGGCAAGCTAACCATGGG ACAGCCACTAGTCATTCTGGCCCCCCATGCAGTAGAGGCACTAGT CAAACAACCCCCCGACCGCTGGCTTTCCAACGCCCGGATGACTCA CTATCAGGCCTTGCTTTTGGACACGGACCGGGTCCAGTTCGGACCG GTGGTAGCCCTGAACCCGGCTACGCTGCTCCCACTGCCTGAGGAA GGGCTGCAACACAACTGCCTTGATATCCTGGCCGAAGCCCACGGA ACCCGACCCGACCTAACGGACCAGCCGCTCCCAGACGCCGACCAC ACCTGGTACACGGATGGAAGCAGTCTCTTACAAGAGGGACAGCGT AAGGCGGGAGCTGCGGTGACCACCGAGACCGAGGTAATCTGGGCT AAAGCCCTGCCAGCCGGGACATCCGCTCAGCGGGCTGAACTGATA GCACTCACCCAGGCCCTAAAGATGGCAGAAGGTAAGAAGCTAAAT GTTTATACTGATAGCCGTTATGCTTTTGCTACTGCCCATATCCATG GAGAAATATACAGAAGGCGTGGGTGGCTCACATCAGAAGGCAAA GAGATCAAAAATAAAGACGAGATCTTGGCCCTACTAAAAGCCCTC TTTCTGCCCAAAAGACTTAGCATAATCCATTGTCCAGGACATCAAA AGGGACACAGCGCCGAGGCTAGAGGCAACCGGATGGCTGACCAA GCGGCCCGAAAGGCAGCCATCACAGAGACTCCAGACACCTCTACC CTCCTCATAGAAAATTCATCACCCTCTGGCGGCTCAAAAAGAACC GCCGACGGCAGCGAATTCGAGCCCAAGAAGAAGAGGAAAGTCGG AAGCGGAGCTACTAACTTCAGCCTGCTGAAGCAGGCTGGCGACGT GGAGGAGAACCCTGGACCTAATTTACTGACCGTACACCAAAATTT GCCTGCATTACCGGTCGATGCAACGAGTGATGAGGTTCGCAAGAA CCTGATGGACATGTTCAGGGATCGCCAGGCGTTTTCTGAGCATACC TGGAAAATGCTTCTGTCCGTTTGCCGGTCGTGGGCGGCATGGTGCA AGTTGAATAACCGGAAATGGTTTCCCGCAGAACCTGAAGATGTTC GCGATTATCTTCTATATCTTCAGGCGCGCGGTCTGGCAGTAAAAAC TATCCAGCAACATTTGGGCCAGCTAAACATGCTTCATCGTCGGTCC GGGCTGCCACGACCAAGTGACAGCAATGCTGTTTCACTGGTTATG CGGCGGATCCGAAAAGAAAACGTTGATGCCGGTGAACGTGCAAA ACAGGCTCTAGCGTTCGAACGCACTGATTTCGACCAGGTTCGTTCA CTCATGGAAAATAGCGATCGCTGCCAGGATATACGTAATCTGGCA TTTCTGGGGATTGCTTATAACACCCTGTTACGTATAGCCGAAATTG CCAGGATCAGGGTTAAAGATATCTCACGTACTGACGGTGGGAGAA TGTTAATCCATATTGGCAGAACGAAAACGCTGGTTAGCACCGCAG GTGTAGAGAAGGCACTTAGCCTGGGGGTAACTAAACTGGTCGAGC GATGGATTTCCGTCTCTGGTGTAGCTGATGATCCGAATAACTACCT GTTTTGCCGGGTCAGAAAAAATGGTGTTGCCGCGCCATCTGCCAC CAGCCAGCTATCAACTCGCGCCCTGGAAGGGATTTTTGAAGCAAC TCATCGATTGATTTACGGCGCTAAGGATGACTCTGGTCAGAGATA CCTGGCCTGGTCTGGACACAGTGCCCGTGTCGGAGCCGCGCGAGA TATGGCCCGCGCTGGAGTTTCAATACCGGAGATCATGCAAGCTGG TGGCTGGACCAATGTAAATATTGTCATGAACTATATCCGTAACCTG GATAGTGAAACAGGGGCAATGGTGCGCCTGCTGGAAGATGGCGAT TAATTTAAACCCGCTGATCAGCCTCGACTGTGCCTTCTAGTTGCCA GCCATCTGTTGTTTGCCCCTCCCCCGTGCCTTCCTTGACCCTGGAA GGTGCCACTCCCACTGTCCTTTCCTAATAAAATGAGAAAATTGCAT CGCATTGTCTGAGTAGGTGTCATTCTATTCTGGGGGGTGGGGTGGG GCAGGACAGCAAGGGGGAGGATTGGGAAGACAATAGCAGGCATG CTGGGGATGCGGTGGGCTCTATGGCTTCTGAGGCGGAAAGAACCA GCTGGGGCTCGATACCGTCGACCTCTAGCTAGAGCTTGGCGTAAT CATGGTCATAGCTGTTTCCTGTGTGAAATTGTTATCCGCTCACAAT TCCACACAACATACGAGCCGGAAGCATAAAGTGTAAAGCCTAGGG TGCCTAATGAGTGAGCTAACTCACATTAATTGCGTTGCGCTCACTG CCCGCTTTCCAGTCGGGAAACCTGTCGTGCCAGCTGCATTAATGAA tcggccaacgcgcggggagaggcggtttgcgtattgggcgctctt CCGCTTCCTCGCTCACTGACTCGCTGCGCTCGGTCGTTCGGCTGCG GCGAGCGGTATCAGCTCACTCAAAGGCGGTAATACGGTTATCCAC AGAATCAGGGGATAACGCAGGAAAGAACATGTGAGCAAAAGGCC AGCAAAAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTT TCCATAGGCTCCGCCCCCCTGACGAGCATCACAAAAATCGACGCT CAAGTCAGAGGTGGCGAAACCCGACAGGACTATAAAGATACCAG GCGTTTCCCCCTGGAAGCTCCCTCGTGCGCTCTCCTGTTCCGACCC TGCCGCTTACCGGATACCTGTCCGCCTTTCTCCCTTCGGGAAGCGT GGCGCTTTCTCATAGCTCACGCTGTAGGTATCTCAGTTCGGTGTAG GTCGTTCGCTCCAAGCTGGGCTGTGTGCACGAACCCCCCGTTCAGC CCGACCGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCCAACCC GGTAAGACACGACTTATCGCCACTGGCAGCAGCCACTGGTAACAG GATTAGCAGAGCGAGGTATGTAGGCGGTGCTACAGAGTTCTTGAA GTGGTGGCCTAACTACGGCTACACTAGAAGAACAGTATTTGGTAT CTGCGCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGAGTTGGTAG CTCTTGATCCGGCAAACAAACCACCGCTGGTAGCGGTGGTTTTTTT GTTTGCAAGCAGCAGATTACGCGCAGAAAAAAAGGATCTCAAGA AGATCCTTTGATCTTTTCTACGGGGTCTGACGCTCAGTGGAACGAA AACTCACGTTAAGGGATTTTGGTCATGAGATTATCAAAAAGGATC TTCACCTAGATCCTTTTAAATTAAAAATGAAGTTTTAAATCAATCT AAAGTATATATGAGTAAACTTGGTCTGACAGTTACCAATGCTTAAT CAGTGAGGCACCTATCTCAGCGATCTGTCTATTTCGTTCATCCATA GTTGCCTGACTCCCCGTCGTGTAGATAACTACGATACGGGAGGGC TTACCATCTGGCCCCAGTGCTGCAATGATACCGCGAGACCCACGC TCACCGGCTCCAGATTTATCAGCAATAAACCAGCCAGCCGGAAGG GCCGAGCGCAGAAGTGGTCCTGCAACTTTATCCGCCTCCATCCAGT CTATTAATTGTTGCCGGGAAGCTAGAGTAAGTAGTTCGCCAGTTA ATAGTTTGCGCAACGTTGTTGCCATTGCTACAGGCATCGTGGTGTC ACGCTCGTCGTTTGGTATGGCTTCATTCAGCTCCGGTTCCCAACGA TCAAGGCGAGTTACATGATCCCCCATGTTGTGCAAAAAAGCGGTT AGCTCCTTCGGTCCTCCGATCGTTGTCAGAAGTAAGTTGGCCGCAG TGTTATCACTCATGGTTATGGCAGCACTGCATAATTCTCTTACTGT CATGCCATCCGTAAGATGCTTTTCTGTGACTGGTGAGTACTCAACC
AAGTCATTCTGAGAATAGTGTATGCGGCGACCGAGTTGCTCTTGCC CGGCGTCAATACGGGATAATACCGCGCCACATAGCAGAACTTTAA AAGTGCTCATCATTGGAAAACGTTCTTCGGGGCGAAAACTCTCAA GGATCTTACCGCTGTTGAGATCCAGTTCGATGTAACCCACTCGTGC ACCCAACTGATCTTCAGCATCTTTTACTTTCACCAGCGTTTCTGGG TGAGCAAAAACAGGAAGGCAAAATGCCGCAAAAAAGGGAATAAG GGCGACACGGAAATGTTGAATACTCATACTCTTCCTTTTTCAATAT TATTGAAGCATTTATCAGGGTTATTGTCTCATGAGCGGATACATAT TTGAATGTATTTAGAAAAATAAACAAATAGGGGTTCCGCGCACAT TTCCCCGAAAAGTGCCACCTGACGTCGACGGATCGGGAGATCGAT CTCCCGATCCCCTAGGGTCGACTCTCAGTACAATCTGCTCTGATGC CGCATAGTTAAGCCAGTATCTGCTCCCTGCTTGTGTGTTGGAGGTC GCTGAGTAGTGCGCGAGCAAAATTTAAGCTACAACAAGGCAAGGC TTGACCGACAATTGCATGAAGAATCTGCTTAGGGTTAGGCGTTTTG CGCTGCTTCGCGATGTACGGGCCAGATAT SEQ ID NO: 74 GTCAACCAGTATCCCGGTGC +90 ngRNA guide sequence (Artificial Sequence) SEQ ID NO: 75 GTCAACCAGTATCCCGGTGCGTTTTAGAGCTAGAAATAGCAAGTT +90 ngRNA AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT (Artificial Sequence) CGGTGC SEQ ID NO: 76 TGATCCCCTGCGCCATCAGATCCTTGGCGGCGAGAAAGCCATCCA GFP minicircle GTTTACTTTGCAGGGCTTCCCAACCTTACCAGAGGGCGCCCCAGCT template (before GGCAATTCCGGTTCGCTTGCTGTCCATAAAACCGCCCAGTCTAGCT cleavage into a ATCGCCATGTAAGCCCACTGCAAGCTACCTGCTTTCTCTTTGCGCT minicircle) TGCGTTTTCCCTTGTCCAGATAGCCCAGTAGCTGACATTCATCCGG (Artificial Sequence) GGTCAGCACCGTTTCTGCGGACTGGCTTTCTACGTGCTCGAGGGGG GCCAAACGGTCTCCAGCTTGGCTGTTTTGGCGGATGAGAGAAGAT TTTCAGCCTGATACAGATTAAATCAGAACGCAGAAGCGGTCTGAT AAAACAGAATTTGCCTGGCGGCAGTAGCGCGGTGGTCCCACCTGA CCCCATGCCGAACTCAGAAGTGAAACGCCGTAGCGCCGATGGTAG TGTGGGGTCTCCCCATGCGAGAGTAGGGAACTGCCAGGCATCAAA TAAAACGAAAGGCTCAGTCGAAAGACTGGGCCTTTCGTTTTATCT GTTGTTTGTCGGTGAACGCTCTCCTGAGTAGGACAAATCCGCCGG GAGCGGATTTGAACGTTGCGAAGCAACGGCCCGGAGGGTGGCGG GCAGGACGCCCGCCATAAACTGCCAGGCATCAAATTAAGCAGAAG GCCATCCTGACGGATGGCCTTTTTGCGTTTCTACAAACTCTTTTGTT TATTTTTCTAAATACATTCAAATATGTATCCGCTCATGACCAAAAT CCCTTAACGTGAGTTTTCGTTCCACTGAGCGTCAGACCCCGTAGAA AAGATCAAAGGATCTTCTTGAGATCCTTTTTTTCTGCGCGTAATCT GCTGCTTGCAAACAAAAAAACCACCGCTACCAGCGGTGGTTTGTT TGCCGGATCAAGAGCTACCAACTCTTTTTCCGAAGGTAACTGGCTT CAGCAGAGCGCAGATACCAAATACTGTCCTTCTAGTGTAGCCGTA GTTAGGCCACCACTTCAAGAACTCTGTAGCACCGCCTACATACCTC GCTCTGCTAATCCTGTTACCAGTGGCTGCTGCCAGTGGCGATAAGT CGTGTCTTACCGGGTTGGACTCAAGACGATAGTTACCGGATAAGG CGCAGCGGTCGGGCTGAACGGGGGGTTCGTGCACACAGCCCAGCT TGGAGCGAACGACCTACACCGAACTGAGATACCTACAGCGTGAGC TATGAGAAAGCGCCACGCTTCCCGAAGGGAGAAAGGCGGACAGG TATCCGGTAAGCGGCAGGGTCGGAACAGGAGAGCGCACGAGGGA GCTTCCAGGGGGAAACGCCTGGTATCTTTATAGTCCTGTCGGGTTT CGCCACCTCTGACTTGAGCGTCGATTTTTGTGATGCTCGTCAGGGG GGCGGAGCCTATGGAAAAACGCCAGCAACGCGGCCTTTTTACGGT TCCTGGCCTTTTGCTGGCCTTTTGCTCACATGTTCTTTCCTGCGTTA TCCCCTGATTCTGTGGATAACCGTATTACCGCCTTTGAGTGAGCTG ATACCGCTCGCCGCAGCCGAACGACCGAGCGCAGCGAGTCAGTGA GCGAGGAAGCGGAAGAGCGCCTGATGCGGTATTTTCTCCTTACGC ATCTGTGCGGTATTTCACACCGCATATGGTGCACTCTCAGTACAAT CTGCTCTGATGCCGCATAGTTAAGCCAGTATACACTCCGCTATCGC TACGTGACTGGGTCATGGCTGCGCCCCGACACCCGCCAACACCCG CTGACGCGCCCTGACGGGCTTGTCTGCTCCCGGCATCCGCTTACAG ACAAGCTGTGACCGTCTCCGGGAGCTGCATGTGTCAGAGGTTTTC ACCGTCATCACCGAAACGCGCGAGGCAGCAGATCAATTCGCGCGC GAAGGCGAAGCGGCATGCATAATGTGCCTGTCAAATGGACGAAGC AGGGATTCTGCAAACCCTATGCTACTCCGTCAAGCCGTCAATTGTC TGATTCGTTACCAATTATGACAACTTGACGGCTACATCATTCACTT TTTCTTCACAACCGGCACGGAACTCGCTCGGGCTGGCCCCGGTGC ATTTTTTAAATACCCGCGAGAAATAGAGTTGATCGTCAAAACCAA CATTGCGACCGACGGTGGCGATAGGCATCCGGGTGGTGCTCAAAA GCAGCTTCGCCTGGCTGATACGTTGGTCCTCGCGCCAGCTTAAGAC GCTAATCCCTAACTGCTGGCGGAAAAGATGTGACAGACGCGACGG CGACAAGCAAACATGCTGTGCGACGCTGGCGATACATTACCCTGT TATCCCTAGATGACATTACCCTGTTATCCCAGATGACATTACCCTG TTATCCCTAGATGACATTACCCTGTTATCCCTAGATGACATTTACC CTGTTATCCCTAGATGACATTACCCTGTTATCCCAGATGACATTAC CCTGTTATCCCTAGATACATTACCCTGTTATCCCAGATGACATACC CTGTTATCCCTAGATGACATTACCCTGTTATCCCAGATGACATTAC CCTGTTATCCCTAGATACATTACCCTGTTATCCCAGATGACATACC CTGTTATCCCTAGATGACATTACCCTGTTATCCCAGATGACATTAC CCTGTTATCCCTAGATACATTACCCTGTTATCCCAGATGACATACC CTGTTATCCCTAGATGACATTACCCTGTTATCCCAGATGACATTAC CCTGTTATCCCTAGATACATTACCCTGTTATCCCAGATGACATACC CTGTTATCCCTAGATGACATTACCCTGTTATCCCAGATGACATTAC CCTGTTATCCCTAGATACATTACCCTGTTATCCCAGATGACATACC CTGTTATCCCTAGATGACATTACCCTGTTATCCCAGATGACATTAC CCTGTTATCCCTAGATACATTACCCTGTTATCCCAGATGACATACC CTGTTATCCCTAGATGACATTACCCTGTTATCCCAGATAAACTCAA TGATGATGATGATGATGGTCGAGACTCAGCGGCCGCGGTGCCAGG GCGTGCCCTTGGGCTCCCCGGGCGCGACTATAAGCTGCGAGCAAC TTCACTTGGGTATGCCGGCGGTAGCGCTTACCGTTCGTATAATGTA TGCTATACGAAGTTATCCGAAGCCGCTAGCGGTGGTTTGTCTGGTC AACCACCGCGGTCTCAGTGGTGTACGGTACAAACCCAGCTACCGG TCGCCACCATGCCCGCCATGAAGATCGAGTGCCGCATCACCGGCA CCCTGAACGGCGTGGAGTTCGAGCTGGTGGGCGGCGGAGAGGGC ACCCCCGAGCAGGGCCGCATGACCAACAAGATGAAGAGCACCAA AGGCGCCCTGACCTTCAGCCCCTACCTGCTGAGCCACGTGATGGG CTACGGCTTCTACCACTTCGGCACCTACCCCAGCGGCTACGAGAA CCCCTTCCTGCACGCCATCAACAACGGCGGCTACACCAACACCCG CATCGAGAAGTACGAGGACGGCGGCGTGCTGCACGTGAGCTTCAG CTACCGCTACGAGGCCGGCCGCGTGATCGGCGACTTCAAGGTGGT GGGCACCGGCTTCCCCGAGGACAGCGTGATCTTCACCGACAAGAT CATCCGCAGCAACGCCACCGTGGAGCACCTGCACCCCATGGGCGA TAACGTGCTGGTGGGCAGCTTCGCCCGCACCTTCAGCCTGCGCGA CGGCGGCTACTACAGCTTCGTGGTGGACAGCCACATGCACTTCAA GAGCGCCATCCACCCCAGCATCCTGCAGAACGGGGGCCCCATGTT CGCCTTCCGCCGCGTGGAGGAGCTGCACAGCAACACCGAGCTGGG CATCGTGGAGTACCAGCACGCCTTCAAGACCCCCATCGCCTTCGCC AGATCTCGAGCTCGATGAGTTTGGACAAACCACAACTAGAATGCA GTGAAAAAAATGCTTTATTTGTGAAATTTGTGATGCTATTGCTTTA TTTGTGGGCCCGCCCCAACTGGGGTAACCTTTGAGTTCTCTCAGTT GGGGGTAATCAGCATCATGATGTGGTACCACATCATGATGCTGAT TATAAGAATGCGGCCGCCACACTCTAGTGGATCTCGAGTTAATAA TTCAGAAGAACTCGTCAAGAAGGCGATAGAAGGCGATGCGCTGCG AATCGGGAGCGGCGATACCGTAAAGCACGAGGAAGCGGTCAGCC CATTCGCCGCCAAGCTCTTCAGCAATATCACGGGTAGCCAACGCT ATGTCCTGATAGCGGTCCGCCACACCCAGCCGGCCACAGTCGATG AATCCAGAAAAGCGGCCATTTTCCACCATGATATTCGGCAAGCAG GCATCGCCATGGGTCACGACGAGATCCTCGCCGTCGGGCATGCTC GCCTTGAGCCTGGCGAACAGTTCGGCTGGCGCGAGCCCCTGATGC TCTTCGTCCAGATCATCCTGATCGACAAGACCGGCTTCCATCCGAG TACGTGCTCGCTCGATGCGATGTTTCGCTTGGTGGTCGAATGGGCA GGTAGCCGGATCAAGCGTATGCAGCCGCCGCATTGCATCAGCCAT GATGGATACTTTCTCGGCAGGAGCAAGGTGTAGATGACATGGAGA TCCTGCCCCGGCACTTCGCCCAATAGCAGCCAGTCCCTTCCCGCTT CAGTGACAACGTCGAGCACAGCTGCGCAAGGAACGCCCGTCGTGG CCAGCCACGATAGCCGCGCTGCCTCGTCTTGCAGTTCATTCAGGGC ACCGGACAGGTCGGTCTTGACAAAAAGAACCGGGCGCCCCTGCGC TGACAGCCGGAACACGGCGGCATCAGAGCAGCCGATTGTCTGTTG TGCCCAGTCATAGCCGAATAGCCTCTCCACCCAAGCGGCCGGAGA ACCTGCGTGCAATCCATCTTGTTCAATCATGCGAAACGATCCTCAT CCTGTCTCTTGATCAGAGCT SEQ ID NO: 77 TGATCCCCTGCGCCATCAGATCCTTGGCGGCGAGAAAGCCATCCA Gaussia Luciferase GTTTACTTTGCAGGGCTTCCCAACCTTACCAGAGGGCGCCCCAGCT minicircle template GGCAATTCCGGTTCGCTTGCTGTCCATAAAACCGCCCAGTCTAGCT (Artificial Sequence) ATCGCCATGTAAGCCCACTGCAAGCTACCTGCTTTCTCTTTGCGCT TGCGTTTTCCCTTGTCCAGATAGCCCAGTAGCTGACATTCATCCGG GGTCAGCACCGTTTCTGCGGACTGGCTTTCTACGTGCTCGAGGGGG GCCAAACGGTCTCCAGCTTGGCTGTTTTGGCGGATGAGAGAAGAT TTTCAGCCTGATACAGATTAAATCAGAACGCAGAAGCGGTCTGAT AAAACAGAATTTGCCTGGCGGCAGTAGCGCGGTGGTCCCACCTGA CCCCATGCCGAACTCAGAAGTGAAACGCCGTAGCGCCGATGGTAG TGTGGGGTCTCCCCATGCGAGAGTAGGGAACTGCCAGGCATCAAA TAAAACGAAAGGCTCAGTCGAAAGACTGGGCCTTTCGTTTTATCT GTTGTTTGTCGGTGAACGCTCTCCTGAGTAGGACAAATCCGCCGG GAGCGGATTTGAACGTTGCGAAGCAACGGCCCGGAGGGTGGCGG GCAGGACGCCCGCCATAAACTGCCAGGCATCAAATTAAGCAGAAG GCCATCCTGACGGATGGCCTTTTTGCGTTTCTACAAACTCTTTTGTT TATTTTTCTAAATACATTCAAATATGTATCCGCTCATGACCAAAAT CCCTTAACGTGAGTTTTCGTTCCACTGAGCGTCAGACCCCGTAGAA AAGATCAAAGGATCTTCTTGAGATCCTTTTTTTCTGCGCGTAATCT GCTGCTTGCAAACAAAAAAACCACCGCTACCAGCGGTGGTTTGTT TGCCGGATCAAGAGCTACCAACTCTTTTTCCGAAGGTAACTGGCTT CAGCAGAGCGCAGATACCAAATACTGTCCTTCTAGTGTAGCCGTA GTTAGGCCACCACTTCAAGAACTCTGTAGCACCGCCTACATACCTC GCTCTGCTAATCCTGTTACCAGTGGCTGCTGCCAGTGGCGATAAGT CGTGTCTTACCGGGTTGGACTCAAGACGATAGTTACCGGATAAGG CGCAGCGGTCGGGCTGAACGGGGGGTTCGTGCACACAGCCCAGCT TGGAGCGAACGACCTACACCGAACTGAGATACCTACAGCGTGAGC TATGAGAAAGCGCCACGCTTCCCGAAGGGAGAAAGGCGGACAGG TATCCGGTAAGCGGCAGGGTCGGAACAGGAGAGCGCACGAGGGA GCTTCCAGGGGGAAACGCCTGGTATCTTTATAGTCCTGTCGGGTTT CGCCACCTCTGACTTGAGCGTCGATTTTTGTGATGCTCGTCAGGGG GGCGGAGCCTATGGAAAAACGCCAGCAACGCGGCCTTTTTACGGT TCCTGGCCTTTTGCTGGCCTTTTGCTCACATGTTCTTTCCTGCGTTA TCCCCTGATTCTGTGGATAACCGTATTACCGCCTTTGAGTGAGCTG ATACCGCTCGCCGCAGCCGAACGACCGAGCGCAGCGAGTCAGTGA GCGAGGAAGCGGAAGAGCGCCTGATGCGGTATTTTCTCCTTACGC ATCTGTGCGGTATTTCACACCGCATATGGTGCACTCTCAGTACAAT CTGCTCTGATGCCGCATAGTTAAGCCAGTATACACTCCGCTATCGC TACGTGACTGGGTCATGGCTGCGCCCCGACACCCGCCAACACCCG CTGACGCGCCCTGACGGGCTTGTCTGCTCCCGGCATCCGCTTACAG ACAAGCTGTGACCGTCTCCGGGAGCTGCATGTGTCAGAGGTTTTC ACCGTCATCACCGAAACGCGCGAGGCAGCAGATCAATTCGCGCGC GAAGGCGAAGCGGCATGCATAATGTGCCTGTCAAATGGACGAAGC AGGGATTCTGCAAACCCTATGCTACTCCGTCAAGCCGTCAATTGTC TGATTCGTTACCAATTATGACAACTTGACGGCTACATCATTCACTT TTTCTTCACAACCGGCACGGAACTCGCTCGGGCTGGCCCCGGTGC ATTTTTTAAATACCCGCGAGAAATAGAGTTGATCGTCAAAACCAA CATTGCGACCGACGGTGGCGATAGGCATCCGGGTGGTGCTCAAAA GCAGCTTCGCCTGGCTGATACGTTGGTCCTCGCGCCAGCTTAAGAC GCTAATCCCTAACTGCTGGCGGAAAAGATGTGACAGACGCGACGG CGACAAGCAAACATGCTGTGCGACGCTGGCGATACATTACCCTGT TATCCCTAGATGACATTACCCTGTTATCCCAGATGACATTACCCTG TTATCCCTAGATGACATTACCCTGTTATCCCTAGATGACATTTACC CTGTTATCCCTAGATGACATTACCCTGTTATCCCAGATGACATTAC CCTGTTATCCCTAGATACATTACCCTGTTATCCCAGATGACATACC CTGTTATCCCTAGATGACATTACCCTGTTATCCCAGATGACATTAC CCTGTTATCCCTAGATACATTACCCTGTTATCCCAGATGACATACC CTGTTATCCCTAGATGACATTACCCTGTTATCCCAGATGACATTAC CCTGTTATCCCTAGATACATTACCCTGTTATCCCAGATGACATACC CTGTTATCCCTAGATGACATTACCCTGTTATCCCAGATGACATTAC CCTGTTATCCCTAGATACATTACCCTGTTATCCCAGATGACATACC CTGTTATCCCTAGATGACATTACCCTGTTATCCCAGATGACATTAC CCTGTTATCCCTAGATACATTACCCTGTTATCCCAGATGACATACC CTGTTATCCCTAGATGACATTACCCTGTTATCCCAGATGACATTAC CCTGTTATCCCTAGATACATTACCCTGTTATCCCAGATGACATACC CTGTTATCCCTAGATGACATTACCCTGTTATCCCAGATAAACTCAA TGATGATGATGATGATGGTCGAGACTCAGCGGCCGCGGTGCCAGG GCGTGCCCTTGGGCTCCCCGGGCGCGACTATAAGCTGCGAGCAAC TTCACTTGGGTATGCCGGCGGTAGCGCTTACCGTTCGTATAATGTA TGCTATACGAAGTTATCCGAAGCCGCTAGCGGTGGTTTGTCTGGTC AACCACCGCGGTCTCAGTGGTGTACGGTACAAACCCACTACCGGT CGCCACCATGGGAGTCAAAGTTCTGTTTGCCCTGATCTGCATCGCT GTGGCCGAGGCCAAGCCCACCGAGAACAACGAAGACTTCAACATC GTGGCCGTGGCCAGCAACTTCGCGACCACGGATCTCGATGCTGAC CGCGGGAAGTTGCCCGGCAAGAAGCTGCCGCTGGAGGTGCTCAAA GAGATGGAAGCCAATGCCCGGAAAGCTGGCTGCACCAGGGGCTGT CTGATCTGCCTGTCCCACATCAAGTGCACGCCCAAGATGAAGAAG TTCATCCCAGGACGCTGCCACACCTACGAAGGCGACAAAGAGTCC GCACAGGGCGGCATAGGCGAGGCGATCGTCGACATTCCTGAGATT CCTGGGTTCAAGGACTTGGAGCCCATGGAGCAGTTCATCGCACAG GTCGATCTGTGTGTGGACTGCACAACTGGCTGCCTCAAAGGGCTT GCCAACGTGCAGTGTTCTGACCTGCTCAAGAAGTGGCTGCCGCAA CGCTGTGCGACCTTTGCCAGCAAGATCCAGGGCCAGGTGGACAAG ATCAAGGGGGCCGGTGGTGACTAAGCGGAGCTCGATGAGTTTGGA CAAACCACAACTAGAATGCAGTGAAAAAAATGCTTTATTTGTGAA ATTTGTGATGCTATTGCTTTATTTGTGGGCCCGCCCCAACTGGGGT AACCTTTGAGTTCTCTCAGTTGGGGGTAATCAGCATCATGATGTGG TACCACATCATGATGCTGATTATAAGAATGCGGCCGCCACACTCT AGTGGATCTCGAGTTAATAATTCAGAAGAACTCGTCAAGAAGGCG ATAGAAGGCGATGCGCTGCGAATCGGGAGCGGCGATACCGTAAA GCACGAGGAAGCGGTCAGCCCATTCGCCGCCAAGCTCTTCAGCAA TATCACGGGTAGCCAACGCTATGTCCTGATAGCGGTCCGCCACAC CCAGCCGGCCACAGTCGATGAATCCAGAAAAGCGGCCATTTTCCA CCATGATATTCGGCAAGCAGGCATCGCCATGGGTCACGACGAGAT CCTCGCCGTCGGGCATGCTCGCCTTGAGCCTGGCGAACAGTTCGG CTGGCGCGAGCCCCTGATGCTCTTCGTCCAGATCATCCTGATCGAC AAGACCGGCTTCCATCCGAGTACGTGCTCGCTCGATGCGATGTTTC GCTTGGTGGTCGAATGGGCAGGTAGCCGGATCAAGCGTATGCAGC CGCCGCATTGCATCAGCCATGATGGATACTTTCTCGGCAGGAGCA AGGTGTAGATGACATGGAGATCCTGCCCCGGCACTTCGCCCAATA GCAGCCAGTCCCTTCCCGCTTCAGTGACAACGTCGAGCACAGCTG CGCAAGGAACGCCCGTCGTGGCCAGCCACGATAGCCGCGCTGCCT CGTCTTGCAGTTCATTCAGGGCACCGGACAGGTCGGTCTTGACAA AAAGAACCGGGCGCCCCTGCGCTGACAGCCGGAACACGGCGGCA TCAGAGCAGCCGATTGTCTGTTGTGCCCAGTCATAGCCGAATAGC CTCTCCACCCAAGCGGCCGGAGAACCTGCGTGCAATCCATCTTGTT CAATCATGCGAAACGATCCTCATCCTGTCTCTTGATCAGAGCT SEQ ID NO: 78 CCCCAACTGGGGTAACCTTTGAGTTCTCTCAGTTGGGG pseudo attP site (Artificial sequence) SEQ ID NO: 79 GACTGAAACTTCACAGAATAGTTTTAGAGCTAGAAATAGCAAGTT Albumin-pegRNA- AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT SERPIN CGGTGCTTGGGATAGTTATGAATTCAATCTTCAACCCTATCCGGAT
(Artificial Sequence) GATCCTGACGACGGAGACCGCCGTCGTCGACAAGCCGGCCTCTGT GAAGTTTCAGTCA SEQ ID NO: 80 GACTGAAACTTCACAGAATAGTTTTAGAGCTAGAAATAGCAAGTT Albumin-pegRNA- AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT CPS1 CGGTGCTTGGGATAGTTATGAATTCAATCTTCAACCCTATCCGGAT (Artificial Sequence) GATCCTGACGACGGAGACCGCCGTCGTCGACAAGCCGGCCTCTGT GAAGTTTC SEQ ID NO: 81 GGCCCAGACTGAGCACGTGAGTTTTAGAGCTAGAAATAGCAAGTT 34 bp lox71 pegRNA AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT (Artificial Sequence) CGGTGCTGGAGGAAGCAGGGCTTCCTTTCCTCTGCCATCATACCGT TCGTATAGCATACATTATACGAAGTTATCGTGCTCAGTCTG SEQ ID NO: 82 GGCCCAGACTGAGCACGTGAGTTTTAGAGCTAGAAATAGCAAGTT 34 bp lox66 pegRNA AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT (Artificial Sequence) CGGTGCTGGAGGAAGCAGGGCTTCCTTTCCTCTGCCATCAATAACT TCGTATAGCATACATTATACGAACGGTACGTGCTCAGTCTG SEQ ID NO: 83 GGCCCAGACTGAGCACGTGA gRNA (Artificial Sequence) SEQ ID NO: 84 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 attB 46 GGTGCGACGAGCGCGGCGATATCATCATCCATGGCCGGATGATCC (original length) TGACGACGGAGACCGCCGTCGTCGACAAGCCGGCCTGAGCTGCGA pegRNA GAA (Artificial Sequence) SEQ ID NO: 85 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTT ACTB N-term AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG PBS_13_RT_29_with TCGGTGCGAGTCGGTGCGACGAGCGCGGCGATATCATCATCCAT TP901-1 minimal GGCACAATTAACATCTCAATCAAGGTAAATGCTTGAGCTGCGAG attB f pegRNA AA (Artificial Sequence) SEQ ID NO: 86 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTT ACTB N-term AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG PBS_13_RT_29_with TCGGTGCGAGTCGGTGCGACGAGCGCGGCGATATCATCATCCAT TP901-1 minimal GGAGCATTTACCTTGATTGAGATGTTAATTGTGTGAGCTGCGAGA attB rc pegRNA A (Artificial Sequence) SEQ ID NO: 87 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTT ACTB N-term AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG PBS_13_RT_29_with TCGGTGCGAGTCGGTGCGACGAGCGCGGCGATATCATCATCCAT PhiBT1 minimal GGCAGGTTTTTGACGAAAGTGATCCAGATGATCCAGTGAGCTGC attB f pegRNA GAGAA (Artificial Sequence) SEQ ID NO: 88 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTT ACTB N-term AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG PBS 13 RT_29_with TCGGTGCGAGTCGGTGCGACGAGCGCGGCGATATCATCATCCAT PhiBT1 minimal GGCTGGATCATCTGGATCACTTTCGTCAAAAACCTGTGAGCTGCG attB rc pegRNA AGAA (Artificial Sequence) SEQ ID NO: 89 GAAGCCGGCCTTGCACATGCGTTTTAGAGCTAGAAATAGCAAGT ACTB N-term TAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGA Nicking guide 1 + 48 GTCGGTGC guide (Artificial Sequence) SEQ ID NO: 90 GAAGCCGGCCTTGCACATGCGTTTTAGAGCTAGAAATAGCAAGT ACTB N-term TAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGA PBS_18_RT_16_with_ GTCGGTGCATATCATCATCCATGGTACCGTTCGTATAGCATACAT Lox71_Cre TATACGAAGTTATTGAGCTGCGAGAATAGCC pegRNA (Artificial Sequence) SEQ ID NO: 91 GAAGCCGGCCTTGCACATGCGTTTTAGAGCTAGAAATAGCAAGT ACTB N-term TAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGA PBS_13_RT_29_with_ GTCGGTGCGACGAGCGCGGCGATATCATCATCCATGGTACCGTT Lox71_Cre CGTATAGCATACATTATACGAAGTTATTGAGCTGCGAGAA pegRNA (Artificial Sequence) SEQ ID NO: 92 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 34 pegRNA GGTGCTCGACGACGAGCGCGGCGATATCATCATCCATGGCCGGAT (Artificial Sequence) GATCCTGACGACGGAGACCGCCGTCGTCGACAAGCCGGCCTGAGC TGCGAGAA SEQ ID NO: 93 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 26 pegRNA GGTGCGAGCGCGGCGATATCATCATCCATGGCCGGATGATCCTGA (Artificial Sequence) CGACGGAGACCGCCGTCGTCGACAAGCCGGCCTGAGCTGCGAGAA SEQ ID NO: 94 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 23 pegRNA GGTGCCGCGGCGATATCATCATCCATGGCCGGATGATCCTGACGAC (Artificial Sequence) GGAGACCGCCGTCGTCGACAAGCCGGCCTGAGCTGCGAGAA SEQ ID NO: 95 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 20 pegRNA GGTGCGGCGATATCATCATCCATGGCCGGATGATCCTGACGACGG (Artificial Sequence) AGACCGCCGTCGTCGACAAGCCGGCCTGAGCTGCGAGAA SEQ ID NO: 96 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 16 pegRNA GGTGCATATCATCATCCATGGCCGGATGATCCTGACGACGGAGAC (Artificial Sequence) CGCCGTCGTCGACAAGCCGGCCTGAGCTGCGAGAA SEQ ID NO: 97 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 18 RT 34 pegRNA GGTGCTCGACGACGAGCGCGGCGATATCATCATCCATGGCCGGAT (Artificial Sequence) GATCCTGACGACGGAGACCGCCGTCGTCGACAAGCCGGCCTGAGC TGCGAGAATAGCC SEQ ID NO: 98 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 18 RT 29 pegRNA GGTGCGACGAGCGCGGCGATATCATCATCCATGGCCGGATGATCC (Artificial Sequence) TGACGACGGAGACCGCCGTCGTCGACAAGCCGGCCTGAGCTGCGA GAATAGCC SEQ ID NO: 99 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 18 RT 16 pegRNA GGTGCATATCATCATCCATGGCCGGATGATCCTGACGACGGAGAC (Artificial Sequence) CGCCGTCGTCGACAAGCCGGCCTGAGCTGCGAGAATAGCC SEQ ID NO: 100 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG 13 RT 39 pegRNA TCGGTGCCTGCCCATCCGCGGCGGCACGGGGGTCGCAGTCGCCA (Artificial Sequence) TGCCGGATGATCCTGACGACGGAGACCGCCGTCGTCGACAAGCC GGCCCGGGCGGCGGAGA SEQ ID NO: 101 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG 13 RT 34 pegRNA TCGGTGCCATCCGCGGCGGCACGGGGGTCGCAGTCGCCATGCCG (Artificial Sequence) GATGATCCTGACGACGGAGACCGCCGTCGTCGACAAGCCGGCCC GGGCGGCGGAGA SEQ ID NO: 102 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG 13 RT 29 pegRNA TCGGTGCGCGGCGGCACGGGGGTCGCAGTCGCCATGCCGGATGA (Artificial Sequence) TCCTGACGACGGAGACCGCCGTCGTCGACAAGCCGGCCCGGGCG GCGGAGA SEQ ID NO: 103 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG 13 RT 24 pegRNA TCGGTGCGGCACGGGGGTCGCAGTCGCCATGCCGGATGATCCTG (Artificial Sequence) ACGACGGAGACCGCCGTCGTCGACAAGCCGGCCCGGGCGGCGGA GA SEQ ID NO: 104 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG 13 RT 19 pegRNA TCGGTGCGGGGGTCGCAGTCGCCATGCCGGATGATCCTGACGAC (Artificial Sequence) GGAGACCGCCGTCGTCGACAAGCCGGCCCGGGCGGCGGAGA SEQ ID NO: 105 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG 18 RT 39 pegRNA TCGGTGCCTGCCCATCCGCGGCGGCACGGGGGTCGCAGTCGCCA (Artificial Sequence) TGCCGGATGATCCTGACGACGGAGACCGCCGTCGTCGACAAGCC GGCCCGGGCGGCGGAGACAGCG SEQ ID NO: 106 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG 18 RT 34 pegRNA TCGGTGCCATCCGCGGCGGCACGGGGGTCGCAGTCGCCATGCCG (Artificial Sequence) GATGATCCTGACGACGGAGACCGCCGTCGTCGACAAGCCGGCCC GGGCGGCGGAGACAGCG SEQ ID NO: 107 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT 18 RT 29 pegRNA CGGTGCGCGGCGGCACGGGGGTCGCAGTCGCCATGCCGGATGATC (Artificial Sequence) CTGACGACGGAGACCGCCGTCGTCGACAAGCCGGCCCGGGCGGCG GAGACAGCG SEQ ID NO: 108 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG 18 RT 24 pegRNA TCGGTGCGGCACGGGGGTCGCAGTCGCCATGCCGGATGATCCTG (Artificial Sequence) ACGACGGAGACCGCCGTCGTCGACAAGCCGGCCCGGGCGGCGGA GACAGCG SEQ ID NO: 109 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG 18 RT 19 pegRNA TCGGTGCGGGGGTCGCAGTCGCCATGCCGGATGATCCTGACGAC (Artificial Sequence) GGAGACCGCCGTCGTCGACAAGCCGGCCCGGGCGGCGGAGACAG CG SEQ ID NO: 110 GCGTGGTGGGGCCGCCAGCGGTTTTAGAGCTAGAAATAGCAAGT LMNB1 N-term TAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGA Nicking guide 1 + 46 GTCGGTGC (Artificial Sequence) SEQ ID NO: 111 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 attB 42 GGTGCGACGAGCGCGGCGATATCATCATCCATGGGGATGATCCTG pegRNA ACGACGGAGACCGCCGTCGTCGACAAGCCGGTGAGCTGCGAGAA (Artificial Sequence) SEQ ID NO: 112 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 attB 40 GGTGCGACGAGCGCGGCGATATCATCATCCATGGGATGATCCTGA pegRNA CGACGGAGACCGCCGTCGTCGACAAGCCGTGAGCTGCGAGAA (Artificial Sequence) SEQ ID NO: 113 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 attB 38 GGTGCGACGAGCGCGGCGATATCATCATCCATGGATGATCCTGAC pegRNA GACGGAGACCGCCGTCGTCGACAAGCCTGAGCTGCGAGAA (Artificial Sequence) SEQ ID NO: 114 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 attB 36 GGTGCGACGAGCGCGGCGATATCATCATCCATGGTGATCCTGACG pegRNA ACGGAGACCGCCGTCGTCGACAAGCTGAGCTGCGAGAA (Artificial Sequence) SEQ ID NO: 115 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT 13 RT 29 attB 44 CGGTGCGCGGCGGCACGGGGGTCGCAGTCGCCATGCGGATGATCC pegRNA v2 TGACGACGGAGACCGCCGTCGTCGACAAGCCGGCCGGGCGGCGG (Artificial Sequence) AGA SEQ ID NO: 116 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT 13 RT 29 attB 42 CGGTGCGCGGCGGCACGGGGGTCGCAGTCGCCATGGGATGATCCT pegRNA v2 GACGACGGAGACCGCCGTCGTCGACAAGCCGGCGGGCGGCGGAG (Artificial Sequence) A SEQ ID NO: 117 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT 13 RT 29 attB 40 CGGTGCGCGGCGGCACGGGGGTCGCAGTCGCCATGGATGATCCTG pegRNA v2 ACGACGGAGACCGCCGTCGTCGACAAGCCGCGGGCGGCGGAGA (Artificial Sequence) SEQ ID NO: 118 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT 13 RT 29 attB 38 CGGTGCGCGGCGGCACGGGGGTCGCAGTCGCCATGATGATCCTGA pegRNA v2 CGACGGAGACCGCCGTCGTCGACAAGCCCGGGCGGCGGAGA (Artificial Sequence) SEQ ID NO: 119 GCGTATTGCCTGGAGGATGGGTTTTAGAGCTAGAAATAGCAAGT NOLC1 N-term PBS TAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGA 18 RT 29 attB 46 GTCGGTGCGAACCACGCGGCGAATGCCGGCGTCCGCCCCGGATG pegRNA ATCCTGACGACGGAGACCGCCGTCGTCGACAAGCCGGCCTCCTC (Artificial Sequence) CAGGCAATACGCG SEQ ID NO: 120 GCGTATTGCCTGGAGGATGGGTTTTAGAGCTAGAAATAGCAAGTT NOLC1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT 13 RT 29 attB 46 CGGTGCGAACCACGCGGCGAATGCCGGCGTCCGCCCCGGATGATC pegRNA CTGACGACGGAGACCGCCGTCGTCGACAAGCCGGCCTCCTCCAGG (Artificial Sequence) CAAT SEQ ID NO: 121 GCGTATTGCCTGGAGGATGGGTTTTAGAGCTAGAAATAGCAAGT NOLC1 N-term PBS TAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGA 13 RT 29 attB 44 GTCGGTGCGAACCACGCGGCGAATGCCGGCGTCCGCCCGGATGA pegRNA TCCTGACGACGGAGACCGCCGTCGTCGACAAGCCGGCTCCTCCA (Artificial Sequence) GGCAAT
SEQ ID NO: 122 GCGTATTGCCTGGAGGATGGGTTTTAGAGCTAGAAATAGCAAGT NOLC1 N-term PBS TAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGA 13 RT 29 attB 42 GTCGGTGCGAACCACGCGGCGAATGCCGGCGTCCGCCGGATGAT pegRNA CCTGACGACGGAGACCGCCGTCGTCGACAAGCCGGTCCTCCAGG (Artificial Sequence) CAAT SEQ ID NO: 123 GCGTATTGCCTGGAGGATGGGTTTTAGAGCTAGAAATAGCAAGT NOLC1 N-term PBS TAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGA 13 RT 29 attB 40 GTCGGTGCGAACCACGCGGCGAATGCCGGCGTCCGCCGATGATC pegRNA CTGACGACGGAGACCGCCGTCGTCGACAAGCCGTCCTCCAGGCA (Artificial Sequence) AT SEQ ID NO: 124 GCGTATTGCCTGGAGGATGGGTTTTAGAGCTAGAAATAGCAAGTT NOLC1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG 13 RT 29 attB 38 TCGGTGCGAACCACGCGGCGAATGCCGGCGTCCGCCATGATCCT pegRNA GACGACGGAGACCGCCGTCGTCGACAAGCCTCCTCCAGGCAAT (Artificial Sequence) SEQ ID NO: 125 GAGCCGAGCACGAGGGGATACGTTTTAGAGCTAGAAATAGCAAGT NOLC1 nicking TAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG guide-43 TCGGTGC (Artificial Sequence) SEQ ID NO: 126 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 20 attB 38 GGTGCGGCGATATCATCATCCATGGATGATCCTGACGACGGAGAC pegRNA CGCCGTCGTCGACAAGCCTGAGCTGCGAGAA (Artificial Sequence) SEQ ID NO: 127 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 15 attB 38 GGTGCTATCATCATCCATGGATGATCCTGACGACGGAGACCGCCG pegRNA TCGTCGACAAGCCTGAGCTGCGAGAA (Artificial Sequence) SEQ ID NO: 128 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 10 attB 38 GGTGCTCATCCATGGATGATCCTGACGACGGAGACCGCCGTCGTC pegRNA GACAAGCCTGAGCTGCGAGAA (Artificial Sequence) SEQ ID NO: 129 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTT ACTB N-term PBS 9 AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG RT 20 attB 38 TCGGTGCGGCGATATCATCATCCATGGATGATCCTGACGACGGAG pegRNA ACCGCCGTCGTCGACAAGCCTGAGCTGCG (Artificial Sequence) SEQ ID NO: 130 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS 9 AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC RT 15 attB 38 GGTGCTATCATCATCCATGGATGATCCTGACGACGGAGACCGCCG pegRNA TCGTCGACAAGCCTGAGCTGCG (Artificial Sequence) SEQ ID NO: 131 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS 9 AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC RT 10 attB 38 GGTGCTCATCCATGGATGATCCTGACGACGGAGACCGCCGTCGTC pegRNA GACAAGCCTGAGCTGCG (Artificial Sequence) SEQ ID NO: 132 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG 13 RT 20 attB 38 TCGGTGCCGGGGGTCGCAGTCGCCATGATGATCCTGACGACGGA pegRNA GACCGCCGTCGTCGACAAGCCCGGGCGGCGGAGA (Artificial Sequence) SEQ ID NO: 133 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG 13 RT 15 attB 38 TCGGTGCGTCGCAGTCGCCATGATGATCCTGACGACGGAGACCG pegRNA CCGTCGTCGACAAGCCCGGGCGGCGGAGA (Artificial Sequence) SEQ ID NO: 134 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG 13 RT 10 attB 38 TCGGTGCAGTCGCCATGATGATCCTGACGACGGAGACCGCCGTC pegRNA GTCGACAAGCCCGGGCGGCGGAGA (Artificial Sequence) SEQ ID NO: 135 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT 9 RT 20 attB 38 CGGTGCCGGGGGTCGCAGTCGCCATGATGATCCTGACGACGGAGA pegRNA CCGCCGTCGTCGACAAGCCCGGGCGGCG (Artificial Sequence) SEQ ID NO: 136 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG 9 RT 15 attB 38 TCGGTGCGTCGCAGTCGCCATGATGATCCTGACGACGGAGACCG pegRNA CCGTCGTCGACAAGCCCGGGCGGCG (Artificial Sequence) SEQ ID NO: 137 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT 9 RT 10 attB 38 CGGTGCAGTCGCCATGATGATCCTGACGACGGAGACCGCCGTCGT pegRNA CGACAAGCCCGGGCGGCG (Artificial Sequence) SEQ ID NO: 138 GAGAAGCGGCGTCCGGGGCTAGTTTTAGAGCTAGAAATAGCAAGT SUPT16H N-term TAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG PBS 13 RT 24 Bxb1- TCGGTGCTCTTTGTCCAGAGTCACAGCCATACCGGATGATCCTGAC GT_Initial length GACGGAGACCGCCGTCGTCGACAAGCCGGCCCCCCGGACGCCGC (Artificial Sequence) SEQ ID NO: 139 GGGCACGGGGCCATGTACAAGTTTTAGAGCTAGAAATAGCAAGT SRRM2 N-term PBS TAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGA 13 RT 24 Bxb1 GTCGGTGCGGCGTCGGCAGCCCGATCCCGTTGCCGGATGATCCT Initial length GACGACGGAGACCGCCGTCGTCGACAAGCCGGCCTACATGGCCC (Artificial Sequence) CGT SEQ ID NO: 140 GTGTCAGGTGGGGCGGGGCTAGTTTTAGAGCTAGAAATAGCAAG DEPDC4 N-term TTAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCG PBS 18 RT 24 Bxb1 AGTCGGTGCGCTGGCTCCTCCCCTGGCACCATACCGGATGATCCT Initial length GACGACGGAGACCGCCGTCGTCGACAAGCCGGCCCCCCGCCCCA (Artificial Sequence) CCTGACAC SEQ ID NO: 141 GAGTGGGTCAGACGAGCAGGAGTTTTAGAGCTAGAAATAGCAAGT NES N-term PBS 13 TAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG RT 29 Bxb1 Initial TCGGTGCGATGGAGGGCTGCATGGGGGAGGAGTCGCCGGATGATC length CTGACGACGGAGACCGCCGTCGTCGACAAGCCGGCCTGCTCGTCT (Artificial Sequence) GACC SEQ ID NO: 142 GCAGCCACCCGCTCTCGGCCCGTTTTAGAGCTAGAAATAGCAAG SUPT16H nicking TTAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCG guide-53 AGTCGGTGC (Artificial Sequence) SEQ ID NO: 143 GTGTAGTCAGGCCGCTCACCCGTTTTAGAGCTAGAAATAGCAAG SRRM2 N-term TTAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCG nicking guide 1 + 87 AGTCGGTGC (Artificial Sequence) SEQ ID NO: 144 GCTGACAAGTCTACGGAACCTGTTTTAGAGCTAGAAATAGCAAG DEPDC4 N-term TTAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCG Nicking guide 1 + 59 AGTCGGTGC (Artificial Sequence) SEQ ID NO: 145 GCTCCTCCAGCGCCTTGACCGTTTTAGAGCTAGAAATAGCAAGTTA NES N-term Nicking AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC guide 2 + 9 GGTGC (Artificial Sequence) SEQ ID NO: 146 GCTATTCTCGCAGCTCACCA HITI_ACTB_guide (Artificial Sequence) SEQ ID NO: 147 AGAAGCGGCGTCCGGGGCTA HITI_SUPTH16_guide (Artificial Sequence) SEQ ID NO: 148 GGGCACGGGGCCATGTACAA HITI_SRRM2_guide (Artificial Sequence) SEQ ID NO: 149 GCGTATTGCCTGGAGGATGG HITI_NOLCl_guide (Artificial Sequence) SEQ ID NO: 150 TGTCAGGTGGGGCGGGGCTA HITI_DEPDC4_guide (Artificial Sequence) SEQ ID NO: 151 AGTGGGTCAGACGAGCAGGA HITI_NES_guide (Artificial Sequence) SEQ ID NO: 152 GCTGTCTCCGCCGCCCGCCA HITI_LMNB1_guide (Artificial Sequence) SEQ ID NO: 153 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTT HDR Cas9 ACTB AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG guide TCGGTGC (Artificial Sequence) SEQ ID NO: 154 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 attB GGTGCGACGAGCGCGGCGATATCATCATCCATGGCCGGATGATCC original length TGACGACGGAGXXCGCCGTCGTCGACAAGCCGGCCTGAGCTGCGA pegRNAs for GAA dinucleotides XX: CG, GC, AT, TA, GG, TT, GA, AG, CC, TC, CT, AA, TG, GT, CA, or (Artificial Sequence) AC SEQ ID NO: 155 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 pegRNA GGTGCGACGAGCGCGGCGATATCATCATCCATGCCGGATGATCCT with attB 46 GT for GACGACGGAGACCGCCGTCGTCGACAAGCCGGCCTGAGCTGCGAG fusion AA (Artificial Sequence) SEQ ID NO: 156 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 pegRNA GGTGCGACGAGCGCGGCGATATCATCATCCATGCCGGATGATCCT with attB 46 CT for GACGACGGAGAGCGCCGTCGTCGACAAGCCGGCCTGAGCTGCGA multiplexing GAA (Artificial Sequence) SEQ ID NO: 157 GCGTATTGCCTGGAGGATGGGTTTTAGAGCTAGAAATAGCAAGTT NOLC1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT 18 RT 29 pegRNA CGGTGCGAACCACGCGGCGAATGCCGGCGTCCGCCCCGGATGATC with attB 46 GA for CTGACGACGGAGTCCGCCGTCGTCGACAAGCCGGCCTCCTCCAGG multiplexing CAATACGCG (Artificial Sequence) SEQ ID NO: 158 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT 18 RT 29 pegRNA CGGTGCGCGGCGGCACGGGGGTCGCAGTCGCCATGCCGGATGATC with attB 46 AG for CTGACGACGGAGCTCGCCGTCGTCGACAAGCCGGCCCGGGCGGCG multiplexing GAGACAGCG (Artificial Sequence) SEQ ID NO: 159 GTCACCTCCAATGACTAGGG EMX1 Cas9 guide 1 (Artificial Sequence) SEQ ID NO: 160 GGGCAACCACAAACCCACGA EMX1 Cas9 guide 2 (Artificial Sequence) SEQ ID NO: 161 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 attB 56 GA GGTGCGACGAGCGCGGCGATATCATCATCCATGGCTATGCCGGAT pegRNA GATCCTGACGACGGAGTCCGCCGTCGTCGACAAGCCGGCCCTAGC (Artificial Sequence) TGAGCTGCGAGAA SEQ ID NO: 162 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 attB 51 GA GGTGCGACGAGCGCGGCGATATCATCATCCATGGTGCCGGATGAT pegRNA CCTGACGACGGAGTCCGCCGTCGTCGACAAGCCGGCCCTATGAGC (Artificial Sequence) TGCGAGAA SEQ ID NO: 163 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 attB 46 GA GGTGCGACGAGCGCGGCGATATCATCATCCATGGCCGGATGATCC pegRNA TGACGACGGAGTCCGCCGTCGTCGACAAGCCGGCCTGAGCTGCGA (Artificial Sequence) GAA SEQ ID NO: 164 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 attB 41 GA GGTGCGACGAGCGCGGCGATATCATCATCCATGGGGATGATCCTG pegRNA ACGACGGAGTCCGCCGTCGTCGACAAGCCGTGAGCTGCGAGAA (Artificial Sequence) SEQ ID NO: 165 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 attB 36 GA GGTGCGACGAGCGCGGCGATATCATCATCCATGGTGATCCTGACG pegRNA ACGGAGTCCGCCGTCGTCGACAAGCTGAGCTGCGAGAA (Artificial Sequence) SEQ ID NO: 166 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 attB 31 GA GGTGCGACGAGCGCGGCGATATCATCATCCATGGATCCTGACGAC pegRNA GGAGTCCGCCGTCGTCGACATGAGCTGCGAGAA
(Artificial Sequence) SEQ ID NO: 167 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 attB 26 GA GGTGCGACGAGCGCGGCGATATCATCATCCATGGCCTGACGACGG pegRNA AGTCCGCCGTCGTCGTGAGCTGCGAGAA (Artificial Sequence) SEQ ID NO: 168 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 attB 21 GA GGTGCGACGAGCGCGGCGATATCATCATCCATGGTGACGACGGAG pegRNA TCCGCCGTCGTGAGCTGCGAGAA (Artificial Sequence) SEQ ID NO: 169 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 attB 16 GA GGTGCGACGAGCGCGGCGATATCATCATCCATGGACGACGGAGTC pegRNA CGCCGTGAGCTGCGAGAA (Artificial Sequence) SEQ ID NO: 170 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 attB 11 GA GGTGCGACGAGCGCGGCGATATCATCATCCATGGGACGGAGTCCG pegRNA TGAGCTGCGAGAA (Artificial Sequence) SEQ ID NO: 171 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 attB 6 GA GGTGCGACGAGCGCGGCGATATCATCATCCATGGCGGAGTTGAGC pegRNA TGCGAGAA (Artificial Sequence) SEQ ID NO: 172 GAAGCCGGCCTTGCACATGCGTTTTAGAGCTAGAAATAGCAAGTT ACTB N-term AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT PBS_18_RT_34_with_ CGGTGCTCGACGACGAGCGCGGCGATATCATCATCCATGGTACCG Lox71_Cre TTCGTATAGCATACATTATACGAAGTTATTGAGCTGCGAGAATAG pegRNA CC (Artificial Sequence) SEQ ID NO: 173 GAAGCCGGCCTTGCACATGCGTTTTAGAGCTAGAAATAGCAAGTT ACTB N-term AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT PBS_18_RT_29_with_ CGGTGCGACGAGCGCGGCGATATCATCATCCATGGTACCGTTCGT Lox71_Cre ATAGCATACATTATACGAAGTTATTGAGCTGCGAGAATAGCC pegRNA (Artificial Sequence) SEQ ID NO: 174 GAAGCCGGCCTTGCACATGCGTTTTAGAGCTAGAAATAGCAAGTT ACTB N-term AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT PBS_13_RT_34_with_ CGGTGCTCGACGACGAGCGCGGCGATATCATCATCCATGGTACCG Lox71_Cre TTCGTATAGCATACATTATACGAAGTTATTGAGCTGCGAGAA pegRNA (Artificial Sequence) SEQ ID NO: 175 GAAGCCGGCCTTGCACATGCGTTTTAGAGCTAGAAATAGCAAGTT ACTB N-term AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT PBS_13_RT_16_with_ CGGTGCATATCATCATCCATGGTACCGTTCGTATAGCATACATTAT Lox71_Cre ACGAAGTTATTGAGCTGCGAGAA pegRNA (Artificial Sequence) SEQ ID NO: 176 CCCCACGATGGAGGGGAAGAGTTTTAGAGCTAGAAATAGCAAGTT ACTB N-term AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT Nicking guide 2 + 93 CGGTGC guide (Artificial Sequence) SEQ ID NO: 177 CCTTCTCCTGGAGCCGCGACGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT Nicking guide 2 + 87 CGGTGC guide (Artificial Sequence)
[0310] Sequences of insertion sites can be found in Table 4 below.
TABLE-US-00005 TABLE 4 FORWARD SEQUENCE (5'-3') REVERSE SEQUENCE (5'-3') DESCRIPTION/ SEQ ID SEQ ID SOURCE NO Sequence NO Sequence Bxb1_attP_GT_ 178 GTGGTTTGTCTGGTC 179 TGGGTTTGTACCGTA original_site AACCACCGCGGTCT CACCACTGAGACCG (Artificial CAGTGGTGTACGGT CGGTGGTTGACCAG Sequence) ACAAACCCA ACAAACCAC Bxb1_attP_C 180 GTGGTTTGTCTGGTC 181 TGGGTTTGTACCGTA G_site AACCACCGCGCGCT CACCACTGAGCGCG (Artificial CAGTGGTGTACGGT CGGTGGTTGACCAG Sequence) ACAAACCCA ACAAACCAC Bxb1_attP_G 182 GTGGTTTGTCTGGTC 183 TGGGTTTGTACCGTA C_site AACCACCGCGGCCT CACCACTGAGGCCG (Artificial CAGTGGTGTACGGT CGGTGGTTGACCAG Sequence) ACAAACCCA ACAAACCAC Bxb1_attP_AT_ 184 GTGGTTTGTCTGGTC 185 TGGGTTTGTACCGTA site AACCACCGCGATCT CACCACTGAGATCG (Artificial CAGTGGTGTACGGT CGGTGGTTGACCAG Sequence) ACAAACCCA ACAAACCAC Bxb1_attP_TA_ 186 GTGGTTTGTCTGGTC 187 TGGGTTTGTACCGTA site AACCACCGCGTACT CACCACTGAGTACG (Artificial CAGTGGTGTACGGT CGGTGGTTGACCAG Sequence) ACAAACCCA ACAAACCAC Bxb1_attP_G 188 GTGGTTTGTCTGGTC 189 TGGGTTTGTACCGTA G_site AACCACCGCGGGCT CACCACTGAGCCCG (Artificial CAGTGGTGTACGGT CGGTGGTTGACCAG Sequence) ACAAACCCA ACAAACCAC Bxb1_attP_TT_ 190 GTGGTTTGTCTGGTC 191 TGGGTTTGTACCGTA site AACCACCGCGTTCTC CACCACTGAGAACG (Artificial AGTGGTGTACGGTA CGGTGGTTGACCAG Sequence) CAAACCCA ACAAACCAC Bxb1_attP_G 192 GTGGTTTGTCTGGTC 193 TGGGTTTGTACCGTA A_site AACCACCGCGGACT CACCACTGAGTCCG (Artificial CAGTGGTGTACGGT CGGTGGTTGACCAG Sequence) ACAAACCCA ACAAACCAC Bxb1_attP_A 194 GTGGTTTGTCTGGTC 195 TGGGTTTGTACCGTA G_site AACCACCGCGAGCT CACCACTGAGCTCG (Artificial CAGTGGTGTACGGT CGGTGGTTGACCAG Sequence) ACAAACCCA ACAAACCAC Bxb1_attP_CC_ 196 GTGGTTTGTCTGGTC 197 TGGGTTTGTACCGTA site AACCACCGCGCCCT CACCACTGAGGGCG (Artificial CAGTGGTGTACGGT CGGTGGTTGACCAG Sequence) ACAAACCCA ACAAACCAC Bxb1_attP_TC_ 198 GTGGTTTGTCTGGTC 199 TGGGTTTGTACCGTA site AACCACCGCGTCCTC CACCACTGAGGACG (Artificial AGTGGTGTACGGTA CGGTGGTTGACCAG Sequence) CAAACCCA ACAAACCAC Bxb1_attP_CT_ 200 GTGGTTTGTCTGGTC 201 TGGGTTTGTACCGTA site AACCACCGCGCTCTC CACCACTGAGAGCG (Artificial AGTGGTGTACGGTA CGGTGGTTGACCAG Sequence) CAAACCCA ACAAACCAC Bxb1_attP_A 202 GTGGTTTGTCTGGTC 203 TGGGTTTGTACCGTA A_site AACCACCGCGAACT CACCACTGAGTTCGC (Artificial CAGTGGTGTACGGT GGTGGTTGACCAGA Sequence) ACAAACCCA CAAACCAC Bxb1_attP_C 204 GTGGTTTGTCTGGTC 205 TGGGTTTGTACCGTA A_site AACCACCGCGCACT CACCACTGAGTGCG (Artificial CAGTGGTGTACGGT CGGTGGTTGACCAG Sequence) ACAAACCCA ACAAACCAC Bxb1_attP_A 206 GTGGTTTGTCTGGTC 207 TGGGTTTGTACCGTA C_site AACCACCGCGACCT CACCACTGAGGTCG (Artificial CAGTGGTGTACGGT CGGTGGTTGACCAG Sequence) ACAAACCCA ACAAACCAC Bxb1_attP_TG_ 208 GTGGTTTGTCTGGTC 209 TGGGTTTGTACCGTA site AACCACCGCGTGCT CACCACTGAGCACG (Artificial CAGTGGTGTACGGT CGGTGGTTGACCAG Sequence) ACAAACCCA ACAAACCAC Bxb1_attB_46_ 210 GGCCGGCTTGTCGA 211 CCGGATGATCCTGA GT_ CGACGGCGGTCTCC CGACGGAGACCGCC original_site GTCGTCAGGATCATC GTCGTCGACAAGCC (Artificial CGG GGCC Sequence) Bxb1_attB_46_ 212 GGCCGGCTTGTCGA 213 CCGGATGATCCTGA AA_site CGACGGCGAACTCC CGACGGAGTTCGCC (Artificial GTCGTCAGGATCATC GTCGTCGACAAGCC Sequence) CGG GGCC Bxb1_attB_46_ 214 GGCCGGCTTGTCGA 215 CCGGATGATCCTGA GA_site CGACGGCGGACTCC CGACGGAGTCCGCC (Artificial GTCGTCAGGATCATC GTCGTCGACAAGCC Sequence) CGG GGCC Bxb1_attB_46_ 216 GGCCGGCTTGTCGA 217 CCGGATGATCCTGA CA_site CGACGGCGCACTCC CGACGGAGTGCGCC (Artificial GTCGTCAGGATCATC GTCGTCGACAAGCC Sequence) CGG GGCC Bxb1_attB_46_ 218 GGCCGGCTTGTCGA 219 CCGGATGATCCTGA TA_site CGACGGCGTACTCC CGACGGAGTACGCC (Artificial GTCGTCAGGATCATC GTCGTCGACAAGCC Sequence) CGG GGCC Bxb1_attB_46_ 220 GGCCGGCTTGTCGA 221 CCGGATGATCCTGA AG_site CGACGGCGAGCTCC CGACGGAGCTCGCC (Artificial GTCGTCAGGATCATC GTCGTCGACAAGCC Sequence) CGG GGCC Bxb1_attB_46_ 222 GGCCGGCTTGTCGA 223 CCGGATGATCCTGA GG_site CGACGGCGGGCTCC CGACGGAGCCCGCC (Artificial GTCGTCAGGATCATC GTCGTCGACAAGCC Sequence) CGG GGCC Bxb1_attB_46_ 224 GGCCGGCTTGTCGA 225 CCGGATGATCCTGA CG_site CGACGGCGCGCTCC CGACGGAGCGCGCC (Artificial GTCGTCAGGATCATC GTCGTCGACAAGCC Sequence) CGG GGCC Bxb1_attB_46_ 226 GGCCGGCTTGTCGA 227 CCGGATGATCCTGA TG_site CGACGGCGTGCTCC CGACGGAGCACGCC (Artificial GTCGTCAGGATCATC GTCGTCGACAAGCC Sequence) CGG GGCC Bxb1_attB_46_ 228 GGCCGGCTTGTCGA 229 CCGGATGATCCTGA AC_site CGACGGCGACCTCC CGACGGAGGTCGCC (Artificial GTCGTCAGGATCATC GTCGTCGACAAGCC Sequence) CGG GGCC Bxb1_attB_46_ 230 GGCCGGCTTGTCGA 231 CCGGATGATCCTGA GC_site CGACGGCGGCCTCC CGACGGAGGCCGCC (Artificial GTCGTCAGGATCATC GTCGTCGACAAGCC Sequence) CGG GGCC Bxb1_attB_46_ 232 GGCCGGCTTGTCGA 233 CCGGATGATCCTGA CC_site CGACGGCGCCCTCC CGACGGAGGGCGCC (Artificial GTCGTCAGGATCATC GTCGTCGACAAGCC Sequence) CGG GGCC Bxb1_attB_46_ 234 GGCCGGCTTGTCGA 235 CCGGATGATCCTGA TC_site CGACGGCGTCCTCC CGACGGAGGACGCC (Artificial GTCGTCAGGATCATC GTCGTCGACAAGCC Sequence) CGG GGCC Bxb1_attB_46_ 236 GGCCGGCTTGTCGA 237 CCGGATGATCCTGA AT_site CGACGGCGATCTCC CGACGGAGATCGCC (Artificial GTCGTCAGGATCATC GTCGTCGACAAGCC Sequence) CGG GGCC Bxb1_attB_46_ 238 GGCCGGCTTGTCGA 239 CCGGATGATCCTGA CT_site CGACGGCGCTCTCC CGACGGAGAGCGCC (Artificial GTCGTCAGGATCATC GTCGTCGACAAGCC Sequence) CGG GGCC Bxb1_attB_46_ 240 GGCCGGCTTGTCGA 241 CCGGATGATCCTGA TT_site CGACGGCGTTCTCCG CGACGGAGAACGCC (Artificial TCGTCAGGATCATCC GTCGTCGACAAGCC Sequence) GG GGCC Bxb1_attB_38_ 242 GGCTTGTCGACGAC 243 ATGATCCTGACGAC GT_site GGCGGTCTCCGTCGT GGAGACCGCCGTCG (Artificial CAGGATCAT TCGACAAGCC Sequence) Bxb1_attB_38_ 244 GGCTTGTCGACGAC 245 ATGATCCTGACGAC AA_site GGCGAACTCCGTCG GGAGTTCGCCGTCGT (Artificial TCAGGATCAT CGACAAGCC Sequence) Bxb1_attB_38_ 246 GGCTTGTCGACGAC 247 ATGATCCTGACGAC GA_site GGCGGACTCCGTCG GGAGTCCGCCGTCG (Artificial TCAGGATCAT TCGACAAGCC Sequence) Bxb1_attB_38_ 248 GGCTTGTCGACGAC 249 ATGATCCTGACGAC CA_site GGCGCACTCCGTCGT GGAGTGCGCCGTCG (Artificial CAGGATCAT TCGACAAGCC Sequence) Bxb1_attB_38_ 250 GGCTTGTCGACGAC 251 ATGATCCTGACGAC TA_site GGCGTACTCCGTCGT GGAGTACGCCGTCG (Artificial CAGGATCAT TCGACAAGCC Sequence) Bxb1_attB_38_ 252 GGCTTGTCGACGAC 253 ATGATCCTGACGAC AG_site GGCGAGCTCCGTCG GGAGCTCGCCGTCG (Artificial TCAGGATCAT TCGACAAGCC Sequence) Bxb1_attB_38_ 254 GGCTTGTCGACGAC 255 ATGATCCTGACGAC GG_site GGCGGGCTCCGTCG GGAGCCCGCCGTCG (Artificial TCAGGATCAT TCGACAAGCC Sequence) Bxb1_attB_38_ 256 GGCTTGTCGACGAC 257 ATGATCCTGACGAC CG_site GGCGCGCTCCGTCGT GGAGCGCGCCGTCG (Artificial CAGGATCAT TCGACAAGCC Sequence) Bxb1_attB_38_ 258 GGCTTGTCGACGAC 259 ATGATCCTGACGAC TG_site GGCGTGCTCCGTCGT GGAGCACGCCGTCG (Artificial CAGGATCAT TCGACAAGCC Sequence) Bxb1_attB_38_ 260 GGCTTGTCGACGAC 261 ATGATCCTGACGAC AC_site GGCGACCTCCGTCGT GGAGGTCGCCGTCG (Artificial CAGGATCAT TCGACAAGCC Sequence) Bxb1_attB_38_ 262 GGCTTGTCGACGAC 263 ATGATCCTGACGAC GC_site GGCGGCCTCCGTCGT GGAGGCCGCCGTCG (Artificial CAGGATCAT TCGACAAGCC Sequence) Bxb1_attB_38_ 264 GGCTTGTCGACGAC 265 ATGATCCTGACGAC CC_site GGCGCCCTCCGTCGT GGAGGGCGCCGTCG (Artificial CAGGATCAT TCGACAAGCC Sequence) Bxb1_attB_38_ 266 GGCTTGTCGACGAC 267 ATGATCCTGACGAC TC_site GGCGTCCTCCGTCGT GGAGGACGCCGTCG (Artificial CAGGATCAT TCGACAAGCC Sequence) Bxb1_attB_38_ 268 GGCTTGTCGACGAC 269 ATGATCCTGACGAC AT_site GGCGATCTCCGTCGT GGAGATCGCCGTCG (Artificial CAGGATCAT TCGACAAGCC Sequence) Bxb1_attB_38_ 270 GGCTTGTCGACGAC 271 ATGATCCTGACGAC CT_site GGCGCTCTCCGTCGT GGAGAGCGCCGTCG (Artificial CAGGATCAT TCGACAAGCC Sequence) Bxb1_attB_38_ 272 GGCTTGTCGACGAC 273 ATGATCCTGACGAC TT_site GGCGTTCTCCGTCGT GGAGAACGCCGTCG (Artificial CAGGATCAT TCGACAAGCC Sequence) Cre Lox 66 274 TACCGTTCGTATAAT 275 ATAACTTCGTATAGC site GTATGCTATACGAA ATACATTATACGAA (Artificial GTTAT CGGTA
Sequence) Cre Lox 71 276 ATAACTTCGTATAAT 277 TACCGTTCGTATAGC site GTATGCTATACGAA ATACATTATACGAA (Artificial CGGTA GTTAT Sequence) TP901-1 278 TTTACCTTGATTGAG 279 CACAATTAACATCTC minimal attB ATGTTAATTGTG AATCAAGGTAAA site (Artificial Sequence) TP901-1 280 GCGAGTTTTTATTTC 281 AAAGGAGTTTTTTAG minimal attP GTTTATTTCAATTAA TTACCTTAATTGAAA site GGTAACTAAAAAAC TAAACGAAATAAAA (Artificial TCCTTT ACTCGC Sequence) PhiBT1 282 CTGGATCATCTGGAT 283 CAGGTTTTTGACGAA minimal attB CACTTTCGTCAAAAA AGTGATCCAGATGA site CCTG TCCAG (Artificial Sequence) PhiBT1 284 TTCGGGTGCTGGGTT 285 TGGTGCTGAGTAGTT minimal attP GTTGTCTCTGGACAG TCCCATGGATCACTG site TGATCCATGGGAAA TCCAGAGACAACAA (Artificial CTACTCAGCACCA CCCAGCACCCGAA Sequence)
[0311] Sequences of Bxb1 and RT mutants can be found in Table 6 below.
TABLE-US-00006 TABLE 6 SEQ ID NO/ DESCRIPTION/ SOURCE FORWARD SEQUENCE(5'-3') SEQ ID NO: 286 AAAAGTGTGGGCTGCAGGATCTGA Bxb1_mut_V368A (Artificial Sequence) SEQ ID NO: 287 GGAGCTGGCAGCTGTCAATGCC Bxb1_mut_E379A (Artificial Sequence) SEQ ID NO: 288 AGTCAATGCCGCTCTCGTGGA Bxb1_mut_E383A (Artifical Sequence) SEQ ID NO: 403 TTGAGCGGGCCCCCACCGT RT_mut_L139P (Artificial Sequence) SEQ ID NO: 289 CAGCGGGCTCAGCTGATAGCA RT_mut_E562Q (Artificial Sequence) SEQ ID NO: 290 CGGATGGCTAACCAAGCGGCC RT_mut_D653N (Artificial Sequence) SEQ ID NO: 404 atgactcactatcaggccttgctt RT(1-478)_Sto7d ttggacacggaccgggtccagttc fusion ggaccggtggtagccctgaacccg gctacgctgctcccactgcctgag gaagggctgcaacacaactgcctt gatGGGACAGGTGGCGGTGGTGTC ACCGTCAAGTTCAAGTACAAGGGT GAGGAACTTGAAGTTGATATTAGC AAAATCAAGAAGGTTTGGCGCGTT GGTAAAATGATATCTTTTACTTAT GACGACAACGGCAAGACAGGTAGA GGGGCAGTGTCTGAGAAAGACGCC CCCAAGGAGCTGTTGCAAATGTTG GAAAAGTCTGGGAAAAAGtctggc ggctcaaaaagaaccgccgacggc agcgaattcgagcccaagaagaag aggaaagtc
[0312] Sequences of primers, probes and restriction enzymes used in ddPCR readout can be found in Table 7 below.
TABLE-US-00007 TABLE 7 SEQ Forward SEQ Reverse SEQ Restriction Locus Cargo ID NO: Primer IN NO: Primer Probe ID NO: Enzymes ACTB GFP 291 CCCGGCTTCCTTTGTCC 292 GAACTCCACGCCGTTCA /56- 405 Eco91I, (pDY0186) FAM/C HindIII C GGC TTG T/ZEN/ C GAC GAC GGC G/3IAB kFQ/ ACTB TP90-1 293 CCCGGCTTCCTTTGTCC 294 AACCACAACTAGAATGCA /56- 406 None GFP GTGA FAM/T (pDY0333) G CTA TTG C/ZEN/ T TTA TTT GTG GGC CCG/ 31ABk FQ/ ACTB TP90-1 295 CCCGGCTTCCTTTGTCC 296 GAACTCCACGCCGTTCA /56- 407 None rc GFP FAM/ (pDY0334) CC ATG AAG A/ZE N/T CGA GTG CCG CAT CA/3I ABkF Q/ ACTB PhiBT1 297 CCCGGCTTCCTTTGTCC 298 AACCACAACTAGAATGCA /56- 406 None GFP GTGA FAM/T (pDY0367) G CTA TTG C/ZEN/ T TTA TTT GTG GGC CCG/ 3IABk FQ/ ACTB PhiBT1 299 CCCGGCTTCCTTTGTCC 300 GAACTCCACGCCGTTCA /56- 407 None rc GFP FAM/ (pDY0368) CC ATG AAG A/ZE N/T CGA GTG CCG CAT CA/3I ABkF Q/ LMNB1 GFP 301 TCCTTATCACGGTCCCGCTCG 302 GAACTCCACGCCGTTCA /56- 407 Eco91I, (pDY0186) FAM/ HindIII CC ATG AAG A/ZE N/T CGA GTG CCG CAT CA/3I ABkF Q/ NOLC1 GFP 303 CGTCGACAACGGTAGTG 304 GAACTCCACGCCGTTCA /56- 407 Eco91I, (pDY0186) FAM/ HindIII CC ATG AAG A/ZE N/T CGA GTG CCG CAT CA/3I ABkF Q/ SUPT16 H GFP 305 TCGCGTGATTCTCGGAAC 306 GAACTCCACGCCGTTCA /56- 407 Eco91I, (pDY0186) FAM/C HindIII C ATG AAG A/ZEN/ T CGA GTG CCG CAT CA/3IA BkFQ/ SRRM2 GFP 307 GGGCGGTAAGTGGTTAGTTT 308 GAACTCCACGCCGTTCA /56- 407 Eco91I, (pDY0186) FAM/ HindIII CC ATG AAG A/ZE N/T CGA GTG CCG CAT CA/3I ABkF Q/ DEPDC4 GFP 309 AAGAGGCGGAGCCAGTA 310 GAACTCCACGCCGTTCA /56- 407 Eco91I, (pDY0186) FAM/ HindIII CC ATG AAG A/ZE N/T CGA GTG CCG CAT CA/3I ABkF Q/ NES GFP 311 CTCCCTTCTCCCGGTGCCC 312 GAACTCCACGCCGTTCA /56- 405 Eco91I, (pDY0186) FAM/C HindIII C GGC TTG T/ZEN/ C GAC GAC GGC G/3IAB kFQ/ ACTB ACTB 313 CCCGGCTTCCTTTGTCC 314 GAACTCCACGCCGTTCA /56- 407 Eco91I HITI FAM/ template CC GFP ATG (pDY0219) AAG A/ZE N/T CGA GTG CCG CAT CA/3I ABkF Q/ SRRM2 SRRM2 315 GGGCGGTAAGTGGTTAGTTT 316 GAACTCCACGCCGTTCA /56- 407 Eco91I HITI FAM/ template CC GFP ATG (aRY0182_A2) AAG A/ZE N/T CGA GTG CCG CAT CA/3I ABkF Q/ NOLC1 NOLC1 317 CGTCGACAACGGTAGTG 318 GAACTCCACGCCGTTCA /56- 407 Eco91I HITI FAM/ template CC GFP ATG (aRY0182_A3) AAG A/ZE N/T CGA GTG CCG CAT CA/3I ABkF Q/ DEPDC4 DEPDC4 HITI 319 AAGAGGCGGAGCCAGTA 320 GAACTCCACGCCGTTCA /56- 407 Eco91I template FAM/ GFP CC (aRY0182_A5) ATG AAG A/ZE N/T CGA GTG CCG CAT CA/3I ABkF Q/ NES NES 321 CTCCCTTCTCCCGGTGCCC 322 GAACTCCACGCCGTTCA /56- 407 Eco91I HITI FAM/ template CC GFP ATG (aRY0182_A7) AAG A/ZE N/T CGA GTG CCG CAT CA/3I ABkF Q/ LMNB1 LMNB1 323 TCCTTATCACGGTCCCGCTCG 324 GAACTCCACGCCGTTCA /56- 407 Eco91I HITI FAM/ template GFP CC (aRY0182_A4) ATG AAG A/ZE N/T CGA GTG CCG CAT CA/3I ABkF Q/
ACTB SERPI 325 CCCGGCTTCCTTTGTCC 326 GGCCTGCCAGCAGGAGGA /56- 405 EcoRI, NA FAM/ XhoI, (pDY0298) CC HindIII GGC TTG T/ZE N/C GAC GAC GGC G/3I ABkF Q/ ACTB CPS1 327 CCCGGCTTCCTTTGTCC 328 GGTGTGCAGTCACATTGG /56- 408 XhoI, (pDY299) TAAAGCC FAM/ HindIII AC AGC TTT C/ZE N/A AAG TGG TGA GGA CAC T/3IA BkFQ/ ACTB CFTR 329 CCCGGCTTCCTTTGTCC 330 GATGGGTCTAGTCCAGCT /56- 409 Eco91I, (pDY0373) AAAG FAM/ HindIII TAC GGT ACA/ ZEN/ AAC CC ACC CGA GAG A/3I ABkF Q/ ACTB NYESO 331 CCCGGCTTCCTTTGTCC 332 GAGAGACAAGGCTGCACA /56- 409 Eco47III, TRAC FAM/ HindIII (pDY0318) TAC GGT ACA/ ZEN/ AAC CC ACC CGA GAG A/3I ABkF Q/ NC_00 GFP 333 CCAGGTGAGAGTCAGGGTAGT 334 GAACTCCACGCCGTTCA /56- 405 Eco91I, 00 03 (pDY0186) GTTCA FAM/ HindIII CC GGC TTG T/ZE N/C GAC GAC GGC G/3I ABkF Q/ NC_00 GFP 335 AGGGACCTTTGCCTGTGTGAG 336 GAACTCCACGCCGTTCA /56- 405 Eco91I, 00 02 (pDY0186) TC FAM/ HindIII CC GGC TTG T/ZE N/C GAC GAC GGC G/3I ABkF Q/ NC_00 GFP 337 TCAGCTCTGTGCTGAGGCGAA 338 GAACTCCACGCCGTTCA /56- 405 Eco91I, 00 09 (pDY0186) FAM/ HindIII CC GGC TTG T/ZE N/C GAC GAC GGC G/3I ABkF Q/ chr6: GFP 339 AAGCCATCTCCCAGAATATCT 340 GAACTCCACGCCGTTCA /56- 405 Eco91I, 149045959 (pDY0186) GCTTAGAAATG FAM/ HindIII CC GGC TTG T/ZE N/C GAC GAC GGC G/3I ABkF Q/ chr16: GFP 341 GAGAGGAGCAACAGTGAGCAT 342 GAACTCCACGCCGTTCA /56- 405 Eco91I, 18607730 (pDY0186) GATG FAM/ HindIII CC GGC TTG T/ZE N/C GAC GAC GGC G/3I ABkF Q/ chr6: ACTB 343 AAGCCATCTCCCAGAATATCT 344 GAACTCCACGCCGTTCA /56- 405 Eco91I 149045959 HITI GCTTAGAAATG FAM/ template CC GFP GGC (pDY0219) TTG T/ZE N/C GAC GAC GGC G/3I ABkF Q/ chr16: ACTB 345 GAGAGGAGCAACAGTGAGCAT 346 GAACTCCACGCCGTTCA /56- 405 Eco91I 18607730 HITI GATG FAM/ template CC GFP GGC (pDY0219) TTG T/ZE N/C GAC GAC GGC G/3I ABkF Q/ ACTB CAG_Kozak_bGH_ 347 CCCGGCTTCCTTTGTCC 348 GGCTATGAACTAATGACC /56- 405 Eco91I, therapeutic_genes CCGT FAM/ HindIII generic CC minicircle GGC TTG T/ZE N/C GAC GAC GGC G/3I ABkF Q/ ACTB Hibit- 349 CCCGGCTTCCTTTGTCC 350 GGCCTGCCAGCAGGAGGA /56- 405 EcoRI, SERPI FAM/ XhoI, NA CC HindIII (pDY0405) GGC TTG T/ZE N/C GAC GAC GGC G/3I ABkF Q/ ACTB Hibit- 351 CCCGGCTTCCTTTGTCC 352 GGTGTGCAGTCACATTGG /56- 408 XhoI, CPS1 TAAAGCC FAM/ HindIII (pDY406) AC AGC TTT C/ZE N/A AAG TGG TGA GGA CAC T/3IA BkFQ/
[0313] Sequences of primers used for NGS readout can be found in Table 8 below.
TABLE-US-00008 TABLE 8 SEQ ID NO / DESCRIPTION / SOURCE ID SEQUENCE (5'-3') SEQ ID NO: 353 PD0966 ACACTCTTTCCCTACACGACGCTCTTCCGATCTCCGAC N-term ACTB Tn5 CTCGGC TCACAGCG readout F 1 (Artificial Sequence) SEQ ID NO: 354 PD0967 ACACTCTTTCCCTACACGACGCTCTTCCGATCTACCGA N-term ACTB Tn5 CCTCGG CTCACAGCG readout F 2 (Artificial Sequence) SEQ ID NO: 355 PD0968 ACACTCTTTCCCTACACGACGCTCTTCCGATCTGACCG N-term ACTB Tn5 ACCTCG GCTCACAGCG readout F 3 (Artificial Sequence) SEQ ID NO: 356 PD0969 ACACTCTTTCCCTACACGACGCTCTTCCGATCTTGACC N-term ACTB Tn5 GACCTC GGCTCACAGCG readout F 4 (Artificial Sequence) SEQ ID NO: 357 PD0970 ACACTCTTTCCCTACACGACGCTCTTCCGATCTCTGAC N-term ACTB Tn5 CGACCT CGGCTCACAGCG readout F 5 (Artificial Sequence) SEQ ID NO: 358 PD0971 ACACTCTTTCCCTACACGACGCTCTTCCGATCTACTGA N-term ACTB Tn5 CCGACC TCGGCTCACAGCG readout F 6 (Artificial Sequence) SEQ ID NO: 359 PD0972 ACACTCTTTCCCTACACGACGCTCTTCCGATCTTACTG N-term ACTB Tn5 ACCGAC CTCGGCTCACAGCG readout F 7 (Articial Sequence) SEQ ID NO: 360 PD0973 ACACTCTTTCCCTACACGACGCTCTTCCGATCTGTACT N-term ACTB Tn5 GACCGA CCTCGGCTCACAGCG readout F 8 (Artificial Sequence) SEQ ID NO: 361 FP0952 GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTCCAC ACTB N-term NGS CCAGCC AGCTCCC R for Cas14 indels (Artificial Sequence) SEQ ID NO: 362 PD0313 ACACTCTTTCCCTACACGACGCTCTTCCGATCTCCGGT NGS EMX1 GGCGCAT TGCCAC Forward 1 (Artificial Sequence) SEQ ID NO: 363 PD0314 ACACTCTTTCCCTACACGACGCTCTTCCGATCTACCGG NGS EMX1 TGGCGCA TTGCCAC Forward 2 (Artificial Sequence) SEQ ID NO: 364 PD0315 ACACTCTTTCCCTACACGACGCTCTTCCGATCTGACCG NGS EMX1 GTGGCGC ATTGCCAC Forward 3 (Artificial Sequence) SEQ ID NO: 365 PD0316 ACACTCTTTCCCTACACGACGCTCTTCCGATCTTGACC NGS EMX1 GGTGGCG CATTGCCAC Forward 4 (Artificial Sequence) SEQ ID NO: 366 PD0317 ACACTCTTTCCCTACACGACGCTCTTCCGATCTCTGAC NGS EMX1 CGGTGGC GCATTGCCAC Forward 5 (Artificial Sequence) SEQ ID NO: 367 PD0318 ACACTCTTTCCCTACACGACGCTCTTCCGATCTACTGA NGS EMX1 CCGGTGG CGCATTGCCAC Forward 6 (Artificial Sequence) SEQ ID NO: 368 PD0319 ACACTCTTTCCCTACACGACGCTCTTCCGATCTTACTG NGS EMX1 ACCGGTG GCGCATTGCCAC Forward 7 (Artificial Sequence) SEQ ID NO: 369 PD0320 ACACTCTTTCCCTACACGACGCTCTTCCGATCTGTACT NGS EMX1 GACCGGT GGCGCATTGCCAC Forward 8 (Artificial Sequence) SEQ ID NO: 370 PD0321 GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTCAGA NGS EMX1 Reverse GTCCAGC TTGGGCCCA (Artificial Sequence)
[0314] Sequences of off-target sites can be found in Table 9 below.
TABLE-US-00009 TABLE 9 SEQ ID NO / DESCRIPTION / SOURCE SEQUENCE (5'-3') SEQ ID NO: 371 GATATTTTCCCAGCTCACCA Cas9_chr6: 149045959 (Artificial Sequence) SEQ ID NO: 372 TCTATTCTCCCAGCTCCCCA Cas9_chr16: 18607730 (Artificial Sequence) SEQ ID NO: 373 AGCGGCTTCTGTCTCTGTGA Bxb1_NC_000002 GTGAGCTGGCGGTCTCCGTC (Artificial Sequence) SEQ ID NO: 374 GACTAGCCCACGCTCCGGTT Bxb1_NC_000003 CTGAGCCGCGACGGCGGTCT (Artificial Sequence) CCG SEQ ID NO: 375 CCCAGGGTCCCATGCGCTCC Bxb1_NC_000009 CCGGCCCTGACGGCGGTCTC (Artificial Sequence) C
[0315] Linker sequences in Table 10 below.
TABLE-US-00010 TABLE 10 Description Sequence (5'-3') Amino acid sequence A - P2A GGAAGCGGAGCTACTA GSGATNFSLLKQAGDVEEN ACTTCAGCCTGCTGAA PGP (SEQ ID NO: 418) GCAGGCTGGCGACGTG GAGGAGAACCCTGGAC CT (SEQ ID NO: 410) B - (GGGS)3 GGGGGAGGAGGTTCTG GGGGSGGGGSGGGGS GAGGCGGAGGCTCCGG (SEQ ID NO: 419) AGGCGGAGGGTCA (SEQ ID NO: 411) C - GGGGS GGAGGTGGCGGGAGC GGGGS (SEQ ID NO: (SEQ ID NO: 412) 420) D - PAPAP CCCGCACCAGCGCCT PAPAP (SEQ ID NO: (SEQ ID NO: 413) 421) E - (EAAAK)3 GAGGCAGCTGCCAAGG EAAAKEAAAKEAAAK AAGCCGCTGCCAAGGA (SEQ ID NO: 422) GGCGGCCGCAAAG (SEQ ID NO: 414) F - XTEN AGTGGGAGCGAGACCC SGSETPGTSESATPES CTGGGACTAGCGAGTC (SEQ ID NO: 423) AGCTACACCCGAAAGC (SEQ ID NO: 415) G - (GGS)6 GGGGGGTCAGGTGGAT GGSGGSGGSGGSGGSGGS CCGGCGGAAGTGGCGG (SEQ ID NO: 424) ATCCGGTGGATCTGGC GGCAGT (SEQ ID NO: 416) H - EAAAK GAAGCTGCTGCTAAG EAAAK (SEQ ID NO: (SEQ ID NO: 417) 425)
[0316] Exemplary fusion sequences in Table 11 below.
TABLE-US-00011 Description Sequence SpCas9-XTEN- MKRTADGSEFESPKKKRKVDKKYSIGLDTN RT(1-478)-Sto7d- SVGWAVITDEYKVPSKKFKVLGNTDRHSIK GGGGS-BxbINT KNLIGALLFDSGETAEATRLKRTARRRYTR Amino acid RKNRICYLQEIFSNEMAKVDDSFFHRLEES SEQ ID NO: 376 FLVEEDKKHERHPIFGNIVDEVAYHEKYPT IYHLRKKLVDSTDKADLRLIYLALAHMIKF RGHFLIEGDLNPDNSDVDKLFIQLVQTYNQ LFEENPINASGVDAKAILSARLSKSRRLEN LIAQLPGEKKNGLFGNLIALSLGLTPNFKS NFDLAEDAKLQLSKDTYDDDLDNLLAQIGD QYADLFLAAKNLSDAILLSDILRVNTEITK APLSASMIKRYDEHHQDLTLLKALVRQQLP EKYKEIFFDQSKNGYAGYIDGGASQEEFYK FIKPILEKMDGTEELLVKLNREDLLRKQRT FDNGSIPHQIHLGELHAILRRQEDFYPFLK DNREKIEKILTFRIPYYVGPLARGNSRFAW MTRKSEETITPWNFEEVVDKGASAQSFIER MTNFDKNLPNEKVLPKHSLLYEYFTVYNEL TKVKYVTEGMRKPAFLSGEQKKAIVDLLFK TNRKVTVKQLKEDYFKKIECFDSVEISGVE DRFNASLGTYHDLLKIIKDKDFLDNEENED ILEDIVLTLTLFEDREMIEERLKTYAHLFD DKVMKQLKRRRYTGWGRLSRKLINGIRDKQ SGKTILDFLKSDGFANRNFMQLIHDDSLTF KEDIQKAQVSGQGDSLHEHIANLAGSPAIK KGILQTVKVVDELVKVMGRHKPENIVIEMA RENQTTQKGQKNSRERMKRIEEGIKELGSQ ILKEHPVENTQLQNEKLYLYYLQNGRDMYV DQELDINRLSDYDVDAIVPQSFLKDDSIDN KVLTRSDKNRGKSDNVPSEEVVKKMKNYWR QLLNAKLITQRKFDNLTKAERGGLSELDKA GFIKRQLVETRQITKHVAQILDSRMNTKYD ENDKLIREVKVITLKSKLVSDFRKDFQFYK VREINNYHHAHDAYLNAVVGTALIKKYPKL ESEFVYGDYKVYDVRKMIAKSEQEIGKATA KYFFYSNIMNFFKTEITLANGEIRKRPLIE TNGETGEIVWDKGRDFATVRKVLSMPQVNI VKKTEVQTGGFSKESILPKRNSDKLIARKK DWDPKKYGGFDSPTVAYSVLVVAKVEKGKS KKLKSVKELLGITIMERSSFEKNPIDFLEA KGYKEVKKDLIIKLPKYSLFELENGRKRML ASAGELQKGNELALPSKYVNFLYLASHYEK LKGSPEDNEQKQLFVEQHKHYLDEIIEQIS EFSKRVILADANLDKVLSAYNKHRDKPIRE QAENIIHLFTLTNLGAPAAFKYFDTTIDRK RYTSTKEVLDATLIHQSITGLYETRIDLSQ LGGDSGGSSGGSSGSETPGTSESATPESSG SETPGTSESATPESSGSETPGTSESATPES SGGSSGGSSTLNIEDEYRLHETSKEPDVSL GSTWLSDFPQAWAETGGMGLAVRQAPLIIP LKATSTPVSIKQYPMSQEARLGIKPHIQRL LDQGILVPCQSPWNTPLLPVKKPGTNDYRP VQDLREVNKRVEDIHPTVPNPYNLLSGPPP SHQWYTVLDLKDAFFCLRLHPTSQPLFAFE WRDPEMGISGQLTWTRLPQGFKNSPTLFNE ALHRDLADFRIQHPDLILLQYVDDLLLAAT SELDCQQGTRALLQTLGNLGYRASAKKAQI CQKQVKYLGYLLKEGQRWLTEARKETVMGQ PTPKTPRQLREFLGKAGFCRLFIPGFAEMA APLYPLTKPGTLFNWGPDQQKAYQEIKQAL LTAPALGLPDLTKPFELFVDEKQGYAKGVL TQKLGPWRRPVAYLSKKLDPVAAGWPPCLR MVAAIAVLTKDAGKLTMGQPLVILAPHAVE ALVKQPPDRWLSNARMTHYQALLLDTDRVQ FGPVVALNPATLLPLPEEGLQHNCLDGTGG GGVTVKFKYKGEELEVDISKIKKVWRVGKM ISFTYDDNGKTGRGAVSEKDAPKELLQMLE KSGKKSGGSKRTADSEFEPKKKRKVGGGGS PKKKRKVYPYDVPDYAGSRALVVIRLSRVT DATTSPERQLESCQQLCAQRGWDVVGVAED LDVSGAVDPFDRKRRPNLARWLAFEEQPFD VIVAYRVDRLTRSIRHLQQLVHWAEDHKKL VVSATEAHFDTTTPFAAVVIALMGTVAQME LEAIKERNRSAAHFNIRAGKYRGSLPPWGY LPTRVDGEWRLVPDPVQRERILEVYHRVVD NHEPLHLVAHDLNRRGVLSPKDYFAQLQGR EPQGREWSATALKRSMISEAMLGYATLNGK TVRDDDGAPLVRAEPILTREQLEALRAELV KTSRAKPAVSTPSLLLRVLFCAVCGEPAYK FAGGGRKHPRYRCRSMGFPKHCGNGTVAMA EWDAFCEEQVLDLLGDAERLEKVWVAGSDS AVELAEVNAELVDLTSLIGSPAYRAGSPQR EALDARIAALAARQEELEGLEARPSGWEWR ETGQRFGDWWREQDTAAKNTWLRSMNVRLT FDVRGGLTRTIDFGDLQEYEQHLRLGSVVE RLHTGMS SpCas9-XTEN- ATGAAACGGACAGCCGACGGAAGCGAGTTC RT(1-478)-Sto7d- GAGTCACCAAAGAAGAAGCGGAAAGTCGAC GGGGS-BxbINT AAGAAGTACAGCATCGGCCTGGACATCGGC Nucleic acid ACCAACTCTGTGGGCTGGGCCGTGATCACC SEQ ID NO: 377 GACGAGTACAAGGTGCCCAGCAAGAAATTC AAGGTGCTGGGCAACACCGACCGGCACAGC ATCAAGAAGAACCTGATCGGAGCCCTGCTG TTCGACAGCGGCGAAACAGCCGAGGCCACC CGGCTGAAGAGAACCGCCAGAAGAAGATAC ACCAGACGGAAGAACCGGATCTGCTATCTG CAAGAGATCTTCAGCAACGAGATGGCCAAG GTGGACGACAGCTTCTTCCACAGACTGGAA GAGTCCTTCCTGGTGGAAGAGGATAAGAAG CACGAGCGGCACCCCATCTTCGGCAACATC GTGGACGAGGTGGCCTACCACGAGAAGTAC CCCACCATCTACCACCTGAGAAAGAAACTG GTGGACAGCACCGACAAGGCCGACCTGCGG CTGATCTATCTGGCCCTGGCCCACATGATC AAGTTCCGGGGCCACTTCCTGATCGAGGGC GACCTGAACCCCGACAACAGCGACGTGGAC AAGCTGTTCATCCAGCTGGTGCAGACCTAC AACCAGCTGTTCGAGGAAAACCCCATCAAC GCCAGCGGCGTGGACGCCAAGGCCATCCTG TCTGCCAGACTGAGCAAGAGCAGACGGCTG GAAAATCTGATCGCCCAGCTGCCCGGCGAG AAGAAGAATGGCCTGTTCGGAAACCTGATT GCCCTGAGCCTGGGCCTGACCCCCAACTTC AAGAGCAACTTCGACCTGGCCGAGGATGCC AAACTGCAGCTGAGCAAGGACACCTACGAC GACGACCTGGACAACCTGCTGGCCCAGATC GGCGACCAGTACGCCGACCTGTTTCTGGCC GCCAAGAACCTGTCCGACGCCATCCTGCTG AGCGACATCCTGAGAGTGAACACCGAGATC ACCAAGGCCCCCCTGAGCGCCTCTATGATC AAGAGATACGACGAGCACCACCAGGACCTG ACCCTGCTGAAAGCTCTCGTGCGGCAGCAG CTGCCTGAGAAGTACAAAGAGATTTTCTTC GACCAGAGCAAGAACGGCTACGCCGGCTAC ATTGACGGCGGAGCCAGCCAGGAAGAGTTC TACAAGTTCATCAAGCCCATCCTGGAAAAG ATGGACGGCACCGAGGAACTGCTCGTGAAG CTGAACAGAGAGGACCTGCTGCGGAAGCAG CGGACCTTCGACAACGGCAGCATCCCCCAC CAGATCCACCTGGGAGAGCTGCACGCCATT CTGCGGCGGCAGGAAGATTTTTACCCATTC CTGAAGGACAACCGGGAAAAGATCGAGAAG ATCCTGACCTTCCGCATCCCCTACTACGTG GGCCCTCTGGCCAGGGGAAACAGCAGATTC GCCTGGATGACCAGAAAGAGCGAGGAAACC ATCACCCCCTGGAACTTCGAGGAAGTGGTG GACAAGGGCGCTTCCGCCCAGAGCTTCATC GAGCGGATGACCAACTTCGATAAGAACCTG CCCAACGAGAAGGTGCTGCCCAAGCACAGC CTGCTGTACGAGTACTTCACCGTGTATAAC GAGCTGACCAAAGTGAAATACGTGACCGAG GGAATGAGAAAGCCCGCCTTCCTGAGCGGC GAGCAGAAAAAGGCCATCGTGGACCTGCTG TTCAAGACCAACCGGAAAGTGACCGTGAAG CAGCTGAAAGAGGACTACTTCAAGAAAATC GAGTGCTTCGACTCCGTGGAAATCTCCGGC GTGGAAGATCGGTTCAACGCCTCCCTGGGC ACATACCACGATCTGCTGAAAATTATCAAG GACAAGGACTTCCTGGACAATGAGGAAAAC GAGGACATTCTGGAAGATATCGTGCTGACC CTGACACTGTTTGAGGACAGAGAGATGATC GAGGAACGGCTGAAAACCTATGCCCACCTG TTCGACGACAAAGTGATGAAGCAGCTGAAG CGGCGGAGATACACCGGCTGGGGCAGGCTG AGCCGGAAGCTGATCAACGGCATCCGGGAC AAGCAGTCCGGCAAGACAATCCTGGATTTC CTGAAGTCCGACGGCTTCGCCAACAGAAAC TTCATGCAGCTGATCCACGACGACAGCCTG ACCTTTAAAGAGGACATCCAGAAAGCCCAG GTGTCCGGCCAGGGCGATAGCCTGCACGAG CACATTGCCAATCTGGCCGGCAGCCCCGCC ATTAAGAAGGGCATCCTGCAGACAGTGAAG GTGGTGGACGAGCTCGTGAAAGTGATGGGC CGGCACAAGCCCGAGAACATCGTGATCGAA ATGGCCAGAGAGAACCAGACCACCCAGAAG GGACAGAAGAACAGCCGCGAGAGAATGAAG CGGATCGAAGAGGGCATCAAAGAGCTGGGC AGCCAGATCCTGAAAGAACACCCCGTGGAA AACACCCAGCTGCAGAACGAGAAGCTGTAC CTGTACTACCTGCAGAATGGGCGGGATATG TACGTGGACCAGGAACTGGACATCAACCGG CTGTCCGACTACGATGTGGACGCTATCGTG CCTCAGAGCTTTCTGAAGGACGACTCCATC GACAACAAGGTGCTGACCAGAAGCGACAAG AACCGGGGCAAGAGCGACAACGTGCCCTCC GAAGAGGTCGTGAAGAAGATGAAGAACTAC TGGCGGCAGCTGCTGAACGCCAAGCTGATT ACCCAGAGAAAGTTCGACAATCTGACCAAG GCCGAGAGAGGCGGCCTGAGCGAACTGGAT AAGGCCGGCTTCATCAAGAGACAGCTGGTG GAAACCCGGCAGATCACAAAGCACGTGGCA CAGATCCTGGACTCCCGGATGAACACTAAG TACGACGAGAATGACAAGCTGATCCGGGAA GTGAAAGTGATCACCCTGAAGTCCAAGCTG GTGTCCGATTTCCGGAAGGATTTCCAGTTT TACAAAGTGCGCGAGATCAACAACTACCAC CACGCCCACGACGCCTACCTGAACGCCGTC GTGGGAACCGCCCTGATCAAAAAGTACCCT AAGCTGGAAAGCGAGTTCGTGTACGGCGAC TACAAGGTGTACGACGTGCGGAAGATGATC GCCAAGAGCGAGCAGGAAATCGGCAAGGCT ACCGCCAAGTACTTCTTCTACAGCAACATC ATGAACTTTTTCAAGACCGAGATTACCCTG GCCAACGGCGAGATCCGGAAGCGGCCTCTG ATCGAGACAAACGGCGAAACCGGGGAGATC GTGTGGGATAAGGGCCGGGATTTTGCCACC GTGCGGAAAGTGCTGAGCATGCCCCAAGTG AATATCGTGAAAAAGACCGAGGTGCAGACA GGCGGCTTCAGCAAAGAGTCTATCCTGCCC AAGAGGAACAGCGATAAGCTGATCGCCAGA AAGAAGGACTGGGACCCTAAGAAGTACGGC GGCTTCGACAGCCCCACCGTGGCCTATTCT GTGCTGGTGGTGGCCAAAGTGGAAAAGGGC AAGTCCAAGAAACTGAAGAGTGTGAAAGAG CTGCTGGGGATCACCATCATGGAAAGAAGC AGCTTCGAGAAGAATCCCATCGACTTTCTG GAAGCCAAGGGCTACAAAGAAGTGAAAAAG GACCTGATCATCAAGCTGCCTAAGTACTCC CTGTTCGAGCTGGAAAACGGCCGGAAGAGA ATGCTGGCCTCTGCCGGCGAACTGCAGAAG GGAAACGAACTGGCCCTGCCCTCCAAATAT GTGAACTTCCTGTACCTGGCCAGCCACTAT GAGAAGCTGAAGGGCTCCCCCGAGGATAAT GAGCAGAAACAGCTGTTTGTGGAACAGCAC AAGCACTACCTGGACGAGATCATCGAGCAG ATCAGCGAGTTCTCCAAGAGAGTGATCCTG GCCGACGCTAATCTGGACAAAGTGCTGTCC GCCTACAACAAGCACCGGGATAAGCCCATC AGAGAGCAGGCCGAGAATATCATCCACCTG TTTACCCTGACCAATCTGGGAGCCCCTGCC GCCTTCAAGTACTTTGACACCACCATCGAC CGGAAGAGGTACACCAGCACCAAAGAGGTG CTGGACGCCACCCTGATCCACCAGAGCATC ACCGGCCTGTACGAGACACGGATCGACCTG TCTCAGCTGGGAGGTGACTCTGGAGGATCT AGCGGAGGATCCTCTGGCAGCGAGACACCA GGAACAAGCGAGTCAGCAACACCAGAGAGC TCTGGTAGCGAGACACCCGGTACCAGTGAA AGCGCCACGCCAGAAAGCAGTGGGAGTGAG ACTCCGGGTACATCTGAATCAGCGACACCG GAATCAAGTGGCGGCAGCAGCGGCGGCAGC AGCACCCTAAATATAGAAGATGAGTATCGG CTACATGAGACCTCAAAAGAGCCAGATGTT TCTCTAGGGTCCACATGGCTGTCTGATTTT CCTCAGGCCTGGGCGGAAACCGGGGGCATG GGACTGGCAGTTCGCCAAGCTCCTCTGATC ATACCTCTGAAAGCAACCTCTACCCCCGTG TCCATAAAACAATACCCCATGTCACAAGAA GCCAGACTGGGGATCAAGCCCCACATACAG AGACTGTTGGACCAGGGAATACTGGTACCC TGCCAGTCCCCCTGGAACACGCCCCTGCTA CCCGTTAAGAAACCAGGGACTAATGATTAT AGGCCTGTCCAGGATCTGAGAGAAGTCAAC AAGCGGGTGGAAGACATCCACCCCACCGTG CCCAACCCTTACAACCTCTTGAGCGGGCCC CCACCGTCCCACCAGTGGTACACTGTGCTT
GATTTAAAGGATGCCTTTTTCTGCCTGAGA CTCCACCCCACCAGTCAGCCTCTCTTCGCC TTTGAGTGGAGAGATCCAGAGATGGGAATC TCAGGACAATTGACCTGGACCAGACTCCCA CAGGGTTTCAAAAACAGTCCCACCCTGTTT AATGAGGCACTGCACAGAGACCTAGCAGAC TTCCGGATCCAGCACCCAGACTTGATCCTG CTACAGTACGTGGATGACTTACTGCTGGCC GCCACTTCTGAGCTAGACTGCCAACAAGGT ACTCGGGCCCTGTTACAAACCCTAGGGAAC CTCGGGTATCGGGCCTCGGCCAAGAAAGCC CAAATTTGCCAGAAACAGGTCAAGTATCTG GGGTATCTTCTAAAAGAGGGTCAGAGATGG CTGACTGAGGCCAGAAAAGAGACTGTGATG GGGCAGCCTACTCCGAAGACCCCTCGACAA CTAAGGGAGTTCCTAGGGAAGGCAGGCTTC TGTCGCCTCTTCATCCCTGGGTTTGCAGAA ATGGCAGCCCCCCTGTACCCTCTCACCAAA CCGGGGACTCTGTTTAATTGGGGCCCAGAC CAACAAAAGGCCTATCAAGAAATCAAGCAA GCTCTTCTAACTGCCCCAGCCCTGGGGTTG CCAGATTTGACTAAGCCCTTTGAACTCTTT GTCGACGAGAAGCAGGGCTACGCCAAAGGT GTCCTAACGCAAAAACTGGGACCTTGGCGT CGGCCGGTGGCCTACCTGTCCAAAAAGCTA GACCCAGTAGCAGCTGGGTGGCCCCCTTGC CTACGGATGGTAGCAGCCATTGCCGTACTG ACAAAGGATGCAGGCAAGCTAACCATGGGA CAGCCACTAGTCATTCTGGCCCCCCATGCA GTAGAGGCACTAGTCAAACAACCCCCCGAC CGCTGGCTTTCCAACGCCCGGATGACTCAC TATCAGGCCTTGCTTTTGGACACGGACCGG GTCCAGTTCGGACCGGTGGTAGCCCTGAAC CCGGCTACGCTGCTCCCACTGCCTGAGGAA GGGCTGCAACACAACTGCCTTGATGGGACA GGTGGCGGTGGTGTCACCGTCAAGTTCAAG TACAAGGGTGAGGAACTTGAAGTTGATATT AGCAAAATCAAGAAGGTTTGGCGCGTTGGT AAAATGATATCTTTTACTTATGACGACAAC GGCAAGACAGGTAGAGGGGCAGTGTCTGAG AAAGACGCCCCCAAGGAGCTGTTGCAAATG TTGGAAAAGTCTGGGAAAAAGTCTGGCGGC TCAAAAAGAACCGCCGACGGCAGCGAATTC GAGCCCAAGAAGAAGAGGAAAGTCGGAGGT GGCGGGAGCCCAAAAAAGAAAAGAAAAGTG TATCCCTATGATGTCCCCGATTATGCCGGT TCAAGAGCCCTGGTCGTGATTAGACTGAGC CGAGTGACAGACGCCACCACAAGTCCCGAG AGACAGCTGGAATCATGCCAGCAGCTCTGT GCTCAGCGGGGTTGGGATGTGGTCGGCGTG GCAGAGGATCTGGACGTGAGCGGGGCCGTC GATCCATTCGACAGAAAGAGGAGGCCCAAC CTGGCAAGATGGCTCGCTTTCGAGGAACAG CCCTTTGATGTGATCGTCGCCTACAGAGTG GACCGGCTGACCCGCTCAATTCGACATCTC CAGCAGCTGGTGCATTGGGCTGAGGACCAC AAGAAACTGGTGGTCAGCGCAACAGAAGCC CACTTCGATACTACCACACCTTTTGCCGCT GTGGTCATCGCACTGATGGGCACTGTGGCC CAGATGGAGCTCGAAGCTATCAAGGAGCGA AACAGGAGCGCAGCCCATTTCAATATTAGG GCCGGTAAATACAGAGGCTCCCTGCCCCCT TGGGGATATCTCCCTACCAGGGTGGATGGG GAGTGGAGACTGGTGCCAGACCCCGTCCAG AGAGAGCGGATTCTGGAAGTGTACCACAGA GTGGTCGATAACCACGAACCACTCCATCTG GTGGCACACGACCTGAATAGACGCGGCGTG CTCTCTCCAAAGGATTATTTTGCTCAGCTG CAGGGAAGAGAGCCACAGGGAAGAGAATGG AGTGCTACTGCACTGAAGAGATCTATGATC AGTGAGGCTATGCTGGGTTACGCAACACTC AATGGCAAAACTGTCCGGGACGATGACGGA GCCCCTCTGGTGAGGGCTGAGCCTATTCTC ACCAGAGAGCAGCTCGAAGCTCTGCGGGCA GAACTGGTCAAGACTAGTCGCGCCAAACCT GCCGTGAGCACCCCAAGCCTGCTCCTGAGG GTGCTGTTCTGCGCCGTCTGTGGAGAGCCA GCATACAAGTTTGCCGGCGGAGGGCGCAAA CATCCCCGCTATCGATGCAGGAGCATGGGG TTCCCTAAGCACTGTGGAAACGGGACAGTG GCCATGGCTGAGTGGGACGCCTTTTGCGAG GAACAGGTGCTGGATCTCCTGGGTGACGCT GAGCGGCTGGAAAAAGTGTGGGTGGCAGGA TCTGACTCCGCTGTGGAGCTGGCAGAAGTC AATGCCGAGCTCGTGGATCTGACTTCCCTC ATCGGATCTCCTGCATATAGAGCTGGGTCC CCACAGAGAGAAGCTCTGGACGCACGAATT GCTGCACTCGCTGCTAGACAGGAGGAACTG GAGGGCCTGGAGGCCAGGCCCTCTGGATGG GAGTGGCGAGAAACCGGACAGAGGTTTGGG GATTGGTGGAGGGAGCAGGACACCGCAGCC AAGAACACATGGCTGAGATCCATGAATGTC CGGCTCACATTCGACGTGCGCGGTGGCCTG ACTCGAACCATCGATTTTGGCGACCTGCAG GAGTATGAACAGCACCTGAGACTGGGGTCC GTGGTCGAAAGACTGCACACTGGGATGTCC SpCas9 DKKYSIGLDIGTNSVGWAVITDEYKVPSKK Amino acid FKVLGNTDRHSIKKNLIGALLFDSGETAEA SEQ ID NO: 378 TRLKRTARRRYTRRKNRICYLQEIFSNEMA KVDDSFFHRLEESFLVEEDKKHERHPIFGN IVDEVAYHEKYPTIYHLRKKLVDSTDKADL RLIYLALAHMIKFRGHFLIEGDLNPDNSDV DKLFIQLVQTYNQLFEENPINASGVDAKAI LSARLSKSRRLENLIAQLPGEKKNGLFGNL IALSLGLTPNFKSNFDLAEDAKLQLSKDTY DDDLDNLLAQIGDQYADLFLAAKNLSDAIL LSDILRVNTEITKAPLSASMIKRYDEHHQD LTLLKALVRQQLPEKYKEIFFDQSKNGYAG YIDGGASQEEFYKFIKPILEKMDGTEELLV KLNREDLLRKQRTFDNGSIPHQIHLGELHA ILRRQEDFYPFLKDNREKIEKILTFRIPYY VGPLARGNSRFAWMTRKSEETITPWNFEEV VDKGASAQSSFIERMTNFDKNLPNEKVLPK HSLLYEYFTVYNELTKVKYVTEGMRKPAFL SGEQKKAIVDLLFKTNRKVTVKQLKEDYFK KIECFDSVEISGVEDRFNASLGTYHDLLKI IKDKDFLDNEENEDILEDIVLTLTLFEDRE MIEERLKTYAHLFDDKVMKQLKRRRYTGWG RLSRKLINGIRDKQSGKTILDFLKSDGFAN RNFMQLIHDDSLTFKEDIQKAQVSGQGDSL HEHIANLAGSPAIKKGILQTVKVVDELVKV MGRHKPENIVIEMARENQTTQKGQKNSRER MKRIEEGIKELGSQILKEHPVENTQLQNEK LYLYYLQNGRDMYVDQELDINRLSDYDVDA IVPQSFLKDDSIDNKVLTRSDKNRGKSDNV PSEEVVKKMKNYWRQLLNAKLITQRKFDNL TKAERGGLSELDKAGFIKRQLVETRQITKH VAQILDSRMNTKYDENDKLIREVKVITLKS KLVSDFRKDFQFYKVREINNYHHAHDAYLN AVVGTALIKKYPKLESEFVYGDYKVYDVRK MIAKSEQEIGKATAKYFFYSNIMNFFKTEI TLANGEIRKRPLIETNGETGEIVWDKGRDF ATVRKVLSMPQVNIVKKTEVQTGGFSKESI LPKRNSDKLIARKKDWDPKKYGGFDSPTVA YSVLVVAKVEKGKSKKLKSVKELLGITIME RSSFEKNPIDFLEAKGYKEVKKDLIIKLPK YSLFELENGRKRMLASAGELQKGNELALPS KYVNFLYLASHYEKLKGSPEDNEQKQLFVE QHKHYLDEIIEQISEFSKRVILADANLDKV LSAYKHRDKPIREQAENIIHLFTLTNLGAP AAFKYFDTTIDRKRYTSTKEVLDATLIHQS ITGLYETRIDLSQLGGD RT(1-478)-Sto7d LNIEDEYRLHETSKEPDVSLGSTWLSDFPQ Amino acid AWAETGGMGLAVRQAPLIIPLKATSTPVSI SEQ ID NO: 379 KQYPMSQEARLGIKPHIQRLLDQGILVPCQ SPWNTPLLPVKKPGTNDYRPVQDLREVNKR VEDIHPTVPNPYNLLSGPPPSHQWYTVLDL KDAFFCLRLHPTSQPLFAFEWRDPEMGISG QLTWTRLPQGFKNSPTLFNEALHRDLADFR IQHPDLILLQYVDDLLLAATSELDCQQGTR ALLQTLGNLGYRASAKKAQICQKQVKYLGY LLKEGQRWLTEARKETVMGQPTPKTPRQLR EFLGKAGFCRLFIPGFAEMAAPLYPLTKPG TLFNWGPDQQKAYQEIKQALLTAPALGLPD LTKPFELFVDEKQGYAKGVLTQKLGPWRRP VAYLSKKLDPVAAGWPPCLRMVAAIAVLTK DAGKLTMGQPLVILAPHAVEALVKQPPDRW LSNARMTHYQALLLDTDRVQFGPVVALNPA TLLPLPEEGLQHNCLDGTGGGGVTVKFKYK GEELEVDISKIKKVWRVGKMISFTYDDNGK TGRGAVSEKDAPKELLQMLEKSGKKSGGSK RTADGS BxbINT SRALVVIRLSRVTDATTSPERQLESCQQLC Amino acid AQRGWDVVGVAEDLDVSGAVDPFDRKRRPN SEQ ID NO: 380 LARWLAFEEQPFDVIVAYRVDRLTRSIRHL QQLVHWAEDHKKLVVSATEAHFDTTTPFAA VVIALMGTVAQMELEAIKERNRSAAHFNIR AGKYRGSLPPWGYLPTRVDGEWRLVPDPVQ RERILEVYHRVVDNHEPLHLVAHDLNRRGV LSPKDYFAQLQGREPQGREWSATALKRSMI SEAMLGYATLNGKTVRDDDGAPLVRAEPIL TREQLEALRAELVKTSRAKPAVSTPSLLLR VLFCAVCGEPAYKFAGGGKHPPYRCRSMGF PKHCGNGTVAMAEWDAFCEEQVLDLLGDAE RLEKVWVAGSDSAVELAEVNAELVDLTSLI GSPAYRAGSPQREALDARIAALAARQEELE GLEARPSGWEWRETGQRFGDWWREQDTAAK NTWLRSMNVRLTFDVRGGLTRTIDFGDLQE YEQHLRLGSVVERLHTGMS
EXAMPLES
[0317] While several experimental Examples are contemplated, these Examples are intended to be non-limiting.
Example 1
CRE Integration Efficiency
[0318] The efficiency of the CRE integration was tested. In order to test the efficacy of PASTE with GFP using lox71/lox66/Cre recombinase system, a clonal HEK293FT cell line with lox71 sequence (SEQ ID NO: 1) integrated into the genome using lentivirus was developed. The integration of GFP was tested by transfection of modified HEK293FT cell line with: (1) plus/minus SEQ ID NO: 71 comprising a Cre recombinase expression plasmid, and (2) SEQ ID NO: 72 comprising a GFP template and a lox 66 Cre site of SEQ ID NO: 2. After 72 hours, the percent integration of GFP into the lox71 site was probed. FIG. 3 shows the percent integration of GFP in the lentiviral integrated lox71 site in HEK293FT cell line in the presence of various plasmids. It was observed that pCMV PE2 P2A Cre (SEQ ID NO: 73), a mammalian expression vector with prime editing complex and Cre recombinase linked to PE2 via a cleavable linker or a non-cleavable linker, shows integration of GFP.
Example 2
Programmable Addition Via Site-Specific Targeting Elements (PASTE) with Cre Recombinase--Addition of Lox Site
[0319] The lox71 (SEQ ID NO: 1) or lox66 (SEQ ID NO: 2) sequence was inserted into the HEK293FT cell genome using prime editing to test integration of GFP into the HEK293FT genome. In order to insert lox71 or lox66 sequence into HEK293FT cell genome, a pegRNA with PBS length of 13 base pairs operably linked to RT region of varying lengths was used. The following plasmids were used in the transfection of HEK293FT cells. The cells were transfected with (1) prime editing construct (PE2) or PE2 with conditional Cre expression, (2) Lox71 or Lox66 pegRNA targeting the HEK3 locus, and (3) plus/minus+90 HEK3 nicking second guide RNA targeting the HEK3 locus (+90 ngRNA). After 72 hours, the percent editing of the HEK293FT genome at the HEK3 locus was probed for incorporation of various lengths of lox71 or lox66 (see FIG. 4). It was observed that 34 base pair lox71 (HEK3 locus guide, SEQ ID NO: 83; and Lox71 pegRNA with RT 34 and PBS 13, SEQ ID NO: 81) with +90 ngRNA (SEQ ID NO: 75) and 34 base pair lox66 (HEK3 locus guide, SEQ ID NO: 83; and Lox66 pegRNA with RT 34 and PBS 13, SEQ ID NO: 82) with +90 ngRNA (SEQ ID NO: 75) had the highest percent editing.
Example 3
PASTE with Cre Recombinase--Integration of Gene
[0320] The lox71 or lox66 pegRNAs having PBS length of 13 base pairs and insert length of 34 base pairs were used to probe integration of GFP in the HEK293F genome. The PE and Cre were delivered in an inducible expression vectors and induced at day 2. The HEK293FT cells were transfected with the following plasmids: (1) prime editing construct (PE2 or PE2 with conditional Cre expression); (2) Lox71 pegRNA; (3) plus/minus+90 HEK3 nicking guide RNA; and (4) EGFP template with Lox66 site. After 72 hours, the percent editing of lox71 site and percent integration of GFP was probed with or without lox66 site in the presence of various PE/Cre constructs. FIG. 5A summarizes the percent editing of lox71 site with different PE/Cre vectors. FIG. 5B summarizes the percent integration of GFP at the lox71 site in HEK293FT cell genome. It was observed that although the lox71 site was edited in the presence of inducible or non-inducible PE/Cre expression system, there was no GFP integration.
Example 4
Bxb1 Integration Data Lenti Reporter
[0321] The integration system was switched to an integrase system that could result in an integration of target genes into a genome with higher efficiency. Serine integrase Bxb1 has been shown to be more active than Cre recombinase and highly efficient in bacteria and mammalian cells for irreversible integration of target genes. FIG. 6 shows a schematic of PASTE methodology using Bxb1 (Merrick, C. A. et al., ACS Synth. Biol. 2018, 7, 299-310).
[0322] To probe the efficiency of the Bxb1 integration system, a clonal HEK293FT cell line with attB Bxb1 site (SEQ ID NO: 3) integrated using lentivirus was developed. The modified HEK293FT cell line was then transferred with the following plasmids: (1) plus/minus Bxb1 expression plasmid and (2) plus/minus GFP (SEQ ID NO: 76) or G-Luc (SEQ ID NO: 77) minicircle template with attP Bxb1 site. After 72 hours, the integration of GFP or Gluc into the attB site in the HEK293FT genome was probed. The percent integrations of GFP or Gluc into the attB locus are shown in FIG. 7. It was observed that GFP and Gluc showed efficient integration into the attB site in HEK293FT cells.
Example 5
Addition of Bxb1 Site to Human Genome Using PRIME
[0323] The maximum length of attB that can be integrated into a HEK293FT cell line with the best efficiency was probed. To probe the best length of attB (SEQ ID NO: 3) or its reverse complement attP (SEQ ID NO: 4) for prime editing, pegRNAs having PBS length of 13 nt with varying RT homology length were used. The following plasmids were transfected in HEK293FT: (1) prime expression plasmid; (2) HEK3 targeting pegRNA design; and (3) HEK3+90 nicking guide. After 72 hours, the percent integration of each of the attB construct was probed. FIG. 8 shows the percent editing in each HEK3 targeting pegRNA. It was observed that attB with 44, 34 and 26 base pairs and attB reverse complement with 34 and 26 base pairs showed the highest percent editing.
[0324] Integration PASTE was then tested with tagging cell-organelle marker proteins with GFP in HEK29FT cells. PASTE was used to tag SUPT16H, SRRM2, LAMNB1, NOLC1 and DEPDC4 with GFP in different cell-culture wells and to test the usefulness of PASTE in tracking protein localization within the cells using microscopy. FIGS. 9A-9G shows the fluorescent microscopy results for each of the organelles. SUPT16H-GFP was observed to be enriched in the nucleus, SRRM2-GFP was observed to be enriched in the nuclear speckles, LAMNB1-GFP was observed to be enriched in the nuclear membrane, NOLC1-GFP was observed to be enriched in the fibrillar center, and DEPDC4-GFP was observed to be enriched in the aggresome.
[0325] The transfection of the plasmids can be achieved using electroporation as illustrated in FIGS. 10A-10B.
Example 6
Programmable Integration of Genes with PASTE
[0326] The efficiency of gene integration of Gluc or EGFP with PASTE was tested. To enable gene integration with PASTE, the following HEK3 targeting pegRNAs were used: (1) 44 pegRNA: PBS of 13nt and RT homology of 44nt; (2) 34 pegRNA: PBS of 13nt and RT homology of 34nt; and (3) 26 pegRNA: PBS of 13nt and RT homology of 26nt.
[0327] A HEK293 cell line was transfected with following plasmids HEK293FT: (1) Prime expression plasmid; (2) Bxb1 expression plasmid; (3) HEK3 targeting pegRNA design; (4) HEK3+90 nicking guide; and (5) EGFP or Gluc minicircle. After 72 hours, the percent integration of Gluc or EGFP was observed. FIG. 11 shows integration of EGFP and Gluc with each of the tested HEK3 targeting pegRNAs. It was observed that EGFP and Gluc were efficiently integrated using PASTE.
Example 7
PASTE for Integration of Multiple Genes
[0328] The PASTE technique for site-specific integration of multiple genes into a cell is facilitated with the use of orthogonal attB and attP sites. Central dinucleotide can be changed to GA from GT, and only GA containing attB/attP sites can interact and do not cross react with GT containing sequences. A screen of dinucleotide combinations to find orthogonal attB/attP pairs for multiplexed PASTE editing can be performed. It has been shown that many orthogonal dinucleotide combinations can be found using a Bxb1 reporter system.
[0329] To test this, attB.sup.GT and attB.sup.GA dinucleotides for Bxb1 was added at a ACTB site by prime editing. A EGFP--attP.sup.GT DNA minicircle and a mCherry--attP.sup.GA DNA minicircle was introduced to test the percent EGFP and mCherry editing in the presence or absence of Bxb1. The results of EGFP and mCherry editing are shown in FIGS. 14A-14B.
[0330] Orthogonal editing with the right GT-EGFP and GA-mCherry pairs was achieved demonstrating the ability for multiplexed PASTE editing in cells.
[0331] Two genes were introduced in the same cell using multiplexed PASTE to tag two different genes in a single reaction. EGFP and mCherry were tagged into the loci of ACTB and NOLC1 in a x cell line, in a single reaction. Further, EGFP and mCherry were tagged into the loci of ACTB and LAMNB1. The cells were visualized using fluorescence microscopy. FIGS. 15A-15B show the results of fluorescent microscopy for multiplexed PASTE.
[0332] The ability of multiplexing with 9-different attB and attP central dinucleotides--AA, GA, CA, AG, AC, CC, GT, CT and TT (SEQ ID NOs: 7, 8, 23, 24, 19, 20, 25, 26, 27, 28, 9, 10, 15, 16, 17, 18, 5 and 6)--in a 9.times.9 cross of attB and attP was tested. The edits were probed using next-generation sequencing. The results of the 9.times.9 cross of attB and attP central dinucleotides--AA, GA, CA, AG, AC, CC, GT, CT and TT--are shown in FIG. 16A. Only orthogonal pairs of attB and attP show the highest edit percentage. This result is also shown in the heat-map of FIG. 16B.
Example 8
Integration of Albumin and CPS1 into Albumin Locus
[0333] 12 pegRNAs with albumin guide were linked to PBS and reverse transcriptase sequence of variable length, and different nicking guide RNAs were used to transfect HEK293FT cells. The percent editing in the albumin was probed using next-generation sequencing. The results of prime editing at the albumin locus are shown in FIG. 17. It was observed that SEQ ID NO: 79 showed the highest percent edits with SERPINA1 and SEQ ID NO: 80 showed the highest percent edits with CPS1.
Example 9
Engineering T-Cells
[0334] In order to engineer CD8+ T-cells, the efficiency of PASTE delivery and editing in T-cells can be evaluated (FIG. 18). ACTB targeting pegRNA can be used to insert an integration site with an EGFP insertion template. To deliver the PASTE components to CD8+ T-cells, electroporation can be used along with an optimized electroporation protocol for unstimulated T-cells. As multiple plasmids may reduce the efficiency of electroporation, the consolidated PASTE components that use fewer vectors can be applied.
[0335] Five vectors, three vectors, and two vectors PASTE systems show that robust T-cell editing can be achieved with maximal editing using the three-vector approach (FIG. 19). Further, expanded sets of electroporation conditions, including the overall plasmid amounts, cell numbers, and voltage/amperage protocol can be tested. In addition, stimulation of T-cells may influence the efficiency of transduction and PASTE efficiency. Further, CD4+/CD8+ T cell mixtures stimulated with T-Activator CD3/CD28 ligands can have higher PASTE editing efficiency versus unstimulated cells. In order to separate efficiency of PASTE from the overall delivery rate, an mCherry expression cassette on PASTE vectors can be evaluated in order to sort successfully transfected T cells. Once optimized parameters are achieved, a panel of 10 insertion sites with PASTE in T cells, including the TRAC, IL2R.alpha., and PDCD1 loci, can be evaluated, using different insertions (e.g. EGFP, BFP, and YFP), both in single and multiplexed editing contexts. A tested subset of relevant sites in HEK293FT achieved greater than 40% editing for EGFP insertion (FIG. 20). The PASTE efficiency at TRAC locus with different TCR and CAR constructs can be evaluated. The T-cells can successfully be transfected to achieve insertion of CARs or TCRs.
Example 10
PASTE for CFTR
[0336] PASTE for the CFTR locus can be tested in HEK293FT cells to identify top performing pegRNA and nicking designs for human cells. Neuro-2A cells can also be tested to identify top performing pegRNA and nicking designs for mouse cells. The best constructs can be applied for testing in mouse air lung interface (ALI) organoids in vitro or for delivery in pre-clinical models of cystic fibrosis in mice. Table 12 shows the pegRNA, nicking guide and minicircle DNA characteristics for the CFTR gene modulation.
TABLE-US-00012 TABLE 12 Variables Characteristics pegRNA 38 bp shortened minimal attB and normal 46 bp attB sequence with: a. PBS of 17, 13, and 9 nt length, and b. RT of 20, 15, and 10 nt in length Nicking guides Nicking guide 1 +64 bp Nicking guide 2 +23 bp Nicking guide 3 -60 bp Nicking guide 4 -78 bp (distance is calculated from cut site of pegRNA) Minicircle A. CFTR coding sequence alone (~4,454 pb in size) template B. CFTR coding sequence plus 5' and 3' UTRs (~6,011 bp in size) (Both minicircles have attP site on them for integration by Bxb1 and a bGH poly A signal)
Example 11
AttB and EGPF Integration Using PASTE
[0337] The efficiency of the integration of attB and EGPF at the ACTB locus was evaluated (FIGS. 21A-21C). To investigate whether Bxb1 can add an EGFP template into this site, a delivery approach using a 5 plasmid system expressing each of the following component was deployed: 1) pegRNA expression, 2) nicking guide expression, 3) Prime expression (Cas9-RT), 4) Bxb1 expression and 5) the insertion template (in this case EGFP). This approach was found to yield editing efficiency of the attB site up to 24% and integration of EGFP .about.10% in HEK293FT cells as measured by sequencing (FIGS. 21A-21B). Optimal activity is achieved in 3-4 days and can be performed as a single step transfection or electroporation of all components. Because the EGFP plasmid is designed as a minicircle, allowing removal of all undesired bacterial components, only the desired gene is inserted along with minimal scars from the Bxb1 recombined sites.
[0338] To make the tool simpler to use, the Bxb1 can be linked to Prime via a P2A linker to the Cas9-RT fusion, allowing for only a single plasmid to be used for PASTE protein expression rather than two. This optimization can maintain the same level of editing, making it easier to use the tool and deliver it (FIG. 21C).
Example 12
Programmable EGFP Integrations in Different Cell Types
[0339] The programmable EGFP integration in liver hepatocellular carcinoma cell line HEPG2 (FIG. 22A) and chronic myelogenous leukemia cell line K562 (FIG. 22B) was evaluated. EGFP integration at the ACTB locus in K562 and HEPG2 cells of about 15% was observed, demonstrating robustness of the platform across cell types.
Example 13
Mutagenesis of Bxb1 for Enhanced PASTE Activity
[0340] The mutagenesis of Bxb1 for enhanced PASTE activity was evaluated (FIGS. 23A-23C). Two levers for optimizing PASTE activity exist: 1) improving the activity of the integrase and 2) enhancing the Prime addition of the integration sequence. As illustrated in FIGS. 23A-23B, Bxb1 activity can be improved as only about 30% of Bxb1 attB sites that are added by PASTE are integrated into by Bxb1. This illustrates that if the Bxb1 efficiency can be improved, the PASTE can be improved. Furthermore, catalytic residues in the Bxb1 integrase were identified via conservation and structural analyses and Bxb1 mutants were generated to test as part of PASTE. As illustrated in FIG. 23B, the mutations can improve integration by about 20-30%.
Example 14
Effect of the pegRNA PBS and RT Lengths on the Prime Editing Integration Efficiency
[0341] The effect of the pegRNA PBS and RT lengths on the prime editing integration efficiency was evaluated (FIGS. 25A-25F). It was found that PASTE can be optimized by tuning the PBS and RT lengths at the ACTB locus to achieve editing rates up to about 20% (FIG. 25A). It was found that shortening the attB site can help improve PASTE function as Prime is better at inserting shorter sequences. Further optimization of PBS, RT, and attB lengths showed that optimal designs can be found for insertion upstream of the LMNB1, NOLC1, and GRSF1 loci (FIGS. 25B, 25C, and 25D). Lengths as short as 36nt for attB were found to be still functional for integration into a reporter plasmid (FIGS. 25B and 25C). It was found that the reverse complemented version of the attB sequence was better integrated via Prime editing, suggesting that the sequence of what Prime is inserting matters. EGFP integrations with attP site mutants showed that certain mutants can improve integration efficiency significantly (FIG. 25E). PASTE was also performed with a large panel of genes, inserting EGFP at the N-terminus of ACTB, LMNB1, SUPT16H, SRRM2, NOLC1, KLHL15, GRSF1, DEPDC4, NES, PGM1, CLTA, BASP1, and DNAJC18 (FIG. 25F). Editing rates that are about 5%-40% were found using digital droplet PCR (ddPCR).
Example 15
Comparison of PASTE and HITI On-Target and Off-Target Activities
[0342] The PASTE and HITI on-target and off-target activities were compared (FIGS. 26A-26F). PASTE and HITI were found to have about 22% and 5% integration efficiencies respectively when using the same guide sequence (FIGS. 26A and 26B). PASTE was found to outperform HITI at most sites when analyzing the editing of 14 genes (FIG. 26C). Using a ddPCR based approach, it was found that PASTE was very specific with minimal off-target activity for Bxb1 off-targets integrations (FIG. 26D) and Cas9 off-targets integrations (FIG. 26E). The analysis of inserts of different sizes showed that PASTE can reliably insert sequences 1 kb-10 kb in size (FIG. 26F), revealing the wide range of sequence sizes PASTE is capable of working with. A decrease in insertion efficiency at larger sizes was also observed, which was likely due to the reduction in plasmid delivery to HEK293FT cells at larger plasmid sizes.
Example 16
Multiplexing with PASTE and Orthogonal Di-Nucleotide attB and attP Sites
[0343] Multiplexing with PASTE and orthogonal di-nucleotide attB and attP sites was evaluated (FIGS. 28A-28C). Multiple orthogonal combinations were found for mutants of the central di-nucleotide motif (FIGS. 28A and 28B). As illustrated in FIG. 28C, programmable multiplexed gene insertion can be achieved by using these orthogonal combinations with PASTE only delivering different pegRNAs and gene inserts while keeping the protein components the same (FIG. 8C).
Example 17
PASTE Multiplexed Integrations at Endogenous Sites
[0344] PASTE multiplexed integrations at endogenous sites were evaluated (FIGS. 28A-28G). A reading frame for the attR scar that is left post-integration by Bxb1 that is ideal for a protein linker due to the enrichment of glycines, serines, and prolines in the sequence (GLSGQPPRSPSSGSSG (SEQ ID NO: 426)) was identified. PegRNAs were designed using this linker frame for the resolution of the attR for tagging a number of genes at the N-terminus with EGFP (ACTB, NOLC1, LMNB1, SUPT16H, SRRM2, and DEPDC4). As these genes all have distinct protein localization appearances, microscopy can be used for ascertaining proper gene tagging. PASTE was found to be capable of high-efficiency gene tagging with protein localizations that match the reference images and expected localization of the proteins in the cells (FIGS. 28A-28C). Genes were also tagged in multiplexed fashion to demonstrate the orthogonality of the engineered integration sites. ACTB, LMNB1, NOLC1, and GRSF were targeted with orthogonal pegRNAs carrying GT, TG, AC, and CA, respectively in HEK293FT in groups of single, dual-plexing, and triple-plexing (FIGS. 28D-28E). These dinucleotides were paired with templates carrying EGFP, BFP, and mCherry to allow for multicolor imaging of these labeled genes. The efficiencies of integration for these multiplexing experiments were found to range from about 5%-32%, revealing efficient multiplex integration with PASTE. Using confocal microscopy of these multiplexed integration experiments, cells were found with simultaneous labeling of these different proteins (FIGS. 28F-28G).
Example 18
Combination of CRISPR-Based Genome Editing and Site-Specific Integration
[0345] The combination of CRISPR-based genome editing and site-specific integration was evaluated.
[0346] PegRNAs containing different attB length truncations were assessed (FIG. 29A). Prime editing was found to be capable of inserting sequences up to 56 bp at the beta-actin (ACTB) gene locus, with higher efficiency at lengths below 31 bp (FIGS. 29A-B) The integration of cognate landing sites was tested for multiple insertion enzymes: Bxb1, TP901, and phiBT1 phage serine integrases and Cre recombinase. Prime editing successfully inserted all landing sites tested, with efficiencies between 10-30% (FIGS. 29C-D). To test the complete system, all components were combined and delivered in a single transfection: the prime editing vector, the landing site containing pegRNA, a nicking guide for stimulating prime editing, a mammalian expression vector for the corresponding integrase or recombinase and a 969 bp minicircle DNA cargo encoding green fluorescent protein (GFP) (FIG. 29E). GFP integration rates among the four integrases and recombinases were compared and Bxb1 integrase was found to have the highest integration rate (.about.20%) at the targeted ACTB locus and require the prime editing nicking guide for optimal performance (FIGS. 29F-H). Finally, to reduce the number of transfected components, Bxb1 was co-expressed with the SpCas9-M-MLV reverse transcriptase (PE2) fusion protein via a P2A protein cleavage site. This combination maintained high GFP insertion efficiency, up to 30% (FIG. 29E). The complete system, PASTE, achieved precise integration of templates as large as 9,500 bp with greater than 10% integration efficiency (FIGS. 29J-K and 26E), with complete integration of the full-length cargo confirmed by Sanger sequencing (FIG. 30A-E).
Example 19
Impact of Prime Editing and Integrase Parameters on PRIME Editing
[0347] The impact of prime editing and integrase parameters on the integration efficiency of PRIME editing was assessed.
[0348] Relevant pegRNA parameters for PASTE include the primer binding site (PBS), reverse transcription template (RT), and attB site lengths, as well as the relative locations and efficacy of the pegRNA spacer and nicking guide (FIG. 31A). A range of PBS and RT lengths were tested at two loci, ACTB and lamin B1 (LMNB1), and rules governing efficiency were found to vary between loci, with shorter PBS lengths and longer RT designs having higher editing at the ACTB locus (FIG. 31B) and longer PBS and shorter RT designs performing better at LMNB1 (FIG. 31C).
[0349] The length of the attB landing site must balance two conflicting factors: the higher efficiency of prime editing for smaller inserts and reduced efficiency of Bxb1 integration at shorter attB lengths. AttB lengths were evaluated atACTB, LMNB1, and nucleolar phosphoprotein p130 (NOLC1), and the optimal attB length was found to be locus dependent. At the ACTB locus, long attB lengths could be inserted by prime editing (FIG. 29B) and overall PASTE efficiencies for the insertion of GFP were highest for long attB lengths (FIG. 31d). In contrast, intermediate attB lengths had higher overall integration efficiencies (>20%) at LMNB1 (FIG. 31E) and NOLC1 (FIG. 31F), indicating that the increased efficiency of installing shorter attB sequences overcame the reduction of Bxb1 integration at these sites.
[0350] The PE3 version of prime editing combines PE2 and an additional nicking guide to bias resolution of the flap intermediate towards insertion. To test the importance of nicking guide selection on PASTE editing, editing at ACTB and LMNB1 loci was tested with two nicking guide positions. Suboptimal nicking guide positions were found to reduce the PASTE efficiency up to 30% (FIG. 32A) in agreement with the 75% reduction of PASTE efficiency in the absence of nicking guide (FIG. 29G). The pegRNA spacer sequence was found to be necessary for PASTE editing, and substitution of the spacer sequence with a non-targeting guide was found to eliminate editing (FIG. 32B).
[0351] Rational mutations were also introduced in both the Bxb1 integrase and reverse transcriptase domain of the PE2 construct to optimize PASTE further. While some of these mutations were well tolerated by PASTE (FIGS. 33A-B), none of them improved PASTE editing efficiency.
[0352] Short RT and PBS lengths can offer additional improvements for editing. A panel of shorter RT and PBS guides were tested at ACTB and LMNB1 loci and while shorter RT and PBS sequences did not increase editing at ACTB (FIG. 31G), it was found that they had improved editing at LMNB1 (FIG. 31H) with best performing guides reaching GFP insertion rates of .about.40% (FIG. 31I).
Example 20
PASTE Tagging at Multiple Endogenous Genes
[0353] GFP insertion efficiency was measured at seven different gene loci--ACTB, SUPT16H, SRM2, NOLC1, DEPDC4, NES, and LMNB1--to test the versatility of the PASTE programming. A range of integration rates up to 22% was found (FIG. 34A). Because PASTE does not require homology or sequence similarity on cargo plasmids, integration of diverse cargo sequences is modular and easily scaled across different loci. Six different gene cargos, varying in size from 969 bp to 4906 bp, were tested for insertion at ACTB and LMNB1 loci with PASTE. Integration frequencies between 5% and 22% depending on the gene and insertion locus were found (FIGS. 34B and 35). Additionally, a panel of seven common therapeutic genes, CEP290, OTC, HBB, PAH, GBA, BTK, and ADA was evaluated for insertion at the ACTB locus, and the efficient integration of these cargos were found between 5%-20% (FIG. 34C).
[0354] The precise insertions of PASTE for in-frame protein tagging or expressing cargo without disruption of endogenous gene expression was assessed. As Bxb1 leaves residual sequences in the genome (termed attL and attR) after cargo integration, these genomic scars can serve as protein linkers. The frame of the attR sequence was positioned through strategic placement of the attP on the minicircle cargo, achieving a suitable protein linker, GGLSGQPPRSPSSGSSG (SEQ ID NO: 427). Using this linker, four genes (ACTB, SRRM2, NOLC1, and LMNB1) were tagged with GFP using PASTE. To assess correct gene tagging, the subcellular location of GFP was compared with the tagged gene product by immunofluorescence. For all four targeted loci, GFP co-localized with the tagged gene product, indicating successful tagging (FIGS. 34D-E).
Example 21
Orthogonal Sequence Preferences for Bxb1 Integration
[0355] The central dinucleotide of Bxb1 is involved in the association of attB and attP sites for integration, and changing the matched central dinucleotide sequences can modify integrase activity and provide orthogonality for insertion of two genes. Expanding the set of attB/attP dinucleotides can enable multiplexed gene insertion with PASTE. The efficiency of GFP integration at the ACTB locus with PASTE across all 16 dinucleotide attB/attP sequence pairs was profiled to find optimal attB/attP dinucleotides for PASTE insertion. Several dinucleotides with integration efficiencies greater than the wild-type GT sequence were found (FIG. 36A). A majority of dinucleotides had 75% editing efficiency or greater compared to wild-type attB/attP efficiency, implying that these dinucleotides can be orthogonal channels for multiplexed gene insertion with PASTE.
[0356] The specificity of matched and unmatched attB/attP dinucleotide interactions was then assessed. The interactions between all dinucleotide combinations in a scalable fashion using a pooled assay to compare attB/attP integration were profiled (FIG. 36B). By barcoding 16 attP dinucleotide plasmids with unique identifiers, co-transfecting this attP pool with the Bxb1 integrase expression vector and a single attB dinucleotide acceptor plasmid, and sequencing the resulting integration products, the relative integration efficiencies of all possible attB/attP pairs were measured (FIG. 36C). Dinucleotide specificity was found to vary, with some dinucleotides (GG) exhibiting strong self-interaction with negligible crosstalk, and others (AA) showing minimal self-preference. Sequence logos of attP preferences (FIG. 37) revealed that dinucleotides with C or G in the first position have stronger preferences for attB dinucleotide sequences with shared first bases, while other attP dinucleotides, especially those with an A in the first position, have reduced specificity for the first attB base.
[0357] GA, AG, AC, and CT dinucleotide pegRNAs were then tested for GFP integration at ACTB, either paired with their corresponding attP cargo or mispaired with the other three dinucleotide attP sequences. All four of the tested dinucleotides efficiently were found to integrate cargo only when paired with the corresponding attB/attP pair, with no detectable integration across mispaired combinations (FIG. 36D).
Example 22
Multiplex Gene Integration with PASTE
[0358] Multiplexing in cells by using orthogonal pegRNAs that direct a matched attP cargo to a specific site in the genome was assessed (FIG. 38A). Selecting the three top dinucleotide attachment site pairs (CT, AG, and GA), pegRNAs that target ACTB (CT), LMNB1 (AG), and NOLC1 (GA) and corresponding minicircle cargo containing GFP (CT), mCherry (AG), and YFP (GA) were designed. Upon co-delivering these reagents to cells, single-plex, dual-plex, and trip-plex editing of all possible combinations of these pegRNAs and cargo in the range of 5%-25% integration was found to be achieved (FIG. 38B).
[0359] An application for multiplexed gene integration is for labeling different proteins to visualize intracellular localization and interactions within the same cell. PASTE was used to simultaneously tag ACTB (GFP) and NOLC1 (mCherry) or ACTB (GFP) and LMNB1 (mCherry) in the same cell. No overlap of GFP and mCherry fluorescence was observed and tagged genes were confirmed to be visible in their appropriate cellular compartments, based on the known subcellular localizations of the ACTB, NOLC1 and LMNB1 protein products (FIGS. 15A-B).
Example 23
PASTE Efficiencies Compared with DSB-Based Insertion Methods
[0360] PASTE efficiencies were found to exceed comparable DSB-based insertion methods.
[0361] PASTE editing was assessed alongside DSB-dependent gene integration using either NHEJ (i.e., homology-independent targeted integration, HITI) or HDR pathways. PASTE had equivalent or better gene insertion efficiencies than either HITI (FIGS. 39A-B) or HDR (FIGS. 39C-D). On a panel of 7 different endogenous targets, PASTE exceeded HITI editing at 6 out of 7 genes, with similar efficiency for the 7th gene (FIG. 39A). As DSB generation can lead to insertions or deletions (indels) as an alternative and undesired editing outcome, the indel frequency of all three methods was assessed by next-generation sequencing, finding significantly fewer indels generated with PASTE than either HDR or HITI in both HEK293FT and HepG2 cells (FIGS. 39B, 39D and 40A), showcasing the high purity of gene integration outcomes with PASTE.
Example 24
Off-Target Characterization of PASTE and HITI Gene Integration
[0362] Off-target editing can be used in genome editing technologies. The specificity of PASTE at specific sites was assessed based on off-targets generated by Bxb1 integration into pseudo-attB sites in the human genome and off-targets generated via guide- and Cas9-dependent editing in the human genome (FIG. 39E). While Bxb1 lacks documented integration into the human genome at pseudo-attachment sites, potential sites with partial similarity to the natural Bxb1 attB core sequence were computationally identified. Bxb1 integration by ddPCR across these sites was tested and no off-target activity was found (FIGS. 39F and 40B-D). To assay Cas9 off-targets for the ACTB pegRNA, two potential off-target sites were identified via computational prediction and no off-target integration for PASTE was found (FIGS. 39G and 40A-D), but substantial off-target activity by HITI at one of the sites was found (FIGS. 39H and 40A-D).
[0363] Genome-wide off-targets due to either Cas9 or Bxb1 through tagging and PCR amplification of insert-genomic junctions were additionally assessed (FIG. 39I). Single cell clones were isolated for conditions with PASTE editing and negative controls missing PE2, and deep sequencing of insert genomic junctions from these clones showed all reads aligning to the on-target ACTB site, confirming no off-target genomic insertions (FIGS. 39J-L).
[0364] Expression of reverse transcriptases and integrases involved in PASTE can have detrimental effects on cellular health. The complete PASTE system, the corresponding guides and cargo with only PE2, and the corresponding guides and cargo with only Bxb1 were transfected and compared to both GFP control transfections and guides without protein expression via transcriptome-wide RNA sequencing to determine the extent of these effects. While Bxb1 expression in the absence of Prime editing was found to have several significant off targets, the complete PASTE system had only one differentially regulated gene with more than a 1.5-fold change (FIGS. 41A-B). Genes upregulated by Bxb1 overexpression included stress response genes, such as TENT5C and DDIT3, but these changes were not seen in the expression of the PASTE system (FIG. 41C), potentially due to the decreased expression of Bxb1 from the P2A linker on the PASTE construct.
Example 25
PASTE Efficiency in Non-Dividing Cell
[0365] PASTE activity in non-dividing cells was assessed. Cas9 and HDR templates or PASTE were transfected into HEK293FT cells and cell division was arrested via aphidicolin treatment (FIG. 42A). In this model of blocked cell division, PASTE was found to maintain a GFP gene integration activity greater than 20% at the ACTB locus whereas HDR-mediated integration was abolished (FIGS. 42B and 43A).
Example 26
Production and Secretion of Therapeutic Transgene
[0366] PASTE with larger transgenes and in additional cell lines were assessed.
[0367] To evaluate the size limits for therapeutic transgenes, insertion of cargos up to 13.3 kb in length in both dividing and aphidicolin treated cells was assessed. Insertion efficiency greater than 10% was found (FIG. 42C), enabling insertion of .about.99.7% of all full-length human cDNA transgenes. To overcome reduction of large insert delivery to cells because of delivery inefficiencies, delivering larger DNA amounts of insert was found to significantly improve gene integration efficiency (FIG. 43B). PASTE editing to additional cell types such as PASTE in the K562 lymphoblast line and in primary human T cells were also assessed. Both PE2-P2A-Bxb1 (PASTE) and separate delivery of PE2 and Bxb1 were found to result in efficient editing in both cell types (FIGS. 42D-E). Lastly, as therapeutic delivery of PASTE in vivo might require viral delivery of the DNA cargo, whether AAV could deliver an attP containing payload that could be integrated into the genome via Bxb1 was evaluated. Targeting the ACTB locus, AAV was found to be capable of delivering the appropriate template for integrase mediated insertion with rates up to 4% in a dose dependent fashion (FIGS. 42F and 43C).
[0368] To improve the efficiency of PASTE, PE2* NLS was incorporated for prime editing and improved PASTE integration at multiple loci was found (FIG. 44A). Furthermore, PE2* resulted in more robust integration at lower titrations of cargo plasmid, demonstrating integration at amounts as low as 8 ng of plasmid (FIG. 44B). To combat reductions in PASTE efficiency due to incomplete plasmid delivery, a puromycin resistance gene was co-delivered and found to increase the PASTE efficiency in the presence of drug selection (FIG. 45).
[0369] Programmable gene integration provides a modality for expression of therapeutic protein products, and protein production was assessed for therapeutically relevant proteins Alpha-1 antitrypsin (encoded by SERPINA1) and Carbamoyl phosphate synthetase I (encoded by CPS1), involved in the diseases Alpha-1 antitrypsin deficiency and CPS1 deficiency, respectively. By tagging gene products with the luminescent protein subunit HiBiT, the transgene production and secretion were assessed independently in response to PASTE treatment (FIG. 42G). PASTE was transfected with SERPINA1 or CPS1 cargo in HEK293FT cells and a human hepatocellular carcinoma cell line (HepG2) and efficient integration at the ACTB locus was found (FIG. 42H-I). This integration resulted in robust protein expression, intracellular accumulation of transgene products (FIGS. 42J and 46A-B), and secretion of proteins into the media (FIG. 42K).
Example 27
Optimized PASTE Constructs
[0370] To optimize complex activity, a panel of protein modifications were screened, including alternative reverse transcriptase fusions and mutations, various linkers between the reverse transcriptase domain and integrase and between the Cas9 and reverse transcriptase domain, and reverse transcriptase and BxbINT domain mutants (FIG. 47A and FIG. 49C-FIG. 49F). A number of protein modifications, including a 48 residue XTEN linker between the Cas9 and reverse transcriptase and the fusion of MMuLV to the Sto7d DNA binding domain (Oscorbin et al. FEBS Lett. 594. 4338-4356. 2020) improved editing efficiency (FIG. 47A and FIG. 49C-FIG. 49D). When these top modifications were combined with a GGGGS linker (SEQ ID NO: 420) between the reverse transcriptase-Sto7d domain and the BxbINT, they produced .about.55% gene integration, highlighting the importance of directly recruiting the integrase to the target site (FIG. 47A). This optimized construct was referred to as SpCas9-(XTEN-48)-RT-Sto7d-(GGGGS)-BxbINT. The optimized construct achieved precise integration of templates as large as 36,000 bp with .about.20% integration efficiency (FIG. 47A), with complete integration of the full-length cargo confirmed by Sanger sequencing.
[0371] Additionally, pegRNAs containing different AttB length truncations were tested and found that prime editing was capable of inserting sequences up to 56 bp at the beta-actin (ACTB) gene locus, with higher efficiency at lengths below 31 bp (FIG. 48A-FIG. 48B). A panel of multiple enzymes was evaluated, including Bxb1 (i.e., BxbINT), TP901 (i.e., Tp9INT), and phiBT1 (i.e., Bt1INT) phage serine integrases. Prime editing successfully inserted all landing sites tested, with efficiencies between 10-30% (FIG. 48C-FIG. 48D)
Example 28
Viral Delivery & In Vivo Editing
[0372] In order to package the complete PASTE system in viral vectors, an AdV vector was utilized (FIG. 50B). Adenovirus was evaluated for if it could deliver a suitable template for BxbINT-mediated insertion along with plasmids for SpCas9-RT-BxbINT and guide expression, or AdV delivery of guides and BxbINT with plasmid delivery of SpCas9-RT, finding that 10-20% integration of the .about.36 kb adenovirus genome carrying EGFP in HEK293FT and HepG2 cells was achieved (FIG. 50C). Upon packaging and delivering the cargo and PASTE system components across 3 AdV vectors, the complete PASTE system (Cas9-reverse transcriptase, integrase and guide RNAs, or cargo) could be substituted by adenoviral delivery, with integration of up to .about.50-60% with viral-only delivery in HEK293FT and HepG2 cells (FIG. 50D).
[0373] To further demonstrate PASTE would be amenable for in vivo delivery, an mRNA version of the PASTE protein components was developed as well as chemically-modified synthetic atgRNA and nicking guide against the LMNB1 target (FIG. 50E). Electroporation of the mRNA and guides along with delivery of the template via adenovirus or plasmid yielded high efficiency integration up to .about.23% (FIG. 50E-FIG. 50F). More sustained BxbINT expression could allow for integration into newly placed AttB sites in the genome, so circular mRNA expression was tested and found to boost the efficiency of integration to .about.30% (FIG. 50G-FIG. 50I).
Example 29
Simultaneous Deletion & Insertion with PASTE
[0374] The PASTE system was used to simultaneously delete one sequence and insert another. 130 bp and 385 bp deletions of first exon of LMNB1 with combined insertion of AttB nucleic acid sequence was performed (FIG. 51A). This data shows that it is possible to replace DNA sequence using the PASTE system.
[0375] A130 bp deletion of the first exon of LMNB1 with combined insertion of a 967 bp cargo using the PASTE system was also performed.
[0376] One of two attP sequences were inserted using the mini circle template that has mutated AttP, as described above. This AttP mutants shows better integration kinetics and efficiency, especially for the shorter AttBs (38-44 bp). The LMNB1 AttB used in this experiment is 38 bp (FIG. 51B).
Sequence CWU
1
1
431134DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotideLox71 1ataacttcgt ataatgtatg ctatacgaac ggta
34234DNAArtificial SequenceDescription of Artificial
Sequence Synthetic oligonucleotideLox66 2taccgttcgt ataatgtatg
ctatacgaag ttat 34346DNAArtificial
SequenceDescription of Artificial Sequence Synthetic
oligonucleotideAttB 3ggccggcttg tcgacgacgg cggtctccgt cgtcaggatc atccgg
46446DNAArtificial SequenceDescription of Artificial
Sequence Synthetic oligonucleotideAttP 4ccggatgatc ctgacgacgg
agaccgccgt cgtcgacaag ccggcc 46538DNAArtificial
SequenceDescription of Artificial Sequence Synthetic
oligonucleotideAttB-TT 5ggcttgtcga cgacggcgtt ctccgtcgtc aggatcat
38652DNAArtificial SequenceDescription of Artificial
Sequence Synthetic oligonucleotideAttP-TT 6gtggtttgtc tggtcaacca
ccgcgttctc agtggtgtac ggtacaaacc ca 52738DNAArtificial
SequenceDescription of Artificial Sequence Synthetic
oligonucleotideAttB-AA 7ggcttgtcga cgacggcgaa ctccgtcgtc aggatcat
38852DNAArtificial SequenceDescription of Artificial
Sequence Synthetic oligonucleotideAttP-AA 8gtggtttgtc tggtcaacca
ccgcgaactc agtggtgtac ggtacaaacc ca 52938DNAArtificial
SequenceDescription of Artificial Sequence Synthetic
oligonucleotideAttB-CC 9ggcttgtcga cgacggcgcc ctccgtcgtc aggatcat
381052DNAArtificial SequenceDescription of Artificial
Sequence Synthetic oligonucleotideAttP-CC 10gtggtttgtc tggtcaacca
ccgcgccctc agtggtgtac ggtacaaacc ca 521138DNAArtificial
SequenceDescription of Artificial Sequence Synthetic
oligonucleotideAttB-GG 11ggcttgtcga cgacggcggg ctccgtcgtc aggatcat
381252DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotideAttP-GG 12gtggtttgtc
tggtcaacca ccgcgggctc agtggtgtac ggtacaaacc ca
521338DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotideAttB-TG 13ggcttgtcga cgacggcgtg ctccgtcgtc aggatcat
381452DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotideAttP-TG 14gtggtttgtc
tggtcaacca ccgcgtgctc agtggtgtac ggtacaaacc ca
521538DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotideAttB-GT 15ggcttgtcga cgacggcggt ctccgtcgtc aggatcat
381652DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotideAttP-GT 16gtggtttgtc
tggtcaacca ccgcggtctc agtggtgtac ggtacaaacc ca
521738DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotideAttB-CT 17ggcttgtcga cgacggcgct ctccgtcgtc aggatcat
381852DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotideAttP-CT 18gtggtttgtc
tggtcaacca ccgcgctctc agtggtgtac ggtacaaacc ca
521938DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotideAttB-CA 19ggcttgtcga cgacggcgca ctccgtcgtc aggatcat
382052DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotideAttP-CA 20gtggtttgtc
tggtcaacca ccgcgcactc agtggtgtac ggtacaaacc ca
522138DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotideAttB-TC 21ggcttgtcga cgacggcgtc ctccgtcgtc aggatcat
382252DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotideAttP-TC 22gtggtttgtc
tggtcaacca ccgcgtcctc agtggtgtac ggtacaaacc ca
522338DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotideAttB-GA 23ggcttgtcga cgacggcgga ctccgtcgtc aggatcat
382452DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotideAttP-GA 24gtggtttgtc
tggtcaacca ccgcggactc agtggtgtac ggtacaaacc ca
522538DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotideAttB-AG 25ggcttgtcga cgacggcgag ctccgtcgtc aggatcat
382652DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotideAttP-AG 26gtggtttgtc
tggtcaacca ccgcgagctc agtggtgtac ggtacaaacc ca
522738DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotideAttB-AC 27ggcttgtcga cgacggcgac ctccgtcgtc aggatcat
382852DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotideAttP-AC 28gtggtttgtc
tggtcaacca ccgcgacctc agtggtgtac ggtacaaacc ca
522938DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotideAttB-AT 29ggcttgtcga cgacggcgat ctccgtcgtc aggatcat
383052DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotideAttP-AT 30gtggtttgtc
tggtcaacca ccgcgatctc agtggtgtac ggtacaaacc ca
523138DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotideAttB-GC 31ggcttgtcga cgacggcggc ctccgtcgtc aggatcat
383252DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotideAttP-GC 32gtggtttgtc
tggtcaacca ccgcggcctc agtggtgtac ggtacaaacc ca
523338DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotideAttB-CG 33ggcttgtcga cgacggcgcg ctccgtcgtc aggatcat
383452DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotideAttB-CG 34gtggtttgtc
tggtcaacca ccgcgcgctc agtggtgtac ggtacaaacc ca
523538DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotideAttB-TA 35ggcttgtcga cgacggcgta ctccgtcgtc aggatcat
383652DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotideAttP-TA 36gtggtttgtc
tggtcaacca ccgcgtactc agtggtgtac ggtacaaacc ca
523745DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotideC-31-B 37tgcgggtgcc agggcgtgcc cttgggctcc ccgggcgcgt
actcc 453842DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotideC31-P 38gtgccccaac
tggggtaacc tttgagttct ctcagttggg gg
423957DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotideR4-B 39gcgcccaagt tgcccatgac catgccgaag cagtggtaga
agggcaccgg cagacac 574070DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotideR4-P 40aggcatgttc
cccaaagcga taccacttga agcagtggta ctgcttgtgg gtacactctg 60cgggtgatga
704160DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotideBT1-B 41gtccttgacc aggtttttga cgaaagtgat ccagatgatc
cagctccaca ccccgaacgc 604263DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotideBT1-P 42ggtgctgggt
tgttgtctct ggacagtgat ccatgggaaa ctactcagca ccaccaatgt 60tcc
634350DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotideBxb-B 43tcggccggct tgtcgacgac ggcggtctcc gtcgtcagga
tcatccgggc 504458DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotideBxb-P 44gtcgtggttt
gtctggtcaa ccaccgcggt ctcagtggtg tacggtacaa accccgac
584546DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotideTG1-B 45gatcagctcc gcgggcaaga ccttctcctt cacggggtgg
aaggtc 464667DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotideTG1-P 46tcaaccccgt
tccagcccaa cagtgttagt ctttgctctt acccagttgg gcgggatagc 60ctgcccg
674757DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotideC1-B 47aacgattttc aaaggatcac tgaatcaaaa gtattgctca
tccacgcgaa atttttc 574857DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotideC1-P 48aatattttag
gtatatgatt ttgtttatta gtgtaaataa cactatgtac ctaaaat
574953DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotideC370-B 49tgtaaaggag actgataatg gcatgtacaa ctatactcgt
cggtaaaaag gca 535052DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotideC370-P 50taaaaaaata
cagcgttttt catgtacaac tatactagtt gtagtgccta aa
525156DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotideK38-B 51gagcgccgga tcagggagtg gacggcctgg gagcgctaca
cgctgtggct gcggtc 565256DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotideK38-P 52ccctaatacg
caagtcgata actctcctgg gagcgttgac aacttgcgca ccctga
565368DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotideRB-B 53tctcgtggtg gtggaaggtg ttggtgcggg gttggccgtg
gtcgaggtgg ggtggtggta 60gccattcg
685469DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotideRV-P 54gcacaggtgt
agtgtatctc acaggtccac ggttggccgt ggactgctga agaacattcc 60acgccagga
695565DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotideSPBC-B 55agtgcagcat gtcattaata tcagtacaga taaagctgta
tctcctgtga acacaatggg 60tgcca
655655DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotideSPBC-P 56aaagtagtaa
gtatcttaaa aaacagataa agctgtatat taagatactt actac
555754DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotideTP901-B 57tgataattgc caacacaatt aacatctcaa tcaaggtaaa
tgctttttcg tttt 545854DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotideTP901-P 58aattgcgagt
ttttatttcg tttatttcaa ttaaggtaac taaaaaactc cttt
545968DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotideWbeta-B 59aaggtagcgt caacgatagg tgtaactgtc gtgtttgtaa
cggtacttcc aacagctggc 60gtttcagt
686068DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotideWbeta-P 60tagttttaaa
gttggttatt agttactgtg atatttatca cggtacccaa taaccaatga 60atatttga
686157DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotideA118-B 61tgtaactttt tcggatcaag ctatgaagga cgcaaagagg
gaactaaaca cttaatt 576257DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotideA118-P 62ttgtttagtt
cctcgttttc tctcgttgga agaagaagaa acgagaaact aaaatta
576363DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotideBL3-B 63caacctgttg acatgtttcc acagacaact cacgtggagg
tagtcacggc ttttacgtta 60gtt
636461DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotideBL3-P 64gagaatactg
ttgaacaatg aaaaactagg catgtagaag ttgtttgtgc actaacttta 60a
6165120DNAArtificial SequenceDescription of Artificial Sequence Synthetic
polynucleotideMR11-B 65acaggtcaac acatcgcagt tatcgaacaa tcttcgaaaa
tgtatggagg cacttgtatc 60aatataggat gtataccttc gaagacactt gtacatgatg
gattagaagg caaatccttt 12066120DNAArtificial SequenceDescription of
Artificial Sequence Synthetic polynucleotideMR11-P 66caaaataaaa
aacattgatt tttattaact tcttttgtgc ggaactacga acagttcatt 60aatacgaagt
gtacaaactt ccatacaaaa ataaccacga caattaagac gtggtttcta
1206717DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotideAttL 67attatttctc accctga
176817DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotideAttR 68atcatctccc
acccgga
176934DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotideVox 69aataggtctg agaacgccca ttctcagacg tatt
347034DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotideFRT 70gaagttccta
tactttctag agaataggaa cttc
34715881DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotideCre Recombinase Expression Plasmid
71ggtcgacatt gattattgac tagttattaa tagtaatcaa ttacggggtc attagttcat
60agcccatata tggagttccg cgttacataa cttacggtaa atggcccgcc tggctgaccg
120cccaacgacc cccgcccatt gacgtcaata atgacgtatg ttcccatagt aacgccaata
180gggactttcc attgacgtca atgggtggag tatttacggt aaactgccca cttggcagta
240catcaagtgt atcatatgcc aagtacgccc cctattgacg tcaatgacgg taaatggccc
300gcctggcatt atgcccagta catgacctta tgggactttc ctacttggca gtacatctac
360gtattagtca tcgctattac catggtcgag gtgagcccca cgttctgctt cactctcccc
420atctcccccc cctccccacc cccaattttg tatttattta ttttttaatt attttgtgca
480gcgatggggg cggggggggg gggggcgcgc gccaggcggg gggggggggg gggggggggg
540gggggggggg gggcgggggg gggcggcggc agccaatcag agcggcgcgc tccgaaagtt
600tccttttatg gcgaggcggc ggcggcggcg gccctataaa aagcgaagcg cgcggcgggc
660gggagtcgct gcgcgctgcc ttcgccccgt gccccgctcc gccgccgcct cgcgccgccc
720gccccggctc tgactgaccg cgttactccc acaggtgagc gggcgggacg gcccttctcc
780tccgggctgt aattagcgct tggtttaatg acggcttgtt tcttttctgt ggctgcgtga
840aagccttgag gggctccggg agggcccttt gtgcgggggg agcggctcgg ggggtgcgtg
900cgtgtgtgtg tgcgtgggga gcgccgcgtg cggctccgcg ctgcccggcg gctgtgagcg
960ctgcgggcgc ggcgcggggc tttgtgcgct ccgcagtgtg cgcgagggga gcgcggccgg
1020gggcggtgcc ccgcggtgcg gggggggctg cgaggggaac aaaggctgcg tgcggggtgt
1080gtgcgtgggg gggtgagcag ggggtgtggg cgcgtcggtc gggctgcaac cccccctgca
1140cccccctccc cgagttgctg agcacggccc ggcttcgggt gcggggctcc gtacggggcg
1200tggcgcgggg ctcgccgtgc cgggcggggg gtggcggcag gtgggggtgc cgggcggggc
1260ggggccgcct cgggccgggg agggctcggg ggaggggcgc ggcggccccc ggagcgccgg
1320cggctgtcga ggcgcggcga gccgcagcca ttgcctttta tggtaatcgt gcgagagggc
1380gcagggactt cctttgtccc aaatctgtgc ggagccgaaa tctgggaggc gccgccgcac
1440cccctctagc gggcgcgggg cgaagcggtg cggcgccggc aggaaggaaa tgggcgggga
1500gggccttcgt gcgtcgccgc gccgccgtcc ccttctccct ctccagcctc ggggctgtcc
1560gcggggggac ggctgccttc gggggggacg gggcagggcg gggttcggct tctggcgtgt
1620gaccggcggc tctagagcct ctgctaacca tgttcatgcc ttcttctttt tcctacagct
1680cctgggcaac gtgctggtta ttgtgctgtc tcatcatttt ggcaaagaat tctgagccgc
1740caccatggcc aatttactga ccgtacacca aaatttgcct gcattaccgg tcgatgcaac
1800gagtgatgag gttcgcaaga acctgatgga catgttcagg gatcgccagg cgttttctga
1860gcatacctgg aaaatgcttc tgtccgtttg ccggtcgtgg gcggcatggt gcaagttgaa
1920taaccggaaa tggtttcccg cagaacctga agatgttcgc gattatcttc tatatcttca
1980ggcgcgcggt ctggcagtaa aaactatcca gcaacatttg ggccagctaa acatgcttca
2040tcgtcggtcc gggctgccac gaccaagtga cagcaatgct gtttcactgg ttatgcggcg
2100gatccgaaaa gaaaacgttg atgccggtga acgtgcaaaa caggctctag cgttcgaacg
2160cactgatttc gaccaggttc gttcactcat ggaaaatagc gatcgctgcc aggatatacg
2220taatctggca tttctgggga ttgcttataa caccctgtta cgtatagccg aaattgccag
2280gatcagggtt aaagatatct cacgtactga cggtgggaga atgttaatcc atattggcag
2340aacgaaaacg ctggttagca ccgcaggtgt agagaaggca cttagcctgg gggtaactaa
2400actggtcgag cgatggattt ccgtctctgg tgtagctgat gatccgaata actacctgtt
2460ttgccgggtc agaaaaaatg gtgttgccgc gccatctgcc accagccagc tatcaactcg
2520cgccctggaa gggatttttg aagcaactca tcgattgatt tacggcgcta aggatgactc
2580tggtcagaga tacctggcct ggtctggaca cagtgcccgt gtcggagccg cgcgagatat
2640ggcccgcgct ggagtttcaa taccggagat catgcaagct ggtggctgga ccaatgtaaa
2700tattgtcatg aactatatcc gtaacctgga tagtgaaaca ggggcaatgg tgcgcctgct
2760ggaagatggc gatggaccgg tggaacaaaa acttatttct gaagaagatc tgtgatagcg
2820gccgcactcc tcaggtgcag gctgcctatc agaaggtggt ggctggtgtg gccaatgccc
2880tggctcacaa ataccactga gatctttttc cctctgccaa aaattatggg gacatcatga
2940agccccttga gcatctgact tctggctaat aaaggaaatt tattttcatt gcaatagtgt
3000gttggaattt tttgtgtctc tcactcggaa ggacatatgg gagggcaaat catttaaaac
3060atcagaatga gtatttggtt tagagtttgg caacatatgc ccatatgctg gctgccatga
3120acaaaggttg gctataaaga ggtcatcagt atatgaaaca gccccctgct gtccattcct
3180tattccatag aaaagccttg acttgaggtt agattttttt tatattttgt tttgtgttat
3240ttttttcttt aacatcccta aaattttcct tacatgtttt actagccaga tttttcctcc
3300tctcctgact actcccagtc atagctgtcc ctcttctctt atggagatcc ctcgacctgc
3360agcccaagct tggcgtaatc atggtcatag ctgtttcctg tgtgaaattg ttatccgctc
3420acaattccac acaacatacg agccggaagc ataaagtgta aagcctgggg tgcctaatga
3480gtgagctaac tcacattaat tgcgttgcgc tcactgcccg ctttccagtc gggaaacctg
3540tcgtgccagc ggatccgcat ctcaattagt cagcaaccat agtcccgccc ctaactccgc
3600ccatcccgcc cctaactccg cccagttccg cccattctcc gccccatggc tgactaattt
3660tttttattta tgcagaggcc gaggccgcct cggcctctga gctattccag aagtagtgag
3720gaggcttttt tggaggccta ggcttttgca aaaagctaac ttgtttattg cagcttataa
3780tggttacaaa taaagcaata gcatcacaaa tttcacaaat aaagcatttt tttcactgca
3840ttctagttgt ggtttgtcca aactcatcaa tgtatcttat catgtctgga tccgctgcat
3900taatgaatcg gccaacgcgc ggggagaggc ggtttgcgta ttgggcgctc ttccgcttcc
3960tcgctcactg actcgctgcg ctcggtcgtt cggctgcggc gagcggtatc agctcactca
4020aaggcggtaa tacggttatc cacagaatca ggggataacg caggaaagaa catgtgagca
4080aaaggccagc aaaaggccag gaaccgtaaa aaggccgcgt tgctggcgtt tttccatagg
4140ctccgccccc ctgacgagca tcacaaaaat cgacgctcaa gtcagaggtg gcgaaacccg
4200acaggactat aaagatacca ggcgtttccc cctggaagct ccctcgtgcg ctctcctgtt
4260ccgaccctgc cgcttaccgg atacctgtcc gcctttctcc cttcgggaag cgtggcgctt
4320tctcatagct cacgctgtag gtatctcagt tcggtgtagg tcgttcgctc caagctgggc
4380tgtgtgcacg aaccccccgt tcagcccgac cgctgcgcct tatccggtaa ctatcgtctt
4440gagtccaacc cggtaagaca cgacttatcg ccactggcag cagccactgg taacaggatt
4500agcagagcga ggtatgtagg cggtgctaca gagttcttga agtggtggcc taactacggc
4560tacactagaa gaacagtatt tggtatctgc gctctgctga agccagttac cttcggaaaa
4620agagttggta gctcttgatc cggcaaacaa accaccgctg gtagcggtgg tttttttgtt
4680tgcaagcagc agattacgcg cagaaaaaaa ggatctcaag aagatccttt gatcttttct
4740acggggtctg acgctcagtg gaacgaaaac tcacgttaag ggattttggt catgagatta
4800tcaaaaagga tcttcaccta gatcctttta aattaaaaat gaagttttaa atcaatctaa
4860agtatatatg agtaaacttg gtctgacagt taccaatgct taatcagtga ggcacctatc
4920tcagcgatct gtctatttcg ttcatccata gttgcctgac tccccgtcgt gtagataact
4980acgatacggg agggcttacc atctggcccc agtgctgcaa tgataccgcg agacccacgc
5040tcaccggctc cagatttatc agcaataaac cagccagccg gaagggccga gcgcagaagt
5100ggtcctgcaa ctttatccgc ctccatccag tctattaatt gttgccggga agctagagta
5160agtagttcgc cagttaatag tttgcgcaac gttgttgcca ttgctacagg catcgtggtg
5220tcacgctcgt cgtttggtat ggcttcattc agctccggtt cccaacgatc aaggcgagtt
5280acatgatccc ccatgttgtg caaaaaagcg gttagctcct tcggtcctcc gatcgttgtc
5340agaagtaagt tggccgcagt gttatcactc atggttatgg cagcactgca taattctctt
5400actgtcatgc catccgtaag atgcttttct gtgactggtg agtactcaac caagtcattc
5460tgagaatagt gtatgcggcg accgagttgc tcttgcccgg cgtcaatacg ggataatacc
5520gcgccacata gcagaacttt aaaagtgctc atcattggaa aacgttcttc ggggcgaaaa
5580ctctcaagga tcttaccgct gttgagatcc agttcgatgt aacccactcg tgcacccaac
5640tgatcttcag catcttttac tttcaccagc gtttctgggt gagcaaaaac aggaaggcaa
5700aatgccgcaa aaaagggaat aagggcgaca cggaaatgtt gaatactcat actcttcctt
5760tttcaatatt attgaagcat ttatcagggt tattgtctca tgagcggata catatttgaa
5820tgtatttaga aaaataaaca aataggggtt ccgcgcacat ttccccgaaa agtgccacct
5880g
5881724915DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotideGFP-Lox66-Cre expression plasmid
72agctctgatc aagagacagg atgaggatcg tttcgcatga ttgaacaaga tggattgcac
60gcaggttctc cggccgcttg ggtggagagg ctattcggct atgactgggc acaacagaca
120atcggctgct ctgatgccgc cgtgttccgg ctgtcagcgc aggggcgccc ggttcttttt
180gtcaagaccg acctgtccgg tgccctgaat gaactgcaag acgaggcagc gcggctatcg
240tggctggcca cgacgggcgt tccttgcgca gctgtgctcg acgttgtcac tgaagcggga
300agggactggc tgctattggg cgaagtgccg gggcaggatc tccatgtcat ctacaccttg
360ctcctgccga gaaagtatcc atcatggctg atgcaatgcg gcggctgcat acgcttgatc
420cggctacctg cccattcgac caccaagcga aacatcgcat cgagcgagca cgtactcgga
480tggaagccgg tcttgtcgat caggatgatc tggacgaaga gcatcagggg ctcgcgccag
540ccgaactgtt cgccaggctc aaggcgagca tgcccgacgg cgaggatctc gtcgtgaccc
600atggcgatgc ctgcttgccg aatatcatgg tggaaaatgg ccgcttttct ggattcatcg
660actgtggccg gctgggtgtg gcggaccgct atcaggacat agcgttggct acccgtgata
720ttgctgaaga gcttggcggc gaatgggctg accgcttcct cgtgctttac ggtatcgccg
780ctcccgattc gcagcgcatc gccttctatc gccttcttga cgagttcttc tgaattatta
840actcgagatc cactagagtg tggcggccgc attcttataa tcagcatcat gatgtggtac
900cacatcatga tgctgattac ccccaactga gagaactcaa aggttacccc agttggggcg
960ggcccacaaa taaagcaata gcatcacaaa tttcacaaat aaagcatttt tttcactgca
1020ttctagttgt ggtttgtcca aactcatcga gctcgagatc tggcgaaggc gatgggggtc
1080ttgaaggcgt gctggtactc cacgatgccc agctcggtgt tgctgtgcag ctcctccacg
1140cggcggaagg cgaacatggg gcccccgttc tgcaggatgc tggggtggat ggcgctcttg
1200aagtgcatgt ggctgtccac cacgaagctg tagtagccgc cgtcgcgcag gctgaaggtg
1260cgggcgaagc tgcccaccag cacgttatcg cccatggggt gcaggtgctc cacggtggcg
1320ttgctgcgga tgatcttgtc ggtgaagatc acgctgtcct cggggaagcc ggtgcccacc
1380accttgaagt cgccgatcac gcggccggcc tcgtagcggt agctgaagct cacgtgcagc
1440acgccgccgt cctcgtactt ctcgatgcgg gtgttggtgt agccgccgtt gttgatggcg
1500tgcaggaagg ggttctcgta gccgctgggg taggtgccga agtggtagaa gccgtagccc
1560atcacgtggc tcagcaggta ggggctgaag gtcagggcgc ctttggtgct cttcatcttg
1620ttggtcatgc ggccctgctc gggggtgccc tctccgccgc ccaccagctc gaactccacg
1680ccgttcaggg tgccggtgat gcggcactcg atcttcatgg cgggcatggt ggcgaccggt
1740agcgctagcg gcttcggata acttcgtata gcatacatta tacgaacggt aagcgctacc
1800gccggcatac ccaagtgaag ttgctcgcag cttatagtcg cgcccgggga gcccaagggc
1860acgccctggc accgcggccg ctgagtctcg accatcatca tcatcatcat tgagtttatc
1920tgggataaca gggtaatgtc atctagggat aacagggtat gtcatctggg ataacagggt
1980aatgtatcta gggataacag ggtaatgtca tctgggataa cagggtaatg tcatctaggg
2040ataacagggt atgtcatctg ggataacagg gtaatgtatc tagggataac agggtaatgt
2100catctgggat aacagggtaa tgtcatctag ggataacagg gtatgtcatc tgggataaca
2160gggtaatgta tctagggata acagggtaat gtcatctggg ataacagggt aatgtcatct
2220agggataaca gggtatgtca tctgggataa cagggtaatg tatctaggga taacagggta
2280atgtcatctg ggataacagg gtaatgtcat ctagggataa cagggtatgt catctgggat
2340aacagggtaa tgtatctagg gataacaggg taatgtcatc tgggataaca gggtaatgtc
2400atctagggat aacagggtat gtcatctggg ataacagggt aatgtatcta gggataacag
2460ggtaatgtca tctgggataa cagggtaatg tcatctaggg ataacagggt aaatgtcatc
2520tagggataac agggtaatgt catctaggga taacagggta atgtcatctg ggataacagg
2580gtaatgtcat ctagggataa cagggtaatg tatcgccagc gtcgcacagc atgtttgctt
2640gtcgccgtcg cgtctgtcac atcttttccg ccagcagtta gggattagcg tcttaagctg
2700gcgcgaggac caacgtatca gccaggcgaa gctgcttttg agcaccaccc ggatgcctat
2760cgccaccgtc ggtcgcaatg ttggttttga cgatcaactc tatttctcgc gggtatttaa
2820aaaatgcacc ggggccagcc cgagcgagtt ccgtgccggt tgtgaagaaa aagtgaatga
2880tgtagccgtc aagttgtcat aattggtaac gaatcagaca attgacggct tgacggagta
2940gcatagggtt tgcagaatcc ctgcttcgtc catttgacag gcacattatg catgccgctt
3000cgccttcgcg cgcgaattga tctgctgcct cgcgcgtttc ggtgatgacg gtgaaaacct
3060ctgacacatg cagctcccgg agacggtcac agcttgtctg taagcggatg ccgggagcag
3120acaagcccgt cagggcgcgt cagcgggtgt tggcgggtgt cggggcgcag ccatgaccca
3180gtcacgtagc gatagcggag tgtatactgg cttaactatg cggcatcaga gcagattgta
3240ctgagagtgc accatatgcg gtgtgaaata ccgcacagat gcgtaaggag aaaataccgc
3300atcaggcgct cttccgcttc ctcgctcact gactcgctgc gctcggtcgt tcggctgcgg
3360cgagcggtat cagctcactc aaaggcggta atacggttat ccacagaatc aggggataac
3420gcaggaaaga acatgtgagc aaaaggccag caaaaggcca ggaaccgtaa aaaggccgcg
3480ttgctggcgt ttttccatag gctccgcccc cctgacgagc atcacaaaaa tcgacgctca
3540agtcagaggt ggcgaaaccc gacaggacta taaagatacc aggcgtttcc ccctggaagc
3600tccctcgtgc gctctcctgt tccgaccctg ccgcttaccg gatacctgtc cgcctttctc
3660ccttcgggaa gcgtggcgct ttctcatagc tcacgctgta ggtatctcag ttcggtgtag
3720gtcgttcgct ccaagctggg ctgtgtgcac gaaccccccg ttcagcccga ccgctgcgcc
3780ttatccggta actatcgtct tgagtccaac ccggtaagac acgacttatc gccactggca
3840gcagccactg gtaacaggat tagcagagcg aggtatgtag gcggtgctac agagttcttg
3900aagtggtggc ctaactacgg ctacactaga aggacagtat ttggtatctg cgctctgctg
3960aagccagtta ccttcggaaa aagagttggt agctcttgat ccggcaaaca aaccaccgct
4020ggtagcggtg gtttttttgt ttgcaagcag cagattacgc gcagaaaaaa aggatctcaa
4080gaagatcctt tgatcttttc tacggggtct gacgctcagt ggaacgaaaa ctcacgttaa
4140gggattttgg tcatgagcgg atacatattt gaatgtattt agaaaaataa acaaaagagt
4200ttgtagaaac gcaaaaaggc catccgtcag gatggccttc tgcttaattt gatgcctggc
4260agtttatggc gggcgtcctg cccgccaccc tccgggccgt tgcttcgcaa cgttcaaatc
4320cgctcccggc ggatttgtcc tactcaggag agcgttcacc gacaaacaac agataaaacg
4380aaaggcccag tctttcgact gagcctttcg ttttatttga tgcctggcag ttccctactc
4440tcgcatgggg agaccccaca ctaccatcgg cgctacggcg tttcacttct gagttcggca
4500tggggtcagg tgggaccacc gcgctactgc cgccaggcaa attctgtttt atcagaccgc
4560ttctgcgttc tgatttaatc tgtatcaggc tgaaaatctt ctctcatccg ccaaaacagc
4620caagctggag accgtttggc ccccctcgag cacgtagaaa gccagtccgc agaaacggtg
4680ctgaccccgg atgaatgtca gctactgggc tatctggaca agggaaaacg caagcgcaaa
4740gagaaagcag gtagcttgca gtgggcttac atggcgatag ctagactggg cggttttatg
4800gacagcaagc gaaccggaat tgccagctgg ggcgccctct ggtaaggttg ggaagccctg
4860caaagtaaac tggatggctt tctcgccgcc aaggatctga tggcgcaggg gatca
49157310815DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotidepCMV-PE2-P2A-Cre 73acgcgttgac attgattatt
gactagttat taatagtaat caattacggg gtcattagtt 60catagcccat atatggagtt
ccgcgttaca taacttacgg taaatggccc gcctggctga 120ccgcccaacg acccccgccc
attgacgtca ataatgacgt atgttcccat agtaacgcca 180atagggactt tccattgacg
tcaatgggtg gagtatttac ggtaaactgc ccacttggca 240gtacatcaag tgtatcatat
gccaagtacg ccccctattg acgtcaatga cggtaaatgg 300cccgcctggc attatgccca
gtacatgacc ttatgggact ttcctacttg gcagtacatc 360tacgtattag tcatcgctat
taccatggtg atgcggtttt ggcagtacat caatgggcgt 420ggatagcggt ttgactcacg
gggatttcca agtctccacc ccattgacgt caatgggagt 480ttgttttggc accaaaatca
acgggacttt ccaaaatgtc gtaacaactc cgccccattg 540acgcaaatgg gcggtaggcg
tgtacggtgg gaggtctata taagcagagc tggtttagtg 600aaccgtcaga tccgctagag
atccgcggcc gctaatacga ctcactatag ggagagccgc 660caccatgaaa cggacagccg
acggaagcga gttcgagtca ccaaagaaga agcggaaagt 720cgacaagaag tacagcatcg
gcctggacat cggcaccaac tctgtgggct gggccgtgat 780caccgacgag tacaaggtgc
ccagcaagaa attcaaggtg ctgggcaaca ccgaccggca 840cagcatcaag aagaacctga
tcggagccct gctgttcgac agcggcgaaa cagccgaggc 900cacccggctg aagagaaccg
ccagaagaag atacaccaga cggaagaacc ggatctgcta 960tctgcaagag atcttcagca
acgagatggc caaggtggac gacagcttct tccacagact 1020ggaagagtcc ttcctggtgg
aagaggataa gaagcacgag cggcacccca tcttcggcaa 1080catcgtggac gaggtggcct
accacgagaa gtaccccacc atctaccacc tgagaaagaa 1140actggtggac agcaccgaca
aggccgacct gcggctgatc tatctggccc tggcccacat 1200gatcaagttc cggggccact
tcctgatcga gggcgacctg aaccccgaca acagcgacgt 1260ggacaagctg ttcatccagc
tggtgcagac ctacaaccag ctgttcgagg aaaaccccat 1320caacgccagc ggcgtggacg
ccaaggccat cctgtctgcc agactgagca agagcagacg 1380gctggaaaat ctgatcgccc
agctgcccgg cgagaagaag aatggcctgt tcggaaacct 1440gattgccctg agcctgggcc
tgacccccaa cttcaagagc aacttcgacc tggccgagga 1500tgccaaactg cagctgagca
aggacaccta cgacgacgac ctggacaacc tgctggccca 1560gatcggcgac cagtacgccg
acctgtttct ggccgccaag aacctgtccg acgccatcct 1620gctgagcgac atcctgagag
tgaacaccga gatcaccaag gcccccctga gcgcctctat 1680gatcaagaga tacgacgagc
accaccagga cctgaccctg ctgaaagctc tcgtgcggca 1740gcagctgcct gagaagtaca
aagagatttt cttcgaccag agcaagaacg gctacgccgg 1800ctacattgac ggcggagcca
gccaggaaga gttctacaag ttcatcaagc ccatcctgga 1860aaagatggac ggcaccgagg
aactgctcgt gaagctgaac agagaggacc tgctgcggaa 1920gcagcggacc ttcgacaacg
gcagcatccc ccaccagatc cacctgggag agctgcacgc 1980cattctgcgg cggcaggaag
atttttaccc attcctgaag gacaaccggg aaaagatcga 2040gaagatcctg accttccgca
tcccctacta cgtgggccct ctggccaggg gaaacagcag 2100attcgcctgg atgaccagaa
agagcgagga aaccatcacc ccctggaact tcgaggaagt 2160ggtggacaag ggcgcttccg
cccagagctt catcgagcgg atgaccaact tcgataagaa 2220cctgcccaac gagaaggtgc
tgcccaagca cagcctgctg tacgagtact tcaccgtgta 2280taacgagctg accaaagtga
aatacgtgac cgagggaatg agaaagcccg ccttcctgag 2340cggcgagcag aaaaaggcca
tcgtggacct gctgttcaag accaaccgga aagtgaccgt 2400gaagcagctg aaagaggact
acttcaagaa aatcgagtgc ttcgactccg tggaaatctc 2460cggcgtggaa gatcggttca
acgcctccct gggcacatac cacgatctgc tgaaaattat 2520caaggacaag gacttcctgg
acaatgagga aaacgaggac attctggaag atatcgtgct 2580gaccctgaca ctgtttgagg
acagagagat gatcgaggaa cggctgaaaa cctatgccca 2640cctgttcgac gacaaagtga
tgaagcagct gaagcggcgg agatacaccg gctggggcag 2700gctgagccgg aagctgatca
acggcatccg ggacaagcag tccggcaaga caatcctgga 2760tttcctgaag tccgacggct
tcgccaacag aaacttcatg cagctgatcc acgacgacag 2820cctgaccttt aaagaggaca
tccagaaagc ccaggtgtcc ggccagggcg atagcctgca 2880cgagcacatt gccaatctgg
ccggcagccc cgccattaag aagggcatcc tgcagacagt 2940gaaggtggtg gacgagctcg
tgaaagtgat gggccggcac aagcccgaga acatcgtgat 3000cgaaatggcc agagagaacc
agaccaccca gaagggacag aagaacagcc gcgagagaat 3060gaagcggatc gaagagggca
tcaaagagct gggcagccag atcctgaaag aacaccccgt 3120ggaaaacacc cagctgcaga
acgagaagct gtacctgtac tacctgcaga atgggcggga 3180tatgtacgtg gaccaggaac
tggacatcaa ccggctgtcc gactacgatg tggacgctat 3240cgtgcctcag agctttctga
aggacgactc catcgacaac aaggtgctga ccagaagcga 3300caagaaccgg ggcaagagcg
acaacgtgcc ctccgaagag gtcgtgaaga agatgaagaa 3360ctactggcgg cagctgctga
acgccaagct gattacccag agaaagttcg acaatctgac 3420caaggccgag agaggcggcc
tgagcgaact ggataaggcc ggcttcatca agagacagct 3480ggtggaaacc cggcagatca
caaagcacgt ggcacagatc ctggactccc ggatgaacac 3540taagtacgac gagaatgaca
agctgatccg ggaagtgaaa gtgatcaccc tgaagtccaa 3600gctggtgtcc gatttccgga
aggatttcca gttttacaaa gtgcgcgaga tcaacaacta 3660ccaccacgcc cacgacgcct
acctgaacgc cgtcgtggga accgccctga tcaaaaagta 3720ccctaagctg gaaagcgagt
tcgtgtacgg cgactacaag gtgtacgacg tgcggaagat 3780gatcgccaag agcgagcagg
aaatcggcaa ggctaccgcc aagtacttct tctacagcaa 3840catcatgaac tttttcaaga
ccgagattac cctggccaac ggcgagatcc ggaagcggcc 3900tctgatcgag acaaacggcg
aaaccgggga gatcgtgtgg gataagggcc gggattttgc 3960caccgtgcgg aaagtgctga
gcatgcccca agtgaatatc gtgaaaaaga ccgaggtgca 4020gacaggcggc ttcagcaaag
agtctatcct gcccaagagg aacagcgata agctgatcgc 4080cagaaagaag gactgggacc
ctaagaagta cggcggcttc gacagcccca ccgtggccta 4140ttctgtgctg gtggtggcca
aagtggaaaa gggcaagtcc aagaaactga agagtgtgaa 4200agagctgctg gggatcacca
tcatggaaag aagcagcttc gagaagaatc ccatcgactt 4260tctggaagcc aagggctaca
aagaagtgaa aaaggacctg atcatcaagc tgcctaagta 4320ctccctgttc gagctggaaa
acggccggaa gagaatgctg gcctctgccg gcgaactgca 4380gaagggaaac gaactggccc
tgccctccaa atatgtgaac ttcctgtacc tggccagcca 4440ctatgagaag ctgaagggct
cccccgagga taatgagcag aaacagctgt ttgtggaaca 4500gcacaagcac tacctggacg
agatcatcga gcagatcagc gagttctcca agagagtgat 4560cctggccgac gctaatctgg
acaaagtgct gtccgcctac aacaagcacc gggataagcc 4620catcagagag caggccgaga
atatcatcca cctgtttacc ctgaccaatc tgggagcccc 4680tgccgccttc aagtactttg
acaccaccat cgaccggaag aggtacacca gcaccaaaga 4740ggtgctggac gccaccctga
tccaccagag catcaccggc ctgtacgaga cacggatcga 4800cctgtctcag ctgggaggtg
actctggagg atctagcgga ggatcctctg gcagcgagac 4860accaggaaca agcgagtcag
caacaccaga gagcagtggc ggcagcagcg gcggcagcag 4920caccctaaat atagaagatg
agtatcggct acatgagacc tcaaaagagc cagatgtttc 4980tctagggtcc acatggctgt
ctgattttcc tcaggcctgg gcggaaaccg ggggcatggg 5040actggcagtt cgccaagctc
ctctgatcat acctctgaaa gcaacctcta cccccgtgtc 5100cataaaacaa taccccatgt
cacaagaagc cagactgggg atcaagcccc acatacagag 5160actgttggac cagggaatac
tggtaccctg ccagtccccc tggaacacgc ccctgctacc 5220cgttaagaaa ccagggacta
atgattatag gcctgtccag gatctgagag aagtcaacaa 5280gcgggtggaa gacatccacc
ccaccgtgcc caacccttac aacctcttga gcgggctccc 5340accgtcccac cagtggtaca
ctgtgcttga tttaaaggat gcctttttct gcctgagact 5400ccaccccacc agtcagcctc
tcttcgcctt tgagtggaga gatccagaga tgggaatctc 5460aggacaattg acctggacca
gactcccaca gggtttcaaa aacagtccca ccctgtttaa 5520tgaggcactg cacagagacc
tagcagactt ccggatccag cacccagact tgatcctgct 5580acagtacgtg gatgacttac
tgctggccgc cacttctgag ctagactgcc aacaaggtac 5640tcgggccctg ttacaaaccc
tagggaacct cgggtatcgg gcctcggcca agaaagccca 5700aatttgccag aaacaggtca
agtatctggg gtatcttcta aaagagggtc agagatggct 5760gactgaggcc agaaaagaga
ctgtgatggg gcagcctact ccgaagaccc ctcgacaact 5820aagggagttc ctagggaagg
caggcttctg tcgcctcttc atccctgggt ttgcagaaat 5880ggcagccccc ctgtaccctc
tcaccaaacc ggggactctg tttaattggg gcccagacca 5940acaaaaggcc tatcaagaaa
tcaagcaagc tcttctaact gccccagccc tggggttgcc 6000agatttgact aagccctttg
aactctttgt cgacgagaag cagggctacg ccaaaggtgt 6060cctaacgcaa aaactgggac
cttggcgtcg gccggtggcc tacctgtcca aaaagctaga 6120cccagtagca gctgggtggc
ccccttgcct acggatggta gcagccattg ccgtactgac 6180aaaggatgca ggcaagctaa
ccatgggaca gccactagtc attctggccc cccatgcagt 6240agaggcacta gtcaaacaac
cccccgaccg ctggctttcc aacgcccgga tgactcacta 6300tcaggccttg cttttggaca
cggaccgggt ccagttcgga ccggtggtag ccctgaaccc 6360ggctacgctg ctcccactgc
ctgaggaagg gctgcaacac aactgccttg atatcctggc 6420cgaagcccac ggaacccgac
ccgacctaac ggaccagccg ctcccagacg ccgaccacac 6480ctggtacacg gatggaagca
gtctcttaca agagggacag cgtaaggcgg gagctgcggt 6540gaccaccgag accgaggtaa
tctgggctaa agccctgcca gccgggacat ccgctcagcg 6600ggctgaactg atagcactca
cccaggccct aaagatggca gaaggtaaga agctaaatgt 6660ttatactgat agccgttatg
cttttgctac tgcccatatc catggagaaa tatacagaag 6720gcgtgggtgg ctcacatcag
aaggcaaaga gatcaaaaat aaagacgaga tcttggccct 6780actaaaagcc ctctttctgc
ccaaaagact tagcataatc cattgtccag gacatcaaaa 6840gggacacagc gccgaggcta
gaggcaaccg gatggctgac caagcggccc gaaaggcagc 6900catcacagag actccagaca
cctctaccct cctcatagaa aattcatcac cctctggcgg 6960ctcaaaaaga accgccgacg
gcagcgaatt cgagcccaag aagaagagga aagtcggaag 7020cggagctact aacttcagcc
tgctgaagca ggctggcgac gtggaggaga accctggacc 7080taatttactg accgtacacc
aaaatttgcc tgcattaccg gtcgatgcaa cgagtgatga 7140ggttcgcaag aacctgatgg
acatgttcag ggatcgccag gcgttttctg agcatacctg 7200gaaaatgctt ctgtccgttt
gccggtcgtg ggcggcatgg tgcaagttga ataaccggaa 7260atggtttccc gcagaacctg
aagatgttcg cgattatctt ctatatcttc aggcgcgcgg 7320tctggcagta aaaactatcc
agcaacattt gggccagcta aacatgcttc atcgtcggtc 7380cgggctgcca cgaccaagtg
acagcaatgc tgtttcactg gttatgcggc ggatccgaaa 7440agaaaacgtt gatgccggtg
aacgtgcaaa acaggctcta gcgttcgaac gcactgattt 7500cgaccaggtt cgttcactca
tggaaaatag cgatcgctgc caggatatac gtaatctggc 7560atttctgggg attgcttata
acaccctgtt acgtatagcc gaaattgcca ggatcagggt 7620taaagatatc tcacgtactg
acggtgggag aatgttaatc catattggca gaacgaaaac 7680gctggttagc accgcaggtg
tagagaaggc acttagcctg ggggtaacta aactggtcga 7740gcgatggatt tccgtctctg
gtgtagctga tgatccgaat aactacctgt tttgccgggt 7800cagaaaaaat ggtgttgccg
cgccatctgc caccagccag ctatcaactc gcgccctgga 7860agggattttt gaagcaactc
atcgattgat ttacggcgct aaggatgact ctggtcagag 7920atacctggcc tggtctggac
acagtgcccg tgtcggagcc gcgcgagata tggcccgcgc 7980tggagtttca ataccggaga
tcatgcaagc tggtggctgg accaatgtaa atattgtcat 8040gaactatatc cgtaacctgg
atagtgaaac aggggcaatg gtgcgcctgc tggaagatgg 8100cgattaattt aaacccgctg
atcagcctcg actgtgcctt ctagttgcca gccatctgtt 8160gtttgcccct cccccgtgcc
ttccttgacc ctggaaggtg ccactcccac tgtcctttcc 8220taataaaatg agaaaattgc
atcgcattgt ctgagtaggt gtcattctat tctggggggt 8280ggggtggggc aggacagcaa
gggggaggat tgggaagaca atagcaggca tgctggggat 8340gcggtgggct ctatggcttc
tgaggcggaa agaaccagct ggggctcgat accgtcgacc 8400tctagctaga gcttggcgta
atcatggtca tagctgtttc ctgtgtgaaa ttgttatccg 8460ctcacaattc cacacaacat
acgagccgga agcataaagt gtaaagccta gggtgcctaa 8520tgagtgagct aactcacatt
aattgcgttg cgctcactgc ccgctttcca gtcgggaaac 8580ctgtcgtgcc agctgcatta
atgaatcggc caacgcgcgg ggagaggcgg tttgcgtatt 8640gggcgctctt ccgcttcctc
gctcactgac tcgctgcgct cggtcgttcg gctgcggcga 8700gcggtatcag ctcactcaaa
ggcggtaata cggttatcca cagaatcagg ggataacgca 8760ggaaagaaca tgtgagcaaa
aggccagcaa aaggccagga accgtaaaaa ggccgcgttg 8820ctggcgtttt tccataggct
ccgcccccct gacgagcatc acaaaaatcg acgctcaagt 8880cagaggtggc gaaacccgac
aggactataa agataccagg cgtttccccc tggaagctcc 8940ctcgtgcgct ctcctgttcc
gaccctgccg cttaccggat acctgtccgc ctttctccct 9000tcgggaagcg tggcgctttc
tcatagctca cgctgtaggt atctcagttc ggtgtaggtc 9060gttcgctcca agctgggctg
tgtgcacgaa ccccccgttc agcccgaccg ctgcgcctta 9120tccggtaact atcgtcttga
gtccaacccg gtaagacacg acttatcgcc actggcagca 9180gccactggta acaggattag
cagagcgagg tatgtaggcg gtgctacaga gttcttgaag 9240tggtggccta actacggcta
cactagaaga acagtatttg gtatctgcgc tctgctgaag 9300ccagttacct tcggaaaaag
agttggtagc tcttgatccg gcaaacaaac caccgctggt 9360agcggtggtt tttttgtttg
caagcagcag attacgcgca gaaaaaaagg atctcaagaa 9420gatcctttga tcttttctac
ggggtctgac gctcagtgga acgaaaactc acgttaaggg 9480attttggtca tgagattatc
aaaaaggatc ttcacctaga tccttttaaa ttaaaaatga 9540agttttaaat caatctaaag
tatatatgag taaacttggt ctgacagtta ccaatgctta 9600atcagtgagg cacctatctc
agcgatctgt ctatttcgtt catccatagt tgcctgactc 9660cccgtcgtgt agataactac
gatacgggag ggcttaccat ctggccccag tgctgcaatg 9720ataccgcgag acccacgctc
accggctcca gatttatcag caataaacca gccagccgga 9780agggccgagc gcagaagtgg
tcctgcaact ttatccgcct ccatccagtc tattaattgt 9840tgccgggaag ctagagtaag
tagttcgcca gttaatagtt tgcgcaacgt tgttgccatt 9900gctacaggca tcgtggtgtc
acgctcgtcg tttggtatgg cttcattcag ctccggttcc 9960caacgatcaa ggcgagttac
atgatccccc atgttgtgca aaaaagcggt tagctccttc 10020ggtcctccga tcgttgtcag
aagtaagttg gccgcagtgt tatcactcat ggttatggca 10080gcactgcata attctcttac
tgtcatgcca tccgtaagat gcttttctgt gactggtgag 10140tactcaacca agtcattctg
agaatagtgt atgcggcgac cgagttgctc ttgcccggcg 10200tcaatacggg ataataccgc
gccacatagc agaactttaa aagtgctcat cattggaaaa 10260cgttcttcgg ggcgaaaact
ctcaaggatc ttaccgctgt tgagatccag ttcgatgtaa 10320cccactcgtg cacccaactg
atcttcagca tcttttactt tcaccagcgt ttctgggtga 10380gcaaaaacag gaaggcaaaa
tgccgcaaaa aagggaataa gggcgacacg gaaatgttga 10440atactcatac tcttcctttt
tcaatattat tgaagcattt atcagggtta ttgtctcatg 10500agcggataca tatttgaatg
tatttagaaa aataaacaaa taggggttcc gcgcacattt 10560ccccgaaaag tgccacctga
cgtcgacgga tcgggagatc gatctcccga tcccctaggg 10620tcgactctca gtacaatctg
ctctgatgcc gcatagttaa gccagtatct gctccctgct 10680tgtgtgttgg aggtcgctga
gtagtgcgcg agcaaaattt aagctacaac aaggcaaggc 10740ttgaccgaca attgcatgaa
gaatctgctt agggttaggc gttttgcgct gcttcgcgat 10800gtacgggcca gatat
108157420DNAArtificial
SequenceDescription of Artificial Sequence Synthetic
oligonucleotide+90ngRNA guide sequence 74gtcaaccagt atcccggtgc
207596DNAArtificial
SequenceDescription of Artificial Sequence Synthetic
oligonucleotide+90ngRNA 75gtcaaccagt atcccggtgc gttttagagc tagaaatagc
aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgc
96764968DNAArtificial SequenceDescription of
Artificial Sequence Synthetic polynucleotideGFP minicircle template
(before cleavage) 76tgatcccctg cgccatcaga tccttggcgg cgagaaagcc
atccagttta ctttgcaggg 60cttcccaacc ttaccagagg gcgccccagc tggcaattcc
ggttcgcttg ctgtccataa 120aaccgcccag tctagctatc gccatgtaag cccactgcaa
gctacctgct ttctctttgc 180gcttgcgttt tcccttgtcc agatagccca gtagctgaca
ttcatccggg gtcagcaccg 240tttctgcgga ctggctttct acgtgctcga ggggggccaa
acggtctcca gcttggctgt 300tttggcggat gagagaagat tttcagcctg atacagatta
aatcagaacg cagaagcggt 360ctgataaaac agaatttgcc tggcggcagt agcgcggtgg
tcccacctga ccccatgccg 420aactcagaag tgaaacgccg tagcgccgat ggtagtgtgg
ggtctcccca tgcgagagta 480gggaactgcc aggcatcaaa taaaacgaaa ggctcagtcg
aaagactggg cctttcgttt 540tatctgttgt ttgtcggtga acgctctcct gagtaggaca
aatccgccgg gagcggattt 600gaacgttgcg aagcaacggc ccggagggtg gcgggcagga
cgcccgccat aaactgccag 660gcatcaaatt aagcagaagg ccatcctgac ggatggcctt
tttgcgtttc tacaaactct 720tttgtttatt tttctaaata cattcaaata tgtatccgct
catgaccaaa atcccttaac 780gtgagttttc gttccactga gcgtcagacc ccgtagaaaa
gatcaaagga tcttcttgag 840atcctttttt tctgcgcgta atctgctgct tgcaaacaaa
aaaaccaccg ctaccagcgg 900tggtttgttt gccggatcaa gagctaccaa ctctttttcc
gaaggtaact ggcttcagca 960gagcgcagat accaaatact gtccttctag tgtagccgta
gttaggccac cacttcaaga 1020actctgtagc accgcctaca tacctcgctc tgctaatcct
gttaccagtg gctgctgcca 1080gtggcgataa gtcgtgtctt accgggttgg actcaagacg
atagttaccg gataaggcgc 1140agcggtcggg ctgaacgggg ggttcgtgca cacagcccag
cttggagcga acgacctaca 1200ccgaactgag atacctacag cgtgagctat gagaaagcgc
cacgcttccc gaagggagaa 1260aggcggacag gtatccggta agcggcaggg tcggaacagg
agagcgcacg agggagcttc 1320cagggggaaa cgcctggtat ctttatagtc ctgtcgggtt
tcgccacctc tgacttgagc 1380gtcgattttt gtgatgctcg tcaggggggc ggagcctatg
gaaaaacgcc agcaacgcgg 1440cctttttacg gttcctggcc ttttgctggc cttttgctca
catgttcttt cctgcgttat 1500cccctgattc tgtggataac cgtattaccg cctttgagtg
agctgatacc gctcgccgca 1560gccgaacgac cgagcgcagc gagtcagtga gcgaggaagc
ggaagagcgc ctgatgcggt 1620attttctcct tacgcatctg tgcggtattt cacaccgcat
atggtgcact ctcagtacaa 1680tctgctctga tgccgcatag ttaagccagt atacactccg
ctatcgctac gtgactgggt 1740catggctgcg ccccgacacc cgccaacacc cgctgacgcg
ccctgacggg cttgtctgct 1800cccggcatcc gcttacagac aagctgtgac cgtctccggg
agctgcatgt gtcagaggtt 1860ttcaccgtca tcaccgaaac gcgcgaggca gcagatcaat
tcgcgcgcga aggcgaagcg 1920gcatgcataa tgtgcctgtc aaatggacga agcagggatt
ctgcaaaccc tatgctactc 1980cgtcaagccg tcaattgtct gattcgttac caattatgac
aacttgacgg ctacatcatt 2040cactttttct tcacaaccgg cacggaactc gctcgggctg
gccccggtgc attttttaaa 2100tacccgcgag aaatagagtt gatcgtcaaa accaacattg
cgaccgacgg tggcgatagg 2160catccgggtg gtgctcaaaa gcagcttcgc ctggctgata
cgttggtcct cgcgccagct 2220taagacgcta atccctaact gctggcggaa aagatgtgac
agacgcgacg gcgacaagca 2280aacatgctgt gcgacgctgg cgatacatta ccctgttatc
cctagatgac attaccctgt 2340tatcccagat gacattaccc tgttatccct agatgacatt
accctgttat ccctagatga 2400catttaccct gttatcccta gatgacatta ccctgttatc
ccagatgaca ttaccctgtt 2460atccctagat acattaccct gttatcccag atgacatacc
ctgttatccc tagatgacat 2520taccctgtta tcccagatga cattaccctg ttatccctag
atacattacc ctgttatccc 2580agatgacata ccctgttatc cctagatgac attaccctgt
tatcccagat gacattaccc 2640tgttatccct agatacatta ccctgttatc ccagatgaca
taccctgtta tccctagatg 2700acattaccct gttatcccag atgacattac cctgttatcc
ctagatacat taccctgtta 2760tcccagatga cataccctgt tatccctaga tgacattacc
ctgttatccc agatgacatt 2820accctgttat ccctagatac attaccctgt tatcccagat
gacataccct gttatcccta 2880gatgacatta ccctgttatc ccagatgaca ttaccctgtt
atccctagat acattaccct 2940gttatcccag atgacatacc ctgttatccc tagatgacat
taccctgtta tcccagataa 3000actcaatgat gatgatgatg atggtcgaga ctcagcggcc
gcggtgccag ggcgtgccct 3060tgggctcccc gggcgcgact ataagctgcg agcaacttca
cttgggtatg ccggcggtag 3120cgcttaccgt tcgtataatg tatgctatac gaagttatcc
gaagccgcta gcggtggttt 3180gtctggtcaa ccaccgcggt ctcagtggtg tacggtacaa
acccagctac cggtcgccac 3240catgcccgcc atgaagatcg agtgccgcat caccggcacc
ctgaacggcg tggagttcga 3300gctggtgggc ggcggagagg gcacccccga gcagggccgc
atgaccaaca agatgaagag 3360caccaaaggc gccctgacct tcagccccta cctgctgagc
cacgtgatgg gctacggctt 3420ctaccacttc ggcacctacc ccagcggcta cgagaacccc
ttcctgcacg ccatcaacaa 3480cggcggctac accaacaccc gcatcgagaa gtacgaggac
ggcggcgtgc tgcacgtgag 3540cttcagctac cgctacgagg ccggccgcgt gatcggcgac
ttcaaggtgg tgggcaccgg 3600cttccccgag gacagcgtga tcttcaccga caagatcatc
cgcagcaacg ccaccgtgga 3660gcacctgcac cccatgggcg ataacgtgct ggtgggcagc
ttcgcccgca ccttcagcct 3720gcgcgacggc ggctactaca gcttcgtggt ggacagccac
atgcacttca agagcgccat 3780ccaccccagc atcctgcaga acgggggccc catgttcgcc
ttccgccgcg tggaggagct 3840gcacagcaac accgagctgg gcatcgtgga gtaccagcac
gccttcaaga cccccatcgc 3900cttcgccaga tctcgagctc gatgagtttg gacaaaccac
aactagaatg cagtgaaaaa 3960aatgctttat ttgtgaaatt tgtgatgcta ttgctttatt
tgtgggcccg ccccaactgg 4020ggtaaccttt gagttctctc agttgggggt aatcagcatc
atgatgtggt accacatcat 4080gatgctgatt ataagaatgc ggccgccaca ctctagtgga
tctcgagtta ataattcaga 4140agaactcgtc aagaaggcga tagaaggcga tgcgctgcga
atcgggagcg gcgataccgt 4200aaagcacgag gaagcggtca gcccattcgc cgccaagctc
ttcagcaata tcacgggtag 4260ccaacgctat gtcctgatag cggtccgcca cacccagccg
gccacagtcg atgaatccag 4320aaaagcggcc attttccacc atgatattcg gcaagcaggc
atcgccatgg gtcacgacga 4380gatcctcgcc gtcgggcatg ctcgccttga gcctggcgaa
cagttcggct ggcgcgagcc 4440cctgatgctc ttcgtccaga tcatcctgat cgacaagacc
ggcttccatc cgagtacgtg 4500ctcgctcgat gcgatgtttc gcttggtggt cgaatgggca
ggtagccgga tcaagcgtat 4560gcagccgccg cattgcatca gccatgatgg atactttctc
ggcaggagca aggtgtagat 4620gacatggaga tcctgccccg gcacttcgcc caatagcagc
cagtcccttc ccgcttcagt 4680gacaacgtcg agcacagctg cgcaaggaac gcccgtcgtg
gccagccacg atagccgcgc 4740tgcctcgtct tgcagttcat tcagggcacc ggacaggtcg
gtcttgacaa aaagaaccgg 4800gcgcccctgc gctgacagcc ggaacacggc ggcatcagag
cagccgattg tctgttgtgc 4860ccagtcatag ccgaatagcc tctccaccca agcggccgga
gaacctgcgt gcaatccatc 4920ttgttcaatc atgcgaaacg atcctcatcc tgtctcttga
tcagagct 4968774855DNAArtificial SequenceDescription of
Artificial Sequence Synthetic polynucleotideGLuc minicircle template
77tgatcccctg cgccatcaga tccttggcgg cgagaaagcc atccagttta ctttgcaggg
60cttcccaacc ttaccagagg gcgccccagc tggcaattcc ggttcgcttg ctgtccataa
120aaccgcccag tctagctatc gccatgtaag cccactgcaa gctacctgct ttctctttgc
180gcttgcgttt tcccttgtcc agatagccca gtagctgaca ttcatccggg gtcagcaccg
240tttctgcgga ctggctttct acgtgctcga ggggggccaa acggtctcca gcttggctgt
300tttggcggat gagagaagat tttcagcctg atacagatta aatcagaacg cagaagcggt
360ctgataaaac agaatttgcc tggcggcagt agcgcggtgg tcccacctga ccccatgccg
420aactcagaag tgaaacgccg tagcgccgat ggtagtgtgg ggtctcccca tgcgagagta
480gggaactgcc aggcatcaaa taaaacgaaa ggctcagtcg aaagactggg cctttcgttt
540tatctgttgt ttgtcggtga acgctctcct gagtaggaca aatccgccgg gagcggattt
600gaacgttgcg aagcaacggc ccggagggtg gcgggcagga cgcccgccat aaactgccag
660gcatcaaatt aagcagaagg ccatcctgac ggatggcctt tttgcgtttc tacaaactct
720tttgtttatt tttctaaata cattcaaata tgtatccgct catgaccaaa atcccttaac
780gtgagttttc gttccactga gcgtcagacc ccgtagaaaa gatcaaagga tcttcttgag
840atcctttttt tctgcgcgta atctgctgct tgcaaacaaa aaaaccaccg ctaccagcgg
900tggtttgttt gccggatcaa gagctaccaa ctctttttcc gaaggtaact ggcttcagca
960gagcgcagat accaaatact gtccttctag tgtagccgta gttaggccac cacttcaaga
1020actctgtagc accgcctaca tacctcgctc tgctaatcct gttaccagtg gctgctgcca
1080gtggcgataa gtcgtgtctt accgggttgg actcaagacg atagttaccg gataaggcgc
1140agcggtcggg ctgaacgggg ggttcgtgca cacagcccag cttggagcga acgacctaca
1200ccgaactgag atacctacag cgtgagctat gagaaagcgc cacgcttccc gaagggagaa
1260aggcggacag gtatccggta agcggcaggg tcggaacagg agagcgcacg agggagcttc
1320cagggggaaa cgcctggtat ctttatagtc ctgtcgggtt tcgccacctc tgacttgagc
1380gtcgattttt gtgatgctcg tcaggggggc ggagcctatg gaaaaacgcc agcaacgcgg
1440cctttttacg gttcctggcc ttttgctggc cttttgctca catgttcttt cctgcgttat
1500cccctgattc tgtggataac cgtattaccg cctttgagtg agctgatacc gctcgccgca
1560gccgaacgac cgagcgcagc gagtcagtga gcgaggaagc ggaagagcgc ctgatgcggt
1620attttctcct tacgcatctg tgcggtattt cacaccgcat atggtgcact ctcagtacaa
1680tctgctctga tgccgcatag ttaagccagt atacactccg ctatcgctac gtgactgggt
1740catggctgcg ccccgacacc cgccaacacc cgctgacgcg ccctgacggg cttgtctgct
1800cccggcatcc gcttacagac aagctgtgac cgtctccggg agctgcatgt gtcagaggtt
1860ttcaccgtca tcaccgaaac gcgcgaggca gcagatcaat tcgcgcgcga aggcgaagcg
1920gcatgcataa tgtgcctgtc aaatggacga agcagggatt ctgcaaaccc tatgctactc
1980cgtcaagccg tcaattgtct gattcgttac caattatgac aacttgacgg ctacatcatt
2040cactttttct tcacaaccgg cacggaactc gctcgggctg gccccggtgc attttttaaa
2100tacccgcgag aaatagagtt gatcgtcaaa accaacattg cgaccgacgg tggcgatagg
2160catccgggtg gtgctcaaaa gcagcttcgc ctggctgata cgttggtcct cgcgccagct
2220taagacgcta atccctaact gctggcggaa aagatgtgac agacgcgacg gcgacaagca
2280aacatgctgt gcgacgctgg cgatacatta ccctgttatc cctagatgac attaccctgt
2340tatcccagat gacattaccc tgttatccct agatgacatt accctgttat ccctagatga
2400catttaccct gttatcccta gatgacatta ccctgttatc ccagatgaca ttaccctgtt
2460atccctagat acattaccct gttatcccag atgacatacc ctgttatccc tagatgacat
2520taccctgtta tcccagatga cattaccctg ttatccctag atacattacc ctgttatccc
2580agatgacata ccctgttatc cctagatgac attaccctgt tatcccagat gacattaccc
2640tgttatccct agatacatta ccctgttatc ccagatgaca taccctgtta tccctagatg
2700acattaccct gttatcccag atgacattac cctgttatcc ctagatacat taccctgtta
2760tcccagatga cataccctgt tatccctaga tgacattacc ctgttatccc agatgacatt
2820accctgttat ccctagatac attaccctgt tatcccagat gacataccct gttatcccta
2880gatgacatta ccctgttatc ccagatgaca ttaccctgtt atccctagat acattaccct
2940gttatcccag atgacatacc ctgttatccc tagatgacat taccctgtta tcccagataa
3000actcaatgat gatgatgatg atggtcgaga ctcagcggcc gcggtgccag ggcgtgccct
3060tgggctcccc gggcgcgact ataagctgcg agcaacttca cttgggtatg ccggcggtag
3120cgcttaccgt tcgtataatg tatgctatac gaagttatcc gaagccgcta gcggtggttt
3180gtctggtcaa ccaccgcggt ctcagtggtg tacggtacaa acccactacc ggtcgccacc
3240atgggagtca aagttctgtt tgccctgatc tgcatcgctg tggccgaggc caagcccacc
3300gagaacaacg aagacttcaa catcgtggcc gtggccagca acttcgcgac cacggatctc
3360gatgctgacc gcgggaagtt gcccggcaag aagctgccgc tggaggtgct caaagagatg
3420gaagccaatg cccggaaagc tggctgcacc aggggctgtc tgatctgcct gtcccacatc
3480aagtgcacgc ccaagatgaa gaagttcatc ccaggacgct gccacaccta cgaaggcgac
3540aaagagtccg cacagggcgg cataggcgag gcgatcgtcg acattcctga gattcctggg
3600ttcaaggact tggagcccat ggagcagttc atcgcacagg tcgatctgtg tgtggactgc
3660acaactggct gcctcaaagg gcttgccaac gtgcagtgtt ctgacctgct caagaagtgg
3720ctgccgcaac gctgtgcgac ctttgccagc aagatccagg gccaggtgga caagatcaag
3780ggggccggtg gtgactaagc ggagctcgat gagtttggac aaaccacaac tagaatgcag
3840tgaaaaaaat gctttatttg tgaaatttgt gatgctattg ctttatttgt gggcccgccc
3900caactggggt aacctttgag ttctctcagt tgggggtaat cagcatcatg atgtggtacc
3960acatcatgat gctgattata agaatgcggc cgccacactc tagtggatct cgagttaata
4020attcagaaga actcgtcaag aaggcgatag aaggcgatgc gctgcgaatc gggagcggcg
4080ataccgtaaa gcacgaggaa gcggtcagcc cattcgccgc caagctcttc agcaatatca
4140cgggtagcca acgctatgtc ctgatagcgg tccgccacac ccagccggcc acagtcgatg
4200aatccagaaa agcggccatt ttccaccatg atattcggca agcaggcatc gccatgggtc
4260acgacgagat cctcgccgtc gggcatgctc gccttgagcc tggcgaacag ttcggctggc
4320gcgagcccct gatgctcttc gtccagatca tcctgatcga caagaccggc ttccatccga
4380gtacgtgctc gctcgatgcg atgtttcgct tggtggtcga atgggcaggt agccggatca
4440agcgtatgca gccgccgcat tgcatcagcc atgatggata ctttctcggc aggagcaagg
4500tgtagatgac atggagatcc tgccccggca cttcgcccaa tagcagccag tcccttcccg
4560cttcagtgac aacgtcgagc acagctgcgc aaggaacgcc cgtcgtggcc agccacgata
4620gccgcgctgc ctcgtcttgc agttcattca gggcaccgga caggtcggtc ttgacaaaaa
4680gaaccgggcg cccctgcgct gacagccgga acacggcggc atcagagcag ccgattgtct
4740gttgtgccca gtcatagccg aatagcctct ccacccaagc ggccggagaa cctgcgtgca
4800atccatcttg ttcaatcatg cgaaacgatc ctcatcctgt ctcttgatca gagct
48557838DNAArtificial SequenceDescription of Artificial Sequence
Synthetic oligonucleotidepseudo-attP 78ccccaactgg ggtaaccttt
gagttctctc agttgggg 3879194DNAArtificial
SequenceDescription of Artificial Sequence Synthetic
polynucleotideAlbumin-pegRNA-SERPIN 79gactgaaact tcacagaata gttttagagc
tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt
cggtgcttgg gatagttatg aattcaatct 120tcaaccctat ccggatgatc ctgacgacgg
agaccgccgt cgtcgacaag ccggcctctg 180tgaagtttca gtca
19480189DNAArtificial
SequenceDescription of Artificial Sequence Synthetic
polynucleotideAlbumin-pegRNA-CPS1 80gactgaaact tcacagaata gttttagagc
tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt
cggtgcttgg gatagttatg aattcaatct 120tcaaccctat ccggatgatc ctgacgacgg
agaccgccgt cgtcgacaag ccggcctctg 180tgaagtttc
18981177DNAArtificial
SequenceDescription of Artificial Sequence Synthetic
polynucleotide34bp lox71 pegRNA 81ggcccagact gagcacgtga gttttagagc
tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt
cggtgctgga ggaagcaggg cttcctttcc 120tctgccatca taccgttcgt atagcataca
ttatacgaag ttatcgtgct cagtctg 17782177DNAArtificial
SequenceDescription of Artificial Sequence Synthetic
polynucleotide34bp lox66 pegRNA 82ggcccagact gagcacgtga gttttagagc
tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt
cggtgctgga ggaagcaggg cttcctttcc 120tctgccatca ataacttcgt atagcataca
ttatacgaac ggtacgtgct cagtctg 1778320DNAArtificial
SequenceDescription of Artificial Sequence Synthetic
oligonucleotidegRNA2 83ggcccagact gagcacgtga
2084184DNAArtificial SequenceDescription of Artificial
Sequence Synthetic polynucleotide 84gctattctcg cagctcacca gttttagagc
tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt
cggtgcgacg agcgcggcga tatcatcatc 120catggccgga tgatcctgac gacggagacc
gccgtcgtcg acaagccggc ctgagctgcg 180agaa
18485179DNAArtificial
SequenceDescription of Artificial Sequence Synthetic polynucleotide
85gctattctcg cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc
60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgagt cggtgcgacg agcgcggcga
120tatcatcatc catggcacaa ttaacatctc aatcaaggta aatgcttgag ctgcgagaa
17986179DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotide 86gctattctcg cagctcacca gttttagagc
tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt
cggtgcgagt cggtgcgacg agcgcggcga 120tatcatcatc catggagcat ttaccttgat
tgagatgtta attgtgtgag ctgcgagaa 17987182DNAArtificial
SequenceDescription of Artificial Sequence Synthetic polynucleotide
87gctattctcg cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc
60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgagt cggtgcgacg agcgcggcga
120tatcatcatc catggcaggt ttttgacgaa agtgatccag atgatccagt gagctgcgag
180aa
18288182DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotide 88gctattctcg cagctcacca gttttagagc
tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt
cggtgcgagt cggtgcgacg agcgcggcga 120tatcatcatc catggctgga tcatctggat
cactttcgtc aaaaacctgt gagctgcgag 180aa
1828996DNAArtificial
SequenceDescription of Artificial Sequence Synthetic oligonucleotide
89gaagccggcc ttgcacatgc gttttagagc tagaaatagc aagttaaaat aaggctagtc
60cgttatcaac ttgaaaaagt ggcaccgagt cggtgc
9690164DNAArtificial SequenceDescription of Artificial Sequence Synthetic
polynucleotide 90gaagccggcc ttgcacatgc gttttagagc tagaaatagc
aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcatat
catcatccat ggtaccgttc 120gtatagcata cattatacga agttattgag ctgcgagaat
agcc 16491172DNAArtificial SequenceDescription of
Artificial Sequence Synthetic polynucleotide 91gaagccggcc ttgcacatgc
gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt
ggcaccgagt cggtgcgacg agcgcggcga tatcatcatc 120catggtaccg ttcgtatagc
atacattata cgaagttatt gagctgcgag aa 17292189DNAArtificial
SequenceDescription of Artificial Sequence Synthetic polynucleotide
92gctattctcg cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc
60cgttatcaac ttgaaaaagt ggcaccgagt cggtgctcga cgacgagcgc ggcgatatca
120tcatccatgg ccggatgatc ctgacgacgg agaccgccgt cgtcgacaag ccggcctgag
180ctgcgagaa
18993181DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotide 93gctattctcg cagctcacca gttttagagc
tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt
cggtgcgagc gcggcgatat catcatccat 120ggccggatga tcctgacgac ggagaccgcc
gtcgtcgaca agccggcctg agctgcgaga 180a
18194178DNAArtificial
SequenceDescription of Artificial Sequence Synthetic polynucleotide
94gctattctcg cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc
60cgttatcaac ttgaaaaagt ggcaccgagt cggtgccgcg gcgatatcat catccatggc
120cggatgatcc tgacgacgga gaccgccgtc gtcgacaagc cggcctgagc tgcgagaa
17895175DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotide 95gctattctcg cagctcacca gttttagagc
tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt
cggtgcggcg atatcatcat ccatggccgg 120atgatcctga cgacggagac cgccgtcgtc
gacaagccgg cctgagctgc gagaa 17596171DNAArtificial
SequenceDescription of Artificial Sequence Synthetic polynucleotide
96gctattctcg cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc
60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcatat catcatccat ggccggatga
120tcctgacgac ggagaccgcc gtcgtcgaca agccggcctg agctgcgaga a
17197194DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotide 97gctattctcg cagctcacca gttttagagc
tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt
cggtgctcga cgacgagcgc ggcgatatca 120tcatccatgg ccggatgatc ctgacgacgg
agaccgccgt cgtcgacaag ccggcctgag 180ctgcgagaat agcc
19498189DNAArtificial
SequenceDescription of Artificial Sequence Synthetic polynucleotide
98gctattctcg cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc
60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgacg agcgcggcga tatcatcatc
120catggccgga tgatcctgac gacggagacc gccgtcgtcg acaagccggc ctgagctgcg
180agaatagcc
18999176DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotide 99gctattctcg cagctcacca gttttagagc
tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt
cggtgcatat catcatccat ggccggatga 120tcctgacgac ggagaccgcc gtcgtcgaca
agccggcctg agctgcgaga atagcc 176100194DNAArtificial
SequenceDescription of Artificial Sequence Synthetic polynucleotide
100gctgtctccg ccgcccgcca gttttagagc tagaaatagc aagttaaaat aaggctagtc
60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcctgc ccatccgcgg cggcacgggg
120gtcgcagtcg ccatgccgga tgatcctgac gacggagacc gccgtcgtcg acaagccggc
180ccgggcggcg gaga
194101189DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotide 101gctgtctccg ccgcccgcca gttttagagc
tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt
cggtgccatc cgcggcggca cgggggtcgc 120agtcgccatg ccggatgatc ctgacgacgg
agaccgccgt cgtcgacaag ccggcccggg 180cggcggaga
189102184DNAArtificial
SequenceDescription of Artificial Sequence Synthetic polynucleotide
102gctgtctccg ccgcccgcca gttttagagc tagaaatagc aagttaaaat aaggctagtc
60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgcgg cggcacgggg gtcgcagtcg
120ccatgccgga tgatcctgac gacggagacc gccgtcgtcg acaagccggc ccgggcggcg
180gaga
184103179DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotide 103gctgtctccg ccgcccgcca gttttagagc
tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt
cggtgcggca cgggggtcgc agtcgccatg 120ccggatgatc ctgacgacgg agaccgccgt
cgtcgacaag ccggcccggg cggcggaga 179104174DNAArtificial
SequenceDescription of Artificial Sequence Synthetic polynucleotide
104gctgtctccg ccgcccgcca gttttagagc tagaaatagc aagttaaaat aaggctagtc
60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgggg gtcgcagtcg ccatgccgga
120tgatcctgac gacggagacc gccgtcgtcg acaagccggc ccgggcggcg gaga
174105199DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotide 105gctgtctccg ccgcccgcca gttttagagc
tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt
cggtgcctgc ccatccgcgg cggcacgggg 120gtcgcagtcg ccatgccgga tgatcctgac
gacggagacc gccgtcgtcg acaagccggc 180ccgggcggcg gagacagcg
199106194DNAArtificial
SequenceDescription of Artificial Sequence Synthetic polynucleotide
106gctgtctccg ccgcccgcca gttttagagc tagaaatagc aagttaaaat aaggctagtc
60cgttatcaac ttgaaaaagt ggcaccgagt cggtgccatc cgcggcggca cgggggtcgc
120agtcgccatg ccggatgatc ctgacgacgg agaccgccgt cgtcgacaag ccggcccggg
180cggcggagac agcg
194107189DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotide 107gctgtctccg ccgcccgcca gttttagagc
tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt
cggtgcgcgg cggcacgggg gtcgcagtcg 120ccatgccgga tgatcctgac gacggagacc
gccgtcgtcg acaagccggc ccgggcggcg 180gagacagcg
189108184DNAArtificial
SequenceDescription of Artificial Sequence Synthetic polynucleotide
108gctgtctccg ccgcccgcca gttttagagc tagaaatagc aagttaaaat aaggctagtc
60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcggca cgggggtcgc agtcgccatg
120ccggatgatc ctgacgacgg agaccgccgt cgtcgacaag ccggcccggg cggcggagac
180agcg
184109179DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotide 109gctgtctccg ccgcccgcca gttttagagc
tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt
cggtgcgggg gtcgcagtcg ccatgccgga 120tgatcctgac gacggagacc gccgtcgtcg
acaagccggc ccgggcggcg gagacagcg 17911096DNAArtificial
SequenceDescription of Artificial Sequence Synthetic oligonucleotide
110gcgtggtggg gccgccagcg gttttagagc tagaaatagc aagttaaaat aaggctagtc
60cgttatcaac ttgaaaaagt ggcaccgagt cggtgc
96111180DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotide 111gctattctcg cagctcacca gttttagagc
tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt
cggtgcgacg agcgcggcga tatcatcatc 120catggggatg atcctgacga cggagaccgc
cgtcgtcgac aagccggtga gctgcgagaa 180112178DNAArtificial
SequenceDescription of Artificial Sequence Synthetic polynucleotide
112gctattctcg cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc
60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgacg agcgcggcga tatcatcatc
120catgggatga tcctgacgac ggagaccgcc gtcgtcgaca agccgtgagc tgcgagaa
178113176DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotide 113gctattctcg cagctcacca gttttagagc
tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt
cggtgcgacg agcgcggcga tatcatcatc 120catggatgat cctgacgacg gagaccgccg
tcgtcgacaa gcctgagctg cgagaa 176114174DNAArtificial
SequenceDescription of Artificial Sequence Synthetic polynucleotide
114gctattctcg cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc
60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgacg agcgcggcga tatcatcatc
120catggtgatc ctgacgacgg agaccgccgt cgtcgacaag ctgagctgcg agaa
174115182DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotide 115gctgtctccg ccgcccgcca gttttagagc
tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt
cggtgcgcgg cggcacgggg gtcgcagtcg 120ccatgcggat gatcctgacg acggagaccg
ccgtcgtcga caagccggcc gggcggcgga 180ga
182116180DNAArtificial
SequenceDescription of Artificial Sequence Synthetic polynucleotide
116gctgtctccg ccgcccgcca gttttagagc tagaaatagc aagttaaaat aaggctagtc
60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgcgg cggcacgggg gtcgcagtcg
120ccatgggatg atcctgacga cggagaccgc cgtcgtcgac aagccggcgg gcggcggaga
180117178DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotide 117gctgtctccg ccgcccgcca gttttagagc
tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt
cggtgcgcgg cggcacgggg gtcgcagtcg 120ccatggatga tcctgacgac ggagaccgcc
gtcgtcgaca agccgcgggc ggcggaga 178118176DNAArtificial
SequenceDescription of Artificial Sequence Synthetic polynucleotide
118gctgtctccg ccgcccgcca gttttagagc tagaaatagc aagttaaaat aaggctagtc
60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgcgg cggcacgggg gtcgcagtcg
120ccatgatgat cctgacgacg gagaccgccg tcgtcgacaa gcccgggcgg cggaga
176119189DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotide 119gcgtattgcc tggaggatgg gttttagagc
tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt
cggtgcgaac cacgcggcga atgccggcgt 120ccgccccgga tgatcctgac gacggagacc
gccgtcgtcg acaagccggc ctcctccagg 180caatacgcg
189120184DNAArtificial
SequenceDescription of Artificial Sequence Synthetic polynucleotide
120gcgtattgcc tggaggatgg gttttagagc tagaaatagc aagttaaaat aaggctagtc
60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgaac cacgcggcga atgccggcgt
120ccgccccgga tgatcctgac gacggagacc gccgtcgtcg acaagccggc ctcctccagg
180caat
184121182DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotide 121gcgtattgcc tggaggatgg gttttagagc
tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt
cggtgcgaac cacgcggcga atgccggcgt 120ccgcccggat gatcctgacg acggagaccg
ccgtcgtcga caagccggct cctccaggca 180at
182122180DNAArtificial
SequenceDescription of Artificial Sequence Synthetic polynucleotide
122gcgtattgcc tggaggatgg gttttagagc tagaaatagc aagttaaaat aaggctagtc
60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgaac cacgcggcga atgccggcgt
120ccgccggatg atcctgacga cggagaccgc cgtcgtcgac aagccggtcc tccaggcaat
180123178DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotide 123gcgtattgcc tggaggatgg gttttagagc
tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt
cggtgcgaac cacgcggcga atgccggcgt 120ccgccgatga tcctgacgac ggagaccgcc
gtcgtcgaca agccgtcctc caggcaat 178124176DNAArtificial
SequenceDescription of Artificial Sequence Synthetic polynucleotide
124gcgtattgcc tggaggatgg gttttagagc tagaaatagc aagttaaaat aaggctagtc
60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgaac cacgcggcga atgccggcgt
120ccgccatgat cctgacgacg gagaccgccg tcgtcgacaa gcctcctcca ggcaat
17612597DNAArtificial SequenceDescription of Artificial Sequence
Synthetic oligonucleotide 125gagccgagca cgaggggata cgttttagag
ctagaaatag caagttaaaa taaggctagt 60ccgttatcaa cttgaaaaag tggcaccgag
tcggtgc 97126167DNAArtificial
SequenceDescription of Artificial Sequence Synthetic polynucleotide
126gctattctcg cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc
60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcggcg atatcatcat ccatggatga
120tcctgacgac ggagaccgcc gtcgtcgaca agcctgagct gcgagaa
167127162DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotide 127gctattctcg cagctcacca gttttagagc
tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt
cggtgctatc atcatccatg gatgatcctg 120acgacggaga ccgccgtcgt cgacaagcct
gagctgcgag aa 162128157DNAArtificial
SequenceDescription of Artificial Sequence Synthetic polynucleotide
128gctattctcg cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc
60cgttatcaac ttgaaaaagt ggcaccgagt cggtgctcat ccatggatga tcctgacgac
120ggagaccgcc gtcgtcgaca agcctgagct gcgagaa
157129163DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotide 129gctattctcg cagctcacca gttttagagc
tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt
cggtgcggcg atatcatcat ccatggatga 120tcctgacgac ggagaccgcc gtcgtcgaca
agcctgagct gcg 163130158DNAArtificial
SequenceDescription of Artificial Sequence Synthetic polynucleotide
130gctattctcg cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc
60cgttatcaac ttgaaaaagt ggcaccgagt cggtgctatc atcatccatg gatgatcctg
120acgacggaga ccgccgtcgt cgacaagcct gagctgcg
158131153DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotide 131gctattctcg cagctcacca gttttagagc
tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt
cggtgctcat ccatggatga tcctgacgac 120ggagaccgcc gtcgtcgaca agcctgagct
gcg 153132167DNAArtificial
SequenceDescription of Artificial Sequence Synthetic polynucleotide
132gctgtctccg ccgcccgcca gttttagagc tagaaatagc aagttaaaat aaggctagtc
60cgttatcaac ttgaaaaagt ggcaccgagt cggtgccggg ggtcgcagtc gccatgatga
120tcctgacgac ggagaccgcc gtcgtcgaca agcccgggcg gcggaga
167133162DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotide 133gctgtctccg ccgcccgcca gttttagagc
tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt
cggtgcgtcg cagtcgccat gatgatcctg 120acgacggaga ccgccgtcgt cgacaagccc
gggcggcgga ga 162134157DNAArtificial
SequenceDescription of Artificial Sequence Synthetic polynucleotide
134gctgtctccg ccgcccgcca gttttagagc tagaaatagc aagttaaaat aaggctagtc
60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcagtc gccatgatga tcctgacgac
120ggagaccgcc gtcgtcgaca agcccgggcg gcggaga
157135163DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotide 135gctgtctccg ccgcccgcca gttttagagc
tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt
cggtgccggg ggtcgcagtc gccatgatga 120tcctgacgac ggagaccgcc gtcgtcgaca
agcccgggcg gcg 163136158DNAArtificial
SequenceDescription of Artificial Sequence Synthetic polynucleotide
136gctgtctccg ccgcccgcca gttttagagc tagaaatagc aagttaaaat aaggctagtc
60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgtcg cagtcgccat gatgatcctg
120acgacggaga ccgccgtcgt cgacaagccc gggcggcg
158137153DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotide 137gctgtctccg ccgcccgcca gttttagagc
tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt
cggtgcagtc gccatgatga tcctgacgac 120ggagaccgcc gtcgtcgaca agcccgggcg
gcg 153138180DNAArtificial
SequenceDescription of Artificial Sequence Synthetic polynucleotide
138gagaagcggc gtccggggct agttttagag ctagaaatag caagttaaaa taaggctagt
60ccgttatcaa cttgaaaaag tggcaccgag tcggtgctct ttgtccagag tcacagccat
120accggatgat cctgacgacg gagaccgccg tcgtcgacaa gccggccccc cggacgccgc
180139179DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotide 139gggcacgggg ccatgtacaa gttttagagc
tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt
cggtgcggcg tcggcagccc gatcccgttg 120ccggatgatc ctgacgacgg agaccgccgt
cgtcgacaag ccggcctaca tggccccgt 179140185DNAArtificial
SequenceDescription of Artificial Sequence Synthetic polynucleotide
140gtgtcaggtg gggcggggct agttttagag ctagaaatag caagttaaaa taaggctagt
60ccgttatcaa cttgaaaaag tggcaccgag tcggtgcgct ggctcctccc ctggcaccat
120accggatgat cctgacgacg gagaccgccg tcgtcgacaa gccggccccc cgccccacct
180gacac
185141184DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotide 141gagtgggtca gacgagcagg agttttagag
ctagaaatag caagttaaaa taaggctagt 60ccgttatcaa cttgaaaaag tggcaccgag
tcggtgcgat ggagggctgc atgggggagg 120agtcgccgga tgatcctgac gacggagacc
gccgtcgtcg acaagccggc ctgctcgtct 180gacc
18414297DNAArtificial
SequenceDescription of Artificial Sequence Synthetic oligonucleotide
142gcagccaccc gctctcggcc cgttttagag ctagaaatag caagttaaaa taaggctagt
60ccgttatcaa cttgaaaaag tggcaccgag tcggtgc
9714397DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 143gtgtagtcag gccgctcacc cgttttagag ctagaaatag
caagttaaaa taaggctagt 60ccgttatcaa cttgaaaaag tggcaccgag tcggtgc
9714497DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 144gctgacaagt
ctacggaacc tgttttagag ctagaaatag caagttaaaa taaggctagt 60ccgttatcaa
cttgaaaaag tggcaccgag tcggtgc
9714596DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 145gctcctccag cgccttgacc gttttagagc tagaaatagc
aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgc
9614620DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 146gctattctcg
cagctcacca
2014720DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 147agaagcggcg tccggggcta
2014820DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 148gggcacgggg
ccatgtacaa
2014920DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 149gcgtattgcc tggaggatgg
2015020DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 150tgtcaggtgg
ggcggggcta
2015120DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 151agtgggtcag acgagcagga
2015220DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 152gctgtctccg
ccgcccgcca
2015396DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 153gctattctcg cagctcacca gttttagagc tagaaatagc
aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgc
96154184DNAArtificial SequenceDescription of
Artificial Sequence Synthetic
polynucleotidemodified_base(148)..(149)CG, GC, AT, TA, GG, TT, GA, AG,
CC, TC, CT, AA, TG, GT, CA, or AC 154gctattctcg cagctcacca
gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt
ggcaccgagt cggtgcgacg agcgcggcga tatcatcatc 120catggccgga tgatcctgac
gacggagnnc gccgtcgtcg acaagccggc ctgagctgcg 180agaa
184155183DNAArtificial
SequenceDescription of Artificial Sequence Synthetic polynucleotide
155gctattctcg cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc
60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgacg agcgcggcga tatcatcatc
120catgccggat gatcctgacg acggagaccg ccgtcgtcga caagccggcc tgagctgcga
180gaa
183156183DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotide 156gctattctcg cagctcacca gttttagagc
tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt
cggtgcgacg agcgcggcga tatcatcatc 120catgccggat gatcctgacg acggagagcg
ccgtcgtcga caagccggcc tgagctgcga 180gaa
183157189DNAArtificial
SequenceDescription of Artificial Sequence Synthetic polynucleotide
157gcgtattgcc tggaggatgg gttttagagc tagaaatagc aagttaaaat aaggctagtc
60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgaac cacgcggcga atgccggcgt
120ccgccccgga tgatcctgac gacggagtcc gccgtcgtcg acaagccggc ctcctccagg
180caatacgcg
189158189DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotide 158gctgtctccg ccgcccgcca gttttagagc
tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt
cggtgcgcgg cggcacgggg gtcgcagtcg 120ccatgccgga tgatcctgac gacggagctc
gccgtcgtcg acaagccggc ccgggcggcg 180gagacagcg
18915920DNAArtificial
SequenceDescription of Artificial Sequence Synthetic oligonucleotide
159gtcacctcca atgactaggg
2016020DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 160gggcaaccac aaacccacga
20161194DNAArtificial SequenceDescription of
Artificial Sequence Synthetic polynucleotide 161gctattctcg
cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac
ttgaaaaagt ggcaccgagt cggtgcgacg agcgcggcga tatcatcatc 120catggctatg
ccggatgatc ctgacgacgg agtccgccgt cgtcgacaag ccggccctag 180ctgagctgcg
agaa
194162189DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotide 162gctattctcg cagctcacca gttttagagc
tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt
cggtgcgacg agcgcggcga tatcatcatc 120catggtgccg gatgatcctg acgacggagt
ccgccgtcgt cgacaagccg gccctatgag 180ctgcgagaa
189163184DNAArtificial
SequenceDescription of Artificial Sequence Synthetic polynucleotide
163gctattctcg cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc
60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgacg agcgcggcga tatcatcatc
120catggccgga tgatcctgac gacggagtcc gccgtcgtcg acaagccggc ctgagctgcg
180agaa
184164179DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotide 164gctattctcg cagctcacca gttttagagc
tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt
cggtgcgacg agcgcggcga tatcatcatc 120catggggatg atcctgacga cggagtccgc
cgtcgtcgac aagccgtgag ctgcgagaa 179165174DNAArtificial
SequenceDescription of Artificial Sequence Synthetic polynucleotide
165gctattctcg cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc
60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgacg agcgcggcga tatcatcatc
120catggtgatc ctgacgacgg agtccgccgt cgtcgacaag ctgagctgcg agaa
174166169DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotide 166gctattctcg cagctcacca gttttagagc
tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt
cggtgcgacg agcgcggcga tatcatcatc 120catggatcct gacgacggag tccgccgtcg
tcgacatgag ctgcgagaa 169167164DNAArtificial
SequenceDescription of Artificial Sequence Synthetic polynucleotide
167gctattctcg cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc
60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgacg agcgcggcga tatcatcatc
120catggcctga cgacggagtc cgccgtcgtc gtgagctgcg agaa
164168159DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotide 168gctattctcg cagctcacca gttttagagc
tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt
cggtgcgacg agcgcggcga tatcatcatc 120catggtgacg acggagtccg ccgtcgtgag
ctgcgagaa 159169154DNAArtificial
SequenceDescription of Artificial Sequence Synthetic polynucleotide
169gctattctcg cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc
60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgacg agcgcggcga tatcatcatc
120catggacgac ggagtccgcc gtgagctgcg agaa
154170149DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotide 170gctattctcg cagctcacca gttttagagc
tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt
cggtgcgacg agcgcggcga tatcatcatc 120catgggacgg agtccgtgag ctgcgagaa
149171144DNAArtificial
SequenceDescription of Artificial Sequence Synthetic polynucleotide
171gctattctcg cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc
60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgacg agcgcggcga tatcatcatc
120catggcggag ttgagctgcg agaa
144172182DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotide 172gaagccggcc ttgcacatgc gttttagagc
tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt
cggtgctcga cgacgagcgc ggcgatatca 120tcatccatgg taccgttcgt atagcataca
ttatacgaag ttattgagct gcgagaatag 180cc
182173177DNAArtificial
SequenceDescription of Artificial Sequence Synthetic polynucleotide
173gaagccggcc ttgcacatgc gttttagagc tagaaatagc aagttaaaat aaggctagtc
60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgacg agcgcggcga tatcatcatc
120catggtaccg ttcgtatagc atacattata cgaagttatt gagctgcgag aatagcc
177174177DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotide 174gaagccggcc ttgcacatgc gttttagagc
tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt
cggtgctcga cgacgagcgc ggcgatatca 120tcatccatgg taccgttcgt atagcataca
ttatacgaag ttattgagct gcgagaa 177175159DNAArtificial
SequenceDescription of Artificial Sequence Synthetic polynucleotide
175gaagccggcc ttgcacatgc gttttagagc tagaaatagc aagttaaaat aaggctagtc
60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcatat catcatccat ggtaccgttc
120gtatagcata cattatacga agttattgag ctgcgagaa
15917696DNAArtificial SequenceDescription of Artificial Sequence
Synthetic oligonucleotide 176ccccacgatg gaggggaaga gttttagagc
tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt
cggtgc 9617796DNAArtificial
SequenceDescription of Artificial Sequence Synthetic oligonucleotide
177ccttctcctg gagccgcgac gttttagagc tagaaatagc aagttaaaat aaggctagtc
60cgttatcaac ttgaaaaagt ggcaccgagt cggtgc
9617852DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 178gtggtttgtc tggtcaacca ccgcggtctc agtggtgtac
ggtacaaacc ca 5217952DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 179tgggtttgta
ccgtacacca ctgagaccgc ggtggttgac cagacaaacc ac
5218052DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 180gtggtttgtc tggtcaacca ccgcgcgctc agtggtgtac
ggtacaaacc ca 5218152DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 181tgggtttgta
ccgtacacca ctgagcgcgc ggtggttgac cagacaaacc ac
5218252DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 182gtggtttgtc tggtcaacca ccgcggcctc agtggtgtac
ggtacaaacc ca 5218352DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 183tgggtttgta
ccgtacacca ctgaggccgc ggtggttgac cagacaaacc ac
5218452DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 184gtggtttgtc tggtcaacca ccgcgatctc agtggtgtac
ggtacaaacc ca 5218552DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 185tgggtttgta
ccgtacacca ctgagatcgc ggtggttgac cagacaaacc ac
5218652DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 186gtggtttgtc tggtcaacca ccgcgtactc agtggtgtac
ggtacaaacc ca 5218752DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 187tgggtttgta
ccgtacacca ctgagtacgc ggtggttgac cagacaaacc ac
5218852DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 188gtggtttgtc tggtcaacca ccgcgggctc agtggtgtac
ggtacaaacc ca 5218952DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 189tgggtttgta
ccgtacacca ctgagcccgc ggtggttgac cagacaaacc ac
5219052DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 190gtggtttgtc tggtcaacca ccgcgttctc agtggtgtac
ggtacaaacc ca 5219152DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 191tgggtttgta
ccgtacacca ctgagaacgc ggtggttgac cagacaaacc ac
5219252DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 192gtggtttgtc tggtcaacca ccgcggactc agtggtgtac
ggtacaaacc ca 5219352DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 193tgggtttgta
ccgtacacca ctgagtccgc ggtggttgac cagacaaacc ac
5219452DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 194gtggtttgtc tggtcaacca ccgcgagctc agtggtgtac
ggtacaaacc ca 5219552DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 195tgggtttgta
ccgtacacca ctgagctcgc ggtggttgac cagacaaacc ac
5219652DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 196gtggtttgtc tggtcaacca ccgcgccctc agtggtgtac
ggtacaaacc ca 5219752DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 197tgggtttgta
ccgtacacca ctgagggcgc ggtggttgac cagacaaacc ac
5219852DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 198gtggtttgtc tggtcaacca ccgcgtcctc agtggtgtac
ggtacaaacc ca 5219952DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 199tgggtttgta
ccgtacacca ctgaggacgc ggtggttgac cagacaaacc ac
5220052DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 200gtggtttgtc tggtcaacca ccgcgctctc agtggtgtac
ggtacaaacc ca 5220152DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 201tgggtttgta
ccgtacacca ctgagagcgc ggtggttgac cagacaaacc ac
5220252DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 202gtggtttgtc tggtcaacca ccgcgaactc agtggtgtac
ggtacaaacc ca 5220352DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 203tgggtttgta
ccgtacacca ctgagttcgc ggtggttgac cagacaaacc ac
5220452DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 204gtggtttgtc tggtcaacca ccgcgcactc agtggtgtac
ggtacaaacc ca 5220552DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 205tgggtttgta
ccgtacacca ctgagtgcgc ggtggttgac cagacaaacc ac
5220652DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 206gtggtttgtc tggtcaacca ccgcgacctc agtggtgtac
ggtacaaacc ca 5220752DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 207tgggtttgta
ccgtacacca ctgaggtcgc ggtggttgac cagacaaacc ac
5220852DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 208gtggtttgtc tggtcaacca ccgcgtgctc agtggtgtac
ggtacaaacc ca 5220952DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 209tgggtttgta
ccgtacacca ctgagcacgc ggtggttgac cagacaaacc ac
5221046DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 210ggccggcttg tcgacgacgg cggtctccgt cgtcaggatc
atccgg 4621146DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 211ccggatgatc
ctgacgacgg agaccgccgt cgtcgacaag ccggcc
4621246DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 212ggccggcttg tcgacgacgg cgaactccgt cgtcaggatc
atccgg 4621346DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 213ccggatgatc
ctgacgacgg agttcgccgt cgtcgacaag ccggcc
4621446DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 214ggccggcttg tcgacgacgg cggactccgt cgtcaggatc
atccgg 4621546DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 215ccggatgatc
ctgacgacgg agtccgccgt cgtcgacaag ccggcc
4621646DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 216ggccggcttg tcgacgacgg cgcactccgt cgtcaggatc
atccgg 4621746DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 217ccggatgatc
ctgacgacgg agtgcgccgt cgtcgacaag ccggcc
4621846DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 218ggccggcttg tcgacgacgg cgtactccgt cgtcaggatc
atccgg 4621946DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 219ccggatgatc
ctgacgacgg agtacgccgt cgtcgacaag ccggcc
4622046DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 220ggccggcttg tcgacgacgg cgagctccgt cgtcaggatc
atccgg 4622146DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 221ccggatgatc
ctgacgacgg agctcgccgt cgtcgacaag ccggcc
4622246DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 222ggccggcttg tcgacgacgg cgggctccgt cgtcaggatc
atccgg 4622346DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 223ccggatgatc
ctgacgacgg agcccgccgt cgtcgacaag ccggcc
4622446DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 224ggccggcttg tcgacgacgg cgcgctccgt cgtcaggatc
atccgg 4622546DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 225ccggatgatc
ctgacgacgg agcgcgccgt cgtcgacaag ccggcc
4622646DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 226ggccggcttg tcgacgacgg cgtgctccgt cgtcaggatc
atccgg 4622746DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 227ccggatgatc
ctgacgacgg agcacgccgt cgtcgacaag ccggcc
4622846DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 228ggccggcttg tcgacgacgg cgacctccgt cgtcaggatc
atccgg 4622946DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 229ccggatgatc
ctgacgacgg aggtcgccgt cgtcgacaag ccggcc
4623046DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 230ggccggcttg tcgacgacgg cggcctccgt cgtcaggatc
atccgg 4623146DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 231ccggatgatc
ctgacgacgg aggccgccgt cgtcgacaag ccggcc
4623246DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 232ggccggcttg tcgacgacgg cgccctccgt cgtcaggatc
atccgg 4623346DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 233ccggatgatc
ctgacgacgg agggcgccgt cgtcgacaag ccggcc
4623446DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 234ggccggcttg tcgacgacgg cgtcctccgt cgtcaggatc
atccgg 4623546DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 235ccggatgatc
ctgacgacgg aggacgccgt cgtcgacaag ccggcc
4623646DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 236ggccggcttg tcgacgacgg cgatctccgt cgtcaggatc
atccgg 4623746DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 237ccggatgatc
ctgacgacgg agatcgccgt cgtcgacaag ccggcc
4623846DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 238ggccggcttg tcgacgacgg cgctctccgt cgtcaggatc
atccgg 4623946DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 239ccggatgatc
ctgacgacgg agagcgccgt cgtcgacaag ccggcc
4624046DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 240ggccggcttg tcgacgacgg cgttctccgt cgtcaggatc
atccgg 4624146DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 241ccggatgatc
ctgacgacgg agaacgccgt cgtcgacaag ccggcc
4624238DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 242ggcttgtcga cgacggcggt ctccgtcgtc aggatcat
3824338DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 243atgatcctga
cgacggagac cgccgtcgtc gacaagcc
3824438DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 244ggcttgtcga cgacggcgaa ctccgtcgtc aggatcat
3824538DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 245atgatcctga
cgacggagtt cgccgtcgtc gacaagcc
3824638DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 246ggcttgtcga cgacggcgga ctccgtcgtc aggatcat
3824738DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 247atgatcctga
cgacggagtc cgccgtcgtc gacaagcc
3824838DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 248ggcttgtcga cgacggcgca ctccgtcgtc aggatcat
3824938DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 249atgatcctga
cgacggagtg cgccgtcgtc gacaagcc
3825038DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 250ggcttgtcga cgacggcgta ctccgtcgtc aggatcat
3825138DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 251atgatcctga
cgacggagta cgccgtcgtc gacaagcc
3825238DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 252ggcttgtcga cgacggcgag ctccgtcgtc aggatcat
3825338DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 253atgatcctga
cgacggagct cgccgtcgtc gacaagcc
3825438DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 254ggcttgtcga cgacggcggg ctccgtcgtc aggatcat
3825538DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 255atgatcctga
cgacggagcc cgccgtcgtc gacaagcc
3825638DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 256ggcttgtcga cgacggcgcg ctccgtcgtc aggatcat
3825738DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 257atgatcctga
cgacggagcg cgccgtcgtc gacaagcc
3825838DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 258ggcttgtcga cgacggcgtg ctccgtcgtc aggatcat
3825938DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 259atgatcctga
cgacggagca cgccgtcgtc gacaagcc
3826038DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 260ggcttgtcga cgacggcgac ctccgtcgtc aggatcat
3826138DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 261atgatcctga
cgacggaggt cgccgtcgtc gacaagcc
3826238DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 262ggcttgtcga cgacggcggc ctccgtcgtc aggatcat
3826338DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 263atgatcctga
cgacggaggc cgccgtcgtc gacaagcc
3826438DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 264ggcttgtcga cgacggcgcc ctccgtcgtc aggatcat
3826538DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 265atgatcctga
cgacggaggg cgccgtcgtc gacaagcc
3826638DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 266ggcttgtcga cgacggcgtc ctccgtcgtc aggatcat
3826738DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 267atgatcctga
cgacggagga cgccgtcgtc gacaagcc
3826838DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 268ggcttgtcga cgacggcgat ctccgtcgtc aggatcat
3826938DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 269atgatcctga
cgacggagat cgccgtcgtc gacaagcc
3827038DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 270ggcttgtcga cgacggcgct ctccgtcgtc aggatcat
3827138DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 271atgatcctga
cgacggagag cgccgtcgtc gacaagcc
3827238DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 272ggcttgtcga cgacggcgtt ctccgtcgtc aggatcat
3827338DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 273atgatcctga
cgacggagaa cgccgtcgtc gacaagcc
3827434DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 274taccgttcgt ataatgtatg ctatacgaag ttat
3427534DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 275ataacttcgt
atagcataca ttatacgaac ggta
3427634DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 276ataacttcgt ataatgtatg ctatacgaac ggta
3427734DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 277taccgttcgt
atagcataca ttatacgaag ttat
3427827DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 278tttaccttga ttgagatgtt aattgtg
2727927DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 279cacaattaac
atctcaatca aggtaaa
2728050DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 280gcgagttttt atttcgttta tttcaattaa ggtaactaaa
aaactccttt 5028150DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 281aaaggagttt
tttagttacc ttaattgaaa taaacgaaat aaaaactcgc
5028234DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 282ctggatcatc tggatcactt tcgtcaaaaa cctg
3428334DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 283caggtttttg
acgaaagtga tccagatgat ccag
3428457DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 284ttcgggtgct gggttgttgt ctctggacag tgatccatgg
gaaactactc agcacca 5728557DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 285tggtgctgag
tagtttccca tggatcactg tccagagaca acaacccagc acccgaa
5728624DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 286aaaagtgtgg gctgcaggat ctga
2428722DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 287ggagctggca
gctgtcaatg cc
2228821DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 288agtcaatgcc gctctcgtgg a
2128921DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 289cagcgggctc
agctgatagc a
2129021DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 290cggatggcta accaagcggc c
2129117DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 291cccggcttcc tttgtcc
1729217DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
292gaactccacg ccgttca
1729317DNAArtificial SequenceDescription of Artificial Sequence Synthetic
primer 293cccggcttcc tttgtcc
1729422DNAArtificial SequenceDescription of Artificial
Sequence Synthetic primer 294aaccacaact agaatgcagt ga
2229517DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 295cccggcttcc tttgtcc
1729617DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
296gaactccacg ccgttca
1729717DNAArtificial SequenceDescription of Artificial Sequence Synthetic
primer 297cccggcttcc tttgtcc
1729822DNAArtificial SequenceDescription of Artificial
Sequence Synthetic primer 298aaccacaact agaatgcagt ga
2229917DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 299cccggcttcc tttgtcc
1730017DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
300gaactccacg ccgttca
1730121DNAArtificial SequenceDescription of Artificial Sequence Synthetic
primer 301tccttatcac ggtcccgctc g
2130217DNAArtificial SequenceDescription of Artificial
Sequence Synthetic primer 302gaactccacg ccgttca
1730317DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 303cgtcgacaac ggtagtg
1730417DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
304gaactccacg ccgttca
1730518DNAArtificial SequenceDescription of Artificial Sequence Synthetic
primer 305tcgcgtgatt ctcggaac
1830617DNAArtificial SequenceDescription of Artificial
Sequence Synthetic primer 306gaactccacg ccgttca
1730720DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 307gggcggtaag tggttagttt
2030817DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
308gaactccacg ccgttca
1730917DNAArtificial SequenceDescription of Artificial Sequence Synthetic
primer 309aagaggcgga gccagta
1731017DNAArtificial SequenceDescription of Artificial
Sequence Synthetic primer 310gaactccacg ccgttca
1731119DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 311ctcccttctc ccggtgccc
1931217DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
312gaactccacg ccgttca
1731317DNAArtificial SequenceDescription of Artificial Sequence Synthetic
primer 313cccggcttcc tttgtcc
1731417DNAArtificial SequenceDescription of Artificial
Sequence Synthetic primer 314gaactccacg ccgttca
1731520DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 315gggcggtaag tggttagttt
2031617DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
316gaactccacg ccgttca
1731717DNAArtificial SequenceDescription of Artificial Sequence Synthetic
primer 317cgtcgacaac ggtagtg
1731817DNAArtificial SequenceDescription of Artificial
Sequence Synthetic primer 318gaactccacg ccgttca
1731917DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 319aagaggcgga gccagta
1732017DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
320gaactccacg ccgttca
1732119DNAArtificial SequenceDescription of Artificial Sequence Synthetic
primer 321ctcccttctc ccggtgccc
1932217DNAArtificial SequenceDescription of Artificial
Sequence Synthetic primer 322gaactccacg ccgttca
1732321DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 323tccttatcac ggtcccgctc g
2132417DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
324gaactccacg ccgttca
1732517DNAArtificial SequenceDescription of Artificial Sequence Synthetic
primer 325cccggcttcc tttgtcc
1732618DNAArtificial SequenceDescription of Artificial
Sequence Synthetic primer 326ggcctgccag caggagga
1832717DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 327cccggcttcc tttgtcc
1732825DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
328ggtgtgcagt cacattggta aagcc
2532917DNAArtificial SequenceDescription of Artificial Sequence Synthetic
primer 329cccggcttcc tttgtcc
1733022DNAArtificial SequenceDescription of Artificial
Sequence Synthetic primer 330gatgggtcta gtccagctaa ag
2233117DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 331cccggcttcc tttgtcc
1733218DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
332gagagacaag gctgcaca
1833326DNAArtificial SequenceDescription of Artificial Sequence Synthetic
primer 333ccaggtgaga gtcagggtag tgttca
2633417DNAArtificial SequenceDescription of Artificial
Sequence Synthetic primer 334gaactccacg ccgttca
1733523DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 335agggaccttt gcctgtgtga gtc
2333617DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
336gaactccacg ccgttca
1733721DNAArtificial SequenceDescription of Artificial Sequence Synthetic
primer 337tcagctctgt gctgaggcga a
2133817DNAArtificial SequenceDescription of Artificial
Sequence Synthetic primer 338gaactccacg ccgttca
1733932DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 339aagccatctc ccagaatatc
tgcttagaaa tg 3234017DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
340gaactccacg ccgttca
1734125DNAArtificial SequenceDescription of Artificial Sequence Synthetic
primer 341gagaggagca acagtgagca tgatg
2534217DNAArtificial SequenceDescription of Artificial
Sequence Synthetic primer 342gaactccacg ccgttca
1734332DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 343aagccatctc ccagaatatc
tgcttagaaa tg 3234417DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
344gaactccacg ccgttca
1734525DNAArtificial SequenceDescription of Artificial Sequence Synthetic
primer 345gagaggagca acagtgagca tgatg
2534617DNAArtificial SequenceDescription of Artificial
Sequence Synthetic primer 346gaactccacg ccgttca
1734717DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 347cccggcttcc tttgtcc
1734822DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
348ggctatgaac taatgacccc gt
2234917DNAArtificial SequenceDescription of Artificial Sequence Synthetic
primer 349cccggcttcc tttgtcc
1735018DNAArtificial SequenceDescription of Artificial
Sequence Synthetic primer 350ggcctgccag caggagga
1835117DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 351cccggcttcc tttgtcc
1735225DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
352ggtgtgcagt cacattggta aagcc
2535352DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 353acactctttc cctacacgac gctcttccga tctccgacct
cggctcacag cg 5235453DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 354acactctttc
cctacacgac gctcttccga tctaccgacc tcggctcaca gcg
5335554DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 355acactctttc cctacacgac gctcttccga tctgaccgac
ctcggctcac agcg 5435655DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 356acactctttc
cctacacgac gctcttccga tcttgaccga cctcggctca cagcg
5535756DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 357acactctttc cctacacgac gctcttccga tctctgaccg
acctcggctc acagcg 5635857DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 358acactctttc
cctacacgac gctcttccga tctactgacc gacctcggct cacagcg
5735958DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 359acactctttc cctacacgac gctcttccga tcttactgac
cgacctcggc tcacagcg 5836059DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 360acactctttc
cctacacgac gctcttccga tctgtactga ccgacctcgg ctcacagcg
5936151DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 361gtgactggag ttcagacgtg tgctcttccg atctccaccc
agccagctcc c 5136251DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 362acactctttc
cctacacgac gctcttccga tctccggtgg cgcattgcca c
5136352DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 363acactctttc cctacacgac gctcttccga tctaccggtg
gcgcattgcc ac 5236453DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 364acactctttc
cctacacgac gctcttccga tctgaccggt ggcgcattgc cac
5336554DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 365acactctttc cctacacgac gctcttccga tcttgaccgg
tggcgcattg ccac 5436655DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 366acactctttc
cctacacgac gctcttccga tctctgaccg gtggcgcatt gccac
5536756DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 367acactctttc cctacacgac gctcttccga tctactgacc
ggtggcgcat tgccac 5636857DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 368acactctttc
cctacacgac gctcttccga tcttactgac cggtggcgca ttgccac
5736958DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 369acactctttc cctacacgac gctcttccga tctgtactga
ccggtggcgc attgccac 5837054DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 370gtgactggag
ttcagacgtg tgctcttccg atctcagagt ccagcttggg ccca
5437120DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 371gatattttcc cagctcacca
2037220DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 372tctattctcc
cagctcccca
2037340DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 373agcggcttct gtctctgtga gtgagctggc ggtctccgtc
4037443DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 374gactagccca
cgctccggtt ctgagccgcg acggcggtct ccg
4337541DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 375cccagggtcc catgcgctcc ccggccctga cggcggtctc c
413762560PRTArtificial SequenceDescription of
Artificial Sequence Synthetic polypeptide 376Met Lys Arg Thr Ala Asp
Gly Ser Glu Phe Glu Ser Pro Lys Lys Lys1 5
10 15Arg Lys Val Asp Lys Lys Tyr Ser Ile Gly Leu Asp
Ile Gly Thr Asn 20 25 30Ser
Val Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys 35
40 45Lys Phe Lys Val Leu Gly Asn Thr Asp
Arg His Ser Ile Lys Lys Asn 50 55
60Leu Ile Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr65
70 75 80Arg Leu Lys Arg Thr
Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg 85
90 95Ile Cys Tyr Leu Gln Glu Ile Phe Ser Asn Glu
Met Ala Lys Val Asp 100 105
110Asp Ser Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp
115 120 125Lys Lys His Glu Arg His Pro
Ile Phe Gly Asn Ile Val Asp Glu Val 130 135
140Ala Tyr His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys
Leu145 150 155 160Val Asp
Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu
165 170 175Ala His Met Ile Lys Phe Arg
Gly His Phe Leu Ile Glu Gly Asp Leu 180 185
190Asn Pro Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu
Val Gln 195 200 205Thr Tyr Asn Gln
Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val 210
215 220Asp Ala Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys
Ser Arg Arg Leu225 230 235
240Glu Asn Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe
245 250 255Gly Asn Leu Ile Ala
Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser 260
265 270Asn Phe Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu
Ser Lys Asp Thr 275 280 285Tyr Asp
Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr 290
295 300Ala Asp Leu Phe Leu Ala Ala Lys Asn Leu Ser
Asp Ala Ile Leu Leu305 310 315
320Ser Asp Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser
325 330 335Ala Ser Met Ile
Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu 340
345 350Leu Lys Ala Leu Val Arg Gln Gln Leu Pro Glu
Lys Tyr Lys Glu Ile 355 360 365Phe
Phe Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly 370
375 380Ala Ser Gln Glu Glu Phe Tyr Lys Phe Ile
Lys Pro Ile Leu Glu Lys385 390 395
400Met Asp Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp
Leu 405 410 415Leu Arg Lys
Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile 420
425 430His Leu Gly Glu Leu His Ala Ile Leu Arg
Arg Gln Glu Asp Phe Tyr 435 440
445Pro Phe Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe 450
455 460Arg Ile Pro Tyr Tyr Val Gly Pro
Leu Ala Arg Gly Asn Ser Arg Phe465 470
475 480Ala Trp Met Thr Arg Lys Ser Glu Glu Thr Ile Thr
Pro Trp Asn Phe 485 490
495Glu Glu Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg
500 505 510Met Thr Asn Phe Asp Lys
Asn Leu Pro Asn Glu Lys Val Leu Pro Lys 515 520
525His Ser Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu
Thr Lys 530 535 540Val Lys Tyr Val Thr
Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly545 550
555 560Glu Gln Lys Lys Ala Ile Val Asp Leu Leu
Phe Lys Thr Asn Arg Lys 565 570
575Val Thr Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys
580 585 590Phe Asp Ser Val Glu
Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser 595
600 605Leu Gly Thr Tyr His Asp Leu Leu Lys Ile Ile Lys
Asp Lys Asp Phe 610 615 620Leu Asp Asn
Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr625
630 635 640Leu Thr Leu Phe Glu Asp Arg
Glu Met Ile Glu Glu Arg Leu Lys Thr 645
650 655Tyr Ala His Leu Phe Asp Asp Lys Val Met Lys Gln
Leu Lys Arg Arg 660 665 670Arg
Tyr Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile 675
680 685Arg Asp Lys Gln Ser Gly Lys Thr Ile
Leu Asp Phe Leu Lys Ser Asp 690 695
700Gly Phe Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu705
710 715 720Thr Phe Lys Glu
Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp 725
730 735Ser Leu His Glu His Ile Ala Asn Leu Ala
Gly Ser Pro Ala Ile Lys 740 745
750Lys Gly Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val
755 760 765Met Gly Arg His Lys Pro Glu
Asn Ile Val Ile Glu Met Ala Arg Glu 770 775
780Asn Gln Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met
Lys785 790 795 800Arg Ile
Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu
805 810 815His Pro Val Glu Asn Thr Gln
Leu Gln Asn Glu Lys Leu Tyr Leu Tyr 820 825
830Tyr Leu Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu
Asp Ile 835 840 845Asn Arg Leu Ser
Asp Tyr Asp Val Asp Ala Ile Val Pro Gln Ser Phe 850
855 860Leu Lys Asp Asp Ser Ile Asp Asn Lys Val Leu Thr
Arg Ser Asp Lys865 870 875
880Asn Arg Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys
885 890 895Met Lys Asn Tyr Trp
Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln 900
905 910Arg Lys Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly
Gly Leu Ser Glu 915 920 925Leu Asp
Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln 930
935 940Ile Thr Lys His Val Ala Gln Ile Leu Asp Ser
Arg Met Asn Thr Lys945 950 955
960Tyr Asp Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu
965 970 975Lys Ser Lys Leu
Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys 980
985 990Val Arg Glu Ile Asn Asn Tyr His His Ala His
Asp Ala Tyr Leu Asn 995 1000
1005Ala Val Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu
1010 1015 1020Ser Glu Phe Val Tyr Gly
Asp Tyr Lys Val Tyr Asp Val Arg Lys 1025 1030
1035Met Ile Ala Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala
Lys 1040 1045 1050Tyr Phe Phe Tyr Ser
Asn Ile Met Asn Phe Phe Lys Thr Glu Ile 1055 1060
1065Thr Leu Ala Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile
Glu Thr 1070 1075 1080Asn Gly Glu Thr
Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe 1085
1090 1095Ala Thr Val Arg Lys Val Leu Ser Met Pro Gln
Val Asn Ile Val 1100 1105 1110Lys Lys
Thr Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile 1115
1120 1125Leu Pro Lys Arg Asn Ser Asp Lys Leu Ile
Ala Arg Lys Lys Asp 1130 1135 1140Trp
Asp Pro Lys Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala 1145
1150 1155Tyr Ser Val Leu Val Val Ala Lys Val
Glu Lys Gly Lys Ser Lys 1160 1165
1170Lys Leu Lys Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu
1175 1180 1185Arg Ser Ser Phe Glu Lys
Asn Pro Ile Asp Phe Leu Glu Ala Lys 1190 1195
1200Gly Tyr Lys Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro
Lys 1205 1210 1215Tyr Ser Leu Phe Glu
Leu Glu Asn Gly Arg Lys Arg Met Leu Ala 1220 1225
1230Ser Ala Gly Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu
Pro Ser 1235 1240 1245Lys Tyr Val Asn
Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu 1250
1255 1260Lys Gly Ser Pro Glu Asp Asn Glu Gln Lys Gln
Leu Phe Val Glu 1265 1270 1275Gln His
Lys His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu 1280
1285 1290Phe Ser Lys Arg Val Ile Leu Ala Asp Ala
Asn Leu Asp Lys Val 1295 1300 1305Leu
Ser Ala Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln 1310
1315 1320Ala Glu Asn Ile Ile His Leu Phe Thr
Leu Thr Asn Leu Gly Ala 1325 1330
1335Pro Ala Ala Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg
1340 1345 1350Tyr Thr Ser Thr Lys Glu
Val Leu Asp Ala Thr Leu Ile His Gln 1355 1360
1365Ser Ile Thr Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln
Leu 1370 1375 1380Gly Gly Asp Ser Gly
Gly Ser Ser Gly Gly Ser Ser Gly Ser Glu 1385 1390
1395Thr Pro Gly Thr Ser Glu Ser Ala Thr Pro Glu Ser Ser
Gly Ser 1400 1405 1410Glu Thr Pro Gly
Thr Ser Glu Ser Ala Thr Pro Glu Ser Ser Gly 1415
1420 1425Ser Glu Thr Pro Gly Thr Ser Glu Ser Ala Thr
Pro Glu Ser Ser 1430 1435 1440Gly Gly
Ser Ser Gly Gly Ser Ser Thr Leu Asn Ile Glu Asp Glu 1445
1450 1455Tyr Arg Leu His Glu Thr Ser Lys Glu Pro
Asp Val Ser Leu Gly 1460 1465 1470Ser
Thr Trp Leu Ser Asp Phe Pro Gln Ala Trp Ala Glu Thr Gly 1475
1480 1485Gly Met Gly Leu Ala Val Arg Gln Ala
Pro Leu Ile Ile Pro Leu 1490 1495
1500Lys Ala Thr Ser Thr Pro Val Ser Ile Lys Gln Tyr Pro Met Ser
1505 1510 1515Gln Glu Ala Arg Leu Gly
Ile Lys Pro His Ile Gln Arg Leu Leu 1520 1525
1530Asp Gln Gly Ile Leu Val Pro Cys Gln Ser Pro Trp Asn Thr
Pro 1535 1540 1545Leu Leu Pro Val Lys
Lys Pro Gly Thr Asn Asp Tyr Arg Pro Val 1550 1555
1560Gln Asp Leu Arg Glu Val Asn Lys Arg Val Glu Asp Ile
His Pro 1565 1570 1575Thr Val Pro Asn
Pro Tyr Asn Leu Leu Ser Gly Pro Pro Pro Ser 1580
1585 1590His Gln Trp Tyr Thr Val Leu Asp Leu Lys Asp
Ala Phe Phe Cys 1595 1600 1605Leu Arg
Leu His Pro Thr Ser Gln Pro Leu Phe Ala Phe Glu Trp 1610
1615 1620Arg Asp Pro Glu Met Gly Ile Ser Gly Gln
Leu Thr Trp Thr Arg 1625 1630 1635Leu
Pro Gln Gly Phe Lys Asn Ser Pro Thr Leu Phe Asn Glu Ala 1640
1645 1650Leu His Arg Asp Leu Ala Asp Phe Arg
Ile Gln His Pro Asp Leu 1655 1660
1665Ile Leu Leu Gln Tyr Val Asp Asp Leu Leu Leu Ala Ala Thr Ser
1670 1675 1680Glu Leu Asp Cys Gln Gln
Gly Thr Arg Ala Leu Leu Gln Thr Leu 1685 1690
1695Gly Asn Leu Gly Tyr Arg Ala Ser Ala Lys Lys Ala Gln Ile
Cys 1700 1705 1710Gln Lys Gln Val Lys
Tyr Leu Gly Tyr Leu Leu Lys Glu Gly Gln 1715 1720
1725Arg Trp Leu Thr Glu Ala Arg Lys Glu Thr Val Met Gly
Gln Pro 1730 1735 1740Thr Pro Lys Thr
Pro Arg Gln Leu Arg Glu Phe Leu Gly Lys Ala 1745
1750 1755Gly Phe Cys Arg Leu Phe Ile Pro Gly Phe Ala
Glu Met Ala Ala 1760 1765 1770Pro Leu
Tyr Pro Leu Thr Lys Pro Gly Thr Leu Phe Asn Trp Gly 1775
1780 1785Pro Asp Gln Gln Lys Ala Tyr Gln Glu Ile
Lys Gln Ala Leu Leu 1790 1795 1800Thr
Ala Pro Ala Leu Gly Leu Pro Asp Leu Thr Lys Pro Phe Glu 1805
1810 1815Leu Phe Val Asp Glu Lys Gln Gly Tyr
Ala Lys Gly Val Leu Thr 1820 1825
1830Gln Lys Leu Gly Pro Trp Arg Arg Pro Val Ala Tyr Leu Ser Lys
1835 1840 1845Lys Leu Asp Pro Val Ala
Ala Gly Trp Pro Pro Cys Leu Arg Met 1850 1855
1860Val Ala Ala Ile Ala Val Leu Thr Lys Asp Ala Gly Lys Leu
Thr 1865 1870 1875Met Gly Gln Pro Leu
Val Ile Leu Ala Pro His Ala Val Glu Ala 1880 1885
1890Leu Val Lys Gln Pro Pro Asp Arg Trp Leu Ser Asn Ala
Arg Met 1895 1900 1905Thr His Tyr Gln
Ala Leu Leu Leu Asp Thr Asp Arg Val Gln Phe 1910
1915 1920Gly Pro Val Val Ala Leu Asn Pro Ala Thr Leu
Leu Pro Leu Pro 1925 1930 1935Glu Glu
Gly Leu Gln His Asn Cys Leu Asp Gly Thr Gly Gly Gly 1940
1945 1950Gly Val Thr Val Lys Phe Lys Tyr Lys Gly
Glu Glu Leu Glu Val 1955 1960 1965Asp
Ile Ser Lys Ile Lys Lys Val Trp Arg Val Gly Lys Met Ile 1970
1975 1980Ser Phe Thr Tyr Asp Asp Asn Gly Lys
Thr Gly Arg Gly Ala Val 1985 1990
1995Ser Glu Lys Asp Ala Pro Lys Glu Leu Leu Gln Met Leu Glu Lys
2000 2005 2010Ser Gly Lys Lys Ser Gly
Gly Ser Lys Arg Thr Ala Asp Gly Ser 2015 2020
2025Glu Phe Glu Pro Lys Lys Lys Arg Lys Val Gly Gly Gly Gly
Ser 2030 2035 2040Pro Lys Lys Lys Arg
Lys Val Tyr Pro Tyr Asp Val Pro Asp Tyr 2045 2050
2055Ala Gly Ser Arg Ala Leu Val Val Ile Arg Leu Ser Arg
Val Thr 2060 2065 2070Asp Ala Thr Thr
Ser Pro Glu Arg Gln Leu Glu Ser Cys Gln Gln 2075
2080 2085Leu Cys Ala Gln Arg Gly Trp Asp Val Val Gly
Val Ala Glu Asp 2090 2095 2100Leu Asp
Val Ser Gly Ala Val Asp Pro Phe Asp Arg Lys Arg Arg 2105
2110 2115Pro Asn Leu Ala Arg Trp Leu Ala Phe Glu
Glu Gln Pro Phe Asp 2120 2125 2130Val
Ile Val Ala Tyr Arg Val Asp Arg Leu Thr Arg Ser Ile Arg 2135
2140 2145His Leu Gln Gln Leu Val His Trp Ala
Glu Asp His Lys Lys Leu 2150 2155
2160Val Val Ser Ala Thr Glu Ala His Phe Asp Thr Thr Thr Pro Phe
2165 2170 2175Ala Ala Val Val Ile Ala
Leu Met Gly Thr Val Ala Gln Met Glu 2180 2185
2190Leu Glu Ala Ile Lys Glu Arg Asn Arg Ser Ala Ala His Phe
Asn 2195 2200 2205Ile Arg Ala Gly Lys
Tyr Arg Gly Ser Leu Pro Pro Trp Gly Tyr 2210 2215
2220Leu Pro Thr Arg Val Asp Gly Glu Trp Arg Leu Val Pro
Asp Pro 2225 2230 2235Val Gln Arg Glu
Arg Ile Leu Glu Val Tyr His Arg Val Val Asp 2240
2245 2250Asn His Glu Pro Leu His Leu Val Ala His Asp
Leu Asn Arg Arg 2255 2260 2265Gly Val
Leu Ser Pro Lys Asp Tyr Phe Ala Gln Leu Gln Gly Arg 2270
2275 2280Glu Pro Gln Gly Arg Glu Trp Ser Ala Thr
Ala Leu Lys Arg Ser 2285 2290 2295Met
Ile Ser Glu Ala Met Leu Gly Tyr Ala Thr Leu Asn Gly Lys 2300
2305 2310Thr Val Arg Asp Asp Asp Gly Ala Pro
Leu Val Arg Ala Glu Pro 2315 2320
2325Ile Leu Thr Arg Glu Gln Leu Glu Ala Leu Arg Ala Glu Leu Val
2330 2335 2340Lys Thr Ser Arg Ala Lys
Pro Ala Val Ser Thr Pro Ser Leu Leu 2345 2350
2355Leu Arg Val Leu Phe Cys Ala Val Cys Gly Glu Pro Ala Tyr
Lys 2360 2365 2370Phe Ala Gly Gly Gly
Arg Lys His Pro Arg Tyr Arg Cys Arg Ser 2375 2380
2385Met Gly Phe Pro Lys His Cys Gly Asn Gly Thr Val Ala
Met Ala 2390 2395 2400Glu Trp Asp Ala
Phe Cys Glu Glu Gln Val Leu Asp Leu Leu Gly 2405
2410 2415Asp Ala Glu Arg Leu Glu Lys Val Trp Val Ala
Gly Ser Asp Ser 2420 2425 2430Ala Val
Glu Leu Ala Glu Val Asn Ala Glu Leu Val Asp Leu Thr 2435
2440 2445Ser Leu Ile Gly Ser Pro Ala Tyr Arg Ala
Gly Ser Pro Gln Arg 2450 2455 2460Glu
Ala Leu Asp Ala Arg Ile Ala Ala Leu Ala Ala Arg Gln Glu 2465
2470 2475Glu Leu Glu Gly Leu Glu Ala Arg Pro
Ser Gly Trp Glu Trp Arg 2480 2485
2490Glu Thr Gly Gln Arg Phe Gly Asp Trp Trp Arg Glu Gln Asp Thr
2495 2500 2505Ala Ala Lys Asn Thr Trp
Leu Arg Ser Met Asn Val Arg Leu Thr 2510 2515
2520Phe Asp Val Arg Gly Gly Leu Thr Arg Thr Ile Asp Phe Gly
Asp 2525 2530 2535Leu Gln Glu Tyr Glu
Gln His Leu Arg Leu Gly Ser Val Val Glu 2540 2545
2550Arg Leu His Thr Gly Met Ser 2555
25603777680DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotide 377atgaaacgga cagccgacgg aagcgagttc
gagtcaccaa agaagaagcg gaaagtcgac 60aagaagtaca gcatcggcct ggacatcggc
accaactctg tgggctgggc cgtgatcacc 120gacgagtaca aggtgcccag caagaaattc
aaggtgctgg gcaacaccga ccggcacagc 180atcaagaaga acctgatcgg agccctgctg
ttcgacagcg gcgaaacagc cgaggccacc 240cggctgaaga gaaccgccag aagaagatac
accagacgga agaaccggat ctgctatctg 300caagagatct tcagcaacga gatggccaag
gtggacgaca gcttcttcca cagactggaa 360gagtccttcc tggtggaaga ggataagaag
cacgagcggc accccatctt cggcaacatc 420gtggacgagg tggcctacca cgagaagtac
cccaccatct accacctgag aaagaaactg 480gtggacagca ccgacaaggc cgacctgcgg
ctgatctatc tggccctggc ccacatgatc 540aagttccggg gccacttcct gatcgagggc
gacctgaacc ccgacaacag cgacgtggac 600aagctgttca tccagctggt gcagacctac
aaccagctgt tcgaggaaaa ccccatcaac 660gccagcggcg tggacgccaa ggccatcctg
tctgccagac tgagcaagag cagacggctg 720gaaaatctga tcgcccagct gcccggcgag
aagaagaatg gcctgttcgg aaacctgatt 780gccctgagcc tgggcctgac ccccaacttc
aagagcaact tcgacctggc cgaggatgcc 840aaactgcagc tgagcaagga cacctacgac
gacgacctgg acaacctgct ggcccagatc 900ggcgaccagt acgccgacct gtttctggcc
gccaagaacc tgtccgacgc catcctgctg 960agcgacatcc tgagagtgaa caccgagatc
accaaggccc ccctgagcgc ctctatgatc 1020aagagatacg acgagcacca ccaggacctg
accctgctga aagctctcgt gcggcagcag 1080ctgcctgaga agtacaaaga gattttcttc
gaccagagca agaacggcta cgccggctac 1140attgacggcg gagccagcca ggaagagttc
tacaagttca tcaagcccat cctggaaaag 1200atggacggca ccgaggaact gctcgtgaag
ctgaacagag aggacctgct gcggaagcag 1260cggaccttcg acaacggcag catcccccac
cagatccacc tgggagagct gcacgccatt 1320ctgcggcggc aggaagattt ttacccattc
ctgaaggaca accgggaaaa gatcgagaag 1380atcctgacct tccgcatccc ctactacgtg
ggccctctgg ccaggggaaa cagcagattc 1440gcctggatga ccagaaagag cgaggaaacc
atcaccccct ggaacttcga ggaagtggtg 1500gacaagggcg cttccgccca gagcttcatc
gagcggatga ccaacttcga taagaacctg 1560cccaacgaga aggtgctgcc caagcacagc
ctgctgtacg agtacttcac cgtgtataac 1620gagctgacca aagtgaaata cgtgaccgag
ggaatgagaa agcccgcctt cctgagcggc 1680gagcagaaaa aggccatcgt ggacctgctg
ttcaagacca accggaaagt gaccgtgaag 1740cagctgaaag aggactactt caagaaaatc
gagtgcttcg actccgtgga aatctccggc 1800gtggaagatc ggttcaacgc ctccctgggc
acataccacg atctgctgaa aattatcaag 1860gacaaggact tcctggacaa tgaggaaaac
gaggacattc tggaagatat cgtgctgacc 1920ctgacactgt ttgaggacag agagatgatc
gaggaacggc tgaaaaccta tgcccacctg 1980ttcgacgaca aagtgatgaa gcagctgaag
cggcggagat acaccggctg gggcaggctg 2040agccggaagc tgatcaacgg catccgggac
aagcagtccg gcaagacaat cctggatttc 2100ctgaagtccg acggcttcgc caacagaaac
ttcatgcagc tgatccacga cgacagcctg 2160acctttaaag aggacatcca gaaagcccag
gtgtccggcc agggcgatag cctgcacgag 2220cacattgcca atctggccgg cagccccgcc
attaagaagg gcatcctgca gacagtgaag 2280gtggtggacg agctcgtgaa agtgatgggc
cggcacaagc ccgagaacat cgtgatcgaa 2340atggccagag agaaccagac cacccagaag
ggacagaaga acagccgcga gagaatgaag 2400cggatcgaag agggcatcaa agagctgggc
agccagatcc tgaaagaaca ccccgtggaa 2460aacacccagc tgcagaacga gaagctgtac
ctgtactacc tgcagaatgg gcgggatatg 2520tacgtggacc aggaactgga catcaaccgg
ctgtccgact acgatgtgga cgctatcgtg 2580cctcagagct ttctgaagga cgactccatc
gacaacaagg tgctgaccag aagcgacaag 2640aaccggggca agagcgacaa cgtgccctcc
gaagaggtcg tgaagaagat gaagaactac 2700tggcggcagc tgctgaacgc caagctgatt
acccagagaa agttcgacaa tctgaccaag 2760gccgagagag gcggcctgag cgaactggat
aaggccggct tcatcaagag acagctggtg 2820gaaacccggc agatcacaaa gcacgtggca
cagatcctgg actcccggat gaacactaag 2880tacgacgaga atgacaagct gatccgggaa
gtgaaagtga tcaccctgaa gtccaagctg 2940gtgtccgatt tccggaagga tttccagttt
tacaaagtgc gcgagatcaa caactaccac 3000cacgcccacg acgcctacct gaacgccgtc
gtgggaaccg ccctgatcaa aaagtaccct 3060aagctggaaa gcgagttcgt gtacggcgac
tacaaggtgt acgacgtgcg gaagatgatc 3120gccaagagcg agcaggaaat cggcaaggct
accgccaagt acttcttcta cagcaacatc 3180atgaactttt tcaagaccga gattaccctg
gccaacggcg agatccggaa gcggcctctg 3240atcgagacaa acggcgaaac cggggagatc
gtgtgggata agggccggga ttttgccacc 3300gtgcggaaag tgctgagcat gccccaagtg
aatatcgtga aaaagaccga ggtgcagaca 3360ggcggcttca gcaaagagtc tatcctgccc
aagaggaaca gcgataagct gatcgccaga 3420aagaaggact gggaccctaa gaagtacggc
ggcttcgaca gccccaccgt ggcctattct 3480gtgctggtgg tggccaaagt ggaaaagggc
aagtccaaga aactgaagag tgtgaaagag 3540ctgctgggga tcaccatcat ggaaagaagc
agcttcgaga agaatcccat cgactttctg 3600gaagccaagg gctacaaaga agtgaaaaag
gacctgatca tcaagctgcc taagtactcc 3660ctgttcgagc tggaaaacgg ccggaagaga
atgctggcct ctgccggcga actgcagaag 3720ggaaacgaac tggccctgcc ctccaaatat
gtgaacttcc tgtacctggc cagccactat 3780gagaagctga agggctcccc cgaggataat
gagcagaaac agctgtttgt ggaacagcac 3840aagcactacc tggacgagat catcgagcag
atcagcgagt tctccaagag agtgatcctg 3900gccgacgcta atctggacaa agtgctgtcc
gcctacaaca agcaccggga taagcccatc 3960agagagcagg ccgagaatat catccacctg
tttaccctga ccaatctggg agcccctgcc 4020gccttcaagt actttgacac caccatcgac
cggaagaggt acaccagcac caaagaggtg 4080ctggacgcca ccctgatcca ccagagcatc
accggcctgt acgagacacg gatcgacctg 4140tctcagctgg gaggtgactc tggaggatct
agcggaggat cctctggcag cgagacacca 4200ggaacaagcg agtcagcaac accagagagc
tctggtagcg agacacccgg taccagtgaa 4260agcgccacgc cagaaagcag tgggagtgag
actccgggta catctgaatc agcgacaccg 4320gaatcaagtg gcggcagcag cggcggcagc
agcaccctaa atatagaaga tgagtatcgg 4380ctacatgaga cctcaaaaga gccagatgtt
tctctagggt ccacatggct gtctgatttt 4440cctcaggcct gggcggaaac cgggggcatg
ggactggcag ttcgccaagc tcctctgatc 4500atacctctga aagcaacctc tacccccgtg
tccataaaac aataccccat gtcacaagaa 4560gccagactgg ggatcaagcc ccacatacag
agactgttgg accagggaat actggtaccc 4620tgccagtccc cctggaacac gcccctgcta
cccgttaaga aaccagggac taatgattat 4680aggcctgtcc aggatctgag agaagtcaac
aagcgggtgg aagacatcca ccccaccgtg 4740cccaaccctt acaacctctt gagcgggccc
ccaccgtccc accagtggta cactgtgctt 4800gatttaaagg atgccttttt ctgcctgaga
ctccacccca ccagtcagcc tctcttcgcc 4860tttgagtgga gagatccaga gatgggaatc
tcaggacaat tgacctggac cagactccca 4920cagggtttca aaaacagtcc caccctgttt
aatgaggcac tgcacagaga cctagcagac 4980ttccggatcc agcacccaga cttgatcctg
ctacagtacg tggatgactt actgctggcc 5040gccacttctg agctagactg ccaacaaggt
actcgggccc tgttacaaac cctagggaac 5100ctcgggtatc gggcctcggc caagaaagcc
caaatttgcc agaaacaggt caagtatctg 5160gggtatcttc taaaagaggg tcagagatgg
ctgactgagg ccagaaaaga gactgtgatg 5220gggcagccta ctccgaagac ccctcgacaa
ctaagggagt tcctagggaa ggcaggcttc 5280tgtcgcctct tcatccctgg gtttgcagaa
atggcagccc ccctgtaccc tctcaccaaa 5340ccggggactc tgtttaattg gggcccagac
caacaaaagg cctatcaaga aatcaagcaa 5400gctcttctaa ctgccccagc cctggggttg
ccagatttga ctaagccctt tgaactcttt 5460gtcgacgaga agcagggcta cgccaaaggt
gtcctaacgc aaaaactggg accttggcgt 5520cggccggtgg cctacctgtc caaaaagcta
gacccagtag cagctgggtg gcccccttgc 5580ctacggatgg tagcagccat tgccgtactg
acaaaggatg caggcaagct aaccatggga 5640cagccactag tcattctggc cccccatgca
gtagaggcac tagtcaaaca accccccgac 5700cgctggcttt ccaacgcccg gatgactcac
tatcaggcct tgcttttgga cacggaccgg 5760gtccagttcg gaccggtggt agccctgaac
ccggctacgc tgctcccact gcctgaggaa 5820gggctgcaac acaactgcct tgatgggaca
ggtggcggtg gtgtcaccgt caagttcaag 5880tacaagggtg aggaacttga agttgatatt
agcaaaatca agaaggtttg gcgcgttggt 5940aaaatgatat cttttactta tgacgacaac
ggcaagacag gtagaggggc agtgtctgag 6000aaagacgccc ccaaggagct gttgcaaatg
ttggaaaagt ctgggaaaaa gtctggcggc 6060tcaaaaagaa ccgccgacgg cagcgaattc
gagcccaaga agaagaggaa agtcggaggt 6120ggcgggagcc caaaaaagaa aagaaaagtg
tatccctatg atgtccccga ttatgccggt 6180tcaagagccc tggtcgtgat tagactgagc
cgagtgacag acgccaccac aagtcccgag 6240agacagctgg aatcatgcca gcagctctgt
gctcagcggg gttgggatgt ggtcggcgtg 6300gcagaggatc tggacgtgag cggggccgtc
gatccattcg acagaaagag gaggcccaac 6360ctggcaagat ggctcgcttt cgaggaacag
ccctttgatg tgatcgtcgc ctacagagtg 6420gaccggctga cccgctcaat tcgacatctc
cagcagctgg tgcattgggc tgaggaccac 6480aagaaactgg tggtcagcgc aacagaagcc
cacttcgata ctaccacacc ttttgccgct 6540gtggtcatcg cactgatggg cactgtggcc
cagatggagc tcgaagctat caaggagcga 6600aacaggagcg cagcccattt caatattagg
gccggtaaat acagaggctc cctgccccct 6660tggggatatc tccctaccag ggtggatggg
gagtggagac tggtgccaga ccccgtccag 6720agagagcgga ttctggaagt gtaccacaga
gtggtcgata accacgaacc actccatctg 6780gtggcacacg acctgaatag acgcggcgtg
ctctctccaa aggattattt tgctcagctg 6840cagggaagag agccacaggg aagagaatgg
agtgctactg cactgaagag atctatgatc 6900agtgaggcta tgctgggtta cgcaacactc
aatggcaaaa ctgtccggga cgatgacgga 6960gcccctctgg tgagggctga gcctattctc
accagagagc agctcgaagc tctgcgggca 7020gaactggtca agactagtcg cgccaaacct
gccgtgagca ccccaagcct gctcctgagg 7080gtgctgttct gcgccgtctg tggagagcca
gcatacaagt ttgccggcgg agggcgcaaa 7140catccccgct atcgatgcag gagcatgggg
ttccctaagc actgtggaaa cgggacagtg 7200gccatggctg agtgggacgc cttttgcgag
gaacaggtgc tggatctcct gggtgacgct 7260gagcggctgg aaaaagtgtg ggtggcagga
tctgactccg ctgtggagct ggcagaagtc 7320aatgccgagc tcgtggatct gacttccctc
atcggatctc ctgcatatag agctgggtcc 7380ccacagagag aagctctgga cgcacgaatt
gctgcactcg ctgctagaca ggaggaactg 7440gagggcctgg aggccaggcc ctctggatgg
gagtggcgag aaaccggaca gaggtttggg 7500gattggtgga gggagcagga caccgcagcc
aagaacacat ggctgagatc catgaatgtc 7560cggctcacat tcgacgtgcg cggtggcctg
actcgaacca tcgattttgg cgacctgcag 7620gagtatgaac agcacctgag actggggtcc
gtggtcgaaa gactgcacac tgggatgtcc 76803781367PRTArtificial
SequenceDescription of Artificial Sequence Synthetic polypeptide
378Asp Lys Lys Tyr Ser Ile Gly Leu Asp Ile Gly Thr Asn Ser Val Gly1
5 10 15Trp Ala Val Ile Thr Asp
Glu Tyr Lys Val Pro Ser Lys Lys Phe Lys 20 25
30Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn
Leu Ile Gly 35 40 45Ala Leu Leu
Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu Lys 50
55 60Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn
Arg Ile Cys Tyr65 70 75
80Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser Phe
85 90 95Phe His Arg Leu Glu Glu
Ser Phe Leu Val Glu Glu Asp Lys Lys His 100
105 110Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu
Val Ala Tyr His 115 120 125Glu Lys
Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp Ser 130
135 140Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu
Ala Leu Ala His Met145 150 155
160Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro Asp
165 170 175Asn Ser Asp Val
Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr Asn 180
185 190Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser
Gly Val Asp Ala Lys 195 200 205Ala
Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn Leu 210
215 220Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn
Gly Leu Phe Gly Asn Leu225 230 235
240Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe
Asp 245 250 255Leu Ala Glu
Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp Asp 260
265 270Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly
Asp Gln Tyr Ala Asp Leu 275 280
285Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp Ile 290
295 300Leu Arg Val Asn Thr Glu Ile Thr
Lys Ala Pro Leu Ser Ala Ser Met305 310
315 320Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr
Leu Leu Lys Ala 325 330
335Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe Asp
340 345 350Gln Ser Lys Asn Gly Tyr
Ala Gly Tyr Ile Asp Gly Gly Ala Ser Gln 355 360
365Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met
Asp Gly 370 375 380Thr Glu Glu Leu Leu
Val Lys Leu Asn Arg Glu Asp Leu Leu Arg Lys385 390
395 400Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro
His Gln Ile His Leu Gly 405 410
415Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe Leu
420 425 430Lys Asp Asn Arg Glu
Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile Pro 435
440 445Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg
Phe Ala Trp Met 450 455 460Thr Arg Lys
Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu Val465
470 475 480Val Asp Lys Gly Ala Ser Ala
Gln Ser Phe Ile Glu Arg Met Thr Asn 485
490 495Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro
Lys His Ser Leu 500 505 510Leu
Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys Tyr 515
520 525Val Thr Glu Gly Met Arg Lys Pro Ala
Phe Leu Ser Gly Glu Gln Lys 530 535
540Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr Val545
550 555 560Lys Gln Leu Lys
Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp Ser 565
570 575Val Glu Ile Ser Gly Val Glu Asp Arg Phe
Asn Ala Ser Leu Gly Thr 580 585
590Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp Asn
595 600 605Glu Glu Asn Glu Asp Ile Leu
Glu Asp Ile Val Leu Thr Leu Thr Leu 610 615
620Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala
His625 630 635 640Leu Phe
Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr Thr
645 650 655Gly Trp Gly Arg Leu Ser Arg
Lys Leu Ile Asn Gly Ile Arg Asp Lys 660 665
670Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly
Phe Ala 675 680 685Asn Arg Asn Phe
Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe Lys 690
695 700Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly
Asp Ser Leu His705 710 715
720Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly Ile
725 730 735Leu Gln Thr Val Lys
Val Val Asp Glu Leu Val Lys Val Met Gly Arg 740
745 750His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg
Glu Asn Gln Thr 755 760 765Thr Gln
Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile Glu 770
775 780Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu
Lys Glu His Pro Val785 790 795
800Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu Gln
805 810 815Asn Gly Arg Asp
Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg Leu 820
825 830Ser Asp Tyr Asp Val Asp Ala Ile Val Pro Gln
Ser Phe Leu Lys Asp 835 840 845Asp
Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg Gly 850
855 860Lys Ser Asp Asn Val Pro Ser Glu Glu Val
Val Lys Lys Met Lys Asn865 870 875
880Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys
Phe 885 890 895Asp Asn Leu
Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp Lys 900
905 910Ala Gly Phe Ile Lys Arg Gln Leu Val Glu
Thr Arg Gln Ile Thr Lys 915 920
925His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp Glu 930
935 940Asn Asp Lys Leu Ile Arg Glu Val
Lys Val Ile Thr Leu Lys Ser Lys945 950
955 960Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr
Lys Val Arg Glu 965 970
975Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val Val
980 985 990Gly Thr Ala Leu Ile Lys
Lys Tyr Pro Lys Leu Glu Ser Glu Phe Val 995 1000
1005Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met
Ile Ala Lys 1010 1015 1020Ser Glu Gln
Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe Tyr 1025
1030 1035Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile
Thr Leu Ala Asn 1040 1045 1050Gly Glu
Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu Thr 1055
1060 1065Gly Glu Ile Val Trp Asp Lys Gly Arg Asp
Phe Ala Thr Val Arg 1070 1075 1080Lys
Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr Glu 1085
1090 1095Val Gln Thr Gly Gly Phe Ser Lys Glu
Ser Ile Leu Pro Lys Arg 1100 1105
1110Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro Lys
1115 1120 1125Lys Tyr Gly Gly Phe Asp
Ser Pro Thr Val Ala Tyr Ser Val Leu 1130 1135
1140Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys
Ser 1145 1150 1155Val Lys Glu Leu Leu
Gly Ile Thr Ile Met Glu Arg Ser Ser Phe 1160 1165
1170Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr
Lys Glu 1175 1180 1185Val Lys Lys Asp
Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu Phe 1190
1195 1200Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala
Ser Ala Gly Glu 1205 1210 1215Leu Gln
Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val Asn 1220
1225 1230Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys
Leu Lys Gly Ser Pro 1235 1240 1245Glu
Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys His 1250
1255 1260Tyr Leu Asp Glu Ile Ile Glu Gln Ile
Ser Glu Phe Ser Lys Arg 1265 1270
1275Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala Tyr
1280 1285 1290Asn Lys His Arg Asp Lys
Pro Ile Arg Glu Gln Ala Glu Asn Ile 1295 1300
1305Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala
Phe 1310 1315 1320Lys Tyr Phe Asp Thr
Thr Ile Asp Arg Lys Arg Tyr Thr Ser Thr 1325 1330
1335Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile
Thr Gly 1340 1345 1350Leu Tyr Glu Thr
Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp 1355 1360
1365379576PRTArtificial SequenceDescription of Artificial
Sequence Synthetic polypeptide 379Leu Asn Ile Glu Asp Glu Tyr Arg
Leu His Glu Thr Ser Lys Glu Pro1 5 10
15Asp Val Ser Leu Gly Ser Thr Trp Leu Ser Asp Phe Pro Gln
Ala Trp 20 25 30Ala Glu Thr
Gly Gly Met Gly Leu Ala Val Arg Gln Ala Pro Leu Ile 35
40 45Ile Pro Leu Lys Ala Thr Ser Thr Pro Val Ser
Ile Lys Gln Tyr Pro 50 55 60Met Ser
Gln Glu Ala Arg Leu Gly Ile Lys Pro His Ile Gln Arg Leu65
70 75 80Leu Asp Gln Gly Ile Leu Val
Pro Cys Gln Ser Pro Trp Asn Thr Pro 85 90
95Leu Leu Pro Val Lys Lys Pro Gly Thr Asn Asp Tyr Arg
Pro Val Gln 100 105 110Asp Leu
Arg Glu Val Asn Lys Arg Val Glu Asp Ile His Pro Thr Val 115
120 125Pro Asn Pro Tyr Asn Leu Leu Ser Gly Pro
Pro Pro Ser His Gln Trp 130 135 140Tyr
Thr Val Leu Asp Leu Lys Asp Ala Phe Phe Cys Leu Arg Leu His145
150 155 160Pro Thr Ser Gln Pro Leu
Phe Ala Phe Glu Trp Arg Asp Pro Glu Met 165
170 175Gly Ile Ser Gly Gln Leu Thr Trp Thr Arg Leu Pro
Gln Gly Phe Lys 180 185 190Asn
Ser Pro Thr Leu Phe Asn Glu Ala Leu His Arg Asp Leu Ala Asp 195
200 205Phe Arg Ile Gln His Pro Asp Leu Ile
Leu Leu Gln Tyr Val Asp Asp 210 215
220Leu Leu Leu Ala Ala Thr Ser Glu Leu Asp Cys Gln Gln Gly Thr Arg225
230 235 240Ala Leu Leu Gln
Thr Leu Gly Asn Leu Gly Tyr Arg Ala Ser Ala Lys 245
250 255Lys Ala Gln Ile Cys Gln Lys Gln Val Lys
Tyr Leu Gly Tyr Leu Leu 260 265
270Lys Glu Gly Gln Arg Trp Leu Thr Glu Ala Arg Lys Glu Thr Val Met
275 280 285Gly Gln Pro Thr Pro Lys Thr
Pro Arg Gln Leu Arg Glu Phe Leu Gly 290 295
300Lys Ala Gly Phe Cys Arg Leu Phe Ile Pro Gly Phe Ala Glu Met
Ala305 310 315 320Ala Pro
Leu Tyr Pro Leu Thr Lys Pro Gly Thr Leu Phe Asn Trp Gly
325 330 335Pro Asp Gln Gln Lys Ala Tyr
Gln Glu Ile Lys Gln Ala Leu Leu Thr 340 345
350Ala Pro Ala Leu Gly Leu Pro Asp Leu Thr Lys Pro Phe Glu
Leu Phe 355 360 365Val Asp Glu Lys
Gln Gly Tyr Ala Lys Gly Val Leu Thr Gln Lys Leu 370
375 380Gly Pro Trp Arg Arg Pro Val Ala Tyr Leu Ser Lys
Lys Leu Asp Pro385 390 395
400Val Ala Ala Gly Trp Pro Pro Cys Leu Arg Met Val Ala Ala Ile Ala
405 410 415Val Leu Thr Lys Asp
Ala Gly Lys Leu Thr Met Gly Gln Pro Leu Val 420
425 430Ile Leu Ala Pro His Ala Val Glu Ala Leu Val Lys
Gln Pro Pro Asp 435 440 445Arg Trp
Leu Ser Asn Ala Arg Met Thr His Tyr Gln Ala Leu Leu Leu 450
455 460Asp Thr Asp Arg Val Gln Phe Gly Pro Val Val
Ala Leu Asn Pro Ala465 470 475
480Thr Leu Leu Pro Leu Pro Glu Glu Gly Leu Gln His Asn Cys Leu Asp
485 490 495Gly Thr Gly Gly
Gly Gly Val Thr Val Lys Phe Lys Tyr Lys Gly Glu 500
505 510Glu Leu Glu Val Asp Ile Ser Lys Ile Lys Lys
Val Trp Arg Val Gly 515 520 525Lys
Met Ile Ser Phe Thr Tyr Asp Asp Asn Gly Lys Thr Gly Arg Gly 530
535 540Ala Val Ser Glu Lys Asp Ala Pro Lys Glu
Leu Leu Gln Met Leu Glu545 550 555
560Lys Ser Gly Lys Lys Ser Gly Gly Ser Lys Arg Thr Ala Asp Gly
Ser 565 570
575380500PRTArtificial SequenceDescription of Artificial Sequence
Synthetic polypeptide 380Ser Arg Ala Leu Val Val Ile Arg Leu Ser Arg
Val Thr Asp Ala Thr1 5 10
15Thr Ser Pro Glu Arg Gln Leu Glu Ser Cys Gln Gln Leu Cys Ala Gln
20 25 30Arg Gly Trp Asp Val Val Gly
Val Ala Glu Asp Leu Asp Val Ser Gly 35 40
45Ala Val Asp Pro Phe Asp Arg Lys Arg Arg Pro Asn Leu Ala Arg
Trp 50 55 60Leu Ala Phe Glu Glu Gln
Pro Phe Asp Val Ile Val Ala Tyr Arg Val65 70
75 80Asp Arg Leu Thr Arg Ser Ile Arg His Leu Gln
Gln Leu Val His Trp 85 90
95Ala Glu Asp His Lys Lys Leu Val Val Ser Ala Thr Glu Ala His Phe
100 105 110Asp Thr Thr Thr Pro Phe
Ala Ala Val Val Ile Ala Leu Met Gly Thr 115 120
125Val Ala Gln Met Glu Leu Glu Ala Ile Lys Glu Arg Asn Arg
Ser Ala 130 135 140Ala His Phe Asn Ile
Arg Ala Gly Lys Tyr Arg Gly Ser Leu Pro Pro145 150
155 160Trp Gly Tyr Leu Pro Thr Arg Val Asp Gly
Glu Trp Arg Leu Val Pro 165 170
175Asp Pro Val Gln Arg Glu Arg Ile Leu Glu Val Tyr His Arg Val Val
180 185 190Asp Asn His Glu Pro
Leu His Leu Val Ala His Asp Leu Asn Arg Arg 195
200 205Gly Val Leu Ser Pro Lys Asp Tyr Phe Ala Gln Leu
Gln Gly Arg Glu 210 215 220Pro Gln Gly
Arg Glu Trp Ser Ala Thr Ala Leu Lys Arg Ser Met Ile225
230 235 240Ser Glu Ala Met Leu Gly Tyr
Ala Thr Leu Asn Gly Lys Thr Val Arg 245
250 255Asp Asp Asp Gly Ala Pro Leu Val Arg Ala Glu Pro
Ile Leu Thr Arg 260 265 270Glu
Gln Leu Glu Ala Leu Arg Ala Glu Leu Val Lys Thr Ser Arg Ala 275
280 285Lys Pro Ala Val Ser Thr Pro Ser Leu
Leu Leu Arg Val Leu Phe Cys 290 295
300Ala Val Cys Gly Glu Pro Ala Tyr Lys Phe Ala Gly Gly Gly Arg Lys305
310 315 320His Pro Arg Tyr
Arg Cys Arg Ser Met Gly Phe Pro Lys His Cys Gly 325
330 335Asn Gly Thr Val Ala Met Ala Glu Trp Asp
Ala Phe Cys Glu Glu Gln 340 345
350Val Leu Asp Leu Leu Gly Asp Ala Glu Arg Leu Glu Lys Val Trp Val
355 360 365Ala Gly Ser Asp Ser Ala Val
Glu Leu Ala Glu Val Asn Ala Glu Leu 370 375
380Val Asp Leu Thr Ser Leu Ile Gly Ser Pro Ala Tyr Arg Ala Gly
Ser385 390 395 400Pro Gln
Arg Glu Ala Leu Asp Ala Arg Ile Ala Ala Leu Ala Ala Arg
405 410 415Gln Glu Glu Leu Glu Gly Leu
Glu Ala Arg Pro Ser Gly Trp Glu Trp 420 425
430Arg Glu Thr Gly Gln Arg Phe Gly Asp Trp Trp Arg Glu Gln
Asp Thr 435 440 445Ala Ala Lys Asn
Thr Trp Leu Arg Ser Met Asn Val Arg Leu Thr Phe 450
455 460Asp Val Arg Gly Gly Leu Thr Arg Thr Ile Asp Phe
Gly Asp Leu Gln465 470 475
480Glu Tyr Glu Gln His Leu Arg Leu Gly Ser Val Val Glu Arg Leu His
485 490 495Thr Gly Met Ser
50038111344DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotide 381ccgaaaagtg ccacctgacg tcgacggatc
gggagatcga tctcccgatc ccctagggtc 60gactctcagt acaatctgct ctgatgccgc
atagttaagc cagtatctgc tccctgcttg 120tgtgttggag gtcgctgagt agtgcgcgag
caaaatttaa gctacaacaa ggcaaggctt 180gaccgacaat tgcatgaaga atctgcttag
ggttaggcgt tttgcgctgc ttcgcgatgt 240acgggccaga tatacgcgtt gacattgatt
attgactagt tattaatagt aatcaattac 300ggggtcatta gttcatagcc catatatgga
gttccgcgtt acataactta cggtaaatgg 360cccgcctggc tgaccgccca acgacccccg
cccattgacg tcaataatga cgtatgttcc 420catagtaacg ccaataggga ctttccattg
acgtcaatgg gtggagtatt tacggtaaac 480tgcccacttg gcagtacatc aagtgtatca
tatgccaagt acgcccccta ttgacgtcaa 540tgacggtaaa tggcccgcct ggcattatgc
ccagtacatg accttatggg actttcctac 600ttggcagtac atctacgtat tagtcatcgc
tattaccatg gtgatgcggt tttggcagta 660catcaatggg cgtggatagc ggtttgactc
acggggattt ccaagtctcc accccattga 720cgtcaatggg agtttgtttt ggcaccaaaa
tcaacgggac tttccaaaat gtcgtaacaa 780ctccgcccca ttgacgcaaa tgggcggtag
gcgtgtacgg tgggaggtct atataagcag 840agctggttta gtgaaccgtc agatccgcta
gagatccgcg gccgctaata cgactcacta 900tagggagagc cgccaccatg aaacggacag
ccgacggaag cgagttcgag tcaccaaaga 960agaagcggaa agtcgacaag aagtacagca
tcggcctgga catcggcacc aactctgtgg 1020gctgggccgt gatcaccgac gagtacaagg
tgcccagcaa gaaattcaag gtgctgggca 1080acaccgaccg gcacagcatc aagaagaacc
tgatcggagc cctgctgttc gacagcggcg 1140aaacagccga ggccacccgg ctgaagagaa
ccgccagaag aagatacacc agacggaaga 1200accggatctg ctatctgcaa gagatcttca
gcaacgagat ggccaaggtg gacgacagct 1260tcttccacag actggaagag tccttcctgg
tggaagagga taagaagcac gagcggcacc 1320ccatcttcgg caacatcgtg gacgaggtgg
cctaccacga gaagtacccc accatctacc 1380acctgagaaa gaaactggtg gacagcaccg
acaaggccga cctgcggctg atctatctgg 1440ccctggccca catgatcaag ttccggggcc
acttcctgat cgagggcgac ctgaaccccg 1500acaacagcga cgtggacaag ctgttcatcc
agctggtgca gacctacaac cagctgttcg 1560aggaaaaccc catcaacgcc agcggcgtgg
acgccaaggc catcctgtct gccagactga 1620gcaagagcag acggctggaa aatctgatcg
cccagctgcc cggcgagaag aagaatggcc 1680tgttcggaaa cctgattgcc ctgagcctgg
gcctgacccc caacttcaag agcaacttcg 1740acctggccga ggatgccaaa ctgcagctga
gcaaggacac ctacgacgac gacctggaca 1800acctgctggc ccagatcggc gaccagtacg
ccgacctgtt tctggccgcc aagaacctgt 1860ccgacgccat cctgctgagc gacatcctga
gagtgaacac cgagatcacc aaggcccccc 1920tgagcgcctc tatgatcaag agatacgacg
agcaccacca ggacctgacc ctgctgaaag 1980ctctcgtgcg gcagcagctg cctgagaagt
acaaagagat tttcttcgac cagagcaaga 2040acggctacgc cggctacatt gacggcggag
ccagccagga agagttctac aagttcatca 2100agcccatcct ggaaaagatg gacggcaccg
aggaactgct cgtgaagctg aacagagagg 2160acctgctgcg gaagcagcgg accttcgaca
acggcagcat cccccaccag atccacctgg 2220gagagctgca cgccattctg cggcggcagg
aagattttta cccattcctg aaggacaacc 2280gggaaaagat cgagaagatc ctgaccttcc
gcatccccta ctacgtgggc cctctggcca 2340ggggaaacag cagattcgcc tggatgacca
gaaagagcga ggaaaccatc accccctgga 2400acttcgagga agtggtggac aagggcgctt
ccgcccagag cttcatcgag cggatgacca 2460acttcgataa gaacctgccc aacgagaagg
tgctgcccaa gcacagcctg ctgtacgagt 2520acttcaccgt gtataacgag ctgaccaaag
tgaaatacgt gaccgaggga atgagaaagc 2580ccgccttcct gagcggcgag cagaaaaagg
ccatcgtgga cctgctgttc aagaccaacc 2640ggaaagtgac cgtgaagcag ctgaaagagg
actacttcaa gaaaatcgag tgcttcgact 2700ccgtggaaat ctccggcgtg gaagatcggt
tcaacgcctc cctgggcaca taccacgatc 2760tgctgaaaat tatcaaggac aaggacttcc
tggacaatga ggaaaacgag gacattctgg 2820aagatatcgt gctgaccctg acactgtttg
aggacagaga gatgatcgag gaacggctga 2880aaacctatgc ccacctgttc gacgacaaag
tgatgaagca gctgaagcgg cggagataca 2940ccggctgggg caggctgagc cggaagctga
tcaacggcat ccgggacaag cagtccggca 3000agacaatcct ggatttcctg aagtccgacg
gcttcgccaa cagaaacttc atgcagctga 3060tccacgacga cagcctgacc tttaaagagg
acatccagaa agcccaggtg tccggccagg 3120gcgatagcct gcacgagcac attgccaatc
tggccggcag ccccgccatt aagaagggca 3180tcctgcagac agtgaaggtg gtggacgagc
tcgtgaaagt gatgggccgg cacaagcccg 3240agaacatcgt gatcgaaatg gccagagaga
accagaccac ccagaaggga cagaagaaca 3300gccgcgagag aatgaagcgg atcgaagagg
gcatcaaaga gctgggcagc cagatcctga 3360aagaacaccc cgtggaaaac acccagctgc
agaacgagaa gctgtacctg tactacctgc 3420agaatgggcg ggatatgtac gtggaccagg
aactggacat caaccggctg tccgactacg 3480atgtggacgc tatcgtgcct cagagctttc
tgaaggacga ctccatcgac aacaaggtgc 3540tgaccagaag cgacaagaac cggggcaaga
gcgacaacgt gccctccgaa gaggtcgtga 3600agaagatgaa gaactactgg cggcagctgc
tgaacgccaa gctgattacc cagagaaagt 3660tcgacaatct gaccaaggcc gagagaggcg
gcctgagcga actggataag gccggcttca 3720tcaagagaca gctggtggaa acccggcaga
tcacaaagca cgtggcacag atcctggact 3780cccggatgaa cactaagtac gacgagaatg
acaagctgat ccgggaagtg aaagtgatca 3840ccctgaagtc caagctggtg tccgatttcc
ggaaggattt ccagttttac aaagtgcgcg 3900agatcaacaa ctaccaccac gcccacgacg
cctacctgaa cgccgtcgtg ggaaccgccc 3960tgatcaaaaa gtaccctaag ctggaaagcg
agttcgtgta cggcgactac aaggtgtacg 4020acgtgcggaa gatgatcgcc aagagcgagc
aggaaatcgg caaggctacc gccaagtact 4080tcttctacag caacatcatg aactttttca
agaccgagat taccctggcc aacggcgaga 4140tccggaagcg gcctctgatc gagacaaacg
gcgaaaccgg ggagatcgtg tgggataagg 4200gccgggattt tgccaccgtg cggaaagtgc
tgagcatgcc ccaagtgaat atcgtgaaaa 4260agaccgaggt gcagacaggc ggcttcagca
aagagtctat cctgcccaag aggaacagcg 4320ataagctgat cgccagaaag aaggactggg
accctaagaa gtacggcggc ttcgacagcc 4380ccaccgtggc ctattctgtg ctggtggtgg
ccaaagtgga aaagggcaag tccaagaaac 4440tgaagagtgt gaaagagctg ctggggatca
ccatcatgga aagaagcagc ttcgagaaga 4500atcccatcga ctttctggaa gccaagggct
acaaagaagt gaaaaaggac ctgatcatca 4560agctgcctaa gtactccctg ttcgagctgg
aaaacggccg gaagagaatg ctggcctctg 4620ccggcgaact gcagaaggga aacgaactgg
ccctgccctc caaatatgtg aacttcctgt 4680acctggccag ccactatgag aagctgaagg
gctcccccga ggataatgag cagaaacagc 4740tgtttgtgga acagcacaag cactacctgg
acgagatcat cgagcagatc agcgagttct 4800ccaagagagt gatcctggcc gacgctaatc
tggacaaagt gctgtccgcc tacaacaagc 4860accgggataa gcccatcaga gagcaggccg
agaatatcat ccacctgttt accctgacca 4920atctgggagc ccctgccgcc ttcaagtact
ttgacaccac catcgaccgg aagaggtaca 4980ccagcaccaa agaggtgctg gacgccaccc
tgatccacca gagcatcacc ggcctgtacg 5040agacacggat cgacctgtct cagctgggag
gtgactctgg aggatctagc ggaggatcct 5100ctggcagcga gacaccagga acaagcgagt
cagcaacacc agagagcagt ggcggcagca 5160gcggcggcag cagcacccta aatatagaag
atgagtatcg gctacatgag acctcaaaag 5220agccagatgt ttctctaggg tccacatggc
tgtctgattt tcctcaggcc tgggcggaaa 5280ccgggggcat gggactggca gttcgccaag
ctcctctgat catacctctg aaagcaacct 5340ctacccccgt gtccataaaa caatacccca
tgtcacaaga agccagactg gggatcaagc 5400cccacataca gagactgttg gaccagggaa
tactggtacc ctgccagtcc ccctggaaca 5460cgcccctgct acccgttaag aaaccaggga
ctaatgatta taggcctgtc caggatctga 5520gagaagtcaa caagcgggtg gaagacatcc
accccaccgt gcccaaccct tacaacctct 5580tgagcgggct cccaccgtcc caccagtggt
acactgtgct tgatttaaag gatgcctttt 5640tctgcctgag actccacccc accagtcagc
ctctcttcgc ctttgagtgg agagatccag 5700agatgggaat ctcaggacaa ttgacctgga
ccagactccc acagggtttc aaaaacagtc 5760ccaccctgtt taatgaggca ctgcacagag
acctagcaga cttccggatc cagcacccag 5820acttgatcct gctacagtac gtggatgact
tactgctggc cgccacttct gagctagact 5880gccaacaagg tactcgggcc ctgttacaaa
ccctagggaa cctcgggtat cgggcctcgg 5940ccaagaaagc ccaaatttgc cagaaacagg
tcaagtatct ggggtatctt ctaaaagagg 6000gtcagagatg gctgactgag gccagaaaag
agactgtgat ggggcagcct actccgaaga 6060cccctcgaca actaagggag ttcctaggga
aggcaggctt ctgtcgcctc ttcatccctg 6120ggtttgcaga aatggcagcc cccctgtacc
ctctcaccaa accggggact ctgtttaatt 6180ggggcccaga ccaacaaaag gcctatcaag
aaatcaagca agctcttcta actgccccag 6240ccctggggtt gccagatttg actaagccct
ttgaactctt tgtcgacgag aagcagggct 6300acgccaaagg tgtcctaacg caaaaactgg
gaccttggcg tcggccggtg gcctacctgt 6360ccaaaaagct agacccagta gcagctgggt
ggcccccttg cctacggatg gtagcagcca 6420ttgccgtact gacaaaggat gcaggcaagc
taaccatggg acagccacta gtcattctgg 6480ccccccatgc agtagaggca ctagtcaaac
aaccccccga ccgctggctt tccaacgccc 6540ggatgactca ctatcaggcc ttgcttttgg
acacggaccg ggtccagttc ggaccggtgg 6600tagccctgaa cccggctacg ctgctcccac
tgcctgagga agggctgcaa cacaactgcc 6660ttgatatcct ggccgaagcc cacggaaccc
gacccgacct aacggaccag ccgctcccag 6720acgccgacca cacctggtac acggatggaa
gcagtctctt acaagaggga cagcgtaagg 6780cgggagctgc ggtgaccacc gagaccgagg
taatctgggc taaagccctg ccagccggga 6840catccgctca gcgggctgaa ctgatagcac
tcacccaggc cctaaagatg gcagaaggta 6900agaagctaaa tgtttatact gatagccgtt
atgcttttgc tactgcccat atccatggag 6960aaatatacag aaggcgtggg tggctcacat
cagaaggcaa agagatcaaa aataaagacg 7020agatcttggc cctactaaaa gccctctttc
tgcccaaaag acttagcata atccattgtc 7080caggacatca aaagggacac agcgccgagg
ctagaggcaa ccggatggct gaccaagcgg 7140cccgaaaggc agccatcaca gagactccag
acacctctac cctcctcata gaaaattcat 7200caccctctgg cggctcaaaa agaaccgccg
acggcagcga attcgagccc aagaagaaga 7260ggaaagtcgg aagcggagct actaacttca
gcctgctgaa gcaggctggc gacgtggagg 7320agaaccctgg acctccaaaa aagaaaagaa
aagtgtatcc ctatgatgtc cccgattatg 7380ccggttcaag agccctggtc gtgattagac
tgagccgagt gacagacgcc accacaagtc 7440ccgagagaca gctggaatca tgccagcagc
tctgtgctca gcggggttgg gatgtggtcg 7500gcgtggcaga ggatctggac gtgagcgggg
ccgtcgatcc attcgacaga aagaggaggc 7560ccaacctggc aagatggctc gctttcgagg
aacagccctt tgatgtgatc gtcgcctaca 7620gagtggaccg gctgacccgc tcaattcgac
atctccagca gctggtgcat tgggctgagg 7680accacaagaa actggtggtc agcgcaacag
aagcccactt cgatactacc acaccttttg 7740ccgctgtggt catcgcactg atgggcactg
tggcccagat ggagctcgaa gctatcaagg 7800agcgaaacag gagcgcagcc catttcaata
ttagggccgg taaatacaga ggctccctgc 7860ccccttgggg atatctccct accagggtgg
atggggagtg gagactggtg ccagaccccg 7920tccagagaga gcggattctg gaagtgtacc
acagagtggt cgataaccac gaaccactcc 7980atctggtggc acacgacctg aatagacgcg
gcgtgctctc tccaaaggat tattttgctc 8040agctgcaggg aagagagcca cagggaagag
aatggagtgc tactgcactg aagagatcta 8100tgatcagtga ggctatgctg ggttacgcaa
cactcaatgg caaaactgtc cgggacgatg 8160acggagcccc tctggtgagg gctgagccta
ttctcaccag agagcagctc gaagctctgc 8220gggcagaact ggtcaagact agtcgcgcca
aacctgccgt gagcacccca agcctgctcc 8280tgagggtgct gttctgcgcc gtctgtggag
agccagcata caagtttgcc ggcggagggc 8340gcaaacatcc ccgctatcga tgcaggagca
tggggttccc taagcactgt ggaaacggga 8400cagtggccat ggctgagtgg gacgcctttt
gcgaggaaca ggtgctggat ctcctgggtg 8460acgctgagcg gctggaaaaa gtgtgggtgg
caggatctga ctccgctgtg gagctggcag 8520aagtcaatgc cgagctcgtg gatctgactt
ccctcatcgg atctcctgca tatagagctg 8580ggtccccaca gagagaagct ctggacgcac
gaattgctgc actcgctgct agacaggagg 8640aactggaggg cctggaggcc aggccctctg
gatgggagtg gcgagaaacc ggacagaggt 8700ttggggattg gtggagggag caggacaccg
cagccaagaa cacatggctg agatccatga 8760atgtccggct cacattcgac gtgcgcggtg
gcctgactcg aaccatcgat tttggcgacc 8820tgcaggagta tgaacagcac ctgagactgg
ggtccgtggt cgaaagactg cacactggga 8880tgtcctaggt ttaaacccgc tgatcagcct
cgactgtgcc ttctagttgc cagccatctg 8940ttgtttgccc ctcccccgtg ccttccttga
ccctggaagg tgccactccc actgtccttt 9000cctaataaaa tgagaaaatt gcatcgcatt
gtctgagtag gtgtcattct attctggggg 9060gtggggtggg gcaggacagc aagggggagg
attgggaaga caatagcagg catgctgggg 9120atgcggtggg ctctatggct tctgaggcgg
aaagaaccag ctggggctcg ataccgtcga 9180cctctagcta gagcttggcg taatcatggt
catagctgtt tcctgtgtga aattgttatc 9240cgctcacaat tccacacaac atacgagccg
gaagcataaa gtgtaaagcc tagggtgcct 9300aatgagtgag ctaactcaca ttaattgcgt
tgcgctcact gcccgctttc cagtcgggaa 9360acctgtcgtg ccagctgcat taatgaatcg
gccaacgcgc ggggagaggc ggtttgcgta 9420ttgggcgctc ttccgcttcc tcgctcactg
actcgctgcg ctcggtcgtt cggctgcggc 9480gagcggtatc agctcactca aaggcggtaa
tacggttatc cacagaatca ggggataacg 9540caggaaagaa catgtgagca aaaggccagc
aaaaggccag gaaccgtaaa aaggccgcgt 9600tgctggcgtt tttccatagg ctccgccccc
ctgacgagca tcacaaaaat cgacgctcaa 9660gtcagaggtg gcgaaacccg acaggactat
aaagatacca ggcgtttccc cctggaagct 9720ccctcgtgcg ctctcctgtt ccgaccctgc
cgcttaccgg atacctgtcc gcctttctcc 9780cttcgggaag cgtggcgctt tctcatagct
cacgctgtag gtatctcagt tcggtgtagg 9840tcgttcgctc caagctgggc tgtgtgcacg
aaccccccgt tcagcccgac cgctgcgcct 9900tatccggtaa ctatcgtctt gagtccaacc
cggtaagaca cgacttatcg ccactggcag 9960cagccactgg taacaggatt agcagagcga
ggtatgtagg cggtgctaca gagttcttga 10020agtggtggcc taactacggc tacactagaa
gaacagtatt tggtatctgc gctctgctga 10080agccagttac cttcggaaaa agagttggta
gctcttgatc cggcaaacaa accaccgctg 10140gtagcggtgg tttttttgtt tgcaagcagc
agattacgcg cagaaaaaaa ggatctcaag 10200aagatccttt gatcttttct acggggtctg
acgctcagtg gaacgaaaac tcacgttaag 10260ggattttggt catgagatta tcaaaaagga
tcttcaccta gatcctttta aattaaaaat 10320gaagttttaa atcaatctaa agtatatatg
agtaaacttg gtctgacagt taccaatgct 10380taatcagtga ggcacctatc tcagcgatct
gtctatttcg ttcatccata gttgcctgac 10440tccccgtcgt gtagataact acgatacggg
agggcttacc atctggcccc agtgctgcaa 10500tgataccgcg agacccacgc tcaccggctc
cagatttatc agcaataaac cagccagccg 10560gaagggccga gcgcagaagt ggtcctgcaa
ctttatccgc ctccatccag tctattaatt 10620gttgccggga agctagagta agtagttcgc
cagttaatag tttgcgcaac gttgttgcca 10680ttgctacagg catcgtggtg tcacgctcgt
cgtttggtat ggcttcattc agctccggtt 10740cccaacgatc aaggcgagtt acatgatccc
ccatgttgtg caaaaaagcg gttagctcct 10800tcggtcctcc gatcgttgtc agaagtaagt
tggccgcagt gttatcactc atggttatgg 10860cagcactgca taattctctt actgtcatgc
catccgtaag atgcttttct gtgactggtg 10920agtactcaac caagtcattc tgagaatagt
gtatgcggcg accgagttgc tcttgcccgg 10980cgtcaatacg ggataatacc gcgccacata
gcagaacttt aaaagtgctc atcattggaa 11040aacgttcttc ggggcgaaaa ctctcaagga
tcttaccgct gttgagatcc agttcgatgt 11100aacccactcg tgcacccaac tgatcttcag
catcttttac tttcaccagc gtttctgggt 11160gagcaaaaac aggaaggcaa aatgccgcaa
aaaagggaat aagggcgaca cggaaatgtt 11220gaatactcat actcttcctt tttcaatatt
attgaagcat ttatcagggt tattgtctca 11280tgagcggata catatttgaa tgtatttaga
aaaataaaca aataggggtt ccgcgcacat 11340ttcc
113443829753DNAArtificial
SequenceDescription of Artificial Sequence Synthetic polynucleotide
382ccgaaaagtg ccacctgacg tcgacggatc gggagatcga tctcccgatc ccctagggtc
60gactctcagt acaatctgct ctgatgccgc atagttaagc cagtatctgc tccctgcttg
120tgtgttggag gtcgctgagt agtgcgcgag caaaatttaa gctacaacaa ggcaaggctt
180gaccgacaat tgcatgaaga atctgcttag ggttaggcgt tttgcgctgc ttcgcgatgt
240acgggccaga tatacgcgtt gacattgatt attgactagt tattaatagt aatcaattac
300ggggtcatta gttcatagcc catatatgga gttccgcgtt acataactta cggtaaatgg
360cccgcctggc tgaccgccca acgacccccg cccattgacg tcaataatga cgtatgttcc
420catagtaacg ccaataggga ctttccattg acgtcaatgg gtggagtatt tacggtaaac
480tgcccacttg gcagtacatc aagtgtatca tatgccaagt acgcccccta ttgacgtcaa
540tgacggtaaa tggcccgcct ggcattatgc ccagtacatg accttatggg actttcctac
600ttggcagtac atctacgtat tagtcatcgc tattaccatg gtgatgcggt tttggcagta
660catcaatggg cgtggatagc ggtttgactc acggggattt ccaagtctcc accccattga
720cgtcaatggg agtttgtttt ggcaccaaaa tcaacgggac tttccaaaat gtcgtaacaa
780ctccgcccca ttgacgcaaa tgggcggtag gcgtgtacgg tgggaggtct atataagcag
840agctggttta gtgaaccgtc agatccgcta gagatccgcg gccgctaata cgactcacta
900tagggagagc cgccaccatg aaacggacag ccgacggaag cgagttcgag tcaccaaaga
960agaagcggaa agtcgacaag aagtacagca tcggcctgga catcggcacc aactctgtgg
1020gctgggccgt gatcaccgac gagtacaagg tgcccagcaa gaaattcaag gtgctgggca
1080acaccgaccg gcacagcatc aagaagaacc tgatcggagc cctgctgttc gacagcggcg
1140aaacagccga ggccacccgg ctgaagagaa ccgccagaag aagatacacc agacggaaga
1200accggatctg ctatctgcaa gagatcttca gcaacgagat ggccaaggtg gacgacagct
1260tcttccacag actggaagag tccttcctgg tggaagagga taagaagcac gagcggcacc
1320ccatcttcgg caacatcgtg gacgaggtgg cctaccacga gaagtacccc accatctacc
1380acctgagaaa gaaactggtg gacagcaccg acaaggccga cctgcggctg atctatctgg
1440ccctggccca catgatcaag ttccggggcc acttcctgat cgagggcgac ctgaaccccg
1500acaacagcga cgtggacaag ctgttcatcc agctggtgca gacctacaac cagctgttcg
1560aggaaaaccc catcaacgcc agcggcgtgg acgccaaggc catcctgtct gccagactga
1620gcaagagcag acggctggaa aatctgatcg cccagctgcc cggcgagaag aagaatggcc
1680tgttcggaaa cctgattgcc ctgagcctgg gcctgacccc caacttcaag agcaacttcg
1740acctggccga ggatgccaaa ctgcagctga gcaaggacac ctacgacgac gacctggaca
1800acctgctggc ccagatcggc gaccagtacg ccgacctgtt tctggccgcc aagaacctgt
1860ccgacgccat cctgctgagc gacatcctga gagtgaacac cgagatcacc aaggcccccc
1920tgagcgcctc tatgatcaag agatacgacg agcaccacca ggacctgacc ctgctgaaag
1980ctctcgtgcg gcagcagctg cctgagaagt acaaagagat tttcttcgac cagagcaaga
2040acggctacgc cggctacatt gacggcggag ccagccagga agagttctac aagttcatca
2100agcccatcct ggaaaagatg gacggcaccg aggaactgct cgtgaagctg aacagagagg
2160acctgctgcg gaagcagcgg accttcgaca acggcagcat cccccaccag atccacctgg
2220gagagctgca cgccattctg cggcggcagg aagattttta cccattcctg aaggacaacc
2280gggaaaagat cgagaagatc ctgaccttcc gcatccccta ctacgtgggc cctctggcca
2340ggggaaacag cagattcgcc tggatgacca gaaagagcga ggaaaccatc accccctgga
2400acttcgagga agtggtggac aagggcgctt ccgcccagag cttcatcgag cggatgacca
2460acttcgataa gaacctgccc aacgagaagg tgctgcccaa gcacagcctg ctgtacgagt
2520acttcaccgt gtataacgag ctgaccaaag tgaaatacgt gaccgaggga atgagaaagc
2580ccgccttcct gagcggcgag cagaaaaagg ccatcgtgga cctgctgttc aagaccaacc
2640ggaaagtgac cgtgaagcag ctgaaagagg actacttcaa gaaaatcgag tgcttcgact
2700ccgtggaaat ctccggcgtg gaagatcggt tcaacgcctc cctgggcaca taccacgatc
2760tgctgaaaat tatcaaggac aaggacttcc tggacaatga ggaaaacgag gacattctgg
2820aagatatcgt gctgaccctg acactgtttg aggacagaga gatgatcgag gaacggctga
2880aaacctatgc ccacctgttc gacgacaaag tgatgaagca gctgaagcgg cggagataca
2940ccggctgggg caggctgagc cggaagctga tcaacggcat ccgggacaag cagtccggca
3000agacaatcct ggatttcctg aagtccgacg gcttcgccaa cagaaacttc atgcagctga
3060tccacgacga cagcctgacc tttaaagagg acatccagaa agcccaggtg tccggccagg
3120gcgatagcct gcacgagcac attgccaatc tggccggcag ccccgccatt aagaagggca
3180tcctgcagac agtgaaggtg gtggacgagc tcgtgaaagt gatgggccgg cacaagcccg
3240agaacatcgt gatcgaaatg gccagagaga accagaccac ccagaaggga cagaagaaca
3300gccgcgagag aatgaagcgg atcgaagagg gcatcaaaga gctgggcagc cagatcctga
3360aagaacaccc cgtggaaaac acccagctgc agaacgagaa gctgtacctg tactacctgc
3420agaatgggcg ggatatgtac gtggaccagg aactggacat caaccggctg tccgactacg
3480atgtggacgc tatcgtgcct cagagctttc tgaaggacga ctccatcgac aacaaggtgc
3540tgaccagaag cgacaagaac cggggcaaga gcgacaacgt gccctccgaa gaggtcgtga
3600agaagatgaa gaactactgg cggcagctgc tgaacgccaa gctgattacc cagagaaagt
3660tcgacaatct gaccaaggcc gagagaggcg gcctgagcga actggataag gccggcttca
3720tcaagagaca gctggtggaa acccggcaga tcacaaagca cgtggcacag atcctggact
3780cccggatgaa cactaagtac gacgagaatg acaagctgat ccgggaagtg aaagtgatca
3840ccctgaagtc caagctggtg tccgatttcc ggaaggattt ccagttttac aaagtgcgcg
3900agatcaacaa ctaccaccac gcccacgacg cctacctgaa cgccgtcgtg ggaaccgccc
3960tgatcaaaaa gtaccctaag ctggaaagcg agttcgtgta cggcgactac aaggtgtacg
4020acgtgcggaa gatgatcgcc aagagcgagc aggaaatcgg caaggctacc gccaagtact
4080tcttctacag caacatcatg aactttttca agaccgagat taccctggcc aacggcgaga
4140tccggaagcg gcctctgatc gagacaaacg gcgaaaccgg ggagatcgtg tgggataagg
4200gccgggattt tgccaccgtg cggaaagtgc tgagcatgcc ccaagtgaat atcgtgaaaa
4260agaccgaggt gcagacaggc ggcttcagca aagagtctat cctgcccaag aggaacagcg
4320ataagctgat cgccagaaag aaggactggg accctaagaa gtacggcggc ttcgacagcc
4380ccaccgtggc ctattctgtg ctggtggtgg ccaaagtgga aaagggcaag tccaagaaac
4440tgaagagtgt gaaagagctg ctggggatca ccatcatgga aagaagcagc ttcgagaaga
4500atcccatcga ctttctggaa gccaagggct acaaagaagt gaaaaaggac ctgatcatca
4560agctgcctaa gtactccctg ttcgagctgg aaaacggccg gaagagaatg ctggcctctg
4620ccggcgaact gcagaaggga aacgaactgg ccctgccctc caaatatgtg aacttcctgt
4680acctggccag ccactatgag aagctgaagg gctcccccga ggataatgag cagaaacagc
4740tgtttgtgga acagcacaag cactacctgg acgagatcat cgagcagatc agcgagttct
4800ccaagagagt gatcctggcc gacgctaatc tggacaaagt gctgtccgcc tacaacaagc
4860accgggataa gcccatcaga gagcaggccg agaatatcat ccacctgttt accctgacca
4920atctgggagc ccctgccgcc ttcaagtact ttgacaccac catcgaccgg aagaggtaca
4980ccagcaccaa agaggtgctg gacgccaccc tgatccacca gagcatcacc ggcctgtacg
5040agacacggat cgacctgtct cagctgggag gtgactctgg aggatctagc ggaggatcct
5100ctggcagcga gacaccagga acaagcgagt cagcaacacc agagagcagt ggcggcagca
5160gcggcggcag cagcacccta aatatagaag atgagtatcg gctacatgag acctcaaaag
5220agccagatgt ttctctaggg tccacatggc tgtctgattt tcctcaggcc tgggcggaaa
5280ccgggggcat gggactggca gttcgccaag ctcctctgat catacctctg aaagcaacct
5340ctacccccgt gtccataaaa caatacccca tgtcacaaga agccagactg gggatcaagc
5400cccacataca gagactgttg gaccagggaa tactggtacc ctgccagtcc ccctggaaca
5460cgcccctgct acccgttaag aaaccaggga ctaatgatta taggcctgtc caggatctga
5520gagaagtcaa caagcgggtg gaagacatcc accccaccgt gcccaaccct tacaacctct
5580tgagcgggct cccaccgtcc caccagtggt acactgtgct tgatttaaag gatgcctttt
5640tctgcctgag actccacccc accagtcagc ctctcttcgc ctttgagtgg agagatccag
5700agatgggaat ctcaggacaa ttgacctgga ccagactccc acagggtttc aaaaacagtc
5760ccaccctgtt taatgaggca ctgcacagag acctagcaga cttccggatc cagcacccag
5820acttgatcct gctacagtac gtggatgact tactgctggc cgccacttct gagctagact
5880gccaacaagg tactcgggcc ctgttacaaa ccctagggaa cctcgggtat cgggcctcgg
5940ccaagaaagc ccaaatttgc cagaaacagg tcaagtatct ggggtatctt ctaaaagagg
6000gtcagagatg gctgactgag gccagaaaag agactgtgat ggggcagcct actccgaaga
6060cccctcgaca actaagggag ttcctaggga aggcaggctt ctgtcgcctc ttcatccctg
6120ggtttgcaga aatggcagcc cccctgtacc ctctcaccaa accggggact ctgtttaatt
6180ggggcccaga ccaacaaaag gcctatcaag aaatcaagca agctcttcta actgccccag
6240ccctggggtt gccagatttg actaagccct ttgaactctt tgtcgacgag aagcagggct
6300acgccaaagg tgtcctaacg caaaaactgg gaccttggcg tcggccggtg gcctacctgt
6360ccaaaaagct agacccagta gcagctgggt ggcccccttg cctacggatg gtagcagcca
6420ttgccgtact gacaaaggat gcaggcaagc taaccatggg acagccacta gtcattctgg
6480ccccccatgc agtagaggca ctagtcaaac aaccccccga ccgctggctt tccaacgccc
6540ggatgactca ctatcaggcc ttgcttttgg acacggaccg ggtccagttc ggaccggtgg
6600tagccctgaa cccggctacg ctgctcccac tgcctgagga agggctgcaa cacaactgcc
6660ttgatatcct ggccgaagcc cacggaaccc gacccgacct aacggaccag ccgctcccag
6720acgccgacca cacctggtac acggatggaa gcagtctctt acaagaggga cagcgtaagg
6780cgggagctgc ggtgaccacc gagaccgagg taatctgggc taaagccctg ccagccggga
6840catccgctca gcgggctgaa ctgatagcac tcacccaggc cctaaagatg gcagaaggta
6900agaagctaaa tgtttatact gatagccgtt atgcttttgc tactgcccat atccatggag
6960aaatatacag aaggcgtggg tggctcacat cagaaggcaa agagatcaaa aataaagacg
7020agatcttggc cctactaaaa gccctctttc tgcccaaaag acttagcata atccattgtc
7080caggacatca aaagggacac agcgccgagg ctagaggcaa ccggatggct gaccaagcgg
7140cccgaaaggc agccatcaca gagactccag acacctctac cctcctcata gaaaattcat
7200caccctctgg cggctcaaaa agaaccgccg acggcagcga attcgagccc aagaagaaga
7260ggaaagtcta accggtcatc atcaccatca ccattgagtt taaacccgct gatcagcctc
7320gactgtgcct tctagttgcc agccatctgt tgtttgcccc tcccccgtgc cttccttgac
7380cctggaaggt gccactccca ctgtcctttc ctaataaaat gagaaaattg catcgcattg
7440tctgagtagg tgtcattcta ttctgggggg tggggtgggg caggacagca agggggagga
7500ttgggaagac aatagcaggc atgctgggga tgcggtgggc tctatggctt ctgaggcgga
7560aagaaccagc tggggctcga taccgtcgac ctctagctag agcttggcgt aatcatggtc
7620atagctgttt cctgtgtgaa attgttatcc gctcacaatt ccacacaaca tacgagccgg
7680aagcataaag tgtaaagcct agggtgccta atgagtgagc taactcacat taattgcgtt
7740gcgctcactg cccgctttcc agtcgggaaa cctgtcgtgc cagctgcatt aatgaatcgg
7800ccaacgcgcg gggagaggcg gtttgcgtat tgggcgctct tccgcttcct cgctcactga
7860ctcgctgcgc tcggtcgttc ggctgcggcg agcggtatca gctcactcaa aggcggtaat
7920acggttatcc acagaatcag gggataacgc aggaaagaac atgtgagcaa aaggccagca
7980aaaggccagg aaccgtaaaa aggccgcgtt gctggcgttt ttccataggc tccgcccccc
8040tgacgagcat cacaaaaatc gacgctcaag tcagaggtgg cgaaacccga caggactata
8100aagataccag gcgtttcccc ctggaagctc cctcgtgcgc tctcctgttc cgaccctgcc
8160gcttaccgga tacctgtccg cctttctccc ttcgggaagc gtggcgcttt ctcatagctc
8220acgctgtagg tatctcagtt cggtgtaggt cgttcgctcc aagctgggct gtgtgcacga
8280accccccgtt cagcccgacc gctgcgcctt atccggtaac tatcgtcttg agtccaaccc
8340ggtaagacac gacttatcgc cactggcagc agccactggt aacaggatta gcagagcgag
8400gtatgtaggc ggtgctacag agttcttgaa gtggtggcct aactacggct acactagaag
8460aacagtattt ggtatctgcg ctctgctgaa gccagttacc ttcggaaaaa gagttggtag
8520ctcttgatcc ggcaaacaaa ccaccgctgg tagcggtggt ttttttgttt gcaagcagca
8580gattacgcgc agaaaaaaag gatctcaaga agatcctttg atcttttcta cggggtctga
8640cgctcagtgg aacgaaaact cacgttaagg gattttggtc atgagattat caaaaaggat
8700cttcacctag atccttttaa attaaaaatg aagttttaaa tcaatctaaa gtatatatga
8760gtaaacttgg tctgacagtt accaatgctt aatcagtgag gcacctatct cagcgatctg
8820tctatttcgt tcatccatag ttgcctgact ccccgtcgtg tagataacta cgatacggga
8880gggcttacca tctggcccca gtgctgcaat gataccgcga gacccacgct caccggctcc
8940agatttatca gcaataaacc agccagccgg aagggccgag cgcagaagtg gtcctgcaac
9000tttatccgcc tccatccagt ctattaattg ttgccgggaa gctagagtaa gtagttcgcc
9060agttaatagt ttgcgcaacg ttgttgccat tgctacaggc atcgtggtgt cacgctcgtc
9120gtttggtatg gcttcattca gctccggttc ccaacgatca aggcgagtta catgatcccc
9180catgttgtgc aaaaaagcgg ttagctcctt cggtcctccg atcgttgtca gaagtaagtt
9240ggccgcagtg ttatcactca tggttatggc agcactgcat aattctctta ctgtcatgcc
9300atccgtaaga tgcttttctg tgactggtga gtactcaacc aagtcattct gagaatagtg
9360tatgcggcga ccgagttgct cttgcccggc gtcaatacgg gataataccg cgccacatag
9420cagaacttta aaagtgctca tcattggaaa acgttcttcg gggcgaaaac tctcaaggat
9480cttaccgctg ttgagatcca gttcgatgta acccactcgt gcacccaact gatcttcagc
9540atcttttact ttcaccagcg tttctgggtg agcaaaaaca ggaaggcaaa atgccgcaaa
9600aaagggaata agggcgacac ggaaatgttg aatactcata ctcttccttt ttcaatatta
9660ttgaagcatt tatcagggtt attgtctcat gagcggatac atatttgaat gtatttagaa
9720aaataaacaa ataggggttc cgcgcacatt tcc
975338311433DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotide 383ccgaaaagtg ccacctgacg tcgacggatc
gggagatcga tctcccgatc ccctagggtc 60gactctcagt acaatctgct ctgatgccgc
atagttaagc cagtatctgc tccctgcttg 120tgtgttggag gtcgctgagt agtgcgcgag
caaaatttaa gctacaacaa ggcaaggctt 180gaccgacaat tgcatgaaga atctgcttag
ggttaggcgt tttgcgctgc ttcgcgatgt 240acgggccaga tatacgcgtt gacattgatt
attgactagt tattaatagt aatcaattac 300ggggtcatta gttcatagcc catatatgga
gttccgcgtt acataactta cggtaaatgg 360cccgcctggc tgaccgccca acgacccccg
cccattgacg tcaataatga cgtatgttcc 420catagtaacg ccaataggga ctttccattg
acgtcaatgg gtggagtatt tacggtaaac 480tgcccacttg gcagtacatc aagtgtatca
tatgccaagt acgcccccta ttgacgtcaa 540tgacggtaaa tggcccgcct ggcattatgc
ccagtacatg accttatggg actttcctac 600ttggcagtac atctacgtat tagtcatcgc
tattaccatg gtgatgcggt tttggcagta 660catcaatggg cgtggatagc ggtttgactc
acggggattt ccaagtctcc accccattga 720cgtcaatggg agtttgtttt ggcaccaaaa
tcaacgggac tttccaaaat gtcgtaacaa 780ctccgcccca ttgacgcaaa tgggcggtag
gcgtgtacgg tgggaggtct atataagcag 840agctggttta gtgaaccgtc agatccgcta
gagatccgcg gccgctaata cgactcacta 900tagggagagc cgccaccatg cccgcggcta
agagggtgaa gcttgacggt ggaaaacgga 960cagccgacgg aagcgagttc gagtcaccaa
agaagaagcg gaaagtcgac aagaagtaca 1020gcatcggcct ggacatcggc accaactctg
tgggctgggc cgtgatcacc gacgagtaca 1080aggtgcccag caagaaattc aaggtgctgg
gcaacaccga ccggcacagc atcaagaaga 1140acctgatcgg agccctgctg ttcgacagcg
gcgaaacagc cgaggccacc cggctgaaga 1200gaaccgccag aagaagatac accagacgga
agaaccggat ctgctatctg caagagatct 1260tcagcaacga gatggccaag gtggacgaca
gcttcttcca cagactggaa gagtccttcc 1320tggtggaaga ggataagaag cacgagcggc
accccatctt cggcaacatc gtggacgagg 1380tggcctacca cgagaagtac cccaccatct
accacctgag aaagaaactg gtggacagca 1440ccgacaaggc cgacctgcgg ctgatctatc
tggccctggc ccacatgatc aagttccggg 1500gccacttcct gatcgagggc gacctgaacc
ccgacaacag cgacgtggac aagctgttca 1560tccagctggt gcagacctac aaccagctgt
tcgaggaaaa ccccatcaac gccagcggcg 1620tggacgccaa ggccatcctg tctgccagac
tgagcaagag cagacggctg gaaaatctga 1680tcgcccagct gcccggcgag aagaagaatg
gcctgttcgg aaacctgatt gccctgagcc 1740tgggcctgac ccccaacttc aagagcaact
tcgacctggc cgaggatgcc aaactgcagc 1800tgagcaagga cacctacgac gacgacctgg
acaacctgct ggcccagatc ggcgaccagt 1860acgccgacct gtttctggcc gccaagaacc
tgtccgacgc catcctgctg agcgacatcc 1920tgagagtgaa caccgagatc accaaggccc
ccctgagcgc ctctatgatc aagagatacg 1980acgagcacca ccaggacctg accctgctga
aagctctcgt gcggcagcag ctgcctgaga 2040agtacaaaga gattttcttc gaccagagca
agaacggcta cgccggctac attgacggcg 2100gagccagcca ggaagagttc tacaagttca
tcaagcccat cctggaaaag atggacggca 2160ccgaggaact gctcgtgaag ctgaacagag
aggacctgct gcggaagcag cggaccttcg 2220acaacggcag catcccccac cagatccacc
tgggagagct gcacgccatt ctgcggcggc 2280aggaagattt ttacccattc ctgaaggaca
accgggaaaa gatcgagaag atcctgacct 2340tccgcatccc ctactacgtg ggccctctgg
ccaggggaaa cagcagattc gcctggatga 2400ccagaaagag cgaggaaacc atcaccccct
ggaacttcga ggaagtggtg gacaagggcg 2460cttccgccca gagcttcatc gagcggatga
ccaacttcga taagaacctg cccaacgaga 2520aggtgctgcc caagcacagc ctgctgtacg
agtacttcac cgtgtataac gagctgacca 2580aagtgaaata cgtgaccgag ggaatgagaa
agcccgcctt cctgagcggc gagcagaaaa 2640aggccatcgt ggacctgctg ttcaagacca
accggaaagt gaccgtgaag cagctgaaag 2700aggactactt caagaaaatc gagtgcttcg
actccgtgga aatctccggc gtggaagatc 2760ggttcaacgc ctccctgggc acataccacg
atctgctgaa aattatcaag gacaaggact 2820tcctggacaa tgaggaaaac gaggacattc
tggaagatat cgtgctgacc ctgacactgt 2880ttgaggacag agagatgatc gaggaacggc
tgaaaaccta tgcccacctg ttcgacgaca 2940aagtgatgaa gcagctgaag cggcggagat
acaccggctg gggcaggctg agccggaagc 3000tgatcaacgg catccgggac aagcagtccg
gcaagacaat cctggatttc ctgaagtccg 3060acggcttcgc caacagaaac ttcatgcagc
tgatccacga cgacagcctg acctttaaag 3120aggacatcca gaaagcccag gtgtccggcc
agggcgatag cctgcacgag cacattgcca 3180atctggccgg cagccccgcc attaagaagg
gcatcctgca gacagtgaag gtggtggacg 3240agctcgtgaa agtgatgggc cggcacaagc
ccgagaacat cgtgatcgaa atggccagag 3300agaaccagac cacccagaag ggacagaaga
acagccgcga gagaatgaag cggatcgaag 3360agggcatcaa agagctgggc agccagatcc
tgaaagaaca ccccgtggaa aacacccagc 3420tgcagaacga gaagctgtac ctgtactacc
tgcagaatgg gcgggatatg tacgtggacc 3480aggaactgga catcaaccgg ctgtccgact
acgatgtgga cgctatcgtg cctcagagct 3540ttctgaagga cgactccatc gacaacaagg
tgctgaccag aagcgacaag aaccggggca 3600agagcgacaa cgtgccctcc gaagaggtcg
tgaagaagat gaagaactac tggcggcagc 3660tgctgaacgc caagctgatt acccagagaa
agttcgacaa tctgaccaag gccgagagag 3720gcggcctgag cgaactggat aaggccggct
tcatcaagag acagctggtg gaaacccggc 3780agatcacaaa gcacgtggca cagatcctgg
actcccggat gaacactaag tacgacgaga 3840atgacaagct gatccgggaa gtgaaagtga
tcaccctgaa gtccaagctg gtgtccgatt 3900tccggaagga tttccagttt tacaaagtgc
gcgagatcaa caactaccac cacgcccacg 3960acgcctacct gaacgccgtc gtgggaaccg
ccctgatcaa aaagtaccct aagctggaaa 4020gcgagttcgt gtacggcgac tacaaggtgt
acgacgtgcg gaagatgatc gccaagagcg 4080agcaggaaat cggcaaggct accgccaagt
acttcttcta cagcaacatc atgaactttt 4140tcaagaccga gattaccctg gccaacggcg
agatccggaa gcggcctctg atcgagacaa 4200acggcgaaac cggggagatc gtgtgggata
agggccggga ttttgccacc gtgcggaaag 4260tgctgagcat gccccaagtg aatatcgtga
aaaagaccga ggtgcagaca ggcggcttca 4320gcaaagagtc tatcctgccc aagaggaaca
gcgataagct gatcgccaga aagaaggact 4380gggaccctaa gaagtacggc ggcttcgaca
gccccaccgt ggcctattct gtgctggtgg 4440tggccaaagt ggaaaagggc aagtccaaga
aactgaagag tgtgaaagag ctgctgggga 4500tcaccatcat ggaaagaagc agcttcgaga
agaatcccat cgactttctg gaagccaagg 4560gctacaaaga agtgaaaaag gacctgatca
tcaagctgcc taagtactcc ctgttcgagc 4620tggaaaacgg ccggaagaga atgctggcct
ctgccggcga actgcagaag ggaaacgaac 4680tggccctgcc ctccaaatat gtgaacttcc
tgtacctggc cagccactat gagaagctga 4740agggctcccc cgaggataat gagcagaaac
agctgtttgt ggaacagcac aagcactacc 4800tggacgagat catcgagcag atcagcgagt
tctccaagag agtgatcctg gccgacgcta 4860atctggacaa agtgctgtcc gcctacaaca
agcaccggga taagcccatc agagagcagg 4920ccgagaatat catccacctg tttaccctga
ccaatctggg agcccctgcc gccttcaagt 4980actttgacac caccatcgac cggaagaggt
acaccagcac caaagaggtg ctggacgcca 5040ccctgatcca ccagagcatc accggcctgt
acgagacacg gatcgacctg tctcagctgg 5100gaggtgactc tggaggatct agcggaggat
cctctggcag cgagacacca ggaacaagcg 5160agtcagcaac accagagagc agtggcggca
gcagcggcgg cagcagcacc ctaaatatag 5220aagatgagta tcggctacat gagacctcaa
aagagccaga tgtttctcta gggtccacat 5280ggctgtctga ttttcctcag gcctgggcgg
aaaccggggg catgggactg gcagttcgcc 5340aagctcctct gatcatacct ctgaaagcaa
cctctacccc cgtgtccata aaacaatacc 5400ccatgtcaca agaagccaga ctggggatca
agccccacat acagagactg ttggaccagg 5460gaatatggta ccctgccagt ccccctggaa
cacgcccctg ctacccgtta agaaaccagg 5520gactaatgat tataggcctg tccaggatct
gagagaagtc aacaagcggg tggaagacat 5580ccaccccacc gtgcccaacc cttacaacct
cttgagcggg ctcccaccgt cccaccagtg 5640gtacactgtg cttgatttaa aggatgcctt
tttctgcctg agactccacc ccaccagtca 5700gcctctcttc gcctttgagt ggagagatcc
agagatggga atctcaggac aattgacctg 5760gaccagactc ccacagggtt tcaaaaacag
tcccaccctg tttaatgagg cactgcacag 5820agacctagca gacttccgga tccagcaccc
agacttgatc ctgctacagt acgtggatga 5880cttactgctg gccgccactt ctgagctaga
ctgccaacaa ggtactcggg ccctgttaca 5940aaccctaggg aacctcgggt atcgggcctc
ggccaagaaa gcccaaattt gccagaaaca 6000ggtcaagtat ctggggtatc ttctaaaaga
gggtcagaga tggctgactg aggccagaaa 6060agagactgtg atggggcagc ctactccgaa
gacccctcga caactaaggg agttcctagg 6120gaaggcaggc ttctgtcgcc tcttcatccc
tgggtttgca gaaatggcag cccccctgta 6180ccctctcacc aaaccgggga ctctgtttaa
ttggggccca gaccaacaaa aggcctatca 6240agaaatcaag caagctcttc taactgcccc
agccctgggg ttgccagatt tgactaagcc 6300ctttgaactc tttgtcgacg agaagcaggg
ctacgccaaa ggtgtcctaa cgcaaaaact 6360gggaccttgg cgtcggccgg tggcctacct
gtccaaaaag ctagacccag tagcagctgg 6420gtggccccct tgcctacgga tggtagcagc
cattgccgta ctgacaaagg atgcaggcaa 6480gctaaccatg ggacagccac tagtcattct
ggccccccat gcagtagagg cactagtcaa 6540acaacccccc gaccgctggc tttccaacgc
ccggatgact cactatcagg ccttgctttt 6600ggacacggac cgggtccagt tcggaccggt
ggtagccctg aacccggcta cgctgctccc 6660actgcctgag gaagggctgc aacacaactg
ccttgatatc ctggccgaag cccacggaac 6720ccgacccgac ctaacggacc agccgctccc
agacgccgac cacacctggt acacggatgg 6780aagcagtctc ttacaagagg gacagcgtaa
ggcgggagct gcggtgacca ccgagaccga 6840ggtaatctgg gctaaagccc tgccagccgg
gacatccgct cagcgggctg aactgatagc 6900actcacccag gccctaaaga tggcagaagg
taagaagcta aatgtttata ctgatagccg 6960ttatgctttt gctactgccc atatccatgg
agaaatatac agaaggcgtg ggtggctcac 7020atcagaaggc aaagagatca aaaataaaga
cgagatcttg gccctactaa aagccctctt 7080tctgcccaaa agacttagca taatccattg
tccaggacat caaaagggac acagcgccga 7140ggctagaggc aaccggatgg ctgaccaagc
ggcccgaaag gcagccatca cagagactcc 7200agacacctct accctcctca tagaaaattc
atcaccctct ggcggctcaa aaagaaccgc 7260cgacggcagc gaaaaaagaa ccgctgactc
tcaacattcc acacctccaa aaaccaagcg 7320aaaagtggaa ttcgagccca agaagaagag
gaaagtcgga agcggagcta ctaacttcag 7380cctgctgaag caggctggcg acgtggagga
gaaccctgga cctccaaaaa agaaaagaaa 7440agtgtatccc tatgatgtcc ccgattatgc
cggttcaaga gccctggtcg tgattagact 7500gagccgagtg acagacgcca ccacaagtcc
cgagagacag ctggaatcat gccagcagct 7560ctgtgctcag cggggttggg atgtggtcgg
cgtggcagag gatctggacg tgagcggggc 7620cgtcgatcca ttcgacagaa agaggaggcc
caacctggca agatggctcg ctttcgagga 7680acagcccttt gatgtgatcg tcgcctacag
agtggaccgg ctgacccgct caattcgaca 7740tctccagcag ctggtgcatt gggctgagga
ccacaagaaa ctggtggtca gcgcaacaga 7800agcccacttc gatactacca caccttttgc
cgctgtggtc atcgcactga tgggcactgt 7860ggcccagatg gagctcgaag ctatcaagga
gcgaaacagg agcgcagccc atttcaatat 7920tagggccggt aaatacagag gctccctgcc
cccttgggga tatctcccta ccagggtgga 7980tggggagtgg agactggtgc cagaccccgt
ccagagagag cggattctgg aagtgtacca 8040cagagtggtc gataaccacg aaccactcca
tctggtggca cacgacctga atagacgcgg 8100cgtgctctct ccaaaggatt attttgctca
gctgcaggga agagagccac agggaagaga 8160atggagtgct actgcactga agagatctat
gatcagtgag gctatgctgg gttacgcaac 8220actcaatggc aaaactgtcc gggacgatga
cggagcccct ctggtgaggg ctgagcctat 8280tctcaccaga gagcagctcg aagctctgcg
ggcagaactg gtcaagacta gtcgcgccaa 8340acctgccgtg agcaccccaa gcctgctcct
gagggtgctg ttctgcgccg tctgtggaga 8400gccagcatac aagtttgccg gcggagggcg
caaacatccc cgctatcgat gcaggagcat 8460ggggttccct aagcactgtg gaaacgggac
agtggccatg gctgagtggg acgccttttg 8520cgaggaacag gtgctggatc tcctgggtga
cgctgagcgg ctggaaaaag tgtgggtggc 8580aggatctgac tccgctgtgg agctggcaga
agtcaatgcc gagctcgtgg atctgacttc 8640cctcatcgga tctcctgcat atagagctgg
gtccccacag agagaagctc tggacgcacg 8700aattgctgca ctcgctgcta gacaggagga
actggagggc ctggaggcca ggccctctgg 8760atgggagtgg cgagaaaccg gacagaggtt
tggggattgg tggagggagc aggacaccgc 8820agccaagaac acatggctga gatccatgaa
tgtccggctc acattcgacg tgcgcggtgg 8880cctgactcga accatcgatt ttggcgacct
gcaggagtat gaacagcacc tgagactggg 8940gtccgtggtc gaaagactgc acactgggat
gtcctaggtt taaacccgct gatcagcctc 9000gactgtgcct tctagttgcc agccatctgt
tgtttgcccc tcccccgtgc cttccttgac 9060cctggaaggt gccactccca ctgtcctttc
ctaataaaat gagaaaattg catcgcattg 9120tctgagtagg tgtcattcta ttctgggggg
tggggtgggg caggacagca agggggagga 9180ttgggaagac aatagcaggc atgctgggga
tgcggtgggc tctatggctt ctgaggcgga 9240aagaaccagc tggggctcga taccgtcgac
ctctagctag agcttggcgt aatcatggtc 9300atagctgttt cctgtgtgaa attgttatcc
gctcacaatt ccacacaaca tacgagccgg 9360aagcataaag tgtaaagcct agggtgccta
atgagtgagc taactcacat taattgcgtt 9420gcgctcactg cccgctttcc agtcgggaaa
cctgtcgtgc cagctgcatt aatgaatcgg 9480ccaacgcgcg gggagaggcg gtttgcgtat
tgggcgctct tccgcttcct cgctcactga 9540ctcgctgcgc tcggtcgttc ggctgcggcg
agcggtatca gctcactcaa aggcggtaat 9600acggttatcc acagaatcag gggataacgc
aggaaagaac atgtgagcaa aaggccagca 9660aaaggccagg aaccgtaaaa aggccgcgtt
gctggcgttt ttccataggc tccgcccccc 9720tgacgagcat cacaaaaatc gacgctcaag
tcagaggtgg cgaaacccga caggactata 9780aagataccag gcgtttcccc ctggaagctc
cctcgtgcgc tctcctgttc cgaccctgcc 9840gcttaccgga tacctgtccg cctttctccc
ttcgggaagc gtggcgcttt ctcatagctc 9900acgctgtagg tatctcagtt cggtgtaggt
cgttcgctcc aagctgggct gtgtgcacga 9960accccccgtt cagcccgacc gctgcgcctt
atccggtaac tatcgtcttg agtccaaccc 10020ggtaagacac gacttatcgc cactggcagc
agccactggt aacaggatta gcagagcgag 10080gtatgtaggc ggtgctacag agttcttgaa
gtggtggcct aactacggct acactagaag 10140aacagtattt ggtatctgcg ctctgctgaa
gccagttacc ttcggaaaaa gagttggtag 10200ctcttgatcc ggcaaacaaa ccaccgctgg
tagcggtggt ttttttgttt gcaagcagca 10260gattacgcgc agaaaaaaag gatctcaaga
agatcctttg atcttttcta cggggtctga 10320cgctcagtgg aacgaaaact cacgttaagg
gattttggtc atgagattat caaaaaggat 10380cttcacctag atccttttaa attaaaaatg
aagttttaaa tcaatctaaa gtatatatga 10440gtaaacttgg tctgacagtt accaatgctt
aatcagtgag gcacctatct cagcgatctg 10500tctatttcgt tcatccatag ttgcctgact
ccccgtcgtg tagataacta cgatacggga 10560gggcttacca tctggcccca gtgctgcaat
gataccgcga gacccacgct caccggctcc 10620agatttatca gcaataaacc agccagccgg
aagggccgag cgcagaagtg gtcctgcaac 10680tttatccgcc tccatccagt ctattaattg
ttgccgggaa gctagagtaa gtagttcgcc 10740agttaatagt ttgcgcaacg ttgttgccat
tgctacaggc atcgtggtgt cacgctcgtc 10800gtttggtatg gcttcattca gctccggttc
ccaacgatca aggcgagtta catgatcccc 10860catgttgtgc aaaaaagcgg ttagctcctt
cggtcctccg atcgttgtca gaagtaagtt 10920ggccgcagtg ttatcactca tggttatggc
agcactgcat aattctctta ctgtcatgcc 10980atccgtaaga tgcttttctg tgactggtga
gtactcaacc aagtcattct gagaatagtg 11040tatgcggcga ccgagttgct cttgcccggc
gtcaatacgg gataataccg cgccacatag 11100cagaacttta aaagtgctca tcattggaaa
acgttcttcg gggcgaaaac tctcaaggat 11160cttaccgctg ttgagatcca gttcgatgta
acccactcgt gcacccaact gatcttcagc 11220atcttttact ttcaccagcg tttctgggtg
agcaaaaaca ggaaggcaaa atgccgcaaa 11280aaagggaata agggcgacac ggaaatgttg
aatactcata ctcttccttt ttcaatatta 11340ttgaagcatt tatcagggtt attgtctcat
gagcggatac atatttgaat gtatttagaa 11400aaataaacaa ataggggttc cgcgcacatt
tcc 1143338411056DNAArtificial
SequenceDescription of Artificial Sequence Synthetic polynucleotide
384ccgaaaagtg ccacctgacg tcgacggatc gggagatcga tctcccgatc ccctagggtc
60gactctcagt acaatctgct ctgatgccgc atagttaagc cagtatctgc tccctgcttg
120tgtgttggag gtcgctgagt agtgcgcgag caaaatttaa gctacaacaa ggcaaggctt
180gaccgacaat tgcatgaaga atctgcttag ggttaggcgt tttgcgctgc ttcgcgatgt
240acgggccaga tatacgcgtt gacattgatt attgactagt tattaatagt aatcaattac
300ggggtcatta gttcatagcc catatatgga gttccgcgtt acataactta cggtaaatgg
360cccgcctggc tgaccgccca acgacccccg cccattgacg tcaataatga cgtatgttcc
420catagtaacg ccaataggga ctttccattg acgtcaatgg gtggagtatt tacggtaaac
480tgcccacttg gcagtacatc aagtgtatca tatgccaagt acgcccccta ttgacgtcaa
540tgacggtaaa tggcccgcct ggcattatgc ccagtacatg accttatggg actttcctac
600ttggcagtac atctacgtat tagtcatcgc tattaccatg gtgatgcggt tttggcagta
660catcaatggg cgtggatagc ggtttgactc acggggattt ccaagtctcc accccattga
720cgtcaatggg agtttgtttt ggcaccaaaa tcaacgggac tttccaaaat gtcgtaacaa
780ctccgcccca ttgacgcaaa tgggcggtag gcgtgtacgg tgggaggtct atataagcag
840agctggttta gtgaaccgtc agatccgcta gagatccgcg gccgctaata cgactcacta
900tagggagagc cgccaccatg aaacggacag ccgacggaag cgagttcgag tcaccaaaga
960agaagcggaa agtcgacaag aagtacagca tcggcctgga catcggcacc aactctgtgg
1020gctgggccgt gatcaccgac gagtacaagg tgcccagcaa gaaattcaag gtgctgggca
1080acaccgaccg gcacagcatc aagaagaacc tgatcggagc cctgctgttc gacagcggcg
1140aaacagccga ggccacccgg ctgaagagaa ccgccagaag aagatacacc agacggaaga
1200accggatctg ctatctgcaa gagatcttca gcaacgagat ggccaaggtg gacgacagct
1260tcttccacag actggaagag tccttcctgg tggaagagga taagaagcac gagcggcacc
1320ccatcttcgg caacatcgtg gacgaggtgg cctaccacga gaagtacccc accatctacc
1380acctgagaaa gaaactggtg gacagcaccg acaaggccga cctgcggctg atctatctgg
1440ccctggccca catgatcaag ttccggggcc acttcctgat cgagggcgac ctgaaccccg
1500acaacagcga cgtggacaag ctgttcatcc agctggtgca gacctacaac cagctgttcg
1560aggaaaaccc catcaacgcc agcggcgtgg acgccaaggc catcctgtct gccagactga
1620gcaagagcag acggctggaa aatctgatcg cccagctgcc cggcgagaag aagaatggcc
1680tgttcggaaa cctgattgcc ctgagcctgg gcctgacccc caacttcaag agcaacttcg
1740acctggccga ggatgccaaa ctgcagctga gcaaggacac ctacgacgac gacctggaca
1800acctgctggc ccagatcggc gaccagtacg ccgacctgtt tctggccgcc aagaacctgt
1860ccgacgccat cctgctgagc gacatcctga gagtgaacac cgagatcacc aaggcccccc
1920tgagcgcctc tatgatcaag agatacgacg agcaccacca ggacctgacc ctgctgaaag
1980ctctcgtgcg gcagcagctg cctgagaagt acaaagagat tttcttcgac cagagcaaga
2040acggctacgc cggctacatt gacggcggag ccagccagga agagttctac aagttcatca
2100agcccatcct ggaaaagatg gacggcaccg aggaactgct cgtgaagctg aacagagagg
2160acctgctgcg gaagcagcgg accttcgaca acggcagcat cccccaccag atccacctgg
2220gagagctgca cgccattctg cggcggcagg aagattttta cccattcctg aaggacaacc
2280gggaaaagat cgagaagatc ctgaccttcc gcatccccta ctacgtgggc cctctggcca
2340ggggaaacag cagattcgcc tggatgacca gaaagagcga ggaaaccatc accccctgga
2400acttcgagga agtggtggac aagggcgctt ccgcccagag cttcatcgag cggatgacca
2460acttcgataa gaacctgccc aacgagaagg tgctgcccaa gcacagcctg ctgtacgagt
2520acttcaccgt gtataacgag ctgaccaaag tgaaatacgt gaccgaggga atgagaaagc
2580ccgccttcct gagcggcgag cagaaaaagg ccatcgtgga cctgctgttc aagaccaacc
2640ggaaagtgac cgtgaagcag ctgaaagagg actacttcaa gaaaatcgag tgcttcgact
2700ccgtggaaat ctccggcgtg gaagatcggt tcaacgcctc cctgggcaca taccacgatc
2760tgctgaaaat tatcaaggac aaggacttcc tggacaatga ggaaaacgag gacattctgg
2820aagatatcgt gctgaccctg acactgtttg aggacagaga gatgatcgag gaacggctga
2880aaacctatgc ccacctgttc gacgacaaag tgatgaagca gctgaagcgg cggagataca
2940ccggctgggg caggctgagc cggaagctga tcaacggcat ccgggacaag cagtccggca
3000agacaatcct ggatttcctg aagtccgacg gcttcgccaa cagaaacttc atgcagctga
3060tccacgacga cagcctgacc tttaaagagg acatccagaa agcccaggtg tccggccagg
3120gcgatagcct gcacgagcac attgccaatc tggccggcag ccccgccatt aagaagggca
3180tcctgcagac agtgaaggtg gtggacgagc tcgtgaaagt gatgggccgg cacaagcccg
3240agaacatcgt gatcgaaatg gccagagaga accagaccac ccagaaggga cagaagaaca
3300gccgcgagag aatgaagcgg atcgaagagg gcatcaaaga gctgggcagc cagatcctga
3360aagaacaccc cgtggaaaac acccagctgc agaacgagaa gctgtacctg tactacctgc
3420agaatgggcg ggatatgtac gtggaccagg aactggacat caaccggctg tccgactacg
3480atgtggacgc tatcgtgcct cagagctttc tgaaggacga ctccatcgac aacaaggtgc
3540tgaccagaag cgacaagaac cggggcaaga gcgacaacgt gccctccgaa gaggtcgtga
3600agaagatgaa gaactactgg cggcagctgc tgaacgccaa gctgattacc cagagaaagt
3660tcgacaatct gaccaaggcc gagagaggcg gcctgagcga actggataag gccggcttca
3720tcaagagaca gctggtggaa acccggcaga tcacaaagca cgtggcacag atcctggact
3780cccggatgaa cactaagtac gacgagaatg acaagctgat ccgggaagtg aaagtgatca
3840ccctgaagtc caagctggtg tccgatttcc ggaaggattt ccagttttac aaagtgcgcg
3900agatcaacaa ctaccaccac gcccacgacg cctacctgaa cgccgtcgtg ggaaccgccc
3960tgatcaaaaa gtaccctaag ctggaaagcg agttcgtgta cggcgactac aaggtgtacg
4020acgtgcggaa gatgatcgcc aagagcgagc aggaaatcgg caaggctacc gccaagtact
4080tcttctacag caacatcatg aactttttca agaccgagat taccctggcc aacggcgaga
4140tccggaagcg gcctctgatc gagacaaacg gcgaaaccgg ggagatcgtg tgggataagg
4200gccgggattt tgccaccgtg cggaaagtgc tgagcatgcc ccaagtgaat atcgtgaaaa
4260agaccgaggt gcagacaggc ggcttcagca aagagtctat cctgcccaag aggaacagcg
4320ataagctgat cgccagaaag aaggactggg accctaagaa gtacggcggc ttcgacagcc
4380ccaccgtggc ctattctgtg ctggtggtgg ccaaagtgga aaagggcaag tccaagaaac
4440tgaagagtgt gaaagagctg ctggggatca ccatcatgga aagaagcagc ttcgagaaga
4500atcccatcga ctttctggaa gccaagggct acaaagaagt gaaaaaggac ctgatcatca
4560agctgcctaa gtactccctg ttcgagctgg aaaacggccg gaagagaatg ctggcctctg
4620ccggcgaact gcagaaggga aacgaactgg ccctgccctc caaatatgtg aacttcctgt
4680acctggccag ccactatgag aagctgaagg gctcccccga ggataatgag cagaaacagc
4740tgtttgtgga acagcacaag cactacctgg acgagatcat cgagcagatc agcgagttct
4800ccaagagagt gatcctggcc gacgctaatc tggacaaagt gctgtccgcc tacaacaagc
4860accgggataa gcccatcaga gagcaggccg agaatatcat ccacctgttt accctgacca
4920atctgggagc ccctgccgcc ttcaagtact ttgacaccac catcgaccgg aagaggtaca
4980ccagcaccaa agaggtgctg gacgccaccc tgatccacca gagcatcacc ggcctgtacg
5040agacacggat cgacctgtct cagctgggag gtgactctgg aggatctagc ggaggatcct
5100ctggcagcga gacaccagga acaagcgagt cagcaacacc agagagctct ggtagcgaga
5160cacccggtac cagtgaaagc gccacgccag aaagcagtgg gagtgagact ccgggtacat
5220ctgaatcagc gacaccggaa tcaagtggcg gcagcagcgg cggcagcagc accctaaata
5280tagaagatga gtatcggcta catgagacct caaaagagcc agatgtttct ctagggtcca
5340catggctgtc tgattttcct caggcctggg cggaaaccgg gggcatggga ctggcagttc
5400gccaagctcc tctgatcata cctctgaaag caacctctac ccccgtgtcc ataaaacaat
5460accccatgtc acaagaagcc agactgggga tcaagcccca catacagaga ctgttggacc
5520agggaatact ggtaccctgc cagtccccct ggaacacgcc cctgctaccc gttaagaaac
5580cagggactaa tgattatagg cctgtccagg atctgagaga agtcaacaag cgggtggaag
5640acatccaccc caccgtgccc aacccttaca acctcttgag cgggccccca ccgtcccacc
5700agtggtacac tgtgcttgat ttaaaggatg cctttttctg cctgagactc caccccacca
5760gtcagcctct cttcgccttt gagtggagag atccagagat gggaatctca ggacaattga
5820cctggaccag actcccacag ggtttcaaaa acagtcccac cctgtttaat gaggcactgc
5880acagagacct agcagacttc cggatccagc acccagactt gatcctgcta cagtacgtgg
5940atgacttact gctggccgcc acttctgagc tagactgcca acaaggtact cgggccctgt
6000tacaaaccct agggaacctc gggtatcggg cctcggccaa gaaagcccaa atttgccaga
6060aacaggtcaa gtatctgggg tatcttctaa aagagggtca gagatggctg actgaggcca
6120gaaaagagac tgtgatgggg cagcctactc cgaagacccc tcgacaacta agggagttcc
6180tagggaaggc aggcttctgt cgcctcttca tccctgggtt tgcagaaatg gcagcccccc
6240tgtaccctct caccaaaccg gggactctgt ttaattgggg cccagaccaa caaaaggcct
6300atcaagaaat caagcaagct cttctaactg ccccagccct ggggttgcca gatttgacta
6360agccctttga actctttgtc gacgagaagc agggctacgc caaaggtgtc ctaacgcaaa
6420aactgggacc ttggcgtcgg ccggtggcct acctgtccaa aaagctagac ccagtagcag
6480ctgggtggcc cccttgccta cggatggtag cagccattgc cgtactgaca aaggatgcag
6540gcaagctaac catgggacag ccactagtca ttctggcccc ccatgcagta gaggcactag
6600tcaaacaacc ccccgaccgc tggctttcca acgcccggat gactcactat caggccttgc
6660ttttggacac ggaccgggtc cagttcggac cggtggtagc cctgaacccg gctacgctgc
6720tcccactgcc tgaggaaggg ctgcaacaca actgccttga tgggacaggt ggcggtggtg
6780tcaccgtcaa gttcaagtac aagggtgagg aacttgaagt tgatattagc aaaatcaaga
6840aggtttggcg cgttggtaaa atgatatctt ttacttatga cgacaacggc aagacaggta
6900gaggggcagt gtctgagaaa gacgccccca aggagctgtt gcaaatgttg gaaaagtctg
6960ggaaaaagtc tggcggctca aaaagaaccg ccgacggcag cgaattcgag cccaagaaga
7020agaggaaagt cggaggtggc gggagcccaa aaaagaaaag aaaagtgtat ccctatgatg
7080tccccgatta tgccggttca agagccctgg tcgtgattag actgagccga gtgacagacg
7140ccaccacaag tcccgagaga cagctggaat catgccagca gctctgtgct cagcggggtt
7200gggatgtggt cggcgtggca gaggatctgg acgtgagcgg ggccgtcgat ccattcgaca
7260gaaagaggag gcccaacctg gcaagatggc tcgctttcga ggaacagccc tttgatgtga
7320tcgtcgccta cagagtggac cggctgaccc gctcaattcg acatctccag cagctggtgc
7380attgggctga ggaccacaag aaactggtgg tcagcgcaac agaagcccac ttcgatacta
7440ccacaccttt tgccgctgtg gtcatcgcac tgatgggcac tgtggcccag atggagctcg
7500aagctatcaa ggagcgaaac aggagcgcag cccatttcaa tattagggcc ggtaaataca
7560gaggctccct gcccccttgg ggatatctcc ctaccagggt ggatggggag tggagactgg
7620tgccagaccc cgtccagaga gagcggattc tggaagtgta ccacagagtg gtcgataacc
7680acgaaccact ccatctggtg gcacacgacc tgaatagacg cggcgtgctc tctccaaagg
7740attattttgc tcagctgcag ggaagagagc cacagggaag agaatggagt gctactgcac
7800tgaagagatc tatgatcagt gaggctatgc tgggttacgc aacactcaat ggcaaaactg
7860tccgggacga tgacggagcc cctctggtga gggctgagcc tattctcacc agagagcagc
7920tcgaagctct gcgggcagaa ctggtcaaga ctagtcgcgc caaacctgcc gtgagcaccc
7980caagcctgct cctgagggtg ctgttctgcg ccgtctgtgg agagccagca tacaagtttg
8040ccggcggagg gcgcaaacat ccccgctatc gatgcaggag catggggttc cctaagcact
8100gtggaaacgg gacagtggcc atggctgagt gggacgcctt ttgcgaggaa caggtgctgg
8160atctcctggg tgacgctgag cggctggaaa aagtgtgggt ggcaggatct gactccgctg
8220tggagctggc agaagtcaat gccgagctcg tggatctgac ttccctcatc ggatctcctg
8280catatagagc tgggtcccca cagagagaag ctctggacgc acgaattgct gcactcgctg
8340ctagacagga ggaactggag ggcctggagg ccaggccctc tggatgggag tggcgagaaa
8400ccggacagag gtttggggat tggtggaggg agcaggacac cgcagccaag aacacatggc
8460tgagatccat gaatgtccgg ctcacattcg acgtgcgcgg tggcctgact cgaaccatcg
8520attttggcga cctgcaggag tatgaacagc acctgagact ggggtccgtg gtcgaaagac
8580tgcacactgg gatgtcctag gtttaaaccc gctgatcagc ctcgactgtg ccttctagtt
8640gccagccatc tgttgtttgc ccctcccccg tgccttcctt gaccctggaa ggtgccactc
8700ccactgtcct ttcctaataa aatgagaaaa ttgcatcgca ttgtctgagt aggtgtcatt
8760ctattctggg gggtggggtg gggcaggaca gcaaggggga ggattgggaa gacaatagca
8820ggcatgctgg ggatgcggtg ggctctatgg cttctgaggc ggaaagaacc agctggggct
8880cgataccgtc gacctctagc tagagcttgg cgtaatcatg gtcatagctg tttcctgtgt
8940gaaattgtta tccgctcaca attccacaca acatacgagc cggaagcata aagtgtaaag
9000cctagggtgc ctaatgagtg agctaactca cattaattgc gttgcgctca ctgcccgctt
9060tccagtcggg aaacctgtcg tgccagctgc attaatgaat cggccaacgc gcggggagag
9120gcggtttgcg tattgggcgc tcttccgctt cctcgctcac tgactcgctg cgctcggtcg
9180ttcggctgcg gcgagcggta tcagctcact caaaggcggt aatacggtta tccacagaat
9240caggggataa cgcaggaaag aacatgtgag caaaaggcca gcaaaaggcc aggaaccgta
9300aaaaggccgc gttgctggcg tttttccata ggctccgccc ccctgacgag catcacaaaa
9360atcgacgctc aagtcagagg tggcgaaacc cgacaggact ataaagatac caggcgtttc
9420cccctggaag ctccctcgtg cgctctcctg ttccgaccct gccgcttacc ggatacctgt
9480ccgcctttct cccttcggga agcgtggcgc tttctcatag ctcacgctgt aggtatctca
9540gttcggtgta ggtcgttcgc tccaagctgg gctgtgtgca cgaacccccc gttcagcccg
9600accgctgcgc cttatccggt aactatcgtc ttgagtccaa cccggtaaga cacgacttat
9660cgccactggc agcagccact ggtaacagga ttagcagagc gaggtatgta ggcggtgcta
9720cagagttctt gaagtggtgg cctaactacg gctacactag aagaacagta tttggtatct
9780gcgctctgct gaagccagtt accttcggaa aaagagttgg tagctcttga tccggcaaac
9840aaaccaccgc tggtagcggt ggtttttttg tttgcaagca gcagattacg cgcagaaaaa
9900aaggatctca agaagatcct ttgatctttt ctacggggtc tgacgctcag tggaacgaaa
9960actcacgtta agggattttg gtcatgagat tatcaaaaag gatcttcacc tagatccttt
10020taaattaaaa atgaagtttt aaatcaatct aaagtatata tgagtaaact tggtctgaca
10080gttaccaatg cttaatcagt gaggcaccta tctcagcgat ctgtctattt cgttcatcca
10140tagttgcctg actccccgtc gtgtagataa ctacgatacg ggagggctta ccatctggcc
10200ccagtgctgc aatgataccg cgagacccac gctcaccggc tccagattta tcagcaataa
10260accagccagc cggaagggcc gagcgcagaa gtggtcctgc aactttatcc gcctccatcc
10320agtctattaa ttgttgccgg gaagctagag taagtagttc gccagttaat agtttgcgca
10380acgttgttgc cattgctaca ggcatcgtgg tgtcacgctc gtcgtttggt atggcttcat
10440tcagctccgg ttcccaacga tcaaggcgag ttacatgatc ccccatgttg tgcaaaaaag
10500cggttagctc cttcggtcct ccgatcgttg tcagaagtaa gttggccgca gtgttatcac
10560tcatggttat ggcagcactg cataattctc ttactgtcat gccatccgta agatgctttt
10620ctgtgactgg tgagtactca accaagtcat tctgagaata gtgtatgcgg cgaccgagtt
10680gctcttgccc ggcgtcaata cgggataata ccgcgccaca tagcagaact ttaaaagtgc
10740tcatcattgg aaaacgttct tcggggcgaa aactctcaag gatcttaccg ctgttgagat
10800ccagttcgat gtaacccact cgtgcaccca actgatcttc agcatctttt actttcacca
10860gcgtttctgg gtgagcaaaa acaggaaggc aaaatgccgc aaaaaaggga ataagggcga
10920cacggaaatg ttgaatactc atactcttcc tttttcaata ttattgaagc atttatcagg
10980gttattgtct catgagcgga tacatatttg aatgtattta gaaaaataaa caaatagggg
11040ttccgcgcac atttcc
110563852367DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotide 385tgttgagatc cagttcgatg taacccactc
gtgcacccaa ctgatcttca gcatctttta 60ctttcaccag cgtttctggg tgagcaaaaa
caggaaggca aaatgccgca aaaaagggaa 120taagggcgac acggaaatgt tgaatactca
tactcttcct ttttcaatat tattgaagca 180tttatcaggg ttattgtctc atgagcggat
acatatttga atgtatttag aaaaataaac 240aaataggggt tccgcgcaca tttccccgaa
aagtgccacc tgacgtcgct agctgtacaa 300aaaagcaggc tttaaaggaa ccaattcagt
cgactggatc cggtaccaag gtcgggcagg 360aagagggcct atttcccatg attccttcat
atttgcatat acgatacaag gctgttagag 420agataattag aattaatttg actgtaaaca
caaagatatt agtacaaaat acgtgacgta 480gaaagtaata atttcttggg tagtttgcag
ttttaaaatt atgttttaaa atggactatc 540atatgcttac cgtaacttga aagtatttcg
atttcttggc tttatatatc ttgtggaaag 600gacgaaacac cgctattctc gcagctcacc
agttttagag ctagaaatag caagttaaaa 660taaggctagt ccgttatcaa cttgaaaaag
tggcaccgag tcggtgcgac gagcgcggcg 720atatcatcat ccatggccgg atgatcctga
cgacggagac cgccgtcgtc gacaagccgg 780cctgagctgc gagaattttt ttaagcttgg
gccgctcgag gtacctctct acatatgaca 840tgtgagcaaa aggccagcaa aaggccagga
accgtaaaaa ggccgcgttg ctggcgtttt 900tccataggct ccgcccccct gacgagcatc
acaaaaatcg acgctcaagt cagaggtggc 960gaaacccgac aggactataa agataccagg
cgtttccccc tggaagctcc ctcgtgcgct 1020ctcctgttcc gaccctgccg cttaccggat
acctgtccgc ctttctccct tcgggaagcg 1080tggcgctttc tcatagctca cgctgtaggt
atctcagttc ggtgtaggtc gttcgctcca 1140agctgggctg tgtgcacgaa ccccccgttc
agcccgaccg ctgcgcctta tccggtaact 1200atcgtcttga gtccaacccg gtaagacacg
acttatcgcc actggcagca gccactggta 1260acaggattag cagagcgagg tatgtaggcg
gtgctacaga gttcttgaag tggtggccta 1320actacggcta cactagaaga acagtatttg
gtatctgcgc tctgctgaag ccagttacct 1380tcggaaaaag agttggtagc tcttgatccg
gcaaacaaac caccgctggt agcggtggtt 1440tttttgtttg caagcagcag attacgcgca
gaaaaaaagg atctcaagaa gatcctttga 1500tcttttctac ggggtctgac gctcagtgga
acgaaaactc acgttaaggg attttggtca 1560tgagattatc aaaaaggatc ttcacctaga
tccttttaaa ttaaaaatga agttttaaat 1620caatctaaag tatatatgag taaacttggt
ctgacagtta ccaatgctta atcagtgagg 1680cacctatctc agcgatctgt ctatttcgtt
catccatagt tgcctgactc cccgtcgtgt 1740agataactac gatacgggag ggcttaccat
ctggccccag tgctgcaatg ataccgcgag 1800atccacgctc accggctcca gatttatcag
caataaacca gccagccgga agggccgagc 1860gcagaagtgg tcctgcaact ttatccgcct
ccatccagtc tattaattgt tgccgggaag 1920ctagagtaag tagttcgcca gttaatagtt
tgcgcaacgt tgttgccatt gctacaggca 1980tcgtggtgtc acgctcgtcg tttggtatgg
cttcattcag ctccggttcc caacgatcaa 2040ggcgagttac atgatccccc atgttgtgca
aaaaagcggt tagctccttc ggtcctccga 2100tcgttgtcag aagtaagttg gccgcagtgt
tatcactcat ggttatggca gcactgcata 2160attctcttac tgtcatgcca tccgtaagat
gcttttctgt gactggtgag tactcaacca 2220agtcattctg agaatagtgt atgcggcgac
cgagttgctc ttgcccggcg tcaatacggg 2280ataataccgc gccacatagc agaactttaa
aagtgctcat cattggaaaa cgttcttcgg 2340ggcgaaaact ctcaaggatc ttaccgc
23673862280DNAArtificial
SequenceDescription of Artificial Sequence Synthetic polynucleotide
386ctccatccag tctattaatt gttgccggga agctagagta agtagttcgc cagttaatag
60tttgcgcaac gttgttgcca ttgctacagg catcgtggtg tcacgctcgt cgtttggtat
120ggcttcattc agctccggtt cccaacgatc aaggcgagtt acatgatccc ccatgttgtg
180caaaaaagcg gttagctcct tcggtcctcc gatcgttgtc agaagtaagt tggccgcagt
240gttatcactc atggttatgg cagcactgca taattctctt actgtcatgc catccgtaag
300atgcttttct gtgactggtg agtactcaac caagtcattc tgagaatagt gtatgcggcg
360accgagttgc tcttgcccgg cgtcaatacg ggataatacc gcgccacata gcagaacttt
420aaaagtgctc atcattggaa aacgttcttc ggggcgaaaa ctctcaagga tcttaccgct
480gttgagatcc agttcgatgt aacccactcg tgcacccaac tgatcttcag catcttttac
540tttcaccagc gtttctgggt gagcaaaaac aggaaggcaa aatgccgcaa aaaagggaat
600aagggcgaca cggaaatgtt gaatactcat actcttcctt tttcaatatt attgaagcat
660ttatcagggt tattgtctca tgagcggata catatttgaa tgtatttaga aaaataaaca
720aataggggtt ccgcgcacat ttccccgaaa agtgccacct gacgtcgcta gctgtacaaa
780aaagcaggct ttaaaggaac caattcagtc gactggatcc ggtaccaagg tcgggcagga
840agagggccta tttcccatga ttccttcata tttgcatata cgatacaagg ctgttagaga
900gataattaga attaatttga ctgtaaacac aaagatatta gtacaaaata cgtgacgtag
960aaagtaataa tttcttgggt agtttgcagt tttaaaatta tgttttaaaa tggactatca
1020tatgcttacc gtaacttgaa agtatttcga tttcttggct ttatatatct tgtggaaagg
1080acgaaacacc gaagccggcc ttgcacatgc gttttagagc tagaaatagc aagttaaaat
1140aaggctagtc cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgttt ttttaagctt
1200gggccgctcg aggtacctct ctacatatga catgtgagca aaaggccagc aaaaggccag
1260gaaccgtaaa aaggccgcgt tgctggcgtt tttccatagg ctccgccccc ctgacgagca
1320tcacaaaaat cgacgctcaa gtcagaggtg gcgaaacccg acaggactat aaagatacca
1380ggcgtttccc cctggaagct ccctcgtgcg ctctcctgtt ccgaccctgc cgcttaccgg
1440atacctgtcc gcctttctcc cttcgggaag cgtggcgctt tctcatagct cacgctgtag
1500gtatctcagt tcggtgtagg tcgttcgctc caagctgggc tgtgtgcacg aaccccccgt
1560tcagcccgac cgctgcgcct tatccggtaa ctatcgtctt gagtccaacc cggtaagaca
1620cgacttatcg ccactggcag cagccactgg taacaggatt agcagagcga ggtatgtagg
1680cggtgctaca gagttcttga agtggtggcc taactacggc tacactagaa gaacagtatt
1740tggtatctgc gctctgctga agccagttac cttcggaaaa agagttggta gctcttgatc
1800cggcaaacaa accaccgctg gtagcggtgg tttttttgtt tgcaagcagc agattacgcg
1860cagaaaaaaa ggatctcaag aagatccttt gatcttttct acggggtctg acgctcagtg
1920gaacgaaaac tcacgttaag ggattttggt catgagatta tcaaaaagga tcttcaccta
1980gatcctttta aattaaaaat gaagttttaa atcaatctaa agtatatatg agtaaacttg
2040gtctgacagt taccaatgct taatcagtga ggcacctatc tcagcgatct gtctatttcg
2100ttcatccata gttgcctgac tccccgtcgt gtagataact acgatacggg agggcttacc
2160atctggcccc agtgctgcaa tgataccgcg agatccacgc tcaccggctc cagatttatc
2220agcaataaac cagccagccg gaagggccga gcgcagaagt ggtcctgcaa ctttatccgc
22803876386DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotide 387tgagcgcaac gcaattaatg tgagttagct
cactcattag gcaccccagg ctttacactt 60tatgcttccg gctcgtatgt tgtgtggaat
tgtgagcgga taacaatttc acacaggaaa 120cagctatgac catgaggcgc gccggattcg
acattgatta ttgactagtt attaatagta 180atcaattacg gggtcattag ttcatagccc
atatatggag ttccgcgtta cataacttac 240ggtaaatggc ccgcctggct gaccgcccaa
cgacccccgc ccattgacgt caataatgac 300gtatgttccc atagtaacgc caatagggac
tttccattga cgtcaatggg tggagtattt 360acggtaaact gcccacttgg cagtacatca
agtgtatcat atgccaagta cgccccctat 420tgacgtcaat gacggtaaat ggcccgcctg
gcattatgcc cagtacatga ccttatggga 480ctttcctact tggcagtaca tctacgtatt
agtcatcgct attaccatgg tcgaggtgag 540ccccacgttc tgcttcactc tccccatctc
ccccccctcc ccacccccaa ttttgtattt 600atttattttt taattatttt gtgcagcgat
gggggcgggg gggggggggg ggcgcgcgcc 660rggsggggsg gggsggggsg rggggsgggg
sggggsgagg cggagaggtg cggcggcagc 720caatcagagc ggcgcgctcc gaaagtttcc
ttttatggcg aggcggcggc ggcggcggcc 780ctataaaaag cgaagcgcgc ggcgggcggg
agtcgctgcg cgctgccttc gccccgtgcc 840ccgctccgcc gccgcctcgc gccgcccgcc
ccggctctga ctgaccgcgt tactcccaca 900ggtgagcggg cgggacggcc cttctcctcc
gggctgtaat tagcgcttgg tttaatgacg 960gcttgtttct tttctgtggc tgcgtgaaag
ccttgagggg ctccgggagg gccctttgtg 1020cggggggagc ggctcggggg gtgcgtgcgt
gtgtgtgtgc gtggggagcg ccgcgtgcgg 1080ctccgcgctg cccggcggct gtgagcgctg
cgggcgcggc gcggggcttt gtgcgctccg 1140cagtgtgcgc gaggggagcg cggccggggg
cggtgccccg cggtgcgggg ggggctgcga 1200ggggaacaaa ggctgcgtgc ggggtgtgtg
cgtggggggg tgagcagggg gtgtgggcgc 1260gtcggtcggg ctgcaacccc ccctgcaccc
ccctccccga gttgctgagc acggcccggc 1320ttcgggtgcg gggctccgta cggggcgtgg
cgcggggctc gccgtgccgg gcggggggtg 1380gcggcaggtg ggggtgccgg gcggggcggg
gccgcctcgg gccggggagg gctcggggga 1440ggggcgcggc ggcccccgga gcgccggcgg
ctgtcgaggc gcggcgagcc gcagccattg 1500ccttttatgg taatcgtgcg agagggcgca
gggacttcct ttgtcccaaa tctgtgcgga 1560gccgaaatct gggaggcgcc gccgcacccc
ctctagcggg cgcggggcga agcggtgcgg 1620cgccggcagg aaggaaatgg gcggggaggg
ccttcgtgcg tcgccgcgcc gccgtcccct 1680tctccctctc cagcctcggg gctgtccgcg
gggggacggc tgccttcggg ggggacgggg 1740cagggcgggg ttcggcttct ggcgtgtgac
cggcggctct agagcctctg ctaaccatgt 1800tcatgccttc ttctttttcc tacagatcct
taattaataa tacgactcac tatagggggt 1860cgacccgcca ccatgccaaa aaagaaaaga
aaagtgtatc cctatgatgt ccccgattat 1920gccggttcaa gagccctggt cgtgattaga
ctgagccgag tgacagacgc caccacaagt 1980cccgagagac agctggaatc atgccagcag
ctctgtgctc agcggggttg ggatgtggtc 2040ggcgtggcag aggatctgga cgtgagcggg
gccgtcgatc cattcgacag aaagaggagg 2100cccaacctgg caagatggct cgctttcgag
gaacagccct ttgatgtgat cgtcgcctac 2160agagtggacc ggctgacccg ctcaattcga
catctccagc agctggtgca ttgggctgag 2220gaccacaaga aactggtggt cagcgcaaca
gaagcccact tcgatactac cacacctttt 2280gccgctgtgg tcatcgcact gatgggcact
gtggcccaga tggagctcga agctatcaag 2340gagcgaaaca ggagcgcagc ccatttcaat
attagggccg gtaaatacag aggctccctg 2400cccccttggg gatatctccc taccagggtg
gatggggagt ggagactggt gccagacccc 2460gtccagagag agcggattct ggaagtgtac
cacagagtgg tcgataacca cgaaccactc 2520catctggtgg cacacgacct gaatagacgc
ggcgtgctct ctccaaagga ttattttgct 2580cagctgcagg gaagagagcc acagggaaga
gaatggagtg ctactgcact gaagagatct 2640atgatcagtg aggctatgct gggttacgca
acactcaatg gcaaaactgt ccgggacgat 2700gacggagccc ctctggtgag ggctgagcct
attctcacca gagagcagct cgaagctctg 2760cgggcagaac tggtcaagac tagtcgcgcc
aaacctgccg tgagcacccc aagcctgctc 2820ctgagggtgc tgttctgcgc cgtctgtgga
gagccagcat acaagtttgc cggcggaggg 2880cgcaaacatc cccgctatcg atgcaggagc
atggggttcc ctaagcactg tggaaacggg 2940acagtggcca tggctgagtg ggacgccttt
tgcgaggaac aggtgctgga tctcctgggt 3000gacgctgagc ggctggaaaa agtgtgggtg
gcaggatctg actccgctgt ggagctggca 3060gaagtcaatg ccgagctcgt ggatctgact
tccctcatcg gatctcctgc atatagagct 3120gggtccccac agagagaagc tctggacgca
cgaattgctg cactcgctgc tagacaggag 3180gaactggagg gcctggaggc caggccctct
ggatgggagt ggcgagaaac cggacagagg 3240tttggggatt ggtggaggga gcaggacacc
gcagccaaga acacatggct gagatccatg 3300aatgtccggc tcacattcga cgtgcgcggt
ggcctgactc gaaccatcga ttttggcgac 3360ctgcaggagt atgaacagca cctgagactg
gggtccgtgg tcgaaagact gcacactggg 3420atgtcctagg tcagagctcg ctgatcagcc
tcgactgtgc cttctagttg ccagccatct 3480gttgtttgcc cctcccccgt gccttccttg
accctggaag gtgccactcc cactgtcctt 3540tcctaataaa atgaggaaat tgcatcgcat
tgtctgagta ggtgtcattc tattctgggg 3600ggtggggtgg ggcaggacag caagggggag
gattgggaag acaatagcag gcatgctggg 3660gatgcggtgg gctctatggc ttctgaggcg
gaaagaacca gctggggctc gagatccact 3720agttctagcc tcgaggctag agcggccgcc
actggccgtc gttttacaac gtcgtgactg 3780ggaaaaccct ggcgttaccc aacttaatcg
ccttgcagca catccccctt tcgccagctg 3840gcgtaatagc gaagaggccc gcaccgatcg
cccttcccaa cagttgcgca gcctgaatgg 3900cgaatgggac gcgccctgta gcggcgcatt
aagcgcggcg ggtgtggtgg ttacgcgcag 3960cgtgaccgct acacttgcca gcgccctagc
gcccgctcct ttcgctttct tcccttcctt 4020tctcgccacg ttcgccggct ttccccgtca
agctctaaat cgggggctcc ctttagggtt 4080ccgatttagt gctttacggc acctcgaccc
caaaaaactt gattagggtg atggttcacg 4140tagtgggcca tcgccctgat agacggtttt
tcgccctttg acgttggagt ccacgttctt 4200taatagtgga ctcttgttcc aaactggaac
aacactcaac cctatctcgg tctattcttt 4260tgatttataa gggattttgc cgatttcggc
ctattggtta aaaaatgagc tgatttaaca 4320aaaatttaac gcgaatttta acaaaatatt
aacgcttacr mktymsrtks smcwttymgg 4380sgaaatgtgc gcggaacccc tatttgttta
tttttctaaa tacattcaaa tatgtatccg 4440ctcatgagac aataaccctg ataaatgctt
caataatatt gaaaaaggaa gagtatgagt 4500attcaacatt tccgtgtcgc ccttattccc
ttttttgcgg cattttgcct tcctgttttt 4560gctcacccag aaacgctggt gaaagtaaaa
gatgctgaag atcagttggg tgcacgagtg 4620ggttacatcg aactggatct caacagcggt
aagatccttg agagttttcg ccccgaagaa 4680cgttttccaa tgatgagcac ttttaaagtt
ctgctatgtg gcgcggtatt atcccgtatt 4740gacgccgggc aagagcaact cggtcgccgc
atacactatt ctcagaatga cttggttgag 4800tactcaccag tcacagaaaa gcatcttacg
gatggcatga cagtaagaga attatgcagt 4860gctgccataa ccatgagtga taacactgcg
gccaacttac ttctgacaac gatcggagga 4920ccgaaggagc taaccgcttt tttgcacaac
atgggggatc atgtaactcg ccttgatcgt 4980tgggaaccgg agctgaatga agccatacca
aacgacgagc gtgacaccac gatgcctgta 5040gcaatggcaa caacgttgcg caaactatta
actggcgaac tacttactct agcttcccgg 5100caacaattaa tagactggat ggaggcggat
aaagttgcag gaccacttct gcgctcggcc 5160cttccggctg gctggtttat tgctgataaa
tctggagccg gtgagcgtgg gtctcgcggt 5220atcattgcag cactggggcc agatggtaag
ccctcccgta tcgtagttat ctacacgacg 5280gggagtcagg caactatgga tgaacgaaat
agacagatcg ctgagatagg tgcctcactg 5340attaagcatt ggtaactgtc agaccaagtt
tactcatata tactttagat tgatttaaaa 5400cttcattttt aatttaaaag gatctaggtg
aagatccttt ttgataatct catgaccaaa 5460atcccttaac gtgagttttc gttccactga
gcgtcagacc ccgtagaaaa gatcaaagga 5520tcttcttgag atcctttttt tctgcgcgta
atctgctgct tgcaaacaaa aaaaccaccg 5580ctaccagcgg tggtttgttt gccggatcaa
gagctaccaa ctctttttcc gaaggtaact 5640ggcttcagca gagcgcagat accaaatact
gttcttctag tgtagccgta gttaggccac 5700cacttcaaga actctgtagc accgcctaca
tacctcgctc tgctaatcct gttaccagtg 5760gctgctgcca gtggcgataa gtcgtgtctt
accgggttgg actcaagacg atagttaccg 5820gataaggcgc agcggtcggg ctgaacgggg
ggttcgtgca cacagcccag cttggagcga 5880acgacctaca ccgaactgag atacctacag
cgtgagctat gagaaagcgc cacgcttccc 5940gaagggagaa aggcggacag gtatccggta
agcggcaggg tcggaacagg agagcgcacg 6000agggagcttc cagggggaaa cgcctggtat
ctttatagtc ctgtcgggtt tcgccacctc 6060tgacttgagc gtcgattttt gtgatgctcg
tcaggggggc ggagcctatg gaaaaacgcc 6120agcaacgcgg cctttttacg gttcctggcc
ttttgctggc cttttgctca catgttcttt 6180cctgcgttat cccctgattc tgtggataac
cgtattaccg cctttgagtg agctgatacc 6240gctcgccgca gccgaacgac cgagcgcagc
gagtcagtga gcgaggaagc ggaagagcgc 6300ccaatacgca aaccgcctct ccccgcgcgt
tggccgattc attaatgcag ctggcacgac 6360aggtttcccg actggaaagc gggcag
63863886317DNAArtificial
SequenceDescription of Artificial Sequence Synthetic polynucleotide
388gattcgacat tgattattga ctagttatta atagtaatca attacggggt cattagttca
60tagcccatat atggagttcc gcgttacata acttacggta aatggcccgc ctggctgacc
120gcccaacgac ccccgcccat tgacgtcaat aatgacgtat gttcccatag taacgccaat
180agggactttc cattgacgtc aatgggtgga gtatttacgg taaactgccc acttggcagt
240acatcaagtg tatcatatgc caagtacgcc ccctattgac gtcaatgacg gtaaatggcc
300cgcctggcat tatgcccagt acatgacctt atgggacttt cctacttggc agtacatcta
360cgtattagtc atcgctatta ccatggtcga ggtgagcccc acgttctgct tcactctccc
420catctccccc ccctccccac ccccaatttt gtatttattt attttttaat tattttgtgc
480agcgatgggg gcgggggggg ggggggggcg cgcgccrggs ggggsggggs ggggsgrggg
540gsggggsggg gsgaggcgga gaggtgcggc ggcagccaat cagagcggcg cgctccgaaa
600gtttcctttt atggcgaggc ggcggcggcg gcggccctat aaaaagcgaa gcgcgcggcg
660ggcgggagtc gctgcgcgct gccttcgccc cgtgccccgc tccgccgccg cctcgcgccg
720cccgccccgg ctctgactga ccgcgttact cccacaggtg agcgggcggg acggcccttc
780tcctccgggc tgtaattagc gcttggttta atgacggctt gtttcttttc tgtggctgcg
840tgaaagcctt gaggggctcc gggagggccc tttgtgcggg gggagcggct cggggggtgc
900gtgcgtgtgt gtgtgcgtgg ggagcgccgc gtgcggctcc gcgctgcccg gcggctgtga
960gcgctgcggg cgcggcgcgg ggctttgtgc gctccgcagt gtgcgcgagg ggagcgcggc
1020cgggggcggt gccccgcggt gcgggggggg ctgcgagggg aacaaaggct gcgtgcgggg
1080tgtgtgcgtg ggggggtgag cagggggtgt gggcgcgtcg gtcgggctgc aaccccccct
1140gcacccccct ccccgagttg ctgagcacgg cccggcttcg ggtgcggggc tccgtacggg
1200gcgtggcgcg gggctcgccg tgccgggcgg ggggtggcgg caggtggggg tgccgggcgg
1260ggcggggccg cctcgggccg gggagggctc gggggagggg cgcggcggcc cccggagcgc
1320cggcggctgt cgaggcgcgg cgagccgcag ccattgcctt ttatggtaat cgtgcgagag
1380ggcgcaggga cttcctttgt cccaaatctg tgcggagccg aaatctggga ggcgccgccg
1440caccccctct agcgggcgcg gggcgaagcg gtgcggcgcc ggcaggaagg aaatgggcgg
1500ggagggcctt cgtgcgtcgc cgcgccgccg tccccttctc cctctccagc ctcggggctg
1560tccgcggggg gacggctgcc ttcggggggg acggggcagg gcggggttcg gcttctggcg
1620tgtgaccggc ggctctagag cctctgctaa ccatgttcat gccttcttct ttttcctaca
1680gatccttaat taataatacg actcactata gggggtcgac ccgccaccat gacagcgcca
1740aagaaaaaga ggaaggtcat gaccaagaaa gtggccatct atactagagt gagcacaacg
1800aatcaggccg aggaggggtt ctctattgac gagcaaatcg atcgtctgac caagtacgcg
1860gaagcaatgg gctggcaagt cagcgacact tacaccgatg ctgggttctc cggcgccaaa
1920ctggaaaggc ctgccatgca gcggctgatt aacgacattg agaacaaggc ctttgataca
1980gtgctcgtat acaagctcga caggctcagc cgatctgtgc gggacacgct ttacctcgta
2040aaggatgttt tcactaagaa taaaatcgac ttcattagcc tgaacgaatc cattgacacc
2100agctcagcta tgggctctct gttcctgacc atcctgagcg ctatcaatga gtttgagagg
2160gagaatataa aggagcgcat gacaatggga aagctgggta gagcgaagtc cgggaaatct
2220atgatgtgga ccaagaccgc ttttggatac taccacaata ggaagacggg cattctggag
2280atcgtgccct tgcaggcaac catcgttgag cagatcttca ccgactacct gagcggaata
2340tctctcacga agttgcgaga taagctgaat gagagcggac acattggcaa ggatattcct
2400tggtcatata gaaccctccg ccaaactctg gataatccgg tgtactgcgg ttacatcaag
2460ttcaaagaca gcctcttcga gggaatgcat aaacctatca ttccatacga gacatacctg
2520aaagtccaaa aggaactcga agagcgccag caacagactt acgaacggaa taataatccc
2580aggcctttcc aggccaaata tatgctgtcc ggcatggcaa gatgcggata ctgcggggca
2640ccactcaaga ttgtgcttgg ccataaacgg aaggatggaa gcagaaccat gaaatatcac
2700tgcgcaaacc gctttccaag gaaaacgaag gggattaccg tgtacaatga caacaaaaaa
2760tgtgatagcg gaacctacga tctgtccaac ttggaaaaca ccgtcattga caatttaatt
2820ggatttcagg aaaataatga cagccttctg aagattatca acgggaacaa tcagccgatt
2880ctggacactt catctttcaa aaaacagatc tctcagattg ataagaaaat tcagaaaaat
2940tccgatttat acctcaatga tttcataacg atggatgagc tgaaggaccg gaccgacagt
3000ttgcaggccg agaagaaact gctgaaagca aagatctccg agaacaagtt caatgacagt
3060accgatgtct tcgagttggt gaagacccag ctgggtagta tcccaatcaa cgagttgagc
3120tatgacaata agaagaagat tgttaataac ctggtgagca aagtggacgt gaccgctgat
3180aacgtggata ttatcttcaa gttccagctg gcctgagtca gagctcgctg atcagcctcg
3240actgtgcctt ctagttgcca gccatctgtt gtttgcccct cccccgtgcc ttccttgacc
3300ctggaaggtg ccactcccac tgtcctttcc taataaaatg aggaaattgc atcgcattgt
3360ctgagtaggt gtcattctat tctggggggt ggggtggggc aggacagcaa gggggaggat
3420tgggaagaca atagcaggca tgctggggat gcggtgggct ctatggcttc tgaggcggaa
3480agaaccagct ggggctcgag atccactagt tctagcctcg aggctagagc ggccgccact
3540ggccgtcgtt ttacaacgtc gtgactggga aaaccctggc gttacccaac ttaatcgcct
3600tgcagcacat ccccctttcg ccagctggcg taatagcgaa gaggcccgca ccgatcgccc
3660ttcccaacag ttgcgcagcc tgaatggcga atgggacgcg ccctgtagcg gcgcattaag
3720cgcggcgggt gtggtggtta cgcgcagcgt gaccgctaca cttgccagcg ccctagcgcc
3780cgctcctttc gctttcttcc cttcctttct cgccacgttc gccggctttc cccgtcaagc
3840tctaaatcgg gggctccctt tagggttccg atttagtgct ttacggcacc tcgaccccaa
3900aaaacttgat tagggtgatg gttcacgtag tgggccatcg ccctgataga cggtttttcg
3960ccctttgacg ttggagtcca cgttctttaa tagtggactc ttgttccaaa ctggaacaac
4020actcaaccct atctcggtct attcttttga tttataaggg attttgccga tttcggccta
4080ttggttaaaa aatgagctga tttaacaaaa atttaacgcg aattttaaca aaatattaac
4140gcttacrmkt ymsrtkssmc wttymggsga aatgtgcgcg gaacccctat ttgtttattt
4200ttctaaatac attcaaatat gtatccgctc atgagacaat aaccctgata aatgcttcaa
4260taatattgaa aaaggaagag tatgagtatt caacatttcc gtgtcgccct tattcccttt
4320tttgcggcat tttgccttcc tgtttttgct cacccagaaa cgctggtgaa agtaaaagat
4380gctgaagatc agttgggtgc acgagtgggt tacatcgaac tggatctcaa cagcggtaag
4440atccttgaga gttttcgccc cgaagaacgt tttccaatga tgagcacttt taaagttctg
4500ctatgtggcg cggtattatc ccgtattgac gccgggcaag agcaactcgg tcgccgcata
4560cactattctc agaatgactt ggttgagtac tcaccagtca cagaaaagca tcttacggat
4620ggcatgacag taagagaatt atgcagtgct gccataacca tgagtgataa cactgcggcc
4680aacttacttc tgacaacgat cggaggaccg aaggagctaa ccgctttttt gcacaacatg
4740ggggatcatg taactcgcct tgatcgttgg gaaccggagc tgaatgaagc cataccaaac
4800gacgagcgtg acaccacgat gcctgtagca atggcaacaa cgttgcgcaa actattaact
4860ggcgaactac ttactctagc ttcccggcaa caattaatag actggatgga ggcggataaa
4920gttgcaggac cacttctgcg ctcggccctt ccggctggct ggtttattgc tgataaatct
4980ggagccggtg agcgtgggtc tcgcggtatc attgcagcac tggggccaga tggtaagccc
5040tcccgtatcg tagttatcta cacgacgggg agtcaggcaa ctatggatga acgaaataga
5100cagatcgctg agataggtgc ctcactgatt aagcattggt aactgtcaga ccaagtttac
5160tcatatatac tttagattga tttaaaactt catttttaat ttaaaaggat ctaggtgaag
5220atcctttttg ataatctcat gaccaaaatc ccttaacgtg agttttcgtt ccactgagcg
5280tcagaccccg tagaaaagat caaaggatct tcttgagatc ctttttttct gcgcgtaatc
5340tgctgcttgc aaacaaaaaa accaccgcta ccagcggtgg tttgtttgcc ggatcaagag
5400ctaccaactc tttttccgaa ggtaactggc ttcagcagag cgcagatacc aaatactgtt
5460cttctagtgt agccgtagtt aggccaccac ttcaagaact ctgtagcacc gcctacatac
5520ctcgctctgc taatcctgtt accagtggct gctgccagtg gcgataagtc gtgtcttacc
5580gggttggact caagacgata gttaccggat aaggcgcagc ggtcgggctg aacggggggt
5640tcgtgcacac agcccagctt ggagcgaacg acctacaccg aactgagata cctacagcgt
5700gagctatgag aaagcgccac gcttcccgaa gggagaaagg cggacaggta tccggtaagc
5760ggcagggtcg gaacaggaga gcgcacgagg gagcttccag ggggaaacgc ctggtatctt
5820tatagtcctg tcgggtttcg ccacctctga cttgagcgtc gatttttgtg atgctcgtca
5880ggggggcgga gcctatggaa aaacgccagc aacgcggcct ttttacggtt cctggccttt
5940tgctggcctt ttgctcacat gttctttcct gcgttatccc ctgattctgt ggataaccgt
6000attaccgcct ttgagtgagc tgataccgct cgccgcagcc gaacgaccga gcgcagcgag
6060tcagtgagcg aggaagcgga agagcgccca atacgcaaac cgcctctccc cgcgcgttgg
6120ccgattcatt aatgcagctg gcacgacagg tttcccgact ggaaagcggg cagtgagcgc
6180aacgcaatta atgtgagtta gctcactcat taggcacccc aggctttaca ctttatgctt
6240ccggctcgta tgttgtgtgg aattgtgagc ggataacaat ttcacacagg aaacagctat
6300gaccatgagg cgcgccg
63173896638DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotide 389gattcgacat tgattattga ctagttatta
atagtaatca attacggggt cattagttca 60tagcccatat atggagttcc gcgttacata
acttacggta aatggcccgc ctggctgacc 120gcccaacgac ccccgcccat tgacgtcaat
aatgacgtat gttcccatag taacgccaat 180agggactttc cattgacgtc aatgggtgga
gtatttacgg taaactgccc acttggcagt 240acatcaagtg tatcatatgc caagtacgcc
ccctattgac gtcaatgacg gtaaatggcc 300cgcctggcat tatgcccagt acatgacctt
atgggacttt cctacttggc agtacatcta 360cgtattagtc atcgctatta ccatggtcga
ggtgagcccc acgttctgct tcactctccc 420catctccccc ccctccccac ccccaatttt
gtatttattt attttttaat tattttgtgc 480agcgatgggg gcgggggggg ggggggggcg
cgcgccrggs ggggsggggs ggggsgrggg 540gsggggsggg gsgaggcgga gaggtgcggc
ggcagccaat cagagcggcg cgctccgaaa 600gtttcctttt atggcgaggc ggcggcggcg
gcggccctat aaaaagcgaa gcgcgcggcg 660ggcgggagtc gctgcgcgct gccttcgccc
cgtgccccgc tccgccgccg cctcgcgccg 720cccgccccgg ctctgactga ccgcgttact
cccacaggtg agcgggcggg acggcccttc 780tcctccgggc tgtaattagc gcttggttta
atgacggctt gtttcttttc tgtggctgcg 840tgaaagcctt gaggggctcc gggagggccc
tttgtgcggg gggagcggct cggggggtgc 900gtgcgtgtgt gtgtgcgtgg ggagcgccgc
gtgcggctcc gcgctgcccg gcggctgtga 960gcgctgcggg cgcggcgcgg ggctttgtgc
gctccgcagt gtgcgcgagg ggagcgcggc 1020cgggggcggt gccccgcggt gcgggggggg
ctgcgagggg aacaaaggct gcgtgcgggg 1080tgtgtgcgtg ggggggtgag cagggggtgt
gggcgcgtcg gtcgggctgc aaccccccct 1140gcacccccct ccccgagttg ctgagcacgg
cccggcttcg ggtgcggggc tccgtacggg 1200gcgtggcgcg gggctcgccg tgccgggcgg
ggggtggcgg caggtggggg tgccgggcgg 1260ggcggggccg cctcgggccg gggagggctc
gggggagggg cgcggcggcc cccggagcgc 1320cggcggctgt cgaggcgcgg cgagccgcag
ccattgcctt ttatggtaat cgtgcgagag 1380ggcgcaggga cttcctttgt cccaaatctg
tgcggagccg aaatctggga ggcgccgccg 1440caccccctct agcgggcgcg gggcgaagcg
gtgcggcgcc ggcaggaagg aaatgggcgg 1500ggagggcctt cgtgcgtcgc cgcgccgccg
tccccttctc cctctccagc ctcggggctg 1560tccgcggggg gacggctgcc ttcggggggg
acggggcagg gcggggttcg gcttctggcg 1620tgtgaccggc ggctctagag cctctgctaa
ccatgttcat gccttcttct ttttcctaca 1680gatccttaat taataatacg actcactata
gggggtcgac ccgccaccat gcccaagaag 1740aaacggaaag tgatgagccc ctttatcgcc
ccggacgtgc ccgagcacct cctggacact 1800gtgcgcgtct ttctgtacgc ccgtcagagt
aaaggacggt cagatggatc tgacgtgtcc 1860accgaagcac agctcgctgc cggacgggcc
cttgttgcct caagaaacgc acaaggggga 1920gctagatggg tggtggcggg cgaattcgtg
gatgtgggca gatcagggtg ggacccgaat 1980gtgacacgcg ccgacttcga aagaatgatg
ggcgaggtgc gcgccggtga gggagacgta 2040gtggtggtta atgaactgag tcgccttacg
aggaagggcg cccacgacgc tctggagatc 2100gataacgaac tcaaaaaaca cggtgtgcgg
ttcatgagcg tgctggaacc attcctggat 2160accagcaccc caatcggtgt cgcgatcttt
gccctgattg ccgcgctcgc taaacaggat 2220tcagacctta aagctgagcg gctgaagggg
gctaaagatg agatcgctgc cttggggggt 2280gtgcacagct catctgcgcc attcggcatg
agggcggtca gaaagaaagt ggataacctg 2340gtcatatctg ttctggagcc tgatgaggac
aacccggacc acgttgagct tgtggaacgg 2400atggctaaga tgtctttcga aggcgtcagc
gataacgcaa ttgccacaac atttgagaag 2460gagaaaatcc cctctccggg gatggctgag
agacgagcca cggagaagag gcttgcttct 2520attaaggcac ggaggctcaa tggcgccgaa
aagccgatca tgtggcgggc gcagacagtt 2580agatggattc ttaaccatcc cgcgattggt
ggattcgcat tcgagcgggt gaaacacgga 2640aaagcccaca tcaacgtgat acgaagagat
cccggcggca aaccccttac ccctcacact 2700ggtatcctgt ctggatccaa gtggttggaa
ctccaggaga agagaagcgg gaaaaatctc 2760tccgaccgca aaccaggtgc cgaagtggaa
cctacgctgc tttccgggtg gagatttctg 2820ggatgtcgga tatgcggtgg gtcaatgggc
cagtcccaag ggggccgtaa gaggaatggg 2880gacttggctg agggcaatta catgtgtgca
aacccaaagg ggcacggcgg tctgagcgtc 2940aagaggtctg agcttgatga attcgtggca
tcaaaagtct gggccaggtt gcgcacggct 3000gacatggagg atgaacatga ccaagcatgg
attgcagctg cagctgaacg gtttgctttg 3060cagcacgacc tggcgggggt agctgacgag
cgacgggagc aacaagctca cctggataac 3120gttcggagat caataaaaga tctccaggcg
gataggaagg caggtctcta cgtgggacgc 3180gaagaactgg agacctggcg cagtaccgtc
ctgcaatata ggagctacga ggctgagtgt 3240actactaggt tggctgagct ggatgaaaaa
atgaatggat ccacccgggt gccttcagaa 3300tggtttagcg gcgaggaccc aaccgcggaa
ggaggcatat gggcgagctg ggatgtctat 3360gagcgccggg agtttctcag cttttttttg
gactccgtaa tggttgacag gggcagacat 3420cctgaaacca agaaatatat accattgaaa
gaccgggtga ccttaaagtg ggcggagctg 3480ttaaaggaag aggatgaagc aagcgaggcc
acagaacggg agctggcagc tctttaggtc 3540agagctcgct gatcagcctc gactgtgcct
tctagttgcc agccatctgt tgtttgcccc 3600tcccccgtgc cttccttgac cctggaaggt
gccactccca ctgtcctttc ctaataaaat 3660gaggaaattg catcgcattg tctgagtagg
tgtcattcta ttctgggggg tggggtgggg 3720caggacagca agggggagga ttgggaagac
aatagcaggc atgctgggga tgcggtgggc 3780tctatggctt ctgaggcgga aagaaccagc
tggggctcga gatccactag ttctagcctc 3840gaggctagag cggccgccac tggccgtcgt
tttacaacgt cgtgactggg aaaaccctgg 3900cgttacccaa cttaatcgcc ttgcagcaca
tccccctttc gccagctggc gtaatagcga 3960agaggcccgc accgatcgcc cttcccaaca
gttgcgcagc ctgaatggcg aatgggacgc 4020gccctgtagc ggcgcattaa gcgcggcggg
tgtggtggtt acgcgcagcg tgaccgctac 4080acttgccagc gccctagcgc ccgctccttt
cgctttcttc ccttcctttc tcgccacgtt 4140cgccggcttt ccccgtcaag ctctaaatcg
ggggctccct ttagggttcc gatttagtgc 4200tttacggcac ctcgacccca aaaaacttga
ttagggtgat ggttcacgta gtgggccatc 4260gccctgatag acggtttttc gccctttgac
gttggagtcc acgttcttta atagtggact 4320cttgttccaa actggaacaa cactcaaccc
tatctcggtc tattcttttg atttataagg 4380gattttgccg atttcggcct attggttaaa
aaatgagctg atttaacaaa aatttaacgc 4440gaattttaac aaaatattaa cgcttacrmk
tymsrtkssm cwttymggsg aaatgtgcgc 4500ggaaccccta tttgtttatt tttctaaata
cattcaaata tgtatccgct catgagacaa 4560taaccctgat aaatgcttca ataatattga
aaaaggaaga gtatgagtat tcaacatttc 4620cgtgtcgccc ttattccctt ttttgcggca
ttttgccttc ctgtttttgc tcacccagaa 4680acgctggtga aagtaaaaga tgctgaagat
cagttgggtg cacgagtggg ttacatcgaa 4740ctggatctca acagcggtaa gatccttgag
agttttcgcc ccgaagaacg ttttccaatg 4800atgagcactt ttaaagttct gctatgtggc
gcggtattat cccgtattga cgccgggcaa 4860gagcaactcg gtcgccgcat acactattct
cagaatgact tggttgagta ctcaccagtc 4920acagaaaagc atcttacgga tggcatgaca
gtaagagaat tatgcagtgc tgccataacc 4980atgagtgata acactgcggc caacttactt
ctgacaacga tcggaggacc gaaggagcta 5040accgcttttt tgcacaacat gggggatcat
gtaactcgcc ttgatcgttg ggaaccggag 5100ctgaatgaag ccataccaaa cgacgagcgt
gacaccacga tgcctgtagc aatggcaaca 5160acgttgcgca aactattaac tggcgaacta
cttactctag cttcccggca acaattaata 5220gactggatgg aggcggataa agttgcagga
ccacttctgc gctcggccct tccggctggc 5280tggtttattg ctgataaatc tggagccggt
gagcgtgggt ctcgcggtat cattgcagca 5340ctggggccag atggtaagcc ctcccgtatc
gtagttatct acacgacggg gagtcaggca 5400actatggatg aacgaaatag acagatcgct
gagataggtg cctcactgat taagcattgg 5460taactgtcag accaagttta ctcatatata
ctttagattg atttaaaact tcatttttaa 5520tttaaaagga tctaggtgaa gatccttttt
gataatctca tgaccaaaat cccttaacgt 5580gagttttcgt tccactgagc gtcagacccc
gtagaaaaga tcaaaggatc ttcttgagat 5640cctttttttc tgcgcgtaat ctgctgcttg
caaacaaaaa aaccaccgct accagcggtg 5700gtttgtttgc cggatcaaga gctaccaact
ctttttccga aggtaactgg cttcagcaga 5760gcgcagatac caaatactgt tcttctagtg
tagccgtagt taggccacca cttcaagaac 5820tctgtagcac cgcctacata cctcgctctg
ctaatcctgt taccagtggc tgctgccagt 5880ggcgataagt cgtgtcttac cgggttggac
tcaagacgat agttaccgga taaggcgcag 5940cggtcgggct gaacgggggg ttcgtgcaca
cagcccagct tggagcgaac gacctacacc 6000gaactgagat acctacagcg tgagctatga
gaaagcgcca cgcttcccga agggagaaag 6060gcggacaggt atccggtaag cggcagggtc
ggaacaggag agcgcacgag ggagcttcca 6120gggggaaacg cctggtatct ttatagtcct
gtcgggtttc gccacctctg acttgagcgt 6180cgatttttgt gatgctcgtc aggggggcgg
agcctatgga aaaacgccag caacgcggcc 6240tttttacggt tcctggcctt ttgctggcct
tttgctcaca tgttctttcc tgcgttatcc 6300cctgattctg tggataaccg tattaccgcc
tttgagtgag ctgataccgc tcgccgcagc 6360cgaacgaccg agcgcagcga gtcagtgagc
gaggaagcgg aagagcgccc aatacgcaaa 6420ccgcctctcc ccgcgcgttg gccgattcat
taatgcagct ggcacgacag gtttcccgac 6480tggaaagcgg gcagtgagcg caacgcaatt
aatgtgagtt agctcactca ttaggcaccc 6540caggctttac actttatgct tccggctcgt
atgttgtgtg gaattgtgag cggataacaa 6600tttcacacag gaaacagcta tgaccatgag
gcgcgccg 66383909530DNAArtificial
SequenceDescription of Artificial Sequence Synthetic polynucleotide
390taatcagcat catgatgtgg taccacatca tgatgctgat tataagaatg cggccgccac
60actctagtgg atctcgagtt aataattcag aagaactcgt caagaaggcg atagaaggcg
120atgcgctgcg aatcgggagc ggcgataccg taaagcacga ggaagcggtc agcccattcg
180ccgccaagct cttcagcaat atcacgggta gccaacgcta tgtcctgata gcggtccgcc
240acacccagcc ggccacagtc gatgaatcca gaaaagcggc cattttccac catgatattc
300ggcaagcagg catcgccatg ggtcacgacg agatcctcgc cgtcgggcat gctcgccttg
360agcctggcga acagttcggc tggcgcgagc ccctgatgct cttcgtccag atcatcctga
420tcgacaagac cggcttccat ccgagtacgt gctcgctcga tgcgatgttt cgcttggtgg
480tcgaatgggc aggtagccgg atcaagcgta tgcagccgcc gcattgcatc agccatgatg
540gatactttct cggcaggagc aaggtgtaga tgacatggag atcctgcccc ggcacttcgc
600ccaatagcag ccagtccctt cccgcttcag tgacaacgtc gagcacagct gcgcaaggaa
660cgcccgtcgt ggccagccac gatagccgcg ctgcctcgtc ttgcagttca ttcagggcac
720cggacaggtc ggtcttgaca aaaagaaccg ggcgcccctg cgctgacagc cggaacacgg
780cggcatcaga gcagccgatt gtctgttgtg cccagtcata gccgaatagc ctctccaccc
840aagcggccgg agaacctgcg tgcaatccat cttgttcaat catgcgaaac gatcctcatc
900ctgtctcttg atcagagctt gatcccctgc gccatcagat ccttggcggc gagaaagcca
960tccagtttac tttgcagggc ttcccaacct taccagaggg cgccccagct ggcaattccg
1020gttcgcttgc tgtccataaa accgcccagt ctagctatcg ccatgtaagc ccactgcaag
1080ctacctgctt tctctttgcg cttgcgtttt cccttgtcca gatagcccag tagctgacat
1140tcatccgggg tcagcaccgt ttctgcggac tggctttcta cgtgctcgag gggggccaaa
1200cggtctccag cttggctgtt ttggcggatg agagaagatt ttcagcctga tacagattaa
1260atcagaacgc agaagcggtc tgataaaaca gaatttgcct ggcggcagta gcgcggtggt
1320cccacctgac cccatgccga actcagaagt gaaacgccgt agcgccgatg gtagtgtggg
1380gtctccccat gcgagagtag ggaactgcca ggcatcaaat aaaacgaaag gctcagtcga
1440aagactgggc ctttcgtttt atctgttgtt tgtcggtgaa cgctctcctg agtaggacaa
1500atccgccggg agcggatttg aacgttgcga agcaacggcc cggagggtgg cgggcaggac
1560gcccgccata aactgccagg catcaaatta agcagaaggc catcctgacg gatggccttt
1620ttgcgtttct acaaactctt ttgtttattt ttctaaatac attcaaatat gtatccgctc
1680atgaccaaaa tcccttaacg tgagttttcg ttccactgag cgtcagaccc cgtagaaaag
1740atcaaaggat cttcttgaga tccttttttt ctgcgcgtaa tctgctgctt gcaaacaaaa
1800aaaccaccgc taccagcggt ggtttgtttg ccggatcaag agctaccaac tctttttccg
1860aaggtaactg gcttcagcag agcgcagata ccaaatactg tccttctagt gtagccgtag
1920ttaggccacc acttcaagaa ctctgtagca ccgcctacat acctcgctct gctaatcctg
1980ttaccagtgg ctgctgccag tggcgataag tcgtgtctta ccgggttgga ctcaagacga
2040tagttaccgg ataaggcgca gcggtcgggc tgaacggggg gttcgtgcac acagcccagc
2100ttggagcgaa cgacctacac cgaactgaga tacctacagc gtgagctatg agaaagcgcc
2160acgcttcccg aagggagaaa ggcggacagg tatccggtaa gcggcagggt cggaacagga
2220gagcgcacga gggagcttcc agggggaaac gcctggtatc tttatagtcc tgtcgggttt
2280cgccacctct gacttgagcg tcgatttttg tgatgctcgt caggggggcg gagcctatgg
2340aaaaacgcca gcaacgcggc ctttttacgg ttcctggcct tttgctggcc ttttgctcac
2400atgttctttc ctgcgttatc ccctgattct gtggataacc gtattaccgc ctttgagtga
2460gctgataccg ctcgccgcag ccgaacgacc gagcgcagcg agtcagtgag cgaggaagcg
2520gaagagcgcc tgatgcggta ttttctcctt acgcatctgt gcggtatttc acaccgcata
2580tggtgcactc tcagtacaat ctgctctgat gccgcatagt taagccagta tacactccgc
2640tatcgctacg tgactgggtc atggctgcgc cccgacaccc gccaacaccc gctgacgcgc
2700cctgacgggc ttgtctgctc ccggcatccg cttacagaca agctgtgacc gtctccggga
2760gctgcatgtg tcagaggttt tcaccgtcat caccgaaacg cgcgaggcag cagatcaatt
2820cgcgcgcgaa ggcgaagcgg catgcataat gtgcctgtca aatggacgaa gcagggattc
2880tgcaaaccct atgctactcc gtcaagccgt caattgtctg attcgttacc aattatgaca
2940acttgacggc tacatcattc actttttctt cacaaccggc acggaactcg ctcgggctgg
3000ccccggtgca ttttttaaat acccgcgaga aatagagttg atcgtcaaaa ccaacattgc
3060gaccgacggt ggcgataggc atccgggtgg tgctcaaaag cagcttcgcc tggctgatac
3120gttggtcctc gcgccagctt aagacgctaa tccctaactg ctggcggaaa agatgtgaca
3180gacgcgacgg cgacaagcaa acatgctgtg cgacgctggc gatacattac cctgttatcc
3240ctagatgaca ttaccctgtt atcccagatg acattaccct gttatcccta gatgacatta
3300ccctgttatc cctagatgac atttaccctg ttatccctag atgacattac cctgttatcc
3360cagatgacat taccctgtta tccctagata cattaccctg ttatcccaga tgacataccc
3420tgttatccct agatgacatt accctgttat cccagatgac attaccctgt tatccctaga
3480tacattaccc tgttatccca gatgacatac cctgttatcc ctagatgaca ttaccctgtt
3540atcccagatg acattaccct gttatcccta gatacattac cctgttatcc cagatgacat
3600accctgttat ccctagatga cattaccctg ttatcccaga tgacattacc ctgttatccc
3660tagatacatt accctgttat cccagatgac ataccctgtt atccctagat gacattaccc
3720tgttatccca gatgacatta ccctgttatc cctagataca ttaccctgtt atcccagatg
3780acataccctg ttatccctag atgacattac cctgttatcc cagatgacat taccctgtta
3840tccctagata cattaccctg ttatcccaga tgacataccc tgttatccct agatgacatt
3900accctgttat cccagataaa ctcaatgatg atgatgatga tggtcgagac tcagcggccg
3960cggtgccagg gcgtgccctt gggctccccg ggcgcgacta taagctgcga gcaacttcac
4020ttgggtatgc cggcggtagc gctgagggcc tatttcccat gattccttca tatttgcata
4080tacgatacaa ggctgttaga gagataattg gaattaattt gactgtaaac acaaagatat
4140tagtacaaaa tacgtgacgt agaaagtaat aatttcttgg gtagtttgca gttttaaaat
4200tatgttttaa aatggactat catatgctta ccgtaacttg aaagtatttc gatttcttgg
4260ctttatatat cttgtggaaa ggacgaaaca ccgggtcttc gagaagacct gttttagagc
4320tagaaatcgt ggttcgcacc gactcggtgc cacagcaagt taaaataagg ctagtccgtt
4380atcaacttga aaaagtggca ccgagtcggt gcttttttga attcgctagc taggtcttga
4440aaggagtggg aattggctcc ggtgcccgtc agtgggcaga gcgcacatcg cccacagtcc
4500ccgagaagtt ggggggaggg gtcggcaatt gatccggtgc ctagagaagg tggcgcgggg
4560taaactggga aagtgatgtc gtgtactggc tccgcctttt tcccgagggt gggggagaac
4620cgtatataag tgcagtagtc gccgtgaacg ttctttttcg caacgggttt gccgccagaa
4680cacaggaccg gttctagagc gctgccacca tggacaagaa gtacagcatc ggcctggaca
4740tcggcaccaa ctctgtgggc tgggccgtga tcaccgacga gtacaaggtg cccagcaaga
4800aattcaaggt gctgggcaac accgaccggc acagcatcaa gaagaacctg atcggagccc
4860tgctgttcga cagcggcgaa acagccgagg ccacccggct gaagagaacc gccagaagaa
4920gatacaccag acggaagaac cggatctgct atctgcaaga gatcttcagc aacgagatgg
4980ccaaggtgga cgacagcttc ttccacagac tggaagagtc cttcctggtg gaagaggata
5040agaagcacga gcggcacccc atcttcggca acatcgtgga cgaggtggcc taccacgaga
5100agtaccccac catctaccac ctgagaaaga aactggtgga cagcaccgac aaggccgacc
5160tgcggctgat ctatctggcc ctggcccaca tgatcaagtt ccggggccac ttcctgatcg
5220agggcgacct gaaccccgac aacagcgacg tggacaagct gttcatccag ctggtgcaga
5280cctacaacca gctgttcgag gaaaacccca tcaacgccag cggcgtggac gccaaggcca
5340tcctgtctgc cagactgagc aagagcagac ggctggaaaa tctgatcgcc cagctgcccg
5400gcgagaagaa gaatggcctg ttcggaaacc tgattgccct gagcctgggc ctgaccccca
5460acttcaagag caacttcgac ctggccgagg atgccaaact gcagctgagc aaggacacct
5520acgacgacga cctggacaac ctgctggccc agatcggcga ccagtacgcc gacctgtttc
5580tggccgccaa gaacctgtcc gacgccatcc tgctgagcga catcctgaga gtgaacaccg
5640agatcaccaa ggcccccctg agcgcctcta tgatcaagag atacgacgag caccaccagg
5700acctgaccct gctgaaagct ctcgtgcggc agcagctgcc tgagaagtac aaagagattt
5760tcttcgacca gagcaagaac ggctacgccg gctacattga cggcggagcc agccaggaag
5820agttctacaa gttcatcaag cccatcctgg aaaagatgga cggcaccgag gaactgctcg
5880tgaagctgaa cagagaggac ctgctgcgga agcagcggac cttcgacaac ggcagcatcc
5940cccaccagat ccacctggga gagctgcacg ccattctgcg gcggcaggaa gatttttacc
6000cattcctgaa ggacaaccgg gaaaagatcg agaagatcct gaccttccgc atcccctact
6060acgtgggccc tctggccagg ggaaacagca gattcgcctg gatgaccaga aagagcgagg
6120aaaccatcac cccctggaac ttcgaggaag tggtggacaa gggcgcttcc gcccagagct
6180tcatcgagcg gatgaccaac ttcgataaga acctgcccaa cgagaaggtg ctgcccaagc
6240acagcctgct gtacgagtac ttcaccgtgt ataacgagct gaccaaagtg aaatacgtga
6300ccgagggaat gagaaagccc gccttcctga gcggcgagca gaaaaaggcc atcgtggacc
6360tgctgttcaa gaccaaccgg aaagtgaccg tgaagcagct gaaagaggac tacttcaaga
6420aaatcgagtg cttcgactcc gtggaaatct ccggcgtgga agatcggttc aacgcctccc
6480tgggcacata ccacgatctg ctgaaaatta tcaaggacaa ggacttcctg gacaatgagg
6540aaaacgagga cattctggaa gatatcgtgc tgaccctgac actgtttgag gacagagaga
6600tgatcgagga acggctgaaa acctatgccc acctgttcga cgacaaagtg atgaagcagc
6660tgaagcggcg gagatacacc ggctggggca ggctgagccg gaagctgatc aacggcatcc
6720gggacaagca gtccggcaag acaatcctgg atttcctgaa gtccgacggc ttcgccaaca
6780gaaacttcat gcagctgatc cacgacgaca gcctgacctt taaagaggac atccagaaag
6840cccaggtgtc cggccagggc gatagcctgc acgagcacat tgccaatctg gccggcagcc
6900ccgccattaa gaagggcatc ctgcagacag tgaaggtggt ggacgagctc gtgaaagtga
6960tgggccggca caagcccgag aacatcgtga tcgaaatggc cagagagaac cagaccaccc
7020agaagggaca gaagaacagc cgcgagagaa tgaagcggat cgaagagggc atcaaagagc
7080tgggcagcca gatcctgaaa gaacaccccg tggaaaacac ccagctgcag aacgagaagc
7140tgtacctgta ctacctgcag aatgggcggg atatgtacgt ggaccaggaa ctggacatca
7200accggctgtc cgactacgat gtggaccata tcgtgcctca gagctttctg aaggacgact
7260ccatcgacaa caaggtgctg accagaagcg acaagaaccg gggcaagagc gacaacgtgc
7320cctccgaaga ggtcgtgaag aagatgaaga actactggcg gcagctgctg aacgccaagc
7380tgattaccca gagaaagttc gacaatctga ccaaggccga gagaggcggc ctgagcgaac
7440tggataaggc cggcttcatc aagagacagc tggtggaaac ccggcagatc acaaagcacg
7500tggcacagat cctggactcc cggatgaaca ctaagtacga cgagaatgac aagctgatcc
7560gggaagtgaa agtgatcacc ctgaagtcca agctggtgtc cgatttccgg aaggatttcc
7620agttttacaa agtgcgcgag atcaacaact accaccacgc ccacgacgcc tacctgaacg
7680ccgtcgtggg aaccgccctg atcaaaaagt accctaagct ggaaagcgag ttcgtgtacg
7740gcgactacaa ggtgtacgac gtgcggaaga tgatcgccaa gagcgagcag gaaatcggca
7800aggctaccgc caagtacttc ttctacagca acatcatgaa ctttttcaag accgagatta
7860ccctggccaa cggcgagatc cggaagcggc ctctgatcga gacaaacggc gaaaccgggg
7920agatcgtgtg ggataagggc cgggattttg ccaccgtgcg gaaagtgctg agcatgcccc
7980aagtgaatat cgtgaaaaag accgaggtgc agacaggcgg cttcagcaaa gagtctatcc
8040tgcccaagag gaacagcgat aagctgatcg ccagaaagaa ggactgggac cctaagaagt
8100acggcggctt cgacagcccc accgtggcct attctgtgct ggtggtggcc aaagtggaaa
8160agggcaagtc caagaaactg aagagtgtga aagagctgct ggggatcacc atcatggaaa
8220gaagcagctt cgagaagaat cccatcgact ttctggaagc caagggctac aaagaagtga
8280aaaaggacct gatcatcaag ctgcctaagt actccctgtt cgagctggaa aacggccgga
8340agagaatgct ggcctctgcc ggcgaactgc agaagggaaa cgaactggcc ctgccctcca
8400aatatgtgaa cttcctgtac ctggccagcc actatgagaa gctgaagggc tcccccgagg
8460ataatgagca gaaacagctg tttgtggaac agcacaagca ctacctggac gagatcatcg
8520agcagatcag cgagttctcc aagagagtga tcctggccga cgctaatctg gacaaagtgc
8580tgtccgccta caacaagcac cgggataagc ccatcagaga gcaggccgag aatatcatcc
8640acctgtttac cctgaccaat ctgggagccc ctgccgcctt caagtacttt gacaccacca
8700tcgaccggaa gaggtacacc agcaccaaag aggtgctgga cgccaccctg atccaccaga
8760gcatcaccgg cctgtacgag acacggatcg acctgtctca gctgggaggc gacaagcgac
8820ctgccgccac aaagaaggct ggacaggcta agaagaagaa agattacaaa gacgatgacg
8880ataagtaact agagctcgct gatcagcctc gactgtgcct tctagttgcc agccatctgt
8940tgtttgcccc tcccccgtgc cttccttgac cctggaaggt gccactccca ctgtcctttc
9000ctaataaaat gaggaaattg catcgcattg tctgagtagg tgtcattcta ttctgggggg
9060tggggtgggg caggacagca agggggagga ttgggaagag aatagcaggc atgctgggga
9120ctgaggcgga aagaaccagc tgtggaatgt gtgtcagtta gggtgtggaa agtccccagg
9180ctccccagca ggcagaagta tgcaaagcat gcatctcaat tagtcagcaa ccaggtgtgg
9240aaagtcccca ggctccccag caggcagaag tatgcaaagc atgcatctca attagtcagc
9300aaccatagtc ccgcccctaa ctccgcccat cccgccccta actccgccca gttccgccca
9360ttctccgccc catggctgac taattttttt tatttatgca gaggccgagg ccgcctcggc
9420ctctgagcta ttccagaagt agtgaggagg cttttttgga ggcctaggct tttgcaaaaa
9480gcttgggccc gccccaactg gggtaacctt tgagttctct cagttggggg
95303915722DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotide 391tgatcccctg cgccatcaga tccttggcgg
cgagaaagcc atccagttta ctttgcaggg 60cttcccaacc ttaccagagg gcgccccagc
tggcaattcc ggttcgcttg ctgtccataa 120aaccgcccag tctagctatc gccatgtaag
cccactgcaa gctacctgct ttctctttgc 180gcttgcgttt tcccttgtcc agatagccca
gtagctgaca ttcatccggg gtcagcaccg 240tttctgcgga ctggctttct acgtgctcga
ggggggccaa acggtctcca gcttggctgt 300tttggcggat gagagaagat tttcagcctg
atacagatta aatcagaacg cagaagcggt 360ctgataaaac agaatttgcc tggcggcagt
agcgcggtgg tcccacctga ccccatgccg 420aactcagaag tgaaacgccg tagcgccgat
ggtagtgtgg ggtctcccca tgcgagagta 480gggaactgcc aggcatcaaa taaaacgaaa
ggctcagtcg aaagactggg cctttcgttt 540tatctgttgt ttgtcggtga acgctctcct
gagtaggaca aatccgccgg gagcggattt 600gaacgttgcg aagcaacggc ccggagggtg
gcgggcagga cgcccgccat aaactgccag 660gcatcaaatt aagcagaagg ccatcctgac
ggatggcctt tttgcgtttc tacaaactct 720tttgtttatt tttctaaata cattcaaata
tgtatccgct catgaccaaa atcccttaac 780gtgagttttc gttccactga gcgtcagacc
ccgtagaaaa gatcaaagga tcttcttgag 840atcctttttt tctgcgcgta atctgctgct
tgcaaacaaa aaaaccaccg ctaccagcgg 900tggtttgttt gccggatcaa gagctaccaa
ctctttttcc gaaggtaact ggcttcagca 960gagcgcagat accaaatact gtccttctag
tgtagccgta gttaggccac cacttcaaga 1020actctgtagc accgcctaca tacctcgctc
tgctaatcct gttaccagtg gctgctgcca 1080gtggcgataa gtcgtgtctt accgggttgg
actcaagacg atagttaccg gataaggcgc 1140agcggtcggg ctgaacgggg ggttcgtgca
cacagcccag cttggagcga acgacctaca 1200ccgaactgag atacctacag cgtgagctat
gagaaagcgc cacgcttccc gaagggagaa 1260aggcggacag gtatccggta agcggcaggg
tcggaacagg agagcgcacg agggagcttc 1320cagggggaaa cgcctggtat ctttatagtc
ctgtcgggtt tcgccacctc tgacttgagc 1380gtcgattttt gtgatgctcg tcaggggggc
ggagcctatg gaaaaacgcc agcaacgcgg 1440cctttttacg gttcctggcc ttttgctggc
cttttgctca catgttcttt cctgcgttat 1500cccctgattc tgtggataac cgtattaccg
cctttgagtg agctgatacc gctcgccgca 1560gccgaacgac cgagcgcagc gagtcagtga
gcgaggaagc ggaagagcgc ctgatgcggt 1620attttctcct tacgcatctg tgcggtattt
cacaccgcat atggtgcact ctcagtacaa 1680tctgctctga tgccgcatag ttaagccagt
atacactccg ctatcgctac gtgactgggt 1740catggctgcg ccccgacacc cgccaacacc
cgctgacgcg ccctgacggg cttgtctgct 1800cccggcatcc gcttacagac aagctgtgac
cgtctccggg agctgcatgt gtcagaggtt 1860ttcaccgtca tcaccgaaac gcgcgaggca
gcagatcaat tcgcgcgcga aggcgaagcg 1920gcatgcataa tgtgcctgtc aaatggacga
agcagggatt ctgcaaaccc tatgctactc 1980cgtcaagccg tcaattgtct gattcgttac
caattatgac aacttgacgg ctacatcatt 2040cactttttct tcacaaccgg cacggaactc
gctcgggctg gccccggtgc attttttaaa 2100tacccgcgag aaatagagtt gatcgtcaaa
accaacattg cgaccgacgg tggcgatagg 2160catccgggtg gtgctcaaaa gcagcttcgc
ctggctgata cgttggtcct cgcgccagct 2220taagacgcta atccctaact gctggcggaa
aagatgtgac agacgcgacg gcgacaagca 2280aacatgctgt gcgacgctgg cgatacatta
ccctgttatc cctagatgac attaccctgt 2340tatcccagat gacattaccc tgttatccct
agatgacatt accctgttat ccctagatga 2400catttaccct gttatcccta gatgacatta
ccctgttatc ccagatgaca ttaccctgtt 2460atccctagat acattaccct gttatcccag
atgacatacc ctgttatccc tagatgacat 2520taccctgtta tcccagatga cattaccctg
ttatccctag atacattacc ctgttatccc 2580agatgacata ccctgttatc cctagatgac
attaccctgt tatcccagat gacattaccc 2640tgttatccct agatacatta ccctgttatc
ccagatgaca taccctgtta tccctagatg 2700acattaccct gttatcccag atgacattac
cctgttatcc ctagatacat taccctgtta 2760tcccagatga cataccctgt tatccctaga
tgacattacc ctgttatccc agatgacatt 2820accctgttat ccctagatac attaccctgt
tatcccagat gacataccct gttatcccta 2880gatgacatta ccctgttatc ccagatgaca
ttaccctgtt atccctagat acattaccct 2940gttatcccag atgacatacc ctgttatccc
tagatgacat taccctgtta tcccagataa 3000actcaatgat gatgatgatg atggtcgaga
ctcagcggcc gcggtgccag ggcgtgccct 3060tgggctcccc gggcgcggtc ctttgggcgc
taactgcgtg cgcgctggga attggcgcta 3120attgcgcgtg cgcgctggga ctcaaggcgc
taactgcgcg tgcgttctgg ggcccggggt 3180gccgcggcct gggctggggc gaaggcgggc
tcggccggaa ggggtggggt cgccgcggct 3240cccgggcgct tgcgcgcact tcctgcccga
gccgctggcc gcccgagggt gtggccgctg 3300cgtgcgcgcg cgccgacccg gcgctgtttg
aaccgggcgg aggcggggct ggcgcccggt 3360tgggaggggg ttggggcctg gcttcctgcc
gcgcgccgcg gggacgcctc cgaccagtgt 3420ttgcctttta tggtaataac gcggccggcc
cggcttcctt tgtccccaat ctgggcgcgc 3480gccggcgccc cctggcggcc taaggactcg
gcgcgccgga agtggccagg gcgggggcga 3540cctcggctca cagcgcgccc ggctattctc
gcagctcgcc accatgcccg ccatgaagat 3600cgagtgccgc atcaccggca ccctgaacgg
cgtggagttc gagctggtgg gcggcggaga 3660gggcaccccc gagcagggcc gcatgaccaa
caagatgaag agcaccaaag gcgccctgac 3720cttcagcccc tacctgctga gccacgtgat
gggctacggc ttctaccact tcggcaccta 3780ccccagcggc tacgagaacc ccttcctgca
cgccatcaac aacggcggct acaccaacac 3840ccgcatcgag aagtacgagg acggcggcgt
gctgcacgtg agcttcagct accgctacga 3900ggccggccgc gtgatcggcg acttcaaggt
ggtgggcacc ggcttccccg aggacagcgt 3960gatcttcacc gacaagatca tccgcagcaa
cgccaccgtg gagcacctgc accccatggg 4020cgataacgtg ctggtgggca gcttcgcccg
caccttcagc ctgcgcgacg gcggctacta 4080cagcttcgtg gtggacagcc acatgcactt
caagagcgcc atccacccca gcatcctgca 4140gaacgggggc cccatgttcg ccttccgccg
cgtggaggag ctgcacagca acaccgagct 4200gggcatcgtg gagtaccagc acgccttcaa
gacccccatc gccttcgcca gatctcgagc 4260tcgaaccatg gatgatgata tcgccgcgct
cgtcgtcgac aacggctccg gcatgtgcaa 4320ggccggcttc gcgggcgacg atgccccccg
ggccgtcttc ccctccatcg tggggcgccc 4380caggcaccag gtaggggagc tggctgggtg
gggcagcccc gggagcgggc gggaggcaag 4440ggcgctttct ctgcacagga gcctcccggt
ttccggggtg ggggctgcgc ccgtgctcag 4500ggcttcttgt cctttccttc ccagggcgtg
atggtgggca tgggtcagaa ggattcctat 4560gtgggcgacg aggcccagag caagagaggc
atcctcaccc tgaagtaccc catcgagcac 4620ggcatcgtca ccaactggga cgacatggag
aaaatctggc accacacctt ctacaatgag 4680ctgcgtgtgg ctcccgagga gcaccccgtg
ctgctgaccg aggcccccct gaaccccaag 4740gccaaccgcg agaagatgac ccagccccaa
ctggggtaac ctttgagttc tctcagttgg 4800gggtaatcag catcatgatg tggtaccaca
tcatgatgct gattataaga atgcggccgc 4860cacactctag tggatctcga gttaataatt
cagaagaact cgtcaagaag gcgatagaag 4920gcgatgcgct gcgaatcggg agcggcgata
ccgtaaagca cgaggaagcg gtcagcccat 4980tcgccgccaa gctcttcagc aatatcacgg
gtagccaacg ctatgtcctg atagcggtcc 5040gccacaccca gccggccaca gtcgatgaat
ccagaaaagc ggccattttc caccatgata 5100ttcggcaagc aggcatcgcc atgggtcacg
acgagatcct cgccgtcggg catgctcgcc 5160ttgagcctgg cgaacagttc ggctggcgcg
agcccctgat gctcttcgtc cagatcatcc 5220tgatcgacaa gaccggcttc catccgagta
cgtgctcgct cgatgcgatg tttcgcttgg 5280tggtcgaatg ggcaggtagc cggatcaagc
gtatgcagcc gccgcattgc atcagccatg 5340atggatactt tctcggcagg agcaaggtgt
agatgacatg gagatcctgc cccggcactt 5400cgcccaatag cagccagtcc cttcccgctt
cagtgacaac gtcgagcaca gctgcgcaag 5460gaacgcccgt cgtggccagc cacgatagcc
gcgctgcctc gtcttgcagt tcattcaggg 5520caccggacag gtcggtcttg acaaaaagaa
ccgggcgccc ctgcgctgac agccggaaca 5580cggcggcatc agagcagccg attgtctgtt
gtgcccagtc atagccgaat agcctctcca 5640cccaagcggc cggagaacct gcgtgcaatc
catcttgttc aatcatgcga aacgatcctc 5700atcctgtctc ttgatcagag ct
572239215424DNAArtificial
SequenceDescription of Artificial Sequence Synthetic polynucleotide
392tcgacggtat cgataagctt gatatcgaat tcctgcagcc cgggggatcc actagttcta
60gagcggccgc caccgcggtg gagctccagc ttttgttccc tttagtgagg gttaatttcg
120agcttggcgt aatcatggtc atagctgttt cctgtgtgaa attgttatcc gctcacaatt
180ccacacaaca tacgagccgg aagcataaag tgtaaagcct ggggtgccta atgagtgagc
240taactcacat taattgcgtt gcgctcactg cccgctttcc agtcgggaaa cctgtcgtgc
300cagctgcatt aatgaatcgg ccaacgcgcg gggagaggcg gtttgcgtat tgggcgctct
360tccgcttcct cgctcactga ctcgctgcgc tcggtcgttc ggctgcggcg agcggtatca
420gctcactcaa aggcggtaat acggttatcc acagaatcag gggataacgc aggaaagaac
480atgtgagcaa aaggccagca aaaggccagg aaccgtaaaa aggccgcgtt gctggcgttt
540ttccataggc tccgcccccc tgacgagcat cacaaaaatc gacgctcaag tcagaggtgg
600cgaaacccga caggactata aagataccag gcgtttcccc ctggaagctc cctcgtgcgc
660tctcctgttc cgaccctgcc gcttaccgga tacctgtccg cctttctccc ttcgggaagc
720gtggcgcttt ctcatagctc acgctgtagg tatctcagtt cggtgtaggt cgttcgctcc
780aagctgggct gtgtgcacga accccccgtt cagcccgacc gctgcgcctt atccggtaac
840tatcgtcttg agtccaaccc ggtaagacac gacttatcgc cactggcagc agccactggt
900aacaggatta gcagagcgag gtatgtaggc ggtgctacag agttcttgaa gtggtggcct
960aactacggct acactagaag gacagtattt ggtatctgcg ctctgctgaa gccagttacc
1020ttcggaaaaa gagttggtag ctcttgatcc ggcaaacaaa ccaccgctgg tagcggtggt
1080ttttttgttt gcaagcagca gattacgcgc agaaaaaaag gatctcaaga agatcctttg
1140atcttttcta cggggtctga cgctcagtgg aacgaaaact cacgttaagg gattttggtc
1200atgagattat caaaaaggat cttcacctag atccttttaa attaaaaatg aagttttaaa
1260tcaatctaaa gtatatatga gtaaacttgg tctgacagtt accaatgctt aatcagtgag
1320gcacctatct cagcgatctg tctatttcgt tcatccatag ttgcctgact ccccgtcgtg
1380tagataacta cgatacggga gggcttacca tctggcccca gtgctgcaat gataccgcga
1440gacccacgct caccggctcc agatttatca gcaataaacc agccagccgg aagggccgag
1500cgcagaagtg gtcctgcaac tttatccgcc tccatccagt ctattaattg ttgccgggaa
1560gctagagtaa gtagttcgcc agttaatagt ttgcgcaacg ttgttgccat tgctacaggc
1620atcgtggtgt cacgctcgtc gtttggtatg gcttcattca gctccggttc ccaacgatca
1680aggcgagtta catgatcccc catgttgtgc aaaaaagcgg ttagctcctt cggtcctccg
1740atcgttgtca gaagtaagtt ggccgcagtg ttatcactca tggttatggc agcactgcat
1800aattctctta ctgtcatgcc atccgtaaga tgcttttctg tgactggtga gtactcaacc
1860aagtcattct gagaatagtg tatgcggcga ccgagttgct cttgcccggc gtcaatacgg
1920gataataccg cgccacatag cagaacttta aaagtgctca tcattggaaa acgttcttcg
1980gggcgaaaac tctcaaggat cttaccgctg ttgagatcca gttcgatgta acccactcgt
2040gcacccaact gatcttcagc atcttttact ttcaccagcg tttctgggtg agcaaaaaca
2100ggaaggcaaa atgccgcaaa aaagggaata agggcgacac ggaaatgttg aatactcata
2160ctcttccttt ttcaatatta ttgaagcatt tatcagggtt attgtctcat gagcggatac
2220atatttgaat gtatttagaa aaataaacaa ataggggttc cgcgcacatt tccccgaaaa
2280gtgccacctg acgtctaaga aaccattatt atcatgacat taacctataa aaataggcgt
2340atcacgaggc cctttcgtct tcaagaattc tcatgtttga cagcttatca tcgataagct
2400ttaatgcggt agtttatcac agttaaattg ctaacgcagt caggcaccgt gtatgaaatc
2460taacaatgcg ctcatcgtca tcctcggcac cgtcaccctg gatgctgtag gcataggctt
2520ggttatgccg gtactgccgg gcctcttgcg ggatatcgtc cattccgaca gcatcgccag
2580tcactatggc gtgctgctag cgctatatgc gttgatgcaa tttctatgcg cacccgttct
2640cggagcactg tccgaccgct ttggccgccg cccagtcctg ctcgcttcgc tacttggagc
2700cactatcgac tacgcgatca tggcgaccac acccgtcctg tggatccggc gcacaccaaa
2760aacgtcactt ttgccacatc cgtcgcttac atgtgttccg ccacacttgc aacatcacac
2820ttccgccaca ctactacgtc acccgccccg ttcccacgcc ccgcgccacg tcacaaactc
2880caccccctca ttatcatatt ggcttcaatc caaaataaat catcaataat ataccttatt
2940ttggattgaa gccaatatga taatgagggg gtggagtttg tgacgtggcg cggggcgtgg
3000gaacggggcg ggtgacgtag gttttagggc ggagtaactt gtatgtgttg ggaattgtag
3060ttttcttaaa atgggaagtt acgtaacgtg ggaaaacgga agtgacgatt tgaggaagtt
3120gtgggttttt tggctttcgt ttctgggcgt aggttcgcgt gcggttttct gggtgttttt
3180tgtggacttt aaccgttacg tcatttttta gtcctatata tactcgctct gcacttggcc
3240cttttttaca ctgtgactga ttgagctggt gccgtgtcga gtggtgtttt tttaataggt
3300tttctttttt actggtaagg ctgactgtta ggctgccgct gtgaagcgct gtatgttgtt
3360ctggagcggg agggtgctat tttgcctagg caggagggtt tttcaggtgt ttatgtgttt
3420ttctctccta ttaattttgt tatacctcct atgggggctg taatgttgtc tctacgcctg
3480cgggtatgta ttcccccggg ctatttcggt cgctttttag cactgaccga tgaatcaacc
3540tgatgtgttt accgagtctt acattatgac tccggacatg accgaggagc tgtcggtggt
3600gctttttaat cacggtgacc agttttttta cggtcacgcc ggcatggccg tagtccgtct
3660tatgcttata agggttgttt ttcctgttgt aagacaggct tctaatgttt aaatgttttt
3720ttgttatttt attttgtgtt tatgcagaaa cccgcagaca tgtttgagag aaaaatggtg
3780tctttttctg tggtggttcc ggagcttacc tgcctttatc tgcatgagca tgactacgat
3840gtgctttctt ttttgcgcga ggctttgcct gattttttga gcagcacctt gcattttata
3900tcgccgccca tgcaacaaag cttacatcgg ggctacgctg gttagcatag ctccgagtat
3960gcgtgtcata atcagtgtgg gttcttttgt caaggttcct ggcggggaag tggccgcgct
4020ggtccgtgca gacctgcacg attatgttca gctggccctg cgaagggacc tacgggatcg
4080cggtattttt gttaatgttc cgcttttgaa tcttatacag gtctgtgagg aacctgaatt
4140tttgcaatca tgattcgctg cttgaggctg aaggtggagg gcgctctgga gcagattttt
4200acaatggccg gacttaatat tcgggatttg cttagagata tattgagaag gtggcgagat
4260gagaattatt tgggcatggt tgaaggtgct ggaatgttta tagaggagat tcaccctgaa
4320gggtttagcc tttacgtcca cttggacgtg agggccgttt gccttttgga agccattgtg
4380caacatctta caaatgccat tatctgttct ttggctgtag agtttgacca cgccaccgga
4440ggggagcgcg ttcacttaat agatcttcat tttgaggttt tggataatct tttggaataa
4500aaaaaaaaac atggttcttc cagctcttcc cgctcctccc gtgtgtgact cgcagaacga
4560atgtgtaggt tggctgggtg tggcttattc tgcggtggtg gatgttatca gggcagcggc
4620gcatgaagga gtttacatag aacccgaagc cagggggcgc ctggatgctt tgagagagtg
4680gatatactac aactactaca cagagcgatc taagcggcga gaccggagac gcagatctgt
4740ttgtcacgcc cgcacctggt tttgcttcag gaaatatgac tacgtccggc gttccatttg
4800gcatgacact acgaccaaca cgatctcggt tgtctcggcg cactccgtac agtagggatc
4860gtctacctcc ttttgagaca gaaacccgcg ctaccatact ggaggatcat ccgctgctgc
4920ccgaatgtaa cactttgaca atgcacaacg tgagttacgt gcgaggtctt ccctgcagtg
4980tgggatttac gctgattcag gaatgggttg ttccctggga tatggttcta acgcgggagg
5040agcttgtaat cctgaggaag tgtatgcacg tgtgcctgtg ttgtgccaac attgatatca
5100tgacgagcat gatgatccat ggttacgagt cctgggctct ccactgtcat tgttccagtc
5160ccggttccct gcagtgtata gccggcgggc aggttttggc cagctggttt aggatggtgg
5220tggatggcgc catgtttaat cagaggttta tatggtaccg ggaggtggtg aattacaaca
5280tgccaaaaga ggtaatgttt atgtccagcg tgtttatgag gggtcgccac ttaatctacc
5340tgcgcttgtg gtatgatggc cacgtgggtt ctgtggtccc cgccatgagc tttggataca
5400gcgccttgca ctgtgggatt ttgaacaata ttgtggtgct gtgctgcagt tactgtgctg
5460atttaagtga gatcagggtg cgctgctgtg cccggaggac aaggcgcctt atgctgcggg
5520cggtgcgaat catcgctgag gagaccactg ccatgttgta ttcctgcagg acggagcggc
5580ggcggcagca gtttattcgc gcgctgctgc agcaccaccg ccctatcctg atgcacgatt
5640atgactctac ccccatgtag gcgtggactt ctccttcgcc gcccgttaag caaccgcaag
5700ttggacagca gcctgtggct cagcagctgg acagcgacat gaacttaagt gagctgcccg
5760gggagtttat taatatcact gatgagcgtt tggctcgaca ggaaaccgtg tggaatataa
5820cacctaagaa tatgtctgtt acccatgata tgatgctttt taaggccagc cggggagaaa
5880ggactgtgta ctctgtgtgt tgggagggag gtggcaggtt gaatactagg gttctgtgag
5940tttgattaag gtacggtgat ctgtataagc tatgtggtgg tggggctata ctactgaatg
6000aaaaatgact tgaaattttc tgcaattgaa aaataaacac gttgaaacat aacacaaacg
6060attctttatt cttgggcaat gtatgaaaaa gtgtaagagg atgtggcaaa tatttcatta
6120atgtagttgt ggccagacca gtcccatgaa aatgacatag agtatgcact tggagttgtg
6180tctcctgttt cctgtgtacc gtttagtgta atggttagtg ttacaggttt agttttgtct
6240ccgtttaagt aaacttgact gacaatgtta cttttggcag ttttaccgtg agattttgga
6300taagctgata ggttaggcat aaatccaaca gcgtttgtat aggctgtgcc ttcagtaaga
6360tctccatttc taaagttcca atattctggg tccaggaagg aattgtttag tagcactcca
6420ttttcgtcaa atcttataat aagatgagca ctttgaactg ttccagatat tggagccaaa
6480ctgcctttaa cagccaaaac tgaaactgta gcaagtattt gactgccaca ttttgttaag
6540accaaagtga gtttagcatc tttctctgca tttagtctac agttaggaga tggagctggt
6600gtggtccaca aagttagctt atcattattt ttgtttccta ctgtaatggc acctgtgctg
6660tcaaaactaa ggccagttcc tagtttagga accatagcct tgtttgaatc aaattctagg
6720ccatggccaa tttttgtttt gaggggattt gtgtttggtg cattaggtga accaaattca
6780agcccatctc ctgcattaat ggctatggct gtagcgtcaa acatcaaccc cttggcagtg
6840cttaggttaa cctcaagctt tttggaattg tttgaagctg taaacaagta aaggcctttg
6900ttgtagttaa tatccaagtt gtgggctgag tttataaaaa gagggccctg tcctagtctt
6960agatttagtt ggttttgagc atcaaacgga taactaacat caagtataag gcgtctgttt
7020tgagaatcaa tccttagtcc tcctgctaca ttaagttgca tattgccttg tgaatcaaaa
7080cccaaggctc cagtaacttt agtttgcaag gaagtattat taatagtcac acctggacca
7140gttgctacgg tcaaagtgtt taggtcgtct gttacatgca aaggagcccc gtactttagt
7200cctagttttc cattttgtgt ataaatgggc tctttcaagt caatgcccaa gctaccagtg
7260gcagtagtta gagggggtga ggcagtgata gtaagggtac tgctatcggt ggtggtgagg
7320gggcctgatg tttgcagggc tagctttcct tctgacactg tgaggggtcc ttgggtggca
7380atgctaagtt tggagtcgtg cacggttagc ggggcctgtg attgcatggt gagtgtgttg
7440cccgcgacca ttagaggtgc ggcggcagcc acagttaggg cttctgaggt aactgtgagg
7500ggtgcagata tttccaggtt tatgtttgac ttggtttttt tgagaggtgg gctcacagtg
7560gttacatttt gggaggtaag gttgccggcc tcgtccagag agaggccgtt gcccattttg
7620agcgcaagca tgccattgga ggtaactaga ggttcggata ggcgcaaaga gagtacccca
7680gggggactct cttgaaaccc attgggggat acaaagggag gagtaagaaa aggcacagtt
7740ggaggaccgg tttccgtgtc atatggatac acggggttga aggtatcttc agacggtctt
7800gcgcgcttca tctgcaacaa catgaagata gtgggtgcgg atggacagga acaggaggaa
7860actgacattc catttagatt gtggagaaag tttgcagcca ggaggaagct gcaataccag
7920agctgggagg agggcaagga ggtgctgctg aataaactgg acagaaattt gctaactgat
7980tttaagtaag tgatgcttta ttattttttt ttattagtta aagggaataa gatccccggg
8040tactctagtt aattaactag aggatcttga tgtaatccaa ggttaggaca gttgcaaatc
8100acagtgagaa cacagggtcc cctgtcccgc tcaactagca gggggcgctg ggtaaactcc
8160cgaatcaggc tacgggcaag ctctccctgg gcggtaagcc ggacgccgtg cgccgggccc
8220tcgatatgat cctcgggcaa ttcaaagtag caaaactcac cggagtcgcg ggcaaagcac
8280ttgtggcggc gacagtggac caggtgtttc aggcgcagtt gctctgcctc tccacttaac
8340attcagtcgt agccgtccgc cgagtccttt accgcgtcaa agttaggaat aaattgatcc
8400ggatagtggc cgggaggtcc cgagaagggg ttaaagtaga ccgatggcac aaactcctca
8460ataaattgca gagttccaat gcctccagag cgcggctcag aggacgaggt ctgcagagtt
8520aggattgcct gacgaggcgt gaatgaagga cggccggcgc cgccgatctg aaatgtcccg
8580tccggacgga gaccaagcga ggagctcacc gactcgtcgt tgagctgaat acctcgccct
8640ctgattgtca ggtgagttat accctgcccg ggcgaccgca ccctgtgacg aaagccgccc
8700gcaagctgcg cccctgagtt agtcatctga acttcggcct gggcgtctct gggaagtacc
8760acagtggtgg gagcgggact ttcctggtac accagggcag cgggccaact acggggatta
8820aggttattac gaggtgtggt ggtaatagcc gcctgttcca agagaattcg gtttcggtgg
8880gcgcggattc cgttgacccg ggatatcatg tggggtcccg cgctcatgta gtttattcgg
8940gttgagtagt cttgggcagc tccagccgca agtcccattt gtggctggta actccacatg
9000tagggcgtgg gaatttcctt gctcataatg gcgctgacga caggtgctgg cgccgggtgt
9060ggccgctgga gatgacgtag ttttcgcgct taaatttgag aaagggcgcg aaactagtcc
9120ttaagagtca gcgcgcagta tttactgaag agagcctccg cgtcttccag cgtgcgccga
9180agctgatctt cgcttttgtg atacaggcag ctgcgggtga gggatcgcag agacctgttt
9240tttattttca gctcttgttc ttggcccctg ctctgttgaa atatagcata cagagtggga
9300aaaatcctgt ttctaagctc gcgggtcgat acgggttcgt tgggcgccag acgcagcgct
9360cctcctcctg ctgctgccgc cgctgtggat ttcttgggct ttgtcagagt cttgctatcc
9420ggtcgccttt gcttctgtgt ggccgctgct gttgctgccg ctgccgctgc cgccggtgca
9480gtatgggctg tagagatgac ggtagtaatg caggatgtta cgggggaagg ccacgccgtg
9540atggtagaga agaaagcggc gggcgaagga gatgttgccc ccacagtctt gcaagcaagc
9600aactatggcg ttcttgtgcc cgcgccatga gcggtagcct tggcgctgtt gttgctcttg
9660ggctaacggc ggcggctgct tggacttacc ggccctggtt ccagtggtgt cccatctacg
9720gttgggtcgg cgaacgggca gtgccggcgg cgcctgagga gcggaggttg tagccatgct
9780ggaaccggtt gccgatttct ggggcgccgg cgaggggaat gcgaccgagg gtgacggtgt
9840ttcgtctgac acctcttcga cctcggaagc ttcctcgtct aggctctccc agtcttccat
9900catgtcctcc tcctcctcgt ccaaaacctc ctctgcctga ctgtcccagt attcctcctc
9960gtccgtgggt ggcggcggca gctgcagctt ctttttgggt gccatcctgg gaagcaaggg
10020cccgcggctg ctgctgatag ggctgcggcg gcggggggat tgggttgagc tcctcgccgg
10080actgggggtc caagtaaacc ccccgtccct ttcgtagcag aaactcttgg cgggctttgt
10140tgatggcttg caattggcca agaatgtggc cctgggtaat gacgcaggcg gtaagctccg
10200catttggcgg gcgggattgg tcttcgtaga acctaatctc gtgggcgtgg tagtcctcag
10260gtacaaattt gcgaaggtaa gccgacgtcc acagccccgg agtgagtttc aaccccggag
10320ccgcggactt ttcgtcaggc gagggaccct gcagctcaaa ggtaccgata atttgacttt
10380cgttaagcag ctgcgaattg caaaccaggg agcggtgcgg ggtgcatagg ttgcagcgac
10440agtgacactc cagtagaccg tcaccgctca cgtcttccat tatgtcagag tggtaggcaa
10500ggtagttggc tagctgcaga aggtagcagt ggccccaaag cggcggaggg cattcgcggt
10560acttaatggg cacaaagtcg ctaggaagtg cacagcaggt ggcgggcaag attcctgagc
10620gctctaggat aaagttccta aagttctgca acatgctttg actggtgaag tctggcagac
10680cctgttgcag ggttttaagc aggcgttcgg ggaaaatgat gtccgccagg tgcgcggcca
10740cggagcgctc gttgaaggcc gtccataggt ccttcaagtt ttgctttagc agtttctgca
10800gctccttgag gttgcactcc tccaagcact gctgccaaac gcccatggcc gtctgccagg
10860tgtagcatag aaataagtaa acgcagtcgc ggacgtagtc gcggcgcgcc tcgcccttga
10920gcgtggaatg aagcacgttt tgcccaaggc ggttttcgtg caaaattcca aggtaggaga
10980ccaggttgca gagctccacg ttggagatct tgcaggcctg gcgtacgtag ccctgtcgaa
11040aggtgtagtg caatgtttcc tctagcttgc gctgcatctc cgggtcagca aagaaccgct
11100gcatgcactc aagctccacg gtaacgagca ctgcggccat cattagtttg cgtcgctcct
11160ccaagtcggc aggctcgcgc gtttgaagcc agcgcgctag ctgctcgtcg ccaactgcgg
11220gtaggccctc ctctgtttgt tcttgcaaat ttgcatccct ctccaggggc tgcgcacggc
11280gcacgatcag ctcactcatg actgtgctca tgaccttggg gggtaggtta agtgccgggt
11340aggcaaagtg ggtgacctcg atgctgcgtt ttagtacggc taggcgcgcg ttgtcaccct
11400cgagttccac caacactcca gagtgacttt cattttcgct gttttcctgt tgcagagcgt
11460ttgccgcgcg cttctcgtcg cgtccaagac cctcaaagat ttttggcact tcgttgagcg
11520aggcgatatc aggtatgaca gcgccctgcc gcaaggccag ctgcttgtcc gctcggctgc
11580ggttggcacg gcaggatagg ggtatcttgc agttttggaa aaagatgtga taggtggcaa
11640gcacctctgg cacggcaaat acggggtaga agttgaggcg cgggttgggc tcgcatgtgc
11700cgttttcttg gcgtttgggg ggtacgcgcg gtgagaatag gtggcgttcg taggcaaggc
11760tgacatccgc tatggcgagg ggcacatcgc tgcgctcttg caacgcgtcg cagataatgg
11820cgcactggcg ctgcagatgc ttcaacagca cgtcgtctcc cacatctagg tagtcgccat
11880gcctttcgtc cccccgcccg acttgttcct cgtttgcctc tgcgttgtcc tggtcttgct
11940ttttatcctc tgttggtact gagcggtcct cgtcgtcttc gcttacaaaa cctgggtcct
12000gctcgataat cacttcctcc tcctcaagcg ggggtgcctc gacggggaag gtggtaggcg
12060cgttggcggc atcggtggag gcggtggtgg cgaactcaga gggggcggtt aggctgtcct
12120tcttctcgac tgactccatg atctttttct gcctatagga gaaggaaatg gccagtcggg
12180aagaggagca gcgcgaaacc acccccgagc gcggacgcgg tgcggcgcga cgtcccccaa
12240ccatggagga cgtgtcgtcc ccgtccccgt cgccgccgcc tccccgggcg cccccaaaaa
12300agcggatgag gcggcgtatc gagtccgagg acgaggaaga ctcatcacaa gacgcgctgg
12360tgccgcgcac acccagcccg cggccatcga cctcggcggc ggatttggcc attgcgccca
12420agaagaaaaa gaagcgccct tctcccaagc ccgagcgccc gccatcacca gaggtaatcg
12480tggacagcga ggaagaaaga gaagatgtgg cgctacaaat ggtgggtttc agcaacccac
12540cggtgctaat caagcatggc aaaggaggta agcgcacagt gcggcggctg aatgaagacg
12600acccagtggc gcgtggtatg cggacgcaag aggaagagga agagcccagc gaagcggaaa
12660gtgaaattac ggtgatgaac ccgctgagtg tgccgatcgt gtctgcgtgg gagaagggca
12720tggaggctgc gcgcgcgctg atggacaagt accacgtgga taacgatcta aaggcgaact
12780tcaaactact gcctgaccaa gtggaagctc tggcggccgt atgcaagacc tggctgaacg
12840aggagcaccg cgggttgcag ctgaccttca ccagcaacaa gacctttgtg acgatgatgg
12900ggcgattcct gcaggcgtac ctgcagtcgt ttgcagaggt gacctacaag catcacgagc
12960ccacgggctg cgcgttgtgg ctgcaccgct gcgctgagat cgaaggcgag cttaagtgtc
13020tacacggaag cattatgata aataaggagc acgtgattga aatggatgtg acgagcgaaa
13080acgggcagcg cgcgctgaag gagcagtcta gcaaggccaa gatcgtgaag aaccggtggg
13140gccgaaatgt ggtgcagatc tccaacaccg acgcaaggtg ctgcgtgcac gacgcggcct
13200gtccggccaa tcagttttcc ggcaagtctt gcggcatgtt cttctctgaa ggcgcaaagg
13260ctcaggtggc ttttaagcag atcaaggctt ttatgcaggc gctgtatcct aacgcccaga
13320ccgggcacgg tcaccttttg atgccactac ggtgcgagtg caactcaaag cctgggcacg
13380cgcccttttt gggaaggcag ctaccaaagt tgactccgtt cgccctgagc aacgcggagg
13440acctggacgc ggatctgatc tccgacaaga gcgtgctggc cagcgtgcac cacccggcgc
13500tgatagtgtt ccagtgctgc aaccctgtgt atcgcaactc gcgcgcgcag ggcggaggcc
13560ccaactgcga cttcaagata tcggcgcccg acctgctaaa cgcgttggtg atggtgcgca
13620gcctgtggag tgaaaacttc accgagctgc cgcggatggt tgtgcctgag tttaagtgga
13680gcactaaaca ccagtatcgc aacgtgtccc tgccagtggc gcatagcgat gcgcggcaga
13740acccctttga tttttaaacg gcgcagacgg caagggtggg ggtaaataat cacccgagag
13800tgtacaaata aaagcatttg cctttattga aagtgtctct agtacattat ttttacatgt
13860ttttcaagtg acaaaaagaa gtggcgctcc taatctgcgc actgtggctg cggaagtagg
13920gcgagtggcg ctccaggaag ctgtagagct gttcctggtt gcgacgcagg gtgggctgta
13980cctggggact gttgagcatg gagttgggta ccccggtaat aaggttcatg gtggggttgt
14040gatccatggg agtttggggc cagttggcaa aggcgtggag aaacatgcag cagaatagtc
14100cacaggcggc cgagttgggc ccctgtacgc tttgggtgga cttttccagc gttatacagc
14160ggtcggggga agaagcaatg gcgctacggc gcaggagtga ctcgtactca aactggtaaa
14220cctgcttgag tcgctggtca gaaaagccaa agggctcaaa gaggtagcat gtttttgagt
14280gcgggttcca ggcaaaggcc atccagtgta cgcccccagt ctcggtccga gactcgaacc
14340gggggtcccg cgactcaacc cttggaaaat aaccctccgg ctacagggag cgagccactt
14400aatgctttcg ctttccagcc taaccgctta cgctgcgcgc ggccagtggc caaaaaagct
14460agcgcagcag ccgccgcgcc tggaaggaag ccaaaaggag cactcccccg ttgtctgacg
14520tcgcacacct gggttcgaca cgcgggcggt aaccgcatgg atcacggcgg acggccggat
14580acggggctcg aaccccggtc gtccgccatg atacccttgc gaatttatcc accagaccac
14640ggaagagtgc ccgcttacag gctctccttt tgcacggtag agcgtcaacg attgcgcgcg
14700cctgaccggc cagagcgtcc cgaccatgga gcactttttg ccgctgcgca acatctggaa
14760ccgcgtccgc gactttccgc gcgcctccac caccgccgcc ggcatcacct ggatgtccag
14820gtacatctac ggatatcatc gccttatgtt ggaagatctc gcccccggag ccccggccac
14880cctacgctgg cccctctacc gccagccgcc gccgcacttt ttggtgggat accagtacct
14940ggtgcggact tgcaacgact acgtatttga ctcgagggct tactcgcgtc tcaggtacac
15000cgagctctcg cagccgggtc accagaccgt taactggtcc gttatggcca actgcactta
15060caccatcaac acgggcgcat accaccgctt tgtggacatg gatgacttcc agtctaccct
15120cacgcaggtg cagcaggcca tattagccga gcgcgttgtc gccgacctag ccctgcttca
15180gccgatgagg ggcttcgggg tcacacgcat gggaggaaga gggcgccacc tacggccaaa
15240ctccgccgcc gccgcagcga tagatgcaag agatgcagga caagaggaag gagaagaaga
15300agtgccggta gaaaggctca tgcaagacta ctacaaagac ctgcgccgat gtcaaaacga
15360agcctggggc atggccgacc gcctgcgcat tcagcaggcc ggacccaagg acatggtgct
15420tctg
154243933849DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotide 393cggcgtcaat acgggataat accgcgccac
atagcagaac tttaaaagtg ctcatcattg 60gaaaacgttc ttcggggcga aaactctcaa
ggatcttacc gctgttgaga tccagttcga 120tgtaacccac tcgtgcaccc aactgatctt
cagcatcttt tactttcacc agcgtttctg 180ggtgagcaaa aacaggaagg caaaatgccg
caaaaaaggg aataagggcg acacggaaat 240gttgaatact catactcttc ctttttcaat
attattgaag catttatcag ggttattgtc 300tcatgagcgg atacatattt gaatgtattt
agaaaaataa acaaataggg gttccgcgca 360catttccccg aaaagtgcca cctaaattgt
aagcgttaat attttgttaa aattcgcgtt 420aaatttttgt taaatcagct cattttttaa
ccaataggcc gaaatcggca aaatccctta 480taaatcaaaa gaatagaccg agatagggtt
gagtgttgtt ccagtttgga acaagagtcc 540actattaaag aacgtggact ccaacgtcaa
agggcgaaaa accgtctatc agggcgatgg 600cccactacgt gaaccatcac cctaatcaag
ttttttgggg tcgaggtgcc gtaaagcact 660aaatcggaac cctaaaggga gcccccgatt
tagagcttga cggggaaagc cggcgaacgt 720ggcgagaaag gaagggaaga aagcgaaagg
agcgggcgct agggcgctgg caagtgtagc 780ggtcacgctg cgcgtaacca ccacacccgc
cgcgcttaat gcgccgctac agggcgcgtc 840ccattcgcca ttcaggctgc gcaactgttg
ggaagggcga tcggtgcggg cctcttcgct 900attacgccag ctgcgcgctc gctcgctcac
tgaggccgcc cgggcaaagc ccgggcgtcg 960ggcgaccttt ggtcgcccgg cctcagtgag
cgagcgagcg cgcagagagg gagtggccaa 1020ctccatcact aggggttcct tgtagttaat
gattaacccg ccatgctact tatctacgta 1080gccatgctct aggaagagta ccattgacgt
caataatgac gtatgttccc atagtaacgc 1140caatagggac tttccattga cgtcaatggg
tggagtattt acggtaaact gcccacttgg 1200cagtacatca agtgtatcag tggtttgtct
ggtcaaccac cgcggtctca gtggtgtacg 1260gtacaaaccc agctaccggt cgccaccatg
cccgccatga agatcgagtg ccgcatcacc 1320ggcaccctga acggcgtgga gttcgagctg
gtgggcggcg gagagggcac ccccgagcag 1380ggccgcatga ccaacaagat gaagagcacc
aaaggcgccc tgaccttcag cccctacctg 1440ctgagccacg tgatgggcta cggcttctac
cacttcggca cctaccccag cggctacgag 1500aaccccttcc tgcacgccat caacaacggc
ggctacacca acacccgcat cgagaagtac 1560gaggacggcg gcgtgctgca cgtgagcttc
agctaccgct acgaggccgg ccgcgtgatc 1620ggcgacttca aggtggtggg caccggcttc
cccgaggaca gcgtgatctt caccgacaag 1680atcatccgca gcaacgccac cgtggagcac
ctgcacccca tgggcgataa cgtgctggtg 1740ggcagcttcg cccgcacctt cagcctgcgc
gacggcggct actacagctt cgtggtggac 1800agccacatgc acttcaagag cgccatccac
cccagcatcc tgcagaacgg gggccccatg 1860ttcgccttcc gccgcgtgga ggagctgcac
agcaacaccg agctgggcat cgtggagtac 1920cagcacgcct tcaagacccc catcgccttc
gccagatctc gagctcgatg agtttggaca 1980aaccacaact agaatgcagt gaaaaaaatg
ctttatttgt gaaatttgtg atgctattgc 2040tttatttgtg ggcccgggat cttcctagag
catggctacg tagataagta gcatggcggg 2100ttaatcatta actacaagga acccctagtg
atggagttgg ccactccctc tctgcgcgct 2160cgctcgctca ctgaggccgg gcgaccaaag
gtcgcccgac gcccgggctt tgcccgggcg 2220gcctcagtga gcgagcgagc gcgcagctgc
attaatgaat cggccaacgc gcggggagag 2280gcggtttgcg tattgggcgc tcttccgctt
cctcgctcac tgactcgctg cgctcggtcg 2340ttcggctgcg gcgagcggta tcagctcact
caaaggcggt aatacggtta tccacagaat 2400caggggataa cgcaggaaag aacatgtgag
caaaaggcca gcaaaaggcc aggaaccgta 2460aaaaggccgc gttgctggcg tttttccata
ggctccgccc ccctgacgag catcacaaaa 2520atcgacgctc aagtcagagg tggcgaaacc
cgacaggact ataaagatac caggcgtttc 2580cccctggaag ctccctcgtg cgctctcctg
ttccgaccct gccgcttacc ggatacctgt 2640ccgcctttct cccttcggga agcgtggcgc
tttctcatag ctcacgctgt aggtatctca 2700gttcggtgta ggtcgttcgc tccaagctgg
gctgtgtgca cgaacccccc gttcagcccg 2760accgctgcgc cttatccggt aactatcgtc
ttgagtccaa cccggtaaga cacgacttat 2820cgccactggc agcagccact ggtaacagga
ttagcagagc gaggtatgta ggcggtgcta 2880cagagttctt gaagtggtgg cctaactacg
gctacactag aagaacagta tttggtatct 2940gcgctctgct gaagccagtt accttcggaa
aaagagttgg tagctcttga tccggcaaac 3000aaaccaccgc tggtagcggt ggtttttttg
tttgcaagca gcagattacg cgcagaaaaa 3060aaggatctca agaagatcct ttgatctttt
ctacggggtc tgacgctcag tggaacgaaa 3120actcacgtta agggattttg gtcatgagat
tatcaaaaag gatcttcacc tagatccttt 3180taaattaaaa atgaagtttt aaatcaatct
aaagtatata tgagtaaact tggtctgaca 3240gttaccaatg cttaatcagt gaggcaccta
tctcagcgat ctgtctattt cgttcatcca 3300tagttgcctg actccccgtc gtgtagataa
ctacgatacg ggagggctta ccatctggcc 3360ccagtgctgc aatgataccg cgagacccac
gctcaccggc tccagattta tcagcaataa 3420accagccagc cggaagggcc gagcgcagaa
gtggtcctgc aactttatcc gcctccatcc 3480agtctattaa ttgttgccgg gaagctagag
taagtagttc gccagttaat agtttgcgca 3540acgttgttgc cattgctaca ggcatcgtgg
tgtcacgctc gtcgtttggt atggcttcat 3600tcagctccgg ttcccaacga tcaaggcgag
ttacatgatc ccccatgttg tgcaaaaaag 3660cggttagctc cttcggtcct ccgatcgttg
tcagaagtaa gttggccgca gtgttatcac 3720tcatggttat ggcagcactg cataattctc
ttactgtcat gccatccgta agatgctttt 3780ctgtgactgg tgagtactca accaagtcat
tctgagaata gtgtatgcgg cgaccgagtt 3840gctcttgcc
38493947336DNAArtificial
SequenceDescription of Artificial Sequence Synthetic polynucleotide
394atgccggggt tttacgagat tgtgattaag gtccccagcg accttgacga gcatctgccc
60ggcatttctg acagctttgt gaactgggtg gccgagaagg aatgggagtt gccgccagat
120tctgacatgg atctgaatct gattgagcag gcacccctga ccgtggccga gaagctgcag
180cgcgactttc tgacggaatg gcgccgtgtg agtaaggccc cggaggctct tttctttgtg
240caatttgaga agggagagag ctacttccac atgcacgtgc tcgtggaaac caccggggtg
300aaatccatgg ttttgggacg tttcctgagt cagattcgcg aaaaactgat tcagagaatt
360taccgcggga tcgagccgac tttgccaaac tggttcgcgg tcacaaagac cagaaatggc
420gccggaggcg ggaacaaggt ggtggatgag tgctacatcc ccaattactt gctccccaaa
480acccagcctg agctccagtg ggcgtggact aatatggaac agtatttaag cgcctgtttg
540aatctcacgg agcgtaaacg gttggtggcg cagcatctga cgcacgtgtc gcagacgcag
600gagcagaaca aagagaatca gaatcccaat tctgatgcgc cggtgatcag atcaaaaact
660tcagccaggt acatggagct ggtcgggtgg ctcgtggaca aggggattac ctcggagaag
720cagtggatcc aggaggacca ggcctcatac atctccttca atgcggcctc caactcgcgg
780tcccaaatca aggctgcctt ggacaatgcg ggaaagatta tgagcctgac taaaaccgcc
840cccgactacc tggtgggcca gcagcccgtg gaggacattt ccagcaatcg gatttataaa
900attttggaac taaacgggta cgatccccaa tatgcggctt ccgtctttct gggatgggcc
960acgaaaaagt tcggcaagag gaacaccatc tggctgtttg ggcctgcaac taccgggaag
1020accaacatcg cggaggccat agcccacact gtgcccttct acgggtgcgt aaactggacc
1080aatgagaact ttcccttcaa cgactgtgtc gacaagatgg tgatctggtg ggaggagggg
1140aagatgaccg ccaaggtcgt ggagtcggcc aaagccattc tcggaggaag caaggtgcgc
1200gtggaccaga aatgcaagtc ctcggcccag atagacccga ctcccgtgat cgtcacctcc
1260aacaccaaca tgtgcgccgt gattgacggg aactcaacga ccttcgaaca ccagcagccg
1320ttgcaagacc ggatgttcaa atttgaactc acccgccgtc tggatcatga ctttgggaag
1380gtcaccaagc aggaagtcaa agactttttc cggtgggcaa aggatcacgt ggttgaggtg
1440gagcatgaat tctacgtcaa aaagggtgga gccaagaaaa gacccgcccc cagtgacgca
1500gatataagtg agcccaaacg ggtgcgcgag tcagttgcgc agccatcgac gtcagacgcg
1560gaagcttcga tcaactacgc agacaggtac caaaacaaat gttctcgtca cgtgggcatg
1620aatctgatgc tgtttccctg cagacaatgc gagagaatga atcagaattc aaatatctgc
1680ttcactcacg gacagaaaga ctgtttagag tgctttcccg tgtcagaatc tcaacccgtt
1740tctgtcgtca aaaaggcgta tcagaaactg tgctacattc atcatatcat gggaaaggtg
1800ccagacgctt gcactgcctg cgatctggtc aatgtggatt tggatgactg catctttgaa
1860caataaatga tttaaatcag gtatggctgc cgatggttat cttccagatt ggctcgagga
1920caacctctct gagggcattc gcgagtggtg ggcgctgaaa cctggagccc cgaagcccaa
1980agccaaccag caaaagcagg acgacggccg gggtctggtg cttcctggct acaagtacct
2040cggacccttc aacggactcg acaaggggga gcccgtcaac gcggcggacg cagcggccct
2100cgagcacgac aaggcctacg accagcagct gcaggcgggt gacaatccgt acctgcggta
2160taaccacgcc gacgccgagt ttcaggagcg tctgcaagaa gatacgtctt ttgggggcaa
2220cctcgggcga gcagtcttcc aggccaagaa gcgggttctc gaacctctcg gtctggttga
2280ggaaggcgct aagacggctc ctggaaagaa gagaccggta gagccatcac cccagcgttc
2340tccagactcc tctacgggca tcggcaagaa aggccaacag cccgccagaa aaagactcaa
2400ttttggtcag actggcgact cagagtcagt tccagaccct caacctctcg gagaacctcc
2460agcagcgccc tctggtgtgg gacctaatac aatggctgca ggcggtggcg caccaatggc
2520agacaataac gaaggcgccg acggagtggg tagttcctcg ggaaattggc attgcgattc
2580cacatggctg ggcgacagag tcatcaccac cagcacccga acctgggccc tgcccaccta
2640caacaaccac ctctacaagc aaatctccaa cgggacatcg ggaggagcca ccaacgacaa
2700cacctacttc ggctacagca ccccctgggg gtattttgac tttaacagat tccactgcca
2760cttttcacca cgtgactggc agcgactcat caacaacaac tggggattcc ggcccaagag
2820actcagcttc aagctcttca acatccaggt caaggaggtc acgcagaatg aaggcaccaa
2880gaccatcgcc aataacctca ccagcaccat ccaggtgttt acggactcgg agtaccagct
2940gccgtacgtt ctcggctctg cccaccaggg ctgcctgcct ccgttcccgg cggacgtgtt
3000catgattccc cagtacggct acctaacact caacaacggt agtcaggccg tgggacgctc
3060ctccttctac tgcctggaat actttccttc gcagatgctg agaaccggca acaacttcca
3120gtttacttac accttcgagg acgtgccttt ccacagcagc tacgcccaca gccagagctt
3180ggaccggctg atgaatcctc tgattgacca gtacctgtac tacttgtctc ggactcaaac
3240aacaggaggc acggcaaata cgcagactct gggcttcagc caaggtgggc ctaatacaat
3300ggccaatcag gcaaagaact ggctgccagg accctgttac cgccaacaac gcgtctcaac
3360gacaaccggg caaaacaaca atagcaactt tgcctggact gctgggacca aataccatct
3420gaatggaaga aattcattgg ctaatcctgg catcgctatg gcaacacaca aagacgacga
3480ggagcgtttt tttcccagta acgggatcct gatttttggc aaacaaaatg ctgccagaga
3540caatgcggat tacagcgatg tcatgctcac cagcgaggaa gaaatcaaaa ccactaaccc
3600tgtggctaca gaggaatacg gtatcgtggc agataacttg cagcagcaaa acacggctcc
3660tcaaattgga actgtcaaca gccagggggc cttacccggt atggtctggc agaaccggga
3720cgtgtacctg cagggtccca tctgggccaa gattcctcac acggacggca acttccaccc
3780gtctccgctg atgggcggct ttggcctgaa acatcctccg cctcagatcc tgatcaagaa
3840cacgcctgta cctgcggatc ctccgaccac cttcaaccag tcaaagctga actctttcat
3900cacgcaatac agcaccggac aggtcagcgt ggaaattgaa tgggagctgc agaaggaaaa
3960cagcaagcgc tggaaccccg agatccagta cacctccaac tactacaaat ctacaagtgt
4020ggactttgct gttaatacag aaggcgtgta ctctgaaccc cgccccattg gcacccgtta
4080cctcacccgt aatctgtaat tgcctgttaa tcaataaacc ggttgattcg tttcagttga
4140actttggtct ctgcgaaggg cgaattcgtt taaacctgca ggactagagg tcctgtatta
4200gaggtcacgt gagtgttttg cgacattttg cgacaccatg tggtcacgct gggtatttaa
4260gcccgagtga gcacgcaggg tctccatttt gaagcgggag gtttgaacgc gcagccgcca
4320agccgaattc tgcagatatc catcacactg gcggccgctc gactagagcg gccgccaccg
4380cggtggagct ccagcttttg ttccctttag tgagggttaa ttgcgcgctt ggcgtaatca
4440tggtcatagc tgtttcctgt gtgaaattgt tatccgctca caattccaca caacatacga
4500gccggaagca taaagtgtaa agcctggggt gcctaatgag tgagctaact cacattaatt
4560gcgttgcgct cactgcccgc tttccagtcg ggaaacctgt cgtgccagct gcattaatga
4620atcggccaac gcgcggggag aggcggtttg cgtattgggc gctcttccgc ttcctcgctc
4680actgactcgc tgcgctcggt cgttcggctg cggcgagcgg tatcagctca ctcaaaggcg
4740gtaatacggt tatccacaga atcaggggat aacgcaggaa agaacatgtg agcaaaaggc
4800cagcaaaagg ccaggaaccg taaaaaggcc gcgttgctgg cgtttttcca taggctccgc
4860ccccctgacg agcatcacaa aaatcgacgc tcaagtcaga ggtggcgaaa cccgacagga
4920ctataaagat accaggcgtt tccccctgga agctccctcg tgcgctctcc tgttccgacc
4980ctgccgctta ccggatacct gtccgccttt ctcccttcgg gaagcgtggc gctttctcat
5040agctcacgct gtaggtatct cagttcggtg taggtcgttc gctccaagct gggctgtgtg
5100cacgaacccc ccgttcagcc cgaccgctgc gccttatccg gtaactatcg tcttgagtcc
5160aacccggtaa gacacgactt atcgccactg gcagcagcca ctggtaacag gattagcaga
5220gcgaggtatg taggcggtgc tacagagttc ttgaagtggt ggcctaacta cggctacact
5280agaagaacag tatttggtat ctgcgctctg ctgaagccag ttaccttcgg aaaaagagtt
5340ggtagctctt gatccggcaa acaaaccacc gctggtagcg gtggtttttt tgtttgcaag
5400cagcagatta cgcgcagaaa aaaaggatct caagaagatc ctttgatctt ttctacgggg
5460tctgacgctc agtggaacga aaactcacgt taagggattt tggtcatgag attatcaaaa
5520aggatcttca cctagatcct tttaaattaa aaatgaagtt ttaaatcaat ctaaagtata
5580tatgagtaaa cttggtctga cagttaccaa tgcttaatca gtgaggcacc tatctcagcg
5640atctgtctat ttcgttcatc catagttgcc tgactccccg tcgtgtagat aactacgata
5700cgggagggct taccatctgg ccccagtgct gcaatgatac cgcgagaccc acgctcaccg
5760gctccagatt tatcagcaat aaaccagcca gccggaaggg ccgagcgcag aagtggtcct
5820gcaactttat ccgcctccat ccagtctatt aattgttgcc gggaagctag agtaagtagt
5880tcgccagtta atagtttgcg caacgttgtt gccattgcta caggcatcgt ggtgtcacgc
5940tcgtcgtttg gtatggcttc attcagctcc ggttcccaac gatcaaggcg agttacatga
6000tcccccatgt tgtgcaaaaa agcggttagc tccttcggtc ctccgatcgt tgtcagaagt
6060aagttggccg cagtgttatc actcatggtt atggcagcac tgcataattc tcttactgtc
6120atgccatccg taagatgctt ttctgtgact ggtgagtact caaccaagtc attctgagaa
6180tagtgtatgc ggcgaccgag ttgctcttgc ccggcgtcaa tacgggataa taccgcgcca
6240catagcagaa ctttaaaagt gctcatcatt ggaaaacgtt cttcggggcg aaaactctca
6300aggatcttac cgctgttgag atccagttcg atgtaaccca ctcgtgcacc caactgatct
6360tcagcatctt ttactttcac cagcgtttct gggtgagcaa aaacaggaag gcaaaatgcc
6420gcaaaaaagg gaataagggc gacacggaaa tgttgaatac tcatactctt cctttttcaa
6480tattattgaa gcatttatca gggttattgt ctcatgagcg gatacatatt tgaatgtatt
6540tagaaaaata aacaaatagg ggttccgcgc acatttcccc gaaaagtgcc acctaaattg
6600taagcgttaa tattttgtta aaattcgcgt taaatttttg ttaaatcagc tcatttttta
6660accaataggc cgaaatcggc aaaatccctt ataaatcaaa agaatagacc gagatagggt
6720tgagtgttgt tccagtttgg aacaagagtc cactattaaa gaacgtggac tccaacgtca
6780aagggcgaaa aaccgtctat cagggcgatg gcccactacg tgaaccatca ccctaatcaa
6840gttttttggg gtcgaggtgc cgtaaagcac taaatcggaa ccctaaaggg agcccccgat
6900ttagagcttg acggggaaag ccggcgaacg tggcgagaaa ggaagggaag aaagcgaaag
6960gagcgggcgc tagggcgctg gcaagtgtag cggtcacgct gcgcgtaacc accacacccg
7020ccgcgcttaa tgcgccgcta cagggcgcgt cccattcgcc attcaggctg cgcaactgtt
7080gggaagggcg atcggtgcgg gcctcttcgc tattacgcca gctggcgaaa gggggatgtg
7140ctgcaaggcg attaagttgg gtaacgccag ggttttccca gtcacgacgt tgtaaaacga
7200cggccagtga gcgcgcgtaa tacgactcac tatagggcga attgggtacc gggccccccc
7260tcgatcgagg tcgacggtat cgggggagct cgcagggtct ccattttgaa gcgggaggtt
7320tgaacgcgca gccgcc
7336395969DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotide 395ccccaactgg ggtaaccttt gggctccccg
ggcgcgacta taagctgcga gcaacttcac 60ttgggtatgc cggcggtagc gcttaccgtt
cgtataatgt atgctatacg aagttatccg 120aagccgctag cggtggtttg tctggtcaac
caccgcggtc tcagtggtgt acggtacaaa 180cccagctacc ggtcgccacc atgcccgcca
tgaagatcga gtgccgcatc accggcaccc 240tgaacggcgt ggagttcgag ctggtgggcg
gcggagaggg cacccccgag cagggccgca 300tgaccaacaa gatgaagagc accaaaggcg
ccctgacctt cagcccctac ctgctgagcc 360acgtgatggg ctacggcttc taccacttcg
gcacctaccc cagcggctac gagaacccct 420tcctgcacgc catcaacaac ggcggctaca
ccaacacccg catcgagaag tacgaggacg 480gcggcgtgct gcacgtgagc ttcagctacc
gctacgaggc cggccgcgtg atcggcgact 540tcaaggtggt gggcaccggc ttccccgagg
acagcgtgat cttcaccgac aagatcatcc 600gcagcaacgc caccgtggag cacctgcacc
ccatgggcga taacgtgctg gtgggcagct 660tcgcccgcac cttcagcctg cgcgacggcg
gctactacag cttcgtggtg gacagccaca 720tgcacttcaa gagcgccatc caccccagca
tcctgcagaa cgggggcccc atgttcgcct 780tccgccgcgt ggaggagctg cacagcaaca
ccgagctggg catcgtggag taccagcacg 840ccttcaagac ccccatcgcc ttcgccagat
ctcgagctcg atgagtttgg acaaaccaca 900actagaatgc agtgaaaaaa atgctttatt
tgtgaaattt gtgatgctat tgctttattt 960gtgggcccg
9693964769DNAArtificial
SequenceDescription of Artificial Sequence Synthetic polynucleotide
396tgatcccctg cgccatcaga tccttggcgg cgagaaagcc atccagttta ctttgcaggg
60cttcccaacc ttaccagagg gcgccccagc tggcaattcc ggttcgcttg ctgtccataa
120aaccgcccag tctagctatc gccatgtaag cccactgcaa gctacctgct ttctctttgc
180gcttgcgttt tcccttgtcc agatagccca gtagctgaca ttcatccggg gtcagcaccg
240tttctgcgga ctggctttct acgtgctcga ggggggccaa acggtctcca gcttggctgt
300tttggcggat gagagaagat tttcagcctg atacagatta aatcagaacg cagaagcggt
360ctgataaaac agaatttgcc tggcggcagt agcgcggtgg tcccacctga ccccatgccg
420aactcagaag tgaaacgccg tagcgccgat ggtagtgtgg ggtctcccca tgcgagagta
480gggaactgcc aggcatcaaa taaaacgaaa ggctcagtcg aaagactggg cctttcgttt
540tatctgttgt ttgtcggtga acgctctcct gagtaggaca aatccgccgg gagcggattt
600gaacgttgcg aagcaacggc ccggagggtg gcgggcagga cgcccgccat aaactgccag
660gcatcaaatt aagcagaagg ccatcctgac ggatggcctt tttgcgtttc tacaaactct
720tttgtttatt tttctaaata cattcaaata tgtatccgct catgaccaaa atcccttaac
780gtgagttttc gttccactga gcgtcagacc ccgtagaaaa gatcaaagga tcttcttgag
840atcctttttt tctgcgcgta atctgctgct tgcaaacaaa aaaaccaccg ctaccagcgg
900tggtttgttt gccggatcaa gagctaccaa ctctttttcc gaaggtaact ggcttcagca
960gagcgcagat accaaatact gtccttctag tgtagccgta gttaggccac cacttcaaga
1020actctgtagc accgcctaca tacctcgctc tgctaatcct gttaccagtg gctgctgcca
1080gtggcgataa gtcgtgtctt accgggttgg actcaagacg atagttaccg gataaggcgc
1140agcggtcggg ctgaacgggg ggttcgtgca cacagcccag cttggagcga acgacctaca
1200ccgaactgag atacctacag cgtgagctat gagaaagcgc cacgcttccc gaagggagaa
1260aggcggacag gtatccggta agcggcaggg tcggaacagg agagcgcacg agggagcttc
1320cagggggaaa cgcctggtat ctttatagtc ctgtcgggtt tcgccacctc tgacttgagc
1380gtcgattttt gtgatgctcg tcaggggggc ggagcctatg gaaaaacgcc agcaacgcgg
1440cctttttacg gttcctggcc ttttgctggc cttttgctca catgttcttt cctgcgttat
1500cccctgattc tgtggataac cgtattaccg cctttgagtg agctgatacc gctcgccgca
1560gccgaacgac cgagcgcagc gagtcagtga gcgaggaagc ggaagagcgc ctgatgcggt
1620attttctcct tacgcatctg tgcggtattt cacaccgcat atggtgcact ctcagtacaa
1680tctgctctga tgccgcatag ttaagccagt atacactccg ctatcgctac gtgactgggt
1740catggctgcg ccccgacacc cgccaacacc cgctgacgcg ccctgacggg cttgtctgct
1800cccggcatcc gcttacagac aagctgtgac cgtctccggg agctgcatgt gtcagaggtt
1860ttcaccgtca tcaccgaaac gcgcgaggca gcagatcaat tcgcgcgcga aggcgaagcg
1920gcatgcataa tgtgcctgtc aaatggacga agcagggatt ctgcaaaccc tatgctactc
1980cgtcaagccg tcaattgtct gattcgttac caattatgac aacttgacgg ctacatcatt
2040cactttttct tcacaaccgg cacggaactc gctcgggctg gccccggtgc attttttaaa
2100tacccgcgag aaatagagtt gatcgtcaaa accaacattg cgaccgacgg tggcgatagg
2160catccgggtg gtgctcaaaa gcagcttcgc ctggctgata cgttggtcct cgcgccagct
2220taagacgcta atccctaact gctggcggaa aagatgtgac agacgcgacg gcgacaagca
2280aacatgctgt gcgacgctgg cgatacatta ccctgttatc cctagatgac attaccctgt
2340tatcccagat gacattaccc tgttatccct agatgacatt accctgttat ccctagatga
2400catttaccct gttatcccta gatgacatta ccctgttatc ccagatgaca ttaccctgtt
2460atccctagat acattaccct gttatcccag atgacatacc ctgttatccc tagatgacat
2520taccctgtta tcccagatga cattaccctg ttatccctag atacattacc ctgttatccc
2580agatgacata ccctgttatc cctagatgac attaccctgt tatcccagat gacattaccc
2640tgttatccct agatacatta ccctgttatc ccagatgaca taccctgtta tccctagatg
2700acattaccct gttatcccag atgacattac cctgttatcc ctagatacat taccctgtta
2760tcccagatga cataccctgt tatccctaga tgacattacc ctgttatccc agatgacatt
2820accctgttat ccctagatac attaccctgt tatcccagat gacataccct gttatcccta
2880gatgacatta ccctgttatc ccagatgaca ttaccctgtt atccctagat acattaccct
2940gttatcccag atgacatacc ctgttatccc tagatgacat taccctgtta tcccagataa
3000actcaatgat gatgatgatg atggtcgaga ctcagcggcc gcggtgccag ggcgtgccct
3060tgggctcccc gggcgcgatg cccgccatga agatcgagtg ccgcatcacc ggcaccctga
3120acggcgtgga gttcgagctg gtgggcggcg gagagggcac ccccgagcag ggccgcatga
3180ccaacaagat gaagagcacc aaaggcgccc tgaccttcag cccctacctg ctgagccacg
3240tgatgggcta cggcttctac cacttcggca cctaccccag cggctacgag aaccccttcc
3300tgcacgccat caacaacggc ggctacacca acacccgcat cgagaagtac gaggacggcg
3360gcgtgctgca cgtgagcttc agctaccgct acgaggccgg ccgcgtgatc ggcgacttca
3420aggtggtggg caccggcttc cccgaggaca gcgtgatctt caccgacaag atcatccgca
3480gcaacgccac cgtggagcac ctgcacccca tgggcgataa cgtgctggtg ggcagcttcg
3540cccgcacctt cagcctgcgc gacggcggct actacagctt cgtggtggac agccacatgc
3600acttcaagag cgccatccac cccagcatcc tgcagaacgg gggccccatg ttcgccttcc
3660gccgcgtgga ggagctgcac agcaacaccg agctgggcat cgtggagtac cagcacgcct
3720tcaagacccc catcgccttc gccagatctc gagctcgagg tggtttgtct ggtcaaccac
3780cgcggtctca gtggtgtacg gtacaaaccc accccaactg gggtaacctt tgagttctct
3840cagttggggg taatcagcat catgatgtgg taccacatca tgatgctgat tataagaatg
3900cggccgccac actctagtgg atctcgagtt aataattcag aagaactcgt caagaaggcg
3960atagaaggcg atgcgctgcg aatcgggagc ggcgataccg taaagcacga ggaagcggtc
4020agcccattcg ccgccaagct cttcagcaat atcacgggta gccaacgcta tgtcctgata
4080gcggtccgcc acacccagcc ggccacagtc gatgaatcca gaaaagcggc cattttccac
4140catgatattc ggcaagcagg catcgccatg ggtcacgacg agatcctcgc cgtcgggcat
4200gctcgccttg agcctggcga acagttcggc tggcgcgagc ccctgatgct cttcgtccag
4260atcatcctga tcgacaagac cggcttccat ccgagtacgt gctcgctcga tgcgatgttt
4320cgcttggtgg tcgaatgggc aggtagccgg atcaagcgta tgcagccgcc gcattgcatc
4380agccatgatg gatactttct cggcaggagc aaggtgtaga tgacatggag atcctgcccc
4440ggcacttcgc ccaatagcag ccagtccctt cccgcttcag tgacaacgtc gagcacagct
4500gcgcaaggaa cgcccgtcgt ggccagccac gatagccgcg ctgcctcgtc ttgcagttca
4560ttcagggcac cggacaggtc ggtcttgaca aaaagaaccg ggcgcccctg cgctgacagc
4620cggaacacgg cggcatcaga gcagccgatt gtctgttgtg cccagtcata gccgaatagc
4680ctctccaccc aagcggccgg agaacctgcg tgcaatccat cttgttcaat catgcgaaac
4740gatcctcatc ctgtctcttg atcagagct
4769397797DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotide 397ccccaactgg ggtaaccttt gggctccccg
ggcgcgatgg tgagcaaggg cgaggaggat 60aacatggcca tcatcaagga gttcatgcgc
ttcaaggtgc acatggaggg ctccgtgaac 120ggccacgagt tcgagatcga gggcgagggc
gagggccgcc cctacgaggg cacccagacc 180gccaagctga aggtgaccaa gggtggcccc
ctgcccttcg cctgggacat cctgtcccct 240cagttcatgt acggctccaa ggcctacgtg
aagcaccccg ccgacatccc cgactacttg 300aagctgtcct tccccgaggg cttcaagtgg
gagcgcgtga tgaacttcga ggacggcggc 360gtggtgaccg tgacccagga ctcctccctg
caggacggcg agttcatcta caaggtgaag 420ctgcgcggca ccaacttccc ctccgacggc
cccgtaatgc agaagaagac catgggctgg 480gaggcctcct ccgagcggat gtaccccgag
gacggcgccc tgaagggcga gatcaagcag 540aggctgaagc tgaaggacgg cggccactac
gacgctgagg tcaagaccac ctacaaggcc 600aagaagcccg tgcagctgcc cggcgcctac
aacgtcaaca tcaagttgga catcacctcc 660cacaacgagg actacaccat cgtggaacag
tacgaacgcg ccgagggccg ccactccacc 720ggcggcatgg acgagctgta caagggtggt
ttgtctggtc aaccaccgcg agctcagtgg 780tgtacggtac aaaccca
797398815DNAArtificial
SequenceDescription of Artificial Sequence Synthetic polynucleotide
398ccccaactgg ggtaaccttt gggctccccg ggcgcggccg ccaccatggt gtccaagggt
60gaggaacttt ttaccggagt ggtgccgata ctggtagagc tggatggcga cgtaaacggg
120cacaagttca gtgtacgggg agagggcgag ggcgacgcta cgaatgggaa attgactttg
180aaatttattt gcaccacggg caaattgccg gtcccgtggc caactttggt tacgaccttg
240acctatggcg ttcagtgttt ctcacggtac ccagaccaca tgaaacagca tgactttttt
300aagtcagcga tgccggaggg atatgtgcaa gaacggacta tctcatttaa agatgatggc
360acatataaga caagagcgga agtcaaattc gaaggggaca ccctcgtcaa tcgaatagaa
420ctcaagggaa tagacttcaa agaagatggt aatatactgg ggcacaaact cgaatacaat
480ttcaacagtc ataacgtcta catcactgcc gacaaacaaa aaaatgggat caaagcgaac
540ttcaaaatcc gacataatgt cgaggatggg agcgtccaac tggcagacca ttaccagcaa
600aatactccaa taggtgatgg tccagtgctt ttgccagata atcattatct tagctatcag
660agcaagttga gtaaggatcc gaatgaaaag cgagatcaca tggtcttgct ggagtttgtt
720acggcggctg gtatcacact tggtatggat gaattgtaca agggtggttt gtctggtcaa
780ccaccgcgga ctcagtggtg tacggtacaa accca
8153991660DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotide 399ccccaactgg ggtaaccttt gggctccccg
ggcgcggcca ccatgaaatg ggttactttc 60atatctctgt tgtttttgtt ttcctctagt
tccagggcca tgccgtcttc tgtctcgtgg 120ggcatcctcc tgctggcagg cctgtgctgc
ctggtccctg tctccctggc tgaggatccc 180cagggagatg ctgcccagaa gacagataca
tcccaccatg atcaggatca cccaaccttc 240aacaagatca cccccaacct ggctgagttc
gccttcagcc tataccgcca gctggcacac 300cagtccaaca gcaccaatat cttcttctcc
ccagtgagca tcgctacagc ctttgcaatg 360ctctccctgg ggaccaaggc tgacactcac
gatgaaatcc tggagggcct gaatttcaac 420ctcacggaga ttccggaggc tcagatccat
gaaggcttcc aggaactcct ccgtaccctc 480aaccagccag acagccagct ccagctgacc
accggcaatg gcctgttcct cagcgagggc 540ctgaagctag tggataagtt tttggaggat
gttaaaaagt tgtaccactc agaagccttc 600actgtcaact tcggggacac cgaagaggcc
aagaaacaga tcaacgatta cgtggagaag 660ggtactcaag ggaaaattgt ggatttggtc
aaggagcttg acagagacac agtttttgct 720ctggtgaatt acatcttctt taaaggcaaa
tgggagagac cctttgaagt caaggacacc 780gaggaagagg acttccacgt ggaccaggtg
accaccgtga aggtgcctat gatgaagcgt 840ttaggcatgt ttaacatcca gcactgtaag
aagctgtcca gctgggtgct gctgatgaaa 900tacctgggca atgccaccgc catcttcttc
ctgcctgatg aggggaaact acagcacctg 960gaaaatgaac tcacccacga tatcatcacc
aagttcctgg aaaatgaaga cagaaggtct 1020gccagcttac atttacccaa actgtccatt
actggaacct atgatctgaa gagcgtcctg 1080ggtcaactgg gcatcactaa ggtcttcagc
aatggggctg acctctccgg ggtcacagag 1140gaggcacccc tgaagctctc caaggccgtg
cataaggctg tgctgaccat cgacgagaaa 1200gggactgaag ctgctggggc catgttttta
gaggccatac ccatgtctat cccccccgag 1260gtcaagttca acaaaccctt tgtcttctta
atgattgaac aaaataccaa gtctcccctc 1320ttcatgggaa aagtggtgaa tcccacccaa
aaataagaat tctaactaga gctcgctgat 1380cagcctcgac tgtgccttct agttgccagc
catctgttgt ttgcccctcc cccgtgcctt 1440ccttgaccct ggaaggtgcc actcccactg
tcctttccta ataaaatgag gaaattgcat 1500cgcattgtct gagtaggtgt cattctattc
tggggggtgg ggtggggcag gacagcaagg 1560gggaggattg ggaagagaat agcaggcatg
ctggggagcg agctcgaggt ggtttgtctg 1620gtcaaccacc gcggtctcag tggtgtacgg
tacaaaccca 16604004906DNAArtificial
SequenceDescription of Artificial Sequence Synthetic polynucleotide
400ccccaactgg ggtaaccttt gggctccccg ggcgcggcca ccatgaaatg ggttactttc
60atatctctgt tgtttttgtt ttcctctagt tccagggcca tgacgaggat tttgacagct
120ttcaaagtgg tgaggacact gaagactggt tttggcttta ccaatgtgac tgcacaccaa
180aaatggaaat tttcaagacc tggcatcagg ctcctttctg tcaaggcaca gacagcacac
240attgtcctgg aagatggaac taagatgaaa ggttactcct ttggccatcc atcctctgtt
300gctggtgaag tggtttttaa tactggcctg ggagggtacc cagaagctat tactgaccct
360gcctacaaag gacagattct cacaatggcc aaccctatta ttgggaatgg tggagctcct
420gatactactg ctctggatga actgggactt agcaaatatt tggagtctaa tggaatcaag
480gtttcaggtt tgctggtgct ggattatagt aaagactaca accactggct ggctaccaag
540agtttagggc aatggctaca ggaagaaaag gttcctgcaa tttatggagt ggacacaaga
600atgctgacta aaataattcg ggataagggt accatgcttg ggaagattga atttgaaggt
660cagcctgtgg attttgtgga tccaaataaa cagaatttga ttgctgaggt ttcaaccaag
720gatgtcaaag tgtacggcaa aggaaacccc acaaaagtgg tagctgtaga ctgtgggatt
780aaaaacaatg taatccgcct gctagtaaag cgaggagctg aagtgcactt agttccctgg
840aaccatgatt tcaccaagat ggagtatgat gggattttga tcgcgggagg accggggaac
900ccagctcttg cagaaccact aattcagaat gtcagaaaga ttttggagag tgatcgcaag
960gagccattgt ttggaatcag tacaggaaac ttaataacag gattggctgc tggtgccaaa
1020acctacaaga tgtccatggc caacagaggg cagaatcagc ctgttttgaa tatcacaaac
1080aaacaggctt tcattactgc tcagaatcat ggctatgcct tggacaacac cctccctgct
1140ggctggaaac cactttttgt gaatgtcaac gatcaaacaa atgaggggat tatgcatgag
1200agcaaaccct tcttcgctgt gcagttccac ccagaggtca ccccggggcc aatagacact
1260gagtacctgt ttgattcctt tttctcactg ataaagaaag gaaaagctac caccattaca
1320tcagtcttac cgaagccagc actagttgca tctcgggttg aggtttccaa agtccttatt
1380ctaggatcag gaggtctgtc cattggtcag gctggagaat ttgattactc aggatctcaa
1440gctgtaaaag ccatgaagga agaaaatgtc aaaactgttc tgatgaaccc aaacattgca
1500tcagtccaga ccaatgaggt gggcttaaag caagcggata ctgtctactt tcttcccatc
1560acccctcagt ttgtcacaga ggtcatcaag gcagaacagc cagatgggtt aattctgggc
1620atgggtggcc agacagctct gaactgtgga gtggaactat tcaagagagg tgtgctcaag
1680gaatatggtg tgaaagtcct gggaacttca gttgagtcca ttatggctac ggaagacagg
1740cagctgtttt cagataaact aaatgagatc aatgaaaaga ttgctccaag ttttgcagtg
1800gaatcgattg aggatgcact gaaggcagca gacaccattg gctacccagt gatgatccgt
1860tccgcctatg cactgggtgg gttaggctca ggcatctgtc ccaacagaga gactttgatg
1920gacctcagca caaaggcctt tgctatgacc aaccaaattc tggtggagaa gtcagtgaca
1980ggttggaaag aaatagaata tgaagtggtt cgagatgctg atgacaattg tgtcactgtc
2040tgtaacatgg aaaatgttga tgccatgggt gttcacacag gtgactcagt tgttgtggct
2100cctgcccaga cactctccaa tgccgagttt cagatgttga gacgtacttc aatcaatgtt
2160gttcgccact tgggcattgt gggtgaatgc aacattcagt ttgcccttca tcctacctca
2220atggaatact gcatcattga agtgaatgcc agactgtccc gaagctctgc tctggcctca
2280aaagccactg gctacccatt ggcattcatt gctgcaaaga ttgccctagg aatcccactt
2340ccagaaatta agaacgtcgt atccgggaag acatcagcct gttttgaacc tagcctggat
2400tacatggtca ccaagattcc ccgctgggat cttgaccgtt ttcatggaac atctagccga
2460attggtagct ctatgaaaag tgtaggagag gtcatggcta ttggtcgtac ctttgaggag
2520agtttccaga aagctttacg gatgtgccac ccatctatag aaggtttcac tccccgtctc
2580ccaatgaaca aagaatggcc atctaattta gatcttagaa aagagttgtc tgaaccaagc
2640agcacgcgta tctatgccat tgccaaggcc attgatgaca acatgtccct tgatgagatt
2700gagaagctca catacattga caagtggttt ttgtataaga tgcgtgatat tttaaacatg
2760gaaaagacac tgaaaggcct caacagtgag tccatgacag aagaaaccct gaaaagggca
2820aaggagattg ggttctcaga taagcagatt tcaaaatgcc ttgggctcac tgaggcccag
2880acaagggagc tgaggttaaa gaaaaacatc cacccttggg ttaaacagat tgatacactg
2940gctgcagaat acccatcagt aacaaactat ctctatgtta cctacaatgg tcaggagcat
3000gatgtcaatt ttgatgacca tggaatgatg gtgctaggct gtggtccata tcacattggc
3060agcagtgtgg aatttgattg gtgtgctgtc tctagtatcc gcacactgcg tcaacttggc
3120aagaagacgg tggtggtgaa ttgcaatcct gagactgtga gcacagactt tgatgagtgt
3180gacaaactgt actttgaaga gttgtccttg gagagaatcc tagacatcta ccatcaggag
3240gcatgtggtg gctgcatcat atcagttgga ggccagattc caaacaacct ggcagttcct
3300ctatacaaga atggtgtcaa gatcatgggc acaagccccc tgcagatcga cagggctgag
3360gatcgctcca tcttctcagc tgtcttggat gagctgaagg tggctcaggc accttggaaa
3420gctgttaata ctttgaatga agcactggaa tttgcaaagt ctgtggacta cccctgcttg
3480ttgaggcctt cctatgtttt gagtgggtct gctatgaatg tggtattctc tgaggatgag
3540atgaaaaaat tcctagaaga ggcgactaga gtttctcagg agcacccagt ggtgctgaca
3600aaatttgttg aaggggcccg agaagtagaa atggacgctg ttggcaaaga tggaagggtt
3660atctctcatg ccatctctga acatgttgaa gatgcaggtg tccactcggg agatgccact
3720ctgatgctgc ccacacaaac catcagccaa ggggccattg aaaaggtgaa ggatgctacc
3780cggaagattg caaaggcttt tgccatctct ggtccattca acgtccaatt tcttgtcaaa
3840ggaaatgatg tcttggtgat tgagtgtaac ttgagagctt ctcgatcctt cccctttgtt
3900tccaagactc ttggggttga cttcattgat gtggccacca aggtgatgat tggagagaat
3960gttgatgaga aacatcttcc aacattggac catcccataa ttcctgctga ctatgttgca
4020attaaggctc ccatgttttc ctggccccgg ttgagggatg ctgaccccat tctgagatgt
4080gagatggctt ccactggaga ggtggcttgc tttggtgaag gtattcatac agccttccta
4140aaggcaatgc tttccacagg atttaagata ccccagaaag gcatcctgat aggcatccag
4200caatcattcc ggccaagatt ccttggtgtg gctgaacaat tacacaatga aggtttcaag
4260ctgtttgcca cggaagccac atcagactgg ctcaacgcca acaatgtccc tgccacccca
4320gtggcatggc cgtctcaaga aggacagaat cccagcctct cttccatcag aaaattgatt
4380agagatggca gcattgacct agtgattaac cttcccaaca acaacactaa atttgtccat
4440gataattatg tgattcggag gacagctgtt gatagtggaa tccctctcct cactaatttt
4500caggtgacca aactttttgc tgaagctgtg cagaaatctc gcaaggtgga ctccaagagt
4560cttttccact acaggcagta cagtgctgga aaagcagcat aggaattcta actagagctc
4620gctgatcagc ctcgactgtg ccttctagtt gccagccatc tgttgtttgc ccctcccccg
4680tgccttcctt gaccctggaa ggtgccactc ccactgtcct ttcctaataa aatgaggaaa
4740ttgcatcgca ttgtctgagt aggtgtcatt ctattctggg gggtggggtg gggcaggaca
4800gcaaggggga ggattgggaa gagaatagca ggcatgctgg ggagcgagct cgaggtggtt
4860tgtctggtca accaccgcgg tctcagtggt gtacggtaca aaccca
49064014882DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotide 401ccccaactgg ggtaaccttt gggctccccg
ggcgcgacta taagctgcga gcaacttcac 60ttgggtatgc cggcggtagc gcttaccgtt
cgtataatgt atgctatacg aagttatccg 120aagccgctag cggtggtttg tctggtcaac
caccgcggtc tcagtggtgt acggtacaaa 180cccacccgag agaccatgca gaggtcgcct
ctggaaaagg ccagcgttgt ctccaaactt 240ttctttagct ggactagacc catccttcgt
aaaggataca gacagcgcct ggaattgtca 300gacatatacc aaatcccttc tgttgattct
gctgacaatc tatctgaaaa attggaaaga 360gaatgggata gagagctggc ttcaaagaaa
aatcctaaac tcattaatgc ccttcggcga 420tgttttttct ggagatttat gttctatgga
atctttttat atttagggga agtcaccaaa 480gcagtacagc ctctcttact gggaagaatc
atagcttcct atgacccgga taacaaggag 540gaacgctcta tcgcgattta tctaggcata
ggcttatgcc ttctctttat tgtgaggaca 600ctgctcctac acccagccat ttttggcctt
catcacattg gaatgcagat gagaatagct 660atgtttagtt tgatttataa gaagacttta
aagctgtcaa gccgtgttct agataaaata 720agtattggac aacttgttag tctcctttcc
aacaacctga acaaatttga tgaaggactt 780gcattggcac atttcgtgtg gatcgctcct
ttgcaagtgg cactcctcat ggggctaatc 840tgggagttgt tacaggcgtc tgccttctgt
ggacttggtt tcctgatagt ccttgccctt 900tttcaggctg ggctagggag aatgatgatg
aagtacagag atcagagagc tgggaagatc 960agtgaaagac ttgtgattac ctcagaaatg
attgaaaata tccaatctgt taaggcatac 1020tgctgggaag aagcaatgga aaaaatgatt
gaaaacttaa gacaaacaga actgaaactg 1080actcggaagg cagcctatgt gagatacttc
aatagctcag ccttcttctt ctcagggttc 1140tttgtggtgt ttttatctgt gcttccctat
gcactaatca aaggaatcat cctccggaaa 1200atattcacca ccatctcatt ctgcattgtt
ctgcgcatgg cggtcactcg gcaatttccc 1260tgggctgtac aaacatggta tgactctctt
ggagcaataa acaaaataca ggatttctta 1320caaaagcaag aatataagac attggaatat
aacttaacga ctacagaagt agtgatggag 1380aatgtaacag ccttctggga ggagggattt
ggggaattat ttgagaaagc aaaacaaaac 1440aataacaata gaaaaacttc taatggtgat
gacagcctct tcttcagtaa tttctcactt 1500cttggtactc ctgtcctgaa agatattaat
ttcaagatag aaagaggaca gttgttggcg 1560gttgctggat ccactggagc aggcaagact
tcacttctaa tggtgattat gggagaactg 1620gagccttcag agggtaaaat taagcacagt
ggaagaattt cattctgttc tcagttttcc 1680tggattatgc ctggcaccat taaagaaaat
atcatctttg gtgtttccta tgatgaatat 1740agatacagaa gcgtcatcaa agcatgccaa
ctagaagagg acatctccaa gtttgcagag 1800aaagacaata tagttcttgg agaaggtgga
atcacactga gtggaggtca acgagcaaga 1860atttctttag caagagcagt atacaaagat
gctgatttgt atttattaga ctctcctttt 1920ggatacctag acgtattgac tgagaaggag
atcttcgagt cctgcgtttg caagcttatg 1980gccaataaga caagaatcct ggttacaagt
aagatggagc acctgaagaa ggccgataag 2040attctgatcc tgcacgaggg atcttcatac
ttctacggca ctttcagcga gcttcagaac 2100ttgcaacctg atttctctag caagcttatg
ggctgcgact cctttgatca gttctctgcc 2160gagcgtcgca actccattct gaccgaaaca
ctgcataggt tttccctcga gggcgacgca 2220ccagtgtctt ggactgagac taagaagcag
agcttcaagc aaaccggcga attcggtgag 2280aagagaaaga acagtatcct gaaccccatt
aattcaattc ggaagttcag tatcgttcag 2340aaaacgcctc ttcagatgaa cgggattgag
gaagactcag acgaaccgct tgaaaggcga 2400ctctcattgg ttcctgacag tgaacaaggg
gaagctattc tcccccggat ttcagtaatt 2460tccacaggtc cgactctgca agcccggaga
agacaatccg tgttgaatct tatgacccat 2520tccgtgaatc aggggcaaaa tatccataga
aagactactg cctctacgag gaaggtatcc 2580cttgcacccc aagccaatct gacggagctc
gacatctact ctcgccgcct gtcccaggag 2640acaggactgg agattagcga ggagatcaat
gaagaggatc tgaaagaatg tttcttcgac 2700gacatggaat ccatccctgc cgtcacgacg
tggaatacct atttgcgtta catcacggta 2760cataaaagtc tgatattcgt cctgatctgg
tgtcttgtga tcttcctcgc tgaagtcgca 2820gccagcctgg tcgttctttg gctgctcggg
aataccccct tgcaggataa gggaaactcc 2880acccactctc ggaacaatag ttacgccgtc
atcattactt ccacttcctc atactacgta 2940ttctatatat atgtcggggt cgctgataca
ctgctggcca tgggcttctt tcgcggcctg 3000ccgctcgtcc acacgctgat aactgtctcc
aagatcttgc atcataagat gctgcactca 3060gtgctgcagg ctccaatgag tacactgaat
actcttaagg ctggcggcat cctgaaccgc 3120tttagtaagg acatcgccat acttgacgat
ctcttgcccc tgacaatctt cgattttatt 3180caactccttt tgatcgttat cggggcgatc
gctgtggttg ctgtgttgca gccatatata 3240ttcgtagcta ctgttcccgt catcgtcgcg
ttcatcatgc tccgtgccta ctttctgcag 3300acgtcccaac agctgaagca gctcgagagc
gagggacggt cccccatatt tacgcacttg 3360gtaactagtc tgaaggggct gtggactctg
agagcatttg gtcgacaacc atatttcgag 3420accctctttc ataaggccct caacctgcac
accgcgaatt ggtttctgta tttgagtacg 3480ttgcggtggt ttcagatgcg catcgagatg
atattcgtga tattctttat cgcagtcaca 3540tttatcagca tcctgactac gggcgaggga
gagggtcgcg tgggcatcat actcacgctc 3600gctatgaaca ttatgagcac cctgcaatgg
gccgtgaata gctctatcga cgttgacagt 3660cttatgcgat ctgtgagccg agtctttaag
ttcattgaca tgccaacaga aggtaaacct 3720accaagtcaa ccaaaccata caagaatggc
caactctcga aagttatgat tattgagaat 3780tcacacgtga agaaagatga catctggccc
tcagggggcc aaatgactgt caaagatctc 3840acagcaaaat acacagaagg tggaaatgcc
atattagaga acatttcctt ctcaataagt 3900cctggccaga gggtgggcct cttgggaaga
actggatcag ggaagagtac tttgttatca 3960gcttttttga gactactgaa cactgaagga
gaaatccaga tcgatggtgt gtcttgggat 4020tcaataactt tgcaacagtg gaggaaagcc
tttggagtga taccacagaa agtatttatt 4080ttttctggaa catttagaaa aaacttggat
ccctatgaac agtggagtga tcaagaaata 4140tggaaagttg cagatgaggt tgggctcaga
tctgtgatag aacagtttcc tgggaagctt 4200gactttgtcc ttgtggatgg gggctgtgtc
ctaagccatg gccacaagca gttgatgtgc 4260ttggctagat ctgttctcag taaggcgaag
atcttgctgc ttgatgaacc cagtgctcat 4320ttggatccag taacatacca aataattaga
agaactctaa aacaagcatt tgctgattgc 4380acagtaattc tctgtgaaca caggatagaa
gcaatgctgg aatgccaaca atttttggtc 4440atagaagaga acaaagtgcg gcagtacgat
tccatccaga aactgctgaa cgagaggagc 4500ctcttccggc aagccatcag cccctccgac
agggtgaagc tctttcccca ccggaactca 4560agcaagtgca agtctaagcc ccagattgct
gctctgaaag aggagacaga agaagaggtg 4620caagatacaa ggctttagac ccgctgatca
gcctcgactg tgccttctag ttgccagcca 4680tctgttgttt gcccctcccc cgtgccttcc
ttgaccctgg aaggtgccac tcccactgtc 4740ctttcctaat aaaatgagaa aattgcatcg
cattgtctga gtaggtgtca ttctattctg 4800gggggtgggg tggggcagga cagcaagggg
gaggattggg aagacaatag caggcatgct 4860ggggatgcgg tgggctctat gg
48824021594DNAArtificial
SequenceDescription of Artificial Sequence Synthetic polynucleotide
402ccccaactgg ggtaaccttt gggctccccg ggcgcggttc cggatccgga gagggcaggg
60gatctctcct tacttgtggc gacgtggagg agaaccccgg ccccatgagc atcggcctcc
120tgtgctgtgc agccttgtct ctcctgtggg caggtccagt gaatgctggt gtcactcaga
180ccccaaaatt ccaggtcctg aagacaggac agagcatgac actgcagtgt gcccaggata
240tgaaccatga atacatgtcc tggtatcgac aagacccagg catggggctg aggctgattc
300attactcagt tggtgctggt atcactgacc aaggagaagt ccccaatggc tacaatgtct
360ccagatcaac cacagaggat ttcccgctca ggctgctgtc ggctgctccc tcccagacat
420ctgtgtactt ctgtgccagc agttacgtcg ggaacaccgg ggagctgttt tttggagaag
480gctctaggct gaccgtactg gaggacctga aaaacgtgtt cccacccgag gtcgctgtgt
540ttgagccatc agaagcagag atctcccaca cccaaaaggc cacactggta tgcctggcca
600caggcttcta ccccgaccac gtggagctga gctggtgggt gaatgggaag gaggtgcaca
660gtggggtcag cacagacccg cagcccctca aggagcagcc cgccctcaat gactccagat
720actgcctgag cagccgcctg agggtctcgg ccaccttctg gcagaacccc cgcaaccact
780tccgctgtca agtccagttc tacgggctct cggagaatga cgagtggacc caggataggg
840ccaaacccgt cacccagatc gtcagcgccg aggcctgggg tagagcagac tgtggcttca
900cctccgagtc ttaccagcaa ggggtcctgt ctgccaccat cctctatgag atcttgctag
960ggaaggccac cttgtatgcc gtgctggtca gtgccctcgt gctgatggct atggtcaaga
1020gaaaggattc cagaggccgg gccaagcggt ccggatccgg agccaccaac ttcagcctgc
1080tgaagcaggc cggcgacgtg gaggagaacc ccggccccat ggagaccctc ttgggcctgc
1140ttatcctttg gctgcagctg caatgggtga gcagcaaaca ggaggtgacg cagattcctg
1200cagctctgag tgtcccagaa ggagaaaact tggttctcaa ctgcagtttc actgatagcg
1260ctatttacaa cctccagtgg tttaggcagg accctgggaa aggtctcaca tctctgttgc
1320ttattcagtc aagtcagaga gagcaaacaa gtggaagact taatgcctcg ctggataaat
1380catcaggacg tagtacttta tacattgcag cttctcagcc tggtgactca gccacctacc
1440tctgtgctgt gaggcccctg tacggaggaa gctacatacc tacatttgga agaggaacca
1500gccttattgt tcatccgtat atccagaacc ctgaccctgc gggtggtttg tctggtcaac
1560caccgcggtc tcagtggtgt acggtacaaa ccca
159440319DNAArtificial SequenceDescription of Artificial Sequence
Synthetic oligonucleotide 403ttgagcgggc ccccaccgt
19404393DNAArtificial SequenceDescription
of Artificial Sequence Synthetic polynucleotide 404atgactcact
atcaggcctt gcttttggac acggaccggg tccagttcgg accggtggta 60gccctgaacc
cggctacgct gctcccactg cctgaggaag ggctgcaaca caactgcctt 120gatgggacag
gtggcggtgg tgtcaccgtc aagttcaagt acaagggtga ggaacttgaa 180gttgatatta
gcaaaatcaa gaaggtttgg cgcgttggta aaatgatatc ttttacttat 240gacgacaacg
gcaagacagg tagaggggca gtgtctgaga aagacgcccc caaggagctg 300ttgcaaatgt
tggaaaagtc tgggaaaaag tctggcggct caaaaagaac cgccgacggc 360agcgaattcg
agcccaagaa gaagaggaaa gtc
39340511DNAArtificial SequenceDescription of Artificial Sequence
Synthetic probe 405cgacgacggc g
1140616DNAArtificial SequenceDescription of
Artificial Sequence Synthetic probe 406tttatttgtg ggcccg
1640715DNAArtificial
SequenceDescription of Artificial Sequence Synthetic probe
407tcgagtgccg catca
1540817DNAArtificial SequenceDescription of Artificial Sequence Synthetic
probe 408aaagtggtga ggacact
1740915DNAArtificial SequenceDescription of Artificial Sequence
Synthetic probe 409aacccacccg agaga
1541066DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 410ggaagcggag
ctactaactt cagcctgctg aagcaggctg gcgacgtgga ggagaaccct 60ggacct
6641145DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 411gggggaggag gttctggagg cggaggctcc ggaggcggag ggtca
4541215DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 412ggaggtggcg ggagc
1541315DNAArtificial
SequenceDescription of Artificial Sequence Synthetic oligonucleotide
413cccgcaccag cgcct
1541445DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 414gaggcagctg ccaaggaagc cgctgccaag gaggcggccg caaag
4541548DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 415agtgggagcg
agacccctgg gactagcgag tcagctacac ccgaaagc
4841654DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 416ggggggtcag gtggatccgg cggaagtggc ggatccggtg
gatctggcgg cagt 5441715DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 417gaagctgctg ctaag
1541822PRTArtificial
SequenceDescription of Artificial Sequence Synthetic peptide 418Gly
Ser Gly Ala Thr Asn Phe Ser Leu Leu Lys Gln Ala Gly Asp Val1
5 10 15Glu Glu Asn Pro Gly Pro
2041915PRTArtificial SequenceDescription of Artificial Sequence
Synthetic peptide 419Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly
Gly Gly Ser1 5 10
154205PRTArtificial SequenceDescription of Artificial Sequence Synthetic
peptide 420Gly Gly Gly Gly Ser1 54215PRTArtificial
SequenceDescription of Artificial Sequence Synthetic peptide 421Pro
Ala Pro Ala Pro1 542215PRTArtificial SequenceDescription of
Artificial Sequence Synthetic peptide 422Glu Ala Ala Ala Lys Glu Ala
Ala Ala Lys Glu Ala Ala Ala Lys1 5 10
1542316PRTArtificial SequenceDescription of Artificial
Sequence Synthetic peptide 423Ser Gly Ser Glu Thr Pro Gly Thr Ser
Glu Ser Ala Thr Pro Glu Ser1 5 10
1542418PRTArtificial SequenceDescription of Artificial Sequence
Synthetic peptide 424Gly Gly Ser Gly Gly Ser Gly Gly Ser Gly Gly Ser
Gly Gly Ser Gly1 5 10
15Gly Ser4255PRTArtificial SequenceDescription of Artificial Sequence
Synthetic peptide 425Glu Ala Ala Ala Lys1
542616PRTArtificial SequenceDescription of Artificial Sequence Synthetic
peptide 426Gly Leu Ser Gly Gln Pro Pro Arg Ser Pro Ser Ser Gly Ser
Ser Gly1 5 10
1542717PRTArtificial SequenceDescription of Artificial Sequence Synthetic
peptide 427Gly Gly Leu Ser Gly Gln Pro Pro Arg Ser Pro Ser Ser Gly
Ser Ser1 5 10
15Gly42888RNAArtificial SequenceDescription of Artificial Sequence
Synthetic oligonucleotide 428gacgagcgcg gcgauaucau cauccauggc
cggaugaucc ugacgacgga gaccgccguc 60gucgacaagc cggccugagc ugcgagaa
8842920RNAArtificial
SequenceDescription of Artificial Sequence Synthetic oligonucleotide
429gaagccggcc uugcacaugc
2043095DNAHomo sapiens 430gcgcgcccgg ctattctcgc agctcaccat ggatgatgat
atcgccgcgc tcgtcgtcga 60caacggctcc ggcatgtgca aggccggctt cgcgg
9543120RNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 431accacucgac
gcucuuaucg 20
User Contributions:
Comment about this patent or add new information about this topic: