Patent application title: CHROMOSOME-BASED PLATFORMS
Inventors:
Edward Perkins (Burnaby, CA)
Carl Perez (Richmond, CA)
Michael Lindenbaum (Coquitlam, CA)
Amy Greene (Burnaby, CA)
Josephine Leung (Coquitlam, CA)
Elena Fleming (North Vancouver, CA)
Sandra Stewart (Vancouver, CA)
Joan Shellard (Vancouver, CA)
IPC8 Class: AC12P1934FI
USPC Class:
435 9141
Class name: Polynucleotide (e.g., nucleic acid, oligonucleotide, etc.) modification or preparation of a recombinant dna vector by insertion or addition of one or more nucleotides
Publication date: 2012-03-15
Patent application number: 20120064578
Abstract:
Artificial chromosomes, including ACes, that have been engineered to
contain available sites for site-specific, recombination-directed
integration of DNA of interest are provided. These artificial chromosomes
provide tractable, efficient and rational engineering of the chromosome
for a variety of applications.Claims:
1. A method of introducing one or more heterologous nucleic acid(s) into
an artificial chromosome, wherein the artificial chromosome is a
mammalian artificial chromosome and is predominantly heterochromatin, the
method comprising: (a) introducing one or a plurality of recombination
site(s) comprising a heterologous att site into the artificial chromosome
(b) mixing the artificial chromosome containing the one or a plurality of
recombination site(s) comprising a heterologous att site with a first
vector comprising the heterologous nucleic acid and one or a plurality of
cognate recombination site(s), wherein the cognate recombination site(s)
is a site that participates in recombinase catalyzed recombination with
att site(s) in the artificial chromosome; (c) incubating the resulting
mixture in the presence of at least one lambda-intR mutein comprising a
glutamic acid to arginine change at position 174 of wild-type lambda-intR
under conditions whereby recombination between the att site and cognate
recombination site is effected, thereby introducing the heterologous
nucleic acid into the artificial chromosome.
2. The method of claim 1, wherein the artificial chromosome contains a plurality of att sites.
3.-13. (canceled)
14. The method of claim 1, wherein the att sites are selected from the group consisting of attP with attB and attL with attR.
15.-19. (canceled)
20. The method of claim 1, wherein the lambda-intR mutein comprising a glutamic acid to arginine change at position 174 of wild-type lambda-intR is encoded by a nucleic acid molecule, wherein the nucleic acid molecule is provided on a second vector or on the first vector, or on the artificial chromosome and its expression is under the control of an inducible promoter.
21. The method of claim 20, wherein the second vector is the plasmid pCXLamIntR.
22. The method of claim 20, wherein the first vector is the plasmid pDsRedN1-attB.
23.-27. (canceled)
28. The method of claim 1 wherein the one or a plurality of recombination site(s) is introduced into the artificial chromosome together with a selectable marker.
29. The method of claim 28 wherein the selectable marker is selected from the group consisting of a gene that provides a selective growth advantage, an antibiotic resistance gene, and a gene encoding a detectable protein, wherein the detectable protein is chromogenic, fluorescent, or capable of being bound by an antibody and FACs sorted.
30. The method of claim 29 wherein the selectable marker is a puromycin-resistance gene.
31. The method of claim 28 wherein the one or a plurality of recombination site(s) and a selectable marker are comprised in pSV40-193attPsensePur (FIG. 4, SEQ ID NO:113).
Description:
RELATED APPLICATIONS
[0001] This application is a continuation of and claims priority under 35 U.S.C. §120 to copending U.S. application Ser. No. 10/161,403, filed May 30, 2002, to EDWARD PERKINS, CARL PEREZ, MICHAEL LINDENBAUM, AMY GREENE, JOSEPHINE LEUNG, ELENA FLEMING, SANDRA STEWART and JOAN SHELLARD, who are the inventors as originally filed, entitled "CHROMOSOME-BASED PLATFORMS," which claims benefit of priority under 35 U.S.C. §119(e) to U.S. provisional application Ser. No. 60/294,758, filed May 30, 2001, to EDWARD PERKINS, CARL PEREZ, MICHAEL LINDENBAUM, AMY GREENE, and JOSEPHINE LEUNG, entitled "CHROMOSOME-BASED PLATFORMS" and benefit of priority to U.S. provisional application Ser. No. 60/366,891, filed Mar. 21, 2002, to EDWARD PERKINS, CARL PEREZ, MICHAEL LINDENBAUM, AMY GREENE, JOSEPHINE LEUNG, ELENA FLEMING, and SANDRA STEWART entitled, "CHROMOSOME-BASED PLATFORMS.".
[0002] This application is related to U.S. application Ser. No. 11/006,076, filed Dec. 6, 2004 by EDWARD PERKINS, CARL PEREZ, MICHAEL LINDENBAUM, AMY GREENE, JOSEPHINE LEUNG, ELENA FLEMING, SANDRA STEWART and JOAN SHELLARD, entitled CHROMOSOME-BASED PLATFORMS.
[0003] This application is related to Provisional Application No. 60/294,687, filed May 30, 2001, by CARL PEREZ AND STEVEN FABIJANSKI entitled PLANT ARTIFICIAL CHROMOSOMES, USES THEREOF AND METHODS FOR PREPARING PLANT ARTIFICIAL CHROMOSOMES and to U.S. Provisional Application No. 60/296,329, filed Jun. 4, 2001, by CARL PEREZ AND STEVEN FABIJANSKI entitled PLANT ARTIFICIAL CHROMOSOMES, USES THEREOF AND METHODS FOR PREPARING PLANT ARTIFICIAL CHROMOSOMES. This application also is related to U.S. Provisional Application No. 60/294,758, filed May 30, 2001, by EDWARD PERKINS et al. entitled CHROMOSOME-BASED PLATFORMS and to U.S. Provisional Application No. 60/366,891, filed Mar. 21, 2002, by EDWARD PERKINS et al. entitled CHROMOSOME-BASED PLATFORMS. This application is also related to U.S. application Ser. No. 10/161,408 and to International PCT application No. PCT/US02/17451, published as WO2002/096923, each filed on the same day herewith, entitled PLANT ARTIFICIAL CHROMOSOMES, USES THEREOF AND METHODS OF PREPARING PLANT ARTIFICIAL CHROMOSOMES to Perez et al.
[0004] This application is related to U.S. application Ser. No. 08/695,191, filed Aug. 7, 1996 by GYULA HADLACZKY and ALADAR SZALAY, entitled ARTIFICIAL CHROMOSOMES, USES THEREOF AND METHODS FOR PREPARING ARTIFICIAL CHROMOSOMES, now U.S. Pat. No. 6,025,155. This application is also related to U.S. application Ser. No. 08/682,080, filed Jul. 15, 1996 by GYULA HADLACZKY and ALADAR SZALAY, entitled ARTIFICIAL CHROMOSOMES, USES THEREOF AND METHODS FOR PREPARING ARTIFICIAL CHROMOSOMES, now U.S. Pat. No. 6,077,697. This application is also related U.S. application Ser. No. 08/629,822, filed Apr. 10, 1996 by GYULA HADLACZKY and ALADAR SZALAY, entitled ARTIFICIAL CHROMOSOMES, USES THEREOF AND METHODS FOR PREPARING ARTIFICIAL CHROMOSOMES (now abandoned), and is also related to copending U.S. application Ser. No. 09/096,648, filed Jun. 12, 1998, by GYULA HADLACZKY and ALADAR SZALAY, entitled ARTIFICIAL CHROMOSOMES, USES THEREOF AND METHODS FOR PREPARING ARTIFICIAL CHROMOSOMES and to U.S. application Ser. No. 09/835,682, Apr. 10, 1997 by GYULA HADLACZKY and ALADAR SZALAY, entitled ARTIFICIAL CHROMOSOMES, USES THEREOF AND METHODS FOR PREPARING ARTIFICIAL CHROMOSOMES (now abandoned). This application is also related to copending U.S. application Ser. No. 09/724,726, filed Nov. 28, 2000, U.S. application Ser. No. 09/724,872, filed Nov. 28, 2000, U.S. application Ser. No. 09/724,693, filed Nov. 28, 2000, U.S. application Ser. No. 09/799,462, filed Mar. 5, 2001, U.S. application Ser. No. 09/836,911, filed Apr. 17, 2001, and U.S. application Ser. No. 10/125,767, filed Apr. 17, 2002, each of which is by GYULA HADLACZKY and ALADAR SZALAY, and is entitled ARTIFICIAL CHROMOSOMES, USES THEREOF AND METHODS FOR PREPARING ARTIFICIAL CHROMOSOMES. This application is also related to International PCT application No. WO 97/40183.
[0005] The subject matter of all of the above provisional applications, international applications, and applications are herein incorporated by reference in their entirety.
INCORPORATION BY REFERENCE OF SEQUENCE LISTING PROVIDED ON COMPACT DISCS
[0006] An electronic version on compact disc (CD) ROM of a computer-readable form of the Sequence Listing is filed herewith in duplicate, the contents of which are incorporated by reference in their entirety. The computer-readable file on each of the aforementioned duplicate compact discs created on Jun. 29, 2006, is identical, 396 kilobytes in size, and entitled 420DSEQ.001.txt.
FIELD OF THE INVENTION
[0007] Artificial chromosomes, including ACes, that have been engineered to contain available sites for site-specific, recombination-directed integration of DNA of interest are provided. These artificial chromosomes permit tractable, efficient, rational engineering of the chromosome.
BACKGROUND
[0008] Artificial Chromosomes
[0009] A variety of artificial chromosomes for use in plants and animals, particularly higher plants and animals are available. In particular, U.S. Pat. Nos. 6,025,155 and 6,077,697 provide heterochromatic artificial chromosomes designated therein as satellite artificial chromosomes (SATACs) and now designated artificial chromosome expression systems (ACes). These chromosomes are prepared by introducing heterologous DNA into a selected plant or animal cell under conditions that result in integration into a region of the chromosome that leads to an amplification event resulting in production of a dicentric chromosome. Subsequent treatment and growth of cells with dicentric chromosomes, including further amplifications, ultimately results in the artificial chromosomes provided therein. In order to introduce a desired heterologous gene (or a plurality of heterologous genes) into the artificial chromosome, the process is repeated introducing the desired heterologous genes and nucleic acids in the initial targeting step. This process is time consuming and tedious. Hence, more tractable and efficient methods for introducing heterologous nucleic acid molecules into artificial chromosomes, particularly ACes, are needed.
[0010] Therefore, it is an object herein to provide engineered artificial chromosomes that permit tractable, efficient and rational engineering of artificial chromosomes.
SUMMARY OF THE INVENTION
[0011] Provided herein are artificial chromosomes that permit tractable, efficient and rational engineering thereof. In particular, the artificial chromosomes provided herein contain one or a plurality of loci (sites) for site-specific, recombination-directed integration of DNA. Thus, provided herein are platform artificial chromosome expression systems ("platform ACes") containing single or multiple site-specific, recombination sites. The artificial chromosomes and ACes artificial chromosomes include plant and animal chromosomes. Any recombinase system that effects site-specific recombination is contemplated for use herein.
[0012] In one embodiment, chromosomes, including platform ACes, are provided that contain one or more lambda att sites designed for recombination-directed integration in the presence of lambda integrase, and that are mutated so that they do not require additional factors. Methods for preparing such chromosomes, vectors for use in the methods, and uses of the resulting chromosomes are also provided.
[0013] Platform ACes containing the recombination site(s) and methods for introducing heterologous nucleic acid into such sites and vectors therefor, are provided.
[0014] Also provided herein is a bacteriophage lambda (λ) integrase site-specific recombination system.
[0015] Methods using recombinase mediated recombination target gene expression vectors and/or genes for insertion thereof into platform chromosomes and the resulting chromosomes are provided.
[0016] Combinations and kits containing the combinations of vectors encoding a recombinase and integrase and primers for introduction of the site recognized thereby are also provided. The kits optionally include instructions for performing site-directed integration or preparation of ACes containing such sites.
[0017] Also provided herein are mammalian and plant cells comprising the artificial chromosomes and ACes described herein. The cells can be nuclear donor cells, stem cells, such as a mesenchymal stem cell, a hematopoietic stem cell, an adult stem cell or an embryonic stem cell.
[0018] Also provide is a lamba-intR mutein comprising a glutamic acid to arginine change at position 174 of wild-type lambda-integrase3. Also provided are transgenic animals and methods for producing a transgenic animal, comprising introducing a ACes into an embryonic cell, such as a stem cell or embryo. The ACes can comprise heterologous nucleic acid that encodes a therapeutic product. The transgenic animal can be a fish, insect, reptile, amphibians, arachnid or mammal. In certain embodiments, the ACes is introduced by cell fusion, lipid-mediated transfection by a carrier system, microinjection, microcell fusion, electroporation, microprojectile bombardment or direct DNA transfer.
[0019] The platform ACes, including plant and animal ACes, such as MACs, provided herein can be introduced into cells, such as, but not limited to, animal cells, including mammalian cells, and into plant cells. Hence plant cells that contain platform MACs, animal cells that contain platform PACs and other combinations of cells and platform ACes are provided.
DESCRIPTION OF THE FIGURES
[0020] FIG. 1 provides a diagram depicting creation of an exemplary ACes artificial chromosome prepared using methods detailed in U.S. Pat. Nos. 6,025,155 and 6,077,697 and International PCT application No. WO 97/40183. In this exemplified embodiment, the nucleic acid is targeted to an acrocentric chromosome in an animal or plant, and the heterologous nucleic acid includes a sequence-specific recombination site and marker genes.
[0021] FIG. 2 provides a map of pWEPuro9K, which is a targeting vector derived from the vector pWE15 (GenBank Accession # X65279; SEQ ID No. 31). Plasmid pWE15 was modified by replacing the SalI (Klenow filled)/SmaI neomycin resistance encoding fragment with the PvuII/BamHI (Klenow filled) puromycin resistance-encoding fragment (isolated from plasmid pPUR, Clontech Laboratories, Inc., Palo Alto, Calif.; GenBank Accession no. U07648; SEQ ID No. 30) resulting in plasmid pWEPuro. Subsequently a 9 Kb NotI fragment from the plasmid pFK161 (see Example 1, see, also Csonka et al. (2000) Journal of Cell Science 113:3207-32161; and SEQ ID NO: 118), containing a portion of the mouse rDNA region, was cloned into the NotI site of pWEPuro resulting in plasmid pWEPuro9K.
[0022] FIG. 3 depicts construction of an ACes platform chromosome with a single recombination site, such as loxP sites or an attP or attB site. This platform ACes chromosome is an exemplary artificial chromosome with a single recombination site.
[0023] FIG. 4 provides a map of plasmid pSV40-193attPsensePur.
[0024] FIG. 5 depicts a method for formation of a chromosome platform with multiple recombination integration sites, such as attP sites.
[0025] FIG. 6 sets forth the sequences of the core region of attP, attB, attL and attR (SEQ ID Nos. 33-36).
[0026] FIG. 7 depicts insertional recombination of a vector encoding a marker gene, DsRed and an attB site with an artificial chromosome containing an attP site.
[0027] FIG. 8 provides a map of plasmid pCXLamIntR (SEQ ID NO: 112), which includes the Lambda integrase (E174R)-encoding nucleic acid.
[0028] FIG. 9 diagrammatically summarizes the platform technology; marker 1 permits selection of the artificial chromosomes containing the integration site; marker 2, which is promoterless in the target gene expression vector, permits selection of recombinants. Upon recombination with the platform marker 2 is expressed under the control of a promoter resident on the platform.
[0029] FIG. 10 provides the vector map for the plasmid p18attBZEO-5'6XHS4eGFP (SEQ ID NO: 116).
[0030] FIG. 11 provides the vector map for the plasmid p18attBZEO-3'6XHS4eGFP (SEQ ID NO: 115).
[0031] FIG. 12 provides the vector map for the plasmid p18attBZEO-(6XHS4)2eGFP (SEQ ID NO: 110).
[0032] FIGS. 13 AND 14 depict the integration of a PCR product by site-specific recombination as set forth in Example 8.
[0033] FIG. 15 provides the vector map for the plasmid pPACrDNA as set forth in Example 9.A.
DETAILED DESCRIPTION OF THE INVENTION
A. Definitions
[0034] Unless defined otherwise, all technical and scientific terms used herein have the same meaning as is commonly understood by one of skill in the art to which the invention(s) belong. All patents, patent applications, published applications and publications, Genbank sequences, websites and other published materials referred to throughout the entire disclosure herein, unless noted otherwise, are incorporated by reference in their entirety. Where reference is made to a URL or other such indentifier or address, it understood that such identifiers can change and particular information on the internet can come and go, but equivalent information can be found by searching the internet. Reference thereto evidences the availability and public dissemination of such information.
[0035] As used herein, nucleic acid refers to single-stranded and/or double-stranded polynucleotides, such as deoxyribonucleic acid (DNA) and ribonucleic acid (RNA), as well as analogs or derivatives of either RNA or DNA. Also included in the term "nucleic acid" are analogs of nucleic acids such as peptide nucleic acid (PNA), phosphorothioate DNA, and other such analogs and derivatives. When referring to probes or primers, optionally labeled, with a detectable label, such as a fluorescent or radiolabel, single-stranded molecules are contemplated. Such molecules are typically of a length such that they are statistically unique and of low copy number (typically less than 5, preferably less than 3) for probing or priming a library. Generally a probe or primer contains at least 14, 16 or 30 contiguous nucleotides of sequence complementary to or identical to a gene of interest. Probes and primers can be 10, 20, 30, 50, 100 or more nucleotides long.
[0036] As used herein, DNA is meant to include all types and sizes of DNA molecules including cDNA, plasmids and DNA including modified nucleotides and nucleotide analogs.
[0037] As used herein, nucleotides include nucleoside mono-, di-, and triphosphates. Nucleotides also include modified-nucleotides, such as, but are not limited to, phosphorothioate nucleotides and deazapurine nucleotides and other nucleotide analogs.
[0038] As used herein, heterologous or foreign DNA and RNA are used interchangeably and refer to DNA or RNA that does not occur naturally as part of the genome in which it is present or which is found in a location or locations and/or in amounts in a genome or cell that differ from that in which it occurs in nature. Heterologous nucleic acid is generally not endogenous to the cell into which it is introduced, but has been obtained from another cell or prepared synthetically. Generally, although not necessarily, such nucleic acid encodes RNA and proteins that are not normally produced by the cell in which it is expressed. Any DNA or RNA that one of skill in the art would recognize or consider as heterologous or foreign to the cell in which it is expressed is herein encompassed by heterologous DNA. Heterologous DNA and RNA may also encode RNA or proteins that mediate or alter expression of endogenous DNA by affecting transcription, translation, or other regulatable biochemical processes.
[0039] Examples of heterologous DNA include, but are not limited to, DNA that encodes a gene product or gene product(s) of interest, introduced for purposes of modification of the endogenous genes or for production of an encoded protein. For example, a heterologous or foreign gene may be isolated from a different species than that of the host genome, or alternatively, may be isolated from the host genome but operably linked to one or more regulatory regions which differ from those found in the unaltered, native gene. Other examples of heterologous DNA include, but are not limited to, DNA that encodes traceable marker proteins, such as a protein that confers traits including, but not limited to, herbicide, insect, or disease resistance; traits, including, but not limited to, oil quality or carbohydrate composition. Antibodies that are encoded by heterologous DNA may be secreted or expressed on the surface of the cell in which the heterologous DNA has been introduced.
[0040] As used herein, operative linkage or operative association, or grammatical variations thereof, of heterologous DNA to regulatory and effector sequences of nucleotides, such as promoters, enhancers, transcriptional and translational stop sites, and other signal sequences refers to the relationship between such DNA and such sequences of nucleotides. For example, operative linkage of heterologous DNA to a promoter refers to the physical relationship between the DNA and the promoter such that the transcription of such DNA is initiated from the promoter by an RNA polymerase that specifically recognizes, binds to and transcribes the DNA.
[0041] In order to optimize expression and/or in vitro transcription, it may be necessary to remove, add or alter 5' untranslated portions of the clones to eliminate extra, potential inappropriate alternative translation initiation (i.e., start) codons or other sequences that may interfere with or reduce expression, either at the level of transcription or translation. Alternatively, consensus ribosome binding sites (see, e.g., Kozak (1991) J. Biol. Chem. 266:19867-19870) can be inserted immediately 5' of the start codon and may enhance expression.
[0042] As used herein, a sequence complementary to at least a portion of an RNA, with reference to antisense oligonucleotides, means a sequence having sufficient complementarity to be able to hybridize with the RNA, preferably under moderate or high stringency conditions, forming a stable duplex. The ability to hybridize depends on the degree of complementarity and the length of the antisense nucleic acid. The longer the hybridizing nucleic acid, the more base mismatches it can contain and still form a stable duplex (or triplex, as the case may be). One skilled in the art can ascertain a tolerable degree of mismatch by use of standard procedures to determine the melting point of the hybridized complex.
[0043] As used herein, regulatory molecule refers to a polymer of deoxyribonucleic acid (DNA) or ribonucleic acid (RNA) or a polypeptide that is capable of enhancing or inhibiting expression of a gene.
[0044] As used herein, recognition sequences are particular sequences of nucleotides that a protein, DNA, or RNA molecule, or combinations thereof, (such as, but not limited to, a restriction endonuclease, a modification methylase and a recombinase) recognizes and binds. For example, a recognition sequence for Cre recombinase (see, e.g., SEQ ID NO:58) is a 34 base pair sequence containing two 13 base pair inverted repeats (serving as the recombinase binding sites) flanking an 8 base pair core and designated loxP (see, e.g., Sauer (1994) Current Opinion in Biotechnology 5:521-527). Other examples of recognition sequences, include, but are not limited to, attB and attP, attR and attL and others (see, e.g., SEQ ID Nos. 8, 41-56 and 72), that are recognized by the recombinase enzyme Integrase (see, SEQ ID Nos. 37 and 38 for the nucleotide and encoded amino acid sequences of an exemplary lambda phage integrase).
[0045] The recombination site designated attB is an approximately 33 base pair sequence containing two 9 base pair core-type Int binding sites and a 7 base pair overlap region; attP (SEQ ID No. 72) is an approximately 240 base pair sequence containing core-type Int binding sites and arm-type Int binding sites as well as sites for auxiliary proteins IHF, FIS, and X is (see, e.g., Landy (1993) Current Opinion in Biotechnology 3:699-7071 see, e.g., SEQ ID Nos. 8 and 72).
[0046] As used herein, a recombinase is an enzyme that catalyzes the exchange of DNA segments at specific recombination sites. An integrase herein refers to a recombinase that is a member of the lambda (λ) integrase family.
[0047] As used herein, recombination proteins include excisive proteins, integrative proteins, enzymes, co-factors and associated proteins that are involved in recombination reactions using one or more recombination sites (see, Landy (1993) Current Opinion in Biotechnology 3:699-707). The recombination proteins used herein can be delivered to a cell via an expression cassette on an appropriate vector, such as a plasmid, and the like. In other embodiments, the recombination proteins can be delivered to a cell in protein form in the same reaction mixture used to deliver the desired nucleic acid, such as a platform ACes, donor target vectors, and the like.
[0048] As used herein the expression "lox site" means a sequence of nucleotides at which the gene product of the cre gene, referred to herein as Cre, can catalyze a site-specific recombination event. A LoxP site is a 34 base pair nucleotide sequence from bacteriophage P1 (see, e.g., Hoess et al. (1982) Proc. Natl. Acad. Sci. U.S.A. 79:3398-3402). The LoxP site contains two 13 base pair inverted repeats separated by an 8 base pair spacer region as follows: (SEQ ID NO. 57):
TABLE-US-00001 ATAACTTCGTATA ATGTATGC TATACGAAGTTAT
E. coliDH5Δlac and yeast strain BSY23 transformed with plasmid pBS44 carrying two loxP sites connected with a LEU2 gene are available from the American Type Culture Collection (ATCC) under accession numbers ATCC 53254 and ATCC 20773, respectively. The lox sites can be isolated from plasmid pBS44 with restriction enzymes EcoRI and SalI, or XhoI and BamHI. In addition, a preselected DNA segment can be inserted into pBS44 at either the SalI or BamHI restriction enzyme sites. Other lox sites include, but are not limited to, LoxB, LoxL, LoxC2 and LoxR sites, which are nucleotide sequences isolated from E. coli (see, e.g., Hoess et al. (1982) Proc. Natl. Acad. Sci. U.S.A. 79:3398). Lox sites can also be produced by a variety of synthetic techniques (see, e.g., Ito et al. (1982) Nuc. Acid Res. 10:1755 and Ogilvie et al. (1981) Science 270:270).
[0049] As used herein, the expression "cre gene" means a sequence of nucleotides that encodes a gene product that effects site-specific recombination of DNA in eukaryotic cells at lox sites. One cre gene can be isolated from bacteriophage P1 (see, e.g., Abremski et al. (1983) Cell 32:1301-1311). E. coliDH1 and yeast strain BSY90 transformed with plasmid pBS39 carrying a cre gene isolated from bacteriophage P1 and a GAL1 regulatory nucleotide sequence are available from the American Type Culture Collection (ATCC) under accession numbers ATCC 53255 and ATCC 20772, respectively. The cre gene can be isolated from plasmid pBS39 with restriction enzymes XhoI and SalI.
[0050] As used herein, site-specific recombination refers to site-specific recombination that is effected between two specific sites on a single nucleic acid molecule or between two different molecules that requires the presence of an exogenous protein, such as an integrase or recombinase.
[0051] For example, Cre-lox site-specific recombination can include the following three events: [0052] a. deletion of a pre-selected DNA segment flanked by lox sites; [0053] b. inversion of the nucleotide sequence of a pre-selected DNA segment flanked by lox sites; and [0054] c. reciprocal exchange of DNA segments proximate to lox sites located on different DNA molecules.
[0055] This reciprocal exchange of DNA segments can result in an integration event if one or both of the DNA molecules are circular. DNA segment refers to a linear fragment of single- or double-stranded deoxyribonucleic acid (DNA), which can be derived from any source. Since the lox site is an asymmetrical nucleotide sequence, two lox sites on the same DNA molecule can have the same or opposite orientations with respect to each other. Recombination between lox sites in the same orientation results in a deletion of the DNA segment located between the two lox sites and a connection between the resulting ends of the original DNA molecule. The deleted DNA segment forms a circular molecule of DNA. The original DNA molecule and the resulting circular molecule each contain a single lox site. Recombination between lox sites in opposite orientations on the same DNA molecule result in an inversion of the nucleotide sequence of the DNA segment located between the two lox sites. In addition, reciprocal exchange of DNA segments proximate to lox sites located on two different DNA molecules can occur. All of these recombination events are catalyzed by the gene product of the cre gene. Thus, the Cre-lox system can be used to specifically delete, invert, or insert DNA. The precise event is controlled by the orientation of lox DNA sequences, in cis the lox sequences direct the Cre recombinase to either delete (lox sequences in direct orientation) or invert (lox sequences in inverted orientation) DNA flanked by the sequences, while in trans the lox sequences can direct a homologous recombination event resulting in the insertion of a recombinant DNA.
[0056] As used herein, a chromosome is a nucleic acid molecule, and associated proteins, that is capable of replication and segregation within a cell upon cell division. Typically, a chromosome contains a centromeric region, replication origins, telomeric regions and a region of nucleic acid between the centromeric and telomeric regions.
[0057] As used herein, a centromere is any nucleic acid sequence that confers an ability to segregate to daughter cells through cell division. A centromere may confer stable segregation of a nucleic acid sequence, including an artificial chromosome containing the centromere, through mitotic or meiotic divisions, including through both mitotic and meiotic divisions. A particular centromere is not necessarily derived from the same species in which it is introduced, but has the ability to promote DNA segregation in cells of that species.
[0058] As used herein, euchromatin and heterochromatin have their recognized meanings. Euchromatin refers to chromatin that stains diffusely and that typically contains genes, and heterochromatin refers to chromatin that remains unusually condensed and that has been thought to be transcriptionally inactive. Highly repetitive DNA sequences (satellite DNA) are usually located in regions of the heterochromatin surrounding the centromere (pericentric or pericentromeric heterochromatin). Constitutive heterochromatin refers to heterochromatin that contains the highly repetitive DNA which is constitutively condensed and genetically inactive.
[0059] As used herein, an acrocentric chromosome refers to a chromosome with arms of unequal length.
[0060] As used herein, endogenous chromosomes refer to genomic chromosomes as found in a cell prior to generation or introduction of an artificial chromosome.
[0061] As used herein, artificial chromosomes are nucleic acid molecules, typically DNA, that stably replicate and segregate alongside endogenous chromosomes in cells and have the capacity to accommodate and express heterologous genes contained therein. It has the capacity to act as a gene delivery vehicle by accommodating and expressing foreign genes contained therein. A mammalian artificial chromosome (MAC) refers to chromosomes that have an active mammalian centromere(s). Plant artificial chromosomes, insect artificial chromosomes and avian artificial chromosomes refer to chromosomes that include centromeres that function in plant, insect and avian cells, respectively. A human artificial chromosome (HAC) refers to chromosomes that include centromeres that function in human cells. For exemplary artificial chromosomes, see, e.g., U.S. Pat. Nos. 6,025,155; 6,077,697; 5,288,625; 5,712,134; 5,695,967; 5,869,294; 5,891,691 and 5,721,118 and published International PCT application Nos, WO 97/40183 and WO 98/08964. Artificial chromosomes include those that are predominantly heterochromatic (formerly referred to as satellite artificial chromosomes (SATACs); see, e.g., U.S. Pat. Nos. 6,077,697 and 6,025,155 and published International PCT application No. WO 97/40183), minichromosomes that contain a de novo centromere (see, U.S. Pat. Nos. 5,712,134, 5,891,691 and 5,288,625), artificial chromosomes predominantly made up of repeating nucleic acid units and that contain substantially equivalent amounts of euchromatic and heterochromatic DNA and in vitro assembled artificial chromosomes (see, copending U.S. provisional application Ser. No. 60/294,687, filed on May 30, 2001).
[0062] As used herein, the term "satellite DNA-based artificial chromosome (SATAC)" is interchangable with the term "artificial chromosome expression system (ACes)". These artificial chromosomes (ACes) include those that are substantially all neutral non-coding sequences (heterochromatin) except for foreign heterologous, typically gene-encoding nucleic acid, that is interspersed within the heterochromatin for the expression therein (see U.S. Pat. Nos. 6,025,155 and 6,077,697 and International PCT application No. WO 97/40183), or that is in a single locus as provided herein. Also included are ACes that may include euchromatin and that result from the process described in U.S. Pat. Nos. 6,025,155 and 6,077,697 and International PCT application No. WO 97/40183 and outlined herein. The delineating structural feature is the presence of repeating units, that are generally predominantly heterochromatin. The precise structure of the ACes will depend upon the structure of the chromosome in which the initial amplification event occurs; all share the common feature of including a defined pattern of repeating units. Generally ACes have more heterochromatin than euchromatin. Foreign nucleic acid molecules (heterologous genes) contained in these artificial chromosome expression systems can include any nucleic acid whose expression is of interest in a particular host cell. Such foreign nucleic acid molecules, include, but are not limited to, nucleic acid that encodes traceable marker proteins (reporter genes), such as fluorescent proteins, such as green, blue or red fluorescent proteins (GFP, BFP and RFP, respectively), other reporter genes, such as 3-galactosidase and proteins that confer drug resistance, such as a gene encoding hygromycin-resistance. Other examples of heterologous nucleic acid molecules include, but are not limited to, DNA that encodes therapeutically effective substances, such as anti-cancer agents, enzymes and hormones, DNA that encodes other types of proteins, such as antibodies, and DNA that encodes RNA molecules (such as antisense or siRNA molecules) that are not translated into proteins.
[0063] As used herein, an artificial chromosome platform, also referred to herein as a "platform ACes" or "ACes platform", refers to an artificial chromosome that has been engineered to include one or more sites for site-specific, recombination-directed integration. In particular, ACes that are so-engineered are provided. Any sites, including but not limited to any described herein, that are suitable for such integration are contemplated. Plant and animal platform ACes are provided. Among the ACes contemplated herein are those that are predominantly heterochromatic (formerly referred to as satellite artificial chromosomes (SATACs); see, e.g., U.S. Pat. Nos. 6,077,697 and 6,025,155 and published International PCT application No. WO 97/40183), artificial chromosomes predominantly made up of repeating nucleic acid units and that contain substantially equivalent amounts of euchromatic and heterochromatic DNA resulting from an amplification event depicted in the referenced patent and herein. Included among the ACes for use in generating platforms, are artificial chromosomes that introduce and express heterologous nucleic acids in plants (see, copending U.S. provisional application Ser. No. 60/294,687, filed on May 30, 2001). These include artificial chromosomes that have a centromere derived from a plant, and, also, artificial chromosomes that have centromeres that may be derived from other organisms but that function in plants.
[0064] As used herein a "reporter ACes" refers to a an ACes that comprises one or a plurality of reporter constructs, where the reporter construct comprises a reporter gene in operative linkage with a regulatory region responsive to test or known compounds.
[0065] As used herein, amplification, with reference to DNA, is a process in which segments of DNA are duplicated to yield two or multiple copies of substantially similar or identical or nearly identical DNA segments that are typically joined as substantially tandem or successive repeats or inverted repeats.
[0066] As used herein, amplification-based artificial chromosomes are artificial chromosomes derived from natural or endogenous chromosomes by virtue of an amplification event, such as one initiated by introduction of heterologous nucleic acid into rDNA in a chromosome. As a result of such an event, chromosomes and fragments thereof exhibiting segmented or repeating patterns arise. Artificial chromosomes can be formed from these chromosomes and fragments. Hence, amplification-based artificial chromosomes refer to engineered chromosomes that exhibit an ordered segmentation that is not observed in naturally occurring chromosomes and that distinguishes them from naturally occurring chromosomes. The segmentation, which can be visualized using a variety of chromosome analysis techniques known to those of skill in the art, correlates with the structure of these artificial chromosomes. In addition to containing one or more centromeres, the amplification-based artificial chromosomes, throughout the region or regions of segmentation are predominantly made up of nucleic acid units also referred to as "amplicons", that is (are) repeated in the region and that have a similar gross structure. Repeats of an amplicon tend to be of similar size and share some common nucleic acid sequences. For example, each repeat of an amplicon may contain a replication site involved in amplification of chromosome segments and/or some heterologous nucleic acid that was utilized in the initial production of the artificial chromosome. Typically, the repeating units are substantially similar in nucleic acid composition and may be nearly identical.
[0067] The amplification-based artificial chromosomes differ depending on the chromosomal region that has undergone amplification in the process of artificial chromosome formation. The structures of the resulting chromosomes can vary depending upon the initiating event and/or the conditions under which the heterologous nucleic acid is introduced, including modification to the endogenous chromosomes. For example, in some of the artificial chromosomes provided herein, the region or regions of segmentation may be made up predominantly of heterochromatic DNA. In other artificial chromosomes provided herein, the region or regions of segmentation may be made up predominantly of euchromatic DNA or may be made up of similar amounts of heterochromatic and euchromatic DNA.
[0068] As used herein an amplicon is a repeated nucleic acid unit. In some of the artificial chromosomes described herein, an amplicon may contain a set of inverted repeats of a megareplicon. A megareplicon represents a higher order replication unit. For example, with reference to some of the predominantly heterochromatic artificial chromosomes, the megareplicon can contain a set of tandem DNA blocks (e.g., ˜7.5 Mb DNA blocks) each containing satellite DNA flanked by non-satellite DNA or may be made up of substantially rDNA. Contained within the megareplicon is a primary replication site, referred to as the megareplicator, which may be involved in organizing and facilitating replication of the pericentric heterochromatin and possibly the centromeres. Within the megareplicon there may be smaller (e.g., 50-300 kb) secondary replicons.
[0069] In artificial chromosomes, such as those provided U.S. Pat. Nos. 6,025,155 and 6,077,697 and International PCT application No. WO 97/40183, the megareplicon is defined by two tandem blocks (˜7.5 Mb DNA blocks in the chromosomes provided therein). Within each artificial chromosome or among a population thereof, each amplicon has the same gross structure but may contain sequence variations. Such variations will arise as a result of movement of mobile genetic elements, deletions or insertions or mutations that arise, particularly in culture. Such variation does not affect the use of the artificial chromosomes or their overall structure as described herein.
[0070] As used herein, amplifiable, when used in reference to a chromosome, particularly the method of generating artificial chromosomes provided herein, refers to a region of a chromosome that is prone to amplification. Amplification typically occurs during replication and other cellular events involving recombination (e.g., DNA repair). Such regions include regions of the chromosome that contain tandem repeats, such as satellite DNA, rDNA, and other such sequences.
[0071] As used herein, a dicentric chromosome is a chromosome that contains two centromeres. A multicentric chromosome contains more than two centromeres.
[0072] As used herein, a formerly dicentric chromosome is a chromosome that is produced when a dicentric chromosome fragments and acquires new telomeres so that two chromosomes, each having one of the centromeres, are produced. Each of the fragments is a replicable chromosome. If one of the chromosomes undergoes amplification of primarily euchromatic DNA to produce a fully functional chromosome that is predominantly (at least more than 50%) euchromatin, it is a minichromosome. The remaining chromosome is a formerly dicentric chromosome. If one of the chromosomes undergoes amplification, whereby heterochromatin (such as, for example, satellite DNA) is amplified and a euchromatic portion (such as, for example, an arm) remains, it is referred to as a sausage chromosome. A chromosome that is substantially all heterochromatin, except for portions of heterologous DNA, is called a predominantly heterochromatic artificial chromosome. Predominantly heterochromatic artificial chromosomes can be produced from other partially heterochromatic artificial chromosomes by culturing the cell containing such chromosomes under conditions such as BrdU treatment that destabilize the chromosome and/or growth under selective conditions so that a predominantly heterochromatic artificial chromosome is produced. For purposes herein, it is understood that the artificial chromosomes may not necessarily be produced in multiple steps, but may appear after the initial introduction of the heterologous DNA. Typically, artificial chromosomes appear after about 5 to about 60, or about 5 to about 55, or about 10 to about 55 or about 25 to about 55 or about 35 to about 55 cell doublings after initiation of artificial chromosome generation, or they may appear after several cycles of growth under selective conditions and BrdU treatment.
[0073] As used herein, an artificial chromosome that is predominantly heterochromatic (i.e., containing more heterochromatin than euchromatin, typically more than about 50%, more than about 70%, or more than about 90% heterochromatin) may be produced by introducing nucleic acid molecules into cells, such as, for example, animal or plant cells, and selecting cells that contain a predominantly heterochromatic artificial chromosome. Any nucleic acid may be introduced into cells in such methods of producing the artificial chromosomes. For example, the nucleic acid may contain a selectable marker and/or optionally a sequence that targets nucleic acid to the pericentric, heterochromatic region of a chromosome, such as in the short arm of acrocentric chromosomes and nucleolar organizing regions. Targeting sequences include, but are not limited to, lambda phage DNA and rDNA for production of predominantly heterochromatic artificial chromosomes in eukaryotic cells.
[0074] After introducing the nucleic acid into cells, a cell containing a predominantly heterochromatic artificial chromosome is selected. Such cells may be identified using a variety of procedures. For example, repeating units of heterochromatic DNA of these chromosomes may be discerned by G-banding and/or fluorescence in situ hybridization (FISH) techniques. Prior to such analyses, the cells to be analyzed may be enriched with artificial chromosome-containing cells by sorting the cells on the basis of the presence of a selectable marker, such as a reporter protein, or by growing (culturing) the cells under selective conditions. It is also possible, after introduction of nucleic acids into cells, to select cells that have a multicentric, typically dicentric, chromosome, a formerly multicentric (typically dicentric) chromosome and/or various heterochromatic structures, such as a megachromosome and a sausage chromosome, that contain a centromere and are predominantly heterochromatic and to treat them such that desired artificial chromosomes are produced. Cells containing a new chromosome are selected. Conditions for generation of a desired structure include, but are not limited to, further growth under selective conditions, introduction of additional nucleic acid molecules and/or growth under selective conditions and treatment with destabilizing agents, and other such methods (see International PCT application No. WO 97/40183 and U.S. Pat. Nos. 6,025,155 and 6,077,697).
[0075] As used herein, a "selectable marker" is a nucleic acid segment, generally DNA, that allows one to select for or against a molecule or a cell that contains it, often under particular conditions. These markers can encode an activity, such as, but not limited to, production of RNA, peptide, or protein, or can provide a binding site for RNA, peptides, proteins, inorganic and organic compounds and compositions. Examples of selectable markers include but are not limited to: (1) nucleic acid segments that encode products that provide resistance against otherwise toxic compounds (e.g., antibiotics); (2) nucleic acid segments that encode products that are otherwise lacking in the recipient cell (e.g., tRNA genes, auxotrophic markers); (3) nucleic acid segments that encode products that suppress the activity of a gene product; (4) nucleic acid segments that encode products that can be identified, such as phenotypic markers, including β-galactosidase, red, blue and/or green fluorescent proteins (FPs), and cell surface proteins; (5) nucleic acid segments that bind products that are otherwise detrimental to cell survival and/or function; (6) nucleic acid segments that otherwise inhibit the activity of any of the nucleic acid segments described in Nos. 1-5 above (e.g., antisense oligonucleotides or siRNA molecules for use in RNA interference); (7) nucleic acid segments that bind products that modify a substrate (e.g. restriction endonucleases); (8) nucleic acid segments that can be used to isolate a desired molecule (e.g. specific protein binding sites); (9) nucleic acid segments that encode a specific nucleotide sequence that can be otherwise non-functional, such as for PCR amplification of subpopulations of molecules; and/or (10) nucleic acid segments, which when absent, directly or indirectly confer sensitivity to particular compounds. Thus, for example, selectable markers include nucleic acids encoding fluorescent proteins, such as green fluorescent proteins, β-galactosidase and other readily detectable proteins, such as chromogenic proteins or proteins capable of being bound by an antibody and FACs sorted. Selectable markers such as these, which are not required for cell survival and/or proliferation in the presence of a selection agent, are also referred to herein as reporter molecules. Other selectable markers, e.g., the neomycin phosphotransferase gene, provide for isolation and identification of cells containing them by conferring properties on the cells that make them resistant to an agent, e.g., a drug such as an antibiotic, that inhibits proliferation of cells that do not contain the marker.
[0076] As another example, interference of gene expression by double stranded RNA has been shown in Caenorhabditis elegans, plants, Drosophila, protozoans and mammals. This method is known as RNA interference (RNAi) and utilizes short, double-stranded RNA molecules (siRNAs). The siRNAs are generally composed of a 19-22 bp double-stranded RNA stem, a loop region and a 1-4 bp overhang on the 3' end. The reduction of gene expression has been accomplished by direct introduction of the siRNAs into the cell (Harborth J et al., 2001, J Cell Sci 114(pt 24):4557-65) as well as the introduction of DNA encoding and expressing the siRNA molecule. The encoded siRNA molecules are under the regulation of an RNA polymerase III promoter (see, e.g., Yu et al., 2002, Proc Natl Acad Sci USA 99(9); 6047-52; Brummelkamp et al., 2002, Science 296(5567):550-3; Miyagishi et al., 2002, Nat Biotechnol 20(5):497-500; and the like). In certain embodiments, RNAi in mammalian cells may have advantages over other therapeutic methods. For example, producing siRNA molecules that block viral genetic activities in infected cells may reduce the effects of the virus. Platform ACes provided herein encoding siRNA molecule(s) are an additional utilization of the platform ACes technology. The platform ACes could be engineered to encode one or more siRNA molecules to create gene "knockdowns". In one embodiment, a platform ACes can engineered to encode both the siRNA molecule and a replacement gene. For example, a mouse model or cell culture system could be generated using a platform ACes that has a knockdown of the endogenous mouse gene, by siRNA, and the human gene homolog expressing in place of the mouse gene. The placement of siRNA encoding sequences under the regulation of a regulatable or inducible promoter would allow one to temporally and/or spatially control the knockdown effect of the corresponding gene.
[0077] As used herein, a reporter gene includes any gene that expresses a detectable gene product, which may be RNA or protein. Generally reporter genes are readily detectable. Examples of reporter genes include, but are not limited to nucleic acid encoding a fluorescent protein, CAT (chloramphenicol acetyl transferase) (Alton et al. (1979) Nature 282: 864-869) luciferase, and other enzyme detection systems, such as beta-galactosidase; firefly luciferase (deWet et al. (1987) Mol. Cell. Biol. 7:725-737); bacterial luciferase (Engebrecht and Silverman (1984) Proc. Natl. Acad. Sci. U.S.A. 81:4154-4158; Baldwin et al. (1984) Biochemistry 23:3663-3667); and alkaline phosphatase (Toh et al. (1989) Eur. J. Biochem. 182:231-238, Hall et al. (1983) J. Mol. Appl. Gen. 2:101).
[0078] As used herein, growth under selective conditions means growth of a cell under conditions that require expression of a selectable marker for survival.
[0079] As used herein, an agent that destabilizes a chromosome is any agent known by those skilled in the art to enhance amplification events, and/or mutations. Such agents, which include BrdU, are well known to those skilled in the art.
[0080] In order to generate an artificial chromosome containing a particular heterologous nucleic acid of interest, it is possible to include the nucleic acid in the nucleic acid that is being introduced into cells to initiate production of the artificial chromosome. Thus, for example, a nucleic acid can be introduced into a cell along with nucleic acid encoding a selectable marker and/or a nucleic acid that targets to a heterochromatic region of a chromosome. For introducing a heterologous nucleic acid into the cell, it can be included in a fragment that includes a selectable marker or as part of a separate nucleic acid fragment and introduced into the cell with a selectable marker during the process of generating the artificial chromosomes. Alternatively, heterologous nucleic acid can be introduced into an artificial chromosome at a later time after the initial generation of the artificial chromosome.
[0081] As used herein, the minichromosome refers to a chromosome derived from a multicentric, typically dicentric, chromosome that contains more euchromatic than heterochromatic DNA. For purposes herein, the minichromosome contains a de novo centromere (e.g., a neocentromere). In some embodiments, for example, the minichromosome contains a centromere that replicates in animals, e.g., a mammalian centromere or in plants, e.g., a plant centromere.
[0082] As used herein, in vitro assembled artificial chromosomes or synthetic chromosomes can be either more euchromatic than heterochromatic or more heterochromatic than euchromatic and are produced by joining essential components of a chromosome in vitro. These components include at least a centromere, a megareplicator, a telomere and optionally secondary origins of replication.
[0083] As used herein, in vitro assembled plant or animal artificial chromosomes are produced by joining essential components (at least the centromere, telomere(s), megareplicator and optional secondary origins of replication) that function in plants or animals. In particular embodiments, the megareplicator contains sequences of rDNA, particularly plant or animal rDNA.
[0084] As used herein, a plant is a eukaryotic organism that contains, in addition to a nucleus and mitochondria, chloroplasts capable of carrying out photosynthesis. A plant can be unicellular or multicellular and can contain multiple tissues and/or organs. Plants can reproduce sexually or asexually and can be perennial or annual in growth. Plants can also be terrestrial or aquatic. The term "plant" includes a whole plant, plant cell, plant protoplast, plant calli, plant seed, plant organ, plant tissue, and other parts of a whole plant.
[0085] As used herein, stable maintenance of chromosomes occurs when at least about 85%, preferably 90%, more preferably 95%, of the cells retain the chromosome. Stability is measured in the presence of a selective agent. Preferably these chromosomes are also maintained in the absence of a selective agent. Stable chromosomes also retain their structure during cell culturing, suffering no unintended intrachromosomal or interchromosomal rearrangements.
[0086] As used herein, de novo with reference to a centromere, refers to generation of an excess centromere in a chromosome as a result of incorporation of a heterologous nucleic acid fragment using the methods herein.
[0087] As used herein, BrdU refers to 5-bromodeoxyuridine, which during replication is inserted in place of thymidine. BrdU is used as a mutagen; it also inhibits condensation of metaphase chromosomes during cell division.
[0088] As used herein, ribosomal RNA (rRNA) is the specialized RNA that forms part of the structure of a ribosome and participates in the synthesis of proteins. Ribosomal RNA is produced by transcription of genes which, in eukaryotic cells, are present in multiple copies. In human cells, the approximately 250 copies of rRNA genes (i.e., genes which encode rRNA) per haploid genome are spread out in clusters on at least five different chromosomes (chromosomes 13, 14, 15, 21 and 22). In mouse cells, the presence of ribosomal DNA (rDNA, which is DNA containing sequences that encode rRNA) has been verified on at least 11 pairs out of 20 mouse chromosomes (chromosomes 5, 6, 7, 9, 11, 12, 15, 16, 17, 18, and 19) (see e.g., Rowe et al. (1996) Mamm. Genome 7:886-889 and Johnson et al. (1993) Mamm. Genome 4:49-52). In Arabidopsis thaliana the presence of rDNA has been verified on chromosomes 2 and 4 (18S, 5.8S, and 25S rDNA) and on chromosomes 3, 4, and 5 (5S rDNA)(see The Arabidopsis Genome Initiative (2000) Nature 408:796-815). In eukaryotic cells, the multiple copies of the highly conserved rRNA genes are located in a tandemly arranged series of rDNA units, which are generally about 40-45 kb in length and contain a transcribed region and a nontranscribed region known as spacer (i.e., intergenic spacer) DNA which can vary in length and sequence. In the human and mouse, these tandem arrays of rDNA units are located adjacent to the pericentric satellite DNA sequences (heterochromatin). The regions of these chromosomes in which the rDNA is located are referred to as nucleolar organizing regions (NOR) which loop into the nucleolus, the site of ribosome production within the cell nucleus.
[0089] As used herein, a megachromosome refers to a chromosome that, except for introduced heterologous DNA, is substantially composed of heterochromatin. Megachromosomes are made up of an array of repeated amplicons that contain two inverted megareplicons bordered by introduced heterologous DNA (see, e.g., FIG. 3 of U.S. Pat. No. 6,077,697 for a schematic drawing of a megachromosome). For purposes herein, a megachromosome is about 50 to 400 Mb, generally about 250-400 Mb. Shorter variants are also referred to as truncated megachromosomes (about 90 to 120 or 150 Mb), dwarf megachromosomes (˜150-200 Mb), and a micro-megachromosome (˜50-90 Mb, typically 50-60 Mb). For purposes herein, the term megachromosome refers to the overall repeated structure based on an array of repeated chromosomal segments (amplicons) that contain two inverted megareplicons bordered by any inserted heterologous DNA. The size will be specified.
[0090] As used herein, gene therapy involves the transfer or insertion of nucleic acid molecules into certain cells, which are also referred to as target cells, to produce specific products that are involved in preventing, curing, correcting, controlling or modulating diseases, disorders and deleterious conditions. The nucleic acid is introduced into the selected target cells in a manner such that the nucleic acid is expressed and a product encoded thereby is produced. Alternatively, the nucleic acid may in some manner mediate expression of DNA that encodes a therapeutic product. This product may be a therapeutic compound, which is produced in therapeutically effective amounts or at a therapeutically useful time. It may also encode a product, such as a peptide or RNA, that in some manner mediates, directly or indirectly, expression of a therapeutic product. Expression of the nucleic acid by the target cells within an organism afflicted with a disease or disorder thereby provides for modulation of the disease or disorder. The nucleic acid encoding the therapeutic product may be modified prior to introduction into the cells of the afflicted host in order to enhance or otherwise alter the product or expression thereof.
[0091] For use in gene therapy, cells can be transfected in vitro, followed by introduction of the transfected cells into an organism. This is often referred to as ex vivo gene therapy. Alternatively, the cells can be transfected directly in vivo within an organism.
[0092] As used herein, therapeutic agents include, but are not limited to, growth factors, antibodies, cytokines, such as tumor necrosis factors and interleukins, and cytotoxic agents and other agents disclosed herein and known to those of skill in the art. Such agents include, but are not limited to, tumor necrosis factor, α-interferon, β-interferon, nerve growth factor, platelet derived growth factor, tissue plasminogen activator; or, biological response modifiers such as, for example, lymphokines, interleukin-I (IL-1), interleukin-2 (IL-2), interleukin-6 (IL-6), granulocyte macrophage colony stimulating factor (GMCSF), granulocyte colony stimulating factor (G-CSF), erythropoietin (EPO), pro-coagulants such as tissue factor and tissue factor variants, pro-apoptotic agents such FAS-ligand, fibroblast growth factors (FGF), nerve growth factor and other growth factors.
[0093] As used herein, a therapeutically effective product is a product that is encoded by heterologous DNA that, upon introduction of the DNA into a host, a product is expressed that effectively ameliorates or eliminates the symptoms, manifestations of an inherited or acquired disease or that cures the disease.
[0094] As used herein, transgenic plants and animals refer to plants and animals in which heterologous or foreign nucleic acid is expressed or in which the expression of a gene naturally present in the plant or animal has been altered by virtue of introduction of heterologous or foreign nucleic acid.
[0095] As used herein, IRES (internal ribosome entry site; see, e.g., SEQ ID No. 27 and nucleotides 2736-3308 SEQ ID No. 28) refers to a region of a nucleic acid molecule, such as an mRNA molecule, that allows internal ribosome entry sufficient to initiate translation, which initiation can be detected in an assay for cap-independent translation (see, e.g., U.S. Pat. No. 6,171,821). The presence of an IRES within an mRNA molecule allows cap-independent translation of a linked protein-encoding sequence that otherwise would not be translated.
[0096] Internal ribosome entry site (IRES) elements were first identified in picornaviruses, which elements are considered the paradigm for cap-independent translation. The 5' UTRs of all picornaviruses are long and mediate translational initiation by directly recruiting and binding ribosomes, thereby circumventing the initial cap-binding step. IRES elements are frequently found in viral mRNA, they are rare in non-viral mRNA. Among non-viral mRNA molecules that contain functional IRES elements in their respective 5' UTRs are those encoding immunoglobulin heavy chain binding protein (BiP) (Macejak et al. (1991) Nature 353:90-94); Drosophila Antennapedia (Oh et al. (1992) Genes Dev, 6:1643-1653); D. Ultrabithorax (Ye et al. (1997) Mol. Cell. Biol. 17:1714-21); fibroblast growth factor 2 (Vagner et al. (1995) Mol. Cell. Biol. 15:35-44); initiation factor eIF4G (Gan et al. (1998) J. Biol. Chem. 273:5006-5012); proto-oncogene c-myc (Nanbru et al. (1995) J. Biol. Chem. 272:32061-32066; Stoneley (1998) Oncogene 16:423-428); IRESH; from the 5'UTR of NRF1 gene (Oumard et al. (2000) Mol. and Cell Biol., 20(8):2755-2759); and vascular endothelial growth factor (VEGF) (Stein et al. (1998) Mol. Cell. Biol. 18:3112-9).
[0097] As used herein, a promoter, with respect to a region of DNA, refers to a sequence of DNA that contains a sequence of bases that signals RNA polymerase to associate with the DNA and initiate transcription of RNA (such as pol II for mRNA) from a template strand of the DNA. A promoter thus generally regulates transcription of DNA into mRNA. A particular promoter provided herein is the Ferritin heavy chain promoter (excluding the Iron Response Element, located in the 5'UTR), which was joined to the 37 bp Fer-1 enhancer element. This promoter is set forth as SEQ ID NO:128. The endogenous Fer-1 enhancer element is located upstream of the Fer-1 promoter (e.g., a Fer-1 oligo was cloned proximal to the core promoter).
[0098] As used herein, isolated, substantially pure nucleic acid, such as, for example, DNA, refers to nucleic acid fragments purified according to standard techniques employed by those skilled in the art, such as that found in Sambrook et al. ((2001) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 3rd edition).
[0099] As used herein, expression refers to the transcription and/or translation of nucleic acid. For example, expression can be the transcription of a gene that may be transcribed into an RNA molecule, such as a messenger RNA (mRNA) molecule. Expression may further include translation of an RNA molecule and translated into peptides, polypeptides, or proteins. If the nucleic acid is derived from genomic DNA, expression may, if an appropriate eukaryotic host cell or organism is selected, include splicing of the mRNA. With respect to an antisense construct, expression may refer to the transcription of the antisense DNA. As used herein, vector or plasmid refers to discrete elements that are used to introduce heterologous nucleic acids into cells for either expression of the heterologous nucleic acid or for replication of the heterologous nucleic acid. Selection and use of such vectors and plasmids are well within the level of skill of the art.
[0100] As used herein, transformation/transfection refers to the process by which nucleic acid is introduced into cells. The terms transfection and transformation refer to the taking up of exogenous nucleic acid, e.g., an expression vector, by a host cell whether or not any coding sequences are in fact expressed. Numerous methods of transfection are known to the ordinarily skilled artisan, for example, by Agrobacterium-mediated transformation, protoplast transformation (including polyethylene glycol (PEG)-mediated transformation, electroporation, protoplast fusion, and microcell fusion), lipid-mediated delivery, liposomes, electroporation, sonoporation, microinjection, particle bombardment and silicon carbide whisker-mediated transformation and combinations thereof (see, e.g., Paszkowski et al. (1984) EMBO J. 3:2717-2722; Potrykus et al. (1985) Mol. Gen. Genet. 199:169-177; Reich et al. (1986) Biotechnology 4:1001-1004; Klein et al. (1987) Nature 327:70-73; U.S. Pat. No. 6,143,949; Paszkowski et al. (1989) in Cell Culture and Somatic Cell Genetics of Plants, Vol. 6, Molecular Biology of Plant Nuclear Genes, eds. Schell, J and Vasil, L.K. Academic Publishers, San Diego, Calif., p. 52-68; and Frame et al. (1994) Plant J. 6:941-948), direct uptake using calcium phosphate (CaPO4; see, e.g., Wigler et al. (1979) Proc. Natl. Acad. Sci. U.S.A. 76:1373-1376), polyethylene glycol (PEG)-mediated DNA uptake, lipofection (see, e.g., Strauss (1996) Meth. Mol. Biol. 54:307-327), microcell fusion (see, EXAMPLES, see, also Lambert (1991) Proc. Natl. Acad. Sci. U.S.A. 88:5907-5911; U.S. Pat. No. 5,396,767, Sawford et al. (1987) Somatic Cell Mol. Genet. 13:279-284; Dhar et al. (1984) Somatic Cell Mol. Genet. 10:547-559; and McNeill-Killary et al. (1995) Meth. Enzymol. 254:133-152), lipid-mediated carrier systems (see, e.g., Teifel et al. (1995) Biotechniques 19:79-80; Albrecht et al. (1996) Ann. Hematol. 72:73-79; Holmen et al. (1995) In Vitro Cell Dev. Biol. Anim. 31:347-351; Remy et al. (1994) Bioconjug. Chem. 5:647-654; Le Bolch et al. (1995) Tetrahedron Lett. 36:6681-6684; Loeffler et al. (1993) Meth. Enzymol. 217:599-618) or other suitable method. Methods for delivery of ACes are described in copending U.S. application Ser. No. 09/815,979. Successful transfection is generally recognized by detection of the presence of the heterologous nucleic acid within the transfected cell, such as, for example, any visualization of the heterologous nucleic acid or any indication of the operation of a vector within the host cell.
[0101] As used herein, "delivery," which is used interchangeably with "transfection," refers to the process by which exogenous nucleic acid molecules are transferred into a cell such that they are located inside the cell. Delivery of nucleic acids is a distinct process from expression of nucleic acids.
[0102] As used herein, injected refers to the microinjection, such as by use of a small syringe, needle, or pipette, for injection of nucleic acid into a cell.
[0103] As used herein, substantially homologous DNA refers to DNA that includes a sequence of nucleotides that is sufficiently similar to another such sequence to form stable hybrids, with each other or a reference sequence, under specified conditions.
[0104] It is well known to those of skill in this art that nucleic acid fragments with different sequences may, under the same conditions, hybridize detectably to the same "target" nucleic acid. Two nucleic acid fragments hybridize detectably, under stringent conditions over a sufficiently long hybridization period, because one fragment contains a segment of at least about 10, 14 or 16 or more nucleotides in a sequence that is complementary (or nearly complementary) to a substantially contiguous sequence of at least one segment in the other nucleic acid fragment. If the time during which hybridization is allowed to occur is held constant, at a value during which, under preselected stringency conditions, two nucleic acid fragments with complementary base-pairing segments hybridize detectably to each other, departures from exact complementarity can be introduced into the base-pairing segments, and base-pairing will nonetheless occur to an extent sufficient to make hybridization detectable. As the departure from complementarity between the base-pairing segments of two nucleic acids becomes larger, and as conditions of the hybridization become more stringent, the probability decreases that the two segments will hybridize detectably to each other.
[0105] Two single-stranded nucleic acid segments have "substantially the same sequence", if (a) both form a base-paired duplex with the same segment, and (b) the melting temperatures of the two duplexes in a solution of 0.5×SSPE differ by less than 10° C. If the segments being compared have the same number of bases, then to have "substantially the same sequence", they will typically differ in their sequences at fewer than 1 base in 10. Methods for determining melting temperatures of nucleic acid duplexes are well known (see, e.g., Meinkoth et al. (1984) Anal. Biochem. 138:267-284 and references cited therein).
[0106] As used herein, a nucleic acid probe is a DNA or RNA fragment that includes a sufficient number of nucleotides to specifically hybridize to DNA or RNA that includes complementary or substantially complementary sequences of nucleotides. A probe may contain any number of nucleotides, from as few as about 10 and as many as hundreds of thousands of nucleotides. The conditions and protocols for such hybridization reactions are well known to those of skill in the art as are the effects of probe size, temperature, degree of mismatch, salt concentration and other parameters on the hybridization reaction. For example, the lower the temperature and higher the salt concentration at which the hybridization reaction is carried out, the greater the degree of mismatch that may be present in the hybrid molecules.
[0107] To be used as a hybridization probe, the nucleic acid is generally rendered detectable by labeling it with a detectable moiety or label, such as 32P, 3H and 14C, or by other means, including chemical labeling, such as by nick-translation in the presence of deoxyuridylate biotinylated at the 5'-position of the uracil moiety. The resulting probe includes the biotinylated uridylate in place of thymidylate residues and can be detected (via the biotin moieties) by any of a number of commercially available detection systems based on binding of streptavidin to the biotin. Such commercially available detection systems can be obtained, for example, from Enzo Biochemicals, Inc. (New York, N.Y.). Any other label known to those of skill in the art, including non-radioactive labels, may be used as long as it renders the probes sufficiently detectable, which is a function of the sensitivity of the assay, the time available (for culturing cells, extracting DNA, and hybridization assays), the quantity of DNA or RNA available as a source of the probe, the particular label and the means used to detect the label.
[0108] Once sequences with a sufficiently high degree of homology to the probe are identified, they can readily be isolated by standard techniques (see, e.g., Sambrook et al. (2001) Molecular Cloning: A Laboratory Manual, 3rd Edition, Cold Spring Harbor Laboratory Press).
[0109] As used herein, conditions under which DNA molecules form stable hybrids are considered substantially homologous, and a DNA or nucleic acid homolog refers to a nucleic acid that includes a preselected conserved nucleotide sequence, such as a sequence encoding a polypeptide. By the term "substantially homologous" is meant having at least 75%, preferably 80%, preferably at least 90%, most preferably at least 95% homology therewith or a less percentage of homology or identity and conserved biological activity or function.
[0110] The terms "homology" and "identity" are often used interchangeably. In this regard, percent homology or identity may be determined, for example, by comparing sequence information using a GAP computer program. The GAP program utilizes the alignment method of Needleman and Wunsch (J. Mol. Biol. 48:443 (1970), as revised by Smith and Waterman (Adv. Appl. Math. 2:482 (1981). Briefly, the GAP program defines similarity as the number of aligned symbols (i.e., nucleotides or amino acids) which are similar, divided by the total number of symbols in the shorter of the two sequences. The preferred default parameters for the GAP program may include: (1) a unary comparison matrix (containing a value of 1 for identities and 0 for non-identities) and the weighted comparison matrix of Gribskov and Burgess, Nucl. Acids Res. 14:6745 (1986), as described by Schwartz and Dayhoff, eds., ATLAS OF PROTEIN SEQUENCE AND STRUCTURE, National Biomedical Research Foundation, pp. 353-358 (1979); (2) a penalty of 3.0 for each gap and an additional 0.10 penalty for each symbol in each gap; and (3) no penalty for end gaps.
[0111] By sequence identity, the number of conserved amino acids are determined by standard alignment algorithms programs, and are used with default gap penalties established by each supplier. Substantially homologous nucleic acid molecules would hybridize typically at moderate stringency or at high stringency all along the length of the nucleic acid of interest. Preferably the two molecules will hybridize under conditions of high stringency. Also contemplated are nucleic acid molecules that contain degenerate codons in place of codons in the hybridizing nucleic acid molecule.
[0112] Whether any two nucleic acid molecules have nucleotide sequences that are at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99% "identical" can be determined using known computer algorithms such as the "FAST A" program, using for example, the default parameters as in Pearson and Lipman, Proc. Natl. Acad. Sci. USA 85:2444 (1988). Alternatively the BLAST function of the National Center for Biotechnology Information database may be used to determine relative sequence identity.
[0113] In general, sequences are aligned so that the highest order match is obtained. "Identity" per se has an art-recognized meaning and can be calculated using published techniques. (See, e.g.: Computational Molecular Biology, Lesk, A. M., ed., Oxford University Press, New York, 1988; Biocomputing: Informatics and Genome Projects, Smith, D. W., ed., Academic Press, New York, 1993; Computer Analysis of Sequence Data, Part I, Griffin, A. M., and Griffin, H. G., eds., Humana Press, New Jersey, 1994; Sequence Analysis in Molecular Biology, von Heinje, G., Academic Press, 1987; and Sequence Analysis Primer, Gribskov, M. and Devereux, J., eds., M Stockton Press, New York, 1991). While there exist a number of methods to measure identity between two polynucleotide or polypeptide sequences, the term "identity" is well known to skilled artisans (Carillo, H. & Lipton, D., SIAM J Applied Math 48:1073 (1988)). Methods commonly employed to determine identity or similarity between two sequences include, but are not limited to, those disclosed in Guide to Huge Computers, Martin J. Bishop, ed., Academic Press, San Diego, 1994, and Carillo, H. & Lipton, D., SIAM J Applied Math 48:1073 (1988). Methods to determine identity and similarity are codified in computer programs. Preferred computer program methods to determine identity and similarity between two sequences include, but are not limited to, GCG program package (Devereux, J., et al., Nucleic Acids Research 12(I):387 (1984)), BLASTP, BLASTN, FASTA (Atschul, S. F., et al., J Molec Biol 215:403 (1990)).
[0114] Therefore, as used herein, the term "identity" represents a comparison between a test and a reference polypeptide or polynucleotide. For example, a test polypeptide may be defined as any polypeptide that is 90% or more identical to a reference polypeptide.
[0115] As used herein, the term at least "90% identical to" refers to percent identities from 90 to 99.99 relative to the reference polypeptides. Identity at a level of 90% or more is indicative of the fact that, assuming for exemplification purposes a test and reference polynucleotide length of 100 amino acids are compared. No more than 10% (i.e., 10 out of 100) amino acids in the test polypeptide differs from that of the reference polypeptides. Similar comparisons may be made between a test and reference polynucleotides. Such differences may be represented as point mutations randomly distributed over the entire length of an amino acid sequence or they may be clustered in one or more locations of varying length up to the maximum allowable, e.g. 10/100 amino acid difference (approximately 90% identity). Differences are defined as nucleic acid or amino acid substitutions, or deletions.
[0116] As used herein: stringency of hybridization in determining percentage mismatch encompass the following conditions or equivalent conditions thereto: [0117] 1) high stringency: 0.1×SSPE or SSC, 0.1% SDS, 65° C. [0118] 2) medium stringency: 0.2×SSPE or SSC, 0.1% SDS, 50° C. [0119] 3) low stringency: 1.0×SSPE or SSC, 0.1% SDS, 50° C. or any combination of salt and temperature and other reagents that result in selection of the same degree of mismatch or matching. Equivalent conditions refer to conditions that select for substantially the same percentage of mismatch in the resulting hybrids. Additions of ingredients, such as formamide, Ficoll, and Denhardt's solution affect parameters such as the temperature under which the hybridization should be conducted and the rate of the reaction. Thus, hybridization in 5×SSC, in 20% formamide at 42° C. is substantially the same as the conditions recited above hybridization under conditions of low stringency. The recipes for SSPE, SSC and Denhardt's and the preparation of deionized formamide are described, for example, in Sambrook et al. (1989) Molecular Cloning, A Laboratory Manual, Cold Spring Harbor Laboratory Press, Chapter 8; see, Sambrook et al., vol. 3, p. B.13, see, also, numerous catalogs that describe commonly used laboratory solutions. It is understood that equivalent stringencies may be achieved using alternative buffers, salts and temperatures. As used herein, all assays and procedures, such as hybridization reactions and antibody-antigen reactions, unless otherwise specified, are conducted under conditions recognized by those of skill in the art as standard conditions.
[0120] As used herein, conservative amino acid substitutions, such as those set forth in Table 1, are those that do not eliminate biological activity. Suitable conservative substitutions of amino acids are known to those of skill in this art and may be made generally without altering the biological activity of the resulting molecule. Those of skill in this art recognize that, in general, single amino acid substitutions in non-essential regions of a polypeptide do not substantially alter biological activity (see, e.g., Watson et al. Molecular Biology of the Gene, 4th Edition, 1987, The Bejacmin/Cummings Pub. co., p. 224). Conservative amino acid substitutions are made, for example, in accordance with those set forth in TABLE 1 as follows:
TABLE-US-00002 TABLE 1 Original residue Conservative substitution Ala (A) Gly; Ser, Abu Arg (R) Lys, orn Asn (N) Gln; His Cys (C) Ser Gln (Q) Asn Glu (E) Asp Gly (G) Ala; Pro His (H) Asn; Gln Ile (I) Leu; Val; Met; Nle; Nva Leu (L) Ile; Val; Met; Nle; Nva Lys (K) Arg; Gln; Glu Met (M) Leu; Tyr; Ile; NLe Val Ornithine Lys; Arg Phe (F) Met; Leu; Tyr Ser (S) Thr Thr (T) Ser Trp (W) Tyr Tyr (Y) Trp; Phe Val (V) Ile; Leu; Met; Nle; Nva
Other substitutions are also permissible and may be determined empirically or in accord with known conservative substitutions.
[0121] As used herein, the amino acids, which occur in the various amino acid sequences appearing herein, are identified according to their well-known, three-letter or one-letter abbreviations. The nucleotides, which occur in the various DNA fragments, are designated with the standard single-letter designations used routinely in the art.
[0122] As used herein, a splice variant refers to a variant produced by differential processing of a primary transcript of genomic DNA that results in more than one type of mRNA.
[0123] As used herein, a probe or primer based on a nucleotide sequence includes at least 10, 14, 16, 30 or 100 contiguous nucleotides from the reference nucleic acid molecule.
[0124] As used herein, recombinant production by using recombinant DNA methods refers to the use of the well known methods of molecular biology for expressing proteins encoded by cloned DNA.
[0125] As used herein, biological activity refers to the in vivo activities of a compound or physiological responses that result upon in vivo administration of a compound, composition or other mixture. Biological activity, thus, encompasses therapeutic effects and pharmaceutical activity of such compounds, compositions and mixtures. Biological activities may be observed in in vitro systems designed to test or use such activities. Thus, for purposes herein the biological activity of a luciferase is its oxygenase activity whereby, upon oxidation of a substrate, light is produced.
[0126] The terms substantially identical or similar varies with the context as understood by those skilled in the relevant art and generally means at least 40, 60, 80, 90, 95 or 98%.
[0127] As used herein, substantially identical to a product means sufficiently similar so that the property is sufficiently unchanged so that the substantially identical product can be used in place of the product.
[0128] As used herein, substantially pure means sufficiently homogeneous to appear free of readily detectable impurities as determined by standard methods of analysis, such as thin layer chromatography (TLC), gel electrophoresis and high performance liquid chromatography (HPLC), used by those of skill in the art to assess such purity, or sufficiently pure such that further purification would not detectably alter the physical and chemical properties, such as enzymatic and biological activities, of the substance. Methods for purification of the compounds to produce substantially chemically pure compounds are known to those of skill in the art. A substantially chemically pure compound may, however, be a mixture of stereoisomers or isomers. In such instances, further purification might increase the specific activity of the compound.
[0129] As used herein, vector (or plasmid) refers to discrete elements that are used to introduce heterologous DNA into cells for either expression or replication thereof. The vectors typically remain episomal, but may be designed to effect integration of a gene or portion thereof into a chromosome of the genome. Also contemplated are vectors that are artificial chromosomes, such as yeast artificial chromosomes and mammalian artificial chromosomes. Selection and use of such vehicles are well known to those of skill in the art. An expression vector includes vectors capable of expressing DNA that is operatively linked with regulatory sequences, such as promoter regions, that are capable of effecting expression of such DNA fragments. Thus, an expression vector refers to a recombinant DNA or RNA construct, such as a plasmid, a phage, recombinant virus or other vector that, upon introduction into an appropriate host cell, results in expression of the cloned DNA. Appropriate expression vectors are well known to those of skill in the art and include those that are replicable in eukaryotic cells and/or prokaryotic cells and those that remain episomal or those which integrate into the host cell genome.
[0130] As used herein, protein-binding-sequence refers to a protein or peptide sequence that is capable of specific binding to other protein or peptide sequences generally, to a set of protein or peptide sequences or to a particular protein or peptide sequence.
[0131] As used herein, a composition refers to any mixture of two or more ingredients. It may be a solution, a suspension, liquid, powder, a paste, aqueous, non-aqueous or any combination thereof.
[0132] As used herein, a combination refers to any association between two or more items.
[0133] As used herein, fluid refers to any composition that can flow. Fluids thus encompass compositions that are in the form of semi-solids, pastes, solutions, aqueous mixtures, gels, lotions, creams and other such compositions.
[0134] As used herein, a cellular extract refers to a preparation or fraction that is made from a lysed or disrupted cell.
[0135] As used herein, the term "subject" refers to animals, plants, insects, and birds and other phyla, genera and species into which nucleic acid molecules may be introduced. Included are higher organisms, such as mammals, fish, insects and birds, including humans, primates, cattle, pigs, rabbits, goats, sheep, mice, rats, guinea pigs, hamsters, cats, dogs, horses, chicken and others.
[0136] As used herein, flow cytometry refers to processes that use a laser based instrument capable of analyzing and sorting out cells and or chromosomes based on size and fluorescence.
[0137] As used herein, the abbreviations for any protective groups, amino acids and other compounds, are, unless indicated otherwise, in accord with their common usage, recognized abbreviations, or the IUPAC-IUB Commission on Biochemical Nomenclature (see, (1972) Biochem. 11:942-944).
B. Recombination Systems
[0138] Site-specific recombination systems typically contain three elements: a pair of DNA sequences (the site-specific recombination sequences) and a specific enzyme (the site-specific recombinase). The site-specific recombinase catalyzes a recombination reaction between two site-specific recombination sequences.
[0139] A number of different site-specific recombinase systems are available and/or known to those of skill in the art, including, but not limited to: the Cre/lox recombination system using CRE recombinase (see, e.g., SEQ ID Nos. 58 and 59) from the Escherichia coli phage P1 (see, e.g., Sauer (1993) Methods in Enzymology 225:890-900; Sauer et al. (1990) The New Biologist 2:441-449), Sauer (1994) Current Opinion in Biotechnology 5:521-527; Odell et al. (1990) Mol Gen Genet. 223:369-378; Lasko et al. (1992) Proc. Natl. Acad. Sci. U.S.A. 89:6232-6236; U.S. Pat. No. 5,658,772), the FLP/FRT system of yeast using the FLP recombinase (see, SEQ ID Nos. 60 and 61) from the 2μ episome of Saccharomyces cerevisiae (Cox (1983) Proc. Natl. Acad. Sci. U.S.A. 80:4223; Falco et al. (1982) Cell 29:573-584; Golic et al. (1989) Cell 59:499-509; U.S. Pat. No. 5,744,336), the resolvases, including Gin recombinase of phage Mu (Maeser et al. (1991) Mol Gen Genet. 230:170-176; Klippel, A. et al (1993) EMBO J. 12:1047-1057; see, e.g., SEQ ID Nos. 64-67), Cin, Hin, αδ Tn3; the Pin recombinase of E. coli (see, e.g., SEQ ID Nos. 68 and 69; Enomoto et al. (1983) J Bacteria 6:663-668), the R/RS system of the pSR1 plasmid of Zygosaccharomyces rouxii (Araki et al. (1992) J. Mol. Biol. 225:25-37; Matsuzaki et al. (1990)J. Bacteriol. 172: 610-618) and site-specific recombinases from Kluyveromyces drosophilarium (Chen et al. (1986) Nucleic Acids Res. 314:4471-4481) and Kluyveromyces waltii (Chen et al. (1992) J. Gen. Microbiol. 138:337-345). Other systems are known to those of skill in the art (Stark et al. Trends Genet. 8:432-439; Utatsu et al. (1987) J. Bacteriol. 169:5537-5545; see, also, U.S. Pat. No. 6,171,861).
[0140] Members of the highly related family of site-specific recombinases, the resolvase family, such as γδ, Tn3 resolvase, Hin, Gin, and Cin are also available. Members of this family of recombinases are typically constrained to intramolecular reactions (e.g., inversions and excisions) and can require host-encoded factors. Mutants have been isolated that relieve some of the requirements for host factors (Maeser et al. (1991) Mol. Gen. Genet. 230:170-176), as well as some of the constraints of intramolecular recombination (see, U.S. Pat. No. 6,171,861).
[0141] The bacteriophage P1 Cre/lox and the yeast FLP/FRT systems are particularly useful systems for site-specific integration, inversion or excision of heterologous nucleic acid into, and out of, chromosomes, particularly ACes as provided herein. In these systems a recombinase (Cre or FLP) interacts specifically with its respective site-specific recombination sequence (lox or FRT, respectively) to invert or excise the intervening sequences. The sequence for each of these two systems is relatively short (34 bp for lox and 47 bp for FRT).
[0142] The FLP/FRT recombinase system has been demonstrated to function efficiently in plant cells (U.S. Pat. No. 5,744,386), and, thus, can be used for producing plant artificial chromosome platforms. In general, short incomplete FRT sites leads to higher accumulation of excision products than the complete full-length FRT sites. The system catalyzes intra- and intermolecular reactions, and, thus, can be used for DNA excision and integration reactions. The recombination reaction is reversible and this reversibility can compromise the efficiency of the reaction in each direction. Altering the structure of the site-specific recombination sequences is one approach to remedying this situation. The site-specific recombination sequence can be mutated in a manner that the product of the recombination reaction is no longer recognized as a substrate for the reverse reaction, thereby stabilizing the integration or excision event.
[0143] In the Cre-lox system, discovered in bacteriophage P1, recombination between loxP sites occurs in the presence of the Cre recombinase (see, e.g., U.S. Pat. No. 5,658,772). This system can be used to insert, invert or excise nucleic acid located between two lox sites. Cre can be expressed from a vector. Since the lox site is an asymmetrical nucleotide sequence, lox sites on the same DNA molecule can have the same or opposite orientation with respect to each other. Recombination between lox sites in the same orientation results in a deletion of the DNA segment located between the two lox sites and a connection between the resulting ends of the original DNA molecule. The deleted DNA segment forms a circular molecule of DNA. The original DNA molecule and the resulting circular molecule each contain a single lox site. Recombination between lox sites in opposite orientations on the same DNA molecule result in an inversion of the nucleotide sequence of the DNA segment located between the two lox sites. In addition, reciprocal exchange of DNA segments proximate to lox sites located on two different DNA molecules can occur. All of these recombination events are catalyzed by the product of the Cre coding region.
[0144] Any site-specific recombinase system known to those of skill in the art is contemplated for use herein. It is contemplated that one or a plurality of sites that direct the recombination by the recombinase are introduced into an artificial chromosome to produce platform ACes. The resulting platform ACes are introduced into cells with nucleic acid encoding the cognate recombinase, typically on a vector, and nucleic acid encoding heterologous nucleic acid of interest linked to the appropriate recombination site for insertion into the platform ACes. The recombinase-encoding-nucleic acid may be introduced into the cells on the same vector, or a different vector, encoding the heterologous nucleic acid.
[0145] An E. coli phage lambda integrase system for ACes platform engineering and for artificial chromosome engineering is provided (Lorbach et al. (2000) J. Mol. Biol. 296:1175-1181). The phage lambda integrase (Landy, A. (1989) Annu. Rev. Biochem. 58:913-94) is adapted herein and the cognate att sites are provided. Chromosomes, including ACes, engineered to contain one or a plurality of att sites are provided, as are vectors encoding a mutant integrase that functions in the absence other factors. Methods using the modified chromosomes and vectors for introduction of heterologous nucleic acid are also provided.
[0146] For purposes herein, one or more of the sites (e.g., a single site or a pair of sites) required for recombination are introduced into an artificial chromosome, such as an ACes chromosome. The enzyme for catalyzing site-directed recombination is introduced with the DNA of interest, or separately, or is engineered onto the artificial chromosome under the control of a regulatable promoter.
[0147] As described herein, artificial chromosome platforms containing one or multiple recombination sites are provided. The methods and resulting products are exemplified with the lambda phage Att/Int system, but similar methods may be used for production of ACes platforms with other recombination systems.
[0148] The Att/Int system and vectors provided herein are not only intended for engineering ACes platforms, but may be used to engineer an Att/Int system into any chromosome. Introduction of att sites into a chromosome will permit engineering of natural chromosomes, such as by permitting targeted integration genes or regulatory regions, and by controlled excision of selected regions. For example, genes encoding a particular trait may be added to a chromosome, such as plant chromosome engineered to contain one or plurality of att sites. Such chromosomes may be used for screening DNA to identify genes. Large pieces of DNA can be introduced into cells and the cells screened phenotypically to select those having the desired trait.
C. Platforms
[0149] Provided herein are platform artificial chromosomes (platform ACes) containing single or multiple site-specific recombination sites. Chromosome-based platform technology permits efficient and tractable engineering and subsequent expression of multiple gene targets. Methods are provided that use DNA vectors and fragments to create platform artificial chromosomes, including animal, particularly mammalian, artificial chromosomes, and plant artificial chromosomes. The artificial chromosomes contain either single or multiple sequence-specific recombination sites suitable for the placement of target gene expression vectors onto the platform chromosome. The engineered chromosome-based platform ACes technology is applicable for methods, including cellular and transgenic protein production, transgenic plant and animal production and gene therapy. The platform ACes are also useful for producing a library of ACes comprising random portions of a given genome (e.g., a mammalian, plant or prokaryotic genome) for genomic screening; as well as a library of cells comprising different and/or mutually exclusive ACes therein.
[0150] Exemplary of artificial chromosome platforms are those based on ACes. ACes artificial chromosomes are non-viral, self-replicating nucleic acid molecules that function as a natural chromosome, having all the elements required for normal chromosomal replication and maintenance within the cell nucleus. ACes artificial chromosomes do not rely on integration into the genome of the cell to be effective, and they are not limited by DNA carrying capacity and as such the therapeutic gene(s) of interest, including regulatory sequences, can be engineered into the ACes. In addition, ACes are stable in vitro and in vivo and can provide predictable long-term gene expression. Once engineered and delivered to the appropriate cell or embryo, ACes work independently alongside host chromosomes, for ACes that are predominantly heterochromatin producing only the products (proteins) from the genes it carries. As provided herein ACes are modified by introduction of recombination site(s) to provide a platform for ready introduction of heterologous nucleic acid. The ACes platforms can be used for production of transgenic animals and plants; as vectors for genetic therapy; for use as protein production systems; for animal models to identify and target new therapeutics; in cell culture for the development and production of therapeutic proteins; and for a variety of other applications.
[0151] 1. Generation of Artificial Chromosomes
[0152] Artificial chromosomes may be generated by any method known to those of skill in the art. Of particular interest herein are the ACes artificial chromosomes, which contain a repeated unit. Methods for production of ACes are described in detail in U.S. Pat. Nos. 6,025,155 and 6,077,697, which, as with all patents, applications, publications and other disclosure, are incorporated herein in their entirety.
[0153] Generation of de Novo Aces.
[0154] ACes can be generated by cotransfecting exogenous DNA--such as a mammary tissue specific DNA cassette including the gene sequences for a therapeutic protein, with a rDNA fragment and a drug resistance marker gene into the desired eukaryotic cell, such as plant or animal cells, such as murine cells in vitro. DNA with a selectable or detectable marker is introduced, and can be allowed to integrate randomly into pericentric heterochromatin or can be targeted to pericentric heterochromatin, such as that in rDNA gene arrays that reside on acrocentric chromosomes, such as the short arms of acrocentric chromosomes. This integration event activates the "megareplicator" sequence and amplifies the pericentric heterochromatin and the exogenous DNA, and duplicates a centromere. Ensuing breakage of this "dicentric" chromosome can result in the production of daughter cells that contain the substantially-original chromosome and the new artificial chromosome. The resulting ACes contain all the essential elements needed for stability and replication in dividing cells--centromere, origins of replications, and telomeres. ACes have been produced that express marker genes (lacZ, green fluorescent protein, neomycin-resistance, puromycin-resistance, hygromycin-resistance) and genes of interest. Isolated ACes, for example, have been successfully transferred intact to rodent, human, and bovine cells by electroporation, sonoporation, microinjection, and transfection with lipids and dendrimers.
[0155] To render the creation of ACes with desired genes more tractable and efficient, "platform" ACes (platform-ACes) can be produced that contain defined DNA sequences for enzyme-mediated homologous DNA recombination, such as by Cre or FLP recombinases (Bouhassira et al. (1996) Blood 88(supplement 1):190a; Bouhassira et al. (1997) Blood, 90:3332-3344; Siebler et al. (1997) Biochemistry: 36:1740-1747; Siebler et al. (1998) Biochemistry 37: 6229-6234; and Bethke et al. (1997) Nucl. Acids Res. 25:2828-2834), and as exemplified herein the lambda phage integrase. A lox site contains two 13 bp inverted repeats to which Cre-recombinase binds and an intervening 8 bp core region. Only pairs of sites having identity in the central 6 bp of the core region are proficient for recombination; sites having non-identical core sequences (heterospecific lox sites) do not efficiently recombine with each other (Hoess et al. (1986) Nucleic Acids Res. 14:2287-2300).
[0156] Generating Acrocentric Chromosomes for Plant Artificial Chromosome Formation.
[0157] In human and mouse cells de novo formation of a satellite DNA based artificial chromosome (SATAC, also referred to as ACes) can occur in an acrocentric chromosome where the short arm contains only pericentric heterochromatin, the rDNA array, and telomere sequences. Plant species may not have any acrocentric chromosomes with the same physical structure described, but "megareplicator" DNA sequences reside in the plant rDNA arrays, also known as the nucleolar organizing regions (NOR). A structure like those seen in acrocentric mammalian chromosomes can be generated using site-specific recombination between appropriate arms of plant chromosomes.
[0158] Approach
[0159] Qin et al. ((1994) Proc. Natl. Acad. Sci. U.S.A. 91:1706-1710, 1994) describes crossing two Nicotiana tabacum transgenic plants. One plant contains a construct encoding a promoterless hygromycin-resistance gene preceded by a lox site (lox-hpt), the other plant carries a construct containing a cauliflower mosaic virus 35S promoter linked to a lox sequence and the cre DNA recombinase coding region (35S-lox-cre). The constructs were introduced separately by infecting leaf explants with agrobacterium tumefaciens which carries the kanamycin-resistance gene (KanR). The resultant KanR transgenic plants were crossed. Plants that carried the appropriate DNA recombination event were identified by hygromycin-resistance.
[0160] Modification of the Above for Generation of ACes
[0161] The KanR cultivars are initially screened, such as by FISH, to identify two sets of candidate transgenic plants. One set has one construct integrated in regions adjacent to the pericentric heterochromatin on the short arm of any chromosome. The second set of candidate plants has the other construct integrated in the NOR region of appropriate chromosomes. To obtain reciprocal translocation both sites must be in the same orientation. Therefore a series of crosses are required, KanR plants generated, and FISH analyses performed to identify the appropriate "acrocentric" plant chromosome for de novo plant ACes formation.
[0162] 2. Bacteriophage Lambda Integrase-Based Site-Specific Recombination System
[0163] An integral part of the platform technology includes a site-specific recombination system that allows the placement of selected gene targets or genomic fragments onto the platform chromosomes. Any such system may be used. In particular, a method is provided for insertion of additional DNA fragments into the platform chromosome residing in the cell via sequence-specific recombination using the recombinase activity of the bacteriophage lambda integrase. The lambda integrase system is exemplary of the recombination systems contemplated for ACes. Any known recombination system, including any described herein, particularly any that operates without the need for additional factors or that, by virtue of mutation, does not require additional factors, is contemplated.
[0164] As noted the lambda integrase system provided herein can be used with natural chromosomes and artificial chromosomes in addition to ACes. Single or a plurality of recombination sites, which may be the same or different, are introduced into artificial chromosomes to produce artificial chromosome platforms.
[0165] 3. Creation of Bacteriophage Lambda Integrase Site-Specific Recombination System
[0166] The lambda phage-encoded integrase (designated Int) is a prototypical member of the integrase family. Int effects integration and excision of the phage in and out of the E. coli genome via recombination between pairs of attachment sites designated attB/attP and attL/attR. Each att site contains two inverted 9 base pair core Int binding sites and a 7 base pair overlap region that is identical in wild-type att sites. Each site, except for attB contains additional Int binding sites. In flanking regions, there are recognition sequences for accessory DNA binding proteins, such as integration host factor (IHF), factor for inversion stimulation (FIS) and the phage encoded excision protein (XIS). Except for attB, Int is a heterobivalent DNA-binding protein and, with assistance from the accessory proteins and negative DNA supercoiling, binds simultaneously to core and arm sites within the same att site.
[0167] Int, like Cre and FLP, executes an ordered sequential pair of strand exchanges during integrative and excisive recombination. The natural pairs of target sequences for Int, attB and attP or attL and attR are located on the same or different DNA molecules resulting in intra or intermolecular recombination, respectively. For example, intramolecular recombination occurs between inversely oriented attB and attP, or between attL and attR sequences, respectively, leading to inversion of the intervening DNA segment.
[0168] Like the recombinase systems, such as Cre and FLP, Int directs site-specific recombination. Unlike the other systems, such Cre and FLP, Int generally requires additional protein factors for integrative and excisive recombination and negative supercoiling for integrative recombination. Hence, the Int system had not been used in eukaryotic targeting systems. Mutant Int proteins, designated Int-h (E174K) and a derivative thereof. Int-h/218(E174K/E218K) do not require accessory proteins to perform intramolecular integrative and excisive recombination in co-transfection assays in human cells (Lorbach et al. (2000) J Mol. Biol. 296:1175-1181); wild-type Int does not catalyze intramolecular recombination in human cells harboring target sites attB and attP. Hence it had been demonstrated that mutant Int can catalyze factor-independent recombination events in human cells.
[0169] There has been no demonstration by others that this system can be used for engineering of eukaryotic genomes or chromosomes. Provided herein are chromosomes, including artificial chromosomes, such as but not limited to ACes that contain att sites (e.g., platform ACes), and the use of such chromosomes for targeted integration of heterologous DNA into such chromosomes in eukaryotic cells, including animal, such as rodent and human, and plant cells. Mutant Int provided herein is shown to effect site-directed recombination between sites in artificial chromosomes and vectors containing cognate sites.
[0170] An additional component of the chromosome-based platform technology is the site-specific integration of target DNA sequences onto the platform. For this the native bacteriophage lambda integrase has been modified to carry out this sequence specific DNA recombination event in eukaryotic cells. The bacteriophage lambda integrase and its cognate DNA substrate att is a member of the site-specific recombinase family that also includes the bacteriophage P1 Cre/lox system as well as the Saccharomyces cerevisiae 2 micron based FLP/FRT system (see, e.g., Landy (1989) Ann. Rev. Biochem 58:913-949; Hoess et al. (1982) Proc. Natl. Acad. Sci. U.S.A. 79:3398-3402; Broach et al. (1982) Cell 29:227-234).
[0171] By combining DNA endonuclease and DNA ligase activity these recombinases recognize and catalyze DNA exchanges between sequences flanking the recognition site. During the integration of lambda genome into the E. coli (lambda recombination) genome, the phage integrase (INT) in association with accessory proteins catalyzes the DNA exchange between the attP site of the phage genome and the attB site of the bacterial genome resulting in the formation of attL and attR sites (FIG. 6). The engineered bacteriophage lambda integrase has been produced herein to carry out an intermolecular DNA recombination event between an incoming DNA molecule (primarily on a vector containing the bacterial attB site) and the chromosome-based platform carrying the lambda attP sequence independent of lambda bacteriophage or bacterial accessory proteins.
[0172] In contrast to the bi-directional Cre/lox and FLP/FRT system, the engineered lambda recombination system derived for chromosome-based platform technology is advantageously unidirectional because accessory proteins, which are absent, are required for excision of integrated nucleic acid upon further exposure to the lambda Int recombinase.
[0173] 4. Creation of Platform Chromosome Containing Single or Multiple Sequence-Specific Recombination Sites
[0174] a. Multiple Sites
[0175] For the creation of a platform chromosome containing multiple, sequence-specific recombination sites, artificial chromosomes are produced as depicted in FIG. 5 and Example 3. As discussed above, artificial chromosomes can be produced using any suitable methodology, including those described in U.S. Pat. Nos. 5,288,625; 5,712,134; 5,891,691; 6,025,155. Briefly, to prepare artificial chromosomes containing multiple recombination (e.g., integration) sites, nucleic acid (either in the form a one or more plasmids, such as the plasmid pSV40193attPsensePUR set forth in Example 3) is targeted into an amplifiable region of a chromosome, such as the pericentric region of a chromosome. Among such regions are the rDNA gene loci in acrocentric mammalian chromosomes. Hence, targeting nucleic acid for integration into the rDNA region of mammalian acrocentric chromosomes can include the mouse rDNA fragments (for targeting into rodent cell lines) or large human rDNA regions on BAC/PAC vectors (or subclones thereof in standard vectors) for targeting into human acrocentric chromosomes, such as for human gene therapy applications. The targeting nucleic acid generally includes a detectable or selectable marker, such as antibiotic resistance, such as puromycin and hygromycin, a recombination site (such as attP, attB, attL, attR or the like), and/or human selectable markers as required for gene therapy applications. Cells are grown under conditions that result in amplification and ultimately production of ACes artificial chromosomes having multiple recombination (e.g., integration) sites therein. ACes having the desired size are selected for further engineering.
[0176] b. Creation of Platform Chromosome Containing a Single Sequence-Specific Recombination Site
[0177] In this method a mammalian platform artificial chromosome is generated containing a single sequence-specific recombination site. In the Example below, this approach is demonstrated using a puromycin resistance marker for selection and a mouse rDNA fragment for targeting into the rDNA locus on mouse acrocentric chromosomes. Other selection markers and targeting DNA sequences as desired and known to those of skill in the art can be used. Additional resistance markers include genes conferring resistance to the antibiotics neomycin, blasticidin, hygromycin and zeocin. For applications, such as gene therapy in which potentially immunogenic responses are to be avoided, host, such as human, derived selectable markers or markers detectable with monoclonal antibodies (MAb) followed by fluorescent activated cell sorting (FACS) can be used. Examples in this class include, but are not limited to: human nerve growth factor receptor (detection with MAb); truncated human growth factor receptor (detection with MAb); mutant human dihydrofolate reductase (DHFR; detectable using a fluorescent methotrexate substrate); secreted alkaline phosphatase (SEAP; detectable with fluorescent substrate); thymidylate synthase (TS; confers resistance to fluorodeoxyuridine); human CAD gene (confers resistance to N-phosphonacetyl-L-aspartate (PALA)).
[0178] To construct a platform artificial chromosome with a single site, an ACes artificial chromosome (or other artificial chromosome of interest) can be produced containing a selectable marker. A single sequence specific recombination site is targeted onto ACes via homologous recombination. For this, DNA sequences containing the site-specific recombination sequence are flanked with DNA sequences homologous to a selected sequence in the chromosome. For example, when using a chromosome containing rDNA or satellite DNA, such DNA can be used as homologous sequences to target the site-specific recombination sequence onto the chromosome. A vector is designed to have these homologous sequences flanking the site-specific recombination site and, after the appropriate restriction enzyme digest to generate free ends of homology to the chromosome, the DNA is transfected into cells harboring the chromosome. After transfection and integration of the site-specific cassette, homologous recombination events onto the platform chromosome are subcloned and identified, for example by screening single cell subclones via expression of resistance or a fluorescent marker and PCR analysis. In one embodiment, a platform artificial chromosome, such as a platform ACes, that contains a single copy of the recombination site is selected. Examples 2B and 2D exemplify the process, and FIG. 3 provides a diagram depicting one method for the creation of a platform mammalian chromosome containing a single sequence-specific recombination site.
[0179] 5. Lambda Integrase Mediated Recombination of Target Gene Expression Vector onto Platform Chromosome
[0180] The third component of the chromosome-based platform technology involves the use of target gene expression vectors carrying, for example, genes for gene therapy, genes for transgenic animal or plant production, and those required for cellular protein production of interest. Using lambda integrase mediated site-specific recombination, or any other recombinase-mediated site-specific recombination, the target gene expression vectors are introduced onto the selected chromosome platform. The use of target gene expression vector permits use of the de novo generated chromosome-based platforms for a wide range of gene targets. Furthermore, chromosome platforms containing multiple attP sites provides the opportunity to incorporate multiple gene targets onto a single platform, thereby providing for expression of multiple gene targets, including the expression of cellular and genetic regulatory genes and the expression of all or parts of metabolic pathways. In addition to expressing small target genes, such as cDNA and hybrid cDNA/artificial intron constructs, the chromosome-based platform can be used for engineering and expressing large genomic fragments carrying target genes along with its endogenous genomic promoter sequences. This is of importance, for example, where the therapy requires precise cell specific expression and in instances where expression is best achieved from genomic clones rather than cDNA clones. FIG. 9 provides a diagram summarizing one embodiment of the chromosome-based technology.
[0181] A feature of the target gene expression vector that is of interest to include is a promoterless marker gene, which as exemplified (see, FIG. 9) contains an upstream attB site (marker 2 on FIG. 9). The nucleic acid encoding the marker is not expressed unless it is placed downstream from a promoter sequence. Using the recombinase technology provided herein, such as the lambda integrase technology (λINT.sub.E174R on FIG. 8) provided herein, site-specific recombination between the attB site on the vector and the promoter-attP site (in the "sense" orientation) on the chromosome-based platform results in the expression of marker 2 on the target gene expression vector, thereby providing a positive selection for the lambda INT mediated site-specific recombination event. Site-specific recombination events on the chromosome-based platform versus random integrations next to a promoter in the genome (false positive) can be quickly screened by designing primers to detect the correct event by PCR. Examples of suitable marker 2 genes, include, but are not limited to, genes that confer resistance to toxic compounds or antibiotics, fluorescence activated cell sorting (FACS) sortable cell surface markers and various fluorescent markers. Examples of these genes include, but are not limited to, human L26aR (human homolog of Saccharomyces cerevisiae CYH8 gene), neomycin, puromycin, blasticidin, CD24 (see, e.g., U.S. Pat. Nos. 5,804,177 and 6,074,836), truncated CD4, truncated low affinity nerve growth factor receptor (LNGFR), truncated LDL receptor, truncated human growth hormone receptor, GFP, RFP, BFP.
[0182] The target gene expression vectors contain a gene (target gene) for expression from the chromosome platform. The target gene can be expressed using various constitutive or regulated promoter systems across various mammalian species. For the expression of multiple target genes within the same target gene expression vector, the expression of the multiple targets can be coordinately regulated via viral-based or human internal ribosome entry site (IRES) elements (see, e.g., Jackson et al. (1990) Trends Biochem Sci. 15: 477-83; Oumard et al. (2000) Mol. Cell. Biol. 20: 2755-2759). Furthermore, using IRES type elements linked to a downstream fluorescent marker, e.g., green, red or blue fluorescent proteins (GFP, RFP, BFP) allows for the identification of high expressing clones from the integrated target gene expression vector.
[0183] In certain embodiments described herein, the promoterless marker can be transcriptionally downstream of the heterologous nucleic acid, wherein the heterologous nucleic acid encodes a heterologous protein, and wherein the expression level of the selectable marker is transcriptionally linked to the expression level of the heterologous protein. In addition, the selectable marker and the heterologous nucleic acid can be transcriptionally linked by the presence of a IRES between them. As set forth herein the selectable marker is selected from the group consisting of an antibiotic resistance gene, and a detectable protein, wherein the detectable protein is chromogenic or fluorescent. Expression from the target gene expression vector integrated onto the chromosome-based platform can be further enhanced using genomic insulator/boundary elements. The incorporation of insulator sequences into the target gene expression vector helps define boundaries in chromatin structure and thus minimizes influence of chromatin position effects/gene silencing on the expression of the target gene (Bell et al. (1999) Current Opinion in Genetics and Development 9:191-198; Emery et al. (2000) Proc. Natl. Acad. Sci. U.S.A. 97:9150-9155). Examples of insulator elements that can be included onto target gene expression vector in order to optimize expression include, but are not limited to: [0184] 1) chicken β-globin HS4 element (Prioleau et al. (1999) EMBO J. 18: 4035-4048); [0185] 2) matrix attachment regions (MAR; see, e.g., Ramakrishnan et al. (2000) Mol. Cell. Biol. 20:868-877); [0186] 3) scaffold attachment regions (SAR; see, e.g., Auten et al. (1999) Human Gene Therapy 10:1389-1399); and [0187] 4) universal chromatin opening elements (UCOE; WO/0005393 and WO/0224930)
[0188] The copy number of the target gene can be controlled by sequentially adding multiple target gene expression vectors containing the target gene onto multiple integration sites on the chromosome platform. Likewise, the copy number of the target gene can be controlled within an individual target gene expression vector by the addition of DNA sequences that promote gene amplification. For example, gene amplification can be induced utilizing the dihydrofolate reductase (DHFR) minigene with subsequent selection with methotrexate (see, e.g., Schimke (1984) Cell 37:705-713) or amplification promoting sequences from the rDNA locus (see, e.g., Wegner et al. (1989) Nucl. Acids Res. 17: 9909-9932).
[0189] 6. Platforms with Other Recombinase System Sites
[0190] A "double lox" targeting strategy mediated by Cre-recombinase (Bethke et al. (1997) Nucl. Acids Res. 25:2828-2834) can be used. This strategy employs a pair of heterospecific lox sites--loxA and loxB, which differ by one nucleotide in the 8 bp spacer region. Both sites are engineered into the artificial chromosome and also onto the targeting DNA vector. This allows for a direct site-specific insertion of a commercially relevant gene or genes by a Cre-catalyzed double crossover event. In essence a platform ACes is engineered with a hygromycin-resistance gene flanked by the double lox sites generating lox-ACes, which is maintained in the thymidine kinase deficient cell, LMtk(-). The gene of interest, for example, for testing purposes, the green fluorescence protein gene, GFP and a HSV thymidine kinase gene (tk) marker, are engineered between the appropriate lox sites of the targeting vector. The vector DNA is cotransfected with plasmid pBS185 (Life Technologies) encoding the Cre recombinase gene into mammalian cells maintaining the dual-lox artificial chromosome. Transient expression of the Cre recombinase catalyzes the site-specific insertion of the gene and the tk-gene onto the artificial chromosome. The transfected cells are grown in HAT medium that selects for only those cells that have integrated and expressed the thymidine kinase gene. The HATR colonies are screened by PCR analyses to identify artificial chromosomes with the desired insertion.
[0191] To generate the lox-ACes, Lambda-HygR-lox DNA is transfected into the (-) cell line harboring the precursor ACes. Hygromycin-resistant colonies are analyzed by FISH and Southern blotting for the presence of a single copy insert on the ACes.
[0192] To demonstrate the gene replacement technology, cell lines containing candidate lox ACes are cotransfected with pTK-GFP-lox and pBS185 (encoding the Cre recombinase gene) DNA. After transfection, transient expression of plasmid pBS185 will provide sufficient burst of Cre recombinase activity to catalyze DNA recombination at the lox sites. Thus, a double crossover event between the ACes target and the exogenous targeting plasmid carrying the loxA and loxB permits the simple replacement of the hygromycin-resistance gene on the lox-ACes for the tk-GFP cassette from the targeting plasmid, with no integration of vector DNA. Transfected cells are grown in HAT-media to select for tk-expression. Correct targeting will result in the generation of HATR, hygromycin sensitive, and green fluorescent cells. The desired integration event is verified by Southern and PCR analyses. Specific PCR primer sets are used to amplify DNA sequences flanking the individual loxA and loxB sites on the lox-ACes before and after homologous recombination.
D. Exemplary Applications of the Platform Aces
[0193] Platform ACes are applicable and tractable for different/optimized cell lines. Those that include a fluorescent marker, for example, can be purified and isolated using fluorescent activated cell sorting (FACS), and subsequently delivered to a target cell. Those with selectable markers provide for efficient selection and provide a growth advantage. Platform ACes allow multiple payload delivery of donor target vectors via a positive-selection site-specific, recombination system, and they allow for the inclusion of additional genetic factors that improve protein production and protein quality.
[0194] The construction and use of the platform ACes as provided for each application may be similarly applied to other applications. Particular descriptions are for exemplification.
[0195] 1. Cellular Protein Production Platform Aces (CPP ACes)
[0196] As described herein, ACes can be produced from acrocentric chromosomes in rodent (mouse, hamster) cell lines via megareplicator induced amplification of heterochromatin/rDNA sequences. Such ACes are ideal for cellular protein production as well as other applications described herein and known to those of skill in the art. ACes platforms that contain a plurality of recombination sites are particularly suitable for engineering as cellular protein production systems.
[0197] In one embodiment, CPP ACes involve a two-component system: the platform chromosome containing multiple engineering sites and the donor target vector containing a platform-specific recombination site with designed expression cassettes (see FIG. 9).
[0198] The platform ACes can be produced from any artificial chromosome, particularly the amplification-based artificial chromosomes. For exemplification, they are produced from rodent artificial chromosomes produced from acrocentric chromosomes using the technology of U.S. Pat. Nos. 6,077,697 and 6,025,155 and published International PCT application No. WO 97/40183, in which nucleic acid is targeted to the pericentric heterochromatic, and, particularly into rDNA to initiate the replication event(s). The ACes can be produced directly in the chosen cellular protein production cell lines, such as, but not limited to, CHO cells, hybridomas, plant cells, plant tissues, plant protoplasts, stem cells and plant calli.
[0199] a. Platform Construction
[0200] In the exemplary embodiment, the initial de novo platform construction requires co-transfecting with excess targeting DNA, such as, rDNA or lambda DNA without an attP region, and an engineered selectable marker. The engineered selectable marker should contain promoter, generally a constitutive promoter, such as human, viral, i.e., adenovirus or SV40 promoter, including the human ferritin heavy chain promoter (SEQ ID NO:128), SV40 and EF1α promoters, to control expression of a marker gene that provides a selective growth advantage to the cell. An example of such a marker gene is the E. coli hisD gene (encoding histidinol dehydrogenase) which is homologous and analogous to the S. typhimurium hisD a dominant marker selection system for mammalian cells previously described (see, Hartman et al. (1988) Proc. Natl. Acad. Sci. U.S.A. 85:8047-8051). Since histidine is an essential amino acid in mammals and a nutritional requirement in cell culture, the E. coli hisD gene can be used to select for histidine prototrophy in defined media. Furthermore more stringent selection can be placed on the cells by including histinol in the medium. Histidinol is itself permeable and toxic to cells. The hisD provides a means of detoxification.
[0201] Placed between the promoter and the marker gene is the bacteriophage lambda attP site to use the bacteriophage lambda integrase dependent site-specific recombination system (described herein). The insertion of an attP site downstream of a promoter element provide forward selection of site-specific recombination events onto the platform ACes.
[0202] b. Donor Target Vector Construction
[0203] A second component of the CPP platform ACes system involves the construction of donor target vectors containing a gene product(s) of interest for the CPP platform ACes. Individual donor target vectors can be designed for each gene product to be expressed thus enabling maximum usage of a de novo constructed platform ACes, so that one or a few CPP platform ACes will be required for many gene targets.
[0204] A key feature of the donor vector target is the promoterless marker gene containing an upstream attB site (marker 2 on FIG. 9). Normally the marker would not be expressed unless it is placed downstream of a promoter sequence. As discussed above, using the lambda integrase technology (λINT.sub.E174R on FIG. 8 and FIG. 9), site-specific recombination between the attB site on the vector and the promoter-attP site on the CPP platform ACes result in the expression of the donor target vector marker providing positive selection for the site-specific event. Site-specific recombination events on the CPP ACes versus random integrations next to a promoter in the genome (false positive) can be quickly screened by designing primers to detect the correct event by PCR. In addition, since the lambda integrase reaction is unidirectional, i.e. excision reaction is not possible, a number of unique targets can be loaded onto the CPP platform ACes limited only by the number of markers available.
[0205] Additional features of the donor target vector include gene target expression cassettes flanked by either chromatin insulator regions, matrix attachment regions (MAR) or scaffold attachment regions (SAR). The use of these regions will provide a more "open" chromatin environment for gene expression and help alleviate silencing. An example of such a cassette for expressing a monoclonal antibody is described. For this purpose, a strong constitutive promoter, e.g. chicken β-actin or RNA PolI, is used to drive the expression of the heavy and light chain open reading frames. The heavy and light chain sequences flank a nonattenuated human IRES (IRESH; from the 5'UTR of NRF1 gene; see Oumard et al., 2000, Mol. and Cell Biol., 20(8):2755-2759) element thereby coordinating transcription of both heavy and light chain sequence. Distal to the light chain open reading frame resides an additional viral encoded IRES (IRESV modified ECMV internal ribosomal entry site (IRES)) element attenuating the expression of the fluorescent marker gene hrGFP from Renilla (Stratagene). By linking the hrGFP with an attenuated IRES, the heavy and light chains along with the hrGFP are monocistronic. Thus, the identification of hrGFP fluorescing cells will provide a means to detect protein producing cells. In addition, high producing cell lines can be identified and isolated by FACS thereby decreasing the time frame in finding high expressers. Functional monoclonal antibody will be confirmed by ELISA.
[0206] c. Additional Components in Cellular Protein Production Platform ACes (CPP Aces)
[0207] In addition to the aforementioned CPP ACes system, other genetic factors can be included to enhance the yield and quality of the expressed protein. Again to provide maximum flexibility, these additional factors can be inserted onto the CPP platform ACes by λINTE174R dependent site-specific recombination. Other factors that could be used with a CPP Platform ACes include for example, adenovirus E1a transactivation system which upregulates both cellular and viral promoters (see, e.g., Svensson and Akusjarvi (1984) EMBO 3:789-794; and U.S. Pat. Nos. 5,866,359; 4,775,630 and 4,920,211).
[0208] d. Targets for CHO-ACes Engineering to Enhance Cell Growth, Such as CHO Cell Growth and Protein Production/Quality
[0209] If adding these additional factors onto the CPP ACes is not prudent or desired, the host cell, CHO cells, can be engineered to express these factors (see, below, targets for CHO-ACes engineering to enhance CHO cell growth and protein production/quality). Additional factors to consider including are addition of insulin or IGF-1 to sustain viabililty; human sialyltransferases or related factors to produce more human-like glycoproteins; expression of factors to decrease ammonium accumulation during cell growth; expression of factors to inhibit apoptosis; expression of factors to improve protein secretion and protein folding; and expression of factors to permit serum-free transfection and selection.
[0210] 1) Addition of Insulin or IGF-1 to Sustain Viabililty
[0211] Stimulatory factors and/or their receptors are expressed to set up an autocrine loop, to improve cell growth, such as CHO cell growth. Two exemplary candidates are insulin and IGF-1 (see, Biotechnol Prog 2000 September; 16(5):693-7). Insulin is the most commonly used growth factor for sustaining cell growth and viability in serum-free Chinese hamster ovary (CHO) cell cultures. Insulin and IGF-1 analog (LongR(3) serve as growth and viability factors for CHO cells.
[0212] CHO cells were modified to produce higher levels of essential nutrients and factors. A serum-free (SF) medium for dihydrofolate reductase-deficient Chinese hamster ovary cells (DG44 cells) was prepared. Chinese hamster ovary cells (DG44 cells), which are normally maintained in 10% serum medium, were gradually weaned to 0.5% serum medium to increase the probability of successful growth in SF medium (see, Kim et al. (199) In Vitro Cell Dev Biol Anim 35(4):178-82). A SF medium (SF-DG44) was formulated by supplementing the basal medium with these components; basal medium was prepared by supplementing Dulbecco's modified Eagle's medium and Ham's nutrient mixture F12 with hypoxanthine (10 mg/l) and thymidine (10 mg/l). Development of a SF medium for DG44 cells was facilitated using a Plackett-Burman design technique and weaning of cells.
[0213] 2) Human Sialyltransferases or Related Factors to Produce More Human-Like Glycoproteins
[0214] CHO cells have been modified by increasing their ability to process protein via addition of complex carbohydrates. This has been achieved by overexpression of relevant processing enzymes, or in some cases, reducing expression of relevant enzymes (see, Bragonzi et al. (2000) Biochim Biophys Acta 1474(3):273-282; see, also Weikert et al. (1999) Nature biotech. 17:1116-11121; Ferrari J et al. (1998) Biotechnol Bioeng 60(5):589-95). A CHO cell line expressing alpha-2,6-sialyltransferase was developed for the production of human-like sialylated recombinant glycoproteins. The sialylation defect of CHO cells can be corrected by transfecting the alpha-2,6-sialyltransferase (alpha-2,6-ST) cDNA into the cells. Glycoproteins produced by such CHO cells display alpha-2,6- and alpha-2,3-linked terminal sialic acid residues, similar to human glycoproteins.
[0215] As another example for improving the production of human-like sialylated recombinant glycoproteins, a CHO cell line has been developed that constitutively expresses sialidase antisense RNA (see, Ferrari J et al. (1998) Biotechnol Bioeng 60(5):589-95). Several antisense expression vectors were prepared using different regions of the sialidase gene. Co-transfection of the antisense constructs with a vector conferring puromycin resistance gave rise to over 40 puromycin resistant clones that were screened for sialidase activity. A 5' 474 bp coding segment of the sialidase cDNA, in the inverted orientation in an SV 40-based expression vector, gave maximal reduction of the sialidase activity to about 40% wild-type values.
[0216] Oligosaccharide biosynthesis pathways in mammalian cells have been engineered for generation of recombinant glycoproteins (see, e.g., Sburlati (1998) Biotechnol Prog 14(2):189-92), which describes a Chinese hamster ovary (CHO) cell line capable of producing bisected oligosaccharides on glycoproteins. This cell line was created by overexpression of a recombinant N-acetylglucosaminyltransferase III (GnT-III) (see, also, Prati et al. (1998) Biotechnol Bioeng 59(4):445-50, which describes antisense strategies for glycosylation engineering of CHO cells).
[0217] 3) Expression of Factors to Decrease Ammonium Accumulation During Cell Growth
[0218] Excess ammonium, which is a by-product of CHO cell metabolism can have detrimental effects on cell growth and protein quality (see, Yang et al. (2000) Biotechnol Bioeng 68(4):370-80). To solve this problem ammonium levels were modified by overexpressing carbamoyl phosphate synthetase I and ornithine transcarbamoylase or glutamine synthetase in CHO cells. Such modification resulted in reduced ammonium levels observed and an increase in the growth rate (see Kim et al. (2000) J Biotechnol 81 (2-3): 129-40; and Enosawa et al. (1997) Cell Transplant 6(5):537-40).
[0219] 4) Expression of Factors to Improve Protein Secretion and Protein Folding
[0220] Overexpression of relevant enzymes can be engineered into the ACes to improve protein secretion and folding.
[0221] 5) Expression of Factors to Permit Serum-Free Transfection and Selection
[0222] It is advantageous to have the ability to convert CHO cells in suspension growing in serum free medium to adherence with out having to resort to serum addition. Laminin or fibronectin addition is sufficient to make cells adherent (see, e.g., Zaworski et al. (1993) Biotechniques 15(5):863-6) so that expressing either of these genes in CHO cells under an inducible promoter should allow for reversible shift to adherence without requiring serum addition.
[0223] 2. Platform Aces and Gene Therapy
[0224] The platform ACes provided herein are contemplated for use in mammalian gene therapy, particularly human gene therapy. Human ACes can be derived from human acrocentric chromosomes from human host cells, in which the amplified sequences are heterochromatic and/or human rDNA. Different platform ACes applicable for different tissue cell types are provided. The ACes for gene therapy can contain a single copy of a therapeutic gene inserted into a defined location on platform ACes. Therapeutic genes include genomic clones, cDNA, hybrid genes and other combinations of sequences. Preferred selectable markers are those from the mammalian host, such as human derived factors so that they are non-immunogenic, non-toxic and allow for efficient selection, such as by FACS and/or drug resistance.
[0225] Platform ACes, useful for gene therapy and other applications, as noted herein, can be generated by megareplicator dependent amplification, such as by the methods in U.S. Pat. Nos. 6,077,697 and 6,025,155 and published International PCT application No. WO 97/40183. In one embodiment, human ACes are produced using human rDNA constructs that target rDNA arrays on human acrocentric chromosomes and induce the megareplicator in human cells, particularly in primary cell lines (with sufficient number of doublings to form the ACes) or stem cells (such as hematopoietic stem cells, mesenchymal stem cells, adult stem cells or embryonic stem cells) to avoid the introduction of potentially harmful rearranged DNA sequences present in many transformed cell lines. Megareplicator induced ACes formation can result in multiple copies of targeting DNA/selectable markers in each amplification block on both chromosomal arms of the platform ACes.
[0226] In view of the considerations regarding immunogenicity and toxicity, the production of human platform ACes for gene therapy applications employs a two component system analogous to the platform ACes designed for cellular protein production (CPP platform ACes). The system includes a platform chromosome of entirely human DNA origin containing multiple engineering sites and a gene target vector carrying the therapeutic gene of interest.
[0227] a. Platform Construction
[0228] The initial de novo construction of the platform chromosome employs the co-transfection of excess targeting DNA and a selectable marker. In one embodiment, the DNA is targeted to the rDNA arrays on the human acrocentric chromosomes (chromosomes 13, 14, 15, 21 and 22). For example, two large human rDNA containing PAC clones 18714 and 18720 and the human PAC clone 558F8 are used for targeting (Genome Research (ML) now Incyte, BACPAC Resources, 747 52nd Street, Oakland Calif.). The mouse rDNA clone pFK161 (SEQ ID NO: 118), which was used to make the human SATAC from the 94-3 hamster/human hybrid cell line (see, e.g., published International PCT application No. WO 97/40183 and Csonka, et al, Journal of Cell Science 113:3207-32161 and Example 1 for a description of pFK161) can also be used.
[0229] For animal applications, selectable markers should be non-immunogenic in the animal, such as a human, and include, but are not limited to: human nerve growth factor receptor (detected with a MAb, such as described in U.S. Pat. No. 6,365,373); truncated human growth factor receptor (detected with MAb), mutant human dihyrofolate reductase (DHFR; fluorescent MTX substrate available); secreted alkaline phosphatase (SEAP; fluorescent substrate available); human thymidylate synthase (TS; confers resistance to anti-cancer agent fluorodeoxyuridine); human glutathione S-transferase alpha (GSTA1; conjugates glutathione to the stem cell selective alkylator busulfan; chemoprotective selectable marker in CD34+ cells); CD24 cell surface antigen in hematopoietic stem cells; human CAD gene to confer resistance to N-phosphonacetyl-L-aspartate (PALA); human multi-drug resistance-1 (MDR-1; P-glycoprotein surface protein selectable by increased drug resistance or enriched by FACS); human CD25 (IL-2a; detectable by Mab-FITC); Methylguanine-DNA methyltransferase (MGMT; selectable by carmustine); and Cytidine deaminase (CD; selectable by Ara-C).
[0230] Since megareplicator induced amplification generates multiple copies of the selectable marker, a second consideration for the selection of the human marker is the resulting dose of the expressed marker after ACes formation. High level of expression of certain markers may be detrimental to the cell and/or result in autoimmunity. One method to decrease the dose of the marker protein is by shortening its half-life, such as via the fusion of the well-conserved human ubiquitin tag (a 76 amino acid sequence) thus leading to increased turnover of the selectable marker. This has been used successfully for a number of reporter systems including DHFR (see, e.g., Stack et al. (2000) Nature Biotechnology 18:1298-1302 and references cited therein).
[0231] Using the ubiquitin tagged protein, a human selectable marker system analogous to the CPP ACes described herein is constructed. Briefly, a tagged selectable marker, such as for example one of those described herein, is cloned downstream of an attP site and expressed from a human promoter. Exemplary promoters contemplated for use herein include, but are not limited to, the human ferritin heavy chain promoter (SEQ ID NO:128); RNA PolI; EF1α; TR; glyceraldehyde-3-phosphate dehydrogenase core promoter (GAP); a GAP core promoter including a proximal insulin inducible element the intervening GAP sequence; phosphofructokinase promoter; and phosphoglycerate kinase promoter. Also contemplated herein is an aldolase A promoter H1 & H2 (representing closely spaced transcriptional start sites) along with the proximal H enhancer. There are 4 promoters (e.g., transcriptional start sites) for this gene, each having different regulatory and tissue activity. The H (most proximal 2) promoters are ubiquitously expressed off the H enhancer. This resulting marker can then be co-transfected along with excess human rDNA targeting sequence into the host cells. An important criteria for the selection of the recipient cells is sufficient number of cell doublings for the formation and detection of ACes. Accordingly, the co-transfections should be attempted in human primary cells that can be cultured for long periods of time, such as for example, stem cells (e.g., hematopoietic, mesenchymal, adult or embryonic stem cells), or the like. Additional cell types, include, but are not limited to: single gene transfected cells exhibiting increased life-span; over-expressing c-myc cells, e.g. MSU1.1 (Morgan et al., 1991, Exp. Cell Res., November; 197(1):125-136); over-expressing telomerase lines, such as TERT cells; SV40 large T-antigen transfected lines; tumor cell lines, such as HT1080; and hybrid human cell lines, such as the 94-3 hamster/human hybrid cell line.
[0232] b. Gene Target Vector
[0233] The second component of the GT platform ACes (GT ACes) system involves the use of engineered target vectors carrying the therapeutic gene of interest. These are introduced onto the GT platform ACes via site-specific recombination. As with the CPP ACes, the use of engineered target vectors maximizes the use of the de novo generated GT platform ACes for most gene targets. Furthermore, using lambda integrase technology, GT platform ACes containing multiple attP sites permits the opportunity to incorporate multiple therapeutic targets onto a single platform. This could be of value in cases where a defined therapy requires multiple gene targets, a single therapeutic target requires an additional gene regulatory factor or a GT ACes requires a "kill" switch.
[0234] Similar to the CPP ACes, a feature of the gene target vector is the promoterless marker gene containing an upstream attB site (marker 2 on FIG. 9). Normally, the marker (in this case, a cell surface antigen that can be sorted by FACS would be ideal) would not be expressed unless it is placed downstream of a promoter sequence. Using the lambda integrase technology (λINT.sub.E174R on FIG. 9), site-specific recombination between the attB site on the vector and the promoter-attP site on the GT platform ACes results in the expression of marker#2 on the gene target vector, i.e. positive selection for the site-specific event. Site-specific recombination events on the GT ACes versus random integrations next to a promoter in the genome (false positive) can be quickly screened by designing primers to detect the correct event by PCR.
[0235] For expression of the therapeutic gene, human specific promoters, such as a ferritin heavy chain promoter (SEQ ID NO:128); EF1α or RNA PolI, are used. These promoters are for high level expression of a cDNA encoded therapeutic protein. In addition to expressing cDNA (or even hybrid cDNA artificial intron constructs), the GT platform ACes are used for engineering and expressing large genomic fragments carrying therapeutic genes of interest expressed from native promoter sequences. This is of importance in situations where the therapy requires precise cell specific expression or in instances where expression is best achieved from genomic clones versus cDNA.
[0236] 3. Selectable Markers for Use, for Example, in Gene Therapy (GT)
[0237] The following are selectable markers that can be incorporated into human ACes and used for selection.
[0238] Dual Resistance to 4-Hydroperoxycyclophosphamide and Methotrexate by Retroviral Transfer of the Human Aldehyde Dehydrogenase Class 1 Gene and a Mutated Dihydrofolate Reductase Gene
[0239] The genetic transfer of drug resistance to hematopoietic cells is one approach to overcoming myelosuppression caused by high-dose chemotherapy. Because cyclophosphamide (CTX) and methotrexate (MTX) are commonly used non-cross-resistant drugs, generation of dual drug resistance in hematopoietic cells that allows dose intensification may increase anti-tumor effects and circumvent the emergence of drug-resistant tumors, a retroviral vector containing a human cytosolic ALDH-1-encoding DNA clone and a human doubly mutated DHFR-encoding clone (Phe22/Ser31; termed F/S in the description of constructs) to generate increased resistance to CTX and MTX were constructed (Takebe et al. (2001) Mol Ther 3(1):88-96). This construct may be useful for protecting patients from high-dose CTX- and MTX-induced myelosuppression. ACes can be similarly constructed.
[0240] Multiple Mechanisms of N-Phosphonacetyl-L-Aspartate Resistance in Human Cell Lines: Carbamyl-P Synthetase/Aspartate Transcarbamylase/Dihydro-Orotase Gene Amplification is Frequent Only when Chromosome 2 is Rearranged
[0241] Rodent cells resistant to N-phosphonacetyl-L-aspartate (PALA) invariably contain amplified carbamyl-P synthetase/aspartate transcarbamylase/dihydro-orotase (CAD) genes, usually in widely spaced tandem arrays present as extensions of the same chromosome arm that carries a single copy of CAD in normal cells (Smith et al. (1997) Proc. Natl. Acad. Sci. U.S.A. 94:1816-21). In contrast, amplification of CAD is very infrequent in several human tumor cell lines. Cell lines with minimal chromosomal rearrangement and with unrearranged copies of chromosome 2 rarely develop intrachromosomal amplifications of CAD. These cells frequently become resistant to PALA through a mechanism that increases the aspartate transcarbamylase activity with no increase in CAD copy number, or they obtain one extra copy of CAD by forming an isochromosome 2p or by retaining an extra copy of chromosome 2. In cells with multiple chromosomal aberrations and rearranged copies of chromosome 2, amplification of CAD as tandem arrays from rearranged chromosomes is the most frequent mechanism of PALA resistance. All of these different mechanisms of PALA resistance are blocked in normal human fibroblasts. Thus, ACes with multiple copies of the CAD gene would provide PALA resistance.
[0242] Retroviral Coexpression of Thymidylate Synthase and Dihydrofolate Reductase Confers Fluoropyrimidine and Antifolate Resistance
[0243] Retroviral gene transfer of dominant selectable markers into hematopoietic cells can be used to select genetically modified cells in vivo or to attenuate the toxic effects of chemotherapeutic agents. Fantz et al. ((1998) Biochem Biophys Res Comm 243(1):6-12) have shown that retroviral gene transfer of thymidylate synthase (TS) confers resistance to TS directed anticancer agents and that co-expression of TS and dihydrofolate reductase (DHFR) confers resistance to TS and DHFR cytotoxic agents. Retroviral vectors encoding Escherichia coli TS, human TS, and the Tyr-to-His at residue 33 variant of human TS (Y33HhTS) were constructed and fibroblasts transfected with these vectors conferred comparable resistance to the TS-directed agent fluorodeoxyuridine (FdUrd, approximately 4-fold). Retroviral vectors that encode dual expression of Y33HhTS and the human L22Y DHFR (L22YhDHFR) variants conferred resistance to FdUrd (3- to 5-fold) and trimetrexate (30- to 140-fold). A L22YhDHFR-Y33HhTS chimeric retroviral vector was also constructed and transduced cells were resistant to FdUrd (3-fold), AG337 (3-fold), trimetrexate (100-fold) and methotrexate (5-fold). These results show that recombinant retroviruses can be used to transfer the cDNA that encodes TS and DHFR and dual expression in transduced cells is sufficiently high to confer resistance to TS and DHFR directed anticancer agents. ACes can be similarly constructed.
[0244] Human CD34+ Cells do not Express Glutathione S-Transferases Alpha
[0245] The expression of glutathione S-transferases alpha (GST alpha) in human hematopoietic CD34+ cells and bone marrow was studied using RT-PCR and immunoblotting (Czerwinski M, Kiem et al. (1997) Gene Ther 4(3):268-70). The GSTA1 protein conjugates glutathione to the stem cell selective alkylator busulfan. This reaction is the major pathway of elimination of the compound from the human body. Human hematopoietic CD34+ cells and bone marrow do not express GSTA1 message, which was present at a high level in liver, an organ relatively resistant to busulfan toxicity in comparison to bone marrow. Similarly, baboon CD34+ cells and dog bone marrow do not express GSTA1. Thus, human GSTA1 is a chemoprotective selectable marker in human stem cell gene therapy and could be employed in ACes construction.
[0246] Selection of Retrovirally Transduced Hematopoietic Cells Using Cd24 as a Marker of Gene Transfer
[0247] Pawliuk et al. ((1994) Blood 84(9):2868-2877) have investigated the use of a cell surface antigen as a dominant selectable marker to facilitate the detection and selection of retrovirally infected target cells. The small coding region of the human cell surface antigen CD24 (approximately 240 bp) was introduced into a myeloproliferative sarcoma virus (MPSV)-based retroviral vector, which was then used to infect day 45-fluorouracil (5-FU)-treated murine bone marrow cells. Within 48 hours of termination of the infection procedure CD24-expressing cells were selected by fluorescent-activated cell sorting (FACS) with an antibody directed against the CD24 antigen. Functional analysis of these cells showed that they included not only in vitro clonogenic progenitors and day 12 colony-forming unit-spleen but also cells capable of competitive long-term hematopoietic repopulation. Double-antibody labeling studies performed on recipients of retrovirally transduced marrow cells showed that some granulocytes, macrophages, erythrocytes, and, to a lesser extent, B and T lymphocytes still expressed the transduced CD24 gene at high levels 4 months later. No gross abnormalities in hematopoiesis were detected in mice repopulated with CD24-expressing cells. These results show that the use of the CD24 cell surface antigen as a retrovirally encoded marker permits rapid, efficient, and nontoxic selection in vitro of infected primary cells, facilitates tracking and phenotyping of their progeny, and provides a tool to identify elements that regulate the expression of transduced genes in the most primitive hematopoietic cells. ACes could be similarly constructed.
[0248] DeltahGHR, a Biosafe Cell Surface-Labeling Molecule for Analysis and Selection of Genetically Transduced Human Cells
[0249] A selectable marker for retroviral transduction and selection of human and murine cells is known (see, Garcia-Ortiz et al. (2000) Hum Gene Ther 11(2):333-46). The molecule expressed on the cell surface of the transduced population is a truncated version of human growth hormone receptor (deltahGHR), capable of ligand (hGH) binding, but devoid of the domains involved in signal triggering. The engineered molecule is stably expressed in the target cells as an inert protein unable to trigger proliferation or to rescue the cells from apoptosis after ligand binding. This new marker, has a wide application spectrum, since hGHR in the human adult is highly expressed only in liver cells, and lower levels have been reported in certain lymphocyte cell populations. The deltahGHR label has high biosafety potential, as it belongs to a well-characterized hormonal system that is nonessential in adults, and there is extensive clinical experience with hGH administration in humans. The differential binding properties of several monoclonal antibodies (MAbs) are used in a cell rescue method in which the antibody used to select deltahGHR-transduced cells is eluted by competition with hGH or, alternatively biotinylated hGH is used to capture tagged cells. In the latter system, the final purified population is recovered free of attached antibodies in hGH (a substance approved for human use)-containing medium. Such a system could be used to identify ACes containing cells.
[0250] 4. Transgenic Models for Evaluation of Genes and Discovery of New Traits in Plants
[0251] Of interest is the use of plants and plant cells containing artificial chromosomes for the evaluation of new genetic combinations and discovery of new traits. Artificial chromosomes, by virtue of the fact that they can contain significant amounts of DNA can also therefore encode numerous genes and accordingly a multiplicity of traits. It is contemplated here that artificial chromosomes, when formed from one plant species, can be evaluated in a second plant species. The resultant phenotypic changes observed, for example, can indicate the nature of the genes contained within the DNA contained within the artificial chromosome, and hence permit the identification of novel genetic activities. Artificial chromosomes containing euchromatic DNA or partially containing euchromatic DNA can serve as a valuable source of new traits when transferred to an alien plant cell environment. For example, it is contemplated that artificial chromosomes derived from dicot plant species can be introduced into monocot plant species by transferring a dicot artificial chromosome. The dicot artificial chromosome possessing a region of euchromatic DNA containing expressed genes.
[0252] The artificial chromosomes can be designed to allow the artificial chromosome to recombine with the naturally occurring plant DNA in such a fashion that a large region of naturally occurring plant DNA becomes incorporated into the artificial chromosome. This allows the artificial chromosome to contain new genetic activities and hence carry novel traits. For example, an artificial chromosome can be introduced into a wild relative of a crop plant under conditions whereby a portion of the DNA present in the chromosomes of the wild relative is transferred to the artificial chromosome. After isolation of the artificial chromosome, this naturally occurring region of DNA from the wild relative, now located on the artificial chromosome can be introduced into the domesticated crop species and the genes encoded within the transferred DNA expressed and evaluated for utility. New traits and gene systems can be discovered in this fashion. The artificial chromosome can be modified to contain sequences that promote homologous recombination within plant cells, or be modified to contain a genetic system that functions as a site-specific recombination system.
[0253] Artificial chromosomes modified to recombine with plant DNA offer many advantages for the discovery and evaluation of traits in different plant species. When the artificial chromosome containing DNA from one plant species is introduced into a new plant species, new traits and genes can be introduced. This use of an artificial chromosome allows for the ability to overcome the sexual barrier that prevents transfer of genes from one plant species to another species. Using artificial chromosomes in this fashion allows for many potentially valuable traits to be identified including traits that are typically found in wild species. Other valuable applications for artificial chromosomes include the ability to transfer large regions of DNA from one plant species to another, such as DNA encoding potentially valuable traits such as altered oil, carbohydrate or protein composition, multiple genes encoding enzymes capable of producing valuable plant secondary metabolites, genetic systems encoding valuable agronomic traits such as disease and insect resistance, genes encoding functions that allow association with soil bacterium such as growth promoting bacteria or nitrogen fixing bacteria, or genes encoding traits that confer freezing, drought or other stress tolerances. In this fashion, artificial chromosomes can be used to discover regions of plant DNA that encode valuable traits.
[0254] The artificial chromosome can also be designed to allow the transfer and subsequent incorporation of these valuable traits now located on the artificial chromosome into the natural chromosomes of a plant species. In this fashion the artificial chromosomes can be used to transfer large regions of DNA encoding traits normally found in one plant species into another plant species. In this fashion, it is possible to derive a plant cell that no longer needs to carry an artificial chromosome to posses the novel trait. Thus, the artificial chromosome would serve as the transfer mechanism to permit the formation of plants with greater degree of genetic diversity.
[0255] The design of an artificial chromosome to accomplish the afore-mentioned purposes can include within the artificial chromosome the presence of specific DNA sequences capable of acting as sites for homologous recombination to take place. For example, the DNA sequence of Arabidopsis is now known. To construct an artificial chromosome capable of recombining with a specific region of Arabidopsis DNA, a sequence of Arabidopsis DNA, normally located near a chromosomal location encoding genes of potential interest can be introduced into an artificial chromosome by methods provided herein. It may be desirable to include a second region of DNA within the artificial chromosome that provides a second flanking sequence to the region encoding genes of potential interest, to promote a double recombination event which would ensure transfer of the entire chromosomal region, encoding genes of potential interest, to the artificial chromosome. The modified artificial chromosome, containing the DNA sequences capable of homologous recombination region, can then be introduced into Arabidopsis cells and the homologous recombination event selected.
[0256] It is convenient to include a marker gene to allow for the selection of a homologous recombination event. The marker gene is preferably inactive unless activated by an appropriate homologous recombination event. For example, U.S. Pat. No. 5,272,071, describes a method where an inactive plant gene is activated by a recombination event such that desired homologous recombination events can be easily scored. Similarly, U.S. Pat. No. 5,501,967 describes a method for the selection of homologous recombination events by activation of a silent selection gene first introduced into the plant DNA, the gene being activated by an appropriate homologous recombination event. Both of these methods can be applied to enable a selective process to be included to select for recombination between an artificial chromosome and a plant chromosome. Once the homologous recombination event is detected, the artificial chromosome, once selected, is isolated and introduced into a recipient cell, for example, tobacco, corn, wheat or rice, and the expression of the newly introduced DNA sequences evaluated.
[0257] Phenotypic changes in the recipient plant cells containing the artificial chromosome, or in regenerated plants containing the artificial chromosome, allows for the evaluation of the nature of the traits encoded by the Arabidopsis DNA, under conditions naturally found in plant cells, including the naturally occurring arrangement of DNA sequences responsible for the developmental control of the traits in the normal chromosomal environment.
[0258] Traits such as durable fungal or bacterial disease resistance, new oil and carbohydrate compositions, valuable secondary metabolites such as phytosterols, flavonoids, efficient nitrogen fixation or mineral utilization, resistance to extremes of drought, heat or cold are all found within different populations of plant species and are often governed by multiple genes. The use of single gene transformation technologies does not permit the evaluation of the multiplicity of genes controlling many valuable traits. Thus, incorporation of these genes into artificial chromosomes allows the rapid evaluation of the utility of these genetic combinations in heterologous plant species.
[0259] The large scale order and structure of the artificial chromosome provides a number of unique advantages in screening for new utilities or novel phenotypes within heterologous plant species. The size of new DNA that can be carried by an artificial chromosome can be millions of base pairs of DNA, representing potentially numerous genes that may have novel utility in a heterologous plant cell. The artificial chromosome is a "natural" environment for gene expression, the problems of variable gene expression and silencing seen for genes transferred by random insertion into a genome should not be observed. Similarly, there is no need to engineer the genes for expression, and the genes inserted would not need to be recombinant genes. Thus, one expects the expression from the transferred genes to be temporal and spatial, as observed in the species from where the genes were initially isolated. A valuable feature for these utilities is the ability to isolate the artificial chromosomes and to further isolate, manipulate and introduce into other cells artificial chromosomes carrying unique genetic compositions.
[0260] Thus, the use of artificial chromosomes and homologous recombination in plant cells can be used to isolate and identify many valuable crop traits.
[0261] In addition to the use of artificial chromosomes for the isolation and testing of large regions of naturally occurring DNA, methods for the use of artificial chromosomes and cloned DNA are also contemplated. Similar to that described above, artificial chromosomes can be used to carry large regions of cloned DNA, including that derived from other plant species.
[0262] The ability to incorporate novel DNA elements into an artificial chromosome as it is being formed allows for the development of artificial chromosomes specifically engineered as a platform for testing of new genetic combinations, or "genomic" discoveries for model species such as Arabidopsis. It is known that specific "recombinase" systems can be used in plant cells to excise or re-arrange genes. These same systems can be used to derive new gene combinations contained on an artificial chromosome.
[0263] The artificial chromosomes can be engineered as platforms to accept large regions of cloned DNA, such as that contained in Bacterial Artificial Chromosomes (BACs) or Yeast Artificial Chromosomes (YACs). It is further contemplated, that as a result of the typical structure of artificial chromosomes containing tandemly repeated DNA blocks, that sequences other than cloned DNA sequence can be introduced by recombination processes. In particular recombination within a predefined region of the tandemly repeated DNA within the artificial chromosome provides a mechanism to "stack" numerous regions of cloned DNA, including large regions of DNA contained within BACs or YACs clones. Thus, multiple combinations of genes can be introduced onto artificial chromosomes and these combinations tested for functionality. In particular, it is contemplated that multiple YACs or BACs can be stacked onto an artificial chromosomes, the BACs or YACs containing multiple genes of complex pathways or multiple genetic pathways. The BACs or YACs are typically selected based on genetic information available within the public domain, for example from the Arabidopsis Information Management System (http://aims.cps.msu.edu/aims/index.html) or the information related to the plant DNA sequences available from the Institute for Genomic Research (http://www.tigr.org) and other sites known to those skilled in the art. Alternatively, clones can be chosen at random and evaluated for functionality. It is contemplated that combinations providing a desired phenotype can be identified by isolation of the artificial chromosome containing the combination and analyzing the nature of the inserted cloned DNA.
[0264] In this regard, it is contemplated that the use of site-specific recombination sequences can have considerable utility in developing artificial chromosomes containing DNA sequences recognized by recombinase enzymes and capable of accepting DNA sequences containing same. The use of site-specific recombination as a means to target an introduced DNA to a specific locus has been demonstrated in the art and such methods can be employed. The recombinase systems can also be used to transfer the cloned DNA regions contained within the artificial chromosome to the naturally occurring plant or mammalian chromosomes.
[0265] As noted herein, many site-specific recombinases are known and can be identified (Kilby et al. (1993) Trends in Genetics 9:413-418). The three recombinase systems that have been extensively employed include: an activity identified as R encoded by the pSR1 plasmid of Zygosaccharomyes rouxii, FLP encoded for the 2 um circular plasmid from Saccharomyces cerevisiae and Cre-lox from the phage P1.
[0266] The integration function of site-specific recombinases is contemplated as a means to assist in the derivation of genetic combinations on artificial chromosomes. In order to accomplish this, it is contemplated that a first step of introducing site-specific recombinase sites into the genome of a plant cell in an essentially random manner is conducted, such that the plant cell has one or more site-specific recombinase recognition sequences on one or more of the plant chromosomes. An artificial chromosome is then introduced into the plant cell, the artificial chromosome engineered to contain a recombinase recognition site (e.g., integration site) capable of being recognized by a site-specific recombinase. Optionally, a gene encoding a recombinase enzyme is also included, preferably under the control of an inducible promoter. Expression of the site-specific recombinase enzyme in the plant cell, either by induction of a inducible recombinase gene, or transient expression of a recombinase sequence, causes a site-specific recombination event to take place, leading to the insertion of a region of the plant chromosomal DNA (containing the recombinase recognition site) into the recombinase recognition site of the artificial chromosome, and forming an artificial chromosome containing plant chromosomal DNA. The artificial chromosome can be isolated and introduced into a heterologous host, preferably a plant host, and expression of the newly introduced plant chromosomal DNA can be monitored and evaluated for desirable phenotypic changes. Accordingly, carrying out this recombination with a population of plant cells wherein the chromosomally located recombinase recognition site is randomly scattered throughout the chromosomes of the plant, can lead to the formation of a population of artificial chromosomes, each with a different region of plant chromosomal DNA, and each potentially representing a novel genetic combination.
[0267] This method requires the precise site-specific insertion of chromosomal DNA into the artificial chromosome. This precision has been demonstrated in the art. For example, Fukushige and Sauer ((1992) Proc. Natl. Acad. Sci. USA, 89:7905-7909) demonstrated that the Cre-lox homologous recombination system could be successfully employed to introduce DNA into a predefined locus in a chromosome of mammalian cells. In this demonstration a promoter-less antibiotic resistance gene modified to include a lox sequence at the 5' end of the coding region was introduced into CHO cells. Cells were re-transformed by electroporation with a plasmid that contained a promoter with a lox sequence and a transiently expressed Cre recombinase gene. Under the conditions employed, the expression of the Cre enzyme catalyzed the homologous recombination between the lox site in the chromosomally located promoter-less antibiotic resistance gene, and the lox site in the introduced promoter sequence, leading to the formation of a functional antibiotic resistance gene. The authors demonstrated efficient and correct targeting of the introduced sequence, 54 of 56 lines analyzed corresponded to the predicted single copy insertion of the DNA due to Cre catalyzed site-specific homologous recombination between the lox sequences.
[0268] Accordingly a lox sequence may be first added to a genome of a plant species capable of being transformed and regenerated to a whole plant to serve as a recombinase target DNA sequence for recombination with an artificial chromosome. The lox sequence may be optimally modified to further contain a selectable marker which is inactive but can be activated by insertion of the lox recombinase recognition sequence into the artificial chromosome.
[0269] A promoterless marker gene or selectable marker gene linked to the recombinase recognition sequence, which is first inserted into the chromosomes of a plant cell can be used to engineer a platform chromosome. A promoter is linked to a recombinase recognition site, in an orientation that allows the promoter to control the expression of the marker or selectable marker gene upon recombination within the artificial chromosome. Upon a site-specific recombination event between a recombinase recognition site in a plant chromosome and the recombinase recognition site within the introduced artificial chromosome, a cell is derived with a recombined artificial chromosome, the artificial chromosome containing an active marker or selectable marker activity that permits the identification and or selection of the cell.
[0270] The artificial chromosomes can be transferred to other plant or animal species and the functionality of the new combinations tested. The ability to conduct such an inter-chromosomal transfer of sequences has been demonstrated in the art. For example, the use of the Cre-lox recombinase system to cause a chromosome recombination event between two chromatids of different chromosomes has been shown.
[0271] Any number of recombination systems may be employed as described herein, such as, but not limited to, bacterially derived systems such as the att/int system of phage lambda, and the Gin/gix system.
[0272] More than one recombination system may be employed, including, for example, one recombinase system for the introduction of DNA into an artificial chromosome, and a second recombinase system for the subsequent transfer of the newly introduced DNA contained within an artificial chromosome into the naturally occurring chromosome of a second plant species. The choice of the specific recombination system used will be dependent on the nature of the modification contemplated.
[0273] By having the ability to isolate an artificial chromosome, in particular, artificial chromosomes containing plant chromosomal DNA introduced via site-specific recombination, and re-introduce the chromosome into other mammalian or plant cells, particularly plant cells, these new combinations can be evaluated in different crop species without the need to first isolate and modify the genes, or carry out multiple transformations or gene transfers to achieve the same combination isolation and testing combinations of the genes in plants. The use of a site-specific recombinase also allows the convenient recovery of the plant chromosomal region into other recombinant DNA vectors and systems, such as mammalian or insect systems, for manipulation and study.
[0274] Also contemplated herein are ACes, cell lines and methods for use in screening a new chromosomal combinations, deletions, truncations with eucaryotic genome that take advantage of the site-specific recombination systems incorporated onto platform ACes provided herein. For example, provided herein is a cell line useful for making a library of ACes, comprising a multiplicity of heterologous recombination sites randomly integrated throughout the endogenous chromosomes. Also provided herein is a method of making a library of ACes comprising random portions of a genome, comprising introducing one or more ACes into a cell line comprising a multiplicity of heterologous recombination sites randomly integrated throughout the endogenous chromosomes, under conditions that promote the site-specific chromosomal arm exchange of the ACes into, and out of, a multiplicity of the heterologous recombination sites within the cell's chromosomal DNA; and isolating said multiplicity of ACes, thereby producing a library of ACes whereby multiple ACes have different portions of the genome within. Also provided herein is a library of cells useful for genomic screening, said library comprising a multiplicity of cells, wherein each cell comprises an ACes having a mutually exclusive portion of a chromosomal nucleic acid therein. The library of cells can be from a different species and/or cell type than the chromosomal nucleic acid within the ACes. Also provided is a method of making one or more cell lines, comprising [0275] a) integrating into endogenous chromosomal DNA of a selected cell species, a multiplicity of heterologous recombination sites, [0276] b) introducing a multiplicity of ACes under conditions that promote the site-specific chromosomal arm exchange of the ACes into, and out of, a multiplicity of the heterologous recombination sites integrated within the cell's endogenous chromosomal DNA; [0277] c) isolating said multiplicity of ACes, thereby producing a library of ACes whereby a multiplicity of ACes have mutually exclusive portions of the endogenous chromosomal DNA therein; [0278] d) introducing the isolated multiplicity of ACes of step c) into a multiplicity of cells, thereby creating a library of cells; [0279] e) selecting different cells having mutually exclusive ACes therein and clonally expanding or differentiating said different cells into clonal cell cultures, thereby creating one or more cell lines.
[0280] These ACes, cell lines and methods utilize the site-specific recombination sites on platform ACes analogous YAC manipulation related to: the methods of generating terminal deletions in normal and artificial chromosomes (e.g., ACes; as described in Vollrath et al., 1988, PNAS, USA, 85:6027-66031; and Pavan et al., PNAS, USA, 87:1300-1304); the methods of generating interstitial deletions in normal and artificial chromosomes (as described in Campbell et al., 1991, PNAS, USA, 888:5744-5748); and the methods of detecting homologous recombination between two ACes (as described in Cellini et al., 1991, Nuc. Acid Res., 19(5):997-1000).
[0281] 5. Use of Plateform ACes in Pharmacogenomic/Toxicology Applications (Development of "Reporter ACes")
[0282] In addition to the placement of genes onto ACes chromosomes for therapeutic protein production or gene therapy, the platform can be engineered via the IntR lambda integrase to carry reporter-linked constructs (reporter genes) that monitor changes in cellular physiology as measured by the particular reporter gene (or a series of different reporter genes) readout. The reporter linked constructs are designed to include a gene that can be detected (by for example fluorescence, drug resistance, immunohistochemistry, or transcript production, and the like) with well-known regulatory sequences that would control the expression of the detectable gene.
[0283] Exemplary regulatory promoter sequences are well-known in the art:
[0284] a) Reporter ACes for Drug Pathway Screening
[0285] The ACes can be engineered to carry reporter-linked constructs that indicate a signal is being transduced through one or a number of pathways. For example, transcriptionally regulated promoters from genes at the end (or any other chosen point) of particular signal transduction pathways could be engineered on the ACes to express the appropriate readout (either by fluorescent protein production or drug resistance) when the pathway is activated (or down-regulated as well). In one embodiment, a number of reporters from different can be placed on a ACes chromosome. Cells (and/or whole animals) containing such a Reporter ACes could be exposed to a variety of drugs or compounds and monitored for the effects of the drugs or compounds upon the selected pathway(s) by the reporter gene(s). Thus, drugs or compounds can be classified or identified by particular pathways they excite or down-regulate. Similarly, transcriptional profiles obtained from genomic array experiments can be biologically validated using the reporter ACes provided herein.
[0286] b) Reporter ACes for Toxic Compound Testing
[0287] Environmental or man-made genotoxicants can be tested in cell lines carrying a number of reporter-genes platform ACes linked to promoters that are transcriptionally regulated in response to DNA damage, induced apoptosis or necrosis, and cell-cycle perturbations. Furthermore, new drugs and/or compounds could be tested in a similar manner with the genotoxicant ACes reporter for their cellular/genetic toxicity by such a screen. Likewise, toxic compound testing could be carried out in whole transgenic animals carrying the ACes chromosome that measures genotoxicant exposure ("canary in a coal mine"). Thus, the same or similar type ACes could be used for toxicity testing in either a cell-based or whole animal setting. An example would include ACes that carry reporter-linked genes controlled by various cytochrome P450 profiled promoters and the like.
[0288] c) Reporter ACes for Individualized Pharmacogenomics/Drug Profiling
[0289] A common disease may arise via various mechanisms. In many instances there are multiple treatments available for a given disease. However, the success of a given treatment may depend upon the mechanism by which the disease originated and/or by the genetic background of the patient. In order to establish the most effective treatment for a given patient one could utilize the ACes reporters provided herein. ACes reporters can be used in patient cell samples to determine an individualized drug regimen for the patient. In addition, potential polymorphisms affecting the transcriptional regulation of an individual's particular gene can be assessed by this approach.
[0290] d) Reporter ACes for Classification of Similar Patient Tumors
[0291] As with other diseases as described in 5.C) above, cancer cells arise via different mechanisms. Furthermore, as a cancerous cell propagates it may undergo genomic alterations. An ACes reporter transferred to cells of different patients having the same disease, i.e. similar cancers, could be used to categorize the particular cancer of each patient, thereby facilitating the identification of the most effective therapeutic regimen. Examples would include the validation of array profiling of certain classes of breast cancers. Subsequently, appropriate drug profiling could be carried out as described above.
[0292] e) Reporter ACes as a "Differentiation" Sensor
[0293] Using the ACes reporter as a "differentiation" sensor in stem cells or other progenitor cells in order to enrich by selection (either FACS based screening, drug selection and/or use of suicide gene) for a particular class of differentiated or undifferentiated cells. For example, in one embodiment, this assay could also be used for compound screening for small molecule modifiers of cell differentiation.
[0294] f) Whole Animal Studies with Reporter Aces
[0295] Finally, with whole-body fluorescence imaging technology (Yang et al. (2000) PNAS 97:12278) any of the above Reporter ACes methods could be used in conjunction with whole-body imaging to monitor reporter genes within whole animals without sacrificing the animal. This would allow temporal and spatial analysis of expression patterns under a given set of conditions. The conditions tested may include for example, normal differentiation of a stem cell, response to drug or compound treatment whether targeted to the diseased tissue or presented systemically, response to genotoxicants, and the like.
[0296] The following examples are included for illustrative purposes only and are not intended to limit the scope of the invention.
Example 1
[0297] pFK161
[0298] Cosmid pFK161 (SEQ ID NO: 118) was obtained from Dr. Gyula Hadlaczky and contains a 9 kb NotI insert derived from a murine rDNA repeat (see clone 161 described in PCT Application Publication No. WO97/40183 by Hadlaczky et al. for a description of this cosmid). This cosmid, referred to as clone 161 contains sequence corresponding to nucleotides 10,232-15,000 in SEQ ID NO. 26. It was produced by inserting fragments of the megachromosome (see, U.S. Pat. No. 6,077,697 and International PCT application No. WO 97/40183). For example, H1D3, which was deposited at the European Collection of Animal Cell Culture (ECACC) under Accession No. 96040929, is a mouse-hamster hybrid cell line carrying this megachromosome into plasmid pWE15 (Stratagene, La Jolla, Calif.; SEQ ID No. 31) as follows. Half of a 100 μl low melting point agarose block (mega-plug) containing isolated SATACs was digested with Non overnight at 37° C. Plasmid pWE15 was similarly digested with NotI overnight. The mega-plug was then melted and mixed with the digested plasmid, ligation buffer and T4 DNA ligase. Ligation was conducted at 16° C. overnight. Bacterial DH5α cells were transformed with the ligation product and transformed cells were plated onto LB/Amp plates. Fifteen to twenty colonies were grown on each plate for a total of 189 colonies. Plasmid DNA was isolated from colonies that survived growth on LB/Amp medium and analyzed by Southern blot hybridization for the presence of DNA that hybridized to a pUC19 probe. This screening methodology assured that all clones, even clones lacking an insert but yet containing the pWE15 plasmid, would be detected.
[0299] Liquid cultures of all 189 transformants were used to generate cosmid minipreps for analysis of restriction sites within the insert DNA. Six of the original 189 cosmid clones contained an insert. These clones were designated as follows: 28 (˜9-kb insert), 30 (˜9-kb insert), 60 (˜4-kb insert), 113 (˜9-kb insert), 157 (˜9-kb insert) and 161 (˜9-kb insert). Restriction enzyme analysis indicated that three of the clones (113, 157 and 161) contained the same insert.
[0300] For sequence analysis the insert of cosmid clone no. 161 was subcloned as follows. To obtain the end fragments of the insert of clone no. 161, the clone was digested with NotI and BamHI and ligated with NotI/BamHI-digested pBluescript KS (Stratagene, La Jolla, Calif.). Two fragments of the insert of clone no. 161 were obtained: a 0.2-kb and a 0.7-kb insert fragment. To subclone the internal fragment of the insert of clone no. 161, the same digest was ligated with BamHI-digested pUC19. Three fragments of the insert of clone no. 161 were obtained: a 0.6-kb, a 1.8-kb and a 4.8-kb insert fragment.
[0301] The insert corresponds to an internal section of the mouse ribosomal RNA gene (rDNA) repeat unit between positions 7551-15670 as set forth in GENBANK accession no. X82564, which is provided as SEQ ID NO. 18. The sequence data obtained for the insert of clone no. 161 is set forth in SEQ ID NOS. 19-25. Specifically, the individual subclones corresponded to the following positions in GENBANK accession no. X82564 (SEQ ID NO:18) and in SEQ ID NOs. 19-25:
TABLE-US-00003 Start End Subclone in X82564 Site SEQ ID No. 161k1 7579 7755 NotI, BamHI 19 161m5 7756 8494 BamHI 20 161m7 8495 10231 BamHI 21 (shows only sequence corresponding to nt. 8495-8950), 22 (shows only sequence corresponding to nt. 9851-10231) 161m12 10232 15000 BamHI 23 (shows only sequence corresponding to nt. 10232-10600), 24 (shows only sequence corresponding to nt. 14267-15000) 161k2 15001 15676 NotI, BamHI 25
[0302] The sequence set forth in SEQ ID NOs. 19-25 diverges in some positions from the sequence presented in positions 7551-15670 of GENBANK accession no. X82564. Such divergence may be attributable to random mutations between repeat units of rDNA.
[0303] For use herein, the rDNA insert from the clone was prepared by digesting the cosmid with Not1 and Bgl11 and was purified as described above. Growth and maintenance of bacterial stocks and purification of plasmids were performed using standard well known methods (see, e.g., Sambrook et al. (1989) Molecular Cloning: A Laboratory Manual, 2nd Edition, Cold Spring Harbor Laboratory Press), and plasmids were purified from bacterial cultures using Midi- and Maxi-preps Kits (Qiagen, Mississauga, Ontario).
pDsRed 1N1
[0304] This vector is available from Clontech (see SEQ ID No. 29) and encodes the red fluorescent protein (DsRed; Genbank accession no. AF272711; SEQ ID Nos. 39 and 40). DsRed, which has a vivid red fluorescence, was isolated from the IndoPacific sea anemone relative Discosoma species. The plasmid pDsRed1N1 (Clontech; SEQ ID No. 29) constitutively expresses a human codon-optimized variant of the fluorescent protein under control of the CMV promoter. Unmodified, this vector expresses high levels of DsRed1 and includes sites for creating N-terminal fusions by cloning proteins of interest into the multiple cloning site (MCS). It is Kan and Neo resistant for selection in bacterial or eukaryotic cells.
Plasmid pMG
[0305] Plasmid pMG (InvivoGen, San Diego, Calif.; see SEQ. ID. NO. 27 for the nucleotide sequence of pMG) contains the hygromycin phosphotransferase gene under the control of the immediate-early human cytomegalovirus (hCMV) enhancer/promoter with intron A. Vector pMG also contains two transcriptional units allowing for the coexpression of two heterologous genes from a single vector sequence.
[0306] The first transcriptional unit of pMG contains a multiple cloning site for insertion of a gene of interest, the hygromycin phosphotransferase gene (hph) and the immediate-early human cytomegalovirus (hCMV) enhancer/promoter with intron A (see, e.g., Chapman et al. (1991) Nuc. Acids Res. 19:3979-3986) located upstream of hph and the multiple cloning site, which drives the expression of hph and any gene of interest inserted into the multiple cloning site as a polycistronic mRNA. The first transcriptional unit also contains a modified EMCV internal ribosomal entry site (IRES) upstream of the hph gene but downstream of the hCMV promoter and MCS for ribosomal entry in translation of the hph gene (see SEQ ID NO. 27, nucleotides 2736-3308). The IRES is modified by insertion of the constitutive E. coli promoter (EM7) within an intron (IM7) into the end of the IRES. In mammalian cells, the E. coli promoter is treated as an intron and is spliced out of the transcript. A polyadenylation signal from the bovine growth hormone (bGh) gene (see, e.g., Goodwin and Rottman (1992) J. Biol. Chem. 267:16330-16334) and a pause site derived from the 3' flanking region of the human α2 globin gene (see, e.g., Enriquez-Harris et al. (1991) EMBO J. 10:1833-1842) are located at the end of the first transcription unit. Efficient polyadenylation is facilitated by inserting the flanking sequence of the bGh gene 3' to the standard AAUAAA hexanucleotide sequence.
[0307] The second transcriptional unit of pMG contains another multiple cloning site for insertion of a gene of interest and an EF-1α/HTLV hybrid promoter located upstream of this multiple cloning site, which drives the expression of any gene of interest inserted into the multiple cloning site. The hybrid promoter is a modified human elongation factor-1 alpha (EF-1 alpha) gene promoter (see, e.g., Kim et al. (1990) Gene 91:217-223) that includes the R segment and part of the U5 sequence (R-U5') of the human T-cell leukemia virus (HTLV) type I long terminal repeat (see, e.g., Takebe et al. (1988) Mol. Cell. Biol 8:466-472). The Simian Virus 40 (SV40) late polyadenylation signal (see Carswell and Alwine (1989) Mol. Cell. Biol. 9:4248-4258) is located downstream of the multiple cloning site. Vector pMG contains a synthetic polyadenylation site for the first and second transcriptional units at the end of the transcriptional unit based on the rabbit β-globin gene and containing the AATAAA hexanucleotide sequence and a GT/T-rich sequence with 22-23 nucleotides between them (see, e.g., Levitt et al. (1989)Genes Dev. 3:1019-1025). A pause site derived from the C2 complement gene (see, Moreira et al. (1995) EMBO J. 14:3809-3819) is also located at the 3' end of the second transcriptional unit.
[0308] Vector pMG also contains an on sequence (ori pMB1) located between the SV40 polyadenylation signal and the synthetic polyadenylation site.
Example 2
[0309] A. Construction of Targeting Vector and Transfection into LMtk- Cells for the Generation of Platform Chromosomes
[0310] A targeting vector derived from the vector pWE15 (GeneBank Accession # X65279) was modified by replacing the SalI (Klenow filled)/SmaI neomycin resistance containing fragment with the PvuII/BamHI (Klenow filled) puromycin resistance containing fragment (isolated from plasmid pPUR, Clontech Laboratories, Inc. Palo Alto, Calif.; SEQ ID No. 30) resulting in plasmid pWEPuro. Subsequently a 9 Kb NotI fragment from the plasmid pFK161 (SEQ ID NO: 118) containing a portion of the mouse rDNA region was cloned into the NotI site of pWEPuro resulting in plasmid pWEPuro9K (FIG. 2). The vector pWEPuro9K was digested with SpeI to linearize and transfected into LMtk- mouse cells. Puromycin resistant colonies were isolated and subsequently tested for artificial chromosome formation via fluorescent in situ hybridization (FISH) (using mouse major and minor DNA repeat sequences, the puromycin gene and telomeres sequences as probes), and fluorescent activated cell sorting (FACS). From this sort, a subclone was isolated containing an artificial chromosome, designated 5B11.12, which carries 4-8 copies of the puromycin resistance gene contained on the pWEPuro9K vector. FISH analysis of the 5B11.12 subclone demonstrated the presence of telomeres and mouse minor on the ACes. DOT PCR has been done on the 5B 11.12 ACes revealing the absence of uncharacterized euchromatic regions on the ACes. A recombination site, such as an att or loxP engineering site or a plurality thereof, was introduced onto this ACes thereby providing a platform for site-specific introduction of heterologous nucleic acid.
B. Targeting a Single Sequence Specific Recombination Site onto Platform Chromosomes
[0311] After the generation of the 5B11.12 platform, a single sequence-specific recombination site is placed onto the platform chromosome via homologous recombination. For this, DNA sequences containing the site-specific recombination sequence can be flanked with DNA sequences of homology to the platform chromosome. For example, using the platform chromosome made from the pWEPuro9K vector, mouse rDNA sequences or mouse major satellite DNA can be used as homologous sequences to target onto the platform chromosome. A vector is designed to have these homologous sequences flanking the site-specific recombination site and, after the appropriate restriction enzyme digest to generate free ends of homology to the platform chromosome, the DNA is transfected into cells harboring the platform chromosome (FIG. 3). Examples of site-specific cassettes that are targeted to the platform chromosome using either mouse rDNA or mouse major repeat DNA include the SV40-attP-hygro cassette and a red fluorescent protein (RFP) gene flanked by loxP sites (Cre/lox, see, e.g., U.S. Pat. No. 4,959,317 and description herein). After transfection and integration of the site-specific cassette, homologous recombination events onto the platform chromosome are subcloned and identified by FACS (e.g. screen and single cell subclone via expression of resistance or fluorescent marker) and PCR analysis.
[0312] For example, a vector can be constructed containing regions of the mouse rDNA locus flanking a gene cassette containing the SV40 early reporter-bacteriophage lambda attP site-hygromycin selectable marker (see FIG. 4 and described below). The use of the bacteriophage lambda attP site for lambda integrase-mediated site-specific recombination is described below. Homologous recombination event of the SV40-attP-hygro cassette onto the platform chromosome was identified using PCR primers that detect the homologous recombination and further confirmed by FISH analysis. After identifying subcloned colonies containing the platform chromosome with a single site-specific recombination site, cells carrying the platform chromosome with a single site-specific recombination site can now be engineered with site-specific recombinases (e.g. lambda INT, Cre) for integrating a target gene expression vector.
C. Targeting a Red Fluorescent Protein (RFP) Gene Flanked by loxP Sites onto 5B11.12 Platform
[0313] As another example, while loxP recombination sites could have been introduced onto the ACes during de novo biosynthesis, it was thought that this might result in multiple segments of the ACes containing a high number of loxP sites, potentially leading to instability upon Cre-mediated recombination. A gene targeting approach was therefore devised to introduce a more limited number of loxP recombination sites into a locus of the 5B11-12 ACes containing introduced and possibly co-amplified endogenous rDNA sequences. Although there are more than 200 copies of rDNA genes in the haploid mouse genome distributed amongst 5-11 chromosomes (depending on strain), rDNA sequences were chosen as the target on the ACes since they represent a less frequent target than that of the satellite repeat sequences. Moreover, having observed much stronger pWEPuro9K hybridization to the 5B11-12 ACes than to other LMTK.sup.- chromosomes and in light of the observation that the transcribed spacer sequences within the rDNA may be less conserved than the rRNA coding regions, it was contemplated that a targeting vector based on the rDNA gene segment in pWEPuro9K would have a higher probability of targeting to the ACes rather than to other LMTKchromosomes. Accordingly, a targeting vector, pBSFKLoxDsRedLox, was designed and constructed based on the rDNA sequences contained in pWEPuro9K.
[0314] The plasmid pBSFKLoxDsRedLox was generated in 4 steps. First, the NotI rDNA insert of pWEPuro9K (FIG. 2) was inserted into pBS SK- (Stratagene) giving rise to pBSFK. Second, a loxP polylinker cassette was generated by PCR amplification of pNEB193 (SEQ ID NO:32; New England Biolabs) using primers complementary to the M13 forward and reverse priming sites at their 3' end and a 34 bp 5' extension comprising a LoxP site. This cassette was reinserted into pNEB193 generating p193LoxMCSLox. Third, the DsRed gene from pDsRed1-N1 (SEQ ID NO:29; Clontech) was then cloned into the polylinker between the loxP sites generating p193LoxDsRedLox. Fourth, a fragment consisting of the DsRed gene flanked by loxP sites was cloned into a unique NdeI within the rDNA insert of pBSFK generating pBSFKLoxDsRedLox.
[0315] A gel purified 11 Kb PmlI/EcoRV fragment of pBSFKLoxDsRedLox was used for transfection. To detect targeted integration, PCR primers were designed from rDNA sequences within the 5' NotI-PmlI fragment of pWEPuro9K that is not present on the targeting fragment (5' primer) and sequence within the LoxDsRedLox cassette (3' primer). If the targeting DNA integrated correctly within the rDNA sequences, PCR amplification using these primers would give rise to a 2.3 Kb band. PCR reactions containing 1-4 of genomic DNA were carried out according to the MasterTaq protocol (Eppendorf), using murine rDNA 5' primer (5'-CGGACAATGCGGTTGTGCGT-3'; SEQ ID NO:72) and DsRed 3' primer (5'GGCCCCGTAATGCAGAAGAA-3'; SEQ ID NO:73) and PCR products were analyzed by agarose gel electrophoresis.
[0316] 1.5×106 5B11-12 LMTK.sup.- cells were transfected with 2 μg of the pBSFKLoxDsRedLox targeting DNA described above using Lipofectamine Plus (Invitrogen). For flow sorting, harvested cells were suspended in medium and applied to the Becton Dickinson Vantage SE cell sorter, equipped with 488 nm lasers for excitation and 585/42 bandpass filter for optimum detection of RFP fluorescence. Cells were sorted using dPBS as sheath buffer. Negative control parental 5B11-12 cells and a positive control LMTK.sup.- cell line stably transfected with DsRed were used to establish the selection gates. The RFP positive gated populations were recovered, diluted in medium supplemented with 1× penicillin-streptomycin (Invitrogen), then plated and cultured as previously described. After 4 rounds of enrichment, the percentage of RFP positive cells reached levels of 50% or higher. DNA from populations was analyzed by PCR for evidence of targeted integration. Ultimately, single cell subclones were established from positive pools and were analyzed by PCR and PCR-positive clones confirmed by FISH as described below. DNA was purified from pools or single cell clones using previously described methods set forth in Lahm et al., Transgenic Res., 1998; 7:131-134, or in some cases using a Wizard Genomic DNA purification kit (Promega). For FISH analysis, a biotinylated DsRed gene probe was generated by PCR using DsRed specific primers and biotin-labeled dUTP (5' RFP primer: 5'-GGTTTAAAGTGCGCTCCTCCAAGAACGTCATC-3', SEQ ID NO:74; and 3' RFP primer: 5'AGATCTAGAGCCGCCGCTACAGGAACAGGTGGTGGCGGCC-3'; SEQ ID NO:75). To maximize the signal intensity of the DsRed probe, Tyramide amplification was carried out according to the manufacturers protocols (NEN).
[0317] The process of testing the feasibility of a more general targeting strategy that would not rely on enrichment via drug selection of stably transfected clones can be summarized as follows. A red fluorescent protein gene (RFP; encoded by the DsRed gene) was inserted between the loxP sites of the targeting vector to form pBSFKLoxDsRedLox. After transfection with PBSFKLoxDsRedLox, sequential rounds of high speed flow sorting and expansion of sorted cells in culture could then be used to enrich for stable transformants expressing RFP. In the event of targeted integration, PCR screening with primers that amplify from a spacer region within the segment of the 45s pre-rRNA gene in pWEPuro9K to a specific anchor sequence within the DsRed gene in the targeting cassette would give rise to a diagnostic 2.3 Kb band. As rDNA clusters are found on several chromosomes, confirmation of targeting to an ACes would require fluorescence in situ hybridization (FISH) analysis. Finally, the flanking of the DsRed gene by loxP sites would allow for its removal and subsequent replacement with other genes of interest.
[0318] After transfection of the targeting sequence into 5B11-12 cells, enrichment for targeted clones was carried out using a combination of flow cytometry to detect red-fluorescing cells and PCR screening. Ultimately 17 single cell subclones were identified as potential targeted clones by PCR and of these 16 were found by FISH to contain the DsRed integration event into the ACes. These subclones are referred to herein as D11-C4, D11-C12, D11-H3, C9-C9, C9-B9, C9-F4, C9-H8, C9-F2, C9-G8, C9-B6, C9-G3, C9-E12, C9-A11, C11-E3, C11-A9 and C11-H4. PCR analysis of genomic DNA isolated from the D11-C4 subclone gave rise to a 2.3 Kb band, indicative of a targeted integration into an rDNA locus. Further analysis of the subclone by FISH analysis with a DsRed gene probe demonstrated integration of the LoxDsRedLox targeting cassette on the ACes co-localizing with one of the regions of rDNA staining seen on the 5B11-12 ACes, consistent with a targeted integration into an rDNA locus of the ACes, while integrations on other chromosomes were not observed. Since transfected cells were maintained as heterogeneous populations through several cycles of sorting and replating it was not possible to estimate the frequency of targeted events. In most mammalian cell lines the frequency of gene targeting via homologous recombination is roughly 10-5-10-7 treated cells. Despite the low frequency of these events in mammalian cells, it is clear that an RFP expression based screening paradigm, coupled with PCR analysis, can effectively detect and enrich for such infrequent events in a large population. In instances where drug selection is not possible or not desirable, such a system may provide a useful alternative. It was also verified that the modified ACes in subclone D11-C4 could be purified by flow cytometry. The results indicate that the flow karyogram of the D11-C4 subclone was unaltered from that of the 5B11-12 cell line. Thus, the D11-C4 ACes can be purified in high yield from native chromosomes of the host cell line.
D. Reduction of LoxP on Aces to a Single Site.
[0319] The strong hybridization signal detected by FISH on the ACes using the DsRed gene probe suggests that several copies of the targeting cassette may be present on the ACes in the D11-C4 line. This also suggests that multiple rDNA genes have been correctly targeted.
[0320] Accordingly, in certain embodiments where necessary, the number of loxP sites on the ACes can be reduced to a single site by in situ treatment with Cre recombinase, provided that the sites are co-linear. Such a process is described for multiple loxP-flanked integrations on a native mouse chromosome (Garrick et al., Nature Genet., 1998, January; 18(1):56-59). Reduction to a single loxP site on the D11-C4 ACes would result in the loss of the DsRed gene, forming the basis of a useful screen for this event.
[0321] For this purpose, a Cre expression plasmid pCX-Cre/GFP III has been generated by first deleting the EcoRI fragment of pCX-eGFP (SEQ ID NO:71) containing the eGFP coding sequence and replacing it with that of a PCR amplified Cre recombinase coding sequence (SEQ ID NO:58), generating pCX-Cre. Next, the AseI/SspI fragment of pD2eGFP-N1 (containing the CMV promoter driving the D2EGFP gene with SV40 polyA signal; Clontech; SEQ ID NO:87) was inserted into the filled HindIII site of pCX-Cre, generating pCX-Cre\GFP III. Control plasmid pCX-CreRev\GFP III was generated in similar fashion except that the Cre recombinase coding sequence was inserted in the antisense orientation. LMTK.sup.- cell line D11-C4 (containing first generation platform ACes with multiple loxP-DsRED sites) and 5B11-12 cell line (containing ACes with no loxP-DsRED sites) are maintained in culture as described above. D11C4 cells are transfected with 2 μg of plasmid pCX-Cre\GFP III or 2 μg pCX-CreRev\GFP III using Lipofectamine (Invitrogen) as previously described.
[0322] Forty-eight to seventy-two hours after transfection, transfected D11-C4 cells are harvested and GFP positive cells are sorted by cell cytometry using a FACSta Vantage cell sorter (Beckton-Dickinson) as follows: All D11-C4 cells transfected with pCX-Cre\GFP III or control plasmid pCX-CreRev\GFP III that exhibit GFP fluorescent higher than the gate level established by untransfected cells are collected and placed in culture a further 7-14 days. After 7-14 days the initial D11-C4 cells are harvested and analyzed by cell cytometry as follows: Untransfected D11-C4 cells are used to establish the gate that defines the RFP positive population, while 5B11-12 cells are used to set the RFP negative gate. The GFP positive population of D11-C4 transfected with pCX-Cre\GFP III should show decreased red fluorescence compared to pCX-CreRev\GFP III transfected or untransfected control D11-C4 cells. The cells exhibiting greatly decreased or no RFP expression are collected and single cell clones subsequently established. These clones will be expanded and analyzed by fluorescence in-situ hybridization and Southern blotting to confirm the removal of loxP-DsRed gene copies.
Example 3
[0323] Construction of Targeting Vector and Transfection into LMtk- Cells for the Generation of Platform Chromosomes Containing Multiple Site-Specific Recombination Sites
[0324] An example of a selectable marker system for the creation of a chromosome-based platform is shown in FIG. 4. This system includes a vector containing the SV40 early promoter immediately followed by (1) a 282 base pair (bp) sequence containing the bacteriophage lambda attP site and (2) the puromycin resistance marker. Initially a PvuII/StuI fragment containing the SV40 early promoter from plasmid pPUR (Clontech Laboratories, Inc., Palo Alto, Calif.; Seq ID No. 30) was subcloned into the EcoRI/CRI site of pNEB 193 (a PUC 19 derivative obtained from New England Biolabs, Beverly, Mass.; SEQ ID No. 32) generating the plasmid pSV40193. The only differences between pUC19 and pNEB 193 are in the polylinker region. A unique AscI site (GGCGCGCC) is located between the BamHI site and the SmaI site, a unique PacI site (TTAATTAA) is located between the BamHI site and the XbaI site and a unique PmeI site (GTTTAAAC) is located between the PstI site and the SalI site.
[0325] The attP site was PCR amplified from lambda genome (GenBank Accession #NC 001416) using the following primers:
TABLE-US-00004 attPUP: CCTTGCGCTAATGCTCTGTTACAGG SEQ ID No. 1 attPDWN: CAGAGGCAGGGAGTGGGACAAAATTG SEQ ID No. 2
[0326] After amplification and purification of the resulting fragment, the attP site was cloned into the SmaI site of pSV40193 and the orientation of the attP site was determined by DNA sequence analysis (plasmid pSV40193attP). The gene encoding puromycin resistance (Puro) was isolated by digesting the plasmid pPUR (Clontech Laboratories, Inc. Palo Alto, Calif.) with AgeI/BamHI followed by filling in the overhangs with Klenow and subsequently cloned into the AscI site downstream of the attP site of pSV40193attP generating the plasmid pSV40193attPsensePUR (FIG. 4; SEQ ID NO:113)).
[0327] The plasmid pSV40193attPsensePUR was digested with ScaI and co-transfected with the plasmid pFK161 (SEQ ID NO: 118) into mouse LMtk- Cells and platform artificial chromosomes were identified and isolated as described above. The process for generating this exemplary platform ACes containing multiple site-specific recombination sites is summarized in FIG. 5. One platform ACes resulting from this experiment is designated B19-18. This platform ACes chromosome may subsequently be engineered to contain target gene expression nucleic acids using the lambda integrase mediated site-specific recombination system as described herein in Example 7 and 8.
Example 4
[0328] Lambda Integrase Mediated Site-Specific Recombination of a RFP Expressing Vector onto Artificial Chromosomes
[0329] In this example, a vector expressing the red fluorescent protein (RFP) was produced and recombined into the attP site residing on an artificial chromosome within LMtk- Cells. This recombination is depicted in FIG. 7.
[0330] A. Construction of Expression Vectors Containing Wildtype and Mutant Lambda Integrase
[0331] Mutations at the glutamic acid at position 174 in the lambda integrase protein relaxes the requirement for the accessory protein IHF during recombination and DNA supercoiling in vitro (see, Miller et al. (1980) Cell 20:721-729; Lange-Gustafson et al. (1984) J. Biol. Chem. 259:12724-12732). Mutations at this site promote attP, attB intramolecular recombination in mammalian cells (Lorbach et al. (2000) J. Mol. Biol. 296:1175-1181).
[0332] To construct nucleic acid encoding the mutant, lambda integrase was PCR amplified from bacteriophage lambda DNA (cI857 ind Sam 7; New England Biolabs) using the following primers:
TABLE-US-00005 Lamint1 (SEQ ID No. 3) TTCGAATTCATGGGAAGAAGGCGAAGTCATGAGCG) Lamint2 (SEQ ID No. 4) (TTCGAATTCTTATTTGATTTCAATTTTGTCCCAC).
The resulting PCR product was digested with EcoR I and cloned into the EcoR I site of pUC19. Lambda integrase was mutated at amino acid position 174 using QuikChange Site-Directed Mutagenesis Kit (Stratagene) and the following oligos (generating a glutamic acid to arginine change at position 174):
TABLE-US-00006 LambdaINTE174R (SEQ ID No. 6) (CGCGCAGCAAAATCTAGAGTAAGGAGATCAAGACTTACGGCTGACG), LamintR174rev (SEQ ID No. 7) (CGTCAGCCGTAAGTCTTGATCTCCTTACTCTAGATTTTGCTGCGCG).
The resulting site directed mutant was confirmed by sequence analysis. The wildtype and mutant lambda genes were cloned into the EcoR I site of pCX creating pCX-LamInt (SEQ ID NO: 127) and pCXLamIntR (FIG. 8; SEQ ID NO: 112).
[0333] The plasmid pCX (SEQ ID No. 70) was derived from plasmid pCXeGFP (SEQ ID No. 71). Excision of the EcoRI fragment containing the eGFP marker generated pCX. To generate plasmid pCXLamINTR (SEQ ID NO: 112) an EcoRI fragment containing the lambda integrase E174R (SEQ ID No. 37) mutation was cloned into the EcoRI site of pCX, and to generate plasmid pCX-LamINT, an EcoRI fragment containing the wild-type lambda integrase was cloned into the EcoRI site of pCX.
[0334] B. Construction of Integration Vector Containing attB and DsRED
[0335] The plasmid pDsRedN1 (Clontech Laboratories, Palo Alto, Calif.; SEQ ID No. 29) was digested with Hpa I and ligated to the following annealed oligos:
TABLE-US-00007 attB1 (SEQ ID No. 8) (TGAAGCCTGCTTTTTTATACTAACTTGAGCGAA) attB2 (SEQ ID No. 9) (TTCGCTCAAGTTAGTATAAAAAAGCAGGCTTCA)
The resulting vector (pDsRedN1-attB) was confirmed by PCR and sequence analysis.
[0336] C. Transfection into LMtk- Cells
[0337] LM(tk-) cells containing the Prototype A ACes (L1-18; Chromos Molecular Systems Inc., Burnaby, BC Canada) were co-transfected with pDsRedN1 or pDsRedN1-attB and either pCXLamInt (SEQ ID NO: 127) or pCXLamIntR (SEQ ID NO: 112) using Lipofectamine Plus Reagent (LifeTechnologies, Gaithersburg, Md.). The transfected cells were grown in DMEM (LifeTechnologies, Gaithersburg, Md.) with 10% FBS (CanSera) and G418 (CalBiochem) at a concentration of 1 mg/ml.
[0338] D. Enrichment by Cell Sorting
[0339] The transfected cells were sorted using a FACs Vantage SE cell sorter (Becton Dickenson) to enrich for cells expressing DsRed. The cells were excited with a 488 nm Argon laser at 200 watts and cells fluorescing in the 585/42 detection channel were collected. The sorted cells were returned to growth medium for recovery and expansion. After three successive enrichments for cells expressing DsRed, single cell sorting into 96 well plates was performed using the same parameters. Duplicate plates of the single cell clones were made for PCR analysis.
[0340] E. PCR Analysis of Single Cell Clones
[0341] Pools of cells from each row and column of the 96 well plate were used for DNA isolation. DNA was prepared using a Wizard Genomic DNA purification kit (Promega Inc, Madison, Wis.). Nested PCR analysis on the DNA pools was performed to confirm the site-specific recombination event using the following primer sets:
TABLE-US-00008 attPdwn2 (SEQ ID No. 10) (TCTTCTCGGGCATAAGTCGGACACC) CMVen (SEQ ID No. 11) (CTCACGGGGATTTCCAAGTCTCCAC) followed by: attPdwn (SEQ ID No. 12) (CAGAGGCAGGGAGTGGGACAAAATTG) CMVen2 (SEQ ID No. 13) (CAACTCCGCCCCATTGACGCAAATG).
The resulting PCR reactions were analyzed by gel electrophoresis and the potential individual clones containing the site-specific recombination event were identified by combining the PCR results of all of the pooled rows and columns for each 96 well plate. The individual clones were then further analyzed by PCR using the following primers that flank the recombination junction. L1 for and F1rev flank the attR junction whereas REDfor and L2rev flank the attL junction (see FIG. 7):
TABLE-US-00009 L1for (SEQ ID No. 14) AGTATCGCCGAACGATTAGCTCTTCA F1rev (SEQ ID No. 15) GCCGATTTCGGCCTATTGGTTAAA REDfor (SEQ ID No. 16) CCGCCGACATCCCCGACTACAAGAA L2rev (SEQ ID No. 17) TTCCTTCGAAGGGGATCCGCCTACC.
[0342] F. Sequence Analysis of Recombination Junctions
[0343] PCR products spanning the recombination junction were Topo-cloned into pcDNA3.1DN5H is (Invitrogen Inc., San Diego, Calif.) and then sequenced by cycle-sequencing. The clones were confirmed to have the correct attR and attL junctions by cycle sequencing.
[0344] G. Fluorescent In Situ Hybridization (FISH)
[0345] The cell lines containing the correct recombination junction sequence were further analyzed by fluorescent in situ hybridization (FISH) by probing with the DsRed coding region labeled with biotin and visualizing with the Tyramide Signal Amplification system (TSA; NEN Life Science Products). The results indicate that the RFP sequence is present on the ACes.
[0346] H. Southern Analysis
[0347] Genomic DNA was harvested from the cell lines containing an ACes with the correct recombinant event and digested with EcoR I. The digested DNAs were separated on a 0.7% agarose gel, transferred and fixed to a nylon membrane and probed with RFP coding sequences. The result showed that there is an integrated copy of RFP coding sequence in each clone.
Example 5
[0348] Delivery of a Second Gene Encoding GFP onto the RFP Platform ACes
[0349] A. Construction of Integration Vector Containing attB and GFP (pD2eGFPIresPuroattB).
[0350] The plasmid pIRESpuro2 (Clontech, Palo Alto, Calif.; SEQ ID NO: 88) was digested with EcoRI and NotI then ligated to the D2eGFP EcoRI-NotI fragment from pD2eGFP-N1 (Clontech, Palo Alto, Calif.) to create pD2eGFPIresPuro2. Subsequently, oligos encoding the attB site were annealed and ligated into the NruI site of pD2eGFPIresPuro2 to create pD2eGFPIresPuroattB. The orientation of attB in the NruI site was determined by PCR.
[0351] B. Transfection of LMtk- Cells
[0352] The LMtk- Cells containing the RFP platform ACes produced in Example 4, which has multiple attP sites, were co-transfected with pCXLamIntR and pD2eGFPIresPuroattB using LipofectAMINE PLUS reagent. Five μg of each vector was placed into a tube containing 750 μl of DMEM (Dulbecco's modified Eagles Medium). Twenty μl of the Plus reagent was added to the DNA and incubated at room temperature for 15 minutes. A mixture of 30 μl of lipofectamine and 750 μl DMEM was added to the DNA mixture and incubated an additional 15 minutes at room temperature. The DNA mixture was then added dropwise to approximately 3 million cells attached to a 10 cm dish in 5 mls of DMEM. The cells were incubated 4 hours (37° C., 5% CO2) with the DNA-lipid mixture, after which DMEM with 20% fetal bovine serum was added to the dishes to bring the culture medium to 10% fetal bovine serum. The dishes were incubated at 37° C. with 5% CO2.
[0353] Plasmid pD2eGFPIresPuroattB has a puromycin gene transcriptionally linked to the GFP gene via an IRES element. Two days after the transfection the cells were placed in medium containing puromycin at 4 μg/ml to select for cells containing the pD2eGFPIresPuroattB plasmid integrated into the genome. Twenty-three clones were isolated after 17 days of selection with puromycin. These clones were expanded and then analyzed for the presence of the GFP gene on the ACes by 2-color (RFP/biotin & GFP/digoxigenin) TSA-FISH (NEN) according to the manufacturers protocol. Sixteen of the 23 clones produced a positive FISH signal on the ACes with a GFP probe.
Example 6
[0354] Delivery of ACes into Human Mesenchymal Stem Cells (hSMC)
[0355] A. Transfection
[0356] Transfection conditions for the most efficient delivery of the ACes into hMSCs (Cambrex BioWhittaker Product Code PT-2501, lot# F0658, East Rutherford, N.J.) were assayed using LipofectAMINE PLUS and Superfect. One million prototype B ACes, which is a murine derived 60 Mb ACes having primarily murine pericentric heterochromatin, and carrying a "payload" containing a hygromycin B selectable marker gene and a lacZ reporter gene (see, Telenius et al., 1999, Chrom. Res., 7:3-7; and Kereso et al., 1996, Chrom. Res., 4:226-239; each of which is incorporated herein by reference in its entirety), were combined with 1-12 μl of the transfection agent. In the case of LipofectAMINE PLUS, the PLUS reagent was combined with the ACes for 15 minutes followed by LipofectAMINE for a further 15 minutes. Superfect was complexed for 10 minutes at a ratio of 2 μl Superfect per 1 million ACes. The ACes/transfection agent complex was then applied to 0.5 million recipient cells and the transfection was allowed to proceed according to the manufacturer's protocol. Percent transfected cells was determined on a FACS Vantage flow cytometer with argon laser tuned to 488 nm at 200 mW and FITC fluorescence collected through a standard FITC 530/30 nm band pass filter. After 24 hours, IdUrd labeled ACes were delivered to human MSCs in the range of 30-50%, varying with transfection agent and dose. ACes delivery curves were generated from data collected in experiments that varied the dose of the transfection reagents. Dose response curves of Superfect and LipofectAMINE PLUS, showing delivery of ACes into recipient hMSCs cells, were prepared, measured by transfer of IdUrd labeled ACes and detected by flow cytometry. Superfect shows maximum delivery in the range of 30-50% at doses greater than 2 μl per million ACes. LipofectAMINE PLUS has a 42-48% delivery peak around 5-8 μl per million ACes. These dose curves were then correlated with toxicity data to determine the transfection conditions that will allow for highest potential transfection efficiency. Toxicity was determined by a modified plating efficiency assay (de Jong et al., 2001, Chrom. Research, 9:475-485). The population's normalized plating efficiency (at maximum % delivery doses) was in the range of 0.2-0.4 for Superfect and 0.5-0.6 with LipofectAMINE PLUS.
[0357] Due to the transfected population consisting of mixed cell types, flow cytometry allowed for the assessment of ACes delivery into each sub-population and the purification of the target population. Flow profiles showing forward scatter (cell size) and side scatter (internal cell granularity) revealed three distinct hMSC populations that were gated into three regions: R3 (small cell region), R4 (medium cell region), R5 (large cell region). Transfection conditions were further optimized by re-analyzing delivery curves and assessing the differences in delivery to each sub-population. Dose response curves of Superfect and LipofectAMINE were prepared showing % delivery to each sub-population represented by the gating on basis of cell size and granularity properties of the mixed population. Three distinct hMSC populations were gated and % delivery dose curves generated. Using Superfect and LipofectAMINE PLUS the overall % delivery increased with cell size (80-90% delivery in large cells). LipofectAMINE PLUS at high doses (8-12 μl per 1 million ACes) shows an increase in the overall proportion of chromosome transfer to the small population (10-20%). This suggests an advantage to using this transfection agent if the small-undifferentiated cell population is the desired target host cell.
[0358] B. Expression from Genes on ACes in hMSCs Following the delivery screening process conducted in section (A) above, the most promising results were subjected to further analyses to monitor expression and verify the presence of structurally intact ACes. The transfection conditions employed for these experiments were exactly the same as those that had been used during the screening process. Short-term expression was monitored by transfecting hMSCs with ACes containing a RFP gene (red fluorescent protein) set forth in Example 2C as "D11C4". The unselected population was harvested at 72-96 hours post transfection and % positive fluorescent cells measured by flow cytometry. RFP expression was in the range of 1-20%.
[0359] Long term-gene expression was assayed by selecting for hygromycin B resistant cells over a period of 7-10 days. Cytogenetic analysis was done to detect presence of intact ACes by Fluorescent In Situ hybridization (FISH), where metaphase chromosomes were hybridized to a mouse major satellite-DNA probe (targeting murine pericentric heterochromatin) and a lambda probe (hybridizing to the lacZ gene). The human mesenchymal transfected culture could not undergo standard sub-cloning as diffuse colonies form with limited doublings available for expansion. Cytogenetic analysis was performed on the entire population, sampling over a period of 3-10 days post-transfection. The hygromycin resistant population was then blocked in mitosis with colchicine and analyzed for presence of intact ACes by FISH. Preliminary FISH results show approximately 2-8% of the hMSC-transfected population had an intact ACes. This compared to rat skeletal muscle myoblast clones, which were in the range of 60-95%. To increase the % of intact ACes in the hMSC-transfected population an enrichment step can be utilized as described in Example 2C.
[0360] C. Differentiation of the hMSCs
[0361] In initial experiments where transfected hMSCs cells have been induced to differentiate into adipose or osteocytes, the results indicate that the transfected cells appear to be differentiating at a rate comparable to the untransfected controls and the cultures are lineage specific as tested by microscopic examination, FISH, Oil Red O staining (adipocyte assay), and calcium secretion (osteocyte assay).
[0362] Accordingly, these results indicate that the artificial chromosomes (ACes) provided herein can be successfully transferred into hMSC target cells. Targeting MSCs (such as hMSCs) permits gene transfer into cells in an undifferentiated state where the cells are easier to expand and purify. The genetically modified cells can then be differentiated in vitro or injected into a site in vivo where the microenvironment will induce transformation into specific cell lineages.
Example 7
Delivery of a Promoterless Marker Gene to a Platform ACes
[0363] Platform ACes containing pSV40attPsensePURO (FIG. 4) were constructed as set forth in Examples 3 and 4.
A. Construction of Targeting Vectors.
[0364] The base vector p18attBZeo (3166 bp; SEQ ID NO: 114) was constructed by ligating the 1067 bp HindIII-SspI fragment containing attBZeo, obtained from pLITattBZeo (SEQ ID NO:91), into pUC18 (SEQ ID NO: 122) digested with HindIII and SspI.
[0365] 1. p18attBZEO-eGFP (6119 bp; SEQ ID NO: 126) was constructed by inserting the 2977 bp SpeI-HindIII fragment from pCXeGFP (SEQ ID NO:71; Okabe, et al. (1997) FEBS Lett 407:313-319) containing the eGFP gene into p18attBZeo (SEQ ID NO: 114) digested with HindIII and XbaI.
[0366] 2. p18attBZEO-5'6XHS4eGFP (FIG. 10; 7631 bp; SEQ ID NO: 116) was constructed by ligating the 4465 bp HindIII fragment from pCXeGFPattB(6XHS4)2 (SEQ ID NO: 123) which contains the eGFP gene, under the regulation of the chicken beta actin promoter, 6 copies of the HS4 core element located 5' of the chicken beta actin promoter and the polyadenylation signal into the HindIII site of p18attBZeo (SEQ ID NO: 114).
[0367] 3. p18attBZEO-3'6XHS4eGFP (FIG. 11; 7600 bp; SEQ ID NO: 115) was created by removing the 5'6XHS4 element from p18attBZeo-(6XHS4)2eGFP (SEQ ID NO: 110). p18attBZeo-(6XHS4)2eGFP was digested with EcoRV and SpeI, treated with Klenow and religated to form p18attBZeo3'6XHS4eGFP (SEQ ID NO: 115).
[0368] 4. p18attBZEO-(6XHS4)2eGFP (FIG. 12; 9080bp; SEQ ID NO: 110) was created in two steps. First, the EcoRI-SpeI fragment from pCXeGFPattB(6XHS4)2 (SEQ ID NO: 123) which contains 6 copies of the HS4 core element was ligated into p18attBZeo (SEQ ID NO: 114) digested with EcoRI and XbaI to create p18attBZeo6XHS4 (4615 bp; SEQ ID NO: 117). Next, p18attBZeo6XHS4 was digested with HindIII and ligated to the 4465 bp HindIII fragment from pCXeGFPattB(6XHS4)2 which contains the eGFP gene, under the regulation of the chicken beta actin promoter, 6 copies of the HS4 core element located 5' of the chicken beta actin promoter and the polyadenylation signal.
TABLE-US-00010 TABLE 2 No. clones No. zeocin No. clones with with correct resistant expected PCR sequence at Targeting plasmid clones product size recombination junction p18attBZEO-eGFP 12 12 NT* p18attBZEO- 11 11 NT 5'6XHS4eGFP p18attBZEO- 11 11 NT 3'6XHS4eGFP p18attBZEO- 9 9 4/4 (6XHS4)2eGFP *NT = not tested
B. Transfection and Selection with Drug.
[0369] The mouse cell line containing the 2nd generation platform ACE, B19-38 (constructed as set forth in Example 3), was plated onto four 10 cm dishes at approximately 5 million cells per dish. The cells were incubated overnight in DMEM with 10% fetal calf serum at 37° C. and 5% CO2. The following day the cells were transfected with 5 μg of each of the 4 vectors listed in Example 7.A. above and 5 μg of pCXLamIntR (SEQ ID NO: 112), for a total of 10βg per 10 cm dish. Lipofectamine Plus reagent was used to transfect the cells according to the manufacturers protocol. Two days post-transfection zeocin was added to the medium at 500 ug/ml. The cells were maintained in selective medium until colonies formed. The colonies were then ring-cloned (see, e.g., McFarland, 2000, Methods Cell Sci, March; 22(1):63-66).
C. Analysis of Clones (PCR, Sequencing).
[0370] Genomic DNA was isolated from each of the candidate clones with the Wizard kit (Promega) and following the manufacturers protocol. The following primer set was used to analyze the genomic DNA isolated from the zeocin resistant clones: 5 PacSV40-CTG'TTAATTAACTGTGGAATGTGTG TCAGTTAGGGTG (SEQ ID NO:76); Antisense Zeo-TGAACAGGGTCACGTCGTCC (SEQ ID NO:77). PCR amplification with the above primers and genomic DNA from the site-specific integration of any of the 4 zeocin vectors would result in a 673 bp PCR product.
[0371] As set forth in Table 2, of the 4 zeocin resistant candidate clones thusfar analyzed by PCR, all 4 exhibit the correct sequence for a site-specific integration event.
Example 8
Integration of a PCR Product by Site-Specific Recombination.
[0372] In this example a gene is integrated onto the platform ACes by site-specific recombination without cloning said gene into a vector.
A. PCR Primer Design.
[0373] PCR primers are designed to contain an attB site at the 5' end of one of the primers in the primer set. The remaining primers, which could be one or more than one primer, do not contain an attB site, but are complementary to sequences flanking the gene or genes of interest and any associated regulatory sequences. In first example, 2 primers (one containing an attB site) are used to amplify a selective gene such as puromycin.
[0374] In a second example as shown in FIG. 13, the primer set includes primers 1 & 2 that amplify the GFP gene without amplification of an upstream promoter. Primer 1 contains the attB site at the 5' end of the oligo. Primers 3 & 4 are designed to amplify the IRES-blasticidin DNA sequences from the vector pIRESblasticidin. The 5' end of primer 3 contains sequences complementary to the 5' end of primer 2 such that annealing can occur between 5' ends of the two primers.
B. PCR Reaction and Subsequent Ligation to Create Circular Molecules from the PCR Product
[0375] In the first example set forth above in Section A, the two PCR primers are combined with a puromycin DNA template such as pPUR (Clontech), a heat stable DNA polymerase and appropriate conditions for DNA amplification. The resulting PCR product (attB-Puromycin) is then then purified and self-ligated to form a circular molecule.
[0376] In the second example set forth above in Section A, amplification of the GFP gene and IRES-blasticidin sequences is accomplished by combining primers 1 & 2 with DNA template pD2eGFP and primers 3 & 4 with template pIRESblasticidin under appropriate conditions to amplify the desired template. After initial amplification of the two products (attB-GFP & IRES-blasticidin) in separate reactions, a second round of amplification using both of the PCR products from the first round of amplification together with primers 1 and 4 amplifies the fusion product attB-GFP-IRES-blasticidin (FIG. 13). This technique of using complementary sequences in primer design to create a fusion product is employed in Saccharomyces cerevisiae for allele replacement (Erdeniz et al (1997) Gen Res 2:1174-1183). The amplified product is then purified from the PCR reaction mixture by standard methods and ligated to form a circular molecule.
C. Introduction of PCR Product onto the Aces Using a Recombinase
[0377] The circular PCR product is then be introduced to the platform ACes using the bacteriphage lambda integrase E174R. The introduction can be performed in vivo by transfecting the pCXLamIntR (SEQ ID NO: 112) vector encoding the lambda integrase mutant E174R together with the circularized PCR product into a cell line containing the platform ACE.
D. Selection for Marker Gene
[0378] The marker gene (in this case either puromycin, blasticidin or GFP) is used to enrich the population for cells containing the proper integration event. A proper integration event in the second example (FIG. 14) juxtaposes a promoter residing on the platform ACes 5' to the attB-GFP-IRES-Blasticidin PCR product, allowing for transcription of both GFP and blasticidin. If enrichment is done by drug selection, blasticidin is added to the medium on the transfected cells 24-48 hours post-transfection. Selection is maintained until colonies are formed on the plates. If enrichment is done by cell sorting, cells are sorted 2-4 days post-transfection to enrich for cells expressing the fluorescent marker (GFP in this case).
E. Analysis of Clones
[0379] Clonal isolates are analyzed by PCR, FISH and sequence analysis to confirm proper integration events.
Example 9
Construction of a Human Platform ACes "ACE 0.1"
[0380] A. Construction of the targeting vector pPACrDNA
[0381] Genome Systems (IncyteGenomics) was supplied with the primers 5'HETS (GGGCCGAAACGATCTCAACCTATT; SEQ ID NO:78), and 3'HETS (CGCAGCGGCCCTCCTACTC; SEQ ID NO:79), which were used to amplify a 538 bp PCR product homologous to nt 9680-10218 of the human rDNA sequences (GenBank Accession No. U13369) and used as a probe to screen a human genomic P1AC(P1 Artificial Chromosome) library constructed in the vector pCYPAC2 (Ioannou et al. (1994) Nat. Genet. 6(1): 84-89). Genome Systems clone #18720 was isolated in this screen and contains three repeats of human rDNA as assessed by restriction analysis. GS clone #18720, was digested with PmeI, a restriction enzyme unique to a single repeat of the human rDNA (45 Kbp), and then religated to form pPACrDNA (FIG. 15). The insert in pPACrDNA was analyzed by restriction digests and sequence analysis of the 5' and 3' termini. The pPACrDNA, rDNA sequences are homologous to Genbank Accession #U13369, containing an insert of about 45 kB comprising a single repeat beginning from the end of one repeat at ˜33980 (relative to the Genbank sequence) through the beginning of the next repeat up to approximately 35120 (the repeat offset from that listed in the GenBank file). Thus, the rDNA sequence is just over 1 copy of the repeat extending from 33980 (+/-10 bp) to the end of the first repeat (43 Kbp) and continuing into the second repeat to by 35120 (+/-10 bp).
B. Transfection and ACes Formation.
[0382] Five hundred thousand MSU1.1 cells (Morgan et al., 1991, Exp. Cell Res., November; 197(1):125-136; provided by Dr. Justin McCormick at Michigan State
[0383] University) were plated per 6 cm plate (3 plates total) and allowed to grow overnight. The cells were 70-80% confluent the following day. One plate was transfected with 15 μg pPACrDNA (linearized with Pme I) and 2 μg pSV40attPsensePuro (linearized with Sca I; see Example 3). The remaining plates were controls and were transfected with either 20 μg pBS (Stratagene) or 20 μg pSV40attBsensePuro (linearized with Sca I). All three plates were transfected using a CaPO4 protocol.
C. Selection of Puromycin Resistant Colonies
[0384] One day post-transfection the cells were "glycerol shocked" by the addition of PBS medium containing 10% glycerol for 30 seconds. Subsequently, the glycerol was removed and replaced with fresh DMEM. Four days post-transfection selective medium was added. Selective medium contains 1 ug/ml puromycin. The transfection plates were maintained at 37° C. with 5% CO2 in selective medium for 2 weeks at which point colonies could be seen on the plate transfected with pPACrDNA and pSV40attPsensePuro. The colonies were ring-cloned from the plate on day 17 post-selection and expanded in selective medium for analysis. Only two colonies (M2-2d & M2-2b) were able to proliferate in the selective medium after cloning. No colonies were seen on the control plates after 37 days in selective medium.
D. Analysis of Clones
[0385] FISH analysis was performed on the candidate clones to detect ACes formation. Metaphase spreads from the candidate clones were probed in multiple probe combinations. In one experiment, the probes used were biotin-labeled human alphoid DNA (pPACrDNA) and digoxigenin-labeled mouse major DNA (pFK161) as a negative control. Candidate M2-2d was single cell subcloned by flow sorting and the candidate subclones were reanalyzed by FISH. Subclone 1B1 of M2-2d was determined to be a platform ACes and is also designated human Platform ACE 0.1.
Example 10
[0386] Site-Specific Integration of a Marker Gene onto a Human Platform ACE 0.1
[0387] The promoterless delivery method was used to deliver a promoterless blasticidin marker gene onto the human platform ACes with excellent results. The human ACes platform with a promoterless blasticidin marker gene resulted in 21 of 38 blasticidin resistant clones displaying a PCR product of the expected size from the population co-transfected with pLIT38attBBSRpolyA10 and pCXLamIntR (FIG. 8; SEQ ID NOs. 111 and 112). Whereas, the population transfected with pBlueScript resulted in 0 blasticidin resistant colonies.
A. Construction of pLIT38attB-BSRpolyA10 & pLIT38attB BSRpolyA2.
[0388] The vector pLITMUS 38 (New England Biolabs; U.S. Pat. No. 5,691,140; SEQ ID NO: 119) was digested with EcoRV and ligated to two annealed oligomers, which form an attB site (attB1 5'-TGAAGCCTGCTTTTTTATACTAACTTGAGCGAA-3' (SEQ ID NO:8); attB2 5'-TTCGCTCAAGTTAGTATAAAAAAGCAGGCTTCA-3'; SEQ ID NO:9). This ligation reaction resulted in the vector pLIT38attB (SEQ ID NO: 120). The blasticidin resistance gene and SV40 polyA site was PCR amplified with primers: 5BSD (ACCATGAAAACATTTAACATTTCTCAACA; SEQ ID NO:80) and SV40polyA (TTTATTTGTGAAATTTGTGATGCTATTGC; SEQ ID NO:81) using pPAC4 (Frengen, E., et al. (2000) Genomics 68 (2), 118-126; GenBank Accession No. U75992) as template. The blasticidin-SV40polyA PCR product was then ligated into pLIT38attB at the BamHI site, which was Klenow treated following digestion with BamHI. pLIT38attB-BSDpolyA10 (SEQ ID NO: 111) and pLIT38attB-BSDpolyA2 (SEQ ID NO: 121) are the two resulting orientations of the PCR product ligated into the vector.
B. Transfection of MSU1.1 Cells Containing Human Platform ACE 0.1.
[0389] MSU1.1 cells containing human platform ACE 0.1 (see Example 9) was expanded and plated to five 10 cm dishes with 1.3×106 cells per dish. The cells were incubated overnight in DMEM with 10% fetal bovine serum, at 37° C. and 5% CO2. The following day the cells were transfected with 5 μg of each plasmid as set forth in Table 3, for a total of 10 μg of DNA per plate of cells transfected (see Table 3) using ExGen 500 in vitro transfection reagent (MBI fermentas, cat. no. R0511). The transfection was performed according to the manufacturers protocol. Cells were incubated at 37° C. with 5% CO2 in DMEM with 10% fetal bovine serum following the transfection.
TABLE-US-00011 TABLE 3 Plate # Plasmid 1 Plasmid 2 No. BsdR Colonies 1 pBS None 0 2 pCXLamInt pLIT38attB- 16 BSRpolyA10 3 pCXLamIntR pLIT38attB- 40 BSRpolyA10 4 pCXLamInt pLIT38attB-BSRpolyA2 28 5 pCXLamIntR pLIT38attB-BSRpolyA2 36
C. Selection of Blasticidin Resistant Clones.
[0390] Three days following the transfection the cells were split from a 10 cm dish to two 15 cm dishes. The cells were maintained in DMEM with 10% fetal bovine serum for 4 days in the 15 cm dishes. Seven days post-transfection blasticidin was introduced into the medium. Stably transfected cells were selected with 1 μg/ml blasticidin. The number of colonies formed on each plate is listed in Table 3. These colonies were ring-cloned and expanded for PCR analysis. Upon expansion in blasticidin containing medium some clones failed to live and therefore do not have corresponding PCR data.
D. PCR Analysis
[0391] Thirty-eight of the 40 clones from plate 3 grew after ring-cloning. Genomic DNA was isolated from these clones with the Promega Wizard Genomic cDNA purification kit, digested with EcoRI and used as template in a PCR reaction with the following primers: 3BSP-TTAATTTCGGG TATATTTGAGTGGA (SEQ ID NO:82); 5 PacSV40-CTGTTAATTAACTGTGGAA TGTGTGTCAGTTAGGGTG (SEQ ID NO:76). The PCR conditions were as follows. 100 ng of genomic DNA was amplified with 0.5ul Herculase polymerase (Stratagene) in a 50ul reaction that contained 12.5 pmole of each primer, 2.5 mM of each dNTP, and 1× Herculase buffer (Stratagene). The reactions were placed in a PerkinElmer thermocycler programmed as follows: Initial denaturation at 95° C. for 10 minutes; 35 cycles of 94° C. for 1 minute, 53° C. for 1 minute, 72° C. for 1 minute, and 72° C. for 1 minute; Final extension for 10 minutes at 72° C.; and 4° C. hold. If pLIT38attB-BSRpolyA10 integrates onto the human platform ACE 0.1 correctly, PCR amplification with the above primers should yield an 804 bp product. Twenty-one of the 38 clones from plate 3 produced a PCR product of the expected 804 bp size.
Example 11
Delivery of a Vector Comprising a Promoterless Marker Gene and a Gene Encoding a Therapeutic Product to a Platform Aces
[0392] Platform ACes containing pSV40attPsensePURO (FIG. 4) were constructed as set forth in Examples 3 and 4.
A. Construction of Delivery Vectors
[0393] 1. Erythropoietin cDNA Vector, p18EPOcDNA.
[0394] The erythropoietin cDNA was PCR amplified from a human cDNA library (E. Perkins et al., 1999, Proc. Natl. Acad. Sci. USA 96(5): 2204-2209) using the following primers: EPO5XBA-TATCTAGAATGGGGGTGC ACGAATGTCCTGCC (SEQ ID NO: 83); EPO3BSI-TACGTACGTCATC TGTCCCCTGTCCTGCAGGC (SEQ ID NO: 84). The cDNA was amplified through two successive rounds of PCR using the following conditions: heat denaturation at 95° C. for 3 minutes; 35 cycles of a 30 second denaturation (95° C.), 30 seconds of annealing (60° C.), and 1 minute extension (72° C.); the last cycle is followed by a 7 minute extension at 72° C. BIO-X-ACT (BIOLINE) was used to amplify the erythropoietin cDNA from 2.5 ng of the human cDNA library in the first round of amplification. Five μl of the first amplification product was used as template for the second round of amplification. Two PCR products were produced from the second amplification with Taq polymerase (Eppendorf), each product was cloned into pCR2.1-Topo (Invitrogen) and sequenced. The larger PCR product contained the expected cDNA sequence for erythropoietin. The erythropoietin cDNA was moved from pTopoEPO into p18attBZeo(6XHS4)2eGFP (SEQ ID NO: 110). pTopoEPO was digested with BsiWI and XbaI to release a 588 bp EPO cDNA. BsrGI and BsiWI create compatable ends. The eGFP gene was removed from p18attBZeo(6XHS4)2eGFP by digestion with BsiWI and XbaI, the 8.3 Kbp vector backbone was gel purified and ligated to the 588 bp EPO cDNA to create p18EPOcDNA (SEQ ID NO: 124).
[0395] 2. Genomic Erythropoietin Vector, P18genEPO.
[0396] The erythropoietin genomic clone was PCR amplified from a human genomic library (Clontech) using the following primers: GENEPO3BSI-CGTACGTCATCTGTCCCCT GTCCTGCA (SEQ ID NO: 85); GENEPO 5XBA-TCTAGAATGGGGGT GCACGGTGAGTACT (SEQ ID NO: 86). The reaction conditions for the amplification were as follows: heat denaturation for 3 minutes (95° C.); 30 cycles of a 30 second denaturation (95° C.), 30 seconds annealing (from 65° C. decreasing 0.5° C. per cycle to 50° C.), and 3 minutes extension (72° C.); 15 cycles of a 30 second denaturation (95° C.), 30 seconds annealing (50° C.), and 3 minute extension (72° C.); the last cycle is followed by a 7 minute extension at 72° C. The erythropoietin genomic PCR product (2147 bp) was gel purified and cloned into pCR2.1Topo to create pTopogenEPO. Sequence analysis revealed 2 bp substitutions and insertions in the intronic sequences of the genomic clone of erythropoietin. A partial digest with XbaI and complete digest with BsiWI excised the erythropoietin genomic insert from pTopogenEPO. The resulting 2158 bp genomic erythropoietin fragment was ligated into the 8.3 Kbp fragment resulting from the digestion of p18attBZeo(6XHS4)2eGFP (SEQ ID NO: 110) with XbaI and BsrGI to create p18genEPO (SEQ ID NO: 125).
B. Transfection and Selection with Drug
[0397] The erythropoietin genomic and cDNA genes were each moved onto the platform ACes B 19-38 (constructed as set forth in Example 3) by co-transfecting with pCXLamIntR. Control transfections were also performed using pCXLamInt (SEQ ID NO: 127) together with either p18EPOcDNA (SEQ ID NO: 124) or p18genEPO (SEQ ID NO: 125). Lipofectamine Plus was used to transfect the DNA's into B19-38 cells according to the manufacturer's protocol. The cells were placed in selective medium (DMEM with 10% FBS and Zeocin @ 500 ug/ml) 48 hours post-transfection and maintained in selective medium for 13 days. Clones were isolated 15 days post-transfection.
C. Analysis of Clones (ELISA, PCR)
[0398] 1. ELISA Assays
[0399] Thirty clones were tested for erythropoietin production by an ELISA assay using a monoclonal anti-human erythropoietin antibody (R&D Systems, Catalogue # MAB287), a polyclonal anti-human erythropoietin antibody (R & D Systems, Catalogue # AB-286-NA) and alkaline phosphotase conjugated goat-anti-rabbit IgG (heavy and light chains) (Jackson ImmunoResearch Laboratories, Inc., Catalogue #111-055-144). The negative control was a Zeocin resistant clone isolated from B19-38 cells transfected with p18attBZeo(6XHS4) (SEQ ID NO: 117; no insert control vector) and pCXLamIntR (SEQ ID NO: 112). The preliminary ELISA assay was executed as follows: 1) Nunc-Immuno Plates (MaxiSorb 96-well, Catalogue #439454) were coated with 75 ul of a 1/200 dilution (in Phosphate buffered Saline, pH 7.4 (PBS), Sigma Catalogue # P-3813) of monoclonal anti-human erythropoietin antibody overnight at 4° C. 2) The following day the plates were washed 3 times with 300ul PBS containing 0.15% Tween 20 (Sigma, Catalogue # P-9416). 3) The plates were then blocked with 300ul of 1% Bovine Serum Albumin (BSA; Sigma Catalogue # A-7030) in PBS for 1 hour at 37° C. 4) Repeat the washes as in step 2. 5) The clonal supernatants (75 ul per clone per well of 96-well plate) were then added to the plate and incubated for 1 hour at 37° C. The clonal supernatant analyzed in the ELISA assay had been maintained on the cells 7 days prior to analysis. 6) Repeat the washes of step 2. 7) Add 75 ul of polyclonal anti-human erythropoietin antibody (1/250 dilution in dilution buffer (0.5% BSA, 0.01% Tween 20, 1×PBS, pH 7.4) and incubate 1 hour at 37° C. 8) Repeat washes of step 2. 9) Add 75 ul of goat anti-rabbit conjugated alkaline phosphatase diluted 1/4000 in dilution buffer and incubate 1 hour at 37° C. 10) Repeat washes of step 2. 11) Add 75 ul substrate, p-nitrophenyl phosphate (Sigma N2640), diluted to 1 mg/ml in substrate buffer (0.1 Ethanolamine-HCl (Sigma, Catalogue # E-6133), 5 mM MgCl2 (Sigma, Catalogue # M-2393), pH 9.8). Incubate the plates in the dark for 1 hour at room temperature (22° C.). 12) Read the absorption at 405 nm (reference wavelength 495 nm) on an Universal Microplate Reader (Bio-Tek Instruments, Inc., model # ELX800 UV). The erythropoietin standard curve was derived from readings of diluted human recombinant Erythropoietin (Roche, catalogue #1-120-166; dilution range 125-7.8mUnits/ml). From this preliminary assay the 21 clones displaying the highest expression of erythropoietin were analyzed a second time in the same manner using medium supernatants that had been on the clones for 24 hours and a 1:3 dilution thereof.
[0400] 2. PCR Analysis
[0401] Genomic DNA was isolated from the 21 clones with the best expression (as assessed by the initial ELISA assay above) as well as the B 19-38 cell line and used for PCR analysis. Genomic DNA was isolated using the Wizard genomic DNA purification kit (Promega) according to the manufacturers protocol. Amplification was performed on 100 ng of genomic DNA as template with MasterTaq DNA Polymerase (Eppendorf) and the primer set 5 PacSV40-CTGTTAATTAACTGTGGAATGTGTG TCAGTTAGGGTG (SEQ ID NO: 76) and Antisense Zeo-TGAACAGGGTCACGTCGTCC (SEQ ID NO: 77). The amplification conditions were as follows: heat denaturation for 3 minutes (95° C.); 30 cycles of a 30 second denaturation (95° C.), 30 seconds annealing (from 65° C. decreasing 0.5° C. per cycle to 50° C.), and 1 minutes extension (72° C.); 15 cycles of a 30 second denaturation (95° C.), 30 seconds annealing (50° C.), and 1 minute extension (72° C.); the last cycle is followed by a 10 minute extension at 72° C. PCR products were size separated by gel electrophoresis. Of the 21 clones analyzed 19 produced a PCR product of 650 bp as expected for a site-specific integration event. All nineteen clones were the result of transformations with p19EPOcDNA (5) or p18genEPO (14) and pCXLamIntR (i.e. mutant integrase). The remaining two clones, both of which were the result of transformation with p18genEPO (SEQ ID NO: 125) and pCXLamInt (i.e. wildtype integrase; SEQ ID NO: 127), produced a 400 bp PCR product.
Example 12
Preparation of a Transformation Vector Useful for the Induction of Plant Artificial Chromosome Formation
[0402] Plant artificial chromosomes (PACs) can be generated by introducing nucleic acid, such as DNA, which can include a targeting DNA, for example rDNA or lambda DNA, into a plant cell, allowing the cell to grow, and then identifying from among the resulting cells those that include a chromosome with a structure that is distinct from that of any chromosome that existed in the cell prior to introduction of the nucleic acid. The structure of a PAC reflects amplification of chromosomal DNA, for example, segmented, repeat region-containing and heterochromatic structures. It is also possible to select cells that contain structures that are precursors to PACs, for example, chromosomes containing more than one centromere and/or fragments thereof, and culture and/or manipulate them to ultimately generate a PAC within the cell.
[0403] In the method of generating PACs, the nucleic acid can be introduced into a variety of plant cells. The nucleic acid can include targeting DNA and/or a plant expressable DNA encoding one or multiple selectable markers (e.g., DNA encoding bialophos (bar) resistance) or scorable markers (e.g., DNA encoding GFP). Examples of targeting DNA include, but are not limited to, N. tabacum rDNA intergenic spacer sequence (IGS) and Arabidopsis rDNA such as the 18S, 5.8S, 26S rDNA and/or the intergenic spacer sequence. The DNA can be introduced using a variety of methods, including, but not limited to Agrobacterium-mediated methods, PEG-mediated DNA uptake and electroporation using, for example, standard procedures according to Hartmann et al [(1998) Plant Molecular Biology 36:741]. The cell into which such DNA is introduced can be grown under selective conditions and can initially be grown under non-selective conditions and then transferred to selective media. The cells or protoplasts can be placed on plates containing a selection agent to grow, for example, individual calli. Resistant calli can be scored for scorable marker expression. Metaphase spreads of resistance cultures can be prepared, and the metaphase chromosomes examined by FISH analysis using specific probes in order to detect amplification of regions of the chromosomes. Cells that have artificial chromosomes with functioning centromeres or artificial chromosomal intermediate structures, including, but not limited to, dicentric chromosomes, formerly dicentric chromosomes, minichromosomes, heterochromatin structures (e.g. sausage chromosomes), and stable self-replicating artificial chromosomal intermediates as described herein, are identified and cultured. In particular, the cells containing self-replicating artificial chromosomes are identified.
[0404] The DNA introduced into a plant cell for the generation of PACs can be in any form, including in the form of a vector. An exemplary vector for use in methods of generating PACs can be prepared as follows.
[0405] For the production of artificial chromosomes, plant transformation vectors, as exemplified by pAgIIa and pAgIIb, containing a selectable marker, a targeting sequence, and a scorable marker were constructed using procedures well known in the art to combine the various fragments. The vectors can be prepared using vector pAg1 as a base vector and inserting the following DNA fragments into pAg1: DNA encoding β-glucoronidase under the control of the nopaline synthase (NOS) promoter fragment and flanked at the 3' end by the NOS terminator fragment, a fragment of mouse satellite DNA and an N. tabacum rDNA intergenic spacer sequence (IGS). In constructing plant transformation vectors, vector pAg2 can also be used as the base vector.
1. Construction of pAG1
[0406] Vector pAg1 (SEQ. ID. NO: 89) is a derivative of the CAMBIA vector named pCambia 3300 (Center for the Application of Molecular Biology to International Agriculture, i.e., CAMBIA, Can berra, Australia; www.cambia.org), which is a modified version of vector pCambia 1300 to which has been added DNA from the bar gene confering resistance to phosphinothricin. The nucleotide sequence of pCambia 3300 is provided in SEQ. ID. NO: 90. pCambia 3300 also contains a lacZ alpha sequence containing a polylinker region.
[0407] pAg1 was constructed by inserting two new functional DNA fragments into the polylinker of pCambia 3300: one sequence containing an attB site and a promoterless zeomycin resistance-encoding DNA flanked at the 3' end by a SV40 polyA signal sequence, and a second sequence containing DNA from the hygromycin resistance gene (hygromycin phosphotransferase) confering resistance to hygromycin for selection in plants. Although the zeomycin-SV40 polyA signal fusion is not expected to function in plant cells, it can be activated in mammalian cells by insertion of a functional promoter element into the attB site by site-specific recombination catalyzed by the Lambda att integrase. Thus, the inclusion of the attB-zeomycin sequences allows for evaluation of functionality of plant artificial chromosomes in mammalian cells by activation of the zeomycin resistance-encoding DNA, and provides an att site for further insertion of new DNA sequences into plant artificial chromosomes formed as a result of using pAg1 for plant transformation. The second functional DNA fragment allows for selection of plant cells with hygromycin. Thus, pAg1 contains DNA from the bar gene confering resisance to phosphinothricin, DNA from the hygromycin resistance gene, both resistance-encoding DNAs under the control of a separate cauliflower mosaic virus (CaMV) 35S promoter, and the attB-promoterless zeomycin resistance-encoding DNA.
[0408] pAg1 is a binary vector containing Agrobacterium right and left T-DNA border sequences for use in Agrobacterium-mediated transformation of plant cells or protoplasts with the DNA located between the border sequences. pAg1 also contains the pBR322 Ori for replication in E. coli. pAg1 was constructed by ligating HindIII/PstI-digested p3300attBZeo with HindIII/PstI-digested pBSCaMV35SHyg as follows.
[0409] a. Generation of p3300attBZeo
[0410] Plasmid pCambia 3300 was digested with PstI/Ecl136 II and ligated with PstI/StuI-digested pLITattBZeo (the nucleotide sequence of pLITattBZeo is provided in SEQ. ID. NO: 91. (containing DNA encoding the zeocin resistance gene and an attB Integrase recognition sequence) to generate p3300attBZeo which contains an attB site, a promoterless zeomycin resistance-encoding DNA flanked at the 3' end by a SV40 polyA signal, and a reconstructed PstI site.
[0411] b. Generation of pBSCaMV35SHyg
[0412] A DNA fragment containing DNA encoding hygromycin phosphotransferase flanked by the CaMV 35S promoter and the CaMV 35S polyA signal sequence was obtained by PCR amplification of plasmid pCambia 1302 (GenBank Accession No. AF234298 and SEQ. ID. NO: 92). The primers used in the amplification reaction were as follows:
TABLE-US-00012 CaMV35SpolyA: SEQ. ID. NO: 93 5'-CTGAATTAACGCCGAATTAATTCGGGGGATCTG-3' CaMV35Spr: SEQ. ID. NO: 94 5'-CTAGAGCAGCTTGCCAACATGGTGGAGCA-3'
The 2100-bp PCR fragment was ligated with EcoRV-digested pBluescript II SK+ (Stratagene, La Jolla, Calif., U.S.A.) to generate pBSCaMV35SHyg.
[0413] c. Generation of pAg1
[0414] To generate pAg1, pBSCaMV35SHyg was digested with HindIII/PstI and ligated with HindIII/PstI-digested p3300attBZeo. Thus, pAg1 contains the pCambia 3300 backbone with DNA conferring resistance to phosphinothricin and hygromycin under the control of separate CaMV 35S promoters, an attB-promoterless zeomycin resistance-encoding DNA recombination cassette and unique sites for adding additional markers, e.g., DNA encoding GFP. The attB site can be used as described herein for the addition of new DNA sequences to plant artificial chromosomes, including PACs formed as a result of using the pAg1 vector, or derivatives thereof, in the production of PACs. The attB site provides a convenient site for recombinase-mediated insertion of DNAs containing a homologous att site.
2. pAG2
[0415] The vector pAg2 (SEQ. ID. NO: 95) is a derivative of vector pAg1 formed by adding DNA encoding a green fluorescent protein (GFP), under the control of a NOS promoter and flanked at the 3' end by a NOS polyA signal, to pAg1. pAg2 was constructed as follows. A DNA fragment containing the NOS promoter was obtained by digestion of pGEM-T-NOS, or pGEMEasyNOS (SEQ. ID. NO: 96), containing the NOS promoter in the cloning vector pGEM-T-Easy (Promega Biotech, Madison, Wis., U.S.A.), with XbaI/NcoI and was ligated to an XbaI/NcoI fragment of pCambia 1302 containing DNA encoding GFP (without the CaMV 35S promoter) to generate p1302NOS (SEQ. ID. NO: 97) containing GFP-encoding DNA in operable association with the NOS promoter. Plasmid p1302NOS was digested with SmaI/BsiWI to yield a fragment containing the NOS promoter and GFP-encoding DNA. The fragment was ligated with PmeI/BsiWI-digested pAg1 to generate pAg2. Thus, pAg2 contains DNA from the bar gene confering resistance to phosphinothricin, DNA conferring resistance to hygromycin, both resistance-encoding DNAs under the control of a cauliflower mosaic virus 35S promoter, DNA encoding kanamycin resistance, a GFP gene under the control of a NOS promoter and the attB-zeomycin resistance-encoding DNA. One of skill in the art will appreciate that other fragments can be used to generate the pAg1 and pAg2 derivatives and that other heterlogous DNA can be incorporated into pAg1 and pAg2 derivatives using methods well known in the art.
3. pAgIIa and pAgIIb Transformation Vectors
[0416] Vectors pAgIIa and pAgIIb were constructed by inserting the following DNA fragments into pAg1: DNA encoding β-glucoronidase, the nopaline synthase terminator fragment, the nopaline synthase (NOS) promoter fragment, a fragment of mouse satellite DNA and an N. tabacum rDNA intergenic spacer sequence (IGS). The construction of pAgIIa and pAgIIb was as follows.
[0417] An N. tabacum rDNA intergenic spacer (IGS) sequence (SEQ. ID. NO: 98; see also GenBank Accession No. Y08422; see also Borysyuk et al. (2000) Nature Biotechnology 18:1303-1306; Borysyuk et al. (1997) Plant Mol. Biol. 35:655-660; U.S. Pat. Nos. 6,100,092 and 6,355,860) was obtained by PCR amplification of tobacco genomic DNA. The IGS can be used as a targeting sequence by virtue of its homology to tobacco rDNA genes; the sequence is also an amplification promoter sequence in plants. This fragment was amplified using standard PCR conditions (e.g., as described by Promega Biotech, Madison, Wis., U.S.A.) from tobacco genomic DNA using the primers shown below:
TABLE-US-00013 NTIGS-F1 (SEQ ID No. 99) 5'-GTG CTA GCC AAT GTT TAA CAA GAT G-3' and NTIGS-R1 (SEQ ID No. 100) 5'-ATG TCT TAA AAA AAA AAA CCC AAG TGA C-3'
Following amplification, the fragment was cloned into pGEM-T Easy to give pIGS-I
[0418] A fragment of mouse satellite DNA (Msat1 fragment; GenBank Accession No. V00846; and SEQ ID No. 101) was amplified via PCR from pSAT-1 using the following primers:
TABLE-US-00014 MSAT-F1 (SEQ ID No. 102) 5'-AAT ACC GCG GAA GCT TGA CCT GGA ATA TCG C-3' and MSAT-Ri (SEQ ID No. 103) 5'-ATA ACC GCG GAG TCC TTC AGT GTG CA T-3'
This amplification added a SacII and a HindIII site at the 5' end and a SacII site at the 3' end of the PCR fragment. This fragment was then cloned into the SacII site in pIGS-1 to give pMIGS-1, providing a eukaryotic centromere-specific DNA and a convenient DNA sequence for detection via FISH.
[0419] A functional marker gene containing a NOS-promoter:GUS:NOS terminator fusion was then constructed containing the NOS promoter (GenBank Accession No. U09365; SEQ ID No. 104), E. coli β-glucuronidase coding sequence (from the GUS gene; GenBank Accession No. S69414; and SEQ ID No. 105), and the nopaline synthase terminator sequence (GenBank Accession No. U09365; SEQ ID No. 107). The NOS promoter in pGEM-T-NOS was added to a promoterless GUS gene in pBlueScript (Stratagene, La Jolla, Calif., U.S.A.) using NotI/SpeI to form pNGN-1, which has the NOS promoter in the opposite orientation relative to the GUS gene.
[0420] pMIGS-1 was digested with NotI/SpeI to yield a fragment containing the mouse major satellite DNA and the tobacco IGS which was then added to NotI-digested pNGN-1 to yield pNGN-2. The NOS promoter was then re-oriented to provide a functional GUS gene, yielding pNGN-3, by digestion and religation with SpeI. Plasmid pNGN-3 was then digested with HindIII, and the HindIII fragment containing the β-glucuronidase coding sequence and the rDNA intergenic spacer, along with the Msat sequence, was added to pAG-1 to form pAgIIa (SEQ ID NO: 108), using the unique HindIII site in pAg1 located near the right T-DNA border of pAg1, within the T-DNA region.
[0421] Another plasmid vector, referred to as pAgIIb, was also recovered, which contained the inserted HindIII fragment (SEQ ID NO: 108) in the opposite orientation relative to that observed in pAgIIa. Thus, pAgIIa and pAgIIb differ only in the orientation of the HindIII fragment containing the mouse major satellite sequence, the GUS DNA sequence and the IGS sequence. The nucleotide sequences of pAgIIa is provided in SEQ. ID. NOS: 109.
[0422] Since modifications will be apparent to those of skill in this art, it is intended that this invention be limited only by the scope of the appended claims.
Sequence CWU
1
129125DNAArtificial SequencePrimer attPUP 1ccttgcgcta atgctctgtt acagg
25226DNAArtificial SequencePrimer
attPDWN 2cagaggcagg gagtgggaca aaattg
26335DNAArtificial SequencePrimer Lamint 1 3ttcgaattca tgggaagaag
gcgaagtcat gagcg 35434DNAArtificial
SequencePrimer Lamint 2 4ttcgaattct tatttgattt caattttgtc ccac
34520DNAArtificial SequencePrimer 5cggacaatgc
ggttgtgcgt
20646DNAArtificial Sequenceprimer 6cgcgcagcaa aatctagagt aaggagatca
agacttacgg ctgacg 46746DNAArtificial
SequenceLambdaINTER174rev 7cgtcagccgt aagtcttgat ctccttactc tagattttgc
tgcgcg 46833DNAArtificial SequenceattB1 8tgaagcctgc
ttttttatac taacttgagc gaa
33933DNAArtificial SequenceattB2 9ttcgctcaag ttagtataaa aaagcaggct tca
331025DNAArtificial SequencePrimer attPdwn2
10tcttctcggg cataagtcgg acacc
251125DNAArtificial SequencePrimerCMVen 11ctcacgggga tttccaagtc tccac
251226DNAArtificial
SequencePrimerattPdwn 12cagaggcagg gagtgggaca aaattg
261325DNAArtificial SequencePrimerCMVEN2 13caactccgcc
ccattgacgc aaatg
251426DNAArtificial SequencePrimerL1 14agtatcgccg aacgattagc tcttca
261524DNAArtificial SequencePrimerF1
rev 15gccgatttcg gcctattggt taaa
241625DNAArtificial SequencePrimerRED 16ccgccgacat ccccgactac aagaa
251725DNAArtificial
SequencePrimerL2rev 17ttccttcgaa ggggatccgc ctacc
251822118DNAMus musculusGenBank X825641996-04-09
18gaattcccct atccctaatc cagattggtg gaataacttg gtatagatgt ttgtgcatta
60aaaaccctgt aggatcttca ctctaggtca ctgttcagca ctggaacctg aattgtggcc
120ctgagtgata ggtcctggga catatgcagt tctgcacaga cagacagaca gacagacaga
180cagacagaca gacagacgtt acaaacaaac acgttgagcc gtgtgccaac acacacacaa
240acaccactct ggccataatt attgaggacg ttgatttatt attctgtgtt tgtgagtctg
300tctgtctgtc tgtctgtctg tctgtctgtc tatcaaacca aaagaaacca aacaattatg
360cctgcctgcc tgcctgcctg cctacacaga gaaatgattt cttcaatcaa tctaaaacga
420cctcctaagt ttgccttttt tctctttctt tatctttttc ttttttcttt tcttcttcct
480tccttccttc cttccttcct tccttccttt ctttctttct ttctttcttt cttactttct
540ttctttcctt cttacattta ttcttttcat acatagtttc ttagtgtaag catccctgac
600tgtcttgaag acactttgta ggcctcaatc ctgtaagagc cttcctctgc ttttcaaatg
660ctggcatgaa tgttgtacct cactatgacc agcttagtct tcaagtctga gttactggaa
720aggagttcca agaagactgg ttatattttt catttattat tgcattttaa ttaaaattta
780atttcaccaa aagaatttag actgaccaat tcagagtctg ccgtttaaaa gcataaggaa
840aaagtaggag aaaaacgtga ggctgtctgt ggatggtcga ggctgcttta gggagcctcg
900tcaccattct gcacttgcaa accgggccac tagaacccgg tgaagggaga aaccaaagcg
960acctggaaac aataggtcac atgaaggcca gccacctcca tcttgttgtg cgggagttca
1020gttagcagac aagatggctg ccatgcacat gttgtctttc agcttggtga ggtcaaagta
1080caaccgagtc acagaacaag gaagtataca cagtgagttc caggtcagcc agagtttaca
1140cagagaaacc acatcttgaa aaaaacaaaa aaataaatta aataaatata atttaaaaat
1200ttaaaaatag ccgggagtga tggcgcatgt ctttaatccc agctctcttc aggcagagat
1260gggaggattt ctgagtttga ggccagcctg gtctgcaaag tgagttccag gacagtcagg
1320gctatacaga gaaaccctgt cttgaaaact aaactaaatt aaactaaact aaactaaaaa
1380aatataaaat aaaaatttta aagaatttta aaaaactaca gaaatcaaac ataagcccac
1440gagatggcaa gtaactgcaa tcatagcaga aatattatac acacacacac acacagactc
1500tgtcataaaa tccaatgtgc cttcatgatg atcaaatttc gatagtcagt aatactagaa
1560gaatcatatg tctgaaaata aaagccagaa ccttttctgc ttttgttttc ttttgcccca
1620agatagggtt tctctcagtg tatccctggc atccctgcct ggaacttcct ttgtaggttt
1680ggtagcctca aactcagaga ggtcctctct gcctgcctgc ctgcctgcct gcctgcctgc
1740ctgcctgcct gcctgcctca cttcttctgc cacccacaca accgagtcga acctaggatc
1800tttatttctt tctctttctc tcttctttct ttctttcttt ctttctttct ttctttcttt
1860ctttctttct ttcttattca attagttttc aatgtaagtg tgtgtttgtg ctctatctgc
1920tgcctatagg cctgcttgcc aggagagggc aacagaacct aggagaaacc accatgcagc
1980tcctgagaat aagtgaaaaa acaacaaaaa aaggaaattc taatcacata gaatgtagat
2040atatgccgag gctgtcagag tgctttttaa ggcttagtgt aagtaatgaa aattgttgtg
2100tgtcttttat ccaaacacag aagagaggtg gctcggcctg catgtctgtt gtctgcatgt
2160agaccaggct ggccttgaac acattaatct gtctgcctct gcttccctaa tgctgcgatt
2220aaaggcatgt gccaccactg cccggactga tttcttcttt tttttttttt tggaaaatac
2280ctttctttct ttttctctct ctctttcttc cttccttcct ttctttctat tctttttttc
2340tttctttttt cttttttttt ttttttttaa aatttgccta aggttaaagg tgtgctccac
2400aattgcctca gctctgctct aattctcttt aaaaaaaaac aaacaaaaaa aaaaccaaaa
2460cagtatgtat gtatgtatat ttagaagaaa tactaatcca ttaataactc ttttttccta
2520aaattcatgt cattcttgtt ccacaaagtg agttccagga cttaccagag aaaccctgtg
2580ttcaaatttc tgtgttcaag gtcaccctgg cttacaaagt gagttccaag tccgataggg
2640ctacacagaa aaaccatatc tcagaaaaaa aaaaagttcc aaacacacac acacacacac
2700acacacacac acacacacac acacacacac acacacacag cgcgccgcgg cgatgagggg
2760aagtcgtgcc taaaataaat atttttctgg ccaaagtgaa agcaaatcac tatgaagagg
2820tactcctaga aaaaataaat acaaacgggc tttttaatca ttccagcact gttttaattt
2880aactctgaat ttagtcttgg aaaagggggc gggtgtgggt gagtgagggc gagcgagcag
2940acgggcgggc gggcgggtga gtggccggcg gcggtggcag cgagcaccag aaaacaacaa
3000accccaagcg gtagagtgtt ttaaaaatga gacctaaatg tggtggaacg gaggtcgccg
3060ccaccctcct cttccactgc ttagatgctc ccttcccctt actgtgctcc cttcccctaa
3120ctgtgcctaa ctgtgcctgt tccctcaccc cgctgattcg ccagcgacgt actttgactt
3180caagaacgat tttgcctgtt ttcaccgctc cctgtcatac tttcgttttt gggtgcccga
3240gtctagcccg ttcgctatgt tcgggcggga cgatggggac cgtttgtgcc actcgggaga
3300agtggtgggt gggtacgctg ctccgtcgtg cgtgcgtgag tgccggaacc tgagctcggg
3360agaccctccg gagagacaga atgagtgagt gaatgtggcg gcgcgtgacg gatctgtatt
3420ggtttgtatg gttgatcgag accattgtcg ggcgacacct agtggtgaca agtttcggga
3480acgctccagg cctctcaggt tggtgacaca ggagagggaa gtgcctgtgg tgaggcgacc
3540agggtgacag gaggccgggc aagcaggcgg gagcgtctcg gagatggtgt cgtgtttaag
3600gacggtctct aacaaggagg tcgtacaggg agatggccaa agcagaccga gttgctgtac
3660gcccttttgg gaaaaatgct agggttggtg gcaacgttac taggtcgacc agaaggctta
3720agtcctaccc ccccccccct tttttttttt tttcctccag aagccctctc ttgtccccgt
3780caccgggggc accgtacatc tgaggccgag aggacgcgat gggcccggct tccaagccgg
3840tgtggctcgg ccagctggcg cttcgggtct tttttttttt tttttttttt ttttcctcca
3900gaagccttgt ctgtcgctgt caccgggggc gctgtacttc tgaggccgag aggacgcgat
3960gggccccggc ttccaagccg gtgtggctcg gccagctgga gcttcgggtc tttttttttt
4020tttttttttt tttttttctc cagaagcctt gtctgtcgct gtcaccgggg gcgctgtact
4080tctgaggccg agaggacgcg atgggtcggc ttccaagccg atgtggcggg gccagctgga
4140gcttcgggtt tttttttttc ctccagaagc cctctcttgt ccccgtcacc gggggcgctg
4200tacttctgag gccgagagga cgtgatgggc ccgggttcca ggcggatgtc gcccggtcag
4260ctggagcttt ggatcttttt tttttttttt cctccagaag ccctctcttg tccccgtcac
4320cgggggcacc ttacatctga gggcgagagg acgtgatggg tccggcttcc aagccgatgt
4380ggcggggcca gctggagctt cgggtttttt ttttttcctc cagaagccct ctcttgtccc
4440cgtcaccggg ggcgctgtac ttctgaggcc gagaggacgt gatgggcccg ggttccaggc
4500ggatgtcgcc cggtcagctg gagctttgga tcattttttt ttttccctcc agaagccctc
4560tcttgtcccc gtcaccgggg gcaccgtaca tctgaggccg agaggacacg atgggcctgt
4620cttccaagcc gatgtggccc ggccagctgg agcttcgggt cttttttttt ttttttcctc
4680cagaagcctt gtctgtcgct gtcacccggg gcgctgtact tctgaggccg agaggacgcg
4740atgggcccgg cttccaagcc ggtgtggctc ggccagctgg agcttcgggt cttttttttt
4800tttttttttt ttcctccaga aaccttgtct gtcgctgtca cccggggcgc ttgtacttct
4860gatgccgaga ggacgcgatg ggcccgtctt ccaggccgat gtggcccggt cagctggagc
4920tttggatctt tttttttttt ttttcctcca gaagccctct cttgtccccg tcaccggggg
4980caccttacat ctgaggccta gaggacacga tgggcccggg ttccaggccg atgtggcccg
5040gtcagctgga gctttggatc tttttttttt ttttcttcca gaagccctct tgtccccgtc
5100accggtggca ctgtacatct gaggcggaga ggacattatg ggcccggctt ccaatccgat
5160gtggcccggt cagctggagc tttggatctt attttttttt taattttttc ttccagaagc
5220cctcttgtcc ctgtcaccgg tggcacggta catctgaggc cgagaggaca ttatgggccc
5280ggcttccagg ccgatgtggc ccggtcagct ggagctttgg atcttttttt ttttttttct
5340tttttcctcc agaagccctc tctgtccctg tcaccggggg ccctgtacgt ctgaggccga
5400gggaaagcta tgggcgcggt tttctttcat tgacctgtcg gtcttatcag ttctccgggt
5460tgtcagggtc gaccagttgt tcctttgagg tccggttctt ttcgttatgg ggtcattttt
5520gggccacctc cccaggtatg acttccaggc gtcgttgctc gcctgtcact ttcctccctg
5580tctcttttat gcttgtgatc ttttctatct gttcctattg gacctggaga taggtactga
5640cacgctgtcc tttccctatt aacactaaag gacactataa agagaccctt tcgatttaag
5700gctgttttgc ttgtccagcc tattcttttt actggcttgg gtctgtcgcg gtgcctgaag
5760ctgtccccga gccacgcttc ctgctttccc gggcttgctg cttgcgtgtg cttgctgtgg
5820gcagcttgtg acaactgggc gctgtgactt tgctgcgtgt cagacgtttt tcccgatttc
5880cccgaggtgt cgttgtcaca cctgtcccgg ttggaatggt ggagccagct gtggttgagg
5940gccaccttat ttcggctcac tttttttttt tttttttctc ttggagtccc gaacctccgc
6000tcttttctct tcccggtctt tcttccacat gcctcccgag tgcatttctt tttgtttttt
6060ttcttttttt tttttttttt ttggggaggt ggagagtccc gagtacttca ctcctgtctg
6120tggtgtccaa gtgttcatgc cacgtgcctc ccgagtgcac ttttttttgt ggcagtcgct
6180cgttgtgttc tcttgttctg tgtctgcccg tatcagtaac tgtcttgccc cgcgtgtaag
6240acattcctat ctcgcttgtt tctcccgatt gcgcgtcgtt gctcactctt agatcgatgt
6300ggtgctccgg agttctcttc gggccagggc caagccgcgc caggcgaggg acggacattc
6360atggcgaatg gcggccgctc ttctcgttct gccagcgggc cctcgtctct ccaccccatc
6420cgtctgccgg tggtgtgtgg aaggcagggg tgcggctctc cggcccgacg ctgccccgcg
6480cgcacttttc tcagtggttc gcgtggtcct tgtggatgtg tgaggcgccc ggttgtgccc
6540tcacgtgttt cactttggtc gtgtctcgct tgaccatgtt cccagagtcg gtggatgtgg
6600ccggtggcgt tgcataccct tcccgtctgg tgtgtgcacg cgctgtttct tgtaagcgtc
6660gaggtgctcc tggagcgttc caggtttgtc tcctaggtgc ctgcttctga gctggtggtg
6720gcgctcccca ttccctggtg tgcctccggt gctccgtctg gctgtgtgcc ttcccgtttg
6780tgtctgagaa gcccgtgaga ggggggtcga ggagagaagg aggggcaaga ccccccttct
6840tcgtcgggtg aggcgcccac cccgcgacta gtacgcctgt gcgtagggct ggtgctgagc
6900ggtcgcggct ggggttggaa agtttctcga gagactcatt gctttcccgt ggggagcttt
6960gagaggcctg gctttcgggg gggaccggtt gcagggtctc ccctgtccgc ggatgctcag
7020aatgcccttg gaagagaacc ttcctgttgc cgcagacccc cccgcgcggt cgcccgcgtg
7080ttggtcttct ggtttccctg tgtgctcgtc gcatgcatcc tctctcggtg gccggggctc
7140gtcggggttt tgggtccgtc ccgccctcag tgagaaagtt tccttctcta gctatcttcc
7200ggaaagggtg cgggcttctt acggtctcga ggggtctctc ccgaatggtc ccctggaggg
7260ctcgccccct gaccgcctcc cgcgcgcgca gcgtttgctc tctcgtctac cgcggcccgc
7320ggcctccccg ctccgagttc ggggagggat cacgcggggc agagcctgtc tgtcgtcctg
7380ccgttgctgc ggagcatgtg gctcggcttg tgtggttggt ggctggggag agggctccgt
7440gcacaccccc gcgtgcgcgt actttcctcc cctcctgagg gccgccgtgc ggacggggtg
7500tgggtaggcg acggtgggct cccgggtccc cacccgtctt cccgtgcctc acccgtgcct
7560tccgtcgcgt gcgtccctct cgctcgcgtc cacgactttg gccgctcccg cgacggcggc
7620ctgcgccgcg cgtggtgcgt gctgtgtgct tctcgggctg tgtggttgtg tcgcctcgcc
7680ccccccttcc cgcggcagcg ttcccacggc tggcgaaatc gcgggagtcc tccttcccct
7740cctcggggtc gagagggtcc gtgtctggcg ttgattgatc tcgctctcgg ggacgggacc
7800gttctgtggg agaacggctg ttggccgcgt ccggcgcgac gtcggacgtg gggacccact
7860gccgctcggg ggtcttcgtc ggtaggcatc ggtgtgtcgg catcggtctc tctctcgtgt
7920cggtgtcgcc tcctcgggct cccggggggc cgtcgtgttt cgggtcggct cggcgctgca
7980ggtgtggtgg gactgctcag gggagtggtg cagtgtgatt cccgccggtt ttgcctcgcg
8040tgccctgacc ggtccgacgc ccgagcggtc tctcggtccc ttgtgaggac ccccttccgg
8100gaggggcccg tttcggccgc ccttgccgtc gtcgccggcc ctcgttctgc tgtgtcgttc
8160ccccctcccc gctcgccgca gccggtcttt tttcctctct ccccccctct cctctgactg
8220acccgtggcc gtgctgtcgg accccccgca tgggggcggc cgggcacgta cgcgtccggg
8280cggtcaccgg ggtcttgggg gggggccgag gggtaagaaa gtcggctcgg cgggcgggag
8340gagctgtggt ttggagggcg tcccggcccc gcggccgtgg cggtgtcttg cgcggtcttg
8400gagagggctg cgtgcgaggg gaaaaggttg ccccgcgagg gcaaagggaa agaggctagc
8460agtggtcatt gtcccgacgg tgtggtggtc tgttggccga ggtgcgtctg gggggctcgt
8520ccggccctgt cgtccgtcgg gaaggcgcgt gttggggcct gccggagtgc cgaggtgggt
8580accctggcgg tgggattaac cccgcgcgcg tgtcccggtg tggcggtggg ggctccggtc
8640gatgtctacc tccctctccc cgaggtctca ggccttctcc gcgcgggctc tcggccctcc
8700cctcgttcct ccctctcgcg gggttcaagt cgctcgtcga cctcccctcc tccgtccttc
8760catctctcgc gcaatggcgc cgcccgagtt cacggtgggt tcgtcctccg cctccgcttc
8820tcgccggggg ctggccgctg tccggtctct cctgcccgac ccccgttggc gtggtcttct
8880ctcgccggct tcgcggactc ctggcttcgc ccggagggtc agggggcttc ccggttcccc
8940gacgttgcgc ctcgctgctg tgtgcttggg gggggcccgc tgcggcctcc gcccgcccgt
9000gagcccctgc cgcacccgcc ggtgtgcggt ttcgcgccgc ggtcagttgg gccctggcgt
9060tgtgtcgcgt cgggagcgtg tccgcctcgc ggcggctaga cgcgggtgtc gccgggctcc
9120gacgggtggc ctatccaggg ctcgcccccg ccgacccccg cctgcccgtc ccggtggtgg
9180tcgttggtgt ggggagtgaa tggtgctacc ggtcattccc tcccgcgtgg tttgactgtc
9240tcgccggtgt cgcgcttctc tttccgccaa cccccacgcc aacccaccac cctgctctcc
9300cggcccggtg cggtcgacgt tccggctctc ccgatgccga ggggttcggg atttgtgccg
9360gggacggagg ggagagcggg taagagaggt gtcggagagc tgtcccgggg cgacgctcgg
9420gttggctttg ccgcgtgcgt gtgctcgcgg acgggttttg tcggaccccg acggggtcgg
9480tccggccgca tgcactctcc cgttccgcgc gagcgcccgc ccggctcacc cccggtttgt
9540cctcccgcga ggctctccgc cgccgccgcc tcctcctcct ctctcgcgct ctctgtcccg
9600cctggtcctg tcccaccccc gacgctccgc tcgcgcttcc ttacctggtt gatcctgcca
9660ggtagcatat gcttgtctca aagattaagc catgcatgtc taagtacgca cggccggtac
9720agtgaaactg cgaatggctc attaaatcag ttatggttcc tttggtcgct cgctcctctc
9780ctacttggat aactgtggta attctagagc taatacatgc cgacgggcgc tgacccccct
9840tcccgggggg ggatgcgtgc atttatcaga tcaaaaccaa cccggtgagc tccctcccgg
9900ctccggccgg gggtcgggcg ccggcggctt ggtgactcta gataacctcg ggccgatcgc
9960acgccccccg tggcggcgac gacccattcg aacgtctgcc ctatcaactt tcgatggtag
10020tcgccgtgcc taccatggtg accacgggtg acggggaatc agggttcgat tccggagagg
10080gagcctgaga aacggctacc acatccaagg aaggcagcag gcgcgcaaat tacccactcc
10140cgacccgggg aggtagtgac gaaaaataac aatacaggac tctttcgagg ccctgtaatt
10200ggaatgagtc cactttaaat cctttaacga ggatccattg gagggcaagt ctggtgccag
10260cagccgcggt aattccagct ccaatagcgt atattaaagt tgctgcagtt aaaaagctcg
10320tagttggatc ttgggagcgg gcgggcggtc cgccgcgagg cgagtcaccg cccgtccccg
10380ccccttgcct ctcggcgccc cctcgatgct cttagctgag tgtcccgcgg ggcccgaagc
10440gtttactttg aaaaaattag agtgttcaaa gcaggcccga gccgcctgga taccgcagct
10500aggaataatg gaataggacc gcggttctat tttgttggtt ttcggaactg aggccatgat
10560taagagggac ggccgggggc attcgtattg cgccgctaga ggtgaaattc ttggaccggc
10620gcaagacgga ccagagcgaa agcatttgcc aagaatgttt tcattaatca agaacgaaag
10680tcggaggttc gaagacgatc agataccgtc gtagttccga ccataaacga tgccgactgg
10740cgatgcggcg gcgttattcc catgacccgc cgggcagctt ccgggaaacc aaagtctttg
10800ggttccgggg ggagtatggt tgcaaagctg aaacttaaag gaattgacgg aagggcacca
10860ccaggagtgg gcctgcggct taatttgact caacacggga aacctcaccc ggcccggaca
10920cggacaggat tgacagattg atagctcttt ctcgattccg tgggtggtgg tgcatggccg
10980ttcttagttg gtggagcgat ttgtctggtt aattccgata acgaacgaga ctctggcatg
11040ctaactagtt acgcgacccc cgagcggtcg gcgtccccca acttcttaga gggacaagtg
11100gcgttcagcc acccgagatt gagcaataac aggtctgtga tgcccttaga tgtccggggc
11160tgcacgcgcg ctacactgac tggctcagcg tgtgcctacc ctgcgccggc aggcgcgggt
11220aacccgttga accccattcg tgatggggat cggggattgc aattattccc catgaacgag
11280gaattcccag taagtgcggg tcataagctt gcgttgatta agtccctgcc ctttgtacac
11340accgcccgtc gctactaccg attggatggt ttagtgaggc cctcggatcg gccccgccgg
11400ggtcggccca cggccctggc ggagcgctga gaagacggtc gaacttgact atctagagga
11460agtaaaagtc gtaacaaggt ttccgtaggt gaacctgcgg aaggatcatt aaacgggaga
11520ctgtggagga gcggcggcgt ggcccgctct ccccgtcttg tgtgtgtcct cgccgggagg
11580cgcgtgcgtc ccgggtcccg tcgcccgcgt gtggagcgag gtgtctggag tgaggtgaga
11640gaaggggtgg gtggggtcgg tctgggtccg tctgggaccg cctccgattt cccctccccc
11700tcccctctcc ctcgtccggc tctgacctcg ccaccctacc gcggcggcgg ctgctcgcgg
11760gcgtcttgcc tctttcccgt ccggctcttc cgtgtctacg aggggcggta cgtcgttacg
11820ggtttttgac ccgtcccggg ggcgttcggt cgtcggggcg cgcgctttgc tctcccggca
11880cccatccccg ccgcggctct ggcttttcta cgttggctgg ggcggttgtc gcgtgtgggg
11940ggatgtgagt gtcgcgtgtg ggctcgcccg tcccgatgcc acgcttttct ggcctcgcgt
12000gtcctccccg ctcctgtccc gggtacctag ctgtcgcgtt ccggcgcgga ggtttaagga
12060ccccgggggg gtcgccctgc cgcccccagg gtcggggggc ggtggggccc gtagggaagt
12120cggtcgttcg ggcggctctc cctcagactc catgaccctc ctccccccgc tgccgccgtt
12180cccgaggcgg cggtcgtgtg ggggggtgga tgtctggagc cccctcgggc gccgtggggg
12240cccgacccgc gccgccggct tgcccgattt ccgcgggtcg gtcctgtcgg tgccggtcgt
12300gggttcccgt gtcgttcccg tgtttttccg ctcccgaccc tttttttttc ctccccccca
12360cacgtgtctc gtttcgttcc tgctggccgg cctgaggcta cccctcggtc catctgttct
12420cctctctctc cggggagagg agggcggtgg tcgttggggg actgtgccgt cgtcagcacc
12480cgtgagttcg ctcacacccg aaataccgat acgactctta gcggtggatc actcggctcg
12540tgcgtcgatg aagaacgcag ctagctgcga gaattaatgt gaattgcagg acacattgat
12600catcgacact tcgaacgcac ttgcggcccc gggttcctcc cggggctacg cctgtctgag
12660cgtcggttga cgatcaatcg cgtcacccgc tgcggtgggt gctgcgcggc tgggagtttg
12720ctcgcagggc caacccccca acccgggtcg ggccctccgt ctcccgaagt tcagacgtgt
12780gggcggttgt cggtgtggcg cgcgcgcccg cgtcgcggag cctggtctcc cccgcgcatc
12840cgcgctcgcg gcttcttccc gctccgccgt tcccgccctc gcccgtgcac cccggtcctg
12900gcctcgcgtc ggcgcctccc ggaccgctgc ctcaccagtc tttctcggtc ccgtgccccg
12960tgggaaccca ccgcgccccc gtggcgcccg ggggtgggcg cgtccgcatc tgctctggtc
13020gaggttggcg gttgagggtg tgcgtgcgcc gaggtggtgg tcggtcccct gcggccgcgg
13080ggttgtcggg gtggcggtcg acgagggccg gtcggtcgcc tgcggtggtt gtctgtgtgt
13140gtttgggtct tgcgctgggg gaggcggggt cgaccgctcg cggggttggc gcggtcgccc
13200ggcgccgcgc accctccggc ttgtgtggag ggagagcgag ggcgagaacg gagagaggtg
13260gtatccccgg tggcgttgcg agggagggtt tggcgtcccg cgtccgtccg tccctccctc
13320cctcggtggg cgccttcgcg ccgcacgcgg ccgctagggg cggtcggggc ccgtggcccc
13380cgtggctctt cttcgtctcc gcttctcctt cacccgggcg gtacccgctc cggcgccggc
13440ccgcgggacg ccgcggcgtc cgtgcgccga tgcgagtcac ccccgggtgt tgcgagttcg
13500gggagggaga gggcctcgct gacccgttgc gtcccggctt ccctgggggg gacccggcgt
13560ctgtgggctg tgcgtcccgg gggttgcgtg tgagtaagat cctccacccc cgccgccctc
13620ccctcccgcc ggcctctcgg ggaccccctg agacggttcg ccggctcgtc ctcccgtgcc
13680gccgggtgcc gtctctttcc cgcccgcctc ctcgctctct tcttcccgcg gctgggcgcg
13740tgtcccccct ttctgaccgc gacctcagat cagacgtggc gacccgctga atttaagcat
13800attagtcagc ggaggaaaag aaactaacca ggattccctc agtaacggcg agtgaacagg
13860gaagagccca gcgccgaatc cccgccgcgc gtcgcggcgt gggaaatgtg gcgtacggaa
13920gacccactcc ccggcgccgc tcgtgggggg cccaagtcct tctgatcgag gcccagcccg
13980tggacggtgt gaggccggta gcggccccgg cgcgccgggc tcgggtcttc ccggagtcgg
14040gttgcttggg aatgcagccc aaagcgggtg gtaaactcca tctaaggcta aataccggca
14100cgagaccgat agtcaacaag taccgtaagg gaaagttgaa aagaactttg aagagagagt
14160tcaagagggc gtgaaaccgt taagaggtaa acgggtgggg tccgcgcagt ccgcccggag
14220gattcaaccc ggcggcgcgc gtccggccgt gcccggtggt cccggcggat ctttcccgct
14280ccccgttcct cccgacccct ccacccgcgc gtcgttcccc tcttcctccc cgcgtccggc
14340gcctccggcg gcgggcgcgg ggggtggtgt ggtggtggcg cgcgggcggg gccgggggtg
14400gggtcggcgg gggaccgccc ccggccggcg accggccgcc gccgggcgca cttccaccgt
14460ggcggtgcgc cgcgaccggc tccgggacgg ccgggaaggc ccggtgggga aggtggctcg
14520gggggggcgg cgcgtctcag ggcgcgccga accacctcac cccgagtgtt acagccctcc
14580ggccgcgctt tcgccgaatc ccggggccga ggaagccaga tacccgtcgc cgcgctctcc
14640ctctcccccc gtccgcctcc cgggcgggcg tgggggtggg ggccgggccg cccctcccac
14700ggcgcgaccg ctctcccacc cccctccgtc gcctctctcg gggcccggtg gggggcgggg
14760cggactgtcc ccagtgcgcc ccgggcgtcg tcgcgccgtc gggtcccggg gggaccgtcg
14820gtcacgcgtc tcccgacgaa gccgagcgca cggggtcggc ggcgatgtcg gctacccacc
14880cgacccgtct tgaaacacgg accaaggagt ctaacgcgtg cgcgagtcag gggctcgtcc
14940gaaagccgcc gtggcgcaat gaaggtgaag ggccccgccc gggggcccga ggtgggatcc
15000cgaggcctct ccagtccgcc gagggcgcac caccggcccg tctcgcccgc cgcgccgggg
15060aggtggagca cgagcgtacg cgttaggacc cgaaagatgg tgaactatgc ttgggcaggg
15120cgaagccaga ggaaactctg gtggaggtcc gtagcggtcc tgacgtgcaa atcggtcgtc
15180cgacctgggt ataggggcga aagactaatc gaaccatcta gtagctggtt ccctccgaag
15240tttccctcag gatagctggc gctctcgctc ccgacgtacg cagttttatc cggtaaagcg
15300aatgattaga ggtcttgggg ccgaaacgat ctcaacctat tctcaaactt taaatgggta
15360agaagcccgg ctcgctggcg tggagccggg cgtggaatgc gagtgcctag tgggccactt
15420ttggtaagca gaactggcgc tgcgggatga accgaacgcc gggttaaggc gcccgatgcc
15480gacgctcatc agaccccaga aaaggtgttg gttgatatag acagcaggac ggtggccatg
15540gaagtcggaa tccgctaagg agtgtgtaac aactcacctg ccgaatcaac tagccctgaa
15600aatggatggc gctggagcgt cgggcccata cccggccgtc gccgcagtcg gaacggaacg
15660ggacgggagc ggccgcgggt gcgcgtctct cggggtcggg ggtgcgtggc gggggcccgt
15720cccccgcctc ccctccgcgc gccgggttcg cccccgcggc gtcgggcccc gcggagccta
15780cgccgcgacg agtaggaggg ccgctgcggt gagccttgaa gcctagggcg cgggcccggg
15840tggagccgcc gcaggtgcag atcttggtgg tagtagcaaa tattcaaacg agaactttga
15900aggccgaagt ggagaagggt tccatgtgaa cagcagttga acatgggtca gtcggtcctg
15960agagatgggc gagtgccgtt ccgaagggac gggcgatggc ctccgttgcc ctcggccgat
16020cgaaagggag tcgggttcag atccccgaat ccggagtggc ggagatgggc gccgcgaggc
16080cagtgcggta acgcgaccga tcccggagaa gccggcggga ggcctcgggg agagttctct
16140tttctttgtg aagggcaggg cgccctggaa tgggttcgcc ccgagagagg ggcccgtgcc
16200ttggaaagcg tcgcggttcc ggcggcgtcc ggtgagctct cgctggccct tgaaaatccg
16260ggggagaggg tgtaaatctc gcgccgggcc gtacccatat ccgcagcagg tctccaaggt
16320gaacagcctc tggcatgttg gaacaatgta ggtaagggaa gtcggcaagc cggatccgta
16380acttcgggat aaggattggc tctaagggct gggtcggtcg ggctggggcg cgaagcgggg
16440ctgggcgcgc gccgcggctg gacgaggcgc cgccgccctc tcccacgtcc ggggagaccc
16500cccgtccttt ccgcccgggc ccgccctccc ctcttccccg cggggccccg tcgtcccccg
16560cgtcgtcgcc acctctcttc ccccctcctt cttcccgtcg gggggcgggt cgggggtcgg
16620cgcgcggcgc gggctccggg gcggcgggtc caaccccgcg ggggttccgg agcgggagga
16680accagcggtc cccggtgggg cggggggccc ggacactcgg ggggccggcg gcggcggcga
16740ctctggacgc gagccgggcc cttcccgtgg atcgcctcag ctgcggcggg cgtcgcggcc
16800gctcccgggg agcccggcgg gtgccggcgc gggtcccctc cccgcggggc ctcgctccac
16860ccccccatcg cctctcccga ggtgcgtggc gggggcgggc gggcgtgtcc cgcgcgtgtg
16920gggggaacct ccgcgtcggt gttcccccgc cgggtccgcc ccccgggccg cggttttccg
16980cgcggcgccc ccgcctcggc cggcgcctag cagccgactt agaactggtg cggaccaggg
17040gaatccgact gtttaattaa aacaaagcat cgcgaaggcc cgcggcgggt gttgacgcga
17100tgtgatttct gcccagtgct ctgaatgtca aagtgaagaa attcaatgaa gcgcgggtaa
17160acggcgggag taactatgac tctcttaagg tagccaaatg cctcgtcatc taattagtga
17220cgcgcatgaa tggatgaacg agattcccac tgtccctacc tactatccag cgaaaccaca
17280gccaagggaa cgggcttggc ggaatcagcg gggaaagaag accctgttga gcttgactct
17340agtctggcac ggtgaagaga catgagaggt gtagaataag tgggaggccc ccggcgcccg
17400gccccgtcct cgcgtcgggg tcggggcacg ccggcctcgc gggccgccgg tgaaatacca
17460ctactctcat cgttttttca ctgacccggt gaggcggggg ggcgagcccc gaggggctct
17520cgcttctggc gccaagcgtc cgtcccgcgc gtgcgggcgg gcgcgacccg ctccggggac
17580agtgccaggt ggggagtttg actggggcgg tacacctgtc aaacggtaac gcaggtgtcc
17640taaggcgagc tcagggagga cagaaacctc ccgtggagca gaagggcaaa agctcgcttg
17700atcttgattt tcagtacgaa tacagaccgt gaaagcgggg cctcacgatc cttctgacct
17760tttgggtttt aagcaggagg tgtcagaaaa gttaccacag ggataactgg cttgtggcgg
17820ccaagcgttc atagcgacgt cgctttttga tccttcgatg tcggctcttc ctatcattgt
17880gaagcagaat tcaccaagcg ttggattgtt cacccactaa tagggaacgt gagctgggtt
17940tagaccgtcg tgagacaggt tagttttacc ctactgatga tgtgttgttg ccatggtaat
18000cctgctcagt acgagaggaa ccgcaggttc agacatttgg tgtatgtgct tggctgagga
18060gccaatgggg cgaagctacc atctgtggga ttatgactga acgcctctaa gtcagaatcc
18120gcccaagcgg aacgatacgg cagcgccgaa ggagcctcgg ttggccccgg atagccgggt
18180ccccgtccgt cccgctcggc ggggtccccg cgtcgccccg cggcggcgcg gggtctcccc
18240ccgccgggcg tcgggaccgg ggtccggtgc ggagagccgt tcgtcttggg aaacggggtg
18300cggccggaaa gggggccgcc ctctcgcccg tcacgttgaa cgcacgttcg tgtggaacct
18360ggcgctaaac cattcgtaga cgacctgctt ctgggtcggg gtttcgtacg tagcagagca
18420gctccctcgc tgcgatctat tgaaagtcag ccctcgacac aagggtttgt ctctgcgggc
18480tttcccgtcg cacgcccgct cgctcgcacg cgaccgtgtc gccgcccggg cgtcacgggg
18540gcggtcgcct cggcccccgc gcggttgccc gaacgaccgt gtggtggttg ggggggggat
18600cgtcttctcc tccgtctccc gaggacggtt cgtttctctt tccccttccg tcgctctcct
18660tgggtgtggg agcctcgtgc cgtcgcgacc gcggcctgcc gtcgcctgcc gccgcagccc
18720cttgccctcc ggccttggcc aagccggagg gcggaggagg gggatcggcg gcggcggcga
18780ccgcggcgcg gtgacgcacg gtgggatccc catcctcggc gcgtccgtcg gggacggccg
18840gttggagggg cgggaggggt ttttcccgtg aacgccgcgt tcggcgccag gcctctggcg
18900gccggggggg cgctctctcc gcccgagcat ccccactccc gcccctcctc ttcgcgcgcc
18960gcggcggcga cgtgcgtacg aggggaggat gtcgcggtgt ggaggcggag agggtccggc
19020gcggcgcctc ttccattttt tcccccccaa cttcggaggt cgaccagtac tccgggcgac
19080actttgtttt ttttttttcc cccgatgctg gaggtcgacc agatgtccga aagtgtcccc
19140cccccccccc ccccccggcg cggagcggcg gggccactct ggactctttt tttttttttt
19200tttttttttt ttaaattcct ggaaccttta ggtcgaccag ttgtccgtct tttactcctt
19260catataggtc gaccagtact ccgggtggta ctttgtcttt ttctgaaaat cccagaggtc
19320gaccagatat ccgaaagtcc tctctttccc tttactcttc cccacagcga ttctcttttt
19380tttttttttt tttggtgtgc ctctttttga cttatataca tgtaaatagt gtgtacgttt
19440atatacttat aggaggaggt cgaccagtac tccgggcgac actttgtttt tttttttttt
19500tccaccgatg atggaggtcg accagatgtc cgaaagtgtc ccgtcccccc cctccccccc
19560ccgcgacgcg gcgggctcac tctggactct tttttttttt tttttttttt tttaaatttc
19620tggaacctta aggtcgacca gttgtccgtc tttcactcat tcatataggt cgaccggtgg
19680tactttgtct ttttctgaaa atcgcagagg tcgaccagat gtcagaaagt ctggtggtcg
19740ataaattatc tgatctagat ttgtttttct gtttttcagt tttgtgttgt tttgtgttgt
19800tttgtgttgt tttgttttgt tttgttttgt tttgttttgt tttgttttgt tttgttttgt
19860tttgtgttgt gttgtgttgt gttgtgttgg gttgggttgg gttgggttgg gttgggttgg
19920gttgggttgg gttgggttgt gttgtttggt tttgtgttgt ttggtgttgt tggttttgtt
19980ttgtttgctg ttgttttgtg ttttgcgggt cgaacagttg tccctaaccg agtttttttg
20040tacacaaaca tgcacttttt ttaaaataaa tttttaaaat aaatgcgaaa atcgaccaat
20100tatccctttc cttctctctc ttttttaaaa attttctttg tgtgtgtgtg tgtgtgtgtg
20160tgtgtgtgtg tgcgtgtgtg tgtgtgtgtg cgtgcagcgt gcgcgcgctc gttttataaa
20220tacttataat aataggtcgc cgggtggtgg tagcttcccg gactccagag gcagaggcag
20280gcagacttct gagttcgagg ccagcctggt ctacagagga accctgtctc gaaaaatgaa
20340aataaataca tacatacata catacataca tacatacata catacataca tacatatgag
20400gttgaccagt tgtcaatcct ttagaatttt gtttttaatt aatgtgatag agagatagat
20460aatagataga tggatagagt gatacaaata taggtttttt tttcagtaaa tatgaggttg
20520attaaccact tttccctttt taggtttttt tttttttccc ctgtccatgt ggttgctggg
20580atttgaactc aggaccctgg caggtcaact ggaaaacgtg ttttctatat atataaatag
20640tggtctgtct gctgtttgtt tgtttgcttg cttgcttgct tgcttgcttg cttgcttgct
20700tgcttttttt tttcttctga gacagtattt ctctgtgtaa cctggtgccc tgaaactcac
20760tctgtagacc agcctggcct caatcgaact cagaaatcct cctgcctctt gtctacctcc
20820caattttgga gtaaaggtgt gctacaccac tgcctggcat tattatcatt atcattatta
20880attttattat tagacagaac gaaatcaact agttggtcct gtttcgttaa ttcatttgaa
20940attagttgga ccaattagtt ggctggtttg ggaggtttct tttgtttccg atttgggtgt
21000ttgtggggct ggggatcagg tatctcaacg gaatgcatga aggttaaggt gagatggctc
21060gatttttgta aagattactt ttcttagtct gaggaaaaaa taaaataata ttgggctacg
21120tttcattgct tcatttctat ttctctttct ttctttcttt ctttcagata aggaggtcgg
21180ccagttcctc ctgccttctg gaagatgtag gcattgcatt gggaaaagca ttgtttgaga
21240gatgtgctag tgaaccagag agtttggatg tcaagccgta taatgtttat tacaatatag
21300aaaagttcta acaaagtgat ctttaacttt tttttttttt tttctccttc tacttctact
21360tgttctcact ctgccaccaa cgcgctttgt acattgaatg tgagctttgt tttgcttaac
21420agacatatat tttttctttt ggttttgctt gacatggttt ccctttctat ccgtgcaggg
21480ttcccagacg gccttttgag aataaaatgg gaggccagaa ccaaagtctt ttgaataaag
21540caccacaact ctaacctgtt tggctgtttt ccttcccaag gcacagatct ttcccagcat
21600ggaaaagcat gtagcagttg taggacacac tagacgagag caccagatct cattgtgggt
21660ggttgtgaac cacccaccat gtggttgcct gggatttgaa ctcaggatct tcagaagacg
21720agtcagggct ctaaaccgat gagccatctc tccagccctc ctacattcct tcttaaggca
21780tgaatgatcc cagcatggga agacagtctg ccctctttgt ggtatatcac catatactca
21840ataaaataat gaaatgaatg aagtctccac gtatttattt cttcgagcta tctaaattct
21900ctcacagcac ctccccctcc cccacactgc ctttctccct atgtttgggt ggggctgggg
21960gaggggtggg gtgggggcag ggatctgcat gtcttcttgc aggtctgtga actatttgcg
22020atggcctggt tctctgaact gttgagcctt gtctatccag aggctgactg gctagttttc
22080tacctgaagt ccctgagtga tgatttccct gtgaattc
2211819175DNAMus musculus 19ctcccgcgcg gcccccgtgt tcgccgttcc cgtggcgcgg
acaatgcggt tgtgcgtcca 60cgtgtgcgtg tccgtgcagt gccgttgtgg agtgcctcgc
tctcctcctc ctccccggca 120gcgttcccac ggttggggac caccggtgac ctcgccctct
tcgggcctgg atccg 17520755DNAMus musculus 20ggtctggtgg gaattgttga
cctcgctctc gggtgcggcc tttggggaac ggcggggtcg 60gtcgtgcccg gcgccggacg
tgtgtcgggg cccacttccc gctcgagggt ggcggtggcg 120gcggcgttgg tagtctcccg
tgttgcgtct tcccgggctc ttgggggggg tgccgtcgtt 180ttcggggccg gcgttgcttg
gcttacgcag gcttggtttg ggactgcctc aggagtcgtg 240ggcggtgtga ttcccgccgg
ttttgcctcg cgtctgcctg ctttgcctcg ggtttgcttg 300gttcgtgtct cgggagcggt
ggtttttttt tttttcgggt cccggggaga ggggtttttc 360cgggggacgt tcccgtcgcc
ccctgccgcc ggtgggtttt cgtttcgggc tgtgttcgtt 420tccccttccc cgtttcgccg
tcggttctcc ccggtcggtc ggccctctcc ccggtcggtc 480gcccggccgt gctgccggac
ccccccttct gggggggatg cccgggcacg cacgcgtccg 540ggcggccact gtggtccggg
agctgctcgg caggcgggtg agccagttgg aggggcgtca 600tgcccccgcg ggctcccgtg
gccgacgcgg cgtgttcttt gggggggcct gtgcgtgcgg 660gaaggctgcg cacgttgtcg
gtccttgcga gggaaagagg cttttttttt ttagggggtc 720gtccttcgtc gtcccgtcgg
cggtggatcc ggcct 75521463DNAMus musculus
21ggccgaggtg cgtctgcggg ttggggctcg tccggccccg tcgtcctccg ggaaggcgtt
60tagcgggtac cgtcgccgcg ccgaggtggg cgcacgtcgg tgagataacc ccgagcgtgt
120ttctggttgt tggcggcggg ggctccggtc gatgtcttcc cctccccctc tccccgaggc
180caggtcagcc tccgcctgtg ggcttcgtcg gccgtctccc cccccctcac gtccctcgcg
240agcgagcccg tccgttcgac cttccttccg ccttcccccc atctttccgc gctccgttgg
300ccccggggtt ttcacggcgc cccccacgct cctccgcctc tccgcccgtg gtttggacgc
360ctggttccgg tctccccgcc aaaccccggt tgggttggtc tccggccccg gcttgctctt
420cgggtctccc aacccccggc cggaagggtt cgggggttcc ggg
46322378DNAMus musculus 22ggattcttca ggattgaaac ccaaaccggt tcagtttcct
ttccggctcc ggccgggggg 60ggcggccccg ggcggtttgg tgagttagat aacctcgggc
cgatcgcacg ccccccgtgg 120cggcgacgac ccattcgaac gtctgcccta tcaactttcg
atggtagtcg atgtgcctac 180catggtgacc acgggtgacg gggaatcagg gttcgattcc
ggagagggag cctgagaaac 240ggctaccaca tccaaggaag gcagcaggcg cgcaaattac
ccactcccga cccggggagg 300tagtgacgaa aaataacaat acaggactct ttcgaggccc
tgtaattgga atgagtccac 360tttaaatcct ttaagcag
37823378DNAMus musculus 23gatccattgg agggcaagtc
tggtgccagc agccgcggta attccagctc caatagcgta 60tattaaagtt gctgcagtta
aaaagctcgt agttggatct tgggagcggg cgggcggtcc 120gccgcgaggc gagtcaccgc
ccgtccccgc cccttgcctc tcggcgcccc ctcgatgctc 180ttagctgagt tgtcccgcgg
ggcccgaagc gtttactttg aaaaaattag agttgtttca 240aagcaggccc gagccgcctg
gataccgcca gctaggaaat aatggaatag gaccgcggtt 300cctattttgt ttggttttcg
gaactgagcc catgattaag ggaaacggcc gggggcattc 360ccttattgcg ccccccta
37824719DNAMus musculus
24ggatctttcc cgctccccgt tcctcccggc ccctccaccc gcgcgtctcc ccccttcttt
60tcccctctcc ggaggggggg gaggtggggg cgcgtgggcg gggtcggggg tggggtcggc
120gggggaccgc ccccggccgg caaaaggccg ccgccgggcg cacttcaacc gtagcggtgc
180gccgcgaccg gctacgagac ggctgggaag gcccgacggg gaatgtggct cggggggggc
240ggcgcgtctc agggcgcgcc gaaccacctc accccgagtg ttacagccct ccggccgcgc
300tttcgcggaa tcccggggcc gaggggaagc ccgatacccg tcgccgcgct tttcccctcc
360ccccgtccgc ctcccgggcg ggcgtggggg tgggggccgg gccgcccctc ccacgcccgt
420ggtttctctc tctcccggtc tcggccggtt tggggggggg agcccggttg ggggcggggc
480ggactgtcct cagtgcgccc cgggcgtcgt cgcgccgtcg ggcccggggg gttctctcgg
540tcacgccgcc cccgacgaag ccgagcgcac ggggtcggcg gcgatgtcgg ctacccaccc
600gacccgtctt gaaacacgga ccaaggagtc taacgcgtgc gcgagtcagg ggctcgcacg
660aaagccgccg tggcgcaatg aaggtgaagg gccccgtccg ggggcccgag gtgggatcc
71925685DNAMus musculus 25cgaggcctct ccagtccgcc gagggcgcac caccggcccg
tctcgcccgc cgcgtcgggg 60aggtggagca cgagcgtacg cgttaggacc cgaaagatgg
tgaactatgc ctgggcaggg 120cgaagccaga ggaaactctg gtggaggtcc gtagcggtcc
tgacgtgcaa atcggtcgtc 180cgacctgggt ataggggcga aagactaatc gaaccatcta
gtagctggtt ccctccgaag 240tttccctcag gatagctggc gctctcgcaa ccttcggaag
cagttttatc cgggtaaagg 300cggaatggat taggaggtct tggggccgga aacgatctca
aactatttct caaactttaa 360atgggtaagg aagcccggct cgctggcgtg gagccgggcg
tggaatgcga gtgcctagtg 420ggccactttt ggtaagcaga actggcgctg cgggatgaac
cgaacgccgg gttaaggcgc 480ccgatgccga cgctcatcag accccagaaa aggtgttggt
tgatatagac agcaggacgg 540tggccatgga agtcggaatc cgctaaggag tgtgtaacaa
ctcacctgcc gaatcaacta 600gccctgaaaa tggatggcgc tggagcgtcg ggcccatacc
cggccgtcgc cggcagtcgg 660aacgggacgg gacgggagcg gccgc
685265162DNAArtificial SequenceChimeric bacterial
plasmid 26gacggatcgg gagatctccc gatcccctat ggtcgactct cagtacaatc
tgctctgatg 60ccgcatagtt aagccagtat ctgctccctg cttgtgtgtt ggaggtcgct
gagtagtgcg 120cgagcaaaat ttaagctaca acaaggcaag gcttgaccga caattgcatg
aagaatctgc 180ttagggttag gcgttttgcg ctgcttcgcg atgtacgggc cagatatacg
cgttgacatt 240gattattgac tagttattaa tagtaatcaa ttacggggtc attagttcat
agcccatata 300tggagttccg cgttacataa cttacggtaa atggcccgcc tggctgaccg
cccaacgacc 360cccgcccatt gacgtcaata atgacgtatg ttcccatagt aacgccaata
gggactttcc 420attgacgtca atgggtggac tatttacggt aaactgccca cttggcagta
catcaagtgt 480atcatatgcc aagtacgccc cctattgacg tcaatgacgg taaatggccc
gcctggcatt 540atgcccagta catgacctta tgggactttc ctacttggca gtacatctac
gtattagtca 600tcgctattac catggtgatg cggttttggc agtacatcaa tgggcgtgga
tagcggtttg 660actcacgggg atttccaagt ctccacccca ttgacgtcaa tgggagtttg
ttttggcacc 720aaaatcaacg ggactttcca aaatgtcgta acaactccgc cccattgacg
caaatgggcg 780gtaggcgtgt acggtgggag gtctatataa gcagagctct ctggctaact
agagaaccca 840ctgcttactg gcttatcgaa attaatacga ctcactatag ggagacccaa
gcttggtacc 900gagctcggat cgatatctgc ggccgcgtcg acggaattca gtggatccac
tagtaacggc 960cgccagtgtg ctggaattaa ttcgctgtct gcgagggcca gctgttgggg
tgagtactcc 1020ctctcaaaag cgggcatgac ttctgcgcta agattgtcag tttccaaaaa
cgaggaggat 1080ttgatattca cctggcccgc ggtgatgcct ttgagggtgg ccgcgtccat
ctggtcagaa 1140aagacaatct ttttgttgtc aagcttgagg tgtggcaggc ttgagatctg
gccatacact 1200tgagtgacaa tgacatccac tttgcctttc tctccacagg tgtccactcc
caggtccaac 1260tgcaggtcga gcatgcatct agggcggcca attccgcccc tctccctccc
ccccccctaa 1320cgttactggc cgaagccgct tggaataagg ccggtgtgcg tttgtctata
tgtgattttc 1380caccatattg ccgtcttttg gcaatgtgag ggcccggaaa cctggccctg
tcttcttgac 1440gagcattcct aggggtcttt cccctctcgc caaaggaatg caaggtctgt
tgaatgtcgt 1500gaaggaagca gttcctctgg aagcttcttg aagacaaaca acgtctgtag
cgaccctttg 1560caggcagcgg aaccccccac ctggcgacag gtgcctctgc ggccaaaagc
cacgtgtata 1620agatacacct gcaaaggcgg cacaacccca gtgccacgtt gtgagttgga
tagttgtgga 1680aagagtcaaa tggctctcct caagcgtatt caacaagggg ctgaaggatg
cccagaaggt 1740accccattgt atgggatctg atctggggcc tcggtgcaca tgctttacat
gtgtttagtc 1800gaggttaaaa aaacgtctag gccccccgaa ccacggggac gtggttttcc
tttgaaaaac 1860acgatgataa gcttgccaca acccgggatc caccggtcgc caccatggtg
agcaagggcg 1920aggagctgtt caccggggtg gtgcccatcc tggtcgagct ggacggcgac
gtaaacggcc 1980acaagttcag cgtgtccggc gagggcgagg gcgatgccac ctacggcaag
ctgaccctga 2040agttcatctg caccaccggc aagctgcccg tgccctggcc caccctcgtg
accaccctga 2100cctacggcgt gcagtgcttc agccgctacc ccgaccacat gaagcagcac
gacttcttca 2160agtccgccat gcccgaaggc tacgtccagg agcgcaccat cttcttcaag
gacgacggca 2220actacaagac ccgcgccgag gtgaagttcg agggcgacac cctggtgaac
cgcatcgagc 2280tgaagggcat cgacttcaag gaggacggca acatcctggg gcacaagctg
gagtacaact 2340acaacagcca caacgtctat atcatggccg acaagcagaa gaacggcatc
aaggtgaact 2400tcaagatccg ccacaacatc gaggacggca gcgtgcagct cgccgaccac
taccagcaga 2460acacccccat cggcgacggc cccgtgctgc tgcccgacaa ccactacctg
agcacccagt 2520ccgccctgag caaagacccc aacgagaagc gcgatcacat ggtcctgctg
gagttcgtga 2580ccgccgccgg gatcactctc ggcatggacg agctgtacaa gtaaagcggc
cctagagctc 2640gctgatcagc ctcgactgtg cctctagttg ccagccatct gttgtttgcc
cctcccccgt 2700gccttccttg accctggaag gtgccactcc cactgtcctt tcctaataaa
atgaggaaat 2760tgcatcgcat tgtctgagta ggtgtcattc tattctgggg ggtggggtgg
ggcaggacag 2820caagggggag gattgggaag acaatagcag gcatgctggg gatgcggtgg
gctctatggc 2880ttctgaggcg gaaagaacca gctggggctc gagtgcattc tagttgtggt
ttgtccaaac 2940tcatcaatgt atcttatcat gtctgtatac cgtcgacctc tagctagagc
ttggcgtaat 3000catggtcata gctgtttcct gtgtgaaatt gttatccgct cacaattcca
cacaacatac 3060gagccggaag cataaagtgt aaagcctggg gtgcctaatg agtgagctaa
ctcacattaa 3120ttgcgttgcg ctcactgccc gctttccagt cgggaaacct gtcgtgccag
ctgcattaat 3180gaatcggcca acgcgcgggg agaggcggtt tgcgtattgg gcgctcttcc
gcttcctcgc 3240tcactgactc gctgcgctcg gtcgttcggc tgcggcgagc ggtatcagct
cactcaaagg 3300cggtaatacg gttatccaca gaatcagggg ataacgcagg aaagaacatg
tgagcaaaag 3360gccagcaaaa ggccaggaac cgtaaaaagg ccgcgttgct ggcgtttttc
cataggctcc 3420gcccccctga cgagcatcac aaaaatcgac gctcaagtca gaggtggcga
aacccgacag 3480gactataaag ataccaggcg tttccccctg gaagctccct cgtgcgctct
cctgttccga 3540ccctgccgct taccggatac ctgtccgcct ttctcccttc gggaagcgtg
gcgctttctc 3600aatgctcacg ctgtaggtat ctcagttcgg tgtaggtcgt tcgctccaag
ctgggctgtg 3660tgcacgaacc ccccgttcag cccgaccgct gcgccttatc cggtaactat
cgtcttgagt 3720ccaacccggt aagacacgac ttatcgccac tggcagcagc cactggtaac
aggattagca 3780gagcgaggta tgtaggcggt gctacagagt tcttgaagtg gtggcctaac
tacggctaca 3840ctagaaggac agtatttggt atctgcgctc tgctgaagcc agttaccttc
ggaaaaagag 3900ttggtagctc ttgatccggc aaacaaacca ccgctggtag cggtggtttt
tttgtttgca 3960agcagcagat tacgcgcaga aaaaaaggat ctcaagaaga tcctttgatc
ttttctacgg 4020ggtctgacgc tcagtggaac gaaaactcac gttaagggat tttggtcatg
agattatcaa 4080aaaggatctt cacctagatc cttttaaatt aaaaatgaag ttttaaatca
atctaaagta 4140tatatgagta aacttggtct gacagttacc aatgcttaat cagtgaggca
cctatctcag 4200cgatctgtct atttcgttca tccatagttg cctgactccc cgtcgtgtag
ataactacga 4260tacgggaggg cttaccatct ggccccagtg ctgcaatgat accgcgagac
ccacgctcac 4320cggctccaga tttatcagca ataaaccagc cagccggaag ggccgagcgc
agaagtggtc 4380ctgcaacttt atccgcctcc atccagtcta ttaattgttg ccgggaagct
agagtaagta 4440gttcgccagt taatagtttg cgcaacgttg ttgccattgc tacaggcatc
gtggtgtcac 4500gctcgtcgtt tggtatggct tcattcagct ccggttccca acgatcaagg
cgagttacat 4560gatcccccat gttgtgcaaa aaagcggtta gctccttcgg tcctccgatc
gttgtcagaa 4620gtaagttggc cgcagtgtta tcactcatgg ttatggcagc actgcataat
tctcttactg 4680tcatgccatc cgtaagatgc ttttctgtga ctggtgagta ctcaaccaag
tcattctgag 4740aatagtgtat gcggcgaccg agttgctctt gcccggcgtc aatacgggat
aataccgcgc 4800cacatagcag aactttaaaa gtgctcatca ttggaaaacg ttcttcgggg
cgaaaactct 4860caaggatctt accgctgttg agatccagtt cgatgtaacc cactcgtgca
cccaactgat 4920cttcagcatc ttttactttc accagcgttt ctgggtgagc aaaaacagga
aggcaaaatg 4980ccgcaaaaaa gggaataagg gcgacacgga aatgttgaat actcatactc
ttcctttttc 5040aatattattg aagcatttat cagggttatt gtctcatgag cggatacata
tttgaatgta 5100tttagaaaaa taaacaaata ggggttccgc gcacatttcc ccgaaaagtg
ccacctgacg 5160tc
5162275627DNAArtificial SequencepMG plasmid from InvivoGen;
IRES sequence modified EMCV nucleotides 2736-3308 27caccggcgaa
ggaggcctag atctatcgat tgtacagcta gctcgacatg ataagataca 60ttgatgagtt
tggacaaacc acaactagaa tgcagtgaaa aaaatgcttt atttgtgaaa 120tttgtgatgc
tattgcttta tttgtgaaat ttgtgatgct attgctttat ttgtaaccat 180tataagctgc
aataaacaag ttaacaacaa caattgcatt cattttatgt ttcaggttca 240gggggaggtg
tgggaggttt tttaaagcaa gtaaaacctc tacaaatgtg gtagatccat 300ttaaatgtta
attaagaaca tgtgagcaaa aggccagcaa aaggccagga accgtaaaaa 360ggccgcgttg
ctggcgtttt tccataggct ccgcccccct gacgagcatc acaaaaatcg 420acgctcaagt
cagaggtggc gaaacccgac aggactataa agataccagg cgtttccccc 480tggaagctcc
ctcgtgcgct ctcctgttcc gaccctgccg cttaccggat acctgtccgc 540ctttctccct
tcgggaagcg tggcgctttc tcatagctca cgctgtaggt atctcagttc 600ggtgtaggtc
gttcgctcca agctgggctg tgtgcacgaa ccccccgttc agcccgaccg 660ctgcgcctta
tccggtaact atcgtcttga gtccaacccg gtaagacacg acttatcgcc 720actggcagca
gccactggta acaggattag cagagcgagg tatgtaggcg gtgctacaga 780gttcttgaag
tggtggccta actacggcta cactagaaga acagtatttg gtatctgcgc 840tctgctgaag
ccagttacct tcggaaaaag agttggtagc tcttgatccg gcaaacaaac 900caccgctggt
agcggtggtt tttttgtttg caagcagcag attacgcgca gaaaaaaagg 960atctcaagaa
gatcctttga tcttttctac ggggtctgac gctcagtgga acgaaaactc 1020acgttaaggg
attttggtca tggctagtta attaagctgc aataaacaat cattattttc 1080attggatctg
tgtgttggtt ttttgtgtgg gcttggggga gggggaggcc agaatgactc 1140caagagctac
aggaaggcag gtcagagacc ccactggaca aacagtggct ggactctgca 1200ccataacaca
caatcaacag gggagtgagc tggatcgagc tagagtccgt tacataactt 1260acggtaaatg
gcccgcctgg ctgaccgccc aacgaccccc gcccattgac gtcaataatg 1320acgtatgttc
ccatagtaac gccaataggg actttccatt gacgtcaatg ggtggagtat 1380ttacggtaaa
ctgcccactt ggcagtacat caagtgtatc atatgccaag tacgccccct 1440attgacgtca
atgacggtaa atggcccgcc tggcattatg cccagtacat gaccttatgg 1500gactttccta
cttggcagta catctacgta ttagtcatcg ctattaccat ggtgatgcgg 1560ttttggcagt
acatcaatgg gcgtggatag cggtttgact cacggggatt tccaagtctc 1620caccccattg
acgtcaatgg gagtttgttt tggcaccaaa atcaacggga ctttccaaaa 1680tgtcgtaaca
actccgcccc attgacgcaa atgggcggta ggcgtgtacg gtgggaggtc 1740tatataagca
gagctcgttt agtgaaccgt cagatcgcct ggagacgcca tccacgctgt 1800tttgacctcc
atagaagaca ccgggaccga tccagcctcc gcggccggga acggtgcatt 1860ggaacgcgga
ttccccgtgc caagagtgac gtaagtaccg cctatagagt ctataggccc 1920acccccttgg
cttcttatgc atgctatact gtttttggct tggggtctat acacccccgc 1980ttcctcatgt
tataggtgat ggtatagctt agcctatagg tgtgggttat tgaccattat 2040tgaccactcc
cctattggtg acgatacttt ccattactaa tccataacat ggctctttgc 2100cacaactctc
tttattggct atatgccaat acactgtcct tcagagactg acacggactc 2160tgtattttta
caggatgggg tctcatttat tatttacaaa ttcacatata caacaccacc 2220gtccccagtg
cccgcagttt ttattaaaca taacgtggga tctccacgcg aatctcgggt 2280acgtgttccg
gacatgggct cttctccggt agcggcggag cttctacatc cgagccctgc 2340tcccatgcct
ccagcgactc atggtcgctc ggcagctcct tgctcctaac agtggaggcc 2400agacttaggc
acagcacgat gcccaccacc accagtgtgc cgcacaaggc cgtggcggta 2460gggtatgtgt
ctgaaaatga gctcggggag cgggcttgca ccgctgacgc atttggaaga 2520cttaaggcag
cggcagaaga agatgcaggc agctgagttg ttgtgttctg ataagagtca 2580gaggtaactc
ccgttgcggt gctgttaacg gtggagggca gtgtagtctg agcagtactc 2640gttgctgccg
cgcgcgccac cagacataat agctgacaga ctaacagact gttcctttcc 2700atgggtcttt
tctgcagtca cccgggggat ccttcgaacg tagctctaga ttgagtcgac 2760gttactggcc
gaagccgctt ggaataaggc cggtgtgcgt ttgtctatat gttattttcc 2820accatattgc
cgtcttttgg caatgtgagg gcccggaaac ctggccctgt cttcttgacg 2880agcattccta
ggggtctttc ccctctcgcc aaaggaatgc aaggtctgtt gaatgtcgtg 2940aaggaagcag
ttcctctgga agcttcttga agacaaacaa cgtctgtagc gaccctttgc 3000aggcagcgga
accccccacc tggcgacagg tgcctctgcg gccaaaagcc acgtgtataa 3060gatacacctg
caaaggcggc acaaccccag tgccacgttg tgagttggat agttgtggaa 3120agagtcaaat
ggctctcctc aagcgtattc aacaaggggc tgaaggatgc ccagaaggta 3180ccccattgta
tgggatctga tctggggcct cggtgcacat gctttacatg tgtttagtcg 3240aggttaaaaa
aacgtctagg ccccccgaac cacggggacg tggttttcct ttgaaaaaca 3300cgataatacc
atgggtaagt gatatctact agttgtgacc ggcgcctagt gttgacaatt 3360aatcatcggc
atagtatatc ggcatagtat aatacgactc actataggag ggccaccatg 3420tcgactacta
accttcttct ctttcctaca gctgagatca ccggtaggag ggccatcatg 3480aaaaagcctg
aactcaccgc gacgtctgtc gcgaagtttc tgatcgaaaa gttcgacagc 3540gtctccgacc
tgatgcagct ctcggagggc gaagaatctc gtgctttcag cttcgatgta 3600ggagggcgtg
gatatgtcct gcgggtaaat agctgcgccg atggtttcta caaagatcgt 3660tatgtttatc
ggcactttgc atcggccgcg ctcccgattc cggaagtgct tgacattggg 3720gaattcagcg
agagcctgac ctattgcatc tcccgccgtg cacagggtgt cacgttgcaa 3780gacctgcctg
aaaccgaact gcccgctgtt ctgcaacccg tcgcggagct catggatgcg 3840atcgctgcgg
ccgatcttag ccagacgagc gggttcggcc cattcggacc gcaaggaatc 3900ggtcaataca
ctacatggcg tgatttcata tgcgcgattg ctgatcccca tgtgtatcac 3960tggcaaactg
tgatggacga caccgtcagt gcgtccgtcg cgcaggctct cgatgagctg 4020atgctttggg
ccgaggactg ccccgaagtc cggcacctcg tgcacgcgga tttcggctcc 4080aacaatgtcc
tgacggacaa tggccgcata acagcggtca ttgactggag cgaggcgatg 4140ttcggggatt
cccaatacga ggtcgccaac atcttcttct ggaggccgtg gttggcttgt 4200atggagcagc
agacgcgcta cttcgagcgg aggcatccgg agcttgcagg atcgccgcgg 4260ctccgggcgt
atatgctccg cattggtctt gaccaactct atcagagctt ggttgacggc 4320aatttcgatg
atgcagcttg ggcgcagggt cgatgcgacg caatcgtccg atccggagcc 4380gggactgtcg
ggcgtacaca aatcgcccgc agaagcgcgg ccgtctggac cgatggctgt 4440gtagaagtac
tcgccgatag tggaaaccga cgccccagca ctcgtccgag ggcaaaggaa 4500tgagtcgaga
attcgctaga gggccctatt ctatagtgtc acctaaatgc tagagctcgc 4560tgatcagcct
cgactgtgcc ttctagttgc cagccatctg ttgtttgccc ctcccccgtg 4620ccttccttga
ccctggaagg tgccactccc actgtccttt cctaataaaa tgaggaaatt 4680gcatcgcatt
gtctgagtag gtgtcattct attctggggg gtggggtggg gcaggacagc 4740aagggggagg
attgggaaga caatagcagg catgcgcagg gcccaattgc tcgagcggcc 4800gcaataaaat
atctttattt tcattacatc tgtgtgttgg ttttttgtgt gaatcgtaac 4860taacatacgc
tctccatcaa aacaaaacga aacaaaacaa actagcaaaa taggctgtcc 4920ccagtgcaag
tgcaggtgcc agaacatttc tctatcgaag gatctgcgat cgctccggtg 4980cccgtcagtg
ggcagagcgc acatcgccca cagtccccga gaagttgggg ggaggggtcg 5040gcaattgaac
cggtgcctag agaaggtggc gcggggtaaa ctgggaaagt gatgtcgtgt 5100actggctccg
cctttttccc gagggtgggg gagaaccgta tataagtgca gtagtcgccg 5160tgaacgttct
ttttcgcaac gggtttgccg ccagaacaca gctgaagctt cgaggggctc 5220gcatctctcc
ttcacgcgcc cgccgcccta cctgaggccg ccatccacgc cggttgagtc 5280gcgttctgcc
gcctcccgcc tgtggtgcct cctgaactgc gtccgccgtc taggtaagtt 5340taaagctcag
gtcgagaccg ggcctttgtc cggcgctccc ttggagccta cctagactca 5400gccggctctc
cacgctttgc ctgaccctgc ttgctcaact ctacgtcttt gtttcgtttt 5460ctgttctgcg
ccgttacaga tccaagctgt gaccggcgcc tacgtaagtg atatctacta 5520gatttatcaa
aaagagtgtt gacttgtgag cgctcacaat tgatacttag attcatcgag 5580agggacacgt
cgactactaa ccttcttctc tttcctacag ctgagat
562728553DNAArtificial SequencepMG plasmid from InvivoGen EMCV IRES
sequence 28aacgttactg gccgaagccg cttggaataa ggccggtgtg cgtttgtcta
tatgttattt 60tccaccatat tgccgtcttt tggcaatgtg agggcccgga aacctggccc
tgtcttcttg 120acgagcattc ctaggggtct ttcccctctc gccaaaggaa tgcaaggtct
gttgaatgtc 180gtgaaggaag cagttcctct ggaagcttct tgaagacaaa caacgtctgt
agcgaccctt 240tgcaggcagc ggaacccccc acctggcgac aggtgcctct gcggccaaaa
gccacgtgta 300taagatacac ctgcaaaggc ggcacaaccc cagtgccacg ttgtgagttg
gatagttgtg 360gaaagagtca aatggctctc ctcaagcgta ttcaacaagg ggctgaagga
tgcccagaag 420gtaccccatt gtatgggatc tgatctgggg cctcggtgca catgctttac
gtgtgtttag 480tcgaggttaa aaaacgtcta ggccccccga accacgggga cgtggttttc
ctttgaaaaa 540cacgatgata ata
553294692DNAArtificial SequencepDSred1-N1 plasmid from
Clontech 29tagttattaa tagtaatcaa ttacggggtc attagttcat agcccatata
tggagttccg 60cgttacataa cttacggtaa atggcccgcc tggctgaccg cccaacgacc
cccgcccatt 120gacgtcaata atgacgtatg ttcccatagt aacgccaata gggactttcc
attgacgtca 180atgggtggag tatttacggt aaactgccca cttggcagta catcaagtgt
atcatatgcc 240aagtacgccc cctattgacg tcaatgacgg taaatggccc gcctggcatt
atgcccagta 300catgacctta tgggactttc ctacttggca gtacatctac gtattagtca
tcgctattac 360catggtgatg cggttttggc agtacatcaa tgggcgtgga tagcggtttg
actcacgggg 420atttccaagt ctccacccca ttgacgtcaa tgggagtttg ttttggcacc
aaaatcaacg 480ggactttcca aaatgtcgta acaactccgc cccattgacg caaatgggcg
gtaggcgtgt 540acggtgggag gtctatataa gcagagctgg tttagtgaac cgtcagatcc
gctagcgcta 600ccggactcag atctcgagct caagcttcga attctgcagt cgacggtacc
gcgggcccgg 660gatccaccgg tcgccaccat ggtgcgctcc tccaagaacg tcatcaagga
gttcatgcgc 720ttcaaggtgc gcatggaggg caccgtgaac ggccacgagt tcgagatcga
gggcgagggc 780gagggccgcc cctacgaggg ccacaacacc gtgaagctga aggtgaccaa
gggcggcccc 840ctgcccttcg cctgggacat cctgtccccc cagttccagt acggctccaa
ggtgtacgtg 900aagcaccccg ccgacatccc cgactacaag aagctgtcct tccccgaggg
cttcaagtgg 960gagcgcgtga tgaacttcga ggacggcggc gtggtgaccg tgacccagga
ctcctccctg 1020caggacggct gcttcatcta caaggtgaag ttcatcggcg tgaacttccc
ctccgacggc 1080cccgtaatgc agaagaagac catgggctgg gaggcctcca ccgagcgcct
gtacccccgc 1140gacggcgtgc tgaagggcga gatccacaag gccctgaagc tgaaggacgg
cggccactac 1200ctggtggagt tcaagtccat ctacatggcc aagaagcccg tgcagctgcc
cggctactac 1260tacgtggact ccaagctgga catcacctcc cacaacgagg actacaccat
cgtggagcag 1320tacgagcgca ccgagggccg ccaccacctg ttcctgtagc ggccgcgact
ctagatcata 1380atcagccata ccacatttgt agaggtttta cttgctttaa aaaacctccc
acacctcccc 1440ctgaacctga aacataaaat gaatgcaatt gttgttgtta acttgtttat
tgcagcttat 1500aatggttaca aataaagcaa tagcatcaca aatttcacaa ataaagcatt
tttttcactg 1560cattctagtt gtggtttgtc caaactcatc aatgtatctt aaggcgtaaa
ttgtaagcgt 1620taatattttg ttaaaattcg cgttaaattt ttgttaaatc agctcatttt
ttaaccaata 1680ggccgaaatc ggcaaaatcc cttataaatc aaaagaatag accgagatag
ggttgagtgt 1740tgttccagtt tggaacaaga gtccactatt aaagaacgtg gactccaacg
tcaaagggcg 1800aaaaaccgtc tatcagggcg atggcccact acgtgaacca tcaccctaat
caagtttttt 1860ggggtcgagg tgccgtaaag cactaaatcg gaaccctaaa gggagccccc
gatttagagc 1920ttgacgggga aagccggcga acgtggcgag aaaggaaggg aagaaagcga
aaggagcggg 1980cgctagggcg ctggcaagtg tagcggtcac gctgcgcgta accaccacac
ccgccgcgct 2040taatgcgccg ctacagggcg cgtcaggtgg cacttttcgg ggaaatgtgc
gcggaacccc 2100tatttgttta tttttctaaa tacattcaaa tatgtatccg ctcatgagac
aataaccctg 2160ataaatgctt caataatatt gaaaaaggaa gagtcctgag gcggaaagaa
ccagctgtgg 2220aatgtgtgtc agttagggtg tggaaagtcc ccaggctccc cagcaggcag
aagtatgcaa 2280agcatgcatc tcaattagtc agcaaccagg tgtggaaagt ccccaggctc
cccagcaggc 2340agaagtatgc aaagcatgca tctcaattag tcagcaacca tagtcccgcc
cctaactccg 2400cccatcccgc ccctaactcc gcccagttcc gcccattctc cgccccatgg
ctgactaatt 2460ttttttattt atgcagaggc cgaggccgcc tcggcctctg agctattcca
gaagtagtga 2520ggaggctttt ttggaggcct aggcttttgc aaagatcgat caagagacag
gatgaggatc 2580gtttcgcatg attgaacaag atggattgca cgcaggttct ccggccgctt
gggtggagag 2640gctattcggc tatgactggg cacaacagac aatcggctgc tctgatgccg
ccgtgttccg 2700gctgtcagcg caggggcgcc cggttctttt tgtcaagacc gacctgtccg
gtgccctgaa 2760tgaactgcaa gacgaggcag cgcggctatc gtggctggcc acgacgggcg
ttccttgcgc 2820agctgtgctc gacgttgtca ctgaagcggg aagggactgg ctgctattgg
gcgaagtgcc 2880ggggcaggat ctcctgtcat ctcaccttgc tcctgccgag aaagtatcca
tcatggctga 2940tgcaatgcgg cggctgcata cgcttgatcc ggctacctgc ccattcgacc
accaagcgaa 3000acatcgcatc gagcgagcac gtactcggat ggaagccggt cttgtcgatc
aggatgatct 3060ggacgaagag catcaggggc tcgcgccagc cgaactgttc gccaggctca
aggcgagcat 3120gcccgacggc gaggatctcg tcgtgaccca tggcgatgcc tgcttgccga
atatcatggt 3180ggaaaatggc cgcttttctg gattcatcga ctgtggccgg ctgggtgtgg
cggaccgcta 3240tcaggacata gcgttggcta cccgtgatat tgctgaagag cttggcggcg
aatgggctga 3300ccgcttcctc gtgctttacg gtatcgccgc tcccgattcg cagcgcatcg
ccttctatcg 3360ccttcttgac gagttcttct gagcgggact ctggggttcg aaatgaccga
ccaagcgacg 3420cccaacctgc catcacgaga tttcgattcc accgccgcct tctatgaaag
gttgggcttc 3480ggaatcgttt tccgggacgc cggctggatg atcctccagc gcggggatct
catgctggag 3540ttcttcgccc accctagggg gaggctaact gaaacacgga aggagacaat
accggaagga 3600acccgcgcta tgacggcaat aaaaagacag aataaaacgc acggtgttgg
gtcgtttgtt 3660cataaacgcg gggttcggtc ccagggctgg cactctgtcg ataccccacc
gagaccccat 3720tggggccaat acgcccgcgt ttcttccttt tccccacccc accccccaag
ttcgggtgaa 3780ggcccagggc tcgcagccaa cgtcggggcg gcaggccctg ccatagcctc
aggttactca 3840tatatacttt agattgattt aaaacttcat ttttaattta aaaggatcta
ggtgaagatc 3900ctttttgata atctcatgac caaaatccct taacgtgagt tttcgttcca
ctgagcgtca 3960gaccccgtag aaaagatcaa aggatcttct tgagatcctt tttttctgcg
cgtaatctgc 4020tgcttgcaaa caaaaaaacc accgctacca gcggtggttt gtttgccgga
tcaagagcta 4080ccaactcttt ttccgaaggt aactggcttc agcagagcgc agataccaaa
tactgtcctt 4140ctagtgtagc cgtagttagg ccaccacttc aagaactctg tagcaccgcc
tacatacctc 4200gctctgctaa tcctgttacc agtggctgct gccagtggcg ataagtcgtg
tcttaccggg 4260ttggactcaa gacgatagtt accggataag gcgcagcggt cgggctgaac
ggggggttcg 4320tgcacacagc ccagcttgga gcgaacgacc tacaccgaac tgagatacct
acagcgtgag 4380ctatgagaaa gcgccacgct tcccgaaggg agaaaggcgg acaggtatcc
ggtaagcggc 4440agggtcggaa caggagagcg cacgagggag cttccagggg gaaacgcctg
gtatctttat 4500agtcctgtcg ggtttcgcca cctctgactt gagcgtcgat ttttgtgatg
ctcgtcaggg 4560gggcggagcc tatggaaaaa cgccagcaac gcggcctttt tacggttcct
ggccttttgc 4620tggccttttg ctcacatgtt ctttcctgcg ttatcccctg attctgtgga
taaccgtatt 4680accgccatgc at
4692304257DNAArtificial SequencepPur plasmid from Clontech
30ctgtggaatg tgtgtcagtt agggtgtgga aagtccccag gctccccagc aggcagaagt
60atgcaaagca tgcatctcaa ttagtcagca accaggtgtg gaaagtcccc aggctcccca
120gcaggcagaa gtatgcaaag catgcatctc aattagtcag caaccatagt cccgccccta
180actccgccca tcccgcccct aactccgccc agttccgccc attctccgcc ccatggctga
240ctaatttttt ttatttatgc agaggccgag gccgcctcgg cctctgagct attccagaag
300tagtgaggag gcttttttgg aggcctaggc ttttgcaaaa agcttgcatg cctgcaggtc
360ggccgccacg accggtgccg ccaccatccc ctgacccacg cccctgaccc ctcacaagga
420gacgaccttc catgaccgag tacaagccca cggtgcgcct cgccacccgc gacgacgtcc
480cccgggccgt acgcaccctc gccgccgcgt tcgccgacta ccccgccacg cgccacaccg
540tcgacccgga ccgccacatc gagcgggtca ccgagctgca agaactcttc ctcacgcgcg
600tcgggctcga catcggcaag gtgtgggtcg cggacgacgg cgccgcggtg gcggtctgga
660ccacgccgga gagcgtcgaa gcgggggcgg tgttcgccga gatcggcccg cgcatggccg
720agttgagcgg ttcccggctg gccgcgcagc aacagatgga aggcctcctg gcgccgcacc
780ggcccaagga gcccgcgtgg ttcctggcca ccgtcggcgt ctcgcccgac caccagggca
840agggtctggg cagcgccgtc gtgctccccg gagtggaggc ggccgagcgc gccggggtgc
900ccgccttcct ggagacctcc gcgccccgca acctcccctt ctacgagcgg ctcggcttca
960ccgtcaccgc cgacgtcgag gtgcccgaag gaccgcgcac ctggtgcatg acccgcaagc
1020ccggtgcctg acgcccgccc cacgacccgc agcgcccgac cgaaaggagc gcacgacccc
1080atggctccga ccgaagccga cccgggcggc cccgccgacc ccgcacccgc ccccgaggcc
1140caccgactct agaggatcat aatcagccat accacatttg tagaggtttt acttgcttta
1200aaaaacctcc cacacctccc cctgaacctg aaacataaaa tgaatgcaat tgttgttgtt
1260aacttgttta ttgcagctta taatggttac aaataaagca atagcatcac aaatttcaca
1320aataaagcat ttttttcact gcattctagt tgtggtttgt ccaaactcat caatgtatct
1380tatcatgtct ggatccccag gaagctcctc tgtgtcctca taaaccctaa cctcctctac
1440ttgagaggac attccaatca taggctgccc atccaccctc tgtgtcctcc tgttaattag
1500gtcacttaac aaaaaggaaa ttgggtaggg gtttttcaca gaccgctttc taagggtaat
1560tttaaaatat ctgggaagtc ccttccactg ctgtgttcca gaagtgttgg taaacagccc
1620acaaatgtca acagcagaaa catacaagct gtcagctttg cacaagggcc caacaccctg
1680ctcatcaaga agcactgtgg ttgctgtgtt agtaatgtgc aaaacaggag gcacattttc
1740cccacctgtg taggttccaa aatatctagt gttttcattt ttacttggat caggaaccca
1800gcactccact ggataagcat tatccttatc caaaacagcc ttgtggtcag tgttcatctg
1860ctgactgtca actgtagcat tttttggggt tacagtttga gcaggatatt tggtcctgta
1920gtttgctaac acaccctgca gctccaaagg ttccccacca acagcaaaaa aatgaaaatt
1980tgacccttga atgggttttc cagcaccatt ttcatgagtt ttttgtgtcc ctgaatgcaa
2040gtttaacata gcagttaccc caataacctc agttttaaca gtaacagctt cccacatcaa
2100aatatttcca caggttaagt cctcatttaa attaggcaaa ggaattcttg aagacgaaag
2160ggcctcgtga tacgcctatt tttataggtt aatgtcatga taataatggt ttcttagacg
2220tcaggtggca cttttcgggg aaatgtgcgc ggaaccccta tttgtttatt tttctaaata
2280cattcaaata tgtatccgct catgagacaa taaccctgat aaatgcttca ataatattga
2340aaaaggaaga gtatgagtat tcaacatttc cgtgtcgccc ttattccctt ttttgcggca
2400ttttgccttc ctgtttttgc tcacccagaa acgctggtga aagtaaaaga tgctgaagat
2460cagttgggtg cacgagtggg ttacatcgaa ctggatctca acagcggtaa gatccttgag
2520agttttcgcc ccgaagaacg ttttccaatg atgagcactt ttaaagttct gctatgtggc
2580gcggtattat cccgtgttga cgccgggcaa gagcaactcg gtcgccgcat acactattct
2640cagaatgact tggttgagta ctcaccagtc acagaaaagc atcttacgga tggcatgaca
2700gtaagagaat tatgcagtgc tgccataacc atgagtgata acactgcggc caacttactt
2760ctgacaacga tcggaggacc gaaggagcta accgcttttt tgcacaacat gggggatcat
2820gtaactcgcc ttgatcgttg ggaaccggag ctgaatgaag ccataccaaa cgacgagcgt
2880gacaccacga tgcctgcagc aatggcaaca acgttgcgca aactattaac tggcgaacta
2940cttactctag cttcccggca acaattaata gactggatgg aggcggataa agttgcagga
3000ccacttctgc gctcggccct tccggctggc tggtttattg ctgataaatc tggagccggt
3060gagcgtgggt ctcgcggtat cattgcagca ctggggccag atggtaagcc ctcccgtatc
3120gtagttatct acacgacggg gagtcaggca actatggatg aacgaaatag acagatcgct
3180gagataggtg cctcactgat taagcattgg taactgtcag accaagttta ctcatatata
3240ctttagattg atttaaaact tcatttttaa tttaaaagga tctaggtgaa gatccttttt
3300gataatctca tgaccaaaat cccttaacgt gagttttcgt tccactgagc gtcagacccc
3360gtagaaaaga tcaaaggatc ttcttgagat cctttttttc tgcgcgtaat ctgctgcttg
3420caaacaaaaa aaccaccgct accagcggtg gtttgtttgc cggatcaaga gctaccaact
3480ctttttccga aggtaactgg cttcagcaga gcgcagatac caaatactgt ccttctagtg
3540tagccgtagt taggccacca cttcaagaac tctgtagcac cgcctacata cctcgctctg
3600ctaatcctgt taccagtggc tgctgccagt ggcgataagt cgtgtcttac cgggttggac
3660tcaagacgat agttaccgga taaggcgcag cggtcgggct gaacgggggg ttcgtgcaca
3720cagcccagct tggagcgaac gacctacacc gaactgagat acctacagcg tgagctatga
3780gaaagcgcca cgcttcccga agggagaaag gcggacaggt atccggtaag cggcagggtc
3840ggaacaggag agcgcacgag ggagcttcca gggggaaacg cctggtatct ttatagtcct
3900gtcgggtttc gccacctctg acttgagcgt cgatttttgt gatgctcgtc aggggggcgg
3960agcctatgga aaaacgccag caacgcggcc tttttacggt tcctggcctt ttgctggcct
4020tttgctcaca tgttctttcc tgcgttatcc cctgattctg tggataaccg tattaccgcc
4080tttgagtgag ctgataccgc tcgccgcagc cgaacgaccg agcgcagcga gtcagtgagc
4140gaggaagcgg aagagcgcct gatgcggtat tttctcctta cgcatctgtg cggtatttca
4200caccgcatat ggtgcactct cagtacaatc tgctctgatg ccgcatagtt aagccag
4257318136DNAArtificial SequencepWE15 cosmid vector 31ctatagtgag
tcgtattatg cggccgcgaa ttcttgaaga cgaaagggcc tcgtgatacg 60cctattttta
taggttaatg tcatgataat aatggtttct tagacgtcag gtggcacttt 120tcggggaaat
gtgcgcggaa cccctatttg tttatttttc taaatacatt caaatatgta 180tccgctcatg
agacaataac cctgataaat gcttcaataa tattgaaaaa ggaagagtat 240gagtattcaa
catttccgtg tcgcccttat tccctttttt gcggcatttt gcttcctgtt 300tttgctcacc
cagaaacgct ggtgaaagta aaagatgctg aagatcagtt gggtgcacga 360gtgggttaca
tcgaactgga tctcaacagc ggtaagatcc ttgagagttt tcgccccgaa 420gaacgttttc
caatgatgag cacttttaaa gttctgctat gtggcgcggt attatcccgt 480gttgacgccg
ggcaagagca actcggtcgc cgcatacact attctcagaa tgacttggtt 540gagtactcac
cagtcacaga aaagcatctt acggatggca tgacagtaag agaattatgc 600agtgctgcca
taaccatgag tgataacact gcggccaact tacttctgac aacgatcgga 660ggaccgaagg
agctaaccgc ttttttgcac aacatggggg atcatgtaac tcgccttgat 720cgttgggaac
cggagctgaa tgaagccata ccaaacgacg agcgtgacac cacgatgcct 780gcagcaatgg
caacaacgtt gcgcaaacta ttaactggcg aactacttac tctagcttcc 840cggcaacaat
taatagactg gatggaggcg gataaagttg caggaccact tctgcgctcg 900gcccttccgg
ctggctggtt tattgctgat aaatctggag ccggtgagcg tgggtctcgc 960ggtatcattg
cagcactggg gccagatggt aagccctccc gtatcgtagt tatctacacg 1020acggggagtc
aggcaactat ggatgaacga aatagacaga tcgctgagat aggtgcctca 1080ctgattaagc
attggtaact gtcagaccaa gtttactcat atatacttta gattgattta 1140aaacttcatt
tttaatttaa aaggatctag gtgaagatcc tttttgataa tctcatgacc 1200aaaatccctt
aacgtgagtt ttcgttccac tgagcgtcag accccgtaga aaagatcaaa 1260ggatcttctt
gagatccttt ttttctgcgc gtaatctgct gcttgcaaac aaaaaaacca 1320ccgctaccag
cggtggtttg tttgccggat caagagctac caactctttt tccgaaggta 1380actggcttca
gcagagcgca gataccaaat actgtccttc tagtgtagcc gtagttaggc 1440caccacttca
agaactctgt agcaccgcct acatacctcg ctctgctaat cctgttacca 1500gtggctgctg
ccagtggcga taagtcgtgt cttaccgggt tggactcaag acgatagtta 1560ccggataagg
cgcagcggtc gggctgaacg gggggttcgt gcacacagcc cagcttggag 1620cgaacgacct
acaccgaact gagataccta cagcgtgagc tatgagaaag cgccacgctt 1680ccgaagggag
aaaggcggac aggtatccgg taagcggcag ggtcggaaca ggagagcgca 1740cgagggagct
tccaggggga aacgcctggt atctttatag tcctgtcggg gtttcgccac 1800ctctgacttg
agcgtcgatt tttgtgatgc tcgtcagggg ggcggagcct atggaaaaac 1860gccagcaacg
cggccttttt acggttcctg gccttttgct ggccttttgc tcacatgttc 1920tttcctgcgt
tatcccctga ttctgtggat aaccgtatta ccgcctttga gtgagctgat 1980accgctcgcc
gcagccgaac gaccgagcgc agcgagtcag tgagcgagga agcggaagag 2040cgctgacttc
cgcgtttcca gactttacga aacacggaaa ccgaagacca ttcatgttgt 2100tgctcaggtc
gcagacgttt tgcagcagca gtcgcttcac gttcgctcgc gtatcggtga 2160ttcattctgc
taaccagtaa ggcaaccccg ccagcctagc cgggtcctca acgacaggag 2220cacgatcatg
cgcacccgtc agatccagac atgataagat acattgatga gtttggacaa 2280accacaacta
gaatgcagtg aaaaaaatgc tttatttgtg aaatttgtga tgctattgct 2340ttatttgtaa
ccattataag ctgcaataaa caagttaaca acaacaattg cattcatttt 2400atgtttcagg
ttcaggggga ggtgtgggag gttttttaaa gcaagtaaaa cctctacaaa 2460tgtggtatgg
ctgattatga tctctagtca aggcactata catcaaatat tccttattaa 2520cccctttaca
aattaaaaag ctaaaggtac acaatttttg agcatagtta ttaatagcag 2580acactctatg
cctgtgtgga gtaagaaaaa acagtatgtt atgattataa ctgttatgcc 2640tacttataaa
ggttacagaa tatttttcca taattttctt gtatagcagt gcagcttttt 2700cctttgtggt
gtaaatagca aagcaagcaa gagttctatt actaaacaca gcatgactca 2760aaaaacttag
caattctgaa ggaaagtcct tggggtcttc tacctttctc ttcttttttg 2820gaggagtaga
atgttgagag tcagcagtag cctcatcatc actagatggc atttcttctg 2880agcaaaacag
gttttcctca ttaaaggcat tccaccactg ctcccattca tcagttccat 2940aggttggaat
ctaaaataca caaacaatta gaatcagtag tttaacacat tatacactta 3000aaaattttat
atttacctta gagctttaaa tctctgtagg tagtttgtcc aattatgtca 3060caccacagaa
gtaaggttcc ttcacaaaga tccggaccaa agcggccatc gtgcctcccc 3120actcctgcag
ttcgggggca tggatgcgcg gatagccgct gctggtttcc tggatgccga 3180cggatttgca
ctgccggtag aactcgcgag gtcgtccagc ctcaggcagc agctgaacca 3240actcgcgagg
ggatcgagcc cggggtgggc gaagaactcc agcatgagat ccccgcgctg 3300gaggatcatc
cagccggcgt cccggaaaac gattccgaag cccaaccttt catagaaggc 3360ggcggtggaa
tcgaaatctc gtgatggcag gttgggcgtc gcttggtcgg tcatttcgaa 3420ccccagagtc
ccgctcagaa gaactcgtca agaaggcgat agaaggcgat gcgctgcgaa 3480tcgggagcgg
cgataccgta aagcacgagg aagcggtcag cccattcgcc gccaagctct 3540tcagcaatat
cacgggtagc caacgctatg tcctgatagc ggtccgccac acccagccgg 3600ccacagtcga
tgaatccaga aaagcggcca ttttccacca tgatattcgg caagcaggca 3660tcgccatggg
tcacgacgag atcctcgccg tcgggatgcg cgccttgagc ctggcgaaca 3720gttcggctgg
cgcgagcccc tgatgctctt cgtccagatc atcctgatcg acaagaccgg 3780cttccatccg
agtacgtgct cgctcgatgc gatgtttcgc ttggtggtcg aatgggcagg 3840tagccggatc
aagcgtatgc agccgccgca ttgcatcagc catgatggat actttctcgg 3900caggagcaag
gtgagatgac aggagatcct gccccggcac ttcgcccaat agcagccagt 3960cccttcccgc
ttcagtgaca acgtcgagca cagctgcgca aggaacgccc gtcgtggcca 4020gccacgatag
ccgcgctgcc tcgtcctgca gttcattcag ggcaccggac aggtcggtct 4080tgacaaaaag
aaccgggcgc ccctgcgctg acagccggaa cacggcggca tcagagcagc 4140cgattgtctg
ttgtgcccag tcatagccga atagcctctc cacccaagcg gccggagaac 4200ctgcgtgcaa
tccatcttgt tcaatcatgc gaaacgatcc tcatcctgtc tcttgatcag 4260atcttgatcc
cctgcgccat cagatccttg gcggcaagaa agccatccag tttactttgc 4320agggcttccc
aaccttacca gagggcgccc cagctggcaa ttccggttcg cttgctgtcc 4380ataaaaccgc
ccagtctagc tatcgccatg taagcccact gcaagctacc tgctttctct 4440ttgcgcttgc
gttttccctt gtccagatag cccagtagct gacattcatc cggggtcagc 4500accgtttctg
cggactggct ttctacgtgt tccgcttcct ttagcagccc ttgcgccctg 4560agtgcttgcg
gcagcgtgaa agctttttgc aaaagcctag gcctccaaaa aagcctcctc 4620actacttctg
gaatagctca gaggccgagg cggcctaaat aaaaaaaatt agtcagccat 4680ggggcggaga
atgggcggaa ctgggcggag ttaggggcgg gatgggcgga gttaggggcg 4740ggactatggt
tgctgactaa ttgagatgca tgctttgcat acttctgcct gctggggagc 4800ctggggactt
tccacacctg gttgctgact aattgagatg catgctttgc atacttctgc 4860ctgctgggga
gcctggggac tttccacacc ctaactgaca cacattccac agccggatct 4920gcaggaccca
acgctgcccg agatgcgccg cgtgcggctg ctggagatgg cggacgcgat 4980ggatatgttc
tgccaagggt tggtttgcgc attcacagtt ctccgcaaga attgattggc 5040tccaattctt
ggagtggtga atccgttagc gaggtgccgc cggcttccat tcaggtcgag 5100gtggcccggc
tccatgcacc gcgacgcaac gcggggaggc agacaaggta tagggcggcg 5160cctacaatcc
atgccaaccc gttccatgtg ctcgccgagg cgcataaatc gccgtgacga 5220tcagcggtcc
aatgatcgaa gttaggctgg taagagccgc gagcgatcct tgaagctgtc 5280cctgatggtc
gtcatctacc tgcctggaca gcatggcctg caacgcggca tcccgatgcc 5340gccggaagcg
agaagaatca taatggggaa ggccatccag cctcgcgtcg cgaacgccag 5400caagacgtag
cccagcgcgt cgggccgcca tgccggcgat aatggcctgc ttctcgccga 5460aacgtttggt
ggcgggacca gtgacgaagg cttgagcgag ggcgtgcaag attccgaata 5520ccgcaagcga
caggccgatc atcgtcgcgc tccagcgaaa gcggtcctcg ccgaaaatga 5580cccagagcgc
tgccggcacc tgtcctacga gttgcatgat aaagaagaca gtcataagtg 5640cggcgacgat
agtcatgccc cgcgcccacc ggaaggagct gactgggttg aaggctctca 5700agggcatcgg
tcgacgctct cccttatgcg actcctgcat taggaagcag cccagtagta 5760ggttgaggcc
gttgagcacc gccgccgcaa ggaatggtgc atgcaaggag atggcgccca 5820acagtccccc
ggccacgggc ctgccaccat acccacgccg aaacaagcgc tcatgagccc 5880gaagtggcga
gcccgatctt ccccatcggt gatgtcggcg atataggcgc cagcaaccgc 5940acctgtggcg
ccggtgatgc cggccacgat gcgtccggcg tagaggatct tggcagtcac 6000agcatgcgca
tatccatgct tcgaccatgc gctcacaaag taggtgaatg cgcaatgtag 6060tacccacatc
gtcatcgctt tccactgctc tcgcgaataa agatggaaaa tcaatctcat 6120ggtaatagtc
catgaaaatc cttgtattca taaatcctcc aggtagctat atgcaaattg 6180aaacaaaaga
gatggtgatc tttctaagag atgatggaat ctcccttcag tatcccgatg 6240gtcaatgcgc
tggatatggg atagatggga atatgctgat ttttatggga cagagttgcg 6300aactgttccc
aactaaaatc attttgcacg atcagcgcac tacgaacttt acccacaaat 6360agtcaggtaa
tgaatcctga tataaagaca ggttgataaa tcagtcttct acgcgcatcg 6420cacgcgcaca
ccgtagaaag tctttcagtt gtgagcctgg gcaaaccgtt aactttcggc 6480ggctttgctg
tgcgacaggc tcacgtctaa aaggaaataa atcatgggtc ataaaattat 6540cacgttgtcc
ggcgcggcga cggatgttct gtatgcgctg tttttccgtg gcgcgttgct 6600gtctggtgat
ctgccttcta aatctggcac agccgaattg cgcgagcttg gttttgctga 6660aaccagacac
acagcaactg aataccagaa agaaaatcac tttacctttc tgacatcaga 6720agggcagaaa
tttgccgttg aacacctggt caatacgcgt tttggtgagc agcaatattg 6780cgcttcgatg
acgcttggcg ttgagattga tacctctgct gcacaaaagg caatcgacga 6840gctggaccag
cgcattcgtg acaccgtctc cttcgaactt attcgcaatg gagtgtcatt 6900catcaaggac
gccgctatcg caaatggtgc tatccacgca gcggcaatcg aaacacctca 6960gccggtgacc
aatatctaca acatcagcct tggtatccag cgtgatgagc cagcgcagaa 7020caaggtaacc
gtcagtgccg ataagttcaa agttaaacct ggtgttgata ccaacattga 7080aacgttgatc
gaaaacgcgc tgaaaaacgc tgctgaatgt gcggcgctgg atgtcacaaa 7140gcaaatggca
gcagacaaga aagcgatgga tgaactggct tcctatgtcc gcacggccat 7200catgatggaa
tgtttccccg gtggtgttat ctggcagcag tgccgtcgat agtatgcaat 7260tgataattat
tatcatttgc gggtcctttc cggcgatccg ccttgttacg gggcggcgac 7320ctcgcgggtt
ttcgctattt atgaaaattt tccggtttaa ggcgtttccg ttcttcttcg 7380tcataactta
atgtttttat ttaaaatacc ctctgaaaag aaaggaaacg acaggtgctg 7440aaagcgagct
ttttggcctc tgtcgtttcc tttctctgtt tttgtccgtg gaatgaacaa 7500tggaagtcaa
caaaaagcag ctggctgaca ttttcggtgc gagtatccgt accattcaga 7560actggcagga
acagggaatg cccgttctgc gaggcggtgg caagggtaat gaggtgcttt 7620atgactctgc
cgccgtcata aaatggtatg ccgaaaggga tgctgaaatt gagaacgaaa 7680agctgcgccg
ggaggttgaa gaactgcggc aggccagcga ggcagatcca caggacgggt 7740gtggtcgcca
tgatcgcgta gtcgatagtg gctccaagta gcgaagcgag caggactggg 7800cggcggcaaa
gcggtcggac agtgctccga gaacgggtgc gcatagaaat tgcatcaacg 7860catatagcgc
tagcagcacg ccatagtgac tggcgatgct gtcggaatgg acgatatccc 7920gcaagaggcc
cggcagtacc ggcataacca agcctatgcc tacagcatcc agggtgacgg 7980tgccgaggat
gacgatgagc gcattgttag atttcataca cggtgcctga ctgcgttagc 8040aatttaactg
tgataaacta ccgcattaaa gcttatcgat gataagcggt caaacatgag 8100aattcgcggc
cgcaattaac cctcactaaa ggatcc
8136322713DNAArtificial SequencepNEB193 plasmid 32tcgcgcgttt cggtgatgac
ggtgaaaacc tctgacacat gcagctcccg gagacggtca 60cagcttgtct gtaagcggat
gccgggagca gacaagcccg tcagggcgcg tcagcgggtg 120ttggcgggtg tcggggctgg
cttaactatg cggcatcaga gcagattgta ctgagagtgc 180accatatgcg gtgtgaaata
ccgcacagat gcgtaaggag aaaataccgc atcaggcgcc 240attcgccatt caggctgcgc
aactgttggg aagggcgatc ggtgcgggcc tcttcgctat 300tacgccagct ggcgaaaggg
ggatgtgctg caaggcgatt aagttgggta acgccagggt 360tttcccagtc acgacgttgt
aaaacgacgg ccagtgaatt cgagctcggt acccgggggc 420gcgccggatc cttaattaag
tctagagtcg actgtttaaa cctgcaggca tgcaagcttg 480gcgtaatcat ggtcatagct
gtttcctgtg tgaaattgtt atccgctcac aattccacac 540aacatacgag ccggaagcat
aaagtgtaaa gcctggggtg cctaatgagt gagctaactc 600acattaattg cgttgcgctc
actgcccgct ttccagtcgg gaaacctgtc gtgccagctg 660cattaatgaa tcggccaacg
cgcggggaga ggcggtttgc gtattgggcg ctcttccgct 720tcctcgctca ctgactcgct
gcgctcggtc gttcggctgc ggcgagcggt atcagctcac 780tcaaaggcgg taatacggtt
atccacagaa tcaggggata acgcaggaaa gaacatgtga 840gcaaaaggcc agcaaaaggc
caggaaccgt aaaaaggccg cgttgctggc gtttttccat 900aggctccgcc cccctgacga
gcatcacaaa aatcgacgct caagtcagag gtggcgaaac 960ccgacaggac tataaagata
ccaggcgttt ccccctggaa gctccctcgt gcgctctcct 1020gttccgaccc tgccgcttac
cggatacctg tccgcctttc tcccttcggg aagcgtggcg 1080ctttctcata gctcacgctg
taggtatctc agttcggtgt aggtcgttcg ctccaagctg 1140ggctgtgtgc acgaaccccc
cgttcagccc gaccgctgcg ccttatccgg taactatcgt 1200cttgagtcca acccggtaag
acacgactta tcgccactgg cagcagccac tggtaacagg 1260attagcagag cgaggtatgt
aggcggtgct acagagttct tgaagtggtg gcctaactac 1320ggctacacta gaaggacagt
atttggtatc tgcgctctgc tgaagccagt taccttcgga 1380aaaagagttg gtagctcttg
atccggcaaa caaaccaccg ctggtagcgg tggttttttt 1440gtttgcaagc agcagattac
gcgcagaaaa aaaggatctc aagaagatcc tttgatcttt 1500tctacggggt ctgacgctca
gtggaacgaa aactcacgtt aagggatttt ggtcatgaga 1560ttatcaaaaa ggatcttcac
ctagatcctt ttaaattaaa aatgaagttt taaatcaatc 1620taaagtatat atgagtaaac
ttggtctgac agttaccaat gcttaatcag tgaggcacct 1680atctcagcga tctgtctatt
tcgttcatcc atagttgcct gactccccgt cgtgtagata 1740actacgatac gggagggctt
accatctggc cccagtgctg caatgatacc gcgagaccca 1800cgctcaccgg ctccagattt
atcagcaata aaccagccag ccggaagggc cgagcgcaga 1860agtggtcctg caactttatc
cgcctccatc cagtctatta attgttgccg ggaagctaga 1920gtaagtagtt cgccagttaa
tagtttgcgc aacgttgttg ccattgctac aggcatcgtg 1980gtgtcacgct cgtcgtttgg
tatggcttca ttcagctccg gttcccaacg atcaaggcga 2040gttacatgat cccccatgtt
gtgcaaaaaa gcggttagct ccttcggtcc tccgatcgtt 2100gtcagaagta agttggccgc
agtgttatca ctcatggtta tggcagcact gcataattct 2160cttactgtca tgccatccgt
aagatgcttt tctgtgactg gtgagtactc aaccaagtca 2220ttctgagaat agtgtatgcg
gcgaccgagt tgctcttgcc cggcgtcaat acgggataat 2280accgcgccac atagcagaac
tttaaaagtg ctcatcattg gaaaacgttc ttcggggcga 2340aaactctcaa ggatcttacc
gctgttgaga tccagttcga tgtaacccac tcgtgcaccc 2400aactgatctt cagcatcttt
tactttcacc agcgtttctg ggtgagcaaa aacaggaagg 2460caaaatgccg caaaaaaggg
aataagggcg acacggaaat gttgaatact catactcttc 2520ctttttcaat attattgaag
catttatcag ggttattgtc tcatgagcgg atacatattt 2580gaatgtattt agaaaaataa
acaaataggg gttccgcgca catttccccg aaaagtgcca 2640cctgacgtct aagaaaccat
tattatcatg acattaacct ataaaaatag gcgtatcacg 2700aggccctttc gtc
27133321DNAArtificial
SequenceattP 33cagctttttt atactaagtt g
213421DNAArtificial SequenceattB 34ctgctttttt atactaactt g
213521DNAArtificial
SequenceattL 35ctgctttttt atactaagtt g
213621DNAArtificial SequenceattR 36cagctttttt atactaactt g
21371071DNAArtificial
SequenceIntegrase E174R 37atg gga aga agg cga agt cat gag cgc cgg gat tta
ccc cct aac ctt 48Met Gly Arg Arg Arg Ser His Glu Arg Arg Asp Leu
Pro Pro Asn Leu1 5 10
15tat ata aga aac aat gga tat tac tgc tac agg gac cca agg acg ggt
96Tyr Ile Arg Asn Asn Gly Tyr Tyr Cys Tyr Arg Asp Pro Arg Thr Gly
20 25 30aaa gag ttt gga tta ggc aga
gac agg cga atc gca atc act gaa gct 144Lys Glu Phe Gly Leu Gly Arg
Asp Arg Arg Ile Ala Ile Thr Glu Ala 35 40
45ata cag gcc aac att gag tta ttt tca gga cac aaa cac aag cct
ctg 192Ile Gln Ala Asn Ile Glu Leu Phe Ser Gly His Lys His Lys Pro
Leu 50 55 60aca gcg aga atc aac agt
gat aat tcc gtt acg tta cat tca tgg ctt 240Thr Ala Arg Ile Asn Ser
Asp Asn Ser Val Thr Leu His Ser Trp Leu65 70
75 80gat cgc tac gaa aaa atc ctg gcc agc aga gga
atc aag cag aag aca 288Asp Arg Tyr Glu Lys Ile Leu Ala Ser Arg Gly
Ile Lys Gln Lys Thr 85 90
95ctc ata aat tac atg agc aaa att aaa gca ata agg agg ggt ctg cct
336Leu Ile Asn Tyr Met Ser Lys Ile Lys Ala Ile Arg Arg Gly Leu Pro
100 105 110gat gct cca ctt gaa gac
atc acc aca aaa gaa att gcg gca atg ctc 384Asp Ala Pro Leu Glu Asp
Ile Thr Thr Lys Glu Ile Ala Ala Met Leu 115 120
125aat gga tac ata gac gag ggc aag gcg gcg tca gcc aag tta
atc aga 432Asn Gly Tyr Ile Asp Glu Gly Lys Ala Ala Ser Ala Lys Leu
Ile Arg 130 135 140tca aca ctg agc gat
gca ttc cga gag gca ata gct gaa ggc cat ata 480Ser Thr Leu Ser Asp
Ala Phe Arg Glu Ala Ile Ala Glu Gly His Ile145 150
155 160aca aca aac cat gtc gct gcc act cgc gca
gca aaa tct aga gta agg 528Thr Thr Asn His Val Ala Ala Thr Arg Ala
Ala Lys Ser Arg Val Arg 165 170
175aga tca aga ctt acg gct gac gaa tac ctg aaa att tat caa gca gca
576Arg Ser Arg Leu Thr Ala Asp Glu Tyr Leu Lys Ile Tyr Gln Ala Ala
180 185 190gaa tca tca cca tgt tgg
ctc aga ctt gca atg gaa ctg gct gtt gtt 624Glu Ser Ser Pro Cys Trp
Leu Arg Leu Ala Met Glu Leu Ala Val Val 195 200
205acc ggg caa cga gtt ggt gat tta tgc gaa atg aag tgg tct
gat atc 672Thr Gly Gln Arg Val Gly Asp Leu Cys Glu Met Lys Trp Ser
Asp Ile 210 215 220gta gat gga tat ctt
tat gtc gag caa agc aaa aca ggc gta aaa att 720Val Asp Gly Tyr Leu
Tyr Val Glu Gln Ser Lys Thr Gly Val Lys Ile225 230
235 240gcc atc cca aca gca ttg cat att gat gct
ctc gga ata tca atg aag 768Ala Ile Pro Thr Ala Leu His Ile Asp Ala
Leu Gly Ile Ser Met Lys 245 250
255gaa aca ctt gat aaa tgc aaa gag att ctt ggc gga gaa acc ata att
816Glu Thr Leu Asp Lys Cys Lys Glu Ile Leu Gly Gly Glu Thr Ile Ile
260 265 270gca tct act cgt cgc gaa
ccg ctt tca tcc ggc aca gta tca agg tat 864Ala Ser Thr Arg Arg Glu
Pro Leu Ser Ser Gly Thr Val Ser Arg Tyr 275 280
285ttt atg cgc gca cga aaa gca tca ggt ctt tcc ttc gaa ggg
gat ccg 912Phe Met Arg Ala Arg Lys Ala Ser Gly Leu Ser Phe Glu Gly
Asp Pro 290 295 300cct acc ttt cac gag
ttg cgc agt ttg tct gca aga ctc tat gag aag 960Pro Thr Phe His Glu
Leu Arg Ser Leu Ser Ala Arg Leu Tyr Glu Lys305 310
315 320cag ata agc gat aag ttt gct caa cat ctt
ctc ggg cat aag tcg gac 1008Gln Ile Ser Asp Lys Phe Ala Gln His Leu
Leu Gly His Lys Ser Asp 325 330
335acc atg gca tca cag tat cgt gat gac aga ggc agg gag tgg gac aaa
1056Thr Met Ala Ser Gln Tyr Arg Asp Asp Arg Gly Arg Glu Trp Asp Lys
340 345 350att gaa atc aaa taa
1071Ile Glu Ile Lys *
35538356PRTArtificial SequenceIntegrase E147R 38Met Gly Arg Arg Arg Ser
His Glu Arg Arg Asp Leu Pro Pro Asn Leu1 5
10 15Tyr Ile Arg Asn Asn Gly Tyr Tyr Cys Tyr Arg Asp
Pro Arg Thr Gly 20 25 30Lys
Glu Phe Gly Leu Gly Arg Asp Arg Arg Ile Ala Ile Thr Glu Ala 35
40 45Ile Gln Ala Asn Ile Glu Leu Phe Ser
Gly His Lys His Lys Pro Leu 50 55
60Thr Ala Arg Ile Asn Ser Asp Asn Ser Val Thr Leu His Ser Trp Leu65
70 75 80Asp Arg Tyr Glu Lys
Ile Leu Ala Ser Arg Gly Ile Lys Gln Lys Thr 85
90 95Leu Ile Asn Tyr Met Ser Lys Ile Lys Ala Ile
Arg Arg Gly Leu Pro 100 105
110Asp Ala Pro Leu Glu Asp Ile Thr Thr Lys Glu Ile Ala Ala Met Leu
115 120 125Asn Gly Tyr Ile Asp Glu Gly
Lys Ala Ala Ser Ala Lys Leu Ile Arg 130 135
140Ser Thr Leu Ser Asp Ala Phe Arg Glu Ala Ile Ala Glu Gly His
Ile145 150 155 160Thr Thr
Asn His Val Ala Ala Thr Arg Ala Ala Lys Ser Arg Val Arg
165 170 175Arg Ser Arg Leu Thr Ala Asp
Glu Tyr Leu Lys Ile Tyr Gln Ala Ala 180 185
190Glu Ser Ser Pro Cys Trp Leu Arg Leu Ala Met Glu Leu Ala
Val Val 195 200 205Thr Gly Gln Arg
Val Gly Asp Leu Cys Glu Met Lys Trp Ser Asp Ile 210
215 220Val Asp Gly Tyr Leu Tyr Val Glu Gln Ser Lys Thr
Gly Val Lys Ile225 230 235
240Ala Ile Pro Thr Ala Leu His Ile Asp Ala Leu Gly Ile Ser Met Lys
245 250 255Glu Thr Leu Asp Lys
Cys Lys Glu Ile Leu Gly Gly Glu Thr Ile Ile 260
265 270Ala Ser Thr Arg Arg Glu Pro Leu Ser Ser Gly Thr
Val Ser Arg Tyr 275 280 285Phe Met
Arg Ala Arg Lys Ala Ser Gly Leu Ser Phe Glu Gly Asp Pro 290
295 300Pro Thr Phe His Glu Leu Arg Ser Leu Ser Ala
Arg Leu Tyr Glu Lys305 310 315
320Gln Ile Ser Asp Lys Phe Ala Gln His Leu Leu Gly His Lys Ser Asp
325 330 335Thr Met Ala Ser
Gln Tyr Arg Asp Asp Arg Gly Arg Glu Trp Asp Lys 340
345 350Ile Glu Ile Lys 35539876DNADiscosoma
speciesCDS(45)...(737)Nucleotide sequence encoding red flourescent
protein (FP593) 39agtttcagcc agtgacaggg tgagctgcca ggtattctaa caag atg
agt tgt tcc 56 Met
Ser Cys Ser 1aag aat gtg
atc aag gag ttc atg agg ttc aag gtt cgt atg gaa gga 104Lys Asn Val
Ile Lys Glu Phe Met Arg Phe Lys Val Arg Met Glu Gly5 10
15 20acg gtc aat ggg cac gag ttt gaa
ata aaa ggc gaa ggt gaa ggg agg 152Thr Val Asn Gly His Glu Phe Glu
Ile Lys Gly Glu Gly Glu Gly Arg 25 30
35cct tac gaa ggt cac tgt tcc gta aag ctt atg gta acc aag
ggt gga 200Pro Tyr Glu Gly His Cys Ser Val Lys Leu Met Val Thr Lys
Gly Gly 40 45 50cct ttg cca
ttt gct ttt gat att ttg tca cca caa ttt cag tat gga 248Pro Leu Pro
Phe Ala Phe Asp Ile Leu Ser Pro Gln Phe Gln Tyr Gly 55
60 65agc aag gta tat gtc aaa cac cct gcc gac ata
cca gac tat aaa aag 296Ser Lys Val Tyr Val Lys His Pro Ala Asp Ile
Pro Asp Tyr Lys Lys 70 75 80ctg tca
ttt cct gag gga ttt aaa tgg gaa agg gtc atg aac ttt gaa 344Leu Ser
Phe Pro Glu Gly Phe Lys Trp Glu Arg Val Met Asn Phe Glu85
90 95 100gac ggt ggc gtg gtt act gta
tcc caa gat tcc agt ttg aaa gac ggc 392Asp Gly Gly Val Val Thr Val
Ser Gln Asp Ser Ser Leu Lys Asp Gly 105
110 115tgt ttc atc tac gag gtc aag ttc att ggg gtg aac
ttt cct tct gat 440Cys Phe Ile Tyr Glu Val Lys Phe Ile Gly Val Asn
Phe Pro Ser Asp 120 125 130gga
cct gtt atg cag agg agg aca cgg ggc tgg gaa gcc agc tct gag 488Gly
Pro Val Met Gln Arg Arg Thr Arg Gly Trp Glu Ala Ser Ser Glu 135
140 145cgt ttg tat cct cgt gat ggg gtg ctg
aaa gga gac atc cat atg gct 536Arg Leu Tyr Pro Arg Asp Gly Val Leu
Lys Gly Asp Ile His Met Ala 150 155
160ctg agg ctg gaa gga ggc ggc cat tac ctc gtt gaa ttc aaa agt att
584Leu Arg Leu Glu Gly Gly Gly His Tyr Leu Val Glu Phe Lys Ser Ile165
170 175 180tac atg gta aag
aag cct tca gtg cag ttg cca ggc tac tat tat gtt 632Tyr Met Val Lys
Lys Pro Ser Val Gln Leu Pro Gly Tyr Tyr Tyr Val 185
190 195gac tcc aaa ctg gat atg acg agc cac aac
gaa gat tac aca gtc gtt 680Asp Ser Lys Leu Asp Met Thr Ser His Asn
Glu Asp Tyr Thr Val Val 200 205
210gag cag tat gaa aaa acc cag gga cgc cac cat ccg ttc att aag cct
728Glu Gln Tyr Glu Lys Thr Gln Gly Arg His His Pro Phe Ile Lys Pro
215 220 225ctg cag tga actcggctca
gtcatggatt agcggtaatg gccacaaaag 777Leu Gln *
230gcacgatgat cgttttttag gaatgcagcc aaaaattgaa ggttatgaca gtagaaatac
837aagcaacagg ctttgcttat taaacatgta attgaaaac
87640230PRTDiscosoma species 40Met Ser Cys Ser Lys Asn Val Ile Lys Glu
Phe Met Arg Phe Lys Val1 5 10
15Arg Met Glu Gly Thr Val Asn Gly His Glu Phe Glu Ile Lys Gly Glu
20 25 30Gly Glu Gly Arg Pro Tyr
Glu Gly His Cys Ser Val Lys Leu Met Val 35 40
45Thr Lys Gly Gly Pro Leu Pro Phe Ala Phe Asp Ile Leu Ser
Pro Gln 50 55 60Phe Gln Tyr Gly Ser
Lys Val Tyr Val Lys His Pro Ala Asp Ile Pro65 70
75 80Asp Tyr Lys Lys Leu Ser Phe Pro Glu Gly
Phe Lys Trp Glu Arg Val 85 90
95Met Asn Phe Glu Asp Gly Gly Val Val Thr Val Ser Gln Asp Ser Ser
100 105 110Leu Lys Asp Gly Cys
Phe Ile Tyr Glu Val Lys Phe Ile Gly Val Asn 115
120 125Phe Pro Ser Asp Gly Pro Val Met Gln Arg Arg Thr
Arg Gly Trp Glu 130 135 140Ala Ser Ser
Glu Arg Leu Tyr Pro Arg Asp Gly Val Leu Lys Gly Asp145
150 155 160Ile His Met Ala Leu Arg Leu
Glu Gly Gly Gly His Tyr Leu Val Glu 165
170 175Phe Lys Ser Ile Tyr Met Val Lys Lys Pro Ser Val
Gln Leu Pro Gly 180 185 190Tyr
Tyr Tyr Val Asp Ser Lys Leu Asp Met Thr Ser His Asn Glu Asp 195
200 205Tyr Thr Val Val Glu Gln Tyr Glu Lys
Thr Gln Gly Arg His His Pro 210 215
220Phe Ile Lys Pro Leu Gln225 2304125DNAArtificial
Sequencem-att; 41rkycwgcttt yktrtacnaa stsgb
254225DNAArtificial Sequencem-attB; 42agccwgcttt yktrtacnaa
ctsgb 254325DNAArtificial
Sequencem-attR 43gttcagcttt cktrtacnaa ctsgb
254425DNAArtificial Sequencem-attL 44agccwgcttt cktrtacnaa
gtsgb 254525DNAArtificial
Sequencem-attP1 45gttcagcttt yktrtacnaa gtsgb
254625DNAArtificial SequenceattB1 46agcctgcttt tttgtacaaa
cttgt 254725DNAArtificial
SequenceattB2 47agcctgcttt cttgtacaaa cttgt
254825DNAArtificial SequenceattB3 48acccagcttt cttgtacaaa
cttgt 254925DNAArtificial
SequenceattR1 49gttcagcttt tttgtacaaa cttgt
255025DNAArtificial SequenceattR2 50gttcagcttt cttgtacaaa
cttgt 255125DNAArtificial
SequenceattR3 51gttcagcttt cttgtacaaa gttgg
255225DNAArtificial SequenceattL1 52agcctgcttt tttgtacaaa
gttgg 255325DNAArtificial
SequenceattL2 53agcctgcttt cttgtacaaa gttgg
255425DNAArtificial SequenceattL3 54acccagcttt cttgtacaaa
gttgg 255525DNAArtificial
SequenceattP1 55gttcagcttt tttgtacaaa gttgg
255625DNAArtificial SequenceattP2,P3 56gttcagcttt cttgtacaaa
gttgg 255734DNAArtificial
SequenceLox P site 57ataacttcgt ataatgtatg ctatacgaag ttat
34581032DNAEscherichia coliCDS(1)...(1032)nucleotide
sequence encoding Cre recombinase 58atg tcc aat tta ctg acc gta cac caa
aat ttg cct gca tta ccg gtc 48Met Ser Asn Leu Leu Thr Val His Gln
Asn Leu Pro Ala Leu Pro Val1 5 10
15gat gca acg agt gat gag gtt cgc aag aac ctg atg gac atg ttc
agg 96Asp Ala Thr Ser Asp Glu Val Arg Lys Asn Leu Met Asp Met Phe
Arg 20 25 30gat cgc cag gcg
ttt tct gag cat acc tgg aaa atg ctt ctg tcc gtt 144Asp Arg Gln Ala
Phe Ser Glu His Thr Trp Lys Met Leu Leu Ser Val 35
40 45tgc cgg tcg tgg gcg gca tgg tgc aag ttg aat aac
cgg aaa tgg ttt 192Cys Arg Ser Trp Ala Ala Trp Cys Lys Leu Asn Asn
Arg Lys Trp Phe 50 55 60ccc gca gaa
cct gaa gat gtt cgc gat tat ctt cta tat ctt cag gcg 240Pro Ala Glu
Pro Glu Asp Val Arg Asp Tyr Leu Leu Tyr Leu Gln Ala65 70
75 80cgc ggt ctg gca gta aaa act atc
cag caa cat ttg ggc cag cta aac 288Arg Gly Leu Ala Val Lys Thr Ile
Gln Gln His Leu Gly Gln Leu Asn 85 90
95atg ctt cat cgt cgg tcc ggg ctg cca cga cca agt gac agc
aat gct 336Met Leu His Arg Arg Ser Gly Leu Pro Arg Pro Ser Asp Ser
Asn Ala 100 105 110gtt tca ctg
gtt atg cgg cgg atc cga aaa gaa aac gtt gat gcc ggt 384Val Ser Leu
Val Met Arg Arg Ile Arg Lys Glu Asn Val Asp Ala Gly 115
120 125gaa cgt gca aaa cag gct cta gcg ttc gaa cgc
act gat ttc gac cag 432Glu Arg Ala Lys Gln Ala Leu Ala Phe Glu Arg
Thr Asp Phe Asp Gln 130 135 140gtt cgt
tca ctc atg gaa aat agc gat cgc tgc cag gat ata cgt aat 480Val Arg
Ser Leu Met Glu Asn Ser Asp Arg Cys Gln Asp Ile Arg Asn145
150 155 160ctg gca ttt ctg ggg att gct
tat aac acc ctg tta cgt ata gcc gaa 528Leu Ala Phe Leu Gly Ile Ala
Tyr Asn Thr Leu Leu Arg Ile Ala Glu 165
170 175att gcc agg atc agg gtt aaa gat atc tca cgt act
gac ggt ggg aga 576Ile Ala Arg Ile Arg Val Lys Asp Ile Ser Arg Thr
Asp Gly Gly Arg 180 185 190atg
tta atc cat att ggc aga acg aaa acg ctg gtt agc acc gca ggt 624Met
Leu Ile His Ile Gly Arg Thr Lys Thr Leu Val Ser Thr Ala Gly 195
200 205gta gag aag gca ctt agc ctg ggg gta
act aaa ctg gtc gag cga tgg 672Val Glu Lys Ala Leu Ser Leu Gly Val
Thr Lys Leu Val Glu Arg Trp 210 215
220att tcc gtc tct ggt gta gct gat gat ccg aat aac tac ctg ttt tgc
720Ile Ser Val Ser Gly Val Ala Asp Asp Pro Asn Asn Tyr Leu Phe Cys225
230 235 240cgg gtc aga aaa
aat ggt gtt gcc gcg cca tct gcc acc agc cag cta 768Arg Val Arg Lys
Asn Gly Val Ala Ala Pro Ser Ala Thr Ser Gln Leu 245
250 255tca act cgc gcc ctg gaa ggg att ttt gaa
gca act cat cga ttg att 816Ser Thr Arg Ala Leu Glu Gly Ile Phe Glu
Ala Thr His Arg Leu Ile 260 265
270tac ggc gct aag gat gac tct ggt cag aga tac ctg gcc tgg tct gga
864Tyr Gly Ala Lys Asp Asp Ser Gly Gln Arg Tyr Leu Ala Trp Ser Gly
275 280 285cac agt gcc cgt gtc gga gcc
gcg cga gat atg gcc cgc gct gga gtt 912His Ser Ala Arg Val Gly Ala
Ala Arg Asp Met Ala Arg Ala Gly Val 290 295
300tca ata ccg gag atc atg caa gct ggt ggc tgg acc aat gta aat att
960Ser Ile Pro Glu Ile Met Gln Ala Gly Gly Trp Thr Asn Val Asn Ile305
310 315 320gtc atg aac tat
atc cgt aac ctg gat agt gaa aca ggg gca atg gtg 1008Val Met Asn Tyr
Ile Arg Asn Leu Asp Ser Glu Thr Gly Ala Met Val 325
330 335cgc ctg ctg gaa gat ggc gat tag
1032Arg Leu Leu Glu Asp Gly Asp *
34059343PRTEscherichia coli 59Met Ser Asn Leu Leu Thr Val His Gln Asn Leu
Pro Ala Leu Pro Val1 5 10
15Asp Ala Thr Ser Asp Glu Val Arg Lys Asn Leu Met Asp Met Phe Arg
20 25 30Asp Arg Gln Ala Phe Ser Glu
His Thr Trp Lys Met Leu Leu Ser Val 35 40
45Cys Arg Ser Trp Ala Ala Trp Cys Lys Leu Asn Asn Arg Lys Trp
Phe 50 55 60Pro Ala Glu Pro Glu Asp
Val Arg Asp Tyr Leu Leu Tyr Leu Gln Ala65 70
75 80Arg Gly Leu Ala Val Lys Thr Ile Gln Gln His
Leu Gly Gln Leu Asn 85 90
95Met Leu His Arg Arg Ser Gly Leu Pro Arg Pro Ser Asp Ser Asn Ala
100 105 110Val Ser Leu Val Met Arg
Arg Ile Arg Lys Glu Asn Val Asp Ala Gly 115 120
125Glu Arg Ala Lys Gln Ala Leu Ala Phe Glu Arg Thr Asp Phe
Asp Gln 130 135 140Val Arg Ser Leu Met
Glu Asn Ser Asp Arg Cys Gln Asp Ile Arg Asn145 150
155 160Leu Ala Phe Leu Gly Ile Ala Tyr Asn Thr
Leu Leu Arg Ile Ala Glu 165 170
175Ile Ala Arg Ile Arg Val Lys Asp Ile Ser Arg Thr Asp Gly Gly Arg
180 185 190Met Leu Ile His Ile
Gly Arg Thr Lys Thr Leu Val Ser Thr Ala Gly 195
200 205Val Glu Lys Ala Leu Ser Leu Gly Val Thr Lys Leu
Val Glu Arg Trp 210 215 220Ile Ser Val
Ser Gly Val Ala Asp Asp Pro Asn Asn Tyr Leu Phe Cys225
230 235 240Arg Val Arg Lys Asn Gly Val
Ala Ala Pro Ser Ala Thr Ser Gln Leu 245
250 255Ser Thr Arg Ala Leu Glu Gly Ile Phe Glu Ala Thr
His Arg Leu Ile 260 265 270Tyr
Gly Ala Lys Asp Asp Ser Gly Gln Arg Tyr Leu Ala Trp Ser Gly 275
280 285His Ser Ala Arg Val Gly Ala Ala Arg
Asp Met Ala Arg Ala Gly Val 290 295
300Ser Ile Pro Glu Ile Met Gln Ala Gly Gly Trp Thr Asn Val Asn Ile305
310 315 320Val Met Asn Tyr
Ile Arg Asn Leu Asp Ser Glu Thr Gly Ala Met Val 325
330 335Arg Leu Leu Glu Asp Gly Asp
340601272DNASaccharomyces cerevisiaeCDS(1)...(1272)nucleotide sequence
encoding Flip recombinase 60atg cca caa ttt ggt ata tta tgt aaa aca cca
cct aag gtg ctt gtt 48Met Pro Gln Phe Gly Ile Leu Cys Lys Thr Pro
Pro Lys Val Leu Val1 5 10
15cgt cag ttt gtg gaa agg ttt gaa aga cct tca ggt gag aaa ata gca
96Arg Gln Phe Val Glu Arg Phe Glu Arg Pro Ser Gly Glu Lys Ile Ala
20 25 30tta tgt gct gct gaa cta acc
tat tta tgt tgg atg att aca cat aac 144Leu Cys Ala Ala Glu Leu Thr
Tyr Leu Cys Trp Met Ile Thr His Asn 35 40
45gga aca gca atc aag aga gcc aca ttc atg agc tat aat act atc
ata 192Gly Thr Ala Ile Lys Arg Ala Thr Phe Met Ser Tyr Asn Thr Ile
Ile 50 55 60agc aat tcg ctg agt ttc
gat att gtc aat aaa tca ctc cag ttt aaa 240Ser Asn Ser Leu Ser Phe
Asp Ile Val Asn Lys Ser Leu Gln Phe Lys65 70
75 80tac aag acg caa aaa gca aca att ctg gaa gcc
tca tta aag aaa ttg 288Tyr Lys Thr Gln Lys Ala Thr Ile Leu Glu Ala
Ser Leu Lys Lys Leu 85 90
95att cct gct tgg gaa ttt aca att att cct tac tat gga caa aaa cat
336Ile Pro Ala Trp Glu Phe Thr Ile Ile Pro Tyr Tyr Gly Gln Lys His
100 105 110caa tct gat atc act gat
att gta agt agt ttg caa tta cag ttc gaa 384Gln Ser Asp Ile Thr Asp
Ile Val Ser Ser Leu Gln Leu Gln Phe Glu 115 120
125tca tcg gaa gaa gca gat aag gga aat agc cac agt aaa aaa
atg ctt 432Ser Ser Glu Glu Ala Asp Lys Gly Asn Ser His Ser Lys Lys
Met Leu 130 135 140aaa gca ctt cta agt
gag ggt gaa agc atc tgg gag atc act gag aaa 480Lys Ala Leu Leu Ser
Glu Gly Glu Ser Ile Trp Glu Ile Thr Glu Lys145 150
155 160ata cta aat tcg ttt gag tat act tcg aga
ttt aca aaa aca aaa act 528Ile Leu Asn Ser Phe Glu Tyr Thr Ser Arg
Phe Thr Lys Thr Lys Thr 165 170
175tta tac caa ttc ctc ttc cta gct act ttc atc aat tgt gga aga ttc
576Leu Tyr Gln Phe Leu Phe Leu Ala Thr Phe Ile Asn Cys Gly Arg Phe
180 185 190agc gat att aag aac gtt
gat ccg aaa tca ttt aaa tta gtc caa aat 624Ser Asp Ile Lys Asn Val
Asp Pro Lys Ser Phe Lys Leu Val Gln Asn 195 200
205aag tat ctg gga gta ata atc cag tgt tta gtg aca gag aca
aag aca 672Lys Tyr Leu Gly Val Ile Ile Gln Cys Leu Val Thr Glu Thr
Lys Thr 210 215 220agc gtt agt agg cac
ata tac ttc ttt agc gca agg ggt agg atc gat 720Ser Val Ser Arg His
Ile Tyr Phe Phe Ser Ala Arg Gly Arg Ile Asp225 230
235 240cca ctt gta tat ttg gat gaa ttt ttg agg
aat tct gaa cca gtc cta 768Pro Leu Val Tyr Leu Asp Glu Phe Leu Arg
Asn Ser Glu Pro Val Leu 245 250
255aaa cga gta aat agg acc ggc aat tct tca agc aat aaa cag gaa tac
816Lys Arg Val Asn Arg Thr Gly Asn Ser Ser Ser Asn Lys Gln Glu Tyr
260 265 270caa tta tta aaa gat aac
tta gtc aga tcg tac aat aaa gct ttg aag 864Gln Leu Leu Lys Asp Asn
Leu Val Arg Ser Tyr Asn Lys Ala Leu Lys 275 280
285aaa aat gcg cct tat tca atc ttt gct ata aaa aat ggc cca
aaa tct 912Lys Asn Ala Pro Tyr Ser Ile Phe Ala Ile Lys Asn Gly Pro
Lys Ser 290 295 300cac att gga aga cat
ttg atg acc tca ttt ctt tca atg aag ggc cta 960His Ile Gly Arg His
Leu Met Thr Ser Phe Leu Ser Met Lys Gly Leu305 310
315 320acg gag ttg act aat gtt gtg gga aat tgg
agc gat aag cgt gct tct 1008Thr Glu Leu Thr Asn Val Val Gly Asn Trp
Ser Asp Lys Arg Ala Ser 325 330
335gcc gtg gcc agg aca acg tat act cat cag ata aca gca ata cct gat
1056Ala Val Ala Arg Thr Thr Tyr Thr His Gln Ile Thr Ala Ile Pro Asp
340 345 350cac tac ttc gca cta gtt
tct cgg tac tat gca tat gat cca ata tca 1104His Tyr Phe Ala Leu Val
Ser Arg Tyr Tyr Ala Tyr Asp Pro Ile Ser 355 360
365aag gaa atg ata gca ttg aag gat gag act aat cca att gag
gag tgg 1152Lys Glu Met Ile Ala Leu Lys Asp Glu Thr Asn Pro Ile Glu
Glu Trp 370 375 380cag cat ata gaa cag
cta aag ggt agt gct gaa gga agc ata cga tac 1200Gln His Ile Glu Gln
Leu Lys Gly Ser Ala Glu Gly Ser Ile Arg Tyr385 390
395 400ccc gca tgg aat ggg ata ata tca cag gag
gta cta gac tac ctt tca 1248Pro Ala Trp Asn Gly Ile Ile Ser Gln Glu
Val Leu Asp Tyr Leu Ser 405 410
415tcc tac ata aat aga cgc ata taa
1272Ser Tyr Ile Asn Arg Arg Ile * 42061422PRTSaccharomyces
cerevisiae 61Pro Gln Phe Gly Ile Leu Cys Lys Thr Pro Pro Lys Val Leu Val
Arg1 5 10 15Gln Phe Val
Glu Arg Phe Glu Arg Pro Ser Gly Glu Lys Ile Ala Leu 20
25 30Cys Ala Ala Glu Leu Thr Tyr Leu Cys Trp
Met Ile Thr His Asn Gly 35 40
45Thr Ala Ile Lys Arg Ala Thr Phe Met Ser Tyr Asn Thr Ile Ile Ser 50
55 60Asn Ser Leu Ser Phe Asp Ile Val Asn
Lys Ser Leu Gln Phe Lys Tyr65 70 75
80Lys Thr Gln Lys Ala Thr Ile Leu Glu Ala Ser Leu Lys Lys
Leu Ile 85 90 95Pro Ala
Trp Glu Phe Thr Ile Ile Pro Tyr Tyr Gly Gln Lys His Gln 100
105 110Ser Asp Ile Thr Asp Ile Val Ser Ser
Leu Gln Leu Gln Phe Glu Ser 115 120
125Ser Glu Glu Ala Asp Lys Gly Asn Ser His Ser Lys Lys Met Leu Lys
130 135 140Ala Leu Leu Ser Glu Gly Glu
Ser Ile Trp Glu Ile Thr Glu Lys Ile145 150
155 160Leu Asn Ser Phe Glu Tyr Thr Ser Arg Phe Thr Lys
Thr Lys Thr Leu 165 170
175Tyr Gln Phe Leu Phe Leu Ala Thr Phe Ile Asn Cys Gly Arg Phe Ser
180 185 190Asp Ile Lys Asn Val Asp
Pro Lys Ser Phe Lys Leu Val Gln Asn Lys 195 200
205Tyr Leu Gly Val Ile Ile Gln Cys Leu Val Thr Glu Thr Lys
Thr Ser 210 215 220Val Ser Arg His Ile
Tyr Phe Phe Ser Ala Arg Gly Arg Ile Asp Pro225 230
235 240Leu Val Tyr Leu Asp Glu Phe Leu Arg Asn
Ser Glu Pro Val Leu Lys 245 250
255Arg Val Asn Arg Thr Gly Asn Ser Ser Ser Asn Lys Gln Glu Tyr Gln
260 265 270Leu Leu Lys Asp Asn
Leu Val Arg Ser Tyr Asn Lys Ala Leu Lys Lys 275
280 285Asn Ala Pro Tyr Ser Ile Phe Ala Ile Lys Asn Gly
Pro Lys Ser His 290 295 300Ile Gly Arg
His Leu Met Thr Ser Phe Leu Ser Met Lys Gly Leu Thr305
310 315 320Glu Leu Thr Asn Val Val Gly
Asn Trp Ser Asp Lys Arg Ala Ser Ala 325
330 335Val Ala Arg Thr Thr Tyr Thr His Gln Ile Thr Ala
Ile Pro Asp His 340 345 350Tyr
Phe Ala Leu Val Ser Arg Tyr Tyr Ala Tyr Asp Pro Ile Ser Lys 355
360 365Glu Met Ile Ala Leu Lys Asp Glu Thr
Asn Pro Ile Glu Glu Trp Gln 370 375
380His Ile Glu Gln Leu Lys Gly Ser Ala Glu Gly Ser Ile Arg Tyr Pro385
390 395 400Ala Trp Asn Gly
Ile Ile Ser Gln Glu Val Leu Asp Tyr Leu Ser Ser 405
410 415Tyr Ile Asn Arg Arg Ile
4206248DNAArtificial SequenceIR2 62gaagttccta ttccgaagtt cctattctct
agaaagtata ggaacttc 486348DNAArtificial SequenceIR1
63gaagttccta tactttctag agaataggaa cttcggaata ggaacttc
486466DNABacteriophage muCDS(1)...(66)nucleotide sequence encoding GIN
recombinase 64tca act ctg tat aaa aaa cac ccc gcg aaa cga gcg cat ata gaa
aac 48Ser Thr Leu Tyr Lys Lys His Pro Ala Lys Arg Ala His Ile Glu
Asn1 5 10 15gac gat cga
atc aat taa 66Asp Asp Arg
Ile Asn * 206521PRTbacteriophage mu 65Ser Thr Leu Tyr Lys Lys
His Pro Ala Lys Arg Ala His Ile Glu Asn1 5
10 15Asp Asp Arg Ile Asn
206669DNABacteriophage muCDS(1)...(69)nucleotide sequence encoding Gin
recombinase 66tat aaa aaa cat ccc gcg aaa cga acg cat ata gaa aac gac gat
cga 48Tyr Lys Lys His Pro Ala Lys Arg Thr His Ile Glu Asn Asp Asp
Arg1 5 10 15atc aat caa
atc gat cgg taa 69Ile Asn Gln
Ile Asp Arg * 206722PRTbacteriophage muGin recombinase of
bacteriophage mu 67Tyr Lys Lys His Pro Ala Lys Arg Thr His Ile Glu Asn
Asp Asp Arg1 5 10 15Ile
Asn Gln Ile Asp Arg 2068555DNAEscherichia
coliCDS(1)...(555)nucleotide sequence encoding PIN recombinase 68atg ctt
att ggc tat gta cgc gta tca aca aat gac cag aac aca gat 48Met Leu
Ile Gly Tyr Val Arg Val Ser Thr Asn Asp Gln Asn Thr Asp1 5
10 15cta caa cgt aat gcg ctg aac tgt
gca gga tgc gag ctg att ttt gaa 96Leu Gln Arg Asn Ala Leu Asn Cys
Ala Gly Cys Glu Leu Ile Phe Glu 20 25
30gac aag ata agc ggc aca aag tcc gaa agg ccg gga ctg aaa aaa
ctg 144Asp Lys Ile Ser Gly Thr Lys Ser Glu Arg Pro Gly Leu Lys Lys
Leu 35 40 45ctc agg aca tta tcg
gca ggt gac act ctg gtt gtc tgg aag ctg gat 192Leu Arg Thr Leu Ser
Ala Gly Asp Thr Leu Val Val Trp Lys Leu Asp 50 55
60cgg ctg ggg cgt agt atg cgg cat ctt gtc gtg ctg gtg gag
gag ttg 240Arg Leu Gly Arg Ser Met Arg His Leu Val Val Leu Val Glu
Glu Leu65 70 75 80cgc
gaa cga ggc atc aac ttt cgt agt ctg acg gat tca att gat acc 288Arg
Glu Arg Gly Ile Asn Phe Arg Ser Leu Thr Asp Ser Ile Asp Thr
85 90 95agc aca cca atg gga cgc ttt
ttc ttt cat gtg atg ggt gcc ctg gct 336Ser Thr Pro Met Gly Arg Phe
Phe Phe His Val Met Gly Ala Leu Ala 100 105
110gaa atg gag cgt gaa ctg att gtt gaa cga aca aaa gct gga
ctg gaa 384Glu Met Glu Arg Glu Leu Ile Val Glu Arg Thr Lys Ala Gly
Leu Glu 115 120 125act gct cgt gca
cag gga cga att ggt gga cgt cgt ccc aaa ctt aca 432Thr Ala Arg Ala
Gln Gly Arg Ile Gly Gly Arg Arg Pro Lys Leu Thr 130
135 140cca gaa caa tgg gca caa gct gga cga tta att gca
gca gga act cct 480Pro Glu Gln Trp Ala Gln Ala Gly Arg Leu Ile Ala
Ala Gly Thr Pro145 150 155
160cgc cag aag gtg gcg att atc tat gat gtt ggt gtg tca act ttg tat
528Arg Gln Lys Val Ala Ile Ile Tyr Asp Val Gly Val Ser Thr Leu Tyr
165 170 175aag agg ttt cct gca
ggg gat aaa taa 555Lys Arg Phe Pro Ala
Gly Asp Lys * 18069184PRTEscherichia coli 69Met Leu Ile Gly
Tyr Val Arg Val Ser Thr Asn Asp Gln Asn Thr Asp1 5
10 15Leu Gln Arg Asn Ala Leu Asn Cys Ala Gly
Cys Glu Leu Ile Phe Glu 20 25
30Asp Lys Ile Ser Gly Thr Lys Ser Glu Arg Pro Gly Leu Lys Lys Leu
35 40 45Leu Arg Thr Leu Ser Ala Gly Asp
Thr Leu Val Val Trp Lys Leu Asp 50 55
60Arg Leu Gly Arg Ser Met Arg His Leu Val Val Leu Val Glu Glu Leu65
70 75 80Arg Glu Arg Gly Ile
Asn Phe Arg Ser Leu Thr Asp Ser Ile Asp Thr 85
90 95Ser Thr Pro Met Gly Arg Phe Phe Phe His Val
Met Gly Ala Leu Ala 100 105
110Glu Met Glu Arg Glu Leu Ile Val Glu Arg Thr Lys Ala Gly Leu Glu
115 120 125Thr Ala Arg Ala Gln Gly Arg
Ile Gly Gly Arg Arg Pro Lys Leu Thr 130 135
140Pro Glu Gln Trp Ala Gln Ala Gly Arg Leu Ile Ala Ala Gly Thr
Pro145 150 155 160Arg Gln
Lys Val Ala Ile Ile Tyr Asp Val Gly Val Ser Thr Leu Tyr
165 170 175Lys Arg Phe Pro Ala Gly Asp
Lys 180704778DNAArtificial Sequencepcx plasmid 70gtcgacattg
attattgact agttattaat agtaatcaat tacggggtca ttagttcata 60gcccatatat
ggagttccgc gttacataac ttacggtaaa tggcccgcct ggctgaccgc 120ccaacgaccc
ccgcccattg acgtcaataa tgacgtatgt tcccatagta acgccaatag 180ggactttcca
ttgacgtcaa tgggtggact atttacggta aactgcccac ttggcagtac 240atcaagtgta
tcatatgcca agtacgcccc ctattgacgt caatgacggt aaatggcccg 300cctggcatta
tgcccagtac atgaccttat gggactttcc tacttggcag tacatctacg 360tattagtcat
cgctattacc atgggtcgag gtgagcccca cgttctgctt cactctcccc 420atctcccccc
cctccccacc cccaattttg tatttattta ttttttaatt attttgtgca 480gcgatggggg
cggggggggg gggggcgcgc gccaggcggg gcggggcggg gcgaggggcg 540gggcggggcg
aggcggagag gtgcggcggc agccaatcag agcggcgcgc tccgaaagtt 600tccttttatg
gcgaggcggc ggcggcggcg gccctataaa aagcgaagcg cgcggcgggc 660gggagtcgct
gcgttgcctt cgccccgtgc cccgctccgc gccgcctcgc gccgcccgcc 720ccggctctga
ctgaccgcgt tactcccaca ggtgagcggg cgggacggcc cttctcctcc 780gggctgtaat
tagcgcttgg tttaatgacg gctcgtttct tttctgtggc tgcgtgaaag 840ccttaaaggg
ctccgggagg gccctttgtg cgggggggag cggctcgggg ggtgcgtgcg 900tgtgtgtgtg
cgtggggagc gccgcgtgcg gcccgcgctg cccggcggct gtgagcgctg 960cgggcgcggc
gcggggcttt gtgcgctccg cgtgtgcgcg aggggagcgc ggccgggggc 1020ggtgccccgc
ggtgcggggg ggctgcgagg ggaacaaagg ctgcgtgcgg ggtgtgtgcg 1080tgggggggtg
agcagggggt gtgggcgcgg cggtcgggct gtaacccccc cctgcacccc 1140cctccccgag
ttgctgagca cggcccggct tcgggtgcgg ggctccgtgc ggggcgtggc 1200gcggggctcg
ccgtgccggg cggggggtgg cggcaggtgg gggtgccggg cggggcgggg 1260ccgcctcggg
ccggggaggg ctcgggggag gggcgcggcg gccccggagc gccggcggct 1320gtcgaggcgc
ggcgagccgc agccattgcc ttttatggta atcgtgcgag agggcgcagg 1380gacttccttt
gtcccaaatc tggcggagcc gaaatctggg aggcgccgcc gcaccccctc 1440tagcgggcgc
gggcgaagcg gtgcggcgcc ggcaggaagg aaatgggcgg ggagggcctt 1500cgtgcgtcgc
cgcgccgccg tccccttctc catctccagc ctcggggctg ccgcaggggg 1560acggctgcct
tcggggggga cggggcaggg cggggttcgg cttctggcgt gtgaccggcg 1620gctctagagc
ctctgctaac catgttcatg ccttcttctt tttcctacag ctcctgggca 1680acgtgctggt
tgttgtgctg tctcatcatt ttggcaaaga attcactcct caggtgcagg 1740ctgcctatca
gaaggtggtg gctggtgtgg ccaatgccct ggctcacaaa taccactgag 1800atctttttcc
ctctgccaaa aattatgggg acatcatgaa gccccttgag catctgactt 1860ctggctaata
aaggaaattt attttcattg caatagtgtg ttggaatttt ttgtgtctct 1920cactcggaag
gacatatggg agggcaaatc atttaaaaca tcagaatgag tatttggttt 1980agagtttggc
aacatatgcc atatgctggc tgccatgaac aaaggtggct ataaagaggt 2040catcagtata
tgaaacagcc ccctgctgtc cattccttat tccatagaaa agccttgact 2100tgaggttaga
ttttttttat attttgtttt gtgttatttt tttctttaac atccctaaaa 2160ttttccttac
atgttttact agccagattt ttcctcctct cctgactact cccagtcata 2220gctgtccctc
ttctcttatg aagatccctc gacctgcagc ccaagcttgg cgtaatcatg 2280gtcatagctg
tttcctgtgt gaaattgtta tccgctcaca attccacaca acatacgagc 2340cggaagcata
aagtgtaaag cctggggtgc ctaatgagtg agctaactca cattaattgc 2400gttgcgctca
ctgcccgctt tccagtcggg aaacctgtcg tgccagcgga tccgcatctc 2460aattagtcag
caaccatagt cccgccccta actccgccca tcccgcccct aactccgccc 2520agttccgccc
attctccgcc ccatggctga ctaatttttt ttatttatgc agaggccgag 2580gccgcctcgg
cctctgagct attccagaag tagtgaggag gcttttttgg aggcctaggc 2640ttttgcaaaa
agctaacttg tttattgcag cttataatgg ttacaaataa agcaatagca 2700tcacaaattt
cacaaataaa gcattttttt cactgcattc tagttgtggt ttgtccaaac 2760tcatcaatgt
atcttatcat gtctggatcc gctgcattaa tgaatcggcc aacgcgcggg 2820gagaggcggt
ttgcgtattg ggcgctcttc cgcttcctcg ctcactgact cgctgcgctc 2880ggtcgttcgg
ctgcggcgag cggtatcagc tcactcaaag gcggtaatac ggttatccac 2940agaatcaggg
gataacgcag gaaagaacat gtgagcaaaa ggccagcaaa aggccaggaa 3000ccgtaaaaag
gccgcgttgc tggcgttttt ccataggctc cgcccccctg acgagcatca 3060caaaaatcga
cgctcaagtc agaggtggcg aaacccgaca ggactataaa gataccaggc 3120gtttccccct
ggaagctccc tcgtgcgctc tcctgttccg accctgccgc ttaccggata 3180cctgtccgcc
tttctccctt cgggaagcgt ggcgctttct caatgctcac gctgtaggta 3240tctcagttcg
gtgtaggtcg ttcgctccaa gctgggctgt gtgcacgaac cccccgttca 3300gcccgaccgc
tgcgccttat ccggtaacta tcgtcttgag tccaacccgg taagacacga 3360cttatcgcca
ctggcagcag ccactggtaa caggattagc agagcgaggt atgtaggcgg 3420tgctacagag
ttcttgaagt ggtggcctaa ctacggctac actagaagga cagtatttgg 3480tatctgcgct
ctgctgaagc cagttacctt cggaaaaaga gttggtagct cttgatccgg 3540caaacaaacc
accgctggta gcggtggttt ttttgtttgc aagcagcaga ttacgcgcag 3600aaaaaaagga
tctcaagaag atcctttgat cttttctacg gggtctgacg ctcagtggaa 3660cgaaaactca
cgttaaggga ttttggtcat gagattatca aaaaggatct tcacctagat 3720ccttttaaat
taaaaatgaa gttttaaatc aatctaaagt atatatgagt aaacttggtc 3780tgacagttac
caatgcttaa tcagtgaggc acctatctca gcgatctgtc tatttcgttc 3840atccatagtt
gcctgactcc ccgtcgtgta gataactacg atacgggagg gcttaccatc 3900tggccccagt
gctgcaatga taccgcgaga cccacgctca ccggctccag atttatcagc 3960aataaaccag
ccagccggaa gggccgagcg cagaagtggt cctgcaactt tatccgcctc 4020catccagtct
attaattgtt gccgggaagc tagagtaagt agttcgccag ttaatagttt 4080gcgcaacgtt
gttgccattg ctacaggcat cgtggtgtca cgctcgtcgt ttggtatggc 4140ttcattcagc
tccggttccc aacgatcaag gcgagttaca tgatccccca tgttgtgcaa 4200aaaagcggtt
agctccttcg gtcctccgat cgttgtcaga agtaagttgg ccgcagtgtt 4260atcactcatg
gttatggcag cactgcataa ttctcttact gtcatgccat ccgtaagatg 4320cttttctgtg
actggtgagt actcaaccaa gtcattctga gaatagtgta tgcggcgacc 4380gagttgctct
tgcccggcgt caatacggga taataccgcg ccacatagca gaactttaaa 4440agtgctcatc
attggaaaac gttcttcggg gcgaaaactc tcaaggatct taccgctgtt 4500gagatccagt
tcgatgtaac ccactcgtgc acccaactga tcttcagcat cttttacttt 4560caccagcgtt
tctgggtgag caaaaacagg aaggcaaaat gccgcaaaaa agggaataag 4620ggcgacacgg
aaatgttgaa tactcatact cttccttttt caatattatt gaagcattta 4680tcagggttat
tgtctcatga gcggatacat atttgaatgt atttagaaaa ataaacaaat 4740aggggttccg
cgcacatttc cccgaaaagt gccacctg
4778715510DNAArtificial SequencepCXeGFP plasmid 71gtcgacattg attattgact
agttattaat agtaatcaat tacggggtca ttagttcata 60gcccatatat ggagttccgc
gttacataac ttacggtaaa tggcccgcct ggctgaccgc 120ccaacgaccc ccgcccattg
acgtcaataa tgacgtatgt tcccatagta acgccaatag 180ggactttcca ttgacgtcaa
tgggtggact atttacggta aactgcccac ttggcagtac 240atcaagtgta tcatatgcca
agtacgcccc ctattgacgt caatgacggt aaatggcccg 300cctggcatta tgcccagtac
atgaccttat gggactttcc tacttggcag tacatctacg 360tattagtcat cgctattacc
atgggtcgag gtgagcccca cgttctgctt cactctcccc 420atctcccccc cctccccacc
cccaattttg tatttattta ttttttaatt attttgtgca 480gcgatggggg cggggggggg
gggggcgcgc gccaggcggg gcggggcggg gcgaggggcg 540gggcggggcg aggcggagag
gtgcggcggc agccaatcag agcggcgcgc tccgaaagtt 600tccttttatg gcgaggcggc
ggcggcggcg gccctataaa aagcgaagcg cgcggcgggc 660gggagtcgct gcgttgcctt
cgccccgtgc cccgctccgc gccgcctcgc gccgcccgcc 720ccggctctga ctgaccgcgt
tactcccaca ggtgagcggg cgggacggcc cttctcctcc 780gggctgtaat tagcgcttgg
tttaatgacg gctcgtttct tttctgtggc tgcgtgaaag 840ccttaaaggg ctccgggagg
gccctttgtg cgggggggag cggctcgggg ggtgcgtgcg 900tgtgtgtgtg cgtggggagc
gccgcgtgcg gcccgcgctg cccggcggct gtgagcgctg 960cgggcgcggc gcggggcttt
gtgcgctccg cgtgtgcgcg aggggagcgc ggccgggggc 1020ggtgccccgc ggtgcggggg
ggctgcgagg ggaacaaagg ctgcgtgcgg ggtgtgtgcg 1080tgggggggtg agcagggggt
gtgggcgcgg cggtcgggct gtaacccccc cctgcacccc 1140cctccccgag ttgctgagca
cggcccggct tcgggtgcgg ggctccgtgc ggggcgtggc 1200gcggggctcg ccgtgccggg
cggggggtgg cggcaggtgg gggtgccggg cggggcgggg 1260ccgcctcggg ccggggaggg
ctcgggggag gggcgcggcg gccccggagc gccggcggct 1320gtcgaggcgc ggcgagccgc
agccattgcc ttttatggta atcgtgcgag agggcgcagg 1380gacttccttt gtcccaaatc
tggcggagcc gaaatctggg aggcgccgcc gcaccccctc 1440tagcgggcgc gggcgaagcg
gtgcggcgcc ggcaggaagg aaatgggcgg ggagggcctt 1500cgtgcgtcgc cgcgccgccg
tccccttctc catctccagc ctcggggctg ccgcaggggg 1560acggctgcct tcggggggga
cggggcaggg cggggttcgg cttctggcgt gtgaccggcg 1620gctctagagc ctctgctaac
catgttcatg ccttcttctt tttcctacag ctcctgggca 1680acgtgctggt tgttgtgctg
tctcatcatt ttggcaaaga attcgccacc atggtgagca 1740agggcgagga gctgttcacc
ggggtggtgc ccatcctggt cgagctggac ggcgacgtaa 1800acggccacaa gttcagcgtg
tccggcgagg gcgagggcga tgccacctac ggcaagctga 1860ccctgaagtt catctgcacc
accggcaagc tgcccgtgcc ctggcccacc ctcgtgacca 1920ccctgaccta cggcgtgcag
tgcttcagcc gctaccccga ccacatgaag cagcacgact 1980tcttcaagtc cgccatgccc
gaaggctacg tccaggagcg caccatcttc ttcaaggacg 2040acggcaacta caagacccgc
gccgaggtga agttcgaggg cgacaccctg gtgaaccgca 2100tcgagctgaa gggcatcgac
ttcaaggagg acggcaacat cctggggcac aagctggagt 2160acaactacaa cagccacaac
gtctatatca tggccgacaa gcagaagaac ggcatcaagg 2220tgaacttcaa gatccgccac
aacatcgagg acggcagcgt gcagctcgcc gaccactacc 2280agcagaacac ccccatcggc
gacggccccg tgctgctgcc cgacaaccac tacctgagca 2340cccagtccgc cctgagcaaa
gaccccaacg agaagcgcga tcacatggtc ctgctggagt 2400tcgtgaccgc cgccgggatc
actctcggca tggacgagct gtacaagtaa gaattcactc 2460ctcaggtgca ggctgcctat
cagaaggtgg tggctggtgt ggccaatgcc ctggctcaca 2520aataccactg agatcttttt
ccctctgcca aaaattatgg ggacatcatg aagccccttg 2580agcatctgac ttctggctaa
taaaggaaat ttattttcat tgcaatagtg tgttggaatt 2640ttttgtgtct ctcactcgga
aggacatatg ggagggcaaa tcatttaaaa catcagaatg 2700agtatttggt ttagagtttg
gcaacatatg ccatatgctg gctgccatga acaaaggtgg 2760ctataaagag gtcatcagta
tatgaaacag ccccctgctg tccattcctt attccataga 2820aaagccttga cttgaggtta
gatttttttt atattttgtt ttgtgttatt tttttcttta 2880acatccctaa aattttcctt
acatgtttta ctagccagat ttttcctcct ctcctgacta 2940ctcccagtca tagctgtccc
tcttctctta tgaagatccc tcgacctgca gcccaagctt 3000ggcgtaatca tggtcatagc
tgtttcctgt gtgaaattgt tatccgctca caattccaca 3060caacatacga gccggaagca
taaagtgtaa agcctggggt gcctaatgag tgagctaact 3120cacattaatt gcgttgcgct
cactgcccgc tttccagtcg ggaaacctgt cgtgccagcg 3180gatccgcatc tcaattagtc
agcaaccata gtcccgcccc taactccgcc catcccgccc 3240ctaactccgc ccagttccgc
ccattctccg ccccatggct gactaatttt ttttatttat 3300gcagaggccg aggccgcctc
ggcctctgag ctattccaga agtagtgagg aggctttttt 3360ggaggcctag gcttttgcaa
aaagctaact tgtttattgc agcttataat ggttacaaat 3420aaagcaatag catcacaaat
ttcacaaata aagcattttt ttcactgcat tctagttgtg 3480gtttgtccaa actcatcaat
gtatcttatc atgtctggat ccgctgcatt aatgaatcgg 3540ccaacgcgcg gggagaggcg
gtttgcgtat tgggcgctct tccgcttcct cgctcactga 3600ctcgctgcgc tcggtcgttc
ggctgcggcg agcggtatca gctcactcaa aggcggtaat 3660acggttatcc acagaatcag
gggataacgc aggaaagaac atgtgagcaa aaggccagca 3720aaaggccagg aaccgtaaaa
aggccgcgtt gctggcgttt ttccataggc tccgcccccc 3780tgacgagcat cacaaaaatc
gacgctcaag tcagaggtgg cgaaacccga caggactata 3840aagataccag gcgtttcccc
ctggaagctc cctcgtgcgc tctcctgttc cgaccctgcc 3900gcttaccgga tacctgtccg
cctttctccc ttcgggaagc gtggcgcttt ctcaatgctc 3960acgctgtagg tatctcagtt
cggtgtaggt cgttcgctcc aagctgggct gtgtgcacga 4020accccccgtt cagcccgacc
gctgcgcctt atccggtaac tatcgtcttg agtccaaccc 4080ggtaagacac gacttatcgc
cactggcagc agccactggt aacaggatta gcagagcgag 4140gtatgtaggc ggtgctacag
agttcttgaa gtggtggcct aactacggct acactagaag 4200gacagtattt ggtatctgcg
ctctgctgaa gccagttacc ttcggaaaaa gagttggtag 4260ctcttgatcc ggcaaacaaa
ccaccgctgg tagcggtggt ttttttgttt gcaagcagca 4320gattacgcgc agaaaaaaag
gatctcaaga agatcctttg atcttttcta cggggtctga 4380cgctcagtgg aacgaaaact
cacgttaagg gattttggtc atgagattat caaaaaggat 4440cttcacctag atccttttaa
attaaaaatg aagttttaaa tcaatctaaa gtatatatga 4500gtaaacttgg tctgacagtt
accaatgctt aatcagtgag gcacctatct cagcgatctg 4560tctatttcgt tcatccatag
ttgcctgact ccccgtcgtg tagataacta cgatacggga 4620gggcttacca tctggcccca
gtgctgcaat gataccgcga gacccacgct caccggctcc 4680agatttatca gcaataaacc
agccagccgg aagggccgag cgcagaagtg gtcctgcaac 4740tttatccgcc tccatccagt
ctattaattg ttgccgggaa gctagagtaa gtagttcgcc 4800agttaatagt ttgcgcaacg
ttgttgccat tgctacaggc atcgtggtgt cacgctcgtc 4860gtttggtatg gcttcattca
gctccggttc ccaacgatca aggcgagtta catgatcccc 4920catgttgtgc aaaaaagcgg
ttagctcctt cggtcctccg atcgttgtca gaagtaagtt 4980ggccgcagtg ttatcactca
tggttatggc agcactgcat aattctctta ctgtcatgcc 5040atccgtaaga tgcttttctg
tgactggtga gtactcaacc aagtcattct gagaatagtg 5100tatgcggcga ccgagttgct
cttgcccggc gtcaatacgg gataataccg cgccacatag 5160cagaacttta aaagtgctca
tcattggaaa acgttcttcg gggcgaaaac tctcaaggat 5220cttaccgctg ttgagatcca
gttcgatgta acccactcgt gcacccaact gatcttcagc 5280atcttttact ttcaccagcg
tttctgggtg agcaaaaaca ggaaggcaaa atgccgcaaa 5340aaagggaata agggcgacac
ggaaatgttg aatactcata ctcttccttt ttcaatatta 5400ttgaagcatt tatcagggtt
attgtctcat gagcggatac atatttgaat gtatttagaa 5460aaataaacaa ataggggttc
cgcgcacatt tccccgaaaa gtgccacctg 551072282DNAArtificial
Sequenceattp 72ccttgcgcta atgctctgtt acaggtcact aataccatct aagtagttga
ttcatagtga 60ctgcatatgt tgtgttttac agtattatgt agtctgtttt ttatgcaaaa
tctaatttaa 120tatattgata tttatatcat tttacgtttc tcgttcagct tttttatact
aagttggcat 180tataaaaaag cattgcttat caatttgttg caacgaacag gtcactatca
gtcaaaataa 240aatcattatt tgatttcaat tttgtcccac tccctgcctc tg
2827320DNAArtificial SequencePrimer 73ggccccgtaa tgcagaagaa
207432DNAArtificial
SequencePrimer 74ggtttaaagt gcgctcctcc aagaacgtca tc
327540DNAArtificial SequencePrimer 75agatctagag ccgccgctac
aggaacaggt ggtggcggcc 407637DNAArtificial
SequencePrimer 5PacSV40 76ctgttaatta actgtggaat gtgtgtcagt tagggtg
377720DNAArtificial SequencePrimer Antisense Zeo
77tgaacagggt cacgtcgtcc
207824DNAArtificial SequencePrimer 5' HETS 78gggccgaaac gatctcaacc tatt
247919DNAArtificial
SequencePrimer 3' HETS 79cgcagcggcc ctcctactc
198029DNAArtificial SequencePrimer 5BSD 80accatgaaaa
catttaacat ttctcaaca
298129DNAArtificial SequencePrimer SV40polyA 81tttatttgtg aaatttgtga
tgctattgc 298225DNAArtificial
SequencePrimer 3BSP 82ttaatttcgg gtatatttga gtgga
258332DNAArtificial SequencePrimer EPO5XBA 83tatctagaat
gggggtgcac gaatgtcctg cc
328432DNAArtificial SequencePrimer EPO3SBI 84tacgtacgtc atctgtcccc
tgtcctgcag gc 328527DNAArtificial
SequencePrimer GENEPO3BSI 85cgtacgtcat ctgtcccctg tcctgca
278628DNAArtificial SequencePrimer GENEPO5XBA
86tctagaatgg gggtgcacgg tgagtact
28874862DNAArtificial SequencepD2eGFP-1N plasmid from Clontech
87tagttattaa tagtaatcaa ttacggggtc attagttcat agcccatata tggagttccg
60cgttacataa cttacggtaa atggcccgcc tggctgaccg cccaacgacc cccgcccatt
120gacgtcaata atgacgtatg ttcccatagt aacgccaata gggactttcc attgacgtca
180atgggtggag tatttacggt aaactgccca cttggcagta catcaagtgt atcatatgcc
240aagtacgccc cctattgacg tcaatgacgg taaatggccc gcctggcatt atgcccagta
300catgacctta tgggactttc ctacttggca gtacatctac gtattagtca tcgctattac
360catggtgatg cggttttggc agtacatcaa tgggcgtgga tagcggtttg actcacgggg
420atttccaagt ctccacccca ttgacgtcaa tgggagtttg ttttggcacc aaaatcaacg
480ggactttcca aaatgtcgta acaactccgc cccattgacg caaatgggcg gtaggcgtgt
540acggtgggag gtctatataa gcagagctgg tttagtgaac cgtcagatcc gctagcgcta
600ccggactcag atctcgagct caagcttcga attctgcagt cgacggtacc gcgggcccgg
660gatccaccgg tcgccaccat ggtgagcaag ggcgaggagc tgttcaccgg ggtggtgccc
720atcctggtcg agctggacgg cgacgtaaac ggccacaagt tcagcgtgtc cggcgagggc
780gagggcgatg ccacctacgg caagctgacc ctgaagttca tctgcaccac cggcaagctg
840cccgtgccct ggcccaccct cgtgaccacc ctgacctacg gcgtgcagtg cttcagccgc
900taccccgacc acatgaagca gcacgacttc ttcaagtccg ccatgcccga aggctacgtc
960caggagcgca ccatcttctt caaggacgac ggcaactaca agacccgcgc cgaggtgaag
1020ttcgagggcg acaccctggt gaaccgcatc gagctgaagg gcatcgactt caaggaggac
1080ggcaacatcc tggggcacaa gctggagtac aactacaaca gccacaacgt ctatatcatg
1140gccgacaagc agaagaacgg catcaaggtg aacttcaaga tccgccacaa catcgaggac
1200ggcagcgtgc agctcgccga ccactaccag cagaacaccc ccatcggcga cggccccgtg
1260ctgctgcccg acaaccacta cctgagcacc cagtccgccc tgagcaaaga ccccaacgag
1320aagcgcgatc acatggtcct gctggagttc gtgaccgccg ccgggatcac tctcggcatg
1380gacgagctgt acaagaagct tagccatggc ttcccgccgg aggtggagga gcaggatgat
1440ggcacgctgc ccatgtcttg tgcccaggag agcgggatgg accgtcaccc tgcagcctgt
1500gcttctgcta ggatcaatgt gtagatgcgc ggccgcgact ctagatcata atcagccata
1560ccacatttgt agaggtttta cttgctttaa aaaacctccc acacctcccc ctgaacctga
1620aacataaaat gaatgcaatt gttgttgtta acttgtttat tgcagcttat aatggttaca
1680aataaagcaa tagcatcaca aatttcacaa ataaagcatt tttttcactg cattctagtt
1740gtggtttgtc caaactcatc aatgtatctt aaggcgtaaa ttgtaagcgt taatattttg
1800ttaaaattcg cgttaaattt ttgttaaatc agctcatttt ttaaccaata ggccgaaatc
1860ggcaaaatcc cttataaatc aaaagaatag accgagatag ggttgagtgt tgttccagtt
1920tggaacaaga gtccactatt aaagaacgtg gactccaacg tcaaagggcg aaaaaccgtc
1980tatcagggcg atggcccact acgtgaacca tcaccctaat caagtttttt ggggtcgagg
2040tgccgtaaag cactaaatcg gaaccctaaa gggagccccc gatttagagc ttgacgggga
2100aagccggcga acgtggcgag aaaggaaggg aagaaagcga aaggagcggg cgctagggcg
2160ctggcaagtg tagcggtcac gctgcgcgta accaccacac ccgccgcgct taatgcgccg
2220ctacagggcg cgtcaggtgg cacttttcgg ggaaatgtgc gcggaacccc tatttgttta
2280tttttctaaa tacattcaaa tatgtatccg ctcatgagac aataaccctg ataaatgctt
2340caataatatt gaaaaaggaa gagtcctgag gcggaaagaa ccagctgtgg aatgtgtgtc
2400agttagggtg tggaaagtcc ccaggctccc cagcaggcag aagtatgcaa agcatgcatc
2460tcaattagtc agcaaccagg tgtggaaagt ccccaggctc cccagcaggc agaagtatgc
2520aaagcatgca tctcaattag tcagcaacca tagtcccgcc cctaactccg cccatcccgc
2580ccctaactcc gcccagttcc gcccattctc cgccccatgg ctgactaatt ttttttattt
2640atgcagaggc cgaggccgcc tcggcctctg agctattcca gaagtagtga ggaggctttt
2700ttggaggcct aggcttttgc aaagatcgat caagagacag gatgaggatc gtttcgcatg
2760attgaacaag atggattgca cgcaggttct ccggccgctt gggtggagag gctattcggc
2820tatgactggg cacaacagac aatcggctgc tctgatgccg ccgtgttccg gctgtcagcg
2880caggggcgcc cggttctttt tgtcaagacc gacctgtccg gtgccctgaa tgaactgcaa
2940gacgaggcag cgcggctatc gtggctggcc acgacgggcg ttccttgcgc agctgtgctc
3000gacgttgtca ctgaagcggg aagggactgg ctgctattgg gcgaagtgcc ggggcaggat
3060ctcctgtcat ctcaccttgc tcctgccgag aaagtatcca tcatggctga tgcaatgcgg
3120cggctgcata cgcttgatcc ggctacctgc ccattcgacc accaagcgaa acatcgcatc
3180gagcgagcac gtactcggat ggaagccggt cttgtcgatc aggatgatct ggacgaagag
3240catcaggggc tcgcgccagc cgaactgttc gccaggctca aggcgagcat gcccgacggc
3300gaggatctcg tcgtgaccca tggcgatgcc tgcttgccga atatcatggt ggaaaatggc
3360cgcttttctg gattcatcga ctgtggccgg ctgggtgtgg cggaccgcta tcaggacata
3420gcgttggcta cccgtgatat tgctgaagag cttggcggcg aatgggctga ccgcttcctc
3480gtgctttacg gtatcgccgc tcccgattcg cagcgcatcg ccttctatcg ccttcttgac
3540gagttcttct gagcgggact ctggggttcg aaatgaccga ccaagcgacg cccaacctgc
3600catcacgaga tttcgattcc accgccgcct tctatgaaag gttgggcttc ggaatcgttt
3660tccgggacgc cggctggatg atcctccagc gcggggatct catgctggag ttcttcgccc
3720accctagggg gaggctaact gaaacacgga aggagacaat accggaagga acccgcgcta
3780tgacggcaat aaaaagacag aataaaacgc acggtgttgg gtcgtttgtt cataaacgcg
3840gggttcggtc ccagggctgg cactctgtcg ataccccacc gagaccccat tggggccaat
3900acgcccgcgt ttcttccttt tccccacccc accccccaag ttcgggtgaa ggcccagggc
3960tcgcagccaa cgtcggggcg gcaggccctg ccatagcctc aggttactca tatatacttt
4020agattgattt aaaacttcat ttttaattta aaaggatcta ggtgaagatc ctttttgata
4080atctcatgac caaaatccct taacgtgagt tttcgttcca ctgagcgtca gaccccgtag
4140aaaagatcaa aggatcttct tgagatcctt tttttctgcg cgtaatctgc tgcttgcaaa
4200caaaaaaacc accgctacca gcggtggttt gtttgccgga tcaagagcta ccaactcttt
4260ttccgaaggt aactggcttc agcagagcgc agataccaaa tactgtcctt ctagtgtagc
4320cgtagttagg ccaccacttc aagaactctg tagcaccgcc tacatacctc gctctgctaa
4380tcctgttacc agtggctgct gccagtggcg ataagtcgtg tcttaccggg ttggactcaa
4440gacgatagtt accggataag gcgcagcggt cgggctgaac ggggggttcg tgcacacagc
4500ccagcttgga gcgaacgacc tacaccgaac tgagatacct acagcgtgag ctatgagaaa
4560gcgccacgct tcccgaaggg agaaaggcgg acaggtatcc ggtaagcggc agggtcggaa
4620caggagagcg cacgagggag cttccagggg gaaacgcctg gtatctttat agtcctgtcg
4680ggtttcgcca cctctgactt gagcgtcgat ttttgtgatg ctcgtcaggg gggcggagcc
4740tatggaaaaa cgccagcaac gcggcctttt tacggttcct ggccttttgc tggccttttg
4800ctcacatgtt ctttcctgcg ttatcccctg attctgtgga taaccgtatt accgccatgc
4860at
4862885192DNAArtificial SequencepIRESpuro2 plasmid from Clontech
88gacggatcgg gagatctccc gatcccctat ggtcgactct cagtacaatc tgctctgatg
60ccgcatagtt aagccagtat ctgctccctg cttgtgtgtt ggaggtcgct gagtagtgcg
120cgagcaaaat ttaagctaca acaaggcaag gcttgaccga caattgcatg aagaatctgc
180ttagggttag gcgttttgcg ctgcttcgcg atgtacgggc cagatatacg cgttgacatt
240gattattgac tagttattaa tagtaatcaa ttacggggtc attagttcat agcccatata
300tggagttccg cgttacataa cttacggtaa atggcccgcc tggctgaccg cccaacgacc
360cccgcccatt gacgtcaata atgacgtatg ttcccatagt aacgccaata gggactttcc
420attgacgtca atgggtggac tatttacggt aaactgccca cttggcagta catcaagtgt
480atcatatgcc aagtacgccc cctattgacg tcaatgacgg taaatggccc gcctggcatt
540atgcccagta catgacctta tgggactttc ctacttggca gtacatctac gtattagtca
600tcgctattac catggtgatg cggttttggc agtacatcaa tgggcgtgga tagcggtttg
660actcacgggg atttccaagt ctccacccca ttgacgtcaa tgggagtttg ttttggcacc
720aaaatcaacg ggactttcca aaatgtcgta acaactccgc cccattgacg caaatgggcg
780gtaggcgtgt acggtgggag gtctatataa gcagagctct ctggctaact agagaaccca
840ctgcttactg gcttatcgaa attaatacga ctcactatag ggagacccaa gcttggtacc
900gagctcggat cgatatctgc ggcctagcta gcgcttaagg cctgttaacc ggtcgtacgt
960ctccggattc gaattcggat ccgcggccgc atagataact gatccagtgt gctggaatta
1020attcgctgtc tgcgagggcc agctgttggg gtgagtactc cctctcaaaa gcgggcatga
1080cttctgcgct aagattgtca gtttccaaaa acgaggagga tttgatattc acctggcccg
1140cggtgatgcc tttgagggtg gccgcgtcca tctggtcaga aaagacaatc tttttgttgt
1200caagcttgag gtgtggcagg cttgagatct ggccatacac ttgagtgaca atgacatcca
1260ctttgccttt ctctccacag gtgtccactc ccaggtccaa ctgcaggtcg agcatgcatc
1320tagggcggcc aattccgccc ctctccctcc ccccccccta acgttactgg ccgaagccgc
1380ttggaataag gccggtgtgc gtttgtctat atgtgatttt ccaccatatt gccgtctttt
1440ggcaatgtga gggcccggaa acctggccct gtcttcttga cgagcattcc taggggtctt
1500tcccctctcg ccaaaggaat gcaaggtctg ttgaatgtcg tgaaggaagc agttcctctg
1560gaagcttctt gaagacaaac aacgtctgta gcgacccttt gcaggcagcg gaacccccca
1620cctggcgaca ggtgcctctg cggccaaaag ccacgtgtat aagatacacc tgcaaaggcg
1680gcacaacccc agtgccacgt tgtgagttgg atagttgtgg aaagagtcaa atggctctcc
1740tcaagcgtat tcaacaaggg gctgaaggat gcccagaagg taccccattg tatgggatct
1800gatctggggc ctcggtgcac atgctttaca tgtgtttagt cgaggttaaa aaaacgtcta
1860ggccccccga accacgggga cgtggttttc ctttgaaaaa cacgatgata agcttgccac
1920aacccacaag gagacgacct tccatgaccg agtacaagcc cacggtgcgc ctcgccaccc
1980gcgacgacgt cccccgggcc gtacgcaccc tcgccgccgc gttcgccgac taccccgcca
2040cgcgccacac cgtcgacccg gaccgccaca tcgagcgggt caccgagctg caagaactct
2100tcctcacgcg cgtcgggctc gacatcggca aggtgtgggt cgcggacgac ggcgccgcgg
2160tggcggtctg gaccacgccg gagagcgtcg aagcgggggc ggtgttcgcc gagatcggcc
2220cgcgcatggc cgagttgagc ggttcccggc tggccgcgca gcaacagatg gaaggcctcc
2280tggcgccgca ccggcccaag gagcccgcgt ggttcctggc caccgtcggc gtctcgcccg
2340accaccaggg caagggtctg ggcagcgccg tcgtgctccc cggagtggag gcggccgagc
2400gcgccggggt gcccgccttc ctggagacct ccgcgccccg caacctcccc ttctacgagc
2460ggctcggctt caccgtcacc gccgacgtcg agtgcccgaa ggaccgcgcg acctggtgca
2520tgacccgcaa gcccggtgcc tgacgcccgc cccacgaccc gcagcgcccg accgaaagga
2580gcgcacgacc ccatggctcc gaccgaagcc gacccgggcg gccccgccga ccccgcaccc
2640gcccccgagg cccaccgact ctagagctcg ctgatcagcc tcgactgtgc cttctagttg
2700ccagccatct gttgtttgcc cctcccccgt gccttccttg accctggaag gtgccactcc
2760cactgtcctt tcctaataaa atgaggaaat tgcatcgcat tgtctgagta ggtgtcattc
2820tattctgggg ggtggggtgg ggcaggacag caagggggag gattgggaag acaatagcag
2880gcatgctggg gatgcggtgg gctctatggc ttctgaggcg gaaagaacca gctggggctc
2940gagtgcattc tagttgtggt ttgtccaaac tcatcaatgt atcttatcat gtctgtatac
3000cgtcgacctc tagctagagc ttggcgtaat catggtcata gctgtttcct gtgtgaaatt
3060gttatccgct cacaattcca cacaacatac gagccggaag cataaagtgt aaagcctggg
3120gtgcctaatg agtgagctaa ctcacattaa ttgcgttgcg ctcactgccc gctttccagt
3180cgggaaacct gtcgtgccag ctgcattaat gaatcggcca acgcgcgggg agaggcggtt
3240tgcgtattgg gcgctcttcc gcttcctcgc tcactgactc gctgcgctcg gtcgttcggc
3300tgcggcgagc ggtatcagct cactcaaagg cggtaatacg gttatccaca gaatcagggg
3360ataacgcagg aaagaacatg tgagcaaaag gccagcaaaa ggccaggaac cgtaaaaagg
3420ccgcgttgct ggcgtttttc cataggctcc gcccccctga cgagcatcac aaaaatcgac
3480gctcaagtca gaggtggcga aacccgacag gactataaag ataccaggcg tttccccctg
3540gaagctccct cgtgcgctct cctgttccga ccctgccgct taccggatac ctgtccgcct
3600ttctcccttc gggaagcgtg gcgctttctc aatgctcacg ctgtaggtat ctcagttcgg
3660tgtaggtcgt tcgctccaag ctgggctgtg tgcacgaacc ccccgttcag cccgaccgct
3720gcgccttatc cggtaactat cgtcttgagt ccaacccggt aagacacgac ttatcgccac
3780tggcagcagc cactggtaac aggattagca gagcgaggta tgtaggcggt gctacagagt
3840tcttgaagtg gtggcctaac tacggctaca ctagaaggac agtatttggt atctgcgctc
3900tgctgaagcc agttaccttc ggaaaaagag ttggtagctc ttgatccggc aaacaaacca
3960ccgctggtag cggtggtttt tttgtttgca agcagcagat tacgcgcaga aaaaaaggat
4020ctcaagaaga tcctttgatc ttttctacgg ggtctgacgc tcagtggaac gaaaactcac
4080gttaagggat tttggtcatg agattatcaa aaaggatctt cacctagatc cttttaaatt
4140aaaaatgaag ttttaaatca atctaaagta tatatgagta aacttggtct gacagttacc
4200aatgcttaat cagtgaggca cctatctcag cgatctgtct atttcgttca tccatagttg
4260cctgactccc cgtcgtgtag ataactacga tacgggaggg cttaccatct ggccccagtg
4320ctgcaatgat accgcgagac ccacgctcac cggctccaga tttatcagca ataaaccagc
4380cagccggaag ggccgagcgc agaagtggtc ctgcaacttt atccgcctcc atccagtcta
4440ttaattgttg ccgggaagct agagtaagta gttcgccagt taatagtttg cgcaacgttg
4500ttgccattgc tacaggcatc gtggtgtcac gctcgtcgtt tggtatggct tcattcagct
4560ccggttccca acgatcaagg cgagttacat gatcccccat gttgtgcaaa aaagcggtta
4620gctccttcgg tcctccgatc gttgtcagaa gtaagttggc cgcagtgtta tcactcatgg
4680ttatggcagc actgcataat tctcttactg tcatgccatc cgtaagatgc ttttctgtga
4740ctggtgagta ctcaaccaag tcattctgag aatagtgtat gcggcgaccg agttgctctt
4800gcccggcgtc aatacgggat aataccgcgc cacatagcag aactttaaaa gtgctcatca
4860ttggaaaacg ttcttcgggg cgaaaactct caaggatctt accgctgttg agatccagtt
4920cgatgtaacc cactcgtgca cccaactgat cttcagcatc ttttactttc accagcgttt
4980ctgggtgagc aaaaacagga aggcaaaatg ccgcaaaaaa gggaataagg gcgacacgga
5040aatgttgaat actcatactc ttcctttttc aatattattg aagcatttat cagggttatt
5100gtctcatgag cggatacata tttgaatgta tttagaaaaa taaacaaata ggggttccgc
5160gcacatttcc ccgaaaagtg ccacctgacg tc
51928911182DNAArtificial SequencepAg1 Plasmid 89catgccaacc acagggttcc
cctcgggatc aaagtacttt gatccaaccc ctccgctgct 60atagtgcagt cggcttctga
cgttcagtgc agccgtcttc tgaaaacgac atgtcgcaca 120agtcctaagt tacgcgacag
gctgccgccc tgcccttttc ctggcgtttt cttgtcgcgt 180gttttagtcg cataaagtag
aatacttgcg actagaaccg gagacattac gccatgaaca 240agagcgccgc cgctggcctg
ctgggctatg cccgcgtcag caccgacgac caggacttga 300ccaaccaacg ggccgaactg
cacgcggccg gctgcaccaa gctgttttcc gagaagatca 360ccggcaccag gcgcgaccgc
ccggagctgg ccaggatgct tgaccaccta cgccctggcg 420acgttgtgac agtgaccagg
ctagaccgcc tggcccgcag cacccgcgac ctactggaca 480ttgccgagcg catccaggag
gccggcgcgg gcctgcgtag cctggcagag ccgtgggccg 540acaccaccac gccggccggc
cgcatggtgt tgaccgtgtt cgccggcatt gccgagttcg 600agcgttccct aatcatcgac
cgcacccgga gcgggcgcga ggccgccaag gcccgaggcg 660tgaagtttgg cccccgccct
accctcaccc cggcacagat cgcgcacgcc cgcgagctga 720tcgaccagga aggccgcacc
gtgaaagagg cggctgcact gcttggcgtg catcgctcga 780ccctgtaccg cgcacttgag
cgcagcgagg aagtgacgcc caccgaggcc aggcggcgcg 840gtgccttccg tgaggacgca
ttgaccgagg ccgacgccct ggcggccgcc gagaatgaac 900gccaagagga acaagcatga
aaccgcacca ggacggccag gacgaaccgt ttttcattac 960cgaagagatc gaggcggaga
tgatcgcggc cgggtacgtg ttcgagccgc ccgcgcacgt 1020ctcaaccgtg cggctgcatg
aaatcctggc cggtttgtct gatgccaagc tggcggcctg 1080gccggccagc ttggccgctg
aagaaaccga gcgccgccgt ctaaaaaggt gatgtgtatt 1140tgagtaaaac agcttgcgtc
atgcggtcgc tgcgtatatg atgcgatgag taaataaaca 1200aatacgcaag gggaacgcat
gaaggttatc gctgtactta accagaaagg cgggtcaggc 1260aagacgacca tcgcaaccca
tctagcccgc gccctgcaac tcgccggggc cgatgttctg 1320ttagtcgatt ccgatcccca
gggcagtgcc cgcgattggg cggccgtgcg ggaagatcaa 1380ccgctaaccg ttgtcggcat
cgaccgcccg acgattgacc gcgacgtgaa ggccatcggc 1440cggcgcgact tcgtagtgat
cgacggagcg ccccaggcgg cggacttggc tgtgtccgcg 1500atcaaggcag ccgacttcgt
gctgattccg gtgcagccaa gcccttacga catatgggcc 1560accgccgacc tggtggagct
ggttaagcag cgcattgagg tcacggatgg aaggctacaa 1620gcggcctttg tcgtgtcgcg
ggcgatcaaa ggcacgcgca tcggcggtga ggttgccgag 1680gcgctggccg ggtacgagct
gcccattctt gagtcccgta tcacgcagcg cgtgagctac 1740ccaggcactg ccgccgccgg
cacaaccgtt cttgaatcag aacccgaggg cgacgctgcc 1800cgcgaggtcc aggcgctggc
cgctgaaatt aaatcaaaac tcatttgagt taatgaggta 1860aagagaaaat gagcaaaagc
acaaacacgc taagtgccgg ccgtccgagc gcacgcagca 1920gcaaggctgc aacgttggcc
agcctggcag acacgccagc catgaagcgg gtcaactttc 1980agttgccggc ggaggatcac
accaagctga agatgtacgc ggtacgccaa ggcaagacca 2040ttaccgagct gctatctgaa
tacatcgcgc agctaccaga gtaaatgagc aaatgaataa 2100atgagtagat gaattttagc
ggctaaagga ggcggcatgg aaaatcaaga acaaccaggc 2160accgacgccg tggaatgccc
catgtgtgga ggaacgggcg gttggccagg cgtaagcggc 2220tgggttgtct gccggccctg
caatggcact ggaaccccca agcccgagga atcggcgtga 2280cggtcgcaaa ccatccggcc
cggtacaaat cggcgcggcg ctgggtgatg acctggtgga 2340gaagttgaag gccgcgcagg
ccgcccagcg gcaacgcatc gaggcagaag cacgccccgg 2400tgaatcgtgg caagcggccg
ctgatcgaat ccgcaaagaa tcccggcaac cgccggcagc 2460cggtgcgccg tcgattagga
agccgcccaa gggcgacgag caaccagatt ttttcgttcc 2520gatgctctat gacgtgggca
cccgcgatag tcgcagcatc atggacgtgg ccgttttccg 2580tctgtcgaag cgtgaccgac
gagctggcga ggtgatccgc tacgagcttc cagacgggca 2640cgtagaggtt tccgcagggc
cggccggcat ggccagtgtg tgggattacg acctggtact 2700gatggcggtt tcccatctaa
ccgaatccat gaaccgatac cgggaaggga agggagacaa 2760gcccggccgc gtgttccgtc
cacacgttgc ggacgtactc aagttctgcc ggcgagccga 2820tggcggaaag cagaaagacg
acctggtaga aacctgcatt cggttaaaca ccacgcacgt 2880tgccatgcag cgtacgaaga
aggccaagaa cggccgcctg gtgacggtat ccgagggtga 2940agccttgatt agccgctaca
agatcgtaaa gagcgaaacc gggcggccgg agtacatcga 3000gatcgagcta gctgattgga
tgtaccgcga gatcacagaa ggcaagaacc cggacgtgct 3060gacggttcac cccgattact
ttttgatcga tcccggcatc ggccgttttc tctaccgcct 3120ggcacgccgc gccgcaggca
aggcagaagc cagatggttg ttcaagacga tctacgaacg 3180cagtggcagc gccggagagt
tcaagaagtt ctgtttcacc gtgcgcaagc tgatcgggtc 3240aaatgacctg ccggagtacg
atttgaagga ggaggcgggg caggctggcc cgatcctagt 3300catgcgctac cgcaacctga
tcgagggcga agcatccgcc ggttcctaat gtacggagca 3360gatgctaggg caaattgccc
tagcagggga aaaaggtcga aaaggtctct ttcctgtgga 3420tagcacgtac attgggaacc
caaagccgta cattgggaac cggaacccgt acattgggaa 3480cccaaagccg tacattggga
accggtcaca catgtaagtg actgatataa aagagaaaaa 3540aggcgatttt tccgcctaaa
actctttaaa acttattaaa actcttaaaa cccgcctggc 3600ctgtgcataa ctgtctggcc
agcgcacagc cgaagagctg caaaaagcgc ctacccttcg 3660gtcgctgcgc tccctacgcc
ccgccgcttc gcgtcggcct atcgcggccg ctggccgctc 3720aaaaatggct ggcctacggc
caggcaatct accagggcgc ggacaagccg cgccgtcgcc 3780actcgaccgc cggcgcccac
atcaaggcac cctgcctcgc gcgtttcggt gatgacggtg 3840aaaacctctg acacatgcag
ctcccggaga cggtcacagc ttgtctgtaa gcggatgccg 3900ggagcagaca agcccgtcag
ggcgcgtcag cgggtgttgg cgggtgtcgg ggcgcagcca 3960tgacccagtc acgtagcgat
agcggagtgt atactggctt aactatgcgg catcagagca 4020gattgtactg agagtgcacc
atatgcggtg tgaaataccg cacagatgcg taaggagaaa 4080ataccgcatc aggcgctctt
ccgcttcctc gctcactgac tcgctgcgct cggtcgttcg 4140gctgcggcga gcggtatcag
ctcactcaaa ggcggtaata cggttatcca cagaatcagg 4200ggataacgca ggaaagaaca
tgtgagcaaa aggccagcaa aaggccagga accgtaaaaa 4260ggccgcgttg ctggcgtttt
tccataggct ccgcccccct gacgagcatc acaaaaatcg 4320acgctcaagt cagaggtggc
gaaacccgac aggactataa agataccagg cgtttccccc 4380tggaagctcc ctcgtgcgct
ctcctgttcc gaccctgccg cttaccggat acctgtccgc 4440ctttctccct tcgggaagcg
tggcgctttc tcatagctca cgctgtaggt atctcagttc 4500ggtgtaggtc gttcgctcca
agctgggctg tgtgcacgaa ccccccgttc agcccgaccg 4560ctgcgcctta tccggtaact
atcgtcttga gtccaacccg gtaagacacg acttatcgcc 4620actggcagca gccactggta
acaggattag cagagcgagg tatgtaggcg gtgctacaga 4680gttcttgaag tggtggccta
actacggcta cactagaagg acagtatttg gtatctgcgc 4740tctgctgaag ccagttacct
tcggaaaaag agttggtagc tcttgatccg gcaaacaaac 4800caccgctggt agcggtggtt
tttttgtttg caagcagcag attacgcgca gaaaaaaagg 4860atctcaagaa gatcctttga
tcttttctac ggggtctgac gctcagtgga acgaaaactc 4920acgttaaggg attttggtca
tgcattctag gtactaaaac aattcatcca gtaaaatata 4980atattttatt ttctcccaat
caggcttgat ccccagtaag tcaaaaaata gctcgacata 5040ctgttcttcc ccgatatcct
ccctgatcga ccggacgcag aaggcaatgt cataccactt 5100gtccgccctg ccgcttctcc
caagatcaat aaagccactt actttgccat ctttcacaaa 5160gatgttgctg tctcccaggt
cgccgtggga aaagacaagt tcctcttcgg gcttttccgt 5220ctttaaaaaa tcatacagct
cgcgcggatc tttaaatgga gtgtcttctt cccagttttc 5280gcaatccaca tcggccagat
cgttattcag taagtaatcc aattcggcta agcggctgtc 5340taagctattc gtatagggac
aatccgatat gtcgatggag tgaaagagcc tgatgcactc 5400cgcatacagc tcgataatct
tttcagggct ttgttcatct tcatactctt ccgagcaaag 5460gacgccatcg gcctcactca
tgagcagatt gctccagcca tcatgccgtt caaagtgcag 5520gacctttgga acaggcagct
ttccttccag ccatagcatc atgtcctttt cccgttccac 5580atcataggtg gtccctttat
accggctgtc cgtcattttt aaatataggt tttcattttc 5640tcccaccagc ttatatacct
tagcaggaga cattccttcc gtatctttta cgcagcggta 5700tttttcgatc agttttttca
attccggtga tattctcatt ttagccattt attatttcct 5760tcctcttttc tacagtattt
aaagataccc caagaagcta attataacaa gacgaactcc 5820aattcactgt tccttgcatt
ctaaaacctt aaataccaga aaacagcttt ttcaaagttg 5880ttttcaaagt tggcgtataa
catagtatcg acggagccga ttttgaaacc gcggtgatca 5940caggcagcaa cgctctgtca
tcgttacaat caacatgcta ccctccgcga gatcatccgt 6000gtttcaaacc cggcagctta
gttgccgttc ttccgaatag catcggtaac atgagcaaag 6060tctgccgcct tacaacggct
ctcccgctga cgccgtcccg gactgatggg ctgcctgtat 6120cgagtggtga ttttgtgccg
agctgccggt cggggagctg ttggctggct ggtggcagga 6180tatattgtgg tgtaaacaaa
ttgacgctta gacaacttaa taacacattg cggacgtttt 6240taatgtactg aattaacgcc
gaattaattc gggggatctg gattttagta ctggattttg 6300gttttaggaa ttagaaattt
tattgataga agtattttac aaatacaaat acatactaag 6360ggtttcttat atgctcaaca
catgagcgaa accctatagg aaccctaatt cccttatctg 6420ggaactactc acacattatt
atggagaaac tcgagtcaaa tctcggtgac gggcaggacc 6480ggacggggcg gtaccggcag
gctgaagtcc agctgccaga aacccacgtc atgccagttc 6540ccgtgcttga agccggccgc
ccgcagcatg ccgcgggggg catatccgag cgcctcgtgc 6600atgcgcacgc tcgggtcgtt
gggcagcccg atgacagcga ccacgctctt gaagccctgt 6660gcctccaggg acttcagcag
gtgggtgtag agcgtggagc ccagtcccgt ccgctggtgg 6720cggggggaga cgtacacggt
cgactcggcc gtccagtcgt aggcgttgcg tgccttccag 6780gggcccgcgt aggcgatgcc
ggcgacctcg ccgtccacct cggcgacgag ccagggatag 6840cgctcccgca gacggacgag
gtcgtccgtc cactcctgcg gttcctgcgg ctcggtacgg 6900aagttgaccg tgcttgtctc
gatgtagtgg ttgacgatgg tgcagaccgc cggcatgtcc 6960gcctcggtgg cacggcggat
gtcggccggg cgtcgttctg ggctcatggt agactcgaga 7020gagatagatt tgtagagaga
gactggtgat ttcagcgtgt cctctccaaa tgaaatgaac 7080ttccttatat agaggaaggt
cttgcgaagg atagtgggat tgtgcgtcat cccttacgtc 7140agtggagata tcacatcaat
ccacttgctt tgaagacgtg gttggaacgt cttctttttc 7200cacgatgctc ctcgtgggtg
ggggtccatc tttgggacca ctgtcggcag aggcatcttg 7260aacgatagcc tttcctttat
cgcaatgatg gcatttgtag gtgccacctt ccttttctac 7320tgtccttttg atgaagtgac
agatagctgg gcaatggaat ccgaggaggt ttcccgatat 7380taccctttgt tgaaaagtct
caatagccct ttggtcttct gagactgtat ctttgatatt 7440cttggagtag acgagagtgt
cgtgctccac catgttatca catcaatcca cttgctttga 7500agacgtggtt ggaacgtctt
ctttttccac gatgctcctc gtgggtgggg gtccatcttt 7560gggaccactg tcggcagagg
catcttgaac gatagccttt cctttatcgc aatgatggca 7620tttgtaggtg ccaccttcct
tttctactgt ccttttgatg aagtgacaga tagctgggca 7680atggaatccg aggaggtttc
ccgatattac cctttgttga aaagtctcaa tagccctttg 7740gtcttctgag actgtatctt
tgatattctt ggagtagacg agagtgtcgt gctccaccat 7800gttggcaagc tgctctagcc
aatacgcaaa ccgcctctcc ccgcgcgttg gccgattcat 7860taatgcagct ggcacgacag
gtttcccgac tggaaagcgg gcagtgagcg caacgcaatt 7920aatgtgagtt agctcactca
ttaggcaccc caggctttac actttatgct tccggctcgt 7980atgttgtgtg gaattgtgag
cggataacaa tttcacacag gaaacagcta tgaccatgat 8040tacgaattcg agccttgact
agagggtcga cggtatacag acatgataag atacattgat 8100gagtttggac aaaccacaac
tagaatgcag tgaaaaaaat gctttatttg tgaaatttgt 8160gatgctattg ctttatttgt
aaccattata agctgcaata aacaagttgg ggtgggcgaa 8220gaactccagc atgagatccc
cgcgctggag gatcatccag ccggcgtccc ggaaaacgat 8280tccgaagccc aacctttcat
agaaggcggc ggtggaatcg aaatctcgta gcacgtgtca 8340gtcctgctcc tcggccacga
agtgcacgca gttgccggcc gggtcgcgca gggcgaactc 8400ccgcccccac ggctgctcgc
cgatctcggt catggccggc ccggaggcgt cccggaagtt 8460cgtggacacg acctccgacc
actcggcgta cagctcgtcc aggccgcgca cccacaccca 8520ggccagggtg ttgtccggca
ccacctggtc ctggaccgcg ctgatgaaca gggtcacgtc 8580gtcccggacc acaccggcga
agtcgtcctc cacgaagtcc cgggagaacc cgagccggtc 8640ggtccagaac tcgaccgctc
cggcgacgtc gcgcgcggtg agcaccggaa cggcactggt 8700caacttggcc atggatccag
atttcgctca agttagtata aaaaagcagg cttcaatcct 8760gcaggaattc gatcgacact
ctcgtctact ccaagaatat caaagataca gtctcagaag 8820accaaagggc tattgagact
tttcaacaaa gggtaatatc gggaaacctc ctcggattcc 8880attgcccagc tatctgtcac
ttcatcaaaa ggacagtaga aaaggaaggt ggcacctaca 8940aatgccatca ttgcgataaa
ggaaaggcta tcgttcaaga tgcctctgcc gacagtggtc 9000ccaaagatgg acccccaccc
acgaggagca tcgtggaaaa agaagacgtt ccaaccacgt 9060cttcaaagca agtggattga
tgtgataaca tggtggagca cgacactctc gtctactcca 9120agaatatcaa agatacagtc
tcagaagacc aaagggctat tgagactttt caacaaaggg 9180taatatcggg aaacctcctc
ggattccatt gcccagctat ctgtcacttc atcaaaagga 9240cagtagaaaa ggaaggtggc
acctacaaat gccatcattg cgataaagga aaggctatcg 9300ttcaagatgc ctctgccgac
agtggtccca aagatggacc cccacccacg aggagcatcg 9360tggaaaaaga agacgttcca
accacgtctt caaagcaagt ggattgatgt gatatctcca 9420ctgacgtaag ggatgacgca
caatcccact atccttcgca agaccttcct ctatataagg 9480aagttcattt catttggaga
ggacacgctg aaatcaccag tctctctcta caaatctatc 9540tctctcgagc tttcgcagat
ccgggggggc aatgagatat gaaaaagcct gaactcaccg 9600cgacgtctgt cgagaagttt
ctgatcgaaa agttcgacag cgtctccgac ctgatgcagc 9660tctcggaggg cgaagaatct
cgtgctttca gcttcgatgt aggagggcgt ggatatgtcc 9720tgcgggtaaa tagctgcgcc
gatggtttct acaaagatcg ttatgtttat cggcactttg 9780catcggccgc gctcccgatt
ccggaagtgc ttgacattgg ggagtttagc gagagcctga 9840cctattgcat ctcccgccgt
gcacagggtg tcacgttgca agacctgcct gaaaccgaac 9900tgcccgctgt tctacaaccg
gtcgcggagg ctatggatgc gatcgctgcg gccgatctta 9960gccagacgag cgggttcggc
ccattcggac cgcaaggaat cggtcaatac actacatggc 10020gtgatttcat atgcgcgatt
gctgatcccc atgtgtatca ctggcaaact gtgatggacg 10080acaccgtcag tgcgtccgtc
gcgcaggctc tcgatgagct gatgctttgg gccgaggact 10140gccccgaagt ccggcacctc
gtgcacgcgg atttcggctc caacaatgtc ctgacggaca 10200atggccgcat aacagcggtc
attgactgga gcgaggcgat gttcggggat tcccaatacg 10260aggtcgccaa catcttcttc
tggaggccgt ggttggcttg tatggagcag cagacgcgct 10320acttcgagcg gaggcatccg
gagcttgcag gatcgccacg actccgggcg tatatgctcc 10380gcattggtct tgaccaactc
tatcagagct tggttgacgg caatttcgat gatgcagctt 10440gggcgcaggg tcgatgcgac
gcaatcgtcc gatccggagc cgggactgtc gggcgtacac 10500aaatcgcccg cagaagcgcg
gccgtctgga ccgatggctg tgtagaagta ctcgccgata 10560gtggaaaccg acgccccagc
actcgtccga gggcaaagaa atagagtaga tgccgaccgg 10620atctgtcgat cgacaagctc
gagtttctcc ataataatgt gtgagtagtt cccagataag 10680ggaattaggg ttcctatagg
gtttcgctca tgtgttgagc atataagaaa cccttagtat 10740gtatttgtat ttgtaaaata
cttctatcaa taaaatttct aattcctaaa accaaaatcc 10800agtactaaaa tccagatccc
ccgaattaat tcggcgttaa ttcagatcaa gcttggcact 10860ggccgtcgtt ttacaacgtc
gtgactggga aaaccctggc gttacccaac ttaatcgcct 10920tgcagcacat ccccctttcg
ccagctggcg taatagcgaa gaggcccgca ccgatcgccc 10980ttcccaacag ttgcgcagcc
tgaatggcga atgctagagc agcttgagct tggatcagat 11040tgtcgtttcc cgccttcagt
ttaaactatc agtgtttgac aggatatatt ggcgggtaaa 11100cctaagagaa aagagcgttt
attagaataa cggatattta aaagggcgtg aaaaggttta 11160tccgttcgtc catttgtatg
tg 11182908428DNAArtificial
SequencepCambia3300 Plasmid 90catgccaacc acagggttcc cctcgggatc aaagtacttt
gatccaaccc ctccgctgct 60atagtgcagt cggcttctga cgttcagtgc agccgtcttc
tgaaaacgac atgtcgcaca 120agtcctaagt tacgcgacag gctgccgccc tgcccttttc
ctggcgtttt cttgtcgcgt 180gttttagtcg cataaagtag aatacttgcg actagaaccg
gagacattac gccatgaaca 240agagcgccgc cgctggcctg ctgggctatg cccgcgtcag
caccgacgac caggacttga 300ccaaccaacg ggccgaactg cacgcggccg gctgcaccaa
gctgttttcc gagaagatca 360ccggcaccag gcgcgaccgc ccggagctgg ccaggatgct
tgaccaccta cgccctggcg 420acgttgtgac agtgaccagg ctagaccgcc tggcccgcag
cacccgcgac ctactggaca 480ttgccgagcg catccaggag gccggcgcgg gcctgcgtag
cctggcagag ccgtgggccg 540acaccaccac gccggccggc cgcatggtgt tgaccgtgtt
cgccggcatt gccgagttcg 600agcgttccct aatcatcgac cgcacccgga gcgggcgcga
ggccgccaag gcccgaggcg 660tgaagtttgg cccccgccct accctcaccc cggcacagat
cgcgcacgcc cgcgagctga 720tcgaccagga aggccgcacc gtgaaagagg cggctgcact
gcttggcgtg catcgctcga 780ccctgtaccg cgcacttgag cgcagcgagg aagtgacgcc
caccgaggcc aggcggcgcg 840gtgccttccg tgaggacgca ttgaccgagg ccgacgccct
ggcggccgcc gagaatgaac 900gccaagagga acaagcatga aaccgcacca ggacggccag
gacgaaccgt ttttcattac 960cgaagagatc gaggcggaga tgatcgcggc cgggtacgtg
ttcgagccgc ccgcgcacgt 1020ctcaaccgtg cggctgcatg aaatcctggc cggtttgtct
gatgccaagc tggcggcctg 1080gccggccagc ttggccgctg aagaaaccga gcgccgccgt
ctaaaaaggt gatgtgtatt 1140tgagtaaaac agcttgcgtc atgcggtcgc tgcgtatatg
atgcgatgag taaataaaca 1200aatacgcaag gggaacgcat gaaggttatc gctgtactta
accagaaagg cgggtcaggc 1260aagacgacca tcgcaaccca tctagcccgc gccctgcaac
tcgccggggc cgatgttctg 1320ttagtcgatt ccgatcccca gggcagtgcc cgcgattggg
cggccgtgcg ggaagatcaa 1380ccgctaaccg ttgtcggcat cgaccgcccg acgattgacc
gcgacgtgaa ggccatcggc 1440cggcgcgact tcgtagtgat cgacggagcg ccccaggcgg
cggacttggc tgtgtccgcg 1500atcaaggcag ccgacttcgt gctgattccg gtgcagccaa
gcccttacga catatgggcc 1560accgccgacc tggtggagct ggttaagcag cgcattgagg
tcacggatgg aaggctacaa 1620gcggcctttg tcgtgtcgcg ggcgatcaaa ggcacgcgca
tcggcggtga ggttgccgag 1680gcgctggccg ggtacgagct gcccattctt gagtcccgta
tcacgcagcg cgtgagctac 1740ccaggcactg ccgccgccgg cacaaccgtt cttgaatcag
aacccgaggg cgacgctgcc 1800cgcgaggtcc aggcgctggc cgctgaaatt aaatcaaaac
tcatttgagt taatgaggta 1860aagagaaaat gagcaaaagc acaaacacgc taagtgccgg
ccgtccgagc gcacgcagca 1920gcaaggctgc aacgttggcc agcctggcag acacgccagc
catgaagcgg gtcaactttc 1980agttgccggc ggaggatcac accaagctga agatgtacgc
ggtacgccaa ggcaagacca 2040ttaccgagct gctatctgaa tacatcgcgc agctaccaga
gtaaatgagc aaatgaataa 2100atgagtagat gaattttagc ggctaaagga ggcggcatgg
aaaatcaaga acaaccaggc 2160accgacgccg tggaatgccc catgtgtgga ggaacgggcg
gttggccagg cgtaagcggc 2220tgggttgtct gccggccctg caatggcact ggaaccccca
agcccgagga atcggcgtga 2280cggtcgcaaa ccatccggcc cggtacaaat cggcgcggcg
ctgggtgatg acctggtgga 2340gaagttgaag gccgcgcagg ccgcccagcg gcaacgcatc
gaggcagaag cacgccccgg 2400tgaatcgtgg caagcggccg ctgatcgaat ccgcaaagaa
tcccggcaac cgccggcagc 2460cggtgcgccg tcgattagga agccgcccaa gggcgacgag
caaccagatt ttttcgttcc 2520gatgctctat gacgtgggca cccgcgatag tcgcagcatc
atggacgtgg ccgttttccg 2580tctgtcgaag cgtgaccgac gagctggcga ggtgatccgc
tacgagcttc cagacgggca 2640cgtagaggtt tccgcagggc cggccggcat ggccagtgtg
tgggattacg acctggtact 2700gatggcggtt tcccatctaa ccgaatccat gaaccgatac
cgggaaggga agggagacaa 2760gcccggccgc gtgttccgtc cacacgttgc ggacgtactc
aagttctgcc ggcgagccga 2820tggcggaaag cagaaagacg acctggtaga aacctgcatt
cggttaaaca ccacgcacgt 2880tgccatgcag cgtacgaaga aggccaagaa cggccgcctg
gtgacggtat ccgagggtga 2940agccttgatt agccgctaca agatcgtaaa gagcgaaacc
gggcggccgg agtacatcga 3000gatcgagcta gctgattgga tgtaccgcga gatcacagaa
ggcaagaacc cggacgtgct 3060gacggttcac cccgattact ttttgatcga tcccggcatc
ggccgttttc tctaccgcct 3120ggcacgccgc gccgcaggca aggcagaagc cagatggttg
ttcaagacga tctacgaacg 3180cagtggcagc gccggagagt tcaagaagtt ctgtttcacc
gtgcgcaagc tgatcgggtc 3240aaatgacctg ccggagtacg atttgaagga ggaggcgggg
caggctggcc cgatcctagt 3300catgcgctac cgcaacctga tcgagggcga agcatccgcc
ggttcctaat gtacggagca 3360gatgctaggg caaattgccc tagcagggga aaaaggtcga
aaaggtctct ttcctgtgga 3420tagcacgtac attgggaacc caaagccgta cattgggaac
cggaacccgt acattgggaa 3480cccaaagccg tacattggga accggtcaca catgtaagtg
actgatataa aagagaaaaa 3540aggcgatttt tccgcctaaa actctttaaa acttattaaa
actcttaaaa cccgcctggc 3600ctgtgcataa ctgtctggcc agcgcacagc cgaagagctg
caaaaagcgc ctacccttcg 3660gtcgctgcgc tccctacgcc ccgccgcttc gcgtcggcct
atcgcggccg ctggccgctc 3720aaaaatggct ggcctacggc caggcaatct accagggcgc
ggacaagccg cgccgtcgcc 3780actcgaccgc cggcgcccac atcaaggcac cctgcctcgc
gcgtttcggt gatgacggtg 3840aaaacctctg acacatgcag ctcccggaga cggtcacagc
ttgtctgtaa gcggatgccg 3900ggagcagaca agcccgtcag ggcgcgtcag cgggtgttgg
cgggtgtcgg ggcgcagcca 3960tgacccagtc acgtagcgat agcggagtgt atactggctt
aactatgcgg catcagagca 4020gattgtactg agagtgcacc atatgcggtg tgaaataccg
cacagatgcg taaggagaaa 4080ataccgcatc aggcgctctt ccgcttcctc gctcactgac
tcgctgcgct cggtcgttcg 4140gctgcggcga gcggtatcag ctcactcaaa ggcggtaata
cggttatcca cagaatcagg 4200ggataacgca ggaaagaaca tgtgagcaaa aggccagcaa
aaggccagga accgtaaaaa 4260ggccgcgttg ctggcgtttt tccataggct ccgcccccct
gacgagcatc acaaaaatcg 4320acgctcaagt cagaggtggc gaaacccgac aggactataa
agataccagg cgtttccccc 4380tggaagctcc ctcgtgcgct ctcctgttcc gaccctgccg
cttaccggat acctgtccgc 4440ctttctccct tcgggaagcg tggcgctttc tcatagctca
cgctgtaggt atctcagttc 4500ggtgtaggtc gttcgctcca agctgggctg tgtgcacgaa
ccccccgttc agcccgaccg 4560ctgcgcctta tccggtaact atcgtcttga gtccaacccg
gtaagacacg acttatcgcc 4620actggcagca gccactggta acaggattag cagagcgagg
tatgtaggcg gtgctacaga 4680gttcttgaag tggtggccta actacggcta cactagaagg
acagtatttg gtatctgcgc 4740tctgctgaag ccagttacct tcggaaaaag agttggtagc
tcttgatccg gcaaacaaac 4800caccgctggt agcggtggtt tttttgtttg caagcagcag
attacgcgca gaaaaaaagg 4860atctcaagaa gatcctttga tcttttctac ggggtctgac
gctcagtgga acgaaaactc 4920acgttaaggg attttggtca tgcattctag gtactaaaac
aattcatcca gtaaaatata 4980atattttatt ttctcccaat caggcttgat ccccagtaag
tcaaaaaata gctcgacata 5040ctgttcttcc ccgatatcct ccctgatcga ccggacgcag
aaggcaatgt cataccactt 5100gtccgccctg ccgcttctcc caagatcaat aaagccactt
actttgccat ctttcacaaa 5160gatgttgctg tctcccaggt cgccgtggga aaagacaagt
tcctcttcgg gcttttccgt 5220ctttaaaaaa tcatacagct cgcgcggatc tttaaatgga
gtgtcttctt cccagttttc 5280gcaatccaca tcggccagat cgttattcag taagtaatcc
aattcggcta agcggctgtc 5340taagctattc gtatagggac aatccgatat gtcgatggag
tgaaagagcc tgatgcactc 5400cgcatacagc tcgataatct tttcagggct ttgttcatct
tcatactctt ccgagcaaag 5460gacgccatcg gcctcactca tgagcagatt gctccagcca
tcatgccgtt caaagtgcag 5520gacctttgga acaggcagct ttccttccag ccatagcatc
atgtcctttt cccgttccac 5580atcataggtg gtccctttat accggctgtc cgtcattttt
aaatataggt tttcattttc 5640tcccaccagc ttatatacct tagcaggaga cattccttcc
gtatctttta cgcagcggta 5700tttttcgatc agttttttca attccggtga tattctcatt
ttagccattt attatttcct 5760tcctcttttc tacagtattt aaagataccc caagaagcta
attataacaa gacgaactcc 5820aattcactgt tccttgcatt ctaaaacctt aaataccaga
aaacagcttt ttcaaagttg 5880ttttcaaagt tggcgtataa catagtatcg acggagccga
ttttgaaacc gcggtgatca 5940caggcagcaa cgctctgtca tcgttacaat caacatgcta
ccctccgcga gatcatccgt 6000gtttcaaacc cggcagctta gttgccgttc ttccgaatag
catcggtaac atgagcaaag 6060tctgccgcct tacaacggct ctcccgctga cgccgtcccg
gactgatggg ctgcctgtat 6120cgagtggtga ttttgtgccg agctgccggt cggggagctg
ttggctggct ggtggcagga 6180tatattgtgg tgtaaacaaa ttgacgctta gacaacttaa
taacacattg cggacgtttt 6240taatgtactg aattaacgcc gaattaattc gggggatctg
gattttagta ctggattttg 6300gttttaggaa ttagaaattt tattgataga agtattttac
aaatacaaat acatactaag 6360ggtttcttat atgctcaaca catgagcgaa accctatagg
aaccctaatt cccttatctg 6420ggaactactc acacattatt atggagaaac tcgagtcaaa
tctcggtgac gggcaggacc 6480ggacggggcg gtaccggcag gctgaagtcc agctgccaga
aacccacgtc atgccagttc 6540ccgtgcttga agccggccgc ccgcagcatg ccgcgggggg
catatccgag cgcctcgtgc 6600atgcgcacgc tcgggtcgtt gggcagcccg atgacagcga
ccacgctctt gaagccctgt 6660gcctccaggg acttcagcag gtgggtgtag agcgtggagc
ccagtcccgt ccgctggtgg 6720cggggggaga cgtacacggt cgactcggcc gtccagtcgt
aggcgttgcg tgccttccag 6780gggcccgcgt aggcgatgcc ggcgacctcg ccgtccacct
cggcgacgag ccagggatag 6840cgctcccgca gacggacgag gtcgtccgtc cactcctgcg
gttcctgcgg ctcggtacgg 6900aagttgaccg tgcttgtctc gatgtagtgg ttgacgatgg
tgcagaccgc cggcatgtcc 6960gcctcggtgg cacggcggat gtcggccggg cgtcgttctg
ggctcatggt agactcgaga 7020gagatagatt tgtagagaga gactggtgat ttcagcgtgt
cctctccaaa tgaaatgaac 7080ttccttatat agaggaaggt cttgcgaagg atagtgggat
tgtgcgtcat cccttacgtc 7140agtggagata tcacatcaat ccacttgctt tgaagacgtg
gttggaacgt cttctttttc 7200cacgatgctc ctcgtgggtg ggggtccatc tttgggacca
ctgtcggcag aggcatcttg 7260aacgatagcc tttcctttat cgcaatgatg gcatttgtag
gtgccacctt ccttttctac 7320tgtccttttg atgaagtgac agatagctgg gcaatggaat
ccgaggaggt ttcccgatat 7380taccctttgt tgaaaagtct caatagccct ttggtcttct
gagactgtat ctttgatatt 7440cttggagtag acgagagtgt cgtgctccac catgttatca
catcaatcca cttgctttga 7500agacgtggtt ggaacgtctt ctttttccac gatgctcctc
gtgggtgggg gtccatcttt 7560gggaccactg tcggcagagg catcttgaac gatagccttt
cctttatcgc aatgatggca 7620tttgtaggtg ccaccttcct tttctactgt ccttttgatg
aagtgacaga tagctgggca 7680atggaatccg aggaggtttc ccgatattac cctttgttga
aaagtctcaa tagccctttg 7740gtcttctgag actgtatctt tgatattctt ggagtagacg
agagtgtcgt gctccaccat 7800gttggcaagc tgctctagcc aatacgcaaa ccgcctctcc
ccgcgcgttg gccgattcat 7860taatgcagct ggcacgacag gtttcccgac tggaaagcgg
gcagtgagcg caacgcaatt 7920aatgtgagtt agctcactca ttaggcaccc caggctttac
actttatgct tccggctcgt 7980atgttgtgtg gaattgtgag cggataacaa tttcacacag
gaaacagcta tgaccatgat 8040tacgaattcg agctcggtac ccggggatcc tctagagtcg
acctgcaggc atgcaagctt 8100ggcactggcc gtcgttttac aacgtcgtga ctgggaaaac
cctggcgtta cccaacttaa 8160tcgccttgca gcacatcccc ctttcgccag ctggcgtaat
agcgaagagg cccgcaccga 8220tcgcccttcc caacagttgc gcagcctgaa tggcgaatgc
tagagcagct tgagcttgga 8280tcagattgtc gtttcccgcc ttcagtttaa actatcagtg
tttgacagga tatattggcg 8340ggtaaaccta agagaaaaga gcgtttatta gaataacgga
tatttaaaag ggcgtgaaaa 8400ggtttatccg ttcgtccatt tgtatgtg
8428913438DNAArtificial SequencepLIT38attBZeo
Plasmid 91tcgaccctct agtcaaggcc ttaagtgagt cgtattacgg actggccgtc
gttttacaac 60gtcgtgactg ggaaaaccct ggcgttaccc aacttaatcg ccttgcagca
catccccctt 120tcgccagctg gcgtaatagc gaagaggccc gcaccgatcg cccttcccaa
cagttgcgca 180gcctgaatgg cgaatggcgc ttcgcttggt aataaagccc gcttcggcgg
gctttttttt 240gttaactacg tcaggtggca cttttcgggg aaatgtgcgc ggaaccccta
tttgtttatt 300tttctaaata cattcaaata tgtatccgct catgagacaa taaccctgat
aaatgcttca 360ataatattga aaaaggaaga gtatgagtat tcaacatttc cgtgtcgccc
ttattccctt 420ttttgcggca ttttgccttc ctgtttttgc tcacccagaa acgctggtga
aagtaaaaga 480tgctgaagat cagttgggtg cacgagtggg ttacatcgaa ctggatctca
acagcggtaa 540gatccttgag agttttcgcc ccgaagaacg ttctccaatg atgagcactt
ttaaagttct 600gctatgtggc gcggtattat cccgtgttga cgccgggcaa gagcaactcg
gtcgccgcat 660acactattct cagaatgact tggttgagta ctcaccagtc acagaaaagc
atcttacgga 720tggcatgaca gtaagagaat tatgcagtgc tgccataacc atgagtgata
acactgcggc 780caacttactt ctgacaacga tcggaggacc gaaggagcta accgcttttt
tgcacaacat 840gggggatcat gtaactcgcc ttgatcgttg ggaaccggag ctgaatgaag
ccataccaaa 900cgacgagcgt gacaccacga tgcctgtagc aatggcaaca acgttgcgca
aactattaac 960tggcgaacta cttactctag cttcccggca acaattaata gactggatgg
aggcggataa 1020agttgcagga ccacttctgc gctcggccct tccggctggc tggtttattg
ctgataaatc 1080tggagccggt gagcgtgggt ctcgcggtat cattgcagca ctggggccag
atggtaagcc 1140ctcccgtatc gtagttatct acacgacggg gagtcaggca actatggatg
aacgaaatag 1200acagatcgct gagataggtg cctcactgat taagcattgg taactgtcag
accaagttta 1260ctcatatata ctttagattg atttaccccg gttgataatc agaaaagccc
caaaaacagg 1320aagattgtat aagcaaatat ttaaattgta aacgttaata ttttgttaaa
attcgcgtta 1380aatttttgtt aaatcagctc attttttaac caataggccg aaatcggcaa
aatcccttat 1440aaatcaaaag aatagcccga gatagggttg agtgttgttc cagtttggaa
caagagtcca 1500ctattaaaga acgtggactc caacgtcaaa gggcgaaaaa ccgtctatca
gggcgatggc 1560ccactacgtg aaccatcacc caaatcaagt tttttggggt cgaggtgccg
taaagcacta 1620aatcggaacc ctaaagggag cccccgattt agagcttgac ggggaaagcg
aacgtggcga 1680gaaaggaagg gaagaaagcg aaaggagcgg gcgctagggc gctggcaagt
gtagcggtca 1740cgctgcgcgt aaccaccaca cccgccgcgc ttaatgcgcc gctacagggc
gcgtaaaagg 1800atctaggtga agatcctttt tgataatctc atgaccaaaa tcccttaacg
tgagttttcg 1860ttccactgag cgtcagaccc cgtagaaaag atcaaaggat cttcttgaga
tccttttttt 1920ctgcgcgtaa tctgctgctt gcaaacaaaa aaaccaccgc taccagcggt
ggtttgtttg 1980ccggatcaag agctaccaac tctttttccg aaggtaactg gcttcagcag
agcgcagata 2040ccaaatactg ttcttctagt gtagccgtag ttaggccacc acttcaagaa
ctctgtagca 2100ccgcctacat acctcgctct gctaatcctg ttaccagtgg ctgctgccag
tggcgataag 2160tcgtgtctta ccgggttgga ctcaagacga tagttaccgg ataaggcgca
gcggtcgggc 2220tgaacggggg gttcgtgcac acagcccagc ttggagcgaa cgacctacac
cgaactgaga 2280tacctacagc gtgagctatg agaaagcgcc acgcttcccg aagggagaaa
ggcggacagg 2340tatccggtaa gcggcagggt cggaacagga gagcgcacga gggagcttcc
agggggaaac 2400gcctggtatc tttatagtcc tgtcgggttt cgccacctct gacttgagcg
tcgatttttg 2460tgatgctcgt caggggggcg gagcctatgg aaaaacgcca gcaacgcggc
ctttttacgg 2520ttcctggcct tttgctggcc ttttgctcac atgtaatgtg agttagctca
ctcattaggc 2580accccaggct ttacacttta tgcttccggc tcgtatgttg tgtggaattg
tgagcggata 2640acaatttcac acaggaaaca gctatgacca tgattacgcc aagctacgta
atacgactca 2700ctagtggggc ccgtgcaatt gaagccggct ggcgccaagc ttctctgcag
gattgaagcc 2760tgctttttta tactaacttg agcgaaatct ggatccatgg ccaagttgac
cagtgccgtt 2820ccggtgctca ccgcgcgcga cgtcgccgga gcggtcgagt tctggaccga
ccggctcggg 2880ttctcccggg acttcgtgga ggacgacttc gccggtgtgg tccgggacga
cgtgaccctg 2940ttcatcagcg cggtccagga ccaggtggtg ccggacaaca ccctggcctg
ggtgtgggtg 3000cgcggcctgg acgagctgta cgccgagtgg tcggaggtcg tgtccacgaa
cttccgggac 3060gcctccgggc cggccatgac cgagatcggc gagcagccgt gggggcggga
gttcgccctg 3120cgcgacccgg ccggcaactg cgtgcacttc gtggccgagg agcaggactg
acacgtgcta 3180cgagatttcg attccaccgc cgccttctat gaaaggttgg gcttcggaat
cgttttccgg 3240gacgccggct ggatgatcct ccagcgcggg gatctcatgc tggagttctt
cgcccacccc 3300aacttgttta ttgcagctta taatggttac aaataaagca atagcatcac
aaatttcaca 3360aataaagcat ttttttcact gcattctagt tgtggtttgt ccaaactcat
caatgtatct 3420tatcatgtct gtataccg
34389210549DNAArtificial SequencepCambia1302 Plasmid
92catggtagat ctgactagta aaggagaaga acttttcact ggagttgtcc caattcttgt
60tgaattagat ggtgatgtta atgggcacaa attttctgtc agtggagagg gtgaaggtga
120tgcaacatac ggaaaactta cccttaaatt tatttgcact actggaaaac tacctgttcc
180gtggccaaca cttgtcacta ctttctctta tggtgttcaa tgcttttcaa gatacccaga
240tcatatgaag cggcacgact tcttcaagag cgccatgcct gagggatacg tgcaggagag
300gaccatcttc ttcaaggacg acgggaacta caagacacgt gctgaagtca agtttgaggg
360agacaccctc gtcaacagga tcgagcttaa gggaatcgat ttcaaggagg acggaaacat
420cctcggccac aagttggaat acaactacaa ctcccacaac gtatacatca tggccgacaa
480gcaaaagaac ggcatcaaag ccaacttcaa gacccgccac aacatcgaag acggcggcgt
540gcaactcgct gatcattatc aacaaaatac tccaattggc gatggccctg tccttttacc
600agacaaccat tacctgtcca cacaatctgc cctttcgaaa gatcccaacg aaaagagaga
660ccacatggtc cttcttgagt ttgtaacagc tgctgggatt acacatggca tggatgaact
720atacaaagct agccaccacc accaccacca cgtgtgaatt ggtgaccagc tcgaatttcc
780ccgatcgttc aaacatttgg caataaagtt tcttaagatt gaatcctgtt gccggtcttg
840cgatgattat catataattt ctgttgaatt acgttaagca tgtaataatt aacatgtaat
900gcatgacgtt atttatgaga tgggttttta tgattagagt cccgcaatta tacatttaat
960acgcgataga aaacaaaata tagcgcgcaa actaggataa attatcgcgc gcggtgtcat
1020ctatgttact agatcgggaa ttaaactatc agtgtttgac aggatatatt ggcgggtaaa
1080cctaagagaa aagagcgttt attagaataa cggatattta aaagggcgtg aaaaggttta
1140tccgttcgtc catttgtatg tgcatgccaa ccacagggtt cccctcggga tcaaagtact
1200ttgatccaac ccctccgctg ctatagtgca gtcggcttct gacgttcagt gcagccgtct
1260tctgaaaacg acatgtcgca caagtcctaa gttacgcgac aggctgccgc cctgcccttt
1320tcctggcgtt ttcttgtcgc gtgttttagt cgcataaagt agaatacttg cgactagaac
1380cggagacatt acgccatgaa caagagcgcc gccgctggcc tgctgggcta tgcccgcgtc
1440agcaccgacg accaggactt gaccaaccaa cgggccgaac tgcacgcggc cggctgcacc
1500aagctgtttt ccgagaagat caccggcacc aggcgcgacc gcccggagct ggccaggatg
1560cttgaccacc tacgccctgg cgacgttgtg acagtgacca ggctagaccg cctggcccgc
1620agcacccgcg acctactgga cattgccgag cgcatccagg aggccggcgc gggcctgcgt
1680agcctggcag agccgtgggc cgacaccacc acgccggccg gccgcatggt gttgaccgtg
1740ttcgccggca ttgccgagtt cgagcgttcc ctaatcatcg accgcacccg gagcgggcgc
1800gaggccgcca aggcccgagg cgtgaagttt ggcccccgcc ctaccctcac cccggcacag
1860atcgcgcacg cccgcgagct gatcgaccag gaaggccgca ccgtgaaaga ggcggctgca
1920ctgcttggcg tgcatcgctc gaccctgtac cgcgcacttg agcgcagcga ggaagtgacg
1980cccaccgagg ccaggcggcg cggtgccttc cgtgaggacg cattgaccga ggccgacgcc
2040ctggcggccg ccgagaatga acgccaagag gaacaagcat gaaaccgcac caggacggcc
2100aggacgaacc gtttttcatt accgaagaga tcgaggcgga gatgatcgcg gccgggtacg
2160tgttcgagcc gcccgcgcac gtctcaaccg tgcggctgca tgaaatcctg gccggtttgt
2220ctgatgccaa gctggcggcc tggccggcca gcttggccgc tgaagaaacc gagcgccgcc
2280gtctaaaaag gtgatgtgta tttgagtaaa acagcttgcg tcatgcggtc gctgcgtata
2340tgatgcgatg agtaaataaa caaatacgca aggggaacgc atgaaggtta tcgctgtact
2400taaccagaaa ggcgggtcag gcaagacgac catcgcaacc catctagccc gcgccctgca
2460actcgccggg gccgatgttc tgttagtcga ttccgatccc cagggcagtg cccgcgattg
2520ggcggccgtg cgggaagatc aaccgctaac cgttgtcggc atcgaccgcc cgacgattga
2580ccgcgacgtg aaggccatcg gccggcgcga cttcgtagtg atcgacggag cgccccaggc
2640ggcggacttg gctgtgtccg cgatcaaggc agccgacttc gtgctgattc cggtgcagcc
2700aagcccttac gacatatggg ccaccgccga cctggtggag ctggttaagc agcgcattga
2760ggtcacggat ggaaggctac aagcggcctt tgtcgtgtcg cgggcgatca aaggcacgcg
2820catcggcggt gaggttgccg aggcgctggc cgggtacgag ctgcccattc ttgagtcccg
2880tatcacgcag cgcgtgagct acccaggcac tgccgccgcc ggcacaaccg ttcttgaatc
2940agaacccgag ggcgacgctg cccgcgaggt ccaggcgctg gccgctgaaa ttaaatcaaa
3000actcatttga gttaatgagg taaagagaaa atgagcaaaa gcacaaacac gctaagtgcc
3060ggccgtccga gcgcacgcag cagcaaggct gcaacgttgg ccagcctggc agacacgcca
3120gccatgaagc gggtcaactt tcagttgccg gcggaggatc acaccaagct gaagatgtac
3180gcggtacgcc aaggcaagac cattaccgag ctgctatctg aatacatcgc gcagctacca
3240gagtaaatga gcaaatgaat aaatgagtag atgaatttta gcggctaaag gaggcggcat
3300ggaaaatcaa gaacaaccag gcaccgacgc cgtggaatgc cccatgtgtg gaggaacggg
3360cggttggcca ggcgtaagcg gctgggttgt ctgccggccc tgcaatggca ctggaacccc
3420caagcccgag gaatcggcgt gacggtcgca aaccatccgg cccggtacaa atcggcgcgg
3480cgctgggtga tgacctggtg gagaagttga aggccgcgca ggccgcccag cggcaacgca
3540tcgaggcaga agcacgcccc ggtgaatcgt ggcaagcggc cgctgatcga atccgcaaag
3600aatcccggca accgccggca gccggtgcgc cgtcgattag gaagccgccc aagggcgacg
3660agcaaccaga ttttttcgtt ccgatgctct atgacgtggg cacccgcgat agtcgcagca
3720tcatggacgt ggccgttttc cgtctgtcga agcgtgaccg acgagctggc gaggtgatcc
3780gctacgagct tccagacggg cacgtagagg tttccgcagg gccggccggc atggccagtg
3840tgtgggatta cgacctggta ctgatggcgg tttcccatct aaccgaatcc atgaaccgat
3900accgggaagg gaagggagac aagcccggcc gcgtgttccg tccacacgtt gcggacgtac
3960tcaagttctg ccggcgagcc gatggcggaa agcagaaaga cgacctggta gaaacctgca
4020ttcggttaaa caccacgcac gttgccatgc agcgtacgaa gaaggccaag aacggccgcc
4080tggtgacggt atccgagggt gaagccttga ttagccgcta caagatcgta aagagcgaaa
4140ccgggcggcc ggagtacatc gagatcgagc tagctgattg gatgtaccgc gagatcacag
4200aaggcaagaa cccggacgtg ctgacggttc accccgatta ctttttgatc gatcccggca
4260tcggccgttt tctctaccgc ctggcacgcc gcgccgcagg caaggcagaa gccagatggt
4320tgttcaagac gatctacgaa cgcagtggca gcgccggaga gttcaagaag ttctgtttca
4380ccgtgcgcaa gctgatcggg tcaaatgacc tgccggagta cgatttgaag gaggaggcgg
4440ggcaggctgg cccgatccta gtcatgcgct accgcaacct gatcgagggc gaagcatccg
4500ccggttccta atgtacggag cagatgctag ggcaaattgc cctagcaggg gaaaaaggtc
4560gaaaaggtct ctttcctgtg gatagcacgt acattgggaa cccaaagccg tacattggga
4620accggaaccc gtacattggg aacccaaagc cgtacattgg gaaccggtca cacatgtaag
4680tgactgatat aaaagagaaa aaaggcgatt tttccgccta aaactcttta aaacttatta
4740aaactcttaa aacccgcctg gcctgtgcat aactgtctgg ccagcgcaca gccgaagagc
4800tgcaaaaagc gcctaccctt cggtcgctgc gctccctacg ccccgccgct tcgcgtcggc
4860ctatcgcggc cgctggccgc tcaaaaatgg ctggcctacg gccaggcaat ctaccagggc
4920gcggacaagc cgcgccgtcg ccactcgacc gccggcgccc acatcaaggc accctgcctc
4980gcgcgtttcg gtgatgacgg tgaaaacctc tgacacatgc agctcccgga gacggtcaca
5040gcttgtctgt aagcggatgc cgggagcaga caagcccgtc agggcgcgtc agcgggtgtt
5100ggcgggtgtc ggggcgcagc catgacccag tcacgtagcg atagcggagt gtatactggc
5160ttaactatgc ggcatcagag cagattgtac tgagagtgca ccatatgcgg tgtgaaatac
5220cgcacagatg cgtaaggaga aaataccgca tcaggcgctc ttccgcttcc tcgctcactg
5280actcgctgcg ctcggtcgtt cggctgcggc gagcggtatc agctcactca aaggcggtaa
5340tacggttatc cacagaatca ggggataacg caggaaagaa catgtgagca aaaggccagc
5400aaaaggccag gaaccgtaaa aaggccgcgt tgctggcgtt tttccatagg ctccgccccc
5460ctgacgagca tcacaaaaat cgacgctcaa gtcagaggtg gcgaaacccg acaggactat
5520aaagatacca ggcgtttccc cctggaagct ccctcgtgcg ctctcctgtt ccgaccctgc
5580cgcttaccgg atacctgtcc gcctttctcc cttcgggaag cgtggcgctt tctcatagct
5640cacgctgtag gtatctcagt tcggtgtagg tcgttcgctc caagctgggc tgtgtgcacg
5700aaccccccgt tcagcccgac cgctgcgcct tatccggtaa ctatcgtctt gagtccaacc
5760cggtaagaca cgacttatcg ccactggcag cagccactgg taacaggatt agcagagcga
5820ggtatgtagg cggtgctaca gagttcttga agtggtggcc taactacggc tacactagaa
5880ggacagtatt tggtatctgc gctctgctga agccagttac cttcggaaaa agagttggta
5940gctcttgatc cggcaaacaa accaccgctg gtagcggtgg tttttttgtt tgcaagcagc
6000agattacgcg cagaaaaaaa ggatctcaag aagatccttt gatcttttct acggggtctg
6060acgctcagtg gaacgaaaac tcacgttaag ggattttggt catgcattct aggtactaaa
6120acaattcatc cagtaaaata taatatttta ttttctccca atcaggcttg atccccagta
6180agtcaaaaaa tagctcgaca tactgttctt ccccgatatc ctccctgatc gaccggacgc
6240agaaggcaat gtcataccac ttgtccgccc tgccgcttct cccaagatca ataaagccac
6300ttactttgcc atctttcaca aagatgttgc tgtctcccag gtcgccgtgg gaaaagacaa
6360gttcctcttc gggcttttcc gtctttaaaa aatcatacag ctcgcgcgga tctttaaatg
6420gagtgtcttc ttcccagttt tcgcaatcca catcggccag atcgttattc agtaagtaat
6480ccaattcggc taagcggctg tctaagctat tcgtataggg acaatccgat atgtcgatgg
6540agtgaaagag cctgatgcac tccgcataca gctcgataat cttttcaggg ctttgttcat
6600cttcatactc ttccgagcaa aggacgccat cggcctcact catgagcaga ttgctccagc
6660catcatgccg ttcaaagtgc aggacctttg gaacaggcag ctttccttcc agccatagca
6720tcatgtcctt ttcccgttcc acatcatagg tggtcccttt ataccggctg tccgtcattt
6780ttaaatatag gttttcattt tctcccacca gcttatatac cttagcagga gacattcctt
6840ccgtatcttt tacgcagcgg tatttttcga tcagtttttt caattccggt gatattctca
6900ttttagccat ttattatttc cttcctcttt tctacagtat ttaaagatac cccaagaagc
6960taattataac aagacgaact ccaattcact gttccttgca ttctaaaacc ttaaatacca
7020gaaaacagct ttttcaaagt tgttttcaaa gttggcgtat aacatagtat cgacggagcc
7080gattttgaaa ccgcggtgat cacaggcagc aacgctctgt catcgttaca atcaacatgc
7140taccctccgc gagatcatcc gtgtttcaaa cccggcagct tagttgccgt tcttccgaat
7200agcatcggta acatgagcaa agtctgccgc cttacaacgg ctctcccgct gacgccgtcc
7260cggactgatg ggctgcctgt atcgagtggt gattttgtgc cgagctgccg gtcggggagc
7320tgttggctgg ctggtggcag gatatattgt ggtgtaaaca aattgacgct tagacaactt
7380aataacacat tgcggacgtt tttaatgtac tgaattaacg ccgaattaat tcgggggatc
7440tggattttag tactggattt tggttttagg aattagaaat tttattgata gaagtatttt
7500acaaatacaa atacatacta agggtttctt atatgctcaa cacatgagcg aaaccctata
7560ggaaccctaa ttcccttatc tgggaactac tcacacatta ttatggagaa actcgagctt
7620gtcgatcgac agatccggtc ggcatctact ctatttcttt gccctcggac gagtgctggg
7680gcgtcggttt ccactatcgg cgagtacttc tacacagcca tcggtccaga cggccgcgct
7740tctgcgggcg atttgtgtac gcccgacagt cccggctccg gatcggacga ttgcgtcgca
7800tcgaccctgc gcccaagctg catcatcgaa attgccgtca accaagctct gatagagttg
7860gtcaagacca atgcggagca tatacgcccg gagtcgtggc gatcctgcaa gctccggatg
7920cctccgctcg aagtagcgcg tctgctgctc catacaagcc aaccacggcc tccagaagaa
7980gatgttggcg acctcgtatt gggaatcccc gaacatcgcc tcgctccagt caatgaccgc
8040tgttatgcgg ccattgtccg tcaggacatt gttggagccg aaatccgcgt gcacgaggtg
8100ccggacttcg gggcagtcct cggcccaaag catcagctca tcgagagcct gcgcgacgga
8160cgcactgacg gtgtcgtcca tcacagtttg ccagtgatac acatggggat cagcaatcgc
8220gcatatgaaa tcacgccatg tagtgtattg accgattcct tgcggtccga atgggccgaa
8280cccgctcgtc tggctaagat cggccgcagc gatcgcatcc atagcctccg cgaccggttg
8340tagaacagcg ggcagttcgg tttcaggcag gtcttgcaac gtgacaccct gtgcacggcg
8400ggagatgcaa taggtcaggc tctcgctaaa ctccccaatg tcaagcactt ccggaatcgg
8460gagcgcggcc gatgcaaagt gccgataaac ataacgatct ttgtagaaac catcggcgca
8520gctatttacc cgcaggacat atccacgccc tcctacatcg aagctgaaag cacgagattc
8580ttcgccctcc gagagctgca tcaggtcgga gacgctgtcg aacttttcga tcagaaactt
8640ctcgacagac gtcgcggtga gttcaggctt tttcatatct cattgccccc cgggatctgc
8700gaaagctcga gagagataga tttgtagaga gagactggtg atttcagcgt gtcctctcca
8760aatgaaatga acttccttat atagaggaag gtcttgcgaa ggatagtggg attgtgcgtc
8820atcccttacg tcagtggaga tatcacatca atccacttgc tttgaagacg tggttggaac
8880gtcttctttt tccacgatgc tcctcgtggg tgggggtcca tctttgggac cactgtcggc
8940agaggcatct tgaacgatag cctttccttt atcgcaatga tggcatttgt aggtgccacc
9000ttccttttct actgtccttt tgatgaagtg acagatagct gggcaatgga atccgaggag
9060gtttcccgat attacccttt gttgaaaagt ctcaatagcc ctttggtctt ctgagactgt
9120atctttgata ttcttggagt agacgagagt gtcgtgctcc accatgttat cacatcaatc
9180cacttgcttt gaagacgtgg ttggaacgtc ttctttttcc acgatgctcc tcgtgggtgg
9240gggtccatct ttgggaccac tgtcggcaga ggcatcttga acgatagcct ttcctttatc
9300gcaatgatgg catttgtagg tgccaccttc cttttctact gtccttttga tgaagtgaca
9360gatagctggg caatggaatc cgaggaggtt tcccgatatt accctttgtt gaaaagtctc
9420aatagccctt tggtcttctg agactgtatc tttgatattc ttggagtaga cgagagtgtc
9480gtgctccacc atgttggcaa gctgctctag ccaatacgca aaccgcctct ccccgcgcgt
9540tggccgattc attaatgcag ctggcacgac aggtttcccg actggaaagc gggcagtgag
9600cgcaacgcaa ttaatgtgag ttagctcact cattaggcac cccaggcttt acactttatg
9660cttccggctc gtatgttgtg tggaattgtg agcggataac aatttcacac aggaaacagc
9720tatgaccatg attacgaatt cgagctcggt acccggggat cctctagagt cgacctgcag
9780gcatgcaagc ttggcactgg ccgtcgtttt acaacgtcgt gactgggaaa accctggcgt
9840tacccaactt aatcgccttg cagcacatcc ccctttcgcc agctggcgta atagcgaaga
9900ggcccgcacc gatcgccctt cccaacagtt gcgcagcctg aatggcgaat gctagagcag
9960cttgagcttg gatcagattg tcgtttcccg ccttcagttt agcttcatgg agtcaaagat
10020tcaaatagag gacctaacag aactcgccgt aaagactggc gaacagttca tacagagtct
10080cttacgactc aatgacaaga agaaaatctt cgtcaacatg gtggagcacg acacacttgt
10140ctactccaaa aatatcaaag atacagtctc agaagaccaa agggcaattg agacttttca
10200acaaagggta atatccggaa acctcctcgg attccattgc ccagctatct gtcactttat
10260tgtgaagata gtggaaaagg aaggtggctc ctacaaatgc catcattgcg ataaaggaaa
10320ggccatcgtt gaagatgcct ctgccgacag tggtcccaaa gatggacccc cacccacgag
10380gagcatcgtg gaaaaagaag acgttccaac cacgtcttca aagcaagtgg attgatgtga
10440tatctccact gacgtaaggg atgacgcaca atcccactat ccttcgcaag acccttcctc
10500tatataagga agttcatttc atttggagag aacacggggg actcttgac
105499333DNAArtificial SequenceCaMV35SpolyA Primer 93ctgaattaac
gccgaattaa ttcgggggat ctg
339429DNAArtificial SequenceCaMV35Spr Primer 94ctagagcagc ttgccaacat
ggtggagca 299512592DNAArtificial
SequencepAg2 Plasmid 95gtacgaagaa ggccaagaac ggccgcctgg tgacggtatc
cgagggtgaa gccttgatta 60gccgctacaa gatcgtaaag agcgaaaccg ggcggccgga
gtacatcgag atcgagctag 120ctgattggat gtaccgcgag atcacagaag gcaagaaccc
ggacgtgctg acggttcacc 180ccgattactt tttgatcgat cccggcatcg gccgttttct
ctaccgcctg gcacgccgcg 240ccgcaggcaa ggcagaagcc agatggttgt tcaagacgat
ctacgaacgc agtggcagcg 300ccggagagtt caagaagttc tgtttcaccg tgcgcaagct
gatcgggtca aatgacctgc 360cggagtacga tttgaaggag gaggcggggc aggctggccc
gatcctagtc atgcgctacc 420gcaacctgat cgagggcgaa gcatccgccg gttcctaatg
tacggagcag atgctagggc 480aaattgccct agcaggggaa aaaggtcgaa aaggtctctt
tcctgtggat agcacgtaca 540ttgggaaccc aaagccgtac attgggaacc ggaacccgta
cattgggaac ccaaagccgt 600acattgggaa ccggtcacac atgtaagtga ctgatataaa
agagaaaaaa ggcgattttt 660ccgcctaaaa ctctttaaaa cttattaaaa ctcttaaaac
ccgcctggcc tgtgcataac 720tgtctggcca gcgcacagcc gaagagctgc aaaaagcgcc
tacccttcgg tcgctgcgct 780ccctacgccc cgccgcttcg cgtcggccta tcgcggccgc
tggccgctca aaaatggctg 840gcctacggcc aggcaatcta ccagggcgcg gacaagccgc
gccgtcgcca ctcgaccgcc 900ggcgcccaca tcaaggcacc ctgcctcgcg cgtttcggtg
atgacggtga aaacctctga 960cacatgcagc tcccggagac ggtcacagct tgtctgtaag
cggatgccgg gagcagacaa 1020gcccgtcagg gcgcgtcagc gggtgttggc gggtgtcggg
gcgcagccat gacccagtca 1080cgtagcgata gcggagtgta tactggctta actatgcggc
atcagagcag attgtactga 1140gagtgcacca tatgcggtgt gaaataccgc acagatgcgt
aaggagaaaa taccgcatca 1200ggcgctcttc cgcttcctcg ctcactgact cgctgcgctc
ggtcgttcgg ctgcggcgag 1260cggtatcagc tcactcaaag gcggtaatac ggttatccac
agaatcaggg gataacgcag 1320gaaagaacat gtgagcaaaa ggccagcaaa aggccaggaa
ccgtaaaaag gccgcgttgc 1380tggcgttttt ccataggctc cgcccccctg acgagcatca
caaaaatcga cgctcaagtc 1440agaggtggcg aaacccgaca ggactataaa gataccaggc
gtttccccct ggaagctccc 1500tcgtgcgctc tcctgttccg accctgccgc ttaccggata
cctgtccgcc tttctccctt 1560cgggaagcgt ggcgctttct catagctcac gctgtaggta
tctcagttcg gtgtaggtcg 1620ttcgctccaa gctgggctgt gtgcacgaac cccccgttca
gcccgaccgc tgcgccttat 1680ccggtaacta tcgtcttgag tccaacccgg taagacacga
cttatcgcca ctggcagcag 1740ccactggtaa caggattagc agagcgaggt atgtaggcgg
tgctacagag ttcttgaagt 1800ggtggcctaa ctacggctac actagaagga cagtatttgg
tatctgcgct ctgctgaagc 1860cagttacctt cggaaaaaga gttggtagct cttgatccgg
caaacaaacc accgctggta 1920gcggtggttt ttttgtttgc aagcagcaga ttacgcgcag
aaaaaaagga tctcaagaag 1980atcctttgat cttttctacg gggtctgacg ctcagtggaa
cgaaaactca cgttaaggga 2040ttttggtcat gcattctagg tactaaaaca attcatccag
taaaatataa tattttattt 2100tctcccaatc aggcttgatc cccagtaagt caaaaaatag
ctcgacatac tgttcttccc 2160cgatatcctc cctgatcgac cggacgcaga aggcaatgtc
ataccacttg tccgccctgc 2220cgcttctccc aagatcaata aagccactta ctttgccatc
tttcacaaag atgttgctgt 2280ctcccaggtc gccgtgggaa aagacaagtt cctcttcggg
cttttccgtc tttaaaaaat 2340catacagctc gcgcggatct ttaaatggag tgtcttcttc
ccagttttcg caatccacat 2400cggccagatc gttattcagt aagtaatcca attcggctaa
gcggctgtct aagctattcg 2460tatagggaca atccgatatg tcgatggagt gaaagagcct
gatgcactcc gcatacagct 2520cgataatctt ttcagggctt tgttcatctt catactcttc
cgagcaaagg acgccatcgg 2580cctcactcat gagcagattg ctccagccat catgccgttc
aaagtgcagg acctttggaa 2640caggcagctt tccttccagc catagcatca tgtccttttc
ccgttccaca tcataggtgg 2700tccctttata ccggctgtcc gtcattttta aatataggtt
ttcattttct cccaccagct 2760tatatacctt agcaggagac attccttccg tatcttttac
gcagcggtat ttttcgatca 2820gttttttcaa ttccggtgat attctcattt tagccattta
ttatttcctt cctcttttct 2880acagtattta aagatacccc aagaagctaa ttataacaag
acgaactcca attcactgtt 2940ccttgcattc taaaacctta aataccagaa aacagctttt
tcaaagttgt tttcaaagtt 3000ggcgtataac atagtatcga cggagccgat tttgaaaccg
cggtgatcac aggcagcaac 3060gctctgtcat cgttacaatc aacatgctac cctccgcgag
atcatccgtg tttcaaaccc 3120ggcagcttag ttgccgttct tccgaatagc atcggtaaca
tgagcaaagt ctgccgcctt 3180acaacggctc tcccgctgac gccgtcccgg actgatgggc
tgcctgtatc gagtggtgat 3240tttgtgccga gctgccggtc ggggagctgt tggctggctg
gtggcaggat atattgtggt 3300gtaaacaaat tgacgcttag acaacttaat aacacattgc
ggacgttttt aatgtactga 3360attaacgccg aattaattcg ggggatctgg attttagtac
tggattttgg ttttaggaat 3420tagaaatttt attgatagaa gtattttaca aatacaaata
catactaagg gtttcttata 3480tgctcaacac atgagcgaaa ccctatagga accctaattc
ccttatctgg gaactactca 3540cacattatta tggagaaact cgagtcaaat ctcggtgacg
ggcaggaccg gacggggcgg 3600taccggcagg ctgaagtcca gctgccagaa acccacgtca
tgccagttcc cgtgcttgaa 3660gccggccgcc cgcagcatgc cgcggggggc atatccgagc
gcctcgtgca tgcgcacgct 3720cgggtcgttg ggcagcccga tgacagcgac cacgctcttg
aagccctgtg cctccaggga 3780cttcagcagg tgggtgtaga gcgtggagcc cagtcccgtc
cgctggtggc ggggggagac 3840gtacacggtc gactcggccg tccagtcgta ggcgttgcgt
gccttccagg ggcccgcgta 3900ggcgatgccg gcgacctcgc cgtccacctc ggcgacgagc
cagggatagc gctcccgcag 3960acggacgagg tcgtccgtcc actcctgcgg ttcctgcggc
tcggtacgga agttgaccgt 4020gcttgtctcg atgtagtggt tgacgatggt gcagaccgcc
ggcatgtccg cctcggtggc 4080acggcggatg tcggccgggc gtcgttctgg gctcatggta
gactcgagag agatagattt 4140gtagagagag actggtgatt tcagcgtgtc ctctccaaat
gaaatgaact tccttatata 4200gaggaaggtc ttgcgaagga tagtgggatt gtgcgtcatc
ccttacgtca gtggagatat 4260cacatcaatc cacttgcttt gaagacgtgg ttggaacgtc
ttctttttcc acgatgctcc 4320tcgtgggtgg gggtccatct ttgggaccac tgtcggcaga
ggcatcttga acgatagcct 4380ttcctttatc gcaatgatgg catttgtagg tgccaccttc
cttttctact gtccttttga 4440tgaagtgaca gatagctggg caatggaatc cgaggaggtt
tcccgatatt accctttgtt 4500gaaaagtctc aatagccctt tggtcttctg agactgtatc
tttgatattc ttggagtaga 4560cgagagtgtc gtgctccacc atgttatcac atcaatccac
ttgctttgaa gacgtggttg 4620gaacgtcttc tttttccacg atgctcctcg tgggtggggg
tccatctttg ggaccactgt 4680cggcagaggc atcttgaacg atagcctttc ctttatcgca
atgatggcat ttgtaggtgc 4740caccttcctt ttctactgtc cttttgatga agtgacagat
agctgggcaa tggaatccga 4800ggaggtttcc cgatattacc ctttgttgaa aagtctcaat
agccctttgg tcttctgaga 4860ctgtatcttt gatattcttg gagtagacga gagtgtcgtg
ctccaccatg ttggcaagct 4920gctctagcca atacgcaaac cgcctctccc cgcgcgttgg
ccgattcatt aatgcagctg 4980gcacgacagg tttcccgact ggaaagcggg cagtgagcgc
aacgcaatta atgtgagtta 5040gctcactcat taggcacccc aggctttaca ctttatgctt
ccggctcgta tgttgtgtgg 5100aattgtgagc ggataacaat ttcacacagg aaacagctat
gaccatgatt acgaattcga 5160gccttgacta gagggtcgac ggtatacaga catgataaga
tacattgatg agtttggaca 5220aaccacaact agaatgcagt gaaaaaaatg ctttatttgt
gaaatttgtg atgctattgc 5280tttatttgta accattataa gctgcaataa acaagttggg
gtgggcgaag aactccagca 5340tgagatcccc gcgctggagg atcatccagc cggcgtcccg
gaaaacgatt ccgaagccca 5400acctttcata gaaggcggcg gtggaatcga aatctcgtag
cacgtgtcag tcctgctcct 5460cggccacgaa gtgcacgcag ttgccggccg ggtcgcgcag
ggcgaactcc cgcccccacg 5520gctgctcgcc gatctcggtc atggccggcc cggaggcgtc
ccggaagttc gtggacacga 5580cctccgacca ctcggcgtac agctcgtcca ggccgcgcac
ccacacccag gccagggtgt 5640tgtccggcac cacctggtcc tggaccgcgc tgatgaacag
ggtcacgtcg tcccggacca 5700caccggcgaa gtcgtcctcc acgaagtccc gggagaaccc
gagccggtcg gtccagaact 5760cgaccgctcc ggcgacgtcg cgcgcggtga gcaccggaac
ggcactggtc aacttggcca 5820tggatccaga tttcgctcaa gttagtataa aaaagcaggc
ttcaatcctg caggaattcg 5880atcgacactc tcgtctactc caagaatatc aaagatacag
tctcagaaga ccaaagggct 5940attgagactt ttcaacaaag ggtaatatcg ggaaacctcc
tcggattcca ttgcccagct 6000atctgtcact tcatcaaaag gacagtagaa aaggaaggtg
gcacctacaa atgccatcat 6060tgcgataaag gaaaggctat cgttcaagat gcctctgccg
acagtggtcc caaagatgga 6120cccccaccca cgaggagcat cgtggaaaaa gaagacgttc
caaccacgtc ttcaaagcaa 6180gtggattgat gtgataacat ggtggagcac gacactctcg
tctactccaa gaatatcaaa 6240gatacagtct cagaagacca aagggctatt gagacttttc
aacaaagggt aatatcggga 6300aacctcctcg gattccattg cccagctatc tgtcacttca
tcaaaaggac agtagaaaag 6360gaaggtggca cctacaaatg ccatcattgc gataaaggaa
aggctatcgt tcaagatgcc 6420tctgccgaca gtggtcccaa agatggaccc ccacccacga
ggagcatcgt ggaaaaagaa 6480gacgttccaa ccacgtcttc aaagcaagtg gattgatgtg
atatctccac tgacgtaagg 6540gatgacgcac aatcccacta tccttcgcaa gaccttcctc
tatataagga agttcatttc 6600atttggagag gacacgctga aatcaccagt ctctctctac
aaatctatct ctctcgagct 6660ttcgcagatc cgggggggca atgagatatg aaaaagcctg
aactcaccgc gacgtctgtc 6720gagaagtttc tgatcgaaaa gttcgacagc gtctccgacc
tgatgcagct ctcggagggc 6780gaagaatctc gtgctttcag cttcgatgta ggagggcgtg
gatatgtcct gcgggtaaat 6840agctgcgccg atggtttcta caaagatcgt tatgtttatc
ggcactttgc atcggccgcg 6900ctcccgattc cggaagtgct tgacattggg gagtttagcg
agagcctgac ctattgcatc 6960tcccgccgtg cacagggtgt cacgttgcaa gacctgcctg
aaaccgaact gcccgctgtt 7020ctacaaccgg tcgcggaggc tatggatgcg atcgctgcgg
ccgatcttag ccagacgagc 7080gggttcggcc cattcggacc gcaaggaatc ggtcaataca
ctacatggcg tgatttcata 7140tgcgcgattg ctgatcccca tgtgtatcac tggcaaactg
tgatggacga caccgtcagt 7200gcgtccgtcg cgcaggctct cgatgagctg atgctttggg
ccgaggactg ccccgaagtc 7260cggcacctcg tgcacgcgga tttcggctcc aacaatgtcc
tgacggacaa tggccgcata 7320acagcggtca ttgactggag cgaggcgatg ttcggggatt
cccaatacga ggtcgccaac 7380atcttcttct ggaggccgtg gttggcttgt atggagcagc
agacgcgcta cttcgagcgg 7440aggcatccgg agcttgcagg atcgccacga ctccgggcgt
atatgctccg cattggtctt 7500gaccaactct atcagagctt ggttgacggc aatttcgatg
atgcagcttg ggcgcagggt 7560cgatgcgacg caatcgtccg atccggagcc gggactgtcg
ggcgtacaca aatcgcccgc 7620agaagcgcgg ccgtctggac cgatggctgt gtagaagtac
tcgccgatag tggaaaccga 7680cgccccagca ctcgtccgag ggcaaagaaa tagagtagat
gccgaccgga tctgtcgatc 7740gacaagctcg agtttctcca taataatgtg tgagtagttc
ccagataagg gaattagggt 7800tcctataggg tttcgctcat gtgttgagca tataagaaac
ccttagtatg tatttgtatt 7860tgtaaaatac ttctatcaat aaaatttcta attcctaaaa
ccaaaatcca gtactaaaat 7920ccagatcccc cgaattaatt cggcgttaat tcagatcaag
cttggcactg gccgtcgttt 7980tacaacgtcg tgactgggaa aaccctggcg ttacccaact
taatcgcctt gcagcacatc 8040cccctttcgc cagctggcgt aatagcgaag aggcccgcac
cgatcgccct tcccaacagt 8100tgcgcagcct gaatggcgaa tgctagagca gcttgagctt
ggatcagatt gtcgtttccc 8160gccttcagtt tggggatcct ctagactgaa ggcgggaaac
gacaatctga tcatgagcgg 8220agaattaagg gagtcacgtt atgacccccg ccgatgacgc
gggacaagcc gttttacgtt 8280tggaactgac agaaccgcaa cgttgaagga gccactcagc
cgcgggtttc tggagtttaa 8340tgagctaagc acatacgtca gaaaccatta ttgcgcgttc
aaaagtcgcc taaggtcact 8400atcagctagc aaatatttct tgtcaaaaat gctccactga
cgttccataa attcccctcg 8460gtatccaatt agagtctcat attcactctc aatccaaata
atctgcaccg gatctcgaga 8520atcgaattcc cgcggccgcc atggtagatc tgactagtaa
aggagaagaa cttttcactg 8580gagttgtccc aattcttgtt gaattagatg gtgatgttaa
tgggcacaaa ttttctgtca 8640gtggagaggg tgaaggtgat gcaacatacg gaaaacttac
ccttaaattt atttgcacta 8700ctggaaaact acctgttccg tggccaacac ttgtcactac
tttctcttat ggtgttcaat 8760gcttttcaag atacccagat catatgaagc ggcacgactt
cttcaagagc gccatgcctg 8820agggatacgt gcaggagagg accatcttct tcaaggacga
cgggaactac aagacacgtg 8880ctgaagtcaa gtttgaggga gacaccctcg tcaacaggat
cgagcttaag ggaatcgatt 8940tcaaggagga cggaaacatc ctcggccaca agttggaata
caactacaac tcccacaacg 9000tatacatcat ggccgacaag caaaagaacg gcatcaaagc
caacttcaag acccgccaca 9060acatcgaaga cggcggcgtg caactcgctg atcattatca
acaaaatact ccaattggcg 9120atggccctgt ccttttacca gacaaccatt acctgtccac
acaatctgcc ctttcgaaag 9180atcccaacga aaagagagac cacatggtcc ttcttgagtt
tgtaacagct gctgggatta 9240cacatggcat ggatgaacta tacaaagcta gccaccacca
ccaccaccac gtgtgaattg 9300gtgaccagct cgaatttccc cgatcgttca aacatttggc
aataaagttt cttaagattg 9360aatcctgttg ccggtcttgc gatgattatc atataatttc
tgttgaatta cgttaagcat 9420gtaataatta acatgtaatg catgacgtta tttatgagat
gggtttttat gattagagtc 9480ccgcaattat acatttaata cgcgatagaa aacaaaatat
agcgcgcaaa ctaggataaa 9540ttatcgcgcg cggtgtcatc tatgttacta gatcgggaat
taaactatca gtgtttgaca 9600ggatatattg gcgggtaaac ctaagagaaa agagcgttta
ttagaataac ggatatttaa 9660aagggcgtga aaaggtttat ccgttcgtcc atttgtatgt
gcatgccaac cacagggttc 9720ccctcgggat caaagtactt tgatccaacc cctccgctgc
tatagtgcag tcggcttctg 9780acgttcagtg cagccgtctt ctgaaaacga catgtcgcac
aagtcctaag ttacgcgaca 9840ggctgccgcc ctgccctttt cctggcgttt tcttgtcgcg
tgttttagtc gcataaagta 9900gaatacttgc gactagaacc ggagacatta cgccatgaac
aagagcgccg ccgctggcct 9960gctgggctat gcccgcgtca gcaccgacga ccaggacttg
accaaccaac gggccgaact 10020gcacgcggcc ggctgcacca agctgttttc cgagaagatc
accggcacca ggcgcgaccg 10080cccggagctg gccaggatgc ttgaccacct acgccctggc
gacgttgtga cagtgaccag 10140gctagaccgc ctggcccgca gcacccgcga cctactggac
attgccgagc gcatccagga 10200ggccggcgcg ggcctgcgta gcctggcaga gccgtgggcc
gacaccacca cgccggccgg 10260ccgcatggtg ttgaccgtgt tcgccggcat tgccgagttc
gagcgttccc taatcatcga 10320ccgcacccgg agcgggcgcg aggccgccaa ggcccgaggc
gtgaagtttg gcccccgccc 10380taccctcacc ccggcacaga tcgcgcacgc ccgcgagctg
atcgaccagg aaggccgcac 10440cgtgaaagag gcggctgcac tgcttggcgt gcatcgctcg
accctgtacc gcgcacttga 10500gcgcagcgag gaagtgacgc ccaccgaggc caggcggcgc
ggtgccttcc gtgaggacgc 10560attgaccgag gccgacgccc tggcggccgc cgagaatgaa
cgccaagagg aacaagcatg 10620aaaccgcacc aggacggcca ggacgaaccg tttttcatta
ccgaagagat cgaggcggag 10680atgatcgcgg ccgggtacgt gttcgagccg cccgcgcacg
tctcaaccgt gcggctgcat 10740gaaatcctgg ccggtttgtc tgatgccaag ctggcggcct
ggccggccag cttggccgct 10800gaagaaaccg agcgccgccg tctaaaaagg tgatgtgtat
ttgagtaaaa cagcttgcgt 10860catgcggtcg ctgcgtatat gatgcgatga gtaaataaac
aaatacgcaa ggggaacgca 10920tgaaggttat cgctgtactt aaccagaaag gcgggtcagg
caagacgacc atcgcaaccc 10980atctagcccg cgccctgcaa ctcgccgggg ccgatgttct
gttagtcgat tccgatcccc 11040agggcagtgc ccgcgattgg gcggccgtgc gggaagatca
accgctaacc gttgtcggca 11100tcgaccgccc gacgattgac cgcgacgtga aggccatcgg
ccggcgcgac ttcgtagtga 11160tcgacggagc gccccaggcg gcggacttgg ctgtgtccgc
gatcaaggca gccgacttcg 11220tgctgattcc ggtgcagcca agcccttacg acatatgggc
caccgccgac ctggtggagc 11280tggttaagca gcgcattgag gtcacggatg gaaggctaca
agcggccttt gtcgtgtcgc 11340gggcgatcaa aggcacgcgc atcggcggtg aggttgccga
ggcgctggcc gggtacgagc 11400tgcccattct tgagtcccgt atcacgcagc gcgtgagcta
cccaggcact gccgccgccg 11460gcacaaccgt tcttgaatca gaacccgagg gcgacgctgc
ccgcgaggtc caggcgctgg 11520ccgctgaaat taaatcaaaa ctcatttgag ttaatgaggt
aaagagaaaa tgagcaaaag 11580cacaaacacg ctaagtgccg gccgtccgag cgcacgcagc
agcaaggctg caacgttggc 11640cagcctggca gacacgccag ccatgaagcg ggtcaacttt
cagttgccgg cggaggatca 11700caccaagctg aagatgtacg cggtacgcca aggcaagacc
attaccgagc tgctatctga 11760atacatcgcg cagctaccag agtaaatgag caaatgaata
aatgagtaga tgaattttag 11820cggctaaagg aggcggcatg gaaaatcaag aacaaccagg
caccgacgcc gtggaatgcc 11880ccatgtgtgg aggaacgggc ggttggccag gcgtaagcgg
ctgggttgtc tgccggccct 11940gcaatggcac tggaaccccc aagcccgagg aatcggcgtg
acggtcgcaa accatccggc 12000ccggtacaaa tcggcgcggc gctgggtgat gacctggtgg
agaagttgaa ggccgcgcag 12060gccgcccagc ggcaacgcat cgaggcagaa gcacgccccg
gtgaatcgtg gcaagcggcc 12120gctgatcgaa tccgcaaaga atcccggcaa ccgccggcag
ccggtgcgcc gtcgattagg 12180aagccgccca agggcgacga gcaaccagat tttttcgttc
cgatgctcta tgacgtgggc 12240acccgcgata gtcgcagcat catggacgtg gccgttttcc
gtctgtcgaa gcgtgaccga 12300cgagctggcg aggtgatccg ctacgagctt ccagacgggc
acgtagaggt ttccgcaggg 12360ccggccggca tggccagtgt gtgggattac gacctggtac
tgatggcggt ttcccatcta 12420accgaatcca tgaaccgata ccgggaaggg aagggagaca
agcccggccg cgtgttccgt 12480ccacacgttg cggacgtact caagttctgc cggcgagccg
atggcggaaa gcagaaagac 12540gacctggtag aaacctgcat tcggttaaac accacgcacg
ttgccatgca gc 12592963357DNAArtificial SequencepGEMEasyNOS
Plasmid 96tatcactagt gaattcgcgg ccgcctgcag gtcgaccata tgggagagct
cccaacgcgt 60tggatgcata gcttgagtat tctatagtgt cacctaaata gcttggcgta
atcatggtca 120tagctgtttc ctgtgtgaaa ttgttatccg ctcacaattc cacacaacat
acgagccgga 180agcataaagt gtaaagcctg gggtgcctaa tgagtgagct aactcacatt
aattgcgttg 240cgctcactgc ccgctttcca gtcgggaaac ctgtcgtgcc agctgcatta
atgaatcggc 300caacgcgcgg ggagaggcgg tttgcgtatt gggcgctctt ccgcttcctc
gctcactgac 360tcgctgcgct cggtcgttcg gctgcggcga gcggtatcag ctcactcaaa
ggcggtaata 420cggttatcca cagaatcagg ggataacgca ggaaagaaca tgtgagcaaa
aggccagcaa 480aaggccagga accgtaaaaa ggccgcgttg ctggcgtttt tccataggct
ccgcccccct 540gacgagcatc acaaaaatcg acgctcaagt cagaggtggc gaaacccgac
aggactataa 600agataccagg cgtttccccc tggaagctcc ctcgtgcgct ctcctgttcc
gaccctgccg 660cttaccggat acctgtccgc ctttctccct tcgggaagcg tggcgctttc
tcatagctca 720cgctgtaggt atctcagttc ggtgtaggtc gttcgctcca agctgggctg
tgtgcacgaa 780ccccccgttc agcccgaccg ctgcgcctta tccggtaact atcgtcttga
gtccaacccg 840gtaagacacg acttatcgcc actggcagca gccactggta acaggattag
cagagcgagg 900tatgtaggcg gtgctacaga gttcttgaag tggtggccta actacggcta
cactagaaga 960acagtatttg gtatctgcgc tctgctgaag ccagttacct tcggaaaaag
agttggtagc 1020tcttgatccg gcaaacaaac caccgctggt agcggtggtt tttttgtttg
caagcagcag 1080attacgcgca gaaaaaaagg atctcaagaa gatcctttga tcttttctac
ggggtctgac 1140gctcagtgga acgaaaactc acgttaaggg attttggtca tgagattatc
aaaaaggatc 1200ttcacctaga tccttttaaa ttaaaaatga agttttaaat caatctaaag
tatatatgag 1260taaacttggt ctgacagtta ccaatgctta atcagtgagg cacctatctc
agcgatctgt 1320ctatttcgtt catccatagt tgcctgactc cccgtcgtgt agataactac
gatacgggag 1380ggcttaccat ctggccccag tgctgcaatg ataccgcgag acccacgctc
accggctcca 1440gatttatcag caataaacca gccagccgga agggccgagc gcagaagtgg
tcctgcaact 1500ttatccgcct ccatccagtc tattaattgt tgccgggaag ctagagtaag
tagttcgcca 1560gttaatagtt tgcgcaacgt tgttgccatt gctacaggca tcgtggtgtc
acgctcgtcg 1620tttggtatgg cttcattcag ctccggttcc caacgatcaa ggcgagttac
atgatccccc 1680atgttgtgca aaaaagcggt tagctccttc ggtcctccga tcgttgtcag
aagtaagttg 1740gccgcagtgt tatcactcat ggttatggca gcactgcata attctcttac
tgtcatgcca 1800tccgtaagat gcttttctgt gactggtgag tactcaacca agtcattctg
agaatagtgt 1860atgcggcgac cgagttgctc ttgcccggcg tcaatacggg ataataccgc
gccacatagc 1920agaactttaa aagtgctcat cattggaaaa cgttcttcgg ggcgaaaact
ctcaaggatc 1980ttaccgctgt tgagatccag ttcgatgtaa cccactcgtg cacccaactg
atcttcagca 2040tcttttactt tcaccagcgt ttctgggtga gcaaaaacag gaaggcaaaa
tgccgcaaaa 2100aagggaataa gggcgacacg gaaatgttga atactcatac tcttcctttt
tcaatattat 2160tgaagcattt atcagggtta ttgtctcatg agcggataca tatttgaatg
tatttagaaa 2220aataaacaaa taggggttcc gcgcacattt ccccgaaaag tgccacctga
tgcggtgtga 2280aataccgcac agatgcgtaa ggagaaaata ccgcatcagg aaattgtaag
cgttaatatt 2340ttgttaaaat tcgcgttaaa tttttgttaa atcagctcat tttttaacca
ataggccgaa 2400atcggcaaaa tcccttataa atcaaaagaa tagaccgaga tagggttgag
tgttgttcca 2460gtttggaaca agagtccact attaaagaac gtggactcca acgtcaaagg
gcgaaaaacc 2520gtctatcagg gcgatggccc actacgtgaa ccatcaccct aatcaagttt
tttggggtcg 2580aggtgccgta aagcactaaa tcggaaccct aaagggagcc cccgatttag
agcttgacgg 2640ggaaagccgg cgaacgtggc gagaaaggaa gggaagaaag cgaaaggagc
gggcgctagg 2700gcgctggcaa gtgtagcggt cacgctgcgc gtaaccacca cacccgccgc
gcttaatgcg 2760ccgctacagg gcgcgtccat tcgccattca ggctgcgcaa ctgttgggaa
gggcgatcgg 2820tgcgggcctc ttcgctatta cgccagctgg cgaaaggggg atgtgctgca
aggcgattaa 2880gttgggtaac gccagggttt tcccagtcac gacgttgtaa aacgacggcc
agtgaattgt 2940aatacgactc actatagggc gaattgggcc cgacgtcgca tgctcccggc
cgccatggcg 3000gccgcgggaa ttcgattctc gagatccggt gcagattatt tggattgaga
gtgaatatga 3060gactctaatt ggataccgag gggaatttat ggaacgtcag tggagcattt
ttgacaagaa 3120atatttgcta gctgatagtg accttaggcg acttttgaac gcgcaataat
ggtttctgac 3180gtatgtgctt agctcattaa actccagaaa cccgcggctg agtggctcct
tcaacgttgc 3240ggttctgtca gttccaaacg taaaacggct tgtcccgcgt catcggcggg
ggtcataacg 3300tgactccctt aattctccgc tcatgatcag attgtcgttt cccgccttca
gtctaga 33579710122DNAArtificial Sequencep1302NOS Plasmid
97catggtagat ctgactagta aaggagaaga acttttcact ggagttgtcc caattcttgt
60tgaattagat ggtgatgtta atgggcacaa attttctgtc agtggagagg gtgaaggtga
120tgcaacatac ggaaaactta cccttaaatt tatttgcact actggaaaac tacctgttcc
180gtggccaaca cttgtcacta ctttctctta tggtgttcaa tgcttttcaa gatacccaga
240tcatatgaag cggcacgact tcttcaagag cgccatgcct gagggatacg tgcaggagag
300gaccatcttc ttcaaggacg acgggaacta caagacacgt gctgaagtca agtttgaggg
360agacaccctc gtcaacagga tcgagcttaa gggaatcgat ttcaaggagg acggaaacat
420cctcggccac aagttggaat acaactacaa ctcccacaac gtatacatca tggccgacaa
480gcaaaagaac ggcatcaaag ccaacttcaa gacccgccac aacatcgaag acggcggcgt
540gcaactcgct gatcattatc aacaaaatac tccaattggc gatggccctg tccttttacc
600agacaaccat tacctgtcca cacaatctgc cctttcgaaa gatcccaacg aaaagagaga
660ccacatggtc cttcttgagt ttgtaacagc tgctgggatt acacatggca tggatgaact
720atacaaagct agccaccacc accaccacca cgtgtgaatt ggtgaccagc tcgaatttcc
780ccgatcgttc aaacatttgg caataaagtt tcttaagatt gaatcctgtt gccggtcttg
840cgatgattat catataattt ctgttgaatt acgttaagca tgtaataatt aacatgtaat
900gcatgacgtt atttatgaga tgggttttta tgattagagt cccgcaatta tacatttaat
960acgcgataga aaacaaaata tagcgcgcaa actaggataa attatcgcgc gcggtgtcat
1020ctatgttact agatcgggaa ttaaactatc agtgtttgac aggatatatt ggcgggtaaa
1080cctaagagaa aagagcgttt attagaataa cggatattta aaagggcgtg aaaaggttta
1140tccgttcgtc catttgtatg tgcatgccaa ccacagggtt cccctcggga tcaaagtact
1200ttgatccaac ccctccgctg ctatagtgca gtcggcttct gacgttcagt gcagccgtct
1260tctgaaaacg acatgtcgca caagtcctaa gttacgcgac aggctgccgc cctgcccttt
1320tcctggcgtt ttcttgtcgc gtgttttagt cgcataaagt agaatacttg cgactagaac
1380cggagacatt acgccatgaa caagagcgcc gccgctggcc tgctgggcta tgcccgcgtc
1440agcaccgacg accaggactt gaccaaccaa cgggccgaac tgcacgcggc cggctgcacc
1500aagctgtttt ccgagaagat caccggcacc aggcgcgacc gcccggagct ggccaggatg
1560cttgaccacc tacgccctgg cgacgttgtg acagtgacca ggctagaccg cctggcccgc
1620agcacccgcg acctactgga cattgccgag cgcatccagg aggccggcgc gggcctgcgt
1680agcctggcag agccgtgggc cgacaccacc acgccggccg gccgcatggt gttgaccgtg
1740ttcgccggca ttgccgagtt cgagcgttcc ctaatcatcg accgcacccg gagcgggcgc
1800gaggccgcca aggcccgagg cgtgaagttt ggcccccgcc ctaccctcac cccggcacag
1860atcgcgcacg cccgcgagct gatcgaccag gaaggccgca ccgtgaaaga ggcggctgca
1920ctgcttggcg tgcatcgctc gaccctgtac cgcgcacttg agcgcagcga ggaagtgacg
1980cccaccgagg ccaggcggcg cggtgccttc cgtgaggacg cattgaccga ggccgacgcc
2040ctggcggccg ccgagaatga acgccaagag gaacaagcat gaaaccgcac caggacggcc
2100aggacgaacc gtttttcatt accgaagaga tcgaggcgga gatgatcgcg gccgggtacg
2160tgttcgagcc gcccgcgcac gtctcaaccg tgcggctgca tgaaatcctg gccggtttgt
2220ctgatgccaa gctggcggcc tggccggcca gcttggccgc tgaagaaacc gagcgccgcc
2280gtctaaaaag gtgatgtgta tttgagtaaa acagcttgcg tcatgcggtc gctgcgtata
2340tgatgcgatg agtaaataaa caaatacgca aggggaacgc atgaaggtta tcgctgtact
2400taaccagaaa ggcgggtcag gcaagacgac catcgcaacc catctagccc gcgccctgca
2460actcgccggg gccgatgttc tgttagtcga ttccgatccc cagggcagtg cccgcgattg
2520ggcggccgtg cgggaagatc aaccgctaac cgttgtcggc atcgaccgcc cgacgattga
2580ccgcgacgtg aaggccatcg gccggcgcga cttcgtagtg atcgacggag cgccccaggc
2640ggcggacttg gctgtgtccg cgatcaaggc agccgacttc gtgctgattc cggtgcagcc
2700aagcccttac gacatatggg ccaccgccga cctggtggag ctggttaagc agcgcattga
2760ggtcacggat ggaaggctac aagcggcctt tgtcgtgtcg cgggcgatca aaggcacgcg
2820catcggcggt gaggttgccg aggcgctggc cgggtacgag ctgcccattc ttgagtcccg
2880tatcacgcag cgcgtgagct acccaggcac tgccgccgcc ggcacaaccg ttcttgaatc
2940agaacccgag ggcgacgctg cccgcgaggt ccaggcgctg gccgctgaaa ttaaatcaaa
3000actcatttga gttaatgagg taaagagaaa atgagcaaaa gcacaaacac gctaagtgcc
3060ggccgtccga gcgcacgcag cagcaaggct gcaacgttgg ccagcctggc agacacgcca
3120gccatgaagc gggtcaactt tcagttgccg gcggaggatc acaccaagct gaagatgtac
3180gcggtacgcc aaggcaagac cattaccgag ctgctatctg aatacatcgc gcagctacca
3240gagtaaatga gcaaatgaat aaatgagtag atgaatttta gcggctaaag gaggcggcat
3300ggaaaatcaa gaacaaccag gcaccgacgc cgtggaatgc cccatgtgtg gaggaacggg
3360cggttggcca ggcgtaagcg gctgggttgt ctgccggccc tgcaatggca ctggaacccc
3420caagcccgag gaatcggcgt gacggtcgca aaccatccgg cccggtacaa atcggcgcgg
3480cgctgggtga tgacctggtg gagaagttga aggccgcgca ggccgcccag cggcaacgca
3540tcgaggcaga agcacgcccc ggtgaatcgt ggcaagcggc cgctgatcga atccgcaaag
3600aatcccggca accgccggca gccggtgcgc cgtcgattag gaagccgccc aagggcgacg
3660agcaaccaga ttttttcgtt ccgatgctct atgacgtggg cacccgcgat agtcgcagca
3720tcatggacgt ggccgttttc cgtctgtcga agcgtgaccg acgagctggc gaggtgatcc
3780gctacgagct tccagacggg cacgtagagg tttccgcagg gccggccggc atggccagtg
3840tgtgggatta cgacctggta ctgatggcgg tttcccatct aaccgaatcc atgaaccgat
3900accgggaagg gaagggagac aagcccggcc gcgtgttccg tccacacgtt gcggacgtac
3960tcaagttctg ccggcgagcc gatggcggaa agcagaaaga cgacctggta gaaacctgca
4020ttcggttaaa caccacgcac gttgccatgc agcgtacgaa gaaggccaag aacggccgcc
4080tggtgacggt atccgagggt gaagccttga ttagccgcta caagatcgta aagagcgaaa
4140ccgggcggcc ggagtacatc gagatcgagc tagctgattg gatgtaccgc gagatcacag
4200aaggcaagaa cccggacgtg ctgacggttc accccgatta ctttttgatc gatcccggca
4260tcggccgttt tctctaccgc ctggcacgcc gcgccgcagg caaggcagaa gccagatggt
4320tgttcaagac gatctacgaa cgcagtggca gcgccggaga gttcaagaag ttctgtttca
4380ccgtgcgcaa gctgatcggg tcaaatgacc tgccggagta cgatttgaag gaggaggcgg
4440ggcaggctgg cccgatccta gtcatgcgct accgcaacct gatcgagggc gaagcatccg
4500ccggttccta atgtacggag cagatgctag ggcaaattgc cctagcaggg gaaaaaggtc
4560gaaaaggtct ctttcctgtg gatagcacgt acattgggaa cccaaagccg tacattggga
4620accggaaccc gtacattggg aacccaaagc cgtacattgg gaaccggtca cacatgtaag
4680tgactgatat aaaagagaaa aaaggcgatt tttccgccta aaactcttta aaacttatta
4740aaactcttaa aacccgcctg gcctgtgcat aactgtctgg ccagcgcaca gccgaagagc
4800tgcaaaaagc gcctaccctt cggtcgctgc gctccctacg ccccgccgct tcgcgtcggc
4860ctatcgcggc cgctggccgc tcaaaaatgg ctggcctacg gccaggcaat ctaccagggc
4920gcggacaagc cgcgccgtcg ccactcgacc gccggcgccc acatcaaggc accctgcctc
4980gcgcgtttcg gtgatgacgg tgaaaacctc tgacacatgc agctcccgga gacggtcaca
5040gcttgtctgt aagcggatgc cgggagcaga caagcccgtc agggcgcgtc agcgggtgtt
5100ggcgggtgtc ggggcgcagc catgacccag tcacgtagcg atagcggagt gtatactggc
5160ttaactatgc ggcatcagag cagattgtac tgagagtgca ccatatgcgg tgtgaaatac
5220cgcacagatg cgtaaggaga aaataccgca tcaggcgctc ttccgcttcc tcgctcactg
5280actcgctgcg ctcggtcgtt cggctgcggc gagcggtatc agctcactca aaggcggtaa
5340tacggttatc cacagaatca ggggataacg caggaaagaa catgtgagca aaaggccagc
5400aaaaggccag gaaccgtaaa aaggccgcgt tgctggcgtt tttccatagg ctccgccccc
5460ctgacgagca tcacaaaaat cgacgctcaa gtcagaggtg gcgaaacccg acaggactat
5520aaagatacca ggcgtttccc cctggaagct ccctcgtgcg ctctcctgtt ccgaccctgc
5580cgcttaccgg atacctgtcc gcctttctcc cttcgggaag cgtggcgctt tctcatagct
5640cacgctgtag gtatctcagt tcggtgtagg tcgttcgctc caagctgggc tgtgtgcacg
5700aaccccccgt tcagcccgac cgctgcgcct tatccggtaa ctatcgtctt gagtccaacc
5760cggtaagaca cgacttatcg ccactggcag cagccactgg taacaggatt agcagagcga
5820ggtatgtagg cggtgctaca gagttcttga agtggtggcc taactacggc tacactagaa
5880ggacagtatt tggtatctgc gctctgctga agccagttac cttcggaaaa agagttggta
5940gctcttgatc cggcaaacaa accaccgctg gtagcggtgg tttttttgtt tgcaagcagc
6000agattacgcg cagaaaaaaa ggatctcaag aagatccttt gatcttttct acggggtctg
6060acgctcagtg gaacgaaaac tcacgttaag ggattttggt catgcattct aggtactaaa
6120acaattcatc cagtaaaata taatatttta ttttctccca atcaggcttg atccccagta
6180agtcaaaaaa tagctcgaca tactgttctt ccccgatatc ctccctgatc gaccggacgc
6240agaaggcaat gtcataccac ttgtccgccc tgccgcttct cccaagatca ataaagccac
6300ttactttgcc atctttcaca aagatgttgc tgtctcccag gtcgccgtgg gaaaagacaa
6360gttcctcttc gggcttttcc gtctttaaaa aatcatacag ctcgcgcgga tctttaaatg
6420gagtgtcttc ttcccagttt tcgcaatcca catcggccag atcgttattc agtaagtaat
6480ccaattcggc taagcggctg tctaagctat tcgtataggg acaatccgat atgtcgatgg
6540agtgaaagag cctgatgcac tccgcataca gctcgataat cttttcaggg ctttgttcat
6600cttcatactc ttccgagcaa aggacgccat cggcctcact catgagcaga ttgctccagc
6660catcatgccg ttcaaagtgc aggacctttg gaacaggcag ctttccttcc agccatagca
6720tcatgtcctt ttcccgttcc acatcatagg tggtcccttt ataccggctg tccgtcattt
6780ttaaatatag gttttcattt tctcccacca gcttatatac cttagcagga gacattcctt
6840ccgtatcttt tacgcagcgg tatttttcga tcagtttttt caattccggt gatattctca
6900ttttagccat ttattatttc cttcctcttt tctacagtat ttaaagatac cccaagaagc
6960taattataac aagacgaact ccaattcact gttccttgca ttctaaaacc ttaaatacca
7020gaaaacagct ttttcaaagt tgttttcaaa gttggcgtat aacatagtat cgacggagcc
7080gattttgaaa ccgcggtgat cacaggcagc aacgctctgt catcgttaca atcaacatgc
7140taccctccgc gagatcatcc gtgtttcaaa cccggcagct tagttgccgt tcttccgaat
7200agcatcggta acatgagcaa agtctgccgc cttacaacgg ctctcccgct gacgccgtcc
7260cggactgatg ggctgcctgt atcgagtggt gattttgtgc cgagctgccg gtcggggagc
7320tgttggctgg ctggtggcag gatatattgt ggtgtaaaca aattgacgct tagacaactt
7380aataacacat tgcggacgtt tttaatgtac tgaattaacg ccgaattaat tcgggggatc
7440tggattttag tactggattt tggttttagg aattagaaat tttattgata gaagtatttt
7500acaaatacaa atacatacta agggtttctt atatgctcaa cacatgagcg aaaccctata
7560ggaaccctaa ttcccttatc tgggaactac tcacacatta ttatggagaa actcgagctt
7620gtcgatcgac agatccggtc ggcatctact ctatttcttt gccctcggac gagtgctggg
7680gcgtcggttt ccactatcgg cgagtacttc tacacagcca tcggtccaga cggccgcgct
7740tctgcgggcg atttgtgtac gcccgacagt cccggctccg gatcggacga ttgcgtcgca
7800tcgaccctgc gcccaagctg catcatcgaa attgccgtca accaagctct gatagagttg
7860gtcaagacca atgcggagca tatacgcccg gagtcgtggc gatcctgcaa gctccggatg
7920cctccgctcg aagtagcgcg tctgctgctc catacaagcc aaccacggcc tccagaagaa
7980gatgttggcg acctcgtatt gggaatcccc gaacatcgcc tcgctccagt caatgaccgc
8040tgttatgcgg ccattgtccg tcaggacatt gttggagccg aaatccgcgt gcacgaggtg
8100ccggacttcg gggcagtcct cggcccaaag catcagctca tcgagagcct gcgcgacgga
8160cgcactgacg gtgtcgtcca tcacagtttg ccagtgatac acatggggat cagcaatcgc
8220gcatatgaaa tcacgccatg tagtgtattg accgattcct tgcggtccga atgggccgaa
8280cccgctcgtc tggctaagat cggccgcagc gatcgcatcc atagcctccg cgaccggttg
8340tagaacagcg ggcagttcgg tttcaggcag gtcttgcaac gtgacaccct gtgcacggcg
8400ggagatgcaa taggtcaggc tctcgctaaa ctccccaatg tcaagcactt ccggaatcgg
8460gagcgcggcc gatgcaaagt gccgataaac ataacgatct ttgtagaaac catcggcgca
8520gctatttacc cgcaggacat atccacgccc tcctacatcg aagctgaaag cacgagattc
8580ttcgccctcc gagagctgca tcaggtcgga gacgctgtcg aacttttcga tcagaaactt
8640ctcgacagac gtcgcggtga gttcaggctt tttcatatct cattgccccc ccggatctgc
8700gaaagctcga gagagataga tttgtagaga gagactggtg atttcagcgt gtcctctcca
8760aatgaaatga acttccttat atagaggaag gtcttgcgaa ggatagtggg attgtgcgtc
8820atcccttacg tcagtggaga tatcacatca atccacttgc tttgaagacg tggttggaac
8880gtcttctttt tccacgatgc tcctcgtggg tgggggtcca tctttgggac cactgtcggc
8940agaggcatct tgaacgatag cctttccttt atcgcaatga tggcatttgt aggtgccacc
9000ttccttttct actgtccttt tgatgaagtg acagatagct gggcaatgga atccgaggag
9060gtttcccgat attacccttt gttgaaaagt ctcaatagcc ctttggtctt ctgagactgt
9120atctttgata ttcttggagt agacgagagt gtcgtgctcc accatgttat cacatcaatc
9180cacttgcttt gaagacgtgg ttggaacgtc ttctttttcc acgatgctcc tcgtgggtgg
9240gggtccatct ttgggaccac tgtcggcaga ggcatcttga acgatagcct ttcctttatc
9300gcaatgatgg catttgtagg tgccaccttc cttttctact gtccttttga tgaagtgaca
9360gatagctggg caatggaatc cgaggaggtt tcccgatatt accctttgtt gaaaagtctc
9420aatagccctt tggtcttctg agactgtatc tttgatattc ttggagtaga cgagagtgtc
9480gtgctccacc atgttggcaa gctgctctag ccaatacgca aaccgcctct ccccgcgcgt
9540tggccgattc attaatgcag ctggcacgac aggtttcccg actggaaagc gggcagtgag
9600cgcaacgcaa ttaatgtgag ttagctcact cattaggcac cccaggcttt acactttatg
9660cttccggctc gtatgttgtg tggaattgtg agcggataac aatttcacac aggaaacagc
9720tatgaccatg attacgaatt cgagctcggt acccggggat cctctagact gaaggcggga
9780aacgacaatc tgatcatgag cggagaatta agggagtcac gttatgaccc ccgccgatga
9840cgcgggacaa gccgttttac gtttggaact gacagaaccg caacgttgaa ggagccactc
9900agccgcgggt ttctggagtt taatgagcta agcacatacg tcagaaacca ttattgcgcg
9960ttcaaaagtc gcctaaggtc actatcagct agcaaatatt tcttgtcaaa aatgctccac
10020tgacgttcca taaattcccc tcggtatcca attagagtct catattcact ctcaatccaa
10080ataatctgca ccggatctcg agaatcgaat tcccgcggcc gc
1012298621DNAArtificial SequenceN. tabacum rDNA intergnic spacer (IGS)
sequence 98gtgctagcca atgtttaaca agatgtcaag cacaatgaat gttggtggtt
ggtggtcgtg 60gctggcggtg gtggaaaatt gcggtggttc gagcggtagt gatcggcgat
ggttggtgtt 120tgcagcggtg tttgatatcg gaatcactta tggtggttgt cacaatggag
gtgcgtcatg 180gttattggtg gttggtcatc tatatatttt tataataata ttaagtattt
tacctatttt 240ttacatattt tttattaaat ttatgcattg tttgtatttt taaatagttt
ttatcgtact 300tgttttataa aatattttat tattttatgt gttatattat tacttgatgt
attggaaatt 360ttctccattg ttttttctat atttataata attttcttat ttttttttgt
tttattatgt 420attttttcgt tttataataa atatttatta aaaaaaatat tatttttgta
aaatatatca 480tttacaatgt ttaaaagtca tttgtgaata tattagctaa gttgtacttc
tttttgtgca 540tttggtgttg tacatgtcta ttatgattct ctggccaaaa catgtctact
cctgtcactt 600gggttttttt ttttaagaca t
6219925DNAArtificial SequenceNTIGS-F1 Primer 99gtgctagcca
atgtttaaca agatg
2510028DNAArtificial SequenceNTIGS-R1 Primer 100atgtcttaaa aaaaaaaacc
caagtgac 28101233DNAMus
MusculusGenbank #V008461989-07-06 101gacctggaat atggcgagaa aactgaaaat
cacggaaaat gagaaataca cactttagga 60cgtgaaatat ggcgaggaaa actgaaaaag
gtggaaaatt tagaaatgtc cactgtagga 120cgtggaatat ggcaagaaaa ctgaaaatca
tggaaaatga gaaacatcca cttgacgact 180tgaaaaatga cgaaatcact aaaaaacgtg
aaaaatgaga aatgcacact gaa 23310231DNAArtificial SequenceMSAT-F1
Primer 102aataccgcgg aagcttgacc tggaatatcg c
3110327DNAArtificial SequenceMSAT-Ri Primer 103ataaccgcgg
agtccttcag tgtgcat
27104277DNAArtificial SequenceNopaline Synthase Promoter Sequence
104gagctcgaat ttccccgatc gttcaaacat ttggcaataa agtttcttaa gattgaatcc
60tgttgccggt cttgcgatga ttatcatata atttctgttg aattacgtta agcatgtaat
120aattaacatg taatgcatga cgttatttat gagatgggtt tttatgatta gagtcccgca
180attatacatt taatacgcga tagaaaacaa aatatagcgc gcaaactagg ataaattatc
240gcgcgcggtg tcatctatgt tactagatcg ggaattc
2771051812DNAEscherichia coliCDS(1)...(1812)Beta-Glucuronidase 105atg tta
cgt cct gta gaa acc cca acc cgt gaa atc aaa aaa ctc gac 48Met Leu
Arg Pro Val Glu Thr Pro Thr Arg Glu Ile Lys Lys Leu Asp1 5
10 15ggc ctg tgg gca ttc agt ctg gat
cgc gaa aac tgt gga att gat cag 96Gly Leu Trp Ala Phe Ser Leu Asp
Arg Glu Asn Cys Gly Ile Asp Gln 20 25
30cgt tgg tgg gaa agc gcg tta caa gaa agc cgg gca att gct gtg
cca 144Arg Trp Trp Glu Ser Ala Leu Gln Glu Ser Arg Ala Ile Ala Val
Pro 35 40 45ggc agt ttt aac gat
cag ttc gcc gat gca gat att cgt aat tat gcg 192Gly Ser Phe Asn Asp
Gln Phe Ala Asp Ala Asp Ile Arg Asn Tyr Ala 50 55
60ggc aac gtc tgg tat cag cgc gaa gtc ttt ata ccg aaa ggt
tgg gca 240Gly Asn Val Trp Tyr Gln Arg Glu Val Phe Ile Pro Lys Gly
Trp Ala65 70 75 80ggc
cag cgt atc gtg ctg cgt ttc gat gcg gtc act cat tac ggc aaa 288Gly
Gln Arg Ile Val Leu Arg Phe Asp Ala Val Thr His Tyr Gly Lys
85 90 95gtg tgg gtc aat aat cag gaa
gtg atg gag cat cag ggc ggc tat acg 336Val Trp Val Asn Asn Gln Glu
Val Met Glu His Gln Gly Gly Tyr Thr 100 105
110cca ttt gaa gcc gat gtc acg ccg tat gtt att gcc ggg aaa
agt gta 384Pro Phe Glu Ala Asp Val Thr Pro Tyr Val Ile Ala Gly Lys
Ser Val 115 120 125cgt atc acc gtt
tgt gtg aac aac gaa ctg aac tgg cag act atc ccg 432Arg Ile Thr Val
Cys Val Asn Asn Glu Leu Asn Trp Gln Thr Ile Pro 130
135 140ccg gga atg gtg att acc gac gaa aac ggc aag aaa
aag cag tct tac 480Pro Gly Met Val Ile Thr Asp Glu Asn Gly Lys Lys
Lys Gln Ser Tyr145 150 155
160ttc cat gat ttc ttt aac tat gcc gga atc cat cgc agc gta atg ctc
528Phe His Asp Phe Phe Asn Tyr Ala Gly Ile His Arg Ser Val Met Leu
165 170 175tac acc acg ccg aac
acc tgg gtg gac gat atc acc gtg gtg acg cat 576Tyr Thr Thr Pro Asn
Thr Trp Val Asp Asp Ile Thr Val Val Thr His 180
185 190gtc gcg caa gac tgt aac cac gcg tct gtt gac tgg
cag gtg gtg gcc 624Val Ala Gln Asp Cys Asn His Ala Ser Val Asp Trp
Gln Val Val Ala 195 200 205aat ggt
gat gtc agc gtt gaa ctg cgt gat gcg gat caa cag gtg gtt 672Asn Gly
Asp Val Ser Val Glu Leu Arg Asp Ala Asp Gln Gln Val Val 210
215 220gca act gga caa ggc act agc ggg act ttg caa
gtg gtg aat ccg cac 720Ala Thr Gly Gln Gly Thr Ser Gly Thr Leu Gln
Val Val Asn Pro His225 230 235
240ctc tgg caa ccg ggt gaa ggt tat ctc tat gaa ctg tgc gtc aca gcc
768Leu Trp Gln Pro Gly Glu Gly Tyr Leu Tyr Glu Leu Cys Val Thr Ala
245 250 255aaa agc cag aca gag
tgt gat atc tac ccg ctt cgc gtc ggc atc cgg 816Lys Ser Gln Thr Glu
Cys Asp Ile Tyr Pro Leu Arg Val Gly Ile Arg 260
265 270tca gtg gca gtg aag ggc gaa cag ttc ctg att aac
cac aaa ccg ttc 864Ser Val Ala Val Lys Gly Glu Gln Phe Leu Ile Asn
His Lys Pro Phe 275 280 285tac ttt
act ggc ttt ggt cgt cat gaa gat gcg gac ttg cgt ggc aaa 912Tyr Phe
Thr Gly Phe Gly Arg His Glu Asp Ala Asp Leu Arg Gly Lys 290
295 300gga ttc gat aac gtg ctg atg gtg cac gac cac
gca tta atg gac tgg 960Gly Phe Asp Asn Val Leu Met Val His Asp His
Ala Leu Met Asp Trp305 310 315
320att ggg gcc aac tcc tac cgt acc tcg cat tac cct tac gct gaa gag
1008Ile Gly Ala Asn Ser Tyr Arg Thr Ser His Tyr Pro Tyr Ala Glu Glu
325 330 335atg ctc gac tgg gca
gat gaa cat ggc atc gtg gtg att gat gaa act 1056Met Leu Asp Trp Ala
Asp Glu His Gly Ile Val Val Ile Asp Glu Thr 340
345 350gct gct gtc ggc ttt aac ctc tct tta ggc att ggt
ttc gaa gcg ggc 1104Ala Ala Val Gly Phe Asn Leu Ser Leu Gly Ile Gly
Phe Glu Ala Gly 355 360 365aac aag
ccg aaa gaa ctg tac agc gaa gag gca gtc aac ggg gaa act 1152Asn Lys
Pro Lys Glu Leu Tyr Ser Glu Glu Ala Val Asn Gly Glu Thr 370
375 380cag caa gcg cac tta cag gcg att aaa gag ctg
ata gcg cgt gac aaa 1200Gln Gln Ala His Leu Gln Ala Ile Lys Glu Leu
Ile Ala Arg Asp Lys385 390 395
400aac cac cca agc gtg gtg atg tgg agt att gcc aac gaa ccg gat acc
1248Asn His Pro Ser Val Val Met Trp Ser Ile Ala Asn Glu Pro Asp Thr
405 410 415cgt ccg caa ggt gca
cgg gaa tat ttc gcg cca ctg gcg gaa gca acg 1296Arg Pro Gln Gly Ala
Arg Glu Tyr Phe Ala Pro Leu Ala Glu Ala Thr 420
425 430cgt aaa ctc gac ccg acg cgt ccg atc acc tgc gtc
aat gta atg ttc 1344Arg Lys Leu Asp Pro Thr Arg Pro Ile Thr Cys Val
Asn Val Met Phe 435 440 445tgc gac
gct cac acc gat acc atc agc gat ctc ttt gat gtg ctg tgc 1392Cys Asp
Ala His Thr Asp Thr Ile Ser Asp Leu Phe Asp Val Leu Cys 450
455 460ctg aac cgt tat tac gga tgg tat gtc caa agc
ggc gat ttg gaa acg 1440Leu Asn Arg Tyr Tyr Gly Trp Tyr Val Gln Ser
Gly Asp Leu Glu Thr465 470 475
480gca gag aag gta ctg gaa aaa gaa ctt ctg gcc tgg cag gag aaa ctg
1488Ala Glu Lys Val Leu Glu Lys Glu Leu Leu Ala Trp Gln Glu Lys Leu
485 490 495cat cag ccg att atc
atc acc gaa tac ggc gtg gat acg tta gcc ggg 1536His Gln Pro Ile Ile
Ile Thr Glu Tyr Gly Val Asp Thr Leu Ala Gly 500
505 510ctg cac tca atg tac acc gac atg tgg agt gaa gag
tat cag tgt gca 1584Leu His Ser Met Tyr Thr Asp Met Trp Ser Glu Glu
Tyr Gln Cys Ala 515 520 525tgg ctg
gat atg tat cac cgc gtc ttt gat cgc gtc agc gcc gtc gtc 1632Trp Leu
Asp Met Tyr His Arg Val Phe Asp Arg Val Ser Ala Val Val 530
535 540ggt gaa cag gta tgg aat ttc gcc gat ttt gcg
acc tcg caa ggc ata 1680Gly Glu Gln Val Trp Asn Phe Ala Asp Phe Ala
Thr Ser Gln Gly Ile545 550 555
560ttg cgc gtt ggc ggt aac aag aaa ggg atc ttc act cgc gac cgc aaa
1728Leu Arg Val Gly Gly Asn Lys Lys Gly Ile Phe Thr Arg Asp Arg Lys
565 570 575ccg aag tcg gcg gct
ttt ctg ctg caa aaa cgc tgg act ggc atg aac 1776Pro Lys Ser Ala Ala
Phe Leu Leu Gln Lys Arg Trp Thr Gly Met Asn 580
585 590ttc ggt gaa aaa ccg cag cag gga ggc aaa caa tga
1812Phe Gly Glu Lys Pro Gln Gln Gly Gly Lys Gln *
595 600106603PRTEscherichia coliGenbank
#S694141994-09-23 106Met Leu Arg Pro Val Glu Thr Pro Thr Arg Glu Ile Lys
Lys Leu Asp1 5 10 15Gly
Leu Trp Ala Phe Ser Leu Asp Arg Glu Asn Cys Gly Ile Asp Gln 20
25 30Arg Trp Trp Glu Ser Ala Leu Gln
Glu Ser Arg Ala Ile Ala Val Pro 35 40
45Gly Ser Phe Asn Asp Gln Phe Ala Asp Ala Asp Ile Arg Asn Tyr Ala
50 55 60Gly Asn Val Trp Tyr Gln Arg Glu
Val Phe Ile Pro Lys Gly Trp Ala65 70 75
80Gly Gln Arg Ile Val Leu Arg Phe Asp Ala Val Thr His
Tyr Gly Lys 85 90 95Val
Trp Val Asn Asn Gln Glu Val Met Glu His Gln Gly Gly Tyr Thr
100 105 110Pro Phe Glu Ala Asp Val Thr
Pro Tyr Val Ile Ala Gly Lys Ser Val 115 120
125Arg Ile Thr Val Cys Val Asn Asn Glu Leu Asn Trp Gln Thr Ile
Pro 130 135 140Pro Gly Met Val Ile Thr
Asp Glu Asn Gly Lys Lys Lys Gln Ser Tyr145 150
155 160Phe His Asp Phe Phe Asn Tyr Ala Gly Ile His
Arg Ser Val Met Leu 165 170
175Tyr Thr Thr Pro Asn Thr Trp Val Asp Asp Ile Thr Val Val Thr His
180 185 190Val Ala Gln Asp Cys Asn
His Ala Ser Val Asp Trp Gln Val Val Ala 195 200
205Asn Gly Asp Val Ser Val Glu Leu Arg Asp Ala Asp Gln Gln
Val Val 210 215 220Ala Thr Gly Gln Gly
Thr Ser Gly Thr Leu Gln Val Val Asn Pro His225 230
235 240Leu Trp Gln Pro Gly Glu Gly Tyr Leu Tyr
Glu Leu Cys Val Thr Ala 245 250
255Lys Ser Gln Thr Glu Cys Asp Ile Tyr Pro Leu Arg Val Gly Ile Arg
260 265 270Ser Val Ala Val Lys
Gly Glu Gln Phe Leu Ile Asn His Lys Pro Phe 275
280 285Tyr Phe Thr Gly Phe Gly Arg His Glu Asp Ala Asp
Leu Arg Gly Lys 290 295 300Gly Phe Asp
Asn Val Leu Met Val His Asp His Ala Leu Met Asp Trp305
310 315 320Ile Gly Ala Asn Ser Tyr Arg
Thr Ser His Tyr Pro Tyr Ala Glu Glu 325
330 335Met Leu Asp Trp Ala Asp Glu His Gly Ile Val Val
Ile Asp Glu Thr 340 345 350Ala
Ala Val Gly Phe Asn Leu Ser Leu Gly Ile Gly Phe Glu Ala Gly 355
360 365Asn Lys Pro Lys Glu Leu Tyr Ser Glu
Glu Ala Val Asn Gly Glu Thr 370 375
380Gln Gln Ala His Leu Gln Ala Ile Lys Glu Leu Ile Ala Arg Asp Lys385
390 395 400Asn His Pro Ser
Val Val Met Trp Ser Ile Ala Asn Glu Pro Asp Thr 405
410 415Arg Pro Gln Gly Ala Arg Glu Tyr Phe Ala
Pro Leu Ala Glu Ala Thr 420 425
430Arg Lys Leu Asp Pro Thr Arg Pro Ile Thr Cys Val Asn Val Met Phe
435 440 445Cys Asp Ala His Thr Asp Thr
Ile Ser Asp Leu Phe Asp Val Leu Cys 450 455
460Leu Asn Arg Tyr Tyr Gly Trp Tyr Val Gln Ser Gly Asp Leu Glu
Thr465 470 475 480Ala Glu
Lys Val Leu Glu Lys Glu Leu Leu Ala Trp Gln Glu Lys Leu
485 490 495His Gln Pro Ile Ile Ile Thr
Glu Tyr Gly Val Asp Thr Leu Ala Gly 500 505
510Leu His Ser Met Tyr Thr Asp Met Trp Ser Glu Glu Tyr Gln
Cys Ala 515 520 525Trp Leu Asp Met
Tyr His Arg Val Phe Asp Arg Val Ser Ala Val Val 530
535 540Gly Glu Gln Val Trp Asn Phe Ala Asp Phe Ala Thr
Ser Gln Gly Ile545 550 555
560Leu Arg Val Gly Gly Asn Lys Lys Gly Ile Phe Thr Arg Asp Arg Lys
565 570 575Pro Lys Ser Ala Ala
Phe Leu Leu Gln Lys Arg Trp Thr Gly Met Asn 580
585 590Phe Gly Glu Lys Pro Gln Gln Gly Gly Lys Gln
595 600107277DNAArtificial SequenceNopaline Synthase
Terminator Sequence 107gagctcgaat ttccccgatc gttcaaacat ttggcaataa
agtttcttaa gattgaatcc 60tgttgccggt cttgcgatga ttatcatata atttctgttg
aattacgtta agcatgtaat 120aattaacatg taatgcatga cgttatttat gagatgggtt
tttatgatta gagtcccgca 180attatacatt taatacgcga tagaaaacaa aatatagcgc
gcaaactagg ataaattatc 240gcgcgcggtg tcatctatgt tactagatcg ggaattc
2771083451DNAArtificial SequenceHindIII Fragment
containing the beta-glucuronidase coding sequence, the rDNA
intergenic spacer, and the Mast1 sequence 108aagcttgacc tggaatatcg
cgagtaaact gaaaatcacg gaaaatgaga aatacacact 60ttaggacgtg aaatatggcg
aggaaaactg aaaaaggtgg aaaatttaga aatgtccact 120gtaggacgtg gaatatggca
agaaaactga aaatcatgga aaatgagaaa catccacttg 180acgacttgaa aaatgacgaa
atcactaaaa aacgtgaaaa atgagaaatg cacactgaag 240gactccgcgg gaattcgatt
gtgctagcca atgtttaaca agatgtcaag cacaatgaat 300gttggtggtt ggtggtcgtg
gctggcggtg gtggaaaatt gcggtggttc gagcggtagt 360gatcggcgat ggttggtgtt
tgcagcggtg tttgatatcg gaatcactta tggtggttgt 420cacaatggag gtgcgtcatg
gttattggtg gttggtcatc tatatatttt tataataata 480ttaagtattt tacctatttt
ttacatattt tttattaaat ttatgcattg tttgtatttt 540taaatagttt ttatcgtact
tgttttataa aatattttat tattttatgt gttatattat 600tacttgatgt attggaaatt
ttctccattg ttttttctat atttataata attttcttat 660ttttttttgt tttattatgt
attttttcgt tttataataa atatttatta aaaaaaatat 720tatttttgta aaatatatca
tttacaatgt ttaaaagtca tttgtgaata tattagctaa 780gttgtacttc tttttgtgca
tttggtgttg tacatgtcta ttatgattct ctggccaaaa 840catgtctact cctgtcactt
gggttttttt ttttaagaca taatcactag tgattatatc 900tagactgaag gcgggaaacg
acaatctgat catgagcgga gaattaaggg agtcacgtta 960tgacccccgc cgatgacgcg
ggacaagccg ttttacgttt ggaactgaca gaaccgcaac 1020gttgaaggag ccactcagcc
gcgggtttct ggagtttaat gagctaagca catacgtcag 1080aaaccattat tgcgcgttca
aaagtcgcct aaggtcacta tcagctagca aatatttctt 1140gtcaaaaatg ctccactgac
gttccataaa ttcccctcgg tatccaatta gagtctcata 1200ttcactctca atccaaataa
tctgcaccgg atctcgagat cgaattcccg cggccgcgaa 1260ttcactagtg gatccccggg
tacggtcagt cccttatgtt acgtcctgta gaaaccccaa 1320cccgtgaaat caaaaaactc
gacggcctgt gggcattcag tctggatcgc gaaaactgtg 1380gaattgagca gcgttggtgg
gaaagcgcgt tacaagaaag ccgggcaatt gctgtgccag 1440gcagttttaa cgatcagttc
gccgatgcag atattcgtaa ttatgtgggc aacgtctggt 1500atcagcgcga agtctttata
ccgaaaggtt gggcaggcca gcgtatcgtg ctgcgtttcg 1560atgcggtcac tcattacggc
aaagtgtggg tcaataatca ggaagtgatg gagcatcagg 1620gcggctatac gccatttgaa
gccgatgtca cgccgtatgt tattgccggg aaaagtgtac 1680gtatcacagt ttgtgtgaac
aacgaactga actggcagac tatcccgccg ggaatggtga 1740ttaccgacga aaacggcaag
aaaaagcagt cttacttcca tgatttcttt aactacgccg 1800ggatccatcg cagcgtaatg
ctctacacca cgccgaacac ctgggtggac gatatcaccg 1860tggtgacgca tgtcgcgcaa
gactgtaacc acgcgtctgt tgactggcag gtggtggcca 1920atggtgatgt cagcgttgaa
ctgcgtgatg cggatcaaca ggtggttgca actggacaag 1980gcaccagcgg gactttgcaa
gtggtgaatc cgcacctctg gcaaccgggt gaaggttatc 2040tctatgaact gtacgtcaca
gccaaaagcc agacagagtg tgatatctac ccgctgcgcg 2100tcggcatccg gtcagtggca
gtgaagggcg aacagttcct gatcaaccac aaaccgttct 2160actttactgg ctttggccgt
catgaagatg cggatttgcg cggcaaagga ttcgataacg 2220tgctgatggt gcacgatcac
gcattaatgg actggattgg ggccaactcc taccgtacct 2280cgcattaccc ttacgctgaa
gagatgctcg actgggcaga tgaacatggc atcgtggtga 2340ttgatgaaac tgcagctgtc
ggctttaacc tctctttagg cattggtttc gaagcgggca 2400acaagccgaa agaactgtac
agcgaagagg cagtcaacgg ggaaactcag caggcgcact 2460tacaggcgat taaagagctg
atagcgcgtg acaaaaacca cccaagcgtg gtgatgtgga 2520gtattgccaa cgaaccggat
acccgtccgc aaggtgcacg ggaatatttc gcgccactgg 2580cggaagcaac gcgtaaactc
gatccgacgc gtccgatcac ctgcgtcaat gtaatgttct 2640gcgacgctca caccgatacc
atcagcgatc tctttgatgt gctgtgcctg aaccgttatt 2700acggttggta tgtccaaagc
ggcgatttgg aaacggcaga gaaggtactg gaaaaagaac 2760ttctggcctg gcaggagaaa
ctgcatcagc cgattatcat caccgaatac ggcgtggata 2820cgttagccgg gctgcactca
atgtacaccg acatgtggag tgaagagtat cagtgtgcat 2880ggctggatat gtatcaccgc
gtctttgatc gcgtcagcgc cgtcgtcggt gaacaggtat 2940ggaatttcgc cgattttgcg
acctcgcaag gcatattgcg cgttggcggt aacaagaagg 3000ggatcttcac ccgcgaccgc
aaaccgaagt cggcggcttt tctgctgcaa aaacgctgga 3060ctggcatgaa cttcggtgaa
aaaccgcagc agggaggcaa acaatgaatc aacaactctc 3120ctggcgcacc atcgtcggct
acagcctcgg gaattgcgta ccgagctcga atttccccga 3180tcgttcaaac atttggcaat
aaagtttctt aagattgaat cctgttgccg gtcttgcgat 3240gattatcata taatttctgt
tgaattacgt taagcatgta ataattaaca tgtaatgcat 3300gacgttattt atgagatggg
tttttatgat tagagtcccg caattataca tttaatacgc 3360gatagaaaac aaaatatagc
gcgcaaacta ggataaatta tcgcgcgcgg tgtcatctat 3420gttactagat cgggaattcg
atatcaagct t 345110914627DNAArtificial
SequencepAg11a Plasmid 109catgccaacc acagggttcc cctcgggatc aaagtacttt
gatccaaccc ctccgctgct 60atagtgcagt cggcttctga cgttcagtgc agccgtcttc
tgaaaacgac atgtcgcaca 120agtcctaagt tacgcgacag gctgccgccc tgcccttttc
ctggcgtttt cttgtcgcgt 180gttttagtcg cataaagtag aatacttgcg actagaaccg
gagacattac gccatgaaca 240agagcgccgc cgctggcctg ctgggctatg cccgcgtcag
caccgacgac caggacttga 300ccaaccaacg ggccgaactg cacgcggccg gctgcaccaa
gctgttttcc gagaagatca 360ccggcaccag gcgcgaccgc ccggagctgg ccaggatgct
tgaccaccta cgccctggcg 420acgttgtgac agtgaccagg ctagaccgcc tggcccgcag
cacccgcgac ctactggaca 480ttgccgagcg catccaggag gccggcgcgg gcctgcgtag
cctggcagag ccgtgggccg 540acaccaccac gccggccggc cgcatggtgt tgaccgtgtt
cgccggcatt gccgagttcg 600agcgttccct aatcatcgac cgcacccgga gcgggcgcga
ggccgccaag gcccgaggcg 660tgaagtttgg cccccgccct accctcaccc cggcacagat
cgcgcacgcc cgcgagctga 720tcgaccagga aggccgcacc gtgaaagagg cggctgcact
gcttggcgtg catcgctcga 780ccctgtaccg cgcacttgag cgcagcgagg aagtgacgcc
caccgaggcc aggcggcgcg 840gtgccttccg tgaggacgca ttgaccgagg ccgacgccct
ggcggccgcc gagaatgaac 900gccaagagga acaagcatga aaccgcacca ggacggccag
gacgaaccgt ttttcattac 960cgaagagatc gaggcggaga tgatcgcggc cgggtacgtg
ttcgagccgc ccgcgcacgt 1020ctcaaccgtg cggctgcatg aaatcctggc cggtttgtct
gatgccaagc tggcggcctg 1080gccggccagc ttggccgctg aagaaaccga gcgccgccgt
ctaaaaaggt gatgtgtatt 1140tgagtaaaac agcttgcgtc atgcggtcgc tgcgtatatg
atgcgatgag taaataaaca 1200aatacgcaag gggaacgcat gaaggttatc gctgtactta
accagaaagg cgggtcaggc 1260aagacgacca tcgcaaccca tctagcccgc gccctgcaac
tcgccggggc cgatgttctg 1320ttagtcgatt ccgatcccca gggcagtgcc cgcgattggg
cggccgtgcg ggaagatcaa 1380ccgctaaccg ttgtcggcat cgaccgcccg acgattgacc
gcgacgtgaa ggccatcggc 1440cggcgcgact tcgtagtgat cgacggagcg ccccaggcgg
cggacttggc tgtgtccgcg 1500atcaaggcag ccgacttcgt gctgattccg gtgcagccaa
gcccttacga catatgggcc 1560accgccgacc tggtggagct ggttaagcag cgcattgagg
tcacggatgg aaggctacaa 1620gcggcctttg tcgtgtcgcg ggcgatcaaa ggcacgcgca
tcggcggtga ggttgccgag 1680gcgctggccg ggtacgagct gcccattctt gagtcccgta
tcacgcagcg cgtgagctac 1740ccaggcactg ccgccgccgg cacaaccgtt cttgaatcag
aacccgaggg cgacgctgcc 1800cgcgaggtcc aggcgctggc cgctgaaatt aaatcaaaac
tcatttgagt taatgaggta 1860aagagaaaat gagcaaaagc acaaacacgc taagtgccgg
ccgtccgagc gcacgcagca 1920gcaaggctgc aacgttggcc agcctggcag acacgccagc
catgaagcgg gtcaactttc 1980agttgccggc ggaggatcac accaagctga agatgtacgc
ggtacgccaa ggcaagacca 2040ttaccgagct gctatctgaa tacatcgcgc agctaccaga
gtaaatgagc aaatgaataa 2100atgagtagat gaattttagc ggctaaagga ggcggcatgg
aaaatcaaga acaaccaggc 2160accgacgccg tggaatgccc catgtgtgga ggaacgggcg
gttggccagg cgtaagcggc 2220tgggttgtct gccggccctg caatggcact ggaaccccca
agcccgagga atcggcgtga 2280cggtcgcaaa ccatccggcc cggtacaaat cggcgcggcg
ctgggtgatg acctggtgga 2340gaagttgaag gccgcgcagg ccgcccagcg gcaacgcatc
gaggcagaag cacgccccgg 2400tgaatcgtgg caagcggccg ctgatcgaat ccgcaaagaa
tcccggcaac cgccggcagc 2460cggtgcgccg tcgattagga agccgcccaa gggcgacgag
caaccagatt ttttcgttcc 2520gatgctctat gacgtgggca cccgcgatag tcgcagcatc
atggacgtgg ccgttttccg 2580tctgtcgaag cgtgaccgac gagctggcga ggtgatccgc
tacgagcttc cagacgggca 2640cgtagaggtt tccgcagggc cggccggcat ggccagtgtg
tgggattacg acctggtact 2700gatggcggtt tcccatctaa ccgaatccat gaaccgatac
cgggaaggga agggagacaa 2760gcccggccgc gtgttccgtc cacacgttgc ggacgtactc
aagttctgcc ggcgagccga 2820tggcggaaag cagaaagacg acctggtaga aacctgcatt
cggttaaaca ccacgcacgt 2880tgccatgcag cgtacgaaga aggccaagaa cggccgcctg
gtgacggtat ccgagggtga 2940agccttgatt agccgctaca agatcgtaaa gagcgaaacc
gggcggccgg agtacatcga 3000gatcgagcta gctgattgga tgtaccgcga gatcacagaa
ggcaagaacc cggacgtgct 3060gacggttcac cccgattact ttttgatcga tcccggcatc
ggccgttttc tctaccgcct 3120ggcacgccgc gccgcaggca aggcagaagc cagatggttg
ttcaagacga tctacgaacg 3180cagtggcagc gccggagagt tcaagaagtt ctgtttcacc
gtgcgcaagc tgatcgggtc 3240aaatgacctg ccggagtacg atttgaagga ggaggcgggg
caggctggcc cgatcctagt 3300catgcgctac cgcaacctga tcgagggcga agcatccgcc
ggttcctaat gtacggagca 3360gatgctaggg caaattgccc tagcagggga aaaaggtcga
aaaggtctct ttcctgtgga 3420tagcacgtac attgggaacc caaagccgta cattgggaac
cggaacccgt acattgggaa 3480cccaaagccg tacattggga accggtcaca catgtaagtg
actgatataa aagagaaaaa 3540aggcgatttt tccgcctaaa actctttaaa acttattaaa
actcttaaaa cccgcctggc 3600ctgtgcataa ctgtctggcc agcgcacagc cgaagagctg
caaaaagcgc ctacccttcg 3660gtcgctgcgc tccctacgcc ccgccgcttc gcgtcggcct
atcgcggccg ctggccgctc 3720aaaaatggct ggcctacggc caggcaatct accagggcgc
ggacaagccg cgccgtcgcc 3780actcgaccgc cggcgcccac atcaaggcac cctgcctcgc
gcgtttcggt gatgacggtg 3840aaaacctctg acacatgcag ctcccggaga cggtcacagc
ttgtctgtaa gcggatgccg 3900ggagcagaca agcccgtcag ggcgcgtcag cgggtgttgg
cgggtgtcgg ggcgcagcca 3960tgacccagtc acgtagcgat agcggagtgt atactggctt
aactatgcgg catcagagca 4020gattgtactg agagtgcacc atatgcggtg tgaaataccg
cacagatgcg taaggagaaa 4080ataccgcatc aggcgctctt ccgcttcctc gctcactgac
tcgctgcgct cggtcgttcg 4140gctgcggcga gcggtatcag ctcactcaaa ggcggtaata
cggttatcca cagaatcagg 4200ggataacgca ggaaagaaca tgtgagcaaa aggccagcaa
aaggccagga accgtaaaaa 4260ggccgcgttg ctggcgtttt tccataggct ccgcccccct
gacgagcatc acaaaaatcg 4320acgctcaagt cagaggtggc gaaacccgac aggactataa
agataccagg cgtttccccc 4380tggaagctcc ctcgtgcgct ctcctgttcc gaccctgccg
cttaccggat acctgtccgc 4440ctttctccct tcgggaagcg tggcgctttc tcatagctca
cgctgtaggt atctcagttc 4500ggtgtaggtc gttcgctcca agctgggctg tgtgcacgaa
ccccccgttc agcccgaccg 4560ctgcgcctta tccggtaact atcgtcttga gtccaacccg
gtaagacacg acttatcgcc 4620actggcagca gccactggta acaggattag cagagcgagg
tatgtaggcg gtgctacaga 4680gttcttgaag tggtggccta actacggcta cactagaagg
acagtatttg gtatctgcgc 4740tctgctgaag ccagttacct tcggaaaaag agttggtagc
tcttgatccg gcaaacaaac 4800caccgctggt agcggtggtt tttttgtttg caagcagcag
attacgcgca gaaaaaaagg 4860atctcaagaa gatcctttga tcttttctac ggggtctgac
gctcagtgga acgaaaactc 4920acgttaaggg attttggtca tgcattctag gtactaaaac
aattcatcca gtaaaatata 4980atattttatt ttctcccaat caggcttgat ccccagtaag
tcaaaaaata gctcgacata 5040ctgttcttcc ccgatatcct ccctgatcga ccggacgcag
aaggcaatgt cataccactt 5100gtccgccctg ccgcttctcc caagatcaat aaagccactt
actttgccat ctttcacaaa 5160gatgttgctg tctcccaggt cgccgtggga aaagacaagt
tcctcttcgg gcttttccgt 5220ctttaaaaaa tcatacagct cgcgcggatc tttaaatgga
gtgtcttctt cccagttttc 5280gcaatccaca tcggccagat cgttattcag taagtaatcc
aattcggcta agcggctgtc 5340taagctattc gtatagggac aatccgatat gtcgatggag
tgaaagagcc tgatgcactc 5400cgcatacagc tcgataatct tttcagggct ttgttcatct
tcatactctt ccgagcaaag 5460gacgccatcg gcctcactca tgagcagatt gctccagcca
tcatgccgtt caaagtgcag 5520gacctttgga acaggcagct ttccttccag ccatagcatc
atgtcctttt cccgttccac 5580atcataggtg gtccctttat accggctgtc cgtcattttt
aaatataggt tttcattttc 5640tcccaccagc ttatatacct tagcaggaga cattccttcc
gtatctttta cgcagcggta 5700tttttcgatc agttttttca attccggtga tattctcatt
ttagccattt attatttcct 5760tcctcttttc tacagtattt aaagataccc caagaagcta
attataacaa gacgaactcc 5820aattcactgt tccttgcatt ctaaaacctt aaataccaga
aaacagcttt ttcaaagttg 5880ttttcaaagt tggcgtataa catagtatcg acggagccga
ttttgaaacc gcggtgatca 5940caggcagcaa cgctctgtca tcgttacaat caacatgcta
ccctccgcga gatcatccgt 6000gtttcaaacc cggcagctta gttgccgttc ttccgaatag
catcggtaac atgagcaaag 6060tctgccgcct tacaacggct ctcccgctga cgccgtcccg
gactgatggg ctgcctgtat 6120cgagtggtga ttttgtgccg agctgccggt cggggagctg
ttggctggct ggtggcagga 6180tatattgtgg tgtaaacaaa ttgacgctta gacaacttaa
taacacattg cggacgtttt 6240taatgtactg aattaacgcc gaattaattc gggggatctg
gattttagta ctggattttg 6300gttttaggaa ttagaaattt tattgataga agtattttac
aaatacaaat acatactaag 6360ggtttcttat atgctcaaca catgagcgaa accctatagg
aaccctaatt cccttatctg 6420ggaactactc acacattatt atggagaaac tcgagtcaaa
tctcggtgac gggcaggacc 6480ggacggggcg gtaccggcag gctgaagtcc agctgccaga
aacccacgtc atgccagttc 6540ccgtgcttga agccggccgc ccgcagcatg ccgcgggggg
catatccgag cgcctcgtgc 6600atgcgcacgc tcgggtcgtt gggcagcccg atgacagcga
ccacgctctt gaagccctgt 6660gcctccaggg acttcagcag gtgggtgtag agcgtggagc
ccagtcccgt ccgctggtgg 6720cggggggaga cgtacacggt cgactcggcc gtccagtcgt
aggcgttgcg tgccttccag 6780gggcccgcgt aggcgatgcc ggcgacctcg ccgtccacct
cggcgacgag ccagggatag 6840cgctcccgca gacggacgag gtcgtccgtc cactcctgcg
gttcctgcgg ctcggtacgg 6900aagttgaccg tgcttgtctc gatgtagtgg ttgacgatgg
tgcagaccgc cggcatgtcc 6960gcctcggtgg cacggcggat gtcggccggg cgtcgttctg
ggctcatggt agactcgaga 7020gagatagatt tgtagagaga gactggtgat ttcagcgtgt
cctctccaaa tgaaatgaac 7080ttccttatat agaggaaggt cttgcgaagg atagtgggat
tgtgcgtcat cccttacgtc 7140agtggagata tcacatcaat ccacttgctt tgaagacgtg
gttggaacgt cttctttttc 7200cacgatgctc ctcgtgggtg ggggtccatc tttgggacca
ctgtcggcag aggcatcttg 7260aacgatagcc tttcctttat cgcaatgatg gcatttgtag
gtgccacctt ccttttctac 7320tgtccttttg atgaagtgac agatagctgg gcaatggaat
ccgaggaggt ttcccgatat 7380taccctttgt tgaaaagtct caatagccct ttggtcttct
gagactgtat ctttgatatt 7440cttggagtag acgagagtgt cgtgctccac catgttatca
catcaatcca cttgctttga 7500agacgtggtt ggaacgtctt ctttttccac gatgctcctc
gtgggtgggg gtccatcttt 7560gggaccactg tcggcagagg catcttgaac gatagccttt
cctttatcgc aatgatggca 7620tttgtaggtg ccaccttcct tttctactgt ccttttgatg
aagtgacaga tagctgggca 7680atggaatccg aggaggtttc ccgatattac cctttgttga
aaagtctcaa tagccctttg 7740gtcttctgag actgtatctt tgatattctt ggagtagacg
agagtgtcgt gctccaccat 7800gttggcaagc tgctctagcc aatacgcaaa ccgcctctcc
ccgcgcgttg gccgattcat 7860taatgcagct ggcacgacag gtttcccgac tggaaagcgg
gcagtgagcg caacgcaatt 7920aatgtgagtt agctcactca ttaggcaccc caggctttac
actttatgct tccggctcgt 7980atgttgtgtg gaattgtgag cggataacaa tttcacacag
gaaacagcta tgaccatgat 8040tacgaattcg agccttgact agagggtcga cggtatacag
acatgataag atacattgat 8100gagtttggac aaaccacaac tagaatgcag tgaaaaaaat
gctttatttg tgaaatttgt 8160gatgctattg ctttatttgt aaccattata agctgcaata
aacaagttgg ggtgggcgaa 8220gaactccagc atgagatccc cgcgctggag gatcatccag
ccggcgtccc ggaaaacgat 8280tccgaagccc aacctttcat agaaggcggc ggtggaatcg
aaatctcgta gcacgtgtca 8340gtcctgctcc tcggccacga agtgcacgca gttgccggcc
gggtcgcgca gggcgaactc 8400ccgcccccac ggctgctcgc cgatctcggt catggccggc
ccggaggcgt cccggaagtt 8460cgtggacacg acctccgacc actcggcgta cagctcgtcc
aggccgcgca cccacaccca 8520ggccagggtg ttgtccggca ccacctggtc ctggaccgcg
ctgatgaaca gggtcacgtc 8580gtcccggacc acaccggcga agtcgtcctc cacgaagtcc
cgggagaacc cgagccggtc 8640ggtccagaac tcgaccgctc cggcgacgtc gcgcgcggtg
agcaccggaa cggcactggt 8700caacttggcc atggatccag atttcgctca agttagtata
aaaaagcagg cttcaatcct 8760gcaggaattc gatcgacact ctcgtctact ccaagaatat
caaagataca gtctcagaag 8820accaaagggc tattgagact tttcaacaaa gggtaatatc
gggaaacctc ctcggattcc 8880attgcccagc tatctgtcac ttcatcaaaa ggacagtaga
aaaggaaggt ggcacctaca 8940aatgccatca ttgcgataaa ggaaaggcta tcgttcaaga
tgcctctgcc gacagtggtc 9000ccaaagatgg acccccaccc acgaggagca tcgtggaaaa
agaagacgtt ccaaccacgt 9060cttcaaagca agtggattga tgtgataaca tggtggagca
cgacactctc gtctactcca 9120agaatatcaa agatacagtc tcagaagacc aaagggctat
tgagactttt caacaaaggg 9180taatatcggg aaacctcctc ggattccatt gcccagctat
ctgtcacttc atcaaaagga 9240cagtagaaaa ggaaggtggc acctacaaat gccatcattg
cgataaagga aaggctatcg 9300ttcaagatgc ctctgccgac agtggtccca aagatggacc
cccacccacg aggagcatcg 9360tggaaaaaga agacgttcca accacgtctt caaagcaagt
ggattgatgt gatatctcca 9420ctgacgtaag ggatgacgca caatcccact atccttcgca
agaccttcct ctatataagg 9480aagttcattt catttggaga ggacacgctg aaatcaccag
tctctctcta caaatctatc 9540tctctcgagc tttcgcagat ccgggggggc aatgagatat
gaaaaagcct gaactcaccg 9600cgacgtctgt cgagaagttt ctgatcgaaa agttcgacag
cgtctccgac ctgatgcagc 9660tctcggaggg cgaagaatct cgtgctttca gcttcgatgt
aggagggcgt ggatatgtcc 9720tgcgggtaaa tagctgcgcc gatggtttct acaaagatcg
ttatgtttat cggcactttg 9780catcggccgc gctcccgatt ccggaagtgc ttgacattgg
ggagtttagc gagagcctga 9840cctattgcat ctcccgccgt gcacagggtg tcacgttgca
agacctgcct gaaaccgaac 9900tgcccgctgt tctacaaccg gtcgcggagg ctatggatgc
gatcgctgcg gccgatctta 9960gccagacgag cgggttcggc ccattcggac cgcaaggaat
cggtcaatac actacatggc 10020gtgatttcat atgcgcgatt gctgatcccc atgtgtatca
ctggcaaact gtgatggacg 10080acaccgtcag tgcgtccgtc gcgcaggctc tcgatgagct
gatgctttgg gccgaggact 10140gccccgaagt ccggcacctc gtgcacgcgg atttcggctc
caacaatgtc ctgacggaca 10200atggccgcat aacagcggtc attgactgga gcgaggcgat
gttcggggat tcccaatacg 10260aggtcgccaa catcttcttc tggaggccgt ggttggcttg
tatggagcag cagacgcgct 10320acttcgagcg gaggcatccg gagcttgcag gatcgccacg
actccgggcg tatatgctcc 10380gcattggtct tgaccaactc tatcagagct tggttgacgg
caatttcgat gatgcagctt 10440gggcgcaggg tcgatgcgac gcaatcgtcc gatccggagc
cgggactgtc gggcgtacac 10500aaatcgcccg cagaagcgcg gccgtctgga ccgatggctg
tgtagaagta ctcgccgata 10560gtggaaaccg acgccccagc actcgtccga gggcaaagaa
atagagtaga tgccgaccgg 10620atctgtcgat cgacaagctc gagtttctcc ataataatgt
gtgagtagtt cccagataag 10680ggaattaggg ttcctatagg gtttcgctca tgtgttgagc
atataagaaa cccttagtat 10740gtatttgtat ttgtaaaata cttctatcaa taaaatttct
aattcctaaa accaaaatcc 10800agtactaaaa tccagatccc ccgaattaat tcggcgttaa
ttcagatcaa gcttgacctg 10860gaatatcgcg agtaaactga aaatcacgga aaatgagaaa
tacacacttt aggacgtgaa 10920atatggcgag gaaaactgaa aaaggtggaa aatttagaaa
tgtccactgt aggacgtgga 10980atatggcaag aaaactgaaa atcatggaaa atgagaaaca
tccacttgac gacttgaaaa 11040atgacgaaat cactaaaaaa cgtgaaaaat gagaaatgca
cactgaagga ctccgcggga 11100attcgattgt gctagccaat gtttaacaag atgtcaagca
caatgaatgt tggtggttgg 11160tggtcgtggc tggcggtggt ggaaaattgc ggtggttcga
gcggtagtga tcggcgatgg 11220ttggtgtttg cagcggtgtt tgatatcgga atcacttatg
gtggttgtca caatggaggt 11280gcgtcatggt tattggtggt tggtcatcta tatattttta
taataatatt aagtatttta 11340cctatttttt acatattttt tattaaattt atgcattgtt
tgtattttta aatagttttt 11400atcgtacttg ttttataaaa tattttatta ttttatgtgt
tatattatta cttgatgtat 11460tggaaatttt ctccattgtt ttttctatat ttataataat
tttcttattt ttttttgttt 11520tattatgtat tttttcgttt tataataaat atttattaaa
aaaaatatta tttttgtaaa 11580atatatcatt tacaatgttt aaaagtcatt tgtgaatata
ttagctaagt tgtacttctt 11640tttgtgcatt tggtgttgta catgtctatt atgattctct
ggccaaaaca tgtctactcc 11700tgtcacttgg gttttttttt ttaagacata atcactagtg
attatatcta gactgaaggc 11760gggaaacgac aatctgatca tgagcggaga attaagggag
tcacgttatg acccccgccg 11820atgacgcggg acaagccgtt ttacgtttgg aactgacaga
accgcaacgt tgaaggagcc 11880actcagccgc gggtttctgg agtttaatga gctaagcaca
tacgtcagaa accattattg 11940cgcgttcaaa agtcgcctaa ggtcactatc agctagcaaa
tatttcttgt caaaaatgct 12000ccactgacgt tccataaatt cccctcggta tccaattaga
gtctcatatt cactctcaat 12060ccaaataatc tgcaccggat ctcgagatcg aattcccgcg
gccgcgaatt cactagtgga 12120tccccgggta cggtcagtcc cttatgttac gtcctgtaga
aaccccaacc cgtgaaatca 12180aaaaactcga cggcctgtgg gcattcagtc tggatcgcga
aaactgtgga attgagcagc 12240gttggtggga aagcgcgtta caagaaagcc gggcaattgc
tgtgccaggc agttttaacg 12300atcagttcgc cgatgcagat attcgtaatt atgtgggcaa
cgtctggtat cagcgcgaag 12360tctttatacc gaaaggttgg gcaggccagc gtatcgtgct
gcgtttcgat gcggtcactc 12420attacggcaa agtgtgggtc aataatcagg aagtgatgga
gcatcagggc ggctatacgc 12480catttgaagc cgatgtcacg ccgtatgtta ttgccgggaa
aagtgtacgt atcacagttt 12540gtgtgaacaa cgaactgaac tggcagacta tcccgccggg
aatggtgatt accgacgaaa 12600acggcaagaa aaagcagtct tacttccatg atttctttaa
ctacgccggg atccatcgca 12660gcgtaatgct ctacaccacg ccgaacacct gggtggacga
tatcaccgtg gtgacgcatg 12720tcgcgcaaga ctgtaaccac gcgtctgttg actggcaggt
ggtggccaat ggtgatgtca 12780gcgttgaact gcgtgatgcg gatcaacagg tggttgcaac
tggacaaggc accagcggga 12840ctttgcaagt ggtgaatccg cacctctggc aaccgggtga
aggttatctc tatgaactgt 12900acgtcacagc caaaagccag acagagtgtg atatctaccc
gctgcgcgtc ggcatccggt 12960cagtggcagt gaagggcgaa cagttcctga tcaaccacaa
accgttctac tttactggct 13020ttggccgtca tgaagatgcg gatttgcgcg gcaaaggatt
cgataacgtg ctgatggtgc 13080acgatcacgc attaatggac tggattgggg ccaactccta
ccgtacctcg cattaccctt 13140acgctgaaga gatgctcgac tgggcagatg aacatggcat
cgtggtgatt gatgaaactg 13200cagctgtcgg ctttaacctc tctttaggca ttggtttcga
agcgggcaac aagccgaaag 13260aactgtacag cgaagaggca gtcaacgggg aaactcagca
ggcgcactta caggcgatta 13320aagagctgat agcgcgtgac aaaaaccacc caagcgtggt
gatgtggagt attgccaacg 13380aaccggatac ccgtccgcaa ggtgcacggg aatatttcgc
gccactggcg gaagcaacgc 13440gtaaactcga tccgacgcgt ccgatcacct gcgtcaatgt
aatgttctgc gacgctcaca 13500ccgataccat cagcgatctc tttgatgtgc tgtgcctgaa
ccgttattac ggttggtatg 13560tccaaagcgg cgatttggaa acggcagaga aggtactgga
aaaagaactt ctggcctggc 13620aggagaaact gcatcagccg attatcatca ccgaatacgg
cgtggatacg ttagccgggc 13680tgcactcaat gtacaccgac atgtggagtg aagagtatca
gtgtgcatgg ctggatatgt 13740atcaccgcgt ctttgatcgc gtcagcgccg tcgtcggtga
acaggtatgg aatttcgccg 13800attttgcgac ctcgcaaggc atattgcgcg ttggcggtaa
caagaagggg atcttcaccc 13860gcgaccgcaa accgaagtcg gcggcttttc tgctgcaaaa
acgctggact ggcatgaact 13920tcggtgaaaa accgcagcag ggaggcaaac aatgaatcaa
caactctcct ggcgcaccat 13980cgtcggctac agcctcggga attgcgtacc gagctcgaat
ttccccgatc gttcaaacat 14040ttggcaataa agtttcttaa gattgaatcc tgttgccggt
cttgcgatga ttatcatata 14100atttctgttg aattacgtta agcatgtaat aattaacatg
taatgcatga cgttatttat 14160gagatgggtt tttatgatta gagtcccgca attatacatt
taatacgcga tagaaaacaa 14220aatatagcgc gcaaactagg ataaattatc gcgcgcggtg
tcatctatgt tactagatcg 14280ggaattcgat atcaagcttg gcactggccg tcgttttaca
acgtcgtgac tgggaaaacc 14340ctggcgttac ccaacttaat cgccttgcag cacatccccc
tttcgccagc tggcgtaata 14400gcgaagaggc ccgcaccgat cgcccttccc aacagttgcg
cagcctgaat ggcgaatgct 14460agagcagctt gagcttggat cagattgtcg tttcccgcct
tcagtttaaa ctatcagtgt 14520ttgacaggat atattggcgg gtaaacctaa gagaaaagag
cgtttattag aataacggat 14580atttaaaagg gcgtgaaaag gtttatccgt tcgtccattt
gtatgtg 146271109080DNAArtificial
Sequencep18attBZeo(6XHS4)2eGFP Plasmid 110cagttgccgg ccgggtcgcg
cagggcgaac tcccgccccc acggctgctc gccgatctcg 60gtcatggccg gcccggaggc
gtcccggaag ttcgtggaca cgacctccga ccactcggcg 120tacagctcgt ccaggccgcg
cacccacacc caggccaggg tgttgtccgg caccacctgg 180tcctggaccg cgctgatgaa
cagggtcacg tcgtcccgga ccacaccggc gaagtcgtcc 240tccacgaagt cccgggagaa
cccgagccgg tcggtccaga actcgaccgc tccggcgacg 300tcgcgcgcgg tgagcaccgg
aacggcactg gtcaacttgg ccatggatcc agatttcgct 360caagttagta taaaaaagca
ggcttcaatc ctgcagagaa gcttgatatc gaattcctgc 420agccccgcgg atccgctcac
ggggacagcc cccccccaaa gcccccaggg atgtaattac 480gtccctcccc cgctaggggg
cagcagcgag ccgcccgggg ctccgctccg gtccggcgct 540ccccccgcat ccccgagccg
gcagcgtgcg gggacagccc gggcacgggg aaggtggcac 600gggatcgctt tcctctgaac
gcttctcgct gctctttgag cctgcagaca cctgggggat 660acggggccgc ggatccgctc
acggggacag ccccccccca aagcccccag ggatgtaatt 720acgtccctcc cccgctaggg
ggcagcagcg agccgcccgg ggctccgctc cggtccggcg 780ctccccccgc atccccgagc
cggcagcgtg cggggacagc ccgggcacgg ggaaggtggc 840acgggatcgc tttcctctga
acgcttctcg ctgctctttg agcctgcaga cacctggggg 900atacggggcc gcggatccgc
tcacggggac agcccccccc caaagccccc agggatgtaa 960ttacgtccct cccccgctag
ggggcagcag cgagccgccc ggggctccgc tccggtccgg 1020cgctcccccc gcatccccga
gccggcagcg tgcggggaca gcccgggcac ggggaaggtg 1080gcacgggatc gctttcctct
gaacgcttct cgctgctctt tgagcctgca gacacctggg 1140ggatacgggg ccgcggatcc
gctcacgggg acagcccccc cccaaagccc ccagggatgt 1200aattacgtcc ctcccccgct
agggggcagc agcgagccgc ccggggctcc gctccggtcc 1260ggcgctcccc ccgcatcccc
gagccggcag cgtgcgggga cagcccgggc acggggaagg 1320tggcacggga tcgctttcct
ctgaacgctt ctcgctgctc tttgagcctg cagacacctg 1380ggggatacgg ggccgcggat
ccgctcacgg ggacagcccc cccccaaagc ccccagggat 1440gtaattacgt ccctcccccg
ctagggggca gcagcgagcc gcccggggct ccgctccggt 1500ccggcgctcc ccccgcatcc
ccgagccggc agcgtgcggg gacagcccgg gcacggggaa 1560ggtggcacgg gatcgctttc
ctctgaacgc ttctcgctgc tctttgagcc tgcagacacc 1620tgggggatac ggggccgcgg
atccgctcac ggggacagcc cccccccaaa gcccccaggg 1680atgtaattac gtccctcccc
cgctaggggg cagcagcgag ccgcccgggg ctccgctccg 1740gtccggcgct ccccccgcat
ccccgagccg gcagcgtgcg gggacagccc gggcacgggg 1800aaggtggcac gggatcgctt
tcctctgaac gcttctcgct gctctttgag cctgcagaca 1860cctgggggat acggggcggg
ggatccacta gttattaata gtaatcaatt acggggtcat 1920tagttcatag cccatatatg
gagttccgcg ttacataact tacggtaaat ggcccgcctg 1980gctgaccgcc caacgacccc
cgcccattga cgtcaataat gacgtatgtt cccatagtaa 2040cgccaatagg gactttccat
tgacgtcaat gggtggacta tttacggtaa actgcccact 2100tggcagtaca tcaagtgtat
catatgccaa gtacgccccc tattgacgtc aatgacggta 2160aatggcccgc ctggcattat
gcccagtaca tgaccttatg ggactttcct acttggcagt 2220acatctacgt attagtcatc
gctattacca tgggtcgagg tgagccccac gttctgcttc 2280actctcccca tctccccccc
ctccccaccc ccaattttgt atttatttat tttttaatta 2340ttttgtgcag cgatgggggc
gggggggggg ggggcgcgcg ccaggcgggg cggggcgggg 2400cgaggggcgg ggcggggcga
ggcggagagg tgcggcggca gccaatcaga gcggcgcgct 2460ccgaaagttt ccttttatgg
cgaggcggcg gcggcggcgg ccctataaaa agcgaagcgc 2520gcggcgggcg ggagtcgctg
cgttgccttc gccccgtgcc ccgctccgcg ccgcctcgcg 2580ccgcccgccc cggctctgac
tgaccgcgtt actcccacag gtgagcgggc gggacggccc 2640ttctcctccg ggctgtaatt
agcgcttggt ttaatgacgg ctcgtttctt ttctgtggct 2700gcgtgaaagc cttaaagggc
tccgggaggg ccctttgtgc gggggggagc ggctcggggg 2760gtgcgtgcgt gtgtgtgtgc
gtggggagcg ccgcgtgcgg cccgcgctgc ccggcggctg 2820tgagcgctgc gggcgcggcg
cggggctttg tgcgctccgc gtgtgcgcga ggggagcgcg 2880gccgggggcg gtgccccgcg
gtgcgggggg gctgcgaggg gaacaaaggc tgcgtgcggg 2940gtgtgtgcgt gggggggtga
gcagggggtg tgggcgcggc ggtcgggctg taaccccccc 3000ctgcaccccc ctccccgagt
tgctgagcac ggcccggctt cgggtgcggg gctccgtgcg 3060gggcgtggcg cggggctcgc
cgtgccgggc ggggggtggc ggcaggtggg ggtgccgggc 3120ggggcggggc cgcctcgggc
cggggagggc tcgggggagg ggcgcggcgg ccccggagcg 3180ccggcggctg tcgaggcgcg
gcgagccgca gccattgcct tttatggtaa tcgtgcgaga 3240gggcgcaggg acttcctttg
tcccaaatct ggcggagccg aaatctggga ggcgccgccg 3300caccccctct agcgggcgcg
ggcgaagcgg tgcggcgccg gcaggaagga aatgggcggg 3360gagggccttc gtgcgtcgcc
gcgccgccgt ccccttctcc atctccagcc tcggggctgc 3420cgcaggggga cggctgcctt
cgggggggac ggggcagggc ggggttcggc ttctggcgtg 3480tgaccggcgg ctctagagcc
tctgctaacc atgttcatgc cttcttcttt ttcctacagc 3540tcctgggcaa cgtgctggtt
gttgtgctgt ctcatcattt tggcaaagaa ttcgccacca 3600tggtgagcaa gggcgaggag
ctgttcaccg gggtggtgcc catcctggtc gagctggacg 3660gcgacgtaaa cggccacaag
ttcagcgtgt ccggcgaggg cgagggcgat gccacctacg 3720gcaagctgac cctgaagttc
atctgcacca ccggcaagct gcccgtgccc tggcccaccc 3780tcgtgaccac cctgacctac
ggcgtgcagt gcttcagccg ctaccccgac cacatgaagc 3840agcacgactt cttcaagtcc
gccatgcccg aaggctacgt ccaggagcgc accatcttct 3900tcaaggacga cggcaactac
aagacccgcg ccgaggtgaa gttcgagggc gacaccctgg 3960tgaaccgcat cgagctgaag
ggcatcgact tcaaggagga cggcaacatc ctggggcaca 4020agctggagta caactacaac
agccacaacg tctatatcat ggccgacaag cagaagaacg 4080gcatcaaggt gaacttcaag
atccgccaca acatcgagga cggcagcgtg cagctcgccg 4140accactacca gcagaacacc
cccatcggcg acggccccgt gctgctgccc gacaaccact 4200acctgagcac ccagtccgcc
ctgagcaaag accccaacga gaagcgcgat cacatggtcc 4260tgctggagtt cgtgaccgcc
gccgggatca ctctcggcat ggacgagctg tacaagtaag 4320aattcactcc tcaggtgcag
gctgcctatc agaaggtggt ggctggtgtg gccaatgccc 4380tggctcacaa ataccactga
gatctttttc cctctgccaa aaattatggg gacatcatga 4440agccccttga gcatctgact
tctggctaat aaaggaaatt tattttcatt gcaatagtgt 4500gttggaattt tttgtgtctc
tcactcggaa ggacatatgg gagggcaaat catttaaaac 4560atcagaatga gtatttggtt
tagagtttgg caacatatgc catatgctgg ctgccatgaa 4620caaaggtggc tataaagagg
tcatcagtat atgaaacagc cccctgctgt ccattcctta 4680ttccatagaa aagccttgac
ttgaggttag atttttttta tattttgttt tgtgttattt 4740ttttctttaa catccctaaa
attttcctta catgttttac tagccagatt tttcctcctc 4800tcctgactac tcccagtcat
agctgtccct cttctcttat gaagatccct cgacctgcag 4860cccaagcttg catgcctgca
ggtcgactct agtggatccc ccgccccgta tcccccaggt 4920gtctgcaggc tcaaagagca
gcgagaagcg ttcagaggaa agcgatcccg tgccaccttc 4980cccgtgcccg ggctgtcccc
gcacgctgcc ggctcgggga tgcgggggga gcgccggacc 5040ggagcggagc cccgggcggc
tcgctgctgc cccctagcgg gggagggacg taattacatc 5100cctgggggct ttgggggggg
gctgtccccg tgagcggatc cgcggccccg tatcccccag 5160gtgtctgcag gctcaaagag
cagcgagaag cgttcagagg aaagcgatcc cgtgccacct 5220tccccgtgcc cgggctgtcc
ccgcacgctg ccggctcggg gatgcggggg gagcgccgga 5280ccggagcgga gccccgggcg
gctcgctgct gccccctagc gggggaggga cgtaattaca 5340tccctggggg ctttgggggg
gggctgtccc cgtgagcgga tccgcggccc cgtatccccc 5400aggtgtctgc aggctcaaag
agcagcgaga agcgttcaga ggaaagcgat cccgtgccac 5460cttccccgtg cccgggctgt
ccccgcacgc tgccggctcg gggatgcggg gggagcgccg 5520gaccggagcg gagccccggg
cggctcgctg ctgcccccta gcgggggagg gacgtaatta 5580catccctggg ggctttgggg
gggggctgtc cccgtgagcg gatccgcggc cccgtatccc 5640ccaggtgtct gcaggctcaa
agagcagcga gaagcgttca gaggaaagcg atcccgtgcc 5700accttccccg tgcccgggct
gtccccgcac gctgccggct cggggatgcg gggggagcgc 5760cggaccggag cggagccccg
ggcggctcgc tgctgccccc tagcggggga gggacgtaat 5820tacatccctg ggggctttgg
gggggggctg tccccgtgag cggatccgcg gccccgtatc 5880ccccaggtgt ctgcaggctc
aaagagcagc gagaagcgtt cagaggaaag cgatcccgtg 5940ccaccttccc cgtgcccggg
ctgtccccgc acgctgccgg ctcggggatg cggggggagc 6000gccggaccgg agcggagccc
cgggcggctc gctgctgccc cctagcgggg gagggacgta 6060attacatccc tgggggcttt
gggggggggc tgtccccgtg agcggatccg cggccccgta 6120tcccccaggt gtctgcaggc
tcaaagagca gcgagaagcg ttcagaggaa agcgatcccg 6180tgccaccttc cccgtgcccg
ggctgtcccc gcacgctgcc ggctcgggga tgcgggggga 6240gcgccggacc ggagcggagc
cccgggcggc tcgctgctgc cccctagcgg gggagggacg 6300taattacatc cctgggggct
ttgggggggg gctgtccccg tgagcggatc cgcggggctg 6360caggaattcg taatcatggt
catagctgtt tcctgtgtga aattgttatc cgctcacaat 6420tccacacaac atacgagccg
gaagcataaa gtgtaaagcc tggggtgcct aatgagtgag 6480ctaactcaca ttaattgcgt
tgcgctcact gcccgctttc cagtcgggaa acctgtcgtg 6540ccagctgcat taatgaatcg
gccaacgcgc ggggagaggc ggtttgcgta ttgggcgctc 6600ttccgcttcc tcgctcactg
actcgctgcg ctcggtcgtt cggctgcggc gagcggtatc 6660agctcactca aaggcggtaa
tacggttatc cacagaatca ggggataacg caggaaagaa 6720catgtgagca aaaggccagc
aaaaggccag gaaccgtaaa aaggccgcgt tgctggcgtt 6780tttccatagg ctccgccccc
ctgacgagca tcacaaaaat cgacgctcaa gtcagaggtg 6840gcgaaacccg acaggactat
aaagatacca ggcgtttccc cctggaagct ccctcgtgcg 6900ctctcctgtt ccgaccctgc
cgcttaccgg atacctgtcc gcctttctcc cttcgggaag 6960cgtggcgctt tctcatagct
cacgctgtag gtatctcagt tcggtgtagg tcgttcgctc 7020caagctgggc tgtgtgcacg
aaccccccgt tcagcccgac cgctgcgcct tatccggtaa 7080ctatcgtctt gagtccaacc
cggtaagaca cgacttatcg ccactggcag cagccactgg 7140taacaggatt agcagagcga
ggtatgtagg cggtgctaca gagttcttga agtggtggcc 7200taactacggc tacactagaa
ggacagtatt tggtatctgc gctctgctga agccagttac 7260cttcggaaaa agagttggta
gctcttgatc cggcaaacaa accaccgctg gtagcggtgg 7320tttttttgtt tgcaagcagc
agattacgcg cagaaaaaaa ggatctcaag aagatccttt 7380gatcttttct acggggtctg
acgctcagtg gaacgaaaac tcacgttaag ggattttggt 7440catgagatta tcaaaaagga
tcttcaccta gatcctttta aattaaaaat gaagttttaa 7500atcaatctaa agtatatatg
agtaaacttg gtctgacagt taccaatgct taatcagtga 7560ggcacctatc tcagcgatct
gtctatttcg ttcatccata gttgcctgac tccccgtcgt 7620gtagataact acgatacggg
agggcttacc atctggcccc agtgctgcaa tgataccgcg 7680agacccacgc tcaccggctc
cagatttatc agcaataaac cagccagccg gaagggccga 7740gcgcagaagt ggtcctgcaa
ctttatccgc ctccatccag tctattaatt gttgccggga 7800agctagagta agtagttcgc
cagttaatag tttgcgcaac gttgttgcca ttgctacagg 7860catcgtggtg tcacgctcgt
cgtttggtat ggcttcattc agctccggtt cccaacgatc 7920aaggcgagtt acatgatccc
ccatgttgtg caaaaaagcg gttagctcct tcggtcctcc 7980gatcgttgtc agaagtaagt
tggccgcagt gttatcactc atggttatgg cagcactgca 8040taattctctt actgtcatgc
catccgtaag atgcttttct gtgactggtg agtactcaac 8100caagtcattc tgagaatagt
gtatgcggcg accgagttgc tcttgcccgg cgtcaatacg 8160ggataatacc gcgccacata
gcagaacttt aaaagtgctc atcattggaa aacgttcttc 8220ggggcgaaaa ctctcaagga
tcttaccgct gttgagatcc agttcgatgt aacccactcg 8280tgcacccaac tgatcttcag
catcttttac tttcaccagc gtttctgggt gagcaaaaac 8340aggaaggcaa aatgccgcaa
aaaagggaat aagggcgaca cggaaatgtt gaatactcat 8400actcttcctt tttcaatatt
attgaagcat ttatcagggt tattgtctca tgagcggata 8460catatttgaa tgtatttaga
aaaataaaca aataggggtt ccgcgcacat ttccccgaaa 8520agtgccacct gacgtagtta
acaaaaaaaa gcccgccgaa gcgggcttta ttaccaagcg 8580aagcgccatt cgccattcag
gctgcgcaac tgttgggaag ggcgatcggt gcgggcctct 8640tcgctattac gccagctggc
gaaaggggga tgtgctgcaa ggcgattaag ttgggtaacg 8700ccagggtttt cccagtcacg
acgttgtaaa acgacggcca gtccgtaata cgactcactt 8760aaggccttga ctagagggtc
gacggtatac agacatgata agatacattg atgagtttgg 8820acaaaccaca actagaatgc
agtgaaaaaa atgctttatt tgtgaaattt gtgatgctat 8880tgctttattt gtaaccatta
taagctgcaa taaacaagtt ggggtgggcg aagaactcca 8940gcatgagatc cccgcgctgg
aggatcatcc agccggcgtc ccggaaaacg attccgaagc 9000ccaacctttc atagaaggcg
gcggtggaat cgaaatctcg tagcacgtgt cagtcctgct 9060cctcggccac gaagtgcacg
90801114223DNAArtificial
SequencepLIT38attBBSRpolyA10 Plasmid 111gttaactacg tcaggtggca cttttcgggg
aaatgtgcgc ggaaccccta tttgtttatt 60tttctaaata cattcaaata tgtatccgct
catgagacaa taaccctgat aaatgcttca 120ataatattga aaaaggaaga gtatgagtat
tcaacatttc cgtgtcgccc ttattccctt 180ttttgcggca ttttgccttc ctgtttttgc
tcacccagaa acgctggtga aagtaaaaga 240tgctgaagat cagttgggtg cacgagtggg
ttacatcgaa ctggatctca acagcggtaa 300gatccttgag agttttcgcc ccgaagaacg
ttctccaatg atgagcactt ttaaagttct 360gctatgtggc gcggtattat cccgtgttga
cgccgggcaa gagcaactcg gtcgccgcat 420acactattct cagaatgact tggttgagta
ctcaccagtc acagaaaagc atcttacgga 480tggcatgaca gtaagagaat tatgcagtgc
tgccataacc atgagtgata acactgcggc 540caacttactt ctgacaacga tcggaggacc
gaaggagcta accgcttttt tgcacaacat 600gggggatcat gtaactcgcc ttgatcgttg
ggaaccggag ctgaatgaag ccataccaaa 660cgacgagcgt gacaccacga tgcctgtagc
aatggcaaca acgttgcgca aactattaac 720tggcgaacta cttactctag cttcccggca
acaattaata gactggatgg aggcggataa 780agttgcagga ccacttctgc gctcggccct
tccggctggc tggtttattg ctgataaatc 840tggagccggt gagcgtgggt ctcgcggtat
cattgcagca ctggggccag atggtaagcc 900ctcccgtatc gtagttatct acacgacggg
gagtcaggca actatggatg aacgaaatag 960acagatcgct gagataggtg cctcactgat
taagcattgg taactgtcag accaagttta 1020ctcatatata ctttagattg atttaccccg
gttgataatc agaaaagccc caaaaacagg 1080aagattgtat aagcaaatat ttaaattgta
aacgttaata ttttgttaaa attcgcgtta 1140aatttttgtt aaatcagctc attttttaac
caataggccg aaatcggcaa aatcccttat 1200aaatcaaaag aatagcccga gatagggttg
agtgttgttc cagtttggaa caagagtcca 1260ctattaaaga acgtggactc caacgtcaaa
gggcgaaaaa ccgtctatca gggcgatggc 1320ccactacgtg aaccatcacc caaatcaagt
tttttggggt cgaggtgccg taaagcacta 1380aatcggaacc ctaaagggag cccccgattt
agagcttgac ggggaaagcg aacgtggcga 1440gaaaggaagg gaagaaagcg aaaggagcgg
gcgctagggc gctggcaagt gtagcggtca 1500cgctgcgcgt aaccaccaca cccgccgcgc
ttaatgcgcc gctacagggc gcgtaaaagg 1560atctaggtga agatcctttt tgataatctc
atgaccaaaa tcccttaacg tgagttttcg 1620ttccactgag cgtcagaccc cgtagaaaag
atcaaaggat cttcttgaga tccttttttt 1680ctgcgcgtaa tctgctgctt gcaaacaaaa
aaaccaccgc taccagcggt ggtttgtttg 1740ccggatcaag agctaccaac tctttttccg
aaggtaactg gcttcagcag agcgcagata 1800ccaaatactg ttcttctagt gtagccgtag
ttaggccacc acttcaagaa ctctgtagca 1860ccgcctacat acctcgctct gctaatcctg
ttaccagtgg ctgctgccag tggcgataag 1920tcgtgtctta ccgggttgga ctcaagacga
tagttaccgg ataaggcgca gcggtcgggc 1980tgaacggggg gttcgtgcac acagcccagc
ttggagcgaa cgacctacac cgaactgaga 2040tacctacagc gtgagctatg agaaagcgcc
acgcttcccg aagggagaaa ggcggacagg 2100tatccggtaa gcggcagggt cggaacagga
gagcgcacga gggagcttcc agggggaaac 2160gcctggtatc tttatagtcc tgtcgggttt
cgccacctct gacttgagcg tcgatttttg 2220tgatgctcgt caggggggcg gagcctatgg
aaaaacgcca gcaacgcggc ctttttacgg 2280ttcctggcct tttgctggcc ttttgctcac
atgtaatgtg agttagctca ctcattaggc 2340accccaggct ttacacttta tgcttccggc
tcgtatgttg tgtggaattg tgagcggata 2400acaatttcac acaggaaaca gctatgacca
tgattacgcc aagctacgta atacgactca 2460ctagtggggc ccgtgcaatt gaagccggct
ggcgccaagc ttctctgcag gattgaagcc 2520tgctttttta tactaacttg agcgaaatct
ggatcaccat gaaaacattt aacatttctc 2580aacaagatct agaattagta gaagtagcga
cagagaagat tacaatgctt tatgaggata 2640ataaacatca tgtgggagcg gcaattcgta
cgaaaacagg agaaatcatt tcggcagtac 2700atattgaagc gtatatagga cgagtaactg
tttgtgcaga agccattgcg attggtagtg 2760cagtttcgaa tggacaaaag gattttgaca
cgattgtagc tgttagacac ccttattctg 2820acgaagtaga tagaagtatt cgagtggtaa
gtccttgtgg tatgtgtagg gagttgattt 2880cagactatgc accagattgt tttgtgttaa
tagaaatgaa tggcaagtta gtcaaaacta 2940cgattgaaga actcattcca ctcaaatata
cccgaaatta aaagttttac cataccaagc 3000ttggctgctg cctgaggctg gacgacctcg
cggagttcta ccggcagtgc aaatccgtcg 3060gcatccagga aaccagcagc ggctatccgc
gcatccatgc ccccgaactg caggagtggg 3120gaggcacgat ggccgctttg gtccggatct
ttgtgaagga accttacttc tgtggtgtga 3180cataattgga caaactacct acagagattt
aaagctctaa ggtaaatata aaatttttaa 3240gtgtataatg tgttaaacta ctgattctaa
ttgtttgtgt attttagatt ccaacctatg 3300gaactgatga atgggagcag tggtggaatg
cctttaatga ggaaaacctg ttttgctcag 3360aagaaatgcc atctagtgat gatgaggcta
ctgctgactc tcaacattct actcctccaa 3420aaaagaagag aaaggtagaa gaccccaagg
actttccttc agaattgcta agttttttga 3480gtcatgctgt gtttagtaat agaactcttg
cttgctttgc tatttacacc acaaaggaaa 3540aagctgcact gctatacaag aaaattatgg
aaaaatattc tgtaaccttt ataagtaggc 3600ataacagtta taatcataac atactgtttt
ttcttactcc acacaggcat agagtgtctg 3660ctattaataa ctatgctcaa aaattgtgta
cctttagctt tttaatttgt aaaggggtta 3720ataaggaata tttgatgtat agtgccttga
ctagagatca taatcagcca taccacattt 3780gtagaggttt tacttgcttt aaaaaacctc
ccacacctcc ccctgaacct gaaacataaa 3840atgaatgcaa ttgttgttgt taacttgttt
attgcagctt ataatggtta caaataaagc 3900aatagcatca caaatttcac aaataaagat
ccacgaattc gctagcttcg gccgtgacgc 3960gtctccggat gtacaggcat gcgtcgaccc
tctagtcaag gccttaagtg agtcgtatta 4020cggactggcc gtcgttttac aacgtcgtga
ctgggaaaac cctggcgtta cccaacttaa 4080tcgccttgca gcacatcccc ctttcgccag
ctggcgtaat agcgaagagg cccgcaccga 4140tcgcccttcc caacagttgc gcagcctgaa
tggcgaatgg cgcttcgctt ggtaataaag 4200cccgcttcgg cgggcttttt ttt
42231125855DNAArtificial
SequencepCX-LamIntR Plasmid 112gtcgacattg attattgact agttattaat
agtaatcaat tacggggtca ttagttcata 60gcccatatat ggagttccgc gttacataac
ttacggtaaa tggcccgcct ggctgaccgc 120ccaacgaccc ccgcccattg acgtcaataa
tgacgtatgt tcccatagta acgccaatag 180ggactttcca ttgacgtcaa tgggtggact
atttacggta aactgcccac ttggcagtac 240atcaagtgta tcatatgcca agtacgcccc
ctattgacgt caatgacggt aaatggcccg 300cctggcatta tgcccagtac atgaccttat
gggactttcc tacttggcag tacatctacg 360tattagtcat cgctattacc atgggtcgag
gtgagcccca cgttctgctt cactctcccc 420atctcccccc cctccccacc cccaattttg
tatttattta ttttttaatt attttgtgca 480gcgatggggg cggggggggg gggggcgcgc
gccaggcggg gcggggcggg gcgaggggcg 540gggcggggcg aggcggagag gtgcggcggc
agccaatcag agcggcgcgc tccgaaagtt 600tccttttatg gcgaggcggc ggcggcggcg
gccctataaa aagcgaagcg cgcggcgggc 660gggagtcgct gcgttgcctt cgccccgtgc
cccgctccgc gccgcctcgc gccgcccgcc 720ccggctctga ctgaccgcgt tactcccaca
ggtgagcggg cgggacggcc cttctcctcc 780gggctgtaat tagcgcttgg tttaatgacg
gctcgtttct tttctgtggc tgcgtgaaag 840ccttaaaggg ctccgggagg gccctttgtg
cgggggggag cggctcgggg ggtgcgtgcg 900tgtgtgtgtg cgtggggagc gccgcgtgcg
gcccgcgctg cccggcggct gtgagcgctg 960cgggcgcggc gcggggcttt gtgcgctccg
cgtgtgcgcg aggggagcgc ggccgggggc 1020ggtgccccgc ggtgcggggg ggctgcgagg
ggaacaaagg ctgcgtgcgg ggtgtgtgcg 1080tgggggggtg agcagggggt gtgggcgcgg
cggtcgggct gtaacccccc cctgcacccc 1140cctccccgag ttgctgagca cggcccggct
tcgggtgcgg ggctccgtgc ggggcgtggc 1200gcggggctcg ccgtgccggg cggggggtgg
cggcaggtgg gggtgccggg cggggcgggg 1260ccgcctcggg ccggggaggg ctcgggggag
gggcgcggcg gccccggagc gccggcggct 1320gtcgaggcgc ggcgagccgc agccattgcc
ttttatggta atcgtgcgag agggcgcagg 1380gacttccttt gtcccaaatc tggcggagcc
gaaatctggg aggcgccgcc gcaccccctc 1440tagcgggcgc gggcgaagcg gtgcggcgcc
ggcaggaagg aaatgggcgg ggagggcctt 1500cgtgcgtcgc cgcgccgccg tccccttctc
catctccagc ctcggggctg ccgcaggggg 1560acggctgcct tcggggggga cggggcaggg
cggggttcgg cttctggcgt gtgaccggcg 1620gctctagagc ctctgctaac catgttcatg
ccttcttctt tttcctacag ctcctgggca 1680acgtgctggt tgttgtgctg tctcatcatt
ttggcaaaga attcatggga agaaggcgaa 1740gtcatgagcg ccgggattta ccccctaacc
tttatataag aaacaatgga tattactgct 1800acagggaccc aaggacgggt aaagagtttg
gattaggcag agacaggcga atcgcaatca 1860ctgaagctat acaggccaac attgagttat
tttcaggaca caaacacaag cctctgacag 1920cgagaatcaa cagtgataat tccgttacgt
tacattcatg gcttgatcgc tacgaaaaaa 1980tcctggccag cagaggaatc aagcagaaga
cactcataaa ttacatgagc aaaattaaag 2040caataaggag gggtctgcct gatgctccac
ttgaagacat caccacaaaa gaaattgcgg 2100caatgctcaa tggatacata gacgagggca
aggcggcgtc agccaagtta atcagatcaa 2160cactgagcga tgcattccga gaggcaatag
ctgaaggcca tataacaaca aaccatgtcg 2220ctgccactcg cgcagcaaaa tctagagtaa
ggagatcaag acttacggct gacgaatacc 2280tgaaaattta tcaagcagca gaatcatcac
catgttggct cagacttgca atggaactgg 2340ctgttgttac cgggcaacga gttggtgatt
tatgcgaaat gaagtggtct gatatcgtag 2400atggatatct ttatgtcgag caaagcaaaa
caggcgtaaa aattgccatc ccaacagcat 2460tgcatattga tgctctcgga atatcaatga
aggaaacact tgataaatgc aaagagattc 2520ttggcggaga aaccataatt gcatctactc
gtcgcgaacc gctttcatcc ggcacagtat 2580caaggtattt tatgcgcgca cgaaaagcat
caggtctttc cttcgaaggg gatccgccta 2640cctttcacga gttgcgcagt ttgtctgcaa
gactctatga gaagcagata agcgataagt 2700ttgctcaaca tcttctcggg cataagtcgg
acaccatggc atcacagtat cgtgatgaca 2760gaggcaggga gtgggacaaa attgaaatca
aataagaatt cactcctcag gtgcaggctg 2820cctatcagaa ggtggtggct ggtgtggcca
atgccctggc tcacaaatac cactgagatc 2880tttttccctc tgccaaaaat tatggggaca
tcatgaagcc ccttgagcat ctgacttctg 2940gctaataaag gaaatttatt ttcattgcaa
tagtgtgttg gaattttttg tgtctctcac 3000tcggaaggac atatgggagg gcaaatcatt
taaaacatca gaatgagtat ttggtttaga 3060gtttggcaac atatgccata tgctggctgc
catgaacaaa ggtggctata aagaggtcat 3120cagtatatga aacagccccc tgctgtccat
tccttattcc atagaaaagc cttgacttga 3180ggttagattt tttttatatt ttgttttgtg
ttattttttt ctttaacatc cctaaaattt 3240tccttacatg ttttactagc cagatttttc
ctcctctcct gactactccc agtcatagct 3300gtccctcttc tcttatgaag atccctcgac
ctgcagccca agcttggcgt aatcatggtc 3360atagctgttt cctgtgtgaa attgttatcc
gctcacaatt ccacacaaca tacgagccgg 3420aagcataaag tgtaaagcct ggggtgccta
atgagtgagc taactcacat taattgcgtt 3480gcgctcactg cccgctttcc agtcgggaaa
cctgtcgtgc cagcggatcc gcatctcaat 3540tagtcagcaa ccatagtccc gcccctaact
ccgcccatcc cgcccctaac tccgcccagt 3600tccgcccatt ctccgcccca tggctgacta
atttttttta tttatgcaga ggccgaggcc 3660gcctcggcct ctgagctatt ccagaagtag
tgaggaggct tttttggagg cctaggcttt 3720tgcaaaaagc taacttgttt attgcagctt
ataatggtta caaataaagc aatagcatca 3780caaatttcac aaataaagca tttttttcac
tgcattctag ttgtggtttg tccaaactca 3840tcaatgtatc ttatcatgtc tggatccgct
gcattaatga atcggccaac gcgcggggag 3900aggcggtttg cgtattgggc gctcttccgc
ttcctcgctc actgactcgc tgcgctcggt 3960cgttcggctg cggcgagcgg tatcagctca
ctcaaaggcg gtaatacggt tatccacaga 4020atcaggggat aacgcaggaa agaacatgtg
agcaaaaggc cagcaaaagg ccaggaaccg 4080taaaaaggcc gcgttgctgg cgtttttcca
taggctccgc ccccctgacg agcatcacaa 4140aaatcgacgc tcaagtcaga ggtggcgaaa
cccgacagga ctataaagat accaggcgtt 4200tccccctgga agctccctcg tgcgctctcc
tgttccgacc ctgccgctta ccggatacct 4260gtccgccttt ctcccttcgg gaagcgtggc
gctttctcaa tgctcacgct gtaggtatct 4320cagttcggtg taggtcgttc gctccaagct
gggctgtgtg cacgaacccc ccgttcagcc 4380cgaccgctgc gccttatccg gtaactatcg
tcttgagtcc aacccggtaa gacacgactt 4440atcgccactg gcagcagcca ctggtaacag
gattagcaga gcgaggtatg taggcggtgc 4500tacagagttc ttgaagtggt ggcctaacta
cggctacact agaaggacag tatttggtat 4560ctgcgctctg ctgaagccag ttaccttcgg
aaaaagagtt ggtagctctt gatccggcaa 4620acaaaccacc gctggtagcg gtggtttttt
tgtttgcaag cagcagatta cgcgcagaaa 4680aaaaggatct caagaagatc ctttgatctt
ttctacgggg tctgacgctc agtggaacga 4740aaactcacgt taagggattt tggtcatgag
attatcaaaa aggatcttca cctagatcct 4800tttaaattaa aaatgaagtt ttaaatcaat
ctaaagtata tatgagtaaa cttggtctga 4860cagttaccaa tgcttaatca gtgaggcacc
tatctcagcg atctgtctat ttcgttcatc 4920catagttgcc tgactccccg tcgtgtagat
aactacgata cgggagggct taccatctgg 4980ccccagtgct gcaatgatac cgcgagaccc
acgctcaccg gctccagatt tatcagcaat 5040aaaccagcca gccggaaggg ccgagcgcag
aagtggtcct gcaactttat ccgcctccat 5100ccagtctatt aattgttgcc gggaagctag
agtaagtagt tcgccagtta atagtttgcg 5160caacgttgtt gccattgcta caggcatcgt
ggtgtcacgc tcgtcgtttg gtatggcttc 5220attcagctcc ggttcccaac gatcaaggcg
agttacatga tcccccatgt tgtgcaaaaa 5280agcggttagc tccttcggtc ctccgatcgt
tgtcagaagt aagttggccg cagtgttatc 5340actcatggtt atggcagcac tgcataattc
tcttactgtc atgccatccg taagatgctt 5400ttctgtgact ggtgagtact caaccaagtc
attctgagaa tagtgtatgc ggcgaccgag 5460ttgctcttgc ccggcgtcaa tacgggataa
taccgcgcca catagcagaa ctttaaaagt 5520gctcatcatt ggaaaacgtt cttcggggcg
aaaactctca aggatcttac cgctgttgag 5580atccagttcg atgtaaccca ctcgtgcacc
caactgatct tcagcatctt ttactttcac 5640cagcgtttct gggtgagcaa aaacaggaag
gcaaaatgcc gcaaaaaagg gaataagggc 5700gacacggaaa tgttgaatac tcatactctt
cctttttcaa tattattgaa gcatttatca 5760gggttattgt ctcatgagcg gatacatatt
tgaatgtatt tagaaaaata aacaaatagg 5820ggttccgcgc acatttcccc gaaaagtgcc
acctg 58551134346DNAArtificial
SequencepSV40-193AttpsensePur Plasmid 113ccggtgccgc caccatcccc tgacccacgc
ccctgacccc tcacaaggag acgaccttcc 60atgaccgagt acaagcccac ggtgcgcctc
gccacccgcg acgacgtccc ccgggccgta 120cgcaccctcg ccgccgcgtt cgccgactac
cccgccacgc gccacaccgt cgacccggac 180cgccacatcg agcgggtcac cgagctgcaa
gaactcttcc tcacgcgcgt cgggctcgac 240atcggcaagg tgtgggtcgc ggacgacggc
gccgcggtgg cggtctggac cacgccggag 300agcgtcgaag cgggggcggt gttcgccgag
atcggcccgc gcatggccga gttgagcggt 360tcccggctgg ccgcgcagca acagatggaa
ggcctcctgg cgccgcaccg gcccaaggag 420cccgcgtggt tcctggccac cgtcggcgtc
tcgcccgacc accagggcaa gggtctgggc 480agcgccgtcg tgctccccgg agtggaggcg
gccgagcgcg ccggggtgcc cgccttcctg 540gagacctccg cgccccgcaa cctccccttc
tacgagcggc tcggcttcac cgtcaccgcc 600gacgtcgagg tgcccgaagg accgcgcacc
tggtgcatga cccgcaagcc cggtgcctga 660cgcccgcccc acgacccgca gcgcccgacc
gaaaggagcg cacgacccca tggctccgac 720cgaagccgac ccgggcggcc ccgccgaccc
cgcacccgcc cccgaggccc accgactcta 780gaggatcata atcagccata ccacatttgt
agaggtttta cttgctttaa aaaacctccc 840acacctcccc ctgaacctga aacataaaat
gaatgcaatt gttgttgtta acttgtttat 900tgcagcttat aatggttaca aataaagcaa
tagcatcaca aatttcacaa ataaagcatt 960tttttcactg cattctagtt gtggtttgtc
caaactcatc aatgtatctt atcatgtctg 1020gatccgcgcc ggatccttaa ttaagtctag
agtcgactgt ttaaacctgc aggcatgcaa 1080gcttggcgta atcatggtca tagctgtttc
ctgtgtgaaa ttgttatccg ctcacaattc 1140cacacaacat acgagccgga agcataaagt
gtaaagcctg gggtgcctaa tgagtgagct 1200aactcacatt aattgcgttg cgctcactgc
ccgctttcca gtcgggaaac ctgtcgtgcc 1260agctgcatta atgaatcggc caacgcgcgg
ggagaggcgg tttgcgtatt gggcgctctt 1320ccgcttcctc gctcactgac tcgctgcgct
cggtcgttcg gctgcggcga gcggtatcag 1380ctcactcaaa ggcggtaata cggttatcca
cagaatcagg ggataacgca ggaaagaaca 1440tgtgagcaaa aggccagcaa aaggccagga
accgtaaaaa ggccgcgttg ctggcgtttt 1500tccataggct ccgcccccct gacgagcatc
acaaaaatcg acgctcaagt cagaggtggc 1560gaaacccgac aggactataa agataccagg
cgtttccccc tggaagctcc ctcgtgcgct 1620ctcctgttcc gaccctgccg cttaccggat
acctgtccgc ctttctccct tcgggaagcg 1680tggcgctttc tcatagctca cgctgtaggt
atctcagttc ggtgtaggtc gttcgctcca 1740agctgggctg tgtgcacgaa ccccccgttc
agcccgaccg ctgcgcctta tccggtaact 1800atcgtcttga gtccaacccg gtaagacacg
acttatcgcc actggcagca gccactggta 1860acaggattag cagagcgagg tatgtaggcg
gtgctacaga gttcttgaag tggtggccta 1920actacggcta cactagaagg acagtatttg
gtatctgcgc tctgctgaag ccagttacct 1980tcggaaaaag agttggtagc tcttgatccg
gcaaacaaac caccgctggt agcggtggtt 2040tttttgtttg caagcagcag attacgcgca
gaaaaaaagg atctcaagaa gatcctttga 2100tcttttctac ggggtctgac gctcagtgga
acgaaaactc acgttaaggg attttggtca 2160tgagattatc aaaaaggatc ttcacctaga
tccttttaaa ttaaaaatga agttttaaat 2220caatctaaag tatatatgag taaacttggt
ctgacagtta ccaatgctta atcagtgagg 2280cacctatctc agcgatctgt ctatttcgtt
catccatagt tgcctgactc cccgtcgtgt 2340agataactac gatacgggag ggcttaccat
ctggccccag tgctgcaatg ataccgcgag 2400acccacgctc accggctcca gatttatcag
caataaacca gccagccgga agggccgagc 2460gcagaagtgg tcctgcaact ttatccgcct
ccatccagtc tattaattgt tgccgggaag 2520ctagagtaag tagttcgcca gttaatagtt
tgcgcaacgt tgttgccatt gctacaggca 2580tcgtggtgtc acgctcgtcg tttggtatgg
cttcattcag ctccggttcc caacgatcaa 2640ggcgagttac atgatccccc atgttgtgca
aaaaagcggt tagctccttc ggtcctccga 2700tcgttgtcag aagtaagttg gccgcagtgt
tatcactcat ggttatggca gcactgcata 2760attctcttac tgtcatgcca tccgtaagat
gcttttctgt gactggtgag tactcaacca 2820agtcattctg agaatagtgt atgcggcgac
cgagttgctc ttgcccggcg tcaatacggg 2880ataataccgc gccacatagc agaactttaa
aagtgctcat cattggaaaa cgttcttcgg 2940ggcgaaaact ctcaaggatc ttaccgctgt
tgagatccag ttcgatgtaa cccactcgtg 3000cacccaactg atcttcagca tcttttactt
tcaccagcgt ttctgggtga gcaaaaacag 3060gaaggcaaaa tgccgcaaaa aagggaataa
gggcgacacg gaaatgttga atactcatac 3120tcttcctttt tcaatattat tgaagcattt
atcagggtta ttgtctcatg agcggataca 3180tatttgaatg tatttagaaa aataaacaaa
taggggttcc gcgcacattt ccccgaaaag 3240tgccacctga cgtctaagaa accattatta
tcatgacatt aacctataaa aataggcgta 3300tcacgaggcc ctttcgtctc gcgcgtttcg
gtgatgacgg tgaaaacctc tgacacatgc 3360agctcccgga gacggtcaca gcttgtctgt
aagcggatgc cgggagcaga caagcccgtc 3420agggcgcgtc agcgggtgtt ggcgggtgtc
ggggctggct taactatgcg gcatcagagc 3480agattgtact gagagtgcac catatgcggt
gtgaaatacc gcacagatgc gtaaggagaa 3540aataccgcat caggcgccat tcgccattca
ggctgcgcaa ctgttgggaa gggcgatcgg 3600tgcgggcctc ttcgctatta cgccagctgg
cgaaaggggg atgtgctgca aggcgattaa 3660gttgggtaac gccagggttt tcccagtcac
gacgttgtaa aacgacggcc agtgaattcg 3720agctgtggaa tgtgtgtcag ttagggtgtg
gaaagtcccc aggctcccca gcaggcagaa 3780gtatgcaaag catgcatctc aattagtcag
caaccaggtg tggaaagtcc ccaggctccc 3840cagcaggcag aagtatgcaa agcatgcatc
tcaattagtc agcaaccata gtcccgcccc 3900taactccgcc catcccgccc ctaactccgc
ccagttccgc ccattctccg ccccatggct 3960gactaatttt ttttatttat gcagaggccg
aggccgcctc ggcctctgag ctattccaga 4020agtagtgagg aggctttttt ggaggctcgg
tacccccttg cgctaatgct ctgttacagg 4080tcactaatac catctaagta gttgattcat
agtgactgca tatgttgtgt tttacagtat 4140tatgtagtct gttttttatg caaaatctaa
tttaatatat tgatatttat atcattttac 4200gtttctcgtt cagctttttt atactaagtt
ggcattataa aaaagcattg cttatcaatt 4260tgttgcaacg aacaggtcac tatcagtcaa
aataaaatca ttatttgatt tcaattttgt 4320cccactccct gcctctgggg ggcgcg
43461143166DNAArtificial
Sequencep18attBZeo Plasmid 114cagttgccgg ccgggtcgcg cagggcgaac tcccgccccc
acggctgctc gccgatctcg 60gtcatggccg gcccggaggc gtcccggaag ttcgtggaca
cgacctccga ccactcggcg 120tacagctcgt ccaggccgcg cacccacacc caggccaggg
tgttgtccgg caccacctgg 180tcctggaccg cgctgatgaa cagggtcacg tcgtcccgga
ccacaccggc gaagtcgtcc 240tccacgaagt cccgggagaa cccgagccgg tcggtccaga
actcgaccgc tccggcgacg 300tcgcgcgcgg tgagcaccgg aacggcactg gtcaacttgg
ccatggatcc agatttcgct 360caagttagta taaaaaagca ggcttcaatc ctgcagagaa
gcttgcatgc ctgcaggtcg 420actctagagg atccccgggt accgagctcg aattcgtaat
catggtcata gctgtttcct 480gtgtgaaatt gttatccgct cacaattcca cacaacatac
gagccggaag cataaagtgt 540aaagcctggg gtgcctaatg agtgagctaa ctcacattaa
ttgcgttgcg ctcactgccc 600gctttccagt cgggaaacct gtcgtgccag ctgcattaat
gaatcggcca acgcgcgggg 660agaggcggtt tgcgtattgg gcgctcttcc gcttcctcgc
tcactgactc gctgcgctcg 720gtcgttcggc tgcggcgagc ggtatcagct cactcaaagg
cggtaatacg gttatccaca 780gaatcagggg ataacgcagg aaagaacatg tgagcaaaag
gccagcaaaa ggccaggaac 840cgtaaaaagg ccgcgttgct ggcgtttttc cataggctcc
gcccccctga cgagcatcac 900aaaaatcgac gctcaagtca gaggtggcga aacccgacag
gactataaag ataccaggcg 960tttccccctg gaagctccct cgtgcgctct cctgttccga
ccctgccgct taccggatac 1020ctgtccgcct ttctcccttc gggaagcgtg gcgctttctc
atagctcacg ctgtaggtat 1080ctcagttcgg tgtaggtcgt tcgctccaag ctgggctgtg
tgcacgaacc ccccgttcag 1140cccgaccgct gcgccttatc cggtaactat cgtcttgagt
ccaacccggt aagacacgac 1200ttatcgccac tggcagcagc cactggtaac aggattagca
gagcgaggta tgtaggcggt 1260gctacagagt tcttgaagtg gtggcctaac tacggctaca
ctagaaggac agtatttggt 1320atctgcgctc tgctgaagcc agttaccttc ggaaaaagag
ttggtagctc ttgatccggc 1380aaacaaacca ccgctggtag cggtggtttt tttgtttgca
agcagcagat tacgcgcaga 1440aaaaaaggat ctcaagaaga tcctttgatc ttttctacgg
ggtctgacgc tcagtggaac 1500gaaaactcac gttaagggat tttggtcatg agattatcaa
aaaggatctt cacctagatc 1560cttttaaatt aaaaatgaag ttttaaatca atctaaagta
tatatgagta aacttggtct 1620gacagttacc aatgcttaat cagtgaggca cctatctcag
cgatctgtct atttcgttca 1680tccatagttg cctgactccc cgtcgtgtag ataactacga
tacgggaggg cttaccatct 1740ggccccagtg ctgcaatgat accgcgagac ccacgctcac
cggctccaga tttatcagca 1800ataaaccagc cagccggaag ggccgagcgc agaagtggtc
ctgcaacttt atccgcctcc 1860atccagtcta ttaattgttg ccgggaagct agagtaagta
gttcgccagt taatagtttg 1920cgcaacgttg ttgccattgc tacaggcatc gtggtgtcac
gctcgtcgtt tggtatggct 1980tcattcagct ccggttccca acgatcaagg cgagttacat
gatcccccat gttgtgcaaa 2040aaagcggtta gctccttcgg tcctccgatc gttgtcagaa
gtaagttggc cgcagtgtta 2100tcactcatgg ttatggcagc actgcataat tctcttactg
tcatgccatc cgtaagatgc 2160ttttctgtga ctggtgagta ctcaaccaag tcattctgag
aatagtgtat gcggcgaccg 2220agttgctctt gcccggcgtc aatacgggat aataccgcgc
cacatagcag aactttaaaa 2280gtgctcatca ttggaaaacg ttcttcgggg cgaaaactct
caaggatctt accgctgttg 2340agatccagtt cgatgtaacc cactcgtgca cccaactgat
cttcagcatc ttttactttc 2400accagcgttt ctgggtgagc aaaaacagga aggcaaaatg
ccgcaaaaaa gggaataagg 2460gcgacacgga aatgttgaat actcatactc ttcctttttc
aatattattg aagcatttat 2520cagggttatt gtctcatgag cggatacata tttgaatgta
tttagaaaaa taaacaaata 2580ggggttccgc gcacatttcc ccgaaaagtg ccacctgacg
tagttaacaa aaaaaagccc 2640gccgaagcgg gctttattac caagcgaagc gccattcgcc
attcaggctg cgcaactgtt 2700gggaagggcg atcggtgcgg gcctcttcgc tattacgcca
gctggcgaaa gggggatgtg 2760ctgcaaggcg attaagttgg gtaacgccag ggttttccca
gtcacgacgt tgtaaaacga 2820cggccagtcc gtaatacgac tcacttaagg ccttgactag
agggtcgacg gtatacagac 2880atgataagat acattgatga gtttggacaa accacaacta
gaatgcagtg aaaaaaatgc 2940tttatttgtg aaatttgtga tgctattgct ttatttgtaa
ccattataag ctgcaataaa 3000caagttgggg tgggcgaaga actccagcat gagatccccg
cgctggagga tcatccagcc 3060ggcgtcccgg aaaacgattc cgaagcccaa cctttcatag
aaggcggcgg tggaatcgaa 3120atctcgtagc acgtgtcagt cctgctcctc ggccacgaag
tgcacg 31661157600DNAArtificial
Sequencep18attBZeo3'6XHS4eGFP Plasmid 115cagttgccgg ccgggtcgcg cagggcgaac
tcccgccccc acggctgctc gccgatctcg 60gtcatggccg gcccggaggc gtcccggaag
ttcgtggaca cgacctccga ccactcggcg 120tacagctcgt ccaggccgcg cacccacacc
caggccaggg tgttgtccgg caccacctgg 180tcctggaccg cgctgatgaa cagggtcacg
tcgtcccgga ccacaccggc gaagtcgtcc 240tccacgaagt cccgggagaa cccgagccgg
tcggtccaga actcgaccgc tccggcgacg 300tcgcgcgcgg tgagcaccgg aacggcactg
gtcaacttgg ccatggatcc agatttcgct 360caagttagta taaaaaagca ggcttcaatc
ctgcagagaa gcttgatcta gttattaata 420gtaatcaatt acggggtcat tagttcatag
cccatatatg gagttccgcg ttacataact 480tacggtaaat ggcccgcctg gctgaccgcc
caacgacccc cgcccattga cgtcaataat 540gacgtatgtt cccatagtaa cgccaatagg
gactttccat tgacgtcaat gggtggacta 600tttacggtaa actgcccact tggcagtaca
tcaagtgtat catatgccaa gtacgccccc 660tattgacgtc aatgacggta aatggcccgc
ctggcattat gcccagtaca tgaccttatg 720ggactttcct acttggcagt acatctacgt
attagtcatc gctattacca tgggtcgagg 780tgagccccac gttctgcttc actctcccca
tctccccccc ctccccaccc ccaattttgt 840atttatttat tttttaatta ttttgtgcag
cgatgggggc gggggggggg ggggcgcgcg 900ccaggcgggg cggggcgggg cgaggggcgg
ggcggggcga ggcggagagg tgcggcggca 960gccaatcaga gcggcgcgct ccgaaagttt
ccttttatgg cgaggcggcg gcggcggcgg 1020ccctataaaa agcgaagcgc gcggcgggcg
ggagtcgctg cgttgccttc gccccgtgcc 1080ccgctccgcg ccgcctcgcg ccgcccgccc
cggctctgac tgaccgcgtt actcccacag 1140gtgagcgggc gggacggccc ttctcctccg
ggctgtaatt agcgcttggt ttaatgacgg 1200ctcgtttctt ttctgtggct gcgtgaaagc
cttaaagggc tccgggaggg ccctttgtgc 1260gggggggagc ggctcggggg gtgcgtgcgt
gtgtgtgtgc gtggggagcg ccgcgtgcgg 1320cccgcgctgc ccggcggctg tgagcgctgc
gggcgcggcg cggggctttg tgcgctccgc 1380gtgtgcgcga ggggagcgcg gccgggggcg
gtgccccgcg gtgcgggggg gctgcgaggg 1440gaacaaaggc tgcgtgcggg gtgtgtgcgt
gggggggtga gcagggggtg tgggcgcggc 1500ggtcgggctg taaccccccc ctgcaccccc
ctccccgagt tgctgagcac ggcccggctt 1560cgggtgcggg gctccgtgcg gggcgtggcg
cggggctcgc cgtgccgggc ggggggtggc 1620ggcaggtggg ggtgccgggc ggggcggggc
cgcctcgggc cggggagggc tcgggggagg 1680ggcgcggcgg ccccggagcg ccggcggctg
tcgaggcgcg gcgagccgca gccattgcct 1740tttatggtaa tcgtgcgaga gggcgcaggg
acttcctttg tcccaaatct ggcggagccg 1800aaatctggga ggcgccgccg caccccctct
agcgggcgcg ggcgaagcgg tgcggcgccg 1860gcaggaagga aatgggcggg gagggccttc
gtgcgtcgcc gcgccgccgt ccccttctcc 1920atctccagcc tcggggctgc cgcaggggga
cggctgcctt cgggggggac ggggcagggc 1980ggggttcggc ttctggcgtg tgaccggcgg
ctctagagcc tctgctaacc atgttcatgc 2040cttcttcttt ttcctacagc tcctgggcaa
cgtgctggtt gttgtgctgt ctcatcattt 2100tggcaaagaa ttcgccacca tggtgagcaa
gggcgaggag ctgttcaccg gggtggtgcc 2160catcctggtc gagctggacg gcgacgtaaa
cggccacaag ttcagcgtgt ccggcgaggg 2220cgagggcgat gccacctacg gcaagctgac
cctgaagttc atctgcacca ccggcaagct 2280gcccgtgccc tggcccaccc tcgtgaccac
cctgacctac ggcgtgcagt gcttcagccg 2340ctaccccgac cacatgaagc agcacgactt
cttcaagtcc gccatgcccg aaggctacgt 2400ccaggagcgc accatcttct tcaaggacga
cggcaactac aagacccgcg ccgaggtgaa 2460gttcgagggc gacaccctgg tgaaccgcat
cgagctgaag ggcatcgact tcaaggagga 2520cggcaacatc ctggggcaca agctggagta
caactacaac agccacaacg tctatatcat 2580ggccgacaag cagaagaacg gcatcaaggt
gaacttcaag atccgccaca acatcgagga 2640cggcagcgtg cagctcgccg accactacca
gcagaacacc cccatcggcg acggccccgt 2700gctgctgccc gacaaccact acctgagcac
ccagtccgcc ctgagcaaag accccaacga 2760gaagcgcgat cacatggtcc tgctggagtt
cgtgaccgcc gccgggatca ctctcggcat 2820ggacgagctg tacaagtaag aattcactcc
tcaggtgcag gctgcctatc agaaggtggt 2880ggctggtgtg gccaatgccc tggctcacaa
ataccactga gatctttttc cctctgccaa 2940aaattatggg gacatcatga agccccttga
gcatctgact tctggctaat aaaggaaatt 3000tattttcatt gcaatagtgt gttggaattt
tttgtgtctc tcactcggaa ggacatatgg 3060gagggcaaat catttaaaac atcagaatga
gtatttggtt tagagtttgg caacatatgc 3120catatgctgg ctgccatgaa caaaggtggc
tataaagagg tcatcagtat atgaaacagc 3180cccctgctgt ccattcctta ttccatagaa
aagccttgac ttgaggttag atttttttta 3240tattttgttt tgtgttattt ttttctttaa
catccctaaa attttcctta catgttttac 3300tagccagatt tttcctcctc tcctgactac
tcccagtcat agctgtccct cttctcttat 3360gaagatccct cgacctgcag cccaagcttg
catgcctgca ggtcgactct agtggatccc 3420ccgccccgta tcccccaggt gtctgcaggc
tcaaagagca gcgagaagcg ttcagaggaa 3480agcgatcccg tgccaccttc cccgtgcccg
ggctgtcccc gcacgctgcc ggctcgggga 3540tgcgggggga gcgccggacc ggagcggagc
cccgggcggc tcgctgctgc cccctagcgg 3600gggagggacg taattacatc cctgggggct
ttgggggggg gctgtccccg tgagcggatc 3660cgcggccccg tatcccccag gtgtctgcag
gctcaaagag cagcgagaag cgttcagagg 3720aaagcgatcc cgtgccacct tccccgtgcc
cgggctgtcc ccgcacgctg ccggctcggg 3780gatgcggggg gagcgccgga ccggagcgga
gccccgggcg gctcgctgct gccccctagc 3840gggggaggga cgtaattaca tccctggggg
ctttgggggg gggctgtccc cgtgagcgga 3900tccgcggccc cgtatccccc aggtgtctgc
aggctcaaag agcagcgaga agcgttcaga 3960ggaaagcgat cccgtgccac cttccccgtg
cccgggctgt ccccgcacgc tgccggctcg 4020gggatgcggg gggagcgccg gaccggagcg
gagccccggg cggctcgctg ctgcccccta 4080gcgggggagg gacgtaatta catccctggg
ggctttgggg gggggctgtc cccgtgagcg 4140gatccgcggc cccgtatccc ccaggtgtct
gcaggctcaa agagcagcga gaagcgttca 4200gaggaaagcg atcccgtgcc accttccccg
tgcccgggct gtccccgcac gctgccggct 4260cggggatgcg gggggagcgc cggaccggag
cggagccccg ggcggctcgc tgctgccccc 4320tagcggggga gggacgtaat tacatccctg
ggggctttgg gggggggctg tccccgtgag 4380cggatccgcg gccccgtatc ccccaggtgt
ctgcaggctc aaagagcagc gagaagcgtt 4440cagaggaaag cgatcccgtg ccaccttccc
cgtgcccggg ctgtccccgc acgctgccgg 4500ctcggggatg cggggggagc gccggaccgg
agcggagccc cgggcggctc gctgctgccc 4560cctagcgggg gagggacgta attacatccc
tgggggcttt gggggggggc tgtccccgtg 4620agcggatccg cggccccgta tcccccaggt
gtctgcaggc tcaaagagca gcgagaagcg 4680ttcagaggaa agcgatcccg tgccaccttc
cccgtgcccg ggctgtcccc gcacgctgcc 4740ggctcgggga tgcgggggga gcgccggacc
ggagcggagc cccgggcggc tcgctgctgc 4800cccctagcgg gggagggacg taattacatc
cctgggggct ttgggggggg gctgtccccg 4860tgagcggatc cgcggggctg caggaattcg
taatcatggt catagctgtt tcctgtgtga 4920aattgttatc cgctcacaat tccacacaac
atacgagccg gaagcataaa gtgtaaagcc 4980tggggtgcct aatgagtgag ctaactcaca
ttaattgcgt tgcgctcact gcccgctttc 5040cagtcgggaa acctgtcgtg ccagctgcat
taatgaatcg gccaacgcgc ggggagaggc 5100ggtttgcgta ttgggcgctc ttccgcttcc
tcgctcactg actcgctgcg ctcggtcgtt 5160cggctgcggc gagcggtatc agctcactca
aaggcggtaa tacggttatc cacagaatca 5220ggggataacg caggaaagaa catgtgagca
aaaggccagc aaaaggccag gaaccgtaaa 5280aaggccgcgt tgctggcgtt tttccatagg
ctccgccccc ctgacgagca tcacaaaaat 5340cgacgctcaa gtcagaggtg gcgaaacccg
acaggactat aaagatacca ggcgtttccc 5400cctggaagct ccctcgtgcg ctctcctgtt
ccgaccctgc cgcttaccgg atacctgtcc 5460gcctttctcc cttcgggaag cgtggcgctt
tctcatagct cacgctgtag gtatctcagt 5520tcggtgtagg tcgttcgctc caagctgggc
tgtgtgcacg aaccccccgt tcagcccgac 5580cgctgcgcct tatccggtaa ctatcgtctt
gagtccaacc cggtaagaca cgacttatcg 5640ccactggcag cagccactgg taacaggatt
agcagagcga ggtatgtagg cggtgctaca 5700gagttcttga agtggtggcc taactacggc
tacactagaa ggacagtatt tggtatctgc 5760gctctgctga agccagttac cttcggaaaa
agagttggta gctcttgatc cggcaaacaa 5820accaccgctg gtagcggtgg tttttttgtt
tgcaagcagc agattacgcg cagaaaaaaa 5880ggatctcaag aagatccttt gatcttttct
acggggtctg acgctcagtg gaacgaaaac 5940tcacgttaag ggattttggt catgagatta
tcaaaaagga tcttcaccta gatcctttta 6000aattaaaaat gaagttttaa atcaatctaa
agtatatatg agtaaacttg gtctgacagt 6060taccaatgct taatcagtga ggcacctatc
tcagcgatct gtctatttcg ttcatccata 6120gttgcctgac tccccgtcgt gtagataact
acgatacggg agggcttacc atctggcccc 6180agtgctgcaa tgataccgcg agacccacgc
tcaccggctc cagatttatc agcaataaac 6240cagccagccg gaagggccga gcgcagaagt
ggtcctgcaa ctttatccgc ctccatccag 6300tctattaatt gttgccggga agctagagta
agtagttcgc cagttaatag tttgcgcaac 6360gttgttgcca ttgctacagg catcgtggtg
tcacgctcgt cgtttggtat ggcttcattc 6420agctccggtt cccaacgatc aaggcgagtt
acatgatccc ccatgttgtg caaaaaagcg 6480gttagctcct tcggtcctcc gatcgttgtc
agaagtaagt tggccgcagt gttatcactc 6540atggttatgg cagcactgca taattctctt
actgtcatgc catccgtaag atgcttttct 6600gtgactggtg agtactcaac caagtcattc
tgagaatagt gtatgcggcg accgagttgc 6660tcttgcccgg cgtcaatacg ggataatacc
gcgccacata gcagaacttt aaaagtgctc 6720atcattggaa aacgttcttc ggggcgaaaa
ctctcaagga tcttaccgct gttgagatcc 6780agttcgatgt aacccactcg tgcacccaac
tgatcttcag catcttttac tttcaccagc 6840gtttctgggt gagcaaaaac aggaaggcaa
aatgccgcaa aaaagggaat aagggcgaca 6900cggaaatgtt gaatactcat actcttcctt
tttcaatatt attgaagcat ttatcagggt 6960tattgtctca tgagcggata catatttgaa
tgtatttaga aaaataaaca aataggggtt 7020ccgcgcacat ttccccgaaa agtgccacct
gacgtagtta acaaaaaaaa gcccgccgaa 7080gcgggcttta ttaccaagcg aagcgccatt
cgccattcag gctgcgcaac tgttgggaag 7140ggcgatcggt gcgggcctct tcgctattac
gccagctggc gaaaggggga tgtgctgcaa 7200ggcgattaag ttgggtaacg ccagggtttt
cccagtcacg acgttgtaaa acgacggcca 7260gtccgtaata cgactcactt aaggccttga
ctagagggtc gacggtatac agacatgata 7320agatacattg atgagtttgg acaaaccaca
actagaatgc agtgaaaaaa atgctttatt 7380tgtgaaattt gtgatgctat tgctttattt
gtaaccatta taagctgcaa taaacaagtt 7440ggggtgggcg aagaactcca gcatgagatc
cccgcgctgg aggatcatcc agccggcgtc 7500ccggaaaacg attccgaagc ccaacctttc
atagaaggcg gcggtggaat cgaaatctcg 7560tagcacgtgt cagtcctgct cctcggccac
gaagtgcacg 76001167631DNAArtificial
Sequencep18attBZeo5'6XHS4eGFP Plasmid 116cagttgccgg ccgggtcgcg cagggcgaac
tcccgccccc acggctgctc gccgatctcg 60gtcatggccg gcccggaggc gtcccggaag
ttcgtggaca cgacctccga ccactcggcg 120tacagctcgt ccaggccgcg cacccacacc
caggccaggg tgttgtccgg caccacctgg 180tcctggaccg cgctgatgaa cagggtcacg
tcgtcccgga ccacaccggc gaagtcgtcc 240tccacgaagt cccgggagaa cccgagccgg
tcggtccaga actcgaccgc tccggcgacg 300tcgcgcgcgg tgagcaccgg aacggcactg
gtcaacttgg ccatggatcc agatttcgct 360caagttagta taaaaaagca ggcttcaatc
ctgcagagaa gcttgatatc gaattcctgc 420agccccgcgg atccgctcac ggggacagcc
cccccccaaa gcccccaggg atgtaattac 480gtccctcccc cgctaggggg cagcagcgag
ccgcccgggg ctccgctccg gtccggcgct 540ccccccgcat ccccgagccg gcagcgtgcg
gggacagccc gggcacgggg aaggtggcac 600gggatcgctt tcctctgaac gcttctcgct
gctctttgag cctgcagaca cctgggggat 660acggggccgc ggatccgctc acggggacag
ccccccccca aagcccccag ggatgtaatt 720acgtccctcc cccgctaggg ggcagcagcg
agccgcccgg ggctccgctc cggtccggcg 780ctccccccgc atccccgagc cggcagcgtg
cggggacagc ccgggcacgg ggaaggtggc 840acgggatcgc tttcctctga acgcttctcg
ctgctctttg agcctgcaga cacctggggg 900atacggggcc gcggatccgc tcacggggac
agcccccccc caaagccccc agggatgtaa 960ttacgtccct cccccgctag ggggcagcag
cgagccgccc ggggctccgc tccggtccgg 1020cgctcccccc gcatccccga gccggcagcg
tgcggggaca gcccgggcac ggggaaggtg 1080gcacgggatc gctttcctct gaacgcttct
cgctgctctt tgagcctgca gacacctggg 1140ggatacgggg ccgcggatcc gctcacgggg
acagcccccc cccaaagccc ccagggatgt 1200aattacgtcc ctcccccgct agggggcagc
agcgagccgc ccggggctcc gctccggtcc 1260ggcgctcccc ccgcatcccc gagccggcag
cgtgcgggga cagcccgggc acggggaagg 1320tggcacggga tcgctttcct ctgaacgctt
ctcgctgctc tttgagcctg cagacacctg 1380ggggatacgg ggccgcggat ccgctcacgg
ggacagcccc cccccaaagc ccccagggat 1440gtaattacgt ccctcccccg ctagggggca
gcagcgagcc gcccggggct ccgctccggt 1500ccggcgctcc ccccgcatcc ccgagccggc
agcgtgcggg gacagcccgg gcacggggaa 1560ggtggcacgg gatcgctttc ctctgaacgc
ttctcgctgc tctttgagcc tgcagacacc 1620tgggggatac ggggccgcgg atccgctcac
ggggacagcc cccccccaaa gcccccaggg 1680atgtaattac gtccctcccc cgctaggggg
cagcagcgag ccgcccgggg ctccgctccg 1740gtccggcgct ccccccgcat ccccgagccg
gcagcgtgcg gggacagccc gggcacgggg 1800aaggtggcac gggatcgctt tcctctgaac
gcttctcgct gctctttgag cctgcagaca 1860cctgggggat acggggcggg ggatccacta
gttattaata gtaatcaatt acggggtcat 1920tagttcatag cccatatatg gagttccgcg
ttacataact tacggtaaat ggcccgcctg 1980gctgaccgcc caacgacccc cgcccattga
cgtcaataat gacgtatgtt cccatagtaa 2040cgccaatagg gactttccat tgacgtcaat
gggtggacta tttacggtaa actgcccact 2100tggcagtaca tcaagtgtat catatgccaa
gtacgccccc tattgacgtc aatgacggta 2160aatggcccgc ctggcattat gcccagtaca
tgaccttatg ggactttcct acttggcagt 2220acatctacgt attagtcatc gctattacca
tgggtcgagg tgagccccac gttctgcttc 2280actctcccca tctccccccc ctccccaccc
ccaattttgt atttatttat tttttaatta 2340ttttgtgcag cgatgggggc gggggggggg
ggggcgcgcg ccaggcgggg cggggcgggg 2400cgaggggcgg ggcggggcga ggcggagagg
tgcggcggca gccaatcaga gcggcgcgct 2460ccgaaagttt ccttttatgg cgaggcggcg
gcggcggcgg ccctataaaa agcgaagcgc 2520gcggcgggcg ggagtcgctg cgttgccttc
gccccgtgcc ccgctccgcg ccgcctcgcg 2580ccgcccgccc cggctctgac tgaccgcgtt
actcccacag gtgagcgggc gggacggccc 2640ttctcctccg ggctgtaatt agcgcttggt
ttaatgacgg ctcgtttctt ttctgtggct 2700gcgtgaaagc cttaaagggc tccgggaggg
ccctttgtgc gggggggagc ggctcggggg 2760gtgcgtgcgt gtgtgtgtgc gtggggagcg
ccgcgtgcgg cccgcgctgc ccggcggctg 2820tgagcgctgc gggcgcggcg cggggctttg
tgcgctccgc gtgtgcgcga ggggagcgcg 2880gccgggggcg gtgccccgcg gtgcgggggg
gctgcgaggg gaacaaaggc tgcgtgcggg 2940gtgtgtgcgt gggggggtga gcagggggtg
tgggcgcggc ggtcgggctg taaccccccc 3000ctgcaccccc ctccccgagt tgctgagcac
ggcccggctt cgggtgcggg gctccgtgcg 3060gggcgtggcg cggggctcgc cgtgccgggc
ggggggtggc ggcaggtggg ggtgccgggc 3120ggggcggggc cgcctcgggc cggggagggc
tcgggggagg ggcgcggcgg ccccggagcg 3180ccggcggctg tcgaggcgcg gcgagccgca
gccattgcct tttatggtaa tcgtgcgaga 3240gggcgcaggg acttcctttg tcccaaatct
ggcggagccg aaatctggga ggcgccgccg 3300caccccctct agcgggcgcg ggcgaagcgg
tgcggcgccg gcaggaagga aatgggcggg 3360gagggccttc gtgcgtcgcc gcgccgccgt
ccccttctcc atctccagcc tcggggctgc 3420cgcaggggga cggctgcctt cgggggggac
ggggcagggc ggggttcggc ttctggcgtg 3480tgaccggcgg ctctagagcc tctgctaacc
atgttcatgc cttcttcttt ttcctacagc 3540tcctgggcaa cgtgctggtt gttgtgctgt
ctcatcattt tggcaaagaa ttcgccacca 3600tggtgagcaa gggcgaggag ctgttcaccg
gggtggtgcc catcctggtc gagctggacg 3660gcgacgtaaa cggccacaag ttcagcgtgt
ccggcgaggg cgagggcgat gccacctacg 3720gcaagctgac cctgaagttc atctgcacca
ccggcaagct gcccgtgccc tggcccaccc 3780tcgtgaccac cctgacctac ggcgtgcagt
gcttcagccg ctaccccgac cacatgaagc 3840agcacgactt cttcaagtcc gccatgcccg
aaggctacgt ccaggagcgc accatcttct 3900tcaaggacga cggcaactac aagacccgcg
ccgaggtgaa gttcgagggc gacaccctgg 3960tgaaccgcat cgagctgaag ggcatcgact
tcaaggagga cggcaacatc ctggggcaca 4020agctggagta caactacaac agccacaacg
tctatatcat ggccgacaag cagaagaacg 4080gcatcaaggt gaacttcaag atccgccaca
acatcgagga cggcagcgtg cagctcgccg 4140accactacca gcagaacacc cccatcggcg
acggccccgt gctgctgccc gacaaccact 4200acctgagcac ccagtccgcc ctgagcaaag
accccaacga gaagcgcgat cacatggtcc 4260tgctggagtt cgtgaccgcc gccgggatca
ctctcggcat ggacgagctg tacaagtaag 4320aattcactcc tcaggtgcag gctgcctatc
agaaggtggt ggctggtgtg gccaatgccc 4380tggctcacaa ataccactga gatctttttc
cctctgccaa aaattatggg gacatcatga 4440agccccttga gcatctgact tctggctaat
aaaggaaatt tattttcatt gcaatagtgt 4500gttggaattt tttgtgtctc tcactcggaa
ggacatatgg gagggcaaat catttaaaac 4560atcagaatga gtatttggtt tagagtttgg
caacatatgc catatgctgg ctgccatgaa 4620caaaggtggc tataaagagg tcatcagtat
atgaaacagc cccctgctgt ccattcctta 4680ttccatagaa aagccttgac ttgaggttag
atttttttta tattttgttt tgtgttattt 4740ttttctttaa catccctaaa attttcctta
catgttttac tagccagatt tttcctcctc 4800tcctgactac tcccagtcat agctgtccct
cttctcttat gaagatccct cgacctgcag 4860cccaagcttg catgcctgca ggtcgactct
agaggatccc cgggtaccga gctcgaattc 4920gtaatcatgg tcatagctgt ttcctgtgtg
aaattgttat ccgctcacaa ttccacacaa 4980catacgagcc ggaagcataa agtgtaaagc
ctggggtgcc taatgagtga gctaactcac 5040attaattgcg ttgcgctcac tgcccgcttt
ccagtcggga aacctgtcgt gccagctgca 5100ttaatgaatc ggccaacgcg cggggagagg
cggtttgcgt attgggcgct cttccgcttc 5160ctcgctcact gactcgctgc gctcggtcgt
tcggctgcgg cgagcggtat cagctcactc 5220aaaggcggta atacggttat ccacagaatc
aggggataac gcaggaaaga acatgtgagc 5280aaaaggccag caaaaggcca ggaaccgtaa
aaaggccgcg ttgctggcgt ttttccatag 5340gctccgcccc cctgacgagc atcacaaaaa
tcgacgctca agtcagaggt ggcgaaaccc 5400gacaggacta taaagatacc aggcgtttcc
ccctggaagc tccctcgtgc gctctcctgt 5460tccgaccctg ccgcttaccg gatacctgtc
cgcctttctc ccttcgggaa gcgtggcgct 5520ttctcatagc tcacgctgta ggtatctcag
ttcggtgtag gtcgttcgct ccaagctggg 5580ctgtgtgcac gaaccccccg ttcagcccga
ccgctgcgcc ttatccggta actatcgtct 5640tgagtccaac ccggtaagac acgacttatc
gccactggca gcagccactg gtaacaggat 5700tagcagagcg aggtatgtag gcggtgctac
agagttcttg aagtggtggc ctaactacgg 5760ctacactaga aggacagtat ttggtatctg
cgctctgctg aagccagtta ccttcggaaa 5820aagagttggt agctcttgat ccggcaaaca
aaccaccgct ggtagcggtg gtttttttgt 5880ttgcaagcag cagattacgc gcagaaaaaa
aggatctcaa gaagatcctt tgatcttttc 5940tacggggtct gacgctcagt ggaacgaaaa
ctcacgttaa gggattttgg tcatgagatt 6000atcaaaaagg atcttcacct agatcctttt
aaattaaaaa tgaagtttta aatcaatcta 6060aagtatatat gagtaaactt ggtctgacag
ttaccaatgc ttaatcagtg aggcacctat 6120ctcagcgatc tgtctatttc gttcatccat
agttgcctga ctccccgtcg tgtagataac 6180tacgatacgg gagggcttac catctggccc
cagtgctgca atgataccgc gagacccacg 6240ctcaccggct ccagatttat cagcaataaa
ccagccagcc ggaagggccg agcgcagaag 6300tggtcctgca actttatccg cctccatcca
gtctattaat tgttgccggg aagctagagt 6360aagtagttcg ccagttaata gtttgcgcaa
cgttgttgcc attgctacag gcatcgtggt 6420gtcacgctcg tcgtttggta tggcttcatt
cagctccggt tcccaacgat caaggcgagt 6480tacatgatcc cccatgttgt gcaaaaaagc
ggttagctcc ttcggtcctc cgatcgttgt 6540cagaagtaag ttggccgcag tgttatcact
catggttatg gcagcactgc ataattctct 6600tactgtcatg ccatccgtaa gatgcttttc
tgtgactggt gagtactcaa ccaagtcatt 6660ctgagaatag tgtatgcggc gaccgagttg
ctcttgcccg gcgtcaatac gggataatac 6720cgcgccacat agcagaactt taaaagtgct
catcattgga aaacgttctt cggggcgaaa 6780actctcaagg atcttaccgc tgttgagatc
cagttcgatg taacccactc gtgcacccaa 6840ctgatcttca gcatctttta ctttcaccag
cgtttctggg tgagcaaaaa caggaaggca 6900aaatgccgca aaaaagggaa taagggcgac
acggaaatgt tgaatactca tactcttcct 6960ttttcaatat tattgaagca tttatcaggg
ttattgtctc atgagcggat acatatttga 7020atgtatttag aaaaataaac aaataggggt
tccgcgcaca tttccccgaa aagtgccacc 7080tgacgtagtt aacaaaaaaa agcccgccga
agcgggcttt attaccaagc gaagcgccat 7140tcgccattca ggctgcgcaa ctgttgggaa
gggcgatcgg tgcgggcctc ttcgctatta 7200cgccagctgg cgaaaggggg atgtgctgca
aggcgattaa gttgggtaac gccagggttt 7260tcccagtcac gacgttgtaa aacgacggcc
agtccgtaat acgactcact taaggccttg 7320actagagggt cgacggtata cagacatgat
aagatacatt gatgagtttg gacaaaccac 7380aactagaatg cagtgaaaaa aatgctttat
ttgtgaaatt tgtgatgcta ttgctttatt 7440tgtaaccatt ataagctgca ataaacaagt
tggggtgggc gaagaactcc agcatgagat 7500ccccgcgctg gaggatcatc cagccggcgt
cccggaaaac gattccgaag cccaaccttt 7560catagaaggc ggcggtggaa tcgaaatctc
gtagcacgtg tcagtcctgc tcctcggcca 7620cgaagtgcac g
76311174615DNAArtificial
Sequencep18attBZeo6XHS4 Plasmid 117cagttgccgg ccgggtcgcg cagggcgaac
tcccgccccc acggctgctc gccgatctcg 60gtcatggccg gcccggaggc gtcccggaag
ttcgtggaca cgacctccga ccactcggcg 120tacagctcgt ccaggccgcg cacccacacc
caggccaggg tgttgtccgg caccacctgg 180tcctggaccg cgctgatgaa cagggtcacg
tcgtcccgga ccacaccggc gaagtcgtcc 240tccacgaagt cccgggagaa cccgagccgg
tcggtccaga actcgaccgc tccggcgacg 300tcgcgcgcgg tgagcaccgg aacggcactg
gtcaacttgg ccatggatcc agatttcgct 360caagttagta taaaaaagca ggcttcaatc
ctgcagagaa gcttgcatgc ctgcaggtcg 420actctagtgg atcccccgcc ccgtatcccc
caggtgtctg caggctcaaa gagcagcgag 480aagcgttcag aggaaagcga tcccgtgcca
ccttccccgt gcccgggctg tccccgcacg 540ctgccggctc ggggatgcgg ggggagcgcc
ggaccggagc ggagccccgg gcggctcgct 600gctgccccct agcgggggag ggacgtaatt
acatccctgg gggctttggg ggggggctgt 660ccccgtgagc ggatccgcgg ccccgtatcc
cccaggtgtc tgcaggctca aagagcagcg 720agaagcgttc agaggaaagc gatcccgtgc
caccttcccc gtgcccgggc tgtccccgca 780cgctgccggc tcggggatgc ggggggagcg
ccggaccgga gcggagcccc gggcggctcg 840ctgctgcccc ctagcggggg agggacgtaa
ttacatccct gggggctttg ggggggggct 900gtccccgtga gcggatccgc ggccccgtat
cccccaggtg tctgcaggct caaagagcag 960cgagaagcgt tcagaggaaa gcgatcccgt
gccaccttcc ccgtgcccgg gctgtccccg 1020cacgctgccg gctcggggat gcggggggag
cgccggaccg gagcggagcc ccgggcggct 1080cgctgctgcc ccctagcggg ggagggacgt
aattacatcc ctgggggctt tggggggggg 1140ctgtccccgt gagcggatcc gcggccccgt
atcccccagg tgtctgcagg ctcaaagagc 1200agcgagaagc gttcagagga aagcgatccc
gtgccacctt ccccgtgccc gggctgtccc 1260cgcacgctgc cggctcgggg atgcgggggg
agcgccggac cggagcggag ccccgggcgg 1320ctcgctgctg ccccctagcg ggggagggac
gtaattacat ccctgggggc tttggggggg 1380ggctgtcccc gtgagcggat ccgcggcccc
gtatccccca ggtgtctgca ggctcaaaga 1440gcagcgagaa gcgttcagag gaaagcgatc
ccgtgccacc ttccccgtgc ccgggctgtc 1500cccgcacgct gccggctcgg ggatgcgggg
ggagcgccgg accggagcgg agccccgggc 1560ggctcgctgc tgccccctag cgggggaggg
acgtaattac atccctgggg gctttggggg 1620ggggctgtcc ccgtgagcgg atccgcggcc
ccgtatcccc caggtgtctg caggctcaaa 1680gagcagcgag aagcgttcag aggaaagcga
tcccgtgcca ccttccccgt gcccgggctg 1740tccccgcacg ctgccggctc ggggatgcgg
ggggagcgcc ggaccggagc ggagccccgg 1800gcggctcgct gctgccccct agcgggggag
ggacgtaatt acatccctgg gggctttggg 1860ggggggctgt ccccgtgagc ggatccgcgg
ggctgcagga attcgtaatc atggtcatag 1920ctgtttcctg tgtgaaattg ttatccgctc
acaattccac acaacatacg agccggaagc 1980ataaagtgta aagcctgggg tgcctaatga
gtgagctaac tcacattaat tgcgttgcgc 2040tcactgcccg ctttccagtc gggaaacctg
tcgtgccagc tgcattaatg aatcggccaa 2100cgcgcgggga gaggcggttt gcgtattggg
cgctcttccg cttcctcgct cactgactcg 2160ctgcgctcgg tcgttcggct gcggcgagcg
gtatcagctc actcaaaggc ggtaatacgg 2220ttatccacag aatcagggga taacgcagga
aagaacatgt gagcaaaagg ccagcaaaag 2280gccaggaacc gtaaaaaggc cgcgttgctg
gcgtttttcc ataggctccg cccccctgac 2340gagcatcaca aaaatcgacg ctcaagtcag
aggtggcgaa acccgacagg actataaaga 2400taccaggcgt ttccccctgg aagctccctc
gtgcgctctc ctgttccgac cctgccgctt 2460accggatacc tgtccgcctt tctcccttcg
ggaagcgtgg cgctttctca tagctcacgc 2520tgtaggtatc tcagttcggt gtaggtcgtt
cgctccaagc tgggctgtgt gcacgaaccc 2580cccgttcagc ccgaccgctg cgccttatcc
ggtaactatc gtcttgagtc caacccggta 2640agacacgact tatcgccact ggcagcagcc
actggtaaca ggattagcag agcgaggtat 2700gtaggcggtg ctacagagtt cttgaagtgg
tggcctaact acggctacac tagaaggaca 2760gtatttggta tctgcgctct gctgaagcca
gttaccttcg gaaaaagagt tggtagctct 2820tgatccggca aacaaaccac cgctggtagc
ggtggttttt ttgtttgcaa gcagcagatt 2880acgcgcagaa aaaaaggatc tcaagaagat
cctttgatct tttctacggg gtctgacgct 2940cagtggaacg aaaactcacg ttaagggatt
ttggtcatga gattatcaaa aaggatcttc 3000acctagatcc ttttaaatta aaaatgaagt
tttaaatcaa tctaaagtat atatgagtaa 3060acttggtctg acagttacca atgcttaatc
agtgaggcac ctatctcagc gatctgtcta 3120tttcgttcat ccatagttgc ctgactcccc
gtcgtgtaga taactacgat acgggagggc 3180ttaccatctg gccccagtgc tgcaatgata
ccgcgagacc cacgctcacc ggctccagat 3240ttatcagcaa taaaccagcc agccggaagg
gccgagcgca gaagtggtcc tgcaacttta 3300tccgcctcca tccagtctat taattgttgc
cgggaagcta gagtaagtag ttcgccagtt 3360aatagtttgc gcaacgttgt tgccattgct
acaggcatcg tggtgtcacg ctcgtcgttt 3420ggtatggctt cattcagctc cggttcccaa
cgatcaaggc gagttacatg atcccccatg 3480ttgtgcaaaa aagcggttag ctccttcggt
cctccgatcg ttgtcagaag taagttggcc 3540gcagtgttat cactcatggt tatggcagca
ctgcataatt ctcttactgt catgccatcc 3600gtaagatgct tttctgtgac tggtgagtac
tcaaccaagt cattctgaga atagtgtatg 3660cggcgaccga gttgctcttg cccggcgtca
atacgggata ataccgcgcc acatagcaga 3720actttaaaag tgctcatcat tggaaaacgt
tcttcggggc gaaaactctc aaggatctta 3780ccgctgttga gatccagttc gatgtaaccc
actcgtgcac ccaactgatc ttcagcatct 3840tttactttca ccagcgtttc tgggtgagca
aaaacaggaa ggcaaaatgc cgcaaaaaag 3900ggaataaggg cgacacggaa atgttgaata
ctcatactct tcctttttca atattattga 3960agcatttatc agggttattg tctcatgagc
ggatacatat ttgaatgtat ttagaaaaat 4020aaacaaatag gggttccgcg cacatttccc
cgaaaagtgc cacctgacgt agttaacaaa 4080aaaaagcccg ccgaagcggg ctttattacc
aagcgaagcg ccattcgcca ttcaggctgc 4140gcaactgttg ggaagggcga tcggtgcggg
cctcttcgct attacgccag ctggcgaaag 4200ggggatgtgc tgcaaggcga ttaagttggg
taacgccagg gttttcccag tcacgacgtt 4260gtaaaacgac ggccagtccg taatacgact
cacttaaggc cttgactaga gggtcgacgg 4320tatacagaca tgataagata cattgatgag
tttggacaaa ccacaactag aatgcagtga 4380aaaaaatgct ttatttgtga aatttgtgat
gctattgctt tatttgtaac cattataagc 4440tgcaataaac aagttggggt gggcgaagaa
ctccagcatg agatccccgc gctggaggat 4500catccagccg gcgtcccgga aaacgattcc
gaagcccaac ctttcataga aggcggcggt 4560ggaatcgaaa tctcgtagca cgtgtcagtc
ctgctcctcg gccacgaagt gcacg 461511817384DNAArtificial
SequencepFK161 Plasmid 118gcgcacgagg gagcttccag ggggaaacgc ctggtatctt
tatagtcctg tcggggtttc 60gccacctctg acttgagcgt cgatttttgt gatgctcgtc
aggggggcgg agcctatgga 120aaaacgccag caacgcggcc tttttacggt tcctggcctt
ttgctggcct tttgctcaca 180tgttctttcc tgcgttatcc cctgattctg tggataaccg
tattaccgcc tttgagtgag 240ctgataccgc tcgccgcagc cgaacgaccg agcgcagcga
gtcagtgagc gaggaagcgg 300aagagcgctg acttccgcgt ttccagactt tacgaaacac
ggaaaccgaa gaccattcat 360gttgttgctc aggtcgcaga cgttttgcag cagcagtcgc
ttcacgttcg ctcgcgtatc 420ggtgattcat tctgctaacc agtaaggcaa ccccgccagc
ctagccgggt cctcaacgac 480aggagcacga tcatgcgcac ccgtcagatc cagacatgat
aagatacatt gatgagtttg 540gacaaaccac aactagaatg cagtgaaaaa aatgctttat
ttgtgaaatt tgtgatgcta 600ttgctttatt tgtaaccatt ataagctgca ataaacaagt
taacaacaac aattgcattc 660attttatgtt tcaggttcag ggggaggtgt gggaggtttt
ttaaagcaag taaaacctct 720acaaatgtgg tatggctgat tatgatctct agtcaaggca
ctatacatca aatattcctt 780attaacccct ttacaaatta aaaagctaaa ggtacacaat
ttttgagcat agttattaat 840agcagacact ctatgcctgt gtggagtaag aaaaaacagt
atgttatgat tataactgtt 900atgcctactt ataaaggtta cagaatattt ttccataatt
ttcttgtata gcagtgcagc 960tttttccttt gtggtgtaaa tagcaaagca agcaagagtt
ctattactaa acacagcatg 1020actcaaaaaa cttagcaatt ctgaaggaaa gtccttgggg
tcttctacct ttctcttctt 1080ttttggagga gtagaatgtt gagagtcagc agtagcctca
tcatcactag atggcatttc 1140ttctgagcaa aacaggtttt cctcattaaa ggcattccac
cactgctccc attcatcagt 1200tccataggtt ggaatctaaa atacacaaac aattagaatc
agtagtttaa cacattatac 1260acttaaaaat tttatattta ccttagagct ttaaatctct
gtaggtagtt tgtccaatta 1320tgtcacacca cagaagtaag gttccttcac aaagatccgg
accaaagcgg ccatcgtgcc 1380tccccactcc tgcagttcgg gggcatggat gcgcggatag
ccgctgctgg tttcctggat 1440gccgacggat ttgcactgcc ggtagaactc gcgaggtcgt
ccagcctcag gcagcagctg 1500aaccaactcg cgaggggatc gagcccgggg tgggcgaaga
actccagcat gagatccccg 1560cgctggagga tcatccagcc ggcgtcccgg aaaacgattc
cgaagcccaa cctttcatag 1620aaggcggcgg tggaatcgaa atctcgtgat ggcaggttgg
gcgtcgcttg gtcggtcatt 1680tcgaacccca gagtcccgct cagaagaact cgtcaagaag
gcgatagaag gcgatgcgct 1740gcgaatcggg agcggcgata ccgtaaagca cgaggaagcg
gtcagcccat tcgccgccaa 1800gctcttcagc aatatcacgg gtagccaacg ctatgtcctg
atagcggtcc gccacaccca 1860gccggccaca gtcgatgaat ccagaaaagc ggccattttc
caccatgata ttcggcaagc 1920aggcatcgcc atgggtcacg acgagatcct cgccgtcggg
atgcgcgcct tgagcctggc 1980gaacagttcg gctggcgcga gcccctgatg ctcttcgtcc
agatcatcct gatcgacaag 2040accggcttcc atccgagtac gtgctcgctc gatgcgatgt
ttcgcttggt ggtcgaatgg 2100gcaggtagcc ggatcaagcg tatgcagccg ccgcattgca
tcagccatga tggatacttt 2160ctcggcagga gcaaggtgag atgacaggag atcctgcccc
ggcacttcgc ccaatagcag 2220ccagtccctt cccgcttcag tgacaacgtc gagcacagct
gcgcaaggaa cgcccgtcgt 2280ggccagccac gatagccgcg ctgcctcgtc ctgcagttca
ttcagggcac cggacaggtc 2340ggtcttgaca aaaagaaccg ggcgcccctg cgctgacagc
cggaacacgg cggcatcaga 2400gcagccgatt gtctgttgtg cccagtcata gccgaatagc
ctctccaccc aagcggccgg 2460agaacctgcg tgcaatccat cttgttcaat catgcgaaac
gatcctcatc ctgtctcttg 2520atcagatctt gatcccctgc gccatcagat ccttggcggc
aagaaagcca tccagtttac 2580tttgcagggc ttcccaacct taccagaggg cgccccagct
ggcaattccg gttcgcttgc 2640tgtccataaa accgcccagt ctagctatcg ccatgtaagc
ccactgcaag ctacctgctt 2700tctctttgcg cttgcgtttt cccttgtcca gatagcccag
tagctgacat tcatccgggg 2760tcagcaccgt ttctgcggac tggctttcta cgtgttccgc
ttcctttagc agcccttgcg 2820ccctgagtgc ttgcggcagc gtgaaagctt tttgcaaaag
cctaggcctc caaaaaagcc 2880tcctcactac ttctggaata gctcagaggc cgaggcggcc
taaataaaaa aaattagtca 2940gccatggggc ggagaatggg cggaactggg cggagttagg
ggcgggatgg gcggagttag 3000gggcgggact atggttgctg actaattgag atgcatgctt
tgcatacttc tgcctgctgg 3060ggagcctggg gactttccac acctggttgc tgactaattg
agatgcatgc tttgcatact 3120tctgcctgct ggggagcctg gggactttcc acaccctaac
tgacacacat tccacagccg 3180gatctgcagg acccaacgct gcccgagatg cgccgcgtgc
ggctgctgga gatggcggac 3240gcgatggata tgttctgcca agggttggtt tgcgcattca
cagttctccg caagaattga 3300ttggctccaa ttcttggagt ggtgaatccg ttagcgaggt
gccgccggct tccattcagg 3360tcgaggtggc ccggctccat gcaccgcgac gcaacgcggg
gaggcagaca aggtataggg 3420cggcgcctac aatccatgcc aacccgttcc atgtgctcgc
cgaggcgcat aaatcgccgt 3480gacgatcagc ggtccaatga tcgaagttag gctggtaaga
gccgcgagcg atccttgaag 3540ctgtccctga tggtcgtcat ctacctgcct ggacagcatg
gcctgcaacg cggcatcccg 3600atgccgccgg aagcgagaag aatcataatg gggaaggcca
tccagcctcg cgtcgcgaac 3660gccagcaaga cgtagcccag cgcgtcgggc cgccatgccg
gcgataatgg cctgcttctc 3720gccgaaacgt ttggtggcgg gaccagtgac gaaggcttga
gcgagggcgt gcaagattcc 3780gaataccgca agcgacaggc cgatcatcgt cgcgctccag
cgaaagcggt cctcgccgaa 3840aatgacccag agcgctgccg gcacctgtcc tacgagttgc
atgataaaga agacagtcat 3900aagtgcggcg acgatagtca tgccccgcgc ccaccggaag
gagctgactg ggttgaaggc 3960tctcaagggc atcggtcgac gctctccctt atgcgactcc
tgcattagga agcagcccag 4020tagtaggttg aggccgttga gcaccgccgc cgcaaggaat
ggtgcatgca aggagatggc 4080gcccaacagt cccccggcca cgggcctgcc accataccca
cgccgaaaca agcgctcatg 4140agcccgaagt ggcgagcccg atcttcccca tcggtgatgt
cggcgatata ggcgccagca 4200accgcacctg tggcgccggt gatgccggcc acgatgcgtc
cggcgtagag gatcttggca 4260gtcacagcat gcgcatatcc atgcttcgac catgcgctca
caaagtaggt gaatgcgcaa 4320tgtagtaccc acatcgtcat cgctttccac tgctctcgcg
aataaagatg gaaaatcaat 4380ctcatggtaa tagtccatga aaatccttgt attcataaat
cctccaggta gctatatgca 4440aattgaaaca aaagagatgg tgatctttct aagagatgat
ggaatctccc ttcagtatcc 4500cgatggtcaa tgcgctggat atgggataga tgggaatatg
ctgattttta tgggacagag 4560ttgcgaactg ttcccaacta aaatcatttt gcacgatcag
cgcactacga actttaccca 4620caaatagtca ggtaatgaat cctgatataa agacaggttg
ataaatcagt cttctacgcg 4680catcgcacgc gcacaccgta gaaagtcttt cagttgtgag
cctgggcaaa ccgttaactt 4740tcggcggctt tgctgtgcga caggctcacg tctaaaagga
aataaatcat gggtcataaa 4800attatcacgt tgtccggcgc ggcgacggat gttctgtatg
cgctgttttt ccgtggcgcg 4860ttgctgtctg gtgatctgcc ttctaaatct ggcacagccg
aattgcgcga gcttggtttt 4920gctgaaacca gacacacagc aactgaatac cagaaagaaa
atcactttac ctttctgaca 4980tcagaagggc agaaatttgc cgttgaacac ctggtcaata
cgcgttttgg tgagcagcaa 5040tattgcgctt cgatgacgct tggcgttgag attgatacct
ctgctgcaca aaaggcaatc 5100gacgagctgg accagcgcat tcgtgacacc gtctccttcg
aacttattcg caatggagtg 5160tcattcatca aggacgccgc tatcgcaaat ggtgctatcc
acgcagcggc aatcgaaaca 5220cctcagccgg tgaccaatat ctacaacatc agccttggta
tccagcgtga tgagccagcg 5280cagaacaagg taaccgtcag tgccgataag ttcaaagtta
aacctggtgt tgataccaac 5340attgaaacgt tgatcgaaaa cgcgctgaaa aacgctgctg
aatgtgcggc gctggatgtc 5400acaaagcaaa tggcagcaga caagaaagcg atggatgaac
tggcttccta tgtccgcacg 5460gccatcatga tggaatgttt ccccggtggt gttatctggc
agcagtgccg tcgatagtat 5520gcaattgata attattatca tttgcgggtc ctttccggcg
atccgccttg ttacggggcg 5580gcgacctcgc gggttttcgc tatttatgaa aattttccgg
tttaaggcgt ttccgttctt 5640cttcgtcata acttaatgtt tttatttaaa ataccctctg
aaaagaaagg aaacgacagg 5700tgctgaaagc gagctttttg gcctctgtcg tttcctttct
ctgtttttgt ccgtggaatg 5760aacaatggaa gtcaacaaaa agcagctggc tgacattttc
ggtgcgagta tccgtaccat 5820tcagaactgg caggaacagg gaatgcccgt tctgcgaggc
ggtggcaagg gtaatgaggt 5880gctttatgac tctgccgccg tcataaaatg gtatgccgaa
agggatgctg aaattgagaa 5940cgaaaagctg cgccgggagg ttgaagaact gcggcaggcc
agcgaggcag atccacagga 6000cgggtgtggt cgccatgatc gcgtagtcga tagtggctcc
aagtagcgaa gcgagcagga 6060ctgggcggcg gcaaagcggt cggacagtgc tccgagaacg
ggtgcgcata gaaattgcat 6120caacgcatat agcgctagca gcacgccata gtgactggcg
atgctgtcgg aatggacgat 6180atcccgcaag aggcccggca gtaccggcat aaccaagcct
atgcctacag catccagggt 6240gacggtgccg aggatgacga tgagcgcatt gttagatttc
atacacggtg cctgactgcg 6300ttagcaattt aactgtgata aactaccgca ttaaagctta
tcgatgataa gcggtcaaac 6360atgagaattc gcggccgctc ttctcgttct gccagcgggc
cctcgtctct ccaccccatc 6420cgtctgccgg tggtgtgtgg aaggcagggg tgcggctctc
cggcccgacg ctgccccgcg 6480cgcacttttc tcagtggttc gcgtggtcct tgtggatgtg
tgaggcgccc ggttgtgccc 6540tcacgtgttt cactttggtc gtgtctcgct tgaccatgtt
cccagagtcg gtggatgtgg 6600ccggtggcgt tgcataccct tcccgtctgg tgtgtgcacg
cgctgtttct tgtaagcgtc 6660gaggtgctcc tggagcgttc caggtttgtc tcctaggtgc
ctgcttctga gctggtggtg 6720gcgctcccca ttccctggtg tgcctccggt gctccgtctg
gctgtgtgcc ttcccgtttg 6780tgtctgagaa gcccgtgaga ggggggtcga ggagagaagg
aggggcaaga ccccccttct 6840tcgtcgggtg aggcgcccac cccgcgacta gtacgcctgt
gcgtagggct ggtgctgagc 6900ggtcgcggct ggggttggaa agtttctcga gagactcatt
gctttcccgt ggggagcttt 6960gagaggcctg gctttcgggg gggaccggtt gcagggtctc
ccctgtccgc ggatgctcag 7020aatgcccttg gaagagaacc ttcctgttgc cgcagacccc
cccgcgcggt cgcccgcgtg 7080ttggtcttct ggtttccctg tgtgctcgtc gcatgcatcc
tctctcggtg gccggggctc 7140gtcggggttt tgggtccgtc ccgccctcag tgagaaagtt
tccttctcta gctatcttcc 7200ggaaagggtg cgggcttctt acggtctcga ggggtctctc
ccgaatggtc ccctggaggg 7260ctcgccccct gaccgcctcc cgcgcgcgca gcgtttgctc
tctcgtctac cgcggcccgc 7320ggcctccccg ctccgagttc ggggagggat cacgcggggc
agagcctgtc tgtcgtcctg 7380ccgttgctgc ggagcatgtg gctcggcttg tgtggttggt
ggctggggag agggctccgt 7440gcacaccccc gcgtgcgcgt actttcctcc cctcctgagg
gccgccgtgc ggacggggtg 7500tgggtaggcg acggtgggct cccgggtccc cacccgtctt
cccgtgcctc acccgtgcct 7560tccgtcgcgt gcgtccctct cgctcgcgtc cacgactttg
gccgctcccg cgacggcggc 7620ctgcgccgcg cgtggtgcgt gctgtgtgct tctcgggctg
tgtggttgtg tcgcctcgcc 7680ccccccttcc cgcggcagcg ttcccacggc tggcgaaatc
gcgggagtcc tccttcccct 7740cctcggggtc gagagggtcc gtgtctggcg ttgattgatc
tcgctctcgg ggacgggacc 7800gttctgtggg agaacggctg ttggccgcgt ccggcgcgac
gtcggacgtg gggacccact 7860gccgctcggg ggtcttcgtc ggtaggcatc ggtgtgtcgg
catcggtctc tctctcgtgt 7920cggtgtcgcc tcctcgggct cccggggggc cgtcgtgttt
cgggtcggct cggcgctgca 7980ggtgtggtgg gactgctcag gggagtggtg cagtgtgatt
cccgccggtt ttgcctcgcg 8040tgccctgacc ggtccgacgc ccgagcggtc tctcggtccc
ttgtgaggac ccccttccgg 8100gaggggcccg tttcggccgc ccttgccgtc gtcgccggcc
ctcgttctgc tgtgtcgttc 8160ccccctcccc gctcgccgca gccggtcttt tttcctctct
ccccccctct cctctgactg 8220acccgtggcc gtgctgtcgg accccccgca tgggggcggc
cgggcacgta cgcgtccggg 8280cggtcaccgg ggtcttgggg gggggccgag gggtaagaaa
gtcggctcgg cgggcgggag 8340gagctgtggt ttggagggcg tcccggcccc gcggccgtgg
cggtgtcttg cgcggtcttg 8400gagagggctg cgtgcgaggg gaaaaggttg ccccgcgagg
gcaaagggaa agaggctagc 8460agtggtcatt gtcccgacgg tgtggtggtc tgttggccga
ggtgcgtctg gggggctcgt 8520ccggccctgt cgtccgtcgg gaaggcgcgt gttggggcct
gccggagtgc cgaggtgggt 8580accctggcgg tgggattaac cccgcgcgcg tgtcccggtg
tggcggtggg ggctccggtc 8640gatgtctacc tccctctccc cgaggtctca ggccttctcc
gcgcgggctc tcggccctcc 8700cctcgttcct ccctctcgcg gggttcaagt cgctcgtcga
cctcccctcc tccgtccttc 8760catctctcgc gcaatggcgc cgcccgagtt cacggtgggt
tcgtcctccg cctccgcttc 8820tcgccggggg ctggccgctg tccggtctct cctgcccgac
ccccgttggc gtggtcttct 8880ctcgccggct tcgcggactc ctggcttcgc ccggagggtc
agggggcttc ccggttcccc 8940gacgttgcgc ctcgctgctg tgtgcttggg gggggcccgc
tgcggcctcc gcccgcccgt 9000gagcccctgc cgcacccgcc ggtgtgcggt ttcgcgccgc
ggtcagttgg gccctggcgt 9060tgtgtcgcgt cgggagcgtg tccgcctcgc ggcggctaga
cgcgggtgtc gccgggctcc 9120gacgggtggc ctatccaggg ctcgcccccg ccgacccccg
cctgcccgtc ccggtggtgg 9180tcgttggtgt ggggagtgaa tggtgctacc ggtcattccc
tcccgcgtgg tttgactgtc 9240tcgccggtgt cgcgcttctc tttccgccaa cccccacgcc
aacccaccac cctgctctcc 9300cggcccggtg cggtcgacgt tccggctctc ccgatgccga
ggggttcggg atttgtgccg 9360gggacggagg ggagagcggg taagagaggt gtcggagagc
tgtcccgggg cgacgctcgg 9420gttggctttg ccgcgtgcgt gtgctcgcgg acgggttttg
tcggaccccg acggggtcgg 9480tccggccgca tgcactctcc cgttccgcgc gagcgcccgc
ccggctcacc cccggtttgt 9540cctcccgcga ggctctccgc cgccgccgcc tcctcctcct
ctctcgcgct ctctgtcccg 9600cctggtcctg tcccaccccc gacgctccgc tcgcgcttcc
ttacctggtt gatcctgcca 9660ggtagcatat gcttgtctca aagattaagc catgcatgtc
taagtacgca cggccggtac 9720agtgaaactg cgaatggctc attaaatcag ttatggttcc
tttggtcgct cgctcctctc 9780ctacttggat aactgtggta attctagagc taatacatgc
cgacgggcgc tgacccccct 9840tcccgggggg ggatgcgtgc atttatcaga tcaaaaccaa
cccggtgagc tccctcccgg 9900ctccggccgg gggtcgggcg ccggcggctt ggtgactcta
gataacctcg ggccgatcgc 9960acgccccccg tggcggcgac gacccattcg aacgtctgcc
ctatcaactt tcgatggtag 10020tcgccgtgcc taccatggtg accacgggtg acggggaatc
agggttcgat tccggagagg 10080gagcctgaga aacggctacc acatccaagg aaggcagcag
gcgcgcaaat tacccactcc 10140cgacccgggg aggtagtgac gaaaaataac aatacaggac
tctttcgagg ccctgtaatt 10200ggaatgagtc cactttaaat cctttaacga ggatccattg
gagggcaagt ctggtgccag 10260cagccgcggt aattccagct ccaatagcgt atattaaagt
tgctgcagtt aaaaagctcg 10320tagttggatc ttgggagcgg gcgggcggtc cgccgcgagg
cgagtcaccg cccgtccccg 10380ccccttgcct ctcggcgccc cctcgatgct cttagctgag
tgtcccgcgg ggcccgaagc 10440gtttactttg aaaaaattag agtgttcaaa gcaggcccga
gccgcctgga taccgcagct 10500aggaataatg gaataggacc gcggttctat tttgttggtt
ttcggaactg aggccatgat 10560taagagggac ggccgggggc attcgtattg cgccgctaga
ggtgaaattc ttggaccggc 10620gcaagacgga ccagagcgaa agcatttgcc aagaatgttt
tcattaatca agaacgaaag 10680tcggaggttc gaagacgatc agataccgtc gtagttccga
ccataaacga tgccgactgg 10740cgatgcggcg gcgttattcc catgacccgc cgggcagctt
ccgggaaacc aaagtctttg 10800ggttccgggg ggagtatggt tgcaaagctg aaacttaaag
gaattgacgg aagggcacca 10860ccaggagtgg gcctgcggct taatttgact caacacggga
aacctcaccc ggcccggaca 10920cggacaggat tgacagattg atagctcttt ctcgattccg
tgggtggtgg tgcatggccg 10980ttcttagttg gtggagcgat ttgtctggtt aattccgata
acgaacgaga ctctggcatg 11040ctaactagtt acgcgacccc cgagcggtcg gcgtccccca
acttcttaga gggacaagtg 11100gcgttcagcc acccgagatt gagcaataac aggtctgtga
tgcccttaga tgtccggggc 11160tgcacgcgcg ctacactgac tggctcagcg tgtgcctacc
ctgcgccggc aggcgcgggt 11220aacccgttga accccattcg tgatggggat cggggattgc
aattattccc catgaacgag 11280gaattcccag taagtgcggg tcataagctt gcgttgatta
agtccctgcc ctttgtacac 11340accgcccgtc gctactaccg attggatggt ttagtgaggc
cctcggatcg gccccgccgg 11400ggtcggccca cggccctggc ggagcgctga gaagacggtc
gaacttgact atctagagga 11460agtaaaagtc gtaacaaggt ttccgtaggt gaacctgcgg
aaggatcatt aaacgggaga 11520ctgtggagga gcggcggcgt ggcccgctct ccccgtcttg
tgtgtgtcct cgccgggagg 11580cgcgtgcgtc ccgggtcccg tcgcccgcgt gtggagcgag
gtgtctggag tgaggtgaga 11640gaaggggtgg gtggggtcgg tctgggtccg tctgggaccg
cctccgattt cccctccccc 11700tcccctctcc ctcgtccggc tctgacctcg ccaccctacc
gcggcggcgg ctgctcgcgg 11760gcgtcttgcc tctttcccgt ccggctcttc cgtgtctacg
aggggcggta cgtcgttacg 11820ggtttttgac ccgtcccggg ggcgttcggt cgtcggggcg
cgcgctttgc tctcccggca 11880cccatccccg ccgcggctct ggcttttcta cgttggctgg
ggcggttgtc gcgtgtgggg 11940ggatgtgagt gtcgcgtgtg ggctcgcccg tcccgatgcc
acgcttttct ggcctcgcgt 12000gtcctccccg ctcctgtccc gggtacctag ctgtcgcgtt
ccggcgcgga ggtttaagga 12060ccccgggggg gtcgccctgc cgcccccagg gtcggggggc
ggtggggccc gtagggaagt 12120cggtcgttcg ggcggctctc cctcagactc catgaccctc
ctccccccgc tgccgccgtt 12180cccgaggcgg cggtcgtgtg ggggggtgga tgtctggagc
cccctcgggc gccgtggggg 12240cccgacccgc gccgccggct tgcccgattt ccgcgggtcg
gtcctgtcgg tgccggtcgt 12300gggttcccgt gtcgttcccg tgtttttccg ctcccgaccc
tttttttttc ctccccccca 12360cacgtgtctc gtttcgttcc tgctggccgg cctgaggcta
cccctcggtc catctgttct 12420cctctctctc cggggagagg agggcggtgg tcgttggggg
actgtgccgt cgtcagcacc 12480cgtgagttcg ctcacacccg aaataccgat acgactctta
gcggtggatc actcggctcg 12540tgcgtcgatg aagaacgcag ctagctgcga gaattaatgt
gaattgcagg acacattgat 12600catcgacact tcgaacgcac ttgcggcccc gggttcctcc
cggggctacg cctgtctgag 12660cgtcggttga cgatcaatcg cgtcacccgc tgcggtgggt
gctgcgcggc tgggagtttg 12720ctcgcagggc caacccccca acccgggtcg ggccctccgt
ctcccgaagt tcagacgtgt 12780gggcggttgt cggtgtggcg cgcgcgcccg cgtcgcggag
cctggtctcc cccgcgcatc 12840cgcgctcgcg gcttcttccc gctccgccgt tcccgccctc
gcccgtgcac cccggtcctg 12900gcctcgcgtc ggcgcctccc ggaccgctgc ctcaccagtc
tttctcggtc ccgtgccccg 12960tgggaaccca ccgcgccccc gtggcgcccg ggggtgggcg
cgtccgcatc tgctctggtc 13020gaggttggcg gttgagggtg tgcgtgcgcc gaggtggtgg
tcggtcccct gcggccgcgg 13080ggttgtcggg gtggcggtcg acgagggccg gtcggtcgcc
tgcggtggtt gtctgtgtgt 13140gtttgggtct tgcgctgggg gaggcggggt cgaccgctcg
cggggttggc gcggtcgccc 13200ggcgccgcgc accctccggc ttgtgtggag ggagagcgag
ggcgagaacg gagagaggtg 13260gtatccccgg tggcgttgcg agggagggtt tggcgtcccg
cgtccgtccg tccctccctc 13320cctcggtggg cgccttcgcg ccgcacgcgg ccgctagggg
cggtcggggc ccgtggcccc 13380cgtggctctt cttcgtctcc gcttctcctt cacccgggcg
gtacccgctc cggcgccggc 13440ccgcgggacg ccgcggcgtc cgtgcgccga tgcgagtcac
ccccgggtgt tgcgagttcg 13500gggagggaga gggcctcgct gacccgttgc gtcccggctt
ccctgggggg gacccggcgt 13560ctgtgggctg tgcgtcccgg gggttgcgtg tgagtaagat
cctccacccc cgccgccctc 13620ccctcccgcc ggcctctcgg ggaccccctg agacggttcg
ccggctcgtc ctcccgtgcc 13680gccgggtgcc gtctctttcc cgcccgcctc ctcgctctct
tcttcccgcg gctgggcgcg 13740tgtcccccct ttctgaccgc gacctcagat cagacgtggc
gacccgctga atttaagcat 13800attagtcagc ggaggaaaag aaactaacca ggattccctc
agtaacggcg agtgaacagg 13860gaagagccca gcgccgaatc cccgccgcgc gtcgcggcgt
gggaaatgtg gcgtacggaa 13920gacccactcc ccggcgccgc tcgtgggggg cccaagtcct
tctgatcgag gcccagcccg 13980tggacggtgt gaggccggta gcggccccgg cgcgccgggc
tcgggtcttc ccggagtcgg 14040gttgcttggg aatgcagccc aaagcgggtg gtaaactcca
tctaaggcta aataccggca 14100cgagaccgat agtcaacaag taccgtaagg gaaagttgaa
aagaactttg aagagagagt 14160tcaagagggc gtgaaaccgt taagaggtaa acgggtgggg
tccgcgcagt ccgcccggag 14220gattcaaccc ggcggcgcgc gtccggccgt gcccggtggt
cccggcggat ctttcccgct 14280ccccgttcct cccgacccct ccacccgcgc gtcgttcccc
tcttcctccc cgcgtccggc 14340gcctccggcg gcgggcgcgg ggggtggtgt ggtggtggcg
cgcgggcggg gccgggggtg 14400gggtcggcgg gggaccgccc ccggccggcg accggccgcc
gccgggcgca cttccaccgt 14460ggcggtgcgc cgcgaccggc tccgggacgg ccgggaaggc
ccggtgggga aggtggctcg 14520gggggggcgg cgcgtctcag ggcgcgccga accacctcac
cccgagtgtt acagccctcc 14580ggccgcgctt tcgccgaatc ccggggccga ggaagccaga
tacccgtcgc cgcgctctcc 14640ctctcccccc gtccgcctcc cgggcgggcg tgggggtggg
ggccgggccg cccctcccac 14700ggcgcgaccg ctctcccacc cccctccgtc gcctctctcg
gggcccggtg gggggcgggg 14760cggactgtcc ccagtgcgcc ccgggcgtcg tcgcgccgtc
gggtcccggg gggaccgtcg 14820gtcacgcgtc tcccgacgaa gccgagcgca cggggtcggc
ggcgatgtcg gctacccacc 14880cgacccgtct tgaaacacgg accaaggagt ctaacgcgtg
cgcgagtcag gggctcgtcc 14940gaaagccgcc gtggcgcaat gaaggtgaag ggccccgccc
gggggcccga ggtgggatcc 15000cgaggcctct ccagtccgcc gagggcgcac caccggcccg
tctcgcccgc cgcgccgggg 15060aggtggagca cgagcgtacg cgttaggacc cgaaagatgg
tgaactatgc ttgggcaggg 15120cgaagccaga ggaaactctg gtggaggtcc gtagcggtcc
tgacgtgcaa atcggtcgtc 15180cgacctgggt ataggggcga aagactaatc gaaccatcta
gtagctggtt ccctccgaag 15240tttccctcag gatagctggc gctctcgctc ccgacgtacg
cagttttatc cggtaaagcg 15300aatgattaga ggtcttgggg ccgaaacgat ctcaacctat
tctcaaactt taaatgggta 15360agaagcccgg ctcgctggcg tggagccggg cgtggaatgc
gagtgcctag tgggccactt 15420ttggtaagca gaactggcgc tgcgggatga accgaacgcc
gggttaaggc gcccgatgcc 15480gacgctcatc agaccccaga aaaggtgttg gttgatatag
acagcaggac ggtggccatg 15540gaagtcggaa tccgctaagg agtgtgtaac aactcacctg
ccgaatcaac tagccctgaa 15600aatggatggc gctggagcgt cgggcccata cccggccgtc
gccgcagtcg gaacggaacg 15660ggacgggagc ggccgcgaat tcttgaagac gaaagggcct
cgtgatacgc ctatttttat 15720aggttaatgt catgataata atggtttctt agacgtcagg
tggcactttt cggggaaatg 15780tgcgcggaac ccctatttgt ttatttttct aaatacattc
aaatatgtat ccgctcatga 15840gacaataacc ctgataaatg cttcaataat attgaaaaag
gaagagtatg agtattcaac 15900atttccgtgt cgcccttatt cccttttttg cggcattttg
cttcctgttt ttgctcaccc 15960agaaacgctg gtgaaagtaa aagatgctga agatcagttg
ggtgcacgag tgggttacat 16020cgaactggat ctcaacagcg gtaagatcct tgagagtttt
cgccccgaag aacgttttcc 16080aatgatgagc acttttaaag ttctgctatg tggcgcggta
ttatcccgtg ttgacgccgg 16140gcaagagcaa ctcggtcgcc gcatacacta ttctcagaat
gacttggttg agtactcacc 16200agtcacagaa aagcatctta cggatggcat gacagtaaga
gaattatgca gtgctgccat 16260aaccatgagt gataacactg cggccaactt acttctgaca
acgatcggag gaccgaagga 16320gctaaccgct tttttgcaca acatggggga tcatgtaact
cgccttgatc gttgggaacc 16380ggagctgaat gaagccatac caaacgacga gcgtgacacc
acgatgcctg cagcaatggc 16440aacaacgttg cgcaaactat taactggcga actacttact
ctagcttccc ggcaacaatt 16500aatagactgg atggaggcgg ataaagttgc aggaccactt
ctgcgctcgg cccttccggc 16560tggctggttt attgctgata aatctggagc cggtgagcgt
gggtctcgcg gtatcattgc 16620agcactgggg ccagatggta agccctcccg tatcgtagtt
atctacacga cggggagtca 16680ggcaactatg gatgaacgaa atagacagat cgctgagata
ggtgcctcac tgattaagca 16740ttggtaactg tcagaccaag tttactcata tatactttag
attgatttaa aacttcattt 16800ttaatttaaa aggatctagg tgaagatcct ttttgataat
ctcatgacca aaatccctta 16860acgtgagttt tcgttccact gagcgtcaga ccccgtagaa
aagatcaaag gatcttcttg 16920agatcctttt tttctgcgcg taatctgctg cttgcaaaca
aaaaaaccac cgctaccagc 16980ggtggtttgt ttgccggatc aagagctacc aactcttttt
ccgaaggtaa ctggcttcag 17040cagagcgcag ataccaaata ctgtccttct agtgtagccg
tagttaggcc accacttcaa 17100gaactctgta gcaccgccta catacctcgc tctgctaatc
ctgttaccag tggctgctgc 17160cagtggcgat aagtcgtgtc ttaccgggtt ggactcaaga
cgatagttac cggataaggc 17220gcagcggtcg ggctgaacgg ggggttcgtg cacacagccc
agcttggagc gaacgaccta 17280caccgaactg agatacctac agcgtgagct atgagaaagc
gccacgcttc cgaagggaga 17340aaggcggaca ggtatccggt aagcggcagg gtcggaacag
gaga 173841192814DNAArtificial SequencepLITMUS38
Plasmid 119gttaactacg tcaggtggca cttttcgggg aaatgtgcgc ggaaccccta
tttgtttatt 60tttctaaata cattcaaata tgtatccgct catgagacaa taaccctgat
aaatgcttca 120ataatattga aaaaggaaga gtatgagtat tcaacatttc cgtgtcgccc
ttattccctt 180ttttgcggca ttttgccttc ctgtttttgc tcacccagaa acgctggtga
aagtaaaaga 240tgctgaagat cagttgggtg cacgagtggg ttacatcgaa ctggatctca
acagcggtaa 300gatccttgag agttttcgcc ccgaagaacg ttctccaatg atgagcactt
ttaaagttct 360gctatgtggc gcggtattat cccgtgttga cgccgggcaa gagcaactcg
gtcgccgcat 420acactattct cagaatgact tggttgagta ctcaccagtc acagaaaagc
atcttacgga 480tggcatgaca gtaagagaat tatgcagtgc tgccataacc atgagtgata
acactgcggc 540caacttactt ctgacaacga tcggaggacc gaaggagcta accgcttttt
tgcacaacat 600gggggatcat gtaactcgcc ttgatcgttg ggaaccggag ctgaatgaag
ccataccaaa 660cgacgagcgt gacaccacga tgcctgtagc aatggcaaca acgttgcgca
aactattaac 720tggcgaacta cttactctag cttcccggca acaattaata gactggatgg
aggcggataa 780agttgcagga ccacttctgc gctcggccct tccggctggc tggtttattg
ctgataaatc 840tggagccggt gagcgtgggt ctcgcggtat cattgcagca ctggggccag
atggtaagcc 900ctcccgtatc gtagttatct acacgacggg gagtcaggca actatggatg
aacgaaatag 960acagatcgct gagataggtg cctcactgat taagcattgg taactgtcag
accaagttta 1020ctcatatata ctttagattg atttaccccg gttgataatc agaaaagccc
caaaaacagg 1080aagattgtat aagcaaatat ttaaattgta aacgttaata ttttgttaaa
attcgcgtta 1140aatttttgtt aaatcagctc attttttaac caataggccg aaatcggcaa
aatcccttat 1200aaatcaaaag aatagcccga gatagggttg agtgttgttc cagtttggaa
caagagtcca 1260ctattaaaga acgtggactc caacgtcaaa gggcgaaaaa ccgtctatca
gggcgatggc 1320ccactacgtg aaccatcacc caaatcaagt tttttggggt cgaggtgccg
taaagcacta 1380aatcggaacc ctaaagggag cccccgattt agagcttgac ggggaaagcg
aacgtggcga 1440gaaaggaagg gaagaaagcg aaaggagcgg gcgctagggc gctggcaagt
gtagcggtca 1500cgctgcgcgt aaccaccaca cccgccgcgc ttaatgcgcc gctacagggc
gcgtaaaagg 1560atctaggtga agatcctttt tgataatctc atgaccaaaa tcccttaacg
tgagttttcg 1620ttccactgag cgtcagaccc cgtagaaaag atcaaaggat cttcttgaga
tccttttttt 1680ctgcgcgtaa tctgctgctt gcaaacaaaa aaaccaccgc taccagcggt
ggtttgtttg 1740ccggatcaag agctaccaac tctttttccg aaggtaactg gcttcagcag
agcgcagata 1800ccaaatactg ttcttctagt gtagccgtag ttaggccacc acttcaagaa
ctctgtagca 1860ccgcctacat acctcgctct gctaatcctg ttaccagtgg ctgctgccag
tggcgataag 1920tcgtgtctta ccgggttgga ctcaagacga tagttaccgg ataaggcgca
gcggtcgggc 1980tgaacggggg gttcgtgcac acagcccagc ttggagcgaa cgacctacac
cgaactgaga 2040tacctacagc gtgagctatg agaaagcgcc acgcttcccg aagggagaaa
ggcggacagg 2100tatccggtaa gcggcagggt cggaacagga gagcgcacga gggagcttcc
agggggaaac 2160gcctggtatc tttatagtcc tgtcgggttt cgccacctct gacttgagcg
tcgatttttg 2220tgatgctcgt caggggggcg gagcctatgg aaaaacgcca gcaacgcggc
ctttttacgg 2280ttcctggcct tttgctggcc ttttgctcac atgtaatgtg agttagctca
ctcattaggc 2340accccaggct ttacacttta tgcttccggc tcgtatgttg tgtggaattg
tgagcggata 2400acaatttcac acaggaaaca gctatgacca tgattacgcc aagctacgta
atacgactca 2460ctagtggggc ccgtgcaatt gaagccggct ggcgccaagc ttctctgcag
gatatctgga 2520tccacgaatt cgctagcttc ggccgtgacg cgtctccgga tgtacaggca
tgcgtcgacc 2580ctctagtcaa ggccttaagt gagtcgtatt acggactggc cgtcgtttta
caacgtcgtg 2640actgggaaaa ccctggcgtt acccaactta atcgccttgc agcacatccc
cctttcgcca 2700gctggcgtaa tagcgaagag gcccgcaccg atcgcccttc ccaacagttg
cgcagcctga 2760atggcgaatg gcgcttcgct tggtaataaa gcccgcttcg gcgggctttt
tttt 28141202847DNAArtificial SequencepLIT38attB Plasmid
120gttaactacg tcaggtggca cttttcgggg aaatgtgcgc ggaaccccta tttgtttatt
60tttctaaata cattcaaata tgtatccgct catgagacaa taaccctgat aaatgcttca
120ataatattga aaaaggaaga gtatgagtat tcaacatttc cgtgtcgccc ttattccctt
180ttttgcggca ttttgccttc ctgtttttgc tcacccagaa acgctggtga aagtaaaaga
240tgctgaagat cagttgggtg cacgagtggg ttacatcgaa ctggatctca acagcggtaa
300gatccttgag agttttcgcc ccgaagaacg ttctccaatg atgagcactt ttaaagttct
360gctatgtggc gcggtattat cccgtgttga cgccgggcaa gagcaactcg gtcgccgcat
420acactattct cagaatgact tggttgagta ctcaccagtc acagaaaagc atcttacgga
480tggcatgaca gtaagagaat tatgcagtgc tgccataacc atgagtgata acactgcggc
540caacttactt ctgacaacga tcggaggacc gaaggagcta accgcttttt tgcacaacat
600gggggatcat gtaactcgcc ttgatcgttg ggaaccggag ctgaatgaag ccataccaaa
660cgacgagcgt gacaccacga tgcctgtagc aatggcaaca acgttgcgca aactattaac
720tggcgaacta cttactctag cttcccggca acaattaata gactggatgg aggcggataa
780agttgcagga ccacttctgc gctcggccct tccggctggc tggtttattg ctgataaatc
840tggagccggt gagcgtgggt ctcgcggtat cattgcagca ctggggccag atggtaagcc
900ctcccgtatc gtagttatct acacgacggg gagtcaggca actatggatg aacgaaatag
960acagatcgct gagataggtg cctcactgat taagcattgg taactgtcag accaagttta
1020ctcatatata ctttagattg atttaccccg gttgataatc agaaaagccc caaaaacagg
1080aagattgtat aagcaaatat ttaaattgta aacgttaata ttttgttaaa attcgcgtta
1140aatttttgtt aaatcagctc attttttaac caataggccg aaatcggcaa aatcccttat
1200aaatcaaaag aatagcccga gatagggttg agtgttgttc cagtttggaa caagagtcca
1260ctattaaaga acgtggactc caacgtcaaa gggcgaaaaa ccgtctatca gggcgatggc
1320ccactacgtg aaccatcacc caaatcaagt tttttggggt cgaggtgccg taaagcacta
1380aatcggaacc ctaaagggag cccccgattt agagcttgac ggggaaagcg aacgtggcga
1440gaaaggaagg gaagaaagcg aaaggagcgg gcgctagggc gctggcaagt gtagcggtca
1500cgctgcgcgt aaccaccaca cccgccgcgc ttaatgcgcc gctacagggc gcgtaaaagg
1560atctaggtga agatcctttt tgataatctc atgaccaaaa tcccttaacg tgagttttcg
1620ttccactgag cgtcagaccc cgtagaaaag atcaaaggat cttcttgaga tccttttttt
1680ctgcgcgtaa tctgctgctt gcaaacaaaa aaaccaccgc taccagcggt ggtttgtttg
1740ccggatcaag agctaccaac tctttttccg aaggtaactg gcttcagcag agcgcagata
1800ccaaatactg ttcttctagt gtagccgtag ttaggccacc acttcaagaa ctctgtagca
1860ccgcctacat acctcgctct gctaatcctg ttaccagtgg ctgctgccag tggcgataag
1920tcgtgtctta ccgggttgga ctcaagacga tagttaccgg ataaggcgca gcggtcgggc
1980tgaacggggg gttcgtgcac acagcccagc ttggagcgaa cgacctacac cgaactgaga
2040tacctacagc gtgagctatg agaaagcgcc acgcttcccg aagggagaaa ggcggacagg
2100tatccggtaa gcggcagggt cggaacagga gagcgcacga gggagcttcc agggggaaac
2160gcctggtatc tttatagtcc tgtcgggttt cgccacctct gacttgagcg tcgatttttg
2220tgatgctcgt caggggggcg gagcctatgg aaaaacgcca gcaacgcggc ctttttacgg
2280ttcctggcct tttgctggcc ttttgctcac atgtaatgtg agttagctca ctcattaggc
2340accccaggct ttacacttta tgcttccggc tcgtatgttg tgtggaattg tgagcggata
2400acaatttcac acaggaaaca gctatgacca tgattacgcc aagctacgta atacgactca
2460ctagtggggc ccgtgcaatt gaagccggct ggcgccaagc ttctctgcag gattgaagcc
2520tgctttttta tactaacttg agcgaaatct ggatccacga attcgctagc ttcggccgtg
2580acgcgtctcc ggatgtacag gcatgcgtcg accctctagt caaggcctta agtgagtcgt
2640attacggact ggccgtcgtt ttacaacgtc gtgactggga aaaccctggc gttacccaac
2700ttaatcgcct tgcagcacat ccccctttcg ccagctggcg taatagcgaa gaggcccgca
2760ccgatcgccc ttcccaacag ttgcgcagcc tgaatggcga atggcgcttc gcttggtaat
2820aaagcccgct tcggcgggct ttttttt
28471214223DNAArtificial SequencepLIT38attBBSRpolyA2 Plasmid
121accatgaaaa catttaacat ttctcaacaa gatctagaat tagtagaagt agcgacagag
60aagattacaa tgctttatga ggataataaa catcatgtgg gagcggcaat tcgtacgaaa
120acaggagaaa tcatttcggc agtacatatt gaagcgtata taggacgagt aactgtttgt
180gcagaagcca ttgcgattgg tagtgcagtt tcgaatggac aaaaggattt tgacacgatt
240gtagctgtta gacaccctta ttctgacgaa gtagatagaa gtattcgagt ggtaagtcct
300tgtggtatgt gtagggagtt gatttcagac tatgcaccag attgttttgt gttaatagaa
360atgaatggca agttagtcaa aactacgatt gaagaactca ttccactcaa atatacccga
420aattaaaagt tttaccatac caagcttggc tgctgcctga ggctggacga cctcgcggag
480ttctaccggc agtgcaaatc cgtcggcatc caggaaacca gcagcggcta tccgcgcatc
540catgcccccg aactgcagga gtggggaggc acgatggccg ctttggtccg gatctttgtg
600aaggaacctt acttctgtgg tgtgacataa ttggacaaac tacctacaga gatttaaagc
660tctaaggtaa atataaaatt tttaagtgta taatgtgtta aactactgat tctaattgtt
720tgtgtatttt agattccaac ctatggaact gatgaatggg agcagtggtg gaatgccttt
780aatgaggaaa acctgttttg ctcagaagaa atgccatcta gtgatgatga ggctactgct
840gactctcaac attctactcc tccaaaaaag aagagaaagg tagaagaccc caaggacttt
900ccttcagaat tgctaagttt tttgagtcat gctgtgttta gtaatagaac tcttgcttgc
960tttgctattt acaccacaaa ggaaaaagct gcactgctat acaagaaaat tatggaaaaa
1020tattctgtaa cctttataag taggcataac agttataatc ataacatact gttttttctt
1080actccacaca ggcatagagt gtctgctatt aataactatg ctcaaaaatt gtgtaccttt
1140agctttttaa tttgtaaagg ggttaataag gaatatttga tgtatagtgc cttgactaga
1200gatcataatc agccatacca catttgtaga ggttttactt gctttaaaaa acctcccaca
1260cctccccctg aacctgaaac ataaaatgaa tgcaattgtt gttgttaact tgtttattgc
1320agcttataat ggttacaaat aaagcaatag catcacaaat ttcacaaata aagatccaga
1380tttcgctcaa gttagtataa aaaagcaggc ttcaatcctg cagagaagct tggcgccagc
1440cggcttcaat tgcacgggcc ccactagtga gtcgtattac gtagcttggc gtaatcatgg
1500tcatagctgt ttcctgtgtg aaattgttat ccgctcacaa ttccacacaa catacgagcc
1560ggaagcataa agtgtaaagc ctggggtgcc taatgagtga gctaactcac attacatgtg
1620agcaaaaggc cagcaaaagg ccaggaaccg taaaaaggcc gcgttgctgg cgtttttcca
1680taggctccgc ccccctgacg agcatcacaa aaatcgacgc tcaagtcaga ggtggcgaaa
1740cccgacagga ctataaagat accaggcgtt tccccctgga agctccctcg tgcgctctcc
1800tgttccgacc ctgccgctta ccggatacct gtccgccttt ctcccttcgg gaagcgtggc
1860gctttctcat agctcacgct gtaggtatct cagttcggtg taggtcgttc gctccaagct
1920gggctgtgtg cacgaacccc ccgttcagcc cgaccgctgc gccttatccg gtaactatcg
1980tcttgagtcc aacccggtaa gacacgactt atcgccactg gcagcagcca ctggtaacag
2040gattagcaga gcgaggtatg taggcggtgc tacagagttc ttgaagtggt ggcctaacta
2100cggctacact agaagaacag tatttggtat ctgcgctctg ctgaagccag ttaccttcgg
2160aaaaagagtt ggtagctctt gatccggcaa acaaaccacc gctggtagcg gtggtttttt
2220tgtttgcaag cagcagatta cgcgcagaaa aaaaggatct caagaagatc ctttgatctt
2280ttctacgggg tctgacgctc agtggaacga aaactcacgt taagggattt tggtcatgag
2340attatcaaaa aggatcttca cctagatcct tttacgcgcc ctgtagcggc gcattaagcg
2400cggcgggtgt ggtggttacg cgcagcgtga ccgctacact tgccagcgcc ctagcgcccg
2460ctcctttcgc tttcttccct tcctttctcg ccacgttcgc tttccccgtc aagctctaaa
2520tcgggggctc cctttagggt tccgatttag tgctttacgg cacctcgacc ccaaaaaact
2580tgatttgggt gatggttcac gtagtgggcc atcgccctga tagacggttt ttcgcccttt
2640gacgttggag tccacgttct ttaatagtgg actcttgttc caaactggaa caacactcaa
2700ccctatctcg ggctattctt ttgatttata agggattttg ccgatttcgg cctattggtt
2760aaaaaatgag ctgatttaac aaaaatttaa cgcgaatttt aacaaaatat taacgtttac
2820aatttaaata tttgcttata caatcttcct gtttttgggg cttttctgat tatcaaccgg
2880ggtaaatcaa tctaaagtat atatgagtaa acttggtctg acagttacca atgcttaatc
2940agtgaggcac ctatctcagc gatctgtcta tttcgttcat ccatagttgc ctgactcccc
3000gtcgtgtaga taactacgat acgggagggc ttaccatctg gccccagtgc tgcaatgata
3060ccgcgagacc cacgctcacc ggctccagat ttatcagcaa taaaccagcc agccggaagg
3120gccgagcgca gaagtggtcc tgcaacttta tccgcctcca tccagtctat taattgttgc
3180cgggaagcta gagtaagtag ttcgccagtt aatagtttgc gcaacgttgt tgccattgct
3240acaggcatcg tggtgtcacg ctcgtcgttt ggtatggctt cattcagctc cggttcccaa
3300cgatcaaggc gagttacatg atcccccatg ttgtgcaaaa aagcggttag ctccttcggt
3360cctccgatcg ttgtcagaag taagttggcc gcagtgttat cactcatggt tatggcagca
3420ctgcataatt ctcttactgt catgccatcc gtaagatgct tttctgtgac tggtgagtac
3480tcaaccaagt cattctgaga atagtgtatg cggcgaccga gttgctcttg cccggcgtca
3540acacgggata ataccgcgcc acatagcaga actttaaaag tgctcatcat tggagaacgt
3600tcttcggggc gaaaactctc aaggatctta ccgctgttga gatccagttc gatgtaaccc
3660actcgtgcac ccaactgatc ttcagcatct tttactttca ccagcgtttc tgggtgagca
3720aaaacaggaa ggcaaaatgc cgcaaaaaag ggaataaggg cgacacggaa atgttgaata
3780ctcatactct tcctttttca atattattga agcatttatc agggttattg tctcatgagc
3840ggatacatat ttgaatgtat ttagaaaaat aaacaaatag gggttccgcg cacatttccc
3900cgaaaagtgc cacctgacgt agttaacaaa aaaaagcccg ccgaagcggg ctttattacc
3960aagcgaagcg ccattcgcca ttcaggctgc gcaactgttg ggaagggcga tcggtgcggg
4020cctcttcgct attacgccag ctggcgaaag ggggatgtgc tgcaaggcga ttaagttggg
4080taacgccagg gttttcccag tcacgacgtt gtaaaacgac ggccagtccg taatacgact
4140cacttaaggc cttgactaga gggtcgacgc atgcctgtac atccggagac gcgtcacggc
4200cgaagctagc gaattcgtgg atc
42231222686DNAArtificial SequencepUC18 Plasmid 122tcgcgcgttt cggtgatgac
ggtgaaaacc tctgacacat gcagctcccg gagacggtca 60cagcttgtct gtaagcggat
gccgggagca gacaagcccg tcagggcgcg tcagcgggtg 120ttggcgggtg tcggggctgg
cttaactatg cggcatcaga gcagattgta ctgagagtgc 180accatatgcg gtgtgaaata
ccgcacagat gcgtaaggag aaaataccgc atcaggcgcc 240attcgccatt caggctgcgc
aactgttggg aagggcgatc ggtgcgggcc tcttcgctat 300tacgccagct ggcgaaaggg
ggatgtgctg caaggcgatt aagttgggta acgccagggt 360tttcccagtc acgacgttgt
aaaacgacgg ccagtgccaa gcttgcatgc ctgcaggtcg 420actctagagg atccccgggt
accgagctcg aattcgtaat catggtcata gctgtttcct 480gtgtgaaatt gttatccgct
cacaattcca cacaacatac gagccggaag cataaagtgt 540aaagcctggg gtgcctaatg
agtgagctaa ctcacattaa ttgcgttgcg ctcactgccc 600gctttccagt cgggaaacct
gtcgtgccag ctgcattaat gaatcggcca acgcgcgggg 660agaggcggtt tgcgtattgg
gcgctcttcc gcttcctcgc tcactgactc gctgcgctcg 720gtcgttcggc tgcggcgagc
ggtatcagct cactcaaagg cggtaatacg gttatccaca 780gaatcagggg ataacgcagg
aaagaacatg tgagcaaaag gccagcaaaa ggccaggaac 840cgtaaaaagg ccgcgttgct
ggcgtttttc cataggctcc gcccccctga cgagcatcac 900aaaaatcgac gctcaagtca
gaggtggcga aacccgacag gactataaag ataccaggcg 960tttccccctg gaagctccct
cgtgcgctct cctgttccga ccctgccgct taccggatac 1020ctgtccgcct ttctcccttc
gggaagcgtg gcgctttctc atagctcacg ctgtaggtat 1080ctcagttcgg tgtaggtcgt
tcgctccaag ctgggctgtg tgcacgaacc ccccgttcag 1140cccgaccgct gcgccttatc
cggtaactat cgtcttgagt ccaacccggt aagacacgac 1200ttatcgccac tggcagcagc
cactggtaac aggattagca gagcgaggta tgtaggcggt 1260gctacagagt tcttgaagtg
gtggcctaac tacggctaca ctagaaggac agtatttggt 1320atctgcgctc tgctgaagcc
agttaccttc ggaaaaagag ttggtagctc ttgatccggc 1380aaacaaacca ccgctggtag
cggtggtttt tttgtttgca agcagcagat tacgcgcaga 1440aaaaaaggat ctcaagaaga
tcctttgatc ttttctacgg ggtctgacgc tcagtggaac 1500gaaaactcac gttaagggat
tttggtcatg agattatcaa aaaggatctt cacctagatc 1560cttttaaatt aaaaatgaag
ttttaaatca atctaaagta tatatgagta aacttggtct 1620gacagttacc aatgcttaat
cagtgaggca cctatctcag cgatctgtct atttcgttca 1680tccatagttg cctgactccc
cgtcgtgtag ataactacga tacgggaggg cttaccatct 1740ggccccagtg ctgcaatgat
accgcgagac ccacgctcac cggctccaga tttatcagca 1800ataaaccagc cagccggaag
ggccgagcgc agaagtggtc ctgcaacttt atccgcctcc 1860atccagtcta ttaattgttg
ccgggaagct agagtaagta gttcgccagt taatagtttg 1920cgcaacgttg ttgccattgc
tacaggcatc gtggtgtcac gctcgtcgtt tggtatggct 1980tcattcagct ccggttccca
acgatcaagg cgagttacat gatcccccat gttgtgcaaa 2040aaagcggtta gctccttcgg
tcctccgatc gttgtcagaa gtaagttggc cgcagtgtta 2100tcactcatgg ttatggcagc
actgcataat tctcttactg tcatgccatc cgtaagatgc 2160ttttctgtga ctggtgagta
ctcaaccaag tcattctgag aatagtgtat gcggcgaccg 2220agttgctctt gcccggcgtc
aatacgggat aataccgcgc cacatagcag aactttaaaa 2280gtgctcatca ttggaaaacg
ttcttcgggg cgaaaactct caaggatctt accgctgttg 2340agatccagtt cgatgtaacc
cactcgtgca cccaactgat cttcagcatc ttttactttc 2400accagcgttt ctgggtgagc
aaaaacagga aggcaaaatg ccgcaaaaaa gggaataagg 2460gcgacacgga aatgttgaat
actcatactc ttcctttttc aatattattg aagcatttat 2520cagggttatt gtctcatgag
cggatacata tttgaatgta tttagaaaaa taaacaaata 2580ggggttccgc gcacatttcc
ccgaaaagtg ccacctgacg tctaagaaac cattattatc 2640atgacattaa cctataaaaa
taggcgtatc acgaggccct ttcgtc 26861238521DNAArtificial
SequencepCXeGFPattB(6xHS4)2 Plasmid 123tacggggcgg gggatccact agttattaat
agtaatcaat tacggggtca ttagttcata 60gcccatatat ggagttccgc gttacataac
ttacggtaaa tggcccgcct ggctgaccgc 120ccaacgaccc ccgcccattg acgtcaataa
tgacgtatgt tcccatagta acgccaatag 180ggactttcca ttgacgtcaa tgggtggact
atttacggta aactgcccac ttggcagtac 240atcaagtgta tcatatgcca agtacgcccc
ctattgacgt caatgacggt aaatggcccg 300cctggcatta tgcccagtac atgaccttat
gggactttcc tacttggcag tacatctacg 360tattagtcat cgctattacc atgggtcgag
gtgagcccca cgttctgctt cactctcccc 420atctcccccc cctccccacc cccaattttg
tatttattta ttttttaatt attttgtgca 480gcgatggggg cggggggggg gggggcgcgc
gccaggcggg gcggggcggg gcgaggggcg 540gggcggggcg aggcggagag gtgcggcggc
agccaatcag agcggcgcgc tccgaaagtt 600tccttttatg gcgaggcggc ggcggcggcg
gccctataaa aagcgaagcg cgcggcgggc 660gggagtcgct gcgttgcctt cgccccgtgc
cccgctccgc gccgcctcgc gccgcccgcc 720ccggctctga ctgaccgcgt tactcccaca
ggtgagcggg cgggacggcc cttctcctcc 780gggctgtaat tagcgcttgg tttaatgacg
gctcgtttct tttctgtggc tgcgtgaaag 840ccttaaaggg ctccgggagg gccctttgtg
cgggggggag cggctcgggg ggtgcgtgcg 900tgtgtgtgtg cgtggggagc gccgcgtgcg
gcccgcgctg cccggcggct gtgagcgctg 960cgggcgcggc gcggggcttt gtgcgctccg
cgtgtgcgcg aggggagcgc ggccgggggc 1020ggtgccccgc ggtgcggggg ggctgcgagg
ggaacaaagg ctgcgtgcgg ggtgtgtgcg 1080tgggggggtg agcagggggt gtgggcgcgg
cggtcgggct gtaacccccc cctgcacccc 1140cctccccgag ttgctgagca cggcccggct
tcgggtgcgg ggctccgtgc ggggcgtggc 1200gcggggctcg ccgtgccggg cggggggtgg
cggcaggtgg gggtgccggg cggggcgggg 1260ccgcctcggg ccggggaggg ctcgggggag
gggcgcggcg gccccggagc gccggcggct 1320gtcgaggcgc ggcgagccgc agccattgcc
ttttatggta atcgtgcgag agggcgcagg 1380gacttccttt gtcccaaatc tggcggagcc
gaaatctggg aggcgccgcc gcaccccctc 1440tagcgggcgc gggcgaagcg gtgcggcgcc
ggcaggaagg aaatgggcgg ggagggcctt 1500cgtgcgtcgc cgcgccgccg tccccttctc
catctccagc ctcggggctg ccgcaggggg 1560acggctgcct tcggggggga cggggcaggg
cggggttcgg cttctggcgt gtgaccggcg 1620gctctagagc ctctgctaac catgttcatg
ccttcttctt tttcctacag ctcctgggca 1680acgtgctggt tgttgtgctg tctcatcatt
ttggcaaaga attcgccacc atggtgagca 1740agggcgagga gctgttcacc ggggtggtgc
ccatcctggt cgagctggac ggcgacgtaa 1800acggccacaa gttcagcgtg tccggcgagg
gcgagggcga tgccacctac ggcaagctga 1860ccctgaagtt catctgcacc accggcaagc
tgcccgtgcc ctggcccacc ctcgtgacca 1920ccctgaccta cggcgtgcag tgcttcagcc
gctaccccga ccacatgaag cagcacgact 1980tcttcaagtc cgccatgccc gaaggctacg
tccaggagcg caccatcttc ttcaaggacg 2040acggcaacta caagacccgc gccgaggtga
agttcgaggg cgacaccctg gtgaaccgca 2100tcgagctgaa gggcatcgac ttcaaggagg
acggcaacat cctggggcac aagctggagt 2160acaactacaa cagccacaac gtctatatca
tggccgacaa gcagaagaac ggcatcaagg 2220tgaacttcaa gatccgccac aacatcgagg
acggcagcgt gcagctcgcc gaccactacc 2280agcagaacac ccccatcggc gacggccccg
tgctgctgcc cgacaaccac tacctgagca 2340cccagtccgc cctgagcaaa gaccccaacg
agaagcgcga tcacatggtc ctgctggagt 2400tcgtgaccgc cgccgggatc actctcggca
tggacgagct gtacaagtaa gaattcactc 2460ctcaggtgca ggctgcctat cagaaggtgg
tggctggtgt ggccaatgcc ctggctcaca 2520aataccactg agatcttttt ccctctgcca
aaaattatgg ggacatcatg aagccccttg 2580agcatctgac ttctggctaa taaaggaaat
ttattttcat tgcaatagtg tgttggaatt 2640ttttgtgtct ctcactcgga aggacatatg
ggagggcaaa tcatttaaaa catcagaatg 2700agtatttggt ttagagtttg gcaacatatg
ccatatgctg gctgccatga acaaaggtgg 2760ctataaagag gtcatcagta tatgaaacag
ccccctgctg tccattcctt attccataga 2820aaagccttga cttgaggtta gatttttttt
atattttgtt ttgtgttatt tttttcttta 2880acatccctaa aattttcctt acatgtttta
ctagccagat ttttcctcct ctcctgacta 2940ctcccagtca tagctgtccc tcttctctta
tgaagatccc tcgacctgca gcccaagctt 3000ggcgtaatca tggtcatagc tgtttcctgt
gtgaaattgt tatccgctca caattccaca 3060caacatacga gccggaagca taaagtgtaa
agcctggggt gcctaatgag tgagctaact 3120cacattaatt gcgttgcgct cactgcccgc
tttccagtcg ggaaacctgt cgtgccagcg 3180gatccgcatc tcaattagtc agcaaccata
gtcccgcccc taactccgcc catcccgccc 3240ctaactccgc ccagttccgc ccattctccg
ccccatggct gactaatttt ttttatttat 3300gcagaggccg aggccgcctc ggcctctgag
ctattccaga agtagtgagg aggctttttt 3360ggaggctagt ggatcccccg ccccgtatcc
cccaggtgtc tgcaggctca aagagcagcg 3420agaagcgttc agaggaaagc gatcccgtgc
caccttcccc gtgcccgggc tgtccccgca 3480cgctgccggc tcggggatgc ggggggagcg
ccggaccgga gcggagcccc gggcggctcg 3540ctgctgcccc ctagcggggg agggacgtaa
ttacatccct gggggctttg ggggggggct 3600gtccccgtga gcggatccgc ggccccgtat
cccccaggtg tctgcaggct caaagagcag 3660cgagaagcgt tcagaggaaa gcgatcccgt
gccaccttcc ccgtgcccgg gctgtccccg 3720cacgctgccg gctcggggat gcggggggag
cgccggaccg gagcggagcc ccgggcggct 3780cgctgctgcc ccctagcggg ggagggacgt
aattacatcc ctgggggctt tggggggggg 3840ctgtccccgt gagcggatcc gcggccccgt
atcccccagg tgtctgcagg ctcaaagagc 3900agcgagaagc gttcagagga aagcgatccc
gtgccacctt ccccgtgccc gggctgtccc 3960cgcacgctgc cggctcgggg atgcgggggg
agcgccggac cggagcggag ccccgggcgg 4020ctcgctgctg ccccctagcg ggggagggac
gtaattacat ccctgggggc tttggggggg 4080ggctgtcccc gtgagcggat ccgcggcccc
gtatccccca ggtgtctgca ggctcaaaga 4140gcagcgagaa gcgttcagag gaaagcgatc
ccgtgccacc ttccccgtgc ccgggctgtc 4200cccgcacgct gccggctcgg ggatgcgggg
ggagcgccgg accggagcgg agccccgggc 4260ggctcgctgc tgccccctag cgggggaggg
acgtaattac atccctgggg gctttggggg 4320ggggctgtcc ccgtgagcgg atccgcggcc
ccgtatcccc caggtgtctg caggctcaaa 4380gagcagcgag aagcgttcag aggaaagcga
tcccgtgcca ccttccccgt gcccgggctg 4440tccccgcacg ctgccggctc ggggatgcgg
ggggagcgcc ggaccggagc ggagccccgg 4500gcggctcgct gctgccccct agcgggggag
ggacgtaatt acatccctgg gggctttggg 4560ggggggctgt ccccgtgagc ggatccgcgg
ccccgtatcc cccaggtgtc tgcaggctca 4620aagagcagcg agaagcgttc agaggaaagc
gatcccgtgc caccttcccc gtgcccgggc 4680tgtccccgca cgctgccggc tcggggatgc
ggggggagcg ccggaccgga gcggagcccc 4740gggcggctcg ctgctgcccc ctagcggggg
agggacgtaa ttacatccct gggggctttg 4800ggggggggct gtccccgtga gcggatccgc
ggggctgcag gaattcgatt gaagcctgct 4860tttttatact aacttgagcg aaatcaagct
cctaggcttt tgcaaaaagc taacttgttt 4920attgcagctt ataatggtta caaataaagc
aatagcatca caaatttcac aaataaagca 4980tttttttcac tgcattctag ttgtggtttg
tccaaactca tcaatgtatc ttatcatgtc 5040tggatccgct gcattaatga atcggccaac
gcgcggggag aggcggtttg cgtattgggc 5100gctcttccgc ttcctcgctc actgactcgc
tgcgctcggt cgttcggctg cggcgagcgg 5160tatcagctca ctcaaaggcg gtaatacggt
tatccacaga atcaggggat aacgcaggaa 5220agaacatgtg agcaaaaggc cagcaaaagg
ccaggaaccg taaaaaggcc gcgttgctgg 5280cgtttttcca taggctccgc ccccctgacg
agcatcacaa aaatcgacgc tcaagtcaga 5340ggtggcgaaa cccgacagga ctataaagat
accaggcgtt tccccctgga agctccctcg 5400tgcgctctcc tgttccgacc ctgccgctta
ccggatacct gtccgccttt ctcccttcgg 5460gaagcgtggc gctttctcaa tgctcacgct
gtaggtatct cagttcggtg taggtcgttc 5520gctccaagct gggctgtgtg cacgaacccc
ccgttcagcc cgaccgctgc gccttatccg 5580gtaactatcg tcttgagtcc aacccggtaa
gacacgactt atcgccactg gcagcagcca 5640ctggtaacag gattagcaga gcgaggtatg
taggcggtgc tacagagttc ttgaagtggt 5700ggcctaacta cggctacact agaaggacag
tatttggtat ctgcgctctg ctgaagccag 5760ttaccttcgg aaaaagagtt ggtagctctt
gatccggcaa acaaaccacc gctggtagcg 5820gtggtttttt tgtttgcaag cagcagatta
cgcgcagaaa aaaaggatct caagaagatc 5880ctttgatctt ttctacgggg tctgacgctc
agtggaacga aaactcacgt taagggattt 5940tggtcatgag attatcaaaa aggatcttca
cctagatcct tttaaattaa aaatgaagtt 6000ttaaatcaat ctaaagtata tatgagtaaa
cttggtctga cagttaccaa tgcttaatca 6060gtgaggcacc tatctcagcg atctgtctat
ttcgttcatc catagttgcc tgactccccg 6120tcgtgtagat aactacgata cgggagggct
taccatctgg ccccagtgct gcaatgatac 6180cgcgagaccc acgctcaccg gctccagatt
tatcagcaat aaaccagcca gccggaaggg 6240ccgagcgcag aagtggtcct gcaactttat
ccgcctccat ccagtctatt aattgttgcc 6300gggaagctag agtaagtagt tcgccagtta
atagtttgcg caacgttgtt gccattgcta 6360caggcatcgt ggtgtcacgc tcgtcgtttg
gtatggcttc attcagctcc ggttcccaac 6420gatcaaggcg agttacatga tcccccatgt
tgtgcaaaaa agcggttagc tccttcggtc 6480ctccgatcgt tgtcagaagt aagttggccg
cagtgttatc actcatggtt atggcagcac 6540tgcataattc tcttactgtc atgccatccg
taagatgctt ttctgtgact ggtgagtact 6600caaccaagtc attctgagaa tagtgtatgc
ggcgaccgag ttgctcttgc ccggcgtcaa 6660tacgggataa taccgcgcca catagcagaa
ctttaaaagt gctcatcatt ggaaaacgtt 6720cttcggggcg aaaactctca aggatcttac
cgctgttgag atccagttcg atgtaaccca 6780ctcgtgcacc caactgatct tcagcatctt
ttactttcac cagcgtttct gggtgagcaa 6840aaacaggaag gcaaaatgcc gcaaaaaagg
gaataagggc gacacggaaa tgttgaatac 6900tcatactctt cctttttcaa tattattgaa
gcatttatca gggttattgt ctcatgagcg 6960gatacatatt tgaatgtatt tagaaaaata
aacaaatagg ggttccgcgc acatttcccc 7020gaaaagtgcc acctggtcga cggtatcgat
aagcttgata tcgaattcct gcagccccgc 7080ggatccgctc acggggacag ccccccccca
aagcccccag ggatgtaatt acgtccctcc 7140cccgctaggg ggcagcagcg agccgcccgg
ggctccgctc cggtccggcg ctccccccgc 7200atccccgagc cggcagcgtg cggggacagc
ccgggcacgg ggaaggtggc acgggatcgc 7260tttcctctga acgcttctcg ctgctctttg
agcctgcaga cacctggggg atacggggcc 7320gcggatccgc tcacggggac agcccccccc
caaagccccc agggatgtaa ttacgtccct 7380cccccgctag ggggcagcag cgagccgccc
ggggctccgc tccggtccgg cgctcccccc 7440gcatccccga gccggcagcg tgcggggaca
gcccgggcac ggggaaggtg gcacgggatc 7500gctttcctct gaacgcttct cgctgctctt
tgagcctgca gacacctggg ggatacgggg 7560ccgcggatcc gctcacgggg acagcccccc
cccaaagccc ccagggatgt aattacgtcc 7620ctcccccgct agggggcagc agcgagccgc
ccggggctcc gctccggtcc ggcgctcccc 7680ccgcatcccc gagccggcag cgtgcgggga
cagcccgggc acggggaagg tggcacggga 7740tcgctttcct ctgaacgctt ctcgctgctc
tttgagcctg cagacacctg ggggatacgg 7800ggccgcggat ccgctcacgg ggacagcccc
cccccaaagc ccccagggat gtaattacgt 7860ccctcccccg ctagggggca gcagcgagcc
gcccggggct ccgctccggt ccggcgctcc 7920ccccgcatcc ccgagccggc agcgtgcggg
gacagcccgg gcacggggaa ggtggcacgg 7980gatcgctttc ctctgaacgc ttctcgctgc
tctttgagcc tgcagacacc tgggggatac 8040ggggccgcgg atccgctcac ggggacagcc
cccccccaaa gcccccaggg atgtaattac 8100gtccctcccc cgctaggggg cagcagcgag
ccgcccgggg ctccgctccg gtccggcgct 8160ccccccgcat ccccgagccg gcagcgtgcg
gggacagccc gggcacgggg aaggtggcac 8220gggatcgctt tcctctgaac gcttctcgct
gctctttgag cctgcagaca cctgggggat 8280acggggccgc ggatccgctc acggggacag
ccccccccca aagcccccag ggatgtaatt 8340acgtccctcc cccgctaggg ggcagcagcg
agccgcccgg ggctccgctc cggtccggcg 8400ctccccccgc atccccgagc cggcagcgtg
cggggacagc ccgggcacgg ggaaggtggc 8460acgggatcgc tttcctctga acgcttctcg
ctgctctttg agcctgcaga cacctggggg 8520a
85211248851DNAArtificial
Sequencep18EPOcDNA Plasmid 124cagttgccgg ccgggtcgcg cagggcgaac tcccgccccc
acggctgctc gccgatctcg 60gtcatggccg gcccggaggc gtcccggaag ttcgtggaca
cgacctccga ccactcggcg 120tacagctcgt ccaggccgcg cacccacacc caggccaggg
tgttgtccgg caccacctgg 180tcctggaccg cgctgatgaa cagggtcacg tcgtcccgga
ccacaccggc gaagtcgtcc 240tccacgaagt cccgggagaa cccgagccgg tcggtccaga
actcgaccgc tccggcgacg 300tcgcgcgcgg tgagcaccgg aacggcactg gtcaacttgg
ccatggatcc agatttcgct 360caagttagta taaaaaagca ggcttcaatc ctgcagagaa
gcttgatatc gaattcctgc 420agccccgcgg atccgctcac ggggacagcc cccccccaaa
gcccccaggg atgtaattac 480gtccctcccc cgctaggggg cagcagcgag ccgcccgggg
ctccgctccg gtccggcgct 540ccccccgcat ccccgagccg gcagcgtgcg gggacagccc
gggcacgggg aaggtggcac 600gggatcgctt tcctctgaac gcttctcgct gctctttgag
cctgcagaca cctgggggat 660acggggccgc ggatccgctc acggggacag ccccccccca
aagcccccag ggatgtaatt 720acgtccctcc cccgctaggg ggcagcagcg agccgcccgg
ggctccgctc cggtccggcg 780ctccccccgc atccccgagc cggcagcgtg cggggacagc
ccgggcacgg ggaaggtggc 840acgggatcgc tttcctctga acgcttctcg ctgctctttg
agcctgcaga cacctggggg 900atacggggcc gcggatccgc tcacggggac agcccccccc
caaagccccc agggatgtaa 960ttacgtccct cccccgctag ggggcagcag cgagccgccc
ggggctccgc tccggtccgg 1020cgctcccccc gcatccccga gccggcagcg tgcggggaca
gcccgggcac ggggaaggtg 1080gcacgggatc gctttcctct gaacgcttct cgctgctctt
tgagcctgca gacacctggg 1140ggatacgggg ccgcggatcc gctcacgggg acagcccccc
cccaaagccc ccagggatgt 1200aattacgtcc ctcccccgct agggggcagc agcgagccgc
ccggggctcc gctccggtcc 1260ggcgctcccc ccgcatcccc gagccggcag cgtgcgggga
cagcccgggc acggggaagg 1320tggcacggga tcgctttcct ctgaacgctt ctcgctgctc
tttgagcctg cagacacctg 1380ggggatacgg ggccgcggat ccgctcacgg ggacagcccc
cccccaaagc ccccagggat 1440gtaattacgt ccctcccccg ctagggggca gcagcgagcc
gcccggggct ccgctccggt 1500ccggcgctcc ccccgcatcc ccgagccggc agcgtgcggg
gacagcccgg gcacggggaa 1560ggtggcacgg gatcgctttc ctctgaacgc ttctcgctgc
tctttgagcc tgcagacacc 1620tgggggatac ggggccgcgg atccgctcac ggggacagcc
cccccccaaa gcccccaggg 1680atgtaattac gtccctcccc cgctaggggg cagcagcgag
ccgcccgggg ctccgctccg 1740gtccggcgct ccccccgcat ccccgagccg gcagcgtgcg
gggacagccc gggcacgggg 1800aaggtggcac gggatcgctt tcctctgaac gcttctcgct
gctctttgag cctgcagaca 1860cctgggggat acggggcggg ggatccacta gttattaata
gtaatcaatt acggggtcat 1920tagttcatag cccatatatg gagttccgcg ttacataact
tacggtaaat ggcccgcctg 1980gctgaccgcc caacgacccc cgcccattga cgtcaataat
gacgtatgtt cccatagtaa 2040cgccaatagg gactttccat tgacgtcaat gggtggacta
tttacggtaa actgcccact 2100tggcagtaca tcaagtgtat catatgccaa gtacgccccc
tattgacgtc aatgacggta 2160aatggcccgc ctggcattat gcccagtaca tgaccttatg
ggactttcct acttggcagt 2220acatctacgt attagtcatc gctattacca tgggtcgagg
tgagccccac gttctgcttc 2280actctcccca tctccccccc ctccccaccc ccaattttgt
atttatttat tttttaatta 2340ttttgtgcag cgatgggggc gggggggggg ggggcgcgcg
ccaggcgggg cggggcgggg 2400cgaggggcgg ggcggggcga ggcggagagg tgcggcggca
gccaatcaga gcggcgcgct 2460ccgaaagttt ccttttatgg cgaggcggcg gcggcggcgg
ccctataaaa agcgaagcgc 2520gcggcgggcg ggagtcgctg cgttgccttc gccccgtgcc
ccgctccgcg ccgcctcgcg 2580ccgcccgccc cggctctgac tgaccgcgtt actcccacag
gtgagcgggc gggacggccc 2640ttctcctccg ggctgtaatt agcgcttggt ttaatgacgg
ctcgtttctt ttctgtggct 2700gcgtgaaagc cttaaagggc tccgggaggg ccctttgtgc
gggggggagc ggctcggggg 2760gtgcgtgcgt gtgtgtgtgc gtggggagcg ccgcgtgcgg
cccgcgctgc ccggcggctg 2820tgagcgctgc gggcgcggcg cggggctttg tgcgctccgc
gtgtgcgcga ggggagcgcg 2880gccgggggcg gtgccccgcg gtgcgggggg gctgcgaggg
gaacaaaggc tgcgtgcggg 2940gtgtgtgcgt gggggggtga gcagggggtg tgggcgcggc
ggtcgggctg taaccccccc 3000ctgcaccccc ctccccgagt tgctgagcac ggcccggctt
cgggtgcggg gctccgtgcg 3060gggcgtggcg cggggctcgc cgtgccgggc ggggggtggc
ggcaggtggg ggtgccgggc 3120ggggcggggc cgcctcgggc cggggagggc tcgggggagg
ggcgcggcgg ccccggagcg 3180ccggcggctg tcgaggcgcg gcgagccgca gccattgcct
tttatggtaa tcgtgcgaga 3240gggcgcaggg acttcctttg tcccaaatct ggcggagccg
aaatctggga ggcgccgccg 3300caccccctct agcgggcgcg ggcgaagcgg tgcggcgccg
gcaggaagga aatgggcggg 3360gagggccttc gtgcgtcgcc gcgccgccgt ccccttctcc
atctccagcc tcggggctgc 3420cgcaggggga cggctgcctt cgggggggac ggggcagggc
ggggttcggc ttctggcgtg 3480tgaccggcgg ctctagaatg ggggtgcacg aatgtcctgc
ctggctgtgg cttctcctgt 3540ccctgctgtc gctccctctg ggcctcccag tcctgggcgc
cccaccacgc ctcatctgtg 3600acagccgagt cctggagagg tacctcttgg aggccaagga
ggccgagaat atcacgacgg 3660gctgtgctga acactgcagc ttgaatgaga atatcactgt
cccagacacc aaagttaatt 3720tctatgcctg gaagaggatg gaggtcgggc agcaggccgt
agaagtctgg cagggcctgg 3780ccctgctgtc ggaagctgtc ctgcggggcc aggccctgtt
ggtcaactct tcccagccgt 3840gggagcccct gcagctgcat gtggataaag ccgtcagtgg
ccttcgcagc ctcaccactc 3900tgcttcgggc tctgggagcc cagaaggaag ccatctcccc
tccagatgcg gcctcagctg 3960ctccactccg aacaatcact gctgacactt tccgcaaact
cttccgagtc tactccaatt 4020tcctccgggg aaagctgaag ctgtacacag gggaggcctg
caggacaggg gacagatgac 4080gtacaagtaa gaattcactc ctcaggtgca ggctgcctat
cagaaggtgg tggctggtgt 4140ggccaatgcc ctggctcaca aataccactg agatcttttt
ccctctgcca aaaattatgg 4200ggacatcatg aagccccttg agcatctgac ttctggctaa
taaaggaaat ttattttcat 4260tgcaatagtg tgttggaatt ttttgtgtct ctcactcgga
aggacatatg ggagggcaaa 4320tcatttaaaa catcagaatg agtatttggt ttagagtttg
gcaacatatg ccatatgctg 4380gctgccatga acaaaggtgg ctataaagag gtcatcagta
tatgaaacag ccccctgctg 4440tccattcctt attccataga aaagccttga cttgaggtta
gatttttttt atattttgtt 4500ttgtgttatt tttttcttta acatccctaa aattttcctt
acatgtttta ctagccagat 4560ttttcctcct ctcctgacta ctcccagtca tagctgtccc
tcttctctta tgaagatccc 4620tcgacctgca gcccaagctt gcatgcctgc aggtcgactc
tagtggatcc cccgccccgt 4680atcccccagg tgtctgcagg ctcaaagagc agcgagaagc
gttcagagga aagcgatccc 4740gtgccacctt ccccgtgccc gggctgtccc cgcacgctgc
cggctcgggg atgcgggggg 4800agcgccggac cggagcggag ccccgggcgg ctcgctgctg
ccccctagcg ggggagggac 4860gtaattacat ccctgggggc tttggggggg ggctgtcccc
gtgagcggat ccgcggcccc 4920gtatccccca ggtgtctgca ggctcaaaga gcagcgagaa
gcgttcagag gaaagcgatc 4980ccgtgccacc ttccccgtgc ccgggctgtc cccgcacgct
gccggctcgg ggatgcgggg 5040ggagcgccgg accggagcgg agccccgggc ggctcgctgc
tgccccctag cgggggaggg 5100acgtaattac atccctgggg gctttggggg ggggctgtcc
ccgtgagcgg atccgcggcc 5160ccgtatcccc caggtgtctg caggctcaaa gagcagcgag
aagcgttcag aggaaagcga 5220tcccgtgcca ccttccccgt gcccgggctg tccccgcacg
ctgccggctc ggggatgcgg 5280ggggagcgcc ggaccggagc ggagccccgg gcggctcgct
gctgccccct agcgggggag 5340ggacgtaatt acatccctgg gggctttggg ggggggctgt
ccccgtgagc ggatccgcgg 5400ccccgtatcc cccaggtgtc tgcaggctca aagagcagcg
agaagcgttc agaggaaagc 5460gatcccgtgc caccttcccc gtgcccgggc tgtccccgca
cgctgccggc tcggggatgc 5520ggggggagcg ccggaccgga gcggagcccc gggcggctcg
ctgctgcccc ctagcggggg 5580agggacgtaa ttacatccct gggggctttg ggggggggct
gtccccgtga gcggatccgc 5640ggccccgtat cccccaggtg tctgcaggct caaagagcag
cgagaagcgt tcagaggaaa 5700gcgatcccgt gccaccttcc ccgtgcccgg gctgtccccg
cacgctgccg gctcggggat 5760gcggggggag cgccggaccg gagcggagcc ccgggcggct
cgctgctgcc ccctagcggg 5820ggagggacgt aattacatcc ctgggggctt tggggggggg
ctgtccccgt gagcggatcc 5880gcggccccgt atcccccagg tgtctgcagg ctcaaagagc
agcgagaagc gttcagagga 5940aagcgatccc gtgccacctt ccccgtgccc gggctgtccc
cgcacgctgc cggctcgggg 6000atgcgggggg agcgccggac cggagcggag ccccgggcgg
ctcgctgctg ccccctagcg 6060ggggagggac gtaattacat ccctgggggc tttggggggg
ggctgtcccc gtgagcggat 6120ccgcggggct gcaggaattc gtaatcatgg tcatagctgt
ttcctgtgtg aaattgttat 6180ccgctcacaa ttccacacaa catacgagcc ggaagcataa
agtgtaaagc ctggggtgcc 6240taatgagtga gctaactcac attaattgcg ttgcgctcac
tgcccgcttt ccagtcggga 6300aacctgtcgt gccagctgca ttaatgaatc ggccaacgcg
cggggagagg cggtttgcgt 6360attgggcgct cttccgcttc ctcgctcact gactcgctgc
gctcggtcgt tcggctgcgg 6420cgagcggtat cagctcactc aaaggcggta atacggttat
ccacagaatc aggggataac 6480gcaggaaaga acatgtgagc aaaaggccag caaaaggcca
ggaaccgtaa aaaggccgcg 6540ttgctggcgt ttttccatag gctccgcccc cctgacgagc
atcacaaaaa tcgacgctca 6600agtcagaggt ggcgaaaccc gacaggacta taaagatacc
aggcgtttcc ccctggaagc 6660tccctcgtgc gctctcctgt tccgaccctg ccgcttaccg
gatacctgtc cgcctttctc 6720ccttcgggaa gcgtggcgct ttctcatagc tcacgctgta
ggtatctcag ttcggtgtag 6780gtcgttcgct ccaagctggg ctgtgtgcac gaaccccccg
ttcagcccga ccgctgcgcc 6840ttatccggta actatcgtct tgagtccaac ccggtaagac
acgacttatc gccactggca 6900gcagccactg gtaacaggat tagcagagcg aggtatgtag
gcggtgctac agagttcttg 6960aagtggtggc ctaactacgg ctacactaga aggacagtat
ttggtatctg cgctctgctg 7020aagccagtta ccttcggaaa aagagttggt agctcttgat
ccggcaaaca aaccaccgct 7080ggtagcggtg gtttttttgt ttgcaagcag cagattacgc
gcagaaaaaa aggatctcaa 7140gaagatcctt tgatcttttc tacggggtct gacgctcagt
ggaacgaaaa ctcacgttaa 7200gggattttgg tcatgagatt atcaaaaagg atcttcacct
agatcctttt aaattaaaaa 7260tgaagtttta aatcaatcta aagtatatat gagtaaactt
ggtctgacag ttaccaatgc 7320ttaatcagtg aggcacctat ctcagcgatc tgtctatttc
gttcatccat agttgcctga 7380ctccccgtcg tgtagataac tacgatacgg gagggcttac
catctggccc cagtgctgca 7440atgataccgc gagacccacg ctcaccggct ccagatttat
cagcaataaa ccagccagcc 7500ggaagggccg agcgcagaag tggtcctgca actttatccg
cctccatcca gtctattaat 7560tgttgccggg aagctagagt aagtagttcg ccagttaata
gtttgcgcaa cgttgttgcc 7620attgctacag gcatcgtggt gtcacgctcg tcgtttggta
tggcttcatt cagctccggt 7680tcccaacgat caaggcgagt tacatgatcc cccatgttgt
gcaaaaaagc ggttagctcc 7740ttcggtcctc cgatcgttgt cagaagtaag ttggccgcag
tgttatcact catggttatg 7800gcagcactgc ataattctct tactgtcatg ccatccgtaa
gatgcttttc tgtgactggt 7860gagtactcaa ccaagtcatt ctgagaatag tgtatgcggc
gaccgagttg ctcttgcccg 7920gcgtcaatac gggataatac cgcgccacat agcagaactt
taaaagtgct catcattgga 7980aaacgttctt cggggcgaaa actctcaagg atcttaccgc
tgttgagatc cagttcgatg 8040taacccactc gtgcacccaa ctgatcttca gcatctttta
ctttcaccag cgtttctggg 8100tgagcaaaaa caggaaggca aaatgccgca aaaaagggaa
taagggcgac acggaaatgt 8160tgaatactca tactcttcct ttttcaatat tattgaagca
tttatcaggg ttattgtctc 8220atgagcggat acatatttga atgtatttag aaaaataaac
aaataggggt tccgcgcaca 8280tttccccgaa aagtgccacc tgacgtagtt aacaaaaaaa
agcccgccga agcgggcttt 8340attaccaagc gaagcgccat tcgccattca ggctgcgcaa
ctgttgggaa gggcgatcgg 8400tgcgggcctc ttcgctatta cgccagctgg cgaaaggggg
atgtgctgca aggcgattaa 8460gttgggtaac gccagggttt tcccagtcac gacgttgtaa
aacgacggcc agtccgtaat 8520acgactcact taaggccttg actagagggt cgacggtata
cagacatgat aagatacatt 8580gatgagtttg gacaaaccac aactagaatg cagtgaaaaa
aatgctttat ttgtgaaatt 8640tgtgatgcta ttgctttatt tgtaaccatt ataagctgca
ataaacaagt tggggtgggc 8700gaagaactcc agcatgagat ccccgcgctg gaggatcatc
cagccggcgt cccggaaaac 8760gattccgaag cccaaccttt catagaaggc ggcggtggaa
tcgaaatctc gtagcacgtg 8820tcagtcctgc tcctcggcca cgaagtgcac g
885112510474DNAArtificial Sequencep18genEPO Plasmid
125cagttgccgg ccgggtcgcg cagggcgaac tcccgccccc acggctgctc gccgatctcg
60gtcatggccg gcccggaggc gtcccggaag ttcgtggaca cgacctccga ccactcggcg
120tacagctcgt ccaggccgcg cacccacacc caggccaggg tgttgtccgg caccacctgg
180tcctggaccg cgctgatgaa cagggtcacg tcgtcccgga ccacaccggc gaagtcgtcc
240tccacgaagt cccgggagaa cccgagccgg tcggtccaga actcgaccgc tccggcgacg
300tcgcgcgcgg tgagcaccgg aacggcactg gtcaacttgg ccatggatcc agatttcgct
360caagttagta taaaaaagca ggcttcaatc ctgcagagaa gcttgatatc gaattcctgc
420agccccgcgg atccgctcac ggggacagcc cccccccaaa gcccccaggg atgtaattac
480gtccctcccc cgctaggggg cagcagcgag ccgcccgggg ctccgctccg gtccggcgct
540ccccccgcat ccccgagccg gcagcgtgcg gggacagccc gggcacgggg aaggtggcac
600gggatcgctt tcctctgaac gcttctcgct gctctttgag cctgcagaca cctgggggat
660acggggccgc ggatccgctc acggggacag ccccccccca aagcccccag ggatgtaatt
720acgtccctcc cccgctaggg ggcagcagcg agccgcccgg ggctccgctc cggtccggcg
780ctccccccgc atccccgagc cggcagcgtg cggggacagc ccgggcacgg ggaaggtggc
840acgggatcgc tttcctctga acgcttctcg ctgctctttg agcctgcaga cacctggggg
900atacggggcc gcggatccgc tcacggggac agcccccccc caaagccccc agggatgtaa
960ttacgtccct cccccgctag ggggcagcag cgagccgccc ggggctccgc tccggtccgg
1020cgctcccccc gcatccccga gccggcagcg tgcggggaca gcccgggcac ggggaaggtg
1080gcacgggatc gctttcctct gaacgcttct cgctgctctt tgagcctgca gacacctggg
1140ggatacgggg ccgcggatcc gctcacgggg acagcccccc cccaaagccc ccagggatgt
1200aattacgtcc ctcccccgct agggggcagc agcgagccgc ccggggctcc gctccggtcc
1260ggcgctcccc ccgcatcccc gagccggcag cgtgcgggga cagcccgggc acggggaagg
1320tggcacggga tcgctttcct ctgaacgctt ctcgctgctc tttgagcctg cagacacctg
1380ggggatacgg ggccgcggat ccgctcacgg ggacagcccc cccccaaagc ccccagggat
1440gtaattacgt ccctcccccg ctagggggca gcagcgagcc gcccggggct ccgctccggt
1500ccggcgctcc ccccgcatcc ccgagccggc agcgtgcggg gacagcccgg gcacggggaa
1560ggtggcacgg gatcgctttc ctctgaacgc ttctcgctgc tctttgagcc tgcagacacc
1620tgggggatac ggggccgcgg atccgctcac ggggacagcc cccccccaaa gcccccaggg
1680atgtaattac gtccctcccc cgctaggggg cagcagcgag ccgcccgggg ctccgctccg
1740gtccggcgct ccccccgcat ccccgagccg gcagcgtgcg gggacagccc gggcacgggg
1800aaggtggcac gggatcgctt tcctctgaac gcttctcgct gctctttgag cctgcagaca
1860cctgggggat acggggcggg ggatccacta gttattaata gtaatcaatt acggggtcat
1920tagttcatag cccatatatg gagttccgcg ttacataact tacggtaaat ggcccgcctg
1980gctgaccgcc caacgacccc cgcccattga cgtcaataat gacgtatgtt cccatagtaa
2040cgccaatagg gactttccat tgacgtcaat gggtggacta tttacggtaa actgcccact
2100tggcagtaca tcaagtgtat catatgccaa gtacgccccc tattgacgtc aatgacggta
2160aatggcccgc ctggcattat gcccagtaca tgaccttatg ggactttcct acttggcagt
2220acatctacgt attagtcatc gctattacca tgggtcgagg tgagccccac gttctgcttc
2280actctcccca tctccccccc ctccccaccc ccaattttgt atttatttat tttttaatta
2340ttttgtgcag cgatgggggc gggggggggg ggggcgcgcg ccaggcgggg cggggcgggg
2400cgaggggcgg ggcggggcga ggcggagagg tgcggcggca gccaatcaga gcggcgcgct
2460ccgaaagttt ccttttatgg cgaggcggcg gcggcggcgg ccctataaaa agcgaagcgc
2520gcggcgggcg ggagtcgctg cgttgccttc gccccgtgcc ccgctccgcg ccgcctcgcg
2580ccgcccgccc cggctctgac tgaccgcgtt actcccacag gtgagcgggc gggacggccc
2640ttctcctccg ggctgtaatt agcgcttggt ttaatgacgg ctcgtttctt ttctgtggct
2700gcgtgaaagc cttaaagggc tccgggaggg ccctttgtgc gggggggagc ggctcggggg
2760gtgcgtgcgt gtgtgtgtgc gtggggagcg ccgcgtgcgg cccgcgctgc ccggcggctg
2820tgagcgctgc gggcgcggcg cggggctttg tgcgctccgc gtgtgcgcga ggggagcgcg
2880gccgggggcg gtgccccgcg gtgcgggggg gctgcgaggg gaacaaaggc tgcgtgcggg
2940gtgtgtgcgt gggggggtga gcagggggtg tgggcgcggc ggtcgggctg taaccccccc
3000ctgcaccccc ctccccgagt tgctgagcac ggcccggctt cgggtgcggg gctccgtgcg
3060gggcgtggcg cggggctcgc cgtgccgggc ggggggtggc ggcaggtggg ggtgccgggc
3120ggggcggggc cgcctcgggc cggggagggc tcgggggagg ggcgcggcgg ccccggagcg
3180ccggcggctg tcgaggcgcg gcgagccgca gccattgcct tttatggtaa tcgtgcgaga
3240gggcgcaggg acttcctttg tcccaaatct ggcggagccg aaatctggga ggcgccgccg
3300caccccctct agcgggcgcg ggcgaagcgg tgcggcgccg gcaggaagga aatgggcggg
3360gagggccttc gtgcgtcgcc gcgccgccgt ccccttctcc atctccagcc tcggggctgc
3420cgcaggggga cggctgcctt cgggggggac ggggcagggc ggggttcggc ttctggcgtg
3480tgaccggcgg ctctagatgc atgctcgagc ggccgccagt gtgatggata tctgcagaat
3540tcgccctttc tagaatgggg gtgcacggtg agtactcgcg ggctgggcgc tcccgcccgc
3600ccgggtccct gtttgagcgg ggatttagcg ccccggctat tggccaggag gtggctgggt
3660tcaaggaccg gcgacttgtc aaggaccccg gaagggggag gggggtgggg tgcctccacg
3720tgccagcggg gacttggggg agtccttggg gatggcaaaa acctgacctg tgaaggggac
3780acagtttggg ggttgagggg aagaaggttt gggggttctg ctgtgccagt ggagaggaag
3840ctgataagct gataacctgg gcgctggagc caccacttat ctgccagagg ggaagcctct
3900gtcacaccag gattgaagtt tggccggaga agtggatgct ggtagctggg ggtggggtgt
3960gcacacggca gcaggattga atgaaggcca gggaggcagc acctgagtgc ttgcatggtt
4020ggggacagga aggacgagct ggggcagaga cgtggggatg aaggaagctg tccttccaca
4080gccacccttc tccctccccg cctgactctc agcctggcta tctgttctag aatgtcctgc
4140ctggctgtgg cttctcctgt ccctgctgtc gctccctctg ggcctcccag tcctgggcgc
4200cccaccacgc ctcatctgtg acagccgagt cctggagagg tacctcttgg aggccaagga
4260ggccgagaat atcacggtga gaccccttcc ccagcacatt ccacagaact cacgctcagg
4320gcttcaggga actcctccca gatccaggaa cctggcactt ggtttggggt ggagttggga
4380agctagacac tgccccccta cataagaata agtctggtgg ccccaaacca tacctggaaa
4440ctaggcaagg agcaaagcca gcagatccta cggcctgtgg gccagggcca gagccttcag
4500ggacccttga ctccccgggc tgtgtgcatt tcagacgggc tgtgctgaac actgcagctt
4560gaatgagaat atcactgtcc cagacaccaa agttaatttc tatgcctgga agaggatgga
4620ggtgagttcc tttttttttt tttttccttt cttttggaga atctcatttg cgagcctgat
4680tttggatgaa agggagaatg atcgagggaa aggtaaaatg gagcagcaga gatgaggctg
4740cctgggcgca gaggctcacg tctataatcc caggctgaga tggccgagat gggagaattg
4800cttgagccct ggagtttcag accaacctag gcagcatagt gagatccccc atctctacaa
4860acatttaaaa aaattagtca ggtgaagtgg tgcatggtgg tagtcccaga tatttggaag
4920gctgaggcgg gaggatcgct tgagcccagg aatttgaggc tgcagtgagc tgtgatcaca
4980ccactgcact ccagcctcag tgacagagtg aggccctgtc tcaaaaaaga aaagaaaaaa
5040gaaaaataat gagggctgta tggaatacat tcattattca ttcactcact cactcactca
5100ttcattcatt cattcattca acaagtctta ttgcatacct tctgtttgct cagcttggtg
5160cttggggctg ctgaggggca ggagggagag ggtgacatgg gtcagctgac tcccagagtc
5220cactccctgt aggtcgggca gcaggccgta gaagtctggc agggcctggc cctgctgtcg
5280gaagctgtcc tgcggggcca ggccctgttg gtcaactctt cccagccgtg ggagcccctg
5340cagctgcatg tggataaagc cgtcagtggc cttcgcagcc tcaccactct gcttcgggct
5400ctgggagccc aggtgagtag gagcggacac ttctgcttgc cctttctgta agaaggggag
5460aagggtcttg ctaaggagta caggaactgt ccgtattcct tccctttctg tggcactgca
5520gcgacctcct gttttctcct tggcagaagg aagccatctc ccctccagat gcggcctcag
5580ctgctccact ccgaacaatc actgctgaca ctttccgcaa actcttccga gtctactcca
5640atttcctccg gggaaagctg aagctgtaca caggggaggc ctgcaggaca ggggacagat
5700gacgtacaag taagaattca ctcctcaggt gcaggctgcc tatcagaagg tggtggctgg
5760tgtggccaat gccctggctc acaaatacca ctgagatctt tttccctctg ccaaaaatta
5820tggggacatc atgaagcccc ttgagcatct gacttctggc taataaagga aatttatttt
5880cattgcaata gtgtgttgga attttttgtg tctctcactc ggaaggacat atgggagggc
5940aaatcattta aaacatcaga atgagtattt ggtttagagt ttggcaacat atgccatatg
6000ctggctgcca tgaacaaagg tggctataaa gaggtcatca gtatatgaaa cagccccctg
6060ctgtccattc cttattccat agaaaagcct tgacttgagg ttagattttt tttatatttt
6120gttttgtgtt atttttttct ttaacatccc taaaattttc cttacatgtt ttactagcca
6180gatttttcct cctctcctga ctactcccag tcatagctgt ccctcttctc ttatgaagat
6240ccctcgacct gcagcccaag cttgcatgcc tgcaggtcga ctctagtgga tcccccgccc
6300cgtatccccc aggtgtctgc aggctcaaag agcagcgaga agcgttcaga ggaaagcgat
6360cccgtgccac cttccccgtg cccgggctgt ccccgcacgc tgccggctcg gggatgcggg
6420gggagcgccg gaccggagcg gagccccggg cggctcgctg ctgcccccta gcgggggagg
6480gacgtaatta catccctggg ggctttgggg gggggctgtc cccgtgagcg gatccgcggc
6540cccgtatccc ccaggtgtct gcaggctcaa agagcagcga gaagcgttca gaggaaagcg
6600atcccgtgcc accttccccg tgcccgggct gtccccgcac gctgccggct cggggatgcg
6660gggggagcgc cggaccggag cggagccccg ggcggctcgc tgctgccccc tagcggggga
6720gggacgtaat tacatccctg ggggctttgg gggggggctg tccccgtgag cggatccgcg
6780gccccgtatc ccccaggtgt ctgcaggctc aaagagcagc gagaagcgtt cagaggaaag
6840cgatcccgtg ccaccttccc cgtgcccggg ctgtccccgc acgctgccgg ctcggggatg
6900cggggggagc gccggaccgg agcggagccc cgggcggctc gctgctgccc cctagcgggg
6960gagggacgta attacatccc tgggggcttt gggggggggc tgtccccgtg agcggatccg
7020cggccccgta tcccccaggt gtctgcaggc tcaaagagca gcgagaagcg ttcagaggaa
7080agcgatcccg tgccaccttc cccgtgcccg ggctgtcccc gcacgctgcc ggctcgggga
7140tgcgggggga gcgccggacc ggagcggagc cccgggcggc tcgctgctgc cccctagcgg
7200gggagggacg taattacatc cctgggggct ttgggggggg gctgtccccg tgagcggatc
7260cgcggccccg tatcccccag gtgtctgcag gctcaaagag cagcgagaag cgttcagagg
7320aaagcgatcc cgtgccacct tccccgtgcc cgggctgtcc ccgcacgctg ccggctcggg
7380gatgcggggg gagcgccgga ccggagcgga gccccgggcg gctcgctgct gccccctagc
7440gggggaggga cgtaattaca tccctggggg ctttgggggg gggctgtccc cgtgagcgga
7500tccgcggccc cgtatccccc aggtgtctgc aggctcaaag agcagcgaga agcgttcaga
7560ggaaagcgat cccgtgccac cttccccgtg cccgggctgt ccccgcacgc tgccggctcg
7620gggatgcggg gggagcgccg gaccggagcg gagccccggg cggctcgctg ctgcccccta
7680gcgggggagg gacgtaatta catccctggg ggctttgggg gggggctgtc cccgtgagcg
7740gatccgcggg gctgcaggaa ttcgtaatca tggtcatagc tgtttcctgt gtgaaattgt
7800tatccgctca caattccaca caacatacga gccggaagca taaagtgtaa agcctggggt
7860gcctaatgag tgagctaact cacattaatt gcgttgcgct cactgcccgc tttccagtcg
7920ggaaacctgt cgtgccagct gcattaatga atcggccaac gcgcggggag aggcggtttg
7980cgtattgggc gctcttccgc ttcctcgctc actgactcgc tgcgctcggt cgttcggctg
8040cggcgagcgg tatcagctca ctcaaaggcg gtaatacggt tatccacaga atcaggggat
8100aacgcaggaa agaacatgtg agcaaaaggc cagcaaaagg ccaggaaccg taaaaaggcc
8160gcgttgctgg cgtttttcca taggctccgc ccccctgacg agcatcacaa aaatcgacgc
8220tcaagtcaga ggtggcgaaa cccgacagga ctataaagat accaggcgtt tccccctgga
8280agctccctcg tgcgctctcc tgttccgacc ctgccgctta ccggatacct gtccgccttt
8340ctcccttcgg gaagcgtggc gctttctcat agctcacgct gtaggtatct cagttcggtg
8400taggtcgttc gctccaagct gggctgtgtg cacgaacccc ccgttcagcc cgaccgctgc
8460gccttatccg gtaactatcg tcttgagtcc aacccggtaa gacacgactt atcgccactg
8520gcagcagcca ctggtaacag gattagcaga gcgaggtatg taggcggtgc tacagagttc
8580ttgaagtggt ggcctaacta cggctacact agaaggacag tatttggtat ctgcgctctg
8640ctgaagccag ttaccttcgg aaaaagagtt ggtagctctt gatccggcaa acaaaccacc
8700gctggtagcg gtggtttttt tgtttgcaag cagcagatta cgcgcagaaa aaaaggatct
8760caagaagatc ctttgatctt ttctacgggg tctgacgctc agtggaacga aaactcacgt
8820taagggattt tggtcatgag attatcaaaa aggatcttca cctagatcct tttaaattaa
8880aaatgaagtt ttaaatcaat ctaaagtata tatgagtaaa cttggtctga cagttaccaa
8940tgcttaatca gtgaggcacc tatctcagcg atctgtctat ttcgttcatc catagttgcc
9000tgactccccg tcgtgtagat aactacgata cgggagggct taccatctgg ccccagtgct
9060gcaatgatac cgcgagaccc acgctcaccg gctccagatt tatcagcaat aaaccagcca
9120gccggaaggg ccgagcgcag aagtggtcct gcaactttat ccgcctccat ccagtctatt
9180aattgttgcc gggaagctag agtaagtagt tcgccagtta atagtttgcg caacgttgtt
9240gccattgcta caggcatcgt ggtgtcacgc tcgtcgtttg gtatggcttc attcagctcc
9300ggttcccaac gatcaaggcg agttacatga tcccccatgt tgtgcaaaaa agcggttagc
9360tccttcggtc ctccgatcgt tgtcagaagt aagttggccg cagtgttatc actcatggtt
9420atggcagcac tgcataattc tcttactgtc atgccatccg taagatgctt ttctgtgact
9480ggtgagtact caaccaagtc attctgagaa tagtgtatgc ggcgaccgag ttgctcttgc
9540ccggcgtcaa tacgggataa taccgcgcca catagcagaa ctttaaaagt gctcatcatt
9600ggaaaacgtt cttcggggcg aaaactctca aggatcttac cgctgttgag atccagttcg
9660atgtaaccca ctcgtgcacc caactgatct tcagcatctt ttactttcac cagcgtttct
9720gggtgagcaa aaacaggaag gcaaaatgcc gcaaaaaagg gaataagggc gacacggaaa
9780tgttgaatac tcatactctt cctttttcaa tattattgaa gcatttatca gggttattgt
9840ctcatgagcg gatacatatt tgaatgtatt tagaaaaata aacaaatagg ggttccgcgc
9900acatttcccc gaaaagtgcc acctgacgta gttaacaaaa aaaagcccgc cgaagcgggc
9960tttattacca agcgaagcgc cattcgccat tcaggctgcg caactgttgg gaagggcgat
10020cggtgcgggc ctcttcgcta ttacgccagc tggcgaaagg gggatgtgct gcaaggcgat
10080taagttgggt aacgccaggg ttttcccagt cacgacgttg taaaacgacg gccagtccgt
10140aatacgactc acttaaggcc ttgactagag ggtcgacggt atacagacat gataagatac
10200attgatgagt ttggacaaac cacaactaga atgcagtgaa aaaaatgctt tatttgtgaa
10260atttgtgatg ctattgcttt atttgtaacc attataagct gcaataaaca agttggggtg
10320ggcgaagaac tccagcatga gatccccgcg ctggaggatc atccagccgg cgtcccggaa
10380aacgattccg aagcccaacc tttcatagaa ggcggcggtg gaatcgaaat ctcgtagcac
10440gtgtcagtcc tgctcctcgg ccacgaagtg cacg
104741266119DNAArtificial Sequencep18attBZeoeGFP Plasmid 126cagttgccgg
ccgggtcgcg cagggcgaac tcccgccccc acggctgctc gccgatctcg 60gtcatggccg
gcccggaggc gtcccggaag ttcgtggaca cgacctccga ccactcggcg 120tacagctcgt
ccaggccgcg cacccacacc caggccaggg tgttgtccgg caccacctgg 180tcctggaccg
cgctgatgaa cagggtcacg tcgtcccgga ccacaccggc gaagtcgtcc 240tccacgaagt
cccgggagaa cccgagccgg tcggtccaga actcgaccgc tccggcgacg 300tcgcgcgcgg
tgagcaccgg aacggcactg gtcaacttgg ccatggatcc agatttcgct 360caagttagta
taaaaaagca ggcttcaatc ctgcagagaa gcttgggctg caggtcgagg 420gatcttcata
agagaagagg gacagctatg actgggagta gtcaggagag gaggaaaaat 480ctggctagta
aaacatgtaa ggaaaatttt agggatgtta aagaaaaaaa taacacaaaa 540caaaatataa
aaaaaatcta acctcaagtc aaggcttttc tatggaataa ggaatggaca 600gcagggggct
gtttcatata ctgatgacct ctttatagcc acctttgttc atggcagcca 660gcatatggca
tatgttgcca aactctaaac caaatactca ttctgatgtt ttaaatgatt 720tgccctccca
tatgtccttc cgagtgagag acacaaaaaa ttccaacaca ctattgcaat 780gaaaataaat
ttcctttatt agccagaagt cagatgctca aggggcttca tgatgtcccc 840ataatttttg
gcagagggaa aaagatctca gtggtatttg tgagccaggg cattggccac 900accagccacc
accttctgat aggcagcctg cacctgagga gtgaattctt acttgtacag 960ctcgtccatg
ccgagagtga tcccggcggc ggtcacgaac tccagcagga ccatgtgatc 1020gcgcttctcg
ttggggtctt tgctcagggc ggactgggtg ctcaggtagt ggttgtcggg 1080cagcagcacg
gggccgtcgc cgatgggggt gttctgctgg tagtggtcgg cgagctgcac 1140gctgccgtcc
tcgatgttgt ggcggatctt gaagttcacc ttgatgccgt tcttctgctt 1200gtcggccatg
atatagacgt tgtggctgtt gtagttgtac tccagcttgt gccccaggat 1260gttgccgtcc
tccttgaagt cgatgccctt cagctcgatg cggttcacca gggtgtcgcc 1320ctcgaacttc
acctcggcgc gggtcttgta gttgccgtcg tccttgaaga agatggtgcg 1380ctcctggacg
tagccttcgg gcatggcgga cttgaagaag tcgtgctgct tcatgtggtc 1440ggggtagcgg
ctgaagcact gcacgccgta ggtcagggtg gtcacgaggg tgggccaggg 1500cacgggcagc
ttgccggtgg tgcagatgaa cttcagggtc agcttgccgt aggtggcatc 1560gccctcgccc
tcgccggaca cgctgaactt gtggccgttt acgtcgccgt ccagctcgac 1620caggatgggc
accaccccgg tgaacagctc ctcgcccttg ctcaccatgg tggcgaattc 1680tttgccaaaa
tgatgagaca gcacaacaac cagcacgttg cccaggagct gtaggaaaaa 1740gaagaaggca
tgaacatggt tagcagaggc tctagagccg ccggtcacac gccagaagcc 1800gaaccccgcc
ctgccccgtc ccccccgaag gcagccgtcc ccctgcggca gccccgaggc 1860tggagatgga
gaaggggacg gcggcgcggc gacgcacgaa ggccctcccc gcccatttcc 1920ttcctgccgg
cgccgcaccg cttcgcccgc gcccgctaga gggggtgcgg cggcgcctcc 1980cagatttcgg
ctccgccaga tttgggacaa aggaagtccc tgcgccctct cgcacgatta 2040ccataaaagg
caatggctgc ggctcgccgc gcctcgacag ccgccggcgc tccggggccg 2100ccgcgcccct
cccccgagcc ctccccggcc cgaggcggcc ccgccccgcc cggcaccccc 2160acctgccgcc
accccccgcc cggcacggcg agccccgcgc cacgccccgc acggagcccc 2220gcacccgaag
ccgggccgtg ctcagcaact cggggagggg ggtgcagggg ggggttacag 2280cccgaccgcc
gcgcccacac cccctgctca cccccccacg cacacacccc gcacgcagcc 2340tttgttcccc
tcgcagcccc cccgcaccgc ggggcaccgc ccccggccgc gctcccctcg 2400cgcacacgcg
gagcgcacaa agccccgcgc cgcgcccgca gcgctcacag ccgccgggca 2460gcgcgggccg
cacgcggcgc tccccacgca cacacacacg cacgcacccc ccgagccgct 2520cccccccgca
caaagggccc tcccggagcc ctttaaggct ttcacgcagc cacagaaaag 2580aaacgagccg
tcattaaacc aagcgctaat tacagcccgg aggagaaggg ccgtcccgcc 2640cgctcacctg
tgggagtaac gcggtcagtc agagccgggg cgggcggcgc gaggcggcgc 2700ggagcggggc
acggggcgaa ggcaacgcag cgactcccgc ccgccgcgcg cttcgctttt 2760tatagggccg
ccgccgccgc cgcctcgcca taaaaggaaa ctttcggagc gcgccgctct 2820gattggctgc
cgccgcacct ctccgcctcg ccccgccccg cccctcgccc cgccccgccc 2880cgcctggcgc
gcgccccccc cccccccgcc cccatcgctg cacaaaataa ttaaaaaata 2940aataaataca
aaattggggg tggggagggg ggggagatgg ggagagtgaa gcagaacgtg 3000gggctcacct
cgacccatgg taatagcgat gactaatacg tagatgtact gccaagtagg 3060aaagtcccat
aaggtcatgt actgggcata atgccaggcg ggccatttac cgtcattgac 3120gtcaataggg
ggcgtacttg gcatatgata cacttgatgt actgccaagt gggcagttta 3180ccgtaaatag
tccacccatt gacgtcaatg gaaagtccct attggcgtta ctatgggaac 3240atacgtcatt
attgacgtca atgggcgggg gtcgttgggc ggtcagccag gcgggccatt 3300taccgtaagt
tatgtaacgc ggaactccat atatgggcta tgaactaatg accccgtaat 3360tgattactat
taataactag aggatccccg ggtaccgagc tcgaattcgt aatcatggtc 3420atagctgttt
cctgtgtgaa attgttatcc gctcacaatt ccacacaaca tacgagccgg 3480aagcataaag
tgtaaagcct ggggtgccta atgagtgagc taactcacat taattgcgtt 3540gcgctcactg
cccgctttcc agtcgggaaa cctgtcgtgc cagctgcatt aatgaatcgg 3600ccaacgcgcg
gggagaggcg gtttgcgtat tgggcgctct tccgcttcct cgctcactga 3660ctcgctgcgc
tcggtcgttc ggctgcggcg agcggtatca gctcactcaa aggcggtaat 3720acggttatcc
acagaatcag gggataacgc aggaaagaac atgtgagcaa aaggccagca 3780aaaggccagg
aaccgtaaaa aggccgcgtt gctggcgttt ttccataggc tccgcccccc 3840tgacgagcat
cacaaaaatc gacgctcaag tcagaggtgg cgaaacccga caggactata 3900aagataccag
gcgtttcccc ctggaagctc cctcgtgcgc tctcctgttc cgaccctgcc 3960gcttaccgga
tacctgtccg cctttctccc ttcgggaagc gtggcgcttt ctcatagctc 4020acgctgtagg
tatctcagtt cggtgtaggt cgttcgctcc aagctgggct gtgtgcacga 4080accccccgtt
cagcccgacc gctgcgcctt atccggtaac tatcgtcttg agtccaaccc 4140ggtaagacac
gacttatcgc cactggcagc agccactggt aacaggatta gcagagcgag 4200gtatgtaggc
ggtgctacag agttcttgaa gtggtggcct aactacggct acactagaag 4260gacagtattt
ggtatctgcg ctctgctgaa gccagttacc ttcggaaaaa gagttggtag 4320ctcttgatcc
ggcaaacaaa ccaccgctgg tagcggtggt ttttttgttt gcaagcagca 4380gattacgcgc
agaaaaaaag gatctcaaga agatcctttg atcttttcta cggggtctga 4440cgctcagtgg
aacgaaaact cacgttaagg gattttggtc atgagattat caaaaaggat 4500cttcacctag
atccttttaa attaaaaatg aagttttaaa tcaatctaaa gtatatatga 4560gtaaacttgg
tctgacagtt accaatgctt aatcagtgag gcacctatct cagcgatctg 4620tctatttcgt
tcatccatag ttgcctgact ccccgtcgtg tagataacta cgatacggga 4680gggcttacca
tctggcccca gtgctgcaat gataccgcga gacccacgct caccggctcc 4740agatttatca
gcaataaacc agccagccgg aagggccgag cgcagaagtg gtcctgcaac 4800tttatccgcc
tccatccagt ctattaattg ttgccgggaa gctagagtaa gtagttcgcc 4860agttaatagt
ttgcgcaacg ttgttgccat tgctacaggc atcgtggtgt cacgctcgtc 4920gtttggtatg
gcttcattca gctccggttc ccaacgatca aggcgagtta catgatcccc 4980catgttgtgc
aaaaaagcgg ttagctcctt cggtcctccg atcgttgtca gaagtaagtt 5040ggccgcagtg
ttatcactca tggttatggc agcactgcat aattctctta ctgtcatgcc 5100atccgtaaga
tgcttttctg tgactggtga gtactcaacc aagtcattct gagaatagtg 5160tatgcggcga
ccgagttgct cttgcccggc gtcaatacgg gataataccg cgccacatag 5220cagaacttta
aaagtgctca tcattggaaa acgttcttcg gggcgaaaac tctcaaggat 5280cttaccgctg
ttgagatcca gttcgatgta acccactcgt gcacccaact gatcttcagc 5340atcttttact
ttcaccagcg tttctgggtg agcaaaaaca ggaaggcaaa atgccgcaaa 5400aaagggaata
agggcgacac ggaaatgttg aatactcata ctcttccttt ttcaatatta 5460ttgaagcatt
tatcagggtt attgtctcat gagcggatac atatttgaat gtatttagaa 5520aaataaacaa
ataggggttc cgcgcacatt tccccgaaaa gtgccacctg acgtagttaa 5580caaaaaaaag
cccgccgaag cgggctttat taccaagcga agcgccattc gccattcagg 5640ctgcgcaact
gttgggaagg gcgatcggtg cgggcctctt cgctattacg ccagctggcg 5700aaagggggat
gtgctgcaag gcgattaagt tgggtaacgc cagggttttc ccagtcacga 5760cgttgtaaaa
cgacggccag tccgtaatac gactcactta aggccttgac tagagggtcg 5820acggtataca
gacatgataa gatacattga tgagtttgga caaaccacaa ctagaatgca 5880gtgaaaaaaa
tgctttattt gtgaaatttg tgatgctatt gctttatttg taaccattat 5940aagctgcaat
aaacaagttg gggtgggcga agaactccag catgagatcc ccgcgctgga 6000ggatcatcca
gccggcgtcc cggaaaacga ttccgaagcc caacctttca tagaaggcgg 6060cggtggaatc
gaaatctcgt agcacgtgtc agtcctgctc ctcggccacg aagtgcacg
61191275855DNAArtificial SequencepCXLamInt Plasmid (Wildtype Integrase)
127gtcgacattg attattgact agttattaat agtaatcaat tacggggtca ttagttcata
60gcccatatat ggagttccgc gttacataac ttacggtaaa tggcccgcct ggctgaccgc
120ccaacgaccc ccgcccattg acgtcaataa tgacgtatgt tcccatagta acgccaatag
180ggactttcca ttgacgtcaa tgggtggact atttacggta aactgcccac ttggcagtac
240atcaagtgta tcatatgcca agtacgcccc ctattgacgt caatgacggt aaatggcccg
300cctggcatta tgcccagtac atgaccttat gggactttcc tacttggcag tacatctacg
360tattagtcat cgctattacc atgggtcgag gtgagcccca cgttctgctt cactctcccc
420atctcccccc cctccccacc cccaattttg tatttattta ttttttaatt attttgtgca
480gcgatggggg cggggggggg gggggcgcgc gccaggcggg gcggggcggg gcgaggggcg
540gggcggggcg aggcggagag gtgcggcggc agccaatcag agcggcgcgc tccgaaagtt
600tccttttatg gcgaggcggc ggcggcggcg gccctataaa aagcgaagcg cgcggcgggc
660gggagtcgct gcgttgcctt cgccccgtgc cccgctccgc gccgcctcgc gccgcccgcc
720ccggctctga ctgaccgcgt tactcccaca ggtgagcggg cgggacggcc cttctcctcc
780gggctgtaat tagcgcttgg tttaatgacg gctcgtttct tttctgtggc tgcgtgaaag
840ccttaaaggg ctccgggagg gccctttgtg cgggggggag cggctcgggg ggtgcgtgcg
900tgtgtgtgtg cgtggggagc gccgcgtgcg gcccgcgctg cccggcggct gtgagcgctg
960cgggcgcggc gcggggcttt gtgcgctccg cgtgtgcgcg aggggagcgc ggccgggggc
1020ggtgccccgc ggtgcggggg ggctgcgagg ggaacaaagg ctgcgtgcgg ggtgtgtgcg
1080tgggggggtg agcagggggt gtgggcgcgg cggtcgggct gtaacccccc cctgcacccc
1140cctccccgag ttgctgagca cggcccggct tcgggtgcgg ggctccgtgc ggggcgtggc
1200gcggggctcg ccgtgccggg cggggggtgg cggcaggtgg gggtgccggg cggggcgggg
1260ccgcctcggg ccggggaggg ctcgggggag gggcgcggcg gccccggagc gccggcggct
1320gtcgaggcgc ggcgagccgc agccattgcc ttttatggta atcgtgcgag agggcgcagg
1380gacttccttt gtcccaaatc tggcggagcc gaaatctggg aggcgccgcc gcaccccctc
1440tagcgggcgc gggcgaagcg gtgcggcgcc ggcaggaagg aaatgggcgg ggagggcctt
1500cgtgcgtcgc cgcgccgccg tccccttctc catctccagc ctcggggctg ccgcaggggg
1560acggctgcct tcggggggga cggggcaggg cggggttcgg cttctggcgt gtgaccggcg
1620gctctagagc ctctgctaac catgttcatg ccttcttctt tttcctacag ctcctgggca
1680acgtgctggt tgttgtgctg tctcatcatt ttggcaaaga attcatggga agaaggcgaa
1740gtcatgagcg ccgggattta ccccctaacc tttatataag aaacaatgga tattactgct
1800acagggaccc aaggacgggt aaagagtttg gattaggcag agacaggcga atcgcaatca
1860ctgaagctat acaggccaac attgagttat tttcaggaca caaacacaag cctctgacag
1920cgagaatcaa cagtgataat tccgttacgt tacattcatg gcttgatcgc tacgaaaaaa
1980tcctggccag cagaggaatc aagcagaaga cactcataaa ttacatgagc aaaattaaag
2040caataaggag gggtctgcct gatgctccac ttgaagacat caccacaaaa gaaattgcgg
2100caatgctcaa tggatacata gacgagggca aggcggcgtc agccaagtta atcagatcaa
2160cactgagcga tgcattccga gaggcaatag ctgaaggcca tataacaaca aaccatgtcg
2220ctgccactcg cgcagcaaaa tcagaggtaa ggagatcaag acttacggct gacgaatacc
2280tgaaaattta tcaagcagca gaatcatcac catgttggct cagacttgca atggaactgg
2340ctgttgttac cgggcaacga gttggtgatt tatgcgaaat gaagtggtct gatatcgtag
2400atggatatct ttatgtcgag caaagcaaaa caggcgtaaa aattgccatc ccaacagcat
2460tgcatattga tgctctcgga atatcaatga aggaaacact tgataaatgc aaagagattc
2520ttggcggaga aaccataatt gcatctactc gtcgcgaacc gctttcatcc ggcacagtat
2580caaggtattt tatgcgcgca cgaaaagcat caggtctttc cttcgaaggg gatccgccta
2640cctttcacga gttgcgcagt ttgtctgcaa gactctatga gaagcagata agcgataagt
2700ttgctcaaca tcttctcggg cataagtcgg acaccatggc atcacagtat cgtgatgaca
2760gaggcaggga gtgggacaaa attgaaatca aataagaatt cactcctcag gtgcaggctg
2820cctatcagaa ggtggtggct ggtgtggcca atgccctggc tcacaaatac cactgagatc
2880tttttccctc tgccaaaaat tatggggaca tcatgaagcc ccttgagcat ctgacttctg
2940gctaataaag gaaatttatt ttcattgcaa tagtgtgttg gaattttttg tgtctctcac
3000tcggaaggac atatgggagg gcaaatcatt taaaacatca gaatgagtat ttggtttaga
3060gtttggcaac atatgccata tgctggctgc catgaacaaa ggtggctata aagaggtcat
3120cagtatatga aacagccccc tgctgtccat tccttattcc atagaaaagc cttgacttga
3180ggttagattt tttttatatt ttgttttgtg ttattttttt ctttaacatc cctaaaattt
3240tccttacatg ttttactagc cagatttttc ctcctctcct gactactccc agtcatagct
3300gtccctcttc tcttatgaag atccctcgac ctgcagccca agcttggcgt aatcatggtc
3360atagctgttt cctgtgtgaa attgttatcc gctcacaatt ccacacaaca tacgagccgg
3420aagcataaag tgtaaagcct ggggtgccta atgagtgagc taactcacat taattgcgtt
3480gcgctcactg cccgctttcc agtcgggaaa cctgtcgtgc cagcggatcc gcatctcaat
3540tagtcagcaa ccatagtccc gcccctaact ccgcccatcc cgcccctaac tccgcccagt
3600tccgcccatt ctccgcccca tggctgacta atttttttta tttatgcaga ggccgaggcc
3660gcctcggcct ctgagctatt ccagaagtag tgaggaggct tttttggagg cctaggcttt
3720tgcaaaaagc taacttgttt attgcagctt ataatggtta caaataaagc aatagcatca
3780caaatttcac aaataaagca tttttttcac tgcattctag ttgtggtttg tccaaactca
3840tcaatgtatc ttatcatgtc tggatccgct gcattaatga atcggccaac gcgcggggag
3900aggcggtttg cgtattgggc gctcttccgc ttcctcgctc actgactcgc tgcgctcggt
3960cgttcggctg cggcgagcgg tatcagctca ctcaaaggcg gtaatacggt tatccacaga
4020atcaggggat aacgcaggaa agaacatgtg agcaaaaggc cagcaaaagg ccaggaaccg
4080taaaaaggcc gcgttgctgg cgtttttcca taggctccgc ccccctgacg agcatcacaa
4140aaatcgacgc tcaagtcaga ggtggcgaaa cccgacagga ctataaagat accaggcgtt
4200tccccctgga agctccctcg tgcgctctcc tgttccgacc ctgccgctta ccggatacct
4260gtccgccttt ctcccttcgg gaagcgtggc gctttctcaa tgctcacgct gtaggtatct
4320cagttcggtg taggtcgttc gctccaagct gggctgtgtg cacgaacccc ccgttcagcc
4380cgaccgctgc gccttatccg gtaactatcg tcttgagtcc aacccggtaa gacacgactt
4440atcgccactg gcagcagcca ctggtaacag gattagcaga gcgaggtatg taggcggtgc
4500tacagagttc ttgaagtggt ggcctaacta cggctacact agaaggacag tatttggtat
4560ctgcgctctg ctgaagccag ttaccttcgg aaaaagagtt ggtagctctt gatccggcaa
4620acaaaccacc gctggtagcg gtggtttttt tgtttgcaag cagcagatta cgcgcagaaa
4680aaaaggatct caagaagatc ctttgatctt ttctacgggg tctgacgctc agtggaacga
4740aaactcacgt taagggattt tggtcatgag attatcaaaa aggatcttca cctagatcct
4800tttaaattaa aaatgaagtt ttaaatcaat ctaaagtata tatgagtaaa cttggtctga
4860cagttaccaa tgcttaatca gtgaggcacc tatctcagcg atctgtctat ttcgttcatc
4920catagttgcc tgactccccg tcgtgtagat aactacgata cgggagggct taccatctgg
4980ccccagtgct gcaatgatac cgcgagaccc acgctcaccg gctccagatt tatcagcaat
5040aaaccagcca gccggaaggg ccgagcgcag aagtggtcct gcaactttat ccgcctccat
5100ccagtctatt aattgttgcc gggaagctag agtaagtagt tcgccagtta atagtttgcg
5160caacgttgtt gccattgcta caggcatcgt ggtgtcacgc tcgtcgtttg gtatggcttc
5220attcagctcc ggttcccaac gatcaaggcg agttacatga tcccccatgt tgtgcaaaaa
5280agcggttagc tccttcggtc ctccgatcgt tgtcagaagt aagttggccg cagtgttatc
5340actcatggtt atggcagcac tgcataattc tcttactgtc atgccatccg taagatgctt
5400ttctgtgact ggtgagtact caaccaagtc attctgagaa tagtgtatgc ggcgaccgag
5460ttgctcttgc ccggcgtcaa tacgggataa taccgcgcca catagcagaa ctttaaaagt
5520gctcatcatt ggaaaacgtt cttcggggcg aaaactctca aggatcttac cgctgttgag
5580atccagttcg atgtaaccca ctcgtgcacc caactgatct tcagcatctt ttactttcac
5640cagcgtttct gggtgagcaa aaacaggaag gcaaaatgcc gcaaaaaagg gaataagggc
5700gacacggaaa tgttgaatac tcatactctt cctttttcaa tattattgaa gcatttatca
5760gggttattgt ctcatgagcg gatacatatt tgaatgtatt tagaaaaata aacaaatagg
5820ggttccgcgc acatttcccc gaaaagtgcc acctg
5855128303DNAArtificial SequenceHuman FER-1 Promoter 128tccatgacaa
agcacttttt gagcccaagc ccagcctagc tcgagctaaa cgggcacaga 60gacgccaccg
ctgtcccaga ggcagtcggc taccggtccc cgctcccgag ctccgccaga 120gcgcgcgagg
gcctccagcg gccgcccctc ccccacagca ggggcggggt cccgcgccca 180ccggaaggag
cgggctcggg gcgggcggcg ctgattggcc ggggcgggcc tgacgccgac 240gcggctataa
gagaccacaa gcgacccgca gggccagacg ttcttcgccg agagtcgggt 300acc
3031296521DNAArtificial SequencepIRES-BSR Plasmid 129tcaatattgg
ccattagcca tattattcat tggttatata gcataaatca atattggcta 60ttggccattg
catacgttgt atctatatca taatatgtac atttatattg gctcatgtcc 120aatatgaccg
ccatgttggc attgattatt gactagttat taatagtaat caattacggg 180gtcattagtt
catagcccat atatggagtt ccgcgttaca taacttacgg taaatggccc 240gcctggctga
ccgcccaacg acccccgccc attgacgtca ataatgacgt atgttcccat 300agtaacgcca
atagggactt tccattgacg tcaatgggtg gagtatttac ggtaaactgc 360ccacttggca
gtacatcaag tgtatcatat gccaagtccg ccccctattg acgtcaatga 420cggtaaatgg
cccgcctggc attatgccca gtacatgacc ttacgggact ttcctacttg 480gcagtacatc
tacgtattag tcatcgctat taccatggtg atgcggtttt ggcagtacac 540caatgggcgt
ggatagcggt ttgactcacg gggatttcca agtctccacc ccattgacgt 600caatgggagt
ttgttttggc accaaaatca acgggacttt ccaaaatgtc gtaacaactg 660cgatcgcccg
ccccgttgac gcaaatgggc ggtaggcgtg tacggtggga ggtctatata 720agcagagctc
gtttagtgaa ccgtcagatc actagaagct ttattgcggt agtttatcac 780agttaaattg
ctaacgcagt cagtgcttct gacacaacag tctcgaactt aagctgcagt 840gactctctta
aggtagcctt gcagaagttg gtcgtgaggc actgggcagg taagtatcaa 900ggttacaaga
caggtttaag gagaccaata gaaactgggc ttgtcgagac agagaagact 960cttgcgtttc
tgataggcac ctattggtct tactgacatc cactttgcct ttctctccac 1020aggtgtccac
tcccagttca attacagctc ttaaggctag agtacttaat acgactcact 1080ataggctagc
ctcgagaatt cacgcgtcga gcatgcatct agggcggcca attccgcccc 1140tctccctccc
ccccccctaa cgttactggc cgaagccgct tggaataagg ccggtgtgcg 1200tttgtctata
tgtgattttc caccatattg ccgtcttttg gcaatgtgag ggcccggaaa 1260cctggccctg
tcttcttgac gagcattcct aggggtcttt cccctctcgc caaaggaatg 1320caaggtctgt
tgaatgtcgt gaaggaagca gttcctctgg aagcttcttg aagacaaaca 1380acgtctgtag
cgaccctttg caggcagcgg aaccccccac ctggcgacag gtgcctctgc 1440ggccaaaagc
cacgtgtata agatacacct gcaaaggcgg cacaacccca gtgccacgtt 1500gtgagttgga
tagttgtgga aagagtcaaa tggctctcct caagcgtatt caacaagggg 1560ctgaaggatg
cccagaaggt accccattgt atgggatctg atctggggcc tcggtgcaca 1620tgctttacat
gtgtttagtc gaggttaaaa aaacgtctag gccccccgaa ccacggggac 1680gtggttttcc
tttgaaaaac acgatgataa gcttgccaca acccaccatg aaaacattta 1740acatttctca
acaagatcta gaattagtag aagtagcgac agagaagatt acaatgcttt 1800atgaggataa
taaacatcat gtgggagcgg caattcgtac gaaaacagga gaaatcattt 1860cggcagtaca
tattgaagcg tatataggac gagtaactgt ttgtgcagaa gccattgcga 1920ttggtagtgc
agtttcgaat ggacaaaagg attttgacac gattgtagct gttagacacc 1980cttattctga
cgaagtagat agaagtattc gagtggtaag tccttgtggt atgtgtaggg 2040agttgatttc
agactatgca ccagattgtt ttgtgttaat agaaatgaat ggcaagttag 2100tcaaaactac
gattgaagaa ctcattccac tcaaatatac ccgaaattaa aagttttacc 2160ataccaagct
tggcgggcgg ccgcttccct ttagtgaggg ttaatgcttc gagcagacat 2220gataagatac
attgatgagt ttggacaaac cacaactaga atgcagtgaa aaaaatgctt 2280tatttgtgaa
atttgtgatg ctattgcttt atttgtaacc attataagct gcaataaaca 2340agttaacaac
aacaattgca ttcattttat gtttcaggtt cagggggaga tgtgggaggt 2400tttttaaagc
aagtaaaacc tctacaaatg tggtaaaatc cgataaggat cgatccgggc 2460tggcgtaata
gcgaagaggc ccgcaccgat cgcccttccc aacagttgcg cagcctgaat 2520ggcgaatgga
cgcgccctgt agcggcgcat taagcgcggc gggtgtggtg gttacgcgca 2580gcgtgaccgc
tacacttgcc agcgccctag cgcccgctcc tttcgctttc ttcccttcct 2640ttctcgccac
gttcgccggc tttccccgtc aagctctaaa tcgggggctc cctttagggt 2700tccgatttag
agctttacgg cacctcgacc gcaaaaaact tgatttgggt gatggttcac 2760gtagtgggcc
atcgccctga tagacggttt ttcgcccttt gacgttggag tccacgttct 2820ttaatagtgg
actcttgttc caaactggaa caacactcaa ccctatctcg gtctattctt 2880ttgatttata
agggattttg ccgatttcgg cctattggtt aaaaaatgag ctgatttaac 2940aaatatttaa
cgcgaatttt aacaaaatat taacgtttac aatttcgcct gatgcggtat 3000tttctcctta
cgcatctgtg cggtatttca caccgcatac gcggatctgc gcagcaccat 3060ggcctgaaat
aacctctgaa agaggaactt ggttaggtac cttctgaggc ggaaagaacc 3120agctgtggaa
tgtgtgtcag ttagggtgtg gaaagtcccc aggctcccca gcaggcagaa 3180gtatgcaaag
catgcatctc aattagtcag caaccaggtg tggaaagtcc ccaggctccc 3240cagcaggcag
aagtatgcaa agcatgcatc tcaattagtc agcaaccata gtcccgcccc 3300taactccgcc
catcccgccc ctaactccgc ccagttccgc ccattctccg ccccatggct 3360gactaatttt
ttttatttat gcagaggccg aggccgcctc ggcctctgag ctattccaga 3420agtagtgagg
aggctttttt ggaggcctag gcttttgcaa aaagcttgat tcttctgaca 3480caacagtctc
gaacttaagg ctagagccac catgattgaa caagatggat tgcacgcagg 3540ttctccggcc
gcttgggtgg agaggctatt cggctatgac tgggcacaac agacaatcgg 3600ctgctctgat
gccgccgtgt tccggctgtc agcgcagggg cgcccggttc tttttgtcaa 3660gaccgacctg
tccggtgccc tgaatgaact gcaggacgag gcagcgcggc tatcgtggct 3720ggccacgacg
ggcgttcctt gcgcagctgt gctcgacgtt gtcactgaag cgggaaggga 3780ctggctgcta
ttgggcgaag tgccggggca ggatctcctg tcatctcacc ttgctcctgc 3840cgagaaagta
tccatcatgg ctgatgcaat gcggcggctg catacgcttg atccggctac 3900ctgcccattc
gaccaccaag cgaaacatcg catcgagcga gcacgtactc ggatggaagc 3960cggtcttgtc
gatcaggatg atctggacga agagcatcag gggctcgcgc cagccgaact 4020gttcgccagg
ctcaaggcgc gcatgcccga cggcgaggat ctcgtcgtga cccatggcga 4080tgcctgcttg
ccgaatatca tggtggaaaa tggccgcttt tctggattca tcgactgtgg 4140ccggctgggt
gtggcggacc gctatcagga catagcgttg gctacccgtg atattgctga 4200agagcttggc
ggcgaatggg ctgaccgctt cctcgtgctt tacggtatcg ccgctcccga 4260ttcgcagcgc
atcgccttct atcgccttct tgacgagttc ttctgagcgg gactctgggg 4320ttcgaaatga
ccgaccaagc gacgcccaac ctgccatcac gatggccgca ataaaatatc 4380tttattttca
ttacatctgt gtgttggttt tttgtgtgaa tcgatagcga taaggatccg 4440cgtatggtgc
actctcagta caatctgctc tgatgccgca tagttaagcc agccccgaca 4500cccgccaaca
cccgctgacg cgccctgacg ggcttgtctg ctcccggcat ccgcttacag 4560acaagctgtg
accgtctccg ggagctgcat gtgtcagagg ttttcaccgt catcaccgaa 4620acgcgcgaga
cgaaagggcc tcgtgatacg cctattttta taggttaatg tcatgataat 4680aatggtttct
tagacgtcag gtggcacttt tcggggaaat gtgcgcggaa cccctatttg 4740tttatttttc
taaatacatt caaatatgta tccgctcatg agacaataac cctgataaat 4800gcttcaataa
tattgaaaaa ggaagagtat gagtattcaa catttccgtg tcgcccttat 4860tccctttttt
gcggcatttt gccttcctgt ttttgctcac ccagaaacgc tggtgaaagt 4920aaaagatgct
gaagatcagt tgggtgcacg agtgggttac atcgaactgg atctcaacag 4980cggtaagatc
cttgagagtt ttcgccccga agaacgtttt ccaatgatga gcacttttaa 5040agttctgcta
tgtggcgcgg tattatcccg tattgacgcc gggcaagagc aactcggtcg 5100ccgcatacac
tattctcaga atgacttggt tgagtactca ccagtcacag aaaagcatct 5160tacggatggc
atgacagtaa gagaattatg cagtgctgcc ataaccatga gtgataacac 5220tgcggccaac
ttacttctga caacgatcgg aggaccgaag gagctaaccg cttttttgca 5280caacatgggg
gatcatgtaa ctcgccttga tcgttgggaa ccggagctga atgaagccat 5340accaaacgac
gagcgtgaca ccacgatgcc tgtagcaatg gcaacaacgt tgcgcaaact 5400attaactggc
gaactactta ctctagcttc ccggcaacaa ttaatagact ggatggaggc 5460ggataaagtt
gcaggaccac ttctgcgctc ggcccttccg gctggctggt ttattgctga 5520taaatctgga
gccggtgagc gtgggtctcg cggtatcatt gcagcactgg ggccagatgg 5580taagccctcc
cgtatcgtag ttatctacac gacggggagt caggcaacta tggatgaacg 5640aaatagacag
atcgctgaga taggtgcctc actgattaag cattggtaac tgtcagacca 5700agtttactca
tatatacttt agattgattt aaaacttcat ttttaattta aaaggatcta 5760ggtgaagatc
ctttttgata atctcatgac caaaatccct taacgtgagt tttcgttcca 5820ctgagcgtca
gaccccgtag aaaagatcaa aggatcttct tgagatcctt tttttctgcg 5880cgtaatctgc
tgcttgcaaa caaaaaaacc accgctacca gcggtggttt gtttgccgga 5940tcaagagcta
ccaactcttt ttccgaaggt aactggcttc agcagagcgc agataccaaa 6000tactgtcctt
ctagtgtagc cgtagttagg ccaccacttc aagaactctg tagcaccgcc 6060tacatacctc
gctctgctaa tcctgttacc agtggctgct gccagtggcg ataagtcgtg 6120tcttaccggg
ttggactcaa gacgatagtt accggataag gcgcagcggt cgggctgaac 6180ggggggttcg
tgcacacagc ccagcttgga gcgaacgacc tacaccgaac tgagatacct 6240acagcgtgag
ctatgagaaa gcgccacgct tcccgaaggg agaaaggcgg acaggtatcc 6300ggtaagcggc
agggtcggaa caggagagcg cacgagggag cttccagggg gaaacgcctg 6360gtatctttat
agtcctgtcg ggtttcgcca cctctgactt gagcgtcgat ttttgtgatg 6420ctcgtcaggg
gggcggagcc tatggaaaaa cgccagcaac gcggcctttt tacggttcct 6480ggccttttgc
tggccttttg ctcacatggc tcgacagatc t 6521
User Contributions:
Comment about this patent or add new information about this topic:
People who visited this patent also read: | |
Patent application number | Title |
---|---|
20190161207 | Apparatuses and Methods for Reducing Ozone Creation from Ultraviolet (UV) Light |
20190161206 | SPARK CONTAINMENT CAP |
20190161205 | LIGHTNING DIVERTER SYSTEM FOR AIRCRAFT RADOME |
20190161204 | AIRCRAFT, LIGHTNING-PROTECTION SYSTEM, AND METHOD OF PROVIDING THE LIGHTNING PROTECTION |
20190161203 | System and Method for Flight Mode Annunciation |