Patent application title: METHODS AND COMPOSITIONS FOR TRANSPOSITION USING MINIMAL SEGMENTS OF THE EUKARYOTIC TRANSFORMATION VECTOR PIGGYBAC
Inventors:
Malcolm Fraser (Granger, IN, US)
Xu Li (Sharon, MA, US)
IPC8 Class: AC12N1585FI
USPC Class:
1 1
Class name:
Publication date: 2019-01-03
Patent application number: 20190002914
Abstract:
Isolated nucleic acid molecules that include a minimally-sized,
functional (minimally-functional) piggyBac transposon can incorporate (i)
a 5' internal domain (ID) comprising a nucleotide fragment that is
substantially homologous to a native piggyBac transposon sequence; (ii) a
5' terminal repeat domain (TRD) comprising a 5' terminal repeat (TR)
sequence, a 5' spacer sequence, and a 5' internal repeat (IR) sequence;
(ii) a sequence of interest; (iv) a 3' TRD comprising a 3' IR sequence, a
3' spacer sequence, and a 3' TR sequence; and (iv) a 3' ID comprising a
nucleotide fragment that is substantially homologous to the native
piggyBac transposon sequence. The 5' TRD and the 3' TRD can be optionally
linked by a sequence comprising a multiple cloning site.Claims:
1. An isolated nucleic acid molecule comprising a minimally-sized,
functional (minimally-functional) piggyBac transposon, the
minimally-functional piggyBac transposon comprising the following genetic
elements in a 5' to 3' direction: a 5' internal domain (ID) comprising a
nucleotide fragment that is substantially homologous to a native piggyBac
transposon sequence; a 5' terminal repeat domain (TRD) comprising a 5'
terminal repeat (TR) sequence, a 5' spacer sequence, and a 5' internal
repeat (IR) sequence; a sequence of interest; a 3' TRD comprising a 3' IR
sequence, a 3' spacer sequence, and a 3' TR sequence; and a 3' ID
comprising a nucleotide fragment that is substantially homologous to the
native piggyBac transposon sequence.
2. The isolated nucleic acid molecule of claim 1, wherein the 5' TRD and the 3' TRD are linked by a sequence comprising a multiple cloning site.
3. The isolated nucleic acid molecule of claim 1, wherein the nucleotide fragments of the 5' ID and the 3' ID are each at least 66-bp in length.
4. The isolated nucleic acid molecule of claim 1, wherein the nucleotide fragment of the 5' ID corresponds to a sequence selected from the group consisting of: SEQ ID NO: 192, SEQ ID NO: 198, SEQ ID NO: 199, or SEQ ID NO: 200.
5. The isolated nucleic acid molecule of claim 1, wherein the 5' TRD comprises a 35-bp sequence corresponding to SEQ ID NO: 193.
6. The isolated nucleic acid molecule of claim 1, wherein the 5' TR sequence comprises a 13-bp sequence corresponding to SEQ ID NO: 194.
7. The isolated nucleic acid molecule of claim 1, wherein the 5' IR sequence comprises a 19-bp sequence corresponding to SEQ ID NO: 195.
8. The isolated nucleic acid molecule of claim 1, wherein the 5' spacer sequence comprises the nucleotide sequence 5'-GTC-3'.
9. The isolated nucleic acid molecule of claim 1, wherein the 3' TRD comprises a 63-bp sequence corresponding to SEQ ID NO: 71
10. The isolated nucleic acid molecule of claim 1, wherein the 3' TR sequence comprises a 13-bp sequence corresponding to SEQ ID NO: 194.
11. The isolated nucleic acid molecule of claim 1, wherein the 3' IR sequence comprises a 19-bp sequence corresponding to SEQ ID NO: 195.
12. The isolated nucleic acid molecule of claim 1, wherein the 5' TR and the 3' TR comprise a same 13-bp sequence corresponding to SEQ ID NO: 194.
13. The isolated nucleic acid molecule of claim 1, wherein the 5' IR and the 3' IR comprise a same 19-bp sequence corresponding to SEQ ID NO: 195.
14. The isolated nucleic acid molecule of claim 1, wherein the nucleotide fragment of the 3' ID comprises SEQ ID NO: 196.
15. The isolated nucleic acid molecule of claim 1, wherein the 3' spacer sequence comprises SEQ ID NO: 197.
16. A plasmid selected from the group consisting of: pBSII-ECFP-R1/L5, pBSII-ECFP-R2/L5, pBSII-ECFP-R3/L5, and pBSII-ECFP-R4/L5.\
17. The plasmid of claim 16, wherein the plasmid is pBSII-ECFP-R1/L5.
18. The plasmid of claim 16, wherein the plasmid is pBSII-ECFP-R2/L5.
19. The plasmid of claim 16, wherein the plasmid is pBSII-ECFP-R3/L5.
20. The plasmid of claim 16, wherein the plasmid is pBSII-ECFP-R4/L5.
Description:
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application is a continuation of U.S. patent application Ser. No. 11/454,947 filed on Jun. 19, 2006, entitled "METHODS AND COMPOSITIONS FOR TRANSPOSITION USING MINIMAL SEGMENTS OF THE EUKARYOTIC TRANSFORMATION VECTOR PIGGYBAC," which is a continuation-in-part of U.S. patent application Ser. No. 10/826,523 filed on Apr. 19, 2004, entitled "METHODS AND COMPOSITIONS FOR TRANSPOSITION USING MINIMAL SEGMENTS OF THE EUKARYOTIC TRANSFORMATION VECTOR PIGGYBAC," which issued as U.S. Pat. No. 7,105,343 on Sep. 12, 2006, which is a continuation-in-part of U.S. patent application Ser. No. 10/001,189 filed on Oct. 30, 2001, entitled "METHODS AND COMPOSITIONS FOR TRANSPOSITION USING MINIMAL SEGMENTS OF THE EUKARYOTIC TRANSFORMATION VECTOR PIGGYBAC," which issued as U.S. Pat. No. 6,962,810 on Nov. 8, 2005, which claims benefit of and priority to U.S. Provisional Patent Application No. 60/244,984 filed on Nov. 1, 2000 entitled "Methods and compositions for transposition using minimal segments of the eukaryotic transformation vector piggyBac" as well as U.S. Provisional Patent Application No. 60/244,667 filed on Oct. 31, 2000 entitled "Methods and compositions for transposition using minimal segments of the eukaryotic transformation vector piggyBac". U.S. patent application Ser. No. 10/826,523 also claims the benefit of and priority to United States Provisional Patent Application No. 60/562,324 filed on Apr. 15, 2004, entitled "Methods and compositions for transposition using minimal segments of the eukaryotic transformation vector piggyBac". All of the foregoing are incorporated herein by this reference in their entirety.
INCORPORATION-BY-REFERENCE OF A SEQUENCE LISTING
[0003] The sequence listing contained in the file "21395-6-1_2018-08-23_Sequence-Listing.txt" created on Aug. 23, 2018 and having a file size of 238,621 bytes and which contains SEQ ID Nos. 1-200 for the current application is incorporated herein by reference in its entirety.
BACKGROUND
The Field of the Invention
[0004] The present invention relates generally to transposable elements, and more particularly to the transposon piggybac.
Related Art
[0005] Transposable elements (transposons) can move around a genome of a cell and are useful for inserting genes for the production of transgenic organisms. The Lepidopteran transposon piggyBac is capable of moving within the genomes of a wide variety of species, and is gaining prominence as a useful gene transduction vector. The transposon structure includes a complex repeat configuration consisting of an internal repeat (IR), a spacer, and a terminal repeat (TR) at both ends, and a single open reading frame encoding a transposase.
[0006] The Lepidopteran transposable element piggyBac was originally isolated from the TN-368 Trichoplusia ni cell culture as a gene disrupting insertion within spontaneous baculovirus plaque morphology mutants. PiggyBac is a 2475 bp short inverted repeat element that has an asymmetric terminal repeat structure with a 3-bp spacer between the 5' 13-bp TR (terminal repeat) and the 19-bp IR (internal repeat), and a 31-bp spacer between the 3' TR and IR. The single 2.1 kb open reading frame encodes a functional transposase (Cary et al., 1989; Fraser et al., 1983, 1995; Elick et al., 1996a; Lobo et al., 1999; Handler et al., 1998).
[0007] PiggyBac transposes via a unique cut-and-paste mechanism, inserting exclusively at 5' TTAA 3' target sites that are duplicated upon insertion, and excising precisely, leaving no footprint (Elick et al., 1996b; Fraser et al., 1996; Wang and Fraser 1993).
[0008] Transient excision and interplasmid transposition assays have verified movement of this element in the SF21AE Spodoptera frugiperda cell line, and embryos of the Lepidopteran Pectinophora glossypiella, Bombyx mori, and T. ni, as well as the Dipteran species Drosophila melanogaster, Aedes aegypti, Aedes triseriatus, Aedes albopictus, Anopheles stephensi and Anopheles gambiae. There is also evidence of transposition in the Cos-7 primate cell line, and embryos of the zebra fish, Danio rerio (Fraser et al., 1995; Buck et al., 1996b; Fraser et al., 1996; Elick et al., 1997; Thibault et al., 1999; Tamura et al., 2000; Lobo et al., 1999).
[0009] The piggyBac element has been used successfully as a helper-dependent gene transfer vector in a wide variety of insect species, including the Mediterranean fruit fly, C. capitata, D. melanogaster, Bombyx mori, P. glossypiella, Tribollium casteneum, and Ae. aegypti (Handler et al., 1998, 1999; Tamura et al., 2000; Berghammer et al., 1999).
[0010] Excision assays using both wildtype and mutagenized piggyBac terminal sequences demonstrated that the element does not discriminate between proximal or distal duplicated ends, and suggest that the transposase does not first recognize an internal binding site and then scan towards the ends. In addition, mutagenesis of the terminal trinucleotides or the terminal-proximate three bases of the TTAA target sequence eliminates excision at the altered terminus (Elick et al., 1996b).
[0011] Although the reported piggyBac vector is useful, length of genes that could be transferred is limited by the size of the other components of the vector. Minimizing the length of the vector to allow more room for the genetic material to be transferred would improve the versatility of the system and reduce costs of preparing synthetic vectors. Previously, the gene to be expressed or transduced was inserted into the middle of the piggyBac transposon in the plasmid p3E1.2. The final construct included the entire length of the piggyBac transposon (2475 bases) and flanking sequences derived from the baculovirus 25K gene region of approximately 813 bases, as well as the plasmid pUC backbone of 2686 bp, and an overall size of approximately 5962 bp. In cloning sequences into the pUC vector, 12 bp of multiple cloning sites DNA was lost. This size limited the effective size of genes that may be inserted, because plasmids larger than 10 KB are generally more difficult to construct, maintain, and transduce into host genomes.
[0012] Another problem was that previous cloning regimens involved the excision of a gene, the promoter controlling the gene, and polyadenylation signals, from one plasmid followed by insertion into the piggyBac transfer vector. This procedure was often complicated by the lack of suitable restriction enzyme sites for these manipulations.
BRIEF SUMMARY OF THE INVENTION
[0013] The present invention identifies the specific sequences in a mobile genetic element, the transposon piggyBac, and sequence configurations outside of piggyBac, that are minimally required for full functionality of the sequence as a transposon. Inserting DNA molecules into cells is enhanced using the methods and compositions of the present invention.
[0014] The present invention solves problems in use of the piggyBac vector for gene transfer caused by lack of suitable restriction sites to cut the components needed for gene transfer, and limitations on the sizes (lengths) of genes transferred by use of this vector. Methods and compositions of the present invention enlarge the size of the gene that may be transferred in two ways. First, a minimal sequence cartridge may be easily amplified using primers containing desired restriction endonuclease sites, and the cartridge may then be inserted into any plasmid containing the gene with its attendant promoter and polyadenylation signals intact, converting that plasmid into a piggyBac transposon. Second, a multiple cloning site may be inserted into a minimal plasmid vector, facilitating the insertion of genes in this more traditional plasmid vector. The vectors may both be used for applications including producing transgenic organisms, both plants and animals. The present invention has been successful in exemplary transpositions using the primate Cos-7 vertebrate cell line and embryos of the zebra fish, Danio rerio, among others.
[0015] Methods and compositions are disclosed herein for transferring genes using the minimum internal and external sequences of the transformation vector piggyBac In an embodiment of the invention, all non-essential sequences are removed, including the bulk of the piggyBac internal domain and the flanking baculovirus sequences. By means of the minimal piggyBac cartridge, a DNA molecule may be transferred from a plasmid into a host cell.
[0016] In one aspect, the invention provides a DNA molecule that in some embodiments comprises at least 163 consecutive nucleotide base pairs of the 3' terminal region beginning at the 3' terminal base pair, and at least 125 consecutive nucleotide base pairs of the 5' terminal region beginning at the 5' terminal base pair of the piggyBac molecule, the region extending from the restriction site SacI to the end of the piggyBac molecule.
[0017] In another aspect, the invention comprises a genetic cartridge designated ITR.
[0018] In some embodiments, the invention provides a genetic cartridge designated ITR1.1 k.
[0019] According to another aspect, the invention provides a vector. In some embodiments, the vector is designated pXL-Bac as shown in FIG. 3. In other embodiments, the vector is designated pXL-BacII-ECFP as shown in FIG. 24 D. In yet additional embodiments, the vector is designated pBSII-ITR1.1 k-ECFP as shown in FIG. 24 C.
[0020] In other aspects, the invention provides a nucleic acid molecule comprising a nucleic acid sequence. In some embodiments, the nucleic acid sequence comprises a minimal sequence of consecutive nucleotide base pairs (a minimal sequence component) having a sequence that is homologous to a nucleic acid sequence of a 5' terminal region of a piggyBac native nucleic acid sequence, and a longer sequence of consecutive nucleotide base pairs (a longer sequence component) that is homologous to a nucleic acid sequence of a 3' terminal region of a piggyBac native nucleic acid sequence.
[0021] In some embodiments, the minimal sequence of consecutive nucleotide base pairs that is homologous to a nucleic acid sequence of a 5' terminal region of the piggyBac native nucleic acid sequence is a sequence of nucleotide base pairs that is about 50 to about 80 base pairs in length, or is about 60 to about 70 base pairs in length, or is 66 base pairs in length. In other embodiments, the minimal sequence of consecutive nucleotide base pairs is defined as comprising a nucleic acid sequence that is homologous to a nucleic acid sequence that is the sequence at nucleotide positions 36 to 100 of the native piggyBac nucleic acid sequence.
[0022] In some embodiments, the longer sequence of consecutive nucleotide base pairs from the 3' terminal region of the piggyBac nucleic acid sequence is about 125 to about 450 base pairs in length, or about 200 to about 400 base pairs in length, or about 300 to about 380 base pairs in length, or about 311, 350, or 378 base pairs in length. In some embodiments, the longer sequence of consecutive nucleotide base pairs is defined as comprising a nucleic acid sequence that is homologous to a nucleic acid sequence that is the sequence at nucleotide positions 2031 to 2409 of the native piggyBac sequence.
[0023] The homology that the minimal sequence component and the longer sequence component have with the referenced native piggyBac nucleic acid sequence as defined herein is a degree of homology that is sufficient to produce a functionally equivalent activity that is equal or substantially equal to the native piggyBac nucleic acid sequence. Homology may also be described relative to the percent (%) similarity that the minimal sequence component or the longer sequence component has to the referenced native piggyBac nucleic acid sequence. In some embodiments, the homology may be 40% or more, 45% or more, 50% or more, 60% or more, 70% or more, 80% or more, 90% or more, or even up to 100% homology with the nucleic acid sequence of the corresponding native piggyBac sequence.
[0024] In some embodiments, the DNA molecule comprises a nucleic acid sequence encoding a phenotypic marker.
[0025] In some embodiments, the DNA molecule comprises a nucleic acid sequence encoding a spacer sequence of interest. The spacer sequence may comprise a sequence of any desired length, and in some embodiments, may be described by the term, "stuffer". This "stuffer" may comprise a nucleic acid sequence of about 10 to about 1000 base pairs, or about 20, 30, 40, 50, 60, 100, 200, 300, 400, 500, 700, 800, or even 1000 base pairs or more. In yet other embodiments, the DNA molecule comprises a nucleic acid sequence encoding a molecule of interest, such as a protein, peptide, or a synthetic or non-synthetic, organic, inorganic, or other type of molecule.
[0026] In another aspect, the invention provides a plasmid comprising a nucleic acid sequence of a DNA molecule having the minimal nucleotide sequence of consecutive nucleotide base pairs from the 5' terminal region of the piggyBac nucleic acid sequence and the longer nucleotide sequence of consecutive nucleotide base pairs from the 3' terminal region of the piggyBac nucleic acid sequence.
[0027] In some embodiments the nucleic acid molecule may comprise a nucleic acid sequence comprising one or more than one minimal sequence of consecutive nucleotide base pairs substantially homologous to a 5' terminal region of a piggyBac nucleic acid sequence, one or more than one longer nucleotide sequence of consecutive nucleotide base pairs substantially homologous to a 3' terminal region of a piggyBac nucleic acid sequence, or any combination thereof and in any desired construct arrangement. By way of example and not limitation, one embodiment of such a nucleic acid molecule may comprise a first minimal sequence of consecutive nucleotide base pairs substantially homologous to a 5' terminal region of a piggyBac nucleic acid sequence, adjacent to a longer nucleotide sequence of consecutive nucleotide base pairs substantially similar to a 3' terminal region of a piggyBac nucleic acid sequence, and a second minimal sequence of consecutive nucleotide base pairs substantially homologous to a 5' terminal region of a piggyBac nucleic acid sequence. In some embodiments, this and any other of the constructs of the present invention may include 1 or more of the small repeat sequences, such as the -CAAAAT- or ACTTATT- small repeat sequences.
[0028] In some embodiments, the invention provides a plasmid designated pBSII-ITR1.1 k-ECFP.
[0029] In other embodiments, the invention provides a plasmid designated pCaSpeR-hs-orf.
[0030] In still other embodiments, the invention provides a plasmid p(PZ)-Bac-EYFP (FIG. 29A).
[0031] In other embodiments, the invention provides a plasmid pBSII-3xP3-ECFP.
[0032] In yet other embodiments, the invention provides a plasmid designated pBSII-ECFP-R4/L. In particular of these embodiments, the plasmid is pBSII-ECFP-R4/L.sub.2, pBSII-ECFP-R4/L.sub.3, pBSII-ECFP-R4/L4, or pBSII-ECFP-R4/L.sub.5 (FIG. 27).
[0033] Another broad aspect of the invention provides a method for providing high frequency transformation of an insect genome using a vector comprising the minimal 5' terminal region and longer 3' terminal region sequence of a piggyBac sequence, in the presence of a helper plasmid. In some embodiments, the vector further comprises a small terminal repeat sequence, CAAAAT. In particular embodiments, the helper plasmid is a plasmid pCaSpeR-hs-orf.
[0034] In some embodiments, the insect genome is further described as that of an insect. In some embodiments, the insect is a mosquito.
[0035] In some embodiments, the method of high frequency transformation may be described as providing a frequency of transformation that is enhanced 100-fold or higher, than transformation frequency employing a vector other that the minimal 5', longer 3' terminal end piggyBac constructs described herein.
[0036] In another aspect, the invention provides a transformed cell transformed with a transformation vector comprising a nucleic acid sequence that includes a minimal sequence component homologous to a 5' terminal region of a piggyBac native nucleic acid sequence and a longer sequence component homologous to a 3' terminal region of a piggyBac native nucleic acid sequence. In some embodiments, the transformed cell is an insect cell, such as Drosophila melanogaster.
BRIEF DESCRIPTION OF THE DRAWINGS
[0037] The invention will be described in conjunction with the accompanying drawings, in which:
[0038] FIG. 1 shows a p3E1.2 deletion series of plasmids and excision assay results; the p3E1.2 plasmid was used to make progressive deletions using the restriction endonuclease ExoIII; three of the maximum deletion plasmids, p3E1.2-d-7, p3E1.2-d-8 and p3E1.2-d-9, were used to perform excision assays in T. ni embryos; p3E1.2d-7 and p3E1.2-d-8 plasmids retained the complete 3' terminal repeat configurations and were characterized by a similar excision frequency as the intact p3E1.2 plasmid; however, p3E1.2-d-9 did not yield any excision events, and sequencing results show that its 3' IR and part of the 31 bp spacer sequence are deleted;
[0039] FIG. 2A-2C(2). 2A shows the pIAO-P/L insertion series of plasmids and presents interplasmid transposition assay results: (A) lists the pIAO-P/L series of plasmids' insertion sequences (SEQ ID NOS: 35-39) and their interplasmid transposition assay (IPTA) frequencies are shown; all the pIAO-P/L insertion plasmids were co-injected with the piggyBac helper plasmid, phspBac, and the target plasmid, pGDV1, into T. ni embryos to perform an interplasmid transposition assay; the results show that when the insertion sequence is less than 40 bp, the transposition frequency drops dramatically; 2B is a schematic representation of the pIAO-P/L series plasmids; the piggyBac sequence was PCR amplified from a p3E1.2B/X plasmid, polhlacZ is from a pD2/-gal DraI/NruI fragment and AMP/ori was PCR amplified from a pUC18 plasmid; and (C1) is the nucleotide sequence of pIAO-P/L (SEQ ID NO: 57) and the amino acid sequences (SEQ ID NOS 58, 142-126, 59, 144-143, 60, 153-145, 61 & 62) (C2) is the nucleotide sequence of pIAO-P/L-Lambda (2.2 kb) (SEQ ID NO: 63) and the amino acid sequences (SEQ ID NOS 58, 142-126, 59, 144-143, 60, 153-145, 61, 157-154, 64, 190-158, 65 & 66);
[0040] FIG. 3A-3C(2) represent a schematic representation of an ITR cartridge and pXL-Bac minimum piggyBac vectors; 3A the ITR cartridge may be amplified from the pIAO-P/L-589 bp plasmid using an IR-specific primer; the amplified ITR may convert any existing plasmid into a piggyBac transposon, which may be mobilized if provided with the piggyBac transposase; 3B is a map of the pXL-Bac plasmid (MCS=multiple cloning site, BamHI or B s sHII are restriction sites; 3C1 the ITR cartridge nucleotide sequence (SEQ ID NO: 40); and 3C2 is the nucleotide sequence (SEQ ID NO: 41) of pXL-Bac;
[0041] FIG. 4 is a restriction map of plasmid pCaSpeR-hs-orf (p32), containing a 2016 bp PCR BamHI fragment containing piggyBac transposase and its terminator, cloned into BamHI sites of FIG. 5A-5B. 5A is a plasmid map showing the piggyBac ORF was amplified as a BamHI cartridge from the p3E1.2 plasmid and cloned into pCaSpeR-hs plasmid, positioning it for transcriptional control by the hsp70 promoter; 5B is the nucleotide sequence (SEQ ID NO: 42) of pCaSpeR-hs-orf; pCaSpeR-hs;
[0042] FIG. 5A-5B. 5A is a plasmid map showing the piggyBac ORF was amplified as a BamHI cartridge from the p3E1.2 plasmid and cloned into pCaSpeR-hs plasmid, positioning it for transcriptional control by the hsp70 promoter; 5B is the nucleotide sequence (SEQ ID NO: 42) of pCaSpeR-hs-orf;
[0043] FIG. 6A-6B. 6A is a plasmid map showing that the piggyBac ORF BamHI cartridge from pCaSpeR-hs-orf was cloned into the pBSII (Stratagene) positioning it for transcription under control of the T7 promoter to form pBSII-IFP2orf; 6B is the nucleotide sequence (SEQ ID NO: 43) of pBSII-IFP2-orf;
[0044] FIG. 7 is a plasmid map showing that the hsp70 promoter was excised from the pCaSpeR-hs plasmid by EcoR I and EcoR V digestion, followed by blunt ending, and cloned into pBSII-IFP2orf at the EcoR I and Hind III (blunt ended) sites to form pBSII-hs-orf (SEQ ID NO: 42);
[0045] FIG. 8A-8B. 8A is a plasmid map showing that the IE1 promoter was PCR amplified from the pIE1FB plasmid (Jarvis et al., 1990) and cloned into the pBSII-IFP2orf plasmid to form pBSII-IE1-orf; 8B is the nucleotide sequence (SEQ ID NO: 44) of pBSII-IE1-orf;
[0046] FIG. 9A-9B. 9A is a plasmid map showing that the base plasmid is pDsRed1-N1 (Clontech). The 3xP3 promoter was PCR amplified from pBac [3xP3-EYFPafm] (Horn and Wimmer, 2000) and cloned into the Xho I and EcoR I sites of pDsRed1-N1 to form the p3xP3-DsRed plasmid. The piggyBac ORF BamHI cartridge from pCaSpeR-hs-orf was then cloned into the BgJII site of p3xP3 DsRed positioning it under control of the CMV (cytomegalovirus) promoter to form p3xP3-DsRed-orf; 9B is the nucleotide sequence (SEQ ID NO: 45) of p3xP3-DsRed-orf. DsRed is a marker from Invitrogen and 3xP3 is a promoter specific for eyes of insects;
[0047] FIG. 10A-10B. 10A is a plasmid map showing that the ITR cartridge was PCR amplified as a BamHI fragment using a piggyBac internal repeat specific primer (5'-GGATCCCATGCGTCAATTTTACGCA-3') (SEQ ID NO: 1) and pIAO-P/L-589 bp plasmid as a template, and cloned into the pCRII plasmid (Invitrogen) to form the pCRII-ITR plasmid; 10B is the nucleotide sequence of pCRII-ITR (SEQ ID NO: 46) and the amino acid sequence (SEQ ID NO: 47);
[0048] FIG. 11 is a plasmid map showing that the ITR BamHI cartridge was recovered from the pCRII-ITR plasmid and religated, then cut with BssHII and cloned into the BssHII sites of the pBSII plasmid (Stratagene) to form pBS-ITR (rev) plasmid. The Multiple Cloning Sites were PCR amplified as a BglII fragment from the pBSII plasmid and were cloned into the BamHI site to the pXL-Bac plasmid;
[0049] FIG. 12A-12B. 12A is a plasmid map showing that the P element enhancer trap plasmid pP {PZ} (from Dr. O'Tousa, Univ. of Notre Dame) was digested with Hind III then self-ligated to produce the p(PZ)-HindIII plasmid. The ITR cartridge was excised using Sal I and Not I (blunt-ended) from pCRII-ITR and then cloned into the blunt ended Hind III site to form p(PZ)-Bac. The 3xP3-EYFP was PCR amplified as an Spe I fragment from pBac[3xP3-EYFPafm] (Horn and Wimmer, 2000) and cloned into the Spe I site of p(PZ)-Bac plasmid to form the p(PZ)-Bac-EYFP plasmid; 12B is the nucleotide sequence (SEQ ID NO: 48) of p(PZ)-Bac-EYFP;
[0050] FIG. 13A-13B. 13A is a plasmid map showing that the P element enhancer trap plasmid pP{PZ} (from Dr. O'Tousa, Univ. of Notre Dame) was digested with HindIII then self-ligated to produce the p(PZ-)-HindIII plasmid. The ITR cartridge was excised using Sal I and Not I (blunt ended) from pCRII-ITR and then cloned into the blunt ended Hind III site to form p(PZ)-Bac. The 3xP3-ECFP was PCR amplified as an Spe I fragment from pBac[3xP3-ECFPafm] (Horn and Wimmer, 2000) and cloned into the Spe I site of the p(PZ)-Bac plasmid to form the p(PZ)-Bac-ECFP plasmid; 13B is the nucleotide sequence (SEQ ID NO: 49) of p(PZ)-Bac-ECFP;
[0051] FIG. 14A-14B. 14A is a plasmid map showing that the P element enhancer trap plasmid pP{PZ} (from Dr. O'Tousa, Univ. of Notre Dame) was digested with Hind III then self-ligated to produce the p(PZ)-HindIII plasmid. The ITR cartridge was excised using Sal I and Not I (blunt ended) from pCRII-ITR and then cloned into the blunt ended HindIII site to form p(PZ)-Bac. The 3xP3-EGFP was PCR amplified as an Spe I fragment from pBac[3xP3-EGFPafm] (Horn and Wimmer, 2000) and cloned into the Spe I site of the p(PZ)-Bac plasmid to form the p(PZ)-Bac-EGFP plasmid; 14B is the nucleotide sequence (SEQ ID NO: 50) of p(PZ)-Bac-EGFP;
[0052] FIG. 15A-15B. 15A is a plasmid map showing that the 3xP3-EYFP gene was PCR amplified as an Spe I fragment from pBac [3xP3-EYFPafm] (Horn and Wimmer, 2000) and cloned into the Spe I site of the pXL-Bac plasmid to form the pXL-Bac-EYFP plasmid; 15B is the nucleotide sequence (SEQ ID NO: 51) of pXL-Bac-EYFP;
[0053] FIG. 16A-16B. 16A is a plasmid map showing that the 3xP3-EGFP gene was PCR amplified as an Spe I fragment from pBac [3xP3-EGFPafm] (Horn and Wimmer, 2000) and cloned into the Spe I site of the pXL-Bac plasmid to form the pXL-Bac-EGFP plasmid; 16B is the nucleotide sequence (SEQ ID NO: 52) of pXL-Bac-EGFP;
[0054] FIG. 17A-17B. 17A is a plasmid map showing that the 3xP3-ECFP gene was PCR amplified as an Spe I fragment from pBac [3xP3-ECFPafm] (Horn and Wimmer, 2000) and cloned into the Spe I site of the pXL-Bac plasmid to form the pXL-Bac-ECFP plasmid; 17B is the nucleotide sequence (SEQ ID NO: 53) of pXL-Bac-ECFP;
[0055] FIG. 18A-18B. 18A is a plasmid map showing that the 3xP3-ECFP was PCR amplified as an Spe I fragment from pBac[3xP3-ECFPafm] (Horn and Wimmer, 2000) and cloned into the Spe I site of the pBS-ITR plasmid to form the pBS-ITR-ECFP plasmid; 18B is the nucleotide sequence (SEQ ID NO: 54) of pBS-ITR-ECFP;
[0056] FIG. 19A-19B. 19A is a plasmid map showing that the 3xP3-EGFP was PCR amplified as an Spe I fragment from pBac[3xP3-EGFPafm] (Horn and Wimmer, 2000) and cloned into the Spe I site of the pBS-ITR plasmid to form the pBS-ITR-EGFP plasmid; 19B is the nucleotide sequence (SEQ ID NO: 55) of pBS-ITR-EGFP;
[0057] FIG. 20A-20B. 20A is a plasmid map showing that the 3xP3-EYFP was PCR amplified as an Spe I fragment from pBac[3xP3-EYFPafm] (Horn and Wimmer, 2000) and cloned into the Spe I site of the pBS-ITR plasmid to form the pBS-ITR-EYFP plasmid; 20B is the nucleotide sequence (SEQ ID NO: 56) of pBS-ITR-EYFP;
[0058] FIG. 21A-21B. 21A is a plasmid map showing that the Actin 5c promoter was cloned as a BamHI and Eco I fragment (bases 3046 to 3055 of SEQ ID NO: 67) from the pHAct5cEFGP plasmid (from Dr. Atkinson, UC Riverside) into the BamHI and EcoRI sites of the pBSII plasmid (Stratagene) to form the pBSII-Act5c-orf plasmid. The piggyBac ORF BamHI cartridge from pCaSpeR-hs-orf was then cloned into pBSII-Act5c plasmid under control of the Act5c promoter; 21B is the nucleotide sequence (SEQ ID NO: 67) of pBSII-Act5c-orf;
[0059] FIG. 22 is the nucleotide sequence (SEQ ID NO: 68) of pCaSpeR-hs-pBac;
[0060] FIG. 23 is a comparison of natural and optimized piggyBac nucleotide sequences (SEQ ID NOS: 69 and 70) wherein "optimizing" means using codons specific for insects;
[0061] FIG. 24A-24D. 24A shows the construction of plasmids developed in the present work. 24A shows a diagram of the pCaSpeR-hs-orf helper used for the transformation assays. The piggyBac cassette was cloned as a PCR product into the BamH I site of the pCaSpeR-hs adjacent to the hsp70 promoter. 24B shows a diagram of the p(PZ)-Bac-EYFP construct demonstrating the inefficiency of the ITR cartridge. (Li et al., 2001b) for transformation. A 7 kb Hind III fragment containing LacZ, hsp70 and Kan/ori sequences was excised from plasmid p(pz0 (Rubin and Spradling, 1983), and ligated to form a p(PZ)-7 kb intermediate plasmid. The ITR cartridge was excised from pBSII-ITR (Li et al., 2001b) using Not I and Sal I, blunt ended, and inserted into the blunt end Hind III site of the p(PZ)-7 kb plasmid. A 3xP3-EYFP Spe I fragment excised from pBac {3xP3-EYFPafm}(Hormn and Wimmer, 2000) was then inserted into the Xba I site to form p(PZ)-Bac-EYFP. 24C shows a diagram of the pBSII-ITR1.1 k-ECFP minimal piggyBac vector constructed by PCR amplification from the pIAO-P/L 589 plasmid (Li et al., 2001b), which contains a minimum piggyBac cartridge with inverted 5' and 3' TRDs separated by a 589 bp X. DNA spacer sequence, and incorporate additional subterminal ID sequences necessary for efficient transformation. This construct is tagged by the addition of the 3xP3-ECFP marker gene excised as a SpeI fragment from the plasmid pBac {3xP3-ECFPafm}(Horn and Wimmer, 2000). 24D shows a diagram of the piggyBac minimal vector pXL-BacII-ECFP, constructed from the pBSII-ITR1.1 k plasmid essentially as previously described (Li et al., 2001b), with the addition of the 3xP3-ECFP SpeI fragment from pBac {3xP3-ECFPafm}.
[0062] FIG. 25 shows a schematic illustration f TRD and adjacent ID regions present in plasmids and synthetic piggyBac internal deletion series constructs tested for transformation efficiency. The plasmids p(PZ)-Bac-EYFP and all pBSII-ECFP synthetic deletions are based on sequences amplified from the pIAO-P/L-589 construct of Li et al. (2001b). All plasmids have the 35 bp 5' TRD and 63 bp 3' TRD, and include variable lengths of 5' and 3' adjacent ID sequence. The relative transformation frequency for each plasmid is listed to the right for convenience.
[0063] FIG. 26A-26B shows a direct PCR analysis of transformed flies. 26A shows a diagram of a generalized synthetic deletion construct indication the position of primers and expected fragment. Three sets of PCR primers were used to verify to piggyBac insertion. The first primer set (IFP2_R1+MF34) detects the 3' terminal region (115 bp), the second primer set (IFP2_L+MF34) detects the 3' terminal region (240 bp), and the third primer set (IFP2_R1+IFP2_L) detects the presence of the external spacer sequence (945 bp). 26B shows the direct PCR results. a.) the first primer set yields a 115 bp fragment in all transformed strains confirming the 5'' terminal region. A less effectively amplified 115 bp fragment is also evident in the w.sup.1118 strain, reflecting the probable presence of piggyBac-like sequences in the D. melanogaster genome. b.) The second primer set yields the expected 240 bp fragment in all transformed strains confirming the 3' terminal region, while this fragment is absent in the w.sup.1118 strain. c.) The external spacer primer set failed to amplify a sequence in any of the transformed strains or the control w.sup.1118.
[0064] FIG. 27A-27B shows Southern hybridization analysis of synthetic deletion plasmid transformed strains. Genomic DNAs from selected strain and the pBSII-ITR1.1k-ECFP plasmid control were digested with Hind III and hybridized to the pBSII-ITR1.1 k-ECFP plasmid probe. 27A provides a map of the pBSII-ITR1.1 k-ECFP plasmid showing the size of expected diagnostic fragments. 27B shows all transformed strains exhibit the two diagnostic bands (2.9 kb and 1.16 kb) and at least two additional bands reflecting the piggyBac terminal adjacent sequences at the site of integration. A weak 1.3 kb band was also observed in all strains that probably represent a piggyBac-like sequence in the w1118 genome. The reduced intensity of the two additional bands representing joining sequences between the piggyBac termini and adjacent genomic DNA in each of the transformed strains is likely due to weaker hybridization of the 200 to 300 bp of AT rich sequences of this region of the probe.
[0065] FIG. 28 shows a schematic illustration of the locations of the two short repeat sequence motifs identified in the TRD adjacent ID sequences of piggyBac Several of these repeat motifs are within regions between R and R1, or L and L2, which appear to be the critical regions based on the present transformation results. These repeats are also found in other positions of the piggyBac ID sequence.
[0066] FIG. 29A-29B show a Southern hybridization analysis of the single p(PZ)-Bac-EYFP transformant. Genomic DNA from the p(PZ)-Bac-EYFP strain and the w.sup.1118 white-eye strain were digested with Sal I, with a SalI digest of the p(PZ) plasmid serving as control. The probe was PCR amplified from p(PZ)-Bac-EYFP using the primers 3xP3_for and M13_For. 29A shows a map of the p(PZ)-Bac-EYFP plasmid illustrating the position of Sal I sites, the region used as the probe, and expected size (3.6 kb) for the diagnostic hybridization fragment. 29B shows the two p(PZ)-Bac-EYFP transgenic sublines (lane 2 and 3) exhibit the diagnostic 3.6 kb band and two additional bands representing junction fragments containing genomic sequences and piggyBac ends at the single insertion site.
[0067] FIG. 30 shows an identified point mutation in the 3' internal repeat sequence. A point mutation was discovered in the 19 bp internal repeat sequence (IR) of the 3' TRD in all of the constructs derived from the pIAO-P/L 589 plasmid (Li et al., 2001b). This nucleotide substitution from C to A (bold and underlined) had no apparent effect on the transposition frequency of any of these constructs relative to the pBac{3xP3-EYFP} control plasmid (SEQ ID NOS 71 & 72 are disclosed respectively in order of appearance).
DETAILED DESCRIPTION
[0068] It is advantageous to define several terms before describing the invention. It should be appreciated that the following definitions are used throughout this application.
Definitions
[0069] Where the definition of terms departs from the commonly used meaning of the term, applicant intends to utilize the definitions provided below, unless specifically indicated.
[0070] For the purposes of the present invention, the term "genetic construct" refers to any artificially assembled combination of DNA sequences.
[0071] For the purposes of the present invention, the term "helper construct" refers to any plasmid construction that generates the piggyBac transposase gene product upon transfection of cells or injection of embryos.
[0072] For the purposes of the present invention, the term "ID region" or "ID regions" relates to a nucleic acid sequence that is derived from the native piggyBac sequence.
[0073] For purposes of the present invention, the term, "long" or "longer" as it refers to the length of a 3' terminal region of a piggyBac nucleic acid sequence is defined as a sequence that includes 250 base pairs (bp) or more, 300 bp or more, 350 bp or more, 375 or more, or 400 bp or more.
[0074] For purposes of the present invention, term "native" refers to any sequence defined as or recognized to be functionally or otherwise homologous to a piggyBac nucleic acid sequence or amino acid sequence in any species, including but not limited to humans, zebra fish, mosquitoes (e.g., Drosophila melanogaster), or any other vertebrate or invertebrate.
[0075] For the purposes of the present invention, the term "plasmid" refers to any self-replicating extrachromosomal circular DNA molecule capable of maintaining itself in bacteria.
[0076] For the purposes of the present invention, the term "spacer" refers to sequences, for example from about 3 bp to about 31 bp or more in length, separating the 5' and 3' (respectively) terminal repeat and internal repeat sequences of the piggyBac transposon.
[0077] For purposes of the present invention, the term "substantially homologous" is defined as a nucleic acid sequence that has or is able to elicit the same or substantially similar function activity of a native piggyBac sequence.
[0078] For the purposes of the present invention, the term "transgenic organism" refers to an organism that has been altered by the addition of foreign DNA sequences to its genome.
[0079] For the purposes of the present invention, the term "vector" refers to any plasmid containing piggyBac ends that is capable of moving foreign sequences into the genomes of a target organism or cell.
Description
[0080] The minimal sequence cartridges of the present invention facilitate transposition of DNA molecules of interest into cells, and production of transgenic organisms that include the transferred DNA molecule in some or all of their cells. A DNA molecule(s) is excised from a genetic (transformation) construct, and is transferred to a cell where it is inserted into the cell's genome. The DNA molecule is accompanied by regulatory elements sufficient to allow its expression in the host cell. "Cell" as used herein includes eukaryotic and prokaryotic cells. The genetic transposition construct includes a DNA molecule to be transferred flanked by a pair of transposon terminal inverted repeat nucleotide sequences from the piggyBac transposon. The DNA molecule to be transferred may be any molecule capable of being expressed in a host cell and/or transgenic organism. The method would also transfer cells not able to be expressed.
[0081] In the present invention, excision (Elick et al., 1996b) and interplasmid transposition assays (Lobo et al., 1999) were used to determine the relative importance of sequences internal to, or external to, the terminal repeat (TR) and internal repeat (IR) sequence configurations for movement of the piggyBac element.
[0082] It was found that progressive deletions within the internal sequence of the element have no noticeable effect on either excision or transposition capabilities. In contrast, deletion of the 3' IR eliminated excision of the element. Construction of vectors having only intact 5' and 3' repeat domains regenerates mobility of the plasmids when supplied with a helper vector expressing a transposase. These features permitted construction of a set of minimal vectors for use in transformation experiments.
[0083] The length of the intervening sequence between piggyBac termini in the donor plasmid also affects the piggyBac transposition frequency. In an embodiment of the present invention, a minimal distance of 55 nucleotide base pairs (bp) may be used between target sites and termini to provide for movement of the element. This suggests that the piggyBac transposase binds the termini simultaneously before any cleavage may occur, and/or that the formation of the transposition complex requires DNA bending between the two termini.
[0084] An aspect of this invention is that it allows the design of minimally sized genetic vectors that are functional for efficient insertion of genes into host genomes, in particular animal, plant, and insect genomes.
[0085] Useful Plasmids Created are:
[0086] A) A Transposition PiggyBac ITR Cartridge Plasmid:
[0087] PCR amplifications and restriction endonuclease cleavage and ligation allowed insertion of a 702 bp fragment containing sequences for piggyBac mobility into any given plasmid of choice, converting the recipient plasmid into an operational transposable sequence capable of being mobilized into an animal genome using the piggyBac transposase gene or purified protein. The pCRII (Invitrogen) plasmid re-amplification using specified primers allows this ITR cartridge to be inserted into any plasmid.
[0088] B) Operational Transposable Vectors (pXO and pXL-Bac):
[0089] Standard restriction endonuclease cleavage and ligation allows insertion of any gene of choice between the minimal sequences of the piggyBac transposon necessary for transposition into the genome of an animal. The total size of the resulting plasmid is preferably not larger than 10 kb.
[0090] According to an embodiment of the present invention, the inverted repeat configuration indicated as [TTAA/TR/IR . . . IR/31 bp/TR/TTAA] may be utilized to obtain a piggyBac transposon. This observation was arrived at through structured deletion mutagenesis within the piggyBac transposon sequence and examining the properties of both excision and interplasmid transposition of the deleted product.
[0091] Additionally, according to an embodiment of the present invention, an insertion sequence between the target site on a plasmid having the terminal repeat configuration [IR/31 bp/TR/TTAA . . . insertion sequence . . . TTAA/TR/IR] may be approximately 55 bp to achieve mobility.
[0092] For ease of manipulation, a cartridge having the configuration [IR31 bp/TR/TTAA . . . 589 . . . TTAA/TR/IR] which may be inserted within a plasmid, converting the plasmid into a functional piggyBac transposon, was constructed. The cartridge was cloned into the plasmid pCRII (Invitrogen). A cartridge is defined herein as a nucleic acid molecule of a specified construction (plasmid) that may be inserted into a vector.
[0093] A cartridge was derived from circularization of the construct A and cutting the construct A with BssHII to cleave at a unique BssHII site within the 589 bp spacer. This yielded a fragment BssHII . . . TTAA/TR/31b/IR/BamHI/IR/TR/TTAA . . . BssHII. Construct B was derived from a pBSII (Stratagene) plasmid by BssHII deletion of the multiple cloning site (MCS). The linearized fragment was then inserted into the pBSII.sup.aBssHII backbone. An MCS primer was synthesized and inserted in the BamHI site.
[0094] Construct A allows ease of construction of genetic vectors through use of a simple 702 bp cartridge that may be inserted into any existing plasmid to convert it immediately into a functional transposon.
[0095] Construct B allows ease of insertion of any genetic sequence into a plasmid having the minimal terminal sequence requirement for piggyBac mobility. The advantage of this construct is it provides a minimal backbone cloning vector for piggyBac transposon construction.
[0096] A kit is contemplated that would contain the two vector constructs along with the original p3E1.2, and/or a helper construct allowing constitutive production of piggyBac transposase in virtually any animal system. Promoter driven expression of the piggyBac transposase using either RSV LTR sequences CMV early promoter, AcMNPV/IE-1 promoter of poly-ubiquitin promoter, among others, is also contemplated.
[0097] Excision assays of plasmids containing progressive deletions of the piggyBac internal sequence revealed that the 5', and 3' IR, spacer, and TR configurations are sufficient for piggyBac movement when provided with a transposase in the trans position. Interplasmid transposition assays of plasmids having different sequence lengths between the target sites demonstrated a minimal 55 bp intervening sequence provides for satisfactory piggyBac transposition, whereas lengths less than 40 by result in dramatic decreases in frequency of transpositions. These results suggest that the piggyBac transposase binds the termini simultaneously before cleavage, and/or that the formation of the transposition complex requires DNA bending between the two termini. Based on these results, a 702 bp cartridge having a minimum piggyBac 5' and 3' terminal region configuration and intervening sequence was constructed. The ability of this region to convert any existing plasmid into a non-autonomous piggyBac transposon was verified. A minimal piggyBac vector, pXL-Bac, that contains an internal multiple cloning site sequence between the terminal regions, was also constructed. These vectors facilitate manipulations of the piggyBac transposon for use in a wide variety of hosts.
[0098] The excision assay provides a rapid way to characterize essential sequences involved in piggyBac transposition. The p3E1.2-d-7 and p3E1.2-d-8 plasmids, which retain the entire 3' and 5' IR, spacer and TR sequences, exhibit precise excision. In contrast, the p3E1.2-d-9 plasmid that retains the entire 5' terminal region and only 36 bp of the 3' terminal domain, including the TR and a portion of the 31 bp spacer, does not excise at a detectable frequency. The requirement for an internal 3' IR sequence in the excision process suggests that the IR region might play an essential role in transposase recognition or cleavage of the target site.
[0099] An alternative explanation is that simply shortening the internal sequence may hinder the formation of a transposition complex, or the binding of transposase to two termini simultaneously. A similar result is observed with the IS5O elements for which the lengthening of Tn5 internal sequences increases the transposition frequency (Goryshin et al., 1994). However, insertion of a KO.alpha. fragment into the p3E1.2-d-9 at the SphI site did not improve the frequency of precise excision events recovered in the excision assay, suggesting that the length of the internal domain is less important than the presence of an intact IR sequence in excision of the piggyBac element.
[0100] The interplasmid transposition assays of pIAO-P/L series plasmids demonstrate that when the external sequence separating the terminal repeats is at least 55 bp, the transposition frequency is over 10', while reducing the length to less than 40 bp depresses the frequency of transposition. The inhibition of piggyBac transposition as terminal sequences are brought closer together, suggests that formation of a transposition complex likely precedes DNA cleavage or nicking, and the shorter distances between these termini do not allow proper bending of the sequences to permit formation of the complex, or result in steric hindrance of transposase binding at the termini.
[0101] These results also imply a necessity for transposase binding of both termini simultaneously before any cleavage (or nicking) may occur. If the simultaneous binding were not necessary, then the transposase could bind one terminal repeat, cleave it, and then bind the second to cleave, and transposition should occur with equivalent frequencies even with smaller intervening sequences.
[0102] Interplasmid transposition assays using pCRII-ITR (FIG. 10) verify that the terminal configuration IR, spacer, TR are the minimum sequence requirements for efficient piggyBac transposition. The rest of the piggyBac internal sequence is not required if transposase is provided in trans configuration. With the ITR fragment, a minimum piggyBac vector may easily be constructed from any plasmid which reduces vector size and leaves maximum space for desired foreign genes.
[0103] Inserting the ITR fragment into pBlueScript II (Stratagene), converts the plasmid into a transposable element that moves with a frequency similar to the intact piggyBac element. This ITR cartridge facilitates the construction of piggyBac transformation vectors from existing plasmids. In addition, the co-integration of the Amp/ori sequences from the donor plasmid into the genome provides an easy way to locate the insertion site because these insertions may be recovered by restriction enzyme digestion, relegation, and transformation. The pXL-Bac (FIG. 11) minimum piggyBac vector replaces the internal sequence of the piggyBac transposon with a multiple cloning site. This plasmid allows any desired foreign genes or sequences to be easily inserted between piggyBac termini for movement in the presence of a helper plasmid. These constructs provide useful tools for the examination and use of piggyBac as a gene transfer vector in a wide variety of organisms.
[0104] The following Biological Deposits have been made on the following dates with a recognized International Depository Authority (IDA), the American Tissue Culture Collection (ATCC), at 10801 University Boulevard, Manassas, Va. 20110-2209, U.S.A., in compliance with the guidelines set forth in the Budapest Treaty on the International Recognition of the Deposit of Microorganisms for the Purposes of Patent Procedure. All restrictions on the availability to the public of the materials deposited will be irrevovably removed upon granting of a patent. The deposits will be maintained for a period of 30 years from the date of deposit or for a period of five years after the date of the most recent request of a sample or the enforceable life of the patent, which is the longest. If a culture become non-viable, it will be replaced with a viable culture of the same kind.
TABLE-US-00001 Table of Biological Deposits Deposit Type Deposit Name Accession Number Deposit Date Plasmid pXL-BACII-ECFP ATCC Accession # Jan. 12, 2006 PTA-7310 Plasmid pBSII-ITR1.1k-ECFP ATCC Accession # Jan. 12, 2006 PTA-7311 Plasmid pBSII-ECFP-R.sub.4/L.sub.2 ATCC Accession # May 14, 2015 PTA-122185 Plasmid pBSII-ECFP-R.sub.4/L.sub.3 ATCC Accession # May 14, 2015 PTA-122183 Plasmid pBSII-ECFP-R.sub.4/L.sub.4 ATCC Accession # May 14, 2015 PTA-122184
[0105] The invention may now be advantageously described by reference to the following representative examples. These examples are in no way to be interpreted to limit the scope and/or description of any embodiment or method of making or using the invention, and are provided solely for illustrative purposes and for satisfaction of providing the best mode of practicing the invention.
EXAMPLES
Example 1--Excision Assay of p3E1.2 Internal Deletion Series in T. ni
[0106] The analysis was begun using three plasmids having the most extensive internal deletions, p3E1.2-d-9, p3E1.2-d-8 and p3E1.2-d-7. Sequencing of these three plasmids revealed that p3E1.2-d-8 and p3E1.2-d-7 retained 163 bp and 303 bp of the 3' terminal region, respectively, including the IR, 31 bp spacer, and TR sequence. The p3E1.2-d-9 deletion plasmid retained only 36 bp of the 3' terminal domain, including the 3' TTAA target site, 3' TR and a portion of the 31 bp spacer, but lacked the 3' IR sequence.
[0107] Embryos of T. ni were injected with combinations of each of the p3E1.2 deletion plasmids and the phspBac helper plasmid. Loss of piggyBac sequences from the deletion series plasmids renders the plasmids resistant to BsiWI and SphI digestion. Transformation of Hirt extract DNAs digested with BsiWI and SphI were compared with transformations employing equal amounts of uncut DNA as a control to determine the frequency of excision. Precise excision events were initially identified by a quick size screen for the characteristic 3.5 kb plasmid in recovered colonies, and these plasmids were then sequenced to confirm the precise excision events.
[0108] A quick size screen method is used to quickly identify the plasmids with changed size directly from colonies (Sekar, 1987). Colonies at least 1 mm in diameter are picked up with pipette tips and resuspended in 10 ml protoplasting buffer (30 mM Tris-HCl pH 8.0, 50 mM NaCl, 20% Sucrose 5 mM EDTA, 100 mg/ml RNase, 100 mg/ml Lysozyme) in the Lux 60 well mini culture plate. A 0.9% agarose gel containing ethidium bromide is preloaded with 4.5 ml lysis solution (80 mM Tris, 0.5% Sucrose, 0.04% Bromophenol Blue, 2% SDS, 2.5 mM EDTA) per well. The bacterial suspension is then loaded into the wells and the gel electrophoresed. Two kinds of markers are needed to distinguish the plasmids with changed size. One is the colony from the control plate or the original plasmid, another is a molecular weight marker. The plasmids with a difference of 500 bp or greater in size are easily distinguished. Both the p3E1.2-d-8 and p3E1.2-d-7 yielded precise excision events at about the same relative frequency, while no excision events were recovered with the maximum deletion plasmid p3E1.2-d-9 (FIG. 1).
Example 2--Minimal Distance Required Between Termini for Movement of a PiggyBac Transposon Construct
[0109] The interplasmid transposition assay was carried out essentially as previously described by Lobo et al. (1999), Thibault et al. (1999) and Sarkar et al. (1997a). Embryos were injected with a combination of 3 plasmids. The donor plasmid, pB(KO.alpha.), carried a piggyBac element marked with the kanamycin resistance gene, ColE1 origin of replication, and the lacZ gene. The transposase providing helper plasmid, pCaSpeR-pB-orf, expressed the full length of the piggyBac ORF under the control of the D. melanogaster hsp70 promoter. The target B. subtilis plasmid, pGDV1, is incapable of replication in E. coli, and contains the chloramphenicol resistance gene. Upon transposition of the genetically tagged piggyBac element from pB(KO.alpha.) into the target plasmid pGDV1 with the help of the transposase provided by the helper pCaSpeR-pB-orf that expresses the piggyBac transposase protein from a minimal hsp70 promoter (see FIG. 4), only the interplasmid transposition product would be able to replicate in E. coli and produce blue colonies on LB/kan/cam/X-gal plates. Embryos were injected with a mixture of the transposase-providing helper plasmid, phspBac, one of the pIAO-P/L series plasmids as the donor, and the pGDV1 target plasmid. Transposition of the tagged piggyBac element from any of the pIAO-P/L plasmids into the target plasmid pGDV1 allows the recipient pGDV1 to replicate in E. coli and produces blue colonies on LB/Amp/Cam/X-gal plates.
[0110] A total of 10 blue colonies were randomly picked from each transformation and prepared for sequencing analysis. Initial sequence analysis of the terminal repeat junction showed that all of the sequenced clones had the distinctive duplication of a TTAA tetranucleotide target site, a characteristic feature of piggyBac transposition. A random set of those clones for which the 5' terminus had been sequenced were also examined at their 3' terminus to confirm the duplication of the TTAA site at both ends. The accumulated results confirmed transposon insertion at 12 of the 21 possible TTAA target sites in the pGDV1 plasmid, all of which were previously identified as insertion sites in Lepidopteran assays by Lobo et al. (1999) and Thibault et al. (1999).
[0111] The relative frequency at which a given pIAO-P/L series plasmid was able to undergo transposition into the target plasmid correlated with the sizes of the intervening sequence between the termini. With intervening sequences greater than 55 bp, the transposition frequency was over 1.2.times.10.sup.-4, which is consistent with the frequency obtained in previous assays with the p3E1.2 derived vectors by Lobo et al. (1999). If the length of the intervening sequence was reduced to 40 bp or less, the frequency of transposition began to decrease dramatically (FIG. 2).
Example 3--Interplasmid Transposition Assay of pCRII-ITR and pBSII-ITR Plasmids
[0112] According to an embodiment of the present invention, the excision assay described herein shows that a minimum of 163 bp of the 3' terminal region and 125 bp of the 5' terminal region (from the restriction site SacI to the end of the element) may be used for excision, while the pIAO-P/L constructs showed that a minimal distance of 55 bp between termini may be utilized to effect movement. These data suggested that the inclusion of intact left and right terminal and internal repeats and spacer domains would be sufficient for transposition.
[0113] The pCRII-ITR plasmid was constructed following PCR of the terminal domains from pIAO-P/L-589 using a single IR specific primer. A second construct pCRII-JFO3/04 was also prepared using two primers that annealed to the piggyBac 5' and 3' internal domains respectively, in case repeat proximate sequences were required.
[0114] The interplasmid transposition assay was performed in T. ni embryos and the plasmids were recovered using LB/Kan/Cam plates (Sambrook et al., 1989) with the controls plated on LB/Amp plates. A total of 10 randomly picked colonies were sequenced, and all were confirmed as resulting from transposition events, having the characteristic tetranucleotide TTAA duplication at the insertion sites. These insertion sites in pGDV1 were among the same previously described (Lobo et al., 1999 and Thibault et al., 1999). The sequencing results also confirmed that all 10 transposition events retained the expected terminal domain configurations. The frequency of transposition events was estimated at 2.times.10.sup.-4, a similar frequency to that obtained with non-mutagenized constructs for this species (Lobo et al., 1999).
[0115] Independent verification that the 702 bp PCR cloned fragment (ITR cartridge, FIG. 3(C1)) may be used as a cartridge to generate transpositionally competent plasmids was obtained by excising the BamHI fragment from pCRII-ITR, and ligating it into the pBlueScript II (Stratagene) plasmid to construct pBSII-ITR. Frequencies similar to those for the pCRII-ITR construct in the interplasmid transposition assay, were obtained.
Example 4--Construction of Minimum PiggyBac Vector pXL-Bac
[0116] A new piggyBac minimum vector pXL-Bac (FIG. 3(C2)) was also constructed by combining the 702 bp BamHI ITR fragment with the pBlueScript II BamHI fragment and inserting a PCR amplified pBSII multiple cloning site (MCS) between the terminal repeats. The pXL-Bac vector was tested by inserting an XbaI fragment from .pi.KO.alpha. (obtained from A, Sarkar, University of Notre Dame), containing the Kanamycin resistance gene, E. coli replication origin, and Lac a-peptide, into the MCS of pXL-Bac to form pXL-Bac-KOa. Interplasmid transposition assays yielded a frequency of over 10.sup.-4 for transposition of the modified ITR sequence, a similar level as observed for the intact piggyBac element.
Example 5--Derivative Vectors of pXL-Bac
[0117] Using the pXL-Bac minimal vector, several derivative vectors may be constructed containing marker genes for detection of successful transformations. In one example, the vectors pXL-Bac-EYFP, pXL-Bac-EGFP, and pXL-Bac-ECFP (FIGS. 15-17) were assembled to contain the 3xP3 promoter driven fluorescent protein genes of Horn and Wimmer (2000) by PCR amplifying these sequences from their respective piggyBac vectors using the primers E*FP-for (5' ACGACTAGTGTTCCCACAATGGTTAATTCG 3') (SEQ ID NO: 2) and E*FP-rev (5' ACGACTAGTGCCGTACGCGTATCGATAAGC 3') (SEQ ID NO: 3) each terminating in an SpeI restriction endonuclease site, and inserting these fragments into the SpeI digested pXL-Bac vector at the unique SpeI site of the multiple cloning site. Vectors constructed in this fashion allow detection of successful transformation by the pXL-Bac vector and may be further modified to include a separate gene of choice and suitable promoter adjacent to the marker gene in the multiple cloning site.
Example 6--Derivative Vectors of pCRII-ITR or pBSII-ITR
[0118] Similar modifications may be made to either the pCRII-ITR or the companion vector, pBSII-ITR, by inserting a marker gene into the plasmid adjacent to the ITR cartridge of these plasmids. In one example, the plasmids pBSII-ITR-ECFP, pBSII-ITR-EGFP, and pBSII-ITR-EYFP (FIGS. 18-20) were constructed using the strategy described in Example 5 to PRC amplify an SpeI fragment containing the marker genes from the Horn and Wimmer (2000) piggyBac vectors and insert them into the unique SpeI site of the pBSII-ITR plasmid.
Example 7--Facilitating Expression of the Transposase
[0119] Expression of the transposase is important in gaining movement of any of the vectors described herein. To facilitate expression of the transposase, a BamHI cartridge containing only the piggyBac open reading frame sequences was PCR amplified from the piggyBac transposon clone p3E1.2 using the primers BamH1E-for 1 (5' GCTTGATAAGAAGAG 3') (SEQ ID NO: 4) and BamH1E-rev 1 (5' GCATGTTGCTTGCTATT 3') (SEQ ID NO: 5). This cartridge was then cloned into the pCaSpeR-hs vector at a unique BamHI site downstream of the Drosophila heat shock promoter (pCaSpeR-hs-orf) to effect heat shock induced expression of the piggyBac transposase following co-injection with any piggyBac vector.
Example 8--In Vitro Expression of mRNA of PiggyBac Transposase
[0120] In some eukaryotic systems, the heat shock promoter may not function to express the transposase protein. An additional plasmid was constructed to allow in vitro expression of the messenger RNA sequence of the piggyBac transposase. Co-injection of this mRNA into embryos along with the piggyBac vectors would allow translation of the piggyBac transposase without having to rely on the expression of the mRNA from a promoter which may or may not be active in the desired system. In addition, this strategy provides much more transposase protein in the embryos, leading to a greater mobility of the piggyBac vectors. The BamHI cartridge was excised from the plasmid pCaSpeR-hs-orf by restriction digestion with BamHI and ligated into a BamHI digested commercially available vector; pBSII (Stratagene) to make pBSII-IFP2orf (FIG. 6), allowing in vitro transcription of the piggyBac transposase mRNA under control of the bacteriophage T7 promoter.
Example 9--Alternative Promoters for the PiggyBac Transposase Gene
[0121] Further modification of pBSII-IFP2orf may be effected to introduce alternative promoters that would drive expression of the piggyBac transposase gene. Three examples are provided. pBSII-hs-orf (FIG. 7) was constructed by excising the heat shock promoter region from pCaSpeR-hs using EcoR I and EcoR V digestion followed by blunt end polishing of the EcoRI terminus, and ligating the fragment to the blunt end polished EcoRII HindIII digested pBSII-IFP2orf plasmid. The plasmid pBSII-1E1-orf was prepared by PCR amplification of the 1E1 promoter from the plasmid pIE1 FB using the primers IE1-Ac-for (5' ACGTAAGCTTCGATGTCTTTGTGATGCGCC 3') (SEQ ID NO: 6) and IE1-Ac-rev (5' ACGGAATTCACTTGCAACTGAAACAATATCC 3') (SEQ ID NO: 7) to generate an EcoRI/HindIII tailed fragment that was then inserted into EcoRI and HindIII digested pBSII-IFP2orf. This plasmid allows constitutive expression of the piggyBac transposase in a diversity of eukaryotic systems. A final demonstration was prepared by digesting the plasmid pHAct5cEGFP (Pinkerton et al., 2000) with BamHI and EcoRI to recover the Drosophila Actin 5c promoter which was then inserted into pBSII digested with EcoRI and BamHI. The BamHI cartridge from pCaSpeR-hs-orf was excised by digestion with BamHI and cloned downstream of the Actin 5c promoter at the unique BamHI to form the plasmid pBSII-Act5c-orf (FIG. 21). This allows high level expression of the piggyBac transposase in embryos of insects.
Example 10--Transposase Expression in Vertebrate Systems
[0122] While all of the constructs in Example 9 permit expression of the transposase in insect systems, they may not permit optimal expression of the transposase in vertebrate systems. Using the commercially available pDsRed1-N1 plasmid (Clonetech) the BamHI cartridge was cloned from pBSII-IFP2orf into the BamHI site adjacent to the CMV promoter to effect efficient expression of the piggyBac transposase in vertebrate systems. This plasmid was further modified by adding the 3xP3 promoter through PCR amplification of this promoter from the plasmid pBacI[3XP3-EYFPafm] (Horn and Wimmer, 2000) using the primers 3xP3-for (5' ACTCTCGAGGTTCCCACAATGGTTAATTCG 3') (SEQ ID NO: 8) and 3xP3-rev (5' ACTGAATTCATGGTGGCGACCGGTGGATCG 3') (SEQ ID NO: 9) to generate a XhoI/EcoRI tailed cartridge that was then cloned into the XhoI and EcoRI digested pDsRed1-N1 backbone to generate the plasmid p3XP3-DsRed-orf (FIG. 9).
Example 11--Optimizing PiggyBac
[0123] In some cases it may be preferable to inject transposase protein to permit movement of the piggyBac transposon. The natural piggyBac transposase sequence is not efficiently expressed in prokaryotic systems due to a preponderance of eukaryotic codons. To achieve better expression of the piggyBac transposase in bacterial systems for purification and functional utility a sequence called optimized piggyBac orf (FIG. 23) was created, substituting prokaryotic codon biases wherever possible. This sequence generated the same protein sequence, but represents an artificial gene expressing the piggyBac transposase.
Example 12--Materials and Methods for Examples 1-11
Plasmids
[0124] p3E1.2 Deletion Series:
[0125] The p3E1.2 plasmid (Fraser et al., 1995) was first linearized using the restriction sites BamHI and EcoRI, blunt ended with the klenow fragment, then religated to construct the p3E1.2(DMCS) eliminating the MCS of the pUC18 sequence. Internal deletions were made using the Erase-A-Base System (Promega). p3E1.2(DMCS) was cut at the unique SacI site within the piggyBac element, generating an ExoIII resistant end, and then cut at the BglII site to generate an ExoIII sensitive end. Fractions of the ExoIII deletion reaction from the BglII site toward the 3' terminus were stopped every 30 seconds, and were blunt ended by S1 nuclease, recircularized, and transformed into DH5a cells. Recovered plasmids were size analyzed using a quick screen method (Sekar, 1987). The presence of intact 3' termini was confirmed using a BsiWI digestion, and then sequenced. Nine consecutive plasmids in the size range of approximately 100.about.200 bp deletions were recovered and named p3E1.2-d-1 to p3E1.2-d-9, with p3E1.2-d-9 having the maximum deletion (FIG. 1).
[0126] pIAO-P/L Series:
[0127] The p3E1.2 B/X plasmid was constructed as a pCRII TA clone (Invitrogen) of the entire piggyBac transposon and flanking TTAA targets sites following PCR from the plasmid p3E1.2 using the BamHI/XbaI-tailed primer M1F34 (5'-GGATCCTCTAGATTAACCCTAGAAAGATA-3') (SEQ ID NO: 10). The element and flanking TTAA sites were then excised using the enzyme BamHI and ligated to form a circular molecule. Two outward facing internal piggyBac primers, one with a terminal ApaI site (5'-GAAAGGGCCCGTGATACGCCTATTTTTATAGGTT-3') (SEQ ID NO: 11) and the other with a terminal KpnI site (5'-AATCGGTACCAACGCGCGGGGAGAGGCGGTTTGCG-3') (SEQ ID NO: 12), were used to generate a linear ApaI/KpnI-tailed fragment. This fragment was ligated to a PCR fragment containing the beta-1 actamase gene and E. coli replication origin amplified from pUC18 using an ApaI-tailed primer (5'-CCAAGGGCCCTGACGTGAACCATTGTCACACGT-3') (SEQ ID NO: 13) and a KpnI tailed (5'-TGTGGGTACCGTCGATCAAACAAACGCGAGATACCG-3) (SEQ ID NO: 14) primer pair. The resulting pIAO plasmid contains the circularized piggyBac transposon with ends separated by an 18 bp fragment of DNA having the restriction sites configuration xbaI/BamHI/xbaI, with a beta-lactamase gene and the E. coli origin of replication. The lacZ gene under the control of the polyhedron promoter was excised from pD-2/B-gal (Fraser et al., 1996) using restriction enzymes NruI and DraI, and cloned into the unique HpaI site within the piggyBac element of pIAO to form pIAO-polh/lacZ (pIAO-P/L) plasmid.
[0128] The pIAO-P/L-TTAA1 plasmid was constructed by digesting pIAO-polh/lacZ with SphI and BsiWI, and the fragment containing the internal piggyBac sequence was isolated. Two complementing oligonucleotides, SphI (5'-CGTCAATTTTACGCAGACTATCTTTCTAGGG-3') (SEQ ID NO: 15) and TTAA-SphI (5'-TTAACCCTAGAAAGATAGTCTGCGTAAAATTGACGCATG-3') (SEQ ID NO: 16), were annealed to form a SphI site on one end and a TTAA overhang on the other end. A second pair of oligonucleotides, BsiWI (5'-GTACGTCACAATATGATTATCYTTCTAGGG-3') (SEQ ID NO: 17) and TTAA-BsiWI (5'-TTAACCCTAGAAAGATAATCATATTGTGAC-3') (SEQ ID NO: 18) were annealed to form a BsiWI site on one end and a TTAA overhang on the other. These two primer pairs were joined using the TTAA overlaps and inserted into the SphI and BsiWI sites of the digested pIAO-polh/lacZ plasmid to form the circular pIAO-P/L-TTAA1 plasmid.
[0129] The pIAO-P/L-TTAA2 plasmid was constructed in a similar manner by combining the SphI-terminal primer with TTAATTAA-SphI (5'-TTAATTAACCCTAGAAAGATAGTCTGCGTAAAATTGACGCATG-3') (SEQ ID NO: 19), and the BsiWI primer with TTAATTAA-BsiWI (5'-TTAATTAACCCTAGAAAGATAATCATATTGTGAC-3') (SEQ ID NO: 20).
[0130] The plasmids pIAO-P/L-2.2 kb, pIAO-P/L-589 bp, pIAO-P/L-354 bp, pIAO-P/L-212 bp and pIAO-P/L-73 bp were constructed by insertion of HindIII or PvuII fragments from the bacteriaphage lambda into the blunt ended XbaI site between the adjacent TTAA target sites of pIAO-polh/lacZ.
[0131] Plasmids pIAO-P/L-55 bp, pIAO-P/L-40 bp and pIAO-P/L-22 bp were constructed by annealing oligonucleotide pIAO-4501 (5'-CTAGTACTAGTGCGCCGCGTACGTCTAGAGACGCGCAGTCTAGAAD-3') (SEQ ID NO: 21) and pIAO-4502 (5'-TTCTAGACTGCGCGTCTCTAGACGTACGCGGCGCACTAGTACTAGD-3') (SEQ ID NO: 22), forming two XbaI sites and one SpeI site, and ligating them into the blunt ended pIAO-P/L XbaI fragment to generate pIAO-P/L-55 bp. The pIAO-P/L-40 bp plasmid was constructed by cutting pIAO-P/L-55 bp plasmid at the XbaI sites of the inserted fragment and then religating. Cutting pIAO-P/L-40 bp at the XbaI and SpeI sites, and religating formed the pIAO-P/L-22 bp plasmid.
[0132] The pIAO-P/L-18 bp plasmid was constructed by PCR amplification of the pIAO-P/L plasmid using the pIAO-18 bp primer (5'-GATGACCTGCAGTAGGAAGACGD3') (SEQ ID NO: 23) and the TR-18 bp primer (5'-GACTCTAGACGTACGCGGAGCTTAACCCTAGAAAGATAD3') (SEQ ID NO: 24). The amplified fragment was cut with XbaI and PstI, and ligated to the pIAO-P/L XbaI and PstI cut fragment.
[0133] pCRII-ITR, pCRII-JF03/04 and pBS-ITR Plasmids:
[0134] The oligonucleotide ITR (5'-GGATTCCATGCGTCAATTTTACGCAD-3') (SEQ ID NO: 25), having the piggyBac IR and a terminal BamHI site, was used to PCR amplify the piggyBac 3' and 5' IRs and TRs along with their spacer regions from the pIAO-P/L-589 bp plasmid. The PCR fragment was TA cloned into pCRII (Invitrogen). The resulting plasmid, pCRII-ITR, replaces the entire internal sequence of piggyBac with the pCRII plasmid sequences. A second plasmid, pCRII-JF03/04, was constructed using the same strategy with the primers JFO3 (5'-GGATCCTCGATATACAGACCGATAAAAACACATGD-3') (SEQ ID NO: 26) and JF04 (5'-GGTACCATTGCAAACAGCGACGGATTCGCGCTATD-3') (SEQ ID NO: 27). JFO3 is 83 bp internal to the 5' terminus, JF04 is 90 bp internal to the 3' terminus. To construct the pBS-ITR plasmid, the 702 bp BamHI fragment was excised from the pCRII-ITR plasmid and inserted into the BamHI site of the pBlueScript (Stratagene) plasmid.
[0135] pXL-Bac Plasmid:
[0136] The 702 bp fragment containing the piggyBac terminal repeats isolated from pCRII-ITR plasmid BamHI digestion was religated to form a circular molecule, followed by BssHII digestion. The pBlueScript II plasmid was also digested by BssHII and the large fragment was band isolated. These two fragments were ligated together to form the pBSII-ITR(Rev) plasmid. The Multiple Cloning Site (MCS) was PCR amplified from the pBSII plasmid using the MCS for (5'-ACGCGTAGATCTTAATACGACTCACTATAGGG-3') (SEQ ID NO: 28) and MCS-rev (5'-ACGCGTAGATCTAATTAACCCTCACTAAAGGG-3') (SEQ ID NO: 29) primers, and cloned into BamHI site of pBSII-ITR(Rev) to construct the pXL-Bac plasmid.
[0137] The pXL-Bac minimum piggyBac vector was constructed by circularizing an ITR BamHI fragment, followed by BssHII digestion. The resulting BssHII fragment was then ligated to the pBlueScript II BssHII AMP/ori containing fragment. The multiple cloning site was PCR amplified from pBSII plasmid and inserted into BamHI site to form the pXL-Bac vector. Any desired gene may be inserted into the MCS [the BssHII fragment taken from pBSII (Stratagene)] to construct a piggyBac transposon.
[0138] Helper Plasmid:
[0139] phspBac (formerly pBhsDSac, Handler et al., 1998) is a transposase-providing helper plasmid that expresses the piggyBac ORF under the control of the D. melanogaster hsp70 promoter.
[0140] Target Plasmid:
[0141] pGDV1 is a Bacillus subtilis plasmid (Sarkar et al., 1997a) containing a chloramphenicol resistance gene, and is incapable of replication in E. coli unless provided with an E. coli origin of replication.
[0142] Microinjection:
[0143] T. ni embryos were collected approximately 2 hours post oviposition and microinjected as described by Lobo et al., (1999). After injection, the embryos were allowed to develop for one hour at room temperature, heat shocked at 37.degree. C. for one hour, and allowed to recover at room temperature overnight. Plasmids were recovered using a modified Hirt (1967) extraction procedure.
[0144] Excision Assay:
[0145] The excision assay was performed as described by Thibault et al., (1999). Precise excision events were confirmed by sequencing using a fluorescent labeled M13 reverse primer (Integrated DNA Technologies, Inc.).
[0146] Interplasmid Transposition Assay:
[0147] The interplasmid transposition assay was performed as described by Lobo et al. (1999) and Sarkar et al. (1997a). Plasmids isolated from the injected and heat-shocked embryos, as well as those passaged through E. coli only, were resuspended in 20.mu.l of sterile distilled water and 41 of the DNAs were then electroporated into 10.mu.l of competent E. coli DH 10B cells (Gibco-BRL) (Elick et al., 1996a). A 1.0-ml aliquot of SOC (2% w/v Bactotryptone, 0.5% w/v Bacto yeast extract, 8.5 mM NaCl, 2.5 mM Kcl, 10 mM MgC.sub.2 20 mM glucose) was added to the electroporated cells, and the cells were allowed to recover at 37.degree. C. for 15 minutes. An aliquot (1%) of the transformed bacteria was plated on LB plates containing amphicilin (100 .mu.g/ml) and X-Gal (5-bromo-4-chloro-3-indolyl-(3-D-galactosidase; 0.025 .mu.g/ml), and the rest were plated on LB plates containing kanamycin (10 .mu.g/ml), chloramphenicol (10 .mu.g/ml) and X-Gal (0.025 .mu.g/ml). Restriction analysis using HindIII and EcoRV and PCR using outward facing primers specific to piggyBac (JF01: 5'-CCTCGATATACAGACCGATAAAACACATG-3' (SEQ ID NO: 30) and JF02: 5'-GCACGCCTCAGCCGAGCTCCAAGGGCGAC-3' (SEQ ID NO: 31)) enabled the preliminary identification of clones with putative interplasmid transposition events. The right insertion site of the clones was sequenced, with the Thermo Sequenase fluorescence-labeled primer sequencing kit (Amersham) and an ALF Express Automated Sequencer (Pharmacia Biotech), using the fluorescence-labeled JF02 primer, while the left insertion site was sequenced with the MF 11 reverse primer (5'-GGATCCCTCAAAATTTCTTTCTAAAGTA-3') (SEQ ID NO: 32).
[0148] To check for plasmid replication in the embryos, Hirt-extracted plasmid DNAs recovered from injected D. melanogaster embryos were digested with the restriction enzyme DpnI (Geier and Modrich, 1979). E. coli cells were transformed with equal volumes of the digested and undigested plasmid DNAs and plated on LB plates containing kanamycin, chloramphenicol and X-Gal as above.
[0149] The pIAO-P/L series transposition events were sequenced using the fluorescent labeled MF 11-reverse primer (5'-GGATCCCTCAAAATTTCTTCTAAAGTA-3') (SEQ ID NO: 33) and JF02 primer (5'-GCACGCCTCAGCCGAGCTCCAAGCGGCGAC-3') (SEQ ID NO: 34), and the pCRII-ITR and pBSII-ITR transposition events were sequenced using fluorescent labeled M13 reverse primer.
[0150] Automatic Thermocycle Sequencing:
[0151] Sequencing was performed using the Thermo Sequenase Fluorescent Labeled Primer Sequencing Kit (Amersham) and the ALF Express Automated Sequencer (Pharmacia Biotech), following standard protocols provided by the manufacturers.
[0152] Other Plasmids:
[0153] FIGS. 12, 13 and 14 present alternative plasmids that may be useful for gene transfer.
Example 14--Identification of TRD Adjacent Regions
[0154] The present invention also provides ID sequences adjacent to the TRD of the piggyBac transposon that contribute to a high frequency of germline transformation in D. melanogaster. The present invention provides an analysis of a series of PCR synthesized deletion vectors constructed with the 3xP3-ECFP gene as a transformation marker (Horn and Wimmer, 2000). These vectors define ID sequences immediately adjacent to the 5' TRD and 3' TRD adjacent ID sequences that effect efficient germline transformation of D. melanogaster. Using this information, the present invention provides a new ITR cartridge, called ITR1.1K, and verifies its utility in converting an existing plasmid into a mobilizable piggyBac vector that enables efficient germline transformation. The present invention also provides a transposon-based cloning vector, pXL-BacII, for insertion of sequences within a minimal piggyBac transposon and verifies its capabilities in germline transformations.
Example 15--Materials and Methods for Example 12
Plasmids
[0155] The pCaSpeR-hs-orf helper plasmid was constructed by PCR amplifying the piggyBac open reading frame using IFP2orf_For and IFP2orf_Rev primers, cloning into the pCRII vector (Invitrogen), excising using BamH I, and inserting into the BamH I site of the P element vector, pCaSpeR-hs (Thummel, et al., 1992). A single clone with the correct orientation and sequence was identified and named pCaSpeR-hs-orf (FIG. 24).
[0156] The p(PZ)-Bac-EYFP plasmid was constructed from the p(PZ) plasmid (Rubin and Spradling, 1983) by digesting with Hind III and recircularizing the 7 kb fragment containing LacZ, hsp70 and Kan/ori sequences to form the p(PZ)-7 kb plasmid. The ITR cartridge was excised from pBSII-ITR (Li et al., 2001b) using Not I and Sal I and blunt end cloned into the Hind III site of the p(PZ)-7 kb plasmid. A 3xP3-EYFP marker gene was PCR amplified from pBac{3xP3-EYFPafm} (Horn and Wimmer, 2000), digested with Spe I, and inserted into the Xba I site to form p(PZ)-Bac-EYFP. It contains the LacZ gene, Drosophila hsp70 promoter, Kanamycin resistance gene, ColE1 replication origin, 3xP3-EYFP marker and the piggyBac terminal repeats-only ITR cartridge (FIG. 24).
[0157] The pBSII-3xP3-ECFP plasmid was constructed by PCR amplifying the 3xP3-ECFP marker gene from pBac{3xP3-ECFPafm} (Horn and Wimmer, 2000) using the primer pair ExFP_For and ExFP_Rev, then digesting the amplified fragment with Spe I, and cloning it into the Xba I site of pBlueScript II plasmid (Stratagene).
[0158] The piggyBac synthetic internal deletion plasmids were constructed by PCR amplification from the pIAO-P/L-589 bp plasmid (Li et al., 2001b) using a series of primers. A total of 9 PCR products were generated using the combination of IFP2_R4 against all five IFP2_L primers and IFP2_L5 against all four IFP2_R primers. Two additional PCR products were also obtained using the IPF2_R-TR+IFP2_L and IFP2_R1+IFP2_L primer pairs. These PCR products were then cloned into the pCR II vector using the TOPO TA cloning kit (Invitrogen), excised using Spe I digestion, and cloned into the Spe I site of the pBSII-3xP3-ECFP plasmid to form the piggyBac internal deletion series (FIG. 25). The pBSII-ITR1.1K-ECFP plasmid (FIG. 24) was constructed by cloning the EcoR V/Dra I fragment from pIAO-P/L-589 bp, which contained both piggyBac terminal repeats, into the EcoR V site of pBSII-3xP3-ECFP. The pXL-BacII-ECFP plasmid (FIG. 24) was constructed by PCR amplifying the ITR1.1k cartridge from pBSII-ITR1.1k-ECFP plasmid using MCS_For and MCS_Rev primers flanking by Bgl II site, cutting with Bgl II, religating and cutting again with BssH II, then inserting into the BssH II sites of the pBSII plasmid.
[0159] A separate cloning strategy was used to construct pBS-pBac/DsRed. The 731 bp Ase I-blunted fragment from p3E1.2, including 99 bp of 3' piggyBac terminal sequence and adjacent NPV insertion site sequence, was ligated into a unique Kpn I-blunted site in pBS-KS (Stratagene). The resulting plasmid was digested with Sac I and blunted, then digested with Pst I, and ligated to a 173 bp Hinc II-Nsi I fragment from p3E1.2, including 38 bp of 5' piggyBac terminal sequence. The pBS-pBac minimal vector was marked with polyubiquitin-regulated DsRed1 digested from pB[PubDsRed1] (Handler and Harrell, 2001a) and inserted into an EcoR I-Hind III deletion in the internal cloning site within the terminal sequences.
Example 16--Transformation of Drosophila Melanogaster
[0160] The D. melanogaster w.sup.1118 white eye strain was used for all microinjections employing a modification of the standard procedure described by Rubin and Spradling (1982), in which the dechorionation step was eliminated. Equal concentrations (0.5 .mu.g/.mu.l) of each of the internal deletion plasmids, or the control plasmid pBac{3xP3-ECFPafm}, were injected along with an equal amount of the pCaSpeR-hs-orf helper plasmid into fresh fly embryos followed by a one hour heat shock at 37.degree. C. and recovery overnight at room temperature. Emerging adults were individually mated with w.sup.1118 flies, and progeny larvae were screened using an Olympus SZX12 fluorescent dissecting microscope equipped with GFP (480 nm excitation/510 nm barrier), CFP (436 nm excitation/480 nm barrier), and YFP (500 nm excitation/530 barrier) filter sets. Two positive adults from each of the vials were crossed with w.sup.1118 to establish germline transformed strains. The pBS-pBac/DsRed1 minimal vector was also injected and screened under HQ Texas Red.RTM. set no. 41004 (Handler and Harrell, 2001a).
Direct PCR Analysis
[0161] Genomic DNAs from each of the transformed stains, the w.sup.1118 wild type strain, and a piggyBac positive strain M23.1 (Handler and Harrell, 1999) were prepared using a modified DNAzol procedure. About 60 flies from each strain were combined with 150 .mu.l of DNAzol (Molecular Research Center, Inc.) in a 1.5 ml eppendorf tube. The flies were homogenized, an additional 450 .mu.l of DNAzol was added, and the homogenates were incubated at room temperature for one hour. The DNAs were extracted twice with phenol:chloroform (1:1 ratio), and the aqueous fractions were transferred to new tubes for precipitation of the DNA with an equal volume of 2-propanol. The DNA pellets were washed with 70% ethanol, air dried, and 150 .mu.l of dH.sub.2O containing 10 .mu.g of RNase A was added and resuspended.
[0162] Two sets of direct PCRs were performed to identify the presence of piggyBac sequences in transformed fly genomes. Primers MF34 and IFP2_L were used to identify the presence of the piggyBac 3' terminal repeat, while MF34 and IFP2_R1 were used for identifying the piggyBac 5' terminal repeat. To exclude the possibility of recombination, a second PCR was also performed using the IFP2_R1 and IFP2_L primers to amplify the external stuffer fragment (Li et al., 2001) between the terminal repeat regions.
Southern Hybridization Analysis
[0163] Southern hybridization analysis was performed using a standard procedure with minor modifications (Ausubel et al. 1994). Approximately 8 .mu.g of genomic DNA (isolated as above) from each of the transformed fly strains was digested with 40 units of Hind III for four hours, followed by agarose gel electrophoresis at 60 Volts for 4 to 5 hours. The gel was then denatured, neutralized and transferred to nylon membranes, and baked at 80.degree. C. for four hours. The membranes were pre-hybridized in the hybridization buffer overnight. A synthetic probe was prepared by nick translation (Invitrogen kit) using .sup.32P labeled dGTP against the pBSII-ITR1.1K-ECFP plasmid template. The purified probe was hybridized at 65.degree. C. overnight followed by several washes, and the membranes were first exposed on phosphor screens (Kodak) overnight for scanning with a Storm phosphor Scanner (Molecular Dynamics System), and then exposed on X-ray film (Kodak).
Universal PCR and Inverse PCR Analysis
[0164] The piggyBac insertion sites in the transformed fly strains were identified using either universal PCR (Beeman et al., 1997) or inverse PCR techniques (Ochman et al., 1988). For the universal PCR, the IFP2_L (3' TR) or IPR2_R1 (5' TR) primer was combined with one of 7 universal primers during the first round of PCR (94.degree. C. 1 minute, 40.degree. C. 1 minute, 72.degree. C. 2 minutes, 35 cycles). 2 .mu.l of the reaction mixture from the first round of PCR was then used for a second round of PCR (94.degree. C. 1 minute, 50.degree.. C. 1 minute, 72.degree. C. 2 minutes, 35 cycles) using IFP2_L1 (3' TR) or iPCR_R1 (5' TR) together with a T7 primer (nested on the universal primer).
[0165] Inverse PCRs were performed by digesting 5 ug of the genomic DNAs from each of the transformed strains completely with HinP1 I for the 3' end or Taq I for the 5' end, followed by purification using the Geneclean kit (Q-Biogene) and self-ligation in a 100 ul volume overnight. The self-ligated DNAs were precipitated and resuspended in 30 ul ddH.sub.2O. A portion of them were then used for first round PCR (94.degree. C. 1 minute, 40.degree. C. 1 minute, 72.degree. C. 2 minutes, 35 cycles) with primer pairs IFP2_R1+MF14 for the 5' end and JF3+IFP2_Lb for the 3' end. 2 ul of the first round PCR products were used as templates for the second round PCR (94.degree. C. 1 minute, 50.degree. C. 1 minute, 72.degree. C. 2 minutes, 35 cycles) using primer pairs iPCR_R1+iPCR_6 for the 5' end and iPCR_L1+MF04 for the 3' end. The pBSII-ITR1.1k-ECFP strains were slightly different, the primer pair iPCR_L1+IFP2_L-R were used for the 3' end in the second round PCR. All the PCR products were cloned into the pCRII vector (Invitrogen) and sequenced. The sequences were used to BLAST search the NCBI database to identify the locations of the insertions. MacVector 6.5.3 (Oxford Molecular Group) and ClustalX (Jeanmougin et al., 1998) were used for sequence alignments.
Example 17--Transformation Experiments with Synthetic Deletion Constructs
[0166] Each of the piggyBac synthetic internal deletion plasmids was formed by PCR amplifying from the pIAO-P/L-589 plasmid (Li et al., 2001) by PCR amplifying across the facing terminal repeats and spacer with primers that recognize 5' or 3' sequences adjacent to the respective TRDs (FIG. 24). The fragments generated were cloned into a pBSII-3xP3-ECFP plasmid and sequenced.
[0167] Each of the synthetic deletion series plasmids and the control plasmid, pBac{3xP3-ECFPafm}, were co-injected with the hsp70-regulated transposase helper into w.sup.1118 embryos, with surviving adults backcrossed, and G1 adult progeny screened for fluorescence. Positive transformants exhibited fluorescent eyes with CFP and GFP filter sets but not with the YFP filter set. Transformation frequencies from all injections are listed in Table 1, below.
TABLE-US-00002 TABLE 1 Transformation of Drosophila melanogaster Embryos Embryos Adults Aults Transformants Transformation Plasmid Injected Hatched Mated Survived Lines (G.sub.0) Frequency p(PZ)-Bac-EYFP 2730 376 217 83 1 0.6% pBSII-ECFP-R1/L5 990 240 83 70 6 8.9% pBSII-ECFP-R2/L5 620 75 21 16 2 12.5% pBSII-ECFP-R3/L5 650 127 29 20 3 15.0% pBSII-ECFP-R4/L5 730 182 39 31 4 12.9% pBSII-ECFP-R4/L4 670 169 44 28 3 10.7% pBSII-ECFP-R4/L3 710 147 44 31 3 9.7% pBSII-ECFP-R4/L2 850 191 55 46 5 10.8% pBSII-ECFP-R4/L1 990 231 75 86 0 0% pBSII-ITR1.1K-ECFP 530 128 43 84 5 13.9% pBSII-ECFP-R-TR/L 610 169 62 71 0 0% pBSII-ECTP-R1/L 840 247 81 69 0 0% pBac(3xP3-ECFPafm) 650 104 45 69 4 12.9% pXL-BacII-ECFP 1020 181 42 36 8 22.2% pBSII-ITR1.1k-ECFP* 515 120 48 22 8 36.4% pXL-BacII-ECFP* 533 199 115 88 22 25.0%
[0168] Eight of the eleven synthetic ID deletion plasmids yielded positive transformants at an acceptable (not significantly different from control, P<0.05) frequency. The 5' ID deletion constructs pBSII-ECFP-R1/L5, pBSII-ECFP-R2/L5, pBSII-ECFP-R3/L5 and pBSII-ECFP-R4/L5 had variable deletions of the piggyBac 5' ID, retaining sequences from 66 bp (nucleotides 36-101 of the piggyBac sequence, GenBank Accession Number: AR307779) to 542 bp (36-567 of the piggyBac sequence). Each of these 5' ID deletions yielded ECFP positive germ line transformants at frequencies from 8.9% to 15.0% (Table 1) when paired with 1 kb of the 3' ID sequence (nucleotides 1454-2409 of the piggyBac sequence). These results suggested that a minimal sequence of no more than 66 bp of the 5' ID may be necessary for efficient germline transposition. *The injections were done independently (Handler lab) using a 0.4:0.2 ug/ul vector/helper concentration ratio of DNA. The p(PZ)-Bac-EYFP plasmid yielded a low transformation frequency of 0.6% compared to the control plasmid, pBac{3xP3-ECFPafm}frequency of 12.9% (Table 1).
[0169] The R4 minimum 5' ID sequence primer was then used in combination with a series of 3' ID deletion primers to generate the constructs pBSII-ECFP-R4/L4, pBSII-ECFP-R4/L3, pBSII-ECFP-R4/L2 and pBSII-ECFP-R4/L1. Of these four constructs, only pBSII-ECFP-R4/L1, which represented the greatest deletion of 3' ID sequence (2284.about.2409 of the piggyBac sequence), failed to yield transformants. Once again, frequencies for the positive transformant constructs were similar to the control (Table 1). It was therefore deduced that the minimal 3' ID sequence requirement for efficient germline transformation was between 125 bp (L1) and 378 bp (L2) of the 3' TRD adjacent ID sequence.
Example 18--Construction of the ITR1.1 k Minimal Sequence piggyBacCartridge
[0170] To construct a minimal sequence cartridge using the information gained from the synthetic deletion analysis, combinations of 5' and 3' minimal sequences were assembled and their transformation capabilities were tested. The pBSII-ECFP-R-TR/L construct is composed of a 35 bp 5' TRD lacking any 5' ID sequence, coupled to a fragment containing the 65 bp 3' TRD and 172 bp of the adjacent 3' ID sequence. This combination did not yield any transformants, confirming the necessity for having 5' ID sequences in combination with 3' ID sequences for efficient transformation. Unexpectedly, addition of 101 bp of the 5' ID sequences to the 5' TRD sequences in the construct pBSII-ECFP-R1/L was not sufficient to recover transformation capacity when paired with the 172 bp 3' ID sequences, even though the lower limit of essential 5' ID sequences had been suggested to be 66 bp using pBSII-ECFP-R1/L5 (Table 1). Increasing the 5' ID sequences to 276 bp in the pBSII-ITR1.1 k-ECFP plasmid recovered the full transformation capability when paired with the 172 bp 3' ID sequence (Table 2). The minimal operational requirement for 5' ID sequences is therefore between 276 and 101 bp when coupled to a minimal 3' ID sequence of 172 bp.
[0171] Two independent verifications of the pBSII-ITR1.1k-ECFP plasmid transforming capabilities were conducted for transformation of D. melanogaster. These transformation experiments resulted in calculated frequencies of 13.9% (FIG. 25) and 36% (Table 1). The discrepancy in frequencies may be attributed to differences in injection protocols between labs. Unless otherwise indicated, the transformation frequencies presented in Table 1 and FIG. 25 were obtained with injections of 0.6:0.6 ug/ul vector:helper concentration ratios. The increased efficiency of transformation for pBSII-ITR1.1k-ECFP observed in the second independent trial seems to be related to a decreased vector:helper concentration in D. melanogaster.
[0172] Five recovered pBSII-ITR1.1k-ECFP transformed strains were used to perform genetic mapping to identify their chromosome locations. Several of the strains had insertions on the second and third chromosomes (including strain 1), while strain 3 had an insertion on the X chromosome. Strain 1 and strain 3 were chosen for further analyses.
Direct PCR Analysis of Integrations:
[0173] Genomic DNAs from each of the transformed strains obtained with the synthetic deletion constructs in FIG. 24, as well as the piggyBac positive strain M23.1 and the negative white eye strain w.sup.1118, were used to perform two sets of PCRs to verify the presence of the piggyBac 5' and 3' terminal repeat regions. An additional negative control PCR was performed on all transformants to show the absence of the external lambda phage DNA stuffer sequence (FIG. 26).
[0174] The first set of PCRs utilized the IFP2_R1 and MF34 primers to amplify the 5' terminal repeat regions, and the second set of PCRs used the IFP2_L and MF34 primers to amplify the 3' terminal repeat regions. All of the synthetic deletion transformed strains, the M23.1 control strain, and the plasmid control yielded a strong PCR product of the correct size for each of the primer sets, confirming the presence of both of the piggyBac terminal repeat regions in all of the transformed strains. Interestingly, the white eye strain w.sup.1118 yielded a very weak product of the correct size with the 5' terminal repeat PCR amplification, but failed to generate a product with the 3' terminal specific primer set.
[0175] A third set of PCRs was performed using the IFP2_R1 and IFP2_L primers in an attempt to amplify the external lambda phage DNA stuffer sequence which would be present if an insertion resulted from recombination of the entire plasmid sequence rather than transposition. The control product from this PCR reaction is a 925 bp fragment, and no such corresponding fragments were generated with any of the transformed strain genomic DNAs.
Southern Hybridization Analysis:
[0176] Southern hybridization analysis was performed to verify the copy number and further confirm transposition of the piggyBac deletion plasmids into the Drosophila genome (FIG. 27 and FIG. 29). Genomic DNAs from two of the pBSII-ITR1.1 k-ECFP strains (strain 1 and strain 3) and one of each of the other strains were digested with Hind III, with the pBSII-ITR1.1k-ECFP plasmid Hind III digest as a plasmid control. The Hind III digestion of all transformed strains will generate four fragments if transpositional insertion has occurred: the pBSII plasmid backbone fragment (2960 bp), the 3xP3-ECFP marker fragment (1158 bp), the piggyBac 5' terminus fragment and the piggyBac 3' terminus fragment. Using the pBSII-ITR1.1k-ECFP plasmid as probe, all four fragments generated by the Hind III digestion may be detected.
[0177] The diagnostic 2960 bp pBSII backbone and 1158 bp ECFP marker fragments were present in all of the transformed strains examined. All of these strains also exhibited at least two additional bands corresponding to the piggyBac termini and adjacent sequences at the integration site (FIG. 27). These results confirmed that the observed frequencies were the result of transpositional integrations.
Example 19--Analysis of Insertion Site Sequences
[0178] To further verify that piggyBac-mediated transposition of the synthetic deletion constructs occurred in these transformants, individual insertion sites were examined by isolating joining regions between the transposon and genomic sequences using either universal PCR or inverse PCR. Subsequent sequencing analysis of these joining regions demonstrated that all of the insertions occurred exclusively at single TTAA target sites that were duplicated upon insertion, and all insertion sites had adjacent sequences that were unrelated to the vector. The two pBSII-ITR1.1 k-ECFP strains 1 and 3 have a single insertion on the third and X chromosome respectively.
Example 20--Pairings of 5' PiggyBac Minimum Sequence with Long 3' End Transposon Sequences
[0179] In these studies, transformation results from synthetic unidirectional deletion plasmids demonstrate that no more than 66 bp (nt 36.about.101 of the piggyBac sequence) of the piggyBac 5' ID sequence and 378 bp (nt 2031.about.2409 of the native (wild-type) piggyBac sequence) of the piggyBac 3' ID sequence are necessary for efficient transformation when these deletions are paired with long (378 or 311 bp, respectively, or longer) ID sequences from the opposite end of the transposon. The transformation data from the pBSII-ITR1.1 k-ECFP plasmid further defines the 3' ID essential sequence as 172 bp (nt 2237.about.2409 of the native (wild-type) piggyBac sequence). Combining this same 172 bp 3' ID sequence with only the 5' TRD in the pBSII-ECFP-R-TR/L plasmid yielded no transformants, demonstrating that the 3' ID sequence alone was insufficient for full mobility. Unexpectedly, adding the 66 bp 5' ID sequence in pBSII-ECFP-R1/L also does not allow recovery of full transformation capability in spite of the fact that the same 66 bp does allow full transformation capability when coupled to the larger (378 bp) 3' ID sequence in the pBSII-ECFP-R1/L2. This result cannot be explained by size alone, since the ITR cartridge strategy used to test this deletion sequence construct effectively replaces the rest of the piggyBac ID with the 2961 bp pBSII plasmid sequence.
[0180] There appears to be an important sequence within the additional 206 bp of the L2 3' ID sequence that compensates for the smaller 5' ID sequence of R1. The data infer that an analogous sequence at the 5' end should be located within the 210 bp added to the 5' ID sequence in construction of the pBSII-ITR1.1 k-ECFP, since this construct exhibits full transforming capability using the L 3' ID sequence. Aligning these two sequences using MacVector 6.5.3 identified two small segments of repeat sequences common between these approximately 200 bp sequences. These repeats, ACTTATT (nt 275.about.281, 2120.about.2126 and 2163.about.2169 of the piggyBac sequence) and CAAAAT (nt 185.about.190, 158.about.163 and 2200.about.2205 of the piggyBac sequence), occur in direct and opposite orientations, and are also found in several other locations of the piggyBac ID (FIG. 28). It seems that a minimum of one set of these repeats on either side of the internal domains are required for the transposon to permit full transforming capability.
Example 21--Materials Used in Transformation Studies with Synthetic Deletion Constructs
[0181] The present example describes the piggyBac construct materials (e.g. synthetic deletion constructs) used in the transformation of Drosophila melanogaster.
Materials and Methods
Plasmids
[0182] The pCaSpeR-hs-orf helper plasmid was constructed by PCR amplifying the piggyBac open reading frame using IFP2orf_For and IFP2orf_Rev primers, cloning into the pCRII vector (Invitrogen), excising with BamH I and inserting into the BamH I site of the P element vector, pCaSpeR-hs (Thummel, et al., 1992). A single clone with the correct orientation and sequence was identified and named pCaSpeR-hs-orf (FIG. 24A).
[0183] The p(PZ)-Bac-EYFP plasmid (FIG. 24B) was constructed from the p(PZ) plasmid (Rubin and Spradling, 1983) by digesting with Hind III and recircularizing the 7 kb fragment containing LacZ, hsp70 and Kan/ori sequences to form the p(PZ)-7 kb plasmid. The ITR cartridge was excised from pBSII-ITR (Li et al., 2001b) using Not I and Sal I and blunt-end cloned into the Hind III site of the p(PZ)-7 kb plasmid. A 3xP3-EYFP marker gene was PCR amplified from pBac{3xP3-EYFPafm} (Horn and Wimmer, 2000), digested with Spe I, and inserted into the Xba I site to form p(PZ)-Bac-EYFP.
[0184] The pBSII-3xP3-ECFP plasmid was constructed by PCR amplifying the 3xP3-ECFP marker gene from pBac{3xP3-ECFPafm} (Horn and Wimmer, 2000) using the primer pair ExFP_For and ExFP_Rev (Table 2), digesting the amplified fragment with Spe I, and cloning it into the Xba I site of pBlueScript II plasmid (Stratagene).
[0185] The piggyBac synthetic internal deletion plasmids were constructed by PCR amplification from the pIAO-P/L-589 bp plasmid (Li et al., 2001b) using a series of primers (Table 2). A total of 9 PCR products were generated using the combination of IFP2_R4 against all five IFP2_L primers and IFP2_L5 against all four IFP2_R primers. Two additional PCR products were also obtained using the IPF2_R-TR+IFP2_L and IFP2_R1+IFP2_L primer pairs. These PCR products were then cloned into the pCR II vector (Invitrogen), excised by Spe I digestion, and cloned into the Spe I site of the pBSII-3xP3-ECFP plasmid to form the piggyBac internal deletion series (FIG. 25). The pBSII-ITR1.1K-ECFP plasmid (FIG. 24C) was constructed by cloning the EcoR V/Dra I fragment from pIAO-P/L-589 bp, which contained both piggyBac terminal repeats, into the EcoR V site of pBSII-3xP3-ECFP. The pXL-BacII-ECFP plasmid (FIG. 24D) was constructed essentially as described previously (Li et al., 2001b) by PCR amplifying the ITR1.1 k cartridge from pBSII-ITR1.1k-ECFP plasmid using MCS_For and MCS_Rev primers, each containing flanking Bgl II sites, cutting with Bgl II, religating and cutting again with BssH II, then inserting into the BssH II sites of the pBSII plasmid.
[0186] The pBS-pBac/DsRed1 plasmid was constructed by excising the 731 bp Ase I-fragment from p3E1.2, including 99 bp of 3' piggyBac terminal sequence and adjacent NPV insertion site sequence, and ligating it as a blunt fragment into a unique Kpn I-blunted site in pBS-KS (Stratagene). The resulting plasmid was digested with Sac I and blunted, digested with Pst I, and ligated to a 173 bp Hinc II-Nsi I fragment from p3E1.2, including 38 bp of 5' piggyBac terminal sequence. The pBS-pBac minimal vector was marked with the polyubiquitin-regulated DsRed1 digested from pB[PUbDsRed1] (Handler and Harrell, 2001a) and inserted into an EcoR I-Hind III deletion in the internal cloning site within the terminal sequences.
Transformation of Drosophila Melanogaster
[0187] The D. melanogaster w.sup.1118 white eye strain was used for all microinjections employing a modification of the standard procedure described by Rubin and Spradling (1982) in which the dechorionation step was eliminated. Equal concentrations (0.5 ug/ul) of each of the internal deletion plasmids or the control plasmid pBac{3xP3-ECFPafm}, were injected along with an equal amount of the pCaSpeR-hs-orf helper plasmid into embryos followed by a one hour heat shock at 37.degree. C. and recovery overnight at room temperature. Emerging adults were individually mated with w.sup.1118 flies, and progeny were screened as larvae using an Olympus SZX12 fluorescent dissecting microscope equipped with GFP (480 nm excitation/510 nm barrier), CFP (436 nm excitation/480 nm barrier), and YFP (500 nm excitation/530 barrier) filter sets. Two positive adults from each of the vials were crossed with w.sup.1118 to establish germ-line transformed strains. The pBS-pBac/DsRed1 minimal vector was also injected and screened using a HQ Texas Red.RTM. filter no. 41004 (Handler and Harrell, 2001a).
Direct PCR Analysis
[0188] Genomic DNAs from each of the transformed stains, the w.sup.1118 wild type strain, and a piggyBac positive strain M23.1 (Handler and Harrell, 1999) were prepared using a modified DNAzol procedure. About 60 flies from each strain were combined with 150 ul of DNAzoI (Molecular Research Center, Inc.) in a 1.5 ml eppendorf tube. The flies were homogenized, an additional 450 ul of DNAzoI was added, and the homogenates were incubated at room temperature for one hour. The DNAs were extracted twice with phenol:chloroform (1:1 ratio), and the aqueous fractions were transferred to new tubes for precipitation of the DNA with an equal volume of 2-propanol. The DNA pellets were washed with 70% ethanol, air dried, and resuspended in 150 ul of dH.sub.2O containing 10 ug of RNase A.
[0189] Two sets of direct PCRs were performed to identify the presence of piggyBac sequences in transformed fly genomes. Primers MF34 and IFP2_L were used to identify the presence of the piggyBac 3' terminal repeat, while MF34 and IFP2_R1 were used for identifying the piggyBac 5' terminal repeat. To exclude the possibility of recombination, a second PCR was also performed using the IFP2_R1 and IFP2_L primers to amplify the external stuffer fragment (Li et al., 2001b) between the terminal repeat regions.
Southern Hybridization Analysis
[0190] Southern hybridization analysis was performed using a standard procedure with minor modifications (Ausubel et al. 1994). Approximately 8 ug of genomic DNA (isolated as above) from each of the transformed fly strains was digested with 40 units of Hind III for four hours, followed by agarose gel electrophoresis. The gel was then denatured, neutralized and transferred to nylon membranes, and baked at 80.degree. C. for four hours, and the membranes were pre-hybridized overnight. A synthetic probe was prepared by nick translation (Invitrogen kit) using .sup.32P labeled dGTP against the pBSII-ITR1.1K-ECFP plasmid template. Purified probe was hybridized at 65.degree. C. overnight followed by several washes, and the membranes were first exposed on phosphor screens (Kodak) overnight for scanning with a Storm phosphor Scanner (Molecular Dynamics System), and then exposed on X-ray film (Kodak).
Universal PCR and Inverse PCR Analysis
[0191] The piggyBac insertion sites in the transformed fly strains were identified using either universal PCR (Beeman et al., 1997) or inverse PCR techniques (Ochman et al., 1988). For the universal PCR, the IFP2_L (3' TR) or IPR2_R1 (5' TR) primer was combined with one of 7 universal primers (Table 2) during the first round of PCR (94.degree. C. 1 min, 40.degree. C. 1 min, 72.degree. C. 2 min, 35 cycles). 2 ul of the reaction mix from the first round PCR was then used for a second round of PCR (94.degree. C. 1 min, 50.degree. C. 1 min, 72.degree. C. 2 min, 35 cycles) using IFP2_L1 (3' TR) or iPCR_R1 (5' TR) together with a T7 primer (nested on the universal primer).
[0192] Inverse PCRs were performed by digesting 5 ug of the genomic DNAs from each of the transformed strains completely with HinP1 I for the 3' end or Taq I for the 5' end, followed by purification using the Geneclean kit (Q-Biogene) and self-ligation in a 100 ul volume overnight. The self-ligated DNAs were precipitated and resuspended in 30 ul ddH.sub.2O. A 5 .mu.l portion of each ligation was used for first round PCR (94.degree. C. 1 min, 40.degree. C. 1 min, 72.degree. C. 2 min, 35 cycles) with primer pairs IFP2_R1+MF14 for the 5' end, and JF3+IFP2_Lb for the 3' end (Table 2). 2 .mu.l of the first round PCR products were used as templates for the second round PCR (94.degree. C. 1 min, 50.degree. C. 1 min, 72.degree. C. 2 min, 35 cycles) using primer pairs iPCR_R1+iPCR_6 for the 5' end and iPCR_L1+MF04 for the 3' end. The primer pair iPCR_L1+IFP2_L-R was used for the second round PCR of the 3' end of pBSII-ITR1.1k-ECFP strains. All the PCR products were cloned into the pCRII vector (Invitrogen) and sequenced. Sequences were subjected to a BLAST search of the NCBI database to identify the locations of the insertions. MacVector 6.5.3 (Oxford Molecular Group) and ClustalX (Jeanmougin et al., 1998) were used for sequence alignments.
Example 22--Transformation Studies with Synthetic Deletion Constructs
[0193] Initial attempts to transform D. melanogaster with plasmids having only TRD sequences as specified in previous reports (Li et al., 2001b) yielded transformation frequencies far less than full length piggyBac constructs. The p(PZ)-Bac-EYFP construct contains the ITR cartridge of Li et al. (2001b) composed of the 5' and 3' TRD and the spacer sequence, while the pBS-pBac/DsRed retains only 2 bp of 5' ID and 36 bp of 3' ID sequences in addition to the 5' and 3' TRD. Neither of these constructs were able to generate germ-line transformants at the frequencies previously reported for full length vectors (Handler and Harrell, 1999) or the less extensive internal deletion construct pBac{3xP3-ECFPafm}(Horn and Wimmer, 2000). The potential involvement of piggyBac ID sequences in generating germ line transformations were therefore reexamined.
[0194] The requirements for TRD was examined adjacent ID sequences of the piggyBac transposon using a synthesized cartridge strategy based upon construction of the previously reported ITR cartridge (Li et al., 2001b), rather than digesting with an endonuclease and selecting clones representing an internal deletion series. Each of the piggyBac synthetic internal deletion plasmids was formed from the pIAO-P/L-589 plasmid (Li et al., 2001b) by PCR amplification across the facing TRDs and spacer sequences with primers that recognize 5' or 3' ID sequences adjacent to the respective TRDs (FIG. 24). The fragments generated were cloned into a pBSII-3xP3-ECFP plasmid and sequenced (Materials and Methods).
[0195] Each of the synthetic deletion series plasmids and the control plasmid, pBac{3xP3-ECFPafm}, were co-injected with the hsp70-regulated transposase helper into w.sup.1118 embryos, with surviving adults backcrossed, and G1 adult progeny screened for fluorescence. Positive transformants exhibited fluorescent eyes with CFP and GFP filter sets but not with the YFP filter set. Transformation frequencies from all injections are listed in Table 3. The p(PZ)-Bac-EYFP plasmid, which was constructed using the ITR cartridge previously described (Li et al., 2001b), yielded a relatively low transformation frequency of 0.6% compared to the control plasmid, pBac{3xP3-ECFPafm}frequency of 12.9% (Table 3).
[0196] Eight of the eleven synthetic ID deletion plasmids yielded positive transformants at an acceptable frequency compared to the control. The 5' ID deletion constructs pBSII-ECFP-R1/L5, pBSII-ECFP-R2/L5, pBSII-ECFP-R3/L5 and pBSII-ECFP-R4/L5 had variable deletions of the piggyBac 5' ID, retaining sequences from 66 bp (nucleotides 36-101; GenBank Accession Number: AR307779) to 542 bp (nucleotides 36-567) of the piggyBac sequence. Each of these 5' ID deletions yielded ECFP positive germ-line transformants at frequencies from 8.9% (+/-1.0%) to 15.0% (+/-0.6%) (Table 3) when paired with 1 kb of the 3' ID sequence (nucleotides 1454-2409). These results demonstrated a minimal sequence of no more than 66 bp of the 5' ID is appropriate for effective germ-line transposition.
[0197] The R4 minimum 5' ID sequence primer was then used in combination with a series of 3' ID deletion primers to generate the constructs pBSII-ECFP-R4/L4, pBSII-ECFP-R4/L3, pBSII-ECFP-R4/L2 and pBSII-ECFP-R4/L1. Of these four constructs, only pBSII-ECFP-R4/L1, which represented the greatest deletion of 3' ID sequence (2284.about.2409 of the piggyBac sequence), failed to yield transformants. Once again, frequencies for the constructs that yielded positive transformants compared favorably with the control (Table 3). It was therefore deduced that the minimal 3' ID sequence requirement for efficient germline transformation was between 125 bp (L1) and 378 bp (L2) of the 3' TRD adjacent ID sequence.
Construction of the ITR1.1 k Minimal Sequence PiggyBac Cartridge
[0198] To construct a minimal sequence cartridge using the information gained from the synthetic deletion analysis combinations of 5' and 3' minimal sequences were constructed and tested for their transformation capabilities. The pBSII-ECFP-R-TR/L construct is composed of a 35 bp 5' TRD lacking any 5' ID sequence, coupled to a fragment containing the 63 bp 3' TRD and 172 bp of the adjacent 3' ID sequence. This combination did not yield any transformants, confirming the necessity for having 5' ID sequences in combination with 3' ID sequences for efficient transformation.
[0199] Unexpectedly, addition of 66 bp of the 5' ID sequences to the 5' TRD sequences in the construct pBSII-ECFP-R1/L was not sufficient to recover transformation capacity when paired with the 172 bp 3' ID sequences, even though the lower limit of essential 5' ID sequences as 66 bp using pBSII-ECFP-R1/L5 had been previously defined (Table 4). Increasing the 5' ID sequences to 276 bp in the pBSII-ITR1.1 k-ECFP plasmid recovered the full transformation capability when paired with the 172 bp 3' ID sequence (Table 4). The minimal operational sequence requirement for 5' ID sequences is therefore between 276 and 66 bp when coupled to a minimal 3' ID sequence of 172 bp.
[0200] Two independent verifications of the pBSII-ITR1.1k-ECFP plasmid transforming capabilities were conducted for transformation of D. melanogaster. These transformation studies resulted in calculated frequencies of 13.9% (FIG. 24) and 36% (Table 3). The discrepancy in frequencies may be attributed at least in some part to differences in injection protocols between labs. Unless otherwise indicated, the transformation frequencies presented in Table 3 were obtained with injections of 0.6:0.6 .mu.g/.mu.l vector:helper concentration ratios. The increased efficiency of transformation for pBSII-ITR1.1 k-ECFP observed in the second independent trial seems to be related to a decreased vector:helper concentration in D. melanogaster.
[0201] Five recovered pBSII-ITR1.1 k-ECFP transformed strains were used to perform genetic mapping to identify their chromosome locations. Several of the strains had insertions on the second and third chromosomes (including strain 1), while strain 3 had an insertion on the X chromosome. Strain 1 and strain 3 were chosen for further analyses.
Direct PCR Analysis of Integrations:
[0202] Genomic DNAs from each of the transformed strains obtained with the synthetic deletion constructs in FIG. 1, as well as the piggyBac positive strain M23.1 and the negative white eye strain w.sup.1118, were used to perform two sets of PCRs to verify the presence of the piggyBac 5' and 3' terminal repeat regions. An additional negative control PCR was performed on all transformants to show the absence of the external lambda phage DNA stuffer sequence (FIG. 25).
[0203] The first set of PCRs utilized the IFP2_R1 and MF34 primers to amplify the 5' terminal repeat regions, and the second set of PCRs used the IFP2_L and MF34 primers to amplify the 3' terminal repeat regions. All of the synthetic deletion transformed strains, the M23.1 control strain, and the plasmid control yielded a strong PCR product of the correct size for each of the primer sets, confirming the presence of both of the piggyBac terminal repeat regions in all of the transformed strains. The white eye strain w.sup.1118 yielded a very weak product of the correct size with the 5' terminal repeat PCR amplification, but failed to generate a product with the 3' terminal specific primer set.
[0204] A third set of PCRs was performed using the IFP2_R1 and IFP2_L primers in an attempt to amplify the external lambda phage DNA stuffer sequence which would be present if an insertion resulted from recombination of the entire plasmid sequence rather than transposition. The control product from this PCR reaction is a 925 bp fragment, and no such corresponding fragments were generated with any of the transformed strain genomic DNAs.
Example 23--Southern Hybridization Analysis
[0205] Southern hybridization analysis was performed to verify the copy number and further confirm transposition of the piggyBac deletion plasmids into the Drosophila genome (FIG. 27, FIG. 29). Genomic DNAs from two of the pBSII-ITR1.1 k-ECFP strains (strain 1 and strain 3) and one of each of the other strains were digested with Hind III, with the pBSII-ITR1.1 k-ECFP plasmid Hind III digest as a plasmid control. The Hind III digestion of all transformed strains is expected to generate four fragments after transpositional insertion: the pBSII plasmid backbone fragment (2960 bp), the 3xP3-ECFP marker fragment (1158 bp), the piggyBac 5' terminus fragment and the piggyBac 3' terminus fragment. Using the pBSII-ITR1.1k-ECFP plasmid as probe, all four fragments generated by the Hind III digestion may be detected.
[0206] The diagnostic 2960 bp pBSII backbone and 1158 bp ECFP marker fragments were present in all of the transformed strains examined. All of these strains also exhibited at least two additional bands corresponding to the piggyBac termini and adjacent sequences at the integration site (FIG. 27). These results confirmed that the observed frequencies were the result of transpositional integrations.
Example 24--Analysis of Insertion Site Sequences
[0207] To further verify that piggyBac-mediated transposition of the synthetic deletion constructs occurred in these transformants, individual insertion sites were examined by isolating joining regions between the transposon and genomic sequences using either universal PCR or inverse PCR. Subsequent sequencing analysis of these joining regions demonstrated that all of the insertions occurred exclusively at single TTAA target sites that were duplicated upon insertion, and all insertion sites had adjacent sequences that were unrelated to the vector (Table 4). The two pBSII-ITR1.1 k-ECFP strains 1 and 3 have a single insertion on the third and X chromosome respectively. This data is consistent with the information obtained from genetic crosses with balancer strains.
[0208] During sequence analysis of the integration sites a reported point mutation in the present constructs was confirmed that occurs at position 2426 in the piggyBac sequence, within the 3' TRD at the boundary of the 31 bp spacer and the internal repeat sequence. This point mutation was apparently generated in constructing the pIAO-P/L plasmid (Li et al., 2001b) and was therefore present in all of the constructs generated by the PCR syntheses employed in these studies. This point mutation had no apparent effect on the transformation frequencies as evidenced by the efficiency of transformation obtained with pBSII-ITR1.1 k-ECFP.
[0209] The available piggyBac insertion site data from previous reports and these studies were compiled and aligned using ClustalX to identify a potential common insertion site motif (Table 5). No apparent consensus motif arose from the comparison of these sequences outside of the required TTAA target site.
TABLE-US-00003 TABLE 2 A listing of the synthetic oligonucleotide primers used (SEQ ID NOS 73-106 respectively in order of appearance): Internal Deletion Primers IFP2_R1 ACTTCTAGAGTCCTAAATTGCAAACAGCGAC IFP2_R2 ACTTCTAGACACGTAAGTAGAACATGAAATAAC IFP2_R3 ACTTCTAGATCACTGTCAGAATCCTCACCAAC IFP2_R4 ACTTCTAGAAGAAGCCAATGAAGAACCTGG IFP2_L1 ACTTCTAGAAATAAATAAATAAACATAAATAAATTG IFP2_L2 ACTTCTAGAGAAAGGCAAATGCATCGTGC IFP2_L3 ACTTCTAGACGCAAAAAATTTATGAGAAACC IFP2_L4 ACTTCTAGAGATGAGGATGCTTCTATCAACG IFP2_L5 ACTTCTAGACGCGAGATACCGGAAGTACTG IFP2_L ACTTCTAGACTCGAGAGAGAATGTTTAAAAGTTTTGTT IFP2_R-TR ACTTCTAGACATGCGTCAATTTTACGCAGACTATCTTTCTAGGG TTAATCTAGCTGCATCAGG Other Primers ExFP_For ACGACTAGTGTTCCCACAATGGTTAATTCG ExFP_Rev ACGACTAGTGCCGTACGCGTATCGATAAGC IFP2orf_For GGATCCTATA TAATAAAATG GGTAGTTCTT IFP2orf_Rev GGATCCAAATTCAACAAACAATTTATTTATG MF34 GGATCCTCTAGATTAACCCTAGAAAGATA Univ-1 TAATACGACTCACTATAGGNNNNNNNNNNCTAT Univ-2 TAATACGACTCACTATAGGNNNNNNNNNNAGTGC Univ-3 TAATACGACTCACTATAGGNNNNNNNNNNGAATTC Univ-4 TAATACGACTCACTATAGGNNNNNNNNNNAGTACT Univ-5 TAATACGACTCACTATAGGNNNNNNNNNNAAGCTT Univ-6 TAATACGACTCACTATAGGNNNNNNNNNNGGATCC Univ-7 TAATACGACTCACTATAGGNNNNNNNNNNCTAG iPCR_R1 ATTTTACGCAGACTATCTTTCTA T7 TTAATACGACTCACTAT MF14 GGATCCGCGGTAAGTGTCACTGA JF3 GGATCCTCGATATACAGACCGATAAAAACACATG IFP2_Lb ACTGGGCCCATACTAATAATAAATTCAACAAAC iPCR_6 TTATTTCATGTTCTACTTACGTG iPCR_L1 TGATTATCTTTAACGTACGTCAC MF04 GTCAGTCCAGAAACAACTTTGGC IFP2_L-R+ CTAGAAATTTATTTATGTTTATTTATTTATTA MCS_For ACGCGTAGATCTTAATACGACTCACTATAGGG MCS_Rev ACGCGTAGATCTAATTAACCCTCACTAAAGGG
TABLE-US-00004 TABLE 3 Transformation of Drosphila malanogaster Embryos Embryos Adults Transformed Overall Plasmid Experiment Injected Hatched mated Lines Frequency Frequency STD DEV STD ERR p(PZ)-Bac-EYFP 1 920 136 55 1 1.8% 0.6% 1.0% .+-.0.6% 2 910 120 56 0 0.0% 3 900 120 55 0 0.0% pBSII-ECFP-R1/L5 1 350 86 21 2 9.5% 8.9% 1.8% .+-.1.0% 2 380 70 16 1 6.3% 3 360 84 33 3 9.1% pBSII-ECFP-R2/L5 1 320 37 11 1 9.1% 12.5% 7.7% .+-.3.4% 2 300 38 3 1 20.0% pBSII-ECFP-R3/L5 1 220 39 7 1 14.3% 15.0% 0.8% .+-.0.6% 2 430 88 13 2 15.4% pBSII-ECFP-R4/L5 1 220 59 12 1 8.3% 12.9% 5.3% .+-.3.7% 2 510 123 19 3 15.8% pBSII-ECFP-R4/L4 1 340 108 21 1 4.8% 10.7% 16.8% .+-.11.9% 2 330 61 7 2 28.6% pBSII-ECFP-R4/L3 1 220 39 9 0 0.0% 9.7% 12.9% .+-.7.4% 2 240 53 14 1 7.1% 3 250 55 8 2 25.0% pBSII-ECFP-R4/L2 1 320 43 11 1 9.1% 10.8% 4.9% .+-.3.5% 2 530 148 25 4 16.0% pBSII-ECFP-R4/L1 1 350 89 30 0 0.0% 0.0% N/A N/A 2 160 33 16 0 0.0% 3 330 78 25 0 0.0% 4 150 31 15 0 0.0% pBSII-ECFP-R-TR/L 1 280 73 31 0 0.0% 0.0% N/A N/A 2 330 96 40 0 0.0% pBSII-ECFP-R1/L 1 220 63 19 0 0.0% 0.0% N/A N/A 2 290 80 23 0 0.0% 3 330 104 27 0 0.0% pBac(3xP3-ECFPafm) 1 300 45 14 2 14.3% 12.9% 1.8% .+-.1.3% 2 350 59 17 2 11.8% pBSII-ITR1.1K-ECFP 1 530 128 36 5 13.9% 13.9% N/A N/A pXL-BacII-ECFP 1 500 80 14 3 21.4% 22.2% 0.9% .+-.0.5% 2 520 101 22 5 22.7% pBSII-ITR1.1k-ECFP* 1 515 120 22 8 36.4% 36.4% N/A N/A pXL-BacII-ECFP* 1 533 199 88 22 25.0% 25.0% N/A N/A Table 3 These injections were done independently (Handler lab) using a 0.4:0.2 ug/ul vector/helper concentration ratio of DNA. Statistical analysis of the data show no significant difference between frequencies obtained with any of the synthetic deletion mutants that yielded detectable numbers of transformants and the control plasmid pBac(3xP3-ECFPafm). The assay cannot be interpreted to represent relative efficiencies of transformation among these constructs, but only whether a particular construct was able to generate transformants at a detectable frequency with the number of surviving injection flies analyzed.
TABLE-US-00005 TABLE 4 Transformed Drosophila Insertion Sites: Chromo- Insertion some Site Strain Name Location 3' junction Sequence 5' junction p(PZ)-Bac-EYFP 3R CCAAACTTCGGCGATGTTTTCTTAA -piggyBac- pBSII-ITR-1.1K-ECFP-1 3R TAGAATTCATGTTTCCAATTTTTTAA -piggyBac- pBSII-ITR-1.1K-ECFP-3 X -piggyBac- TTAAATTCGCATATGTGCAAATGTT pBSII-ECFP-R1/L5 3I TCGGGTGGCACGTTGTGGATTTTAA -piggyBac- TTAAGCATGTCCTTAAGCATAAAAT pBSII-ECFP-R2/L5 2I AAATACGTCACTCCCCTTCCCTTAA -piggyBac- TTAATGCTAGCTGCATGCAGGATGC pBSII-ECFP-R3/L5 2R AGCTGCACTCACCGGATGTCCTTAA -piggyBac- TTAAACAAAAAATGAAACATAAGG pBSII-ECFP-R4/L5 2R CCCAAAGTATAGTTAAATAGCTTAA -piggyBac- TTAAAGGAATTAATAAAAATACAA pBSII-ECFP-R4/L4 2R GTTTATTTATGATTAGAGCCTTTAA -piggyBac- TTAATCTCCTCCGCCCTTCTTCAATT pBSII-ECFP-R4/L3 2R TGTTGTTTTTTTGTCCCCACGTTTAA -piggyBac- TTAAACAAACACCTTTGACAAATTT pBSII-ECFP-R4/L2 2I CTGCCTCTAGCCGCCTGCTTTATTAA -piggyBac- TTAATATTAATTGAAAATAAATGCA The 5' (SEQ ID NOS 116-123) and 3' (SEQ ID NOS 107-115) flanking sequences for the inserted piggyBac sequences in each strain were obtained using end-specific inverse PCR (Materials and Methods) followed by cloning and sequencing of the recovered fragments. The chromosomal locations were determined from the sequences using the BLAST search program against the available Drosophila sequence in the GenBank library.
TABLE-US-00006 TABLE 5 Percentage of each nucleotide at piggyBac insertion sites flanking sequences from position -10 to +10. % of each nucleotide at piggyBac insertion sites flanking sequences Nucleotide -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 TTAA +1 +2 +3 +4 +5 +6 +7 +8 +9 +10 A 22 31 38 33 26 27 16 18 18 29 41 28 43 41 42 43 28 34 33 40 C 20 19 22 17 17 23 15 20 26 15 11 20 18 20 15 17 21 23 16 11 G 28 19 17 16 24 8 24 16 19 12 18 29 22 13 20 12 23 6 15 11 T 30 31 23 34 33 42 45 46 37 44 30 23 17 26 23 28 28 37 36 38 Note: Percentage of each nucleotide at piggyBac insertion site flanking sequences from position -10 to +10. The available piggyBac insertion sites include insertion sites in transformed insect genomes (Handler et al., 1998; Toshiki et al., 1999; Handler et al., 1999; Peloquin et al., 2000; Thomas et al., 2001; Handler and Harrell, 2000; Hediger et al., 2001; Kokoza et al., 2001; Nolan et al., 2002; Heinrich et al., 2002; Grossman et al., 2001; Lobo et al., 2002; Perera et al., 2002; Mandrioli and Wimmer, 2003; Sumitani et al., 2003; Elick et al., 1996; Li et al., 2001a; data from this report). insertion sites in baculoviruses (Lynne et al., 1989; Fraser et al., 1995) and insertion sites in transposition assay target plasmid pGDV1(Thibault et al., 1999; Grossman et al., 2000; Lobo, Li and Fraser, unpublished and Li et al., 2001a). No consensus aside from the TTAA target site is apparent among these insertion sites. However, the piggyBac transposable element does have a preference of inserting in the TA rich region with 4-5 Ts before and 5-6 As after the TTAA target site.
[0210] Attempts to transform insects using plasmids containing a previously reported piggyBac ITR minimal sequence cartridge (Li et al., 2001b), that has facing 5' and 3' TRDs with their respective TTAA target sites and is completely devoid of ID sequences, failed to produce a transformation frequency that was comparable to frequencies obtained with full length or conservative ID deletion constructs (Handler and Harrell, 1999; Horn and Wimmer, 2000).
[0211] Frequencies of transposition obtained for the ITR-based p(PZ)-Bac-EYFP and the similarly constructed pBS-pBac/DsRed were far less than expected. While Southern hybridization and inverse PCR analyses did confirm that the single transformant recovered with p(PZ)-Bac-EYFP had resulted from transpositional insertion, the efficient transposition of piggyBac minimal vectors evidenced in interplasmid transposition assays (Li et al., 2001b) did not necessarily predict the properties of piggyBac transposon movement in germline transformations.
[0212] The fact that germline transposition involves distinctly different cell populations than interplasmid transposition in injected embryos may explain these discrepancies. Similar discrepancies between transformation results and artificial transposition assays have been reported with other Class II transposons (Tosi and Beverley, 2000; Lohe and Hartl, 2001; Lozovsky et al., 2002). In addition, the Hermes transposable element undergoes normal cut-and-paste transposition in plasmid-based transposition assays (Sarkar et al., 1997a), but germline integrations in Ae. aegypti seem to occur either through general recombination or through a partial replicative transposition mechanism (Jasinskiene et al., 2000).
[0213] The synthetic cartridge approach used to examine the role of ID sequences in effecting efficient germline transposition has the advantage of examining the involvement of sequences through reconstruction rather than by analysis of successive internal deletions. The main disadvantage of this approach in analyzing piggyBac is the high AT content of the transposon, which limits the position of useful primers. As a result, the present analyses does not define the exact limits of the requisite sequences. However, some of the most effective nucleic acid sequences are delimited to a relatively narrow 250 bp of TRD adjacent nucleic acid sequences.
[0214] Transformation results from synthetic unidirectional deletion plasmids shown here demonstrates that no more than about 66 bp (nt 36-101) of the piggyBac 5' nucleic acid sequence and about 378 bp (nt 2031.about.2409) of the piggyBac 3' nucleic acid sequence are necessary for efficient transformation when these deletions are paired with long (378 or 311 bp, respectively, or longer) nucleic acid sequences from the opposite end of the transposon. The transformation data from the pBSII-ITR1.1 k-ECFP plasmid further defines the 3' nucleic acid sequence as 172 bp (nt 2237.about.2409). Combining this same 172 bp 3' nucleic acid sequence with only the 5' TRD in the pBSII-ECFP-R-TR/L plasmid yielded no transformants, demonstrating that the 3' nucleic acid sequence alone was insufficient for full mobility. Unexpectedly, adding the 66 bp 5' nucleic acid sequence in pBSII-ECFP-R1/L also does not allow recovery of full transformation capability while the same 66 bp does allow full transformation capability when coupled to the larger (955 bp) 3' nucleic acid sequence in pBSII-ECFP-R1/L5. This result cannot be explained by size alone, since the ITR cartridge strategy used to test these deletion sequence constructs effectively replaces the rest of the piggyBac nucleic acid sequence with the 2961 bp pBSII plasmid sequence.
[0215] The frequencies obtained for a given construct may be higher or lower relative to the control. The present studies detect the limits of nucleic acid sequences that yield acceptable transformation frequencies, and do not evaluate the effectiveness of the deleted regions relative to one another.
[0216] The present results indicate the presence of important nucleic acid sequences between nucleotides 66 and 311 of the 5' nucleic acid sequence used for construction of the pBSII-ITR1.1 k-ECFP, since this construct exhibits full transforming capability when matched with the L 3' ID sequence. Compensating sequences must be present in 3' nucleic acid sequences longer than 172 bp, since the 955 bp 3' nucleic acid sequence included with primer L5 is able to compensate for the 66 bp 5' nucleic acid sequence (construct pBSII-ECFP-R1/L5). There was noted a presence of small repeats in the 5' nucleic acid sequence of pBSII-ITR1.1K-ECFP that are matched by similar sequences in the 3' nucleic acid sequences included in construct pBSII-ECFP-R1/L5. These relatively small repeats (FIG. 28) occur in direct or opposite orientations and are also found in several other locations within the piggyBac nucleic acid sequence. There does seem to be a correlation between efficient transgenesis and the presence of at least one CAAAAT repeat in the 3' nucleic acid sequence combined with at least one in the 5' nucleic acid sequence, or the compensating presence of two or three sequence repeats in the 3' nucleic acid sequence. In some embodiments of the present inventive methods of transformation, the presence of this small repeat CAAAAT may be described as facilitating transpositional activity of piggyBac constructs.
[0217] Previous observations of efficient interplasmid transposition for the piggyBac ITR construct, completely devoid of piggyBac internal domain nucleic acid sequences (ID), support a mechanism for movement in which the piggyBac transposase binds at the terminal repeat regions (IR, spacer and TR) to effect transposition (Li et al., 2001b). Since the cut-and-paste reactions of excision and transposition do not appear to require ID sequences, the relatively unsuccessful application of the previously constructed ITR cartridge for germ-line transformation suggests the required ID sequences may be involved in other aspects of the transformation process than the mechanics of cut-and-paste. These other aspects seem to be linked to differential movement in germ line cells.
[0218] The presence of sequences important for full transforming capability within internal domains of transposons is not without precedent. Transposase binding to target sequences at or near the ends of the element is necessary to generate a synaptic complex that brings the ends of the element together for subsequent DNA cleavage (reviewed by Saedler and Gierl, 1996), but the efficiency of this interaction can be influenced by other sequences in the transposon. Multiple transposase binding sites or accessory factor binding sites are identified in other Class II transposon systems. Efficient transposition of mariner requires the continuity of several internal regions of this element and their proper spacing with respect to the terminal repeats (Lohe and Hartl, 2001; Lozovsky et al., 2002), although they are not essential for in vitro transposition (Tosi and Beverley, 2000). The P element transposase binding occurs at 10 bp subterminal sequences present at both 5' and 3' ends, while the 31 bp terminal inverted repeat is recognized by a Drosophila host protein, IRBP (inverted repeat binding protein), and an internally located 11 bp inverted repeat is shown to act as a transpositional enhancer in vivo (Rio and Rubin, 1988; Kaufman et al., 1989; Mullins et al., 1989). The maize Ac transposase binds specifically and cooperatively to repetitive ACG and TCG trinucleotides, which are found in more than 20 copies in both 5' and 3' subterminal regions, although the Ac transposase also weakly interacts with the terminal repeats (Kunze and Starlinger 1989; Becker and Kunze 1997). The TNPA transposase of the En/Spm element binds a 12 bp sequence found in multiple copies within the 5' and 3' 300 bp subterminal repeat regions (Gierl et al., 1988; Trentmann et al., 1993). The Arabidopsis transposon Tag 1 also requires minimal subterminal sequences and a minimal internal spacer between 238 bp and 325 bp for efficient transposition (Liu et al., 2000). The Sleeping Beauty (SB) transposable element contains two transposase binding sites (DRs) at the end of the 230 bp terminal inverted repeats (Ivies et al., 1997). The DNA-bending protein HMGB1, a cellular cofactor, was found to interact with the SB transposase in vivo to stimulate preferential binding of the transposase to the DR further from the cleavage site, and promoted bending of DNA fragments containing the transposon IR (Zayed et al., 2003).
[0219] These examples demonstrate that the piggyBac transposase or some host accessory factors could be binding to the identified critical TRD adjacent ID regions to promote efficient transposition in germ-line cells. While not intending to be limited to any particular theory or mechanism of action, these subterminal ID sequences may serve as additional piggyBac transposase binding sites, thus increasing the efficiency of movement by cooperative binding of the transposase. Alternatively, these sequences may serve as some accessory factor binding site(s) responsible for efficient alignment of the termini or facilitating association of the transposon with chromatin-complexed genomic sequences.
[0220] The present results force a reassessment of the reliability of plasmid-based transposition assays in predicting piggyBac movement for transgenesis. Plasmid-based transposition assays, while facilitating mutational analyses of the transposon, are likely to be reliable predictors of in vivo movement only when alterations lead to a loss of movement. This difference is likely due to the fact that plasmid-based assays indicate the activity of the transposon in somatic cells while transformation assays assess movement in germ-line cells. Chromatin in the primordial germ cells is structured and regulated differently than that of blastoderm cells (reviewed by Wolffe, 1996). This difference could contribute to different results in the two types of assays. Interplamsid transposition assays utilize purified supercoiled DNA as the target, while transformation assays target chromatin. Nucleosome formation on negatively supercoiled DNA occurs virtually instantaneously in vitro (Pfaffle and Jackson, 1990), and target plasmid DNA introduced into the embryo cells would most likely form nucleosome structures, but there will be a significant difference in complexity compared to chromatin. This difference in complexity could be the cause of different transposition results. Alternatively, the absence of additional transposase or accessory factor binding sites on the transposon could result in less efficient translocation of the DNA to the nucleus, or lessened affinity of the transposon/transposase complex for the genomic DNA.
Example 25--TRD Point Mutation Analysis
[0221] Sequence analysis of integrated constructs and subsequent detailed analysis of all the constructs confirms a point mutation in the TRD of all constructs examined in this study. This mutation is a C-A transversion in the 19 bp internal repeat sequence of the 3' TRD (FIG. 30). This point mutation originated during construction of the pIAO plasmid (Li et al., 2001b), and is most likely the result of mis-incorporation during PCR amplification. However, our results confirm that this mutation has no significant effect on the transformation efficiency.
[0222] Under the conditions of the present direct PCR amplification using piggyBac 5' terminus-specific primers, a weak band of the same size as the expected piggyBac band was generated from control w.sup.1118 flies. The Southern hybridizations detected a 1.3 kb band in all of the transformants that was distinct from the pBSII backbone fragment (2.96 kb) and 3xP3-ECFP (1.16 kb) marker bands. piggyBac-like sequences have been detected in many species by PCR and Southern hybridization analysis using probes derived from the piggyBac 5' terminal region, including moths, flies, beetles, etc. (reviewed by Handler, 2002). A homology search against the available sequence database has identified the existence of the piggyBac-like sequences in the D. melanogaster genome (Sarkar et al., 2003). These results reflect the presence of one of these degenerate piggyBac-like sequences in the Drosophila genome.
[0223] The insertion sites in the transformed fly strains were identified by either universal PCR or inverse PCR techniques. All insertions occurred exclusively at TTAA sites verifying that these insertions were due to a specific piggyBac transposase-mediated mechanism (Fraser et al., 1995). A ClustalX alignment of all piggyBac insertion sites identified here, including insertion sites in the transposition assay target plasmid pGDV1 (Sarkar et al., 1997b), baculovirus, and transformed insect genomes, does not reveal any further significant similarities (Table5). The proposed existence of a larger piggyBac insertion consensus sequence YYTTTTTT/AARTAAYAG (SEQ ID NO: 124) (Y=pyrimidine, R=purine, /=insertion point) by Cary et al. (1989) and Grossman et al. (2000), and a short 8 bp consensus sequence A/TNA/TTTAAAJT (SEQ ID NO: 125) proposed by Li et al. (2001a) seem to be contradicted by the accumulated insertion site data. A decided preference was noted for piggyBac insertion within TTAA target sites flanked by 4-5 Ts on the 5' side and 5-6 As on the 3' side (Table 5).
[0224] Based on the minimal piggyBac vector pBSII-ITR1.1 k-ECFP, a plasmid minimal vector, pXL-BacII-ECFP, was constructed which also yields a high frequency of transformation in D. melanogaster (Table 3). The present results confirm that both the pBSII-ITR1.1 k-ECFP and the pXL-BacII-ECFP minimal vectors can serve as highly efficient piggyBac transformation vectors.
[0225] All documents, patents, journal articles and other materials cited in the present application are hereby incorporated by reference.
[0226] Although the present invention has been fully described in conjunction with several embodiments thereof with reference to the accompanying drawings, it is to be understood that various changes and modifications may be apparent to those skilled in the art. Such changes and modifications are to be understood as included within the scope of the present invention as defined by the appended claims, unless they depart therefrom.
BIBLIOGRAPHY
[0227] The following materials are hereby specifically incorporated herein by reference in their entirety.
[0228] 1. Ausubel F M, et al. (1994), Current Protocols in Molecular Biology, John Wiley & Sons, Inc.
[0229] 2. Becker H A, Kunze R (1997), Mol. Gen. Genet., 254(3): 219-30.
[0230] 3. Beeman R W, Stauth D M (1997), Insect Mol. Biol., 6(1): 83-8.
[0231] 4. Berghammer A J, et al. (1999), Nature, 402: 370-371.
[0232] 5. Buck T A, et al. (1997), Mol. Gen. Genet., 255: 605-610.
[0233] 6. Cary L C, et al. (1989), Virology, 172: 156-169.
[0234] 7. Elick T A, et al. (1996a), Genetica, 97(2): 127-139.
[0235] 8. Elick T A, Bauser C A, Fraser M J Jr (1996b), Genetica., 98(1): 33-41.
[0236] 9. Elick T A, et al. (1997), Mol. Gen. Genet., 255(6): 605-610.
[0237] 10. Fraser M J Jr, et al. (1983), J. Virol., 47: 287-300.
[0238] 11. Fraser M J Jr, et al. (1985), Virology, 145(2): 356-61.
[0239] 12. Fraser M J Jr, et al. (1995), Virology, 211(2): 397-407.
[0240] 13. Fraser M J Jr, Ciszczon T, Elick T, Bauser C (1996), Insect Mol. Biol., 5(2): 141-51.
[0241] 14. Geier, G. and Modrich, P. (1979) J. Biol. Chem., 254 (4):1408-1413.
[0242] 15. Gierl A, Lutticke S, Saedler H (1988), EMBO J., 7(13): 4045-53.
[0243] 16. Goryshin I Y, et al. (1994), Proc. Natl. Acad. USA, 91: 10834-10838.
[0244] 17. Grossman G L, et al. (2000), Insect Biochem. Mol. Biol., 30(10): 909-14.
[0245] 18. Grossman G L, et al. (2001), Insect Mol. Biol., 10(6): 597-604.
[0246] 19. Grossniklaus U, et al. (1992), Genes Dev., 6(6): 1030-51.
[0247] 20. Handler A M, et al. (1998) Proc. Natl. Acad. Sci. USA, 95(13): 7520-5.
[0248] 21. Handler A M, Harrell R A 2nd (1999), Insect Mol. Biol., 8(4): 449-57.
[0249] 22. Handler A M, McCombs S D (2000), Insect Mol. Biol., 9(6): 605-12.
[0250] 23. Handler A M, Harrell R A 2nd (2001a), Biotechniques, 31(4): pp. 824-8.
[0251] 24. Handler A M, Harrell R A 2nd (2001b), Insect Biochem. Mol. Biol., 31(2): 199-205.
[0252] 25. Handler A M (2002), Insect Biochem. Mol. Biol., 32(10): 1211-20.
[0253] 26. Hediger M, et al. (2001), Insect Mol. Biol., 10(2): 113-9.
[0254] 27. Heinrich J C, et al. (2002), Insect Mol. Biol., 11(1): 1-10.
[0255] 28. Hirt B (1967), J. Mol. Bio., 26: 367-369.
[0256] 29. Horn C, Wimmer E A (2000), Dev. Genes Evol., 210(12): 630-7.
[0257] 30. Ivies Z, Hackett P B, Plasterk R H, Izsvak Z (1997), Cell, 91(4): 501-10.
[0258] 31. Jarvis et al. (1990), Biotechnology, 8 (10): 950-955.
[0259] 32. Jasinskiene N, et al. (2000), Insect Mol. Biol., 9(1): 11-8.
[0260] 33. Kaufman P D, et al. (1989), Cell, 59(2): 359-71.
[0261] 34. Kokoza V, et al. (2001), Insect Biochem. Mol. Biol., 31(12): 1137-43.
[0262] 35. Kunze R, Starlinger P (1989), EMBO J., 8(11): 3177-85. 36. Li X, Heinrich J C, Scott M J (2001a), Insect Mol. Biol., 10(5): 447-55.
[0263] 37. Li X, Lobo N, Bauser C A, Fraser M J Jr (2001b), Mol. Genet. Gen., 266(2): 190-8.
[0264] 38. Liu D, et al. (2000), Genetics, 157(2): 817-30. [
[0265] 39. Lobo N, Li X, Fraser M J Jr (1999), Mol. Gen. Genet., 261(4-5): 803-10.
[0266] 40. Lobo N, et al. (2001), Mol. Genet. Gen., 265(1): 66-71. 41. Lobo N F, et al. (2002), Insect Mol. Biol., 11(2): 133-9.
[0267] 42. Lohe A R, Hartl D L (2001), Genetics, 160(2): 519-26.
[0268] 43. Lozovsky E R, et al. (2002), Genetics, 160(2): 527-35.
[0269] 44. Mandrioli M, Wimmer E A (2002), Insect Biochem. Mol. Biol., 33(1): 1-5.
[0270] 45. Mullins M C, Rio D C, Rubin G M (1989), Genes Dev., 3(5): 729-38.
[0271] 46. Nolan T, et al. (2002), J. Biol. Chem., 277(11): 8759-62.
[0272] 47. Ochman H, et al. (1988), Genetics, 120(3): 621-3.
[0273] 48. Peloquin J J, et al. (2000), Insect Mol. Biol., 9(3): 323-33.
[0274] 49. Perera O P, et al. (2002), Insect Mol. Biol., 11(4): 291-7.
[0275] 50. Pfaffle P, Jackson V (1990), J. Biol. Chem., 265(28): 6821-9.
[0266] 51. Rio, D C, Rubin G M (1988), Proc. Natl. Acad. Sci. USA, 85: 8929-8933.
[0276] 52. Rubin G M, Spradling A C (1982), Science, 218(4570): 348-53.
[0277] 53. Rubin G M, Spradling A C (1983), Nucleic Acids Res., 11(18): 6341-51.
[0278] 54. Saedler H, Gierl A (Editors) (1996) Transposable Elements, Soringer-Verlag, Berlin.
[0279] 55. Sambrook J, Fritsch E F, and Maniatis T (1989) Molecular Cloning: A Laboratory Manual (New York: Cold Spring Harbor Press).
[0280] 56. Sarkar A, Yardley K, Atkinson P W, James A A, O'Brochta DA (1997a), Insect Biochem. Mol. Biol., 27(5): 359-63.
[0281] 57. Sarkar A, et al. (1997b), Genetica., 99(1): 15-29.
[0282] 58. Sarkar A, et al. (2003), Mol. Genet. Genomics, 270(2): 173-80.
[0283] 59. Sekar V (1987), BioTechniques, 5: 11-13.
[0284] 60. Sumitani M, et al. (2003), Insect Biochem. Mol. Biol., 33(4): 449-458.
[0285] 61. Tamura T, et al. (2000), Nat. Biotechnol. 18(1): 81-4.
[0286] 62. Thibault S T, et al. (1999), Insect Mol. Biol., 8(1): 119-23.
[0287] 63. Thomas J L, et al. (2002), Insect Biochem. Mol. Biol., 32(3): 247-53.
[0288] 64. Thummel, C S and Pirrotta, V (1992), Dros. Info. Service, 71: 150-150.
[0289] 65. Tosi L R, Beverley S M (2000), Nucleic Acids Res., 28(3): 784-90.
[0290] 66. Trentmann S M, Saedler H, Gierl A (1993), Mol. Gen. Genet., 238(1-2): 201-208.
[0291] 67. Wang H H, Fraser M J Jr (1993), Insect Mol. Biol., 1: 109-116.
[0292] 68. Zayed H, et al. (2003), Nucleic Acids Res., 31(9): 2313-2322.
Sequence CWU
1
1
200125DNAArtificial SequenceSynthetic primer 1ggatcccatg cgtcaatttt acgca
25230DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
2acgactagtg ttcccacaat ggttaattcg
30330DNAArtificial SequenceDescription of Artificial Sequence Synthetic
primer 3acgactagtg ccgtacgcgt atcgataagc
30415DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 4gcttgataag aagag
15517DNAArtificial SequenceDescription of Artificial
Sequence Synthetic primer 5gcatgttgct tgctatt
17630DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 6acgtaagctt cgatgtcttt
gtgatgcgcc 30731DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
7acggaattca cttgcaactg aaacaatatc c
31830DNAArtificial SequenceDescription of Artificial Sequence Synthetic
primer 8actctcgagg ttcccacaat ggttaattcg
30930DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 9actgaattca tggtggcgac cggtggatcg
301029DNAArtificial SequenceDescription of Artificial
Sequence Synthetic primer 10ggatcctcta gattaaccct agaaagata
291134DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 11gaaagggccc gtgatacgcc
tatttttata ggtt 341235DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
12aatcggtacc aacgcgcggg gagaggcggt ttgcg
351333DNAArtificial SequenceDescription of Artificial Sequence Synthetic
primer 13ccaagggccc tgacgtgaac cattgtcaca cgt
331436DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 14tgtgggtacc gtcgatcaaa caaacgcgag ataccg
361531DNAArtificial SequenceDescription of Artificial
Sequence Synthetic primer 15cgtcaatttt acgcagacta tctttctagg g
311639DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 16ttaaccctag aaagatagtc
tgcgtaaaat tgacgcatg 391730DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
17gtacgtcaca atatgattat ctttctaggg
301830DNAArtificial SequenceDescription of Artificial Sequence Synthetic
primer 18ttaaccctag aaagataatc atattgtgac
301943DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 19ttaattaacc ctagaaagat agtctgcgta aaattgacgc atg
432034DNAArtificial SequenceDescription of Artificial
Sequence Synthetic primer 20ttaattaacc ctagaaagat aatcatattg tgac
342146DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 21ctagtactag tgcgccgcgt
acgtctagag acgcgcagtc tagaad 462246DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
22ttctagactg cgcgtctcta gacgtacgcg gcgcactagt actagd
462323DNAArtificial SequenceDescription of Artificial Sequence Synthetic
primer 23gatgacctgc agtaggaaga cgd
232439DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 24gactctagac gtacgcggag cttaacccta gaaagatad
392526DNAArtificial SequenceDescription of Artificial
Sequence Synthetic primer 25ggattccatg cgtcaatttt acgcad
262635DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 26ggatcctcga tatacagacc
gataaaaaca catgd 352735DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
27ggtaccattg caaacagcga cggattcgcg ctatd
352832DNAArtificial SequenceDescription of Artificial Sequence Synthetic
primer 28acgcgtagat cttaatacga ctcactatag gg
322932DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 29acgcgtagat ctaattaacc ctcactaaag gg
323029DNAArtificial SequenceDescription of Artificial
Sequence Synthetic primer 30cctcgatata cagaccgata aaacacatg
293129DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 31gcacgcctca gccgagctcc
aagggcgac 293227DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
32ggatccctca aaatttcttc taaagta
273327DNAArtificial SequenceDescription of Artificial Sequence Synthetic
primer 33ggatccctca aaatttcttc taaagta
273430DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 34gcacgcctca gccgagctcc aagcggcgac
303526DNAArtificial SequenceDescription of Artificial
Sequence Synthetic oligonucleotide 35ttaatctaga ggatcctcta gattaa
263626DNAArtificial
SequenceDescription of Artificial Sequence Synthetic oligonucleotide
36ttaatctaga cgtacgcgga gcttaa
263730DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 37ttaatctagc tagtactaga actagattaa
303848DNAArtificial SequenceDescription of Artificial
Sequence Synthetic oligonucleotide 38ttaatctagt tctagacgta
cgcggcgcac tagtactagc tagattaa 483963DNAArtificial
SequenceDescription of Artificial Sequence Synthetic oligonucleotide
39ttaatctagt tctagactgc gcgtctctag acgtacgcgg cgcactagta ctagctagat
60taa
6340707DNAArtificial SequenceDescription of Artificial Sequence Synthetic
ITR cartridge sequence 40ggatcccatg cgtcaatttt acgcagacta
tctttctagg gttaatctag ctgcatcagg 60atcatatcgt cgggtctttt ttccggctca
gtcatcgccc aagctggcgc tatctgggca 120tcggggagga agaagcccgt gccttttccc
gcgaggttga agcggcatgg aaagagtttg 180ccgaggatga ctgctgctgc attgacgttg
agcgaaaacg cacgtttacc atgatgattc 240gggaaggtgt ggccatgcac gcctttaacg
gtgaactgtt cgttcaggcc acctgggata 300ccagttcgtc gcggcttttc cggacacagt
tccggatggt cagcccgaag cgcatcagca 360acccgaacaa taccggcgac agccggaact
gccgtgccgg tgtgcagatt aatgacagcg 420gtgcggcgct gggatattac gtcagcgagg
acgggtatcc tggctggatg ccgcagaaat 480ggacatggat accccgtgag ttacccggcg
ggcgcgcctc gttcattcac gtttttgaac 540ccgtggagga cgggcagact cgcggtgcaa
atgtgtttta cagcgtgatg gagcagatga 600agatgctcga cacgctgcag aacacgcagc
tagattaacc ctagaaagat aatcatattg 660tgacgtacgt taaagataat catgcgtaaa
attgacgcat gggatcc 707413662DNAArtificial
SequenceDescription of Artificial Sequence Synthetic nucleotide
construct 41ctaaattgta agcgttaata ttttgttaaa attcgcgtta aatttttgtt
aaatcagctc 60attttttaac caataggccg aaatcggcaa aatcccttat aaatcaaaag
aatagaccga 120gatagggttg agtgttgttc cagtttggaa caagagtcca ctattaaaga
acgtggactc 180caacgtcaaa gggcgaaaaa ccgtctatca gggcgatggc ccactacgtg
aaccatcacc 240ctaatcaagt tttttggggt cgaggtgccg taaagcacta aatcggaacc
ctaaagggag 300cccccgattt agagcttgac ggggaaagcc ggcgaacgtg gcgagaaagg
aagggaagaa 360agcgaaagga gcgggcgcta gggcgctggc aagtgtagcg gtcacgctgc
gcgtaaccac 420cacacccgcc gcgcttaatg cgccgctaca gggcgcgtcc cattcgccat
tcaggctgcg 480caactgttgg gaagggcgat cggtgcgggc ctcttcgcta ttacgccagc
tggcgaaagg 540gggatgtgct gcaaggcgat taagttgggt aacgccaggg ttttcccagt
cacgacgttg 600taaaacgacg gccagtgagc gcgcctcgtt cattcacgtt tttgaacccg
tggaggacgg 660gcagactcgc ggtgcaaatg tgttttacag cgtgatggag cagatgaaga
tgctcgacac 720gctgcagaac acgcagctag attaacccta gaaagataat catattgtga
cgtacgttaa 780agataatcat gcgtaaaatt gacgcatggg atctgtaata cgactcacta
tagggcgaat 840tgggtaccgg gccccccctc gaggtcgacg gtatcgataa gcttgatatc
gaattcctgc 900agcccggggg atccactagt tctagagcgg ccgccaccgc ggtggagctc
cagcttttgt 960tccctttagt gagggttaat tagatcccat gcgtcaattt tacgcagact
atctttctag 1020ggttaatcta gctgcatcag gatcatatcg tcgggtcttt tttccggctc
agtcatcgcc 1080caagctggcg ctatctgggc atcggggagg aagaagcccg tgccttttcc
cgcgaggttg 1140aagcggcatg gaaagagttt gccgaggatg actgctgctg cattgacgtt
gagcgaaaac 1200gcacgtttac catgatgatt cgggaaggtg tggccatgca cgcctttaac
ggtgaactgt 1260tcgttcaggc cacctgggat accagttcgt cgcggctttt ccggacacag
ttccggatgg 1320tcagcccgaa gcgcatcagc aacccgaaca ataccggcga cagccggaac
tgccgtgccg 1380gtgtgcagat taatgacagc ggtgcggcgc tgggatatta cgtcagcgag
gacgggtatc 1440ctggctggat gccgcagaaa tggacatgga taccccgtga gttacccggc
gggcgcgctt 1500ggcgtaatca tggtcatagc tgtttcctgt gtgaaattgt tatccgctca
caattccaca 1560caacatacga gccggaagca taaagtgtaa agcctggggt gcctaatgag
tgagctaact 1620cacattaatt gcgttgcgct cactgcccgc tttccagtcg ggaaacctgt
cgtgccagct 1680gcattaatga atcggccaac gcgcggggag aggcggtttg cgtattgggc
gctcttccgc 1740ttcctcgctc actgactcgc tgcgctcggt cgttcggctg cggcgagcgg
tatcagctca 1800ctcaaaggcg gtaatacggt tatccacaga atcaggggat aacgcaggaa
agaacatgtg 1860agcaaaaggc cagcaaaagg ccaggaaccg taaaaaggcc gcgttgctgg
cgtttttcca 1920taggctccgc ccccctgacg agcatcacaa aaatcgacgc tcaagtcaga
ggtggcgaaa 1980cccgacagga ctataaagat accaggcgtt tccccctgga agctccctcg
tgcgctctcc 2040tgttccgacc ctgccgctta ccggatacct gtccgccttt ctcccttcgg
gaagcgtggc 2100gctttctcat agctcacgct gtaggtatct cagttcggtg taggtcgttc
gctccaagct 2160gggctgtgtg cacgaacccc ccgttcagcc cgaccgctgc gccttatccg
gtaactatcg 2220tcttgagtcc aacccggtaa gacacgactt atcgccactg gcagcagcca
ctggtaacag 2280gattagcaga gcgaggtatg taggcggtgc tacagagttc ttgaagtggt
ggcctaacta 2340cggctacact agaaggacag tatttggtat ctgcgctctg ctgaagccag
ttaccttcgg 2400aaaaagagtt ggtagctctt gatccggcaa acaaaccacc gctggtagcg
gtggtttttt 2460tgtttgcaag cagcagatta cgcgcagaaa aaaaggatct caagaagatc
ctttgatctt 2520ttctacgggg tctgacgctc agtggaacga aaactcacgt taagggattt
tggtcatgag 2580attatcaaaa aggatcttca cctagatcct tttaaattaa aaatgaagtt
ttaaatcaat 2640ctaaagtata tatgagtaaa cttggtctga cagttaccaa tgcttaatca
gtgaggcacc 2700tatctcagcg atctgtctat ttcgttcatc catagttgcc tgactccccg
tcgtgtagat 2760aactacgata cgggagggct taccatctgg ccccagtgct gcaatgatac
cgcgagaccc 2820acgctcaccg gctccagatt tatcagcaat aaaccagcca gccggaaggg
ccgagcgcag 2880aagtggtcct gcaactttat ccgcctccat ccagtctatt aattgttgcc
gggaagctag 2940agtaagtagt tcgccagtta atagtttgcg caacgttgtt gccattgcta
caggcatcgt 3000ggtgtcacgc tcgtcgtttg gtatggcttc attcagctcc ggttcccaac
gatcaaggcg 3060agttacatga tcccccatgt tgtgcaaaaa agcggttagc tccttcggtc
ctccgatcgt 3120tgtcagaagt aagttggccg cagtgttatc actcatggtt atggcagcac
tgcataattc 3180tcttactgtc atgccatccg taagatgctt ttctgtgact ggtgagtact
caaccaagtc 3240attctgagaa tagtgtatgc ggcgaccgag ttgctcttgc ccggcgtcaa
tacgggataa 3300taccgcgcca catagcagaa ctttaaaagt gctcatcatt ggaaaacgtt
cttcggggcg 3360aaaactctca aggatcttac cgctgttgag atccagttcg atgtaaccca
ctcgtgcacc 3420caactgatct tcagcatctt ttactttcac cagcgtttct gggtgagcaa
aaacaggaag 3480gcaaaatgcc gcaaaaaagg gaataagggc gacacggaaa tgttgaatac
tcatactctt 3540cctttttcaa tattattgaa gcatttatca gggttattgt ctcatgagcg
gatacatatt 3600tgaatgtatt tagaaaaata aacaaatagg ggttccgcgc acatttcccc
gaaaagtgcc 3660ac
3662425533DNAArtificial SequenceDescription of Artificial
Sequence Synthetic nucleotide construct 42ctaaattgta agcgttaata
ttttgttaaa attcgcgtta aatttttgtt aaatcagctc 60attttttaac caataggccg
aaatcggcaa aatcccttat aaatcaaaag aatagaccga 120gatagggttg agtgttgttc
cagtttggaa caagagtcca ctattaaaga acgtggactc 180caacgtcaaa gggcgaaaaa
ccgtctatca gggcgatggc ccactacgtg aaccatcacc 240ctaatcaagt tttttggggt
cgaggtgccg taaagcacta aatcggaacc ctaaagggag 300cccccgattt agagcttgac
ggggaaagcc ggcgaacgtg gcgagaaagg aagggaagaa 360agcgaaagga gcgggcgcta
gggcgctggc aagtgtagcg gtcacgctgc gcgtaaccac 420cacacccgcc gcgcttaatg
cgccgctaca gggcgcgtcc cattcgccat tcaggctgcg 480caactgttgg gaagggcgat
cggtgcgggc ctcttcgcta ttacgccagc tggcgaaagg 540gggatgtgct gcaaggcgat
taagttgggt aacgccaggg ttttcccagt cacgacgttg 600taaaacgacg gccagtgagc
gcgcgtaata cgactcacta tagggcgaat tgggtaccgg 660gccccccctc gaggtcgacg
gtatcgataa gctatccagt gcagtaaaaa ataaaaaaaa 720aatatgtttt tttaaatcta
cattctccaa aaaagggttt tattaactta catacatact 780agaattgatc cccgatcccc
ctagaatccc aaaacaaact ggttattgtg gtaggtcatt 840tgtttggcag aagaaaactc
gagaaatttc tctggccgtt attcgttatt ctctcttttc 900tttttgggtc tccctctctg
cactaatgct ctctcactct gtcacacagt aaacggcata 960ctgctctcgt tggttcgaga
gagcgcgcct cgaatgttcg cgaaaagagc gccggagtat 1020aaatagagcg cttcgtctac
ggagcgacaa ttcaattcaa acaagcaaag tgaacacgtc 1080gctaagcgaa agctaagcaa
ataaacaagc gcagctgaac aagctaaaca atctgcagta 1140aagtgcaagt taaagtgaat
caattaaaag taaccagcaa ccaagtaaat caactgcaac 1200tactgaaatc tgccaagaag
taattattga atacaagaag agaactctga atagggaatt 1260gggaattcct gcagcccggg
ggatcctata taataaaatg ggtagttctt tagacgatga 1320gcatatcctc tctgctcttc
tgcaaagcga tgacgagctt gttggtgagg attctgacag 1380tgaaatatca gatcacgtaa
gtgaagatga cgtccagagc gatacagaag aagcgtttat 1440agatgaggta catgaagtgc
agccaacgtc aagcggtagt gaaatattag acgaacaaaa 1500tgttattgaa caaccaggtt
cttcattggc ttctaacaga atcttgacct tgccacagag 1560gactattaga ggtaagaata
aacattgttg gtcaacttca aagtccacga ggcgtagccg 1620agtctctgca ctgaacattg
tcagatctca aagaggtccg acgcgtatgt gccgcaatat 1680atatgaccca cttttatgct
tcaaactatt ttttactgat gagataattt cggaaattgt 1740aaaatggaca aatgctgaga
tatcattgaa acgtcgggaa tctatgacag gtgctacatt 1800tcgtgacacg aatgaagatg
aaatctatgc tttctttggt attctggtaa tgacagcagt 1860gagaaaagat aaccacatgt
ccacagatga cctctttgat cgatctttgt caatggtgta 1920cgtctctgta atgagtcgtg
atcgttttga ttttttgata cgatgtctta gaatggatga 1980caaaagtata cggcccacac
ttcgagaaaa cgatgtattt actcctgtta gaaaaatatg 2040ggatctcttt atccatcagt
gcatacaaaa ttacactcca ggggctcatt tgaccataga 2100tgaacagtta cttggtttta
gaggacggtg tccgtttagg atgtatatcc caaacaagcc 2160aagtaagtat ggaataaaaa
tcctcatgat gtgtgacagt ggtacgaagt atatgataaa 2220tggaatgcct tatttgggaa
gaggaacaca gaccaacgga gtaccactcg gtgaatacta 2280cgtgaaggag ttatcaaagc
ctgtgcacgg tagttgtcgt aatattacgt gtgacaattg 2340gttcacctca atccctttgg
caaaaaactt actacaagaa ccgtataagt taaccattgt 2400gggaaccgtg cgatcaaaca
aacgcgagat accggaagta ctgaaaaaca gtcgctccag 2460gccagtggga acatcgatgt
tttgttttga cggacccctt actctcgtct catataaacc 2520gaagccagct aagatggtat
acttattatc atcttgtgat gaggatgctt ctatcaacga 2580aagtaccggt aaaccgcaaa
tggttatgta ttataatcaa actaaaggcg gagtggacac 2640gctagaccaa atgtgttctg
tgatgacctg cagtaggaag acgaataggt ggcctatggc 2700attattgtac ggaatgataa
acattgcctg cataaattct tttattatat acagccataa 2760tgtcagtagc aagggagaaa
aggttcaaag tcgcaaaaaa tttatgagaa acctttacat 2820gagcctgacg tcatcgttta
tgcgtaagcg tttagaagct cctactttga agagatattt 2880gcgcgataat atctctaata
ttttgccaaa tgaagtgcct ggtacatcag atgacagtac 2940tgaagagcca gtaatgaaaa
aacgtactta ctgtacttac tgcccctcta aaataaggcg 3000aaaggcaaat gcatcgtgca
aaaaatgcaa aaaagttatt tgtcgagagc ataatattga 3060tatgtgccaa agttgtttct
gactgactaa taagtataat ttgtttctat tatgtataag 3120ttaagctaat tacttatttt
ataatacaac atgactgttt ttaaagtaca aaataagttt 3180atttttgtaa aagagagaat
gtttaaaagt tttgttactt tagaagaaat tttgagtttt 3240tgtttttttt taataaataa
ataaacataa ataaattgtt tgttgaattt ggatccacta 3300gttctagagc ggccgccacc
gcggtggagc tccagctttt gttcccttta gtgagggtta 3360attgcgcgct tggcgtaatc
atggtcatag ctgtttcctg tgtgaaattg ttatccgctc 3420acaattccac acaacatacg
agccggaagc ataaagtgta aagcctgggg tgcctaatga 3480gtgagctaac tcacattaat
tgcgttgcgc tcactgcccg ctttccagtc gggaaacctg 3540tcgtgccagc tgcattaatg
aatcggccaa cgcgcgggga gaggcggttt gcgtattggg 3600cgctcttccg cttcctcgct
cactgactcg ctgcgctcgg tcgttcggct gcggcgagcg 3660gtatcagctc actcaaaggc
ggtaatacgg ttatccacag aatcagggga taacgcagga 3720aagaacatgt gagcaaaagg
ccagcaaaag gccaggaacc gtaaaaaggc cgcgttgctg 3780gcgtttttcc ataggctccg
cccccctgac gagcatcaca aaaatcgacg ctcaagtcag 3840aggtggcgaa acccgacagg
actataaaga taccaggcgt ttccccctgg aagctccctc 3900gtgcgctctc ctgttccgac
cctgccgctt accggatacc tgtccgcctt tctcccttcg 3960ggaagcgtgg cgctttctca
tagctcacgc tgtaggtatc tcagttcggt gtaggtcgtt 4020cgctccaagc tgggctgtgt
gcacgaaccc cccgttcagc ccgaccgctg cgccttatcc 4080ggtaactatc gtcttgagtc
caacccggta agacacgact tatcgccact ggcagcagcc 4140actggtaaca ggattagcag
agcgaggtat gtaggcggtg ctacagagtt cttgaagtgg 4200tggcctaact acggctacac
tagaaggaca gtatttggta tctgcgctct gctgaagcca 4260gttaccttcg gaaaaagagt
tggtagctct tgatccggca aacaaaccac cgctggtagc 4320ggtggttttt ttgtttgcaa
gcagcagatt acgcgcagaa aaaaaggatc tcaagaagat 4380cctttgatct tttctacggg
gtctgacgct cagtggaacg aaaactcacg ttaagggatt 4440ttggtcatga gattatcaaa
aaggatcttc acctagatcc ttttaaatta aaaatgaagt 4500tttaaatcaa tctaaagtat
atatgagtaa acttggtctg acagttacca atgcttaatc 4560agtgaggcac ctatctcagc
gatctgtcta tttcgttcat ccatagttgc ctgactcccc 4620gtcgtgtaga taactacgat
acgggagggc ttaccatctg gccccagtgc tgcaatgata 4680ccgcgagacc cacgctcacc
ggctccagat ttatcagcaa taaaccagcc agccggaagg 4740gccgagcgca gaagtggtcc
tgcaacttta tccgcctcca tccagtctat taattgttgc 4800cgggaagcta gagtaagtag
ttcgccagtt aatagtttgc gcaacgttgt tgccattgct 4860acaggcatcg tggtgtcacg
ctcgtcgttt ggtatggctt cattcagctc cggttcccaa 4920cgatcaaggc gagttacatg
atcccccatg ttgtgcaaaa aagcggttag ctccttcggt 4980cctccgatcg ttgtcagaag
taagttggcc gcagtgttat cactcatggt tatggcagca 5040ctgcataatt ctcttactgt
catgccatcc gtaagatgct tttctgtgac tggtgagtac 5100tcaaccaagt cattctgaga
atagtgtatg cggcgaccga gttgctcttg cccggcgtca 5160atacgggata ataccgcgcc
acatagcaga actttaaaag tgctcatcat tggaaaacgt 5220tcttcggggc gaaaactctc
aaggatctta ccgctgttga gatccagttc gatgtaaccc 5280actcgtgcac ccaactgatc
ttcagcatct tttactttca ccagcgtttc tgggtgagca 5340aaaacaggaa ggcaaaatgc
cgcaaaaaag ggaataaggg cgacacggaa atgttgaata 5400ctcatactct tcctttttca
atattattga agcatttatc agggttattg tctcatgagc 5460ggatacatat ttgaatgtat
ttagaaaaat aaacaaatag gggttccgcg cacatttccc 5520cgaaaagtgc cac
5533434971DNAArtificial
SequenceDescription of Artificial Sequence Synthetic nucleotide
construct 43ctaaattgta agcgttaata ttttgttaaa attcgcgtta aatttttgtt
aaatcagctc 60attttttaac caataggccg aaatcggcaa aatcccttat aaatcaaaag
aatagaccga 120gatagggttg agtgttgttc cagtttggaa caagagtcca ctattaaaga
acgtggactc 180caacgtcaaa gggcgaaaaa ccgtctatca gggcgatggc ccactacgtg
aaccatcacc 240ctaatcaagt tttttggggt cgaggtgccg taaagcacta aatcggaacc
ctaaagggag 300cccccgattt agagcttgac ggggaaagcc ggcgaacgtg gcgagaaagg
aagggaagaa 360agcgaaagga gcgggcgcta gggcgctggc aagtgtagcg gtcacgctgc
gcgtaaccac 420cacacccgcc gcgcttaatg cgccgctaca gggcgcgtcc cattcgccat
tcaggctgcg 480caactgttgg gaagggcgat cggtgcgggc ctcttcgcta ttacgccagc
tggcgaaagg 540gggatgtgct gcaaggcgat taagttgggt aacgccaggg ttttcccagt
cacgacgttg 600taaaacgacg gccagtgagc gcgcgtaata cgactcacta tagggcgaat
tgggtaccgg 660gccccccctc gaggtcgacg gtatcgataa gcttgatatc gaattcctgc
agcccggggg 720atcctatata ataaaatggg tagttcttta gacgatgagc atatcctctc
tgctcttctg 780caaagcgatg acgagcttgt tggtgaggat tctgacagtg aaatatcaga
tcacgtaagt 840gaagatgacg tccagagcga tacagaagaa gcgtttatag atgaggtaca
tgaagtgcag 900ccaacgtcaa gcggtagtga aatattagac gaacaaaatg ttattgaaca
accaggttct 960tcattggctt ctaacagaat cttgaccttg ccacagagga ctattagagg
taagaataaa 1020cattgttggt caacttcaaa gtccacgagg cgtagccgag tctctgcact
gaacattgtc 1080agatctcaaa gaggtccgac gcgtatgtgc cgcaatatat atgacccact
tttatgcttc 1140aaactatttt ttactgatga gataatttcg gaaattgtaa aatggacaaa
tgctgagata 1200tcattgaaac gtcgggaatc tatgacaggt gctacatttc gtgacacgaa
tgaagatgaa 1260atctatgctt tctttggtat tctggtaatg acagcagtga gaaaagataa
ccacatgtcc 1320acagatgacc tctttgatcg atctttgtca atggtgtacg tctctgtaat
gagtcgtgat 1380cgttttgatt ttttgatacg atgtcttaga atggatgaca aaagtatacg
gcccacactt 1440cgagaaaacg atgtatttac tcctgttaga aaaatatggg atctctttat
ccatcagtgc 1500atacaaaatt acactccagg ggctcatttg accatagatg aacagttact
tggttttaga 1560ggacggtgtc cgtttaggat gtatatccca aacaagccaa gtaagtatgg
aataaaaatc 1620ctcatgatgt gtgacagtgg tacgaagtat atgataaatg gaatgcctta
tttgggaaga 1680ggaacacaga ccaacggagt accactcggt gaatactacg tgaaggagtt
atcaaagcct 1740gtgcacggta gttgtcgtaa tattacgtgt gacaattggt tcacctcaat
ccctttggca 1800aaaaacttac tacaagaacc gtataagtta accattgtgg gaaccgtgcg
atcaaacaaa 1860cgcgagatac cggaagtact gaaaaacagt cgctccaggc cagtgggaac
atcgatgttt 1920tgttttgacg gaccccttac tctcgtctca tataaaccga agccagctaa
gatggtatac 1980ttattatcat cttgtgatga ggatgcttct atcaacgaaa gtaccggtaa
accgcaaatg 2040gttatgtatt ataatcaaac taaaggcgga gtggacacgc tagaccaaat
gtgttctgtg 2100atgacctgca gtaggaagac gaataggtgg cctatggcat tattgtacgg
aatgataaac 2160attgcctgca taaattcttt tattatatac agccataatg tcagtagcaa
gggagaaaag 2220gttcaaagtc gcaaaaaatt tatgagaaac ctttacatga gcctgacgtc
atcgtttatg 2280cgtaagcgtt tagaagctcc tactttgaag agatatttgc gcgataatat
ctctaatatt 2340ttgccaaatg aagtgcctgg tacatcagat gacagtactg aagagccagt
aatgaaaaaa 2400cgtacttact gtacttactg cccctctaaa ataaggcgaa aggcaaatgc
atcgtgcaaa 2460aaatgcaaaa aagttatttg tcgagagcat aatattgata tgtgccaaag
ttgtttctga 2520ctgactaata agtataattt gtttctatta tgtataagtt aagctaatta
cttattttat 2580aatacaacat gactgttttt aaagtacaaa ataagtttat ttttgtaaaa
gagagaatgt 2640ttaaaagttt tgttacttta gaagaaattt tgagtttttg ttttttttta
ataaataaat 2700aaacataaat aaattgtttg ttgaatttgg atccactagt tctagagcgg
ccgccaccgc 2760ggtggagctc cagcttttgt tccctttagt gagggttaat tgcgcgcttg
gcgtaatcat 2820ggtcatagct gtttcctgtg tgaaattgtt atccgctcac aattccacac
aacatacgag 2880ccggaagcat aaagtgtaaa gcctggggtg cctaatgagt gagctaactc
acattaattg 2940cgttgcgctc actgcccgct ttccagtcgg gaaacctgtc gtgccagctg
cattaatgaa 3000tcggccaacg cgcggggaga ggcggtttgc gtattgggcg ctcttccgct
tcctcgctca 3060ctgactcgct gcgctcggtc gttcggctgc ggcgagcggt atcagctcac
tcaaaggcgg 3120taatacggtt atccacagaa tcaggggata acgcaggaaa gaacatgtga
gcaaaaggcc 3180agcaaaaggc caggaaccgt aaaaaggccg cgttgctggc gtttttccat
aggctccgcc 3240cccctgacga gcatcacaaa aatcgacgct caagtcagag gtggcgaaac
ccgacaggac 3300tataaagata ccaggcgttt ccccctggaa gctccctcgt gcgctctcct
gttccgaccc 3360tgccgcttac cggatacctg tccgcctttc tcccttcggg aagcgtggcg
ctttctcata 3420gctcacgctg taggtatctc agttcggtgt aggtcgttcg ctccaagctg
ggctgtgtgc 3480acgaaccccc cgttcagccc gaccgctgcg ccttatccgg taactatcgt
cttgagtcca 3540acccggtaag acacgactta tcgccactgg cagcagccac tggtaacagg
attagcagag 3600cgaggtatgt aggcggtgct acagagttct tgaagtggtg gcctaactac
ggctacacta 3660gaaggacagt atttggtatc tgcgctctgc tgaagccagt taccttcgga
aaaagagttg 3720gtagctcttg atccggcaaa caaaccaccg ctggtagcgg tggttttttt
gtttgcaagc 3780agcagattac gcgcagaaaa aaaggatctc aagaagatcc tttgatcttt
tctacggggt 3840ctgacgctca gtggaacgaa aactcacgtt aagggatttt ggtcatgaga
ttatcaaaaa 3900ggatcttcac ctagatcctt ttaaattaaa aatgaagttt taaatcaatc
taaagtatat 3960atgagtaaac ttggtctgac agttaccaat gcttaatcag tgaggcacct
atctcagcga 4020tctgtctatt tcgttcatcc atagttgcct gactccccgt cgtgtagata
actacgatac 4080gggagggctt accatctggc cccagtgctg caatgatacc gcgagaccca
cgctcaccgg 4140ctccagattt atcagcaata aaccagccag ccggaagggc cgagcgcaga
agtggtcctg 4200caactttatc cgcctccatc cagtctatta attgttgccg ggaagctaga
gtaagtagtt 4260cgccagttaa tagtttgcgc aacgttgttg ccattgctac aggcatcgtg
gtgtcacgct 4320cgtcgtttgg tatggcttca ttcagctccg gttcccaacg atcaaggcga
gttacatgat 4380cccccatgtt gtgcaaaaaa gcggttagct ccttcggtcc tccgatcgtt
gtcagaagta 4440agttggccgc agtgttatca ctcatggtta tggcagcact gcataattct
cttactgtca 4500tgccatccgt aagatgcttt tctgtgactg gtgagtactc aaccaagtca
ttctgagaat 4560agtgtatgcg gcgaccgagt tgctcttgcc cggcgtcaat acgggataat
accgcgccac 4620atagcagaac tttaaaagtg ctcatcattg gaaaacgttc ttcggggcga
aaactctcaa 4680ggatcttacc gctgttgaga tccagttcga tgtaacccac tcgtgcaccc
aactgatctt 4740cagcatcttt tactttcacc agcgtttctg ggtgagcaaa aacaggaagg
caaaatgccg 4800caaaaaaggg aataagggcg acacggaaat gttgaatact catactcttc
ctttttcaat 4860attattgaag catttatcag ggttattgtc tcatgagcgg atacatattt
gaatgtattt 4920agaaaaataa acaaataggg gttccgcgca catttccccg aaaagtgcca c
4971445523DNAArtificial SequenceDescription of Artificial
Sequence Synthetic nucleotide construct 44ctaaattgta agcgttaata
ttttgttaaa attcgcgtta aatttttgtt aaatcagctc 60attttttaac caataggccg
aaatcggcaa aatcccttat aaatcaaaag aatagaccga 120gatagggttg agtgttgttc
cagtttggaa caagagtcca ctattaaaga acgtggactc 180caacgtcaaa gggcgaaaaa
ccgtctatca gggcgatggc ccactacgtg aaccatcacc 240ctaatcaagt tttttggggt
cgaggtgccg taaagcacta aatcggaacc ctaaagggag 300cccccgattt agagcttgac
ggggaaagcc ggcgaacgtg gcgagaaagg aagggaagaa 360agcgaaagga gcgggcgcta
gggcgctggc aagtgtagcg gtcacgctgc gcgtaaccac 420cacacccgcc gcgcttaatg
cgccgctaca gggcgcgtcc cattcgccat tcaggctgcg 480caactgttgg gaagggcgat
cggtgcgggc ctcttcgcta ttacgccagc tggcgaaagg 540gggatgtgct gcaaggcgat
taagttgggt aacgccaggg ttttcccagt cacgacgttg 600taaaacgacg gccagtgagc
gcgcgtaata cgactcacta tagggcgaat tgggtaccgg 660gccccccctc gaggtcgacg
gtatcgataa gcttcgatgt ctttgtgatg cgccgacatt 720tttgtaggtt attgataaaa
tgaacggata cagttgcccg acattatcat taaatccttg 780gcgtagaatt tgtcgggtcc
attgtccgtg tgcgctagca tgcccgctaa cggacctcgt 840acttttggct tcaaaggttt
tgcgcacaga caaaatgtgc cacacttgca gctctgcatg 900tgtgcgcgtt accacaaatc
ccaacggcgc agtgtacttg ttgtatgcaa ataaatctcg 960ataaaggcgc ggcgcgcgaa
tgcagctgat cacgtacgct cctcgtgttc cgttcaagga 1020cggtgttatc gacctcagat
taatgtttat cggccgactg ttttcgtatc cgctcaccaa 1080acgcgttttt gcattaacat
tgtatgtcgg cggatgttct atatctaatt tgaataaata 1140aacgataacc gcgttggttt
tagagggcat aataaaagaa atattgttat cgtgttcgcc 1200attagggcag tataaattga
cgttcatgtt ggatattgtt tcagttgcaa gtgaattcct 1260gcagcccggg ggatcctata
taataaaatg ggtagttctt tagacgatga gcatatcctc 1320tctgctcttc tgcaaagcga
tgacgagctt gttggtgagg attctgacag tgaaatatca 1380gatcacgtaa gtgaagatga
cgtccagagc gatacagaag aagcgtttat agatgaggta 1440catgaagtgc agccaacgtc
aagcggtagt gaaatattag acgaacaaaa tgttattgaa 1500caaccaggtt cttcattggc
ttctaacaga atcttgacct tgccacagag gactattaga 1560ggtaagaata aacattgttg
gtcaacttca aagtccacga ggcgtagccg agtctctgca 1620ctgaacattg tcagatctca
aagaggtccg acgcgtatgt gccgcaatat atatgaccca 1680cttttatgct tcaaactatt
ttttactgat gagataattt cggaaattgt aaaatggaca 1740aatgctgaga tatcattgaa
acgtcgggaa tctatgacag gtgctacatt tcgtgacacg 1800aatgaagatg aaatctatgc
tttctttggt attctggtaa tgacagcagt gagaaaagat 1860aaccacatgt ccacagatga
cctctttgat cgatctttgt caatggtgta cgtctctgta 1920atgagtcgtg atcgttttga
ttttttgata cgatgtctta gaatggatga caaaagtata 1980cggcccacac ttcgagaaaa
cgatgtattt actcctgtta gaaaaatatg ggatctcttt 2040atccatcagt gcatacaaaa
ttacactcca ggggctcatt tgaccataga tgaacagtta 2100cttggtttta gaggacggtg
tccgtttagg atgtatatcc caaacaagcc aagtaagtat 2160ggaataaaaa tcctcatgat
gtgtgacagt ggtacgaagt atatgataaa tggaatgcct 2220tatttgggaa gaggaacaca
gaccaacgga gtaccactcg gtgaatacta cgtgaaggag 2280ttatcaaagc ctgtgcacgg
tagttgtcgt aatattacgt gtgacaattg gttcacctca 2340atccctttgg caaaaaactt
actacaagaa ccgtataagt taaccattgt gggaaccgtg 2400cgatcaaaca aacgcgagat
accggaagta ctgaaaaaca gtcgctccag gccagtggga 2460acatcgatgt tttgttttga
cggacccctt actctcgtct catataaacc gaagccagct 2520aagatggtat acttattatc
atcttgtgat gaggatgctt ctatcaacga aagtaccggt 2580aaaccgcaaa tggttatgta
ttataatcaa actaaaggcg gagtggacac gctagaccaa 2640atgtgttctg tgatgacctg
cagtaggaag acgaataggt ggcctatggc attattgtac 2700ggaatgataa acattgcctg
cataaattct tttattatat acagccataa tgtcagtagc 2760aagggagaaa aggttcaaag
tcgcaaaaaa tttatgagaa acctttacat gagcctgacg 2820tcatcgttta tgcgtaagcg
tttagaagct cctactttga agagatattt gcgcgataat 2880atctctaata ttttgccaaa
tgaagtgcct ggtacatcag atgacagtac tgaagagcca 2940gtaatgaaaa aacgtactta
ctgtacttac tgcccctcta aaataaggcg aaaggcaaat 3000gcatcgtgca aaaaatgcaa
aaaagttatt tgtcgagagc ataatattga tatgtgccaa 3060agttgtttct gactgactaa
taagtataat ttgtttctat tatgtataag ttaagctaat 3120tacttatttt ataatacaac
atgactgttt ttaaagtaca aaataagttt atttttgtaa 3180aagagagaat gtttaaaagt
tttgttactt tagaagaaat tttgagtttt tgtttttttt 3240taataaataa ataaacataa
ataaattgtt tgttgaattt ggatccacta gttctagagc 3300ggccgccacc gcggtggagc
tccagctttt gttcccttta gtgagggtta attgcgcgct 3360tggcgtaatc atggtcatag
ctgtttcctg tgtgaaattg ttatccgctc acaattccac 3420acaacatacg agccggaagc
ataaagtgta aagcctgggg tgcctaatga gtgagctaac 3480tcacattaat tgcgttgcgc
tcactgcccg ctttccagtc gggaaacctg tcgtgccagc 3540tgcattaatg aatcggccaa
cgcgcgggga gaggcggttt gcgtattggg cgctcttccg 3600cttcctcgct cactgactcg
ctgcgctcgg tcgttcggct gcggcgagcg gtatcagctc 3660actcaaaggc ggtaatacgg
ttatccacag aatcagggga taacgcagga aagaacatgt 3720gagcaaaagg ccagcaaaag
gccaggaacc gtaaaaaggc cgcgttgctg gcgtttttcc 3780ataggctccg cccccctgac
gagcatcaca aaaatcgacg ctcaagtcag aggtggcgaa 3840acccgacagg actataaaga
taccaggcgt ttccccctgg aagctccctc gtgcgctctc 3900ctgttccgac cctgccgctt
accggatacc tgtccgcctt tctcccttcg ggaagcgtgg 3960cgctttctca tagctcacgc
tgtaggtatc tcagttcggt gtaggtcgtt cgctccaagc 4020tgggctgtgt gcacgaaccc
cccgttcagc ccgaccgctg cgccttatcc ggtaactatc 4080gtcttgagtc caacccggta
agacacgact tatcgccact ggcagcagcc actggtaaca 4140ggattagcag agcgaggtat
gtaggcggtg ctacagagtt cttgaagtgg tggcctaact 4200acggctacac tagaaggaca
gtatttggta tctgcgctct gctgaagcca gttaccttcg 4260gaaaaagagt tggtagctct
tgatccggca aacaaaccac cgctggtagc ggtggttttt 4320ttgtttgcaa gcagcagatt
acgcgcagaa aaaaaggatc tcaagaagat cctttgatct 4380tttctacggg gtctgacgct
cagtggaacg aaaactcacg ttaagggatt ttggtcatga 4440gattatcaaa aaggatcttc
acctagatcc ttttaaatta aaaatgaagt tttaaatcaa 4500tctaaagtat atatgagtaa
acttggtctg acagttacca atgcttaatc agtgaggcac 4560ctatctcagc gatctgtcta
tttcgttcat ccatagttgc ctgactcccc gtcgtgtaga 4620taactacgat acgggagggc
ttaccatctg gccccagtgc tgcaatgata ccgcgagacc 4680cacgctcacc ggctccagat
ttatcagcaa taaaccagcc agccggaagg gccgagcgca 4740gaagtggtcc tgcaacttta
tccgcctcca tccagtctat taattgttgc cgggaagcta 4800gagtaagtag ttcgccagtt
aatagtttgc gcaacgttgt tgccattgct acaggcatcg 4860tggtgtcacg ctcgtcgttt
ggtatggctt cattcagctc cggttcccaa cgatcaaggc 4920gagttacatg atcccccatg
ttgtgcaaaa aagcggttag ctccttcggt cctccgatcg 4980ttgtcagaag taagttggcc
gcagtgttat cactcatggt tatggcagca ctgcataatt 5040ctcttactgt catgccatcc
gtaagatgct tttctgtgac tggtgagtac tcaaccaagt 5100cattctgaga atagtgtatg
cggcgaccga gttgctcttg cccggcgtca atacgggata 5160ataccgcgcc acatagcaga
actttaaaag tgctcatcat tggaaaacgt tcttcggggc 5220gaaaactctc aaggatctta
ccgctgttga gatccagttc gatgtaaccc actcgtgcac 5280ccaactgatc ttcagcatct
tttactttca ccagcgtttc tgggtgagca aaaacaggaa 5340ggcaaaatgc cgcaaaaaag
ggaataaggg cgacacggaa atgttgaata ctcatactct 5400tcctttttca atattattga
agcatttatc agggttattg tctcatgagc ggatacatat 5460ttgaatgtat ttagaaaaat
aaacaaatag gggttccgcg cacatttccc cgaaaagtgc 5520cac
5523456984DNAArtificial
SequenceDescription of Artificial Sequence Synthetic nucleotide
construct 45tagttattaa tagtaatcaa ttacggggtc attagttcat agcccatata
tggagttccg 60cgttacataa cttacggtaa atggcccgcc tggctgaccg cccaacgacc
cccgcccatt 120gacgtcaata atgacgtatg ttcccatagt aacgccaata gggactttcc
attgacgtca 180atgggtggag tatttacggt aaactgccca cttggcagta catcaagtgt
atcatatgcc 240aagtacgccc cctattgacg tcaatgacgg taaatggccc gcctggcatt
atgcccagta 300catgacctta tgggactttc ctacttggca gtacatctac gtattagtca
tcgctattac 360catggtgatg cggttttggc agtacatcaa tgggcgtgga tagcggtttg
actcacgggg 420atttccaagt ctccacccca ttgacgtcaa tgggagtttg ttttggcacc
aaaatcaacg 480ggactttcca aaatgtcgta acaactccgc cccattgacg caaatgggcg
gtaggcgtgt 540acggtgggag gtctatataa gcagagctgg tttagtgaac cgtcagatcc
gctagcgcta 600ccggactcag atcctatata ataaaatggg tagttcttta gacgatgagc
atatcctctc 660tgctcttctg caaagcgatg acgagcttgt tggtgaggat tctgacagtg
aaatatcaga 720tcacgtaagt gaagatgacg tccagagcga tacagaagaa gcgtttatag
atgaggtaca 780tgaagtgcag ccaacgtcaa gcggtagtga aatattagac gaacaaaatg
ttattgaaca 840accaggttct tcattggctt ctaacagaat cttgaccttg ccacagagga
ctattagagg 900taagaataaa cattgttggt caacttcaaa gtccacgagg cgtagccgag
tctctgcact 960gaacattgtc agatctcaaa gaggtccgac gcgtatgtgc cgcaatatat
atgacccact 1020tttatgcttc aaactatttt ttactgatga gataatttcg gaaattgtaa
aatggacaaa 1080tgctgagata tcattgaaac gtcgggaatc tatgacaggt gctacatttc
gtgacacgaa 1140tgaagatgaa atctatgctt tctttggtat tctggtaatg acagcagtga
gaaaagataa 1200ccacatgtcc acagatgacc tctttgatcg atctttgtca atggtgtacg
tctctgtaat 1260gagtcgtgat cgttttgatt ttttgatacg atgtcttaga atggatgaca
aaagtatacg 1320gcccacactt cgagaaaacg atgtatttac tcctgttaga aaaatatggg
atctctttat 1380ccatcagtgc atacaaaatt acactccagg ggctcatttg accatagatg
aacagttact 1440tggttttaga ggacggtgtc cgtttaggat gtatatccca aacaagccaa
gtaagtatgg 1500aataaaaatc ctcatgatgt gtgacagtgg tacgaagtat atgataaatg
gaatgcctta 1560tttgggaaga ggaacacaga ccaacggagt accactcggt gaatactacg
tgaaggagtt 1620atcaaagcct gtgcacggta gttgtcgtaa tattacgtgt gacaattggt
tcacctcaat 1680ccctttggca aaaaacttac tacaagaacc gtataagtta accattgtgg
gaaccgtgcg 1740atcaaacaaa cgcgagatac cggaagtact gaaaaacagt cgctccaggc
cagtgggaac 1800atcgatgttt tgttttgacg gaccccttac tctcgtctca tataaaccga
agccagctaa 1860gatggtatac ttattatcat cttgtgatga ggatgcttct atcaacgaaa
gtaccggtaa 1920accgcaaatg gttatgtatt ataatcaaac taaaggcgga gtggacacgc
tagaccaaat 1980gtgttctgtg atgacctgca gtaggaagac gaataggtgg cctatggcat
tattgtacgg 2040aatgataaac attgcctgca taaattcttt tattatatac agccataatg
tcagtagcaa 2100gggagaaaag gttcaaagtc gcaaaaaatt tatgagaaac ctttacatga
gcctgacgtc 2160atcgtttatg cgtaagcgtt tagaagctcc tactttgaag agatatttgc
gcgataatat 2220ctctaatatt ttgccaaatg aagtgcctgg tacatcagat gacagtactg
aagagccagt 2280aatgaaaaaa cgtacttact gtacttactg cccctctaaa ataaggcgaa
aggcaaatgc 2340atcgtgcaaa aaatgcaaaa aagttatttg tcgagagcat aatattgata
tgtgccaaag 2400ttgtttctga ctgactaata agtataattt gtttctatta tgtataagtt
aagctaatta 2460cttattttat aatacaacat gactgttttt aaagtacaaa ataagtttat
ttttgtaaaa 2520gagagaatgt ttaaaagttt tgttacttta gaagaaattt tgagtttttg
ttttttttta 2580ataaataaat aaacataaat aaattgtttg ttgaatttgg atctcgaggt
tcccacaatg 2640gttaattcga gctcgcccgg ggatctaatt caattagaga ctaattcaat
tagagctaat 2700tcaattagga tccaagctta tcgatttcga accctcgacc gccggagtat
aaatagaggc 2760gcttcgtcta cggagcgaca attcaattca aacaagcaaa gtgaacacgt
cgctaagcga 2820aagctaagca aataaacaag cgcagctgaa caagctaaac aatcggggta
ccgctagagt 2880cgacggtacc gcgggcccgg gatccaccgg tcgccaccat gaattctgca
gtcgacggta 2940ccgcgggccc gggatccacc ggtcgccacc atggtgcgct cctccaagaa
cgtcatcaag 3000gagttcatgc gcttcaaggt gcgcatggag ggcaccgtga acggccacga
gttcgagatc 3060gagggcgagg gcgagggccg cccctacgag ggccacaaca ccgtgaagct
gaaggtgacc 3120aagggcggcc ccctgccctt cgcctgggac atcctgtccc cccagttcca
gtacggctcc 3180aaggtgtacg tgaagcaccc cgccgacatc cccgactaca agaagctgtc
cttccccgag 3240ggcttcaagt gggagcgcgt gatgaacttc gaggacggcg gcgtggtgac
cgtgacccag 3300gactcctccc tgcaggacgg ctgcttcatc tacaaggtga agttcatcgg
cgtgaacttc 3360ccctccgacg gccccgtaat gcagaagaag accatgggct gggaggcctc
caccgagcgc 3420ctgtaccccc gcgacggcgt gctgaagggc gagatccaca aggccctgaa
gctgaaggac 3480ggcggccact acctggtgga gttcaagtcc atctacatgg ccaagaagcc
cgtgcagctg 3540cccggctact actacgtgga ctccaagctg gacatcacct cccacaacga
ggactacacc 3600atcgtggagc agtacgagcg caccgagggc cgccaccacc tgttcctgta
gcggccgcga 3660ctctagatca taatcagcca taccacattt gtagaggttt tacttgcttt
aaaaaacctc 3720ccacacctcc ccctgaacct gaaacataaa atgaatgcaa ttgttgttgt
taacttgttt 3780attgcagctt ataatggtta caaataaagc aatagcatca caaatttcac
aaataaagca 3840tttttttcac tgcattctag ttgtggtttg tccaaactca tcaatgtatc
ttaaggcgta 3900aattgtaagc gttaatattt tgttaaaatt cgcgttaaat ttttgttaaa
tcagctcatt 3960ttttaaccaa taggccgaaa tcggcaaaat cccttataaa tcaaaagaat
agaccgagat 4020agggttgagt gttgttccag tttggaacaa gagtccacta ttaaagaacg
tggactccaa 4080cgtcaaaggg cgaaaaaccg tctatcaggg cgatggccca ctacgtgaac
catcacccta 4140atcaagtttt ttggggtcga ggtgccgtaa agcactaaat cggaacccta
aagggagccc 4200ccgatttaga gcttgacggg gaaagccggc gaacgtggcg agaaaggaag
ggaagaaagc 4260gaaaggagcg ggcgctaggg cgctggcaag tgtagcggtc acgctgcgcg
taaccaccac 4320acccgccgcg cttaatgcgc cgctacaggg cgcgtcaggt ggcacttttc
ggggaaatgt 4380gcgcggaacc cctatttgtt tatttttcta aatacattca aatatgtatc
cgctcatgag 4440acaataaccc tgataaatgc ttcaataata ttgaaaaagg aagagtcctg
aggcggaaag 4500aaccagctgt ggaatgtgtg tcagttaggg tgtggaaagt ccccaggctc
cccagcaggc 4560agaagtatgc aaagcatgca tctcaattag tcagcaacca ggtgtggaaa
gtccccaggc 4620tccccagcag gcagaagtat gcaaagcatg catctcaatt agtcagcaac
catagtcccg 4680cccctaactc cgcccatccc gcccctaact ccgcccagtt ccgcccattc
tccgccccat 4740ggctgactaa ttttttttat ttatgcagag gccgaggccg cctcggcctc
tgagctattc 4800cagaagtagt gaggaggctt ttttggaggc ctaggctttt gcaaagatcg
atcaagagac 4860aggatgagga tcgtttcgca tgattgaaca agatggattg cacgcaggtt
ctccggccgc 4920ttgggtggag aggctattcg gctatgactg ggcacaacag acaatcggct
gctctgatgc 4980cgccgtgttc cggctgtcag cgcaggggcg cccggttctt tttgtcaaga
ccgacctgtc 5040cggtgccctg aatgaactgc aagacgaggc agcgcggcta tcgtggctgg
ccacgacggg 5100cgttccttgc gcagctgtgc tcgacgttgt cactgaagcg ggaagggact
ggctgctatt 5160gggcgaagtg ccggggcagg atctcctgtc atctcacctt gctcctgccg
agaaagtatc 5220catcatggct gatgcaatgc ggcggctgca tacgcttgat ccggctacct
gcccattcga 5280ccaccaagcg aaacatcgca tcgagcgagc acgtactcgg atggaagccg
gtcttgtcga 5340tcaggatgat ctggacgaag agcatcaggg gctcgcgcca gccgaactgt
tcgccaggct 5400caaggcgagc atgcccgacg gcgaggatct cgtcgtgacc catggcgatg
cctgcttgcc 5460gaatatcatg gtggaaaatg gccgcttttc tggattcatc gactgtggcc
ggctgggtgt 5520ggcggaccgc tatcaggaca tagcgttggc tacccgtgat attgctgaag
agcttggcgg 5580cgaatgggct gaccgcttcc tcgtgcttta cggtatcgcc gctcccgatt
cgcagcgcat 5640cgccttctat cgccttcttg acgagttctt ctgagcggga ctctggggtt
cgaaatgacc 5700gaccaagcga cgcccaacct gccatcacga gatttcgatt ccaccgccgc
cttctatgaa 5760aggttgggct tcggaatcgt tttccgggac gccggctgga tgatcctcca
gcgcggggat 5820ctcatgctgg agttcttcgc ccaccctagg gggaggctaa ctgaaacacg
gaaggagaca 5880ataccggaag gaacccgcgc tatgacggca ataaaaagac agaataaaac
gcacggtgtt 5940gggtcgtttg ttcataaacg cggggttcgg tcccagggct ggcactctgt
cgatacccca 6000ccgagacccc attggggcca atacgcccgc gtttcttcct tttccccacc
ccacccccca 6060agttcgggtg aaggcccagg gctcgcagcc aacgtcgggg cggcaggccc
tgccatagcc 6120tcaggttact catatatact ttagattgat ttaaaacttc atttttaatt
taaaaggatc 6180taggtgaaga tcctttttga taatctcatg accaaaatcc cttaacgtga
gttttcgttc 6240cactgagcgt cagaccccgt agaaaagatc aaaggatctt cttgagatcc
tttttttctg 6300cgcgtaatct gctgcttgca aacaaaaaaa ccaccgctac cagcggtggt
ttgtttgccg 6360gatcaagagc taccaactct ttttccgaag gtaactggct tcagcagagc
gcagatacca 6420aatactgtcc ttctagtgta gccgtagtta ggccaccact tcaagaactc
tgtagcaccg 6480cctacatacc tcgctctgct aatcctgtta ccagtggctg ctgccagtgg
cgataagtcg 6540tgtcttaccg ggttggactc aagacgatag ttaccggata aggcgcagcg
gtcgggctga 6600acggggggtt cctgcacaca gcccagcttg gagcgaacga cctacaccga
actgagatac 6660ctacagcgtg agctatgaga aagcgccacg cttcccgaag ggagaaaggc
ggacaggtat 6720ccggtaagcg gcagggtcgg aacaggagag cgcacgaggg agcttccagg
gggaaacgcc 6780tggtatcttt atagtcctgt cgggtttcgc cacctctgac ttgagcgtcg
atttttgtga 6840tgctcgtcag gggggcggag cctatggaaa aacgccagca acgcggcctt
tttacggttc 6900ctggcctttt gctggccttt tgctcacatg ttctttcctg cgttatcccc
tgattctgtg 6960gataaccgta ttaccgccat gcat
6984464613DNAArtificial SequenceDescription of Artificial
Sequence Synthetic nucleotide constructCDS(344)..(922) 46agcgcccaat
acgcaaaccg cctctccccg cgcgttggcc gattcattaa tgcagctggc 60acgacaggtt
tcccgactgg aaagcgggca gtgagcgcaa cgcaattaat gtgagttagc 120tcactcatta
ggcaccccag gctttacact ttatgcttcc ggctcgtatg ttgtgtggaa 180ttgtgagcgg
ataacaattt cacacaggaa acagctatga ccatgattac gccaagcttg 240gtaccgagct
cggatccact agtaacggcc gccagtgtgc tggaattcgg cttggatccc 300atgcgtcaat
tttacgcaga ctatctttct agggttaatc tag ctg cat cag gat 355
Leu His Gln Asp
1 cat atc gtc
ggg tct ttt ttc cgg ctc agt cat cgc cca agc tgg cgc 403His Ile Val
Gly Ser Phe Phe Arg Leu Ser His Arg Pro Ser Trp Arg 5
10 15 20 tat ctg ggc
atc ggg gag gaa gaa gcc cgt gcc ttt tcc cgc gag gtt 451Tyr Leu Gly
Ile Gly Glu Glu Glu Ala Arg Ala Phe Ser Arg Glu Val
25 30 35 gaa gcg gca
tgg aaa gag ttt gcc gag gat gac tgc tgc tgc att gac 499Glu Ala Ala
Trp Lys Glu Phe Ala Glu Asp Asp Cys Cys Cys Ile Asp
40 45 50 gtt gag cga
aaa cgc acg ttt acc atg atg att cgg gaa ggt gtg gcc 547Val Glu Arg
Lys Arg Thr Phe Thr Met Met Ile Arg Glu Gly Val Ala 55
60 65 atg cac gcc
ttt aac ggt gaa ctg ttc gtt cag gcc acc tgg gat acc 595Met His Ala
Phe Asn Gly Glu Leu Phe Val Gln Ala Thr Trp Asp Thr 70
75 80 agt tcg tcg
cgg ctt ttc cgg aca cag ttc cgg atg gtc agc ccg aag 643Ser Ser Ser
Arg Leu Phe Arg Thr Gln Phe Arg Met Val Ser Pro Lys 85
90 95 100 cgc atc agc
aac ccg aac aat acc ggc gac agc cgg aac tgc cgt gcc 691Arg Ile Ser
Asn Pro Asn Asn Thr Gly Asp Ser Arg Asn Cys Arg Ala
105 110 115 ggt gtg cag
att aat gac agc ggt gcg gcg ctg gga tat tac gtc agc 739Gly Val Gln
Ile Asn Asp Ser Gly Ala Ala Leu Gly Tyr Tyr Val Ser
120 125 130 gag gac ggg
tat cct ggc tgg atg ccg cag aaa tgg aca tgg ata ccc 787Glu Asp Gly
Tyr Pro Gly Trp Met Pro Gln Lys Trp Thr Trp Ile Pro 135
140 145 cgt gag tta
ccc ggc ggg cgc gcc tcg ttc att cac gtt ttt gaa ccc 835Arg Glu Leu
Pro Gly Gly Arg Ala Ser Phe Ile His Val Phe Glu Pro 150
155 160 gtg gag gac
ggg cag act cgc ggt gca aat gtg ttt tac agc gtg atg 883Val Glu Asp
Gly Gln Thr Arg Gly Ala Asn Val Phe Tyr Ser Val Met 165
170 175 180 gag cag atg
aag atg ctc gac acg ctg cag aac acg cag ctagattaac 932Glu Gln Met
Lys Met Leu Asp Thr Leu Gln Asn Thr Gln
185 190 cctagaaaga
taatcatatt gtgacgtacg ttaaagataa tcatgcgtaa aattgacgca 992tgggatccaa
gccgaattct gcagatatcc atcacactgg cggccgctcg agcatgcatc 1052tagagggccc
aattcgccct atagtgagtc gtattacaat tcactggccg tcgttttaca 1112acgtcgtgac
tgggaaaacc ctggcgttac ccaacttaat cgccttgcag cacatccccc 1172tttcgccagc
tggcgtaata gcgaagaggc ccgcaccgat cgcccttccc aacagttgcg 1232cagcctgaat
ggcgaatggg acgcgccctg tagcggcgca ttaagcgcgg cgggtgtggt 1292ggttacgcgc
agcgtgaccg ctacacttgc cagcgcccta gcgcccgctc ctttcgcttt 1352cttcccttcc
tttctcgcca cgttcgccgg ctttccccgt caagctctaa atcgggggct 1412ccctttaggg
ttccgattta gagctttacg gcacctcgac cgcaaaaaac ttgatttggg 1472tgatggttca
cgtagtgggc catcgccctg atagacggtt tttcgccctt tgacgttgga 1532gtccacgttc
tttaatagtg gactcttgtt ccaaactgga acaacactca accctatcgc 1592ggtctattct
tttgatttat aagggatttt gccgatttcg gcctattggt taaaaaatga 1652gctgatttaa
caaattcagg gcgcaagggc tgctaaagga accggaacac gtagaaagcc 1712agtccgcaga
aacggtgctg accccggatg aatgtcagct actgggctat ctggacaagg 1772gaaaacgcaa
gcgcaaagag aaagcaggta gcttgcagtg ggcttacatg gcgatagcta 1832gactgggcgg
ttttatggac agcaagcgaa ccggaattgc cagctggggc gccctctggt 1892aaggttggga
agccctgcaa agtaaactgg atggctttct tgccgccaag gatctgatgg 1952cgcaggggat
caagatctga tcaagagaca ggatgaggat cgtttcgcat gattgaacaa 2012gatggattgc
acgcaggttc tccggccgct tgggtggaga ggctattcgg ctatgactgg 2072gcacaacaga
caatcggctg ctctgatgcc gccgtgttcc ggctgtcagc gcaggggcgc 2132ccggttcttt
ttgtcaagac cgacctgtcc ggtgccctga atgaactgca ggacgaggca 2192gcgcggctat
cgtggctggc cacgacgggc gttccttgcg cagctgtgct cgacgttgtc 2252actgaagcgg
gaagggactg gctgctattg ggcgaagtgc cggggcagga tctcctgtca 2312tctcgccttg
ctcctgccga gaaagtatcc atcatggctg atgcaatgcg gcggctgcat 2372acgcttgatc
cggctacctg cccattcgac caccaagcga aacatcgcat cgagcgagca 2432cgtactcgga
tggaagccgg tcttgtcgat caggatgatc tggacgaaga gcatcagggg 2492ctcgcgccag
ccgaactgtt cgccaggctc aaggcgcgca tgcccgacgg cgaggatctc 2552gtcgtgatcc
atggcgatgc ctgcttgccg aatatcatgg tggaaaatgg ccgcttttct 2612ggattcaacg
actgtggccg gctgggtgtg gcggaccgct atcaggacat agcgttggat 2672acccgtgata
ttgctgaaga gcttggcggc gaatgggctg accgcttcct cgtgctttac 2732ggtatcgccg
ctcccgattc gcagcgcatc gccttctatc gccttcttga cgagttcttc 2792tgaattgaaa
aaggaagagt atgagtattc aacatttccg tgtcgccctt attccctttt 2852ttgcggcatt
ttgccttcct gtttttgctc acccagaaac gctggtgaaa gtaaaagatg 2912ctgaagatca
gttgggtgca cgagtgggtt acatcgaact ggatctcaac agcggtaaga 2972tccttgagag
ttttcgcccc gaagaacgtt ttccaatgat gagcactttt aaagttctgc 3032tatgtcatac
actattatcc cgtattgacg ccgggcaaga gcaactcggt cgccgggcgc 3092ggtattctca
gaatgacttg gttgagtact caccagtcac agaaaagcat cttacggatg 3152gcatgacagt
aagagaatta tgcagtgctg ccataaccat gagtgataac actgcggcca 3212acttacttct
gacaacgatc ggaggaccga aggagctaac cgcttttttg cacaacatgg 3272gggatcatgt
aactcgcctt gatcgttggg aaccggagct gaatgaagcc ataccaaacg 3332acgagagtga
caccacgatg cctgtagcaa tgccaacaac gttgcgcaaa ctattaactg 3392gcgaactact
tactctagct tcccggcaac aattaataga ctggatggag gcggataaag 3452ttgcaggacc
acttctgcgc tcggcccttc cggctggctg gtttattgct gataaatctg 3512gagccggtga
gcgtgggtct cgcggtatca ttgcagcact ggggccagat ggtaagccct 3572cccgtatcgt
agttatctac acgacgggga gtcaggcaac tatggatgaa cgaaatagac 3632agatcgctga
gataggtgcc tcactgatta agcattggta actgtcagac caagtttact 3692catatatact
ttagattgat ttaaaacttc atttttaatt taaaaggatc taggtgaaga 3752tcctttttga
taatctcatg accaaaatcc cttaacgtga gttttcgttc cactgagcgt 3812cagaccccgt
agaaaagatc aaaggatctt cttgagatcc tttttttctg cgcgtaatct 3872gctgcttgca
aacaaaaaaa ccaccgctac cagcggtggt ttgtttgccg gatcaagagc 3932taccaactct
ttttccgaag gtaactggct tcagcagagc gcagatacca aatactgtcc 3992ttctagtgta
gccgtagtta ggccaccact tcaagaactc tgtagcaccg cctacatacc 4052tcgctctgct
aatcctgtta ccagtggctg ctgccagtgg cgataagtcg tgtcttaccg 4112ggttggactc
aagacgatag ttaccggata aggcgcagcg gtcgggctga acggggggtt 4172cgtgcacaca
gcccagcttg gagcgaacga cctacaccga actgagatac ctacagcgtg 4232agcattgaga
aagcgccacg cttcccgaag ggagaaaggc ggacaggtat ccggtaagcg 4292gcagggtcgg
aacaggagag cgcacgaggg agcttccagg gggaaacgcc tggtatcttt 4352atagtcctgt
cgggtttcgc cacctctgac ttgagcgtcg atttttgtga tgctcgtcag 4412gggggcggag
cctatggaaa aacgccagca acgcggcctt tttacggttc ctggcctttt 4472gctggccttt
tgctcacatg ttctttcctg cgttatcccc tgattctgtg gataaccgta 4532ttaccgcctt
tgagtgagct gataccgctc gccgcagccg aacgaccgag cgcagcgagt 4592cagtgagcga
ggaagcggaa g
461347193PRTArtificial SequenceSynthetic Construct 47Leu His Gln Asp His
Ile Val Gly Ser Phe Phe Arg Leu Ser His Arg 1 5
10 15 Pro Ser Trp Arg Tyr Leu Gly Ile Gly Glu
Glu Glu Ala Arg Ala Phe 20 25
30 Ser Arg Glu Val Glu Ala Ala Trp Lys Glu Phe Ala Glu Asp Asp
Cys 35 40 45 Cys
Cys Ile Asp Val Glu Arg Lys Arg Thr Phe Thr Met Met Ile Arg 50
55 60 Glu Gly Val Ala Met His
Ala Phe Asn Gly Glu Leu Phe Val Gln Ala 65 70
75 80 Thr Trp Asp Thr Ser Ser Ser Arg Leu Phe Arg
Thr Gln Phe Arg Met 85 90
95 Val Ser Pro Lys Arg Ile Ser Asn Pro Asn Asn Thr Gly Asp Ser Arg
100 105 110 Asn Cys
Arg Ala Gly Val Gln Ile Asn Asp Ser Gly Ala Ala Leu Gly 115
120 125 Tyr Tyr Val Ser Glu Asp Gly
Tyr Pro Gly Trp Met Pro Gln Lys Trp 130 135
140 Thr Trp Ile Pro Arg Glu Leu Pro Gly Gly Arg Ala
Ser Phe Ile His 145 150 155
160 Val Phe Glu Pro Val Glu Asp Gly Gln Thr Arg Gly Ala Asn Val Phe
165 170 175 Tyr Ser Val
Met Glu Gln Met Lys Met Leu Asp Thr Leu Gln Asn Thr 180
185 190 Gln 488999DNAArtificial
SequenceDescription of Artificial Sequence Synthetic nucleotide
construct 48accgaagtat acacttaaat tcagtgcacg tttgcttgtt gagaggaaag
gttgtgtgcg 60gacgaatttt tttttgaaaa cattaaccct tacgtggaat aaaaaaaaat
gaaatattgc 120aaattttgct gcaaagctgt gactggagta aaattaattc acgtgccgaa
gtgtgctatt 180aagagaaaat tgtgggagca gagccttggg tgcagccttg gtgaaaactc
ccaaatttgt 240gatacccact ttaatgattc gcagtggaag gctgcacctg caaaaggtca
gacatttaaa 300aggaggcgac tcaacgcaga tgccgtacct agtaaagtga tagagcctga
accagaaaag 360ataaaagaag gctataccag tgggagtaca caaacagagt aagtttgaat
agtaaaaaaa 420atcatttatg taaacaataa cgtgactgtg cgttaggtcc tgttcattgt
ttaatgaaaa 480taagagcttg agggaaaaaa ttcgtacttt ggagtacgaa atgcgtcgtt
tagagcagca 540gccgaattca ctggccgtcg ttttacaacg tcgtgactgg gaaaaccctg
gcgttaccca 600acttaatcgc cttgcagcac atcccccttt cgccagctgg cgtaatagcg
aagaggcccg 660caccgatcgc ccttcccaac agttgcgcag cctgaatggc gaatggcgct
ttgcctggtt 720tccggcacca gaagcggtgc cggaaagctg gctggagtgc gatcttcctg
aggccgatac 780tgtcgtcgtc ccctcaaact ggcagatgca cggttacgat gcgcccatct
acaccaacgt 840aacctatccc attacggtca atccgccgtt tgttcccacg gagaatccga
cgggttgtta 900ctcgctcaca tttaatgttg atgaaagctg gctacaggaa ggccagacgc
gaattatttt 960tgatggcgtt aactcggcgt ttcatctgtg gtgcaacggg cgctgggtcg
gttacggcca 1020ggacagtcgt ttgccgtctg aatttgacct gagcgcattt ttacgcgccg
gagaaaaccg 1080cctcgcggtg atggtgctgc gttggagtga cggcagttat ctggaagatc
aggatatgtg 1140gcggatgagc ggcattttcc gtgacgtctc gttgctgcat aaaccgacta
cacaaatcag 1200cgatttccat gttgccactc gctttaatga tgatttcagc cgcgctgtac
tggaggctga 1260agttcagatg tgcggcgagt tgcgtgacta cctacgggta acagtttctt
tatggcaggg 1320tgaaacgcag gtcgccagcg gcaccgcgcc tttcggcggt gaaattatcg
atgagcgtgg 1380tggttatgcc gatcgcgtca cactacgtct gaacgtcgaa aacccgaaac
tgtggagcgc 1440cgaaatcccg aatctctatc gtgcggtggt tgaactgcac accgccgacg
gcacgctgat 1500tgaagcagaa gcctgcgatg tcggtttccg cgaggtgcgg attgaaaatg
gtctgctgct 1560gctgaacggc aagccgttgc tgattcgagg cgttaaccgt cacgagcatc
atcctctgca 1620tggtcaggtc atggatgagc agacgatggt gcaggatatc ctgctgatga
agcagaacaa 1680ctttaacgcc gtgcgctgtt cgcattatcc gaaccatccg ctgtggtaca
cgctgtgcga 1740ccgctacggc ctgtatgtgg tggatgaagc caatattgaa acccacggca
tggtgccaat 1800gaatcgtctg accgatgatc cgcgctggct accggcgatg agcgaacgcg
taacgcgaat 1860ggtgcagcgc gatcgtaatc acccgagtgt gatcatctgg tcgctgggga
atgaatcagg 1920ccacggcgct aatcacgacg cgctgtatcg ctggatcaaa tctgtcgatc
cttcccgccc 1980ggtgcagtat gaaggcggcg gagccgacac cacggccacc gatattattt
gcccgatgta 2040cgcgcgcgtg gatgaagacc agcccttccc ggctgtgccg aaatggtcca
tcaaaaaatg 2100gctttcgcta cctggagaga cgcgcccgct gatcctttgc gaatacgccc
acgcgatggg 2160taacagtctt ggcggtttcg ctaaatactg gcaggcgttt cgtcagtatc
cccgtttaca 2220gggcggcttc gtctgggact gggtggatca gtcgctgatt aaatatgatg
aaaacggcaa 2280cccgtggtcg gcttacggcg gtgattttgg cgatacgccg aacgatcgcc
agttctgtat 2340gaacggtctg gtctttgccg accgcacgcc gcatccagcg ctgacggaag
caaaacacca 2400gcagcagttt ttccagttcc gtttatccgg gcaaaccatc gaagtgacca
gcgaatacct 2460gttccgtcat agcgataacg agctcctgca ctggatggtg gcgctggatg
gtaagccgct 2520ggcaagcggt gaagtgcctc tggatgtcgc tccacaaggt aaacagttga
ttgaactgcc 2580tgaactaccg cagccggaga gcgccgggca actctggctc acagtacgcg
tagtgcaacc 2640gaacgcgacc gcatggtcag aagccgggca catcagcgcc tggcagcagt
ggcgtctggc 2700ggaaaacctc agtgtgacgc tccccgccgc gtcccacgcc atcccgcatc
tgaccaccag 2760cgaaatggat ttttgcatcg agctgggtaa taagcgttgg caatttaacc
gccagtcagg 2820ctttctttca cagatgtgga ttggcgataa aaaacaactg ctgacgccgc
tgcgcgatca 2880gttcacccgt gcaccgctgg ataacgacat tggcgtaagt gaagcgaccc
gcattgaccc 2940taacgcctgg gtcgaacgct ggaaggcggc gggccattac caggccgaag
cagcgttgtt 3000gcagtgcacg gcagatacac ttgctgatgc ggtgctgatt acgaccgctc
acgcgtggca 3060gcatcagggg aaaaccttat ttatcagccg gaaaacctac cggattgatg
gtagtggtca 3120aatggcgatt accgttgatg ttgaagtggc gagcgataca ccgcatccgg
cgcggattgg 3180cctgaactgc cagctggcgc aggtagcaga gcgggtaaac tggctcggat
tagggccgca 3240agaaaactat cccgaccgcc ttactgccgc ctgttttgac cgctgggatc
tgccattgtc 3300agacatgtat accccgtacg tcttcccgag cgaaaacggt ctgcgctgcg
ggacgcgcga 3360attgaattat ggcccacacc agtggcgcgg cgacttccag ttcaacatca
gccgctacag 3420tcaacagcaa ctgatggaaa ccagccatcg ccatctgctg cacgcggaag
aaggcacatg 3480gctgaatatc gacggtttcc atatggggat tggtggcgac gactcctgga
gcccgtcagt 3540atcggcggaa ttccagctga gcgccggtcg ctaccattac cagttggtct
ggtgtcgggg 3600atccgtcgac taaggccaaa gagtctaatt tttgttcatc aatgggttat
aacatatggg 3660ttatattata agtttgtttt aagtttttga gactgataag aatgtttcga
tcgaatattc 3720catagaacaa caatagtatt acctaattac caagtcttaa tttagcaaaa
atgttattgc 3780ttatagaaaa aataaattat ttatttgaaa tttaaagtca acttgtcatt
taatgtcttg 3840tagacttttg aaagtcttac gatacaatta gtatctaata tacatgggtt
cattctacat 3900tctatattag tgatgatttc tttagctagt aatacatttt aattatattc
ggctttgatg 3960attttctgat tttttccgaa cggattttcg tagacccttt cgatctcata
atggctcatt 4020ttattgcgat ggacggtcag gagagctcca cttttgaatt tctgttcgca
gacaccgcat 4080ttgtagcaca tagccgggac atccggtttg gggagatttt ccagtctctg
ttgcaattgg 4140ttttcgggaa tgcgttgcag gcgcatacgc tctatatcct ccgaacggcg
ctggttgacc 4200ctagcattta cataaggatc agcagcaaaa tttgcctctg cttcattgcc
cggaatcaca 4260gcaatcagat gtccctttcg gttacgatgg atattcaggt gcgaaccgca
cacaaagctc 4320tcgccgcaca ctccacactg atatggtcgc tcgccctgtg gcgccgcata
tggatcttaa 4380ggtcgttgga ctgcacaaag ctcttgctgc acattttgca ggagtacggc
ctttgacccg 4440tgtgcaatcg catgtgtcgc gccagcttgt tctgcgaaat aaacttcttg
gagcagatgc 4500ggccgcccgg ggtgggcgaa gaactccagc atgagatccc cgcgctggag
gatcatccag 4560ccggcgtccc ggaaaacgat tccgaagccc aacctttcat agaaggcggc
ggtggaatcg 4620aaatctcgtg atggcaggtt gggcgtcgct tggtcggtca tttcgaaccc
cagagtcccg 4680ctcagaagaa ctcgtcaaga aggcgataga aggcgatgcg ctgcgaatcg
ggagcggcga 4740taccgtaaag cacgaggaag cggtcagccc attcgccgcc aagctcttca
gcaatatcac 4800gggtagccaa cgctatgtcc tgatagcggt ccgccacacc cagccggcca
cagtcgatga 4860atccagaaaa gcggccattt tccaccatga tattcggcaa gcaggcatcg
ccatgggtca 4920cgacgagatc ctcgccgtcg ggcatgcgcg ccttgagcct ggcgaacagt
tcggctggcg 4980cgagcccctg atgctcttcg tccagatcat cctgatcgac aagaccggct
tccatccgag 5040tacgtgctcg ctcgatgcga tgtttcgctt ggtggtcgaa tgggcaggta
gccggatcaa 5100gcgtatgcag ccgccgcatt gcatcagcca tgatggatac tttctcggca
ggagcaaggt 5160gagatgacag gagatcctgc cccggcactt cgcccaatag cagccagtcc
cttcccgctt 5220cagtgacaac gtcgagcaca gctgcgcaag gaacgcccgt cgtggccagc
cacgatagcc 5280gcgctgcctc gtcctgcagt tcattcaggg caccggacag gtcggtcttg
acaaaaagaa 5340ccgggcgccc ctgcgctgac agccggaaca cggcggcatc agagcagccg
attgtctgtt 5400gtgcccagtc atagccgaat agcctctcca cccaagcggc cggagaacct
gcgtgcaatc 5460catcttgttc aatcatgcga aacgatcctc atcctgtctc ttgatcagat
cttgatcccc 5520tgcgccatca gatccttggc ggcaagaaag ccatccagtt tactttgcag
ggcttcccaa 5580ccttaccaga gggcgcccca gctggcaatt ccggttcgct tgctgtccat
aaaaccgccc 5640agtctagcta tcgccatgta agcccactgc aagctacctg ctttctcttt
gcgcttgcgt 5700tttcccttgt ccagatagcc cagtagctga cattcatccg gggtcagcac
cgtttctgcg 5760gactggcttt ctacgtgttc cgcttccttt agcagccctt gcgccctgag
tgcttgcggc 5820agcgtgaagc taattcatgg ttataaattt ttgttaaatc agctcatttt
ttaaccaata 5880ggccgaaatc ggcaaaatcc cttataaatc aaaagaatag cccgagatag
ggttgagtgt 5940tgttccagtt tggaacaaga gtccactatt aaagaacgtg gactccaacg
tcaaagggcg 6000aaaaaccgtc tatcagggcg atggccggat cagcttatgc ggtgtgaaat
accgcacaga 6060tgcgtaagga gaaaataccg catcaggcgc tcttccgctt cctcgctcac
tgactcgctg 6120cgctcggtcg ttcggctgcg gcgagcggta tcagctcact caaaggcggt
aatacggtta 6180tccacagaat caggggataa cgcaggaaag aacatgtgag caaaaggcca
gcaaaaggcc 6240aggaaccgta aaaaggccgc gttgctggcg tttttccata ggctccgccc
ccctgacgag 6300catcacaaaa atcgacgctc aagtcagagg tggcgaaacc cgacaggact
ataaagatac 6360caggcgtttc cccctggaag ctccctcgtg cgctctcctg ttccgaccct
gccgcttacc 6420ggatacctgt ccgcctttct cccttcggga agcgtggcgc tttctcatag
ctcacgctgt 6480aggtatctca gttcggtgta ggtcgttcgc tccaagctgg gctgtgtgca
cgaacccccc 6540gttcagcccg accgctgcgc cttatccggt aactatcgtc ttgagtccaa
cccggtaaga 6600cacgacttat cgccactggc agcagccact ggtaacagga ttagcagagc
gaggtatgta 6660ggcggtgcta cagagttctt gaagtggtgg cctaactacg gctacactag
aaggacagta 6720tttggtatct gcgctctgct gaagccagtt accttcggaa aaagagttgg
tagctcttga 6780tccggcaaac aaaccaccgc tggtagcggc ggttttttgt ttgcaagcag
cagattacgc 6840gcagaaaaaa aggatctcaa gaagatcctt tgatcttttc ttactgaacg
gtgatcccca 6900ccggaattgc ggccgcggaa ttctcatgtt tgacagctta tcatcgataa
gctggccgct 6960ctagaactag tgttcccaca atggttaatt cgagctcgcc cggggatcta
attcaattag 7020agactaattc aattagagct aattcaatta ggatccaagc ttatcgattt
cgaaccctcg 7080accgccggag tataaataga ggcgcttcgt ctacggagcg acaattcaat
tcaaacaagc 7140aaagtgaaca cgtcgctaag cgaaagctaa gcaaataaac aagcgcagct
gaacaagcta 7200aacaatcggg gtaccgctag agtcgacggt acgatccacc ggtcgccacc
atggtgagca 7260agggcgagga gctgttcacc ggggtggtgc ccatcctggt cgagctggac
ggcgacgtaa 7320acggccacaa gttcagcgtg tccggcgagg gcgagggcga tgccacctac
ggcaagctga 7380ccctgaagtt catctgcacc accggcaagc tgcccgtgcc ctggcccacc
ctcgtgacca 7440ccttcggcta cggcctgcag tgcttcgccc gctaccccga ccacatgaag
cagcacgact 7500tcttcaagtc cgccatgccc gaaggctacg tccaggagcg caccatcttc
ttcaaggacg 7560acggcaacta caagacccgc gccgaggtga agttcgaggg cgacaccctg
gtgaaccgca 7620tcgagctgaa gggcatcgac ttcaaggagg acggcaacat cctggggcac
aagctggagt 7680acaactacaa cagccacaac gtctatatca tggccgacaa gcagaagaac
ggcatcaagg 7740tgaacttcaa gatccgccac aacatcgagg acggcagcgt gcagctcgcc
gaccactacc 7800agcagaacac ccccatcggc gacggccccg tgctgctgcc cgacaaccac
tacctgagct 7860accagtccgc cctgagcaaa gaccccaacg agaagcgcga tcacatggtc
ctgctggagt 7920tcgtgaccgc cgccgggatc actctcggca tggacgagct gtacaagtaa
agcggccgcg 7980actctagatc ataatcagcc ataccacatt tgtagaggtt ttacttgctt
taaaaaacct 8040cccacacctc cccctgaacc tgaaacataa aatgaatgca attgttgttg
ttaacttgtt 8100tattgcagct tataatggtt acaaataaag caatagcatc acaaatttca
caaataaagc 8160atttttttca ctgcattcta gttgtggttt gtccaaactc atcaatgtat
cttaaagctt 8220atcgatacgc gtacggcact agtggatccc atgcgtcaat tttacgcatg
attatcttta 8280acgtacgtca caatatgatt atctttctag ggttaatcta gctgcgtgtt
ctgcagcgtg 8340tcgagcatct tcatctgctc catcacgctg taaaacacat ttgcaccgcg
agtctgcccg 8400tcctccacgg gttcaaaaac gtgaatgaac gaggcgcgcc cgccgggtaa
ctcacggggt 8460atccatgtcc atttctgcgg catccagcca ggatacccgt cctcgctgac
gtaatatccc 8520agcgccgcac cgctgtcatt aatctgcaca ccggcacggc agttccggct
gtcgccggta 8580ttgttcgggt tgctgatgcg cttcgggctg accatccgga actgtgtccg
gaaaagccgc 8640gacgaactgg tatcccaggt ggcctgaacg aacagttcac cgttaaaggc
gtgcatggcc 8700acaccttccc gaatcatcat ggtaaacgtg cgttttcgct caacgtcaat
gcagcagcag 8760tcatcctcgg caaactcttt ccatgccgct tcaacctcgc gggaaaaggc
acgggcttct 8820tcctccccga tgcccagata gcgccagctt gggcgatgac tgagccggaa
aaaagacccg 8880acgatatgat cctgatgcag ctagattaac cctagaaaga tagtctgcgt
aaaattgacg 8940catgggatcc cccgggctgc aggaattcga tatcaagctt atcgataccg
tcgaagctt 8999499012DNAArtificial SequenceDescription of Artificial
Sequence Synthetic nucleotide construct 49accgaagtat acacttaaat
tcagtgcacg tttgcttgtt gagaggaaag gttgtgtgcg 60gacgaatttt tttttgaaaa
cattaaccct tacgtggaat aaaaaaaaat gaaatattgc 120aaattttgct gcaaagctgt
gactggagta aaattaattc acgtgccgaa gtgtgctatt 180aagagaaaat tgtgggagca
gagccttggg tgcagccttg gtgaaaactc ccaaatttgt 240gatacccact ttaatgattc
gcagtggaag gctgcacctg caaaaggtca gacatttaaa 300aggaggcgac tcaacgcaga
tgccgtacct agtaaagtga tagagcctga accagaaaag 360ataaaagaag gctataccag
tgggagtaca caaacagagt aagtttgaat agtaaaaaaa 420atcatttatg taaacaataa
cgtgactgtg cgttaggtcc tgttcattgt ttaatgaaaa 480taagagcttg agggaaaaaa
ttcgtacttt ggagtacgaa atgcgtcgtt tagagcagca 540gccgaattca ctggccgtcg
ttttacaacg tcgtgactgg gaaaaccctg gcgttaccca 600acttaatcgc cttgcagcac
atcccccttt cgccagctgg cgtaatagcg aagaggcccg 660caccgatcgc ccttcccaac
agttgcgcag cctgaatggc gaatggcgct ttgcctggtt 720tccggcacca gaagcggtgc
cggaaagctg gctggagtgc gatcttcctg aggccgatac 780tgtcgtcgtc ccctcaaact
ggcagatgca cggttacgat gcgcccatct acaccaacgt 840aacctatccc attacggtca
atccgccgtt tgttcccacg gagaatccga cgggttgtta 900ctcgctcaca tttaatgttg
atgaaagctg gctacaggaa ggccagacgc gaattatttt 960tgatggcgtt aactcggcgt
ttcatctgtg gtgcaacggg cgctgggtcg gttacggcca 1020ggacagtcgt ttgccgtctg
aatttgacct gagcgcattt ttacgcgccg gagaaaaccg 1080cctcgcggtg atggtgctgc
gttggagtga cggcagttat ctggaagatc aggatatgtg 1140gcggatgagc ggcattttcc
gtgacgtctc gttgctgcat aaaccgacta cacaaatcag 1200cgatttccat gttgccactc
gctttaatga tgatttcagc cgcgctgtac tggaggctga 1260agttcagatg tgcggcgagt
tgcgtgacta cctacgggta acagtttctt tatggcaggg 1320tgaaacgcag gtcgccagcg
gcaccgcgcc tttcggcggt gaaattatcg atgagcgtgg 1380tggttatgcc gatcgcgtca
cactacgtct gaacgtcgaa aacccgaaac tgtggagcgc 1440cgaaatcccg aatctctatc
gtgcggtggt tgaactgcac accgccgacg gcacgctgat 1500tgaagcagaa gcctgcgatg
tcggtttccg cgaggtgcgg attgaaaatg gtctgctgct 1560gctgaacggc aagccgttgc
tgattcgagg cgttaaccgt cacgagcatc atcctctgca 1620tggtcaggtc atggatgagc
agacgatggt gcaggatatc ctgctgatga agcagaacaa 1680ctttaacgcc gtgcgctgtt
cgcattatcc gaaccatccg ctgtggtaca cgctgtgcga 1740ccgctacggc ctgtatgtgg
tggatgaagc caatattgaa acccacggca tggtgccaat 1800gaatcgtctg accgatgatc
cgcgctggct accggcgatg agcgaacgcg taacgcgaat 1860ggtgcagcgc gatcgtaatc
acccgagtgt gatcatctgg tcgctgggga atgaatcagg 1920ccacggcgct aatcacgacg
cgctgtatcg ctggatcaaa tctgtcgatc cttcccgccc 1980ggtgcagtat gaaggcggcg
gagccgacac cacggccacc gatattattt gcccgatgta 2040cgcgcgcgtg gatgaagacc
agcccttccc ggctgtgccg aaatggtcca tcaaaaaatg 2100gctttcgcta cctggagaga
cgcgcccgct gatcctttgc gaatacgccc acgcgatggg 2160taacagtctt ggcggtttcg
ctaaatactg gcaggcgttt cgtcagtatc cccgtttaca 2220gggcggcttc gtctgggact
gggtggatca gtcgctgatt aaatatgatg aaaacggcaa 2280cccgtggtcg gcttacggcg
gtgattttgg cgatacgccg aacgatcgcc agttctgtat 2340gaacggtctg gtctttgccg
accgcacgcc gcatccagcg ctgacggaag caaaacacca 2400gcagcagttt ttccagttcc
gtttatccgg gcaaaccatc gaagtgacca gcgaatacct 2460gttccgtcat agcgataacg
agctcctgca ctggatggtg gcgctggatg gtaagccgct 2520ggcaagcggt gaagtgcctc
tggatgtcgc tccacaaggt aaacagttga ttgaactgcc 2580tgaactaccg cagccggaga
gcgccgggca actctggctc acagtacgcg tagtgcaacc 2640gaacgcgacc gcatggtcag
aagccgggca catcagcgcc tggcagcagt ggcgtctggc 2700ggaaaacctc agtgtgacgc
tccccgccgc gtcccacgcc atcccgcatc tgaccaccag 2760cgaaatggat ttttgcatcg
agctgggtaa taagcgttgg caatttaacc gccagtcagg 2820ctttctttca cagatgtgga
ttggcgataa aaaacaactg ctgacgccgc tgcgcgatca 2880gttcacccgt gcaccgctgg
ataacgacat tggcgtaagt gaagcgaccc gcattgaccc 2940taacgcctgg gtcgaacgct
ggaaggcggc gggccattac caggccgaag cagcgttgtt 3000gcagtgcacg gcagatacac
ttgctgatgc ggtgctgatt acgaccgctc acgcgtggca 3060gcatcagggg aaaaccttat
ttatcagccg gaaaacctac cggattgatg gtagtggtca 3120aatggcgatt accgttgatg
ttgaagtggc gagcgataca ccgcatccgg cgcggattgg 3180cctgaactgc cagctggcgc
aggtagcaga gcgggtaaac tggctcggat tagggccgca 3240agaaaactat cccgaccgcc
ttactgccgc ctgttttgac cgctgggatc tgccattgtc 3300agacatgtat accccgtacg
tcttcccgag cgaaaacggt ctgcgctgcg ggacgcgcga 3360attgaattat ggcccacacc
agtggcgcgg cgacttccag ttcaacatca gccgctacag 3420tcaacagcaa ctgatggaaa
ccagccatcg ccatctgctg cacgcggaag aaggcacatg 3480gctgaatatc gacggtttcc
atatggggat tggtggcgac gactcctgga gcccgtcagt 3540atcggcggaa ttccagctga
gcgccggtcg ctaccattac cagttggtct ggtgtcgggg 3600atccgtcgac taaggccaaa
gagtctaatt tttgttcatc aatgggttat aacatatggg 3660ttatattata agtttgtttt
aagtttttga gactgataag aatgtttcga tcgaatattc 3720catagaacaa caatagtatt
acctaattac caagtcttaa tttagcaaaa atgttattgc 3780ttatagaaaa aataaattat
ttatttgaaa tttaaagtca acttgtcatt taatgtcttg 3840tagacttttg aaagtcttac
gatacaatta gtatctaata tacatgggtt cattctacat 3900tctatattag tgatgatttc
tttagctagt aatacatttt aattatattc ggctttgatg 3960attttctgat tttttccgaa
cggattttcg tagacccttt cgatctcata atggctcatt 4020ttattgcgat ggacggtcag
gagagctcca cttttgaatt tctgttcgca gacaccgcat 4080ttgtagcaca tagccgggac
atccggtttg gggagatttt ccagtctctg ttgcaattgg 4140ttttcgggaa tgcgttgcag
gcgcatacgc tctatatcct ccgaacggcg ctggttgacc 4200ctagcattta cataaggatc
agcagcaaaa tttgcctctg cttcattgcc cggaatcaca 4260gcaatcagat gtccctttcg
gttacgatgg atattcaggt gcgaaccgca cacaaagctc 4320tcgccgcaca ctccacactg
atatggtcgc tcgccctgtg gcgccgcata tggatcttaa 4380ggtcgttgga ctgcacaaag
ctcttgctgc acattttgca ggagtacggc ctttgacccg 4440tgtgcaatcg catgtgtcgc
gccagcttgt tctgcgaaat aaacttcttg gagcagatgc 4500ggccgcccgg ggtgggcgaa
gaactccagc atgagatccc cgcgctggag gatcatccag 4560ccggcgtccc ggaaaacgat
tccgaagccc aacctttcat agaaggcggc ggtggaatcg 4620aaatctcgtg atggcaggtt
gggcgtcgct tggtcggtca tttcgaaccc cagagtcccg 4680ctcagaagaa ctcgtcaaga
aggcgataga aggcgatgcg ctgcgaatcg ggagcggcga 4740taccgtaaag cacgaggaag
cggtcagccc attcgccgcc aagctcttca gcaatatcac 4800gggtagccaa cgctatgtcc
tgatagcggt ccgccacacc cagccggcca cagtcgatga 4860atcgagaaaa gcggccattt
tccaccatga tattcggcaa gcaggcatcg ccatgggtca 4920cgacgagatc ctcgccgtcg
ggcatgcgcg ccttgagcct ggcgaacagt tcggctggcg 4980cgagcccctg atgctcttcg
tccagatcat cctgatcgac aagaccggct tccatccgag 5040tacgtgctcg ctcgatgcga
tgtttcgctt ggtggtcgaa tgggcaggta gccggatcaa 5100gcgtatgcag ccgccgcatt
gcatcagcca tgatggatac tttctcggca ggagcaaggt 5160gagatgacag gagatcctgc
cccggcactt cgcccaatag cagccagtcc cttcccgctt 5220cagtgacaac gtcgagcaca
gctgcgcaag gaacgcccgt cgtggccagc cacgatagcc 5280gcgctgcctc gtcctgcagt
tcattcaggg caccggacag gtcggtcttg acaaaaagaa 5340ccgggcgccc ctgcgctgac
agccggaaca cggcggcatc agagcagccg attgtctgtt 5400gtgcccagtc atagccgaat
agcctctcca cccaagcggc cggagaacct gcgtgcaatc 5460catcttgttc aatcatgcga
aacgatcctc atcctgtctc ttgatcagat cttgatcccc 5520tgcgccatca gatccttggc
ggcaagaaag ccatccagtt tactttgcag ggcttcccaa 5580ccttaccaga gggcgcccca
gctggcaatt ccggttcgct tgctgtccat aaaaccgccc 5640agtctagcta tcgccatgta
agcccactgc aagctacctg ctttctcttt gcgcttgcgt 5700tttcccttgt ccagatagcc
cagtagctga cattcatccg gggtcagcac cgtttctgcg 5760gactggcttt ctacgtgttc
cgcttccttt agcagccctt gcgccctgag tgcttgcggc 5820agcgtgaagc taattcatgg
ttataaattt ttgttaaatc agctcatttt ttaaccaata 5880ggccgaaatc ggcaaaatcc
cttataaatc aaaagaatag cccgagatag ggttgagtgt 5940tgttccagtt tggaacaaga
gtccactatt aaagaacgtg gactccaacg tcaaagggcg 6000aaaaaccgtc tatcagggcg
atggccggat cagcttatgc ggtgtgaaat accgcacaga 6060tgcgtaagga gaaaataccg
catcaggcgc tcttccgctt cctcgctcac tgactcgctg 6120cgctcggtcg ttcggctgcg
gcgagcggta tcagctcact caaaggcggt aatacggtta 6180tccacagaat caggggataa
cgcaggaaag aacatgtgag caaaaggcca gcaaaaggcc 6240aggaaccgta aaaaggccgc
gttgctggcg tttttccata ggctccgccc ccctgacgag 6300catcacaaaa atcgacgctc
aagtcagagg tggcgaaacc cgacaggact ataaagatac 6360caggcgtttc cccctggaag
ctccctcgtg cgctctcctg ttccgaccct gccgcttacc 6420ggatacctgt ccgcctttct
cccttcggga agcgtggcgc tttctcatag ctcacgctgt 6480aggtatctca gttcggtgta
ggtcgttcgc tccaagctgg gctgtgtgca cgaacccccc 6540gttcagcccg accgctgcgc
cttatccggt aactatcgtc ttgagtccaa cccggtaaga 6600cacgacttat cgccactggc
agcagccact ggtaacagga ttagcagagc gaggtatgta 6660ggcggtgcta cagagttctt
gaagtggtgg cctaactacg gctacactag aaggacagta 6720tttggtatct gcgctctgct
gaagccagtt accttcggaa aaagagttgg tagctcttga 6780tccggcaaac aaaccaccgc
tggtagcggc ggttttttgt ttgcaagcag cagattacgc 6840gcagaaaaaa aggatctcaa
gaagatcctt tgatcttttc ttactgaacg gtgatcccca 6900ccggaattgc ggccgcggaa
ttctcatgtt tgacagctta tcatcgataa gctggccgct 6960ctagaactag tgttcccaca
atggttaatt cgagctcgcc cggggatcta attcaattag 7020agactaattc aattagagct
aattcaatta ggatccaagc ttatcgattt cgaaccctcg 7080accgccggag tataaataga
ggcgcttcgt ctacggagcg acaattcaat tcaaacaagc 7140aaagtgaaca cgtcgctaag
cgaaagctaa gcaaataaac aagcgcagct gaacaagcta 7200aacaatcggg gtaccgctag
agtcgacggt acgatccacc ggtcgccacc atggtgagca 7260agggcgagga gctgttcacc
ggggtggtgc ccatcctggt cgagctggac ggcgacgtaa 7320acggccacaa gttcagcgtg
tccggcgagg gcgagggcga tgccacctac ggcaagctga 7380ccctgaagtt catctgcacc
accggcaagc tgcccgtgcc ctggcccacc ctcgtgacca 7440ccctgacctg gggcgtgcag
tgcttcagcc gctaccccga ccacatgaag cagcacgact 7500tcttcaagtc cgccatgccc
gaaggctacg tccaggagcg caccatcttc ttcaaggacg 7560acggcaacta caagacccgc
gccgaggtga agttcgaggg cgacaccctg gtgaaccgca 7620tcgagctgaa gggcatcgac
ttcaaggagg acggcaacat cctggggcac aagctggagt 7680acaactagat cagccacaac
gtctatatca ccgccgacaa gcagaagaac ggcatcaagg 7740ccaacttcaa gatccgccac
aacatcgagg acggcagcgt gcagctcgcc gaccactacc 7800agcagaacac ccccatcggc
gacggccccg tgctgctgcc cgacaaccac tacctgagca 7860cccagtccgc cctgagcaaa
gaccccaacg agaagcgcga tcacatggtc ctgctggagt 7920tcgtgaccgc cgccgggatc
actctcggca tggacgagct gtacaagtaa agcggccgcg 7980actctagatc ataatcagcc
ataccacatt tgtagaggtt ttacttgctt taaaaaacct 8040cccacacctc cccctgaacc
tgaaacataa aatgaatgca attgttgttg ttaacttgtt 8100tattgcagct tataatggtt
acaaataaag caatagcatc acaaatttca caaataaagc 8160atttttttca ctgcattcta
gttgtggttt gtccaaactc atcaatgtat cttaaagctt 8220atcgatacgc gtacggcgcg
cctaggccgg ccgattggat cccatgcgtc aattttacgc 8280atgattatct ttaacgtacg
tcacaatatg attatctttc tagggttaat ctagctgcgt 8340gttctgcagc gtgtcgagca
tcttcatctg ctccatcacg ctgtaaaaca catttgcacc 8400gcgagtctgc ccgtcctcca
cgggttcaaa aacgtgaatg aacgaggcgc gcccgccggg 8460taactcacgg ggtatccatg
tccatttctg cggcatccag ccaggatacc cgtcctcgct 8520gacgtaatat cccagcgccg
caccgctgtc attaatctgc acaccggcac ggcagttccg 8580gctgtcgccg gtattgttcg
ggttgctgat gcgcttcggg ctgaccatcc ggaactgtgt 8640ccggaaaagc cgcgacgaac
tggtatccca ggtggcctga acgaacagtt caccgttaaa 8700ggcgtgcatg gccacacctt
cccgaatcat catggtaaac gtgcgttttc gctcaacgtc 8760aatgcagcag cagtcatcct
cggcaaactc tttccatgcc gcttcaacct cgcgggaaaa 8820ggcacgggct tcttcctccc
cgatgcccag atagcgccag cttgggcgat gactgagccg 8880gaaaaaagac ccgacgatat
gatcctgatg cagctagatt aaccctagaa agatagtctg 8940cgtaaaattg acgcatggga
tcccccgggc tgcaggaatt cgatatcaag cttatcgata 9000ccgtcgaagc tt
9012509013DNAArtificial
SequenceDescription of Artificial Sequence Synthetic nucleotide
construct 50accgaagtat acacttaaat tcagtgcacg tttgcttgtt gagaggaaag
gttgtgtgcg 60gacgaatttt tttttgaaaa cattaaccct tacgtggaat aaaaaaaaat
gaaatattgc 120aaattttgct gcaaagctgt gactggagta aaattaattc acgtgccgaa
gtgtgctatt 180aagagaaaat tgtgggagca gagccttggg tgcagccttg gtgaaaactc
ccaaatttgt 240gatacccact ttaatgattc gcagtggaag gctgcacctg caaaaggtca
gacatttaaa 300aggaggcgac tcaacgcaga tgccgtacct agtaaagtga tagagcctga
accagaaaag 360ataaaagaag gctataccag tgggagtaca caaacagagt aagtttgaat
agtaaaaaaa 420atcatttatg taaacaataa cgtgactgtg cgttaggtcc tgttcattgt
ttaatgaaaa 480taagagcttg agggaaaaaa ttcgtacttt ggagtacgaa atgcgtcgtt
tagagcagca 540gccgaattca ctggccgtcg ttttacaacg tcgtgactgg gaaaaccctg
gcgttaccca 600acttaatcgc cttgcagcac atcccccttt cgccagctgg cgtaatagcg
aagaggcccg 660caccgatcgc ccttcccaac agttgcgcag cctgaatggc gaatggcgct
ttgcctggtt 720tccggcacca gaagcggtgc cggaaagctg gctggagtgc gatcttcctg
aggccgatac 780tgtcgtcgtc ccctcaaact ggcagatgca cggttacgat gcgcccatct
acaccaacgt 840aacctatccc attacggtca atccgccgtt tgttcccacg gagaatccga
cgggttgtta 900ctcgctcaca tttaatgttg atgaaagctg gctacaggaa ggccagacgc
gaattatttt 960tgatggcgtt aactcggcgt ttcatctgtg gtgcaacggg cgctgggtcg
gttacggcca 1020ggacagtcgt ttgccgtctg aatttgacct gagcgcattt ttacgcgccg
gagaaaaccg 1080cctcgcggtg atggtgctgc gttggagtga cggcagttat ctggaagatc
aggatatgtg 1140gcggatgagc ggcattttcc gtgacgtctc gttgctgcat aaaccgacta
cacaaatcag 1200cgatttccat gttgccactc gctttaatga tgatttcagc cgcgctgtac
tggaggctga 1260agttcagatg tgcggcgagt tgcgtgacta cctacgggta acagtttctt
tatggcaggg 1320tgaaacgcag gtcgccagcg gcaccgcgcc tttcggcggt gaaattatcg
atgagcgtgg 1380tggttatgcc gatcgcgtca cactacgtct gaacgtcgaa aacccgaaac
tgtggagcgc 1440cgaaatcccg aatctctatc gtgcggtggt tgaactgcac accgccgacg
gcacgctgat 1500tgaagcagaa gcctgcgatg tcggtttccg cgaggtgcgg attgaaaatg
gtctgctgct 1560gctgaacggc aagccgttgc tgattcgagg cgttaaccgt cacgagcatc
atcctctgca 1620tggtcaggtc atggatgagc agacgatggt gcaggatatc ctgctgatga
agcagaacaa 1680ctttaacgcc gtgcgctgtt cgcattatcc gaaccatccg ctgtggtaca
cgctgtgcga 1740ccgctacggc ctgtatgtgg tggatgaagc caatattgaa acccacggca
tggtgccaat 1800gaatcgtctg accgatgatc cgcgctggct accggcgatg agcgaacgcg
taacgcgaat 1860ggtgcagcgc gatcgtaatc acccgagtgt gatcatctgg tcgctgggga
atgaatcagg 1920ccacggcgct aatcacgacg cgctgtatcg ctggatcaaa tctgtcgatc
cttcccgccc 1980ggtgcagtat gaaggcggcg gagccgacac cacggccacc gatattattt
gcccgatgta 2040cgcgcgcgtg gatgaagacc agcccttccc ggctgtgccg aaatggtcca
tcaaaaaatg 2100gctttcgcta cctggagaga cgcgcccgct gatcctttgc gaatacgccc
acgcgatggg 2160taacagtctt ggcggtttcg ctaaatactg gcaggcgttt cgtcagtatc
cccgtttaca 2220gggcggcttc gtctgggact gggtggatca gtcgctgatt aaatatgatg
aaaacggcaa 2280cccgtggtcg gcttacggcg gtgattttgg cgatacgccg aacgatcgcc
agttctgtat 2340gaacggtctg gtctttgccg accgcacgcc gcatccagcg ctgacggaag
caaaacacca 2400gcagcagttt ttccagttcc gtttatccgg gcaaaccatc gaagtgacca
gcgaatacct 2460gttccgtcat agcgataacg agctcctgca ctggatggtg gcgctggatg
gtaagccgct 2520ggcaagcggt gaagtgcctc tggatgtcgc tccacaaggt aaacagttga
ttgaactgcc 2580tgaactaccg cagccggaga gcgccgggca actctggctc acagtacgcg
tagtgcaacc 2640gaacgcgacc gcatggtcag aagccgggca catcagcgcc tggcagcagt
ggcgtctggc 2700ggaaaacctc agtgtgacgc tccccgccgc gtcccacgcc atcccgcatc
tgaccaccag 2760cgaaatggat ttttgcatcg agctgggtaa taagcgttgg caatttaacc
gccagtcagg 2820ctttctttca cagatgtgga ttggcgataa aaaacaactg ctgacgccgc
tgcgcgatca 2880gttcacccgt gcaccgctgg ataacgacat tggcgtaagt gaagcgaccc
gcattgaccc 2940taacgcctgg gtcgaacgct ggaaggcggc gggccattac caggccgaag
cagcgttgtt 3000gcagtgcacg gcagatacac ttgctgatgc ggtgctgatt acgaccgctc
acgcgtggca 3060gcatcagggg aaaaccttat ttatcagccg gaaaacctac cggattgatg
gtagtggtca 3120aatggcgatt accgttgatg ttgaagtggc gagcgataca ccgcatccgg
cgcggattgg 3180cctgaactgc cagctggcgc aggtagcaga gcgggtaaac tggctcggat
tagggccgca 3240agaaaactat cccgaccgcc ttactgccgc ctgttttgac cgctgggatc
tgccattgtc 3300agacatgtat accccgtacg tcttcccgag cgaaaacggt ctgcgctgcg
ggacgcgcga 3360attgaattat ggcccacacc agtggcgcgg cgacttccag ttcaacatca
gccgctacag 3420tcaacagcaa ctgatggaaa ccagccatcg ccatctgctg cacgcggaag
aaggcacatg 3480gctgaatatc gacggtttcc atatggggat tggtggcgac gactcctgga
gcccgtcagt 3540atcggcggaa ttccagctga gcgccggtcg ctaccattac cagttggtct
ggtgtcgggg 3600atccgtcgac taaggccaaa gagtctaatt tttgttcatc aatgggttat
aacatatggg 3660ttatattata agtttgtttt aagtttttga gactgataag aatgtttcga
tcgaatattc 3720catagaacaa caatagtatt acctaattac caagtcttaa tttagcaaaa
atgtaattgc 3780ttatagaaaa aataaattat ttatttgaaa tttaaagtca acttgtcatt
taatgtcttg 3840tagacttttg aaagtcttac gatacaatta gtatctaata tacatgggtt
cattctacat 3900tctatattag tgatgatttc tttagctagt aatacatttt aattatattc
ggctttgatg 3960attttctgat tttttccgaa cggattttcg tagacccttt cgatctcata
atggctcatt 4020ttattgcgat ggacggtcag gagagctcca cttttgaatt tctgttcgca
gacaccgcat 4080ttgtagcaca tagccgggac atccggtttg gggagatttt ccagtctctg
ttgcaattgg 4140ttttcgggaa tgcgttgcag gcgcatacgc tctatatcct ccgaacggcg
ctggttgacc 4200ctagcattta cataaggatc agcagcaaaa tttgcctctg cttcattgcc
cggaatcaca 4260gcaatcagat gtccctttcg gttacgatgg atattcaggt gcgaaccgca
cacaaagctc 4320tcgccgcaca ctccacactg atatggtcgc tcgccctgtg gcgccgcata
tggatcttaa 4380ggtcgttgga ctgcacaaag ctcttgctgc acattttgca ggagtacggc
ctttgacccg 4440tgtgcaatcg catgtgtcgc gccagcttgt tctgcgaaat aaacttcttg
gagcagatgc 4500ggccgcccgg ggtgggcgaa gaactccagc atgagatccc cgcgctggag
gatcatccag 4560ccggcgtccc ggaaaacgat tccgaagccc aacctttcat agaaggcggc
ggtggaatcg 4620aaatctcgtg atggcaggtt gggcgtcgct tggtcggtca tttcgaaccc
cagagtcccg 4680ctcagaagaa ctcgtcaaga aggcgataga aggcgatgcg ctgcgaatcg
ggagcggcga 4740taccgtaaag cacgaggaag cggtcagccc attcgccgcc aagctcttca
gcaatatcac 4800gggtagccaa cgctatgtcc tgatagcggt ccgccacacc cagccggcca
cagtcgatga 4860atccagaaaa gcggccattt tccaccatga tattcggcaa gcaggcatcg
ccatgggtca 4920cgacgagatc ctcgccgtcg ggcatgcgcg ccttgagcct ggcgaacagt
tcggctggcg 4980cgagcccctg atgctcttcg tccagatcat cctgatcgac aagaccggct
tccatccgag 5040tacgtgctcg ctcgatgcga tgtttcgctt ggtggtcgaa tgggcaggta
gccggatcaa 5100gcgtatgcag ccgccgcatt gcatcagcca tgatggatac tttctcggca
ggagcaaggt 5160gagatgacag gagatcctgc cccggcactt cgcccaatag cagccagtcc
cttcccgctt 5220cagtgacaac gtcgagcaca gctgcgcaag gaacgcccgt cgtggccagc
cacgatagcc 5280gcgctgcctc gtcctgcagt tcattcaggg caccggacag gtcggtcttg
acaaaaagaa 5340ccgggcgccc ctgcgctgac agccggaaca cggcggcatc agagcagccg
attgtctgtt 5400gtgcccagtc atagccgaat agcctctcca cccaagcggc cggagaacct
gcgtgcaatc 5460catcttgttc aatcatgcga aacgatcctc atcctgtctc ttgatcagat
cttgatcccc 5520tgcgccatca gatccttggc ggcaagaaag ccatccagtt tactttgcag
ggcttcccaa 5580ccttaccaga gggcgcccca gctggcaatt ccggttcgct tgctgtccat
aaaaccgccc 5640agtctagcta tcgccatgta agcccactgc aagctacctg ctttctcttt
gcgcttgcgt 5700tttcccttgt ccagatagcc cagtagctga cattcatccg gggtcagcac
cgtttctgcg 5760gactggcttt ctacgtgttc cgcttccttt agcagccctt gcgccctgag
tgcttgcggc 5820agcgtgaagc taattcatgg ttataaattt ttgttaaatc agctcatttt
ttaaccaata 5880ggccgaaatc ggcaaaatcc cttataaatc aaaagaatag cccgagatag
ggttgagtgt 5940tgttccagtt tggaacaaga gtccactatt aaagaacgtg gactccaacg
tcaaagggcg 6000aaaaaccgtc tatcagggcg atggccggat cagcttatgc ggtgtgaaat
accgcacaga 6060tgcgtaagga gaaaataccg catcaggcgc tcttccgctt cctcgctcac
tgactcgctg 6120cgctcggtcg ttcggctgcg gcgagcggta tcagctcact caaaggcggt
aatacggtta 6180tccacagaat caggggataa cgcaggaaag aacatgtgag caaaaggcca
gcaaaaggcc 6240aggaaccgta aaaaggccgc gttgctggcg tttttccata ggctccgccc
ccctgacgag 6300catcacaaaa atcgacgctc aagtcagagg tggcgaaacc cgacaggact
ataaagatac 6360caggcgtttc cccctggaag ctccctcgtg cgctctcctg ttccgaccct
gccgcttacc 6420ggatacctgt ccgcctttct cccttcggga agcgtggcgc tttctcatag
ctcacgctgt 6480aggtatctca gttcggtgta ggtcgttcgc tccaagctgg gctgtgtgca
cgaacccccc 6540gttcagcccg accgctgcgc cttatccggt aactatcgtc ttgagtccaa
cccggtaaga 6600cacgacttat cgccactggc agcagccact ggtaacagga ttagcagagc
gaggtatgta 6660ggcggtgcta cagagttctt gaagtggtgg cctaactacg gctacactag
aaggacagta 6720tttggtatct gcgctctgct gaagccagtt accttcggaa aaagagttgg
tagctcttga 6780tccggcaaac aaaccaccgc tggtagcggc ggttttttgt ttgcaagcag
cagattacgc 6840gcagaaaaaa aggatctcaa gaagatcctt tgatcttttc ttactgaacg
gtgatcccca 6900ccggaattgc ggccgcggaa ttctcatgtt tgacagctta tcatcgataa
gctggccgct 6960ctagaactag tgttcccaca atggttaatt cgagctcgcc cggggatcta
attcaattag 7020agactaattc aattagagct aattcaatta ggatccaagc ttatcgattt
cgaaccctcg 7080accgccggag tataaataga ggcgcttcgt ctacggagcg acaattcaat
tcaaacaagc 7140aaagtgaaca cgtcgctaag cgaaagctaa gcaaataaac aagcgcagct
gaacaagcta 7200aacaatcggg gtaccgctag agtcgacggt accgcgggcc cgggatccac
cggtcgccac 7260catggtgagc aagggcgagg agctgttcac cggggtggtg cccatcctgg
tcgagctgga 7320cggcgacgta aacggccaca agttcagcgt gtccggcgag ggcgagggcg
atgccaccta 7380cggcaagctg accctgaagt tcatctgcac caccggcaag ctgcccgtgc
cctggcccac 7440cctcgtgacc accctgacct acggcgtgca gtgcttcagc cgctaccccg
accacatgaa 7500gcagcacgac ttcttcaagt ccgccatgcc cgaaggctac gtccaggagc
gcaccatctt 7560cttcaaggac gacggcaact acaagacccg cgccgaggtg aagttcgagg
gcgacaccct 7620ggtgaaccgc atcgagctga agggcatcga cttcaaggag gacggcaaca
tcctggggca 7680caagctggag tacaactaca acagccacaa cgtctatatc atggccgaca
agcagaagaa 7740cggcatcaag gtgaacttca agatccgcca caacatcgag gacggcagcg
tgcagctcgc 7800cgaccactac cagcagaaca cccccatcgg cgacggcccc gtgctgctgc
ccgacaacca 7860ctacctgagc acccagtccg ccctgagcaa agaccccaac gagaagcgcg
atcacatggt 7920cctgctggag ttcgtgaccg ccgccgggat cactctcggc atggacgagc
tgtacaagta 7980aagcggccgc gactctagat cataatcagc cataccacat ttgtagaggt
tttacttgct 8040ttaaaaaacc tcccacacct ccccctgaac ctgaaacata aaatgaatgc
aattgttgtt 8100gttaacttgt ttattgcagc ttataatggt tacaaataaa gcaatagcat
cacaaatttc 8160acaaataaag catttttttc actgcattct agttgtggtt tgtccaaact
catcaatgta 8220tcttaaagct tatcgatacg cgtacggcgc gcctagtgga tcccatgcgt
caattttacg 8280catgattatc tttaacgtac gtcacaatat gattatcttt ctagggttaa
tctagctgcg 8340tgttctgcag cgtgtcgagc atcttcatct gctccatcac gctgtaaaac
acatttgcac 8400cgcgagtctg cccgtcctcc acgggttcaa aaacgtgaat gaacgaggcg
cgcccgccgg 8460gtaactcacg gggtatccat gtccatttct gcggcatcca gccaggatac
ccgtcctcgc 8520tgacgtaata tcccagcgcc gcaccgctgt cattaatctg cacaccggca
cggcagttcc 8580ggctgtcgcc ggtattgttc gggttgctga tgcgcttcgg gctgaccatc
cggaactgtg 8640tccggaaaag ccgcgacgaa ctggtatccc aggtggcctg aacgaacagt
tcaccgttaa 8700aggcgtgcat ggccacacct tcccgaatca tcatggtaaa cgtgcgtttt
cgctcaacgt 8760caatgcagca gcagtcatcc tcggcaaact ctttccatgc cgcttcaacc
tcgcgggaaa 8820aggcacgggc ttcttcctcc ccgatgccca gatagcgcca gcttgggcga
tgactgagcc 8880ggaaaaaaga cccgacgata tgatcctgat gcagctagat taaccctaga
aagatagtct 8940gcgtaaaatt gacgcatggg atcccccggg ctgcaggaat tcgatatcaa
gcttatcgat 9000accgtcgaag ctt
9013514951DNAArtificial SequenceDescription of Artificial
Sequence Synthetic nucleotide construct 51ctaaattgta agcgttaata
ttttgttaaa attcgcgtta aatttttgtt aaatcagctc 60attttttaac caataggccg
aaatcggcaa aatcccttat aaatcaaaag aatagaccga 120gatagggttg agtgttgttc
cagtttggaa caagagtcca ctattaaaga acgtggactc 180caacgtcaaa gggcgaaaaa
ccgtctatca gggcgatggc ccactacgtg aaccatcacc 240ctaatcaagt tttttggggt
cgaggtgccg taaagcacta aatcggaacc ctaaagggag 300cccccgattt agagcttgac
ggggaaagcc ggcgaacgtg gcgagaaagg aagggaagaa 360agcgaaagga gcgggcgcta
gggcgctggc aagtgtagcg gtcacgctgc gcgtaaccac 420cacacccgcc gcgcttaatg
cgccgctaca gggcgcgtcc cattcgccat tcaggctgcg 480caactgttgg gaagggcgat
cggtgcgggc ctcttcgcta ttacgccagc tggcgaaagg 540gggatgtgct gcaaggcgat
taagttgggt aacgccaggg ttttcccagt cacgacgttg 600taaaacgacg gccagtgagc
gcgcccgccg ggtaactcac ggggtatcca tgtccatttc 660tgcggcatcc agccaggata
cccgtcctcg ctgacgtaat atcccagcgc cgcaccgctg 720tcattaatct gcacaccggc
acggcagttc cggctgtcgc cggtattgtt cgggttgctg 780atgcgcttcg ggctgaccat
ccggaactgt gtccggaaaa gccgcgacga actggtatcc 840caggtggcct gaacgaacag
ttcaccgtta aaggcgtgca tggccacacc ttcccgaatc 900atcatggtaa acgtgcgttt
tcgctcaacg tcaatgcagc agcagtcatc ctcggcaaac 960tctttccatg ccgcttcaac
ctcgcgggaa aaggcacggg cttcttcctc cccgatgccc 1020agatagcgcc agcttgggcg
atgactgagc cggaaaaaag acccgacgat atgatcctga 1080tgcagctaga ttaaccctag
aaagatagtc tgcgtaaaat tgacgcatga tctaattaac 1140cctcactaaa gggaacaaaa
gctggagctc caccgcggtg gcggccgctc tagaactagt 1200gttcccacaa tggttaattc
gagctcgccc ggggatctaa ttcaattaga gactaattca 1260attagagcta attcaattag
gatccaagct tatcgatttc gaaccctcga ccgccggagt 1320ataaatagag gcgcttcgtc
tacggagcga caattcaatt caaacaagca aagtgaacac 1380gtcgctaagc gaaagctaag
caaataaaca agcgcagctg aacaagctaa acaatcgggg 1440taccgctaga gtcgacggta
cgatccaccg gtcgccacca tggtgagcaa gggcgaggag 1500ctgttcaccg gggtggtgcc
catcctggtc gagctggacg gcgacgtaaa cggccacaag 1560ttcagcgtgt ccggcgaggg
cgagggcgat gccacctacg gcaagctgac cctgaagttc 1620atctgcacca ccggcaagct
gcccgtgccc tggcccaccc tcgtgaccac cttcggctac 1680ggcctgcagt gcttcgcccg
ctaccccgac cacatgaagc agcacgactt cttcaagtcc 1740gccatgcccg aaggctacgt
ccaggagcgc accatcttct tcaaggacga cggcaactac 1800aagacccgcg ccgaggtgaa
gttcgagggc gacaccctgg tgaaccgcat cgagctgaag 1860ggcatcgact tcaaggagga
cggcaacatc ctggggcaca agctggagta caactacaac 1920agccacaacg tctatatcat
ggccgacaag cagaagaacg gcatcaaggt gaacttcaag 1980atccgccaca acatcgagga
cggcagcgtg cagctcgccg accactacca gcagaacacc 2040cccatcggcg acggccccgt
gctgctgccc gacaaccact acctgagcta ccagtccgcc 2100ctgagcaaag accccaacga
gaagcgcgat cacatggtcc tgctggagtt cgtgaccgcc 2160gccgggatca ctctcggcat
ggacgagctg tacaagtaaa gcggccgcga ctctagatca 2220taatcagcca taccacattt
gtagaggttt tacttgcttt aaaaaacctc ccacacctcc 2280ccctgaacct gaaacataaa
atgaatgcaa ttgttgttgt taacttgttt attgcagctt 2340ataatggtta caaataaagc
aatagcatca caaatttcac aaataaagca tttttttcac 2400tgcattctag ttgtggtttg
tccaaactca tcaatgtatc ttaaagctta tcgatacgcg 2460tacggcgcgc ctaggcacta
gtggatcccc cgggctgcag gaattcgata tcaagcttat 2520cgataccgtc gacctcgagg
gggggcccgg tacccaattc gccctatagt gagtcgtatt 2580aagatcacgc gtagatccat
gcgtcaattt tacgcatgat tatctttaac gtacgtcaca 2640atatgattat ctttctaggg
ttaatctagc tgcgtgttct gcagcgtgtc gagcatcttc 2700atctgctcca tcacgctgta
aaacacattt gcaccgcgag tctgcccgtc ctccacgggt 2760tcaaaaacgt gaatgaacga
ggcgcgcttg gcgtaatcat ggtcatagct gtttcctgtg 2820tgaaattgtt atccgctcac
aattccacac aacatacgag ccggaagcat aaagtgtaaa 2880gcctggggtg cctaatgagt
gagctaactc acattaattg cgttgcgctc actgcccgct 2940ttccagtcgg gaaacctgtc
gtgccagctg cattaatgaa tcggccaacg cgcggggaga 3000ggcggtttgc gtattgggcg
ctcttccgct tcctcgctca ctgactcgct gcgctcggtc 3060gttcggctgc ggcgagcggt
atcagctcac tcaaaggcgg taatacggtt atccacagaa 3120tcaggggata acgcaggaaa
gaacatgtga gcaaaaggcc agcaaaaggc caggaaccgt 3180aaaaaggccg cgttgctggc
gtttttccat aggctccgcc cccctgacga gcatcacaaa 3240aatcgacgct caagtcagag
gtggcgaaac ccgacaggac tataaagata ccaggcgttt 3300ccccctggaa gctccctcgt
gcgctctcct gttccgaccc tgccgcttac cggatacctg 3360tccgcctttc tcccttcggg
aagcgtggcg ctttctcata gctcacgctg taggtatctc 3420agttcggtgt aggtcgttcg
ctccaagctg ggctgtgtgc acgaaccccc cgttcagccc 3480gaccgctgcg ccttatccgg
taactatcgt cttgagtcca acccggtaag acacgactta 3540tcgccactgg cagcagccac
tggtaacagg attagcagag cgaggtatgt aggcggtgct 3600acagagttct tgaagtggtg
gcctaactac ggctacacta gaaggacagt atttggtatc 3660tgcgctctgc tgaagccagt
taccttcgga aaaagagttg gtagctcttg atccggcaaa 3720caaaccaccg ctggtagcgg
tggttttttt gtttgcaagc agcagattac gcgcagaaaa 3780aaaggatctc aagaagatcc
tttgatcttt tctacggggt ctgacgctca gtggaacgaa 3840aactcacgtt aagggatttt
ggtcatgaga ttatcaaaaa ggatcttcac ctagatcctt 3900ttaaattaaa aatgaagttt
taaatcaatc taaagtatat atgagtaaac ttggtctgac 3960agttaccaat gcttaatcag
tgaggcacct atctcagcga tctgtctatt tcgttcatcc 4020atagttgcct gactccccgt
cgtgtagata actacgatac gggagggctt accatctggc 4080cccagtgctg caatgatacc
gcgagaccca cgctcaccgg ctccagattt atcagcaata 4140aaccagccag ccggaagggc
cgagcgcaga agtggtcctg caactttatc cgcctccatc 4200cagtctatta attgttgccg
ggaagctaga gtaagtagtt cgccagttaa tagtttgcgc 4260aacgttgttg ccattgctac
aggcatcgtg gtgtcacgct cgtcgtttgg tatggcttca 4320ttcagctccg gttcccaacg
atcaaggcga gttacatgat cccccatgtt gtgcaaaaaa 4380gcggttagct ccttcggtcc
tccgatcgtt gtcagaagta agttggccgc agtgttatca 4440ctcatggtta tggcagcact
gcataattct cttactgtca tgccatccgt aagatgcttt 4500tctgtgactg gtgagtactc
aaccaagtca ttctgagaat agtgtatgcg gcgaccgagt 4560tgctcttgcc cggcgtcaat
acgggataat accgcgccac atagcagaac tttaaaagtg 4620ctcatcattg gaaaacgttc
ttcggggcga aaactctcaa ggatcttacc gctgttgaga 4680tccagttcga tgtaacccac
tcgtgcaccc aactgatctt cagcatcttt tactttcacc 4740agcgtttctg ggtgagcaaa
aacaggaagg caaaatgccg caaaaaaggg aataagggcg 4800acacggaaat gttgaatact
catactcttc ctttttcaat attattgaag catttatcag 4860ggttattgtc tcatgagcgg
atacatattt gaatgtattt agaaaaataa acaaataggg 4920gttccgcgca catttccccg
aaaagtgcca c 4951524952DNAArtificial
SequenceDescription of Artificial Sequence Synthetic nucleotide
construct 52ctaaattgta agcgttaata ttttgttaaa attcgcgtta aatttttgtt
aaatcagctc 60attttttaac caataggccg aaatcggcaa aatcccttat aaatcaaaag
aatagaccga 120gatagggttg agtgttgttc cagtttggaa caagagtcca ctattaaaga
acgtggactc 180caacgtcaaa gggcgaaaaa ccgtctatca gggcgatggc ccactacgtg
aaccatcacc 240ctaatcaagt tttttggggt cgaggtgccg taaagcacta aatcggaacc
ctaaagggag 300cccccgattt agagcttgac ggggaaagcc ggcgaacgtg gcgagaaagg
aagggaagaa 360agcgaaagga gcgggcgcta gggcgctggc aagtgtagcg gtcacgctgc
gcgtaaccac 420cacacccgcc gcgcttaatg cgccgctaca gggcgcgtcc cattcgccat
tcaggctgcg 480caactgttgg gaagggcgat cggtgcgggc ctcttcgcta ttacgccagc
tggcgaaagg 540gggatgtgct gcaaggcgat taagttgggt aacgccaggg ttttcccagt
cacgacgttg 600taaaacgacg gccagtgagc gcgcccgccg ggtaactcac ggggtatcca
tgtccatttc 660tgcggcatcc agccaggata cccgtcctcg ctgacgtaat atcccagcgc
cgcaccgctg 720tcattaatct gcacaccggc acggcagttc cggctgtcgc cggtattgtt
cgggttgctg 780atgcgcttcg ggctgaccat ccggaactgt gtccggaaaa gccgcgacga
actggtatcc 840caggtggcct gaacgaacag ttcaccgtta aaggcgtgca tggccacacc
ttcccgaatc 900atcatggtaa acgtgcgttt tcgctcaacg tcaatgcagc agcagtcatc
ctcggcaaac 960tctttccatg ccgcttcaac ctcgcgggaa aaggcacggg cttcttcctc
cccgatgccc 1020agatagcgcc agcttgggcg atgactgagc cggaaaaaag acccgacgat
atgatcctga 1080tgcagctaga ttaaccctag aaagatagtc tgcgtaaaat tgacgcatga
tctaattaac 1140cctcactaaa gggaacaaaa gctggagctc caccgcggtg gcggccgctc
tagaactagt 1200gccgtacgcg tatcgataag ctttaagata cattgatgag tttggacaaa
ccacaactag 1260aatgcagtga aaaaaatgct ttatttgtga aatttgtgat gctattgctt
tatttgtaac 1320cattataagc tgcaataaac aagttaacaa caacaattgc attcatttta
tgtttcaggt 1380tcagggggag gtgtgggagg ttttttaaag caagtaaaac ctctacaaat
gtggtatggc 1440tgattatgat ctagagtcgc ggccgcttta cttgtacagc tcgtccatgc
cgagagtgat 1500cccggcggcg gtcacgaact ccagcaggac catgtgatcg cgcttctcgt
tggggtcttt 1560gctcagggcg gactgggtgc tcaggtagtg gttgtcgggc agcagcacgg
ggccgtcgcc 1620gatgggggtg ttctgctggt agtggtcggc gagctgcacg ctgccgtcct
cgatgttgtg 1680gcggatcttg aagttcacct tgatgccgtt cttctgcttg tcggccatga
tatagacgtt 1740gtggctgttg tagttgtact ccagcttgtg ccccaggatg ttgccgtcct
ccttgaagtc 1800gatgcccttc agctcgatgc ggttcaccag ggtgtcgccc tcgaacttca
cctcggcgcg 1860ggtcttgtag ttgccgtcgt ccttgaagaa gatggtgcgc tcctggacgt
agccttcggg 1920catggcggac ttgaagaagt cgtgctgctt catgtggtcg gggtagcggc
tgaagcactg 1980cacgccgtag gtcagggtgg tcacgagggt gggccagggc acgggcagct
tgccggtggt 2040gcagatgaac ttcagggtca gcttgccgta ggtggcatcg ccctcgccct
cgccggacac 2100gctgaacttg tggccgttta cgtcgccgtc cagctcgacc aggatgggca
ccaccccggt 2160gaacagctcc tcgcccttgc tcaccatggt ggcgaccggt ggatcccggg
cccgcggtac 2220cgtcgactct agcggtaccc cgattgttta gcttgttcag ctgcgcttgt
ttatttgctt 2280agctttcgct tagcgacgtg ttcactttgc ttgtttgaat tgaattgtcg
ctccgtagac 2340gaagcgcctc tatttatact ccggcggtcg agggttcgaa atcgataagc
ttggatccta 2400attgaattag ctctaattga attagtctct aattgaatta gatccccggg
cgagctcgaa 2460ttaaccattg tgggaacact agtggatccc ccgggctgca ggaattcgat
atcaagctta 2520tcgataccgt cgacctcgag ggggggcccg gtacccaatt cgccctatag
tgagtcgtat 2580taagatcacg cgtagatcca tgcgtcaatt ttacgcatga ttatctttaa
cgtacgtcac 2640aatatgatta tctttctagg gttaatctag ctgcgtgttc tgcagcgtgt
cgagcatctt 2700catctgctcc atcacgctgt aaaacacatt tgcaccgcga gtctgcccgt
cctccacggg 2760ttcaaaaacg tgaatgaacg aggcgcgctt ggcgtaatca tggtcatagc
tgtttcctgt 2820gtgaaattgt tatccgctca caattccaca caacatacga gccggaagca
taaagtgtaa 2880agcctggggt gcctaatgag tgagctaact cacattaatt gcgttgcgct
cactgcccgc 2940tttccagtcg ggaaacctgt cgtgccagct gcattaatga atcggccaac
gcgcggggag 3000aggcggtttg cgtattgggc gctcttccgc ttcctcgctc actgactcgc
tgcgctcggt 3060cgttcggctg cggcgagcgg tatcagctca ctcaaaggcg gtaatacggt
tatccacaga 3120atcaggggat aacgcaggaa agaacatgtg agcaaaaggc cagcaaaagg
ccaggaaccg 3180taaaaaggcc gcgttgctgg cgtttttcca taggctccgc ccccctgacg
agcatcacaa 3240aaatcgacgc tcaagtcaga ggtggcgaaa cccgacagga ctataaagat
accaggcgtt 3300tccccctgga agctccctcg tgcgctctcc tgttccgacc ctgccgctta
ccggatacct 3360gtccgccttt ctcccttcgg gaagcgtggc gctttctcat agctcacgct
gtaggtatct 3420cagttcggtg taggtcgttc gctccaagct gggctgtgtg cacgaacccc
ccgttcagcc 3480cgaccgctgc gccttatccg gtaactatcg tcttgagtcc aacccggtaa
gacacgactt 3540atcgccactg gcagcagcca ctggtaacag gattagcaga gcgaggtatg
taggcggtgc 3600tacagagttc ttgaagtggt ggcctaacta cggctacact agaaggacag
tatttggtat 3660ctgcgctctg ctgaagccag ttaccttcgg aaaaagagtt ggtagctctt
gatccggcaa 3720acaaaccacc gctggtagcg gtggtttttt tgtttgcaag cagcagatta
cgcgcagaaa 3780aaaaggatct caagaagatc ctttgatctt ttctacgggg tctgacgctc
agtggaacga 3840aaactcacgt taagggattt tggtcatgag attatcaaaa aggatcttca
cctagatcct 3900tttaaattaa aaatgaagtt ttaaatcaat ctaaagtata tatgagtaaa
cttggtctga 3960cagttaccaa tgcttaatca gtgaggcacc tatctcagcg atctgtctat
ttcgttcatc 4020catagttgcc tgactccccg tcgtgtagat aactacgata cgggagggct
taccatctgg 4080ccccagtgct gcaatgatac cgcgagaccc acgctcaccg gctccagatt
tatcagcaat 4140aaaccagcca gccggaaggg ccgagcgcag aagtggtcct gcaactttat
ccgcctccat 4200ccagtctatt aattgttgcc gggaagctag agtaagtagt tcgccagtta
atagtttgcg 4260caacgttgtt gccattgcta caggcatcgt ggtgtcacgc tcgtcgtttg
gtatggcttc 4320attcagctcc ggttcccaac gatcaaggcg agttacatga tcccccatgt
tgtgcaaaaa 4380agcggttagc tccttcggtc ctccgatcgt tgtcagaagt aagttggccg
cagtgttatc 4440actcatggtt atggcagcac tgcataattc tcttactgtc atgccatccg
taagatgctt 4500ttctgtgact ggtgagtact caaccaagtc attctgagaa tagtgtatgc
ggcgaccgag 4560ttgctcttgc ccggcgtcaa tacgggataa taccgcgcca catagcagaa
ctttaaaagt 4620gctcatcatt ggaaaacgtt cttcggggcg aaaactctca aggatcttac
cgctgttgag 4680atccagttcg atgtaaccca ctcgtgcacc caactgatct tcagcatctt
ttactttcac 4740cagcgtttct gggtgagcaa aaacaggaag gcaaaatgcc gcaaaaaagg
gaataagggc 4800gacacggaaa tgttgaatac tcatactctt cctttttcaa tattattgaa
gcatttatca 4860gggttattgt ctcatgagcg gatacatatt tgaatgtatt tagaaaaata
aacaaatagg 4920ggttccgcgc acatttcccc gaaaagtgcc ac
4952534941DNAArtificial SequenceDescription of Artificial
Sequence Synthetic nucleotide construct 53ctaaattgta agcgttaata
ttttgttaaa attcgcgtta aatttttgtt aaatcagctc 60attttttaac caataggccg
aaatcggcaa aatcccttat aaatcaaaag aatagaccga 120gatagggttg agtgttgttc
cagtttggaa caagagtcca ctattaaaga acgtggactc 180caacgtcaaa gggcgaaaaa
ccgtctatca gggcgatggc ccactacgtg aaccatcacc 240ctaatcaagt tttttggggt
cgaggtgccg taaagcacta aatcggaacc ctaaagggag 300cccccgattt agagcttgac
ggggaaagcc ggcgaacgtg gcgagaaagg aagggaagaa 360agcgaaagga gcgggcgcta
gggcgctggc aagtgtagcg gtcacgctgc gcgtaaccac 420cacacccgcc gcgcttaatg
cgccgctaca gggcgcgtcc cattcgccat tcaggctgcg 480caactgttgg gaagggcgat
cggtgcgggc ctcttcgcta ttacgccagc tggcgaaagg 540gggatgtgct gcaaggcgat
taagttgggt aacgccaggg ttttcccagt cacgacgttg 600taaaacgacg gccagtgagc
gcgcccgccg ggtaactcac ggggtatcca tgtccatttc 660tgcggcatcc agccaggata
cccgtcctcg ctgacgtaat atcccagcgc cgcaccgctg 720tcattaatct gcacaccggc
acggcagttc cggctgtcgc cggtattgtt cgggttgctg 780atgcgcttcg ggctgaccat
ccggaactgt gtccggaaaa gccgcgacga actggtatcc 840caggtggcct gaacgaacag
ttcaccgtta aaggcgtgca tggccacacc ttcccgaatc 900atcatggtaa acgtgcgttt
tcgctcaacg tcaatgcagc agcagtcatc ctcggcaaac 960tctttccatg ccgcttcaac
ctcgcgggaa aaggcacggg cttcttcctc cccgatgccc 1020agatagcgcc agcttgggcg
atgactgagc cggaaaaaag acccgacgat atgatcctga 1080tgcagctaga ttaaccctag
aaagatagtc tgcgtaaaat tgacgcatga tctaattaac 1140cctcactaaa gggaacaaaa
gctggagctc caccgcggtg gccgccgctc tagaactagt 1200gttcccacaa tggttaattc
gagctcgccc ggggatctaa ttcaattaga gactaattca 1260attagagcta attcaattag
gatccaagct tatcgatttc gaaccctcga ccgccggagt 1320ataaatagag gcgcttcgtc
tacggagcga caattcaatt caaacaagca aagtgaacac 1380gtcgctaagc gaaagctaag
caaataaaca agcgcagctg aacaagctaa acaatcgggg 1440taccgctaga gtcgacggta
cgatccaccg gtcgccacca tggtgagcaa gggcgaggag 1500ctgttcaccg gggtggtgcc
catcctggtc gagctggacg gcgacgtaaa cggccacaag 1560ttcagcgtgt ccggcgaggg
cgagggcgat gccacctacg gcaagctgac cctgaagttc 1620atctgcacca ccggcaagct
gcccgtgccc tggcccaccc tcgtgaccac cctgacctgg 1680ggcgtgcagt gcttcagccg
ctaccccgac cacatgaagc agcacgactt cttcaagtcc 1740gccatgcccg aaggctacgt
ccaggagcgc accatcttct tcaaggacga cggcaactac 1800aagacccgcg ccgaggtgaa
gttcgagggc gacaccctgg tgaaccgcat cgagctgaag 1860ggcatcgact tcaaggagga
cggcaacatc ctggggcaca agctggagta caactacatc 1920agccacaacg tctatatcac
cgccgacaag cagaagaacg gcatcaaggc caacttcaag 1980atccgccaca acatcgagga
cggcagcgtg cagctcgccg accactacca gcagaacacc 2040cccatcggcg acggccccgt
gctgctgccc gacaaccact acctgagcac ccagtccgcc 2100ctgagcaaag accccaacga
gaagcgcgat cacatggtcc tgctggagtt cgtgaccgcc 2160gccgggatca ctctcggcat
ggacgagctg tacaagtaaa gcggccgcga ctctagatca 2220taatcagcca taccacattt
gtagaggttt tacttgcttt aaaaaacctc ccacacctcc 2280ccctgaacct gaaacataaa
atgaatgcaa ttgttgttgt taacttgttt attgcagctt 2340ataatggtta caaataaagc
aatagcatca caaatttcac aaataaagca tttttttcac 2400tgcattctag ttgtggtttg
tccaaactca tcaatgtatc ttaaagctta tcgatacgcg 2460tacggcacta gtggatcccc
cgggctgcag gaattcgata tcaagcttat cgataccgtc 2520gacctcgagg gggggcccgg
tacccaattc gccctatagt gagtcgtatt aagatcacgc 2580gtagatccat gcgtcaattt
tacgcatgat tatctttaac gtacgtcaca atatgattat 2640ctttctaggg ttaatctagc
tgcgtgttct gcagcgtgtc gagcatcttc atctgctcca 2700tcacgctgta aaacacattt
gcaccgcgag tctgcccgtc ctccacgggt tcaaaaacgt 2760gaatgaacga ggcgcgcttg
gcgtaatcat ggtcatagct gtttcctgtg tgaaattgtt 2820atccgctcac aattccacac
aacatacgag ccggaagcat aaagtgtaaa gcctggggtg 2880cctaatgagt gagctaactc
acattaattg cgttgcgctc actgcccgct ttccagtcgg 2940gaaacctgtc gtgccagctg
cattaatgaa tcggccaacg cgcggggaga ggcggtttgc 3000gtattgggcg ctcttccgct
tcctcgctca ctgactcgct gcgctcggtc gttcggctgc 3060ggcgagcggt atcagctcac
tcaaaggcgg taatacggtt atccacagaa tcaggggata 3120acgcaggaaa gaacatgtga
gcaaaaggcc agcaaaaggc caggaaccgt aaaaaggccg 3180cgttgctggc gtttttccat
aggctccgcc cccctgacga gcatcacaaa aatcgacgct 3240caagtcagag gtggcgaaac
ccgacaggac tataaagata ccaggcgttt ccccctggaa 3300gctccctcgt gcgctctcct
gttccgaccc tgccgcttac cggatacctg tccgcctttc 3360tcccttcggg aagcgtggcg
ctttctcata gctcacgctg taggtatctc agttcggtgt 3420aggtcgttcg ctccaagctg
ggctgtgtgc acgaaccccc cgttcagccc gaccgctgcg 3480ccttatccgg taactatcgt
cttgagtcca acccggtaag acacgactta tcgccactgg 3540cagcagccac tggtaacagg
attagcagag cgaggtatgt aggcggtgct acagagttct 3600tgaagtggtg gcctaactac
ggctacacta gaaggacagt atttggtatc tgcgctctgc 3660tgaagccagt taccttcgga
aaaagagttg gtagctcttg atccggcaaa caaaccaccg 3720ctggtagcgg tggttttttt
gtttgcaagc agcagattac gcgcagaaaa aaaggatctc 3780aagaagatcc tttgatcttt
tctacggggt ctgacgctca gtggaacgaa aactcacgtt 3840aagggatttt ggtcatgaga
ttatcaaaaa ggatcttcac ctagatcctt ttaaattaaa 3900aatgaagttt taaatcaatc
taaagtatat atgagtaaac ttggtctgac agttaccaat 3960gcttaatcag tgaggcacct
atctcagcga tctgtctatt tcgttcatcc atagttgcct 4020gactccccgt cgtgtagata
actacgatac gggagggctt accatctggc cccagtgctg 4080caatgatacc gcgagaccca
cgctcaccgg ctccagattt atcagcaata aaccagccag 4140ccggaagggc cgagcgcaga
agtggtcctg caactttatc cgcctccatc cagtctatta 4200attgttgccg ggaagctaga
gtaagtagtt cgccagttaa tagtttgcgc aacgttgttg 4260ccattgctac aggcatcgtg
gtgtcacgct cgtcgtttgg tatggcttca ttcagctccg 4320gttcccaacg atcaaggcga
gttacatgat cccccatgtt gtgcaaaaaa gcggttagct 4380ccttcggtcc tccgatcgtt
gtcagaagta agttggccgc agtgttatca ctcatggtta 4440tggcagcact gcataattct
cttactgtca tgccatccgt aagatgcttt tctgtgactg 4500gtgagtactc aaccaagtca
ttctgagaat agtgtatgcg gcgaccgagt tgctcttgcc 4560cggcgtcaat acgggataat
accgcgccac atagcagaac tttaaaagtg ctcatcattg 4620gaaaacgttc ttcggggcga
aaactctcaa ggatcttacc gctgttgaga tccagttcga 4680tgtaacccac tcgtgcaccc
aactgatctt cagcatcttt tactttcacc agcgtttctg 4740ggtgagcaaa aacaggaagg
caaaatgccg caaaaaaggg aataagggcg acacggaaat 4800gttgaatact catactcttc
ctttttcaat attattgaag catttatcag ggttattgtc 4860tcatgagcgg atacatattt
gaatgtattt agaaaaataa acaaataggg gttccgcgca 4920catttccccg aaaagtgcca c
4941544943DNAArtificial
SequenceDescription of Artificial Sequence Synthetic nucleotide
construct 54cacctgacgc gccctgtagc ggcgcattaa gcgcggcggg tgtggtggtt
acgcgcagcg 60tgaccgctac acttgccagc gccctagcgc ccgctccttt cgctttcttc
ccttcctttc 120tcgccacgtt cgccggcttt ccccgtcaag ctctaaatcg ggggctccct
ttagggttcc 180gatttagtgc tttacggcac ctcgacccca aaaaacttga ttagggtgat
ggttcacgta 240gtgggccatc gccctgatag acggtttttc gccctttgac gttggagtcc
acgttcttta 300atagtggact cttgttccaa actggaacaa cactcaaccc tatctcggtc
tattcttttg 360atttataagg gattttgccg atttcggcct attggttaaa aaatgagctg
atttaacaaa 420aatttaacgc gaattttaac aaaatattaa cgcttacaat ttccattcgc
cattcaggct 480gcgcaactgt tgggaagggc gatcggtgcg ggcctcttcg ctattacgcc
agctggcgaa 540agggggatgt gctgcaaggc gattaagttg ggtaacgcca gggttttccc
agtcacgacg 600ttgtaaaacg acggccagtg aattgtaata cgactcacta tagggcgaat
tgggtaccgg 660gccccccctc gaggtcgacg gtatcgataa gcttgatatc gaattcctgc
agcccggggg 720atcccatgcg tcaattttac gcagactatc tttctagggt taatctagct
gcatcaggat 780catatcgtcg ggtctttttt ccggctcagt catcgcccaa gctggcgcta
tctgggcatc 840ggggaggaag aagcccgtgc cttttcccgc gaggttgaag cggcatggaa
agagtttgcc 900gaggatgact gctgctgcat tgacgttgag cgaaaacgca cgtttaccat
gatgattcgg 960gaaggtgtgg ccatgcacgc ctttaacggt gaactgttcg ttcaggccac
ctgggatacc 1020agttcgtcgc ggcttttccg gacacagttc cggatggtca gcccgaagcg
catcagcaac 1080ccgaacaata ccggcgacag ccggaactgc cgtgccggtg tgcagattaa
tgacagcggt 1140gcggcgctgg gatattacgt cagcgaggac gggtatcctg gctggatgcc
gcagaaatgg 1200acatggatac cccgtgagtt acccggcggc tcgttcattc acgtttttga
acccgtggag 1260gacgggcaga ctcgcggtgc aaatgtgttt tacagcgtga tggagcagat
gaagatgctc 1320gacacgctgc agaacacgca gctagattaa ccctagaaag ataatcatat
tgtgacgtac 1380gttaaagata atcatgcgta aaattgacgc atgggatcca ctagtgttcc
cacaatggtt 1440aattcgagct cgcccgggga tctaattcaa ttagagacta attcaattag
agctaattca 1500attaggatcc aagcttatcg atttcgaacc ctcgaccgcc ggagtataaa
tagaggcgct 1560tcgtctacgg agcgacaatt caattcaaac aagcaaagtg aacacgtcgc
taagcgaaag 1620ctaagcaaat aaacaagcgc agctgaacaa gctaaacaat cggggtaccg
ctagagtcga 1680cggtacgatc caccggtcgc caccatggtg agcaagggcg aggagctgtt
caccggggtg 1740gtgcccatcc tggtcgagct ggacggcgac gtaaacggcc acaagttcag
cgtgtccggc 1800gagggcgagg gcgatgccac ctacggcaag ctgaccctga agttcatctg
caccaccggc 1860aagctgcccg tgccctggcc caccctcgtg accaccctga cctggggcgt
gcagtgcttc 1920agccgctacc ccgaccacat gaagcagcac gacttcttca agtccgccat
gcccgaaggc 1980tacgtccagg agcgcaccat cttcttcaag gacgacggca actacaagac
ccgcgccgag 2040gtgaagttcg agggcgacac cctggtgaac cgcatcgagc tgaagggcat
cgacttcaag 2100gaggacggca acatcctggg gcacaagctg gagtacaact acatcagcca
caacgtctat 2160atcaccgccg acaagcagaa gaacggcatc aaggccaact tcaagatccg
ccacaacatc 2220gaggacggca gcgtgcagct cgccgaccac taccagcaga acacccccat
cggcgacggc 2280cccgtgctgc tgcccgacaa ccactacctg agcacccagt ccgccctgag
caaagacccc 2340aacgagaagc gcgatcacat ggtcctgctg gagttcgtga ccgccgccgg
gatcactctc 2400ggcatggacg agctgtacaa gtaaagcggc cgcgactcta gatcataatc
agccatacca 2460catttgtaga ggttttactt gctttaaaaa acctcccaca cctccccctg
aacctgaaac 2520ataaaatgaa tgcaattgtt gttgttaact tgtttattgc agcttataat
ggttacaaat 2580aaagcaatag catcacaaat ttcacaaata aagcattttt ttcactgcat
tctagttgtg 2640gtttgtccaa actcatcaat gtatcttaaa gcttatcgat acgcgtacgg
cgcgcctagg 2700ccggccgata ctagttctag agcggccgcc accgcggtgg agctccagct
tttgttccct 2760ttagtgaggg ttaatttcga gcttggcgta atcatggtca tagctgtttc
ctgtgtgaaa 2820ttgttatccg ctcacaattc cacacaacat acgagccgga agcataaagt
gtaaagcctg 2880gggtgcctaa tgagtgagct aactcacatt aattgcgttg cgctcactgc
ccgctttcca 2940gtcgggaaac ctgtcgtgcc agctgcatta atgaatcggc caacgcgcgg
ggagaggcgg 3000tttgcgtatt gggcgctctt ccgcttcctc gctcactgac tcgctgcgct
cggtcgttcg 3060gctgcggcga gcggtatcag ctcactcaaa ggcggtaata cggttatcca
cagaatcagg 3120ggataacgca ggaaagaaca tgtgagcaaa aggccagcaa aaggccagga
accgtaaaaa 3180ggccgcgttg ctggcgtttt tccataggct ccgcccccct gacgagcatc
acaaaaatcg 3240acgctcaagt cagaggtggc gaaacccgac aggactataa agataccagg
cgtttccccc 3300tggaagctcc ctcgtgcgct ctcctgttcc gaccctgccg cttaccggat
acctgtccgc 3360ctttctccct tcgggaagcg tggcgctttc tcatagctca cgctgtaggt
atctcagttc 3420ggtgtaggtc gttcgctcca agctgggctg tgtgcacgaa ccccccgttc
agcccgaccg 3480ctgcgcctta tccggtaact atcgtcttga gtccaacccg gtaagacacg
acttatcgcc 3540actggcagca gccactggta acaggattag cagagcgagg tatgtaggcg
gtgctacaga 3600gttcttgaag tggtggccta actacggcta cactagaagg acagtatttg
gtatctgcgc 3660tctgctgaag ccagttacct tcggaaaaag agttggtagc tcttgatccg
gcaaacaaac 3720caccgctggt agcggtggtt tttttgtttg caagcagcag attacgcgca
gaaaaaaagg 3780atctcaagaa gatcctttga tcttttctac ggggtctgac gctcagtgga
acgaaaactc 3840acgttaaggg attttggtca tgagattatc aaaaaggatc ttcacctaga
tccttttaaa 3900ttaaaaatga agttttaaat caatctaaag tatatatgag taaacttggt
ctgacagtta 3960ccaatgctta atcagtgagg cacctatctc agcgatctgt ctatttcgtt
catccatagt 4020tgcctgactc cccgtcgtgt agataactac gatacgggag ggcttaccat
ctggccccag 4080tgctgcaatg ataccgcgag acccacgctc accggctcca gatttatcag
caataaacca 4140gccagccgga agggccgagc gcagaagtgg tcctgcaact ttatccgcct
ccatccagtc 4200tattaattgt tgccgggaag ctagagtaag tagttcgcca gttaatagtt
tgcgcaacgt 4260tgttgccatt gctacaggca tcgtggtgtc acgctcgtcg tttggtatgg
cttcattcag 4320ctccggttcc caacgatcaa ggcgagttac atgatccccc atgttgtgca
aaaaagcggt 4380tagctccttc ggtcctccga tcgttgtcag aagtaagttg gccgcagtgt
tatcactcat 4440ggttatggca gcactgcata attctcttac tgtcatgcca tccgtaagat
gcttttctgt 4500gactggtgag tactcaacca agtcattctg agaatagtgt atgcggcgac
cgagttgctc 4560ttgcccggcg tcaatacggg ataataccgc gccacatagc agaactttaa
aagtgctcat 4620cattggaaaa cgttcttcgg ggcgaaaact ctcaaggatc ttaccgctgt
tgagatccag 4680ttcgatgtaa cccactcgtg cacccaactg atcttcagca tcttttactt
tcaccagcgt 4740ttctgggtga gcaaaaacag gaaggcaaaa tgccgcaaaa aagggaataa
gggcgacacg 4800gaaatgttga atactcatac tcttcctttt tcaatattat tgaagcattt
atcagggtta 4860ttgtctcatg agcggataca tatttgaatg tatttagaaa aataaacaaa
taggggttcc 4920gcgcacattt ccccgaaaag tgc
4943554944DNAArtificial SequenceDescription of Artificial
Sequence Synthetic nucleotide construct 55cacctgacgc gccctgtagc
ggcgcattaa gcgcggcggg tgtggtggtt acgcgcagcg 60tgaccgctac acttgccagc
gccctagcgc ccgctccttt cgctttcttc ccttcctttc 120tcgccacgtt cgccggcttt
ccccgtcaag ctctaaatcg ggggctccct ttagggttcc 180gatttagtgc tttacggcac
ctcgacccca aaaaacttga ttagggtgat ggttcacgta 240gtgggccatc gccctgatag
acggtttttc gccctttgac gttggagtcc acgttcttta 300atagtggact cttgttccaa
actggaacaa cactcaaccc tatctcggtc tattcttttg 360atttataagg gattttgccg
atttcggcct attggttaaa aaatgagctg atttaacaaa 420aatttaacgc gaattttaac
aaaatattaa cgcttacaat ttccattcgc cattcaggct 480gcgcaactgt tgggaagggc
gatcggtgcg ggcctcttcg ctattacgcc agctggcgaa 540agggggatgt gctgcaaggc
gattaagttg ggtaacgcca gggttttccc agtcacgacg 600ttgtaaaacg acggccagtg
aattgtaata cgactcacta tagggcgaat tgggtaccgg 660gccccccctc gaggtcgacg
gtatcgataa gcttgatatc gaattcctgc agcccggggg 720atcccatgcg tcaattttac
gcagactatc tttctagggt taatctagct gcatcaggat 780catatcgtcg ggtctttttt
ccggctcagt catcgcccaa gctggcgcta tctgggcatc 840ggggaggaag aagcccgtgc
cttttcccgc gaggttgaag cggcatggaa agagtttgcc 900gaggatgact gctgctgcat
tgacgttgag cgaaaacgca cgtttaccat gatgattcgg 960gaaggtgtgg ccatgcacgc
ctttaacggt gaactgttcg ttcaggccac ctgggatacc 1020agttcgtcgc ggcttttccg
gacacagttc cggatggtca gcccgaagcg catcagcaac 1080ccgaacaata ccggcgacag
ccggaactgc cgtgccggtg tgcagattaa tgacagcggt 1140gcggcgctgg gatattacgt
cagcgaggac gggtatcctg gctggatgcc gcagaaatgg 1200acatggatac cccgtgagtt
acccggcggc tcgttcattc acgtttttga acccgtggag 1260gacgggcaga ctcgcggtgc
aaatgtgttt tacagcgtga tggagcagat gaagatgctc 1320gacacgctgc agaacacgca
gctagattaa ccctagaaag ataatcatat tgtgacgtac 1380gttaaagata atcatgcgta
aaattgacgc atgggatcca ctagtgttcc cacaatggtt 1440aattcgagct cgcccgggga
tctaattcaa ttagagacta attcaattag agctaattca 1500attaggatcc aagcttatcg
atttcgaacc ctcgaccgcc ggagtataaa tagaggcgct 1560tcgtctacgg agcgacaatt
caattcaaac aagcaaagtg aacacgtcgc taagcgaaag 1620ctaagcaaat aaacaagcgc
agctgaacaa gctaaacaat cggggtaccg ctagagtcga 1680cggtaccgcg ggcccgggat
ccaccggtcg ccaccatggt gagcaagggc gaggagctgt 1740tcaccggggt ggtgcccatc
ctggtcgagc tggacggcga cgtaaacggc cacaagttca 1800gcgtgtccgg cgagggcgag
ggcgatgcca cctacggcaa gctgaccctg aagttcatct 1860gcaccaccgg caagctgccc
gtgccctggc ccaccctcgt gaccaccctg acctacggcg 1920tgcagtgctt cagccgctac
cccgaccaca tgaagcagca cgacttcttc aagtccgcca 1980tgcccgaagg ctacgtccag
gagcgcacca tcttcttcaa ggacgacggc aactacaaga 2040cccgcgccga ggtgaagttc
gagggcgaca ccctggtgaa ccgcatcgag ctgaagggca 2100tcgacttcaa ggaggacggc
aacatcctgg ggcacaagct ggagtacaac tacaacagcc 2160acaacgtcta tatcatggcc
gacaagcaga agaacggcat caaggtgaac ttcaagatcc 2220gccacaacat cgaggacggc
agcgtgcagc tcgccgacca ctaccagcag aacaccccca 2280tcggcgacgg ccccgtgctg
ctgcccgaca accactacct gagcacccag tccgccctga 2340gcaaagaccc caacgagaag
cgcgatcaca tggtcctgct ggagttcgtg accgccgccg 2400ggatcactct cggcatggac
gagctgtaca agtaaagcgg ccgcgactct agatcataat 2460cagccatacc acatttgtag
aggttttact tgctttaaaa aacctcccac acctccccct 2520gaacctgaaa cataaaatga
atgcaattgt tgttgttaac ttgtttattg cagcttataa 2580tggttacaaa taaagcaata
gcatcacaaa tttcacaaat aaagcatttt tttcactgca 2640ttctagttgt ggtttgtcca
aactcatcaa tgtatcttaa agcttatcga tacgcgtacg 2700gcgcgcctag actagttcta
gagcggccgc caccgcggtg gagctccagc ttttgttccc 2760tttagtgagg gttaatttcg
agcttggcgt aatcatggtc atagctgttt cctgtgtgaa 2820attgttatcc gctcacaatt
ccacacaaca tacgagccgg aagcataaag tgtaaagcct 2880ggggtgccta atgagtgagc
taactcacat taattgcgtt gcgctcactg cccgctttcc 2940agtcgggaaa cctgtcgtgc
cagctgcatt aatgaatcgg ccaacgcgcg gggagaggcg 3000gtttgcgtat tgggcgctct
tccgcttcct cgctcactga ctcgctgcgc tcggtcgttc 3060ggctgcggcg agcggtatca
gctcactcaa aggcggtaat acggttatcc acagaatcag 3120gggataacgc aggaaagaac
atgtgagcaa aaggccagca aaaggccagg aaccgtaaaa 3180aggccgcgtt gctggcgttt
ttccataggc tccgcccccc tgacgagcat cacaaaaatc 3240gacgctcaag tcagaggtgg
cgaaacccga caggactata aagataccag gcgtttcccc 3300ctggaagctc cctcgtgcgc
tctcctgttc cgaccctgcc gcttaccgga tacctgtccg 3360cctttctccc ttcgggaagc
gtggcgcttt ctcatagctc acgctgtagg tatctcagtt 3420cggtgtaggt cgttcgctcc
aagctgggct gtgtgcacga accccccgtt cagcccgacc 3480gctgcgcctt atccggtaac
tatcgtcttg agtccaaccc ggtaagacac gacttatcgc 3540cactggcagc agccactggt
aacaggatta gcagagcgag gtatgtaggc ggtgctacag 3600agttcttgaa gtggtggcct
aactacggct acactagaag gacagtattt ggtatctgcg 3660ctctgctgaa gccagttacc
ttcggaaaaa gagttggtag ctcttgatcc ggcaaacaaa 3720ccaccgctgg tagcggtggt
ttttttgttt gcaagcagca gattacgcgc agaaaaaaag 3780gatctcaaga agatcctttg
atcttttcta cggggtctga cgctcagtgg aacgaaaact 3840cacgttaagg gattttggtc
atgagattat caaaaaggat cttcacctag atccttttaa 3900attaaaaatg aagttttaaa
tcaatctaaa gtatatatga gtaaacttgg tctgacagtt 3960accaatgctt aatcagtgag
gcacctatct cagggatctg tctatttcgt tcatccatag 4020ttgcctgact ccccgtcgtg
tagataacta cgatacggga gggcttacca tctggcccca 4080gtgctgcaat gataccgcga
gacccacgct caccggctcc agatttatca gcaataaacc 4140agccagccgg aagggccgag
cgcagaagtg gtcctgcaac tttatccgcc tccatccagt 4200ctattaattg ttgccgggaa
gctagagtaa gtagttcgcc agttaatagt ttgcgcaacg 4260ttgttgccat tgctacaggc
atcgtggtgt cacgctcgtc gtttggtatg gcttcattca 4320gctccggttc ccaacgatca
aggcgagtta catgatcccc catgttgtgc aaaaaagcgg 4380ttagctcctt cggtcctccg
atcgttgtca gaagtaagtt ggccgcagtg ttatcactca 4440tggttatggc agcactgcat
aattctctta ctgtcatgcc atccgtaaga tgcttttctg 4500tgactggtga gtactcaacc
aagtcattct gagaatagtg tatgcggcga ccgagttgct 4560cttgcccggc gtcaatacgg
gataataccg cgccacatag cagaacttta aaagtgctca 4620tcattggaaa acgttcttcg
gggcgaaaac tctcaaggat cttaccgctg ttgagatcca 4680gttcgatgta acccactcgt
gcacccaact gatcttcagc atcttttact ttcaccagcg 4740tttctgggtg agcaaaaaca
ggaaggcaaa atgccgcaaa aaagggaata agggcgacac 4800ggaaatgttg aatactcata
ctcttccttt ttcaatatta ttgaagcatt tatcagggtt 4860attgtctcat gagcggatac
atatttgaat gtatttagaa aaataaacaa ataggggttc 4920cgcgcacatt tccccgaaaa
gtgc 4944564944DNAArtificial
SequenceDescription of Artificial Sequence Synthetic nucleotide
construct 56cacctgacgc gccctgtagc ggcgcattaa gcgcggcggg tgtggtggtt
acgcgcagcg 60tgaccgctac acttgccagc gccctagcgc ccgctccttt cgctttcttc
ccttcctttc 120tcgccacgtt cgccggcttt ccccgtcaag ctctaaatcg ggggctccct
ttagggttcc 180gatttagtgc tttacggcac ctcgacccca aaaaacttga ttagggtgat
ggttcacgta 240gtgggccatc gccctgatag acggtttttc gccctttgac gttggagtcc
acgttcttta 300atagtggact cttgttccaa actggaacaa cactcaaccc tatctcggtc
tattcttttg 360atttataagg gattttgccg atttcggcct attggttaaa aaatgagctg
atttaacaaa 420aatttaacgc gaattttaac aaaatattaa cgcttacaat ttccattcgc
cattcaggct 480gcgcaactgt tgggaagggc gatcggtgcg ggcctcttcg ctattacgcc
agctggcgaa 540agggggatgt gctgcaaggc gattaagttg ggtaacgcca gggttttccc
agtcacgacg 600ttgtaaaacg acggccagtg aattgtaata cgactcacta tagggcgaat
tgggtaccgg 660gccccccctc gaggtcgacg gtatcgataa gcttgatatc gaattcctgc
agcccggggg 720atcccatgcg tcaattttac gcagactatc tttctagggt taatctagct
gcatcaggat 780catatcgtcg ggtctttttt ccggctcagt catcgcccaa gctggcgcta
tctgggcatc 840ggggaggaag aagcccgtgc cttttcccgc gaggttgaag cggcatggaa
agagtttgcc 900gaggatgact gctgctgcat tgacgttgag cgaaaacgca cgtttaccat
gatgattcgg 960gaaggtgtgg ccatgcacgc ctttaacggt gaactgttcg ttcaggccac
ctgggatacc 1020agttcgtcgc ggcttttccg gacacagttc cggatggtca gcccgaagcg
catcagcaac 1080ccgaacaata ccggcgacag ccggaactgc cgtgccggtg tgcagattaa
tgacagcggt 1140gcggcgctgg gatattacgt cagcgaggac gggtatcctg gctggatgcc
gcagaaatgg 1200acatggatac cccgtgagtt acccggcggc tcgttcattc acgtttttga
acccgtggag 1260gacgggcaga ctcgcggtgc aaatgtgttt tacagcgtga tggagcagat
gaagatgctc 1320gacacgctgc agaacacgca gctagattaa ccctagaaag ataatcatat
tgtgacgtac 1380gttaaagata atcatgcgta aaattgacgc atgggatcca ctagtgttcc
cacaatggtt 1440aattcgagct cgcccgggga tctaattcaa ttagagacta attcaattag
agctaattca 1500attaggatcc aagcttatcg atttcgaacc ctcgaccgcc ggagtataaa
tagaggcgct 1560tcgtctacgg agcgacaatt caattcaaac aagcaaagtg aacacgtcgc
taagcgaaag 1620ctaagcaaat aaacaagcgc agctgaacaa gctaaacaat cggggtaccg
ctagagtcga 1680cggtacgatc caccggtcgc caccatggtg agcaagggcg aggagctgtt
caccggggtg 1740gtgcccatcc tggtcgagct ggacggcgac gtaaacggcc acaagttcag
cgtgtccggc 1800gagggcgagg gcgatgccac ctacggcaag ctgaccctga agttcatctg
caccaccggc 1860aagctgcccg tgccctggcc caccctcgtg accaccttcg gctacggcct
gcagtgcttc 1920gcccgctacc ccgaccacat gaagcagcac gacttcttca agtccgccat
gcccgaaggc 1980tacgtccagg agcgcaccat cttcttcaag gacgacggca actacaagac
ccgcgccgag 2040gtgaagttcg agggcgacac cctggtgaac cgcatcgagc tgaagggcat
cgacttcaag 2100gaggacggca acatcctggg gcacaagctg gagtacaact acaacagcca
caacgtctat 2160atcatggccg acaagcagaa gaacggcatc aaggtgaact tcaagatccg
ccacaacatc 2220gaggacggca gcgtgcagct cgccgaccac taccagcaga acacccccat
cggcgacggc 2280cccgtgctgc tgcccgacaa ccactacctg agctaccagt ccgccctgag
caaagacccc 2340aacgagaagc gcgatcacat ggtcctgctg gagttcgtga ccgccgccgg
gatcactctc 2400ggcatggacg agctgtacaa gtaaagcggc cgcgactcta gatcataatc
agccatacca 2460catttgtaga ggttttactt gctttaaaaa acctcccaca cctccccctg
aacctgaaac 2520ataaaatgaa tgcaattgtt gttgttaact tgtttattgc agcttataat
ggttacaaat 2580aaagcaatag catcacaaat ttcacaaata aagcattttt ttcactgcat
tctagttgtg 2640gtttgtccaa actcatcaat gtatcttaaa gcttatcgat acgcgtacgg
cgcgcctagg 2700ccggccgatc actagttcta gagcggccgc caccgcggtg gagctccagc
ttttgttccc 2760tttagtgagg gttaatttcg agcttggcgt aatcatggtc atagctgttt
cctgtgtgaa 2820attgttatcc gctcacaatt ccacacaaca tacgagccgg aagcataaag
tgtaaagcct 2880ggggtgccta atgagtgagc taactcacat taattgcgtt gcgctcactg
cccgctttcc 2940agtcgggaaa cctgtcgtgc cagctgcatt aatgaatcgg ccaacgcgcg
gggagaggcg 3000gtttgcgtat tgggcgctct tccgcttcct cgctcactga ctcgctgcgc
tcggtcgttc 3060ggctgcggcg agcggtatca gctcactcaa aggcggtaat acggttatcc
acagaatcag 3120gggataacgc aggaaagaac atgtgagcaa aaggccagca aaaggccagg
aaccgtaaaa 3180aggccgcgtt gctggcgttt ttccataggc tccgcccccc tgacgagcat
cacaaaaatc 3240gacgctcaag tcagaggtgg cgaaacccga caggactata aagataccag
gcgtttcccc 3300ctggaagctc cctcgtgcgc tctcctgttc cgaccctgcc gcttaccgga
tacctgtccg 3360cctttctccc ttcgggaagc gtggcgcttt ctcatagctc acgctgtagg
tatctcagtt 3420cggtgtaggt cgttcgctcc aagctgggct gtgtgcacga accccccgtt
cagcccgacc 3480gctgcgcctt atccggtaac tatcgtcttg agtccaaccc ggtaagacac
gacttatcgc 3540cactggcagc agccactggt aacaggatta gcagagcgag gtatgtaggc
ggtgctacag 3600agttcttgaa gtggtggcct aactacggct acactagaag gacagtattt
ggtatctgcg 3660ctctgctgaa gccagttacc ttcggaaaaa gagttggtag ctcttgatcc
ggcaaacaaa 3720ccaccgctgg tagcggtggt ttttttgttt gcaagcagca gattacgcgc
agaaaaaaag 3780gatctcaaga agatcctttg atcttttcta cggggtctga cgctcagtgg
aacgaaaact 3840cacgttaagg gattttggtc atgagattat caaaaaggat cttcacctag
atccttttaa 3900attaaaaatg aagttttaaa tcaatctaaa gtatatatga gtaaacttgg
tctgacagtt 3960accaatgctt aatcagtgag gcacctatct cagcgatctg tctatttcgt
tcatccatag 4020ttgcctgact ccccgtcgtg tagataacta cgatacggga gggcttacca
tctggcccca 4080gtgctgcaat gataccgcga gacccacgct caccggctcc agatttatca
gcaataaacc 4140agccagccgg aagggccgag cgcagaagtg gtcctgcaac tttatccgcc
tccatccagt 4200ctattaattg ttgccgggaa gctagagtaa gtagttcgcc agttaatagt
ttgcgcaacg 4260ttgttgccat tgctacaggc atcgtggtgt cacgctcgtc gtttggtatg
gcttcattca 4320gctccggttc ccaacgatca aggcgagtta catgatcccc catgttgtgc
aaaaaagcgg 4380ttagctcctt cggtcctccg atcgttgtca gaagtaagtt ggccgcagtg
ttatcactca 4440tggttatggc agcactgcat aattctctta ctgtcatgcc atccgtaaga
tgcttttctg 4500tgactggtga gtactcaacc aagtcattct gagaatagtg tatgcggcga
ccgagttgct 4560cttgcccggc gtcaatacgg gataataccg cgccacatag cagaacttta
aaagtgctca 4620tcattggaaa acgttcttcg gggcgaaaac tctcaaggat cttaccgctg
ttgagatcca 4680gttcgatgta acccactcgt gcacccaact gatcttcagc atcttttact
ttcaccagcg 4740tttctgggtg agcaaaaaca ggaaggcaaa atgccgcaaa aaagggaata
agggcgacac 4800ggaaatgttg aatactcata ctcttccttt ttcaatatta ttgaagcatt
tatcagggtt 4860attgtctcat gagcggatac atatttgaat gtatttagaa aaataaacaa
ataggggttc 4920cgcgcacatt tccccgaaaa gtgc
4944577670DNAArtificial SequenceDescription of Artificial
Sequence Synthetic nucleotide construct 57aacgcgcggg gagaggcggt
ttgcgtattg ggcgctcttc cgcttcctcg ctcactgact 60cgctgcgctc ggtcgttcgg
ctgcggcgag cggtatcagc tcactcaaag gcggtaatac 120ggttatccac agaatcaggg
gataacgcag gaaagaacat gtgagcaaaa ggccagcaaa 180aggccaggaa ccgtaaaaag
gccgcgttgc tggcgttttt ccataggctc cgcccccctg 240acgagcatca caaaaatcga
cgctcaagtc agaggtggcg aaacccgaca ggactataaa 300gataccaggc gtttccccct
ggaagctccc tcgtgcgctc tcctgttccg accctgccgc 360ttaccggata cctgtccgcc
tttctccctt cgggaagcgt ggcgctttct caatgctcac 420gctgtaggta tctcagttcg
gtgtaggtcg ttcgctccaa gctgggctgt gtgcacgaac 480cccccgttca gcccgaccgc
tgcgccttat ccggtaacta tcgtcttgag tccaacccgg 540taagacacga cttatcgcca
ctggcagcag ccactggtaa caggattagc agagcgaggt 600atgtaggcgg tgctacagag
ttcttgaagt ggtggcctaa ctacggctac actagaagga 660cagtatttgg tatctgcgct
ctgctgaagc cagttacctt cggaaaaaga gttggtagct 720cttgatccgg caaacaaacc
accgctggta gcggtggttt ttttgtttgc aagcagcaga 780ttacgcgcag aaaaaaagga
tctcaagaag atcctttgat cttttctacg gggtctgacg 840ctcagtggaa cgaaaactca
cgttaaggga ttttggtcat gagattatca aaaaggatct 900tcacctagat ccttttaaat
taaaaatgaa gttttaaatc aatctaaagt atatatgagt 960aaacttggtc tgacagttac
caatgcttaa tcagtgaggc acctatctca gcgatctgtc 1020tatttcgttc atccatagtt
gcctgactcc ccgtcgtgta gataactacg atacgggagg 1080gcttaccatc tggccccagt
gctgcaatga taccgcgaga cccacgctca ccggctccag 1140atttatcagc aataaaccag
ccagccggaa gggccgagcg cagaagtggt cctgcaactt 1200tatccgcctc catccagtct
attaattgtt gccgggaagc tagagtaagt agttcgccag 1260ttaatagttt gcgcaacgtt
gttgccattg ctacaggcat cgtggtgtca cgctcgtcgt 1320ttggtatggc ttcattcagc
tccggttccc aacgatcaag gcgagttaca tgatccccca 1380tgttgtgcaa aaaagcggtt
agctccttcg gtcctccgat cgttgtcaga agtaagttgg 1440ccgcagtgtt atcactcatg
gttatggcag cactgcataa ttctcttact gtcatgccat 1500ccgtaagatg cttttctgtg
actggtgagt actcaaccaa gtcattctga gaatagtgta 1560tgcggcgacc gagttgctct
tgcccggcgt caatacggga taataccgcg ccacatagca 1620gaactttaaa agtgctcatc
attggaaaac gttcttcggg gcgaaaactc tcaaggatct 1680taccgctgtt gagatccagt
tcgatgtaac ccactcgtgc acccaactga tcttcagcat 1740cttttacttt caccagcgtt
tctgggtgag caaaaacagg aaggcaaaat gccgcaaaaa 1800agggaataag ggcgacacgg
aaatgttgaa tactcatact cttccttttt caatattatt 1860gaagcattta tcagggttat
tgtctcatga gcggatacat atttgaatgt atttagaaaa 1920ataaacaaat aggggttccg
cgcacatttc cccgaaaagt gccacctgac gtctaagaaa 1980ccattattat catgacatta
acctataaaa ataggcgtat cacggggccc tgaggtgaac 2040caattgtcac acgtaatatt
acgacaacta ccgtgcacag gctttgataa ctccttcacg 2100tagtattcac cgagtggtac
tccgttggtc tgtgttcctc ttcccaaata aggcattcca 2160tttatcatat acttcgtacc
actgtcacac atcatgagga tttttattcc atacttactt 2220ggcttgtttg ggatatacat
cctaaacgga caccgtcctc taaaaccaag taactgttca 2280tctatggtca aatgagcccc
tggagtgtaa ttttgtatgc actgatggat aaagagatcc 2340catatttttc taacaggagt
aaatacatcg ttttctcgaa gtgtgggccg tatacttttg 2400tcatccattc taagacatcg
tatcaaaaaa tccaaaacga tccacagact cattacagag 2460acgtacacat tgacaaagat
cgatccaaag aggtcatctg tggacatgtg gttatctttt 2520ctcactgctg tcattaccag
aataccaaag aaagcataga tttcatcttc attcgtgtca 2580cgaaatgtag cacctgtcat
agattcccga cgtttcaatg atatctcagc atttgtccat 2640tttacaattt gcgaaattat
ctcatcagta aaaaatagtt tgaagcataa aagtgggtca 2700tatatattgc ggcacatacg
cgtcggacct ctttgagatc tgacaatgtt cagtgcagag 2760actcggctac cgctcgtgga
ctttgaagtt aaattcagat ataaagacgc tgaaaatcat 2820ttgattttcg ctctaacata
ccaccctaaa gattataaat ttaatgaatt attaaaatac 2880gtacaacaat tgtctgtaaa
tcaacaacgc acagaatcta gcgcttaata aatgtactaa 2940taacaatgta tcgtgtttta
atacgccgga ccagtgaaca gaggtgcgtc tggtgcaaac 3000tcctttactt tgaacaccag
ggaaacttca aggagaattt cctcctcttc agcagagtcg 3060gtaccggtca cccggggatc
ccccctgccc ggttattatt atttttgaca ccagaccaac 3120tggtaatggt agcgaccggc
gctcagctgg aattccgccg atactgacgg gctccaggag 3180tcgtcgccac caatccccat
atggaaaccg tcgatattca gccatgtgcc ttcttccgcg 3240tgcagcagat ggcgatggct
ggtttccatc agttgctgtt gactgtagcg gctgatgttg 3300aactggaagt cgccgcgcca
ctggtgtggg ccataattca attcgcgcgt cccgcagcgc 3360agaccgtttt cgctcgggaa
gacgtacggg gtatacatgt ctgacaatgg cagatcccag 3420cggtcaaaac aggcggcagt
aaggcggtcg ggatagtttt cttgcggccc taatccgagc 3480cagtttaccc gctctgctac
ctgcgccagc tggcagttca ggccaatccg cgccggatgc 3540ggtgtatcgc tcgccacttc
aacatcaacg gtaatcgcca tttgaccact accatcaatc 3600cggtaggttt tccggctgat
aaataaggtt ttcccctgat gctgccacgc gtgagcggtc 3660gtaatcagca ccgcatcagc
aagtgtatct gccgtgcact gcaacaacgc tgcttcggcc 3720tggtaatggc ccgccgcctt
ccagcgttcg acccaggcgt tagggtcaat gcgggtcgct 3780tcacttacgc caatgtcgtt
atccagcggt gcacgggtga actgatcgcg cagcggcgtc 3840agcagttgtt ttttatcgcc
aatccacatc tgtgaaagaa agcctgactg gcggttaaat 3900tgccaacgct tattacccag
ctcgatgcaa aaatccattt cgctggtggt cagatgcggg 3960atggcgtggg acgcggcggg
gagcgtcaca ctgaggtttt ccgccagacg ccactgctgc 4020caggcgctga tgtgcccggc
ttctgaccat gcggtcgcgt tcggttgcac tacgcgtact 4080gtgagccaga gttgcccggc
gctctccggc tgcggtagtt caggcagttc aatcaactgt 4140ttaccttgtg gagcgacatc
cagaggcact tcaccgcttg ccagcggctt accatccagc 4200gccaccatcc agtgcaggag
ctcgttatcg ctatgacgga acaggtattc gctggtcact 4260tcgatggttt gcccggataa
acggaactgg aaaaactgct gctggtgttt tgcttccgtc 4320agcgctggat gcggcgtgcg
gtcggcaaag accagaccgt tcatacagaa ctggcgatcg 4380ttcggcgtat cgccaaaatc
accgccgtaa gccgaccacg ggttgccgtt ttcatcatat 4440ttaatcagcg actgatccac
ccagtcccag acgaagccgc cctgtaaacg gggatactga 4500cgaaacgcct gccagtattt
agcgaaaccg ccaagactgt tacccatcgc gtgggcgtat 4560tcgcaaagga tcagcgggcg
cgtctctcca ggtagcgaaa gccatttttt gatggaccat 4620ttcggcacag ccgggaaggg
ctggtcttca tccacgcgcg cgtacatcgg gcaaataata 4680tcggtggccg tggtgtcggc
tccgccgcct tcatactgca ccgggcggga aggatcgaca 4740gatttgatcc agcgatacag
cgcgtcgtga ttagcgccgt ggcctgattc attccccagc 4800gaccagatga tcacactcgg
gtgattacga tcgcgctgca ccattcgcgt tacgcgttcg 4860ctcatcgccg gtagccagcg
cggatcatcg gtcagacgat tcattggcac catgccgtgg 4920gtttcaatat tggcttcatc
caccacatac aggccgtagc ggtcgcacag cgtgtaccac 4980agcggatggt tcggataatg
cgaacagcgc acggcgttaa agttgttctg cttcatcagc 5040aggatatcct gcaccatcgt
ctgctcatcc atgacctgac catgcagagg atgatgctcg 5100tgacggttaa cgcctcgaat
cagcaacggc ttgccgttca gcagcagcag accattttca 5160atccgcacct cgcggaaacc
gacatcgcag gcttctgctt caatcagcgt gccgtcggcg 5220gtgtgcagtt caaccaccgc
acgatagaga ttcgggattt cggcgctcca cagtttcggg 5280ttttcgacgt tcagacgtag
tgtgacgcga tcggcataac caccacgctc atcgataatt 5340tcaccgccga aaggcgcggt
gccgctggcg acctgcgttt caccctgcca taaagaaact 5400gttacccgta ggtagtcacg
caactcgccg cacatctgaa cttcagcctc cagtacagcg 5460cggctgaaat catcattaaa
gcgagtggca acatggaaat cgctgatttg tgtagtcggt 5520ttatgcagca acgagacgtc
acggaaaatg ccgctcatcc gccacatatc ctgatcttcc 5580agataactgc cgtcactcca
acgcagcacc atcaccgcga ggcggttttc tccggcgcgt 5640aaaaatgcgc tcaggtcaaa
ttcagacggc aaacgactgt cctggccgta accgacccag 5700cgcccgttgc accacagatg
aaacgccgag ttaacgccat caaaaataat tcgcgtctgg 5760ccttcctgta gccagctttc
atcaacatta aatgtgagcg agtaacaacc cgtcggattc 5820tccgtgggaa caaacggcgg
attgaccgta atgggatagg ttacgttggt gtagatgggc 5880gcatcgtaac cgtgcatctg
ccagtttgag gggacgacga cgggatccgt ttttttatta 5940caaaactgtt acgaaaacag
taaaatactt atttattcgg accaacaatg tttattctta 6000cctctaatag tcctctgtgg
caaggtcaag attctgttag aagccaatga agaacctggt 6060tgttcaataa cattttgttc
gtctaatatt tcactacgct tgacgttggc tgacacttca 6120tgtacctcat ctataaacgc
ttcttctgta tcgctctgga cgtcttcact tacgtgatct 6180gatatttcac tgtcagaatc
ctcaccaaca agctcgtcat cgccttgcag aagagcagag 6240aggatatgct catcgtctaa
agaacatccc attttattat atattagtca cgatatctat 6300aacaagaaaa tatatatata
ataagttatc acgtaagtag aacatgaaat aacaatatta 6360attatcgtat gagttaaatc
ttaaaagtca cgtaaaagat aatcatgcgt cattttgact 6420cacgcggtcg ttatagttca
aaatcagtga cacttaccgc attgacaagc acgcctcagc 6480cgagctccaa gcggcgactg
agatgtccta aattgcaaac agcgacggat tcgcgctatt 6540tagaaagaga gagcaatatt
tcaagaatgc atgcgtcaat tttacgcaga ctatctttct 6600agggttaatc tagaggatcc
tctagattaa ccctagaaag ataatcatat tgtgacgtac 6660gttaaagata atcatgcgta
aaattgacgc atgtgttttt atcggtctgt atatcgaggt 6720ttatttatta atttgaatag
atattaagtt ttattatatt tacacttaca tactaataat 6780aaattcaaca aacaatttat
ttatgtttat ttatttatta aaaaaaaaca aaaactcaaa 6840atttcttcta aagtaacaaa
acttttaaac attctctctt ttacaaaaat aaacttattt 6900tgtactttaa aaacagtcat
gttgtattat aaaataagta attagcttaa cttatacata 6960atagaaacaa attatactta
ttagtcagtc cagaaacaac tttggcacat atcaatatta 7020tgctctcgac aaataacttt
tttgcatttt ttgcacgatg catttgcctt tcgccttatt 7080ttagaggggc agtaagtaca
gtaagtacgt tttttcatta ctggctcttc agtactgtca 7140tctgatgtac caggcacttc
atttggcaaa atattagaga tattatcgcg caaatatctc 7200ttcaaagtag gagcttctaa
acggttacgc ataaacgatg acgtcaggct catgtaaagg 7260tttctcataa attttttgcg
actttgaacc ttttctccct tgctactgac attatggctg 7320tatataataa aagaatttat
gcaggcaatg tttatcattc cgtacaataa tgccataggc 7380cacctattcg tcttcctact
gcaggtcatc acagaacaca tttggtctag cgtgtccact 7440ccgcctttag tttgattata
atacataacc atttgcggtt taccggtact ttcgttgata 7500gaagcatcct catcacaaga
tgataataag tataccatct tagctggctt cggtttatat 7560gagacgagag taaggggtcc
gtcaaaacaa aacatcgatg ttcccactgg cctggagcga 7620ctgtttttca gtacttccgg
tatctcgcgt ttgtttgatc gcacggtacc 767058286PRTArtificial
SequenceDescription of Artificial Sequence Synthetic protein 58Met
Ser Ile Gln His Phe Arg Val Ala Leu Ile Pro Phe Phe Ala Ala 1
5 10 15 Phe Cys Leu Pro Val Phe
Ala His Pro Glu Thr Leu Val Lys Val Lys 20
25 30 Asp Ala Glu Asp Gln Leu Gly Ala Arg Val
Gly Tyr Ile Glu Leu Asp 35 40
45 Leu Asn Ser Gly Lys Ile Leu Glu Ser Phe Arg Pro Glu Glu
Arg Phe 50 55 60
Pro Met Met Ser Thr Phe Lys Val Leu Leu Cys Gly Ala Val Leu Ser 65
70 75 80 Arg Ile Asp Ala Gly
Gln Glu Gln Leu Gly Arg Arg Ile His Tyr Ser 85
90 95 Gln Asn Asp Leu Val Glu Tyr Ser Pro Val
Thr Glu Lys His Leu Thr 100 105
110 Asp Gly Met Thr Val Arg Glu Leu Cys Ser Ala Ala Ile Thr Met
Ser 115 120 125 Asp
Asn Thr Ala Ala Asn Leu Leu Leu Thr Thr Ile Gly Gly Pro Lys 130
135 140 Glu Leu Thr Ala Phe Leu
His Asn Met Gly Asp His Val Thr Arg Leu 145 150
155 160 Asp Arg Trp Glu Pro Glu Leu Asn Glu Ala Ile
Pro Asn Asp Glu Arg 165 170
175 Asp Thr Thr Met Pro Val Ala Met Ala Thr Thr Leu Arg Lys Leu
Leu 180 185 190 Thr
Gly Glu Leu Leu Thr Leu Ala Ser Arg Gln Gln Leu Ile Asp Trp 195
200 205 Met Glu Ala Asp Lys Val
Ala Gly Pro Leu Leu Arg Ser Ala Leu Pro 210 215
220 Ala Gly Trp Phe Ile Ala Asp Lys Ser Gly Ala
Gly Glu Arg Gly Ser 225 230 235
240 Arg Gly Ile Ile Ala Ala Leu Gly Pro Asp Gly Lys Pro Ser Arg Ile
245 250 255 Val Val
Ile Tyr Thr Thr Gly Ser Gln Ala Thr Met Asp Glu Arg Asn 260
265 270 Arg Gln Ile Ala Glu Ile Gly
Ala Ser Leu Ile Lys His Trp 275 280
285 597PRTArtificial SequenceDescription of Artificial Sequence
Synthetic peptide 59Asn Phe Lys Val His Glu Arg 1 5
604PRTArtificial SequenceDescription of Artificial Sequence
Synthetic peptide 60Cys Trp Ser Glu 1 617PRTArtificial
SequenceDescription of Artificial Sequence Synthetic peptide 61Asn
Gly Met Phe Phe Arg Arg 1 5 62229PRTArtificial
SequenceDescription of Artificial Sequence Synthetic protein 62Gly
Thr Val Arg Ser Asn Lys Arg Glu Ile Pro Glu Val Leu Lys Asn 1
5 10 15 Ser Arg Ser Arg Pro Val
Gly Thr Ser Met Phe Cys Phe Asp Gly Pro 20
25 30 Leu Thr Leu Val Ser Tyr Lys Pro Lys Pro
Ala Lys Met Val Tyr Leu 35 40
45 Leu Ser Ser Cys Asp Glu Asp Ala Ser Ile Asn Glu Ser Thr
Gly Lys 50 55 60
Pro Gln Met Val Met Tyr Tyr Asn Gln Thr Lys Gly Gly Val Asp Thr 65
70 75 80 Leu Asp Gln Met Cys
Ser Val Met Thr Cys Ser Arg Lys Thr Asn Arg 85
90 95 Trp Pro Met Ala Leu Leu Tyr Gly Met Ile
Asn Ile Ala Cys Ile Asn 100 105
110 Ser Phe Ile Ile Tyr Ser His Asn Val Ser Ser Lys Gly Glu Lys
Val 115 120 125 Gln
Ser Arg Lys Lys Phe Met Arg Asn Leu Tyr Met Ser Leu Thr Ser 130
135 140 Ser Phe Met Arg Asn Arg
Leu Glu Ala Pro Thr Leu Lys Arg Tyr Leu 145 150
155 160 Arg Asp Asn Ile Ser Asn Ile Leu Pro Asn Glu
Val Pro Gly Thr Ser 165 170
175 Asp Asp Ser Thr Glu Glu Pro Val Met Lys Lys Arg Thr Tyr Cys
Thr 180 185 190 Tyr
Cys Pro Ser Lys Ile Arg Arg Lys Ala Asn Ala Ser Cys Lys Lys 195
200 205 Cys Lys Lys Val Ile Cys
Arg Glu His Asn Ile Asp Met Cys Gln Ser 210 215
220 Cys Phe Trp Thr Asp 225
639984DNAArtificial SequenceDescription of Artificial Sequence Synthetic
nucleotide construct 63aacgcgcggg gagaggcggt ttgcgtattg ggcgctcttc
cgcttcctcg ctcactgact 60cgctgcgctc ggtcgttcgg ctgcggcgag cggtatcagc
tcactcaaag gcggtaatac 120ggttatccac agaatcaggg gataacgcag gaaagaacat
gtgagcaaaa ggccagcaaa 180aggccaggaa ccgtaaaaag gccgcgttgc tggcgttttt
ccataggctc cgcccccctg 240acgagcatca caaaaatcga cgctcaagtc agaggtggcg
aaacccgaca ggactataaa 300gataccaggc gtttccccct ggaagctccc tcgtgcgctc
tcctgttccg accctgccgc 360ttaccggata cctgtccgcc tttctccctt cgggaagcgt
ggcgctttct caatgctcac 420gctgtaggta tctcagttcg gtgtaggtcg ttcgctccaa
gctgggctgt gtgcacgaac 480cccccgttca gcccgaccgc tgcgccttat ccggtaacta
tcgtcttgag tccaacccgg 540taagacacga cttatcgcca ctggcagcag ccactggtaa
caggattagc agagcgaggt 600atgtaggcgg tgctacagag ttcttgaagt ggtggcctaa
ctacggctac actagaagga 660cagtatttgg tatctgcgct ctgctgaagc cagttacctt
cggaaaaaga gttggtagct 720cttgatccgg caaacaaacc accgctggta gcggtggttt
ttttgtttgc aagcagcaga 780ttacgcgcag aaaaaaagga tctcaagaag atcctttgat
cttttctacg gggtctgacg 840ctcagtggaa cgaaaactca cgttaaggga ttttggtcat
gagattatca aaaaggatct 900tcacctagat ccttttaaat taaaaatgaa gttttaaatc
aatctaaagt atatatgagt 960aaacttggtc tgacagttac caatgcttaa tcagtgaggc
acctatctca gcgatctgtc 1020tatttcgttc atccatagtt gcctgactcc ccgtcgtgta
gataactacg atacgggagg 1080gcttaccatc tggccccagt gctgcaatga taccgcgaga
cccacgctca ccggctccag 1140atttatcagc aataaaccag ccagccggaa gggccgagcg
cagaagtggt cctgcaactt 1200tatccgcctc catccagtct attaattgtt gccgggaagc
tagagtaagt agttcgccag 1260ttaatagttt gcgcaacgtt gttgccattg ctacaggcat
cgtggtgtca cgctcgtcgt 1320ttggtatggc ttcattcagc tccggttccc aacgatcaag
gcgagttaca tgatccccca 1380tgttgtgcaa aaaagcggtt agctccttcg gtcctccgat
cgttgtcaga agtaagttgg 1440ccgcagtgtt atcactcatg gttatggcag cactgcataa
ttctcttact gtcatgccat 1500ccgtaagatg cttttctgtg actggtgagt actcaaccaa
gtcattctga gaatagtgta 1560tgcggcgacc gagttgctct tgcccggcgt caatacggga
taataccgcg ccacatagca 1620gaactttaaa agtgctcatc attggaaaac gttcttcggg
gcgaaaactc tcaaggatct 1680taccgctgtt gagatccagt tcgatgtaac ccactcgtgc
acccaactga tcttcagcat 1740cttttacttt caccagcgtt tctgggtgag caaaaacagg
aaggcaaaat gccgcaaaaa 1800agggaataag ggcgacacgg aaatgttgaa tactcatact
cttccttttt caatattatt 1860gaagcattta tcagggttat tgtctcatga gcggatacat
atttgaatgt atttagaaaa 1920ataaacaaat aggggttccg cgcacatttc cccgaaaagt
gccacctgac gtctaagaaa 1980ccattattat catgacatta acctataaaa ataggcgtat
cacggggccc tgaggtgaac 2040caattgtcac acgtaatatt acgacaacta ccgtgcacag
gctttgataa ctccttcacg 2100tagtattcac cgagtggtac tccgttggtc tgtgttcctc
ttcccaaata aggcattcca 2160tttatcatat acttcgtacc actgtcacac atcatgagga
tttttattcc atacttactt 2220ggcttgtttg ggatatacat cctaaacgga caccgtcctc
taaaaccaag taactgttca 2280tctatggtca aatgagcccc tggagtgtaa ttttgtatgc
actgatggat aaagagatcc 2340catatttttc taacaggagt aaatacatcg ttttctcgaa
gtgtgggccg tatacttttg 2400tcatccattc taagacatcg tatcaaaaaa tccaaaacga
tccacagact cattacagag 2460acgtacacat tgacaaagat cgatccaaag aggtcatctg
tggacatgtg gttatctttt 2520ctcactgctg tcattaccag aataccaaag aaagcataga
tttcatcttc attcgtgtca 2580cgaaatgtag cacctgtcat agattcccga cgtttcaatg
atatctcagc atttgtccat 2640tttacaattt gcgaaattat ctcatcagta aaaaatagtt
tgaagcataa aagtgggtca 2700tatatattgc ggcacatacg cgtcggacct ctttgagatc
tgacaatgtt cagtgcagag 2760actcggctac cgctcgtgga ctttgaagtt aaattcagat
ataaagacgc tgaaaatcat 2820ttgattttcg ctctaacata ccaccctaaa gattataaat
ttaatgaatt attaaaatac 2880gtacaacaat tgtctgtaaa tcaacaacgc acagaatcta
gcgcttaata aatgtactaa 2940taacaatgta tcgtgtttta atacgccgga ccagtgaaca
gaggtgcgtc tggtgcaaac 3000tcctttactt tgaacaccag ggaaacttca aggagaattt
cctcctcttc agcagagtcg 3060gtaccggtca cccggggatc ccccctgccc ggttattatt
atttttgaca ccagaccaac 3120tggtaatggt agcgaccggc gctcagctgg aattccgccg
atactgacgg gctccaggag 3180tcgtcgccac caatccccat atggaaaccg tcgatattca
gccatgtgcc ttcttccgcg 3240tgcagcagat ggcgatggct ggtttccatc agttgctgtt
gactgtagcg gctgatgttg 3300aactggaagt cgccgcgcca ctggtgtggg ccataattca
attcgcgcgt cccgcagcgc 3360agaccgtttt cgctcgggaa gacgtacggg gtatacatgt
ctgacaatgg cagatcccag 3420cggtcaaaac aggcggcagt aaggcggtcg ggatagtttt
cttgcggccc taatccgagc 3480cagtttaccc gctctgctac ctgcgccagc tggcagttca
ggccaatccg cgccggatgc 3540ggtgtatcgc tcgccacttc aacatcaacg gtaatcgcca
tttgaccact accatcaatc 3600cggtaggttt tccggctgat aaataaggtt ttcccctgat
gctgccacgc gtgagcggtc 3660gtaatcagca ccgcatcagc aagtgtatct gccgtgcact
gcaacaacgc tgcttcggcc 3720tggtaatggc ccgccgcctt ccagcgttcg acccaggcgt
tagggtcaat gcgggtcgct 3780tcacttacgc caatgtcgtt atccagcggt gcacgggtga
actgatcgcg cagcggcgtc 3840agcagttgtt ttttatcgcc aatccacatc tgtgaaagaa
agcctgactg gcggttaaat 3900tgccaacgct tattacccag ctcgatgcaa aaatccattt
cgctggtggt cagatgcggg 3960atggcgtggg acgcggcggg gagcgtcaca ctgaggtttt
ccgccagacg ccactgctgc 4020caggcgctga tgtgcccggc ttctgaccat gcggtcgcgt
tcggttgcac tacgcgtact 4080gtgagccaga gttgcccggc gctctccggc tgcggtagtt
caggcagttc aatcaactgt 4140ttaccttgtg gagcgacatc cagaggcact tcaccgcttg
ccagcggctt accatccagc 4200gccaccatcc agtgcaggag ctcgttatcg ctatgacgga
acaggtattc gctggtcact 4260tcgatggttt gcccggataa acggaactgg aaaaactgct
gctggtgttt tgcttccgtc 4320agcgctggat gcggcgtgcg gtcggcaaag accagaccgt
tcatacagaa ctggcgatcg 4380ttcggcgtat cgccaaaatc accgccgtaa gccgaccacg
ggttgccgtt ttcatcatat 4440ttaatcagcg actgatccac ccagtcccag acgaagccgc
cctgtaaacg gggatactga 4500cgaaacgcct gccagtattt agcgaaaccg ccaagactgt
tacccatcgc gtgggcgtat 4560tcgcaaagga tcagcgggcg cgtctctcca ggtagcgaaa
gccatttttt gatggaccat 4620ttcggcacag ccgggaaggg ctggtcttca tccacgcgcg
cgtacatcgg gcaaataata 4680tcggtggccg tggtgtcggc tccgccgcct tcatactgca
ccgggcggga aggatcgaca 4740gatttgatcc agcgatacag cgcgtcgtga ttagcgccgt
ggcctgattc attccccagc 4800gaccagatga tcacactcgg gtgattacga tcgcgctgca
ccattcgcgt tacgcgttcg 4860ctcatcgccg gtagccagcg cggatcatcg gtcagacgat
tcattggcac catgccgtgg 4920gtttcaatat tggcttcatc caccacatac aggccgtagc
ggtcgcacag cgtgtaccac 4980agcggatggt tcggataatg cgaacagcgc acggcgttaa
agttgttctg cttcatcagc 5040aggatatcct gcaccatcgt ctgctcatcc atgacctgac
catgcagagg atgatgctcg 5100tgacggttaa cgcctcgaat cagcaacggc ttgccgttca
gcagcagcag accattttca 5160atccgcacct cgcggaaacc gacatcgcag gcttctgctt
caatcagcgt gccgtcggcg 5220gtgtgcagtt caaccaccgc acgatagaga ttcgggattt
cggcgctcca cagtttcggg 5280ttttcgacgt tcagacgtag tgtgacgcga tcggcataac
caccacgctc atcgataatt 5340tcaccgccga aaggcgcggt gccgctggcg acctgcgttt
caccctgcca taaagaaact 5400gttacccgta ggtagtcacg caactcgccg cacatctgaa
cttcagcctc cagtacagcg 5460cggctgaaat catcattaaa gcgagtggca acatggaaat
cgctgatttg tgtagtcggt 5520ttatgcagca acgagacgtc acggaaaatg ccgctcatcc
gccacatatc ctgatcttcc 5580agataactgc cgtcactcca acgcagcacc atcaccgcga
ggcggttttc tccggcgcgt 5640aaaaatgcgc tcaggtcaaa ttcagacggc aaacgactgt
cctggccgta accgacccag 5700cgcccgttgc accacagatg aaacgccgag ttaacgccat
caaaaataat tcgcgtctgg 5760ccttcctgta gccagctttc atcaacatta aatgtgagcg
agtaacaacc cgtcggattc 5820tccgtgggaa caaacggcgg attgaccgta atgggatagg
ttacgttggt gtagatgggc 5880gcatcgtaac cgtgcatctg ccagtttgag gggacgacga
cgggatccgt ttttttatta 5940caaaactgtt acgaaaacag taaaatactt atttattcgg
accaacaatg tttattctta 6000cctctaatag tcctctgtgg caaggtcaag attctgttag
aagccaatga agaacctggt 6060tgttcaataa cattttgttc gtctaatatt tcactacgct
tgacgttggc tgacacttca 6120tgtacctcat ctataaacgc ttcttctgta tcgctctgga
cgtcttcact tacgtgatct 6180gatatttcac tgtcagaatc ctcaccaaca agctcgtcat
cgccttgcag aagagcagag 6240aggatatgct catcgtctaa agaacatccc attttattat
atattagtca cgatatctat 6300aacaagaaaa tatatatata ataagttatc acgtaagtag
aacatgaaat aacaatatta 6360attatcgtat gagttaaatc ttaaaagtca cgtaaaagat
aatcatgcgt cattttgact 6420cacgcggtcg ttatagttca aaatcagtga cacttaccgc
attgacaagc acgcctcagc 6480cgagctccaa gcggcgactg agatgtccta aattgcaaac
agcgacggat tcgcgctatt 6540tagaaagaga gagcaatatt tcaagaatgc atgcgtcaat
tttacgcaga ctatctttct 6600agggttaatc tagcttttct aatttaacct ttgtcaggtt
accaactact aaggttgtag 6660gctcaagagg gtgtgtcctg tcgtaggtaa ataactgacc
tgtcgagctt aatattctat 6720attgttgttc tttctgcaaa aaagtgggga agtgagtaat
gaaattattt ctaacattta 6780tctgcatcat accttccgag catttattaa gcatttcgct
ataagttctc gctggaagag 6840gtagtttttt cattgtactt taccttcatc tctgttcatt
atcatcgctt ttaaaacggt 6900tcgaccttct aatcctatct gaccattata attttttaga
atggtttcat aagaaagctc 6960tgaatcaacg gactgcgata ataagtggtg gtatccagaa
tttgtcactt caagtaaaaa 7020cacctcacga gttaaaacac ctaagttctc accgaatgtc
tcaatatccg gacggataat 7080atttattgct tctcttgacc gtaggacttt ccacatgcag
gattttggaa cctcttgcag 7140tactactggg gaatgagttg caattattgc tacaccattg
cgtgcatcga gtaagtcgct 7200taatgttcgt aaaaaagcag agagcaaagg tggatgcaga
tgaacctctg gttcatcgaa 7260taaaactaat gacttttcgc caacgacatc tactaatctt
gtgatagtaa ataaaacaat 7320tgcatgtcca gagctcattc gaagcagata tttctggata
ttgtcataaa acaatttagt 7380gaatttatca tcgtccactt gaatctgtgg ttcattacgt
cttaactctt catatttaga 7440aatgaggctg atgagttcca tatttgaaaa gttttcatca
ctacttagtt ttttgatagc 7500ttcaagccag agttgtcttt ttctatctac tctcatacaa
ccaataaatg ctgaaatgaa 7560ttctaagcgg agatcgccta gtgattttaa actattgctg
gcagcattct tgagtccaat 7620ataaaagtat tgtgtacctt ttgctgggtc aggttgttct
ttaggaggag taaaaggatc 7680aaatgcacta aacgaaactg aaacaagcga tcgaaaatat
ccctttggga ttcttgactc 7740gataagtcta ttattttcag agaaaaaata ttcattgttt
tctgggttgg tgattgcacc 7800aatcattcca ttcaaaattg ttgttttacc acacccattc
cgcccgataa aagcatgaat 7860gttcgtgctg ggcatagaat taaccgtcac ctcaaaaggt
atagttaaat cactgaatcc 7920gggagcactt tttctattaa atgaaaagtg gaaatctgac
aattctggca aaccatttaa 7980cacacgtgcg aactgtccat gaatttctga aagagttacc
cctctaagta atgaggtgtt 8040aaggacgctt tcattttcaa tgtcggctaa tcgatttggc
catactacta aatcctgaat 8100agctttaaga aggttatgtt taaaaccatc gcttaatttg
ctgagattaa catagtagtc 8160aatgctttca cctaaggaaa aaaacatttc agggagttga
ctgaattttt tatctattaa 8220tgaataagtg cttacttctt ctttttgacc tacaaaacca
attttaacat ttccgatatc 8280gcatttttca ccatgctcat caaagacagt aagataaaac
attgtaacaa aggaatagtc 8340attccaacca tctgctcgta ggaatgcctt atttttttct
actgcaggaa tatacccgcc 8400tctttcaata acactaaact ccaacatata gtaaccctta
attttattaa aataaccgca 8460atttatttgg cggcaacaca ggatctctct tttaagttac
tctctattac atacgttttc 8520catctaaaaa ttagtagtat tgaacttaac ggggcatcgt
attgtagttt tccatattta 8580gctttctgct tccttttgga taacccactg ttattcatgt
tgcatggtgc actgtttata 8640ccaacgatat agtctattaa tgcatatata gtatcgccga
acgattagct cttcaggctt 8700ctgaagaagc gtttcaagta ctaataagcc gatagatagc
cacggacttc gtagccattt 8760ttcataagtg ttaacttccg ctcctcgctc ataacagaca
ttcactacag ttatggcgga 8820aaggtatgca tgctgggtgt ggggaagtcg tgaaagaaaa
gaagtcagct gcgtcgtttg 8880acatcactgc tatcttctta ctggttatgc aggtcgtagt
gggtggcaca caaagctaga 8940ttaaccctag aaagataatc atattgtgac gtacgttaaa
gataatcatg cgtaaaattg 9000acgcatgtgt ttttatcggt ctgtatatcg aggtttattt
attaatttga atagatatta 9060agttttatta tatttacact tacatactaa taataaattc
aacaaacaat ttatttatgt 9120ttatttattt attaaaaaaa aacaaaaact caaaatttct
tctaaagtaa caaaactttt 9180aaacattctc tcttttacaa aaataaactt attttgtact
ttaaaaacag tcatgttgta 9240ttataaaata agtaattagc ttaacttata cataatagaa
acaaattata cttattagtc 9300agtccagaaa caactttggc acatatcaat attatgctct
cgacaaataa cttttttgca 9360ttttttgcac gatgcatttg cctttcgcct tattttagag
gggcagtaag tacagtaagt 9420acgttttttc attactggct cttcagtact gtcatctgat
gtaccaggca cttcatttgg 9480caaaatatta gagatattat cgcgcaaata tctcttcaaa
gtaggagctt ctaaacggtt 9540acgcataaac gatgacgtca ggctcatgta aaggtttctc
ataaattttt tgcgactttg 9600aaccttttct cccttgctac tgacattatg gctgtatata
ataaaagaat ttatgcaggc 9660aatgtttatc attccgtaca ataatgccat aggccaccta
ttcgtcttcc tactgcaggt 9720catcacagaa cacatttggt ctagcgtgtc cactccgcct
ttagtttgat tataatacat 9780aaccatttgc ggtttaccgg tactttcgtt gatagaagca
tcctcatcac aagatgataa 9840taagtatacc atcttagctg gcttcggttt atatgagacg
agagtaaggg gtccgtcaaa 9900acaaaacatc gatgttccca ctggcctgga gcgactgttt
ttcagtactt ccggtatctc 9960gcgtttgttt gatcgcacgg tacc
99846410PRTArtificial SequenceDescription of
Artificial Sequence Synthetic peptide 64Met Leu Gly Arg Tyr Asp Ala
Asp Lys Cys 1 5 10 658PRTArtificial
SequenceDescription of Artificial Sequence Synthetic peptide 65Val
Tyr Ser Cys Ser Arg Lys Lys 1 5
66229PRTArtificial SequenceDescription of Artificial Sequence Synthetic
protein 66Gly Thr Val Arg Ser Asn Lys Arg Glu Ile Pro Glu Val Leu Lys
Asn 1 5 10 15 Ser
Arg Ser Arg Pro Val Gly Thr Ser Met Phe Cys Phe Asp Gly Pro
20 25 30 Leu Thr Leu Val Ser
Tyr Lys Pro Lys Pro Ala Lys Met Val Tyr Leu 35
40 45 Leu Ser Ser Cys Asp Glu Asp Ala Ser
Ile Asn Glu Ser Thr Gly Lys 50 55
60 Pro Gln Met Val Met Tyr Tyr Asn Gln Thr Lys Gly Gly
Val Asp Thr 65 70 75
80 Leu Asp Gln Met Cys Ser Val Met Thr Cys Ser Arg Lys Thr Asn Arg
85 90 95 Trp Pro Met Ala
Leu Leu Tyr Gly Met Ile Asn Ile Ala Cys Ile Asn 100
105 110 Ser Phe Ile Ile Tyr Ser His Asn Val
Ser Ser Lys Gly Glu Lys Val 115 120
125 Gln Ser Arg Lys Lys Phe Met Arg Asn Leu Tyr Met Ser Leu
Thr Ser 130 135 140
Ser Phe Met Arg Asn Arg Leu Glu Ala Pro Thr Leu Lys Arg Tyr Leu 145
150 155 160 Arg Asp Asn Ile Ser
Asn Ile Leu Pro Asn Glu Val Pro Gly Thr Ser 165
170 175 Asp Asp Ser Thr Glu Glu Pro Val Met
Lys Lys Arg Thr Tyr Cys Thr 180 185
190 Tyr Cys Pro Ser Lys Ile Arg Arg Lys Ala Asn Ala Ser Cys
Lys Lys 195 200 205
Cys Lys Lys Val Ile Cys Arg Glu His Asn Ile Asp Met Cys Gln Ser 210
215 220 Cys Phe Trp Thr Asp
225 677411DNAArtificial SequenceDescription of Artificial
Sequence Synthetic nucleotide construct 67ctaaattgta agcgttaata
ttttgttaaa attcgcgtta aatttttgtt aaatcagctc 60attttttaac caataggccg
aaatcggcaa aatcccttat aaatcaaaag aatagaccga 120gatagggttg agtgttgttc
cagtttggaa caagagtcca ctattaaaga acgtggactc 180caacgtcaaa gggcgaaaaa
ccgtctatca gggcgatggc ccactacgtg aaccatcacc 240ctaatcaagt tttttggggt
cgaggtgccg taaagcacta aatcggaacc ctaaagggag 300cccccgattt agagcttgac
ggggaaagcc ggcgaacgtg gcgagaaagg aagggaagaa 360agcgaaagga gcgggcgcta
gggcgctggc aagtgtagcg gtcacgctgc gcgtaaccac 420cacacccgcc gcgcttaatg
cgccgctaca gggcgcgtcc cattcgccat tcaggctgcg 480caactgttgg gaagggcgat
cggtgcgggc ctcttcgcta ttacgccagc tggcgaaagg 540gggatgtgct gcaaggcgat
taagttgggt aacgccaggg ttttcccagt cacgacgttg 600taaaacgacg gccagtgagc
gcgcgtaata cgactcacta tagggcgaat tgggtaccgg 660gccccccctc gaggtcgacg
gtatcgataa gcttgatatc gaattctaaa aaaaatcatg 720aatggcatca actctgaatc
aaatctttgc agatgcacct acttctcatt tccactgtca 780catcattttt ccagatctcg
ctgcctgtta tgtggcccac aaaccaagac acgttttatg 840gccattaaag ctggctgatc
gtcgccaaac accaaataca tatcaatatg tacattcgag 900aaagaagcga tcaaagaagc
gtcttcgggc gagtaggaga atgcggagga gaaggagaac 960gagctgatct agtatctctc
cacaatccaa tgccaactga ccaactggcc atattcggag 1020caatttgaag ccaatttcca
tcgcctggcg atcgctccat tcttggctat atgtttttca 1080ccgttcccgg ggccattttc
aaagactcgt cggtaagata agattgtgtc actcgctgtc 1140tctcttcatt tgtcgaagaa
tgctgaggaa tttcgcgatg acgtcggcga gtattttgaa 1200gaatgagaat aatttgtatt
tatacgaaaa tcagttagtg gaattttcta caaaaacatg 1260ttatctatag ataattttgt
tgcaaaatat gttgactatg acaaagattg tatgtatata 1320cctttaatgt attctcattt
tcttatgtat ttataatggc aatgatgata ctgatgatat 1380tttaagatga tgccagacca
caggctgatt tctgcgtctt ttgccgaacg cagtgcatgt 1440gcggttgttg ttttttggaa
tagtttcaat tttcggactg tccgctttga tttcagtttc 1500ttggcttatt caaaaagcaa
agtaaagcca aaaaagcgag atggcaatac caaatgcggc 1560aaaacggtag tggaaggaaa
ggggtgcggg gcagcggaag gaagggtggg gcggggcgtg 1620gcggggtctg tggctgggcg
cgacgtcacc gacgttggag ccactccttt gaccatgtgt 1680gcgtgtgtgt attattcgtg
tctcgccact cgccggttgt ttttttcttt ttatctcgct 1740ctctctagcg ccatctcgta
cgcatgctca acgcaccgca tgttgccgtg tcctttatgc 1800gtcattttgg ctcgaaatag
gcaattattt aaacaaagat tagtcaacga aaacgctaaa 1860ataaataagt ctacaatatg
gttacttatt gccatgtgtg tgcagccaac gatagcaaca 1920aaagcaacaa cacagtggct
ttccctcttt cactttttgt ttgcaagcgc gtgcgagcaa 1980gacggcacga ccggcaaacg
caattacgct gacaaagagc agacgaagtt ttggccgaaa 2040aacatcaagg cgcctgatac
gaatgcattt gcaataacaa ttgcgatatt taatattgtt 2100tatgaagctg tttgacttca
aaacacacaa aaaaaaaaat aaaacaaatt atttgaaaga 2160gaattaggaa tcggacagct
tatcgttacg ggctaacagc acaccgagac gaaatagctt 2220acctgacgtc acagcctctg
gaagaactgc cgccaagcag acgatgcaga ggacgacaca 2280tagagtagcg gagtaggcca
gcgtagtacg catgtgcttg tgtgtgaggc gtctctctct 2340tcgtctcctg tttgcgcaaa
cgcatagact gcactgagaa aatcgattac ctatttttta 2400tgaatgaata tttgcactat
tactattcaa aactattaag atagcaatca cattcaatag 2460ccaaatacta taccacctga
gcgatgcaac gaaatgatca atttgagcaa aaatgctgca 2520tatttaggac ggcatcatta
tagaaatgct tcttgctgtg tacttttctc tcgtctggca 2580gctgtttcgc cgttattgtt
aaaaccggct taagttaggt gtgttttcta cgactagtga 2640tgcccctact agaagatgtg
tgttgcacaa atgtccctga ataaccaatt tgaagtgcag 2700atagcagtaa acgtaagcta
atatgaatat tatttaactg taatgtttta atatcgctgg 2760acattactaa taaacccact
ataaacacat gtacatatgt atgttttggc atacaatgag 2820tagttgggga aaaaatgtgt
aaaagcaccg tgaccatcac agcataaaga taaccagctg 2880aagtatcgaa tatgagtaac
ccccaaattg aatcacatgc cgcaactgat aggacccatg 2940gaagtacact cttcatggcg
atatacaaga cacacacaag cacgaacacc cagttgcgga 3000ggaaattctc cgtaaatgaa
aacccaatcg gcgaacaatt catacccata tatggtaaaa 3060gttttgaacg cgacttgaga
gcggagagca ttgcggctga taaggtttta gcgctaagcg 3120ggctttataa aacgggctgc
gggaccagtt ttcatatcgg atcctatata ataaaatggg 3180tagttcttta gacgatgagc
atatcctctc tgctcttctg caaagcgatg acgagcttgt 3240tggtgaggat tctgacagtg
aaatatcaga tcacgtaagt gaagatgacg tccagagcga 3300tacagaagaa gcgtttatag
atgaggtaca tgaagtgcag ccaacgtcaa gcggtagtga 3360aatattagac gaacaaaatg
ttattgaaca accaggttct tcattggctt ctaacagaat 3420cttgaccttg ccacagagga
ctattagagg taagaataaa cattgttggt caacttcaaa 3480gtccacgagg cgtagccgag
tctctgcact gaacattgtc agatctcaaa gaggtccgac 3540gcgtatgtgc cgcaatatat
atgacccact tttatgcttc aaactatttt ttactgatga 3600gataatttcg gaaattgtaa
aatggacaaa tgctgagata tcattgaaac gtcgggaatc 3660tatgacaggt gctacatttc
gtgacacgaa tgaagatgaa atctatgctt tctttggtat 3720tctggtaatg acagcagtga
gaaaagataa ccacatgtcc acagatgacc tctttgatcg 3780atctttgtca atggtgtacg
tctctgtaat gagtcgtgat cgttttgatt ttttgatacg 3840atgtcttaga atggatgaca
aaagtatacg gcccacactt cgagaaaacg atgtatttac 3900tcctgttaga aaaatatggg
atctctttat ccatcagtgc atacaaaatt acactccagg 3960ggctcatttg accatagatg
aacagttact tggttttaga ggacggtgtc cgtttaggat 4020gtatatccca aacaagccaa
gtaagtatgg aataaaaatc ctcatgatgt gtgacagtgg 4080tacgaagtat atgataaatg
gaatgcctta tttgggaaga ggaacacaga ccaacggagt 4140accactcggt gaatactacg
tgaaggagtt atcaaagcct gtgcacggta gttgtcgtaa 4200tattacgtgt gacaattggt
tcacctcaat ccctttggca aaaaacttac tacaagaacc 4260gtataagtta accattgtgg
gaaccgtgcg atcaaacaaa cgcgagatac cggaagtact 4320gaaaaacagt cgctccaggc
cagtgggaac atcgatgttt tgttttgacg gaccccttac 4380tctcgtctca tataaaccga
agccagctaa gatggtatac ttattatcat cttgtgatga 4440ggatgcttct atcaacgaaa
gtaccggtaa accgcaaatg gttatgtatt ataatcaaac 4500taaaggcgga gtggacacgc
tagaccaaat gtgttctgtg atgacctgca gtaggaagac 4560gaataggtgg cctatggcat
tattgtacgg aatgataaac attgcctgca taaattcttt 4620tattatatac agccataatg
tcagtagcaa gggagaaaag gttcaaagtc gcaaaaaatt 4680tatgagaaac ctttacatga
gcctgacgtc atcgtttatg cgtaagcgtt tagaagctcc 4740tactttgaag agatatttgc
gcgataatat ctctaatatt ttgccaaatg aagtgcctgg 4800tacatcagat gacagtactg
aagagccagt aatgaaaaaa cgtacttact gtacttactg 4860cccctctaaa ataaggcgaa
aggcaaatgc atcgtgcaaa aaatgcaaaa aagttatttg 4920tcgagagcat aatattgata
tgtgccaaag ttgtttctga ctgactaata agtataattt 4980gtttctatta tgtataagtt
aagctaatta cttattttat aatacaacat gactgttttt 5040aaagtacaaa ataagtttat
ttttgtaaaa gagagaatgt ttaaaagttt tgttacttta 5100gaagaaattt tgagtttttg
ttttttttta ataaataaat aaacataaat aaattgtttg 5160ttgaatttgg atccactagt
tctagagcgg ccgccaccgc ggtggagctc cagcttttgt 5220tccctttagt gagggttaat
tgcgcgcttg gcgtaatcat ggtcatagct gtttcctgtg 5280tgaaattgtt atccgctcac
aattccacac aacatacgag ccggaagcat aaagtgtaaa 5340gcctggggtg cctaatgagt
gagctaactc acattaattg cgttgcgctc actgcccgct 5400ttccagtcgg gaaacctgtc
gtgccagctg cattaatgaa tcggccaacg cgcggggaga 5460ggcggtttgc gtattgggcg
ctcttccgct tcctcgctca ctgactcgct gcgctcggtc 5520gttcggctgc ggcgagcggt
atcagctcac tcaaaggcgg taatacggtt atccacagaa 5580tcaggggata acgcaggaaa
gaacatgtga gcaaaaggcc agcaaaaggc caggaaccgt 5640aaaaaggccg cgttgctggc
gtttttccat aggctccgcc cccctgacga gcatcacaaa 5700aatcgacgct caagtcagag
gtggcgaaac ccgacaggac tataaagata ccaggcgttt 5760ccccctggaa gctccctcgt
gcgctctcct gttccgaccc tgccgcttac cggatacctg 5820tccgcctttc tcccttcggg
aagcgtggcg ctttctcata gctcacgctg taggtatctc 5880agttcggtgt aggtcgttcg
ctccaagctg ggctgtgtgc acgaaccccc cgttcagccc 5940gaccgctgcg ccttatccgg
taactatcgt cttgagtcca acccggtaag acacgactta 6000tcgccactgg cagcagccac
tggtaacagg attagcagag cgaggtatgt aggcggtgct 6060acagagttct tgaagtggtg
gcctaactac ggctacacta gaaggacagt atttggtatc 6120tgcgctctgc tgaagccagt
taccttcgga aaaagagttg gtagctcttg atccggcaaa 6180caaaccaccg ctggtagcgg
tggttttttt gtttgcaagc agcagattac gcgcagaaaa 6240aaaggatctc aagaagatcc
tttgatcttt tctacggggt ctgacgctca gtggaacgaa 6300aactcacgtt aagggatttt
ggtcatgaga ttatcaaaaa ggatcttcac ctagatcctt 6360ttaaattaaa aatgaagttt
taaatcaatc taaagtatat atgagtaaac ttggtctgac 6420agttaccaat gcttaatcag
tgaggcacct atctcagcga tctgtctatt tcgttcatcc 6480atagttgcct gactccccgt
cgtgtagata actacgatac gggagggctt accatctggc 6540cccagtgctg caatgatacc
gcgagaccca cgctcaccgg ctccagattt atcagcaata 6600aaccagccag ccggaagggc
cgagcgcaga agtggtcctg caactttatc cgcctccatc 6660cagtctatta attgttgccg
ggaagctaga gtaagtagtt cgccagttaa tagtttgcgc 6720aacgttgttg ccattgctac
aggcatcgtg gtgtcacgct cgtcgtttgg tatggcttca 6780ttcagctccg gttcccaacg
atcaaggcga gttacatgat cccccatgtt gtgcaaaaaa 6840gcggttagct ccttcggtcc
tccgatcgtt gtcagaagta agttggccgc agtgttatca 6900ctcatggtta tggcagcact
gcataattct cttactgtca tgccatccgt aagatgcttt 6960tctgtgactg gtgagtactc
aaccaagtca ttctgagaat agtgtatgcg gcgaccgagt 7020tgctcttgcc cggcgtcaat
acgggataat accgcgccac atagcagaac tttaaaagtg 7080ctcatcattg gaaaacgttc
ttcggggcga aaactctcaa ggatcttacc gctgttgaga 7140tccagttcga tgtaacccac
tcgtgcaccc aactgatctt cagcatcttt tactttcacc 7200agcgtttctg ggtgagcaaa
aacaggaagg caaaatgccg caaaaaaggg aataagggcg 7260acacggaaat gttgaatact
catactcttc ctttttcaat attattgaag catttatcag 7320ggttattgtc tcatgagcgg
atacatattt gaatgtattt agaaaaataa acaaataggg 7380gttccgcgca catttccccg
aaaagtgcca c 74116810330DNAArtificial
SequenceDescription of Artificial Sequence Synthetic nucleotide
construct 68aagcttgggc tgcaggtcga cggatccaaa ttcaacaaac aatttattta
tgtttattta 60tttattaaaa aaaaacaaaa actcaaaatt tcttctaaag taacaaaact
tttaaacatt 120ctctctttta caaaaataaa cttattttgt actttaaaaa cagtcatgtt
gtattataaa 180ataagtaatt agcttaactt atacataata gaaacaaatt atacttatta
gtcagtcaga 240aacaactttg gcacatatca atattatgct ctcgacaaat aacttttttg
cattttttgc 300acgatgcatt tgcctttcgc cttattttag aggggcagta agtacagtaa
gtacgttttt 360tcattactgg ctcttcagta ctgtcatctg atgtaccagg cacttcattt
ggcaaaatat 420tagagatatt atcgcgcaaa tatctcttca aagtaggagc ttctaaacgc
ttacgcataa 480acgatgacgt caggctcatg taaaggtttc tcataaattt tttgcgactt
tgaacctttt 540ctcccttgct actgacatta tggctgtata taataaaaga atttatgcag
gcaatgttta 600tcattccgta caataatgcc ataggccacc tattcgtctt cctactgcag
gtcatcacag 660aacacatttg gtctagcgtg tccactccgc ctttagtttg attataatac
ataaccattt 720gcggtttacc ggtactttcg ttgatagaag catcctcatc acaagatgat
aataagtata 780ccatcttagc tggcttcggt ttatatgaga cgagagtaag gggtccgtca
aaacaaaaca 840tcgatgttcc cactggcctg gagcgactgt ttttcagtac ttccggtatc
tcgcgtttgt 900ttgatcgcac ggttcccaca atggttaact tatacggttc ttgtagtaag
ttttttgcca 960aagggattga ggtgaaccaa ttgtcacacg taatattacg acaactaccg
tgcacaggct 1020ttgataactc cttcacgtag tattcaccga gtggtactcc gttggtctgt
gttcctcttc 1080ccaaataagg cattccattt atcatatact tcgtaccact gtcacacatc
atgaggattt 1140ttattccata cttacttggc ttgtttggga tatacatcct aaacggacac
cgtcctctaa 1200aaccaagtaa ctgttcatct atggtcaaat gagcccctgg agtgtaattt
tgtatgcact 1260gatggataaa gagatcccat atttttctaa caggagtaaa tacatcgttt
tctcgaagtg 1320tgggccgtat acttttgtca tccattctaa gacatcgtat caaaaaatca
aaacgatcac 1380gactcattac agagacgtac accattgaca aagatcgatc aaagaggtca
tctgtggaca 1440tgtggttatc ttttctcact gctgtcatta ccagaatacc aaagaaagca
tagatttcat 1500cttcattcgt gtcacgaaat gtagcacctg tcatagattc ccgacgtttc
aatgatatct 1560cagcatttgt ccattttaca atttccgaaa ttatctcatc agtaaaaaat
agtttgaagc 1620ataaaagtgg gtcatatata ttgcggcaca tacgcgtcgg acctctttga
gatctgacaa 1680tgttcagtgc agagactcgg ctacgcctcg tggactttga agttgaccaa
caatgtttat 1740tcttacctct aatagtcctc tgtggcaagg tcaagattct gttagaagcc
aatgaagaac 1800ctggttgttc aataacattt tgttcgtcta atatttcact accgcttgac
gttggctgca 1860cttcatgtac ctcatctata aacgcttctt ctgtatcgct ctggacgtca
tcttcactta 1920cgtgatctga tatttcactg tcagaatcct caccaacaag ctcgtcatcg
ctttgcagaa 1980gagcagagag gatatgctca tcgtctaaag aactacccat tttattatat
aggatccccg 2040acaccagacc aactggtaat ggtagcgacc ggcgctcagc tggaattagg
ccttctagac 2100cgcggccgca gatctgttaa cgaattccca attccctatt cagagttctc
ttcttgtatt 2160caataattac ttcttggcag atttcagtag ttgcagttga tttacttggt
tgctggttac 2220ttttaattga ttcactttaa cttgcacttt actgcagatt gtttagcttg
ttcagctgcg 2280cttgtttatt tgcttagctt tcgcttagcg acgtgttcac ttgcttgttt
gaattgaatt 2340gtcgctccgt agacgaagcg ctctatttat actccggcgc tcttttcgcg
aacattcgag 2400gcgcgctctc tcgaaccaac gagagcagta tgccgtttac tgtgtgacag
agtgagagag 2460cattagtgca gagagggaga cccaaaaaga aaagagagaa taacgaataa
cggccagaga 2520aatttctcga gttttcttct gccaaacaaa tgacctacca caataaccag
tttgttttgg 2580gattctaggg ggatcgggga tcaattctag tatgtatgta agttaataaa
accctttttt 2640ggagaatgta gatttaaaaa aacatatttt ttttttattt tttactgcac
tggatatcat 2700tgaacttatc tgatcagttt taaatttact tcgatccaag ggtatttgaa
gtaccaggtt 2760ctttcgatta cctctcactc aaaatgacat tccactcaaa gtcagcgctg
tttgcctcct 2820tctctgtcca cagaaatatc gccgtctctt tcgccgctgc gtccgctatc
tctttcgcca 2880ccgtttgtag cgttacctag cgtcaatgtc cgccttcagt tgcactttgt
cagcggtttc 2940gtgacgaagc tccaagcggt ttacgccatc aattaaacac aaagtgctgt
gccaaaactc 3000ctctcgcttc ttatttttgt ttgttttttg agtgattggg gtggtgattg
gttttgggtg 3060ggtaagcagg ggaaagtgtg aaaaatcccg gcaatgggcc aagaggatca
ggagctatta 3120attcgcggag gcagcaaaca cccatctgcc gagcatctga acaatgtgag
tagtacatgt 3180gcatacatct taagttcact tgatctatag gaactgcgat tgcaacatca
aattgtctgc 3240ggcgtgagaa ctgcgaccca caaaaatccc aaaccgcaat cgcacaaaca
aatagtgaca 3300cgaaacagat tattctggta gctgtgctcg ctatataaga caatttttaa
gatcatatca 3360tgatcaagac atctaaaggc attcattttc gactacattc ttttttacaa
aaaatataac 3420aaccagatat tttaagctga tcctagatgc acaaaaaata aataaaagta
taaacctact 3480tcgtaggata cttcgttttg ttcggggtta gatgagcata acgcttgtag
ttgatatttg 3540agatccccta tcattgcagg gtgacagcgg agcggcttcg cagagctgca
ttaaccaggg 3600cttcgggcag gccaaaaact acggcacgct cctgccaccc agtccgccgg
aggactccgg 3660ttcagggagc ggccaactag ccgagaacct cacctatgcc tggcacaata
tggacatctt 3720tggggcggtc aatcagccgg gctccggatg gcggcagctg gtcaaccgga
cacgcggact 3780attctgcaac gagcgacaca taccggcgcc caggaaacat ttgctcaaga
acggtgagtt 3840tctattcgca gtcggctgat ctgtgtgaaa tcttaataaa gggtccaatt
accaatttga 3900aactcagttt gcggcgtggc ctatccgggc gaacttttgg ccgtgatggg
cagttccggt 3960gccggaaaga cgaccctgct gaatgccctt gcctttcgat cgccgcaggg
catccaagta 4020tcgccatccg ggatgcgact gctcaatggc caacctgtgg acgccaagga
gatgcaggcc 4080aggtgcgcct atgtccagca ggatgacctc tttatcggct ccctaacggc
cagggaacac 4140ctgattttcc aggccatggt gcggatgcca cgacatctga cctatcggca
gcgagtggcc 4200cgcgtggatc aggtgatcca ggagctttcg ctcagcaaat gtcagcacac
gatcatcggt 4260gtgcccggca gggtgaaagg tctgtccggc ggagaaagga agcgtctggc
attcgcctcc 4320gaggcactaa ccgatccgcc gcttctgatc tgcgatgagc ccacctccgg
actggactca 4380tttaccgccc acagcgtcgt ccaggtgctg aagaagctgt cgcagaaggg
caagaccgtc 4440atcctgacca ttcatcagcc gtcttccgag ctgtttgagc tctttgacaa
gatccttctg 4500atggccgagg gcagggtagc tttcttgggc actcccagcg aagccgtcga
cttcttttcc 4560tagtgagttc gatgtgttta ttaagggtat ctagcattac attacatctc
aactcctatc 4620cagcgtgggt gcccagtgtc ctaccaacta caatccggcg gacttttacg
tacaggtgtt 4680ggccgttgtg cccggacggg agatcgagtc ccgtgatcgg atcgccaaga
tatgcgacaa 4740ttttgctatt agcaaagtag cccgggatat ggagcagttg ttggccacca
aaaatttgga 4800gaagccactg gagcagccgg agaatgggta cacctacaag gccacctggt
tcatgcagtt 4860ccgggcggtc ctgtggcgat cctggctgtc ggtgctcaag gaaccactcc
tcgtaaaagt 4920gcgacttatt cagacaacgg tgagtggttc cagtggaaac aaatgatata
acgcttacaa 4980ttcttggaaa caaattcgct agattttagt tagaattgcc tgattccaca
cccttcttag 5040tttttttcaa tgagatgtat agtttatagt tttgcagaaa ataaataaat
ttcatttaac 5100tcgcgaacat gttgaagata tgaatattaa tgagatgcga gtaacatttt
aatttgcaga 5160tggttgccat cttgattggc ctcatctttt tgggccaaca actcacgcaa
gtgggcgtga 5220tgaatatcaa cggagccatc ttcctcttcc tgaccaacat gacctttcaa
aacgtctttg 5280ccacgataaa tgtaagtctt gtttagaata catttgcata ttaataattt
actaactttc 5340taatgaatcg attcgattta ggtgttcacc tcagagctgc cagtttttat
gagggaggcc 5400cgaagtcgac tttatcgctg tgacacatac tttctgggca aaacgattgc
cgaattaccg 5460ctttttctca cagtgccact ggtcttcacg gcgattgcct atccgatgat
cggactgcgg 5520gccggagtgc tgcacttctt caactgcctg gcgctggtca ctctggtggc
caatgtgtca 5580acgtccttcg gatatctaat atcctgcgcc agctcctcga cctcgatggc
gctgtctgtg 5640ggtccgccgg ttatcatacc attcctgctc tttggcggct tcttcttgaa
ctcgggctcg 5700gtgccagtat acctcaaatg gttgtcgtac ctctcatggt tccgttacgc
caacgagggt 5760ctgctgatta accaatgggc ggacgtggag ccgggcgaaa ttagctgcac
atcgtcgaac 5820accacgtgcc ccagttcggg caaggtcatc ctggagacgc ttaacttctc
cgccgccgat 5880ctgccgctgg actacgtggg tctggccatt ctcatcgtga gcttccgggt
gctcgcatat 5940ctggctctaa gacttcgggc ccgacgcaag gagtagccga catatatccg
aaataactgc 6000ttgttttttt ttttaccatt attaccatcg tgtttactgt ttattgcccc
ctcaaaaagc 6060taatgtaatt atatttgtgc caataaaaac aagatatgac ctatagaata
caagtatttc 6120cccttcgaac atccccacaa gtagactttg gatttgtctt ctaaccaaaa
gacttacaca 6180cctgcatacc ttacatcaaa aactcgttta tcgctacata aaacaccggg
atatattttt 6240tatatacata cttttcaaat cgcgcgccct cttcataatt cacctccacc
acaccacgtt 6300tcgtagttgc tctttcgctg tctcccaccc gctctccgca acacattcac
cttttgttcg 6360acgaccttgg agcgactgtc gttagttccg cgcgattcgg ttcgctcaaa
tggttccgag 6420tggttcattt cgtctcaata gaaattagta ataaatattt gtatgtacaa
tttatttgct 6480ccaatatatt tgtatatatt tccctcacag ctatatttat tctaatttaa
tattatgact 6540ttttaaggta attttttgtg acctgttcgg agtgattagc gttacaattt
gaactgaaag 6600tgacatccag tgtttgttcc ttgtgtagat gcatctcaaa aaaatggtgg
gcataatagt 6660gttgtttata tatatcaaaa ataagaacta taataataag aatacattta
atttagaaaa 6720tgcttggatt tcactggaac tagaattaat tcggctgctg ctctaaacga
cgcatttcgt 6780actccaaagt acgaattttt tccctcaagc tcttattttc attaaacaat
gaacaggacc 6840taacgcacag tcacgttatt gtttacataa atgatttttt ttactattca
aacttactct 6900gtttgtgtac tcccactggt atagccttct tttatctttt ctggttcagg
ctctatcact 6960ttactaggta cggcatctgc gttgagtcgc ctccttttaa atgtctgacc
ttttgcaggt 7020gcagccttcc actgcgaatc tttaaagtgg gtatcacaaa tttgggagtt
ttcaccaagg 7080ctgcacccaa ggctctgctc ccacaatttt ctcttaatag cacacttcgg
cacgtgaatt 7140aattttactc cagtcacagc ttgcagcaaa atttgcaata tttcattttt
ttttattcca 7200cgtaagggtt aatgttttca aaaaaaaatt cgtccgcaca caacctttcc
tctcaacaag 7260caaacgtgca ctgaatttaa gtgtatactt cggtaagctt cggctatcga
cgggaccacc 7320ttatgttatt tcatcatggg ccagacccac gtagtccagc ggcagatcgg
cggcggagaa 7380gttaagcgtc tccaggatga ccttgcccga actggggcac gtggtgttcg
acgatgtgca 7440gctaatttcg cccggctcca cgtccgccca ttggttaatc agcagaccct
cgttggcgta 7500acggaaccat gagaggtacg acaaccattt gaggtatact ggcaccgagc
ccgagttcaa 7560gaagaaggcg tttttccata ggctccgccc ccctgacgag catcacaaaa
atcgacgctc 7620aagtcagagg tggcgaaacc cgacaggact ataaagatac caggcgtttc
cccctggaag 7680ctccctcgtg cgctctcctg ttccgaccct gccgcttacc ggatacctgt
ccgcctttct 7740cccttcggga agcgtggcgc tttctcaatg ctcacgctgt aggtatctca
gttcggtgta 7800ggtcgttcgc tccaagctgg gctgtgtgca cgaacccccc gttcagcccg
accgctgcgc 7860cttatccggt aactatcgtc ttgagtccaa cccggtaaga cacgacttat
cgccactggc 7920agcagccact ggtaacagga ttagcagagc gaggtatgta ggcggtgcta
cagagttctt 7980gaagtggtgg cctaactacg gctacactag aaggacagta tttggtatct
gcgctctgct 8040gaagccagtt accttcggaa aaagagttgg tagctcttga tccggcaaac
aaaccaccgc 8100tggtagcggt ggtttttttg tttgcaagca gcagattacg cgcagaaaaa
aaggatctca 8160agaagatcct ttgatctttt ctacggggtc tgacgctcag tggaacgaaa
actcacgtta 8220agggattttg gtcatgagat tatcaaaaag gatcttcacc tagatccttt
taaattaaaa 8280atgaagtttt aaatcaatct aaagtatata tgagtaaact tggtctgaca
gttaccaatg 8340cttaatcagt gaggcaccta tctcagcgat ctgtctattt cgttcatcca
tagttgcctg 8400actccccgtc gtgtagataa ctacgatacg ggagggctta ccatctggcc
ccagtgctgc 8460aatgataccg cgagacccac gctcaccggc tccagattta tcagcaataa
accagccagc 8520cggaagggcc gagcgcagaa gtggtcctgc aactttatcc gcctccatcc
agtctattaa 8580ttgttgccgg gaagctgagt aagtagttcg ccagttaata gtttgcgcaa
cgttgttgcc 8640attgctacag gcatcgtggt gtcacgctcg tcgtttggta tggcttcatt
cagctccggt 8700tcccaacgat caaggcgagt tacatgatcc cccatgttgt gcaaaaaagc
ggttagctcc 8760ttcggtcctc cgatcgttgt cagaagtaag ttggccgcag tgttatcact
catggttatg 8820gcagcactgc ataattctct tactgtcatg ccatccgtaa gatgcttttc
tgtgactggt 8880gagtactcaa ccaagtcatt ctgagaatag tgtatgcggc gaccgagttg
ctcttgcccg 8940gcgtcaatac gggataatac cgcgccacat agcagaactt taaaagtgct
catcattgga 9000aaacgttctt cggggcgaaa actctcaagg atcttaccgc tgttgagatc
cagttcgatg 9060taacccactc gtgcacccaa ctgatcttca gcatctttta ctttcaccag
cgtttctggg 9120tgagcaaaaa caggaaggca aaatgccgca aaaaagggaa taagggcgac
acggaaatgt 9180tgaatactca tactcttcct ttttcaatat tattgaagca tttatcaggg
ttattgtctc 9240atgagcggat acatatttga atgtatttag aaaaataaac aaataggggt
tccgcgcaca 9300tttccccgaa aagtgccacc tgacgtctaa gaaaccatta ttatcatgac
attaacctat 9360aaaaataggc gtatcacgag gccctttcgt ctcgcgcgtt tcggtgatga
cggtgaaaac 9420ctctgacaca tgcagctccc ggagacggtc acagcttgtc tgtaagcgga
tgccgggagc 9480agacaagccc gtcagggcgc gtcagcgggt gttggcgggt gtcggggctg
gcttaactat 9540gcggcatcag agcagattgt actgagagtg caccatatgc ggtgtgaaat
accgcaccga 9600atcgcgcgga actaacgaca gtcgctccaa ggtcgtcgaa caaaaggtga
atgtgttgcg 9660gagagcgggt gggagacagc gaaagagcaa ctacgaaacg tggtgtggtg
gaggtgaatt 9720atgaagaggg cgcgcgattt gaaaagtatg tatataaaaa atatatcccg
gtgttttatg 9780tagcgataaa cgagtttttg atgtaaggta tgcaggtgtg taagtctttt
ggttagaaga 9840caaatccaaa gtctacttgt ggggatgttc gaaggggaaa tacttgtatt
ctataggtca 9900tatcttgttt ttattggcac aaatataatt acattagctt tttgaggggg
caataaacag 9960taaacacgat ggtaataatg gtaaaaaaaa aaacaagcag ttatttcgga
tatatgtcgg 10020ctactccttg cgtcgggccc gaagtcttag agccagatat gcgagcaccc
ggaagctcac 10080gatgagaatg gccagaccat gatgaaataa cataaggtgg tcccgtcggc
aagagacatc 10140cacttaacgt atgcttgcaa taagtgcgag tgaaaggaat agtattctga
gtgtcgtatt 10200gagtctgagt gagacagcga tatgattgtt gattaaccct tagcatgtcc
gtggggtttg 10260aattaactca taatattaat tagacgaaat tatttttaaa gttttatttt
taataatttg 10320cgagtacgca
10330691785DNAArtificial SequenceDescription of Artificial
Sequence Synthetic nucleotide construct 69atgggtagtt ctttagacga
tgagcatatc ctctctgctc ttctgcaaag cgatgacgag 60cttgttggtg aggattctga
cagtgaaata tcagatcacg taagtgaaga tgacgtccag 120agcgatacag aagaagcgtt
tatagatgag gtacatgaag tgcagccaac gtcaagcggt 180agtgaaatat tagacgaaca
aaatgttatt gaacaaccag gttcttcatt ggcttctaac 240agaatcttga ccttgccaca
gaggactatt agaggtaaga ataaacattg ttggtcaact 300tcaaagtcca cgaggcgtag
ccgagtctct gcactgaaca ttgtcagatc tcaaagaggt 360ccgacgcgta tgtgccgcaa
tatatatgac ccacttttat gcttcaaact attttttact 420gatgagataa tttcggaaat
tgtaaaatgg acaaatgctg agatatcatt gaaacgtcgg 480gaatctatga caggtgctac
atttcgtgac acgaatgaag atgaaatcta tgctttcttt 540ggtattctgg taatgacagc
agtgagaaaa gataaccaca tgtccacaga tgacctcttt 600gatcgatctt tgtcaatggt
gtacgtctct gtaatgagtc gtgatcgttt tgattttttg 660atacgatgtc ttagaatgga
tgacaaaagt atacggccca cacttcgaga aaacgatgta 720tttactcctg ttagaaaaat
atgggatctc tttatccatc agtgcataca aaattacact 780ccaggggctc atttgaccat
agatgaacag ttacttggtt ttagaggacg gtgtccgttt 840aggatgtata tcccaaacaa
gccaagtaag tatggaataa aaatcctcat gatgtgtgac 900agtggtacga agtatatgat
aaatggaatg ccttatttgg gaagaggaac acagaccaac 960ggagtaccac tcggtgaata
ctacgtgaag gagttatcaa agcctgtgca cggtagttgt 1020cgtaatatta cgtgtgacaa
ttggttcacc tcaatccctt tggcaaaaaa cttactacaa 1080gaaccgtata agttaaccat
tgtgggaacc gtgcgatcaa acaaacgcga gataccggaa 1140gtactgaaaa acagtcgctc
caggccagtg ggaacatcga tgttttgttt tgacggaccc 1200cttactctcg tctcatataa
accgaagcca gctaagatgg tatacttatt atcatcttgt 1260gatgaggatg cttctatcaa
cgaaagtacc ggtaaaccgc aaatggttat gtattataat 1320caaactaaag gcggagtgga
cacgctagac caaatgtgtt ctgtgatgac ctgcagtagg 1380aagacgaata ggtggcctat
ggcattattg tacggaatga taaacattgc ctgcataaat 1440tcttttatta tatacagcca
taatgtcagt agcaagggag aaaaggttca aagtcgcaaa 1500aaatttatga gaaaccttta
catgagcctg acgtcatcgt ttatgcgtaa gcgtttagaa 1560gctcctactt tgaagagata
tttgcgcgat aatatctcta atattttgcc aaatgaagtg 1620cctggtacat cagatgacag
tactgaagag ccagtaatga aaaaacgtac ttactgtact 1680tactgcccct ctaaaataag
gcgaaaggca aatgcatcgt gcaaaaaatg caaaaaagtt 1740atttgtcgag agcataatat
tgatatgtgc caaagttgtt tctga 1785701785DNAArtificial
SequenceDescription of Artificial Sequence Synthetic nucleotide
construct 70atgggtagca gcctggatga tgaacatatc ctgagcgcgc tgctgcagag
cgacgacgaa 60ctggttggtg aagatagcga cagcgaaatc agcgatcacg tgagcgaaga
cgacgttcag 120agcgataccg aagaagcgtt catcgacgaa gttcacgaag tgcagccgac
cagcagcggt 180agcgaaatcc tggatgaaca gaacgttatc gaacagccgg gtagcagcct
ggcgagcaac 240cgtatcctga ccctgccgca gcgcaccatc cgtggtaaaa acaaacactg
ttggagcacc 300agcaaaagca cccgccgtag ccgtgttagc gcgctgaaca ttgttcgtag
ccagcgtggt 360ccgacccgta tgtgccgcaa catctacgat ccgctgctgt gcttcaaact
gttcttcacc 420gatgaaatca tcagcgaaat cgtgaaatgg accaacgccg aaatcagcct
gaaacgtcgc 480gaaagcatga ccggcgcgac cttccgcgat accaacgaag atgaaatcta
cgccttcttc 540ggtatcctgg tgatgaccgc ggtgcgtaaa gataaccaca tgagcaccga
tgatctgttt 600gatcgtagcc tgagcatggt ttacgttagc gttatgagcc gtgaccgttt
cgattttctg 660atccgttgtc tgcgtatgga tgataaaagc atccgcccga ccctgcgcga
aaacgatgtg 720ttcaccccgg ttcgcaaaat ctgggatctg ttcatccacc agtgcatcca
gaactacacc 780ccgggcgcgc acctgaccat cgatgaacag ctgctgggtt ttcgtggtcg
ctgtccgttt 840cgtatgtaca tcccgaacaa accgagcaaa tacggtatca aaatcctgat
gatgtgtgac 900agcggtacca agtacatgat caacggtatg ccgtatctgg gtcgtggtac
ccagaccaac 960ggtgtgccgc tgggtgaata ctacgtgaaa gaactgagca aaccggtgca
cggtagctgt 1020cgtaacatca cctgtgacaa ctggttcacc agcatcccgc tggcgaaaaa
cctgctgcag 1080gaaccgtata aactgaccat cgtgggtacc gttcgtagca acaaacgtga
aatcccggaa 1140gtgctgaaaa acagccgtag ccgtccggtg ggcaccagca tgttctgttt
cgatggtccg 1200ctgaccctgg ttagctacaa accgaaaccg gcgaaaatgg tgtacctgct
gagcagctgc 1260gacgaagacg cgagcatcaa cgaaagcacc ggtaaaccgc agatggttat
gtactacaac 1320cagaccaaag gcggtgtgga caccctggat cagatgtgca gcgttatgac
ctgcagccgc 1380aaaaccaacc gctggccgat ggcgctgctg tacggtatga tcaacatcgc
ctgcatcaac 1440agctttatca tctacagcca taacgttagc agcaaaggtg aaaaagttca
gagccgcaaa 1500aaatttatgc gtaacctgta catgagcctg accagcagct tcatgcgtaa
acgtctggaa 1560gccccgaccc tgaaacgtta tctgcgcgat aacatcagca acatcctgcc
gaacgaagtg 1620ccgggtacca gcgatgatag caccgaagaa ccggtgatga aaaaacgtac
ctactgtacc 1680tactgcccga gcaaaatccg ccgtaaagcg aacgcgagct gcaaaaaatg
caaaaaagtt 1740atctgtcgtg aacataacat cgatatgtgc cagagctgtt tctga
17857163DNAArtificial SequenceDescription of Artificial
Sequence Synthetic oligonucleotide 71ccctagaaag ataatcatat
tgtgacgtac gttaaagata atcatgcgta aaattgacgc 60atg
637262DNAArtificial
SequenceDescription of Artificial Sequence Synthetic oligonucleotide
72ccctagaaag ataatcatat tgtgacgtac gttaaagata atcatgagta aattgacgca
60tg
627331DNAArtificial SequenceDescription of Artificial Sequence Synthetic
primer 73acttctagag tcctaaattg caaacagcga c
317433DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 74acttctagac acgtaagtag aacatgaaat aac
337532DNAArtificial SequenceDescription of Artificial
Sequence Synthetic primer 75acttctagat cactgtcaga atcctcacca ac
327630DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 76acttctagaa gaagccaatg
aagaacctgg 307736DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
77acttctagaa ataaataaat aaacataaat aaattg
367829DNAArtificial SequenceDescription of Artificial Sequence Synthetic
primer 78acttctagag aaaggcaaat gcatcgtgc
297931DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 79acttctagac gcaaaaaatt tatgagaaac c
318031DNAArtificial SequenceDescription of Artificial
Sequence Synthetic primer 80acttctagag atgaggatgc ttctatcaac g
318130DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 81acttctagac gcgagatacc
ggaagtactg 308238DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
82acttctagac tcgagagaga atgtttaaaa gttttgtt
388363DNAArtificial SequenceDescription of Artificial Sequence Synthetic
primer 83acttctagac atgcgtcaat tttacgcaga ctatctttct agggttaatc
tagctgcatc 60agg
638430DNAArtificial SequenceDescription of Artificial
Sequence Synthetic primer 84acgactagtg ttcccacaat ggttaattcg
308530DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 85acgactagtg ccgtacgcgt
atcgataagc 308630DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
86ggatcctata taataaaatg ggtagttctt
308731DNAArtificial SequenceDescription of Artificial Sequence Synthetic
primer 87ggatccaaat tcaacaaaca atttatttat g
318829DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 88ggatcctcta gattaaccct agaaagata
298934DNAArtificial SequenceDescription of Artificial
Sequence Synthetic primermodified_base(21)..(30)a, c, g, t, unknown
or othermisc_feature(21)..(30)n is a, c, g, or t 89taatacgact cactataggg
nnnnnnnnnn ctat 349035DNAArtificial
SequenceDescription of Artificial Sequence Synthetic
primermodified_base(21)..(30)a, c, g, t, unknown or
othermisc_feature(21)..(30)n is a, c, g, or t 90taatacgact cactataggg
nnnnnnnnnn agtgc 359136DNAArtificial
SequenceDescription of Artificial Sequence Synthetic
primermodified_base(21)..(30)a, c, g, t, unknown or
othermisc_feature(21)..(30)n is a, c, g, or t 91taatacgact cactataggg
nnnnnnnnnn gaattc 369236DNAArtificial
SequenceDescription of Artificial Sequence Synthetic
primermodified_base(21)..(30)a, c, g, t, unknown or
othermisc_feature(21)..(30)n is a, c, g, or t 92taatacgact cactataggg
nnnnnnnnnn agtact 369336DNAArtificial
SequenceDescription of Artificial Sequence Synthetic
primermodified_base(21)..(30)a, c, g, t, unknown or
othermisc_feature(21)..(30)n is a, c, g, or t 93taatacgact cactataggg
nnnnnnnnnn aagctt 369436DNAArtificial
SequenceDescription of Artificial Sequence Synthetic
primermodified_base(21)..(30)a, c, g, t, unknown or
othermisc_feature(21)..(30)n is a, c, g, or t 94taatacgact cactataggg
nnnnnnnnnn ggatcc 369534DNAArtificial
SequenceDescription of Artificial Sequence Synthetic
primermodified_base(21)..(30)a, c, g, t, unknown or
othermisc_feature(21)..(30)n is a, c, g, or t 95taatacgact cactataggg
nnnnnnnnnn ctag 349623DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
96attttacgca gactatcttt cta
239717DNAArtificial SequenceDescription of Artificial Sequence Synthetic
primer 97ttaatacgac tcactat
179823DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 98ggatccgcgg taagtgtcac tga
239934DNAArtificial SequenceDescription of Artificial
Sequence Synthetic primer 99ggatcctcga tatacagacc gataaaaaca catg
3410033DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 100actgggccca tactaataat
aaattcaaca aac 3310123DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
101ttatttcatg ttctacttac gtg
2310223DNAArtificial SequenceDescription of Artificial Sequence Synthetic
primer 102tgattatctt taacgtacgt cac
2310323DNAArtificial SequenceDescription of Artificial
Sequence Synthetic primer 103gtcagtccag aaacaacttt ggc
2310432DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 104ctagaaattt atttatgttt
atttatttat ta 3210532DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
105acgcgtagat cttaatacga ctcactatag gg
3210632DNAArtificial SequenceDescription of Artificial Sequence Synthetic
primer 106acgcgtagat ctaattaacc ctcactaaag gg
3210725DNAArtificial SequenceDescription of Artificial
Sequence Synthetic oligonucleotide 107ccaaacttcg gcgatgtttt cttaa
2510826DNAArtificial
SequenceDescription of Artificial Sequence Synthetic oligonucleotide
108tagaattcat gtttccaatt ttttaa
2610925DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 109tcgggtggca cgttgtggat tttaa
2511025DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 110aaatacgtca
ctccccttcc cttaa
2511125DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 111agctgcactc accggatgtc cttaa
2511225DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 112cccaaagtat
agttaaatag cttaa
2511325DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 113gtttatttat gattagagcc tttaa
2511426DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 114tgttgttttt
ttgtccccac gtttaa
2611526DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 115ctgcctctag ccgcctgctt tattaa
2611625DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 116ttaaattcgc
atatgtgcaa atgtt
2511725DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 117ttaagcatgt ccttaagcat aaaat
2511825DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 118ttaatgctag
ctgcatgcag gatgc
2511924DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 119ttaaacaaaa aatgaaacat aagg
2412024DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 120ttaaaggaat
taataaaaat acaa
2412126DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 121ttaatctcct ccgcccttct tcaatt
2612225DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 122ttaaacaaac
acctttgaca aattt
2512325DNAArtificial SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 123ttaatattaa ttgaaaataa atgca
2512417DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 124yyttttttaa rtaayag
1712511DNAArtificial
SequenceDescription of Artificial Sequence Synthetic
oligonucleotidemodified_base(3)..(3)a, c, g, t, unknown or
othermisc_feature(3)..(3)n is a, c, g, or t 125atnatttaaa t
1112622PRTArtificial
SequenceDescription of Artificial Sequence Synthetic peptide 126Pro
Ser Leu Cys Thr Glu His Cys Gln Ile Ser Lys Arg Ser Asp Ala 1
5 10 15 Tyr Val Pro Gln Tyr Ile
20 12710PRTArtificial SequenceDescription of
Artificial Sequence Synthetic peptide 127Pro Thr Phe Met Leu Gln Thr
Ile Phe Tyr 1 5 10 12811PRTArtificial
SequenceDescription of Artificial Sequence Synthetic peptide 128Asp
Asn Phe Ala Asn Cys Lys Met Asp Lys Cys 1 5
10 12915PRTArtificial SequenceDescription of Artificial Sequence
Synthetic peptide 129Asp Ile Ile Glu Thr Ser Gly Ile Tyr Asp Arg Cys
Tyr Ile Ser 1 5 10 15
13016PRTArtificial SequenceDescription of Artificial Sequence Synthetic
peptide 130Asn Leu Cys Phe Leu Trp Tyr Ser Gly Asn Asp Ser Ser Glu Lys
Arg 1 5 10 15
1315PRTArtificial SequenceDescription of Artificial Sequence Synthetic
peptide 131Pro His Val His Arg 1 5 13227PRTArtificial
SequenceDescription of Artificial Sequence Synthetic peptide 132Pro
Leu Trp Ile Asp Leu Cys Gln Cys Val Arg Leu Cys Asn Glu Ser 1
5 10 15 Val Asp Arg Phe Gly Phe
Phe Asp Thr Met Ser 20 25
13316PRTArtificial SequenceDescription of Artificial Sequence Synthetic
peptide 133Gln Lys Tyr Thr Ala His Thr Ser Arg Lys Arg Cys Ile Tyr Ser
Cys 1 5 10 15
13423PRTArtificial SequenceDescription of Artificial Sequence Synthetic
peptide 134Lys Asn Met Gly Ser Leu Tyr Pro Ser Val His Thr Lys Leu His
Ser 1 5 10 15 Arg
Gly Ser Phe Asp His Arg 20 1355PRTArtificial
SequenceDescription of Artificial Sequence Synthetic peptide 135Thr
Val Thr Trp Phe 1 5 1365PRTArtificial SequenceDescription
of Artificial Sequence Synthetic peptide 136Arg Thr Val Ser Val 1
5 1378PRTArtificial SequenceDescription of Artificial
Sequence Synthetic peptide 137Asp Val Tyr Pro Lys Gln Ala Lys 1
5 1389PRTArtificial SequenceDescription of
Artificial Sequence Synthetic peptide 138Val Trp Asn Lys Asn Pro His
Asp Val 1 5 13925PRTArtificial
SequenceDescription of Artificial Sequence Synthetic peptide 139Gln
Trp Tyr Glu Val Tyr Asp Lys Trp Asn Ala Leu Phe Gly Lys Arg 1
5 10 15 Asn Thr Asp Gln Arg Ser
Thr Thr Arg 20 25 14012PRTArtificial
SequenceDescription of Artificial Sequence Synthetic peptide 140Ile
Leu Arg Glu Gly Val Ile Lys Ala Cys Ala Arg 1 5
10 1418PRTArtificial SequenceDescription of Artificial
Sequence Synthetic peptide 141Gln Leu Val His Leu Arg Ala Pro 1
5 1426PRTArtificial SequenceDescription of
Artificial Sequence Synthetic peptide 142Tyr Ala Tyr Phe Tyr Arg 1
5 1437PRTArtificial SequenceDescription of Artificial
Sequence Synthetic peptide 143Ile Ser Ile Leu Leu Phe Ser 1
5 144949PRTArtificial SequenceDescription of Artificial
Sequence Synthetic protein 144Gln Phe Cys Asn Lys Lys Thr Asp Pro
Val Val Val Pro Ser Asn Trp 1 5 10
15 Gln Met His Gly Tyr Asp Ala Pro Ile Tyr Thr Asn Val Thr
Tyr Pro 20 25 30
Ile Thr Val Asn Pro Pro Phe Val Pro Thr Glu Asn Pro Thr Gly Cys
35 40 45 Tyr Ser Leu Thr
Phe Asn Val Asp Glu Ser Trp Leu Gln Glu Gly Gln 50
55 60 Thr Arg Ile Ile Phe Asp Gly Val
Asn Ser Ala Phe His Leu Trp Cys 65 70
75 80 Asn Gly Arg Trp Val Gly Tyr Gly Gln Asp Ser Arg
Leu Pro Ser Glu 85 90
95 Phe Asp Leu Ser Ala Phe Leu Arg Ala Gly Glu Asn Arg Leu Ala Val
100 105 110 Met Val Leu
Arg Trp Ser Asp Gly Ser Tyr Leu Glu Asp Gln Asp Met 115
120 125 Trp Arg Met Ser Gly Ile Phe Arg
Asp Val Ser Leu Leu His Lys Pro 130 135
140 Thr Thr Gln Ile Ser Asp Phe His Val Ala Thr Arg Phe
Asn Asp Asp 145 150 155
160 Phe Ser Arg Ala Val Leu Glu Ala Glu Val Gln Met Cys Gly Glu Leu
165 170 175 Arg Asp Tyr
Leu Arg Val Thr Val Ser Leu Trp Gln Gly Glu Thr Gln 180
185 190 Val Ala Ser Gly Thr Ala Pro Phe
Gly Gly Glu Ile Ile Asp Glu Arg 195 200
205 Gly Gly Tyr Ala Asp Arg Val Thr Leu Arg Leu Asn Val
Glu Asn Pro 210 215 220
Lys Leu Trp Ser Ala Glu Ile Pro Asn Leu Tyr Arg Ala Val Val Glu 225
230 235 240 Leu His Thr Ala
Asp Gly Thr Leu Ile Glu Ala Glu Ala Cys Asp Val 245
250 255 Gly Phe Arg Glu Val Arg Ile Glu Asn
Gly Leu Leu Leu Leu Asn Gly 260 265
270 Lys Pro Leu Leu Ile Arg Gly Val Asn Arg His Glu His His
Pro Leu 275 280 285
His Gly Gln Val Met Asp Glu Gln Thr Met Val Gln Asp Ile Leu Leu 290
295 300 Met Lys Gln Asn Asn
Phe Asn Ala Val Arg Cys Ser His Tyr Pro Asn 305 310
315 320 His Pro Leu Trp Tyr Thr Leu Cys Asp Arg
Tyr Gly Leu Tyr Val Val 325 330
335 Asp Glu Ala Asn Ile Glu Thr His Gly Met Val Pro Met Asn Arg
Leu 340 345 350 Thr
Asp Asp Pro Arg Trp Leu Pro Ala Met Ser Glu Arg Val Thr Arg 355
360 365 Met Val Gln Arg Asp Arg
Asn His Pro Ser Val Ile Ile Trp Ser Leu 370 375
380 Gly Asn Glu Ser Gly His Gly Ala Asn His Asp
Ala Leu Tyr Arg Trp 385 390 395
400 Ile Lys Ser Val Asp Pro Ser Arg Pro Val Gln Tyr Glu Gly Gly Gly
405 410 415 Ala Asp
Thr Thr Ala Thr Asp Ile Ile Cys Pro Met Tyr Ala Arg Val 420
425 430 Asp Glu Asp Gln Pro Phe Pro
Ala Val Pro Lys Trp Ser Ile Lys Lys 435 440
445 Trp Leu Ser Leu Pro Gly Glu Thr Arg Pro Leu Ile
Leu Cys Glu Tyr 450 455 460
Ala His Ala Met Gly Asn Ser Leu Gly Gly Phe Ala Lys Tyr Trp Gln 465
470 475 480 Ala Phe Arg
Gln Tyr Pro Arg Leu Gln Gly Gly Phe Val Trp Asp Trp 485
490 495 Val Asp Gln Ser Leu Ile Lys Tyr
Asp Glu Asn Gly Asn Pro Trp Ser 500 505
510 Ala Tyr Gly Gly Asp Phe Gly Asp Thr Pro Asn Asp Arg
Gln Phe Cys 515 520 525
Met Asn Gly Leu Val Phe Ala Asp Arg Thr Pro His Pro Ala Leu Thr 530
535 540 Glu Ala Lys His
Gln Gln Gln Phe Phe Gln Phe Arg Leu Ser Gly Gln 545 550
555 560 Thr Ile Glu Val Thr Ser Glu Tyr Leu
Phe Arg His Ser Asp Asn Glu 565 570
575 Leu Leu His Trp Met Val Ala Leu Asp Gly Lys Pro Leu Ala
Ser Gly 580 585 590
Glu Val Pro Leu Asp Val Ala Pro Gln Gly Lys Gln Leu Ile Glu Leu
595 600 605 Pro Glu Leu Pro
Gln Pro Glu Ser Ala Gly Gln Leu Trp Leu Thr Val 610
615 620 Arg Val Val Gln Pro Asn Ala Thr
Ala Trp Ser Glu Ala Gly His Ile 625 630
635 640 Ser Ala Trp Gln Gln Trp Arg Leu Ala Glu Asn Leu
Ser Val Thr Leu 645 650
655 Pro Ala Ala Ser His Ala Ile Pro His Leu Thr Thr Ser Glu Met Asp
660 665 670 Phe Cys Ile
Glu Leu Gly Asn Lys Arg Trp Gln Phe Asn Arg Gln Ser 675
680 685 Gly Phe Leu Ser Gln Met Trp Ile
Gly Asp Lys Lys Gln Leu Leu Thr 690 695
700 Pro Leu Arg Asp Gln Phe Thr Arg Ala Pro Leu Asp Asn
Asp Ile Gly 705 710 715
720 Val Ser Glu Ala Thr Arg Ile Asp Pro Asn Ala Trp Val Glu Arg Trp
725 730 735 Lys Ala Ala Gly
His Tyr Gln Ala Glu Ala Ala Leu Leu Gln Cys Thr 740
745 750 Ala Asp Thr Leu Ala Asp Ala Val Leu
Ile Thr Thr Ala His Ala Trp 755 760
765 Gln His Gln Gly Lys Thr Leu Phe Ile Ser Arg Lys Thr Tyr
Arg Ile 770 775 780
Asp Gly Ser Gly Gln Met Ala Ile Thr Val Asp Val Glu Val Ala Ser 785
790 795 800 Asp Thr Pro His Pro
Ala Arg Ile Gly Leu Asn Cys Gln Leu Ala Gln 805
810 815 Val Ala Glu Arg Val Asn Trp Leu Gly Leu
Gly Pro Gln Glu Asn Tyr 820 825
830 Pro Asp Arg Leu Thr Ala Ala Cys Phe Asp Arg Trp Asp Leu Pro
Leu 835 840 845 Ser
Asp Met Tyr Thr Pro Tyr Val Phe Pro Ser Glu Asn Gly Leu Arg 850
855 860 Cys Gly Thr Arg Glu Leu
Asn Tyr Gly Pro His Gln Trp Arg Gly Asp 865 870
875 880 Phe Gln Phe Asn Ile Ser Arg Tyr Ser Gln Gln
Gln Leu Met Glu Thr 885 890
895 Ser His Arg His Leu Leu His Ala Glu Glu Gly Thr Trp Leu Asn Ile
900 905 910 Asp Gly
Phe His Met Gly Ile Gly Gly Asp Asp Ser Trp Ser Pro Ser 915
920 925 Val Ser Ala Glu Phe Gln Leu
Ser Ala Gly Arg Tyr His Tyr Gln Leu 930 935
940 Val Trp Cys Gln Lys 945
14510PRTArtificial SequenceDescription of Artificial Sequence Synthetic
peptide 145Ala Tyr Pro Leu Cys Ser Ser Ala Arg Arg 1 5
10 1464PRTArtificial SequenceDescription of Artificial
Sequence Synthetic peptide 146Arg Ala Cys Trp 1
1476PRTArtificial SequenceDescription of Artificial Sequence Synthetic
peptide 147Asn Ile Arg Ser Arg Lys 1 5
14812PRTArtificial SequenceDescription of Artificial Sequence Synthetic
peptide 148Arg Arg Pro Glu Arg Tyr Arg Arg Ser Val Tyr Arg 1
5 10 1497PRTArtificial SequenceDescription
of Artificial Sequence Synthetic peptide 149Ser Val Ser Gln Arg Gln
Ala 1 5 1508PRTArtificial SequenceDescription of
Artificial Sequence Synthetic peptide 150Asn Ile Arg Arg Thr Lys Cys
Tyr 1 5 1518PRTArtificial SequenceDescription
of Artificial Sequence Synthetic peptide 151Thr Thr Arg Phe Phe Ile
Gly Phe 1 5 15210PRTArtificial
SequenceDescription of Artificial Sequence Synthetic peptide 152Gln
Asn Leu Asp Leu Ala Thr Glu Asp Tyr 1 5
10 1534PRTArtificial SequenceDescription of Artificial Sequence
Synthetic peptide 153Thr Leu Leu Val 1
15415PRTArtificial SequenceDescription of Artificial Sequence Synthetic
peptide 154Phe His Tyr Ser Leu Pro His Phe Phe Ala Glu Arg Thr Thr Ile
1 5 10 15
15517PRTArtificial SequenceDescription of Artificial Sequence Synthetic
peptide 155Asn Ile Lys Leu Asp Arg Ser Val Ile Tyr Leu Arg Gln Asp Thr
Pro 1 5 10 15 Ser
1567PRTArtificial SequenceDescription of Artificial Sequence Synthetic
peptide 156Ala Tyr Asn Leu Ser Ser Trp 1 5
1574PRTArtificial SequenceDescription of Artificial Sequence Synthetic
peptide 157Pro Asp Lys Gly 1 1589PRTArtificial
SequenceDescription of Artificial Sequence Synthetic peptide 158Gly
Ile Pro Thr Ser Arg Trp Leu Glu 1 5
15912PRTArtificial SequenceDescription of Artificial Sequence Synthetic
peptide 159Leu Phe Leu Cys Tyr Asn Val Leu Ser Tyr Cys Leu 1
5 10 1607PRTArtificial SequenceDescription
of Artificial Sequence Synthetic peptide 160Lys Met Arg Tyr Arg Lys
Cys 1 5 16117PRTArtificial SequenceDescription of
Artificial Sequence Synthetic peptide 161Asn Trp Phe Cys Arg Ser Lys
Arg Arg Ser Lys His Leu Phe Ile Asn 1 5
10 15 Arg 1626PRTArtificial SequenceDescription of
Artificial Sequence Synthetic peptide 162Lys Ile Gln Ser Thr Pro 1
5 1636PRTArtificial SequenceDescription of Artificial
Sequence Synthetic peptide 163Asn Val Phe Phe Leu Arg 1
5 1648PRTArtificial SequenceDescription of Artificial Sequence
Synthetic peptide 164Ser Gln Gln Ile Lys Arg Trp Phe 1
5 16515PRTArtificial SequenceDescription of Artificial
Sequence Synthetic peptide 165Ser Tyr Ser Gly Phe Ser Ser Met Ala
Lys Ser Ile Ser Arg His 1 5 10
15 1664PRTArtificial SequenceDescription of Artificial Sequence
Synthetic peptide 166His Leu Ile Thr 1
16728PRTArtificial SequenceDescription of Artificial Sequence Synthetic
peptide 167Arg Gly Asn Ser Phe Arg Asn Ser Trp Thr Val Arg Thr Cys Val
Lys 1 5 10 15 Trp
Phe Ala Arg Ile Val Arg Phe Pro Leu Phe Ile 20
25 1687PRTArtificial SequenceDescription of Artificial
Sequence Synthetic peptide 168Lys Lys Cys Ser Arg Ile Gln 1
5 1695PRTArtificial SequenceDescription of Artificial
Sequence Synthetic peptide 169Phe Asn Tyr Thr Phe 1 5
17017PRTArtificial SequenceDescription of Artificial Sequence Synthetic
peptide 170Phe Tyr Ala Gln His Glu His Ser Cys Phe Tyr Arg Ala Glu
Trp Val 1 5 10 15
Trp 17117PRTArtificial SequenceDescription of Artificial Sequence
Synthetic peptide 171Asn Asn Asn Phe Glu Trp Asn Asp Trp Cys Asn His
Gln Pro Arg Lys 1 5 10
15 Gln 1724PRTArtificial SequenceDescription of Artificial Sequence
Synthetic peptide 172Ile Phe Phe Leu 1
17319PRTArtificial SequenceDescription of Artificial Sequence Synthetic
peptide 173Thr Tyr Arg Val Lys Asn Pro Lys Gly Ile Phe Ser Ile Ala Cys
Phe 1 5 10 15 Ser
Phe Val 1745PRTArtificial SequenceDescription of Artificial Sequence
Synthetic peptide 174Ser Phe Tyr Ser Ser 1 5
17518PRTArtificial SequenceDescription of Artificial Sequence Synthetic
peptide 175Pro Ser Lys Arg Tyr Thr Ile Leu Leu Tyr Trp Thr Gln Glu Cys
Cys 1 5 10 15 Gln
Gln 17622PRTArtificial SequenceDescription of Artificial Sequence
Synthetic peptide 176Phe Lys Ile Thr Arg Arg Ser Pro Leu Arg Ile His
Phe Ser Ile Tyr 1 5 10
15 Trp Leu Tyr Glu Ser Arg 20 1776PRTArtificial
SequenceDescription of Artificial Sequence Synthetic peptide 177Lys
Lys Thr Thr Leu Ala 1 5 1786PRTArtificial
SequenceDescription of Artificial Sequence Synthetic peptide 178Ser
Tyr Gln Lys Thr Lys 1 5 17912PRTArtificial
SequenceDescription of Artificial Sequence Synthetic peptide 179Lys
Leu Phe Lys Tyr Gly Thr His Gln Pro His Phe 1 5
10 1804PRTArtificial SequenceDescription of Artificial
Sequence Synthetic peptide 180Arg Val Lys Thr 1
1817PRTArtificial SequenceDescription of Artificial Sequence Synthetic
peptide 181Thr Thr Asp Ser Ser Gly Arg 1 5
18236PRTArtificial SequenceDescription of Artificial Sequence Synthetic
peptide 182Gln Tyr Pro Glu Ile Ser Ala Ser Asn Glu Leu Trp Thr Cys Asn
Cys 1 5 10 15 Phe
Ile Tyr Tyr His Lys Ile Ser Arg Cys Arg Trp Arg Lys Val Ile
20 25 30 Ser Phe Ile Arg
35 18363PRTArtificial SequenceDescription of Artificial Sequence
Synthetic peptide 183Thr Arg Gly Ser Ser Ala Ser Thr Phe Ala Leu Cys
Phe Phe Thr Asn 1 5 10
15 Ile Lys Arg Leu Thr Arg Cys Thr Gln Trp Cys Ser Asn Asn Cys Asn
20 25 30 Ser Phe Pro
Ser Ser Thr Ala Arg Gly Ser Lys Ile Leu His Val Glu 35
40 45 Ser Pro Thr Val Lys Arg Ser Asn
Lys Tyr Tyr Pro Ser Gly Tyr 50 55
60 1847PRTArtificial SequenceDescription of Artificial
Sequence Synthetic peptide 184Glu Leu Arg Cys Phe Asn Ser 1
5 1854PRTArtificial SequenceDescription of Artificial
Sequence Synthetic peptide 185Gly Val Phe Thr 1
18614PRTArtificial SequenceDescription of Artificial Sequence Synthetic
peptide 186Ser Asp Lys Phe Trp Ile Pro Pro Leu Ile Ile Ala Val Arg 1
5 10 1875PRTArtificial
SequenceDescription of Artificial Sequence Synthetic peptide 187Phe
Arg Ala Phe Leu 1 5 1886PRTArtificial SequenceDescription
of Artificial Sequence Synthetic peptide 188Asn His Ser Lys Lys Leu
1 5 18919PRTArtificial SequenceDescription of
Artificial Sequence Synthetic peptide 189Trp Ser Asp Arg Ile Arg Arg
Ser Asn Arg Phe Lys Ser Asp Asp Asn 1 5
10 15 Glu Gln Arg 19014PRTArtificial
SequenceDescription of Artificial Sequence Synthetic peptide 190Ser
Thr Met Lys Lys Leu Pro Leu Pro Ala Arg Thr Tyr Ser 1 5
10 1912476DNAArtificial SequenceSynthetic
nucleotide sequence 191ccctagaaag atagtctgcg taaaattgac gcatgcattc
ttgaaatatt gctctctctt 60tctaaatagc gcgaatccgt cgctgtttgc aatttaggac
atctcagtcg ccgcttggag 120ctcggctgag gcgtgcttgt caatgcggta agtgtcactg
attttgaact ataacgaccg 180cgtgagtcaa aatgacgcat gattatcttt tacgtgactt
ttaagattta actcatacga 240taattaatat tgttatttca tgttctactt acgtgataac
ttattatata tatattttct 300tgttatagat atcgtgacta atatataata aaatgggatg
ttctttagac gatgagcata 360tcctctctgc tcttctgcaa ggcgatgacg agcttgttgg
tgaggattct gacagtgaaa 420tatcagatca cgtaagtgaa gacgtccaga gcgatacaga
agaagcgttt atagatgagg 480tacatgaagt gtcagccaac gtcaagcgta gtgaaatatt
agacgaacaa aatgttattg 540aacaaccagg ttcttcattg gcttctaaca gaatcttgac
cttgccacag aggactatta 600gaggtaagaa taaacattgt tggtcaactt caaagtccac
gagcggtagc cgagtctctg 660cactgaacat tgtcagatct caaagaggtc cgacgcgtat
gtgccgcaat atatatgacc 720cacttttatg cttcaaacta ttttttactg atgagataat
ttcgcaaatt gtaaaatgga 780caaatgctga gatatcattg aaacgtcggg aatctatgac
aggtgctaca tttcgtgaca 840cgaatgaaga tgaaatctat gctttctttg gtattctggt
aatgacagca gtgagaaaag 900ataaccacat gtccacagat gacctctttg gatcgatctt
tgtcaatgtg tacgtctctg 960taatgagtct gtggatcgtt ttggattttt tgatacgatg
tcttagaatg gatgacaaaa 1020gtatacggcc cacacttcga gaaaacgatg tatttactcc
tgttagaaaa atatgggatc 1080tctttatcca tcagtgcata caaaattaca ctccaggggc
tcatttgacc atagatgaac 1140agttacttgg ttttagagga cggtgtccgt ttaggatgta
tatcccaaac aagccaagta 1200agtatggaat aaaaatcctc atgatgtgtg acagtggtac
gaagtatatg ataaatggaa 1260tgccttattt gggaagagga acacagacca acggagtacc
actcggtgaa tactacgtga 1320aggagttatc aaagcctgtg cacggtagtt gtcgtaatat
tacgtgtgac aattggttca 1380cctcaatccc tttggcaaaa aacttactac aagaaccgta
taagttaacc attgtgggaa 1440ccgtgcgatc aaacaaacgc gagataccgg aagtactgaa
aaacagtcgc tccaggccag 1500tgggaacatc gatgttttgt tttgacggac cccttactct
cgtctcatat aaaccgaagc 1560cagctaagat ggtatactta ttatcatctt gtgatgagga
tgcttctatc aacgaaagta 1620ccggtaaacc gcaaatggtt atgtattata atcaaactaa
aggcggagtg gacacgctag 1680accaaatgtg ttctgtgatg acctgcagta ggaagacgaa
taggtggcct atggcattat 1740tgtacggaat gataaacatt gcctgcataa attcttttat
tatatacagc cataatgtca 1800gtagcaaggg agaaaaggtt caaagtcgca aaaaatttat
gagaaacctt tacatgagcc 1860tgacgtcatc gtttatgcgt aaccgtttag aagctcctac
tttgaagaga tatttgcgcg 1920ataatatctc taatattttg ccaaatgaag tgcctggtac
atcagatgac agtactgaag 1980agccagtaat gaaaaaacgt acttactgta cttactgccc
ctctaaaata aggcgaaagg 2040caaatgcatc gtgcaaaaaa tgcaaaaaag ttatttgtcg
agagcataat attgatatgt 2100gccaaagttg tttctggact gactaataag tataatttgt
ttctattatg tataagttaa 2160gctaattact tattttataa tacaacatga ctgtttttaa
agtacaaaat aagtttattt 2220ttgtaaaaga gagaatgttt aaaagttttg ttactttaga
agaaattttg agtttttgtt 2280tttttttaat aaataaataa acataaataa attgtttgtt
gaatttatta ttagtatgta 2340agtgtaaata taataaaact taatatctat tcaaattaat
aaataaacct cgatatacag 2400accgataaaa acacatgcgt caattttacg catgattatc
tttaacgtac gtcacaatat 2460gattatcttt ctaggg
247619266DNAArtificial SequenceSynthetic nucleotide
sequence 192cattcttgaa atattgctct ctctttctaa atagcgcgaa tccgtcgctg
tttgcaattt 60aggaca
6619335DNAArtificial SequenceSynthetic nucleotide sequence
193ccctagaaag atagtctgag taaaattgac gcatg
3519435DNAArtificial SequenceSynthetic nucleotide sequence 194ccctagaaag
atagtctgag taaaattgac gcatg
3519519DNAArtificial SequenceSynthetic nucleotide sequence 195tgcgtaaaat
tgacgcatg
19196956DNAArtificial SequenceSynthetic nucleotide sequence 196caaacgcgag
ataccggaag tactgaaaaa cagtcgctcc aggccagtgg gaacatcgat 60gttttgtttt
gacggacccc ttactctcgt ctcatataaa ccgaagccag ctaagatggt 120atacttatta
tcatcttgtg atgaggatgc ttctatcaac gaaagtaccg gtaaaccgca 180aatggttatg
tattataatc aaactaaagg cggagtggac acgctagacc aaatgtgttc 240tgtgatgacc
tgcagtagga agacgaatag gtggcctatg gcattattgt acggaatgat 300aaacattgcc
tgcataaatt cttttattat atacagccat aatgtcagta gcaagggaga 360aaaggttcaa
agtcgcaaaa aatttatgag aaacctttac atgagcctga cgtcatcgtt 420tatgcgtaac
cgtttagaag ctcctacttt gaagagatat ttgcgcgata atatctctaa 480tattttgcca
aatgaagtgc ctggtacatc agatgacagt actgaagagc cagtaatgaa 540aaaacgtact
tactgtactt actgcccctc taaaataagg cgaaaggcaa atgcatcgtg 600caaaaaatgc
aaaaaagtta tttgtcgaga gcataatatt gatatgtgcc aaagttgttt 660ctggactgac
taataagtat aatttgtttc tattatgtat aagttaagct aattacttat 720tttataatac
aacatgactg tttttaaagt acaaaataag tttatttttg taaaagagag 780aatgtttaaa
agttttgtta ctttagaaga aattttgagt ttttgttttt ttttaataaa 840taaataaaca
taaataaatt gtttgttgaa tttattatta gtatgtaagt gtaaatataa 900taaaacttaa
tatctattca aattaataaa taaacctcga tatacagacc gataaa
95619731DNAArtificial SequenceSynthetic nucleotide sequence 197atcatattgt
gacgtacgtt aaagataatc a
31198241DNAArtificial SequenceSynthetic nucleotide sequence 198cattcttgaa
atattgctct ctctttctaa atagcgcgaa tccgtcgctg tttgcaattt 60aggacatctc
agtcgccgct tggagctcgg ctgaggcgtg cttgtcaatg cggtaagtgt 120cactgatttt
gaactataac gaccgcgtga gtcaaaatga cgcatgatta tcttttacgt 180gacttttaag
atttaactca tacgataatt aatattgtta tttcatgttc tacttacgtg 240a
241199384DNAArtificial SequenceSynthetic nucleotide sequence
199cattcttgaa atattgctct ctctttctaa atagcgcgaa tccgtcgctg tttgcaattt
60aggacatctc agtcgccgct tggagctcgg ctgaggcgtg cttgtcaatg cggtaagtgt
120cactgatttt gaactataac gaccgcgtga gtcaaaatga cgcatgatta tcttttacgt
180gacttttaag atttaactca tacgataatt aatattgtta tttcatgttc tacttacgtg
240ataacttatt atatatatat tttcttgtta tagatatcgt gactaatata taataaaatg
300ggatgttctt tagacgatga gcatatcctc tctgctcttc tgcaaggcga tgacgagctt
360gttggtgagg attctgacag tgaa
384200542DNAArtificial SequenceSynthetic nucleotide sequence
200cattcttgaa atattgctct ctctttctaa atagcgcgaa tccgtcgctg tttgcaattt
60aggacatctc agtcgccgct tggagctcgg ctgaggcgtg cttgtcaatg cggtaagtgt
120cactgatttt gaactataac gaccgcgtga gtcaaaatga cgcatgatta tcttttacgt
180gacttttaag atttaactca tacgataatt aatattgtta tttcatgttc tacttacgtg
240ataacttatt atatatatat tttcttgtta tagatatcgt gactaatata taataaaatg
300ggatgttctt tagacgatga gcatatcctc tctgctcttc tgcaaggcga tgacgagctt
360gttggtgagg attctgacag tgaaatatca gatcacgtaa gtgaagacgt ccagagcgat
420acagaagaag cgtttataga tgaggtacat gaagtgtcag ccaacgtcaa gcgtagtgaa
480atattagacg aacaaaatgt tattgaacaa ccaggttctt cattggcttc taacagaatc
540tt
542
User Contributions:
Comment about this patent or add new information about this topic: