Patent application title: Cosmid Vector for Transforming Plant and Use Thereof
Inventors:
Japan Tobacco Inc. (Tokyo, JP)
Yoshimitsu Takakura (Iwata-Shi, Shizuoka, JP)
Toshihiko Komari (Iwata-Shi, Shizuoka, JP)
Yuji Ishida (Iwata-Shi, Shizuoka, JP)
Toshiyuki Komori (Iwata-Shi, Shizuoka, JP)
Yukoh Hiei (Iwata-Shi, Shizuoka, JP)
Toshiki Mine (Iwata-Shi, Shizuoka, JP)
Teruyuki Imayama (Iwata-Shi, Shizuoka, JP)
Assignees:
JAPAN TOBACCO INC.
IPC8 Class: AC12N1582FI
USPC Class:
800279
Class name: Multicellular living organisms and unmodified parts thereof and related processes method of introducing a polynucleotide molecule into or rearrangement of genetic material within a plant or plant part the polynucleotide confers pathogen or pest resistance
Publication date: 2013-04-11
Patent application number: 20130091599
Abstract:
The present invention provides novel cosmid vectors for plant
transformation. The cosmid vectors have a full length of 15 kb or less
and contain: 1) an origin of replication of an IncP plasmid, but not any
origin of replication of other plasmid groups; 2) the trfA1 gene of an
IncP plasmid; 3) an oriT of an IncP plasmid; 4) the incC1 gene of an IncP
plasmid; 5) a cos site of lambda phage, which is located outside the
T-DNA; 6) a drug resistance gene expressed in E. coli and a bacterium of
Agrobacterium; 7) a T-DNA right border sequence of a bacterium of
Agrobacterium; 8) a T-DNA left border sequence of a bacterium of
Agrobacterium; 9) a selectable marker gene for plant transformation
located between 7) and 8) and expressed in a plant; and 10) restriction
endonuclease recognition site(s) located between 7) and 8) for cloning a
foreign gene.Claims:
1. A cosmid vector having a full length of 15 kb or less characterized in
that: 1) it contains an origin of replication (oriV) of an IncP plasmid,
but does not contain any origin of replication of other plasmid groups;
2) it contains the trfA1 gene of an IncP plasmid; 3) it contains an
origin of conjugative transfer (oriT) of an IncP plasmid; 4) it contains
the incC1 gene of an IncP plasmid; 5) it contains a cos site of lambda
phage and the cos site is located outside the T-DNA; 6) it contains a
drug resistance gene expressed in E. coli and a bacterium of the genus
Agrobacterium; 7) it contains a T-DNA right border sequence of a
bacterium of the genus Agrobacterium; 8) it contains a T-DNA left border
sequence of a bacterium of the genus Agrobacterium; 9) it contains a
selectable marker gene for plant transformation located between 7) and 8)
and expressed in a plant; and 10) it contains restriction endonuclease
recognition site(s) located between 7) and 8) for cloning a foreign gene,
wherein the cosmid vector is selected from the group consisting of: the
cosmid vector pLC40 consisting of the nucleotide sequence of SEQ ID NO: 2
or an equivalent thereof consisting of a nucleotide sequence having at
least 95% identity to the nucleotide sequence; the cosmid vector pLC40GWH
consisting of the nucleotide sequence of SEQ ID NO: 3 or an equivalent
thereof consisting of a nucleotide sequence having at least 95% identity
to the nucleotide sequence; the cosmid vector pLC40 bar consisting of the
nucleotide sequence of SEQ ID NO: 4 or an equivalent thereof consisting
of a nucleotide sequence having at least 95% identity to the nucleotide
sequence; the cosmid vector pLC40GWB consisting of the nucleotide
sequence of SEQ ID NO: 5 or an equivalent thereof consisting of a
nucleotide sequence having at least 95% identity to the nucleotide
sequence; the cosmid vector pLC40GWHKorB consisting of the nucleotide
sequence of SEQ ID NO: 65 or an equivalent thereof consisting of a
nucleotide sequence having at least 95% identity to the nucleotide
sequence; the cosmid vector pLCleo consisting of the nucleotide sequence
of SEQ ID NO: 66 or an equivalent thereof consisting of a nucleotide
sequence having at least 95% identity to the nucleotide sequence; and the
cosmid vector pLC40GWHvG1 consisting of the nucleotide sequence of SEQ ID
NO: 7 or an equivalent thereof consisting of a nucleotide sequence having
at least 95% identity to the nucleotide sequence; wherein the cosmid
vector is stably maintained in E. coli and Agrobacterium cells.
2. A method for transforming a plant, comprising transforming the plant with a bacterium of the genus Agrobacterium harboring an expression vector containing a nucleic acid fragment of a plant inserted into the cosmid vector of claim 1.
3. A method for transforming a plant, comprising transforming the plant with a bacterium of the genus Agrobacterium harboring the cosmid vector according to claim 1 and a plasmid vector characterized in that: 1) it contains an element necessary for the replication of an IncW plasmid, but does not contain any origin of replication of other plasmid groups; 2) it contains the repA gene necessary for the replication of an IncW plasmid; 3) it contains a drug resistance gene expressed in E. coli and a bacterium of the genus Agrobacterium; and 4) the virG gene of a bacterium of the genus Agrobacterium.
4. The method for transforming a plant according to claim 2 or 3, wherein the selectable marker gene for plant transformation is selected from the group consisting of a hygromycin resistance gene, a phosphinotricin resistance gene and a kanamycin resistance gene.
Description:
[0001] This application is a Divisional of co-pending application Ser. No.
12/306,163 filed on Mar. 16, 2009 and for which priority is claimed under
35 U.S.C. §120. application Ser. No. 12/306,163 is the national
phase of PCT International Application No. PCT/JP2007/062720 filed on
Jun. 25, 2007 under 35U.S.C. §371. PCT International Application No.
PCT/JP2007/062720 claims the benefit of priority of PCT/JP2006/312633
filed on Jun. 23, 2006. The entire contents of each of the
above-identified applications are hereby incorporated by reference.
TECHNICAL FIELD
[0002] The present invention relates to novel cosmid vectors for transforming plant and use thereof.
BACKGROUND ART
[0003] Various vectors have been previously developed for the purpose of plant transformation.
[0004] Recently, the entire genome sequences of Arabidopsis thaliana and rice (Oryza sativa) were elucidated, which moved the focus of plant genome studies from the accumulation of nucleotide sequence information to the elucidation of gene functions. For the elucidation of gene functions, experiments are absolutely necessary in which cloned DNA is transferred into a plant to analyze changes in the phenotype. If large DNA could be transferred in this operation, the study efficiency would be dramatically improved.
[0005] Thus, a number of vectors intended to transfer large DNA fragments into plants were developed. As typical examples, cosmid vectors for plant transformation were prepared, such as pOCA18 (Olszewski et al., 1988, Nucleic Acids Res. 16: 10765-10782) and pLZO3 (Lazo et al., 1991, Bio/Technology 9: 963-967). The use of a cosmid has the advantage that a lambda phage packaging reaction can be used, which allows easy cloning of relatively large genomic fragments (Sambrook J. and Russell D. W. 2001. Molecular Cloning, A Laboratory Manual, 3rd edn. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., USA.). In cloning with a cosmid vector and a packaging reaction, the total size of the vector and the insert fragment is 40 kb-50 kb so that the size of the insert fragment is restricted within a certain range by the size of the vector and the sizes of the vector and the insert fragment inversely correlate with each other.
[0006] Vectors such as pOCA18 and pLZO3 contain elements for plant transformation such as T-DNA border sequences and a selectable marker (kanamycin resistance gene) in pRK290 (Ditta et al., 1980, Proc. Natl. Acad. Sci. USA 77: 7347-7351) which is a typical vector having an origin of replication (oriV) of an IncP plasmid that is functional in both E. coli and Agrobacterium. These vectors per se had a size of 24.3-30.1 kb, and therefore, the size of DNA that can be cloned using a packaging reaction was about 20 kb (pOCA18), or about 13-22 kb (pLZO3) on average. These vectors have an origin of replication (oriV) of an IncP plasmid, but other vectors such as pCIT103 and pCIT104 (Ma et al. 1992 Gene 117: 161-167) have an origin of replication from ColE1 in addition to an origin of replication (oriV) of an IncP plasmid. On the other hand, pC22 (Simoens et al. 1986 Nucleic Acids Res 14: 8073-8090) is a vector having an origin of replication from ColE1 and an origin of replication from an Ri plasmid. Other cosmid vectors capable of plant transformation include pMON565 (Klee et al. 1987 Mol Gen Genet 210: 282-287) and pCLD04541 (Bent et al. 1994 Science 265: 1856-1860), but they are not suitable for cloning DNA fragments of 25 kb or more because their own sizes are 24 kb and 29 kb, respectively. Other examples such as pE4cos(16 kb, Klee et al. 1987 Mol Gen Genet 210: 282-287), pMON565, pLZ03, pOCA18, pCLD04541, pC22 and the like had a structure containing a cos site within the T-DNA.
[0007] Subsequently, the BIBAC vector (binary bacterial artificial chromosome, Hamilton U.S. Pat. No. 5,733,744, Hamilton et al., 1996, Proc. Natl. Acad. Sci. USA 93:9975-9979, Hamilton, 1997, Gene 200:107-116) was developed, which is capable of cloning DNA fragments of up to about 150 kb and transferring them into plants. This vector is based on a BAC vector capable of carrying large DNA fragments and further contains elements for plant transformation such as T-DNA border sequences and a selectable marker as well as an origin of replication for Agrobacterium. The TAC vector (transformation-competent bacterial artificial chromosome) pYLTAC7 (Liu et al., 1999, Proc. Natl. Acad. Sci. USA 96: 6535-6540.) was also developed, which is capable of cloning DNA of up to about 80 kb and transferring it into plants. This vector is based on a high-capacity PAC vector (P1-derived artificial chromosome) using the replication mechanism of P1 phage and contains elements for plant transformation such as T-DNA border sequences and a selectable marker as well as an origin of replication for Agrobacterium. These vectors contain an origin of replication of (ori) from a plasmid existing as a single copy per cell in E. coli and Agrobacterium for the purpose of stably maintaining a large foreign gene. That is, they use an F factor on (BIBAC) or a P1 phage on (TAC) as on for E. coli and an R1 on from Agrobacterium rhizogenes (both BIBAC and TAC) as on for Agrobacterium. However, the use of an origin of replication from a single-copy plasmid is not necessarily essential, and vectors having an origin of replication (oriV) of an IncP plasmid known to exist as a few copies per cell such as pSLJ1711 and pCLD04541 were reported to be capable of stably maintaining plant genomic DNA fragments of more than 100 kb in size (Tao and Zhang (1998) Nucleic Acids Res 26: 4901-4909). In addition, pBIGRZ was also reported, which contains R1 on in the versatile binary plasmid vector pBI121 (JPA Hei-10-155485).
[0008] Such vectors can be used to clone large DNA fragments far exceeding 50 kb, but involve complicated cloning operations. Cloning of large DNA requires skilled techniques and a considerable amount of time and labor. Transformation with BIBAC requires special Agrobacterium cells overexpressing virG or the like and results in a much lower transformation efficiency (the number of selected calli/inoculated leaf section) for fragments of 150 kb as compared with those of normal small vectors (Hamiltin et al., 1996, Proc Natl Acad Sci USA 93: 9975-9979, Shibata and Liu, 2000, Trend Plant Sci 5: 354-357). Thus, transformation of large fragments into plants with BIBAC or TAC is limited to a few specific examples of large fragments (e.g., Hamiltin et al., 1996, Proc Natl Acad Sci USA 93: 9975-9979, Liu et al., 1999, Proc Natl Acad Sci USA 96: 6535-6540, Lin et al., 2003, Proc Natl Acad Sci USA 100: 5962-5967, Nakano et al., 2005, Mol Gen Genomics 273: 123-129).
[0009] As described above, pCLD04541 is a cosmid of 29 kb in size, and therefore, the size of DNA fragments that can be cloned using a lambda phage packaging reaction is 10-20 kb. If cloning of larger DNA fragments is intended, a packaging reaction cannot be used as described above, and thus complicated cloning operations and a considerable amount of time and labor are required.
[0010] Recently, genetic markers based on DNA sequence polymorphisms or so-called DNA markers are used more and more frequently with the advance in genome studies of higher plants. Many attempts have been made to clone unknown genes of higher plants known only by their phenotypes on the basis of genetic map information using DNA markers, i.e., so-called map-based cloning. Generally, the basic protocol of map-based cloning is as follows.
[0011] 1. Examine a relatively small segregating population with a set of DNA markers widely used for rough mapping of a candidate region on a chromosome.
[0012] 2. Screen a large segregating population with a set of DNA markers newly designed for the particular region of the genome to narrow down the candidate region.
[0013] 3. Determine the nucleotide sequence of the genetic region and guess a candidate gene.
[0014] 4. Transfer a DNA fragment containing the candidate gene into a plant and determine the effect/function of the gene on the basis of the phenotype.
[0015] Many previous successful cases often involve narrowing down the genetic region to about 1-3 genes in step 3 and transferring several DNA fragments of several kilobases or less in step 4. However, it is not always easy to narrow down the genetic region. For example, it is often impossible to narrow down the genetic region to 150 kb or less in chromosomal regions near centromeres because of the low frequency of genetic recombination upon cross-hybridization. Even cases where narrowing down is possible often require repeating the operation of step 2 and therefore enormous amounts of time. Even if narrowing down to about 50 kb were possible, it would be very difficult to guess a candidate gene without strong information linking the phenotype to the gene sequence in step 3.
[0016] Thus, map-based cloning is relatively easy until the step of defining a candidate region including one to a few DNA fragments cloned by a BAC vector by narrowing down to some extent (to 50 kb to several hundreds of kilobases), but it is often technically difficult to further pursue the analysis to practically identify a gene, and even if it is possible, enormous amounts of labor and time are often required.
REFERENCES
[0017] Patent Publication No. 1: U.S. Pat. No. 5,733,744
[0018] Patent Publication No. 2: Japanese Patent Laid-open Publication No. H10-155485
[0019] Patent Publication No. 3: WO2005/040374
[0020] Non-patent Publication No. 1: Olszewski et al., 1988, Nucleic Acids Res. 16: 10765-10782
[0021] Non-patent Publication No. 2: Lazo et al., 1991,
[0022] Bio/Technology 9: 963-967
[0023] Non-patent Publication No. 3: Ditta et al., 1980, Proc. Natl. Acad. Sci. USA 77: 7347-7351
[0024] Non-patent Publication No. 4: Ma et al. 1992 Gene 117: 161-167
[0025] Non-patent Publication No. 5: Simoens et al. 1986 Nucleic Acids Res 14: 8073-8090
[0026] Non-patent Publication No. 6: Klee et al. 1987 Mol Gen Genet 210: 282-287
[0027] Non-patent Publication No. 7: Bent et al. 1994 Science 265: 1856-1860
[0028] Non-patent Publication No. 8: Hamilton et al., 1996, Proc. Natl. Acad. Sci. USA 93:9975-9979,
[0029] Non-patent Publication No. 9: Hamilton, 1997, Gene 200:107-116
[0030] Non-patent Publication No. 10: Liu et al., 1999, Proc. Natl. Acad. Sci. USA 96: 6535-6540
[0031] Non-patent Publication No. 11: Tao and Zhang, 1998, Nucleic Acids Res 26: 4901-4909
[0032] Non-patent Publication No. 12: Shibata and Liu, 2000, Trend Plant Sci 5: 354-357
[0033] Non-patent Publication No. 13: Lin et al., 2003, Proc Natl Acad Sci USA 100: 5962-5967,
[0034] Non-patent Publication No. 14: Nakano et al., 2005, Mol Gen Genomics 273: 123-129
[0035] Non-patent Publication No. 15: Pansegrau et al. (1994) J Mol Biol 239: 623-663
[0036] Non-patent Publication No. 16: Knauf and Nester 1982 Plasmid 8: 45-54
[0037] Non-patent Publication No. 17: Komari et al. 1996 Plant J 10:165-174
[0038] Non-patent Publication No. 18: Zambryski et al. 1980 Science 209: 1385-1391
[0039] Non-patent Publication No. 19: Schmidhauser and Helinski, J. Bacteriol. 164:446-455, 1985
[0040] Non-patent Publication No. 20: Winans et al. 1986 Proc. Natl. Acad. Sci. USA 83: 8278-8282
[0041] Non-patent Publication No. 21: Pazour et al. 1992 J. Bac 174:4169-4174
[0042] Non-patent Publication No. 22: Ward et al. (1988) J Biol Chem 263: 5804-5814
[0043] Non-patent Publication No. 23: Frame et al. 2002 Plant Physiol 129: 13-22
[0044] Non-patent Publication No. 24: Hansen et al. 1994 ProNAS 91:7603-7607
[0045] Non-patent Publication No. 25: Ishida et al. 1996 Nat Biotechnol 14:745-50
[0046] Non-patent Publication No. 26: Close et al. 1984 Plasmid 12: 111-118
[0047] Non-patent Publication No. 27: Jin et al. 1987 J Bacteriol 169: 4417-4425
[0048] Non-patent Publication No. 28: Wang et al. 2000 Gene 242: 105-114
[0049] Non-patent Publication No. 29: Okumura and Kado (1992) Mol Gen Genet 235: 55-63
[0050] Non-patent Publication No. 30: Christensen et al. 1992 Plant Mol Biol 18: 675-689
[0051] Non-patent Publication No. 31: Bilang et al. (1991) Gene 100: 247-250
[0052] Non-patent Publication No. 32: Hirsch and Beringer 1984 Plasmid 12: 139-141
[0053] Non-patent Publication No. 33: Konieczny and Ausubel 1993 Plant Journal 4: 403-410
[0054] Non-patent Publication No. 34: Hiei et al. (1994) Plant J 6: 271-282
[0055] Non-patent Publication No. 35: Ishida et al. (2003) Plant Biotechnology 20:57-66
[0056] Non-patent Publication No. 36: Hiei and Komari (2006) Plant Cell, Tissue and Organ Culture 85: 271-283
[0057] Non-patent Publication No. 37: Komori et al. (2004) Plant J 37: 315-325
[0058] Non-patent Publication No. 38: Kazama and Toriyama (2003) FEBS lett 544: 99-102.
DISCLOSURE OF THE INVENTION
Problems to be Solved by the Invention
[0059] WO2005/040374, which is incorporated by reference herein in its entirety, discloses a method for efficiently selecting and preparing a number of genomic DNA fragments capable of improving traits expressed in heterosis or quantitative traits as cloned DNA fragments. We have selected large genomic DNA fragments capable of introducing agriculturally useful mutations by using the method described in WO2005/040374. However, the success rate of transferring clones carried in E. coli into Agrobacterium was about 80%. Moreover, only about 60% of Agrobacterium strains harboring clones was able to transform plants. In view of this result, we examined whether or not the efficiency of the method of WO2005/040374 could be significantly improved by changing the vector used. However, the efficiency of this method could not be improved by any vector ever known.
[0060] Thus, an object of the present invention is to provide a novel vector capable of improving the efficiency of selecting and cloning relatively large genomic DNA fragments, e.g., in the method described in WO2005/040374.
[0061] Another object of the present invention is to provide a vector preferably fulfilling all of the requirements below:
[0062] it allows efficient cloning of DNA fragments of about 25-40 kb in size;
[0063] it is stably maintained in E. coli and Agrobacterium cells;
[0064] it can be efficiently introduced into Agrobacterium;
[0065] the copy number per cell in E. coli and Agrobacterium is 4-5; and
[0066] it allows efficient transfer of only cloned DNA fragments of interest into plants, preferably monocotyledons.
[0067] Still another object of the present invention is to provide a gene transfer method for transferring a gene into a plant at a very high efficiency using such a vector.
[0068] Still another object of the present invention is to provide a method for rapidly narrowing down a gene region for completing map-based cloning with ease in a short time using such a vector.
[0069] Still another object of the present invention is to provide a plasmid capable of further improving the transformation efficiency by combining it with said vector.
Means for Solving the Problems
[0070] Cosmid Vectors
[0071] The cosmid vectors of the present invention are vectors having a full length of 15 kb or less satisfying all of the following criteria (hereinafter referred to as "pLC vectors"):
[0072] 1) they contain an origin of replication (oriV) of an IncP plasmid, but do not contain any origin of replication of other plasmid groups;
[0073] 2) they contain the trfA1 gene of an IncP plasmid;
[0074] 3) they contain an origin of conjugative transfer (oriT) of an IncP plasmid;
[0075] 4) they contain the incC1 gene of an IncP plasmid;
[0076] 5) they contain a cos site of lambda phage and the cos site is located outside the T-DNA;
[0077] 6) they contain a drug resistance gene expressed in E. coli and a bacterium of the genus Agrobacterium;
[0078] 7) they contain a T-DNA right border sequence of a bacterium of the genus Agrobacterium;
[0079] 8) they contain a T-DNA left border sequence of a bacterium of the genus Agrobacterium;
[0080] 9) they contain a selectable marker gene for plant transformation located between 7) and 8) and expressed in a plant; and
[0081] 10) they contain restriction endonuclease recognition site(s) located between 7) and 8) for cloning a foreign gene.
[0082] The vectors of the present invention are cosmid vector containing a cos site of lambda phage. This allows cloning of relatively large genomic fragments by a packaging reaction (Sambrook J. and Russell D. W. 2001. Molecular Cloning, A Laboratory Manual, 3rd edn. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., USA.). In cloning using a packaging reaction of a cosmid vector, the total size of the vector and the insert fragment is around 40 kb-50 kb so that the size of the insert fragment is restricted within a certain range by the size of the vector. The vectors of the present invention have a full length of 15 kb or less, preferably 12-14 kb because they are intended to clone a DNA fragment of up to about 25-40 kb, preferably 30-40 kb.
[0083] 1) An origin of replication (oriV) of an IncP plasmid: The oriV is functional in both E. coli and Agrobacterium. The nucleotide sequence of the oriV of the present invention is not specifically limited so far as it has the function of oriV, i.e., the function of an origin of replication of an IncP plasmid.
[0084] The oriV has molecular biological properties described in detail in Pansegrau et al. (1994) J Mol Biol 239: 623-663, and it is defined as nucleotides 12200-12750 of the sequence of Genbank/EMBL Accession Number L27758 (full length 60099 bp). This corresponds to nucleotides 3451-4002 of SEQ ID NO: 1 (core sequence of oriV).
[0085] The oriV can be conventionally prepared from an IncP plasmid such as pVK102 (Knauf and Nester 1982 Plasmid 8: 45-54). For example, a 0.9 kb DNA (nucleotides 3345-4247 of SEQ ID NO: 1) amplified by PCR from pVK102 can be used as the oriV.
[0086] Alternatively, a nucleic acid containing a nucleotide sequence hybridizing to a complementary strand of the nucleotide sequence of nucleotides 3451-4002, more preferably 3345-4247 of SEQ ID NO: 1 described above under stringent conditions and having the function of oriV can also be used. Alternatively, a nucleic acid containing a nucleotide sequence having an identity of at least 95%, more preferably 97%, still more preferably 99% to the nucleotide sequence of nucleotides 3451-4002, more preferably 3345-4247 of SEQ ID NO: 1 described above and having the function of oriV can also be used.
[0087] It will be recognized by those skilled in the art that a shorter region in nucleotides 3451-4002 of SEQ ID NO: 1 may be selected as a sequence having a similar function. We investigated from various viewpoints the reason why the final transformation efficiency was only about 50% (80% x 60%=48%) when the method described in WO2005/040374 was used, and concluded that this might be ascribable to the use of the cloning vector pSB200.
[0088] A replication origin of pSB200 is from ColE1, and plasmids having an origin of replication from ColE1 exist in a relatively high copy number, i.e., 30-40 copies per E. coli cell. Tao and Zhang (1998, Nucleic Acids Res 26: 4901-4909) assume that E. coli can stably maintain 1200-1500 kb of foreign DNA per cell. If 30-40 kb of DNA is cloned by pSB200, the total DNA amount including the vector size reaches 1200-2000 kb per cell, which may exceed the range assumed above. Another possible reason is that pSB200 is a plasmid that is not replicated alone in Agrobacterium. Thus, a vector having an origin of replication (oriV) of an IncP plasmid called pSB1 is preliminarily introduced into Agrobacterium, and a cointegrate between pSB200 and pSB1 is prepared via homologous recombination between DNA sequences contained in both pSB200 and pSB1, thereby introducing pSB200 into Agrobacterium. It is undeniable that some adverse phenomenon could occur during such an operation to result in the failure in the transfer of pSB200.
[0089] If the copy number in E. coli and Agrobacterium is too low, however, the analysis of DNA or the like will be inefficient.
[0090] Based on the foregoing discussion, we prepared and tested vectors containing an origin of replication (oriV) of an IncP plasmid that is functional in both E. coli and Agrobacterium but not any origin of replication of other plasmid groups and existing in 4-5 copies in these bacteria. As a result, we found that the transformation efficiency is improved by using such vectors in plant transformation, specifically e.g., in the method described in WO2005/040374, and thus achieved the present invention.
[0091] 2) The trfA1 gene of an IncP plasmid: The trfA1 gene is important as a transacting replication factor of IncP plasmids and necessary for an oriV to perform its function. The nucleotide sequence of the trfA1 gene of the present invention is not specifically limited so far as it has the function of trfA1, i.e. the function of a transacting replication factor.
[0092] It has molecular biological properties described in detail in Pansegrau et al. (1994) J Mol Biol 239: 623-663, and it is defined as nucleotides 16521-17669 of the sequence of Genbank/EMBL Accession Number L27758 (full length 60099 bp). This corresponds to nucleotides 6323-7471 of SEQ ID NO: 1 (core sequence of trfA1).
[0093] TrfA1 can be conventionally prepared from an IncP plasmid such as pVK102 (Knauf and Nester 1982 Plasmid 8: 45-54). For example, a 3.2 kb DNA fragment (nucleotides 5341-8507 of SEQ ID NO: 1) amplified by PCR from pVK102 can be used as the trfA1 gene.
[0094] Alternatively, a nucleic acid containing a nucleotide sequence hybridizing to a complementary strand of the nucleotide sequence of nucleotides 6323-7471, more preferably 5341-8507 of SEQ ID NO: 1 described above under stringent conditions and having the function of the trfA1 gene can also be used. Alternatively, a nucleic acid containing a nucleotide sequence having an identity of at least 95%, more preferably 97%, still more preferably 99% to the nucleotide sequence of nucleotides 6323-7471, more preferably 5341-8507 of SEQ ID NO: 1 described above and having the function of the trfA1 gene can also be used.
[0095] It will be recognized by those skilled in the art that a shorter region in nucleotides 6323-7471 of SEQ ID NO: 1 may be selected as a sequence having a similar function.
[0096] 3) An origin of conjugative transfer (oriT) of an IncP plasmid: oriT is an element responsible for conjugation (mating). One of the purposes of the vectors of the present invention is to perform large-scale and high-efficient transformation. For that purpose, conjugation (mating) between E. coli and Agrobacterium is necessary, and oriT contributes to the conjugation (mating). The sequence of the oriT of the present invention is not specifically limited so far as it has the function of oriT, i.e. the function of an element responsible for conjugation (mating).
[0097] The oriT has molecular biological properties described in detail in Pansegrau et al. (1994) J Mol Biol 239: 623-663, and it is defined as nucleotides 51097-51463 of the sequence of Genbank/EMBL Accession Number L27758 (full length 60099 bp). The oriT can be conventionally prepared from an IncP plasmid such as pVK102 (Knauf and Nester 1982 Plasmid 8: 45-54). For example, a 0.8 kb DNA fragment (nucleotides 1-816 of SEQ ID NO: 1) amplified by PCR from pVK102 can be used as the oriT.
[0098] Alternatively, a nucleic acid containing a nucleotide sequence hybridizing to a complementary strand of the nucleotide sequence of nucleotides 1-816 of SEQ ID NO: 1 described above under stringent conditions and having the function of oriT can also be used. Alternatively, a nucleic acid containing a nucleotide sequence having an identity of at least 95%, more preferably 97%, still more preferably 99% to the nucleotide sequence of nucleotides 1-816 of SEQ ID NO: 1 described above and having the function of oriT can also be used.
[0099] It will be recognized by those skilled in the art that a shorter region in nucleotides 1-816 of SEQ ID NO: 1 may be selected as a sequence having a similar function.
[0100] 4) The incC1 gene of an IncP plasmid: The incC1 gene contributes to the stability of IncP plasmids. The nucleotide sequence of the incC1 gene of the present invention is not specifically limited so far as it has the function of the incC1 gene contributing the stability of IncP plasmids.
[0101] This gene has molecular biological properties described in detail in Pansegrau et al. (1994) J Mol Biol 239: 623-663, and it is defined as nucleotides 58260-59354 of the sequence of Genbank/EMBL Accession Number L27758 (full length 60099 bp). This corresponds to nucleotides 1179-2273 of SEQ ID NO: 1 (core sequence of the incC1 gene).
[0102] IncC1 can be conventionally prepared from an IncP plasmid such as pVK102 (Knauf and Nester 1982 Plasmid 8: 45-54). For example, a 2.1 kb DNA fragment (nucleotides 817-2935 of SEQ ID NO: 1) amplified by PCR from pVK102 can be used as the incC1 gene.
[0103] Alternatively, a nucleic acid containing a nucleotide sequence hybridizing to a complementary strand of the nucleotide sequence of nucleotides 1179-2273, more preferably 817-2935 of SEQ ID NO: 1 described above under stringent conditions and having the function of the incC1 gene can also be used. Alternatively, a nucleic acid containing a nucleotide sequence having an identity of at least 95%, more preferably 97%, still more preferably 99% to the nucleotide sequence of nucleotides 1179-2273, more preferably 817-2935 of SEQ ID NO: 1 described above and having the function of the incC1 gene can also be used.
[0104] It will be recognized by those skilled in the art that a shorter region in nucleotides 1179-2273 of SEQ ID NO: 1 may be selected as a sequence having a similar function.
[0105] 5) A cos site of lambda phage: The vectors of the present invention contain a cos site of lambda phage to utilize the packaging reaction of cosmid vectors. The nucleotide sequence of the cos site of lambda phage of the present invention is not specifically limited so far as it has the function of a cos site of lambda phage, i.e. the function contributing to the packaging reaction of cosmid vectors.
[0106] The cos site of lambda phage has molecular biological properties described in detail in Sambrook J. and Russell D. W. (2001), and it has the sequence 5'-aggtcgccgccc-3' (SEQ ID NO: 9) (the core sequence of a cos site of lambda phage). The cos can be conventionally prepared from a plasmid such as pSB11 (Komari et al. 1996 Plant J 10:165-174). For example, a 0.4 kb DNA fragment (nucleotides 2936-3344 of SEQ ID NO: 1) amplified by PCR from pSB11 can be used.
[0107] Alternatively, a nucleic acid containing a nucleotide sequence hybridizing to a complementary strand of the nucleotide sequence of SEQ ID NO: 9 described above, more preferably the nucleotide sequence of nucleotides 2936-3344 of SEQ ID NO: 1 under stringent conditions and having the function of a cos site of lambda phage can also be used. Alternatively, a nucleic acid containing a nucleotide sequence having an identity of at least 95%, more preferably 97%, still more preferably 99% to the nucleotide sequence of SEQ ID NO: 9 described above, more preferably the nucleotide sequence of nucleotides 2936-3344 of SEQ ID NO: 1 and having the function of a cos site of lambda phage can also be used.
[0108] The cos site should be located outside the T-DNA because undesired DNA will be introduced into plants if the cos site is located inside the T-DNA.
[0109] 6) The drug resistance gene expressed in E. coli and a bacterium of the genus Agrobacterium is used as a selectable marker for transformation. This drug resistance gene confers e.g., antibiotic resistance or autotrophy, including, but not limited to, a kanamycin resistance gene, a spectinomycin resistance gene, an ampicillin resistance gene, a tetracycline resistance gene, a gentamycin resistance gene, a hygromycin resistance gene, etc.
[0110] 7), 8) T-DNA right border sequence (RB) and left border sequence (LB) of a bacterium of the genus Agrobacterium are essential for transformation (Zambryski et al. 1980 Science 209: 1385-1391), and a cloning site for a foreign gene is located between them. The nucleotide sequences of the RB and LB of the present invention are not specifically limited so far as they have the function of T-DNA right border sequence (RB) and left border sequence (LB) of a bacterium of the genus Agrobacterium. They can be each conventionally prepared from a plasmid such as pSB11 (Komari et al. 1996 Plant J 10:165-174). For example, nucleotides 13253-13277 and 3479-3503 of SEQ ID NO: 2 can be used, respectively.
[0111] Alternatively, nucleic acids containing nucleotide sequences hybridizing to complementary strands of the nucleotide sequences of nucleotides 13253-13277 and 3479-3503 of SEQ ID NO: 2 described above under stringent conditions and having the functions of the RB and LB, respectively, can also be used. Alternatively, nucleic acids containing nucleotide sequences having an identity of at least 95%, more preferably 97%, still more preferably 99% to the nucleotide sequences of nucleotides 13253-13277 and 3479-3503 of SEQ ID NO: 2 described above and having the functions of the RB and LB, respectively, can also be used.
[0112] It will be recognized by those skilled in the art that shorter regions in nucleotides 13253-13277 and 3479-3503 of SEQ ID NO: 2 may be selected as sequences having similar functions.
[0113] 9) A selectable marker gene for plant transformation expressed in a plant cell and located between 7) and 8) is included. The selectable marker gene for plant transformation is not specifically limited, and known selectable marker genes can be used. Preferably, it is any one of a hygromycin resistance gene, a phosphinotricin resistance gene, and a kanamycin resistance gene. For use in transformation of monocotyledons, a hygromycin resistance gene or a phosphinotricin resistance gene is preferred.
[0114] 10) Restriction endonuclease recognition site(s) located between 7) and 8) for cloning a foreign gene are included. The restriction endonuclease recognition sites for cloning a foreign gene are not specifically limited, and known restriction endonuclease recognition sites can be used, but the same recognition sites are desirably absent elsewhere on the vectors.
[0115] In the cosmid vector constructs of the present invention, the order of all of the seven elements consisting of elements 1)-6) and a unit of 7)-10) is not limited. Moreover, the order of 9) and 10) located between 7) and 8) is not limited.
[0116] The cosmid vectors of the present invention preferably satisfy one or more of the following criteria A-G.
[0117] A. The nucleotide sequence of oriV in 1) comprises the following nucleotide sequence:
[0118] i) the nucleotide sequence of nucleotides 3451-4002, more preferably 3345-4247 of SEQ ID NO: 1;
[0119] ii) a nucleotide sequence containing a nucleotide sequence hybridizing to a complementary strand of the nucleotide sequence of nucleotides 3451-4002, more preferably 3345-4247 of SEQ ID NO: 1 under stringent conditions and having the function of oriV; or
[0120] iii) a nucleotide sequence containing a nucleotide sequence having an identity of at least 95%, more preferably 97%, still more preferably 99% to the nucleotide sequence of nucleotides 3451-4002, more preferably 3345-4247 of SEQ ID NO: 1 and having the function of oriV.
[0121] B. The trfA1 gene in 2) comprises the following nucleotide sequence:
[0122] i) the nucleotide sequence of nucleotides 6323-7471, more preferably 5341-8507 of SEQ ID NO: 1;
[0123] ii) a nucleotide sequence containing a nucleotide sequence hybridizing to a complementary strand of the nucleotide sequence of nucleotides 6323-7471, more preferably 5341-8507 of SEQ ID NO: 1 under stringent conditions and having the function of the trfA1 gene;
[0124] iii) a nucleotide sequence containing a nucleotide sequence having an identity of at least 95%, more preferably 97%, still more preferably 99% to the nucleotide sequence of nucleotides 6323-7471, more preferably 5341-8507 of SEQ ID NO: 1 and having the function of the trfA1 gene.
[0125] C. The oriT in 3) comprises the following nucleotide sequence:
[0126] i) the nucleotide sequence of nucleotides 1-816 of SEQ ID NO: 1;
[0127] ii) a nucleotide sequence containing a nucleotide sequence hybridizing to a complementary strand of the nucleotide sequence of nucleotides 1-816 of SEQ ID NO: 1 under stringent conditions and having the function of oriT;
[0128] iii) a nucleotide sequence containing a nucleotide sequence having an identity of at least 95%, more preferably 97%, still more preferably 99% to the nucleotide sequence of nucleotides 1-816 of SEQ ID NO: 1 and having the function of oriT.
[0129] D. The incC1 gene in 4) comprises the following nucleotide sequence:
[0130] i) the nucleotide sequence of nucleotides 1179-2273, more preferably 817-2935 of SEQ ID NO: 1;
[0131] ii) a nucleotide sequence containing a nucleotide sequence hybridizing to a complementary strand of the nucleotide sequence of nucleotides 1179-2273, more preferably 817-2935 of SEQ ID NO: 1 under stringent conditions and having the function of the incC1 gene;
[0132] iii) a nucleotide sequence containing a nucleotide sequence having an identity of at least 95%, more preferably 97%, still more preferably 99% to the nucleotide sequence of nucleotides 1179-2273, more preferably 817-2935 of SEQ ID NO: 1 and having the function of the incC1 gene.
[0133] E. The cos site of lambda phage in 5) comprises the following nucleotide sequence:
[0134] i) the nucleotide sequence of SEQ ID NO: 9, more preferably the nucleotide sequence of nucleotides 2936-3344 of SEQ ID NO: 1;
[0135] ii) a nucleotide sequence containing a nucleotide sequence hybridizing to a complementary strand of the nucleotide sequence of SEQ ID NO: 9, more preferably the nucleotide sequence of nucleotides 2936-3344 of SEQ ID NO: 1 under stringent conditions and having the function of a cos site of lambda phage;
[0136] iii) a nucleotide sequence containing a nucleotide sequence having an identity of at least 95%, more preferably 97%, still more preferably 99% to the nucleotide sequence of SEQ ID NO: 9, more preferably the nucleotide sequence of nucleotides 2936-3344 of SEQ ID NO: 1 and having the function of a cos site of lambda phage.
[0137] F. The T-DNA right border sequence (RB) of a bacterium of the genus Agrobacterium in 7) comprises the following nucleotide sequence:
[0138] i) the nucleotide sequence of nucleotides 13253-13277 of SEQ ID NO: 2;
[0139] ii) a nucleotide sequence containing a nucleotide sequence hybridizing to a complementary strand of the nucleotide sequence of nucleotides 13253-13277 of SEQ ID NO: 2 under stringent conditions and having the function of RB;
[0140] iii) a nucleotide sequence containing a nucleotide sequence having an identity of at least 95%, more preferably 97%, still more preferably 99% to the nucleotide sequence of nucleotides 13253-13277 of SEQ ID NO: 2 and having the function of RB.
[0141] G. The T-DNA left border sequence (LB) of a bacterium of the genus Agrobacterium in 8) comprises the following nucleotide sequence:
[0142] i) the nucleotide sequence of nucleotides 3479-3503 of SEQ ID NO: 2;
[0143] ii) a nucleotide sequence containing a nucleotide sequence hybridizing to a complementary strand of the nucleotide sequence of nucleotides 3479-3503 of SEQ ID NO: 2 under stringent conditions and having the function of LB;
[0144] iii) a nucleotide sequence containing a nucleotide sequence having an identity of at least 95%, more preferably 97%, still more preferably 99% to the nucleotide sequence of nucleotides 3479-3503 of SEQ ID NO: 2 and having the function of LB.
[0145] By satisfying all of the criteria 1)-10) above for the cosmid vectors of the present invention, a vector fulfilling all of the requirements below can be prepared:
[0146] it allows efficient cloning of DNA fragments of about 25-40 kb in size, preferably 30-40 kb;
[0147] it is stably maintained in E. coli and Agrobacterium cells;
[0148] it can be efficiently introduced into Agrobacterium;
[0149] the copy number per cell in E. coli and Agrobacterium is 4-5; and
[0150] it allows efficient transfer of only cloned DNA fragments of interest into plants, preferably monocotyledons.
[0151] However, the development of such a vector was not straightforward even after the requirements above had been defined. This is partially due to the very complex control mechanism of plasmid replication. Specifically, the most suitable vector backbone for the requirements above is a small plasmid (in the order of 12 kb to 15 kb) having an origin of replication (oriV) of an IncP plasmid. However, the backbone 60 kb IncP plasmid has many genes involved in the replication of the plasmid and partitioning during cell division, resulting in a very complex mechanism, though its entire nucleotide sequence has been determined (Pansegrau et al. J. Mol. Biol. 239:623-663, 1994). Thus, it is not easy to prepare a small vector having an origin of replication (oriV) of an IncP plasmid and stably maintained in bacteria. In fact, plasmids of various sizes derived from IncP plasmids have been studied, but small plasmids are generally unstable and widely differ in stability depending on the bacterial species (Schmidhauser and Helinski, J. Bacteriol. 164:446-455, 1985). pE4cos is an example of the plasmid which has lost stability in Agrobacterium by size reduction. The reasons for this have been discussed to a certain extent (Klee et al. 1987 Mol Gen Genet 210: 282-287), but it can be hardly said that they have been clarified.
[0152] Schmidhauser and Helinski (J. Bacteriol. 164:446-455, 1985) say that "there is no universal set of genetic determinants in plasmid RK2 that accounts for stable maintenance in all gram-negative bacteria", indicating great difficulty in the preparation of a small and stable vector. The plasmid RK2 here (also often designated as pRK2) is one of typical IncP plasmids. The procedure for constructing such a vector often uses the step of cloning elements of a backbone plasmid using another vector. However, DNA fragments involved in the replication of bacterial plasmids or chromosomes are sometimes difficult to clone. If such a problem occurs, a means to solve it must be developed, which contributes to the difficulty in the construction of novel vectors.
[0153] Non-limitative examples of the cosmid vectors (pLC series) of the present invention are as follows.
[0154] i) pLC40 (SEQ ID NO: 2, FIG. 6)
[0155] A binary cosmid vector having a full length of 13429 by characterized in that:
[0156] 1) it contains an origin of replication (oriV) of an IncP plasmid, but does not contain any origin of replication of other plasmid groups;
[0157] 2) it contains the trfA1 gene, 3) oriT, and 4) the incC1 gene of an IncP plasmid;
[0158] 5) it contains a cos site of lambda phage and the cos site is located outside the T-DNA;
[0159] 6) it contains the drug resistance gene nptIII (kanamycin resistance gene) expressed in E. coli and a bacterium of the genus Agrobacterium;
[0160] 7) it contains a T-DNA right border sequence of a bacterium of the genus Agrobacterium;
[0161] 8) it contains a T-DNA left border sequence of a bacterium of the genus Agrobacterium;
[0162] 9) it contains the selectable marker gene for plant transformation hpt (hygromycin resistance gene) located between 7) and 8) and expressed in a plant; and
[0163] 10) it contains restriction endonuclease recognition site(s) located between 7) and 8) for cloning a foreign gene, e.g., an NspV site.
[0164] pLC40 was prepared by inserting a region containing the T-DNA region of pSB200PcHm (FIG. 1) into p6FRG. It should be noted that p6FRG is a cosmid vector of 8507 by in full length having the structure shown in FIG. 5 (SEQ ID NO: 1 in the Sequence Listing) characterized in that:
[0165] 1) it contains an origin of replication (oriV) of an IncP plasmid, but does not contain any origin of replication of other plasmid groups;
[0166] 2) it contains the trfA1 gene, oriT and the incC1 gene of an IncP plasmid, and a cos site of lambda phage;
[0167] 3) it contains the drug resistance gene nptIII (kanamycin resistance gene) expressed in E. coli and a bacterium of the genus Agrobacterium.
[0168] ii) pLC40GWH (SEQ ID NO: 3, FIG. 7)
[0169] A binary cosmid vector of 13174 by in full length. It differs from pLC40 by an insertion of attB1, 2 sequences and a deletion of a 317 by SspI-BalI region upstream of the RB. It was prepared by inserting a region containing the T-DNA region of pSB200PcHmGWH (FIG. 3) into p6FRG.
[0170] iii) pLC40 bar (SEQ ID NO: 4, FIG. 8)
[0171] A binary cosmid vector of 12884 by in full length. Principal differences of pLC40 bar from pLC40 are in that the selectable marker gene for plant transformation is bar (phosphinotricin resistance gene), and that the orientation of the selectable marker unit (ubiquitin promoter-ubiquitin intron-selectable marker gene for plant transformation) on the T-DNA is opposite. It was prepared by inserting a region containing the T-DNA region of pSB25UNpHm (FIG. 2) into p6FRG.
[0172] iv) pLC40GWB (SEQ ID NO: 5, FIG. 9)
[0173] A binary cosmid vector of 13026 by in full length. It differs from pLC40 in that the selectable marker gene for plant transformation is bar (phosphinotricin resistance gene) and that attB1, 2 sequences have been inserted. It was prepared by inserting a region containing the T-DNA region of pSB200PcHmGWB (FIG. 4) into p6FRG.
[0174] v) pLC40GWHkorB (SEQ ID NO: 65, FIG. 10)
[0175] A binary cosmid vector of 14120 by in full length. It differs from pLC40GWH in that it contains the nucleotide sequence of the korB gene. The korB gene is located near IncC1 described above, and contributes to the stability of IncP plasmids as IncC1 does. The nucleotide sequence of the korB gene of the present invention is not specifically limited so far as it has the function of the korB gene contributing to the stability of IncP plasmids.
[0176] This sequence has molecular biological properties described in detail in Pansegrau et al. (1994) J Mol Biol 239: 623-663, and it is defined as nucleotides 57187-58263 of the sequence of Genbank/EMBL Accession Number L27758 (full length 60099 bp). This corresponds to nucleotides 6306-7382 of SEQ ID NO: 65.
[0177] The korB can be conventionally prepared from an IncP plasmid such as pVK102 (Knauf and Nester 1982 Plasmid 8: 45-54). For example, a sequence amplified by PCR from pVK102 (nucleotides 6306-7382 of SEQ ID NO: 65) can be used as the korB gene.
[0178] Alternatively, a nucleic acid containing a nucleotide sequence hybridizing to a complementary strand of the nucleotide sequence of nucleotides 6306-7382 of SEQ ID NO: 65 under stringent conditions and having the function of the korB gene can be used. Alternatively, a nucleic acid containing a nucleotide sequence having an identity of at least 95%, more preferably 97%, still more preferably 99% to the nucleotide sequence of nucleotides 6306-7382 of SEQ ID NO: 65 and having the function of the korB gene can also be used.
[0179] vi) pLCleo (SEQ ID NO: 66, FIG. 11)
[0180] A binary cosmid vector of 14195 by in full length. It differs from pLC40GWHkorB in that it contains a PspOMI site in the multicloning site, a PI-SceI upstream of it, and an attB3 site upstream of the ubiquitin promoter.
[0181] vii) pLC40GWHvG1 (SEQ ID NO: 7, FIG. 13)
[0182] A binary cosmid vector of 14222 by in full length. It differs from pLC40GWH in that the virG gene has been inserted. It was prepared by inserting the virG gene outside the T-DNA of pPLC40GWH.
[0183] Those skilled in the art can readily derive equivalents to the seven cosmid vectors described above, i.e.,
[0184] i) the cosmid vector pLC40 consisting of the nucleotide sequence of SEQ ID NO: 2;
[0185] ii) the cosmid vector pLC40GWH consisting of the nucleotide sequence of SEQ ID NO: 3;
[0186] iii) the cosmid vector pLC40 bar consisting of the nucleotide sequence of SEQ ID NO: 4;
[0187] iv) the cosmid vector pLC40GWB consisting of the nucleotide sequence of SEQ ID NO: 5;
[0188] v) the cosmid vector pLC40GWHKorB consisting of the nucleotide sequence of SEQ ID NO: 65;
[0189] vi) the cosmid vector pLCleo consisting of the nucleotide sequence of SEQ ID NO: 66; and
[0190] vii) the cosmid vector pLC40GWHvG1 consisting of the nucleotide sequence of SEQ ID NO: 7;
[0191] said equivalents having similar functions to those of these vectors even if the nucleotide sequences are not completely identical. Thus, these "equivalents" are also included as preferred embodiments of the cosmid vectors of the present invention.
[0192] For example, it is thought that even if the nucleotide sequences of the cosmid vectors of the present invention i)-vii) above are modified especially in parts other than the elements related to criteria 1)-10) above (e.g., oriV in criterion 1), or the trfA1 gene in criterion 2)), they perform similar functions to those of the original vectors as cosmid vectors. Moreover, more than one genes or restriction endonuclease sites having similar functions to those of the drug resistance gene in 6), the selectable marker gene for plant transformation in 9), and the restriction endonuclease recognition site(s) in 10) among criteria 1)-10) are known even if the nucleotide sequences are not completely identical to the nucleotide sequences in the cosmid vectors i)-vii), and those skilled in the art can modify these parts as appropriate.
[0193] Therefore, an "equivalent" to each of the cosmid vectors of the present invention i)-vii) preferably refers to a nucleotide sequence identical to or having an identity of at least 95% or more, 97% or more, 98% or more or 99% or more, more preferably 99.5% or more to the nucleotide sequence of each cosmid vector in the nucleotide sequences of the elements related to criteria 1)-5) and 7)-8) of the cosmid vectors of the present invention, especially the core sequences in these criteria or refers to a nucleotide sequence hybridizing to a complementary strand of the nucleotide sequence of each cosmid vector under stringent conditions, said equivalent containing a mutation elsewhere in the nucleotide sequence while having similar function and effect to those of each vector. More preferably, it refers to a nucleotide sequence identical to the nucleotide sequence of each cosmid vector in the nucleotide sequences of the elements related to criteria 1)-10) of the cosmid vectors of the present invention, especially the core sequences in these criteria and containing a mutation elsewhere in the nucleotide sequence while having similar function and effect to those of each vector.
[0194] The degree of mutation is not specifically limited, but the "equivalent" preferably consists of a nucleotide sequence hybridizing to a complementary strand of the nucleotide sequence of each of cosmid vectors i)-vii) under stringent conditions. The number of nucleotides that can be mutated is more preferably one or more, still more preferably one to a few (e.g., to the extent at which a mutation can be introduced by known site-directed mutagenesis).
[0195] The "equivalent" also preferably consists of a nucleotide sequence having an identity of 95% or more, 97% or more, 98% or more or 99% or more, more preferably 99.5% or more to a nucleotide sequence selected from the nucleotide sequences of cosmid vectors i)-vii).
[0196] The percent identity of two nucleic acid sequences can be determined by visual inspection and mathematical calculation, or more preferably, the comparison is done by comparing sequence information using a computer program. An exemplary, preferred computer program is the Genetics Computer Group (GCG; Madison, Wis.) Wisconsin package version 10.0 program, "GAP" (Devereux et al., 1984, Nucl. Acids Res. 12: 387). This "GAP" program can be used to compare not only two nucleic acid sequences but also two amino acid sequences or a nucleic acid sequence and an amino acid sequence. The preferred default parameters for the "GAP" program include (1) The GCG implementation of a unary comparison matrix (containing a value of 1 for identities and 0 for non-identities) for nucleotides, and the weighted amino acid comparison matrix of Gribskov and Burgess, Nucl. Acids Res. 14: 6745, 1986 as described by Schwartz and Dayhoff, eds., "Atlas of Polypeptide Sequence and Structure", National Biomedical Research Foundation, pp. 353-358, 1979; or other comparable comparison matrices; (2) a penalty of 30 for each gap and an additional penalty of 1 for each symbol in each gap for amino acid sequences, or penalty of 50 for each gap and an additional penalty of 3 for each symbol in each gap for nucleotide sequences; (3) no penalty for end gaps; and (4) no maximum penalty for long gaps. Other programs used by those skilled in the art of sequence comparison can also be used, such as, for example, the BLASTN program version 2.2.7, available for use via the National Library of Medicine website: http://www.ncbi.nlm.nih.gov/blast/bl2seq/bls.ht ml, or the UW-BLAST 2.0 algorithm. Standard default parameter settings for UW-BLAST 2.0 are described at the following Internet site: http://blast.wustl.edu. In addition, the BLAST algorithm uses the BLOSUM62 amino acid scoring matrix, and optional parameters that can be used are as follows: (A) inclusion of a filter to mask segments of the query sequence that have low compositional complexity (as determined by the SEG program of Wootton and Federhen (Computers and Chemistry, 1993); also see Wootton and Federhen, 1996, Analysis of compositionally biased regions in sequence databases, Methods Enzymol. 266: 554-71) or segments consisting of short-periodicity internal repeats (as determined by the XNU program of Clayerie and States (Computers and Chemistry, 1993)), and (B) a statistical significance threshold for reporting matches against database sequences, or E-score (the expected probability of matches being found merely by chance, according to the stochastic model of Karlin and Altschul, 1990; if the statistical significance ascribed to a match is greater than this E-score threshold, the match will not be reported.); preferred E-score threshold values are 0.5, or in order of increasing preference, 0.25, 0.1, 0.05, 0.01, 0.001, 0.0001, 1e-5, 1e-10, 1e-15, 1e-20, 1e-25, 1e-30, 1e-40, 1e-50, 1e-75, or 1e-100.
[0197] Plant Transformation Methods
[0198] The present invention also provides a plant transformation method using a cosmid vector of the present invention. Specifically, the plant transformation method of the present invention comprises transforming a plant with a bacterium of the genus Agrobacterium harboring a vector containing a nucleic acid fragment of a plant inserted into a cosmid vector of the present invention.
[0199] The type of the nucleic acid fragment inserted into the cosmid vector is not specifically limited, and any fragment can be used, such as a genomic DNA fragment, a cDNA fragment, etc. The nucleic acid fragment is preferably a genomic DNA fragment, more preferably a genomic DNA fragment derived from a plant. The size of the DNA fragment inserted is preferably 1 kb or more, more preferably 10 kb or more, still more preferably 20 kb or more, still more preferably 25-40 kb, still more preferably 30-40 kb.
[0200] The preparation and introduction of the nucleic acid fragment into the cosmid vector and other operations can be performed by a known method, e.g., the method described in WO2005/040374.
[0201] The source of the nucleic acid fragment are not specifically limited. In the case of plant genomic DNA fragments, preferred examples include plants in which heterosis may occur by cross with recipient plants of genomic DNA fragments. When the recipient plant is Japonica rice, for example, the donor is preferably a wild species of rice Oryza rufipogon or Indica rice. When the recipient plant is a specific variety of maize, preferred examples of donor plants include the other varieties of maize and wild species of teosinte. In general, higher heterosis has been observed between more distantly related plants.
[0202] The recipient plant used for transformation may belong to a different species from that of the donor plant of the genomic DNA or a different variety of the same species or the same variety of the same species. Preferred examples of plants include substantially unrestricted wide range of plants, e.g., cereals such as rice, barley, wheat, maize, sorghum, or millet such as an extremely early maturing variety of Italian millet or pearl millet; industrial crops such as sugar cane; pasture grasses such as Sudan grass or rose grass; plants for producing luxury grocery items such as coffee, cocoa, tea and tobacco; vegetables; fruits; ornamental plants such as flowers; weeds such as Arabidopsis, etc.
[0203] The cosmid vectors of the present invention were obtained especially to improve the efficiency of Agrobacterium-mediated transformation among biological transfer methods. Therefore, the plant transformation method is preferably Agrobacterium-mediated. However, other known plant transformation methods are not excluded. For example, known methods include physical transfer methods such as microinjection, electroporation, particle gun, silicon carbide-mediated method and air injection; and chemical transfer methods such as polyethylene glycol-mediated method.
[0204] The type of Agrobacterium strain is not specifically limited so far as it has an antibiotic resistance other than the antibiotic resistance (gene) for the bacterium used for the construction of the vector, and known strains such as LBA4404, A281, BHA105, PC2760, etc. can be used.
[0205] Map-Based Cloning Method
[0206] The present invention also provides an efficient map-based cloning method using a cosmid vector of the present invention as described above. The map-based cloning method is characterized in that it comprises the steps of:
[0207] 1) partially or completely digesting BAC clones containing candidate genes responsible for a plant phenotype with a restriction endonuclease;
[0208] 2) subcloning DNA fragments obtained in step 1) using a cosmid vector to construct a library; and
[0209] 3) individually transferring clones constituting the library into a plant to evaluate the phenotypes of transformed plants.
[0210] In this map-based cloning method, the DNA fragments obtained in step 1) preferably have a size of, but not limited to, 25-40 kb. More preferably, the cosmid vector in 2) is a cosmid vector as described in the section "Cosmid vectors" above.
[0211] The "candidate genes" refer to a group of genes including genes likely to be responsible for a plant phenotype. The "plant phenotype" is not specifically limited, but includes various agriculturally useful phenotypes such as high vigor of the whole plant, large sizes of the plant and organs, high yield, high growth speed, disease and insect resistance, resistance to various environmental stresses such as drought, high temperature, low temperature, etc., an increase or decrease of a specific component, an increase or decrease of a specific enzyme activity, dwarfness, etc.
[0212] For example, suppose that candidate genes were found to be contained in DNA fragments carried by more than one BAC clones of 100-200 kb. Then, these cloned DNAs are partially or completely digested with an appropriate restriction endonuclease to prepare overlapping fragments of about 40 kb, which are then subcloned using a transformation vector of the present invention. It is not necessary to investigate in detail the relative positions and the overlapping of the subcloned DNA fragments. According to a statistical calculation, any site on original fragments in 200 kb clones is maintained by randomized 21 subclones with a 99% probability (e.g., see
[0043]-[0047] in WO2005/040374).
[0213] Then, each subclone is transferred into a plant to prepare about 10 independent transformants per subclone, and the effect of the gene is analyzed. According to this operation, the candidate region can be first narrowed down to 40 kb by identifying subclones containing candidate genes, and then the candidate region can be further restricted to a very narrow region by comparing the experimental results between adjacent subclones. Thus, the efficiency of identifying candidate genes greatly improves.
[0214] Transformation Method Additionally Using the virG Gene (and the virB Gene)
[0215] In a preferred embodiment, the plant transformation method of the present invention is characterized in that it uses a bacterium of the genus Agrobacterium harboring the following elements:
[0216] 1a) a vector containing a nucleic acid fragment of a plant and the virG gene of a bacterium of the genus Agrobacterium inserted into the cosmid vector of the present invention; or
[0217] 1b) a vector containing a nucleic acid fragment of a plant inserted into the cosmid vector of the present invention, and a plasmid capable of coexisting with an IncP plasmid in a cell of a bacterium of the genus Agrobacterium and containing the virG gene of a bacterium of the genus Agrobacterium, and
[0218] 2) a Ti plasmid or Ri plasmid of a bacterium of the genus Agrobacterium.
[0219] virG is one of vir genes of Agrobacterium that play a role in the transfer of the T-DNA into plants, and it is regarded as a transcription factor of the virB gene or the like (Winans et al. 1986 Proc. Natl. Acad. Sci. USA 83: 8278-8282). As an example of virG, virGN54D is a variant in which the amino acid at position 54 of the virG protein is changed from asparagine to aspartic acid to increase the expression of the virB gene as compared with the wild-type virG (Pazour et al. 1992 J. Bac 174:4169-4174). In the present transformation method, the virG gene is preferably virGN54D.
[0220] In an embodiment of the present invention 1a), the virG gene may be further inserted into a cosmid vector of the present invention containing a nucleic acid fragment of a plant. In embodiments where a cosmid vector of the present invention already carries the virG gene (e.g., pLC40GWHvG1), the virG gene need not be further inserted.
[0221] Alternatively, the virG gene may exist in an independent plasmid separate from a cosmid vector of the present invention. In this case, Agrobacterium in the method of the present invention harbors a plasmid capable of coexisting with an IncP plasmid in Agrobacterium cells and containing the virG gene of a bacterium of the genus Agrobacterium, in addition to the cosmid vector (embodiment 1b).
[0222] The Ti plasmid or Ri plasmid is not specifically limited, but preferably disarmed by deleting the T-DNA.
[0223] The plasmid containing the virG gene of a bacterium of the genus Agrobacterium in 1b) may contain an origin of replication of an IncW plasmid. Preferably, it is pVGW having the structure shown in FIG. 14. More preferably, it is pVGW2 having the structure shown in FIG. 15.
[0224] The plasmid containing the virG gene of a bacterium of the genus Agrobacterium in 1b) may further contain the virB gene of a bacterium of the genus Agrobacterium. Here again, the plasmid may contain an origin of replication of an IncW plasmid. Such a plasmid is preferably pTOK47.
[0225] The virB gene of a bacterium of the genus Agrobacterium is described in detail in Ward et al. (1988) J Biol Chem 263: 5804-5814. For example, it can be conventionally prepared from a plasmid such as pSB1 (Komari et al. 1996 Plant J 10: 165-174). The nucleotide sequence of virB is defined as, e.g., nucleotides 3416-12851 of the nucleotide sequence of Genbank/EMBL Accession Number: AB027255 (pSB1). As a non-limitative example, a DNA containing a nucleotide sequence hybridizing to this sequence or a complementary strand thereto under stringent conditions can be used as the virB gene.
[0226] These transformation methods are more effective for plants normally associated with low efficiency of Agrobacterium-mediated transformation, e.g., including, but not limited to, maize and soybean. When the nucleic acid fragment to be transferred is large (e.g., 25-40 kb as a non-limitative example) or has a complex structure (e.g., a highly repeated sequence as a non-limitative example), pVGW described below is preferably used as a plasmid containing the virG gene of a bacterium of the genus Agrobacterium.
[0227] Plasmid Vectors
[0228] In plants such as maize, wherein transformation is difficult to occur, the efficiency of Agrobacterium-mediated transformation with standard binary vectors containing the T-DNA is very low except for special cases (Frame et al. 2002 Plant Physiol 129: 13-22). Previous reports show an increase in the efficiency of transient expression in maize by the coexistence of a binary vector with another plasmid containing the virGN54D gene, a variant of the virG gene in Agrobacterium (Hansen et al. 1994 ProNAS 91:7603-7607), and a high efficiency maize transformation system with a binary vector containing virG and virB (Ishida et al. 1996 Nat Biotechnol 14:745-50).
[0229] However, no report has shown that the maize transformation efficiency was increased by the coexistence of a binary vector with a plasmid containing virG or virGN54D in Agrobacterium.
[0230] The cosmid vectors of the present invention (pLC vectors) (IncP plasmids) are also expected to further improve the maize transformation efficiency. Plasmids capable of coexisting with an IncP plasmid include e.g., IncW plasmids (Close et al. 1984 Plasmid 12: 111-118). Previously reported IncW vectors containing virG are large because they contain origins of replication of other plasmids such as pBR322 ori. For example, pTOK47 contains IncW (pSa) on and pBR322 on (as well as not only virG but also virB) and it has a full length of about 28 kb (Jin et al. 1987 J Bacteriol 169: 4417-4425). pYW48 contains IncW (pSa) on and pBR322 on (as well as not only virG but also virA) and it has a full length of 15.5 kb (Wang et al. 2000 Gene 242: 105-114). Such vectors can also be used in the transformation methods of the present invention. However, these vectors are so long that they may cause problems in stability in bacteria when they coexist with a pLC vector containing a large fragment, and therefore, small vectors capable of coexisting with a pLC vector and containing virG are desirable.
[0231] As a means to solve these problems, the present invention provides a small plasmid vector capable of further improving the transformation efficiency by the coexistence with the cosmid vectors of the present invention described above.
[0232] The plasmid vector of the present invention satisfies all of the criteria below.
[0233] 1) it contains an origin of replication of an IncW plasmid, but does not contain any origin of replication of other plasmid groups;
[0234] 2) it contains the repA gene necessary for the replication of an IncW plasmid;
[0235] 3) it contains a drug resistance gene expressed in E. coli and a bacterium of the genus Agrobacterium; and
[0236] 4) it contains the virG gene of a bacterium of the genus Agrobacterium.
[0237] 1) The nucleotide sequence of the origin of replication of an IncW plasmid of the present invention is not specifically limited so far as it has the function as an origin of replication of an IncW plasmid.
[0238] The origin of replication of an IncW plasmid has molecular biological properties described in detail in Okumura and Kado (1992 Mol Gen Genet 235: 55-63), and it is defined as nucleotides 2170-2552 of Genbank/EMBL Accession Number: U30471 (full length 5500 bp). This corresponds to nucleotides 2832-3214 of SEQ ID NO: 8.
[0239] The origin of replication of an IncW plasmid can be conventionally prepared from an IncW plasmid such as pTOK47 (Jin et al. 1987 J Bacteriol 169: 4417-4425). For example, nucleotides 2832-3214 of SEQ ID NO: 8 in a 2.7 kb DNA amplified by PCR from pTOK47 with repA necessary for the replication of an IncW plasmid described below can be used.
[0240] Alternatively, a nucleic acid containing a nucleotide sequence hybridizing to a complementary strand of the nucleotide sequence of nucleotides 2832-3214 of SEQ ID NO: 8 described above under stringent conditions and having the function of an origin of replication of an IncW plasmid can also be used. Alternatively, a nucleic acid containing a nucleotide sequence having an identity of at least 95%, more preferably 97%, still more preferably 99% to the nucleotide sequence of nucleotides 2832-3214 of SEQ ID NO: 8 described above and having the function of an origin of replication of an IncW plasmid can also be used.
[0241] It will be recognized by those skilled in the art that a shorter region in nucleotides 2832-3214 of SEQ ID NO: 8 may be selected as a sequence having a similar function.
[0242] 2) The nucleotide sequence of the repA gene of the present invention is not specifically limited so far as it has the function as the repA gene necessary for the replication of an IncW plasmid.
[0243] The repA necessary for the replication of an IncW plasmid has molecular biological properties described in detail in Okumura and Kado (1992 Mol Gen Genet 235: 55-63), and it is defined as nucleotides 1108-2079 of Genbank/EMBL Accession Number:U30471 (full length 5500 bp). This corresponds to nucleotides 1770-2741 of SEQ ID NO: 8.
[0244] The repA necessary for the replication of an IncW plasmid can be conventionally prepared from an IncW plasmid such as pTOK47 (Jin et al. 1987 J Bacteriol 169: 4417-4425). For example, nucleotides 1770-2741 of SEQ ID NO: 8 in a 2.7 kb DNA amplified by PCR from pTOK47 with an origin of replication of an IncW plasmid described above can be used.
[0245] Alternatively, a nucleic acid containing a nucleotide sequence hybridizing to a complementary strand of the nucleotide sequence of nucleotides 1770-2741 of SEQ ID NO: 8 described above under stringent conditions and having the function of the repA gene necessary for the replication of an IncW plasmid can also be used. Alternatively, a nucleic acid containing a nucleotide sequence having an identity of at least 95%, more preferably 97%, still more preferably 99% to the nucleotide sequence of nucleotides 1770-2741 of SEQ ID NO: 8 described above and having the function of the repA gene necessary for the replication of an IncW plasmid can also be used.
[0246] It will be recognized by those skilled in the art that a shorter region in nucleotides 1770-2741 of SEQ ID NO: 8 may be selected as a sequence having a similar function.
[0247] 3) The drug resistance gene expressed in E. coli and a bacterium of the genus Agrobacterium is used as a selectable marker for transformation. This drug resistance gene confers e.g., antibiotic resistance or autotrophy, including, but not limited to, a kanamycin resistance gene, a spectinomycin resistance gene, an ampicillin resistance gene, a tetracycline resistance gene, a gentamycin resistance gene, a hygromycin resistance gene, etc.
[0248] 4) The virG gene of a bacterium of the genus Agrobacterium and virGN54D have molecular biological properties described in detail in Winans et al. (1986) Proc. Natl. Acad. Sci. USA 83: 8278-8282 and Pazour et al. (1992) J. Bacteriol. 174: 4169-4174, Hansen et al. 1994 Proc. Natl. Acad. Sci. USA 91: 7603-7607, respectively. virG is a transcription regulator (activator) of other vir genes such as virB and virE. virG is activated upon regulation (phosphorylation) by virA, whereas virGN54D is a variant in a permanently activated state without this regulation. The virG gene can be prepared by conventional procedure and virGN54D can be prepared by mutagenesis both from a plasmid such as pTOK47 (Jin et al. 1987 J Bacteriol 169: 4417-4425). For example, 1 kb virG DNA (nucleotides 4024-5069 of SEQ ID NO: 7) amplified by PCR from pTOK47 and 1 kb virGN54D DNA (nucleotides 1-1080 of SEQ ID NO: 8) amplified and prepared by PCR mutagenesis can be used.
[0249] Alternatively, a nucleic acid containing a nucleotide sequence hybridizing to a complementary strand of these nucleotide sequences under stringent conditions and having the function of the virG gene of a bacterium of the genus Agrobacterium or a nucleic acid containing a nucleotide sequence having an identity of at least 95%, more preferably 97%, still more preferably 99% to these nucleotide sequences and having the function of the virG gene of a bacterium of the genus Agrobacterium can also be used.
[0250] The plasmid vector of the present invention preferably has a full length of 10 kb or less, more preferably 5 kb or less.
[0251] The plasmid vector of the present invention is preferably the pVGW vector having the structure shown in FIG. 14. More preferably, it is pVGW2 having the structure shown in FIG. 15. pVGW and pVGW2 are vectors satisfying all of criteria 1)-4) above. pVGW shown as SEQ ID NO: 8 has a full length of 4531 bp, and pVGW2 shown as SEQ ID NO: 67 has a full length of 4836 bp, and they are characterized in that:
[0252] 1) they contain an origin of replication of an IncW plasmid, but do not contain any origin of replication of other plasmid groups;
[0253] 2) they contain the repA gene necessary for the replication of an IncW plasmid;
[0254] 3) they contain a gentamycin resistance gene as a drug resistance gene expressed in E. coli and a bacterium of the genus Agrobacterium; and
[0255] 4) they contain the virGN54D gene of a bacterium of the genus Agrobacterium.
[0256] Among these components, the origin of replication of an IncW plasmid and the repA gene necessary for the replication of an IncW plasmid were simultaneously cloned and the gentamycin resistance gene and the virGN54D gene were separately cloned, after which all of the three DNA fragments (four components) were assembled.
[0257] Those skilled in the art can readily derive equivalents to the two plasmid vectors pVGW, pVGW2 of the present invention described above, said equivalents having similar functions to those of these vectors even if the nucleotide sequences are not completely identical. Thus, these "equivalents" are also included as preferred embodiments of the plasmid vectors of the present invention.
[0258] For example, it is thought that even if the nucleotide sequences of the plasmid vectors of the present invention are modified especially in parts other than the elements related to criteria 1)-4) above (e.g., the origin of replication of an IncW plasmid in criterion 1)), they perform similar functions to those of the original vectors as plasmid vectors. Moreover, more than one genes having similar functions to those of the drug resistance gene in 3) among criteria 1)-4) are known even if the nucleotide sequences are not completely identical to the nucleotide sequences in the plasmid vectors, and those skilled in the art can modify these parts as appropriate.
[0259] Therefore, an "equivalent" to each of the plasmid vectors of the present invention preferably refers to a nucleotide sequence identical to or having an identity of at least 95% or more, 97% or more, 98% or more or 99% or more, more preferably 99.5% or more to the nucleotide sequence of each plasmid vector in the nucleotide sequences of the elements related to criteria 1)-2) and 4) of the plasmid vectors of the present invention or refers to a nucleotide sequence hybridizing to a complementary strand of the nucleotide sequence of each plasmid vector under stringent conditions, said equivalent containing a mutation elsewhere in the nucleotide sequence while having similar function and effect to those of each vector. More preferably, it refers to a nucleotide sequence identical to the nucleotide sequence of each plasmid vector in the nucleotide sequences of the elements related to criteria 1)-4) of the plasmid vectors of the present invention and containing a mutation elsewhere in the nucleotide sequence while having similar function and effect to those of each vector.
[0260] The degree of mutation is not specifically limited, but the "equivalent" preferably consists of a nucleotide sequence hybridizing to a complementary strand of the nucleotide sequence of each plasmid vector under stringent conditions. The number of nucleotides that can be mutated is more preferably one or more, still more preferably one to a few (e.g., to the extent at which a mutation can be introduced by known site-directed mutagenesis).
[0261] The "equivalent" also preferably consists of a nucleotide sequence having an identity of 95% or more, 97% or more, 98% or more or 99% or more, more preferably 99.5% or more to a nucleotide sequence selected from the nucleotide sequences of the plasmid vectors.
[0262] As used herein, the expression "under stringent conditions" refers to hybridization under conditions of moderate or high stringency. Specifically, conditions of moderate stringency can be readily determined by those having ordinary skill in the art based on, for example, the length of the DNA. The basic conditions are set forth by Sambrook et al. Molecular Cloning: A Laboratory Manual, 3rd Ed., Chapters 6-7, Cold Spring Harbor Laboratory Press, 2001, and include use of a prewashing solution for the nitrocellulose filters containing 5×SSC, 0.5% SDS, 1.0 raM EDTA (pH 8.0), hybridization conditions of 2×SSC to 6×SSC with or without about 50% formamide at about 40° C. to 50° C. (or other similar hybridization solution, such as Stark's solution, in about 50% formamide at about 42° C.), and washing conditions of 0.5 to 6×SSC, 0.1% SDS at about 40° C. to 60° C. Preferably, conditions of moderate stringency include hybridization conditions (and washing conditions) of 6×SSC at about 50° C. Conditions of high stringency can also be readily determined by the skilled artisan based on, for example, the length of the DNA.
[0263] Generally, such conditions include hybridization and/or washing at higher temperatures and/or lower salt concentrations than in the conditions of moderate stringency (e.g., hybridization in 6×SSC to 0.2×SSC, preferably 6×SSC, more preferably 2×SSC, most preferably 0.2×SSC at about 65° C.), and are defined to involve hybridization conditions as above and washing in 0.2×SSC, 0.1% SDS at about 65° C. to 68° C. SSPE (1×SSPE=0.15 M NaCl, 10 mM NaH2PO4, and 1.25 mM EDTA, pH 7.4) can be substituted for SSC (1×SSC=0.15 M NaCl and 15 mM sodium citrate) for use as hybridization and washing buffers, and washing is continued for 15 minutes after completion of hybridization.
[0264] Commercially available hybridization kits not using radioactive substances as probes can also be used. Specifically, hybridization can be performed by using ECL direct labeling & detection system (from Amersham), etc. Stringent hybridization conditions include hybridization in the hybridization buffer included in the kit containing 5% (w/v) Blocking reagent and 0.5 M NaCl at 42° C. for 4 hours, followed by washing twice in 0.4% SDS, 0.5×SSC at 55° C. for 20 minutes, and once in 2×SSC at room temperature for 5 minutes.
[0265] pVGW is characterized in that it is small and stable. Specifically, it is effective for improving the transformation efficiency by the coexistence with pLC especially when large fragments are used and/or when maize is used as a host. It is also effective for improving the efficiency of transformation of maize or the like by the coexistence with an ordinary vector other than pLC.
Effects of the Invention
[0266] The vectors (pLC vectors) of the present invention provide the following advantages that could not be achieved by known vectors:
[0267] they allow efficient cloning of DNA fragments of about 25-40 kb in size, preferably 30-40 kb;
[0268] they are stably maintained in E. coli and Agrobacterium cells;
[0269] they can be efficiently introduced in Agrobacterium;
[0270] the copy number per cell in E. coli and Agrobacterium is 4-5; and
[0271] they allow efficient transfer of only cloned DNA fragments of interest into plants, preferably monocotyledons (the transformation efficiency of pLC vectors is 90%, in contrast to the transformation efficiency of pSB vectors 60%).
[0272] The combined use of the pLC vectors of the present invention and the pVGW vector allows efficient gene transfer into even plants that are relatively difficult to transform such as maize.
[0273] Candidate gene sites can be narrowed down with little expenditure of labor and time by map-based cloning with the pLC vectors of the present invention.
[0274] The present invention is significantly effective even if mapping information is very limited. For example, suppose that nothing is known except for the presence of candidate genes at an end region of a chromosome. If the entire length of one chromosome is 40 Mb, for example, its end region may be assumed to be 2 Mb. This region can be covered by about 20 BAC clones carrying an insert fragment of 150 kb on average by constructing a library of aligned BACs (BAC contig). Therefore, if 20 subclones are prepared from each BAC, almost all candidate genes can be rapidly identified by preparing a total of 400 fragments and 4000 recombinants. Thus, genes can be identified with little expenditure of labor and time unimaginable from conventional techniques by using the technique of the present invention.
EMBODIMENTS OF THE PRESENT INVENTION
[0275] The present invention preferably the following embodiments.
Embodiment 1
[0276] A cosmid vector having a full length of 15 kb or less characterized in that:
[0277] 1) it contains an origin of replication (oriV) of an IncP plasmid, but does not contain any origin of replication of other plasmid groups;
[0278] 2) it contains the trfA1 gene of an IncP plasmid;
[0279] 3) it contains an origin of conjugative transfer (oriT) of an IncP plasmid;
[0280] 4) it contains the incC1 gene of an IncP plasmid;
[0281] 5) it contains a cos site of lambda phage and the cos site is located outside the T-DNA;
[0282] 6) it contains a drug resistance gene expressed in E. coli and a bacterium of the genus Agrobacterium;
[0283] 7) it contains a T-DNA right border sequence of a bacterium of the genus Agrobacterium;
[0284] 8) it contains a T-DNA left border sequence of a bacterium of the genus Agrobacterium;
[0285] 9) it contains a selectable marker gene for plant transformation located between 7) and 8) and expressed in a plant; and
[0286] 10) it contains restriction endonuclease recognition site(s) located between 7) and 8) for cloning a foreign gene.
Embodiment 2
[0287] The cosmid vector of Embodiment 1 wherein the selectable marker gene for plant transformation is selected from the group consisting of a hygromycin resistance gene, a phosphinotricin resistance gene and a kanamycin resistance gene.
Embodiment 3
[0288] The cosmid vector of Embodiment 1 or 2, which contains the korB gene of an IncP plasmid.
Embodiment 4
[0289] The cosmid vector of any one of Embodiments 1 to 3 selected from the group consisting of:
[0290] the cosmid vector pLC40 consisting of the nucleotide sequence of SEQ ID NO: 2 or an equivalent thereof;
[0291] the cosmid vector pLC40GWH consisting of the nucleotide sequence of SEQ ID NO: 3 or an equivalent thereof;
[0292] the cosmid vector pLC40 bar consisting of the nucleotide sequence of SEQ ID NO: 4 or an equivalent thereof;
[0293] the cosmid vector pLC40GWB consisting of the nucleotide sequence of SEQ ID NO: 5 or an equivalent thereof;
[0294] the cosmid vector pLC40GWHKorB consisting of the nucleotide sequence of SEQ ID NO: 65 or an equivalent thereof;
[0295] the cosmid vector pLCleo consisting of the nucleotide sequence of SEQ ID NO: 66 or an equivalent thereof; and
[0296] the cosmid vector pLC40GWHvG1 consisting of the nucleotide sequence of SEQ ID NO: 7 or an equivalent thereof.
Embodiment 5
[0297] The cosmid vector of Embodiment 4 selected from the group consisting of:
[0298] the cosmid vector pLC40 consisting of the nucleotide sequence of SEQ ID NO: 2;
[0299] the cosmid vector pLC40GWH consisting of the nucleotide sequence of SEQ ID NO: 3;
[0300] the cosmid vector pLC40 bar consisting of the nucleotide sequence of SEQ ID NO: 4;
[0301] the cosmid vector pLC40GWB consisting of the nucleotide sequence of SEQ ID NO: 5;
[0302] the cosmid vector pLC40GWHKorB consisting of the nucleotide sequence of SEQ ID NO: 65;
[0303] the cosmid vector pLCleo consisting of the nucleotide sequence of SEQ ID NO: 66; and
[0304] the cosmid vector pLC40GWHvG1 consisting of the nucleotide sequence of SEQ ID NO: 7.
Embodiment 6
[0305] A method for transforming a plant, comprising transforming the plant with a bacterium of the genus Agrobacterium harboring an expression vector containing a nucleic acid fragment of a plant inserted into the cosmid vector of any one of Embodiments 1 to 5.
Embodiment 7
[0306] The method of Embodiment 6 wherein the nucleic acid fragment inserted has a size of 25-40 kb.
Embodiment 8
[0307] The method of Embodiment 6 or 7 characterized in that it uses a bacterium of the genus Agrobacterium harboring the following elements for transforming the plant:
[0308] 1a) a vector containing a nucleic acid fragment of a plant and the virG gene of a bacterium of the genus Agrobacterium inserted into the cosmid vector of any one of Embodiments 1 to 5; or
[0309] 1b) a vector containing a nucleic acid fragment of a plant inserted into the cosmid vector of any one of Embodiments 1 to 5, and a plasmid capable of coexisting with an IncP plasmid in a cell of a bacterium of the genus Agrobacterium and containing the virG gene of a bacterium of the genus Agrobacterium, and
[0310] 2) a Ti plasmid or Ri plasmid of a bacterium of the genus Agrobacterium.
Embodiment 9
[0311] The method of Embodiment 8 wherein the virG gene of a bacterium of the genus Agrobacterium in 1a) or 1b) is virGN54D.
Embodiment 10
[0312] The method of Embodiment 8 wherein the plasmid containing the virG gene of a bacterium of the genus Agrobacterium in 1b) contains an origin of replication of an IncW plasmid.
Embodiment 11
[0313] The method of Embodiment 10 wherein the plasmid containing the virG gene of a bacterium of the genus Agrobacterium in 1b) is pVGW having the structure shown in FIG. 14 or pVGW2 having the structure shown in FIG. 15.
Embodiment 12
[0314] The method of Embodiment 8 wherein the plasmid containing the virG gene of a bacterium of the genus Agrobacterium in 1b) further contains the virB gene of a bacterium of the genus Agrobacterium.
Embodiment 13
[0315] The method of Embodiment 12 wherein the plasmid containing the virG gene of a bacterium of the genus Agrobacterium in 1b) contains an origin of replication of an IncW plasmid.
Embodiment 14
[0316] The method of Embodiment 13 wherein the plasmid containing the virG gene of a bacterium of the genus Agrobacterium in 1b) is pTOK47.
Embodiment 15
[0317] A map-based cloning method comprising the steps of:
[0318] 1) partially or completely digesting BAC clones containing candidate genes responsible for a plant phenotype with a restriction endonuclease;
[0319] 2) subcloning DNA fragments obtained in step 1) using a cosmid vector to construct a library; and 3) individually transferring clones constituting the library into a plant to evaluate the phenotypes of transformed plants.
Embodiment 16
[0320] The map-based cloning method of Embodiment 15 wherein the DNA fragments obtained in step 1) have a size of 25-40 kb.
Embodiment 17
[0321] The map-based cloning method of Embodiment 16 wherein the cosmid vector in 2) is the cosmid vector of any one of Embodiments 1 to 5.
Embodiment 18
[0322] A plasmid vector characterized in that:
[0323] 1) it contains an element necessary for the replication of an IncW plasmid, but does not contain any origin of replication of other plasmid groups;
[0324] 2) it contains the repA gene necessary for the replication of an IncW plasmid;
[0325] 3) it contains a drug resistance gene expressed in E. coli and a bacterium of the genus Agrobacterium; and
[0326] 4) the virG gene of a bacterium of the genus Agrobacterium.
Embodiment 19
[0327] The plasmid vector of Embodiment 18, which has a full length of 10 kb or less.
Embodiment 20
[0328] The plasmid vector of Embodiment 19 wherein the virG gene of a bacterium of the genus Agrobacterium is virGN54D.
Embodiment 21
[0329] The plasmid vector of any one of Embodiments 18 to 20 selected from the group consisting of:
[0330] the plasmid vector pVGW consisting of the nucleotide sequence of SEQ ID NO: 8 or an equivalent thereof; and
[0331] the plasmid vector pVGW2 consisting of the nucleotide sequence of SEQ ID NO: 67 or an equivalent thereof.
Embodiment 22
[0332] The plasmid vector pVGW consisting of the nucleotide sequence of SEQ ID NO: 8.
Embodiment 23
[0333] The plasmid vector pVGW2 consisting of the nucleotide sequence of SEQ ID NO: 67.
Embodiment 24
[0334] A method for transforming a plant, comprising transforming the plant with a bacterium of the genus Agrobacterium harboring the plasmid vector of any one of Embodiments 18 to 23.
BRIEF EXPLANATION OF THE DRAWINGS
[0335] FIG. 1 is a schematic diagram of the vector pSB200PcHm.
[0336] FIG. 2 is a schematic diagram of the vector pSB25UNpHm.
[0337] FIG. 3 is a schematic diagram of the vector pSB200PcHmGWH.
[0338] FIG. 4 is a schematic diagram of the vector pSB200PcHmGWB.
[0339] FIG. 5 is a diagram showing a procedure for constructing the vector pLC40.
[0340] FIG. 6 is a schematic diagram of the vector pLC40.
[0341] FIG. 7 is a schematic diagram of the vector pLC40GWH.
[0342] FIG. 8 is a schematic diagram of the vector pLC40 bar.
[0343] FIG. 9 is a schematic diagram of the vector pLC40GWB.
[0344] FIG. 10 is a schematic diagram of the vector pLC40GWHkorB.
[0345] FIG. 11 is a schematic diagram of the vector pLCleo.
[0346] FIG. 12 is a schematic diagram of the vector pLCSBGWBSWa.
[0347] FIG. 13 is a schematic diagram of the vector pLC40GWHvG1.
[0348] FIG. 14 is a schematic diagram of the vector pVGW.
[0349] FIG. 15 is a schematic diagram of the vector pVGW2.
[0350] FIG. 16 shows the results of cloning of a genomic DNA fragment by a pLC vector. An example of a teosinte genomic DNA fragment is shown. M1: marker (1 kb ladder), M2: marker (X-HindIII); the numbers represent clone numbers, and the arrow indicates the size of the band corresponding to pLC40GWH (13.2 kb). Plasmid DNA of eleven clones from a teosinte library was purified. The DNA was cleaved with the restriction endonucleases HindIII and SacI in the multicloning site at each end of the plasmid insert and separated by agarose gel (0.8%) electrophoresis.
[0351] FIG. 17 shows the results of transformation of a genomic DNA fragment into rice (in the center region of fragment B). M: markers, Yu: Yukihikari, Ru: Oryza rufipogon, Transgenic: transformed rice (two individuals). In the transformed rice, a band derived from Oryza rufipogon was detected in addition to a band derived from Yukihikari.
EXAMPLES
[0352] The following examples further illustrate the present invention but are not intended to limit the technical scope of the invention. Those skilled in the art can readily add modifications/changes to the present invention in the light of the description herein, and those modifications/changes are also included in the technical scope of the present invention.
Example 1
Construction of pLC Series Cosmid Vectors
[0353] In the following procedures, molecular biological experimental methods were performed as described in Sambrook J. and Russell D. W. 2001. Molecular Cloning, A Laboratory Manual, 3rd edn. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., USA., unless otherwise specified.
[0354] 1) Construction of T-DNA Regions
[0355] A PacI linker (gttaattaac) (SEQ ID NO: 10) was inserted into the EcoRV site of pSB200 (WO2005/040374) to construct pSB200Pac. The cauliflower mosaic virus 35S promoter in pSB25 (Ishida et al. 1996) was replaced by the ubiquitin promoter of maize (Christensen et al. 1992 Plant Mol Biol 18: 675-689) to construct pSB25U. The adapters HinNspISceRV and HinNspISceFW (Table 1) having recognition sites for the restriction endonuclease NspV and the homing endonuclease I-SceI were annealed. A part of the annealed adapters were phosphorylated with a polynucleotide kinase (PNK, Amersham). The phosphorylated adapters were cloned into the SacI site of pSB200Pac and the HindIII site of pSB25U. The resulting plasmids were designated as pSB200PacHm1 and pSB25UNpHm1, respectively.
[0356] The adapters SpeICeuRV and SpeICeuFW containing a homing endonuclease I-CeuI site (Table 1) were inserted into the SpeI site of pSB200PacHm1 and pSB25UNpHm1. This operation generated the vector pSB200PcHm containing homing endonuclease sites I-SceI and I-CeuI inserted into the SacI and SpeI sites respectively of pSB200Pac (FIG. 1), and the vector pSB25UNpHm containing homing endonuclease sites I-SceI (+NspV site) and I-CeuI inserted into the HindIII and SpeI sites respectively of pSB25U (FIG. 2). Excision by I-SceI and I-CeuI was verified, and the nucleotide sequences were checked by using ABI PRISM Fluorescent Sequencer (Model 310 Genetic Analyzer, from Perkin Elmer) to confirm that a single adapter had been inserted. In these vectors, the I-SceI-selectable marker unit-LB-1-CeuI can be excised.
TABLE-US-00001 TABLE 1 Primer Name Sequence Length HinNspISceRV 5'-AgC TTT CgA ATA ggg ATA ACA ggg TAA T-3' 28 mer HinNspISceFW 5'-AgC TAT TAC CCT gTT ATC CCT ATT CgA A-3' 28 mer SpeICeuRV 5'-CTA gTA ACT ATA ACg gTC CTA Agg TAg CgA C-3' 31 mer SpeICeuFW 5'-CTA ggT CgC TAC CTT Agg ACC gTT ATA gTT A-3' 31 mer SEQ ID NOs: 23-26 in order from the top.
[0357] Then, pSB200PcHm was digested with BamHI to remove the hygromycin resistance gene (hpt), and then blunt-ended. This was ligated to the aatR1-ccdB-Cm-aatR fragment (Invitrogen), and transferred into E. coli DB3.1 to select a chloramphenicol-resistant colony, thereby generating the destination vector pDEST3342. Then, the following primers containing an aatB sequence were synthesized (aatB sequences are shown in uppercase) in order to introduce a marker gene into the pDONR/Zeo plasmid (Invitrogen) by the BP reaction.
TABLE-US-00002 TABLE 2 Primer Name Sequence Length aatB1-HPT ggg gAC AAG TTT GTA CAA AAA AGC AGG CTc aat gag ata tga aaa agc c 49 mer HPT-aatB2 ggg gAC CAC TTT GTA CAA GAA AGC TGG GTc tat tcc ttt gcc ctc gga cga g 52 mer aatB1-bar ggg gAC AAG TTT GTA CAA AAA AGC AGG CTc cat gga ccc aga acg acg c 49 mer bar-aatB2 ggg gAC CAC TTT GTA CAA GAA AGC TGG GTt cct aga cgc gtg aga tca g 49 mer SEQ ID NOs: 27-30 in order from the top.
[0358] For the amplification of the Hpt gene, the hpt gene described in Bilang et al. (1991) Gene 100: 247-250 was used as a template DNA along with aatB1-HPT and HPT-aatB2 as primers. For the amplification of the phosphinotricin resistance gene (bar), pSB25 (Ishida et al. 1996) was used as a template DNA along with aatB1-bar and bar-aatB2 as primers (Table 2). In 100 μl of a reaction solution containing 10 ng of each template DNA and 25 pmoles of the primers, 35 cycles of PCR was performed. After the completion of the reaction, the products were recovered by ethanol precipitation and used for the BP reaction (25° C., 6 hrs) according to the protocol attached to BP Clonase Enzyme Mix kit (Invitrogen), and then transferred into E. coli DH5a and E. coli cells harboring plasmids of interest were selected on a low salt LA plate containing the antibiotics Zeocin. The nucleotide sequences of the finally obtained plasmids were confirmed by restriction endonuclease analysis, thereby generating pENT-HPTwt and pENT-bar, respectively.
[0359] The destination vector (pDEST3342) prepared before and the entry vectors (pENT-HPTwt and pENT-bar) were used to prepare final plasmids of interest by the LR reaction. After the reaction in 20 μl of a reaction solution (containing 300 ng each of the destination vector and entry vectors) at 25° C. for 4 hours according to the protocol attached to GATEWAY LR Clonase Enzyme Mix, the reaction products were transferred into E. coli DH5a by electroporation. Plasmid DNAs were prepared from colonies grown on an LA plate containing spectinomycin and candidate clones were selected by restriction fragment patterns. They were confirmed by nucleotide sequence analysis to contain an aatB sequence and the sequence of the HPT gene or bar gene and designated as pSB200PcHmGWH (FIG. 3) and pSB200PcHmGWB (FIG. 4), respectively.
[0360] 2) Construction of the Cosmid Vector pLC40
[0361] A PCR reaction was performed using Pyrobest DNA Polymerase (Takara) along with OriV3'ClaFW, OriV5'PvNhEc, OriT5'Bg1RV, OriT3'SpEcFW, InC5'XbRV, InC3'BgEcFW, R5'XhoIRV, R3'BmEcFW, 121KIII5'NspV, 121KIII3'SalI, COS5'BmRV and COS3'MunFW designed as PCR primers for amplifying a DNA fragment containing oriV, a DNA fragment containing oriT, a DNA fragment containing the incC2 gene, a DNA fragment containing the trfA1 gene, all of which are derived from the IncP plasmid pVK102 (Knauf and Nester, Plasmid 8: 45-54, 1982), a DNA fragment containing the nptIII gene from pBI121 and a DNA fragment containing cos from pSB11 (Table 3).
[0362] Each primer contains a restriction endonuclease site for later use. The PCR products other than the trfA1 gene, i.e., the DNA fragment containing oriV from pVK102 (884 bp), the DNA fragment containing oriT (810 bp), the DNA fragment containing the incC1 gene (2118 bp), the DNA fragment containing the nptIII gene from pBI121 (1087 bp), and the DNA fragment containing cos from pSB11 were each cloned into the vector pCR2.1Topo Blunt (from Invitrogen). As a result, the DNA fragment containing oriV was found to contain two nucleotide substitutions and one nucleotide addition as compared with the corresponding nucleotide sequence in a public database (Genbank accession L27758). These mutations were also found in the template plasmid, showing that they were not introduced by PCR but that the template plasmid had a nucleotide sequence different from the sequence in the public database. The nucleotide sequences of oriT, the incC1 gene, and cos were completely identical to those in the database. However, the trfA1 gene could not be cloned alone. Thus, the construction was pursued by the method described below.
[0363] The plasmid into which the DNA fragment containing oriV had been cloned was digested with the restriction endonucleases EcoRI and ClaI, and a 0.9 kb fragment was purified. Similarly, the DNA fragment containing oriT was digested with EcoRI and BglII, and the DNA fragment containing nptIII was digested with NspV and SalI, and the digests were purified. The PCR product of the DNA fragment containing the trfA1 gene was precipitated with ethanol, and then digested with XhoI and BamHI, and purified. These 4 fragments (oriV, the trfA1 gene, nptIII, oriT) were ligated at a time and cloned together. The nucleotide sequence of the resulting plasmid (designated as pVRKT) was analyzed to reveal a frameshift mutation in the DNA fragment containing the trfA1 gene as compared with the corresponding nucleotide sequence in the public database (Genbank accession L27758), but the same mutation was also found in pVK102 used as the template, thereby concluding that the mutation was not introduced by PCR and that pVK102 used as the template contained a nucleotide sequence different from the sequence in the public database.
[0364] The resulting plasmid pVRKT containing the 4 fragments were digested with EcoRI and SpeI, and the DNA fragment containing the incC1 gene recovered by digesting the plasmid containing the incC1 gene with EcoRI and XbaI was inserted into it. The resulting plasmid was further digested with EcoRI and BglII, and the DNA fragment containing cos recovered by digesting the plasmid containing the DNA fragment containing cos with MunI and BamHI was inserted into it to generate the low-copy vector backbone p6FRG (about 8.5 kb) consisting of the 6 fragments, i.e., the DNA fragment containing oriV, the DNA fragment containing the trfA1 gene, the DNA fragment containing nptIII, the DNA fragment containing oriT, the DNA fragment containing the incC1 gene, and the DNA fragment containing cos(SEQ ID NO: 1 in the Sequence Listing). The foregoing cloning procedure was summarized in a schematic diagram shown in FIG. 5. The T-DNA region (SspI-SpeI fragment) of pSB200PcHm was inserted into the PvuII, NheI sites of the p6FRG plasmid to generate the vector pLC40 (FIG. 6, SEQ ID NO: 2 in the Sequence Listing).
TABLE-US-00003 TABLE 3 Primer Name Sequence Target gene Length 121KIII5'NspV 5'-TCg TTC gAA TCg ATA CTA TgT TAT ACg CCA AC-3' nptIII 32 mer 121KIII3'SalI 5'-ATC gTC gAC TgC ACg AAT ACC AgC gAC CC-3' 29 mer COS5'BmRV 5'-ggg ggA TCC TTC CAT TgT TCA TTC CAC ggA C-3' cos 31 mer COS3'MunFW 5'-ggg CAA TTg ACA TgA ggT TgC CCC gTA TTC -3' 30 mer OriV3'ClaFW 5'-gAT ATC gAT AgC gTg gAC TCA Agg CTC TC-3' oriV 29 mer OriV5'PvNhEc 5'-AAA gAA TTC gCT AgC CAg CTg gCg CTg CCA TTT TTg ggg Tg-3' 41 mer R5'XhoIRV 5'-AAA CTC gAg CAg CCg AgA ACA TTg gTT CC-3' trfA1 29 mer R3'BmEcFW 5'-TAg gAA TTC ggA TCC AAA ACA ACT gTC AAA gCg CAC-3' 36 mer OriT5'BglRV 5'-CgT AgA TCT ggC gCT Cgg TCT TgC CTT g-3' oriT 28 mer OriT3'SpEcFW 5'-TgT gAA TTC ACT AgT gAT ATT CCA CAA AAC AgC Agg g-3' 37 mer InC5'XbRV 5'-CCg TCT AgA TTC gAg CCA Cgg Tag Cgg C-3' incC2 28 mer InC3'BgEcFW 5'-CTT gAA TTC AgA TCT TCT Cgg Cgg CgA TCA CgA C-3' 34 mer SEQ ID NOs: 31-42 in order from the top.
[0365] 3) Construction of Other pLC Series Cosmid Vectors pLC40GWH
[0366] Of the two BalI sites in the backbone of pSB200PcHmGWH, the one on the left side of the RB is not cleaved because it is methylated. Thus, pSB200PcHmGWH was used in the experiments below after it was once transferred into the E. coli strain GM48 to demethylate that site. pSB200PcHmGWH was treated with BalI and SpeI to excise a region containing the T-DNA, which was cloned into the PvuII, NheI sites of 6FRG described above to generate pLC40GWH (FIG. 7, SEQ ID NO: 3 in the Sequence Listing). This differs from pLC40 by insertions of attB1, 2 sequences and a deletion of a 317 by SspI-BalI region upstream of the RB.
[0367] pLC40 bar, pLC40GWB, pLC40GWBSW
[0368] pSB25UNpHm and pSB200PcHmGWB were digested with the restriction endonucleases SpeI and SspI, and fragments containing the T-DNA were recovered. These fragments were cloned into the PvuII, NheI sites of p6FRG to generate pLC40 bar (FIG. 8, SEQ ID NO: 4 in the Sequence Listing) and pLC40GWB (FIG. 9, SEQ ID NO: 5 in the Sequence Listing), respectively. pSB200PcHmGWB was treated with NspV, blunt-ended, and dephosphorylated. A pSwaI linker (Table 4) was inserted into this site (pSB200PcHmGWBSW). This plasmid was digested with the restriction endonucleases SpeI and SspI, and a fragment containing the T-DNA was recovered. These fragments were cloned into the PvuII, NheI sites of p6FRG to generate pLC40GWBSW.
[0369] pLC40:35S-IGUS, pLC40GWB:35S-IGUS
[0370] The vector pSB24 (Komari et al. 1996) was treated with the restriction endonucleases HindIII and EcoRI to excise a DNA fragment consisting of 35S promoter-1-GUS gene-NOS terminator. This fragment was further blunt-ended by Klenow treatment, and then a 3.1 kb fragment was purified and recovered. The cosmid vector pLC40 described above was treated with the restriction endonuclease NspV, blunt-ended with Klenow enzyme, and then dephosphorylated and purified. On the other hand, pLC40GWBSW was treated with the restriction endonuclease SwaI, dephosphorylated and then gel-purified. The DNA fragment containing the GUS gene described above was inserted into these vectors to prepare pLC40:35S-IGUS and pLC40GWB:35S-IGUS, respectively.
[0371] pLC40GWHKorB
[0372] The cloned region of IncC1 in the pLC vector was extended, and the vector pLC40GWHKorB containing the korB gene was constructed. IncC3'BgEcFw (described above) and IncC/KorB-Xba#1 (Table 4) were designed as primers for amplifying a DNA fragment containing IncC1-KorB of the IncP-based plasmid pVK102. Each primer contains a restriction endonuclease site for later use. A PCR reaction was performed as follows. In 50 μl of a reaction solution containing 500 ng of the pVK102 plasmid DNA, 5 μl of 10× Pyrobest Buffer II, 4 μl of 2.5 mM each dNTP, 50 pmoles of the primers, and 0.5 μl of Pyrobest DNA Polymerase (from Takara), one cycle of 96° C. for 3 minutes, and 10 cycles of 96° C. for 1 minute, 55° C. for 1 minute, and 72° C. for 2 minutes and 30 seconds were performed by using Mastercycler gradient (eppendorf). The resulting amplified PCR product of IncC1-korB (3065 bp) was cloned into the vector pCR2.1Topo Blunt (from Invitrogen). Ligation reactions were performed following the instructions attached to the vector kit. The DNA was transferred into E. coli DH5a by electroporation, and incubated overnight at 37° C. on a 2×YT agar plate containing the antibiotic Zeocin (25 μg/ml). Colony direct PCR was performed to select candidate clones by using grown colonies as templates along with the same primer set as used for the amplification of IncC1-KorB. PCR conditions included one cycle of 96° C. for 3 minutes, and 30 cycles of 96° C. for 1 minutes, 55° C. for 1 minute, and 72° C. for 2 minutes and 30 seconds using Wastercycler gradient in a suspension of the colony in 20 μl of a reaction solution containing 2 μl of 10× Extaq Buffer, 1.6 μl of 2.5 mM each dNTP, 5 pmoles of the primers, and 0.4 μl of Extaq DNA Polymerase (from Takara). The resulting PCR amplified products of about 3 kb were selected as candidate clones. The nucleotide sequences of these clones were determined by ABI PRISM Fluorescent Sequencer (Model 3100 Genetic Analyzer, from Applied Biosystems). As a result, the nucleotide sequence of IncC1-KorB was completely identical to the sequence in the database.
[0373] Then, the plasmid pVRKT described above was digested with EcoRI and SpeI, and the IncC1-KorB fragment recovered by digesting the plasmid containing IncC1-KorB with EcoRI and XbaI was inserted into it. The resulting plasmid was further digested with EcoRI and BglII, and the cos fragment (MunI-BamHI fragment) described above was inserted into it to generate the plasmid p6FRG2 consisting of the 6 fragments, i.e., oriV, trfA1, nptIII, oriT, IncC1-KorB and cos. The T-DNA region (Ball-SpeI fragment) from pSB3342GWH was inserted into the PvuII, NheI sites of the p6FRG2 plasmid to generate the vector pLC40GWHKorB (FIG. 10, SEQ ID NO: 65).
[0374] pLC40GWHKorBPI
[0375] In order that the cloned large genomic fragment could be excised in its intact form, a recognition site for the homing endonuclease PI-SceI was added upstream of the multicloning site. pLC40GWHKorB was digested with HindIII, and PI-SceI adapters (PI-SceIFw, PI-SceIRv, Table 4) were inserted to generate pLC40GWHKorBPI.
[0376] pLC40GWHKorBPIattB3
[0377] In order that the promoter of the selectable marker of pLC40GWH could be changed by any other one by a Gateway system, an attB3 site was added upstream of the ubiquitin promoter. pLC40GWHKorBPI was digested with I-SceI and attB3 adapters (attB3Fw, attB3Rv, Table 4) were inserted to prepare pLC40GWHKorBPIattB3.
[0378] pLCleo
[0379] In order that an NotI-digested genomic fragment could be cloned, a recognition site for PspOMI (producing the same sticky end as that of NotI) was formed at the multicloning site and simultaneously the recognition site of ApaI (a neoschizomer of PspOMI) in the ubiquitin intron was abolished. pLC40GWHKorBPIattB3 was digested with ApaI and NheI, and ApaIm-NheI adapters (Apalm-NheIFw, Apalm-NheIRv, Table 4) were inserted to prepare pLC40GWHKorBPIattB3ApaIm. This plasmid was digested with HindIII and NspV, and HindIII-PspOMI-NspV adapters (HindIII-PspOMI-NspVFw, HindIII-PspOMI-NspVRv, Table 4) were inserted to finally prepare pLCleo (FIG. 11, SEQ ID NO: 66 in the Sequence Listing).
TABLE-US-00004 TABLE 4 Primer/Adapter name Sequence(5'-3') Length IncC/KorB-Xba#1 CGG TCT AGA GTG CGC AGC AGC TCG TTA TC 29 mer PI-SceIFw AGC TAT CTA TGT CGG GTG CGG AGA AAG AGG TAA TGA AAT GGC A 43 mer PI-SceIRv AGC TTG CCA TTT CAT TAC CTC TTT CTC CGC ACC CGA CAT AGA T 43 mer attB3Fw CAG GGT AAT CAA CTT TGT ATA ATA AAG TTG ATA A 34 mer attB3Rv CAA CTT TAT TAT ACA AAG TTG ATT ACC CTG TTA T 34 mer Apalm-NheIFw GGGTAGTTCTACTTCTGTTCATGTTTGTGTTAGATCCGTGTTTGTGTTAGATCCGTGCTG 60 mer Apalnn-NheIRy CTAGCGCCGGATCTAACACAAACACGGATCTAACACAAACATGAACAGAAGTAGAACTACCCGGCC 66 mer HindIII-PspOMI-NspVFw AGC TTG GGC CCT T 13 mer HindIII-PspOMI-NspVRv AGG GCC CA 8 mer SEQ ID NOs: 68-76 in order from the top.
[0380] p6FRGSwKp
[0381] p6FRG was treated with PvuII and dephosphorylated. The adapter SwaIKpnIRV, SwaIKpnIFW (Table 5) DNAs having recognition sites for SwaI and KpnI were annealed. A part of this was phosphorylated with PNK (Amersham). This SwaI-KpnI linker was inserted into the PvuII site of p6FRG to generate p6FRGSwKp. The KpnI site was designed for cloning a DNA fragment containing the virB gene and the virG gene derived from the Agrobacterium strain A281 in the next step, and the SwaI site was designed for cloning the T-DNA in the step after next.
TABLE-US-00005 TABLE 5 Linker/Adapter name Sequence Length pSwaI linker 5'-cca ttt aaa tgg-3' 12 mer SwaIKpnIRV 5'-cca ttt aaa tgg tac cgg-3' 18 mer SwaIKpnIFW 5'-ccg gta cca ttt aaa tgg-3' 18 mer SEQ ID NOs: 43-45 in order from the top.
[0382] p6FRGSVR, p6FRGSVRF
[0383] The vector pSB1 (Komari et al. 1996) was digested with KpnI, and a 14.8 kb DNA fragment containing the virB gene and the virG gene was recovered. This fragment was inserted into the KpnI-treated and dephosphorylated vector p6FRGSwKp, thereby generating p6FRGSVR and p6FRGSVF.
[0384] pLCSBGWBSW
[0385] pSB200PcHmGWBSW was digested with SpeI and SspI, and a DNA fragment containing the T-DNA region was blunt-ended with Klenow enzyme. This fragment was inserted into the SwaI-digested and dephosphorylated vector p6FRGSVR, thereby generating pLCSBGWBSW (FIG. 12, SEQ ID NO: 6 in the Sequence Listing). This vector is a low-copy vector having a full length of about 28 kb, which contains the virB gene and the virG gene derived from the Agrobacterium strain A281 so that it may be used for transformation of maize. It also contains a cos site, which allows easy cloning of about 10-20 kb of DNA by a packaging reaction.
[0386] 4) A pLC Vector Containing virG
[0387] The vector pLC40GWHvG containing the virG gene in the pLC40GWH vector was constructed by the procedure described below as a means for improving the efficiency of plant transformation with a pLC40 series cosmid vector.
[0388] Preparation of the virG Gene
[0389] The primers virGProSm and virGTerSm for amplifying the virG gene (including its promoter, the structural gene and the 3' region) were designed and synthesized. These primers, and pTOK47 (Jin et al. 1987 J Bacteriol 169: 4417-4425) as a template DNA, were used to amplify the virG gene by PCR. As a result, the PCR product of about 1 kb was amplified. A part of the product was cloned into the vector pCR2.1Topo (from Invitrogen) in the same manner as described above, and the nucleotide sequence was determined. The DNA sequence of the VirG gene contains an NspV site. This restriction site will be used as a cloning site in a future vector. Thus, this site was removed by PCR mutagenesis. The first adenine in the NspV site (ttcgaa) was changed to guanine (ttcgga) to design and synthesize the primer virGonNspVRV and its complementary sequence virGonNspVFW. PCR was performed with two primer sets, i.e., one consisting of VirGonNspVFW and the primer virGProSpe placed upstream of the virG gene promoter and the other consisting of virGonNspVRV and the primer virGTerSpe placed downstream of the virG gene terminator. The virG gene cloned into pCR2.1Topo was used as a template. As a result, the product of about 400 by and the product of about 600 by were amplified by the former and latter sets, respectively. These products were purified and used as templates for the next PCR reaction. A PCR reaction was performed with the purified two PCR products as templates and the previous primers virGProSpe and virGTerSpe. As a result, the PCR product of about 1 kb was amplified. The PCR product was cloned into the pCR2.1Topo vector, and the nucleotide sequence was determined to confirm the mutation (ttcgaa→ttcgga).
[0390] Similarly, the unmutated virG gene was amplified by PCR with virGProSpe and virGTerSpe, and cloned into pCR2.1Topo, and the nucleotide sequence was determined. The primers used in PCR are summarized in Table 6.
TABLE-US-00006 TABLE 6 Designation Sequence 5'-3' Length virGProSm TCA ATA CCC ggg gTA ACC TCg AAg CgT TTC AC 32 mer virGTerSm Tgg TgA CCC ggg ACC TAT Cgg AAC CCC TCA C 31 mer virGProSpe TCA ATA ACT AgT gTA ACC TCg AAg CgT TTC AC 32 mer virGTerSpe Tgg TgA ACT AgT ACC TAT Cgg AAC CCC TCA C 31 mer virGonNspVRV CTT gAg ATC gTT Cgg AAT CTg 21 mer virGonNspVFW CAg ATT CCg AAC gAT CTC AAg 21 mer SEQ ID NOs: 46-51 in order from the top.
[0391] pLC40GWHvG1, pLC40GWHvGC1
[0392] The vector pLC40GWH was digested with the restriction endonuclease PvuII, and dephosphorylated. An SpeI linker (GACTAGTC, from Takara) was inserted to prepare pLC40GWHSpe. This plasmid was digested with the restriction endonuclease SpeI and dephosphorylated. A fragment of about 1 kb of the mutated virG gene excised with SpeI from the vector was inserted into this plasmid to prepare pLC40GWHvG1 (FIG. 13, SEQ ID NO: 7 in the Sequence Listing). Similarly, the unmutated virG gene was inserted into pLC40GWHSpe to prepare pLC40GWHvGC1.
[0393] pLC40GWHvG1:35S-IGUS, pLC40GWHvGC1:35S-IGUS
[0394] In the same manner as described above, the vector pSB24 (Komari et al. 1996) was treated with the restriction endonucleases HindIII and EcoRI to excise a DNA fragment containing the GUS gene, which was cloned into a vector having a multicloning site SgfI-HindIII-EcoRI-SgfI. The resulting plasmid was digested with SgfI to recover the DNA fragment containing the GUS gene. At this point, both ends of the DNA fragment containing the GUS gene are SgfI sites. The cosmid vector pLC40GWHvG1 described above was treated with the restriction endonuclease PacI and dephosphorylated. The 3.1 kb SgfI fragment (the DNA fragment containing the GUS gene) was cloned into it to generate pLC40GWHvG1:35S-IGUS. Similarly, 35S-IGUS-NOS was introduced into pLC40GWHvGC1 to prepare pLC40GWHvGC1:35S-IGUS.
[0395] 5) virG-Containing Vectors Capable of Coexisting with pLC
[0396] pVGW
[0397] pTOK47 is a large IncW plasmid of about 28 kb containing virG and virB (Jin et al. 1987 J Bacteriol 169: 4417-4425). Thus, a smaller vector capable of coexisting with a pLC vector and containing the origin of replication IncW ori, the virG gene, and a selectable marker gene (designated as pVGW) was designed and constructed.
[0398] The primers pSa5'EcT22 and pSa3'BglII for amplifying a fragment containing IncW on from pTOK47 (Jin et al. 1987 J Bacteriol 169: 4417-4425), and the primers Gm5'Bm and Gm3'Xh-2nd for amplifying the gentamycin resistance gene (gentamycin acetyltransferase) from pPH1JI (Hirsch and Beringer 1984 Plasmid 12: 139-141) were designed (Table 7). Each primer contains a restriction endonuclease site for later use. pTOK47 and pPH1JI were used as templates, respectively. Pyrobest DNA Polymerase (from TaKaRa) was used to perform PCR. As a result, a DNA fragment of about 2.7 kb containing IncW on and a DNA fragment of about 0.7 kb corresponding to the gentamycin resistance gene were amplified.
[0399] On the other hand, the primer virGN54DFW for changing the amino acid residue at position 54 of virG derived from pTOK47 from N to D by PCR mutagenesis (virGN54D, Hansen et al. 1994 Proc. Natl. Acad. Sci. USA 91: 7603-7607), and its complementary sequence virGN54DRV were designed. PCR was performed with two primer sets, i.e., one consisting of virGN54DFW and the primer virGProSal placed on the 5' of the virG gene promoter and the other consisting of virGN54DRV and the primer virGTerPst placed on the 3' of the virG gene terminator (Table 7). The pTOK47 plasmid was used as a template. As a result, the product of about 0.4 kb and the product of about 0.7 kb were amplified by the former and latter sets, respectively. These products were purified and used as templates along with the previous primers virGProSal and virGTerPst to further perform a PCR reaction. As a result, the product (virGN54D) of about 1.1 kb was amplified.
[0400] The PCR products of the fragment containing IncW ori, the gentamycin resistance gene, and virGN54D were cloned into the pCR-Blunt II-TOPO vector (Invitrogen). The nucleotide sequence was determined and compared with a publicly available sequence (Genbank/EMBL Accession Number: U30471) to reveal a deletion of 6 nucleotides in the fragment containing IncW ori, which was also found in pTOK47 used as a template. However, the nucleotide sequence of the gentamycin resistance gene was completely identical to the sequence in the database. virGN54D was found to contain the mutation at the desired site.
[0401] The plasmid into which the fragment containing IncW on had been cloned was digested with EcoT22I and BglII, and a 2.7 kb fragment was recovered. Similarly, the gentamycin resistance gene was digested with BamHI and XhoI, and virGN54D was digested with SalI and PstI, and each fragment was purified. These three fragments were ligated together (BglII and BamHI, XhoI and SalI, and PstI and EcoT22I produce the same sticky ends) to generate pVGW (FIG. 14, SEQ ID NO: 8 in the Sequence Listing).
TABLE-US-00007 TABLE 7 Designation Sequence Length pSa5'EcT22 5'-aaa atg cat ggc atg ttt aac aga atc tg-3' 29 mer pSa3'BglII 5'-ttt aga tct act cgt tcg cgg agc tgg-3' 27 mer Gm5'Bm 5'-aaa gga tcc ttc atg get tgt tat gac tg-3' 29 mer Gm3'Xh-2nd 5'-tgc ctc gag aca att tac cga aca act ccg-3' 30 mer virGN54DFW 5'-cga cct aaa tct aga tca aca ac-3' 23 mer viGN54DRV 5'-gtt gtt gat cta gat tta ggt cg-3' 23 mer virGProSal 5'-ttt gtc gac cat agg cga tct cct taa tc-3' 29 mer virGTerPst 5'-aaa ctg cag gtg aag agg gac cta tcg g-3' 28 mer SEQ ID NOs: 52-59 in order from the top.
[0402] pVGW2
[0403] To further increase the convenience of pVGW, the promoter region of the gentamycin resistance gene was extended and additional cloning sites were added to construct the vector pVGW2. The primers BamSmaGmPro and NheIsiteGmRv for amplifying the gentamycin resistance gene of the plasmid pPH1JI, and the primers `MscIsite-virG5` Fw (for these primers, see Table 8) and pSa3'BglII (described above) for amplifying the virG-IncW region of pVGW were designed. Each primer contains a restriction endonuclease site. A PCR reaction was performed as follows. One cycle of 98° C. for 30 seconds and 35 cycles of 98° C. for 10 seconds, 55° C. for 5 seconds, and 72° C. for 1 minute were performed using Mastercycler gradient (eppendorf) in 50 μl of a reaction solution containing 1 ng of the template plasmid DNA, 25 μl of 2× PrimeSTAR Max Premix (from Takara), and 15 pmoles of the primers. As a result, the PCR products of the gentamycin resistance gene (826 bp) and the virG-IncW region (3840 bp) were amplified. The gentamycin resistance gene was cloned into the vector pCR-Blunt II-TOPO (from Invitrogen), and transferred into E. coli TOP10 (Invitrogen) by electroporation. The cells were incubated on an LB agar plate containing the antibiotics kanamycin (50 μg/ml) at 37° C. overnight, and a plasmid was purified from the resulting colony. The nucleotide sequences of these clones (pCR-Gm) were determined by ABI PRISM Fluorescent Sequencer (Model 3100 Genetic Analyzer, from Applied Biosystems) to confirm that no mutation had been introduced by PCR error. The plasmid pCR-Gm was digested with BamHI and PvuII to recover the Gm fragment, which was ligated to the virG-IncW fragment digested with BglII (having a BglII site at one end and a blunt end at the other). The resulting clone was transferred into E. coli TOP10 by electroporation, and selected on an LB agar plate containing the antibiotics gentamycin (30 μg/ml). A plasmid was purified from the resulting colony and confirmed by the sequencer to contain no PCR error, thereby generating pVGW2 (FIG. 15, SEQ ID NO: 67 in the Sequence Listing).
TABLE-US-00008 TABLE 8 Designation Sequence Length BamSmaGmPro 5'-AAA GGA TCC CGG GTT GAC ATA AGC CTG TTC GGT TCG-3' 36 mer NheIsiteGmRv 5'-AAA GCT AGC AAT TTA CCG AAC AAC TCC GCG G-3' 31 mer MscIsite-virG5'Fw 5'-AAA TGG CCA TAG GCG ATC TCC TTA ATC AAT-3' 30 mer SEQ ID NOs: 77-79 in order from the top.
Example 2
Cloning of Large Fragments by pLC Vectors
[0404] The present example describes examples of libraries of Arabidopsis thaliana (ecotype: colombia), wild species of rice (Oryza rufipogon), Sudan grass (Sorghum sudanense), an extremely early maturing variety of Italian millet (Setaria italica), teosinte (Zea diploperennis), pearl millet (Pennisetum typhoideum), Bahia grass (Paspalum notatum Flugge) and sugar cane (Saccharum officinarum) prepared with pLC40, pLC40GWH, pLCleo, pLC40GWHvG1, pSB200, pSB200PcHmGW, or pSB25U.
[0405] 1) Preparation of Genomic DNA
[0406] About 5 g of young leaves of each plant at about one month after seeding grown in a greenhouse was ground in a mortar under liquid nitrogen, and then the genomic DNA was purified by the CTAB method. The yield was about 500-600 μg expressed as DNA. The genomic DNA was partially digested with 0.02-0.06 U/μg of TaqI enzyme. After the partial digestion, fractions containing a genomic DNA fragment of 30-45 kb were recovered by 10-40% sucrose density gradient centrifugation.
[0407] 2) Preparation of the Vectors
[0408] The cosmid vectors pLC40, pLC40GWH, pLCleo, pLC40GWHvG1, pSB200, pSB200PcHmGW, and pSB25U were completely digested with the restriction endonuclease NspV (TOYOBO) and dephosphorylated, and then purified.
[0409] 3) Cloning by a Packaging Reaction
[0410] The vectors prepared as described above were ligated to the genomic DNA fragments, followed by a packaging reaction using GigaPack III XL Packaging extract at room temperature for 2 hours. After the reaction, the clones were incubated with E. coli GeneHogs (Invitrogen). As a result, libraries of 1-100,000 cfu (colony-forming-unit) were prepared from all of the combinations of the plant species and vectors (Table 9), as shown in Table 7.
TABLE-US-00009 TABLE 9 Plant species Vector Library cfu Arabidopsis thaliana pLC40 ca 80000 pSB200PcHmGWH ca 100000 Oryza rufipogon pLC40GWH ca 20000 pSB200 ca 50000 Extremely early pLC40GWH ca 20000 maturing Italian millet pSB200PcHmGWH ca 20000 Sugar cane pLC40GWH ca 50000 pLC40GWHvG1 ca 50000 Sudan grass pLC40GWH ca 50000 pSB200PcHmGWH ca 30000 Pearl millet pLC40GWH ca 20000 Teosinte pLC40GWH ca 100000 pSB25UNpHm ca 20000 Bahia grass pLCleo ca 10000
[0411] 4) Analysis of the Cloned Genomic DNA Fragments
[0412] Plasmids were purified from 12-24 clones of each library and cleaved with the restriction endonucleases HindIII and SacI in the multicloning site at each end of the insert, thereby yielding bands corresponding to the vectors (9.2-9.8 kb) in all of the clones analyzed in the case of pSB200, pSB25UNpHm and pSB200PcHmGW as well as bands corresponding to the vectors (13.2-14.2 kb) in all of the clones analyzed in the case of pLC40, pLC40GWH, pLCleo and pLC40GWHvG1. The length of the cloned large fragment is estimated to be in the range of 25 kb-45 kb from the total length of the restriction fragments of the insert of each clone, with an average of about 40 kb in the case of the pSB vectors and an average of about 35 kb in the case of the pLC vectors. FIG. 16 shows an example of teosinte genomic DNA/pLC40GWH.
[0413] Then, the human genome (Human Genomic DNA, Male, from Promega, Catalog No.: G1471) was partially digested with TaqI to prepare a 30-40 kb fragment, which was then cloned into the vector pLC40GWH. Plasmid DNA was purified from E. coli containing the human genomic fragment from arbitrary 12 clones, and the nucleotide sequences at both ends of the insert were analyzed and searched through a database. Homology searches were performed by BLAST through the database of GenBank at NCBI (http://www.ncbi.nlm.nih.gov/BLAST/). The results showed that 11 of the 12 clones isolated are included in 11 single clones containing the human genomic fragment in the database. Ten clones excluding one containing repeated sequences were analyzed for homology to the human genome sequence in the database, whereby the lengths of the cloned human genomic fragments were estimated to be 28023 bp, 31645 bp, 38265 bp, 39599 bp, 31965 bp, 32631 bp, 34727 bp, 36925 bp, 38794 bp, and 34364 bp. The average length was 34693.8 bp, which agreed well with the value obtained by cloning the plant genomes.
[0414] Then, the nucleotide sequences at both ends of the cloned plant genomic DNA fragments were determined. Homology searches were performed by BLAST on thus obtained sequence data of 300-600 nucleotides through the database of GenBank at NCBI (http://www.ncbi.nlm.nih.gov/BLAST/) and the database of Beijing Genomics Institute (http://btn.genomics.org.cn:8080/rice/). As a result, Oryza rufipogon and Arabidopsis showed a homology of 87-100% to the genome sequence of rice and Arabidopsis, respectively, over the range of at least 100 by or more. The libraries of the other plant species also showed significant homologies to the sequences of rice, Arabidopsis, maize, sorghum, etc.
Example 3
Transfer into Agrobacterium via Triparental Mating
[0415] 1) Transfer into Agrobacterium via triparental mating and its efficiency
[0416] Each vector containing a plant genomic fragment was transferred into Agrobacterium via triparental mating as follows.
[0417] i) pLC40 series cosmid vectors
[0418] pLC40 series cosmid vectors are resistant to kanamycin (Km) and hygromycin (Hm). GeneHogs® (Invitrogen) was used as host E. coli. pRK2073 (spectinomycin (Sp)-resistant) was used as a helper plasmid for triparental mating. HB101 was used as host E. coli for the helper plasmid. The Agrobacterium strain LBA4404 (no drug resistance) was used.
[0419] Initially, E. coli GeneHogs® was infected with an appropriate amount of a dilution of a packaging reaction, spread on an LA plate containing Km (50 μg/mL), and incubated at 23° C. for 3 days. E. coli cells in a colony that appeared were streaked with a toothpick on an LA plate containing Km, and incubated at 28° C. for 2 nights. On the other hand, LBA4404 was spread on an AB plate, and incubated at 25° C. for 5 days. HB101/pRK2073 was spread on an LA plate containing Sp (50 μg/mL), and incubated at 37° C. for 2 nights. The cultures of the three strains, i.e., GeneHogs® harboring a pLC40 series cosmid vector containing a cloned genomic fragment, LBA4404 and HB101/pRK2073 were mixed on an NA plate and incubated at 28° C. overnight. The entire amount of the mixture of the three strains was suspended in 250 μl of sterile water, and 5 μl of the suspension was spread on an AB plate containing Km (50 μg/mL) and Hm (25 μg/mL), and incubated at 28° C. for 7 days. The resulting recombinant Agrobacterium was used in plant transformation experiments. This single colony was reincubated on an AB plate containing Km and Hm and a part of the grown colony was spread on an LA with drugs, showing that few E. coli cells have been grown.
[0420] ii) pSB200 series cosmid vectors
[0421] pSB200 series cosmid vectors are Sp- and Hm-resistant. GeneHogs® (Invitrogen) was used as host E. coli. pRK2013 (Km-resistant) was used as a helper plasmid. HB101 was used as host E. coli for the helper plasmid. The Agrobacterium strain LBA4404 harboring pSB1 (tetracycline (Tc) resistance) was used.
[0422] Initially, E. coli GeneHogs was infected with an appropriate amount of a dilution of a packaging reaction, spread on an LA plate containing Sp (50 μg/mL), and incubated at 23° C. for 3 days. A colony was picked with a toothpick and streaked on an LA plate containing Sp, and incubated at 28° C. for further 2 nights. On the other hand, LBA4404/pSB1 was spread on an AB plate containing Tc (15 μg/mL), and incubated at 25° C. for 5 days. HB101/pRK2013 was spread on an LA plate containing Km (50 μg/mL) and incubated at 37° C. for 2 nights. The cultures of the three strains, i.e., GeneHogs® harboring a pSB200 series cosmid vector containing a cloned genomic fragment, LBA4404/pSB1 and HB101/pRK2013 were mixed on an NA plate and incubated at 28° C. overnight. The entire amount of the mixture of the three strains was suspended in 250 μl of sterile water, and 25 μl of the suspension was spread on an AB plate containing Sp (50 μg/mL) and Hm (25 μg/mL), and incubated at 28° C. for 7 days. The resulting recombinant Agrobacterium was used in plant transformation experiments.
[0423] iii) pCLD04541
[0424] Two genome libraries (the genomes of the rice variety CO39 and Arabidopsis ecotype Colombia, both having an average insert length of 110 kb in host E. coli DH10B) prepared with the vector pCLD04541 provided from Dr. Hongbin Zhang of Texas A&M University were used for triparental mating. The pCLD04541 vector is Km- and Tc-resistant. pRK2073 was used as a helper plasmid, and HB101 was used as host E. coli for the helper plasmid. The Agrobacterium strain LBA4404 was used.
[0425] E. coli harboring each clone of the pCLD04541 libraries was spread on an LA containing Tc (10 μg/mL), and incubated at 28° C. for 2 nights. On the other hand, LBA4404 was spread on an AB plate, and incubated at 25° C. for 5 days. HB101/pRK2073 was spread on an LA containing Sp (50 μg/mL) and incubated at 37° C. for 2 nights. The cultures of the three strains, i.e., DH10B harboring pCLD04541 containing a cloned genomic fragment, LBA4404 and HB101/pRK2073 were mixed on an NA plate and incubated at 28° C. overnight. The entire amount of the mixture of the three strains was suspended in 250 μl of sterile water, and a few microliters of the suspension was spread on an AB plate containing Km (25 μg/mL), and incubated at 28° C. for 7 days. The resulting recombinant Agrobacterium was used in plant transformation experiments.
[0426] As described above, genome clones included in the libraries prepared with pLC40 series cosmid vectors, pSB200 series cosmid vectors and the pCLD04541 vector were transferred into Agrobacterium. A summary of these triparental mating systems and the triparental mating efficiencies are shown in Table 7. The triparental mating efficiencies were 97% in pLC series vectors, 79% in pSB series vectors, and 93% in pCLD04541, respectively, showing that the pLC series vectors were the most efficient (Table 10).
TABLE-US-00010 TABLE 10 # of clones giving # of clones recom- Effi- used for binant ciency triparental Agrobact- (%) DNA donor plant Vector mating (a) erium (b) b/a <pLC40 series cosmid vectors> Oryza rufipogon pLC40 GWH 5657 5469 96.7 Arabidopsis thaliana pLC40 1532 1410 92.0 Sudan grass pLC40 GWH 2301 2201 95.7 Italian millet pLC40 GWH 2521 2405 95.4 Teosinte pLC40 GWH 10739 10593 98.6 Bahia grass pLCleo 384 383 99.7 Total 23134 22461 97.1 <pSB200 series cosmid vectors > Oryza rufipogon pSB200 10375 7504 72.3 Arabidopsis thaliana pSB200PcHmGWH 1332 1179 88.5 Sudan grass pSB200PcHmGWH 2096 2031 96.9 Italian millet pSB200PcHmGWH 2336 2032 87.0 Total 16139 12746 79.0 <pCLD04541> Indica riceCO39 pCLD04541 149 127 85.2 Arabidopsis thaliana pCLD04541 192 190 99.0 Total 341 317 93.0
[0427] 2) Stability of Genomic DNA
[0428] To analyze whether or not the genomic DNA fragment carried on each clone has been transferred to Agrobacterium, Southern hybridization was performed using the entire genomic DNA fragment as a probe. Plasmid DNAs were conventionally extracted from E. coli and Agrobacterium, and digested with the restriction endonucleases HindIII and SacI. Then, a part of the digests were fractionated by agarose gel electrophoresis, and transferred to the nylon membrane filter HybondN+. Then, a part of the HindIII and SacI digest (precipitated with ethanol and redissolved in TE) of the E. coli-derived plasmid was labeled with an ECL labelling kit (Amersham) and hybridized to this membrane as a probe. Hybridization, washing and signal detection were performed following the instructions attached to the ECL kit. All of four plasmids containing a rufipogon fragment cloned into pLC40GWH showed the transfer of the genomic DNA fragment from E. coli to Agrobacterium.
Example 4
Transformation of Large Fragments into Rice with pLC Vectors
[0429] 1) Rice Transformation and its Efficiency
[0430] i) Method for Rice Transformation
[0431] Immature embryos of the rice variety Yukihikari were infected with Agrobacterium. Rice transformation was performed by the method described in the Japanese Patent Application No. 2003-293125, except that all of the aseptically dissected immature embryos were centrifuged as a pretreatment before Agrobacterium inoculation. Specifically, the immature embryos were centrifuged in an eppendorf tube containing 1 ml of sterile water at 20000×g for 10 minutes (25° C.). Hygromycin B was used as a selective drug and added at 50 mg/l each in the selective medium, regeneration medium and rooting medium. In the case of pLC40 series cosmid vectors and pSB200 series cosmid vectors, one immature embryo was inoculated with one Agrobacterium strain (one type of DNA fragment). In the case of pCLD04541, however, two immature embryos were inoculated with one Agrobacterium strain (one type of DNA fragment). Paromomycin was used as a selective drug and added at a concentration of 400-800 mg/l in the selective medium, regeneration medium and rooting medium.
[0432] ii) Transfer of Plant Genomic Fragments into Rice
[0433] The results of transformation are shown in Table 11. In the case of pSB200 series cosmid vectors, hygromycin-resistant individuals were obtained from 59.1%-62.7% of the strain. In contrast, pLC40 series cosmid vectors gave the transformants from 86.6%-95.4% of the strain. In all of the three donor plants of genomic DNA (Oryza rufipogon, Sudan grass, and an extremely early maturing variety of Italian millet), the efficiency was 24%-36% higher when pLC40 series cosmid vectors were used. In the case of the pCLD04541 vector, however, the efficiency was as low as 41-53.4%. These results suggested that pLC40 series cosmid vectors allow transfer of genomic DNA fragments into rice more efficiently than pSB200 series cosmid vectors and pLCD04541.
[0434] To evaluate the transformation efficiency of normal size gene expression units, vectors were tested by comparison in the transformation with a DNA fragment containing the GUS gene. When 25 Yukihikari immature embryos were used for each vector, pSB134 (WO2005/017169) gave an average of 11.7 hygromycin-resistant regenerated individuals per immature embryo while pLC40:35S-IGUS gave an average of 11.5 regenerated individuals.
TABLE-US-00011 TABLE 11 Results of the transformation of randomized plant genomic fragments into rice using a pSB or pLC vector # of # of genomic genomic fragments fragments used for that regenerated Agrobacterium- hygromycin- mediated resistant Genome donor plant Vector transformation (A) individuals (B) B/A (%) Oryza rufipogon pSB200 2246 1327 59.1 Oryza rufipogon pLC4OGWH 2271 2166 95.4 Sudan grass pSB200PcHmGWH 1997 1252 62.7 Sudan grass pLC4OGWH 1760 1524 86.6 Italian millet pSB200PcHmGWH 1940 1200 61.9 Italian millet pLC4OGWH 2285 1986 86.9 Bahia grass pLCleo 18 16 88.9 Indica rice CO39 pCLD04541 156 64 41.0 Arabidopsis thaliana pCLD04541 189 101 53.4
[0435] 2) Verification of the Transfer of Large Fragments
[0436] i) PCR of Flanking Regions of Fragments
[0437] Genomic DNAs were extracted from 11 transformants and young leaves of Yukihikari by the method described above. PCR was performed on these DNAs with 2 sets of the primers shown in the table below. pSB200-9531F and pSB200-4R are primers for amplifying a 139 by region from the RB to the genomic DNA fragment. HPTinRV and HPTinFW are primers for amplifying an internal region of the hygromycin resistance gene (Table 12). Thirty-five cycles of PCR were performed. As a result, the products were amplified with HPTinRV and HPTinFW in all of the 11 transformants, while no PCR product was obtained with either primer set in the control Yukihikari. When pSB200-9531F and pSB200-4R were used, PCR products were obtained in 10 of the 11 individuals. These results show that the flanking regions of the genomic DNA fragments were transferred into most of the plants transformed with pLC vectors, thus verifying the transfer of the genomic DNA fragments.
TABLE-US-00012 TABLE 12 Designation Sequence Length pSB200-9531F 5'-ctg aag gcg gga aac gac aat ctg-3' 24 mer pSB200-4R 5'-gct tgc tga gtg gct cct tca acg-3' 24 mer pSB200-170R 5'-aac tgc act tca aac aag tgt gac-3' 24 mer HPTinRV 5'-tat gtc ctg cgg gta aat ag-3' 20 mer HPTinFW 5'-ttg ttg gag ccg aaa tcc g-3' 19 mer SEQ ID NOs: 60-64 in order from the top.
[0438] ii) PCR of Both Terminal and Internal Sequences of Fragments
[0439] For each of three Oryza rufipogon fragments (called A, B, and C) used for the transformation into Yukihikari with the pLC40GWH vector, two individuals of TO plant were analyzed by PCR to determine whether or not both ends and the center region of each fragment had been introduced. PCR conditions included a treatment at 94° C. for 2 minutes, followed by 35 cycles of thermal denaturation at 94° C. for 30 seconds, annealing at 60° C. for 30 seconds and extension at 60° C. for 30 seconds, and finally a treatment at 72° C. for 2 minutes.
[0440] To detect the RB side of fragment A, PCR (PCR1) was performed with pSB200-9531F and a primer specific to fragment A (5'-gtt aat ttc ttg tga tcg aag gac-3' (SEQ ID NO: 11)). To detect the center region of fragment A, a PCR assay was performed by the CAPS method (Konieczny and Ausubel 1993 Plant Journal 4: 403-410) using nucleotide sequence polymorphisms found between the sequence of Nipponbare AP004667 corresponding to fragment A (identified by database searches) and the sequence of Oryza rufipogon. Specifically, PCR (PCR2) was performed with two primers (5'-ggg att ctt tat gct ggg ttt agg-3' (SEQ ID NO: 12) and 5'-gca agc aat acc tct gtt atg ctg-3' (SEQ ID NO: 13)), and the product was digested with SspI. To detect the HPT side, PCR (PCR3) was performed with pSB200-170R and a primer specific to fragment A (5'-gtt ttc aga tgg cga cct cag ctt tg-3' (SEQ ID NO: 14)).
[0441] Similar marker assays were performed on fragment B and fragment C. Thus, to detect the RB side of fragment B, PCR was performed with pSB200-9531F and a primer specific to fragment B (5'-cag gtg gct tta ttc ctc ctc tca-3' (SEQ ID NO: 15)). To detect the center region of fragment B, a PCR assay was performed by the CAPS method using nucleotide sequence polymorphisms found between the sequence of Nipponbare AP005967 corresponding to fragment B (identified by database searches) and the sequence of Oryza rufipogon. Specifically, PCR was performed with two primers (5'-ccg aaa gtt cgt ggg caa tgc cta-3' (SEQ ID NO: 16) and 5'-gcc atc ctt agc ata tga gtg gca-3' (SEQ ID NO: 17)), and the product was digested with HaeIII. To detect the HPT side of fragment B, PCR was performed with pSB200-170R and a primer specific to fragment B (5'-ggc tat tta cgt ggc atg tta cgt-3' (SEQ ID NO: 18)). To detect the RB side of fragment C, PCR was performed with pSB200-9531F and a primer specific to fragment C (5'-tcg taa gtc tac ttc cct tta cga-3' (SEQ ID NO: 19)). To detect the center region of fragment C, a PCR assay was performed by the CAPS method using nucleotide sequence polymorphisms found between the sequence of Nipponbare AL713907 corresponding to fragment C (identified by database searches) and the sequence of Oryza rufipogon. Specifically, PCR was performed with two primers (5'-cca aac cac atc ctt ata gtg tgc-3' (SEQ ID NO: 20) and 5'-cct cat tgc atg cgg tca cta c-3' (SEQ ID NO: 21)), and the product was digested with HaeIII. To detect the HPT side of fragment C, PCR was performed with pSB200-170R and a primer specific to fragment C (5'-gca ggg tat taa tcg atc aac acc-3' (SEQ ID NO: 22)).
[0442] Analytical results of fragment B are shown in FIG. 17, and analytical results of fragments A-C are summarized in Table 13. Of the two transformants tested for fragment A, no individual containing the entire large fragment was obtained but an individual containing the center region and one end, or both ends was obtained. However, one of the two individuals tested for fragment B and fragment C was shown to contain the entire Oryza rufipogon fragment, i.e., both ends and the center region. These results verified that plant genomic fragments of 25-40 kb in size can be transferred into plants by pLC vectors.
TABLE-US-00013 TABLE 13 Fragment T0 plant RB side center HPT side A 1 - + + 2 + - + B 1 + + - 2 + + + C 1 - - - 2 + + + +: An oryza rufipogon fragment was detected. -: An oryza rufipogon fragment was not detected.
Example 5
Transformation of Maize with pLC40 Series Cosmid Vectors
[0443] 1) Combination of pLC with pTOK47 or pLC with pVGW
[0444] A vector containing a vir gene is required for maize transformation to increase the transformation efficiency (Ishida et al. 1996 Nat Biotechnol 14:745-50) because the efficiency with ordinary binary vectors is very low except for special methods (Frame et al. (2002) Plant Physiol 129: 13-22). pLC40 series cosmid vectors are ordinary binary vectors so that they should be modified by using a vir gene to improve the transformation efficiency. Thus, the vector pTOK47 capable of coexisting with pLC40 series cosmid vectors (IncP plasmids) in bacteria and expressing a vir gene (Jin et al. 1987 J Bacteriol 169: 4417-4425), and a vector newly constructed by the present invention, pVGW were initially used in combination with pLC. pTOK47 is an IncW plasmid carrying a DNA fragment (KpnI 14.8 kb fragment) containing the virB gene and the virG gene derived from the Agrobacterium strain A281, and capable of coexisting with IncP plasmids. pVGW is a plasmid containing a variant virG (virGN54D) and IncW ori.
[0445] pTOK47 (tetracycline-resistant) was transferred into the Agrobacterium LBA4404 or EHA105 (a kind gift from Dr. Stanton Gelvin of Purdue University) via triparental mating. A plasmid was extracted from this Agrobacterium and confirmed by restriction endonuclease analysis to contain pTOK47. Further, pLC40:35S-IGUS or pLC40GWB:35S-IGUS was introduced into the resulting LBA4404/pTOK47 or EHA105/pTOK47 (Tc-resistant) via triparental mating. These Agrobacteria are described as LBA4404/pTOK47/pLC40:35S-IGUS, LBA4404/pTOK47/pLC40GWB:35S-IGUS, EHA105/pTOK47/pLC40:35S-IGUS, and EHA105/pTOK47/pLC40GWB:35S-IGUS. Plasmid DNAs were extracted from the Agrobacteria and analyzed by PCR to confirm the presence of the VirG, RB, hpt or bar, and GUS genes.
[0446] In the same manner, pVGW was transferred into the Agrobacterium LBA4404 by electroporation, and a colony was selected by gentamycin (Gm 50 μg/mL). pLC40:35S-IGUS or pLC40GWB:35S-IGUS was introduced into the resulting LBA4404/pVGW via triparental mating. These Agrobacteria are described as LBA4404/pVGW/pLC40:35S-IGUS and LBA4404/pVGW/pLC40GWB:35S-IGUS. Agrobacterium colonies (Km- and Gm-resistant) were directly analyzed by PCR to confirm the presence of the VirG, hpt or bar, and GUS genes.
[0447] Moreover, pIG121Hm derived from the IncP plasmid pBI121 (Hiei et al. (1994) Plant J 6: 271-282) was introduced into LB4404/pTOK47 to prepare Agrobacterium LB4404/pTOK47/pIG121Hm, which was used as a control in maize transformation experiments.
[0448] 2) Transformation of Maize
[0449] Maize immature embryos having a size of about 1.2 mm (variety: A188) were aseptically removed from a plant grown in a greenhouse, and immersed in a liquid medium for suspending Agrobacterium (LS-inf, Ishida et al. 1996). After thermal treatment at 46° C. for 3 minutes, the immature embryos were washed with the same liquid medium. After centrifugation at 15,000 rpm, 4° C., for 10 minutes, the immature embryos were then immersed in a suspension of each strain at about 1×109 cfu/ml in LS-inf medium (containing 100 μM acetosyringon) and then plated on a coculture medium (LS-AS (Ishida et al. 1996 Nat Biotechnol 14:745-50) containing AgNO3, CuSO4). After incubation at 25° C. in darkness for 3 days, the immature embryos were partially used for GUS analysis.
[0450] The cocultured immature embryos were plated on a selective medium containing hygromycin or phosphinothricin (Ishida et al. (2003) Plant Biotechnology 20:57-66) and incubated. A callus grown was excised and plated on a regeneration medium containing hygromycin (Hm) or phosphinothricin (PPT) (Ishida et al. 1996 Nat Biotechnol 14:745-50), and incubated under illumination. After two weeks, regenerated plants showing resistance to Hm or PPT were investigated.
[0451] Initially, A188 immature embryos were inoculated with various strains and observed for the transient expression of the GUS gene on day 3 of coculture. Immature embryos inoculated with the control LBA4404/pSB134 showed the expression of the GUS gene over a wide range. However, few immature embryos inoculated with LBA4404/pLC40:35S-IGUS showed the expression except for limited ones showing the expression in very small spots. No increase in expression was found when EHA105 was used as a host. Most of immature embryos inoculated with LBA4404/pTOK47/pLC40:35S-IGUS, LBA4404/pLC40GWHvG1:35S-IGUS, LBA4404/pVGW/pLC40:35S-IGUS and LBA4404/pVGW/pLC40GWB:35S-IGUS showed spots representing the expression of the GUS gene to a lesser extent than with LBA4404/pSB134, thus verifying that the gene transfer efficiency is improved by the coexistence with a plasmid containing the virB gene and virG gene derived from the Agrobacterium strain A281, or the coexistence with a plasmid containing virGN54D, or the addition of the virG gene. On the other hand, there is no difference in the expression of the GUS gene between pLC40GWHvG1:35S-IGUS and pLC40GWHvGC1:35S-IGUS, showing that a single nucleotide substitution for removing an NspV recognition site does not influence the virG activity.
[0452] Then, we tried to create transformed plants by incubating the cocultured immature embryos in a selective medium containing Hm or PPT and a regeneration medium. When EHA105 was used as a host, the pLCSBGWBSW vector gave no PPT-resistant plant. When LBA4404 was used as a host, however, the pLCSBGWBSW vector gave plants showing resistance to PPT at an efficiency comparable to that of the superbinary vector pSB131 (containing the GUS gene and the bar gene in the T-DNA region, Ishida et al. 1996 Nat Biotechnol 14:745-50) using the same strain as a host (Table 14).
[0453] LBA4407/pTOK47/pLC40GWB:35S-IGUS was also shown to give PPT-resistant plants at a high efficiency comparable to that of the superbinary vector pSB131. When the hygromycin resistance gene was used as a selectable marker gene, a pLC40 series cosmid vector (pLC40:35S-IGUS) combined with pTOK47 also gave hygromycin-resistant plants (Table 13). pLC40GWHvG1 containing the virG gene also achieved an efficiency comparable to that of the superbinary vector SB134 (containing the GUS gene and the hygromycin resistance gene in the T-DNA region, Hiei and Komari 2006 Plant Cell, Tissue and Organ Culture 85: 271-283) (Table 14).
TABLE-US-00014 TABLE 14 Results of transformation of maize Rediffer- # of immature embryos entiation Exper- Selective Inoculated rediffer- ratio iment Strain drug (A) entiated (B) (B/A, %) 1 LBA4404 (pLCSBGWBSW) PPT 46 10 21.7 EHA105 (pLCSBGWBSW) PPT 46 0 0 LBA4404 (pSB131) PPT 45 9 20.0 2 LBA4404 (pLC40GWB:35S-IGUS) PPT 56 0 0 LBA4404 (pLC40GWB:35S-IGUS/ PPT 57 14 24.6 pTOK47) LBA4404 (pSB131) PPT 59 19 32.2 3 LBA4404(pLC40:35S-IGUS) Hm 43 0 0 LBA4404(pLC40:35S-IGUS /pTOK47) Hm 44 2 4.5 LBA4404(pIG121Hm) Hm 42 0 0 LBA4404(pIG121Hm/pTOK47) Hm 42 0 0 4 LBA4404(pLC40GWHvG1) Hm 59 5 8.5 LBA4404(pSB134) Hm 57 5 8.8 PPT: phosphinothricin, Hm: hygromicin
[0454] In order to examine the influence of pVGW on maize transformation, maize was then transformed with LBA4404/pLC40:35S-IGUS, LBA4404/pVGW/pLC40:35S-IGUS, LBA4404/pVGW/pLC40GWB:35S-IGUS, and LBA4404/pSB134, and the regenerated individuals were analyzed for GUS expression. As a result, the proportion of the number of GUS-expressing individuals in pLC40:35S-IGUS was 0% (0/16), while the proportion of the number of GUS-expressing individuals per inoculated immature embryo in pLC40:35S-IGUS and pLC40GWB:35S-IGUS both combined with pVGW reached 40% (6/15) and 30% (6/20), respectively, which were comparable to 41.2% (7/17) in the superbinary vector pSB134. Thus, the transformation of maize with pLC vectors could be achieved at high efficiency by using pVGW.
[0455] We further tried to transform plant genomic fragments into maize by combining a pLC vector and the pVGW vector. A genomic fragment (30-35 kb) of Sudan grass was randomly cloned into the NspV site of the vector pLC40GWB. The resulting E. coli plasmid was transferred to Agrobacterium harboring pVGW (LBA4404) via triparental mating. In this manner, Agrobacterium harboring both of the plasmids pLC40GWB containing the genomic fragment of Sudan grass and pVGW was prepared, and inoculated into maize immature embryos (variety: A188). Transformed cells were selected to show that 17 of the 27 fragments inoculated gave redifferentiated plants (Table 15). This showed that plant genomic fragments can be efficiently transformed into maize by the combination of pLC and pVGW.
[0456] These results demonstrated that maize transformation can be efficiently achieved by the combination with a plasmid carrying a DNA fragment containing the virB gene and virG gene derived from the Agrobacterium strain A281 such as pTOK47, or the combination with a plasmid containing the virGN54D gene such as pVGW, or the incorporation of the virG gene into a pLC vector such as pLC40GWHvG1.
TABLE-US-00015 TABLE 15 Results of the transformation of randomized plant genomic fragments into maize using a pLC/pVGW vector system # of genomic # of genomic fragments used for fragments that Agrobacterium- regenerated Genome mediated PPT-resistant B/A donor plant Strain transformation (A) individuals (B) (%) Sudan grass LBA4404(pLC40GWB/pVGW) 27 17 63.0
Example 6
Isolation of a Gene of Interest from BAC Clones Using pLC Vectors
[0457] Komori et al. (2004) (Plant J 37: 315-325) found that a cytoplasmic male sterile strain restores fertility when it is transformed with the PPR791 gene isolated from the rice variety IR24, thus demonstrating that PPR791 is the fertility restorer gene Rf-1. The PRR791 gene was identical with the PPR8-1 gene of the rice variety Milyang 23 that had been previously reported as a candidate for Rf-1 by Kazama and Toriyama (2003) (FEES Lett 544: 99-102). Thus, the BAC clone OSIMBb0046F08 of Milyang 23 from which the PPR8-1 gene had been derived was obtained from Clemson University, and a model experiment was performed for isolating Rf-1 from the BAC.
[0458] Initially, a plasmid was extracted from OSIMBb0046F08 using High Purity Plasmid Midiprep System (Marligen). The plasmid was partially digested with TaqI and a DNA fragment around 30 kb was recovered by sucrose density gradient centrifugation. This DNA fragment was ligated to the BstBI-digested and CIP-treated pLC40GWH vector or the BstBI-digested and CIP-treated pSB200 vector using DNA Ligation Kit <Mighty Mix> (Takara Bio Inc.). The resulting construct was transferred into E. coli by electroporation to give colonies of transformants on an LB plate containing an appropriate antibiotic (50 μg/ml kanamycin or spectinomycin). To determine the presence or absence of the Rf-1 gene in the resulting plasmid, direct PCR (see Examples 1, 3)) was performed by using these colonies as templates along with primers designed for the Rf-1 gene (WSF7T7R1 and IR50226R, Table 16) to select Rf-1 positive clones giving an amplified product of about 2 kb from Rf-1 negative clones showing no amplification of the product. The incidence of positive clones in this PCR screening was 5/39 (12.8%) in the pLC40GWH construct and 6/96 (6.3%) in the pSB200 construct. That is, the cloning efficiency of a gene of interest was about twice higher in the pLC vector than pSB.
[0459] One positive clone and two negative clones selected from the pLC40GWH construct, and one positive clone and two negative clones selected from the pSB200 construct were transferred from E. coli to Agrobacterium via triparental mating. The cytoplasmic male sterile strain MS Koshihikari was infected with the resulting Agrobacterium by the method described in Komori et al. (2004). The resulting transformed rice was acclimated and then grown in a greenhouse. During the maturing stage, an average ear was collected from each individual and evaluated for the fertility rate. The results showed that transformants from constructs containing no Rf-1 (pLC-7, pLC-11, pSB-1, pSB-7) were sterile, while constructs containing Rf-1 (pLC-8, pSB-37) gave fertile transformants (Table 17).
[0460] These results demonstrated that a gene of interest can be efficiently identified by preparing a library from DNA of BAC containing the gene of interest using a cosmid vector for plant transformation and transferring it into a plant and then selecting a plant showing an expected phenotype.
TABLE-US-00016 TABLE 16 Primer Name Sequence Length WSF7T7R1 5'-AGT GTG TGG CAT GGT GCA TTT 24 mer CCG-3' IR50226R 5'-CTC TAC AGG ATA CAC GGT GTA 24 mer AGG-3' SEQ ID NOs: 80-81 in order from the top.
TABLE-US-00017 TABLE 17 Fertility restoration by various constructs Presence(+) or # of # of absence(-) individuals individuals Construct of Rf-1 analyzed fetile pLC-8 + 6 4 pLC-7 - 9 0 pLC-11 - 8 0 pSB-37 + 9 6 pSB-1 - 9 0 pSB-7 - 9 0
[0461] In conclusion, pLC vectors are characterized in that:
1. they allow easy cloning of DNA in the order of 25-40 kb; 2. they are stable in bacteria; and 3. they allow efficient transformation of plants, especially monocotyledons.
[0462] pLC vector series are useful for handling medium-size DNA in the field of functional genomics.
Sequence CWU
1
1
8118507DNAArtificialp6FRG 1tggcgctcgg tcttgccttg ctcgtcggtg atgtacttca
ccagctccgc gaagtcgctc 60ttcttgatgg agcgcatggg gacgtgcttg gcaatcacgc
gcaccccccg gccgttttag 120cggctaaaaa agtcatggct ctgccctcgg gcggaccacg
cccatcatga ccttgccaag 180ctcgtcctgc ttctcttcga tcttcgccag cagggcgagg
atcgtggcat caccgaaccg 240cgccgtgcgc gggtcgtcgg tgagccagag tttcagcagg
ccgcccaggc ggcccaggtc 300gccattgatg cgggccagct cgcggacgtg ctcatagtcc
acgacgcccg tgattttgta 360gccctggccg acggccagca ggtaggccga caggctcatg
ccggccgccg ccgccttttc 420ctcaatcgct cttcgttcgt ctggaaggca gtacaccttg
ataggtgggc tgcccttcct 480ggttggcttg gtttcatcag ccatccgctt gccctcatct
gttacgccgg cggtagccgg 540ccagcctcgc agagcaggat tcccgttgag caccgccagg
tgcgaataag ggacagtgaa 600gaaggaacac ccgctcgcgg gtgggcctac ttcacctatc
ctgcccggct gacgccgttg 660gatacaccaa ggaaagtcta cacgaaccct ttggcaaaat
cctgtatatc gtgcgaaaaa 720ggatggatat accgaaaaaa tcgctataat gaccccgaag
cagggttatg cagcggaaaa 780gcgctgcttc cctgctgttt tgtggaatat cactagattc
gagccacggt agcggcgggc 840gccgtgattg atgatatagc ggcccggctg ctcctggttc
tcgcgcaccg aaatgggtga 900cttcaccccg cgctctttga tcgtggcacc gatttccgcg
atgctctccg gggaaaagcc 960ggggttgtcg gccgtccgcg gctgatgcgg atcttcgtcg
atcaggtcca ggtccagctc 1020gatagggccg gaaccgccct gagacgccgc aggagcgtcc
aggaggctcg acaggtcgcc 1080gatgctatcc aaccccaggc cggacggctg cgccgcgcct
gcggcttcct gagcggccgc 1140agcggtgttt ttcttggtgg tcttggcttg agccgcagtc
attgggaaat ctccatcttc 1200gtgaacacgt aatcagccag ggcgcgaacc tctttcgatg
ccttgcgcgc ggccgttttc 1260ttgatcttcc agaccggcac accggatgcg agggcatcgg
cgatgctgct gcgcaggcca 1320acggtggccg gaatcatcat cttggggtac gcggccagca
gctcggcttg gtggcgcgcg 1380tggcgcggat tccgcgcatc gaccttgctg ggcaccatgc
caaggaattg cagcttggcg 1440ttcttctggc gcacgttcgc aatggtcgtg accatcttct
tgatgccctg gatgctgtac 1500gcctcaagct cgatggggga cagcacatag tcggccgcga
agagggcggc cgccaggccg 1560acgccaaggg tcggggccgt gtcgatcagg cacacgtcga
agccttggtt cgccagggcc 1620ttgatgttcg ccccgaacag ctcgcgggcg tcgtccagcg
acagccgttc ggcgttcgcc 1680agtaccgggt tggactcgat gagggcgagg cgcgcggcct
ggccgtcgcc ggctgcgggt 1740gcggtttcgg tccagccgcc ggcagggaca gcgccgaaca
gcttgcttgc atgcaggccg 1800gtagcaaagt ccttgagcgt gtaggacgca ttgccctggg
ggtccaggtc gatcacggca 1860acccgcaagc cgcgctcgaa aaagtcgaag gcaagatgca
caagggtcga agtcttgccg 1920acgccgcctt tctggttggc cgtgaccaaa gttttcatcg
tttggtttcc tgttttttct 1980tggcgtccgc ttcccacttc cggacgatgt acgcctgatg
ttccggcaga accgccgtta 2040cccgcgcgta cccctcgggc aagttcttgt cctcgaacgc
ggcccacacg cgatgcaccg 2100cttgcgacac tgcgcccctg gtcagtccca gcgacgttgc
gaacgtcgcc tgtggcttcc 2160catcgactaa gacgccccgc gctatctcga tggtctgctg
ccccacttcc agcccctgga 2220tcgcctcctg gaactggctt tcggtaagcc gtttcttcat
ggataacacc cataatttgc 2280tccgcgcctt ggttgaacat agcggtgaca gccgccagca
catgagagaa gtttagctaa 2340acatttctcg cacgtcaaca cctttagccg ctaaaactcg
tccttggcgt aacaaaacaa 2400aagcccggaa accgggcttt cgtctcttgc cgcttatggc
tctgcacccg gctccatcac 2460caacaggtcg cgcacgcgct tcactcggtt gcggatcgac
actgccagcc caacaaagcc 2520ggttgccgcc gccgccagga tcgcgccgat gatgccggcc
acaccggcca tcgcccacca 2580ggtcgccgcc ttccggttcc attcctgctg gtactgcttc
gcaatgctgg acctcggctc 2640accataggct gaccgctcga tggcgtatgc cgcttctccc
cttggcgtaa aacccagcgc 2700cgcaggcggc attgccatgc tgcccgccgc tttcccgacc
acgacgcgcg caccaggctt 2760gcggtccaga ccttcggcca cggcgagctg cgcaaggaca
taatcagccg ccgacttggc 2820tccacgcgcc tcgatcagct cttgcactcg cgcgaaatcc
ttggcctcca cggccgccat 2880gaatcgcgca cgcggcgaag gctccgcagg gccggcgtcg
tgatcgccgc cgagaagatc 2940cttccattgt tcattccacg gacaaaaaca gagaaaggaa
acgacagagg ccaaaaagct 3000cgctttcagc acctgtcgtt tcctttcttt tcagagggta
ttttaaataa aaacattaag 3060ttatgacgaa gaagaacgga aacgccttaa accggaaaat
tttcataaat agcgaaaacc 3120cgcgaggtcg ccgccccgta acctgtcgga tcaccggaaa
ggacccgtaa agtgataatg 3180attatcatct acatatcaca acgtgcgtgg aggccatcaa
accacgtcaa ataatcaatt 3240atgacgcagg tatcgtatta attgatctgc atcaacttaa
cgtaaaaaca acttcagaca 3300atacaaatca gcgacactga atacggggca acctcatgtc
aattcgctag ccagctggcg 3360ctgccatttt tggggtgagg ccgttcgcgg ccgaggggcg
cagcccctgg ggggatggga 3420ggcccgcgtt agcgggccgg gagggttcga gaaggggggg
cacccccctt cggcgtgcgc 3480ggtcacgcgc acagggcgca gccctggtta aaaacaaggt
ttataaatat tggtttaaaa 3540gcaggttaaa agacaggtta gcggtggccg aaaaacgggc
ggaaaccctt gcaaatgctg 3600gattttctgc ctgtggacag cccctcaaat gtcaataggt
gcgcccctca tctgtcagca 3660ctctgcccct caagtgtcaa ggatcgcgcc cctcatctgt
cagtagtcgc gcccctcaag 3720tgtcaatacc gcagggcact tatccccagg cttgtccaca
tcatctgtgg gaaactcgcg 3780taaaatcagg cgttttcgcc gatttgcgag gctggccagc
tccacgtcgc cggccgaaat 3840cgagcctgcc cctcatctgt caacgccgcg ccgggtgagt
cggcccctca agtgtcaacg 3900tccgcccctc atctgtcagt gagggccaag ttttccgcga
ggtatccaca acgccggcgg 3960ccgcggtgtc tcgcacacgg cttcgacggc gtttctggcg
cgtttgcagg gccatagacg 4020gccgccagcc cagcggcgag ggcaaccagc ccggtgagcg
tcggaaaggc gctggaagcc 4080ccgtagcgac gcggagaggg gcgagacaag ccaagggcgc
aggctcgatg cgcagcacga 4140catagccggt tctcgcaagg acgagaattt ccctgcggtg
cccctcaagt gtcaatgaaa 4200gtttccaacg cgagccattc gcgagagcct tgagtccacg
ctatcgaatc gatactatgt 4260tatacgccaa ctttgaaaac aactttgaaa aagctgtttt
ctggtattta aggttttaga 4320atgcaaggaa cagtgaattg gagttcgtct tgttataatt
agcttcttgg ggtatcttta 4380aatactgtag aaaagaggaa ggaaataata aatggctaaa
atgagaatat caccggaatt 4440gaaaaaactg atcgaaaaat accgctgcgt aaaagatacg
gaaggaatgt ctcctgctaa 4500ggtatataag ctggtgggag aaaatgaaaa cctatattta
aaaatgacgg acagccggta 4560taaagggacc acctatgatg tggaacggga aaaggacatg
atgctatggc tggaaggaaa 4620gctgcctgtt ccaaaggtcc tgcactttga acggcatgat
ggctggagca atctgctcat 4680gagtgaggcc gatggcgtcc tttgctcgga agagtatgaa
gatgaacaaa gccctgaaaa 4740gattatcgag ctgtatgcgg agtgcatcag gctctttcac
tccatcgaca tatcggattg 4800tccctatacg aatagcttag acagccgctt agccgaattg
gattacttac tgaataacga 4860tctggccgat gtggattgcg aaaactggga agaagacact
ccatttaaag atccgcgcga 4920gctgtatgat tttttaaaga cggaaaagcc cgaagaggaa
cttgtctttt cccacggcga 4980cctgggagac agcaacatct ttgtgaaaga tggcaaagta
agtggcttta ttgatcttgg 5040gagaagcggc agggcggaca agtggtatga cattgccttc
tgcgtccggt cgatcaggga 5100ggatatcggg gaagaacagt atgtcgagct attttttgac
ttactgggga tcaagcctga 5160ttgggagaaa ataaaatatt atattttact ggatgaattg
ttttagtacc tagatgtggc 5220gcaacgatgc cggcgacaag caggagcgca ccgacttctt
ccgcatcaag tgttttggct 5280ctcaggccga ggcccacggc aagtatttgg gcaaggggtc
gctggtattc gtgcagtcga 5340gcagccgaga acattggttc ctgtaggcat cgggattggc
ggatcaaaca ctaaagctac 5400tggaacgagc agaagtcctc cggccgccag ttgccaggcg
gtaaaggtga gcagaggcac 5460gggaggttgc cacttgcggg tcagcacggt tccgaacgcc
atggaaaccg cccccgccag 5520gcccgctgcg acgccgacag gatctagcgc tgcgtttggt
gtcaacacca acagcgccac 5580gcccgcagtt ccgcaaatag cccccaggac cgccatcaat
cgtatcgggc tacctagcag 5640agcggcagag atgaacacga ccatcagcgg ctgcacagcg
cctaccgtcg ccgcgacccg 5700cccggcaggc ggtagaccga aataaacaac aagctccaga
atagcgaaat attaagtgcg 5760ccgaggatga agatgcgcat ccaccagatt cccgttggaa
tctgtcggac gatcatcacg 5820agcaataaac ccgccggcaa cgcccgcagc agcataccgg
cgacccctcg gcctcgctgt 5880tcgggctcca cgaaaacgcc ggacagatgc gccttgtgag
cgtccttggg gccgtcctcc 5940tgtttgaaga ccgacagccc aatgatctcg ccgtcgatgt
aggcgccgaa tgccacggca 6000tctcgcaacc gttcagcgaa cgcctccatg ggctttttct
cctcgtgctc gtaaacggac 6060ccgaacatct ctggagcttt cttcagggcc gacaatcgga
tctcgcggaa atcctgcacg 6120tcggccgctc caagccgtcg aatctgagcc ttaatcacaa
ttgtcaattt taatcctctg 6180tttatcggca gttcgtagag cgcgccgtgc gtcccgagcg
atactgagcg aagcaagtgc 6240gtcgagcagt gcccgcttgt tcctgaaatg ccagtaaagc
gctggctgct gaacccccag 6300ccggaactga ccccacaagg ccctagcgtt tgcaatgcac
caggtcatca ttgacccagg 6360cgtgttccac caggccgctg cctcgcaact cttcgcaggc
ttcgccgacc tgctcgcgcc 6420acttcttcac gcgggtggaa tccgatccgc acatgaggcg
gaaggtttcc agcttgagcg 6480ggtacggctc ccggtgcgag ctgaaatagt cgaacatccg
tcgggccgtc ggcgacagct 6540tgcggtactt ctcccatatg aatttcgtgt agtggtcgcc
agcaaacagc acgacgattt 6600cctcgtcgat caggacctgg caacgggacg ttttcttgcc
acggtccagg acgcggaagc 6660ggtgcagcag cgacaccgat tccaggtgcc caacgcggtc
ggacgtgaag cccatcgccg 6720tcgcctgtag gcgcgacagg cattcctcgg ccttcgtgta
ataccggcca ttgatcgacc 6780agcccaggtc ctggcaaagc tcgtagaacg tgaaggtgat
cggctcgccg ataggggtgc 6840gcttcgcgta ctccaacacc tgctgccaca ccagttcgtc
atcgtcggcc cgcagctcga 6900cgccggtgta ggtgatcttc acgtccttgt tgacgtggaa
aatgaccttg ttttgcagcg 6960cctcgcgcgg gattttcttg ttgcgcgtgg tgaacagggc
agagcgggcc gtgtcgtttg 7020gcatcgctcg catcgtgtcc ggccacggcg caatatcgaa
caaggaaagc tgcatttcct 7080tgatctgctg cttcgtgtgt ttcagcaacg cggcctgctt
ggcctcgctg acctgttttg 7140ccaggtcctc gccggcggtt tttcgcttct tggtcgtcat
agttcctcgc gtgtcgatgg 7200tcatcgactt cgccaaacct gccgcctcct gttcgagacg
acgcgaacgc tccacggcgg 7260ccgatggcgc gggcagggca gggggagcca gttgcacgct
gtcgcgctcg atcttggccg 7320tagcttgctg gaccatcgag ccgacggact ggaaggtttc
gcggggcgca cgcatgacgg 7380tgcggcttgc gatggtttcg gcatcctcgg cggaaaaccc
cgcgtcgatc agttcttgcc 7440tgtatgcctt ccggtcaaac gtccgattca ttcaccctcc
ttgcgggatt gccccgactc 7500acgccggggc aatgtgccct tattcctgat ttgacccgcc
tggtgccttg gtgtccagat 7560aatccacctt atcggcaatg aagtcggtcc cgtagaccgt
ctggccgtcc ttctcgtact 7620tggtattccg aatcttgccc tgcacgaata ccagcgaccc
cttgcccaaa tacttgccgt 7680gggcctcggc ctgagagcca aaacacttga tgcggaagaa
gtcggtgcgc tcctgcttgt 7740cgccggcatc gttgcgccac tcttcattaa ccgctatatc
gaaaattgct tgcggcttgt 7800tagaattgcc atgacgtacc tcggtgtcac gggtaagatt
accgataaac tggaactgat 7860tatggctcat atcgaaagtc tccttgagaa aggagactct
agtttagcta aacattggtt 7920ccgctgtcaa gaactttagc ggctaaaatt ttgcgggccg
cgaccaaagg tgcgaggggc 7980ggcttccgct gtgtacaacc agatattttt caccaacatc
cttcgtctgc tcgatgagcg 8040gggcatgacg aaacatgagc tgtcggagag ggcaggggtt
tcaatttcgt ttttatcaga 8100cttaaccaac ggtaaggcca acccctcgtt gaaggtgatg
gaggccattg ccgacgccct 8160ggaaactccc ctacctcttc tcctggagtc caccgacctt
gaccgcgagg cactcgcgga 8220gattgcgggt catcctttca agagcagcgt gccgcccgga
tacgaacgca tcagtgtggt 8280tttgccgtca cataaggcgt ttatcgtaaa gaaatggggc
gacgacaccc gaaaaaagct 8340gcgtggaagg ctctgacgcc aagggttagg gcttgcactt
ccttctttag ccgctaaaac 8400ggccccttct ctgcgggccg tcggctcgcg catcatatcg
acatcctcaa cggaagccgt 8460gccgcgaatg gcatcgggcg ggtgcgcttt gacagttgtt
ttggatc 8507213429DNAArtificialpLC40 2aagcttgcgg
ccgcttcgaa gatgttaatt aacatcggta ccgagctcta gggataacag 60ggtaatagct
cgaattctag cttgcatgcc tgcagtgcag cgtgacccgg tcgtgcccct 120ctctagagat
aatgagcatt gcatgtctaa gttataaaaa attaccacat attttttttg 180tcacacttgt
ttgaagtgca gtttatctat ctttatacat atatttaaac tttactctac 240gaataatata
atctatagta ctacaataat atcagtgttt tagagaatca tataaatgaa 300cagttagaca
tggtctaaag gacaattgag tattttgaca acaggactct acagttttat 360ctttttagtg
tgcatgtgtt ctcctttttt tttgcaaata gcttcaccta tataatactt 420catccatttt
attagtacat ccatttaggg tttagggtta atggttttta tagactaatt 480tttttagtac
atctatttta ttctatttta gcctctaaat taagaaaact aaaactctat 540tttagttttt
ttatttaata atttagatat aaaatagaat aaaataaagt gactaaaaat 600taaacaaata
ccctttaaga aattaaaaaa actaaggaaa catttttctt gtttcgagta 660gataatgcca
gcctgttaaa cgccgtcgac gagtctaacg gacaccaacc agcgaaccag 720cagcgtcgcg
tcgggccaag cgaagcagac ggcacggcat ctctgtcgct gcctctggac 780ccctctcgag
agttccgctc caccgttgga cttgctccgc tgtcggcatc cagaaattgc 840gtggcggagc
ggcagacgtg agccggcacg gcaggcggcc tcctcctcct ctcacggcac 900cggcagctac
gggggattcc tttcccaccg ctccttcgct ttcccttcct cgcccgccgt 960aataaataga
caccccctcc acaccctctt tccccaacct cgtgttgttc ggagcgcaca 1020cacacacaac
cagatctccc ccaaatccac ccgtcggcac ctccgcttca aggtacgccg 1080ctcgtcctcc
cccccccccc ctctctacct tctctagatc ggcgttccgg tccatggtta 1140gggcccggta
gttctacttc tgttcatgtt tgtgttagat ccgtgtttgt gttagatccg 1200tgctgctagc
gttcgtacac ggatgcgacc tgtacgtcag acacgttctg attgctaact 1260tgccagtgtt
tctctttggg gaatcctggg atggctctag ccgttccgca gacgggatcg 1320atttcatgat
tttttttgtt tcgttgcata gggtttggtt tgcccttttc ctttatttca 1380atatatgccg
tgcacttgtt tgtcgggtca tcttttcatg cttttttttg tcttggttgt 1440gatgatgtgg
tctggttggg cggtcgttct agatcggagt agaattctgt ttcaaactac 1500ctggtggatt
tattaatttt ggatctgtat gtgtgtgcca tacatattca tagttacgaa 1560ttgaagatga
tggatggaaa tatcgatcta ggataggtat acatgttgat gcgggtttta 1620ctgatgcata
tacagagatg ctttttgttc gcttggttgt gatgatgtgg tgtggttggg 1680cggtcgttca
ttcgttctag atcggagtag aatactgttt caaactacct ggtgtattta 1740ttaattttgg
aactgtatgt gtgtgtcata catcttcata gttacgagtt taagatggat 1800ggaaatatcg
atctaggata ggtatacatg ttgatgtggg ttttactgat gcatatacat 1860gatggcatat
gcagcatcta ttcatatgct ctaaccttga gtacctatct attataataa 1920acaagtatgt
tttataatta ttttgatctt gatatacttg gatgatggca tatgcagcag 1980ctatatgtgg
atttttttag ccctgccttc atacgctatt tatttgcttg gtactgtttc 2040ttttgtcgat
gctcaccctg ttgtttggtg ttacttctgc aggtcgactc tagaggatcc 2100cggggggcaa
tgagatatga aaaagcctga actcaccgcg acgtctgtcg agaagtttct 2160gatcgaaaag
ttcgacagcg tctccgacct gatgcagctc tcggagggcg aagaatctcg 2220tgctttcagc
ttcgatgtag gagggcgtgg atatgtcctg cgggtaaata gctgcgccga 2280tggtttctac
aaagatcgtt atgtttatcg gcactttgca tcggccgcgc tcccgattcc 2340ggaagtgctt
gacattgggg aattcagcga gagcctgacc tattgcatct cccgccgtgc 2400acagggtgtc
acgttgcaag acctgcctga aaccgaactg cccgctgttc tgcagccggt 2460cgcggaggcc
atggatgcga tcgctgcggc cgatcttagc cagacgagcg ggttcggccc 2520attcggaccg
caaggaatcg gtcaatacac tacatggcgt gatttcatat gcgcgattgc 2580tgatccccat
gtgtatcact ggcaaactgt gatggacgac accgtcagtg cgtccgtcgc 2640gcaggctctc
gatgagctga tgctttgggc cgaggactgc cccgaagtcc ggcacctcgt 2700gcacgcggat
ttcggctcca acaatgtcct gacggacaat ggccgcataa cagcggtcat 2760tgactggagc
gaggcgatgt tcggggattc ccaatacgag gtcgccaaca tcttcttctg 2820gaggccgtgg
ttggcttgta tggagcagca gacgcgctac ttcgagcgga ggcatccgga 2880gcttgcagga
tcgccgcggc tccgggcgta tatgctccgc attggtcttg accaactcta 2940tcagagcttg
gttgacggca atttcgatga tgcagcttgg gcgcagggtc gatgcgacgc 3000aatcgtccga
tccggagccg ggactgtcgg gcgtacacaa atcgcccgca gaagcgcggc 3060cgtctggacc
gatggctgtg tagaagtact cgccgatagt ggaaaccgac gccccagcac 3120tcgtccggga
tccgtcgacc tgcagatcgt tcaaacattt ggcaataaag tttcttaaga 3180ttgaatcctg
ttgccggtct tgcgatgatt atcatataat ttctgttgaa ttacgttaag 3240catgtaataa
ttaacatgta atgcatgacg ttatttatga gatgggtttt tatgattaga 3300gtcccgcaat
tatacattta atacgcgata gaaaacaaaa tatagcgcgc aaactaggat 3360aaattatcgc
gcgcggtgtc atctatgtta ctagatccga tgataagctg tcaaacatga 3420gaattcagta
cattaaaaac gtccgcaatg tgttattaag ttgtctaagc gtcaatttgt 3480ttacaccaca
atatatcctg ccaccagcca gccaacagct ccccgaccgg cagctcggca 3540caaaatcacc
actcgataca ggcagcccat cagtccggga cggcgtcagc gggagagccg 3600ttgtaaggcg
gcagactttg ctcatgttac cgatgctatt cggaagaacg gcaactaagc 3660tgccgggttt
gaaacacgga tgatctcgcg gagggtagca tgttgattgt aacgatgaca 3720gagcgttgct
gcctgtgatc aaatatcatc tccctcgcag agatccgaat tatcagcctt 3780cttattcatt
tctcgcttaa ccgtgacagg ctgtcgatct tgagaactat gccgacataa 3840taggaaatcg
ctggataaag ccgctgagga agctgagtgg cgctatttct ttagaagtga 3900acgttgacga
tcgtcgaccg taccccgatg aattaattcg gacgtacgtt ctgaacacag 3960ctggatactt
acttgggcga ttgtcataca tgacatcaac aatgtacccg tttgtgtaac 4020cgtctcttgg
aggttcgtat gacactaggt cgctacctta ggaccgttat agttactagc 4080gaattgacat
gaggttgccc cgtattcagt gtcgctgatt tgtattgtct gaagttgttt 4140ttacgttaag
ttgatgcaga tcaattaata cgatacctgc gtcataattg attatttgac 4200gtggtttgat
ggcctccacg cacgttgtga tatgtagatg ataatcatta tcactttacg 4260ggtcctttcc
ggtgatccga caggttacgg ggcggcgacc tcgcgggttt tcgctattta 4320tgaaaatttt
ccggtttaag gcgtttccgt tcttcttcgt cataacttaa tgtttttatt 4380taaaataccc
tctgaaaaga aaggaaacga caggtgctga aagcgagctt tttggcctct 4440gtcgtttcct
ttctctgttt ttgtccgtgg aatgaacaat ggaaggatct tctcggcggc 4500gatcacgacg
ccggccctgc ggagccttcg ccgcgtgcgc gattcatggc ggccgtggag 4560gccaaggatt
tcgcgcgagt gcaagagctg atcgaggcgc gtggagccaa gtcggcggct 4620gattatgtcc
ttgcgcagct cgccgtggcc gaaggtctgg accgcaagcc tggtgcgcgc 4680gtcgtggtcg
ggaaagcggc gggcagcatg gcaatgccgc ctgcggcgct gggttttacg 4740ccaaggggag
aagcggcata cgccatcgag cggtcagcct atggtgagcc gaggtccagc 4800attgcgaagc
agtaccagca ggaatggaac cggaaggcgg cgacctggtg ggcgatggcc 4860ggtgtggccg
gcatcatcgg cgcgatcctg gcggcggcgg caaccggctt tgttgggctg 4920gcagtgtcga
tccgcaaccg agtgaagcgc gtgcgcgacc tgttggtgat ggagccgggt 4980gcagagccat
aagcggcaag agacgaaagc ccggtttccg ggcttttgtt ttgttacgcc 5040aaggacgagt
tttagcggct aaaggtgttg acgtgcgaga aatgtttagc taaacttctc 5100tcatgtgctg
gcggctgtca ccgctatgtt caaccaaggc gcggagcaaa ttatgggtgt 5160tatccatgaa
gaaacggctt accgaaagcc agttccagga ggcgatccag gggctggaag 5220tggggcagca
gaccatcgag atagcgcggg gcgtcttagt cgatgggaag ccacaggcga 5280cgttcgcaac
gtcgctggga ctgaccaggg gcgcagtgtc gcaagcggtg catcgcgtgt 5340gggccgcgtt
cgaggacaag aacttgcccg aggggtacgc gcgggtaacg gcggttctgc 5400cggaacatca
ggcgtacatc gtccggaagt gggaagcgga cgccaagaaa aaacaggaaa 5460ccaaacgatg
aaaactttgg tcacggccaa ccagaaaggc ggcgtcggca agacttcgac 5520ccttgtgcat
cttgccttcg actttttcga gcgcggcttg cgggttgccg tgatcgacct 5580ggacccccag
ggcaatgcgt cctacacgct caaggacttt gctaccggcc tgcatgcaag 5640caagctgttc
ggcgctgtcc ctgccggcgg ctggaccgaa accgcacccg cagccggcga 5700cggccaggcc
gcgcgcctcg ccctcatcga gtccaacccg gtactggcga acgccgaacg 5760gctgtcgctg
gacgacgccc gcgagctgtt cggggcgaac atcaaggccc tggcgaacca 5820aggcttcgac
gtgtgcctga tcgacacggc cccgaccctt ggcgtcggcc tggcggccgc 5880cctcttcgcg
gccgactatg tgctgtcccc catcgagctt gaggcgtaca gcatccaggg 5940catcaagaag
atggtcacga ccattgcgaa cgtgcgccag aagaacgcca agctgcaatt 6000ccttggcatg
gtgcccagca aggtcgatgc gcggaatccg cgccacgcgc gccaccaagc 6060cgagctgctg
gccgcgtacc ccaagatgat gattccggcc accgttggcc tgcgcagcag 6120catcgccgat
gccctcgcat ccggtgtgcc ggtctggaag atcaagaaaa cggccgcgcg 6180caaggcatcg
aaagaggttc gcgccctggc tgattacgtg ttcacgaaga tggagatttc 6240ccaatgactg
cggctcaagc caagaccacc aagaaaaaca ccgctgcggc cgctcaggaa 6300gccgcaggcg
cggcgcagcc gtccggcctg gggttggata gcatcggcga cctgtcgagc 6360ctcctggacg
ctcctgcggc gtctcagggc ggttccggcc ctatcgagct ggacctggac 6420ctgatcgacg
aagatccgca tcagccgcgg acggccgaca accccggctt ttccccggag 6480agcatcgcgg
aaatcggtgc cacgatcaaa gagcgcgggg tgaagtcacc catttcggtg 6540cgcgagaacc
aggagcagcc gggccgctat atcatcaatc acggcgcccg ccgctaccgt 6600ggctcgaatc
tagtgatatt ccacaaaaca gcagggaagc agcgcttttc cgctgcataa 6660ccctgcttcg
gggtcattat agcgattttt tcggtatatc catccttttt cgcacgatat 6720acaggatttt
gccaaagggt tcgtgtagac tttccttggt gtatccaacg gcgtcagccg 6780ggcaggatag
gtgaagtagg cccacccgcg agcgggtgtt ccttcttcac tgtcccttat 6840tcgcacctgg
cggtgctcaa cgggaatcct gctctgcgag gctggccggc taccgccggc 6900gtaacagatg
agggcaagcg gatggctgat gaaaccaagc caaccaggaa gggcagccca 6960cctatcaagg
tgtactgcct tccagacgaa cgaagagcga ttgaggaaaa ggcggcggcg 7020gccggcatga
gcctgtcggc ctacctgctg gccgtcggcc agggctacaa aatcacgggc 7080gtcgtggact
atgagcacgt ccgcgagctg gcccgcatca atggcgacct gggccgcctg 7140ggcggcctgc
tgaaactctg gctcaccgac gacccgcgca cggcgcggtt cggtgatgcc 7200acgatcctcg
ccctgctggc gaagatcgaa gagaagcagg acgagcttgg caaggtcatg 7260atgggcgtgg
tccgcccgag ggcagagcca tgactttttt agccgctaaa acggccgggg 7320ggtgcgcgtg
attgccaagc acgtccccat gcgctccatc aagaagagcg acttcgcgga 7380gctggtgaag
tacatcaccg acgagcaagg caagaccgag cgccagatcc aaaacaactg 7440tcaaagcgca
cccgcccgat gccattcgcg gcacggcttc cgttgaggat gtcgatatga 7500tgcgcgagcc
gacggcccgc agagaagggg ccgttttagc ggctaaagaa ggaagtgcaa 7560gccctaaccc
ttggcgtcag agccttccac gcagcttttt tcgggtgtcg tcgccccatt 7620tctttacgat
aaacgcctta tgtgacggca aaaccacact gatgcgttcg tatccgggcg 7680gcacgctgct
cttgaaagga tgacccgcaa tctccgcgag tgcctcgcgg tcaaggtcgg 7740tggactccag
gagaagaggt aggggagttt ccagggcgtc ggcaatggcc tccatcacct 7800tcaacgaggg
gttggcctta ccgttggtta agtctgataa aaacgaaatt gaaacccctg 7860ccctctccga
cagctcatgt ttcgtcatgc cccgctcatc gagcagacga aggatgttgg 7920tgaaaaatat
ctggttgtac acagcggaag ccgcccctcg cacctttggt cgcggcccgc 7980aaaattttag
ccgctaaagt tcttgacagc ggaaccaatg tttagctaaa ctagagtctc 8040ctttctcaag
gagactttcg atatgagcca taatcagttc cagtttatcg gtaatcttac 8100ccgtgacacc
gaggtacgtc atggcaattc taacaagccg caagcaattt tcgatatagc 8160ggttaatgaa
gagtggcgca acgatgccgg cgacaagcag gagcgcaccg acttcttccg 8220catcaagtgt
tttggctctc aggccgaggc ccacggcaag tatttgggca aggggtcgct 8280ggtattcgtg
cagggcaaga ttcggaatac caagtacgag aaggacggcc agacggtcta 8340cgggaccgac
ttcattgccg ataaggtgga ttatctggac accaaggcac caggcgggtc 8400aaatcaggaa
taagggcaca ttgccccggc gtgagtcggg gcaatcccgc aaggagggtg 8460aatgaatcgg
acgtttgacc ggaaggcata caggcaagaa ctgatcgacg cggggttttc 8520cgccgaggat
gccgaaacca tcgcaagccg caccgtcatg cgtgcgcccc gcgaaacctt 8580ccagtccgtc
ggctcgatgg tccagcaagc tacggccaag atcgagcgcg acagcgtgca 8640actggctccc
cctgccctgc ccgcgccatc ggccgccgtg gagcgttcgc gtcgtctcga 8700acaggaggcg
gcaggtttgg cgaagtcgat gaccatcgac acgcgaggaa ctatgacgac 8760caagaagcga
aaaaccgccg gcgaggacct ggcaaaacag gtcagcgagg ccaagcaggc 8820cgcgttgctg
aaacacacga agcagcagat caaggaaatg cagctttcct tgttcgatat 8880tgcgccgtgg
ccggacacga tgcgagcgat gccaaacgac acggcccgct ctgccctgtt 8940caccacgcgc
aacaagaaaa tcccgcgcga ggcgctgcaa aacaaggtca ttttccacgt 9000caacaaggac
gtgaagatca cctacaccgg cgtcgagctg cgggccgacg atgacgaact 9060ggtgtggcag
caggtgttgg agtacgcgaa gcgcacccct atcggcgagc cgatcacctt 9120cacgttctac
gagctttgcc aggacctggg ctggtcgatc aatggccggt attacacgaa 9180ggccgaggaa
tgcctgtcgc gcctacaggc gacggcgatg ggcttcacgt ccgaccgcgt 9240tgggcacctg
gaatcggtgt cgctgctgca ccgcttccgc gtcctggacc gtggcaagaa 9300aacgtcccgt
tgccaggtcc tgatcgacga ggaaatcgtc gtgctgtttg ctggcgacca 9360ctacacgaaa
ttcatatggg agaagtaccg caagctgtcg ccgacggccc gacggatgtt 9420cgactatttc
agctcgcacc gggagccgta cccgctcaag ctggaaacct tccgcctcat 9480gtgcggatcg
gattccaccc gcgtgaagaa gtggcgcgag caggtcggcg aagcctgcga 9540agagttgcga
ggcagcggcc tggtggaaca cgcctgggtc aatgatgacc tggtgcattg 9600caaacgctag
ggccttgtgg ggtcagttcc ggctgggggt tcagcagcca gcgctttact 9660ggcatttcag
gaacaagcgg gcactgctcg acgcacttgc ttcgctcagt atcgctcggg 9720acgcacggcg
cgctctacga actgccgata aacagaggat taaaattgac aattgtgatt 9780aaggctcaga
ttcgacggct tggagcggcc gacgtgcagg atttccgcga gatccgattg 9840tcggccctga
agaaagctcc agagatgttc gggtccgttt acgagcacga ggagaaaaag 9900cccatggagg
cgttcgctga acggttgcga gatgccgtgg cattcggcgc ctacatcgac 9960ggcgagatca
ttgggctgtc ggtcttcaaa caggaggacg gccccaagga cgctcacaag 10020gcgcatctgt
ccggcgtttt cgtggagccc gaacagcgag gccgaggggt cgccggtatg 10080ctgctgcggg
cgttgccggc gggtttattg ctcgtgatga tcgtccgaca gattccaacg 10140ggaatctggt
ggatgcgcat cttcatcctc ggcgcactta atatttcgct attctggagc 10200ttgttgttta
tttcggtcta ccgcctgccg ggcgggtcgc ggcgacggta ggcgctgtgc 10260agccgctgat
ggtcgtgttc atctctgccg ctctgctagg tagcccgata cgattgatgg 10320cggtcctggg
ggctatttgc ggaactgcgg gcgtggcgct gttggtgttg acaccaaacg 10380cagcgctaga
tcctgtcggc gtcgcagcgg gcctggcggg ggcggtttcc atggcgttcg 10440gaaccgtgct
gacccgcaag tggcaacctc ccgtgcctct gctcaccttt accgcctggc 10500aactggcggc
cggaggactt ctgctcgttc cagtagcttt agtgtttgat ccgccaatcc 10560cgatgcctac
aggaaccaat gttctcggct gctcgactgc acgaatacca gcgacccctt 10620gcccaaatac
ttgccgtggg cctcggcctg agagccaaaa cacttgatgc ggaagaagtc 10680ggtgcgctcc
tgcttgtcgc cggcatcgtt gcgccacatc taggtactaa aacaattcat 10740ccagtaaaat
ataatatttt attttctccc aatcaggctt gatccccagt aagtcaaaaa 10800atagctcgac
atactgttct tccccgatat cctccctgat cgaccggacg cagaaggcaa 10860tgtcatacca
cttgtccgcc ctgccgcttc tcccaagatc aataaagcca cttactttgc 10920catctttcac
aaagatgttg ctgtctccca ggtcgccgtg ggaaaagaca agttcctctt 10980cgggcttttc
cgtctttaaa aaatcataca gctcgcgcgg atctttaaat ggagtgtctt 11040cttcccagtt
ttcgcaatcc acatcggcca gatcgttatt cagtaagtaa tccaattcgg 11100ctaagcggct
gtctaagcta ttcgtatagg gacaatccga tatgtcgatg gagtgaaaga 11160gcctgatgca
ctccgcatac agctcgataa tcttttcagg gctttgttca tcttcatact 11220cttccgagca
aaggacgcca tcggcctcac tcatgagcag attgctccag ccatcatgcc 11280gttcaaagtg
caggaccttt ggaacaggca gctttccttc cagccatagc atcatgtcct 11340tttcccgttc
cacatcatag gtggtccctt tataccggct gtccgtcatt tttaaatata 11400ggttttcatt
ttctcccacc agcttatata ccttagcagg agacattcct tccgtatctt 11460ttacgcagcg
gtatttttcg atcagttttt tcaattccgg tgatattctc attttagcca 11520tttattattt
ccttcctctt ttctacagta tttaaagata ccccaagaag ctaattataa 11580caagacgaac
tccaattcac tgttccttgc attctaaaac cttaaatacc agaaaacagc 11640tttttcaaag
ttgttttcaa agttggcgta taacatagta tcgattcgat agcgtggact 11700caaggctctc
gcgaatggct cgcgttggaa actttcattg acacttgagg ggcaccgcag 11760ggaaattctc
gtccttgcga gaaccggcta tgtcgtgctg cgcatcgagc ctgcgccctt 11820ggcttgtctc
gcccctctcc gcgtcgctac ggggcttcca gcgcctttcc gacgctcacc 11880gggctggttg
ccctcgccgc tgggctggcg gccgtctatg gccctgcaaa cgcgccagaa 11940acgccgtcga
agccgtgtgc gagacaccgc ggccgccggc gttgtggata cctcgcggaa 12000aacttggccc
tcactgacag atgaggggcg gacgttgaca cttgaggggc cgactcaccc 12060ggcgcggcgt
tgacagatga ggggcaggct cgatttcggc cggcgacgtg gagctggcca 12120gcctcgcaaa
tcggcgaaaa cgcctgattt tacgcgagtt tcccacagat gatgtggaca 12180agcctgggga
taagtgccct gcggtattga cacttgaggg gcgcgactac tgacagatga 12240ggggcgcgat
ccttgacact tgaggggcag agtgctgaca gatgaggggc gcacctattg 12300acatttgagg
ggctgtccac aggcagaaaa tccagcattt gcaagggttt ccgcccgttt 12360ttcggccacc
gctaacctgt cttttaacct gcttttaaac caatatttat aaaccttgtt 12420tttaaccagg
gctgcgccct gtgcgcgtga ccgcgcacgc cgaagggggg tgccccccct 12480tctcgaaccc
tcccggcccg ctaacgcggg cctcccatcc ccccaggggc tgcgcccctc 12540ggccgcgaac
ggcctcaccc caaaaatggc agcgccagat tattgaagca tttatcaggg 12600ttattgtctc
atgagcggat acatatttga atgtatttag aaaaataaac aaataggggt 12660tccgcgcaca
tttccccgaa aagtgccacc tgacgtctaa gaaaccatta ttatcatgac 12720attaacctat
aaaaataggc gtatcacgag gccctttcgt cttcaagaat tggtcgacga 12780tcttgctgcg
ttcggatatt ttcgtggagt tcccgccaca gacccggatt gaaggcgaga 12840tccagcaact
cgcgccagat catcctgtga cggaactttg gcgcgtgatg actggccagg 12900acgtcggccg
aaagagcgac aagcagatca cgcttttcga cagcgtcgga tttgcgatcg 12960aggatttttc
ggcgctgcgc tacgtccgcg accgcgttga gggatcaagc cacagcagcc 13020cactcgacct
tctagccgac ccagacgagc caagggatct ttttggaatg ctgctccgtc 13080gtcaggcttt
ccgacgtttg ggtggttgaa cagaagtcat tatcgcacgg aatgccaagc 13140actcccgagg
ggaaccctgt ggttggcatg cacatacaaa tggacgaacg gataaacctt 13200ttcacgccct
tttaaatatc cgattattct aataaacgct cttttctctt aggtttaccc 13260gccaatatat
cctgtcaaac actgatagtt taaactgaag gcgggaaacg acaatctgat 13320catgagcgga
gaattaaggg agtcacgtta tgacccccgc cgatgacgcg ggacaagccg 13380ttttacgttt
ggaactgaca gaaccgcaac gttgaaggag ccactcagc
13429313174DNAArtificialpLC40GWH 3aagcttgcgg ccgcttcgaa gatgttaatt
aacatcggta ccgagctcta gggataacag 60ggtaatagct cgaattctag cttgcatgcc
tgcagtgcag cgtgacccgg tcgtgcccct 120ctctagagat aatgagcatt gcatgtctaa
gttataaaaa attaccacat attttttttg 180tcacacttgt ttgaagtgca gtttatctat
ctttatacat atatttaaac tttactctac 240gaataatata atctatagta ctacaataat
atcagtgttt tagagaatca tataaatgaa 300cagttagaca tggtctaaag gacaattgag
tattttgaca acaggactct acagttttat 360ctttttagtg tgcatgtgtt ctcctttttt
tttgcaaata gcttcaccta tataatactt 420catccatttt attagtacat ccatttaggg
tttagggtta atggttttta tagactaatt 480tttttagtac atctatttta ttctatttta
gcctctaaat taagaaaact aaaactctat 540tttagttttt ttatttaata atttagatat
aaaatagaat aaaataaagt gactaaaaat 600taaacaaata ccctttaaga aattaaaaaa
actaaggaaa catttttctt gtttcgagta 660gataatgcca gcctgttaaa cgccgtcgac
gagtctaacg gacaccaacc agcgaaccag 720cagcgtcgcg tcgggccaag cgaagcagac
ggcacggcat ctctgtcgct gcctctggac 780ccctctcgag agttccgctc caccgttgga
cttgctccgc tgtcggcatc cagaaattgc 840gtggcggagc ggcagacgtg agccggcacg
gcaggcggcc tcctcctcct ctcacggcac 900cggcagctac gggggattcc tttcccaccg
ctccttcgct ttcccttcct cgcccgccgt 960aataaataga caccccctcc acaccctctt
tccccaacct cgtgttgttc ggagcgcaca 1020cacacacaac cagatctccc ccaaatccac
ccgtcggcac ctccgcttca aggtacgccg 1080ctcgtcctcc cccccccccc ctctctacct
tctctagatc ggcgttccgg tccatggtta 1140gggcccggta gttctacttc tgttcatgtt
tgtgttagat ccgtgtttgt gttagatccg 1200tgctgctagc gttcgtacac ggatgcgacc
tgtacgtcag acacgttctg attgctaact 1260tgccagtgtt tctctttggg gaatcctggg
atggctctag ccgttccgca gacgggatcg 1320atttcatgat tttttttgtt tcgttgcata
gggtttggtt tgcccttttc ctttatttca 1380atatatgccg tgcacttgtt tgtcgggtca
tcttttcatg cttttttttg tcttggttgt 1440gatgatgtgg tctggttggg cggtcgttct
agatcggagt agaattctgt ttcaaactac 1500ctggtggatt tattaatttt ggatctgtat
gtgtgtgcca tacatattca tagttacgaa 1560ttgaagatga tggatggaaa tatcgatcta
ggataggtat acatgttgat gcgggtttta 1620ctgatgcata tacagagatg ctttttgttc
gcttggttgt gatgatgtgg tgtggttggg 1680cggtcgttca ttcgttctag atcggagtag
aatactgttt caaactacct ggtgtattta 1740ttaattttgg aactgtatgt gtgtgtcata
catcttcata gttacgagtt taagatggat 1800ggaaatatcg atctaggata ggtatacatg
ttgatgtggg ttttactgat gcatatacat 1860gatggcatat gcagcatcta ttcatatgct
ctaaccttga gtacctatct attataataa 1920acaagtatgt tttataatta ttttgatctt
gatatacttg gatgatggca tatgcagcag 1980ctatatgtgg atttttttag ccctgccttc
atacgctatt tatttgcttg gtactgtttc 2040ttttgtcgat gctcaccctg ttgtttggtg
ttacttctgc aggtcgactc tagaggatca 2100tcacaagttt gtacaaaaaa gcaggctcaa
tgagatatga aaaagcctga actcaccgcg 2160acgtctgtcg agaagtttct gatcgaaaag
ttcgacagcg tctccgacct gatgcagctc 2220tcggagggcg aagaatctcg tgctttcagc
ttcgatgtag gagggcgtgg atatgtcctg 2280cgggtaaata gctgcgccga tggtttctac
aaagatcgtt atgtttatcg gcactttgca 2340tcggccgcgc tcccgattcc ggaagtgctt
gacattgggg aattcagcga gagcctgacc 2400tattgcatct cccgccgtgc acagggtgtc
acgttgcaag acctgcctga aaccgaactg 2460cccgctgttc tgcagccggt cgcggaggcc
atggatgcga tcgctgcggc cgatcttagc 2520cagacgagcg ggttcggccc attcggaccg
caaggaatcg gtcaatacac tacatggcgt 2580gatttcatat gcgcgattgc tgatccccat
gtgtatcact ggcaaactgt gatggacgac 2640accgtcagtg cgtccgtcgc gcaggctctc
gatgagctga tgctttgggc cgaggactgc 2700cccgaagtcc ggcacctcgt gcacgcggat
ttcggctcca acaatgtcct gacggacaat 2760ggccgcataa cagcggtcat tgactggagc
gaggcgatgt tcggggattc ccaatacgag 2820gtcgccaaca tcttcttctg gaggccgtgg
ttggcttgta tggagcagca gacgcgctac 2880ttcgagcgga ggcatccgga gcttgcagga
tcgccgcggc tccgggcgta tatgctccgc 2940attggtcttg accaactcta tcagagcttg
gttgacggca atttcgatga tgcagcttgg 3000gcgcagggtc gatgcgacgc aatcgtccga
tccggagccg ggactgtcgg gcgtacacaa 3060atcgcccgca gaagcgcggc cgtctggacc
gatggctgtg tagaagtact cgccgatagt 3120ggaaaccgac gccccagcac tcgtccgagg
gcaaaggaat agacccagct ttcttgtaca 3180aagtggtgat gatccgtcga cctgcagatc
gttcaaacat ttggcaataa agtttcttaa 3240gattgaatcc tgttgccggt cttgcgatga
ttatcatata atttctgttg aattacgtta 3300agcatgtaat aattaacatg taatgcatga
cgttatttat gagatgggtt tttatgatta 3360gagtcccgca attatacatt taatacgcga
tagaaaacaa aatatagcgc gcaaactagg 3420ataaattatc gcgcgcggtg tcatctatgt
tactagatcc gatgataagc tgtcaaacat 3480gagaattcag tacattaaaa acgtccgcaa
tgtgttatta agttgtctaa gcgtcaattt 3540gtttacacca caatatatcc tgccaccagc
cagccaacag ctccccgacc ggcagctcgg 3600cacaaaatca ccactcgata caggcagccc
atcagtccgg gacggcgtca gcgggagagc 3660cgttgtaagg cggcagactt tgctcatgtt
accgatgcta ttcggaagaa cggcaactaa 3720gctgccgggt ttgaaacacg gatgatctcg
cggagggtag catgttgatt gtaacgatga 3780cagagcgttg ctgcctgtga tcaaatatca
tctccctcgc agagatccga attatcagcc 3840ttcttattca tttctcgctt aaccgtgaca
ggctgtcgat cttgagaact atgccgacat 3900aataggaaat cgctggataa agccgctgag
gaagctgagt ggcgctattt ctttagaagt 3960gaacgttgac gatcgtcgac cgtaccccga
tgaattaatt cggacgtacg ttctgaacac 4020agctggatac ttacttgggc gattgtcata
catgacatca acaatgtacc cgtttgtgta 4080accgtctctt ggaggttcgt atgacactag
gtcgctacct taggaccgtt atagttacta 4140gcgaattgac atgaggttgc cccgtattca
gtgtcgctga tttgtattgt ctgaagttgt 4200ttttacgtta agttgatgca gatcaattaa
tacgatacct gcgtcataat tgattatttg 4260acgtggtttg atggcctcca cgcacgttgt
gatatgtaga tgataatcat tatcacttta 4320cgggtccttt ccggtgatcc gacaggttac
ggggcggcga cctcgcgggt tttcgctatt 4380tatgaaaatt ttccggttta aggcgtttcc
gttcttcttc gtcataactt aatgttttta 4440tttaaaatac cctctgaaaa gaaaggaaac
gacaggtgct gaaagcgagc tttttggcct 4500ctgtcgtttc ctttctctgt ttttgtccgt
ggaatgaaca atggaaggat cttctcggcg 4560gcgatcacga cgccggccct gcggagcctt
cgccgcgtgc gcgattcatg gcggccgtgg 4620aggccaagga tttcgcgcga gtgcaagagc
tgatcgaggc gcgtggagcc aagtcggcgg 4680ctgattatgt ccttgcgcag ctcgccgtgg
ccgaaggtct ggaccgcaag cctggtgcgc 4740gcgtcgtggt cgggaaagcg gcgggcagca
tggcaatgcc gcctgcggcg ctgggtttta 4800cgccaagggg agaagcggca tacgccatcg
agcggtcagc ctatggtgag ccgaggtcca 4860gcattgcgaa gcagtaccag caggaatgga
accggaaggc ggcgacctgg tgggcgatgg 4920ccggtgtggc cggcatcatc ggcgcgatcc
tggcggcggc ggcaaccggc tttgttgggc 4980tggcagtgtc gatccgcaac cgagtgaagc
gcgtgcgcga cctgttggtg atggagccgg 5040gtgcagagcc ataagcggca agagacgaaa
gcccggtttc cgggcttttg ttttgttacg 5100ccaaggacga gttttagcgg ctaaaggtgt
tgacgtgcga gaaatgttta gctaaacttc 5160tctcatgtgc tggcggctgt caccgctatg
ttcaaccaag gcgcggagca aattatgggt 5220gttatccatg aagaaacggc ttaccgaaag
ccagttccag gaggcgatcc aggggctgga 5280agtggggcag cagaccatcg agatagcgcg
gggcgtctta gtcgatggga agccacaggc 5340gacgttcgca acgtcgctgg gactgaccag
gggcgcagtg tcgcaagcgg tgcatcgcgt 5400gtgggccgcg ttcgaggaca agaacttgcc
cgaggggtac gcgcgggtaa cggcggttct 5460gccggaacat caggcgtaca tcgtccggaa
gtgggaagcg gacgccaaga aaaaacagga 5520aaccaaacga tgaaaacttt ggtcacggcc
aaccagaaag gcggcgtcgg caagacttcg 5580acccttgtgc atcttgcctt cgactttttc
gagcgcggct tgcgggttgc cgtgatcgac 5640ctggaccccc agggcaatgc gtcctacacg
ctcaaggact ttgctaccgg cctgcatgca 5700agcaagctgt tcggcgctgt ccctgccggc
ggctggaccg aaaccgcacc cgcagccggc 5760gacggccagg ccgcgcgcct cgccctcatc
gagtccaacc cggtactggc gaacgccgaa 5820cggctgtcgc tggacgacgc ccgcgagctg
ttcggggcga acatcaaggc cctggcgaac 5880caaggcttcg acgtgtgcct gatcgacacg
gccccgaccc ttggcgtcgg cctggcggcc 5940gccctcttcg cggccgacta tgtgctgtcc
cccatcgagc ttgaggcgta cagcatccag 6000ggcatcaaga agatggtcac gaccattgcg
aacgtgcgcc agaagaacgc caagctgcaa 6060ttccttggca tggtgcccag caaggtcgat
gcgcggaatc cgcgccacgc gcgccaccaa 6120gccgagctgc tggccgcgta ccccaagatg
atgattccgg ccaccgttgg cctgcgcagc 6180agcatcgccg atgccctcgc atccggtgtg
ccggtctgga agatcaagaa aacggccgcg 6240cgcaaggcat cgaaagaggt tcgcgccctg
gctgattacg tgttcacgaa gatggagatt 6300tcccaatgac tgcggctcaa gccaagacca
ccaagaaaaa caccgctgcg gccgctcagg 6360aagccgcagg cgcggcgcag ccgtccggcc
tggggttgga tagcatcggc gacctgtcga 6420gcctcctgga cgctcctgcg gcgtctcagg
gcggttccgg ccctatcgag ctggacctgg 6480acctgatcga cgaagatccg catcagccgc
ggacggccga caaccccggc ttttccccgg 6540agagcatcgc ggaaatcggt gccacgatca
aagagcgcgg ggtgaagtca cccatttcgg 6600tgcgcgagaa ccaggagcag ccgggccgct
atatcatcaa tcacggcgcc cgccgctacc 6660gtggctcgaa tctagtgata ttccacaaaa
cagcagggaa gcagcgcttt tccgctgcat 6720aaccctgctt cggggtcatt atagcgattt
tttcggtata tccatccttt ttcgcacgat 6780atacaggatt ttgccaaagg gttcgtgtag
actttccttg gtgtatccaa cggcgtcagc 6840cgggcaggat aggtgaagta ggcccacccg
cgagcgggtg ttccttcttc actgtccctt 6900attcgcacct ggcggtgctc aacgggaatc
ctgctctgcg aggctggccg gctaccgccg 6960gcgtaacaga tgagggcaag cggatggctg
atgaaaccaa gccaaccagg aagggcagcc 7020cacctatcaa ggtgtactgc cttccagacg
aacgaagagc gattgaggaa aaggcggcgg 7080cggccggcat gagcctgtcg gcctacctgc
tggccgtcgg ccagggctac aaaatcacgg 7140gcgtcgtgga ctatgagcac gtccgcgagc
tggcccgcat caatggcgac ctgggccgcc 7200tgggcggcct gctgaaactc tggctcaccg
acgacccgcg cacggcgcgg ttcggtgatg 7260ccacgatcct cgccctgctg gcgaagatcg
aagagaagca ggacgagctt ggcaaggtca 7320tgatgggcgt ggtccgcccg agggcagagc
catgactttt ttagccgcta aaacggccgg 7380ggggtgcgcg tgattgccaa gcacgtcccc
atgcgctcca tcaagaagag cgacttcgcg 7440gagctggtga agtacatcac cgacgagcaa
ggcaagaccg agcgccagat ccaaaacaac 7500tgtcaaagcg cacccgcccg atgccattcg
cggcacggct tccgttgagg atgtcgatat 7560gatgcgcgag ccgacggccc gcagagaagg
ggccgtttta gcggctaaag aaggaagtgc 7620aagccctaac ccttggcgtc agagccttcc
acgcagcttt tttcgggtgt cgtcgcccca 7680tttctttacg ataaacgcct tatgtgacgg
caaaaccaca ctgatgcgtt cgtatccggg 7740cggcacgctg ctcttgaaag gatgacccgc
aatctccgcg agtgcctcgc ggtcaaggtc 7800ggtggactcc aggagaagag gtaggggagt
ttccagggcg tcggcaatgg cctccatcac 7860cttcaacgag gggttggcct taccgttggt
taagtctgat aaaaacgaaa ttgaaacccc 7920tgccctctcc gacagctcat gtttcgtcat
gccccgctca tcgagcagac gaaggatgtt 7980ggtgaaaaat atctggttgt acacagcgga
agccgcccct cgcacctttg gtcgcggccc 8040gcaaaatttt agccgctaaa gttcttgaca
gcggaaccaa tgtttagcta aactagagtc 8100tcctttctca aggagacttt cgatatgagc
cataatcagt tccagtttat cggtaatctt 8160acccgtgaca ccgaggtacg tcatggcaat
tctaacaagc cgcaagcaat tttcgatata 8220gcggttaatg aagagtggcg caacgatgcc
ggcgacaagc aggagcgcac cgacttcttc 8280cgcatcaagt gttttggctc tcaggccgag
gcccacggca agtatttggg caaggggtcg 8340ctggtattcg tgcagggcaa gattcggaat
accaagtacg agaaggacgg ccagacggtc 8400tacgggaccg acttcattgc cgataaggtg
gattatctgg acaccaaggc accaggcggg 8460tcaaatcagg aataagggca cattgccccg
gcgtgagtcg gggcaatccc gcaaggaggg 8520tgaatgaatc ggacgtttga ccggaaggca
tacaggcaag aactgatcga cgcggggttt 8580tccgccgagg atgccgaaac catcgcaagc
cgcaccgtca tgcgtgcgcc ccgcgaaacc 8640ttccagtccg tcggctcgat ggtccagcaa
gctacggcca agatcgagcg cgacagcgtg 8700caactggctc cccctgccct gcccgcgcca
tcggccgccg tggagcgttc gcgtcgtctc 8760gaacaggagg cggcaggttt ggcgaagtcg
atgaccatcg acacgcgagg aactatgacg 8820accaagaagc gaaaaaccgc cggcgaggac
ctggcaaaac aggtcagcga ggccaagcag 8880gccgcgttgc tgaaacacac gaagcagcag
atcaaggaaa tgcagctttc cttgttcgat 8940attgcgccgt ggccggacac gatgcgagcg
atgccaaacg acacggcccg ctctgccctg 9000ttcaccacgc gcaacaagaa aatcccgcgc
gaggcgctgc aaaacaaggt cattttccac 9060gtcaacaagg acgtgaagat cacctacacc
ggcgtcgagc tgcgggccga cgatgacgaa 9120ctggtgtggc agcaggtgtt ggagtacgcg
aagcgcaccc ctatcggcga gccgatcacc 9180ttcacgttct acgagctttg ccaggacctg
ggctggtcga tcaatggccg gtattacacg 9240aaggccgagg aatgcctgtc gcgcctacag
gcgacggcga tgggcttcac gtccgaccgc 9300gttgggcacc tggaatcggt gtcgctgctg
caccgcttcc gcgtcctgga ccgtggcaag 9360aaaacgtccc gttgccaggt cctgatcgac
gaggaaatcg tcgtgctgtt tgctggcgac 9420cactacacga aattcatatg ggagaagtac
cgcaagctgt cgccgacggc ccgacggatg 9480ttcgactatt tcagctcgca ccgggagccg
tacccgctca agctggaaac cttccgcctc 9540atgtgcggat cggattccac ccgcgtgaag
aagtggcgcg agcaggtcgg cgaagcctgc 9600gaagagttgc gaggcagcgg cctggtggaa
cacgcctggg tcaatgatga cctggtgcat 9660tgcaaacgct agggccttgt ggggtcagtt
ccggctgggg gttcagcagc cagcgcttta 9720ctggcatttc aggaacaagc gggcactgct
cgacgcactt gcttcgctca gtatcgctcg 9780ggacgcacgg cgcgctctac gaactgccga
taaacagagg attaaaattg acaattgtga 9840ttaaggctca gattcgacgg cttggagcgg
ccgacgtgca ggatttccgc gagatccgat 9900tgtcggccct gaagaaagct ccagagatgt
tcgggtccgt ttacgagcac gaggagaaaa 9960agcccatgga ggcgttcgct gaacggttgc
gagatgccgt ggcattcggc gcctacatcg 10020acggcgagat cattgggctg tcggtcttca
aacaggagga cggccccaag gacgctcaca 10080aggcgcatct gtccggcgtt ttcgtggagc
ccgaacagcg aggccgaggg gtcgccggta 10140tgctgctgcg ggcgttgccg gcgggtttat
tgctcgtgat gatcgtccga cagattccaa 10200cgggaatctg gtggatgcgc atcttcatcc
tcggcgcact taatatttcg ctattctgga 10260gcttgttgtt tatttcggtc taccgcctgc
cgggcgggtc gcggcgacgg taggcgctgt 10320gcagccgctg atggtcgtgt tcatctctgc
cgctctgcta ggtagcccga tacgattgat 10380ggcggtcctg ggggctattt gcggaactgc
gggcgtggcg ctgttggtgt tgacaccaaa 10440cgcagcgcta gatcctgtcg gcgtcgcagc
gggcctggcg ggggcggttt ccatggcgtt 10500cggaaccgtg ctgacccgca agtggcaacc
tcccgtgcct ctgctcacct ttaccgcctg 10560gcaactggcg gccggaggac ttctgctcgt
tccagtagct ttagtgtttg atccgccaat 10620cccgatgcct acaggaacca atgttctcgg
ctgctcgact gcacgaatac cagcgacccc 10680ttgcccaaat acttgccgtg ggcctcggcc
tgagagccaa aacacttgat gcggaagaag 10740tcggtgcgct cctgcttgtc gccggcatcg
ttgcgccaca tctaggtact aaaacaattc 10800atccagtaaa atataatatt ttattttctc
ccaatcaggc ttgatcccca gtaagtcaaa 10860aaatagctcg acatactgtt cttccccgat
atcctccctg atcgaccgga cgcagaaggc 10920aatgtcatac cacttgtccg ccctgccgct
tctcccaaga tcaataaagc cacttacttt 10980gccatctttc acaaagatgt tgctgtctcc
caggtcgccg tgggaaaaga caagttcctc 11040ttcgggcttt tccgtcttta aaaaatcata
cagctcgcgc ggatctttaa atggagtgtc 11100ttcttcccag ttttcgcaat ccacatcggc
cagatcgtta ttcagtaagt aatccaattc 11160ggctaagcgg ctgtctaagc tattcgtata
gggacaatcc gatatgtcga tggagtgaaa 11220gagcctgatg cactccgcat acagctcgat
aatcttttca gggctttgtt catcttcata 11280ctcttccgag caaaggacgc catcggcctc
actcatgagc agattgctcc agccatcatg 11340ccgttcaaag tgcaggacct ttggaacagg
cagctttcct tccagccata gcatcatgtc 11400cttttcccgt tccacatcat aggtggtccc
tttataccgg ctgtccgtca tttttaaata 11460taggttttca ttttctccca ccagcttata
taccttagca ggagacattc cttccgtatc 11520ttttacgcag cggtattttt cgatcagttt
tttcaattcc ggtgatattc tcattttagc 11580catttattat ttccttcctc ttttctacag
tatttaaaga taccccaaga agctaattat 11640aacaagacga actccaattc actgttcctt
gcattctaaa accttaaata ccagaaaaca 11700gctttttcaa agttgttttc aaagttggcg
tataacatag tatcgattcg atagcgtgga 11760ctcaaggctc tcgcgaatgg ctcgcgttgg
aaactttcat tgacacttga ggggcaccgc 11820agggaaattc tcgtccttgc gagaaccggc
tatgtcgtgc tgcgcatcga gcctgcgccc 11880ttggcttgtc tcgcccctct ccgcgtcgct
acggggcttc cagcgccttt ccgacgctca 11940ccgggctggt tgccctcgcc gctgggctgg
cggccgtcta tggccctgca aacgcgccag 12000aaacgccgtc gaagccgtgt gcgagacacc
gcggccgccg gcgttgtgga tacctcgcgg 12060aaaacttggc cctcactgac agatgagggg
cggacgttga cacttgaggg gccgactcac 12120ccggcgcggc gttgacagat gaggggcagg
ctcgatttcg gccggcgacg tggagctggc 12180cagcctcgca aatcggcgaa aacgcctgat
tttacgcgag tttcccacag atgatgtgga 12240caagcctggg gataagtgcc ctgcggtatt
gacacttgag gggcgcgact actgacagat 12300gaggggcgcg atccttgaca cttgaggggc
agagtgctga cagatgaggg gcgcacctat 12360tgacatttga ggggctgtcc acaggcagaa
aatccagcat ttgcaagggt ttccgcccgt 12420ttttcggcca ccgctaacct gtcttttaac
ctgcttttaa accaatattt ataaaccttg 12480tttttaacca gggctgcgcc ctgtgcgcgt
gaccgcgcac gccgaagggg ggtgcccccc 12540cttctcgaac cctcccggcc cgctaacgcg
ggcctcccat ccccccaggg gctgcgcccc 12600tcggccgcga acggcctcac cccaaaaatg
gcagcgccag ccaggacgtc ggccgaaaga 12660gcgacaagca gatcacgctt ttcgacagcg
tcggatttgc gatcgaggat ttttcggcgc 12720tgcgctacgt ccgcgaccgc gttgagggat
caagccacag cagcccactc gaccttctag 12780ccgacccaga cgagccaagg gatctttttg
gaatgctgct ccgtcgtcag gctttccgac 12840gtttgggtgg ttgaacagaa gtcattatcg
cacggaatgc caagcactcc cgaggggaac 12900cctgtggttg gcatgcacat acaaatggac
gaacggataa accttttcac gcccttttaa 12960atatccgatt attctaataa acgctctttt
ctcttaggtt tacccgccaa tatatcctgt 13020caaacactga tagtttaaac tgaaggcggg
aaacgacaat ctgatcatga gcggagaatt 13080aagggagtca cgttatgacc cccgccgatg
acgcgggaca agccgtttta cgtttggaac 13140tgacagaacc gcaacgttga aggagccact
cagc 13174412884DNAArtificialpLC40bar
4aagctttcga atagggataa cagggtaata gcttgctaga ggatctgcga tctagtaaca
60tagatgacac cgcgcgcgat aatttatcct agtttgcgcg ctatattttg ttttctatcg
120cgtattaaat gtataattgc gggactctaa tcataaaaac ccatctcata aataacgtca
180tgcattacat gttaattatt acatgcttaa cgtaattcaa cagaaattat atgataatca
240tcgcaagacc ggcaacagga ttcaatctta agaaacttta ttgccaaatg tttgaacgat
300ctgcttcgga tcctagacgc gtgagatcag atctcggtga cgggcaggac cggacggggc
360ggtaccggca ggctgaagtc cagctgccag aaacccacgt catgccagtt cccgtgcttg
420aagccggccg cccgcagcat gccgcggggg gcatatccga gcgcctcgtg catgcgcacg
480ctcgggtcgt tgggcagccc gatgacagcg accacgctct tgaagccctg tgcctccagg
540gacttcagca ggtgggtgta gagcgtggag cccagtcccg tccgctggtg gcggggggag
600acgtacacgg tcgactcggc cgtccagtcg taggcgttgc gtgccttcca ggggcccgcg
660taggcgatgc cggcgacctc gccgtccacc tcggcgacga gccagggata gcgctcccgc
720agacggacga ggtcgtccgt ccactcctgc ggttcctgcg gctcggtacg gaagttgacc
780gtgcttgtct cgatgtagtg gttgacgatg gtgcagaccg ccggcatgtc cgcctcggtg
840gcacggcgga tgtcggccgg gcgtcgttct gggtccatgg cgacctgcag aagtaacacc
900aaacaacagg gtgagcatcg acaaaagaaa cagtaccaag caaataaata gcgtatgaag
960gcagggctaa aaaaatccac atatagctgc tgcatatgcc atcatccaag tatatcaaga
1020tcaaaataat tataaaacat acttgtttat tataatagat aggtactcaa ggttagagca
1080tatgaataga tgctgcatat gccatcatgt atatgcatca gtaaaaccca catcaacatg
1140tatacctatc ctagatcgat atttccatcc atcttaaact cgtaactatg aagatgtatg
1200acacacacat acagttccaa aattaataaa tacaccaggt agtttgaaac agtattctac
1260tccgatctag aacgaatgaa cgaccgccca accacaccac atcatcacaa ccaagcgaac
1320aaaaagcatc tctgtatatg catcagtaaa acccgcatca acatgtatac ctatcctaga
1380tcgatatttc catccatcat cttcaattcg taactatgaa tatgtatggc acacacatac
1440agatccaaaa ttaataaatc caccaggtag tttgaaacag aattaattct actccgatct
1500agaacgaccg cccaaccaga ccacatcatc acaaccaaga caaaaaaaag catgaaaaga
1560tgacccgaca aacaagtgca cggcatatat tgaaataaag gaaaagggca aaccaaaccc
1620tatgcaacga aacaaaaaaa atcatgaaat cgatcccgtc tgcggaacgg ctagagccat
1680cccaggattc cccaaagaga aacactggca agttagcaat cagaacgtgt ctgacgtaca
1740ggtcgcatcc gtgtacgaac gctagcagca cggatctaac acaaacacgg atctaacaca
1800aacatgaaca gaagtagaac taccgggccc taaccatgga ccggaacgcc gatctagaga
1860aggtagagag gggggggggg ggaggacgag cggcgtacct tgaagcggag gtgccgacgg
1920gtggatttgg gggagatctg gttgtgtgtg tgtgcgctcc gaacaacacg aggttgggga
1980aagagggtgt ggagggggtg tctatttatt acggcgggcg aggaagggaa agcgaaggag
2040cggtgggaaa ggaatccccc gtagctgccg gtgccgtgag aggaggagga ggccgcctgc
2100cgtgccggct cacgtctgcc gctccgccac gcaatttctg gatgccgaca gcggagcaag
2160tccaacggtg gagcggaact ctcgagaggg gtccagaggc agcgacagag atgccgtgcc
2220gtctgcttcg cttggcccga cgcgacgctg ctggttcgct ggttggtgtc cgttagactc
2280gtcgacggcg tttaacaggc tggcattatc tactcgaaac aagaaaaatg tttccttagt
2340ttttttaatt tcttaaaggg tatttgttta atttttagtc actttatttt attctatttt
2400atatctaaat tattaaataa aaaaactaaa atagagtttt agttttctta atttagaggc
2460taaaatagaa taaaatagat gtactaaaaa aattagtcta taaaaaccat taaccctaaa
2520ccctaaatgg atgtactaat aaaatggatg aagtattata taggtgaagc tatttgcaaa
2580aaaaaaggag aacacatgca cactaaaaag ataaaactgt agagtcctgt tgtcaaaata
2640ctcaattgtc ctttagacca tgtctaactg ttcatttata tgattctcta aaacactgat
2700attattgtag tactatagat tatattattc gtagagtaaa gtttaaatat atgtataaag
2760atagataaac tgcacttcaa acaagtgtga caaaaaaaat atgtggtaat tttttataac
2820ttagacatgc aatgctcatt atctctagag aggggcacga ccgggtcacg ctgcacaatt
2880cagtacatta aaaacgtccg caatgtgtta ttaagttgtc taagcgtcaa tttgtttaca
2940ccacaatata tcctgccacc agccagccaa cagctccccg accggcagct cggcacaaaa
3000tcaccactcg atacaggcag cccatcagtc cgggacggcg tcagcgggag agccgttgta
3060aggcggcaga ctttgctcat gttaccgatg ctattcggaa gaacggcaac taagctgccg
3120ggtttgaaac acggatgatc tcgcggaggg tagcatgttg attgtaacga tgacagagcg
3180ttgctgcctg tgatcaaata tcatctccct cgcagagatc cgaattatca gccttcttat
3240tcatttctcg cttaaccgtg acaggctgtc gatcttgaga actatgccga cataatagga
3300aatcgctgga taaagccgct gaggaagctg agtggcgcta tttctttaga agtgaacgtt
3360gacgatcgtc gaccgtaccc cgatgaatta attcggacgt acgttctgaa cacagctgga
3420tacttacttg ggcgattgtc atacatgaca tcaacaatgt acccgtttgt gtaaccgtct
3480cttggaggtt cgtatgacac taggtcgcta ccttaggacc gttatagtta ctagcgaatt
3540gacatgaggt tgccccgtat tcagtgtcgc tgatttgtat tgtctgaagt tgtttttacg
3600ttaagttgat gcagatcaat taatacgata cctgcgtcat aattgattat ttgacgtggt
3660ttgatggcct ccacgcacgt tgtgatatgt agatgataat cattatcact ttacgggtcc
3720tttccggtga tccgacaggt tacggggcgg cgacctcgcg ggttttcgct atttatgaaa
3780attttccggt ttaaggcgtt tccgttcttc ttcgtcataa cttaatgttt ttatttaaaa
3840taccctctga aaagaaagga aacgacaggt gctgaaagcg agctttttgg cctctgtcgt
3900ttcctttctc tgtttttgtc cgtggaatga acaatggaag gatcttctcg gcggcgatca
3960cgacgccggc cctgcggagc cttcgccgcg tgcgcgattc atggcggccg tggaggccaa
4020ggatttcgcg cgagtgcaag agctgatcga ggcgcgtgga gccaagtcgg cggctgatta
4080tgtccttgcg cagctcgccg tggccgaagg tctggaccgc aagcctggtg cgcgcgtcgt
4140ggtcgggaaa gcggcgggca gcatggcaat gccgcctgcg gcgctgggtt ttacgccaag
4200gggagaagcg gcatacgcca tcgagcggtc agcctatggt gagccgaggt ccagcattgc
4260gaagcagtac cagcaggaat ggaaccggaa ggcggcgacc tggtgggcga tggccggtgt
4320ggccggcatc atcggcgcga tcctggcggc ggcggcaacc ggctttgttg ggctggcagt
4380gtcgatccgc aaccgagtga agcgcgtgcg cgacctgttg gtgatggagc cgggtgcaga
4440gccataagcg gcaagagacg aaagcccggt ttccgggctt ttgttttgtt acgccaagga
4500cgagttttag cggctaaagg tgttgacgtg cgagaaatgt ttagctaaac ttctctcatg
4560tgctggcggc tgtcaccgct atgttcaacc aaggcgcgga gcaaattatg ggtgttatcc
4620atgaagaaac ggcttaccga aagccagttc caggaggcga tccaggggct ggaagtgggg
4680cagcagacca tcgagatagc gcggggcgtc ttagtcgatg ggaagccaca ggcgacgttc
4740gcaacgtcgc tgggactgac caggggcgca gtgtcgcaag cggtgcatcg cgtgtgggcc
4800gcgttcgagg acaagaactt gcccgagggg tacgcgcggg taacggcggt tctgccggaa
4860catcaggcgt acatcgtccg gaagtgggaa gcggacgcca agaaaaaaca ggaaaccaaa
4920cgatgaaaac tttggtcacg gccaaccaga aaggcggcgt cggcaagact tcgacccttg
4980tgcatcttgc cttcgacttt ttcgagcgcg gcttgcgggt tgccgtgatc gacctggacc
5040cccagggcaa tgcgtcctac acgctcaagg actttgctac cggcctgcat gcaagcaagc
5100tgttcggcgc tgtccctgcc ggcggctgga ccgaaaccgc acccgcagcc ggcgacggcc
5160aggccgcgcg cctcgccctc atcgagtcca acccggtact ggcgaacgcc gaacggctgt
5220cgctggacga cgcccgcgag ctgttcgggg cgaacatcaa ggccctggcg aaccaaggct
5280tcgacgtgtg cctgatcgac acggccccga cccttggcgt cggcctggcg gccgccctct
5340tcgcggccga ctatgtgctg tcccccatcg agcttgaggc gtacagcatc cagggcatca
5400agaagatggt cacgaccatt gcgaacgtgc gccagaagaa cgccaagctg caattccttg
5460gcatggtgcc cagcaaggtc gatgcgcgga atccgcgcca cgcgcgccac caagccgagc
5520tgctggccgc gtaccccaag atgatgattc cggccaccgt tggcctgcgc agcagcatcg
5580ccgatgccct cgcatccggt gtgccggtct ggaagatcaa gaaaacggcc gcgcgcaagg
5640catcgaaaga ggttcgcgcc ctggctgatt acgtgttcac gaagatggag atttcccaat
5700gactgcggct caagccaaga ccaccaagaa aaacaccgct gcggccgctc aggaagccgc
5760aggcgcggcg cagccgtccg gcctggggtt ggatagcatc ggcgacctgt cgagcctcct
5820ggacgctcct gcggcgtctc agggcggttc cggccctatc gagctggacc tggacctgat
5880cgacgaagat ccgcatcagc cgcggacggc cgacaacccc ggcttttccc cggagagcat
5940cgcggaaatc ggtgccacga tcaaagagcg cggggtgaag tcacccattt cggtgcgcga
6000gaaccaggag cagccgggcc gctatatcat caatcacggc gcccgccgct accgtggctc
6060gaatctagtg atattccaca aaacagcagg gaagcagcgc ttttccgctg cataaccctg
6120cttcggggtc attatagcga ttttttcggt atatccatcc tttttcgcac gatatacagg
6180attttgccaa agggttcgtg tagactttcc ttggtgtatc caacggcgtc agccgggcag
6240gataggtgaa gtaggcccac ccgcgagcgg gtgttccttc ttcactgtcc cttattcgca
6300cctggcggtg ctcaacggga atcctgctct gcgaggctgg ccggctaccg ccggcgtaac
6360agatgagggc aagcggatgg ctgatgaaac caagccaacc aggaagggca gcccacctat
6420caaggtgtac tgccttccag acgaacgaag agcgattgag gaaaaggcgg cggcggccgg
6480catgagcctg tcggcctacc tgctggccgt cggccagggc tacaaaatca cgggcgtcgt
6540ggactatgag cacgtccgcg agctggcccg catcaatggc gacctgggcc gcctgggcgg
6600cctgctgaaa ctctggctca ccgacgaccc gcgcacggcg cggttcggtg atgccacgat
6660cctcgccctg ctggcgaaga tcgaagagaa gcaggacgag cttggcaagg tcatgatggg
6720cgtggtccgc ccgagggcag agccatgact tttttagccg ctaaaacggc cggggggtgc
6780gcgtgattgc caagcacgtc cccatgcgct ccatcaagaa gagcgacttc gcggagctgg
6840tgaagtacat caccgacgag caaggcaaga ccgagcgcca gatccaaaac aactgtcaaa
6900gcgcacccgc ccgatgccat tcgcggcacg gcttccgttg aggatgtcga tatgatgcgc
6960gagccgacgg cccgcagaga aggggccgtt ttagcggcta aagaaggaag tgcaagccct
7020aacccttggc gtcagagcct tccacgcagc ttttttcggg tgtcgtcgcc ccatttcttt
7080acgataaacg ccttatgtga cggcaaaacc acactgatgc gttcgtatcc gggcggcacg
7140ctgctcttga aaggatgacc cgcaatctcc gcgagtgcct cgcggtcaag gtcggtggac
7200tccaggagaa gaggtagggg agtttccagg gcgtcggcaa tggcctccat caccttcaac
7260gaggggttgg ccttaccgtt ggttaagtct gataaaaacg aaattgaaac ccctgccctc
7320tccgacagct catgtttcgt catgccccgc tcatcgagca gacgaaggat gttggtgaaa
7380aatatctggt tgtacacagc ggaagccgcc cctcgcacct ttggtcgcgg cccgcaaaat
7440tttagccgct aaagttcttg acagcggaac caatgtttag ctaaactaga gtctcctttc
7500tcaaggagac tttcgatatg agccataatc agttccagtt tatcggtaat cttacccgtg
7560acaccgaggt acgtcatggc aattctaaca agccgcaagc aattttcgat atagcggtta
7620atgaagagtg gcgcaacgat gccggcgaca agcaggagcg caccgacttc ttccgcatca
7680agtgttttgg ctctcaggcc gaggcccacg gcaagtattt gggcaagggg tcgctggtat
7740tcgtgcaggg caagattcgg aataccaagt acgagaagga cggccagacg gtctacggga
7800ccgacttcat tgccgataag gtggattatc tggacaccaa ggcaccaggc gggtcaaatc
7860aggaataagg gcacattgcc ccggcgtgag tcggggcaat cccgcaagga gggtgaatga
7920atcggacgtt tgaccggaag gcatacaggc aagaactgat cgacgcgggg ttttccgccg
7980aggatgccga aaccatcgca agccgcaccg tcatgcgtgc gccccgcgaa accttccagt
8040ccgtcggctc gatggtccag caagctacgg ccaagatcga gcgcgacagc gtgcaactgg
8100ctccccctgc cctgcccgcg ccatcggccg ccgtggagcg ttcgcgtcgt ctcgaacagg
8160aggcggcagg tttggcgaag tcgatgacca tcgacacgcg aggaactatg acgaccaaga
8220agcgaaaaac cgccggcgag gacctggcaa aacaggtcag cgaggccaag caggccgcgt
8280tgctgaaaca cacgaagcag cagatcaagg aaatgcagct ttccttgttc gatattgcgc
8340cgtggccgga cacgatgcga gcgatgccaa acgacacggc ccgctctgcc ctgttcacca
8400cgcgcaacaa gaaaatcccg cgcgaggcgc tgcaaaacaa ggtcattttc cacgtcaaca
8460aggacgtgaa gatcacctac accggcgtcg agctgcgggc cgacgatgac gaactggtgt
8520ggcagcaggt gttggagtac gcgaagcgca cccctatcgg cgagccgatc accttcacgt
8580tctacgagct ttgccaggac ctgggctggt cgatcaatgg ccggtattac acgaaggccg
8640aggaatgcct gtcgcgccta caggcgacgg cgatgggctt cacgtccgac cgcgttgggc
8700acctggaatc ggtgtcgctg ctgcaccgct tccgcgtcct ggaccgtggc aagaaaacgt
8760cccgttgcca ggtcctgatc gacgaggaaa tcgtcgtgct gtttgctggc gaccactaca
8820cgaaattcat atgggagaag taccgcaagc tgtcgccgac ggcccgacgg atgttcgact
8880atttcagctc gcaccgggag ccgtacccgc tcaagctgga aaccttccgc ctcatgtgcg
8940gatcggattc cacccgcgtg aagaagtggc gcgagcaggt cggcgaagcc tgcgaagagt
9000tgcgaggcag cggcctggtg gaacacgcct gggtcaatga tgacctggtg cattgcaaac
9060gctagggcct tgtggggtca gttccggctg ggggttcagc agccagcgct ttactggcat
9120ttcaggaaca agcgggcact gctcgacgca cttgcttcgc tcagtatcgc tcgggacgca
9180cggcgcgctc tacgaactgc cgataaacag aggattaaaa ttgacaattg tgattaaggc
9240tcagattcga cggcttggag cggccgacgt gcaggatttc cgcgagatcc gattgtcggc
9300cctgaagaaa gctccagaga tgttcgggtc cgtttacgag cacgaggaga aaaagcccat
9360ggaggcgttc gctgaacggt tgcgagatgc cgtggcattc ggcgcctaca tcgacggcga
9420gatcattggg ctgtcggtct tcaaacagga ggacggcccc aaggacgctc acaaggcgca
9480tctgtccggc gttttcgtgg agcccgaaca gcgaggccga ggggtcgccg gtatgctgct
9540gcgggcgttg ccggcgggtt tattgctcgt gatgatcgtc cgacagattc caacgggaat
9600ctggtggatg cgcatcttca tcctcggcgc acttaatatt tcgctattct ggagcttgtt
9660gtttatttcg gtctaccgcc tgccgggcgg gtcgcggcga cggtaggcgc tgtgcagccg
9720ctgatggtcg tgttcatctc tgccgctctg ctaggtagcc cgatacgatt gatggcggtc
9780ctgggggcta tttgcggaac tgcgggcgtg gcgctgttgg tgttgacacc aaacgcagcg
9840ctagatcctg tcggcgtcgc agcgggcctg gcgggggcgg tttccatggc gttcggaacc
9900gtgctgaccc gcaagtggca acctcccgtg cctctgctca cctttaccgc ctggcaactg
9960gcggccggag gacttctgct cgttccagta gctttagtgt ttgatccgcc aatcccgatg
10020cctacaggaa ccaatgttct cggctgctcg actgcacgaa taccagcgac cccttgccca
10080aatacttgcc gtgggcctcg gcctgagagc caaaacactt gatgcggaag aagtcggtgc
10140gctcctgctt gtcgccggca tcgttgcgcc acatctaggt actaaaacaa ttcatccagt
10200aaaatataat attttatttt ctcccaatca ggcttgatcc ccagtaagtc aaaaaatagc
10260tcgacatact gttcttcccc gatatcctcc ctgatcgacc ggacgcagaa ggcaatgtca
10320taccacttgt ccgccctgcc gcttctccca agatcaataa agccacttac tttgccatct
10380ttcacaaaga tgttgctgtc tcccaggtcg ccgtgggaaa agacaagttc ctcttcgggc
10440ttttccgtct ttaaaaaatc atacagctcg cgcggatctt taaatggagt gtcttcttcc
10500cagttttcgc aatccacatc ggccagatcg ttattcagta agtaatccaa ttcggctaag
10560cggctgtcta agctattcgt atagggacaa tccgatatgt cgatggagtg aaagagcctg
10620atgcactccg catacagctc gataatcttt tcagggcttt gttcatcttc atactcttcc
10680gagcaaagga cgccatcggc ctcactcatg agcagattgc tccagccatc atgccgttca
10740aagtgcagga cctttggaac aggcagcttt ccttccagcc atagcatcat gtccttttcc
10800cgttccacat cataggtggt ccctttatac cggctgtccg tcatttttaa atataggttt
10860tcattttctc ccaccagctt atatacctta gcaggagaca ttccttccgt atcttttacg
10920cagcggtatt tttcgatcag ttttttcaat tccggtgata ttctcatttt agccatttat
10980tatttccttc ctcttttcta cagtatttaa agatacccca agaagctaat tataacaaga
11040cgaactccaa ttcactgttc cttgcattct aaaaccttaa ataccagaaa acagcttttt
11100caaagttgtt ttcaaagttg gcgtataaca tagtatcgat tcgatagcgt ggactcaagg
11160ctctcgcgaa tggctcgcgt tggaaacttt cattgacact tgaggggcac cgcagggaaa
11220ttctcgtcct tgcgagaacc ggctatgtcg tgctgcgcat cgagcctgcg cccttggctt
11280gtctcgcccc tctccgcgtc gctacggggc ttccagcgcc tttccgacgc tcaccgggct
11340ggttgccctc gccgctgggc tggcggccgt ctatggccct gcaaacgcgc cagaaacgcc
11400gtcgaagccg tgtgcgagac accgcggccg ccggcgttgt ggatacctcg cggaaaactt
11460ggccctcact gacagatgag gggcggacgt tgacacttga ggggccgact cacccggcgc
11520ggcgttgaca gatgaggggc aggctcgatt tcggccggcg acgtggagct ggccagcctc
11580gcaaatcggc gaaaacgcct gattttacgc gagtttccca cagatgatgt ggacaagcct
11640ggggataagt gccctgcggt attgacactt gaggggcgcg actactgaca gatgaggggc
11700gcgatccttg acacttgagg ggcagagtgc tgacagatga ggggcgcacc tattgacatt
11760tgaggggctg tccacaggca gaaaatccag catttgcaag ggtttccgcc cgtttttcgg
11820ccaccgctaa cctgtctttt aacctgcttt taaaccaata tttataaacc ttgtttttaa
11880ccagggctgc gccctgtgcg cgtgaccgcg cacgccgaag gggggtgccc ccccttctcg
11940aaccctcccg gcccgctaac gcgggcctcc catcccccca ggggctgcgc ccctcggccg
12000cgaacggcct caccccaaaa atggcagcgc cagattattg aagcatttat cagggttatt
12060gtctcatgag cggatacata tttgaatgta tttagaaaaa taaacaaata ggggttccgc
12120gcacatttcc ccgaaaagtg ccacctgacg tctaagaaac cattattatc atgacattaa
12180cctataaaaa taggcgtatc acgaggccct ttcgtcttca agaattggtc gacgatcttg
12240ctgcgttcgg atattttcgt ggagttcccg ccacagaccc ggattgaagg cgagatccag
12300caactcgcgc cagatcatcc tgtgacggaa ctttggcgcg tgatgactgg ccaggacgtc
12360ggccgaaaga gcgacaagca gatcacgctt ttcgacagcg tcggatttgc gatcgaggat
12420ttttcggcgc tgcgctacgt ccgcgaccgc gttgagggat caagccacag cagcccactc
12480gaccttctag ccgacccaga cgagccaagg gatctttttg gaatgctgct ccgtcgtcag
12540gctttccgac gtttgggtgg ttgaacagaa gtcattatcg cacggaatgc caagcactcc
12600cgaggggaac cctgtggttg gcatgcacat acaaatggac gaacggataa accttttcac
12660gcccttttaa atatccgatt attctaataa acgctctttt ctcttaggtt tacccgccaa
12720tatatcctgt caaacactga tagtttaaac tgaaggcggg aaacgacaat ctgatcatga
12780gcggagaatt aagggagtca cgttatgacc cccgccgatg acgcgggaca agccgtttta
12840cgtttggaac tgacagaacc gcaacgttga aggagccact cagc
12884513026DNAArtificialpLC40GWB 5aagcttgcgg ccgcttcgaa gatgttaatt
aacatcggta ccgagctcta gggataacag 60ggtaatagct cgaattctag cttgcatgcc
tgcagtgcag cgtgacccgg tcgtgcccct 120ctctagagat aatgagcatt gcatgtctaa
gttataaaaa attaccacat attttttttg 180tcacacttgt ttgaagtgca gtttatctat
ctttatacat atatttaaac tttactctac 240gaataatata atctatagta ctacaataat
atcagtgttt tagagaatca tataaatgaa 300cagttagaca tggtctaaag gacaattgag
tattttgaca acaggactct acagttttat 360ctttttagtg tgcatgtgtt ctcctttttt
tttgcaaata gcttcaccta tataatactt 420catccatttt attagtacat ccatttaggg
tttagggtta atggttttta tagactaatt 480tttttagtac atctatttta ttctatttta
gcctctaaat taagaaaact aaaactctat 540tttagttttt ttatttaata atttagatat
aaaatagaat aaaataaagt gactaaaaat 600taaacaaata ccctttaaga aattaaaaaa
actaaggaaa catttttctt gtttcgagta 660gataatgcca gcctgttaaa cgccgtcgac
gagtctaacg gacaccaacc agcgaaccag 720cagcgtcgcg tcgggccaag cgaagcagac
ggcacggcat ctctgtcgct gcctctggac 780ccctctcgag agttccgctc caccgttgga
cttgctccgc tgtcggcatc cagaaattgc 840gtggcggagc ggcagacgtg agccggcacg
gcaggcggcc tcctcctcct ctcacggcac 900cggcagctac gggggattcc tttcccaccg
ctccttcgct ttcccttcct cgcccgccgt 960aataaataga caccccctcc acaccctctt
tccccaacct cgtgttgttc ggagcgcaca 1020cacacacaac cagatctccc ccaaatccac
ccgtcggcac ctccgcttca aggtacgccg 1080ctcgtcctcc cccccccccc ctctctacct
tctctagatc ggcgttccgg tccatggtta 1140gggcccggta gttctacttc tgttcatgtt
tgtgttagat ccgtgtttgt gttagatccg 1200tgctgctagc gttcgtacac ggatgcgacc
tgtacgtcag acacgttctg attgctaact 1260tgccagtgtt tctctttggg gaatcctggg
atggctctag ccgttccgca gacgggatcg 1320atttcatgat tttttttgtt tcgttgcata
gggtttggtt tgcccttttc ctttatttca 1380atatatgccg tgcacttgtt tgtcgggtca
tcttttcatg cttttttttg tcttggttgt 1440gatgatgtgg tctggttggg cggtcgttct
agatcggagt agaattctgt ttcaaactac 1500ctggtggatt tattaatttt ggatctgtat
gtgtgtgcca tacatattca tagttacgaa 1560ttgaagatga tggatggaaa tatcgatcta
ggataggtat acatgttgat gcgggtttta 1620ctgatgcata tacagagatg ctttttgttc
gcttggttgt gatgatgtgg tgtggttggg 1680cggtcgttca ttcgttctag atcggagtag
aatactgttt caaactacct ggtgtattta 1740ttaattttgg aactgtatgt gtgtgtcata
catcttcata gttacgagtt taagatggat 1800ggaaatatcg atctaggata ggtatacatg
ttgatgtggg ttttactgat gcatatacat 1860gatggcatat gcagcatcta ttcatatgct
ctaaccttga gtacctatct attataataa 1920acaagtatgt tttataatta ttttgatctt
gatatacttg gatgatggca tatgcagcag 1980ctatatgtgg atttttttag ccctgccttc
atacgctatt tatttgcttg gtactgtttc 2040ttttgtcgat gctcaccctg ttgtttggtg
ttacttctgc aggtcgactc tagaggatca 2100tcacaagttt gtacaaaaaa gcaggctcca
tggacccaga acgacgcccg gccgacatcc 2160gccgtgccac cgaggcggac atgccggcgg
tctgcaccat cgtcaaccac tacatcgaga 2220caagcacggt caacttccgt accgagccgc
aggaaccgca ggagtggacg gacgacctcg 2280tccgtctgcg ggagcgctat ccctggctcg
tcgccgaggt ggacggcgag gtcgccggca 2340tcgcctacgc gggcccctgg aaggcacgca
acgcctacga ctggacggcc gagtcgaccg 2400tgtacgtctc cccccgccac cagcggacgg
gactgggctc cacgctctac acccacctgc 2460tgaagtccct ggaggcacag ggcttcaaga
gcgtggtcgc tgtcatcggg ctgcccaacg 2520acccgagcgt gcgcatgcac gaggcgctcg
gatatgcccc ccgcggcatg ctgcgggcgg 2580ccggcttcaa gcacgggaac tggcatgacg
tgggtttctg gcagctggac ttcagcctgc 2640cggtaccgcc ccgtccggtc ctgcccgtca
ccgagatctg atctcacgcg tctaggaacc 2700cagctttctt gtacaaagtg gtgatgatcc
gtcgacctgc agatcgttca aacatttggc 2760aataaagttt cttaagattg aatcctgttg
ccggtcttgc gatgattatc atataatttc 2820tgttgaatta cgttaagcat gtaataatta
acatgtaatg catgacgtta tttatgagat 2880gggtttttat gattagagtc ccgcaattat
acatttaata cgcgatagaa aacaaaatat 2940agcgcgcaaa ctaggataaa ttatcgcgcg
cggtgtcatc tatgttacta gatccgatga 3000taagctgtca aacatgagaa ttcagtacat
taaaaacgtc cgcaatgtgt tattaagttg 3060tctaagcgtc aatttgttta caccacaata
tatcctgcca ccagccagcc aacagctccc 3120cgaccggcag ctcggcacaa aatcaccact
cgatacaggc agcccatcag tccgggacgg 3180cgtcagcggg agagccgttg taaggcggca
gactttgctc atgttaccga tgctattcgg 3240aagaacggca actaagctgc cgggtttgaa
acacggatga tctcgcggag ggtagcatgt 3300tgattgtaac gatgacagag cgttgctgcc
tgtgatcaaa tatcatctcc ctcgcagaga 3360tccgaattat cagccttctt attcatttct
cgcttaaccg tgacaggctg tcgatcttga 3420gaactatgcc gacataatag gaaatcgctg
gataaagccg ctgaggaagc tgagtggcgc 3480tatttcttta gaagtgaacg ttgacgatcg
tcgaccgtac cccgatgaat taattcggac 3540gtacgttctg aacacagctg gatacttact
tgggcgattg tcatacatga catcaacaat 3600gtacccgttt gtgtaaccgt ctcttggagg
ttcgtatgac actaggtcgc taccttagga 3660ccgttatagt tactagcgaa ttgacatgag
gttgccccgt attcagtgtc gctgatttgt 3720attgtctgaa gttgttttta cgttaagttg
atgcagatca attaatacga tacctgcgtc 3780ataattgatt atttgacgtg gtttgatggc
ctccacgcac gttgtgatat gtagatgata 3840atcattatca ctttacgggt cctttccggt
gatccgacag gttacggggc ggcgacctcg 3900cgggttttcg ctatttatga aaattttccg
gtttaaggcg tttccgttct tcttcgtcat 3960aacttaatgt ttttatttaa aataccctct
gaaaagaaag gaaacgacag gtgctgaaag 4020cgagcttttt ggcctctgtc gtttcctttc
tctgtttttg tccgtggaat gaacaatgga 4080aggatcttct cggcggcgat cacgacgccg
gccctgcgga gccttcgccg cgtgcgcgat 4140tcatggcggc cgtggaggcc aaggatttcg
cgcgagtgca agagctgatc gaggcgcgtg 4200gagccaagtc ggcggctgat tatgtccttg
cgcagctcgc cgtggccgaa ggtctggacc 4260gcaagcctgg tgcgcgcgtc gtggtcggga
aagcggcggg cagcatggca atgccgcctg 4320cggcgctggg ttttacgcca aggggagaag
cggcatacgc catcgagcgg tcagcctatg 4380gtgagccgag gtccagcatt gcgaagcagt
accagcagga atggaaccgg aaggcggcga 4440cctggtgggc gatggccggt gtggccggca
tcatcggcgc gatcctggcg gcggcggcaa 4500ccggctttgt tgggctggca gtgtcgatcc
gcaaccgagt gaagcgcgtg cgcgacctgt 4560tggtgatgga gccgggtgca gagccataag
cggcaagaga cgaaagcccg gtttccgggc 4620ttttgttttg ttacgccaag gacgagtttt
agcggctaaa ggtgttgacg tgcgagaaat 4680gtttagctaa acttctctca tgtgctggcg
gctgtcaccg ctatgttcaa ccaaggcgcg 4740gagcaaatta tgggtgttat ccatgaagaa
acggcttacc gaaagccagt tccaggaggc 4800gatccagggg ctggaagtgg ggcagcagac
catcgagata gcgcggggcg tcttagtcga 4860tgggaagcca caggcgacgt tcgcaacgtc
gctgggactg accaggggcg cagtgtcgca 4920agcggtgcat cgcgtgtggg ccgcgttcga
ggacaagaac ttgcccgagg ggtacgcgcg 4980ggtaacggcg gttctgccgg aacatcaggc
gtacatcgtc cggaagtggg aagcggacgc 5040caagaaaaaa caggaaacca aacgatgaaa
actttggtca cggccaacca gaaaggcggc 5100gtcggcaaga cttcgaccct tgtgcatctt
gccttcgact ttttcgagcg cggcttgcgg 5160gttgccgtga tcgacctgga cccccagggc
aatgcgtcct acacgctcaa ggactttgct 5220accggcctgc atgcaagcaa gctgttcggc
gctgtccctg ccggcggctg gaccgaaacc 5280gcacccgcag ccggcgacgg ccaggccgcg
cgcctcgccc tcatcgagtc caacccggta 5340ctggcgaacg ccgaacggct gtcgctggac
gacgcccgcg agctgttcgg ggcgaacatc 5400aaggccctgg cgaaccaagg cttcgacgtg
tgcctgatcg acacggcccc gacccttggc 5460gtcggcctgg cggccgccct cttcgcggcc
gactatgtgc tgtcccccat cgagcttgag 5520gcgtacagca tccagggcat caagaagatg
gtcacgacca ttgcgaacgt gcgccagaag 5580aacgccaagc tgcaattcct tggcatggtg
cccagcaagg tcgatgcgcg gaatccgcgc 5640cacgcgcgcc accaagccga gctgctggcc
gcgtacccca agatgatgat tccggccacc 5700gttggcctgc gcagcagcat cgccgatgcc
ctcgcatccg gtgtgccggt ctggaagatc 5760aagaaaacgg ccgcgcgcaa ggcatcgaaa
gaggttcgcg ccctggctga ttacgtgttc 5820acgaagatgg agatttccca atgactgcgg
ctcaagccaa gaccaccaag aaaaacaccg 5880ctgcggccgc tcaggaagcc gcaggcgcgg
cgcagccgtc cggcctgggg ttggatagca 5940tcggcgacct gtcgagcctc ctggacgctc
ctgcggcgtc tcagggcggt tccggcccta 6000tcgagctgga cctggacctg atcgacgaag
atccgcatca gccgcggacg gccgacaacc 6060ccggcttttc cccggagagc atcgcggaaa
tcggtgccac gatcaaagag cgcggggtga 6120agtcacccat ttcggtgcgc gagaaccagg
agcagccggg ccgctatatc atcaatcacg 6180gcgcccgccg ctaccgtggc tcgaatctag
tgatattcca caaaacagca gggaagcagc 6240gcttttccgc tgcataaccc tgcttcgggg
tcattatagc gattttttcg gtatatccat 6300cctttttcgc acgatataca ggattttgcc
aaagggttcg tgtagacttt ccttggtgta 6360tccaacggcg tcagccgggc aggataggtg
aagtaggccc acccgcgagc gggtgttcct 6420tcttcactgt cccttattcg cacctggcgg
tgctcaacgg gaatcctgct ctgcgaggct 6480ggccggctac cgccggcgta acagatgagg
gcaagcggat ggctgatgaa accaagccaa 6540ccaggaaggg cagcccacct atcaaggtgt
actgccttcc agacgaacga agagcgattg 6600aggaaaaggc ggcggcggcc ggcatgagcc
tgtcggccta cctgctggcc gtcggccagg 6660gctacaaaat cacgggcgtc gtggactatg
agcacgtccg cgagctggcc cgcatcaatg 6720gcgacctggg ccgcctgggc ggcctgctga
aactctggct caccgacgac ccgcgcacgg 6780cgcggttcgg tgatgccacg atcctcgccc
tgctggcgaa gatcgaagag aagcaggacg 6840agcttggcaa ggtcatgatg ggcgtggtcc
gcccgagggc agagccatga cttttttagc 6900cgctaaaacg gccggggggt gcgcgtgatt
gccaagcacg tccccatgcg ctccatcaag 6960aagagcgact tcgcggagct ggtgaagtac
atcaccgacg agcaaggcaa gaccgagcgc 7020cagatccaaa acaactgtca aagcgcaccc
gcccgatgcc attcgcggca cggcttccgt 7080tgaggatgtc gatatgatgc gcgagccgac
ggcccgcaga gaaggggccg ttttagcggc 7140taaagaagga agtgcaagcc ctaacccttg
gcgtcagagc cttccacgca gcttttttcg 7200ggtgtcgtcg ccccatttct ttacgataaa
cgccttatgt gacggcaaaa ccacactgat 7260gcgttcgtat ccgggcggca cgctgctctt
gaaaggatga cccgcaatct ccgcgagtgc 7320ctcgcggtca aggtcggtgg actccaggag
aagaggtagg ggagtttcca gggcgtcggc 7380aatggcctcc atcaccttca acgaggggtt
ggccttaccg ttggttaagt ctgataaaaa 7440cgaaattgaa acccctgccc tctccgacag
ctcatgtttc gtcatgcccc gctcatcgag 7500cagacgaagg atgttggtga aaaatatctg
gttgtacaca gcggaagccg cccctcgcac 7560ctttggtcgc ggcccgcaaa attttagccg
ctaaagttct tgacagcgga accaatgttt 7620agctaaacta gagtctcctt tctcaaggag
actttcgata tgagccataa tcagttccag 7680tttatcggta atcttacccg tgacaccgag
gtacgtcatg gcaattctaa caagccgcaa 7740gcaattttcg atatagcggt taatgaagag
tggcgcaacg atgccggcga caagcaggag 7800cgcaccgact tcttccgcat caagtgtttt
ggctctcagg ccgaggccca cggcaagtat 7860ttgggcaagg ggtcgctggt attcgtgcag
ggcaagattc ggaataccaa gtacgagaag 7920gacggccaga cggtctacgg gaccgacttc
attgccgata aggtggatta tctggacacc 7980aaggcaccag gcgggtcaaa tcaggaataa
gggcacattg ccccggcgtg agtcggggca 8040atcccgcaag gagggtgaat gaatcggacg
tttgaccgga aggcatacag gcaagaactg 8100atcgacgcgg ggttttccgc cgaggatgcc
gaaaccatcg caagccgcac cgtcatgcgt 8160gcgccccgcg aaaccttcca gtccgtcggc
tcgatggtcc agcaagctac ggccaagatc 8220gagcgcgaca gcgtgcaact ggctccccct
gccctgcccg cgccatcggc cgccgtggag 8280cgttcgcgtc gtctcgaaca ggaggcggca
ggtttggcga agtcgatgac catcgacacg 8340cgaggaacta tgacgaccaa gaagcgaaaa
accgccggcg aggacctggc aaaacaggtc 8400agcgaggcca agcaggccgc gttgctgaaa
cacacgaagc agcagatcaa ggaaatgcag 8460ctttccttgt tcgatattgc gccgtggccg
gacacgatgc gagcgatgcc aaacgacacg 8520gcccgctctg ccctgttcac cacgcgcaac
aagaaaatcc cgcgcgaggc gctgcaaaac 8580aaggtcattt tccacgtcaa caaggacgtg
aagatcacct acaccggcgt cgagctgcgg 8640gccgacgatg acgaactggt gtggcagcag
gtgttggagt acgcgaagcg cacccctatc 8700ggcgagccga tcaccttcac gttctacgag
ctttgccagg acctgggctg gtcgatcaat 8760ggccggtatt acacgaaggc cgaggaatgc
ctgtcgcgcc tacaggcgac ggcgatgggc 8820ttcacgtccg accgcgttgg gcacctggaa
tcggtgtcgc tgctgcaccg cttccgcgtc 8880ctggaccgtg gcaagaaaac gtcccgttgc
caggtcctga tcgacgagga aatcgtcgtg 8940ctgtttgctg gcgaccacta cacgaaattc
atatgggaga agtaccgcaa gctgtcgccg 9000acggcccgac ggatgttcga ctatttcagc
tcgcaccggg agccgtaccc gctcaagctg 9060gaaaccttcc gcctcatgtg cggatcggat
tccacccgcg tgaagaagtg gcgcgagcag 9120gtcggcgaag cctgcgaaga gttgcgaggc
agcggcctgg tggaacacgc ctgggtcaat 9180gatgacctgg tgcattgcaa acgctagggc
cttgtggggt cagttccggc tgggggttca 9240gcagccagcg ctttactggc atttcaggaa
caagcgggca ctgctcgacg cacttgcttc 9300gctcagtatc gctcgggacg cacggcgcgc
tctacgaact gccgataaac agaggattaa 9360aattgacaat tgtgattaag gctcagattc
gacggcttgg agcggccgac gtgcaggatt 9420tccgcgagat ccgattgtcg gccctgaaga
aagctccaga gatgttcggg tccgtttacg 9480agcacgagga gaaaaagccc atggaggcgt
tcgctgaacg gttgcgagat gccgtggcat 9540tcggcgccta catcgacggc gagatcattg
ggctgtcggt cttcaaacag gaggacggcc 9600ccaaggacgc tcacaaggcg catctgtccg
gcgttttcgt ggagcccgaa cagcgaggcc 9660gaggggtcgc cggtatgctg ctgcgggcgt
tgccggcggg tttattgctc gtgatgatcg 9720tccgacagat tccaacggga atctggtgga
tgcgcatctt catcctcggc gcacttaata 9780tttcgctatt ctggagcttg ttgtttattt
cggtctaccg cctgccgggc gggtcgcggc 9840gacggtaggc gctgtgcagc cgctgatggt
cgtgttcatc tctgccgctc tgctaggtag 9900cccgatacga ttgatggcgg tcctgggggc
tatttgcgga actgcgggcg tggcgctgtt 9960ggtgttgaca ccaaacgcag cgctagatcc
tgtcggcgtc gcagcgggcc tggcgggggc 10020ggtttccatg gcgttcggaa ccgtgctgac
ccgcaagtgg caacctcccg tgcctctgct 10080cacctttacc gcctggcaac tggcggccgg
aggacttctg ctcgttccag tagctttagt 10140gtttgatccg ccaatcccga tgcctacagg
aaccaatgtt ctcggctgct cgactgcacg 10200aataccagcg accccttgcc caaatacttg
ccgtgggcct cggcctgaga gccaaaacac 10260ttgatgcgga agaagtcggt gcgctcctgc
ttgtcgccgg catcgttgcg ccacatctag 10320gtactaaaac aattcatcca gtaaaatata
atattttatt ttctcccaat caggcttgat 10380ccccagtaag tcaaaaaata gctcgacata
ctgttcttcc ccgatatcct ccctgatcga 10440ccggacgcag aaggcaatgt cataccactt
gtccgccctg ccgcttctcc caagatcaat 10500aaagccactt actttgccat ctttcacaaa
gatgttgctg tctcccaggt cgccgtggga 10560aaagacaagt tcctcttcgg gcttttccgt
ctttaaaaaa tcatacagct cgcgcggatc 10620tttaaatgga gtgtcttctt cccagttttc
gcaatccaca tcggccagat cgttattcag 10680taagtaatcc aattcggcta agcggctgtc
taagctattc gtatagggac aatccgatat 10740gtcgatggag tgaaagagcc tgatgcactc
cgcatacagc tcgataatct tttcagggct 10800ttgttcatct tcatactctt ccgagcaaag
gacgccatcg gcctcactca tgagcagatt 10860gctccagcca tcatgccgtt caaagtgcag
gacctttgga acaggcagct ttccttccag 10920ccatagcatc atgtcctttt cccgttccac
atcataggtg gtccctttat accggctgtc 10980cgtcattttt aaatataggt tttcattttc
tcccaccagc ttatatacct tagcaggaga 11040cattccttcc gtatctttta cgcagcggta
tttttcgatc agttttttca attccggtga 11100tattctcatt ttagccattt attatttcct
tcctcttttc tacagtattt aaagataccc 11160caagaagcta attataacaa gacgaactcc
aattcactgt tccttgcatt ctaaaacctt 11220aaataccaga aaacagcttt ttcaaagttg
ttttcaaagt tggcgtataa catagtatcg 11280attcgatagc gtggactcaa ggctctcgcg
aatggctcgc gttggaaact ttcattgaca 11340cttgaggggc accgcaggga aattctcgtc
cttgcgagaa ccggctatgt cgtgctgcgc 11400atcgagcctg cgcccttggc ttgtctcgcc
cctctccgcg tcgctacggg gcttccagcg 11460cctttccgac gctcaccggg ctggttgccc
tcgccgctgg gctggcggcc gtctatggcc 11520ctgcaaacgc gccagaaacg ccgtcgaagc
cgtgtgcgag acaccgcggc cgccggcgtt 11580gtggatacct cgcggaaaac ttggccctca
ctgacagatg aggggcggac gttgacactt 11640gaggggccga ctcacccggc gcggcgttga
cagatgaggg gcaggctcga tttcggccgg 11700cgacgtggag ctggccagcc tcgcaaatcg
gcgaaaacgc ctgattttac gcgagtttcc 11760cacagatgat gtggacaagc ctggggataa
gtgccctgcg gtattgacac ttgaggggcg 11820cgactactga cagatgaggg gcgcgatcct
tgacacttga ggggcagagt gctgacagat 11880gaggggcgca cctattgaca tttgaggggc
tgtccacagg cagaaaatcc agcatttgca 11940agggtttccg cccgtttttc ggccaccgct
aacctgtctt ttaacctgct tttaaaccaa 12000tatttataaa ccttgttttt aaccagggct
gcgccctgtg cgcgtgaccg cgcacgccga 12060aggggggtgc ccccccttct cgaaccctcc
cggcccgcta acgcgggcct cccatccccc 12120caggggctgc gcccctcggc cgcgaacggc
ctcaccccaa aaatggcagc gccagattat 12180tgaagcattt atcagggtta ttgtctcatg
agcggataca tatttgaatg tatttagaaa 12240aataaacaaa taggggttcc gcgcacattt
ccccgaaaag tgccacctga cgtctaagaa 12300accattatta tcatgacatt aacctataaa
aataggcgta tcacgaggcc ctttcgtctt 12360caagaattgg tcgacgatct tgctgcgttc
ggatattttc gtggagttcc cgccacagac 12420ccggattgaa ggcgagatcc agcaactcgc
gccagatcat cctgtgacgg aactttggcg 12480cgtgatgact ggccaggacg tcggccgaaa
gagcgacaag cagatcacgc ttttcgacag 12540cgtcggattt gcgatcgagg atttttcggc
gctgcgctac gtccgcgacc gcgttgaggg 12600atcaagccac agcagcccac tcgaccttct
agccgaccca gacgagccaa gggatctttt 12660tggaatgctg ctccgtcgtc aggctttccg
acgtttgggt ggttgaacag aagtcattat 12720cgcacggaat gccaagcact cccgagggga
accctgtggt tggcatgcac atacaaatgg 12780acgaacggat aaaccttttc acgccctttt
aaatatccga ttattctaat aaacgctctt 12840ttctcttagg tttacccgcc aatatatcct
gtcaaacact gatagtttaa actgaaggcg 12900ggaaacgaca atctgatcat gagcggagaa
ttaagggagt cacgttatga cccccgccga 12960tgacgcggga caagccgttt tacgtttgga
actgacagaa ccgcaacgtt gaaggagcca 13020ctcagc
13026627875DNAArtificialpLCSBGWBSW
6aagcttgcgg ccgcttcgcc atttaaatgg cgaagatgtt aattaacatc ggtaccgagc
60tctagggata acagggtaat agctcgaatt ctagcttgca tgcctgcagt gcagcgtgac
120ccggtcgtgc ccctctctag agataatgag cattgcatgt ctaagttata aaaaattacc
180acatattttt tttgtcacac ttgtttgaag tgcagtttat ctatctttat acatatattt
240aaactttact ctacgaataa tataatctat agtactacaa taatatcagt gttttagaga
300atcatataaa tgaacagtta gacatggtct aaaggacaat tgagtatttt gacaacagga
360ctctacagtt ttatcttttt agtgtgcatg tgttctcctt tttttttgca aatagcttca
420cctatataat acttcatcca ttttattagt acatccattt agggtttagg gttaatggtt
480tttatagact aattttttta gtacatctat tttattctat tttagcctct aaattaagaa
540aactaaaact ctattttagt ttttttattt aataatttag atataaaata gaataaaata
600aagtgactaa aaattaaaca aatacccttt aagaaattaa aaaaactaag gaaacatttt
660tcttgtttcg agtagataat gccagcctgt taaacgccgt cgacgagtct aacggacacc
720aaccagcgaa ccagcagcgt cgcgtcgggc caagcgaagc agacggcacg gcatctctgt
780cgctgcctct ggacccctct cgagagttcc gctccaccgt tggacttgct ccgctgtcgg
840catccagaaa ttgcgtggcg gagcggcaga cgtgagccgg cacggcaggc ggcctcctcc
900tcctctcacg gcaccggcag ctacggggga ttcctttccc accgctcctt cgctttccct
960tcctcgcccg ccgtaataaa tagacacccc ctccacaccc tctttcccca acctcgtgtt
1020gttcggagcg cacacacaca caaccagatc tcccccaaat ccacccgtcg gcacctccgc
1080ttcaaggtac gccgctcgtc ctcccccccc ccccctctct accttctcta gatcggcgtt
1140ccggtccatg gttagggccc ggtagttcta cttctgttca tgtttgtgtt agatccgtgt
1200ttgtgttaga tccgtgctgc tagcgttcgt acacggatgc gacctgtacg tcagacacgt
1260tctgattgct aacttgccag tgtttctctt tggggaatcc tgggatggct ctagccgttc
1320cgcagacggg atcgatttca tgattttttt tgtttcgttg catagggttt ggtttgccct
1380tttcctttat ttcaatatat gccgtgcact tgtttgtcgg gtcatctttt catgcttttt
1440tttgtcttgg ttgtgatgat gtggtctggt tgggcggtcg ttctagatcg gagtagaatt
1500ctgtttcaaa ctacctggtg gatttattaa ttttggatct gtatgtgtgt gccatacata
1560ttcatagtta cgaattgaag atgatggatg gaaatatcga tctaggatag gtatacatgt
1620tgatgcgggt tttactgatg catatacaga gatgcttttt gttcgcttgg ttgtgatgat
1680gtggtgtggt tgggcggtcg ttcattcgtt ctagatcgga gtagaatact gtttcaaact
1740acctggtgta tttattaatt ttggaactgt atgtgtgtgt catacatctt catagttacg
1800agtttaagat ggatggaaat atcgatctag gataggtata catgttgatg tgggttttac
1860tgatgcatat acatgatggc atatgcagca tctattcata tgctctaacc ttgagtacct
1920atctattata ataaacaagt atgttttata attattttga tcttgatata cttggatgat
1980ggcatatgca gcagctatat gtggattttt ttagccctgc cttcatacgc tatttatttg
2040cttggtactg tttcttttgt cgatgctcac cctgttgttt ggtgttactt ctgcaggtcg
2100actctagagg atcatcacaa gtttgtacaa aaaagcaggc tccatggacc cagaacgacg
2160cccggccgac atccgccgtg ccaccgaggc ggacatgccg gcggtctgca ccatcgtcaa
2220ccactacatc gagacaagca cggtcaactt ccgtaccgag ccgcaggaac cgcaggagtg
2280gacggacgac ctcgtccgtc tgcgggagcg ctatccctgg ctcgtcgccg aggtggacgg
2340cgaggtcgcc ggcatcgcct acgcgggccc ctggaaggca cgcaacgcct acgactggac
2400ggccgagtcg accgtgtacg tctccccccg ccaccagcgg acgggactgg gctccacgct
2460ctacacccac ctgctgaagt ccctggaggc acagggcttc aagagcgtgg tcgctgtcat
2520cgggctgccc aacgacccga gcgtgcgcat gcacgaggcg ctcggatatg ccccccgcgg
2580catgctgcgg gcggccggct tcaagcacgg gaactggcat gacgtgggtt tctggcagct
2640ggacttcagc ctgccggtac cgccccgtcc ggtcctgccc gtcaccgaga tctgatctca
2700cgcgtctagg aacccagctt tcttgtacaa agtggtgatg atccgtcgac ctgcagatcg
2760ttcaaacatt tggcaataaa gtttcttaag attgaatcct gttgccggtc ttgcgatgat
2820tatcatataa tttctgttga attacgttaa gcatgtaata attaacatgt aatgcatgac
2880gttatttatg agatgggttt ttatgattag agtcccgcaa ttatacattt aatacgcgat
2940agaaaacaaa atatagcgcg caaactagga taaattatcg cgcgcggtgt catctatgtt
3000actagatccg atgataagct gtcaaacatg agaattcagt acattaaaaa cgtccgcaat
3060gtgttattaa gttgtctaag cgtcaatttg tttacaccac aatatatcct gccaccagcc
3120agccaacagc tccccgaccg gcagctcggc acaaaatcac cactcgatac aggcagccca
3180tcagtccggg acggcgtcag cgggagagcc gttgtaaggc ggcagacttt gctcatgtta
3240ccgatgctat tcggaagaac ggcaactaag ctgccgggtt tgaaacacgg atgatctcgc
3300ggagggtagc atgttgattg taacgatgac agagcgttgc tgcctgtgat caaatatcat
3360ctccctcgca gagatccgaa ttatcagcct tcttattcat ttctcgctta accgtgacag
3420gctgtcgatc ttgagaacta tgccgacata ataggaaatc gctggataaa gccgctgagg
3480aagctgagtg gcgctatttc tttagaagtg aacgttgacg atcgtcgacc gtaccccgat
3540gaattaattc ggacgtacgt tctgaacaca gctggatact tacttgggcg attgtcatac
3600atgacatcaa caatgtaccc gtttgtgtaa ccgtctcttg gaggttcgta tgacactagg
3660tcgctacctt aggaccgtta tagttactag aaatggtacc tgcggggaag cttacaataa
3720tgtgtgttgt taagtcttgt tgcctgtcat cgtctgactg actttcgtca taaatcccgg
3780cctccgtaac ccagctttgg gcaagctcac ggatttgatc cggcggaacg ggaatatcga
3840gatgccgggc tgaacgctgc agttccagct ttccctttcg ggacaggtac tccagctgat
3900tgattatctg ctgaagggtc ttggttccac ctcctggcac aatgcgaatg attacttgag
3960cgcgatcggg catccaattt tctcccgtca ggtgcgtggt caagtgctac aaggcacctt
4020tcagtaacga gcgaccgtcg atccgtcgcc gggatacgga caaaatggag cgcagtagtc
4080catcgagggc ggcgaaagcc tcgccaaaag caatacgttc atctcgcaca gcctccagat
4140ccgatcgagg gtcttcggcg taggcagata gaagcatgga tacattgctt gagagtattc
4200cgatggactg aagtatggct tccatctttt ctcgtgtgtc tgcatctatt tcgagaaagc
4260ccccgatgcg gcgcaccgca acgcgaattg ccatactatc cgaaagtccc agcaggcgcg
4320cttgatagga aaaggtttca tactcggccg atcgcagacg ggcactcacg accttgaacc
4380cttcaacttt cagggatcga tgctggttga tggtagtctc actcgacgtg gctctggtgt
4440gttttgacat agcttcctcc aaagaaagcg gaaggtctgg atactccagc acgaaatgtg
4500cccgggtaga cggatggaag tctagccctg ctcaatatga aatcaacagt acatttacag
4560tcaatactga atatacttgc tacatttgca attgtcttat aacgaatgtg aaataaaaat
4620agtgtaacaa cgcttttact catcgataat cacaaaaaca tttatacgaa caaaaataca
4680aatgcactcc ggtttcacag gataggcggg atcagaatat gcaacttttg acgttttgtt
4740ctttcaaagg gggtgctggc aaaaccaccg cactcatggg cctttgcgct gctttggcaa
4800atgacggtaa acgagtggcc ctctttgatg ccgacgaaaa ccggcctctg acgcgatgga
4860gagaaaacgc cttacaaagc agtactggga tcctcgctgt gaagtctatt ccgccgacga
4920aatgcccctt cttgaagcag cctatgaaaa tgccgagctc gaaggatttg attatgcgtt
4980ggccgatacg cgtggcggct cgagcgagct caacaacaca atcatcgcta gctcaaacct
5040gcttctgatc cccaccatgc taacgccgct cgacatcgat gaggcactat ctacctaccg
5100ctacgtcatc gagctgctgt tgagtgaaaa tttggcaatt cctacagctg ttttgcgcca
5160acgcgtcccg gtcggccgat tgacaacatc gcaacgcagg atgtcagaga cgctagagag
5220ccttccagtt gtaccgtctc ccatgcatga aagagatgca tttgccgcga tgaaagaacg
5280cggcatgttg catcttacat tactaaacac gggaactgat ccgacgatgc gcctcataga
5340gaggaatctt cggattgcga tggaggaagt cgtggtcatt tcgaaactga tcagcaaaat
5400cttggaggct tgaagatggc aattcgcaag cccgcattgt cggtcggcga agcacggcgg
5460cttgctggtg ctcgacccga gatccaccat cccaacccga cacttgttcc ccagaagctg
5520gacctccagc acttgcctga aaaagccgac gagaaagacc agcaacgtga gcctctcgtc
5580gccgatcaca tttacagtcc cgatcgacaa cttaagctaa ctgtggatgc ccttagtcca
5640cctccgtccc cgaaaaagct ccaggttttt ctttcagcgc gaccgcccgc gcctcaagtg
5700tcgaaaacat atgacaacct cgttcggcaa tacagtccct cgaagtcgct acaaatgatt
5760ttaaggcgcg cgttggacga tttcgaaagc atgctggcag atggatcatt tcgcgtggcc
5820ccgaaaagtt atccgatccc ttcaactaca gaaaaatccg ttctcgttca gacctcacgc
5880atgttcccgg ttgcgttgct cgaggtcgct cgaagtcatt ttgatccgtt ggggttggag
5940accgctcgag ctttcggcca caagctggct accgccgcgc tcgcgtcatt ctttgctgga
6000gagaagccat cgagcaattg gtgaagaggg acctatcgga acccctcacc aaatattgag
6060tgtaggtttg aggccgctgg ccgcgtcctc agtcaccttt tgagccagat aattaagagc
6120caaatgcaat tggctcaggc tgccatcgtc cccccgtgcg aaacctgcac gtccgcgtca
6180aagaaataac cggcacctct tgctgttttt atcagttgag ggcttgacgg atccgcctca
6240agtttgcggc gcagccgcaa aatgagaaca tctatactcc tgtcgtaaac ctcctcgtcg
6300cgtactcgac tggcaatgag aagttgctcg cgcgatagaa cgtcgcgggg tttctctaaa
6360aacgcgagga gaagattgaa ctcacctgcc gtaagtttca cctcaccgcc agcttcggac
6420atcaagcgac gttgcctgag attaagtgtc cagtcagtaa aacaaaaaga ccgtcggtct
6480ttggagcgga caacgttggg gcgcacgcgc aaggcaaccc gaatgcgtgc aagaaactct
6540ctcgtactaa acggcttagc gataaaatca cttgctccta gctcgagtgc aacaacttta
6600tccgtctcct caaggcggtc gccactgata attatgattg gaatatcaga ctttgccgcc
6660agatttcgaa cgatctcaag cccatcttca cgacctaaat ttagatcaac aaccacgaca
6720tcgaccgtcg cggaagagag tactctagtg aactgggtgc tgtcggctac cgcggtcact
6780ttgaaggcgt ggatcgtaag gtattcgata ataagatgcc gcatagcgac atcgtcatcg
6840ataagaagaa cgtgtttcaa cggctcacct ttcaatctaa aatctgaacc cttgttcaca
6900gcgcttgaga aattttcacg tgaaggatgt acaatcatct ccagctaaat gggcagttcg
6960tcagaattgc ggctgaccgc ggatgacgaa aatgcgaacc aagtatttca attttatgac
7020aaaagttctc aatcgttgtt acaagtgaaa cgcttcgagg ttacagctac tattgattaa
7080ggagatcgcc tatggtctcg ccccggcgtc gtgcgtccgc cgcgagccag atctcgccta
7140cttcataaac gtcctcatag gcacggaatg gaatgatgac atcgatcgcc gtagagagca
7200tgtcaatcag tgtgcgatct tccaagctag caccttgggc gctacttttg acaagggaaa
7260acagtttctt gaatccttgg attggattcg cgccgtgtat tgttgaaatc gatcccggat
7320gtcccgagac gacttcactc agataagccc atgctgcatc gtcgcgcatc tcgccaagca
7380atatccggtc cggccgcata cgcagacttg cttggagcaa gtgctcggcg ctcacagcac
7440ccagcccagc accgttcttg gagtagagta gtctaacatg attatcgtgt ggaatgacga
7500gttcgagcgt atcttctatg gtgattagcc tttcctgggg ggggatggcg ctgatcaagg
7560tcttgctcat tgttgtcttg ccgcttccgg tagggccaca tagcaacatc gtcagtcggc
7620tgacgacgca tgcgtgcaga aacgcttcca aatccccgtt gtcaaaatgc tgaaggatag
7680cttcatcatc ctgattttgg cgtttccttc gtgtctgcca ctggttccac ctcgaagcat
7740cataacggga ggagacttct ttaagaccag aaacacgcga gcttggccgt cgaatggtca
7800agctgacggt gcccgaggga acggtcggcg gcagacagat ttgtagtcgt tcaccaccag
7860gaagttcagt ggcgcagagg gggttacgtg gtccgacatc ctgctttctc agcgcgcccg
7920ctaaaatagc gatatcttca agatcatcat aagagacggg caaaggcatc ttggtaaaaa
7980tgccggcttg gcgcacaaat gcctctccag gtcgattgat cgcaatttct tcagtcttcg
8040ggtcatcgag ccattccaaa atcggcttca gaagaaagcg tagttgcgga tccacttcca
8100tttacaatgt atcctatctc taagcggaaa tttgaattca ttaagagcgg cggttcctcc
8160cccgcgtggc gccgccagtc aggcggagct ggtaaacacc aaagaaatcg aggtcccgtg
8220ctacgaaaat ggaaacggtg tcaccctgat tcttcttcag ggttggcggt atgttgatgg
8280ttgccttaag ggctgtctca gttgtctgct caccgttatt ttgaaagctg ttgaagctca
8340tcccgccacc cgagctgccg gcgtaggtgc tagctgcctg gaaggcgcct tgaacaacac
8400tcaagagcat agctccgcta aaacgctgcc agaagtggct gtcgaccgag cccggcaatc
8460ctgagcgacc gagttcgtcc gcgcttggcg atgttaacga gatcatcgca tggtcaggtg
8520tctcggcgcg atcccacaac acaaaaacgc gcccatctcc ctgttgcaag ccacgctgta
8580tttcgccaac aacggtggtg ccacgatcaa gaagcacgat attgttcgtt gttccacgaa
8640tatcctgagg caagacacac tttacatagc ctgccaaatt tgtgtcgatt gcggtttgca
8700agatgcacgg aattattgtc ccttgcgtta ccataaaatc ggggtgcggc aagagcgtgg
8760cgctgctggg ctgcagctcg gtgggtttca tacgtatcga caaatcgttc tcgccggaca
8820cttcgccatt cggcaaggag ttgtcgtcac gcttgccttc ttgtcttcgg cccgtgtcgc
8880cctgaatggc gcgtttgctg accccttgat cgccgctgct atatgcaaaa atcggtgttt
8940cttccggccg tggctcatgc cgctccggtt cgcccctcgg cggtagagga gcagcaggct
9000gaacagcctc ttgaaccgct ggaggatccg gcggcacctc aatcggagct ggatgaaatg
9060gcttggtgtt tgttgcgatc aaagttgacg gcgatgcgtt ctcattcacc ttcttttggc
9120gcccacctag ccaaatgagg cttaatgata acgcgagaac gacacctccg acgatcaatt
9180tctgagaccc cgaaagacgc cggcgatgtt tgtcggagac cagggatcca gatgcatcaa
9240cctcatgtgc cgcttgctga ctatcgttat tcatcccttc gcccccttca ggacgcgttt
9300cacatcgggc ctcaccgtgc ccgtttgcgg cctttggcca acgggatcgt aagcggtgtt
9360ccagatacat agtactgtgt ggccatccct cagacgccaa cctcgggaaa ccgaagaaat
9420ctcgacatcg ctccctttaa ctgaatagtt ggcaacagct tccttgccat caggattgat
9480ggtgtagatg gagggtatgc gtacattgcc cggaaagtgg aataccgtcg taaatccatt
9540gtcgaagact tcgagtggca acagcgaacg atcgccttgg gcgacgtagt gccaattact
9600gtccgccgca ccaagggctg tgacaggctg atccaataaa ttctcagctt tccgttgata
9660ttgtgcttcc gcgtgtagtc tgtccacaac agccttctgt tgtgcctccc ttcgccgagc
9720cgccgcatcg tcggcggggt aggcgaattg gacgctgtaa tagagatcgg gctgctcttt
9780atcgaggtgg gacagagtct tggaacttat actgaaaaca taacggcgca tcccggagtc
9840gcttgcggtt agcacgatta ctggctgagg cgtgaggacc tggcttgcct tgaaaaatag
9900ataatttccc cgcggtaggg ctgctagatc tttgctattt gaaacggcaa ccgctgtcac
9960cgtttcgttc gtggcgaatg ttacgaccaa agtagctcca accgccgtcg agaggcgcac
10020cacttgatcg ggattgtaag ccaaataacg catgcgcgga tctagcttgc ccgccattgg
10080agtgtcttca gcctccgcac cagtcgcagc ggcaaataaa catgctaaaa tgaaaagtgc
10140ttttctgatc atggttcgct gtggcctacg tttgaaacgg tatcttccga tgtctgatag
10200gaggtgacaa ccagacctgc cgggttggtt agtctcaatc tgccgggcaa gctggtcacc
10260ttttcgtagc gaactgtcgc ggtccacgta ctcaccacag gcattttgcc gtcaacgacg
10320agggtccttt tatagcgaat ttgctgcgtg cttggagtta catcatttga agcgatgtgc
10380tcgacctcca ccctgccgcg tttgccaaga atgacttgag gcgaactggg attgggatag
10440ttgaagaatt gctggtaatc ctggcgcact gttggggcac tgaagttcga taccaggtcg
10500taggcgtact gagcggtgtc ggcatcataa ctctcgcgca ggcgaacgta ctcccacaat
10560gaggcgttaa cgacggcctc ctcttgagtt gcaggcaatc gcgagacaga cacctcgctg
10620tcaacggtgc cgtccggccg tatccataga tatacgggca caagcctgct caacggcacc
10680attgtggcta tagcgaacgc ttgagcaaca tttcccaaaa tcgcgatagc tgcgacagct
10740gcaatgagtt tggagagacg tcgcgccgat ttcgctcgcg cggtttgaaa ggcttctact
10800tccttatagt gctcggcaag gctttcgcgc gccactagca tggcatattc aggccccgtc
10860atagcgtcca cccgaattgc cgagctgaag atctgacgga gtaggctgcc atcgccccac
10920attcagcggg aagatcgggc ctttgcagct cgctaatgtg tcgtttgtct ggcagccgct
10980caaagcgaca actaggcaca gcaggcaata cttcatagaa ttctccattg aggcgaattt
11040ttgcgcgacc tagcctcgct caacctgagc gaagcgacgg tacaagctgc tggcagattg
11100ggttgcgccg ctccagtaac tgcctccaat gttgccggcg atcgccggca aagcgacaat
11160gagcgcatcc cctgtcagaa aaaacatatc gagttcgtaa agaccaatga tcttggccgc
11220ggtcgtaccg gcgaaggtga ttacaccaag cataagggtg agcgcagtcg cttcggttag
11280gatgacgatc gttgccacga ggtttaagag gagaagcaag agaccgtagg tgataagttg
11340cccgatccac ttagctgcga tgtcccgcgt gcgatcaaaa atatatccga cgaggatcag
11400aggcccgatc gcgagaagca ctttcgtgag aattccaacg gcgtcgtaaa ctccgaaggc
11460agaccagagc gtgccgtaaa ggacccactg tgccccttgg aaagcaagga tgtcctggtc
11520gttcatcgga ccgatttcgg atgcgatttt ctgaaaaacg gcctgggtca cggcgaacat
11580tgtatccaac tgtgccggaa cagtctgcag aggcaagccg gttacactaa actgctgaac
11640aaagtttggg accgtctttt cgaagatgga aaccacatag tcttggtagt tagcctgccc
11700aacaattaga gcaacaacga tggtgaccgt gatcacccga gtgataccgc tacgggtatc
11760gacttcgccg cgtatgacta aaataccctg aacaataatc caaagagtga cacaggcgat
11820caatggcgca ctcaccgcct cctggatagt ctcaagcatc gagtccaagc ctgtcgtgaa
11880ggctacatcg aagatcgtat gaatggccgt aaacggcgcc ggaatcgtga aattcatcga
11940ttggacctga acttgactgg tttgtcgcat aatgttggat aaaatgagct cgcattcggc
12000gaggatgcgg gcggatgaac aaatcgccca gccttagggg agggcaccaa agatgacagc
12060ggtcttttga tgctccttgc gttgagcggc cgcctcttcc gcctcgtgaa ggccggcctg
12120cgcggtagtc atcgttaata ggcttgtcgc ctgtacattt tgaatcattg cgtcatggat
12180ctgcttgaga agcaaaccat tggtcacggt tgcctgcatg atattgcgag atcgggaaag
12240ctgagcagac gtatcagcat tcgccgtcaa gcgtttgtcc atcgtttcca gattgtcagc
12300cgcaatgcca gcgctgtttg cggaaccggt gatctgcgat cgcaacaggt ccgcttcagc
12360atcactaccc acgactgcac gatctgtatc gctggtgatc gcacgtgccg tggtcgacat
12420tggcattcgc ggcgaaaaca tttcattgtc taggtccttc gtcgaaggat actgattttt
12480ctggttgagc gaagtcagta gtccagtaac gccgtaggcc gacgtcaaca tcgtaaccat
12540cgctatagtc tgagtgagat tctccgcagt cgcgagcgca gtcgcgagcg tctcagcctc
12600cgttgccggg tcgctaacaa caaactgcgc ccgcgcgggc tgaatatata gaaagctgca
12660ggtcaaaact gttgcaataa gttgcgtcgt cttcatcgtt tcctacctta tcaatcttct
12720gcctcgtggt gacgggccat gaattcgctg agccagccag atgagttgcc ttcttgtgcc
12780tcgcgtagtc gagttgcaaa gcgcaccgtg ttggcacgcc ccgaaagcac ggcgacatat
12840tcacgcatat cccgcagatc aaattcgcag atgacgcttc cactttctcg tttaagaaga
12900aacttacggc tgccgaccgt catgtcttca cggatcgcct gaaattcctt ttcggtacat
12960ttcagtccat cgacataagc cgatcgatct gcggttggtg atggatagaa aatcttcgtc
13020atacattgcg caaccaagct ggctcctagc ggcgattcca gaacatgctc tggttgctgc
13080gttgccagta ttagcatccc gttgtttttt cgaacggtca ggaggaattt gtcgacgaca
13140gtcgaaaatt tagggtttaa caaataggcg cgaaactcat cgcagctcat cacaaaacgg
13200cggccgtcga tcatggctcc aatccgatgc aggagatatg ctgcagcggg agcgcatact
13260tcctcgtatt cgagaagatg cgtcatgtcg aagccggtaa tcgacggatc taactttact
13320tcgtcaactt cgccgtcaaa tgcccagcca agcgcatggc cccggcacca gcgttggagc
13380cgcgctcctg cgccttcggc gggcccatgc aacaaaaatt cacgtaaccc cgcgattgaa
13440cgcatttgtg gatcaaacga gagctgacga tggataccac ggaccagacg gcggttctct
13500tccggagaaa tcccaccccg accatcactc tcgatgagag ccacgatcca ttcgcgcaga
13560aaatcgtgtg aggctgctgt gttttctagg ccacgcaacg gcgccaaccc gctgggtgtg
13620cctctgtgaa gtgccaaata tgttcctcct gtggcgcgaa ccagcaattc gccaccccgg
13680tccttgtcaa agaacacgac cgtacctgca cggtcgacca tgctctgttc gagcatggct
13740agaacaaaca tcatgagcgt cgtcttaccc ctcccgatag gcccgaatat tgccgtcatg
13800ccaacatcgt gctcatgcgg gatatagtcg aaaggcgttc cgccattggt acgaaatcgg
13860gcaatcgcgt tgccccagtg gcctgagctg gcgccctctg gaaagttttc gaaagagaca
13920aaccctgcga aattgcgtga agtgattgcg ccagggcgtg tgcgccactt aaaattcccc
13980ggcaattggg accaataggc cgcttccata ccaatacctt cttggacaac cacggcacct
14040gcatccgcca ttcgtgtccg agcccgcgcg cccctgtccc caagactatt gagatcgtct
14100gcatagacgc aaaggctcaa atgatgtgag cccataacga attcgttgct cgcaagtgcg
14160tcctcagcct cggataattt gccgatttga gtcacggctt tatcgccgga actcagcatc
14220tggctcgatt tgaggctaag tttcgcgtgc gcttgcgggc gagtcaggaa cgaaaaactc
14280tgcgtgagaa caagtggaaa atcgagggat agcagcgcgt tgagcatgcc cggccgtgtt
14340tttgcagggt attcgcgaaa cgaatagatg gatccaacgt aactgtcttt tggcgttctg
14400atctcgagtc ctcgcttgcc gcaaatgact ctgtcggtat aaatcgaagc gccgagtgag
14460ccgctgacga ccggaaccgg tgtgaaccga ccagtcatga tcaaccgtag cgcttcgcca
14520atttcggtga agagcacacc ctgcttctcg cggatgccaa gacgatgcag gccatacgct
14580ttaagagagc cagcgacaac atgccaaaga tcttccatgt tcctgatctg gcccgtgaga
14640tcgttttccc tttttccgct tagcttggtg aacctcctct ttaccttccc taaagccgcc
14700tgtgggtaga caatcaacgt aaggaagtgt tcattgcgga ggagttggcc ggagagcacg
14760cgctgttcaa aagcttcgtt caggctagcg gcgaaaacac tacggaagtg tcgcggcgcc
14820gatgatggca cgtcggcatg acgtacgagg tgagcatata ttgacacatg atcatcagcg
14880atattgcgca acagcgtgtt gaacgcacga caacgcgcat tgcgcatttc agtttcctca
14940agctcgaatg caacgccatc aattctcgca atggtcatga tcgatccgtc ttcaagaagg
15000acgatatggt cgctgaggtg gccaatataa gggagataga tctcaccgga tctttcggtc
15060gttccactcg cgccgagcat cacaccattc ctctccctcg tgggggaacc ctaattggat
15120ttgggctaac agtagcgccc ccccaaactg cactatcaat gcttcttccc gcggtccgca
15180aaaatagcag gacgacgctc gccgcattgt agtctcgctc cacgatgagc cgggctgcaa
15240accataacgg cacgagaacg acttcgtaga gcgggttctg aacgataacg atgacaaagc
15300cggcgaacat catgaataac cctgccaatg tcagtggcac cccaagaaac aatgcgggcc
15360gtgtggctgc gaggtaaagg gtcgattctt ccaaacgatc agccatcaac taccgccagt
15420gagcgtttgg ccgaggaagc tcgccccaaa catgataaca atgccgccga cgacgccggc
15480aaccagccca agcgaagccc gcccgaacat ccaggagatc ccgatagcga caatgccgag
15540aacagcgagt gactggccga acggaccaag gataaacgtg catatattgt taaccattgt
15600ggcggggtca gtgccgccac ccgcagattg cgctgcggcg ggtccggatg aggaaatgct
15660ccatgcaatt gcaccgcaca agcttggggc gcagctcgat atcacgcgca tcatcgcatt
15720cgagagcgag aggcgattta gatgtaaacg gtatctctca aagcatcgca tcaatgcgca
15780cctccttagt ataagtcgaa taagacttga ttgtcgtctg cggatttgcc gttgtcctgg
15840tgtggcggtg gcggagcgat taaaccgcca gcgccatcct cctgcgagcg gcgctgatat
15900gacccccaaa catcccacgt ctcttcggat tttagcgcct cgtgatcgtc ttttggaggc
15960tcgattaacg cgggcaccag cgattgagca gctgtttcaa cttttcgcac gtagccgttt
16020gcaaaaccgc cgatgaaatt accggtgttg taagcggaga tcgcccgacg aagcgcaaat
16080tgcttctcgt caatcgtttc gccgcctgca taacgacttt tcagcatgtt tgcagcggca
16140gataatgatg tgcacgcctg gagcgcaccg tcaggtgtca gaccgagcat agaaaaattt
16200cgagagttta tttgcatgag gccaacatcc agcgaatgcc gtgcatcgag acggtgcctg
16260acgacttggg ttgcttggct gtgatcttgc cagtgaagcg tttcgccggt cgtgttgtca
16320tgaatcgcta aaggatcaaa gcgactctcc accttagcta tcgccgcaag cgtagatgtc
16380gcaactgatg gggcacactt gcgagcaaca tggtcaaact cagcagatga gagtggcgtg
16440gcaaggctcg acgaacagaa ggagaccatc aaggcaagag aaagcgaccc cgatctctta
16500agcatacctt atctccttag ctcgcaacta acaccgcctc tcccgttgga agaagtgcgt
16560tgttttatgt tgaagattat cgggagggtc ggttactcga aaattttcaa ttgcttcttt
16620atgatttcaa ttgaagcgag aaacctcgcc cggcgtcttg gaacgcaaca tggaccgaga
16680accgcgcatc catgactaag caaccggatc gacctattca ggccgcagtt ggtcaggtca
16740ggctcagaac gaaaatgctc ggcgaggtta cgctgtctgt aaacccattc gatgaacggg
16800aagcttcctt ccgattgctc ttggcaggaa tattggccca tgcctgcttg cgctttgcaa
16860atgctcttat cgcgttggta tcatatgcct tgtccgccag cagaaacgca ctctaagcga
16920ttatttgtaa aaatgtttcg gtcatgcggc ggtcatgggc ttgacccgct gtcagcgcaa
16980gacggatcgg tcaaccgtcg gcatcgacaa cagcgtgaat cttggtggtc aaaccgccac
17040gggaacgtcc catacagcca tcgtcttgat cccgctgttt cccgtcgccg catgttggtg
17100gacgcggaca caggaactgt caatcatgac gacattctat cgaaagcctt ggaaatcaca
17160ctcagaatat gatcccagac gtctgcctca cgccatcgta caaagcgatt gtagcaggtt
17220gtacaggaac cgtatcgatc aggaacgtct gcccagggcg ggcccgtccg gaagcgccac
17280aagatgacat tgatcacccg cgtcaacgcg cggcacgcga cgcggcttat ttgggaacaa
17340aggactgaac aacagtccat tcgaaatcgg tgacatcaaa gcggggacgg gttatcagtg
17400gcctccaagt caagcctcaa tgaatcaaaa tcagaccgat ttgcaaacct gatttatgag
17460tgtgcggcct aaatgatgaa atcgtccttc tagatcgcct ccgtggtgta gcaacacctc
17520gcagtatcgc cgtgctgacc ttggccaggg aattgactgg caagggtgct ttcacatgac
17580cgctcttttg gccgcgatag atgatttcgt tgctgctttg ggcacgtaga aggagagaag
17640tcatatcgga gaaattcctc ctggcgcgag agcctgctct atcgcgacgg catcccactg
17700tcgggaacag accggatcat tcacgaggcg aaagtcgtca acacatgcgt tataggcatc
17760ttcccttgaa ggatgatctt gttgctgcca atctggaggt gcggcagccg caggcagatg
17820cgatctcagc gcaacttgcg gcaaaacatc tcactcacct gaaaaccact agcgagtctc
17880gcgatcagac gaaggccttt tacttaacga cacaatatcc gatgtctgca tcacaggcgt
17940cgctatccca gtcaatacta aagcggtgca ggaactaaag attactgatg acttaggcgt
18000gccacgaggc ctgagacgac gcgcgtagac agttttttga aatcattatc aaagtgatgg
18060cctccgctga agcctatcac ctctgcgccg gtctgtcgga gagatgggca agcattatta
18120cggtcttcgc gcccgtacat gcattggacg attgcagggt caatggatct gagatcatcc
18180agaggattgc cgcccttacc ttccgtttcg agttggagcc agcccctaaa tgagacgaca
18240tagtcgactt gatgtgacaa tgccaagaga gagatttgct taacccgatt tttttgctca
18300agcgtaagcc tattgaagct tgccggcatg acgtccgcgc cgaaagaata tcctacaagt
18360aaaacattct gcacaccgaa atgcttggtg tagacatcga ttatgtgacc aagatcctta
18420gcagtttcgc ttggggaccg ctccgaccag aaataccgaa gtgaactgac gccaatgaca
18480ggaatccctt ccgtctgcag ataggtaccg gctggctagc gaattgacat gaggttgccc
18540cgtattcagt gtcgctgatt tgtattgtct gaagttgttt ttacgttaag ttgatgcaga
18600tcaattaata cgatacctgc gtcataattg attatttgac gtggtttgat ggcctccacg
18660cacgttgtga tatgtagatg ataatcatta tcactttacg ggtcctttcc ggtgatccga
18720caggttacgg ggcggcgacc tcgcgggttt tcgctattta tgaaaatttt ccggtttaag
18780gcgtttccgt tcttcttcgt cataacttaa tgtttttatt taaaataccc tctgaaaaga
18840aaggaaacga caggtgctga aagcgagctt tttggcctct gtcgtttcct ttctctgttt
18900ttgtccgtgg aatgaacaat ggaaggatct tctcggcggc gatcacgacg ccggccctgc
18960ggagccttcg ccgcgtgcgc gattcatggc ggccgtggag gccaaggatt tcgcgcgagt
19020gcaagagctg atcgaggcgc gtggagccaa gtcggcggct gattatgtcc ttgcgcagct
19080cgccgtggcc gaaggtctgg accgcaagcc tggtgcgcgc gtcgtggtcg ggaaagcggc
19140gggcagcatg gcaatgccgc ctgcggcgct gggttttacg ccaaggggag aagcggcata
19200cgccatcgag cggtcagcct atggtgagcc gaggtccagc attgcgaagc agtaccagca
19260ggaatggaac cggaaggcgg cgacctggtg ggcgatggcc ggtgtggccg gcatcatcgg
19320cgcgatcctg gcggcggcgg caaccggctt tgttgggctg gcagtgtcga tccgcaaccg
19380agtgaagcgc gtgcgcgacc tgttggtgat ggagccgggt gcagagccat aagcggcaag
19440agacgaaagc ccggtttccg ggcttttgtt ttgttacgcc aaggacgagt tttagcggct
19500aaaggtgttg acgtgcgaga aatgtttagc taaacttctc tcatgtgctg gcggctgtca
19560ccgctatgtt caaccaaggc gcggagcaaa ttatgggtgt tatccatgaa gaaacggctt
19620accgaaagcc agttccagga ggcgatccag gggctggaag tggggcagca gaccatcgag
19680atagcgcggg gcgtcttagt cgatgggaag ccacaggcga cgttcgcaac gtcgctggga
19740ctgaccaggg gcgcagtgtc gcaagcggtg catcgcgtgt gggccgcgtt cgaggacaag
19800aacttgcccg aggggtacgc gcgggtaacg gcggttctgc cggaacatca ggcgtacatc
19860gtccggaagt gggaagcgga cgccaagaaa aaacaggaaa ccaaacgatg aaaactttgg
19920tcacggccaa ccagaaaggc ggcgtcggca agacttcgac ccttgtgcat cttgccttcg
19980actttttcga gcgcggcttg cgggttgccg tgatcgacct ggacccccag ggcaatgcgt
20040cctacacgct caaggacttt gctaccggcc tgcatgcaag caagctgttc ggcgctgtcc
20100ctgccggcgg ctggaccgaa accgcacccg cagccggcga cggccaggcc gcgcgcctcg
20160ccctcatcga gtccaacccg gtactggcga acgccgaacg gctgtcgctg gacgacgccc
20220gcgagctgtt cggggcgaac atcaaggccc tggcgaacca aggcttcgac gtgtgcctga
20280tcgacacggc cccgaccctt ggcgtcggcc tggcggccgc cctcttcgcg gccgactatg
20340tgctgtcccc catcgagctt gaggcgtaca gcatccaggg catcaagaag atggtcacga
20400ccattgcgaa cgtgcgccag aagaacgcca agctgcaatt ccttggcatg gtgcccagca
20460aggtcgatgc gcggaatccg cgccacgcgc gccaccaagc cgagctgctg gccgcgtacc
20520ccaagatgat gattccggcc accgttggcc tgcgcagcag catcgccgat gccctcgcat
20580ccggtgtgcc ggtctggaag atcaagaaaa cggccgcgcg caaggcatcg aaagaggttc
20640gcgccctggc tgattacgtg ttcacgaaga tggagatttc ccaatgactg cggctcaagc
20700caagaccacc aagaaaaaca ccgctgcggc cgctcaggaa gccgcaggcg cggcgcagcc
20760gtccggcctg gggttggata gcatcggcga cctgtcgagc ctcctggacg ctcctgcggc
20820gtctcagggc ggttccggcc ctatcgagct ggacctggac ctgatcgacg aagatccgca
20880tcagccgcgg acggccgaca accccggctt ttccccggag agcatcgcgg aaatcggtgc
20940cacgatcaaa gagcgcgggg tgaagtcacc catttcggtg cgcgagaacc aggagcagcc
21000gggccgctat atcatcaatc acggcgcccg ccgctaccgt ggctcgaatc tagtgatatt
21060ccacaaaaca gcagggaagc agcgcttttc cgctgcataa ccctgcttcg gggtcattat
21120agcgattttt tcggtatatc catccttttt cgcacgatat acaggatttt gccaaagggt
21180tcgtgtagac tttccttggt gtatccaacg gcgtcagccg ggcaggatag gtgaagtagg
21240cccacccgcg agcgggtgtt ccttcttcac tgtcccttat tcgcacctgg cggtgctcaa
21300cgggaatcct gctctgcgag gctggccggc taccgccggc gtaacagatg agggcaagcg
21360gatggctgat gaaaccaagc caaccaggaa gggcagccca cctatcaagg tgtactgcct
21420tccagacgaa cgaagagcga ttgaggaaaa ggcggcggcg gccggcatga gcctgtcggc
21480ctacctgctg gccgtcggcc agggctacaa aatcacgggc gtcgtggact atgagcacgt
21540ccgcgagctg gcccgcatca atggcgacct gggccgcctg ggcggcctgc tgaaactctg
21600gctcaccgac gacccgcgca cggcgcggtt cggtgatgcc acgatcctcg ccctgctggc
21660gaagatcgaa gagaagcagg acgagcttgg caaggtcatg atgggcgtgg tccgcccgag
21720ggcagagcca tgactttttt agccgctaaa acggccgggg ggtgcgcgtg attgccaagc
21780acgtccccat gcgctccatc aagaagagcg acttcgcgga gctggtgaag tacatcaccg
21840acgagcaagg caagaccgag cgccagatcc aaaacaactg tcaaagcgca cccgcccgat
21900gccattcgcg gcacggcttc cgttgaggat gtcgatatga tgcgcgagcc gacggcccgc
21960agagaagggg ccgttttagc ggctaaagaa ggaagtgcaa gccctaaccc ttggcgtcag
22020agccttccac gcagcttttt tcgggtgtcg tcgccccatt tctttacgat aaacgcctta
22080tgtgacggca aaaccacact gatgcgttcg tatccgggcg gcacgctgct cttgaaagga
22140tgacccgcaa tctccgcgag tgcctcgcgg tcaaggtcgg tggactccag gagaagaggt
22200aggggagttt ccagggcgtc ggcaatggcc tccatcacct tcaacgaggg gttggcctta
22260ccgttggtta agtctgataa aaacgaaatt gaaacccctg ccctctccga cagctcatgt
22320ttcgtcatgc cccgctcatc gagcagacga aggatgttgg tgaaaaatat ctggttgtac
22380acagcggaag ccgcccctcg cacctttggt cgcggcccgc aaaattttag ccgctaaagt
22440tcttgacagc ggaaccaatg tttagctaaa ctagagtctc ctttctcaag gagactttcg
22500atatgagcca taatcagttc cagtttatcg gtaatcttac ccgtgacacc gaggtacgtc
22560atggcaattc taacaagccg caagcaattt tcgatatagc ggttaatgaa gagtggcgca
22620acgatgccgg cgacaagcag gagcgcaccg acttcttccg catcaagtgt tttggctctc
22680aggccgaggc ccacggcaag tatttgggca aggggtcgct ggtattcgtg cagggcaaga
22740ttcggaatac caagtacgag aaggacggcc agacggtcta cgggaccgac ttcattgccg
22800ataaggtgga ttatctggac accaaggcac caggcgggtc aaatcaggaa taagggcaca
22860ttgccccggc gtgagtcggg gcaatcccgc aaggagggtg aatgaatcgg acgtttgacc
22920ggaaggcata caggcaagaa ctgatcgacg cggggttttc cgccgaggat gccgaaacca
22980tcgcaagccg caccgtcatg cgtgcgcccc gcgaaacctt ccagtccgtc ggctcgatgg
23040tccagcaagc tacggccaag atcgagcgcg acagcgtgca actggctccc cctgccctgc
23100ccgcgccatc ggccgccgtg gagcgttcgc gtcgtctcga acaggaggcg gcaggtttgg
23160cgaagtcgat gaccatcgac acgcgaggaa ctatgacgac caagaagcga aaaaccgccg
23220gcgaggacct ggcaaaacag gtcagcgagg ccaagcaggc cgcgttgctg aaacacacga
23280agcagcagat caaggaaatg cagctttcct tgttcgatat tgcgccgtgg ccggacacga
23340tgcgagcgat gccaaacgac acggcccgct ctgccctgtt caccacgcgc aacaagaaaa
23400tcccgcgcga ggcgctgcaa aacaaggtca ttttccacgt caacaaggac gtgaagatca
23460cctacaccgg cgtcgagctg cgggccgacg atgacgaact ggtgtggcag caggtgttgg
23520agtacgcgaa gcgcacccct atcggcgagc cgatcacctt cacgttctac gagctttgcc
23580aggacctggg ctggtcgatc aatggccggt attacacgaa ggccgaggaa tgcctgtcgc
23640gcctacaggc gacggcgatg ggcttcacgt ccgaccgcgt tgggcacctg gaatcggtgt
23700cgctgctgca ccgcttccgc gtcctggacc gtggcaagaa aacgtcccgt tgccaggtcc
23760tgatcgacga ggaaatcgtc gtgctgtttg ctggcgacca ctacacgaaa ttcatatggg
23820agaagtaccg caagctgtcg ccgacggccc gacggatgtt cgactatttc agctcgcacc
23880gggagccgta cccgctcaag ctggaaacct tccgcctcat gtgcggatcg gattccaccc
23940gcgtgaagaa gtggcgcgag caggtcggcg aagcctgcga agagttgcga ggcagcggcc
24000tggtggaaca cgcctgggtc aatgatgacc tggtgcattg caaacgctag ggccttgtgg
24060ggtcagttcc ggctgggggt tcagcagcca gcgctttact ggcatttcag gaacaagcgg
24120gcactgctcg acgcacttgc ttcgctcagt atcgctcggg acgcacggcg cgctctacga
24180actgccgata aacagaggat taaaattgac aattgtgatt aaggctcaga ttcgacggct
24240tggagcggcc gacgtgcagg atttccgcga gatccgattg tcggccctga agaaagctcc
24300agagatgttc gggtccgttt acgagcacga ggagaaaaag cccatggagg cgttcgctga
24360acggttgcga gatgccgtgg cattcggcgc ctacatcgac ggcgagatca ttgggctgtc
24420ggtcttcaaa caggaggacg gccccaagga cgctcacaag gcgcatctgt ccggcgtttt
24480cgtggagccc gaacagcgag gccgaggggt cgccggtatg ctgctgcggg cgttgccggc
24540gggtttattg ctcgtgatga tcgtccgaca gattccaacg ggaatctggt ggatgcgcat
24600cttcatcctc ggcgcactta atatttcgct attctggagc ttgttgttta tttcggtcta
24660ccgcctgccg ggcgggtcgc ggcgacggta ggcgctgtgc agccgctgat ggtcgtgttc
24720atctctgccg ctctgctagg tagcccgata cgattgatgg cggtcctggg ggctatttgc
24780ggaactgcgg gcgtggcgct gttggtgttg acaccaaacg cagcgctaga tcctgtcggc
24840gtcgcagcgg gcctggcggg ggcggtttcc atggcgttcg gaaccgtgct gacccgcaag
24900tggcaacctc ccgtgcctct gctcaccttt accgcctggc aactggcggc cggaggactt
24960ctgctcgttc cagtagcttt agtgtttgat ccgccaatcc cgatgcctac aggaaccaat
25020gttctcggct gctcgactgc acgaatacca gcgacccctt gcccaaatac ttgccgtggg
25080cctcggcctg agagccaaaa cacttgatgc ggaagaagtc ggtgcgctcc tgcttgtcgc
25140cggcatcgtt gcgccacatc taggtactaa aacaattcat ccagtaaaat ataatatttt
25200attttctccc aatcaggctt gatccccagt aagtcaaaaa atagctcgac atactgttct
25260tccccgatat cctccctgat cgaccggacg cagaaggcaa tgtcatacca cttgtccgcc
25320ctgccgcttc tcccaagatc aataaagcca cttactttgc catctttcac aaagatgttg
25380ctgtctccca ggtcgccgtg ggaaaagaca agttcctctt cgggcttttc cgtctttaaa
25440aaatcataca gctcgcgcgg atctttaaat ggagtgtctt cttcccagtt ttcgcaatcc
25500acatcggcca gatcgttatt cagtaagtaa tccaattcgg ctaagcggct gtctaagcta
25560ttcgtatagg gacaatccga tatgtcgatg gagtgaaaga gcctgatgca ctccgcatac
25620agctcgataa tcttttcagg gctttgttca tcttcatact cttccgagca aaggacgcca
25680tcggcctcac tcatgagcag attgctccag ccatcatgcc gttcaaagtg caggaccttt
25740ggaacaggca gctttccttc cagccatagc atcatgtcct tttcccgttc cacatcatag
25800gtggtccctt tataccggct gtccgtcatt tttaaatata ggttttcatt ttctcccacc
25860agcttatata ccttagcagg agacattcct tccgtatctt ttacgcagcg gtatttttcg
25920atcagttttt tcaattccgg tgatattctc attttagcca tttattattt ccttcctctt
25980ttctacagta tttaaagata ccccaagaag ctaattataa caagacgaac tccaattcac
26040tgttccttgc attctaaaac cttaaatacc agaaaacagc tttttcaaag ttgttttcaa
26100agttggcgta taacatagta tcgattcgat agcgtggact caaggctctc gcgaatggct
26160cgcgttggaa actttcattg acacttgagg ggcaccgcag ggaaattctc gtccttgcga
26220gaaccggcta tgtcgtgctg cgcatcgagc ctgcgccctt ggcttgtctc gcccctctcc
26280gcgtcgctac ggggcttcca gcgcctttcc gacgctcacc gggctggttg ccctcgccgc
26340tgggctggcg gccgtctatg gccctgcaaa cgcgccagaa acgccgtcga agccgtgtgc
26400gagacaccgc ggccgccggc gttgtggata cctcgcggaa aacttggccc tcactgacag
26460atgaggggcg gacgttgaca cttgaggggc cgactcaccc ggcgcggcgt tgacagatga
26520ggggcaggct cgatttcggc cggcgacgtg gagctggcca gcctcgcaaa tcggcgaaaa
26580cgcctgattt tacgcgagtt tcccacagat gatgtggaca agcctgggga taagtgccct
26640gcggtattga cacttgaggg gcgcgactac tgacagatga ggggcgcgat ccttgacact
26700tgaggggcag agtgctgaca gatgaggggc gcacctattg acatttgagg ggctgtccac
26760aggcagaaaa tccagcattt gcaagggttt ccgcccgttt ttcggccacc gctaacctgt
26820cttttaacct gcttttaaac caatatttat aaaccttgtt tttaaccagg gctgcgccct
26880gtgcgcgtga ccgcgcacgc cgaagggggg tgccccccct tctcgaaccc tcccggcccg
26940ctaacgcggg cctcccatcc ccccaggggc tgcgcccctc ggccgcgaac ggcctcaccc
27000caaaaatggc agcgccagcc atttattatt gaagcattta tcagggttat tgtctcatga
27060gcggatacat atttgaatgt atttagaaaa ataaacaaat aggggttccg cgcacatttc
27120cccgaaaagt gccacctgac gtctaagaaa ccattattat catgacatta acctataaaa
27180ataggcgtat cacgaggccc tttcgtcttc aagaattggt cgacgatctt gctgcgttcg
27240gatattttcg tggagttccc gccacagacc cggattgaag gcgagatcca gcaactcgcg
27300ccagatcatc ctgtgacgga actttggcgc gtgatgactg gccaggacgt cggccgaaag
27360agcgacaagc agatcacgct tttcgacagc gtcggatttg cgatcgagga tttttcggcg
27420ctgcgctacg tccgcgaccg cgttgaggga tcaagccaca gcagcccact cgaccttcta
27480gccgacccag acgagccaag ggatcttttt ggaatgctgc tccgtcgtca ggctttccga
27540cgtttgggtg gttgaacaga agtcattatc gcacggaatg ccaagcactc ccgaggggaa
27600ccctgtggtt ggcatgcaca tacaaatgga cgaacggata aaccttttca cgccctttta
27660aatatccgat tattctaata aacgctcttt tctcttaggt ttacccgcca atatatcctg
27720tcaaacactg atagtttaaa ctgaaggcgg gaaacgacaa tctgatcatg agcggagaat
27780taagggagtc acgttatgac ccccgccgat gacgcgggac aagccgtttt acgtttggaa
27840ctgacagaac cgcaacgttg aaggagccac tcagc
27875714222DNAArtificialpLC40GWHvG1 7aagcttgcgg ccgcttcgaa gatgttaatt
aacatcggta ccgagctcta gggataacag 60ggtaatagct cgaattctag cttgcatgcc
tgcagtgcag cgtgacccgg tcgtgcccct 120ctctagagat aatgagcatt gcatgtctaa
gttataaaaa attaccacat attttttttg 180tcacacttgt ttgaagtgca gtttatctat
ctttatacat atatttaaac tttactctac 240gaataatata atctatagta ctacaataat
atcagtgttt tagagaatca tataaatgaa 300cagttagaca tggtctaaag gacaattgag
tattttgaca acaggactct acagttttat 360ctttttagtg tgcatgtgtt ctcctttttt
tttgcaaata gcttcaccta tataatactt 420catccatttt attagtacat ccatttaggg
tttagggtta atggttttta tagactaatt 480tttttagtac atctatttta ttctatttta
gcctctaaat taagaaaact aaaactctat 540tttagttttt ttatttaata atttagatat
aaaatagaat aaaataaagt gactaaaaat 600taaacaaata ccctttaaga aattaaaaaa
actaaggaaa catttttctt gtttcgagta 660gataatgcca gcctgttaaa cgccgtcgac
gagtctaacg gacaccaacc agcgaaccag 720cagcgtcgcg tcgggccaag cgaagcagac
ggcacggcat ctctgtcgct gcctctggac 780ccctctcgag agttccgctc caccgttgga
cttgctccgc tgtcggcatc cagaaattgc 840gtggcggagc ggcagacgtg agccggcacg
gcaggcggcc tcctcctcct ctcacggcac 900cggcagctac gggggattcc tttcccaccg
ctccttcgct ttcccttcct cgcccgccgt 960aataaataga caccccctcc acaccctctt
tccccaacct cgtgttgttc ggagcgcaca 1020cacacacaac cagatctccc ccaaatccac
ccgtcggcac ctccgcttca aggtacgccg 1080ctcgtcctcc cccccccccc ctctctacct
tctctagatc ggcgttccgg tccatggtta 1140gggcccggta gttctacttc tgttcatgtt
tgtgttagat ccgtgtttgt gttagatccg 1200tgctgctagc gttcgtacac ggatgcgacc
tgtacgtcag acacgttctg attgctaact 1260tgccagtgtt tctctttggg gaatcctggg
atggctctag ccgttccgca gacgggatcg 1320atttcatgat tttttttgtt tcgttgcata
gggtttggtt tgcccttttc ctttatttca 1380atatatgccg tgcacttgtt tgtcgggtca
tcttttcatg cttttttttg tcttggttgt 1440gatgatgtgg tctggttggg cggtcgttct
agatcggagt agaattctgt ttcaaactac 1500ctggtggatt tattaatttt ggatctgtat
gtgtgtgcca tacatattca tagttacgaa 1560ttgaagatga tggatggaaa tatcgatcta
ggataggtat acatgttgat gcgggtttta 1620ctgatgcata tacagagatg ctttttgttc
gcttggttgt gatgatgtgg tgtggttggg 1680cggtcgttca ttcgttctag atcggagtag
aatactgttt caaactacct ggtgtattta 1740ttaattttgg aactgtatgt gtgtgtcata
catcttcata gttacgagtt taagatggat 1800ggaaatatcg atctaggata ggtatacatg
ttgatgtggg ttttactgat gcatatacat 1860gatggcatat gcagcatcta ttcatatgct
ctaaccttga gtacctatct attataataa 1920acaagtatgt tttataatta ttttgatctt
gatatacttg gatgatggca tatgcagcag 1980ctatatgtgg atttttttag ccctgccttc
atacgctatt tatttgcttg gtactgtttc 2040ttttgtcgat gctcaccctg ttgtttggtg
ttacttctgc aggtcgactc tagaggatca 2100tcacaagttt gtacaaaaaa gcaggctcaa
tgagatatga aaaagcctga actcaccgcg 2160acgtctgtcg agaagtttct gatcgaaaag
ttcgacagcg tctccgacct gatgcagctc 2220tcggagggcg aagaatctcg tgctttcagc
ttcgatgtag gagggcgtgg atatgtcctg 2280cgggtaaata gctgcgccga tggtttctac
aaagatcgtt atgtttatcg gcactttgca 2340tcggccgcgc tcccgattcc ggaagtgctt
gacattgggg aattcagcga gagcctgacc 2400tattgcatct cccgccgtgc acagggtgtc
acgttgcaag acctgcctga aaccgaactg 2460cccgctgttc tgcagccggt cgcggaggcc
atggatgcga tcgctgcggc cgatcttagc 2520cagacgagcg ggttcggccc attcggaccg
caaggaatcg gtcaatacac tacatggcgt 2580gatttcatat gcgcgattgc tgatccccat
gtgtatcact ggcaaactgt gatggacgac 2640accgtcagtg cgtccgtcgc gcaggctctc
gatgagctga tgctttgggc cgaggactgc 2700cccgaagtcc ggcacctcgt gcacgcggat
ttcggctcca acaatgtcct gacggacaat 2760ggccgcataa cagcggtcat tgactggagc
gaggcgatgt tcggggattc ccaatacgag 2820gtcgccaaca tcttcttctg gaggccgtgg
ttggcttgta tggagcagca gacgcgctac 2880ttcgagcgga ggcatccgga gcttgcagga
tcgccgcggc tccgggcgta tatgctccgc 2940attggtcttg accaactcta tcagagcttg
gttgacggca atttcgatga tgcagcttgg 3000gcgcagggtc gatgcgacgc aatcgtccga
tccggagccg ggactgtcgg gcgtacacaa 3060atcgcccgca gaagcgcggc cgtctggacc
gatggctgtg tagaagtact cgccgatagt 3120ggaaaccgac gccccagcac tcgtccgagg
gcaaaggaat agacccagct ttcttgtaca 3180aagtggtgat gatccgtcga cctgcagatc
gttcaaacat ttggcaataa agtttcttaa 3240gattgaatcc tgttgccggt cttgcgatga
ttatcatata atttctgttg aattacgtta 3300agcatgtaat aattaacatg taatgcatga
cgttatttat gagatgggtt tttatgatta 3360gagtcccgca attatacatt taatacgcga
tagaaaacaa aatatagcgc gcaaactagg 3420ataaattatc gcgcgcggtg tcatctatgt
tactagatcc gatgataagc tgtcaaacat 3480gagaattcag tacattaaaa acgtccgcaa
tgtgttatta agttgtctaa gcgtcaattt 3540gtttacacca caatatatcc tgccaccagc
cagccaacag ctccccgacc ggcagctcgg 3600cacaaaatca ccactcgata caggcagccc
atcagtccgg gacggcgtca gcgggagagc 3660cgttgtaagg cggcagactt tgctcatgtt
accgatgcta ttcggaagaa cggcaactaa 3720gctgccgggt ttgaaacacg gatgatctcg
cggagggtag catgttgatt gtaacgatga 3780cagagcgttg ctgcctgtga tcaaatatca
tctccctcgc agagatccga attatcagcc 3840ttcttattca tttctcgctt aaccgtgaca
ggctgtcgat cttgagaact atgccgacat 3900aataggaaat cgctggataa agccgctgag
gaagctgagt ggcgctattt ctttagaagt 3960gaacgttgac gatcgtcgac cgtaccccga
tgaattaatt cggacgtacg ttctgaacac 4020aggactagtg taacctcgaa gcgtttcact
tgtaacaacg attgagaact tttgtcataa 4080aattgaaata cttggttcgc attttcgtca
tccgcggtca gccgcaattc tgacgaactg 4140cccatttagc tggagatgat tgtacatcct
tcacgtgaaa atttctcaag cgctgtgaac 4200aagggttcag attttagatt gaaaggtgag
ccgttgaaac acgttcttct tatcgatgac 4260gatgtcgcta tgcggcatct tattatcgaa
taccttacga tccacgcctt caaagtgacc 4320gcggtagccg acagcaccca gttcactaga
gtactctctt ccgcgacggt cgatgtcgtg 4380gttgttgatc taaatttagg tcgtgaagat
gggcttgaga tcgttcgaaa tctggcggca 4440aagtctgata ttccaatcat aattatcagt
ggcgaccgcc ttgaggagac ggataaagtt 4500gttgcactcg agctaggagc aagtgatttt
atcgctaagc cgtttagtac gagagagttt 4560cttgcacgca ttcgggttgc cttgcgcgtg
cgccccaacg ttgtccgctc caaagaccga 4620cggtcttttt gttttactga ctggacactt
aatctcaggc aacgtcgctt gatgtccgaa 4680gctggcggtg aggtgaaact tacggcaggt
gagttcaatc ttctcctcgc gtttttagag 4740aaaccccgcg acgttctatc gcgcgagcaa
cttctcattg ccagtcgagt acgcgacgag 4800gaggtttacg acaggagtat agatgttctc
attttgcggc tgcgccgcaa acttgaggcg 4860gatccgtcaa gccctcaact gataaaaaca
gcaagaggtg ccggttattt ctttgacgcg 4920gacgtgcagg tttcgcacgg ggggacgatg
gcagcctgag ccaattgcat ttggctctta 4980attatctggc tcaaaaggtg actgaggacg
cggccagcgg cctcaaacct acactcaata 5040tttggtgagg ggttccgata ggtactagtc
ctggatactt acttgggcga ttgtcataca 5100tgacatcaac aatgtacccg tttgtgtaac
cgtctcttgg aggttcgtat gacactaggt 5160cgctacctta ggaccgttat agttactagc
gaattgacat gaggttgccc cgtattcagt 5220gtcgctgatt tgtattgtct gaagttgttt
ttacgttaag ttgatgcaga tcaattaata 5280cgatacctgc gtcataattg attatttgac
gtggtttgat ggcctccacg cacgttgtga 5340tatgtagatg ataatcatta tcactttacg
ggtcctttcc ggtgatccga caggttacgg 5400ggcggcgacc tcgcgggttt tcgctattta
tgaaaatttt ccggtttaag gcgtttccgt 5460tcttcttcgt cataacttaa tgtttttatt
taaaataccc tctgaaaaga aaggaaacga 5520caggtgctga aagcgagctt tttggcctct
gtcgtttcct ttctctgttt ttgtccgtgg 5580aatgaacaat ggaaggatct tctcggcggc
gatcacgacg ccggccctgc ggagccttcg 5640ccgcgtgcgc gattcatggc ggccgtggag
gccaaggatt tcgcgcgagt gcaagagctg 5700atcgaggcgc gtggagccaa gtcggcggct
gattatgtcc ttgcgcagct cgccgtggcc 5760gaaggtctgg accgcaagcc tggtgcgcgc
gtcgtggtcg ggaaagcggc gggcagcatg 5820gcaatgccgc ctgcggcgct gggttttacg
ccaaggggag aagcggcata cgccatcgag 5880cggtcagcct atggtgagcc gaggtccagc
attgcgaagc agtaccagca ggaatggaac 5940cggaaggcgg cgacctggtg ggcgatggcc
ggtgtggccg gcatcatcgg cgcgatcctg 6000gcggcggcgg caaccggctt tgttgggctg
gcagtgtcga tccgcaaccg agtgaagcgc 6060gtgcgcgacc tgttggtgat ggagccgggt
gcagagccat aagcggcaag agacgaaagc 6120ccggtttccg ggcttttgtt ttgttacgcc
aaggacgagt tttagcggct aaaggtgttg 6180acgtgcgaga aatgtttagc taaacttctc
tcatgtgctg gcggctgtca ccgctatgtt 6240caaccaaggc gcggagcaaa ttatgggtgt
tatccatgaa gaaacggctt accgaaagcc 6300agttccagga ggcgatccag gggctggaag
tggggcagca gaccatcgag atagcgcggg 6360gcgtcttagt cgatgggaag ccacaggcga
cgttcgcaac gtcgctggga ctgaccaggg 6420gcgcagtgtc gcaagcggtg catcgcgtgt
gggccgcgtt cgaggacaag aacttgcccg 6480aggggtacgc gcgggtaacg gcggttctgc
cggaacatca ggcgtacatc gtccggaagt 6540gggaagcgga cgccaagaaa aaacaggaaa
ccaaacgatg aaaactttgg tcacggccaa 6600ccagaaaggc ggcgtcggca agacttcgac
ccttgtgcat cttgccttcg actttttcga 6660gcgcggcttg cgggttgccg tgatcgacct
ggacccccag ggcaatgcgt cctacacgct 6720caaggacttt gctaccggcc tgcatgcaag
caagctgttc ggcgctgtcc ctgccggcgg 6780ctggaccgaa accgcacccg cagccggcga
cggccaggcc gcgcgcctcg ccctcatcga 6840gtccaacccg gtactggcga acgccgaacg
gctgtcgctg gacgacgccc gcgagctgtt 6900cggggcgaac atcaaggccc tggcgaacca
aggcttcgac gtgtgcctga tcgacacggc 6960cccgaccctt ggcgtcggcc tggcggccgc
cctcttcgcg gccgactatg tgctgtcccc 7020catcgagctt gaggcgtaca gcatccaggg
catcaagaag atggtcacga ccattgcgaa 7080cgtgcgccag aagaacgcca agctgcaatt
ccttggcatg gtgcccagca aggtcgatgc 7140gcggaatccg cgccacgcgc gccaccaagc
cgagctgctg gccgcgtacc ccaagatgat 7200gattccggcc accgttggcc tgcgcagcag
catcgccgat gccctcgcat ccggtgtgcc 7260ggtctggaag atcaagaaaa cggccgcgcg
caaggcatcg aaagaggttc gcgccctggc 7320tgattacgtg ttcacgaaga tggagatttc
ccaatgactg cggctcaagc caagaccacc 7380aagaaaaaca ccgctgcggc cgctcaggaa
gccgcaggcg cggcgcagcc gtccggcctg 7440gggttggata gcatcggcga cctgtcgagc
ctcctggacg ctcctgcggc gtctcagggc 7500ggttccggcc ctatcgagct ggacctggac
ctgatcgacg aagatccgca tcagccgcgg 7560acggccgaca accccggctt ttccccggag
agcatcgcgg aaatcggtgc cacgatcaaa 7620gagcgcgggg tgaagtcacc catttcggtg
cgcgagaacc aggagcagcc gggccgctat 7680atcatcaatc acggcgcccg ccgctaccgt
ggctcgaatc tagtgatatt ccacaaaaca 7740gcagggaagc agcgcttttc cgctgcataa
ccctgcttcg gggtcattat agcgattttt 7800tcggtatatc catccttttt cgcacgatat
acaggatttt gccaaagggt tcgtgtagac 7860tttccttggt gtatccaacg gcgtcagccg
ggcaggatag gtgaagtagg cccacccgcg 7920agcgggtgtt ccttcttcac tgtcccttat
tcgcacctgg cggtgctcaa cgggaatcct 7980gctctgcgag gctggccggc taccgccggc
gtaacagatg agggcaagcg gatggctgat 8040gaaaccaagc caaccaggaa gggcagccca
cctatcaagg tgtactgcct tccagacgaa 8100cgaagagcga ttgaggaaaa ggcggcggcg
gccggcatga gcctgtcggc ctacctgctg 8160gccgtcggcc agggctacaa aatcacgggc
gtcgtggact atgagcacgt ccgcgagctg 8220gcccgcatca atggcgacct gggccgcctg
ggcggcctgc tgaaactctg gctcaccgac 8280gacccgcgca cggcgcggtt cggtgatgcc
acgatcctcg ccctgctggc gaagatcgaa 8340gagaagcagg acgagcttgg caaggtcatg
atgggcgtgg tccgcccgag ggcagagcca 8400tgactttttt agccgctaaa acggccgggg
ggtgcgcgtg attgccaagc acgtccccat 8460gcgctccatc aagaagagcg acttcgcgga
gctggtgaag tacatcaccg acgagcaagg 8520caagaccgag cgccagatcc aaaacaactg
tcaaagcgca cccgcccgat gccattcgcg 8580gcacggcttc cgttgaggat gtcgatatga
tgcgcgagcc gacggcccgc agagaagggg 8640ccgttttagc ggctaaagaa ggaagtgcaa
gccctaaccc ttggcgtcag agccttccac 8700gcagcttttt tcgggtgtcg tcgccccatt
tctttacgat aaacgcctta tgtgacggca 8760aaaccacact gatgcgttcg tatccgggcg
gcacgctgct cttgaaagga tgacccgcaa 8820tctccgcgag tgcctcgcgg tcaaggtcgg
tggactccag gagaagaggt aggggagttt 8880ccagggcgtc ggcaatggcc tccatcacct
tcaacgaggg gttggcctta ccgttggtta 8940agtctgataa aaacgaaatt gaaacccctg
ccctctccga cagctcatgt ttcgtcatgc 9000cccgctcatc gagcagacga aggatgttgg
tgaaaaatat ctggttgtac acagcggaag 9060ccgcccctcg cacctttggt cgcggcccgc
aaaattttag ccgctaaagt tcttgacagc 9120ggaaccaatg tttagctaaa ctagagtctc
ctttctcaag gagactttcg atatgagcca 9180taatcagttc cagtttatcg gtaatcttac
ccgtgacacc gaggtacgtc atggcaattc 9240taacaagccg caagcaattt tcgatatagc
ggttaatgaa gagtggcgca acgatgccgg 9300cgacaagcag gagcgcaccg acttcttccg
catcaagtgt tttggctctc aggccgaggc 9360ccacggcaag tatttgggca aggggtcgct
ggtattcgtg cagggcaaga ttcggaatac 9420caagtacgag aaggacggcc agacggtcta
cgggaccgac ttcattgccg ataaggtgga 9480ttatctggac accaaggcac caggcgggtc
aaatcaggaa taagggcaca ttgccccggc 9540gtgagtcggg gcaatcccgc aaggagggtg
aatgaatcgg acgtttgacc ggaaggcata 9600caggcaagaa ctgatcgacg cggggttttc
cgccgaggat gccgaaacca tcgcaagccg 9660caccgtcatg cgtgcgcccc gcgaaacctt
ccagtccgtc ggctcgatgg tccagcaagc 9720tacggccaag atcgagcgcg acagcgtgca
actggctccc cctgccctgc ccgcgccatc 9780ggccgccgtg gagcgttcgc gtcgtctcga
acaggaggcg gcaggtttgg cgaagtcgat 9840gaccatcgac acgcgaggaa ctatgacgac
caagaagcga aaaaccgccg gcgaggacct 9900ggcaaaacag gtcagcgagg ccaagcaggc
cgcgttgctg aaacacacga agcagcagat 9960caaggaaatg cagctttcct tgttcgatat
tgcgccgtgg ccggacacga tgcgagcgat 10020gccaaacgac acggcccgct ctgccctgtt
caccacgcgc aacaagaaaa tcccgcgcga 10080ggcgctgcaa aacaaggtca ttttccacgt
caacaaggac gtgaagatca cctacaccgg 10140cgtcgagctg cgggccgacg atgacgaact
ggtgtggcag caggtgttgg agtacgcgaa 10200gcgcacccct atcggcgagc cgatcacctt
cacgttctac gagctttgcc aggacctggg 10260ctggtcgatc aatggccggt attacacgaa
ggccgaggaa tgcctgtcgc gcctacaggc 10320gacggcgatg ggcttcacgt ccgaccgcgt
tgggcacctg gaatcggtgt cgctgctgca 10380ccgcttccgc gtcctggacc gtggcaagaa
aacgtcccgt tgccaggtcc tgatcgacga 10440ggaaatcgtc gtgctgtttg ctggcgacca
ctacacgaaa ttcatatggg agaagtaccg 10500caagctgtcg ccgacggccc gacggatgtt
cgactatttc agctcgcacc gggagccgta 10560cccgctcaag ctggaaacct tccgcctcat
gtgcggatcg gattccaccc gcgtgaagaa 10620gtggcgcgag caggtcggcg aagcctgcga
agagttgcga ggcagcggcc tggtggaaca 10680cgcctgggtc aatgatgacc tggtgcattg
caaacgctag ggccttgtgg ggtcagttcc 10740ggctgggggt tcagcagcca gcgctttact
ggcatttcag gaacaagcgg gcactgctcg 10800acgcacttgc ttcgctcagt atcgctcggg
acgcacggcg cgctctacga actgccgata 10860aacagaggat taaaattgac aattgtgatt
aaggctcaga ttcgacggct tggagcggcc 10920gacgtgcagg atttccgcga gatccgattg
tcggccctga agaaagctcc agagatgttc 10980gggtccgttt acgagcacga ggagaaaaag
cccatggagg cgttcgctga acggttgcga 11040gatgccgtgg cattcggcgc ctacatcgac
ggcgagatca ttgggctgtc ggtcttcaaa 11100caggaggacg gccccaagga cgctcacaag
gcgcatctgt ccggcgtttt cgtggagccc 11160gaacagcgag gccgaggggt cgccggtatg
ctgctgcggg cgttgccggc gggtttattg 11220ctcgtgatga tcgtccgaca gattccaacg
ggaatctggt ggatgcgcat cttcatcctc 11280ggcgcactta atatttcgct attctggagc
ttgttgttta tttcggtcta ccgcctgccg 11340ggcgggtcgc ggcgacggta ggcgctgtgc
agccgctgat ggtcgtgttc atctctgccg 11400ctctgctagg tagcccgata cgattgatgg
cggtcctggg ggctatttgc ggaactgcgg 11460gcgtggcgct gttggtgttg acaccaaacg
cagcgctaga tcctgtcggc gtcgcagcgg 11520gcctggcggg ggcggtttcc atggcgttcg
gaaccgtgct gacccgcaag tggcaacctc 11580ccgtgcctct gctcaccttt accgcctggc
aactggcggc cggaggactt ctgctcgttc 11640cagtagcttt agtgtttgat ccgccaatcc
cgatgcctac aggaaccaat gttctcggct 11700gctcgactgc acgaatacca gcgacccctt
gcccaaatac ttgccgtggg cctcggcctg 11760agagccaaaa cacttgatgc ggaagaagtc
ggtgcgctcc tgcttgtcgc cggcatcgtt 11820gcgccacatc taggtactaa aacaattcat
ccagtaaaat ataatatttt attttctccc 11880aatcaggctt gatccccagt aagtcaaaaa
atagctcgac atactgttct tccccgatat 11940cctccctgat cgaccggacg cagaaggcaa
tgtcatacca cttgtccgcc ctgccgcttc 12000tcccaagatc aataaagcca cttactttgc
catctttcac aaagatgttg ctgtctccca 12060ggtcgccgtg ggaaaagaca agttcctctt
cgggcttttc cgtctttaaa aaatcataca 12120gctcgcgcgg atctttaaat ggagtgtctt
cttcccagtt ttcgcaatcc acatcggcca 12180gatcgttatt cagtaagtaa tccaattcgg
ctaagcggct gtctaagcta ttcgtatagg 12240gacaatccga tatgtcgatg gagtgaaaga
gcctgatgca ctccgcatac agctcgataa 12300tcttttcagg gctttgttca tcttcatact
cttccgagca aaggacgcca tcggcctcac 12360tcatgagcag attgctccag ccatcatgcc
gttcaaagtg caggaccttt ggaacaggca 12420gctttccttc cagccatagc atcatgtcct
tttcccgttc cacatcatag gtggtccctt 12480tataccggct gtccgtcatt tttaaatata
ggttttcatt ttctcccacc agcttatata 12540ccttagcagg agacattcct tccgtatctt
ttacgcagcg gtatttttcg atcagttttt 12600tcaattccgg tgatattctc attttagcca
tttattattt ccttcctctt ttctacagta 12660tttaaagata ccccaagaag ctaattataa
caagacgaac tccaattcac tgttccttgc 12720attctaaaac cttaaatacc agaaaacagc
tttttcaaag ttgttttcaa agttggcgta 12780taacatagta tcgattcgat agcgtggact
caaggctctc gcgaatggct cgcgttggaa 12840actttcattg acacttgagg ggcaccgcag
ggaaattctc gtccttgcga gaaccggcta 12900tgtcgtgctg cgcatcgagc ctgcgccctt
ggcttgtctc gcccctctcc gcgtcgctac 12960ggggcttcca gcgcctttcc gacgctcacc
gggctggttg ccctcgccgc tgggctggcg 13020gccgtctatg gccctgcaaa cgcgccagaa
acgccgtcga agccgtgtgc gagacaccgc 13080ggccgccggc gttgtggata cctcgcggaa
aacttggccc tcactgacag atgaggggcg 13140gacgttgaca cttgaggggc cgactcaccc
ggcgcggcgt tgacagatga ggggcaggct 13200cgatttcggc cggcgacgtg gagctggcca
gcctcgcaaa tcggcgaaaa cgcctgattt 13260tacgcgagtt tcccacagat gatgtggaca
agcctgggga taagtgccct gcggtattga 13320cacttgaggg gcgcgactac tgacagatga
ggggcgcgat ccttgacact tgaggggcag 13380agtgctgaca gatgaggggc gcacctattg
acatttgagg ggctgtccac aggcagaaaa 13440tccagcattt gcaagggttt ccgcccgttt
ttcggccacc gctaacctgt cttttaacct 13500gcttttaaac caatatttat aaaccttgtt
tttaaccagg gctgcgccct gtgcgcgtga 13560ccgcgcacgc cgaagggggg tgccccccct
tctcgaaccc tcccggcccg ctaacgcggg 13620cctcccatcc ccccaggggc tgcgcccctc
ggccgcgaac ggcctcaccc caaaaatggc 13680agcgccagcc aggacgtcgg ccgaaagagc
gacaagcaga tcacgctttt cgacagcgtc 13740ggatttgcga tcgaggattt ttcggcgctg
cgctacgtcc gcgaccgcgt tgagggatca 13800agccacagca gcccactcga ccttctagcc
gacccagacg agccaaggga tctttttgga 13860atgctgctcc gtcgtcaggc tttccgacgt
ttgggtggtt gaacagaagt cattatcgca 13920cggaatgcca agcactcccg aggggaaccc
tgtggttggc atgcacatac aaatggacga 13980acggataaac cttttcacgc ccttttaaat
atccgattat tctaataaac gctcttttct 14040cttaggttta cccgccaata tatcctgtca
aacactgata gtttaaactg aaggcgggaa 14100acgacaatct gatcatgagc ggagaattaa
gggagtcacg ttatgacccc cgccgatgac 14160gcgggacaag ccgttttacg tttggaactg
acagaaccgc aacgttgaag gagccactca 14220gc 1422284531DNAArtificialpVGW
8tcgaccatag gcgatctcct taatcaatag tagctgtaac ctcgaagcgt ttcacttgta
60acaacgattg agaacttttg tcataaaatt gaaatacttg gttcgcattt tcgtcatccg
120cggtcagccg caattctgac gaactgccca tttagctgga gatgattgta catccttcac
180gtgaaaattt ctcaagcgct gtgaacaagg gttcagattt tagattgaaa ggtgagccgt
240tgaaacacgt tcttcttatc gatgacgatg tcgctatgcg gcatcttatt atcgaatacc
300ttacgatcca cgccttcaaa gtgaccgcgg tagccgacag cacccagttc actagagtac
360tctcttccgc gacggtcgat gtcgtggttg ttgatctaga tttaggtcgt gaagatgggc
420ttgagatcgt tcgaaatctg gcggcaaagt ctgatattcc aatcataatt atcagtggcg
480accgccttga ggagacggat aaagttgttg cactcgagct aggagcaagt gattttatcg
540ctaagccgtt tagtacgaga gagtttcttg cacgcattcg ggttgccttg cgcgtgcgcc
600ccaacgttgt ccgctccaaa gaccgacggt ctttttgttt tactgactgg acacttaatc
660tcaggcaacg tcgcttgatg tccgaagctg gcggtgaggt gaaacttacg gcaggtgagt
720tcaatcttct cctcgcgttt ttagagaaac cccgcgacgt tctatcgcgc gagcaacttc
780tcattgccag tcgagtacgc gacgaggagg tttacgacag gagtatagat gttctcattt
840tgcggctgcg ccgcaaactt gaggcggatc cgtcaagccc tcaactgata aaaacagcaa
900gaggtgccgg ttatttcttt gacgcggacg tgcaggtttc gcacgggggg acgatggcag
960cctgagccaa ttgcatttgg ctcttaatta tctggctcaa aaggtgactg aggacgcggc
1020cagcggcctc aaacctacac tcaatatttg gtgaggggtt ccgataggtc cctcttcacc
1080tgcatggcat gtttaaccga atctgacgtt ttccctgcaa atgccaaaat actatgccta
1140tctccgggtt tcgcgtgacg gccaagaccc ggaaaaccaa aaatacggtt tgctcgaata
1200cgcgaacgcc aaaggcttcg cgccgctaca gatcgaggaa gaaattgcca gcagagcaaa
1260ggactggcgc aagcgcaagc tcggagcaat catcgaaaag gccgagcgtg gcgacgtgct
1320actgacgccg gagattacgc gcattgccgg ttccgccctc gccgccttgg aaattctcaa
1380agcggcgagc gagcgcggcc taatcgtcca tgtgaccaaa cagaagatca tcatggacgg
1440cagcctacaa agcgacatca tggcaaccgt gcttggcttg gctgcacaga tcgagcggca
1500tttcattcag gcacgtacca ccgaggcgct acaagtcgcc agagagcgcg gcaagacgct
1560cgggcgaccc aagggcagca aatcgagcgc cttgaagctg gacagccgta ttgatgaagt
1620acaggcatac gtgaaccttg gcttgccgca aagtcgcgca gccgagttgt taggcgtcag
1680ccctcacacc ttgcgcctgt tcatcaaacg ccggaacatc aaacccacaa acactagacc
1740aaccatcacc atgccgggga gggaacaaca tgcctaagaa caacaaagcc cccggccatc
1800gtatcaacga gatcatcaag acgagcctcg cgctcgaaat ggaggatgcc cgcgaagctg
1860gcttagtcgg ctacatggcc cgttgccttg tgcaagcgac catgccccac accgacccca
1920agaccagcta ctttgagcgc accaatggca tcgtcacctt gtcgatcatg ggcaagccga
1980gcatcggcct gccctacggt tctatgccgc gcaccttgct tgcttggata tgcaccgagg
2040ccgtgcgaac gaaagacccc gtgttgaacc ttggccggtc gcaatcggaa tttctacaaa
2100ggctcggaat gcacaccgat ggccgttaca cggccaccct tcgcaatcag gcgcaacgcc
2160tgttttcatc catgatttcg cttgccggcg agcaaggcaa tgacttcggc attgagaacg
2220tcgtcattgc caagcgcgct tttctattct ggaatcccaa gcggccagaa gatcgggcgc
2280tatgggatag caccctcacc ctcacaggcg atttcttcga ggaagtcacc cgctcaccgg
2340ttcctatccg aatcgactac ctgcatgcct tgcggcagtc tccgcttgcg atggacattt
2400acacgtggct gacctatcgc gtgttcctgt tgcgggccaa gggccgcccc ttcgtgcaaa
2460tcccttgggt cgccctgcaa gcgcaattcg gctcatccta tggcagccgc gcacgcaact
2520cgcccgaact ggacgataag gcccgagagc gggcagagcg ggcagcactc gccagcttca
2580aatacaactt caaaaagcgc ctacgcgaag tgttgattgt ctatcccgag gcaagcgact
2640gcatcgaaga tgacggcgaa tgcctgcgca tcaaatccac acgcctgcat gtcacccgcg
2700cacccggcaa gggcgctcgc atcggccccc ctccgacttg accaggccaa cgctacgctt
2760ggcttggtca agccttccca tccaacagcc cgccgtcgag cgggcttttt tatccccgga
2820agcctgtgga tagagggtag ttatccacgt gaaaccgcta atgccccgca aagccttgat
2880tcacggggct ttccggcccg ctccaaaaac tatccacgtg aaatcgctaa tcagggtacg
2940tgaaatcgct aatcggagta cgtgaaatcg ctaataaggt cacgtgaaat cgctaatcaa
3000aaaggcacgt gagaacgcta atagcccttt cagatcaaca gcttgcaaac acccctcgct
3060ccggcaagta gttacagcaa gtagtatgtt caattagctt ttcaattatg aatatatata
3120tcaattattg gtcgcccttg gcttgtggac aatgcgctac gcgcaccggc tccgcccgtg
3180gacaaccgca agcggttgcc caccgtcgag cgcctttgcc cacaacccgg cggccgcaac
3240agatcgtttt ataaattttt ttttttgaaa aagaaaaagc ccgaaaggcg gcaacctctc
3300gggcttctgg atttccgatc aacgcaggag tcgttcggaa agtagctgtt ccagaattat
3360aggcgcagag acaccagatt ccaagatggc tctgttaaat tgttgtagta tgtagtatca
3420tacaacatac tacagtacag aggcccgcaa gaatggcaat cactaaacaa gacatttggc
3480gagcagccga cgaactggac gccgaaggca tccggcccac tttggccgcc gtgcgcaaga
3540aactcggaag cggtagcttc acaaccattt ccgatgcaat ggctgaatgg aaaaaccgca
3600agaccgccac cctgccctca tcagacccat tgccggttgc agtcaacgag catcttgccg
3660agcttggcaa tgcgctatgg gctatcgccc tggcgcacgc caacgcccgg tttgacgaag
3720atcggaaaca gatcgaggcc gacaaagcgg ccatcagcca gcagcttgcc gaagcaatcg
3780aactagccga caccttcacc cgcgaaaacg accagctccg cgaacgagta gatccttcat
3840ggcttgttat gactgttttt ttgtacagtc tatgcctcgg gcatccaagc agcaagcgcg
3900ttacgccgtg ggtcgatgtt tgatgttatg gagcagcaac gatgttacgc agcagcaacg
3960atgttacgca gcagggcagt cgccctaaaa caaagttagg tggctcaagt atgggcatca
4020ttcgcacatg taggctcggc cctgaccaag tcaaatccat gcgggctgct cttgatcttt
4080tcggtcgtga gttcggagac gtagccacct actcccaaca tcagccggac tccgattacc
4140tcgggaactt gctccgtagt aagacattca tcgcgcttgc tgccttcgac caagaagcgg
4200ttgttggcgc tctcgcggct tacgttctgc ccaagtttga gcagccgcgt agtgagatct
4260atatctatga tctcgcagtc tccggcgagc accggaggca gggcattgcc accgcgctca
4320tcaatctcct caagcatgag gccaacgcgc ttggtgctta tgtgatctac gtgcaagcag
4380attacggtga cgatcccgca gtggctctct atacaaagtt gggcatacgg gaagaagtga
4440tgcactttga tatcgaccca agtaccgcca cctaacaatt cgttcaagcc gagatcggct
4500tcccggccgc ggagttgttc ggtaaattgt c
4531912DNAArtificialcos sequence 9aggtcgccgc cc
121010DNAArtificialPac linker 10gttaattaac
101124DNAArtificialprimer for amplification 11gttaatttct tgtgatcgaa ggac
241224DNAArtificialprimer for
amplification 12gggattcttt atgctgggtt tagg
241324DNAArtificialprimer for amplification 13gcaagcaata
cctctgttat gctg
241426DNAArtificialprimer for amplification 14gttttcagat ggcgacctca
gctttg 261524DNAArtificialprimer
for amplification 15caggtggctt tattcctcct ctca
241624DNAArtificialprimer for amplification 16ccgaaagttc
gtgggcaatg ccta
241724DNAArtificialprimer for amplification 17gccatcctta gcatatgagt ggca
241824DNAArtificialprimer for
amplification 18ggctatttac gtggcatgtt acgt
241924DNAArtificialprimer for amplification 19tcgtaagtct
acttcccttt acga
242024DNAArtificialprimer for amplification 20ccaaaccaca tccttatagt gtgc
242122DNAArtificialprimer for
amplification 21cctcattgca tgcggtcact ac
222224DNAArtificialprimer for amplification 22gcagggtatt
aatcgatcaa cacc
242328DNAArtificialprimer for amplification 23agctttcgaa tagggataac
agggtaat 282428DNAArtificialprimer
for amplification 24agctattacc ctgttatccc tattcgaa
282531DNAArtificialprimer for amplification 25ctagtaacta
taacggtcct aaggtagcga c
312631DNAArtificialprimer for amplification 26ctaggtcgct accttaggac
cgttatagtt a 312749DNAArtificialprimer
for amplification 27ggggacaagt ttgtacaaaa aagcaggctc aatgagatat gaaaaagcc
492852DNAArtificialprimer for amplification 28ggggaccact
ttgtacaaga aagctgggtc tattcctttg ccctcggacg ag
522949DNAArtificialprimer for amplification 29ggggacaagt ttgtacaaaa
aagcaggctc catggaccca gaacgacgc 493049DNAArtificialprimer
for amplification 30ggggaccact ttgtacaaga aagctgggtt cctagacgcg tgagatcag
493132DNAArtificialprimer for amplification 31tcgttcgaat
cgatactatg ttatacgcca ac
323229DNAArtificialprimer for amplification 32atcgtcgact gcacgaatac
cagcgaccc 293331DNAArtificialprimer
for amplification 33gggggatcct tccattgttc attccacgga c
303430DNAArtificialprimer for amplification 34gggcaattga
catgaggttg ccccgtattc
303529DNAArtificialprimer for amplification 35gatatcgata gcgtggactc
aaggctctc 293641DNAArtificialprimer
for amplification 36aaagaattcg ctagccagct ggcgctgcca tttttggggt g
413729DNAArtificialprimer for amplification 37aaactcgagc
agccgagaac attggttcc
293836DNAArtificialprimer for amplification 38taggaattcg gatccaaaac
aactgtcaaa gcgcac 363928DNAArtificialprimer
for amplification 39cgtagatctg gcgctcggtc ttgccttg
284037DNAArtificialprimer for amplification 40tgtgaattca
ctagtgatat tccacaaaac agcaggg
374128DNAArtificialprimer for amplification 41ccgtctagat tcgagccacg
gtagcggc 284234DNAArtificialprimer
for amplification 42cttgaattca gatcttctcg gcggcgatca cgac
344312DNAArtificialpSwaI linker 43ccatttaaat gg
124418DNAArtificialSwaIKpnIRV 44ccatttaaat ggtaccgg
184518DNAArtificialSwaIKpnIFW 45ccggtaccat
ttaaatgg
184632DNAArtificialprimer for amplification 46tcaatacccg gggtaacctc
gaagcgtttc ac 324731DNAArtificialprimer
for amplification 47tggtgacccg ggacctatcg gaacccctca c
314832DNAArtificialprimer for amplification 48tcaataacta
gtgtaacctc gaagcgtttc ac
324931DNAArtificialprimer for amplification 49tggtgaacta gtacctatcg
gaacccctca c 315021DNAArtificialprimer
for amplification 50cttgagatcg ttcggaatct g
215121DNAArtificialprimer for amplification 51cagattccga
acgatctcaa g
215229DNAArtificialprimer for amplification 52aaaatgcatg gcatgtttaa
cagaatctg 295327DNAArtificialprimer
for amplification 53tttagatcta ctcgttcgcg gagctgg
275429DNAArtificialprimer for amplification 54aaaggatcct
tcatggcttg ttatgactg
295530DNAArtificialprimer for amplification 55tgcctcgaga caatttaccg
aacaactccg 305623DNAArtificialprimer
for amplification 56cgacctaaat ctagatcaac aac
235723DNAArtificialprimer for amplification 57gttgttgatc
tagatttagg tcg
235829DNAArtificialprimer for amplification 58tttgtcgacc ataggcgatc
tccttaatc 295928DNAArtificialprimer
for amplification 59aaactgcagg tgaagaggga cctatcgg
286024DNAArtificialprimer for amplification 60ctgaaggcgg
gaaacgacaa tctg
246124DNAArtificialprimer for amplification 61gcttgctgag tggctccttc aacg
246224DNAArtificialprimer for
amplification 62aactgcactt caaacaagtg tgac
246320DNAArtificialprimer for amplification 63tatgtcctgc
gggtaaatag
206419DNAArtificialprimer for amplification 64ttgttggagc cgaaatccg
196514120DNAArtificialpLC40GWHkorB 65aagcttgcgg ccgcttcgaa gatgttaatt
aacatcggta ccgagctcta gggataacag 60ggtaatagct cgaattctag cttgcatgcc
tgcagtgcag cgtgacccgg tcgtgcccct 120ctctagagat aatgagcatt gcatgtctaa
gttataaaaa attaccacat attttttttg 180tcacacttgt ttgaagtgca gtttatctat
ctttatacat atatttaaac tttactctac 240gaataatata atctatagta ctacaataat
atcagtgttt tagagaatca tataaatgaa 300cagttagaca tggtctaaag gacaattgag
tattttgaca acaggactct acagttttat 360ctttttagtg tgcatgtgtt ctcctttttt
tttgcaaata gcttcaccta tataatactt 420catccatttt attagtacat ccatttaggg
tttagggtta atggttttta tagactaatt 480tttttagtac atctatttta ttctatttta
gcctctaaat taagaaaact aaaactctat 540tttagttttt ttatttaata atttagatat
aaaatagaat aaaataaagt gactaaaaat 600taaacaaata ccctttaaga aattaaaaaa
actaaggaaa catttttctt gtttcgagta 660gataatgcca gcctgttaaa cgccgtcgac
gagtctaacg gacaccaacc agcgaaccag 720cagcgtcgcg tcgggccaag cgaagcagac
ggcacggcat ctctgtcgct gcctctggac 780ccctctcgag agttccgctc caccgttgga
cttgctccgc tgtcggcatc cagaaattgc 840gtggcggagc ggcagacgtg agccggcacg
gcaggcggcc tcctcctcct ctcacggcac 900cggcagctac gggggattcc tttcccaccg
ctccttcgct ttcccttcct cgcccgccgt 960aataaataga caccccctcc acaccctctt
tccccaacct cgtgttgttc ggagcgcaca 1020cacacacaac cagatctccc ccaaatccac
ccgtcggcac ctccgcttca aggtacgccg 1080ctcgtcctcc cccccccccc ctctctacct
tctctagatc ggcgttccgg tccatggtta 1140gggcccggta gttctacttc tgttcatgtt
tgtgttagat ccgtgtttgt gttagatccg 1200tgctgctagc gttcgtacac ggatgcgacc
tgtacgtcag acacgttctg attgctaact 1260tgccagtgtt tctctttggg gaatcctggg
atggctctag ccgttccgca gacgggatcg 1320atttcatgat tttttttgtt tcgttgcata
gggtttggtt tgcccttttc ctttatttca 1380atatatgccg tgcacttgtt tgtcgggtca
tcttttcatg cttttttttg tcttggttgt 1440gatgatgtgg tctggttggg cggtcgttct
agatcggagt agaattctgt ttcaaactac 1500ctggtggatt tattaatttt ggatctgtat
gtgtgtgcca tacatattca tagttacgaa 1560ttgaagatga tggatggaaa tatcgatcta
ggataggtat acatgttgat gcgggtttta 1620ctgatgcata tacagagatg ctttttgttc
gcttggttgt gatgatgtgg tgtggttggg 1680cggtcgttca ttcgttctag atcggagtag
aatactgttt caaactacct ggtgtattta 1740ttaattttgg aactgtatgt gtgtgtcata
catcttcata gttacgagtt taagatggat 1800ggaaatatcg atctaggata ggtatacatg
ttgatgtggg ttttactgat gcatatacat 1860gatggcatat gcagcatcta ttcatatgct
ctaaccttga gtacctatct attataataa 1920acaagtatgt tttataatta ttttgatctt
gatatacttg gatgatggca tatgcagcag 1980ctatatgtgg atttttttag ccctgccttc
atacgctatt tatttgcttg gtactgtttc 2040ttttgtcgat gctcaccctg ttgtttggtg
ttacttctgc aggtcgactc tagaggatca 2100tcacaagttt gtacaaaaaa gcaggctcaa
tgagatatga aaaagcctga actcaccgcg 2160acgtctgtcg agaagtttct gatcgaaaag
ttcgacagcg tctccgacct gatgcagctc 2220tcggagggcg aagaatctcg tgctttcagc
ttcgatgtag gagggcgtgg atatgtcctg 2280cgggtaaata gctgcgccga tggtttctac
aaagatcgtt atgtttatcg gcactttgca 2340tcggccgcgc tcccgattcc ggaagtgctt
gacattgggg aattcagcga gagcctgacc 2400tattgcatct cccgccgtgc acagggtgtc
acgttgcaag acctgcctga aaccgaactg 2460cccgctgttc tgcagccggt cgcggaggcc
atggatgcga tcgctgcggc cgatcttagc 2520cagacgagcg ggttcggccc attcggaccg
caaggaatcg gtcaatacac tacatggcgt 2580gatttcatat gcgcgattgc tgatccccat
gtgtatcact ggcaaactgt gatggacgac 2640accgtcagtg cgtccgtcgc gcaggctctc
gatgagctga tgctttgggc cgaggactgc 2700cccgaagtcc ggcacctcgt gcacgcggat
ttcggctcca acaatgtcct gacggacaat 2760ggccgcataa cagcggtcat tgactggagc
gaggcgatgt tcggggattc ccaatacgag 2820gtcgccaaca tcttcttctg gaggccgtgg
ttggcttgta tggagcagca gacgcgctac 2880ttcgagcgga ggcatccgga gcttgcagga
tcgccgcggc tccgggcgta tatgctccgc 2940attggtcttg accaactcta tcagagcttg
gttgacggca atttcgatga tgcagcttgg 3000gcgcagggtc gatgcgacgc aatcgtccga
tccggagccg ggactgtcgg gcgtacacaa 3060atcgcccgca gaagcgcggc cgtctggacc
gatggctgtg tagaagtact cgccgatagt 3120ggaaaccgac gccccagcac tcgtccgagg
gcaaaggaat agacccagct ttcttgtaca 3180aagtggtgat gatccgtcga cctgcagatc
gttcaaacat ttggcaataa agtttcttaa 3240gattgaatcc tgttgccggt cttgcgatga
ttatcatata atttctgttg aattacgtta 3300agcatgtaat aattaacatg taatgcatga
cgttatttat gagatgggtt tttatgatta 3360gagtcccgca attatacatt taatacgcga
tagaaaacaa aatatagcgc gcaaactagg 3420ataaattatc gcgcgcggtg tcatctatgt
tactagatcc gatgataagc tgtcaaacat 3480gagaattcag tacattaaaa acgtccgcaa
tgtgttatta agttgtctaa gcgtcaattt 3540gtttacacca caatatatcc tgccaccagc
cagccaacag ctccccgacc ggcagctcgg 3600cacaaaatca ccactcgata caggcagccc
atcagtccgg gacggcgtca gcgggagagc 3660cgttgtaagg cggcagactt tgctcatgtt
accgatgcta ttcggaagaa cggcaactaa 3720gctgccgggt ttgaaacacg gatgatctcg
cggagggtag catgttgatt gtaacgatga 3780cagagcgttg ctgcctgtga tcaaatatca
tctccctcgc agagatccga attatcagcc 3840ttcttattca tttctcgctt aaccgtgaca
ggctgtcgat cttgagaact atgccgacat 3900aataggaaat cgctggataa agccgctgag
gaagctgagt ggcgctattt ctttagaagt 3960gaacgttgac gatcgtcgac cgtaccccga
tgaattaatt cggacgtacg ttctgaacac 4020agctggatac ttacttgggc gattgtcata
catgacatca acaatgtacc cgtttgtgta 4080accgtctctt ggaggttcgt atgacactag
gtcgctacct taggaccgtt atagttacta 4140gcgaattgac atgaggttgc cccgtattca
gtgtcgctga tttgtattgt ctgaagttgt 4200ttttacgtta agttgatgca gatcaattaa
tacgatacct gcgtcataat tgattatttg 4260acgtggtttg atggcctcca cgcacgttgt
gatatgtaga tgataatcat tatcacttta 4320cgggtccttt ccggtgatcc gacaggttac
ggggcggcga cctcgcgggt tttcgctatt 4380tatgaaaatt ttccggttta aggcgtttcc
gttcttcttc gtcataactt aatgttttta 4440tttaaaatac cctctgaaaa gaaaggaaac
gacaggtgct gaaagcgagc tttttggcct 4500ctgtcgtttc ctttctctgt ttttgtccgt
ggaatgaaca atggaaggat cttctcggcg 4560gcgatcacga cgccggccct gcggagcctt
cgccgcgtgc gcgattcatg gcggccgtgg 4620aggccaagga tttcgcgcga gtgcaagagc
tgatcgaggc gcgtggagcc aagtcggcgg 4680ctgattatgt ccttgcgcag ctcgccgtgg
ccgaaggtct ggaccgcaag cctggtgcgc 4740gcgtcgtggt cgggaaagcg gcgggcagca
tggcaatgcc gcctgcggcg ctgggtttta 4800cgccaagggg agaagcggca tacgccatcg
agcggtcagc ctatggtgag ccgaggtcca 4860gcattgcgaa gcagtaccag caggaatgga
accggaaggc ggcgacctgg tgggcgatgg 4920ccggtgtggc cggcatcatc ggcgcgatcc
tggcggcggc ggcaaccggc tttgttgggc 4980tggcagtgtc gatccgcaac cgagtgaagc
gcgtgcgcga cctgttggtg atggagccgg 5040gtgcagagcc ataagcggca agagacgaaa
gcccggtttc cgggcttttg ttttgttacg 5100ccaaggacga gttttagcgg ctaaaggtgt
tgacgtgcga gaaatgttta gctaaacttc 5160tctcatgtgc tggcggctgt caccgctatg
ttcaaccaag gcgcggagca aattatgggt 5220gttatccatg aagaaacggc ttaccgaaag
ccagttccag gaggcgatcc aggggctgga 5280agtggggcag cagaccatcg agatagcgcg
gggcgtctta gtcgatggga agccacaggc 5340gacgttcgca acgtcgctgg gactgaccag
gggcgcagtg tcgcaagcgg tgcatcgcgt 5400gtgggccgcg ttcgaggaca agaacttgcc
cgaggggtac gcgcgggtaa cggcggttct 5460gccggaacat caggcgtaca tcgtccggaa
gtgggaagcg gacgccaaga aaaaacagga 5520aaccaaacga tgaaaacttt ggtcacggcc
aaccagaaag gcggcgtcgg caagacttcg 5580acccttgtgc atcttgcctt cgactttttc
gagcgcggct tgcgggttgc cgtgatcgac 5640ctggaccccc agggcaatgc gtcctacacg
ctcaaggact ttgctaccgg cctgcatgca 5700agcaagctgt tcggcgctgt ccctgccggc
ggctggaccg aaaccgcacc cgcagccggc 5760gacggccagg ccgcgcgcct cgccctcatc
gagtccaacc cggtactggc gaacgccgaa 5820cggctgtcgc tggacgacgc ccgcgagctg
ttcggggcga acatcaaggc cctggcgaac 5880caaggcttcg acgtgtgcct gatcgacacg
gccccgaccc ttggcgtcgg cctggcggcc 5940gccctcttcg cggccgacta tgtgctgtcc
cccatcgagc ttgaggcgta cagcatccag 6000ggcatcaaga agatggtcac gaccattgcg
aacgtgcgcc agaagaacgc caagctgcaa 6060ttccttggca tggtgcccag caaggtcgat
gcgcggaatc cgcgccacgc gcgccaccaa 6120gccgagctgc tggccgcgta ccccaagatg
atgattccgg ccaccgttgg cctgcgcagc 6180agcatcgccg atgccctcgc atccggtgtg
ccggtctgga agatcaagaa aacggccgcg 6240cgcaaggcat cgaaagaggt tcgcgccctg
gctgattacg tgttcacgaa gatggagatt 6300tcccaatgac tgcggctcaa gccaagacca
ccaagaaaaa caccgctgcg gccgctcagg 6360aagccgcagg cgcggcgcag ccgtccggcc
tggggttgga tagcatcggc gacctgtcga 6420gcctcctgga cgctcctgcg gcgtctcagg
gcggttccgg ccctatcgag ctggacctgg 6480acctgatcga cgaagatccg catcagccgc
ggacggccga caaccccggc ttttccccgg 6540agagcatcgc ggaaatcggt gccacgatca
aagagcgcgg ggtgaagtca cccatttcgg 6600tgcgcgagaa ccaggagcag ccgggccgct
atatcatcaa tcacggcgcc cgccgctacc 6660gtggctcgaa gtgggccggc aagaagtcca
tcccggcgtt catcgacaac gactacaacg 6720aagccgacca ggttatcgag aacctgcaac
gcaacgagct gaccccgcgc gaaattgccg 6780acttcattgg ccgcgagctg gcgaagggca
agaagaaagg cgatatcgcc aaggaaatcg 6840gcaagtcgcc ggcgttcatc acccagcacg
tcacgctgct ggacctgccg gagaagatcg 6900ccgatgcgtt caacaccggc cgcgtgcgcg
acgtgaccgt ggtgaacgag ctggtgacgg 6960ccttcaagaa gcgcccggag gaagtcgagg
cgtggcttga cgacgacacc caggaaatca 7020cgcgcggcac ggtcaagctg ctgcgcgagt
tcctggacga gaagggccgc gatcccaaca 7080ccgtcgatgc cttcaacggc cagactgatg
ccgagcgtga cgcggaggcc ggcgacggcc 7140aggacggcga ggacggcgac caggacggta
aggacgccaa ggaaaagggc gcgaaggagc 7200cggacccgga caagctgaaa aaggccatcg
tccaggtcga gcacgacgag cgccctgccc 7260gccttatcct caaccgtcgg ccgccggcgg
aaggctatgc ctggttgaag tacgaggacg 7320acggccagga gttcgaggcg aaccttgccg
acgtgaaact ggtcgcgctc atcgagggct 7380gatccccaaa gacagcggcg cgggccaccc
gcgccgcaca gacaacggtt ccgctacaag 7440gaggaccgaa gaatgaatcc gatgctgttc
tacatcgcgg gaggcgtagg cgcggcgttg 7500ctgctggttt ccgcgatcat gctgttcaag
ctgcgcgagc cgaagaagga acaccgaccg 7560cagcgcaagg cggcggcccc gacgccgcag
ccggtcgata acgagctgct gcgcactcta 7620gtgatattcc acaaaacagc agggaagcag
cgcttttccg ctgcataacc ctgcttcggg 7680gtcattatag cgattttttc ggtatatcca
tcctttttcg cacgatatac aggattttgc 7740caaagggttc gtgtagactt tccttggtgt
atccaacggc gtcagccggg caggataggt 7800gaagtaggcc cacccgcgag cgggtgttcc
ttcttcactg tcccttattc gcacctggcg 7860gtgctcaacg ggaatcctgc tctgcgaggc
tggccggcta ccgccggcgt aacagatgag 7920ggcaagcgga tggctgatga aaccaagcca
accaggaagg gcagcccacc tatcaaggtg 7980tactgccttc cagacgaacg aagagcgatt
gaggaaaagg cggcggcggc cggcatgagc 8040ctgtcggcct acctgctggc cgtcggccag
ggctacaaaa tcacgggcgt cgtggactat 8100gagcacgtcc gcgagctggc ccgcatcaat
ggcgacctgg gccgcctggg cggcctgctg 8160aaactctggc tcaccgacga cccgcgcacg
gcgcggttcg gtgatgccac gatcctcgcc 8220ctgctggcga agatcgaaga gaagcaggac
gagcttggca aggtcatgat gggcgtggtc 8280cgcccgaggg cagagccatg acttttttag
ccgctaaaac ggccgggggg tgcgcgtgat 8340tgccaagcac gtccccatgc gctccatcaa
gaagagcgac ttcgcggagc tggtgaagta 8400catcaccgac gagcaaggca agaccgagcg
ccagatccaa aacaactgtc aaagcgcacc 8460cgcccgatgc cattcgcggc acggcttccg
ttgaggatgt cgatatgatg cgcgagccga 8520cggcccgcag agaaggggcc gttttagcgg
ctaaagaagg aagtgcaagc cctaaccctt 8580ggcgtcagag ccttccacgc agcttttttc
gggtgtcgtc gccccatttc tttacgataa 8640acgccttatg tgacggcaaa accacactga
tgcgttcgta tccgggcggc acgctgctct 8700tgaaaggatg acccgcaatc tccgcgagtg
cctcgcggtc aaggtcggtg gactccagga 8760gaagaggtag gggagtttcc agggcgtcgg
caatggcctc catcaccttc aacgaggggt 8820tggccttacc gttggttaag tctgataaaa
acgaaattga aacccctgcc ctctccgaca 8880gctcatgttt cgtcatgccc cgctcatcga
gcagacgaag gatgttggtg aaaaatatct 8940ggttgtacac agcggaagcc gcccctcgca
cctttggtcg cggcccgcaa aattttagcc 9000gctaaagttc ttgacagcgg aaccaatgtt
tagctaaact agagtctcct ttctcaagga 9060gactttcgat atgagccata atcagttcca
gtttatcggt aatcttaccc gtgacaccga 9120ggtacgtcat ggcaattcta acaagccgca
agcaattttc gatatagcgg ttaatgaaga 9180gtggcgcaac gatgccggcg acaagcagga
gcgcaccgac ttcttccgca tcaagtgttt 9240tggctctcag gccgaggccc acggcaagta
tttgggcaag gggtcgctgg tattcgtgca 9300gggcaagatt cggaatacca agtacgagaa
ggacggccag acggtctacg ggaccgactt 9360cattgccgat aaggtggatt atctggacac
caaggcacca ggcgggtcaa atcaggaata 9420agggcacatt gccccggcgt gagtcggggc
aatcccgcaa ggagggtgaa tgaatcggac 9480gtttgaccgg aaggcataca ggcaagaact
gatcgacgcg gggttttccg ccgaggatgc 9540cgaaaccatc gcaagccgca ccgtcatgcg
tgcgccccgc gaaaccttcc agtccgtcgg 9600ctcgatggtc cagcaagcta cggccaagat
cgagcgcgac agcgtgcaac tggctccccc 9660tgccctgccc gcgccatcgg ccgccgtgga
gcgttcgcgt cgtctcgaac aggaggcggc 9720aggtttggcg aagtcgatga ccatcgacac
gcgaggaact atgacgacca agaagcgaaa 9780aaccgccggc gaggacctgg caaaacaggt
cagcgaggcc aagcaggccg cgttgctgaa 9840acacacgaag cagcagatca aggaaatgca
gctttccttg ttcgatattg cgccgtggcc 9900ggacacgatg cgagcgatgc caaacgacac
ggcccgctct gccctgttca ccacgcgcaa 9960caagaaaatc ccgcgcgagg cgctgcaaaa
caaggtcatt ttccacgtca acaaggacgt 10020gaagatcacc tacaccggcg tcgagctgcg
ggccgacgat gacgaactgg tgtggcagca 10080ggtgttggag tacgcgaagc gcacccctat
cggcgagccg atcaccttca cgttctacga 10140gctttgccag gacctgggct ggtcgatcaa
tggccggtat tacacgaagg ccgaggaatg 10200cctgtcgcgc ctacaggcga cggcgatggg
cttcacgtcc gaccgcgttg ggcacctgga 10260atcggtgtcg ctgctgcacc gcttccgcgt
cctggaccgt ggcaagaaaa cgtcccgttg 10320ccaggtcctg atcgacgagg aaatcgtcgt
gctgtttgct ggcgaccact acacgaaatt 10380catatgggag aagtaccgca agctgtcgcc
gacggcccga cggatgttcg actatttcag 10440ctcgcaccgg gagccgtacc cgctcaagct
ggaaaccttc cgcctcatgt gcggatcgga 10500ttccacccgc gtgaagaagt ggcgcgagca
ggtcggcgaa gcctgcgaag agttgcgagg 10560cagcggcctg gtggaacacg cctgggtcaa
tgatgacctg gtgcattgca aacgctaggg 10620ccttgtgggg tcagttccgg ctgggggttc
agcagccagc gctttactgg catttcagga 10680acaagcgggc actgctcgac gcacttgctt
cgctcagtat cgctcgggac gcacggcgcg 10740ctctacgaac tgccgataaa cagaggatta
aaattgacaa ttgtgattaa ggctcagatt 10800cgacggcttg gagcggccga cgtgcaggat
ttccgcgaga tccgattgtc ggccctgaag 10860aaagctccag agatgttcgg gtccgtttac
gagcacgagg agaaaaagcc catggaggcg 10920ttcgctgaac ggttgcgaga tgccgtggca
ttcggcgcct acatcgacgg cgagatcatt 10980gggctgtcgg tcttcaaaca ggaggacggc
cccaaggacg ctcacaaggc gcatctgtcc 11040ggcgttttcg tggagcccga acagcgaggc
cgaggggtcg ccggtatgct gctgcgggcg 11100ttgccggcgg gtttattgct cgtgatgatc
gtccgacaga ttccaacggg aatctggtgg 11160atgcgcatct tcatcctcgg cgcacttaat
atttcgctat tctggagctt gttgtttatt 11220tcggtctacc gcctgccggg cgggtcgcgg
cgacggtagg cgctgtgcag ccgctgatgg 11280tcgtgttcat ctctgccgct ctgctaggta
gcccgatacg attgatggcg gtcctggggg 11340ctatttgcgg aactgcgggc gtggcgctgt
tggtgttgac accaaacgca gcgctagatc 11400ctgtcggcgt cgcagcgggc ctggcggggg
cggtttccat ggcgttcgga accgtgctga 11460cccgcaagtg gcaacctccc gtgcctctgc
tcacctttac cgcctggcaa ctggcggccg 11520gaggacttct gctcgttcca gtagctttag
tgtttgatcc gccaatcccg atgcctacag 11580gaaccaatgt tctcggctgc tcgactgcac
gaataccagc gaccccttgc ccaaatactt 11640gccgtgggcc tcggcctgag agccaaaaca
cttgatgcgg aagaagtcgg tgcgctcctg 11700cttgtcgccg gcatcgttgc gccacatcta
ggtactaaaa caattcatcc agtaaaatat 11760aatattttat tttctcccaa tcaggcttga
tccccagtaa gtcaaaaaat agctcgacat 11820actgttcttc cccgatatcc tccctgatcg
accggacgca gaaggcaatg tcataccact 11880tgtccgccct gccgcttctc ccaagatcaa
taaagccact tactttgcca tctttcacaa 11940agatgttgct gtctcccagg tcgccgtggg
aaaagacaag ttcctcttcg ggcttttccg 12000tctttaaaaa atcatacagc tcgcgcggat
ctttaaatgg agtgtcttct tcccagtttt 12060cgcaatccac atcggccaga tcgttattca
gtaagtaatc caattcggct aagcggctgt 12120ctaagctatt cgtataggga caatccgata
tgtcgatgga gtgaaagagc ctgatgcact 12180ccgcatacag ctcgataatc ttttcagggc
tttgttcatc ttcatactct tccgagcaaa 12240ggacgccatc ggcctcactc atgagcagat
tgctccagcc atcatgccgt tcaaagtgca 12300ggacctttgg aacaggcagc tttccttcca
gccatagcat catgtccttt tcccgttcca 12360catcataggt ggtcccttta taccggctgt
ccgtcatttt taaatatagg ttttcatttt 12420ctcccaccag cttatatacc ttagcaggag
acattccttc cgtatctttt acgcagcggt 12480atttttcgat cagttttttc aattccggtg
atattctcat tttagccatt tattatttcc 12540ttcctctttt ctacagtatt taaagatacc
ccaagaagct aattataaca agacgaactc 12600caattcactg ttccttgcat tctaaaacct
taaataccag aaaacagctt tttcaaagtt 12660gttttcaaag ttggcgtata acatagtatc
gattcgatag cgtggactca aggctctcgc 12720gaatggctcg cgttggaaac tttcattgac
acttgagggg caccgcaggg aaattctcgt 12780ccttgcgaga accggctatg tcgtgctgcg
catcgagcct gcgcccttgg cttgtctcgc 12840ccctctccgc gtcgctacgg ggcttccagc
gcctttccga cgctcaccgg gctggttgcc 12900ctcgccgctg ggctggcggc cgtctatggc
cctgcaaacg cgccagaaac gccgtcgaag 12960ccgtgtgcga gacaccgcgg ccgccggcgt
tgtggatacc tcgcggaaaa cttggccctc 13020actgacagat gaggggcgga cgttgacact
tgaggggccg actcacccgg cgcggcgttg 13080acagatgagg ggcaggctcg atttcggccg
gcgacgtgga gctggccagc ctcgcaaatc 13140ggcgaaaacg cctgatttta cgcgagtttc
ccacagatga tgtggacaag cctggggata 13200agtgccctgc ggtattgaca cttgaggggc
gcgactactg acagatgagg ggcgcgatcc 13260ttgacacttg aggggcagag tgctgacaga
tgaggggcgc acctattgac atttgagggg 13320ctgtccacag gcagaaaatc cagcatttgc
aagggtttcc gcccgttttt cggccaccgc 13380taacctgtct tttaacctgc ttttaaacca
atatttataa accttgtttt taaccagggc 13440tgcgccctgt gcgcgtgacc gcgcacgccg
aaggggggtg cccccccttc tcgaaccctc 13500ccggcccgct aacgcgggcc tcccatcccc
ccaggggctg cgcccctcgg ccgcgaacgg 13560cctcacccca aaaatggcag cgccagccag
gacgtcggcc gaaagagcga caagcagatc 13620acgcttttcg acagcgtcgg atttgcgatc
gaggattttt cggcgctgcg ctacgtccgc 13680gaccgcgttg agggatcaag ccacagcagc
ccactcgacc ttctagccga cccagacgag 13740ccaagggatc tttttggaat gctgctccgt
cgtcaggctt tccgacgttt gggtggttga 13800acagaagtca ttatcgcacg gaatgccaag
cactcccgag gggaaccctg tggttggcat 13860gcacatacaa atggacgaac ggataaacct
tttcacgccc ttttaaatat ccgattattc 13920taataaacgc tcttttctct taggtttacc
cgccaatata tcctgtcaaa cactgatagt 13980ttaaactgaa ggcgggaaac gacaatctga
tcatgagcgg agaattaagg gagtcacgtt 14040atgacccccg ccgatgacgc gggacaagcc
gttttacgtt tggaactgac agaaccgcaa 14100cgttgaagga gccactcagc
141206614195DNAArtificialpLCleo
66aagcttgggc ccttcgaaga tgttaattaa catcggtacc gagctctagg gataacaggg
60taatcaactt tgtataataa agttgataac agggtaatag ctcgaattct agcttgcatg
120cctgcagtgc agcgtgaccc ggtcgtgccc ctctctagag ataatgagca ttgcatgtct
180aagttataaa aaattaccac atattttttt tgtcacactt gtttgaagtg cagtttatct
240atctttatac atatatttaa actttactct acgaataata taatctatag tactacaata
300atatcagtgt tttagagaat catataaatg aacagttaga catggtctaa aggacaattg
360agtattttga caacaggact ctacagtttt atctttttag tgtgcatgtg ttctcctttt
420tttttgcaaa tagcttcacc tatataatac ttcatccatt ttattagtac atccatttag
480ggtttagggt taatggtttt tatagactaa tttttttagt acatctattt tattctattt
540tagcctctaa attaagaaaa ctaaaactct attttagttt ttttatttaa taatttagat
600ataaaataga ataaaataaa gtgactaaaa attaaacaaa taccctttaa gaaattaaaa
660aaactaagga aacatttttc ttgtttcgag tagataatgc cagcctgtta aacgccgtcg
720acgagtctaa cggacaccaa ccagcgaacc agcagcgtcg cgtcgggcca agcgaagcag
780acggcacggc atctctgtcg ctgcctctgg acccctctcg agagttccgc tccaccgttg
840gacttgctcc gctgtcggca tccagaaatt gcgtggcgga gcggcagacg tgagccggca
900cggcaggcgg cctcctcctc ctctcacggc accggcagct acgggggatt cctttcccac
960cgctccttcg ctttcccttc ctcgcccgcc gtaataaata gacaccccct ccacaccctc
1020tttccccaac ctcgtgttgt tcggagcgca cacacacaca accagatctc ccccaaatcc
1080acccgtcggc acctccgctt caaggtacgc cgctcgtcct cccccccccc ccctctctac
1140cttctctaga tcggcgttcc ggtccatggt tagggccggg tagttctact tctgttcatg
1200tttgtgttag atccgtgttt gtgttagatc cgtgctgcta gcgttcgtac acggatgcga
1260cctgtacgtc agacacgttc tgattgctaa cttgccagtg tttctctttg gggaatcctg
1320ggatggctct agccgttccg cagacgggat cgatttcatg attttttttg tttcgttgca
1380tagggtttgg tttgcccttt tcctttattt caatatatgc cgtgcacttg tttgtcgggt
1440catcttttca tgcttttttt tgtcttggtt gtgatgatgt ggtctggttg ggcggtcgtt
1500ctagatcgga gtagaattct gtttcaaact acctggtgga tttattaatt ttggatctgt
1560atgtgtgtgc catacatatt catagttacg aattgaagat gatggatgga aatatcgatc
1620taggataggt atacatgttg atgcgggttt tactgatgca tatacagaga tgctttttgt
1680tcgcttggtt gtgatgatgt ggtgtggttg ggcggtcgtt cattcgttct agatcggagt
1740agaatactgt ttcaaactac ctggtgtatt tattaatttt ggaactgtat gtgtgtgtca
1800tacatcttca tagttacgag tttaagatgg atggaaatat cgatctagga taggtataca
1860tgttgatgtg ggttttactg atgcatatac atgatggcat atgcagcatc tattcatatg
1920ctctaacctt gagtacctat ctattataat aaacaagtat gttttataat tattttgatc
1980ttgatatact tggatgatgg catatgcagc agctatatgt ggattttttt agccctgcct
2040tcatacgcta tttatttgct tggtactgtt tcttttgtcg atgctcaccc tgttgtttgg
2100tgttacttct gcaggtcgac tctagaggat catcacaagt ttgtacaaaa aagcaggctc
2160aatgagatat gaaaaagcct gaactcaccg cgacgtctgt cgagaagttt ctgatcgaaa
2220agttcgacag cgtctccgac ctgatgcagc tctcggaggg cgaagaatct cgtgctttca
2280gcttcgatgt aggagggcgt ggatatgtcc tgcgggtaaa tagctgcgcc gatggtttct
2340acaaagatcg ttatgtttat cggcactttg catcggccgc gctcccgatt ccggaagtgc
2400ttgacattgg ggaattcagc gagagcctga cctattgcat ctcccgccgt gcacagggtg
2460tcacgttgca agacctgcct gaaaccgaac tgcccgctgt tctgcagccg gtcgcggagg
2520ccatggatgc gatcgctgcg gccgatctta gccagacgag cgggttcggc ccattcggac
2580cgcaaggaat cggtcaatac actacatggc gtgatttcat atgcgcgatt gctgatcccc
2640atgtgtatca ctggcaaact gtgatggacg acaccgtcag tgcgtccgtc gcgcaggctc
2700tcgatgagct gatgctttgg gccgaggact gccccgaagt ccggcacctc gtgcacgcgg
2760atttcggctc caacaatgtc ctgacggaca atggccgcat aacagcggtc attgactgga
2820gcgaggcgat gttcggggat tcccaatacg aggtcgccaa catcttcttc tggaggccgt
2880ggttggcttg tatggagcag cagacgcgct acttcgagcg gaggcatccg gagcttgcag
2940gatcgccgcg gctccgggcg tatatgctcc gcattggtct tgaccaactc tatcagagct
3000tggttgacgg caatttcgat gatgcagctt gggcgcaggg tcgatgcgac gcaatcgtcc
3060gatccggagc cgggactgtc gggcgtacac aaatcgcccg cagaagcgcg gccgtctgga
3120ccgatggctg tgtagaagta ctcgccgata gtggaaaccg acgccccagc actcgtccga
3180gggcaaagga atagacccag ctttcttgta caaagtggtg atgatccgtc gacctgcaga
3240tcgttcaaac atttggcaat aaagtttctt aagattgaat cctgttgccg gtcttgcgat
3300gattatcata taatttctgt tgaattacgt taagcatgta ataattaaca tgtaatgcat
3360gacgttattt atgagatggg tttttatgat tagagtcccg caattataca tttaatacgc
3420gatagaaaac aaaatatagc gcgcaaacta ggataaatta tcgcgcgcgg tgtcatctat
3480gttactagat ccgatgataa gctgtcaaac atgagaattc agtacattaa aaacgtccgc
3540aatgtgttat taagttgtct aagcgtcaat ttgtttacac cacaatatat cctgccacca
3600gccagccaac agctccccga ccggcagctc ggcacaaaat caccactcga tacaggcagc
3660ccatcagtcc gggacggcgt cagcgggaga gccgttgtaa ggcggcagac tttgctcatg
3720ttaccgatgc tattcggaag aacggcaact aagctgccgg gtttgaaaca cggatgatct
3780cgcggagggt agcatgttga ttgtaacgat gacagagcgt tgctgcctgt gatcaaatat
3840catctccctc gcagagatcc gaattatcag ccttcttatt catttctcgc ttaaccgtga
3900caggctgtcg atcttgagaa ctatgccgac ataataggaa atcgctggat aaagccgctg
3960aggaagctga gtggcgctat ttctttagaa gtgaacgttg acgatcgtcg accgtacccc
4020gatgaattaa ttcggacgta cgttctgaac acagctggat acttacttgg gcgattgtca
4080tacatgacat caacaatgta cccgtttgtg taaccgtctc ttggaggttc gtatgacact
4140aggtcgctac cttaggaccg ttatagttac tagcgaattg acatgaggtt gccccgtatt
4200cagtgtcgct gatttgtatt gtctgaagtt gtttttacgt taagttgatg cagatcaatt
4260aatacgatac ctgcgtcata attgattatt tgacgtggtt tgatggcctc cacgcacgtt
4320gtgatatgta gatgataatc attatcactt tacgggtcct ttccggtgat ccgacaggtt
4380acggggcggc gacctcgcgg gttttcgcta tttatgaaaa ttttccggtt taaggcgttt
4440ccgttcttct tcgtcataac ttaatgtttt tatttaaaat accctctgaa aagaaaggaa
4500acgacaggtg ctgaaagcga gctttttggc ctctgtcgtt tcctttctct gtttttgtcc
4560gtggaatgaa caatggaagg atcttctcgg cggcgatcac gacgccggcc ctgcggagcc
4620ttcgccgcgt gcgcgattca tggcggccgt ggaggccaag gatttcgcgc gagtgcaaga
4680gctgatcgag gcgcgtggag ccaagtcggc ggctgattat gtccttgcgc agctcgccgt
4740ggccgaaggt ctggaccgca agcctggtgc gcgcgtcgtg gtcgggaaag cggcgggcag
4800catggcaatg ccgcctgcgg cgctgggttt tacgccaagg ggagaagcgg catacgccat
4860cgagcggtca gcctatggtg agccgaggtc cagcattgcg aagcagtacc agcaggaatg
4920gaaccggaag gcggcgacct ggtgggcgat ggccggtgtg gccggcatca tcggcgcgat
4980cctggcggcg gcggcaaccg gctttgttgg gctggcagtg tcgatccgca accgagtgaa
5040gcgcgtgcgc gacctgttgg tgatggagcc gggtgcagag ccataagcgg caagagacga
5100aagcccggtt tccgggcttt tgttttgtta cgccaaggac gagttttagc ggctaaaggt
5160gttgacgtgc gagaaatgtt tagctaaact tctctcatgt gctggcggct gtcaccgcta
5220tgttcaacca aggcgcggag caaattatgg gtgttatcca tgaagaaacg gcttaccgaa
5280agccagttcc aggaggcgat ccaggggctg gaagtggggc agcagaccat cgagatagcg
5340cggggcgtct tagtcgatgg gaagccacag gcgacgttcg caacgtcgct gggactgacc
5400aggggcgcag tgtcgcaagc ggtgcatcgc gtgtgggccg cgttcgagga caagaacttg
5460cccgaggggt acgcgcgggt aacggcggtt ctgccggaac atcaggcgta catcgtccgg
5520aagtgggaag cggacgccaa gaaaaaacag gaaaccaaac gatgaaaact ttggtcacgg
5580ccaaccagaa aggcggcgtc ggcaagactt cgacccttgt gcatcttgcc ttcgactttt
5640tcgagcgcgg cttgcgggtt gccgtgatcg acctggaccc ccagggcaat gcgtcctaca
5700cgctcaagga ctttgctacc ggcctgcatg caagcaagct gttcggcgct gtccctgccg
5760gcggctggac cgaaaccgca cccgcagccg gcgacggcca ggccgcgcgc ctcgccctca
5820tcgagtccaa cccggtactg gcgaacgccg aacggctgtc gctggacgac gcccgcgagc
5880tgttcggggc gaacatcaag gccctggcga accaaggctt cgacgtgtgc ctgatcgaca
5940cggccccgac ccttggcgtc ggcctggcgg ccgccctctt cgcggccgac tatgtgctgt
6000cccccatcga gcttgaggcg tacagcatcc agggcatcaa gaagatggtc acgaccattg
6060cgaacgtgcg ccagaagaac gccaagctgc aattccttgg catggtgccc agcaaggtcg
6120atgcgcggaa tccgcgccac gcgcgccacc aagccgagct gctggccgcg taccccaaga
6180tgatgattcc ggccaccgtt ggcctgcgca gcagcatcgc cgatgccctc gcatccggtg
6240tgccggtctg gaagatcaag aaaacggccg cgcgcaaggc atcgaaagag gttcgcgccc
6300tggctgatta cgtgttcacg aagatggaga tttcccaatg actgcggctc aagccaagac
6360caccaagaaa aacaccgctg cggccgctca ggaagccgca ggcgcggcgc agccgtccgg
6420cctggggttg gatagcatcg gcgacctgtc gagcctcctg gacgctcctg cggcgtctca
6480gggcggttcc ggccctatcg agctggacct ggacctgatc gacgaagatc cgcatcagcc
6540gcggacggcc gacaaccccg gcttttcccc ggagagcatc gcggaaatcg gtgccacgat
6600caaagagcgc ggggtgaagt cacccatttc ggtgcgcgag aaccaggagc agccgggccg
6660ctatatcatc aatcacggcg cccgccgcta ccgtggctcg aagtgggccg gcaagaagtc
6720catcccggcg ttcatcgaca acgactacaa cgaagccgac caggttatcg agaacctgca
6780acgcaacgag ctgaccccgc gcgaaattgc cgacttcatt ggccgcgagc tggcgaaggg
6840caagaagaaa ggcgatatcg ccaaggaaat cggcaagtcg ccggcgttca tcacccagca
6900cgtcacgctg ctggacctgc cggagaagat cgccgatgcg ttcaacaccg gccgcgtgcg
6960cgacgtgacc gtggtgaacg agctggtgac ggccttcaag aagcgcccgg aggaagtcga
7020ggcgtggctt gacgacgaca cccaggaaat cacgcgcggc acggtcaagc tgctgcgcga
7080gttcctggac gagaagggcc gcgatcccaa caccgtcgat gccttcaacg gccagactga
7140tgccgagcgt gacgcggagg ccggcgacgg ccaggacggc gaggacggcg accaggacgg
7200taaggacgcc aaggaaaagg gcgcgaagga gccggacccg gacaagctga aaaaggccat
7260cgtccaggtc gagcacgacg agcgccctgc ccgccttatc ctcaaccgtc ggccgccggc
7320ggaaggctat gcctggttga agtacgagga cgacggccag gagttcgagg cgaaccttgc
7380cgacgtgaaa ctggtcgcgc tcatcgaggg ctgatcccca aagacagcgg cgcgggccac
7440ccgcgccgca cagacaacgg ttccgctaca aggaggaccg aagaatgaat ccgatgctgt
7500tctacatcgc gggaggcgta ggcgcggcgt tgctgctggt ttccgcgatc atgctgttca
7560agctgcgcga gccgaagaag gaacaccgac cgcagcgcaa ggcggcggcc ccgacgccgc
7620agccggtcga taacgagctg ctgcgcactc tagtgatatt ccacaaaaca gcagggaagc
7680agcgcttttc cgctgcataa ccctgcttcg gggtcattat agcgattttt tcggtatatc
7740catccttttt cgcacgatat acaggatttt gccaaagggt tcgtgtagac tttccttggt
7800gtatccaacg gcgtcagccg ggcaggatag gtgaagtagg cccacccgcg agcgggtgtt
7860ccttcttcac tgtcccttat tcgcacctgg cggtgctcaa cgggaatcct gctctgcgag
7920gctggccggc taccgccggc gtaacagatg agggcaagcg gatggctgat gaaaccaagc
7980caaccaggaa gggcagccca cctatcaagg tgtactgcct tccagacgaa cgaagagcga
8040ttgaggaaaa ggcggcggcg gccggcatga gcctgtcggc ctacctgctg gccgtcggcc
8100agggctacaa aatcacgggc gtcgtggact atgagcacgt ccgcgagctg gcccgcatca
8160atggcgacct gggccgcctg ggcggcctgc tgaaactctg gctcaccgac gacccgcgca
8220cggcgcggtt cggtgatgcc acgatcctcg ccctgctggc gaagatcgaa gagaagcagg
8280acgagcttgg caaggtcatg atgggcgtgg tccgcccgag ggcagagcca tgactttttt
8340agccgctaaa acggccgggg ggtgcgcgtg attgccaagc acgtccccat gcgctccatc
8400aagaagagcg acttcgcgga gctggtgaag tacatcaccg acgagcaagg caagaccgag
8460cgccagatcc aaaacaactg tcaaagcgca cccgcccgat gccattcgcg gcacggcttc
8520cgttgaggat gtcgatatga tgcgcgagcc gacggcccgc agagaagggg ccgttttagc
8580ggctaaagaa ggaagtgcaa gccctaaccc ttggcgtcag agccttccac gcagcttttt
8640tcgggtgtcg tcgccccatt tctttacgat aaacgcctta tgtgacggca aaaccacact
8700gatgcgttcg tatccgggcg gcacgctgct cttgaaagga tgacccgcaa tctccgcgag
8760tgcctcgcgg tcaaggtcgg tggactccag gagaagaggt aggggagttt ccagggcgtc
8820ggcaatggcc tccatcacct tcaacgaggg gttggcctta ccgttggtta agtctgataa
8880aaacgaaatt gaaacccctg ccctctccga cagctcatgt ttcgtcatgc cccgctcatc
8940gagcagacga aggatgttgg tgaaaaatat ctggttgtac acagcggaag ccgcccctcg
9000cacctttggt cgcggcccgc aaaattttag ccgctaaagt tcttgacagc ggaaccaatg
9060tttagctaaa ctagagtctc ctttctcaag gagactttcg atatgagcca taatcagttc
9120cagtttatcg gtaatcttac ccgtgacacc gaggtacgtc atggcaattc taacaagccg
9180caagcaattt tcgatatagc ggttaatgaa gagtggcgca acgatgccgg cgacaagcag
9240gagcgcaccg acttcttccg catcaagtgt tttggctctc aggccgaggc ccacggcaag
9300tatttgggca aggggtcgct ggtattcgtg cagggcaaga ttcggaatac caagtacgag
9360aaggacggcc agacggtcta cgggaccgac ttcattgccg ataaggtgga ttatctggac
9420accaaggcac caggcgggtc aaatcaggaa taagggcaca ttgccccggc gtgagtcggg
9480gcaatcccgc aaggagggtg aatgaatcgg acgtttgacc ggaaggcata caggcaagaa
9540ctgatcgacg cggggttttc cgccgaggat gccgaaacca tcgcaagccg caccgtcatg
9600cgtgcgcccc gcgaaacctt ccagtccgtc ggctcgatgg tccagcaagc tacggccaag
9660atcgagcgcg acagcgtgca actggctccc cctgccctgc ccgcgccatc ggccgccgtg
9720gagcgttcgc gtcgtctcga acaggaggcg gcaggtttgg cgaagtcgat gaccatcgac
9780acgcgaggaa ctatgacgac caagaagcga aaaaccgccg gcgaggacct ggcaaaacag
9840gtcagcgagg ccaagcaggc cgcgttgctg aaacacacga agcagcagat caaggaaatg
9900cagctttcct tgttcgatat tgcgccgtgg ccggacacga tgcgagcgat gccaaacgac
9960acggcccgct ctgccctgtt caccacgcgc aacaagaaaa tcccgcgcga ggcgctgcaa
10020aacaaggtca ttttccacgt caacaaggac gtgaagatca cctacaccgg cgtcgagctg
10080cgggccgacg atgacgaact ggtgtggcag caggtgttgg agtacgcgaa gcgcacccct
10140atcggcgagc cgatcacctt cacgttctac gagctttgcc aggacctggg ctggtcgatc
10200aatggccggt attacacgaa ggccgaggaa tgcctgtcgc gcctacaggc gacggcgatg
10260ggcttcacgt ccgaccgcgt tgggcacctg gaatcggtgt cgctgctgca ccgcttccgc
10320gtcctggacc gtggcaagaa aacgtcccgt tgccaggtcc tgatcgacga ggaaatcgtc
10380gtgctgtttg ctggcgacca ctacacgaaa ttcatatggg agaagtaccg caagctgtcg
10440ccgacggccc gacggatgtt cgactatttc agctcgcacc gggagccgta cccgctcaag
10500ctggaaacct tccgcctcat gtgcggatcg gattccaccc gcgtgaagaa gtggcgcgag
10560caggtcggcg aagcctgcga agagttgcga ggcagcggcc tggtggaaca cgcctgggtc
10620aatgatgacc tggtgcattg caaacgctag ggccttgtgg ggtcagttcc ggctgggggt
10680tcagcagcca gcgctttact ggcatttcag gaacaagcgg gcactgctcg acgcacttgc
10740ttcgctcagt atcgctcggg acgcacggcg cgctctacga actgccgata aacagaggat
10800taaaattgac aattgtgatt aaggctcaga ttcgacggct tggagcggcc gacgtgcagg
10860atttccgcga gatccgattg tcggccctga agaaagctcc agagatgttc gggtccgttt
10920acgagcacga ggagaaaaag cccatggagg cgttcgctga acggttgcga gatgccgtgg
10980cattcggcgc ctacatcgac ggcgagatca ttgggctgtc ggtcttcaaa caggaggacg
11040gccccaagga cgctcacaag gcgcatctgt ccggcgtttt cgtggagccc gaacagcgag
11100gccgaggggt cgccggtatg ctgctgcggg cgttgccggc gggtttattg ctcgtgatga
11160tcgtccgaca gattccaacg ggaatctggt ggatgcgcat cttcatcctc ggcgcactta
11220atatttcgct attctggagc ttgttgttta tttcggtcta ccgcctgccg ggcgggtcgc
11280ggcgacggta ggcgctgtgc agccgctgat ggtcgtgttc atctctgccg ctctgctagg
11340tagcccgata cgattgatgg cggtcctggg ggctatttgc ggaactgcgg gcgtggcgct
11400gttggtgttg acaccaaacg cagcgctaga tcctgtcggc gtcgcagcgg gcctggcggg
11460ggcggtttcc atggcgttcg gaaccgtgct gacccgcaag tggcaacctc ccgtgcctct
11520gctcaccttt accgcctggc aactggcggc cggaggactt ctgctcgttc cagtagcttt
11580agtgtttgat ccgccaatcc cgatgcctac aggaaccaat gttctcggct gctcgactgc
11640acgaatacca gcgacccctt gcccaaatac ttgccgtggg cctcggcctg agagccaaaa
11700cacttgatgc ggaagaagtc ggtgcgctcc tgcttgtcgc cggcatcgtt gcgccacatc
11760taggtactaa aacaattcat ccagtaaaat ataatatttt attttctccc aatcaggctt
11820gatccccagt aagtcaaaaa atagctcgac atactgttct tccccgatat cctccctgat
11880cgaccggacg cagaaggcaa tgtcatacca cttgtccgcc ctgccgcttc tcccaagatc
11940aataaagcca cttactttgc catctttcac aaagatgttg ctgtctccca ggtcgccgtg
12000ggaaaagaca agttcctctt cgggcttttc cgtctttaaa aaatcataca gctcgcgcgg
12060atctttaaat ggagtgtctt cttcccagtt ttcgcaatcc acatcggcca gatcgttatt
12120cagtaagtaa tccaattcgg ctaagcggct gtctaagcta ttcgtatagg gacaatccga
12180tatgtcgatg gagtgaaaga gcctgatgca ctccgcatac agctcgataa tcttttcagg
12240gctttgttca tcttcatact cttccgagca aaggacgcca tcggcctcac tcatgagcag
12300attgctccag ccatcatgcc gttcaaagtg caggaccttt ggaacaggca gctttccttc
12360cagccatagc atcatgtcct tttcccgttc cacatcatag gtggtccctt tataccggct
12420gtccgtcatt tttaaatata ggttttcatt ttctcccacc agcttatata ccttagcagg
12480agacattcct tccgtatctt ttacgcagcg gtatttttcg atcagttttt tcaattccgg
12540tgatattctc attttagcca tttattattt ccttcctctt ttctacagta tttaaagata
12600ccccaagaag ctaattataa caagacgaac tccaattcac tgttccttgc attctaaaac
12660cttaaatacc agaaaacagc tttttcaaag ttgttttcaa agttggcgta taacatagta
12720tcgattcgat agcgtggact caaggctctc gcgaatggct cgcgttggaa actttcattg
12780acacttgagg ggcaccgcag ggaaattctc gtccttgcga gaaccggcta tgtcgtgctg
12840cgcatcgagc ctgcgccctt ggcttgtctc gcccctctcc gcgtcgctac ggggcttcca
12900gcgcctttcc gacgctcacc gggctggttg ccctcgccgc tgggctggcg gccgtctatg
12960gccctgcaaa cgcgccagaa acgccgtcga agccgtgtgc gagacaccgc ggccgccggc
13020gttgtggata cctcgcggaa aacttggccc tcactgacag atgaggggcg gacgttgaca
13080cttgaggggc cgactcaccc ggcgcggcgt tgacagatga ggggcaggct cgatttcggc
13140cggcgacgtg gagctggcca gcctcgcaaa tcggcgaaaa cgcctgattt tacgcgagtt
13200tcccacagat gatgtggaca agcctgggga taagtgccct gcggtattga cacttgaggg
13260gcgcgactac tgacagatga ggggcgcgat ccttgacact tgaggggcag agtgctgaca
13320gatgaggggc gcacctattg acatttgagg ggctgtccac aggcagaaaa tccagcattt
13380gcaagggttt ccgcccgttt ttcggccacc gctaacctgt cttttaacct gcttttaaac
13440caatatttat aaaccttgtt tttaaccagg gctgcgccct gtgcgcgtga ccgcgcacgc
13500cgaagggggg tgccccccct tctcgaaccc tcccggcccg ctaacgcggg cctcccatcc
13560ccccaggggc tgcgcccctc ggccgcgaac ggcctcaccc caaaaatggc agcgccagcc
13620aggacgtcgg ccgaaagagc gacaagcaga tcacgctttt cgacagcgtc ggatttgcga
13680tcgaggattt ttcggcgctg cgctacgtcc gcgaccgcgt tgagggatca agccacagca
13740gcccactcga ccttctagcc gacccagacg agccaaggga tctttttgga atgctgctcc
13800gtcgtcaggc tttccgacgt ttgggtggtt gaacagaagt cattatcgca cggaatgcca
13860agcactcccg aggggaaccc tgtggttggc atgcacatac aaatggacga acggataaac
13920cttttcacgc ccttttaaat atccgattat tctaataaac gctcttttct cttaggttta
13980cccgccaata tatcctgtca aacactgata gtttaaactg aaggcgggaa acgacaatct
14040gatcatgagc ggagaattaa gggagtcacg ttatgacccc cgccgatgac gcgggacaag
14100ccgttttacg tttggaactg acagaaccgc aacgttgaag gagccactca gcaagctatc
14160tatgtcgggt gcggagaaag aggtaatgaa atggc
14195674836DNAArtificialpVGW2 67aaatggccat aggcgatctc cttaatcaat
agtagctgta acctcgaagc gtttcacttg 60taacaacgat tgagaacttt tgtcataaaa
ttgaaatact tggttcgcat tttcgtcatc 120cgcggtcagc cgcaattctg acgaactgcc
catttagctg gagatgattg tacatccttc 180acgtgaaaat ttctcaagcg ctgtgaacaa
gggttcagat tttagattga aaggtgagcc 240gttgaaacac gttcttctta tcgatgacga
tgtcgctatg cggcatctta ttatcgaata 300ccttacgatc cacgccttca aagtgaccgc
ggtagccgac agcacccagt tcactagagt 360actctcttcc gcgacggtcg atgtcgtggt
tgttgatcta gatttaggtc gtgaagatgg 420gcttgagatc gttcgaaatc tggcggcaaa
gtctgatatt ccaatcataa ttatcagtgg 480cgaccgcctt gaggagacgg ataaagttgt
tgcactcgag ctaggagcaa gtgattttat 540cgctaagccg tttagtacga gagagtttct
tgcacgcatt cgggttgcct tgcgcgtgcg 600ccccaacgtt gtccgctcca aagaccgacg
gtctttttgt tttactgact ggacacttaa 660tctcaggcaa cgtcgcttga tgtccgaagc
tggcggtgag gtgaaactta cggcaggtga 720gttcaatctt ctcctcgcgt ttttagagaa
accccgcgac gttctatcgc gcgagcaact 780tctcattgcc agtcgagtac gcgacgagga
ggtttacgac aggagtatag atgttctcat 840tttgcggctg cgccgcaaac ttgaggcgga
tccgtcaagc cctcaactga taaaaacagc 900aagaggtgcc ggttatttct ttgacgcgga
cgtgcaggtt tcgcacgggg ggacgatggc 960agcctgagcc aattgcattt ggctcttaat
tatctggctc aaaaggtgac tgaggacgcg 1020gccagcggcc tcaaacctac actcaatatt
tggtgagggg ttccgatagg tccctcttca 1080cctgcatggc atgtttaacc gaatctgacg
ttttccctgc aaatgccaaa atactatgcc 1140tatctccggg tttcgcgtga cggccaagac
ccggaaaacc aaaaatacgg tttgctcgaa 1200tacgcgaacg ccaaaggctt cgcgccgcta
cagatcgagg aagaaattgc cagcagagca 1260aaggactggc gcaagcgcaa gctcggagca
atcatcgaaa aggccgagcg tggcgacgtg 1320ctactgacgc cggagattac gcgcattgcc
ggttccgccc tcgccgcctt ggaaattctc 1380aaagcggcga gcgagcgcgg cctaatcgtc
catgtgacca aacagaagat catcatggac 1440ggcagcctac aaagcgacat catggcaacc
gtgcttggct tggctgcaca gatcgagcgg 1500catttcattc aggcacgtac caccgaggcg
ctacaagtcg ccagagagcg cggcaagacg 1560ctcgggcgac ccaagggcag caaatcgagc
gccttgaagc tggacagccg tattgatgaa 1620gtacaggcat acgtgaacct tggcttgccg
caaagtcgcg cagccgagtt gttaggcgtc 1680agccctcaca ccttgcgcct gttcatcaaa
cgccggaaca tcaaacccac aaacactaga 1740ccaaccatca ccatgccggg gagggaacaa
catgcctaag aacaacaaag cccccggcca 1800tcgtatcaac gagatcatca agacgagcct
cgcgctcgaa atggaggatg cccgcgaagc 1860tggcttagtc ggctacatgg cccgttgcct
tgtgcaagcg accatgcccc acaccgaccc 1920caagaccagc tactttgagc gcaccaatgg
catcgtcacc ttgtcgatca tgggcaagcc 1980gagcatcggc ctgccctacg gttctatgcc
gcgcaccttg cttgcttgga tatgcaccga 2040ggccgtgcga acgaaagacc ccgtgttgaa
ccttggccgg tcgcaatcgg aatttctaca 2100aaggctcgga atgcacaccg atggccgtta
cacggccacc cttcgcaatc aggcgcaacg 2160cctgttttca tccatgattt cgcttgccgg
cgagcaaggc aatgacttcg gcattgagaa 2220cgtcgtcatt gccaagcgcg cttttctatt
ctggaatccc aagcggccag aagatcgggc 2280gctatgggat agcaccctca ccctcacagg
cgatttcttc gaggaagtca cccgctcacc 2340ggttcctatc cgaatcgact acctgcatgc
cttgcggcag tctccgcttg cgatggacat 2400ttacacgtgg ctgacctatc gcgtgttcct
gttgcgggcc aagggccgcc ccttcgtgca 2460aatcccttgg gtcgccctgc aagcgcaatt
cggctcatcc tatggcagcc gcgcacgcaa 2520ctcgcccgaa ctggacgata aggcccgaga
gcgggcagag cgggcagcac tcgccagctt 2580caaatacaac ttcaaaaagc gcctacgcga
agtgttgatt gtctatcccg aggcaagcga 2640ctgcatcgaa gatgacggcg aatgcctgcg
catcaaatcc acacgcctgc atgtcacccg 2700cgcacccggc aagggcgctc gcatcggccc
ccctccgact tgaccaggcc aacgctacgc 2760ttggcttggt caagccttcc catccaacag
cccgccgtcg agcgggcttt tttatccccg 2820gaagcctgtg gatagagggt agttatccac
gtgaaaccgc taatgccccg caaagccttg 2880attcacgggg ctttccggcc cgctccaaaa
actatccacg tgaaatcgct aatcagggta 2940cgtgaaatcg ctaatcggag tacgtgaaat
cgctaataag gtcacgtgaa atcgctaatc 3000aaaaaggcac gtgagaacgc taatagccct
ttcagatcaa cagcttgcaa acacccctcg 3060ctccggcaag tagttacagc aagtagtatg
ttcaattagc ttttcaatta tgaatatata 3120tatcaattat tggtcgccct tggcttgtgg
acaatgcgct acgcgcaccg gctccgcccg 3180tggacaaccg caagcggttg cccaccgtcg
agcgcctttg cccacaaccc ggcggccgca 3240acagatcgtt ttataaattt ttttttttga
aaaagaaaaa gcccgaaagg cggcaacctc 3300tcgggcttct ggatttccga tcaacgcagg
agtcgttcgg aaagtagctg ttccagaatt 3360ataggcgcag agacaccaga ttccaagatg
gctctgttaa attgttgtag tatgtagtat 3420catacaacat actacagtac agaggcccgc
aagaatggca atcactaaac aagacatttg 3480gcgagcagcc gacgaactgg acgccgaagg
catccggccc actttggccg ccgtgcgcaa 3540gaaactcgga agcggtagct tcacaaccat
ttccgatgca atggctgaat ggaaaaaccg 3600caagaccgcc accctgccct catcagaccc
attgccggtt gcagtcaacg agcatcttgc 3660cgagcttggc aatgcgctat gggctatcgc
cctggcgcac gccaacgccc ggtttgacga 3720agatcggaaa cagatcgagg ccgacaaagc
ggccatcagc cagcagcttg ccgaagcaat 3780cgaactagcc gacaccttca cccgcgaaaa
cgaccagctc cgcgaacgag tagatcccgg 3840gttgacataa gcctgttcgg ttcgtaaact
gtaatgcaag tagcgtatgc gctcacgcaa 3900ctggtccaga accttgaccg aacgcagcgg
tggtaacggc gcagtggcgg ttttcatggc 3960ttgttatgac tgtttttttg tacagtctat
gcctcgggca tccaagcagc aagcgcgtta 4020cgccgtgggt cgatgtttga tgttatggag
cagcaacgat gttacgcagc agcaacgatg 4080ttacgcagca gggcagtcgc cctaaaacaa
agttaggtgg ctcaagtatg ggcatcattc 4140gcacatgtag gctcggccct gaccaagtca
aatccatgcg ggctgctctt gatcttttcg 4200gtcgtgagtt cggagacgta gccacctact
cccaacatca gccggactcc gattacctcg 4260ggaacttgct ccgtagtaag acattcatcg
cgcttgctgc cttcgaccaa gaagcggttg 4320ttggcgctct cgcggcttac gttctgccca
agtttgagca gccgcgtagt gagatctata 4380tctatgatct cgcagtctcc ggcgagcacc
ggaggcaggg cattgccacc gcgctcatca 4440atctcctcaa gcatgaggcc aacgcgcttg
gtgcttatgt gatctacgtg caagcagatt 4500acggtgacga tcccgcagtg gctctctata
caaagttggg catacgggaa gaagtgatgc 4560actttgatat cgacccaagt accgccacct
aacaattcgt tcaagccgag atcggcttcc 4620cggccgcgga gttgttcggt aaattgctag
ctttaagggc gaattctgca gatatccatc 4680acactggcgg ccgctcgagc atgcatctag
agggcccaat tcgccctata gtgagtcgta 4740ttacaattca ctggccgtcg ttttacaacg
tcgtgactgg gaaaaccctg gcgttaccca 4800acttaatcgc cttgcagcac atcccccttt
cgccag 48366829DNAArtificialprimer for
amplification 68cggtctagag tgcgcagcag ctcgttatc
296943DNAArtificialprimer for amplification 69agctatctat
gtcgggtgcg gagaaagagg taatgaaatg gca
437043DNAArtificialprimer for amplification 70agcttgccat ttcattacct
ctttctccgc acccgacata gat 437134DNAArtificialprimer
for amplification 71cagggtaatc aactttgtat aataaagttg ataa
347234DNAArtificialprimer for amplification 72caactttatt
atacaaagtt gattaccctg ttat
347360DNAArtificialprimer for amplification 73gggtagttct acttctgttc
atgtttgtgt tagatccgtg tttgtgttag atccgtgctg 607466DNAArtificialprimer
for amplification 74ctagcgccgg atctaacaca aacacggatc taacacaaac
atgaacagaa gtagaactac 60ccggcc
667513DNAArtificialprimer for amplification
75agcttgggcc ctt
13768DNAArtificialprimer for amplification 76agggccca
87736DNAArtificialprimer for
amplification 77aaaggatccc gggttgacat aagcctgttc ggttcg
367831DNAArtificialprimer for amplification 78aaagctagca
atttaccgaa caactccgcg g
317930DNAArtificialprimer for amplification 79aaatggccat aggcgatctc
cttaatcaat 308024DNAArtificialprimer
for amplification 80agtgtgtggc atggtgcatt tccg
248124DNAArtificialprimer for amplification 81ctctacagga
tacacggtgt aagg 24
User Contributions:
Comment about this patent or add new information about this topic: