Patent application title: PLANT GENOME MODIFICATION USING GUIDE RNA/CAS ENDONUCLEASE SYSTEMS AND METHODS OF USE
Inventors:
Andrew Mark Cigan (Johnston, IA, US)
Andrew Mark Cigan (Johnston, IA, US)
Saverio Carl Falco (Wilmington, DE, US)
Saverio Carl Falco (Wilmington, DE, US)
Huirong Gao (Johnston, IA, US)
Huirong Gao (Johnston, IA, US)
Zhongsen Li (Hockessin, DE, US)
Zhan-Bin Liu (West Chester, PA, US)
Zhan-Bin Liu (West Chester, PA, US)
L. Aleksander Lyznik (Johnston, IA, US)
Jinrui Shi (Johnston, IA, US)
Jinrui Shi (Johnston, IA, US)
Sergei Svitashev (Johnston, IA, US)
Sergei Svitashev (Johnston, IA, US)
Joshua K. Young (Johnston, IA, US)
Joshua K. Young (Johnston, IA, US)
IPC8 Class: AC12N1582FI
USPC Class:
800270
Class name: Multicellular living organisms and unmodified parts thereof and related processes method of using a plant or plant part in a breeding process which includes a step of sexual hybridization method of breeding involving a mutation step
Publication date: 2015-03-19
Patent application number: 20150082478
Abstract:
Compositions and methods are provided for genome modification of a target
sequence in the genome of a plant or plant cell. The methods and
compositions employ a guide RNA/Cas endonuclease system to provide an
effective system for modifying or altering target sites within the genome
of a plant, plant cell or seed. Also provided are compositions and
methods employing a guide polynucleotide/Cas endonuclease system for
genome modification of a nucleotide sequence in the genome of a cell or
organism, for gene editing, and/or for inserting or deleting a
polynucleotide of interest into or from the genome of a cell or organism.
Once a genomic target site is identified, a variety of methods can be
employed to further modify the target sites such that they contain a
variety of polynucleotides of interest. Breeding methods and methods for
selecting plants utilizing a two component RNA guide and Cas endonuclease
system are also disclosed. Compositions and methods are also provided for
editing a nucleotide sequence in the genome of a cell.Claims:
1. A method for selecting a plant comprising an altered target site in
its plant genome, the method comprising: a) obtaining a first plant
comprising at least one Cas endonuclease capable of introducing a double
strand break at a target site in the plant genome; b) obtaining a second
plant comprising a guide RNA that is capable of forming a complex with
the Cas endonuclease of (a); c) crossing the first plant of (a) with the
second plant of (b); d) evaluating the progeny of (c) for an alteration
in the target site; and, e) selecting a progeny plant that possesses the
desired alteration of said target site.
2. A method for selecting a plant comprising an altered target site in its plant genome, the method comprising selecting at least one progeny plant that comprises an alteration at a target site in its plant genome, wherein said progeny plant was obtained by crossing a first plant comprising at least one a Cas endonuclease with a second plant comprising a guide RNA, wherein said Cas endonuclease is capable of introducing a double strand break at said target site.
3. A method for selecting a plant comprising an altered target site in its plant genome, the method comprising: a) obtaining a first plant comprising at least one Cas endonuclease capable of introducing a double strand break at a target site in the plant genome; b) obtaining a second plant comprising a guide RNA and a donor DNA, wherein said guide RNA is capable of forming a complex with the Cas endonuclease of (a), wherein said donor DNA comprises a polynucleotide of interest; c) crossing the first plant of (a) with the second plant of (b); d) evaluating the progeny of (c) for an alteration in the target site; and, e) selecting a progeny plant that comprises the polynucleotide of interest inserted at said target site.
4. A method for selecting a plant comprising an altered target site in its plant genome, the method comprising selecting at least one progeny plant that comprises an alteration at a target site in its plant genome, wherein said progeny plant was obtained by crossing a first plant expressing at least one Cas endonuclease to a second plant comprising a guide RNA and a donor DNA, wherein said Cas endonuclease is capable of introducing a double strand break at said target site, wherein said donor DNA comprises a polynucleotide of interest.
5. A method for modifying a target site in the genome of a plant cell, the method comprising providing a guide RNA to a plant cell having a Cas endonuclease, wherein said guide RNA and Cas endonuclease are capable of forming a complex that enables the Cas endonuclease to introduce a double strand break at said target site.
6. A method for modifying a target site in the genome of a plant cell, the method comprising providing a guide RNA and a Cas endonuclease to said plant cell, wherein said guide RNA and Cas endonuclease are capable of forming a complex that enables the Cas endonuclease to introduce a double strand break at said target site.
7. A method for modifying a target site in the genome of a plant cell, the method comprising providing a guide RNA and a donor DNA to a plant cell having a Cas endonuclease, wherein said guide RNA and Cas endonuclease are capable of forming a complex that enables the Cas endonuclease to introduce a double strand break at said target site, wherein said donor DNA comprises a polynucleotide of interest.
8. A method for modifying a target site in the genome of a plant cell, the method comprising: a) providing to a plant cell a guide RNA and a Cas endonuclease, wherein said guide RNA and Cas endonuclease are capable of forming a complex that enables the Cas endonuclease to introduce a double strand break at said target site; and, b) identifying at least one plant cell that has a modification at said target, wherein the modification includes at least one deletion or substitution of one or more nucleotides in said target site.
9. A method for modifying a target DNA sequence in the genome of a plant cell, the method comprising: a) providing to a plant cell a first recombinant DNA construct capable of expressing a guide RNA and a second recombinant DNA construct capable of expressing a Cas endonuclease, wherein said guide RNA and Cas endonuclease are capable of forming a complex that enables the Cas endonuclease to introduce a double strand break at said target site; and, b) identifying at least one plant cell that has a modification at said target, wherein the modification includes at least one deletion or substitution of one or more nucleotides in said target site.
10. A method for introducing a polynucleotide of Interest into a target site in the genome of a plant cell, the method comprising: a) providing to a plant cell a first recombinant DNA construct capable of expressing a guide RNA and a second recombinant DNA construct capable of expressing a Cas endonuclease, wherein said guide RNA and Cas endonuclease are capable of forming a complex that enables the Cas endonuclease to introduce a double strand break at said target site; b) contacting the plant cell of (a) with a donor DNA comprising a polynucleotide of Interest; and, c) identifying at least one plant cell from (b) comprising in its genome the polynucleotide of Interest integrated at said target site.
11. The method of claim 5, wherein the guide RNA is introduced directly by particle bombardment.
12. The method of claim 5, wherein the guide RNA is introduced via particle bombardment or Agrobacterium transformation of a recombinant DNA construct comprising the corresponding guide DNA operably linked to a plant U6 polymerase III promoter.
13. The method of claim 1, wherein the Cas endonuclease gene is a plant optimized Cas9 endonuclease.
14. The method of claim 1, wherein the Cas endonuclease gene is operably linked to a SV40 nuclear targeting signal upstream of the Cas codon region and a VirD2 nuclear localization signal downstream of the Cas codon region.
15. The method of claim 1, wherein the plant is a monocot or a dicot.
16. The method of claim 15, wherein the monocot is selected from the group consisting of maize, rice, sorghum, rye, barley, wheat, millet, oats, sugarcane, turfgrass, or switchgrass.
17. The method of claim 16, wherein the dicot is selected from the group consisting of soybean, canola, alfalfa, sunflower, cotton, tobacco, peanut, potato, tobacco, Arabidopsis, or safflower.
18. The method of claim 1, wherein the target site is located in the gene sequence of an acetolactate synthase (ALS) gene, an Enolpyruvylshikimate Phosphate Synthase Gene (ESPSP) gene, a male fertility (MS45, MS26 or MSCA1).
19. A plant or seed produced by the method of claim 5.
20. A plant comprising a recombinant DNA construct, said recombinant DNA construct comprising a promoter operably linked to a nucleotide sequence encoding a plant optimized Cas9 endonuclease, wherein said plant optimized Cas9 endonuclease is capable of binding to and creating a double strand break in a genomic target sequence said plant genome.
21. A plant comprising a recombinant DNA construct and a guide RNA, wherein said recombinant DNA construct comprises a promoter operably linked to a nucleotide sequence encoding a plant optimized Cas9 endonuclease, wherein said plant optimized Cas9 endonuclease and guide RNA are capable of forming a complex and creating a double strand break in a genomic target sequence said plant genome.
22. A recombinant DNA construct comprising a promoter operably linked to a nucleotide sequence encoding a plant optimized Cas9 endonuclease, wherein said plant optimized Cas9 endonuclease is capable of binding to and creating a double strand break in a genomic target sequence said plant genome.
23. A recombinant DNA construct comprising a promoter operably linked to a nucleotide sequence expressing a guide RNA, wherein said guide RNA is capable of forming a complex with a plant optimized Cas9 endonuclease, and wherein said complex is capable of binding to and creating a double strand break in a genomic target sequence said plant genome.
24. A method for selecting a male sterile plant, the method comprising selecting at least one progeny plant that comprises an alteration at a genomic target site located in a male fertility gene locus, wherein said progeny plant is obtained by crossing a first plant expressing a Cas9 endonuclease to a second plant comprising a guide RNA, wherein said Cas endonuclease is capable of introducing a double strand break at said genomic target site,
25. A method for producing a male sterile plant, the method comprising: a) obtaining a first plant comprising at least one Cas endonuclease capable of introducing a double strand break at a genomic target site located in a male fertility gene locus in the plant genome; b) obtaining a second plant comprising a guide RNA that is capable of forming a complex with the Cas endonuclease of (a); c) crossing the first plant of (a) with the second plant of (b); d) evaluating the progeny of (c) for an alteration in the target site; and, e) selecting a progeny plant that is male sterile.
26. The method of claim 24, wherein the male fertility gene is selected from the group consisting of MS26, MS45 and MSCA1.
27. The method of claim 24, wherein the plant is a monocot or a dicot.
28. The method of claim 27, wherein the monocot is selected from the group consisting of maize, rice, sorghum, rye, barley, wheat, millet, oats, sugarcane, turfgrass, or switchgrass.
29. A method for editing a nucleotide sequence in the genome of a cell, the method comprising introducing at least one guide RNA, at least one polynucleotide modification template and at least one Cas endonuclease into a cell, wherein the Cas endonuclease introduces a double-strand break at a target site in the genome of said cell, wherein said polynucleotide modification template comprises at least one nucleotide modification of said nucleotide sequence.
30. The method of claim 29, wherein the cell is a plant cell.
31. The method of claim 29 wherein the nucleotide sequence is a promoter, a regulatory sequence or a gene of interest of interest.
32. The method of claim 31 wherein the gene of interest is an enolpyruvylshikimate-3-phosphate synthase (EPSPS) gene or an acetolactate synthase (ALS) gene.
33. The method of claim 30 wherein the plant cell is a monocot or dicot plant cell.
34. A method for producing an epsps mutant plant, the method comprising: a) providing a guide RNA, a polynucleotide modification template and at least one Cas endonuclease to a plant cell, wherein the Cas endonuclease introduces a double strand break at a target site within an epsps genomic sequence in the plant genome, wherein said polynucleotide modification template comprises at least one nucleotide modification of said epsps genomic sequence; b) obtaining a plant from the plant cell of (a); c) evaluating the plant of (b) for the presence of said at least one nucleotide modification; and, d) selecting a progeny plant that shows tolerance to glyphosate.
35. A method for producing an epsps mutant plant, the method comprising: a) providing a guide RNA, a polynucleotide modification template and at least one Cas endonuclease to a plant cell, wherein the Cas endonuclease introduces a double strand break at a target site within an epsps genomic sequence in the plant genome, wherein said polynucleotide modification template comprises at least one nucleotide modification of said epsps genomic sequence; b) obtaining a plant from the plant cell of (a); c) evaluating the plant of (b) for the presence of said at least one nucleotide modification; and, d) screening a progeny plant of (c) that is void of said guide RNA and Cas endonuclease.
36. The method of claim 35 further comprising selecting a plant that shows resistance to glyphosate.
37. A plant, plant cell or seed produced by the method of claim 29.
38. The method of claim 29, wherein the Cas endonuclease is a Cas9 endonuclease.
39. The method of claim 38, wherein the Cas9 endonuclease is expressed by SEQ ID NO: 5.
40. The method of claim 38 wherein the Cas9 endonuclease is encoded by any one of SEQ ID NOs: 1, 124, 212, 213, 214, 215, 216, 193 or nucleotides 2037-6329 of SEQ ID NO: 5, or any functional fragment thereof.
41. The plant or plant cell of claim 37, wherein said plant cell shows resistance to glyphosate.
42. A plant cell comprising a modified nucleotide sequence, wherein the modified nucleotide sequence was produced by providing a guide RNA, a polynucleotide modification template and at least one Cas endonuclease to a plant cell, wherein the Cas endonuclease is capable of introducing a double-strand break at a target site in the plant genome, wherein said polynucleotide modification template comprises at least one nucleotide modification of said nucleotide sequence.
43. The method of claim 29, wherein the at least one nucleotide modification is not a modification at said target site.
44. A method for producing a male sterile plant, the method comprising: a) providing to a plant cell a guide RNA and a Cas endonuclease, wherein said guide RNA and Cas endonuclease are capable of forming a complex that enables the Cas endonuclease to introduce a double strand break at a target site located in or near a male fertility gene; b) identifying at least one plant cell that has a modification in said male fertility gene, wherein the modification includes at least one deletion, insertion, or substitution of one or more nucleotides in said male sterility gene; and, c) obtaining a plant from the plant cell of b).
45. The method of claim 44, further comprising selecting a progeny plant from the plant of c) wherein said progeny plant is male sterile.
46. The method of claim 44, wherein the male fertility gene is selected from the group consisting of MS26, MS45 and MSCA1.
47. A plant comprising at least one altered target site, wherein the at least one altered target site originated from a corresponding target site that was recognized and cleaved by a guideRNA/Cas endonuclease system, and wherein the at least one altered target site is in a genomic region of interest that extends from the target sequence set forth in SEQ ID NO: 229 to the target site set forth in SEQ ID NO: 235.
48. The plant of claim 47, wherein the at least one altered target site has an alteration selected from the group consisting of (i) replacement of at least one nucleotide, (ii) a deletion of at least one nucleotide, (iii) an insertion of at least one nucleotide, and (iv) any combination of (i)-(iii).
49. The plant of claim 47, wherein the at least one altered target site comprises a recombinant DNA molecule.
50. The plant of claim 47, wherein the plant comprises at least two altered target sites, wherein each of the altered target site originated from corresponding target site that was recognized and cleaved by a guideRNA/Cas endonuclease system, wherein the corresponding target site is selected from the group consisting of SEQ ID NOs: 229, 230, 231, 232, 233, 234, 235 and 236.
51. A method for editing a nucleotide sequence in the genome of a cell, the method comprising providing a guide polynucleotide, a Cas endonuclease, and optionally a polynucleotide modification template, to a cell, wherein said guide RNA and Cas endonuclease are capable of forming a complex that enables the Cas endonuclease to introduce a double strand break at a target site in the genome of said cell, wherein said polynucleotide modification template comprises at least one nucleotide modification of said nucleotide sequence.
52. The method of claim 51, wherein the nucleotide sequence in the genome of a cell is selected from the group consisting of a promoter sequence, a terminator sequence, a regulatory element sequence, a splice site, a coding sequence, a polyubiquitination site, an intron site and an intron enhancing motif.
53. A method for editing a promoter sequence in the genome of a cell, the method comprising providing a guide polynucleotide, a polynucleotide modification template and at least one Cas endonuclease to a cell, wherein said guide RNA and Cas endonuclease are capable of forming a complex that enables the Cas endonuclease to introduce a double strand break at a target site in the genome of said cell, wherein said polynucleotide modification template comprises at least one nucleotide modification of said promoter sequence to be edited.
54. A method for replacing a first promoter sequence in a cell, the method comprising providing a guide RNA, a polynucleotide modification template, and a Cas endonuclease to said cell, wherein said guide RNA and Cas endonuclease are capable of forming a complex that enables the Cas endonuclease to introduce a double strand break at a target site in the genome of said cell, wherein said polynucleotide modification template comprises a second promoter or second promoter fragment that is different from said first promoter sequence.
55. The method of claim 54, wherein the replacement of the first promoter sequence results in any one of the following, or any one combination of the following: an increased promoter activity, an increased promoter tissue specificity, a decreased promoter activity, a decreased promoter tissue specificity, a new promoter activity, an inducible promoter activity, an extended window of gene expression, or a modification of the timing or developmental progress of gene expression in the same cell layer or other cell layer.
56. The method of claim 54, wherein the first promoter sequence is selected from the group consisting of Zea mays ARGOS 8 promoter, a soybean EPSPS1 promoter, a maize EPSPS promoter, maize NPK1 promoter, wherein the second promoter sequence is selected from the group consisting of a Zea mays GOS2 PRO:GOS2-intron promoter, a soybean ubiquitin promoter, a stress inducible maize RAB17 promoter, a Zea mays-PEPC1 promoter, a Zea mays Ubiquitin promoter, a Zea mays-Rootmet2 promoter, a rice actin promoter, a sorghum RCC3 promoter, a Zea mays-GOS2 promoter, a Zea mays-ACO2 promoter, and a Zea mays oleosin promoter.
57. A method for deleting a promoter sequence in the genome of a cell, the method comprising providing a guide polynucleotide, a Cas endonuclease to a cell, wherein said guide RNA and Cas endonuclease are capable of forming a complex that enables the Cas endonuclease to introduce a double strand break in at least one target site located inside or outside said promoter sequence.
58. A method for inserting a promoter or a promoter element in the genome of a cell, the method comprising providing a guide polynucleotide, a polynucleotide modification template comprising the promoter or the promoter element, and a Cas endonuclease to a cell, wherein said guide RNA and Cas endonuclease are capable of forming a complex that enables the Cas endonuclease to introduce a double strand break at a target site in the genome of said cell.
59. The method of claim 58, wherein the insertion of the promoter or promoter element results in any one of the following, or any one combination of the following: an increased promoter activity, an increased promoter tissue specificity, a decreased promoter activity, a decreased promoter tissue specificity, a new promoter activity, an inducible promoter activity, an extended window of gene expression, a modification of the timing or developmental progress of gene expression, a mutation of DNA binding elements, or an addition of DNA binding elements.
60. A method for editing a Zinc Finger transcription factor, the method comprising providing a guide polynucleotide, a Cas endonuclease, and optionally a polynucleotide modification template, to a cell, wherein the Cas endonuclease introduces a double-strand break at a target site in the genome of said cell, wherein said polynucleotide modification template comprises at least one nucleotide modification or deletion of said Zinc Finger transcription factor, wherein the deletion or modification of said Zinc Finger transcription factor results in the creation of a dominant negative Zinc Finger transcription factor mutant.
61. A method for creating a fusion protein, the method comprising introducing a guide polynucleotide, a Cas endonuclease, and a polynucleotide modification template, into a cell, wherein the Cas endonuclease introduces a double-strand break at a target site located inside or outside a first coding sequence in the genome of said cell, wherein said polynucleotide modification template comprises a second coding sequence encoding a protein of interest, wherein the protein fusion results in any one of the following, or any one combination of the following: a targeting of the fusion protein to the chloroplast of said cell, an increased protein activity, an increased protein functionality, a decreased protein activity, a decreased protein functionality, a new protein functionality, a modified protein functionality, a new protein localization, a new timing of protein expression, a modified protein expression pattern, a chimeric protein, or a modified protein with dominant phenotype functionality.
62. A method for producing in a plant a complex trait locus comprising at least two altered target sequences in a genomic region of interest, said method comprising: (a) selecting a genomic region in a plant, wherein the genomic region comprises a first target sequence and a second target sequence; (b) contacting at least one plant cell with at least a first guide polynucleotide, a second polynucleotide, and optionally at least one Donor DNA, and a Cas endonuclease, wherein the first and second guide polynucleotide and the Cas endonuclease can form a complex that enables the Cas endonuclease to introduce a double strand break in at least a first and a second target sequence; (c) identifying a cell from (b) comprising a first alteration at the first target sequence and a second alteration at the second target sequence; and (d) recovering a first fertile plant from the cell of (c) said fertile plant comprising the first alteration and the second alteration, wherein the first alteration and the second alteration are physically linked.
63. A method for producing in a plant a complex trait locus comprising at least two altered target sequences in a genomic region of interest, said method comprising: (a) selecting a genomic region in a plant, wherein the genomic region comprises a first target sequence and a second target sequence; (b) contacting at least one plant cell with a first guide polynucleotide, a Cas endonuclease, and optionally a first Donor DNA, wherein the first guide polynucleotide and the Cas endonuclease can form a complex that enables the Cas endonuclease to introduce a double strand break a first target sequence; (c) identifying a cell from (b) comprising a first alteration at the first target sequence; (d) recovering a first fertile plant from the cell of (c), said first fertile plant comprising the first alteration; (e) contacting at least one plant cell with a second guide polynucleotide, a Cas endonuclease, and optionally a second Donor DNA; (f) identifying a cell from (e) comprising a second alteration at the second target sequence; (g) recovering a second fertile plant from the cell of (f), said second fertile plant comprising the second alteration; and, (h) obtaining a fertile progeny plant from the second fertile plant of (g), said fertile progeny plant comprising the first alteration and the second alteration, wherein the first alteration and the second alteration are physically linked.
64. The method of claim 29, wherein the editing of said nucleotide sequence renders said nucleotide sequence capable of conferring herbicide resistance to said cell.
65. A method for producing an acetolactate synthase (ALS) mutant plant, the method comprising: a) obtaining a plant or a seed thereof, wherein the plant or the seed comprises a modification in an endogenous ALS gene, the modification generated by a Cas endonuclease, a guide RNA and a polynucleotide modification template, wherein the plant or the seed is resistant to sulphonylurea; and, b) producing a progeny plant that is void of said guide RNA and Cas endonuclease.
66. A method of generating a sulphonylurea resistant plant, the method comprising providing a plant cell wherein its endogenous chromosomal ALS gene by has been modified through a guide RNA/Cas endonuclease system to produce a sulphonylurea resistant ALS protein and growing a plant from said maize plant cell, wherein said plant is resistant to sulphonylurea.
Description:
[0001] This application claims the benefit of U.S. Provisional Application
No. 61/868,706, filed Aug. 22, 2013, U.S. Provisional Application No.
61/882,532, filed Sep. 25, 2013, U.S. Provisional Application No.
61/937,045, filed Feb. 7, 2014, U.S. Provisional Application No.
61/953,090, filed Mar. 14, 2014, and U.S. Provisional Application No.
62/023,239, filed Jul. 11, 2014; all of which are hereby incorporated
herein in their entirety by reference.
FIELD
[0002] The disclosure relates to the field of plant molecular biology, in particular, to methods for altering the genome of a plant cell.
REFERENCE TO SEQUENCE LISTING SUBMITTED ELECTRONICALLY
[0003] The official copy of the sequence listing is submitted electronically via EFS-Web as an ASCII formatted sequence listing with a file named 20140814_BB2284USNP_ST25_SeqLst created on Aug. 14, 2014 and having a size 560 kilobytes and is filed concurrently with the specification. The sequence listing contained in this ASCII formatted document is part of the specification and is herein incorporated by reference in its entirety.
BACKGROUND
[0004] Recombinant DNA technology has made it possible to insert foreign DNA sequences into the genome of an organism, thus, altering the organism's phenotype. The most commonly used plant transformation methods are Agrobacterium infection and biolistic particle bombardment in which transgenes integrate into a plant genome in a random fashion and in an unpredictable copy number. Thus, efforts are undertaken to control transgene integration in plants.
[0005] One method for inserting or modifying a DNA sequence involves homologous DNA recombination by introducing a transgenic DNA sequence flanked by sequences homologous to the genomic target. U.S. Pat. No. 5,527,695 describes transforming eukaryotic cells with DNA sequences that are targeted to a predetermined sequence of the eukaryote's DNA. Specifically, the use of site-specific recombination is discussed. Transformed cells are identified through use of a selectable marker included as a part of the introduced DNA sequences.
[0006] It was shown that artificially induced site-specific genomic double-stranded breaks in plant cells were repaired by homologous recombination with exogenously supplied DNA using two different pathways. (Puchta et al., (1996) Proc. Natl. Acad. Sci. USA 93:5055-5060; U.S. Patent Application Publication No. 2005/0172365A1 published Aug. 4, 2005; U.S. Patent Application Publication No. 2006/0282914 published Dec. 14, 2006; WO 2005/028942 published Jun. 2, 2005).
[0007] Since the isolation, cloning, transfer and recombination of DNA segments, including coding sequences and non-coding sequences, is most conveniently carried out using restriction endonuclease enzymes. Much research has focused on studying and designing endonucleases such as WO 2004/067736 published Aug. 12, 2004; U.S. Pat. No. 5,792,632 issued to Dujon et al., Aug. 11, 1998; U.S. Pat. No. 6,610,545 B2 issued to Dujon et al., Aug. 26, 2003; Chevalier et al., (2002) Mol Cell 10:895-905; Chevalier et al., (2001) Nucleic Acids Res 29:3757-3774; Seligman et al., (2002) Nucleic Acids Res 30:3870-3879.
[0008] Although several approaches have been developed to target a specific site for modification in the genome of a plant, there still remains a need for more efficient and effective methods for producing a fertile plant, having an altered genome comprising specific modifications in a defined region of the genome of the plant.
BRIEF SUMMARY
[0009] Compositions and methods are provided employing a guide RNA/Cas endonuclease system in plants for genome modification of a target sequence in the genome of a plant or plant cell, for selecting plants, for gene editing, and for inserting a polynucleotide of interest into the genome of a plant. The methods and compositions employ a guide RNA/Cas endonuclease system to provide for an effective system for modifying or altering target sites and nucleotides of interest within the genome of a plant, plant cell or seed. Once a genomic target site is identified, a variety of methods can be employed to further modify the target sites such that they contain a variety of polynucleotides of interest. Breeding methods and methods for selecting plants utilizing a two component RNA guide and Cas endonuclease system are also disclosed. Also provided are nucleic acid constructs, plants, plant cells, explants, seeds and grain having the guide RNA/Cas endonuclease system. Compositions and methods are also provided employing a guide polynucleotide/Cas endonuclease system for genome modification of a target sequence in the genome of a cell or organism, for gene editing, and for inserting or deleting a polynucleotide of interest into or from the genome of a cell or organism. The methods and compositions employ a guide polynucleotide/Cas endonuclease system to provide for an effective system for modifying or altering target sites and editing nucleotide sequences of interest within the genome of a cell, wherein the guide polynucleotide is comprised of a RNA sequence, a DNA sequence, or a DNA-RNA combination sequence.
[0010] Thus in a first embodiment of the disclosure, the method comprises a method for selecting a plant comprising an altered target site in its plant genome, the method comprising: a) obtaining a first plant comprising at least one Cas endonuclease capable of introducing a double strand break at a target site in the plant genome; b) obtaining a second plant comprising a guide RNA that is capable of forming a complex with the Cas endonuclease of (a), c) crossing the first plant of (a) with the second plant of (b); d) evaluating the progeny of (c) for an alteration in the target site and e) selecting a progeny plant that possesses the desired alteration of said target site.
[0011] In another embodiment, the method comprises, a method for selecting a plant comprising an altered target site in its plant genome, the method comprising selecting at least one progeny plant that comprises an alteration at a target site in its plant genome, wherein said progeny plant was obtained by crossing a first plant comprising at least one Cas endonuclease with a second plant comprising a guide RNA, wherein said Cas endonuclease is capable of introducing a double strand break at said target site.
[0012] In another embodiment, the method comprises, a method for selecting a plant comprising an altered target site in its plant genome, the method comprising:
[0013] a) obtaining a first plant comprising at least one Cas endonuclease capable of introducing a double strand break at a target site in the plant genome; b) obtaining a second plant comprising a guide RNA and a donor DNA, wherein said guide RNA is capable of forming a complex with the Cas endonuclease of (a), wherein said donor DNA comprises a polynucleotide of interest; c) crossing the first plant of (a) with the second plant of (b); d) evaluating the progeny of (c) for an alteration in the target site e) selecting a progeny plant that comprises the polynucleotide of interest inserted at said target site.
[0014] In another embodiment, the method comprises, a method for selecting a plant comprising an altered target site in its plant genome, the method comprising selecting at least one progeny plant that comprises an alteration at a target site in its plant genome, wherein said progeny plant was obtained by crossing a first plant expressing at least one Cas endonuclease to a second plant comprising a guide RNA and a donor DNA, wherein said Cas endonuclease is capable of introducing a double strand break at said target site, wherein said donor DNA comprises a polynucleotide of interest.
[0015] In another embodiment, the method comprises, method for modifying a target site in the genome of a plant cell, the method comprising introducing a guide RNA into a plant cell having a Cas endonuclease, wherein said guide RNA and Cas endonuclease are capable of forming a complex that enables the Cas endonuclease to introduce a double strand break at said target site.
[0016] In another embodiment, the method comprises, a method for modifying a target site in the genome of a plant cell, the method comprising introducing a guide RNA and a Cas endonuclease into said plant cell, wherein said guide RNA and Cas endonuclease are capable of forming a complex that enables the Cas endonuclease to introduce a double strand break at said target site.
[0017] In another embodiment, the method comprises, a method for modifying a target site in the genome of a plant cell, the method comprising introducing a guide RNA and a donor DNA into a plant cell having a Cas endonuclease, wherein said guide RNA and Cas endonuclease are capable of forming a complex that enables the Cas endonuclease to introduce a double strand break at said target site, wherein said donor DNA comprises a polynucleotide of interest.
[0018] In another embodiment, the method comprises a method for modifying a target site in the genome of a plant cell, the method comprising: a) introducing into a plant cell a guide RNA and a Cas endonuclease, wherein said guide RNA and Cas endonuclease are capable of forming a complex that enables the Cas endonuclease to introduce a double strand break at said target site; and, b) identifying at least one plant cell that has a modification at said target, wherein the modification includes at least one deletion or substitution of one or more nucleotides in said target site.
[0019] In another embodiment, the method comprises, method for modifying a target DNA sequence in the genome of a plant cell, the method comprising: A) introducing into a plant cell a first recombinant DNA construct capable of expressing a guide RNA and a second recombinant DNA construct capable of expressing a Cas endonuclease, wherein said guide RNA and Cas endonuclease are capable of forming a complex that enables the Cas endonuclease to introduce a double strand break at said target site; and, B) identifying at least one plant cell that has a modification at said target, wherein the modification includes at least one deletion or substitution of one or more nucleotides in said target site.
[0020] In another embodiment, the method comprises, a method for introducing a polynucleotide of Interest into a target site in the genome of a plant cell, the method comprising: a) introducing into a plant cell a first recombinant DNA construct capable of expressing a guide RNA and a second recombinant DNA construct capable of expressing a Cas endonuclease, wherein said guide RNA and Cas endonuclease are capable of forming a complex that enables the Cas endonuclease to introduce a double strand break at said target site; (b) contacting the plant cell of (a) with a donor DNA comprising a polynucleotide of Interest; and, (c) identifying at least one plant cell from (b) comprising in its genome the polynucleotide of Interest integrated at said target site.
[0021] In some of these embodiments, the guide RNA can be introduced directly by particle bombardment or can be introduced via particle bombardment or Agrobacterium transformation of a recombinant DNA construct comprising the corresponding guide DNA operably linked to a plant U6 polymerase III promoter.
[0022] In some of these embodiments, the Cas endonuclease gene is a plant optimized Cas9 endonuclease.
[0023] In some of these embodiments, the Cas endonuclease gene is operably linked to a SV40 nuclear targeting signal upstream of the Cas codon region and a VirD2 nuclear localization signal downstream of the Cas codon region.
[0024] The plant in these embodiments is a monocot or a dicot. More specifically, the monocot is selected from the group consisting of maize, rice, sorghum, rye, barley, wheat, millet, oats, sugarcane, turfgrass, or switchgrass. The dicot is selected from the group consisting of soybean, canola, alfalfa, sunflower, cotton, tobacco, peanut, potato, tobacco, Arabidopsis, or safflower.
[0025] In some embodiments, the target site is located in the gene sequence of an acetolactate synthase (ALS) gene, an Enolpyruvylshikimate Phosphate Synthase Gene (ESPSP) gene, a male fertility (MS45, MS26 or MSCA1) gene.
[0026] In another embodiment the disclosure comprises a plant, plant part, or seed, comprising a recombinant DNA construct, said recombinant DNA construct comprising a promoter operably linked to a nucleotide sequence encoding a plant optimized Cas9 endonuclease, wherein said plant optimized Cas9 endonuclease is capable of binding to and creating a double strand break in a genomic target sequence said plant genome.
[0027] In another embodiment the plant comprises a recombinant DNA construct and a guide RNA, wherein said recombinant DNA construct comprises a promoter operably linked to a nucleotide sequence encoding a plant optimized Cas9 endonuclease, wherein said plant optimized Cas9 endonuclease and guide RNA are capable of forming a complex and creating a double strand break in a genomic target sequence said plant genome.
[0028] In another embodiment, the recombinant DNA construct comprises a promoter operably linked to a nucleotide sequence encoding a plant optimized Cas9 endonuclease, wherein said plant optimized Cas9 endonuclease is capable of binding to and creating a double strand break in a genomic target sequence said plant genome.
[0029] In another embodiment, the recombinant DNA construct comprises a promoter operably linked to a nucleotide sequence expressing a guide RNA, wherein said guide RNA is capable of forming a complex with a plant optimized Cas9 endonuclease, and wherein said complex is capable of binding to and creating a double strand break in a genomic target sequence said plant genome.
[0030] In another embodiment, the method comprises a method for selecting a male sterile or male fertile plant, the method comprising selecting at least one progeny plant that comprises an alteration at a genomic target site located in a male fertility gene locus, wherein said progeny plant is obtained by crossing a first plant expressing a Cas9 endonuclease to a second plant comprising a guide RNA, wherein said Cas endonuclease is capable of introducing a double strand break at said genomic target site.
[0031] In another embodiment, the method comprises a method for producing a male sterile or male fertile plant, the method comprising: a) obtaining a first plant comprising at least one Cas endonuclease capable of introducing a double strand break at a genomic target site located in a male fertility gene locus in the plant genome; b) obtaining a second plant comprising a guide RNA that is capable of forming a complex with the Cas endonuclease of (a), c) crossing the first plant of (a) with the second plant of (b); d) evaluating the progeny of (c) for an alteration in the target site; and e) selecting a progeny plant that is male sterile or male fertile. Male fertility genes can be selected from, but are not limited to MS26, MS45, MSCA1 genes
[0032] Compositions and methods are also provided for editing a nucleotide sequence in the genome of a cell. In one embodiment, the disclosure describes a method for editing a nucleotide sequence in the genome of a plant cell, the method comprising providing a guide RNA, a polynucleotide modification template, and at least one maize optimized Cas9 endonuclease to a plant cell, wherein the maize optimized Cas9 endonuclease is capable of introducing a double-strand break at a target site in the plant genome, wherein said polynucleotide modification template includes at least one nucleotide modification of said nucleotide sequence. The nucleotide to be edited (the nucleotide sequence of interest) can be located within or outside a target site that is recognized and cleaved by a Cas endonuclease. Cells include, but are not limited to, human, animal, bacterial, fungal, insect, and plant cells as well as plants and seeds produced by the methods described herein.
[0033] Additional embodiments of the methods and compositions of the present disclosure are shown herein.
BRIEF DESCRIPTION OF THE DRAWINGS AND THE SEQUENCE LISTING
[0034] The disclosure can be more fully understood from the following detailed description and the accompanying drawings and Sequence Listing, which form a part of this application. The sequence descriptions and sequence listing attached hereto comply with the rules governing nucleotide and amino acid sequence disclosures in patent applications as set forth in 37 C.F.R. §§1.821-1.825. The sequence descriptions contain the three letter codes for amino acids as defined in 37 C.F.R. §§1.821-1.825, which are incorporated herein by reference.
FIGURES
[0035] FIG. 1A shows a maize optimized Cas9 gene (encoding a Cas9 endonuclease) containing a potato ST-LS1 intron, a SV40 amino terminal nuclear localization sequence (NLS), and a VirD2 carboxyl terminal NLS, operably linked to a plant ubiquitin promoter (SEQ ID NO: 5). The maize optimized Cas9 gene (just Cas9 coding sequence, no NLSs) corresponds to nucleotide positions 2037-2411 and 2601-6329 of SEQ ID NO: 5 with the potato intron residing at positions 2412-2600 of SEQ ID NO: 5. SV40 NLS is at positions 2010-2036 of SEQ ID NO: 5. VirD2 NLS is at positions 6330-6386 of SEQ ID NO: 5. FIG. 1B shows a long guide RNA operably linked to a maize U6 polymerase III promoter terminating with a maize U6 terminator (SEQ ID NO: 12). The long guide RNA containing the variable targeting domain corresponding to the maize LIGCas-3 target site (SEQ ID NO: 8) is transcribed from/corresponds to positions 1001-1094 of SEQ ID NO: 12. FIG. 1C shows the maize optimized Cas9 and long guide RNA expression cassettes combined on a single vector DNA (SEQ ID NO: 102).
[0036] FIG. 2A illustrates the duplexed crRNA (SEQ ID NO:6)-tracrRNA (SEQ ID NO:7)/Cas9 endonuclease system and target DNA complex relative to the appropriately oriented PAM sequence at the maize LIGCas-3 (SEQ ID NO: 18, Table 1) target site with triangles pointing towards the expected site of cleavage on both sense and anti-sense DNA strands. FIG. 2 B illustrates the guide RNA/Cas9 endonuclease complex interacting with the genomic target site relative to the appropriately oriented PAM sequence (GGA) at the maize genomic LIGCas-3 target site (SEQ ID NO:18, Table 1). The guide RNA (shown as boxed-in in light gray, SEQ ID NO:8) is a fusion between a crRNA and tracrRNA and comprises a variable targeting domain that is complementary to one DNA strand of the double strand DNA genomic target site. The Cas9 endonuclease is shown in dark gray. Triangles point towards the expected site of DNA cleavage on both sense and anti-sense DNA strands.
[0037] FIG. 3A-3B shows an alignment and count of the top 10 most frequent NHEJ mutations induced by the maize optimized guide RNA/Cas endonuclease system described herein compared to a LIG3-4 homing endonuclease control at the maize genomic Liguleless 1 locus. The mutations were identified by deep sequencing. The reference sequence represents the unmodified locus with each target site underlined. The PAM sequence and expected site of cleavage are also indicated. Deletions or insertions as a result of imperfect NHEJ are shown by a "-" or an italicized underlined nucleotide, respectively. The reference and mutations 1-10 of the LIGCas-1 target site correspond to SEQ ID NOs: 55-65, respectively. The reference and mutations 1-10 of the LIGCas-2 correspond to SEQ ID NOs: 55, 65-75, respectively. The reference and mutations 1-10 of the LIGCas-3 correspond to SEQ ID NOs: 76-86, respectively. The reference and mutations 1-10 of the LIG3-4 homing endonuclease target site correspond to SEQ ID NOs: 76, 87-96, respectively.
[0038] FIG. 4 illustrates how the homologous recombination (HR) repair DNA vector (SEQ ID NO: 97) was constructed. To promote site-specific transgene insertion by homologous recombination, the transgene (shown in light gray) was flanked on either side by approximately 1 kb of DNA with homology to the maize genomic regions immediately adjacent to the LIGCas3 and LIG3-4 homing endonuclease expected sites of cleavage.
[0039] FIG. 5 illustrates how genomic DNA extracted from stable transformants was screened for site-specific transgene insertion by PCR. Genomic primers (corresponding to SEQ ID NOs: 98 and 101) within the Liguleless 1 locus were designed outside of the regions used in constructing the HR repair DNA vector (SEQ ID NO: 97) and were paired with primers inside the transgene (corresponding to SEQ ID NOs: 99 and 100) to facilitate PCR detection of unique genomic DNA junctions created by appropriately oriented site-specific transgene integration.
[0040] FIG. 6 shows an alignment of the NHEJ mutations induced by the maize optimized guide RNA/Cas endonuclease system, described herein, when the short guide RNA was delivered directly as RNA. The mutations were identified by deep sequencing. The reference illustrates the unmodified locus with the genomic target site underlined. The PAM sequence and expected site of cleavage are also indicated. Deletions or insertions as a result of imperfect NHEJ are shown by a "-" or an italicized underlined nucleotide, respectively. The reference and mutations 1-6 for 55CasRNA-1 correspond to SEQ ID NOs: 104-110, respectively.
[0041] FIG. 7 shows the QC782 vector comprising the Cas9 expression cassette.
[0042] FIG. 8A shows the QC783 vector comprising the guide RNA expression cassette. FIG. 8B show the DNA sequence (coding sequence) of the DD43CR1 (20 bp) variable targeting domain of the guide RNA, as well as the terminator sequence linked to the guide RNA. The 20 bp variable targeting domain DD43CR1 is in bold
[0043] FIG. 9 shows the map of a linked soybean optimized Cas9 and guide RNA construct QC815.
[0044] FIG. 10A shows the DD20 soybean locus on chromosome 4 and the DD20CR1 and DD20CR2 genomic target sites (indicated by bold arrows). FIG. 10B shows the DD43 soybean locus on chromosome 4 and the DD43CR1 and DD43CR2 genomic target sites (indicated by bold arrows).
[0045] FIG. 11A-11D. Alignments of expected target site sequences with mutant target sequences detected in four guide RNA induced NHEJ experiments. FIG. 11A shows the DD20CR1 PCR amplicon (reference sequence, SEQ ID NO:142, genomic target site is underlined) and the 10 mutations (SEQ ID NOs: 147-156) induced by the guideRNA/Cas endonuclease system at the DD20CR1 genomic target site. FIG. 11B shows the DD20CR2 PCR amplicon (reference sequence, SEQ ID NO:143) and the 10 mutations (SEQ ID NOs 157-166) induced by the guide RNA/Cas endonuclease system at the DD20CR2 genomic target site. FIG. 11C shows the DD43CR1 PCR amplicon (reference sequence, SEQ ID NO:144) and the mutations (SEQ ID NOs:167-176) induced by the guide RNA/Cas endonuclease system at the DD43CR1 genomic target site. FIG. 11D shows the DD43CR2 PCR amplicon (reference sequence, SEQ ID NO: 145) and the 10 mutations (SEQ ID NOs: 177-191) induced by the guide RNA/Cas endonuclease system at the DD43CR2 genomic target site. The target sequences corresponding different guide RNAs are underlined. Each nucleotide deletions is indicated by "-". Inserted and replaced sequences are in bold. The total number of each mutant sequence is listed in the last column.
[0046] FIG. 12A-12B shows a schematic representation of the guide RNA/Cas endonuclease system used for editing a nucleotide sequence of interest. To enable specific nucleotide editing, a polynucleotide modification template that includes at least one nucleotide modification (when compared to the nucleotide sequence to be edited) is introduced into a cell together with the guide RNA and Cas endonuclease expression cassettes. For example, as shown herein, the nucleotide sequence to be edited is an endogenous wild type enolpyruvylshikimate-3-phosphate synthase (EPSPS) gene in maize cells. The Cas endonuclease (shaded circle) is a maize optimized Cas9 endonuclease that cleaves a moCas9 target sequence within the epsps genomic locus using a guide RNA of SEQ ID NO:194. FIG. 12-A shows a polynucleotide modification template that includes three nucleotide modifications (when compared to the wild type epsps locus depicted in FIG. 12-B) flanked by two homology regions HR-1 and HR-2. FIG. 12-B shows the guide RNA/maize optimized Cas9 endonuclease complex interacting with the epsps locus. The original nucleotide codons of the EPSPS gene that needed to be edited are show as aCT and Cca (FIG. 12-B). The nucleotide codons with modified nucleotides (shown in capitals) are shown as aTC and Tca (FIG. 12-B).
[0047] FIG. 13 shows a diagram of a maize optimized Cas9 endonuclease expression cassette. The bacterial cas9 coding sequence was codon optimized for expression in maize cells and supplemented with the ST-LS1 potato intron (moCas9 coding sequence, SEQ ID NO: 193). A DNA fragment encoding the SV40 nuclear localization signal (NLS) was fused to the 5'-end of the moCas9 coding sequence. A maize ubiquitin promoter (Ubi promoter) and its cognate intron (ubi intron) provided controlling elements for the expression of moCas9 in maize cells. The pinII transcription termination sequence (pinII) completed the maize moCAS9 gene design.
[0048] FIG. 14 shows some examples of the moCas9 target sequence (underlined), located on EPSPS DNA fragments, mutagenized by the introduction of double-strand breaks at the cleavage site of the moCas9 endonuclease (thick arrow) in maize cells. In SEQ ID NO: 206, three nucleotides were deleted (dashes) next to the moCas9 cleavage site. SEQ ID NOs: 207-208 indicate that the nucleotide deletion can expand beyond the moCAs9 cleavage site
[0049] FIG. 15 depicts an EPSPS template vector used for delivery of the EPSPS polynucleotide modification template containing the three TIPS nucleotide modifications. The EPSP polynucleotide modification template includes a partial fragment of the EPSPS gene. The vector was 6,475 bp in length and consisted of two homology regions to the epsps locus (epsps-HR1 and epsps-HR2). Two Gateway cloning sites (ATTL4 and ATTL3), an antibiotic resistance gene (KAN), and the pUC origin of replication (PUC ORI) completed synthesis of the EPSPS template vector1.
[0050] FIG. 16 illustrates the PCR-based screening strategy for the identification of maize events with TIPS nucleotide modifications in maize cells. Two pairs of PCR primers were used to amplify the genomic fragments of the epsps locus (upper section). Both of them contained the TIPS specific primers (an arrow with a dot indicating the site of the three TIPS modifications). The shorter fragment (780 bp F-E2) was produced by amplification of the EPSPS polynucleotide modification template fragment (template detection). The amplified EPSPS polynucleotide modification template fragment was found in all but 4 analyzed events (panel F-E2). The longer fragment (839 bp H-T) was produced by amplification of the genomic EPSPS sequence providing that the epsps locus contained the three nucleotide modifications responsible for the TIPS modifications. Six events were identified as containing the three nucleotide modifications (panel H-T). The white arrows point to events that contain both the amplified EPSPS polynucleotide modification template and the nucleotide modifications responsible for the TIPS modification.
[0051] FIG. 17A shows a schematic diagram of the PCR protocol used to identify edited EPSPS DNA fragments in selected events. A partial genomic fragment, comprising parts of Exon1, Intron 1 and Exon2 of the epsps locus, was amplified regardless of the editing product (panel A, 1050 bp F-E3). The amplification products, representing only partial EPSPS gene sequences having one or more mutations, were cloned and sequenced. FIG. 17B shows 2 examples of sequenced amplification products. In some amplification products, the epsps nucleotides and the moCas9 target sequence (underlined) were unchanged indicating that one EPSPS allele was not edited (wild type allele; SEQ ID NO: 210). In other amplification products, three specific nucleotide substitutions (representing the TIPS modifications) were identified with no mutations at the moCas9 target sequence (underlined) (SEQ ID NO: 209).
[0052] FIG. 18 shows the location of MHP14, TS8, TS9 and TS10 loci comprising target sites for the guide RNA/Cas endonuclease system near trait A (located at 53.14 cM) on chromosome 1 of maize.
[0053] FIG. 19A shows the location of the MHP14Cas1 maize genomic target sequence (SEQ ID NO: 229) and the MSP14Cas-3 maize genomic target sequence (SEQ ID NO: 230) on the MHP14 maize genomic DNA locus on chromosome1. The 5' to 3' sequence. FIG. 19B shows the location of the TS8Cas-1 (SEQ ID NO: 231) and TS8Cas-2 (SEQ ID NO: 232) maize genomic target sequences located on the TS8 locus. FIG. 19C shows the location of the TS9Cas-2 (SEQ ID NO: 233) and TS9Cas-3 (SEQ ID NO: 234) maize genomic target sequences located on the TS8 locus. FIG. 19D shows the location of the TS10Cas-1 (SEQ ID NO: 235), and TS10Cas-3 (SEQ ID NO: 236) maize genomic target sequences located on the TS10 locus. All these maize genomic target sites are recognized are recognized and cleaved by a guide RNA/Cas endonuclease system described herein. Each maize genomic target sequence (indicated by an arrow) is highlighted in bold and followed by the NGG PAM sequence shown boxed in.
[0054] FIG. 20 shows a schematic of a donor DNA (also referred to as HR repair DNA) comprising a transgene cassette with a selectable marker (phosphomannose isomerase, depicted in grey), flanked by homologous recombination sequences (HR1 and HR2) of about 0.5 to 1 kb in length, used to introduce the transgene cassette into a genomic target site for the guide RNA/Cas endonuclease system. The arrows indicate the sections of the genomic DNA sequence on either side of the endonuclease cleavage site that corresponds to the homologous regions of the donor DNA. This schematic is representative for homologous recombination occurring at any one of the 8 target sites (4 loci) located on chromosome 1 from 51.54 cM to 54.56 cM in maize genome.
[0055] FIG. 21 shows the junction PCR screen for identification of insertion events. Primer 1 and 2 located on the transgene donor are common for all target sites. Primer TSHR1f is located on the genomic region outside of the homologous sequence HR1. Primer combination THR1f/primer1 amplify junction 1. Primer TSHR2r is located on the genomic region outside of the HR2 region. Primer combination primer2/TSHR2r amplify junction 2.
[0056] FIG. 22 shows a junction PCR screen for identification of insertion events at the TS10Cas10 locus. A gel picture indicates the presence of insertion events at the TS10Cas10-1 target site (lane 02 A1). PCR reaction of HR1 and HR2 junction loaded next to each other (lane 02-white label and lane 02-gray label), with white label representing HR1 junction PCR, gray label representing HR2 junction PCR.
[0057] FIG. 23 A-B. DNA expression cassettes used in gRNA/Cas9 mediated genome modification experiments. A) The Cas9 endonuclease cassette (EF1A2:CAS9) comprising a soybean EF1A2 promoter (GM-EF1A2 PRO) driving the soybean codon optimized Cas9 endonucleases (CAS9(SO), a soybean optimized SV40 nuclear localization signal (SV40 NLS(SO)) and a PINII terminator (PINII TERM) was linked to a guide RNA expression cassette (U6-9.1:DD20CR1, comprising a soybean U6 promoter driving the DD20CR1 guide RNA) used in experiment U6-9.1DD20CR1 (Table 27). Other Guide RNA/Cas9 cassettes listed in Table 27 are identical except for the 20 bp variable targeting domains of the guide RNA targeting the genomic target sites DD20CR2, DD43CR1, or DD43CR2. B) The donor DNA cassette (DD20HR1-SAMS:HPT-DD20HR2) used in experiment U6-9.1DD20CR1 (Table 27). DD20HR1 and DD20HR2 homologous DNA regions between the donor DNA cassette and the genomic DNA sequences flanking the DD20 target site). Other Donor DNA cassettes listed in Table 27 are identical except for the DD43HR1 and DD43HR2 regions in two of them.
[0058] FIG. 24 A-C. DD20 and DD43 soybean genomic target sites locations and qPCR amplicons. A) Diagram of Glycine max chromosome 04 indicating relative positions of DD20 and DD43 target sites. Genetic mapping positions of DD20 and DD43 sites are the positions of the most nearby genes Glyma04g39780.1 and Glyma04g39550.1. B) DD20 qPCR 64 bp amplicon 45936307-45936370 from chromosome 04 (SEQ ID NO: 304). Relative positions of the target sites DD20-CR1 and DD20-CR2, qPCR primers and probe DD20-F, DD20-R, and DD20-T are marked. C) DD43 qPCR 115 bp amplicon 45731879-45731993 from chromosome 04 (SEQ ID NO: 305). Relative positions of the target sites DD43-CR1 and DD43-CR2, qPCR primers and probe DD43-F2, DD43-F, DD43-R, and DD43-T are marked.
[0059] FIG. 25 A-C. Schematic of guide RNA/Cas9 system mediated site-specific non-homologous end joining (NHEJ) and transgene insertion via homologous recombination (HR) at DD20CR1 site. A) Soybean plants are co-transformed with guide RNA/Cas9 and donor DNA cassettes as listed in Table 27. The DD20CR1 guide RNA/Cas9 complex transcribed from the linked guide RNA/Cas9 DNA cassettes will cleave specifically the DD20CR1 target site on chromosome 04 to make DNA double strand breaks. The breaks can be repaired spontaneously as NHEJs or repaired as a HR event by the donor DNA facilitated by the flanking homologous regions DD20-HR1 and DD20HR2. B) NHEJs are detected by DD20-specific qPCR and the mutated sequences are assessed by sequencing cloned HR1-HR2 PCR fragments. C) HR events are revealed by two border-specific PCR analyses HR1-SAMS and NOS-HR2, noting that the primers are only able to amplify DNA recombined between the DD20CR1 region of chromosome 04 and the donor DNA. Guide RNA/Cas9 mediated NHEJ and HR at DD20-CR2 site follow the same process except for using DD20-CR2 guide RNA. Guide RNA/Cas9 mediated site-specific NHEJ and HR at DD43CR1 and DD43CR2 sites follow the same process except for using guide RNA and homologous regions specific to the DD43 sites.
[0060] FIG. 26 A-C. Sequences of gRNA/Cas9 system mediated NHEJs. Only 60 bp sequences surrounding the genomic target site shown in bold case are aligned to show the mutations. The PAM sequence is shown boxed in. Insertion sequences are indicated by symbol marking the insertion position followed by the size of the insert. Actual insertion sequences are listed in the sequences listing. A) U6-9.1 DD20CR1 sequences. Three colonies were sequenced for each of 54 events from experiment U6-9.1 DD20CR1. A total of 150 sequences were returned, of which 26 were found to be short unique deletions while 2 of the events contained small insertions. B) U6-9.1 DD20CR2 sequences. Three colonies were sequenced for each of 28 events from experiment U6-9.1 DD20CR2. A total of 84 sequences were returned, of which 20 were found to be short unique deletions while 1 of the events contained a single by insertion. C) U6-9.1DD43CR1 sequences. Three colonies were sequenced for each of 46 events from experiment U6-9.1 DD43CR1. A total of 132 sequences were returned, of which 18 were found to be short unique deletions while 10 of the events contained small insertions. D) U6-9.1DD43CR2 sequences.
[0061] FIG. 27 A-C shows the ten most prevalent types of NHEJ mutations recovered based on the crRNA/tracrRNA/Cas endonuclease system. FIG. 27A shows NHEJ mutations for LIGCas-1 target site, corresponding to SEQ ID NOs: 415-424), FIG. 27B shows NHEJ mutations for LIGCas-2 target site corresponding to SEQ ID NOs: 425-434) and FIG. 27V shows NHEJ mutations (for LIGCas-3 target site corresponding to SEQ ID NOs: 435-444).
[0062] FIG. 28. Schematic representation of Zm-GOS2 PRO:GOS2 INTRON insertion in the 5'-UTR of maize ARGOS8 gene by targeting the guide RNA/Cas9 target sequence 1 (CTS1, SEQ ID NO: 1) with the gRNA1/Cas9 endonuclease system, described herein. HR1 and HR2 indicate homologous recombination regions.
[0063] FIG. 29 A-C. Identification and analysis of Zm-GOS2 PRO:GOS2 INTRON insertion events in maize plants. (A) Schematic representation of Zm-GOS2 PRO:GOS2 INTRON insertion in the 5'-UTR of Zm-ARGOS8. CTS1 was targeted with the gRNA1/Cas9 endonuclease system, described herein. HR1 and HR2 indicate homologous recombination regions. P1 to P4 indicate PCR primers. (B) PCR screening of PMI-resistance calli to identify insertion events. PCR results are shown for 13 representative calli. The left and right junction PCRs were carried out with the primer pair P1+P2 and P3+P4, respectively. (C) PCR analysis of a T0 plant. A PCR product with the expected size (2.4 kb, Lane T0) was amplified with the primer P3 and P4.
[0064] FIG. 30. Schematic representation of Zm-ARGOS8 promoter substitution with Zm-GOS2 PRO:GOS2 INTRON by targeting CTS3 (SEQ ID NO: 3) and CTS2 (SEQ ID NO:2). HR1 and HR2 indicate homologous recombination regions.
[0065] FIG. 31 A-D. Substitution of the native promoter of the ARGOS8 gene with Zm-GOS2 PRO:GOS2 INTRON in maize plants. (A) Schematic representation of the Zm-GOS2 PRO:GOS2 INTRON:ARGOS8 allele generated by promoter swap. Two guide RNA/Cas9 target sites, CTS3 (SEQ ID NO:3) and CTS2 (SEQ ID NO:2), were targeted with a gRNA3/gRNA2/Cas9 system. HR1 and HR2 indicate homologous recombination regions. P1 to P5 indicate PCR primers. (B) PCR screening of PMI-resistance calli to identify swap events. PCR results are shown for 10 representative calli. One callus sample, 12A09, is positive for both left junction (L, primer P1+P2) and right junction (R, primer P5+P4) PCR, indicating that 12A09 is a swap event. (C) PCR analysis of the callus events identified in primary screening. PCR products with the expected size (2.4 kb) were amplified using the primer P3 and P4 from event #3, 4, 6, 8 and 9, indicating presence of the Zm-GOS2 PRO:GOS2 INTRON:ARGOS8 allele. (D) PCR analysis of a T0 plant. A PCR product with the expected size (2.4 kb, Lane T0) was amplified with the primer P3 and P4.
[0066] FIG. 32 A-B. Deletion of the native promoter of the ARGOS8 gene in maize plants. (A) Schematic representation of promoter deletion. Two guide RNA's and a Cas9 endonuclease system, referred to as a gRNA3/gRNA2/Cas9 system, were used to target the CTS3 and CTS2 sites in Zm-ARGOS8. P1 and P4 indicate PCR primers for deletion event screening. (B) PCR screening of PMI-resistance calli to identify deletion events. PCR results are shown for 15 representative calli. A 1.1-kp PCR product indicates deletion of the CTS3/CTS2 fragment.
[0067] FIG. 33. Schematic representation of enhancer element deletions using the guide RNA/Cas9 target sequence. The enhancer element to be deleted can be, but is not limited to, a 35S enhancer element.
[0068] FIG. 34 A-C. Modification of a maize EPSPS polyubiquitination site. (A) The selected maize EPSPS polyubiquitination site is compared to the analogous sites of other plant species. (B) The nucleotides to be edited in the maize EPSPS coding sequence (underlined, encoded amino acid shown in bold). (C) The edited EPSPS coding sequence identified in the selected T0 plant.
[0069] FIG. 35 A-C. The intron mediated enhanced element (A). The 5' section of the first intron of the EPSPS gene (editing: substitutions underlined and deletions represented by dots) (B) and its edited version conferring three IMEs elements (underlined). The edited nucleotides are shown in bold (C).
[0070] FIG. 36 A-B. Alternatively spliced EPSPS mRNA in maize cells. (A) left panel represents analysis of EPSPS cDNA. The lane I4 in FIG. 36A shows amplification of the EPSPS pre-mRNA containing the 3rd intron unspliced (the 804 bp diagnostic fragment as shown in FIG. 36 B indicates an alternate splicing event). Lanes E3 and F8 show the EPSPS PCR amplified fragments with spliced introns. These diagnostic fragments are not amplified unless cDNA is synthesized (as is evident by the absence of bands in lanes E3, I4, and F8 comprising total RNA (shown in the total RNA panel on right of FIG. 36A). The grey boxes in FIG. 36 B represent the eight EPSPS exons (their sizes are indicated above each of them).
[0071] FIG. 37. Splicing site at the junction between the second EPSPS intron and the third exon (bolded). The nucleotide to be edited is underlined.
[0072] FIG. 38. Schematic representation of Southern hybridization analysis of T0 and T1 maize plants.
SEQUENCES
[0073] SEQ ID NO: 1 is the nucleotide sequence of the Cas9 gene from Streptococcus pyogenes M1 GAS (SF370).
[0074] SEQ ID NO: 2 is the nucleotide sequence of the potato ST-LS1 intron.
[0075] SEQ ID NO: 3 is the amino acid sequence of SV40 amino N-terminal.
[0076] SEQ ID NO: 4 is the amino acid sequence of Agrobacterium tumefaciens bipartite VirD2 T-DNA border endonuclease carboxyl terminal.
[0077] SEQ ID NO: 5 is the nucleotide sequence of an expression cassette expressing the maize optimized Cas9.
[0078] SEQ ID NO: 6 is the nucleotide sequence of crRNA containing the LIGCas-3 target sequence in the variable targeting domain.
[0079] SEQ ID NO: 7 is the nucleotide sequence of the tracrRNA.
[0080] SEQ ID NO: 8 is the nucleotide sequence of a long guide RNA containing the LIGCas-3 target sequence in the variable targeting domain.
[0081] SEQ ID NO: 9 is the nucleotide sequence of the Chromosome 8 maize U6 polymerase III promoter.
[0082] SEQ ID NO: 10 list two copies of the nucleotide sequence of the maize U6 polymerase III terminator.
[0083] SEQ ID NO: 11 is the nucleotide sequence of the maize optimized short guide RNA containing the LIGCas-3 variable targeting domain.
[0084] SEQ ID NO: 12 is the nucleotide sequence of the maize optimized long guide RNA expression cassette containing the LIGCas-3 variable targeting domain.
[0085] SEQ ID NO: 13 is the nucleotide sequence of the Maize genomic target site MS26Cas-1 plus PAM sequence.
[0086] SEQ ID NO: 14 is the nucleotide sequence of the Maize genomic target site MS26Cas-2 plus PAM sequence.
[0087] SEQ ID NO: 15 is the nucleotide sequence of the Maize genomic target site MS26Cas-3 plus PAM sequence.
[0088] SEQ ID NO: 16 is the nucleotide sequence of the Maize genomic target site LIGCas-2 plus PAM sequence.
[0089] SEQ ID NO: 17 is the nucleotide sequence of the Maize genomic target site LIGCas-3 plus PAM sequence.
[0090] SEQ ID NO: 18 is the nucleotide sequence of the Maize genomic target site LIGCas-4 plus PAM sequence.
[0091] SEQ ID NO: 19 is the nucleotide sequence of the Maize genomic target site MS45Cas-1 plus PAM sequence.
[0092] SEQ ID NO: 20 is the nucleotide sequence of the Maize genomic target site MS45Cas-2 plus PAM sequence.
[0093] SEQ ID NO: 21 is the nucleotide sequence of the Maize genomic target site MS45Cas-3 plus PAM sequence.
[0094] SEQ ID NO: 22 is the nucleotide sequence of the Maize genomic target site ALSCas-1 plus PAM sequence.
[0095] SEQ ID NO: 23 is the nucleotide sequence of the Maize genomic target site ALSCas-2 plus PAM sequence.
[0096] SEQ ID NO: 24 is the nucleotide sequence of the Maize genomic target site ALSCas-3 plus PAM sequence.
[0097] SEQ ID NO: 25 is the nucleotide sequence of the Maize genomic target site EPSPSCas-1 plus PAM sequence.
[0098] SEQ ID NO: 26 is the nucleotide sequence of the Maize genomic target site EPSPSCas-2 plus PAM sequence.
[0099] SEQ ID NO: 27 is the nucleotide sequence of the Maize genomic target site EPSPSCas-3 plus PAM sequence.
[0100] SEQ ID NOs: 28-52 are the nucleotide sequence of target site specific forward primers for primary PCR as shown in Table 2.
[0101] SEQ ID NO: 53 is the nucleotide sequence of the forward primer for secondary PCR.
[0102] SEQ ID NO: 54 is the nucleotide sequence of Reverse primer for secondary PCR
[0103] SEQ ID NO: 55 is the nucleotide sequence of the unmodified reference sequence for LIGCas-1 and LIGCas-2 locus.
[0104] SEQ ID NOs: 56-65 are the nucleotide sequences of mutations 1-10 for LIGCas-1.
[0105] SEQ ID NOs: 66-75 are the nucleotide sequences of mutations 1-10 for LIGCas-2.
[0106] SEQ ID NO: 76 is the nucleotide sequence of the unmodified reference sequence for the LIGCas-3 and LIG3-4 homing endonuclease locus.
[0107] SEQ ID NOs: 77-86 are the nucleotide sequences of mutations 1-10 for LIGCas-3.
[0108] SEQ ID NOs: 88-96 are the nucleotide sequences of mutations 1-10 for LIG3-4 homing endonuclease locus.
[0109] SEQ ID NO: 97 is the nucleotide sequence of a donor vector referred to as an HR Repair DNA.
[0110] SEQ ID NO: 98 is the nucleotide sequence of forward PCR primer for site-specific transgene insertion at junction 1.
[0111] SEQ ID NO: 99 is the nucleotide sequence of reverse PCR primer for site-specific transgene insertion at junction 1.
[0112] SEQ ID NO: 100 is the nucleotide sequence of forward PCR primer for site-specific transgene insertion at junction 2.
[0113] SEQ ID NO: 101 is the nucleotide sequence of reverse PCR primer for site-specific transgene insertion at junction 2.
[0114] SEQ ID NO: 102 is the nucleotide sequence of the linked Cas9 endonuclease and LIGCas-3 long guide RNA expression cassettes
[0115] SEQ ID NO: 103 is the nucleotide sequence of Maize genomic target site 55CasRNA-1 plus PAM sequence.
[0116] SEQ ID NO: 104 is the nucleotide sequence of the unmodified reference sequence for 55CasRNA-1 locus.
[0117] SEQ ID NOs: 105-110 are the nucleotide sequences of mutations 1-6 for 55CasRNA-1.
[0118] SEQ ID NO: 111 is the nucleotide sequence of LIG3-4 homing endonuclease target site
[0119] SEQ ID NO: 112 is the nucleotide sequence of LIG3-4 homing endonuclease coding sequence.
[0120] SEQ ID NO: 113 is the nucleotide sequence of the MS26++ homing endonuclease target site.
[0121] SEQ ID NO: 114 is the nucleotide sequence of MS26++ homing endonuclease coding sequence
[0122] SEQ ID NO: 115 is the nucleotide sequence of the soybean codon optimized Cas9 gene.
[0123] SEQ ID NO: 116 is the nucleotide sequence of the soybean constitutive promoter GM-EF1A2.
[0124] SEQ ID NO: 117 is the nucleotide sequence of linker SV40 NLS.
[0125] SEQ ID NO: 118 is the amino acid sequence of soybean optimized Cas9 with a SV40 NLS.
[0126] SEQ ID NO: 119 is the nucleotide sequence of vector QC782.
[0127] SEQ ID NO: 120 is the nucleotide sequence of soybean U6 polymerase III promoter described herein, GM-U6-13.1 PRO.
[0128] SEQ ID NO: 121 is the nucleotide sequence of the guide RNA in FIG. 8B.
[0129] SEQ ID NO: 122 is the nucleotide sequence of vector QC783.
[0130] SEQ ID NO: 123 is the nucleotide sequence of vector QC815.
[0131] SEQ ID NO: 124 is the nucleotide sequence of a Cas9 endonuclease (cas9-2) from S. pyogenes.
[0132] SEQ ID NO: 125 is the nucleotide sequence of the DD20CR1 soybean target site
[0133] SEQ ID NO: 126 is the nucleotide sequence of the DD20CR2 soybean target site
[0134] SEQ ID NO: 127 is the nucleotide sequence of the DD43CR1 soybean target site
[0135] SEQ ID NO: 128 is the nucleotide sequence of the DD43CR2 soybean target site
[0136] SEQ ID NO: 129 is the nucleotide sequence of the DD20 sequence in FIG. 10A.
[0137] SEQ ID NO: 130 is the nucleotide sequence of the DD20 sequence complementary in FIG. 10A.
[0138] SEQ ID NO: 131 is the nucleotide sequence of DD43 sequence.
[0139] SEQ ID NO: 132 is the nucleotide sequence of the DD43 complementary sequence.
[0140] SEQ ID NO: 133-141 are primer sequences.
[0141] SEQ ID NO: 142 is the nucleotide sequence of the DD20CR1 PCR amplicon.
[0142] SEQ ID NO: 143 is the nucleotide sequence of the DD20CR2 PCR amplicon.
[0143] SEQ ID NO: 144 is the nucleotide sequence of the DD43CR1 PCR amplicon.
[0144] SEQ ID NO: 145 is the nucleotide sequence of the DD43CR2 PCR amplicon.
[0145] SEQ ID NO: 146 is the nucleotide sequence of the DD43CR2 PCR amplicon.
[0146] SEQ ID NO: 147-156 are the nucleotide sequence of mutations 1 to 10 for the DD20CR1 target site
[0147] SEQ ID NO: 157-166 are the nucleotide sequence of mutations 1 to 10 for the DD20CR2 target site
[0148] SEQ ID NO: 167-176 are the nucleotide sequence of mutations 1 to 10 for the DD43CR1 target site
[0149] SEQ ID NO: 177-191 are the nucleotide sequence of mutations 1 to 10 for the DD43CR2 target site.
[0150] SEQ ID NO: 192 is the amino acid sequence of a maize optimized version of the Cas9 protein.
[0151] SEQ ID NO: 193 is the nucleotide sequence of the maize optimized version of the Cas9 gene of SEQ ID NO: 192.
[0152] SEQ ID NO: 194 is the DNA version of guide RNA (EPSPS sgRNA).
[0153] SEQ ID NO: 195 is the EPSPS polynucleotide modification template.
[0154] SEQ ID NO: 196 is a nucleotide fragment comprising the TIPS nucleotide modifications.
[0155] SEQ ID NO: 197-204 are primer sequences shown in Table 15.
[0156] SEQ ID NO: 205-208 are nucleotide fragments shown in FIG. 14.
[0157] SEQ ID NO: 209 is an example of a TIPS edited EPSPS nucleotide sequence fragment shown in FIG. 17.
[0158] SEQ ID NO: 210 is an example of a Wild-type EPSPS nucleotide sequence fragment shown in FIG. 17.
[0159] SEQ ID NO: 211 is the nucleotide sequence of a maize enolpyruvylshikimate-3-phosphate synthase (epsps) locus
[0160] SEQ ID NO: 212 is the nucleotide sequence of a Cas9 endonuclease (genbank CS571758.1) from S. thermophiles.
[0161] SEQ ID NO: 213 is the nucleotide sequence of a Cas9 endonuclease (genbank CS571770.1) from S. thermophiles.
[0162] SEQ ID NO: 214 is the nucleotide sequence of a Cas9 endonuclease (genbank CS571785.1) from S. agalactiae.
[0163] SEQ ID NO: 215 is the nucleotide sequence of a Cas9 endonuclease, (genbank CS571790.1) from S. agalactiae.
[0164] SEQ ID NO: 216 is the nucleotide sequence of a Cas9 endonuclease (genbank CS571790.1) from S. mutant.
[0165] SEQ ID NOs: 217-228 are primer and probe nucleotide sequences described in Example 17.
[0166] SEQ ID NOs: 229 is the nucleotide sequence of the MHP14Cas1 target site.
[0167] SEQ ID NOs: 230 is the nucleotide sequence of the MHP14Cas3 target site.
[0168] SEQ ID NOs: 231 is the nucleotide sequence of the TS8Cas1 target site.
[0169] SEQ ID NOs: 232 is the nucleotide sequence of the TS8Cas2 target site.
[0170] SEQ ID NOs: 233 is the nucleotide sequence of the TS9Cas2 target site.
[0171] SEQ ID NOs: 234 is the nucleotide sequence of the TS9Cas3 target site.
[0172] SEQ ID NOs: 235 is the nucleotide sequence of the TS10Cas1 target site.
[0173] SEQ ID NOs: 236 is the nucleotide sequence of the TS10Cas3 target site.
[0174] SEQ ID NOs: 237-244 are the nucleotide sequences shown in FIG. 19A-D.
[0175] SEQ ID NOs: 245-252 are the nucleotide sequences of the guide RNA expression cassettes described in Example 18.
[0176] SEQ ID NOs: 253-260 are the nucleotide sequences of donor DNA expression cassettes described in Example 18.
[0177] SEQ ID NOs: 261-270 are the nucleotide sequences of the primers described in Example 18.
[0178] SEQ ID NOs: 271-294 are the nucleotide sequences of the primers and probes described in Example 18.
[0179] SEQ ID NO: 295 is the nucleotide sequence of GM-U6-13.1 PRO, a soybean U6 polymerase III promoter described herein,
[0180] SEQ ID NOs: 298, 300, 301 and 303 are the nucleotide sequences of the linked guideRNA/Cas9 expression cassettes.
[0181] SEQ ID NOs: 299 and 302 are the nucleotide sequences of the donor DNA expression cassettes.
[0182] SEQ ID NOs: 271-294 are the nucleotide sequences of the primers and probes described in Example 18.
[0183] SEQ ID NO: 304 is the nucleotide sequence of the DD20 qPCR amplicon.
[0184] SEQ ID NO: 305 is the nucleotide sequence of the DD43 qPCR amplicon.
[0185] SEQ ID NOs: 306-328 are the nucleotide sequences of the primers and probes described herein.
[0186] SEQ ID NOs: 329-334 are the nucleotide sequences of PCR amplicons described herein.
[0187] SEQ ID NO: 335 is the nucleotide sequence of a soybean genomic region comprising the DD20CR1 target site.
[0188] SEQ ID NO: 364 is the nucleotide sequence of a soybean genomic region comprising the DD20CR2 target site.
[0189] SEQ ID NO: 386 is the nucleotide sequence of a soybean genomic region comprising the DD43CR1 target site.
[0190] SEQ ID NOs: 336-363, 365-385 and 387-414 are the nucleotide sequences of shown in FIG. 26 A-C.
[0191] SEQ ID NOs: 415-444 are the nucleotide sequences of NHEJ mutations recovered based on the crRNA/tracrRNA/Cas endonuclease system shown in FIG. 27A-C.
[0192] SEQ ID NO: 445-447 are the nucleotide sequence of the LIGCas-1, LIGCas2 and LIGCas3 crRNA expression cassettes, respectively.
[0193] SEQ ID NO: 448 is the nucleotide sequence of the tracrRNA expression cassette.
[0194] SEQ ID NO: 449 is the nucleotide sequence of LIGCas-2 forward primer for primary PCR
[0195] SEQ ID NO: 450 is the nucleotide sequence of LIGCas-3 forward primer for primary PCR.
[0196] SEQ ID NO: 451 is the nucleotide sequence of the maize genomic Cas9 endonuclease target site Zm-ARGOS8-CTS1.
[0197] SEQ ID NO: 452 is the nucleotide sequence of the maize genomic Cas9 endonuclease target site Zm-ARGOS8-CTS2.
[0198] SEQ ID NO: 453 is the nucleotide sequence of the maize genomic Cas9 endonuclease target site Zm-ARGOS8-CTS3
[0199] SEQ ID NOs: 454-458 are the nucleotide sequence of primers P1, P2, P3, P4, P5, respectively.
[0200] SEQ ID NO: 459 is the nucleotide sequence of a Primer Binding Site (PBS), a sequence to facilitate event screening.
[0201] SEQ ID NO: 460 is the nucleotide sequence of the Zm-GOS2 PRO-GOS2 INTRON, the maize GOS2 promoter and GOS2 intron1 including the promoter, 5'-UTR1, INTRON1 and 5'-UTR2.
[0202] SEQ ID NO: 461 is the nucleotide sequence of the maize Zm-ARGOS8 promoter.
[0203] SEQ ID NO: 462 is the nucleotide sequence of the maize Zm-ARGOS8 5'-UTR.
[0204] SEQ ID NO: 463 is the nucleotide sequence of the maize Zm-ARGOS8 codon sequence
[0205] SEQ ID NO: 464 is the nucleotide sequence of the maize Zm-GOS2 gene, including promoter, 5'-UTR, CDS, 3'-UTR and introns.
[0206] SEQ ID NO: 465 is the nucleotide sequence of the maize Zm-GOS2 PRO promoter.
[0207] SEQ ID NO: 466 is the nucleotide sequence of the maize GOS2 INTRON, maize GOS2 5'-UTR1 and intron1 and 5'-UTR2.
[0208] SEQ ID NOs: 467-468, 490-491, 503-504 are the nucleotide sequence of the soybean genomic Cas endonuclease target sequences soy EPSPS-CR1, soy EPSPS-CR2, soy EPSPS-CR4, soy EPSPS-CR5, soy EPSPS-CR6, soy EPSPSCR7, respectively
[0209] SEQ ID NO: 469 is the nucleotide sequence of the soybean U6 small nuclear RNA promoter GM-U6-13.1.
[0210] SEQ ID NOs: 470, 471 are the nucleotide sequences of the QC868, QC879 plasmids, respectively.
[0211] SEQ ID NOs: 472, 473, 492, 493, 494, 505, 506, 507 are the nucleotide sequences of the RTW1013A, RTW1012A, RTW1199, RTW1200, RTW1190A, RTW1201, RTW1202, RTW1192A respectively.
[0212] SEQ ID NOs: 474-488, 495-402, 508-512 are the nucleotide sequences of primers and probes.
[0213] SEQ ID NO: 489 is the nucleotide sequence of the soybean codon optimized Cas9.
[0214] SEQ ID NO: 513 is the nucleotide sequence of the 35S enhancer.
[0215] SEQ ID NO: 514 is the nucleotide sequence of the 35S-CRTS for gRNA1 at 163-181 (including pam at 3' end).
[0216] SEQ ID NO: 515 is the nucleotide sequence of the 35S-CRTS for gRNA2 at 295-319 (including pam at 3' end).
[0217] SEQ ID NO: 516 is the nucleotide sequence of the 35S-CRT for gRNA3 at 331-350 (including pam at 3' end).
[0218] SEQ ID NO: 517 is the nucleotide sequence of the EPSPS-K90R template.
[0219] SEQ ID NO: 518 is the nucleotide sequence of the EPSPS-IME template. S
[0220] SEQ ID NO: 519 is the nucleotide sequence of the EPSPS-Tspliced template.
[0221] SEQ ID NO: 520 is the amino acid sequence of ZM-RAP2.7 peptide
[0222] SEQ ID NO: 521 is the nucleotide sequence ZM-RAP2.7 coding DNA sequence
SEQ ID NOs: 522 is the amino acid sequence of ZM-NPK1B peptide
[0223] SEQ ID NO: 523 is the nucleotide sequence of the ZM-NPK1B coding DNA sequence
[0224] SEQ ID NOs: 524 is the nucleotide sequence of the RAB17 promoter
[0225] SEQ ID NOs: 525 is the amino acid sequence of the Maize FTM1.
[0226] SEQ ID NO: 526 is the nucleotide sequence of the Maize FTM1 coding DNA sequence.
[0227] SEQ ID NOs: 527-532 are the nucleotide sequences shown in FIGS. 34, 35 and 37.
[0228] SEQ ID NOs: 533-534 are the nucleotide sequences of the Southern genomic probe and Southern MoPAT probe of FIG. 38, respectively. SEQ ID NOs: 535-541 are the nucleotide sequences of the RF-FPCas-1, RF-FPCas-2, ALSCas-4, ALS modification repair template 804, ALS modification repair template 127, ALS Forward_primer and ALS Reverse_primer, respectively.
[0229] SEQ ID NOs: 542-549 are the nucleotide sequences of the soy ALS1-CR1, Cas9 target sequence, soy ALS2-CR2, Cas9 target sequence, QC880, QC881, RTW1026A, WOL900, Forward_primer, WOL578, Reverse_primer and WOL573, Forward_primer, respectively.
[0230] SEQ ID NO: 550 is the nucleotide sequence of a maize ALS protein.
DETAILED DESCRIPTION
[0231] The present disclosure includes compositions and methods for genome modification of a target sequence in the genome of a plant or plant cell, for selecting plants, for gene editing, and for inserting a polynucleotide of interest into the genome of a plant. The methods employ a guide RNA/Cas endonuclease system, wherein the Cas endonuclease is guided by the guide RNA to recognize and optionally introduce a double strand break at a specific target site into the genome of a cell. The guide RNA/Cas endonuclease system provides for an effective system for modifying target sites within the genome of a plant, plant cell or seed. Further provided are methods and compositions employing a guide polynucleotide/Cas endonuclease system to provide an effective system for modifying target sites within the genome of a cell and for editing a nucleotide sequence in the genome of a cell. Once a genomic target site is identified, a variety of methods can be employed to further modify the target sites such that they contain a variety of polynucleotides of interest. Breeding methods utilizing a two component guide RNA/Cas endonuclease system are also disclosed. Compositions and methods are also provided for editing a nucleotide sequence in the genome of a cell. The nucleotide sequence to be edited (the nucleotide sequence of interest) can be located within or outside a target site that is recognized by a Cas endonuclease.
[0232] CRISPR loci (Clustered Regularly Interspaced Short Palindromic Repeats) (also known as SPIDRs--SPacer Interspersed Direct Repeats) constitute a family of recently described DNA loci. CRISPR loci consist of short and highly conserved DNA repeats (typically 24 to 40 bp, repeated from 1 to 140 times--also referred to as CRISPR-repeats) which are partially palindromic. The repeated sequences (usually specific to a species) are interspaced by variable sequences of constant length (typically 20 to 58 bp depending on the CRISPR locus (WO2007/025097 published Mar. 1, 2007).
[0233] CRISPR loci were first recognized in E. coli (Ishino et al. (1987) J. Bacterial. 169:5429-5433; Nakata et al. (1989) J. Bacterial. 171:3553-3556). Similar interspersed short sequence repeats have been identified in Haloferax mediterranei, Streptococcus pyogenes, Anabaena, and Mycobacterium tuberculosis (Groenen et al. (1993) Mol. Microbiol. 10:1057-1065; Hoe et al. (1999) Emerg. Infect. Dis. 5:254-263; Masepohl et al. (1996) Biochim. Biophys. Acta 1307:26-30; Mojica et al. (1995) Mol. Microbiol. 17:85-93). The CRISPR loci differ from other SSRs by the structure of the repeats, which have been termed short regularly spaced repeats (SRSRs) (Janssen et al. (2002) OMICS J. Integ. Biol. 6:23-33; Mojica et al. (2000) Mol. Microbiol. 36:244-246). The repeats are short elements that occur in clusters, that are always regularly spaced by variable sequences of constant length (Mojica et al. (2000) Mol. Microbiol. 36:244-246).
[0234] Cas gene includes a gene that is generally coupled, associated or close to or in the vicinity of flanking CRISPR loci. The terms "Cas gene", "CRISPR-associated (Cas) gene" are used interchangeably herein. A comprehensive review of the Cas protein family is presented in Haft et al. (2005) Computational Biology, PLoS Comput Biol 1(6): e60. doi:10.1371/journal.pcbi.0010060.
[0235] As described therein, 41 CRISPR-associated (Cas) gene families are described, in addition to the four previously known gene families. It shows that CRISPR systems belong to different classes, with different repeat patterns, sets of genes, and species ranges. The number of Cas genes at a given CRISPR locus can vary between species.
[0236] Cas endonuclease relates to a Cas protein encoded by a Cas gene, wherein said Cas protein is capable of introducing a double strand break into a DNA target sequence. The Cas endonuclease is guided by the guide polynucleotide to recognize and optionally introduce a double strand break at a specific target site into the genome of a cell. As used herein, the term "guide polynucleotide/Cas endonuclease system" includes a complex of a Cas endonuclease and a guide polynucleotide that is capable of introducing a double strand break into a DNA target sequence. The Cas endonuclease unwinds the DNA duplex in close proximity of the genomic target site and cleaves both DNA strands upon recognition of a target sequence by a guide RNA, but only if the correct protospacer-adjacent motif (PAM) is approximately oriented at the 3' end of the target sequence (FIG. 2A, FIG. 2B).
[0237] In one embodiment, the Cas endonuclease gene is a Cas9 endonuclease, such as but not limited to, Cas9 genes listed in SEQ ID NOs: 462, 474, 489, 494, 499, 505, and 518 of WO2007/025097 published Mar. 1, 2007, and incorporated herein by reference. In another embodiment, the Cas endonuclease gene is plant, maize or soybean optimized Cas9 endonuclease (FIG. 1A). In another embodiment, the Cas endonuclease gene is operably linked to a SV40 nuclear targeting signal upstream of the Cas codon region and a bipartite VirD2 nuclear localization signal (Tinland et al. (1992) Proc. Natl. Acad. Sci. USA 89:7442-6) downstream of the Cas codon region.
[0238] In one embodiment, the Cas endonuclease gene is a Cas9 endonuclease gene of SEQ ID NO:1, 124, 212, 213, 214, 215, 216, 193 or nucleotides 2037-6329 of SEQ ID NO:5, or any functional fragment or variant thereof.
[0239] The terms "functional fragment", "fragment that is functionally equivalent" and "functionally equivalent fragment" are used interchangeably herein. These terms refer to a portion or subsequence of the Cas endonuclease sequence of the present disclosure in which the ability to create a double-strand break is retained.
[0240] The terms "functional variant", "Variant that is functionally equivalent" and "functionally equivalent variant" are used interchangeably herein. These terms refer to a variant of the Cas endonuclease of the present disclosure in which the ability create a double-strand break is retained. Fragments and variants can be obtained via methods such as site-directed mutagenesis and synthetic construction.
[0241] In one embodiment, the Cas endonuclease gene is a plant codon optimized streptococcus pyogenes Cas9 gene that can recognize any genomic sequence of the form N(12-30)NGG can in principle be targeted.
[0242] In one embodiment, the Cas endonuclease is introduced directly into a cell by any method known in the art, for example, but not limited to transient introduction methods, transfection and/or topical application.
[0243] Endonucleases are enzymes that cleave the phosphodiester bond within a polynucleotide chain, and include restriction endonucleases that cleave DNA at specific sites without damaging the bases. Restriction endonucleases include Type I, Type II, Type III, and Type IV endonucleases, which further include subtypes. In the Type I and Type III systems, both the methylase and restriction activities are contained in a single complex. Endonucleases also include meganucleases, also known as homing endonucleases (HEases), which like restriction endonucleases, bind and cut at a specific recognition site, however the recognition sites for meganucleases are typically longer, about 18 bp or more. (patent application WO-PCT PCT/US12/30061 filed on Mar. 22, 2012) Meganucleases have been classified into four families based on conserved sequence motifs, the families are the LAGLIDADG, GIY-YIG, H-N-H, and His-Cys box families. These motifs participate in the coordination of metal ions and hydrolysis of phosphodiester bonds. HEases are notable for their long recognition sites, and for tolerating some sequence polymorphisms in their DNA substrates. The naming convention for meganuclease is similar to the convention for other restriction endonuclease. Meganucleases are also characterized by prefix F-, I-, or PI- for enzymes encoded by free-standing ORFS, introns, and inteins, respectively. One step in the recombination process involves polynucleotide cleavage at or near the recognition site. This cleaving activity can be used to produce a double-strand break. For reviews of site-specific recombinases and their recognition sites, see, Sauer (1994) Curr Op Biotechnol 5:521-7; and Sadowski (1993) FASEB 7:760-7. In some examples the recombinase is from the Integrase or Resolvase families.
[0244] TAL effector nucleases are a new class of sequence-specific nucleases that can be used to make double-strand breaks at specific target sequences in the genome of a plant or other organism. (Miller et al. (2011) Nature Biotechnology 29:143-148). Zinc finger nucleases (ZFNs) are engineered double-strand break inducing agents comprised of a zinc finger DNA binding domain and a double-strand-break-inducing agent domain. Recognition site specificity is conferred by the zinc finger domain, which typically comprising two, three, or four zinc fingers, for example having a C2H2 structure, however other zinc finger structures are known and have been engineered. Zinc finger domains are amenable for designing polypeptides which specifically bind a selected polynucleotide recognition sequence. ZFNs include an engineered DNA-binding zinc finger domain linked to a non-specific endonuclease domain, for example nuclease domain from a Type IIs endonuclease such as FokI. Additional functionalities can be fused to the zinc-finger binding domain, including transcriptional activator domains, transcription repressor domains, and methylases. In some examples, dimerization of nuclease domain is required for cleavage activity. Each zinc finger recognizes three consecutive base pairs in the target DNA. For example, a 3 finger domain recognized a sequence of 9 contiguous nucleotides, with a dimerization requirement of the nuclease, two sets of zinc finger triplets are used to bind an 18 nucleotide recognition sequence.
[0245] Bacteria and archaea have evolved adaptive immune defenses termed clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated (Cas) systems that use short RNA to direct degradation of foreign nucleic acids ((WO2007/025097 published Mar. 1, 2007). The type II CRISPR/Cas system from bacteria employs a crRNA and tracrRNA to guide the Cas endonuclease to its DNA target. The crRNA (CRISPR RNA) contains the region complementary to one strand of the double strand DNA target and base pairs with the tracrRNA (trans-activating CRISPR RNA) forming a RNA duplex that directs the Cas endonuclease to cleave the DNA target (FIG. 2 B).
[0246] As used herein, the term "guide RNA" relates to a synthetic fusion of two RNA molecules, a crRNA (CRISPR RNA) comprising a variable targeting domain, and a tracrRNA (FIG. 2 B). In one embodiment, the guide RNA comprises a variable targeting domain of 12 to 30 nucleotide sequences and a RNA fragment that can interact with a Cas endonuclease.
[0247] As used herein, the term "guide polynucleotide", relates to a polynucleotide sequence that can form a complex with a Cas endonuclease and enables the Cas endonuclease to recognize and optionally cleave a DNA target site. The guide polynucleotide can be a single molecule or a double molecule. The guide polynucleotide sequence can be a RNA sequence, a DNA sequence, or a combination thereof (a RNA-DNA combination sequence). Optionally, the guide polynucleotide can comprise at least one nucleotide, phosphodiester bond or linkage modification such as, but not limited, to Locked Nucleic Acid (LNA), 5-methyl dC, 2,6-Diaminopurine, 2'-Fluoro A, 2'-Fluoro U, 2'-O-Methyl RNA, phosphorothioate bond, linkage to a cholesterol molecule, linkage to a polyethylene glycol molecule, linkage to a spacer 18 (hexaethylene glycol chain) molecule, or 5' to 3' covalent linkage resulting in circularization. A guide polynucleotride that solely comprises ribonucleic acids is also referred to as a "guide RNA".
[0248] The guide polynucleotide can be a double molecule (also referred to as duplex guide polynucleotide) comprising a first nucleotide sequence domain (referred to as Variable Targeting domain or VT domain) that is complementary to a nucleotide sequence in a target DNA and a second nucleotide sequence domain (referred to as Cas endonuclease recognition domain or CER domain) that interacts with a Cas endonuclease polypeptide. The CER domain of the double molecule guide polynucleotide comprises two separate molecules that are hybridized along a region of complementarity. The two separate molecules can be RNA, DNA, and/or RNA-DNA-combination sequences. In some embodiments, the first molecule of the duplex guide polynucleotide comprising a VT domain linked to a CER domain is referred to as "crDNA" (when composed of a contiguous stretch of DNA nucleotides) or "crRNA" (when composed of a contiguous stretch of RNA nucleotides), or "crDNA-RNA" (when composed of a combination of DNA and RNA nucleotides). The crNucleotide can comprise a fragment of the cRNA naturally occurring in Bacteria and Archaea. In one embodiment, the size of the fragment of the cRNA naturally occurring in Bacteria and Archaea that is present in a crNucleotide disclosed herein can range from, but is not limited to, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 or more nucleotides. In some embodiments the second molecule of the duplex guide polynucleotide comprising a CER domain is referred to as "tracrRNA" (when composed of a contiguous stretch of RNA nucleotides) or "tracrDNA" (when composed of a contiguous stretch of DNA nucleotides) or "tracrDNA-RNA" (when composed of a combination of DNA and RNA nucleotides In one embodiment, the RNA that guides the RNA/Cas9 endonuclease complex, is a duplexed RNA comprising a duplex crRNA-tracrRNA.
[0249] The guide polynucleotide can also be a single molecule comprising a first nucleotide sequence domain (referred to as Variable Targeting domain or VT domain) that is complementary to a nucleotide sequence in a target DNA and a second nucleotide domain (referred to as Cas endonuclease recognition domain or CER domain) that interacts with a Cas endonuclease polypeptide. By "domain" it is meant a contiguous stretch of nucleotides that can be RNA, DNA, and/or RNA-DNA-combination sequence. The VT domain and/or the CER domain of a single guide polynucleotide can comprise a RNA sequence, a DNA sequence, or a RNA-DNA-combination sequence. In some embodiments the single guide polynucleotide comprises a crNucleotide (comprising a VT domain linked to a CER domain) linked to a tracrNucleotide (comprising a CER domain), wherein the linkage is a nucleotide sequence comprising a RNA sequence, a DNA sequence, or a RNA-DNA combination sequence. The single guide polynucleotide being comprised of sequences from the crNucleotide and tracrNucleotide may be referred to as "single guide RNA" (when composed of a contiguous stretch of RNA nucleotides) or "single guide DNA" (when composed of a contiguous stretch of DNA nucleotides) or "single guide RNA-DNA" (when composed of a combination of RNA and DNA nucleotides). In one embodiment of the disclosure, the single guide RNA comprises a cRNA or cRNA fragment and a tracrRNA or tracrRNA fragment of the type II CRISPR/Cas system that can form a complex with a type II Cas endonuclease, wherein said guide RNA/Cas endonuclease complex can direct the Cas endonuclease to a plant genomic target site, enabling the Cas endonuclease to introduce a double strand break into the genomic target site. One aspect of using a single guide polynucleotide versus a duplex guide polynucleotide is that only one expression cassette needs to be made to express the single guide polynucleotide.
[0250] The term "variable targeting domain" or "VT domain" is used interchangeably herein and includes a nucleotide sequence that is complementary to one strand (nucleotide sequence) of a double strand DNA target site (FIGS. 2 A and 2 B). The % complementation between the first nucleotide sequence domain (VT domain) and the target sequence can be at least 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 63%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100%. The variable target domain can be at least 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29 or 30 nucleotides in length. In some embodiments, the variable targeting domain comprises a contiguous stretch of 12 to 30 nucleotides. The variable targeting domain can be composed of a DNA sequence, a RNA sequence, a modified DNA sequence, a modified RNA sequence, or any combination thereof.
[0251] The term "Cas endonuclease recognition domain" or "CER domain" of a guide polynucleotide is used interchangeably herein and includes a nucleotide sequence (such as a second nucleotide sequence domain of a guide polynucleotide), that interacts with a Cas endonuclease polypeptide. The CER domain can be composed of a DNA sequence, a RNA sequence, a modified DNA sequence, a modified RNA sequence (see for example modifications described herein), or any combination thereof.
[0252] The nucleotide sequence linking the crNucleotide and the tracrNucleotide of a single guide polynucleotide can comprise a RNA sequence, a DNA sequence, or a RNA-DNA combination sequence. In one embodiment, the nucleotide sequence linking the crNucleotide and the tracrNucleotide of a single guide polynucleotide can be at least 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99 or 100 nucleotides in length. In another embodiment, the nucleotide sequence linking the crNucleotide and the tracrNucleotide of a single guide polynucleotide can comprise a tetraloop sequence, such as, but not limiting to a GAAA tetraloop sequence.
[0253] Nucleotide sequence modification of the guide polynucleotide, VT domain and/or CER domain can be selected from, but not limited to, the group consisting of a 5' cap, a 3' polyadenylated tail, a riboswitch sequence, a stability control sequence, a sequence that forms a dsRNA duplex, a modification or sequence that targets the guide poly nucleotide to a subcellular location, a modification or sequence that provides for tracking, a modification or sequence that provides a binding site for proteins, a Locked Nucleic Acid (LNA), a 5-methyl dC nucleotide, a 2,6-Diaminopurine nucleotide, a 2'-Fluoro A nucleotide, a 2'-Fluoro U nucleotide; a 2'-O-Methyl RNA nucleotide, a phosphorothioate bond, linkage to a cholesterol molecule, linkage to a polyethylene glycol molecule, linkage to a spacer 18 molecule, a 5' to 3' covalent linkage, or any combination thereof. These modifications can result in at least one additional beneficial feature, wherein the additional beneficial feature is selected from the group of a modified or regulated stability, a subcellular targeting, tracking, a fluorescent label, a binding site for a protein or protein complex, modified binding affinity to complementary target sequence, modified resistance to cellular degradation, and increased cellular permeability.
[0254] In one embodiment, the guide RNA and Cas endonuclease are capable of forming a complex that enables the Cas endonuclease to introduce a double strand break at a DNA target site
[0255] In one embodiment of the disclosure the variable target domain is 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29 or 30 nucleotides in length.
[0256] In one embodiment of the disclosure, the guide RNA comprises a cRNA (or cRNA fragment) and a tracrRNA (or tracrRNA fragment) of the type II CRISPR/Cas system that can form a complex with a type II Cas endonuclease, wherein said guide RNA/Cas endonuclease complex can direct the Cas endonuclease to a plant genomic target site, enabling the Cas endonuclease to introduce a double strand break into the genomic target site.
[0257] In one embodiment the guide RNA can be introduced into a plant or plant cell directly using any method known in the art such as, but not limited to, particle bombardment or topical applications.
[0258] In another embodiment the guide RNA can be introduced indirectly by introducing a recombinant DNA molecule comprising the corresponding guide DNA sequence operably linked to a plant specific promoter (as shown in FIG. 1B) that is capable of transcribing the guide RNA in said plant cell. The term "corresponding guide DNA" includes a DNA molecule that is identical to the RNA molecule but has a "T" substituted for each "U" of the RNA molecule.
[0259] In some embodiments, the guide RNA is introduced via particle bombardment or Agrobacterium transformation of a recombinant DNA construct comprising the corresponding guide DNA operably linked to a plant U6 polymerase III promoter.
[0260] In one embodiment, the RNA that guides the RNA/Cas9 endonuclease complex, is a duplexed RNA comprising a duplex crRNA-tracrRNA (as shown in FIG. 2B). One advantage of using a guide RNA versus a duplexed crRNA-tracrRNA is that only one expression cassette needs to be made to express the fused guide RNA.
[0261] The terms "target site", "target sequence", "target DNA", "target locus", "genomic target site", "genomic target sequence", and "genomic target locus" are used interchangeably herein and refer to a polynucleotide sequence in the genome (including choloroplastic and mitochondrial DNA) of a plant cell at which a double-strand break is induced in the plant cell genome by a Cas endonuclease. The target site can be an endogenous site in the plant genome, or alternatively, the target site can be heterologous to the plant and thereby not be naturally occurring in the genome, or the target site can be found in a heterologous genomic location compared to where it occurs in nature. As used herein, terms "endogenous target sequence" and "native target sequence" are used interchangeable herein to refer to a target sequence that is endogenous or native to the genome of a plant and is at the endogenous or native position of that target sequence in the genome of the plant.
[0262] In one embodiments, the target site can be similar to a DNA recognition site or target site that that is specifically recognized and/or bound by a double-strand break inducing agent such as a LIG3-4 endonuclease (US patent publication 2009-0133152 A1 (published May 21, 2009) or a MS26++ meganuclease (U.S. patent application Ser. No. 13/526,912 filed Jun. 19, 2012).
[0263] An "artificial target site" or "artificial target sequence" are used interchangeably herein and refer to a target sequence that has been introduced into the genome of a plant. Such an artificial target sequence can be identical in sequence to an endogenous or native target sequence in the genome of a plant but be located in a different position (i.e., a non-endogenous or non-native position) in the genome of a plant.
[0264] An "altered target site", "altered target sequence", "modified target site", "modified target sequence" are used interchangeably herein and refer to a target sequence as disclosed herein that comprises at least one alteration when compared to non-altered target sequence. Such "alterations" include, for example:
(i) replacement of at least one nucleotide, (ii) a deletion of at least one nucleotide, (iii) an insertion of at least one nucleotide, or (iv) any combination of (i)-(iii).
[0265] Methods for modifying a plant genomic target site are disclosed herein. In one embodiment, a method for modifying a target site in the genome of a plant cell comprises introducing a guide RNA into a plant cell having a Cas endonuclease, wherein said guide RNA and Cas endonuclease are capable of forming a complex that enables the Cas endonuclease to introduce a double strand break at said target site.
[0266] Also provided is a method for modifying a target site in the genome of a plant cell, the method comprising introducing a guide RNA and a Cas endonuclease into said plant, wherein said guide RNA and Cas endonuclease are capable of forming a complex that enables the Cas endonuclease to introduce a double strand break at said target site.
[0267] Further provided is a method for modifying a target site in the genome of a plant cell, the method comprising introducing a guide RNA and a donor DNA into a plant cell having a Cas endonuclease, wherein said guide RNA and Cas endonuclease are capable of forming a complex that enables the Cas endonuclease to introduce a double strand break at said target site, wherein said donor DNA comprises a polynucleotide of interest.
[0268] Further provided is a method for modifying a target site in the genome of a plant cell, the method comprising: a) introducing into a plant cell a guide RNA comprising a variable targeting domain and a Cas endonuclease, wherein said guide RNA and Cas endonuclease are capable of forming a complex that enables the Cas endonuclease to introduce a double strand break at said target site; and, b) identifying at least one plant cell that has a modification at said target, wherein the modification includes at least one deletion or substitution of one or more nucleotides in said target site.
[0269] Further provided, a method for modifying a target DNA sequence in the genome of a plant cell, the method comprising: a) introducing into a plant cell a first recombinant DNA construct capable of expressing a guide RNA and a second recombinant DNA construct capable of expressing a Cas endonuclease, wherein said guide RNA and Cas endonuclease are capable of forming a complex that enables the Cas endonuclease to introduce a double strand break at said target site; and, b) identifying at least one plant cell that has a modification at said target, wherein the modification includes at least one deletion or substitution of one or more nucleotides in said target site.
[0270] The length of the target site can vary, and includes, for example, target sites that are at least 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30 or more nucleotides in length. It is further possible that the target site can be palindromic, that is, the sequence on one strand reads the same in the opposite direction on the complementary strand. The nick/cleavage site can be within the target sequence or the nick/cleavage site could be outside of the target sequence. In another variation, the cleavage could occur at nucleotide positions immediately opposite each other to produce a blunt end cut or, in other Cases, the incisions could be staggered to produce single-stranded overhangs, also called "sticky ends", which can be either 5' overhangs, or 3' overhangs.
[0271] In some embodiment, the genomic target site capable of being cleaved by a Cas endonuclease comprises a 12 to 30 nucleotide fragment of a male fertility gene such as MS26 (see for example U.S. Pat. Nos. 7,098,388, 7,517,975, 7,612,251), MS45 (see for example U.S. Pat. Nos. 5,478,369, 6,265,640) or MSCA1 (see for example U.S. Pat. No. 7,919,676), ALS or ESPS genes.
[0272] Active variants of genomic target sites can also be used. Such active variants can comprise at least 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to the given target site, wherein the active variants retain biological activity and hence are capable of being recognized and cleaved by an Cas endonuclease. Assays to measure the double-strand break of a target site by an endonuclease are known in the art and generally measure the overall activity and specificity of the agent on DNA substrates containing recognition sites.
[0273] Various methods and compositions can be employed to obtain a plant having a polynucleotide of interest inserted in a target site for a Cas endonuclease. Such methods can employ homologous recombination to provide integration of the polynucleotide of Interest at the target site. In one method provided, a polynucleotide of interest is provided to the plant cell in a donor DNA construct. As used herein, "donor DNA" is a DNA construct that comprises a polynucleotide of Interest to be inserted into the target site of a Cas endonuclease. The donor DNA construct further comprises a first and a second region of homology that flank the polynucleotide of Interest. The first and second regions of homology of the donor DNA share homology to a first and a second genomic region, respectively, present in or flanking the target site of the plant genome. By "homology" is meant DNA sequences that are similar. For example, a "region of homology to a genomic region" that is found on the donor DNA is a region of DNA that has a similar sequence to a given "genomic region" in the plant genome. A region of homology can be of any length that is sufficient to promote homologous recombination at the cleaved target site. For example, the region of homology can comprise at least 5-10, 5-15, 5-20, 5-25, 5-30, 5-35, 5-40, 5-45, 5-50, 5-55, 5-60, 5-65, 5-70, 5-75, 5-80, 5-85, 5-90, 5-95, 5-100, 5-200, 5-300, 5-400, 5-500, 5-600, 5-700, 5-800, 5-900, 5-1000, 5-1100, 5-1200, 5-1300, 5-1400, 5-1500, 5-1600, 5-1700, 5-1800, 5-1900, 5-2000, 5-2100, 5-2200, 5-2300, 5-2400, 5-2500, 5-2600, 5-2700, 5-2800, 5-2900, 5-3000, 5-3100 or more bases in length such that the region of homology has sufficient homology to undergo homologous recombination with the corresponding genomic region. "Sufficient homology" indicates that two polynucleotide sequences have sufficient structural similarity to act as substrates for a homologous recombination reaction. The structural similarity includes overall length of each polynucleotide fragment, as well as the sequence similarity of the polynucleotides. Sequence similarity can be described by the percent sequence identity over the whole length of the sequences, and/or by conserved regions comprising localized similarities such as contiguous nucleotides having 100% sequence identity, and percent sequence identity over a portion of the length of the sequences.
[0274] The amount of homology or sequence identity shared by a target and a donor polynucleotide can vary and includes total lengths and/or regions having unit integral values in the ranges of about 1-20 bp, 20-50 bp, 50-100 bp, 75-150 bp, 100-250 bp, 150-300 bp, 200-400 bp, 250-500 bp, 300-600 bp, 350-750 bp, 400-800 bp, 450-900 bp, 500-1000 bp, 600-1250 bp, 700-1500 bp, 800-1750 bp, 900-2000 bp, 1-2.5 kb, 1.5-3 kb, 2-4 kb, 2.5-5 kb, 3-6 kb, 3.5-7 kb, 4-8 kb, 5-10 kb, or up to and including the total length of the target site. These ranges include every integer within the range, for example, the range of 1-20 bp includes 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19 and 20 bp. The amount of homology can also described by percent sequence identity over the full aligned length of the two polynucleotides which includes percent sequence identity of about at least 50%, 55%, 60%, 65%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100%. Sufficient homology includes any combination of polynucleotide length, global percent sequence identity, and optionally conserved regions of contiguous nucleotides or local percent sequence identity, for example sufficient homology can be described as a region of 75-150 bp having at least 80% sequence identity to a region of the target locus. Sufficient homology can also be described by the predicted ability of two polynucleotides to specifically hybridize under high stringency conditions, see, for example, Sambrook et al., (1989) Molecular Cloning: A Laboratory Manual, (Cold Spring Harbor Laboratory Press, NY); Current Protocols in Molecular Biology, Ausubel et al., Eds (1994) Current Protocols, (Greene Publishing Associates, Inc. and John Wiley & Sons, Inc.); and, Tijssen (1993) Laboratory Techniques in Biochemistry and Molecular Biology--Hybridization with Nucleic Acid Probes, (Elsevier, New York).
[0275] As used herein, a "genomic region" is a segment of a chromosome in the genome of a plant cell that is present on either side of the target site or, alternatively, also comprises a portion of the target site. The genomic region can comprise at least 5-10, 5-15, 5-20, 5-25, 5-30, 5-35, 5-40, 5-45, 5-50, 5-55, 5-60, 5-65, 5-70, 5-75, 5-80, 5-85, 5-90, 5-95, 5-100, 5-200, 5-300, 5-400, 5-500, 5-600, 5-700, 5-800, 5-900, 5-1000, 5-1100, 5-1200, 5-1300, 5-1400, 5-1500, 5-1600, 5-1700, 5-1800, 5-1900, 5-2000, 5-2100, 5-2200, 5-2300, 5-2400, 5-2500, 5-2600, 5-2700, 5-2800. 5-2900, 5-3000, 5-3100 or more bases such that the genomic region has sufficient homology to undergo homologous recombination with the corresponding region of homology.
[0276] Polynucleotides of interest and/or traits can be stacked together in a complex trait locus as described in US-2013-0263324-A1, published 3 Oct. 2013 and in PCT/US13/22891, published Jan. 24, 2013, both applications are hereby incorporated by reference. The guide polynucleotide/Cas9 endonuclease system described herein provides for an efficient system to generate double strand breaks and allows for traits to be stacked in a complex trait locus.
[0277] In one embodiment, the guide polynucleotide/Cas endonuclease system is used for introducing one or more polynucleotides of interest or one or more traits of interest into one or more target sites by providing one or more guide polynucleotides, one Cas endonuclease, and optionally one or more donor DNAs to a plant cell. A fertile plant can be produced from that plant cell that comprises an alteration at said one or more target sites, wherein the alteration is selected from the group consisting of (i) replacement of at least one nucleotide, (ii) a deletion of at least one nucleotide, (iii) an insertion of at least one nucleotide, and (iv) any combination of (i)-(iii). Plants comprising these altered target sites can be crossed with plants comprising at least one gene or trait of interest in the same complex trait locus, thereby further stacking traits in said complex trait locus. (see also US-2013-0263324-A1, published 3 Oct. 2013 and in PCT/US13/22891, published Jan. 24, 2013).
[0278] In one embodiment, the method comprises a method for producing in a plant a complex trait locus comprising at least two altered target sequences in a genomic region of interest, said method comprising: (a) selecting a genomic region in a plant, wherein the genomic region comprises a first target sequence and a second target sequence; (b) contacting at least one plant cell with at least a first guide polynucleotide, a second polynucleotide, and optionally at least one donor DNA, and a Cas endonuclease, wherein the first and second guide polynucleotide and the Cas endonuclease can form a complex that enables the Cas endonuclease to introduce a double strand break in at least a first and a second target sequence; (c) identifying a cell from (b) comprising a first alteration at the first target sequence and a second alteration at the second target sequence; and (d) recovering a first fertile plant from the cell of (c) said fertile plant comprising the first alteration and the second alteration, wherein the first alteration and the second alteration are physically linked.
[0279] In one embodiment, the method comprises a method for producing in a plant a complex trait locus comprising at least two altered target sequences in a genomic region of interest, said method comprising: (a) selecting a genomic region in a plant, wherein the genomic region comprises a first target sequence and a second target sequence; (b) contacting at least one plant cell with a first guide polynucleotide, a Cas endonuclease, and optionally a first donor DNA, wherein the first guide polynucleotide and the Cas endonuclease can form a complex that enables the Cas endonuclease to introduce a double strand break a first target sequence; (c) identifying a cell from (b) comprising a first alteration at the first target sequence; (d) recovering a first fertile plant from the cell of (c), said first fertile plant comprising the first alteration; (e) contacting at least one plant cell with a second guide polynucleotide, a Cas endonuclease and optionally a second Donor DNA; (f) identifying a cell from (e) comprising a second alteration at the second target sequence; (g) recovering a second fertile plant from the cell of (f), said second fertile plant comprising the second alteration; and, (h) obtaining a fertile progeny plant from the second fertile plant of (g), said fertile progeny plant comprising the first alteration and the second alteration, wherein the first alteration and the second alteration are physically linked.
[0280] The structural similarity between a given genomic region and the corresponding region of homology found on the donor DNA can be any degree of sequence identity that allows for homologous recombination to occur. For example, the amount of homology or sequence identity shared by the "region of homology" of the donor DNA and the "genomic region" of the plant genome can be at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity, such that the sequences undergo homologous recombination
[0281] The region of homology on the donor DNA can have homology to any sequence flanking the target site. While in some embodiments the regions of homology share significant sequence homology to the genomic sequence immediately flanking the target site, it is recognized that the regions of homology can be designed to have sufficient homology to regions that may be further 5' or 3' to the target site. In still other embodiments, the regions of homology can also have homology with a fragment of the target site along with downstream genomic regions. In one embodiment, the first region of homology further comprises a first fragment of the target site and the second region of homology comprises a second fragment of the target site, wherein the first and second fragments are dissimilar.
[0282] As used herein, "homologous recombination" includes the exchange of DNA fragments between two DNA molecules at the sites of homology. The frequency of homologous recombination is influenced by a number of factors. Different organisms vary with respect to the amount of homologous recombination and the relative proportion of homologous to non-homologous recombination. Generally, the length of the region of homology affects the frequency of homologous recombination events: the longer the region of homology, the greater the frequency. The length of the homology region needed to observe homologous recombination is also species-variable. In many cases, at least 5 kb of homology has been utilized, but homologous recombination has been observed with as little as 25-50 bp of homology. See, for example, Singer et al., (1982) Cell 31:25-33; Shen and Huang, (1986) Genetics 112:441-57; Watt et al., (1985) Proc. Natl. Acad. Sci. USA 82:4768-72, Sugawara and Haber, (1992) Mol Cell Biol 12:563-75, Rubnitz and Subramani, (1984) Mol Cell Biol 4:2253-8; Ayares et al., (1986) Proc. Natl. Acad. Sci. USA 83:5199-203; Liskay et al., (1987) Genetics 115:161-7.
[0283] Homology-directed repair (HDR) is a mechanism in cells to repair double-stranded and single stranded DNA breaks. Homology-directed repair includes homologous recombination (HR) and single-strand annealing (SSA) (Lieber. 2010 Annu. Rev. Biochem. 79:181-211). The most common form of HDR is called homologous recombination (HR), which has the longest sequence homology requirements between the donor and acceptor DNA. Other forms of HDR include single-stranded annealing (SSA) and breakage-induced replication, and these require shorter sequence homology relative to HR. Homology-directed repair at nicks (single-stranded breaks) can occur via a mechanism distinct from HDR at double-strand breaks (Davis and Maizels. PNAS (0027-8424), 111 (10), p. E924-E932.
[0284] Alteration of the genome of a plant cell, for example, through homologous recombination (HR), is a powerful tool for genetic engineering. Despite the low frequency of homologous recombination in higher plants, there are a few examples of successful homologous recombination of plant endogenous genes. The parameters for homologous recombination in plants have primarily been investigated by rescuing introduced truncated selectable marker genes. In these experiments, the homologous DNA fragments were typically between 0.3 kb to 2 kb. Observed frequencies for homologous recombination were on the order of 10-4 to 10-5. See, for example, Halfter et al., (1992) Mol Gen Genet 231:186-93; Offring a et al., (1990) EMBO J 9:3077-84; Offring a et al., (1993) Proc. Natl. Acad. Sci. USA 90:7346-50; Paszkowski et al., (1988) EMBO J 7:4021-6; Hourda and Paszkowski, (1994) Mol Gen Genet 243:106-11; and Risseeuw et al., (1995) Plant J 7:109-19.
[0285] Homologous recombination has been demonstrated in insects. In Drosophila, Dray and Gloor found that as little as 3 kb of total template:target homology sufficed to copy a large non-homologous segment of DNA into the target with reasonable efficiency (Dray and Gloor, (1997) Genetics 147:689-99). Using FLP-mediated DNA integration at a target FRT in Drosophila, Golic et al., showed integration was approximately 10-fold more efficient when the donor and target shared 4.1 kb of homology as compared to 1.1 kb of homology (Golic et al., (1997) Nucleic Acids Res 25:3665). Data from Drosophila indicates that 2-4 kb of homology is sufficient for efficient targeting, but there is some evidence that much less homology may suffice, on the order of about 30 bp to about 100 bp (Nassif and Engels, (1993) Proc. Natl. Acad. Sci. USA 90:1262-6; Keeler and Gloor, (1997) Mol Cell Biol 17:627-34).
[0286] Homologous recombination has also been accomplished in other organisms. For example, at least 150-200 bp of homology was required for homologous recombination in the parasitic protozoan Leishmania (Papadopoulou and Dumas, (1997) Nucleic Acids Res 25:4278-86). In the filamentous fungus Aspergillus nidulans, gene replacement has been accomplished with as little as 50 bp flanking homology (Chaveroche et al., (2000) Nucleic Acids Res 28:e97). Targeted gene replacement has also been demonstrated in the ciliate Tetrahymena thermophila (Gaertig et al., (1994) Nucleic Acids Res 22:5391-8). In mammals, homologous recombination has been most successful in the mouse using pluripotent embryonic stem cell lines (ES) that can be grown in culture, transformed, selected and introduced into a mouse embryo. Embryos bearing inserted transgenic ES cells develop as genetically offspring. By interbreeding siblings, homozygous mice carrying the selected genes can be obtained. An overview of the process is provided in Watson et al., (1992) Recombinant DNA, 2nd Ed., (Scientific American Books distributed by WH Freeman & Co.); Capecchi, (1989) Trends Genet 5:70-6; and Bronson, (1994) J Biol Chem 269:27155-8. Homologous recombination in mammals other than mouse has been limited by the lack of stem cells capable of being transplanted to oocytes or developing embryos. However, McCreath et al., Nature 405:1066-9 (2000) reported successful homologous recombination in sheep by transformation and selection in primary embryo fibroblast cells.
[0287] Error-prone DNA repair mechanisms can produce mutations at double-strand break sites. The Non-Homologous-End-Joining (NHEJ) pathways are the most common repair mechanism to bring the broken ends together (Bleuyard et al., (2006) DNA Repair 5:1-12). The structural integrity of chromosomes is typically preserved by the repair, but deletions, insertions, or other rearrangements are possible. The two ends of one double-strand break are the most prevalent substrates of NHEJ (Kirik et al., (2000) EMBO J 19:5562-6), however if two different double-strand breaks occur, the free ends from different breaks can be ligated and result in chromosomal deletions (Siebert and Puchta, (2002) Plant Cell 14:1121-31), or chromosomal translocations between different chromosomes (Pacher et al., (2007) Genetics 175:21-9).
[0288] Episomal DNA molecules can also be ligated into the double-strand break, for example, integration of T-DNAs into chromosomal double-strand breaks (Chilton and Que, (2003) Plant Physiol 133:956-65; Salomon and Puchta, (1998) EMBO J 17:6086-95). Once the sequence around the double-strand breaks is altered, for example, by exonuclease activities involved in the maturation of double-strand breaks, gene conversion pathways can restore the original structure if a homologous sequence is available, such as a homologous chromosome in non-dividing somatic cells, or a sister chromatid after DNA replication (Molinier et al., (2004) Plant Cell 16:342-52). Ectopic and/or epigenic DNA sequences may also serve as a DNA repair template for homologous recombination (Puchta, (1999) Genetics 152:1173-81).
[0289] Once a double-strand break is induced in the DNA, the cell's DNA repair mechanism is activated to repair the break. Error-prone DNA repair mechanisms can produce mutations at double-strand break sites. The most common repair mechanism to bring the broken ends together is the nonhomologous end-joining (NHEJ) pathway (Bleuyard et al., (2006) DNA Repair 5:1-12). The structural integrity of chromosomes is typically preserved by the repair, but deletions, insertions, or other rearrangements are possible (Siebert and Puchta, (2002) Plant Cell 14:1121-31; Pacher et al., (2007) Genetics 175:21-9).
[0290] Alternatively, the double-strand break can be repaired by homologous recombination between homologous DNA sequences. Once the sequence around the double-strand break is altered, for example, by exonuclease activities involved in the maturation of double-strand breaks, gene conversion pathways can restore the original structure if a homologous sequence is available, such as a homologous chromosome in non-dividing somatic cells, or a sister chromatid after DNA replication (Molinier et al., (2004) Plant Cell 16:342-52). Ectopic and/or epigenic DNA sequences may also serve as a DNA repair template for homologous recombination (Puchta, (1999) Genetics 152:1173-81).
[0291] DNA double-strand breaks appear to be an effective factor to stimulate homologous recombination pathways (Puchta et al., (1995) Plant Mol Biol 28:281-92; Tzfira and White, (2005) Trends Biotechnol 23:567-9; Puchta, (2005) J Exp Bot 56:1-14). Using DNA-breaking agents, a two- to nine-fold increase of homologous recombination was observed between artificially constructed homologous DNA repeats in plants (Puchta et al., (1995) Plant Mol Biol 28:281-92). In maize protoplasts, experiments with linear DNA molecules demonstrated enhanced homologous recombination between plasmids (Lyznik et al., (1991) Mol Gen Genet 230:209-18).
[0292] In one embodiment provided herein, the method comprises contacting a plant cell with the donor DNA and the endonuclease. Once a double-strand break is introduced in the target site by the endonuclease, the first and second regions of homology of the donor DNA can undergo homologous recombination with their corresponding genomic regions of homology resulting in exchange of DNA between the donor and the genome. As such, the provided methods result in the integration of the polynucleotide of interest of the donor DNA into the double-strand break in the target site in the plant genome, thereby altering the original target site and producing an altered genomic target site.
[0293] The donor DNA may be introduced by any means known in the art. For example, a plant having a target site is provided. The donor DNA may be provided by any transformation method known in the art including, for example, Agrobacterium-mediated transformation or biolistic particle bombardment. The donor DNA may be present transiently in the cell or it could be introduced via a viral replicon. In the presence of the Cas endonuclease and the target site, the donor DNA is inserted into the transformed plant's genome.
[0294] Another approach uses protein engineering of existing homing endonucleases to alter their target specificities. Homing endonucleases, such as I-SceI or I-CreI, bind to and cleave relatively long DNA recognition sequences (18 bp and 22 bp, respectively). These sequences are predicted to naturally occur infrequently in a genome, typically only 1 or 2 sites/genome. The cleavage specificity of a homing endonuclease can be changed by rational design of amino acid substitutions at the DNA binding domain and/or combinatorial assembly and selection of mutated monomers (see, for example, Arnould et al., (2006) J Mol Biol 355:443-58; Ashworth et al., (2006) Nature 441:656-9; Doyon et al., (2006) J Am Chem Soc 128:2477-84; Rosen et al., (2006) Nucleic Acids Res 34:4791-800; and Smith et al., (2006) Nucleic Acids Res 34:e149; Lyznik et al., (2009) U.S. Patent Application Publication No. 20090133152A1; Smith et al., (2007) U.S. Patent Application Publication No. 20070117128A1). Engineered meganucleases have been demonstrated that can cleave cognate mutant sites without broadening their specificity. An artificial recognition site specific to the wild type yeast I-SceI homing nuclease was introduced in maize genome and mutations of the recognition sequence were detected in 1% of analyzed F1 plants when a transgenic I-SceI was introduced by crossing and activated by gene excision (Yang et al., (2009) Plant Mol Biol 70:669-79). More practically, the maize liguleless locus was targeted using an engineered single-chain endonuclease designed based on the I-CreI meganuclease sequence. Mutations of the selected liguleless locus recognition sequence were detected in 3% of the T0 transgenic plants when the designed homing nuclease was introduced by Agrobacterium-mediated transformation of immature embryos (Gao et al., (2010) Plant J 61:176-87).
[0295] Polynucleotides of interest are further described herein and are reflective of the commercial markets and interests of those involved in the development of the crop. Crops and markets of interest change, and as developing nations open up world markets, new crops and technologies will emerge also. In addition, as our understanding of agronomic traits and characteristics such as yield and heterosis increase, the choice of genes for genetic engineering will change accordingly.
[0296] Genome Editing Using the Guide RNA/Cas Endonuclease System
[0297] As described herein, the guide RNA/Cas endonuclease system can be used in combination with a co-delivered polynucleotide modification template to allow for editing of a genomic nucleotide sequence of interest. Also, as described herein, for each embodiment that uses a guide RNA/Cas endonuclease system, a similar guide polynucleotide/Cas endonuclease system can be deployed where the guide polynucleotide does not solely comprise ribonucleic acids but wherein the guide polynucleotide comprises a combination of RNA-DNA molecules or solely comprise DNA molecules.
[0298] While numerous double-strand break-making systems exist, their practical applications for gene editing may be restricted due to the relatively low frequency of induced double-strand breaks (DSBs). To date, many genome modification methods rely on the homologous recombination system. Homologous recombination (HR) can provide molecular means for finding genomic DNA sequences of interest and modifying them according to the experimental specifications. Homologous recombination takes place in plant somatic cells at low frequency. The process can be enhanced to a practical level for genome engineering by introducing double-strand breaks (DSBs) at selected endonuclease target sites. The challenge has been to efficiently make DSBs at genomic sites of interest since there is a bias in the directionality of information transfer between two interacting DNA molecules (the broken one acts as an acceptor of genetic information). Described herein is the use of a guide RNA/Cas system which provides flexible genome cleavage specificity and results in a high frequency of double-strand breaks at a DNA target site, thereby enabling efficient gene editing in a nucleotide sequence of interest, wherein the nucleotide sequence of interest to be edited can be located within or outside the target site recognized and cleaved by a Cas endonuclease.
[0299] A "modified nucleotide" or "edited nucleotide" refers to a nucleotide sequence of interest that comprises at least one alteration when compared to its non-modified nucleotide sequence. Such "alterations" include, for example: (i) replacement of at least one nucleotide, (ii) a deletion of at least one nucleotide, (iii) an insertion of at least one nucleotide, or (iv) any combination of (i)-(iii).
[0300] The term "polynucleotide modification template" includes a polynucleotide that comprises at least one nucleotide modification when compared to the nucleotide sequence to be edited. A nucleotide modification can be at least one nucleotide substitution, addition or deletion. Optionally, the polynucleotide modification template can further comprise homologous nucleotide sequences flanking the at least one nucleotide modification, wherein the flanking homologous nucleotide sequences provide sufficient homology to the desired nucleotide sequence to be edited.
[0301] In one embodiment, the disclosure describes a method for editing a nucleotide sequence in the genome of a cell, the method comprising providing a guide RNA, a polynucleotide modification template, and at least one Cas endonuclease to a cell, wherein the Cas endonuclease is capable of introducing a double-strand break at a target sequence in the genome of said cell, wherein said polynucleotide modification template includes at least one nucleotide modification of said nucleotide sequence. Cells include, but are not limited to, human, animal, bacterial, fungal, insect, and plant cells as well as plants and seeds produced by the methods described herein. The nucleotide to be edited can be located within or outside a target site recognized and cleaved by a Cas endonuclease. In one embodiment, the at least one nucleotide modification is not a modification at a target site recognized and cleaved by a Cas endonuclease. In another embodiment, there are at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 30, 40, 50, 100, 200, 300, 400, 500, 600, 700, 900 or 1000 nucleotides between the at least one nucleotide to be edited and the genomic target site.
[0302] In another embodiment, the disclosure describes a method for editing a nucleotide sequence in the genome of a plant cell, the method comprising providing a guide RNA, a polynucleotide modification template, and at least one maize optimized Cas9 endonuclease to a plant cell, wherein the maize optimized Cas9 endonuclease is capable of providing a double-strand break at a moCas9 target sequence in the plant genome, wherein said polynucleotide modification template includes at least one nucleotide modification of said nucleotide sequence.
[0303] In another embodiment, the disclosure describes a method for editing a nucleotide sequence in the genome of a cell, the method comprising providing a guide RNA, a polynucleotide modification template and at least one Cas endonuclease to a cell, wherein said guide RNA and Cas endonuclease are capable of forming a complex that enables the Cas endonuclease to introduce a double strand break at a target site, wherein said polynucleotide modification template comprises at least one nucleotide modification of said nucleotide sequence.
[0304] In another embodiment of genome editing, editing of the endogenous enolpyruvylshikimate-3-phosphate synthase (EPSPS) gene is disclosed herein (Example 16). In this embodiment, the polynucleotide modification template (EPSPS polynucleotide modification template) includes a partial fragment of the EPSPS gene (and therefore does not encode a fully functional EPSPS polypeptide by itself). The EPSPS polynucleotide modification template contained three point mutations that were responsible for the creation of the T102I/P106S (TIPS) double mutant (Funke, T et al., J. Biol. Chem. 2009, 284:9854-9860), which provide glyphosate tolerance to transgenic plants expressing as EPSPS double mutant transgene.
[0305] As defined herein "Glyphosate" includes any herbicidally effective form of N-phosphonomethylglycine (including any salt thereof), other forms which result in the production of the glyphosate anion in plants and any other herbicides of the phosphonomethlyglycine family.
[0306] In one embodiment of the disclosure, an epsps mutant plant is produced by the method described herein, said method comprising: a) providing a guide RNA, a polynucleotide modification template and at least one Cas endonuclease to a plant cell, wherein the Cas endonuclease introduces a double strand break at a target site within an epsps (enolpyruvylshikimate-3-phosphate synthase) genomic sequence in the plant genome, wherein said polynucleotide modification template comprises at least one nucleotide modification of said epsps genomic sequence; b) obtaining a plant from the plant cell of (a); c) evaluating the plant of (b) for the presence of said at least one nucleotide modification and d) selecting a progeny plant that shows resistance to glyphosate.
[0307] Increased resistance to an herbicide is demonstrated when plants which display the increased resistance to an herbicide are subjected to the herbicide and a dose/response curve is shifted to the right when compared with that provided by an appropriate control plant. Such dose/response curves have "dose" plotted on the x-axis and "percentage injury", "herbicidal effect" etc. plotted on the y-axis. Plants which are substantially resistant to the herbicide exhibit few, if any, bleached, necrotic, lytic, chlorotic or other lesions and are not stunted, wilted or deformed when subjected to the herbicide at concentrations and rates which are typically employed by the agricultural community to kill weeds in the field. The terms resistance and tolerance may be used interchangeably.
[0308] FIG. 12 shows a schematic representation of components used in the genome editing procedure. A maize optimized Cas endonuclease, a guide RNA and a polynucleotide modification template were provided to a plant cell. For example, as shown in FIG. 12, the polynucleotide modification template included three nucleotide modifications (indicated by arrows) when compared to the EPSPS genomic sequence to be edited. These three nucleotide modifications are referred to as TIPS mutations as these nucleotide modifications result in the amino acid changes T-102 to I-102 and P-106 to S-106. The first point mutation results from the substitution of the C nucleotide in the codon sequence ACT with a T nucleotide, a second mutation results from the substitution of the T nucleotide on the same codon sequence ACT with a C nucleotide to form the isoleucine codon ATC, the third point mutation results from the substitution of the first C nucleotide in the codon sequence CCA with a T nucleotide in order to form a serine codon TCA (FIG. 12).
[0309] In one embodiment, the disclosure describes a method for producing an epsps (enolpyruvylshikimate-3-phosphate synthase) mutant plant, the method comprising: a) providing a guide RNA, a polynucleotide modification template and at least one Cas endonuclease to a plant cell, wherein the Cas endonuclease introduces a double strand break at a target site within an epsps genomic sequence in the plant genome, wherein said polynucleotide modification template comprises at least one nucleotide modification of said epsps genomic sequence; b) obtaining a plant from the plant cell of (a); c) evaluating the plant of (b) for the presence of said at least one nucleotide modification; and, d) screening a progeny plant of (c) that is void of said guide RNA and Cas endonuclease.
[0310] The nucleotide sequence to be edited can be a sequence that is endogenous, artificial, pre-existing, or transgenic to the cell that is being edited. For example, the nucleotide sequence in the genome of a cell can be a native gene, a mutated gene, a non-native gene, a foreign gene, or a transgene that is stably incorporated into the genome of a cell. Editing of such nucleotide may result in a further desired phenotype or genotype.
Regulatory Sequence Modifications Using the Guide Polynucleotide/Cas Endonuclease System
[0311] In one embodiment the nucleotide sequence to be modified can be a regulatory sequence such as a promoter wherein the editing of the promoter comprises replacing the promoter (also referred to as a "promoter swap" or "promoter replacement") or promoter fragment with a different promoter (also referred to as replacement promoter) or promoter fragment (also referred to as replacement promoter fragment), wherein the promoter replacement results in any one of the following or any one combination of the following: an increased promoter activity, an increased promoter tissue specificity, a decreased promoter activity, a decreased promoter tissue specificity, a new promoter activity, an inducible promoter activity, an extended window of gene expression, a modification of the timing or developmental progress of gene expression in the same cell layer or other cell layer (such as but not limiting to extending the timing of gene expression in the tapetum of maize anthers (U.S. Pat. No. 5,837,850 issued Nov. 17, 1998), a mutation of DNA binding elements and/or a deletion or addition of DNA binding elements. The promoter (or promoter fragment) to be modified can be a promoter (or promoter fragment) that is endogenous, artificial, pre-existing, or transgenic to the cell that is being edited. The replacement promoter (or replacement promoter fragment) can be a promoter (or promoter fragment) that is endogenous, artificial, pre-existing, or transgenic to the cell that is being edited.
[0312] In one embodiment the nucleotide sequence can be a promoter wherein the editing of the promoter comprises replacing an ARGOS 8 promoter with a Zea mays GOS2 PRO:GOS2-intron promoter.
[0313] In one embodiment the nucleotide sequence can be a promoter wherein the editing of the promoter comprises replacing a native EPSPS1 promoter from with a plant ubiquitin promoter.
[0314] In one embodiment the nucleotide sequence can be a promoter wherein the editing of the promoter comprises replacing an endogenous maize NPK1 promoter with a stress inducible maize RAB17 promoter.
[0315] In one embodiment the nucleotide sequence can be a promoter wherein the promoter to be edited is selected from the group comprising Zea mays-PEPC1 promoter (Kausch et al, Plant Molecular Biology, 45: 1-15, 2001), Zea mays Ubiquitin promoter (UBI1ZM PRO, Christensen et al, plant Molecular Biology 18: 675-689, 1992), Zea mays-Rootmet2 promoter (U.S. Pat. No. 7,214,855), Rice actin promoter (OS-ACTIN PRO, U.S. Pat. No. 5,641,876; McElroy et al, The Plant Cell, Vol 2, 163-171, February 1990), Sorghum RCC3 promoter (US 2012/0210463 filed on 13 Feb. 2012), Zea mays-GOS2 promoter (U.S. Pat. No. 6,504,083), Zea mays-ACO2 promoter (U.S. application Ser. No. 14/210,711 filed 14 Mar. 2014) or Zea mays-oleosin promoter (U.S. Pat. No. 8,466,341 B2).
[0316] In another embodiment, the guide polynucleotide/Cas endonuclease system can be used in combination with a co-delivered polynucleotide modification template or donor DNA sequence to allow for the insertion of a promoter or promoter element into a genomic nucleotide sequence of interest, wherein the promoter insertion (or promoter element insertion) results in any one of the following or any one combination of the following: an increased promoter activity (increased promoter strength), an increased promoter tissue specificity, a decreased promoter activity, a decreased promoter tissue specificity, a new promoter activity, an inducible promoter activity, an extended window of gene expression, a modification of the timing or developmental progress of gene expression a mutation of DNA binding elements and/or an addition of DNA binding elements. Promoter elements to be inserted can be, but are not limited to, promoter core elements (such as, but not limited to, a CAAT box, a CCAAT box, a Pribnow box, a and/or TATA box, translational regulation sequences and/or a repressor system for inducible expression (such as TET operator repressor/operator/inducer elements, or Sulphonylurea (Su) repressor/operator/inducer elements. The dehydration-responsive element (DRE) was first identified as a cis-acting promoter element in the promoter of the drought-responsive gene rd29A, which contains a 9 bp conserved core sequence, TACCGACAT (Yamaguchi-Shinozaki, K, and Shinozaki, K. (1994) Plant Cell 6, 251-264). Insertion of DRE into an endogenous promoter may confer a drought inducible expression of the downstream gene. Another example are ABA-responsive elements (ABREs) which contain a (C/T)ACGTGGC consensus sequence found to be present in numerous ABA and/or stress-regulated genes (Busk P. K., Pages M. (1998) Plant Mol. Biol. 37:425-435). Insertion of 35S enhancer or MMV enhancer into an endogenous promoter region will increase gene expression (U.S. Pat. No. 5,196,525). The promoter (or promoter element) to be inserted can be a promoter (or promoter element) that is endogenous, artificial, pre-existing, or transgenic to the cell that is being edited.
[0317] In one embodiment, the guide polynucleotide/Cas endonuclease system can be used to insert an enhancer element, such as but not limited to a Cauliflower Mosaic Virus 35 S enhancer, in front of an endogenous FMT1 promoter to enhance expression of the FTM1.
[0318] In one embodiment, the guide polynucleotide/Cas endonuclease system can be used to insert a component of the TET operator repressor/operator/inducer system, or a component of the sulphonylurea (Su) repressor/operator/inducer system into plant genomes to generate or control inducible expression systems.
[0319] In another embodiment, the guide polynucleotide/Cas endonuclease system can be used to allow for the deletion of a promoter or promoter element, wherein the promoter deletion (or promoter element deletion) results in any one of the following or any one combination of the following: a permanently inactivated gene locus, an increased promoter activity (increased promoter strength), an increased promoter tissue specificity, a decreased promoter activity, a decreased promoter tissue specificity, a new promoter activity, an inducible promoter activity, an extended window of gene expression, a modification of the timing or developmental progress of gene expression, a mutation of DNA binding elements and/or an addition of DNA binding elements. Promoter elements to be deleted can be, but are not limited to, promoter core elements, promoter enhancer elements or 35 S enhancer elements (as described in Example 32) The promoter or promoter fragment to be deleted can be endogenous, artificial, pre-existing, or transgenic to the cell that is being edited.
[0320] In one embodiment, the guide polynucleotide/Cas endonuclease system can be used to delete the ARGOS 8 promoter present in a maize genome as described herein.
[0321] In one embodiment, the guide polynucleotide/Cas endonuclease system can be used to delete a 35S enhancer element present in a plant genome as described herein.
Terminator Modifications Using the Guide Polynucleotide/Cas Endonuclease System
[0322] In one embodiment the nucleotide sequence to be modified can be a terminator wherein the editing of the terminator comprises replacing the terminator (also referred to as a "terminator swap" or "terminator replacement") or terminator fragment with a different terminator (also referred to as replacement terminator) or terminator fragment (also referred to as replacement terminator fragment), wherein the terminator replacement results in any one of the following or any one combination of the following: an increased terminator activity, an increased terminator tissue specificity, a decreased terminator activity, a decreased terminator tissue specificity, a mutation of DNA binding elements and/or a deletion or addition of DNA binding elements." The terminator (or terminator fragment) to be modified can be a terminator (or terminator fragment) that is endogenous, artificial, pre-existing, or transgenic to the cell that is being edited. The replacement terminator (or replacement terminator fragment) can be a terminator (or terminator fragment) that is endogenous, artificial, pre-existing, or transgenic to the cell that is being edited.
[0323] In one embodiment the nucleotide sequence to be modified can be a terminator wherein the terminator to be edited is selected from the group comprising terminators from maize Argos 8 or SRTF18 genes, or other terminators, such as potato PinII terminator, sorghum actin terminator (SB-ACTIN TERM, WO 2013/184537 A1 published December 2013), sorghum SB-GKAF TERM (WO2013019461), rice T28 terminator (OS-T28 TERM, WO 2013/012729 A2), ATT9 TERM (WO 2013/012729 A2) or GZ-W64A TERM (U.S. Pat. No. 7,053,282).
[0324] In one embodiment, the guide polynucleotide/Cas endonuclease system can be used in combination with a co-delivered polynucleotide modification template or donor DNA sequence to allow for the insertion of a terminator or terminator element into a genomic nucleotide sequence of interest, wherein the terminator insertion (or terminator element insertion) results in any one of the following or any one combination of the following: an increased terminator activity (increased terminator strength), an increased terminator tissue specificity, a decreased terminator activity, a decreased terminator tissue specificity, a mutation of DNA binding elements and/or an addition of DNA binding elements.
The terminator (or terminator element) to be inserted can be a terminator (or terminator element) that is endogenous, artificial, pre-existing, or transgenic to the cell that is being edited.
[0325] In another embodiment, the guide polynucleotide/Cas endonuclease system can be used to allow for the deletion of a terminator or terminator element, wherein the terminator deletion (or terminator element deletion) results in any one of the following or any one combination of the following: an increased terminator activity (increased terminator strength), an increased terminator tissue specificity, a decreased terminator activity, a decreased terminator tissue specificity, a mutation of DNA binding elements and/or an addition of DNA binding elements. The terminator or terminator fragment to be deleted can be endogenous, artificial, pre-existing, or transgenic to the cell that is being edited.
Additional Regulatory Sequence Modifications Using the Guide Polynucleotide/Cas Endonuclease System
[0326] In one embodiment, the guide polynucleotide/Cas endonuclease system can be used to modify or replace a regulatory sequence in the genome of a cell. A regulatory sequence is a segment of a nucleic acid molecule which is capable of increasing or decreasing the expression of specific genes within an organism and/or is capable of altering tissue specific expression of genes within an organism. Examples of regulatory sequences include, but are not limited to, 3' UTR (untranslated region) region, 5' UTR region, transcription activators, transcriptional enhancers transcriptions repressors, translational repressors, splicing factors, miRNAs, siRNA, artificial miRNAs, promoter elements, CAMV 35 S enhancer, MMV enhancer elements (PCT/US14/23451 filed Mar. 11, 2013), SECIS elements, polyadenylation signals, and polyubiquitination sites. In some embodiments the editing (modification) or replacement of a regulatory element results in altered protein translation, RNA cleavage, RNA splicing, transcriptional termination or post translational modification. In one embodiment, regulatory elements can be identified within a promoter and these regulatory elements can be edited or modified do to optimize these regulatory elements for up or down regulation of the promoter.
[0327] In one embodiment, the genomic sequence of interest to be modified is a polyubiquitination site, wherein the modification of the polyubiquitination sites results in a modified rate of protein degradation. The ubiquitin tag condemns proteins to be degraded by proteasomes or autophagy. Proteasome inhibitors are known to cause a protein overproduction. Modifications made to a DNA sequence encoding a protein of interest can result in at least one amino acid modification of the protein of interest, wherein said modification allows for the polyubiquitination of the protein (a post translational modification) resulting in a modification of the protein degradation
[0328] In one embodiment, the genomic sequence of interest to be modified is a polyubiquitination site on a maize EPSPS gene, wherein the polyubiquitination site modified resulting in an increased protein content due to a slower rate of EPSPS protein degradation.
[0329] In one embodiment, the genomic sequence of interest to be modified is a an intron site, wherein the modification consist of inserting an intron enhancing motif into the intron which results in modulation of the transcriptional activity of the gene comprising said intron.
[0330] In one embodiment, the genomic sequence of interest to be modified is a an intron site, wherein the modification consist of replacing a soybean EPSP1 intron with a soybean ubiquitin intron 1 as described herein (Example 25)
[0331] In one embodiment, the genomic sequence of interest to be modified is a an intron or UTR site, wherein the modification consist of inserting at least one microRNA into said intron or UTR site, wherein expression of the gene comprising the intron or UTR site also results in expression of said microRNA, which in turn can silence any gene targeted by the microRNA without disrupting the gene expression of the native/transgene comprising said intron.
[0332] In one embodiment, the guide polynucleotide/Cas endonuclease system can be used to allow for the deletion or mutation of a Zinc Finger transcription factor, wherein the deletion or mutation of the Zinc Finger transcription factor results in or allows for the creation of a dominant negative Zinc Finger transcription factor mutant (Li et al 2013 Rice zinc finger protein DST enhances grain production through controlling Gn1a/OsCKX2 expression PNAS 110:3167-3172). Insertion of a single base pair downstream zinc finger domain will result in a frame shift and produces a new protein which still can bind to DNA without transcription activity. The mutant protein will compete to bind to cytokinin oxidase gene promoters and block the expression of cytokinin oxidase gene. Reduction of cytokinin oxidase gene expression will increase cytokinin level and promote panicle growth in rice and ear growth in maize, and increase yield under normal and stress conditions.
Modifications of Splicing Sites and/or Introducing Alternate Splicing Sites Using the Guide Polynucleotide/Cas Endonuclease System
[0333] Protein synthesis utilizes mRNA molecules that emerge from pre-mRNA molecules subjected to the maturation process. The pre-mRNA molecules are capped, spliced and stabilized by addition of polyA tails. Eukaryotic cells developed a complex process of splicing that result in alternative variants of the original pre-mRNA molecules. Some of them may not produce functional templates for protein synthesis. In maize cells, the splicing process is affected by splicing sites at the exon-intron junction sites. An example of a canonical splice site is AGGT. Gene coding sequences can contains a number of alternate splicing sites that may affect the overall efficiency of the pre-mRNA maturation process and as such may limit the protein accumulation in cells. The guide polynucleotide/Cas endonuclease system can be used in combination with a co-delivered polynucleotide modification template to edit a gene of interest to introduce a canonical splice site at a described junction or any variant of a splicing site that changes the splicing pattern of pre-mRNA molecules.
[0334] In one embodiment, the nucleotide sequence of interest to be modified is a maize EPSPS gene, wherein the modification of the gene consists of modifying alternative splicing sites resulting in enhanced production of the functional gene transcripts and gene products (proteins).
[0335] In one embodiment, the nucleotide sequence of interest to be modified is a gene, wherein the modification of the gene consists of editing the intron borders of alternatively spliced genes to alter the accumulation of splice variants.
Modifications of Nucleotide Sequences Encoding a Protein of Interest Using the Guide Polynucleotide/Cas Endonuclease System
[0336] In one embodiment, the guide polynucleotide/Cas endonuclease system can be used to modify or replace a coding sequence in the genome of a cell, wherein the modification or replacement results in any one of the following, or any one combination of the following: an increased protein (enzyme) activity, an increased protein functionality, a decreased protein activity, a decreased protein functionality, a site specific mutation, a protein domain swap, a protein knock-out, a new protein functionality, a modified protein functionality,
[0337] In one embodiment the protein knockout is due to the introduction of a stop codon into the coding sequence of interest.
[0338] In one embodiment the protein knockout is due to the deletion of a start codon into the coding sequence of interest.
Amino Acid and/or Protein Fusions Using the Guide Polynucleotide/Cas Endonuclease System
[0339] In one embodiment, the guide polynucleotide/Cas endonuclease system can be used with or without a co-delivered polynucleotide sequence to fuse a first coding sequence encoding a first protein to a second coding sequence encoding a second protein in the genome of a cell, wherein the protein fusion results in any one of the following or any one combination of the following: an increased protein (enzyme) activity, an increased protein functionality, a decreased protein activity, a decreased protein functionality, a new protein functionality, a modified protein functionality, a new protein localization, a new timing of protein expression, a modified protein expression pattern, a chimeric protein, or a modified protein with dominant phenotype functionality.
[0340] In one embodiment, the guide polynucleotide/Cas endonuclease system can be used with or without a co-delivered polynucleotide sequence to fuse a first coding sequence encoding a chloroplast localization signal to a second coding sequence encoding a protein of interest, wherein the protein fusion results in targeting the protein of interest to the chloroplast.
[0341] In one embodiment, the guide polynucleotide/Cas endonuclease system can be used with or without a co-delivered polynucleotide sequence to fuse a first coding sequence encoding a chloroplast localization signal to a second coding sequence encoding a protein of interest, wherein the protein fusion results in targeting the protein of interest to the chloroplast.
[0342] In one embodiment, the guide polynucleotide/Cas endonuclease system can be used with or without a co-delivered polynucleotide sequence to fuse a first coding sequence encoding a chloroplast localization signal (e.g., a chloroplast transit peptide) to a second coding sequence, wherein the protein fusion results in a modified protein with dominant phenotype functionality
Gene Silencing by Expressing an Inverted Repeat into a Gene of Interest Using the Guide Polynucleotide/Cas Endonuclease System
[0343] In one embodiment, the guide polynucleotide/Cas endonuclease system can be used in combination with a co-delivered polynucleotide sequence to insert an inverted gene fragment into a gene of interest in the genome of an organism, wherein the insertion of the inverted gene fragment can allow for an in-vivo creation of an inverted repeat (hairpin) and results in the silencing of said endogenous gene.
[0344] In one embodiment the insertion of the inverted gene fragment can result in the formation of an in-vivo created inverted repeat (hairpin) in a native (or modified) promoter of a gene and/or in a native 5' end of the native gene. The inverted gene fragment can further comprise an intron which can result in an enhanced silencing of the targeted gene.
Genome Deletion for Trait Locus Characterization
[0345] Trait mapping in plant breeding often results in the detection of chromosomal regions housing one or more genes controlling expression of a trait of interest. For a qualitative trait, the guide polynucleotide/Cas endonuclease system can be used to eliminate candidate genes in the identified chromosomal regions to determine if deletion of the gene affects expression of the trait. For quantitative traits, expression of a trait of interest is governed by multiple quantitative trait loci (QTL) of varying effect-size, complexity, and statistical significance across one or more chromosomes. In cases of negative effect or deleterious QTL regions affecting a complex trait, the guide polynucleotide/Cas endonuclease system can be used to eliminate whole regions delimited by marker-assisted fine mapping, and to target specific regions for their selective elimination or rearrangement. Similarly, presence/absence variation (PAV) or copy number variation (CNV) can be manipulated with selective genome deletion using the guide polynucleotide/Cas endonuclease system.
[0346] In one embodiment, the region of interest can be flanked by two independent guide polynucleotide/CAS endonuclease target sequences. Cutting would be done concurrently. The deletion event would be the repair of the two chromosomal ends without the region of interest. Alternative results would include inversions of the region of interest, mutations at the cut sites and duplication of the region of interest.
[0347] Methods for Identifying at Least One Plant Cell Comprising in its Genome a Polynucleotide of Interest Integrated at the Target Site.
[0348] Further provided are methods for identifying at least one plant cell, comprising in its genome, a polynucleotide of interest integrated at the target site. A variety of methods are available for identifying those plant cells with insertion into the genome at or near to the target site without using a screenable marker phenotype. Such methods can be viewed as directly analyzing a target sequence to detect any change in the target sequence, including but not limited to PCR methods, sequencing methods, nuclease digestion, Southern blots, and any combination thereof. See, for example, U.S. patent application Ser. No. 12/147,834, herein incorporated by reference to the extent necessary for the methods described herein. The method also comprises recovering a plant from the plant cell comprising a polynucleotide of Interest integrated into its genome. The plant may be sterile or fertile. It is recognized that any polynucleotide of interest can be provided, integrated into the plant genome at the target site, and expressed in a plant.
[0349] Polynucleotides/polypeptides of interest include, but are not limited to, herbicide-resistance coding sequences, insecticidal coding sequences, nematicidal coding sequences, antimicrobial coding sequences, antifungal coding sequences, antiviral coding sequences, abiotic and biotic stress tolerance coding sequences, or sequences modifying plant traits such as yield, grain quality, nutrient content, starch quality and quantity, nitrogen fixation and/or utilization, fatty acids, and oil content and/or composition. More specific polynucleotides of interest include, but are not limited to, genes that improve crop yield, polypeptides that improve desirability of crops, genes encoding proteins conferring resistance to abiotic stress, such as drought, nitrogen, temperature, salinity, toxic metals or trace elements, or those conferring resistance to toxins such as pesticides and herbicides, or to biotic stress, such as attacks by fungi, viruses, bacteria, insects, and nematodes, and development of diseases associated with these organisms. General categories of genes of interest include, for example, those genes involved in information, such as zinc fingers, those involved in communication, such as kinases, and those involved in housekeeping, such as heat shock proteins. More specific categories of transgenes, for example, include genes encoding important traits for agronomics, insect resistance, disease resistance, herbicide resistance, fertility or sterility, grain characteristics, and commercial products. Genes of interest include, generally, those involved in oil, starch, carbohydrate, or nutrient metabolism as well as those affecting kernel size, sucrose loading, and the like that can be stacked or used in combination with other traits, such as but not limited to herbicide resistance, described herein.
[0350] Agronomically important traits such as oil, starch, and protein content can be genetically altered in addition to using traditional breeding methods. Modifications include increasing content of oleic acid, saturated and unsaturated oils, increasing levels of lysine and sulfur, providing essential amino acids, and also modification of starch. Hordothionin protein modifications are described in U.S. Pat. Nos. 5,703,049, 5,885,801, 5,885,802, and 5,990,389, herein incorporated by reference. Another example is lysine and/or sulfur rich seed protein encoded by the soybean 2S albumin described in U.S. Pat. No. 5,850,016, and the chymotrypsin inhibitor from barley, described in Williamson et al. (1987) Eur. J. Biochem. 165:99-106, the disclosures of which are herein incorporated by reference.
[0351] Commercial traits can also be encoded on a polynucleotide of interest that could increase for example, starch for ethanol production, or provide expression of proteins. Another important commercial use of transformed plants is the production of polymers and bioplastics such as described in U.S. Pat. No. 5,602,321. Genes such as β-Ketothiolase, PHBase (polyhydroxybutyrate synthase), and acetoacetyl-CoA reductase (see Schubert et al. (1988) J. Bacteriol. 170:5837-5847) facilitate expression of polyhydroxyalkanoates (PHAs).
[0352] Derivatives of the coding sequences can be made by site-directed mutagenesis to increase the level of preselected amino acids in the encoded polypeptide. For example, the gene encoding the barley high lysine polypeptide (BHL) is derived from barley chymotrypsin inhibitor, U.S. application Ser. No. 08/740,682, filed Nov. 1, 1996, and WO 98/20133, the disclosures of which are herein incorporated by reference. Other proteins include methionine-rich plant proteins such as from sunflower seed (Lilley et al. (1989) Proceedings of the World Congress on Vegetable Protein Utilization in Human Foods and Animal Feedstuffs, ed. Applewhite (American Oil Chemists Society, Champaign, Ill.), pp. 497-502; herein incorporated by reference); corn (Pedersen et al. (1986) J. Biol. Chem. 261:6279; Kirihara et al. (1988) Gene 71:359; both of which are herein incorporated by reference); and rice (Musumura et al. (1989) Plant Mol. Biol. 12:123, herein incorporated by reference). Other agronomically important genes encode latex, Floury 2, growth factors, seed storage factors, and transcription factors.
[0353] Polynucleotides that improve crop yield include dwarfing genes, such as Rht1 and Rht2 (Peng et al. (1999) Nature 400:256-261), and those that increase plant growth, such as ammonium-inducible glutamate dehydrogenase. Polynucleotides that improve desirability of crops include, for example, those that allow plants to have reduced saturated fat content, those that boost the nutritional value of plants, and those that increase grain protein. Polynucleotides that improve salt tolerance are those that increase or allow plant growth in an environment of higher salinity than the native environment of the plant into which the salt-tolerant gene(s) has been introduced.
[0354] Polynucleotides/polypeptides that influence amino acid biosynthesis include, for example, anthranilate synthase (AS; EC 4.1.3.27) which catalyzes the first reaction branching from the aromatic amino acid pathway to the biosynthesis of tryptophan in plants, fungi, and bacteria. In plants, the chemical processes for the biosynthesis of tryptophan are compartmentalized in the chloroplast. See, for example, US Pub. 20080050506, herein incorporated by reference. Additional sequences of interest include Chorismate Pyruvate Lyase (CPL) which refers to a gene encoding an enzyme which catalyzes the conversion of chorismate to pyruvate and pHBA. The most well characterized CPL gene has been isolated from E. coli and bears the GenBank accession number M96268. See, U.S. Pat. No. 7,361,811, herein incorporated by reference.
[0355] Polynucleotide sequences of interest may encode proteins involved in providing disease or pest resistance. By "disease resistance" or "pest resistance" is intended that the plants avoid the harmful symptoms that are the outcome of the plant-pathogen interactions. Pest resistance genes may encode resistance to pests that have great yield drag such as rootworm, cutworm, European Corn Borer, and the like. Disease resistance and insect resistance genes such as lysozymes or cecropins for antibacterial protection, or proteins such as defensins, glucanases or chitinases for antifungal protection, or Bacillus thuringiensis endotoxins, protease inhibitors, collagenases, lectins, or glycosidases for controlling nematodes or insects are all examples of useful gene products. Genes encoding disease resistance traits include detoxification genes, such as against fumonisin (U.S. Pat. No. 5,792,931); avirulence (avr) and disease resistance (R) genes (Jones et al. (1994) Science 266:789; Martin et al. (1993) Science 262:1432; and Mindrinos et al. (1994) Cell 78:1089); and the like. Insect resistance genes may encode resistance to pests that have great yield drag such as rootworm, cutworm, European Corn Borer, and the like. Such genes include, for example, Bacillus thuringiensis toxic protein genes (U.S. Pat. Nos. 5,366,892; 5,747,450; 5,736,514; 5,723,756; 5,593,881; and Geiser et al. (1986) Gene 48:109); and the like.
[0356] An "herbicide resistance protein" or a protein resulting from expression of an "herbicide resistance-encoding nucleic acid molecule" includes proteins that confer upon a cell the ability to tolerate a higher concentration of an herbicide than cells that do not express the protein, or to tolerate a certain concentration of an herbicide for a longer period of time than cells that do not express the protein. Herbicide resistance traits may be introduced into plants by genes coding for resistance to herbicides that act to inhibit the action of acetolactate synthase (ALS), in particular the sulfonylurea-type herbicides, genes coding for resistance to herbicides that act to inhibit the action of glutamine synthase, such as phosphinothricin or basta (e.g., the bar gene), glyphosate (e.g., the EPSP synthase gene and the GAT gene), HPPD inhibitors (e.g, the HPPD gene) or other such genes known in the art. See, for example, U.S. Pat. Nos. 7,626,077, 5,310,667, 5,866,775, 6,225,114, 6,248,876, 7,169,970, 6,867,293, and U.S. Provisional Application No. 61/401,456, each of which is herein incorporated by reference. The bar gene encodes resistance to the herbicide basta, the nptII gene encodes resistance to the antibiotics kanamycin and geneticin, and the ALS-gene mutants encode resistance to the herbicide chlorsulfuron.
[0357] Sterility genes can also be encoded in an expression cassette and provide an alternative to physical detasseling. Examples of genes used in such ways include male fertility genes such as MS26 (see for example U.S. Pat. Nos. 7,098,388, 7,517,975, 7,612,251), MS45 (see for example U.S. Pat. Nos. 5,478,369, 6,265,640) or MSCA1 (see for example U.S. Pat. No. 7,919,676). Maize plants (Zea mays L.) can be bred by both self-pollination and cross-pollination techniques. Maize has male flowers, located on the tassel, and female flowers, located on the ear, on the same plant. It can self-pollinate ("selfing") or cross pollinate. Natural pollination occurs in maize when wind blows pollen from the tassels to the silks that protrude from the tops of the incipient ears. Pollination may be readily controlled by techniques known to those of skill in the art. The development of maize hybrids requires the development of homozygous inbred lines, the crossing of these lines, and the evaluation of the crosses. Pedigree breeding and recurrent selections are two of the breeding methods used to develop inbred lines from populations. Breeding programs combine desirable traits from two or more inbred lines or various broad-based sources into breeding pools from which new inbred lines are developed by selfing and selection of desired phenotypes. A hybrid maize variety is the cross of two such inbred lines, each of which may have one or more desirable characteristics lacked by the other or which complement the other. The new inbreds are crossed with other inbred lines and the hybrids from these crosses are evaluated to determine which have commercial potential. The hybrid progeny of the first generation is designated F1. The F1 hybrid is more vigorous than its inbred parents. This hybrid vigor, or heterosis, can be manifested in many ways, including increased vegetative growth and increased yield.
[0358] Hybrid maize seed can be produced by a male sterility system incorporating manual detasseling. To produce hybrid seed, the male tassel is removed from the growing female inbred parent, which can be planted in various alternating row patterns with the male inbred parent. Consequently, providing that there is sufficient isolation from sources of foreign maize pollen, the ears of the female inbred will be fertilized only with pollen from the male inbred. The resulting seed is therefore hybrid (F1) and will form hybrid plants.
[0359] Field variation impacting plant development can result in plants tasseling after manual detasseling of the female parent is completed. Or, a female inbred plant tassel may not be completely removed during the detasseling process. In any event, the result is that the female plant will successfully shed pollen and some female plants will be self-pollinated. This will result in seed of the female inbred being harvested along with the hybrid seed which is normally produced. Female inbred seed does not exhibit heterosis and therefore is not as productive as F1 seed. In addition, the presence of female inbred seed can represent a germplasm security risk for the company producing the hybrid.
[0360] Alternatively, the female inbred can be mechanically detasseled by machine.
[0361] Mechanical detasseling is approximately as reliable as hand detasseling, but is faster and less costly. However, most detasseling machines produce more damage to the plants than hand detasseling. Thus, no form of detasseling is presently entirely satisfactory, and a need continues to exist for alternatives which further reduce production costs and to eliminate self-pollination of the female parent in the production of hybrid seed.
[0362] Mutations that cause male sterility in plants have the potential to be useful in methods for hybrid seed production for crop plants such as maize and can lower production costs by eliminating the need for the labor-intensive removal of male flowers (also known as de-tasseling) from the maternal parent plants used as a hybrid parent. Mutations that cause male sterility in maize have been produced by a variety of methods such as X-rays or UV-irradiations, chemical treatments, or transposable element insertions (ms23, ms25, ms26, ms32) (Chaubal et al. (2000) Am J Bot 87:1193-1201). Conditional regulation of fertility genes through fertility/sterility "molecular switches" could enhance the options for designing new male-sterility systems for crop improvement (Unger et al. (2002) Transgenic Res 11:455-465).
[0363] Besides identification of novel genes impacting male fertility, there remains a need to provide a reliable system of producing genetic male sterility.
[0364] In U.S. Pat. No. 5,478,369, a method is described by which the Ms45 male fertility gene was tagged and cloned on maize chromosome 9. Previously, there had been described a male fertility gene on chromosome 9, ms2, which had never been cloned and sequenced. It is not allelic to the gene referred to in the '369 patent. See Albertsen, M. and Phillips, R. L., "Developmental Cytology of 13 Genetic Male Sterile Loci in Maize" Canadian Journal of Genetics & Cytology 23:195-208 (January 1981). The only fertility gene cloned before that had been the Arabidopsis gene described at Aarts, et al., supra.
[0365] Examples of genes that have been discovered subsequently that are important to male fertility are numerous and include the Arabidopsis ABORTED MICROSPORES (AMS) gene, Sorensen et al., The Plant Journal (2003) 33(2):413-423); the Arabidopsis MS1 gene (Wilson et al., The Plant Journal (2001) 39(2):170-181); the NEF1 gene (Ariizumi et al., The Plant Journal (2004) 39(2):170-181); Arabidopsis AtGPAT1 gene (Zheng et al., The Plant Cell (2003) 15:1872-1887); the Arabidopsis dde2-2 mutation was shown to be defective in the allene oxide syntase gene (Malek et al., Planta (2002)216:187-192); the Arabidopsis faceless pollen-1 gene (flp1) (Ariizumi et al, Plant Mol. Biol. (2003) 53:107-116); the Arabidopsis MALE MEIOCYTE DEATH1 gene (Yang et al., The Plant Cell (2003) 15: 1281-1295); the tapetum-specific zinc finger gene, TAZ1 (Kapoor et al., The Plant Cell (2002) 14:2353-2367); and the TAPETUM DETERMINANT1 gene (Lan et al, The Plant Cell (2003) 15:2792-2804).
[0366] Other known male fertility mutants or genes from Zea mays are listed in U.S. Pat. No. 7,919,676 incorporated herein by reference.
[0367] Other genes include kinases and those encoding compounds toxic to either male or female gametophytic development.
[0368] Furthermore, it is recognized that the polynucleotide of interest may also comprise antisense sequences complementary to at least a portion of the messenger RNA (mRNA) for a targeted gene sequence of interest. Antisense nucleotides are constructed to hybridize with the corresponding mRNA. Modifications of the antisense sequences may be made as long as the sequences hybridize to and interfere with expression of the corresponding mRNA. In this manner, antisense constructions having 70%, 80%, or 85% sequence identity to the corresponding antisense sequences may be used. Furthermore, portions of the antisense nucleotides may be used to disrupt the expression of the target gene. Generally, sequences of at least 50 nucleotides, 100 nucleotides, 200 nucleotides, or greater may be used.
[0369] In addition, the polynucleotide of interest may also be used in the sense orientation to suppress the expression of endogenous genes in plants. Methods for suppressing gene expression in plants using polynucleotides in the sense orientation are known in the art. The methods generally involve transforming plants with a DNA construct comprising a promoter that drives expression in a plant operably linked to at least a portion of a nucleotide sequence that corresponds to the transcript of the endogenous gene. Typically, such a nucleotide sequence has substantial sequence identity to the sequence of the transcript of the endogenous gene, generally greater than about 65% sequence identity, about 85% sequence identity, or greater than about 95% sequence identity. See, U.S. Pat. Nos. 5,283,184 and 5,034,323; herein incorporated by reference.
[0370] The polynucleotide of interest can also be a phenotypic marker. A phenotypic marker is screenable or a selectable marker that includes visual markers and selectable markers whether it is a positive or negative selectable marker. Any phenotypic marker can be used. Specifically, a selectable or screenable marker comprises a DNA segment that allows one to identify, or select for or against a molecule or a cell that contains it, often under particular conditions. These markers can encode an activity, such as, but not limited to, production of RNA, peptide, or protein, or can provide a binding site for RNA, peptides, proteins, inorganic and organic compounds or compositions and the like.
[0371] Examples of selectable markers include, but are not limited to, DNA segments that comprise restriction enzyme sites; DNA segments that encode products which provide resistance against otherwise toxic compounds including antibiotics, such as, spectinomycin, ampicillin, kanamycin, tetracycline, Basta, neomycin phosphotransferase II (NEO) and hygromycin phosphotransferase (HPT)); DNA segments that encode products which are otherwise lacking in the recipient cell (e.g., tRNA genes, auxotrophic markers); DNA segments that encode products which can be readily identified (e.g., phenotypic markers such as β-galactosidase, GUS; fluorescent proteins such as green fluorescent protein (GFP), cyan (CFP), yellow (YFP), red (RFP), and cell surface proteins); the generation of new primer sites for PCR (e.g., the juxtaposition of two DNA sequence not previously juxtaposed), the inclusion of DNA sequences not acted upon or acted upon by a restriction endonuclease or other DNA modifying enzyme, chemical, etc.; and, the inclusion of a DNA sequences required for a specific modification (e.g., methylation) that allows its identification.
[0372] Additional selectable markers include genes that confer resistance to herbicidal compounds, such as glufosinate ammonium, bromoxynil, imidazolinones, and 2,4-dichlorophenoxyacetate (2,4-D). See for example, Yarranton, (1992) Curr Opin Biotech 3:506-11; Christopherson et al., (1992) Proc. Natl. Acad. Sci. USA 89:6314-8; Yao et al., (1992) Cell 71:63-72; Reznikoff, (1992) Mol Microbiol 6:2419-22; Hu et al., (1987) Cell 48:555-66; Brown et al., (1987) Cell 49:603-12; Figge et al., (1988) Cell 52:713-22; Deuschle et al., (1989) Proc. Natl. Acad. Sci. USA 86:5400-4; Fuerst et al., (1989) Proc. Natl. Acad. Sci. USA 86:2549-53; Deuschle et al., (1990) Science 248:480-3; Gossen, (1993) Ph.D. Thesis, University of Heidelberg; Reines et al., (1993) Proc. Natl. Acad. Sci. USA 90:1917-21; Labow et al., (1990) Mol Cell Biol 10:3343-56; Zambretti et al., (1992) Proc. Natl. Acad. Sci. USA 89:3952-6; Baim et al., (1991) Proc. Natl. Acad. Sci. USA 88:5072-6; Wyborski et al., (1991) Nucleic Acids Res 19:4647-53; Hillen and Wissman, (1989) Topics Mol Struc Biol 10:143-62; Degenkolb et al., (1991) Antimicrob Agents Chemother 35:1591-5; Kleinschnidt et al., (1988) Biochemistry 27:1094-104; Bonin, (1993) Ph.D. Thesis, University of Heidelberg; Gossen et al., (1992) Proc. Natl. Acad. Sci. USA 89:5547-51; Oliva et al., (1992) Antimicrob Agents Chemother 36:913-9; Hlavka et al., (1985) Handbook of Experimental Pharmacology, Vol. 78 (Springer-Verlag, Berlin); Gill et al., (1988) Nature 334:721-4. Commercial traits can also be encoded on a gene or genes that could increase for example, starch for ethanol production, or provide expression of proteins. Another important commercial use of transformed plants is the production of polymers and bioplastics such as described in U.S. Pat. No. 5,602,321. Genes such as β-Ketothiolase, PHBase (polyhydroxyburyrate synthase), and acetoacetyl-CoA reductase (see Schubert et al. (1988) J. Bacteriol. 170:5837-5847) facilitate expression of polyhyroxyalkanoates (PHAs).
[0373] Exogenous products include plant enzymes and products as well as those from other sources including procaryotes and other eukaryotes. Such products include enzymes, cofactors, hormones, and the like. The level of proteins, particularly modified proteins having improved amino acid distribution to improve the nutrient value of the plant, can be increased. This is achieved by the expression of such proteins having enhanced amino acid content.
[0374] The transgenes, recombinant DNA molecules, DNA sequences of interest, and polynucleotides of interest can be comprise one or more DNA sequences for gene silencing. Methods for gene silencing involving the expression of DNA sequences in plant are known in the art include, but are not limited to, cosuppression, antisense suppression, double-stranded RNA (dsRNA) interference, hairpin RNA (hpRNA) interference, intron-containing hairpin RNA (ihpRNA) interference, transcriptional gene silencing, and micro RNA (miRNA) interference
[0375] As used herein, "nucleic acid" means a polynucleotide and includes a single or a double-stranded polymer of deoxyribonucleotide or ribonucleotide bases. Nucleic acids may also include fragments and modified nucleotides. Thus, the terms "polynucleotide", "nucleic acid sequence", "nucleotide sequence" and "nucleic acid fragment" are used interchangeably to denote a polymer of RNA and/or DNA that is single- or double-stranded, optionally containing synthetic, non-natural, or altered nucleotide bases. Nucleotides (usually found in their 5'-monophosphate form) are referred to by their single letter designation as follows: "A" for adenosine or deoxyadenosine (for RNA or DNA, respectively), "C" for cytosine or deoxycytosine, "G" for guanosine or deoxyguanosine, "U" for uridine, "T" for deoxythymidine, "R" for purines (A or G), "Y" for pyrimidines (C or T), "K" for G or T, "H" for A or C or T, "I" for inosine, and "N" for any nucleotide.
[0376] "Open reading frame" is abbreviated ORF.
[0377] The terms "subfragment that is functionally equivalent" and "functionally equivalent subfragment" are used interchangeably herein. These terms refer to a portion or subsequence of an isolated nucleic acid fragment in which the ability to alter gene expression or produce a certain phenotype is retained whether or not the fragment or subfragment encodes an active enzyme. For example, the fragment or subfragment can be used in the design of genes to produce the desired phenotype in a transformed plant. genes can be designed for use in suppression by linking a nucleic acid fragment or subfragment thereof, whether or not it encodes an active enzyme, in the sense or antisense orientation relative to a plant promoter sequence.
[0378] The term "conserved domain" or "motif" means a set of amino acids conserved at specific positions along an aligned sequence of evolutionarily related proteins. While amino acids at other positions can vary between homologous proteins, amino acids that are highly conserved at specific positions indicate amino acids that are essential to the structure, the stability, or the activity of a protein. Because they are identified by their high degree of conservation in aligned sequences of a family of protein homologues, they can be used as identifiers, or "signatures", to determine if a protein with a newly determined sequence belongs to a previously identified protein family.
[0379] Polynucleotide and polypeptide sequences, variants thereof, and the structural relationships of these sequences can be described by the terms "homology", "homologous", "substantially identical", "substantially similar" and "corresponding substantially" which are used interchangeably herein. These refer to polypeptide or nucleic acid fragments wherein changes in one or more amino acids or nucleotide bases do not affect the function of the molecule, such as the ability to mediate gene expression or to produce a certain phenotype. These terms also refer to modification(s) of nucleic acid fragments that do not substantially alter the functional properties of the resulting nucleic acid fragment relative to the initial, unmodified fragment. These modifications include deletion, substitution, and/or insertion of one or more nucleotides in the nucleic acid fragment.
[0380] Substantially similar nucleic acid sequences encompassed may be defined by their ability to hybridize (under moderately stringent conditions, e.g., 0.5×SSC, 0.1% SDS, 60° C.) with the sequences exemplified herein, or to any portion of the nucleotide sequences disclosed herein and which are functionally equivalent to any of the nucleic acid sequences disclosed herein. Stringency conditions can be adjusted to screen for moderately similar fragments, such as homologous sequences from distantly related organisms, to highly similar fragments, such as genes that duplicate functional enzymes from closely related organisms. Post-hybridization washes determine stringency conditions.
[0381] The term "selectively hybridizes" includes reference to hybridization, under stringent hybridization conditions, of a nucleic acid sequence to a specified nucleic acid target sequence to a detectably greater degree (e.g., at least 2-fold over background) than its hybridization to non-target nucleic acid sequences and to the substantial exclusion of non-target nucleic acids. Selectively hybridizing sequences typically have about at least 80% sequence identity, or 90% sequence identity, up to and including 100% sequence identity (i.e., fully complementary) with each other.
[0382] The term "stringent conditions" or "stringent hybridization conditions" includes reference to conditions under which a probe will selectively hybridize to its target sequence in an in vitro hybridization assay. Stringent conditions are sequence-dependent and will be different in different circumstances. By controlling the stringency of the hybridization and/or washing conditions, target sequences can be identified which are 100% complementary to the probe (homologous probing). Alternatively, stringency conditions can be adjusted to allow some mismatching in sequences so that lower degrees of similarity are detected (heterologous probing). Generally, a probe is less than about 1000 nucleotides in length, optionally less than 500 nucleotides in length.
[0383] Typically, stringent conditions will be those in which the salt concentration is less than about 1.5 M Na ion, typically about 0.01 to 1.0 M Na ion concentration (or other salt(s)) at pH 7.0 to 8.3, and at least about 30° C. for short probes (e.g., 10 to 50 nucleotides) and at least about 60° C. for long probes (e.g., greater than 50 nucleotides). Stringent conditions may also be achieved with the addition of destabilizing agents such as formamide. Exemplary low stringency conditions include hybridization with a buffer solution of 30 to 35% formamide, 1 M NaCl, 1° A) SDS (sodium dodecyl sulphate) at 37° C., and a wash in 1× to 2×SSC (20×SSC=3.0 M NaCl/0.3 M trisodium citrate) at 50 to 55° C. Exemplary moderate stringency conditions include hybridization in 40 to 45% formamide, 1 M NaCl, 1% SDS at 37° C., and a wash in 0.5× to 1×SSC at 55 to 60° C. Exemplary high stringency conditions include hybridization in 50% formamide, 1 M NaCl, 1% SDS at 37° C., and a wash in 0.1×SSC at 60 to 65° C.
[0384] "Sequence identity" or "identity" in the context of nucleic acid or polypeptide sequences refers to the nucleic acid bases or amino acid residues in two sequences that are the same when aligned for maximum correspondence over a specified comparison window.
[0385] The term "percentage of sequence identity" refers to the value determined by comparing two optimally aligned sequences over a comparison window, wherein the portion of the polynucleotide or polypeptide sequence in the comparison window may comprise additions or deletions (i.e., gaps) as compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences. The percentage is calculated by determining the number of positions at which the identical nucleic acid base or amino acid residue occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison and multiplying the results by 100 to yield the percentage of sequence identity. Useful examples of percent sequence identities include, but are not limited to, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90% or 95%, or any integer percentage from 50% to 100%. These identities can be determined using any of the programs described herein.
[0386] Sequence alignments and percent identity or similarity calculations may be determined using a variety of comparison methods designed to detect homologous sequences including, but not limited to, the MegAlign® program of the LASERGENE bioinformatics computing suite (DNASTAR Inc., Madison, Wis.). Within the context of this application it will be understood that where sequence analysis software is used for analysis, that the results of the analysis will be based on the "default values" of the program referenced, unless otherwise specified. As used herein "default values" will mean any set of values or parameters that originally load with the software when first initialized.
[0387] The "Clustal V method of alignment" corresponds to the alignment method labeled Clustal V (described by Higgins and Sharp, (1989) CABIOS 5:151-153; Higgins et al., (1992) Comput Appl Biosci 8:189-191) and found in the MegAlign® program of the LASERGENE bioinformatics computing suite (DNASTAR Inc., Madison, Wis.). For multiple alignments, the default values correspond to GAP PENALTY=10 and GAP LENGTH PENALTY=10. Default parameters for pairwise alignments and calculation of percent identity of protein sequences using the Clustal method are KTUPLE=1, GAP PENALTY=3, WINDOW=5 and DIAGONALS SAVED=5. For nucleic acids these parameters are KTUPLE=2, GAP PENALTY=5, WINDOW=4 and DIAGONALS SAVED=4. After alignment of the sequences using the Clustal V program, it is possible to obtain a "percent identity" by viewing the "sequence distances" table in the same program.
[0388] The "Clustal W method of alignment" corresponds to the alignment method labeled Clustal W (described by Higgins and Sharp, (1989) CABIOS 5:151-153; Higgins et al., (1992) Comput Appl Biosci 8:189-191) and found in the MegAlign® v6.1 program of the LASERGENE bioinformatics computing suite (DNASTAR Inc., Madison, Wis.). Default parameters for multiple alignment (GAP PENALTY=10, GAP LENGTH PENALTY=0.2, Delay Divergen Seqs (%)=30, DNA Transition Weight=0.5, Protein Weight Matrix=Gonnet Series, DNA Weight Matrix=IUB). After alignment of the sequences using the Clustal W program, it is possible to obtain a "percent identity" by viewing the "sequence distances" table in the same program.
[0389] Unless otherwise stated, sequence identity/similarity values provided herein refer to the value obtained using GAP Version 10 (GCG, Accelrys, San Diego, Calif.) using the following parameters: % identity and % similarity for a nucleotide sequence using a gap creation penalty weight of 50 and a gap length extension penalty weight of 3, and the nwsgapdna.cmp scoring matrix; % identity and % similarity for an amino acid sequence using a GAP creation penalty weight of 8 and a gap length extension penalty of 2, and the BLOSUM62 scoring matrix (Henikoff and Henikoff, (1989) Proc. Natl. Acad. Sci. USA 89:10915). GAP uses the algorithm of Needleman and Wunsch, (1970) J Mol Biol 48:443-53, to find an alignment of two complete sequences that maximizes the number of matches and minimizes the number of gaps. GAP considers all possible alignments and gap positions and creates the alignment with the largest number of matched bases and the fewest gaps, using a gap creation penalty and a gap extension penalty in units of matched bases.
[0390] "BLAST" is a searching algorithm provided by the National Center for Biotechnology Information (NCBI) used to find regions of similarity between biological sequences. The program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance of matches to identify sequences having sufficient similarity to a query sequence such that the similarity would not be predicted to have occurred randomly. BLAST reports the identified sequences and their local alignment to the query sequence.
[0391] It is well understood by one skilled in the art that many levels of sequence identity are useful in identifying polypeptides from other species or modified naturally or synthetically wherein such polypeptides have the same or similar function or activity. Useful examples of percent identities include, but are not limited to, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90% or 95%, or any integer percentage from 50% to 100%. Indeed, any integer amino acid identity from 50% to 100% may be useful in describing the present disclosure, such as 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99%.
[0392] "Gene" includes a nucleic acid fragment that expresses a functional molecule such as, but not limited to, a specific protein, including regulatory sequences preceding (5' non-coding sequences) and following (3' non-coding sequences) the coding sequence. "Native gene" refers to a gene as found in nature with its own regulatory sequences.
[0393] A "mutated gene" is a gene that has been altered through human intervention. Such a "mutated gene" has a sequence that differs from the sequence of the corresponding non-mutated gene by at least one nucleotide addition, deletion, or substitution. In certain embodiments of the disclosure, the mutated gene comprises an alteration that results from a guide polynucleotide/Cas endonuclease system as disclosed herein. A mutated plant is a plant comprising a mutated gene.
[0394] As used herein, a "targeted mutation" is a mutation in a native gene that was made by altering a target sequence within the native gene using a method involving a double-strand-break-inducing agent that is capable of inducing a double-strand break in the DNA of the target sequence as disclosed herein or known in the art.
[0395] In one embodiment, the targeted mutation is the result of a guideRNA/Cas endonuclease induced gene editing as described herein. The guide RNA/Cas endonuclease induced targeted mutation can occur in a nucleotide sequence that is located within or outside a genomic target site that is recognized and cleaved by a Cas endonuclease.
[0396] The term "genome" as it applies to a plant cells encompasses not only chromosomal DNA found within the nucleus, but organelle DNA found within subcellular components (e.g., mitochondria, or plastid) of the cell.
[0397] A "codon-modified gene" or "codon-preferred gene" or "codon-optimized gene" is a gene having its frequency of codon usage designed to mimic the frequency of preferred codon usage of the host cell.
[0398] An "allele" is one of several alternative forms of a gene occupying a given locus on a chromosome. When all the alleles present at a given locus on a chromosome are the same, that plant is homozygous at that locus. If the alleles present at a given locus on a chromosome differ, that plant is heterozygous at that locus.
[0399] "Coding sequence" refers to a polynucleotide sequence which codes for a specific amino acid sequence. "Regulatory sequences" refer to nucleotide sequences located upstream (5' non-coding sequences), within, or downstream (3' non-coding sequences) of a coding sequence, and which influence the transcription, RNA processing or stability, or translation of the associated coding sequence. Regulatory sequences may include, but are not limited to: promoters, translation leader sequences, 5' untranslated sequences, 3' untranslated sequences, introns, polyadenylation target sequences, RNA processing sites, effector binding sites, and stem-loop structures.
[0400] "A plant-optimized nucleotide sequence" is nucleotide sequence that has been optimized for increased expression in plants, particularly for increased expression in plants or in one or more plants of interest. For example, a plant-optimized nucleotide sequence can be synthesized by modifying a nucleotide sequence encoding a protein such as, for example, double-strand-break-inducing agent (e.g., an endonuclease) as disclosed herein, using one or more plant-preferred codons for improved expression. See, for example, Campbell and Gowri (1990) Plant Physiol. 92:1-11 for a discussion of host-preferred codon usage.
[0401] Methods are available in the art for synthesizing plant-preferred genes. See, for example, U.S. Pat. Nos. 5,380,831, and 5,436,391, and Murray et al. (1989) Nucleic Acids Res. 17:477-498, herein incorporated by reference. Additional sequence modifications are known to enhance gene expression in a plant host. These include, for example, elimination of: one or more sequences encoding spurious polyadenylation signals, one or more exon-intron splice site signals, one or more transposon-like repeats, and other such well-characterized sequences that may be deleterious to gene expression. The G-C content of the sequence may be adjusted to levels average for a given plant host, as calculated by reference to known genes expressed in the host plant cell. When possible, the sequence is modified to avoid one or more predicted hairpin secondary mRNA structures. Thus, "a plant-optimized nucleotide sequence" of the present disclosure comprises one or more of such sequence modifications.
[0402] "Promoter" refers to a DNA sequence capable of controlling the expression of a coding sequence or functional RNA. The promoter sequence consists of proximal and more distal upstream elements, the latter elements often referred to as enhancers. An "enhancer" is a DNA sequence that can stimulate promoter activity, and may be an innate element of the promoter or a heterologous element inserted to enhance the level or tissue-specificity of a promoter. Promoters may be derived in their entirety from a native gene, or be composed of different elements derived from different promoters found in nature, and/or comprise synthetic DNA segments. It is understood by those skilled in the art that different promoters may direct the expression of a gene in different tissues or cell types, or at different stages of development, or in response to different environmental conditions. It is further recognized that since in most cases the exact boundaries of regulatory sequences have not been completely defined, DNA fragments of some variation may have identical promoter activity. Promoters that cause a gene to be expressed in most cell types at most times are commonly referred to as "constitutive promoters".
[0403] It has been shown that certain promoters are able to direct RNA synthesis at a higher rate than others. These are called "strong promoters". Certain other promoters have been shown to direct RNA synthesis at higher levels only in particular types of cells or tissues and are often referred to as "tissue specific promoters", or "tissue-preferred promoters" if the promoters direct RNA synthesis preferably in certain tissues but also in other tissues at reduced levels. Since patterns of expression of a chimeric gene (or genes) introduced into a plant are controlled using promoters, there is an ongoing interest in the isolation of novel promoters which are capable of controlling the expression of a chimeric gene or (genes) at certain levels in specific tissue types or at specific plant developmental stages.
[0404] Some embodiments of the disclosures relate to newly discovered U6 RNA polymerase III promoters, GM-U6-13.1 (SEQ ID NO: 120) as described in Example 12 and GM-U6-9.1 (SEQ ID NO: 295) described in Example 19.
[0405] Non-limiting examples of methods and compositions relating to the soybean promoters described herein are as follows:
[0406] A1. A recombinant DNA construct comprising a nucleotide sequence comprising any of the sequences set forth in SEQ ID NO:120 or SEQ ID NO:295, or a functional fragment thereof, operably linked to at least one heterologous sequence, wherein said nucleotide sequence is a promoter.
[0407] A2. The recombinant DNA construct of embodiment A1, wherein the nucleotide sequence has at least 95% identity, based on the Clustal V method of alignment with pairwise alignment default parameters (KTUPLE=2, GAP PENALTY=5, WINDOW=4 and DIAGONALS SAVED=4), when compared to the sequence set forth in SEQ ID NO:120 or SEQ ID NO: 295.
[0408] A3. A vector comprising the recombinant DNA construct of embodiment A1.
[0409] A4. A cell comprising the recombinant DNA construct of embodiment A1.
[0410] A5. The cell of embodiment A4, wherein the cell is a plant cell.
[0411] A6. A transgenic plant having stably incorporated into its genome the recombinant DNA construct of embodiment A1.
[0412] A7. The transgenic plant of embodiment A6, wherein said plant is a dicot plant.
[0413] A8. The transgenic plant of embodiment A7 wherein the plant is soybean.
[0414] A9. A transgenic seed produced by the transgenic plant of embodiment A7, wherein the transgenic seed comprises the recombinant DNA construct.
[0415] A10. The recombinant DNA construct of embodiment A1 wherein the at least one heterologous sequence codes for a gene selected from the group consisting of: a reporter gene, a selection marker, a disease resistance conferring gene, a herbicide resistance conferring gene, an insect resistance conferring gene; a gene involved in carbohydrate metabolism, a gene involved in fatty acid metabolism, a gene involved in amino acid metabolism, a gene involved in plant development, a gene involved in plant growth regulation, a gene involved in yield improvement, a gene involved in drought resistance, a gene involved in cold resistance, a gene involved in heat resistance and a gene involved in salt resistance in plants.
[0416] A11. The recombinant DNA construct of embodiment A1, wherein the at least one heterologous sequence encodes a protein selected from the group consisting of: a reporter protein, a selection marker, a protein conferring disease resistance, protein conferring herbicide resistance, protein conferring insect resistance; protein involved in carbohydrate metabolism, protein involved in fatty acid metabolism, protein involved in amino acid metabolism, protein involved in plant development, protein involved in plant growth regulation, protein involved in yield improvement, protein involved in drought resistance, protein involved in cold resistance, protein involved in heat resistance and protein involved in salt resistance in plants.
[0417] A12. A method of expressing a coding sequence or a functional RNA in a plant comprising:
[0418] a) introducing the recombinant DNA construct of embodiment A1 into the plant, wherein the at least one heterologous sequence comprises a coding sequence or encodes a functional RNA;
[0419] b) growing the plant of step a); and
[0420] c) selecting a plant displaying expression of the coding sequence or the functional RNA of the recombinant DNA construct.
[0421] A13. A method of transgenically altering a marketable plant trait, comprising:
[0422] a) introducing a recombinant DNA construct of embodiment A1 into the plant;
[0423] b) growing a fertile, mature plant resulting from step a); and
[0424] c) selecting a plant expressing the at least one heterologous sequence in at least one plant tissue based on the altered marketable trait.
[0425] A14. The method of embodiment A13 wherein the marketable trait is selected from the group consisting of: disease resistance, herbicide resistance, insect resistance carbohydrate metabolism, fatty acid metabolism, amino acid metabolism, plant development, plant growth regulation, yield improvement, drought resistance, cold resistance, heat resistance, and salt resistance.
[0426] A15. A method for altering expression of at least one heterologous sequence in a plant comprising:
[0427] (a) transforming a plant cell with the recombinant DNA construct of embodiment A1;
[0428] (b) growing fertile mature plants from transformed plant cell of step (a); and
[0429] (c) selecting plants containing the transformed plant cell wherein the expression of the heterologous sequence is increased or decreased.
[0430] A16. The method of Embodiment A15 wherein the plant is a soybean plant.
[0431] A17. A plant stably transformed with a recombinant DNA construct comprising a soybean promoter and a heterologous nucleic acid fragment operably linked to said promoter, wherein said promoter is a capable of controlling expression of said heterologous nucleic acid fragment in a plant cell, and further wherein said promoter comprises any of the sequences set forth in SEQ ID NO: 120 or SEQ ID NO:295.
[0432] New promoters of various types useful in plant cells are constantly being discovered; numerous examples may be found in the compilation by Okamuro and Goldberg, (1989) In The Biochemistry of Plants, Vol. 115, Stumpf and Conn, eds (New York, N.Y.: Academic Press), pp. 1-82.
[0433] "Translation leader sequence" refers to a polynucleotide sequence located between the promoter sequence of a gene and the coding sequence. The translation leader sequence is present in the mRNA upstream of the translation start sequence. The translation leader sequence may affect processing of the primary transcript to mRNA, mRNA stability or translation efficiency. Examples of translation leader sequences have been described (e.g., Turner and Foster, (1995) Mol Biotechnol 3:225-236).
[0434] "3' non-coding sequences", "transcription terminator" or "termination sequences" refer to DNA sequences located downstream of a coding sequence and include polyadenylation recognition sequences and other sequences encoding regulatory signals capable of affecting mRNA processing or gene expression. The polyadenylation signal is usually characterized by affecting the addition of polyadenylic acid tracts to the 3' end of the mRNA precursor. The use of different 3' non-coding sequences is exemplified by Ingelbrecht et al., (1989) Plant Cell 1:671-680.
[0435] "RNA transcript" refers to the product resulting from RNA polymerase-catalyzed transcription of a DNA sequence. When the RNA transcript is a perfect complimentary copy of the DNA sequence, it is referred to as the primary transcript or pre-mRNA. A RNA transcript is referred to as the mature RNA or mRNA when it is a RNA sequence derived from post-transcriptional processing of the primary transcript pre mRNAt. "Messenger RNA" or "mRNA" refers to the RNA that is without introns and that can be translated into protein by the cell. "cDNA" refers to a DNA that is complementary to, and synthesized from, a mRNA template using the enzyme reverse transcriptase. The cDNA can be single-stranded or converted into double-stranded form using the Klenow fragment of DNA polymerase I. "Sense" RNA refers to RNA transcript that includes the mRNA and can be translated into protein within a cell or in vitro. "Antisense RNA" refers to an RNA transcript that is complementary to all or part of a target primary transcript or mRNA, and that blocks the expression of a target gene (see, e.g., U.S. Pat. No. 5,107,065). The complementarity of an antisense RNA may be with any part of the specific gene transcript, i.e., at the 5' non-coding sequence, 3' non-coding sequence, introns, or the coding sequence. "Functional RNA" refers to antisense RNA, ribozyme RNA, or other RNA that may not be translated but yet has an effect on cellular processes. The terms "complement" and "reverse complement" are used interchangeably herein with respect to mRNA transcripts, and are meant to define the antisense RNA of the message.
[0436] The term "operably linked" refers to the association of nucleic acid sequences on a single nucleic acid fragment so that the function of one is regulated by the other. For example, a promoter is operably linked with a coding sequence when it is capable of regulating the expression of that coding sequence (i.e., the coding sequence is under the transcriptional control of the promoter). Coding sequences can be operably linked to regulatory sequences in a sense or antisense orientation. In another example, the complementary RNA regions can be operably linked, either directly or indirectly, 5' to the target mRNA, or 3' to the target mRNA, or within the target mRNA, or a first complementary region is 5' and its complement is 3' to the target mRNA.
[0437] Standard recombinant DNA and molecular cloning techniques used herein are well known in the art and are described more fully in Sambrook et al., Molecular Cloning: A Laboratory Manual; Cold Spring Harbor Laboratory: Cold Spring Harbor, N.Y. (1989). Transformation methods are well known to those skilled in the art and are described infra.
[0438] "PCR" or "polymerase chain reaction" is a technique for the synthesis of specific DNA segments and consists of a series of repetitive denaturation, annealing, and extension cycles. Typically, a double-stranded DNA is heat denatured, and two primers complementary to the 3' boundaries of the target segment are annealed to the DNA at low temperature, and then extended at an intermediate temperature. One set of these three consecutive steps is referred to as a "cycle".
[0439] The term "recombinant" refers to an artificial combination of two otherwise separated segments of sequence, e.g., by chemical synthesis, or manipulation of isolated segments of nucleic acids by genetic engineering techniques.
[0440] The terms "plasmid", "vector" and "cassette" refer to an extra chromosomal element often carrying genes that are not part of the central metabolism of the cell, and usually in the form of double-stranded DNA. Such elements may be autonomously replicating sequences, genome integrating sequences, phage, or nucleotide sequences, in linear or circular form, of a single- or double-stranded DNA or RNA, derived from any source, in which a number of nucleotide sequences have been joined or recombined into a unique construction which is capable of introducing a polynucleotide of interest into a cell. "Transformation cassette" refers to a specific vector containing a gene and having elements in addition to the gene that facilitates transformation of a particular host cell. "Expression cassette" refers to a specific vector containing a gene and having elements in addition to the gene that allow for expression of that gene in a host.
[0441] The terms "recombinant DNA molecule", "recombinant construct", "expression construct", "construct", "construct", and "recombinant DNA construct" are used interchangeably herein. A recombinant construct comprises an artificial combination of nucleic acid fragments, e.g., regulatory and coding sequences that are not all found together in nature. For example, a construct may comprise regulatory sequences and coding sequences that are derived from different sources, or regulatory sequences and coding sequences derived from the same source, but arranged in a manner different than that found in nature. Such a construct may be used by itself or may be used in conjunction with a vector. If a vector is used, then the choice of vector is dependent upon the method that will be used to transform host cells as is well known to those skilled in the art. For example, a plasmid vector can be used. The skilled artisan is well aware of the genetic elements that must be present on the vector in order to successfully transform, select and propagate host cells. The skilled artisan will also recognize that different independent transformation events may result in different levels and patterns of expression (Jones et al., (1985) EMBO J. 4:2411-2418; De Almeida et al., (1989) Mol Gen Genetics 218:78-86), and thus that multiple events are typically screened in order to obtain lines displaying the desired expression level and pattern. Such screening may be accomplished standard molecular biological, biochemical, and other assays including Southern analysis of DNA, Northern analysis of mRNA expression, PCR, real time quantitative PCR (qPCR), reverse transcription PCR (RT-PCR), immunoblotting analysis of protein expression, enzyme or activity assays, and/or phenotypic analysis.
[0442] The term "expression", as used herein, refers to the production of a functional end-product (e.g., an mRNA, guide RNA, or a protein) in either precursor or mature form.
[0443] The term "introduced" means providing a nucleic acid (e.g., expression construct) or protein into a cell. Introduced includes reference to the incorporation of a nucleic acid into a eukaryotic or prokaryotic cell where the nucleic acid may be incorporated into the genome of the cell, and includes reference to the transient provision of a nucleic acid or protein to the cell. Introduced includes reference to stable or transient transformation methods, as well as sexually crossing. Thus, "introduced" in the context of inserting a nucleic acid fragment (e.g., a recombinant DNA construct/expression construct) into a cell, means "transfection" or "transformation" or "transduction" and includes reference to the incorporation of a nucleic acid fragment into a eukaryotic or prokaryotic cell where the nucleic acid fragment may be incorporated into the genome of the cell (e.g., chromosome, plasmid, plastid, or mitochondrial DNA), converted into an autonomous replicon, or transiently expressed (e.g., transfected mRNA).
[0444] "Mature" protein refers to a post-translationally processed polypeptide (i.e., one from which any pre- or propeptides present in the primary translation product have been removed). "Precursor" protein refers to the primary product of translation of mRNA (i.e., with pre- and propeptides still present). Pre- and propeptides may be but are not limited to intracellular localization signals.
[0445] "Stable transformation" refers to the transfer of a nucleic acid fragment into a genome of a host organism, including both nuclear and organellar genomes, resulting in genetically stable inheritance. In contrast, "transient transformation" refers to the transfer of a nucleic acid fragment into the nucleus, or other DNA-containing organelle, of a host organism resulting in gene expression without integration or stable inheritance. Host organisms containing the transformed nucleic acid fragments are referred to as "transgenic" organisms.
[0446] The commercial development of genetically improved germplasm has also advanced to the stage of introducing multiple traits into crop plants, often referred to as a gene stacking approach. In this approach, multiple genes conferring different characteristics of interest can be introduced into a plant. Gene stacking can be accomplished by many means including but not limited to co-transformation, retransformation, and crossing lines with different genes of interest.
[0447] The term "plant" refers to whole plants, plant organs, plant tissues, seeds, plant cells, seeds and progeny of the same. Plant cells include, without limitation, cells from seeds, suspension cultures, embryos, meristematic regions, callus tissue, leaves, roots, shoots, gametophytes, sporophytes, pollen and microspores. Plant parts include differentiated and undifferentiated tissues including, but not limited to roots, stems, shoots, leaves, pollens, seeds, tumor tissue and various forms of cells and culture (e.g., single cells, protoplasts, embryos, and callus tissue). The plant tissue may be in plant or in a plant organ, tissue or cell culture. The term "plant organ" refers to plant tissue or a group of tissues that constitute a morphologically and functionally distinct part of a plant. The term "genome" refers to the entire complement of genetic material (genes and non-coding sequences) that is present in each cell of an organism, or virus or organelle; and/or a complete set of chromosomes inherited as a (haploid) unit from one parent. "Progeny" comprises any subsequent generation of a plant.
[0448] A transgenic plant includes, for example, a plant which comprises within its genome a heterologous polynucleotide introduced by a transformation step. The heterologous polynucleotide can be stably integrated within the genome such that the polynucleotide is passed on to successive generations. The heterologous polynucleotide may be integrated into the genome alone or as part of a recombinant DNA construct. A transgenic plant can also comprise more than one heterologous polynucleotide within its genome. Each heterologous polynucleotide may confer a different trait to the transgenic plant. A heterologous polynucleotide can include a sequence that originates from a foreign species, or, if from the same species, can be substantially modified from its native form. Transgenic can include any cell, cell line, callus, tissue, plant part or plant, the genotype of which has been altered by the presence of heterologous nucleic acid including those transgenics initially so altered as well as those created by sexual crosses or asexual propagation from the initial transgenic. The alterations of the genome (chromosomal or extra-chromosomal) by conventional plant breeding methods, by the genome editing procedure described herein that does not result in an insertion of a foreign polynucleotide, or by naturally occurring events such as random cross-fertilization, non-recombinant viral infection, non-recombinant bacterial transformation, non-recombinant transposition, or spontaneous mutation are not intended to be regarded as transgenic.
[0449] In certain embodiments of the disclosure, a fertile plant is a plant that produces viable male and female gametes and is self-fertile. Such a self-fertile plant can produce a progeny plant without the contribution from any other plant of a gamete and the genetic material contained therein. Other embodiments of the disclosure can involve the use of a plant that is not self-fertile because the plant does not produce male gametes, or female gametes, or both, that are viable or otherwise capable of fertilization. As used herein, a "male sterile plant" is a plant that does not produce male gametes that are viable or otherwise capable of fertilization. As used herein, a "female sterile plant" is a plant that does not produce female gametes that are viable or otherwise capable of fertilization. It is recognized that male-sterile and female-sterile plants can be female-fertile and male-fertile, respectively. It is further recognized that a male fertile (but female sterile) plant can produce viable progeny when crossed with a female fertile plant and that a female fertile (but male sterile) plant can produce viable progeny when crossed with a male fertile plant.
[0450] A "centimorgan" (cM) or "map unit" is the distance between two linked genes, markers, target sites, loci, or any pair thereof, wherein 1% of the products of meiosis are recombinant. Thus, a centimorgan is equivalent to a distance equal to a 1% average recombination frequency between the two linked genes, markers, target sites, loci, or any pair thereof.
Breeding Methods and Methods for Selecting Plants Utilizing a Two Component RNA Guide and Cas Endonuclease System
[0451] The present disclosure finds use in the breeding of plants comprising one or more transgenic traits. Most commonly, transgenic traits are randomly inserted throughout the plant genome as a consequence of transformation systems based on Agrobacterium, biolistics, or other commonly used procedures. More recently, gene targeting protocols have been developed that enable directed transgene insertion. One important technology, site-specific integration (SSI) enables the targeting of a transgene to the same chromosomal location as a previously inserted transgene. Custom-designed meganucleases and custom-designed zinc finger meganucleases allow researchers to design nucleases to target specific chromosomal locations, and these reagents allow the targeting of transgenes at the chromosomal site cleaved by these nucleases.
[0452] The currently used systems for precision genetic engineering of eukaryotic genomes, e.g. plant genomes, rely upon homing endonucleases, meganucleases, zinc finger nucleases, and transcription activator--like effector nucleases (TALENs), which require de novo protein engineering for every new target locus. The highly specific, RNA-directed DNA nuclease, guide RNA/Cas9 endonuclease system described herein, is more easily customizable and therefore more useful when modification of many different target sequences is the goal. This disclosure takes further advantage of the two component nature of the guide RNA/Cas system, with its constant protein component, the Cas endonuclease, and its variable and easily reprogrammable targeting component, the guide RNA or the crRNA.
[0453] The guide RNA/Cas system described herein is especially useful for genome engineering, especially plant genome engineering, in circumstances where nuclease off-target cutting can be toxic to the targeted cells. In one embodiment of the guide RNA/Cas system described herein, the constant component, in the form of an expression-optimized Cas9 gene, is stably integrated into the target genome, e.g. plant genome. Expression of the Cas9 gene is under control of a promoter, e.g. plant promoter, which can be a constitutive promoter, tissue-specific promoter or inducible promoter, e.g. temperature-inducible, stress-inducible, developmental stage inducible, or chemically inducible promoter. In the absence of the variable component, i.e. the guide RNA or crRNA, the Cas9 protein is not able to cut DNA and therefore its presence in the plant cell should have little or no consequence. Hence a key advantage of the guide RNA/Cas system described herein is the ability to create and maintain a cell line or transgenic organism capable of efficient expression of the Cas9 protein with little or no consequence to cell viability. In order to induce cutting at desired genomic sites to achieve targeted genetic modifications, guide RNAs or crRNAs can be introduced by a variety of methods into cells containing the stably-integrated and expressed cas9 gene. For example, guide RNAs or crRNAs can be chemically or enzymatically synthesized, and introduced into the Cas9 expressing cells via direct delivery methods such a particle bombardment or electroporation.
[0454] Alternatively, genes capable of efficiently expressing guide RNAs or crRNAs in the target cells can be synthesized chemically, enzymatically or in a biological system, and these genes can be introduced into the Cas9 expressing cells via direct delivery methods such a particle bombardment, electroporation or biological delivery methods such as Agrobacterium mediated DNA delivery.
[0455] One embodiment of the disclosure is a method for selecting a plant comprising an altered target site in its plant genome, the method comprising: a) obtaining a first plant comprising at least one Cas endonuclease capable of introducing a double strand break at a target site in the plant genome; b) obtaining a second plant comprising a guide RNA that is capable of forming a complex with the Cas endonuclease of (a), c) crossing the first plant of (a) with the second plant of (b); d) evaluating the progeny of (c) for an alteration in the target site and e) selecting a progeny plant that possesses the desired alteration of said target site.
[0456] Another embodiment of the disclosure is a method for selecting a plant comprising an altered target site in its plant genome, the method comprising: a) obtaining a first plant comprising at least one Cas endonuclease capable of introducing a double strand break at a target site in the plant genome; b) obtaining a second plant comprising a guide RNA and a donor DNA, wherein said guide RNA is capable of forming a complex with the Cas endonuclease of (a), wherein said donor DNA comprises a polynucleotide of interest; c) crossing the first plant of (a) with the second plant of (b); d) evaluating the progeny of (c) for an alteration in the target site and e) selecting a progeny plant that comprises the polynucleotide of interest inserted at said target site.
[0457] Another embodiment of the disclosure is a method for selecting a plant comprising an altered target site in its plant genome, the method comprising selecting at least one progeny plant that comprises an alteration at a target site in its plant genome, wherein said progeny plant was obtained by crossing a first plant expressing at least one Cas endonuclease to a second plant comprising a guide RNA and a donor DNA, wherein said Cas endonuclease is capable of introducing a double strand break at said target site, wherein said donor DNA comprises a polynucleotide of interest.
[0458] As disclosed herein, a guide RNA/Cas system mediating gene targeting can be used in methods for directing transgene insertion and/or for producing complex transgenic trait loci comprising multiple transgenes in a fashion similar as disclosed in WO2013/0198888 (published Aug. 1, 2013) where instead of using a double strand break inducing agent to introduce a gene of interest, a guide RNA/Cas system or a guide polynucleotide/Cas system as disclosed herein is used. In one embodiment, a complex transgenic trait locus is a genomic locus that has multiple transgenes genetically linked to each other. By inserting independent transgenes within 0.1, 0.2, 0.3, 04, 0.5, 1, 2, or even 5 centimorgans (cM) from each other, the transgenes can be bred as a single genetic locus (see, for example, U.S. patent application Ser. No. 13/427,138) or PCT application PCT/US2012/030061. After selecting a plant comprising a transgene, plants containing (at least) one transgenes can be crossed to form an F1 that contains both transgenes. In progeny from these F1 (F2 or BC1) 1/500 progeny would have the two different transgenes recombined onto the same chromosome. The complex locus can then be bred as single genetic locus with both transgene traits. This process can be repeated to stack as many traits as desired.
[0459] Chromosomal intervals that correlate with a phenotype or trait of interest can be identified. A variety of methods well known in the art are available for identifying chromosomal intervals. The boundaries of such chromosomal intervals are drawn to encompass markers that will be linked to the gene controlling the trait of interest. In other words, the chromosomal interval is drawn such that any marker that lies within that interval (including the terminal markers that define the boundaries of the interval) can be used as a marker for northern leaf blight resistance. In one embodiment, the chromosomal interval comprises at least one QTL, and furthermore, may indeed comprise more than one QTL. Close proximity of multiple QTLs in the same interval may obfuscate the correlation of a particular marker with a particular QTL, as one marker may demonstrate linkage to more than one QTL. Conversely, e.g., if two markers in close proximity show co-segregation with the desired phenotypic trait, it is sometimes unclear if each of those markers identifies the same QTL or two different QTL. The term "quantitative trait locus" or "QTL" refers to a region of DNA that is associated with the differential expression of a quantitative phenotypic trait in at least one genetic background, e.g., in at least one breeding population. The region of the QTL encompasses or is closely linked to the gene or genes that affect the trait in question. An "allele of a QTL" can comprise multiple genes or other genetic factors within a contiguous genomic region or linkage group, such as a haplotype. An allele of a QTL can denote a haplotype within a specified window wherein said window is a contiguous genomic region that can be defined, and tracked, with a set of one or more polymorphic markers. A haplotype can be defined by the unique fingerprint of alleles at each marker within the specified window.
[0460] A variety of methods are available to identify those cells having an altered genome at or near a target site without using a screenable marker phenotype. Such methods can be viewed as directly analyzing a target sequence to detect any change in the target sequence, including but not limited to PCR methods, sequencing methods, nuclease digestion, Southern blots, and any combination thereof.
[0461] Proteins may be altered in various ways including amino acid substitutions, deletions, truncations, and insertions. Methods for such manipulations are generally known. For example, amino acid sequence variants of the protein(s) can be prepared by mutations in the DNA. Methods for mutagenesis and nucleotide sequence alterations include, for example, Kunkel, (1985) Proc. Natl. Acad. Sci. USA 82:488-92; Kunkel et al., (1987) Meth Enzymol 154:367-82; U.S. Pat. No. 4,873,192; Walker and Gaastra, eds. (1983) Techniques in Molecular Biology (MacMillan Publishing Company, New York) and the references cited therein. Guidance regarding amino acid substitutions not likely to affect biological activity of the protein is found, for example, in the model of Dayhoff et al., (1978) Atlas of Protein Sequence and Structure (Natl Biomed Res Found, Washington, D.C.). Conservative substitutions, such as exchanging one amino acid with another having similar properties, may be preferable. Conservative deletions, insertions, and amino acid substitutions are not expected to produce radical changes in the characteristics of the protein, and the effect of any substitution, deletion, insertion, or combination thereof can be evaluated by routine screening assays. Assays for double-strand-break-inducing activity are known and generally measure the overall activity and specificity of the agent on DNA substrates containing target sites.
[0462] A variety of methods are known for the introduction of nucleotide sequences and polypeptides into an organism, including, for example, transformation, sexual crossing, and the introduction of the polypeptide, DNA, or mRNA into the cell.
[0463] Methods for contacting, providing, and/or introducing a composition into various organisms are known and include but are not limited to, stable transformation methods, transient transformation methods, virus-mediated methods, and sexual breeding. Stable transformation indicates that the introduced polynucleotide integrates into the genome of the organism and is capable of being inherited by progeny thereof. Transient transformation indicates that the introduced composition is only temporarily expressed or present in the organism.
[0464] Protocols for introducing polynucleotides and polypeptides into plants may vary depending on the type of plant or plant cell targeted for transformation, such as monocot or dicot. Suitable methods of introducing polynucleotides and polypeptides into plant cells and subsequent insertion into the plant genome include microinjection (Crossway et al., (1986) Biotechniques 4:320-34 and U.S. Pat. No. 6,300,543), meristem transformation (U.S. Pat. No. 5,736,369), electroporation (Riggs et al., (1986) Proc. Natl. Acad. Sci. USA 83:5602-6, Agrobacterium-mediated transformation (U.S. Pat. Nos. 5,563,055 and 5,981,840), direct gene transfer (Paszkowski et al., (1984) EMBO J. 3:2717-22), and ballistic particle acceleration (U.S. Pat. Nos. 4,945,050; 5,879,918; 5,886,244; 5,932,782; Tomes et al., (1995) "Direct DNA Transfer into Intact Plant Cells via Microprojectile Bombardment" in Plant Cell, Tissue, and Organ Culture: Fundamental Methods, ed. Gamborg & Phillips (Springer-Verlag, Berlin); McCabe et al., (1988) Biotechnology 6:923-6; Weissinger et al., (1988) Ann Rev Genet. 22:421-77; Sanford et al., (1987) Particulate Science and Technology 5:27-37 (onion); Christou et al., (1988) Plant Physiol 87:671-4 (soybean); Finer and McMullen, (1991) In Vitro Cell Dev Biol 27P:175-82 (soybean); Singh et al., (1998) Theor Appl Genet 96:319-24 (soybean); Datta et al., (1990) Biotechnology 8:736-40 (rice); Klein et al., (1988) Proc. Natl. Acad. Sci. USA 85:4305-9 (maize); Klein et al., (1988) Biotechnology 6:559-63 (maize); U.S. Pat. Nos. 5,240,855; 5,322,783 and 5,324,646; Klein et al., (1988) Plant Physiol 91:440-4 (maize); Fromm et al., (1990) Biotechnology 8:833-9 (maize); Hooykaas-Van Slogteren et al., (1984) Nature 311:763-4; U.S. Pat. No. 5,736,369 (cereals); Bytebier et al., (1987) Proc. Natl. Acad. Sci. USA 84:5345-9 (Liliaceae); De Wet et al., (1985) in The Experimental Manipulation of Ovule Tissues, ed. Chapman et al., (Longman, N.Y.), pp. 197-209 (pollen); Kaeppler et al., (1990) Plant Cell Rep 9:415-8) and Kaeppler et al., (1992) Theor Appl Genet 84:560-6 (whisker-mediated transformation); D'Halluin et al., (1992) Plant Cell 4:1495-505 (electroporation); Li et al., (1993) Plant Cell Rep 12:250-5; Christou and Ford (1995) Annals Botany 75:407-13 (rice) and Osjoda et al., (1996) Nat Biotechnol 14:745-50 (maize via Agrobacterium tumefaciens).
[0465] Alternatively, polynucleotides may be introduced into plants by contacting plants with a virus or viral nucleic acids. Generally, such methods involve incorporating a polynucleotide within a viral DNA or RNA molecule. In some examples a polypeptide of interest may be initially synthesized as part of a viral polyprotein, which is later processed by proteolysis in vivo or in vitro to produce the desired recombinant protein. Methods for introducing polynucleotides into plants and expressing a protein encoded therein, involving viral DNA or RNA molecules, are known, see, for example, U.S. Pat. Nos. 5,889,191, 5,889,190, 5,866,785, 5,589,367 and 5,316,931. Transient transformation methods include, but are not limited to, the introduction of polypeptides, such as a double-strand break inducing agent, directly into the organism, the introduction of polynucleotides such as DNA and/or RNA polynucleotides, and the introduction of the RNA transcript, such as an mRNA encoding a double-strand break inducing agent, into the organism. Such methods include, for example, microinjection or particle bombardment. See, for example Crossway et al., (1986) Mol Gen Genet 202:179-85; Nomura et al., (1986) Plant Sci 44:53-8; Hepler et al., (1994) Proc. Natl. Acad. Sci. USA 91:2176-80; and, Hush et al., (1994) J Cell Sci 107:775-84.
[0466] The term "dicot" refers to the subclass of angiosperm plants also knows as "dicotyledoneae" and includes reference to whole plants, plant organs (e.g., leaves, stems, roots, etc.), seeds, plant cells, and progeny of the same. Plant cell, as used herein includes, without limitation, seeds, suspension cultures, embryos, meristematic regions, callus tissue, leaves, roots, shoots, gametophytes, sporophytes, pollen, and microspores.
[0467] The term "crossed" or "cross" or "crossing" in the context of this disclosure means the fusion of gametes via pollination to produce progeny (i.e., cells, seeds, or plants). The term encompasses both sexual crosses (the pollination of one plant by another) and selfing (self-pollination, i.e., when the pollen and ovule (or microspores and megaspores) are from the same plant or genetically identical plants).
[0468] The term "introgression" refers to the transmission of a desired allele of a genetic locus from one genetic background to another. For example, introgression of a desired allele at a specified locus can be transmitted to at least one progeny plant via a sexual cross between two parent plants, where at least one of the parent plants has the desired allele within its genome. Alternatively, for example, transmission of an allele can occur by recombination between two donor genomes, e.g., in a fused protoplast, where at least one of the donor protoplasts has the desired allele in its genome. The desired allele can be, e.g., a transgene, a modified (mutated or edited) native allele, or a selected allele of a marker or QTL.
[0469] Standard DNA isolation, purification, molecular cloning, vector construction, and verification/characterization methods are well established, see, for example Sambrook et al., (1989) Molecular Cloning: A Laboratory Manual, (Cold Spring Harbor Laboratory Press, NY). Vectors and constructs include circular plasmids, and linear polynucleotides, comprising a polynucleotide of interest and optionally other components including linkers, adapters, regulatory regions, introns, restriction sites, enhancers, insulators, selectable markers, nucleotide sequences of interest, promoters, and/or other sites that aid in vector construction or analysis. In some examples a recognition site and/or target site can be contained within an intron, coding sequence, 5' UTRs, 3' UTRs, and/or regulatory regions.
[0470] The present disclosure further provides expression constructs for expressing in a plant, plant cell, or plant part a guide RNA/Cas system that is capable of binding to and creating a double strand break in a target site. In one embodiment, the expression constructs of the disclosure comprise a promoter operably linked to a nucleotide sequence encoding a Cas gene and a promoter operably linked to a guide RNA of the present disclosure. The promoter is capable of driving expression of an operably linked nucleotide sequence in a plant cell.
[0471] A promoter is a region of DNA involved in recognition and binding of RNA polymerase and other proteins to initiate transcription. A plant promoter is a promoter capable of initiating transcription in a plant cell, for a review of plant promoters, see, Potenza et al., (2004) In Vitro Cell Dev Biol 40:1-22. Constitutive promoters include, for example, the core promoter of the Rsyn7 promoter and other constitutive promoters disclosed in WO99/43838 and U.S. Pat. No. 6,072,050; the core CaMV 35S promoter (Odell et al., (1985) Nature 313:810-2); rice actin (McElroy et al., (1990) Plant Cell 2:163-71); ubiquitin (Christensen et al., (1989) Plant Mol Biol 12:619-32; Christensen et al., (1992) Plant Mol Biol 18:675-89); pEMU (Last et al., (1991) Theor Appl Genet 81:581-8); MAS (Velten et al., (1984) EMBO J 3:2723-30); ALS promoter (U.S. Pat. No. 5,659,026), and the like. Other constitutive promoters are described in, for example, U.S. Pat. Nos. 5,608,149; 5,608,144; 5,604,121; 5,569,597; 5,466,785; 5,399,680; 5,268,463; 5,608,142 and 6,177,611. In some examples an inducible promoter may be used. Pathogen-inducible promoters induced following infection by a pathogen include, but are not limited to those regulating expression of PR proteins, SAR proteins, beta-1,3-glucanase, chitinase, etc.
[0472] Chemical-regulated promoters can be used to modulate the expression of a gene in a plant through the application of an exogenous chemical regulator. The promoter may be a chemical-inducible promoter, where application of the chemical induces gene expression, or a chemical-repressible promoter, where application of the chemical represses gene expression. Chemical-inducible promoters include, but are not limited to, the maize In2-2 promoter, activated by benzene sulfonamide herbicide safeners (De Veylder et al., (1997) Plant Cell Physiol 38:568-77), the maize GST promoter (GST-II-27, WO93/01294), activated by hydrophobic electrophilic compounds used as pre-emergent herbicides, and the tobacco PR-1a promoter (Ono et al., (2004) Biosci Biotechnol Biochem 68:803-7) activated by salicylic acid. Other chemical-regulated promoters include steroid-responsive promoters (see, for example, the glucocorticoid-inducible promoter (Schena et al., (1991) Proc. Natl. Acad. Sci. USA 88:10421-5; McNellis et al., (1998) Plant J 14:247-257); tetracycline-inducible and tetracycline-repressible promoters (Gatz et al., (1991) Mol Gen Genet 227:229-37; U.S. Pat. Nos. 5,814,618 and 5,789,156).
[0473] Tissue-preferred promoters can be utilized to target enhanced expression within a particular plant tissue. Tissue-preferred promoters include, for example, Kawamata et al., (1997) Plant Cell Physiol 38:792-803; Hansen et al., (1997) Mol Gen Genet 254:337-43; Russell et al., (1997) Transgenic Res 6:157-68; Rinehart et al., (1996) Plant Physiol 112:1331-41; Van Camp et al., (1996) Plant Physiol 112:525-35; Canevascini et al., (1996) Plant Physiol 112:513-524; Lam, (1994) Results Probl Cell Differ 20:181-96; and Guevara-Garcia et al., (1993) Plant J 4:495-505. Leaf-preferred promoters include, for example, Yamamoto et al., (1997) Plant J 12:255-65; Kwon et al., (1994) Plant Physiol 105:357-67; Yamamoto et al., (1994) Plant Cell Physiol 35:773-8; Gotor et al., (1993) Plant J 3:509-18; Orozco et al., (1993) Plant Mol Biol 23:1129-38; Matsuoka et al., (1993) Proc. Natl. Acad. Sci. USA 90:9586-90; Simpson et al., (1958) EMBO J. 4:2723-9; Timko et al., (1988) Nature 318:57-8. Root-preferred promoters include, for example, Hire et al., (1992) Plant Mol Biol 20:207-18 (soybean root-specific glutamine synthase gene); Miao et al., (1991) Plant Cell 3:11-22 (cytosolic glutamine synthase (GS)); Keller and Baumgartner, (1991) Plant Cell 3:1051-61 (root-specific control element in the GRP 1.8 gene of French bean); Sanger et al., (1990) Plant Mol Biol 14:433-43 (root-specific promoter of A. tumefaciens mannopine synthase (MAS)); Bogusz et al., (1990) Plant Cell 2:633-41 (root-specific promoters isolated from Parasponia andersonii and Trema tomentosa); Leach and Aoyagi, (1991) Plant Sci 79:69-76 (A. rhizogenes roIC and roID root-inducing genes); Teeri et al., (1989) EMBO J 8:343-50 (Agrobacterium wound-induced TR1' and TR2' genes); VfENOD-GRP3 gene promoter (Kuster et al., (1995) Plant Mol Biol 29:759-72); and roIB promoter (Capana et al., (1994) Plant Mol Biol 25:681-91; phaseolin gene (Murai et al., (1983) Science 23:476-82; Sengopta-Gopalen et al., (1988) Proc. Natl. Acad. Sci. USA 82:3320-4). See also, U.S. Pat. Nos. 5,837,876; 5,750,386; 5,633,363; 5,459,252; 5,401,836; 5,110,732 and 5,023,179.
[0474] Seed-preferred promoters include both seed-specific promoters active during seed development, as well as seed-germinating promoters active during seed germination. See, Thompson et al., (1989) BioEssays 10:108. Seed-preferred promoters include, but are not limited to, Cim1 (cytokinin-induced message); cZ19B1 (maize 19 kDa zein); and milps (myo-inositol-1-phosphate synthase); (WO00/11177; and U.S. Pat. No. 6,225,529). For dicots, seed-preferred promoters include, but are not limited to, bean β-phaseolin, napin, β-conglycinin, soybean lectin, cruciferin, and the like. For monocots, seed-preferred promoters include, but are not limited to, maize 15 kDa zein, 22 kDa zein, 27 kDa gamma zein, waxy, shrunken 1, shrunken 2, globulin 1, oleosin, and nuc1. See also, WO00/12733, where seed-preferred promoters from END1 and END2 genes are disclosed.
[0475] A phenotypic marker is a screenable or selectable marker that includes visual markers and selectable markers whether it is a positive or negative selectable marker. Any phenotypic marker can be used. Specifically, a selectable or screenable marker comprises a DNA segment that allows one to identify, or select for or against a molecule or a cell that contains it, often under particular conditions. These markers can encode an activity, such as, but not limited to, production of RNA, peptide, or protein, or can provide a binding site for RNA, peptides, proteins, inorganic and organic compounds or compositions and the like.
[0476] Examples of selectable markers include, but are not limited to, DNA segments that comprise restriction enzyme sites; DNA segments that encode products which provide resistance against otherwise toxic compounds including antibiotics, such as, spectinomycin, ampicillin, kanamycin, tetracycline, Basta, neomycin phosphotransferase II (NEO) and hygromycin phosphotransferase (HPT)); DNA segments that encode products which are otherwise lacking in the recipient cell (e.g., tRNA genes, auxotrophic markers); DNA segments that encode products which can be readily identified (e.g., phenotypic markers such as β-galactosidase, GUS; fluorescent proteins such as green fluorescent protein (GFP), cyan (CFP), yellow (YFP), red (RFP), and cell surface proteins); the generation of new primer sites for PCR (e.g., the juxtaposition of two DNA sequence not previously juxtaposed), the inclusion of DNA sequences not acted upon or acted upon by a restriction endonuclease or other DNA modifying enzyme, chemical, etc.; and, the inclusion of a DNA sequences required for a specific modification (e.g., methylation) that allows its identification.
[0477] Additional selectable markers include genes that confer resistance to herbicidal compounds, such as glufosinate ammonium, bromoxynil, imidazolinones, and 2,4-dichlorophenoxyacetate (2,4-D). See for example, Yarranton, (1992) Curr Opin Biotech 3:506-11; Christopherson et al., (1992) Proc. Natl. Acad. Sci. USA 89:6314-8; Yao et al., (1992) Cell 71:63-72; Reznikoff, (1992) Mol Microbiol 6:2419-22; Hu et al., (1987) Cell 48:555-66; Brown et al., (1987) Cell 49:603-12; Figge et al., (1988) Cell 52:713-22; Deuschle et al., (1989) Proc. Natl. Acad. Sci. USA 86:5400-4; Fuerst et al., (1989) Proc. Natl. Acad. Sci. USA 86:2549-53; Deuschle et al., (1990) Science 248:480-3; Gossen, (1993) Ph.D. Thesis, University of Heidelberg; Reines et al., (1993) Proc. Natl. Acad. Sci. USA 90:1917-21; Labow et al., (1990) Mol Cell Biol 10:3343-56; Zambretti et al., (1992) Proc. Natl. Acad. Sci. USA 89:3952-6; Baim et al., (1991) Proc. Natl. Acad. Sci. USA 88:5072-6; Wyborski et al., (1991) Nucleic Acids Res 19:4647-53; Hillen and Wissman, (1989) Topics Mol Struc Biol 10:143-62; Degenkolb et al., (1991) Antimicrob Agents Chemother 35:1591-5; Kleinschnidt et al., (1988) Biochemistry 27:1094-104; Bonin, (1993) Ph.D. Thesis, University of Heidelberg; Gossen et al., (1992) Proc. Natl. Acad. Sci. USA 89:5547-51; Oliva et al., (1992) Antimicrob Agents Chemother 36:913-9; Hlavka et al., (1985) Handbook of Experimental Pharmacology, Vol. 78 (Springer-Verlag, Berlin); Gill et al., (1988) Nature 334:721-4.
[0478] The cells having the introduced sequence may be grown or regenerated into plants using conventional conditions, see for example, McCormick et al., (1986) Plant Cell Rep 5:81-4. These plants may then be grown, and either pollinated with the same transformed strain or with a different transformed or untransformed strain, and the resulting progeny having the desired characteristic and/or comprising the introduced polynucleotide or polypeptide identified. Two or more generations may be grown to ensure that the polynucleotide is stably maintained and inherited, and seeds harvested.
[0479] Any plant can be used, including monocot and dicot plants. Examples of monocot plants that can be used include, but are not limited to, corn (Zea mays), rice (Oryza sativa), rye (Secale cereale), sorghum (Sorghum bicolor, Sorghum vulgare), millet (e.g., pearl millet (Pennisetum glaucum), proso millet (Panicum miliaceum), foxtail millet (Setaria italica), finger millet (Eleusine coracana)), wheat (Triticum aestivum), sugarcane (Saccharum spp.), oats (Avena), barley (Hordeum), switchgrass (Panicum virgatum), pineapple (Ananas comosus), banana (Musa spp.), palm, ornamentals, turfgrasses, and other grasses. Examples of dicot plants that can be used include, but are not limited to, soybean (Glycine max), canola (Brassica napus and B. campestris), alfalfa (Medicago sativa), tobacco (Nicotiana tabacum), Arabidopsis (Arabidopsis thaliana), sunflower (Helianthus annuus), cotton (Gossypium arboreum), and peanut (Arachis hypogaea), tomato (Solanum lycopersicum), potato (Solanum tuberosum) etc.
[0480] The transgenes, recombinant DNA molecules, DNA sequences of interest, and polynucleotides of interest can comprise one or more genes of interest. Such genes of interest can encode, for example, a protein that provides agronomic advantage to the plant.
[0481] Marker Assisted Selection and Breeding of Plants
[0482] A primary motivation for development of molecular markers in crop species is the potential for increased efficiency in plant breeding through marker assisted selection (MAS). Genetic marker alleles, or alternatively, quantitative trait loci (QTL alleles, are used to identify plants that contain a desired genotype at one or more loci, and that are expected to transfer the desired genotype, along with a desired phenotype to their progeny. Genetic marker alleles (or QTL alleles) can be used to identify plants that contain a desired genotype at one locus, or at several unlinked or linked loci (e.g., a haplotype), and that would be expected to transfer the desired genotype, along with a desired phenotype to their progeny. It will be appreciated that for the purposes of MAS, the term marker can encompass both marker and QTL loci.
[0483] After a desired phenotype and a polymorphic chromosomal locus, e.g., a marker locus or QTL, are determined to segregate together, it is possible to use those polymorphic loci to select for alleles corresponding to the desired phenotype--a process called marker-assisted selection (MAS). In brief, a nucleic acid corresponding to the marker nucleic acid is detected in a biological sample from a plant to be selected. This detection can take the form of hybridization of a probe nucleic acid to a marker, e.g., using allele-specific hybridization, southern blot analysis, northern blot analysis, in situ hybridization, hybridization of primers followed by PCR amplification of a region of the marker or the like. A variety of procedures for detecting markers are well known in the art. After the presence (or absence) of a particular marker in the biological sample is verified, the plant is selected, i.e., used to make progeny plants by selective breeding.
[0484] Plant breeders need to combine traits of interest with genes for high yield and other desirable traits to develop improved plant varieties. Screening for large numbers of samples can be expensive, time consuming, and unreliable. Use of markers, and/or genetically-linked nucleic acids is an effective method for selecting plant having the desired traits in breeding programs. For example, one advantage of marker-assisted selection over field evaluations is that MAS can be done at any time of year regardless of the growing season. Moreover, environmental effects are irrelevant to marker-assisted selection.
[0485] When a population is segregating for multiple loci affecting one or multiple traits, the efficiency of MAS compared to phenotypic screening becomes even greater because all the loci can be processed in the lab together from a single sample of DNA.
[0486] The DNA repair mechanisms of cells are the basis to introduce extraneous DNA or induce mutations on endogenous genes. DNA homologous recombination is a specialized way of DNA repair that the cells repair DNA damages using a homologous sequence. In plants, DNA homologous recombination happens at frequencies too low to be routinely used in gene targeting or gene editing until it has been found that the process can be stimulated by DNA double-strand breaks (Bibikova et al., (2001) Mol. Cell. Biol. 21:289-297; Puchta and Baltimore, (2003) Science 300:763; Wright et al., (2005) Plant J. 44:693-705).
[0487] The meaning of abbreviations is as follows: "sec" means second(s), "min" means minute(s), "h" means hour(s), "d" means day(s), "μL" means microliter(s), "mL" means milliliter(s), "L" means liter(s), "μM" means micromolar, "mM" means millimolar, "M" means molar, "mmol" means millimole(s), "μmole" mean micromole(s), "g" means gram(s), "μg" means microgram(s), "ng" means nanogram(s), "U" means unit(s), "bp" means base pair(s) and "kb" means kilobase(s).
[0488] Also, as described herein, for each example or embodiment that cites a guide RNA, a similar guide polynucleotide can be designed wherein the guide polynucleotide does not solely comprise ribonucleic acids but wherein the guide polynucleotide comprises a combination of RNA-DNA molecules or solely comprises DNA molecules.
Non-limiting examples of compositions and methods disclosed herein are as follows:
[0489] 1. A method for selecting a plant comprising an altered target site in its plant genome, the method comprising:
[0490] a) obtaining a first plant comprising at least one Cas endonuclease capable of introducing a double strand break at a target site in the plant genome;
[0491] b) obtaining a second plant comprising a guide RNA that is capable of forming a complex with the Cas endonuclease of (a);
[0492] c) crossing the first plant of (a) with the second plant of (b);
[0493] d) evaluating the progeny of (c) for an alteration in the target site; and,
[0494] e) selecting a progeny plant that possesses the desired alteration of said target site.
[0495] 2. A method for selecting a plant comprising an altered target site in its plant genome, the method comprising selecting at least one progeny plant that comprises an alteration at a target site in its plant genome, wherein said progeny plant was obtained by crossing a first plant comprising at least one a Cas endonuclease with a second plant comprising a guide RNA, wherein said Cas endonuclease is capable of introducing a double strand break at said target site.
[0496] 3. A method for selecting a plant comprising an altered target site in its plant genome, the method comprising:
[0497] a) obtaining a first plant comprising at least one Cas endonuclease capable of introducing a double strand break at a target site in the plant genome;
[0498] b) obtaining a second plant comprising a guide RNA and a donor DNA, wherein said guide RNA is capable of forming a complex with the Cas endonuclease of (a), wherein said donor DNA comprises a polynucleotide of interest;
[0499] c) crossing the first plant of (a) with the second plant of (b);
[0500] d) evaluating the progeny of (c) for an alteration in the target site; and,
[0501] e) selecting a progeny plant that comprises the polynucleotide of interest inserted at said target site.
[0502] 4. A method for selecting a plant comprising an altered target site in its plant genome, the method comprising selecting at least one progeny plant that comprises an alteration at a target site in its plant genome, wherein said progeny plant was obtained by crossing a first plant expressing at least one Cas endonuclease to a second plant comprising a guide RNA and a donor DNA, wherein said Cas endonuclease is capable of introducing a double strand break at said target site, wherein said donor DNA comprises a polynucleotide of interest.
[0503] 5. A method for modifying a target site in the genome of a plant cell, the method comprising introducing a guide RNA into a plant cell having a Cas endonuclease, wherein said guide RNA and Cas endonuclease are capable of forming a complex that enables the Cas endonuclease to introduce a double strand break at said target site.
[0504] 6. A method for modifying a target site in the genome of a plant cell, the method comprising introducing a guide RNA and a Cas endonuclease into said plant cell, wherein said guide RNA and Cas endonuclease are capable of forming a complex that enables the Cas endonuclease to introduce a double strand break at said target site.
[0505] 7. A method for modifying a target site in the genome of a plant cell, the method comprising introducing a guide RNA and a donor DNA into a plant cell having a Cas endonuclease, wherein said guide RNA and Cas endonuclease are capable of forming a complex that enables the Cas endonuclease to introduce a double strand break at said target site, wherein said donor DNA comprises a polynucleotide of interest.
[0506] 8. A method for modifying a target site in the genome of a plant cell, the method comprising:
[0507] a) introducing into a plant cell a guide RNA and a Cas endonuclease, wherein said guide RNA and Cas endonuclease are capable of forming a complex that enables the Cas endonuclease to introduce a double strand break at said target site; and,
[0508] b) identifying at least one plant cell that has a modification at said target, wherein the modification includes at least one deletion or substitution of one or more nucleotides in said target site.
[0509] 9. A method for modifying a target DNA sequence in the genome of a plant cell, the method comprising:
[0510] a) introducing into a plant cell a first recombinant DNA construct capable of expressing a guide RNA and a second recombinant DNA construct capable of expressing a Cas endonuclease, wherein said guide RNA and Cas endonuclease are capable of forming a complex that enables the Cas endonuclease to introduce a double strand break at said target site; and,
[0511] b) identifying at least one plant cell that has a modification at said target, wherein the modification includes at least one deletion or substitution of one or more nucleotides in said target site.
[0512] 10. A method for introducing a polynucleotide of Interest into a target site in the genome of a plant cell, the method comprising:
[0513] a) introducing into a plant cell a first recombinant DNA construct capable of expressing a guide RNA and a second recombinant DNA construct capable of expressing a Cas endonuclease, wherein said guide RNA and Cas endonuclease are capable of forming a complex that enables the Cas endonuclease to introduce a double strand break at said target site;
[0514] b) contacting the plant cell of (a) with a donor DNA comprising a polynucleotide of Interest; and,
[0515] c) identifying at least one plant cell from (b) comprising in its genome the polynucleotide of Interest integrated at said target site.
[0516] 10-B A method for introducing a polynucleotide of Interest into a target site in the genome of a plant cell, the method comprising:
[0517] a) introducing into a plant cell a guide RNA and a Cas endonuclease, wherein said guide RNA and Cas endonuclease are capable of forming a complex that enables the Cas endonuclease to introduce a double strand break at said target site;
[0518] b) contacting the plant cell of (a) with a donor DNA comprising a polynucleotide of Interest; and,
[0519] c) identifying at least one plant cell from (b) comprising in its genome the polynucleotide of Interest integrated at said target site.
[0520] 11. The method of any one of embodiments 5-8, wherein the guide RNA is introduced directly by particle bombardment.
[0521] 12. The method of any one of embodiments 5-9, wherein the guide RNA is introduced via particle bombardment or Agrobacterium transformation of a recombinant DNA construct comprising the corresponding guide DNA operably linked to a plant U6 polymerase III promoter.
[0522] 13. The method of any one of embodiments 1-10, wherein the Cas endonuclease gene is a plant optimized Cas9 endonuclease.
[0523] 14. The method of any one of embodiments 1-10, wherein the Cas endonuclease gene is operably linked to a SV40 nuclear targeting signal upstream of the Cas codon region and a VirD2 nuclear localization signal downstream of the Cas codon region.
[0524] 15. The method of any one of embodiments 1-14, wherein the plant is a monocot or a dicot.
[0525] 16. The method of embodiment 15, wherein the monocot is selected from the group consisting of maize, rice, sorghum, rye, barley, wheat, millet, oats, sugarcane, turfgrass, or switchgrass.
[0526] 17. The method of embodiment 16, wherein the dicot is selected from the group consisting of soybean, canola, alfalfa, sunflower, cotton, tobacco, peanut, potato, tobacco, Arabidopsis, or safflower.
[0527] 18. The method of any one of embodiments 1-17 wherein the target site is located in the gene sequence of an acetolactate synthase (ALS) gene, an Enolpyruvylshikimate Phosphate Synthase Gene (ESPSP) gene, a male fertility (MS45, MS26 or MSCA1).
[0528] 19. A plant or seed produced by any one of embodiments 1-17.
[0529] 20. A plant comprising a recombinant DNA construct, said recombinant DNA construct comprising a promoter operably linked to a nucleotide sequence encoding a plant optimized Cas9 endonuclease, wherein said plant optimized Cas9 endonuclease is capable of binding to and creating a double strand break in a genomic target sequence said plant genome.
[0530] 21. A plant comprising a recombinant DNA construct and a guide RNA, wherein said recombinant DNA construct comprises a promoter operably linked to a nucleotide sequence encoding a plant optimized Cas9 endonuclease, wherein said plant optimized Cas9 endonuclease and guide RNA are capable of forming a complex and creating a double strand break in a genomic target sequence said plant genome.
[0531] 22. A recombinant DNA construct comprising a promoter operably linked to a nucleotide sequence encoding a plant optimized Cas9 endonuclease, wherein said plant optimized Cas9 endonuclease is capable of binding to and creating a double strand break in a genomic target sequence said plant genome.
[0532] 23. A recombinant DNA construct comprising a promoter operably linked to a nucleotide sequence expressing a guide RNA, wherein said guide RNA is capable of forming a complex with a plant optimized Cas9 endonuclease, and wherein said complex is capable of binding to and creating a double strand break in a genomic target sequence said plant genome.
[0533] 24. A method for selecting a male sterile plant, the method comprising selecting at least one progeny plant that comprises an alteration at a genomic target site located in a male fertility gene locus, wherein said progeny plant is obtained by crossing a first plant expressing a Cas9 endonuclease to a second plant comprising a guide RNA, wherein said Cas endonuclease is capable of introducing a double strand break at said genomic target site.
[0534] 25. A method for producing a male sterile plant, the method comprising:
[0535] a) obtaining a first plant comprising at least one Cas endonuclease capable of introducing a double strand break at a genomic target site located in a male fertility gene locus in the plant genome;
[0536] b) obtaining a second plant comprising a guide RNA that is capable of forming a complex with the Cas endonuclease of (a);
[0537] c) crossing the first plant of (a) with the second plant of (b);
[0538] d) evaluating the progeny of (c) for an alteration in the target site; and,
[0539] e) selecting a progeny plant that is male sterile.
[0540] 26. The method of any of embodiments 23-24 wherein the male fertility gene is selected from the list comprising MS26, MS45, M.
[0541] 27. The method of any one of embodiments 24-26, wherein the plant is a monocot or a dicot.
[0542] 28. The method of embodiment 27, wherein the monocot is selected from the group consisting of maize, rice, sorghum, rye, barley, wheat, millet, oats, sugarcane, turfgrass, or switchgrass.
[0543] 29. A method for editing a nucleotide sequence in the genome of a cell, the method comprising introducing a guide RNA, a polynucleotide modification template and at least one Cas endonuclease into a cell, wherein the Cas endonuclease introduces a double-strand break at a target site in the genome of said cell, wherein said polynucleotide modification template comprises at least one nucleotide modification of said nucleotide sequence.
[0544] 30. The method of embodiment 29, wherein the cell is a plant cell.
[0545] 31. The method of embodiment 29 wherein the nucleotide sequence is a promoter, a regulatory sequence or a gene of interest of interest.
[0546] 32. The method of embodiment 31 wherein the gene of interest is an EPSPS gene.
[0547] 33. The method of embodiment 30 wherein the plant cell is a monocot or dicot plant cell.
[0548] 34. A method for producing an epsps mutant plant, the method comprising:
[0549] a) providing a guide RNA, a polynucleotide modification template and at least one Cas endonuclease to a plant cell, wherein the Cas endonuclease introduces a double strand break at a target site within an epsps genomic sequence in the plant genome, wherein said polynucleotide modification template comprises at least one nucleotide modification of said epsps genomic sequence.
[0550] b) obtaining a plant from the plant cell of (a);
[0551] c) evaluating the plant of (b) for the presence of said at least one nucleotide modification; and,
[0552] c) selecting a progeny plant that shows tolerance to glyphosate.
[0553] 35. A method for producing an epsps mutant plant, the method comprising:
[0554] a) providing a guide RNA, a polynucleotide modification template and at least one Cas endonuclease into a plant cell, wherein the Cas endonuclease introduces a double strand break at a target site within an epsps genomic sequence in the plant genome, wherein said polynucleotide modification template comprises at least one nucleotide modification of said epsps genomic sequence.
[0555] b) obtaining a plant from the plant cell of (a);
[0556] c) evaluating the plant of (b) for the presence of said at least one nucleotide modification; and,
[0557] d) screening a progeny plant of (c) that is void of said guide RNA and Cas endonuclease.
[0558] 36. The method of embodiment 35, further comprising selecting a plant that shows resistance to glyphosate.
[0559] 37. A plant, plant cell or seed produced by any one of embodiments 29-36
[0560] 38. The method of any one of embodiments 29-36 wherein the Cas endonuclease is a Cas9 endonuclease.
[0561] 39. The method of embodiment 38 wherein the Cas9 endonuclease is expressed by SEQ ID NO:5.
[0562] 40. The method of embodiment 38 wherein the Cas9 endonuclease is encoded by any one of SEQ ID NOs: 1, 124, 212, 213, 214, 215, 216, 193 or nucleotides 2037-6329 of SEQ ID NO:5, or any functional fragment or variant thereof.
[0563] 41. The plant or plant cell of embodiment 37, wherein said plant cell shows resistance to glyphosate.
[0564] 42. A plant cell comprising a modified nucleotide sequence, wherein the modified nucleotide sequence was produced by providing a guide RNA, a polynucleotide modification template and at least one Cas endonuclease to a plant cell, wherein the Cas endonuclease is capable of introducing a double-strand break at a target site in the plant genome wherein said polynucleotide modification template comprises at least one nucleotide modification of said nucleotide sequence.
[0565] 43. The method of embodiments 29, 34 and 35 wherein the at least one nucleotide modification is not a modification at said target site.
[0566] 44. A method for producing a male sterile plant, the method comprising:
[0567] a) introducing into a plant cell a guide RNA and a Cas endonuclease, wherein said guide RNA and Cas endonuclease are capable of forming a complex that enables the Cas endonuclease to introduce a double strand break at a target site located in or near a male fertility gene;
[0568] b) identifying at least one plant cell that has a modification in said male fertility gene, wherein the modification includes at least one deletion or substitution of one or more nucleotides in said male sterility gene; and,
[0569] c) obtaining a plant from the plant cell of b).
[0570] 45. The method of embodiment 43, further comprising selecting a progeny plant from the plant of c) wherein said progeny plant is male sterile.
[0571] 46. The method of embodiment 43, wherein the male fertility gene is selected from the group comprising MS26, MS45 and MSCA1.
[0572] 47. A plant comprising at least one altered target site, wherein the at least one altered target site originated from a corresponding target site that was recognized and cleaved by a guide RNA/Cas endonuclease system, and wherein the at least one altered target site is in a genomic region of interest that extends from the target sequence set forth in SEQ ID NO: 229 to the target site set forth in SEQ ID NO: 235.
[0573] 48. The plant of embodiment 47, wherein the at least one altered target site has an alteration selected from the group consisting of (i) replacement of at least one nucleotide, (ii) a deletion of at least one nucleotide, (iii) an insertion of at least one nucleotide, and (iv) any combination of (i)-(iii).
[0574] 49. The plant of embodiment 47, wherein the at least one altered target site comprises a recombinant DNA molecule.
[0575] 50. The plant of embodiment 47, wherein the plant comprises at least two altered target sites, wherein each of the altered target site originated from corresponding target site that was recognized and cleaved by a guide RNA/Cas endonuclease system, wherein the corresponding target site is selected from the group consisting of SEQ ID NOs: 229, 230, 231, 232, 233, 234, 235 and 236.
[0576] 51. A recombinant DNA construct comprising a nucleotide sequence set forth in SEQ ID NO: 120 or SEQ ID NO:295, or a functional fragment thereof, operably linked to at least one heterologous sequence, wherein said nucleotide sequence is a promoter.
[0577] 52. A plant stably transformed with a recombinant DNA construct comprising a soybean promoter and a heterologous nucleic acid fragment operably linked to said soybean promoter, wherein said promoter is a capable of controlling expression of said heterologous nucleic acid fragment in a plant cell, and further wherein said promoter comprises any of the sequences set forth in SEQ ID NO: 120 or SEQ ID NO: 295.
[0578] 53. A method for editing a nucleotide sequence in the genome of a cell, the method comprising introducing a guide polynucleotide, a Cas endonuclease, and optionally a polynucleotide modification template, into a cell, wherein said guide RNA and Cas endonuclease are capable of forming a complex that enables the Cas endonuclease to introduce a double strand break at a target site in the genome of said cell, wherein said polynucleotide modification template comprises at least one nucleotide modification of said nucleotide sequence.
[0579] 54. The method of embodiment 53, wherein the nucleotide sequence in the genome of a cell is selected from the group consisting of a promoter sequence, a terminator sequence, a regulatory element sequence, a splice site, a coding sequence, a polyubiquitination site, an intron site and an intron enhancing motif.
[0580] 55. A method for editing a promoter sequence in the genome of a cell, the method comprising introducing a guide polynucleotide, a polynucleotide modification template and at least one Cas endonuclease into a cell, wherein said guide RNA and Cas endonuclease are capable of forming a complex that enables the Cas endonuclease to introduce a double strand break at a target site in the genome of said cell, wherein said polynucleotide modification template comprises at least one nucleotide modification of said nucleotide sequence.
[0581] 56. A method for replacing a first promoter sequence in a cell, the method comprising introducing a guide RNA, a polynucleotide modification template, and a Cas endonuclease into said cell, wherein said guide RNA and Cas endonuclease are capable of forming a complex that enables the Cas endonuclease to introduce a double strand break at a target site in the genome of said cell, wherein said polynucleotide modification template comprises a second promoter or second promoter fragment that is different from said first promoter sequence.
[0582] 57. The method of embodiment 56, wherein the replacement of the first promoter sequence results in any one of the following, or any one combination of the following: an increased promoter activity, an increased promoter tissue specificity, a decreased promoter activity, a decreased promoter tissue specificity, a new promoter activity, an inducible promoter activity, an extended window of gene expression, or a modification of the timing or developmental progress of gene expression in the same cell layer or other cell layer
[0583] 58. The method of embodiment 56, wherein the first promoter sequence is selected from the group consisting of Zea mays ARGOS 8 promoter, a soybean EPSPS1 promoter, a maize EPSPS promoter, maize NPK1 promoter, wherein the second promoter sequence is selected from the group consisting of a Zea mays GOS2 PRO:GOS2-intron promoter, a soybean ubiquitin promoter, a stress inducible maize RAB17 promoter, a Zea mays-PEPC1 promoter, a Zea mays Ubiquitin promoter, a Zea mays-Rootmet2 promoter, a rice actin promoter, a sorghum RCC3 promoter, a Zea mays-GOS2 promoter, a Zea mays-ACO2 promoter and a Zea mays oleosin promoter.
[0584] 59. A method for deleting a promoter sequence in the genome of a cell, the method comprising introducing a guide polynucleotide, a Cas endonuclease into a cell, wherein said guide RNA and Cas endonuclease are capable of forming a complex that enables the Cas endonuclease to introduce a double strand break in at least one target site located inside or outside said promoter sequence.
[0585] 60. A method for inserting a promoter or a promoter element in the genome of a cell, the method comprising introducing a guide polynucleotide, a polynucleotide modification template comprising the promoter or the promoter element, and a Cas endonuclease into a cell, wherein said guide RNA and Cas endonuclease are capable of forming a complex that enables the Cas endonuclease to introduce a double strand break at a target site in the genome of said cell.
[0586] 61. The method of embodiment 60, wherein the insertion of the promoter or promoter element results in any one of the following, or any one combination of the following: an increased promoter activity, an increased promoter tissue specificity, a decreased promoter activity, a decreased promoter tissue specificity, a new promoter activity, an inducible promoter activity, an extended window of gene expression, a modification of the timing or developmental progress of gene expression, a mutation of DNA binding elements, or an addition of DNA binding elements.
[0587] 62. A method for editing a Zinc Finger transcription factor, the method comprising introducing a guide polynucleotide, a Cas endonuclease, and optionally a polynucleotide modification template, into a cell, wherein the Cas endonuclease introduces a double-strand break at a target site in the genome of said cell, wherein said polynucleotide modification template comprises at least one nucleotide modification or deletion of said Zinc Finger transcription factor, wherein the deletion or modification of said Zinc Finger transcription factor results in the creation of a dominant negative Zinc Finger transcription factor mutant.
[0588] 63. A method for creating a fusion protein, the method comprising introducing a guide polynucleotide, a Cas endonuclease, and a polynucleotide modification template, into a cell, wherein the Cas endonuclease introduces a double-strand break at a target site located inside or outside a first coding sequence in the genome of said cell, wherein said polynucleotide modification template comprises a second coding sequence encoding a protein of interest, wherein the protein fusion results in any one of the following, or any one combination of the following: a targeting of the fusion protein to the chloroplast of said cell, an increased protein activity, an increased protein functionality, a decreased protein activity, a decreased protein functionality, a new protein functionality, a modified protein functionality, a new protein localization, a new timing of protein expression, a modified protein expression pattern, a chimeric protein, or a modified protein with dominant phenotype functionality.
[0589] 64. A method for producing in a plant a complex trait locus comprising at least two altered target sequences in a genomic region of interest, said method comprising:
[0590] (a) selecting a genomic region in a plant, wherein the genomic region comprises a first target sequence and a second target sequence;
[0591] (b) contacting at least one plant cell with at least a first guide polynucleotide, a second polynucleotide, and optionally at least one Donor DNA, and a Cas endonuclease, wherein the first and second guide polynucleotide and the Cas endonuclease can form a complex that enables the Cas endonuclease to introduce a double strand break in at least a first and a second target sequence;
[0592] (c) identifying a cell from (b) comprising a first alteration at the first target sequence and a second alteration at the second target sequence; and,
[0593] (d) recovering a first fertile plant from the cell of (c) said fertile plant comprising the first alteration and the second alteration, wherein the first alteration and the second alteration are physically linked.
[0594] 65. A method for producing in a plant a complex trait locus comprising at least two altered target sequences in a genomic region of interest, said method comprising:
[0595] (a) selecting a genomic region in a plant, wherein the genomic region comprises a first target sequence and a second target sequence;
[0596] (b) contacting at least one plant cell with a first guide polynucleotide, a Cas endonuclease, and optionally a first Donor DNA, wherein the first guide polynucleotide and the Cas endonuclease can form a complex that enables the Cas endonuclease to introduce a double strand break a first target sequence;
[0597] (c) identifying a cell from (b) comprising a first alteration at the first target sequence;
[0598] (d) recovering a first fertile plant from the cell of (c), said first fertile plant comprising the first alteration;
[0599] (e) contacting at least one plant cell with a second guide polynucleotide, a Cas endonuclease, and optionally a second Donor DNA;
[0600] (f) identifying a cell from (e) comprising a second alteration at the second target sequence;
[0601] (g) recovering a second fertile plant from the cell of (f), said second fertile plant comprising the second alteration; and,
[0602] (h) obtaining a fertile progeny plant from the second fertile plant of (g), said fertile progeny plant comprising the first alteration and the second alteration, wherein the first alteration and the second alteration are physically linked.
[0603] 66. A method for editing a nucleotide sequence in the genome of a cell, the method comprising introducing at least one guide RNA, at least one polynucleotide modification template and at least one Cas endonuclease into a cell, wherein the Cas endonuclease introduces a double-strand break at a target site in the genome of said cell, wherein said polynucleotide modification template comprises at least one nucleotide modification of said nucleotide sequence.
[0604] 67. The method of embodiment 66 wherein the editing of said nucleotide sequence renders said nucleotide sequence capable of conferring herbicide resistance to said cell.
[0605] 68. The method of embodiment 67, wherein the cell is a plant cell.
[0606] 69. The method of embodiment 66 wherein the nucleotide sequence is a promoter, a regulatory sequence or a gene of interest of interest.
[0607] 70. The method of embodiment 69 wherein the gene of interest is an enolpyruvylshikimate-3-phosphate synthase (EPSPS) gene or an ALS gene.
[0608] 71. The method of embodiment 66 wherein the plant cell is a monocot or dicot plant cell.
[0609] 72. A method for producing an acetolactate synthase (ALS) mutant plant, the method comprising:
[0610] a) providing a guide RNA, a polynucleotide modification template, and a Cas endonuclease to a plant cell comprising an ALS nucleotide sequence, wherein said guide RNA and Cas endonuclease are capable of forming a complex that enables the Cas endonuclease to introduce a double strand break at a target site in the genome of said plant cell, wherein said polynucleotide modification template comprises at least one nucleotide modification of said ALS nucleotide sequence;
[0611] b) obtaining a plant from the plant cell of (a);
[0612] c) evaluating the plant of (b) for the presence of said at least one nucleotide modification; and,
[0613] d) selecting a progeny plant that shows resistance to sulphonylurea.
[0614] 73. A method for producing an acetolactate synthase (ALS) mutant plant, the method comprising:
[0615] a) providing a guide RNA and a polynucleotide modification template to a plant cell comprising a Cas endonuclease and an ALS nucleotide sequence, wherein said Cas endonuclease introduces a double strand break at a target site in the genome of said plant cell, wherein said polynucleotide modification template comprises at least one nucleotide modification of said ALS nucleotide sequence;
[0616] b) obtaining a plant from the plant cell of (a);
[0617] c) evaluating the plant of (b) for the presence of said at least one nucleotide modification; and,
[0618] d) selecting a progeny plant that shows resistance to sulphonylurea.
[0619] 74. The method of any of embodiments 72-73, wherein said polynucleotide modification template comprises a non-functional or partial fragment of the ALS nucleotide sequence.
[0620] 75. The method of any of embodiments 72-73, wherein the target site is located within the ALS nucleotide sequence.
[0621] 76. The method of any of embodiments 72-73, further comprising selecting a progeny plant that is void of said guide RNA and Cas endonuclease.
[0622] 77. A method for producing an acetolactate synthase (ALS) mutant plant, the method comprising:
[0623] a) obtaining a plant or a seed thereof, wherein the plant or the seed comprises a modification in an endogenous ALS gene, the modification generated by a Cas endonuclease, a guide RNA and a polynucleotide modification template, wherein the plant or the seed is resistant to sulphonylurea; and,
[0624] b) producing a progeny plant that is void of said guide RNA and Cas endonuclease.
[0625] 78. The method of embodiment 77, further comprising selecting a plant that shows resistance to sulphonylurea.
[0626] 79. The method of any one of embodiments 72-78, wherein the plant is a monocot or a dicot.
[0627] 80. The method of embodiment 79, wherein the monocot is selected from the group consisting of maize, rice, sorghum, rye, barley, wheat, millet, oats, sugarcane, turfgrass, or switchgrass.
[0628] 81. The method of embodiment 79, wherein the dicot is selected from the group consisting of soybean, canola, alfalfa, sunflower, cotton, tobacco, peanut, potato, tobacco,
Arabidopsis, or safflower.
[0629] 82. A method of generating a sulphonylurea resistant plant, the method comprising providing a plant cell wherein its endogenous chromosomal ALS gene by has been modified through a guide RNA/Cas endonuclease system to produce a sulphonylurea resistant ALS protein and growing a plant from said maize plant cell, wherein said plant is resistant to sulphonylurea.
[0630] 83. The method of embodiment 82, wherein the plant is a monocot or a dicot.
[0631] 84. A plant produced by the method of embodiment 82.
[0632] 85. A seed produced by the plant of embodiment 84.
[0633] 86. A guide RNA wherein the variable targeting domain targets a fragment of a plant EPSPS or ALS nucleotide sequence.
[0634] 87. A method for producing an acetolactate synthase (ALS) mutant plant cell, the method comprising:
[0635] a) providing to a cell comprising an ALS nucleotide sequence, a guide RNA, a Cas endonuclease, and a polynucleotide modification template, wherein said guide RNA and Cas endonuclease are capable of forming a complex that enables the Cas endonuclease to introduce a double strand break at a target site in the genome of said cell, wherein said polynucleotide modification template comprises at least one nucleotide modification of said ALS nucleotide sequence; and,
[0636] b) obtaining at least one plant cell of (a) that has at least one nucleotide modification at said ALS nucleotide sequence, wherein the modification includes at least one deletion, insertion or substitution of one or more nucleotides in said ALS nucleotide sequence.
[0637] 88. A method for producing an acetolactate synthase (ALS) mutant plant cell, the method comprising:
[0638] a) providing a guide RNA and a polynucleotide modification template to a plant cell comprising a Cas endonuclease and a ALS nucleotide sequence, wherein said Cas endonuclease introduces a double strand break at a target site in the genome of said plant cell, wherein said polynucleotide modification template comprises at least one nucleotide modification of said ALS nucleotide sequence; and,
[0639] b) identifying at least one plant cell of (a) that has at least one nucleotide modification at said ALS nucleotide sequence, wherein the modification includes at least one deletion, insertion or substitution of one or more nucleotides in said ALS nucleotide sequence.
[0640] 89. A method for producing an acetolactate synthase (ALS) mutant cell, the method comprising:
[0641] a) providing to a cell comprising an ALS nucleotide sequence, a first recombinant DNA construct capable of expressing a guide RNA, a second recombinant DNA construct capable of expressing a Cas endonuclease, and a polynucleotide modification template, wherein said guide RNA and Cas endonuclease are capable of forming a complex that enables the Cas endonuclease to introduce a double strand break at a target site in the genome of said cell, wherein said polynucleotide modification template comprises a non-functional fragment of the ALS gene and at least one nucleotide modification of said ALS nucleotide sequence; and,
[0642] b) identifying at least one cell of (a) that has at least one nucleotide modification at said ALS nucleotide sequence, wherein the modification includes at least one deletion, insertion or substitution of one or more nucleotides in said ALS nucleotide sequence.
EXAMPLES
[0643] In the following Examples, unless otherwise stated, parts and percentages are by weight and degrees are Celsius. It should be understood that these Examples, while indicating embodiments of the disclosure, are given by way of illustration only. From the above discussion and these Examples, one skilled in the art can make various changes and modifications of the disclosure to adapt it to various usages and conditions. Such modifications are also intended to fall within the scope of the appended claims.
Example 1
[0644] Maize Optimized Expression Cassettes for Guide RNA/Cas Endonuclease Based Genome Modification in Maize Plants
[0645] For genome engineering applications, the type II CRISPR/Cas system minimally requires the Cas9 protein and a duplexed crRNA/tracrRNA molecule or a synthetically fused crRNA and tracrRNA (guide RNA) molecule for DNA target site recognition and cleavage (Gasiunas et al. (2012) Proc. Natl. Acad. Sci. USA 109:E2579-86, Jinek et al. (2012) Science 337:816-21, Mali et al. (2013) Science 339:823-26, and Gong et al. (2013) Science 339:819-23). Described herein is a guideRNA/Cas endonuclease system that is based on the type II CRISPR/Cas system and consists of a Cas endonuclease and a guide RNA (or duplexed crRNA and tracrRNA) that together can form a complex that recognizes a genomic target site in a plant and introduces a double-strand-break into said target site.
[0646] To test the guide RNA/Cas endonuclease system in maize, the Cas9 gene from Streptococcus pyogenes M1 GAS (SF370) (SEQ ID NO: 1) was maize codon optimized per standard techniques known in the art and the potato ST-LS1 intron (SEQ ID NO: 2) was introduced in order to eliminate its expression in E. coli and Agrobacterium (FIG. 1A). To facilitate nuclear localization of the Cas9 protein in maize cells, Simian virus 40 (SV40) monopartite amino terminal nuclear localization signal (MAPKKKRKV, SEQ ID NO: 3) and Agrobacterium tumefaciens bipartite VirD2 T-DNA border endonuclease carboxyl terminal nuclear localization signal (KRPRDRHDGELGGRKRAR, SEQ ID NO: 4) were incorporated at the amino and carboxyl-termini of the Cas9 open reading frame (FIG. 1A), respectively. The maize optimized Cas9 gene was operably linked to a maize constitutive or regulated promoter by standard molecular biological techniques. An example of the maize optimized Cas9 expression cassette (SEQ ID NO: 5) is illustrated in FIG. 1A. FIG. 1A shows a maize optimized Cas9 gene containing the ST-LS1 intron, SV40 amino terminal nuclear localization signal (NLS) and VirD2 carboxyl terminal NLS driven by a plant Ubiquitin promoter.
[0647] The second component necessary to form a functional guide RNA/Cas endonuclease system for genome engineering applications is a duplex of the crRNA and tracrRNA molecules or a synthetic fusing of the crRNA and tracrRNA molecules, a guide RNA. To confer efficient guide RNA expression (or expression of the duplexed crRNA and tracrRNA) in maize, the maize U6 polymerase III promoter (SEQ ID NO: 9) and maize U6 polymerase III terminator (first 8 bases of SEQ ID NO: 10) residing on chromosome 8 were isolated and operably fused to the termini of a guide RNA (FIG. 1B) using standard molecular biology techniques. Two different guide RNA configurations were developed for testing in maize, a short guide RNA (SEQ ID NO: 11) based on Jinek et al. (2012) Science 337:816-21 and a long guide RNA (SEQ ID NO: 8) based on Mali et al. (2013) Science 339:823-26. An example expression cassette (SEQ ID NO: 12) is shown in FIG. 1B which illustrates a maize U6 polymerase III promoter driving expression of a long guide RNA terminated with a U6 polymerase III terminator.
[0648] As shown in FIGS. 2 A and 2B, the guide RNA or crRNA molecule contains a region complementary to one strand of the double strand DNA target (referred to as the variable targeting domain) that is approximately 12-30 nucleotides in length and upstream of a PAM sequence (5'NGG3' on antisense strand of FIG. 2A-2B, corresponding to 5'CCN3' on sense strand of FIG. 2A-2B) for target site recognition and cleavage (Gasiunas et al. (2012) Proc. Natl. Acad. Sci. USA 109:E2579-86, Jinek et al. (2012) Science 337:816-21, Mali et al. (2013) Science 339:823-26, and Cong et al. (2013) Science 339:819-23). To facilitate the rapid introduction of maize genomic DNA target sequences into the crRNA or guide RNA expression constructs, two Type IIS BbsI restriction endonuclease target sites were introduced in an inverted tandem orientation with cleavage orientated in an outward direction as described in Cong et al. (2013) Science 339:819-23. Upon cleavage, the Type IIS restriction endonuclease excises its target sites from the crRNA or guide RNA expression plasmid, generating overhangs allowing for the in-frame directional cloning of duplexed oligos containing the desired maize genomic DNA target site into the variable targeting domain. In this example, only target sequences starting with a G nucleotide were used to promote favorable polymerase III expression of the guide RNA or crRNA.
[0649] Expression of both the Cas endonuclease gene and the guide RNA then allows for the formation of the guide RNA/Cas complex depicted in FIG. 2 B (SEQ ID NO: 8). Alternatively, expression of the Cas endonucleases gene, crRNA, and tracrRNA allow for the formation of the crRNA/tracrRNA/Cas complex as depicted in FIG. 2 A, (SEQ ID NOs: 6-7).
Example 2
[0650] The Guide RNA/Cas Endonuclease System Cleaves Chromosomal DNA in Maize and Introduces Mutations by Imperfect Non-Homologous End-Joining
[0651] To test whether the maize optimized guide RNA/Cas endonuclease described in example 1 could recognize, cleave, and mutate maize chromosomal DNA through imprecise non-homologous end-joining (NHEJ) repair pathways, three different genomic target sequences in 5 maize loci were targeted for cleavage (see Table 1) and examined by deep sequencing for the presence of NHEJ mutations.
TABLE-US-00001 TABLE 1 Maize genomic target sites targeted by a quideRNA/Cas endonuclease system. Target Maize Guide Site Genomic PAM SEQ Loca- RNA Desig- Target Site Se- ID Locus tion Used nation Sequence quence NO: MS26 Chr. 1: Long MS26Cas-1 GTACTCCATCC GGG 13 51.81 GCCCCATCGAG cM TA Long MS26Cas-2 GCACGTACGTC CGG 14 ACCATCCCGC Long MS26Cas-3 GACGTACGTGC GGG 15 CCTACTCGAT LIG Chr. 2: Long LIGCas-1 GTACCGTACGT AGG 16 28.45 GCCCCGGCGG cM Long LIGCas-2 GGAATTGTACC CGG 17 GTACGTGCCC Long LIGCas-3 GCGTACGCGTA AGG 18 CGTGTG MS45 Chr. 9: Long MS45Cas-1 GCTGGCCGAGG CGG 19 119.15 TCGACTAC cM Long MS45Cas-2 GGCCGAGGTCG CGG 20 ACTACCGGC Long MS45Cas-3 GGCGCGAGCTC CGG 21 GTGCTTCAC ALS Chr. 4: Long ALSCas-1 GGTGCCAATCA CGG 22 107.73 TGCGTCG cM and Long ALSCas-2 GGTCGCCATCA AGG 23 Chr. 5: CGGGAC 115.49 Long ALSCas-3 GTCGCGGCACC TGG 24 cM TGTCCCGTGA EPSP Chr. 9: Long EPSPSCas- GGAATGCTGGA CGG 25 S 69.43 1 ACTGCAATG cM Long EPSPSCas- GCAGCTCTTCT TGG 26 2 TGGGGAATGC Long EPSPSCas- GCAGTAACAGC TGG 27 3 TGCTGTCAA MS26 = Male Sterility Gene 26, LIG = Liguleless 1 Gene Promoter, MS45 = Male Sterility Gene 45, ALS = Acetolactate Synthase Gene, EPSPS = Enolpyruvylshikimate Phosphate Synthase Gene
[0652] The maize optimized Cas9 endonuclease and long guide RNA expression cassettes containing the specific maize variable targeting domains were co-delivered to 60-90 Hi-II immature maize embryos by particle-mediated delivery (see Example 10) in the presence of BBM and WUS2 genes (see Example 11). Hi-II maize embryos transformed with either the LIG3-4 or MS26++ homing endonucleases (see Example 9) targeting the same maize genomic loci as the LIGCas or MS26Cas target sites served as a positive control and embryos transformed with only the Cas9 or guide RNA expression cassette served as negative controls. After 7 days, the 20-30 most uniformly transformed embryos from each treatment were pooled and total genomic DNA was extracted. The region surrounding the intended target site was PCR amplified with Phusion® High Fidelity PCR Master Mix (New England Biolabs, M0531 L) adding on the sequences necessary for amplicon-specific barcodes and Illumnia sequencing using "tailed" primers through two rounds of PCR. The primers used in the primary PCR reaction are shown in Table 2 and the primers used in the secondary PCR reaction were AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACG (forward, SEQ ID NO: 53) and CAAGCAGAAGACGGCATA (reverse, SEQ ID NO: 54).
TABLE-US-00002 TABLE 2 PCR primer sequences SEQ Primer ID Target Slte Orientation Primary PCR Primer Sequence NO: MS26Cas-1 Forward CTACACTCTTTCCCTACACGACGCTCTTCCGATCTA 28 GGACCGGAAGCTCGCCGCGT MS26Cas-1 Reverse CAAGCAGAAGACGGCATACGAGCTCTTCCGATCTTC 29 CTGGAGGACGACGTGCTG MS26Cas-2 Forward CTACACTCTTTCCCTACACGACGCTCTTCCGATCTAA 30 GGTCCTGGAGGACGACGTGCTG MS26Cas-2 Reverse CAAGCAGAAGACGGCATACGAGCTCTTCCGATCTCC 31 GGAAGCTCGCCGCGT MS26Cas-3 Forward CTACACTCTTTCCCTACACGACGCTCTTCCGATCTTC 32 CTCCGGAAGCTCGCCGCGT MS26Cas-3 Reverse CAAGCAGAAGACGGCATACGAGCTCTTCCGATCTTC 29 CTGGAGGACGACGTGCTG MS26 Forward CTACACTCTTTCCCTACACGACGCTCTTCCGATCTTT 33 Meganuclease CCTCCTGGAGGACGACGTGCTG MS26 Reverse CAAGCAGAAGACGGCATACGAGCTCTTCCGATCTCC 31 Meganuclease GGAAGCTCGCCGCGT LIGCas-1 Forward CTACACTCTTTCCCTACACGACGCTCTTCCGATCTA 34 GGACTGTAACGATTTACGCACCTGCTG LIGCas-1 Reverse CAAGCAGAAGACGGCATACGAGCTCTTCCGATCTGC 35 AAATGAGTAGCAGCGCACGTAT LIGCas-2 Forward CTACACTCTTTCCCTACACGACGCTCTTCCGATCTTC 36 CTCTGTAACGATTTACGCACCTGCTG LIGCas-2 Reverse CAAGCAGAAGACGGCATACGAGCTCTTCCGATCTGC 35 AAATGAGTAGCAGCGCACGTAT LIGCas-3 Forward CTACACTCTTTCCCTACACGACGCTCTTCCGATCTAA 37 GGCGCAAATGAGTAGCAGCGCAC LIGCas-3 Reverse CAAGCAGAAGACGGCATACGAGCTCTTCCGATCTCA 38 CCTGCTGGGAATTGTACCGTA LIG3-4 Forward CTACACTCTTTCCCTACACGACGCTCTTCCGATCTC 39 Meganuclease CTTCGCAAATGAGTAGCAGCGCAC LIG3-4 Reverse CAAGCAGAAGACGGCATACGAGCTCTTCCGATCTCA 38 Meganuclease CCTGCTGGGAATTGTACCGTA MS45Cas-1 Forward CTACACTCTTTCCCTACACGACGCTCTTCCGATCTA 40 GGAGGACCCGTTCGGCCTCAGT MS45Cas-1 Reverse CAAGCAGAAGACGGCATACGAGCTCTTCCGATCTGC 41 CGGCTGGCATTGTCTCTG MS45Cas-2 Forward CTACACTCTTTCCCTACACGACGCTCTTCCGATCTTC 42 CTGGACCCGTTCGGCCTCAGT MS45Cas-2 Reverse CAAGCAGAAGACGGCATACGAGCTCTTCCGATCTGC 41 CGGCTGGCATTGTCTCTG MS45Cas-3 Forward CTACACTCTTTCCCTACACGACGCTCTTCCGATCTG 43 AAGGGACCCGTTCGGCCTCAGT MS45Cas-3 Reverse CAAGCAGAAGACGGCATACGAGCTCTTCCGATCTGC 41 CGGCTGGCATTGTCTCTG ALSCas-1 Forward CTACACTCTTTCCCTACACGACGCTCTTCCGATCTAA 44 GGCGACGATGGGCGTCTCCTG ALSCas-1 Reverse CAAGCAGAAGACGGCATACGAGCTCTTCCGATCTGC 45 GTCTGCATCGCCACCTC ALSCas-2 Forward CTACACTCTTTCCCTACACGACGCTCTTCCGATCTTT 46 CCCGACGATGGGCGTCTCCTG ALSCas-2 Reverse CAAGCAGAAGACGGCATACGAGCTCTTCCGATCTGC 45 GTCTGCATCGCCACCTC ALSCas-3 Forward CTACACTCTTTCCCTACACGACGCTCTTCCGATCTG 47 GAACGACGATGGGCGTCTCCTG ALSCas-3 Reverse CAAGCAGAAGACGGCATACGAGCTCTTCCGATCTGC 45 GTCTGCATCGCCACCTC EPSPSCas-1 Forward CTACACTCTTTCCCTACACGACGCTCTTCCGATCTG 48 GAAGAGGAAACATACGTTGCATTTCCA EPSPSCas-1 Reverse CAAGCAGAAGACGGCATACGAGCTCTTCCGATCTG 49 GTGGAAAGTTCCCAGTTGAGGA EPSPSCas-2 Forward CTACACTCTTTCCCTACACGACGCTCTTCCGATCTAA 50 GCGGTGGAAAGTTCCCAGTTGAGGA EPSPSCas-2 Reverse CAAGCAGAAGACGGCATACGAGCTCTTCCGATCTGA 51 GGAAACATACGTTGCATTTCCA EPSPSCas-3 Forward CTACACTCTTTCCCTACACGACGCTCTTCCGATCTC 52 CTTGAGGAAACATACGTTGCATTTCCA EPSPSCas-3 Reverse CAAGCAGAAGACGGCATACGAGCTCTTCCGATCTG 49 GTGGAAAGTTCCCAGTTGAGGA
[0653] The resulting PCR amplifications were purified with a Qiagen PCR purification spin column, concentration measured with a Hoechst dye-based fluorometric assay, combined in an equimolar ratio, and single read 100 nucleotide-length deep sequencing was performed on Illumina's MiSeq Personal Sequencer with a 30-40% (v/v) spike of PhiX control v3 (Illumina, FC-110-3001) to off-set sequence bias. Only those reads with a ≧1 nucleotide indel arising within the 10 nucleotide window centered over the expected site of cleavage and not found in a similar level in the negative control were classified as NHEJ mutations. NHEJ mutant reads with the same mutation were counted and collapsed into a single read and the top 10 most prevalent mutations were visually confirmed as arising within the expected site of cleavage. The total numbers of visually confirmed NHEJ mutations were then used to calculate the % mutant reads based on the total number of reads of an appropriate length containing a perfect match to the barcode and forward primer.
[0654] The frequency of NHEJ mutations recovered by deep sequencing for the guide RNA/Cas endonuclease system targeting the three LIGCas targets (SEQ ID NOS: 16, 17, 18) compared to the LIG3-4 homing endonuclease targeting the same locus is shown in Table 3. The ten most prevalent types of NHEJ mutations recovered based on the guide RNA/Cas endonuclease system compared to the LIG3-4 homing endonuclease are shown in FIG. 3 A (corresponding to SEQ ID NOs: 55-75) and FIG. 3 B (corresponding to SEQ ID NOs: 76-96). Approximately, 12-23 fold higher frequencies of NHEJ mutations were observed when using a guide RNA/Cas system to introduce a double strand break at a maize genomic target site (Cas target sites), relative to the LIG3-4 homing endonuclease control. As shown in Table 4, a similar difference between the guide RNA/Cas system and meganuclease double-strand break technologies was observed at the MS26 locus with approximately 14-25 fold higher frequencies of NHEJ mutations when a guide RNA/Cas endonuclease system was used. High frequencies of NHEJ mutations were also recovered at the MS45, ALS and EPSPS Cas targets (see Table 5) when using a guide RNA/Cas endonuclease system. This data indicates that the guide RNA/Cas9 endonuclease system described herein can be effectively used to introduce an alteration at genomic sites of interest such as those related to male fertility, wherein an alteration results in the creation of a male sterile gene locus and male sterile plants. Altering the EPSPS target can result in the production of plants that are tolerant and/or resistant against glyphosate based herbicides. Altering the acetolactate synthase (ALS) gene target site can result in the production of plants that are tolerant and/or resistant to imidazolinone and sulphonylurea herbicides.
TABLE-US-00003 TABLE 3 Percent (%) mutant reads at maize Liguleless 1 target locus produced by a guide RNA/Cas system versus a homing endonuclease system. Total Number of Number of Mutant % Mutant System Reads Reads Reads Cas9 Only Control 640,063 1 0.00% guide RNA Only 646,774 1 0.00% Control LIG3-4 Homing 616,536 1,211 0.20% Endonuclease LIGCas-1 guide/ 716,854 33,050 4.61% Cas9 LIGCas-2 guide/ 711,047 16,675 2.35% Cas9 LIGCas-3 guide/ 713,183 27,959 3.92% Cas9
TABLE-US-00004 TABLE 4 Percent (%) mutant reads at maize Male Sterility 26 target locus produced by a guide RNA/Cas system versus a homing endonuclease. Total Number of Number of Mutant % Mutant System Reads Reads Reads Cas9 Only Control 403,123 15 0.00% MS26++ Homing 512,784 642 0.13% Endonuclease MS26Cas-1 guide/ 575,671 10,073 1.75% Cas9 MS26Cas-2 guide/ 543,856 16,930 3.11% Cas9 MS26Cas-3 guide/ 538,141 13,879 2.58% Cas9
TABLE-US-00005 TABLE 5 Percent (%) mutant reads at maize Male Sterility 45, Acetolactate Synthase and Enolpyruvylshikimate Phosphate Synthase target loci produced by the guide RNA/Cas system. Total Number of Number of Mutant % Mutant System Reads Reads Reads Cas9 Only 899,500 27 0.00% Control (MS45) MS45Cas-1 812,644 3,795 0.47% guide/Cas9 MS45Cas-2 785,183 14,704 1.87% guide/Cas9 MS45Cas-3 728,023 9,203 1.26% guide/Cas9 Cas9 Only 534,764 19 0.00% Control (ALS) ALSCas-1 434,452 9,669 2.23% guide/Cas9 ALSCas-2 472,351 6,352 1.345% guide/Cas9 ALSCas-3 497,786 8,535 1.715% guide/Cas9 Cas9 Only 1,347,086 6 0.00% Control (EPSPS) EPSPSCas-1 1,420,274 13,051 0.92% guide/Cas9 EPSPSCas-2 1,225,082 26,340 2.15% guide/Cas9 EPSPSCas-3 1,406,905 53,603 3.81% guide/Cas9
[0655] Taken together, our data indicate that the maize optimized guide RNA/Cas endonuclease system described herein using a long guide RNA expression cassette efficiently cleaves maize chromosomal DNA and generates imperfect NHEJ mutations at frequencies greater than the engineered LIG3-4 and MS26++ homing endonucleases.
Example 3
[0656] Long Guide RNA of the Maize Optimized Guide RNA/Cas Endonuclease System Cleaves Maize Chromosomal DNA More Efficiently than the Short Guide RNA
[0657] To determine the most effective guide RNA (comprising a fusion of the crRNA and tracrRNA) for use in maize, the recovery of NHEJ mutations using a short guide RNA (SEQ ID NO: 11) based on Jinek et al. (2012) Science 337:816-21 and a long guide RNA (SEQ ID NO: 8) based on Mali et al. (2013) Science 339:823-26 was examined.
[0658] The variable targeting domains of the guide RNA targeting the maize genomic target sites at the LIG locus (LIGCas-1, LIGCas-2 and LIGCas-3, SEQ ID NOs: 16, 17 and 18, Table1) were introduced into both the maize optimized long and short guide RNA expression cassettes as described in Example 1 and co-transformed along with the maize optimized Cas9 endonuclease expression cassette into immature maize embryos and deep sequenced for NHEJ mutations as described in Example 2. Embryos transformed with only the Cas9 endonuclease expression cassette served as a negative control.
[0659] As shown in Table 6 below, the frequency of NHEJ mutations recovered with the long guide RNA far exceeded those obtained with the short guide RNA. This data indicates that the long guide RNA paired with the maize optimized Cas9 endonuclease gene described herein more efficiently cleaves maize chromosomal DNA.
TABLE-US-00006 TABLE 6 Percent (%) mutant reads at the maize Liquleless 1 target locus produced by a guide RNA/Cas system with a long versus a short guide RNA. Number of guide RNA Total Number of Mutant % Mutant System Used Reads Reads Reads Cas9 Only N/A 640,063 1 0.00% LIGCas-1 Short 676,870 43 0.01% guide/Cas9 LIGCas-2 Short 747,945 91 0.01% guide/Cas9 LIGCas-3 Short 655,157 10 0.00% guide/Cas9 LIGCas-1 Long 716,854 33,050 4.61% guide/Cas9 LIGCas-2 Long 711,047 16,675 2.35% guide/cas9 LIGCas-3 Long 713,183 27,959 3.92% guide/Cas9
Example 4
[0660] The Guide RNA/Cas Endonuclease System May be Multiplexed to Simultaneously Target Multiple Chromosomal Loci in Maize for Mutagenesis by Imperfect Non-Homologous End-Joining
[0661] To test if multiple chromosomal loci may be simultaneously mutagenized with the guide RNA/maize optimized Cas endonuclease system described herein, the long guide RNA expression cassettes targeting the MS26Cas-2 target site (SEQ ID NO: 14), the LIGCas-3 target site (SEQ ID NO: 18) and the MS45Cas-2 target site (SEQ ID NO: 20), were co-transformed into maize embryos either in duplex or in triplex along with the Cas9 endonuclease expression cassette and examined by deep sequencing for the presence of imprecise NHEJ mutations as described in Example 2.
[0662] Hi-II maize embryos co-transformed with the Cas9 expression cassette and the corresponding guide RNA expression cassette singly served as a positive control and embryos transformed with only the Cas9 expression cassette served as a negative control.
[0663] As shown in Table 7 below, mutations resulting from imprecise NHEJ were recovered at all relevant loci when multiple guide RNA expression cassettes were simultaneously introduced either in duplex or triplex with frequencies of mutant reads near those of the positive control. Thus, demonstrating that the maize optimized guide RNA/Cas endonuclease system described herein may be used to simultaneously introduce imprecise NHEJ mutations at multiple loci in maize.
TABLE-US-00007 TABLE 7 Percent (%) mutant reads at maize target loci produced by a multiplexed guide RNA/Cas system. guide RNAs Target Site Co-transformed Examined Individually, in Total Number of for NHEJ Duplex, or in Triplex Number of Mutant % Mutant Mutations with Cas9 Reads Reads Reads LIGCas-3, None (Cas9 Only 527,691 9 0.00% MS26Cas-2, control) MS45Cas-2 LIGCas-3 LIGCas-3 645,107 12,631 1.96% LIGCas-3 579,992 10,348 1.78% MS26Cas-2 LIGCas-3 648,901 12,094 1.86% MS26Cas-2 MS45Cas-2 MS26Cas-2 MS26 Cas 2 699,154 17,247 2.47% LIGCas-3 717,158 10,256 1.43% MS26Cas-2 MS26Cas-2 613,431 9,931 1.62% MS45Cas-2 LIGCas-3 471,890 7,311 1.55% MS26Cas-2 MS45Cas-2 MS45Cas-2 MS45Cas-2 503,423 10,034 1.99% MS26Cas-2 480,178 8,008 1.67% MS45Cas-2 LIGCas-3 416,711 7,190 1.73% MS26Cas-2 MS45Cas-2
Example 5
[0664] Guide RNA/Cas Endonuclease Mediated DNA Cleavage in Maize Chromosomal Loci can Stimulate Homologous Recombination Repair-Mediated Transgene Insertion
[0665] To test the utility of the maize optimized guide RNA/Cas system described herein to cleave maize chromosomal loci and stimulate homologous recombination (HR) repair pathways to site-specifically insert a transgene, a HR repair DNA vector (also referred to as a donor DNA) (SEQ ID NO: 97) was constructed as illustrated in FIG. 4 using standard molecular biology techniques and co-transformed with a long guide RNA expression cassette, comprising a variable targeting domain corresponding to the LIGCas-3 genomic target site, and a Cas9 endonuclease expression cassette into immature maize embryos as described in Example 2.
[0666] Maize embryos co-transformed with the HR repair DNA vector and LIG3-4 homing endonuclease (see Example 9) targeting the same genomic target site as LIGCas-3 served as a positive control. Since successful delivery of the HR repair DNA vector confers bialaphos herbicide resistance, callus events containing putative HR-mediated transgenic insertions were selected by placing the callus on herbicide containing media. After selection, stable callus events were sampled, total genomic DNA extracted, and using the primer pairs shown in FIG. 5 (corresponding to SEQ ID NOs: 98-101), PCR amplification was carried out at both possible transgene genomic DNA junctions to identify putative HR-mediated transgenic insertions. The resulting amplifications were sequenced for confirmation.
[0667] Sequence confirmed PCR amplifications indicating site-specific transgene insertion for the guide RNA/Cas system were detected for 37 out of 384 stable transformants with 15 containing amplifications across both transgene genomic DNA junctions indicating near perfect site-specific transgene insertion. The LIG3-4 homing endonuclease positive control yielded PCR amplifications indicating site-specific transgene insertion for 3 out of 192 stable transformants with 1 containing amplifications across both transgene genomic DNA junctions. The data clearly demonstrates that maize chromosomal loci cleaved with the maize optimized guide RNA/Cas system described herein can be used to stimulate HR repair pathways to site-specifically insert transgenes at frequencies greater than the LIG3-4 homing endonuclease.
Example 6
[0668] Guide RNA/Cas Endonuclease System Transformed Together on a Single Vector Results in Greater Recovery of Imperfect Non-Homologous End-Joining Mutations
[0669] To evaluate different delivery methods for the maize optimized guide RNA/Cas endonuclease system described herein, the recovery of NHEJ mutations when the guide RNA/Cas expression cassettes were either co-transformed as separate DNA vectors as in Examples 2, 3, 4 and 5 or transformed as a single vector DNA (comprising both guide RNA and Cas endonuclease expression cassettes, as shown in FIG. 1C) was examined.
[0670] The long guide RNA expression cassette for LIGCas-3 and the Cas9 expression cassette were consolidated onto a single vector DNA (FIG. 1C, SEQ ID NO: 102) by standard molecular biology techniques and transformed into immature Hi-II maize embryos as described in Examples 10 and 11 by particle-mediated delivery. Hi-II embryos co-transformed with the Cas9 and LIGCas-3 long guide RNA expression cassettes served as a positive control while embryos transformed with only the Cas9 expression cassette served as a negative control. Deep sequencing for NHEJ mutations was performed as described in Example 2.
[0671] As shown in Table 8 below, the frequency of NHEJ mutations recovered when the Cas endonuclease and long guide RNA expression cassettes were delivered together as a single vector DNA was approximately 2-fold greater than that observed from the equivalent co-transformation experiment. This indicates that delivery of the guide RNA/Cas system expression cassettes together on a single vector DNA results in a greater recovery of imperfect non-homologous end-joining mutations.
TABLE-US-00008 TABLE 8 Percent (%) mutant reads at the maize Liguleless 1 target locus produced by a guide RNA/Cas system with Cas9 and guide RNA expression cassettes combined into one DNA vector versus two separate DNA vectors. Total Number of Number of Mutant % Mutant System Reads Reads Reads Cas9 Only Control 1,519,162 97 0.01% LIGCas-3 1,515,0607 36,346 2.40% guide/Cas9 (Two vector DNAs) LIGCas-3 1,860,031 105,854 5.69% guide/Cas9 (Single vector DNA)
Example 7
[0672] Delivery Methods for Plant Genome Editing Using the Guide RNA/Cas Endonuclease System
[0673] This example describes methods to deliver or maintain and express the Cas9 endonuclease and guide RNA (or individual crRNA and tracrRNAs) into, or within plants, respectively, to enable directed DNA modification or gene insertion via homologous recombination. More specifically this example describes a variety of methods which include, but are not limited to, delivery of the Cas9 endonuclease as a DNA, RNA (5'-capped and polyadenylated) or protein molecule. In addition, the guide RNA may be delivered as a DNA or RNA molecule.
[0674] Shown in Example 2, a high mutation frequency was observed when Cas9 endonuclease and guide RNA were delivered as DNA vectors by biolistic transformation of immature corn embryos. Other embodiments of this disclosure can be to deliver the Cas9 endonuclease as a DNA, RNA or protein and the guide RNA as a DNA or RNA molecule or as a duplex crRNA/tracrRNA molecule as RNA or DNA or a combination. Various combinations of Cas9 endonuclease, guide RNA and crRNA/tracrRNA delivery methods can be, but are not limited to, the methods shown in Table 9.
TABLE-US-00009 TABLE 9 Various combinations of delivery of the cas9 endonuclease, guide RNA or cRNA + tracrRNA. Components delivered. combination (Delivery method is shown between brackets) 1 Cas9 (DNA vector), guide RNA (DNA vector) 2 Cas9 (DNA vector), guide RNA (RNA) 3 Cas9 (RNA), guide RNA (DNA) 4 Cas9 (RNA), guide RNA (RNA) 5 Cas9 (Protein), guide RNA (DNA) 6 Cas9 (Protein), guide RNA (RNA) 7 Cas9 (DNA vector), crRNA (DNA), tracrRNA (DNA) 8 Cas9 (DNA vector), crRNA (RNA), tracrRNA (DNA) 9 Cas9 (DNA vector), crRNA (RNA), tracrRNA (RNA) 10 Cas9 (DNA vector) crRNA (DNA), tracrRNA (RNA) 11 Cas9 (RNA), crRNA (DNA), tracrRNA (DNA) 12 Cas9 (RNA), crRNA (RNA), tracrRNA (DNA) 13 Cas9 (RNA), crRNA (RNA), tracrRNA (RNA) 14 Cas9 (RNA), crRNA (DNA), tracrRNA (RNA) 15 Cas9 (Protein), crRNA (DNA), tracrRNA (DNA) 16 Cas9 (Protein), crRNA (RNA), tracrRNA (DNA) 17 Cas9 (Protein), crRNA (RNA), tracrRNA 18(RNA) 18 Cas9 (Protein), crRNA (DNA), tracrRNA (RNA)
[0675] Delivery of the Cas9 (as DNA vector) and guide RNA (as DNA vector) example (Table 9, combination1) can also be accomplished by co-delivering these DNA cassettes on a single or multiple Agrobacterium vectors and transforming plant tissues by Agrobacterium mediated transformation. In addition, a vector containing a constitutive, tissue-specific or conditionally regulated Cas9 gene can be first delivered to plant cells to allow for stable integration into the plant genome to establish a plant line that contains only the Cas9 gene in the plant genome. In this example, single or multiple guide RNAs, or single or multiple crRNA and a tracrRNA can be delivered as either DNA or RNA, or combination, to the plant line containing the genome-integrated version of the Cas9 gene for the purpose of generating mutations or promoting homologous recombination when HR repair DNA vectors for targeted integration are co-delivered with the guide RNAs. As extension of this example, plant line containing the genome-integrated version of the Cas9 gene and a tracrRNA as a DNA molecule can also be established. In this example single or multiple crRNA molecules can be delivered as RNA or DNA to promote the generation of mutations or to promote homologous recombination when HR repair DNA vectors for targeted integration are co-delivered with crRNA molecule(s) enabling the targeted mutagenesis or homologous recombination at single or multiple sites in the plant genome.
Example 8
[0676] Components of the Guide RNA/Cas Endonuclease System Delivered Directly as RNA in Plants
[0677] This example illustrates the use of the methods as described in Table 9 configuration of Example 7 [Cas9 (DNA vector), guide RNA (RNA)] for modification or mutagenesis of chromosomal loci in plants. The maize optimized Cas9 endonuclease expression cassette described in Example 1 was co-delivered by particle gun as described in Example 2 along with single stranded RNA molecules (synthesized by Integrated DNA Technologies, Inc.) constituting a short guide RNA targeting the maize locus and sequence shown in Table 10. Embryos transformed with only the Cas9 expression cassette or short guide RNA molecules served as negative controls. Seven days post-bombardment, the immature embryos were harvested and analyzed by deep sequencing for NHEJ mutations as described in Example 2. Mutations not present in the negative controls were found at the site (FIG. 6, corresponding to SEQ ID NOs: 104-110). These mutations were similar to those found in Examples 2, 3, 4 and 6. This data indicates that component(s) of the maize optimized guide RNA/Cas endonuclease system described herein may be delivered directly as RNA.
TABLE-US-00010 TABLE 10 Maize genomic target site and location for short guide RNA delivered as RNA. Guide PAM SEQ Loca- RNA Maize Se- ID Locus tion Used Designation Target Site quence NO 55 Chr. 1: Short 55CasRNA-1 TGGGCAGGTCT TGG 103 51.78 CACGACGGT cM
Example 9
Creation of Rare Cutting Engineered Meganucleases
LIG3-4 Meganuclease and LIG3-4 Intended Recognition Sequence
[0678] An endogenous maize genomic target site comprising the LIG3-4 intended recognition sequence (SEQ ID NO: 111) was selected for design of a rare-cutting double-strand break inducing agent (SEQ ID NO: 112) as described in US patent publication 2009-0133152 A1 (published May 21, 2009). The LIG3-4 intended recognition sequence is a 22 bp polynucleotide having the following sequence:
TABLE-US-00011 (SEQ ID NO: 111) ATATACCTCACACGTACGCGTA.
MS 26++ meganuclease
[0679] An endogenous maize genomic target site designated "TS-MS26" (SEQ ID NO: 113) was selected for design of a custom double-strand break inducing agent MS26++ as described in U.S. patent application Ser. No. 13/526,912 filed Jun. 19, 2012). The TS-MS26 target site is a 22 bp polynucleotide positioned 62 bps from the 5' end of the fifth exon of the maize MS26 gene and having the following sequence: gatggtgacgtac gtgccctac (SEQ ID NO: 113). The double strand break site and overhang region is underlined, the enzyme cuts after C13, as indicated by the . Plant optimized nucleotide sequences for an engineered endonuclease (SEQ ID NO: 114) encoding an engineered MS26++ endonuclease were designed to bind and make double-strand breaks at the selected TS-MS26 target site.
Example 10
Transformation of Maize Immature Embryos
[0680] Transformation can be accomplished by various methods known to be effective in plants, including particle-mediated delivery, Agrobacterium-mediated transformation, PEG-mediated delivery, and electroporation.
[0681] a. Particle-Mediated Delivery
[0682] Transformation of maize immature embryos using particle delivery is performed as follows. Media recipes follow below.
[0683] The ears are husked and surface sterilized in 30% Clorox bleach plus 0.5% Micro detergent for 20 minutes, and rinsed two times with sterile water. The immature embryos are isolated and placed embryo axis side down (scutellum side up), 25 embryos per plate, on 560Y medium for 4 hours and then aligned within the 2.5-cm target zone in preparation for bombardment. Alternatively, isolated embryos are placed on 560L (Initiation medium) and placed in the dark at temperatures ranging from 26° C. to 37° C. for 8 to 24 hours prior to placing on 560Y for 4 hours at 26° C. prior to bombardment as described above.
[0684] Plasmids containing the double strand brake inducing agent and donor DNA are constructed using standard molecular biology techniques and co-bombarded with plasmids containing the developmental genes ODP2 (AP2 domain transcription factor ODP2 (Ovule development protein 2); US20090328252 A1) and Wushel (US2011/0167516).
[0685] The plasmids and DNA of interest are precipitated onto 0.6 μm (average diameter) gold pellets using a water-soluble cationic lipid transfection reagent as follows. DNA solution is prepared on ice using 1 μg of plasmid DNA and optionally other constructs for co-bombardment such as 50 ng (0.5 μl) of each plasmid containing the developmental genes ODP2 (AP2 domain transcription factor ODP2 (Ovule development protein 2); US20090328252 A1) and Wushel. To the pre-mixed DNA, 20 μl of prepared gold particles (15 mg/ml) and 5 μl of a water-soluble cationic lipid transfection reagent is added in water and mixed carefully. Gold particles are pelleted in a microfuge at 10,000 rpm for 1 min and supernatant is removed. The resulting pellet is carefully rinsed with 100 ml of 100% EtOH without resuspending the pellet and the EtOH rinse is carefully removed. 105 μl of 100% EtOH is added and the particles are resuspended by brief sonication. Then, 10 μl is spotted onto the center of each macrocarrier and allowed to dry about 2 minutes before bombardment.
[0686] Alternatively, the plasmids and DNA of interest are precipitated onto 1.1 μm (average diameter) tungsten pellets using a calcium chloride (CaCl2) precipitation procedure by mixing 100 μl prepared tungsten particles in water, 10 μl (1 μg) DNA in Tris EDTA buffer (1 μg total DNA), 100 μl 2.5 M CaCl2, and 10 μl 0.1 M spermidine. Each reagent is added sequentially to the tungsten particle suspension, with mixing. The final mixture is sonicated briefly and allowed to incubate under constant vortexing for 10 minutes. After the precipitation period, the tubes are centrifuged briefly, liquid is removed, and the particles are washed with 500 ml 100% ethanol, followed by a 30 second centrifugation. Again, the liquid is removed, and 105 μl of 100% ethanol is added to the final tungsten particle pellet. For particle gun bombardment, the tungsten/DNA particles are briefly sonicated. 10 μl of the tungsten/DNA particles is spotted onto the center of each macrocarrier, after which the spotted particles are allowed to dry about 2 minutes before bombardment.
[0687] The sample plates are bombarded at level #4 with a Biorad Helium Gun. All samples receive a single shot at 450 PSI, with a total of ten aliquots taken from each tube of prepared particles/DNA.
[0688] Following bombardment, the embryos are incubated on 560P (maintenance medium) for 12 to 48 hours at temperatures ranging from 26C to 37C, and then placed at 26C. After 5 to 7 days the embryos are transferred to 560R selection medium containing 3 mg/liter Bialaphos, and subcultured every 2 weeks at 26C. After approximately 10 weeks of selection, selection-resistant callus clones are transferred to 288J medium to initiate plant regeneration. Following somatic embryo maturation (2-4 weeks), well-developed somatic embryos are transferred to medium for germination and transferred to a lighted culture room. Approximately 7-10 days later, developing plantlets are transferred to 272V hormone-free medium in tubes for 7-10 days until plantlets are well established. Plants are then transferred to inserts in flats (equivalent to a 2.5'' pot) containing potting soil and grown for 1 week in a growth chamber, subsequently grown an additional 1-2 weeks in the greenhouse, then transferred to Classic 600 pots (1.6 gallon) and grown to maturity. Plants are monitored and scored for transformation efficiency, and/or modification of regenerative capabilities.
[0689] Initiation medium (560L) comprises 4.0 g/l N6 basal salts (SIGMA C-1416), 1.0 ml/l Eriksson's Vitamin Mix (1000×SIGMA-1511), 0.5 mg/l thiamine HCl, 20.0 g/l sucrose, 1.0 mg/l 2,4-D, and 2.88 g/l L-proline (brought to volume with D-I H2O following adjustment to pH 5.8 with KOH); 2.0 g/l Gelrite (added after bringing to volume with D-I H2O); and 8.5 mg/l silver nitrate (added after sterilizing the medium and cooling to room temperature).
[0690] Maintenance medium (560P) comprises 4.0 g/l N6 basal salts (SIGMA C-1416), 1.0 ml/l Eriksson's Vitamin Mix (1000×SIGMA-1511), 0.5 mg/l thiamine HCl, 30.0 g/l sucrose, 2.0 mg/l 2,4-D, and 0.69 g/l L-proline (brought to volume with D-I H2O following adjustment to pH 5.8 with KOH); 3.0 g/l Gelrite (added after bringing to volume with D-I H2O); and 0.85 mg/l silver nitrate (added after sterilizing the medium and cooling to room temperature).
[0691] Bombardment medium (560Y) comprises 4.0 g/l N6 basal salts (SIGMA C-1416), 1.0 ml/l Eriksson's Vitamin Mix (1000×SIGMA-1511), 0.5 mg/l thiamine HCl, 120.0 g/l sucrose, 1.0 mg/l 2,4-D, and 2.88 g/l L-proline (brought to volume with D-I H2O following adjustment to pH 5.8 with KOH); 2.0 g/l Gelrite (added after bringing to volume with D-I H2O); and 8.5 mg/l silver nitrate (added after sterilizing the medium and cooling to room temperature).
Selection medium (560R) comprises 4.0 g/l N6 basal salts (SIGMA C-1416), 1.0 ml/l Eriksson's Vitamin Mix (1000×SIGMA-1511), 0.5 mg/l thiamine HCl, 30.0 g/l sucrose, and 2.0 mg/l 2,4-D (brought to volume with D-I H2O following adjustment to pH 5.8 with KOH); 3.0 g/l Gelrite (added after bringing to volume with D-I H2O); and 0.85 mg/l silver nitrate and 3.0 mg/l bialaphos (both added after sterilizing the medium and cooling to room temperature).
[0692] Plant regeneration medium (288J) comprises 4.3 g/l MS salts (GIBCO 11117-074), 5.0 ml/l MS vitamins stock solution (0.100 g nicotinic acid, 0.02 g/l thiamine HCL, 0.10 g/l pyridoxine HCL, and 0.40 g/l glycine brought to volume with polished D-I H2O) (Murashige and Skoog (1962) Physiol. Plant. 15:473), 100 mg/l myo-inositol, 0.5 mg/l zeatin, 60 g/l sucrose, and 1.0 ml/l of 0.1 mM abscisic acid (brought to volume with polished D-I H2O after adjusting to pH 5.6); 3.0 g/l Gelrite (added after bringing to volume with D-I H2O); and 1.0 mg/l indoleacetic acid and 3.0 mg/l bialaphos (added after sterilizing the medium and cooling to 60° C.). Hormone-free medium (272V) comprises 4.3 g/l MS salts (GIBCO 11117-074), 5.0 ml/l MS vitamins stock solution (0.100 g/l nicotinic acid, 0.02 g/l thiamine HCL, 0.10 g/l pyridoxine HCL, and 0.40 g/l glycine brought to volume with polished D-I H2O), 0.1 g/1 myo-inositol, and 40.0 g/l sucrose (brought to volume with polished D-I H2O after adjusting pH to 5.6); and 6 g/l bacto-agar (added after bringing to volume with polished D-I H2O), sterilized and cooled to 60° C.
[0693] b. Agrobacterium-Mediated Transformation
[0694] Agrobacterium-mediated transformation was performed essentially as described in Djukanovic et al. (2006) Plant Biotech J4:345-57. Briefly, 10-12 day old immature embryos (0.8-2.5 mm in size) were dissected from sterilized kernels and placed into liquid medium (4.0 g/L N6 Basal Salts (Sigma C-1416), 1.0 ml/L Eriksson's Vitamin Mix (Sigma E-1511), 1.0 mg/L thiamine HCl, 1.5 mg/L 2, 4-D, 0.690 g/L L-proline, 68.5 g/L sucrose, 36.0 g/L glucose, pH 5.2). After embryo collection, the medium was replaced with 1 ml Agrobacterium at a concentration of 0.35-0.45 OD550. Maize embryos were incubated with Agrobacterium for 5 min at room temperature, then the mixture was poured onto a media plate containing 4.0 g/L N6 Basal Salts (Sigma C-1416), 1.0 ml/L Eriksson's Vitamin Mix (Sigma E-1511), 1.0 mg/L thiamine HCl, 1.5 mg/L 2, 4-D, 0.690 g/L L-proline, 30.0 g/L sucrose, 0.85 mg/L silver nitrate, 0.1 nM acetosyringone, and 3.0 g/L Gelrite, pH 5.8. Embryos were incubated axis down, in the dark for 3 days at 20° C., then incubated 4 days in the dark at 28° C., then transferred onto new media plates containing 4.0 g/L N6 Basal Salts (Sigma C-1416), 1.0 ml/L Eriksson's Vitamin Mix (Sigma E-1511), 1.0 mg/L thiamine HCl, 1.5 mg/L 2, 4-D, 0.69 g/L L-proline, 30.0 g/L sucrose, 0.5 g/L MES buffer, 0.85 mg/L silver nitrate, 3.0 mg/L Bialaphos, 100 mg/L carbenicillin, and 6.0 g/L agar, pH 5.8. Embryos were subcultured every three weeks until transgenic events were identified. Somatic embryogenesis was induced by transferring a small amount of tissue onto regeneration medium (4.3 g/L MS salts (Gibco 11117), 5.0 ml/L MS Vitamins Stock Solution, 100 mg/L myo-inositol, 0.1 μM ABA, 1 mg/L IAA, 0.5 mg/L zeatin, 60.0 g/L sucrose, 1.5 mg/L Bialaphos, 100 mg/L carbenicillin, 3.0 g/L Gelrite, pH 5.6) and incubation in the dark for two weeks at 28° C. All material with visible shoots and roots were transferred onto media containing 4.3 g/L MS salts (Gibco 11117), 5.0 ml/L MS Vitamins Stock Solution, 100 mg/L myo-inositol, 40.0 g/L sucrose, 1.5 g/L Gelrite, pH 5.6, and incubated under artificial light at 28° C. One week later, plantlets were moved into glass tubes containing the same medium and grown until they were sampled and/or transplanted into soil.
Example 11
Transient Expression of BBM Enhances Transformation
[0695] Parameters of the transformation protocol can be modified to ensure that the BBM activity is transient. One such method involves precipitating the BBM-containing plasmid in a manner that allows for transcription and expression, but precludes subsequent release of the DNA, for example, by using the chemical PEI. In one example, the BBM plasmid is precipitated onto gold particles with PEI, while the transgenic expression cassette (UBI::moPAT˜GFPm::PinII; moPAT is the maize optimized PAT gene) to be integrated is precipitated onto gold particles using the standard calcium chloride method.
[0696] Briefly, gold particles were coated with PEI as follows. First, the gold particles were washed. Thirty-five mg of gold particles, 1.0 in average diameter (A.S.I. #162-0010), were weighed out in a microcentrifuge tube, and 1.2 ml absolute EtOH was added and vortexed for one minute. The tube was incubated for 15 minutes at room temperature and then centrifuged at high speed using a microfuge for 15 minutes at 4° C. The supernatant was discarded and a fresh 1.2 ml aliquot of ethanol (EtOH) was added, vortexed for one minute, centrifuged for one minute, and the supernatant again discarded (this is repeated twice). A fresh 1.2 ml aliquot of EtOH was added, and this suspension (gold particles in EtOH) was stored at -20° C. for weeks. To coat particles with polyethylimine (PEI; Sigma #P3143), 250 μl of the washed gold particle/EtOH mix was centrifuged and the EtOH discarded. The particles were washed once in 100 μl ddH2O to remove residual ethanol, 250 μl of 0.25 mM PEI was added, followed by a pulse-sonication to suspend the particles and then the tube was plunged into a dry ice/EtOH bath to flash-freeze the suspension, which was then lyophilized overnight. At this point, dry, coated particles could be stored at -80° C. for at least 3 weeks. Before use, the particles were rinsed 3 times with 250 μl aliquots of 2.5 mM HEPES buffer, pH 7.1, with 1× pulse-sonication, and then a quick vortex before each centrifugation. The particles were then suspended in a final volume of 250 μl HEPES buffer. A 25 μl aliquot of the particles was added to fresh tubes before attaching DNA. To attach uncoated DNA, the particles were pulse-sonicated, then 1 μg of DNA (in 5 μl water) was added, followed by mixing by pipetting up and down a few times with a Pipetteman and incubated for 10 minutes. The particles were spun briefly (i.e. 10 seconds), the supernatant removed, and 60 μl EtOH added. The particles with PEI-precipitated DNA-1 were washed twice in 60 μl of EtOH. The particles were centrifuged, the supernatant discarded, and the particles were resuspended in 45 μl water. To attach the second DNA (DNA-2), precipitation using a water-soluble cationic lipid transfection reagent was used. The 45 μl of particles/DNA-1 suspension was briefly sonicated, and then 5 μl of 100 ng/μl of DNA-2 and 2.5 μl of the water-soluble cationic lipid transfection reagent were added. The solution was placed on a rotary shaker for 10 minutes, centrifuged at 10,000 g for 1 minute. The supernatant was removed, and the particles resuspended in 60 μl of EtOH. The solution was spotted onto macrocarriers and the gold particles onto which DNA-1 and DNA-2 had been sequentially attached were delivered into scutellar cells of 10 DAP Hi-II immature embryos using a standard protocol for the PDS-1000. For this experiment, the DNA-1 plasmid contained a UBI::RFP::pinII expression cassette, and DNA-2 contained a UBI::CFP::pinII expression cassette. Two days after bombardment, transient expression of both the CFP and RFP fluorescent markers was observed as numerous red & blue cells on the surface of the immature embryo. The embryos were then placed on non-selective culture medium and allowed to grow for 3 weeks before scoring for stable colonies. After this 3-week period, 10 multicellular, stably-expressing blue colonies were observed, in comparison to only one red colony. This demonstrated that PEI-precipitation could be used to effectively introduce DNA for transient expression while dramatically reducing integration of the PEI-introduced DNA and thus reducing the recovery of RFP-expressing transgenic events. In this manner, PEI-precipitation can be used to deliver transient expression of BBM and/or WUS2.
[0697] For example, the particles are first coated with UBI::BBM::pinII using PEI, then coated with UBI::moPAT˜YFP using a water-soluble cationic lipid transfection reagent, and then bombarded into scutellar cells on the surface of immature embryos. PEI-mediated precipitation results in a high frequency of transiently expressing cells on the surface of the immature embryo and extremely low frequencies of recovery of stable transformants Thus, it is expected that the PEI-precipitated BBM cassette expresses transiently and stimulates a burst of embryogenic growth on the bombarded surface of the tissue (i.e. the scutellar surface), but this plasmid will not integrate. The PAT˜GFP plasmid released from the Ca++/gold particles is expected to integrate and express the selectable marker at a frequency that results in substantially improved recovery of transgenic events. As a control treatment, PEI-precipitated particles containing a UBI::GUS::pinII (instead of BBM) are mixed with the PAT˜GFP/Ca++ particles. Immature embryos from both treatments are moved onto culture medium containing 3 mg/l bialaphos. After 6-8 weeks, it is expected that GFP+, bialaphos-resistant calli will be observed in the PEI/BBM treatment at a much higher frequency relative to the control treatment (PEI/GUS).
[0698] As an alternative method, the BBM plasmid is precipitated onto gold particles with PEI, and then introduced into scutellar cells on the surface of immature embryos, and subsequent transient expression of the BBM gene elicits a rapid proliferation of embryogenic growth. During this period of induced growth, the explants are treated with Agrobacterium using standard methods for maize (see Example 1), with T-DNA delivery into the cell introducing a transgenic expression cassette such as UBI::moPAT˜GFPm::pinII. After co-cultivation, explants are allowed to recover on normal culture medium, and then are moved onto culture medium containing 3 mg/l bialaphos. After 6-8 weeks, it is expected that GFP+, bialaphos-resistant calli will be observed in the PEI/BBM treatment at a much higher frequency relative to the control treatment (PEI/GUS).
[0699] It may be desirable to "kick start" callus growth by transiently expressing the BBM and/or WUS2 polynucleotide products. This can be done by delivering BBM and WUS2 5'-capped polyadenylated RNA, expression cassettes containing BBM and WUS2 DNA, or BBM and/or WUS2 proteins. All of these molecules can be delivered using a biolistics particle gun. For example 5'-capped polyadenylated BBM and/or WUS2 RNA can easily be made in vitro using Ambion's mMessage mMachine kit. RNA is co-delivered along with DNA containing a polynucleotide of interest and a marker used for selection/screening such as Ubi::moPAT˜GFPm::PinII. It is expected that the cells receiving the RNA will immediately begin dividing more rapidly and a large portion of these will have integrated the agronomic gene. These events can further be validated as being transgenic clonal colonies because they will also express the PAT˜GFP fusion protein (and thus will display green fluorescence under appropriate illumination). Plants regenerated from these embryos can then be screened for the presence of the polynucleotide of interest.
Example 12
[0700] DNA Constructs to Test the Guide RNA/Cas Endonuclease System for Soybean Genome Modifications
[0701] To test if a guide RNA/Cas endonuclease system, similar to that described in Example 1 for maize, is functional in a dicot such as soybean, a Cas9 (SO) gene (SEQ ID NO:115) soybean codon optimized from Streptococcus pyogenes M1 GAS (SF370) was expressed with a strong soybean constitutive promoter GM-EF1A2 (US patent application 20090133159 (SEQ ID NO: 116). A simian vacuolating virus 40 (SV40) large T-antigen nuclear localization signal (SEQ ID NO:117), representing the amino acid molecules of PKKKRKV (with a linker SRAD (SRADPKKKRKV), was added to the carboxyl terminus of the codon optimized Cas9 to facilitate transporting the codon optimized Cas9 protein (SEQ ID NO:118) to the nucleus. The codon optimized Cas9 gene was synthesized as two pieces by GenScript USA Inc. (Piscataway, N.J.) and cloned in frame downstream of the GM-EF1A2 promoter to make DNA construct QC782 shown in FIG. 7 (SEQ ID NO:119).
[0702] Plant U6 RNA polymerase III promoters have been cloned and characterized from such as Arabidopsis and Medicago truncatula (Waibel and Filipowicz, NAR 18:3451-3458 (1990); Li et al., J. Integrat. Plant Biol. 49:222-229 (2007); Kim and Nam, Plant Mol. Biol. Rep. 31:581-593 (2013); Wang et al., RNA 14:903-913 (2008)). Soybean U6 small nuclear RNA (snRNA) genes were identified herein by searching public soybean variety Williams82 genomic sequence using Arabidopsis U6 gene coding sequence. Approximately 0.5 kb genomic DNA sequence upstream of the first G nucleotide of a U6 gene was selected to be used as a RNA polymerase III promoter for example, GM-U6-13.1 promoter (SEQ ID NO:120), to express guide RNA to direct Cas9 nuclease to designated genomic site. The guide RNA coding sequence was 76 bp long (FIG. 8B) and comprised a 20 bp variable targeting domain from a chosen soybean genomic target site on the 5' end and a tract of 4 or more T residues as a transcription terminator on the 3' end. (SEQ ID NO:121, FIG. 8 B). The first nucleotide of the 20 bp variable targeting domain was a G residue to be used by RNA polymerase III for transcription. The U6 gene promoter and the complete guide RNA was synthesized and then cloned into an appropriate vector to make, for example, DNA construct QC783 shown in FIG. 8 A (SEQ ID NO:122). Other soybean U6 homologous genes promoters were similarly cloned and used for small RNA expression.
[0703] Since the Cas9 endonuclease and the guide RNA need to form a protein/RNA complex to mediate site-specific DNA double strand cleavage, the Cas9 endonuclease and guide RNA must be expressed in same cells. To improve their co-expression and presence, the Cas9 endonuclease and guide RNA expression cassettes were linked into a single DNA construct, for example, QC815 in FIG. 9 A (SEQ ID NO:123), which was then used to transform soybean cells to test the soybean optimized guide RNA/Cas system for genome modification. Similar DNA constructs were made to target different genomic sites using guide RNAs containing different target sequences.
Example 13
[0704] Selection of Soybean Genomic Sites to be Cleaved by the Guide RNA/Cas Endonuclease System
[0705] A region of the soybean chromosome 4 (Gm04) was selected to test if the soybean optimized guide RNA/Cas endonuclease system could recognize, cleave, and mutate soybean chromosomal DNA through imprecise non-homologous end-joining (NHEJ) repair. Two genomic target sites were selected one close to a predicted gene Glyma04g39780.1 at 114.13 cM herein named DD20 locus (FIG. 10A) and another close to Glyma04g39550.1 at 111.95 cM herein named DD43 locus (FIG. 10B). Each of the 20 bp variable targeting domain of the guide RNA started with a G residue required by RNA polymerase III and was followed in the soybean genome by a 3 bp PAM motif (Table 11). The chromosome positions of the soybean genomic targets sites in close proximity to the PAM sequences were determined by blast searching the public soybean variety Williams82 genomic sequence. The soybean genomic target sites DD20CR1 (SEQ ID NO: 125), DD20CR2 (SEQ ID NO: 126), and DD43CR1 (SEQ ID NO: 127) were identified as all unique in soybean genome while a second identical 23 bp genomic target site DD43CR2 (SEQ ID NO: 128) was found at Gm06:12072339-12072361 so there are two potential cleavage sites targeted by DD43CR2 guide RNA. Both DD43CR1 and DD43CR2 are complementary strand sequences indicated by "c" after the positions.
TABLE-US-00012 TABLE 11 Soybean genomic target sites for a guide RNA/Cas endonuclease system. Genomic Chromo- Desig- Target some Positions nation Sites PAM Gm04, 45936311- DD20CR1 GGAACTGACA TGG 114.13 45936333 CACGACATGA cM 45936324- DD20CR2 GACATGATGG AGG 45936346 AACGTGACTA Gm04, 45731921- DD43CR1 GTCCCTTGTA CGG 111.95 45731943c CTTGTACGTA cM 45731895- DD43CR2 GTATTCTAGA TGG 45731917c AAAGAGGAAT
[0706] Guide RNA expression cassette comprising a variable targeting domain targeting one of DD20CR1, DD20CR2, DD43CR2 genomic target sites were similarly constructed and linked to the soybean Cas9 expression cassette to make DNA constructs QC817, QC818, and QC816 that are similar to QC815 in FIG. 9 A (SEQ ID NO:123) except for the 20 bp variable targeting domain of the guide RNA
[0707] Since up to six continuous mismatches in the 5' regions of the genomic target site (protospacer) with the 20 bp variable targeting domain can be tolerated, i.e., a continuous stretch of 14 base pairs between the variable targeting domain and the crRNA sequence proximate to the PAM is necessarily enough for efficient targets cleavage any 23 bp genomic DNA sequence following the pattern N(20)NGG can be selected as a target site for the guide RNA/Cas endonuclease system. The last NGG is the PAM sequence that should not be included in the 20 bp variable targeting domain of the guide RNA. If the first N is not endogenously a G residue it must be replaced with a G residue in guide RNA target sequence to accommodate RNA polymerase III, which should not sacrifice recognition specificity of the target site by the guide RNA.
Example 14
[0708] Delivery of the Guide RNA/Cas Endonuclease System DNA to Soybean by Transient Transformation
[0709] The soybean optimized Cas9 endonuclease and guide RNA expression cassettes were delivered to young soybean somatic embryos in the form of embryogenic suspension cultures by particle gun bombardment. Soybean embryogenic suspension cultures were induced as follows. Cotyledons (˜3 mm in length) were dissected from surface sterilized, immature seeds and were cultured for 6-10 weeks in the light at 26° C. on a Murashige and Skoog (MS) media containing 0.7% agar and supplemented with 10 mg/ml 2,4-D (2,4-Dichlorophenoxyacetic acid). Globular stage somatic embryos, which produced secondary embryos, were then excised and placed into flasks containing liquid MS medium supplemented with 2,4-D (10 mg/ml) and cultured in the light on a rotary shaker. After repeated selection for clusters of somatic embryos that multiplied as early, globular staged embryos, the soybean embryogenic suspension cultures were maintained in 35 ml liquid media on a rotary shaker, 150 rpm, at 26° C. with fluorescent lights on a 16:8 hour day/night schedule. Cultures were subcultured every two weeks by inoculating approximately 35 mg of tissue into 35 ml of the same fresh liquid MS medium.
[0710] Soybean embryogenic suspension cultures were then transformed by the method of particle gun bombardment using a DuPont Biolistic® PDS1000/HE instrument (Bio-Rad Laboratories, Hercules, Calif.). To 50 μl of a 60 mg/ml 1.0 mm gold particle suspension were added (in order): 30 μl of 30 ng/μl QC815 DNA fragment U6-13.1:DD43CR1+EF1A2:CAS9 as an example, 20 μl of 0.1 M spermidine, and 25 μl of 5 M CaCl2. The particle preparation was then agitated for 3 minutes, spun in a centrifuge for 10 seconds and the supernatant removed. The DNA-coated particles were then washed once in 400 μl 100% ethanol and resuspended in 45 μl of 100% ethanol. The DNA/particle suspension was sonicated three times for one second each. Then 5 μl of the DNA-coated gold particles was loaded on each macro carrier disk.
[0711] Approximately 100 mg of a two-week-old suspension cultures were placed in an empty 60×15 mm Petri dish and the residual liquid removed from the tissue with a pipette. Membrane rupture pressure was set at 1100 psi and the chamber was evacuated to a vacuum of 28 inches mercury. The tissue was placed approximately 3.5 inches away from the retaining screen and bombarded once. The tissue clumps were rearranged and bombarded another time. Minimum amount of liquid MS media without 2,4-D supplement was added to the tissue to prevent the cultures from drying or overgrowing. The 60×15 mm Petri dish was sealed in a 100×25 mm Petri dish containing agar solid MS media to as another measure to keep the tissues from drying up. The tissues were harvested seven days after and genomic DNA was extracted for PCR analysis.
Example 15
[0712] Analysis of Guide RNA/Cas Endonuclease System Mediated Site-Specific NHEJ by Deep Sequencing
[0713] To evaluate DNA double strand cleavage at a soybean genomic target site mediated by the guide RNA/Cas endonuclease system, a region of approximately 100 bp genomic DNA surrounding the target site was amplified by PCR and the PCR product was then sequenced to check mutations at the target site as results of NHEJs. The region was first amplified by 20 cycles of PCR with Phusion High Fidelity mastermix (New England Biolabs) from 100 ng genomic DNA using gene-specific primers that also contain adaptors and amplicon-specific barcode sequences needed for a second round PCR and subsequence sequence analysis. For examples, the first PCR for the four experiments listed in Table 2 were done using primers DD20-S3 (SEQ ID NO:133)/DD20-A (SEQ ID NO:134), DD20-S4 (SEQ ID NO:135)/DD20-A, DD43-S3 (SEQ ID NO:136)/DD43-A (SEQ ID NO:137) and DD43-S4 (SEQ ID NO:138)/DD43-A. One micro liter of the first round PCR products was further amplified by another 20 cycles of PCR using universal primers (SEQ ID NOs:140, 141) with Phusion High Fidelity mastermix. The resulting PCR products were separated on 1.5% agarose gel and the specific DNA bands were purified with Qiagen gel purification spin columns. DNA concentrations were measured with a DNA Bioanalyzer (Agilent) and equal molar amounts of DNA for up to 12 different samples each with specific barcode were mixed as one sample for Illumina deep sequencing analysis. Single read 100 nucleotide-length deep sequencing was performed at a DuPont core facility on a Illumnia's MiSeq Personal Sequencer with a 40% (v/v) spike of PhiX control v3 (Illumina, FC-110-3001) to off-set sequence bias.
[0714] Since the genomic target site is located in the middle of the ˜100 bp long PCR amplicon (SEQ ID NOs: 142, 143, 144, 145), the 100 nucleotide-length deep sequencing is sufficient to cover the targets site region. A window of 10 nucleotides centered over the expected cleavage site, i.e., 3 bp upstream of the PAM, was selected for sequence analysis. Only those reads with one or more nucleotide indel arising within the 10 nucleotide window and not found in a similar level in negative controls were classified as NHEJ mutations. NHEJ mutant reads of different lengths but with the same mutation were counted into a single read and up to 10 most prevalent mutations were visually confirmed to be specific mutations before they were then used to calculate the % mutant reads based on the total analyzed reads containing specific barcode and forward primer.
[0715] The frequencies of NHEJ mutations revealed by deep sequencing for four target sites DD20CR1, DD20CR2, DD43CR1, DD43CR2 with one RNA polymerase III promoter GM-U6-13.1 are shown in Table 2. The visually confirmed most prevalent NHEJ mutations are shown in FIG. 11A-11D. The mutant sequences in FIG. 11A-11E are listed as SEQ ID NOs:147-201. The top row is the original reference sequence with the target site sequence underlined. Deletions in the mutated sequences are indicated by " - - - " while additions and replacements are indicated by bold letters. Total count of each mutation of different reads is given in the last column. Cas9 nuclease construct only, guide RNA construct only, and no DNA bombardment negative controls were similarly performed and analyzed but data not shown since no-specific mutations were detected. Other targets sites and guide RNAs were also tested with similar positive results and data not shown.
TABLE-US-00013 TABLE 12 Target site-specific mutations introduced by guide RNA/Cas endonuclease mediated NHEJ. Mutant Total % Experiment DNA reads reads Mutants U6-13.1:DD20CR1 + QC817 339 710,339 0.048% EF1A2:CAS9 U6-13.1:DD20CR2 + QC818 419 693,483 0.060% EF1A2:CAS9 U6-13.1:DD43CR1 + QC815 489 682,207 0.072% EF1A2:CAS9 U6-13.1:DD43CR2 + QC816 917** 539,681 0.170% EF1A2:CAS9 **At least the top 15 reads are specific mutations but only the top 10 are counted in the table to be consistent with other experiments. If all top 15 mutations are counted, the total Mutant reads is 1080 and the % Mutants is 0.200%.
[0716] In conclusion, our data indicate that the soybean optimized guide RNA/Cas endonuclease system is able to effectively cleave soybean endogenous genomic DNA and create imperfect NHEJ mutations at the specified genomic target sites.
Example 16
[0717] The Guide RNA/Cas Endonuclease System Delivers Double-Strand Breaks (DBSs) to the Maize Epsps Locus Resulting in Desired Point Mutations
[0718] Two maize optimized Cas9 endonucleases were developed and evaluated for their ability to introduce a double-strand break at a genomic target sequence. A first Cas9 endonuclease was as described in FIG. 1A (Example 2 and expression cassette SEQ ID NO:5). A second maize optimized Cas9 endonuclease (moCas9 endonuclease; SEQ ID NO:192) was supplemented with the SV40 nuclear localization signal by adding the signal coding sequence to the 5' end of the moCas9 coding sequence (FIG. 13). The plant moCas9 expression cassette was subsequently modified by the insertion of the ST-LS1 intron into the moCas9 coding sequence in order to enhance its expression in maize cells and to eliminate its expression in E. coli and Agrobacterium. The maize ubiquitin promoter and the potato proteinase inhibitor II gene terminator sequences complemented the moCas9 endonuclease gene designs. The structural elements of the moCas9 expression cassette are shown in FIG. 13 and its amino acid and nucleotide sequences are listed as SEQ ID Nos: 192 and 193.
[0719] A single guide RNA (sgRNA) expression cassette was essentially as described in Example 1 and shown in FIG. 1B. It consists of the U6 polymerase III maize promoter (SEQ ID NO: 9) and its cognate U6 polymerase III termination sequences (TTTTTTTT). The guide RNA (SEQ ID NO: 194) comprised a 20 nucleotide variable targeting domain (nucleotide1-20 of SEQ ID NO: 194) followed by a RNA sequence capable of interacting with the double strand break inducing endonuclease.
[0720] A maize optimized Cas9 endonuclease target sequence (moCas9 target sequence) within the EPSPS codon sequence was complementary to the 20 nucleotide variable sequence of the guide sgRNA determined the site of the Cas9 endonuclease cleavage within the EPSPS coding sequence.
[0721] The moCAS9 target sequence (nucleotides 25-44 of SEQ ID NO:209) was synthesized and cloned into the guide RNA-Cas9 expression vector designed for delivery of the components of the guide RNA-Cas9 system to the BMS (Black Mexican Sweet) cells through Agrobacterium-mediated transformation. Agrobacterium T-DNA delivered also the yeast FLP site-specific recombinase and the WDV (wheat dwarf virus) replication-associated protein (replicase). Since the moCas9 target sequences were flanked by the FLP recombination targets (FRT), they were excised by FLP in maize cells forming episomal (chromosome-like) structures. Such circular DNA fragments were replicated by the WDV replicase (the origin of replication was embedded into the WDV promoter) allowing their recovery in E. coli cells. If the maize optimizedCas9 endonuclease made a double-strand break at the moCas9 target sequence, its repair might produce mutations. The procedure is described in detail in: Lyznik, L. A., Djukanovic, V., Yang, M. and Jones, S. (2012) Double-strand break-induced targeted mutagenesis in plants. In: Transgenic plants: Methods and Protocols (Dunwell, J. M. and Wetten, A. C. eds). New York Heidelberg Dordrecht London: Springer, pp. 399-416.
[0722] The guideRNA/Cas endonuclease systems using either one of the maize optimized Cas9 endonucleases described herein, generated double-strand breaks in the moCas9 target sequence (Table 13). Table 13 shows the percent of the moCas9 target sequences mutagenized in the maize BMS cells using the moCas9 endonuclease of SEQ ID NO: 192 or the maize optimized cas9 endonuclease described in FIG. 1A and expressed by the expression cassette of SEQ ID NO:5. Both guideRNA/Cas endonuclease systems generated double-strand breaks (as judged by the number of targeted mutagenesis events) ranging from 67 to 84% of the moCas9 target sequences available on episomal DNA molecules in maize BMS cells. A sample of mutagenized EPSPS target sequences is shown in FIG. 14. This observation indicates that the maize optimized Cas9 endonuclease described herein is functional in maize cells and efficiently generates double-strand breaks at the moCas9 target sequence.
TABLE-US-00014 TABLE 13 Percent of the moCas9 target sequences mutaqenized in the maize BMS cells by maize optimized Cas9 endonucleases. # of # of moCas9 # of intact mutagenized Cas9 target moCas9 target moCas9 target Percent endonuclease sequences sequences sequences mutagenesis version analyzed recovered found (%) SEQ ID 81 13 68 84% NO: 193 (FIG. 13) SEQ ID 93 31 62 67% NO: 5 (FIG. 1A)
[0723] In order to accomplish targeted genome editing of the maize chromosomal EPSPS gene, a polynucleotide modification template which provided genetic information for editing the EPSPS coding sequence was created (SEQ ID NO:195) and co-delivered with the guide RNA/Cas9 system components.
[0724] As shown in FIG. 12, the polynucleotide modification template comprised three nucleotide modifications (indicated by arrows) when compared to the EPSPS genomic sequence to be edited. These three nucleotide modifications are referred to as TIPS mutations as these nucleotide modifications result in the amino acid changes T-102 to I-102 and P-106 to S-106. The first point mutation results from the substitution of the C nucleotide in the codon sequence ACT with a T nucleotide, a second mutation results from the substitution of the T nucleotide on the same codon sequence ACT with a C nucleotide to form the isoleucine codon (ATC), the third point mutation results from the substitution of the first C nucleotide in the codon sequence CCA with a T nucleotide in order to form a serine codon, TCA. (FIG. 12). Both codon sequences were located within 9 nucleotides of each other as shown in SEQ ID NO: 196: atcgcaatgcggtca. The three nucleotide modifications are shown in bold. The nucleotides between the two codon sequences were homologous to the non-edited EPSPS gene on the epsps locus. The polynucleotide modification template further comprised DNA fragments of maize EPSPS genomic sequence that were used as homologous sequence for the EPSPS gene editing. The short arm of homologous sequence (HR1--FIG. 12) was 810 base pairs long and the long arm of homologous sequence (HR2--FIG. 12) was 2,883 base pairs long (SEQ ID NO: 195).
[0725] In this example, the EPSPS polynucleotide modification template was co-delivered using particle gun bombardment as a plasmid (see template vector 1, FIG. 15) together with the guide sgRNA expression cassette and a maize optimizedCas9 endonuclease expression vector which contained the maize optimized Cas9 endonuclease expression cassette described in FIG. 1A (Example 1, SEQ ID NO:5) and also contained a moPAT selectable marker gene. Ten to eleven day-old immature embryos were placed, embryo-axis down, onto plates containing the N6 medium (Table 14) and incubated at 28° C. for 4-6 hours before bombardment. The plates were placed on the third shelf from the bottom in the PDS-1000 apparatus and bombarded at 200 psi. Post-bombardment, embryos were incubated in the dark overnight at 28° C. and then transferred to plates containing the N6-2 media for 6-8 days at 28° C. The embryos were then transferred to plates containing the N6-3 media for three weeks, followed by transferring the responding callus to plates containing the N6-4 media for an additional three-week selection. After six total weeks of selection at 28° C., a small amount of selected tissue was transferred onto the MS regeneration medium and incubated for three weeks in the dark at 28° C.
TABLE-US-00015 TABLE 14 Composition of Culture Media. Culture medium Composition N6 4.0 g/L N6 Basal Salts (Sigma C-1416; Sigma-Aldrich Co., St. Louis, MO, USA), 1.0 ml/L Ericksson's Vitamin Mix (Sigma E-1511), 0.5 mg/L thiamine HCl, 190 g/L sucrose, 1.0 mg/L 2,4- dichlorophenoxyacetic acid (2,4-D), 2.88 g/L L-proline, 8.5 mg/L silver nitrate, 25 mg/L cefotaxime, and 6.36 g/L Sigma agar at pH 5.8 N6-2 4.0 g/L N6 Basal Salts (Sigma C-1416), 1.0 ml/L Ericksson's Vitamin Mix (Sigma E-1511), 0.5 mg/L thiamine HCl, 20 g/L sucrose, 1.0 mg/L 2,4-D, 2.88 g/L L-proline, 8.5 mg/L silver nitrate, 25 mg/L cefotaxime, and 8.5 g/L Sigma agar at pH 5.8 N6-3 4.0 g/L N6 Basal Salts (Sigma C-1416), 1.0 ml/L Ericksson's Vitamin Mix (Sigma E-1511), 0.5 mg/L thiamine HCl, 30 g/L sucrose, 1.5 mg/L 2,4-D, 0.69 g/L L-proline, 0.5 g/L 2-(N- morpholino)ethanesulphonic acid (MES) buffer, 0.85 mg/L silver nitrate, 5 mg/L glufosinate NH4, and 8.0 g/L Sigma agar at pH 5.8 N6-4 4.0 g/L N6 Basal Salts (Sigma C-1416), 1.0 ml/L Ericksson's Vitamin Mix (Sigma E-1511), 0.5 mg/L thiamine HCl, 30 g/L sucrose, 1.5 mg/L 2,4-D, 0.69 g/L L-proline, 0.5 g/L MES buffer, 0.85 mg/L silver nitrate, 3 mg/L bialophos, and 8.0 g/L Sigma agar at pH 5.8 MS 4.3 g/L Murashige and Skoog (MS) salts (Gibco 11117; Gibco, Grand Island, NY), 5.0 ml/L MS Vitamins Stock Solution (Sigma M3900), 100 mg/L myo-inositol, 0.1 μmol abscisic acid (ABA), 1 mg/L indoleacetic acid (IAA), 0.5 mg/L zeatin, 60.0 g/L sucrose, 3.0 mg/L Bialaphos, and 8.0 g/L Sigma agar at pH 5.6
[0726] DNA was extracted by placing callus cell samples, two stainless-steel beads, and 450 ul of extraction buffer (250 mM NaCl, 200 mM Tris-HCl pH 7.4, 25 mM EDTA, 4.2 M Guanidine HCl) into each tube of a Mega titer rack. The rack was shaken in the Genogrinder at 1650 r.p.m. for 60 seconds and centrifuged at 3000×g for 20 min at 4° C. Three hundred μl of supernatant was transferred to the wells of the Unifilter 96-well DNA Binding GF/F Microplate (770-2810, Whatman, GE Healthcare). The plate was placed on the top of a Multi-well plate vacuum manifold (5017, Pall Life Sciences). A vacuum pressure was applied until the wells were completely dried. The vacuum filtration procedure was repeated one time with 100 ul extraction buffer and two times with 250 ul washing buffer (50 mM Tris-HCl pH 7.4, 200 mM NaCl, 70% ethanol). The residual ethanol was removed by placing the GF/F filter plate on an empty waste collection plate and centrifuged for 10 min at 3000×g. The DNA was eluted in 100 ul Elution Buffer (10 mM Tris-HCl, pH 8.3) and centrifuged at 3000×g for 1 min. For each sample, four PCR reactions were run. They included approximately 40 ng genomic DNA, 10 ul REDExtract-N-Amp PCR ReadyMix (R4775, Sigma-Aldrich Co.), and 5 picomoles of each primer in a total volume of 20 ul. Primer combinations for each PCR reaction are listed in the Table 15.
TABLE-US-00016 TABLE 15 Primer combinations for PCR reactions. PCR Primer SEQ reaction sequence ID NO: PCR product F-E2 CCGAGGAGATCGTG 197 Template CTGCA randomly CAATGGCCGCATTG 198 integrated CAGTTC or gene editing event F-T CCGAGGAGATCGTG 199 Wild-type CTGCA EPSPS allele TGACCGCATTGCGA 200 TTCCAG H-T TCCAAGTCGCTTTC 201 TIPS editing CAACAGGATC event TGACCGCATTGCGA 202 TTCCAG F-E3 CCGAGGAGATCGTG 203 A fragment of CTGCA the epsps locus ACCAAGCTGCTTCA 204 for cloning and ATCCGACAAC sequencing
[0727] The same PCR reactions were done on five samples of genomic DNA obtained from untransformed maize inbred plantlets. After an initial denaturation at 95° C. for 5 minutes, each PCR amplification was carried out over 35 cycles using DNA Engine Tetrad2 Thermal Cycler (BioRad Laboratories, Hercules, Calif.) at 94° C. for 30 sec denaturation, 68° C. for 30 sec annealing, and 72° C. for 1 min extension. PCR products F-E2, F-T and H-T were separated in 1% agarose gel at 100 Volts for 45 minutes, with 100 bp DNA Ladder (N0467S, NewEngland Biolabs). For sequencing, the F-F3 PCR amplified fragments from selected calli were cloned into pCR 2.1-TOPO vectors using the TOPO TA Cloning Kit (Invitrogen Corp, Carlsbad, Calif.). DNA sequencing was done with BigDye Terminator chemistry on ABI 3700 capillary sequencing machines (Applied Biosystems, Foster City, Calif.). Each sample contained about 0.5 ug Topo plasmid DNA and 6.4 pmole primer E3-EPex3 Rev (ACCAAGCTGCTTCAATCCGACAAC, SEQ ID NO: 204). Sequences were analyzed using the Sequencer program.
[0728] A sample of thirty one callus events selected on media containing bialophos (the moPAT selectable marker gene was part of the guide RNA-moCas9 expression vector) were screened for the presence of the TIPS point mutations. Twenty four events contained the TIPS point mutations integrated into genomic DNA (FIG. 16, the F-E2 treatment). Among them, six events showed the PCR amplification product of the chromosomal EPSPS gene with TIPS mutations (FIG. 16, the H-T treatment). The pair of PCR primers (one that can hybridize to the genomic epsps sequence not present in the EPSPS polynucleotide modification template and the other one binding to the edited EPSPS sequence present in the EPSPS polynucleotide modification template) distinguished the EPSPS-TIPS editing products from the wild-type epsps alleles or random insertions of the TIPS mutations. If one EPSPS allele was edited to contain the TIPS substitutions, it should be detected as a DNA fragment originating from the genomic epsps locus, regardless whether the TIPS substitutions were selected for during the PCR amplification process. The TIPS primer was replaced with the wild-type EPSPS primer (Table 15, the F-E3 pair of primers) and the PCR amplification products were cloned into the TOPO cloning vectors and sequenced. The sequencing data represented a random sample of the genomic epsps locus sequences in one of the selected events (FIG. 17, callus A12 3360.92). FIG. 17 shows that the method disclosed herein resulted in the successful nucleotide editing of three nucleotides (FIG. 17 bold) responsible for the TIPS mutations without altering any of the other epsps nucleotides, while the moCas9 target sequence (the site of guide RNA binding underlined in FIG. 17) was not mutagenized.
[0729] Also, the other EPSPS allele was not edited indicating that only one EPSPS allele was edited in this particular event (FIG. 17, lower section).
[0730] This data further shows that the present disclosure of the use of the guide RNA/Cas system for the gene editing demonstrates the ability to recover gene editing events at a high efficiency of 1 out of fewer than 10 selected events.
Example 17
[0731] The Quide RNA/Cas Endonuclease System Delivers Double-Strand Breaks to the Maize Epsps Locus Resulting in Maize Plants Containing an EPSPS-TIPS Edited Gene.
[0732] The EPSPS gene edited events were produced and selected as described in the Example 16. In short, the EPSPS polynucleotide modification template was co-n delivered using particle gun bombardment as a plasmid (see template vector 1, FIG. 15) together with the guide RNA expression cassette and a maize optimized Cas9 endonuclease expression vector which contained the maize optimized Cas9 endonuclease expression cassette described in FIG. 1A (Example 1, SEQ ID NO:5) and also contained a moPAT selectable marker gene.
[0733] After six weeks of selection at 28° C., a small amount of selected tissue was transferred onto the MS regeneration medium and incubated for three weeks in the dark at 28° C. After the three week incubation visible shoots were transferred to plates containing the MS-1 medium and incubated at 26° C. in the light for 1-2 weeks until they were ready to be sent to a greenhouse and transferred into soil flats. The Ms-1 medium contained: 4.3 g/L MS salts (Gibco 11117), 5.0 ml/L MS Vitamins Stock Solution (Sigma M3900), 100 mg/L myo-inositol, 40.0 g/L sucrose, and 6.0 g/L Bacto-Agar at pH 5.6.
[0734] Using the procedures described above, 390 T0 maize plants were produced originating from 3282 embryos, resulting in an overall transformation efficiency of 12%, further indicating that the guide RNA/Cas system used herein results in low or no toxicity (Table 16).
TABLE-US-00017 TABLE 16 Transformation efficiency of the EPSPS editing. # # Calli Selection T0 plants Overall Treatment Embryos selected efficiency to GH Efficiency Particle 3282 489 15% 390 12% bombardment
[0735] DNA was extracted from each T0 plantlet 7-10 days after transfer to the greenhouse and PCR procedures were conducted as described in the Example 16 to screen the T0 plants for mutations at the epsps locus.
[0736] Seventy two percent of analyzed T0 plants ( 270/375, Table 17) contained mutagenized EPSPS alleles as determined by the end-point PCR procedure described in the Example 16. Most of the mutations ( 230/375 or 89%) were produced as a result of error-prone non-homologous end joining (NHEJ) while forty T0 plants ( 40/375 or 11%) contained the TIPS edited EPSPS alleles indicating the involvement of a templated double-strand break repair mechanism (Table 17).
TABLE-US-00018 TABLE 17 Mutations at the epsps locus. Gene Mutations Editing TO Plants at the Mutation TIPS Rate Transformation Analyzed epsps locus rate editing (TIPS) Particle 375 270 72% 40 11% bombardment
[0737] A pair of primers (Table 15, the F-E3 pair of primers) was used to amplify a native, endogenous fragment of the epsps locus containing the moCas6 target sequence and the EPSPS editing site from the genomic DNA of selected T0 plants. The PCR amplification products were cloned into the TOPO cloning vectors and sequenced as described in Example 16. The sequencing data represent a random sample of the genomic epsps locus sequences from a particular T0 plant (Table 18) and indicate the genotype of the selected T0 plants. The list of the EPSPS-TIPS allele-containing T0 plants transferred to the pots is presented in Table 18 (a selected set of T0 plants from the original 40 TIPS-containing events).
TABLE-US-00019 TABLE 18 The epsps locus genotypes observed in T0 plants. TIPS refers to a clone comprising the TIPS edited EPSPS sequence. NHEJ refers to the presence of a NHEJ mutation and WT refers to the presence of a wild-type EPSPS sequence amplified from the native epsps locus. Event Observed Sequences found at the (T0 plant) epsps locus E1 16 TIPS, 13 NHEJ E2 28 TIPS, 0 NHEJ E3 2 TIPS, 20 WT E4 1 TIPS, 28 NHEJ E5 2 TIPS, 2 NHEJ, 9 WT E6 10 TIPS, 17 NHEJ E7 12 TIPS, 17 NHEJ E8 11 TIPS, 15 NHEJ E9 17 TIPS, 10 NHEJ
[0738] As presented in Table 18, the selected plants of E1 and E3 to E9 contained the EPSPS-TIPS edited version of the EPSPS gene either accompanied by a wild-type EPSPS allele (WT) or a NHEJ mutagenized EPSPS allele (NHEJ). The numbers before TIPS, WT, NHEJ in Table18 indicate the frequency at which a particular version of the EPSPS allele was identified. If all clones contained the TIPS-edited EPSPS sequence, the analyzed plant was likely to be homozygous for the EPSPS-TIPS allele (see for example E2). If only about 50% of clones contained a TIPS-edited EPSPS sequence, the analyzed plant was likely to be hemizygous for the EPSPS-TIPS allele (see for example E1). Other plants, such as E3 or E4, were likely to be chimeric for TIPS. In one event, E2, the T0 plant contained only TIPS-edited sequence at the epsps locus indicating that the guide RNA/Cas endonuclease system disclosed herein resulted in the successful nucleotide editing of three nucleotides (FIG. 17 bold) responsible for the two EPSPS-TIPS alleles at the epsps locus in maize plants.
[0739] A qPCR analysis was performed on the selected T0 plants to estimate the copy number of the wild-type EPSPS genes and the moCas9 endonuclease sequences. Multiplex qPCR amplifications of the maize EPSPS gene and the ADH housekeeping gene were carried out on the DNA samples from T0 plants. The primers and probes used in the PCR reaction are shown in Table 19.
TABLE-US-00020 TABLE 19 Primers used in qPCR analysis of T0 plants. Primer/ Primary PCR SEQ probe Primer Sequence ID NO: primer 5'-CAAGTCGCGGT SEQ ID qADH F TTTCAATCA-3 NO: 217 Primer 5'-TGAAGGTGGAA SEQ ID qADH R GTCCCAACAA-3' NO: 218 probe VIC-TGGGAAGCCT SEQ ID ADH-VIC ATCTACCAC NO: 219 Probe 6FAM-CGGCCATTG SEQ ID wtEPSPS ACAGCA-MGB-NFQ NO: 220 Forward primer 5'-TCTTGGGGAAT ,SEQ ID qEPSPS F GCTGGAACT-3' NO: 221 reverse primer 5'-CACCAGCAGCA SEQ ID qEPSPSR GTAACAGCTG-3' NO: 222 FAM-wtEPSPS 6FAM-TGCTGTCA SEQ ID R probe ATGGCCGCA NO: 223 forward primer 5'-TCTTGGGGAA SEQ ID qEPSPS F TGCTGGAACT-3' NO: 224 reverse primer 5'-CCACCAGCAGC SEQ ID q wtEPSPS RA AGTAACAGC-3 NO: 225)
[0740] All analyses were conducted using the LightCycler 480 Real-Time PCR System (Roche Diagnostics). A threshold value for the wtEPSPS genotype was set at 1.76. Every sample showing less than 1.76 copies of EPSPS, with the end-point florescence measurements up to two times lower than the wild-type control, was categorized as the One Allele EPSPS genotype (hemizygous for the wild-type EPSPS allele).
[0741] A qPCR method was used to estimate the TIPS sequence copy number. The primers and probes used in the qPCR reaction are shown in Table 20.
TABLE-US-00021 TABLE 20 Primers used in qPCR analysis to estimate the TIPS sequence copy number. Primer/ Primary PCR Primer SEQ ID probe Sequence NO: forward primer 5'-GGAAGTGCAGCTCTTCTT SEQ ID q epTIPS F GGG-3' NO: 226 reverse primer 5'-AGCTGCTGTCAATGAC SEQ ID q epTIPS R CGC-3' NO: 227 TIPS probe 6FAM-AATGCTGGAATCGCA SEQ ID NO: 228)
[0742] A comparative Ct method with Delta Ct values normalized to the average Delta Ct from the bi-allelic TIPS genotypes provided a copy number estimation for the TIPS sequence detected in the analyzed plant samples.
TABLE-US-00022 TABLE 21 qPCR genotyping and copy number of selected T0 plants. TIPS Wild-type moCas9 Event EPSPS EPSPS coding name allele allele # TIPS copy # sequence E1 positive Null 5 positive E2 positive Null 2 positive E7 positive Null 6 positive E8 positive Null 1 positive E9 positive Null 3 positive
[0743] The qPCR genotyping indicated that no wild-type EPSPS alleles were detected in the selected T0 plants of Events E1, E2, E7, E8 and E9 (Table 21). Both, the TIPS template sequences and the moCas9 coding sequence were found in the selected T0 plants, presumably, as a result of random insertions associated with the transformation process (Table 21: for the TIPS template sequences E1, E7, and E9 T0 plants). Both genetic elements (the randomly inserted TIPS templates and the moCas9 expression cassette) can be segregated out by standard breeding procedures in the T1 progeny generation, if not linked to the edited EPSPS-TIPS gene.
[0744] T0 plants grew well in the greenhouse and were fertile. A sample of T0 plants was sprayed with a 1× dose of glyphosate (Roundup Powermax) at V3 growth stage using the spray booth setting of 20 gallons per acre. The 1× dose of glyphosate was prepared as follow: 2.55 ml Powermax in 300 ml water (active ingredient: glyphosate, N-(phosphonomethyl)glycine, in the form of its potassium salt at 48.7%). Seven days after glyphosate application, no leaf tissue damage was observed in some of the T0 plants. These plantlets were hemizygous for the EPSPS-TIPS alleles, while other plantlets were severely damaged. One plant showing no damage to the leaf tissue 14 days after herbicide application contained 21 EPSPS-TIPS alleles among 44 genomic clones of the epsps locus (cloned and sequenced as described in the Example 16).
[0745] These data indicate that a guide RNA/Cas system can be used to create a TIPS-edited EPSPS allele in maize. Maize plants homozygous at the epsps-tips locus (two EPSPS alleles edited) with no additional insertion of the TIPS template (plant E2) were obtained. Furthermore, some EPSPS-TIPS edited maize plants did show some level of tolerance against a 1× dose of glyphosate.
Example 18
[0746] Guide RNA/Cas Endonuclease Mediated DNA Cleavage in Maize Chromosomal Loci Enables Transgene Insertion in an Elite Maize Line
[0747] To test whether a maize optimized guide RNA/Cas system can cleave an maize chromosomal locus and enable homologous recombination (HR) mediated pathways to site-specifically insert a transgene in an elite maize line, 4 loci were selected on the maize chromosome 1 located between 51.54 cM to 54.56 cM (FIG. 18). Two target sites for a Cas endonuclease were identified at each of the four loci and are referred to as MHP14Cas-1, MHP14Cas-3, TS8Cas-1, TS8Cas2, TS9Cas-2, TS9Cas-3, TS10Cas-1 and TS10Cas-3 (FIG. 19, Table 22, SEQ ID NOs:229-236).
TABLE-US-00023 TABLE 22 Maize genomic target sites targeted by a guide RNA/Cas endonuclease. Maize SEQ Target Genomic Target ID Locus Location Site Site Sequence PAM NO: MHP14 Chr. 1: MHP14 gttaaatctgac TGG 229 51.54cM Cas-1 gtgaatctgtt MHP14 acaaacattgaa TGG 230 Cas-3 gcgacatag TS8 Chr. 1: TS8 gtacgtaacgtg TGG 231 52.56cM Cas-1 cagtac TS8 gctcatcagtga TGG 232 Cas-2 tcagctgg TS9 Chr. 1: TS9 ggctgtttgcgg AGG 233 53.56cM Cas-2 cctcg TS9 gcctcgaggttg CGG 234 Cas-3 cacgcacgt TS10 Chr.1: TS10 gcctcgccttcg GGG 235 54.56cM Cas-1 ctagttaa TS10 gctcgtgttgga GGG 236 Cas-3 gataca
[0748] The maize optimized Cas endonuclease cassette (SEQ ID NO: 5 was as prepared as describe in Example 1. Long guide RNA expression cassettes comprising a variable targeting domain targeting one of the 8 genomic target sites, driven by a maize U6 polymerase III promoter, and terminated by a maize U6 polymerase III terminator were designed as described in Example 1 and 3 and listed in Table 23. A donor DNA (HR repair DNA) containing a selectable marker (a phosphomannose-isomerase (PMI) expression cassette) flanked by two homologous regions was constructed using standard molecular biology techniques (FIG. 20).
TABLE-US-00024 TABLE 23 List of guide RNA (gRNA) and Donor DNA expression cassettes Donor DNA gRNA (SEQ ID Locus Target Site (SEQ ID NO:) NO:) MHP14 MHP14Cas-1 245 253 MHP14Cas-3 246 254 TS8 TS8Cas-1 247 255 TS8Cas-2 248 256 TS9 TS9Cas-2 249 257 TS9Cas-3 250 258 TS10 TS10Cas-1 251 259 TS10Cas-3 252 260
[0749] A vector containing the maize optimized Cas9 endonuclease of SEQ ID NO: 5, a vector containing one of eight long guide RNA expression cassettes of SEQ ID NOs: 245-252, and a vector containing one of eight donor DNAs of SEQ ID NOs: 253-260 were co-delivered to maize elite line immature embryos by particle-mediated delivery as described in Example 10. About 1000 embryos were bombarded for each target site. Since the donor DNA contained a selectable marker, PMI, successful delivery of the donor DNA allowed for callus growth on mannose media. Putative HR-mediated transgenic insertions were selected by placing the callus on mannose containing media. After selection, stable shoots on maturation plates were sampled, total genomic DNA extracted, and using the primer pairs shown in Table 24 (corresponding to SEQ ID NOs: 261-270), PCR amplification was carried out at both possible transgene genomic DNA junctions to identify putative HR-mediated transgenic insertions.
TABLE-US-00025 TABLE 24 Primer sequences used for integration event screening at each target site. SEQ Target ID Locus Site Junction Primer NO: UBIR donor 1 CCATGTCTAACTGTTCA 261 TTTATATGATTCTCT PSBF donor 2 GCTCGTGTCCAAGCGTC 262 ACTTACGATTAGCT MHP14 MHP14Cas-1 14-1HR1f CTCACATGAGGCTCTTC 263 MHP14Cas-3 TTTGCTTGCT 14-1HR2r AGGATCCTATTCCCCAA 264 TTTGTAGAT CHR1-8 TS8Cas-1 8HR1f CAGTCCGTGGATTGAAG 265 CCAT TS8Cas-2 8HR2r CTCTGTCTCCGAGACGT 266 GCTTA CHR1-9 TS9Cas-2 9HR1f GGAGCAAATGTTTTAGG 267 TATGAAATG TS9Cas-3 9HR2r CGGATTCTAAAGATCAT 268 ACGTAAATGAA CHR1-10 TS10Cas-1 10HR1f TGGCTTGTCTATGCGCA 269 TS10Cas-3 TCTC 10HR2r CCAGACCCAAACAGCAG 270 GTT
The same genomic primers were used for each of the two target sites at one locus. The resulting amplifications were sequenced to determine if these sites were mutated or contained a transgene insertion.
[0750] The "Event Recovery frequency" was calculated using the number of events recovered divided by the total number of embryos bombarded, and may indicate if an endonuclease has some toxic effect or not (Table 26). Hence, if 1000 embryos were bombarded and 240 were recovered, the Event Recovery frequency is 24%. Table 26 indicates that for all target sites analyzed the Event Recovery frequency ranged between 17 and 28%, indicating that the guide RNA/Cas system used herein results in low or no toxicity. Cas endonuclease activity was measured in-planta by determining the "Target Site Mutation frequency" (Table 26) which is defined as: (number of events with target site modification/total number recovered events)*100%. Hence, if 240 events were recovered and 180 events showed a mutation, the Target Site Mutation frequency is 75%. The target site mutation frequency was measured using target site allele copy number as described in Example 9 of U.S. application Ser. No. 13/886,317, filed on May 3, 2013. The primers and probes for obtaining the target site copy number using qPCR at each site were as listed in Table 25 (SEQ ID NO: 271-294).
TABLE-US-00026 TABLE 25 Primer and probe sequences used to assess DNA cleavage at 8 maize genomic target sites Target Site SEQ Desig- Probe Primer ID nation primers sequence NO: MHP14 probe CAGATTCACGTCAGATTT 271 Cas-1 forward CATAGTGGTGTATGAAAG 272 GAAGCACTT reverse CATTTTGGATTGTAATAT 273 GTGTACCTCATA MHP14 probe CACCACTATGTCGCTTC 274 Cas-3 forward CGGATGCACGAAAATTGT 275 AGGA reverse CTGACGTGAATCTGTTTG 276 GAATTG TS8 probe TACGTAACGTGCAGTACT 277 Cas-1 forward ACGGACGGACCATACG 278 TTATG reverse TCAGCTGGTGGAGTATAT 279 TAGTTCGT TS8 probe CCAGCTGATCACTGATGA 280 Cas-2 forward ACGGACGGACCATACGT 281 TATG reverse CGCACATGTTATAAATTA 282 CAATGCAT TS9 probe CTGTTTGCGGCCTC 283 Cas-2 forward CTGCGGAGCTGCTGG 284 CGAT reverse CTTGCTGGCTTCGTC 285 TGTCA TS9 probe CCGACGTGCGTGCAA 286 Cas-3 forward CTGCGGAGCTGCTGG 287 CGAT reverse CTTGCTGGCTTCGTC 288 TGTCA TS10 probe TCGCCTTCGCTAG 289 Cas-1 TTAA forward AAGACCTGGCCGGTTT 290 TCCA reverse TAGCGGCCATTGCCATCA 291 TS10 probe CTGTATCTCCAACAC 292 Cas-3 GAGC forward AAGACCTGGCCGGTTT 293 TCCA reverse TAGCGGCCATTGCCA 294 TCA
As shown in Table 26, all 8 guide RNA/Cas9 systems were very efficient in cleaving their target DNA and inducing mutations (by non-homologous end joining (NHEJ) as is evidenced by a mutation frequency ranging from 33-90%.
[0751] All events were also screened for the presence of an inserted transgene. The insertion event screening for each target site is illustrated in FIG. 21. The primers used for insertion PCR analysis at each site are listed in Table 24. FIG. 22 shows one example of an insertion event screening PCR result. The frequency of transgene insertion was determined by calculating the "Insertion frequency" which is defined as: (number of events with target site insertion/total number recovered events)*100%. Hence, if 240 events were recovered and 21 events showed a transgene insertion, the Insertion frequency was 9%.
TABLE-US-00027 TABLE 26 Activity of the guide RNA/Cas 9 system at 8 target sites as determined by target site mutation frequency and transgene insertion frequency at the desired target site in maize plant tissue Insertion Target Site Mutation frequency Target Site Event Recovery (%) (%) (%) TS10Cas-1 24% 75% 9% (7*) TS10Cas-3 22% 83% 16% (20*) TS8Cas-1 17% 90% 14% (9*) TS8Cas-2 27% 84% 8% (10*) MHP14Cas-1 17% 33% 2% (2*) MHP14Cas-3 28% 68% 4% (1*) TS9Cas-2 23% 62% 8%** TS9Cas-3 28% 84% 8%** *Number of events with HR1 and HR2 both junctions positive **only HR2 junction available
[0752] Sequence--confirmed-PCR amplifications indicated a site-specific transgene insertion for each of the 8 target sites as shown in Table 26 (column Insertion frequency). A transgene cassette was inserted at all 8 target sites with high efficiency (2-16%). The number of events containing amplifications across both transgene genomic DNA junctions, indicating near perfect site-specific transgene insertion, are show in brackets in Table 26.
[0753] Taken together, these data demonstrates that maize chromosomal loci cleaved with the maize optimized guide RNA/Cas system described herein can be used to insert transgenes at high frequencies in maize elite inbred line.
Example 19
[0754] Delivery of the Guide RNA/Cas9 Endonuclease System DNA to Soybean by Stable Transformation
[0755] A soybean U6 small nuclear RNA promoter (GM-U6-9.1; SEQ ID NO: 295) was identified in a similar manner as the soybean promoter GM-U6-13.1 (SEQ ID NO:120) described in Example 12. The GM-U6-9.1 promoter was used to express guide RNA to direct Cas9 nuclease to designated genomic target site.
[0756] A soybean codon optimized Cas9 endonuclease expression cassette (such as for example EF1A2:CAS9, SEQ ID NO: 296) and a guide RNA expression cassette (such as for example U6-9.1:DD20CR1; SEQ ID NO: 297) were linked (such as U6-9.1: DD20CR1+EF1A2:CAS9; SEQ ID NO: 298, FIG. 23A) and integrated into a DNA plasmid that was co-delivered with another plasmid comprising a donor DNA (repair DNA) cassette (such as DD20HR1-SAMS:HPT-DD20HR2; SEQ ID NO: 299) to young soybean somatic embryos in the form of embryogenic suspension cultures by particle gun bombardment (FIGS. 23A and 23B). Other guide RNA/Cas9 DNA constructs targeting various soybean genomic sites and donor DNA constructs for site-specific transgene integration through homologous recombination were similarly configured and are listed in Table 27. The four gRNA/Cas9 constructs differed only in the 20 bp guide RNA targeting domain (variable targeting domain) targeting the soybean genomic target sites DD20CR1 (SEQ ID NO: 125), DD20CR2 (SEQ ID NO: 126), DD43CR1 (SEQ ID NO: 127), or DD43CR2 (SEQ ID NO: 128). The two donor DNA constructs differed only in the homologous regions such as DD20HR1 and DD20HR (FIG. 23B), or DD43HR1 and DD43HR2. These guide RNA/Cas9 DNA constructs and donor DNAs were co-delivered to an elite (93B86) or a non-elite (Jack) soybean genome by the stable transformation procedure described below.
TABLE-US-00028 TABLE 27 Guide RNA/Cas9 Mediated Soybean Stable Transformation. SEQ ID Experiment Guide RNA/Cas9 Donor DNA NOs: U6-9.1DD20CR1 U6-9.1:DD20CR1 + EF1A2:CAS9 DD20HR1-SAMS:HPT-DD20HR2 298, 299 U6-9.1DD20CR2 U6-9.1:DD20CR2 + EF1A2:CAS9 DD20HR1-SAMS:HPT-DD20HR2 300, 299 U6-9.1DD43CR1 U6-9.1:DD43CR1 + EF1A2:CAS9 DD43HR1-SAMS:HPT-DD43HR2 301, 302 U6-9.1DD43CR2 U6-9.1:DD43CR2 + EF1A2:CAS9 DD43HR1-SAMS:HPT-DD43HR2 303, 302
[0757] Soybean somatic embryogenic suspension cultures were induced from a DuPont Pioneer proprietary elite cultivar 93B86 as follows. Cotyledons (˜3 mm in length) were dissected from surface sterilized, immature seeds and were cultured for 6-10 weeks in the light at 26° C. on a Murashige and Skoog (MS) media containing 0.7% agar and supplemented with 10 mg/ml 2,4-D (2,4-Dichlorophenoxyacetic acid). Globular stage somatic embryos, which produced secondary embryos, were then excised and placed into flasks containing liquid MS medium supplemented with 2,4-D (10 mg/ml) and cultured in light on a rotary shaker. After repeated selection for clusters of somatic embryos that multiplied as early, globular staged embryos, the soybean embryogenic suspension cultures were maintained in 35 ml liquid media on a rotary shaker, 150 rpm, at 26° C. with fluorescent lights on a 16:8 hour day/night schedule. Cultures were subcultured every two weeks by inoculating approximately 35 mg of tissue into 35 ml of the same fresh liquid MS medium.
[0758] Soybean embryogenic suspension cultures were then transformed by the method of particle gun bombardment using a DuPont Biolistic® PDS1000/HE instrument (Bio-Rad Laboratories, Hercules, Calif.). To 50 μl of a 60 mg/ml 1.0 mm gold particle suspension were added in order: 30 μl of equal amount (30 ng/μl) plasmid DNA comprising, for example, U6-9.1:DD20CR1+EF1A2:CAS9 (SEQ ID NO:298) and plasmid DNA comprising, for example, (DD20HR1-SAMS:HPT-DD20HR2, SEQ ID NO: 299) (Experiment U6-9.1 DD20CR1 listed in Table 27) 20 μl of 0.1 M spermidine, and 25 μl of 5 M CaCl2. The particle preparation was then agitated for 3 minutes, spun in a centrifuge for 10 seconds and the supernatant removed. The DNA-coated particles were then washed once in 400 μl 100% ethanol and resuspended in 45 μl of 100% ethanol. The DNA/particle suspension was sonicated three times for one second each. Then 5 μl of the DNA-coated gold particles was loaded on each macro carrier disk.
[0759] Approximately 300-400 mg of a two-week-old suspension culture was placed in an empty 60×15 mm Petri dish and the residual liquid removed from the tissue with a pipette. For each transformation experiment, approximately 5 to 10 plates of tissue were bombarded. Membrane rupture pressure was set at 1100 psi and the chamber was evacuated to a vacuum of 28 inches mercury. The tissue was placed approximately 3.5 inches away from the retaining screen and bombarded once. Following bombardment, the tissue was divided in half and placed back into liquid media and cultured as described above.
[0760] Five to seven days post bombardment, the liquid media was exchanged with fresh media containing 30 mg/ml hygromycin as selection agent. This selective media was refreshed weekly. Seven to eight weeks post bombardment, green, transformed tissue was observed growing from untransformed, necrotic embryogenic clusters. Isolated green tissue was removed and inoculated into individual flasks to generate new, clonally propagated, transformed embryogenic suspension cultures. Each clonally propagated culture was treated as an independent transformation event and subcultured in the same liquid MS media supplemented with 2,4-D (10 mg/ml) and 30 ng/ml hygromycin selection agent to increase mass. The embryogenic suspension cultures were then transferred to agar solid MS media plates without 2,4-D supplement to allow somatic embryos to develop. A sample of each event was collected at this stage for quantitative PCR analysis.
[0761] Cotyledon stage somatic embryos were dried-down (by transferring them into an empty small Petri dish that was seated on top of a 10 cm Petri dish containing some agar gel to allow slow dry down) to mimic the last stages of soybean seed development. Dried-down embryos were placed on germination solid media and transgenic soybean plantlets were regenerated. The transgenic plants were then transferred to soil and maintained in growth chambers for seed production. Transgenic events were sampled at somatic embryo stage or T0 leaf stage for molecular analysis.
[0762] Similar transformation experiments (U6-9.1 DD20CR2, U6-9.1 DD43CR1, U6-9.1DD43CR2) with the components listed in Table 27 and using the elite cultivar 93B86 were performed as described above.
[0763] Two transformation experiments, U6-9.1 DD20CR1 and U6-9.1 DD43CR1 listed in Table 27, were also performed in a non-elite soybean cultivar "Jack" to test the gRNA/Cas9 system performance in different soybean genotypes.
Example 20
Detection of Site-Specific NHEJ Mediated by the Guide RNA/Cas9 System in Stably Transformed Soybean
[0764] Genomic DNA was extracted from somatic embryo samples and analyzed by quantitative PCR using a 7500 real time PCR system (Applied Biosystems, Foster City, Calif.) with target site-specific primers and FAM-labeled fluorescence probe to check copy number changes of the target site DD20 or DD43 (FIG. 24 A-C). The qPCR analysis was done in duplex reactions with a heat shock protein (HSP) gene as the endogenous controls and a wild type 93B86 genomic DNA sample that contains one copy of the target site with 2 alleles, as the single copy calibrator. The HSP endogenous control qPCR employed primer probe set HSP-F/HSP-T/HSP-R. The DD20-CR1 (SEQ ID NO:306) and DD20-CR2 (SEQ ID NO:307) specific qPCR employed primer probe set DD20-F (SEQ ID NO:308)/DD20-T (SEQ ID NO:309)/DD20-R(SEQ ID NO:310). The DD43-CR1 (SEQ ID NO:311) specific qPCR employed primer probe set DD43-F (SEQ ID NO:313)/DD43-T (SEQ ID NO:315)/DD43-R (SEQ ID NO:316) while the DD43-CR2 (SEQ ID NO:312) specific qPCR employed primer probe set DD43-F2 (SEQ ID NO:314)/DD43-T/DD43-R. The guide RNA/Cas9 DNA (SEQ ID NOs: 298, 300, 301, and 303) specific qPCR employed primer probe set Cas9-F (SEQ ID NO:317/Cas9-T (SEQ ID NO:318)/Cas-9-R(SEQ ID NO:319). The donor DNA (SEQ ID NOS: 299, and 302) specific qPCR employed primer probe set Sams-76F (SEQ ID NO:320)/FRT1I63-T (SEQ ID NO:321)/FRT1I-41F (SEQ ID NO:322). The endogenous control probe HSP-T was labeled with VIC and the gene-specific probes DD20-T, DD43-T, Cas9-T, and FRT1I63-T were labeled with FAM for the simultaneous detection of both fluorescent probes (Applied Biosystems). PCR reaction data were captured and analyzed using the sequence detection software provided with the 7500 real time PCR system and the gene copy numbers were calculated using the relative quantification methodology (Applied Biosystems).
[0765] Since the wild type 93B86 genomic DNA with two alleles of the target site was used as the single copy calibrator, events without any change of the target site would be detected as one copy herein termed Wt-Homo (qPCR value>=0.7), events with one allele changed, which is no longer detectible by the target site-specific qPCR, would be detected as half copy herein termed NHEJ-Hemi (qPCR value between 0.1 and 0.7), while events with both alleles changed would be detected as null herein termed NHEJ-Null (qPCR value=<0.1). The wide range of the qPCR values suggested that most of the events contained mixed mutant and wild type sequences of the target site. High percentage of NHEJ-Hemi (ranging from 10.1 to 33.5%, Table 28) and NHEJ-Null (ranging from 32.3 to 46.4%, Table 21) were detected in all four experiments with combined NHEJ average frequencies of more than 60% (Table 28).
TABLE-US-00029 TABLE 28 Target Site Mutations and Site Specific Gene Integration Induced by the Guide RNA/Cas9 system in elite soybean germplasm. Numbers indicate no. of events (numbers in parentheses are %). NA = not analyzed. Wt-Homo NHEJ- NHEJ-Null Insertion Project Total event (%) Hemi (%) (%) Frequency(%) U6-9.1DD20CR1 239 85 (35.6%) 77 (32.2%) 77 (32.2%) 11 (4.6%) U6-9.1DD20CR2 79 43 (54.4%) 8 (10.1%) 28 (35.4%) NA U6-9.1DD43CR1 263 53 (20.2%) 88 (33.5%) 122 (46.4%) 10 (3.8%)
TABLE-US-00030 TABLE 29 Target Site Mutations and Site Specific Gene Integration Induced by the Guide RNA/Cas9 system in non-elite soybean germplasm. Numbers indicate no. of events (numbers in parentheses are % of the total analyzed events). Total Wt-Homo NHEJ- NHEJ-Null Insertion Project event (%) Hemi (%) (%) frequency (%) U6-9.1DD20CR1-Jack 149 99 (66.4%) 34 (22.8%) 16 (10.7%) 0 (0%) U6-9.1DD43CR1-Jack 141 84 (59.6%) 27 (19.1%) 30 (21.3%) 1 (0.7%)
[0766] Both NHEJ-Hemi and NHEJ-Null were detected in the two experiments U6-9.1DD20CR1-Jack and U6-9.1DD43CR1-Jack repeated in "Jack" genotype though at lower frequencies (Table 29). The differences between NHEJ frequencies were likely caused by variations between transformation experiments.
[0767] The target region of NHEJ-Null events were amplified by regular PCR from the same genomic DNA samples using DD20-LB (SEQ ID NO: 323) and DD20-RB (SEQ ID NO: 326) primers specific respectively to DD20-HR1 and DD20-HR2 for DD20 target site specific HR1-HR2 PCR amplicon (FIG. 25 A-C; SEQ ID NO: 329), or DD43-LB (SEQ ID NO: 327) and DD43-RB (SEQ ID NO: 328) primers specific respectively to DD43-HR1 and DD43-HR2 for DD43 target site specific HR1-HR2 PCR amplicon (SEQ ID NO: 332). The PCR bands were cloned into pCR2.1 vector using a TOPO-TA cloning kit (Invitrogen) and multiple clones were sequenced to check for target site sequence changes as the results of NHEJ. Various small deletions at the Cas9 cleavage site, 3 bp upstream of the PAM, were revealed at all four tested target sites (FIG. 26 A-C). Small insertions were also detected in some sequences. Different mutated sequences were identified from some of the same events indicating the chimeric nature of these events. Some of the same mutated sequences were also identified from different events suggesting that the same mutations could have happened independently or some of the events could be clonal events. These sequence analysis confirmed the occurrence of NHEJ mediated by the guide RNA/Cas9 system at the specific Cas9 target sites.
Example 21
[0768] Identification of Site-Specific Gene Integration Via Homologous Recombination Mediated by the Guide RNA/Cas9 System in Stably Transformed Soybean
[0769] Site-specific gene integration via guide RNA/Cas9 system mediated DNA homologous recombination was determined by border-specific PCR analysis. The 5' end borders of DD20CR1 and DD20CR2 events were amplified as a 1204 bp DD20 HR1-SAMS PCR amplicon (SEQ ID NO: 330) by PCR with primers DD20-LB (SEQ ID NO: 323) and Sams-A1 (SEQ ID NO: 324) while the 3' borders of the same events were amplified as a 1459 bp DD20 NOS--HR2 PCR amplicon (SEQ ID NO: 331) with primers QC498A-S1 and DD20-RB (FIG. 25 A-C). Any events with both the 5' border and 3' border-specific bands amplified are considered as site-specific integration events through homologous recombination containing the transgene from the donor DNA fragment DD20HR1-SAMS:HPT-DD20HR2 or its circular form (FIG. 23). The 5' end borders of DD43CR1 and DD43CR2 events were amplified as a 1202 bp DD43 HR1-SAMS PCR amplicon (SEQ ID NO: 333) by PCR with primers DD43-LB and Sams-A1 while the 3' borders of the same events were amplified as a 1454 bp DD43 NOS-HR2 PCR amplicon (SEQ ID NO: 334) with primers QC498A-S1 (SEQ ID NO: 325) and DD43-RB (SEQ ID NO: 328). Any events with both the 5' border and 3' border-specific bands amplified are considered as site-specific integration events through homologous recombination containing the transgene from repair DNA fragment DD43HR1-SAMS:HPT-DD43HR2 or its circular form. Some of the border-specific PCR fragments were sequenced and were all confirmed to be recombined sequences as expected from homologous recombination. On average, gene integration through the guide RNA/Cas9 mediated homologous recombination occurred at approximately 4% of the total transgenic events (Insertion frequency, Table 28 and Table 29). One homologous recombination event was identified from experiment U6-9.1 DD43CR1-Jack repeated in "Jack" genotype (Table 29).
Example 22
[0770] The crRNA/tracrRNA/Cas Endonuclease System Cleaves Chromosomal DNA in Maize and Introduces Mutations by Imperfect Non-Homologous End-Joining
[0771] To test whether the maize optimized crRNA/tracrRNA/Cas endonuclease system described in Example 1 could recognize, cleave, and mutate maize chromosomal DNA through imprecise non-homologous end-joining (NHEJ) repair pathways, three different genomic target sequences were targeted for cleavage (see Table 30) and examined by deep sequencing for the presence of NHEJ mutations.
TABLE-US-00031 TABLE 30 Maize genomic target sequences targeted by a crRNA/tracrRNA/Cas endonuclease system. Tar- Maize Cas get Genomic RNA Site Target PAM SEQ Lo- Loca- System Desig- Site Se- ID cus tion Used nation Sequence quence NO: LIG Chr. 2: crRNA/ LIG GTACCGTACGT AGG 16 28.45cM tracrRNA Cas-1 GCCCCGGCGG crRNA/ LIG GGAATTGTACC CGG 17 tracrRNA Cas-2 GTACGTGCCC crRNA/ LIG GCGTACGCGTA AGG 18 tracrRNA Cas-3 CGTGTG LIG = Liguleless 1 Gene Promoter
[0772] The maize optimized Cas9 endonuclease expression cassette, crRNA expression cassettes containing the specific maize variable targeting domains (SEQ ID NOs: 445-447) complementary to the antisense strand of the maize genomic target sequences listed in Table 30 and tracrRNA expression cassette (SEQ ID NO: 448) were co-delivered to 60-90 Hi-II immature maize embryos by particle-mediated delivery (see Example 5) in the presence of BBM and WUS2 genes (see Example 6). Hi-II maize embryos transformed with the Cas9 and long guide RNA expression cassettes targeting the LIGCas-3 genomic target site (SEQ ID NO: 18) for cleavage served as a positive control and embryos transformed with only the Cas9 expression cassette served as a negative control. After 7 days, the 20-30 most uniformly transformed embryos from each treatment were pooled and total genomic DNA was extracted. The region surrounding the intended target site was PCR amplified with Phusion® High Fidelity PCR Master Mix (New England Biolabs, M0531L) adding on the sequences necessary for amplicon-specific barcodes and Illumnia sequencing using "tailed" primers through two rounds of PCR. The primers used in the primary PCR reaction are shown in Table 31 and the primers used in the secondary PCR reaction were AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACG (forward, SEQ ID NO: 53) and CAAGCAGAAGACGGCATA (reverse, SEQ ID NO: 54).
TABLE-US-00032 TABLE 31 PCR primer sequences Cas Primary Tar- RNA Primer PCR SEQ get System Orien- Primer ID Site Used tation Sequence NO: LIGCas-1 crRNA/ Forward CTACACTCTTTCCCTACACGA 36 tracrRNA CGCTCTTCCGATCTTCCTCTG TAACGATTTACGCACCTGCTG LIGCas-1 crRNA/ Reverse CAAGCAGAAGACGGCATACGA 35 tracrRNA GCTCTTCCGATCTGCAAATGA GTAGCAGCGCACGTAT LIGCas-2 crRNA/ Forward CTACACTCTTTCCCTACACGA 449 tracrRNA CGCTCTTCCGATCTGAAGCTG TAACGATTTACGCACCTGCTG LIGCas-2 crRNA/ Reverse CAAGCAGAAGACGGCATACGA 35 tracrRNA GCTCTTCCGATCTGCAAATGA GTAGCAGCGCACGTAT LIGCas-3 crRNA/ Forward CTACACTCTTTCCCTACACGA 37 tracrRNA CGCTCTTCCGATCTAAGGCGC AAATGAGTAGCAGCGCAC LIGCas-3 crRNA/ Reverse CAAGCAGAAGACGGCATACGA 38 tracrRNA GCTCTTCCGATCTCACCTGCT GGGAATTGTACCGTA LIGCas-3 Long Forward CTACACTCTTTCCCTACACGA 450 guide CGCTCTTCCGATCTTTCCCGC RNA AAATGAGTAGCAGCGCAC LIGCas-3 Long Reverse CAAGCAGAAGACGGCATACGA 38 guide GCTCTTCCGATCTCACCTGCT RNA GGGAATTGTACCGTA
[0773] The resulting PCR amplifications were purified with a Qiagen PCR purification spin column, concentration measured with a Hoechst dye-based fluorometric assay, combined in an equimolar ratio, and single read 100 nucleotide-length deep sequencing was performed on Illumina's MiSeq Personal Sequencer with a 30-40% (v/v) spike of PhiX control v3 (Illumina, FC-110-3001) to off-set sequence bias. Only those reads with a ≧1 nucleotide indel arising within the 10 nucleotide window centered over the expected site of cleavage and not found in a similar level in the negative control were classified as NHEJ mutations. NHEJ mutant reads with the same mutation were counted and collapsed into a single read and the top 10 most prevalent mutations were visually confirmed as arising within the expected site of cleavage. The total numbers of visually confirmed NHEJ mutations were then used to calculate the % mutant reads based on the total number of reads of an appropriate length containing a perfect match to the barcode and forward primer.
[0774] The frequency of NHEJ mutations recovered by deep sequencing for the crRNA/tracrRNA/Cas endonuclease system targeting the three LIGCas targets (SEQ ID NOS: 16, 17, 18) compared to the long guide RNA/Cas endonuclease system targeting the same locus is shown in Table 32.
TABLE-US-00033 TABLE 32 Percent (%) mutant reads at maize Liguleless 1 target locus produced by crRNA/tracrRNA/Cas endonuclease system compared to the long guide RNA/Cas endonuclease system Total Number of Number of Mutant % Mutant System Reads Reads Reads Cas9 Only Control 1,744,427 0 0.00% LIGCas-3 long 1,596,955 35,300 2.21% guide RNA LIGCas-1 1,803,163 4,331 0.24% crRNA/tracrRNA LIGCas-2 1,648,743 3,290 0.20% crRNA/tracrRNA LIGCas-3 1,681,130 2,409 0.14% crRNA/tracrRNA
[0775] The ten most prevalent types of NHEJ mutations recovered based on the crRNA/tracrRNA/Cas endonuclease system are shown in FIG. 27A (for LIGCas-1 target site, corresponding to SEQ ID NOs:415-424), FIG. 27B (for LIGCas-2 target site corresponding to SEQ ID NOs: 425-434) and FIG. 27C (for LIGCas-3 target site corresponding to SEQ ID NOs:435-444). Approximately, 9-16 fold lower frequencies of NHEJ mutations were observed when using a crRNA/tracrRNA/Cas endonuclease system to introduce a double strand break at a maize genomic target site, relative to the long guide RNA/Cas endonuclease system control.
[0776] Taken together, our data indicate that the maize optimized crRNA/tracrRNA/Cas endonuclease system described herein cleaves maize chromosomal DNA and generates imperfect NHEJ mutations.
Example 23
[0777] Modifying the ARGOS8 Gene to Improve Drought Tolerance and Nitrogen Use Efficiency in Maize Plants
[0778] ARGOS is a negative regulator for ethylene responses in plants (WO 2013/066805 A1, published 10 May 2013). ARGOS proteins target the ethylene signal transduction pathway. When over-expressed in maize plants, ARGOS reduces plant sensitivity to ethylene and promotes organ growth, leading to increased drought tolerance (DRT) and improved nitrogen use efficiency (NUE) ((WO 2013/066805 A1, published 10 May 2013). To achieve optimal ethylene sensitivity, promoters have been tested for driving Zm-ARGOS8 over-expression in transgenic maize plants. Field trials showed that a maize promoter, Zm-GOS2 PRO:GOS2 INTRON (SEQ ID NO:460, U.S. Pat. No. 6,504,083 patent issued on Jan. 7, 2003; Zm-GOS2 is a maize homologous gene of rice GOS2. Rice GOS2 stands for Gene from Oryza Sativa 2), provided a favorable expression level and tissue coverage for Zm-ARGOS8 and the transgenic plants have a higher grain yield than non-transgenic controls under drought stress and low nitrogen conditions (WO 2013/066805 A1, published 10 May 2013). However, these transgenic plants contain two ARGOS8 genes, the endogenous gene and the transgene. ARGOS8 protein levels, therefore, are determined by these two genes. Because the endogenous ARGOS8 gene varies in sequence and the expression level among different inbred lines, the ARGOS8 protein level will be different when the transgene is integrated into different inbreds. Here we present a mutagenization (gene editing) method to modify the promoter region of the endogenous ARGOS8 gene to attain desired expression patterns and eliminate the need for a transgene.
[0779] The promoter Zm-GOS2 PRO:GOS2 INTRON (SEQ ID NO:460; U.S. Pat. No. 6,504,083 patent issued on Jan. 7, 2003) was inserted into the 5'-UTR of Zm-ARGOS8 (SEQ ID NO:462) by using a guideRNA/Cas9 system. The Zm-GOS2 PRO:GOS2 INTRON fragment also included a primer binding site (SEQ ID NO:459) at its 5' end to facilitate event screening with PCR. We also substituted the native promoter of Zm-ARGOS8 (SEQ ID NO:461) with Zm-GOS2 PRO::GOS2 INTRON (SEQ ID NO:460). Resulted maize lines carry a new ARGOS8 allele whose expression levels and tissue specificity will differ from the native form. We expect that these lines will recapitulate the phenotype of increased drought tolerance and improved NUE as observed in the Zm-GOS2 PRO:Zm-ARGOS8 transgenic plants (WO 2013/066805 A1, published 10 May 2013). These maize lines are different from those conventional transgenic events: (1) there is only one ARGOS8 gene in the genome; (2) this modified version of Zm-ARGOS8 resides at its native locus; (3) the ARGOS8 protein level and the tissue specificity of gene expression are entirely controlled by the edited allele. The DNA reagents used during the mutagenization, such as guideRNA, Cas9endonuclease, transformation selection marker and other DNA fragments are not required for function of the newly generated ARGOS8 allele and can be eliminated from the genome by segregation through standard breeding methods. Because the promoter Zm-GOS2 PRO:GOS2 INTRON was copied from maize GOS2 gene (SEQ ID NO:464) and inserted into the ARGOS8 locus through homologous recombination, this ARGOS8 allele is indistinguishable from natural mutant alleles.
A. Insertion of Zea mays-GOS2 PRO:GOS2 INTRON into Maize-ARGOS 8 Promoter
[0780] To insert Zm-GOS2 PRO:GOS2 INTRON into the 5'-UTR of maize ARGOS8 gene, a guideRNA construct, gRNA1, was made using maize U6 promoter and terminator as described herein. The 5'-end of the guide RNA contained a 19-bp variable targeting domain targeting the genomic target sequence 1 (CTS1; SEQ ID NO; 451) in the 5'-UTR of Zm-ARGOS8 (FIG. 28). A polynucleotide modification template containing the Zm-GOS2 PRO:GOS2 INTRON that was flanked by two genomic DNA fragments (HR1 and HR2, 370 and 430-bp in length, respectively) derived from the upstream and downstream region of the CTS1 (FIG. 28). The gRNA1 construct, the polynucleotide modification template, a Cas9 cassette and transformation selection marker phosphomannose isomerase (PMI) were introduced into maize immature embryo cells by using a particle bombardment method. PMI-resistant calli were screened with PCR for Zm-GOS2 PRO:GOS2 INTRON insertion (FIGS. 29A and 29B). Multiple callus events were identified and plants were regenerated. The insertion events were confirmed by amplifying the Zm-ARGOS8 region in T0 plants with PCR (FIG. 29C) and sequencing the PCR products.
B. Replacement of Zm-ARGOS 8 Promoter with Zm-GOS2 PRO:GOS2 INTRON Promoter (Promoter Swap).
[0781] To substitute (replace) the native promoter of Zm-ARGOS8 with Zm-GOS2 PRO:GOS2 INTRON, a guide RNA construct, gRNA3, was made for targeting the genomic target site CTS3 (SEQ ID NO:453), located 710-bp upstream of the Zm-ARGOS8 start codon (FIG. 30). Another guide RNA, gRNA2, was designed to target the genomic target site CTS2 (SEQ ID NO:452) located in the 5'-UTR of Zm-ARGOSO8 (FIG. 30). The polynucleotide modification template contained a 400-bp genomic DNA fragment derived from the upstream region of CTS3, Zm-GOS2 PRO:GOS2 INTRON and a 360-bp genomic DNA fragment derived from the downstream region of CTS2 (FIG. 30). The gRNA3 and gRNA2, the Cas9 cassette, the polynucleotide modification template and the PMI selection marker were used to transform immature embryo cells. Multiple promoter swap (promoter replacement) events were identified by PCR screening of the PMI-resistance calli (FIGS. 31A, 31B & 31C) and plants were regenerated. The swap events were confirmed by PCR analysis of the Zm-ARGOS8 region in T0 plants (FIG. 31D).
C. Deletion of Zm-ARGOS 8 Promoter
[0782] To delete the promoter of Zm-ARGOS8, we screened the PMI-resistance calli obtained from the above gRNA3/gRNA2 experiment to look for events that produce a 1.1-kb PCR product (FIG. 32A). Multiple deletion events were identified (FIG. 32B) and plants were regenerated. The deletion events were confirmed by amplifying the Zm-ARGOS8 region in T0 plants with PCR and sequencing of the PCR products.
Example 24
[0783] Gene Editing of the Soybean EPSPS1 Gene Using the Guide RNA/Cas Endonuclease System
A. guideRNA/Cas9 Endonuclease Target Site Design on the Soybean EPSPS Genes.
[0784] Two guideRNA/Cas9 endonuclease target sites (soy EPSPS-CR1 and soy EPSPS-CR2) were identified in theExon2 of the soybean EPSPS1 gene Glyma01g33660 (Table 33).
TABLE-US-00034 TABLE 33 Guide RNA/Cas9 endonuclease target sites on soybean EPSPS1 gene Cas endonuclease Name of gRNA-Cas9 target endonuclease sequence target site (SEQ ID NO:) Physical location soy EPSPS-CR1 467 Gm01: 45865337 . . . 45865315 soy EPSPS-CR2 468 Gm01: 45865311 . . . 45865333
B. Guide-RNA Expression Cassettes, Cas9 Endonuclease Expression Cassettes and Polynucleotide Modification Templates for Introduction of Specific Amino Acid Changes in the Soybean EPSPS1 Gene
[0785] The soybean U6 small nuclear RNA promoter, GM-U6-13.1 (SEQ ID. NO: 469), was used to express guide RNAs to direct Cas9 nuclease to designated genomic target sites (Table 34). A soybean codon optimized Cas9 endonuclease (SEQ ID NO: 489) expression cassette and a guide RNA expression cassette were linked in a first plasmid that was co-delivered with a polynucleotide modification template. The polynucleotide modification template contained specific nucleotide changes that encoded for amino acid changes in the EPSPS1 polypeptide (Glyma01g33660), such as the T183I and P187S (TIPS) in the Exon2. Other amino acid changes in the EPSPS1 polypeptide can also be obtained using the guide RNA/Cas endonuclease system described herein. Specific amino acid modifications can be achieved by homologous recombination between the genomic DNA and the polynucleotide modification template facilitated by the guideRNA/Cas endonuclease system.
TABLE-US-00035 TABLE 34 Guide RNA/Cas9 expression cassettes and polynucleotide modification templates used in soybean stable transformation for the specific amino acid modifications of the EPSPS1 gene. SEQ polynucleotide SEQ Guide RNA/Cas9 ID modification ID Experiment (plasmid name) NO: template NO: soy EPSPS- U6-13.1:EPSPS CR1 + 470 RTW1013A 472 CR1 EF1A2:CAS9 (QC878) soy EPSPS- U6-13.1:EPSPS CR2 + 471 RTW1012A 473 CR2 EF1A2:CAS9 (QC879)
C. Detection of Site-Specific Non-Homologous-End-Joining (NHEJ) Mediated by the Guide RNA/Cas9 System in Stably Transformed Soybean
[0786] Genomic DNA was extracted from somatic embryo samples and analyzed by quantitative PCR using a 7500 real time PCR system (Applied Biosystems, Foster City, Calif.) with target site-specific primers and FAM-labeled fluorescence probe to check copy number changes of the double strand break target sites. The qPCR analysis was done in duplex reactions with a syringolide induced protein (SIP) as the endogenous controls and a wild type 93B86 genomic DNA sample that contains one copy of the target site with 2 alleles, as the single copy calibrator. The presence or absence of the guide RNA-Cas9 expression cassette in the transgenic events was also analyzed with the qPCR primer/probes for guideRNA/Cas9 (SEQ IDs: 477-479) and for PinII (SEQ ID: 480-482). The qPCR primers/probes are listed in Table 35.
TABLE-US-00036 TABLE 35 Primers/Probes used in qPCR analyses of transgenic soybean events. Primer/ Target Probe SEQ ID Site Name Sequences NOs: EPSPS-CR1 & Soy1-F1 CCACTAGTAAGGAATCT 474 EPSPS-CR2 AAAGATGAAATCA Soy1-R2 CCTGCAGCAACCACAGC 475 TGCTGTC Soy1-T1 CTGCAATGCGTCCTT 476 (FAM-MGB) gRNA/ Cas9-F CCTTCTTCCACCGCC 477 CAS9 TTGA Cas9-R TGGGTGTCTCTCGTGCT 478 TTTT Cas9-T AATCATTCCTGGTGG 479 (FAM-MGB) AGGA plNll plNll-99F TGATGCCCACATTATAG 480 TGATTAGC plNll-13R CATCTTCTGGATTGGCC 481 AACTT plNll-69T ACTATGTGTGCATCCTT 482 (FAM-MGB) SIP SIP-130F TTCAAGTTGGGCTTTTT 483 CAGAAG SIP-198R TCTCCTTGGTGCTCTCA 484 TCACA SIP-170T CTGCAGCAGAACCAA 485 (VIC-MGB)
[0787] The endogenous control probe SIP-T was labeled with VIC and the gene-specific probes for all the target sites were labeled with FAM for the simultaneous detection of both fluorescent probes (Applied Biosystems). PCR reaction data were captured and analyzed using the sequence detection software provided with the 7500 real time PCR system and the gene copy numbers were calculated using the relative quantification methodology (Applied Biosystems).
[0788] Since the wild type 93B86 genomic DNA with two alleles of the double strand break target site was used as the single copy calibrator, events without any change of the target site would be detected as one copy herein termed Wt-Homo (qPCR value>=0.7), events with one allele changed, which is no longer detectible by the target site-specific qPCR, would be detected as half copy herein termed NHEJ-Hemi (qPCR value between 0.1 and 0.7), while events with both alleles changed would be detected as null herein termed NHEJ-Null (qPCR value=<0.1). As shown in Table 36, both guideRNA/Cas endonuclease systems targeting the soy EPSPS-CR1 and EPSPS-CR2 sites can introduce efficient Double Strand Break (DSB) efficiency at their designed target sites. Both NHEJ-Hemi and NHEJ-Null were detected in the 93B86 genotype. NHEJ (Non-Homologous-End-Joining) mutations mediated by the guide RNA/Cas9 system at the specific Cas9 target sites were confirmed by PCR/topo cloning/sequencing.
TABLE-US-00037 TABLE 36 Target Site Double Strand Break Rate Mutations Induced by the Guide RNA/Cas9 system on soybean EPSPS1 gene. Numbers indicate no. of events (numbers in parentheses are %). Total NHEJ-Hemi Project event Wt-Homo (%) (%) NHEJ-Null (%) U6-13.1 168 63 (38%) 66 (39%) 39 (23%) EPSPS-CR1 U6-13.1 111 50 (45%) 21 (19%) 40 (36%) EPSPS-CR2
D. Detection of the TIPS Mutation in the Soybean EPSPS Gene
[0789] In order to edit specific amino acids at the native EPSPS gene (such as those resulting in a TIPS modification), a polynucleotide modification template, such as RTW1013A or RTW1012A (Table 34), was co-delivered with the guideRNA/Cas9 expression cassettes into soybean cells.
[0790] The modification of the native EPSPS1 gene via guide RNA/Cas9 system mediated DNA homologous recombination was determined by specific PCR analysis. A specific PCR assay with primer pair WOL569 (SEQ ID NO: 486) and WOL876 (SEQ ID NO: 487) was used to detect perfect TIPS modification at the native EPSPS1 gene. A second primer pair WOL569 (SEQ ID NO: 486) and WOL570 (SEQ ID NO: 488) was used to amplify both TIPS modified EPSPS1 allele and WT (wild type)/NHEJ mutated allele. Topo cloning/sequencing was used to verify the sequences.
Example 25
[0791] Intron Replacement of Soybean Genes Using the guideRNA/Cas Endonuclease System
A. guideRNA/Cas9 Endonuclease Target Site Design.
[0792] Four guideRNA/Cas9 endonuclease target sites were identified in the soybean EPSPS1 gene Glyma01g33660 (Table 37). Two of the target sites (soy EPSPS-CR1 and soy EPSPS-CR2) were identified to target the Exon2 of the soybean EPSPS gene as described in Example 24. Another two target sites (soy EPSPS-CR4 and soy EPSPS-CR5) were designed near the 5' end of the intron1 of the soybean EPSPS gene.
TABLE-US-00038 TABLE 37 Guide RNA/Cas9 endonuclease target sites on soybean EPSPS1 gene. Cas endonuclease Name of gRNA-Cas9 target endonuclease sequence target site (SEQ ID NO:) Physical location soy EPSPS-CR1 467 Gm01: 45865337 . . . 45865315 soy EPSPS-CR2 468 Gm01: 45865311 . . . 45865333 soy EPSPS-CR4 490 Gm01: 45866302 . . . 45866280 soy EPSPS-CR5 491 Gm01: 45866295 . . . 45866274
B. Guide RNA/Cas9 Endonuclease Expression Cassettes and Polynucleotide Modification Templates Used in Soybean Stable Transformation for the Replacement of the Intron 1 of the Soybean EPSPS1 Gene with the Soybean Ubiquitin (UBQ) Intron 1
[0793] The soybean U6 small nuclear RNA promoter GM-U6-13.1 (SEQ ID. NO: 469) was used to express two guide RNAs (soy-EPSPS-CR1 and soy-EPSPS-CR4, or soy-EPSPS-CR1 and soy-EPSPS-CR5) to direct Cas9 endonuclease to designated genomic target sites (Table 38). One of the target sites (soy-EPSPS-CR1) was located in the exon2, as described in Example 24, and a second target site (soy-EPSPS-CR4 or soy-EPSPS-CR5) was located near the 5' end of intron1 of the native EPSPS1 gene. A soybean codon optimized Cas9 endonuclease expression cassette and a guide RNA expression cassette were linked in the expression plasmids QC878/RTW1199 (SEQ ID NO:470/492) or QC878/RTW1200 (SEQ ID NO:470/493) that was co-delivered with a polynucleotide modification template. The polynucleotide modification template, RTW1190A (SEQ ID NO:494), contained 532 bp intron1 of the soybean UBQ gene and the TIPS modified Exon2. Soybean EPSPS1 intron 1 replacement with the soybean UBQ intron1 can be achieved with the guide RNA/Cas system by homologous recombination between the genomic DNA and the polynucleotide modification template, resulting in enhancement of the native or modified soy EPSPS1 gene expression.
TABLE-US-00039 TABLE 38 Guide RNA/Cas9 endonuclease expression cassettes and polynucleotide modification templates used in soybean stable transformation for the replacement of the Intron1 of the soybean EPSPS1 gene with the soybean ubiquitin (UBQ) intron1 polynucleotide SEQ SEQ ID modification ID Experiment Guide RNA/Cas9 NO: template NO: soy EPSPS-CR1 U6-13.1:EPSPS 470/492 RTW1190A 494 and CR1 + CR4 + soy EPSPS-CR4 EF1A2:CAS9 (QC878/RTW1199) soy EPSPS-CR1 U6-13.1:EPSPS 470/493 RTW1190A 494 and CR1 + CR5 + soy EPSPS-CR5 EF1A2:CAS9 (QC878/RTW1200)
C. Detection of Site-Specific NHEJ Mediated by the Guide RNA/Cas9 System in Stably Transformed Soybean
[0794] Site-specific NHEJ was detected as described in Example 24C, using the qPCR primers/probes listed in Table 39.
TABLE-US-00040 TABLE 39 Primers/Probes used in qPCR analyses of transgenic soybean events. SEQ Target Primer/ ID Site Probe Name Sequences NOs: EPSPS-CR1 & Soy1-F1 CCACTAGTAAGGAAT 474 EPSPS-CR2 CTAAAGATGAAATCA Soy1-R2 CCTGCAGCAACCACA 475 GCTGCTGTC Soy1-T1 (FAM-MGB) CTGCAATGCGTCCTT 476 EPSPS-CR4 Soy1-F3 GTTTGTTTGTTGTTG 495 GGTGTGGG Soy1-R3 GACATGATGCTTCAT 496 TTTCACAGAA Soy-T2 (FAM-MGB) TGTGTAGAGTGGATT 497 TTG EPSPS-CR5 Soy1-F2 TGTTGTTGGGTGTGG 498 GAATAGG Soy1-R3 GACATGATGCTTCAT 496 TTTCACAGAA Soy1-T2 (FAM-MGB) TGTGTAGAGTGGATT 497 TTG gRNA/CAS9 Cas9-F CCTTCTTCCACCGCC 477 TTGA Cas9-R TGGGTGTCTCTCGTG 478 CTTTTT Cas9-T (FAM-MGB) AATCATTCCTGGTGG 479 AGGA plNll plNll-99F TGATGCCCACATTAT 480 AGTGATTAGC plNll-13R CATCTTCTGGATTGG 481 CCAACTT plNll-69T (FAM-MGB) ACTATGTGTGCAT 482 CCTT SIP SIP-130F TTCAAGTTGGGCTTT 483 TTCAGAAG SIP-198R TCTCCTTGGTGCTCT 484 CATCACA SIP-170T (VIC-MGB) CTGCAGCAGAACCAA 485
D. Detection of the Replacement of the Soybean EPSPS1 Intron1 with the Soybean UBQ Intron 1 Using the Guide RNA/Cas9 Endonuclease System.
[0795] In order to replace the soybean EPSPS1 intron1 with the soybean UBQ intron1 at the native EPSPS1 gene, two guideRNA expression vectors were used as shown in Table 38. The QC878 vector (SEQ ID NO: 470) was targeting the exon2 and the RTW1199 (SEQ ID NO:492) or RTW1200 (SEQ ID NO:493) was targeting the 5' end of the intron1. The double cleavage of soybean EPSPS gene with the two guide RNA/Cas systems resulted in the removal of the native EPSPS1 intron1/partial Exon2 fragment. At the same time, a polynucleotide modification template RTW1190A (SEQ ID NO:494) was co-delivered into soybean cells and homologous recombination between the polynucleotide modification template and the genomic DNA resulted in the replacement of EPSPS1 intron1 with the soybean UBQ intron1 and the desired amino acid modifications in exon2 as evidenced by PCR analysis. PCR assays with primer WOL1001/WOL1002 pair (SEQ ID NO: 499 and 500) and WOL1003/WOL1004 pair (SEQ ID NO: 501 and 502) were used to detect the intron replacement events.
Example 26
[0796] Promoter Replacement (Promoter Swap) of Soybean Genes Using the guideRNA/Cas Endonuclease System
A. guideRNA/Cas9 Endonuclease Target Site Design.
[0797] Four guideRNA/Cas9 endonuclease target sites were identified in the soybean EPSPS1 gene Glyma01g33660 (Table 40). Two of the target sites (soy EPSPS-CR1 and soy EPSPS-CR2) were identified to target the Exon2 of the soybean EPSPS gene as described in Example 24. The soy EPSPS-CR6 and soy EPSPS-CR7 were identified near the 5' end of the -798 bp of the native EPSPS promoter.
TABLE-US-00041 TABLE 40 Guide RNA/Cas9 endonuclease target sites on soybean EPSPS1 gene. Cas Name of gRNA-Cas9 endonuclease endonuclease target sequence target site (SEQ ID NO:) Physical location soy EPSPS-CR1 467 Gm01: 45865337 . . . 45865315 soy EPSPS-CR2 468 Gm01: 45865311 . . . 45865333 soy EPSPS-CR6 503 Gm01: 45867471 . . . 45867493 soy EPSPS-CR7 504 Gm01: 45867459 . . . 45867481
B. Guide RNA/Cas9 Endonuclease Expression Cassettes and Polynucleotide Modification Templates Used in Soybean Stable Transformation for the Replacement of the -798 bp Soybean EPSPS1 Promoter with the Soybean UBQ Promoter.
[0798] The soybean U6 small nuclear RNA promoter GM-U6-13.1 (SEQ ID. NO: 469) was used to express two guide RNAs (soyEPSPS-CR1 and soyEPSPS-CR6, or soyEPSPS-CR1 and soyEPSPS-CR7) to direct Cas9 nuclease to designated genomic target sites (Table 41). One of the target sites (soy-EPSPS-CR1) was located in the exon2 as described in Example 24 and a second target site (soy-EPSPS-CR6 or soy-EPSPS-CR7) was located near 5' end of the -798 bp of the native EPSPS1 promoter. A soybean codon optimized Cas9 endonuclease expression cassette and a guide RNA expression cassette were linked in the expression plasmids QC878/RTW1201 (SEQ ID NO:470/505) or QC878/RTW1202 (SEQ ID NO:470/506) that was co-delivered with a polynucleotide modification template, RTW1192A (SEQ ID NO:507). The polynucleotide modification template contained 1369 bp of the soybean UBQ gene promoter, 47 bp 5UTR and 532 bp UBQ intron1. Specific soybean EPSPS1 promoter replacement with the soybean UBQ promoter can be achieved with the guide RNA/Cas system by homologous recombination between the genomic DNA and the polynucleotide modification template, resulting enhancement of the native or modified soy EPSPS1 gene expression
TABLE-US-00042 TABLE 41 Guide RNA/Cas9 endonuclease expression cassettes and polynucleotide modification templates used in soybean stable transformation for the replacement of the -798 bp soybean EPSPS1 promoter with the soybean UBQ promoter SEQ polynucleotide SEQ ID modification ID Experiment Guide RNA/Cas9 NO: template NO: soy EPSPS-CR1 U6-13.1:EPSPS CR1 + 470, RTW1192A 507 and CR6 + EF1A2:CAS9 505 soy EPSPS-CR6 (QC878/RTW1201) soy EPSPS-CR1 U6-13.1:EPSPS CR1 + 470, RTW1192A 507 and CR7 + EF1A2:CAS9 506 soy EPSPS-CR7 (QC878/RTW1202)
C. Detection of Site-Specific NHEJ Mediated by the Guide RNA/Cas9 System in Stably Transformed Soybean
[0799] Site-specific NHEJ was detected as described in Example 24C, using the qPCR primers/probes listed in Table 42.
TABLE-US-00043 TABLE 42 Primers/Probes used in qPCR analyses of transgenic soybean events SEQ Target Primer/ ID Site Probe Name Sequences NOs: EPSPS-CR1 & Soy1-F1 CCACTAGTAAGGAATC 474 EPSPS-CR12 TAAAGATGAAATCA Soy1-R2 CCTGCAGCAACCACAG 475 CTGCTGTC Soy1-T1 (FAM-MGB) CTGCAATGCGTCCTT 476 EPSPS-CR6 & Soy1-F4 TCAATAATACTACTCT 508 EPSPS-CR7 CTTAGACACCAAACAA Soy1-R4 CAAGGAAAATGAATGA 509 TGGCTTT Soy1-T3 (FAM-MGB) CCTTCCCAAACTA 510 TAATC gRNA/CAS9 Cas9-F CCTTCTTCCACCGCC 477 TTGA Cas9-R TGGGTGTCTCTCGTGC 478 TTTTT Cas9-T (FAM-MGB) AATCATTCCTGGTGG 479 AGGA plNll plNll-99F TGATGCCCACATTATA 480 GTGATTAGC plNll-13R CATCTTCTGGATTGGC 481 CAACTT plNll-69T ACTATGTGTGCAT 482 (FAM-MGB) CCTT SIP SIP-130F TTCAAGTTGGGCTTTT 483 TCAGAAG SIP-198R TCTCCTTGGTGCTCTC 484 ATCACA SIP-170T (VIC-MGB) CTGCAGCAGAACCAA 485
D. Detection of the Promoter Replacement of the Soybean EPSPS1 Promoter with the Soybean UBQ Promoter Using the Guide RNA/Cas9 Endonuclease System.
[0800] In order to replace the soybean EPSPS1 promoter with the soybean UBQ promoter at the native EPSPS1 gene, two guideRNA expression vectors were used in each soybean transformation experiment as shown in Table 41. The QC878 (SEQ ID NO: 470) was targeting the exon2 and the RTW1201 (SEQ ID NO: 505) or RTW1202 (SEQ ID NO: 506) was targeting the 5' end of the soybean -798 bp promoter. The double cleavage of the soybean EPSPS1 gene with the two guide RNA/Cas systems resulted in removal of the native EPSPS1 promoter/5'UTR-Exon1/Intron1/partial Exon2 fragment at the native EPSPS gene. At the same time, a polynucleotide modification template RTW1192A (SEQ ID NO: 507) was co-delivered into soybean cells. This RTW1192A DNA contained 1369 bp soybean UBQ promoter, its 47 bp 5-UTR and 532 bp UBQ intron1 in front of the EPSPS1 exon1-Intron1-modified Exon2. Homologous recombination between the polynucleotide modification template and the genomic DNA resulted in the replacement of EPSPS1 promoter/5'UTR with the soybean UBQ promoter/5'UTR/Intron1 and the desired amino acid modifications evidenced by PCR analysis. PCR assays with primer WOL1005/WOL1006 pair (SEQ ID NO: 511 and 512) and WOL1003/WOL1004 pair (SEQ ID NO: 501 and 502) were used to detect the promoter replacement events.
Example 27
Enhancer Element Deletions Using the guideRNA/Cas Endonuclease System
[0801] The guide RNA/Cas endonuclease system described herein can be used to allow for the deletion of a promoter element from either a transgenic (pre-existing, artificial) or endogenous gene. Promoter elements, such enhancer elements, or often introduced in promoters driving gene expression cassettes in multiple copies (3×=3 copies of enhancer element, FIG. 33) for trait gene testing or to produce transgenic plants expressing specific trait. Enhancer elements can be, but are not limited to, a 35S enhancer element (Benfey et al, EMBO J, August 1989; 8(8): 2195-2202, SEQ ID NO:513). In some plants (events), the enhancer elements can cause an unwanted phenotype, a yield drag, or a change in expression pattern of the trait of interest that is not desired. For example, as shown in FIG. 33, a plant comprising multiple enhancer elements (3 copies, 3×) in its genomic DNA located between two trait cassettes (Trait A en Trait B) was characterized to show an unwanted phenotype. It is desired to remove the extra copies of the enhancer element while keeping the trait gene cassettes intact at their integrated genomic location. The guide RNA/Cas endonuclease system described herein can be used to removing the unwanted enhancing element from the plant genome. A guide RNA can be designed to contain a variable targeting region targeting a target site sequence of 12-30 bps adjacent to a NGG (PAM) in the enhancer. If a Cas endonuclease target site sequence is present in all copies of the enhancer elements (such as the three Cas endonuclease target sites 35S-CRTS1 (SEQ ID NO:514), 35S-CRTS2 (SEQ ID NO:515), 35S-CRTS3 (SEQ ID NO:516)), only one guide RNA is needed to guide the Cas endonuclease to the target sites and induce a double strand break in all the enhancer elements at once. The Cas endonuclease can make cleavage to remove one or multiple enhancers. The guideRNA/Cas endonuclease system can introduced by either agrobacterium or particle gun bombardment. Alternatively, two different guide RNAs (targeting two different genomic target sites) can be used to remove all 3× enhancer elements from the genome of an organism, in a manner similar to the removal of a (transgenic or endogenous) promoter described herein.
Example 28
Regulatory Sequence Modifications Using the Guide RNA/Cas Endonuclease System
A. Modification of Polyubiquitination Sites
[0802] There are defined ubiquitination sites on proteins to be degraded and they were found within the maize EPSPS protein by using dedicated computer programs (for example, the CKSAAP_UbSite (Ziding Zhang's Laboratory of Protein Bioinformatics College of Biological Sciences, China Agricultural University, 100193 Beijing, China). One of the selected polyubiquitination site within the maize EPSPS coding sequence is shown in FIG. 34A and its amino acid signature sequence is compared to the equivalent EPSPS sites from the other plants (FIG. 34A). The lysine amino acid (K) at position 90 (highly conserved in other plant species) was selected as a potential site of the EPSPS protein polyubiquitination. The polynucleotide modification template (referred to as EPSPS polynucleotide maize K90R template) used to edit the epsps locus is listed as SEQ ID NO: 517. This template allowed for editing the epsps locus to contain the lysine (K) to arginine (R) substitution at position 90 (K90R) and two additional TIPS substitutions at positions 102 and 106 (FIGS. 34B and 34C). Maize genomic DNA was edited using the guideRNA/Cas endonuclease system described herein and T0 plants were produced as described herein. The T0 plants that contained the nucleotide modifications, as specified by the information provided on the K90R template (FIG. 34C), were selected by the genotying methods described herein. F1 EPSPS-K90R plants can be selected for elevated protein content due to a slower rate of the EPSPS protein degradation.
B. Editing Intron Elements to Introduce Intron Mediated Enhancer Elements (IMEs)
[0803] Transcriptional activity of the native EPSPS gene can be modulated by transcriptional enhancers positioned in the vicinity of other transcription controlling elements. Introns are known to contain enhancer elements affecting the overall rate of transcription from native promoters including the EPSPS promoter. For example, the first intron of the maize ubiquitin 5'UTR confers a high level of expression in monocot plants as specified in the WO 2011/156535 A1 patent application. An intron enhancing motif CATATCTG (FIG. 35 A), also referred to as a intron-mediated enhancer element, IME) was identified by proprietary analysis (WO2011/156535 A1, published on Dec. 15, 2011) and appropriate nucleotide sites at the 5' end of the EPSPS first intron were selected for editing in order to introduce the intron-mediated enhancer elements (IMEs) (FIG. 35B-35C). The polynucleotide modification template (referred to as EPSPS polynucleotide maize IME template) is listed as SEQ ID No: 518. The polynucleotide modification template allows for editing of the epsps locus to contain three IMEs (two on one strand of the DNA, one on the reverse strand) in the first EPSPS intron and the TIPS substitutions at positions 102 and 106. The genomic DNA of maize plants was edited using the guideRNA/Cas endonuclease system described herein. Maize plants containing the IME edited EPSPS coding sequence can be selected by genotyping the T0 plants and can be further evaluated for elevated EPSPS-TIPS protein content due to the enhanced transcription rate of the native EPSPS gene.
Example 29
[0804] Modifications of Splicing Sites and/or Introducing Alternate Splicing Sites Using the Guide RNA/Cas Endonuclease System
[0805] In maize cells, the splicing process is affected by splicing sites at the exon-intron junction sites as illustrated in the EPSPS mRNA production (FIG. 36A-36B). FIG. 36A shows analysis of EPSPS amplified pre-mRNA (cDNA panel on left). Lane I4 in FIG. 36A shows amplification of the EPSPS pre-mRNA containing the 3rd intron unspliced, resulting in a 804 bp diagnostic fragment indicative for an alternate splicing event. Lanes E3 and F8 show the EPSPS PCR amplified fragments resulting from regular spliced introns. Diagnostic fragments such as the 804 bp fragment of lane I4 are not amplified unless cDNA is synthesized (as is evident by the absence of bands in lanes E3, I4, and F8 comprising total RNA (shown in the total RNA panel on right of FIG. 36A). The canonical splice site in the maize EPSPS gene and genes from other species is AGGT, while other (alternative) variants of the splice sites may lead to the aberrant processing of pre-mRNA molecules. The EPSPS coding sequence contains a number of alternate splicing sites that may affect the overall efficiency of the pre-mRNA maturation process and as such may limit the EPSPS protein accumulation in maize cells.
[0806] In order to limit the occurrence of alternate splicing events during EPSPS gene expression, a guideRNA/Cas endonuclease system as described herein can be used to edit splicing sites. The splicing site at the junction of the second native EPSPS intron and the third exon is AGTT and can be edited in order to introduce the canonical AGGT splice site at this junction (FIG. 37). The T>G substitution does not affect the native EPSPS open reading frame and it does not change the EPSPS amino acid sequence. The polynucleotide modification template (referred to as EPSPS polynucleotide maize Tspliced template) is listed as SEQ ID NO: 519. This polynucleotide modification template allows for editing of the epsps locus to contain the canonical AGGT splice site at the 2nd intron-3rd exon junction site and the TIPS substitutions at positions 102 and 106. Maize plants are edited using the procedures described herein. F1 EPSPS-Tspliced maize plants can be evaluated for increased protein content due to the enhanced production of functional EPSPS mRNA messages.
Example 30
[0807] Shortening Maturity Via Manipulation of Early Flowering Phenotype with ZmRap2.7 Down-Regulation Using the Guide RNA/Cas Endonuclease System
[0808] Overall plant maturity can be shortened by modulating the flowering time phenotype of plants through modulation of a maize ZmRap2.7 gene. Shortening of plant maturity can be obtained by an early flowering phenotype.
[0809] RAP2.7 is an acronym for Related to APETALA 2.7. RAPL means RAP2.7 LIKE and RAP2.7 functions as an AP2-family transcription factor that suppresses floral transition (SEQ ID NOs:520 and 521). Transgenic phenotype upon silencing or knock-down of Rap2.7 resulted in early flowering, reduced plant height, but surprisingly developed normal ear and tassel as compared the wild-type plants (PCT/US14/26279 application, filed Mar. 13, 2014). The guide RNA/Cas endonuclease system described herein can be used to target and induce a double strand break at a Cas endonuclease target site located within the RAP2.7 gene. Plants comprising NHEJ within the RAP2.7 gene can be selected and evaluated for the presence of a shortened maturity phenotype.
Example 31
[0810] Modulating Expression of a Maize NPK1B Gene for Engineering Frost Tolerance in Maize Using a Guide RNA/Cas Endonuclease System
[0811] Nicotiana Protein Kinase1 (NPK1) is a mitogen activated protein kinase kinase kinase that is involved in cytokinesis regulation and oxidative stress signal transduction. The ZM-NPK1B (SEQ ID NO: 522 and SEQ ID NO: 523) which has about 70% amino acid similarity to rice NPKL3 has been tested for frost tolerance in maize seedlings and reproductive stages (PCT/US14/26279 application, filed Mar. 13, 2014). Transgenic seedlings and plants comprising a ZM-NPK1B driven by an inducible promoter Rab17, had significantly higher frost tolerance than control seedlings and control plants. The gene seemed inducted after cold acclimation and during -3° C. treatment period in most of the events but at low levels. (PCT/US14/26279 application, filed Mar. 13, 2014).
[0812] A guide RNA/Cas endonuclease system described herein can be used to replace the endogenous promoter of NPK1 gene, with a stress-inducible promoter such as the maize RAB17 promoter stages (SEQ ID NO: 524; PCT/US14/26279 application, filed Mar. 13, 2014), thus modulate NPK1B expression in a stress-responsive manner and provide frost tolerance to the modulated maize plants.
Example 32
Shortening Maturity Via Manipulation of Early Flowering Phenotype with FTM1 Expression Using a Guide RNA/Cas Endonuclease Systems
[0813] Overall plant maturity can be shortened by modulating the flowering time phenotype of plants through expressing a transgene. Such a phenotype modification can also be achieved with additional transgenes or through a breeding approach.
[0814] FTM1 stands for Floral Transition MADS 1 transcription factor (SEQ ID NOs: 525 and 526). It is a MADS Box transcriptional factor and induces floral transition. Upon expression of FTM1 under a constitutive promoter, transgenic plants exhibited early flowering and shortened maturity, but surprisingly ear and tassel developed normally as compared to the wild-type plants (PCT/US14/26279 application, filed Mar. 13, 2014).
[0815] FTM1-expressing maize plants demonstrated that by manipulating a floral transition gene, time to flowering can be reduced significantly, leading to a shortened maturity for the plant. As maturity can be generally described as time from seeding to harvest, a shorter maturity is desired for ensuring that a crop can finish in the northern continental dry climatic environment (PCT/US14/26279 application, filed Mar. 13, 2014).
[0816] A guide RNA/Cas endonuclease system described herein can be used to introduce enhancer elements such as the CaMV35S enhancers (Benfey et al, EMBO J, August 1989; 8(8): 2195-2202, SEQ ID NO:512), specifically targeted in front of the endogenous promoter of FTM1, in order to enhance the expression of FTM1 while preserving most of the tissue and temporal specificities of native expression, providing shortened maturity to the modulated plants.
Example 33
Inserting Inducible Responsive Elements in Plant Genomes
[0817] Inducible expression systems controlled by an external stimulus are desirable for functional analysis of cellular proteins as well as trait development as changes in the expression level of the gene of interest can lead to an accompanying phenotype modification. Ideally such a system would not only mediate an "on/off" status for gene expression but would also permit limited expression of a gene at a defined level.
The guide RNA/Cas endonuclease system described herein can be used to introduce components of repressor/operator/inducer systems to regulate gene expression of an organism. Repressor/operator/inducer systems and their components are well known I the art (US 2003/0186281 published Oct. 2, 2003; U.S. Pat. No. 6,271,348). For example, nut not limited to, components of the tetracycline (Tc) resistance system of E. coli have been found to function in eukaryotic cells and have been used to regulate gene expression (U.S. Pat. No. 6,271,348). Nucleotide sequences of tet operators of different classes are known in the art see for example: classA, classB, classC, classD, classE TET operator sequences lists as SEQ ID NOs:11-15 of U.S. Pat. No. 6,271,348.
[0818] Components of a sulfonylurea-responsive repressor system (as described in U.S. Pat. No. 8,257,956, issued on Sep. 4, 2012) can also be introduced into plant genomes to generate a epressor/operator/inducer systems into said plant where polypeptides can specifically bind to an operator, wherein the specific binding is regulated by a sulfonylurea compound.
Example 34
Genome Deletion for Trait Locus Characterization
[0819] Trait mapping in plant breeding often results in the detection of chromosomal regions housing one or more genes controlling expression of a trait of interest. For quantitative traits, expression of a trait of interest is governed by multiple quantitative trait loci (QTL) of varying effect-size, complexity, and statistical significance across one or more chromosomes. A QTL or haplotype that is associated with suppression of kernel-row number in the maize ear can be found to be endemic in elite breeding germplasm. The negative effect of this QTL for kernel row number can be fine-mapped to an acceptable resolution to desire selective elimination of this negative QTL segment within specific recipient germplasm. Two flanking cut sites for the guide polynucleotide/Cas endonuclease system are designed via haplotype, marker, and/or DNA sequence context at the targeted QTL region, and the two guide polynucleotide/Cas endonuclease systems are deployed simultaneously or sequentially to produce the desired end product of two independent double strand breaks (cuts) that liberate the intervening region from the chromosome. Individuals harboring the desired deletion event would result by the NHEJ repair of the two chromosomal ends and eliminating the intervening DNA region. Assays to identify these individuals is based on the presence of flanking DNA marker regions, but absence of intervening DNA markers. A proprietary haplotype for kernel-row-number is created that is not extant in the previously defined elite breeding germplasm pool.
An alternative approach would be to delete a region containing a fluorescent gene. Recovery of plants with, and without, fluorescence would give an approximate indication of the efficiency of the deletion process.
Example 35
Engineering Drought Tolerance and Nitrogen Use Efficiency into Maize Via Gene Silencing by Expressing an Inverted Repeat into an ACS6 Gene Using the Guide RNA/Cas Endonuclease System
[0820] ACC (1-aminocyclopropane-1-carboxylic acid) synthase (ACS) genes encode enzymes that catalyze the rate limiting step in ethylene biosynthesis. A construct containing one of the maize ACS genes, ZM-ACS6, in an inverted repeat configuration, has been extensively tested for improved abiotic stress tolerance in maize (PCT/US2010/051358, filed Oct. 4, 2010; PCT/US2010/031008, filed Apr. 14, 2010). Multiple transgenic maize events containing a ZM-ACS6 RNAi sequence driven by a ubiquitin constitutive promoter had reduced ethylene emission, and a concomitant increase in grain yield relative to controls under both drought and low nitrogen field conditions (Plant Biotechnology Journal: 12 MAR 2014, DOI: 10.1111/pbi.12172).
[0821] In one embodiment, the guide RNA/Cas endonuclease system can be used in combination with a co-delivered polynucleotide sequence to insert an inverted ZM-ACS6 gene fragment into the genome of maize, wherein the insertion of the inverted gene fragment allows for the in-vivo creation of an inverted repeat (hairpin) and results in the silencing of the endogenous ethylene biosynthesis gene.
[0822] In an embodiment the insertion of the inverted gene fragment can result in the formation of an in-vivo created inverted repeat (hairpin) in a native (or modified) promoter of an ACS6 gene and/or in a native 5' end of the native ACS6 gene. The inverted gene fragment can further comprise an intron which can result in an enhanced silencing of the targeted ethylene biosynthetic gene.
Example 36
T0 Plants from the Multiplexed Guide RNA/Cas Experiment Carried High Frequency of Bi-Allelic Mutations and Demonstrated Proper Inheritance of Mutagenized Alleles in the T1 Population
[0823] This example demonstrates the high efficiency of the guide RNA/Cas endonuclease system in generating maize plants with multiple mutagenized loci and their inheritance in the consecutive generation(s).
[0824] Mutated events generated in the multiplexed experiment described in Example 4 were used to regenerate T0 plants with mutations at 3 different target sites: MS26Cas-2 target site (SEQ ID NO: 14), LIGCas-3 target site (SEQ ID NO: 18) and MS45Cas-2 target site (SEQ ID NO: 20).
[0825] For further analysis, total genomic DNA was extracted from leaf tissue of individual T0 plants. Fragments spanning all 3 target sites were PCR amplified using primer pairs for the corresponding target sites, cloned into the pCR2.1-TOPO cloning vector (Invitrogen), and sequenced. Table 43 shows examples of mutations detected in four T0 plants resulting from imprecise NHEJ at all relevant loci when multiple guide RNA expression cassettes were simultaneously introduced either in duplex (see TS=Lig34/MS26) or triplex (see TS=Lig34/MS26/MS45), respectively.
TABLE-US-00044 TABLE 43 Examples of mutations at maize target loci produced by a multiplexed guide RNA/Cas system Target sites T0 qPCR Sequencing data (TS) plant data Lig3/4 TS Ms26 TS Ms45 TS Lig34/MS26 1 NULL/NULL* 1 bp ins/2 bp 1 bp ins/19 bp del + 1 bp ins del 2 NULL/NULL 1 bp ins/1 bp del 1 bp ins/1 bp ins Lig34/MS26/ 1 NULL/NULL/ 1 bp ins/large del 1 bp ins/1 bp 15 bp del/ MS45 NULL del large del 2 INDEL**/NULL/ 1 bp ins/WT 1 bp (T) ins/ 1 bp ins/ NULL 1 bp (C) ins large del *NULL indicates that both alleles are mutated **INDEL indicates mutation in one of the two alleles. del = deletion, ins = insertion, bp = base pair
[0826] All T0 plants were crossed with wild type maize plants to produce T1 seeds. T1 progeny plants (32 plants) of the second T0 plant from the triplex experiment (see Table 43, Lig34/MS26/MS45) were analyzed by sequencing to evaluate segregation frequencies of the mutated alleles. Our results demonstrated proper inheritance and expected (1:1) segregation of the mutated alleles as well as between mutated and wild type alleles at all three target sites.
[0827] The data clearly demonstrate that the guide RNA/maize optimized Cas endonuclease system described herein, can be used to simultaneously mutagenize multiple chromosomal loci and produce progeny plants containing the stably inherited multiple gene knock-outs.
Example 37
[0828] Guide RNA/Cas endonuclease mediated DNA cleavage in maize chromosomal loci can stimulate homologous recombination repair-mediated transgene insertion and resulting T1 progeny plants demonstrated proper inheritance of the modified alleles.
[0829] Maize events generated in the experiment described in Example 5 were used to regenerate T0 plants. T0 plants were regenerated from 7 independent callus events with correct amplifications across both transgene genomic DNA junctions and analyzed. Leaf tissue was sampled, total genomic DNA extracted, and PCR amplification at both transgene genomic DNA junctions was carried out using the primer pairs (corresponding to SEQ ID NOs: 98-101). The resulting amplification products were sequenced for confirmation. Plants with confirmed junctions at both ends were further analyzed by Southern hybridization (FIG. 38) using two probes, genomic (outside HR1 region, SEQ ID: 533) and transgenic (within MoPAT gene, SEQ ID: 534). PCR, sequencing and Southern hybridization data demonstrated that plants regenerated from two of the 7 events (events 1 and 2) demonstrated perfect, clean, single copy transgene integration at the expected target site via homologous recombination. Plants regenerated from the remaining 5 events contained either additional, randomly integrated copies of the transgene (events 4, 5, and 6) or rearranged copies of the transgene integrated into the target site (events 3 and 7).
[0830] T0 plants from events 1 and 2 were crossed with wild type maize plants to produce T1 seeds. Ninety-six T1 plants from events 1 and 2 were analyzed by Southern hybridization (using the same probes as above) to evaluate segregation frequencies of the transgene locus. Southern results demonstrated proper inheritance and expected (1:1) segregation of the transgene and wild type loci.
[0831] The data clearly demonstrate that maize chromosomal loci cleaved with the maize optimized guide RNA/Cas system described herein can be used to stimulate HR repair pathways to site-specifically insert transgenes and produce progeny plants that have the inserted transgene stably inherited.
Example 38
Production of Maize Transgenic Lines with Pre-Integrated Cas9 for Transient Delivery of Guide RNA
[0832] This example describes the rationale, production, and testing of maize transgenic lines with an integrated Cas9 gene under constitutive and temperature inducible promoters.
[0833] As demonstrated in Example 2, a high mutation frequency was observed when Cas9 endonuclease and guide RNA were delivered as DNA vectors by biolistic transformation to immature corn embryo cells. When Cas9 endonuclease was delivered as a DNA vector and guide RNA as RNA molecules, a reduced mutation frequency was observed (Table 44).
TABLE-US-00045 TABLE 44 Mutant reads at LigCas-3 target site produced by transiently delivered guide RNA Target Site Examined for Transient Expression Mutant Total Mutations Delivery Cassette Reads Reads LIGCas-3 -- Cas9 24.2 1,599,492 LIGCas-3 -- Cas9/guide 44170 1,674,825 RNA LIGCas-3 35 ng guide RNA Cas9 418 1,622,180 LIGCas-3 70 ng guide RNA Cas9 667 1,791,388 LIGCas-3 140 ng guide RNA Cas9 239 1,632,137
[0834] Increased efficiency (increased mutant reads) may occur when the Cas9 protein and guide RNA are present in the cell at the same time. To facilitate the presence of both Cas9 endonuclease and guide RNA in the same cell, a vector containing a constitutive and conditionally regulated Cas9 gene can be first delivered to plant cells to allow for stable integration into the plant genome to establish a plant line that contains only the Cas9 gene in the plant genome. Then, single or multiple guide RNAs can be delivered as either DNA or RNA, or combination, to the embryo cells of the plant line containing the genome-integrated version of the Cas9 gene.
[0835] Transgenic maize (genotype Hi-II) lines with an integrated Cas9 gene driven by either a constitutive (Ubi) or an inducible (CAS) promoter were generated via Agrobacterium-mediated transformation. Besides the Cas9 gene, the Agro vector also contained a visible marker (END2:Cyan) and a Red Fluorescent Protein sequence interrupted with a 318 bp long linker (H2B:RF-FP) (as described in U.S. patent Ser. No. 13/526,912, filed Jun. 19, 2012). The linker sequence was flanked with 370 bp long direct repeats to promote recombination and restoration of a functional RFP gene sequence upon double strand break within the linker.
[0836] Lines with single copies of the transgene were identified and used for further experiments. Two guide RNA constructs targeting 2 different sites (Table 45 in the linker sequence, were delivered into immature embryo cells via particle bombardment. Meganuclease variant LIG3-4 B65 with very high cutting activity previously used in similar experiments was used as the positive control.
TABLE-US-00046 TABLE 45 Target sites in the RF-FP linker for guideRNA/Cas endonuclease system. Target Guide Site Target PAM SEQ RNA Desig- Site Se- ID Locus Used nation Sequence quence NO: RF- Long RF-FP GCAGGTCTC TGG 535 FP Cas-1 ACGACGGT linker Long RF-FP GTAAAGTACG AGG 536 Cas-2 CGTACGTGT
[0837] After transformation, embryos with Cas9 gene under Ubiquitin promoter were incubated at 28° C. while embryos with Cas9 gene under temperature inducible CAS promoter were first incubated at 37° C. for 15-20 hours and then transferred to 28° C. Embryos were examined 3-5 days after bombardment under luminescent microscope. Expression and activity of the pre-integrated Cas9 protein was visually evaluated based on the number of embryo cells with RFP protein expression. In most lines, the guide RNA/Cas endonuclease system demonstrated similar or higher frequency of RFP repair than LIG3-4 B65 meganuclease indicating high level of Cas9 protein expression and activity in the generated transgenic lines.
[0838] This example describes the production of transgenic lines with a pre-integrated Cas9 gene that can be used in further experiments to evaluate efficiency of mutagenesis at a target site upon transient delivery of guide RNA in the form of RNA molecules.
Example 39
[0839] The Quide RNA/Cas Endonuclease System Delivers Double-Strand Breaks to the Maize ALS Locus and Facilitates Editing of the ALS Gene
[0840] This example demonstrates that the guide RNA/Cas endonuclease system can be efficiently used to introduce specific changes into the nucleotide sequence of the maize ALS gene resulting in resistance to sulfonylurea class herbicides, specifically, chlorsulfuron.
[0841] Endogenous ALS protein is the target site of ALS inhibitor sulfonylurea class herbicides. Expression of the herbicide tolerant version of ALS protein in crops confers tolerance to this class of herbicides. The ALS protein contains N-terminal transit peptides, and the mature protein is formed following transport into the chloroplast and subsequent cleavage of the transit peptide. The mature protein starts at residue S41, resulting in a mature protein of 598 amino acids with a predicted molecular weight of 65 kDa (SEQ ID NO: 550).
TABLE-US-00047 TABLE 46 Deduced Amino Acid Sequence of the Full-Length ZM-ALS Protein (SEQ ID no: 550) 1 MATAAAASTA LTGATTAAPK ARRRAHLLAT RRALAAPIRC SAASPAMPMA 51 PPATPLRPWG PTEPRKGADI LVESLERCGV RDVFAYPGGA SMEIHQALTR 101 SPVIANHLFR HEQGEAFAAS GYARSSGRVG VCIATSGPGA TNLVSALADA 151 LLDSVPMVAI TGQVPRRMIG TDAFQETPIV EVTRSITKHN YLVLDVDDIP 201 RVVQEAFFLA SSGRPGPVLV DIPKDIQQQM AVPVWDKPMS LPGYIARLPK 251 PPATELLEQV LRLVGESRRP VLYVGGGCAA SGEELRRFVE LTGIPVTTTL 301 MGLGNFPSDD PLSLRMLGMH GTVYANYAVD KADLLLALGV RFDDRVTGKI 351 EAFASRAKIV HVDIDPAEIG KNKQPHVSIC ADVKLALQGM NALLEGSTSK 401 KSFDFGSWND ELDQQKREFP LGYKTSNEEI QPQYAIQVLD ELTKGEAIIG 451 TGVGQHQMWA AQYYTYKRPR QWLSSAGLGA MGFGLPAAAG ASVANPGVTV 501 VDIDGDGSFL MNVQELAMIR IENLPVKVFV LNNQHLGMVV QWEDRFYKAN 551 RAHTYLGNPE NESEIYPDFV TIAKGFNIPA VRVTKKNEVR AAIKKMLETP 601 GPYLLDIIVP HQEHVLPMIP SGGAFKDMIL DGDGRTVY
[0842] Modification of a single amino acid residue (P165A or P165S, shown in bold) from the endogenous maize acetoacetate synthase protein provides resistance to herbicides in maize.
[0843] There are two ALS genes in maize, ALS1 and ALS2, located on chromosomes 5 and 4, respectively. As described in Example 2, guide RNA expressing constructs for 3 different target sites within the ALS genes were tested. Based on polymorphism between ALS1 and ALS2 nucleotide sequences, ALS1-specific and ALSCas-4 target site were identified and tested. ALSCas-1 guide RNA expressing construct targeting both ALS1 and ALS2 genes was used as control (Table 47)
TABLE-US-00048 TABLE 47 Maize ALS genomic target sites tested. Maize Target Genomic Site Target SEQ Lo- Guide Desig- Site PAM ID cus Location RNA nation Sequence Sequence NO: ALS Chr. 4: Long ALSCas-1 GGTGCCAATCATGC CGG 22 107.73cM GTCG and Long ALSCas-4 GCTGCTCGATTCC TGG* 537 Chr. 5: GTCCCCA 115.49cM *Target site in the ALS1 gene; bolded nucleo- tides are different in the ALS2 gene.
The experiment was conducted and mutation frequency determined as described in Example 2 and results are shown in Table 48.
TABLE-US-00049 TABLE 48 Frequencies of NHEJ mutations at the two ALS target sites recovered by deep sequencing. TS Total Reads Mutant reads (ALS1) Mutant reads (ALS2) ALSCas-1 204,230 5072 (2.5%) 2704 (1.3%) ALSCas-4 120,766 3294 (2.7%) 40 (0.03%)
The results demonstrated that ALSCas-4 guide RNA/Cas9 system mutates the ALS1 gene with approximately 90 times higher efficiency than the ALS2 gene. Therefore, the ALSCas-4 target site and the corresponding guide RNA were selected for the ALS gene editing experiment.
[0844] To produce edited events, the ALS polynucleotide modification repair template was co-delivered using particle bombardment as a plasmid with an 804 bp long homologous region (SEQ ID NO: 538) or as a single-stranded 127 bp DNA fragment (SEQ ID NO: 539), the maize optimized Cas9 endonuclease expression vector described in Example 1, the guide RNA expression cassette (targeting ALSCas-4 site), a moPAT-DsRed fusion as selectable and visible markers, and developmental genes (ODP-2 and WUS). Approximately 1000 Hi-II immature embryos were bombarded with each of the two repair templates described above. Forty days after bombardment, 600 young callus events (300 for each repair template) were collected and transferred to the media with bialaphos selection. The embryos with remaining events were transferred to the media with 100 ppm of chlorsulfuron for selection. A month later, events that continued growing under chlorsulfuron selection were collected and used for analysis.
[0845] A small amount of callus tissue from each selected event was used for total DNA extraction. A pair of genomic primers outside the repair/donor DNA fragment (SEQ ID NO:540 and SEQ ID NO:541) was used to amplify an endogenous fragment of the ALS1 locus containing the ALSCas4 target sequence. The PCR amplification products were gel purified, cloned into the pCR2.1 TOPO cloning vector (Invitrogen) and sequenced. A total of 6 events demonstrated the presence of the specifically edited ALS1 allele as well as either a wild type or a mutagenized second allele.
[0846] These data indicate that a guide RNA/Cas system can be successfully used to create edited ALS allele in maize. The data further demonstrates that the guide RNA/maize optimized Cas endonuclease system described herein, can be used to produce progeny plants containing gene edits that are stably inherited.
Example 40
Gene Editing of the Soybean ALS1 Gene and Use as a Transformation Selectable Marker for Soybean Transformation with the Guide RNA/Cas Endonuclease System
[0847] A. guideRNA/Cas9 Endonuclease Target Site Design on the Soybean ALS1 Gene.
[0848] There are four ALS genes in soybean (Glyma04g37270, Glyma06g17790, Glyma13g31470 and Glyma15g07860). Two guideRNA/Cas9 endonuclease target sites (soy ALS1-CR1 and soy ALS1-CR2) were designed near the Proline 178 of the soybean ALS1 gene Glyma04g37270 (Table 49).
TABLE-US-00050 TABLE 49 Guide RNA/Cas9 endonuclease target sites on soybean ALS1 gene Cas endonuclease Name of gRNA-Cas9 target endonuclease sequence target site (SEQ ID NO:) Physical location soy ALS1-CR1 542 Gm04: 43645633 . . . 43645612 soy ALS1-CR2 543 Gm04: 43645594 . . . 43645615
B. Guide-RNA Expression Cassettes, Cas9 Endonuclease Expression Cassettes, Polynucleotide Modification Templates for Introduction of Specific Amino Acid Changes and Use the P178S Modified ALS1 Allele as a Soybean Transformation Selectable Marker
[0849] The soybean U6 small nuclear RNA promoter, GM-U6-13.1 (SEQ ID. NO: 469), was used to express guide RNAs to direct Cas9 nuclease to designated genomic target sites (Table 50). A soybean codon optimized Cas9 endonuclease (SEQ ID NO:489) expression cassette and a guide RNA expression cassette were linked in a first plasmid that was co-delivered with a polynucleotide modification template. The polynucleotide modification template contained specific nucleotide changes that encoded for amino acid changes in the soy ALS1 polypeptide
[0850] (Glyma04g37270), such as the P178S. Other amino acid changes in the ALS1 polypeptide can also be obtained using the guide RNA/Cas endonuclease system described herein. Specific amino acid modifications can be achieved by homologous recombination between the genomic DNA and the polynucleotide modification template facilitated by the guideRNA/Cas endonuclease system.
TABLE-US-00051 TABLE 50 Guide RNA/Cas9 expression cassettes and polynucleotide modification templates used in soybean stable transformation for the specific amino acid modifications of the soy ALS1 gene. SEQ polynucleotide SEQ Guide RNA/Cas9 ID modification ID Experiment (plasmid name) NO: template NO: soy ALS1-CR1 U6-13.1:ALS1-CR1 + 544 RTW1026A 546 EF1A2:CAS9 (QC880) soy ALS-CR2 U6-13.1:ALS1-CR2 + 545 RTW1026A 546 EF1A2:CAS9 (QC881)
[0851] C. Detection of the P178S Mutation in the Soybean ALS1 Gene in the Event Selected by Chlorsulfuron
[0852] In order to edit specific amino acids at the native ALS1 gene (such as the P178S modification), a polynucleotide modification template such as RTW1026A (Table 50), was co-delivered with the guideRNA/Cas9 expression cassettes into soybean cells. Chlorsulfuron (100 ppb) was used to select the P178S ALS1 gene editing events in soybean transformation process.
[0853] The modification of the native ALS1 gene via guide RNA/Cas9 system mediated DNA homologous recombination was determined by specific PCR analysis. A specific PCR assay with primer pair WOL900 (SEQ ID NO: 547) and WOL578 (SEQ ID NO: 548) was used to detect perfect P178S modification at the native ALS1 gene. A second primer pair WOL573 (SEQ ID NO: 549) and WOL578 (SEQ ID NO: 548) was used to amplify both a P178S modified Soy ALS1 allele and a NHEJ mutated allele. A chlorsulfuron tolerant event (MSE3772-18) was generated from the soy ALS1-CR2 experiment. The event contained a perfect P178S modified allele and a 2nd allele with a 5 bp deletion at the soyALS1-CR2 cleavage site. Topo cloning/sequencing was used to verify the sequences. Our results demonstrated one P178S modified ALS1 allele is sufficient to provide chlorsulfuron selection in soybean transformation process.
Sequence CWU
1
1
55014107DNAStreptococcus pyogenes M1 GAS (SF370) 1atggataaga aatactcaat
aggcttagat atcggcacaa atagcgtcgg atgggcggtg 60atcactgatg aatataaggt
tccgtctaaa aagttcaagg ttctgggaaa tacagaccgc 120cacagtatca aaaaaaatct
tataggggct cttttatttg acagtggaga gacagcggaa 180gcgactcgtc tcaaacggac
agctcgtaga aggtatacac gtcggaagaa tcgtatttgt 240tatctacagg agattttttc
aaatgagatg gcgaaagtag atgatagttt ctttcatcga 300cttgaagagt cttttttggt
ggaagaagac aagaagcatg aacgtcatcc tatttttgga 360aatatagtag atgaagttgc
ttatcatgag aaatatccaa ctatctatca tctgcgaaaa 420aaattggtag attctactga
taaagcggat ttgcgcttaa tctatttggc cttagcgcat 480atgattaagt ttcgtggtca
ttttttgatt gagggagatt taaatcctga taatagtgat 540gtggacaaac tatttatcca
gttggtacaa acctacaatc aattatttga agaaaaccct 600attaacgcaa gtggagtaga
tgctaaagcg attctttctg cacgattgag taaatcaaga 660cgattagaaa atctcattgc
tcagctcccc ggtgagaaga aaaatggctt atttgggaat 720ctcattgctt tgtcattggg
tttgacccct aattttaaat caaattttga tttggcagaa 780gatgctaaat tacagctttc
aaaagatact tacgatgatg atttagataa tttattggcg 840caaattggag atcaatatgc
tgatttgttt ttggcagcta agaatttatc agatgctatt 900ttactttcag atatcctaag
agtaaatact gaaataacta aggctcccct atcagcttca 960atgattaaac gctacgatga
acatcatcaa gacttgactc ttttaaaagc tttagttcga 1020caacaacttc cagaaaagta
taaagaaatc ttttttgatc aatcaaaaaa cggatatgca 1080ggttatattg atgggggagc
tagccaagaa gaattttata aatttatcaa accaatttta 1140gaaaaaatgg atggtactga
ggaattattg gtgaaactaa atcgtgaaga tttgctgcgc 1200aagcaacgga cctttgacaa
cggctctatt ccccatcaaa ttcacttggg tgagctgcat 1260gctattttga gaagacaaga
agacttttat ccatttttaa aagacaatcg tgagaagatt 1320gaaaaaatct tgacttttcg
aattccttat tatgttggtc cattggcgcg tggcaatagt 1380cgttttgcat ggatgactcg
gaagtctgaa gaaacaatta ccccatggaa ttttgaagaa 1440gttgtcgata aaggtgcttc
agctcaatca tttattgaac gcatgacaaa ctttgataaa 1500aatcttccaa atgaaaaagt
actaccaaaa catagtttgc tttatgagta ttttacggtt 1560tataacgaat tgacaaaggt
caaatatgtt actgaaggaa tgcgaaaacc agcatttctt 1620tcaggtgaac agaagaaagc
cattgttgat ttactcttca aaacaaatcg aaaagtaacc 1680gttaagcaat taaaagaaga
ttatttcaaa aaaatagaat gttttgatag tgttgaaatt 1740tcaggagttg aagatagatt
taatgcttca ttaggtacct accatgattt gctaaaaatt 1800attaaagata aagatttttt
ggataatgaa gaaaatgaag atatcttaga ggatattgtt 1860ttaacattga ccttatttga
agatagggag atgattgagg aaagacttaa aacatatgct 1920cacctctttg atgataaggt
gatgaaacag cttaaacgtc gccgttatac tggttgggga 1980cgtttgtctc gaaaattgat
taatggtatt agggataagc aatctggcaa aacaatatta 2040gattttttga aatcagatgg
ttttgccaat cgcaatttta tgcagctgat ccatgatgat 2100agtttgacat ttaaagaaga
cattcaaaaa gcacaagtgt ctggacaagg cgatagttta 2160catgaacata ttgcaaattt
agctggtagc cctgctatta aaaaaggtat tttacagact 2220gtaaaagttg ttgatgaatt
ggtcaaagta atggggcggc ataagccaga aaatatcgtt 2280attgaaatgg cacgtgaaaa
tcagacaact caaaagggcc agaaaaattc gcgagagcgt 2340atgaaacgaa tcgaagaagg
tatcaaagaa ttaggaagtc agattcttaa agagcatcct 2400gttgaaaata ctcaattgca
aaatgaaaag ctctatctct attatctcca aaatggaaga 2460gacatgtatg tggaccaaga
attagatatt aatcgtttaa gtgattatga tgtcgatcac 2520attgttccac aaagtttcct
taaagacgat tcaatagaca ataaggtctt aacgcgttct 2580gataaaaatc gtggtaaatc
ggataacgtt ccaagtgaag aagtagtcaa aaagatgaaa 2640aactattgga gacaacttct
aaacgccaag ttaatcactc aacgtaagtt tgataattta 2700acgaaagctg aacgtggagg
tttgagtgaa cttgataaag ctggttttat caaacgccaa 2760ttggttgaaa ctcgccaaat
cactaagcat gtggcacaaa ttttggatag tcgcatgaat 2820actaaatacg atgaaaatga
taaacttatt cgagaggtta aagtgattac cttaaaatct 2880aaattagttt ctgacttccg
aaaagatttc caattctata aagtacgtga gattaacaat 2940taccatcatg cccatgatgc
gtatctaaat gccgtcgttg gaactgcttt gattaagaaa 3000tatccaaaac ttgaatcgga
gtttgtctat ggtgattata aagtttatga tgttcgtaaa 3060atgattgcta agtctgagca
agaaataggc aaagcaaccg caaaatattt cttttactct 3120aatatcatga acttcttcaa
aacagaaatt acacttgcaa atggagagat tcgcaaacgc 3180cctctaatcg aaactaatgg
ggaaactgga gaaattgtct gggataaagg gcgagatttt 3240gccacagtgc gcaaagtatt
gtccatgccc caagtcaata ttgtcaagaa aacagaagta 3300cagacaggcg gattctccaa
ggagtcaatt ttaccaaaaa gaaattcgga caagcttatt 3360gctcgtaaaa aagactggga
tccaaaaaaa tatggtggtt ttgatagtcc aacggtagct 3420tattcagtcc tagtggttgc
taaggtggaa aaagggaaat cgaagaagtt aaaatccgtt 3480aaagagttac tagggatcac
aattatggaa agaagttcct ttgaaaaaaa tccgattgac 3540tttttagaag ctaaaggata
taaggaagtt aaaaaagact taatcattaa actacctaaa 3600tatagtcttt ttgagttaga
aaacggtcgt aaacggatgc tggctagtgc cggagaatta 3660caaaaaggaa atgagctggc
tctgccaagc aaatatgtga attttttata tttagctagt 3720cattatgaaa agttgaaggg
tagtccagaa gataacgaac aaaaacaatt gtttgtggag 3780cagcataagc attatttaga
tgagattatt gagcaaatca gtgaattttc taagcgtgtt 3840attttagcag atgccaattt
agataaagtt cttagtgcat ataacaaaca tagagacaaa 3900ccaatacgtg aacaagcaga
aaatattatt catttattta cgttgacgaa tcttggagct 3960cccgctgctt ttaaatattt
tgatacaaca attgatcgta aacgatatac gtctacaaaa 4020gaagttttag atgccactct
tatccatcaa tccatcactg gtctttatga aacacgcatt 4080gatttgagtc agctaggagg
tgactga 41072189DNASolanum
tuberosum 2gtaagtttct gcttctacct ttgatatata tataataatt atcattaatt
agtagtaata 60taatatttca aatatttttt tcaaaataaa agaatgtagt atatagcaat
tgcttttctg 120tagtttataa gtgtgtatat tttaatttat aacttttcta atatatgacc
aaaacatggt 180gatgtgcag
18939PRTSimian virus 40 3Met Ala Pro Lys Lys Lys Arg Lys Val
1 5 418PRTAgrobacterium tumefaciens 4Lys
Arg Pro Arg Asp Arg His Asp Gly Glu Leu Gly Gly Arg Lys Arg 1
5 10 15 Ala Arg
56717DNAArtificial SequenceMaize optimized Cas9 expression cassette
5gtgcagcgtg acccggtcgt gcccctctct agagataatg agcattgcat gtctaagtta
60taaaaaatta ccacatattt tttttgtcac acttgtttga agtgcagttt atctatcttt
120atacatatat ttaaacttta ctctacgaat aatataatct atagtactac aataatatca
180gtgttttaga gaatcatata aatgaacagt tagacatggt ctaaaggaca attgagtatt
240ttgacaacag gactctacag ttttatcttt ttagtgtgca tgtgttctcc tttttttttg
300caaatagctt cacctatata atacttcatc cattttatta gtacatccat ttagggttta
360gggttaatgg tttttataga ctaatttttt tagtacatct attttattct attttagcct
420ctaaattaag aaaactaaaa ctctatttta gtttttttat ttaataattt agatataaaa
480tagaataaaa taaagtgact aaaaattaaa caaataccct ttaagaaatt aaaaaaacta
540aggaaacatt tttcttgttt cgagtagata atgccagcct gttaaacgcc gtcgacgagt
600ctaacggaca ccaaccagcg aaccagcagc gtcgcgtcgg gccaagcgaa gcagacggca
660cggcatctct gtcgctgcct ctggacccct ctcgagagtt ccgctccacc gttggacttg
720ctccgctgtc ggcatccaga aattgcgtgg cggagcggca gacgtgagcc ggcacggcag
780gcggcctcct cctcctctca cggcaccggc agctacgggg gattcctttc ccaccgctcc
840ttcgctttcc cttcctcgcc cgccgtaata aatagacacc ccctccacac cctctttccc
900caacctcgtg ttgttcggag cgcacacaca cacaaccaga tctcccccaa atccacccgt
960cggcacctcc gcttcaaggt acgccgctcg tcctcccccc cccccctctc taccttctct
1020agatcggcgt tccggtccat gcatggttag ggcccggtag ttctacttct gttcatgttt
1080gtgttagatc cgtgtttgtg ttagatccgt gctgctagcg ttcgtacacg gatgcgacct
1140gtacgtcaga cacgttctga ttgctaactt gccagtgttt ctctttgggg aatcctggga
1200tggctctagc cgttccgcag acgggatcga tttcatgatt ttttttgttt cgttgcatag
1260ggtttggttt gcccttttcc tttatttcaa tatatgccgt gcacttgttt gtcgggtcat
1320cttttcatgc ttttttttgt cttggttgtg atgatgtggt ctggttgggc ggtcgttcta
1380gatcggagta gaattctgtt tcaaactacc tggtggattt attaattttg gatctgtatg
1440tgtgtgccat acatattcat agttacgaat tgaagatgat ggatggaaat atcgatctag
1500gataggtata catgttgatg cgggttttac tgatgcatat acagagatgc tttttgttcg
1560cttggttgtg atgatgtggt gtggttgggc ggtcgttcat tcgttctaga tcggagtaga
1620atactgtttc aaactacctg gtgtatttat taattttgga actgtatgtg tgtgtcatac
1680atcttcatag ttacgagttt aagatggatg gaaatatcga tctaggatag gtatacatgt
1740tgatgtgggt tttactgatg catatacatg atggcatatg cagcatctat tcatatgctc
1800taaccttgag tacctatcta ttataataaa caagtatgtt ttataattat tttgatcttg
1860atatacttgg atgatggcat atgcagcagc tatatgtgga tttttttagc cctgccttca
1920tacgctattt atttgcttgg tactgtttct tttgtcgatg ctcaccctgt tgtttggtgt
1980tacttctgca ggtcgactct agaggatcca tggcaccgaa gaagaagcgc aaggtgatgg
2040acaagaagta cagcatcggc ctcgacatcg gcaccaactc ggtgggctgg gccgtcatca
2100cggacgaata taaggtcccg tcgaagaagt tcaaggtcct cggcaataca gaccgccaca
2160gcatcaagaa aaacttgatc ggcgccctcc tgttcgatag cggcgagacc gcggaggcga
2220ccaggctcaa gaggaccgcc aggagacggt acactaggcg caagaacagg atctgctacc
2280tgcaggagat cttcagcaac gagatggcga aggtggacga ctccttcttc caccgcctgg
2340aggaatcatt cctggtggag gaggacaaga agcatgagcg gcacccaatc ttcggcaaca
2400tcgtcgacga ggtaagtttc tgcttctacc tttgatatat atataataat tatcattaat
2460tagtagtaat ataatatttc aaatattttt ttcaaaataa aagaatgtag tatatagcaa
2520ttgcttttct gtagtttata agtgtgtata ttttaattta taacttttct aatatatgac
2580caaaacatgg tgatgtgcag gtggcctacc acgagaagta cccgacaatc taccacctcc
2640ggaagaaact ggtggacagc acagacaagg cggacctccg gctcatctac cttgccctcg
2700cgcatatgat caagttccgc ggccacttcc tcatcgaggg cgacctgaac ccggacaact
2760ccgacgtgga caagctgttc atccagctcg tgcagacgta caatcaactg ttcgaggaga
2820accccataaa cgctagcggc gtggacgcca aggccatcct ctcggccagg ctctcgaaat
2880caagaaggct ggagaacctt atcgcgcagt tgccaggcga aaagaagaac ggcctcttcg
2940gcaaccttat tgcgctcagc ctcggcctga cgccgaactt caaatcaaac ttcgacctcg
3000cggaggacgc caagctccag ctctcaaagg acacctacga cgacgacctc gacaacctcc
3060tggcccagat aggagaccag tacgcggacc tcttcctcgc cgccaagaac ctctccgacg
3120ctatcctgct cagcgacatc cttcgggtca acaccgaaat taccaaggca ccgctgtccg
3180ccagcatgat taaacgctac gacgagcacc atcaggacct cacgctgctc aaggcactcg
3240tccgccagca gctccccgag aagtacaagg agatcttctt cgaccaatca aaaaacggct
3300acgcgggata tatcgacggc ggtgccagcc aggaagagtt ctacaagttc atcaaaccaa
3360tcctggagaa gatggacggc accgaggagt tgctggtcaa gctcaacagg gaggacctcc
3420tcaggaagca gaggaccttc gacaacggct ccatcccgca tcagatccac ctgggcgaac
3480tgcatgccat cctgcggcgc caggaggact tctacccgtt cctgaaggat aaccgggaga
3540agatcgagaa gatcttgacg ttccgcatcc catactacgt gggcccgctg gctcgcggca
3600actcccggtt cgcctggatg acccggaagt cggaggagac catcacaccc tggaactttg
3660aggaggtggt cgataagggc gctagcgctc agagcttcat cgagcgcatg accaacttcg
3720ataaaaacct gcccaatgaa aaagtcctcc ccaagcactc gctgctctac gagtacttca
3780ccgtgtacaa cgagctcacc aaggtcaaat acgtcaccga gggcatgcgg aagccggcgt
3840tcctgagcgg cgagcagaag aaggcgatag tggacctcct cttcaagacc aacaggaagg
3900tgaccgtgaa gcaattaaaa gaggactact tcaagaaaat agagtgcttc gactccgtgg
3960agatctcggg cgtggaggat cggttcaacg cctcactcgg cacgtatcac gacctcctca
4020agatcattaa agacaaggac ttcctcgaca acgaggagaa cgaggacatc ctcgaggaca
4080tcgtcctcac cctgaccctg ttcgaggacc gcgaaatgat cgaggagagg ctgaagacct
4140acgcgcacct gttcgacgac aaggtcatga aacagctcaa gaggcgccgc tacactggtt
4200ggggaaggct gtcccgcaag ctcattaatg gcatcaggga caagcagagc ggcaagacca
4260tcctggactt cctcaagtcc gacgggttcg ccaaccgcaa cttcatgcag ctcattcacg
4320acgactcgct cacgttcaag gaagacatcc agaaggcaca ggtgagcggg cagggtgact
4380ccctccacga acacatcgcc aacctggccg gctcgccggc cattaaaaag ggcatcctgc
4440agacggtcaa ggtcgtcgac gagctcgtga aggtgatggg ccggcacaag cccgaaaata
4500tcgtcataga gatggccagg gagaaccaga ccacccaaaa agggcagaag aactcgcgcg
4560agcggatgaa acggatcgag gagggcatta aagagctcgg gtcccagatc ctgaaggagc
4620accccgtgga aaatacccag ctccagaatg aaaagctcta cctctactac ctgcagaacg
4680gccgcgacat gtacgtggac caggagctgg acattaatcg gctatcggac tacgacgtcg
4740accacatcgt gccgcagtcg ttcctcaagg acgatagcat cgacaacaag gtgctcaccc
4800ggtcggataa aaatcggggc aagagcgaca acgtgcccag cgaggaggtc gtgaagaaga
4860tgaaaaacta ctggcgccag ctcctcaacg cgaaactgat cacccagcgc aagttcgaca
4920acctgacgaa ggcggaacgc ggtggcttga gcgaactcga taaggcgggc ttcataaaaa
4980ggcagctggt cgagacgcgc cagatcacga agcatgtcgc ccagatcctg gacagccgca
5040tgaatactaa gtacgatgaa aacgacaagc tgatccggga ggtgaaggtg atcacgctga
5100agtccaagct cgtgtcggac ttccgcaagg acttccagtt ctacaaggtc cgcgagatca
5160acaactacca ccacgcccac gacgcctacc tgaatgcggt ggtcgggacc gccctgatca
5220agaagtaccc gaagctggag tcggagttcg tgtacggcga ctacaaggtc tacgacgtgc
5280gcaaaatgat cgccaagtcc gagcaggaga tcggcaaggc cacggcaaaa tacttcttct
5340actcgaacat catgaacttc ttcaagaccg agatcaccct cgcgaacggc gagatccgca
5400agcgcccgct catcgaaacc aacggcgaga cgggcgagat cgtctgggat aagggccggg
5460atttcgcgac ggtccgcaag gtgctctcca tgccgcaagt caatatcgtg aaaaagacgg
5520aggtccagac gggcgggttc agcaaggagt ccatcctccc gaagcgcaac tccgacaagc
5580tcatcgcgag gaagaaggat tgggacccga aaaaatatgg cggcttcgac agcccgaccg
5640tcgcatacag cgtcctcgtc gtggcgaagg tggagaaggg caagtcaaag aagctcaagt
5700ccgtgaagga gctgctcggg atcacgatta tggagcggtc ctccttcgag aagaacccga
5760tcgacttcct agaggccaag ggatataagg aggtcaagaa ggacctgatt attaaactgc
5820cgaagtactc gctcttcgag ctggaaaacg gccgcaagag gatgctcgcc tccgcaggcg
5880agttgcagaa gggcaacgag ctcgccctcc cgagcaaata cgtcaatttc ctgtacctcg
5940ctagccacta tgaaaagctc aagggcagcc cggaggacaa cgagcagaag cagctcttcg
6000tggagcagca caagcattac ctggacgaga tcatcgagca gatcagcgag ttctcgaagc
6060gggtgatcct cgccgacgcg aacctggaca aggtgctgtc ggcatataac aagcaccgcg
6120acaaaccaat acgcgagcag gccgaaaata tcatccacct cttcaccctc accaacctcg
6180gcgctccggc agccttcaag tacttcgaca ccacgattga ccggaagcgg tacacgagca
6240cgaaggaggt gctcgatgcg acgctgatcc accagagcat cacagggctc tatgaaacac
6300gcatcgacct gagccagctg ggcggagaca agagaccacg ggaccgccac gatggcgagc
6360tgggaggccg caagcgggca aggtaggtac cgttaaccta gacttgtcca tcttctggat
6420tggccaactt aattaatgta tgaaataaaa ggatgcacac atagtgacat gctaatcact
6480ataatgtggg catcaaagtt gtgtgttatg tgtaattact agttatctga ataaaagaga
6540aagagatcat ccatatttct tatcctaaat gaatgtcacg tgtctttata attctttgat
6600gaaccagatg catttcatta accaaatcca tatacatata aatattaatc atatataatt
6660aatatcaatt gggttagcaa aacaaatcta gtctaggtgt gttttgcgaa tgcggcc
6717639RNAArtificial SequencecrRNA containing the LIGCas-3 target
sequence in the variable targeting domain 6gcguacgcgu acgugugguu
uuagagcuau gcuguuuug 39786RNAStreptococcus
pyogenes M1 GAS (SF370)tracRNA(1)..(86) 7ggaaccauuc aaaacagcau agcaaguuaa
aauaaggcua guccguuauc aacuugaaaa 60aguggcaccg agucggugcu uuuuuu
86894RNAArtificial SequenceLong guide
RNA containing the LIGCas-3 target sequence in the variable
targeting domain 8gcguacgcgu acgugugguu uuagagcuag aaauagcaag uuaaaauaag
gcuaguccgu 60uaucaacuug aaaaaguggc accgagucgg ugcu
9491000DNAZea mays 9tgagagtaca atgatgaacc tagattaatc
aatgccaaag tctgaaaaat gcaccctcag 60tctatgatcc agaaaatcaa gattgcttga
ggccctgttc ggttgttccg gattagagcc 120ccggattaat tcctagccgg attacttctc
taatttatat agattttgat gagctggaat 180gaatcctggc ttattccggt acaaccgaac
aggccctgaa ggataccagt aatcgctgag 240ctaaattggc atgctgtcag agtgtcagta
ttgcagcaag gtagtgagat aaccggcatc 300atggtgccag tttgatggca ccattagggt
tagagatggt ggccatgggc gcatgtcctg 360gccaactttg tatgatatat ggcagggtga
ataggaaagt aaaattgtat tgtaaaaagg 420gatttcttct gtttgttagc gcatgtacaa
ggaatgcaag ttttgagcga gggggcatca 480aagatctggc tgtgtttcca gctgtttttg
ttagccccat cgaatccttg acataatgat 540cccgcttaaa taagcaacct cgcttgtata
gttccttgtg ctctaacaca cgatgatgat 600aagtcgtaaa atagtggtgt ccaaagaatt
tccaggccca gttgtaaaag ctaaaatgct 660attcgaattt ctactagcag taagtcgtgt
ttagaaatta tttttttata tacctttttt 720ccttctatgt acagtaggac acagtgtcag
cgccgcgttg acggagaata tttgcaaaaa 780agtaaaagag aaagtcatag cggcgtatgt
gccaaaaact tcgtcacaga gagggccata 840agaaacatgg cccacggccc aatacgaagc
accgcgacga agcccaaaca gcagtccgta 900ggtggagcaa agcgctgggt aatacgcaaa
cgttttgtcc caccttgact aatcacaaga 960gtggagcgta ccttataaac cgagccgcaa
gcaccgaatt 10001016DNAZea mays 10tttttttttt
tttttt
161159RNAArtificial SequenceShort guide RNA containing the LIGCas-3
variable targeting domain 11gcguacgcgu acgugugguu uuagagcuag aaauagcaag
uuaaaauaag gcuaguccg 59121102DNAArtificial SequenceMaize optimized
long guide RNA expression cassette containing the LIGCas-3 variable
targeting domain 12tgagagtaca atgatgaacc tagattaatc aatgccaaag tctgaaaaat
gcaccctcag 60tctatgatcc agaaaatcaa gattgcttga ggccctgttc ggttgttccg
gattagagcc 120ccggattaat tcctagccgg attacttctc taatttatat agattttgat
gagctggaat 180gaatcctggc ttattccggt acaaccgaac aggccctgaa ggataccagt
aatcgctgag 240ctaaattggc atgctgtcag agtgtcagta ttgcagcaag gtagtgagat
aaccggcatc 300atggtgccag tttgatggca ccattagggt tagagatggt ggccatgggc
gcatgtcctg 360gccaactttg tatgatatat ggcagggtga ataggaaagt aaaattgtat
tgtaaaaagg 420gatttcttct gtttgttagc gcatgtacaa ggaatgcaag ttttgagcga
gggggcatca 480aagatctggc tgtgtttcca gctgtttttg ttagccccat cgaatccttg
acataatgat 540cccgcttaaa taagcaacct cgcttgtata gttccttgtg ctctaacaca
cgatgatgat 600aagtcgtaaa atagtggtgt ccaaagaatt tccaggccca gttgtaaaag
ctaaaatgct 660attcgaattt ctactagcag taagtcgtgt ttagaaatta tttttttata
tacctttttt 720ccttctatgt acagtaggac acagtgtcag cgccgcgttg acggagaata
tttgcaaaaa 780agtaaaagag aaagtcatag cggcgtatgt gccaaaaact tcgtcacaga
gagggccata 840agaaacatgg cccacggccc aatacgaagc accgcgacga agcccaaaca
gcagtccgta 900ggtggagcaa agcgctgggt aatacgcaaa cgttttgtcc caccttgact
aatcacaaga 960gtggagcgta ccttataaac cgagccgcaa gcaccgaatt gcgtacgcgt
acgtgtggtt 1020ttagagctag aaatagcaag ttaaaataag gctagtccgt tatcaacttg
aaaaagtggc 1080accgagtcgg tgcttttttt tt
11021327DNAZea mays 13gtactccatc cgccccatcg agtaggg
271424DNAZea mays 14gcacgtacgt caccatcccg
ccgg 241524DNAZea mays
15gacgtacgtg ccctactcga tggg
241624DNAZea mays 16gtaccgtacg tgccccggcg gagg
241724DNAZea mays 17ggaattgtac cgtacgtgcc ccgg
241820DNAZea mays 18gcgtacgcgt acgtgtgagg
201922DNAZea mays
19gctggccgag gtcgactacc gg
222023DNAZea mays 20ggccgaggtc gactaccggc cgg
232123DNAZea mays 21ggcgcgagct cgtgcttcac cgg
232221DNAZea mays 22ggtgccaatc atgcgtcgcg
g 212320DNAZea mays
23ggtcgccatc acgggacagg
202424DNAZea mays 24gtcgcggcac ctgtcccgtg atgg
242523DNAZea mays 25ggaatgctgg aactgcaatg cgg
232624DNAZea mays 26gcagctcttc ttggggaatg
ctgg 242723DNAZea mays
27gcagtaacag ctgctgtcaa tgg
232856DNAArtificial SequenceMS26Cas-1 forward primer 28ctacactctt
tccctacacg acgctcttcc gatctaggac cggaagctcg ccgcgt
562954DNAArtificial SequenceMS26Cas-1 and MS26Cas-3 reverse primer
29caagcagaag acggcatacg agctcttccg atcttcctgg aggacgacgt gctg
543059DNAArtificial SequenceMS26Cas-2 forward primer 30ctacactctt
tccctacacg acgctcttcc gatctaaggt cctggaggac gacgtgctg
593151DNAArtificial SequenceMS26Cas-2 and MS26 meganuclease reverse
primer 31caagcagaag acggcatacg agctcttccg atctccggaa gctcgccgcg t
513256DNAArtificial SequenceMS26Cas-3 forward primer 32ctacactctt
tccctacacg acgctcttcc gatcttcctc cggaagctcg ccgcgt
563359DNAArtificial SequenceMS26 Meganuclease forward primer 33ctacactctt
tccctacacg acgctcttcc gatctttcct cctggaggac gacgtgctg
593463DNAArtificial SequenceLIGCas-1 forward primer 34ctacactctt
tccctacacg acgctcttcc gatctaggac tgtaacgatt tacgcacctg 60ctg
633558DNAArtificial SequenceLIGCas-1 and LIGCas-2 reverse primer
35caagcagaag acggcatacg agctcttccg atctgcaaat gagtagcagc gcacgtat
583663DNAArtificial SequenceLIGCas-2 forward primer 36ctacactctt
tccctacacg acgctcttcc gatcttcctc tgtaacgatt tacgcacctg 60ctg
633760DNAArtificial SequenceLIGCas-3 forward primer 37ctacactctt
tccctacacg acgctcttcc gatctaaggc gcaaatgagt agcagcgcac
603857DNAArtificial SequenceLIGCas-3 and LIG3-4 meganuclease reverse
primer 38caagcagaag acggcatacg agctcttccg atctcacctg ctgggaattg taccgta
573960DNAArtificial SequenceLIG3-4 meganuclease forward primer
39ctacactctt tccctacacg acgctcttcc gatctccttc gcaaatgagt agcagcgcac
604058DNAArtificial SequenceMS45Cas-1 forward primer 40ctacactctt
tccctacacg acgctcttcc gatctaggag gacccgttcg gcctcagt
584154DNAArtificial SequenceMS45Cas-1, MS45Cas-2 and MS45Cas-3 reverse
primer 41caagcagaag acggcatacg agctcttccg atctgccggc tggcattgtc tctg
544258DNAArtificial SequenceMS45Cas-2 forward primer 42ctacactctt
tccctacacg acgctcttcc gatcttcctg gacccgttcg gcctcagt
584358DNAArtificial SequenceMS45Cas-3 forward primer 43ctacactctt
tccctacacg acgctcttcc gatctgaagg gacccgttcg gcctcagt
584458DNAArtificial SequenceALSCas-1 forward primer 44ctacactctt
tccctacacg acgctcttcc gatctaaggc gacgatgggc gtctcctg
584553DNAArtificial SequenceALSCas-1, ALSCas-2 and ALSCas-3 reverse
primer 45caagcagaag acggcatacg agctcttccg atctgcgtct gcatcgccac ctc
534658DNAArtificial SequenceALSCas-2 forward primer 46ctacactctt
tccctacacg acgctcttcc gatctttccc gacgatgggc gtctcctg
584758DNAArtificial SequenceALSCas-3 forward primer 47ctacactctt
tccctacacg acgctcttcc gatctggaac gacgatgggc gtctcctg
584863DNAArtificial SequenceEPSPSCas-1 forward primer 48ctacactctt
tccctacacg acgctcttcc gatctggaag aggaaacata cgttgcattt 60cca
634957DNAArtificial SequencePSPSCas-1 and EPSPSCas-3 reverse primer
49caagcagaag acggcatacg agctcttccg atctggtgga aagttcccag ttgagga
575062DNAArtificial SequencePSPSCas-2 forward primer 50ctacactctt
tccctacacg acgctcttcc gatctaagcg gtggaaagtt cccagttgag 60ga
625158DNAArtificial SequenceEPSPSCas-2 reverse primer 51caagcagaag
acggcatacg agctcttccg atctgaggaa acatacgttg catttcca
585263DNAArtificial SequenceEPSPSCas-3 forward primer 52ctacactctt
tccctacacg acgctcttcc gatctccttg aggaaacata cgttgcattt 60cca
635343DNAArtificial SequenceForward primer for secondary PCR 53aatgatacgg
cgaccaccga gatctacact ctttccctac acg
435418DNAArtificial SequenceReverse primer for secondary PCR 54caagcagaag
acggcata 185593DNAZea
mays 55ctgtaacgat ttacgcacct gctgggaatt gtaccgtacg tgccccggcg gaggatatat
60atacctcaca cgtacgcgta cgcgtatata tac
935698DNAArtificial sequenceMutation 1 for LIGCas-1 locus 56aggactgtaa
cgatttacgc acctgctggg aattgtaccg tacgtgcccc ggtcggagga 60tatatatacc
tcacacgtac gcgtacgcgt atatatac
985798DNAArtificial sequenceMutation 2 for LIGCas-1 locus 57aggactgtaa
cgatttacgc acctgctggg aattgtaccg tacgtgcccc ggacggagga 60tatatatacc
tcacacgtac gcgtacgcgt atatatac
985898DNAArtificial sequenceMutation 3 for LIGCas-1 locus 58aggactgtaa
cgatttacgc acctgctggg aattgtaccg tacgtgcccc gggcggagga 60tatatatacc
tcacacgtac gcgtacgcgt atatatac
985995DNAArtificial sequenceMutation 4 for LIGCas-1 locus 59aggactgtaa
cgatttacgc acctgctggg aattgtaccg tacgtgcggt cggaggatat 60atatacctca
cacgtacgcg tacgcgtata tatac
956098DNAArtificial SequenceMutation 5 for LIGCas-1 locus 60aggactgtaa
cgatttacgc acctgctggg aattgtaccg tacgtgcccc ggccggagga 60tatatatacc
tcacacgtac gcgtacgcgt atatatac
986196DNAArtificial sequenceMutation 6 for LIGCas-1 locus 61aggactgtaa
cgatttacgc acctgctggg aattgtaccg tacgtgcccc gcggaggata 60tatatacctc
acacgtacgc gtacgcgtat atatac
966296DNAArtificial SequenceMutation 7 for LIGCas-1 locus 62aggactgtaa
cgatttacgc acctgctggg aattgtaccg tacgtgcccc ggggaggata 60tatatacctc
acacgtacgc gtacgcgtat atatac
966394DNAArtificial SequenceMutation 8 for LIGCas-1 locus 63aggactgtaa
cgatttacgc acctgctggg aattgtaccg tacgtgcccc ggaggatata 60tatacctcac
acgtacgcgt acgcgtatat atac
946494DNAArtificial SequenceMutation 8 for LIGCas-1 locus 64aggactgtaa
cgatttacgc acctgctggg aattgtaccg tacgtgcgtc ggaggatata 60tatacctcac
acgtacgcgt acgcgtatat atac
946543DNAArtificial SequenceMutation 10 for LIGCas-1 locus 65aggactgtaa
cgatttacgc acctgctggg aattgtaccg tac
436698DNAArtificial SequenceMutation 1 for LIGCas-2 locus 66tcctctgtaa
cgatttacgc acctgctggg aattgtaccg tacgtgaccc cggcggagga 60tatatatacc
tcacacgtac gcgtacgcgt atatatac
986796DNAArtificial SequenceMutation 2 for LIGCas-2 locus 67tcctctgtaa
cgatttacgc acctgctggg aattgtaccg tacgtccccg gcggaggata 60tatatacctc
acacgtacgc gtacgcgtat atatac
966898DNAArtificial SequenceMutation 3 for LIGCas-2 locus 68tcctctgtaa
cgatttacgc acctgctggg aattgtaccg tacgtgtccc cggcggagga 60tatatatacc
tcacacgtac gcgtacgcgt atatatac
986994DNAArtificial SequenceMutation 4 for LIGCas-2 locus 69tcctctgtaa
cgatttacgc acctgctggg aattgtaccg tacccccggc ggaggatata 60tatacctcac
acgtacgcgt acgcgtatat atac
947093DNAArtificial SequenceMutation 5 for LIGCas-2 locus 70tcctctgtaa
cgatttacgc acctgctggg aattgtaccg taccccggcg gaggatatat 60atacctcaca
cgtacgcgta cgcgtatata tac
937198DNAArtificial SequenceMutation 6 for LIGCas-2 locus 71tcctctgtaa
cgatttacgc acctgctggg aattgtaccg tacgtggccc cggcggagga 60tatatatacc
tcacacgtac gcgtacgcgt atatatac
987292DNAArtificial SequenceMutation 7 for LIGCas-2 locus 72tcctctgtaa
cgatttacgc acctgctggg aattgtaccg tacccggcgg aggatatata 60tacctcacac
gtacgcgtac gcgtatatat ac
927399DNAArtificial SequenceMutation 8 for LIGCas-2 locus 73tcctctgtaa
cgatttacgc acctgctggg aattgtaccg tacgtgaacc ccggcggagg 60atatatatac
ctcacacgta cgcgtacgcg tatatatac
997461DNAArtificial SequenceMutation 9 for LIGCas-2 locus 74tcctctgtaa
cgatttacgc acctgctggg aattgtaccg tacgtgtacg cgtatatata 60c
617595DNAArtificial SequenceMutation 10 for LIGCas-2 locus 75tcctctgtaa
cgatttacgc acctgctggg aattgtaccg tacgccccgg cggaggatat 60atatacctca
cacgtacgcg tacgcgtata tatac 957693DNAZea
mays 76cgcaaatgag tagcagcgca cgtatatata cgcgtacgcg tacgtgtgag gtatatatat
60cctccgccgg ggcacgtacg gtacaattcc cag
937798DNAArtificial SequenceMutation 1 for LIGCas-3 locus 77aaggcgcaaa
tgagtagcag cgcacgtata tatacgcgta cgcgtacgtt gtgaggtata 60tatatcctcc
gccggggcac gtacggtaca attcccag
987896DNAArtificial SequenceMutation 2 for LIGCas-3 locus 78aaggcgcaaa
tgagtagcag cgcacgtata tatacgcgta cgcgtacggt gaggtatata 60tatcctccgc
cggggcacgt acggtacaat tcccag
967996DNAArtificial SequenceMutation 3 for LIGCas-3 locus 79aaggcgcaaa
tgagtagcag cgcacgtata tatacgcgta cgcgtactgt gaggtatata 60tatcctccgc
cggggcacgt acggtacaat tcccag
968095DNAArtificial SequenceMutation 4 for LIGCas-3 locus 80aaggcgcaaa
tgagtagcag cgcacgtata tatacgcgta cgcgtacgtg aggtatatat 60atcctccgcc
ggggcacgta cggtacaatt cccag
958168DNAArtificial SequenceMutation 5 for LIGCas-3 locus 81aaggcgcaaa
tgagtagcag cgcacgtata tatatcctcc gccggggcac gtacggtaca 60attcccag
688255DNAArtificial SequenceMutation 6 for LIGCas-3 locus 82aaggcgcaaa
tgagtagcag cgcacgtata tatacgcgta cggtacaatt cccag
558393DNAArtificial SequenceMutation 7 for LIGCas-3 locus 83aaggcgcaaa
tgagtagcag cgcacgtata tatacgcgta cgcgtgtgag gtatatatat 60cctccgccgg
ggcacgtacg gtacaattcc cag
938469DNAArtificial SequenceMutation 8 for LIGCas-3 locus 84aaggcgcaaa
tgagtagcag cgcacgtata tatacgcgta cgccggggca cgtacggtac 60aattcccag
698566DNAArtificial SequenceMutation 9 for LIGCas-3 locus 85aaggcgcaaa
tgagtagcag cgcacgtata tatcctccgc cggggcacgt acggtacaat 60tcccag
668695DNAArtificial SequenceMutation 10 for LIGCas-3 locus 86aaggcgcaaa
tgagtagcag cgcacgtata tatacgcgta cgcgtatgtg aggtatatat 60atcctccgcc
ggggcacgta cggtacaatt cccag
958795DNAArtificial SequenceMutation 1 for LIG3-4 homing endonuclease
locus 87ccttcgcaaa tgagtagcag cgcacgtata tatacgcgta cgcgtacgtg aggtatatat
60atcctccgcc ggggcacgta cggtacaatt cccag
958868DNAArtificial SequenceMutation 2 for LIG3-4 homing endonuclease
locus 88ccttcgcaaa tgagtagcag cgcacgtata tatatcctcc gccggggcac gtacggtaca
60attcccag
688965DNAArtificial SequenceMutation 3 for LIG3-4 homing endonuclease
locus 89ccttcgcaaa tgagtagcag cgcacgtata tatacgcgta cgcgtacgta cggtacaatt
60cccag
659055DNAArtificial SequenceMutation 4 for LIG3-4 homing endonuclease
locus 90ccttcgcaaa tgagtagcag cgcacgtata tatacgcgta cggtacaatt cccag
559169DNAArtificial SequenceMutation 5 for LIG3-4 homing endonuclease
locus 91ccttcgcaaa tgagtagcag cgcacgtata tatacgcgta cgccggggca cgtacggtac
60aattcccag
699285DNAArtificial SequenceMutation 6 for LIG3-4 homing endonuclease
locus 92ccttcgcaaa tgagtagcag cgcacgtata tatacgtgtg aggtatatat atcctccgcc
60ggggcacgta cggtacaatt cccag
859393DNAArtificial SequenceMutation 7 for LIG3-4 homing endonuclease
locus 93ccttcgcaaa tgagtagcag cgcacgtata tatacgcgta cgcgtgtgag gtatatatat
60cctccgccgg ggcacgtacg gtacaattcc cag
939466DNAArtificial SequenceMutation 8 for LIG3-4 homing endonuclease
locus 94ccttcgcaaa tgagtagcag cgcacgtata tatcctccgc cggggcacgt acggtacaat
60tcccag
669595DNAArtificial SequenceMutation 9 for LIG3-4 homing endonuclease
locus 95ccttcgcaaa tgagtagcag cgcacgtata tatacgcgta cgcgtacgtg tggtatatat
60atcctccgcc ggggcacgta cggtacaatt cccag
9596102DNAArtificial SequenceMutation 10 for LIG3-4 homing endonuclease
locus 96ccttcgcaaa tgagtagcag cgcacgtata tatacgcgta cggtatatat acgtgtgagg
60tatatatatc ctccgccggg gcacgtacgg tacaattccc ag
102975424DNAArtificial Sequencedonor DNA -HR Repair DNA 97cccatagaaa
actgtgtgct ataatacacc aaaaggaaag caaagtgaaa aggaaacttt 60gaatagccaa
gaagactcgg agtgcttcac gccttcacct atcccacata ggtgatgagc 120taagagtaaa
atgtagattc tctcgagtac tgaatattgc ctgcactttt ccttgcagta 180aatacacctt
taatccatga cgagagtcca ctctttgagt ccgtcttgag attcttccat 240tgatcataca
acatgacctc gaagtcctga tggagaacaa cttatataat taaaactaca 300atacagaaag
ttcctgacaa ttaaaacctt tggtggtggc atgccgtagg ttaaaaaaaa 360tagataatga
caacacaact ggagacacgc tctttgccga gtgctcacac gtttgctgag 420agcgagcact
cggcaaatat atgatttgcc gaataccacc ctcctcggca aaacaataca 480ctaggcaaaa
aggtagtttc ccatcaccat gatgcccgcc gttaatgtac cttctatgcc 540gagtatgttg
gcgctcagca aagagatcgt taccggcgtt tgtttcacca agagctcttt 600gacgagtgtg
gcacacgaca aaaccttttg ccgagtgtaa ttagtcgttt gccaagtgac 660tggtgcagtt
ggcaaaggag tcgtttatta tgtgtgggca aaatgatata tggtgccagt 720tagggctagc
aaattaaagg gggggggggg ggggttaggt tgaagaaggt gacgagtaat 780aaggtctcgg
acggccgcgc gcatatatat cagatccgat ccaatggcac acggtgcaaa 840cgaaaagcac
gaaatttcca ccagcttaat tagggagaga aaaatagagc accagctgat 900gagtgaatga
atgagataga cgggacacag agggtccagc aggctagcct actctggccg 960ccctaaatag
aagtcagtgc cgtgacgacg cgcaaacttc ttttgatcgg ctgcggaaat 1020aatatactgt
aacgatttac gcacctgctg ggaattgtac cgtacgtgcc ccggcggagg 1080atatatatac
ctcacacaag ggcgaattgt actagttagt tagctagtcg gtcctagatg 1140ccgtaatcat
tagctaatcg taagtgacgc ttggacacga gcggcttgag ctaggaacct 1200acgaagtcat
cggaatcagc tcaggtgtac agaagttcct atactttctg gagaatagga 1260acttcggaat
aggaacttcg tatacgctag ggccgcattc gcaaaacaca cctagactag 1320atttgttttg
ctaacccaat tgatattaat tatatatgat taatatttat atgtatatgg 1380atttggttaa
tgaaatgcat ctggttcatc aaagaattat aaagacacgt gacattcatt 1440taggataaga
aatatggatg atctctttct cttttattca gataactagt aattacacat 1500aacacacaac
tttgatgccc acattatagt gattagcatg tcactatgtg tgcatccttt 1560tatttcatac
attaattaag ttggccaatc cagaagatgg acaagtctag gtttcgactc 1620agatctgcgt
caccgggcgc accgggcgcg gcggggccgg cagctcgaag tcgcgctgcc 1680agaagccgac
gtcgtgccag ccgccgtgct tgtagccggc ggcgcggagg gtgccgcggg 1740cggtgtagcc
gagggcctcg tggaggcgca cggacgggtc gttcgggagg ccgatcacgg 1800ccaccacgga
cttgaagccc tgggcctcca tgctcttgag gaggtgggtg tagagggtgg 1860agccgaggcc
gaggcgctgg tggcggtggg acacgtacac ggtggactcc acggtccagt 1920cgtaggcgtt
gcgggccttc cacgggccgg cgtaggcgat gccggccacc acgccctcca 1980cctcggccac
gagccacggg tagcggtcct ggaggcgctc caggtcgtcg atccactcct 2040gcggggtctg
cggctcggtg cggaagttca cggtggaggt ctcgatgtag tggttcacga 2100tgtcgcacac
ggcggccatg tcggcggcgg tggccgggcg gatctcgacg gggcggcgct 2160cgggggacat
ggtgtcgtgt ggatcccggt ggatctgaag ttcctatact ttctagagaa 2220taggaacttc
ggaataggaa cttcgctagc gaattgatcc tctagagtcg acctgcagaa 2280gtaacaccaa
acaacagggt gagcatcgac aaaagaaaca gtaccaagca aataaatagc 2340gtatgaaggc
agggctaaaa aaatccacat atagctgctg catatgccat catccaagta 2400tatcaagatc
aaaataatta taaaacatac ttgtttatta taatagatag gtactcaagg 2460ttagagcata
tgaatagatg ctgcatatgc catcatgtat atgcatcagt aaaacccaca 2520tcaacatgta
tacctatcct agatcgatat ttccatccat cttaaactcg taactatgaa 2580gatgtatgac
acacacatac agttccaaaa ttaataaata caccaggtag tttgaaacag 2640tattctactc
cgatctagaa cgaatgaacg accgcccaac cacaccacat catcacaacc 2700aagcgaacaa
aaagcatctc tgtatatgca tcagtaaaac ccgcatcaac atgtatacct 2760atcctagatc
gatatttcca tccatcatct tcaattcgta actatgaata tgtatggcac 2820acacatacag
atccaaaatt aataaatcca ccaggtagtt tgaaacagaa ttctactccg 2880atctagaacg
accgcccaac cagaccacat catcacaacc aagacaaaaa aaagcatgaa 2940aagatgaccc
gacaaacaag tgcacggcat atattgaaat aaaggaaaag ggcaaaccaa 3000accctatgca
acgaaacaaa aaaaatcatg aaatcgatcc cgtctgcgga acggctagag 3060ccatcccagg
attccccaaa gagaaacact ggcaagttag caatcagaac gtgtctgacg 3120tacaggtcgc
atccgtgtac gaacgctagc agcacggatc taacacaaac acggatctaa 3180cacaaacatg
aacagaagta gaactaccgg gccctaacca tgcatggacc ggaacgccga 3240tctagagaag
gtagagaggg ggggggggga ggacgagcgg cgtaccttga agcggaggtg 3300ccgacgggtg
gatttggggg agatctggtt gtgtgtgtgt gcgctccgaa caacacgagg 3360ttggggaaag
agggtgtgga gggggtgtct atttattacg gcgggcgagg aagggaaagc 3420gaaggagcgg
tgggaaagga atcccccgta gctgccggtg ccgtgagagg aggaggaggc 3480cgcctgccgt
gccggctcac gtctgccgct ccgccacgca atttctggat gccgacagcg 3540gagcaagtcc
aacggtggag cggaactctc gagaggggtc cagaggcagc gacagagatg 3600ccgtgccgtc
tgcttcgctt ggcccgacgc gacgctgctg gttcgctggt tggtgtccgt 3660tagactcgtc
gacggcgttt aacaggctgg cattatctac tcgaaacaag aaaaatgttt 3720ccttagtttt
tttaatttct taaagggtat ttgtttaatt tttagtcact ttattttatt 3780ctattttata
tctaaattat taaataaaaa aactaaaata gagttttagt tttcttaatt 3840tagaggctaa
aatagaataa aatagatgta ctaaaaaaat tagtctataa aaaccattaa 3900ccctaaaccc
taaatggatg tactaataaa atggatgaag tattatatag gtgaagctat 3960ttgcaaaaaa
aaaggagaac acatgcacac taaaaagata aaactgtaga gtcctgttgt 4020caaaatactc
aattgtcctt tagaccatgt ctaactgttc atttatatga ttctctaaaa 4080cactgatatt
attgtagtac tatagattat attattcgta gagtaaagtt taaatatatg 4140tataaagata
gataaactgc acttcaaaca agtgtgacaa aaaaaatatg tggtaatttt 4200ttataactta
gacatgcaat gctcattatc tctagagagg ggcacgaccg ggtcacgctg 4260cactgcaggc
tagcggcgaa ttcgcccttg tacgcgtacg cgtatatata cgtgcgctgc 4320tactcatttg
cgcgggaata cagctcagtc tgctgtgcgc tgcaggatgt acatacatac 4380atgcgcaggt
gcaaagtcta cgcgcgcggg caatgcaagc ccctggcgta gttgggccat 4440gactgagatc
acgcctcatg gtcatggaac gaaacaccgc gtccggccgg gctgcccctg 4500gcgtcacgcg
ggaggcagct gctagcgtta gcgtacgtac ccaccgtctc gtacacacca 4560ccgcagggag
agagaagagc gatgcaatgc acatgtacag catccgcatc atgcatagat 4620actcatatct
tcaaggccac acatgcagca gtgtcgtacg ctacgttgtt tcaacggagg 4680aggaggatac
atacatagac acccacagcc agcctagcat atagcagata gcatacggac 4740tcccgggtga
ggaaaaatgg agggcgaacc aaaccaacca caaagaagca gcagcagcag 4800cagcagcagc
tgcggctgct atcaccactc accaactcca attaaagatc tctctctctc 4860tctctactgg
ccggccctgt cagtgccagc gcccggtttg ttgctagctg agctgcgggc 4920gtcgctctta
gatatagccc aaaactcact ccaccaccac tcgttccatg gaaccctaga 4980ccaaaagtac
tcgcgctctc ggccctcgct ctcgccctct ccctctccgc agcaaaagag 5040atccggccgg
ccgagaaggg cgcgcgctag ctgcccggct actagctggc gcccgcccgc 5100gcatatatct
gtgtcatcgc catcacccac accatggccc ggccggccaa caccgccgta 5160ttagctctgt
ctgtcgctcg tccacctgcg accgactgag cgatcgatct ccaccgagct 5220ctccgctaag
cgctgtcctt gccgccgtcc tcccctccgt cccctacgca tccatttccg 5280tgtgctcgtg
tgtgcgcgcg cgggcactcc tgctcctgct ccctccggcc cctcctcccc 5340tcccaggctc
ccagctagcc gcgcccgccc gcgcgacctg cacctgcaca gatcgggcgg 5400ccgggccgac
cgatcgatcg agat
54249824DNAArtificial SequenceForward PCR primer 98cccgttattg tatgaggtaa
tgac 249931DNAArtificial
SequenceReverse PCR primer for site-specific transgene insertion at
junction 1 99gctcgtgtcc aagcgtcact tacgattagc t
3110032DNAArtificial SequenceForward PCR primer for
site-specific transgene insertion at junction 2 100ccatgtctaa
ctgttcattt atatgattct ct
3210124DNAArtificial SequenceReverse PCR primer for site-specific
transgene insertion at junction 2 101gcagccgata ggttcatcat cttc
241027850DNAArtificial
SequenceLinked Cas9 and LIGCas-3 long guide RNA expression cassettes
102gtgcagcgtg acccggtcgt gcccctctct agagataatg agcattgcat gtctaagtta
60taaaaaatta ccacatattt tttttgtcac acttgtttga agtgcagttt atctatcttt
120atacatatat ttaaacttta ctctacgaat aatataatct atagtactac aataatatca
180gtgttttaga gaatcatata aatgaacagt tagacatggt ctaaaggaca attgagtatt
240ttgacaacag gactctacag ttttatcttt ttagtgtgca tgtgttctcc tttttttttg
300caaatagctt cacctatata atacttcatc cattttatta gtacatccat ttagggttta
360gggttaatgg tttttataga ctaatttttt tagtacatct attttattct attttagcct
420ctaaattaag aaaactaaaa ctctatttta gtttttttat ttaataattt agatataaaa
480tagaataaaa taaagtgact aaaaattaaa caaataccct ttaagaaatt aaaaaaacta
540aggaaacatt tttcttgttt cgagtagata atgccagcct gttaaacgcc gtcgacgagt
600ctaacggaca ccaaccagcg aaccagcagc gtcgcgtcgg gccaagcgaa gcagacggca
660cggcatctct gtcgctgcct ctggacccct ctcgagagtt ccgctccacc gttggacttg
720ctccgctgtc ggcatccaga aattgcgtgg cggagcggca gacgtgagcc ggcacggcag
780gcggcctcct cctcctctca cggcaccggc agctacgggg gattcctttc ccaccgctcc
840ttcgctttcc cttcctcgcc cgccgtaata aatagacacc ccctccacac cctctttccc
900caacctcgtg ttgttcggag cgcacacaca cacaaccaga tctcccccaa atccacccgt
960cggcacctcc gcttcaaggt acgccgctcg tcctcccccc cccccctctc taccttctct
1020agatcggcgt tccggtccat gcatggttag ggcccggtag ttctacttct gttcatgttt
1080gtgttagatc cgtgtttgtg ttagatccgt gctgctagcg ttcgtacacg gatgcgacct
1140gtacgtcaga cacgttctga ttgctaactt gccagtgttt ctctttgggg aatcctggga
1200tggctctagc cgttccgcag acgggatcga tttcatgatt ttttttgttt cgttgcatag
1260ggtttggttt gcccttttcc tttatttcaa tatatgccgt gcacttgttt gtcgggtcat
1320cttttcatgc ttttttttgt cttggttgtg atgatgtggt ctggttgggc ggtcgttcta
1380gatcggagta gaattctgtt tcaaactacc tggtggattt attaattttg gatctgtatg
1440tgtgtgccat acatattcat agttacgaat tgaagatgat ggatggaaat atcgatctag
1500gataggtata catgttgatg cgggttttac tgatgcatat acagagatgc tttttgttcg
1560cttggttgtg atgatgtggt gtggttgggc ggtcgttcat tcgttctaga tcggagtaga
1620atactgtttc aaactacctg gtgtatttat taattttgga actgtatgtg tgtgtcatac
1680atcttcatag ttacgagttt aagatggatg gaaatatcga tctaggatag gtatacatgt
1740tgatgtgggt tttactgatg catatacatg atggcatatg cagcatctat tcatatgctc
1800taaccttgag tacctatcta ttataataaa caagtatgtt ttataattat tttgatcttg
1860atatacttgg atgatggcat atgcagcagc tatatgtgga tttttttagc cctgccttca
1920tacgctattt atttgcttgg tactgtttct tttgtcgatg ctcaccctgt tgtttggtgt
1980tacttctgca ggtcgactct agaggatcca tggcaccgaa gaagaagcgc aaggtgatgg
2040acaagaagta cagcatcggc ctcgacatcg gcaccaactc ggtgggctgg gccgtcatca
2100cggacgaata taaggtcccg tcgaagaagt tcaaggtcct cggcaataca gaccgccaca
2160gcatcaagaa aaacttgatc ggcgccctcc tgttcgatag cggcgagacc gcggaggcga
2220ccaggctcaa gaggaccgcc aggagacggt acactaggcg caagaacagg atctgctacc
2280tgcaggagat cttcagcaac gagatggcga aggtggacga ctccttcttc caccgcctgg
2340aggaatcatt cctggtggag gaggacaaga agcatgagcg gcacccaatc ttcggcaaca
2400tcgtcgacga ggtaagtttc tgcttctacc tttgatatat atataataat tatcattaat
2460tagtagtaat ataatatttc aaatattttt ttcaaaataa aagaatgtag tatatagcaa
2520ttgcttttct gtagtttata agtgtgtata ttttaattta taacttttct aatatatgac
2580caaaacatgg tgatgtgcag gtggcctacc acgagaagta cccgacaatc taccacctcc
2640ggaagaaact ggtggacagc acagacaagg cggacctccg gctcatctac cttgccctcg
2700cgcatatgat caagttccgc ggccacttcc tcatcgaggg cgacctgaac ccggacaact
2760ccgacgtgga caagctgttc atccagctcg tgcagacgta caatcaactg ttcgaggaga
2820accccataaa cgctagcggc gtggacgcca aggccatcct ctcggccagg ctctcgaaat
2880caagaaggct ggagaacctt atcgcgcagt tgccaggcga aaagaagaac ggcctcttcg
2940gcaaccttat tgcgctcagc ctcggcctga cgccgaactt caaatcaaac ttcgacctcg
3000cggaggacgc caagctccag ctctcaaagg acacctacga cgacgacctc gacaacctcc
3060tggcccagat aggagaccag tacgcggacc tcttcctcgc cgccaagaac ctctccgacg
3120ctatcctgct cagcgacatc cttcgggtca acaccgaaat taccaaggca ccgctgtccg
3180ccagcatgat taaacgctac gacgagcacc atcaggacct cacgctgctc aaggcactcg
3240tccgccagca gctccccgag aagtacaagg agatcttctt cgaccaatca aaaaacggct
3300acgcgggata tatcgacggc ggtgccagcc aggaagagtt ctacaagttc atcaaaccaa
3360tcctggagaa gatggacggc accgaggagt tgctggtcaa gctcaacagg gaggacctcc
3420tcaggaagca gaggaccttc gacaacggct ccatcccgca tcagatccac ctgggcgaac
3480tgcatgccat cctgcggcgc caggaggact tctacccgtt cctgaaggat aaccgggaga
3540agatcgagaa gatcttgacg ttccgcatcc catactacgt gggcccgctg gctcgcggca
3600actcccggtt cgcctggatg acccggaagt cggaggagac catcacaccc tggaactttg
3660aggaggtggt cgataagggc gctagcgctc agagcttcat cgagcgcatg accaacttcg
3720ataaaaacct gcccaatgaa aaagtcctcc ccaagcactc gctgctctac gagtacttca
3780ccgtgtacaa cgagctcacc aaggtcaaat acgtcaccga gggcatgcgg aagccggcgt
3840tcctgagcgg cgagcagaag aaggcgatag tggacctcct cttcaagacc aacaggaagg
3900tgaccgtgaa gcaattaaaa gaggactact tcaagaaaat agagtgcttc gactccgtgg
3960agatctcggg cgtggaggat cggttcaacg cctcactcgg cacgtatcac gacctcctca
4020agatcattaa agacaaggac ttcctcgaca acgaggagaa cgaggacatc ctcgaggaca
4080tcgtcctcac cctgaccctg ttcgaggacc gcgaaatgat cgaggagagg ctgaagacct
4140acgcgcacct gttcgacgac aaggtcatga aacagctcaa gaggcgccgc tacactggtt
4200ggggaaggct gtcccgcaag ctcattaatg gcatcaggga caagcagagc ggcaagacca
4260tcctggactt cctcaagtcc gacgggttcg ccaaccgcaa cttcatgcag ctcattcacg
4320acgactcgct cacgttcaag gaagacatcc agaaggcaca ggtgagcggg cagggtgact
4380ccctccacga acacatcgcc aacctggccg gctcgccggc cattaaaaag ggcatcctgc
4440agacggtcaa ggtcgtcgac gagctcgtga aggtgatggg ccggcacaag cccgaaaata
4500tcgtcataga gatggccagg gagaaccaga ccacccaaaa agggcagaag aactcgcgcg
4560agcggatgaa acggatcgag gagggcatta aagagctcgg gtcccagatc ctgaaggagc
4620accccgtgga aaatacccag ctccagaatg aaaagctcta cctctactac ctgcagaacg
4680gccgcgacat gtacgtggac caggagctgg acattaatcg gctatcggac tacgacgtcg
4740accacatcgt gccgcagtcg ttcctcaagg acgatagcat cgacaacaag gtgctcaccc
4800ggtcggataa aaatcggggc aagagcgaca acgtgcccag cgaggaggtc gtgaagaaga
4860tgaaaaacta ctggcgccag ctcctcaacg cgaaactgat cacccagcgc aagttcgaca
4920acctgacgaa ggcggaacgc ggtggcttga gcgaactcga taaggcgggc ttcataaaaa
4980ggcagctggt cgagacgcgc cagatcacga agcatgtcgc ccagatcctg gacagccgca
5040tgaatactaa gtacgatgaa aacgacaagc tgatccggga ggtgaaggtg atcacgctga
5100agtccaagct cgtgtcggac ttccgcaagg acttccagtt ctacaaggtc cgcgagatca
5160acaactacca ccacgcccac gacgcctacc tgaatgcggt ggtcgggacc gccctgatca
5220agaagtaccc gaagctggag tcggagttcg tgtacggcga ctacaaggtc tacgacgtgc
5280gcaaaatgat cgccaagtcc gagcaggaga tcggcaaggc cacggcaaaa tacttcttct
5340actcgaacat catgaacttc ttcaagaccg agatcaccct cgcgaacggc gagatccgca
5400agcgcccgct catcgaaacc aacggcgaga cgggcgagat cgtctgggat aagggccggg
5460atttcgcgac ggtccgcaag gtgctctcca tgccgcaagt caatatcgtg aaaaagacgg
5520aggtccagac gggcgggttc agcaaggagt ccatcctccc gaagcgcaac tccgacaagc
5580tcatcgcgag gaagaaggat tgggacccga aaaaatatgg cggcttcgac agcccgaccg
5640tcgcatacag cgtcctcgtc gtggcgaagg tggagaaggg caagtcaaag aagctcaagt
5700ccgtgaagga gctgctcggg atcacgatta tggagcggtc ctccttcgag aagaacccga
5760tcgacttcct agaggccaag ggatataagg aggtcaagaa ggacctgatt attaaactgc
5820cgaagtactc gctcttcgag ctggaaaacg gccgcaagag gatgctcgcc tccgcaggcg
5880agttgcagaa gggcaacgag ctcgccctcc cgagcaaata cgtcaatttc ctgtacctcg
5940ctagccacta tgaaaagctc aagggcagcc cggaggacaa cgagcagaag cagctcttcg
6000tggagcagca caagcattac ctggacgaga tcatcgagca gatcagcgag ttctcgaagc
6060gggtgatcct cgccgacgcg aacctggaca aggtgctgtc ggcatataac aagcaccgcg
6120acaaaccaat acgcgagcag gccgaaaata tcatccacct cttcaccctc accaacctcg
6180gcgctccggc agccttcaag tacttcgaca ccacgattga ccggaagcgg tacacgagca
6240cgaaggaggt gctcgatgcg acgctgatcc accagagcat cacagggctc tatgaaacac
6300gcatcgacct gagccagctg ggcggagaca agagaccacg ggaccgccac gatggcgagc
6360tgggaggccg caagcgggca aggtaggtac cgttaaccta gacttgtcca tcttctggat
6420tggccaactt aattaatgta tgaaataaaa ggatgcacac atagtgacat gctaatcact
6480ataatgtggg catcaaagtt gtgtgttatg tgtaattact agttatctga ataaaagaga
6540aagagatcat ccatatttct tatcctaaat gaatgtcacg tgtctttata attctttgat
6600gaaccagatg catttcatta accaaatcca tatacatata aatattaatc atatataatt
6660aatatcaatt gggttagcaa aacaaatcta gtctaggtgt gttttgcgaa tgcggccccc
6720cctcgaggtc gacggtatcg ataagctttg agagtacaat gatgaaccta gattaatcaa
6780tgccaaagtc tgaaaaatgc accctcagtc tatgatccag aaaatcaaga ttgcttgagg
6840ccctgttcgg ttgttccgga ttagagcccc ggattaattc ctagccggat tacttctcta
6900atttatatag attttgatga gctggaatga atcctggctt attccggtac aaccgaacag
6960gccctgaagg ataccagtaa tcgctgagct aaattggcat gctgtcagag tgtcagtatt
7020gcagcaaggt agtgagataa ccggcatcat ggtgccagtt tgatggcacc attagggtta
7080gagatggtgg ccatgggcgc atgtcctggc caactttgta tgatatatgg cagggtgaat
7140aggaaagtaa aattgtattg taaaaaggga tttcttctgt ttgttagcgc atgtacaagg
7200aatgcaagtt ttgagcgagg gggcatcaaa gatctggctg tgtttccagc tgtttttgtt
7260agccccatcg aatccttgac ataatgatcc cgcttaaata agcaacctcg cttgtatagt
7320tccttgtgct ctaacacacg atgatgataa gtcgtaaaat agtggtgtcc aaagaatttc
7380caggcccagt tgtaaaagct aaaatgctat tcgaatttct actagcagta agtcgtgttt
7440agaaattatt tttttatata ccttttttcc ttctatgtac agtaggacac agtgtcagcg
7500ccgcgttgac ggagaatatt tgcaaaaaag taaaagagaa agtcatagcg gcgtatgtgc
7560caaaaacttc gtcacagaga gggccataag aaacatggcc cacggcccaa tacgaagcac
7620cgcgacgaag cccaaacagc agtccgtagg tggagcaaag cgctgggtaa tacgcaaacg
7680ttttgtccca ccttgactaa tcacaagagt ggagcgtacc ttataaaccg agccgcaagc
7740accgaattgc gtacgcgtac gtgtggtttt agagctagaa atagcaagtt aaaataaggc
7800tagtccgtta tcaacttgaa aaagtggcac cgagtcggtg cttttttttt
785010323DNAZea mays 103tgggcaggtc tcacgacggt tgg
2310491DNAZea mays 104ccggtttcgc gtgctctggc
tttacattac atgggcaggt ctcacgacgg ttgggctgga 60gagccggctg gtaggggagg
acctcaacgg c 9110590DNAArtificial
SequenceMutation 1 for 55CasRNA-1 locus 105ccggtttcgc gtgctctggc
tttacattac atgggcaggt ctcacgaggt tgggctggag 60agccggctgg taggggagga
cctcaacggc 9010690DNAArtificial
SequenceMutation 2 for 55CasRNA-1 locus 106ccggtttcgc gtgctctggc
tttacattac atgggcaggt ctcacacggt tgggctggag 60agccggctgg taggggagga
cctcaacggc 9010792DNAArtificial
SequenceMutation 3 for 55CasRNA-1 locus 107ccggtttcgc gtgctctggc
tttacattac atgggcaggt ctcacgacgg tttgggctgg 60agagccggct ggtaggggag
gacctcaacg gc 9210889DNAArtificial
SequenceMutation 4 for 55CasRNA-1 locus 108ccggtttcgc gtgctctggc
tttacattgc atgagcaggt cgtgacggtt gggctggaga 60gccggctggt aggggaggac
ctcaacggc 8910957DNAArtificial
SequenceMutation 5 for 55CasRNA-1 locus 109gggcaggtct cgacggttgg
gctggagagc cggctggtag gggaggacct caacggc 5711057DNAArtificial
SequenceMutation 6 for 55CasRNA-1 locus 110ccggtttcgc gtgctcttgg
gctggagagc cggctggtag gggaggacct caacggc 5711122DNAZea mays
111atatacctca cacgtacgcg ta
221121053DNAZea mays 112atgaacacca agtacaacaa ggagttcctg ctctacctgg
ccggcttcgt ggacggcgac 60ggctccatca aggcgcagat caagccgaac cagtcctgca
agttcaagca ccagctctcc 120ctgaccttcc aggtgaccca gaagacgcag aggcgctggt
tcctcgacaa gctggtcgac 180gagatcgggg tgggctacgt ctacgaccgc gggtcggtgt
ccgactacga gctctcccag 240atcaagcccc tgcacaactt cctcacccag ctccagccgt
tcctcaagct gaagcagaag 300caggcgaacc tcgtcctgaa gatcatcgag cagctcccct
cggccaagga gtccccggac 360aagttcctgg aggtgtgcac gtgggtcgac cagatcgcgg
ccctcaacga cagcaagacc 420cgcaagacga cctcggagac ggtgcgggcg gtcctggact
ccctcccagg atccgtggga 480ggtctatcgc catctcaggc atccagcgcc gcatcctcgg
cttcctcaag cccgggttca 540gggatctccg aagcactcag agctggagca actaagtcca
aggaattcct gctctacctg 600gccggcttcg tggacggcga cggctccatc atcgcgtcca
tcaagccgcg ccagtgctac 660aagttcaagc acgagctccg cctggagttc accgtgaccc
agaagacgca gaggcgctgg 720ttcctcgaca agctggtcga cgagatcggg gtgggctacg
tctacgaccg cgggtcggtg 780tccgactacc gcctctccca gatcaagccc ctgcacaact
tcctcaccca gctccagccg 840ttcctcaagc tgaagcagaa gcaggcgaac ctcgtcctga
agatcatcga gcagctcccc 900tcggccaagg agtccccgga caagttcctg gaggtgtgca
cgtgggtcga ccagatcgcg 960gccctcaacg acagcaagac ccgcaagacg acctcggaga
cggtgcgggc ggtcctggac 1020tccctcagcg agaagaagaa gtcgtccccc tga
105311322DNAZea mays 113gatggtgacg tacgtgccct ac
221141053DNAZea mays
114atgaacacca agtacaacaa ggagttcctc ctctacctgg caggtttcgt ggacggcgat
60gggtctatca tcgcccagat taccccgcaa cagtcctaca agttcaagca cgccctgcgg
120ctgaggttca cggtcactca gaagacgcag cgcaggtggt tcctcgataa gctggtcgac
180gaaatcggag tcggcaaggt gcgggacagg ggctctgtca gcgactacat cctctcccag
240aagaagccgc tccacaactt cctgacccag ctgcagccct tcctcaagct caagcagaag
300caggccaacc tggtgctcaa gatcatcgag cagctgccat ctgccaagga gtcaccagac
360aagttccttg aggtctgcac ctgggtcgat cagatcgctg ccctgaacga ctccaagacg
420aggaagacca cctccgagac cgtcagggct gtgctggact cactcccagg atccgttggc
480ggtctcagcc cttctcaggc tagctcggct gcttcctcag ccagcagctc acctggctcc
540ggtatcagcg aggctctcag agcaggtgcc accaagtcca aggagttcct cctgtacctg
600gcaggcttcg ttgacggcga cggctcgatc atggcgtcca ttaccccgaa ccagtcgtgt
660aagttcaagc atcagctgcg cctgcgcttt accgtcacgc agaagaccca gaggcgctgg
720ttcctggaca aactggtgga cgagatcggg gtcgggaagg tgtacgacag agggagcgtt
780agcgactacc ggctgtccca gaagaagccg ctccacaact tcctgacgca gctccaaccc
840ttcctgaagc tgaagcagaa gcaggcgaac cttgtgctga agatcattga gcagctgccg
900agcgccaagg agagccctga caagttcctg gaggtctgca cctgggtcga ccagatcgct
960gccctcaacg actccaagac caggaagacc acgagcgaga ccgttcgggc tgtcctggac
1020agcctctccg agaagaagaa gtcgagcccg tag
10531154104DNAArtificial sequencesoybean codon optimized Cas9
115atggacaaaa agtactcaat agggctcgac atagggacta actccgttgg atgggccgtc
60atcaccgacg agtacaaggt gccctccaag aagttcaagg tgttgggaaa caccgacagg
120cacagcataa agaagaattt gatcggtgcc ctcctcttcg actccggaga gaccgctgag
180gctaccaggc tcaagaggac cgctagaagg cgctacacca gaaggaagaa cagaatctgc
240tacctgcagg agatcttctc caacgagatg gccaaggtgg acgactcctt cttccaccgc
300cttgaggaat cattcctggt ggaggaggat aaaaagcacg agagacaccc aatcttcggg
360aacatcgtcg acgaggtggc ctaccatgaa aagtacccta ccatctacca cctgaggaag
420aagctggtcg actctaccga caaggctgac ttgcgcttga tttacctggc tctcgctcac
480atgataaagt tccgcggaca cttcctcatt gagggagacc tgaacccaga caactccgac
540gtggacaagc tcttcatcca gctcgttcag acctacaacc agcttttcga ggagaaccca
600atcaacgcca gtggagttga cgccaaggct atcctctctg ctcgtctgtc aaagtccagg
660aggcttgaga acttgattgc ccagctgcct ggcgaaaaga agaacggact gttcggaaac
720ttgatcgctc tctccctggg attgactccc aacttcaagt ccaacttcga cctcgccgag
780gacgctaagt tgcagttgtc taaagacacc tacgacgatg acctcgacaa cttgctggcc
840cagataggcg accaatacgc cgatctcttc ctcgccgcta agaacttgtc cgacgcaatc
900ctgctgtccg acatcctgag agtcaacact gagattacca aagctcctct gtctgcttcc
960atgattaagc gctacgacga gcaccaccaa gatctgaccc tgctcaaggc cctggtgaga
1020cagcagctgc ccgagaagta caaggagatc tttttcgacc agtccaagaa cggctacgcc
1080ggatacattg acggaggcgc ctcccaggaa gagttctaca agttcatcaa gcccatcctt
1140gagaagatgg acggtaccga ggagctgttg gtgaagttga acagagagga cctgttgagg
1200aagcagagaa ccttcgacaa cggaagcatc cctcaccaaa tccacctggg agagctccac
1260gccatcttga ggaggcagga ggatttctat cccttcctga aggacaaccg cgagaagatt
1320gagaagatct tgaccttcag aattccttac tacgtcgggc cactcgccag aggaaactct
1380aggttcgcct ggatgacccg caaatctgaa gagaccatta ctccctggaa cttcgaggaa
1440gtcgtggaca agggcgcttc cgctcagtct ttcatcgaga ggatgaccaa cttcgataaa
1500aatctgccca acgagaaggt gctgcccaag cactccctgt tgtacgagta tttcacagtg
1560tacaacgagc tcaccaaggt gaagtacgtc acagagggaa tgaggaagcc tgccttcttg
1620tccggagagc agaagaaggc catcgtcgac ctgctcttca agaccaacag gaaggtgact
1680gtcaagcagc tgaaggagga ctacttcaag aagatcgagt gcttcgactc cgtcgagatc
1740tctggtgtcg aggacaggtt caacgcctcc cttgggactt accacgatct gctcaagatt
1800attaaagaca aggacttcct ggacaacgag gagaacgagg acatccttga ggacatcgtg
1860ctcaccctga ccttgttcga agacagggaa atgatcgaag agaggctcaa gacctacgcc
1920cacctcttcg acgacaaggt gatgaaacag ctgaagagac gcagatatac cggctgggga
1980aggctctccc gcaaattgat caacgggatc agggacaagc agtcagggaa gactatactc
2040gacttcctga agtccgacgg attcgccaac aggaacttca tgcagctcat tcacgacgac
2100tccttgacct tcaaggagga catccagaag gctcaggtgt ctggacaggg tgactccttg
2160catgagcaca ttgctaactt ggccggctct cccgctatta agaagggcat tttgcagacc
2220gtgaaggtcg ttgacgagct cgtgaaggtg atgggacgcc acaagccaga gaacatcgtt
2280attgagatgg ctcgcgagaa ccaaactacc cagaaagggc agaagaattc ccgcgagagg
2340atgaagcgca ttgaggaggg cataaaagag cttggctctc agatcctcaa ggagcacccc
2400gtcgagaaca ctcagctgca gaacgagaag ctgtacctgt actacctcca aaacggaagg
2460gacatgtacg tggaccagga gctggacatc aacaggttgt ccgactacga cgtcgaccac
2520atcgtgcctc agtccttcct gaaggatgac tccatcgaca ataaagtgct gacacgctcc
2580gataaaaata gaggcaagtc cgacaacgtc ccctccgagg aggtcgtgaa gaagatgaaa
2640aactactgga gacagctctt gaacgccaag ctcatcaccc agcgtaagtt cgacaacctg
2700actaaggctg agagaggagg attgtccgag ctcgataagg ccggattcat caagagacag
2760ctcgtcgaaa cccgccaaat taccaagcac gtggcccaaa ttctggattc ccgcatgaac
2820accaagtacg atgaaaatga caagctgatc cgcgaggtca aggtgatcac cttgaagtcc
2880aagctggtct ccgacttccg caaggacttc cagttctaca aggtgaggga gatcaacaac
2940taccaccacg cacacgacgc ctacctcaac gctgtcgttg gaaccgccct catcaaaaaa
3000tatcctaagc tggagtctga gttcgtctac ggcgactaca aggtgtacga cgtgaggaag
3060atgatcgcta agtctgagca ggagatcggc aaggccaccg ccaagtactt cttctactcc
3120aacatcatga acttcttcaa gaccgagatc actctcgcca acggtgagat caggaagcgc
3180ccactgatcg agaccaacgg tgagactgga gagatcgtgt gggacaaagg gagggatttc
3240gctactgtga ggaaggtgct ctccatgcct caggtgaaca tcgtcaagaa gaccgaagtt
3300cagaccggag gattctccaa ggagtccatc ctccccaaga gaaactccga caagctgatc
3360gctagaaaga aagactggga ccctaagaag tacggaggct tcgattctcc taccgtggcc
3420tactctgtgc tggtcgtggc caaggtggag aagggcaagt ccaagaagct gaaatccgtc
3480aaggagctcc tcgggattac catcatggag aggagttcct tcgagaagaa ccctatcgac
3540ttcctggagg ccaagggata taaagaggtg aagaaggacc tcatcatcaa gctgcccaag
3600tactccctct tcgagttgga gaacggaagg aagaggatgc tggcttctgc cggagagttg
3660cagaagggaa atgagctcgc ccttccctcc aagtacgtga acttcctgta cctcgcctct
3720cactatgaaa agttgaaggg ctctcctgag gacaacgagc agaagcagct cttcgtggag
3780cagcacaagc actacctgga cgaaattatc gagcagatct ctgagttctc caagcgcgtg
3840atattggccg acgccaacct cgacaaggtg ctgtccgcct acaacaagca cagggataag
3900cccattcgcg agcaggctga aaacattatc cacctgttta ccctcacaaa cttgggagcc
3960cctgctgcct tcaagtactt cgacaccacc attgacagga agagatacac ctccaccaag
4020gaggtgctcg acgcaacact catccaccaa tccatcaccg gcctctatga aacaaggatt
4080gacttgtccc agctgggagg cgac
41041161503DNAGlycine max 116ccgggtttac ttattttgtg ggtatctata cttttattag
atttttaatc aggctcctga 60tttcttttta tttcgattga attcctgaac ttgtattatt
cagtagatcg aataaattat 120aaaaagataa aatcataaaa taatatttta tcctatcaat
catattaaag caatgaatat 180gtaaaattaa tcttatcttt attttaaaaa atcatatagg
tttagtattt ttttaaaaat 240aaagatagga ttagttttac tattcactgc ttattacttt
taaaaaaatc ataaaggttt 300agtatttttt taaaataaat ataggaatag ttttactatt
cactgcttta atagaaaaat 360agtttaaaat ttaagatagt tttaatccca gcatttgcca
cgtttgaacg tgagccgaaa 420cgatgtcgtt acattatctt aacctagctg aaacgatgtc
gtcataatat cgccaaatgc 480caactggact acgtcgaacc cacaaatccc acaaagcgcg
tgaaatcaaa tcgctcaaac 540cacaaaaaag aacaacgcgt ttgttacacg ctcaatccca
cgcgagtaga gcacagtaac 600cttcaaataa gcgaatgggg cataatcaga aatccgaaat
aaacctaggg gcattatcgg 660aaatgaaaag tagctcactc aatataaaaa tctaggaacc
ctagttttcg ttatcactct 720gtgctccctc gctctatttc tcagtctctg tgtttgcggc
tgaggattcc gaacgagtga 780ccttcttcgt ttctcgcaaa ggtaacagcc tctgctcttg
tctcttcgat tcgatctatg 840cctgtctctt atttacgatg atgtttcttc ggttatgttt
ttttatttat gctttatgct 900gttgatgttc ggttgtttgt ttcgctttgt ttttgtggtt
cagtttttta ggattctttt 960ggtttttgaa tcgattaatc ggaagagatt ttcgagttat
ttggtgtgtt ggaggtgaat 1020cttttttttg aggtcataga tctgttgtat ttgtgttata
aacatgcgac tttgtatgat 1080tttttacgag gttatgatgt tctggttgtt ttattatgaa
tctgttgaga cagaaccatg 1140atttttgttg atgttcgttt acactattaa aggtttgttt
taacaggatt aaaagttttt 1200taagcatgtt gaaggagtct tgtagatatg taaccgtcga
tagttttttt gtgggtttgt 1260tcacatgtta tcaagcttaa tcttttacta tgtatgcgac
catatctgga tccagcaaag 1320gcgatttttt aattccttgt gaaacttttg taatatgaag
ttgaaatttt gttattggta 1380aactataaat gtgtgaagtt ggagtatacc tttaccttct
tatttggctt tgtgatagtt 1440taatttatat gtattttgag ttctgacttg tatttctttg
aattgattct agtttaagta 1500atc
150311733DNAArtificial sequencelinker SV40 NLS
117tctagagccg atcccaagaa gaagagaaag gtg
331181379PRTArtificial sequenceCas9 with a SV40 NLS 118Met Asp Lys Lys
Tyr Ser Ile Gly Leu Asp Ile Gly Thr Asn Ser Val 1 5
10 15 Gly Trp Ala Val Ile Thr Asp Glu Tyr
Lys Val Pro Ser Lys Lys Phe 20 25
30 Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn
Leu Ile 35 40 45
Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu 50
55 60 Lys Arg Thr Ala Arg
Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys 65 70
75 80 Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met
Ala Lys Val Asp Asp Ser 85 90
95 Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys
Lys 100 105 110 His
Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr 115
120 125 His Glu Lys Tyr Pro Thr
Ile Tyr His Leu Arg Lys Lys Leu Val Asp 130 135
140 Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr
Leu Ala Leu Ala His 145 150 155
160 Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro
165 170 175 Asp Asn
Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr 180
185 190 Asn Gln Leu Phe Glu Glu Asn
Pro Ile Asn Ala Ser Gly Val Asp Ala 195 200
205 Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg
Arg Leu Glu Asn 210 215 220
Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn 225
230 235 240 Leu Ile Ala
Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe 245
250 255 Asp Leu Ala Glu Asp Ala Lys Leu
Gln Leu Ser Lys Asp Thr Tyr Asp 260 265
270 Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln
Tyr Ala Asp 275 280 285
Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp 290
295 300 Ile Leu Arg Val
Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser 305 310
315 320 Met Ile Lys Arg Tyr Asp Glu His His
Gln Asp Leu Thr Leu Leu Lys 325 330
335 Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile
Phe Phe 340 345 350
Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser
355 360 365 Gln Glu Glu Phe
Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp 370
375 380 Gly Thr Glu Glu Leu Leu Val Lys
Leu Asn Arg Glu Asp Leu Leu Arg 385 390
395 400 Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His
Gln Ile His Leu 405 410
415 Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe
420 425 430 Leu Lys Asp
Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile 435
440 445 Pro Tyr Tyr Val Gly Pro Leu Ala
Arg Gly Asn Ser Arg Phe Ala Trp 450 455
460 Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn
Phe Glu Glu 465 470 475
480 Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr
485 490 495 Asn Phe Asp Lys
Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser 500
505 510 Leu Leu Tyr Glu Tyr Phe Thr Val Tyr
Asn Glu Leu Thr Lys Val Lys 515 520
525 Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly
Glu Gln 530 535 540
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr 545
550 555 560 Val Lys Gln Leu Lys
Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp 565
570 575 Ser Val Glu Ile Ser Gly Val Glu Asp Arg
Phe Asn Ala Ser Leu Gly 580 585
590 Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu
Asp 595 600 605 Asn
Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr 610
615 620 Leu Phe Glu Asp Arg Glu
Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala 625 630
635 640 His Leu Phe Asp Asp Lys Val Met Lys Gln Leu
Lys Arg Arg Arg Tyr 645 650
655 Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp
660 665 670 Lys Gln
Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe 675
680 685 Ala Asn Arg Asn Phe Met Gln
Leu Ile His Asp Asp Ser Leu Thr Phe 690 695
700 Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln
Gly Asp Ser Leu 705 710 715
720 His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly
725 730 735 Ile Leu Gln
Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly 740
745 750 Arg His Lys Pro Glu Asn Ile Val
Ile Glu Met Ala Arg Glu Asn Gln 755 760
765 Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met
Lys Arg Ile 770 775 780
Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro 785
790 795 800 Val Glu Asn Thr
Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu 805
810 815 Gln Asn Gly Arg Asp Met Tyr Val Asp
Gln Glu Leu Asp Ile Asn Arg 820 825
830 Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe
Leu Lys 835 840 845
Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg 850
855 860 Gly Lys Ser Asp Asn
Val Pro Ser Glu Glu Val Val Lys Lys Met Lys 865 870
875 880 Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys
Leu Ile Thr Gln Arg Lys 885 890
895 Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu
Asp 900 905 910 Lys
Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr 915
920 925 Lys His Val Ala Gln Ile
Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp 930 935
940 Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val
Ile Thr Leu Lys Ser 945 950 955
960 Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg
965 970 975 Glu Ile
Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val 980
985 990 Val Gly Thr Ala Leu Ile Lys
Lys Tyr Pro Lys Leu Glu Ser Glu Phe 995 1000
1005 Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val
Arg Lys Met Ile Ala 1010 1015 1020
Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe
1025 1030 1035 Tyr Ser
Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala 1040
1045 1050 Asn Gly Glu Ile Arg Lys Arg
Pro Leu Ile Glu Thr Asn Gly Glu 1055 1060
1065 Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe
Ala Thr Val 1070 1075 1080
Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr 1085
1090 1095 Glu Val Gln Thr Gly
Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys 1100 1105
1110 Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys
Lys Asp Trp Asp Pro 1115 1120 1125
Lys Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val
1130 1135 1140 Leu Val
Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys 1145
1150 1155 Ser Val Lys Glu Leu Leu Gly
Ile Thr Ile Met Glu Arg Ser Ser 1160 1165
1170 Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys
Gly Tyr Lys 1175 1180 1185
Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu 1190
1195 1200 Phe Glu Leu Glu Asn
Gly Arg Lys Arg Met Leu Ala Ser Ala Gly 1205 1210
1215 Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu
Pro Ser Lys Tyr Val 1220 1225 1230
Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser
1235 1240 1245 Pro Glu
Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys 1250
1255 1260 His Tyr Leu Asp Glu Ile Ile
Glu Gln Ile Ser Glu Phe Ser Lys 1265 1270
1275 Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val
Leu Ser Ala 1280 1285 1290
Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn 1295
1300 1305 Ile Ile His Leu Phe
Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala 1310 1315
1320 Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg
Lys Arg Tyr Thr Ser 1325 1330 1335
Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr
1340 1345 1350 Gly Leu
Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp 1355
1360 1365 Ser Arg Ala Asp Pro Lys Lys
Lys Arg Lys Val 1370 1375
1198519DNAArtificial sequenceQC782 119ccgggtttac ttattttgtg ggtatctata
cttttattag atttttaatc aggctcctga 60tttcttttta tttcgattga attcctgaac
ttgtattatt cagtagatcg aataaattat 120aaaaagataa aatcataaaa taatatttta
tcctatcaat catattaaag caatgaatat 180gtaaaattaa tcttatcttt attttaaaaa
atcatatagg tttagtattt ttttaaaaat 240aaagatagga ttagttttac tattcactgc
ttattacttt taaaaaaatc ataaaggttt 300agtatttttt taaaataaat ataggaatag
ttttactatt cactgcttta atagaaaaat 360agtttaaaat ttaagatagt tttaatccca
gcatttgcca cgtttgaacg tgagccgaaa 420cgatgtcgtt acattatctt aacctagctg
aaacgatgtc gtcataatat cgccaaatgc 480caactggact acgtcgaacc cacaaatccc
acaaagcgcg tgaaatcaaa tcgctcaaac 540cacaaaaaag aacaacgcgt ttgttacacg
ctcaatccca cgcgagtaga gcacagtaac 600cttcaaataa gcgaatgggg cataatcaga
aatccgaaat aaacctaggg gcattatcgg 660aaatgaaaag tagctcactc aatataaaaa
tctaggaacc ctagttttcg ttatcactct 720gtgctccctc gctctatttc tcagtctctg
tgtttgcggc tgaggattcc gaacgagtga 780ccttcttcgt ttctcgcaaa ggtaacagcc
tctgctcttg tctcttcgat tcgatctatg 840cctgtctctt atttacgatg atgtttcttc
ggttatgttt ttttatttat gctttatgct 900gttgatgttc ggttgtttgt ttcgctttgt
ttttgtggtt cagtttttta ggattctttt 960ggtttttgaa tcgattaatc ggaagagatt
ttcgagttat ttggtgtgtt ggaggtgaat 1020cttttttttg aggtcataga tctgttgtat
ttgtgttata aacatgcgac tttgtatgat 1080tttttacgag gttatgatgt tctggttgtt
ttattatgaa tctgttgaga cagaaccatg 1140atttttgttg atgttcgttt acactattaa
aggtttgttt taacaggatt aaaagttttt 1200taagcatgtt gaaggagtct tgtagatatg
taaccgtcga tagttttttt gtgggtttgt 1260tcacatgtta tcaagcttaa tcttttacta
tgtatgcgac catatctgga tccagcaaag 1320gcgatttttt aattccttgt gaaacttttg
taatatgaag ttgaaatttt gttattggta 1380aactataaat gtgtgaagtt ggagtatacc
tttaccttct tatttggctt tgtgatagtt 1440taatttatat gtattttgag ttctgacttg
tatttctttg aattgattct agtttaagta 1500atccatggac aaaaagtact caatagggct
cgacataggg actaactccg ttggatgggc 1560cgtcatcacc gacgagtaca aggtgccctc
caagaagttc aaggtgttgg gaaacaccga 1620caggcacagc ataaagaaga atttgatcgg
tgccctcctc ttcgactccg gagagaccgc 1680tgaggctacc aggctcaaga ggaccgctag
aaggcgctac accagaagga agaacagaat 1740ctgctacctg caggagatct tctccaacga
gatggccaag gtggacgact ccttcttcca 1800ccgccttgag gaatcattcc tggtggagga
ggataaaaag cacgagagac acccaatctt 1860cgggaacatc gtcgacgagg tggcctacca
tgaaaagtac cctaccatct accacctgag 1920gaagaagctg gtcgactcta ccgacaaggc
tgacttgcgc ttgatttacc tggctctcgc 1980tcacatgata aagttccgcg gacacttcct
cattgaggga gacctgaacc cagacaactc 2040cgacgtggac aagctcttca tccagctcgt
tcagacctac aaccagcttt tcgaggagaa 2100cccaatcaac gccagtggag ttgacgccaa
ggctatcctc tctgctcgtc tgtcaaagtc 2160caggaggctt gagaacttga ttgcccagct
gcctggcgaa aagaagaacg gactgttcgg 2220aaacttgatc gctctctccc tgggattgac
tcccaacttc aagtccaact tcgacctcgc 2280cgaggacgct aagttgcagt tgtctaaaga
cacctacgac gatgacctcg acaacttgct 2340ggcccagata ggcgaccaat acgccgatct
cttcctcgcc gctaagaact tgtccgacgc 2400aatcctgctg tccgacatcc tgagagtcaa
cactgagatt accaaagctc ctctgtctgc 2460ttccatgatt aagcgctacg acgagcacca
ccaagatctg accctgctca aggccctggt 2520gagacagcag ctgcccgaga agtacaagga
gatctttttc gaccagtcca agaacggcta 2580cgccggatac attgacggag gcgcctccca
ggaagagttc tacaagttca tcaagcccat 2640ccttgagaag atggacggta ccgaggagct
gttggtgaag ttgaacagag aggacctgtt 2700gaggaagcag agaaccttcg acaacggaag
catccctcac caaatccacc tgggagagct 2760ccacgccatc ttgaggaggc aggaggattt
ctatcccttc ctgaaggaca accgcgagaa 2820gattgagaag atcttgacct tcagaattcc
ttactacgtc gggccactcg ccagaggaaa 2880ctctaggttc gcctggatga cccgcaaatc
tgaagagacc attactccct ggaacttcga 2940ggaagtcgtg gacaagggcg cttccgctca
gtctttcatc gagaggatga ccaacttcga 3000taaaaatctg cccaacgaga aggtgctgcc
caagcactcc ctgttgtacg agtatttcac 3060agtgtacaac gagctcacca aggtgaagta
cgtcacagag ggaatgagga agcctgcctt 3120cttgtccgga gagcagaaga aggccatcgt
cgacctgctc ttcaagacca acaggaaggt 3180gactgtcaag cagctgaagg aggactactt
caagaagatc gagtgcttcg actccgtcga 3240gatctctggt gtcgaggaca ggttcaacgc
ctcccttggg acttaccacg atctgctcaa 3300gattattaaa gacaaggact tcctggacaa
cgaggagaac gaggacatcc ttgaggacat 3360cgtgctcacc ctgaccttgt tcgaagacag
ggaaatgatc gaagagaggc tcaagaccta 3420cgcccacctc ttcgacgaca aggtgatgaa
acagctgaag agacgcagat ataccggctg 3480gggaaggctc tcccgcaaat tgatcaacgg
gatcagggac aagcagtcag ggaagactat 3540actcgacttc ctgaagtccg acggattcgc
caacaggaac ttcatgcagc tcattcacga 3600cgactccttg accttcaagg aggacatcca
gaaggctcag gtgtctggac agggtgactc 3660cttgcatgag cacattgcta acttggccgg
ctctcccgct attaagaagg gcattttgca 3720gaccgtgaag gtcgttgacg agctcgtgaa
ggtgatggga cgccacaagc cagagaacat 3780cgttattgag atggctcgcg agaaccaaac
tacccagaaa gggcagaaga attcccgcga 3840gaggatgaag cgcattgagg agggcataaa
agagcttggc tctcagatcc tcaaggagca 3900ccccgtcgag aacactcagc tgcagaacga
gaagctgtac ctgtactacc tccaaaacgg 3960aagggacatg tacgtggacc aggagctgga
catcaacagg ttgtccgact acgacgtcga 4020ccacatcgtg cctcagtcct tcctgaagga
tgactccatc gacaataaag tgctgacacg 4080ctccgataaa aatagaggca agtccgacaa
cgtcccctcc gaggaggtcg tgaagaagat 4140gaaaaactac tggagacagc tcttgaacgc
caagctcatc acccagcgta agttcgacaa 4200cctgactaag gctgagagag gaggattgtc
cgagctcgat aaggccggat tcatcaagag 4260acagctcgtc gaaacccgcc aaattaccaa
gcacgtggcc caaattctgg attcccgcat 4320gaacaccaag tacgatgaaa atgacaagct
gatccgcgag gtcaaggtga tcaccttgaa 4380gtccaagctg gtctccgact tccgcaagga
cttccagttc tacaaggtga gggagatcaa 4440caactaccac cacgcacacg acgcctacct
caacgctgtc gttggaaccg ccctcatcaa 4500aaaatatcct aagctggagt ctgagttcgt
ctacggcgac tacaaggtgt acgacgtgag 4560gaagatgatc gctaagtctg agcaggagat
cggcaaggcc accgccaagt acttcttcta 4620ctccaacatc atgaacttct tcaagaccga
gatcactctc gccaacggtg agatcaggaa 4680gcgcccactg atcgagacca acggtgagac
tggagagatc gtgtgggaca aagggaggga 4740tttcgctact gtgaggaagg tgctctccat
gcctcaggtg aacatcgtca agaagaccga 4800agttcagacc ggaggattct ccaaggagtc
catcctcccc aagagaaact ccgacaagct 4860gatcgctaga aagaaagact gggaccctaa
gaagtacgga ggcttcgatt ctcctaccgt 4920ggcctactct gtgctggtcg tggccaaggt
ggagaagggc aagtccaaga agctgaaatc 4980cgtcaaggag ctcctcggga ttaccatcat
ggagaggagt tccttcgaga agaaccctat 5040cgacttcctg gaggccaagg gatataaaga
ggtgaagaag gacctcatca tcaagctgcc 5100caagtactcc ctcttcgagt tggagaacgg
aaggaagagg atgctggctt ctgccggaga 5160gttgcagaag ggaaatgagc tcgcccttcc
ctccaagtac gtgaacttcc tgtacctcgc 5220ctctcactat gaaaagttga agggctctcc
tgaggacaac gagcagaagc agctcttcgt 5280ggagcagcac aagcactacc tggacgaaat
tatcgagcag atctctgagt tctccaagcg 5340cgtgatattg gccgacgcca acctcgacaa
ggtgctgtcc gcctacaaca agcacaggga 5400taagcccatt cgcgagcagg ctgaaaacat
tatccacctg tttaccctca caaacttggg 5460agcccctgct gccttcaagt acttcgacac
caccattgac aggaagagat acacctccac 5520caaggaggtg ctcgacgcaa cactcatcca
ccaatccatc accggcctct atgaaacaag 5580gattgacttg tcccagctgg gaggcgactc
tagagccgat cccaagaaga agagaaaggt 5640gtaggttaac ctagacttgt ccatcttctg
gattggccaa cttaattaat gtatgaaata 5700aaaggatgca cacatagtga catgctaatc
actataatgt gggcatcaaa gttgtgtgtt 5760atgtgtaatt actagttatc tgaataaaag
agaaagagat catccatatt tcttatccta 5820aatgaatgtc acgtgtcttt ataattcttt
gatgaaccag atgcatttca ttaaccaaat 5880ccatatacat ataaatatta atcatatata
attaatatca attgggttag caaaacaaat 5940ctagtctagg tgtgttttgc gaatgcggcc
gctcgagggg gggcccggta ccggcgcgcc 6000gttctatagt gtcacctaaa tcgtatgtgt
atgatacata aggttatgta ttaattgtag 6060ccgcgttcta acgacaatat gtccatatgg
tgcactctca gtacaatctg ctctgatgcc 6120gcatagttaa gccagccccg acacccgcca
acacccgctg acgcgccctg acgggcttgt 6180ctgctcccgg catccgctta cagacaagct
gtgaccgtct ccgggagctg catgtgtcag 6240aggttttcac cgtcatcacc gaaacgcgcg
agacgaaagg gcctcgtgat acgcctattt 6300ttataggtta atgtcatgac caaaatccct
taacgtgagt tttcgttcca ctgagcgtca 6360gaccccgtag aaaagatcaa aggatcttct
tgagatcctt tttttctgcg cgtaatctgc 6420tgcttgcaaa caaaaaaacc accgctacca
gcggtggttt gtttgccgga tcaagagcta 6480ccaactcttt ttccgaaggt aactggcttc
agcagagcgc agataccaaa tactgtcctt 6540ctagtgtagc cgtagttagg ccaccacttc
aagaactctg tagcaccgcc tacatacctc 6600gctctgctaa tcctgttacc agtggctgct
gccagtggcg ataagtcgtg tcttaccggg 6660ttggactcaa gacgatagtt accggataag
gcgcagcggt cgggctgaac ggggggttcg 6720tgcacacagc ccagcttgga gcgaacgacc
tacaccgaac tgagatacct acagcgtgag 6780cattgagaaa gcgccacgct tcccgaaggg
agaaaggcgg acaggtatcc ggtaagcggc 6840agggtcggaa caggagagcg cacgagggag
cttccagggg gaaacgcctg gtatctttat 6900agtcctgtcg ggtttcgcca cctctgactt
gagcgtcgat ttttgtgatg ctcgtcaggg 6960gggcggagcc tatggaaaaa cgccagcaac
gcggcctttt tacggttcct ggccttttgc 7020tggccttttg ctcacatgtt ctttcctgcg
ttatcccctg attctgtgga taaccgtatt 7080accgcctttg agtgagctga taccgctcgc
cgcagccgaa cgaccgagcg cagcgagtca 7140gtgagcgagg aagcggaaga gcgcccaata
cgcaaaccgc ctctccccgc gcgttggccg 7200attcattaat gcaggttgat cagatctcga
tcccgcgaaa ttaatacgac tcactatagg 7260gagaccacaa cggtttccct ctagaaataa
ttttgtttaa ctttaagaag gagatatacc 7320catggaaaag cctgaactca ccgcgacgtc
tgtcgagaag tttctgatcg aaaagttcga 7380cagcgtctcc gacctgatgc agctctcgga
gggcgaagaa tctcgtgctt tcagcttcga 7440tgtaggaggg cgtggatatg tcctgcgggt
aaatagctgc gccgatggtt tctacaaaga 7500tcgttatgtt tatcggcact ttgcatcggc
cgcgctcccg attccggaag tgcttgacat 7560tggggaattc agcgagagcc tgacctattg
catctcccgc cgtgcacagg gtgtcacgtt 7620gcaagacctg cctgaaaccg aactgcccgc
tgttctgcag ccggtcgcgg aggctatgga 7680tgcgatcgct gcggccgatc ttagccagac
gagcgggttc ggcccattcg gaccgcaagg 7740aatcggtcaa tacactacat ggcgtgattt
catatgcgcg attgctgatc cccatgtgta 7800tcactggcaa actgtgatgg acgacaccgt
cagtgcgtcc gtcgcgcagg ctctcgatga 7860gctgatgctt tgggccgagg actgccccga
agtccggcac ctcgtgcacg cggatttcgg 7920ctccaacaat gtcctgacgg acaatggccg
cataacagcg gtcattgact ggagcgaggc 7980gatgttcggg gattcccaat acgaggtcgc
caacatcttc ttctggaggc cgtggttggc 8040ttgtatggag cagcagacgc gctacttcga
gcggaggcat ccggagcttg caggatcgcc 8100gcggctccgg gcgtatatgc tccgcattgg
tcttgaccaa ctctatcaga gcttggttga 8160cggcaatttc gatgatgcag cttgggcgca
gggtcgatgc gacgcaatcg tccgatccgg 8220agccgggact gtcgggcgta cacaaatcgc
ccgcagaagc gcggccgtct ggaccgatgg 8280ctgtgtagaa gtactcgccg atagtggaaa
ccgacgcccc agcactcgtc cgagggcaaa 8340ggaatagtga ggtacagctt ggatcgatcc
ggctgctaac aaagcccgaa aggaagctga 8400gttggctgct gccaccgctg agcaataact
agcataaccc cttggggcct ctaaacgggt 8460cttgaggggt tttttgctga aaggaggaac
tatatccgga tgatcgggcg cgccggtac 8519120434DNAGlycine max 120ccgggtgtga
tttagtataa agtgaagtaa tggtcaaaag aaaaagtgta aaacgaagta 60cctagtaata
agtaatattg aacaaaataa atggtaaagt gtcagatata taaaataggc 120tttaataaaa
ggaagaaaaa aaacaaacaa aaaataggtt gcaatggggc agagcagagt 180catcatgaag
ctagaaaggc taccgataga taaactatag ttaattaaat acattaaaaa 240atacttggat
ctttctctta ccctgtttat attgagacct gaaacttgag agagatacac 300taatcttgcc
ttgttgtttc attccctaac ttacaggact cagcgcatgt catgtggtct 360cgttccccat
ttaagtccca caccgtctaa acttattaaa ttattaatgt ttataactag 420atgcacaaca
acaa
434121104DNAArtificial sequenceGuide RNA for DD43CR1 121gtcccttgta
cttgtacgta gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac
ttgaaaaagt ggcaccgagt cggtgctttt tttt
1041223098DNAArtificial sequenceQC783 122ccgggtgtga tttagtataa agtgaagtaa
tggtcaaaag aaaaagtgta aaacgaagta 60cctagtaata agtaatattg aacaaaataa
atggtaaagt gtcagatata taaaataggc 120tttaataaaa ggaagaaaaa aaacaaacaa
aaaataggtt gcaatggggc agagcagagt 180catcatgaag ctagaaaggc taccgataga
taaactatag ttaattaaat acattaaaaa 240atacttggat ctttctctta ccctgtttat
attgagacct gaaacttgag agagatacac 300taatcttgcc ttgttgtttc attccctaac
ttacaggact cagcgcatgt catgtggtct 360cgttccccat ttaagtccca caccgtctaa
acttattaaa ttattaatgt ttataactag 420atgcacaaca acaaagcttg tcccttgtac
ttgtacgtag ttttagagct agaaatagca 480agttaaaata aggctagtcc gttatcaact
tgaaaaagtg gcaccgagtc ggtgcttttt 540tttgcggccg ctcgaggggg ggcccggtac
cggcgcgccg ttctatagtg tcacctaaat 600cgtatgtgta tgatacataa ggttatgtat
taattgtagc cgcgttctaa cgacaatatg 660tccatatggt gcactctcag tacaatctgc
tctgatgccg catagttaag ccagccccga 720cacccgccaa cacccgctga cgcgccctga
cgggcttgtc tgctcccggc atccgcttac 780agacaagctg tgaccgtctc cgggagctgc
atgtgtcaga ggttttcacc gtcatcaccg 840aaacgcgcga gacgaaaggg cctcgtgata
cgcctatttt tataggttaa tgtcatgacc 900aaaatccctt aacgtgagtt ttcgttccac
tgagcgtcag accccgtaga aaagatcaaa 960ggatcttctt gagatccttt ttttctgcgc
gtaatctgct gcttgcaaac aaaaaaacca 1020ccgctaccag cggtggtttg tttgccggat
caagagctac caactctttt tccgaaggta 1080actggcttca gcagagcgca gataccaaat
actgtccttc tagtgtagcc gtagttaggc 1140caccacttca agaactctgt agcaccgcct
acatacctcg ctctgctaat cctgttacca 1200gtggctgctg ccagtggcga taagtcgtgt
cttaccgggt tggactcaag acgatagtta 1260ccggataagg cgcagcggtc gggctgaacg
gggggttcgt gcacacagcc cagcttggag 1320cgaacgacct acaccgaact gagataccta
cagcgtgagc attgagaaag cgccacgctt 1380cccgaaggga gaaaggcgga caggtatccg
gtaagcggca gggtcggaac aggagagcgc 1440acgagggagc ttccaggggg aaacgcctgg
tatctttata gtcctgtcgg gtttcgccac 1500ctctgacttg agcgtcgatt tttgtgatgc
tcgtcagggg ggcggagcct atggaaaaac 1560gccagcaacg cggccttttt acggttcctg
gccttttgct ggccttttgc tcacatgttc 1620tttcctgcgt tatcccctga ttctgtggat
aaccgtatta ccgcctttga gtgagctgat 1680accgctcgcc gcagccgaac gaccgagcgc
agcgagtcag tgagcgagga agcggaagag 1740cgcccaatac gcaaaccgcc tctccccgcg
cgttggccga ttcattaatg caggttgatc 1800agatctcgat cccgcgaaat taatacgact
cactataggg agaccacaac ggtttccctc 1860tagaaataat tttgtttaac tttaagaagg
agatataccc atggaaaagc ctgaactcac 1920cgcgacgtct gtcgagaagt ttctgatcga
aaagttcgac agcgtctccg acctgatgca 1980gctctcggag ggcgaagaat ctcgtgcttt
cagcttcgat gtaggagggc gtggatatgt 2040cctgcgggta aatagctgcg ccgatggttt
ctacaaagat cgttatgttt atcggcactt 2100tgcatcggcc gcgctcccga ttccggaagt
gcttgacatt ggggaattca gcgagagcct 2160gacctattgc atctcccgcc gtgcacaggg
tgtcacgttg caagacctgc ctgaaaccga 2220actgcccgct gttctgcagc cggtcgcgga
ggctatggat gcgatcgctg cggccgatct 2280tagccagacg agcgggttcg gcccattcgg
accgcaagga atcggtcaat acactacatg 2340gcgtgatttc atatgcgcga ttgctgatcc
ccatgtgtat cactggcaaa ctgtgatgga 2400cgacaccgtc agtgcgtccg tcgcgcaggc
tctcgatgag ctgatgcttt gggccgagga 2460ctgccccgaa gtccggcacc tcgtgcacgc
ggatttcggc tccaacaatg tcctgacgga 2520caatggccgc ataacagcgg tcattgactg
gagcgaggcg atgttcgggg attcccaata 2580cgaggtcgcc aacatcttct tctggaggcc
gtggttggct tgtatggagc agcagacgcg 2640ctacttcgag cggaggcatc cggagcttgc
aggatcgccg cggctccggg cgtatatgct 2700ccgcattggt cttgaccaac tctatcagag
cttggttgac ggcaatttcg atgatgcagc 2760ttgggcgcag ggtcgatgcg acgcaatcgt
ccgatccgga gccgggactg tcgggcgtac 2820acaaatcgcc cgcagaagcg cggccgtctg
gaccgatggc tgtgtagaag tactcgccga 2880tagtggaaac cgacgcccca gcactcgtcc
gagggcaaag gaatagtgag gtacagcttg 2940gatcgatccg gctgctaaca aagcccgaaa
ggaagctgag ttggctgctg ccaccgctga 3000gcaataacta gcataacccc ttggggcctc
taaacgggtc ttgaggggtt ttttgctgaa 3060aggaggaact atatccggat gatcgggcgc
gccggtac 30981239093DNAArtificial sequenceQC815
123ccgggtgtga tttagtataa agtgaagtaa tggtcaaaag aaaaagtgta aaacgaagta
60cctagtaata agtaatattg aacaaaataa atggtaaagt gtcagatata taaaataggc
120tttaataaaa ggaagaaaaa aaacaaacaa aaaataggtt gcaatggggc agagcagagt
180catcatgaag ctagaaaggc taccgataga taaactatag ttaattaaat acattaaaaa
240atacttggat ctttctctta ccctgtttat attgagacct gaaacttgag agagatacac
300taatcttgcc ttgttgtttc attccctaac ttacaggact cagcgcatgt catgtggtct
360cgttccccat ttaagtccca caccgtctaa acttattaaa ttattaatgt ttataactag
420atgcacaaca acaaagcttg tcccttgtac ttgtacgtag ttttagagct agaaatagca
480agttaaaata aggctagtcc gttatcaact tgaaaaagtg gcaccgagtc ggtgcttttt
540tttgcggccg caattggatc gggtttactt attttgtggg tatctatact tttattagat
600ttttaatcag gctcctgatt tctttttatt tcgattgaat tcctgaactt gtattattca
660gtagatcgaa taaattataa aaagataaaa tcataaaata atattttatc ctatcaatca
720tattaaagca atgaatatgt aaaattaatc ttatctttat tttaaaaaat catataggtt
780tagtattttt ttaaaaataa agataggatt agttttacta ttcactgctt attactttta
840aaaaaatcat aaaggtttag tattttttta aaataaatat aggaatagtt ttactattca
900ctgctttaat agaaaaatag tttaaaattt aagatagttt taatcccagc atttgccacg
960tttgaacgtg agccgaaacg atgtcgttac attatcttaa cctagctgaa acgatgtcgt
1020cataatatcg ccaaatgcca actggactac gtcgaaccca caaatcccac aaagcgcgtg
1080aaatcaaatc gctcaaacca caaaaaagaa caacgcgttt gttacacgct caatcccacg
1140cgagtagagc acagtaacct tcaaataagc gaatggggca taatcagaaa tccgaaataa
1200acctaggggc attatcggaa atgaaaagta gctcactcaa tataaaaatc taggaaccct
1260agttttcgtt atcactctgt gctccctcgc tctatttctc agtctctgtg tttgcggctg
1320aggattccga acgagtgacc ttcttcgttt ctcgcaaagg taacagcctc tgctcttgtc
1380tcttcgattc gatctatgcc tgtctcttat ttacgatgat gtttcttcgg ttatgttttt
1440ttatttatgc tttatgctgt tgatgttcgg ttgtttgttt cgctttgttt ttgtggttca
1500gttttttagg attcttttgg tttttgaatc gattaatcgg aagagatttt cgagttattt
1560ggtgtgttgg aggtgaatct tttttttgag gtcatagatc tgttgtattt gtgttataaa
1620catgcgactt tgtatgattt tttacgaggt tatgatgttc tggttgtttt attatgaatc
1680tgttgagaca gaaccatgat ttttgttgat gttcgtttac actattaaag gtttgtttta
1740acaggattaa aagtttttta agcatgttga aggagtcttg tagatatgta accgtcgata
1800gtttttttgt gggtttgttc acatgttatc aagcttaatc ttttactatg tatgcgacca
1860tatctggatc cagcaaaggc gattttttaa ttccttgtga aacttttgta atatgaagtt
1920gaaattttgt tattggtaaa ctataaatgt gtgaagttgg agtatacctt taccttctta
1980tttggctttg tgatagttta atttatatgt attttgagtt ctgacttgta tttctttgaa
2040ttgattctag tttaagtaat ccatggacaa aaagtactca atagggctcg acatagggac
2100taactccgtt ggatgggccg tcatcaccga cgagtacaag gtgccctcca agaagttcaa
2160ggtgttggga aacaccgaca ggcacagcat aaagaagaat ttgatcggtg ccctcctctt
2220cgactccgga gagaccgctg aggctaccag gctcaagagg accgctagaa ggcgctacac
2280cagaaggaag aacagaatct gctacctgca ggagatcttc tccaacgaga tggccaaggt
2340ggacgactcc ttcttccacc gccttgagga atcattcctg gtggaggagg ataaaaagca
2400cgagagacac ccaatcttcg ggaacatcgt cgacgaggtg gcctaccatg aaaagtaccc
2460taccatctac cacctgagga agaagctggt cgactctacc gacaaggctg acttgcgctt
2520gatttacctg gctctcgctc acatgataaa gttccgcgga cacttcctca ttgagggaga
2580cctgaaccca gacaactccg acgtggacaa gctcttcatc cagctcgttc agacctacaa
2640ccagcttttc gaggagaacc caatcaacgc cagtggagtt gacgccaagg ctatcctctc
2700tgctcgtctg tcaaagtcca ggaggcttga gaacttgatt gcccagctgc ctggcgaaaa
2760gaagaacgga ctgttcggaa acttgatcgc tctctccctg ggattgactc ccaacttcaa
2820gtccaacttc gacctcgccg aggacgctaa gttgcagttg tctaaagaca cctacgacga
2880tgacctcgac aacttgctgg cccagatagg cgaccaatac gccgatctct tcctcgccgc
2940taagaacttg tccgacgcaa tcctgctgtc cgacatcctg agagtcaaca ctgagattac
3000caaagctcct ctgtctgctt ccatgattaa gcgctacgac gagcaccacc aagatctgac
3060cctgctcaag gccctggtga gacagcagct gcccgagaag tacaaggaga tctttttcga
3120ccagtccaag aacggctacg ccggatacat tgacggaggc gcctcccagg aagagttcta
3180caagttcatc aagcccatcc ttgagaagat ggacggtacc gaggagctgt tggtgaagtt
3240gaacagagag gacctgttga ggaagcagag aaccttcgac aacggaagca tccctcacca
3300aatccacctg ggagagctcc acgccatctt gaggaggcag gaggatttct atcccttcct
3360gaaggacaac cgcgagaaga ttgagaagat cttgaccttc agaattcctt actacgtcgg
3420gccactcgcc agaggaaact ctaggttcgc ctggatgacc cgcaaatctg aagagaccat
3480tactccctgg aacttcgagg aagtcgtgga caagggcgct tccgctcagt ctttcatcga
3540gaggatgacc aacttcgata aaaatctgcc caacgagaag gtgctgccca agcactccct
3600gttgtacgag tatttcacag tgtacaacga gctcaccaag gtgaagtacg tcacagaggg
3660aatgaggaag cctgccttct tgtccggaga gcagaagaag gccatcgtcg acctgctctt
3720caagaccaac aggaaggtga ctgtcaagca gctgaaggag gactacttca agaagatcga
3780gtgcttcgac tccgtcgaga tctctggtgt cgaggacagg ttcaacgcct cccttgggac
3840ttaccacgat ctgctcaaga ttattaaaga caaggacttc ctggacaacg aggagaacga
3900ggacatcctt gaggacatcg tgctcaccct gaccttgttc gaagacaggg aaatgatcga
3960agagaggctc aagacctacg cccacctctt cgacgacaag gtgatgaaac agctgaagag
4020acgcagatat accggctggg gaaggctctc ccgcaaattg atcaacggga tcagggacaa
4080gcagtcaggg aagactatac tcgacttcct gaagtccgac ggattcgcca acaggaactt
4140catgcagctc attcacgacg actccttgac cttcaaggag gacatccaga aggctcaggt
4200gtctggacag ggtgactcct tgcatgagca cattgctaac ttggccggct ctcccgctat
4260taagaagggc attttgcaga ccgtgaaggt cgttgacgag ctcgtgaagg tgatgggacg
4320ccacaagcca gagaacatcg ttattgagat ggctcgcgag aaccaaacta cccagaaagg
4380gcagaagaat tcccgcgaga ggatgaagcg cattgaggag ggcataaaag agcttggctc
4440tcagatcctc aaggagcacc ccgtcgagaa cactcagctg cagaacgaga agctgtacct
4500gtactacctc caaaacggaa gggacatgta cgtggaccag gagctggaca tcaacaggtt
4560gtccgactac gacgtcgacc acatcgtgcc tcagtccttc ctgaaggatg actccatcga
4620caataaagtg ctgacacgct ccgataaaaa tagaggcaag tccgacaacg tcccctccga
4680ggaggtcgtg aagaagatga aaaactactg gagacagctc ttgaacgcca agctcatcac
4740ccagcgtaag ttcgacaacc tgactaaggc tgagagagga ggattgtccg agctcgataa
4800ggccggattc atcaagagac agctcgtcga aacccgccaa attaccaagc acgtggccca
4860aattctggat tcccgcatga acaccaagta cgatgaaaat gacaagctga tccgcgaggt
4920caaggtgatc accttgaagt ccaagctggt ctccgacttc cgcaaggact tccagttcta
4980caaggtgagg gagatcaaca actaccacca cgcacacgac gcctacctca acgctgtcgt
5040tggaaccgcc ctcatcaaaa aatatcctaa gctggagtct gagttcgtct acggcgacta
5100caaggtgtac gacgtgagga agatgatcgc taagtctgag caggagatcg gcaaggccac
5160cgccaagtac ttcttctact ccaacatcat gaacttcttc aagaccgaga tcactctcgc
5220caacggtgag atcaggaagc gcccactgat cgagaccaac ggtgagactg gagagatcgt
5280gtgggacaaa gggagggatt tcgctactgt gaggaaggtg ctctccatgc ctcaggtgaa
5340catcgtcaag aagaccgaag ttcagaccgg aggattctcc aaggagtcca tcctccccaa
5400gagaaactcc gacaagctga tcgctagaaa gaaagactgg gaccctaaga agtacggagg
5460cttcgattct cctaccgtgg cctactctgt gctggtcgtg gccaaggtgg agaagggcaa
5520gtccaagaag ctgaaatccg tcaaggagct cctcgggatt accatcatgg agaggagttc
5580cttcgagaag aaccctatcg acttcctgga ggccaaggga tataaagagg tgaagaagga
5640cctcatcatc aagctgccca agtactccct cttcgagttg gagaacggaa ggaagaggat
5700gctggcttct gccggagagt tgcagaaggg aaatgagctc gcccttccct ccaagtacgt
5760gaacttcctg tacctcgcct ctcactatga aaagttgaag ggctctcctg aggacaacga
5820gcagaagcag ctcttcgtgg agcagcacaa gcactacctg gacgaaatta tcgagcagat
5880ctctgagttc tccaagcgcg tgatattggc cgacgccaac ctcgacaagg tgctgtccgc
5940ctacaacaag cacagggata agcccattcg cgagcaggct gaaaacatta tccacctgtt
6000taccctcaca aacttgggag cccctgctgc cttcaagtac ttcgacacca ccattgacag
6060gaagagatac acctccacca aggaggtgct cgacgcaaca ctcatccacc aatccatcac
6120cggcctctat gaaacaagga ttgacttgtc ccagctggga ggcgactcta gagccgatcc
6180caagaagaag agaaaggtgt aggttaacct agacttgtcc atcttctgga ttggccaact
6240taattaatgt atgaaataaa aggatgcaca catagtgaca tgctaatcac tataatgtgg
6300gcatcaaagt tgtgtgttat gtgtaattac tagttatctg aataaaagag aaagagatca
6360tccatatttc ttatcctaaa tgaatgtcac gtgtctttat aattctttga tgaaccagat
6420gcatttcatt aaccaaatcc atatacatat aaatattaat catatataat taatatcaat
6480tgggttagca aaacaaatct agtctaggtg tgttttgcga attcgatatc aagcttatcg
6540ataccgtcga gggggggccc ggtaccggcg cgccgttcta tagtgtcacc taaatcgtat
6600gtgtatgata cataaggtta tgtattaatt gtagccgcgt tctaacgaca atatgtccat
6660atggtgcact ctcagtacaa tctgctctga tgccgcatag ttaagccagc cccgacaccc
6720gccaacaccc gctgacgcgc cctgacgggc ttgtctgctc ccggcatccg cttacagaca
6780agctgtgacc gtctccggga gctgcatgtg tcagaggttt tcaccgtcat caccgaaacg
6840cgcgagacga aagggcctcg tgatacgcct atttttatag gttaatgtca tgaccaaaat
6900cccttaacgt gagttttcgt tccactgagc gtcagacccc gtagaaaaga tcaaaggatc
6960ttcttgagat cctttttttc tgcgcgtaat ctgctgcttg caaacaaaaa aaccaccgct
7020accagcggtg gtttgtttgc cggatcaaga gctaccaact ctttttccga aggtaactgg
7080cttcagcaga gcgcagatac caaatactgt ccttctagtg tagccgtagt taggccacca
7140cttcaagaac tctgtagcac cgcctacata cctcgctctg ctaatcctgt taccagtggc
7200tgctgccagt ggcgataagt cgtgtcttac cgggttggac tcaagacgat agttaccgga
7260taaggcgcag cggtcgggct gaacgggggg ttcgtgcaca cagcccagct tggagcgaac
7320gacctacacc gaactgagat acctacagcg tgagcattga gaaagcgcca cgcttcccga
7380agggagaaag gcggacaggt atccggtaag cggcagggtc ggaacaggag agcgcacgag
7440ggagcttcca gggggaaacg cctggtatct ttatagtcct gtcgggtttc gccacctctg
7500acttgagcgt cgatttttgt gatgctcgtc aggggggcgg agcctatgga aaaacgccag
7560caacgcggcc tttttacggt tcctggcctt ttgctggcct tttgctcaca tgttctttcc
7620tgcgttatcc cctgattctg tggataaccg tattaccgcc tttgagtgag ctgataccgc
7680tcgccgcagc cgaacgaccg agcgcagcga gtcagtgagc gaggaagcgg aagagcgccc
7740aatacgcaaa ccgcctctcc ccgcgcgttg gccgattcat taatgcaggt tgatcagatc
7800tcgatcccgc gaaattaata cgactcacta tagggagacc acaacggttt ccctctagaa
7860ataattttgt ttaactttaa gaaggagata tacccatgga aaagcctgaa ctcaccgcga
7920cgtctgtcga gaagtttctg atcgaaaagt tcgacagcgt ctccgacctg atgcagctct
7980cggagggcga agaatctcgt gctttcagct tcgatgtagg agggcgtgga tatgtcctgc
8040gggtaaatag ctgcgccgat ggtttctaca aagatcgtta tgtttatcgg cactttgcat
8100cggccgcgct cccgattccg gaagtgcttg acattgggga attcagcgag agcctgacct
8160attgcatctc ccgccgtgca cagggtgtca cgttgcaaga cctgcctgaa accgaactgc
8220ccgctgttct gcagccggtc gcggaggcta tggatgcgat cgctgcggcc gatcttagcc
8280agacgagcgg gttcggccca ttcggaccgc aaggaatcgg tcaatacact acatggcgtg
8340atttcatatg cgcgattgct gatccccatg tgtatcactg gcaaactgtg atggacgaca
8400ccgtcagtgc gtccgtcgcg caggctctcg atgagctgat gctttgggcc gaggactgcc
8460ccgaagtccg gcacctcgtg cacgcggatt tcggctccaa caatgtcctg acggacaatg
8520gccgcataac agcggtcatt gactggagcg aggcgatgtt cggggattcc caatacgagg
8580tcgccaacat cttcttctgg aggccgtggt tggcttgtat ggagcagcag acgcgctact
8640tcgagcggag gcatccggag cttgcaggat cgccgcggct ccgggcgtat atgctccgca
8700ttggtcttga ccaactctat cagagcttgg ttgacggcaa tttcgatgat gcagcttggg
8760cgcagggtcg atgcgacgca atcgtccgat ccggagccgg gactgtcggg cgtacacaaa
8820tcgcccgcag aagcgcggcc gtctggaccg atggctgtgt agaagtactc gccgatagtg
8880gaaaccgacg ccccagcact cgtccgaggg caaaggaata gtgaggtaca gcttggatcg
8940atccggctgc taacaaagcc cgaaaggaag ctgagttggc tgctgccacc gctgagcaat
9000aactagcata accccttggg gcctctaaac gggtcttgag gggttttttg ctgaaaggag
9060gaactatatc cggatgatcg ggcgcgccgg tac
90931244107DNAS. pyogenes 124atggataaga aatactcaat aggcttagat atcggcacaa
atagcgtcgg atgggcggtg 60atcactgatg attataaggt tccgtctaaa aagctcaagg
gtctgggaaa tacagaccgc 120cacggtatca aaaaaaatct tataggggct cttttatttg
acagtggaga gacagcggaa 180gcgactcgtc tcaaacggac agctcgtaga aggtatacac
gtcggaagaa tcgtatttgt 240tatctacagg agattttttc aaatgagatg gcgaaagtag
atgatagttt ctttcatcga 300cttgaagagt cttttttggt ggaagaagac aagaagcatg
aacgtcatcc tatttttgga 360aatatagtag atgaagttgc ttatcatgag aaatatccaa
ctatctatca tctgcgaaaa 420aaattggcag attctactga taaagtggat ttgcgcttaa
tctatttggc cttagcgcat 480atgattaagt ttcgtggtca ttttttgatt gagggagatt
taaatcctga taatagtgat 540gtggacaaac tatttatcca gttggtacaa acctacaatc
aattatttga agaaaaccct 600attaacgcaa gtagagtaga tgctaaagcg attctttctg
cacgattgag taaatcaaga 660cgattagaaa atctcattgc tcagctcccc ggtgagaaga
aaaatggatt gtttgggaat 720ctcattgctt tgtcattggg attgacccct aattttaaat
caaattttga tttggcagaa 780gatgctaaat tacagctttc aaaagatact tacgatgatg
atttagataa tttattggcg 840caaattggag atcaatatgc tgatttgttt ttggcagcta
agaatttatc agatgctact 900ttactttcag atatcctaag agtaaatagt gaaataacta
aggctcccct atcagcttca 960atgattaagc gctacgatga acatcatcaa gacttgactc
ttttaaaagc tttagttcga 1020caacaacttc cagaaaagta taaagaaatc ttttttgatc
aatcaaaaaa cggatatgca 1080ggttatattg atgggggagc tagccaagaa gaattttata
aatttatcaa accaatttta 1140gaaaaaatgg atggtactga ggaattattg gcgaaactaa
atcgtgaaga tttgctgcgc 1200aagcaacgga cctttgacaa cggctctatt ccctatcaaa
ttcacttggg tgagctgcat 1260gctattttga gaagacaaga agacttttat ccatttttaa
aagacaatcg tgagaagatt 1320gaaaaaatct tgacttttcg aattccttat tatgttggtc
cattggcgcg tggcaatagt 1380cgttttgcat ggatgactcg gaagtctgaa gaaacaatta
ccccatggaa ttttgaagaa 1440gttgtcgata aaggtgcttc agctcaatca tttattgaac
gcatgacaaa ctttgataaa 1500aatcttccaa atgaaaaagt actaccaaaa catagtttgc
tttatgagta ttttacggtt 1560tataacgaat tgacaaaagt caaatatgtt actgagggaa
tgcgaaaacc agcatttctt 1620tcaggtgaac agaagaaagc cattgttgat ttactcttca
aaacaaatcg aaaagtaacc 1680gttaagcaat taaaagaaga ttatttcaaa aaaatagaat
gttttgatag tgttgaaatt 1740tcaggagttg aagatagatt taatgcttca ttaggtacct
accatgattt gctaaaaatt 1800attaaagata aagatttttt ggataatgaa gaaaacgaag
atatcttaga ggatattgtt 1860ttaacattga ccttatttga agatagggag atgattgagg
aaagacttaa aacatatgct 1920cacctctttg atgataaggt gatgaaacag cttaaacgtc
gccgttatac tggttgggga 1980cgtttgtctc gaaaattgat taatggtatt agggataagc
aatctggcaa aacaatatta 2040gattttttga aatcagatgg ttttgccaat cgcaatttta
tgcagctgat ccatgatgat 2100agtttgacat ttaaagaaga cattcaaaaa gcacaagtgt
ctggacaagg cgatagttta 2160catgaacata ttgcaaattt agctggtagc cctgctatta
aaaaaggtat tttacagact 2220gtaaaagttg ttgatgaatt ggtcaaagta atggggcggc
ataagccaga aaatatcgtt 2280attgaaatgg cacgtgaaaa tcagacaact caaaagggcc
agaaaaattc gcgtgagcgt 2340atgaaacgta ttgaagaagg aataaaagaa ctaggaagtg
atattctaaa ggagtatcct 2400gttgaaaaca ctcaattaca aaatgaaaag ctctatctct
attatctcca aaatggaaga 2460gacatgtatg tggaccaaga attagatatt aatcgtttaa
gtgattatga tgtcgatcac 2520attgttccac aaagtttcct taaagacgat tcaatagaca
ataaggtctt aacgcgttct 2580gataaaaatc gtggtaaatc ggataacgtt ccaagtgaag
aagtagtcaa aaagatgaaa 2640aactattgga gacaacttct aaacgccaag ttaatcactc
aacgtaagtt tgataattta 2700acaaaagctg aacgtggagg tttgagtgaa cttgataaag
ttggttttat caaacgccaa 2760ttggttgaaa ctcgccaaat cactaagcat gtggcacaaa
ttttggatag tcgcatgaat 2820actaaatacg atgaaaatga taaacttatt cgagaggtta
gagtgattac cttaaaatct 2880aaattagttt ctgacttccg aaaagatttc caattctata
aagtacgtga gattaacaat 2940taccatcatg cccatgatgc gtatcttaat gccgtcgttg
gaactgcttt gattaagaaa 3000tatccaaaac ttgaatcgga gtttgtctat ggtgattata
aagtttatga tgttcgtaaa 3060atgattgcta agtctgagca ggaaataggc aaagcaaccg
caaaatattt cttttactct 3120aatatcatga acttcttcaa aacagaaatt acacttgcaa
atggagagat tcgcaaacgc 3180cctctaatcg aaactaatgg ggaaactgga gaaattgtct
gggataaagg gcgagatttt 3240gccacagtgc gcaaagtatt gtccatgccc caagtcaata
ttgtcaagaa aacagaagta 3300cagacaggcg gattctccaa ggagtcaatt ttaccaaaaa
gaaattcgga caagcttatt 3360gctcgtaaaa aagactggga tccaaaaaaa tatggtggtt
ttgatagtcc aacggtagct 3420tattcagtcc tagtggttgc taaggtggaa aaagggaaat
cgaagaagtt aaaatccgtt 3480aaagagttac tagggatcac aataatggaa agaagctctt
ttgaaaaaga tccgattgac 3540tttttagaag ctaaaggata taaggaagtt agaaaagact
taatcattaa actacctaaa 3600tatagtcttt ttgagttaga aaacggtcgt aaacggatgc
tggctagtgc cggagaattg 3660caaaaaggaa atgagctagc tctgccaagc aaatatgtga
attttttata tttagctagt 3720cattatgaaa agttgaaggg tagtccagaa gataacgaac
aaaaacaatt gtttgtggag 3780cagcataagc attatttaga tgagattatt gagcaaatca
gtgaattttc taagcgtgtt 3840attttagcag atgccaattt agataaagtt cttagtgcat
ataacaaaca tagagacaaa 3900ccaatacgtg aacaagcaga aaatattatt catttattta
cgttgacgaa tcttggagct 3960cccgctgctt ttaaatattt tgatacaaca attgatcgta
aacgatatac gtctacaaaa 4020gaagttttag atgccactct tatccatcaa tccatcactg
gtctttatga aacacgcatt 4080gatttgagtc agctaggagg tgactga
410712520DNAGlycine max 125ggaactgaca cacgacatga
2012620DNAGlycine max
126gacatgatgg aacgtgacta
2012720DNAGlycine max 127gtcccttgta cttgtacgta
2012820DNAGlycine max 128gtattctaga aaagaggaat
2012970DNAGlycine max
129atcaaaattc ggaactgaca cacgacatga tggaacgtga ctaaggtggg tttttgactt
60tgcatgtcga
7013070DNAGlycine max 130tcgacatgca aagtcaaaaa cccaccttag tcacgttcca
tcatgtcgtg tgtcagttcc 60gaattttgat
7013170DNAGlycine max 131ggcagactcc aattcctctt
ttctagaata ccctccgtac gtacaagtac aagggacttg 60tgagttgtaa
7013270DNAGlycine max
132ttacaactca caagtccctt gtacttgtac gtacggaggg tattctagaa aagaggaatt
60ggagtctgcc
7013371DNAGlycine max 133ctacactctt tccctacacg acgctcttcc gatctggaat
ttacagcaca agtagatcac 60ttgtacttat c
7113459DNAGlycine max 134caagcagaag acggcatacg
agctcttccg atctaaatca ctctcacttc gacatgcaa 5913571DNAGlycine max
135ctacactctt tccctacacg acgctcttcc gatctttcct ttacagcaca agtagatcac
60ttgtacttat c
7113668DNAGlycine max 136ctacactctt tccctacacg acgctcttcc gatctagctg
taaatacagc cttacaactc 60acaagtcc
6813763DNAArtificial sequencePrimer, DD43-A
137caagcagaag acggcatacg agctcttccg atctttaatt taggactaaa agaagaggca
60gac
6313868DNAArtificial sequencePrimer, DD43-S4 138ctacactctt tccctacacg
acgctcttcc gatctctagg taaatacagc cttacaactc 60acaagtcc
6813968DNAArtificial
sequencePrimer, DD43-S5 139ctacactctt tccctacacg acgctcttcc gatctgatcg
taaatacagc cttacaactc 60acaagtcc
6814043DNAArtificial sequencePrimer, JKY557
140aatgatacgg cgaccaccga gatctacact ctttccctac acg
4314118DNAArtificial sequenceprimer, JKY558 141caagcagaag acggcata
18142117DNAArtificial
sequenceDD20CR1 PCR amplicon 142ggaatttaca gcacaagtag atcacttgta
cttatcaaaa ttcggaactg acacacgaca 60tgatggaacg tgactaaggt gggtttttga
ctttgcatgt cgaagtgaga gtgattt 117143117DNAArtificial
sequenceDD20CR2 PCR amplicon 143ttcctttaca gcacaagtag atcacttgta
cttatcaaaa ttcggaactg acacacgaca 60tgatggaacg tgactaaggt gggtttttga
ctttgcatgt cgaagtgaga gtgattt 117144108DNAArtificial
sequenceDD43CR1 PCR amplicon 144agctgtaaat acagccttac aactcacaag
tcccttgtac ttgtacgtac ggagggtatt 60ctagaaaaga ggaattggag tctgcctctt
cttttagtcc taaattaa 108145108DNAArtificial
sequenceDD43CR2 PCR amplicon 145ctaggtaaat acagccttac aactcacaag
tcccttgtac ttgtacgtac ggagggtatt 60ctagaaaaga ggaattggag tctgcctctt
cttttagtcc taaattaa 108146108DNAartificial
sequenceamplicon 146ctaggtaaat acagccttac aactcacaag tcccttgtac
ttgtacgtac ggagggtatt 60ctagaaaaga ggaattggag tctgcctctt cttttagtcc
taaattaa 108147101DNAArtificial sequenceDD20CR1 mutant
target site 147ggaatttaca gcacaagtag atcacttgta cttatcaaaa ttcggaactg
acacacgatg 60atggaacgtg actaaggtgg gtttttgact ttgcatgtcg a
101148101DNAArtificial sequenceDD20CR1 mutant target site
148ggaatttaca gcacaagtag atcacttgta cttatcaaaa ttcggaactg acacacgatg
60gaacgtgact aaggtgggtt tttgactttg catgtcgaag t
101149101DNAArtificial sequenceDD20CR1 mutant target site 149ggaatttaca
gcacaagtag atcacttgta cttatcaaaa ttcggaactg acacacgact 60gatggaacgt
gactaaggtg ggtttttgac tttgcatgtc g
101150101DNAArtificial sequenceDD20CR1 mutant target site 150ggaatttaca
gcacaagtag atcacttgta cttatcaaaa ttcggaactg acacacatgg 60aacgtgacta
aggtgggttt ttgactttgc atgtcgaagt g
101151101DNAArtificial sequenceDD20CR1 mutant target site 151ggaatttaca
gcacaagtag atcacttgta cttatcaaaa ttcggaactg acacatgatg 60gaacgtgact
aaggtgggtt tttgactttg catgtcgaag t
101152101DNAArtificial sequenceDD20CR1 mutant target site 152ggaatttaca
gcacaagtag atcacttgta cttatcaaaa ttcggaactg acacagacat 60gatggaacgt
gactaaggtg ggtttttgac tttgcatgtc g
101153101DNAArtificial sequenceDD20CR1 mutant target site 153ggaatttaca
gcacaagtag atcacttgta cttatcaaaa ttcggaactg acacgacatg 60atggaacgtg
actaaggtgg gtttttgact ttgcatgtcg a
101154101DNAArtificial sequenceDD20CR1 mutant target site i 154ggaatttaca
gcacaagtag atcacttgta cttatcaaaa ttcggaactg acacaagaaa 60tgatggaacg
tgactaaggt gggtttttga ctttgcatgt c
101155101DNAArtificial sequenceDD20CR1 mutant target site 155ggaatttaca
gcacaagtag atcacttgta cttatcaaaa ttcggaactg acacacgatt 60gaacgtgact
aaggtgggtt tttgactttg catgtcgaag t
101156101DNAArtificial sequenceDD20CR1 mutant target site 156ggaatttaca
gcacaagtag atcacttgta cttatcaaaa ttcggaactg acacacattg 60aacgtgacta
aggtgggttt ttgactttgc atgtcgaagt g
101157101DNAArtificial sequenceDD20CR2 mutant target site 157ttcctttaca
gcacaagtag atcacttgta cttatcaaaa ttcggaactg acacacgaca 60tgatggaacg
tctaaggtgg gtttttgact ttgcatgtcg a
101158101DNAArtificial sequenceDD20CR2 mutant target site 158ttcctttaca
gcacaagtag atcacttgta cttatcaaaa ttcggaactg acacacgaca 60tgatggaacc
taaggtgggt ttttgacttt gcatgtcgaa g
101159101DNAArtificial sequenceDD20CR2 mutant target site 159ttcctttaca
gcacaagtag atcacttgta cttatcaaaa ttcggaactg acacacgaca 60tgatggaacg
tgactaggtg ggtttttgac tttgcatgtc g
101160101DNAArtificial sequenceDD20CR2 mutant target site 160ttcctttaca
gcacaagtag atcacttgta cttatcaaaa ttcggaactg acacacgaca 60tgatggaact
aaggtgggtt tttgactttg catgtcgaag t
101161101DNAArtificial sequenceDD20CR2 mutant target site 161ttcctttaca
gcacaagtag atcacttgta cttatcaaaa ttcggaactg acacacgaca 60tgatggaacg
aaggtgggtt tttgactttg catgtcgaag t
101162101DNAArtificial sequenceDD20CR2 mutant target site 162ttcctttaca
gcacaagtag atcacttgta cttatcaaaa ttcggaactg acacacgaca 60tgatggaagg
tgggtttttg actttgcatg tcgaagtgag a
101163101DNAArtificial sequenceDD20CR2 mutant target site i 163ttcctttaca
gcacaagtag atcacttgta cttatcaaaa ttcggaactg acacacgaca 60tgatggacgt
gactaaggtg ggtttttgac tttgcatgtc g
101164101DNAArtificial sequenceDD20CR2 mutant target site 164ttcctttaca
gcacaagtag atcacttgta cttatcaaaa ttcggaactg acacacgaca 60tgatggaact
ttactaaggt gggtttttga ctttgcatgt c
101165101DNAArtificial sequenceDD20CR2 mutant target site 165ttcctttaca
gcacaagtag atcacttgta cttatcaaaa ttcggaactg acacacgaca 60tgatggaacg
tgacaaggtg ggtttttgac tttgcatgtc g
101166101DNAArtificial sequenceDD20CR2 mutant target site 166ttcctttaca
gcacaagtag atcacttgta cttatcaaaa ttcggaactg acacactaca 60ttatttaact
ttactaaggt gggtttttga ctttgcatgt c
101167108DNAArtificial sequenceDD43CR1 mutant target site 167agctgtaaat
acagccttac aactcacaag tcccttgtac ttgtacgtac ggagggtatt 60ctagaaaaga
ggaattggag tctgcctctt cttttagtcc taaattaa
108168101DNAArtificial sequenceDD43CR1 mutant target site 168agctgtaaat
acagccttac aactcacaag tcccttgtac ggagggtatt ctagaaaaga 60ggaattggag
tctgcctctt cttttagtcc taaattaaag a
101169101DNAArtificial sequenceDD43CR1 mutant target site 169agctgtaaat
acagccttac aactcacaag tcccttgtac ttgtacggag ggtattctag 60aaaagaggaa
ttggagtctg cctcttcttt tagtcctaaa t
101170101DNAArtificial sequenceDD43CR1 mutant target site 170agctgtaaat
acagccttac aactcacaag tcccttacgg agggtattct agaaaagagg 60aattggagtc
tgcctcttct tttagtccta aattaaagat c
101171101DNAArtificial sequenceDD43CR1 mutant target site 171agctgtaaat
acagccttac aactcacaag tcccttgtac ttgtaccgta cggagggtat 60tctagaaaag
aggaattgga gtctgcctct tcttttagtc c
101172101DNAArtificial sequenceDD43CR1 mutant target site 172agctgtaaat
acagccttac aactcacaag tcccttgtac tgtacggagg gtattctaga 60aaagaggaat
tggagtctgc ctcttctttt agtcctaaat t
101173101DNAArtificial sequenceDD43CR1 mutant target site 173agctgtaaat
acagccttac aactcacaag tcccttgtag tacggagggt attctagaaa 60agaggaattg
gagtctgcct cttcttttag tcctaaatta a
101174101DNAArtificial sequenceDD43CR1 mutant target site 174agctgtaaat
acagccttac aactcacaag tcccttgtac ttgtacgtag ggtattctag 60aaaagaggaa
ttggagtctg cctcttcttt tagtcctaaa t
101175100DNAArtificial sequenceDD43CR1 mutant target site 175agctgtaaat
acagccttac aactcacaag tcctacactc tttccctaca cgacgctctt 60cttttagtcc
taaattaaag atcggaagat ctcgtatgcc
100176101DNAArtificial sequenceDD43CR1 mutant target site 176agctgtaaat
acagccttac aactcacaag tcccttgtac ttgtacctta cggagggtat 60tctagaaaag
aggaattgga gtctgcctct tcttttagtc c
101177101DNAArtificial sequenceDD43CR2 mutant target site 177ctaggtaaat
acagccttac aactcacaag tcccttgtac ttgtacgtac ggagggtatt 60ctagaaaatt
ggagtctgcc tcttctttta gtcctaaatt a
101178101DNAArtificial sequenceDD43CR2 mutant target site 178ctaggtaaat
acagccttac aactcacaag tcccttgtac ttgtacgtac ggagggtatt 60ctagaaaaga
attggagtct gcctcttctt ttagtcctaa a
101179101DNAArtificial sequenceDD43CR2 mutant target site 179ctaggtaaat
acagccttac aactcacaag tcccttgtac ttgtacgtac ggagggtatt 60ctagaattgg
agtctgcctc ttcttttagt cctaaattaa a
101180101DNAArtificial sequenceDD43CR2 mutant target site 180ctaggtaaat
acagccttac aactcacaag tcccttgtac ttgtacgtac ggagggtatt 60ctagaaaaga
aattggagtc tgcctcttct tttagtccta a
101181101DNAArtificial sequenceDD43CR2 mutant target site 181ctaggtaaat
acagccttac aactcacaag tcccttgtac ttgtacgtac ggagggtatt 60ctagaaaaat
tggagtctgc ctcttctttt agtcctaaat t
101182101DNAArtificial sequenceDD43CR2 mutant target site 182ctaggtaaat
acagccttac aactcacaag tcccttgtac ttgtacgtac ggagggtatt 60ctagaaaaga
ggattggagt ctgcctcttc ttttagtcct a
101183101DNAArtificial sequenceDD43CR2 mutant target site 183ctaggtaaat
acagccttac aactcacaag tcccttgtac ttgtacgtac ggagggtatt 60ctagaaattg
gagtctgcct cttcttttag tcctaaatta a
101184101DNAArtificial sequenceDD43CR2 mutant target site 184ctaggtaaat
acagccttac aactcacaag tcccttgtac ttgtacgtac ggagggtatt 60ctattggagt
ctgcctcttc ttttagtcct aaattaaaga t
101185101DNAArtificial sequenceDD43CR2 mutant target site 185ctaggtaaat
acagccttac aactcacaag tcccttgtac ttgtacgtac ggagggtatt 60ctagtctgcc
tcttctttta gtcctaaatt aaagatcgga a
101186101DNAArtificial sequenceDD43CR2 mutant target site 186ctaggtaaat
acagccttac aactcacaag tcccttgtac ttgtacgtac ggagggtatt 60ctagaaaagt
ctgcctcttc ttttagtcct aaattaaaga t
101187101DNAArtificial sequenceDD43CR2 mutant target site 187ctaggtaaat
acagccttac aactcacaag tcccttgtac ttgtacgtac ggagggtatt 60ctagaaaaga
gaattggagt ctgcctcttc ttttagtcct a
101188101DNAArtificial sequenceDD43CR2 mutant target site 188ctaggtaaat
acagccttac aactcacaag tcccttgtac ttgtacgtac ggagggtatt 60ctagaaaaga
ggagtctgcc tcttctttta gtcctaaatt a
101189101DNAArtificial sequenceDD43CR2 mutant target site 189ctaggtaaat
acagccttac aactcacaag tcccttgtac ttgtacgtac ggagggtatt 60ctaattggag
tctgcctctt cttttagtcc taaattaaag a
101190101DNAArtificial sequenceDD43CR2 mutant target site 190ctaggtaaat
acagccttac aactcacaag tcccttgtac ttgtacgtac ggagggtatt 60ctagaaaaga
ggaaattgga gtctgcctct tcttttagtc c
101191101DNAArtificial sequenceDD43CR2 mutant target site 191ctaggtaaat
acagccttac aactcacaag tcccttgtac ttgtacgtac ggagggtatt 60ctagaaagag
gaattggagt ctgcctcttc ttttagtcct a
1011921377PRTArtificial sequencemaize optimized moCAS9 endonuclease
192Met Ala Pro Lys Lys Lys Arg Lys Val Met Asp Lys Lys Tyr Ser Ile 1
5 10 15 Gly Leu Asp Ile
Gly Thr Asn Ser Val Gly Trp Ala Val Ile Thr Asp 20
25 30 Glu Tyr Lys Val Pro Ser Lys Lys Phe
Lys Val Leu Gly Asn Thr Asp 35 40
45 Arg His Ser Ile Lys Lys Asn Leu Ile Gly Ala Leu Leu Phe
Asp Ser 50 55 60
Gly Glu Thr Ala Glu Ala Thr Arg Leu Lys Arg Thr Ala Arg Arg Arg 65
70 75 80 Tyr Thr Arg Arg Lys
Asn Arg Ile Cys Tyr Leu Gln Glu Ile Phe Ser 85
90 95 Asn Glu Met Ala Lys Val Asp Asp Ser Phe
Phe His Arg Leu Glu Glu 100 105
110 Ser Phe Leu Val Glu Glu Asp Lys Lys His Glu Arg His Pro Ile
Phe 115 120 125 Gly
Asn Ile Val Asp Glu Val Ala Tyr His Glu Lys Tyr Pro Thr Ile 130
135 140 Tyr His Leu Arg Lys Lys
Leu Val Asp Ser Thr Asp Lys Ala Asp Leu 145 150
155 160 Arg Leu Ile Tyr Leu Ala Leu Ala His Met Ile
Lys Phe Arg Gly His 165 170
175 Phe Leu Ile Glu Gly Asp Leu Asn Pro Asp Asn Ser Asp Val Asp Lys
180 185 190 Leu Phe
Ile Gln Leu Val Gln Thr Tyr Asn Gln Leu Phe Glu Glu Asn 195
200 205 Pro Ile Asn Ala Ser Gly Val
Asp Ala Lys Ala Ile Leu Ser Ala Arg 210 215
220 Leu Ser Lys Ser Arg Arg Leu Glu Asn Leu Ile Ala
Gln Leu Pro Gly 225 230 235
240 Glu Lys Lys Asn Gly Leu Phe Gly Asn Leu Ile Ala Leu Ser Leu Gly
245 250 255 Leu Thr Pro
Asn Phe Lys Ser Asn Phe Asp Leu Ala Glu Asp Ala Lys 260
265 270 Leu Gln Leu Ser Lys Asp Thr Tyr
Asp Asp Asp Leu Asp Asn Leu Leu 275 280
285 Ala Gln Ile Gly Asp Gln Tyr Ala Asp Leu Phe Leu Ala
Ala Lys Asn 290 295 300
Leu Ser Asp Ala Ile Leu Leu Ser Asp Ile Leu Arg Val Asn Thr Glu 305
310 315 320 Ile Thr Lys Ala
Pro Leu Ser Ala Ser Met Ile Lys Arg Tyr Asp Glu 325
330 335 His His Gln Asp Leu Thr Leu Leu Lys
Ala Leu Val Arg Gln Gln Leu 340 345
350 Pro Glu Lys Tyr Lys Glu Ile Phe Phe Asp Gln Ser Lys Asn
Gly Tyr 355 360 365
Ala Gly Tyr Ile Asp Gly Gly Ala Ser Gln Glu Glu Phe Tyr Lys Phe 370
375 380 Ile Lys Pro Ile Leu
Glu Lys Met Asp Gly Thr Glu Glu Leu Leu Val 385 390
395 400 Lys Leu Asn Arg Glu Asp Leu Leu Arg Lys
Gln Arg Thr Phe Asp Asn 405 410
415 Gly Ser Ile Pro His Gln Ile His Leu Gly Glu Leu His Ala Ile
Leu 420 425 430 Arg
Arg Gln Glu Asp Phe Tyr Pro Phe Leu Lys Asp Asn Arg Glu Lys 435
440 445 Ile Glu Lys Ile Leu Thr
Phe Arg Ile Pro Tyr Tyr Val Gly Pro Leu 450 455
460 Ala Arg Gly Asn Ser Arg Phe Ala Trp Met Thr
Arg Lys Ser Glu Glu 465 470 475
480 Thr Ile Thr Pro Trp Asn Phe Glu Glu Val Val Asp Lys Gly Ala Ser
485 490 495 Ala Gln
Ser Phe Ile Glu Arg Met Thr Asn Phe Asp Lys Asn Leu Pro 500
505 510 Asn Glu Lys Val Leu Pro Lys
His Ser Leu Leu Tyr Glu Tyr Phe Thr 515 520
525 Val Tyr Asn Glu Leu Thr Lys Val Lys Tyr Val Thr
Glu Gly Met Arg 530 535 540
Lys Pro Ala Phe Leu Ser Gly Glu Gln Lys Lys Ala Ile Val Asp Leu 545
550 555 560 Leu Phe Lys
Thr Asn Arg Lys Val Thr Val Lys Gln Leu Lys Glu Asp 565
570 575 Tyr Phe Lys Lys Ile Glu Cys Phe
Asp Ser Val Glu Ile Ser Gly Val 580 585
590 Glu Asp Arg Phe Asn Ala Ser Leu Gly Thr Tyr His Asp
Leu Leu Lys 595 600 605
Ile Ile Lys Asp Lys Asp Phe Leu Asp Asn Glu Glu Asn Glu Asp Ile 610
615 620 Leu Glu Asp Ile
Val Leu Thr Leu Thr Leu Phe Glu Asp Arg Glu Met 625 630
635 640 Ile Glu Glu Arg Leu Lys Thr Tyr Ala
His Leu Phe Asp Asp Lys Val 645 650
655 Met Lys Gln Leu Lys Arg Arg Arg Tyr Thr Gly Trp Gly Arg
Leu Ser 660 665 670
Arg Lys Leu Ile Asn Gly Ile Arg Asp Lys Gln Ser Gly Lys Thr Ile
675 680 685 Leu Asp Phe Leu
Lys Ser Asp Gly Phe Ala Asn Arg Asn Phe Met Gln 690
695 700 Leu Ile His Asp Asp Ser Leu Thr
Phe Lys Glu Asp Ile Gln Lys Ala 705 710
715 720 Gln Val Ser Gly Gln Gly Asp Ser Leu His Glu His
Ile Ala Asn Leu 725 730
735 Ala Gly Ser Pro Ala Ile Lys Lys Gly Ile Leu Gln Thr Val Lys Val
740 745 750 Val Asp Glu
Leu Val Lys Val Met Gly Arg His Lys Pro Glu Asn Ile 755
760 765 Val Ile Glu Met Ala Arg Glu Asn
Gln Thr Thr Gln Lys Gly Gln Lys 770 775
780 Asn Ser Arg Glu Arg Met Lys Arg Ile Glu Glu Gly Ile
Lys Glu Leu 785 790 795
800 Gly Ser Gln Ile Leu Lys Glu His Pro Val Glu Asn Thr Gln Leu Gln
805 810 815 Asn Glu Lys Leu
Tyr Leu Tyr Tyr Leu Gln Asn Gly Arg Asp Met Tyr 820
825 830 Val Asp Gln Glu Leu Asp Ile Asn Arg
Leu Ser Asp Tyr Asp Val Asp 835 840
845 His Ile Val Pro Gln Ser Phe Leu Lys Asp Asp Ser Ile Asp
Asn Lys 850 855 860
Val Leu Thr Arg Ser Asp Lys Asn Arg Gly Lys Ser Asp Asn Val Pro 865
870 875 880 Ser Glu Glu Val Val
Lys Lys Met Lys Asn Tyr Trp Arg Gln Leu Leu 885
890 895 Asn Ala Lys Leu Ile Thr Gln Arg Lys Phe
Asp Asn Leu Thr Lys Ala 900 905
910 Glu Arg Gly Gly Leu Ser Glu Leu Asp Lys Ala Gly Phe Ile Lys
Arg 915 920 925 Gln
Leu Val Glu Thr Arg Gln Ile Thr Lys His Val Ala Gln Ile Leu 930
935 940 Asp Ser Arg Met Asn Thr
Lys Tyr Asp Glu Asn Asp Lys Leu Ile Arg 945 950
955 960 Glu Val Lys Val Ile Thr Leu Lys Ser Lys Leu
Val Ser Asp Phe Arg 965 970
975 Lys Asp Phe Gln Phe Tyr Lys Val Arg Glu Ile Asn Asn Tyr His His
980 985 990 Ala His
Asp Ala Tyr Leu Asn Ala Val Val Gly Thr Ala Leu Ile Lys 995
1000 1005 Lys Tyr Pro Lys Leu
Glu Ser Glu Phe Val Tyr Gly Asp Tyr Lys 1010 1015
1020 Val Tyr Asp Val Arg Lys Met Ile Ala Lys
Ser Glu Gln Glu Ile 1025 1030 1035
Gly Lys Ala Thr Ala Lys Tyr Phe Phe Tyr Ser Asn Ile Met Asn
1040 1045 1050 Phe Phe
Lys Thr Glu Ile Thr Leu Ala Asn Gly Glu Ile Arg Lys 1055
1060 1065 Arg Pro Leu Ile Glu Thr Asn
Gly Glu Thr Gly Glu Ile Val Trp 1070 1075
1080 Asp Lys Gly Arg Asp Phe Ala Thr Val Arg Lys Val
Leu Ser Met 1085 1090 1095
Pro Gln Val Asn Ile Val Lys Lys Thr Glu Val Gln Thr Gly Gly 1100
1105 1110 Phe Ser Lys Glu Ser
Ile Leu Pro Lys Arg Asn Ser Asp Lys Leu 1115 1120
1125 Ile Ala Arg Lys Lys Asp Trp Asp Pro Lys
Lys Tyr Gly Gly Phe 1130 1135 1140
Asp Ser Pro Thr Val Ala Tyr Ser Val Leu Val Val Ala Lys Val
1145 1150 1155 Glu Lys
Gly Lys Ser Lys Lys Leu Lys Ser Val Lys Glu Leu Leu 1160
1165 1170 Gly Ile Thr Ile Met Glu Arg
Ser Ser Phe Glu Lys Asn Pro Ile 1175 1180
1185 Asp Phe Leu Glu Ala Lys Gly Tyr Lys Glu Val Lys
Lys Asp Leu 1190 1195 1200
Ile Ile Lys Leu Pro Lys Tyr Ser Leu Phe Glu Leu Glu Asn Gly 1205
1210 1215 Arg Lys Arg Met Leu
Ala Ser Ala Gly Glu Leu Gln Lys Gly Asn 1220 1225
1230 Glu Leu Ala Leu Pro Ser Lys Tyr Val Asn
Phe Leu Tyr Leu Ala 1235 1240 1245
Ser His Tyr Glu Lys Leu Lys Gly Ser Pro Glu Asp Asn Glu Gln
1250 1255 1260 Lys Gln
Leu Phe Val Glu Gln His Lys His Tyr Leu Asp Glu Ile 1265
1270 1275 Ile Glu Gln Ile Ser Glu Phe
Ser Lys Arg Val Ile Leu Ala Asp 1280 1285
1290 Ala Asn Leu Asp Lys Val Leu Ser Ala Tyr Asn Lys
His Arg Asp 1295 1300 1305
Lys Pro Ile Arg Glu Gln Ala Glu Asn Ile Ile His Leu Phe Thr 1310
1315 1320 Leu Thr Asn Leu Gly
Ala Pro Ala Ala Phe Lys Tyr Phe Asp Thr 1325 1330
1335 Thr Ile Asp Arg Lys Arg Tyr Thr Ser Thr
Lys Glu Val Leu Asp 1340 1345 1350
Ala Thr Leu Ile His Gln Ser Ile Thr Gly Leu Tyr Glu Thr Arg
1355 1360 1365 Ile Asp
Leu Ser Gln Leu Gly Gly Asp 1370 1375
1936677DNAArtificial sequencemaize optimized moCAS9 endonuclease
193ctgcagtgca gcgtgacccg gtcgtgcccc tctctagaga taatgagcat tgcatgtcta
60agttataaaa aattaccaca tatttttttt gtcacacttg tttgaagtgc agtttatcta
120tctttataca tatatttaaa ctttactcta cgaataatat aatctatagt actacaataa
180tatcagtgtt ttagagaatc atataaatga acagttagac atggtctaaa ggacaattga
240gtattttgac aacaggactc tacagtttta tctttttagt gtgcatgtgt tctccttttt
300ttttgcaaat agcttcacct atataatact tcatccattt tattagtaca tccatttagg
360gtttagggtt aatggttttt atagactaat ttttttagta catctatttt attctatttt
420agcctctaaa ttaagaaaac taaaactcta ttttagtttt tttatttaat aatttagata
480taaaatagaa taaaataaag tgactaaaaa ttaaacaaat accctttaag aaattaaaaa
540aactaaggaa acatttttct tgtttcgagt agataatgcc agcctgttaa acgccgtcga
600cgagtctaac ggacaccaac cagcgaacca gcagcgtcgc gtcgggccaa gcgaagcaga
660cggcacggca tctctgtcgc tgcctctgga cccctctcga gagttccgct ccaccgttgg
720acttgctccg ctgtcggcat ccagaaattg cgtggcggag cggcagacgt gagccggcac
780ggcaggcggc ctcctcctcc tctcacggca cggcagctac gggggattcc tttcccaccg
840ctccttcgct ttcccttcct cgcccgccgt aataaataga caccccctcc acaccctctt
900tccccaacct cgtgttgttc ggagcgcaca cacacacaac cagatctccc ccaaatccac
960ccgtcggcac ctccgcttca aggtacgccg ctcgtcctcc cccccccccc tctctacctt
1020ctctagatcg gcgttccggt ccatgcatgg ttagggcccg gtagttctac ttctgttcat
1080gtttgtgtta gatccgtgtt tgtgttagat ccgtgctgct agcgttcgta cacggatgcg
1140acctgtacgt cagacacgtt ctgattgcta acttgccagt gtttctcttt ggggaatcct
1200gggatggctc tagccgttcc gcagacggga tcgatttcat gatttttttt gtttcgttgc
1260atagggtttg gtttgccctt ttcctttatt tcaatatatg ccgtgcactt gtttgtcggg
1320tcatcttttc atgctttttt ttgtcttggt tgtgatgatg tggtctggtt gggcggtcgt
1380tctagatcgg agtagaattc tgtttcaaac tacctggtgg atttattaat tttggatctg
1440tatgtgtgtg ccatacatat tcatagttac gaattgaaga tgatggatgg aaatatcgat
1500ctaggatagg tatacatgtt gatgcgggtt ttactgatgc atatacagag atgctttttg
1560ttcgcttggt tgtgatgatg tggtgtggtt gggcggtcgt tcattcgttc tagatcggag
1620tagaatactg tttcaaacta cctggtgtat ttattaattt tggaactgta tgtgtgtgtc
1680atacatcttc atagttacga gtttaagatg gatggaaata tcgatctagg ataggtatac
1740atgttgatgt gggttttact gatgcatata catgatggca tatgcagcat ctattcatat
1800gctctaacct tgagtaccta tctattataa taaacaagta tgttttataa ttattttgat
1860cttgatatac ttggatgatg gcatatgcag cagctatatg tggatttttt tagccctgcc
1920ttcatacgct atttatttgc ttggtactgt ttcttttgtc gatgctcacc ctgttgtttg
1980gtgttacttc tgcaggtcga ctctagagga tccccatggc cccgaagaag aagaggaagg
2040tgcacatgga taagaagtac agcatcggcc tcgacatcgg gaccaacagc gtcggctggg
2100ccgtcatcac cgacgaatat aaggtgccca gcaagaagtt caaggtgctc gggaatacag
2160accgccacag catcaagaag aacctgatcg gcgccctcct gttcgactcg ggcgagaccg
2220ctgaggccac cagactaaag aggaccgctc gccgccgcta cacccgccgc aagaaccgca
2280tatgctacct ccaggagatc ttcagcaacg agatggccaa ggtggacgac agcttcttcc
2340accgccttga ggagtcgttc ctcgtggagg aggacaagaa gcatgagagg cacccgatct
2400tcgggaacat cgtggacgag gtaagtttct gcttctacct ttgatatata tataataatt
2460atcattaatt agtagtaata taatatttca aatatttttt tcaaaataaa agaatgtagt
2520atatagcaat tgcttttctg tagtttataa gtgtgtatat tttaatttat aacttttcta
2580atatatgacc aaaacatggt gatgtgcagg tggcgtacca cgagaagtac ccgacgatct
2640accacctccg caagaagctg gtcgactcca cagacaaggc cgacctcaga ctgatctacc
2700tggccctcgc gcacatgatc aagttccgcg ggcacttcct catcgagggc gacctgaacc
2760cggacaactc cgacgtcgac aagctcttca tccagctggt ccagacctac aatcaactgt
2820tcgaggagaa cccgatcaac gcgtccggcg tggacgcgaa ggccatcctc agcgcgaggc
2880tcagcaaatc aagacggctg gagaacctga tcgcccagct cccaggcgag aagaaaaacg
2940gcttgttcgg caacctgatc gcgctctcgc tcggcctcac gcccaacttc aaatcaaact
3000tcgacctggc cgaggacgcg aaactgcagc tgtccaagga cacttacgac gacgacctcg
3060acaacctgct ggcgcaaatc ggtgaccagt acgcagacct cttcctggcc gccaagaacc
3120tctcggacgc catcctgctg tccgatatcc tgagagtgaa tacggagatc accaaggcgc
3180cgctcagcgc ctccatgatt aaaaggtacg acgagcacca ccaggacctg acgctgctca
3240aggccctggt gcgccagcag ctccccgaga agtacaagga gatcttcttc gaccaatcaa
3300aaaacggcta cgccggctac atcgacgggg gcgcctccca ggaggagttc tacaagttca
3360tcaaaccaat tctcgagaag atggacggca cggaggagct tctcgtgaag ctcaaccggg
3420aggacctcct gaggaagcag aggacgttcg acaacggctc gataccgcat cagatccacc
3480tgggcgagct ccacgccatc ctgcgccggc aggaggattt ctatccgttc ctcaaggaca
3540acagggagaa gatcgagaaa attctgacgt tccgcatccc gtactacgtg ggccctctcg
3600cgcgcgggaa cagccggttc gcctggatga ctcggaagtc ggaggagacg atcacgccgt
3660ggaacttcga ggaggtggtg gacaagggcg cctccgccca gtcgttcatc gagcgcatga
3720cgaacttcga taaaaatctg cccaatgaaa aagtgctccc gaagcacagc ctcctctacg
3780agtacttcac ggtgtacaac gagctcacga aggtgaagta cgtgaccgag ggtatgcgga
3840agccggcgtt cctgagcggc gagcagaaga aggccatcgt ggacctcctc ttcaagacga
3900accggaaagt caccgtgaag caattaaagg aggactactt caagaaaata gagtgcttcg
3960acagcgtcga gatctcgggc gtcgaggaca ggttcaacgc gtcgctgggc acataccacg
4020acctcctcaa gatcattaaa gacaaggact tcctggacaa cgaggagaac gaggacatcc
4080tcgaggacat cgtgctgacc ctcaccctgt ttgaggaccg ggagatgatc gaggagcgcc
4140tcaagacgta cgctcacctt ttcgacgaca aggtgatgaa acagctgaag cggcgccgct
4200acaccggatg gggccggctc tcccgcaagc tcattaatgg gatcagggac aagcagtccg
4260gcaagaccat actcgatttc ctgaagagcg acggcttcgc caaccggaac ttcatgcagc
4320tcatccacga cgactccctc actttcaagg aggacatcca gaaggcccag gtcagcggac
4380agggcgactc gctccacgaa cacatcgcca acctggccgg gtcgcctgcg attaaaaagg
4440gaatccttca gaccgtcaag gtcgtggacg agctggtgaa ggtgatgggc aggcacaagc
4500ccgaaaatat cgtcattgag atggcccggg agaaccagac cacgcagaaa ggccagaaga
4560acagccggga gcgcatgaaa cggatcgagg agggtatcaa ggagctgggc tcgcagatcc
4620tcaaggagca ccctgtggaa aatacccagc tgcagaatga aaagctctac ctctactacc
4680tccagaacgg ccgcgacatg tacgtggacc aggagctgga cattaatcgc ctctcggact
4740acgacgtcga ccacatcgtc ccgcagtcct tcctgaagga cgacagcatc gacaacaagg
4800tcttgacccg ctccgataaa aatcgcggga agtccgacaa cgtgccgtcg gaggaggtgg
4860tcaagaagat gaaaaactac tggcgccagc tgctcaacgc caagctaatc acgcagcgca
4920agttcgacaa cctcaccaag gccgaacgcg gcggtctctc cgagcttgat aaggctgggt
4980tcatcaagag acagctggtg gagacccggc agatcaccaa gcatgtcgcc cagatcctgg
5040actcgcgcat gaatactaag tacgatgaaa acgacaagct catccgcgag gtgaaggtga
5100tcaccctgaa gagcaagctg gtctcggact tccggaagga cttccagttc tacaaggtcc
5160gggagatcaa caactaccac cacgcgcacg acgcctacct gaacgcggtg gtgggcacag
5220cccttataaa gaagtaccct aagctcgagt ccgagttcgt gtacggcgac tacaaggtgt
5280acgacgtccg caagatgatc gcgaagagcg agcaggagat cgggaaggcc accgcaaaat
5340acttcttcta ctccaacatc atgaacttct tcaagaccga gatcaccctg gccaacgggg
5400agatccgcaa gcgcccgctg attgagacga acggagagac aggcgagata gtctgggaca
5460agggcaggga cttcgccacc gtgcgcaagg ttctgtccat gccgcaggtg aacatcgtga
5520agaagactga ggtgcagaca ggcggcttct cgaaggagtc catcctgccc aagcggaaca
5580gcgacaagct catcgcgcgg aagaaggact gggaccctaa aaaatatggc gggttcgact
5640cgcccaccgt ggcttactcg gtcctcgtgg tggccaaggt cgagaagggc aaaagcaaga
5700agctgaagag cgtcaaggag ctcctcggca tcaccatcat ggagcggtcc agcttcgaga
5760agaacccgat cgacttcctc gaggcgaagg gatataagga ggtgaagaag gacctcatca
5820ttaaactgcc gaagtactcg ctattcgaac tggagaatgg tcgcaagagg atgctcgcga
5880gcgctggcga gctgcagaaa gggaacgagc tggctctccc gagcaagtac gtcaacttcc
5940tctacctggc ctcccactat gaaaagctca agggctcgcc ggaggacaac gagcagaagc
6000agctgttcgt cgagcagcac aagcattacc tcgacgagat catcgagcag atctcggagt
6060tcagcaagcg cgtgatcctg gccgacgcca acctcgacaa ggtgctgtcc gcatataaca
6120agcaccgcga caaaccaata cgggagcagg ccgaaaatat catccacctg ttcaccctca
6180cgaacctggg cgcccccgcc gcgttcaagt acttcgacac aaccatcgac cgcaagcggt
6240acacgagcac gaaggaggtg ctggacgcca cgttgattca ccagtccatc acgggcctgt
6300atgaaacaag gatcgatctc agccagctcg gcggcgacta ggtaccacat ggttaaccta
6360gacttgtcca tcttctggat tggccaactt aattaatgta tgaaataaaa ggatgcacac
6420atagtgacat gctaatcact ataatgtggg catcaaagtt gtgtgttatg tgtaattact
6480agttatctga ataaaagaga aagagatcat ccatatttct tatcctaaat gaatgtcacg
6540tgtctttata attctttgat gaaccagatg catttcatta accaaatcca tatacatata
6600aatattaatc atatataatt aatatcaatt gggttagcaa aacaaatcta gtctaggtgt
6660gttttgcgaa ttgcggc
6677194100DNAArtificial sequenceDNA version of guide RNA (EPSPS sgRNA)
194gcagtaacag ctgctgtcaa gttttagagc tagaaatagc aagttaaaat aaggctagtc
60cgttatcaac ttgaaaaagt ggcaccgagt cggtgctttt
1001953708DNAArtificial sequenceEPSPS polynucleotide template
195ctgcagccca tcaaggagat ctccggcacc gtcaagctgc cggggtccaa gtcgctttcc
60aacaggatcc tcctgctcgc cgccctgtcc gaggtgagcg attttggtgc ttgctgcgct
120gccctgtctc actgctacct aaatgttttg cctgtcgaat accatggatt ctcggtgtaa
180tccatctcac gatcagatgc accgcatgtc gcatgcctag ctctctctaa tttgtctagt
240agtttgtata cggattaaga ttgataaatc ggtaccgcaa aagctaggtg taaataaaca
300ctacaaaatt ggatgttccc ctatcggcct gtactcggct actcgttctt gtgatggcat
360gttatttctt cttggtgttt ggtgaactcc cttatgaaat ttgggcgcaa agaaatcgcc
420ctcaagggtt gatcttatgc catcgtcatg ataaacagtg aagcacggat gatcctttac
480gttgttttta acaaactttg tcagaaaact agcaatgtta acttcttaat gatgatttca
540caacaaaaaa ggtaaccttg ctactaacat aacaaaagac ttgttgctta ttaattatat
600gtttttttaa tctttgatca ggggacaaca gtggttgata acctgttgaa cagtgaggat
660gtccactaca tgctcggggc cttgaggact cttggtctct ctgtcgaagc ggacaaagct
720gccaaaagag ctgtagttgt tggctgtggt ggaaagttcc cagttgagga tgctaaagag
780gaagtgcagc tcttcttggg gaatgctgga atcgcaatgc ggtcattgac agcagctgtt
840actgctgctg gtggaaatgc aacgtatgtt tcctctctct ctctacaata cttgttggag
900ttagtatgaa acccatgtgt atgtctagtg gcttatggtg tattggtttt tgaacttcag
960ttacgtgctt gatggagtac caagaatgag ggagagaccc attggcgact tggttgtcgg
1020attgaagcag cttggtgcag atgttgattg tttccttggc actgactgcc cacctgttcg
1080tgtcaatgga atcggagggc tacctggtgg caaggttagt tactaagggc cacatgttac
1140attcttctgt aaatggtaca actattgtcg agcttttgca tttgtaagga aaacattgat
1200tgatctgaat ttgatgctac accacaaaat atctacaaat ggtcatccct aactagcaaa
1260ccatgtctcc attaagctca atgaagtaat acttggcatg tgtttatcaa cttaatttcc
1320atcttctggg gtattgcctg ttttctagtc taatagcatt tgtttttaga attagctctt
1380acaactgtta tgttctacag gtcaagctgt ctggctccat cagcagtcag tacttgagtg
1440ccttgctgat ggctgctcct ttggctcttg gggatgtgga gattgaaatc attgataaat
1500taatctccat tccctacgtc gaaatgacat tgagattgat ggagcgtttt ggtgtgaaag
1560cagagcattc tgatagctgg gacagattct acattaaggg aggtcaaaaa tacaagtaag
1620ctctgtaatg tatttcacta ctttgatgcc aatgtttcag ttttcagttt tccaaacagt
1680cgcatcaata tttgaataga tgcactgtag aaaaaaatca ttgcagggaa aaactagtac
1740tgagtatttt gactgtaaat tatttaacca gtcggaatat agtcagtcta ttggagtcaa
1800gagcgtgaac cgaaatagcc agttaattat cccattatac agaggacaac catgtatact
1860attgaaactt ggtttaagag aatctaggta gctggactcg tagctgcttg gcatggatac
1920cttcttatct ttaggaaaag acacttgatt ttttttctgt ggccctctat gatgtgtgaa
1980cctgcttctc tattgcttta gaaggatata tctatgtcgt tatgcaacat gcttccctta
2040gtcatttgta ctgaaatcag tttcataagt tcgttagtgg ttccctaaac gaaaccttgt
2100ttttctttgc aatcaacagg tcccctaaaa atgcctatgt tgaaggtgat gcctcaagcg
2160caagctattt cttggctggt gctgcaatta ctggagggac tgtgactgtg gaaggttgtg
2220gcaccaccag tttgcaggta aagatttctt ggctggtgct acgataactg cttttgtctt
2280tttggtttca gcattgttct cagagtcact aaataacatt atcatctgca aacgtcaaat
2340agacatactt aggtgaatgg atattcatgt aaccgtttcc ttacaaattt gctgaaacct
2400cagggtgatg tgaagtttgc tgaggtactg gagatgatgg gagcgaaggt tacatggacc
2460gagactagcg taactgttac tggcccaccg cgggagccat ttgggaggaa acacctcaag
2520gcgattgatg tcaacatgaa caagatgcct gatgtcgcca tgactcttgc tgtggttgcc
2580ctctttgccg atggcccgac agccatcaga gacggtaaaa cattctcagc cctacaacca
2640tgcctcttct acatcactac ttgacaagac taaaaactat tggctcgttg gcagtggctt
2700cctggagagt aaaggagacc gagaggatgg ttgcgatccg gacggagcta accaaggtaa
2760ggctacatac ttcacatgtc tcacgtcgtc tttccatagc tcgctgcctc ttagcggctt
2820gcctgcggtc gctccatcct cggttgctgt ctgtgttttc cacagctggg agcatctgtt
2880gaggaagggc cggactactg catcatcacg ccgccggaga agctgaacgt gacggcgatc
2940gacacgtacg acgaccacag gatggccatg gccttctccc ttgccgcctg tgccgaggtc
3000cccgtgacca tccgggaccc tgggtgcacc cggaagacct tccccgacta cttcgatgtg
3060ctgagcactt tcgtcaagaa ttaataaagc gtgcgatact accacgcagc ttgattgaag
3120tgataggctt gtgctgagga aatacatttc ttttgttctg ttttttctct ttcacgggat
3180taagttttga gtctgtaacg ttagttgttt gtagcaagtt tctatttcgg atcttaagtt
3240tgtgcactgt aagccaaatt tcatttcaag agtggttcgt tggaataata agaataataa
3300attacgtttc agtggctgtc aagcctgctg ctacgtttta ggagatggca ttagacattc
3360atcatcaaca acaataaaac cttttagcct caaacaataa tagtgaagtt attttttagt
3420cctaaacaag ttgcattagg atatagttaa aacacaaaag aagctaaagt tagggtttag
3480acatgtggat attgttttcc atgtatagta tgttctttct ttgagtctca tttaactacc
3540tctacacata ccaactttag ttttttttct acctcttcat gttactatgg tgccttctta
3600tcccactgag cattggtata tttagaggtt tttgttgaac atgcctaaat catctcaatc
3660aacgatggac aatcttttct tcgattgagc tgaggtacgt catctaga
370819615DNAArtificial sequenceTIPS nucleotide modifications
196atcgcaatgc ggtca
1519719DNAArtificial sequencePrimer Seqeunce-1 F-E2 197ccgaggagat
cgtgctgca
1919820DNAArtificial sequencePrimer Seqeunce-2 F-E2 198caatggccgc
attgcagttc
2019919DNAArtificial sequencePrimer Seqeunce-1 F-T 199ccgaggagat
cgtgctgca
1920020DNAArtificial sequencePrimer Seqeunce-2 F-T 200tgaccgcatt
gcgattccag
2020124DNAArtificial sequencePrimer Seqeunce-1 H-T 201tccaagtcgc
tttccaacag gatc
2420220DNAArtificial sequencePrimer Seqeunce-2 H-T 202tgaccgcatt
gcgattccag
2020319DNAArtificial sequencePrimer Seqeunce-1 F-E3 203ccgaggagat
cgtgctgca
1920424DNAArtificial sequencePrimer Seqeunce-2 F-E3 204accaagctgc
ttcaatccga caac
2420563DNAArtificial sequenceDNA fragment with intact Cas target sequence
205ggggaatgct ggaactgcaa tgcggccatt gacagcagct gttactgctg ctggtggaaa
60tgc
6320660DNAArtificial sequenceDNA fragment with mutated Cas target
sequence 206ggggaatgct ggaactgcaa tgcggccatt ggcagctgtt actgctgctg
gtggaaatgc 6020750DNAArtificial sequenceDNA fragment with mutated Cas
target sequence 207ggggaatgct ggaactgcac agcagctgtt actgctgctg gtggaaatgc
5020833DNAArtificial sequenceDNA fragment with mutated Cas
target sequence 208ggggaatgct gttactgctg ctggtggaaa tgc
3320951DNAArtificial sequenceTIPS edited EPSPS nucleotide
sequence fragment 209aatgctggaa tcgcaatgcg gtcattgaca gcagctgtta
ctgctgctgg t 5121051DNAArtificial sequenceWild-type epsps
nucleotide sequence fragment 210aatgctggaa ctgcaatgcg gccattgaca
gcagctgtta ctgctgctgg t 512115124DNAZea mays 211atggcggcca
tggcgaccaa ggccgccgcg ggcaccgtgt cgctggacct cgccgcgccg 60ccggcggcgg
cagcggcggc ggcggtgcag gcgggtgccg aggagatcgt gctgcagccc 120atcaaggaga
tctccggcac cgtcaagctg ccggggtcca agtcgctttc caacaggatc 180ctcctgctcg
ccgccctgtc cgaggtgagc gattttggtg cttgctgcgc tgccctgtct 240cactgctacc
taaatgtttt gcctgtcgaa taccatggat tctcggtgta atccatctca 300cgatcagatg
caccgcatgt cgcatgccta gctctctcta atttgtctag tagtttgtat 360acggattaag
attgataaat cggtaccgca aaagctaggt gtaaataaac actacaaaat 420tggatgttcc
cctatcggcc tgtactcggc tactcgttct tgtgatggca tgttatttct 480tcttggtgtt
tggtgaactc ccttatgaaa tttgggcgca aagaaatcgc cctcaagggt 540tgatcttatg
ccatcgtcat gataaacagt gaagcacgga tgatccttta cgttgttttt 600aacaaacttt
gtcagaaaac tagcaatgtt aacttcttaa tgatgatttc acaacaaaaa 660aggtaacctt
gctactaaca taacaaaaga cttgttgctt attaattata tgttttttta 720atctttgatc
aggggacaac agtggttgat aacctgttga acagtgagga tgtccactac 780atgctcgggg
ccttgaggac tcttggtctc tctgtcgaag cggacaaagc tgccaaaaga 840gctgtagttg
ttggctgtgg tggaaagttc ccagttgagg atgctaaaga ggaagtgcag 900ctcttcttgg
ggaatgctgg aactgcaatg cggccattga cagcagctgt tactgctgct 960ggtggaaatg
caacgtatgt ttcctctctc tctctacaat acttgttgga gttagtatga 1020aacccatgtg
tatgtctagt ggcttatggt gtattggttt ttgaacttca gttacgtgct 1080tgatggagta
ccaagaatga gggagagacc cattggcgac ttggttgtcg gattgaagca 1140gcttggtgca
gatgttgatt gtttccttgg cactgactgc ccacctgttc gtgtcaatgg 1200aatcggaggg
ctacctggtg gcaaggttag ttactaaggg ccacatgtta cattcttctg 1260taaatggtac
aactattgtc gagcttttgc atttgtaagg aaaacattga ttgatctgaa 1320tttgatgcta
caccacaaaa tatctacaaa tggtcatccc taactagcaa accatgtctc 1380cattaagctc
aatgaagtaa tacttggcat gtgtttatca acttaatttc catcttctgg 1440ggtattgcct
gttttctagt ctaatagcat ttgtttttag aattagctct tacaactgtt 1500atgttctaca
ggtcaagctg tctggctcca tcagcagtca gtacttgagt gccttgctga 1560tggctgctcc
tttggctctt ggggatgtgg agattgaaat cattgataaa ttaatctcca 1620ttccctacgt
cgaaatgaca ttgagattga tggagcgttt tggtgtgaaa gcagagcatt 1680ctgatagctg
ggacagattc tacattaagg gaggtcaaaa atacaagtaa gctctgtaat 1740gtatttcact
actttgatgc caatgtttca gttttcagtt ttccaaacag tcgcatcaat 1800atttgaatag
atgcactgta gaaaaaaatc attgcaggga aaaactagta ctgagtattt 1860tgactgtaaa
ttatttaacc agtcggaata tagtcagtct attggagtca agagcgtgaa 1920ccgaaatagc
cagttaatta tcccattata cagaggacaa ccatgtatac tattgaaact 1980tggtttaaga
gaatctaggt agctggactc gtagctgctt ggcatggata ccttcttatc 2040tttaggaaaa
gacacttgat tttttttctg tggccctcta tgatgtgtga acctgcttct 2100ctattgcttt
agaaggatat atctatgtcg ttatgcaaca tgcttccctt agtcatttgt 2160actgaaatca
gtttcataag ttcgttagtg gttccctaaa cgaaaccttg tttttctttg 2220caatcaacag
gtcccctaaa aatgcctatg ttgaaggtga tgcctcaagc gcaagctatt 2280tcttggctgg
tgctgcaatt actggaggga ctgtgactgt ggaaggttgt ggcaccacca 2340gtttgcaggt
aaagatttct tggctggtgc tacgataact gcttttgtct ttttggtttc 2400agcattgttc
tcagagtcac taaataacat tatcatctgc aaacgtcaaa tagacatact 2460taggtgaatg
gatattcatg taaccgtttc cttacaaatt tgctgaaacc tcagggtgat 2520gtgaagtttg
ctgaggtact ggagatgatg ggagcgaagg ttacatggac cgagactagc 2580gtaactgtta
ctggcccacc gcgggagcca tttgggagga aacacctcaa ggcgattgat 2640gtcaacatga
acaagatgcc tgatgtcgcc atgactcttg ctgtggttgc cctctttgcc 2700gatggcccga
cagccatcag agacggtaaa acattctcag ccctacaacc atgcctcttc 2760tacatcacta
cttgacaaga ctaaaaacta ttggctcgtt ggcagtggct tcctggagag 2820taaaggagac
cgagaggatg gttgcgatcc ggacggagct aaccaaggta aggctacata 2880cttcacatgt
ctcacgtcgt ctttccatag ctcgctgcct cttagcggct tgcctgcggt 2940cgctccatcc
tcggttgctg tctgtgtttt ccacagctgg gagcatctgt tgaggaaggg 3000ccggactact
gcatcatcac gccgccggag aagctgaacg tgacggcgat cgacacgtac 3060gacgaccaca
ggatggccat ggccttctcc cttgccgcct gtgccgaggt ccccgtgacc 3120atccgggacc
ctgggtgcac ccggaagacc ttccccgact acttcgatgt gctgagcact 3180ttcgtcaaga
attaataaag cgtgcgatac taccacgcag cttgattgaa gtgataggct 3240tgtgctgagg
aaatacattt cttttgttct gttttttctc tttcacggga ttaagttttg 3300agtctgtaac
gttagttgtt tgtagcaagt ttctatttcg gatcttaagt ttgtgcactg 3360taagccaaat
ttcatttcaa gagtggttcg ttggaataat aagaataata aattacgttt 3420cagtggctgt
caagcctgct gctacgtttt aggagatggc attagacatt catcatcaac 3480aacaataaaa
ccttttagcc tcaaacaata atagtgaagt tattttttag tcctaaacaa 3540gttgcattag
gatatagtta aaacacaaaa gaagctaaag ttagggttta gacatgtgga 3600tattgttttc
catgtatagt atgttctttc tttgagtctc atttaactac ctctacacat 3660accaacttta
gttttttttc tacctcttca tgttactatg gtgccttctt atcccactga 3720gcattggtat
atttagaggt ttttgttgaa catgcctaaa tcatctcaat caacgatgga 3780caatcttttc
ttcgattgag ctgaggtacg tcatctagag gataggacct tgagaatatg 3840tgtccgtcaa
tagctaaccc tctactaatt ttttcaatca agcaacctat tggcttgact 3900ttaattcgta
ccggcttcta ctacttctac agtattttgt ctctataaat tgcagctaca 3960acagtcagaa
cggctggctt taaaatcaaa tggcctaagg atcattgaaa ggcatcttag 4020caatgtctaa
aattattacc ttctctagac gttgatatct ttgctccgga ttcgatccct 4080tgttgtatga
ccacaaatcc aacaccaaat acgcatttct gcaacacacc caaacacccc 4140ttccaaataa
gtggaatggt tgagaaattt gctattttga ttaaatattg gtgaaggggc 4200aaggctgagg
aaacgagacg aaggttcctt gacagctgaa aaatggaaca ctctagaggc 4260ggagggagcg
aggcgagctg tgtgaattgc cacccattga ttaagaatcc aacaacttga 4320ctagcaaatg
ccgacatggg tagcctacaa aggcgagttt tggagctggt ttcgtaataa 4380ggaaatttct
caaccaacta ctttccttag aaaagagttg cttgaccgga tcaacatctc 4440cccctaaacc
ccttggaggg ggagggggct aagattttaa tctacaagtt agatctaact 4500gtccacctca
atccccctca aggaggtttt tgtattattt gttagtgtag aatgataaag 4560tggatgtatt
gataggagat ggggtacaca tatttatagg gactcaaccc taaccctaat 4620gggtcggcag
cccaacagtg gtgtccggcc cacacacaca ctcacacaca cagtctaaca 4680tcccccgcag
tcgcaacggg gacaccacac acgatgagac tggagtagag gccgaaggta 4740ggagccgacg
ggttgaaatc ccccctagtc gcagcgtcgt gatagtacga atgttgcggc 4800tggagtagag
accggtgtgt gctccaagaa gacgatagcc cctagatgcc gaggtagccg 4860aagtcgaggt
ggtcgcggtc ggaagacgcg cagcaaaagc ctgatcttcg ggatggtcga 4920cgttcgagcg
tcaacgatcg gtagggcgac acaataaaag ggcaccagca ggtcgacctt 4980cctgcttctt
cgatcgtcca gacgtcaagg agcctcgcta gggaggccga cggcagcgca 5040cgcggctacg
ccggtcatgg tgtcctcacc cgcggcagaa aagaagggga atgtcggatc 5100cgaccgagaa
ggccacggca gcga 51242123387DNAS.
thermophilus 212atgagtgact tagttttagg acttgatatc ggtataggtt ctgttggtgt
aggtatcctt 60aacaaagtga caggagaaat tatccataaa aactcacgca tcttcccagc
agctcaagca 120gaaaataacc tagtacgtag aacgaatcgt caaggaagac gcttgacacg
acgtaaaaaa 180catcgtatag ttcgtttaaa tcgtctattt gaggaaagtg gattaatcac
cgattttacg 240aagatttcaa ttaatcttaa cccatatcaa ttacgagtta agggcttgac
cgatgaattg 300tctaatgaag aactgtttat cgctcttaaa aatatggtga aacaccgtgg
gattagttac 360ctcgatgatg ctagtgatga cggaaattca tcagtaggag actatgcaca
aattgttaag 420gaaaatagta aacaattaga aactaagaca ccgggacaga tacagttgga
acgctaccaa 480acatatggtc aattacgtgg tgattttact gttgagaaag atggcaaaaa
acatcgcttg 540attaatgtct ttccaacatc agcttatcgt tcagaagcct taaggatact
gcaaactcaa 600caagaattta atccacagat tacagatgaa tttattaatc gttatctcga
aattttaact 660ggaaaacgga aatattatca tggacccgga aatgaaaagt cacggactga
ttatggtcgt 720tacagaacga gtggagaaac tttagacaat atttttggaa ttctaattgg
gaaatgtaca 780ttttatccag aagagtttag agcagcaaaa gcttcctaca cggctcaaga
attcaatttg 840ctaaatgatt tgaacaatct aacagttcct actgaaacca aaaagttgag
caaagaacag 900aagaatcaaa tcattaatta tgtcaaaaat gaaaaggcaa tggggccagc
gaaacttttt 960aaatatatcg ctaagttact ttcttgtgat gttgcagata tcaagggata
ccgtatcgac 1020aaatcaggta aggctgagat tcatactttc gaagcctatc gaaaaatgaa
aacgcttgaa 1080accttagata ttgaacaaat ggatagagaa acgcttgata aattagccta
tgtcttaaca 1140ttaaacactg agagggaagg tattcaagaa gccttagaac atgaatttgc
tgatggtagc 1200tttagccaga agcaagttga cgaattggtt caattccgca aagcaaatag
ttccattttt 1260ggaaaaggat ggcataattt ttctgtcaaa ctgatgatgg agttaattcc
agaattgtat 1320gagacgtcag aagagcaaat gactatcctg acacgacttg gaaaacaaaa
acgacttcgt 1380cttcaaataa aacaaaatat ttcaaataaa acaaaatata tagatgagaa
actattaact 1440gaagaaatct ataatcctgt tgttgctaag tctgttcgcc aggctataaa
aatcgtaaat 1500gcggcgatta aagaatacgg agactttgac aatattgtca tcgaaatggc
tcgtgaaaca 1560aatgaagatg atgaaaagaa agctattcaa aagattcaaa aagccaacaa
agatgaaaaa 1620gatgcagcaa tgcttaaggc tgctaaccaa tataatggaa aggctgaatt
accacatagt 1680gttttccacg gtcataagca attagcgact aaaatccgcc tttggcatca
gcaaggagaa 1740cgttgccttt atactggtaa gacaatctca atccatgatt tgataaataa
tcctaatcag 1800tttgaagtag atcatatttt acctctttct atcacattcg atgatagcct
tgcaaataag 1860gttttggttt atgcaactgc taaccaagaa aaaggacaac gaacacctta
tcaggcttta 1920gatagtatgg atgatgcgtg gtctttccgt gaattaaaag cttttgtacg
tgagtcaaaa 1980acactttcaa acaagaaaaa agaatacctc cttacagaag aagatatttc
aaagtttgat 2040gttcgaaaga aatttattga acgaaatctt gtagatacaa gatacgcttc
aagagttgtc 2100ctcaatgccc ttcaagaaca ctttagagct cacaagattg atacaaaagt
ttccgtggtt 2160cgtggccaat ttacatctca attgagacgc cattggggaa ttgagaagac
tcgtgatact 2220tatcatcacc atgctgtcga tgcattgatt attgccgcct caagtcagtt
gaatttgtgg 2280aaaaaacaaa agaataccct tgtaagttat tcagaagaac aactccttga
tattgaaaca 2340ggtgaactta ttagtgatga tgagtacaag gaatctgtgt tcaaagcccc
ttatcaacat 2400tttgttgata cattgaagag taaagaattt gaagacagta tcttattctc
atatcaagtg 2460gattctaagt ttaatcgtaa aatatcagat gccactattt atgcgacaag
acaggctaaa 2520gtgggaaaag ataagaagga tgaaacttat gtcttaggga aaatcaaaga
tatctatact 2580caggatggtt atgatgcctt tatgaagatt tataagaagg ataagtcaaa
attcctcatg 2640tatcgtcacg acccacaaac ctttgagaaa gttatcgagc caattttaga
gaactatcct 2700aataagcaaa tgaatgaaaa aggaaaagag gtaccatgta atcctttcct
aaaatataaa 2760gaagaacatg gctatattcg taaatatagt aaaaaaggca atggtcctga
aatcaagagt 2820cttaaatact atgatagtaa gcttttaggt aatcctattg atattactcc
agagaatagt 2880aaaaataaag ttgtcttaca gtcattaaaa ccttggagaa cagatgtcta
tttcaataag 2940gctactggaa aatacgaaat ccttggatta aaatatgctg atctacaatt
tgagaaaggg 3000acaggaacat ataagatttc ccaggaaaaa tacaatgaca ttaagaaaaa
agagggtgta 3060gattctgatt cagaattcaa gtttacactt tataaaaatg atttgttact
cgttaaagat 3120acagaaacaa aagaacaaca gcttttccgt tttctttctc gaactttacc
taaacaaaag 3180cattatgttg aattaaaacc ttatgataaa cagaaatttg aaggaggtga
ggcgttaatt 3240aaagtgttgg gtaacgttgc taatggtggt caatgcataa aaggactagc
aaaatcaaat 3300atttctattt ataaagtaag aacagatgtc ctaggaaatc agcatatcat
caaaaatgag 3360ggtgataagc ctaagctaga tttttaa
33872133369DNAS. thermophilus 213atgagtgact tagttttagg
acttgatatc ggtataggtt ctgttggtgt aggtatcctt 60aacaaagtga caggagaaat
tatccataaa aactcacgca tcttcccagc agctcaagca 120gaaaataacc tagtacgtag
aacgaatcgt caaggaagac gcttgacacg acgtaaaaaa 180catcgtatag ttcgtttaaa
tcgtctattt gaggaaagtg gattaatcac cgattttacg 240aagatttcaa ttaatcttaa
cccatatcaa ttacgagtta agggcttgac cgatgaattg 300tctaatgaag aactgtttat
cgctcttaaa aatatggtga aacaccgtgg gattagttac 360ctcgatgatg ctagtgatga
cggaaattca tcagtaggag actatgcaca aattgttaag 420gaaaatagta aacaattaga
aactaagaca ccgggacaga tacagttgga acgctaccaa 480acatatggtc aattacgtgg
tgattttact gttgagaaag atggcaaaaa acatcgcttg 540attaatgtct ttccaacatc
agcttatcgt tcagaagcct taaggatact gcaaactcaa 600caagaattta attcacagat
tacagatgaa tttattaatc gttatctcga aattttaact 660ggaaaacgga aatattatca
tggacccgga aatgaaaagt cacggactga ttatggtcgt 720tacagaacga atggagaaac
tttagacaat atttttggaa ttctaattgg gaaatgtaca 780ttttatccag acgagtttag
agcagcaaaa gcttcctaca cggctcaaga attcaatttg 840ctaaatgatt tgaacaatct
aacagttcct actgaaacca aaaagttgag caaagaacag 900aagaatcaaa tcattaatta
tgtcaaaaat gaaaaggtaa tggggccagc gaaacttttt 960aaatatatcg ctaaattact
ttcttgtgat gttgcagata tcaagggaca ccgtatcgac 1020aaatcaggta aggctgagat
tcatactttc gaagcctatc gaaaaatgaa aacgcttgaa 1080accttagata ttgagcaaat
ggatagagaa acgcttgata aattagccta tgtcttaaca 1140ttaaacactg agagggaagg
tattcaagaa gctttagaac atgaatttgc tgatggtagc 1200tttagccaga agcaagttga
cgaattggtt caattccgca aagcaaatag ttccattttt 1260ggaaaaggat ggcataattt
ttctgtcaaa ctgatgatgg agttaattcc agaattgtat 1320gagacgtcag aagagcaaat
gactatcctg acacgacttg gaaaacaaaa aacaacttcg 1380tcttcaaata aaacaaaata
tatagatgag aaactattaa ctgaagaaat ctataatcct 1440gttgttgcta agtctgttcg
ccaggctata aaaatcgtaa atgcggcgat taaagaatac 1500ggagactttg acaatattgt
catcgaaatg gctcgtgaaa caaatgaaga tgatgaaaag 1560aaagctattc aaaagattca
aaaagccaac aaagatgaaa aagatgcagc aatgcttaag 1620gctgctaacc aatataatgg
aaaggctgaa ttaccacata gtgttttcca cggtcataag 1680caattagcga ctaaaatccg
cctttggcat cagcaaggag aacgttgcct ttatactggt 1740aagacaatct caatccatga
tttgataaat aatcctaatc agtttgaagt agatcatatt 1800ttacctcttt ctatcacatt
cgatgatagc cttgcaaata aggttttggt ttatgcaact 1860gctaaccaag aaaaaggaca
acgaacacct tatcaggctt tagatagtat ggatgatgcg 1920tggtctttcc gtgaattaaa
agcttttgta cgtgagtcaa aaacactttc aaacaagaaa 1980aaagaatacc tccttacaga
agaagatatt tcaaagtttg atgttcgaaa gaaatttatt 2040gaacgaaatc ttgtagatac
aagatacgct tcaagagttg tcctcaatgc ccttcaagaa 2100cactttagag ctcacaagat
tgatacaaaa gtttccgtgg ttcgtggcca atttacatct 2160caattgagac gccattgggg
aattgagaag actcgtgata cttatcatca ccatgctgtc 2220gatgcattga ttattgccgc
ctcaagtcag ttgaatttgt ggaaaaaaca aaagaatacc 2280cttgtaagtt attcagaaga
acaactcctt gatattgaaa caggtgaact tattagtgat 2340gatgagtaca aggaatctgt
gttcaaagcc ccttatcaac attttgttga tacattgaag 2400agtaaagaat ttgaagacag
tatcttattc tcatatcaag tggattctaa gtttaatcgt 2460aaaatatcag atgccactat
ttatgcgaca agacaggcta aagtgggaaa agataagaag 2520gatgaaactt atgtcttagg
gaaaatcaaa gatatctata ctcaggatgg ttatgatgcc 2580tttatgaaga tttataagaa
ggataagtca aaattcctca tgtatcgtca cgacccacaa 2640acctttgaga aagttatcga
gccaatttta gagaactatc ctaataagga aatgaatgaa 2700aaagggaaag aagtaccatg
taatcctttc ctaaaatata aagaagaaca tggctatatt 2760cgtaaatata gtaaaaaagg
caatggtcct gaaatcaaga gtcttaaata ctatgatagt 2820aagcttttag gtaatcctat
tgatattact ccagagaata gtaaaaataa agttgtctta 2880cagtcattaa aaccttggag
aacagatgtc tatttcaata aaaatactgg taaatatgaa 2940attttaggac tgaaatatgc
tgatttacaa tttgaaaaga agacaggaac atataagatt 3000tcccaggaaa aatacaatgg
cattatgaaa gaagagggtg tagattctga ttcagaattc 3060aagtttacac tttataaaaa
tgatttgtta ctcgttaaag atacagaaac aaaagaacaa 3120cagcttttcc gttttctttc
tcgaactatg cctaatgtga aatattatgt agagttaaag 3180ccttattcaa aagataaatt
tgagaagaat gagtcactta ttgaaatttt aggttctgca 3240gataagtcag gacgatgtat
aaaagggcta ggaaaatcaa atatttctat ttataaggta 3300agaacagatg tcctaggaaa
tcagcatatc atcaaaaatg agggtgataa gcctaagcta 3360gatttttaa
33692144113DNAS. agalactiae
214atgaataagc catattcaat aggccttgac atcggtacta attccgtcgg atggagcatt
60attacagatg attataaagt acctgctaag aagatgagag ttttagggaa cactgataaa
120gaatatatta agaagaatct cataggtgct ctgctttttg atggcgggaa tactgctgca
180gatagacgct tgaagcgaac tgctcgtcgt cgttatacac gtcgtagaaa tcgtattcta
240tatttacaag aaatttttgc agaggaaatg agtaaagttg atgatagttt ctttcatcga
300ttagaggatt cttttctagt tgaggaagat aagagaggga gcaagtatcc tatctttgca
360acattgcagg aagagaaaga ttatcatgaa aaattttcga caatctatca tttgagaaaa
420gaattagctg acaagaaaga aaaagcagac cttcgtctta tttatattgc tctagctcat
480atcattaaat ttagagggca tttcctaatt gaggatgata gctttgatgt caggaataca
540gacatttcaa aacaatatca agatttttta gaaatcttta atacaacttt tgaaaataat
600gatttgttat ctcaaaacgt tgacgtagag gcaatactaa cagataagat tagcaagtct
660gcgaagaaag atcgtatttt agcgcagtat cctaaccaaa aatctactgg catttttgca
720gaatttttga aattgattgt cggaaatcaa gctgacttca agaaatattt caatttggag
780gataaaacgc cgcttcaatt cgctaaggat agctacgatg aagatttaga aaatcttctt
840ggacagattg gtgatgaatt tgcagactta ttctcagcag cgaaaaagtt atatgatagt
900gtccttttgt ctggcattct tacagtaatc gacctcagta ccaaggcgcc actttcagct
960tctatgattc agcgttatga tgaacataga gaggacttga aacagttaaa acaattcgta
1020aaagcttcat tgccggaaaa atatcaagaa atatttgctg attcatcaaa agatggctac
1080gctggttata ttgaaggtaa aactaatcaa gaagcttttt ataaatacct gtcaaaattg
1140ttgaccaagc aagaagatag cgagaatttt cttgaaaaaa tcaagaatga agatttcttg
1200agaaaacaaa ggacctttga taatggctca attccacacc aagtccattt gacagagctg
1260aaagctatta tccgccgtca atcagaatac tatcccttct tgaaagagaa tcaagatagg
1320attgaaaaaa tccttacctt tagaattcct tattatatcg ggccactagc acgtgagaag
1380agtgattttg catggatgac tcgcaaaaca gatgacagta ttcgaccttg gaattttgaa
1440gacttggttg ataaagaaaa atctgcggaa gcttttatcc atcgtatgac caacaatgat
1500ttttatcttc ctgaagaaaa agttttacca aagcatagtc ttatttatga aaaatttacg
1560gtctataatg agttgactaa ggttagatat aaaaatgagc aaggtgagac ttattttttt
1620gatagcaata ttaaacaaga aatctttgat ggagtattca aggaacatcg taaggtatcc
1680aagaagaagt tgctagattt tctggctaaa gaatatgagg agtttaggat agtagatgtt
1740attggtctag ataaagaaaa taaagctttc aacgcctcat tgggaactta ccacgatctc
1800gaaaaaatac tagacaaaga ttttctagat aatccagata atgagtctat tctggaagat
1860atcgtccaaa ctctaacatt atttgaagac agagaaatga ttaagaagcg tcttgaaaac
1920tataaagatc tttttacaga gtcacaacta aaaaaactct atcgtcgtca ctatactggc
1980tggggacgat tgtctgctaa gttaatcaat ggtattcgag ataaagagag tcaaaaaaca
2040atcttggact atcttattga tgatggtaga tctaatcgca actttatgca gttgataaat
2100gatgatggtc tatctttcaa atcaattatc agtaaggcac aggctggtag tcattcagat
2160aatctaaaag aagttgtagg tgagcttgca ggtagccctg ctattaaaaa gggaattcta
2220caaagtttga aaattgttga tgagcttgtt aaagtcatgg gatacgaacc tgaacaaatt
2280gtggttgaga tggcgcgtga gaatcaaaca acaaatcaag gtcgtcgtaa ctctcgacaa
2340cgctataaac ttcttgatga tggcgttaag aatctagcta gtgacttgaa tggcaatatt
2400ttgaaagaat atcctacgga taatcaagcg ttgcaaaatg aaagactttt cctttactac
2460ttacaaaacg gaagagatat gtatacaggg gaagctctag atattgacaa tttaagtcaa
2520tatgatattg accacattat tcctcaagct ttcataaaag atgattctat tgataatcgt
2580gttttggtat catctgctaa aaatcgtgga aagtcagatg atgttcctag ccttgaaatt
2640gtaaaagatt gtaaagtttt ctggaaaaaa ttacttgatg ctaagttaat gagtcagcgt
2700aagtatgata atttgactaa ggcagagcgc ggaggcctaa cttccgatga taaggcaaga
2760tttatccaac gtcagttggt tgagacacga caaattacca agcatgttgc ccgtatcttg
2820gatgaacgct ttaataatga gcttgatagt aaaggtagaa ggatccgcaa agttaaaatt
2880gtaaccttga agtcaaattt ggtttcaaat ttccgaaaag aatttggatt ctataaaatt
2940cgtgaagtta acaattatca ccatgcacat gatgcctatc ttaatgcagt agttgctaaa
3000gctattctaa ccaaatatcc tcagttagag ccagaatttg tctacggcga ctatccaaaa
3060tataatagtt acaaaacgcg taaatccgct acagaaaagc tatttttcta ttcaaatatt
3120atgaacttct ttaaaactaa ggtaacttta gcggatggaa ccgttgttgt aaaagatgat
3180attgaagtta ataatgatac gggtgaaatt gtttgggata aaaagaaaca ctttgcgaca
3240gttagaaaag tcttgtcata ccctcagaac aatatcgtga agaagacaga gattcagaca
3300ggtggtttct ctaaggaatc aatcttggcg catggtaact cagataagtt gattccaaga
3360aaaacgaagg atatttattt agatcctaag aaatatggag gttttgatag tccgatagta
3420gcttactctg ttttagttgt agctgatatc aaaaagggta aagcacaaaa actaaaaaca
3480gttacggaac ttttaggaat taccatcatg gagaggtcca gatttgagaa aaatccatca
3540gctttccttg aatcaaaagg ctatttaaat attagggctg ataaactaat tattttgccc
3600aagtatagtc tgttcgaatt agaaaatggg cgtcgtcgat tacttgctag tgctggtgaa
3660ttacaaaaag gtaatgagct agccttacca acacaattta tgaagttctt ataccttgca
3720agtcgttata atgagtcaaa aggtaaacca gaggagattg agaagaaaca agaatttgta
3780aatcaacatg tctcttattt tgatgacatc cttcaattaa ttaatgattt ttcaaaacga
3840gttattctag cagatgctaa tttagagaaa atcaataagc tttaccaaga taataaggaa
3900aatatatcag tagatgaact tgctaataat attatcaatc tatttacttt taccagtcta
3960ggagctccag cagcttttaa attttttgat aaaatagttg atagaaaacg ctatacatca
4020actaaagaag tacttaattc taccctaatt catcaatcta ttactggact ttatgaaaca
4080cgtattgatt tgggtaagtt aggagaagat tga
41132154134DNAS. agalactiae 215atgaataagc catattcaat aggccttgac
atcggtacta attccgtcgg atggagcatt 60attacagatg attataaagt acctgctaag
aagatgagag ttttagggaa cactgataaa 120gaatatatta agaagaatct cataggtgct
ctgctttttg atggcgggaa tactgctgca 180gatagacgct tgaagcgaac tgctcgtcgt
cgttatacac gtcgtagaaa tcgtattcta 240tatttacaag aaatttttgc agaggaaatg
agtaaagttg atgatagttt ctttcatcga 300ttagaggatt cttttctagt tgaggaagat
aagagaggta gcaagtatcc tatctttgca 360acaatgcagg aggagaaata ttatcatgaa
aaatttccga caatctatca tttgagaaaa 420gaattggctg acaagaaaga aaaagcagac
cttcgtcttg tttatctggc tctagctcat 480atcattaaat tcagagggca tttcctaatt
gaggatgata gatttgatgt gaggaatacc 540gatattcaaa aacaatatca agccttttta
gaaatttttg atactacctt tgaaaataat 600catttgttat ctcaaaatgt agatgtagaa
gcaattctaa cagataagat tagcaagtct 660gcgaagaagg atcgcatctt agcgcagtat
cctaaccaaa aatctactgg tatttttgca 720gaatttttga aattgattgt cggaaatcaa
gctgacttca agaaacattt caatttggag 780gataaaacac cgcttcaatt cgctaaggat
agctacgatg aagatttaga aaatcttctt 840ggacagattg gtgatgaatt tgcagactta
ttctcagtag cgaaaaagct atatgatagt 900gttcttttat ctggcattct tacagtaact
gatctcagta ccaaggcgcc actttctgcc 960tctatgattc agcgttatga tgaacatcat
gaggacttaa agcatctaaa acaattcgta 1020aaagcttcat tacctgaaaa ttatcgggaa
gtatttgctg attcatcaaa agatggctac 1080gctggctata ttgaaggcaa aactaatcaa
gaagcttttt ataaatatct gttaaaattg 1140ttgaccaaac aagaaggtag cgagtatttt
cttgagaaaa ttaagaatga agattttttg 1200agaaaacaga gaacctttga taatggctca
atcccgcatc aagtccattt gacagaattg 1260agggctatta ttcgacgtca atcagaatac
tatccattct tgaaagagaa tcaagatagg 1320attgaaaaaa tccttacctt tagaattcct
tattatgtcg ggccactagc acgtgagaag 1380agtgattttg catggatgac tcgcaaaaca
gatgacagta ttcgaccttg gaattttgaa 1440gacttggttg ataaagaaaa atctgcggaa
gcttttatcc atcgcatgac caacaatgac 1500ctctatcttc cagaagaaaa agttttacca
aagcatagtc ttatttatga aaaatttact 1560gtttacaatg aattaacgaa ggttagattt
ttggcagaag gctttaaaga ttttcaattt 1620ttaaatagga agcaaaaaga aactatcttt
aacagcttgt ttaaggaaaa acgtaaagta 1680actgaaaagg atattattag ttttttgaat
aaagttgatg gatatgaagg aattgcaatc 1740aaaggaattg agaaacagtt taacgctagc
ctttcaacct atcatgatct taaaaaaata 1800cttggcaagg atttccttga taatacagat
aacgagctta ttttggaaga tatcgtccaa 1860actctaacct tatttgaaga tagagaaatg
attaagaagt gtcttgacat ctataaagat 1920ttttttacag agtcacagct taaaaagctc
tatcgccgtc actatactgg ctggggacga 1980ttgtctgcta agctaataaa tggcatccga
aataaagaga atcaaaaaac aatcttggac 2040tatcttattg atgatggaag tgcaaaccga
aacttcatgc agttgataaa tgatgatgat 2100ctatcattta aaccaattat tgacaaggca
cgaactggta gtcattcgga taatctgaaa 2160gaagttgtag gtgaacttgc tggtagccct
gctattaaaa aagggattct acaaagtttg 2220aaaatagttg atgagctggt taaagtcatg
ggctatgaac ctgaacaaat cgtggttgaa 2280atggcacgtg agaaccaaac gacagcaaaa
ggattaagtc gttcacgaca acgcttgaca 2340accttgagag aatctcttgc taatttgaag
agtaatattt tggaagagaa aaagcctaag 2400tatgtgaaag atcaagttga aaatcatcat
ttatctgatg accgtctttt cctttactac 2460ttacaaaacg gaagagatat gtatacaaaa
aaggctctgg atattgataa tttaagtcaa 2520tatgatattg accacattat tcctcaagct
ttcataaaag atgattctat tgataatcgt 2580gttttggtat catctgctaa aaatcgtgga
aaatcagatg atgttcctag cattgaaatt 2640gtaaaagctc gcaaaatgtt ctggaaaaat
ttactggatg ctaagttaat gagtcagcgt 2700aagtatgata atttgactaa ggcagagcgc
ggaggcctaa cttccgatga taaggcaaga 2760tttatccaac gtcagttggt tgagactcga
caaattacca agcatgtagc tcgtatcttg 2820gatgaacgct tcaataatga agttgataat
ggtaaaaaga tttgcaaggt taaaattgta 2880accttgaagt caaatttggt ttcaaatttc
cgaaaagaat ttggattcta taaaattcgt 2940gaagttaatg attatcacca tgcacacgat
gcttatctta atgcagtagt tgccaaagct 3000attctaacca aatatccaca gttagagcca
gagtttgtct acggaatgta tagacagaaa 3060aaactttcga aaatcgttca tgaggataag
gaagaaaaat atagtgaagc aaccaggaaa 3120atgtttttct actccaactt gatgaatatg
ttcaaaagag ttgtgaggtt agcagatggt 3180tctattgttg taagaccagt aatagaaact
ggtagatata tgagaaaaac tgcatgggat 3240aaaaagaaac actttgcgac agttagaaaa
gtcttgtcat accctcagaa caatatcgtg 3300aagaagacag agattcagac aggtggtttc
tctaaggaat caatcttggc gcatggtaac 3360tcagataagt tgattccaag aaaaacgaag
gatatttatt tagatcctaa gaaatatgga 3420ggttttgata gtccgatagt agcttactct
gttttagttg tagctgatat caaaaaaggt 3480aaagcacaaa aactaaaaac agttacggaa
cttttaggaa ttaccatcat ggagaggtcc 3540agatttgaga aaaatccatc agctttcctt
gaatcaaaag gttatttaaa tattagggac 3600gataaattaa tgattttacc gaagtatagt
ctgttcgaat tagaaaatgg gcgtcgtcga 3660ttacttgcta gtgctggtga attacaaaaa
ggtaacgagc tagccttacc aacacaattt 3720atgaagttct tataccttgc aagtcgttat
aatgagtcaa aaggtaaacc agaggagatt 3780gagaagaaac aagaatttgt aaatcaacat
gtctcttatt ttgatgacat ccttcaatta 3840attaatgatt tttcaaaacg agttattcta
gcagatgcta atttagagaa aatcaataag 3900ctttaccagg ataataagga aaatatacca
gtagatgaac ttgctaataa tattatcaat 3960ctatttactt ttaccagtct aggagctcca
gcagctttta aattttttga taaaatagtt 4020gatagaaaac gctatacatc aactaaagaa
gtacttaatt ctactctaat ccatcaatct 4080attactggac tttatgaaac acgtattgat
ttgggtaaat taggagaaga ttga 41342164038DNAS. mutans 216atgaaaaaac
cttactctat tggacttgat attggaacca attctgttgg ttgggctgtt 60gtgacagatg
actacaaagt tcctgctaag aagatgaagg ttctgggaaa tacagataaa 120agtcatatcg
agaaaaattt gcttggcgct ttattatttg atagcgggaa tactgcagaa 180gacagacggt
taaagagaac tgctcgccgt cgttacacac gtcgcagaaa tcgtatttta 240tatttgcaag
agattttttc agaagaaatg ggcaaggtag atgatagttt ctttcatcgt 300ttagaggatt
cttttcttgt tactgaggat aaacgaggag agcgccatcc catttttggg 360aatcttgaag
aagaagttaa gtatcatgaa aattttccaa ccatttatca tttgcggcaa 420tatcttgcgg
ataatccaga aaaagttgat ttgcgtttag tttatttggc tttggcacat 480ataattaagt
ttagaggtca ttttttaatt gaaggaaagt ttgatacacg caataatgat 540gtacaaagac
tgtttcaaga atttttagca gtctatgata atacttttga gaatagttcg 600cttcaggagc
aaaatgttca agttgaagaa attctgactg ataaaatcag taaatctgct 660aagaaagata
gagttttgaa actttttcct aatgaaaagt ctaatggccg ctttgcagaa 720tttctaaaac
taattgttgg taatcaagct gattttaaaa agcattttga attagaagag 780aaagcaccat
tgcaattttc taaagatact tatgaagaag agttagaagt actattagct 840caaattggag
ataattacgc agagctcttt ttatcagcaa agaaactgta tgatagtatc 900cttttatcag
ggattttaac agttactgat gttggtacca aagcgccttt atctgcttcg 960atgattcagc
gatataatga acatcagatg gatttagctc agcttaaaca attcattcgt 1020cagaaattat
cagataaata taacgaagtt ttttctgatg tttcaaaaga cggctatgcg 1080ggttatattg
atgggaaaac aaatcaagaa gctttttata aataccttaa aggtctatta 1140aataagattg
agggaagtgg ctatttcctt gataaaattg agcgtgaaga ttttctaaga 1200aagcaacgta
cctttgacaa tggctctatt ccacatcaga ttcatcttca agaaatgcgt 1260gctatcattc
gtagacaggc tgaattttat ccgtttttag cagacaatca agataggatt 1320gagaaattat
tgactttccg tattccctac tatgttggtc cattagcgcg cggaaaaagt 1380gattttgctt
ggttaagtcg gaaatcggct gataaaatta caccatggaa ttttgatgaa 1440atcgttgata
aagaatcctc tgcagaagct tttatcaatc gtatgacaaa ttatgatttg 1500tacttgccaa
atcaaaaagt tcttcctaaa catagtttat tatacgaaaa atttactgtt 1560tacaatgaat
taacaaaggt taaatataaa acagagcaag gaaaaacagc attttttgat 1620gccaatatga
agcaagaaat ctttgatggc gtatttaagg tttatcgaaa agtaactaaa 1680gataaattaa
tggatttcct tgaaaaagaa tttgatgaat ttcgtattgt tgatttaaca 1740ggtctggata
aagaaaataa agtatttaac gcttcttatg gaacttatca tgatttgtgt 1800aaaattttag
ataaagattt tctcgataat tcaaagaatg aaaagatttt agaagatatt 1860gtgttgacct
taacgttatt tgaagataga gaaatgatta gaaaacgtct agaaaattac 1920agtgatttat
tgaccaaaga acaagtgaaa aagctggaaa gacgtcatta tactggttgg 1980ggaagattat
cagctgagtt aattcatggt attcgcaata aagaaagcag aaaaacaatt 2040cttgattatc
tcattgatga tggcaatagc aatcggaact ttatgcaact gattaacgat 2100gatgctcttt
ctttcaaaga agagattgct aaggcacaag ttattggaga aacagacaat 2160ctaaatcaag
ttgttagtga tattgctggc agccctgcta ttaaaaaagg aattttacaa 2220agcttgaaga
ttgttgatga gcttgtcaaa attatgggac atcaacctga aaatatcgtc 2280gtggagatgg
cgcgtgaaaa ccagtttacc aatcagggac gacgaaattc acagcaacgt 2340ttgaaaggtt
tgacagattc tattaaagaa tttggaagtc aaattcttaa agaacatccg 2400gttgagaatt
cacagttaca aaatgataga ttgtttctat attatttaca aaacggcaga 2460gatatgtata
ctggagaaga attggatatt gattatctaa gccagtatga tatagaccat 2520attatcccgc
aagcttttat aaaggataat tctattgata atagagtatt gactagctca 2580aaggaaaatc
gtggaaaatc ggatgatgta ccaagtaaag atgttgttcg taaaatgaaa 2640tcctattgga
gtaagctact ttcggcaaag cttattacac aacgtaaatt tgataatttg 2700acaaaagctg
aacgaggtgg attgaccgac gatgataaag ctggattcat caagcgtcaa 2760ttagtagaaa
cacgacaaat taccaaacat gtagcacgta ttctggacga acgatttaat 2820acagaaacag
atgaaaacaa caagaaaatt cgtcaagtaa aaattgtgac cttgaaatca 2880aatcttgttt
ccaatttccg taaagagttt gaactctaca aagtgcgtga aattaatgac 2940tatcatcatg
cacatgatgc ctatctcaat gctgtaattg gaaaggcttt actaggtgtt 3000tacccacaat
tggaacctga atttgtttat ggtgattatc ctcattttca tggacataaa 3060gaaaataaag
caactgctaa gaaatttttc tattcaaata ttatgaactt ctttaaaaaa 3120gatgatgtcc
gtactgataa aaatggtgaa attatctgga aaaaagatga gcatatttct 3180aatattaaaa
aagtgctttc ttatccacaa gttaatattg ttaagaaagt agaggagcaa 3240acgggaggat
tttctaaaga atctatcttg ccgaaaggta attctgacaa gcttattcct 3300cgaaaaacga
agaaatttta ttgggatacc aagaaatatg gaggatttga tagcccgatt 3360gttgcttatt
ctattttagt tattgctgat attgaaaaag gtaaatctaa aaaattgaaa 3420acagtcaaag
ccttagttgg tgtcactatt atggaaaaga tgacttttga aagggatcca 3480gttgcttttc
ttgagcgaaa aggctatcga aatgttcaag aagaaaatat tataaagtta 3540ccaaaatata
gtttatttaa actagaaaac ggacgaaaaa ggctattggc aagtgctagg 3600gaacttcaaa
agggaaatga aatcgttttg ccaaatcatt taggaacctt gctttatcac 3660gctaaaaata
ttcataaagt tgatgaacca aagcatttgg actatgttga taaacataaa 3720gatgaattta
aggagttgct agatgttgtg tcaaactttt ctaaaaaata tactttagca 3780gaaggaaatt
tagaaaaaat caaagaatta tatgcacaaa ataatggtga agatcttaaa 3840gaattagcaa
gttcatttat caacttatta acatttactg ctataggagc accggctact 3900tttaaattct
ttgataaaaa tattgatcga aaacgatata cttcaactac tgaaattctc 3960aacgctaccc
tcatccacca atccatcacc ggtctttatg aaacgcggat tgatctcaat 4020aagttaggag
gagactaa
403821720DNAArtificial SequenceMprimer qADH-F 217caagtcgcgg ttttcaatca
2021821DNAArtificial
SequencePrimer qADH-R 218tgaaggtgga agtcccaaca a
2121919DNAArtificial Sequenceprobe ADH-VIC
219tgggaagcct atctaccac
1922015DNAArtificial SequenceProbe wtEPSPS 220cggccattga cagca
1522120DNAArtificial
SequenceForward primer qEPSPS-F 221tcttggggaa tgctggaact
2022221DNAArtificial Sequencereverse
primer qEPSPSR 222caccagcagc agtaacagct g
2122317DNAArtificial SequenceFAM-wtEPSPS R probe
223tgctgtcaat ggccgca
1722420DNAArtificial Sequenceforward primer qEPSPS-F 224tcttggggaa
tgctggaact
2022520DNAArtificial Sequencereverse primer q wtEPSPS RA 225ccaccagcag
cagtaacagc
2022621DNAArtificial Sequenceorward primer q epTIPS F 226ggaagtgcag
ctcttcttgg g
2122719DNAArtificial Sequencereverse primer q epTIPS R 227agctgctgtc
aatgaccgc
1922815DNAArtificial SequenceTIPS probe 228aatgctggaa tcgca
1522923DNAZea maysMHP14Cas1 target
site(1)..(23) 229gttaaatctg acgtgaatct gtt
2323021DNAZea maysMHP14Cas3 target site(1)..(21)
230acaaacattg aagcgacata g
2123118DNAZea maysTS8Cas1 target site(1)..(18) 231gtacgtaacg tgcagtac
1823220DNAZea maysTS8Cas2
target site(1)..(20) 232gctcatcagt gatcagctgg
2023317DNAZea maysTS9Cas2 target site(1)..(17)
233ggctgtttgc ggcctcg
1723421DNAZea maysTS9Cas3 target site(1)..(21) 234gcctcgaggt tgcacgcacg t
2123520DNAZea maysTS10Cas1
target site(1)..(20) 235gcctcgcctt cgctagttaa
2023618DNAZea maysTS10Cas3 target site(1)..(18)
236gctcgtgttg gagataca
1823780DNAZea mays 237gttaaatctg acgtgaatct gtttggaatt gaaaaacaag
tgcttccttt catacaccac 60tatgtcgctt caatgtttgt
8023880DNAZea mays 238acaaacattg aagcgacata
gtggtgtatg aaaggaagca cttgtttttc aattccaaac 60agattcacgt cagatttaac
8023966DNAZea mays
239ccagtactgc acgttacgta cgtacgaact aatatactcc accagctgat cactgatgag
60ccgagc
6624066DNAZea mays 240gctcggctca tcagtgatca gctggtggag tatattagtt
cgtacgtacg taacgtgcag 60tactgg
6624135DNAZea mays 241ccgacgtgcg tgcaacctcg
aggccgcaaa cagcc 3524235DNAZea mays
242ggctgtttgc ggcctcgagg ttgcacgcac gtcgg
3524368DNAZea mays 243gctcgtgttg gagatacagg gacagcaagt acttggccct
taactagcga aggcgaggcg 60gccatgga
6824468DNAZea mays 244tccatggccg cctcgccttc
gctagttaag ggccaagtac ttgctgtccc tgtatctcca 60acacgagc
682451108DNAArtificial
SequenceMHP14Cas-1 guideRNA cassette 245tgagagtaca atgatgaacc tagattaatc
aatgccaaag tctgaaaaat gcaccctcag 60tctatgatcc agaaaatcaa gattgcttga
ggccctgttc ggttgttccg gattagagcc 120ccggattaat tcctagccgg attacttctc
taatttatat agattttgat gagctggaat 180gaatcctggc ttattccggt acaaccgaac
aggccctgaa ggataccagt aatcgctgag 240ctaaattggc atgctgtcag agtgtcagta
ttgcagcaag gtagtgagat aaccggcatc 300atggtgccag tttgatggca ccattagggt
tagagatggt ggccatgggc gcatgtcctg 360gccaactttg tatgatatat ggcagggtga
ataggaaagt aaaattgtat tgtaaaaagg 420gatttcttct gtttgttagc gcatgtacaa
ggaatgcaag ttttgagcga gggggcatca 480aagatctggc tgtgtttcca gctgtttttg
ttagccccat cgaatccttg acataatgat 540cccgcttaaa taagcaacct cgcttgtata
gttccttgtg ctctaacaca cgatgatgat 600aagtcgtaaa atagtggtgt ccaaagaatt
tccaggccca gttgtaaaag ctaaaatgct 660attcgaattt ctactagcag taagtcgtgt
ttagaaatta tttttttata tacctttttt 720ccttctatgt acagtaggac acagtgtcag
cgccgcgttg acggagaata tttgcaaaaa 780agtaaaagag aaagtcatag cggcgtatgt
gccaaaaact tcgtcacaga gagggccata 840agaaacatgg cccacggccc aatacgaagc
accgcgacga agcccaaaca gcagtccgta 900ggtggagcaa agcgctgggt aatacgcaaa
cgttttgtcc caccttgact aatcacaaga 960gtggagcgta ccttataaac cgagccgcaa
gcaccgaatt gttaaatctg acgtgaatct 1020gttgttttag agctagaaat agcaagttaa
aataaggcta gtccgttatc aacttgaaaa 1080agtggcaccg agtcggtgct tttttttt
11082461106DNAArtificial
SequenceMHP14Cas-3 gRNA cassette 246tgagagtaca atgatgaacc tagattaatc
aatgccaaag tctgaaaaat gcaccctcag 60tctatgatcc agaaaatcaa gattgcttga
ggccctgttc ggttgttccg gattagagcc 120ccggattaat tcctagccgg attacttctc
taatttatat agattttgat gagctggaat 180gaatcctggc ttattccggt acaaccgaac
aggccctgaa ggataccagt aatcgctgag 240ctaaattggc atgctgtcag agtgtcagta
ttgcagcaag gtagtgagat aaccggcatc 300atggtgccag tttgatggca ccattagggt
tagagatggt ggccatgggc gcatgtcctg 360gccaactttg tatgatatat ggcagggtga
ataggaaagt aaaattgtat tgtaaaaagg 420gatttcttct gtttgttagc gcatgtacaa
ggaatgcaag ttttgagcga gggggcatca 480aagatctggc tgtgtttcca gctgtttttg
ttagccccat cgaatccttg acataatgat 540cccgcttaaa taagcaacct cgcttgtata
gttccttgtg ctctaacaca cgatgatgat 600aagtcgtaaa atagtggtgt ccaaagaatt
tccaggccca gttgtaaaag ctaaaatgct 660attcgaattt ctactagcag taagtcgtgt
ttagaaatta tttttttata tacctttttt 720ccttctatgt acagtaggac acagtgtcag
cgccgcgttg acggagaata tttgcaaaaa 780agtaaaagag aaagtcatag cggcgtatgt
gccaaaaact tcgtcacaga gagggccata 840agaaacatgg cccacggccc aatacgaagc
accgcgacga agcccaaaca gcagtccgta 900ggtggagcaa agcgctgggt aatacgcaaa
cgttttgtcc caccttgact aatcacaaga 960gtggagcgta ccttataaac cgagccgcaa
gcaccgaatt gcaaacattg aagcgacata 1020ggttttagag ctagaaatag caagttaaaa
taaggctagt ccgttatcaa cttgaaaaag 1080tggcaccgag tcggtgcttt tttttt
11062471103DNAArtificial
SequenceTS8Cas-1 guideRNA cassette 247tgagagtaca atgatgaacc tagattaatc
aatgccaaag tctgaaaaat gcaccctcag 60tctatgatcc agaaaatcaa gattgcttga
ggccctgttc ggttgttccg gattagagcc 120ccggattaat tcctagccgg attacttctc
taatttatat agattttgat gagctggaat 180gaatcctggc ttattccggt acaaccgaac
aggccctgaa ggataccagt aatcgctgag 240ctaaattggc atgctgtcag agtgtcagta
ttgcagcaag gtagtgagat aaccggcatc 300atggtgccag tttgatggca ccattagggt
tagagatggt ggccatgggc gcatgtcctg 360gccaactttg tatgatatat ggcagggtga
ataggaaagt aaaattgtat tgtaaaaagg 420gatttcttct gtttgttagc gcatgtacaa
ggaatgcaag ttttgagcga gggggcatca 480aagatctggc tgtgtttcca gctgtttttg
ttagccccat cgaatccttg acataatgat 540cccgcttaaa taagcaacct cgcttgtata
gttccttgtg ctctaacaca cgatgatgat 600aagtcgtaaa atagtggtgt ccaaagaatt
tccaggccca gttgtaaaag ctaaaatgct 660attcgaattt ctactagcag taagtcgtgt
ttagaaatta tttttttata tacctttttt 720ccttctatgt acagtaggac acagtgtcag
cgccgcgttg acggagaata tttgcaaaaa 780agtaaaagag aaagtcatag cggcgtatgt
gccaaaaact tcgtcacaga gagggccata 840agaaacatgg cccacggccc aatacgaagc
accgcgacga agcccaaaca gcagtccgta 900ggtggagcaa agcgctgggt aatacgcaaa
cgttttgtcc caccttgact aatcacaaga 960gtggagcgta ccttataaac cgagccgcaa
gcaccgaatt gtacgtaacg tgcagtacgt 1020tttagagcta gaaatagcaa gttaaaataa
ggctagtccg ttatcaactt gaaaaagtgg 1080caccgagtcg gtgctttttt ttt
11032481105DNAArtificial
SequenceTS8Cas-2 guideRNA cassette 248tgagagtaca atgatgaacc tagattaatc
aatgccaaag tctgaaaaat gcaccctcag 60tctatgatcc agaaaatcaa gattgcttga
ggccctgttc ggttgttccg gattagagcc 120ccggattaat tcctagccgg attacttctc
taatttatat agattttgat gagctggaat 180gaatcctggc ttattccggt acaaccgaac
aggccctgaa ggataccagt aatcgctgag 240ctaaattggc atgctgtcag agtgtcagta
ttgcagcaag gtagtgagat aaccggcatc 300atggtgccag tttgatggca ccattagggt
tagagatggt ggccatgggc gcatgtcctg 360gccaactttg tatgatatat ggcagggtga
ataggaaagt aaaattgtat tgtaaaaagg 420gatttcttct gtttgttagc gcatgtacaa
ggaatgcaag ttttgagcga gggggcatca 480aagatctggc tgtgtttcca gctgtttttg
ttagccccat cgaatccttg acataatgat 540cccgcttaaa taagcaacct cgcttgtata
gttccttgtg ctctaacaca cgatgatgat 600aagtcgtaaa atagtggtgt ccaaagaatt
tccaggccca gttgtaaaag ctaaaatgct 660attcgaattt ctactagcag taagtcgtgt
ttagaaatta tttttttata tacctttttt 720ccttctatgt acagtaggac acagtgtcag
cgccgcgttg acggagaata tttgcaaaaa 780agtaaaagag aaagtcatag cggcgtatgt
gccaaaaact tcgtcacaga gagggccata 840agaaacatgg cccacggccc aatacgaagc
accgcgacga agcccaaaca gcagtccgta 900ggtggagcaa agcgctgggt aatacgcaaa
cgttttgtcc caccttgact aatcacaaga 960gtggagcgta ccttataaac cgagccgcaa
gcaccgaatt gctcatcagt gatcagctgg 1020gttttagagc tagaaatagc aagttaaaat
aaggctagtc cgttatcaac ttgaaaaagt 1080ggcaccgagt cggtgctttt ttttt
11052491102DNAArtificial
SequenceTS9Cas-2 guideRNA cassette 249tgagagtaca atgatgaacc tagattaatc
aatgccaaag tctgaaaaat gcaccctcag 60tctatgatcc agaaaatcaa gattgcttga
ggccctgttc ggttgttccg gattagagcc 120ccggattaat tcctagccgg attacttctc
taatttatat agattttgat gagctggaat 180gaatcctggc ttattccggt acaaccgaac
aggccctgaa ggataccagt aatcgctgag 240ctaaattggc atgctgtcag agtgtcagta
ttgcagcaag gtagtgagat aaccggcatc 300atggtgccag tttgatggca ccattagggt
tagagatggt ggccatgggc gcatgtcctg 360gccaactttg tatgatatat ggcagggtga
ataggaaagt aaaattgtat tgtaaaaagg 420gatttcttct gtttgttagc gcatgtacaa
ggaatgcaag ttttgagcga gggggcatca 480aagatctggc tgtgtttcca gctgtttttg
ttagccccat cgaatccttg acataatgat 540cccgcttaaa taagcaacct cgcttgtata
gttccttgtg ctctaacaca cgatgatgat 600aagtcgtaaa atagtggtgt ccaaagaatt
tccaggccca gttgtaaaag ctaaaatgct 660attcgaattt ctactagcag taagtcgtgt
ttagaaatta tttttttata tacctttttt 720ccttctatgt acagtaggac acagtgtcag
cgccgcgttg acggagaata tttgcaaaaa 780agtaaaagag aaagtcatag cggcgtatgt
gccaaaaact tcgtcacaga gagggccata 840agaaacatgg cccacggccc aatacgaagc
accgcgacga agcccaaaca gcagtccgta 900ggtggagcaa agcgctgggt aatacgcaaa
cgttttgtcc caccttgact aatcacaaga 960gtggagcgta ccttataaac cgagccgcaa
gcaccgaatt ggctgtttgc ggcctcggtt 1020ttagagctag aaatagcaag ttaaaataag
gctagtccgt tatcaacttg aaaaagtggc 1080accgagtcgg tgcttttttt tt
11022501106DNAArtificial
SequenceTS9Cas-3 guideRNA cassette 250tgagagtaca atgatgaacc tagattaatc
aatgccaaag tctgaaaaat gcaccctcag 60tctatgatcc agaaaatcaa gattgcttga
ggccctgttc ggttgttccg gattagagcc 120ccggattaat tcctagccgg attacttctc
taatttatat agattttgat gagctggaat 180gaatcctggc ttattccggt acaaccgaac
aggccctgaa ggataccagt aatcgctgag 240ctaaattggc atgctgtcag agtgtcagta
ttgcagcaag gtagtgagat aaccggcatc 300atggtgccag tttgatggca ccattagggt
tagagatggt ggccatgggc gcatgtcctg 360gccaactttg tatgatatat ggcagggtga
ataggaaagt aaaattgtat tgtaaaaagg 420gatttcttct gtttgttagc gcatgtacaa
ggaatgcaag ttttgagcga gggggcatca 480aagatctggc tgtgtttcca gctgtttttg
ttagccccat cgaatccttg acataatgat 540cccgcttaaa taagcaacct cgcttgtata
gttccttgtg ctctaacaca cgatgatgat 600aagtcgtaaa atagtggtgt ccaaagaatt
tccaggccca gttgtaaaag ctaaaatgct 660attcgaattt ctactagcag taagtcgtgt
ttagaaatta tttttttata tacctttttt 720ccttctatgt acagtaggac acagtgtcag
cgccgcgttg acggagaata tttgcaaaaa 780agtaaaagag aaagtcatag cggcgtatgt
gccaaaaact tcgtcacaga gagggccata 840agaaacatgg cccacggccc aatacgaagc
accgcgacga agcccaaaca gcagtccgta 900ggtggagcaa agcgctgggt aatacgcaaa
cgttttgtcc caccttgact aatcacaaga 960gtggagcgta ccttataaac cgagccgcaa
gcaccgaatt gcctcgaggt tgcacgcacg 1020tgttttagag ctagaaatag caagttaaaa
taaggctagt ccgttatcaa cttgaaaaag 1080tggcaccgag tcggtgcttt tttttt
11062511105DNAArtificial
SequenceTS10Cas-1 guideRNA cassette 251tgagagtaca atgatgaacc tagattaatc
aatgccaaag tctgaaaaat gcaccctcag 60tctatgatcc agaaaatcaa gattgcttga
ggccctgttc ggttgttccg gattagagcc 120ccggattaat tcctagccgg attacttctc
taatttatat agattttgat gagctggaat 180gaatcctggc ttattccggt acaaccgaac
aggccctgaa ggataccagt aatcgctgag 240ctaaattggc atgctgtcag agtgtcagta
ttgcagcaag gtagtgagat aaccggcatc 300atggtgccag tttgatggca ccattagggt
tagagatggt ggccatgggc gcatgtcctg 360gccaactttg tatgatatat ggcagggtga
ataggaaagt aaaattgtat tgtaaaaagg 420gatttcttct gtttgttagc gcatgtacaa
ggaatgcaag ttttgagcga gggggcatca 480aagatctggc tgtgtttcca gctgtttttg
ttagccccat cgaatccttg acataatgat 540cccgcttaaa taagcaacct cgcttgtata
gttccttgtg ctctaacaca cgatgatgat 600aagtcgtaaa atagtggtgt ccaaagaatt
tccaggccca gttgtaaaag ctaaaatgct 660attcgaattt ctactagcag taagtcgtgt
ttagaaatta tttttttata tacctttttt 720ccttctatgt acagtaggac acagtgtcag
cgccgcgttg acggagaata tttgcaaaaa 780agtaaaagag aaagtcatag cggcgtatgt
gccaaaaact tcgtcacaga gagggccata 840agaaacatgg cccacggccc aatacgaagc
accgcgacga agcccaaaca gcagtccgta 900ggtggagcaa agcgctgggt aatacgcaaa
cgttttgtcc caccttgact aatcacaaga 960gtggagcgta ccttataaac cgagccgcaa
gcaccgaatt gcctcgcctt cgctagttaa 1020gttttagagc tagaaatagc aagttaaaat
aaggctagtc cgttatcaac ttgaaaaagt 1080ggcaccgagt cggtgctttt ttttt
11052521103DNAArtificial SequenceTSCas-3
guideRNA cassette 252tgagagtaca atgatgaacc tagattaatc aatgccaaag
tctgaaaaat gcaccctcag 60tctatgatcc agaaaatcaa gattgcttga ggccctgttc
ggttgttccg gattagagcc 120ccggattaat tcctagccgg attacttctc taatttatat
agattttgat gagctggaat 180gaatcctggc ttattccggt acaaccgaac aggccctgaa
ggataccagt aatcgctgag 240ctaaattggc atgctgtcag agtgtcagta ttgcagcaag
gtagtgagat aaccggcatc 300atggtgccag tttgatggca ccattagggt tagagatggt
ggccatgggc gcatgtcctg 360gccaactttg tatgatatat ggcagggtga ataggaaagt
aaaattgtat tgtaaaaagg 420gatttcttct gtttgttagc gcatgtacaa ggaatgcaag
ttttgagcga gggggcatca 480aagatctggc tgtgtttcca gctgtttttg ttagccccat
cgaatccttg acataatgat 540cccgcttaaa taagcaacct cgcttgtata gttccttgtg
ctctaacaca cgatgatgat 600aagtcgtaaa atagtggtgt ccaaagaatt tccaggccca
gttgtaaaag ctaaaatgct 660attcgaattt ctactagcag taagtcgtgt ttagaaatta
tttttttata tacctttttt 720ccttctatgt acagtaggac acagtgtcag cgccgcgttg
acggagaata tttgcaaaaa 780agtaaaagag aaagtcatag cggcgtatgt gccaaaaact
tcgtcacaga gagggccata 840agaaacatgg cccacggccc aatacgaagc accgcgacga
agcccaaaca gcagtccgta 900ggtggagcaa agcgctgggt aatacgcaaa cgttttgtcc
caccttgact aatcacaaga 960gtggagcgta ccttataaac cgagccgcaa gcaccgaatt
gctcgtgttg gagatacagt 1020tttagagcta gaaatagcaa gttaaaataa ggctagtccg
ttatcaactt gaaaaagtgg 1080caccgagtcg gtgctttttt ttt
11032534928DNAArtificial SequenceMHP14Cas1 donor
253gccatgtcat cttgtagtta gggcttggag ctagtcgacc gttggaggct ttgtcctcat
60gcggcaccgg acagtctggt gctacaccgg acagtccggt gcccctctga ccatctgctc
120tgacatctga attgcactgt tcactttgca gagtcgacca ttgcgtgcag gtagccattg
180ctccgctggt gcaccggaca gtccagtggc acaccggaca gtccgatgaa ttatagcgga
240gctgcgcctg ggaaacccga agctgaggag tttgagctga ttcaccctgg tgcaccggac
300actgtccggt ggcacactgg acagtccggt gcgccggacc agggcacact tcggtttcct
360ttttgctcct ttcttttgaa gcctaacttg ttcttttgat tggtttgtgt tgaaccttta
420gcacctgtag aatgtatgat ctagagcaaa ctagttagtc caattatttg tgttgggcaa
480ttcaaccacc aaaaacattt aggaaaatgt ttgatcttat ttccctttca tattctctta
540ttgctagttg tcggggtgaa gttgagctct tgcttaggtt ttaattagtg ttgattttta
600gaaaaaccca attcaccccc ctcttgggca tcgtgatcct tttagcaaca aaatgtgcac
660acatcaaaac aagcgcttct accatatgta gttgttgcac aataatggtc ctccttagga
720tttgcaaccg tttaacaata gctatgtgac cacagattta tgtcggatgc acgaaaattg
780taggatttta catttcttta ccttggttca caaacattga agcgacatag tggtgtatga
840aaggaagcac ttgtttttca attccaaacc gcggtaccat ttaaatctta agcctaggat
900aacttcgtat agcatacatt atacgaagtt atggcgccgc tagcctgcag tgcagcgtga
960cccggtcgtg cccctctcta gagataatga gcattgcatg tctaagttat aaaaaattac
1020cacatatttt ttttgtcaca cttgtttgaa gtgcagttta tctatcttta tacatatatt
1080taaactttac tctacgaata atataatcta tagtactaca ataatatcag tgttttagag
1140aatcatataa atgaacagtt agacatggtc taaaggacaa ttgagtattt tgacaacagg
1200actctacagt tttatctttt tagtgtgcat gtgttctcct ttttttttgc aaatagcttc
1260acctatataa tacttcatcc attttattag tacatccatt tagggtttag ggttaatggt
1320ttttatagac taattttttt agtacatcta ttttattcta ttttagcctc taaattaaga
1380aaactaaaac tctattttag tttttttatt taataattta gatataaaat agaataaaat
1440aaagtgacta aaaattaaac aaataccctt taagaaatta aaaaaactaa ggaaacattt
1500ttcttgtttc gagtagataa tgccagcctg ttaaacgccg tcgacgagtc taacggacac
1560caaccagcga accagcagcg tcgcgtcggg ccaagcgaag cagacggcac ggcatctctg
1620tcgctgcctc tggacccctc tcgagagttc cgctccaccg ttggacttgc tccgctgtcg
1680gcatccagaa attgcgtggc ggagcggcag acgtgagccg gcacggcagg cggcctcctc
1740ctcctctcac ggcaccggca gctacggggg attcctttcc caccgctcct tcgctttccc
1800ttcctcgccc gccgtaataa atagacaccc cctccacacc ctctttcccc aacctcgtgt
1860tgttcggagc gcacacacac acaaccagat ctcccccaaa tccacccgtc ggcacctccg
1920cttcaaggta cgccgctcgt cctccccccc ccccctctct accttctcta gatcggcgtt
1980ccggtccatg catggttagg gcccggtagt tctacttctg ttcatgtttg tgttagatcc
2040gtgtttgtgt tagatccgtg ctgctagcgt tcgtacacgg atgcgacctg tacgtcagac
2100acgttctgat tgctaacttg ccagtgtttc tctttgggga atcctgggat ggctctagcc
2160gttccgcaga cgggatcgat ttcatgattt tttttgtttc gttgcatagg gtttggtttg
2220cccttttcct ttatttcaat atatgccgtg cacttgtttg tcgggtcatc ttttcatgct
2280tttttttgtc ttggttgtga tgatgtggtc tggttgggcg gtcgttctag atcggagtag
2340aattctgttt caaactacct ggtggattta ttaattttgg atctgtatgt gtgtgccata
2400catattcata gttacgaatt gaagatgatg gatggaaata tcgatctagg ataggtatac
2460atgttgatgc gggttttact gatgcatata cagagatgct ttttgttcgc ttggttgtga
2520tgatgtggtg tggttgggcg gtcgttcatt cgttctagat cggagtagaa tactgtttca
2580aactacctgg tgtatttatt aattttggaa ctgtatgtgt gtgtcataca tcttcatagt
2640tacgagttta agatggatgg aaatatcgat ctaggatagg tatacatgtt gatgtgggtt
2700ttactgatgc atatacatga tggcatatgc agcatctatt catatgctct aaccttgagt
2760acctatctat tataataaac aagtatgttt tataattatt ttgatcttga tatacttgga
2820tgatggcata tgcagcagct atatgtggat ttttttagcc ctgccttcat acgctattta
2880tttgcttggt actgtttctt ttgtcgatgc tcaccctgtt gtttggtgtt acttctgcag
2940gtcgactcta gaggatcaat tcgctagcga agttcctatt ccgaagttcc tattctctag
3000aaagtatagg aacttcagat ccaccgggat ccccgatcat gcaaaaactc attaactcag
3060tgcaaaacta tgcctggggc agcaaaacgg cgttgactga actttatggt atggaaaatc
3120cgtccagcca gccgatggcc gagctgtgga tgggcgcaca tccgaaaagc agttcacgag
3180tgcagaatgc cgccggagat atcgtttcac tgcgtgatgt gattgagagt gataaatcga
3240ctctgctcgg agaggccgtt gccaaacgct ttggcgaact gcctttcctg ttcaaagtat
3300tatgcgcagc acagccactc tccattcagg ttcatccaaa caaacacaat tctgaaatcg
3360gttttgccaa agaaaatgcc gcaggtatcc cgatggatgc cgccgagcgt aactataaag
3420atcctaacca caagccggag ctggtttttg cgctgacgcc tttccttgcg atgaacgcgt
3480ttcgtgaatt ttccgagatt gtctccctac tccagccggt cgcaggtgca catccggcga
3540ttgctcactt tttacaacag cctgatgccg aacgtttaag cgaactgttc gccagcctgt
3600tgaatatgca gggtgaagaa aaatcccgcg cgctggcgat tttaaaatcg gccctcgata
3660gccagcaggg tgaaccgtgg caaacgattc gtttaatttc tgaattttac ccggaagaca
3720gcggtctgtt ctccccgcta ttgctgaatg tggtgaaatt gaaccctggc gaagcgatgt
3780tcctgttcgc tgaaacaccg cacgcttacc tgcaaggcgt ggcgctggaa gtgatggcaa
3840actccgataa cgtgctgcgt gcgggtctga cgcctaaata cattgatatt ccggaactgg
3900ttgccaatgt gaaattcgaa gccaaaccgg ctaaccagtt gttgacccag ccggtgaaac
3960aaggtgcaga actggacttc ccgattccag tggatgattt tgccttctcg ctgcatgacc
4020ttagtgataa agaaaccacc attagccagc agagtgccgc cattttgttc tgcgtcgaag
4080gcgatgcaac gttgtggaaa ggttctcagc agttacagct taaaccgggt gaatcagcgt
4140ttattgccgc caacgaatca ccggtgactg tcaaaggcca cggccgttta gcgcgtgttt
4200acaacaagct gtaagagctt actgaaaaaa ttaacatctc ttgctaagct gggggtggaa
4260cctagacttg tccatcttct ggattggcca acttaattaa tgtatgaaat aaaaggatgc
4320acacatagtg acatgctaat cactataatg tgggcatcaa agttgtgtgt tatgtgtaat
4380tactagttat ctgaataaaa gagaaagaga tcatccatat ttcttatcct aaatgaatgt
4440cacgtgtctt tataattctt tgatgaacca gatgcatttc attaaccaaa tccatataca
4500tataaatatt aatcatatat aattaatatc aattgggtta gcaaaacaaa tctagtctag
4560gtgtgttttg cgaatgcggc cctagcgtat acgaagttcc tattccgaag ttcctattct
4620ccagaaagta taggaacttc tgtacacctg agctgattcc gatgacttcg taggttccta
4680gctcaagccg ctcgtgtcca agcgtcactt acgattagct aatgattacg gcatctagga
4740ccgactagct aactaactag taccgaggcc ggccccgcgg gagctcggcg cgccagattc
4800acgtcagatt taaccaaaac tatattatga ggtacacata ttacaatcca aaatgaatta
4860tctagttctc gagttgtaca cagtttatca cgtgttttac acattccaac cctaaactcc
4920aaccgtgg
49282544570DNAArtificial SequenceMHP14Cas3 donor 254acacttcggt ttcctttttg
ctcctttctt ttgaagccta acttgttctt ttgattggtt 60tgtgttgaac ctttagcacc
tgtagaatgt atgatctaga gcaaactagt tagtccaatt 120atttgtgttg ggcaattcaa
ccaccaaaaa catttaggaa aatgtttgat cttatttccc 180tttcatattc tcttattgct
agttgtcggg gtgaagttga gctcttgctt aggttttaat 240tagtgttgat ttttagaaaa
acccaattca cccccctctt gggcatcgtg atccttttag 300caacaaaatg tgcacacatc
aaaacaagcg cttctaccat atgtagttgt tgcacaataa 360tggtcctcct taggatttgc
aaccgtttaa caatagctat gtgaccacag atttatgtcg 420gatgcacgaa aattgtagga
ttttacattt ctttaccttg gttcacaaac attgaagcga 480caggtaccat ttaaatctta
agcctaggat aacttcgtat agcatacatt atacgaagtt 540atggcgccgc tagcctgcag
tgcagcgtga cccggtcgtg cccctctcta gagataatga 600gcattgcatg tctaagttat
aaaaaattac cacatatttt ttttgtcaca cttgtttgaa 660gtgcagttta tctatcttta
tacatatatt taaactttac tctacgaata atataatcta 720tagtactaca ataatatcag
tgttttagag aatcatataa atgaacagtt agacatggtc 780taaaggacaa ttgagtattt
tgacaacagg actctacagt tttatctttt tagtgtgcat 840gtgttctcct ttttttttgc
aaatagcttc acctatataa tacttcatcc attttattag 900tacatccatt tagggtttag
ggttaatggt ttttatagac taattttttt agtacatcta 960ttttattcta ttttagcctc
taaattaaga aaactaaaac tctattttag tttttttatt 1020taataattta gatataaaat
agaataaaat aaagtgacta aaaattaaac aaataccctt 1080taagaaatta aaaaaactaa
ggaaacattt ttcttgtttc gagtagataa tgccagcctg 1140ttaaacgccg tcgacgagtc
taacggacac caaccagcga accagcagcg tcgcgtcggg 1200ccaagcgaag cagacggcac
ggcatctctg tcgctgcctc tggacccctc tcgagagttc 1260cgctccaccg ttggacttgc
tccgctgtcg gcatccagaa attgcgtggc ggagcggcag 1320acgtgagccg gcacggcagg
cggcctcctc ctcctctcac ggcaccggca gctacggggg 1380attcctttcc caccgctcct
tcgctttccc ttcctcgccc gccgtaataa atagacaccc 1440cctccacacc ctctttcccc
aacctcgtgt tgttcggagc gcacacacac acaaccagat 1500ctcccccaaa tccacccgtc
ggcacctccg cttcaaggta cgccgctcgt cctccccccc 1560ccccctctct accttctcta
gatcggcgtt ccggtccatg catggttagg gcccggtagt 1620tctacttctg ttcatgtttg
tgttagatcc gtgtttgtgt tagatccgtg ctgctagcgt 1680tcgtacacgg atgcgacctg
tacgtcagac acgttctgat tgctaacttg ccagtgtttc 1740tctttgggga atcctgggat
ggctctagcc gttccgcaga cgggatcgat ttcatgattt 1800tttttgtttc gttgcatagg
gtttggtttg cccttttcct ttatttcaat atatgccgtg 1860cacttgtttg tcgggtcatc
ttttcatgct tttttttgtc ttggttgtga tgatgtggtc 1920tggttgggcg gtcgttctag
atcggagtag aattctgttt caaactacct ggtggattta 1980ttaattttgg atctgtatgt
gtgtgccata catattcata gttacgaatt gaagatgatg 2040gatggaaata tcgatctagg
ataggtatac atgttgatgc gggttttact gatgcatata 2100cagagatgct ttttgttcgc
ttggttgtga tgatgtggtg tggttgggcg gtcgttcatt 2160cgttctagat cggagtagaa
tactgtttca aactacctgg tgtatttatt aattttggaa 2220ctgtatgtgt gtgtcataca
tcttcatagt tacgagttta agatggatgg aaatatcgat 2280ctaggatagg tatacatgtt
gatgtgggtt ttactgatgc atatacatga tggcatatgc 2340agcatctatt catatgctct
aaccttgagt acctatctat tataataaac aagtatgttt 2400tataattatt ttgatcttga
tatacttgga tgatggcata tgcagcagct atatgtggat 2460ttttttagcc ctgccttcat
acgctattta tttgcttggt actgtttctt ttgtcgatgc 2520tcaccctgtt gtttggtgtt
acttctgcag gtcgactcta gaggatcaat tcgctagcga 2580agttcctatt ccgaagttcc
tattctctag aaagtatagg aacttcagat ccaccgggat 2640ccccgatcat gcaaaaactc
attaactcag tgcaaaacta tgcctggggc agcaaaacgg 2700cgttgactga actttatggt
atggaaaatc cgtccagcca gccgatggcc gagctgtgga 2760tgggcgcaca tccgaaaagc
agttcacgag tgcagaatgc cgccggagat atcgtttcac 2820tgcgtgatgt gattgagagt
gataaatcga ctctgctcgg agaggccgtt gccaaacgct 2880ttggcgaact gcctttcctg
ttcaaagtat tatgcgcagc acagccactc tccattcagg 2940ttcatccaaa caaacacaat
tctgaaatcg gttttgccaa agaaaatgcc gcaggtatcc 3000cgatggatgc cgccgagcgt
aactataaag atcctaacca caagccggag ctggtttttg 3060cgctgacgcc tttccttgcg
atgaacgcgt ttcgtgaatt ttccgagatt gtctccctac 3120tccagccggt cgcaggtgca
catccggcga ttgctcactt tttacaacag cctgatgccg 3180aacgtttaag cgaactgttc
gccagcctgt tgaatatgca gggtgaagaa aaatcccgcg 3240cgctggcgat tttaaaatcg
gccctcgata gccagcaggg tgaaccgtgg caaacgattc 3300gtttaatttc tgaattttac
ccggaagaca gcggtctgtt ctccccgcta ttgctgaatg 3360tggtgaaatt gaaccctggc
gaagcgatgt tcctgttcgc tgaaacaccg cacgcttacc 3420tgcaaggcgt ggcgctggaa
gtgatggcaa actccgataa cgtgctgcgt gcgggtctga 3480cgcctaaata cattgatatt
ccggaactgg ttgccaatgt gaaattcgaa gccaaaccgg 3540ctaaccagtt gttgacccag
ccggtgaaac aaggtgcaga actggacttc ccgattccag 3600tggatgattt tgccttctcg
ctgcatgacc ttagtgataa agaaaccacc attagccagc 3660agagtgccgc cattttgttc
tgcgtcgaag gcgatgcaac gttgtggaaa ggttctcagc 3720agttacagct taaaccgggt
gaatcagcgt ttattgccgc caacgaatca ccggtgactg 3780tcaaaggcca cggccgttta
gcgcgtgttt acaacaagct gtaagagctt actgaaaaaa 3840ttaacatctc ttgctaagct
gggggtggaa cctagacttg tccatcttct ggattggcca 3900acttaattaa tgtatgaaat
aaaaggatgc acacatagtg acatgctaat cactataatg 3960tgggcatcaa agttgtgtgt
tatgtgtaat tactagttat ctgaataaaa gagaaagaga 4020tcatccatat ttcttatcct
aaatgaatgt cacgtgtctt tataattctt tgatgaacca 4080gatgcatttc attaaccaaa
tccatataca tataaatatt aatcatatat aattaatatc 4140aattgggtta gcaaaacaaa
tctagtctag gtgtgttttg cgaatgcggc cctagcgtat 4200acgaagttcc tattccgaag
ttcctattct ccagaaagta taggaacttc tgtacacctg 4260agctgattcc gatgacttcg
taggttccta gctcaagccg ctcgtgtcca agcgtcactt 4320acgattagct aatgattacg
gcatctagga ccgactagct aactaactag taccgaggcc 4380ggccccgcgg gagctcggcg
cgcctagtgg tgtatgaaag gaagcacttg tttttcaatt 4440ccaaacagat tcacgtcaga
tttaaccaaa actatattat gaggtacaca tattacaatc 4500caaaatgaat tatctagttc
tcgagttgta cacagtttat cacgtgtttt acacattcca 4560accctaaact
45702555091DNAArtificial
SequenceTS8Cas-1 donor 255cacacatgac tgcctgagaa tctgctgccg ttgcctctca
tattatattc gatcccctga 60ctaaaaaaac tcggggccgg ctaatacgta ctgtacgtac
gcagaattta cggtccagca 120cgggcatgcc gcgcgggctg actttgctcc actgactcga
tcatgtgcgg attccatcgc 180ggcgtagcgt agccaaccgc aacgcaaacc gacttcatct
ttttttttta ttatgaacaa 240aaggagatcg agagaaacgt gaacggtaaa taatatatct
gatcccatgc atgcacgctg 300cctgggtcga tctcgctctc gctccgccca gacgaacatg
catgctggtc aggctcaacg 360ctcaggcggg caagctgtgg gaggacatgg gatgggagag
gaggacacat gcatgctggc 420cagtcaggca ctgtgctggc acatgaggta gggatagggg
ggccctcggc cagtgtccag 480gccgcatgca tgcatgcccc ccctgctgct cgaccgaaca
acgttggatg cctggattga 540tgcaacagtt tggacggacg gaccatacgt tatgtaccag
taggtaccat ttaaatctta 600agcctaggat aacttcgtat agcatacatt atacgaagtt
atggcgccgc tagcctgcag 660tgcagcgtga cccggtcgtg cccctctcta gagataatga
gcattgcatg tctaagttat 720aaaaaattac cacatatttt ttttgtcaca cttgtttgaa
gtgcagttta tctatcttta 780tacatatatt taaactttac tctacgaata atataatcta
tagtactaca ataatatcag 840tgttttagag aatcatataa atgaacagtt agacatggtc
taaaggacaa ttgagtattt 900tgacaacagg actctacagt tttatctttt tagtgtgcat
gtgttctcct ttttttttgc 960aaatagcttc acctatataa tacttcatcc attttattag
tacatccatt tagggtttag 1020ggttaatggt ttttatagac taattttttt agtacatcta
ttttattcta ttttagcctc 1080taaattaaga aaactaaaac tctattttag tttttttatt
taataattta gatataaaat 1140agaataaaat aaagtgacta aaaattaaac aaataccctt
taagaaatta aaaaaactaa 1200ggaaacattt ttcttgtttc gagtagataa tgccagcctg
ttaaacgccg tcgacgagtc 1260taacggacac caaccagcga accagcagcg tcgcgtcggg
ccaagcgaag cagacggcac 1320ggcatctctg tcgctgcctc tggacccctc tcgagagttc
cgctccaccg ttggacttgc 1380tccgctgtcg gcatccagaa attgcgtggc ggagcggcag
acgtgagccg gcacggcagg 1440cggcctcctc ctcctctcac ggcaccggca gctacggggg
attcctttcc caccgctcct 1500tcgctttccc ttcctcgccc gccgtaataa atagacaccc
cctccacacc ctctttcccc 1560aacctcgtgt tgttcggagc gcacacacac acaaccagat
ctcccccaaa tccacccgtc 1620ggcacctccg cttcaaggta cgccgctcgt cctccccccc
ccccctctct accttctcta 1680gatcggcgtt ccggtccatg catggttagg gcccggtagt
tctacttctg ttcatgtttg 1740tgttagatcc gtgtttgtgt tagatccgtg ctgctagcgt
tcgtacacgg atgcgacctg 1800tacgtcagac acgttctgat tgctaacttg ccagtgtttc
tctttgggga atcctgggat 1860ggctctagcc gttccgcaga cgggatcgat ttcatgattt
tttttgtttc gttgcatagg 1920gtttggtttg cccttttcct ttatttcaat atatgccgtg
cacttgtttg tcgggtcatc 1980ttttcatgct tttttttgtc ttggttgtga tgatgtggtc
tggttgggcg gtcgttctag 2040atcggagtag aattctgttt caaactacct ggtggattta
ttaattttgg atctgtatgt 2100gtgtgccata catattcata gttacgaatt gaagatgatg
gatggaaata tcgatctagg 2160ataggtatac atgttgatgc gggttttact gatgcatata
cagagatgct ttttgttcgc 2220ttggttgtga tgatgtggtg tggttgggcg gtcgttcatt
cgttctagat cggagtagaa 2280tactgtttca aactacctgg tgtatttatt aattttggaa
ctgtatgtgt gtgtcataca 2340tcttcatagt tacgagttta agatggatgg aaatatcgat
ctaggatagg tatacatgtt 2400gatgtgggtt ttactgatgc atatacatga tggcatatgc
agcatctatt catatgctct 2460aaccttgagt acctatctat tataataaac aagtatgttt
tataattatt ttgatcttga 2520tatacttgga tgatggcata tgcagcagct atatgtggat
ttttttagcc ctgccttcat 2580acgctattta tttgcttggt actgtttctt ttgtcgatgc
tcaccctgtt gtttggtgtt 2640acttctgcag gtcgactcta gaggatcaat tcgctagcga
agttcctatt ccgaagttcc 2700tattctctag aaagtatagg aacttcagat ccaccgggat
ccccgatcat gcaaaaactc 2760attaactcag tgcaaaacta tgcctggggc agcaaaacgg
cgttgactga actttatggt 2820atggaaaatc cgtccagcca gccgatggcc gagctgtgga
tgggcgcaca tccgaaaagc 2880agttcacgag tgcagaatgc cgccggagat atcgtttcac
tgcgtgatgt gattgagagt 2940gataaatcga ctctgctcgg agaggccgtt gccaaacgct
ttggcgaact gcctttcctg 3000ttcaaagtat tatgcgcagc acagccactc tccattcagg
ttcatccaaa caaacacaat 3060tctgaaatcg gttttgccaa agaaaatgcc gcaggtatcc
cgatggatgc cgccgagcgt 3120aactataaag atcctaacca caagccggag ctggtttttg
cgctgacgcc tttccttgcg 3180atgaacgcgt ttcgtgaatt ttccgagatt gtctccctac
tccagccggt cgcaggtgca 3240catccggcga ttgctcactt tttacaacag cctgatgccg
aacgtttaag cgaactgttc 3300gccagcctgt tgaatatgca gggtgaagaa aaatcccgcg
cgctggcgat tttaaaatcg 3360gccctcgata gccagcaggg tgaaccgtgg caaacgattc
gtttaatttc tgaattttac 3420ccggaagaca gcggtctgtt ctccccgcta ttgctgaatg
tggtgaaatt gaaccctggc 3480gaagcgatgt tcctgttcgc tgaaacaccg cacgcttacc
tgcaaggcgt ggcgctggaa 3540gtgatggcaa actccgataa cgtgctgcgt gcgggtctga
cgcctaaata cattgatatt 3600ccggaactgg ttgccaatgt gaaattcgaa gccaaaccgg
ctaaccagtt gttgacccag 3660ccggtgaaac aaggtgcaga actggacttc ccgattccag
tggatgattt tgccttctcg 3720ctgcatgacc ttagtgataa agaaaccacc attagccagc
agagtgccgc cattttgttc 3780tgcgtcgaag gcgatgcaac gttgtggaaa ggttctcagc
agttacagct taaaccgggt 3840gaatcagcgt ttattgccgc caacgaatca ccggtgactg
tcaaaggcca cggccgttta 3900gcgcgtgttt acaacaagct gtaagagctt actgaaaaaa
ttaacatctc ttgctaagct 3960gggggtggaa cctagacttg tccatcttct ggattggcca
acttaattaa tgtatgaaat 4020aaaaggatgc acacatagtg acatgctaat cactataatg
tgggcatcaa agttgtgtgt 4080tatgtgtaat tactagttat ctgaataaaa gagaaagaga
tcatccatat ttcttatcct 4140aaatgaatgt cacgtgtctt tataattctt tgatgaacca
gatgcatttc attaaccaaa 4200tccatataca tataaatatt aatcatatat aattaatatc
aattgggtta gcaaaacaaa 4260tctagtctag gtgtgttttg cgaatgcggc cctagcgtat
acgaagttcc tattccgaag 4320ttcctattct ccagaaagta taggaacttc tgtacacctg
agctgattcc gatgacttcg 4380taggttccta gctcaagccg ctcgtgtcca agcgtcactt
acgattagct aatgattacg 4440gcatctagga ccgactagct aactaactag taccgaggcc
ggccccgcgg actgcacgtt 4500acgtacgtac gaactaatat actccaccag ctgatcactg
atgagccgag ccgccatgca 4560ttgtaattta taacatgtgc ggctgtacgc ttccatctca
aatacctttt tatatatata 4620ttgtacttta tagtctacga cataatctgc catggtaatt
tataagatgt gctttattgc 4680tcgttgttct gttctcatct gtgtccatgg catggcatgg
atacaaaatg tatgtatggc 4740cacgcatcca atctgtgacg ttgtcaaggc agaggtccaa
ccgtccaaga ccctcttgtg 4800ccgccctgta cttgcagtca gtgacgttgt gagaaaaagc
tgtgggtggt ctccgcagag 4860cgcgcgggcc acgagaggga gccccatctc tcggccgagg
ggtacggggg ctccagacac 4920ggtcctttgg tttcttctgc ctgtagcgag cggccccgcc
ccccaccgcg ctgctagcct 4980agccgatgct gatccatcca ccacccacaa gggattgttc
cacgacttgt ggacctgacc 5040atgacgtgac ttcacgccat gtacgctcag ccgctcacta
gctttttttt c 50912565237DNAArtificial SequenceTS8Cas-2 donor
256tctctttcag ggcttgttcg tttacgttgg attgcacccg gaatcgttac agctaatcaa
60agtttatata aattagagaa gcaaccggat aggaatcgtt ccgacccacc aattcgacac
120aaacgaacaa ggcctcaatc cttctcaatc cacctccaac ccaataagct cttggaggcg
180gcggcgggag agcagccaca cacatgactg cctgagaatc tgctgccgtt gcctctcata
240ttatattcga tcccctgact aaaaaaactc ggggccggct aatacgtact gtacgtacgc
300agaatttacg gtccagcacg ggcatgccgc gcgggctgac tttgctccac tgactcgatc
360atgtgcggat tccatcgcgg cgtagcgtag ccaaccgcaa cgcaaaccga cttcatcttt
420tttttttatt atgaacaaaa ggagatcgag agaaacgtga acggtaaata atatatctga
480tcccatgcat gcacgctgcc tgggtcgatc tcgctctcgc tccgcccaga cgaacatgca
540tgctggtcag gctcaacgct caggcgggca agctgtggga ggacatggga tgggagagga
600ggacacatgc atgctggcca gtcaggcact gtgctggcac atgaggtagg gatagggggg
660ccctcggcca gtgtccaggc cgcatgcatg catgcccccc ctgctgctcg accgaacaac
720gttggatgcc tggattgatg caacagtttg gacggacgga ccatacgtta tgtaccagta
780ctgcacgtta cgtacgtacg aactaatata ctccaccagg taccatttaa atcttaagcc
840taggataact tcgtatagca tacattatac gaagttatgg cgccgctagc ctgcagtgca
900gcgtgacccg gtcgtgcccc tctctagaga taatgagcat tgcatgtcta agttataaaa
960aattaccaca tatttttttt gtcacacttg tttgaagtgc agtttatcta tctttataca
1020tatatttaaa ctttactcta cgaataatat aatctatagt actacaataa tatcagtgtt
1080ttagagaatc atataaatga acagttagac atggtctaaa ggacaattga gtattttgac
1140aacaggactc tacagtttta tctttttagt gtgcatgtgt tctccttttt ttttgcaaat
1200agcttcacct atataatact tcatccattt tattagtaca tccatttagg gtttagggtt
1260aatggttttt atagactaat ttttttagta catctatttt attctatttt agcctctaaa
1320ttaagaaaac taaaactcta ttttagtttt tttatttaat aatttagata taaaatagaa
1380taaaataaag tgactaaaaa ttaaacaaat accctttaag aaattaaaaa aactaaggaa
1440acatttttct tgtttcgagt agataatgcc agcctgttaa acgccgtcga cgagtctaac
1500ggacaccaac cagcgaacca gcagcgtcgc gtcgggccaa gcgaagcaga cggcacggca
1560tctctgtcgc tgcctctgga cccctctcga gagttccgct ccaccgttgg acttgctccg
1620ctgtcggcat ccagaaattg cgtggcggag cggcagacgt gagccggcac ggcaggcggc
1680ctcctcctcc tctcacggca ccggcagcta cgggggattc ctttcccacc gctccttcgc
1740tttcccttcc tcgcccgccg taataaatag acaccccctc cacaccctct ttccccaacc
1800tcgtgttgtt cggagcgcac acacacacaa ccagatctcc cccaaatcca cccgtcggca
1860cctccgcttc aaggtacgcc gctcgtcctc cccccccccc ctctctacct tctctagatc
1920ggcgttccgg tccatgcatg gttagggccc ggtagttcta cttctgttca tgtttgtgtt
1980agatccgtgt ttgtgttaga tccgtgctgc tagcgttcgt acacggatgc gacctgtacg
2040tcagacacgt tctgattgct aacttgccag tgtttctctt tggggaatcc tgggatggct
2100ctagccgttc cgcagacggg atcgatttca tgattttttt tgtttcgttg catagggttt
2160ggtttgccct tttcctttat ttcaatatat gccgtgcact tgtttgtcgg gtcatctttt
2220catgcttttt tttgtcttgg ttgtgatgat gtggtctggt tgggcggtcg ttctagatcg
2280gagtagaatt ctgtttcaaa ctacctggtg gatttattaa ttttggatct gtatgtgtgt
2340gccatacata ttcatagtta cgaattgaag atgatggatg gaaatatcga tctaggatag
2400gtatacatgt tgatgcgggt tttactgatg catatacaga gatgcttttt gttcgcttgg
2460ttgtgatgat gtggtgtggt tgggcggtcg ttcattcgtt ctagatcgga gtagaatact
2520gtttcaaact acctggtgta tttattaatt ttggaactgt atgtgtgtgt catacatctt
2580catagttacg agtttaagat ggatggaaat atcgatctag gataggtata catgttgatg
2640tgggttttac tgatgcatat acatgatggc atatgcagca tctattcata tgctctaacc
2700ttgagtacct atctattata ataaacaagt atgttttata attattttga tcttgatata
2760cttggatgat ggcatatgca gcagctatat gtggattttt ttagccctgc cttcatacgc
2820tatttatttg cttggtactg tttcttttgt cgatgctcac cctgttgttt ggtgttactt
2880ctgcaggtcg actctagagg atcaattcgc tagcgaagtt cctattccga agttcctatt
2940ctctagaaag tataggaact tcagatccac cgggatcccc gatcatgcaa aaactcatta
3000actcagtgca aaactatgcc tggggcagca aaacggcgtt gactgaactt tatggtatgg
3060aaaatccgtc cagccagccg atggccgagc tgtggatggg cgcacatccg aaaagcagtt
3120cacgagtgca gaatgccgcc ggagatatcg tttcactgcg tgatgtgatt gagagtgata
3180aatcgactct gctcggagag gccgttgcca aacgctttgg cgaactgcct ttcctgttca
3240aagtattatg cgcagcacag ccactctcca ttcaggttca tccaaacaaa cacaattctg
3300aaatcggttt tgccaaagaa aatgccgcag gtatcccgat ggatgccgcc gagcgtaact
3360ataaagatcc taaccacaag ccggagctgg tttttgcgct gacgcctttc cttgcgatga
3420acgcgtttcg tgaattttcc gagattgtct ccctactcca gccggtcgca ggtgcacatc
3480cggcgattgc tcacttttta caacagcctg atgccgaacg tttaagcgaa ctgttcgcca
3540gcctgttgaa tatgcagggt gaagaaaaat cccgcgcgct ggcgatttta aaatcggccc
3600tcgatagcca gcagggtgaa ccgtggcaaa cgattcgttt aatttctgaa ttttacccgg
3660aagacagcgg tctgttctcc ccgctattgc tgaatgtggt gaaattgaac cctggcgaag
3720cgatgttcct gttcgctgaa acaccgcacg cttacctgca aggcgtggcg ctggaagtga
3780tggcaaactc cgataacgtg ctgcgtgcgg gtctgacgcc taaatacatt gatattccgg
3840aactggttgc caatgtgaaa ttcgaagcca aaccggctaa ccagttgttg acccagccgg
3900tgaaacaagg tgcagaactg gacttcccga ttccagtgga tgattttgcc ttctcgctgc
3960atgaccttag tgataaagaa accaccatta gccagcagag tgccgccatt ttgttctgcg
4020tcgaaggcga tgcaacgttg tggaaaggtt ctcagcagtt acagcttaaa ccgggtgaat
4080cagcgtttat tgccgccaac gaatcaccgg tgactgtcaa aggccacggc cgtttagcgc
4140gtgtttacaa caagctgtaa gagcttactg aaaaaattaa catctcttgc taagctgggg
4200gtggaaccta gacttgtcca tcttctggat tggccaactt aattaatgta tgaaataaaa
4260ggatgcacac atagtgacat gctaatcact ataatgtggg catcaaagtt gtgtgttatg
4320tgtaattact agttatctga ataaaagaga aagagatcat ccatatttct tatcctaaat
4380gaatgtcacg tgtctttata attctttgat gaaccagatg catttcatta accaaatcca
4440tatacatata aatattaatc atatataatt aatatcaatt gggttagcaa aacaaatcta
4500gtctaggtgt gttttgcgaa tgcggcccta gcgtatacga agttcctatt ccgaagttcc
4560tattctccag aaagtatagg aacttctgta cacctgagct gattccgatg acttcgtagg
4620ttcctagctc aagccgctcg tgtccaagcg tcacttacga ttagctaatg attacggcat
4680ctaggaccga ctagctaact aactagtacc gaggccggcc ccgcgggagc tcgctgatca
4740ctgatgagcc gagccgccat gcattgtaat ttataacatg tgcggctgta cgcttccatc
4800tcaaatacct ttttatatat atattgtact ttatagtcta cgacataatc tgccatggta
4860atttataaga tgtgctttat tgctcgttgt tctgttctca tctgtgtcca tggcatggca
4920tggatacaaa atgtatgtat ggccacgcat ccaatctgtg acgttgtcaa ggcagaggtc
4980caaccgtcca agaccctctt gtgccgccct gtacttgcag tcagtgacgt tgtgagaaaa
5040agctgtgggt ggtctccgca gagcgcgcgg gccacgagag ggagccccat ctctcggccg
5100aggggtacgg gggctccaga cacggtcctt tggtttcttc tgcctgtagc gagcggcccc
5160gccccccacc gcgctgctag cctagccgat gctgatccat ccaccaccca caagggattg
5220ttccacgact tgtggac
52372575427DNAArtificial SequenceTS9Cas-2 donor 257agcaaggaac taaactgtta
ttggacgcta aagtttagta ctttatcttt aacatctttc 60agcatttcta tgtagatatt
taagggctaa attttagcaa gtgtgctgat aaattttagc 120ctaaatgttt ctgttgggct
aaattttagc aagtgtactg ttaaatttta gcatattcct 180tttagagtgg tatgggtgtg
catagactaa atgtttccgt tgggccctaa tttaacgatg 240tgtacgcagg cctgtttaga
tgacttggta ccggcatatg gcctcgtact gtttcatttg 300atgacgcgag cgtgcggccc
atgcagcagc agcacgccgg gaaggcagcg gattttgaag 360tactattgga cagcgcggcg
cggggaccgg gtcgttggcg cgcggtggag tgggggtggg 420tggtcctggc gtcctgccct
gcgcgatggt cgatggatgc cccatgcgcg tgtaaccgcc 480cagccgtcgc catccgacca
ggtgggcaga cgtacgtacg gtggcacgcc cacggcccat 540cggccatcgc gatcgcgttc
gtatcgtgtc ctcaataacg aaagcgccaa cggaaggcgc 600tgtcgtcgtc agttcaccgc
gcgccggcgc cctgtgtcct cgtccctctc gacttctcga 660ccagtaagaa ctctcgcgag
ctgcggagct gctggcgatg gccggccggt gggatccgac 720gtgcgtgcaa cctcgaattt
aaatcttaag cctaggataa cttcgtatag catacattat 780acgaagttat ggcgccgcta
gcctgcagtg cagcgtgacc cggtcgtgcc cctctctaga 840gataatgagc attgcatgtc
taagttataa aaaattacca catatttttt ttgtcacact 900tgtttgaagt gcagtttatc
tatctttata catatattta aactttactc tacgaataat 960ataatctata gtactacaat
aatatcagtg ttttagagaa tcatataaat gaacagttag 1020acatggtcta aaggacaatt
gagtattttg acaacaggac tctacagttt tatcttttta 1080gtgtgcatgt gttctccttt
ttttttgcaa atagcttcac ctatataata cttcatccat 1140tttattagta catccattta
gggtttaggg ttaatggttt ttatagacta atttttttag 1200tacatctatt ttattctatt
ttagcctcta aattaagaaa actaaaactc tattttagtt 1260tttttattta ataatttaga
tataaaatag aataaaataa agtgactaaa aattaaacaa 1320atacccttta agaaattaaa
aaaactaagg aaacattttt cttgtttcga gtagataatg 1380ccagcctgtt aaacgccgtc
gacgagtcta acggacacca accagcgaac cagcagcgtc 1440gcgtcgggcc aagcgaagca
gacggcacgg catctctgtc gctgcctctg gacccctctc 1500gagagttccg ctccaccgtt
ggacttgctc cgctgtcggc atccagaaat tgcgtggcgg 1560agcggcagac gtgagccggc
acggcaggcg gcctcctcct cctctcacgg caccggcagc 1620tacgggggat tcctttccca
ccgctccttc gctttccctt cctcgcccgc cgtaataaat 1680agacaccccc tccacaccct
ctttccccaa cctcgtgttg ttcggagcgc acacacacac 1740aaccagatct cccccaaatc
cacccgtcgg cacctccgct tcaaggtacg ccgctcgtcc 1800tccccccccc ccctctctac
cttctctaga tcggcgttcc ggtccatgca tggttagggc 1860ccggtagttc tacttctgtt
catgtttgtg ttagatccgt gtttgtgtta gatccgtgct 1920gctagcgttc gtacacggat
gcgacctgta cgtcagacac gttctgattg ctaacttgcc 1980agtgtttctc tttggggaat
cctgggatgg ctctagccgt tccgcagacg ggatcgattt 2040catgattttt tttgtttcgt
tgcatagggt ttggtttgcc cttttccttt atttcaatat 2100atgccgtgca cttgtttgtc
gggtcatctt ttcatgcttt tttttgtctt ggttgtgatg 2160atgtggtctg gttgggcggt
cgttctagat cggagtagaa ttctgtttca aactacctgg 2220tggatttatt aattttggat
ctgtatgtgt gtgccataca tattcatagt tacgaattga 2280agatgatgga tggaaatatc
gatctaggat aggtatacat gttgatgcgg gttttactga 2340tgcatataca gagatgcttt
ttgttcgctt ggttgtgatg atgtggtgtg gttgggcggt 2400cgttcattcg ttctagatcg
gagtagaata ctgtttcaaa ctacctggtg tatttattaa 2460ttttggaact gtatgtgtgt
gtcatacatc ttcatagtta cgagtttaag atggatggaa 2520atatcgatct aggataggta
tacatgttga tgtgggtttt actgatgcat atacatgatg 2580gcatatgcag catctattca
tatgctctaa ccttgagtac ctatctatta taataaacaa 2640gtatgtttta taattatttt
gatcttgata tacttggatg atggcatatg cagcagctat 2700atgtggattt ttttagccct
gccttcatac gctatttatt tgcttggtac tgtttctttt 2760gtcgatgctc accctgttgt
ttggtgttac ttctgcaggt cgactctaga ggatcaattc 2820gctagcgaag ttcctattcc
gaagttccta ttctctagaa agtataggaa cttcagatcc 2880accgggatcc ccgatcatgc
aaaaactcat taactcagtg caaaactatg cctggggcag 2940caaaacggcg ttgactgaac
tttatggtat ggaaaatccg tccagccagc cgatggccga 3000gctgtggatg ggcgcacatc
cgaaaagcag ttcacgagtg cagaatgccg ccggagatat 3060cgtttcactg cgtgatgtga
ttgagagtga taaatcgact ctgctcggag aggccgttgc 3120caaacgcttt ggcgaactgc
ctttcctgtt caaagtatta tgcgcagcac agccactctc 3180cattcaggtt catccaaaca
aacacaattc tgaaatcggt tttgccaaag aaaatgccgc 3240aggtatcccg atggatgccg
ccgagcgtaa ctataaagat cctaaccaca agccggagct 3300ggtttttgcg ctgacgcctt
tccttgcgat gaacgcgttt cgtgaatttt ccgagattgt 3360ctccctactc cagccggtcg
caggtgcaca tccggcgatt gctcactttt tacaacagcc 3420tgatgccgaa cgtttaagcg
aactgttcgc cagcctgttg aatatgcagg gtgaagaaaa 3480atcccgcgcg ctggcgattt
taaaatcggc cctcgatagc cagcagggtg aaccgtggca 3540aacgattcgt ttaatttctg
aattttaccc ggaagacagc ggtctgttct ccccgctatt 3600gctgaatgtg gtgaaattga
accctggcga agcgatgttc ctgttcgctg aaacaccgca 3660cgcttacctg caaggcgtgg
cgctggaagt gatggcaaac tccgataacg tgctgcgtgc 3720gggtctgacg cctaaataca
ttgatattcc ggaactggtt gccaatgtga aattcgaagc 3780caaaccggct aaccagttgt
tgacccagcc ggtgaaacaa ggtgcagaac tggacttccc 3840gattccagtg gatgattttg
ccttctcgct gcatgacctt agtgataaag aaaccaccat 3900tagccagcag agtgccgcca
ttttgttctg cgtcgaaggc gatgcaacgt tgtggaaagg 3960ttctcagcag ttacagctta
aaccgggtga atcagcgttt attgccgcca acgaatcacc 4020ggtgactgtc aaaggccacg
gccgtttagc gcgtgtttac aacaagctgt aagagcttac 4080tgaaaaaatt aacatctctt
gctaagctgg gggtggaacc tagacttgtc catcttctgg 4140attggccaac ttaattaatg
tatgaaataa aaggatgcac acatagtgac atgctaatca 4200ctataatgtg ggcatcaaag
ttgtgtgtta tgtgtaatta ctagttatct gaataaaaga 4260gaaagagatc atccatattt
cttatcctaa atgaatgtca cgtgtcttta taattctttg 4320atgaaccaga tgcatttcat
taaccaaatc catatacata taaatattaa tcatatataa 4380ttaatatcaa ttgggttagc
aaaacaaatc tagtctaggt gtgttttgcg aatgcggccc 4440tagcgtatac gaagttccta
ttccgaagtt cctattctcc agaaagtata ggaacttctg 4500tacacctgag ctgattccga
tgacttcgta ggttcctagc tcaagccgct cgtgtccaag 4560cgtcacttac gattagctaa
tgattacggc atctaggacc gactagctaa ctaactagta 4620ccgaggccgg ccccgcggga
gctcggccgc aaacagcctg gtgacagacg aagccagcaa 4680gcacgtacgt acgcacgtct
ctgctggtct ggatgtgtat ggatatggac gtctcacgtc 4740tggacgtcgt cgtcgccgtt
gtattgtatc atgccaacca cttccgtacc gtaccccctc 4800gcgtgccaac atgaccaccg
ccggtacgtc tccatcgtcg gccgtcggcg tctcaggcag 4860ctctcaatta agcggacgtg
ttttggtaat ctggtggaac gccgcgcgca ctgagggttt 4920gggggccccg gcggacgagc
gagcgagaga cggtgcatgc atgccaaatg gcaacgaggg 4980cccgcccgcc catccaataa
ccaacccaga cgtagcgcaa ccaacgtacg agtcctgtgc 5040tggcgcgtac gactaccacg
ctagctgccg cgacatgcga actacggtcc accaggcacc 5100agccatgaca atatatactg
tatatatatt tttcttcttc tttttgtttc cgctctctca 5160agttcctgct ctgctcctgc
ctgtccgcgg tgccgatcgg cgagagagca tgcatggaca 5220tggaccacgc gagatccagg
aaccggcacg ggcccatgcg tggcaggcgg ccgtttcgtc 5280aggttccccg aaatgcccca
actgcgcggc tgcaggatgg ctcatggctg gctgcctagc 5340tggcccgtga caccgatcga
tcggtaacga cgacgcacgc acctgaagca caggaaggag 5400cctccctctc gcatgcacgt
tagtact 54272585426DNAArtificial
SequenceTS9Cas-3 donor 258agcaaggaac taaactgtta ttggacgcaa agtttagtac
tttatcttta acatctttca 60gcatttctat gtagatattt aagggctaaa ttttagcaag
tgtgctgata aattttagcc 120taaatgtttc tgttgggcta aattttagca agtgtactgt
taaattttag catattcctt 180ttagagtggt atgggtgtgc atagactaaa tgtttccgtt
gggccctaat ttaacgatgt 240gtacgcaggc ctgtttagat gacttggtac cggcatatgg
cctcgtactg tttcatttga 300tgacgcgagc gtgcggccca tgcagcagca gcacgccggg
aaggcagcgg attttgaagt 360actattggac agcgcggcgc ggggaccggg tcgttggcgc
gcggtggagt gggggtgggt 420ggtcctggcg tcctgccctg cgcgatggtc gatggatgcc
ccatgcgcgt gtaaccgccc 480agccgtcgcc atccgaccag gtgggcagac gtacgtacgg
tggcacgccc acggcccatc 540ggccatcgcg atcgcgttcg tatcgtgtcc tcaataacga
aagcgccaac ggaaggcgct 600gtcgtcgtca gttcaccgcg cgccggcgcc ctgtgtcctc
gtccctctcg acttctcgac 660cagtaagaac tctcgcgagc tgcggagctg ctggcgatgg
ccggccggtg ggatccgacg 720atttaaatct taagcctagg ataacttcgt atagcataca
ttatacgaag ttatggcgcc 780gctagcctgc agtgcagcgt gacccggtcg tgcccctctc
tagagataat gagcattgca 840tgtctaagtt ataaaaaatt accacatatt ttttttgtca
cacttgtttg aagtgcagtt 900tatctatctt tatacatata tttaaacttt actctacgaa
taatataatc tatagtacta 960caataatatc agtgttttag agaatcatat aaatgaacag
ttagacatgg tctaaaggac 1020aattgagtat tttgacaaca ggactctaca gttttatctt
tttagtgtgc atgtgttctc 1080cttttttttt gcaaatagct tcacctatat aatacttcat
ccattttatt agtacatcca 1140tttagggttt agggttaatg gtttttatag actaattttt
ttagtacatc tattttattc 1200tattttagcc tctaaattaa gaaaactaaa actctatttt
agttttttta tttaataatt 1260tagatataaa atagaataaa ataaagtgac taaaaattaa
acaaataccc tttaagaaat 1320taaaaaaact aaggaaacat ttttcttgtt tcgagtagat
aatgccagcc tgttaaacgc 1380cgtcgacgag tctaacggac accaaccagc gaaccagcag
cgtcgcgtcg ggccaagcga 1440agcagacggc acggcatctc tgtcgctgcc tctggacccc
tctcgagagt tccgctccac 1500cgttggactt gctccgctgt cggcatccag aaattgcgtg
gcggagcggc agacgtgagc 1560cggcacggca ggcggcctcc tcctcctctc acggcaccgg
cagctacggg ggattccttt 1620cccaccgctc cttcgctttc ccttcctcgc ccgccgtaat
aaatagacac cccctccaca 1680ccctctttcc ccaacctcgt gttgttcgga gcgcacacac
acacaaccag atctccccca 1740aatccacccg tcggcacctc cgcttcaagg tacgccgctc
gtcctccccc ccccccctct 1800ctaccttctc tagatcggcg ttccggtcca tgcatggtta
gggcccggta gttctacttc 1860tgttcatgtt tgtgttagat ccgtgtttgt gttagatccg
tgctgctagc gttcgtacac 1920ggatgcgacc tgtacgtcag acacgttctg attgctaact
tgccagtgtt tctctttggg 1980gaatcctggg atggctctag ccgttccgca gacgggatcg
atttcatgat tttttttgtt 2040tcgttgcata gggtttggtt tgcccttttc ctttatttca
atatatgccg tgcacttgtt 2100tgtcgggtca tcttttcatg cttttttttg tcttggttgt
gatgatgtgg tctggttggg 2160cggtcgttct agatcggagt agaattctgt ttcaaactac
ctggtggatt tattaatttt 2220ggatctgtat gtgtgtgcca tacatattca tagttacgaa
ttgaagatga tggatggaaa 2280tatcgatcta ggataggtat acatgttgat gcgggtttta
ctgatgcata tacagagatg 2340ctttttgttc gcttggttgt gatgatgtgg tgtggttggg
cggtcgttca ttcgttctag 2400atcggagtag aatactgttt caaactacct ggtgtattta
ttaattttgg aactgtatgt 2460gtgtgtcata catcttcata gttacgagtt taagatggat
ggaaatatcg atctaggata 2520ggtatacatg ttgatgtggg ttttactgat gcatatacat
gatggcatat gcagcatcta 2580ttcatatgct ctaaccttga gtacctatct attataataa
acaagtatgt tttataatta 2640ttttgatctt gatatacttg gatgatggca tatgcagcag
ctatatgtgg atttttttag 2700ccctgccttc atacgctatt tatttgcttg gtactgtttc
ttttgtcgat gctcaccctg 2760ttgtttggtg ttacttctgc aggtcgactc tagaggatca
attcgctagc gaagttccta 2820ttccgaagtt cctattctct agaaagtata ggaacttcag
atccaccggg atccccgatc 2880atgcaaaaac tcattaactc agtgcaaaac tatgcctggg
gcagcaaaac ggcgttgact 2940gaactttatg gtatggaaaa tccgtccagc cagccgatgg
ccgagctgtg gatgggcgca 3000catccgaaaa gcagttcacg agtgcagaat gccgccggag
atatcgtttc actgcgtgat 3060gtgattgaga gtgataaatc gactctgctc ggagaggccg
ttgccaaacg ctttggcgaa 3120ctgcctttcc tgttcaaagt attatgcgca gcacagccac
tctccattca ggttcatcca 3180aacaaacaca attctgaaat cggttttgcc aaagaaaatg
ccgcaggtat cccgatggat 3240gccgccgagc gtaactataa agatcctaac cacaagccgg
agctggtttt tgcgctgacg 3300cctttccttg cgatgaacgc gtttcgtgaa ttttccgaga
ttgtctccct actccagccg 3360gtcgcaggtg cacatccggc gattgctcac tttttacaac
agcctgatgc cgaacgttta 3420agcgaactgt tcgccagcct gttgaatatg cagggtgaag
aaaaatcccg cgcgctggcg 3480attttaaaat cggccctcga tagccagcag ggtgaaccgt
ggcaaacgat tcgtttaatt 3540tctgaatttt acccggaaga cagcggtctg ttctccccgc
tattgctgaa tgtggtgaaa 3600ttgaaccctg gcgaagcgat gttcctgttc gctgaaacac
cgcacgctta cctgcaaggc 3660gtggcgctgg aagtgatggc aaactccgat aacgtgctgc
gtgcgggtct gacgcctaaa 3720tacattgata ttccggaact ggttgccaat gtgaaattcg
aagccaaacc ggctaaccag 3780ttgttgaccc agccggtgaa acaaggtgca gaactggact
tcccgattcc agtggatgat 3840tttgccttct cgctgcatga ccttagtgat aaagaaacca
ccattagcca gcagagtgcc 3900gccattttgt tctgcgtcga aggcgatgca acgttgtgga
aaggttctca gcagttacag 3960cttaaaccgg gtgaatcagc gtttattgcc gccaacgaat
caccggtgac tgtcaaaggc 4020cacggccgtt tagcgcgtgt ttacaacaag ctgtaagagc
ttactgaaaa aattaacatc 4080tcttgctaag ctgggggtgg aacctagact tgtccatctt
ctggattggc caacttaatt 4140aatgtatgaa ataaaaggat gcacacatag tgacatgcta
atcactataa tgtgggcatc 4200aaagttgtgt gttatgtgta attactagtt atctgaataa
aagagaaaga gatcatccat 4260atttcttatc ctaaatgaat gtcacgtgtc tttataattc
tttgatgaac cagatgcatt 4320tcattaacca aatccatata catataaata ttaatcatat
ataattaata tcaattgggt 4380tagcaaaaca aatctagtct aggtgtgttt tgcgaatgcg
gccctagcgt atacgaagtt 4440cctattccga agttcctatt ctccagaaag tataggaact
tctgtacacc tgagctgatt 4500ccgatgactt cgtaggttcc tagctcaagc cgctcgtgtc
caagcgtcac ttacgattag 4560ctaatgatta cggcatctag gaccgactag ctaactaact
agtaccgagg ccggccccgc 4620gggagctctg cgtgcaacct cgaggccgca aacagcctgg
tgacagacga agccagcaag 4680cacgtacgta cgcacgtctc tgctggtctg gatgtgtatg
gatatggacg tctcacgtct 4740ggacgtcgtc gtcgccgttg tattgtatca tgccaaccac
ttccgtaccg taccccctcg 4800cgtgccaaca tgaccaccgc cggtacgtct ccatcgtcgg
ccgtcggcgt ctcaggcagc 4860tctcaattaa gcggacgtgt tttggtaatc tggtggaacg
ccgcgcgcac tgagggtttg 4920ggggccccgg cggacgagcg agcgagagac ggtgcatgca
tgccaaatgg caacgagggc 4980ccgcccgccc atccaataac caacccagac gtagcgcaac
caacgtacga gtcctgtgct 5040ggcgcgtacg actaccacgc tagctgccgc gacatgcgaa
ctacggtcca ccaggcacca 5100gccatgacaa tatatactgt atatatattt ttcttcttct
ttttgtttcc gctctctcaa 5160gttcctgctc tgctcctgcc tgtccgcggt gccgatcggc
gagagagcat gcatggacat 5220ggaccacgcg agatccagga accggcacgg gcccatgcgt
ggcaggcggc cgtttcgtca 5280ggttccccga aatgccccaa ctgcgcggct gcaggatggc
tcatggctgg ctgcctagct 5340ggcccgtgac accgatcgat cggtaacgac gacgcacgca
cctgaagcac aggaaggagc 5400ctccctctcg catgcacgtt agtact
54262595152DNAArtificial SequenceTS10Cas-1 donor
259ggtaccaaat agtaaacggg aggggaggtc gctagtagta aacgctaggt agctaggata
60atccgtctcg tgttggacgg aaggttttgg acgcatctgc gtgcacagcc cgctgataca
120gatctgatcg actagctagc tagatgccga ggccccagag caaggcccgg atactcctgc
180acagtccctg agatttcagc acagcaggtg ctgttgcatc aatatataaa tccctgcttt
240attaatttaa tctctgtgca tgtatccata catcgtcagc ggctcagcgc tatcacactg
300cagtgcacgc agctagttga gcgcctgggt cagtatatat atagctagta gggacaaagg
360ggggcactgt acgttggttt ggtttggcac gcacgcgatc gagagtggtg gaatggactg
420cagatcatcg atcgctgcac tgtacgcacg cgcaccggac tgcatttgca tgcccctgaa
480ggaggaaagg ggaaggaaag aaaagaaata ggagaaagaa gaagaagcag agaaatacgt
540cacagtccaa gaagagtgag ccgccctagc tagcttcaac cctgacgaac ccggcagcca
600cacttccggc catgtatgca tgcatgcatg gcttagcttc agatgtccaa tcgaatccat
660caagacctgg ccggttttcc atggccgcct cgccttcgct agtggtacca tttaaatctt
720aagcctagga taacttcgta tagcatacat tatacgaagt tatggcgccg ctagcctgca
780gtgcagcgtg acccggtcgt gcccctctct agagataatg agcattgcat gtctaagtta
840taaaaaatta ccacatattt tttttgtcac acttgtttga agtgcagttt atctatcttt
900atacatatat ttaaacttta ctctacgaat aatataatct atagtactac aataatatca
960gtgttttaga gaatcatata aatgaacagt tagacatggt ctaaaggaca attgagtatt
1020ttgacaacag gactctacag ttttatcttt ttagtgtgca tgtgttctcc tttttttttg
1080caaatagctt cacctatata atacttcatc cattttatta gtacatccat ttagggttta
1140gggttaatgg tttttataga ctaatttttt tagtacatct attttattct attttagcct
1200ctaaattaag aaaactaaaa ctctatttta gtttttttat ttaataattt agatataaaa
1260tagaataaaa taaagtgact aaaaattaaa caaataccct ttaagaaatt aaaaaaacta
1320aggaaacatt tttcttgttt cgagtagata atgccagcct gttaaacgcc gtcgacgagt
1380ctaacggaca ccaaccagcg aaccagcagc gtcgcgtcgg gccaagcgaa gcagacggca
1440cggcatctct gtcgctgcct ctggacccct ctcgagagtt ccgctccacc gttggacttg
1500ctccgctgtc ggcatccaga aattgcgtgg cggagcggca gacgtgagcc ggcacggcag
1560gcggcctcct cctcctctca cggcaccggc agctacgggg gattcctttc ccaccgctcc
1620ttcgctttcc cttcctcgcc cgccgtaata aatagacacc ccctccacac cctctttccc
1680caacctcgtg ttgttcggag cgcacacaca cacaaccaga tctcccccaa atccacccgt
1740cggcacctcc gcttcaaggt acgccgctcg tcctcccccc cccccctctc taccttctct
1800agatcggcgt tccggtccat gcatggttag ggcccggtag ttctacttct gttcatgttt
1860gtgttagatc cgtgtttgtg ttagatccgt gctgctagcg ttcgtacacg gatgcgacct
1920gtacgtcaga cacgttctga ttgctaactt gccagtgttt ctctttgggg aatcctggga
1980tggctctagc cgttccgcag acgggatcga tttcatgatt ttttttgttt cgttgcatag
2040ggtttggttt gcccttttcc tttatttcaa tatatgccgt gcacttgttt gtcgggtcat
2100cttttcatgc ttttttttgt cttggttgtg atgatgtggt ctggttgggc ggtcgttcta
2160gatcggagta gaattctgtt tcaaactacc tggtggattt attaattttg gatctgtatg
2220tgtgtgccat acatattcat agttacgaat tgaagatgat ggatggaaat atcgatctag
2280gataggtata catgttgatg cgggttttac tgatgcatat acagagatgc tttttgttcg
2340cttggttgtg atgatgtggt gtggttgggc ggtcgttcat tcgttctaga tcggagtaga
2400atactgtttc aaactacctg gtgtatttat taattttgga actgtatgtg tgtgtcatac
2460atcttcatag ttacgagttt aagatggatg gaaatatcga tctaggatag gtatacatgt
2520tgatgtgggt tttactgatg catatacatg atggcatatg cagcatctat tcatatgctc
2580taaccttgag tacctatcta ttataataaa caagtatgtt ttataattat tttgatcttg
2640atatacttgg atgatggcat atgcagcagc tatatgtgga tttttttagc cctgccttca
2700tacgctattt atttgcttgg tactgtttct tttgtcgatg ctcaccctgt tgtttggtgt
2760tacttctgca ggtcgactct agaggatcaa ttcgctagcg aagttcctat tccgaagttc
2820ctattctcta gaaagtatag gaacttcaga tccaccggga tccccgatca tgcaaaaact
2880cattaactca gtgcaaaact atgcctgggg cagcaaaacg gcgttgactg aactttatgg
2940tatggaaaat ccgtccagcc agccgatggc cgagctgtgg atgggcgcac atccgaaaag
3000cagttcacga gtgcagaatg ccgccggaga tatcgtttca ctgcgtgatg tgattgagag
3060tgataaatcg actctgctcg gagaggccgt tgccaaacgc tttggcgaac tgcctttcct
3120gttcaaagta ttatgcgcag cacagccact ctccattcag gttcatccaa acaaacacaa
3180ttctgaaatc ggttttgcca aagaaaatgc cgcaggtatc ccgatggatg ccgccgagcg
3240taactataaa gatcctaacc acaagccgga gctggttttt gcgctgacgc ctttccttgc
3300gatgaacgcg tttcgtgaat tttccgagat tgtctcccta ctccagccgg tcgcaggtgc
3360acatccggcg attgctcact ttttacaaca gcctgatgcc gaacgtttaa gcgaactgtt
3420cgccagcctg ttgaatatgc agggtgaaga aaaatcccgc gcgctggcga ttttaaaatc
3480ggccctcgat agccagcagg gtgaaccgtg gcaaacgatt cgtttaattt ctgaatttta
3540cccggaagac agcggtctgt tctccccgct attgctgaat gtggtgaaat tgaaccctgg
3600cgaagcgatg ttcctgttcg ctgaaacacc gcacgcttac ctgcaaggcg tggcgctgga
3660agtgatggca aactccgata acgtgctgcg tgcgggtctg acgcctaaat acattgatat
3720tccggaactg gttgccaatg tgaaattcga agccaaaccg gctaaccagt tgttgaccca
3780gccggtgaaa caaggtgcag aactggactt cccgattcca gtggatgatt ttgccttctc
3840gctgcatgac cttagtgata aagaaaccac cattagccag cagagtgccg ccattttgtt
3900ctgcgtcgaa ggcgatgcaa cgttgtggaa aggttctcag cagttacagc ttaaaccggg
3960tgaatcagcg tttattgccg ccaacgaatc accggtgact gtcaaaggcc acggccgttt
4020agcgcgtgtt tacaacaagc tgtaagagct tactgaaaaa attaacatct cttgctaagc
4080tgggggtgga acctagactt gtccatcttc tggattggcc aacttaatta atgtatgaaa
4140taaaaggatg cacacatagt gacatgctaa tcactataat gtgggcatca aagttgtgtg
4200ttatgtgtaa ttactagtta tctgaataaa agagaaagag atcatccata tttcttatcc
4260taaatgaatg tcacgtgtct ttataattct ttgatgaacc agatgcattt cattaaccaa
4320atccatatac atataaatat taatcatata taattaatat caattgggtt agcaaaacaa
4380atctagtcta ggtgtgtttt gcgaatgcgg ccctagcgta tacgaagttc ctattccgaa
4440gttcctattc tccagaaagt ataggaactt ctgtacacct gagctgattc cgatgacttc
4500gtaggttcct agctcaagcc gctcgtgtcc aagcgtcact tacgattagc taatgattac
4560ggcatctagg accgactagc taactaacta gtaccgaggc cggccccgcg ggagctcggc
4620gcgcctaagg gccaagtact tgctgtccct gtatctccaa cacgagcctt gattcctgcc
4680ggccggtgat ggcaatggcc gctagtagtc tccgctagct agggagcggc gatccgacgc
4740gacgccacca tgtgtctaga aaagaagttt cttgctttgc atgcagactt attagcgcgg
4800tcgacacctg tggggacccc gtgtcttgag acaatgagac tgcctgtccg cccaagacac
4860tacttgtagc catgaagcca tcgactcctc tccttgctct ccagtaatcc agtggatgga
4920tccatcatcg atagtttagt ttatcagtct tcttgaggcc ggtgtccccc atgcataatg
4980atgacagaaa gcctgggcca ggtaaaagcc aaaaagtttg accctctagg tactggggcc
5040agccctggcg tttgaacaaa aaaaaaatct gagcgtgtcg ccccggcctg ttttcgaact
5100cctaaacgac gtcgcaactt tttttataca cacactaccg gtacatggct tt
51522605146DNAArtificial SequenceTS10Cas-3 donor 260aaatagtaaa cgggagggga
ggtcgctagt agtaaacgct aggtagctag gataatccgt 60ctcgtgttgg acggaaggtt
ttggacgcat ctgcgtgcac agcccgctga tacagatctg 120atcgactagc tagctagatg
ccgaggcccc agagcaaggc ccggatactc ctgcacagtc 180cctgagattt cagcacagca
ggtgctgttg catcaatata taaatccctg ctttattaat 240ttaatctctg tgcatgtatc
catacatcgt cagcggctca gcgctatcac actgcagtgc 300acgcagctag ttgagcgcct
gggtcagtat atatatagct agtagggaca aaggggggca 360ctgtacgttg gtttggtttg
gcacgcacgc gatcgagagt ggtggaatgg actgcagatc 420atcgatcgct gcactgtacg
cacgcgcacc ggactgcatt tgcatgcccc tgaaggagga 480aaggggaagg aaagaaaaga
aataggagaa agaagaagaa gcagagaaat acgtcacagt 540ccaagaagag tgagccgccc
tagctagctt caaccctgac gaacccggca gccacacttc 600cggccatgta tgcatgcatg
catggcttag cttcagatgt ccaatcgaat ccatcaagac 660ctggccggtt ttccatggcc
gcctcgcctt cgctagttaa gggccaagta cttgctgtcc 720ctgtggtacc atttaaatct
taagcctagg ataacttcgt atagcataca ttatacgaag 780ttatggcgcc gctagcctgc
agtgcagcgt gacccggtcg tgcccctctc tagagataat 840gagcattgca tgtctaagtt
ataaaaaatt accacatatt ttttttgtca cacttgtttg 900aagtgcagtt tatctatctt
tatacatata tttaaacttt actctacgaa taatataatc 960tatagtacta caataatatc
agtgttttag agaatcatat aaatgaacag ttagacatgg 1020tctaaaggac aattgagtat
tttgacaaca ggactctaca gttttatctt tttagtgtgc 1080atgtgttctc cttttttttt
gcaaatagct tcacctatat aatacttcat ccattttatt 1140agtacatcca tttagggttt
agggttaatg gtttttatag actaattttt ttagtacatc 1200tattttattc tattttagcc
tctaaattaa gaaaactaaa actctatttt agttttttta 1260tttaataatt tagatataaa
atagaataaa ataaagtgac taaaaattaa acaaataccc 1320tttaagaaat taaaaaaact
aaggaaacat ttttcttgtt tcgagtagat aatgccagcc 1380tgttaaacgc cgtcgacgag
tctaacggac accaaccagc gaaccagcag cgtcgcgtcg 1440ggccaagcga agcagacggc
acggcatctc tgtcgctgcc tctggacccc tctcgagagt 1500tccgctccac cgttggactt
gctccgctgt cggcatccag aaattgcgtg gcggagcggc 1560agacgtgagc cggcacggca
ggcggcctcc tcctcctctc acggcaccgg cagctacggg 1620ggattccttt cccaccgctc
cttcgctttc ccttcctcgc ccgccgtaat aaatagacac 1680cccctccaca ccctctttcc
ccaacctcgt gttgttcgga gcgcacacac acacaaccag 1740atctccccca aatccacccg
tcggcacctc cgcttcaagg tacgccgctc gtcctccccc 1800ccccccctct ctaccttctc
tagatcggcg ttccggtcca tgcatggtta gggcccggta 1860gttctacttc tgttcatgtt
tgtgttagat ccgtgtttgt gttagatccg tgctgctagc 1920gttcgtacac ggatgcgacc
tgtacgtcag acacgttctg attgctaact tgccagtgtt 1980tctctttggg gaatcctggg
atggctctag ccgttccgca gacgggatcg atttcatgat 2040tttttttgtt tcgttgcata
gggtttggtt tgcccttttc ctttatttca atatatgccg 2100tgcacttgtt tgtcgggtca
tcttttcatg cttttttttg tcttggttgt gatgatgtgg 2160tctggttggg cggtcgttct
agatcggagt agaattctgt ttcaaactac ctggtggatt 2220tattaatttt ggatctgtat
gtgtgtgcca tacatattca tagttacgaa ttgaagatga 2280tggatggaaa tatcgatcta
ggataggtat acatgttgat gcgggtttta ctgatgcata 2340tacagagatg ctttttgttc
gcttggttgt gatgatgtgg tgtggttggg cggtcgttca 2400ttcgttctag atcggagtag
aatactgttt caaactacct ggtgtattta ttaattttgg 2460aactgtatgt gtgtgtcata
catcttcata gttacgagtt taagatggat ggaaatatcg 2520atctaggata ggtatacatg
ttgatgtggg ttttactgat gcatatacat gatggcatat 2580gcagcatcta ttcatatgct
ctaaccttga gtacctatct attataataa acaagtatgt 2640tttataatta ttttgatctt
gatatacttg gatgatggca tatgcagcag ctatatgtgg 2700atttttttag ccctgccttc
atacgctatt tatttgcttg gtactgtttc ttttgtcgat 2760gctcaccctg ttgtttggtg
ttacttctgc aggtcgactc tagaggatca attcgctagc 2820gaagttccta ttccgaagtt
cctattctct agaaagtata ggaacttcag atccaccggg 2880atccccgatc atgcaaaaac
tcattaactc agtgcaaaac tatgcctggg gcagcaaaac 2940ggcgttgact gaactttatg
gtatggaaaa tccgtccagc cagccgatgg ccgagctgtg 3000gatgggcgca catccgaaaa
gcagttcacg agtgcagaat gccgccggag atatcgtttc 3060actgcgtgat gtgattgaga
gtgataaatc gactctgctc ggagaggccg ttgccaaacg 3120ctttggcgaa ctgcctttcc
tgttcaaagt attatgcgca gcacagccac tctccattca 3180ggttcatcca aacaaacaca
attctgaaat cggttttgcc aaagaaaatg ccgcaggtat 3240cccgatggat gccgccgagc
gtaactataa agatcctaac cacaagccgg agctggtttt 3300tgcgctgacg cctttccttg
cgatgaacgc gtttcgtgaa ttttccgaga ttgtctccct 3360actccagccg gtcgcaggtg
cacatccggc gattgctcac tttttacaac agcctgatgc 3420cgaacgttta agcgaactgt
tcgccagcct gttgaatatg cagggtgaag aaaaatcccg 3480cgcgctggcg attttaaaat
cggccctcga tagccagcag ggtgaaccgt ggcaaacgat 3540tcgtttaatt tctgaatttt
acccggaaga cagcggtctg ttctccccgc tattgctgaa 3600tgtggtgaaa ttgaaccctg
gcgaagcgat gttcctgttc gctgaaacac cgcacgctta 3660cctgcaaggc gtggcgctgg
aagtgatggc aaactccgat aacgtgctgc gtgcgggtct 3720gacgcctaaa tacattgata
ttccggaact ggttgccaat gtgaaattcg aagccaaacc 3780ggctaaccag ttgttgaccc
agccggtgaa acaaggtgca gaactggact tcccgattcc 3840agtggatgat tttgccttct
cgctgcatga ccttagtgat aaagaaacca ccattagcca 3900gcagagtgcc gccattttgt
tctgcgtcga aggcgatgca acgttgtgga aaggttctca 3960gcagttacag cttaaaccgg
gtgaatcagc gtttattgcc gccaacgaat caccggtgac 4020tgtcaaaggc cacggccgtt
tagcgcgtgt ttacaacaag ctgtaagagc ttactgaaaa 4080aattaacatc tcttgctaag
ctgggggtgg aacctagact tgtccatctt ctggattggc 4140caacttaatt aatgtatgaa
ataaaaggat gcacacatag tgacatgcta atcactataa 4200tgtgggcatc aaagttgtgt
gttatgtgta attactagtt atctgaataa aagagaaaga 4260gatcatccat atttcttatc
ctaaatgaat gtcacgtgtc tttataattc tttgatgaac 4320cagatgcatt tcattaacca
aatccatata catataaata ttaatcatat ataattaata 4380tcaattgggt tagcaaaaca
aatctagtct aggtgtgttt tgcgaatgcg gccctagcgt 4440atacgaagtt cctattccga
agttcctatt ctccagaaag tataggaact tctgtacacc 4500tgagctgatt ccgatgactt
cgtaggttcc tagctcaagc cgctcgtgtc caagcgtcac 4560ttacgattag ctaatgatta
cggcatctag gaccgactag ctaactaact agtaccgagg 4620ccggccccgc gggagctcgg
cgcgccatct ccaacacgag ccttgattcc tgccggccgg 4680tgatggcaat ggccgctagt
agtctccgct agctagggag cggcgatccg acgcgacgcc 4740accatgtgtc tagaaaagaa
gtttcttgct ttgcatgcag acttattagc gcggtcgaca 4800cctgtgggga ccccgtgtct
tgagacaatg agactgcctg tccgcccaag acactacttg 4860tagccatgaa gccatcgact
cctctccttg ctctccagta atccagtgga tggatccatc 4920atcgatagtt tagtttatca
gtcttcttga ggccggtgtc ccccatgcat aatgatgaca 4980gaaagcctgg gccaggtaaa
agccaaaaag tttgaccctc taggtactgg ggccagccct 5040ggcgtttgaa caaaaaaaaa
atctgagcgt gtcgccccgg cctgttttcg aactcctaaa 5100cgacgtcgca acttttttta
tacacacact accggtacat ggcttt 514626132DNAArtificial
Sequenceubir primer from donor 261ccatgtctaa ctgttcattt atatgattct ct
3226231DNAArtificial Sequencepsbf primer
from dono 262gctcgtgtcc aagcgtcact tacgattagc t
3126327DNAArtificial SequencesMHP14 14-1HR1f primer
263ctcacatgag gctcttcttt gcttgct
2726426DNAArtificial SequenceMHP14 14-1HR2r primer 264aggatcctat
tccccaattt gtagat
2626521DNAArtificial SequenceCHR1-8 8HR1f primer 265cagtccgtgg attgaagcca
t 2126622DNAArtificial
SequenceCHR1-8 8HR2r primer 266ctctgtctcc gagacgtgct ta
2226726DNAArtificial SequenceCHR1-9 9HR1f
primer 267ggagcaaatg ttttaggtat gaaatg
2626828DNAArtificial SequenceCHR1-9 9HR2r primer 268cggattctaa
agatcatacg taaatgaa
2826921DNAArtificial SequenceCHR1-10 10HR1f primer 269tggcttgtct
atgcgcatct c
2127020DNAArtificial SequenceCHR1-10 10HR2r primer 270ccagacccaa
acagcaggtt
2027118DNAArtificial SequenceMHP14Cas-1 probe 271cagattcacg tcagattt
1827227DNAArtificial
SequenceMHP14cas-1 forward primer 272catagtggtg tatgaaagga agcactt
2727330DNAArtificial SequenceMHP14cas-1
reverse primer 273cattttggat tgtaatatgt gtacctcata
3027417DNAArtificial SequenceMHP14Cas-3 probe 274caccactatg
tcgcttc
1727522DNAArtificial SequenceMHP14Cas-3 forward primer 275cggatgcacg
aaaattgtag ga
2227624DNAArtificial SequenceMHP14Cas-3 reverse primer 276ctgacgtgaa
tctgtttgga attg
2427718DNAArtificial SequenceTS8Cas-1 probe 277tacgtaacgt gcagtact
1827821DNAArtificial
SequenceTS8Cas-1 forward primer 278acggacggac catacgttat g
2127926DNAArtificial SequenceTS8Cas-1
reverse primer 279tcagctggtg gagtatatta gttcgt
2628018DNAArtificial SequenceTS8Cas-2 probe 280ccagctgatc
actgatga
1828121DNAArtificial SequenceTS8Cas-2 forward primer 281acggacggac
catacgttat g
2128226DNAArtificial SequenceTS8Cas-2 reverse primer 282cgcacatgtt
ataaattaca atgcat
2628314DNAArtificial SequenceTS9Cas-2 probe 283ctgtttgcgg cctc
1428419DNAArtificial
SequenceTS9Cas-2 forward primer 284ctgcggagct gctggcgat
1928520DNAArtificial SequenceTS9Cas-2
reverse primer 285cttgctggct tcgtctgtca
2028615DNAArtificial SequenceTS9Cas-3 probe 286ccgacgtgcg
tgcaa
1528719DNAArtificial SequenceTS9Cas-3 forward primer 287ctgcggagct
gctggcgat
1928820DNAArtificial SequenceTS9Cas-3 reverse primer 288cttgctggct
tcgtctgtca
2028917DNAArtificial SequenceTS10Cas-1 probe 289tcgccttcgc tagttaa
1729020DNAArtificial
SequenceTS10Cas-1 forward primer 290aagacctggc cggttttcca
2029118DNAArtificial SequenceTS10Cas-1
reverse primer 291tagcggccat tgccatca
1829219DNAArtificial SequenceTS10Cas-3 probe 292ctgtatctcc
aacacgagc
1929320DNAArtificial SequenceTS10Cas-3 forward primer 293aagacctggc
cggttttcca
2029418DNAArtificial SequenceTS10Cas-3 reverse primer 294tagcggccat
tgccatca
18295472DNAGlycine maxGM-U6-9.1 promoter(1)..(472) 295cccgggttaa
gagaattgta agtgtgcttt tatatattta aaattaatat attttgaaat 60gttaaaatat
aaaagaaaat tcaatgtaaa ttaaaaataa ataaatgttt aataaagata 120aattttaaaa
cataaaagaa aatgtctaac aagaggatta agatcctgtg ctcttaaatt 180tttaggtgtt
gaaatcttag ccatacaaaa tatattttat taaaaccaag catgaaaaaa 240gtcactaaag
agctatataa ctcatgcagc tagaaatgaa gtgaagggaa tccagtttgt 300tctcagtcga
aagagtgtct atctttgttc ttttctgcaa ccgagttaag caaaatggga 360atgcgaggta
tcttcctttc gttaggggag caccagatgc atagttagtc ccacattgat 420gaatataaca
agagcttcac agaatatata gcccaggcca cagtaaaagc tt
4722965958DNAArtificial sequenceEF1A2-CAS9 296gggtttactt attttgtggg
tatctatact tttattagat ttttaatcag gctcctgatt 60tctttttatt tcgattgaat
tcctgaactt gtattattca gtagatcgaa taaattataa 120aaagataaaa tcataaaata
atattttatc ctatcaatca tattaaagca atgaatatgt 180aaaattaatc ttatctttat
tttaaaaaat catataggtt tagtattttt ttaaaaataa 240agataggatt agttttacta
ttcactgctt attactttta aaaaaatcat aaaggtttag 300tattttttta aaataaatat
aggaatagtt ttactattca ctgctttaat agaaaaatag 360tttaaaattt aagatagttt
taatcccagc atttgccacg tttgaacgtg agccgaaacg 420atgtcgttac attatcttaa
cctagctgaa acgatgtcgt cataatatcg ccaaatgcca 480actggactac gtcgaaccca
caaatcccac aaagcgcgtg aaatcaaatc gctcaaacca 540caaaaaagaa caacgcgttt
gttacacgct caatcccacg cgagtagagc acagtaacct 600tcaaataagc gaatggggca
taatcagaaa tccgaaataa acctaggggc attatcggaa 660atgaaaagta gctcactcaa
tataaaaatc taggaaccct agttttcgtt atcactctgt 720gctccctcgc tctatttctc
agtctctgtg tttgcggctg aggattccga acgagtgacc 780ttcttcgttt ctcgcaaagg
taacagcctc tgctcttgtc tcttcgattc gatctatgcc 840tgtctcttat ttacgatgat
gtttcttcgg ttatgttttt ttatttatgc tttatgctgt 900tgatgttcgg ttgtttgttt
cgctttgttt ttgtggttca gttttttagg attcttttgg 960tttttgaatc gattaatcgg
aagagatttt cgagttattt ggtgtgttgg aggtgaatct 1020tttttttgag gtcatagatc
tgttgtattt gtgttataaa catgcgactt tgtatgattt 1080tttacgaggt tatgatgttc
tggttgtttt attatgaatc tgttgagaca gaaccatgat 1140ttttgttgat gttcgtttac
actattaaag gtttgtttta acaggattaa aagtttttta 1200agcatgttga aggagtcttg
tagatatgta accgtcgata gtttttttgt gggtttgttc 1260acatgttatc aagcttaatc
ttttactatg tatgcgacca tatctggatc cagcaaaggc 1320gattttttaa ttccttgtga
aacttttgta atatgaagtt gaaattttgt tattggtaaa 1380ctataaatgt gtgaagttgg
agtatacctt taccttctta tttggctttg tgatagttta 1440atttatatgt attttgagtt
ctgacttgta tttctttgaa ttgattctag tttaagtaat 1500ccatggacaa aaagtactca
atagggctcg acatagggac taactccgtt ggatgggccg 1560tcatcaccga cgagtacaag
gtgccctcca agaagttcaa ggtgttggga aacaccgaca 1620ggcacagcat aaagaagaat
ttgatcggtg ccctcctctt cgactccgga gagaccgctg 1680aggctaccag gctcaagagg
accgctagaa ggcgctacac cagaaggaag aacagaatct 1740gctacctgca ggagatcttc
tccaacgaga tggccaaggt ggacgactcc ttcttccacc 1800gccttgagga atcattcctg
gtggaggagg ataaaaagca cgagagacac ccaatcttcg 1860ggaacatcgt cgacgaggtg
gcctaccatg aaaagtaccc taccatctac cacctgagga 1920agaagctggt cgactctacc
gacaaggctg acttgcgctt gatttacctg gctctcgctc 1980acatgataaa gttccgcgga
cacttcctca ttgagggaga cctgaaccca gacaactccg 2040acgtggacaa gctcttcatc
cagctcgttc agacctacaa ccagcttttc gaggagaacc 2100caatcaacgc cagtggagtt
gacgccaagg ctatcctctc tgctcgtctg tcaaagtcca 2160ggaggcttga gaacttgatt
gcccagctgc ctggcgaaaa gaagaacgga ctgttcggaa 2220acttgatcgc tctctccctg
ggattgactc ccaacttcaa gtccaacttc gacctcgccg 2280aggacgctaa gttgcagttg
tctaaagaca cctacgacga tgacctcgac aacttgctgg 2340cccagatagg cgaccaatac
gccgatctct tcctcgccgc taagaacttg tccgacgcaa 2400tcctgctgtc cgacatcctg
agagtcaaca ctgagattac caaagctcct ctgtctgctt 2460ccatgattaa gcgctacgac
gagcaccacc aagatctgac cctgctcaag gccctggtga 2520gacagcagct gcccgagaag
tacaaggaga tctttttcga ccagtccaag aacggctacg 2580ccggatacat tgacggaggc
gcctcccagg aagagttcta caagttcatc aagcccatcc 2640ttgagaagat ggacggtacc
gaggagctgt tggtgaagtt gaacagagag gacctgttga 2700ggaagcagag aaccttcgac
aacggaagca tccctcacca aatccacctg ggagagctcc 2760acgccatctt gaggaggcag
gaggatttct atcccttcct gaaggacaac cgcgagaaga 2820ttgagaagat cttgaccttc
agaattcctt actacgtcgg gccactcgcc agaggaaact 2880ctaggttcgc ctggatgacc
cgcaaatctg aagagaccat tactccctgg aacttcgagg 2940aagtcgtgga caagggcgct
tccgctcagt ctttcatcga gaggatgacc aacttcgata 3000aaaatctgcc caacgagaag
gtgctgccca agcactccct gttgtacgag tatttcacag 3060tgtacaacga gctcaccaag
gtgaagtacg tcacagaggg aatgaggaag cctgccttct 3120tgtccggaga gcagaagaag
gccatcgtcg acctgctctt caagaccaac aggaaggtga 3180ctgtcaagca gctgaaggag
gactacttca agaagatcga gtgcttcgac tccgtcgaga 3240tctctggtgt cgaggacagg
ttcaacgcct cccttgggac ttaccacgat ctgctcaaga 3300ttattaaaga caaggacttc
ctggacaacg aggagaacga ggacatcctt gaggacatcg 3360tgctcaccct gaccttgttc
gaagacaggg aaatgatcga agagaggctc aagacctacg 3420cccacctctt cgacgacaag
gtgatgaaac agctgaagag acgcagatat accggctggg 3480gaaggctctc ccgcaaattg
atcaacggga tcagggacaa gcagtcaggg aagactatac 3540tcgacttcct gaagtccgac
ggattcgcca acaggaactt catgcagctc attcacgacg 3600actccttgac cttcaaggag
gacatccaga aggctcaggt gtctggacag ggtgactcct 3660tgcatgagca cattgctaac
ttggccggct ctcccgctat taagaagggc attttgcaga 3720ccgtgaaggt cgttgacgag
ctcgtgaagg tgatgggacg ccacaagcca gagaacatcg 3780ttattgagat ggctcgcgag
aaccaaacta cccagaaagg gcagaagaat tcccgcgaga 3840ggatgaagcg cattgaggag
ggcataaaag agcttggctc tcagatcctc aaggagcacc 3900ccgtcgagaa cactcagctg
cagaacgaga agctgtacct gtactacctc caaaacggaa 3960gggacatgta cgtggaccag
gagctggaca tcaacaggtt gtccgactac gacgtcgacc 4020acatcgtgcc tcagtccttc
ctgaaggatg actccatcga caataaagtg ctgacacgct 4080ccgataaaaa tagaggcaag
tccgacaacg tcccctccga ggaggtcgtg aagaagatga 4140aaaactactg gagacagctc
ttgaacgcca agctcatcac ccagcgtaag ttcgacaacc 4200tgactaaggc tgagagagga
ggattgtccg agctcgataa ggccggattc atcaagagac 4260agctcgtcga aacccgccaa
attaccaagc acgtggccca aattctggat tcccgcatga 4320acaccaagta cgatgaaaat
gacaagctga tccgcgaggt caaggtgatc accttgaagt 4380ccaagctggt ctccgacttc
cgcaaggact tccagttcta caaggtgagg gagatcaaca 4440actaccacca cgcacacgac
gcctacctca acgctgtcgt tggaaccgcc ctcatcaaaa 4500aatatcctaa gctggagtct
gagttcgtct acggcgacta caaggtgtac gacgtgagga 4560agatgatcgc taagtctgag
caggagatcg gcaaggccac cgccaagtac ttcttctact 4620ccaacatcat gaacttcttc
aagaccgaga tcactctcgc caacggtgag atcaggaagc 4680gcccactgat cgagaccaac
ggtgagactg gagagatcgt gtgggacaaa gggagggatt 4740tcgctactgt gaggaaggtg
ctctccatgc ctcaggtgaa catcgtcaag aagaccgaag 4800ttcagaccgg aggattctcc
aaggagtcca tcctccccaa gagaaactcc gacaagctga 4860tcgctagaaa gaaagactgg
gaccctaaga agtacggagg cttcgattct cctaccgtgg 4920cctactctgt gctggtcgtg
gccaaggtgg agaagggcaa gtccaagaag ctgaaatccg 4980tcaaggagct cctcgggatt
accatcatgg agaggagttc cttcgagaag aaccctatcg 5040acttcctgga ggccaaggga
tataaagagg tgaagaagga cctcatcatc aagctgccca 5100agtactccct cttcgagttg
gagaacggaa ggaagaggat gctggcttct gccggagagt 5160tgcagaaggg aaatgagctc
gcccttccct ccaagtacgt gaacttcctg tacctcgcct 5220ctcactatga aaagttgaag
ggctctcctg aggacaacga gcagaagcag ctcttcgtgg 5280agcagcacaa gcactacctg
gacgaaatta tcgagcagat ctctgagttc tccaagcgcg 5340tgatattggc cgacgccaac
ctcgacaagg tgctgtccgc ctacaacaag cacagggata 5400agcccattcg cgagcaggct
gaaaacatta tccacctgtt taccctcaca aacttgggag 5460cccctgctgc cttcaagtac
ttcgacacca ccattgacag gaagagatac acctccacca 5520aggaggtgct cgacgcaaca
ctcatccacc aatccatcac cggcctctat gaaacaagga 5580ttgacttgtc ccagctggga
ggcgactcta gagccgatcc caagaagaag agaaaggtgt 5640aggttaacct agacttgtcc
atcttctgga ttggccaact taattaatgt atgaaataaa 5700aggatgcaca catagtgaca
tgctaatcac tataatgtgg gcatcaaagt tgtgtgttat 5760gtgtaattac tagttatctg
aataaaagag aaagagatca tccatatttc ttatcctaaa 5820tgaatgtcac gtgtctttat
aattctttga tgaaccagat gcatttcatt aaccaaatcc 5880atatacatat aaatattaat
catatataat taatatcaat tgggttagca aaacaaatct 5940agtctaggtg tgttttgc
5958297573DNAArtificial
sequenceU6-9.1-DD20CR1 297ccgggttaag agaattgtaa gtgtgctttt atatatttaa
aattaatata ttttgaaatg 60ttaaaatata aaagaaaatt caatgtaaat taaaaataaa
taaatgttta ataaagataa 120attttaaaac ataaaagaaa atgtctaaca agaggattaa
gatcctgtgc tcttaaattt 180ttaggtgttg aaatcttagc catacaaaat atattttatt
aaaaccaagc atgaaaaaag 240tcactaaaga gctatataac tcatgcagct agaaatgaag
tgaagggaat ccagtttgtt 300ctcagtcgaa agagtgtcta tctttgttct tttctgcaac
cgagttaagc aaaatgggaa 360tgcgaggtat cttcctttcg ttaggggagc accagatgca
tagttagtcc cacattgatg 420aatataacaa gagcttcaca gaatatatag cccaggccac
agtaaaagct tggaactgac 480acacgacatg agttttagag ctagaaatag caagttaaaa
taaggctagt ccgttatcaa 540cttgaaaaag tggcaccgag tcggtgcttt ttt
5732986611DNAArtificial
sequenceU6-9.1-DD20CR1+EF1A2-CAS9 298cgcgccggta cccgggttaa gagaattgta
agtgtgcttt tatatattta aaattaatat 60attttgaaat gttaaaatat aaaagaaaat
tcaatgtaaa ttaaaaataa ataaatgttt 120aataaagata aattttaaaa cataaaagaa
aatgtctaac aagaggatta agatcctgtg 180ctcttaaatt tttaggtgtt gaaatcttag
ccatacaaaa tatattttat taaaaccaag 240catgaaaaaa gtcactaaag agctatataa
ctcatgcagc tagaaatgaa gtgaagggaa 300tccagtttgt tctcagtcga aagagtgtct
atctttgttc ttttctgcaa ccgagttaag 360caaaatggga atgcgaggta tcttcctttc
gttaggggag caccagatgc atagttagtc 420ccacattgat gaatataaca agagcttcac
agaatatata gcccaggcca cagtaaaagc 480ttggaactga cacacgacat gagttttaga
gctagaaata gcaagttaaa ataaggctag 540tccgttatca acttgaaaaa gtggcaccga
gtcggtgctt ttttttgcgg ccgcaattgg 600atcgggttta cttattttgt gggtatctat
acttttatta gatttttaat caggctcctg 660atttcttttt atttcgattg aattcctgaa
cttgtattat tcagtagatc gaataaatta 720taaaaagata aaatcataaa ataatatttt
atcctatcaa tcatattaaa gcaatgaata 780tgtaaaatta atcttatctt tattttaaaa
aatcatatag gtttagtatt tttttaaaaa 840taaagatagg attagtttta ctattcactg
cttattactt ttaaaaaaat cataaaggtt 900tagtattttt ttaaaataaa tataggaata
gttttactat tcactgcttt aatagaaaaa 960tagtttaaaa tttaagatag ttttaatccc
agcatttgcc acgtttgaac gtgagccgaa 1020acgatgtcgt tacattatct taacctagct
gaaacgatgt cgtcataata tcgccaaatg 1080ccaactggac tacgtcgaac ccacaaatcc
cacaaagcgc gtgaaatcaa atcgctcaaa 1140ccacaaaaaa gaacaacgcg tttgttacac
gctcaatccc acgcgagtag agcacagtaa 1200ccttcaaata agcgaatggg gcataatcag
aaatccgaaa taaacctagg ggcattatcg 1260gaaatgaaaa gtagctcact caatataaaa
atctaggaac cctagttttc gttatcactc 1320tgtgctccct cgctctattt ctcagtctct
gtgtttgcgg ctgaggattc cgaacgagtg 1380accttcttcg tttctcgcaa aggtaacagc
ctctgctctt gtctcttcga ttcgatctat 1440gcctgtctct tatttacgat gatgtttctt
cggttatgtt tttttattta tgctttatgc 1500tgttgatgtt cggttgtttg tttcgctttg
tttttgtggt tcagtttttt aggattcttt 1560tggtttttga atcgattaat cggaagagat
tttcgagtta tttggtgtgt tggaggtgaa 1620tctttttttt gaggtcatag atctgttgta
tttgtgttat aaacatgcga ctttgtatga 1680ttttttacga ggttatgatg ttctggttgt
tttattatga atctgttgag acagaaccat 1740gatttttgtt gatgttcgtt tacactatta
aaggtttgtt ttaacaggat taaaagtttt 1800ttaagcatgt tgaaggagtc ttgtagatat
gtaaccgtcg atagtttttt tgtgggtttg 1860ttcacatgtt atcaagctta atcttttact
atgtatgcga ccatatctgg atccagcaaa 1920ggcgattttt taattccttg tgaaactttt
gtaatatgaa gttgaaattt tgttattggt 1980aaactataaa tgtgtgaagt tggagtatac
ctttaccttc ttatttggct ttgtgatagt 2040ttaatttata tgtattttga gttctgactt
gtatttcttt gaattgattc tagtttaagt 2100aatccatgga caaaaagtac tcaatagggc
tcgacatagg gactaactcc gttggatggg 2160ccgtcatcac cgacgagtac aaggtgccct
ccaagaagtt caaggtgttg ggaaacaccg 2220acaggcacag cataaagaag aatttgatcg
gtgccctcct cttcgactcc ggagagaccg 2280ctgaggctac caggctcaag aggaccgcta
gaaggcgcta caccagaagg aagaacagaa 2340tctgctacct gcaggagatc ttctccaacg
agatggccaa ggtggacgac tccttcttcc 2400accgccttga ggaatcattc ctggtggagg
aggataaaaa gcacgagaga cacccaatct 2460tcgggaacat cgtcgacgag gtggcctacc
atgaaaagta ccctaccatc taccacctga 2520ggaagaagct ggtcgactct accgacaagg
ctgacttgcg cttgatttac ctggctctcg 2580ctcacatgat aaagttccgc ggacacttcc
tcattgaggg agacctgaac ccagacaact 2640ccgacgtgga caagctcttc atccagctcg
ttcagaccta caaccagctt ttcgaggaga 2700acccaatcaa cgccagtgga gttgacgcca
aggctatcct ctctgctcgt ctgtcaaagt 2760ccaggaggct tgagaacttg attgcccagc
tgcctggcga aaagaagaac ggactgttcg 2820gaaacttgat cgctctctcc ctgggattga
ctcccaactt caagtccaac ttcgacctcg 2880ccgaggacgc taagttgcag ttgtctaaag
acacctacga cgatgacctc gacaacttgc 2940tggcccagat aggcgaccaa tacgccgatc
tcttcctcgc cgctaagaac ttgtccgacg 3000caatcctgct gtccgacatc ctgagagtca
acactgagat taccaaagct cctctgtctg 3060cttccatgat taagcgctac gacgagcacc
accaagatct gaccctgctc aaggccctgg 3120tgagacagca gctgcccgag aagtacaagg
agatcttttt cgaccagtcc aagaacggct 3180acgccggata cattgacgga ggcgcctccc
aggaagagtt ctacaagttc atcaagccca 3240tccttgagaa gatggacggt accgaggagc
tgttggtgaa gttgaacaga gaggacctgt 3300tgaggaagca gagaaccttc gacaacggaa
gcatccctca ccaaatccac ctgggagagc 3360tccacgccat cttgaggagg caggaggatt
tctatccctt cctgaaggac aaccgcgaga 3420agattgagaa gatcttgacc ttcagaattc
cttactacgt cgggccactc gccagaggaa 3480actctaggtt cgcctggatg acccgcaaat
ctgaagagac cattactccc tggaacttcg 3540aggaagtcgt ggacaagggc gcttccgctc
agtctttcat cgagaggatg accaacttcg 3600ataaaaatct gcccaacgag aaggtgctgc
ccaagcactc cctgttgtac gagtatttca 3660cagtgtacaa cgagctcacc aaggtgaagt
acgtcacaga gggaatgagg aagcctgcct 3720tcttgtccgg agagcagaag aaggccatcg
tcgacctgct cttcaagacc aacaggaagg 3780tgactgtcaa gcagctgaag gaggactact
tcaagaagat cgagtgcttc gactccgtcg 3840agatctctgg tgtcgaggac aggttcaacg
cctcccttgg gacttaccac gatctgctca 3900agattattaa agacaaggac ttcctggaca
acgaggagaa cgaggacatc cttgaggaca 3960tcgtgctcac cctgaccttg ttcgaagaca
gggaaatgat cgaagagagg ctcaagacct 4020acgcccacct cttcgacgac aaggtgatga
aacagctgaa gagacgcaga tataccggct 4080ggggaaggct ctcccgcaaa ttgatcaacg
ggatcaggga caagcagtca gggaagacta 4140tactcgactt cctgaagtcc gacggattcg
ccaacaggaa cttcatgcag ctcattcacg 4200acgactcctt gaccttcaag gaggacatcc
agaaggctca ggtgtctgga cagggtgact 4260ccttgcatga gcacattgct aacttggccg
gctctcccgc tattaagaag ggcattttgc 4320agaccgtgaa ggtcgttgac gagctcgtga
aggtgatggg acgccacaag ccagagaaca 4380tcgttattga gatggctcgc gagaaccaaa
ctacccagaa agggcagaag aattcccgcg 4440agaggatgaa gcgcattgag gagggcataa
aagagcttgg ctctcagatc ctcaaggagc 4500accccgtcga gaacactcag ctgcagaacg
agaagctgta cctgtactac ctccaaaacg 4560gaagggacat gtacgtggac caggagctgg
acatcaacag gttgtccgac tacgacgtcg 4620accacatcgt gcctcagtcc ttcctgaagg
atgactccat cgacaataaa gtgctgacac 4680gctccgataa aaatagaggc aagtccgaca
acgtcccctc cgaggaggtc gtgaagaaga 4740tgaaaaacta ctggagacag ctcttgaacg
ccaagctcat cacccagcgt aagttcgaca 4800acctgactaa ggctgagaga ggaggattgt
ccgagctcga taaggccgga ttcatcaaga 4860gacagctcgt cgaaacccgc caaattacca
agcacgtggc ccaaattctg gattcccgca 4920tgaacaccaa gtacgatgaa aatgacaagc
tgatccgcga ggtcaaggtg atcaccttga 4980agtccaagct ggtctccgac ttccgcaagg
acttccagtt ctacaaggtg agggagatca 5040acaactacca ccacgcacac gacgcctacc
tcaacgctgt cgttggaacc gccctcatca 5100aaaaatatcc taagctggag tctgagttcg
tctacggcga ctacaaggtg tacgacgtga 5160ggaagatgat cgctaagtct gagcaggaga
tcggcaaggc caccgccaag tacttcttct 5220actccaacat catgaacttc ttcaagaccg
agatcactct cgccaacggt gagatcagga 5280agcgcccact gatcgagacc aacggtgaga
ctggagagat cgtgtgggac aaagggaggg 5340atttcgctac tgtgaggaag gtgctctcca
tgcctcaggt gaacatcgtc aagaagaccg 5400aagttcagac cggaggattc tccaaggagt
ccatcctccc caagagaaac tccgacaagc 5460tgatcgctag aaagaaagac tgggacccta
agaagtacgg aggcttcgat tctcctaccg 5520tggcctactc tgtgctggtc gtggccaagg
tggagaaggg caagtccaag aagctgaaat 5580ccgtcaagga gctcctcggg attaccatca
tggagaggag ttccttcgag aagaacccta 5640tcgacttcct ggaggccaag ggatataaag
aggtgaagaa ggacctcatc atcaagctgc 5700ccaagtactc cctcttcgag ttggagaacg
gaaggaagag gatgctggct tctgccggag 5760agttgcagaa gggaaatgag ctcgcccttc
cctccaagta cgtgaacttc ctgtacctcg 5820cctctcacta tgaaaagttg aagggctctc
ctgaggacaa cgagcagaag cagctcttcg 5880tggagcagca caagcactac ctggacgaaa
ttatcgagca gatctctgag ttctccaagc 5940gcgtgatatt ggccgacgcc aacctcgaca
aggtgctgtc cgcctacaac aagcacaggg 6000ataagcccat tcgcgagcag gctgaaaaca
ttatccacct gtttaccctc acaaacttgg 6060gagcccctgc tgccttcaag tacttcgaca
ccaccattga caggaagaga tacacctcca 6120ccaaggaggt gctcgacgca acactcatcc
accaatccat caccggcctc tatgaaacaa 6180ggattgactt gtcccagctg ggaggcgact
ctagagccga tcccaagaag aagagaaagg 6240tgtaggttaa cctagacttg tccatcttct
ggattggcca acttaattaa tgtatgaaat 6300aaaaggatgc acacatagtg acatgctaat
cactataatg tgggcatcaa agttgtgtgt 6360tatgtgtaat tactagttat ctgaataaaa
gagaaagaga tcatccatat ttcttatcct 6420aaatgaatgt cacgtgtctt tataattctt
tgatgaacca gatgcatttc attaaccaaa 6480tccatataca tataaatatt aatcatatat
aattaatatc aattgggtta gcaaaacaaa 6540tctagtctag gtgtgttttg cgaattcgat
atcaagctta tcgataccgt cgaggggggg 6600cccggtaccg g
66112995686DNAArtificial
sequenceDD20HR1-SAMSHPT-DD20HR2 299cgcgcctcta gttgaagaca cgttcatgtc
ttcatcgtaa gaagacactc agtagtcttc 60ggccagaatg gccatctgga ttcagcaggc
ctagaaggcc atttaaatcc tgaggatctg 120gtcttcctaa ggacccggga tatcgctatc
aactttgtat agaaaagttg ggccgaattc 180gagctcggta cggccagaat ccggtaagtg
actagggtca cgtgacccta gtcacttaaa 240ttcggccaga atggccatct ggattcagca
ggcctagaag gcccggaccg attaaacttt 300aattcggtcc gggttacctc gagcctagta
ataattacac atctaagata tccccttctt 360tttcaagtaa aataatatca tatgatctca
ttttagtgaa acaatactat ttccctgata 420actctcttca acattaggga cttcatctaa
tcatctactt tcaaggtata actagacgta 480tttgttcttt taaaaaaaac actagatgta
ctcgtcaact caaaattcat cgttcatgca 540ttttaattaa actttaatta gctaatgagt
agaaaaagat catacgagta aaatagaaga 600atcttcctag attttggaag aatggattgg
agtgtaagtg aattgatcca ttagtggaag 660atgctcttta caatggccaa actgttctaa
ttgttagagc acatttgaga tgaaacactt 720cagtagtgga ggtaacctac aatcctagga
tctgtatcct ctatcactaa tggagcaatg 780ggtttgagat tgacttactc ctttccttgt
ctctcgtagt gcatatgcgc actttcaaag 840gctacacaaa agccgttaac tttttgttta
tttaagttac gaaagatagt tgaattagag 900taaatggtga tattgaatta ggattttaaa
taattttaaa agaatttttt taataaaaaa 960aatattgtgt tgttggatca aaatttttaa
ataacatgaa taaggaaatg gattgcaatg 1020aggttttaaa caattatttt aacatatagg
attttagaaa gacttttata atattttgtt 1080gaagtttaga ttttaatata tttatgtttt
aaaattttaa aaaaaacttc atgaatttat 1140aatatttgaa aaagacacgt gaatatttag
aaaacattta aaattacaat aataaatcat 1200aatgagatag ggtgtattca tgtgtagacg
agacaccaag tatatggttc acaagtgaat 1260catctttttt ttttacagca caagtagatc
acttgtactt atcaaaattc ggaactgaca 1320cacactagtg gtcacctaag tgactagggt
cacgtgaccc tagtcactta ttcccaaaca 1380ctagtaacgg ccgccagtgt gctggaattc
gcccttccca agctttgctc tagatcaaac 1440tcacatccaa acataacatg gatatcttcc
ttaccaatca tactaattat tttgggttaa 1500atattaatca ttatttttaa gatattaatt
aagaaattaa aagatttttt aaaaaaatgt 1560ataaaattat attattcatg atttttcata
catttgattt tgataataaa tatatttttt 1620ttaatttctt aaaaaatgtt gcaagacact
tattagacat agtcttgttc tgtttacaaa 1680agcattcatc atttaataca ttaaaaaata
tttaatacta acagtagaat cttcttgtga 1740gtggtgtggg agtaggcaac ctggcattga
aacgagagaa agagagtcag aaccagaaga 1800caaataaaaa gtatgcaaca aacaaatcaa
aatcaaaggg caaaggctgg ggttggctca 1860attggttgct acattcaatt ttcaactcag
tcaacggttg agattcactc tgacttcccc 1920aatctaagcc gcggatgcaa acggttgaat
ctaacccaca atccaatctc gttacttagg 1980ggcttttccg tcattaactc acccctgcca
cccggtttcc ctataaattg gaactcaatg 2040ctcccctcta aactcgtatc gcttcagagt
tgagaccaag acacactcgt tcatatatct 2100ctctgctctt ctcttctctt ctacctctca
aggtactttt cttctccctc taccaaatcc 2160tagattccgt ggttcaattt cggatcttgc
acttctggtt tgctttgcct tgctttttcc 2220tcaactgggt ccatctagga tccatgtgaa
actctactct ttctttaata tctgcggaat 2280acgcgtttga ctttcagatc tagtcgaaat
catttcataa ttgcctttct ttcttttagc 2340ttatgagaaa taaaatcact ttttttttat
ttcaaaataa accttgggcc ttgtgctgac 2400tgagatgggg tttggtgatt acagaatttt
agcgaatttt gtaattgtac ttgtttgtct 2460gtagttttgt tttgttttct tgtttctcat
acattcctta ggcttcaatt ttattcgagt 2520ataggtcaca ataggaattc aaactttgag
caggggaatt aatcccttcc ttcaaatcca 2580gtttgtttgt atatatgttt aaaaaatgaa
acttttgctt taaattctat tataactttt 2640tttatggctg aaatttttgc atgtgtcttt
gctctctgtt gtaaatttac tgtttaggta 2700ctaactctag gcttgttgtg cagtttttga
agtataacaa cagaagttcc tattccgaag 2760ttcctattct ctagaaagta taggaacttc
cactagtcca tgaaaaagcc tgaactcacc 2820gcgacgtctg tcgagaagtt tctgatcgaa
aagttcgaca gcgtctccga cctgatgcag 2880ctctcggagg gcgaagaatc tcgtgctttc
agcttcgatg taggagggcg tggatatgtc 2940ctgcgggtaa atagctgcgc cgatggtttc
tacaaagatc gttatgttta tcggcacttt 3000gcatcggccg cgctcccgat tccggaagtg
cttgacattg gggaattcag cgagagcctg 3060acctattgca tctcccgccg tgcacagggt
gtcacgttgc aagacctgcc tgaaaccgaa 3120ctgcccgctg ttctgcagcc ggtcgcggag
gccatggatg cgatcgctgc ggccgatctt 3180agccagacga gcgggttcgg cccattcgga
ccgcaaggaa tcggtcaata cactacatgg 3240cgtgatttca tatgcgcgat tgctgatccc
catgtgtatc actggcaaac tgtgatggac 3300gacaccgtca gtgcgtccgt cgcgcaggct
ctcgatgagc tgatgctttg ggccgaggac 3360tgccccgaag tccggcacct cgtgcacgcg
gatttcggct ccaacaatgt cctgacggac 3420aatggccgca taacagcggt cattgactgg
agcgaggcga tgttcgggga ttcccaatac 3480gaggtcgcca acatcttctt ctggaggccg
tggttggctt gtatggagca gcagacgcgc 3540tacttcgagc ggaggcatcc ggagcttgca
ggatcgccgc ggctccgggc gtatatgctc 3600cgcattggtc ttgaccaact ctatcagagc
ttggttgacg gcaatttcga tgatgcagct 3660tgggcgcagg gtcgatgcga cgcaatcgtc
cgatccggag ccgggactgt cgggcgtaca 3720caaatcgccc gcagaagcgc ggccgtctgg
accgatggct gtgtagaagt actcgccgat 3780agtggaaacc gacgccccag cactcgtccg
agggcaaagg aatagtgagg tacctaaaga 3840aggagtgcgt cgaagcagat cgttcaaaca
tttggcaata aagtttctta agattgaatc 3900ctgttgccgg tcttgcgatg attatcatat
aatttctgtt gaattacgtt aagcatgtaa 3960taattaacat gtaatgcatg acgttattta
tgagatgggt ttttatgatt agagtcccgc 4020aattatacat ttaatacgcg atagaaaaca
aaatatagcg cgcaaactag gataaattat 4080cgcgcgcggt gtcatctatg ttactagatc
gatgtcgacc cgggccctag gaggccggcc 4140cagctgatga tcccggtgaa gttcctattc
cgaagttcct attctccaga aagtatagga 4200acttcactag agcttgcggc cgcgcatgct
gacttaatca gctaacgcca ctcgacctgc 4260aggcatgccc gcggatatcg atgggccccg
gccgaagctt caagtttgta caaaaaagca 4320ggctggcgcc ggaaccaatt cagtcgactg
gatccggtac cgaattcgcg gccgcactcg 4380agatatctag acccagcttt cttgtacaaa
gtggccgtta acggatcggc cagaatccgg 4440taagtgacta gggtcacgtg accctagtca
cttaaattcg gccagaatgg ccatctggat 4500tcagcaggcc tagaaggccc ggaccgatta
aactttaatt cggtccgggt tacctctaga 4560aagcttgtcg acctgcagac acgacatgat
ggaacgtgac taaggtgggt ttttgacttt 4620gcatgtcgaa gtgagagtga ttttattgag
agaataatag aagacctaca aaacaaatga 4680tcccgacgct aaagtaagta cgagagttaa
gagaataaat gggaaaatat gcatacatga 4740ttaggtgtgt gttcgtctca agaaagtacg
aatgaatatg gtgtgtttgt agtacatgaa 4800tgatgtgttt tgagggttca agggaaattg
atatttatag agtgaaatgg aaccagaggt 4860ctttgttgac aagggttgtt atgactcttg
caaataatta atagcttata aataatagcc 4920aataacttat tatagataga gttagagata
atatatagct aaatttgaac aaggcataca 4980aaacaaaaat gctaaatatg aataagacaa
tcaaaattgt agtcgatgtt caactctttg 5040tcgttgaaga acttgtttgc agtggtatag
taaatgggtg tgagtgcagt gtctcaccca 5100tctcacacca cacaaccaac ttcatatcta
aagatattgt cgctgaatac aaaattgagt 5160tatggaatat acaattcata atatagatac
gaaaaatcat ttcttacaaa acattcaatc 5220aaaaattatt caaacataat tctagattaa
gtaatccgaa gtacaagtta gtatcctaga 5280tccgttaatt taaaattatg tttgcataat
tttggatttg gtgttctata agggcacaat 5340tttgttcatt cttacaagtt tgtcaattct
aaaatatatg caaatttgaa gaaaaaaaat 5400ttacgaatgt gtctcaaaca ataacttaat
gggaggagaa tgagggatga agaagctcaa 5460aattaccaac gccttctacc tcaagaagct
acttcacaca aaatatgact ggcggaagga 5520taggggacaa ccgataacga gaaggagata
cataaggtaa tgtacgttgt tgtgtgaggg 5580atccggtcac ctaagtgact agggtcacgt
gaccctagtc acttattccc gggcaacttt 5640attatacaaa gttgatagat ctcgaattca
ttccgattaa tcgtgg 56863006611DNAArtificial
sequenceU6-9.1-DD20CR2+EF1A2-CAS9 300cgcgccggta cccgggttaa gagaattgta
agtgtgcttt tatatattta aaattaatat 60attttgaaat gttaaaatat aaaagaaaat
tcaatgtaaa ttaaaaataa ataaatgttt 120aataaagata aattttaaaa cataaaagaa
aatgtctaac aagaggatta agatcctgtg 180ctcttaaatt tttaggtgtt gaaatcttag
ccatacaaaa tatattttat taaaaccaag 240catgaaaaaa gtcactaaag agctatataa
ctcatgcagc tagaaatgaa gtgaagggaa 300tccagtttgt tctcagtcga aagagtgtct
atctttgttc ttttctgcaa ccgagttaag 360caaaatggga atgcgaggta tcttcctttc
gttaggggag caccagatgc atagttagtc 420ccacattgat gaatataaca agagcttcac
agaatatata gcccaggcca cagtaaaagc 480ttgacatgat ggaacgtgac tagttttaga
gctagaaata gcaagttaaa ataaggctag 540tccgttatca acttgaaaaa gtggcaccga
gtcggtgctt ttttttgcgg ccgcaattgg 600atcgggttta cttattttgt gggtatctat
acttttatta gatttttaat caggctcctg 660atttcttttt atttcgattg aattcctgaa
cttgtattat tcagtagatc gaataaatta 720taaaaagata aaatcataaa ataatatttt
atcctatcaa tcatattaaa gcaatgaata 780tgtaaaatta atcttatctt tattttaaaa
aatcatatag gtttagtatt tttttaaaaa 840taaagatagg attagtttta ctattcactg
cttattactt ttaaaaaaat cataaaggtt 900tagtattttt ttaaaataaa tataggaata
gttttactat tcactgcttt aatagaaaaa 960tagtttaaaa tttaagatag ttttaatccc
agcatttgcc acgtttgaac gtgagccgaa 1020acgatgtcgt tacattatct taacctagct
gaaacgatgt cgtcataata tcgccaaatg 1080ccaactggac tacgtcgaac ccacaaatcc
cacaaagcgc gtgaaatcaa atcgctcaaa 1140ccacaaaaaa gaacaacgcg tttgttacac
gctcaatccc acgcgagtag agcacagtaa 1200ccttcaaata agcgaatggg gcataatcag
aaatccgaaa taaacctagg ggcattatcg 1260gaaatgaaaa gtagctcact caatataaaa
atctaggaac cctagttttc gttatcactc 1320tgtgctccct cgctctattt ctcagtctct
gtgtttgcgg ctgaggattc cgaacgagtg 1380accttcttcg tttctcgcaa aggtaacagc
ctctgctctt gtctcttcga ttcgatctat 1440gcctgtctct tatttacgat gatgtttctt
cggttatgtt tttttattta tgctttatgc 1500tgttgatgtt cggttgtttg tttcgctttg
tttttgtggt tcagtttttt aggattcttt 1560tggtttttga atcgattaat cggaagagat
tttcgagtta tttggtgtgt tggaggtgaa 1620tctttttttt gaggtcatag atctgttgta
tttgtgttat aaacatgcga ctttgtatga 1680ttttttacga ggttatgatg ttctggttgt
tttattatga atctgttgag acagaaccat 1740gatttttgtt gatgttcgtt tacactatta
aaggtttgtt ttaacaggat taaaagtttt 1800ttaagcatgt tgaaggagtc ttgtagatat
gtaaccgtcg atagtttttt tgtgggtttg 1860ttcacatgtt atcaagctta atcttttact
atgtatgcga ccatatctgg atccagcaaa 1920ggcgattttt taattccttg tgaaactttt
gtaatatgaa gttgaaattt tgttattggt 1980aaactataaa tgtgtgaagt tggagtatac
ctttaccttc ttatttggct ttgtgatagt 2040ttaatttata tgtattttga gttctgactt
gtatttcttt gaattgattc tagtttaagt 2100aatccatgga caaaaagtac tcaatagggc
tcgacatagg gactaactcc gttggatggg 2160ccgtcatcac cgacgagtac aaggtgccct
ccaagaagtt caaggtgttg ggaaacaccg 2220acaggcacag cataaagaag aatttgatcg
gtgccctcct cttcgactcc ggagagaccg 2280ctgaggctac caggctcaag aggaccgcta
gaaggcgcta caccagaagg aagaacagaa 2340tctgctacct gcaggagatc ttctccaacg
agatggccaa ggtggacgac tccttcttcc 2400accgccttga ggaatcattc ctggtggagg
aggataaaaa gcacgagaga cacccaatct 2460tcgggaacat cgtcgacgag gtggcctacc
atgaaaagta ccctaccatc taccacctga 2520ggaagaagct ggtcgactct accgacaagg
ctgacttgcg cttgatttac ctggctctcg 2580ctcacatgat aaagttccgc ggacacttcc
tcattgaggg agacctgaac ccagacaact 2640ccgacgtgga caagctcttc atccagctcg
ttcagaccta caaccagctt ttcgaggaga 2700acccaatcaa cgccagtgga gttgacgcca
aggctatcct ctctgctcgt ctgtcaaagt 2760ccaggaggct tgagaacttg attgcccagc
tgcctggcga aaagaagaac ggactgttcg 2820gaaacttgat cgctctctcc ctgggattga
ctcccaactt caagtccaac ttcgacctcg 2880ccgaggacgc taagttgcag ttgtctaaag
acacctacga cgatgacctc gacaacttgc 2940tggcccagat aggcgaccaa tacgccgatc
tcttcctcgc cgctaagaac ttgtccgacg 3000caatcctgct gtccgacatc ctgagagtca
acactgagat taccaaagct cctctgtctg 3060cttccatgat taagcgctac gacgagcacc
accaagatct gaccctgctc aaggccctgg 3120tgagacagca gctgcccgag aagtacaagg
agatcttttt cgaccagtcc aagaacggct 3180acgccggata cattgacgga ggcgcctccc
aggaagagtt ctacaagttc atcaagccca 3240tccttgagaa gatggacggt accgaggagc
tgttggtgaa gttgaacaga gaggacctgt 3300tgaggaagca gagaaccttc gacaacggaa
gcatccctca ccaaatccac ctgggagagc 3360tccacgccat cttgaggagg caggaggatt
tctatccctt cctgaaggac aaccgcgaga 3420agattgagaa gatcttgacc ttcagaattc
cttactacgt cgggccactc gccagaggaa 3480actctaggtt cgcctggatg acccgcaaat
ctgaagagac cattactccc tggaacttcg 3540aggaagtcgt ggacaagggc gcttccgctc
agtctttcat cgagaggatg accaacttcg 3600ataaaaatct gcccaacgag aaggtgctgc
ccaagcactc cctgttgtac gagtatttca 3660cagtgtacaa cgagctcacc aaggtgaagt
acgtcacaga gggaatgagg aagcctgcct 3720tcttgtccgg agagcagaag aaggccatcg
tcgacctgct cttcaagacc aacaggaagg 3780tgactgtcaa gcagctgaag gaggactact
tcaagaagat cgagtgcttc gactccgtcg 3840agatctctgg tgtcgaggac aggttcaacg
cctcccttgg gacttaccac gatctgctca 3900agattattaa agacaaggac ttcctggaca
acgaggagaa cgaggacatc cttgaggaca 3960tcgtgctcac cctgaccttg ttcgaagaca
gggaaatgat cgaagagagg ctcaagacct 4020acgcccacct cttcgacgac aaggtgatga
aacagctgaa gagacgcaga tataccggct 4080ggggaaggct ctcccgcaaa ttgatcaacg
ggatcaggga caagcagtca gggaagacta 4140tactcgactt cctgaagtcc gacggattcg
ccaacaggaa cttcatgcag ctcattcacg 4200acgactcctt gaccttcaag gaggacatcc
agaaggctca ggtgtctgga cagggtgact 4260ccttgcatga gcacattgct aacttggccg
gctctcccgc tattaagaag ggcattttgc 4320agaccgtgaa ggtcgttgac gagctcgtga
aggtgatggg acgccacaag ccagagaaca 4380tcgttattga gatggctcgc gagaaccaaa
ctacccagaa agggcagaag aattcccgcg 4440agaggatgaa gcgcattgag gagggcataa
aagagcttgg ctctcagatc ctcaaggagc 4500accccgtcga gaacactcag ctgcagaacg
agaagctgta cctgtactac ctccaaaacg 4560gaagggacat gtacgtggac caggagctgg
acatcaacag gttgtccgac tacgacgtcg 4620accacatcgt gcctcagtcc ttcctgaagg
atgactccat cgacaataaa gtgctgacac 4680gctccgataa aaatagaggc aagtccgaca
acgtcccctc cgaggaggtc gtgaagaaga 4740tgaaaaacta ctggagacag ctcttgaacg
ccaagctcat cacccagcgt aagttcgaca 4800acctgactaa ggctgagaga ggaggattgt
ccgagctcga taaggccgga ttcatcaaga 4860gacagctcgt cgaaacccgc caaattacca
agcacgtggc ccaaattctg gattcccgca 4920tgaacaccaa gtacgatgaa aatgacaagc
tgatccgcga ggtcaaggtg atcaccttga 4980agtccaagct ggtctccgac ttccgcaagg
acttccagtt ctacaaggtg agggagatca 5040acaactacca ccacgcacac gacgcctacc
tcaacgctgt cgttggaacc gccctcatca 5100aaaaatatcc taagctggag tctgagttcg
tctacggcga ctacaaggtg tacgacgtga 5160ggaagatgat cgctaagtct gagcaggaga
tcggcaaggc caccgccaag tacttcttct 5220actccaacat catgaacttc ttcaagaccg
agatcactct cgccaacggt gagatcagga 5280agcgcccact gatcgagacc aacggtgaga
ctggagagat cgtgtgggac aaagggaggg 5340atttcgctac tgtgaggaag gtgctctcca
tgcctcaggt gaacatcgtc aagaagaccg 5400aagttcagac cggaggattc tccaaggagt
ccatcctccc caagagaaac tccgacaagc 5460tgatcgctag aaagaaagac tgggacccta
agaagtacgg aggcttcgat tctcctaccg 5520tggcctactc tgtgctggtc gtggccaagg
tggagaaggg caagtccaag aagctgaaat 5580ccgtcaagga gctcctcggg attaccatca
tggagaggag ttccttcgag aagaacccta 5640tcgacttcct ggaggccaag ggatataaag
aggtgaagaa ggacctcatc atcaagctgc 5700ccaagtactc cctcttcgag ttggagaacg
gaaggaagag gatgctggct tctgccggag 5760agttgcagaa gggaaatgag ctcgcccttc
cctccaagta cgtgaacttc ctgtacctcg 5820cctctcacta tgaaaagttg aagggctctc
ctgaggacaa cgagcagaag cagctcttcg 5880tggagcagca caagcactac ctggacgaaa
ttatcgagca gatctctgag ttctccaagc 5940gcgtgatatt ggccgacgcc aacctcgaca
aggtgctgtc cgcctacaac aagcacaggg 6000ataagcccat tcgcgagcag gctgaaaaca
ttatccacct gtttaccctc acaaacttgg 6060gagcccctgc tgccttcaag tacttcgaca
ccaccattga caggaagaga tacacctcca 6120ccaaggaggt gctcgacgca acactcatcc
accaatccat caccggcctc tatgaaacaa 6180ggattgactt gtcccagctg ggaggcgact
ctagagccga tcccaagaag aagagaaagg 6240tgtaggttaa cctagacttg tccatcttct
ggattggcca acttaattaa tgtatgaaat 6300aaaaggatgc acacatagtg acatgctaat
cactataatg tgggcatcaa agttgtgtgt 6360tatgtgtaat tactagttat ctgaataaaa
gagaaagaga tcatccatat ttcttatcct 6420aaatgaatgt cacgtgtctt tataattctt
tgatgaacca gatgcatttc attaaccaaa 6480tccatataca tataaatatt aatcatatat
aattaatatc aattgggtta gcaaaacaaa 6540tctagtctag gtgtgttttg cgaattcgat
atcaagctta tcgataccgt cgaggggggg 6600cccggtaccg g
66113016611DNAArtificial
sequenceU6-9.1DD43CR1+EF1A2CAS9 301cgcgccggta cccgggttaa gagaattgta
agtgtgcttt tatatattta aaattaatat 60attttgaaat gttaaaatat aaaagaaaat
tcaatgtaaa ttaaaaataa ataaatgttt 120aataaagata aattttaaaa cataaaagaa
aatgtctaac aagaggatta agatcctgtg 180ctcttaaatt tttaggtgtt gaaatcttag
ccatacaaaa tatattttat taaaaccaag 240catgaaaaaa gtcactaaag agctatataa
ctcatgcagc tagaaatgaa gtgaagggaa 300tccagtttgt tctcagtcga aagagtgtct
atctttgttc ttttctgcaa ccgagttaag 360caaaatggga atgcgaggta tcttcctttc
gttaggggag caccagatgc atagttagtc 420ccacattgat gaatataaca agagcttcac
agaatatata gcccaggcca cagtaaaagc 480ttgtcccttg tacttgtacg tagttttaga
gctagaaata gcaagttaaa ataaggctag 540tccgttatca acttgaaaaa gtggcaccga
gtcggtgctt ttttttgcgg ccgcaattgg 600atcgggttta cttattttgt gggtatctat
acttttatta gatttttaat caggctcctg 660atttcttttt atttcgattg aattcctgaa
cttgtattat tcagtagatc gaataaatta 720taaaaagata aaatcataaa ataatatttt
atcctatcaa tcatattaaa gcaatgaata 780tgtaaaatta atcttatctt tattttaaaa
aatcatatag gtttagtatt tttttaaaaa 840taaagatagg attagtttta ctattcactg
cttattactt ttaaaaaaat cataaaggtt 900tagtattttt ttaaaataaa tataggaata
gttttactat tcactgcttt aatagaaaaa 960tagtttaaaa tttaagatag ttttaatccc
agcatttgcc acgtttgaac gtgagccgaa 1020acgatgtcgt tacattatct taacctagct
gaaacgatgt cgtcataata tcgccaaatg 1080ccaactggac tacgtcgaac ccacaaatcc
cacaaagcgc gtgaaatcaa atcgctcaaa 1140ccacaaaaaa gaacaacgcg tttgttacac
gctcaatccc acgcgagtag agcacagtaa 1200ccttcaaata agcgaatggg gcataatcag
aaatccgaaa taaacctagg ggcattatcg 1260gaaatgaaaa gtagctcact caatataaaa
atctaggaac cctagttttc gttatcactc 1320tgtgctccct cgctctattt ctcagtctct
gtgtttgcgg ctgaggattc cgaacgagtg 1380accttcttcg tttctcgcaa aggtaacagc
ctctgctctt gtctcttcga ttcgatctat 1440gcctgtctct tatttacgat gatgtttctt
cggttatgtt tttttattta tgctttatgc 1500tgttgatgtt cggttgtttg tttcgctttg
tttttgtggt tcagtttttt aggattcttt 1560tggtttttga atcgattaat cggaagagat
tttcgagtta tttggtgtgt tggaggtgaa 1620tctttttttt gaggtcatag atctgttgta
tttgtgttat aaacatgcga ctttgtatga 1680ttttttacga ggttatgatg ttctggttgt
tttattatga atctgttgag acagaaccat 1740gatttttgtt gatgttcgtt tacactatta
aaggtttgtt ttaacaggat taaaagtttt 1800ttaagcatgt tgaaggagtc ttgtagatat
gtaaccgtcg atagtttttt tgtgggtttg 1860ttcacatgtt atcaagctta atcttttact
atgtatgcga ccatatctgg atccagcaaa 1920ggcgattttt taattccttg tgaaactttt
gtaatatgaa gttgaaattt tgttattggt 1980aaactataaa tgtgtgaagt tggagtatac
ctttaccttc ttatttggct ttgtgatagt 2040ttaatttata tgtattttga gttctgactt
gtatttcttt gaattgattc tagtttaagt 2100aatccatgga caaaaagtac tcaatagggc
tcgacatagg gactaactcc gttggatggg 2160ccgtcatcac cgacgagtac aaggtgccct
ccaagaagtt caaggtgttg ggaaacaccg 2220acaggcacag cataaagaag aatttgatcg
gtgccctcct cttcgactcc ggagagaccg 2280ctgaggctac caggctcaag aggaccgcta
gaaggcgcta caccagaagg aagaacagaa 2340tctgctacct gcaggagatc ttctccaacg
agatggccaa ggtggacgac tccttcttcc 2400accgccttga ggaatcattc ctggtggagg
aggataaaaa gcacgagaga cacccaatct 2460tcgggaacat cgtcgacgag gtggcctacc
atgaaaagta ccctaccatc taccacctga 2520ggaagaagct ggtcgactct accgacaagg
ctgacttgcg cttgatttac ctggctctcg 2580ctcacatgat aaagttccgc ggacacttcc
tcattgaggg agacctgaac ccagacaact 2640ccgacgtgga caagctcttc atccagctcg
ttcagaccta caaccagctt ttcgaggaga 2700acccaatcaa cgccagtgga gttgacgcca
aggctatcct ctctgctcgt ctgtcaaagt 2760ccaggaggct tgagaacttg attgcccagc
tgcctggcga aaagaagaac ggactgttcg 2820gaaacttgat cgctctctcc ctgggattga
ctcccaactt caagtccaac ttcgacctcg 2880ccgaggacgc taagttgcag ttgtctaaag
acacctacga cgatgacctc gacaacttgc 2940tggcccagat aggcgaccaa tacgccgatc
tcttcctcgc cgctaagaac ttgtccgacg 3000caatcctgct gtccgacatc ctgagagtca
acactgagat taccaaagct cctctgtctg 3060cttccatgat taagcgctac gacgagcacc
accaagatct gaccctgctc aaggccctgg 3120tgagacagca gctgcccgag aagtacaagg
agatcttttt cgaccagtcc aagaacggct 3180acgccggata cattgacgga ggcgcctccc
aggaagagtt ctacaagttc atcaagccca 3240tccttgagaa gatggacggt accgaggagc
tgttggtgaa gttgaacaga gaggacctgt 3300tgaggaagca gagaaccttc gacaacggaa
gcatccctca ccaaatccac ctgggagagc 3360tccacgccat cttgaggagg caggaggatt
tctatccctt cctgaaggac aaccgcgaga 3420agattgagaa gatcttgacc ttcagaattc
cttactacgt cgggccactc gccagaggaa 3480actctaggtt cgcctggatg acccgcaaat
ctgaagagac cattactccc tggaacttcg 3540aggaagtcgt ggacaagggc gcttccgctc
agtctttcat cgagaggatg accaacttcg 3600ataaaaatct gcccaacgag aaggtgctgc
ccaagcactc cctgttgtac gagtatttca 3660cagtgtacaa cgagctcacc aaggtgaagt
acgtcacaga gggaatgagg aagcctgcct 3720tcttgtccgg agagcagaag aaggccatcg
tcgacctgct cttcaagacc aacaggaagg 3780tgactgtcaa gcagctgaag gaggactact
tcaagaagat cgagtgcttc gactccgtcg 3840agatctctgg tgtcgaggac aggttcaacg
cctcccttgg gacttaccac gatctgctca 3900agattattaa agacaaggac ttcctggaca
acgaggagaa cgaggacatc cttgaggaca 3960tcgtgctcac cctgaccttg ttcgaagaca
gggaaatgat cgaagagagg ctcaagacct 4020acgcccacct cttcgacgac aaggtgatga
aacagctgaa gagacgcaga tataccggct 4080ggggaaggct ctcccgcaaa ttgatcaacg
ggatcaggga caagcagtca gggaagacta 4140tactcgactt cctgaagtcc gacggattcg
ccaacaggaa cttcatgcag ctcattcacg 4200acgactcctt gaccttcaag gaggacatcc
agaaggctca ggtgtctgga cagggtgact 4260ccttgcatga gcacattgct aacttggccg
gctctcccgc tattaagaag ggcattttgc 4320agaccgtgaa ggtcgttgac gagctcgtga
aggtgatggg acgccacaag ccagagaaca 4380tcgttattga gatggctcgc gagaaccaaa
ctacccagaa agggcagaag aattcccgcg 4440agaggatgaa gcgcattgag gagggcataa
aagagcttgg ctctcagatc ctcaaggagc 4500accccgtcga gaacactcag ctgcagaacg
agaagctgta cctgtactac ctccaaaacg 4560gaagggacat gtacgtggac caggagctgg
acatcaacag gttgtccgac tacgacgtcg 4620accacatcgt gcctcagtcc ttcctgaagg
atgactccat cgacaataaa gtgctgacac 4680gctccgataa aaatagaggc aagtccgaca
acgtcccctc cgaggaggtc gtgaagaaga 4740tgaaaaacta ctggagacag ctcttgaacg
ccaagctcat cacccagcgt aagttcgaca 4800acctgactaa ggctgagaga ggaggattgt
ccgagctcga taaggccgga ttcatcaaga 4860gacagctcgt cgaaacccgc caaattacca
agcacgtggc ccaaattctg gattcccgca 4920tgaacaccaa gtacgatgaa aatgacaagc
tgatccgcga ggtcaaggtg atcaccttga 4980agtccaagct ggtctccgac ttccgcaagg
acttccagtt ctacaaggtg agggagatca 5040acaactacca ccacgcacac gacgcctacc
tcaacgctgt cgttggaacc gccctcatca 5100aaaaatatcc taagctggag tctgagttcg
tctacggcga ctacaaggtg tacgacgtga 5160ggaagatgat cgctaagtct gagcaggaga
tcggcaaggc caccgccaag tacttcttct 5220actccaacat catgaacttc ttcaagaccg
agatcactct cgccaacggt gagatcagga 5280agcgcccact gatcgagacc aacggtgaga
ctggagagat cgtgtgggac aaagggaggg 5340atttcgctac tgtgaggaag gtgctctcca
tgcctcaggt gaacatcgtc aagaagaccg 5400aagttcagac cggaggattc tccaaggagt
ccatcctccc caagagaaac tccgacaagc 5460tgatcgctag aaagaaagac tgggacccta
agaagtacgg aggcttcgat tctcctaccg 5520tggcctactc tgtgctggtc gtggccaagg
tggagaaggg caagtccaag aagctgaaat 5580ccgtcaagga gctcctcggg attaccatca
tggagaggag ttccttcgag aagaacccta 5640tcgacttcct ggaggccaag ggatataaag
aggtgaagaa ggacctcatc atcaagctgc 5700ccaagtactc cctcttcgag ttggagaacg
gaaggaagag gatgctggct tctgccggag 5760agttgcagaa gggaaatgag ctcgcccttc
cctccaagta cgtgaacttc ctgtacctcg 5820cctctcacta tgaaaagttg aagggctctc
ctgaggacaa cgagcagaag cagctcttcg 5880tggagcagca caagcactac ctggacgaaa
ttatcgagca gatctctgag ttctccaagc 5940gcgtgatatt ggccgacgcc aacctcgaca
aggtgctgtc cgcctacaac aagcacaggg 6000ataagcccat tcgcgagcag gctgaaaaca
ttatccacct gtttaccctc acaaacttgg 6060gagcccctgc tgccttcaag tacttcgaca
ccaccattga caggaagaga tacacctcca 6120ccaaggaggt gctcgacgca acactcatcc
accaatccat caccggcctc tatgaaacaa 6180ggattgactt gtcccagctg ggaggcgact
ctagagccga tcccaagaag aagagaaagg 6240tgtaggttaa cctagacttg tccatcttct
ggattggcca acttaattaa tgtatgaaat 6300aaaaggatgc acacatagtg acatgctaat
cactataatg tgggcatcaa agttgtgtgt 6360tatgtgtaat tactagttat ctgaataaaa
gagaaagaga tcatccatat ttcttatcct 6420aaatgaatgt cacgtgtctt tataattctt
tgatgaacca gatgcatttc attaaccaaa 6480tccatataca tataaatatt aatcatatat
aattaatatc aattgggtta gcaaaacaaa 6540tctagtctag gtgtgttttg cgaattcgat
atcaagctta tcgataccgt cgaggggggg 6600cccggtaccg g
66113025719DNAArtificial
sequenceDD43HR1-SAMSHPT-DD43HR2 302cgcgcctcta gttgaagaca cgttcatgtc
ttcatcgtaa gaagacactc agtagtcttc 60ggccagaatg gccatctgga ttcagcaggc
ctagaaggcc atttaaatcc tgaggatctg 120gtcttcctaa ggacccggga tatcgctatc
aactttgtat agaaaagttg ggccgaattc 180gagctcggta cggccagaat ccggtaagtg
actagggtca cgtgacccta gtcacttaaa 240ttcggccaga atggccatct ggattcagca
ggcctagaag gcccggaccg attaaacttt 300aattcggtcc gggttacctc gagatcttgt
tcccctcctt ggtttggcat aaattgattt 360tcatggctct tctcggtcga aactggagct
aattcaccct tagtctctct taaaattctg 420gctgtaagaa acaccacaga acacataaat
tataaactaa ttataatttg aagagtaaaa 480tatgttttta ctcttatgat ttaattagtg
tagttttaat tttctccttt ttttaaaaaa 540ttttggtatt cataaatttc aattttttaa
aaataattgt tgttacccgt taatgataac 600gggatatgtt atgttaccac taaatcggac
aaaaaaaatt caaaactttt ataaggatta 660aaattaacaa aaatatttta aaaaaatcta
acctcaataa agttaaattt ataagcacaa 720aataatactt ttaagcctaa tttggcaaga
cacaagcaag ctcacctgta gcattaatag 780aaaggaagca aagcaagaga aaagcaacca
gaaggaagcg tttgcttggt gacacagcca 840tcttacttga atttatggta ttactgagaa
accttgatct tgcttcaaaa tcttctagtt 900accctctttt tataggcaga aagagaacta
gctagttgcc aataggatat gaggacatgt 960ggtgcaatgc actcactctt caaggacaag
aaaaacaatg gctacaattg tggttcaaat 1020caatgtctcc tgctctgtcc tgcctgaaaa
tgacaccctt ttgcttggaa aagaggatca 1080aagctaagaa caggagtggc ttcattccct
tcatgtaacc aaacactttc gcattctgtc 1140attcgtgaat cagcaaaatc tgcaaccaaa
aatatatggt gcctaaataa aagaaataaa 1200ataatttaga gttgcggact aaaataataa
acaaaagaaa tatattataa tctagaatta 1260atttaggact aaaagaagag gcagactcca
attcctcttt tctagaatac cctccgtacg 1320tacactagtg gtcacctaag tgactagggt
cacgtgaccc tagtcactta ttcccaaaca 1380ctagtaacgg ccgccagtgt gctggaattc
gcccttccca agctttgctc tagatcaaac 1440tcacatccaa acataacatg gatatcttcc
ttaccaatca tactaattat tttgggttaa 1500atattaatca ttatttttaa gatattaatt
aagaaattaa aagatttttt aaaaaaatgt 1560ataaaattat attattcatg atttttcata
catttgattt tgataataaa tatatttttt 1620ttaatttctt aaaaaatgtt gcaagacact
tattagacat agtcttgttc tgtttacaaa 1680agcattcatc atttaataca ttaaaaaata
tttaatacta acagtagaat cttcttgtga 1740gtggtgtggg agtaggcaac ctggcattga
aacgagagaa agagagtcag aaccagaaga 1800caaataaaaa gtatgcaaca aacaaatcaa
aatcaaaggg caaaggctgg ggttggctca 1860attggttgct acattcaatt ttcaactcag
tcaacggttg agattcactc tgacttcccc 1920aatctaagcc gcggatgcaa acggttgaat
ctaacccaca atccaatctc gttacttagg 1980ggcttttccg tcattaactc acccctgcca
cccggtttcc ctataaattg gaactcaatg 2040ctcccctcta aactcgtatc gcttcagagt
tgagaccaag acacactcgt tcatatatct 2100ctctgctctt ctcttctctt ctacctctca
aggtactttt cttctccctc taccaaatcc 2160tagattccgt ggttcaattt cggatcttgc
acttctggtt tgctttgcct tgctttttcc 2220tcaactgggt ccatctagga tccatgtgaa
actctactct ttctttaata tctgcggaat 2280acgcgtttga ctttcagatc tagtcgaaat
catttcataa ttgcctttct ttcttttagc 2340ttatgagaaa taaaatcact ttttttttat
ttcaaaataa accttgggcc ttgtgctgac 2400tgagatgggg tttggtgatt acagaatttt
agcgaatttt gtaattgtac ttgtttgtct 2460gtagttttgt tttgttttct tgtttctcat
acattcctta ggcttcaatt ttattcgagt 2520ataggtcaca ataggaattc aaactttgag
caggggaatt aatcccttcc ttcaaatcca 2580gtttgtttgt atatatgttt aaaaaatgaa
acttttgctt taaattctat tataactttt 2640tttatggctg aaatttttgc atgtgtcttt
gctctctgtt gtaaatttac tgtttaggta 2700ctaactctag gcttgttgtg cagtttttga
agtataacaa cagaagttcc tattccgaag 2760ttcctattct ctagaaagta taggaacttc
cactagtcca tgaaaaagcc tgaactcacc 2820gcgacgtctg tcgagaagtt tctgatcgaa
aagttcgaca gcgtctccga cctgatgcag 2880ctctcggagg gcgaagaatc tcgtgctttc
agcttcgatg taggagggcg tggatatgtc 2940ctgcgggtaa atagctgcgc cgatggtttc
tacaaagatc gttatgttta tcggcacttt 3000gcatcggccg cgctcccgat tccggaagtg
cttgacattg gggaattcag cgagagcctg 3060acctattgca tctcccgccg tgcacagggt
gtcacgttgc aagacctgcc tgaaaccgaa 3120ctgcccgctg ttctgcagcc ggtcgcggag
gccatggatg cgatcgctgc ggccgatctt 3180agccagacga gcgggttcgg cccattcgga
ccgcaaggaa tcggtcaata cactacatgg 3240cgtgatttca tatgcgcgat tgctgatccc
catgtgtatc actggcaaac tgtgatggac 3300gacaccgtca gtgcgtccgt cgcgcaggct
ctcgatgagc tgatgctttg ggccgaggac 3360tgccccgaag tccggcacct cgtgcacgcg
gatttcggct ccaacaatgt cctgacggac 3420aatggccgca taacagcggt cattgactgg
agcgaggcga tgttcgggga ttcccaatac 3480gaggtcgcca acatcttctt ctggaggccg
tggttggctt gtatggagca gcagacgcgc 3540tacttcgagc ggaggcatcc ggagcttgca
ggatcgccgc ggctccgggc gtatatgctc 3600cgcattggtc ttgaccaact ctatcagagc
ttggttgacg gcaatttcga tgatgcagct 3660tgggcgcagg gtcgatgcga cgcaatcgtc
cgatccggag ccgggactgt cgggcgtaca 3720caaatcgccc gcagaagcgc ggccgtctgg
accgatggct gtgtagaagt actcgccgat 3780agtggaaacc gacgccccag cactcgtccg
agggcaaagg aatagtgagg tacctaaaga 3840aggagtgcgt cgaagcagat cgttcaaaca
tttggcaata aagtttctta agattgaatc 3900ctgttgccgg tcttgcgatg attatcatat
aatttctgtt gaattacgtt aagcatgtaa 3960taattaacat gtaatgcatg acgttattta
tgagatgggt ttttatgatt agagtcccgc 4020aattatacat ttaatacgcg atagaaaaca
aaatatagcg cgcaaactag gataaattat 4080cgcgcgcggt gtcatctatg ttactagatc
gatgtcgacc cgggccctag gaggccggcc 4140cagctgatga tcccggtgaa gttcctattc
cgaagttcct attctccaga aagtatagga 4200acttcactag agcttgcggc cgcgcatgct
gacttaatca gctaacgcca ctcgacctgc 4260aggcatgccc gcggatatcg atgggccccg
gccgaagctt caagtttgta caaaaaagca 4320ggctggcgcc ggaaccaatt cagtcgactg
gatccggtac cgaattcgcg gccgcactcg 4380agatatctag acccagcttt cttgtacaaa
gtggccgtta acggatcggc cagaatccgg 4440taagtgacta gggtcacgtg accctagtca
cttaaattcg gccagaatgg ccatctggat 4500tcagcaggcc tagaaggccc ggaccgatta
aactttaatt cggtccgggt tacctctaga 4560aagcttgtcg acctgcaggt acaagtacaa
gggacttgtg agttgtaagg ctgtatttac 4620aatagtgaaa agagaatcat ctgggtgatt
gggtttttag tccccagtga cgaattaaag 4680gtttgaattc ttagtatgtt tgggaatcaa
ttaggaattt cgttttggac tttccaaagc 4740aattattcac tttttcattc attaaatgtg
actaaaaaat tgttatttct ccattggcca 4800ggatgcatcg tttatataaa cataacctta
gtgaaagcag tgttttcatg tgacagcggc 4860agactatatc ttaaacaaaa ttacttgtaa
agaaagatac cgttaggaaa aaaatgaaaa 4920gaaaattgaa gctatcactt gtttactttc
ctaatatctt tcaagaatac aatgtggtga 4980atttcaattt tccctacata tgtataccgt
cagcctgacg caacttatga aacttctctt 5040tctttcattt gatgtatata taaagacaca
ttatatataa agaaacttta tatatatctc 5100catcatattt tagtacttgc tactatgtaa
aattagctgt tggaagtatc tcaagaaaca 5160tttaatttat tgaaccaagc attaaccatt
catctacatt tgagttctaa aataaatctt 5220aaatgatgtg gaggaaggga aattgttaat
tatttccctc ttctcctaca tggatatacc 5280tgaaacatgc aatggatgga ttagatttta
acatttgcag cctgagaagt tcactgactt 5340tcctccagct attttatgtg tgcccgccac
catttatagc tcatgattgt agctgaactg 5400caaaaactgc atcgattgca aactgaaatt
gagaatctct tttcaacttt atatgctgat 5460tgatgcatgc tgagcatgct atactagtac
tcgaagttcc tatatgtaga ctttgttact 5520gcctaatata ctttgtgttt gttctcaagt
tcttatttta tttcatattt tttcctataa 5580aaggttaatg gctctataaa ggttgagtga
cggatccggt cacctaagtg actagggtca 5640cgtgacccta gtcacttatt cccgggcaac
tttattatac aaagttgata gatctcgaat 5700tcattccgat taatcgtgg
57193036611DNAArtificial
sequenceU6-9.1DD43CR2+EF1A2CAS9 303cgcgccggta cccgggttaa gagaattgta
agtgtgcttt tatatattta aaattaatat 60attttgaaat gttaaaatat aaaagaaaat
tcaatgtaaa ttaaaaataa ataaatgttt 120aataaagata aattttaaaa cataaaagaa
aatgtctaac aagaggatta agatcctgtg 180ctcttaaatt tttaggtgtt gaaatcttag
ccatacaaaa tatattttat taaaaccaag 240catgaaaaaa gtcactaaag agctatataa
ctcatgcagc tagaaatgaa gtgaagggaa 300tccagtttgt tctcagtcga aagagtgtct
atctttgttc ttttctgcaa ccgagttaag 360caaaatggga atgcgaggta tcttcctttc
gttaggggag caccagatgc atagttagtc 420ccacattgat gaatataaca agagcttcac
agaatatata gcccaggcca cagtaaaagc 480ttgtattcta gaaaagagga atgttttaga
gctagaaata gcaagttaaa ataaggctag 540tccgttatca acttgaaaaa gtggcaccga
gtcggtgctt ttttttgcgg ccgcaattgg 600atcgggttta cttattttgt gggtatctat
acttttatta gatttttaat caggctcctg 660atttcttttt atttcgattg aattcctgaa
cttgtattat tcagtagatc gaataaatta 720taaaaagata aaatcataaa ataatatttt
atcctatcaa tcatattaaa gcaatgaata 780tgtaaaatta atcttatctt tattttaaaa
aatcatatag gtttagtatt tttttaaaaa 840taaagatagg attagtttta ctattcactg
cttattactt ttaaaaaaat cataaaggtt 900tagtattttt ttaaaataaa tataggaata
gttttactat tcactgcttt aatagaaaaa 960tagtttaaaa tttaagatag ttttaatccc
agcatttgcc acgtttgaac gtgagccgaa 1020acgatgtcgt tacattatct taacctagct
gaaacgatgt cgtcataata tcgccaaatg 1080ccaactggac tacgtcgaac ccacaaatcc
cacaaagcgc gtgaaatcaa atcgctcaaa 1140ccacaaaaaa gaacaacgcg tttgttacac
gctcaatccc acgcgagtag agcacagtaa 1200ccttcaaata agcgaatggg gcataatcag
aaatccgaaa taaacctagg ggcattatcg 1260gaaatgaaaa gtagctcact caatataaaa
atctaggaac cctagttttc gttatcactc 1320tgtgctccct cgctctattt ctcagtctct
gtgtttgcgg ctgaggattc cgaacgagtg 1380accttcttcg tttctcgcaa aggtaacagc
ctctgctctt gtctcttcga ttcgatctat 1440gcctgtctct tatttacgat gatgtttctt
cggttatgtt tttttattta tgctttatgc 1500tgttgatgtt cggttgtttg tttcgctttg
tttttgtggt tcagtttttt aggattcttt 1560tggtttttga atcgattaat cggaagagat
tttcgagtta tttggtgtgt tggaggtgaa 1620tctttttttt gaggtcatag atctgttgta
tttgtgttat aaacatgcga ctttgtatga 1680ttttttacga ggttatgatg ttctggttgt
tttattatga atctgttgag acagaaccat 1740gatttttgtt gatgttcgtt tacactatta
aaggtttgtt ttaacaggat taaaagtttt 1800ttaagcatgt tgaaggagtc ttgtagatat
gtaaccgtcg atagtttttt tgtgggtttg 1860ttcacatgtt atcaagctta atcttttact
atgtatgcga ccatatctgg atccagcaaa 1920ggcgattttt taattccttg tgaaactttt
gtaatatgaa gttgaaattt tgttattggt 1980aaactataaa tgtgtgaagt tggagtatac
ctttaccttc ttatttggct ttgtgatagt 2040ttaatttata tgtattttga gttctgactt
gtatttcttt gaattgattc tagtttaagt 2100aatccatgga caaaaagtac tcaatagggc
tcgacatagg gactaactcc gttggatggg 2160ccgtcatcac cgacgagtac aaggtgccct
ccaagaagtt caaggtgttg ggaaacaccg 2220acaggcacag cataaagaag aatttgatcg
gtgccctcct cttcgactcc ggagagaccg 2280ctgaggctac caggctcaag aggaccgcta
gaaggcgcta caccagaagg aagaacagaa 2340tctgctacct gcaggagatc ttctccaacg
agatggccaa ggtggacgac tccttcttcc 2400accgccttga ggaatcattc ctggtggagg
aggataaaaa gcacgagaga cacccaatct 2460tcgggaacat cgtcgacgag gtggcctacc
atgaaaagta ccctaccatc taccacctga 2520ggaagaagct ggtcgactct accgacaagg
ctgacttgcg cttgatttac ctggctctcg 2580ctcacatgat aaagttccgc ggacacttcc
tcattgaggg agacctgaac ccagacaact 2640ccgacgtgga caagctcttc atccagctcg
ttcagaccta caaccagctt ttcgaggaga 2700acccaatcaa cgccagtgga gttgacgcca
aggctatcct ctctgctcgt ctgtcaaagt 2760ccaggaggct tgagaacttg attgcccagc
tgcctggcga aaagaagaac ggactgttcg 2820gaaacttgat cgctctctcc ctgggattga
ctcccaactt caagtccaac ttcgacctcg 2880ccgaggacgc taagttgcag ttgtctaaag
acacctacga cgatgacctc gacaacttgc 2940tggcccagat aggcgaccaa tacgccgatc
tcttcctcgc cgctaagaac ttgtccgacg 3000caatcctgct gtccgacatc ctgagagtca
acactgagat taccaaagct cctctgtctg 3060cttccatgat taagcgctac gacgagcacc
accaagatct gaccctgctc aaggccctgg 3120tgagacagca gctgcccgag aagtacaagg
agatcttttt cgaccagtcc aagaacggct 3180acgccggata cattgacgga ggcgcctccc
aggaagagtt ctacaagttc atcaagccca 3240tccttgagaa gatggacggt accgaggagc
tgttggtgaa gttgaacaga gaggacctgt 3300tgaggaagca gagaaccttc gacaacggaa
gcatccctca ccaaatccac ctgggagagc 3360tccacgccat cttgaggagg caggaggatt
tctatccctt cctgaaggac aaccgcgaga 3420agattgagaa gatcttgacc ttcagaattc
cttactacgt cgggccactc gccagaggaa 3480actctaggtt cgcctggatg acccgcaaat
ctgaagagac cattactccc tggaacttcg 3540aggaagtcgt ggacaagggc gcttccgctc
agtctttcat cgagaggatg accaacttcg 3600ataaaaatct gcccaacgag aaggtgctgc
ccaagcactc cctgttgtac gagtatttca 3660cagtgtacaa cgagctcacc aaggtgaagt
acgtcacaga gggaatgagg aagcctgcct 3720tcttgtccgg agagcagaag aaggccatcg
tcgacctgct cttcaagacc aacaggaagg 3780tgactgtcaa gcagctgaag gaggactact
tcaagaagat cgagtgcttc gactccgtcg 3840agatctctgg tgtcgaggac aggttcaacg
cctcccttgg gacttaccac gatctgctca 3900agattattaa agacaaggac ttcctggaca
acgaggagaa cgaggacatc cttgaggaca 3960tcgtgctcac cctgaccttg ttcgaagaca
gggaaatgat cgaagagagg ctcaagacct 4020acgcccacct cttcgacgac aaggtgatga
aacagctgaa gagacgcaga tataccggct 4080ggggaaggct ctcccgcaaa ttgatcaacg
ggatcaggga caagcagtca gggaagacta 4140tactcgactt cctgaagtcc gacggattcg
ccaacaggaa cttcatgcag ctcattcacg 4200acgactcctt gaccttcaag gaggacatcc
agaaggctca ggtgtctgga cagggtgact 4260ccttgcatga gcacattgct aacttggccg
gctctcccgc tattaagaag ggcattttgc 4320agaccgtgaa ggtcgttgac gagctcgtga
aggtgatggg acgccacaag ccagagaaca 4380tcgttattga gatggctcgc gagaaccaaa
ctacccagaa agggcagaag aattcccgcg 4440agaggatgaa gcgcattgag gagggcataa
aagagcttgg ctctcagatc ctcaaggagc 4500accccgtcga gaacactcag ctgcagaacg
agaagctgta cctgtactac ctccaaaacg 4560gaagggacat gtacgtggac caggagctgg
acatcaacag gttgtccgac tacgacgtcg 4620accacatcgt gcctcagtcc ttcctgaagg
atgactccat cgacaataaa gtgctgacac 4680gctccgataa aaatagaggc aagtccgaca
acgtcccctc cgaggaggtc gtgaagaaga 4740tgaaaaacta ctggagacag ctcttgaacg
ccaagctcat cacccagcgt aagttcgaca 4800acctgactaa ggctgagaga ggaggattgt
ccgagctcga taaggccgga ttcatcaaga 4860gacagctcgt cgaaacccgc caaattacca
agcacgtggc ccaaattctg gattcccgca 4920tgaacaccaa gtacgatgaa aatgacaagc
tgatccgcga ggtcaaggtg atcaccttga 4980agtccaagct ggtctccgac ttccgcaagg
acttccagtt ctacaaggtg agggagatca 5040acaactacca ccacgcacac gacgcctacc
tcaacgctgt cgttggaacc gccctcatca 5100aaaaatatcc taagctggag tctgagttcg
tctacggcga ctacaaggtg tacgacgtga 5160ggaagatgat cgctaagtct gagcaggaga
tcggcaaggc caccgccaag tacttcttct 5220actccaacat catgaacttc ttcaagaccg
agatcactct cgccaacggt gagatcagga 5280agcgcccact gatcgagacc aacggtgaga
ctggagagat cgtgtgggac aaagggaggg 5340atttcgctac tgtgaggaag gtgctctcca
tgcctcaggt gaacatcgtc aagaagaccg 5400aagttcagac cggaggattc tccaaggagt
ccatcctccc caagagaaac tccgacaagc 5460tgatcgctag aaagaaagac tgggacccta
agaagtacgg aggcttcgat tctcctaccg 5520tggcctactc tgtgctggtc gtggccaagg
tggagaaggg caagtccaag aagctgaaat 5580ccgtcaagga gctcctcggg attaccatca
tggagaggag ttccttcgag aagaacccta 5640tcgacttcct ggaggccaag ggatataaag
aggtgaagaa ggacctcatc atcaagctgc 5700ccaagtactc cctcttcgag ttggagaacg
gaaggaagag gatgctggct tctgccggag 5760agttgcagaa gggaaatgag ctcgcccttc
cctccaagta cgtgaacttc ctgtacctcg 5820cctctcacta tgaaaagttg aagggctctc
ctgaggacaa cgagcagaag cagctcttcg 5880tggagcagca caagcactac ctggacgaaa
ttatcgagca gatctctgag ttctccaagc 5940gcgtgatatt ggccgacgcc aacctcgaca
aggtgctgtc cgcctacaac aagcacaggg 6000ataagcccat tcgcgagcag gctgaaaaca
ttatccacct gtttaccctc acaaacttgg 6060gagcccctgc tgccttcaag tacttcgaca
ccaccattga caggaagaga tacacctcca 6120ccaaggaggt gctcgacgca acactcatcc
accaatccat caccggcctc tatgaaacaa 6180ggattgactt gtcccagctg ggaggcgact
ctagagccga tcccaagaag aagagaaagg 6240tgtaggttaa cctagacttg tccatcttct
ggattggcca acttaattaa tgtatgaaat 6300aaaaggatgc acacatagtg acatgctaat
cactataatg tgggcatcaa agttgtgtgt 6360tatgtgtaat tactagttat ctgaataaaa
gagaaagaga tcatccatat ttcttatcct 6420aaatgaatgt cacgtgtctt tataattctt
tgatgaacca gatgcatttc attaaccaaa 6480tccatataca tataaatatt aatcatatat
aattaatatc aattgggtta gcaaaacaaa 6540tctagtctag gtgtgttttg cgaattcgat
atcaagctta tcgataccgt cgaggggggg 6600cccggtaccg g
661130464DNAArtificial sequenceDD20 qPCR
amplicon 304attcggaact gacacacgac atgatggaac gtgactaagg tgggtttttg
actttgcatg 60tcga
64305115DNAArtificial sequenceDD43 qPCR amplicon
305aaagaagagg cagactccaa ttcctctttt ctagaatacc ctccgtacgt acaagtacaa
60gggacttgtg agttgtaagg ctgtatttac aatagtgaaa agagaatcat ctggg
11530620DNAArtificial sequenceprimer, DD20-CR1 306ggaactgaca cacgacatga
2030720DNAArtificial
sequenceprimer, DD20-CR2 307gacatgatgg aacgtgacta
2030822DNAArtificial sequenceprimer, DD20-F
308attcggaact gacacacgac at
2230917DNAArtificial sequenceFAM-MGB probe, DD20-T 309atggaacgtg actaagg
1731022DNAArtificial
sequenceprimer, DD20-R 310tcgacatgca aagtcaaaaa cc
2231120DNAArtificial sequenceprimer, DD43CR1
311gtcccttgta cttgtacgta
2031220DNAArtificial sequenceprimer, DD43CR2 312gtattctaga aaagaggaat
2031326DNAArtificial
sequenceprimer, DD43-F 313ttctagaata ccctccgtac gtacaa
2631426DNAArtificial sequenceprimer, DD43-F2
314aaagaagagg cagactccaa ttcctc
2631519DNAArtificial sequenceFAM-MGB probe, DD43-T 315caagggactt
gtgagttgt
1931626DNAArtificial sequenceprimer, DD43-R 316cccagatgat tctcttttca
ctattg 2631719DNAArtificial
sequenceprimer, Cas9-F 317ccttcttcca ccgccttga
1931819DNAArtificial sequenceFAM-MGB probe, Cas9-T
318aatcattcct ggtggagga
1931921DNAArtificial sequenceprimer, Cas9-R 319tgggtgtctc tcgtgctttt t
2132022DNAArtificial
sequenceprimer, Sams-76F 320aggcttgttg tgcagttttt ga
2232122DNAArtificial sequenceFAM-MGB probe,
FRT1I-63T 321tggactagtg gaagttccta ta
2232221DNAArtificial sequenceprimer, FRT1I-41F 322gcggtgagtt
caggcttttt c
2132331DNAArtificial sequenceprimer, DD20-LB 323ggttatacct tcttcttagt
gtggtctatc c 3132431DNAArtificial
sequenceprimer, Sams-A1 324cccaaaataa ttagtatgat tggtaaggaa g
3132523DNAArtificial sequenceprimer, QC498A-S1
325ggaacttcac tagagcttgc ggc
2332628DNAArtificial sequenceprimer, DD20-RB 326gccattacat tcttcataag
ttcctctc 2832726DNAArtificial
sequenceprimer, DD43-LB 327gtgtagtcca ttgtagccaa gtcacc
2632824DNAArtificial sequenceprimer, DD43-RB
328caaaccggag agagaggaag aacc
243292105DNAArtificial SequenceDD20 HR1-HR2 PCR amplicon 329ggttatacct
tcttcttagt gtggtctatc ccctagtaat aattacacat ctaagatatc 60cccttctttt
tcaagtaaaa taatatcata tgatctcatt ttagtgaaac aatactattt 120ccctgataac
tctcttcaac attagggact tcatctaatc atctactttc aaggtataac 180tagacgtatt
tgttctttta aaaaaaacac tagatgtact cgtcaactca aaattcatcg 240ttcatgcatt
ttaattaaac tttaattagc taatgagtag aaaaagatca tacgagtaaa 300atagaagaat
cttcctagat tttggaagaa tggattggag tgtaagtgaa ttgatccatt 360agtggaagat
gctctttaca atggccaaac tgttctaatt gttagagcac atttgagatg 420aaacacttca
gtagtggagg taacctacaa tcctaggatc tgtatcctct atcactaatg 480gagcaatggg
tttgagattg acttactcct ttccttgtct ctcgtagtgc atatgcgcac 540tttcaaaggc
tacacaaaag ccgttaactt tttgtttatt taagttacga aagatagttg 600aattagagta
aatggtgata ttgaattagg attttaaata attttaaaag aattttttta 660ataaaaaaaa
tattgtgttg ttggatcaaa atttttaaat aacatgaata aggaaatgga 720ttgcaatgag
gttttaaaca attattttaa catataggat tttagaaaga cttttataat 780attttgttga
agtttagatt ttaatatatt tatgttttaa aattttaaaa aaaacttcat 840gaatttataa
tatttgaaaa agacacgtga atatttagaa aacatttaaa attacaataa 900taaatcataa
tgagataggg tgtattcatg tgtagacgag acaccaagta tatggttcac 960aagtgaatca
tctttttttt ttacagcaca agtagatcac ttgtacttat caaaattcgg 1020aactgacaca
cgacatgatg gaacgtgact aaggtgggtt tttgactttg catgtcgaag 1080tgagagtgat
tttattgaga gaataataga agacctacaa aacaaatgat cccgacgcta 1140aagtaagtac
gagagttaag agaataaatg ggaaaatatg catacatgat taggtgtgtg 1200ttcgtctcaa
gaaagtacga atgaatatgg tgtgtttgta gtacatgaat gatgtgtttt 1260gagggttcaa
gggaaattga tatttataga gtgaaatgga accagaggtc tttgttgaca 1320agggttgtta
tgactcttgc aaataattaa tagcttataa ataatagcca ataacttatt 1380atagatagag
ttagagataa tatatagcta aatttgaaca aggcatacaa aacaaaaatg 1440ctaaatatga
ataagacaat caaaattgta gtcgatgttc aactctttgt cgttgaagaa 1500cttgtttgca
gtggtatagt aaatgggtgt gagtgcagtg tctcacccat ctcacaccac 1560acaaccaact
tcatatctaa agatattgtc gctgaataca aaattgagtt atggaatata 1620caattcataa
tatagatacg aaaaatcatt tcttacaaaa cattcaatca aaaattattc 1680aaacataatt
ctagattaag taatccgaag tacaagttag tatcctagat ccgttaattt 1740aaaattatgt
ttgcataatt ttggatttgg tgttctataa gggcacaatt ttgttcattc 1800ttacaagttt
gtcaattcta aaatatatgc aaatttgaag aaaaaaaatt tacgaatgtg 1860tctcaaacaa
taacttaatg ggaggagaat gagggatgaa gaagctcaaa attaccaacg 1920ccttctacct
caagaagcta cttcacacaa aatatgactg gcggaaggat aggggacaac 1980cgataacgag
aaggagatac ataaggtaat gtacgttgtt gtgtgaggta cacaattatg 2040gggatgaaga
agttcaactt tagtcgaaaa aatgtttgag aggaacttat gaagaatgta 2100atggc
21053301204DNAArtificial SequenceDD20 HR1-SAMS PCR amplicon 330ggttatacct
tcttcttagt gtggtctatc ccctagtaat aattacacat ctaagatatc 60cccttctttt
tcaagtaaaa taatatcata tgatctcatt ttagtgaaac aatactattt 120ccctgataac
tctcttcaac attagggact tcatctaatc atctactttc aaggtataac 180tagacgtatt
tgttctttta aaaaaaacac tagatgtact cgtcaactca aaattcatcg 240ttcatgcatt
ttaattaaac tttaattagc taatgagtag aaaaagatca tacgagtaaa 300atagaagaat
cttcctagat tttggaagaa tggattggag tgtaagtgaa ttgatccatt 360agtggaagat
gctctttaca atggccaaac tgttctaatt gttagagcac atttgagatg 420aaacacttca
gtagtggagg taacctacaa tcctaggatc tgtatcctct atcactaatg 480gagcaatggg
tttgagattg acttactcct ttccttgtct ctcgtagtgc atatgcgcac 540tttcaaaggc
tacacaaaag ccgttaactt tttgtttatt taagttacga aagatagttg 600aattagagta
aatggtgata ttgaattagg attttaaata attttaaaag aattttttta 660ataaaaaaaa
tattgtgttg ttggatcaaa atttttaaat aacatgaata aggaaatgga 720ttgcaatgag
gttttaaaca attattttaa catataggat tttagaaaga cttttataat 780attttgttga
agtttagatt ttaatatatt tatgttttaa aattttaaaa aaaacttcat 840gaatttataa
tatttgaaaa agacacgtga atatttagaa aacatttaaa attacaataa 900taaatcataa
tgagataggg tgtattcatg tgtagacgag acaccaagta tatggttcac 960aagtgaatca
tctttttttt ttacagcaca agtagatcac ttgtacttat caaaattcgg 1020aactgacaca
cactagtggt cacctaagtg actagggtca cgtgacccta gtcacttatt 1080cccaaacact
agtaacggcc gccagtgtgc tggaattcgc ccttcccaag ctttgctcta 1140gatcaaactc
acatccaaac ataacatgga tatcttcctt accaatcata ctaattattt 1200tggg
12043311459DNAArtificial SequenceDD20 NOS-HR2 PCR amplicon 331ggaacttcac
tagagcttgc ggccgcgcat gctgacttaa tcagctaacg ccactcgacc 60tgcaggcatg
cccgcggata tcgatgggcc ccggccgaag cttcaagttt gtacaaaaaa 120gcaggctggc
gccggaacca attcagtcga ctggatccgg taccgaattc gcggccgcac 180tcgagatatc
tagacccagc tttcttgtac aaagtggccg ttaacggatc ggccagaatc 240cggtaagtga
ctagggtcac gtgaccctag tcacttaaat tcggccagaa tggccatctg 300gattcagcag
gcctagaagg cccggaccga ttaaacttta attcggtccg ggttacctct 360agaaagcttg
tcgacctgca gacacgacat gatggaacgt gactaaggtg ggtttttgac 420tttgcatgtc
gaagtgagag tgattttatt gagagaataa tagaagacct acaaaacaaa 480tgatcccgac
gctaaagtaa gtacgagagt taagagaata aatgggaaaa tatgcataca 540tgattaggtg
tgtgttcgtc tcaagaaagt acgaatgaat atggtgtgtt tgtagtacat 600gaatgatgtg
ttttgagggt tcaagggaaa ttgatattta tagagtgaaa tggaaccaga 660ggtctttgtt
gacaagggtt gttatgactc ttgcaaataa ttaatagctt ataaataata 720gccaataact
tattatagat agagttagag ataatatata gctaaatttg aacaaggcat 780acaaaacaaa
aatgctaaat atgaataaga caatcaaaat tgtagtcgat gttcaactct 840ttgtcgttga
agaacttgtt tgcagtggta tagtaaatgg gtgtgagtgc agtgtctcac 900ccatctcaca
ccacacaacc aacttcatat ctaaagatat tgtcgctgaa tacaaaattg 960agttatggaa
tatacaattc ataatataga tacgaaaaat catttcttac aaaacattca 1020atcaaaaatt
attcaaacat aattctagat taagtaatcc gaagtacaag ttagtatcct 1080agatccgtta
atttaaaatt atgtttgcat aattttggat ttggtgttct ataagggcac 1140aattttgttc
attcttacaa gtttgtcaat tctaaaatat atgcaaattt gaagaaaaaa 1200aatttacgaa
tgtgtctcaa acaataactt aatgggagga gaatgaggga tgaagaagct 1260caaaattacc
aacgccttct acctcaagaa gctacttcac acaaaatatg actggcggaa 1320ggatagggga
caaccgataa cgagaaggag atacataagg taatgtacgt tgttgtgtga 1380ggtacacaat
tatggggatg aagaagttca actttagtcg aaaaaatgtt tgagaggaac 1440ttatgaagaa
tgtaatggc
14593322098DNAArtificial SequenceDD43 HR1-HR2 PCR amplicon 332gtgtagtcca
ttgtagccaa gtcaccaata tcttgttccc ctccttggtt tggcataaat 60tgattttcat
ggctcttctc ggtcgaaact ggagctaatt cacccttagt ctctcttaaa 120attctggctg
taagaaacac cacagaacac ataaattata aactaattat aatttgaaga 180gtaaaatatg
tttttactct tatgatttaa ttagtgtagt tttaattttc tccttttttt 240aaaaaatttt
ggtattcata aatttcaatt ttttaaaaat aattgttgtt acccgttaat 300gataacggga
tatgttatgt taccactaaa tcggacaaaa aaaattcaaa acttttataa 360ggattaaaat
taacaaaaat attttaaaaa aatctaacct caataaagtt aaatttataa 420gcacaaaata
atacttttaa gcctaatttg gcaagacaca agcaagctca cctgtagcat 480taatagaaag
gaagcaaagc aagagaaaag caaccagaag gaagcgtttg cttggtgaca 540cagccatctt
acttgaattt atggtattac tgagaaacct tgatcttgct tcaaaatctt 600ctagttaccc
tctttttata ggcagaaaga gaactagcta gttgccaata ggatatgagg 660acatgtggtg
caatgcactc actcttcaag gacaagaaaa acaatggcta caattgtggt 720tcaaatcaat
gtctcctgct ctgtcctgcc tgaaaatgac acccttttgc ttggaaaaga 780ggatcaaagc
taagaacagg agtggcttca ttcccttcat gtaaccaaac actttcgcat 840tctgtcattc
gtgaatcagc aaaatctgca accaaaaata tatggtgcct aaataaaaga 900aataaaataa
tttagagttg cggactaaaa taataaacaa aagaaatata ttataatcta 960gaattaattt
aggactaaaa gaagaggcag actccaattc ctcttttcta gaataccctc 1020cgtacgtaca
agtacaaggg acttgtgagt tgtaaggctg tatttacaat agtgaaaaga 1080gaatcatctg
ggtgattggg tttttagtcc ccagtgacga attaaaggtt tgaattctta 1140gtatgtttgg
gaatcaatta ggaatttcgt tttggacttt ccaaagcaat tattcacttt 1200ttcattcatt
aaatgtgact aaaaaattgt tatttctcca ttggccagga tgcatcgttt 1260atataaacat
aaccttagtg aaagcagtgt tttcatgtga cagcggcaga ctatatctta 1320aacaaaatta
cttgtaaaga aagataccgt taggaaaaaa atgaaaagaa aattgaagct 1380atcacttgtt
tactttccta atatctttca agaatacaat gtggtgaatt tcaattttcc 1440ctacatatgt
ataccgtcag cctgacgcaa cttatgaaac ttctctttct ttcatttgat 1500gtatatataa
agacacatta tatataaaga aactttatat atatctccat catattttag 1560tacttgctac
tatgtaaaat tagctgttgg aagtatctca agaaacattt aatttattga 1620accaagcatt
aaccattcat ctacatttga gttctaaaat aaatcttaaa tgatgtggag 1680gaagggaaat
tgttaattat ttccctcttc tcctacatgg atatacctga aacatgcaat 1740ggatggatta
gattttaaca tttgcagcct gagaagttca ctgactttcc tccagctatt 1800ttatgtgtgc
ccgccaccat ttatagctca tgattgtagc tgaactgcaa aaactgcatc 1860gattgcaaac
tgaaattgag aatctctttt caactttata tgctgattga tgcatgctga 1920gcatgctata
ctagtactcg aagttcctat atgtagactt tgttactgcc taatatactt 1980tgtgtttgtt
ctcaagttct tattttattt catatttttt cctataaaag gttaatggct 2040ctataaaggt
tgagtgacat atatatacta taaaggttct tcctctctct ccggtttg
20983331202DNAArtificial SequenceDD43 HR1-SAMS PCR PCR amplicon
333gtgtagtcca ttgtagccaa gtcaccaata tcttgttccc ctccttggtt tggcataaat
60tgattttcat ggctcttctc ggtcgaaact ggagctaatt cacccttagt ctctcttaaa
120attctggctg taagaaacac cacagaacac ataaattata aactaattat aatttgaaga
180gtaaaatatg tttttactct tatgatttaa ttagtgtagt tttaattttc tccttttttt
240aaaaaatttt ggtattcata aatttcaatt ttttaaaaat aattgttgtt acccgttaat
300gataacggga tatgttatgt taccactaaa tcggacaaaa aaaattcaaa acttttataa
360ggattaaaat taacaaaaat attttaaaaa aatctaacct caataaagtt aaatttataa
420gcacaaaata atacttttaa gcctaatttg gcaagacaca agcaagctca cctgtagcat
480taatagaaag gaagcaaagc aagagaaaag caaccagaag gaagcgtttg cttggtgaca
540cagccatctt acttgaattt atggtattac tgagaaacct tgatcttgct tcaaaatctt
600ctagttaccc tctttttata ggcagaaaga gaactagcta gttgccaata ggatatgagg
660acatgtggtg caatgcactc actcttcaag gacaagaaaa acaatggcta caattgtggt
720tcaaatcaat gtctcctgct ctgtcctgcc tgaaaatgac acccttttgc ttggaaaaga
780ggatcaaagc taagaacagg agtggcttca ttcccttcat gtaaccaaac actttcgcat
840tctgtcattc gtgaatcagc aaaatctgca accaaaaata tatggtgcct aaataaaaga
900aataaaataa tttagagttg cggactaaaa taataaacaa aagaaatata ttataatcta
960gaattaattt aggactaaaa gaagaggcag actccaattc ctcttttcta gaataccctc
1020cgtacgtaca ctagtggtca cctaagtgac tagggtcacg tgaccctagt cacttattcc
1080caaacactag taacggccgc cagtgtgctg gaattcgccc ttcccaagct ttgctctaga
1140tcaaactcac atccaaacat aacatggata tcttccttac caatcatact aattattttg
1200gg
12023341454DNAArtificial SequenceDD43 NOS-HR2 PCR PCR amplicon
334ggaacttcac tagagcttgc ggccgcgcat gctgacttaa tcagctaacg ccactcgacc
60tgcaggcatg cccgcggata tcgatgggcc ccggccgaag cttcaagttt gtacaaaaaa
120gcaggctggc gccggaacca attcagtcga ctggatccgg taccgaattc gcggccgcac
180tcgagatatc tagacccagc tttcttgtac aaagtggccg ttaacggatc ggccagaatc
240cggtaagtga ctagggtcac gtgaccctag tcacttaaat tcggccagaa tggccatctg
300gattcagcag gcctagaagg cccggaccga ttaaacttta attcggtccg ggttacctct
360agaaagcttg tcgacctgca ggtacaagta caagggactt gtgagttgta aggctgtatt
420tacaatagtg aaaagagaat catctgggtg attgggtttt tagtccccag tgacgaatta
480aaggtttgaa ttcttagtat gtttgggaat caattaggaa tttcgttttg gactttccaa
540agcaattatt cactttttca ttcattaaat gtgactaaaa aattgttatt tctccattgg
600ccaggatgca tcgtttatat aaacataacc ttagtgaaag cagtgttttc atgtgacagc
660ggcagactat atcttaaaca aaattacttg taaagaaaga taccgttagg aaaaaaatga
720aaagaaaatt gaagctatca cttgtttact ttcctaatat ctttcaagaa tacaatgtgg
780tgaatttcaa ttttccctac atatgtatac cgtcagcctg acgcaactta tgaaacttct
840ctttctttca tttgatgtat atataaagac acattatata taaagaaact ttatatatat
900ctccatcata ttttagtact tgctactatg taaaattagc tgttggaagt atctcaagaa
960acatttaatt tattgaacca agcattaacc attcatctac atttgagttc taaaataaat
1020cttaaatgat gtggaggaag ggaaattgtt aattatttcc ctcttctcct acatggatat
1080acctgaaaca tgcaatggat ggattagatt ttaacatttg cagcctgaga agttcactga
1140ctttcctcca gctattttat gtgtgcccgc caccatttat agctcatgat tgtagctgaa
1200ctgcaaaaac tgcatcgatt gcaaactgaa attgagaatc tcttttcaac tttatatgct
1260gattgatgca tgctgagcat gctatactag tactcgaagt tcctatatgt agactttgtt
1320actgcctaat atactttgtg tttgttctca agttcttatt ttatttcata ttttttccta
1380taaaaggtta atggctctat aaaggttgag tgacatatat atactataaa ggttcttcct
1440ctctctccgg tttg
145433560DNAGlycine maxsoybean genomic DD20CR1 target region(1)..(60)
335acttgtactt atcaaaattc ggaactgaca cacgacatga tggaacgtga ctaaggtggg
6033659DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
336acttgtactt atcaaaattc ggaactgaca cacgactgat ggaacgtgac taaggtggg
5933759DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
337acttgtactt atcaaaattc ggaactgaca cacgaatgat ggaacgtgac taaggtggg
5933858DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
338acttgtactt atcaaaattc ggaactgaca cacgatgatg gaacgtgact aaggtggg
5833958DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
339acttgtactt atcaaaattc ggaactgaca cacgacgatg gaacgtgact aaggtggg
5834058DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
340acttgtactt atcaaaattc ggaactgaca cacggtgatg gaacgtgact aaggtggg
5834157DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
341acttgtactt atcaaaattc ggaactgaca cacatgatgg aacgtgacta aggtggg
5734257DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
342acttgtactt atcaaaattc ggaactgaca cacgtgatgg aacgtgacta aggtggg
5734356DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
343acttgtactt atcaaaattc ggaactgaca cactgatgga acgtgactaa ggtggg
5634456DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
344acttgtactt atcaaaattc ggaactgaca cacggatgga acgtgactaa ggtggg
5634555DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
345acttgtactt atcaaaattc ggaactgaca cacgatggaa cgtgactaag gtggg
5534655DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
346acttgtactt atcaaaattc ggaactgaca catgatggaa cgtgactaag gtggg
5534754DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
347acttgtactt atcaaaattc ggaactgaca cacatggaac gtgactaagg tggg
5434854DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
348acttgtactt atcaaaattc ggaactgaca ctgatggaac gtgactaagg tggg
5434953DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
349acttgtactt atcaaaattc ggaactgaca cgatggaacg tgactaaggt ggg
5335051DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
350acttgtactt atcaaaattc ggaactgatg atggaacgtg actaaggtgg g
5135150DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
351acttgtactt atcaaaattc ggaactgaca tggaacgtga ctaaggtggg
5035250DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
352acttgtactt atcaaaattc ggaactgtga tggaacgtga ctaaggtggg
5035351DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
353acttgtactt atcaaaattc ggaactgaca cacgaacgtg actaaggtgg g
5135450DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
354acttgtactt atcaaaattc ggaactgaca cggaacgtga ctaaggtggg
5035549DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
355acttgtacct atcaaaattc ggaactgaat ggaacgtgac taaggtggg
4935648DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
356acttgtactt atcaaaattc ggaactgatg gaacgtgact aaggtggg
4835746DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
357acttgtactt atcaaaattc ggaactgaga acgtgactaa ggtggg
4635838DNAArtificial sequencesequence of gRNA/Cas9 system mediated target
site modification 358acttgtactt atcaaaattc ggaactgaca cacgacat
3835938DNAArtificial sequencesequence of gRNA/Cas9
system mediated NHEJ 359acttgtactt atcaaaattc ggaactgaca aaggtggg
3836039DNAArtificial sequencesequence of gRNA/Cas9
system mediated NHEJ 360acttgtactt atcaaaattc ggaacgtgac taaggtggg
3936124DNAArtificial sequencesequence of gRNA/Cas9
system mediated NHEJ 361actatggaac gtgactaagg tggg
24362211DNAArtificial sequencesequence of gRNA/Cas9
system mediated target site modification 362acttgtactt atcaaaattc
ggaactgaca cacggccggt gatggattgg tggatgagtg 60ttgcgtcgag cacctccttg
gtggaggtgt atctcttcct gtcaatggtg gtgtcgaagt 120acttgaaggc agcaggggct
cccaagtttg tgagggtaaa caggtggata atgttttcag 180cctgctcgcg atggaacgtg
actaaggtgg g 21136390DNAArtificial
sequencesequence of gRNA/Cas9 system mediated target site
modification 363acttgtactt atcaaaaact acttgtgctg taaaaaaaaa gaggaacaat
cttcactcat 60caataagtga tggaacgcga ctaaggtggg
9036457DNAGlycine maxsoybean genomic DD20CR2 target
region(1)..(57) 364gacacacgac atgatggaac gtgactaagg tgggtttttg actttgcatg
tcgaagt 5736561DNAartificial sequencesequence of gRNA/Cas9 system
mediated target site modification 365actgacacac gacatgatgg
aacgtgaact aaggtgggtt tttgactttg catgtcgaag 60t
6136659DNAArtificial
sequencesequence of gRNA/Cas9 system mediated NHEJ 366actgacacac
gacatgatgg aacgtactaa ggtgggtttt tgactttgca tgtcgaagt
5936758DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
367actgacacac gacatgatgg aacgtctaag gtgggttttt gactttgcat gtcgaagt
5836858DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
368actgacacac gacatgatgg aacgtgaaag gtgggttttt gactttgcat gtcgaagt
5836957DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
369actgacacac gacatgatgg aacgctaagg tgggtttttg actttgcatg tcgaagt
5737057DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
370actgacacac gacatgatgg aacgtgaagg tgggtttttg actttgcatg tcgaagt
5737156DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
371actgacacac gacatgatgg aacgtgaggt gggtttttga ctttgcatgt cgaagt
5637256DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
372actgacacac gacatgatgg aacgtaaggt gggtttttga ctttgcatgt cgaagt
5637356DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
373actgacacac gacatgatgg aacctaaggt gggtttttga ctttgcatgt cgaagt
5637456DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
374actgacacac gacatgatgg aacgtgaggt gggtttttga ctttgcatgt cgaagt
5637555DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
375actgacacac gacatgatgg aactaaggtg ggtttttgac tttgcatgtc gaagt
5537654DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
376actgacacac gacatgatgg aataaggtgg gtttttgact ttgcatgtcg aagt
5437753DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
377actgacacac gacatgatgg ctaaggtggg tttttgactt tgcatgtcga agt
5337853DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
378actgacacac gacatgatgg ataaggtggg tttttgactt tgcatgtcga agt
5337951DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
379actgacacac gacatgatgg aaggtgggtt tttgactttg catgtcgaag t
5138050DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
380actgacacac gacatgatgg aggtgggttt ttgactttgc atgtcgaagt
5038144DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
381actgacacac gacatgatgg gtttttgact ttgcatgtcg aagt
4438243DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
382actgacacac gacaggtggg tttttgactt tgcatgtcga agt
4338340DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
383actgacacta aggtgggttt ttgactttgc atgtcgaagt
4038425DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
384actgacacac gacatgatgg aacgt
2538520DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
385actgacacac gacatgatgg
2038660DNAGlycine maxsoybean genomic DD43CR1 target region(1)..(60)
386agccttacaa ctcacaagtc ccttgtactt gtacgtacgg agggtattct agaaaagagg
6038759DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
387agccttacaa ctcacaagtc ccttgtactt gtactacgga gggtattcta gaaaagagg
5938859DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
388agccttacaa ctcacaagtc ccttgtactt gtagtacgga gggtattcta gaaaagagg
5938958DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
389agccttacaa ctcacaagtc ccttgtactt gtgtacggag ggtattctag aaaagagg
5839058DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
390agccttacaa ctcacaagtc ccttgtactt gcgtacggag ggtattctag aaaagagg
5839157DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
391agccttacaa ctcacaagtc ccttgtactt ggtacggagg gtattctaga aaagagg
5739257DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
392agccttacaa ctcacaagtc ccttgtactt gttacggagg gtattctaga aaagagg
5739356DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
393agccttacaa ctcacaagtc ccttgtactt gtacggaggg tattctagaa aagagg
5639455DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
394agccttacaa ctcacaagtc ccttgtactt tacggagggt attctagaaa agagg
5539555DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
395agccttacaa ctcacaagtc ccttgtactg tacggagggt attctagaaa agagg
5539654DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
396agccttacaa ctcacaagcc ccttgtactt acggagggta ttctagaaaa gagg
5439752DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
397agccttacaa ctcacaagtc ccttgtatac ggagggtatt ctagaaaaga gg
5239852DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
398agccttacaa ctcacaagtc ccttgtgtac ggagggtatt ctagaaaaga gg
5239950DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
399agccttacaa ctcacaagtc ccttgtacgg agggtattct agaaaagagg
5040049DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
400agccttacaa ctcacaagtc cctttacgga gggtattcta gaaaagagg
4940148DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
401agccttacaa ctcacaagtc ccttacggag ggtattctag aaaagagg
4840247DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
402agccttacaa ctcacaagtc cctacggagg gtattctaga aaagagg
4740343DNAArtificial sequencesequence of gRNA/Cas9 system mediated NHEJ
403agccttacaa ctcacaagtc ccttgtactt gtaagaaaag agg
4340449DNAArtificial sequencesequence of gRNA/Cas9 system mediated target
site modification 404agccttacaa ctcacaagtc ctaaattaaa ggttattcta
gaaaagagg 49405227DNAArtificial sequencesequence of
gRNA/Cas9 system mediated target site modification 405agccttacaa
ctcacaagtc ccttgtactt gtagaatcca gttcataaaa caagtgacac 60acaacagata
tgaactggac tacgtcgaac ccacaaatcc cacaaagcgc gtgaaatcaa 120atcgctcaaa
ccacaaaaaa gaacaacgcg tttgttacac gctaatacca aaattatacc 180caaatcttaa
gctatttatg cgtacggagg gtattctaga aaagagg
22740697DNAArtificial sequencesequence of gRNA/Cas9 system mediated
target site modification 406agccttacaa ctcacaagtc ccttgtactt
gtaatgctcc cctctaaact cgtatcgctt 60cagagttgag agtacggagg gtattctaga
aaagagg 97407183DNAArtificial
sequencesequence of gRNA/Cas9 system mediated target site
modification 407agccttacaa ctcacaagtc ccttgtatat agatacccac aaaataagta
aacccgatcc 60aaaatcttaa atgatgtgga ggaagggaaa ttgttaatta ttcccctctt
ctcctacatg 120gatatacctg aaacatgcaa tggatggatt agattttgta cggagggtat
tctagaaaag 180agg
183408234DNAArtificial sequencesequence of gRNA/Cas9 system
mediated target site modification 408agccttacaa ctcacaagtc
ccttgtactt gtaccagggg atgtttttta tttacattca 60cgtcttttgg aaagagccgc
taaattaagt tctcagttag gcgaaggaag tatgactgct 120ttaccaatag ttgaaactca
atcgggagat gtttcagctt atattcctac taatgtaatt 180tccattacag atggccaaat
attcttacgt acggagggta ttctagaaaa gagg 234409280DNAArtificial
sequencesequence of gRNA/Cas9 system mediated target site
modification 409agccttacaa ctcacaagtc ccttgtactt gtaccgaaaa tttcagccat
aaaaaaagtt 60ataatagaat ttaaagcaaa agtttcattt tttaaacata tatactgaca
cgctccgata 120aaaatagagg caagtccgac aacgtcccct ccgaggaggt cgtgaagaag
atgaaaaact 180actggagaca gctcttgaac gccaagctca tcacccagcg taagctcgac
aacctgacta 240aggctgagag aggtgtacgg agggtattct agaaaagagg
280410250DNAArtificial sequencesequence of gRNA/Cas9 system
mediated target site modification 410agccttacaa ctcacaagtc
ccttgtactt gtactggatt tggtgaggga tgcttccgtt 60gtcgaaggtt ctctgcttcc
tcaacaggtc ctctctgttc aacttcacca acagctcctc 120ggtaccgtcc atcttctcaa
ggatgaagat cgagtgcttc gactccgtcg agatctctgg 180tgtcgaggac aggttcaacg
cctcccttgg gacttgccac gatcgtacgg agggtattct 240agaaaagagg
250411161DNAArtificial
sequencesequence of gRNA/Cas9 system mediated target site
modification 411agccttacaa ctcacaagtc ccttatgacc tcaaaaaaaa gattcacctc
caacacacca 60aataactcga aaatctcttt cctattctct agaaagtata ggaacttcca
ctagtccatg 120aaaaagcctg aactcgtacg gagggtattc tagaaaagag g
161412185DNAArtificial sequencesequence of gRNA/Cas9 system
mediated target site modification 412agccttacaa ctcacaagtc
ccttgtactt gtacacctgg ggcatggaga gcaccttcct 60cacagtagcg aaatccctcc
ctttgtccca cacgatctct ccagtctcac cgttggtctc 120gatcagtggg cgcttcctga
tctcaccgtt ggcgagagtg tacggagggt attctagaaa 180agagg
185413212DNAArtificial
sequencesequence of gRNA/Cas9 system mediated target site
modification 413agccttacaa ctcacaagtc ccttgtactt gtgctaggtt agccgaaaga
tggttatcgg 60ttcaaggacg caaggtgccc ctgctttttc agggtaataa ggggtagaga
aaatgcctcg 120agccaaagtt cgagtaccag gcgctacagc gctgaagtaa tccatgccat
actcccagga 180aaagccgtac ggagggtatt ctagaaaaga gg
212414231DNAArtificial sequencesequence of gRNA/Cas9 system
mediated target site modification 414agccttacaa ctcacaagtc
ccttgtactt gtactcaagt tcttatttta tttcatattt 60tttcctataa aaggttaatg
gctctataaa ggttgagtga cggatccggt cacctaagtg 120actagggtca cgtgacccta
gtcacttatt cccgggcaac tttattatac aaagttgata 180gatctcgaat tcattccgat
taatcgtggc gagggtattc tagaaaagag g 23141598DNAArtificial
sequenceLIGCas-1 mutation 1 415tcctctgtaa cgatttacgc acctgctggg
aattgtaccg tacgtgcccc ggtcggagga 60tatatatacc tcacacgtac gcgtacgcgt
atatatac 9841698DNAArtificial
sequenceLIGCas-1 mutation 2 416tcctctgtaa cgatttacgc acctgctggg
aattgtaccg tacgtgcccc ggacggagga 60tatatatacc tcacacgtac gcgtacgcgt
atatatac 9841798DNAArtificial
sequenceLIGCas-1 mutation 3 417tcctctgtaa cgatttacgc acctgctggg
aattgtaccg tacgtgcccc gggcggagga 60tatatatacc tcacacgtac gcgtacgcgt
atatatac 9841898DNAArtificial
sequenceLIGCas-1 mutation 4 418tcctctgtaa cgatttacgc acctgctggg
aattgtaccg tacgtgcccc ggccggagga 60tatatatacc tcacacgtac gcgtacgcgt
atatatac 9841999DNAArtificial
sequenceLIGCas-1 mutation 5 419tcctctgtaa cgatttacgc acctgctggg
aattgtaccg tacgtgcccc ggatcggagg 60atatatatac ctcacacgta cgcgtacgcg
tatatatac 9942094DNAArtificial
sequenceLIGCas-1 mutation 6 420tcctctgtaa cgatttacgc acctgctggg
aattgtaccg tacgtgcccc ggaggatata 60tatacctcac acgtacgcgt acgcgtatat
atac 9442181DNAArtificial
sequenceLIGCas-1 mutation 7 421tcctctgtaa cgatttacgc acctgctggg
aattgtaccg tacgtgcccc ggttcacacg 60tacgcgtacg cgtatatata c
8142265DNAArtificial sequenceLIGCas-1
mutation 8 422tcctctgtaa cgatttacgc acctgctggg aattgtaccg tacgtacgcg
tacgcgtata 60tatac
6542399DNAArtificial sequenceLIGCas-1 mutation 9
423tcctctgtaa cgatttacgc acctgctggg aattgtaccg tacgtgcccc ggttcggagg
60atatatatac ctcacacgta cgcgtacgcg tatatatac
9942495DNAArtificial sequenceLIGCas-1 mutation 10 424tcctctgtaa
cgatttacgc acctgctggg aattgtaccg tacgtgcccc cggaggatat 60atatacctca
cacgtacgcg tacgcgtata tatac
9542598DNAArtificial sequenceLIGCas-2 mutation 1 425gaagctgtaa cgatttacgc
acctgctggg aattgtaccg tacgtgaccc cggcggagga 60tatatatacc tcacacgtac
gcgtacgcgt atatatac 9842698DNAArtificial
sequenceLIGCas-2 mutation 2 426gaagctgtaa cgatttacgc acctgctggg
aattgtaccg tacgtgtccc cggcggagga 60tatatatacc tcacacgtac gcgtacgcgt
atatatac 9842796DNAArtificial
sequenceLIGCas-2 mutation 3 427gaagctgtaa cgatttacgc acctgctggg
aattgtaccg tacgtccccg gcggaggata 60tatatacctc acacgtacgc gtacgcgtat
atatac 9642898DNAArtificial
sequenceLIGCas-2 mutation 4 428gaagctgtaa cgatttacgc acctgctggg
aattgtaccg tacgtggccc cggcggagga 60tatatatacc tcacacgtac gcgtacgcgt
atatatac 9842999DNAArtificial
sequenceLIGCas-2 mutation 5 429gaagctgtaa cgatttacgc acctgctggg
aattgtaccg tacgtgcacc ccggcggagg 60atatatatac ctcacacgta cgcgtacgcg
tatatatac 9943087DNAArtificial
sequenceLIGCas-2 mutation 6 430gaagctgtaa cgatttacgc acctgctggg
aattgtaccc ggcggaggat atatatacct 60cacacgtacg cgtacgcgta tatatac
8743192DNAArtificial sequenceLIGCas-2
mutation 7 431gaagctgtaa cgatttacgc acctgctggg aattgtaccg tccccggcgg
aggatatata 60tacctcacac gtacgcgtac gcgtatatat ac
9243294DNAArtificial sequenceLIGCas-2 mutation 8
432gaagctgtaa cgatttacgc acctgctggg aattgtaccg tacccccggc ggaggatata
60tatacctcac acgtacgcgt acgcgtatat atac
9443395DNAArtificial sequenceLIGCas-2 mutation 9 433gaagctgtaa cgatttacgc
acctgctggg aattgtaccg tacgccccgg cggaggatat 60atatacctca cacgtacgcg
tacgcgtata tatac 9543488DNAArtificial
sequenceLIGCas-2 mutation 10 434gaagctgtaa cgatttacgc acctgctggg
aattgtaccc cggcggagga tatatatacc 60tcacacgtac gcgtacgcgt atatatac
8843598DNAArtificial sequenceLIGCas-3
mutation 1 435aaggcgcaaa tgagtagcag cgcacgtata tatacgcgta cgcgtacgtt
gtgaggtata 60tatatcctcc gccggggcac gtacggtaca attcccag
9843696DNAArtificial sequenceLIGCas-3 mutation 2
436aaggcgcaaa tgagtagcag cgcacgtata tatacgcgta cgcgtacggt gaggtatata
60tatcctccgc cggggcacgt acggtacaat tcccag
9643795DNAArtificial sequenceLIGCas-3 mutation 3 437aaggcgcaaa tgagtagcag
cgcacgtata tatacgcgta cgcgtacgtg aggtatatat 60atcctccgcc ggggcacgta
cggtacaatt cccag 9543896DNAArtificial
sequenceLIGCas-3 mutation 4 438aaggcgcaaa tgagtagcag cgcacgtata
tatacgcgta cgcgtactgt gaggtatata 60tatcctccgc cggggcacgt acggtacaat
tcccag 9643968DNAArtificial
sequenceLIGCas-3 mutation 5 439aaggcgcaaa tgagtagcag cgcacgtata
tatatcctcc gccggggcac gtacggtaca 60attcccag
6844093DNAArtificial sequenceLIGCas-3
mutation 6 440aaggcgcaaa tgagtagcag cgcacgtata tatacgcgta cgcgtgtgag
gtatatatat 60cctccgccgg ggcacgtacg gtacaattcc cag
9344189DNAArtificial sequenceLIGCas-3 mutation 7
441aaggcgcaaa tgagtagcag cgcacgtata tatacgcgta cgtgaggtat atatatcctc
60cgccggggca cgtacggtac aattcccag
8944289DNAArtificial sequenceLIGCas-3 mutation 8 442aaggcgcaaa tgagtagcag
cgcacgtata tatacgcgta cgcgtactat atatatcctc 60cgccggggca cgtacggtac
aattcccag 8944394DNAArtificial
sequenceLIGCas-3 mutation 9 443aaggcgcaaa tgagtagcag cgcacgtata
tatacgcgta cgcgtacgga ggtatatata 60tcctccgccg gggcacgtac ggtacaattc
ccag 9444496DNAArtificial
sequenceLIGCas-3 mutation 0 444aaggcgcaaa tgagtagcag cgcacgtata
tatacgcgta cgcgtacgat gaggtatata 60tatcctccgc cggggcacgt acggtacaat
tcccag 964451051DNAArtificial
sequenceLIGCas-1_crRNA_Expression_Cassette 445tgagagtaca atgatgaacc
tagattaatc aatgccaaag tctgaaaaat gcaccctcag 60tctatgatcc agaaaatcaa
gattgcttga ggccctgttc ggttgttccg gattagagcc 120ccggattaat tcctagccgg
attacttctc taatttatat agattttgat gagctggaat 180gaatcctggc ttattccggt
acaaccgaac aggccctgaa ggataccagt aatcgctgag 240ctaaattggc atgctgtcag
agtgtcagta ttgcagcaag gtagtgagat aaccggcatc 300atggtgccag tttgatggca
ccattagggt tagagatggt ggccatgggc gcatgtcctg 360gccaactttg tatgatatat
ggcagggtga ataggaaagt aaaattgtat tgtaaaaagg 420gatttcttct gtttgttagc
gcatgtacaa ggaatgcaag ttttgagcga gggggcatca 480aagatctggc tgtgtttcca
gctgtttttg ttagccccat cgaatccttg acataatgat 540cccgcttaaa taagcaacct
cgcttgtata gttccttgtg ctctaacaca cgatgatgat 600aagtcgtaaa atagtggtgt
ccaaagaatt tccaggccca gttgtaaaag ctaaaatgct 660attcgaattt ctactagcag
taagtcgtgt ttagaaatta tttttttata tacctttttt 720ccttctatgt acagtaggac
acagtgtcag cgccgcgttg acggagaata tttgcaaaaa 780agtaaaagag aaagtcatag
cggcgtatgt gccaaaaact tcgtcacaga gagggccata 840agaaacatgg cccacggccc
aatacgaagc accgcgacga agcccaaaca gcagtccgta 900ggtggagcaa agcgctgggt
aatacgcaaa cgttttgtcc caccttgact aatcacaaga 960gtggagcgta ccttataaac
cgagccgcaa gcaccgaatt gtaccgtacg tgccccggcg 1020ggttttagag ctatgctgtt
ttgttttttt t 10514461051DNAArtificial
sequenceLIGCas-2_crRNA_Expression_Cassette 446tgagagtaca atgatgaacc
tagattaatc aatgccaaag tctgaaaaat gcaccctcag 60tctatgatcc agaaaatcaa
gattgcttga ggccctgttc ggttgttccg gattagagcc 120ccggattaat tcctagccgg
attacttctc taatttatat agattttgat gagctggaat 180gaatcctggc ttattccggt
acaaccgaac aggccctgaa ggataccagt aatcgctgag 240ctaaattggc atgctgtcag
agtgtcagta ttgcagcaag gtagtgagat aaccggcatc 300atggtgccag tttgatggca
ccattagggt tagagatggt ggccatgggc gcatgtcctg 360gccaactttg tatgatatat
ggcagggtga ataggaaagt aaaattgtat tgtaaaaagg 420gatttcttct gtttgttagc
gcatgtacaa ggaatgcaag ttttgagcga gggggcatca 480aagatctggc tgtgtttcca
gctgtttttg ttagccccat cgaatccttg acataatgat 540cccgcttaaa taagcaacct
cgcttgtata gttccttgtg ctctaacaca cgatgatgat 600aagtcgtaaa atagtggtgt
ccaaagaatt tccaggccca gttgtaaaag ctaaaatgct 660attcgaattt ctactagcag
taagtcgtgt ttagaaatta tttttttata tacctttttt 720ccttctatgt acagtaggac
acagtgtcag cgccgcgttg acggagaata tttgcaaaaa 780agtaaaagag aaagtcatag
cggcgtatgt gccaaaaact tcgtcacaga gagggccata 840agaaacatgg cccacggccc
aatacgaagc accgcgacga agcccaaaca gcagtccgta 900ggtggagcaa agcgctgggt
aatacgcaaa cgttttgtcc caccttgact aatcacaaga 960gtggagcgta ccttataaac
cgagccgcaa gcaccgaatt ggaattgtac cgtacgtgcc 1020cgttttagag ctatgctgtt
ttgttttttt t 10514471047DNAArtificial
sequenceLIGCas-3_crRNA_Expression_Cassette 447tgagagtaca atgatgaacc
tagattaatc aatgccaaag tctgaaaaat gcaccctcag 60tctatgatcc agaaaatcaa
gattgcttga ggccctgttc ggttgttccg gattagagcc 120ccggattaat tcctagccgg
attacttctc taatttatat agattttgat gagctggaat 180gaatcctggc ttattccggt
acaaccgaac aggccctgaa ggataccagt aatcgctgag 240ctaaattggc atgctgtcag
agtgtcagta ttgcagcaag gtagtgagat aaccggcatc 300atggtgccag tttgatggca
ccattagggt tagagatggt ggccatgggc gcatgtcctg 360gccaactttg tatgatatat
ggcagggtga ataggaaagt aaaattgtat tgtaaaaagg 420gatttcttct gtttgttagc
gcatgtacaa ggaatgcaag ttttgagcga gggggcatca 480aagatctggc tgtgtttcca
gctgtttttg ttagccccat cgaatccttg acataatgat 540cccgcttaaa taagcaacct
cgcttgtata gttccttgtg ctctaacaca cgatgatgat 600aagtcgtaaa atagtggtgt
ccaaagaatt tccaggccca gttgtaaaag ctaaaatgct 660attcgaattt ctactagcag
taagtcgtgt ttagaaatta tttttttata tacctttttt 720ccttctatgt acagtaggac
acagtgtcag cgccgcgttg acggagaata tttgcaaaaa 780agtaaaagag aaagtcatag
cggcgtatgt gccaaaaact tcgtcacaga gagggccata 840agaaacatgg cccacggccc
aatacgaagc accgcgacga agcccaaaca gcagtccgta 900ggtggagcaa agcgctgggt
aatacgcaaa cgttttgtcc caccttgact aatcacaaga 960gtggagcgta ccttataaac
cgagccgcaa gcaccgaatt gcgtacgcgt acgtgtggtt 1020ttagagctat gctgttttgt
ttttttt 10474481087DNAArtificial
sequencetracrRNA_Expression_Cassette 448tgagagtaca atgatgaacc tagattaatc
aatgccaaag tctgaaaaat gcaccctcag 60tctatgatcc agaaaatcaa gattgcttga
ggccctgttc ggttgttccg gattagagcc 120ccggattaat tcctagccgg attacttctc
taatttatat agattttgat gagctggaat 180gaatcctggc ttattccggt acaaccgaac
aggccctgaa ggataccagt aatcgctgag 240ctaaattggc atgctgtcag agtgtcagta
ttgcagcaag gtagtgagat aaccggcatc 300atggtgccag tttgatggca ccattagggt
tagagatggt ggccatgggc gcatgtcctg 360gccaactttg tatgatatat ggcagggtga
ataggaaagt aaaattgtat tgtaaaaagg 420gatttcttct gtttgttagc gcatgtacaa
ggaatgcaag ttttgagcga gggggcatca 480aagatctggc tgtgtttcca gctgtttttg
ttagccccat cgaatccttg acataatgat 540cccgcttaaa taagcaacct cgcttgtata
gttccttgtg ctctaacaca cgatgatgat 600aagtcgtaaa atagtggtgt ccaaagaatt
tccaggccca gttgtaaaag ctaaaatgct 660attcgaattt ctactagcag taagtcgtgt
ttagaaatta tttttttata tacctttttt 720ccttctatgt acagtaggac acagtgtcag
cgccgcgttg acggagaata tttgcaaaaa 780agtaaaagag aaagtcatag cggcgtatgt
gccaaaaact tcgtcacaga gagggccata 840agaaacatgg cccacggccc aatacgaagc
accgcgacga agcccaaaca gcagtccgta 900ggtggagcaa agcgctgggt aatacgcaaa
cgttttgtcc caccttgact aatcacaaga 960gtggagcgta ccttataaac cgagccgcaa
gcaccgaatt ggaaccattc aaaacagcat 1020agcaagttaa aataaggcta gtccgttatc
aacttgaaaa agtggcaccg agtcggtgct 1080ttttttt
108744963DNAArtificial SequenceLIGCas-2
forward primer for primary 449ctacactctt tccctacacg acgctcttcc gatctgaagc
tgtaacgatt tacgcacctg 60ctg
6345060DNAArtificial SequenceLIGCas-3 forward
primer for primary PCR 450ctacactctt tccctacacg acgctcttcc gatctttccc
gcaaatgagt agcagcgcac 6045119DNAArtificial sequenceZm-ARGOS8-CTS1,
Cas9 target sequence 1 451gcgtgcatcg atccatcgc
1945221DNAArtificial sequencem-ARGOS8-CTS2, Cas9
target sequence 2 452ggctacggat agatatgatg c
2145320DNAArtificial sequenceZm-ARGOS8-CTS3, Cas9 target
sequence 3 453gttacttctc taagcacggc
2045421DNAArtificial sequenceP1, Forward_primer 454gcgccattcc
ctaaaggtaa c
2145523DNAArtificial sequenceP2, Reverse_primer 455gctaatcgta agtgacgctt
gga 2345631DNAArtificial
sequenceP3, Forward_primer 456gctcgtgtcc aagcgtcact tacgattagc t
3145720DNAArtificial sequenceP4, Reverse_primer
457ctgcgaactg cttgattccg
2045824DNAArtificial sequenceP5, Forward_primer 458accgtcctta tctctgcatc
atct 2445931DNAArtificial
sequencePBS, Primer Binding Site 459gctcgtgtcc aagcgtcact tacgattagc t
314601823DNAZea
maysmisc_feature(1)..(1823)Zm-GOS2 PRO-GOS2 INTRON, maize GOS2 promoter
and GOS2 intron1 including the promoter, 5'-UTR1, INTRON1 and
5'-UTR2 sequence 460taattattgg ctgtaggatt ctaaacagag cctaaatagc
tggaatagct ctagccctca 60atccaaacta atgatatcta tacttatgca actctaaatt
tttattctaa aagtaatatt 120tcatttttgt caacgagatt ctctactcta ttccacaatc
ttttgaagct atatttacct 180taaatctgta ctctatacca ataatcatat attctattat
ttatttttat ctctctccta 240aggagcatcc ctctatgtct gcatggcccc cgcctcgggt
cccaatctct tgctctgcta 300gtagcacaga agaaaacact agaaatgact tgcttgactt
agagtatcag ataaacatca 360tgtttactta actttaattt gtatcggttt ctactatttt
tataatattt ttgtctctat 420agatactacg tgcaacagta taatcaacct agtttaatcc
agagcgaagg attttttact 480aagtacgtga ctccatatgc acagcgttcc ttttatggtt
cctcactggg cacagcataa 540acgaaccctg tccaatgttt tcagcgcgaa caaacagaaa
ttccatcagc gaacaaacaa 600catacatgcg agatgaaaat aaataataaa aaaagctccg
tctcgatagg ccggcacgaa 660tcgagagcct ccatagccag ttttttccat cggaacggcg
gttcgcgcac ctaattatat 720gcaccacacg cctataaagc caaccaaccc gtcggagggg
cgcaagccag acagaagaca 780gcccgtcagc ccctctcgtt tttcatccgc cttcgcctcc
aaccgcgtgc gctccacgcc 840tcctccagga aagcgagagg tgagcgcagt cccctttccc
ctccttccaa ttcaattcgt 900cttctcgttc gcagccctag gatttggggg tctggagggg
tttgatcgtt tctcgccgtg 960aatctgcttt ggtgtaaacc aacggatctc ggatcgtagt
cttcagaaga tcccggattt 1020tgcggtttgg cccctcctgg attcaattcg tcgtatcgtt
cgcagcccta ggatttgggg 1080atctggaggg gtttgatcgt ttctcgccgc gaatctgctc
tggtgtaaac caacggatct 1140cgggtcgtag tcttcagaag gtcccggatt ttgcggtttg
gcccctcctg gattcaattc 1200gtcgtatcgt tcgcagccct aggatttggg gatctggagg
ggtttgatcc tttctcgccg 1260cgaatctgct ctggtataac caacggatct cgggtcgtag
tcttcagaag gtcccggatt 1320ttgcggtttg gtggttcttg ctctatgaat cagagggatg
gttcttcccg gatttatgcc 1380ttgcggccac tctgtcgaat catggggttt cgacccgatt
cgtaggcgtg ctccctgttt 1440tggatgggaa gtaggcgtgt ttgtagtatt cgtgcttcga
ttcgtcaacg gagattagaa 1500gacctgggat gggatttgag gaaatctagg tatctgtcta
gcacgtttct agatctattc 1560ttcagctgtt atatgagagt aattttggaa ccctggtggg
gtatgtttga ccgagtattc 1620tgtagattat tgtccgtgac ttgctggctg ttaccgtcct
tatctctgca tcatctatct 1680gtgctagttt ctgcgtgctt ctcaaatatt tccggcctgt
gtagcatgtg actgataata 1740tgattttggc agcttctgca taagaacaac aaatcaaaag
cttgatcagc tcggtgccta 1800caaaacctca acaaccaagt ttc
1823461556DNAZea
maysmisc_feature(1)..(556)Zm-ARGOS8 promoter 461gttacttctc taagcacggc
tggatttcag gcctctagtc ctctactagt actagctaca 60cgacgtgcac gcatgcatca
cagcatcaac aactagacac gcacacgctg cacgcggccg 120gggaacccac tgattccccc
cttccccgcg cgcggtttga tttcctttcc tggtacggat 180ccatatctga gggcttgttc
ggttattccc aacacacatg tattggatgg gattgaaaaa 240aaaatgagaa gaagtttgac
ttgtttggga ttcaaaccca tccaatccca ctcaatccac 300atggattgag agctaaccga
acaagccctc atagtacata cctggtacgg atccatatca 360tagtacatag atccagtaga
atagaaggtg atccgaccgc cggcgcttgc gttgttttcc 420ccggtccatt gaacctgcca
accctcctaa ccacaggcac gccaaaccgc gggctccggc 480caccaccgcc accgccacct
gccctgccgc acctctccaa ccccaaatcc aggggggggg 540gggggcacca tgcgtg
556462155DNAZea
maysmisc_feature(1)..(155)Zm-ARGOS8 5'-UTR 462catcgatcca tcgctggcgc
gcgggtccgg cggggcggtc tgtgagggca aatttatata 60ggtctagtgg gtacccggct
acggatagat atgatgctgc actgcacatt ggctatatct 120gaggctcctg cgcgcgcctt
ggccaggtgt ctgtc 155463285DNAZea mays
463atgcgggcga tgccgcagga agaggaagcc gcggtggcga cgacgaccat ggccgggggc
60aaggtggcgg cgctgctggc cacggcggcc gcgctgctgc tgctgctccc gctggcgctg
120ccgccgctgc cgccgccgcc cacgcagctg ttgttcgtcc ccgtggtctt gctgctcctc
180gtggcgtccc tcgcgttctg ccccgccgcg accccctcgc cgtcgccgat gcatgccgcc
240gaccacgggt cgttcgggac cactggatca ccgcacctat gttga
2854642843DNAZea maysmisc_feature(1)..(2843)Zm-GOS2 gene, including
promoter, 5'-UTR, CDS, 3'-UTR and introns sequence 464taattattgg
ctgtaggatt ctaaacagag cctaaatagc tggaatagct ctagccctca 60atccaaacta
atgatatcta tacttatgca actctaaatt tttattctaa aagtaatatt 120tcatttttgt
caacgagatt ctctactcta ttccacaatc ttttgaagct atatttacct 180taaatctgta
ctctatacca ataatcatat attctattat ttatttttat ctctctccta 240aggagcatcc
ctctatgtct gcatggcccc cgcctcgggt cccaatctct tgctctgcta 300gtagcacaga
agaaaacact agaaatgact tgcttgactt agagtatcag ataaacatca 360tgtttactta
actttaattt gtatcggttt ctactatttt tataatattt ttgtctctat 420agatactacg
tgcaacagta taatcaacct agtttaatcc agagcgaagg attttttact 480aagtacgtga
ctccatatgc acagcgttcc ttttatggtt cctcactggg cacagcataa 540acgaaccctg
tccaatgttt tcagcgcgaa caaacagaaa ttccatcagc gaacaaacaa 600catacatgcg
agatgaaaat aaataataaa aaaagctccg tctcgatagg ccggcacgaa 660tcgagagcct
ccatagccag ttttttccat cggaacggcg gttcgcgcac ctaattatat 720gcaccacacg
cctataaagc caaccaaccc gtcggagggg cgcaagccag acagaagaca 780gcccgtcagc
ccctctcgtt tttcatccgc cttcgcctcc aaccgcgtgc gctccacgcc 840tcctccagga
aagcgagagg tgagcgcagt cccctttccc ctccttccaa ttcaattcgt 900cttctcgttc
gcagccctag gatttggggg tctggagggg tttgatcgtt tctcgccgtg 960aatctgcttt
ggtgtaaacc aacggatctc ggatcgtagt cttcagaaga tcccggattt 1020tgcggtttgg
cccctcctgg attcaattcg tcgtatcgtt cgcagcccta ggatttgggg 1080atctggaggg
gtttgatcgt ttctcgccgc gaatctgctc tggtgtaaac caacggatct 1140cgggtcgtag
tcttcagaag gtcccggatt ttgcggtttg gcccctcctg gattcaattc 1200gtcgtatcgt
tcgcagccct aggatttggg gatctggagg ggtttgatcc tttctcgccg 1260cgaatctgct
ctggtataac caacggatct cgggtcgtag tcttcagaag gtcccggatt 1320ttgcggtttg
gtggttcttg ctctatgaat cagagggatg gttcttcccg gatttatgcc 1380ttgcggccac
tctgtcgaat catggggttt cgacccgatt cgtaggcgtg ctccctgttt 1440tggatgggaa
gtaggcgtgt ttgtagtatt cgtgcttcga ttcgtcaacg gagattagaa 1500gacctgggat
gggatttgag gaaatctagg tatctgtcta gcacgtttct agatctattc 1560ttcagctgtt
atatgagagt aattttggaa ccctggtggg gtatgtttga ccgagtattc 1620tgtagattat
tgtccgtgac ttgctggctg ttaccgtcct tatctctgca tcatctatct 1680gtgctagttt
ctgcgtgctt ctcaaatatt tccggcctgt gtagcatgtg actgataata 1740tgattttggc
agcttctgca taagaacaac aaatcaaaag cttgatcagc tcggtgccta 1800caaaacctca
acaaccaagt ttcatgtctg atctcgacgt ccagcttcca tctgcctttg 1860gtatggctac
ttctcaattc atgatgccat gttttttttt atattgtggt tttacataat 1920acatagcatc
ttccagcttc ctgaagagta ttactgaata gattgataac atcatacaca 1980cgaagttcat
cttgaacatg cttattagtg ttctgtttgc atctgatggt atggcatcat 2040ctttgataga
tccgtttgct gaggcaaatg ctgaggactc tggtgctggt cctggaacga 2100aggattatgt
gcatgtgcgc atccagcagc gcaacggcag aaagagtctg actacagtcc 2160agggtctgaa
gaaagagttc agctataaca agatcctcaa ggatctgaag aaggaattct 2220gctgcaatgg
tactgtagtt caggacccag agctaggcca ggtaagatac gagaacaatg 2280catttcaagc
ttgtaaaaat ggtatctgcc ggttggtgga tatactgatc tgtttgtccg 2340ctgcaggtca
ttcagctcca aggtgaccag cgcaagaatg ttgctacttt cctagttcag 2400gtattcagaa
tcttcagacc tggccagctg aatactgttt taccataccg atagatgttc 2460aatctgttaa
tactgatcgt gcaattatta cttgtcttgg taggctggga ttgcgaagaa 2520agagaacatc
aagattcacg ggttctaagg gacctgtaaa tgcttgtgcc ctatattgtg 2580tgcctccaca
tattggggag cttgaagcat cgacagttac tagtcattgc ttacttatat 2640aagaacataa
gtagtatttg ctattgtcaa gtgtgccttg cttgatgcaa gttgtgtttt 2700cgtatcatta
ttattatgca cggccatcgt acgtgtatgg cttgtatggg ttattgccaa 2760cttaataaaa
gcacactctg tttgcctata agcactgatg tttgcctcgt catgcacatg 2820ttgagtcggg
ttttatttgt att 2843465800DNAZea
maysmisc_feature(1)..(800)Zm-GOS2 PRO, maize GOS2 promoter 465taattattgg
ctgtaggatt ctaaacagag cctaaatagc tggaatagct ctagccctca 60atccaaacta
atgatatcta tacttatgca actctaaatt tttattctaa aagtaatatt 120tcatttttgt
caacgagatt ctctactcta ttccacaatc ttttgaagct atatttacct 180taaatctgta
ctctatacca ataatcatat attctattat ttatttttat ctctctccta 240aggagcatcc
ctctatgtct gcatggcccc cgcctcgggt cccaatctct tgctctgcta 300gtagcacaga
agaaaacact agaaatgact tgcttgactt agagtatcag ataaacatca 360tgtttactta
actttaattt gtatcggttt ctactatttt tataatattt ttgtctctat 420agatactacg
tgcaacagta taatcaacct agtttaatcc agagcgaagg attttttact 480aagtacgtga
ctccatatgc acagcgttcc ttttatggtt cctcactggg cacagcataa 540acgaaccctg
tccaatgttt tcagcgcgaa caaacagaaa ttccatcagc gaacaaacaa 600catacatgcg
agatgaaaat aaataataaa aaaagctccg tctcgatagg ccggcacgaa 660tcgagagcct
ccatagccag ttttttccat cggaacggcg gttcgcgcac ctaattatat 720gcaccacacg
cctataaagc caaccaaccc gtcggagggg cgcaagccag acagaagaca 780gcccgtcagc
ccctctcgtt
8004661023DNAZea maysmisc_feature(1)..(1023)GOS2 INTRON, maize GOS2
5'-UTR1 and intron1 and 5'-UTR2 sequence 466tttcatccgc cttcgcctcc
aaccgcgtgc gctccacgcc tcctccagga aagcgagagg 60tgagcgcagt cccctttccc
ctccttccaa ttcaattcgt cttctcgttc gcagccctag 120gatttggggg tctggagggg
tttgatcgtt tctcgccgtg aatctgcttt ggtgtaaacc 180aacggatctc ggatcgtagt
cttcagaaga tcccggattt tgcggtttgg cccctcctgg 240attcaattcg tcgtatcgtt
cgcagcccta ggatttgggg atctggaggg gtttgatcgt 300ttctcgccgc gaatctgctc
tggtgtaaac caacggatct cgggtcgtag tcttcagaag 360gtcccggatt ttgcggtttg
gcccctcctg gattcaattc gtcgtatcgt tcgcagccct 420aggatttggg gatctggagg
ggtttgatcc tttctcgccg cgaatctgct ctggtataac 480caacggatct cgggtcgtag
tcttcagaag gtcccggatt ttgcggtttg gtggttcttg 540ctctatgaat cagagggatg
gttcttcccg gatttatgcc ttgcggccac tctgtcgaat 600catggggttt cgacccgatt
cgtaggcgtg ctccctgttt tggatgggaa gtaggcgtgt 660ttgtagtatt cgtgcttcga
ttcgtcaacg gagattagaa gacctgggat gggatttgag 720gaaatctagg tatctgtcta
gcacgtttct agatctattc ttcagctgtt atatgagagt 780aattttggaa ccctggtggg
gtatgtttga ccgagtattc tgtagattat tgtccgtgac 840ttgctggctg ttaccgtcct
tatctctgca tcatctatct gtgctagttt ctgcgtgctt 900ctcaaatatt tccggcctgt
gtagcatgtg actgataata tgattttggc agcttctgca 960taagaacaac aaatcaaaag
cttgatcagc tcggtgccta caaaacctca acaaccaagt 1020ttc
102346723DNAGlycine max
467gcgtcctttg acagcagctg tgg
2346823DNAGlycine max 468gcaaccacag ctgctgtcaa agg
23469434DNAGlycine max 469ccgggtgtga tttagtataa
agtgaagtaa tggtcaaaag aaaaagtgta aaacgaagta 60cctagtaata agtaatattg
aacaaaataa atggtaaagt gtcagatata taaaataggc 120tttaataaaa ggaagaaaaa
aaacaaacaa aaaataggtt gcaatggggc agagcagagt 180catcatgaag ctagaaaggc
taccgataga taaactatag ttaattaaat acattaaaaa 240atacttggat ctttctctta
ccctgtttat attgagacct gaaacttgag agagatacac 300taatcttgcc ttgttgtttc
attccctaac ttacaggact cagcgcatgt catgtggtct 360cgttccccat ttaagtccca
caccgtctaa acttattaaa ttattaatgt ttataactag 420atgcacaaca acaa
4344709093DNAArtificial
sequenceQC878 470ccgggtgtga tttagtataa agtgaagtaa tggtcaaaag aaaaagtgta
aaacgaagta 60cctagtaata agtaatattg aacaaaataa atggtaaagt gtcagatata
taaaataggc 120tttaataaaa ggaagaaaaa aaacaaacaa aaaataggtt gcaatggggc
agagcagagt 180catcatgaag ctagaaaggc taccgataga taaactatag ttaattaaat
acattaaaaa 240atacttggat ctttctctta ccctgtttat attgagacct gaaacttgag
agagatacac 300taatcttgcc ttgttgtttc attccctaac ttacaggact cagcgcatgt
catgtggtct 360cgttccccat ttaagtccca caccgtctaa acttattaaa ttattaatgt
ttataactag 420atgcacaaca acaaagcttg cgtcctttga cagcagctgg ttttagagct
agaaatagca 480agttaaaata aggctagtcc gttatcaact tgaaaaagtg gcaccgagtc
ggtgcttttt 540tttgcggccg caattggatc gggtttactt attttgtggg tatctatact
tttattagat 600ttttaatcag gctcctgatt tctttttatt tcgattgaat tcctgaactt
gtattattca 660gtagatcgaa taaattataa aaagataaaa tcataaaata atattttatc
ctatcaatca 720tattaaagca atgaatatgt aaaattaatc ttatctttat tttaaaaaat
catataggtt 780tagtattttt ttaaaaataa agataggatt agttttacta ttcactgctt
attactttta 840aaaaaatcat aaaggtttag tattttttta aaataaatat aggaatagtt
ttactattca 900ctgctttaat agaaaaatag tttaaaattt aagatagttt taatcccagc
atttgccacg 960tttgaacgtg agccgaaacg atgtcgttac attatcttaa cctagctgaa
acgatgtcgt 1020cataatatcg ccaaatgcca actggactac gtcgaaccca caaatcccac
aaagcgcgtg 1080aaatcaaatc gctcaaacca caaaaaagaa caacgcgttt gttacacgct
caatcccacg 1140cgagtagagc acagtaacct tcaaataagc gaatggggca taatcagaaa
tccgaaataa 1200acctaggggc attatcggaa atgaaaagta gctcactcaa tataaaaatc
taggaaccct 1260agttttcgtt atcactctgt gctccctcgc tctatttctc agtctctgtg
tttgcggctg 1320aggattccga acgagtgacc ttcttcgttt ctcgcaaagg taacagcctc
tgctcttgtc 1380tcttcgattc gatctatgcc tgtctcttat ttacgatgat gtttcttcgg
ttatgttttt 1440ttatttatgc tttatgctgt tgatgttcgg ttgtttgttt cgctttgttt
ttgtggttca 1500gttttttagg attcttttgg tttttgaatc gattaatcgg aagagatttt
cgagttattt 1560ggtgtgttgg aggtgaatct tttttttgag gtcatagatc tgttgtattt
gtgttataaa 1620catgcgactt tgtatgattt tttacgaggt tatgatgttc tggttgtttt
attatgaatc 1680tgttgagaca gaaccatgat ttttgttgat gttcgtttac actattaaag
gtttgtttta 1740acaggattaa aagtttttta agcatgttga aggagtcttg tagatatgta
accgtcgata 1800gtttttttgt gggtttgttc acatgttatc aagcttaatc ttttactatg
tatgcgacca 1860tatctggatc cagcaaaggc gattttttaa ttccttgtga aacttttgta
atatgaagtt 1920gaaattttgt tattggtaaa ctataaatgt gtgaagttgg agtatacctt
taccttctta 1980tttggctttg tgatagttta atttatatgt attttgagtt ctgacttgta
tttctttgaa 2040ttgattctag tttaagtaat ccatggacaa aaagtactca atagggctcg
acatagggac 2100taactccgtt ggatgggccg tcatcaccga cgagtacaag gtgccctcca
agaagttcaa 2160ggtgttggga aacaccgaca ggcacagcat aaagaagaat ttgatcggtg
ccctcctctt 2220cgactccgga gagaccgctg aggctaccag gctcaagagg accgctagaa
ggcgctacac 2280cagaaggaag aacagaatct gctacctgca ggagatcttc tccaacgaga
tggccaaggt 2340ggacgactcc ttcttccacc gccttgagga atcattcctg gtggaggagg
ataaaaagca 2400cgagagacac ccaatcttcg ggaacatcgt cgacgaggtg gcctaccatg
aaaagtaccc 2460taccatctac cacctgagga agaagctggt cgactctacc gacaaggctg
acttgcgctt 2520gatttacctg gctctcgctc acatgataaa gttccgcgga cacttcctca
ttgagggaga 2580cctgaaccca gacaactccg acgtggacaa gctcttcatc cagctcgttc
agacctacaa 2640ccagcttttc gaggagaacc caatcaacgc cagtggagtt gacgccaagg
ctatcctctc 2700tgctcgtctg tcaaagtcca ggaggcttga gaacttgatt gcccagctgc
ctggcgaaaa 2760gaagaacgga ctgttcggaa acttgatcgc tctctccctg ggattgactc
ccaacttcaa 2820gtccaacttc gacctcgccg aggacgctaa gttgcagttg tctaaagaca
cctacgacga 2880tgacctcgac aacttgctgg cccagatagg cgaccaatac gccgatctct
tcctcgccgc 2940taagaacttg tccgacgcaa tcctgctgtc cgacatcctg agagtcaaca
ctgagattac 3000caaagctcct ctgtctgctt ccatgattaa gcgctacgac gagcaccacc
aagatctgac 3060cctgctcaag gccctggtga gacagcagct gcccgagaag tacaaggaga
tctttttcga 3120ccagtccaag aacggctacg ccggatacat tgacggaggc gcctcccagg
aagagttcta 3180caagttcatc aagcccatcc ttgagaagat ggacggtacc gaggagctgt
tggtgaagtt 3240gaacagagag gacctgttga ggaagcagag aaccttcgac aacggaagca
tccctcacca 3300aatccacctg ggagagctcc acgccatctt gaggaggcag gaggatttct
atcccttcct 3360gaaggacaac cgcgagaaga ttgagaagat cttgaccttc agaattcctt
actacgtcgg 3420gccactcgcc agaggaaact ctaggttcgc ctggatgacc cgcaaatctg
aagagaccat 3480tactccctgg aacttcgagg aagtcgtgga caagggcgct tccgctcagt
ctttcatcga 3540gaggatgacc aacttcgata aaaatctgcc caacgagaag gtgctgccca
agcactccct 3600gttgtacgag tatttcacag tgtacaacga gctcaccaag gtgaagtacg
tcacagaggg 3660aatgaggaag cctgccttct tgtccggaga gcagaagaag gccatcgtcg
acctgctctt 3720caagaccaac aggaaggtga ctgtcaagca gctgaaggag gactacttca
agaagatcga 3780gtgcttcgac tccgtcgaga tctctggtgt cgaggacagg ttcaacgcct
cccttgggac 3840ttaccacgat ctgctcaaga ttattaaaga caaggacttc ctggacaacg
aggagaacga 3900ggacatcctt gaggacatcg tgctcaccct gaccttgttc gaagacaggg
aaatgatcga 3960agagaggctc aagacctacg cccacctctt cgacgacaag gtgatgaaac
agctgaagag 4020acgcagatat accggctggg gaaggctctc ccgcaaattg atcaacggga
tcagggacaa 4080gcagtcaggg aagactatac tcgacttcct gaagtccgac ggattcgcca
acaggaactt 4140catgcagctc attcacgacg actccttgac cttcaaggag gacatccaga
aggctcaggt 4200gtctggacag ggtgactcct tgcatgagca cattgctaac ttggccggct
ctcccgctat 4260taagaagggc attttgcaga ccgtgaaggt cgttgacgag ctcgtgaagg
tgatgggacg 4320ccacaagcca gagaacatcg ttattgagat ggctcgcgag aaccaaacta
cccagaaagg 4380gcagaagaat tcccgcgaga ggatgaagcg cattgaggag ggcataaaag
agcttggctc 4440tcagatcctc aaggagcacc ccgtcgagaa cactcagctg cagaacgaga
agctgtacct 4500gtactacctc caaaacggaa gggacatgta cgtggaccag gagctggaca
tcaacaggtt 4560gtccgactac gacgtcgacc acatcgtgcc tcagtccttc ctgaaggatg
actccatcga 4620caataaagtg ctgacacgct ccgataaaaa tagaggcaag tccgacaacg
tcccctccga 4680ggaggtcgtg aagaagatga aaaactactg gagacagctc ttgaacgcca
agctcatcac 4740ccagcgtaag ttcgacaacc tgactaaggc tgagagagga ggattgtccg
agctcgataa 4800ggccggattc atcaagagac agctcgtcga aacccgccaa attaccaagc
acgtggccca 4860aattctggat tcccgcatga acaccaagta cgatgaaaat gacaagctga
tccgcgaggt 4920caaggtgatc accttgaagt ccaagctggt ctccgacttc cgcaaggact
tccagttcta 4980caaggtgagg gagatcaaca actaccacca cgcacacgac gcctacctca
acgctgtcgt 5040tggaaccgcc ctcatcaaaa aatatcctaa gctggagtct gagttcgtct
acggcgacta 5100caaggtgtac gacgtgagga agatgatcgc taagtctgag caggagatcg
gcaaggccac 5160cgccaagtac ttcttctact ccaacatcat gaacttcttc aagaccgaga
tcactctcgc 5220caacggtgag atcaggaagc gcccactgat cgagaccaac ggtgagactg
gagagatcgt 5280gtgggacaaa gggagggatt tcgctactgt gaggaaggtg ctctccatgc
ctcaggtgaa 5340catcgtcaag aagaccgaag ttcagaccgg aggattctcc aaggagtcca
tcctccccaa 5400gagaaactcc gacaagctga tcgctagaaa gaaagactgg gaccctaaga
agtacggagg 5460cttcgattct cctaccgtgg cctactctgt gctggtcgtg gccaaggtgg
agaagggcaa 5520gtccaagaag ctgaaatccg tcaaggagct cctcgggatt accatcatgg
agaggagttc 5580cttcgagaag aaccctatcg acttcctgga ggccaaggga tataaagagg
tgaagaagga 5640cctcatcatc aagctgccca agtactccct cttcgagttg gagaacggaa
ggaagaggat 5700gctggcttct gccggagagt tgcagaaggg aaatgagctc gcccttccct
ccaagtacgt 5760gaacttcctg tacctcgcct ctcactatga aaagttgaag ggctctcctg
aggacaacga 5820gcagaagcag ctcttcgtgg agcagcacaa gcactacctg gacgaaatta
tcgagcagat 5880ctctgagttc tccaagcgcg tgatattggc cgacgccaac ctcgacaagg
tgctgtccgc 5940ctacaacaag cacagggata agcccattcg cgagcaggct gaaaacatta
tccacctgtt 6000taccctcaca aacttgggag cccctgctgc cttcaagtac ttcgacacca
ccattgacag 6060gaagagatac acctccacca aggaggtgct cgacgcaaca ctcatccacc
aatccatcac 6120cggcctctat gaaacaagga ttgacttgtc ccagctggga ggcgactcta
gagccgatcc 6180caagaagaag agaaaggtgt aggttaacct agacttgtcc atcttctgga
ttggccaact 6240taattaatgt atgaaataaa aggatgcaca catagtgaca tgctaatcac
tataatgtgg 6300gcatcaaagt tgtgtgttat gtgtaattac tagttatctg aataaaagag
aaagagatca 6360tccatatttc ttatcctaaa tgaatgtcac gtgtctttat aattctttga
tgaaccagat 6420gcatttcatt aaccaaatcc atatacatat aaatattaat catatataat
taatatcaat 6480tgggttagca aaacaaatct agtctaggtg tgttttgcga attcgatatc
aagcttatcg 6540ataccgtcga gggggggccc ggtaccggcg cgccgttcta tagtgtcacc
taaatcgtat 6600gtgtatgata cataaggtta tgtattaatt gtagccgcgt tctaacgaca
atatgtccat 6660atggtgcact ctcagtacaa tctgctctga tgccgcatag ttaagccagc
cccgacaccc 6720gccaacaccc gctgacgcgc cctgacgggc ttgtctgctc ccggcatccg
cttacagaca 6780agctgtgacc gtctccggga gctgcatgtg tcagaggttt tcaccgtcat
caccgaaacg 6840cgcgagacga aagggcctcg tgatacgcct atttttatag gttaatgtca
tgaccaaaat 6900cccttaacgt gagttttcgt tccactgagc gtcagacccc gtagaaaaga
tcaaaggatc 6960ttcttgagat cctttttttc tgcgcgtaat ctgctgcttg caaacaaaaa
aaccaccgct 7020accagcggtg gtttgtttgc cggatcaaga gctaccaact ctttttccga
aggtaactgg 7080cttcagcaga gcgcagatac caaatactgt ccttctagtg tagccgtagt
taggccacca 7140cttcaagaac tctgtagcac cgcctacata cctcgctctg ctaatcctgt
taccagtggc 7200tgctgccagt ggcgataagt cgtgtcttac cgggttggac tcaagacgat
agttaccgga 7260taaggcgcag cggtcgggct gaacgggggg ttcgtgcaca cagcccagct
tggagcgaac 7320gacctacacc gaactgagat acctacagcg tgagcattga gaaagcgcca
cgcttcccga 7380agggagaaag gcggacaggt atccggtaag cggcagggtc ggaacaggag
agcgcacgag 7440ggagcttcca gggggaaacg cctggtatct ttatagtcct gtcgggtttc
gccacctctg 7500acttgagcgt cgatttttgt gatgctcgtc aggggggcgg agcctatgga
aaaacgccag 7560caacgcggcc tttttacggt tcctggcctt ttgctggcct tttgctcaca
tgttctttcc 7620tgcgttatcc cctgattctg tggataaccg tattaccgcc tttgagtgag
ctgataccgc 7680tcgccgcagc cgaacgaccg agcgcagcga gtcagtgagc gaggaagcgg
aagagcgccc 7740aatacgcaaa ccgcctctcc ccgcgcgttg gccgattcat taatgcaggt
tgatcagatc 7800tcgatcccgc gaaattaata cgactcacta tagggagacc acaacggttt
ccctctagaa 7860ataattttgt ttaactttaa gaaggagata tacccatgga aaagcctgaa
ctcaccgcga 7920cgtctgtcga gaagtttctg atcgaaaagt tcgacagcgt ctccgacctg
atgcagctct 7980cggagggcga agaatctcgt gctttcagct tcgatgtagg agggcgtgga
tatgtcctgc 8040gggtaaatag ctgcgccgat ggtttctaca aagatcgtta tgtttatcgg
cactttgcat 8100cggccgcgct cccgattccg gaagtgcttg acattgggga attcagcgag
agcctgacct 8160attgcatctc ccgccgtgca cagggtgtca cgttgcaaga cctgcctgaa
accgaactgc 8220ccgctgttct gcagccggtc gcggaggcta tggatgcgat cgctgcggcc
gatcttagcc 8280agacgagcgg gttcggccca ttcggaccgc aaggaatcgg tcaatacact
acatggcgtg 8340atttcatatg cgcgattgct gatccccatg tgtatcactg gcaaactgtg
atggacgaca 8400ccgtcagtgc gtccgtcgcg caggctctcg atgagctgat gctttgggcc
gaggactgcc 8460ccgaagtccg gcacctcgtg cacgcggatt tcggctccaa caatgtcctg
acggacaatg 8520gccgcataac agcggtcatt gactggagcg aggcgatgtt cggggattcc
caatacgagg 8580tcgccaacat cttcttctgg aggccgtggt tggcttgtat ggagcagcag
acgcgctact 8640tcgagcggag gcatccggag cttgcaggat cgccgcggct ccgggcgtat
atgctccgca 8700ttggtcttga ccaactctat cagagcttgg ttgacggcaa tttcgatgat
gcagcttggg 8760cgcagggtcg atgcgacgca atcgtccgat ccggagccgg gactgtcggg
cgtacacaaa 8820tcgcccgcag aagcgcggcc gtctggaccg atggctgtgt agaagtactc
gccgatagtg 8880gaaaccgacg ccccagcact cgtccgaggg caaaggaata gtgaggtaca
gcttggatcg 8940atccggctgc taacaaagcc cgaaaggaag ctgagttggc tgctgccacc
gctgagcaat 9000aactagcata accccttggg gcctctaaac gggtcttgag gggttttttg
ctgaaaggag 9060gaactatatc cggatgatcg ggcgcgccgg tac
90934719093DNAArtificial sequenceQC879 471ccgggtgtga
tttagtataa agtgaagtaa tggtcaaaag aaaaagtgta aaacgaagta 60cctagtaata
agtaatattg aacaaaataa atggtaaagt gtcagatata taaaataggc 120tttaataaaa
ggaagaaaaa aaacaaacaa aaaataggtt gcaatggggc agagcagagt 180catcatgaag
ctagaaaggc taccgataga taaactatag ttaattaaat acattaaaaa 240atacttggat
ctttctctta ccctgtttat attgagacct gaaacttgag agagatacac 300taatcttgcc
ttgttgtttc attccctaac ttacaggact cagcgcatgt catgtggtct 360cgttccccat
ttaagtccca caccgtctaa acttattaaa ttattaatgt ttataactag 420atgcacaaca
acaaagcttg caaccacagc tgctgtcaag ttttagagct agaaatagca 480agttaaaata
aggctagtcc gttatcaact tgaaaaagtg gcaccgagtc ggtgcttttt 540tttgcggccg
caattggatc gggtttactt attttgtggg tatctatact tttattagat 600ttttaatcag
gctcctgatt tctttttatt tcgattgaat tcctgaactt gtattattca 660gtagatcgaa
taaattataa aaagataaaa tcataaaata atattttatc ctatcaatca 720tattaaagca
atgaatatgt aaaattaatc ttatctttat tttaaaaaat catataggtt 780tagtattttt
ttaaaaataa agataggatt agttttacta ttcactgctt attactttta 840aaaaaatcat
aaaggtttag tattttttta aaataaatat aggaatagtt ttactattca 900ctgctttaat
agaaaaatag tttaaaattt aagatagttt taatcccagc atttgccacg 960tttgaacgtg
agccgaaacg atgtcgttac attatcttaa cctagctgaa acgatgtcgt 1020cataatatcg
ccaaatgcca actggactac gtcgaaccca caaatcccac aaagcgcgtg 1080aaatcaaatc
gctcaaacca caaaaaagaa caacgcgttt gttacacgct caatcccacg 1140cgagtagagc
acagtaacct tcaaataagc gaatggggca taatcagaaa tccgaaataa 1200acctaggggc
attatcggaa atgaaaagta gctcactcaa tataaaaatc taggaaccct 1260agttttcgtt
atcactctgt gctccctcgc tctatttctc agtctctgtg tttgcggctg 1320aggattccga
acgagtgacc ttcttcgttt ctcgcaaagg taacagcctc tgctcttgtc 1380tcttcgattc
gatctatgcc tgtctcttat ttacgatgat gtttcttcgg ttatgttttt 1440ttatttatgc
tttatgctgt tgatgttcgg ttgtttgttt cgctttgttt ttgtggttca 1500gttttttagg
attcttttgg tttttgaatc gattaatcgg aagagatttt cgagttattt 1560ggtgtgttgg
aggtgaatct tttttttgag gtcatagatc tgttgtattt gtgttataaa 1620catgcgactt
tgtatgattt tttacgaggt tatgatgttc tggttgtttt attatgaatc 1680tgttgagaca
gaaccatgat ttttgttgat gttcgtttac actattaaag gtttgtttta 1740acaggattaa
aagtttttta agcatgttga aggagtcttg tagatatgta accgtcgata 1800gtttttttgt
gggtttgttc acatgttatc aagcttaatc ttttactatg tatgcgacca 1860tatctggatc
cagcaaaggc gattttttaa ttccttgtga aacttttgta atatgaagtt 1920gaaattttgt
tattggtaaa ctataaatgt gtgaagttgg agtatacctt taccttctta 1980tttggctttg
tgatagttta atttatatgt attttgagtt ctgacttgta tttctttgaa 2040ttgattctag
tttaagtaat ccatggacaa aaagtactca atagggctcg acatagggac 2100taactccgtt
ggatgggccg tcatcaccga cgagtacaag gtgccctcca agaagttcaa 2160ggtgttggga
aacaccgaca ggcacagcat aaagaagaat ttgatcggtg ccctcctctt 2220cgactccgga
gagaccgctg aggctaccag gctcaagagg accgctagaa ggcgctacac 2280cagaaggaag
aacagaatct gctacctgca ggagatcttc tccaacgaga tggccaaggt 2340ggacgactcc
ttcttccacc gccttgagga atcattcctg gtggaggagg ataaaaagca 2400cgagagacac
ccaatcttcg ggaacatcgt cgacgaggtg gcctaccatg aaaagtaccc 2460taccatctac
cacctgagga agaagctggt cgactctacc gacaaggctg acttgcgctt 2520gatttacctg
gctctcgctc acatgataaa gttccgcgga cacttcctca ttgagggaga 2580cctgaaccca
gacaactccg acgtggacaa gctcttcatc cagctcgttc agacctacaa 2640ccagcttttc
gaggagaacc caatcaacgc cagtggagtt gacgccaagg ctatcctctc 2700tgctcgtctg
tcaaagtcca ggaggcttga gaacttgatt gcccagctgc ctggcgaaaa 2760gaagaacgga
ctgttcggaa acttgatcgc tctctccctg ggattgactc ccaacttcaa 2820gtccaacttc
gacctcgccg aggacgctaa gttgcagttg tctaaagaca cctacgacga 2880tgacctcgac
aacttgctgg cccagatagg cgaccaatac gccgatctct tcctcgccgc 2940taagaacttg
tccgacgcaa tcctgctgtc cgacatcctg agagtcaaca ctgagattac 3000caaagctcct
ctgtctgctt ccatgattaa gcgctacgac gagcaccacc aagatctgac 3060cctgctcaag
gccctggtga gacagcagct gcccgagaag tacaaggaga tctttttcga 3120ccagtccaag
aacggctacg ccggatacat tgacggaggc gcctcccagg aagagttcta 3180caagttcatc
aagcccatcc ttgagaagat ggacggtacc gaggagctgt tggtgaagtt 3240gaacagagag
gacctgttga ggaagcagag aaccttcgac aacggaagca tccctcacca 3300aatccacctg
ggagagctcc acgccatctt gaggaggcag gaggatttct atcccttcct 3360gaaggacaac
cgcgagaaga ttgagaagat cttgaccttc agaattcctt actacgtcgg 3420gccactcgcc
agaggaaact ctaggttcgc ctggatgacc cgcaaatctg aagagaccat 3480tactccctgg
aacttcgagg aagtcgtgga caagggcgct tccgctcagt ctttcatcga 3540gaggatgacc
aacttcgata aaaatctgcc caacgagaag gtgctgccca agcactccct 3600gttgtacgag
tatttcacag tgtacaacga gctcaccaag gtgaagtacg tcacagaggg 3660aatgaggaag
cctgccttct tgtccggaga gcagaagaag gccatcgtcg acctgctctt 3720caagaccaac
aggaaggtga ctgtcaagca gctgaaggag gactacttca agaagatcga 3780gtgcttcgac
tccgtcgaga tctctggtgt cgaggacagg ttcaacgcct cccttgggac 3840ttaccacgat
ctgctcaaga ttattaaaga caaggacttc ctggacaacg aggagaacga 3900ggacatcctt
gaggacatcg tgctcaccct gaccttgttc gaagacaggg aaatgatcga 3960agagaggctc
aagacctacg cccacctctt cgacgacaag gtgatgaaac agctgaagag 4020acgcagatat
accggctggg gaaggctctc ccgcaaattg atcaacggga tcagggacaa 4080gcagtcaggg
aagactatac tcgacttcct gaagtccgac ggattcgcca acaggaactt 4140catgcagctc
attcacgacg actccttgac cttcaaggag gacatccaga aggctcaggt 4200gtctggacag
ggtgactcct tgcatgagca cattgctaac ttggccggct ctcccgctat 4260taagaagggc
attttgcaga ccgtgaaggt cgttgacgag ctcgtgaagg tgatgggacg 4320ccacaagcca
gagaacatcg ttattgagat ggctcgcgag aaccaaacta cccagaaagg 4380gcagaagaat
tcccgcgaga ggatgaagcg cattgaggag ggcataaaag agcttggctc 4440tcagatcctc
aaggagcacc ccgtcgagaa cactcagctg cagaacgaga agctgtacct 4500gtactacctc
caaaacggaa gggacatgta cgtggaccag gagctggaca tcaacaggtt 4560gtccgactac
gacgtcgacc acatcgtgcc tcagtccttc ctgaaggatg actccatcga 4620caataaagtg
ctgacacgct ccgataaaaa tagaggcaag tccgacaacg tcccctccga 4680ggaggtcgtg
aagaagatga aaaactactg gagacagctc ttgaacgcca agctcatcac 4740ccagcgtaag
ttcgacaacc tgactaaggc tgagagagga ggattgtccg agctcgataa 4800ggccggattc
atcaagagac agctcgtcga aacccgccaa attaccaagc acgtggccca 4860aattctggat
tcccgcatga acaccaagta cgatgaaaat gacaagctga tccgcgaggt 4920caaggtgatc
accttgaagt ccaagctggt ctccgacttc cgcaaggact tccagttcta 4980caaggtgagg
gagatcaaca actaccacca cgcacacgac gcctacctca acgctgtcgt 5040tggaaccgcc
ctcatcaaaa aatatcctaa gctggagtct gagttcgtct acggcgacta 5100caaggtgtac
gacgtgagga agatgatcgc taagtctgag caggagatcg gcaaggccac 5160cgccaagtac
ttcttctact ccaacatcat gaacttcttc aagaccgaga tcactctcgc 5220caacggtgag
atcaggaagc gcccactgat cgagaccaac ggtgagactg gagagatcgt 5280gtgggacaaa
gggagggatt tcgctactgt gaggaaggtg ctctccatgc ctcaggtgaa 5340catcgtcaag
aagaccgaag ttcagaccgg aggattctcc aaggagtcca tcctccccaa 5400gagaaactcc
gacaagctga tcgctagaaa gaaagactgg gaccctaaga agtacggagg 5460cttcgattct
cctaccgtgg cctactctgt gctggtcgtg gccaaggtgg agaagggcaa 5520gtccaagaag
ctgaaatccg tcaaggagct cctcgggatt accatcatgg agaggagttc 5580cttcgagaag
aaccctatcg acttcctgga ggccaaggga tataaagagg tgaagaagga 5640cctcatcatc
aagctgccca agtactccct cttcgagttg gagaacggaa ggaagaggat 5700gctggcttct
gccggagagt tgcagaaggg aaatgagctc gcccttccct ccaagtacgt 5760gaacttcctg
tacctcgcct ctcactatga aaagttgaag ggctctcctg aggacaacga 5820gcagaagcag
ctcttcgtgg agcagcacaa gcactacctg gacgaaatta tcgagcagat 5880ctctgagttc
tccaagcgcg tgatattggc cgacgccaac ctcgacaagg tgctgtccgc 5940ctacaacaag
cacagggata agcccattcg cgagcaggct gaaaacatta tccacctgtt 6000taccctcaca
aacttgggag cccctgctgc cttcaagtac ttcgacacca ccattgacag 6060gaagagatac
acctccacca aggaggtgct cgacgcaaca ctcatccacc aatccatcac 6120cggcctctat
gaaacaagga ttgacttgtc ccagctggga ggcgactcta gagccgatcc 6180caagaagaag
agaaaggtgt aggttaacct agacttgtcc atcttctgga ttggccaact 6240taattaatgt
atgaaataaa aggatgcaca catagtgaca tgctaatcac tataatgtgg 6300gcatcaaagt
tgtgtgttat gtgtaattac tagttatctg aataaaagag aaagagatca 6360tccatatttc
ttatcctaaa tgaatgtcac gtgtctttat aattctttga tgaaccagat 6420gcatttcatt
aaccaaatcc atatacatat aaatattaat catatataat taatatcaat 6480tgggttagca
aaacaaatct agtctaggtg tgttttgcga attcgatatc aagcttatcg 6540ataccgtcga
gggggggccc ggtaccggcg cgccgttcta tagtgtcacc taaatcgtat 6600gtgtatgata
cataaggtta tgtattaatt gtagccgcgt tctaacgaca atatgtccat 6660atggtgcact
ctcagtacaa tctgctctga tgccgcatag ttaagccagc cccgacaccc 6720gccaacaccc
gctgacgcgc cctgacgggc ttgtctgctc ccggcatccg cttacagaca 6780agctgtgacc
gtctccggga gctgcatgtg tcagaggttt tcaccgtcat caccgaaacg 6840cgcgagacga
aagggcctcg tgatacgcct atttttatag gttaatgtca tgaccaaaat 6900cccttaacgt
gagttttcgt tccactgagc gtcagacccc gtagaaaaga tcaaaggatc 6960ttcttgagat
cctttttttc tgcgcgtaat ctgctgcttg caaacaaaaa aaccaccgct 7020accagcggtg
gtttgtttgc cggatcaaga gctaccaact ctttttccga aggtaactgg 7080cttcagcaga
gcgcagatac caaatactgt ccttctagtg tagccgtagt taggccacca 7140cttcaagaac
tctgtagcac cgcctacata cctcgctctg ctaatcctgt taccagtggc 7200tgctgccagt
ggcgataagt cgtgtcttac cgggttggac tcaagacgat agttaccgga 7260taaggcgcag
cggtcgggct gaacgggggg ttcgtgcaca cagcccagct tggagcgaac 7320gacctacacc
gaactgagat acctacagcg tgagcattga gaaagcgcca cgcttcccga 7380agggagaaag
gcggacaggt atccggtaag cggcagggtc ggaacaggag agcgcacgag 7440ggagcttcca
gggggaaacg cctggtatct ttatagtcct gtcgggtttc gccacctctg 7500acttgagcgt
cgatttttgt gatgctcgtc aggggggcgg agcctatgga aaaacgccag 7560caacgcggcc
tttttacggt tcctggcctt ttgctggcct tttgctcaca tgttctttcc 7620tgcgttatcc
cctgattctg tggataaccg tattaccgcc tttgagtgag ctgataccgc 7680tcgccgcagc
cgaacgaccg agcgcagcga gtcagtgagc gaggaagcgg aagagcgccc 7740aatacgcaaa
ccgcctctcc ccgcgcgttg gccgattcat taatgcaggt tgatcagatc 7800tcgatcccgc
gaaattaata cgactcacta tagggagacc acaacggttt ccctctagaa 7860ataattttgt
ttaactttaa gaaggagata tacccatgga aaagcctgaa ctcaccgcga 7920cgtctgtcga
gaagtttctg atcgaaaagt tcgacagcgt ctccgacctg atgcagctct 7980cggagggcga
agaatctcgt gctttcagct tcgatgtagg agggcgtgga tatgtcctgc 8040gggtaaatag
ctgcgccgat ggtttctaca aagatcgtta tgtttatcgg cactttgcat 8100cggccgcgct
cccgattccg gaagtgcttg acattgggga attcagcgag agcctgacct 8160attgcatctc
ccgccgtgca cagggtgtca cgttgcaaga cctgcctgaa accgaactgc 8220ccgctgttct
gcagccggtc gcggaggcta tggatgcgat cgctgcggcc gatcttagcc 8280agacgagcgg
gttcggccca ttcggaccgc aaggaatcgg tcaatacact acatggcgtg 8340atttcatatg
cgcgattgct gatccccatg tgtatcactg gcaaactgtg atggacgaca 8400ccgtcagtgc
gtccgtcgcg caggctctcg atgagctgat gctttgggcc gaggactgcc 8460ccgaagtccg
gcacctcgtg cacgcggatt tcggctccaa caatgtcctg acggacaatg 8520gccgcataac
agcggtcatt gactggagcg aggcgatgtt cggggattcc caatacgagg 8580tcgccaacat
cttcttctgg aggccgtggt tggcttgtat ggagcagcag acgcgctact 8640tcgagcggag
gcatccggag cttgcaggat cgccgcggct ccgggcgtat atgctccgca 8700ttggtcttga
ccaactctat cagagcttgg ttgacggcaa tttcgatgat gcagcttggg 8760cgcagggtcg
atgcgacgca atcgtccgat ccggagccgg gactgtcggg cgtacacaaa 8820tcgcccgcag
aagcgcggcc gtctggaccg atggctgtgt agaagtactc gccgatagtg 8880gaaaccgacg
ccccagcact cgtccgaggg caaaggaata gtgaggtaca gcttggatcg 8940atccggctgc
taacaaagcc cgaaaggaag ctgagttggc tgctgccacc gctgagcaat 9000aactagcata
accccttggg gcctctaaac gggtcttgag gggttttttg ctgaaaggag 9060gaactatatc
cggatgatcg ggcgcgccgg tac
90934721357DNAArtificial sequenceRTW1013A 472ctagaagata aaccctcccc
caaaacacaa attagaatga catttcaagt tccatgtatg 60tcactttcat tctattattt
ttacaacttt tagttactta acagatgtct tgttcagcat 120aaattataat ttattctgtt
tttttttagg gaacaactgt tgtagacaac ttgttgtata 180gtgaggatat tcattacatg
cttggtgcat taaggaccct tggactgcgt gtggaagatg 240acaaaacaac caaacaagca
attgttgaag gctgtggggg attgtttccc actagtaagg 300aatctaaaga tgaaatcaat
ttattccttg gaaatgctgg tattgcaatg agatctttga 360cagcagctgt tgttgctgca
ggtggaaatg caaggtctgt tttttttttt tttgttcagc 420ataatctttg aattgttcct
cgtataacta atcacaacag agtacgtgtt cttcttcctg 480ttataatcta aaaatctcat
ccagattagt catcctttct tcttaaaagg aacctttaat 540tatcaatgta tttatttaat
atttaaatta gcttgtcaaa gtctagcata tacatatttt 600gattatattc tgagaaatgc
acctgagggt gttcctcatg atctacttca acctctgtta 660ttattagatt ttctatcatg
attactggtt tgagtctcta agtagaccat cttgatgttc 720aaaatatttc agctacgtac
ttgatggggt gccccgaatg agagagaggc caattgggga 780tttggttgct ggtcttaagc
aacttggtgc agatgttgat tgctttcttg gcacaaactg 840tccacctgtt cgtgtaaatg
ggaagggagg acttcctggc ggaaaggtat ggtttggatt 900tcatttagaa taaggtggag
taactttcct ggatcaaaat tctaatttaa gaagcctccc 960tgttttcctc tctttagaat
aagactaagg gtaggtttag gagttgggtt ttggagagaa 1020atggaaggga gagcaatttt
tttcttcttc taataaatat tctttaattt gatacatttt 1080ttaagtaaaa gaatataaag
atagattagc ataacttaat gttttaatct tttatttatt 1140tttataaata ttatatacct
gtctatttaa aaatcaaata tttgtcctcc attccctttc 1200ccttcaaaac ctcagttcca
aatataccgt agttgaatta tattttggaa ggcctattgg 1260ttggagactt ttccttttca
gagattatcc ctcaccttta ttatagcctt tctattttta 1320aacttcatat agacgccatt
cttggggcgg ccgcgat 13574731357DNAArtificial
sequenceRTW1012A 473ctagaagata aaccctcccc caaaacacaa attagaatga
catttcaagt tccatgtatg 60tcactttcat tctattattt ttacaacttt tagttactta
acagatgtct tgttcagcat 120aaattataat ttattctgtt tttttttagg gaacaactgt
tgtagacaac ttgttgtata 180gtgaggatat tcattacatg cttggtgcat taaggaccct
tggactgcgt gtggaagatg 240acaaaacaac caaacaagca attgttgaag gctgtggggg
attgtttccc actagtaagg 300aatctaaaga tgaaatcaat ttattccttg gaaatgctgg
tattgcaatg agatctttga 360cagcagctgt ggttgctgca ggtggaaatg caaggtctgt
tttttttttt tttgttcagc 420ataatctttg aattgttcct cgtataacta atcacaacag
agtacgtgtt cttcttcctg 480ttataatcta aaaatctcat ccagattagt catcctttct
tcttaaaagg aacctttaat 540tatcaatgta tttatttaat atttaaatta gcttgtcaaa
gtctagcata tacatatttt 600gattatattc tgagaaatgc acctgagggt gttcctcatg
atctacttca acctctgtta 660ttattagatt ttctatcatg attactggtt tgagtctcta
agtagaccat cttgatgttc 720aaaatatttc agctacgtac ttgatggggt gccccgaatg
agagagaggc caattgggga 780tttggttgct ggtcttaagc aacttggtgc agatgttgat
tgctttcttg gcacaaactg 840tccacctgtt cgtgtaaatg ggaagggagg acttcctggc
ggaaaggtat ggtttggatt 900tcatttagaa taaggtggag taactttcct ggatcaaaat
tctaatttaa gaagcctccc 960tgttttcctc tctttagaat aagactaagg gtaggtttag
gagttgggtt ttggagagaa 1020atggaaggga gagcaatttt tttcttcttc taataaatat
tctttaattt gatacatttt 1080ttaagtaaaa gaatataaag atagattagc ataacttaat
gttttaatct tttatttatt 1140tttataaata ttatatacct gtctatttaa aaatcaaata
tttgtcctcc attccctttc 1200ccttcaaaac ctcagttcca aatataccgt agttgaatta
tattttggaa ggcctattgg 1260ttggagactt ttccttttca gagattatcc ctcaccttta
ttatagcctt tctattttta 1320aacttcatat agacgccatt cttggggcgg ccgcgat
135747430DNAArtificial sequenceprimer, soy1-F1
474ccactagtaa ggaatctaaa gatgaaatca
3047524DNAArtificial sequenceprimer, soy1-R2 475cctgcagcaa ccacagctgc
tgtc 2447615DNAArtificial
sequenceprobe, soy1-T1(FAM-MGB 476ctgcaatgcg tcctt
1547719DNAArtificial sequenceprimer, cas9-F
477ccttcttcca ccgccttga
1947821DNAArtificial sequenceprimer, Cas9-R 478tgggtgtctc tcgtgctttt t
2147919DNAArtificial
sequenceprobe, Cas9-T(FAM-MGB) 479aatcattcct ggtggagga
1948025DNAArtificial sequenceprimer,
pINII-99F 480tgatgcccac attatagtga ttagc
2548122DNAArtificial sequenceprimer, pINII-13R 481catcttctgg
attggccaac tt
2248217DNAArtificial sequenceprobe, pINII-69T(FAM-MGB) 482actatgtgtg
catcctt
1748323DNAArtificial sequenceprimer, SIP-130F 483ttcaagttgg gctttttcag
aag 2348422DNAArtificial
sequenceprimer, SIP-198R 484tctccttggt gctctcatca ca
2248515DNAArtificial sequenceprobe,
SIP-170T(VIC-MGB) 485ctgcagcaga accaa
1548624DNAArtificialWOL569, Forward_primer 486ggacccatta
ggtgagagcg tggg
2448719DNAArtificialWOL876, Reverse_primer 487cagctgctgt caaagatct
1948829DNAArtificialWOL570,
Reverse_primer 488tctaataata acagaggttg aagtagatc
294894104DNAGlycine max 489atggacaaaa agtactcaat agggctcgac
atagggacta actccgttgg atgggccgtc 60atcaccgacg agtacaaggt gccctccaag
aagttcaagg tgttgggaaa caccgacagg 120cacagcataa agaagaattt gatcggtgcc
ctcctcttcg actccggaga gaccgctgag 180gctaccaggc tcaagaggac cgctagaagg
cgctacacca gaaggaagaa cagaatctgc 240tacctgcagg agatcttctc caacgagatg
gccaaggtgg acgactcctt cttccaccgc 300cttgaggaat cattcctggt ggaggaggat
aaaaagcacg agagacaccc aatcttcggg 360aacatcgtcg acgaggtggc ctaccatgaa
aagtacccta ccatctacca cctgaggaag 420aagctggtcg actctaccga caaggctgac
ttgcgcttga tttacctggc tctcgctcac 480atgataaagt tccgcggaca cttcctcatt
gagggagacc tgaacccaga caactccgac 540gtggacaagc tcttcatcca gctcgttcag
acctacaacc agcttttcga ggagaaccca 600atcaacgcca gtggagttga cgccaaggct
atcctctctg ctcgtctgtc aaagtccagg 660aggcttgaga acttgattgc ccagctgcct
ggcgaaaaga agaacggact gttcggaaac 720ttgatcgctc tctccctggg attgactccc
aacttcaagt ccaacttcga cctcgccgag 780gacgctaagt tgcagttgtc taaagacacc
tacgacgatg acctcgacaa cttgctggcc 840cagataggcg accaatacgc cgatctcttc
ctcgccgcta agaacttgtc cgacgcaatc 900ctgctgtccg acatcctgag agtcaacact
gagattacca aagctcctct gtctgcttcc 960atgattaagc gctacgacga gcaccaccaa
gatctgaccc tgctcaaggc cctggtgaga 1020cagcagctgc ccgagaagta caaggagatc
tttttcgacc agtccaagaa cggctacgcc 1080ggatacattg acggaggcgc ctcccaggaa
gagttctaca agttcatcaa gcccatcctt 1140gagaagatgg acggtaccga ggagctgttg
gtgaagttga acagagagga cctgttgagg 1200aagcagagaa ccttcgacaa cggaagcatc
cctcaccaaa tccacctggg agagctccac 1260gccatcttga ggaggcagga ggatttctat
cccttcctga aggacaaccg cgagaagatt 1320gagaagatct tgaccttcag aattccttac
tacgtcgggc cactcgccag aggaaactct 1380aggttcgcct ggatgacccg caaatctgaa
gagaccatta ctccctggaa cttcgaggaa 1440gtcgtggaca agggcgcttc cgctcagtct
ttcatcgaga ggatgaccaa cttcgataaa 1500aatctgccca acgagaaggt gctgcccaag
cactccctgt tgtacgagta tttcacagtg 1560tacaacgagc tcaccaaggt gaagtacgtc
acagagggaa tgaggaagcc tgccttcttg 1620tccggagagc agaagaaggc catcgtcgac
ctgctcttca agaccaacag gaaggtgact 1680gtcaagcagc tgaaggagga ctacttcaag
aagatcgagt gcttcgactc cgtcgagatc 1740tctggtgtcg aggacaggtt caacgcctcc
cttgggactt accacgatct gctcaagatt 1800attaaagaca aggacttcct ggacaacgag
gagaacgagg acatccttga ggacatcgtg 1860ctcaccctga ccttgttcga agacagggaa
atgatcgaag agaggctcaa gacctacgcc 1920cacctcttcg acgacaaggt gatgaaacag
ctgaagagac gcagatatac cggctgggga 1980aggctctccc gcaaattgat caacgggatc
agggacaagc agtcagggaa gactatactc 2040gacttcctga agtccgacgg attcgccaac
aggaacttca tgcagctcat tcacgacgac 2100tccttgacct tcaaggagga catccagaag
gctcaggtgt ctggacaggg tgactccttg 2160catgagcaca ttgctaactt ggccggctct
cccgctatta agaagggcat tttgcagacc 2220gtgaaggtcg ttgacgagct cgtgaaggtg
atgggacgcc acaagccaga gaacatcgtt 2280attgagatgg ctcgcgagaa ccaaactacc
cagaaagggc agaagaattc ccgcgagagg 2340atgaagcgca ttgaggaggg cataaaagag
cttggctctc agatcctcaa ggagcacccc 2400gtcgagaaca ctcagctgca gaacgagaag
ctgtacctgt actacctcca aaacggaagg 2460gacatgtacg tggaccagga gctggacatc
aacaggttgt ccgactacga cgtcgaccac 2520atcgtgcctc agtccttcct gaaggatgac
tccatcgaca ataaagtgct gacacgctcc 2580gataaaaata gaggcaagtc cgacaacgtc
ccctccgagg aggtcgtgaa gaagatgaaa 2640aactactgga gacagctctt gaacgccaag
ctcatcaccc agcgtaagtt cgacaacctg 2700actaaggctg agagaggagg attgtccgag
ctcgataagg ccggattcat caagagacag 2760ctcgtcgaaa cccgccaaat taccaagcac
gtggcccaaa ttctggattc ccgcatgaac 2820accaagtacg atgaaaatga caagctgatc
cgcgaggtca aggtgatcac cttgaagtcc 2880aagctggtct ccgacttccg caaggacttc
cagttctaca aggtgaggga gatcaacaac 2940taccaccacg cacacgacgc ctacctcaac
gctgtcgttg gaaccgccct catcaaaaaa 3000tatcctaagc tggagtctga gttcgtctac
ggcgactaca aggtgtacga cgtgaggaag 3060atgatcgcta agtctgagca ggagatcggc
aaggccaccg ccaagtactt cttctactcc 3120aacatcatga acttcttcaa gaccgagatc
actctcgcca acggtgagat caggaagcgc 3180ccactgatcg agaccaacgg tgagactgga
gagatcgtgt gggacaaagg gagggatttc 3240gctactgtga ggaaggtgct ctccatgcct
caggtgaaca tcgtcaagaa gaccgaagtt 3300cagaccggag gattctccaa ggagtccatc
ctccccaaga gaaactccga caagctgatc 3360gctagaaaga aagactggga ccctaagaag
tacggaggct tcgattctcc taccgtggcc 3420tactctgtgc tggtcgtggc caaggtggag
aagggcaagt ccaagaagct gaaatccgtc 3480aaggagctcc tcgggattac catcatggag
aggagttcct tcgagaagaa ccctatcgac 3540ttcctggagg ccaagggata taaagaggtg
aagaaggacc tcatcatcaa gctgcccaag 3600tactccctct tcgagttgga gaacggaagg
aagaggatgc tggcttctgc cggagagttg 3660cagaagggaa atgagctcgc ccttccctcc
aagtacgtga acttcctgta cctcgcctct 3720cactatgaaa agttgaaggg ctctcctgag
gacaacgagc agaagcagct cttcgtggag 3780cagcacaagc actacctgga cgaaattatc
gagcagatct ctgagttctc caagcgcgtg 3840atattggccg acgccaacct cgacaaggtg
ctgtccgcct acaacaagca cagggataag 3900cccattcgcg agcaggctga aaacattatc
cacctgttta ccctcacaaa cttgggagcc 3960cctgctgcct tcaagtactt cgacaccacc
attgacagga agagatacac ctccaccaag 4020gaggtgctcg acgcaacact catccaccaa
tccatcaccg gcctctatga aacaaggatt 4080gacttgtccc agctgggagg cgac
410449023DNAGlycine max 490gtttgtttgt
tgttgggtgt ggg
2349122DNAGlycine max 491tgttgttggg tgtgggaata gg
224929174DNAArtificial sequenceRTW1199 492ccgggtgtga
tttagtataa agtgaagtaa tggtcaaaag aaaaagtgta aaacgaagta 60cctagtaata
agtaatattg aacaaaataa atggtaaagt gtcagatata taaaataggc 120tttaataaaa
ggaagaaaaa aaacaaacaa aaaataggtt gcaatggggc agagcagagt 180catcatgaag
ctagaaaggc taccgataga taaactatag ttaattaaat acattaaaaa 240atacttggat
ctttctctta ccctgtttat attgagacct gaaacttgag agagatacac 300taatcttgcc
ttgttgtttc attccctaac ttacaggact cagcgcatgt catgtggtct 360cgttccccat
ttaagtccca caccgtctaa acttattaaa ttattaatgt ttataactag 420atgcacaaca
acaaagcttg tttgtttgtt gttgggtgtg ttttagagct agaaatagca 480agttaaaata
aggctagtcc gttatcaact tgaaaaagtg gcaccgagtc ggtgcttttt 540tttgcggccg
caattggatc gggtttactt attttgtggg tatctatact tttattagat 600ttttaatcag
gctcctgatt tctttttatt tcgattgaat tcctgaactt gtattattca 660gtagatcgaa
taaattataa aaagataaaa tcataaaata atattttatc ctatcaatca 720tattaaagca
atgaatatgt aaaattaatc ttatctttat tttaaaaaat catataggtt 780tagtattttt
ttaaaaataa agataggatt agttttacta ttcactgctt attactttta 840aaaaaatcat
aaaggtttag tattttttta aaataaatat aggaatagtt ttactattca 900ctgctttaat
agaaaaatag tttaaaattt aagatagttt taatcccagc atttgccacg 960tttgaacgtg
agccgaaacg atgtcgttac attatcttaa cctagctgaa acgatgtcgt 1020cataatatcg
ccaaatgcca actggactac gtcgaaccca caaatcccac aaagcgcgtg 1080aaatcaaatc
gctcaaacca caaaaaagaa caacgcgttt gttacacgct caatcccacg 1140cgagtagagc
acagtaacct tcaaataagc gaatggggca taatcagaaa tccgaaataa 1200acctaggggc
attatcggaa atgaaaagta gctcactcaa tataaaaatc taggaaccct 1260agttttcgtt
atcactctgt gctccctcgc tctatttctc agtctctgtg tttgcggctg 1320aggattccga
acgagtgacc ttcttcgttt ctcgcaaagg taacagcctc tgctcttgtc 1380tcttcgattc
gatctatgcc tgtctcttat ttacgatgat gtttcttcgg ttatgttttt 1440ttatttatgc
tttatgctgt tgatgttcgg ttgtttgttt cgctttgttt ttgtggttca 1500gttttttagg
attcttttgg tttttgaatc gattaatcgg aagagatttt cgagttattt 1560ggtgtgttgg
aggtgaatct tttttttgag gtcatagatc tgttgtattt gtgttataaa 1620catgcgactt
tgtatgattt tttacgaggt tatgatgttc tggttgtttt attatgaatc 1680tgttgagaca
gaaccatgat ttttgttgat gttcgtttac actattaaag gtttgtttta 1740acaggattaa
aagtttttta agcatgttga aggagtcttg tagatatgta accgtcgata 1800gtttttttgt
gggtttgttc acatgttatc aagcttaatc ttttactatg tatgcgacca 1860tatctggatc
cagcaaaggc gattttttaa ttccttgtga aacttttgta atatgaagtt 1920gaaattttgt
tattggtaaa ctataaatgt gtgaagttgg agtatacctt taccttctta 1980tttggctttg
tgatagttta atttatatgt attttgagtt ctgacttgta tttctttgaa 2040ttgattctag
tttaagtaat ccatggcacc gaagaagaag cgcaaggtga tggacaaaaa 2100gtactcaata
gggctcgaca tagggactaa ctccgttgga tgggccgtca tcaccgacga 2160gtacaaggtg
ccctccaaga agttcaaggt gttgggaaac accgacaggc acagcataaa 2220gaagaatttg
atcggtgccc tcctcttcga ctccggagag accgctgagg ctaccaggct 2280caagaggacc
gctagaaggc gctacaccag aaggaagaac agaatctgct acctgcagga 2340gatcttctcc
aacgagatgg ccaaggtgga cgactccttc ttccaccgcc ttgaggaatc 2400attcctggtg
gaggaggata aaaagcacga gagacaccca atcttcggga acatcgtcga 2460cgaggtggcc
taccatgaaa agtaccctac catctaccac ctgaggaaga agctggtcga 2520ctctaccgac
aaggctgact tgcgcttgat ttacctggct ctcgctcaca tgataaagtt 2580ccgcggacac
ttcctcattg agggagacct gaacccagac aactccgacg tggacaagct 2640cttcatccag
ctcgttcaga cctacaacca gcttttcgag gagaacccaa tcaacgccag 2700tggagttgac
gccaaggcta tcctctctgc tcgtctgtca aagtccagga ggcttgagaa 2760cttgattgcc
cagctgcctg gcgaaaagaa gaacggactg ttcggaaact tgatcgctct 2820ctccctggga
ttgactccca acttcaagtc caacttcgac ctcgccgagg acgctaagtt 2880gcagttgtct
aaagacacct acgacgatga cctcgacaac ttgctggccc agataggcga 2940ccaatacgcc
gatctcttcc tcgccgctaa gaacttgtcc gacgcaatcc tgctgtccga 3000catcctgaga
gtcaacactg agattaccaa agctcctctg tctgcttcca tgattaagcg 3060ctacgacgag
caccaccaag atctgaccct gctcaaggcc ctggtgagac agcagctgcc 3120cgagaagtac
aaggagatct ttttcgacca gtccaagaac ggctacgccg gatacattga 3180cggaggcgcc
tcccaggaag agttctacaa gttcatcaag cccatccttg agaagatgga 3240cggtaccgag
gagctgttgg tgaagttgaa cagagaggac ctgttgagga agcagagaac 3300cttcgacaac
ggaagcatcc ctcaccaaat ccacctggga gagctccacg ccatcttgag 3360gaggcaggag
gatttctatc ccttcctgaa ggacaaccgc gagaagattg agaagatctt 3420gaccttcaga
attccttact acgtcgggcc actcgccaga ggaaactcta ggttcgcctg 3480gatgacccgc
aaatctgaag agaccattac tccctggaac ttcgaggaag tcgtggacaa 3540gggcgcttcc
gctcagtctt tcatcgagag gatgaccaac ttcgataaaa atctgcccaa 3600cgagaaggtg
ctgcccaagc actccctgtt gtacgagtat ttcacagtgt acaacgagct 3660caccaaggtg
aagtacgtca cagagggaat gaggaagcct gccttcttgt ccggagagca 3720gaagaaggcc
atcgtcgacc tgctcttcaa gaccaacagg aaggtgactg tcaagcagct 3780gaaggaggac
tacttcaaga agatcgagtg cttcgactcc gtcgagatct ctggtgtcga 3840ggacaggttc
aacgcctccc ttgggactta ccacgatctg ctcaagatta ttaaagacaa 3900ggacttcctg
gacaacgagg agaacgagga catccttgag gacatcgtgc tcaccctgac 3960cttgttcgaa
gacagggaaa tgatcgaaga gaggctcaag acctacgccc acctcttcga 4020cgacaaggtg
atgaaacagc tgaagagacg cagatatacc ggctggggaa ggctctcccg 4080caaattgatc
aacgggatca gggacaagca gtcagggaag actatactcg acttcctgaa 4140gtccgacgga
ttcgccaaca ggaacttcat gcagctcatt cacgacgact ccttgacctt 4200caaggaggac
atccagaagg ctcaggtgtc tggacagggt gactccttgc atgagcacat 4260tgctaacttg
gccggctctc ccgctattaa gaagggcatt ttgcagaccg tgaaggtcgt 4320tgacgagctc
gtgaaggtga tgggacgcca caagccagag aacatcgtta ttgagatggc 4380tcgcgagaac
caaactaccc agaaagggca gaagaattcc cgcgagagga tgaagcgcat 4440tgaggagggc
ataaaagagc ttggctctca gatcctcaag gagcaccccg tcgagaacac 4500tcagctgcag
aacgagaagc tgtacctgta ctacctccaa aacggaaggg acatgtacgt 4560ggaccaggag
ctggacatca acaggttgtc cgactacgac gtcgaccaca tcgtgcctca 4620gtccttcctg
aaggatgact ccatcgacaa taaagtgctg acacgctccg ataaaaatag 4680aggcaagtcc
gacaacgtcc cctccgagga ggtcgtgaag aagatgaaaa actactggag 4740acagctcttg
aacgccaagc tcatcaccca gcgtaagttc gacaacctga ctaaggctga 4800gagaggagga
ttgtccgagc tcgataaggc cggattcatc aagagacagc tcgtcgaaac 4860ccgccaaatt
accaagcacg tggcccaaat tctggattcc cgcatgaaca ccaagtacga 4920tgaaaatgac
aagctgatcc gcgaggtcaa ggtgatcacc ttgaagtcca agctggtctc 4980cgacttccgc
aaggacttcc agttctacaa ggtgagggag atcaacaact accaccacgc 5040acacgacgcc
tacctcaacg ctgtcgttgg aaccgccctc atcaaaaaat atcctaagct 5100ggagtctgag
ttcgtctacg gcgactacaa ggtgtacgac gtgaggaaga tgatcgctaa 5160gtctgagcag
gagatcggca aggccaccgc caagtacttc ttctactcca acatcatgaa 5220cttcttcaag
accgagatca ctctcgccaa cggtgagatc aggaagcgcc cactgatcga 5280gaccaacggt
gagactggag agatcgtgtg ggacaaaggg agggatttcg ctactgtgag 5340gaaggtgctc
tccatgcctc aggtgaacat cgtcaagaag accgaagttc agaccggagg 5400attctccaag
gagtccatcc tccccaagag aaactccgac aagctgatcg ctagaaagaa 5460agactgggac
cctaagaagt acggaggctt cgattctcct accgtggcct actctgtgct 5520ggtcgtggcc
aaggtggaga agggcaagtc caagaagctg aaatccgtca aggagctcct 5580cgggattacc
atcatggaga ggagttcctt cgagaagaac cctatcgact tcctggaggc 5640caagggatat
aaagaggtga agaaggacct catcatcaag ctgcccaagt actccctctt 5700cgagttggag
aacggaagga agaggatgct ggcttctgcc ggagagttgc agaagggaaa 5760tgagctcgcc
cttccctcca agtacgtgaa cttcctgtac ctcgcctctc actatgaaaa 5820gttgaagggc
tctcctgagg acaacgagca gaagcagctc ttcgtggagc agcacaagca 5880ctacctggac
gaaattatcg agcagatctc tgagttctcc aagcgcgtga tattggccga 5940cgccaacctc
gacaaggtgc tgtccgccta caacaagcac agggataagc ccattcgcga 6000gcaggctgaa
aacattatcc acctgtttac cctcacaaac ttgggagccc ctgctgcctt 6060caagtacttc
gacaccacca ttgacaggaa gagatacacc tccaccaagg aggtgctcga 6120cgcaacactc
atccaccaat ccatcaccgg cctctatgaa acaaggattg acttgtccca 6180gctgggaggc
gactctagag ccgatcccaa gaagaagaga aaggtgaaga gaccacggga 6240ccgccacgat
ggcgagctgg gaggccgcaa gcgggcaagg taggttaacc tagacttgtc 6300catcttctgg
attggccaac ttaattaatg tatgaaataa aaggatgcac acatagtgac 6360atgctaatca
ctataatgtg ggcatcaaag ttgtgtgtta tgtgtaatta ctagttatct 6420gaataaaaga
gaaagagatc atccatattt cttatcctaa atgaatgtca cgtgtcttta 6480taattctttg
atgaaccaga tgcatttcat taaccaaatc catatacata taaatattaa 6540tcatatataa
ttaatatcaa ttgggttagc aaaacaaatc tagtctaggt gtgttttgcg 6600aattcgatat
caagcttatc gataccgtcg agggggggcc cggtaccggc gcgccgttct 6660atagtgtcac
ctaaatcgta tgtgtatgat acataaggtt atgtattaat tgtagccgcg 6720ttctaacgac
aatatgtcca tatggtgcac tctcagtaca atctgctctg atgccgcata 6780gttaagccag
ccccgacacc cgccaacacc cgctgacgcg ccctgacggg cttgtctgct 6840cccggcatcc
gcttacagac aagctgtgac cgtctccggg agctgcatgt gtcagaggtt 6900ttcaccgtca
tcaccgaaac gcgcgagacg aaagggcctc gtgatacgcc tatttttata 6960ggttaatgtc
atgaccaaaa tcccttaacg tgagttttcg ttccactgag cgtcagaccc 7020cgtagaaaag
atcaaaggat cttcttgaga tccttttttt ctgcgcgtaa tctgctgctt 7080gcaaacaaaa
aaaccaccgc taccagcggt ggtttgtttg ccggatcaag agctaccaac 7140tctttttccg
aaggtaactg gcttcagcag agcgcagata ccaaatactg ttcttctagt 7200gtagccgtag
ttaggccacc acttcaagaa ctctgtagca ccgcctacat acctcgctct 7260gctaatcctg
ttaccagtgg ctgctgccag tggcgataag tcgtgtctta ccgggttgga 7320ctcaagacga
tagttaccgg ataaggcgca gcggtcgggc tgaacggggg gttcgtgcac 7380acagcccagc
ttggagcgaa cgacctacac cgaactgaga tacctacagc gtgagctatg 7440agaaagcgcc
acgcttcccg aagggagaaa ggcggacagg tatccggtaa gcggcagggt 7500cggaacagga
gagcgcacga gggagcttcc agggggaaac gcctggtatc tttatagtcc 7560tgtcgggttt
cgccacctct gacttgagcg tcgatttttg tgatgctcgt caggggggcg 7620gagcctatgg
aaaaacgcca gcaacgcggc ctttttacgg ttcctggcct tttgctggcc 7680ttttgctcac
atgttctttc ctgcgttatc ccctgattct gtggataacc gtattaccgc 7740ctttgagtga
gctgataccg ctcgccgcag ccgaacgacc gagcgcagcg agtcagtgag 7800cgaggaagcg
gaagagcgcc caatacgcaa accgcctctc cccgcgcgtt ggccgattca 7860ttaatgcagg
ttgatcagat ctcgatcccg cgaaattaat acgactcact atagggagac 7920cacaacggtt
tccctctaga aataattttg tttaacttta agaaggagat atacccatgg 7980aaaagcctga
actcaccgcg acgtctgtcg agaagtttct gatcgaaaag ttcgacagcg 8040tctccgacct
gatgcagctc tcggagggcg aagaatctcg tgctttcagc ttcgatgtag 8100gagggcgtgg
atatgtcctg cgggtaaata gctgcgccga tggtttctac aaagatcgtt 8160atgtttatcg
gcactttgca tcggccgcgc tcccgattcc ggaagtgctt gacattgggg 8220aattcagcga
gagcctgacc tattgcatct cccgccgtgc acagggtgtc acgttgcaag 8280acctgcctga
aaccgaactg cccgctgttc tgcagccggt cgcggaggct atggatgcga 8340tcgctgcggc
cgatcttagc cagacgagcg ggttcggccc attcggaccg caaggaatcg 8400gtcaatacac
tacatggcgt gatttcatat gcgcgattgc tgatccccat gtgtatcact 8460ggcaaactgt
gatggacgac accgtcagtg cgtccgtcgc gcaggctctc gatgagctga 8520tgctttgggc
cgaggactgc cccgaagtcc ggcacctcgt gcacgcggat ttcggctcca 8580acaatgtcct
gacggacaat ggccgcataa cagcggtcat tgactggagc gaggcgatgt 8640tcggggattc
ccaatacgag gtcgccaaca tcttcttctg gaggccgtgg ttggcttgta 8700tggagcagca
gacgcgctac ttcgagcgga ggcatccgga gcttgcagga tcgccgcggc 8760tccgggcgta
tatgctccgc attggtcttg accaactcta tcagagcttg gttgacggca 8820atttcgatga
tgcagcttgg gcgcagggtc gatgcgacgc aatcgtccga tccggagccg 8880ggactgtcgg
gcgtacacaa atcgcccgca gaagcgcggc cgtctggacc gatggctgtg 8940tagaagtact
cgccgatagt ggaaaccgac gccccagcac tcgtccgagg gcaaaggaat 9000agtgaggtac
agcttggatc gatccggctg ctaacaaagc ccgaaaggaa gctgagttgg 9060ctgctgccac
cgctgagcaa taactagcat aaccccttgg ggcctctaaa cgggtcttga 9120ggggtttttt
gctgaaagga ggaactatat ccggatgctc gggcgcgccg gtac
91744939174DNAArtificial sequenceRTW1200 493ccgggtgtga tttagtataa
agtgaagtaa tggtcaaaag aaaaagtgta aaacgaagta 60cctagtaata agtaatattg
aacaaaataa atggtaaagt gtcagatata taaaataggc 120tttaataaaa ggaagaaaaa
aaacaaacaa aaaataggtt gcaatggggc agagcagagt 180catcatgaag ctagaaaggc
taccgataga taaactatag ttaattaaat acattaaaaa 240atacttggat ctttctctta
ccctgtttat attgagacct gaaacttgag agagatacac 300taatcttgcc ttgttgtttc
attccctaac ttacaggact cagcgcatgt catgtggtct 360cgttccccat ttaagtccca
caccgtctaa acttattaaa ttattaatgt ttataactag 420atgcacaaca acaaagcttg
tgttgttggg tgtgggaatg ttttagagct agaaatagca 480agttaaaata aggctagtcc
gttatcaact tgaaaaagtg gcaccgagtc ggtgcttttt 540tttgcggccg caattggatc
gggtttactt attttgtggg tatctatact tttattagat 600ttttaatcag gctcctgatt
tctttttatt tcgattgaat tcctgaactt gtattattca 660gtagatcgaa taaattataa
aaagataaaa tcataaaata atattttatc ctatcaatca 720tattaaagca atgaatatgt
aaaattaatc ttatctttat tttaaaaaat catataggtt 780tagtattttt ttaaaaataa
agataggatt agttttacta ttcactgctt attactttta 840aaaaaatcat aaaggtttag
tattttttta aaataaatat aggaatagtt ttactattca 900ctgctttaat agaaaaatag
tttaaaattt aagatagttt taatcccagc atttgccacg 960tttgaacgtg agccgaaacg
atgtcgttac attatcttaa cctagctgaa acgatgtcgt 1020cataatatcg ccaaatgcca
actggactac gtcgaaccca caaatcccac aaagcgcgtg 1080aaatcaaatc gctcaaacca
caaaaaagaa caacgcgttt gttacacgct caatcccacg 1140cgagtagagc acagtaacct
tcaaataagc gaatggggca taatcagaaa tccgaaataa 1200acctaggggc attatcggaa
atgaaaagta gctcactcaa tataaaaatc taggaaccct 1260agttttcgtt atcactctgt
gctccctcgc tctatttctc agtctctgtg tttgcggctg 1320aggattccga acgagtgacc
ttcttcgttt ctcgcaaagg taacagcctc tgctcttgtc 1380tcttcgattc gatctatgcc
tgtctcttat ttacgatgat gtttcttcgg ttatgttttt 1440ttatttatgc tttatgctgt
tgatgttcgg ttgtttgttt cgctttgttt ttgtggttca 1500gttttttagg attcttttgg
tttttgaatc gattaatcgg aagagatttt cgagttattt 1560ggtgtgttgg aggtgaatct
tttttttgag gtcatagatc tgttgtattt gtgttataaa 1620catgcgactt tgtatgattt
tttacgaggt tatgatgttc tggttgtttt attatgaatc 1680tgttgagaca gaaccatgat
ttttgttgat gttcgtttac actattaaag gtttgtttta 1740acaggattaa aagtttttta
agcatgttga aggagtcttg tagatatgta accgtcgata 1800gtttttttgt gggtttgttc
acatgttatc aagcttaatc ttttactatg tatgcgacca 1860tatctggatc cagcaaaggc
gattttttaa ttccttgtga aacttttgta atatgaagtt 1920gaaattttgt tattggtaaa
ctataaatgt gtgaagttgg agtatacctt taccttctta 1980tttggctttg tgatagttta
atttatatgt attttgagtt ctgacttgta tttctttgaa 2040ttgattctag tttaagtaat
ccatggcacc gaagaagaag cgcaaggtga tggacaaaaa 2100gtactcaata gggctcgaca
tagggactaa ctccgttgga tgggccgtca tcaccgacga 2160gtacaaggtg ccctccaaga
agttcaaggt gttgggaaac accgacaggc acagcataaa 2220gaagaatttg atcggtgccc
tcctcttcga ctccggagag accgctgagg ctaccaggct 2280caagaggacc gctagaaggc
gctacaccag aaggaagaac agaatctgct acctgcagga 2340gatcttctcc aacgagatgg
ccaaggtgga cgactccttc ttccaccgcc ttgaggaatc 2400attcctggtg gaggaggata
aaaagcacga gagacaccca atcttcggga acatcgtcga 2460cgaggtggcc taccatgaaa
agtaccctac catctaccac ctgaggaaga agctggtcga 2520ctctaccgac aaggctgact
tgcgcttgat ttacctggct ctcgctcaca tgataaagtt 2580ccgcggacac ttcctcattg
agggagacct gaacccagac aactccgacg tggacaagct 2640cttcatccag ctcgttcaga
cctacaacca gcttttcgag gagaacccaa tcaacgccag 2700tggagttgac gccaaggcta
tcctctctgc tcgtctgtca aagtccagga ggcttgagaa 2760cttgattgcc cagctgcctg
gcgaaaagaa gaacggactg ttcggaaact tgatcgctct 2820ctccctggga ttgactccca
acttcaagtc caacttcgac ctcgccgagg acgctaagtt 2880gcagttgtct aaagacacct
acgacgatga cctcgacaac ttgctggccc agataggcga 2940ccaatacgcc gatctcttcc
tcgccgctaa gaacttgtcc gacgcaatcc tgctgtccga 3000catcctgaga gtcaacactg
agattaccaa agctcctctg tctgcttcca tgattaagcg 3060ctacgacgag caccaccaag
atctgaccct gctcaaggcc ctggtgagac agcagctgcc 3120cgagaagtac aaggagatct
ttttcgacca gtccaagaac ggctacgccg gatacattga 3180cggaggcgcc tcccaggaag
agttctacaa gttcatcaag cccatccttg agaagatgga 3240cggtaccgag gagctgttgg
tgaagttgaa cagagaggac ctgttgagga agcagagaac 3300cttcgacaac ggaagcatcc
ctcaccaaat ccacctggga gagctccacg ccatcttgag 3360gaggcaggag gatttctatc
ccttcctgaa ggacaaccgc gagaagattg agaagatctt 3420gaccttcaga attccttact
acgtcgggcc actcgccaga ggaaactcta ggttcgcctg 3480gatgacccgc aaatctgaag
agaccattac tccctggaac ttcgaggaag tcgtggacaa 3540gggcgcttcc gctcagtctt
tcatcgagag gatgaccaac ttcgataaaa atctgcccaa 3600cgagaaggtg ctgcccaagc
actccctgtt gtacgagtat ttcacagtgt acaacgagct 3660caccaaggtg aagtacgtca
cagagggaat gaggaagcct gccttcttgt ccggagagca 3720gaagaaggcc atcgtcgacc
tgctcttcaa gaccaacagg aaggtgactg tcaagcagct 3780gaaggaggac tacttcaaga
agatcgagtg cttcgactcc gtcgagatct ctggtgtcga 3840ggacaggttc aacgcctccc
ttgggactta ccacgatctg ctcaagatta ttaaagacaa 3900ggacttcctg gacaacgagg
agaacgagga catccttgag gacatcgtgc tcaccctgac 3960cttgttcgaa gacagggaaa
tgatcgaaga gaggctcaag acctacgccc acctcttcga 4020cgacaaggtg atgaaacagc
tgaagagacg cagatatacc ggctggggaa ggctctcccg 4080caaattgatc aacgggatca
gggacaagca gtcagggaag actatactcg acttcctgaa 4140gtccgacgga ttcgccaaca
ggaacttcat gcagctcatt cacgacgact ccttgacctt 4200caaggaggac atccagaagg
ctcaggtgtc tggacagggt gactccttgc atgagcacat 4260tgctaacttg gccggctctc
ccgctattaa gaagggcatt ttgcagaccg tgaaggtcgt 4320tgacgagctc gtgaaggtga
tgggacgcca caagccagag aacatcgtta ttgagatggc 4380tcgcgagaac caaactaccc
agaaagggca gaagaattcc cgcgagagga tgaagcgcat 4440tgaggagggc ataaaagagc
ttggctctca gatcctcaag gagcaccccg tcgagaacac 4500tcagctgcag aacgagaagc
tgtacctgta ctacctccaa aacggaaggg acatgtacgt 4560ggaccaggag ctggacatca
acaggttgtc cgactacgac gtcgaccaca tcgtgcctca 4620gtccttcctg aaggatgact
ccatcgacaa taaagtgctg acacgctccg ataaaaatag 4680aggcaagtcc gacaacgtcc
cctccgagga ggtcgtgaag aagatgaaaa actactggag 4740acagctcttg aacgccaagc
tcatcaccca gcgtaagttc gacaacctga ctaaggctga 4800gagaggagga ttgtccgagc
tcgataaggc cggattcatc aagagacagc tcgtcgaaac 4860ccgccaaatt accaagcacg
tggcccaaat tctggattcc cgcatgaaca ccaagtacga 4920tgaaaatgac aagctgatcc
gcgaggtcaa ggtgatcacc ttgaagtcca agctggtctc 4980cgacttccgc aaggacttcc
agttctacaa ggtgagggag atcaacaact accaccacgc 5040acacgacgcc tacctcaacg
ctgtcgttgg aaccgccctc atcaaaaaat atcctaagct 5100ggagtctgag ttcgtctacg
gcgactacaa ggtgtacgac gtgaggaaga tgatcgctaa 5160gtctgagcag gagatcggca
aggccaccgc caagtacttc ttctactcca acatcatgaa 5220cttcttcaag accgagatca
ctctcgccaa cggtgagatc aggaagcgcc cactgatcga 5280gaccaacggt gagactggag
agatcgtgtg ggacaaaggg agggatttcg ctactgtgag 5340gaaggtgctc tccatgcctc
aggtgaacat cgtcaagaag accgaagttc agaccggagg 5400attctccaag gagtccatcc
tccccaagag aaactccgac aagctgatcg ctagaaagaa 5460agactgggac cctaagaagt
acggaggctt cgattctcct accgtggcct actctgtgct 5520ggtcgtggcc aaggtggaga
agggcaagtc caagaagctg aaatccgtca aggagctcct 5580cgggattacc atcatggaga
ggagttcctt cgagaagaac cctatcgact tcctggaggc 5640caagggatat aaagaggtga
agaaggacct catcatcaag ctgcccaagt actccctctt 5700cgagttggag aacggaagga
agaggatgct ggcttctgcc ggagagttgc agaagggaaa 5760tgagctcgcc cttccctcca
agtacgtgaa cttcctgtac ctcgcctctc actatgaaaa 5820gttgaagggc tctcctgagg
acaacgagca gaagcagctc ttcgtggagc agcacaagca 5880ctacctggac gaaattatcg
agcagatctc tgagttctcc aagcgcgtga tattggccga 5940cgccaacctc gacaaggtgc
tgtccgccta caacaagcac agggataagc ccattcgcga 6000gcaggctgaa aacattatcc
acctgtttac cctcacaaac ttgggagccc ctgctgcctt 6060caagtacttc gacaccacca
ttgacaggaa gagatacacc tccaccaagg aggtgctcga 6120cgcaacactc atccaccaat
ccatcaccgg cctctatgaa acaaggattg acttgtccca 6180gctgggaggc gactctagag
ccgatcccaa gaagaagaga aaggtgaaga gaccacggga 6240ccgccacgat ggcgagctgg
gaggccgcaa gcgggcaagg taggttaacc tagacttgtc 6300catcttctgg attggccaac
ttaattaatg tatgaaataa aaggatgcac acatagtgac 6360atgctaatca ctataatgtg
ggcatcaaag ttgtgtgtta tgtgtaatta ctagttatct 6420gaataaaaga gaaagagatc
atccatattt cttatcctaa atgaatgtca cgtgtcttta 6480taattctttg atgaaccaga
tgcatttcat taaccaaatc catatacata taaatattaa 6540tcatatataa ttaatatcaa
ttgggttagc aaaacaaatc tagtctaggt gtgttttgcg 6600aattcgatat caagcttatc
gataccgtcg agggggggcc cggtaccggc gcgccgttct 6660atagtgtcac ctaaatcgta
tgtgtatgat acataaggtt atgtattaat tgtagccgcg 6720ttctaacgac aatatgtcca
tatggtgcac tctcagtaca atctgctctg atgccgcata 6780gttaagccag ccccgacacc
cgccaacacc cgctgacgcg ccctgacggg cttgtctgct 6840cccggcatcc gcttacagac
aagctgtgac cgtctccggg agctgcatgt gtcagaggtt 6900ttcaccgtca tcaccgaaac
gcgcgagacg aaagggcctc gtgatacgcc tatttttata 6960ggttaatgtc atgaccaaaa
tcccttaacg tgagttttcg ttccactgag cgtcagaccc 7020cgtagaaaag atcaaaggat
cttcttgaga tccttttttt ctgcgcgtaa tctgctgctt 7080gcaaacaaaa aaaccaccgc
taccagcggt ggtttgtttg ccggatcaag agctaccaac 7140tctttttccg aaggtaactg
gcttcagcag agcgcagata ccaaatactg ttcttctagt 7200gtagccgtag ttaggccacc
acttcaagaa ctctgtagca ccgcctacat acctcgctct 7260gctaatcctg ttaccagtgg
ctgctgccag tggcgataag tcgtgtctta ccgggttgga 7320ctcaagacga tagttaccgg
ataaggcgca gcggtcgggc tgaacggggg gttcgtgcac 7380acagcccagc ttggagcgaa
cgacctacac cgaactgaga tacctacagc gtgagctatg 7440agaaagcgcc acgcttcccg
aagggagaaa ggcggacagg tatccggtaa gcggcagggt 7500cggaacagga gagcgcacga
gggagcttcc agggggaaac gcctggtatc tttatagtcc 7560tgtcgggttt cgccacctct
gacttgagcg tcgatttttg tgatgctcgt caggggggcg 7620gagcctatgg aaaaacgcca
gcaacgcggc ctttttacgg ttcctggcct tttgctggcc 7680ttttgctcac atgttctttc
ctgcgttatc ccctgattct gtggataacc gtattaccgc 7740ctttgagtga gctgataccg
ctcgccgcag ccgaacgacc gagcgcagcg agtcagtgag 7800cgaggaagcg gaagagcgcc
caatacgcaa accgcctctc cccgcgcgtt ggccgattca 7860ttaatgcagg ttgatcagat
ctcgatcccg cgaaattaat acgactcact atagggagac 7920cacaacggtt tccctctaga
aataattttg tttaacttta agaaggagat atacccatgg 7980aaaagcctga actcaccgcg
acgtctgtcg agaagtttct gatcgaaaag ttcgacagcg 8040tctccgacct gatgcagctc
tcggagggcg aagaatctcg tgctttcagc ttcgatgtag 8100gagggcgtgg atatgtcctg
cgggtaaata gctgcgccga tggtttctac aaagatcgtt 8160atgtttatcg gcactttgca
tcggccgcgc tcccgattcc ggaagtgctt gacattgggg 8220aattcagcga gagcctgacc
tattgcatct cccgccgtgc acagggtgtc acgttgcaag 8280acctgcctga aaccgaactg
cccgctgttc tgcagccggt cgcggaggct atggatgcga 8340tcgctgcggc cgatcttagc
cagacgagcg ggttcggccc attcggaccg caaggaatcg 8400gtcaatacac tacatggcgt
gatttcatat gcgcgattgc tgatccccat gtgtatcact 8460ggcaaactgt gatggacgac
accgtcagtg cgtccgtcgc gcaggctctc gatgagctga 8520tgctttgggc cgaggactgc
cccgaagtcc ggcacctcgt gcacgcggat ttcggctcca 8580acaatgtcct gacggacaat
ggccgcataa cagcggtcat tgactggagc gaggcgatgt 8640tcggggattc ccaatacgag
gtcgccaaca tcttcttctg gaggccgtgg ttggcttgta 8700tggagcagca gacgcgctac
ttcgagcgga ggcatccgga gcttgcagga tcgccgcggc 8760tccgggcgta tatgctccgc
attggtcttg accaactcta tcagagcttg gttgacggca 8820atttcgatga tgcagcttgg
gcgcagggtc gatgcgacgc aatcgtccga tccggagccg 8880ggactgtcgg gcgtacacaa
atcgcccgca gaagcgcggc cgtctggacc gatggctgtg 8940tagaagtact cgccgatagt
ggaaaccgac gccccagcac tcgtccgagg gcaaaggaat 9000agtgaggtac agcttggatc
gatccggctg ctaacaaagc ccgaaaggaa gctgagttgg 9060ctgctgccac cgctgagcaa
taactagcat aaccccttgg ggcctctaaa cgggtcttga 9120ggggtttttt gctgaaagga
ggaactatat ccggatgctc gggcgcgccg gtac 91744943175DNAArtificial
sequenceRTW1190A 494cgaattctac aggtcactaa taccatctaa gtagttggtt
catagtgact gcatatgtaa 60aaattatcct tattttaagg aaattaaaaa ttatcatata
tatataagtt ttaaattaat 120tatcttatat atgtaccaaa aagttttaaa gcaattatta
taaaaattaa taaatttatc 180atataaaata atttataatt aaattttaaa ttatcaattc
attaaattaa attatttaaa 240atttttgaat gataatataa taattttatc ctctactaag
tcccaacgtt tcctatttta 300ttccactttt agcaataaat tttgtcataa acacttataa
caaaaaaagt aagtaaaaaa 360taaaaaaaag tttttcaata aagtataaac taatttgtat
aaacttttag aaaaaataaa 420gttatacatt gataatataa attttttaca taattatccg
atcaactcat tatatatgat 480aaatttattg attttttaaa ataattatct taaaataatt
taaacaatga tttgcaatta 540gatgataata taaaattatt ttacacacta catgtattaa
actcaaactt ttatatatta 600gtttttctaa aaactaattt ttaactcaaa aaaaatgtta
cttataattt tcttatcttc 660tttttttata agtatttttt aagaaattta ttgaaacatg
accatgcttg ggtcaataat 720actactctct tagacaccaa acaacccttc ccaaactata
atctaatcca aaagccatca 780ttcattttcc ttggtaggta aagttccaag accttcacca
actttttcac tcaattgttt 840tggtgtaagc aattcgacat gtgttagtgt tagttggcaa
ccaaaaatcc ctttatgtga 900ctcaatccaa caaccactca caccaccaac ccccataacc
atttctcaca atacccttca 960tttacacatt atcatcacca aaaataaata aaaaaaacct
ctcatttcag agagagagag 1020agagacttca cagaccaaag tgcagagaac aacaaagttc
acaactttaa ggaaaattga 1080aatggcccaa gtgagcagag tgcacaatct tgctcaaagc
actcaaattt ttggccattc 1140ttccaactcc aacaaactca aatcggtgaa ttcggtttca
ttgaggccac gcctttgggg 1200ggcctcaaaa tctcgcatcc cgatgcataa aaatggaagc
tttatgggaa attttaatgt 1260ggggaaggga aattccggcg tgtttaaggt ttctgcatcg
gtcgccgccg cagagaagcc 1320gtcaacgtcg ccggagatcg tgttggaacc catcaaagac
ttctcgggta ccatcacatt 1380gccagggtcc aagtctctgt ccaatcgaat tttgcttctt
gctgctctct ctgaggttcg 1440tagatttctt ccgttttttt ttcttcttct ttattgtttg
ttctacatca gcatgatgtt 1500gatttgattg tgttttctat cgtttcatcg attataaatt
ttcataatca gaagattcag 1560cttttattaa tgcaagaacg tccttaattg atgattttat
aaccgtaaat taggtctaat 1620tagagttttt ttcataaaga ttttcagatc cgtttacaac
aagccttaat tgttgattct 1680gtagtcgtag attaaggttt ttttcatgaa ctacttcaga
tccgttaaac aacagcctta 1740tttgttgata cttcagtcgt ttttcaagaa attgttcaga
tccgttgata aaagccttat 1800tcgttgattc tgtatggtat ttcaagagat attgctcagg
tcctttagca actaccttat 1860ttgttgattc tgtggccata gattaggatt ttttttcacg
aaattgcttc ttgaaattac 1920gtgatggatt ttgattctga tttatcttgt gattgttgac
tctacaggga acaactgttg 1980tagacaactt gttgtatagt gaggatattc attacatgct
tggtgcatta aggacccttg 2040gactgcgtgt ggaagatgac aaaacaacca aacaagcaat
tgttgaaggc tgtgggggat 2100tgtttcccac tagtaaggaa tctaaagatg aaatcaattt
attccttgga aatgctggta 2160ttgcaatgag atctttgaca gcagctgttg ttgctgcagg
tggaaatgca aggtctgttt 2220tttttttttt tgttcagcat aatctttgaa ttgttcctcg
tataactaat cacaacagag 2280tacgtgttct tcttcctgtt ataatctaaa aatctcatcc
agattagtca tcctttcttc 2340ttaaaaggaa cctttaatta tcaatgtatt tatttaatat
ttaaattagc ttgtcaaagt 2400ctagcatata catattttga ttatattctg agaaatgcac
ctgagggtgt tcctcatgat 2460ctacttcaac ctctgttatt attagatttt ctatcatgat
tactggtttg agtctctaag 2520tagaccatct tgatgttcaa aatatttcag ctacgtactt
gatggggtgc cccgaatgag 2580agagaggcca attggggatt tggttgctgg tcttaagcaa
cttggtgcag atgttgattg 2640ctttcttggc acaaactgtc cacctgttcg tgtaaatggg
aagggaggac ttcctggcgg 2700aaaggtatgg tttggatttc atttagaata aggtggagta
actttcctgg atcaaaattc 2760taatttaaga agcctccctg ttttcctctc tttagaataa
gactaagggt aggtttagga 2820gttgggtttt ggagagaaat ggaagggaga gcaatttttt
tcttcttcta ataaatattc 2880tttaatttga tacatttttt aagtaaaaga atataaagat
agattagcat aacttaatgt 2940tttaatcttt tatttatttt tataaatatt atatacctgt
ctatttaaaa atcaaatatt 3000tgtcctccat tccctttccc ttcaaaacct cagttccaaa
tataccgtag ttgaattata 3060ttttggaagg cctattggtt ggagactttt ccttttcaga
gattatccct cacctttatt 3120atagcctttc tatttttaaa cttcatatag acgccattct
tggggcggcc gcgat 317549523DNAArtificial sequenceprimer, soy1-F3
495gtttgtttgt tgttgggtgt ggg
2349625DNAArtificial sequenceprimer, soy1-R3 496gacatgatgc ttcattttca
cagaa 2549718DNAArtificial
sequencerobe, soy1-T2(FAM-MGB) 497tgtgtagagt ggattttg
1849822DNAArtificial sequenceprimer,
soy1-F2 498tgttgttggg tgtgggaata gg
2249931DNAArtificial sequenceWOL1001, Forward_primer 499aggtttaatt
ttatataatg ttagcataca g
3150028DNAArtificial sequence500 WOL1002, Reverse_primer 500atcaacatca
tgctgatgta gaacaaac
2850129DNAArtificial sequence501 WOL1003, Forward_primer 501attctgattt
atcttgtgat tgttgactc
2950227DNAArtificial sequenceWOL1004, Reverse_primer 502atttactttg
gagagaataa ggagggg
2750323DNAGlycine mac 503gaaacgttgg gacttagtag agg
2350423DNAGlycine max 504ggaataaaat aggaaacgtt ggg
235059174DNAArtificial
sequenceRTW1201 505ccgggtgtga tttagtataa agtgaagtaa tggtcaaaag aaaaagtgta
aaacgaagta 60cctagtaata agtaatattg aacaaaataa atggtaaagt gtcagatata
taaaataggc 120tttaataaaa ggaagaaaaa aaacaaacaa aaaataggtt gcaatggggc
agagcagagt 180catcatgaag ctagaaaggc taccgataga taaactatag ttaattaaat
acattaaaaa 240atacttggat ctttctctta ccctgtttat attgagacct gaaacttgag
agagatacac 300taatcttgcc ttgttgtttc attccctaac ttacaggact cagcgcatgt
catgtggtct 360cgttccccat ttaagtccca caccgtctaa acttattaaa ttattaatgt
ttataactag 420atgcacaaca acaaagcttg aaacgttggg acttagtagg ttttagagct
agaaatagca 480agttaaaata aggctagtcc gttatcaact tgaaaaagtg gcaccgagtc
ggtgcttttt 540tttgcggccg caattggatc gggtttactt attttgtggg tatctatact
tttattagat 600ttttaatcag gctcctgatt tctttttatt tcgattgaat tcctgaactt
gtattattca 660gtagatcgaa taaattataa aaagataaaa tcataaaata atattttatc
ctatcaatca 720tattaaagca atgaatatgt aaaattaatc ttatctttat tttaaaaaat
catataggtt 780tagtattttt ttaaaaataa agataggatt agttttacta ttcactgctt
attactttta 840aaaaaatcat aaaggtttag tattttttta aaataaatat aggaatagtt
ttactattca 900ctgctttaat agaaaaatag tttaaaattt aagatagttt taatcccagc
atttgccacg 960tttgaacgtg agccgaaacg atgtcgttac attatcttaa cctagctgaa
acgatgtcgt 1020cataatatcg ccaaatgcca actggactac gtcgaaccca caaatcccac
aaagcgcgtg 1080aaatcaaatc gctcaaacca caaaaaagaa caacgcgttt gttacacgct
caatcccacg 1140cgagtagagc acagtaacct tcaaataagc gaatggggca taatcagaaa
tccgaaataa 1200acctaggggc attatcggaa atgaaaagta gctcactcaa tataaaaatc
taggaaccct 1260agttttcgtt atcactctgt gctccctcgc tctatttctc agtctctgtg
tttgcggctg 1320aggattccga acgagtgacc ttcttcgttt ctcgcaaagg taacagcctc
tgctcttgtc 1380tcttcgattc gatctatgcc tgtctcttat ttacgatgat gtttcttcgg
ttatgttttt 1440ttatttatgc tttatgctgt tgatgttcgg ttgtttgttt cgctttgttt
ttgtggttca 1500gttttttagg attcttttgg tttttgaatc gattaatcgg aagagatttt
cgagttattt 1560ggtgtgttgg aggtgaatct tttttttgag gtcatagatc tgttgtattt
gtgttataaa 1620catgcgactt tgtatgattt tttacgaggt tatgatgttc tggttgtttt
attatgaatc 1680tgttgagaca gaaccatgat ttttgttgat gttcgtttac actattaaag
gtttgtttta 1740acaggattaa aagtttttta agcatgttga aggagtcttg tagatatgta
accgtcgata 1800gtttttttgt gggtttgttc acatgttatc aagcttaatc ttttactatg
tatgcgacca 1860tatctggatc cagcaaaggc gattttttaa ttccttgtga aacttttgta
atatgaagtt 1920gaaattttgt tattggtaaa ctataaatgt gtgaagttgg agtatacctt
taccttctta 1980tttggctttg tgatagttta atttatatgt attttgagtt ctgacttgta
tttctttgaa 2040ttgattctag tttaagtaat ccatggcacc gaagaagaag cgcaaggtga
tggacaaaaa 2100gtactcaata gggctcgaca tagggactaa ctccgttgga tgggccgtca
tcaccgacga 2160gtacaaggtg ccctccaaga agttcaaggt gttgggaaac accgacaggc
acagcataaa 2220gaagaatttg atcggtgccc tcctcttcga ctccggagag accgctgagg
ctaccaggct 2280caagaggacc gctagaaggc gctacaccag aaggaagaac agaatctgct
acctgcagga 2340gatcttctcc aacgagatgg ccaaggtgga cgactccttc ttccaccgcc
ttgaggaatc 2400attcctggtg gaggaggata aaaagcacga gagacaccca atcttcggga
acatcgtcga 2460cgaggtggcc taccatgaaa agtaccctac catctaccac ctgaggaaga
agctggtcga 2520ctctaccgac aaggctgact tgcgcttgat ttacctggct ctcgctcaca
tgataaagtt 2580ccgcggacac ttcctcattg agggagacct gaacccagac aactccgacg
tggacaagct 2640cttcatccag ctcgttcaga cctacaacca gcttttcgag gagaacccaa
tcaacgccag 2700tggagttgac gccaaggcta tcctctctgc tcgtctgtca aagtccagga
ggcttgagaa 2760cttgattgcc cagctgcctg gcgaaaagaa gaacggactg ttcggaaact
tgatcgctct 2820ctccctggga ttgactccca acttcaagtc caacttcgac ctcgccgagg
acgctaagtt 2880gcagttgtct aaagacacct acgacgatga cctcgacaac ttgctggccc
agataggcga 2940ccaatacgcc gatctcttcc tcgccgctaa gaacttgtcc gacgcaatcc
tgctgtccga 3000catcctgaga gtcaacactg agattaccaa agctcctctg tctgcttcca
tgattaagcg 3060ctacgacgag caccaccaag atctgaccct gctcaaggcc ctggtgagac
agcagctgcc 3120cgagaagtac aaggagatct ttttcgacca gtccaagaac ggctacgccg
gatacattga 3180cggaggcgcc tcccaggaag agttctacaa gttcatcaag cccatccttg
agaagatgga 3240cggtaccgag gagctgttgg tgaagttgaa cagagaggac ctgttgagga
agcagagaac 3300cttcgacaac ggaagcatcc ctcaccaaat ccacctggga gagctccacg
ccatcttgag 3360gaggcaggag gatttctatc ccttcctgaa ggacaaccgc gagaagattg
agaagatctt 3420gaccttcaga attccttact acgtcgggcc actcgccaga ggaaactcta
ggttcgcctg 3480gatgacccgc aaatctgaag agaccattac tccctggaac ttcgaggaag
tcgtggacaa 3540gggcgcttcc gctcagtctt tcatcgagag gatgaccaac ttcgataaaa
atctgcccaa 3600cgagaaggtg ctgcccaagc actccctgtt gtacgagtat ttcacagtgt
acaacgagct 3660caccaaggtg aagtacgtca cagagggaat gaggaagcct gccttcttgt
ccggagagca 3720gaagaaggcc atcgtcgacc tgctcttcaa gaccaacagg aaggtgactg
tcaagcagct 3780gaaggaggac tacttcaaga agatcgagtg cttcgactcc gtcgagatct
ctggtgtcga 3840ggacaggttc aacgcctccc ttgggactta ccacgatctg ctcaagatta
ttaaagacaa 3900ggacttcctg gacaacgagg agaacgagga catccttgag gacatcgtgc
tcaccctgac 3960cttgttcgaa gacagggaaa tgatcgaaga gaggctcaag acctacgccc
acctcttcga 4020cgacaaggtg atgaaacagc tgaagagacg cagatatacc ggctggggaa
ggctctcccg 4080caaattgatc aacgggatca gggacaagca gtcagggaag actatactcg
acttcctgaa 4140gtccgacgga ttcgccaaca ggaacttcat gcagctcatt cacgacgact
ccttgacctt 4200caaggaggac atccagaagg ctcaggtgtc tggacagggt gactccttgc
atgagcacat 4260tgctaacttg gccggctctc ccgctattaa gaagggcatt ttgcagaccg
tgaaggtcgt 4320tgacgagctc gtgaaggtga tgggacgcca caagccagag aacatcgtta
ttgagatggc 4380tcgcgagaac caaactaccc agaaagggca gaagaattcc cgcgagagga
tgaagcgcat 4440tgaggagggc ataaaagagc ttggctctca gatcctcaag gagcaccccg
tcgagaacac 4500tcagctgcag aacgagaagc tgtacctgta ctacctccaa aacggaaggg
acatgtacgt 4560ggaccaggag ctggacatca acaggttgtc cgactacgac gtcgaccaca
tcgtgcctca 4620gtccttcctg aaggatgact ccatcgacaa taaagtgctg acacgctccg
ataaaaatag 4680aggcaagtcc gacaacgtcc cctccgagga ggtcgtgaag aagatgaaaa
actactggag 4740acagctcttg aacgccaagc tcatcaccca gcgtaagttc gacaacctga
ctaaggctga 4800gagaggagga ttgtccgagc tcgataaggc cggattcatc aagagacagc
tcgtcgaaac 4860ccgccaaatt accaagcacg tggcccaaat tctggattcc cgcatgaaca
ccaagtacga 4920tgaaaatgac aagctgatcc gcgaggtcaa ggtgatcacc ttgaagtcca
agctggtctc 4980cgacttccgc aaggacttcc agttctacaa ggtgagggag atcaacaact
accaccacgc 5040acacgacgcc tacctcaacg ctgtcgttgg aaccgccctc atcaaaaaat
atcctaagct 5100ggagtctgag ttcgtctacg gcgactacaa ggtgtacgac gtgaggaaga
tgatcgctaa 5160gtctgagcag gagatcggca aggccaccgc caagtacttc ttctactcca
acatcatgaa 5220cttcttcaag accgagatca ctctcgccaa cggtgagatc aggaagcgcc
cactgatcga 5280gaccaacggt gagactggag agatcgtgtg ggacaaaggg agggatttcg
ctactgtgag 5340gaaggtgctc tccatgcctc aggtgaacat cgtcaagaag accgaagttc
agaccggagg 5400attctccaag gagtccatcc tccccaagag aaactccgac aagctgatcg
ctagaaagaa 5460agactgggac cctaagaagt acggaggctt cgattctcct accgtggcct
actctgtgct 5520ggtcgtggcc aaggtggaga agggcaagtc caagaagctg aaatccgtca
aggagctcct 5580cgggattacc atcatggaga ggagttcctt cgagaagaac cctatcgact
tcctggaggc 5640caagggatat aaagaggtga agaaggacct catcatcaag ctgcccaagt
actccctctt 5700cgagttggag aacggaagga agaggatgct ggcttctgcc ggagagttgc
agaagggaaa 5760tgagctcgcc cttccctcca agtacgtgaa cttcctgtac ctcgcctctc
actatgaaaa 5820gttgaagggc tctcctgagg acaacgagca gaagcagctc ttcgtggagc
agcacaagca 5880ctacctggac gaaattatcg agcagatctc tgagttctcc aagcgcgtga
tattggccga 5940cgccaacctc gacaaggtgc tgtccgccta caacaagcac agggataagc
ccattcgcga 6000gcaggctgaa aacattatcc acctgtttac cctcacaaac ttgggagccc
ctgctgcctt 6060caagtacttc gacaccacca ttgacaggaa gagatacacc tccaccaagg
aggtgctcga 6120cgcaacactc atccaccaat ccatcaccgg cctctatgaa acaaggattg
acttgtccca 6180gctgggaggc gactctagag ccgatcccaa gaagaagaga aaggtgaaga
gaccacggga 6240ccgccacgat ggcgagctgg gaggccgcaa gcgggcaagg taggttaacc
tagacttgtc 6300catcttctgg attggccaac ttaattaatg tatgaaataa aaggatgcac
acatagtgac 6360atgctaatca ctataatgtg ggcatcaaag ttgtgtgtta tgtgtaatta
ctagttatct 6420gaataaaaga gaaagagatc atccatattt cttatcctaa atgaatgtca
cgtgtcttta 6480taattctttg atgaaccaga tgcatttcat taaccaaatc catatacata
taaatattaa 6540tcatatataa ttaatatcaa ttgggttagc aaaacaaatc tagtctaggt
gtgttttgcg 6600aattcgatat caagcttatc gataccgtcg agggggggcc cggtaccggc
gcgccgttct 6660atagtgtcac ctaaatcgta tgtgtatgat acataaggtt atgtattaat
tgtagccgcg 6720ttctaacgac aatatgtcca tatggtgcac tctcagtaca atctgctctg
atgccgcata 6780gttaagccag ccccgacacc cgccaacacc cgctgacgcg ccctgacggg
cttgtctgct 6840cccggcatcc gcttacagac aagctgtgac cgtctccggg agctgcatgt
gtcagaggtt 6900ttcaccgtca tcaccgaaac gcgcgagacg aaagggcctc gtgatacgcc
tatttttata 6960ggttaatgtc atgaccaaaa tcccttaacg tgagttttcg ttccactgag
cgtcagaccc 7020cgtagaaaag atcaaaggat cttcttgaga tccttttttt ctgcgcgtaa
tctgctgctt 7080gcaaacaaaa aaaccaccgc taccagcggt ggtttgtttg ccggatcaag
agctaccaac 7140tctttttccg aaggtaactg gcttcagcag agcgcagata ccaaatactg
ttcttctagt 7200gtagccgtag ttaggccacc acttcaagaa ctctgtagca ccgcctacat
acctcgctct 7260gctaatcctg ttaccagtgg ctgctgccag tggcgataag tcgtgtctta
ccgggttgga 7320ctcaagacga tagttaccgg ataaggcgca gcggtcgggc tgaacggggg
gttcgtgcac 7380acagcccagc ttggagcgaa cgacctacac cgaactgaga tacctacagc
gtgagctatg 7440agaaagcgcc acgcttcccg aagggagaaa ggcggacagg tatccggtaa
gcggcagggt 7500cggaacagga gagcgcacga gggagcttcc agggggaaac gcctggtatc
tttatagtcc 7560tgtcgggttt cgccacctct gacttgagcg tcgatttttg tgatgctcgt
caggggggcg 7620gagcctatgg aaaaacgcca gcaacgcggc ctttttacgg ttcctggcct
tttgctggcc 7680ttttgctcac atgttctttc ctgcgttatc ccctgattct gtggataacc
gtattaccgc 7740ctttgagtga gctgataccg ctcgccgcag ccgaacgacc gagcgcagcg
agtcagtgag 7800cgaggaagcg gaagagcgcc caatacgcaa accgcctctc cccgcgcgtt
ggccgattca 7860ttaatgcagg ttgatcagat ctcgatcccg cgaaattaat acgactcact
atagggagac 7920cacaacggtt tccctctaga aataattttg tttaacttta agaaggagat
atacccatgg 7980aaaagcctga actcaccgcg acgtctgtcg agaagtttct gatcgaaaag
ttcgacagcg 8040tctccgacct gatgcagctc tcggagggcg aagaatctcg tgctttcagc
ttcgatgtag 8100gagggcgtgg atatgtcctg cgggtaaata gctgcgccga tggtttctac
aaagatcgtt 8160atgtttatcg gcactttgca tcggccgcgc tcccgattcc ggaagtgctt
gacattgggg 8220aattcagcga gagcctgacc tattgcatct cccgccgtgc acagggtgtc
acgttgcaag 8280acctgcctga aaccgaactg cccgctgttc tgcagccggt cgcggaggct
atggatgcga 8340tcgctgcggc cgatcttagc cagacgagcg ggttcggccc attcggaccg
caaggaatcg 8400gtcaatacac tacatggcgt gatttcatat gcgcgattgc tgatccccat
gtgtatcact 8460ggcaaactgt gatggacgac accgtcagtg cgtccgtcgc gcaggctctc
gatgagctga 8520tgctttgggc cgaggactgc cccgaagtcc ggcacctcgt gcacgcggat
ttcggctcca 8580acaatgtcct gacggacaat ggccgcataa cagcggtcat tgactggagc
gaggcgatgt 8640tcggggattc ccaatacgag gtcgccaaca tcttcttctg gaggccgtgg
ttggcttgta 8700tggagcagca gacgcgctac ttcgagcgga ggcatccgga gcttgcagga
tcgccgcggc 8760tccgggcgta tatgctccgc attggtcttg accaactcta tcagagcttg
gttgacggca 8820atttcgatga tgcagcttgg gcgcagggtc gatgcgacgc aatcgtccga
tccggagccg 8880ggactgtcgg gcgtacacaa atcgcccgca gaagcgcggc cgtctggacc
gatggctgtg 8940tagaagtact cgccgatagt ggaaaccgac gccccagcac tcgtccgagg
gcaaaggaat 9000agtgaggtac agcttggatc gatccggctg ctaacaaagc ccgaaaggaa
gctgagttgg 9060ctgctgccac cgctgagcaa taactagcat aaccccttgg ggcctctaaa
cgggtcttga 9120ggggtttttt gctgaaagga ggaactatat ccggatgctc gggcgcgccg
gtac 91745069174DNAArtificial sequenceRTW1202 506ccgggtgtga
tttagtataa agtgaagtaa tggtcaaaag aaaaagtgta aaacgaagta 60cctagtaata
agtaatattg aacaaaataa atggtaaagt gtcagatata taaaataggc 120tttaataaaa
ggaagaaaaa aaacaaacaa aaaataggtt gcaatggggc agagcagagt 180catcatgaag
ctagaaaggc taccgataga taaactatag ttaattaaat acattaaaaa 240atacttggat
ctttctctta ccctgtttat attgagacct gaaacttgag agagatacac 300taatcttgcc
ttgttgtttc attccctaac ttacaggact cagcgcatgt catgtggtct 360cgttccccat
ttaagtccca caccgtctaa acttattaaa ttattaatgt ttataactag 420atgcacaaca
acaaagcttg gaataaaata ggaaacgttg ttttagagct agaaatagca 480agttaaaata
aggctagtcc gttatcaact tgaaaaagtg gcaccgagtc ggtgcttttt 540tttgcggccg
caattggatc gggtttactt attttgtggg tatctatact tttattagat 600ttttaatcag
gctcctgatt tctttttatt tcgattgaat tcctgaactt gtattattca 660gtagatcgaa
taaattataa aaagataaaa tcataaaata atattttatc ctatcaatca 720tattaaagca
atgaatatgt aaaattaatc ttatctttat tttaaaaaat catataggtt 780tagtattttt
ttaaaaataa agataggatt agttttacta ttcactgctt attactttta 840aaaaaatcat
aaaggtttag tattttttta aaataaatat aggaatagtt ttactattca 900ctgctttaat
agaaaaatag tttaaaattt aagatagttt taatcccagc atttgccacg 960tttgaacgtg
agccgaaacg atgtcgttac attatcttaa cctagctgaa acgatgtcgt 1020cataatatcg
ccaaatgcca actggactac gtcgaaccca caaatcccac aaagcgcgtg 1080aaatcaaatc
gctcaaacca caaaaaagaa caacgcgttt gttacacgct caatcccacg 1140cgagtagagc
acagtaacct tcaaataagc gaatggggca taatcagaaa tccgaaataa 1200acctaggggc
attatcggaa atgaaaagta gctcactcaa tataaaaatc taggaaccct 1260agttttcgtt
atcactctgt gctccctcgc tctatttctc agtctctgtg tttgcggctg 1320aggattccga
acgagtgacc ttcttcgttt ctcgcaaagg taacagcctc tgctcttgtc 1380tcttcgattc
gatctatgcc tgtctcttat ttacgatgat gtttcttcgg ttatgttttt 1440ttatttatgc
tttatgctgt tgatgttcgg ttgtttgttt cgctttgttt ttgtggttca 1500gttttttagg
attcttttgg tttttgaatc gattaatcgg aagagatttt cgagttattt 1560ggtgtgttgg
aggtgaatct tttttttgag gtcatagatc tgttgtattt gtgttataaa 1620catgcgactt
tgtatgattt tttacgaggt tatgatgttc tggttgtttt attatgaatc 1680tgttgagaca
gaaccatgat ttttgttgat gttcgtttac actattaaag gtttgtttta 1740acaggattaa
aagtttttta agcatgttga aggagtcttg tagatatgta accgtcgata 1800gtttttttgt
gggtttgttc acatgttatc aagcttaatc ttttactatg tatgcgacca 1860tatctggatc
cagcaaaggc gattttttaa ttccttgtga aacttttgta atatgaagtt 1920gaaattttgt
tattggtaaa ctataaatgt gtgaagttgg agtatacctt taccttctta 1980tttggctttg
tgatagttta atttatatgt attttgagtt ctgacttgta tttctttgaa 2040ttgattctag
tttaagtaat ccatggcacc gaagaagaag cgcaaggtga tggacaaaaa 2100gtactcaata
gggctcgaca tagggactaa ctccgttgga tgggccgtca tcaccgacga 2160gtacaaggtg
ccctccaaga agttcaaggt gttgggaaac accgacaggc acagcataaa 2220gaagaatttg
atcggtgccc tcctcttcga ctccggagag accgctgagg ctaccaggct 2280caagaggacc
gctagaaggc gctacaccag aaggaagaac agaatctgct acctgcagga 2340gatcttctcc
aacgagatgg ccaaggtgga cgactccttc ttccaccgcc ttgaggaatc 2400attcctggtg
gaggaggata aaaagcacga gagacaccca atcttcggga acatcgtcga 2460cgaggtggcc
taccatgaaa agtaccctac catctaccac ctgaggaaga agctggtcga 2520ctctaccgac
aaggctgact tgcgcttgat ttacctggct ctcgctcaca tgataaagtt 2580ccgcggacac
ttcctcattg agggagacct gaacccagac aactccgacg tggacaagct 2640cttcatccag
ctcgttcaga cctacaacca gcttttcgag gagaacccaa tcaacgccag 2700tggagttgac
gccaaggcta tcctctctgc tcgtctgtca aagtccagga ggcttgagaa 2760cttgattgcc
cagctgcctg gcgaaaagaa gaacggactg ttcggaaact tgatcgctct 2820ctccctggga
ttgactccca acttcaagtc caacttcgac ctcgccgagg acgctaagtt 2880gcagttgtct
aaagacacct acgacgatga cctcgacaac ttgctggccc agataggcga 2940ccaatacgcc
gatctcttcc tcgccgctaa gaacttgtcc gacgcaatcc tgctgtccga 3000catcctgaga
gtcaacactg agattaccaa agctcctctg tctgcttcca tgattaagcg 3060ctacgacgag
caccaccaag atctgaccct gctcaaggcc ctggtgagac agcagctgcc 3120cgagaagtac
aaggagatct ttttcgacca gtccaagaac ggctacgccg gatacattga 3180cggaggcgcc
tcccaggaag agttctacaa gttcatcaag cccatccttg agaagatgga 3240cggtaccgag
gagctgttgg tgaagttgaa cagagaggac ctgttgagga agcagagaac 3300cttcgacaac
ggaagcatcc ctcaccaaat ccacctggga gagctccacg ccatcttgag 3360gaggcaggag
gatttctatc ccttcctgaa ggacaaccgc gagaagattg agaagatctt 3420gaccttcaga
attccttact acgtcgggcc actcgccaga ggaaactcta ggttcgcctg 3480gatgacccgc
aaatctgaag agaccattac tccctggaac ttcgaggaag tcgtggacaa 3540gggcgcttcc
gctcagtctt tcatcgagag gatgaccaac ttcgataaaa atctgcccaa 3600cgagaaggtg
ctgcccaagc actccctgtt gtacgagtat ttcacagtgt acaacgagct 3660caccaaggtg
aagtacgtca cagagggaat gaggaagcct gccttcttgt ccggagagca 3720gaagaaggcc
atcgtcgacc tgctcttcaa gaccaacagg aaggtgactg tcaagcagct 3780gaaggaggac
tacttcaaga agatcgagtg cttcgactcc gtcgagatct ctggtgtcga 3840ggacaggttc
aacgcctccc ttgggactta ccacgatctg ctcaagatta ttaaagacaa 3900ggacttcctg
gacaacgagg agaacgagga catccttgag gacatcgtgc tcaccctgac 3960cttgttcgaa
gacagggaaa tgatcgaaga gaggctcaag acctacgccc acctcttcga 4020cgacaaggtg
atgaaacagc tgaagagacg cagatatacc ggctggggaa ggctctcccg 4080caaattgatc
aacgggatca gggacaagca gtcagggaag actatactcg acttcctgaa 4140gtccgacgga
ttcgccaaca ggaacttcat gcagctcatt cacgacgact ccttgacctt 4200caaggaggac
atccagaagg ctcaggtgtc tggacagggt gactccttgc atgagcacat 4260tgctaacttg
gccggctctc ccgctattaa gaagggcatt ttgcagaccg tgaaggtcgt 4320tgacgagctc
gtgaaggtga tgggacgcca caagccagag aacatcgtta ttgagatggc 4380tcgcgagaac
caaactaccc agaaagggca gaagaattcc cgcgagagga tgaagcgcat 4440tgaggagggc
ataaaagagc ttggctctca gatcctcaag gagcaccccg tcgagaacac 4500tcagctgcag
aacgagaagc tgtacctgta ctacctccaa aacggaaggg acatgtacgt 4560ggaccaggag
ctggacatca acaggttgtc cgactacgac gtcgaccaca tcgtgcctca 4620gtccttcctg
aaggatgact ccatcgacaa taaagtgctg acacgctccg ataaaaatag 4680aggcaagtcc
gacaacgtcc cctccgagga ggtcgtgaag aagatgaaaa actactggag 4740acagctcttg
aacgccaagc tcatcaccca gcgtaagttc gacaacctga ctaaggctga 4800gagaggagga
ttgtccgagc tcgataaggc cggattcatc aagagacagc tcgtcgaaac 4860ccgccaaatt
accaagcacg tggcccaaat tctggattcc cgcatgaaca ccaagtacga 4920tgaaaatgac
aagctgatcc gcgaggtcaa ggtgatcacc ttgaagtcca agctggtctc 4980cgacttccgc
aaggacttcc agttctacaa ggtgagggag atcaacaact accaccacgc 5040acacgacgcc
tacctcaacg ctgtcgttgg aaccgccctc atcaaaaaat atcctaagct 5100ggagtctgag
ttcgtctacg gcgactacaa ggtgtacgac gtgaggaaga tgatcgctaa 5160gtctgagcag
gagatcggca aggccaccgc caagtacttc ttctactcca acatcatgaa 5220cttcttcaag
accgagatca ctctcgccaa cggtgagatc aggaagcgcc cactgatcga 5280gaccaacggt
gagactggag agatcgtgtg ggacaaaggg agggatttcg ctactgtgag 5340gaaggtgctc
tccatgcctc aggtgaacat cgtcaagaag accgaagttc agaccggagg 5400attctccaag
gagtccatcc tccccaagag aaactccgac aagctgatcg ctagaaagaa 5460agactgggac
cctaagaagt acggaggctt cgattctcct accgtggcct actctgtgct 5520ggtcgtggcc
aaggtggaga agggcaagtc caagaagctg aaatccgtca aggagctcct 5580cgggattacc
atcatggaga ggagttcctt cgagaagaac cctatcgact tcctggaggc 5640caagggatat
aaagaggtga agaaggacct catcatcaag ctgcccaagt actccctctt 5700cgagttggag
aacggaagga agaggatgct ggcttctgcc ggagagttgc agaagggaaa 5760tgagctcgcc
cttccctcca agtacgtgaa cttcctgtac ctcgcctctc actatgaaaa 5820gttgaagggc
tctcctgagg acaacgagca gaagcagctc ttcgtggagc agcacaagca 5880ctacctggac
gaaattatcg agcagatctc tgagttctcc aagcgcgtga tattggccga 5940cgccaacctc
gacaaggtgc tgtccgccta caacaagcac agggataagc ccattcgcga 6000gcaggctgaa
aacattatcc acctgtttac cctcacaaac ttgggagccc ctgctgcctt 6060caagtacttc
gacaccacca ttgacaggaa gagatacacc tccaccaagg aggtgctcga 6120cgcaacactc
atccaccaat ccatcaccgg cctctatgaa acaaggattg acttgtccca 6180gctgggaggc
gactctagag ccgatcccaa gaagaagaga aaggtgaaga gaccacggga 6240ccgccacgat
ggcgagctgg gaggccgcaa gcgggcaagg taggttaacc tagacttgtc 6300catcttctgg
attggccaac ttaattaatg tatgaaataa aaggatgcac acatagtgac 6360atgctaatca
ctataatgtg ggcatcaaag ttgtgtgtta tgtgtaatta ctagttatct 6420gaataaaaga
gaaagagatc atccatattt cttatcctaa atgaatgtca cgtgtcttta 6480taattctttg
atgaaccaga tgcatttcat taaccaaatc catatacata taaatattaa 6540tcatatataa
ttaatatcaa ttgggttagc aaaacaaatc tagtctaggt gtgttttgcg 6600aattcgatat
caagcttatc gataccgtcg agggggggcc cggtaccggc gcgccgttct 6660atagtgtcac
ctaaatcgta tgtgtatgat acataaggtt atgtattaat tgtagccgcg 6720ttctaacgac
aatatgtcca tatggtgcac tctcagtaca atctgctctg atgccgcata 6780gttaagccag
ccccgacacc cgccaacacc cgctgacgcg ccctgacggg cttgtctgct 6840cccggcatcc
gcttacagac aagctgtgac cgtctccggg agctgcatgt gtcagaggtt 6900ttcaccgtca
tcaccgaaac gcgcgagacg aaagggcctc gtgatacgcc tatttttata 6960ggttaatgtc
atgaccaaaa tcccttaacg tgagttttcg ttccactgag cgtcagaccc 7020cgtagaaaag
atcaaaggat cttcttgaga tccttttttt ctgcgcgtaa tctgctgctt 7080gcaaacaaaa
aaaccaccgc taccagcggt ggtttgtttg ccggatcaag agctaccaac 7140tctttttccg
aaggtaactg gcttcagcag agcgcagata ccaaatactg ttcttctagt 7200gtagccgtag
ttaggccacc acttcaagaa ctctgtagca ccgcctacat acctcgctct 7260gctaatcctg
ttaccagtgg ctgctgccag tggcgataag tcgtgtctta ccgggttgga 7320ctcaagacga
tagttaccgg ataaggcgca gcggtcgggc tgaacggggg gttcgtgcac 7380acagcccagc
ttggagcgaa cgacctacac cgaactgaga tacctacagc gtgagctatg 7440agaaagcgcc
acgcttcccg aagggagaaa ggcggacagg tatccggtaa gcggcagggt 7500cggaacagga
gagcgcacga gggagcttcc agggggaaac gcctggtatc tttatagtcc 7560tgtcgggttt
cgccacctct gacttgagcg tcgatttttg tgatgctcgt caggggggcg 7620gagcctatgg
aaaaacgcca gcaacgcggc ctttttacgg ttcctggcct tttgctggcc 7680ttttgctcac
atgttctttc ctgcgttatc ccctgattct gtggataacc gtattaccgc 7740ctttgagtga
gctgataccg ctcgccgcag ccgaacgacc gagcgcagcg agtcagtgag 7800cgaggaagcg
gaagagcgcc caatacgcaa accgcctctc cccgcgcgtt ggccgattca 7860ttaatgcagg
ttgatcagat ctcgatcccg cgaaattaat acgactcact atagggagac 7920cacaacggtt
tccctctaga aataattttg tttaacttta agaaggagat atacccatgg 7980aaaagcctga
actcaccgcg acgtctgtcg agaagtttct gatcgaaaag ttcgacagcg 8040tctccgacct
gatgcagctc tcggagggcg aagaatctcg tgctttcagc ttcgatgtag 8100gagggcgtgg
atatgtcctg cgggtaaata gctgcgccga tggtttctac aaagatcgtt 8160atgtttatcg
gcactttgca tcggccgcgc tcccgattcc ggaagtgctt gacattgggg 8220aattcagcga
gagcctgacc tattgcatct cccgccgtgc acagggtgtc acgttgcaag 8280acctgcctga
aaccgaactg cccgctgttc tgcagccggt cgcggaggct atggatgcga 8340tcgctgcggc
cgatcttagc cagacgagcg ggttcggccc attcggaccg caaggaatcg 8400gtcaatacac
tacatggcgt gatttcatat gcgcgattgc tgatccccat gtgtatcact 8460ggcaaactgt
gatggacgac accgtcagtg cgtccgtcgc gcaggctctc gatgagctga 8520tgctttgggc
cgaggactgc cccgaagtcc ggcacctcgt gcacgcggat ttcggctcca 8580acaatgtcct
gacggacaat ggccgcataa cagcggtcat tgactggagc gaggcgatgt 8640tcggggattc
ccaatacgag gtcgccaaca tcttcttctg gaggccgtgg ttggcttgta 8700tggagcagca
gacgcgctac ttcgagcgga ggcatccgga gcttgcagga tcgccgcggc 8760tccgggcgta
tatgctccgc attggtcttg accaactcta tcagagcttg gttgacggca 8820atttcgatga
tgcagcttgg gcgcagggtc gatgcgacgc aatcgtccga tccggagccg 8880ggactgtcgg
gcgtacacaa atcgcccgca gaagcgcggc cgtctggacc gatggctgtg 8940tagaagtact
cgccgatagt ggaaaccgac gccccagcac tcgtccgagg gcaaaggaat 9000agtgaggtac
agcttggatc gatccggctg ctaacaaagc ccgaaaggaa gctgagttgg 9060ctgctgccac
cgctgagcaa taactagcat aaccccttgg ggcctctaaa cgggtcttga 9120ggggtttttt
gctgaaagga ggaactatat ccggatgctc gggcgcgccg gtac
91745076113DNAArtificial sequenceRTW1192A 507caagtagttc tagtcttaat
acaaatgtca aatggcacaa gtgagatttt gaatttctga 60tgttgtaaaa atctcaggac
atgaatacta ttgggaagca attattcata cttcaccaat 120ccaaactgac ccaaaattct
caaatcacat gaaagcaaaa atgcatataa cacgaagaat 180aagaagaaga ggaactaacc
tggggtttcg atgattaaag cgttgttgtt gatgatgaaa 240acgatgatta tggagagaaa
ttgttgttga atggtgaaat tgttatagaa agagaacgaa 300gatagagaaa aagatatata
gatttttcaa ggctcaaacc ctaaaatcac catgagagag 360aacaaagatt gagaaaccta
caaccactat gagagagaat gagcagaaca gaagcgtgag 420atagagaacg agagttaagg
tgcgagagga cacgaagaac aaaaggtgtg agagaaagaa 480caaaggagcc tacggtgtga
gatgagagaa tttgaaattc ttaccattta ggtggaattt 540caattctaca attttattct
attaaaatta ttttaaaaaa tgatgtcatt ttaaattctt 600taaaatctca tatccaacaa
ctgaattatg atagaggtat ttcaaattca cttaaaaaaa 660ttatcttatt taaatacccc
atccaaacat agcgaaatgt tcatgagaag gatcaagtgg 720tttggaaaca tagtactaat
ggtgtttata cagttcatgg aatccttgat agataattta 780aaggttgctg gaaattggat
gaaggtgtgg agattaaata ttcttccaaa aataaagcgt 840tttatttgga gagtgttgtg
tggttgtctc ccctgtaggc aaaagcttcg atgtaaagga 900gttcaatgtc caataaccta
tgctttctat ccctcgatta ttgaaaatga atgacacatt 960ttatttggtt gaaatcaaga
aataagcatg tggcaagcaa cgggtatttg acaattcata 1020gaacaaaagg tgaatgcagc
aaaagaatta atgaactcct tttcgatcta cttggatcac 1080tacatggaga tattatcaac
aaatttgatg ttactttatg gagcagttgg aattcttgga 1140atgacaagat atgaaatgaa
cataccaacc ctcctcttgt ttctgtttcg gtttctatgc 1200agtattttgt tgaatggcaa
agtgcaaggt aatatgctcc tcaacatcaa ttaacaaatg 1260ttcatgacat ctcttaccag
ctccaacttg gggacgtttg acaaacacca ccgtcaagtt 1320tccttaaatg caacattaat
gttgctcatt tcaaggagga gaatagtttt ggtgtcggca 1380tgatactcca tcaaggaaga
ttcgtcaaag ctcactcacg ttttcgacat gggtcgacat 1440gggttacctg acccaaaggc
tgaggcttag gcttgggttt gcttcaagta ttgatctggg 1500cccagactat tggtttacat
aatatcattt ttgaaaacct aacatctaaa actcaaggtt 1560gtttagaggt gcgccattcc
aaaataagat tatcctattt gtgcatgaat gcgaccaact 1620atctcctgtt tcagcattat
aaagtataaa caacaaactt ctttaatcaa gggactaaaa 1680gatattggac atacaagcta
aaagtgatag aatttgagaa aacaaatatt gacaacaata 1740ttcaagagga cactaaaaca
taattctcaa attttttttg tttatttaaa ataaagtggt 1800tcattaggta gctccgggtg
attgcggtta catcatgtac ggaaaaataa ttctaatcct 1860tgatttaaat ttgaacttga
ctatttattt attctttatt tcattttgta aatcatttta 1920tgtatctcct ggcaagcaat
tttatccacc ttgcaccaac accttcgggt tccataatca 1980aaccacctta acttcacacc
atgctgtaac tcacaccgcc cagcatctcc aatgtgaaag 2040aagctaaaat ttaataaaca
atcatacgaa gcagtgacaa aataccagat ggtattaatg 2100cttcgataaa attaattgga
aagtataaaa tggtagaaaa taataaatta taattaattt 2160aagtaagata aaaaataatt
aaaaactaaa atgttaaaat tttaaaaaaa ttattttaaa 2220taatatttaa aaacattaaa
aatcatttta aaaaatttat ttatagaaca attaaataaa 2280tatttcagct aataaaaaac
aaaagcttac ctagccttag aagacaactt gtccaacaat 2340tagatgatac ccattgccct
tacgttttct ttaacatcaa ttattgtttt tgtcaacaag 2400ctatctttta gttttatttt
attggtaaaa aatatgtcgc cttcaagttg catcatttaa 2460cacatctcgt cattagaaaa
ataaaactct tccctaaacg attagtagaa aaaatcattc 2520gataataaat aagaaagaaa
aattagaaaa aaataacttc attttaaaaa aatcattaag 2580gctatatttt ttaaatgact
aattttatat agactgtaac taaaagtata caatttatta 2640tgctatgtat cttaaagaat
tacttataaa aatctacgga agaatatctt acaaagtgaa 2700aaacaaatga gaaagaattt
agtgggatga ttatgatttt atttgaaaat tgaaaaaata 2760attattaaag actttagtgg
agtaagaaag ctttcctatt agtcttttct tatccataaa 2820aaaaaaaaaa aaaatctagc
gtgacagctt ttccatagat tttaataatg taaaatactg 2880gtagcagccg accgttcagg
taatggacac tgtggtccta acttgcaacg ggtgcgggcc 2940caatttaata acgccgtggt
aacggataaa gccaagcgtg aagcggtgaa ggtacatctc 3000tgactccgtc aagattacga
aaccgtcaac tacgaaggac tccccgaaat atcatctgtg 3060tcataaacac caagtcacac
catacatggg cacgcgtcac aatatgattg gagaacggtt 3120ccaccgcata tgctataaaa
tgcccccaca cccctcgacc ctaatcgcac ttcaattgca 3180atcaaattag ttcattctct
ttgcgcagtt ccctacctct cctttcaagg ttcgtagatt 3240tcttccgttt ttttttcttc
ttctttattg tttgttctac atcagcatga tgttgatttg 3300attgtgtttt ctatcgtttc
atcgattata aattttcata atcagaagat tcagctttta 3360ttaatgcaag aacgtcctta
attgatgatt ttataaccgt aaattaggtc taattagagt 3420ttttttcata aagattttca
gatccgttta caacaagcct taattgttga ttctgtagtc 3480gtagattaag gtttttttca
tgaactactt cagatccgtt aaacaacagc cttatttgtt 3540gatacttcag tcgtttttca
agaaattgtt cagatccgtt gataaaagcc ttattcgttg 3600attctgtatg gtatttcaag
agatattgct caggtccttt agcaactacc ttatttgttg 3660attctgtggc catagattag
gatttttttt cacgaaattg cttcttgaaa ttacgtgatg 3720gattttgatt ctgatttatc
ttgtgattgt tgactctaca gatggcccaa gtgagcagag 3780tgcacaatct tgctcaaagc
actcaaattt ttggccattc ttccaactcc aacaaactca 3840aatcggtgaa ttcggtttca
ttgaggccac gcctttgggg ggcctcaaaa tctcgcatcc 3900cgatgcataa aaatggaagc
tttatgggaa attttaatgt ggggaaggga aattccggcg 3960tgtttaaggt ttctgcatcg
gtcgccgccg cagagaagcc gtcaacgtcg ccggagatcg 4020tgttggaacc catcaaagac
ttctcgggta ccatcacatt gccagggtcc aagtctctgt 4080ccaatcgaat tttgcttctt
gctgctctct ctgaggtgaa gtttatttat ttatttattt 4140gtttgtttgt tgttgggtgt
gggaatagga gtttgatgtg tagagtggat tttgaatatt 4200tgattttttt ttgtattatt
ctgtgaaaat gaagcatcat gtcccatgaa agaaatggac 4260acgaaattaa gtggcttatg
atgtgaaatg aggatagaaa tgtgtgtagg gttttttaat 4320gggtagcaat aagcatattc
aatatctgga ttgatttgga cgtttctgta taaaggagta 4380tgctagcaat gtgttaatgt
atggcttgct aaaatactcc taaaaatcaa gtgggagtag 4440tatacatatc tacagcaaat
gtattaggtg aggcatttgg cttctctatt gtaaggaaca 4500aataatatca gttaatgtga
aaatcaatgg ttgatattcc aatacattca tgatgtgtta 4560tttatatgta cctaatattg
actgttgttt ttctccgcaa tgaccaagat tatttatttt 4620atcctctaaa gtgactaatt
gagttgctta ctttagagaa gttggaccca ttaggtgaga 4680gcgtgggggg aactaatctt
gaatatacaa tctgagtctt gattatccaa gtatggttgt 4740atgaacaatg ttagctctag
aagataaacc ctcccccaaa acacaaatta gaatgacatt 4800tcaagttcca tgtatgtcac
tttcattcta ttatttttac aacttttagt tacttaacag 4860atgtcttgtt cagcataaat
tataatttat tctgtttttt tttagggaac aactgttgta 4920gacaacttgt tgtatagtga
ggatattcat tacatgcttg gtgcattaag gacccttgga 4980ctgcgtgtgg aagatgacaa
aacaaccaaa caagcaattg ttgaaggctg tgggggattg 5040tttcccacta gtaaggaatc
taaagatgaa atcaatttat tccttggaaa tgctggtatt 5100gcaatgagat ctttgacagc
agctgttgtt gctgcaggtg gaaatgcaag gtctgttttt 5160tttttttttg ttcagcataa
tctttgaatt gttcctcgta taactaatca caacagagta 5220cgtgttcttc ttcctgttat
aatctaaaaa tctcatccag attagtcatc ctttcttctt 5280aaaaggaacc tttaattatc
aatgtattta tttaatattt aaattagctt gtcaaagtct 5340agcatataca tattttgatt
atattctgag aaatgcacct gagggtgttc ctcatgatct 5400acttcaacct ctgttattat
tagattttct atcatgatta ctggtttgag tctctaagta 5460gaccatcttg atgttcaaaa
tatttcagct acgtacttga tggggtgccc cgaatgagag 5520agaggccaat tggggatttg
gttgctggtc ttaagcaact tggtgcagat gttgattgct 5580ttcttggcac aaactgtcca
cctgttcgtg taaatgggaa gggaggactt cctggcggaa 5640aggtatggtt tggatttcat
ttagaataag gtggagtaac tttcctggat caaaattcta 5700atttaagaag cctccctgtt
ttcctctctt tagaataaga ctaagggtag gtttaggagt 5760tgggttttgg agagaaatgg
aagggagagc aatttttttc ttcttctaat aaatattctt 5820taatttgata cattttttaa
gtaaaagaat ataaagatag attagcataa cttaatgttt 5880taatctttta tttattttta
taaatattat atacctgtct atttaaaaat caaatatttg 5940tcctccattc cctttccctt
caaaacctca gttccaaata taccgtagtt gaattatatt 6000ttggaaggcc tattggttgg
agacttttcc ttttcagaga ttatccctca cctttattat 6060agcctttcta tttttaaact
tcatatagac gccattcttg gggcggccgc gat 611350832DNAArtificial
sequenceprimer, soy1-F4 508tcaataatac tactctctta gacaccaaac aa
3250923DNAArtificial sequenceprimer, soy1-R4
509caaggaaaat gaatgatggc ttt
2351018DNAArtificial sequenceprobe, soy1-T3(FAM-MGB) 510ccttcccaaa
ctataatc
1851127DNAArtificial sequenceWOL1005, Forward_primer 511aaatgttatc
agaggaacat gagctgc
2751228DNAArtificial sequenceWOL1006, Reverse_primer 512attatttttc
cgtacatgat gtaaccgc
28513438DNACauliflower mozaic virus 513cccatggagt caaagattca aatagaggac
ctaacagaac tcgccgtaaa gactggcgaa 60cagttcatac agagtctctt acgactcaat
gacaagaaga aaatcttcgt caacatggtg 120gagcacgaca cgcttgtcta ctccaaaaat
atcaaagata cagtctcaga agaccaaagg 180gcaattgaga cttttcaaca aagggtaata
tccggaaacc tcctcggatt ccattgccca 240gctatctgtc actttattgt gaagatagtg
gaaaaggaag gtggctccta caaatgccat 300cattgcgata aaggaaaggc catcgttgaa
gatgcctctg ccgacagtgg tcccaaagat 360ggacccccac ccacgaggag catcgtggaa
aaagaagacg ttccaaccac gtcttcaaag 420caagtggatt gatgtgat
43851419DNACauliflower mozaic virus
514gtctcagaag accaaaggg
1951525DNACauliflower mozaic virus 515tgccatcatt gcgataaagg aaagg
2551620DNACauliflower mozaic virus
516gatgcctctg ccgacagtgg
205173708DNAZea mays 517ctgcagccca tcaaggagat ctccggcacc gtcaagctgc
cggggtccaa gtcgctttcc 60aacaggatcc tcctgctcgc cgccctgtcc gaggtgagcg
attttggtgc ttgctgcgct 120gccctgtctc actgctacct aaatgttttg cctgtcgaat
accatggatt ctcggtgtaa 180tccatctcac gatcagatgc accgcatgtc gcatgcctag
ctctctctaa tttgtctagt 240agtttgtata cggattaaga ttgataaatc ggtaccgcaa
aagctaggtg taaataaaca 300ctacaaaatt ggatgttccc ctatcggcct gtactcggct
actcgttctt gtgatggcat 360gttatttctt cttggtgttt ggtgaactcc cttatgaaat
ttgggcgcaa agaaatcgcc 420ctcaagggtt gatcttatgc catcgtcatg ataaacagtg
aagcacggat gatcctttac 480gttgttttta acaaactttg tcagaaaact agcaatgtta
acttcttaat gatgatttca 540caacaaaaaa ggtaaccttg ctactaacat aacaaaagac
ttgttgctta ttaattatat 600gtttttttaa tctttgatca ggggacaaca gtggttgata
acctgttgaa cagtgaggat 660gtccactaca tgctcggggc cttgaggact cttggtctct
ctgtcgaagc ggacaaagct 720gccaaaagag ctgtagttgt tggctgtggt ggaaagttcc
cagttgagga tgctagagag 780gaagtgcagc tcttcttggg gaatgctgga atcgcaatgc
ggtcattgac agcagctgtt 840actgctgctg gtggaaatgc aacgtatgtt tcctctctct
ctctacaata cttgttggag 900ttagtatgaa acccatgtgt atgtctagtg gcttatggtg
tattggtttt tgaacttcag 960ttacgtgctt gatggagtac caagaatgag ggagagaccc
attggcgact tggttgtcgg 1020attgaagcag cttggtgcag atgttgattg tttccttggc
actgactgcc cacctgttcg 1080tgtcaatgga atcggagggc tacctggtgg caaggttagt
tactaagggc cacatgttac 1140attcttctgt aaatggtaca actattgtcg agcttttgca
tttgtaagga aaacattgat 1200tgatctgaat ttgatgctac accacaaaat atctacaaat
ggtcatccct aactagcaaa 1260ccatgtctcc attaagctca atgaagtaat acttggcatg
tgtttatcaa cttaatttcc 1320atcttctggg gtattgcctg ttttctagtc taatagcatt
tgtttttaga attagctctt 1380acaactgtta tgttctacag gtcaagctgt ctggctccat
cagcagtcag tacttgagtg 1440ccttgctgat ggctgctcct ttggctcttg gggatgtgga
gattgaaatc attgataaat 1500taatctccat tccctacgtc gaaatgacat tgagattgat
ggagcgtttt ggtgtgaaag 1560cagagcattc tgatagctgg gacagattct acattaaggg
aggtcaaaaa tacaagtaag 1620ctctgtaatg tatttcacta ctttgatgcc aatgtttcag
ttttcagttt tccaaacagt 1680cgcatcaata tttgaataga tgcactgtag aaaaaaatca
ttgcagggaa aaactagtac 1740tgagtatttt gactgtaaat tatttaacca gtcggaatat
agtcagtcta ttggagtcaa 1800gagcgtgaac cgaaatagcc agttaattat cccattatac
agaggacaac catgtatact 1860attgaaactt ggtttaagag aatctaggta gctggactcg
tagctgcttg gcatggatac 1920cttcttatct ttaggaaaag acacttgatt ttttttctgt
ggccctctat gatgtgtgaa 1980cctgcttctc tattgcttta gaaggatata tctatgtcgt
tatgcaacat gcttccctta 2040gtcatttgta ctgaaatcag tttcataagt tcgttagtgg
ttccctaaac gaaaccttgt 2100ttttctttgc aatcaacagg tcccctaaaa atgcctatgt
tgaaggtgat gcctcaagcg 2160caagctattt cttggctggt gctgcaatta ctggagggac
tgtgactgtg gaaggttgtg 2220gcaccaccag tttgcaggta aagatttctt ggctggtgct
acgataactg cttttgtctt 2280tttggtttca gcattgttct cagagtcact aaataacatt
atcatctgca aacgtcaaat 2340agacatactt aggtgaatgg atattcatgt aaccgtttcc
ttacaaattt gctgaaacct 2400cagggtgatg tgaagtttgc tgaggtactg gagatgatgg
gagcgaaggt tacatggacc 2460gagactagcg taactgttac tggcccaccg cgggagccat
ttgggaggaa acacctcaag 2520gcgattgatg tcaacatgaa caagatgcct gatgtcgcca
tgactcttgc tgtggttgcc 2580ctctttgccg atggcccgac agccatcaga gacggtaaaa
cattctcagc cctacaacca 2640tgcctcttct acatcactac ttgacaagac taaaaactat
tggctcgttg gcagtggctt 2700cctggagagt aaaggagacc gagaggatgg ttgcgatccg
gacggagcta accaaggtaa 2760ggctacatac ttcacatgtc tcacgtcgtc tttccatagc
tcgctgcctc ttagcggctt 2820gcctgcggtc gctccatcct cggttgctgt ctgtgttttc
cacagctggg agcatctgtt 2880gaggaagggc cggactactg catcatcacg ccgccggaga
agctgaacgt gacggcgatc 2940gacacgtacg acgaccacag gatggccatg gccttctccc
ttgccgcctg tgccgaggtc 3000cccgtgacca tccgggaccc tgggtgcacc cggaagacct
tccccgacta cttcgatgtg 3060ctgagcactt tcgtcaagaa ttaataaagc gtgcgatact
accacgcagc ttgattgaag 3120tgataggctt gtgctgagga aatacatttc ttttgttctg
ttttttctct ttcacgggat 3180taagttttga gtctgtaacg ttagttgttt gtagcaagtt
tctatttcgg atcttaagtt 3240tgtgcactgt aagccaaatt tcatttcaag agtggttcgt
tggaataata agaataataa 3300attacgtttc agtggctgtc aagcctgctg ctacgtttta
ggagatggca ttagacattc 3360atcatcaaca acaataaaac cttttagcct caaacaataa
tagtgaagtt attttttagt 3420cctaaacaag ttgcattagg atatagttaa aacacaaaag
aagctaaagt tagggtttag 3480acatgtggat attgttttcc atgtatagta tgttctttct
ttgagtctca tttaactacc 3540tctacacata ccaactttag ttttttttct acctcttcat
gttactatgg tgccttctta 3600tcccactgag cattggtata tttagaggtt tttgttgaac
atgcctaaat catctcaatc 3660aacgatggac aatcttttct tcgattgagc tgaggtacgt
catctaga 37085183714DNAZea mays 518ctgcagccca tcaaggagat
ctccggcacc gtcaagctgc cggggtccaa gtcgctttcc 60aacaggatcc tcctgctcgc
cgccctgtcc gaggtgagcg attttggtgc ttgctgcgct 120gccctgtctc actgctacct
aaatgttttg cctgtcgaat accatggatt ctcggtgtaa 180tccatatctg cacgatcaga
tatgcaccgc atgtcgcata tctgagctct ctctaatttg 240tctagtagtt tgtatacgga
ttaagattga taaatcggta ccgcaaaagc taggtgtaaa 300taaacactac aaaattggat
gttcccctat cggcctgtac tcggctactc gttcttgtga 360tggcatgtta tttcttcttg
gtgtttggtg aactccctta tgaaatttgg gcgcaaagaa 420atcgccctca agggttgatc
ttatgccatc gtcatgataa acagtgaagc acggatgatc 480ctttacgttg tttttaacaa
actttgtcag aaaactagca atgttaactt cttaatgatg 540atttcacaac aaaaaaggta
accttgctac taacataaca aaagacttgt tgcttattaa 600ttatatgttt ttttaatctt
tgatcagggg acaacagtgg ttgataacct gttgaacagt 660gaggatgtcc actacatgct
cggggccttg aggactcttg gtctctctgt cgaagcggac 720aaagctgcca aaagagctgt
agttgttggc tgtggtggaa agttcccagt tgaggatgct 780aaagaggaag tgcagctctt
cttggggaat gctggaatcg caatgcggtc attgacagca 840gctgttactg ctgctggtgg
aaatgcaacg tatgtttcct ctctctctct acaatacttg 900ttggagttag tatgaaaccc
atgtgtatgt ctagtggctt atggtgtatt ggtttttgaa 960cttcagttac gtgcttgatg
gagtaccaag aatgagggag agacccattg gcgacttggt 1020tgtcggattg aagcagcttg
gtgcagatgt tgattgtttc cttggcactg actgcccacc 1080tgttcgtgtc aatggaatcg
gagggctacc tggtggcaag gttagttact aagggccaca 1140tgttacattc ttctgtaaat
ggtacaacta ttgtcgagct tttgcatttg taaggaaaac 1200attgattgat ctgaatttga
tgctacacca caaaatatct acaaatggtc atccctaact 1260agcaaaccat gtctccatta
agctcaatga agtaatactt ggcatgtgtt tatcaactta 1320atttccatct tctggggtat
tgcctgtttt ctagtctaat agcatttgtt tttagaatta 1380gctcttacaa ctgttatgtt
ctacaggtca agctgtctgg ctccatcagc agtcagtact 1440tgagtgcctt gctgatggct
gctcctttgg ctcttgggga tgtggagatt gaaatcattg 1500ataaattaat ctccattccc
tacgtcgaaa tgacattgag attgatggag cgttttggtg 1560tgaaagcaga gcattctgat
agctgggaca gattctacat taagggaggt caaaaataca 1620agtaagctct gtaatgtatt
tcactacttt gatgccaatg tttcagtttt cagttttcca 1680aacagtcgca tcaatatttg
aatagatgca ctgtagaaaa aaatcattgc agggaaaaac 1740tagtactgag tattttgact
gtaaattatt taaccagtcg gaatatagtc agtctattgg 1800agtcaagagc gtgaaccgaa
atagccagtt aattatccca ttatacagag gacaaccatg 1860tatactattg aaacttggtt
taagagaatc taggtagctg gactcgtagc tgcttggcat 1920ggataccttc ttatctttag
gaaaagacac ttgatttttt ttctgtggcc ctctatgatg 1980tgtgaacctg cttctctatt
gctttagaag gatatatcta tgtcgttatg caacatgctt 2040cccttagtca tttgtactga
aatcagtttc ataagttcgt tagtggttcc ctaaacgaaa 2100ccttgttttt ctttgcaatc
aacaggtccc ctaaaaatgc ctatgttgaa ggtgatgcct 2160caagcgcaag ctatttcttg
gctggtgctg caattactgg agggactgtg actgtggaag 2220gttgtggcac caccagtttg
caggtaaaga tttcttggct ggtgctacga taactgcttt 2280tgtctttttg gtttcagcat
tgttctcaga gtcactaaat aacattatca tctgcaaacg 2340tcaaatagac atacttaggt
gaatggatat tcatgtaacc gtttccttac aaatttgctg 2400aaacctcagg gtgatgtgaa
gtttgctgag gtactggaga tgatgggagc gaaggttaca 2460tggaccgaga ctagcgtaac
tgttactggc ccaccgcggg agccatttgg gaggaaacac 2520ctcaaggcga ttgatgtcaa
catgaacaag atgcctgatg tcgccatgac tcttgctgtg 2580gttgccctct ttgccgatgg
cccgacagcc atcagagacg gtaaaacatt ctcagcccta 2640caaccatgcc tcttctacat
cactacttga caagactaaa aactattggc tcgttggcag 2700tggcttcctg gagagtaaag
gagaccgaga ggatggttgc gatccggacg gagctaacca 2760aggtaaggct acatacttca
catgtctcac gtcgtctttc catagctcgc tgcctcttag 2820cggcttgcct gcggtcgctc
catcctcggt tgctgtctgt gttttccaca gctgggagca 2880tctgttgagg aagggccgga
ctactgcatc atcacgccgc cggagaagct gaacgtgacg 2940gcgatcgaca cgtacgacga
ccacaggatg gccatggcct tctcccttgc cgcctgtgcc 3000gaggtccccg tgaccatccg
ggaccctggg tgcacccgga agaccttccc cgactacttc 3060gatgtgctga gcactttcgt
caagaattaa taaagcgtgc gatactacca cgcagcttga 3120ttgaagtgat aggcttgtgc
tgaggaaata catttctttt gttctgtttt ttctctttca 3180cgggattaag ttttgagtct
gtaacgttag ttgtttgtag caagtttcta tttcggatct 3240taagtttgtg cactgtaagc
caaatttcat ttcaagagtg gttcgttgga ataataagaa 3300taataaatta cgtttcagtg
gctgtcaagc ctgctgctac gttttaggag atggcattag 3360acattcatca tcaacaacaa
taaaaccttt tagcctcaaa caataatagt gaagttattt 3420tttagtccta aacaagttgc
attaggatat agttaaaaca caaaagaagc taaagttagg 3480gtttagacat gtggatattg
ttttccatgt atagtatgtt ctttctttga gtctcattta 3540actacctcta cacataccaa
ctttagtttt ttttctacct cttcatgtta ctatggtgcc 3600ttcttatccc actgagcatt
ggtatattta gaggtttttg ttgaacatgc ctaaatcatc 3660tcaatcaacg atggacaatc
ttttcttcga ttgagctgag gtacgtcatc taga 37145193708DNAZea mays
519ctgcagccca tcaaggagat ctccggcacc gtcaagctgc cggggtccaa gtcgctttcc
60aacaggatcc tcctgctcgc cgccctgtcc gaggtgagcg attttggtgc ttgctgcgct
120gccctgtctc actgctacct aaatgttttg cctgtcgaat accatggatt ctcggtgtaa
180tccatctcac gatcagatgc accgcatgtc gcatgcctag ctctctctaa tttgtctagt
240agtttgtata cggattaaga ttgataaatc ggtaccgcaa aagctaggtg taaataaaca
300ctacaaaatt ggatgttccc ctatcggcct gtactcggct actcgttctt gtgatggcat
360gttatttctt cttggtgttt ggtgaactcc cttatgaaat ttgggcgcaa agaaatcgcc
420ctcaagggtt gatcttatgc catcgtcatg ataaacagtg aagcacggat gatcctttac
480gttgttttta acaaactttg tcagaaaact agcaatgtta acttcttaat gatgatttca
540caacaaaaaa ggtaaccttg ctactaacat aacaaaagac ttgttgctta ttaattatat
600gtttttttaa tctttgatca ggggacaaca gtggttgata acctgttgaa cagtgaggat
660gtccactaca tgctcggggc cttgaggact cttggtctct ctgtcgaagc ggacaaagct
720gccaaaagag ctgtagttgt tggctgtggt ggaaagttcc cagttgagga tgctagaaag
780gaagtgcagc tcttcttggg gaatgctgga atcgcaatgc ggtcattgac agcagctgtt
840actgctgctg gtggaaatgc aacgtatgtt tcctctctct ctctacaata cttgttggag
900ttagtatgaa acccatgtgt atgtctagtg gcttatggtg tattggtttt tgaacttcag
960gtacgtgctt gatggagtac caagaatgag ggagagaccc attggcgact tggttgtcgg
1020attgaagcag cttggtgcag atgttgattg tttccttggc actgactgcc cacctgttcg
1080tgtcaatgga atcggagggc tacctggtgg caaggttagt tactaagggc cacatgttac
1140attcttctgt aaatggtaca actattgtcg agcttttgca tttgtaagga aaacattgat
1200tgatctgaat ttgatgctac accacaaaat atctacaaat ggtcatccct aactagcaaa
1260ccatgtctcc attaagctca atgaagtaat acttggcatg tgtttatcaa cttaatttcc
1320atcttctggg gtattgcctg ttttctagtc taatagcatt tgtttttaga attagctctt
1380acaactgtta tgttctacag gtcaagctgt ctggctccat cagcagtcag tacttgagtg
1440ccttgctgat ggctgctcct ttggctcttg gggatgtgga gattgaaatc attgataaat
1500taatctccat tccctacgtc gaaatgacat tgagattgat ggagcgtttt ggtgtgaaag
1560cagagcattc tgatagctgg gacagattct acattaaggg aggtcaaaaa tacaagtaag
1620ctctgtaatg tatttcacta ctttgatgcc aatgtttcag ttttcagttt tccaaacagt
1680cgcatcaata tttgaataga tgcactgtag aaaaaaatca ttgcagggaa aaactagtac
1740tgagtatttt gactgtaaat tatttaacca gtcggaatat agtcagtcta ttggagtcaa
1800gagcgtgaac cgaaatagcc agttaattat cccattatac agaggacaac catgtatact
1860attgaaactt ggtttaagag aatctaggta gctggactcg tagctgcttg gcatggatac
1920cttcttatct ttaggaaaag acacttgatt ttttttctgt ggccctctat gatgtgtgaa
1980cctgcttctc tattgcttta gaaggatata tctatgtcgt tatgcaacat gcttccctta
2040gtcatttgta ctgaaatcag tttcataagt tcgttagtgg ttccctaaac gaaaccttgt
2100ttttctttgc aatcaacagg tcccctaaaa atgcctatgt tgaaggtgat gcctcaagcg
2160caagctattt cttggctggt gctgcaatta ctggagggac tgtgactgtg gaaggttgtg
2220gcaccaccag tttgcaggta aagatttctt ggctggtgct acgataactg cttttgtctt
2280tttggtttca gcattgttct cagagtcact aaataacatt atcatctgca aacgtcaaat
2340agacatactt aggtgaatgg atattcatgt aaccgtttcc ttacaaattt gctgaaacct
2400cagggtgatg tgaagtttgc tgaggtactg gagatgatgg gagcgaaggt tacatggacc
2460gagactagcg taactgttac tggcccaccg cgggagccat ttgggaggaa acacctcaag
2520gcgattgatg tcaacatgaa caagatgcct gatgtcgcca tgactcttgc tgtggttgcc
2580ctctttgccg atggcccgac agccatcaga gacggtaaaa cattctcagc cctacaacca
2640tgcctcttct acatcactac ttgacaagac taaaaactat tggctcgttg gcagtggctt
2700cctggagagt aaaggagacc gagaggatgg ttgcgatccg gacggagcta accaaggtaa
2760ggctacatac ttcacatgtc tcacgtcgtc tttccatagc tcgctgcctc ttagcggctt
2820gcctgcggtc gctccatcct cggttgctgt ctgtgttttc cacagctggg agcatctgtt
2880gaggaagggc cggactactg catcatcacg ccgccggaga agctgaacgt gacggcgatc
2940gacacgtacg acgaccacag gatggccatg gccttctccc ttgccgcctg tgccgaggtc
3000cccgtgacca tccgggaccc tgggtgcacc cggaagacct tccccgacta cttcgatgtg
3060ctgagcactt tcgtcaagaa ttaataaagc gtgcgatact accacgcagc ttgattgaag
3120tgataggctt gtgctgagga aatacatttc ttttgttctg ttttttctct ttcacgggat
3180taagttttga gtctgtaacg ttagttgttt gtagcaagtt tctatttcgg atcttaagtt
3240tgtgcactgt aagccaaatt tcatttcaag agtggttcgt tggaataata agaataataa
3300attacgtttc agtggctgtc aagcctgctg ctacgtttta ggagatggca ttagacattc
3360atcatcaaca acaataaaac cttttagcct caaacaataa tagtgaagtt attttttagt
3420cctaaacaag ttgcattagg atatagttaa aacacaaaag aagctaaagt tagggtttag
3480acatgtggat attgttttcc atgtatagta tgttctttct ttgagtctca tttaactacc
3540tctacacata ccaactttag ttttttttct acctcttcat gttactatgg tgccttctta
3600tcccactgag cattggtata tttagaggtt tttgttgaac atgcctaaat catctcaatc
3660aacgatggac aatcttttct tcgattgagc tgaggtacgt catctaga
3708520464PRTzea mays 520Met Gln Leu Asp Leu Asn Val Ala Glu Ala Pro Pro
Pro Val Glu Met 1 5 10
15 Glu Ala Ser Asp Ser Gly Ser Ser Val Leu Asn Ala Ser Glu Ala Ala
20 25 30 Ser Ala Gly
Gly Ala Pro Ala Pro Ala Glu Glu Gly Ser Ser Ser Thr 35
40 45 Pro Ala Val Leu Glu Phe Ser Ile
Leu Ile Arg Ser Asp Ser Asp Ala 50 55
60 Ala Gly Ala Asp Glu Asp Glu Asp Ala Thr Pro Ser Pro
Pro Pro Arg 65 70 75
80 His Arg His Gln His Gln Gln Gln Leu Val Thr Arg Glu Leu Phe Pro
85 90 95 Ala Gly Ala Gly
Pro Pro Ala Pro Thr Pro Arg His Trp Ala Glu Leu 100
105 110 Gly Phe Phe Arg Ala Asp Leu Gln Gln
Gln Gln Ala Pro Gly Pro Arg 115 120
125 Ile Val Pro His Pro His Ala Ala Pro Pro Pro Ala Lys Lys
Ser Arg 130 135 140
Arg Gly Pro Arg Ser Arg Ser Ser Gln Tyr Arg Gly Val Thr Phe Tyr 145
150 155 160 Arg Arg Thr Gly Arg
Trp Glu Ser His Ile Trp Asp Cys Gly Lys Gln 165
170 175 Val Tyr Leu Gly Gly Phe Asp Thr Ala His
Ala Ala Ala Arg Ala Tyr 180 185
190 Asp Arg Ala Ala Ile Lys Phe Arg Gly Val Asp Ala Asp Ile Asn
Phe 195 200 205 Asn
Leu Ser Asp Tyr Glu Asp Asp Met Lys Gln Met Gly Ser Leu Ser 210
215 220 Lys Glu Glu Phe Val His
Val Leu Arg Arg Gln Ser Thr Gly Phe Ser 225 230
235 240 Arg Gly Ser Ser Arg Tyr Arg Gly Val Thr Leu
His Lys Cys Gly Arg 245 250
255 Trp Glu Ala Arg Met Gly Gln Phe Leu Gly Lys Lys Tyr Ile Tyr Leu
260 265 270 Gly Leu
Phe Asp Ser Glu Val Glu Ala Ala Arg Ala Tyr Asp Lys Ala 275
280 285 Ala Ile Lys Cys Asn Gly Arg
Glu Ala Val Thr Asn Phe Glu Pro Ser 290 295
300 Thr Tyr His Gly Glu Leu Pro Thr Glu Val Ala Asp
Val Asp Leu Asn 305 310 315
320 Leu Ser Ile Ser Gln Pro Ser Pro Gln Arg Asp Lys Asn Ser Cys Leu
325 330 335 Gly Leu Gln
Leu His His Gly Pro Phe Glu Gly Ser Glu Leu Lys Lys 340
345 350 Thr Lys Ile Asp Asp Ala Pro Ser
Glu Leu Pro Gly Arg Pro Arg Gln 355 360
365 Leu Ser Pro Leu Val Ala Glu His Pro Pro Ala Trp Pro
Ala Gln Pro 370 375 380
Pro His Pro Phe Phe Val Phe Thr Asn His Glu Met Ser Ala Ser Gly 385
390 395 400 Asp Leu His Arg
Arg Pro Ala Gly Ala Val Pro Ser Trp Ala Trp Gln 405
410 415 Val Ala Ala Ala Ala Pro Pro Pro Ala
Ala Leu Pro Ser Ser Ala Ala 420 425
430 Ala Ser Ser Gly Phe Ser Asn Thr Ala Thr Thr Ala Ala Thr
Thr Ala 435 440 445
Pro Ser Ala Ser Ser Leu Arg Tyr Cys Pro Pro Pro Pro Pro Pro Ser 450
455 460 5211413DNAZea mays
521atgcagttgg atctgaacgt ggccgaggcg ccgccgccgg tggagatgga ggcgagcgac
60tcggggtcgt cggtgctgaa cgcgtcggaa gcggcgtcgg cgggcggcgc gcccgcgccg
120gcggaggagg gatctagctc aacgccggcc gtgctggagt tcagcatcct catccggagc
180gatagcgacg cggccggcgc ggacgaggac gaggacgcca cgccatcgcc tcctcctcgc
240caccgccacc agcaccagca gcagctcgtg acccgcgagc tgttcccggc cggcgccggt
300ccgccggccc cgacgccgcg gcattgggcc gagctcggct tcttccgcgc cgacctgcag
360cagcaacagg cgccgggccc caggatcgtg ccgcacccac acgccgcgcc gccgccggcc
420aagaagagcc gccgcggccc gcgctcccgc agctcgcagt accgcggcgt caccttctac
480cgccgcacag gccgctggga gtcccacatc tgggattgcg gcaagcaggt gtacctaggt
540ggattcgaca ccgctcacgc cgctgcaagg gcgtacgacc gggcggcgat caagttccgc
600ggcgtcgacg ccgacatcaa cttcaacctc agcgactacg aggacgacat gaagcagatg
660gggagcctgt ccaaggagga gttcgtgcac gtcctgcgcc gtcagagcac cggcttctcg
720agaggcagct ccaggtacag aggcgtcacc ctgcacaagt gcggccgctg ggaggcgcgc
780atggggcagt tcctcggcaa gaagtacata taccttgggc tattcgacag cgaagtagag
840gctgcaagag cctacgacaa ggccgccatc aaatgcaatg gcagagaggc cgtgacgaac
900ttcgagccga gcacgtatca cggggagctg ccgactgaag ttgctgatgt cgatctgaac
960ctgagcatat ctcagccgag cccccaaaga gacaagaaca gctgcctagg tctgcagctc
1020caccacggac cattcgaggg ctccgaactg aagaaaacca agatcgacga tgctccctct
1080gagctaccgg gccgccctcg tcagctgtct cctctcgtgg ctgagcatcc gccggcctgg
1140cctgcgcagc cgcctcaccc cttcttcgtc ttcacaaacc atgagatgag tgcatcagga
1200gatctccaca ggaggcctgc aggggctgtt cccagctggg catggcaggt ggcagcagca
1260gctcctcctc ctgccgccct gccgtcgtcc gctgcagcat catcaggatt ctccaacacc
1320gccacgacag ctgccaccac cgccccatcg gcctcctccc tccggtactg cccgccgccg
1380ccgccgccgt cgagccatca ccatccccgc tga
1413522514PRTZea Mays 522Met Thr Thr Ser Thr Thr Ala Lys Gln Leu Arg Arg
Val Arg Thr Leu 1 5 10
15 Gly Arg Gly Ala Ser Gly Ala Val Val Trp Leu Ala Ser Asp Glu Ala
20 25 30 Ser Gly Glu
Leu Val Ala Val Lys Ser Ala Arg Ala Ala Gly Ala Ala 35
40 45 Ala Gln Leu Gln Arg Glu Gly Arg
Val Leu Arg Gly Leu Ser Ser Pro 50 55
60 His Ile Val Pro Cys Leu Gly Ser Arg Ala Ala Ala Gly
Gly Glu Tyr 65 70 75
80 Gln Leu Leu Leu Glu Phe Ala Pro Gly Gly Ser Leu Ala Asp Glu Ala
85 90 95 Ala Arg Ser Gly
Gly Gly Arg Leu Ala Glu Arg Ala Ile Gly Ala Tyr 100
105 110 Ala Gly Asp Val Ala Arg Gly Leu Ala
Tyr Leu His Gly Arg Ser Leu 115 120
125 Val His Gly Asp Val Lys Ala Arg Asn Val Val Ile Gly Gly
Asp Gly 130 135 140
Arg Ala Arg Leu Thr Asp Phe Gly Cys Ala Arg Pro Ala Gly Gly Ser 145
150 155 160 Thr Arg Pro Val Gly
Gly Thr Pro Ala Phe Met Ala Pro Glu Val Ala 165
170 175 Arg Gly Gln Glu Gln Gly Pro Ala Ala Asp
Val Trp Ala Leu Gly Cys 180 185
190 Met Val Val Glu Leu Ala Thr Gly Arg Ala Pro Trp Ser Asp Val
Glu 195 200 205 Gly
Asp Asp Leu Leu Ala Ala Leu His Arg Ile Gly Tyr Thr Asp Asp 210
215 220 Val Pro Glu Val Pro Ala
Trp Leu Ser Pro Glu Ala Lys Asp Phe Leu 225 230
235 240 Ala Gly Cys Phe Glu Arg Arg Ala Ala Ala Arg
Pro Thr Ala Ala Gln 245 250
255 Pro Ala Ala His Pro Phe Val Val Ala Ser Ala Ser Ala Ala Ala Ala
260 265 270 Ile Arg
Gly Pro Ala Lys Gln Glu Val Val Pro Ser Pro Lys Ser Thr 275
280 285 Leu His Asp Ala Phe Trp Asp
Ser Asp Ala Glu Asp Glu Ala Asp Glu 290 295
300 Met Ser Thr Gly Ala Ala Ala Glu Arg Ile Gly Ala
Leu Ala Cys Ala 305 310 315
320 Ala Ser Ala Leu Pro Asp Trp Asp Thr Glu Glu Gly Trp Ile Asp Leu
325 330 335 Gln Asp Asp
His Ser Ala Gly Thr Ala Asp Ala Pro Pro Ala Pro Val 340
345 350 Ala Asp Tyr Phe Ile Ser Trp Ala
Glu Pro Ser Asp Ala Glu Leu Glu 355 360
365 Pro Phe Val Ala Val Ala Ala Ala Ala Gly Leu Pro His
Val Ala Gly 370 375 380
Val Ala Leu Ala Gly Ala Thr Ala Val Asn Leu Gln Gly Ser Tyr Tyr 385
390 395 400 Tyr Tyr Pro Pro
Met His Leu Gly Val Arg Gly Asn Glu Ile Pro Arg 405
410 415 Pro Leu Leu Asp His His Gly Asp Gly
Leu Glu Lys Gly Gln Gly Ser 420 425
430 His Arg Val Cys Asn Arg Glu Thr Glu Lys Val Thr Met Lys
Arg Ile 435 440 445
Ser Leu Lys Arg Arg Ala Ala Phe Leu Leu Asp Gln His His Val Arg 450
455 460 Ser Leu Asp Lys Leu
Glu Tyr Arg Pro Arg His Asp Arg Met Leu Arg 465 470
475 480 Arg Arg Gln Ser Ile Tyr Arg Ser Asn Ser
Val Leu Gly Tyr Asp Val 485 490
495 Ser Lys Gly Arg Gln Val Arg Trp Arg Arg Ala Val Cys Ile Ala
Val 500 505 510 Ala
Ala 5231545DNAZea mays 523atgacgacgt cgaccacggc gaagcagctc cggcgcgtgc
gcacgctcgg ccgcggcgcg 60tcgggcgccg tggtgtggct ggcctccgac gaggcctcgg
gcgagctggt ggcggtcaag 120tcggcgcgcg ccgccggggc cgcggcgcag ctgcagcgcg
agggccgcgt cctccggggc 180ctctcgtcgc cgcacatcgt gccctgcctc ggctcccgcg
ccgcggcggg cggcgagtac 240cagctcctgc tggagttcgc gccgggcggg tcgctggccg
acgaggccgc caggagcggc 300gggggccgcc tcgcggagcg cgccatcggc gcctacgccg
gggacgtggc gcgcgggctg 360gcgtacctcc acggccggtc gctcgtgcac ggggacgtca
aggcccggaa cgtggtcatc 420ggcggcgacg ggcgcgccag gctgaccgac ttcgggtgcg
cgaggccggc cggcgggtcg 480acgcgccccg tcgggggcac cccggcgttc atggcgcccg
aggtggcgcg cggccaggag 540cagggccccg ccgccgacgt ctgggcgctc gggtgcatgg
tcgtcgagct ggccacgggc 600cgcgcgccct ggagcgacgt ggagggcgac gacctcctcg
ccgcgctcca ccggatcggg 660tacacggacg acgtgccgga ggtgcccgcg tggctgtcgc
ccgaggccaa ggacttcctg 720gccggctgct tcgagcgccg cgccgccgcc cggcccacgg
ccgcgcagcc cgcggcgcac 780ccgttcgtcg tcgcctccgc ctccgccgcc gccgccatcc
gcggcccggc gaagcaggag 840gtggtcccgt cacccaagag cacgctgcac gacgcgttct
gggactcgga cgccgaggac 900gaagcggacg agatgtcgac gggcgcggcg gccgagagga
tcggggcatt ggcgtgcgcc 960gcctccgcgc tgcctgactg ggacaccgag gaaggctgga
tcgacctcca ggacgaccac 1020tcggccggaa ctgccgacgc accgccggcg cccgtcgcgg
actacttcat cagctgggcg 1080gagccgtcag acgcagagct ggaaccattc gtcgccgtcg
ccgccgccgc aggtctcccg 1140cacgttgcag gagttgcatt agcaggcgcc accgccgtta
acctgcaggg cagttattat 1200tattacccgc ctatgcatct aggcgtccgc ggaaacgaga
ttccacgccc gttgttggat 1260catcatggcg acgggttaga aaaggggcag ggatcccacc
gcgtttgtaa cagagaaaca 1320gaaaaggtaa caatgaaacg aatttcgtta aaaagaagag
ctgctttcct tctcgaccag 1380catcacgtgc gatcgctgga caaactggaa tatcgtccac
gtcacgaccg aatgctgcgt 1440cgacggcaat ctatatatcg gagcaatagc gtccttggtt
acgacgttag caaaggtagg 1500caggtccgtt ggcgccgtgc ggtttgcatt gccgttgctg
cctga 1545524671DNAzea mays 524cggatccact agtaacggcc
gccagtgtgc tggaattcgc ccttgacggc ccgggctggt 60atttcaaaac tatagtattt
taaaattgca ttaacaaaca tgtcctaatt ggtactcctg 120agatactata ccctcctgtt
ttaaaatagt tggcattatc gaattatcat tttacttttt 180aatgttttct cttcttttaa
tatattttat gaattttaat gtattttaaa atgttatgca 240gttcgctctg gacttttctg
ctgcgcctac acttgggtgt actgggccta aattcagcct 300gaccgaccgc ctgcattgaa
taatggatga gcaccggtaa aatccgcgta cccaactttc 360gagaagaacc gagacgtggc
gggccgggcc accgacgcac ggcaccagcg actgcacacg 420tcccgccggc gtacgtgtac
gtgctgttcc ctcactggcc gcccaatcca ctcatgcatg 480cccacgtaca cccctgccgt
ggcgcgccca gatcctaatc ctttcgccgt tctgcacttc 540tgctgcctat aaatggcggc
atcgaccgtc acctgcttca ccaccggcga gccacatcga 600gaacacgatc gagcacacaa
gcacgaagac tcgtttagga gaaaccacaa accaccaagc 660cgtgcaagca c
671525245PRTzea mays 525Met
Gly Arg Gly Lys Val Gln Leu Lys Arg Ile Glu Asn Lys Ile Asn 1
5 10 15 Arg Gln Val Thr Phe Ser
Lys Arg Arg Ser Gly Leu Leu Lys Lys Ala 20
25 30 His Glu Ile Ser Val Leu Cys Asp Ala Glu
Val Ala Leu Ile Ile Phe 35 40
45 Ser Thr Lys Gly Lys Leu Tyr Glu Tyr Ser Thr Asp Ser Cys
Met Asp 50 55 60
Lys Ile Leu Glu Arg Tyr Glu Arg Tyr Ser Tyr Ala Glu Lys Val Leu 65
70 75 80 Ile Ser Ala Glu Tyr
Glu Thr Gln Gly Asn Trp Cys His Glu Tyr Arg 85
90 95 Lys Leu Lys Ala Lys Val Glu Thr Ile Gln
Lys Cys Gln Lys His Leu 100 105
110 Met Gly Glu Asp Leu Glu Thr Leu Asn Leu Lys Glu Leu Gln Gln
Leu 115 120 125 Glu
Gln Gln Leu Glu Ser Ser Leu Lys His Ile Arg Thr Arg Lys Ser 130
135 140 Gln Leu Met Val Glu Ser
Ile Ser Ala Leu Gln Arg Lys Glu Lys Ser 145 150
155 160 Leu Gln Glu Glu Asn Lys Val Leu Gln Lys Glu
Leu Ala Glu Lys Gln 165 170
175 Lys Asp Gln Arg Gln Gln Val Gln Arg Asp Gln Thr Gln Gln Gln Thr
180 185 190 Ser Ser
Ser Ser Thr Ser Phe Met Leu Arg Glu Ala Ala Pro Thr Thr 195
200 205 Asn Val Ser Ile Phe Pro Val
Ala Ala Gly Gly Arg Val Val Glu Gly 210 215
220 Ala Ala Ala Gln Pro Gln Ala Arg Val Gly Leu Pro
Pro Trp Met Leu 225 230 235
240 Ser His Leu Ser Cys 245 526738DNAzea mays
526atggggcgcg ggaaggtgca gctgaagcgg atcgagaaca agatcaaccg ccaggtgaca
60ttctccaagc gccgctcggg gctactcaag aaggcgcacg agatctccgt gctctgcgac
120gccgaggtcg cgctcatcat cttctccacc aagggcaagc tctacgagta ctctaccgat
180tcatgtatgg acaaaattct tgaacggtat gagcgctact cctatgcaga aaaggttctc
240atttccgcag aatatgaaac tcagggcaat tggtgccatg aatatagaaa actaaaggcg
300aaggtcgaga caatacagaa atgtcaaaag cacctcatgg gagaggatct tgaaactttg
360aatctcaaag agcttcagca actagagcag cagctggaga gttcactgaa acatatcaga
420acaaggaaga gccagcttat ggtcgagtca atttcagcgc tccaacggaa ggagaagtca
480ctgcaggagg agaacaaggt tctgcagaag gagctcgcgg agaagcagaa agaccagcgg
540cagcaagtgc aacgggacca aactcaacag cagaccagtt cgtcttccac gtccttcatg
600ttaagggaag ctgccccaac aacaaatgtc agcatcttcc ctgtggcagc aggcgggagg
660gtggtggaag gggcagcagc gcagccgcag gctcgcgttg gactgccacc atggatgctt
720agccatctga gctgctga
73852780DNAzea maysmisc_feature(1)..(80)sequence of Figure 34B
527gctaaagagg aagtgcagct cttcttgggg aatgctggaa ctgcaatgcg gccattgaca
60gcagctgtta ctgctgctgg
8052880DNAzea maysmisc_feature(1)..(80)sequence of Figure 34c
528gctagagagg aagtgcagct cttcttgggg aatgctggaa tcgcaatgcg gtcattgaca
60gcagctgtta ctgctgctgg
8052937DNAzea maysmisc_feature(1)..(37)sequence of Figure 35b
529catctcacga tcagatgcac cgcatgtcgc atgccta
3753042DNAzea maysmisc_feature(1)..(42)sequence of Figure 35c
530catatctgca cgatcagata tgcaccgcat gtcgcatatc tg
4253131DNAzea maysmisc_feature(1)..(31)sequence of Figure 37
531gtttttgaac ttcagttacg tgcttgatgg a
3153231DNAzea maysmisc_feature(1)..(31)sequence of Figure 37
532gtttttgaac ttcaggtacg tgcttgatgg a
31533459DNAArtificial SequenceSouthern genomic probe 533agctttatcc
atccatccat cgcgctagct ggctgcaggc acgggttatc ttatcttgtc 60gtccagagga
cgacacacgg ccggccggtg aagtaaaagg gagtaatctt attttgccag 120gacgaggggc
ggtacatgat attacacacg taccatgcat gcatatatgc atggacaagg 180tacgtcgtcg
tcgatcgacg tcgatgcata tgtgtgtatg tatgtacgtg cataatgcat 240ggtaccagct
gctggcttat atatatttgt caccgatcga tgcatgctgc tgctctacac 300ggtttgacac
tttaatttga ctcatcgatg accttgctag atagtagcgg ctcgtcaatt 360aatgagccat
caagttaaca agagggcacg ggcttgcgcg actgattcca ccttattaac 420atacgccctg
cgcccgcgcg tgctgtacgt acgagaatt
459534446DNAArtificial SequenceSouthern MoPAT probe 534tcgaagtcgc
gctgccagaa gccgacgtcg tgccagccgc cgtgcttgta gccggcggcg 60cggagggtgc
cgcgggcggt gtagccgagg gcctcgtgga ggcgcacgga cgggtcgttc 120gggaggccga
tcacggccac cacggacttg aagccctggg cctccatgct cttgaggagg 180tgggtgtaga
gggtggagcc gaggccgagg cgctggtggc ggtgggacac gtacacggtg 240gactccacgg
tccagtcgta ggcgttgcgg gccttccacg ggccggcgta ggcgatgccg 300gccaccacgc
cctccacctc ggccacgagc cacgggtagc ggtcctggag gcgctccagg 360tcgtcgatcc
actcctgcgg ggtctgcggc tcggtgcgga agttcacggt ggaggtctcg 420atgtagtggt
tcacgatgtc gcacac
44653520DNAArtificial SequenceRF-FPCas-1 535gcaggtctca cgacggttgg
2053623DNAArtificial
SequenceRF-FPCas-2 536gtaaagtacg cgtacgtgtg agg
2353723DNAArtificial SequenceALSCas-4 537gctgctcgat
tccgtcccca tgg
23538804DNAArtificial SequenceALS modification repair template 804
538agcttacagc cgccgcaacc atggccaccg ccgccgccgc gtctaccgcg ctcactggcg
60ccactaccgc tgcgcccaag gcgaggcgcc gggcgcacct cctggccacc cgccgcgccc
120tcgccgcgcc catcaggtgc tcagcggcgt cacccgccat gccgatggct cccccggcca
180ccccgctccg gccgtggggc cccaccgatc cccgcaaggg cgccgacatc ctcgtcgagt
240ccctcgagcg ctgcggcgtc cgcgacgtct tcgcctaccc cggcggcgcg tccatggaga
300tccaccaggc actcacccgc tcccccgtca tcgccaacca cctcttccgc cacgagcaag
360gggaggcctt tgcggcctcc ggctacgcgc gctcctcggg ccgcgtcggc gtctgcatcg
420ccacctccgg ccccggcgcc accaaccttg tctccgcgct cgccgacgcg ttgctcgact
480ccgtccccat tgtcgccatc acgggacagg tgtcgcgacg catgattggc accgacgcct
540tccaggagac gcccatcgtc gaggtcaccc gctccatcac caagcacaac tacctggtcc
600tcgacgtcga cgacatcccc cgcgtcgtgc aggaggcttt cttcctcgcc tcctctggtc
660gaccagggcc ggtgcttgtc gacatcccca aggacatcca gcagcagatg gcggtgcctg
720tctgggacaa gcccatgagt ctgcctgggt acattgcgcg ccttcccaag ccccctgcga
780ctgagttgct tgagcagaag ggcg
804539127DNAArtificial SequenceALS modification repair template 127
539aaccttgtct ccgcgctcgc cgacgcgttg ctcgactccg tccccattgt cgccatcacg
60ggacaggtgt cgcgacgcat gattggcacc gacgccttcc aggagacgcc catcgtcgag
120gtcaccc
12754025DNAArtificial SequenceALS Forward_primer; 540ctacgcacat
ccccctttct cccac
2554136DNAArtificial SequenceALS Reverse_primer 541atgcatacct agcatgcgca
gagacagtgg gtcgtc 3654222DNAArtificial
sequencesoy ALS1-CR1, Cas9 target sequence 542caccggccag gtcccccgcc gg
2254322DNAArtificial
sequencesoy ALS2-CR2, Cas9 target sequence 543ggcgtcggtg ccgatcatcc gg
225449093DNAArtificial
sequenceQC880 544ccgggtgtga tttagtataa agtgaagtaa tggtcaaaag aaaaagtgta
aaacgaagta 60cctagtaata agtaatattg aacaaaataa atggtaaagt gtcagatata
taaaataggc 120tttaataaaa ggaagaaaaa aaacaaacaa aaaataggtt gcaatggggc
agagcagagt 180catcatgaag ctagaaaggc taccgataga taaactatag ttaattaaat
acattaaaaa 240atacttggat ctttctctta ccctgtttat attgagacct gaaacttgag
agagatacac 300taatcttgcc ttgttgtttc attccctaac ttacaggact cagcgcatgt
catgtggtct 360cgttccccat ttaagtccca caccgtctaa acttattaaa ttattaatgt
ttataactag 420atgcacaaca acaaagcttg caccggccag gtcccccgcg ttttagagct
agaaatagca 480agttaaaata aggctagtcc gttatcaact tgaaaaagtg gcaccgagtc
ggtgcttttt 540tttgcggccg caattggatc gggtttactt attttgtggg tatctatact
tttattagat 600ttttaatcag gctcctgatt tctttttatt tcgattgaat tcctgaactt
gtattattca 660gtagatcgaa taaattataa aaagataaaa tcataaaata atattttatc
ctatcaatca 720tattaaagca atgaatatgt aaaattaatc ttatctttat tttaaaaaat
catataggtt 780tagtattttt ttaaaaataa agataggatt agttttacta ttcactgctt
attactttta 840aaaaaatcat aaaggtttag tattttttta aaataaatat aggaatagtt
ttactattca 900ctgctttaat agaaaaatag tttaaaattt aagatagttt taatcccagc
atttgccacg 960tttgaacgtg agccgaaacg atgtcgttac attatcttaa cctagctgaa
acgatgtcgt 1020cataatatcg ccaaatgcca actggactac gtcgaaccca caaatcccac
aaagcgcgtg 1080aaatcaaatc gctcaaacca caaaaaagaa caacgcgttt gttacacgct
caatcccacg 1140cgagtagagc acagtaacct tcaaataagc gaatggggca taatcagaaa
tccgaaataa 1200acctaggggc attatcggaa atgaaaagta gctcactcaa tataaaaatc
taggaaccct 1260agttttcgtt atcactctgt gctccctcgc tctatttctc agtctctgtg
tttgcggctg 1320aggattccga acgagtgacc ttcttcgttt ctcgcaaagg taacagcctc
tgctcttgtc 1380tcttcgattc gatctatgcc tgtctcttat ttacgatgat gtttcttcgg
ttatgttttt 1440ttatttatgc tttatgctgt tgatgttcgg ttgtttgttt cgctttgttt
ttgtggttca 1500gttttttagg attcttttgg tttttgaatc gattaatcgg aagagatttt
cgagttattt 1560ggtgtgttgg aggtgaatct tttttttgag gtcatagatc tgttgtattt
gtgttataaa 1620catgcgactt tgtatgattt tttacgaggt tatgatgttc tggttgtttt
attatgaatc 1680tgttgagaca gaaccatgat ttttgttgat gttcgtttac actattaaag
gtttgtttta 1740acaggattaa aagtttttta agcatgttga aggagtcttg tagatatgta
accgtcgata 1800gtttttttgt gggtttgttc acatgttatc aagcttaatc ttttactatg
tatgcgacca 1860tatctggatc cagcaaaggc gattttttaa ttccttgtga aacttttgta
atatgaagtt 1920gaaattttgt tattggtaaa ctataaatgt gtgaagttgg agtatacctt
taccttctta 1980tttggctttg tgatagttta atttatatgt attttgagtt ctgacttgta
tttctttgaa 2040ttgattctag tttaagtaat ccatggacaa aaagtactca atagggctcg
acatagggac 2100taactccgtt ggatgggccg tcatcaccga cgagtacaag gtgccctcca
agaagttcaa 2160ggtgttggga aacaccgaca ggcacagcat aaagaagaat ttgatcggtg
ccctcctctt 2220cgactccgga gagaccgctg aggctaccag gctcaagagg accgctagaa
ggcgctacac 2280cagaaggaag aacagaatct gctacctgca ggagatcttc tccaacgaga
tggccaaggt 2340ggacgactcc ttcttccacc gccttgagga atcattcctg gtggaggagg
ataaaaagca 2400cgagagacac ccaatcttcg ggaacatcgt cgacgaggtg gcctaccatg
aaaagtaccc 2460taccatctac cacctgagga agaagctggt cgactctacc gacaaggctg
acttgcgctt 2520gatttacctg gctctcgctc acatgataaa gttccgcgga cacttcctca
ttgagggaga 2580cctgaaccca gacaactccg acgtggacaa gctcttcatc cagctcgttc
agacctacaa 2640ccagcttttc gaggagaacc caatcaacgc cagtggagtt gacgccaagg
ctatcctctc 2700tgctcgtctg tcaaagtcca ggaggcttga gaacttgatt gcccagctgc
ctggcgaaaa 2760gaagaacgga ctgttcggaa acttgatcgc tctctccctg ggattgactc
ccaacttcaa 2820gtccaacttc gacctcgccg aggacgctaa gttgcagttg tctaaagaca
cctacgacga 2880tgacctcgac aacttgctgg cccagatagg cgaccaatac gccgatctct
tcctcgccgc 2940taagaacttg tccgacgcaa tcctgctgtc cgacatcctg agagtcaaca
ctgagattac 3000caaagctcct ctgtctgctt ccatgattaa gcgctacgac gagcaccacc
aagatctgac 3060cctgctcaag gccctggtga gacagcagct gcccgagaag tacaaggaga
tctttttcga 3120ccagtccaag aacggctacg ccggatacat tgacggaggc gcctcccagg
aagagttcta 3180caagttcatc aagcccatcc ttgagaagat ggacggtacc gaggagctgt
tggtgaagtt 3240gaacagagag gacctgttga ggaagcagag aaccttcgac aacggaagca
tccctcacca 3300aatccacctg ggagagctcc acgccatctt gaggaggcag gaggatttct
atcccttcct 3360gaaggacaac cgcgagaaga ttgagaagat cttgaccttc agaattcctt
actacgtcgg 3420gccactcgcc agaggaaact ctaggttcgc ctggatgacc cgcaaatctg
aagagaccat 3480tactccctgg aacttcgagg aagtcgtgga caagggcgct tccgctcagt
ctttcatcga 3540gaggatgacc aacttcgata aaaatctgcc caacgagaag gtgctgccca
agcactccct 3600gttgtacgag tatttcacag tgtacaacga gctcaccaag gtgaagtacg
tcacagaggg 3660aatgaggaag cctgccttct tgtccggaga gcagaagaag gccatcgtcg
acctgctctt 3720caagaccaac aggaaggtga ctgtcaagca gctgaaggag gactacttca
agaagatcga 3780gtgcttcgac tccgtcgaga tctctggtgt cgaggacagg ttcaacgcct
cccttgggac 3840ttaccacgat ctgctcaaga ttattaaaga caaggacttc ctggacaacg
aggagaacga 3900ggacatcctt gaggacatcg tgctcaccct gaccttgttc gaagacaggg
aaatgatcga 3960agagaggctc aagacctacg cccacctctt cgacgacaag gtgatgaaac
agctgaagag 4020acgcagatat accggctggg gaaggctctc ccgcaaattg atcaacggga
tcagggacaa 4080gcagtcaggg aagactatac tcgacttcct gaagtccgac ggattcgcca
acaggaactt 4140catgcagctc attcacgacg actccttgac cttcaaggag gacatccaga
aggctcaggt 4200gtctggacag ggtgactcct tgcatgagca cattgctaac ttggccggct
ctcccgctat 4260taagaagggc attttgcaga ccgtgaaggt cgttgacgag ctcgtgaagg
tgatgggacg 4320ccacaagcca gagaacatcg ttattgagat ggctcgcgag aaccaaacta
cccagaaagg 4380gcagaagaat tcccgcgaga ggatgaagcg cattgaggag ggcataaaag
agcttggctc 4440tcagatcctc aaggagcacc ccgtcgagaa cactcagctg cagaacgaga
agctgtacct 4500gtactacctc caaaacggaa gggacatgta cgtggaccag gagctggaca
tcaacaggtt 4560gtccgactac gacgtcgacc acatcgtgcc tcagtccttc ctgaaggatg
actccatcga 4620caataaagtg ctgacacgct ccgataaaaa tagaggcaag tccgacaacg
tcccctccga 4680ggaggtcgtg aagaagatga aaaactactg gagacagctc ttgaacgcca
agctcatcac 4740ccagcgtaag ttcgacaacc tgactaaggc tgagagagga ggattgtccg
agctcgataa 4800ggccggattc atcaagagac agctcgtcga aacccgccaa attaccaagc
acgtggccca 4860aattctggat tcccgcatga acaccaagta cgatgaaaat gacaagctga
tccgcgaggt 4920caaggtgatc accttgaagt ccaagctggt ctccgacttc cgcaaggact
tccagttcta 4980caaggtgagg gagatcaaca actaccacca cgcacacgac gcctacctca
acgctgtcgt 5040tggaaccgcc ctcatcaaaa aatatcctaa gctggagtct gagttcgtct
acggcgacta 5100caaggtgtac gacgtgagga agatgatcgc taagtctgag caggagatcg
gcaaggccac 5160cgccaagtac ttcttctact ccaacatcat gaacttcttc aagaccgaga
tcactctcgc 5220caacggtgag atcaggaagc gcccactgat cgagaccaac ggtgagactg
gagagatcgt 5280gtgggacaaa gggagggatt tcgctactgt gaggaaggtg ctctccatgc
ctcaggtgaa 5340catcgtcaag aagaccgaag ttcagaccgg aggattctcc aaggagtcca
tcctccccaa 5400gagaaactcc gacaagctga tcgctagaaa gaaagactgg gaccctaaga
agtacggagg 5460cttcgattct cctaccgtgg cctactctgt gctggtcgtg gccaaggtgg
agaagggcaa 5520gtccaagaag ctgaaatccg tcaaggagct cctcgggatt accatcatgg
agaggagttc 5580cttcgagaag aaccctatcg acttcctgga ggccaaggga tataaagagg
tgaagaagga 5640cctcatcatc aagctgccca agtactccct cttcgagttg gagaacggaa
ggaagaggat 5700gctggcttct gccggagagt tgcagaaggg aaatgagctc gcccttccct
ccaagtacgt 5760gaacttcctg tacctcgcct ctcactatga aaagttgaag ggctctcctg
aggacaacga 5820gcagaagcag ctcttcgtgg agcagcacaa gcactacctg gacgaaatta
tcgagcagat 5880ctctgagttc tccaagcgcg tgatattggc cgacgccaac ctcgacaagg
tgctgtccgc 5940ctacaacaag cacagggata agcccattcg cgagcaggct gaaaacatta
tccacctgtt 6000taccctcaca aacttgggag cccctgctgc cttcaagtac ttcgacacca
ccattgacag 6060gaagagatac acctccacca aggaggtgct cgacgcaaca ctcatccacc
aatccatcac 6120cggcctctat gaaacaagga ttgacttgtc ccagctggga ggcgactcta
gagccgatcc 6180caagaagaag agaaaggtgt aggttaacct agacttgtcc atcttctgga
ttggccaact 6240taattaatgt atgaaataaa aggatgcaca catagtgaca tgctaatcac
tataatgtgg 6300gcatcaaagt tgtgtgttat gtgtaattac tagttatctg aataaaagag
aaagagatca 6360tccatatttc ttatcctaaa tgaatgtcac gtgtctttat aattctttga
tgaaccagat 6420gcatttcatt aaccaaatcc atatacatat aaatattaat catatataat
taatatcaat 6480tgggttagca aaacaaatct agtctaggtg tgttttgcga attcgatatc
aagcttatcg 6540ataccgtcga gggggggccc ggtaccggcg cgccgttcta tagtgtcacc
taaatcgtat 6600gtgtatgata cataaggtta tgtattaatt gtagccgcgt tctaacgaca
atatgtccat 6660atggtgcact ctcagtacaa tctgctctga tgccgcatag ttaagccagc
cccgacaccc 6720gccaacaccc gctgacgcgc cctgacgggc ttgtctgctc ccggcatccg
cttacagaca 6780agctgtgacc gtctccggga gctgcatgtg tcagaggttt tcaccgtcat
caccgaaacg 6840cgcgagacga aagggcctcg tgatacgcct atttttatag gttaatgtca
tgaccaaaat 6900cccttaacgt gagttttcgt tccactgagc gtcagacccc gtagaaaaga
tcaaaggatc 6960ttcttgagat cctttttttc tgcgcgtaat ctgctgcttg caaacaaaaa
aaccaccgct 7020accagcggtg gtttgtttgc cggatcaaga gctaccaact ctttttccga
aggtaactgg 7080cttcagcaga gcgcagatac caaatactgt ccttctagtg tagccgtagt
taggccacca 7140cttcaagaac tctgtagcac cgcctacata cctcgctctg ctaatcctgt
taccagtggc 7200tgctgccagt ggcgataagt cgtgtcttac cgggttggac tcaagacgat
agttaccgga 7260taaggcgcag cggtcgggct gaacgggggg ttcgtgcaca cagcccagct
tggagcgaac 7320gacctacacc gaactgagat acctacagcg tgagcattga gaaagcgcca
cgcttcccga 7380agggagaaag gcggacaggt atccggtaag cggcagggtc ggaacaggag
agcgcacgag 7440ggagcttcca gggggaaacg cctggtatct ttatagtcct gtcgggtttc
gccacctctg 7500acttgagcgt cgatttttgt gatgctcgtc aggggggcgg agcctatgga
aaaacgccag 7560caacgcggcc tttttacggt tcctggcctt ttgctggcct tttgctcaca
tgttctttcc 7620tgcgttatcc cctgattctg tggataaccg tattaccgcc tttgagtgag
ctgataccgc 7680tcgccgcagc cgaacgaccg agcgcagcga gtcagtgagc gaggaagcgg
aagagcgccc 7740aatacgcaaa ccgcctctcc ccgcgcgttg gccgattcat taatgcaggt
tgatcagatc 7800tcgatcccgc gaaattaata cgactcacta tagggagacc acaacggttt
ccctctagaa 7860ataattttgt ttaactttaa gaaggagata tacccatgga aaagcctgaa
ctcaccgcga 7920cgtctgtcga gaagtttctg atcgaaaagt tcgacagcgt ctccgacctg
atgcagctct 7980cggagggcga agaatctcgt gctttcagct tcgatgtagg agggcgtgga
tatgtcctgc 8040gggtaaatag ctgcgccgat ggtttctaca aagatcgtta tgtttatcgg
cactttgcat 8100cggccgcgct cccgattccg gaagtgcttg acattgggga attcagcgag
agcctgacct 8160attgcatctc ccgccgtgca cagggtgtca cgttgcaaga cctgcctgaa
accgaactgc 8220ccgctgttct gcagccggtc gcggaggcta tggatgcgat cgctgcggcc
gatcttagcc 8280agacgagcgg gttcggccca ttcggaccgc aaggaatcgg tcaatacact
acatggcgtg 8340atttcatatg cgcgattgct gatccccatg tgtatcactg gcaaactgtg
atggacgaca 8400ccgtcagtgc gtccgtcgcg caggctctcg atgagctgat gctttgggcc
gaggactgcc 8460ccgaagtccg gcacctcgtg cacgcggatt tcggctccaa caatgtcctg
acggacaatg 8520gccgcataac agcggtcatt gactggagcg aggcgatgtt cggggattcc
caatacgagg 8580tcgccaacat cttcttctgg aggccgtggt tggcttgtat ggagcagcag
acgcgctact 8640tcgagcggag gcatccggag cttgcaggat cgccgcggct ccgggcgtat
atgctccgca 8700ttggtcttga ccaactctat cagagcttgg ttgacggcaa tttcgatgat
gcagcttggg 8760cgcagggtcg atgcgacgca atcgtccgat ccggagccgg gactgtcggg
cgtacacaaa 8820tcgcccgcag aagcgcggcc gtctggaccg atggctgtgt agaagtactc
gccgatagtg 8880gaaaccgacg ccccagcact cgtccgaggg caaaggaata gtgaggtaca
gcttggatcg 8940atccggctgc taacaaagcc cgaaaggaag ctgagttggc tgctgccacc
gctgagcaat 9000aactagcata accccttggg gcctctaaac gggtcttgag gggttttttg
ctgaaaggag 9060gaactatatc cggatgatcg ggcgcgccgg tac
90935459093DNAArtificial sequenceQC881 545ccgggtgtga
tttagtataa agtgaagtaa tggtcaaaag aaaaagtgta aaacgaagta 60cctagtaata
agtaatattg aacaaaataa atggtaaagt gtcagatata taaaataggc 120tttaataaaa
ggaagaaaaa aaacaaacaa aaaataggtt gcaatggggc agagcagagt 180catcatgaag
ctagaaaggc taccgataga taaactatag ttaattaaat acattaaaaa 240atacttggat
ctttctctta ccctgtttat attgagacct gaaacttgag agagatacac 300taatcttgcc
ttgttgtttc attccctaac ttacaggact cagcgcatgt catgtggtct 360cgttccccat
ttaagtccca caccgtctaa acttattaaa ttattaatgt ttataactag 420atgcacaaca
acaaagcttg ggcgtcggtg ccgatcatcg ttttagagct agaaatagca 480agttaaaata
aggctagtcc gttatcaact tgaaaaagtg gcaccgagtc ggtgcttttt 540tttgcggccg
caattggatc gggtttactt attttgtggg tatctatact tttattagat 600ttttaatcag
gctcctgatt tctttttatt tcgattgaat tcctgaactt gtattattca 660gtagatcgaa
taaattataa aaagataaaa tcataaaata atattttatc ctatcaatca 720tattaaagca
atgaatatgt aaaattaatc ttatctttat tttaaaaaat catataggtt 780tagtattttt
ttaaaaataa agataggatt agttttacta ttcactgctt attactttta 840aaaaaatcat
aaaggtttag tattttttta aaataaatat aggaatagtt ttactattca 900ctgctttaat
agaaaaatag tttaaaattt aagatagttt taatcccagc atttgccacg 960tttgaacgtg
agccgaaacg atgtcgttac attatcttaa cctagctgaa acgatgtcgt 1020cataatatcg
ccaaatgcca actggactac gtcgaaccca caaatcccac aaagcgcgtg 1080aaatcaaatc
gctcaaacca caaaaaagaa caacgcgttt gttacacgct caatcccacg 1140cgagtagagc
acagtaacct tcaaataagc gaatggggca taatcagaaa tccgaaataa 1200acctaggggc
attatcggaa atgaaaagta gctcactcaa tataaaaatc taggaaccct 1260agttttcgtt
atcactctgt gctccctcgc tctatttctc agtctctgtg tttgcggctg 1320aggattccga
acgagtgacc ttcttcgttt ctcgcaaagg taacagcctc tgctcttgtc 1380tcttcgattc
gatctatgcc tgtctcttat ttacgatgat gtttcttcgg ttatgttttt 1440ttatttatgc
tttatgctgt tgatgttcgg ttgtttgttt cgctttgttt ttgtggttca 1500gttttttagg
attcttttgg tttttgaatc gattaatcgg aagagatttt cgagttattt 1560ggtgtgttgg
aggtgaatct tttttttgag gtcatagatc tgttgtattt gtgttataaa 1620catgcgactt
tgtatgattt tttacgaggt tatgatgttc tggttgtttt attatgaatc 1680tgttgagaca
gaaccatgat ttttgttgat gttcgtttac actattaaag gtttgtttta 1740acaggattaa
aagtttttta agcatgttga aggagtcttg tagatatgta accgtcgata 1800gtttttttgt
gggtttgttc acatgttatc aagcttaatc ttttactatg tatgcgacca 1860tatctggatc
cagcaaaggc gattttttaa ttccttgtga aacttttgta atatgaagtt 1920gaaattttgt
tattggtaaa ctataaatgt gtgaagttgg agtatacctt taccttctta 1980tttggctttg
tgatagttta atttatatgt attttgagtt ctgacttgta tttctttgaa 2040ttgattctag
tttaagtaat ccatggacaa aaagtactca atagggctcg acatagggac 2100taactccgtt
ggatgggccg tcatcaccga cgagtacaag gtgccctcca agaagttcaa 2160ggtgttggga
aacaccgaca ggcacagcat aaagaagaat ttgatcggtg ccctcctctt 2220cgactccgga
gagaccgctg aggctaccag gctcaagagg accgctagaa ggcgctacac 2280cagaaggaag
aacagaatct gctacctgca ggagatcttc tccaacgaga tggccaaggt 2340ggacgactcc
ttcttccacc gccttgagga atcattcctg gtggaggagg ataaaaagca 2400cgagagacac
ccaatcttcg ggaacatcgt cgacgaggtg gcctaccatg aaaagtaccc 2460taccatctac
cacctgagga agaagctggt cgactctacc gacaaggctg acttgcgctt 2520gatttacctg
gctctcgctc acatgataaa gttccgcgga cacttcctca ttgagggaga 2580cctgaaccca
gacaactccg acgtggacaa gctcttcatc cagctcgttc agacctacaa 2640ccagcttttc
gaggagaacc caatcaacgc cagtggagtt gacgccaagg ctatcctctc 2700tgctcgtctg
tcaaagtcca ggaggcttga gaacttgatt gcccagctgc ctggcgaaaa 2760gaagaacgga
ctgttcggaa acttgatcgc tctctccctg ggattgactc ccaacttcaa 2820gtccaacttc
gacctcgccg aggacgctaa gttgcagttg tctaaagaca cctacgacga 2880tgacctcgac
aacttgctgg cccagatagg cgaccaatac gccgatctct tcctcgccgc 2940taagaacttg
tccgacgcaa tcctgctgtc cgacatcctg agagtcaaca ctgagattac 3000caaagctcct
ctgtctgctt ccatgattaa gcgctacgac gagcaccacc aagatctgac 3060cctgctcaag
gccctggtga gacagcagct gcccgagaag tacaaggaga tctttttcga 3120ccagtccaag
aacggctacg ccggatacat tgacggaggc gcctcccagg aagagttcta 3180caagttcatc
aagcccatcc ttgagaagat ggacggtacc gaggagctgt tggtgaagtt 3240gaacagagag
gacctgttga ggaagcagag aaccttcgac aacggaagca tccctcacca 3300aatccacctg
ggagagctcc acgccatctt gaggaggcag gaggatttct atcccttcct 3360gaaggacaac
cgcgagaaga ttgagaagat cttgaccttc agaattcctt actacgtcgg 3420gccactcgcc
agaggaaact ctaggttcgc ctggatgacc cgcaaatctg aagagaccat 3480tactccctgg
aacttcgagg aagtcgtgga caagggcgct tccgctcagt ctttcatcga 3540gaggatgacc
aacttcgata aaaatctgcc caacgagaag gtgctgccca agcactccct 3600gttgtacgag
tatttcacag tgtacaacga gctcaccaag gtgaagtacg tcacagaggg 3660aatgaggaag
cctgccttct tgtccggaga gcagaagaag gccatcgtcg acctgctctt 3720caagaccaac
aggaaggtga ctgtcaagca gctgaaggag gactacttca agaagatcga 3780gtgcttcgac
tccgtcgaga tctctggtgt cgaggacagg ttcaacgcct cccttgggac 3840ttaccacgat
ctgctcaaga ttattaaaga caaggacttc ctggacaacg aggagaacga 3900ggacatcctt
gaggacatcg tgctcaccct gaccttgttc gaagacaggg aaatgatcga 3960agagaggctc
aagacctacg cccacctctt cgacgacaag gtgatgaaac agctgaagag 4020acgcagatat
accggctggg gaaggctctc ccgcaaattg atcaacggga tcagggacaa 4080gcagtcaggg
aagactatac tcgacttcct gaagtccgac ggattcgcca acaggaactt 4140catgcagctc
attcacgacg actccttgac cttcaaggag gacatccaga aggctcaggt 4200gtctggacag
ggtgactcct tgcatgagca cattgctaac ttggccggct ctcccgctat 4260taagaagggc
attttgcaga ccgtgaaggt cgttgacgag ctcgtgaagg tgatgggacg 4320ccacaagcca
gagaacatcg ttattgagat ggctcgcgag aaccaaacta cccagaaagg 4380gcagaagaat
tcccgcgaga ggatgaagcg cattgaggag ggcataaaag agcttggctc 4440tcagatcctc
aaggagcacc ccgtcgagaa cactcagctg cagaacgaga agctgtacct 4500gtactacctc
caaaacggaa gggacatgta cgtggaccag gagctggaca tcaacaggtt 4560gtccgactac
gacgtcgacc acatcgtgcc tcagtccttc ctgaaggatg actccatcga 4620caataaagtg
ctgacacgct ccgataaaaa tagaggcaag tccgacaacg tcccctccga 4680ggaggtcgtg
aagaagatga aaaactactg gagacagctc ttgaacgcca agctcatcac 4740ccagcgtaag
ttcgacaacc tgactaaggc tgagagagga ggattgtccg agctcgataa 4800ggccggattc
atcaagagac agctcgtcga aacccgccaa attaccaagc acgtggccca 4860aattctggat
tcccgcatga acaccaagta cgatgaaaat gacaagctga tccgcgaggt 4920caaggtgatc
accttgaagt ccaagctggt ctccgacttc cgcaaggact tccagttcta 4980caaggtgagg
gagatcaaca actaccacca cgcacacgac gcctacctca acgctgtcgt 5040tggaaccgcc
ctcatcaaaa aatatcctaa gctggagtct gagttcgtct acggcgacta 5100caaggtgtac
gacgtgagga agatgatcgc taagtctgag caggagatcg gcaaggccac 5160cgccaagtac
ttcttctact ccaacatcat gaacttcttc aagaccgaga tcactctcgc 5220caacggtgag
atcaggaagc gcccactgat cgagaccaac ggtgagactg gagagatcgt 5280gtgggacaaa
gggagggatt tcgctactgt gaggaaggtg ctctccatgc ctcaggtgaa 5340catcgtcaag
aagaccgaag ttcagaccgg aggattctcc aaggagtcca tcctccccaa 5400gagaaactcc
gacaagctga tcgctagaaa gaaagactgg gaccctaaga agtacggagg 5460cttcgattct
cctaccgtgg cctactctgt gctggtcgtg gccaaggtgg agaagggcaa 5520gtccaagaag
ctgaaatccg tcaaggagct cctcgggatt accatcatgg agaggagttc 5580cttcgagaag
aaccctatcg acttcctgga ggccaaggga tataaagagg tgaagaagga 5640cctcatcatc
aagctgccca agtactccct cttcgagttg gagaacggaa ggaagaggat 5700gctggcttct
gccggagagt tgcagaaggg aaatgagctc gcccttccct ccaagtacgt 5760gaacttcctg
tacctcgcct ctcactatga aaagttgaag ggctctcctg aggacaacga 5820gcagaagcag
ctcttcgtgg agcagcacaa gcactacctg gacgaaatta tcgagcagat 5880ctctgagttc
tccaagcgcg tgatattggc cgacgccaac ctcgacaagg tgctgtccgc 5940ctacaacaag
cacagggata agcccattcg cgagcaggct gaaaacatta tccacctgtt 6000taccctcaca
aacttgggag cccctgctgc cttcaagtac ttcgacacca ccattgacag 6060gaagagatac
acctccacca aggaggtgct cgacgcaaca ctcatccacc aatccatcac 6120cggcctctat
gaaacaagga ttgacttgtc ccagctggga ggcgactcta gagccgatcc 6180caagaagaag
agaaaggtgt aggttaacct agacttgtcc atcttctgga ttggccaact 6240taattaatgt
atgaaataaa aggatgcaca catagtgaca tgctaatcac tataatgtgg 6300gcatcaaagt
tgtgtgttat gtgtaattac tagttatctg aataaaagag aaagagatca 6360tccatatttc
ttatcctaaa tgaatgtcac gtgtctttat aattctttga tgaaccagat 6420gcatttcatt
aaccaaatcc atatacatat aaatattaat catatataat taatatcaat 6480tgggttagca
aaacaaatct agtctaggtg tgttttgcga attcgatatc aagcttatcg 6540ataccgtcga
gggggggccc ggtaccggcg cgccgttcta tagtgtcacc taaatcgtat 6600gtgtatgata
cataaggtta tgtattaatt gtagccgcgt tctaacgaca atatgtccat 6660atggtgcact
ctcagtacaa tctgctctga tgccgcatag ttaagccagc cccgacaccc 6720gccaacaccc
gctgacgcgc cctgacgggc ttgtctgctc ccggcatccg cttacagaca 6780agctgtgacc
gtctccggga gctgcatgtg tcagaggttt tcaccgtcat caccgaaacg 6840cgcgagacga
aagggcctcg tgatacgcct atttttatag gttaatgtca tgaccaaaat 6900cccttaacgt
gagttttcgt tccactgagc gtcagacccc gtagaaaaga tcaaaggatc 6960ttcttgagat
cctttttttc tgcgcgtaat ctgctgcttg caaacaaaaa aaccaccgct 7020accagcggtg
gtttgtttgc cggatcaaga gctaccaact ctttttccga aggtaactgg 7080cttcagcaga
gcgcagatac caaatactgt ccttctagtg tagccgtagt taggccacca 7140cttcaagaac
tctgtagcac cgcctacata cctcgctctg ctaatcctgt taccagtggc 7200tgctgccagt
ggcgataagt cgtgtcttac cgggttggac tcaagacgat agttaccgga 7260taaggcgcag
cggtcgggct gaacgggggg ttcgtgcaca cagcccagct tggagcgaac 7320gacctacacc
gaactgagat acctacagcg tgagcattga gaaagcgcca cgcttcccga 7380agggagaaag
gcggacaggt atccggtaag cggcagggtc ggaacaggag agcgcacgag 7440ggagcttcca
gggggaaacg cctggtatct ttatagtcct gtcgggtttc gccacctctg 7500acttgagcgt
cgatttttgt gatgctcgtc aggggggcgg agcctatgga aaaacgccag 7560caacgcggcc
tttttacggt tcctggcctt ttgctggcct tttgctcaca tgttctttcc 7620tgcgttatcc
cctgattctg tggataaccg tattaccgcc tttgagtgag ctgataccgc 7680tcgccgcagc
cgaacgaccg agcgcagcga gtcagtgagc gaggaagcgg aagagcgccc 7740aatacgcaaa
ccgcctctcc ccgcgcgttg gccgattcat taatgcaggt tgatcagatc 7800tcgatcccgc
gaaattaata cgactcacta tagggagacc acaacggttt ccctctagaa 7860ataattttgt
ttaactttaa gaaggagata tacccatgga aaagcctgaa ctcaccgcga 7920cgtctgtcga
gaagtttctg atcgaaaagt tcgacagcgt ctccgacctg atgcagctct 7980cggagggcga
agaatctcgt gctttcagct tcgatgtagg agggcgtgga tatgtcctgc 8040gggtaaatag
ctgcgccgat ggtttctaca aagatcgtta tgtttatcgg cactttgcat 8100cggccgcgct
cccgattccg gaagtgcttg acattgggga attcagcgag agcctgacct 8160attgcatctc
ccgccgtgca cagggtgtca cgttgcaaga cctgcctgaa accgaactgc 8220ccgctgttct
gcagccggtc gcggaggcta tggatgcgat cgctgcggcc gatcttagcc 8280agacgagcgg
gttcggccca ttcggaccgc aaggaatcgg tcaatacact acatggcgtg 8340atttcatatg
cgcgattgct gatccccatg tgtatcactg gcaaactgtg atggacgaca 8400ccgtcagtgc
gtccgtcgcg caggctctcg atgagctgat gctttgggcc gaggactgcc 8460ccgaagtccg
gcacctcgtg cacgcggatt tcggctccaa caatgtcctg acggacaatg 8520gccgcataac
agcggtcatt gactggagcg aggcgatgtt cggggattcc caatacgagg 8580tcgccaacat
cttcttctgg aggccgtggt tggcttgtat ggagcagcag acgcgctact 8640tcgagcggag
gcatccggag cttgcaggat cgccgcggct ccgggcgtat atgctccgca 8700ttggtcttga
ccaactctat cagagcttgg ttgacggcaa tttcgatgat gcagcttggg 8760cgcagggtcg
atgcgacgca atcgtccgat ccggagccgg gactgtcggg cgtacacaaa 8820tcgcccgcag
aagcgcggcc gtctggaccg atggctgtgt agaagtactc gccgatagtg 8880gaaaccgacg
ccccagcact cgtccgaggg caaaggaata gtgaggtaca gcttggatcg 8940atccggctgc
taacaaagcc cgaaaggaag ctgagttggc tgctgccacc gctgagcaat 9000aactagcata
accccttggg gcctctaaac gggtcttgag gggttttttg ctgaaaggag 9060gaactatatc
cggatgatcg ggcgcgccgg tac
90935461113DNAArtificial sequenceRTW1026A 546agcttggtac cgagctcgga
tccactagta tggcggccac cgcttccaga accacccgat 60tctcttcttc ctcttcacac
cccaccttcc ccaaacgcat tactagatcc accctccctc 120tctctcatca aaccctcacc
aaacccaacc acgctctcaa aatcaaatgt tccatctcca 180aaccccccac ggcggcgccc
ttcaccaagg aagcgccgac cacggagccc ttcgtgtcac 240ggttcgcctc cggcgaacct
cgcaagggcg cggacatcct tgtggaggcg ctggagaggc 300agggcgtgac gacggtgttc
gcgtaccccg gcggtgcgtc gatggagatc caccaggcgc 360tcacgcgctc cgccgccatc
cgcaacgtgc tcccgcgcca cgagcagggc ggcgtcttcg 420ccgccgaagg ctacgcgcgt
tcctccggcc tccccggcgt ctgcattgcc acctccggcc 480ccggcgccac caacctcgtg
agcggcctcg ccgacgcttt aatggacagc gtcccagtcg 540tcgccatcac cggccaggtc
agccgtcgca tgatcggtac cgacgccttc caagaaaccc 600cgatcgtgga ggtgagcaga
tccatcacga agcacaacta cctcatcctc gacgtcgacg 660acatcccccg cgtcgtcgcc
gaggctttct tcgtcgccac ctccggccgc cccggtccgg 720tcctcatcga cattcccaaa
gacgttcagc agcaactcgc cgtgcctaat tgggacgagc 780ccgttaacct ccccggttac
ctcgccaggc tgcccaggcc ccccgccgag gcccaattgg 840aacacattgt cagactcatc
atggaggccc aaaagcccgt tctctacgtc ggcggtggca 900gtttgaattc cagtgctgaa
ttgaggcgct ttgttgaact cactggtatt cccgttgcta 960gcactttaat gggtcttgga
acttttccta ttggtgatga atattccctt cagatgctgg 1020gtatgcatgg tactgtttat
gctaactatg ctgttgacaa tagtgatttg ttgcttgcct 1080ttggggtaag gtttgatgac
cgtgttactg gga
111354717DNAArtificialWOL900, Forward_primer 547atcaccggcc aggtcag
1754825DNAArtificialWOL578,
Reverse_primer 548acttaccctc cactcctttc tcctc
2554929DNAArtificialWOL573, Forward_primer 549atggcggcca
ccgcttccag aaccacccg 29550638PRTzea
mays 550Met Ala Thr Ala Ala Ala Ala Ser Thr Ala Leu Thr Gly Ala Thr Thr 1
5 10 15 Ala Ala Pro
Lys Ala Arg Arg Arg Ala His Leu Leu Ala Thr Arg Arg 20
25 30 Ala Leu Ala Ala Pro Ile Arg Cys
Ser Ala Ala Ser Pro Ala Met Pro 35 40
45 Met Ala Pro Pro Ala Thr Pro Leu Arg Pro Trp Gly Pro
Thr Glu Pro 50 55 60
Arg Lys Gly Ala Asp Ile Leu Val Glu Ser Leu Glu Arg Cys Gly Val 65
70 75 80 Arg Asp Val Phe
Ala Tyr Pro Gly Gly Ala Ser Met Glu Ile His Gln 85
90 95 Ala Leu Thr Arg Ser Pro Val Ile Ala
Asn His Leu Phe Arg His Glu 100 105
110 Gln Gly Glu Ala Phe Ala Ala Ser Gly Tyr Ala Arg Ser Ser
Gly Arg 115 120 125
Val Gly Val Cys Ile Ala Thr Ser Gly Pro Gly Ala Thr Asn Leu Val 130
135 140 Ser Ala Leu Ala Asp
Ala Leu Leu Asp Ser Val Pro Met Val Ala Ile 145 150
155 160 Thr Gly Gln Val Pro Arg Arg Met Ile Gly
Thr Asp Ala Phe Gln Glu 165 170
175 Thr Pro Ile Val Glu Val Thr Arg Ser Ile Thr Lys His Asn Tyr
Leu 180 185 190 Val
Leu Asp Val Asp Asp Ile Pro Arg Val Val Gln Glu Ala Phe Phe 195
200 205 Leu Ala Ser Ser Gly Arg
Pro Gly Pro Val Leu Val Asp Ile Pro Lys 210 215
220 Asp Ile Gln Gln Gln Met Ala Val Pro Val Trp
Asp Lys Pro Met Ser 225 230 235
240 Leu Pro Gly Tyr Ile Ala Arg Leu Pro Lys Pro Pro Ala Thr Glu Leu
245 250 255 Leu Glu
Gln Val Leu Arg Leu Val Gly Glu Ser Arg Arg Pro Val Leu 260
265 270 Tyr Val Gly Gly Gly Cys Ala
Ala Ser Gly Glu Glu Leu Arg Arg Phe 275 280
285 Val Glu Leu Thr Gly Ile Pro Val Thr Thr Thr Leu
Met Gly Leu Gly 290 295 300
Asn Phe Pro Ser Asp Asp Pro Leu Ser Leu Arg Met Leu Gly Met His 305
310 315 320 Gly Thr Val
Tyr Ala Asn Tyr Ala Val Asp Lys Ala Asp Leu Leu Leu 325
330 335 Ala Leu Gly Val Arg Phe Asp Asp
Arg Val Thr Gly Lys Ile Glu Ala 340 345
350 Phe Ala Ser Arg Ala Lys Ile Val His Val Asp Ile Asp
Pro Ala Glu 355 360 365
Ile Gly Lys Asn Lys Gln Pro His Val Ser Ile Cys Ala Asp Val Lys 370
375 380 Leu Ala Leu Gln
Gly Met Asn Ala Leu Leu Glu Gly Ser Thr Ser Lys 385 390
395 400 Lys Ser Phe Asp Phe Gly Ser Trp Asn
Asp Glu Leu Asp Gln Gln Lys 405 410
415 Arg Glu Phe Pro Leu Gly Tyr Lys Thr Ser Asn Glu Glu Ile
Gln Pro 420 425 430
Gln Tyr Ala Ile Gln Val Leu Asp Glu Leu Thr Lys Gly Glu Ala Ile
435 440 445 Ile Gly Thr Gly
Val Gly Gln His Gln Met Trp Ala Ala Gln Tyr Tyr 450
455 460 Thr Tyr Lys Arg Pro Arg Gln Trp
Leu Ser Ser Ala Gly Leu Gly Ala 465 470
475 480 Met Gly Phe Gly Leu Pro Ala Ala Ala Gly Ala Ser
Val Ala Asn Pro 485 490
495 Gly Val Thr Val Val Asp Ile Asp Gly Asp Gly Ser Phe Leu Met Asn
500 505 510 Val Gln Glu
Leu Ala Met Ile Arg Ile Glu Asn Leu Pro Val Lys Val 515
520 525 Phe Val Leu Asn Asn Gln His Leu
Gly Met Val Val Gln Trp Glu Asp 530 535
540 Arg Phe Tyr Lys Ala Asn Arg Ala His Thr Tyr Leu Gly
Asn Pro Glu 545 550 555
560 Asn Glu Ser Glu Ile Tyr Pro Asp Phe Val Thr Ile Ala Lys Gly Phe
565 570 575 Asn Ile Pro Ala
Val Arg Val Thr Lys Lys Asn Glu Val Arg Ala Ala 580
585 590 Ile Lys Lys Met Leu Glu Thr Pro Gly
Pro Tyr Leu Leu Asp Ile Ile 595 600
605 Val Pro His Gln Glu His Val Leu Pro Met Ile Pro Ser Gly
Gly Ala 610 615 620
Phe Lys Asp Met Ile Leu Asp Gly Asp Gly Arg Thr Val Tyr 625
630 635
User Contributions:
Comment about this patent or add new information about this topic:
People who visited this patent also read: | |
Patent application number | Title |
---|---|
20220191057 | CENTRAL PLANT CONTROL SYSTEM WITH ASSET ALLOCATION OVERRIDE |
20220191056 | WIRELESS COMMUNICATION DEVICE |
20220191055 | GRAPHICAL USER INTERFACES FOR GROUPING VIDEO CONFERENCE PARTICIPANTS |
20220191054 | METHOD FOR INSTANT MESSAGNING SERVICE PROVIDING SCHEDULE SERVICE AND APPARATUS THEREOF |
20220191053 | System and Method of Dating Through Video Profiles |