Patents - stay tuned to the technology

Inventors list

Assignees list

Classification tree browser

Top 100 Inventors

Top 100 Assignees

Patent application title: Method of Reducing Acetylation in Plants to Improve Biofuel Production

Inventors:  Henrik Vibe Scheller (Millbrae, CA, US)  Henrik Vibe Scheller (Millbrae, CA, US)
Assignees:  THE REGENTS OF THE UNIVERSITY OF CALIFORNIA
IPC8 Class: AC12P100FI
USPC Class: 800278
Class name: Multicellular living organisms and unmodified parts thereof and related processes method of introducing a polynucleotide molecule into or rearrangement of genetic material within a plant or plant part
Publication date: 2012-02-23
Patent application number: 20120047601



Abstract:

The invention provides methods for engineering plants to have reduced levels of acetylation by decreasing expression of one or more Cas1L genes. Such plants can be used, e.g., to increase yield for biofuel production.

Claims:

1. A method of improving the final yield from, or efficiency of, a fermentation reaction comprising plant material that is acetylated, the method comprising: enzymatically or chemically degrading plant material from a mutant CAS1L plant that has decreased activity of at least one CAS1L gene to obtain degradation products; and fermenting the degradation products in a fermentation reaction, wherein the final yield, or efficiency of, the fermentation reaction is increased relative to the final yield, or efficiency, of a fermentation reaction using corresponding plant material from a wildtype CAS1L plant.

2. The method of claim 1, wherein the steps of degrading the plant material and fermenting the degradation products occur in the same reaction mixture.

3. The method of claim 1, wherein the steps of degrading the plant material and fermenting the degradation products occur in separate reaction mixtures.

4. The method of claim 1, wherein the improvement of the efficiency of the fermentation reaction is an increase in the amount of enzymatically degraded degradation product obtained per unit enzyme over a period of time compared to the amount of product obtained using corresponding plant material from a wildtype CAS1L plant per unit enzyme over the same period of time.

5. The method of claim 1, wherein the mutant Cas1L plant has been engineered to decrease the activity of at least two Cas1L genes.

6. The method of claim 1, wherein the mutant Cas1L_plan has been engineered to decrease the activity of at least three Cas1L genes.

7. The method of claim 1, wherein the plant is a transgenic plant that comprises a vector comprising a nucleic acid sequence that encodes an RNAi that inhibits expression of at least one Cas1L gene.

8. The method of claim 1, wherein the plant is selected from the group consisting of corn, sorghum, millet, miscanthus, sugarcane, poplar, pine, eucalyptus, wheat, rice, soy, cotton, barley, switchgrass, turfgrass, ryegrass, bahiagrass, bermudagrass, dallisgrass, pangolagrass, big bluestem, Indian grass, fescue, Dactylis sp. Brachypodium, smooth bromegrass, orchardgrass, Kentucky bluegrass, timothy, Kochia, forage soybeans, alfalfa, clover, hemp, kenaf, tobacco, potato, bamboo, rape, sugar beet, sunflower, willow, and eucalyptus.

9. A plant comprising a vector that comprises an RNAi that inhibits a Cas1L gene.

10. The plant of claim 9, wherein the plant is selected from the group consisting of corn, sorghum, millet, miscanthus, sugarcane, poplar, pine, eucalyptus, wheat, rice, soy, cotton, barley, switchgrass, turfgrass, ryegrass, bahiagrass, bermudagrass, dallisgrass, pangolagrass, big bluestem, Indian grass, fescue, Dactylis sp. Brachypodium, smooth bromegrass, orchardgrass, Kentucky bluegrass, timothy, Kochia, forage soybeans, alfalfa, clover, hemp, kenaf, tobacco, potato, bamboo, rape, sugar beet, sunflower, willow, and eucalyptus.

11. An isolated mutant Cas1L plant in which expression of at least two Cas1L genes is inhibited.

12. The plant of claim 11, wherein the plant is selected from the group consisting of corn, sorghum, millet, miscanthus, sugarcane, poplar, pine, eucalyptus, wheat, rice, soy, cotton, barley, switchgrass, turfgrass, ryegrass, bahiagrass, bermudagrass, dallisgrass, pangolagrass, big bluestem, Indian grass, fescue, Dactylis sp. Brachypodium, smooth bromegrass, orchardgrass, Kentucky bluegrass, timothy, Kochia, forage soybeans, alfalfa, clover, hemp, kenaf, tobacco, potato, bamboo, rape, sugar beet, sunflower, willow, and eucalyptus.

13. A method of engineering a plant to reduce acetylation in the cell wall of a plant, the method comprising inhibiting expression of at least one Cas1L gene in the plant using RNAi.

14. The method of claim 13, further comprising determining the level of acetylation in the cell wall of the plant.

15. Bulk harvested material comprising material from at least two engineered plants that have decreased expression of at least one Cas1L gene.

16. The bulk harvested material of claim 15, wherein the at least two engineered plants are grass plants.

17. The bulk harvested material of claim 15, wherein the material is present in a fermentation reaction.

18. A plant engineered to have reduced expression of at least one Cas1L gene, wherein the plant has at least a 10% reduction in cell wall acetylation in comparison to a plant that has not been engineered.

19. The plant of claim 18, wherein the plant is a grass.

20. A polysaccharide or polysaccharide fraction isolated from a plant of claim 18.

21. A polysaccharide of claim 20, where the polysaccharide is homogalacturonan or pectin.

22. A food ingredient containing the polysaccharide of claim 20.

23. A food product containing an ingredient according to claim 22.

Description:

CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application claims priority benefit of U.S. provisional patent application No. 61/153,202, filed Feb. 17, 2009, which application is incorporated by reference herein.

BACKGROUND OF THE INVENTION

[0003] Biomass for biofuel production contains large amounts of xylan as well as smaller amounts of other matrix polysaccharides, such as xyloglucan, mannans, pectins, and the like. Efficient biofuel production requires efficient degradation of these polysaccharides.

[0004] Enzymatic degradation into monosaccharides is hindered, however, by acetyl ester substitutions on the polysaccharide backbone. The inhibition of degradation by acetyl esters can be significant, as substitution can be at a high level, e.g., typically 25-50% of the xylose residues of grass xylan are acetylated. Moreover, the acetic acid that is released during enzymatic or chemical degradation is inhibitory to organisms, such as yeast, that are used for fermentation. The acetic acid contained in a biomass mixture for fermentation can easily be in the order of 0.1 M or 6 g/l, which is a highly inhibitory level. Reduction in the level of acetic acid would therefore be highly beneficial for fermentation. Accordingly, there is a need for improvement to biofuel production to reduce acetic acid levels.

[0005] A pathogenic fungus, Cryptococcus neoformans, has a polysaccharide coat consisting of O-acetylated glucuronoxylomannans. A protein encoded by the Cas1p gene has been identified as being essential for acetylation of the coat polysaccharide (Janbon et al., Molec. Microbiol. 42:453-467, 2001). Although the gene has been putatively annotated as an acetyltranserase, its biochemical activity was not confirmed. Homologs of this gene have been identified in various plants.

[0006] The present invention is based, in part, on the discovery that mutations in homologs to Cas1p reduce polysaccharide acetylation. Plants having reduced polysaccharide acetylation in accordance with the present invention can be used, for example, to provide plant mass that produces lower levels of acetic acid during fermentation.

BRIEF SUMMARY OF THE INVENTION

[0007] The invention provides, in part, methods of engineering plants to reduce the level of acetylation in plants, plants that have been engineered in accordance with the methods and methods of using such plants, e.g., to improve the yield for biofuel production. Thus, in one aspect, the invention provides a method of improve the yield of a fermentation reaction from an enzymatic or chemical degradation of acetylated products in a reaction comprising plant material, e.g., cell wall material, the method comprising providing a plant that has been engineered to decrease the activity of at least one CAS1L gene, e.g., a CAS1L1 or CAS1L2 gene, in the plant; and incubating the plant material in the reaction. The plants can be engineered to decrease the activity of at least two CAS1L genes, or to decrease the activity of at least three CAS1L genes. In some embodiments, a plant can be engineered to reduce the activity of four CAS1L genes.

[0008] In some embodiments, improvement in yield can be an improvement of the final yield from, or efficiency of, a fermentation reaction, wherein the method comprises: enzymatically or chemically degrading plant material from a mutant CAS1L plant that has decreased activity of at least one CAS1L gene to obtain degradation products; and fermenting the degradation products in a fermentation reaction, wherein the final yield, or efficiency of, the fermentation reaction is increased relative to the final yield, or efficiency, of a fermentation reaction using corresponding plant material from a wildtype CAS1L plant. In some embodiment, the steps of degrading the plant material and fermenting the degradation products occur in the same reaction mixture. Alternatively, the steps of degrading the plant material and fermenting the degradation products can occur in separate reaction mixtures. In some embodiments, the improvement of the efficiency of the fermentation reaction is an increase in the amount of product obtained per unit enzyme over a period of time compared to the amount of product obtained using corresponding plant material from a wildtype CAS1L plant per unit enzyme over the same period of time. In some embodiments, the mutant CAS1L plant has been engineered to decrease the activity of at least one CAS1L gene, e.g., a CAS1L1 or a CAS1L2 gene. In some embodiments, the plants are engineered to decrease the activity of at least two CAS1L genes, or to decrease the activity of at least three CAS1L genes. In some embodiments, a plant can be engineered to reduce the activity of four CAS1L genes.

[0009] In some embodiments, the invention provides a method of chemically degrading or enzymatically degrading plant material from a mutant CAS1L plant that has decreased activity of at least one CAS1L gene to obtain degradation products for a fermentation reaction, wherein the amount of degradation product obtained and/or the efficiency of the degradation reaction is improved compared to a corresponding reaction in which wildtype CAS1L plant material is employed. In some embodiments, the degradation reaction is an enzymatic reaction and the efficiency (e.g., amount of product obtained per unit enzyme per unit of time) of the degradation reaction is improved. In some embodiments, the mutant CAS1L plant has been engineered to decrease the activity of at least one CAS1L gene, e.g., a CAS1L1 or a CAS1L2 gene. In some embodiments, the plants are engineered to decrease the activity of at least two CAS1L genes, or to decrease the activity of at least three CAS1L genes. In some embodiments, a plant can be engineered to reduce the activity of four CAS1L genes.

[0010] In some embodiments, the plant is a transgenic plant that comprises a vector comprising a nucleic acid sequence that encodes an RNAi, e.g., an artificial microRNA (miRNA) or other nucleic acid that encodes an RNAi, that inhibits expression of at least one CAS1L gene.

[0011] In some embodiments, the plants that is engineered in accordance with the invention is selected from the group consisting of corn, sorghum, millet, miscanthus, sugarcane, poplar, pine, eucalyptus, wheat, rice, soy, cotton, barley, switchgrass, turfgrass, ryegrass, bahiagrass, bermudagrass, dallisgrass, pangolagrass, big bluestem, Indian grass, fescue, Dactylis sp. Brachypodium, smooth bromegrass, orchardgrass, Kentucky bluegrass, timothy, Kochia, forage soybeans, alfalfa, clover, hemp, kenaf, tobacco, potato, bamboo, rape, sugar beet, sunflower, willow, and eucalyptus.

[0012] In an additional aspect, the invention provides a plant comprising a vector that comprises an RNAi that inhibits a CAS1L gene. In some embodiments, the plants is selected from the group consisting of corn, sorghum, millet, miscanthus, sugarcane, poplar, pine, eucalyptus, wheat, rice, soy, cotton, barley, switchgrass, turfgrass, ryegrass, bahiagrass, bermudagrass, dallisgrass, pangolagrass, big bluestem, Indian grass, fescue, Dactylis sp. Brachypodium, smooth bromegrass, orchardgrass, Kentucky bluegrass, timothy, Kochia, forage soybeans, alfalfa, clover, hemp, kenaf, tobacco, potato, bamboo, rape, sugar beet, sunflower, willow, and eucalyptus.

[0013] The invention also provides an isolated mutant CAS1L plant in which at least two CAS1L genes are inhibited. In some embodiments, three or four CAS1L genes are inhibited.

[0014] In some embodiments, the plant is selected from the group consisting of corn, sorghum, millet, miscanthus, sugarcane, poplar, pine, eucalyptus, wheat, rice, soy, cotton, barley, switchgrass, turfgrass, ryegrass, bahiagrass, bermudagrass, dallisgrass, pangolagrass, big bluestem, Indian grass, fescue, Dactylis sp. Brachypodium, smooth bromegrass, orchardgrass, Kentucky bluegrass, timothy, Kochia, forage soybeans, alfalfa, clover, hemp, kenaf, tobacco, potato, bamboo, rape, sugar beet, sunflower, willow, and eucalyptus.

[0015] In some embodiments, the invention also provides a method of producing a plant that has reduced acetylation in the cell wall material, the method comprising inhibiting expression of at least one CAS1L gene in the plant using RNAi. In some embodiments, the method further comprises determining the level of acetylation in the cell wall of the plant.

[0016] In another aspect, the invention provides bulk harvested material comprising material from at least two engineered plants, e.g., grass plants, that have decreased expression of at least one CAS1L gene. In some embodiments, the bulk harvested material is present in a fermentation reaction.

[0017] In an additional aspect, the invention provides a plant, e.g., a grass, engineered to have reduced expression of at least one CAS1L gene, wherein the plant has at least a 10% reduction in cell wall acetylation in comparison to a plant that has not been engineered.

[0018] The invention further provides a polysaccharide or polysaccharide fraction, e.g., homogalacturonan or pectin, isolated from a plant that has at least a 10% reduction in cell wall acetylation in comparison to a wildtype plant. In some embodiment, the invention provides a food ingredient containing such a polysaccharide.

BRIEF DESCRIPTION OF THE DRAWINGS

[0019] FIG. 1 provides an alignment of plant CAS1L genes.

[0020] FIG. 2 shows acetate levels in a small number of Arabidopsis plants that have been engineered to reduce CAS1L gene expression.

[0021] FIG. 3 provides data that illustrate the saccharification efficiency (determined as concentration of total monosaccharide released) as a function of time for cell wall material from wild type arabidopsis plants and rwa2-1. Rwa2-1 is an Arabidopsis plant with a mutated RWA2 gene, i.e., a mutated CAS1L2 gene.

[0022] FIG. 4 shows the results of a Botrytis cinerea pathoassay 3 days post infection.

DETAILED DESCRIPTION OF THE INVENTION

[0023] The term "biomass" in the context of this application refers to plant material that is processed to provide a product, e.g., a biofuel such as ethanol, or livestock feed. Such plant material can include whole plants, or parts of plants, e.g., stems, leaves, branches, shoots, roots, tubers, and the like.

[0024] The term "acetate" as used herein is used to refer to acetyl esters bound to polysaccharides and glycan structures on glycoproteins and proteoglycans. The acetyl ester can be bound to different OH-groups on sugars, and individual sugar residues can contain more than one acetyl ester group. Many different plant polysaccharides are known to be acetylated, including xylan, mannan, xyloglucan, and pectin.

[0025] A "fermentation reaction" as used herein is used to refer to the conversion of a substrate into a product, typically by an enzymatic reaction. Such reactions in the context of this invention can be anaerobic or aerobic. Typically, a fermentation reaction used in this invention is an anaerobic reaction in which yeast or bacteria convert polysaccharides, oligosaccharides and/or sugars to alcohols, acids, hydrocarbons and/or esters.

[0026] In the context of this invention, the term "yield", when referring to an enzymatic or chemical reaction for the conversion of a substrate into a product, refers to efficiency as well as overall amount of the product. Thus, the term "improved yield" not only refers to increases in overall, or final, yield of a reaction (i.e., the amount of product obtained per amount of starting material), but also includes a faster reaction rate that increases the amount of product produced over a shorter period when using biomass from a mutant CAS1L plant in comparison to corresponding biomass from the wildtype plant. In some embodiments in which CAS1L mutant plant material is degraded in an enzymatic reaction, "improved yield" includes an increase in the amount of product obtained per enzyme unit over a defined unit of time in comparison to the yield obtained using wildtype CAS1L plant material.

[0027] In the context of this invention "corresponding plant material from a wildtype CAS1L plant" refers to plant material that is from the same part of the plant as the CAS1L mutant plant material. As understood in the art, improved yield is based upon comparisons of the same amount of corresponding plant material.

[0028] A plant that has a "mutant CAS1L gene" in the context of this invention refers to a plant in which at least the CAS1L gene is inactivated, or mutated to reduce expression, relative to a plant with a wildtype CAS1L gene.

[0029] A plant that has been "engineered to have decreased expression of a CAS1L gene" refers to a plant that has been modified by mutagenesis, and/or genetic engineering, e.g., engineering to express RNAi that targets one or more CAS1L genes, to exhibit decreased expression of the targeted CAS1L gene or genes.

[0030] A "CAS1L gene" refers generally to a nucleic acid encoding a polypeptide that is a member of the CAS1L gene family. Examples of members of the CAS1L gene family include CAS1L1, CAS1L2, CAS1L3, and CAS1L4. In the context of this invention, a CAS1L polypeptide that is encoded by a CAS1L gene has substantial identity to a polypeptide comprising the sequence set forth in SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, or SEQ ID NO:5; or to a CAS1L gene sequence set forth in FIG. 1. A "CAS1L gene" is also referred to herein as an RWA gene. Thus, for example, CAS1L genes CAS1L1, CAS1L2, CAS1L3, and CAS1L4 are also referred to as RWA1, RWA2, RWA3 and RWA4 genes.

[0031] The terms "decreased expression", "reduced expression", or "inhibited expression" of a CAS1L gene refers to a reduction in the level of expression of the CAS1L gene in an engineered plant compared to the level of expression in a wildtype plant. Thus, decreased expression can be a reduction in expression of a CAS1L gene of 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, or 90% or greater. Decreased expression can be assessed by measuring decreases in the level of RNA encoded by the gene and/or decreases in the level of CAS1L protein or protein activity. CAS1L protein/protein activity can be assessed directly or indirectly, e.g., by measuring an endpoint such as the acetate content of a part of plant in which the CAS1L gene is expressed.

[0032] The phrase "substantially identical," in the context of two nucleic acids or polypeptides, refers to a sequence or subsequence that has at least 50% identity, typically at least 60% sequence identity, to a reference sequence. Percent identity can be any integer from 50% to 100%. In some embodiments, a sequence is at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, or 99% identical when compared to a reference sequence.

[0033] In the case of both expression of transgenes and inhibition of endogenous genes (e.g., by antisense, or sense suppression) one of skill will recognize that the inserted polynucleotide sequence need not be perfectly identical and may be "substantially identical" to a sequence of the gene from which it was derived. In the case of polynucleotides used to inhibit expression of an endogenous gene, the introduced sequence need not be perfectly identical to a sequence of the target endogenous gene. The introduced polynucleotide sequence will typically be at least substantially identical (as determined below) to the target endogenous sequence. Thus, an introduced "polynucleotide sequence from" a CAS1L gene may not be identical to the target CAS1L gene to be suppressed, but is functional in that it is capable of inhibiting expression of the target CAS1L gene. As explained below, these variants are specifically covered by this term.

[0034] The phrase "at least one CAS1L gene, or "at least two CAS1L genes", or "at least three CAS1L genes" is used to refer to members of the CAS1L gene family in a plant that has multiple CAS1L genes. For example, in the alignment in FIG. 1, rice has three CAS1L gene family members (Os01g0631100--Oryza (SEQ ID NO:15), Os05g0582100--Oryza (SEQ ID NO:16), and Os03g0314200--Oryza (SEQ ID NO:17)), poplar has four CAS1L gene family members (P2_P._trichocarpa (SEQ ID NO: 18), P3_P._trichocarpa (SEQ ID NO: 19), P1P._trichocarpa (SEQ ID NO: 20), and P4_P._trichocarpa (SEQ ID NO: 21)), and Arabidopsis has four CAS1L gene family members (SEQ ID NOs: 3, 4, 1, and 5).

[0035] Two nucleic acid sequences or polypeptides are said to be "identical" if the sequence of nucleotides or amino acid residues, respectively, in the two sequences is the same when aligned for maximum correspondence as described below. The term "complementary to" is used herein to mean that the sequence is complementary to all or a portion of a reference polynucleotide sequence.

[0036] Optimal alignment of sequences for comparison may be conducted by the local homology algorithm of Smith and Waterman Add. APL. Math. 2:482 (1981), by the homology alignment algorithm of Needle man and Wunsch J. Mol. Biol. 48:443 (1970), by the search for similarity method of Pearson and Lipman Proc. Natl. Acad. Sci. (U.S.A.) 85: 2444 (1988), by computerized implementations of these algorithms (GAP, BESTFIT, BLAST, FASTA, and TFASTA in the Wisconsin Genetics Software Package, Genetics Computer Group (GCG), 575 Science Dr., Madison, Wis.), or by inspection. In some embodiments, percent identity is determined using the BLAST2 algorithm set at the default settings.

[0037] "Percentage of sequence identity" is determined by comparing two optimally aligned sequences over a comparison window, wherein the portion of the polynucleotide sequence in the comparison window may comprise additions or deletions (i.e., gaps) as compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences. The percentage is calculated by determining the number of positions at which the identical nucleic acid base or amino acid residue occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison and multiplying the result by 100 to yield the percentage of sequence identity. A "comparison window" may be, e.g., 20, 50, 100, 400, or more nucleotides ore amino acids in length; or may be the entire length of the sequences being compared.

[0038] Proteins that are substantially identical include those that have conservative amino acid substitutions. Conservative amino acid substitutions refer to the interchangeability of residues having similar side chains. For example, a group of amino acids having aliphatic side chains is glycine, alanine, valine, leucine, and isoleucine; a group of amino acids having aliphatic-hydroxyl side chains is serine and threonine; a group of amino acids having amide-containing side chains is asparagine and glutamine; a group of amino acids having aromatic side chains is phenylalanine, tyrosine, and tryptophan; a group of amino acids having basic side chains is lysine, arginine, and histidine; and a group of amino acids having sulfur-containing side chains is cysteine and methionine. Preferred conservative amino acids substitution groups are: valine-leucine-isoleucine, phenylalanine-tyrosine, lysine-arginine, alanine-valine, aspartic acid-glutamic acid, and asparagine-glutamine.

[0039] Another indication that nucleotide sequences are substantially identical is if two molecules hybridize to each other, or a third nucleic acid, under stringent conditions. Stringent conditions are sequence dependent and will be different in different circumstances. Generally, stringent conditions are selected to be about 5° C. lower than the thermal melting point (Tm) for the specific sequence at a defined ionic strength and pH. The Tm is the temperature (under defined ionic strength and pH) at which 50% of the target sequence hybridizes to a perfectly matched probe. Typically, stringent conditions will be those in which the salt concentration is about 0.02 molar at pH 7 and the temperature is at least about 60° C. For example, stringent conditions for hybridization, such as RNA-DNA hybridizations in a blotting technique are those which include at least one wash in 0.2×SSC at 55° C. for 20 minutes, or equivalent conditions.

[0040] A CAS1L polynucleotide sequence for use in the invention can also be amplified using PCR techniques, e.g., primers that are designed to amplify nucleic acid sequences that encode CAS1L protein sequences set forth in SEQ ID NOs: 1, 2, 3, 4, or 5.

[0041] The term "plant", as used herein, can refer to a whole plant or parts plant parts e.g., cuttings, tubers, pollen, leaves, stems, flowers, roots, fruits, branches, and the like. The term also encompasses individual plant cells, groups of plant cells (e.g., cultured plant cells), protoplasts, plant extracts, seeds, and progeny thereof. The tetin includes plants of a variety of a ploidy levels, including polyploid, diploid and haploid.

[0042] The term "progeny" refers generally to the offspring of a cross, and includes direct F1 progeny, as well as later generations of F2, F3, etc.

[0043] A polynucleotide sequence is "heterologous to" a second polynucleotide sequence if it originates from a foreign species, or, if from the same species, is modified by human action from its original form. For example, a promoter operably linked to a heterologous coding sequence refers to a coding sequence from a species different from that from which the promoter was derived, or, if from the same species, a coding sequence which is different from any naturally occurring allelic variants.

[0044] An "expression cassette" refers to a nucleic acid construct, which when introduced into a host cell, results in transcription and/or translation of a RNA or polypeptide, respectively.

[0045] The term "isolated", when applied to a nucleic acid or protein, denotes that the nucleic acid or protein is essentially free of other cellular components with which it is associated in the natural state. It is preferably in a homogeneous state and may be in either a dry or aqueous solution. Purity and homogeneity are typically determined using analytical chemistry techniques such as polyacrylamide gel electrophoresis or high performance liquid chromatography. A protein which is the predominant species present in a preparation is substantially purified. In particular, an isolated gene is separated from open reading frames which flank the gene and encode a protein other than the gene of interest.

[0046] The term "bulk harvested material" refers to combined plant material harvested from at least three plants, preferably at least 5, 10, 25, 50, 100, 500, or 1000 or more plants. The plant material may be whole plants, or parts of the plants, e.g., leaves or stems harvested from the plants. In some embodiments, the plant material present in the bulk harvested material is crushed or milled to a desired particle size, e.g., a size that is useful for producing biofuel.

Introduction

[0047] This invention is based, in part, on the discovery that plants can be engineered to suppress CAS1L gene expression to reduce the amount of acetate, i.e., polymer-bound acetyl esters, present in the plant material, thereby enhancing yield in an enzymatic reaction, e.g., a fermentation reaction, to obtain a desired product. Often, the product is a biofuel, such as ethanol. In typical embodiments, the fermentation reaction employs a microorganism, such as a yeast or bacteria, that is sensitive to acetic acid levels in the reaction.

[0048] In some embodiments, a plant such as a sugar beet or potato, that is engineered to reduce the amount of acetate in the plant material can be used as a source of other products, such as a pectin, that can be employed as a food ingredient.

[0049] The invention therefore provides methods of engineering plants to reduce acetate by suppressing expression of at least one, often two or three, CAS1L genes in the plant. The invention further provides plants that have been thusly engineered, as well as methods of using such plants, e.g., to enhance biofuel yield from plant material.

[0050] In the current invention, the yield of a fermentation reaction to obtain a desired product is increased due to the reduced acetylation in CAS1L mutant plants. This can be accomplished that by having less acetate in the products that are used in the fermentation. Either an enzymatic or chemical degradation can be used.

[0051] In some embodiments, an enzymatic degradation is employed. In such embodiments, the enzymatic degradation reaction can itself be improved due to the lowered acetate content in CAS1L mutant plants. Typically, this results in increased yield in a fermentation reaction (i.e., either in the rate of fermentation and/or the total amount of fermentation product generated).

[0052] In some embodiments, improved degradation can also be advantageous without an effect on the final yield in fermentation. For example, in some embodiments, a reaction may employ less enzyme in order to degrade the biomass obtained from CAS1L mutant plants compared to the amount of enzyme required to degrade the biomass from wildtype CAS1L plants. Accordingly, in some embodiments, improved yield from biomass from CAS1L mutant plants results from an increase in the amount of degradation product generated per enzyme unit per unit of time relative to the yield from corresponding biomass from a wildtype CAS1L plant.

[0053] In the current invention, the degradation and fermentation of the biomass from the plant can be performed in one reaction mixture or using separate reaction mixtures. Thus, plant material from a CAS1L mutant plant, e.g., cell wall material from shoots, stems, etc., can be degraded either enzymatically or chemically in one reaction and the degradation products then fermented in a separate reaction mixture. In other embodiments, the degradation reaction and the fermentation reaction are conducted in the same reaction mixture such that the degradation products generated from enzymatic or chemical degradation of the plant biomass is fermented in the same mixture in which the biomass is degraded.

[0054] An "improved yield" from a fermentation reaction can thus arise from an improvement in the overall amount of product obtained or from an increased efficiency of the overall reaction.

Plants that Can be Engineered in Accordance with the Invention

[0055] Various kind of plants can be engineered to reduce CAS1L gene expression as described herein to reduce acetate levels. The plant may be a monocotyledonous plant or a dicotyledonous plant. In certain embodiments of the invention, plants are green field plants.

[0056] In other embodiments, plants are grown specifically for "biomass energy". For example, suitable plants include corn, switchgrass, sorghum, miscanthus, sugarcane, poplar, pine, wheat, rice, soy, cotton, barley, turf grass, tobacco, potato, bamboo, rape, sugar beet, sunflower, willow, and eucalyptus. In further embodiments, the plant is switchgrass (Panicum virgatum), giant reed (Arundo donax), reed canarygrass (Phalaris arundinacea), Miscanthus×giganteus, Miscanthus sp., sericea lespedeza (Lespedeza cuneata), millet, ryegrass (Lolium multiflorum, Lolium sp.), timothy, Kochia (Kochia scoparia), forage soybeans, alfalfa, clover, sunn hemp, kenaf, bahiagrass, bermudagrass, dallisgrass, pangolagrass, big bluestem, indiangrass, fescue (Festuca sp.), Dactylis sp., Brachypodium distachyon, smooth bromegrass, orchardgrass, or Kentucky bluegrass among others.

Inhibition of CALM Gene Expression

[0057] The invention employs various routine recombinant nucleic acid techniques. Generally, the nomenclature and the laboratory procedures in recombinant DNA technology described below are those well known and commonly employed in the art. Many manuals that provide direction for performing recombinant DNA manipulations are available, e.g., Sambrook & Russell, Molecular Cloning, A Laboratory Manual (3rd Ed, 2001); and Current Protocols in Molecular Biology (Ausubel et al., eds., 1994-1999, updated through 2008).

[0058] Acetyl-related protein has been identified in Cryptococcus neoformans, where mutation in the Cas1p gene leads to lack of acetylation of the coat polysaccharide (Janbon et al. Molecular Microbiology 42: 453-467, 2001). The protein therefore plays a role in acetylation and has been described as a putative acetyl transferase, although no transferase activity has been demonstrated. In the Genbank database, several homologs of the Cryptococcus protein are annotated as `acetyl transferase`, `acetyl transferase-related`, `putative acetyl transferase` etc. These annotations are based on the sequence similarity with the Cryptococcus protein.

[0059] A CAS1L nucleic acid that is targeted for suppression (inhibition) in this invention encodes a CAS1L protein that is substantially similar to SEQ ID NO:1, 2, 3, 4, or 5, or a fragment thereof. CAS1L proteins that are substantially similar to SEQ ID NO:1, 2, 3, 4, or 5, or a fragment thereof, have one or more conserved domains. For example, FIG. 1 provides an alignment of plant CAS protein sequences, including sequences from Arabidopsis (SEQ ID NOs: 3, 4, 1 and 5), rice (SEQ ID NOs:15-17), poplar (SEQ ID NOs:18-21), and Selagninella (SEQ ID NO:22). These sequences include a highly conserved region, which is shown below using Arabidposis CAS1L2 as a reference sequence (SEQ ID NO:11):

TABLE-US-00001 YLNRHQTEEWKGWMQVLFLMYHYFAAAEYYNAIRVFIACYVWMTGFGNFS YYYIRKDFSLARFAQMMWRLNFLVIFSCIVLNNSYMLYYTCPMHTLFTLM VY

This region is at least 78% conserved (i.e., in the context of this invention 78% identical to) Arabidopsis CAS1L2 in CAS1L sequences in FIG. 1. The following sequence provides examples of residue positions, designated by "X", that may vary in this conserved sequence (SEQ ID NO:12). In some embodiments, a residue occurring at an "X" position is a conservative amino acid substitution relative to the amino acid residue that occurs at that position in a CAS1L sequence shown in FIG. 1.

TABLE-US-00002 YLNRHQTEEWKGWMQVXFLMYHYFXAXEXYNAIRXFIAXYVWMTGFGNFS YYYXXKDFSXXRFXQMMWRLNXXVXXXCXXLXNXYXLYYICPMHTLFTXM VY.

Accordingly, a CAS1L nucleic acid that is targeted for inhibition as described herein typically encodes a protein that contains a region that has at least 60%, 65%, 70%, or 75% or great identity to this sequence.

[0060] Another motif that is conserved is the sequence LHEWHFRSGLDRYIWI (SEQ ID NO:13). Accordingly, a CAS1L nucleic acid sequence that is targeted for inhibition typically encodes a protein that contains this motif.

[0061] Furthermore, a CAS1L nucleic acid typically encodes a protein that has substantial identity, e.g., at least 60%, 65%, 70%, or 75% identity to a central region, residues 116-350, shown with reference to an Arabidopsis CasL3 protein sequence (SEQ ID NO:14):

TABLE-US-00003 YFYISDRTSLLGESKKNYNRDLFLFLYCLLIIVSAMTSLKKHNDKSPITG KSILYLNRHQTEEWKGWMQVLFLMYHYFAAAEIYNAIRVFIAAYVWMTGF GNFSYYYIRKDFSLARFTQMMWRLNLFVAFSCIILNNDYMLYYICPMHTL FTLMVYGALGIFSRYNEIPSVMALKIASCFLVVIVMWEIPGVFEIFWSPL TFLLGYTDPAKPELPLLHEWHFRSGLDRYIWIIGM.

[0062] A plant may express multiple CAS1L family members. For example, Arabidopsis has multiple CAS1L family members, which have a high degree of amino acid sequence identity. For example, CAS1L1, CAS1L3, and CAS1L4 have at least 68%, 69%, and 74% amino acid sequence identity to CAS1L2.

[0063] In the present invention, at least one CAS1L gene sequence is inhibited in a plant. For example, in some embodiments, one CAS1L gene is inhibited. Such a gene may encode a protein that has at least 50%, 55%, 60%, 65%, 70%, or 75% identity, or greater, to a CAS1L1 or CAS1L2 reference sequence, e.g., SEQ ID NO:1 or 2. In some embodiments, plants are generated in which at least two CAS1L genes are inhibited. Plants may also be engineered in which at least three CAS1L genes or four CAS1L genes are inhibited. As understood in the art, the number of CAS1L genes that are inhibited will depend on such factors as effects on plant growth, plant general health and the like.

[0064] In some embodiments, a CAS1L gene that is inhibited has a coding nucleic acid sequence that is substantially identical, e.g., at least 50%, 60%, 65%, 70%, or 75% identical, to the CAS1L nucleic acid sequences SEQ ID NO:6, 7, 8, 9, or 10. Of the CAS1L nucleic acid sequences shown in SEQ ID NOs. 6-10, the sequences have the following percent identity as determined by Blast using standard parameters:

TABLE-US-00004 CAS1L1 CAS1L2-a CAS1L2-b CAS1L3 CAS1L4 CAS1L1 CAS1L2-a 73 CAS1L2-b 73 99 CAS1L3 76 71 71 CAS1L4 77 72 72 87

Methods of Inhibiting CAS1L Gene Expression

[0065] The CAS1L gene can be suppressed using any number of techniques well known in the art. For example, one method of suppression is sense suppression (also known as co-suppression). Introduction of expression cassettes in which a nucleic acid is configured in the sense orientation with respect to the promoter has been shown to be an effective means by which to block the transcription of target genes. For an example of the use of this method to modulate expression of endogenous genes see, Napoli et al., The Plant Cell 2:279-289 (1990); Flavell, Proc. Natl. Acad. Sci., USA 91:3490-3496 (1994); Kooter and Mol, Current Opin. Biol. 4:166-171 (1993); and U.S. Pat. Nos. 5,034,323, 5,231,020, and 5,283,184.

[0066] Generally, where inhibition of expression is desired, some transcription of the introduced sequence occurs. The effect may occur where the introduced sequence contains no coding sequence per se, but only intron or untranslated sequences homologous to sequences present in the primary transcript of the endogenous sequence. The introduced sequence generally will be substantially identical to the endogenous sequence intended to be repressed. This minimal identity will typically be greater than about 65%, but a higher identity can exert a more effective repression of expression of the endogenous sequences. In some embodiments, sequences with substantially greater identity are used, e.g., at least about 80, at least about 95%, or 100% identity are used. As with antisense regulation, further discussed below, the effect can be designed and tested to apply to any other proteins within a similar family of genes exhibiting homology or substantial homology.

[0067] For sense suppression, the introduced sequence in the expression cassette, needing less than absolute identity, also need not be full length, relative to either the primary transcription product or fully processed mRNA. Furthermore, the introduced sequence need not have the same intron or exon pattern, and identity of non-coding segments will be equally effective. In some embodiments, a sequence of the size ranges noted above for antisense regulation is used, i.e., 30-40, or at least about 20, 50, 100, 200, 500 or more nucleotides.

[0068] Endogenous gene expression may also be suppressed by means of RNA interference (RNAi) (and indeed co-suppression can be considered a type of RNAi), which uses a double-stranded RNA having a sequence identical or similar to the sequence of the target gene. As used herein RNAi, includes the use of micro RNA, such as artificial miRNA to suppress expression of a gene.

[0069] RNAi is the phenomenon in which when a double-stranded RNA having a sequence identical or similar to that of the target gene is introduced into a cell, the expressions of both the inserted exogenous gene and target endogenous gene are suppressed. The double-stranded RNA may be formed from two separate complementary RNAs or may be a single RNA with internally complementary sequences that form a double-stranded RNA. Although complete details of the mechanism of RNAi are still unknown, it is considered that the introduced double-stranded RNA is initially cleaved into small fragments, which then serve as indexes of the target gene in some manner, thereby degrading the target gene. RNAi is known to be also effective in plants (see, e.g., Chuang, C. F. & Meyerowitz, E. M., Proc. Natl. Acad. Sci. USA 97: 4985 (2000); Waterhouse et al., Proc. Natl. Acad. Sci. USA 95:13959-13964 (1998); Tabara et al. Science 282:430-431 (1998); Matthew, Comp Funct Genom 5: 240-244 (2004); Lu, et al., Nucleic Acids Res. 32(21):e171 (2004)).

[0070] Thus, in some embodiments, inhibition of a CAS1L gene is achieved using RNAi techniques. For example, to achieve suppression of the expression of a DNA encoding a protein using RNAi, a double-stranded RNA having the sequence of a DNA encoding the protein, or a substantially similar sequence thereof (including those engineered not to translate the protein) or fragment thereof, is introduced into a plant of interest. As used herein, RNAi and dsRNA both refer to gene-specific silencing that is induced by the introduction of a double-stranded RNA molecule, see e.g., U.S. Pat. Nos. 6,506,559 and 6,573,099, and includes reference to a molecule that has a region that is double-stranded, e.g., a short hairpin RNA molecule. The resulting plants may then be screened for a phenotype associated with the target CAS1L protein, e.g., reduced acetate, and/or by monitoring steady-state RNA levels for transcripts encoding the protein. Although the genes used for RNAi need not be completely identical to the target gene, they may be at least 70%, 80%, 90%, 95% or more identical to the target gene sequence. See, e.g., U.S. Patent Publication No. 2004/0029283. The constructs encoding an RNA molecule with a stem-loop structure that is unrelated to the target gene and that is positioned distally to a sequence specific for the gene of interest may also be used to inhibit target gene expression. See, e.g., U.S. Patent Publication No. 2003/0221211.

[0071] The RNAi polynucleotides may encompass the full-length target RNA or may correspond to a fragment of the target RNA. In some cases, the fragment will have fewer than 100, 200, 300, 400, or 500, nucleotides corresponding to the target sequence. In addition, in some embodiments, these fragments are at least, e.g., 50, 100, 150, 200, or more nucleotides in length. Interfering RNAs may be designed based on short duplexes (i.e., short regions of double-stranded sequences). Typically, the short duplex is at least about 15, 20, or 25-50 nucleotides in length (e.g., each complementary sequence of the double stranded RNA is 15-50 nucleotides in length), often about 20-30 nucleotides, e.g., 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, or 30 nucleotides in length. In some cases, fragments for use in RNAi will be at least substantially similar to regions of a target protein that do not occur in other proteins in the organism or may be selected to have as little similarity to other organism transcripts as possible, e.g., selected by comparison to sequences in analyzing publicly-available sequence databases. Thus, RNAi fragments may be selected for similarity or identity with conserved domains of CAS1L sequences, such as those described herein, that lacking significant homology to sequences in the databases).

[0072] In some embodiments, an RNAi is introduced into a cell as part of a larger DNA construct. Often, such constructs allow stable expression of the RNAi in cells after introduction, e.g., by integration of the construct into the host genome. Thus, expression vectors that continually express iRNA in cells transfected with the vectors may be employed for this invention. For example, vectors that express small hairpin or stem-loop structure RNAs, or precursors to microRNA, which get processed in vivo into small RNAi molecules capable of carrying out gene-specific silencing (Brummelkamp et al., Science 296:550-553 (2002), and Paddison, et al., Genes & Dev. 16:948-958 (2002)) can be used. Post-transcriptional gene silencing by double-stranded RNA is discussed in further detail by Hammond et al. Nature Rev Gen 2: 110-119 (2001), Fire et al. Nature 391: 806-811 (1998) and Timmons and Fire Nature 395: 854 (1998).

[0073] Methods for selection and design of sequences that generate RNAi are well known in the art (e.g. Reynolds, 2004; see also U.S. Pat. No. 6,506,559; U.S. Pat. No. 6,511,824; and U.S. Pat. No. 6,489,127).

[0074] One of skill in the art will recognize that using technology based on specific nucleotide sequences (e.g., antisense or sense suppression technology), families of homologous genes can be suppressed with a single sense or antisense, discussed below, transcript. For instance, if a sense or antisense transcript is designed to have a sequence that is conserved among a family of genes, then multiple members of a gene family can be suppressed. Conversely, if the goal is to only suppress one member of a homologous gene family, then the sense or antisense transcript should be targeted to sequences with the most variation between family members.

[0075] The term "target RNA molecule", e.g., in this invention a target CAS1L RNA, refers to an RNA molecule to which an RNAi molecule is homologous or complementary.

[0076] One or more CAS1L genes can be inhibited using the same interfering RNA. For example, all of the CAS1L genes in a plant may be targeted by using an RNAi that is designed to a conserved region of the CAS1L gene. In other embodiments, individual CAS1L gene family members may be targeted by using an RNAi that is specific that CAS1L gene.

Antisense and Ribozyme Suppression

[0077] A reduction of CAS1L gene expression in a plant to reduce polysaccharide acetylation may be obtained by introducing into plants antisense constructs based on the CAS1L polynucleotide sequences. For antisense suppression, a CAS1L sequence is arranged in reverse orientation relative to the promoter sequence in the expression vector. The introduced sequence need not be a full length CAS1L cDNA or gene, and need not be identical to the CASL cDNA or a gene found in the plant variety to be transformed. Generally, however, where the introduced sequence is of shorter length, a higher degree of homology to the native CASL sequence is used to achieve for effective antisense suppression. Preferably, the introduced antisense sequence in the vector will be at least 30 nucleotides in length, and improved antisense suppression will typically be observed as the length of the antisense sequence increases. Preferably, the length of the antisense sequence in the vector will be greater than 100 nucleotides. Transcription of an antisense construct as described results in the production of RNA molecules that are the reverse complement of mRNA molecules transcribed from the endogenous cspl gene. Suppression of endogenous CAS1L gene expression can also be achieved using a ribozyme. The production and use of ribozymes are disclosed in U.S. Pat. No. 4,987,071 to Cech and U.S. Pat. No. 5,543,508 to Haselhoff.

Mutagenesis

[0078] Alternatively, random mutagenesis approaches may be used to disrupt or "knock-out" the expression of a CAS1L gene using either chemical or insertional mutagenesis, or irradiation. One method of mutagenesis and mutant identification is known as TILLING (for targeting induced local lesions in genomes). In this method, mutations are induced in the seed of a plant of interest, for example, using EMS treatment. The resulting plants are grown and self-fertilized, and the progeny are assessed. For example, the plants may be assed using PCR to identify whether a mutated plant has a CAS1L mutation, e.g., that reduces expression of a CAS1L gene, or by evaluating whether the plant has reduced levels of acetate in a part of the plant that expressed the CAS1L gene. TILLING can identify mutations that may alter the expression of specific genes or the activity of proteins encoded by these genes (see Colbert et al (2001) Plant Physiol 126:480-484; McCallum et al (2000) Nature Biotechnology 18:455-457).

[0079] Another method for abolishing or decreasing the expression of a CAS1L gene is by insertion mutagenesis using the T-DNA of Agrobacterium tumefaciens. After generating the insertion mutants, the mutants can be screened to identify those containing the insertion in a CAS1L gene. Mutants containing a single mutation event at the desired gene may be crossed to generate homozygous plants for the mutation (Koncz et al. (1992) Methods in Arabidopsis Research. World Scientific).

[0080] Another method to disrupt a CAS1L gene is by use of the cre-lox system (for example, as described in U.S. Pat. No. 5,658,772).

Plants Having where Multiple CAS1L Genes are Inhibited

[0081] In some embodiments expression of two or more CAS1L genes is inhibited in a plant in accordance with the invention. As explained above, such plants can be generated by performing a molecular manipulation that targets all of the CAS1L gene family members in a plant, e.g., using an RNAi to a conserved region to inactivate all of the CAS1L genes. Such plants can also be obtained by breeding plants that each have individual mutations that inactivate different CAS1L genes to obtain progeny plants that are inactivated in all of the desired CAS1L genes. For example, to obtain a rice plant in which three CAS1L genes are inactivated, one of skill can target the genes using RNAi developed to a region that is conserved in all three of the rice CAS1L genes, or target the genes individually and breed the resulting mutant plants.

Expression of CAS1L Gene Inhibitors

[0082] Expression cassettes comprising polynucleotides that encodes CAS1L gene expression inhibitors, e.g., an antisense or siRNA, can be constructed using methods well known in the art. Constructs include regulatory elements, including promoters and other sequences for expression and selection of cells that express the construct. Typically, plant transformation vectors include one or more cloned plant coding sequences (genomic or cDNA) under the transcriptional control of 5' and 3' regulatory sequences and a dominant selectable marker. Such plant transformation vectors typically also contain a promoter (e.g., a regulatory region controlling inducible or constitutive, environmentally-or developmentally-regulated, or cell- or tissue-specific expression), a transcription initiation start site, an RNA processing signal (such as intron splice sites), a transcription termination site, and/or a polyadenylation signal.

[0083] Examples of constitutive plant promoters which may be useful for expressing the TF sequence include: the cauliflower mosaic virus (CaMV) 35S promoter, which confers constitutive, high-level expression in most plant tissues (see, e.g., Odel et al., (1985) Nature 313:810); the nopaline synthase promoter (An et al., (1988) Plant Physiol. 88:547); and the octopine synthase promoter (Fromm et al., (1989) Plant Cell 1:977).

[0084] Additional constitutive regulatory elements including those for efficient expression in monocots also are known in the art, for example, the pEmu promoter and promoters based on the rice Actin-1 5' region (Last et al., Theor. Appl. Genet. 81:581 (1991); Mcelroy et al., Mol. Gen. Genet. 231:150 (1991); Mcelroy et al., Plant Cell 2:163 (1990)). Chimeric regulatory elements, which combine elements from different genes, also can be useful for ectopically expressing a nucleic acid molecule encoding an IND 1 polynucleotide (Comai et al., Plant Mol. Biol. 15:373 (1990)).

[0085] Other examples of constitutive promoters include the 1'- or 2'-promoter derived from T-DNA of Agrobacterium tumafaciens (see, e.g., Mengiste (1997) supra; O'Grady (1995) Plant Mol. Biol. 29:99-108); actin promoters, such as the Arabidopsis actin gene promoter (see, e.g., Huang (1997) Plant Mol. Biol. 1997 33:125-139); alcohol dehydrogenase (Adh) gene promoters (see, e.g., Millar (1996) Plant Mol. Biol. 31:897-904); ACT11 from Arabidopsis (Huang et al. Plant Mol. Biol. 33:125-139 (1996)), Cat3 from Arabidopsis (GenBank No. U43147, Zhong et al., Mol. Gen. Genet. 251:196-203 (1996)), the gene encoding stearoyl-acyl carrier protein desaturase from Brassica napus (Genbank No. X74782, Solocombe et al. Plant Physiol. 104:1167-1176 (1994)), GPc1 from maize (GenBank No. X15596, Martinez et al. J. Mol. Biol 208:551-565 (1989)), Gpc2 from maize (GenBank No. U45855, Manjunath et al., Plant Mol. Biol. 33:97-112 (1997)), other transcription initiation regions from various plant genes known to those of skill. See also Holtorf Plant Mol. Biol. 29:637-646 (1995).

[0086] A variety of plant gene promoters that regulate gene expression in response to various environmental, hormonal, chemical, developmental signals, and in a tissue-active manner are known in the art. Examples of environmental conditions that may effect transcription by inducible promoters include anaerobic conditions, elevated temperature, drought, or the presence of light. Examples of environmental promoters include drought-inducible promoter of maize (Busk (1997) supra); the cold, drought, and high salt inducible promoter from potato (Kirch (1997) Plant Mol. Biol. 33:897 909). Plant promoters that are inducible upon exposure to plant hormones, such as auxins, may also be employed. For example, the invention can use the auxin response elements E1 promoter fragment (AuxREs) in the soybean (Glycine max L.) (Liu (1997) Plant Physiol. 115:397 407); the auxin-responsive Arabidopsis GST6 promoter (also responsive to salicylic acid and hydrogen peroxide) (Chen (1996) Plant J. 10: 955 966); the auxin-inducible parC promoter from tobacco (Sakai (1996) 37:906 913); a plant biotin response element (Streit (1997) Mol. Plant Microbe Interact. 10:933 937); and, the promoter responsive to the stress hormone abscisic acid (Sheen (1996) Science 274:1900 1902).

[0087] Plant promoters which are inducible upon exposure to chemicals reagents that can be applied to the plant, such as herbicides or antibiotics, may also be used in vectors as described herein. For example, the maize In2 2 promoter, activated by benzenesulfonamide herbicide safeners, can be used; application of different herbicide safeners induces distinct gene expression patterns, including expression in the root, hydathodes, and the shoot apical meristem. Other promoters, e.g., a tetracycline inducible promoter; a salicylic acid responsive element promoter, promoters comprising copper-inducible regulatory elements; promoters comprising ecdysone inducible regulatory elements; heat shock inducible promoters, a nitrate-inducible promoter, or a light-inducible promoter may also be used.

[0088] In some embodiments, the plant promoter may direct expression of the polynucleotide of the invention in a specific tissue (tissue-specific promoters), such as a leaf or a stem. Tissue specific promoters are transcriptional control elements that are only active in particular cells or tissues at specific times during plant development, such as in vegetative tissues or reproductive tissues. Examples of tissue-specific promoters include promoters that initiate transcription primarily in certain tissues, such as vegetative tissues, e.g., roots or leaves, or reproductive tissues, such as fruit, ovules, seeds, pollen, pistols, flowers, or any embryonic tissue. Other examples are promoters that direct expression specifically to cells and tissues with secondary cell wall deposition, such as xylem and fibers.

[0089] Plant expression vectors may also include RNA processing signals that may be positioned within, upstream or downstream of the coding sequence. In addition, the expression vectors may include additional regulatory sequences from the 3'-untranslated region of plant genes, e.g., a 3' terminator region to increase mRNA stability of the mRNA, such as the PI-II terminator region of potato or the octopine or nopaline synthase 3' terminator regions.

[0090] Plant expression vectors routinely also include dominant selectable marker genes to allow for the ready selection of transformants. Such genes include those encoding antibiotic resistance genes (e.g., resistance to hygromycin, kanamycin, bleomycin, G418, streptomycin or spectinomycin), herbicide resistance genes (e.g., phosphinothricin acetyltransferase), and genes encoding positive selection enzymes (e.g. mannose isomerase).

[0091] Once an expression cassette comprising a polynucleotide encoding an inhibitor of the expression of a CAS1L gene, e.g., an antisense or siRNA, has been constructed, standard techniques may be used to introduce the polynucleotide into a plant in order to modify CAS1L activity and accordingly, the level of acetylation in the plant or plant part in which the CAS1L target nucleic acid is expressed. See protocols described in Ammirato et al. (1984) Handbook of Plant Cell Culture--Crop Species. Macmillan Publ. Co. Shimamoto et al. (1989) Nature 338:274-276; Fromm et al. (1990) Bio/Technology 8:833-839; and Vasil et al. (1990) Bio/Technology 8:429-434.

[0092] Transformation and regeneration of plants is known in the art, and the selection of the most appropriate transformation technique will be determined by the practitioner. Suitable methods may include, but are not limited to: electroporation of plant protoplasts; liposome-mediated transformation; polyethylene glycol (PEG) mediated transformation; transformation using viruses; micro-injection of plant cells; micro-projectile bombardment of plant cells; vacuum infiltration; and Agrobacterium tumeficiens mediated transformation. Transformation means introducing a nucleotide sequence in a plant in a manner to cause stable or transient expression of the sequence. Examples of these methods in various plants include: U.S. Pat. Nos. 5,571,706; 5,677,175; 5,510,471; 5,750,386; 5,597,945; 5,589,615; 5,750,871; 5,268,526; 5,780,708; 5,538,880; 5,773,269; 5,736,369 and 5,610,042.

[0093] Following transformation, plants are preferably selected using a dominant selectable marker incorporated into the transformation vector. Typically, such a marker will confer antibiotic or herbicide resistance on the transformed plants or the ability to grow on a specific substrate, and selection of transformants can be accomplished by exposing the plants to appropriate concentrations of the antibiotic, herbicide, or substrate.

Evaluation of Plants Engineered to Reduce CAS1L Expression.

[0094] After transformed plants are selected parts of the plants may be evaluated to determine the level of CAS1L gene expression in a part of the plant that expresses the CAS1L gene, e.g., by evaluating the level of RNA or protein, or determining the levels of acetate in the plants. These analyses can be performed using any number of methods known in the art.

[0095] In some embodiments, acetyl esters in plant cell wall material can be measured. For example, cell walls are prepared from plant material. Several methods are known. In the simplest method, the plant material is ground and extracted repeatedly with 96% and 70% ethanol. The resulting `alcohol insoluble residue` is highly enriched in cell wall material. The sample is dried and resuspended in buffer at neutral pH. An aliquot of the sample is saponified by treatment with 0.1 M NaOH at 4° C. overnight or at room temperature for several hours. Following saponification, the sample is neutralized by adding 1 M HCl. Acetic acid in the saponified and neutralized sample can be determined in several different ways, e.g. by gas chromatography or HPLC on an appropriate column. A convenient method is to use an acetic acid determination kit, e.g., such as the kit by R-Biopharm, Germany, according to the manufacturers instructions. The principle of the kit is that acetate is enzymatically consumed in a series of reactions leading to the formation of

[0096] NADPH, which can be determined spectrophotometrically at 340 nm. The kit allows for correction for interference in the determination.

[0097] The procedure above can be used to determine total acetyl esters in the alcohol insoluble residue. To determine acetate esters in specific polymers, the alcohol insoluble residue can be sequentially extracted and/or digested with specific enzymes. The extracts and digests can be analyzed by the methods described above or by mass spectrometry.

[0098] As appreciated by one in the art double, triple, or quadruple CAS1L mutant plants can be generated using techniques well known in the art, including breeding techniques as well as introducing mutations into a plant that already has a mutation in a CAS1L gene.

[0099] Plants that exhibit reduced CAS1L gene expression have at least 5% reduction in acetylation, e.g., as assessed by evaluating cell walls, typically at least 10% reduction in acetylation, or more often at least 15%, 20%, 30%, 40%, or 50% or more reduction in acetylation in comparison to a plant that has not been engineered to decrease expression of the CAS1L gene. In some embodiments, such a reduction in acetylation is observed when one CAS1L gene is reduced in expression in an engineered plant; in some embodiments, such a reduction in acetylation is observed when two CAS1L genes in the engineered plant have reduced expression, and in some embodiments, such a reduction in acetylation is observed when three or four CAS1L genes are engineered to decrease expression in the plant. As understood in the art, CAS1L-mediated reduction in acetylation may occur in one or more parts of the plants, e.g,. acetylation may be reduced in a leaf and/or a stem.

[0100] Plants that exhibit reduced CAS1L gene expression can be used in a variety of methods. The plants selected for reduced acetate levels may further be evaluated to further confirm that the plants provide for improved yield. For example, plant material from the plants with reduced acetate content is ground or milled to defined particle size. The plant material can be compared to ground or milled plant material from normal plants that have not been engineered to reduce CAS1L gene expression.

[0101] The milled plant material is subjected to a saccharification procedure. Many different procedures are currently used experimentally, e.g. dilute acid treatment, steam explosion or ionic liquid treatments. As the beneficial effect of reduced acetate content will differ depending on the exact procedure used several different pretreatment methods can be evaluated. For example, a dilute acid treatment method can be used. The pretreated plant material is then subjected to enzymatic hydrolysis using a mixture of cell wall degrading enzymes.

[0102] Procedures for cell wall pretreatment and enzymatic digestion are well known to those skilled in the art. The yield or efficiency of the procedure can be readily determined by measuring the amount of reducing sugar released, using a standard method for sugar detection, e.g. the dinitrosalicylic acid method well known to those skilled in the art. Plants engineered in accordance with the invention provide a higher sugar yield.

[0103] CAS1L-inhibited plants may also be evaluated in comparison to non-engineered plants to test for the effect of acetates and acetate-derived compounds on subsequent fermentation. For example, hydrolyzed biomass is subjected to fermentation using an organism such as yeast or E. coli that can convert the biomass into compounds such as ethanol, butanol, alkanes, lipids, etc. In the simplest test, the yield of ethanol obtained with a given amount of starting plant material and a standard yeast fermentation can be determined. Yield can be determined not only with organisms that can ferment glucose, but also with organisms that have the ability to ferment pentoses and or other sugars derived from the biomass. In addition to determining the yield of product, e.g. ethanol, one can determine the growth rate of the organism. The plants of the invention that are engineered to reduce CAS1L activity and accordingly, to have reduced acetate, will exhibit a reduced inhibitory effect due to acetate and acetate-derived compounds in comparison to corresponding plants that have not be engineered to reduce CAS1L activity. The decreased inhibitory effect may result in higher final yields of a fermentation reaction, or in faster fermentation, or both.

[0104] Plants having reduced CAS1L activity can be used in a variety of reactions, including fermentation reactions. Such reactions are well known in the art. For example, fermentation reactions noted above, e.g., a yeast or bacterial fermentation reaction, may employ CAS1L mutants, to obtain ethanol, butanol, lipids, and the like. For example the plants may be used in industrial bioprocessing reactions that include fermentative bacteria, yeast, or filamentous fungi such as Corynebacterium sp., Brevibacterium sp., Rhodococcus sp., Azotobacter sp., Citrobacter sp., Enterobacter sp., Clostridium sp., Klebsiella sp., Salmonella sp., Lactobacillus sp., Aspergillus sp., Saccharomyces sp., Zygosaccharomyces sp., Pichia sp., Kluyveromyces sp., Candida sp., Hansenula sp., Dunaliella sp., Debaryomyces sp., Mucor sp., Torulopsis sp., Methylobacteria sp., Bacillus sp., Escherichia sp., Pseudomonas sp., Serratia sp., Rhizobium sp., and Streptomyces sp., Zymomonas mobilis, acetic acid bacteria, methylotrophic bacteria, Propionibacterium, Acetobacter, Arthrobacter, Ralstonia, Gluconobacter, Propionibacterium, and Rhodococcus.

Evaluation for Increased Resistance to Fungus

[0105] CAS1L mutant plants, e.g., plants that have mutations in CAS1L2, also exhibit increased resistance to certain fungi in comparison to plants that have wildtype CAS1L2 genes. CAS1L-inhibited plants may also be evaluated for susceptibility to infection, e.g., fungus infection. In some embodiments, such an assay may also be used to identify mutant CAS1L plants. An example of an assay to evaluate fungus resistance is provided in Example 3. CAS1L mutant plants may thus be more readily propagated, e.g., the plants could grown using reduced amounts of fungicide.

EXAMPLES

Example 1

Acetate Levels in CAS1L Mutant Plants

[0106] Arabidopsis mutants generated by random insertion of T-DNA were obtained from the Arabidopsis Biological Resource Center, Ohio, or constructed. Mutants where the T-DNA had been inserted in the coding region of the genes of interest, e.g., At3g06550, At2g34410, and At1g29890 were identified by searching the website of www.arabidopsis.org. The website contains sequence information for flanking regions of T-DNA for all the mutants deposited. Mutant seeds were first screened to confirm the T-DNA insertion and identify homozygous plants. Seeds were germinated and leaf samples were used to prepare DNA and carry out PCR amplification using primers suggested by ABRC. The principle behind this PCR screening is that a primer complementary to the T-DNA together with a primer complementary to the plant genome near the insertion will yield a PCR product indicative of the insertion. Likewise, two primers complementary to the plant genome and placed on different sides of the site of insertion will only yield a PCR band if there is no insert, thus indicating the wild type allele. By using all three primers for PCR reactions, plants that were homozygous for the insertion could be identified by the presence of the insert-specific PCR product and the absence of the wild-type-specific PCR product. The homozygous individuals were grown to maturity and the seeds harvested. Homozygosity of the offspring was confirmed by PCR.

[0107] Acetate levels were initially evaluated in leaf samples from a small number of Arabidopsis plants that were generated. Cell walls were prepared from plant material. An aliquot of the sample was saponified using h 0.1 M NaOH. Following saponification, the sample was neutralized by adding 1 M HCl. Acetic acid in the saponified and neutralized sample was then determined with a colorimetric assay essentially according to Beutler (1984, in Methods of Enzymatic Analysis (Bergmeyer, H. U., ed.) 3rd ed., vol. VI, pp. 639-645, Verlag Chemie, Weinheim, Deerfield Beach/Florida, Basel).

[0108] The results are shown in FIG. 2. Plants in which CAS1L2 expression was decreased showed a statistically significant reduction in acetate levels. Plants that had inhibited CAS1L3 or CAS1L4 also exhibited a small reduction in acetate levels, although in the experiment depicted in FIG. 2 with the small number of plants, the reduction was not significant. A different CAS1L2 mutant plant (i.e., that had a different CAS1L2 mutation) also exhibited reduced acetate levels (data not shown).

[0109] An initial analysis also indicated that Arabidopsis plants in which CAS1L1 expression is inhibited exhibited reduced acetate levels in the stems.

[0110] Crosses of mutant CAS1L1, CAS1L2, CAS1L3, and CAS1L4 plants were also obtained in all combinations, i.e., all the single, double, triple, and quadruple mutants, were generated. The quadruple mutants and some of the triple mutant plants exhibited reduced plant growth.

Example 2

Enzymatic Digestion of Wild Type Arabadopsis and Mutant rwa2 to Determine the Effect of Acetylation on Saccharification Hydrolysis Kinetics

[0111] This example demonstrates that plant material from CAS1L mutant plants is more easily saccharified into sugars.

[0112] Cell wall material was isolated as described (Harholt et al., Plant Physiol.140:49-58, 2006). Rosette leaves from mature plants were ground to a powder in a ball mill and extracted with 96% and 70% ethanol.

[0113] Enzyme digestions were performed using Novozyme cellulase NS50013 and beta-glucosidase NS50010, with enzyme loadings of 10% wt enzyme/wt glucan and 1% wt enzyme/wt glucan, respectively. The percentage glucan was assumed to be 30% (dw) in all samples, and a 1% glucan loading was used for all reactions. All reactions were performed as duplicate technical replicates and two or three biological replicates.

[0114] Reactions were performed in 2 ml screwcap microcentrifuge tubes. Aproximately 10 mg of cell wall material was measured into the vials. A master mix containing 10% and 1% of NS50013 and NS50010 was prepared in 50 mM sodium acetate buffer, pH 4.8. Enzyme mixture was added to tubes on ice. Samples were then incubated for up to 72 hours in an ATR multitron shaker at 50 C and 1000 RPM.

[0115] At various time points, samples were removed from the shaker and centrifuged at 14000×g for 10 min. Supernatants were collected and diluted 1:100 for analysis. Monosaccharide concentrations were determined using High Performance Anion Exchange Chromatography with Pulsed Amperometric Detection (HPAEC-PAD) on a Dionex DX600 equipped with a Dionex Carbopac PA-20 analytical column (3×150 mm) and a Carbopac PA-20 guard column (3×30 mm) (Dionex, Sunnyvale, Calif.). Eluent flow rate was 0.4 mL min-1 and the temperature was 30° C. A gradient consisting of a 12 min elution with 14 mM NaOH followed by a 5 min ramp to 450 mM NaOH for 20 min, then a return to the original NaOH concentration of 14 mM for 10 min prior to the next injection. Product concentrations were determined using an external standard.

[0116] In one experiment, a much higher amount of sugar was released from the leaf cell walls from the rwa2-1 mutant (see FIG. 3). Most of the released sugar is glucose, but xylose release showed the same difference with about 6-fold higher release from the rwa2 cell wall material. In a second experiment that was carried out is a similar way but with only a single time point of 24 hrs incubation, there was no significant difference between the mutant and the wildtype in total sugar release. However, the second experiment still showed a trend towards 3% increased xylose release from the mutant cell walls.

[0117] These experiments thus indicate that there is a benefit of reducing cell wall acetate on saccharification of biomass. The combined effect of a 20% reduction in hemicellulose acetylation on saccharification and on yeast fermentation has been simulated in a lignocellulose biorefinery model (Klein-Marchushamer et al., Techno-Economic Modeling of Cellulosic Biorefineries. Presented at DOE Genomic Science Awardee Workshop, Feb. 7-10, 2010, Crystal City, Va.). This simulation showed that an expected result would be a 10% decrease in cost per gallon of produced ethanol. The main effect is on the improved fermentation.

Example 3

RWA Mutant Plants have Decreased Susceptibility to Fungus

[0118] RWA muant plants also demonstrated increased resistant to fungus.

[0119] Botrytis cinerea IK2018 was isolated from strawberry fruit and obtained from Dr. Birgit Jensen, Dept. of Plant Biology and Biotechnology, University of Copenhagen, Denmark. B. cinerea was maintained on potato dextrose agar (Difco, USA). Spores were collected in 3 ml of 12 g/L potato dextrose broth (PDB; Difco, USA) by gentle rubbing and filtered through miracloth (Calbiochem/Merck, Germany) to remove mycelium. The number of spores was counted using a haemocytometer, and the suspension was adjusted to 5×105 conidiospores ml-1 in PDB for infection of leaves.

[0120] Rosette leaves from 4-week-old soil-grown Arabidopsis plants (wild type Col-0 and the knock out mutants rwa2-1 and rwa2-3) were placed in Petri dishes containing 0.6% agar, with the petiole embedded in the medium. Inoculation was performed by placing 5 μl of a suspension of 5×105 conidiospores ml-1 in 12 g l-1 potato dextrose broth (PDB; Difco, Detroit, USA) on each side of the middle vein. The plates were incubated at 22° C. with a 12-h photoperiod. High humidity was maintained by covering the plates with a clear plastic lid. Lesion diameter in centimeters was obtained by hand analysis of high-resolution digital images of infected leaves using ImageJ (Abramoff et al., Biophotonics International 11:36-42, 2004). Included scale objects allowed standardization of measurements across images.

[0121] The results showed that rwa2 mutant plants with decreased wall acetylation exhibited a surprising tolerance to the necrotrophic fungal pathogen Botrytis cinerea. The experiment was carried out twice with essentially the same results. The result of one of the experiments is shown in FIG. 4.

[0122] Although the foregoing invention has been described in some detail by way of illustration and example for purposes of clarity of understanding, it will be readily apparent to one of ordinary skill in the art in light of the teachings of this invention that certain changes and modifications may be made thereto without departing from the spirit or scope of the appended claims.

[0123] All publications, accession numbers, patents, and patent applications cited in this specification are herein incorporated by reference as if each was specifically and individually indicated to be incorporated by reference.

Examples of CAS1L1 (RWA) Polypeptide and Nucleic Acid Sequences

TABLE-US-00005 [0124] CAS1L1 Arabidopsis sequence (NP_568662.1) AT5G46340 (SEQ ID NO: 1) 1 MVDPGPITPG QVSFLLGVIP IFVGWIYSEL LEYRKSWVPL KPHSDNNLVE LGDVAEKDDD 61 KADLLEGGLA RSPSVKFHNS SIRTNIIRFL SMEDSFLLEH RATLRAMSEF GAILIYFYIC 121 DRTELLGDST KNYNRDLFLF LYVLLIIVSA MTSLRKHNDK SPISGKSILY LNRHQTEEWK 181 GWMQVLFLMY HYFAAAEIYN AIRIFIAAYV WMTGFGNFSY YYVRKDFSVA RFAQMMWRLN 241 FFVAFCCIVL NNDYMLYYIC PMHTLFTLMV YGALGIFSKY NEIGSVMALK IFSCFLVVFL 301 LWEIPGAFEI FWGPLTFLLG YNDPAKPDLH RLHEWHFRSG LDRYIWIIGM IYAYYHPTVE 361 RWMEKLEDCE TKKRLSIKAA IVTITVLVGY VWYECIYKLD RTSYNMYHPY TSWIPITVYI 421 CLRNFTHQLR SVSLTLFAWL GKITLETYIS QFHIWLRSNM PDGQPKWLLS IIPGYPMLNF 481 MLTTAIYVLV SHRLFELTNT LKTVFVPTKD NKRLFSNFIA GIAIALPLYC FSFVLLQIHR Cas1L2 protein sequence NP-001118592.1 AT3G06550 (SEQ ID NO: 2) 1 MASSSPVTPG LMSVVFGIVP VIVAWLYSEY LHYAKYSVSA KTHSDVNLVE IAKDFVKEDD 61 KALLIEDGGG LQSASPRAKG PTTHSPLIRF VLLDESFLVE NRLTLRAIIE FAVLMVYFYI 121 CDRTDVFNSS KKSYNRDLFL FLYFLLIIVS AITSFTIHTD KSPFSGKAIM YLNRHQTEEW 181 KGWMQVLFLM YHYFAAAEYY NAIRVFIACY VWMTGFGNFS YYYIRKDFSL ARFAQMMWRL 241 NFLVIFSCIV LNNSYMLYYI CPMHTLFTLM VYGALGIMSK YNEMGSVIAA KFFACFVVVI 301 IVWEIPGVFE WIWSPFTLLM GYNDPAKPQL PLLHEWHFRS GLDRYIWIIG MLYAYYHPTV 361 ESWMDKLEEA EMKFRVAIKT SVALIALTVG YFWYEYIYKM DKLTYNKYHP YTSWIPITVY 421 ICLRNITQSF RGYSLTLLAW LGKITLETYI SQFHIWLRSG VPDGQPKLLL SLVPDYPLLN 481 FMLTTSIYVA ISYRLFELTN TLKTAFIPTK DDKRLVYNTI SALIICTCLY FFSFILITIP 541 QKLV Cas1L2 protein sequence NP_001078116.1 AT3G06550 (SEQ ID NO: 5) 1 MASSSPVTPG LMSVVFGIVP VIVAWLYSEY LHYAKYSVSA KTRHSDVNLV EIAKDFVKED 61 DKALLIEDGG GLQSASPRAK GPTTHSPLIR FVLLDESFLV ENRLTLRAII EFAVLMVYFY 121 ICDRTDVFNS SKKSYNRDLF LFLYFLLIIV SAITSFTIHT DKSPFSGKAI MYLNRHQTEE 181 WKGWMQVLFL MYHYFAAAEY YNAIRVFIAC YVWMTGFGNF SYYYIRKDFS LARFAQMMWR 241 LNFLVIFSCI VLNNSYMLYY ICPMHTLFTL MVYGALGIMS KYNEMGSVIA AKFFACFVVV 301 IIVWEIPGVF EWIWSPFTLL MGYNDPAKPQ LPLLHEWHFR SGLDRYIWII GMLYAYYHPT 361 VESWMDKLEE AEMKFRVAIK TSVALIALTV GYFWYEYIYK MDKLTYNKYH PYTSWIPITV 421 YICLRNITQS FRGYSLTLLA WLGKITLETY ISQFHIWLRS GVPDGQPKLL LSLVPDYPLL 481 NFMLTTSIYV AISYRLFELT NTLKTAFIPT KDDKRLVYNT ISALIICTCL YFFSFILITI 541 PQKLVSQNFI FLCGRKLFFP WYLSSLIC Cas1L3 protein sequence NP_001031478.1 AT2G34410 (SEQ ID NO: 3) 1 MADSQPITPG QVSFLLGVIP VFIAWIYSEF LEYKRSSLHS KVHSDNNLVE LGEVKNKEDE 61 GVVLLEGGLP RSVSTKFYNS PIKTNLIRFL TLEDSFLIEN RATLRAMAEF GAILFYFYIS 121 DRTSLLGESK KNYNRDLFLF LYCLLIIVSA MTSLKKHNDK SPITGKSILY LNRHQTEEWK 181 GWMQVLFLMY HYFAAAEIYN AIRVFIAAYV WMTGFGNFSY YYIRKDFSLA RFTQMMWRLN 241 LFVAFSCIIL NNDYMLYYIC PMHTLFTLMV YGALGIFSRY NEIPSVMALK IASCFLVVIV 301 MWEIPGVFEI FWSPLTFLLG YTDPAKPELP LLHEWHFRSG LDRYIWIIGM IYAYFHPTVE 361 RWMEKLEECD AKRKMSIKTS IIAISSFVGY LWYEYIYKLD KVTYNKYHPY TSWIPITVYI 421 CLRNSTQQLR NFSMTLFAWL GKITLETYIS QFHIWLRSNV PNGQPKWLLC IIPEYPMLNF 481 MLVTAIYVLV SHRLFELTNT LKSVFIPTKD DKRLLHNVLA GAAISFCLYL TSLILLQIPH Cas1L4 protein sequence NP174282.2 AT1G29890 (SEQ ID NO: 4) 1 MFSSHNIFLT IGIVFIRRFL TLEDSFLLEN RATLRAMAEF GAILLYFYIC DRTSLIGQSQ 61 KNYSRDLFLF LFCLLIIVSA MTSLKKHTDK SPITGKSILY LNRHQTEEWK GWMQVLFLMY 121 HYFAAVEFYN AIRVFIAGYV WMTGFGNFSY YYIRKDFSLA RFTQMMWRLN FFVAFCCIIL 181 NNDYMLYYIC PMHTLFTLMV YGALGIYSQY NEIASVMALK IASCFLVVIL MWEIPGVFEI 241 FWSPLAFLLG YTDPAKPDLP RLHEWHFRSG LDRYIWIIGM IYAYFHPTVE RWMEKLEECD 301 AKRRMSIKTS IIGISSFAGY LWYEYIYKLD KVTYNKYHPY TSWIPITVYI CLRNCTQQLR 361 RFSLTLFAWL GKITLETYIS QFHIWLRSSV PNGQPKLLLS IIPEYPMLNF MLTTAIYVLV 421 SVRLFELTNT LKSVFIPTKD DKRLLHNVIA MAAISFCLYI IGLILLLIPH SEQ ID NO: 6 CAS1L1 (RWA1) nucleic acid sequence (coding sequence) NCBI Reference Sequence: NM_124004.2 1 atggtggatc ctggaccaat tactccgggc caggtatctt ttcttcttgg agtaatccca 61 atatttgttg gttggatata ctcggagtta cttgagtaca gaaaatcttg ggttcccttg 121 aaacctcact cggataataa tctagttgaa ttgggagacg tagcagagaa ggacgacgac 181 aaagctgatc tgttggaggg aggtcttgcc cgatcaccat ctgtaaagtt tcataattca 241 tctatcagaa caaacataat caggtttttg agtatggaag attcattttt gctggaacat 301 cgagcaacct tgagagcaat gtcggaattt ggggcaatct taatatattt ctatatctgt 361 gaccgcacag aattgcttgg agattctacc aagaattaca accgcgacct tttccttttt 421 ctctacgttc ttctcatcat agtatcagcc atgacatctc tcagaaaaca caatgacaag 481 tcacccatat ctgggaagtc cattctttac cttaatcgcc accaaactga agaatggaaa 541 ggatggatgc aggttttgtt cttaatgtat cactactttg ctgcggccga gatatacaac 601 gcaatccgta tctttattgc tgcttatgtt tggatgactg gttttggaaa cttctcttac 661 tactatgtca gaaaggattt ctctgttgca cgttttgcgc agatgatgtg gaggctgaac 721 ttctttgtag cgttttgctg tattgttctc aacaacgact atatgttata ctacatctgc 781 ccaatgcaca ctcttttcac cctaatggta tatggagctc tgggtatctt cagcaagtac 841 aatgagatag gatcggtgat ggctctgaag atattttcat gcttcctcgt tgtctttttg 901 ttgtgggaaa ttcctggagc ttttgaaata ttttggggtc ccttaacatt tttgctaggt 961 tacaatgacc ctgccaagcc cgatcttcat cggctgcatg aatggcactt tagatcaggc 1021 cttgatcgct acatatggat catcggaatg atttatgcct attatcaccc aactgtagag 1081 agatggatgg agaagttaga ggactgtgaa acgaagaaaa gactatccat aaaggccgct 1141 attgttacta ttactgtgct tgttggctat gtgtggtatg aatgtatcta caagctggac 1201 aggaccagtt acaacatgta tcatccgtac acatcatgga tccccatcac tgtttacata 1261 tgccttcgga atttcaccca ccagcttcga agtgtctcat tgactctctt tgcgtggctt 1321 ggcaagatca ctttagagac ttacatttcc cagtttcata tatggctaag atcaaacatg 1381 cctgacgggc aaccaaaatg gcttctctct attattccgg gataccctat gctcaatttc 1441 atgctgacaa ctgctatata cgtccttgta tctcaccgtc tctttgaact aaccaacaca 1501 ctcaagacgg ttttcgtacc cacaaaagac aacaagcgac tcttctctaa cttcatagct 1561 gggattgcca tcgctcttcc actctattgc ttctcattcg ttcttcttca gattcatcgt 1621 tag SEQ ID NO: 7 CAS1L2 (RWA2) nucleic acid sequence (coding sequence) NCBI Reference Sequence: NM_001125120.1 1 atggcgagtt caagccctgt tacacctggg ctaatgtcgg tggtgttcgg gattgtgccg 61 gtgatcgtgg cttggctata ctctgagtat ctgcactatg ctaaatactc ggtctccgcc 121 aaaacgcact ctgatgtcaa tttggtggaa attgcgaaag attttgttaa agaagatgac 181 aaagctcttt taatagaaga tggaggtggt ctccaatcag cttctcctag agccaaaggc 241 ccgaccacac attctcctct catcaggttt gtcctcttgg atgagtcgtt cttggttgag 301 aacaggctga ctttaagggc aataattgag tttgcagtac ttatggtata cttttacata 361 tgtgaccgca cagatgtctt caattcatca aagaagagtt acaaccggga tctctttctg 421 ttcctttact tccttctcat catcgtttca gcgataactt cattcacgat acatactgat 481 aaatcaccat tcagcggaaa agccatcatg tacttgaata ggcatcaaac cgaggagtgg 541 aaaggctgga tgcaggtcct tttcttgatg taccactact ttgctgctgc agagtactat 601 aatgcgatcc gtgttttcat tgcttgctat gtatggatga ctggatttgg gaatttttct 661 tattattaca ttcgcaagga ctttagcctt gcaaggtttg cacagatgat gtggcggcta 721 aatttcctgg tcatattctc ctgcatcgtc ctcaacaaca gttacatgct atactacatc 781 tgcccaatgc acactctgtt tactctaatg gtctatgggg cacttggtat tatgagcaag 841 tataatgaga tgggttcagt catagctgcc aaattttttg cctgcttcgt tgttgttatc 901 atcgtttggg aaattcctgg cgtttttgaa tggatttgga gtccatttac actcctaatg 961 ggttacaatg atcccgcaaa acctcagctt cccctcttgc atgagtggca tttccgctct 1021 ggacttgatc ggtacatatg gataatcggg atgctatatg catactacca cccaactgtt 1081 gaaagttgga tggataaact ggaggaagct gagatgaaat tcagggtggc tatcaaaaca 1141 tctgtggcac tgatagcact aacggtggga tatttttggt acgagtatat atacaagatg 1201 gacaagttaa cttacaacaa atatcatcct tacacctctt ggattccaat aactgtttat 1261 atctgtctcc ggaacatcac ccagtctttc cgcggctaca gtttgaccct tctggcgtgg 1321 cttggaaaga taacactgga gacatatatc tcccagtttc atatatggct cagatctgga 1381 gttcctgatg gtcaacccaa attactacta tctcttgtcc cggattaccc attgttgaac 1441 ttcatgctca ctacttcgat ttacgtcgct atctcttata ggctctttga gcttaccaac 1501 actttgaaaa cagccttcat accaaccaag gacgacaaac gccttgtcta caacacgatc 1561 tcagcactca taatctgcac ttgtctctac tttttctcat ttattcttat cacaattccc 1621 caaaaactgg tgtga SEQ ID NO: 8 CAS1L2 (RWA2) nucleic acid sequence (coding sequence) NCBI Reference Sequence: NM_001084647.4 1 atggcgagtt caagccctgt tacacctggg ctaatgtcgg tggtgttcgg gattgtgccg 61 gtgatcgtgg cttggctata ctctgagtat ctgcactatg ctaaatactc ggtctccgcc 121 aaaactaggc actctgatgt caatttggtg gaaattgcga aagattttgt taaagaagat 181 gacaaagctc ttttaataga agatggaggt ggtctccaat cagcttctcc tagagccaaa 241 ggcccgacca cacattctcc tctcatcagg tttgtcctct tggatgagtc gttcttggtt 301 gagaacaggc tgactttaag ggcaataatt gagtttgcag tacttatggt atacttttac 361 atatgtgacc gcacagatgt cttcaattca tcaaagaaga gttacaaccg ggatctcttt 421 ctgttccttt acttccttct catcatcgtt tcagcgataa cttcattcac gatacatact 481 gataaatcac cattcagcgg aaaagccatc atgtacttga ataggcatca aaccgaggag 541 tggaaaggct ggatgcaggt ccttttcttg atgtaccact actttgctgc tgcagagtac 601 tataatgcga tccgtgtttt cattgcttgc tatgtatgga tgactggatt tgggaatttt 661 tcttattatt acattcgcaa ggactttagc cttgcaaggt ttgcacagat gatgtggcgg 721 ctaaatttcc tggtcatatt ctcctgcatc gtcctcaaca acagttacat gctatactac 781 atctgcccaa tgcacactct gtttactcta atggtctatg gggcacttgg tattatgagc

841 aagtataatg agatgggttc agtcatagct gccaaatttt ttgcctgctt cgttgttgtt 901 atcatcgttt gggaaattcc tggcgttttt gaatggattt ggagtccatt tacactccta 961 atgggttaca atgatcccgc aaaacctcag cttcccctct tgcatgagtg gcatttccgc 1021 tctggacttg atcggtacat atggataatc gggatgctat atgcatacta ccacccaact 1081 gttgaaagtt ggatggataa actggaggaa gctgagatga aattcagggt ggctatcaaa 1141 acatctgtgg cactgatagc actaacggtg ggatattttt ggtacgagta tatatacaag 1201 atggacaagt taacttacaa caaatatcat ccttacacct cttggattcc aataactgtt 1261 tatatctgtc tccggaacat cacccagtct ttccgcggct acagtttgac ccttctggcg 1321 tggcttggaa agataacact ggagacatat atctcccagt ttcatatatg gctcagatct 1381 ggagttcctg atggtcaacc caaattacta ctatctcttg tcccggatta cccattgttg 1441 aacttcatgc tcactacttc gatttacgtc gctatctctt ataggctctt tgagcttacc 1501 aacactttga aaacagcctt cataccaacc aaggacgaca aacgccttgt ctacaacacg 1561 atctcagcac tcataatctg cacttgtctc tactttttct catttattct tatcacaatt 1621 ccccaaaaac tggtaagtca aaattttatc tttttgtgtg ggagaaagct tttttttccc 1681 tggtacttga gttcattgat atgttag SEQ ID NO: 9 CAS1L3 (RWA3) nucleic acid sequence (coding sequence) 1 atggcggatt ctcagccaat cacgcctggt caggtttcgt ttctactcgg agtcattcct 61 gtcttcatag catggattta ctcagagttt ctagagtata agaggtcttc attgcactct 121 aaagttcatt cagataataa tttggttgaa cttggtgagg taaaaaacaa ggaagatgaa 181 ggagtagttt tacttgaagg aggtcttcca agatcagtct ctacaaagtt ttataactca 241 cctatcaaaa caaacttgat tagatttctg acgctggaag actctttctt gattgaaaat 301 cgagcaacct tgagagcgat ggctgagttt ggggctattc ttttttactt ttatattagt 361 gatcgaacaa gcttgcttgg agagtctaaa aagaattaca acagagatct tttcctcttt 421 ctctactgtc ttctcatcat agtttcagcc atgacatcct tgaagaaaca caatgacaaa 481 tcacctataa caggaaaatc cattctctat cttaatcgtc accagactga agagtggaag 541 ggatggatgc aggttctatt tcttatgtat cattactttg ctgcggctga gatatataat 601 gcaatcaggg ttttcattgc tgcctacgtc tggatgactg ggtttgggaa cttctcttat 661 tactatatca gaaaggattt ctccctagca cgatttactc agatgatgtg gcgtcttaac 721 ttatttgtgg cgtttagctg cattattctc aataatgatt atatgctgta ctacatctgt 781 ccaatgcaca ctctgttcac tcttatggtg tatggagccc ttggtatctt cagtcgatat 841 aacgaaatac catcagtaat ggctttgaag attgcttcat gctttctcgt ggttatcgtg 901 atgtgggaga ttcctggcgt ttttgagatt ttctggagtc ctttaacatt cttactggga 961 tacactgatc cagctaaacc agaactacca cttttacatg aatggcactt cagatcagga 1021 cttgaccgct acatatggat cattggaatg atatatgcct atttccatcc cactgtagag 1081 agatggatgg agaaattgga ggagtgtgat gccaagagaa agatgtcaat aaagacaagc 1141 ataattgcaa tttcctcatt tgttggttac ctatggtatg aatacatata caagcttgac 1201 aaggttacat acaacaaata tcatccctac acatcgtgga ttccaataac cgtctacatc 1261 tgtctgcgaa attctacaca acagctgcgt aatttctcca tgacactatt tgcgtggctc 1321 ggcaagatta ctctggaaac ctatatttct cagtttcaca tctggttaag atcgaatgtg 1381 ccaaatggac agcctaagtg gctattatgc attattccag aatacccaat gctcaacttc 1441 atgctcgtca cggccatcta tgtcttggtg tcccaccgac ttttcgagct tacaaacacg 1501 ttaaagtctg ttttcatacc aacaaaagac gacaagaggc tgctccacaa tgttctcgct 1561 ggagctgcca tctcgttctg tttatattta acatctctca ttcttctcca gatcccacac 1621 taa SEQ ID NO: 10 CAS1L4 (RWA4) nucleic acid sequence (coding sequence) NCBI Reference Sequence: NM_102729.2 1 atgttctcta gccataatat tttcttaacc attggcattg tgtttattcg taggtttctg 61 actttggaag actctttctt gcttgaaaac cgagcaacct tgagagcaat ggctgagttt 121 ggagcaattc ttttatattt ttatatttgt gatcgaacta gcttgatcgg gcagtctcaa 181 aagaattaca gccgagacct ttttctcttt ctcttctgtc ttctcatcat agtgtcagct 241 atgacgtcct tgaagaaaca cactgacaag tcaccaataa caggaaagtc cattctgtat 301 ctcaatcgtc accagactga agaatggaaa gggtggatgc aggttctatt tctgatgtat 361 cattattttg cggcagttga gttttacaat gcaatcaggg tcttcatcgc tggctatgtg 421 tggatgaccg gttttgggaa cttctcttat tactatatcc gaaaggattt ctcccttgca 481 cgattcactc agatgatgtg gcggcttaac ttttttgtgg cgttttgttg cattattctc 541 aacaatgact atatgctgta ctacatctgt ccaatgcaca ctctattcac gctgatggtc 601 tatggagccc ttggtattta cagtcagtat aacgaaatag catcagtgat ggctctgaag 661 attgcttcat gctttctcgt ggttatcctt atgtgggaga ttcctggagt ttttgagatt 721 ttctggagtc ctctggcatt cttactgggg tacacagatc cagctaaacc agaccttcca 781 cgtctacacg aatggcattt cagatctgga cttgatcgct acatatggat catcggcatg 841 atatatgcat attttcatcc cactgtagaa agatggatgg agaaattgga ggagtgtgat 901 gctaagagaa ggatgtcaat caagacaagc ataataggaa tttcttcatt cgctggttac 961 ctttggtatg aatacatcta caagctggac aaggttacgt acaacaaata tcatccctac 1021 acatcttgga ttccaataac tgtctacatc tgtctgcgaa attgcaccca acagctacgg 1081 agattttccc tgacactctt tgcgtggctg ggcaagataa ctctcgagac ctacatttca 1141 cagtttcaca tctggttaag atcgagtgtg ccaaatgggc agccaaagtt gctattatca 1201 atcatcccag aatacccaat gctcaacttc atgctcacca cggccatcta cgtcttggta 1261 tctgttcgac ttttcgagct aaccaataca ttaaaatcag ttttcatacc cacgaaagac 1321 gacaaacggc tgctccacaa cgtgattgct atggctgcga tatcattttg tttatatatt 1381 atcggtctta ttcttctctt gatcccacat taa

Sequence CWU 1

221540PRTArabidopsis thalianaCAS1L1 Arabidopsis sequence (NP_568662.1, AT5G46340) 1Met Val Asp Pro Gly Pro Ile Thr Pro Gly Gln Val Ser Phe Leu Leu1 5 10 15Gly Val Ile Pro Ile Phe Val Gly Trp Ile Tyr Ser Glu Leu Leu Glu 20 25 30Tyr Arg Lys Ser Trp Val Pro Leu Lys Pro His Ser Asp Asn Asn Leu 35 40 45Val Glu Leu Gly Asp Val Ala Glu Lys Asp Asp Asp Lys Ala Asp Leu 50 55 60Leu Glu Gly Gly Leu Ala Arg Ser Pro Ser Val Lys Phe His Asn Ser65 70 75 80Ser Ile Arg Thr Asn Ile Ile Arg Phe Leu Ser Met Glu Asp Ser Phe 85 90 95Leu Leu Glu His Arg Ala Thr Leu Arg Ala Met Ser Glu Phe Gly Ala 100 105 110Ile Leu Ile Tyr Phe Tyr Ile Cys Asp Arg Thr Glu Leu Leu Gly Asp 115 120 125Ser Thr Lys Asn Tyr Asn Arg Asp Leu Phe Leu Phe Leu Tyr Val Leu 130 135 140Leu Ile Ile Val Ser Ala Met Thr Ser Leu Arg Lys His Asn Asp Lys145 150 155 160Ser Pro Ile Ser Gly Lys Ser Ile Leu Tyr Leu Asn Arg His Gln Thr 165 170 175Glu Glu Trp Lys Gly Trp Met Gln Val Leu Phe Leu Met Tyr His Tyr 180 185 190Phe Ala Ala Ala Glu Ile Tyr Asn Ala Ile Arg Ile Phe Ile Ala Ala 195 200 205Tyr Val Trp Met Thr Gly Phe Gly Asn Phe Ser Tyr Tyr Tyr Val Arg 210 215 220Lys Asp Phe Ser Val Ala Arg Phe Ala Gln Met Met Trp Arg Leu Asn225 230 235 240Phe Phe Val Ala Phe Cys Cys Ile Val Leu Asn Asn Asp Tyr Met Leu 245 250 255Tyr Tyr Ile Cys Pro Met His Thr Leu Phe Thr Leu Met Val Tyr Gly 260 265 270Ala Leu Gly Ile Phe Ser Lys Tyr Asn Glu Ile Gly Ser Val Met Ala 275 280 285Leu Lys Ile Phe Ser Cys Phe Leu Val Val Phe Leu Leu Trp Glu Ile 290 295 300Pro Gly Ala Phe Glu Ile Phe Trp Gly Pro Leu Thr Phe Leu Leu Gly305 310 315 320Tyr Asn Asp Pro Ala Lys Pro Asp Leu His Arg Leu His Glu Trp His 325 330 335Phe Arg Ser Gly Leu Asp Arg Tyr Ile Trp Ile Ile Gly Met Ile Tyr 340 345 350Ala Tyr Tyr His Pro Thr Val Glu Arg Trp Met Glu Lys Leu Glu Asp 355 360 365Cys Glu Thr Lys Lys Arg Leu Ser Ile Lys Ala Ala Ile Val Thr Ile 370 375 380Thr Val Leu Val Gly Tyr Val Trp Tyr Glu Cys Ile Tyr Lys Leu Asp385 390 395 400Arg Thr Ser Tyr Asn Met Tyr His Pro Tyr Thr Ser Trp Ile Pro Ile 405 410 415Thr Val Tyr Ile Cys Leu Arg Asn Phe Thr His Gln Leu Arg Ser Val 420 425 430Ser Leu Thr Leu Phe Ala Trp Leu Gly Lys Ile Thr Leu Glu Thr Tyr 435 440 445Ile Ser Gln Phe His Ile Trp Leu Arg Ser Asn Met Pro Asp Gly Gln 450 455 460Pro Lys Trp Leu Leu Ser Ile Ile Pro Gly Tyr Pro Met Leu Asn Phe465 470 475 480Met Leu Thr Thr Ala Ile Tyr Val Leu Val Ser His Arg Leu Phe Glu 485 490 495Leu Thr Asn Thr Leu Lys Thr Val Phe Val Pro Thr Lys Asp Asn Lys 500 505 510Arg Leu Phe Ser Asn Phe Ile Ala Gly Ile Ala Ile Ala Leu Pro Leu 515 520 525Tyr Cys Phe Ser Phe Val Leu Leu Gln Ile His Arg 530 535 5402544PRTArabidopsis thalianaCas1L2 protein sequence (NP_001118592.1, AT3G06550) 2Met Ala Ser Ser Ser Pro Val Thr Pro Gly Leu Met Ser Val Val Phe1 5 10 15Gly Ile Val Pro Val Ile Val Ala Trp Leu Tyr Ser Glu Tyr Leu His 20 25 30Tyr Ala Lys Tyr Ser Val Ser Ala Lys Thr His Ser Asp Val Asn Leu 35 40 45Val Glu Ile Ala Lys Asp Phe Val Lys Glu Asp Asp Lys Ala Leu Leu 50 55 60Ile Glu Asp Gly Gly Gly Leu Gln Ser Ala Ser Pro Arg Ala Lys Gly65 70 75 80Pro Thr Thr His Ser Pro Leu Ile Arg Phe Val Leu Leu Asp Glu Ser 85 90 95Phe Leu Val Glu Asn Arg Leu Thr Leu Arg Ala Ile Ile Glu Phe Ala 100 105 110Val Leu Met Val Tyr Phe Tyr Ile Cys Asp Arg Thr Asp Val Phe Asn 115 120 125Ser Ser Lys Lys Ser Tyr Asn Arg Asp Leu Phe Leu Phe Leu Tyr Phe 130 135 140Leu Leu Ile Ile Val Ser Ala Ile Thr Ser Phe Thr Ile His Thr Asp145 150 155 160Lys Ser Pro Phe Ser Gly Lys Ala Ile Met Tyr Leu Asn Arg His Gln 165 170 175Thr Glu Glu Trp Lys Gly Trp Met Gln Val Leu Phe Leu Met Tyr His 180 185 190Tyr Phe Ala Ala Ala Glu Tyr Tyr Asn Ala Ile Arg Val Phe Ile Ala 195 200 205Cys Tyr Val Trp Met Thr Gly Phe Gly Asn Phe Ser Tyr Tyr Tyr Ile 210 215 220Arg Lys Asp Phe Ser Leu Ala Arg Phe Ala Gln Met Met Trp Arg Leu225 230 235 240Asn Phe Leu Val Ile Phe Ser Cys Ile Val Leu Asn Asn Ser Tyr Met 245 250 255Leu Tyr Tyr Ile Cys Pro Met His Thr Leu Phe Thr Leu Met Val Tyr 260 265 270Gly Ala Leu Gly Ile Met Ser Lys Tyr Asn Glu Met Gly Ser Val Ile 275 280 285Ala Ala Lys Phe Phe Ala Cys Phe Val Val Val Ile Ile Val Trp Glu 290 295 300Ile Pro Gly Val Phe Glu Trp Ile Trp Ser Pro Phe Thr Leu Leu Met305 310 315 320Gly Tyr Asn Asp Pro Ala Lys Pro Gln Leu Pro Leu Leu His Glu Trp 325 330 335His Phe Arg Ser Gly Leu Asp Arg Tyr Ile Trp Ile Ile Gly Met Leu 340 345 350Tyr Ala Tyr Tyr His Pro Thr Val Glu Ser Trp Met Asp Lys Leu Glu 355 360 365Glu Ala Glu Met Lys Phe Arg Val Ala Ile Lys Thr Ser Val Ala Leu 370 375 380Ile Ala Leu Thr Val Gly Tyr Phe Trp Tyr Glu Tyr Ile Tyr Lys Met385 390 395 400Asp Lys Leu Thr Tyr Asn Lys Tyr His Pro Tyr Thr Ser Trp Ile Pro 405 410 415Ile Thr Val Tyr Ile Cys Leu Arg Asn Ile Thr Gln Ser Phe Arg Gly 420 425 430Tyr Ser Leu Thr Leu Leu Ala Trp Leu Gly Lys Ile Thr Leu Glu Thr 435 440 445Tyr Ile Ser Gln Phe His Ile Trp Leu Arg Ser Gly Val Pro Asp Gly 450 455 460Gln Pro Lys Leu Leu Leu Ser Leu Val Pro Asp Tyr Pro Leu Leu Asn465 470 475 480Phe Met Leu Thr Thr Ser Ile Tyr Val Ala Ile Ser Tyr Arg Leu Phe 485 490 495Glu Leu Thr Asn Thr Leu Lys Thr Ala Phe Ile Pro Thr Lys Asp Asp 500 505 510Lys Arg Leu Val Tyr Asn Thr Ile Ser Ala Leu Ile Ile Cys Thr Cys 515 520 525Leu Tyr Phe Phe Ser Phe Ile Leu Ile Thr Ile Pro Gln Lys Leu Val 530 535 5403540PRTArabidopsis thalianaCas1L3 protein sequence (NP_001031478.1, AT2G34410) 3Met Ala Asp Ser Gln Pro Ile Thr Pro Gly Gln Val Ser Phe Leu Leu1 5 10 15Gly Val Ile Pro Val Phe Ile Ala Trp Ile Tyr Ser Glu Phe Leu Glu 20 25 30Tyr Lys Arg Ser Ser Leu His Ser Lys Val His Ser Asp Asn Asn Leu 35 40 45Val Glu Leu Gly Glu Val Lys Asn Lys Glu Asp Glu Gly Val Val Leu 50 55 60Leu Glu Gly Gly Leu Pro Arg Ser Val Ser Thr Lys Phe Tyr Asn Ser65 70 75 80Pro Ile Lys Thr Asn Leu Ile Arg Phe Leu Thr Leu Glu Asp Ser Phe 85 90 95Leu Ile Glu Asn Arg Ala Thr Leu Arg Ala Met Ala Glu Phe Gly Ala 100 105 110Ile Leu Phe Tyr Phe Tyr Ile Ser Asp Arg Thr Ser Leu Leu Gly Glu 115 120 125Ser Lys Lys Asn Tyr Asn Arg Asp Leu Phe Leu Phe Leu Tyr Cys Leu 130 135 140Leu Ile Ile Val Ser Ala Met Thr Ser Leu Lys Lys His Asn Asp Lys145 150 155 160Ser Pro Ile Thr Gly Lys Ser Ile Leu Tyr Leu Asn Arg His Gln Thr 165 170 175Glu Glu Trp Lys Gly Trp Met Gln Val Leu Phe Leu Met Tyr His Tyr 180 185 190Phe Ala Ala Ala Glu Ile Tyr Asn Ala Ile Arg Val Phe Ile Ala Ala 195 200 205Tyr Val Trp Met Thr Gly Phe Gly Asn Phe Ser Tyr Tyr Tyr Ile Arg 210 215 220Lys Asp Phe Ser Leu Ala Arg Phe Thr Gln Met Met Trp Arg Leu Asn225 230 235 240Leu Phe Val Ala Phe Ser Cys Ile Ile Leu Asn Asn Asp Tyr Met Leu 245 250 255Tyr Tyr Ile Cys Pro Met His Thr Leu Phe Thr Leu Met Val Tyr Gly 260 265 270Ala Leu Gly Ile Phe Ser Arg Tyr Asn Glu Ile Pro Ser Val Met Ala 275 280 285Leu Lys Ile Ala Ser Cys Phe Leu Val Val Ile Val Met Trp Glu Ile 290 295 300Pro Gly Val Phe Glu Ile Phe Trp Ser Pro Leu Thr Phe Leu Leu Gly305 310 315 320Tyr Thr Asp Pro Ala Lys Pro Glu Leu Pro Leu Leu His Glu Trp His 325 330 335Phe Arg Ser Gly Leu Asp Arg Tyr Ile Trp Ile Ile Gly Met Ile Tyr 340 345 350Ala Tyr Phe His Pro Thr Val Glu Arg Trp Met Glu Lys Leu Glu Glu 355 360 365Cys Asp Ala Lys Arg Lys Met Ser Ile Lys Thr Ser Ile Ile Ala Ile 370 375 380Ser Ser Phe Val Gly Tyr Leu Trp Tyr Glu Tyr Ile Tyr Lys Leu Asp385 390 395 400Lys Val Thr Tyr Asn Lys Tyr His Pro Tyr Thr Ser Trp Ile Pro Ile 405 410 415Thr Val Tyr Ile Cys Leu Arg Asn Ser Thr Gln Gln Leu Arg Asn Phe 420 425 430Ser Met Thr Leu Phe Ala Trp Leu Gly Lys Ile Thr Leu Glu Thr Tyr 435 440 445Ile Ser Gln Phe His Ile Trp Leu Arg Ser Asn Val Pro Asn Gly Gln 450 455 460Pro Lys Trp Leu Leu Cys Ile Ile Pro Glu Tyr Pro Met Leu Asn Phe465 470 475 480Met Leu Val Thr Ala Ile Tyr Val Leu Val Ser His Arg Leu Phe Glu 485 490 495Leu Thr Asn Thr Leu Lys Ser Val Phe Ile Pro Thr Lys Asp Asp Lys 500 505 510Arg Leu Leu His Asn Val Leu Ala Gly Ala Ala Ile Ser Phe Cys Leu 515 520 525Tyr Leu Thr Ser Leu Ile Leu Leu Gln Ile Pro His 530 535 5404470PRTArabidopsis thalianaCas1L4 protein sequence (NP174282.2, AT1G29890) 4Met Phe Ser Ser His Asn Ile Phe Leu Thr Ile Gly Ile Val Phe Ile1 5 10 15Arg Arg Phe Leu Thr Leu Glu Asp Ser Phe Leu Leu Glu Asn Arg Ala 20 25 30Thr Leu Arg Ala Met Ala Glu Phe Gly Ala Ile Leu Leu Tyr Phe Tyr 35 40 45Ile Cys Asp Arg Thr Ser Leu Ile Gly Gln Ser Gln Lys Asn Tyr Ser 50 55 60Arg Asp Leu Phe Leu Phe Leu Phe Cys Leu Leu Ile Ile Val Ser Ala65 70 75 80Met Thr Ser Leu Lys Lys His Thr Asp Lys Ser Pro Ile Thr Gly Lys 85 90 95Ser Ile Leu Tyr Leu Asn Arg His Gln Thr Glu Glu Trp Lys Gly Trp 100 105 110Met Gln Val Leu Phe Leu Met Tyr His Tyr Phe Ala Ala Val Glu Phe 115 120 125Tyr Asn Ala Ile Arg Val Phe Ile Ala Gly Tyr Val Trp Met Thr Gly 130 135 140Phe Gly Asn Phe Ser Tyr Tyr Tyr Ile Arg Lys Asp Phe Ser Leu Ala145 150 155 160Arg Phe Thr Gln Met Met Trp Arg Leu Asn Phe Phe Val Ala Phe Cys 165 170 175Cys Ile Ile Leu Asn Asn Asp Tyr Met Leu Tyr Tyr Ile Cys Pro Met 180 185 190His Thr Leu Phe Thr Leu Met Val Tyr Gly Ala Leu Gly Ile Tyr Ser 195 200 205Gln Tyr Asn Glu Ile Ala Ser Val Met Ala Leu Lys Ile Ala Ser Cys 210 215 220Phe Leu Val Val Ile Leu Met Trp Glu Ile Pro Gly Val Phe Glu Ile225 230 235 240Phe Trp Ser Pro Leu Ala Phe Leu Leu Gly Tyr Thr Asp Pro Ala Lys 245 250 255Pro Asp Leu Pro Arg Leu His Glu Trp His Phe Arg Ser Gly Leu Asp 260 265 270Arg Tyr Ile Trp Ile Ile Gly Met Ile Tyr Ala Tyr Phe His Pro Thr 275 280 285Val Glu Arg Trp Met Glu Lys Leu Glu Glu Cys Asp Ala Lys Arg Arg 290 295 300Met Ser Ile Lys Thr Ser Ile Ile Gly Ile Ser Ser Phe Ala Gly Tyr305 310 315 320Leu Trp Tyr Glu Tyr Ile Tyr Lys Leu Asp Lys Val Thr Tyr Asn Lys 325 330 335Tyr His Pro Tyr Thr Ser Trp Ile Pro Ile Thr Val Tyr Ile Cys Leu 340 345 350Arg Asn Cys Thr Gln Gln Leu Arg Arg Phe Ser Leu Thr Leu Phe Ala 355 360 365Trp Leu Gly Lys Ile Thr Leu Glu Thr Tyr Ile Ser Gln Phe His Ile 370 375 380Trp Leu Arg Ser Ser Val Pro Asn Gly Gln Pro Lys Leu Leu Leu Ser385 390 395 400Ile Ile Pro Glu Tyr Pro Met Leu Asn Phe Met Leu Thr Thr Ala Ile 405 410 415Tyr Val Leu Val Ser Val Arg Leu Phe Glu Leu Thr Asn Thr Leu Lys 420 425 430Ser Val Phe Ile Pro Thr Lys Asp Asp Lys Arg Leu Leu His Asn Val 435 440 445Ile Ala Met Ala Ala Ile Ser Phe Cys Leu Tyr Ile Ile Gly Leu Ile 450 455 460Leu Leu Leu Ile Pro His465 4705568PRTArabidopsis thalianaCas1L2 protein sequence (NP_001078116.1, AT3G06550) 5Met Ala Ser Ser Ser Pro Val Thr Pro Gly Leu Met Ser Val Val Phe1 5 10 15Gly Ile Val Pro Val Ile Val Ala Trp Leu Tyr Ser Glu Tyr Leu His 20 25 30Tyr Ala Lys Tyr Ser Val Ser Ala Lys Thr Arg His Ser Asp Val Asn 35 40 45Leu Val Glu Ile Ala Lys Asp Phe Val Lys Glu Asp Asp Lys Ala Leu 50 55 60Leu Ile Glu Asp Gly Gly Gly Leu Gln Ser Ala Ser Pro Arg Ala Lys65 70 75 80Gly Pro Thr Thr His Ser Pro Leu Ile Arg Phe Val Leu Leu Asp Glu 85 90 95Ser Phe Leu Val Glu Asn Arg Leu Thr Leu Arg Ala Ile Ile Glu Phe 100 105 110Ala Val Leu Met Val Tyr Phe Tyr Ile Cys Asp Arg Thr Asp Val Phe 115 120 125Asn Ser Ser Lys Lys Ser Tyr Asn Arg Asp Leu Phe Leu Phe Leu Tyr 130 135 140Phe Leu Leu Ile Ile Val Ser Ala Ile Thr Ser Phe Thr Ile His Thr145 150 155 160Asp Lys Ser Pro Phe Ser Gly Lys Ala Ile Met Tyr Leu Asn Arg His 165 170 175Gln Thr Glu Glu Trp Lys Gly Trp Met Gln Val Leu Phe Leu Met Tyr 180 185 190His Tyr Phe Ala Ala Ala Glu Tyr Tyr Asn Ala Ile Arg Val Phe Ile 195 200 205Ala Cys Tyr Val Trp Met Thr Gly Phe Gly Asn Phe Ser Tyr Tyr Tyr 210 215 220Ile Arg Lys Asp Phe Ser Leu Ala Arg Phe Ala Gln Met Met Trp Arg225 230 235 240Leu Asn Phe Leu Val Ile Phe Ser Cys Ile Val Leu Asn Asn Ser Tyr 245 250 255Met Leu Tyr Tyr Ile Cys Pro Met His Thr Leu Phe Thr Leu Met Val 260 265 270Tyr Gly Ala Leu Gly Ile Met Ser Lys Tyr Asn Glu Met Gly Ser Val 275 280 285Ile Ala Ala Lys Phe Phe Ala Cys Phe Val Val Val Ile Ile Val Trp 290 295 300Glu Ile Pro Gly Val Phe Glu Trp Ile Trp Ser Pro Phe Thr Leu Leu305 310 315 320Met Gly Tyr Asn Asp Pro Ala

Lys Pro Gln Leu Pro Leu Leu His Glu 325 330 335Trp His Phe Arg Ser Gly Leu Asp Arg Tyr Ile Trp Ile Ile Gly Met 340 345 350Leu Tyr Ala Tyr Tyr His Pro Thr Val Glu Ser Trp Met Asp Lys Leu 355 360 365Glu Glu Ala Glu Met Lys Phe Arg Val Ala Ile Lys Thr Ser Val Ala 370 375 380Leu Ile Ala Leu Thr Val Gly Tyr Phe Trp Tyr Glu Tyr Ile Tyr Lys385 390 395 400Met Asp Lys Leu Thr Tyr Asn Lys Tyr His Pro Tyr Thr Ser Trp Ile 405 410 415Pro Ile Thr Val Tyr Ile Cys Leu Arg Asn Ile Thr Gln Ser Phe Arg 420 425 430Gly Tyr Ser Leu Thr Leu Leu Ala Trp Leu Gly Lys Ile Thr Leu Glu 435 440 445Thr Tyr Ile Ser Gln Phe His Ile Trp Leu Arg Ser Gly Val Pro Asp 450 455 460Gly Gln Pro Lys Leu Leu Leu Ser Leu Val Pro Asp Tyr Pro Leu Leu465 470 475 480Asn Phe Met Leu Thr Thr Ser Ile Tyr Val Ala Ile Ser Tyr Arg Leu 485 490 495Phe Glu Leu Thr Asn Thr Leu Lys Thr Ala Phe Ile Pro Thr Lys Asp 500 505 510Asp Lys Arg Leu Val Tyr Asn Thr Ile Ser Ala Leu Ile Ile Cys Thr 515 520 525Cys Leu Tyr Phe Phe Ser Phe Ile Leu Ile Thr Ile Pro Gln Lys Leu 530 535 540Val Ser Gln Asn Phe Ile Phe Leu Cys Gly Arg Lys Leu Phe Phe Pro545 550 555 560Trp Tyr Leu Ser Ser Leu Ile Cys 56561623DNAArabidopsis thalianaCAS1L1 (RWA1) coding sequence (NM_124004.2) 6atggtggatc ctggaccaat tactccgggc caggtatctt ttcttcttgg agtaatccca 60atatttgttg gttggatata ctcggagtta cttgagtaca gaaaatcttg ggttcccttg 120aaacctcact cggataataa tctagttgaa ttgggagacg tagcagagaa ggacgacgac 180aaagctgatc tgttggaggg aggtcttgcc cgatcaccat ctgtaaagtt tcataattca 240tctatcagaa caaacataat caggtttttg agtatggaag attcattttt gctggaacat 300cgagcaacct tgagagcaat gtcggaattt ggggcaatct taatatattt ctatatctgt 360gaccgcacag aattgcttgg agattctacc aagaattaca accgcgacct tttccttttt 420ctctacgttc ttctcatcat agtatcagcc atgacatctc tcagaaaaca caatgacaag 480tcacccatat ctgggaagtc cattctttac cttaatcgcc accaaactga agaatggaaa 540ggatggatgc aggttttgtt cttaatgtat cactactttg ctgcggccga gatatacaac 600gcaatccgta tctttattgc tgcttatgtt tggatgactg gttttggaaa cttctcttac 660tactatgtca gaaaggattt ctctgttgca cgttttgcgc agatgatgtg gaggctgaac 720ttctttgtag cgttttgctg tattgttctc aacaacgact atatgttata ctacatctgc 780ccaatgcaca ctcttttcac cctaatggta tatggagctc tgggtatctt cagcaagtac 840aatgagatag gatcggtgat ggctctgaag atattttcat gcttcctcgt tgtctttttg 900ttgtgggaaa ttcctggagc ttttgaaata ttttggggtc ccttaacatt tttgctaggt 960tacaatgacc ctgccaagcc cgatcttcat cggctgcatg aatggcactt tagatcaggc 1020cttgatcgct acatatggat catcggaatg atttatgcct attatcaccc aactgtagag 1080agatggatgg agaagttaga ggactgtgaa acgaagaaaa gactatccat aaaggccgct 1140attgttacta ttactgtgct tgttggctat gtgtggtatg aatgtatcta caagctggac 1200aggaccagtt acaacatgta tcatccgtac acatcatgga tccccatcac tgtttacata 1260tgccttcgga atttcaccca ccagcttcga agtgtctcat tgactctctt tgcgtggctt 1320ggcaagatca ctttagagac ttacatttcc cagtttcata tatggctaag atcaaacatg 1380cctgacgggc aaccaaaatg gcttctctct attattccgg gataccctat gctcaatttc 1440atgctgacaa ctgctatata cgtccttgta tctcaccgtc tctttgaact aaccaacaca 1500ctcaagacgg ttttcgtacc cacaaaagac aacaagcgac tcttctctaa cttcatagct 1560gggattgcca tcgctcttcc actctattgc ttctcattcg ttcttcttca gattcatcgt 1620tag 162371635DNAArabidopsis thalianaCAS1L2 (RWA2) coding sequence (NM_001125120.1) 7atggcgagtt caagccctgt tacacctggg ctaatgtcgg tggtgttcgg gattgtgccg 60gtgatcgtgg cttggctata ctctgagtat ctgcactatg ctaaatactc ggtctccgcc 120aaaacgcact ctgatgtcaa tttggtggaa attgcgaaag attttgttaa agaagatgac 180aaagctcttt taatagaaga tggaggtggt ctccaatcag cttctcctag agccaaaggc 240ccgaccacac attctcctct catcaggttt gtcctcttgg atgagtcgtt cttggttgag 300aacaggctga ctttaagggc aataattgag tttgcagtac ttatggtata cttttacata 360tgtgaccgca cagatgtctt caattcatca aagaagagtt acaaccggga tctctttctg 420ttcctttact tccttctcat catcgtttca gcgataactt cattcacgat acatactgat 480aaatcaccat tcagcggaaa agccatcatg tacttgaata ggcatcaaac cgaggagtgg 540aaaggctgga tgcaggtcct tttcttgatg taccactact ttgctgctgc agagtactat 600aatgcgatcc gtgttttcat tgcttgctat gtatggatga ctggatttgg gaatttttct 660tattattaca ttcgcaagga ctttagcctt gcaaggtttg cacagatgat gtggcggcta 720aatttcctgg tcatattctc ctgcatcgtc ctcaacaaca gttacatgct atactacatc 780tgcccaatgc acactctgtt tactctaatg gtctatgggg cacttggtat tatgagcaag 840tataatgaga tgggttcagt catagctgcc aaattttttg cctgcttcgt tgttgttatc 900atcgtttggg aaattcctgg cgtttttgaa tggatttgga gtccatttac actcctaatg 960ggttacaatg atcccgcaaa acctcagctt cccctcttgc atgagtggca tttccgctct 1020ggacttgatc ggtacatatg gataatcggg atgctatatg catactacca cccaactgtt 1080gaaagttgga tggataaact ggaggaagct gagatgaaat tcagggtggc tatcaaaaca 1140tctgtggcac tgatagcact aacggtggga tatttttggt acgagtatat atacaagatg 1200gacaagttaa cttacaacaa atatcatcct tacacctctt ggattccaat aactgtttat 1260atctgtctcc ggaacatcac ccagtctttc cgcggctaca gtttgaccct tctggcgtgg 1320cttggaaaga taacactgga gacatatatc tcccagtttc atatatggct cagatctgga 1380gttcctgatg gtcaacccaa attactacta tctcttgtcc cggattaccc attgttgaac 1440ttcatgctca ctacttcgat ttacgtcgct atctcttata ggctctttga gcttaccaac 1500actttgaaaa cagccttcat accaaccaag gacgacaaac gccttgtcta caacacgatc 1560tcagcactca taatctgcac ttgtctctac tttttctcat ttattcttat cacaattccc 1620caaaaactgg tgtga 163581707DNAArabidopsis thalianaCAS1L2 (RWA2) coding sequence (NM_001084647.4) 8atggcgagtt caagccctgt tacacctggg ctaatgtcgg tggtgttcgg gattgtgccg 60gtgatcgtgg cttggctata ctctgagtat ctgcactatg ctaaatactc ggtctccgcc 120aaaactaggc actctgatgt caatttggtg gaaattgcga aagattttgt taaagaagat 180gacaaagctc ttttaataga agatggaggt ggtctccaat cagcttctcc tagagccaaa 240ggcccgacca cacattctcc tctcatcagg tttgtcctct tggatgagtc gttcttggtt 300gagaacaggc tgactttaag ggcaataatt gagtttgcag tacttatggt atacttttac 360atatgtgacc gcacagatgt cttcaattca tcaaagaaga gttacaaccg ggatctcttt 420ctgttccttt acttccttct catcatcgtt tcagcgataa cttcattcac gatacatact 480gataaatcac cattcagcgg aaaagccatc atgtacttga ataggcatca aaccgaggag 540tggaaaggct ggatgcaggt ccttttcttg atgtaccact actttgctgc tgcagagtac 600tataatgcga tccgtgtttt cattgcttgc tatgtatgga tgactggatt tgggaatttt 660tcttattatt acattcgcaa ggactttagc cttgcaaggt ttgcacagat gatgtggcgg 720ctaaatttcc tggtcatatt ctcctgcatc gtcctcaaca acagttacat gctatactac 780atctgcccaa tgcacactct gtttactcta atggtctatg gggcacttgg tattatgagc 840aagtataatg agatgggttc agtcatagct gccaaatttt ttgcctgctt cgttgttgtt 900atcatcgttt gggaaattcc tggcgttttt gaatggattt ggagtccatt tacactccta 960atgggttaca atgatcccgc aaaacctcag cttcccctct tgcatgagtg gcatttccgc 1020tctggacttg atcggtacat atggataatc gggatgctat atgcatacta ccacccaact 1080gttgaaagtt ggatggataa actggaggaa gctgagatga aattcagggt ggctatcaaa 1140acatctgtgg cactgatagc actaacggtg ggatattttt ggtacgagta tatatacaag 1200atggacaagt taacttacaa caaatatcat ccttacacct cttggattcc aataactgtt 1260tatatctgtc tccggaacat cacccagtct ttccgcggct acagtttgac ccttctggcg 1320tggcttggaa agataacact ggagacatat atctcccagt ttcatatatg gctcagatct 1380ggagttcctg atggtcaacc caaattacta ctatctcttg tcccggatta cccattgttg 1440aacttcatgc tcactacttc gatttacgtc gctatctctt ataggctctt tgagcttacc 1500aacactttga aaacagcctt cataccaacc aaggacgaca aacgccttgt ctacaacacg 1560atctcagcac tcataatctg cacttgtctc tactttttct catttattct tatcacaatt 1620ccccaaaaac tggtaagtca aaattttatc tttttgtgtg ggagaaagct tttttttccc 1680tggtacttga gttcattgat atgttag 170791623DNAArabidopsis thalianaCAS1L3 (RWA3) nucleic acid coding sequence 9atggcggatt ctcagccaat cacgcctggt caggtttcgt ttctactcgg agtcattcct 60gtcttcatag catggattta ctcagagttt ctagagtata agaggtcttc attgcactct 120aaagttcatt cagataataa tttggttgaa cttggtgagg taaaaaacaa ggaagatgaa 180ggagtagttt tacttgaagg aggtcttcca agatcagtct ctacaaagtt ttataactca 240cctatcaaaa caaacttgat tagatttctg acgctggaag actctttctt gattgaaaat 300cgagcaacct tgagagcgat ggctgagttt ggggctattc ttttttactt ttatattagt 360gatcgaacaa gcttgcttgg agagtctaaa aagaattaca acagagatct tttcctcttt 420ctctactgtc ttctcatcat agtttcagcc atgacatcct tgaagaaaca caatgacaaa 480tcacctataa caggaaaatc cattctctat cttaatcgtc accagactga agagtggaag 540ggatggatgc aggttctatt tcttatgtat cattactttg ctgcggctga gatatataat 600gcaatcaggg ttttcattgc tgcctacgtc tggatgactg ggtttgggaa cttctcttat 660tactatatca gaaaggattt ctccctagca cgatttactc agatgatgtg gcgtcttaac 720ttatttgtgg cgtttagctg cattattctc aataatgatt atatgctgta ctacatctgt 780ccaatgcaca ctctgttcac tcttatggtg tatggagccc ttggtatctt cagtcgatat 840aacgaaatac catcagtaat ggctttgaag attgcttcat gctttctcgt ggttatcgtg 900atgtgggaga ttcctggcgt ttttgagatt ttctggagtc ctttaacatt cttactggga 960tacactgatc cagctaaacc agaactacca cttttacatg aatggcactt cagatcagga 1020cttgaccgct acatatggat cattggaatg atatatgcct atttccatcc cactgtagag 1080agatggatgg agaaattgga ggagtgtgat gccaagagaa agatgtcaat aaagacaagc 1140ataattgcaa tttcctcatt tgttggttac ctatggtatg aatacatata caagcttgac 1200aaggttacat acaacaaata tcatccctac acatcgtgga ttccaataac cgtctacatc 1260tgtctgcgaa attctacaca acagctgcgt aatttctcca tgacactatt tgcgtggctc 1320ggcaagatta ctctggaaac ctatatttct cagtttcaca tctggttaag atcgaatgtg 1380ccaaatggac agcctaagtg gctattatgc attattccag aatacccaat gctcaacttc 1440atgctcgtca cggccatcta tgtcttggtg tcccaccgac ttttcgagct tacaaacacg 1500ttaaagtctg ttttcatacc aacaaaagac gacaagaggc tgctccacaa tgttctcgct 1560ggagctgcca tctcgttctg tttatattta acatctctca ttcttctcca gatcccacac 1620taa 1623101413DNAArabidopsis thalianaCAS1L4 (RWA4) coding sequence (NM_102729.2) 10atgttctcta gccataatat tttcttaacc attggcattg tgtttattcg taggtttctg 60actttggaag actctttctt gcttgaaaac cgagcaacct tgagagcaat ggctgagttt 120ggagcaattc ttttatattt ttatatttgt gatcgaacta gcttgatcgg gcagtctcaa 180aagaattaca gccgagacct ttttctcttt ctcttctgtc ttctcatcat agtgtcagct 240atgacgtcct tgaagaaaca cactgacaag tcaccaataa caggaaagtc cattctgtat 300ctcaatcgtc accagactga agaatggaaa gggtggatgc aggttctatt tctgatgtat 360cattattttg cggcagttga gttttacaat gcaatcaggg tcttcatcgc tggctatgtg 420tggatgaccg gttttgggaa cttctcttat tactatatcc gaaaggattt ctcccttgca 480cgattcactc agatgatgtg gcggcttaac ttttttgtgg cgttttgttg cattattctc 540aacaatgact atatgctgta ctacatctgt ccaatgcaca ctctattcac gctgatggtc 600tatggagccc ttggtattta cagtcagtat aacgaaatag catcagtgat ggctctgaag 660attgcttcat gctttctcgt ggttatcctt atgtgggaga ttcctggagt ttttgagatt 720ttctggagtc ctctggcatt cttactgggg tacacagatc cagctaaacc agaccttcca 780cgtctacacg aatggcattt cagatctgga cttgatcgct acatatggat catcggcatg 840atatatgcat attttcatcc cactgtagaa agatggatgg agaaattgga ggagtgtgat 900gctaagagaa ggatgtcaat caagacaagc ataataggaa tttcttcatt cgctggttac 960ctttggtatg aatacatcta caagctggac aaggttacgt acaacaaata tcatccctac 1020acatcttgga ttccaataac tgtctacatc tgtctgcgaa attgcaccca acagctacgg 1080agattttccc tgacactctt tgcgtggctg ggcaagataa ctctcgagac ctacatttca 1140cagtttcaca tctggttaag atcgagtgtg ccaaatgggc agccaaagtt gctattatca 1200atcatcccag aatacccaat gctcaacttc atgctcacca cggccatcta cgtcttggta 1260tctgttcgac ttttcgagct aaccaataca ttaaaatcag ttttcatacc cacgaaagac 1320gacaaacggc tgctccacaa cgtgattgct atggctgcga tatcattttg tttatatatt 1380atcggtctta ttcttctctt gatcccacat taa 141311102PRTArtificial Sequencesynthetic sequence of highly conserved region of CAS1L protein sequence 11Tyr Leu Asn Arg His Gln Thr Glu Glu Trp Lys Gly Trp Met Gln Val1 5 10 15Leu Phe Leu Met Tyr His Tyr Phe Ala Ala Ala Glu Tyr Tyr Asn Ala 20 25 30Ile Arg Val Phe Ile Ala Cys Tyr Val Trp Met Thr Gly Phe Gly Asn 35 40 45Phe Ser Tyr Tyr Tyr Ile Arg Lys Asp Phe Ser Leu Ala Arg Phe Ala 50 55 60Gln Met Met Trp Arg Leu Asn Phe Leu Val Ile Phe Ser Cys Ile Val65 70 75 80Leu Asn Asn Ser Tyr Met Leu Tyr Tyr Ile Cys Pro Met His Thr Leu 85 90 95Phe Thr Leu Met Val Tyr 10012102PRTArtificial Sequencesynthetic consensus sequence of conserved region from CAS1L protein sequence 12Tyr Leu Asn Arg His Gln Thr Glu Glu Trp Lys Gly Trp Met Gln Val1 5 10 15Xaa Phe Leu Met Tyr His Tyr Phe Xaa Ala Xaa Glu Xaa Tyr Asn Ala 20 25 30Ile Arg Xaa Phe Ile Ala Xaa Tyr Val Trp Met Thr Gly Phe Gly Asn 35 40 45Phe Ser Tyr Tyr Tyr Xaa Xaa Lys Asp Phe Ser Xaa Xaa Arg Phe Xaa 50 55 60Gln Met Met Trp Arg Leu Asn Xaa Xaa Val Xaa Xaa Xaa Cys Xaa Xaa65 70 75 80Leu Xaa Asn Xaa Tyr Xaa Leu Tyr Tyr Ile Cys Pro Met His Thr Leu 85 90 95Phe Thr Xaa Met Val Tyr 1001316PRTArtificial Sequencesynthetic sequence of conserved motif from CAS1L protein sequence 13Leu His Glu Trp His Phe Arg Ser Gly Leu Asp Arg Tyr Ile Trp Ile1 5 10 1514235PRTArtificial Sequencesynthetic sequence of a central region from CAS1L protein sequence 14Tyr Phe Tyr Ile Ser Asp Arg Thr Ser Leu Leu Gly Glu Ser Lys Lys1 5 10 15Asn Tyr Asn Arg Asp Leu Phe Leu Phe Leu Tyr Cys Leu Leu Ile Ile 20 25 30Val Ser Ala Met Thr Ser Leu Lys Lys His Asn Asp Lys Ser Pro Ile 35 40 45Thr Gly Lys Ser Ile Leu Tyr Leu Asn Arg His Gln Thr Glu Glu Trp 50 55 60Lys Gly Trp Met Gln Val Leu Phe Leu Met Tyr His Tyr Phe Ala Ala65 70 75 80Ala Glu Ile Tyr Asn Ala Ile Arg Val Phe Ile Ala Ala Tyr Val Trp 85 90 95Met Thr Gly Phe Gly Asn Phe Ser Tyr Tyr Tyr Ile Arg Lys Asp Phe 100 105 110Ser Leu Ala Arg Phe Thr Gln Met Met Trp Arg Leu Asn Leu Phe Val 115 120 125Ala Phe Ser Cys Ile Ile Leu Asn Asn Asp Tyr Met Leu Tyr Tyr Ile 130 135 140Cys Pro Met His Thr Leu Phe Thr Leu Met Val Tyr Gly Ala Leu Gly145 150 155 160Ile Phe Ser Arg Tyr Asn Glu Ile Pro Ser Val Met Ala Leu Lys Ile 165 170 175Ala Ser Cys Phe Leu Val Val Ile Val Met Trp Glu Ile Pro Gly Val 180 185 190Phe Glu Ile Phe Trp Ser Pro Leu Thr Phe Leu Leu Gly Tyr Thr Asp 195 200 205Pro Ala Lys Pro Glu Leu Pro Leu Leu His Glu Trp His Phe Arg Ser 210 215 220Gly Leu Asp Arg Tyr Ile Trp Ile Ile Gly Met225 230 23515539PRTArtificial Sequencepartial Os01g0631100_Oryza sequence from CAS1L gene family 15Met Glu Val Phe Gly Pro Val Thr Ala Gly Gln Val Ser Phe Leu Leu1 5 10 15Gly Leu Phe Pro Val Leu Ile Ala Trp Ile Tyr Ser Glu Val Leu Glu 20 25 30Tyr Arg Lys Ser Ser Ser Met Lys Val His Ser Asp Ser Asn Leu Glu 35 40 45Asn Gly Thr Val Lys Glu Asp Asp Lys Thr Val Leu Leu Glu Gly Gly 50 55 60Leu Ser Lys Ser Pro Ser Thr Lys Phe Arg Ile Asn Ser Thr Lys Ala65 70 75 80Asn Leu Ile Arg Phe Ile Thr Met Asp Glu Ser Phe Leu Leu Glu Asn 85 90 95Arg Ala Val Leu Arg Ala Met Ala Glu Phe Gly Ile Val Leu Val Tyr 100 105 110Phe Tyr Ile Cys Asp Arg Thr Asn Ile Phe Pro Glu Ser Lys Lys Ser 115 120 125Tyr Asn Arg Asp Leu Phe Leu Phe Leu Tyr Ile Leu Leu Ile Ile Ala 130 135 140Ser Ala Leu Thr Ser Leu Lys Lys His His Asp Lys Ser Ala Phe Ser145 150 155 160Gly Lys Ser Ile Leu Tyr Leu Asn Arg His Gln Thr Glu Glu Trp Lys 165 170 175Gly Trp Met Gln Val Leu Phe Leu Met Tyr His Tyr Phe Ala Ala Thr 180 185 190Glu Ile Tyr Asn Ala Ile Arg Val Phe Ile Ala Ala Tyr Val Trp Met 195 200 205Thr Gly Phe Gly Asn Phe Ser Tyr Tyr Tyr Ile Lys Lys Asp Phe Ser 210 215 220Leu Ala Arg Phe Ala Gln Met Met Trp Arg Leu Asn Phe Phe Val Ala225 230 235 240Phe Cys Cys Ile Val Leu Asp Asn Asp Tyr Met Leu Tyr Tyr Ile Cys 245 250 255Pro Met His Thr Leu Phe Thr Leu Met Val Tyr Gly Ser Leu Gly Leu 260 265 270Phe Asn Lys Tyr Asn Glu Ile Pro Ser Val Met Ala Met Lys Ile Val 275 280 285Ser Cys Phe Leu Ala Val Ile Leu Ile Trp Glu Ile Pro Gly Val Phe 290 295

300Glu Leu Leu Trp Ser Pro Phe Thr Phe Leu Leu Gly Tyr Lys Asp Pro305 310 315 320Glu Pro Ser Lys Ala Asn Leu Pro Leu Leu His Glu Trp His Phe Arg 325 330 335Ser Gly Leu Asp Arg Tyr Ile Trp Ile Ile Gly Met Ile Tyr Ala Tyr 340 345 350Phe His Pro Asn Val Glu Arg Trp Met Glu Lys Leu Glu Glu Ser Glu 355 360 365Thr Lys Val Arg Leu Ser Ile Lys Gly Thr Ile Ile Ser Ile Ser Leu 370 375 380Val Ala Gly Tyr Leu Trp Tyr Glu Tyr Ile Tyr Lys Leu Asp Lys Ile385 390 395 400Thr Tyr Asn Lys Tyr His Pro Tyr Thr Ser Trp Ile Pro Ile Thr Val 405 410 415Tyr Ile Ser Leu Arg Asn Cys Thr Gln Gln Leu Arg Asn Val Ser Leu 420 425 430Thr Leu Phe Ala Trp Leu Gly Lys Ile Thr Leu Glu Thr Tyr Ile Ser 435 440 445Gln Ile His Ile Trp Leu Arg Ser Asn Met Pro Asn Gly Gln Pro Lys 450 455 460Trp Leu Leu Ser Phe Ile Pro Gly Tyr Pro Leu Leu Asn Phe Met Leu465 470 475 480Ala Thr Ala Ile Tyr Leu Leu Ile Ser Tyr Arg Val Phe Glu Leu Thr 485 490 495Gly Val Leu Lys Ser Ala Phe Ile Pro Ser Arg Asp Asn Asn Arg Leu 500 505 510Tyr Gln Asn Phe Val Ala Gly Ile Ala Ile Ser Val Cys Leu Tyr Phe 515 520 525Leu Ser Ile Val Leu Leu Lys Ile Pro Ile Val 530 53516539PRTArtificial Sequencepartial Os05g0582100_Oryza sequence from CAS1L gene family 16Met Glu Val Phe Gly Pro Val Thr Pro Gly Gln Val Ser Phe Leu Leu1 5 10 15Gly Leu Phe Pro Val Leu Ile Gly Trp Ile Tyr Ala Glu Ile Leu Glu 20 25 30Tyr Arg Lys Ser Leu Leu Tyr Gly Lys Val His Ser Asp Ala Asn Leu 35 40 45Glu Asn Glu Thr Met Lys Glu Asp Asp Lys Ala Val Leu Leu Gly Gly 50 55 60Gln Ser Lys Ser Pro Ser Thr Lys Leu Arg Asn Met Ser Thr Lys Ala65 70 75 80Asn Leu Ile Arg Phe Ile Thr Met Asp Glu Ser Phe Leu Leu Glu Asn 85 90 95Arg Ala Val Leu Arg Ala Met Ala Glu Val Gly Ile Ile Leu Val Tyr 100 105 110Phe Tyr Ile Cys Asp Arg Thr Asn Ile Phe Pro Glu Thr Lys Lys Ser 115 120 125Tyr Asn Arg Asp Leu Phe Leu Phe Leu Tyr Ile Leu Leu Ile Ile Ala 130 135 140Ser Ala Leu Thr Ser Leu Lys Lys His Asn Glu Lys Ser Ala Phe Thr145 150 155 160Gly Lys Ser Ile Leu Tyr Leu Asn Arg His Gln Thr Glu Glu Trp Lys 165 170 175Gly Trp Met Gln Val Leu Phe Leu Met Tyr His Tyr Phe Ala Ala Thr 180 185 190Glu Ile Tyr Asn Ala Ile Arg Val Phe Ile Ala Ala Tyr Val Trp Met 195 200 205Thr Gly Phe Gly Asn Phe Ser Tyr Tyr Tyr Ile Lys Lys Asp Phe Ser 210 215 220Ile Ala Arg Phe Ala Gln Met Met Trp Arg Leu Asn Phe Phe Val Ala225 230 235 240Phe Cys Cys Ile Val Leu Asp Asn Asp Tyr Met Leu Tyr Tyr Ile Cys 245 250 255Pro Met His Thr Leu Phe Thr Leu Met Val Tyr Gly Ser Leu Gly Leu 260 265 270Phe Asn Lys Tyr Asn Glu Lys Pro Ser Val Met Ala Ile Lys Ile Ala 275 280 285Cys Cys Phe Leu Thr Val Ile Leu Ile Trp Glu Ile Pro Gly Val Phe 290 295 300Glu Phe Leu Trp Ala Pro Phe Thr Phe Leu Leu Gly Tyr Lys Asp Pro305 310 315 320Glu Pro Ser Lys Ala Asn Leu Pro Leu Leu His Glu Trp His Phe Arg 325 330 335Ser Gly Leu Asp Arg Tyr Ile Trp Ile Ile Gly Met Ile Tyr Ala Tyr 340 345 350Phe His Pro Asn Val Glu Arg Trp Met Glu Lys Leu Glu Glu Ser Glu 355 360 365Thr Lys Val Arg Leu Phe Ile Lys Gly Ala Ile Val Thr Leu Ser Leu 370 375 380Thr Ala Gly Tyr Leu Trp Tyr Glu Tyr Ile Tyr Arg Leu Asp Lys Ile385 390 395 400Thr Tyr Asn Lys Tyr His Pro Tyr Thr Ser Trp Ile Pro Ile Thr Val 405 410 415Tyr Ile Cys Leu Arg Asn Cys Thr Gln Gln Leu Arg Ser Ala Ser Leu 420 425 430Ala Leu Phe Ala Trp Leu Gly Lys Ile Thr Leu Glu Thr Tyr Ile Ser 435 440 445Gln Ile His Ile Trp Leu Arg Ser Ser Thr Pro Asn Gly Gln Pro Lys 450 455 460Trp Leu Leu Ser Phe Val Pro Asp Tyr Pro Leu Leu Asn Phe Met Leu465 470 475 480Thr Thr Ala Ile Tyr Leu Leu Leu Ser Tyr Arg Val Phe Glu Ile Thr 485 490 495Gly Val Leu Lys Gly Ala Phe Ile Pro Ser Arg Asp Asn Asn Arg Leu 500 505 510Tyr Gln Asn Phe Ile Ala Gly Ile Ala Ile Ser Ala Cys Leu Tyr Phe 515 520 525Cys Ser Leu Ile Leu Val Lys Ile Thr Ile Val 530 53517552PRTArtificial Sequencepartial Os03g0314200_Oryza sequence from CAS1L gene family 17Met Ala Glu Ala Ile Ala Ser Ala Gly Gly Ile Ala Met Ala Ala Ser1 5 10 15Thr Ser Leu Thr Pro Gly Gln Val Ser Ala Leu Leu Gly Phe Leu Trp 20 25 30Val Phe Thr Ala Trp Ala Tyr Ala Glu Val Leu Tyr Tyr Arg Lys Asn 35 40 45Ala Ala Ser Ile Lys Ala His Ser Asp Val Asn Leu Ala Val Met Asp 50 55 60Ser Ser Ser Asn Lys Gly Glu Asp Gln Val Met Leu Leu Glu Glu Gly65 70 75 80Val Gln Ala Pro Val Gln Lys Pro Val Tyr Ala Ser Leu Thr Ser Gln 85 90 95Met Phe Arg Leu Phe Leu Leu Asp Gln Ala Leu Ile Leu Glu Asn Arg 100 105 110Leu Thr Leu Arg Ala Ile Ser Glu Phe Gly Gly His Leu Leu Tyr Phe 115 120 125Tyr Ile Cys Asp Arg Thr Asn Leu Leu Gly Glu Ser Ala Lys Asn Tyr 130 135 140Ser Arg Asp Met Phe Leu Phe Leu Tyr Phe Leu Leu Ile Ile Val Ala145 150 155 160Ala Met Thr Ser Phe Lys Val His Gln Asp Lys Ser Ser Phe Thr Gly 165 170 175Lys Ser Ile Leu Tyr Leu Asn Arg His Gln Thr Glu Glu Trp Lys Gly 180 185 190Trp Met Gln Val Leu Phe Leu Met Tyr His Tyr Phe Asn Ala Lys Glu 195 200 205Ile Tyr Asn Ala Ile Arg Val Phe Ile Ala Ala Tyr Val Trp Met Thr 210 215 220Gly Phe Gly Asn Phe Ser Tyr Tyr Tyr Val Arg Lys Asp Phe Ser Leu225 230 235 240Ala Arg Phe Ala Gln Met Met Trp Arg Leu Asn Phe Phe Val Ala Phe 245 250 255Cys Cys Ile Val Leu Asn Asn Asp Tyr Thr Leu Tyr Tyr Ile Cys Pro 260 265 270Met His Thr Leu Phe Thr Leu Met Val Tyr Gly Ala Leu Gly Ile Leu 275 280 285Asn Lys Tyr Asn Glu Ile Gly Ser Val Met Ala Ile Lys Phe Val Ala 290 295 300Cys Phe Leu Val Val Ile Leu Ile Trp Glu Ile Pro Gly Val Phe Glu305 310 315 320Ile Val Trp Ser Pro Phe Thr Phe Leu Leu Gly Tyr Thr Asp Pro Ser 325 330 335Lys Pro Asp Leu Pro Arg Leu His Glu Trp His Phe Arg Ser Gly Leu 340 345 350Asp Arg Tyr Ile Trp Ile Val Gly Met Ile Tyr Ala Tyr Tyr His Pro 355 360 365Thr Val Glu Lys Trp Met Glu Lys Leu Glu Glu Ala Glu Thr Lys Thr 370 375 380Lys Leu Tyr Ile Lys Ala Leu Ile Val Ser Ile Ala Leu Thr Ala Gly385 390 395 400Cys Leu Trp Tyr Glu Tyr Ile Tyr Lys Leu Asp Lys Ile Thr Tyr Asn 405 410 415Lys Tyr His Pro Tyr Thr Ser Trp Ile Pro Ile Thr Val Tyr Ile Cys 420 425 430Leu Arg Asn Phe Thr Gln Glu Phe Arg Cys Cys Ser Leu Thr Leu Phe 435 440 445Ala Trp Leu Gly Lys Ile Thr Leu Glu Thr Tyr Ile Ser Gln Phe His 450 455 460Ile Trp Leu Arg Ser Lys Val Pro Asn Gly Gln Pro Lys Trp Leu Leu465 470 475 480Thr Ile Ile Pro Asn Tyr Pro Met Leu Asn Phe Met Leu Thr Thr Ala 485 490 495Ile Tyr Val Ala Val Ser His Arg Leu Phe Glu Leu Thr Asn Thr Leu 500 505 510Lys Ile Ala Phe Val Pro Ser Arg Asp Asn Lys Arg Leu Ser Tyr Asn 515 520 525Phe Val Ala Gly Ile Ala Ile Ser Val Ala Leu Tyr Ser Leu Ser Phe 530 535 540Leu Ile Val Gly Val Ala Gly Tyr545 55018473PRTArtificial Sequencepartial P2_P._trichocarpa sequence from CAS1L gene family 18Arg Ser Ala Ser Ala Lys Phe His Ser Ser Ala Ile Lys Met Asn Leu1 5 10 15Ile Arg Phe Met Thr Leu Asp Asp Ser Phe Leu Leu Glu Asn Arg Ala 20 25 30Thr Leu Arg Ala Met Ser Glu Phe Gly Ala Val Leu Leu Tyr Phe Tyr 35 40 45Ile Cys Asp Arg Thr Asn Ile Leu Gly Glu Ser Thr Lys Ser Tyr Asn 50 55 60Arg Asp Leu Phe Val Phe Leu Tyr Ile Leu Leu Ile Ile Val Ser Ser65 70 75 80Met Thr Ser Leu Arg Lys His Thr Asp Lys Ser Ala Phe Thr Gly Lys 85 90 95Ser Met Leu Tyr Leu Asn Arg His Gln Thr Glu Glu Trp Lys Gly Trp 100 105 110Met Gln Val Leu Phe Leu Met Tyr His Tyr Phe Ala Ala Ala Glu Ile 115 120 125Tyr Asn Ala Ile Arg Ile Phe Ile Ala Ala Tyr Val Trp Met Thr Gly 130 135 140Phe Gly Asn Phe Ser Tyr Tyr Tyr Ile Arg Lys Asp Phe Ser Val Ala145 150 155 160Arg Phe Ser Gln Met Met Trp Arg Leu Asn Phe Phe Val Ala Phe Cys 165 170 175Cys Ile Ile Leu Asn Asn Asp Tyr Met Leu Tyr Tyr Ile Cys Pro Met 180 185 190His Thr Leu Phe Thr Leu Met Val Tyr Gly Ala Leu Gly Ile Phe Asn 195 200 205Lys Tyr Asn Glu Asn Ser Ser Val Met Ala Val Lys Ile Leu Ser Cys 210 215 220Phe Leu Val Val Ile Leu Ile Trp Glu Ile Pro Gly Val Phe Asp Phe225 230 235 240Leu Trp Ser Pro Leu Thr Phe Leu Leu Gly Tyr Ser Asp Pro Ala Lys 245 250 255Pro Asp Leu Pro Arg Leu His Glu Trp His Phe Arg Ser Gly Leu Asp 260 265 270Arg Tyr Ile Trp Ile Ile Gly Met Ile Tyr Ala Tyr Phe His Pro Asn 275 280 285Ile Glu Lys Trp Met Glu Lys Leu Glu Glu Ser Glu Thr Lys Lys Lys 290 295 300Leu Ser Met Lys Thr Gly Ile Val Ala Val Ser Val Ser Val Gly Tyr305 310 315 320Leu Trp Tyr Glu Tyr Ile Tyr Lys Leu Asp Lys Val Ser Tyr Asn Lys 325 330 335Tyr His Pro Tyr Thr Ser Trp Ile Pro Ile Thr Val Tyr Ile Cys Leu 340 345 350Arg Asn Cys Thr Gln Gln Leu Arg Ser Phe Ser Ser Thr Leu Phe Ala 355 360 365Trp Leu Gly Lys Ile Thr Leu Glu Thr Tyr Ile Ser Gln Phe His Ile 370 375 380Trp Leu Arg Ser Asp Ile Pro Asn Gly Gln Pro Lys Trp Leu Leu Ser385 390 395 400Phe Ile Pro Glu Tyr Pro Leu Leu Asn Phe Met Leu Thr Thr Ala Ile 405 410 415Tyr Val Leu Val Ser His Arg Leu Phe Glu Leu Thr Asn Thr Leu Lys 420 425 430Thr Val Phe Ile Pro Thr Lys Asp Asn Lys Arg Leu Phe Tyr Asn Ser 435 440 445Val Ala Gly Ala Ala Ile Ser Val Cys Leu Tyr Cys Val Ala Val Ile 450 455 460Leu Leu His Ile Pro His Ser Pro Ala465 47019482PRTArtificial Sequencepartial P3_P._trichocarpa sequence from CAS1L gene family 19Leu Pro Arg Ser Ala Ser Ala Lys Phe His Ser Ser Ala Thr Lys Met1 5 10 15Asn Leu Ile Arg Phe Met Thr Met Asp Asp Ser Phe Leu Leu Glu Asn 20 25 30Arg Thr Thr Leu Arg Val Met Ser Glu Phe Gly Ala Val Leu Val Tyr 35 40 45Phe Tyr Ile Cys Asp Arg Thr Ile Arg Phe Met Thr Met Asp Asp Ser 50 55 60Phe Leu Leu Glu Asn Arg Thr Thr Leu Arg Val Met Ser Glu Phe Gly65 70 75 80Ala Val Leu Val Tyr Phe Tyr Ile Cys Asp Arg Thr His Gln Thr Glu 85 90 95Glu Trp Lys Gly Trp Met Gln Val Ile Phe Leu Met Tyr His Tyr Phe 100 105 110Ala Ala Thr Glu Ile Tyr Asn Ala Ile Arg Val Phe Ile Ala Ala Tyr 115 120 125Val Trp Met Thr Gly Phe Gly Asn Phe Ser Tyr Tyr Tyr Ile Arg Lys 130 135 140Asp Phe Ser Val Ala Arg Phe Ala Gln Met Met Trp Arg Leu Asn Leu145 150 155 160Phe Val Ala Phe Cys Cys Ile Val Leu Asn Asn Asp Tyr Met Leu Tyr 165 170 175Tyr Ile Cys Pro Met His Thr Leu Phe Thr Val Met Val Tyr Gly Val 180 185 190Leu Gly Ile Phe Asn Lys Tyr Asn Glu Asn Ser Ser Val Ile Ala Val 195 200 205Lys Ile Leu Ser Cys Phe Leu Met Val Ile Leu Ile Trp Glu Thr Pro 210 215 220Gly Val Phe Asp Ile Leu Trp Ser Pro Leu Thr Phe Leu Leu Gly Tyr225 230 235 240Thr Asp Pro Ala Lys Pro Asp Leu Pro Arg Leu His Glu Trp His Phe 245 250 255Arg Ser Gly Leu Asp Arg Tyr Ile Trp Ile Ile Gly Met Ile Tyr Ala 260 265 270Tyr Phe His Pro Asn Val Glu Lys Trp Met Glu Lys Leu Glu Glu Ser 275 280 285Glu Ile Lys Lys Lys Leu Ser Ile Lys Thr Gly Leu Val Ala Val Ser 290 295 300Leu Ser Val Gly Tyr Leu Trp Tyr Glu Cys Ile Tyr Lys Leu Asp Lys305 310 315 320Val Ser Tyr Asn Lys Tyr His Pro Tyr Thr Ser Trp Ile Pro Ile Thr 325 330 335Val Tyr Ile Cys Leu Arg Asn Cys Thr Gln Gln Leu Arg Ser Phe Ser 340 345 350Leu Thr Leu Phe Ala Trp Leu Gly Lys Ile Thr Leu Glu Thr Tyr Ile 355 360 365Ser Gln Phe His Ile Trp Leu Arg Ser Asp Met Pro Asn Gly Gln Pro 370 375 380Lys Trp Leu Leu Ser Val Ile Pro Glu Tyr Pro Leu Leu Asn Phe Met385 390 395 400Leu Thr Thr Ala Ile Tyr Val Leu Val Ser His Arg Leu Phe Glu Leu 405 410 415Thr Asn Thr Leu Lys Thr Val Phe Ile Pro Thr Lys Asp Asn Met Arg 420 425 430Leu Phe Tyr Asn Phe Val Ala Gly Ala Ala Ile Ser Leu Cys Leu Tyr 435 440 445Cys Val Ala Val Ile Leu Leu His Ile Leu His Ser Ala Val Ser Pro 450 455 460Ser Leu Val Leu Glu Asn Asn Met Val Ala Ser Asp Asp Leu Glu Leu465 470 475 480Cys Ser20576PRTArtificial Sequencepartial P1_P._trichocarpa sequence from CAS1L gene family 20Met Phe Ala Met Leu Thr Gly Lys Lys Glu Glu Glu Gly Ile Gly Gly1 5 10 15Pro Lys Glu His Trp Val Asp Ala Ser Met Pro Met Leu Ser Pro Val 20 25 30Thr Pro Gly Gln Phe Ser Phe Leu Leu Gly Ile Val Pro Val Phe Ala 35 40 45Ala Trp Ile Tyr Thr Glu Tyr Leu Glu Tyr Lys Lys Asn Asn Thr Leu 50 55 60Ala Lys Ala His Ser Asp Val Gly Leu Val Glu Leu Gly Asn Glu Ala65 70 75 80Val Lys Glu Asp Asp Arg Ala Val Leu Leu Glu Gly Gly Val Gln Ser 85 90 95Ala Ser Pro Lys Ala Arg Ser Ser Thr Ser Thr Phe Pro Ile Phe Arg 100 105 110Phe Phe Thr Met Glu Glu Gln Phe Leu Ile Asp Asn Arg Leu Thr Leu 115 120 125Arg Ala Ile Ser Glu Phe Gly Phe Phe Met Val Tyr Phe Tyr

Ile Cys 130 135 140Asp Arg Thr Asp Ile Leu Gly Ser Ser Lys Lys Ser Tyr Asn Arg Asp145 150 155 160Leu Phe Leu Phe Leu Tyr Phe Leu Leu Ile Ile Val Ser Ala Ile Thr 165 170 175Ser Phe Lys Ile His His Asp Lys Ser Pro Phe Ser Gly Lys Pro Ile 180 185 190Leu Tyr Leu Asn Arg His Gln Thr Glu Glu Trp Lys Gly Trp Met Gln 195 200 205Val Leu Phe Leu Met Tyr His Tyr Phe Ala Ala Thr Glu Phe Tyr Asn 210 215 220Ala Ile Arg Val Phe Ile Ala Ser Tyr Val Trp Met Thr Gly Phe Gly225 230 235 240Asn Phe Ser Tyr Tyr Tyr Val Arg Lys Asp Phe Ser Leu Ala Arg Phe 245 250 255Ala Gln Met Met Trp Arg Leu Asn Phe Leu Val Leu Val Cys Cys Val 260 265 270Val Leu Asn Asn Ser Tyr Met Leu Tyr Tyr Ile Cys Pro Met His Thr 275 280 285Leu Phe Thr Leu Met Val Tyr Ala Ala Leu Gly Ile Phe Asn Lys Tyr 290 295 300Asn Glu Ile Gly Ser Val Met Ala Ala Lys Ile Ile Ala Cys Phe Leu305 310 315 320Val Val Ile Leu Met Trp Glu Ile Pro Gly Val Phe Glu Val Val Trp 325 330 335Ser Pro Phe Thr Phe Leu Phe Gly Tyr Thr Asp Pro Ala Lys Pro Asp 340 345 350Leu Pro Arg Leu His Glu Trp His Phe Arg Ser Gly Leu Asp Arg Tyr 355 360 365Ile Trp Ile Val Gly Met Ile Tyr Ala Tyr Tyr His Pro Met Val Glu 370 375 380Gly Trp Met Glu Lys Leu Glu Glu Thr Glu Ala Lys Arg Arg Ile Ser385 390 395 400Ile Lys Thr Ala Val Ala Thr Ile Ser Leu Ala Val Gly Tyr Met Trp 405 410 415Tyr Glu Tyr Ile Tyr Lys Leu Asp Lys Cys Val His Leu Phe Glu Lys 420 425 430Cys His Pro Ala Leu Pro Leu Leu Gln Leu Asp Pro Phe Arg Leu Asn 435 440 445His Leu Leu Glu Ala Leu Ile Gly Gly Asn Leu Arg Glu Leu Leu Phe 450 455 460Leu Trp Leu Gly Lys Ile Thr Leu Glu Thr Tyr Ile Ser Gln Ile His465 470 475 480Ile Trp Leu Arg Ser Gly Ile Pro Asp Gly Gln Pro Lys Leu Leu Leu 485 490 495Ser Leu Ile Pro Asp Tyr Pro Met Leu Asn Phe Met Leu Thr Thr Ser 500 505 510Ile Tyr Val Ala Val Ser Tyr Arg Leu Phe Asp Leu Thr Asn Thr Leu 515 520 525Lys Thr Ala Phe Val Pro Ser Lys Asp Asp Lys Arg Leu Thr Asn Asn 530 535 540Ile Ile Thr Ala Val Ala Val Ser Ile Val Leu Tyr Ser Leu Ser Phe545 550 555 560Val Phe Leu Lys Ala Pro Gln Met Leu Val Leu Thr Ile Arg Thr Asp 565 570 57521447PRTArtificial Sequencepartial P4_P._trichocarpa sequence from CAS1L gene family 21Gly Leu Gln Pro Ala Ser Pro Lys Ala Arg Thr Pro Thr Ser Ser Phe1 5 10 15Pro Ile Phe Arg Phe Leu Met Met Glu Glu Gln Phe Leu Ile Asp Asn 20 25 30Arg Leu Thr Leu Arg Ala Ile Leu Glu Phe Gly Phe Phe Met Ala Tyr 35 40 45Phe Tyr Ile Cys Asp Arg Thr Asp Met Leu Gly Ser Ser Lys Lys Ser 50 55 60Tyr Asn Arg Asp Leu Phe Leu Phe Leu Tyr Phe Leu Leu Ile Ile Val65 70 75 80Ser Ala Val Thr Ser Phe Thr Ile His His Asp Lys Ser Pro Phe Ser 85 90 95Gly Lys Pro Ile Leu Tyr Leu Asn Arg His Gln Thr Glu Glu Trp Lys 100 105 110Gly Trp Met Gln Val Leu Phe Leu Met Tyr His Tyr Phe Ala Ala Thr 115 120 125Glu Ile Tyr Asn Ala Ile Arg Met Phe Ile Ala Ala Tyr Val Trp Met 130 135 140Thr Gly Phe Gly Asn Phe Ser Tyr Tyr Tyr Val Arg Lys Asp Phe Ser145 150 155 160Leu Ala Arg Phe Ala Gln Met Met Trp Arg Leu Asn Phe Leu Val Leu 165 170 175Phe Cys Cys Val Val Leu Asp Asn Ser Tyr Met Leu Tyr Tyr Ile Cys 180 185 190Pro Met His Thr Leu Phe Thr Leu Met Val Tyr Ala Ala Pro Ala Lys 195 200 205Pro Asp Leu Pro Arg Leu His Glu Trp His Phe Arg Ser Gly Leu Asp 210 215 220Arg Tyr Ile Trp Ile Ile Gly Met Ile Tyr Ala Tyr Tyr His Pro Lys225 230 235 240Val Glu Gly Trp Met Glu Lys Leu Glu Glu Thr Glu Ala Lys Arg Arg 245 250 255Ile Pro Ile Lys Thr Ala Val Ala Thr Ile Ser Leu Ala Val Gly Tyr 260 265 270Thr Trp Tyr Glu Tyr Ile Tyr Lys Leu Asp Lys Ile Ser Tyr Asn Lys 275 280 285Tyr His Pro Tyr Thr Ser Trp Ile Pro Ile Thr Val Tyr Ile Cys Leu 290 295 300Arg Asn Val Thr Gln Gln Phe Arg Cys Tyr Ser Leu Thr Leu Phe Ala305 310 315 320Trp Leu Gly Lys Ile Thr Leu Glu Thr Tyr Ile Ser Gln Ile His Ile 325 330 335Trp Leu Arg Ser Gly Ile Pro Asp Gly Gln Pro Lys Leu Leu Leu Ser 340 345 350Leu Ile Pro Asp Tyr Pro Met Leu Asn Phe Met Leu Thr Thr Ser Ile 355 360 365Tyr Ile Gly Val Ser Tyr Arg Leu Phe Asp Leu Thr Asn Thr Leu Lys 370 375 380Thr Ala Phe Val Pro Ser Lys Asp Asn Lys Arg Leu Thr Asn Asn Ile385 390 395 400Ile Thr Ala Ala Ala Val Ser Ser Val Leu Tyr Ser Leu Ser Phe Val 405 410 415Phe Leu Lys Val Pro Gln Met Leu Ile Asn Asp Asn Leu Cys Ala Val 420 425 430Cys His Leu Asn Ala Gln Phe Ala Asp Thr Leu Asn Leu Gln Val 435 440 44522542PRTArtificial Sequencepartial Selaginella_estExt_Genewise1.C sequence from CAS1L gene family 22Met Val Glu Ile Ser Pro Pro Thr Thr Gly Gln Val Ala Leu Val Leu1 5 10 15Gly Phe Ile Pro Val Leu Thr Ala Trp Leu Tyr Ser Glu Phe Leu Glu 20 25 30Tyr Arg Lys Gln Pro Val Pro Gly Lys Ala His Ser Asp Ile Asn Leu 35 40 45Ser Glu Leu Glu His Gly Pro Arg Arg Asp Asn Glu Lys Asp Ser Leu 50 55 60Leu Glu Asn Gly Phe Ser Val Ser Gly Thr Leu Lys Gly Ser Phe Ser65 70 75 80Ile Arg Met Gln Leu Phe Lys Phe Phe Thr Leu Asn Glu Thr Phe Leu 85 90 95Val Glu Asn Arg Ser Leu Leu Arg Ala Ile Ala Glu Phe Gly Cys Leu 100 105 110Leu Cys Tyr Phe Tyr Ile Cys Asp Arg Thr Asn Val Phe Gly Glu Leu 115 120 125Lys Lys Asn Tyr Ser Arg Asp Leu Phe Val Phe Leu Tyr Phe Leu Leu 130 135 140Ile Ile Val Ser Ser Ile Thr Ser Leu Lys Lys His Ala Glu Lys Ser145 150 155 160Val Ala Ser Gly Lys Ser Ile Leu Tyr Leu Asn Arg His Gln Thr Glu 165 170 175Glu Trp Lys Gly Trp Met Gln Val Leu Phe Leu Met Tyr His Tyr Phe 180 185 190Ala Ala Ala Glu Ile Tyr Asn Ala Ile Arg Leu Phe Ile Ala Gly Tyr 195 200 205Val Trp Met Thr Gly Phe Gly Asn Phe Ser Tyr Tyr Tyr Val Arg Lys 210 215 220Asp Phe Ser Leu Gly Arg Phe Ala Gln Met Met Trp Arg Leu Asn Phe225 230 235 240Leu Val Thr Phe Cys Cys Ile Val Leu Asn Asn Ser Tyr Met Leu Tyr 245 250 255Tyr Ile Cys Pro Met His Thr Leu Phe Thr Leu Met Val Tyr Cys Ser 260 265 270Leu Gly Ile Leu Asn Lys Tyr Asn Glu Val Pro Ser Val Ile Gly Ala 275 280 285Lys Ile Ala Ala Cys Phe Ala Val Val Ile Leu Val Trp Glu Val Pro 290 295 300Gly Val Phe Asp Phe Val Trp Arg Pro Phe Thr Phe Leu Val Glu Tyr305 310 315 320Thr Asp Pro Gly Lys Pro Asp Leu Pro Val Leu His Glu Trp His Phe 325 330 335Arg Ser Gly Leu Asp Arg Tyr Ile Trp Ile Tyr Gly Met Ile Cys Ala 340 345 350Tyr Phe His Pro Thr Val Glu Arg Trp Leu Glu Lys Leu Glu Glu Leu 355 360 365Glu Cys Arg Arg Lys Phe Thr Tyr Lys Ser Val Ile Val Phe Val Ala 370 375 380Ser Leu Val Gly Tyr Leu Trp Tyr Val His Ile Tyr Lys Leu Asp Lys385 390 395 400Leu Ser Tyr Asn Lys Leu His Pro Tyr Thr Ser Trp Ile Pro Ile Ser 405 410 415Val Tyr Ile Val Leu Arg Asn Val Ser Gln Pro Leu Arg Asn Trp Ser 420 425 430Leu Thr Leu Phe Ala Trp Leu Gly Lys Ile Thr Leu Glu Thr Tyr Ile 435 440 445Ala Gln Phe His Ile Trp Leu Arg Thr Gly Val Ser Asn Gly Gln Pro 450 455 460Lys Leu Leu Leu Ser Phe Ile Pro Asp Tyr Pro Met Leu Asn Phe Met465 470 475 480Leu Ala Thr Ser Ile Tyr Ile Leu Val Ser Tyr Arg Leu Phe Glu Leu 485 490 495Thr Asn Thr Leu Lys Ser Ala Phe Val Pro Asn Lys Asp Asn Asn Arg 500 505 510Leu Phe Leu Met Val Val Ser Gly Gly Thr Ile Phe Ser Leu Leu Tyr 515 520 525Gly Val Ser Tyr Leu Leu Val Lys Ile Pro Tyr Ile Leu Val 530 535 540


Patent applications by Henrik Vibe Scheller, Millbrae, CA US

Patent applications by THE REGENTS OF THE UNIVERSITY OF CALIFORNIA

Patent applications in class METHOD OF INTRODUCING A POLYNUCLEOTIDE MOLECULE INTO OR REARRANGEMENT OF GENETIC MATERIAL WITHIN A PLANT OR PLANT PART

Patent applications in all subclasses METHOD OF INTRODUCING A POLYNUCLEOTIDE MOLECULE INTO OR REARRANGEMENT OF GENETIC MATERIAL WITHIN A PLANT OR PLANT PART


User Contributions:

Comment about this patent or add new information about this topic:

CAPTCHA
Similar patent applications:
DateTitle
2012-04-19Methods of producing polyketide synthase mutants and compositions and uses thereof
2010-12-16Methods for transforming plants to express delta-endotoxins
2009-03-26Methods of affecting nitrogen assimilation in plants
2011-07-21Methods of affecting nitrogen assimilation in plants
2012-06-14Methods of affecting nitrogen assimilation in plants
New patent applications in this class:
DateTitle
2022-05-05Suppression of target gene expression through genome editing of native mirnas
2019-05-16Plants having altered agronomic characteristics under nitrogen limiting conditions and related constructs and methods involving low nitrogen tolerance genes
2017-08-17Genes and proteins for aromatic polyketide synthesis
2017-08-17Insecticidal proteins and methods for their use
2016-09-01Bg1 compositions and methods to increase agronomic performance of plants
New patent applications from these inventors:
DateTitle
2017-06-01Method of reducing acetylation in plants to improve biofuel production
2015-10-29Modulation of expression of acyltransferases to modify hydroxycinnamic acid content
2015-04-09Regulation of galactan synthase expression to modify galactan content in plants
2015-03-19Cell modified in the expression of a nucleotide sugar transporter
2015-02-19Dominant negative mutations of arabidopsis rwa
Top Inventors for class "Multicellular living organisms and unmodified parts thereof and related processes"
RankInventor's name
1Gregory J. Holland
2William H. Eby
3Richard G. Stelpflug
4Laron L. Peters
5Justin T. Mason
Website © 2025 Advameg, Inc.