Patent application title: MAD NUCLEASES
Inventors:
Juhan Kim (Boulder, CO, US)
Benjamin Mijts (Boulder, CO, US)
Aamir Mir (Boulder, CO, US)
Aamir Mir (Boulder, CO, US)
IPC8 Class: AC12N922FI
USPC Class:
Class name:
Publication date: 2022-07-07
Patent application number: 20220213458
Abstract:
The present disclosure provides new RNA-guided nuclease systems and
engineered nickases for making rational, direct edits to nucleic acids in
live cells.Claims:
1. A system for CRISPR editing of live cells comprising a MAD2015
nuclease having a sequence SEQ ID NO: 1, a CRISPR repeat RNA having a
sequence SEQ ID NO: 2, and a tracr RNA having a sequence SEQ ID NO: 3; a
MAD2016 nuclease having a sequence SEQ ID NO: 4, a CRISPR repeat RNA
having a sequence SEQ ID NO: 5, and a tracr RNA having a sequence SEQ ID
NO: 6; a MAD2017 nuclease having a sequence SEQ ID NO: 7, a CRISPR repeat
RNA having a sequence SEQ ID NO: 8, and a tracr RNA having a sequence SEQ
ID NO: 9; a MAD2019 nuclease having a sequence SEQ ID NO: 10, a CRISPR
repeat RNA having a sequence SEQ ID NO: 11, and a tracr RNA having a
sequence SEQ ID NO: 12; a MAD2020 nuclease having a sequence SEQ ID NO:
13, a CRISPR repeat RNA having a sequence SEQ ID NO: 14, and a tracr RNA
having a sequence SEQ ID NO: 15; a MAD2021 nuclease having a sequence SEQ
ID NO: 16, a CRISPR repeat RNA having a sequence SEQ ID NO: 17, and a
tracr RNA having a sequence SEQ ID NO: 18; or a MAD2022 nuclease having a
sequence SEQ ID NO: 19, a CRISPR repeat RNA having a sequence SEQ ID NO:
20, and a tracr RNA having a sequence SEQ ID NO: 21.
2. The system for CRISPR editing of live cells of claim 1, comprising a MAD2015 nuclease having a sequence SEQ ID NO: 1, a CRISPR repeat RNA having a sequence SEQ ID NO: 2, and a tracr RNA having a sequence SEQ ID NO: 3.
3. The system for CRISPR editing of live cells of claim 1, comprising a MAD2016 nuclease having a sequence SEQ ID NO: 4, a CRISPR repeat RNA having a sequence SEQ ID NO: 5, and a tracr RNA having a sequence SEQ ID NO: 6.
4. The system for CRISPR editing of live cells of claim 1, comprising a MAD2017 nuclease having a sequence SEQ ID NO: 7, a CRISPR repeat RNA having a sequence SEQ ID NO: 8, and a tracr RNA having a sequence SEQ ID NO: 9.
5. The system for CRISPR editing of live cells of claim 1, comprising a MAD2019 nuclease having a sequence SEQ ID NO: 10, a CRISPR repeat RNA having a sequence SEQ ID NO: 11, and a tracr RNA having a sequence SEQ ID NO: 12.
6. The system for CRISPR editing of live cells of claim 1, comprising a MAD2020 nuclease having a sequence SEQ ID NO: 13, a CRISPR repeat RNA having a sequence SEQ ID NO: 14, and a tracr RNA having a sequence SEQ ID NO: 15.
7. The system for CRISPR editing of live cells of claim 1, comprising a MAD2021 nuclease having a sequence SEQ ID NO: 16, a CRISPR repeat RNA having a sequence SEQ ID NO: 17, and a tracr RNA having a sequence SEQ ID NO: 18.
8. The system for CRISPR editing of live cells of claim 1, a MAD2022 nuclease having a sequence SEQ ID NO: 19, a CRISPR repeat RNA having a sequence SEQ ID NO: 20, and a tracr RNA having a sequence SEQ ID NO: 21.
Description:
RELATED CASES
[0001] This application is a continuation of U.S. Ser. No. 17/463,498, filed 31 Aug. 2021, now allowed; which claims priority to U.S. Ser. No. 63/133,502, filed 4 Jan. 2021, entitled "MAD NUCLEASES", which is incorporated herein in its entirety.
INCORPORATION BY REFERENCE
[0002] Submitted with the present application is an electronically filed sequence listing via EFS-Web as an ASCII formatted sequence listing, entitled "INSC083US2_SEQLIST_20220309", created Mar. 9, 2022, and 359,000 bytes in size. The sequence listing is part of the specification filed Mar. 9, 2022 and is incorporated by reference in its entirety.
FIELD OF THE INVENTION
[0003] The present disclosure provides new RNA-guided nuclease systems and engineered nickases for making rational, direct edits to nucleic acids in live cells.
BACKGROUND OF THE INVENTION
[0004] In the following discussion certain articles and methods will be described for background and introductory purposes. Nothing contained herein is to be construed as an "admission" of prior art. Applicant expressly reserves the right to demonstrate, where appropriate, that the methods referenced herein do not constitute prior art under the applicable statutory provisions.
[0005] The ability to make precise, targeted changes to the genome of living cells has been a long-standing goal in biomedical research and development. Recently, various nucleases have been identified that allow manipulation of gene sequence: hence, gene function. These nucleases include nucleic acid-guided nucleases. The range of target sequences that nucleic acid-guided nucleases can recognize, however, is constrained by the need for a specific PAM to be located near the desired target sequence. PAMs are short nucleotide sequences recognized by a gRNA/nuclease complex where this complex directs editing of the target sequence. The precise PAM sequence and PAM length requirements for different nucleic acid-guided nucleases vary; however, PAMs typically are 2-7 base-pair sequences adjacent or in proximity to the target sequence and, depending on the nuclease, can be 5' or 3' to the target sequence. Engineering nucleic acid-guided nucleases or mining for new nucleic acid-guided nucleases may provide nucleases with altered PAM preferences and/or altered activity or fidelity; all changes that may increase the versatility of a nucleic acid-guided nuclease for certain editing tasks.
[0006] There is thus a need in the art of nucleic acid-guided nuclease gene editing for novel nucleases with varied PAM preferences, varied activity in cells from different organisms such as mammals and/or altered enzyme fidelity. The novel MAD nucleases described herein satisfy this need.
SUMMARY OF THE INVENTION
[0007] This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter. Other features, details, utilities, and advantages of the claimed subject matter will be apparent from the following written Detailed Description including those aspects illustrated in the accompanying drawings and defined in the appended claims.
[0008] The present disclosure provides Type II MAD nucleases (e.g., RNA-guided nucleases or RGNs) with varied PAM preferences, and/or varied activity in mammalian cells.
[0009] Thus, in one embodiment there are provided MAD nuclease systems that perform nucleic acid-guided nuclease editing including a MAD2015 system comprising SEQ ID Nos. 1 (MAD2015 nuclease), 2 (CRISPR RNA) and 3 (trans-activating crispr RNA); a MAD2016 system comprising SEQ ID Nos. 4 (MAD2016 nuclease), 5 (CRISPR RNA) and 6 (trans-activating crispr RNA); a MAD2017 system comprising SEQ ID Nos. 7 (MAD2017 nuclease), 8 (CRISPR RNA) and 9 (trans-activating crispr RNA); a MAD2019 system comprising SEQ ID Nos. 10 (MAD2019 nuclease), 11 (CRISPR RNA) and 12 (trans-activating crispr RNA); a MAD2020 system comprising SEQ ID Nos. 13 (MAD2020 nuclease), 14 (CRISPR RNA) and 15 (trans-activating crispr RNA); a MAD2021 system comprising SEQ ID Nos. 16 (MAD2021 nuclease), 17 (CRISPR RNA) and 18 (trans-activating crispr RNA); a MAD2022 system comprising SEQ ID Nos. 19 (MAD2022 nuclease), 20 (CRISPR RNA) and 21 (trans-activating crispr RNA); a MAD2023 system comprising SEQ ID Nos. 22 (MAD2023 nuclease), 23 (CRISPR RNA) and 24 (trans-activating crispr RNA); a MAD2024 system comprising SEQ ID Nos. 25 (MAD2024 nuclease), 26 (CRISPR RNA) and 27 (trans-activating crispr RNA); a MAD2025 system comprising SEQ ID Nos. 28 (MAD2025 nuclease), 29 (CRISPR RNA) and 30 (trans-activating crispr RNA); a MAD2026 system comprising SEQ ID Nos. 31 (MAD2026 nuclease), 32 (CRISPR RNA) and 33 (trans-activating crispr RNA); a MAD2027 system comprising SEQ ID Nos. 34 (MAD2034 nuclease), 35 (CRISPR RNA) and 36 (trans-activating crispr RNA); a MAD2028 system comprising SEQ ID Nos. 37 (MAD2028 nuclease), 38 (CRISPR RNA) and 39 (trans-activating crispr RNA); a MAD2029 system comprising SEQ ID Nos. 40 (MAD2029 nuclease), 41 (CRISPR RNA) and 42 (trans-activating crispr RNA); a MAD2030 system comprising SEQ ID Nos. 43 (MAD2030 nuclease), 44 (CRISPR RNA) and 45 (trans-activating crispr RNA); a MAD2031 system comprising SEQ ID Nos. 46 (MAD2031 nuclease), 47 (CRISPR RNA) and 48 (trans-activating crispr RNA); a MAD2032 system comprising SEQ ID Nos. 49 (MAD2032 nuclease), 50 (CRISPR RNA) and 51 (trans-activating crispr RNA); a MAD2033 system comprising SEQ ID Nos. 52 (MAD2033 nuclease), 53 (CRISPR RNA) and 54 (trans-activating crispr RNA); a MAD2034 system comprising SEQ ID Nos. 55 (MAD2034 nuclease), 56 (CRISPR RNA) and 57 (trans-activating crispr RNA); a MAD2035 system comprising SEQ ID Nos. 58 (MAD2035 nuclease), 59 (CRISPR RNA) and 60 (trans-activating crispr RNA); a MAD2036 system comprising SEQ ID Nos. 61 (MAD2036 nuclease), 62 (CRISPR RNA) and 63 (trans-activating crispr RNA); a MAD2037 system comprising SEQ ID Nos. 64 (MAD2031 nuclease), 65 (CRISPR RNA) and 66 (trans-activating crispr RNA); a MAD2038 system comprising SEQ ID Nos. 67 (MAD2038 nuclease), 68 (CRISPR RNA) and 69 (trans-activating crispr RNA); a MAD2039 system comprising SEQ ID Nos. 70 (MAD2039 nuclease), 71 (CRISPR RNA) and 72 (trans-activating crispr RNA); and a MAD2040 system comprising SEQ ID Nos. 73 (MAD2040 nuclease), 74 (CRISPR RNA) and 75 (trans-activating crispr RNA). In some aspects, the MAD system components are delivered as sequences to be transcribed (in the case of the gRNA components) and transcribed and translated (in the case of the MAD nuclease), and in some aspects, the coding sequence for the MAD nuclease and the gRNA component sequences are on the same vector. In other aspects, the coding sequence for the MAD nuclease and the gRNA component sequences are on a different vector and in some aspects, the gRNA component sequences are located in an editing cassette which also comprises a donor DNA (e.g., homology arm). In other aspects, the MAD nuclease is delivered to the cells as a peptide or the MAD nuclease and gRNA components are delivered to the cells as a ribonuclease complex.
[0010] Additionally there is provided engineered nickases derived from the nucleases from the above-referenced systems, including MAD2016-H851A (SEQ ID NO: 178); MAD2016-N874A (SEQ ID NO: 179); MAD2032-H590A (SEQ ID NO: 180); MAD2039-H587A (SEQ ID NO: 181); MAD2039-N610A (SEQ ID NO: 182).
[0011] These aspects and other features and advantages of the invention are described below in more detail.
BRIEF DESCRIPTION OF THE FIGURES
[0012] FIG. 1 is an exemplary workflow for creating and screening mined MAD nucleases or RGNs.
[0013] FIG. 2 is a simplified depiction of an in vitro test conducted on candidate enzymes.
[0014] FIG. 3 is a list of novel Type II MADzymes that have been identified.
[0015] FIG. 4 is a map of Type II MADzymes in cluster 59.
[0016] FIG. 5 is a map of Type II MADzymes in cluster 55, 56, 57 and 58.
[0017] FIG. 6 is a map of Type II MADzymes in cluster 141.
[0018] FIG. 7 is a reproduction of a gel showing nicked plasmid formation with different MADzyme nickases compared to corresponding MADzyme nucleases.
[0019] It should be understood that the drawings are not necessarily to scale.
DETAILED DESCRIPTION
[0020] The description set forth below in connection with the appended drawings is intended to be a description of various, illustrative embodiments of the disclosed subject matter. Specific features and functionalities are described in connection with each illustrative embodiment; however, it will be apparent to those skilled in the art that the disclosed embodiments may be practiced without each of those specific features and functionalities. Moreover, all of the functionalities described in connection with one embodiment are intended to be applicable to the additional embodiments described herein except where expressly stated or where the feature or function is incompatible with the additional embodiments. For example, where a given feature or function is expressly described in connection with one embodiment but not expressly mentioned in connection with an alternative embodiment, it should be understood that the feature or function may be deployed, utilized, or implemented in connection with the alternative embodiment unless the feature or function is incompatible with the alternative embodiment.
[0021] The practice of the techniques described herein may employ, unless otherwise indicated, conventional techniques and descriptions of organic chemistry, polymer technology, molecular biology (including recombinant techniques), cell biology, biochemistry, biological emulsion generation, and sequencing technology, which are within the skill of those who practice in the art. Such conventional techniques include polymer array synthesis, hybridization and ligation of polynucleotides, and detection of hybridization using a label. Specific illustrations of suitable techniques can be had by reference to the examples herein. However, other equivalent conventional procedures can, of course, also be used. Such conventional techniques and descriptions can be found in standard laboratory manuals such as Green, et al., Eds. (1999), Genome Analysis: A Laboratory Manual Series (Vols. I-IV); Weiner, Gabriel, Stephens, Eds. (2007), Genetic Variation: A Laboratory Manual; Dieffenbach, Dveksler, Eds. (2003), PCR Primer: A Laboratory Manual; Bowtell and Sambrook (2003), DNA Microarrays: A Molecular Cloning Manual; Mount (2004), Bioinformatics: Sequence and Genome Analysis; Sambrook and Russell (2006), Condensed Protocols from Molecular Cloning: A Laboratory Manual; and Sambrook and Russell (2002), Molecular Cloning: A Laboratory Manual (all from Cold Spring Harbor Laboratory Press); Stryer, L. (1995) Biochemistry (4th Ed.) W.H. Freeman, New York N.Y.; Gait, "Oligonucleotide Synthesis: A Practical Approach" 1984, IRL Press, London; Nelson and Cox (2000), Lehninger, Principles of Biochemistry 3.sup.rd Ed., W. H. Freeman Pub., New York, N.Y.; Berg et al. (2002) Biochemistry, 5.sup.th Ed., W.H. Freeman Pub., New York, N.Y.; Cell and Tissue Culture: Laboratory Procedures in Biotechnology (Doyle & Griffiths, eds., John Wiley & Sons 1998), all of which are herein incorporated in their entirety by reference for all purposes. Nuclease-specific techniques can be found in, e.g., Genome Editing and Engineering From TALENs and CRISPRs to Molecular Surgery, Appasani and Church, 2018; and CRISPR: Methods and Protocols, Lindgren and Charpentier, 2015; both of which are herein incorporated in their entirety by reference for all purposes. Basic methods for enzyme engineering may be found in, Enzyme Engineering Methods and Protocols, Samuelson, ed., 2013; Protein Engineering, Kaumaya, ed., (2012); and Kaur and Sharma, "Directed Evolution: An Approach to Engineer Enzymes", Crit. Rev. Biotechnology, 26:165-69 (2006).
[0022] Note that as used herein and in the appended claims, the singular forms "a," "an," and "the" include plural referents unless the context clearly dictates otherwise. Thus, for example, reference to "an oligonucleotide" refers to one or more oligonucleotides. Terms such as "first," "second," "third," etc., merely identify one of a number of portions, components, steps, operations, functions, and/or points of reference as disclosed herein, and likewise do not necessarily limit embodiments of the present disclosure to any particular configuration or orientation.
[0023] Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. All publications mentioned herein are incorporated by reference for the purpose of describing and disclosing devices, methods and cell populations that may be used in connection with the presently described invention.
[0024] Where a range of values is provided, it is understood that each intervening value, between the upper and lower limit of that range and any other stated or intervening value in that stated range is encompassed within the invention. The upper and lower limits of these smaller ranges may independently be included in the smaller ranges, and are also encompassed within the invention, subject to any specifically excluded limit in the stated range. Where the stated range includes one or both of the limits, ranges excluding either both of those included limits are also included in the invention.
[0025] In the following description, numerous specific details are set forth to provide a more thorough understanding of the present invention. However, it will be apparent to one of ordinary skill in the art that the present invention may be practiced without one or more of these specific details. In other instances, well-known features and procedures well known to those skilled in the art have not been described in order to avoid obscuring the invention.
[0026] The term "complementary" as used herein refers to Watson-Crick base pairing between nucleotides and specifically refers to nucleotides hydrogen bonded to one another with thymine or uracil residues linked to adenine residues by two hydrogen bonds and cytosine and guanine residues linked by three hydrogen bonds. In general, a nucleic acid includes a nucleotide sequence described as having a "percent complementarity" or "percent homology" to a specified second nucleotide sequence. For example, a nucleotide sequence may have 80%, 90%, or 100% complementarity to a specified second nucleotide sequence, indicating that 8 of 10, 9 of 10 or 10 of 10 nucleotides of a sequence are complementary to the specified second nucleotide sequence. For instance, the nucleotide sequence 3'-TCGA-5' is 100% complementary to the nucleotide sequence 5'-AGCT-3'; and the nucleotide sequence 3'-TCGA-5' is 100% complementary to a region of the nucleotide sequence 5'-TAGCTG-3'.
[0027] The term DNA "control sequences" refers collectively to promoter sequences, polyadenylation signals, transcription termination sequences, upstream regulatory domains, origins of replication, internal ribosome entry sites, nuclear localization sequences, enhancers, and the like, which collectively provide for the replication, transcription and translation of a coding sequence in a recipient cell. Not all of these types of control sequences need to be present so long as a selected coding sequence is capable of being replicated, transcribed and--for some components--translated in an appropriate host cell.
[0028] As used herein the term "donor DNA" or "donor nucleic acid" refers to nucleic acid that is designed to introduce a DNA sequence modification (insertion, deletion, substitution) into a locus by homologous recombination using nucleic acid-guided nucleases. For homology-directed repair, the donor DNA must have sufficient homology to the regions flanking the "cut site" or site to be edited in the genomic target sequence. The length of the homology arm(s) will depend on, e.g., the type and size of the modification being made. In many instances and preferably, the donor DNA will have two regions of sequence homology (e.g., two homology arms) to the genomic target locus. Preferably, an "insert" region or "DNA sequence modification" region--the nucleic acid modification that one desires to be introduced into a genome target locus in a cell--will be located between two regions of homology. The DNA sequence modification may change one or more bases of the target genomic DNA sequence at one specific site or multiple specific sites. A change may include changing 1, 2, 3, 4, 5, 10, 15, 20, 25, 30, 35, 40, 50, 75, 100, 150, 200, 300, 400, or 500 or more base pairs of the target sequence. A deletion or insertion may be a deletion or insertion of 1, 2, 3, 4, 5, 10, 15, 20, 25, 30, 40, 50, 75, 100, 150, 200, 300, 400, or 500 or more base pairs of the target sequence.
[0029] The terms "guide nucleic acid" or "guide RNA" or "gRNA" refer to a polynucleotide comprising 1) a guide sequence capable of hybridizing to a genomic target locus, and 2) a scaffold sequence capable of interacting or complexing with a nucleic acid-guided nuclease.
[0030] "Homology" or "identity" or "similarity" refers to sequence similarity between two peptides or, more often in the context of the present disclosure, between two nucleic acid molecules. The term "homologous region" or "homology arm" refers to a region on the donor DNA with a certain degree of homology with the target genomic DNA sequence. Homology can be determined by comparing a position in each sequence which may be aligned for purposes of comparison. When a position in the compared sequence is occupied by the same base or amino acid, then the molecules are homologous at that position. A degree of homology between sequences is a function of the number of matching or homologous positions shared by the sequences.
[0031] "Operably linked" refers to an arrangement of elements where the components so described are configured so as to perform their usual function. Thus, control sequences operably linked to a coding sequence are capable of effecting the transcription, and in some cases, the translation, of a coding sequence. The control sequences need not be contiguous with the coding sequence so long as they function to direct the expression of the coding sequence. Thus, for example, intervening untranslated yet transcribed sequences can be present between a promoter sequence and the coding sequence and the promoter sequence can still be considered "operably linked" to the coding sequence. In fact, such sequences need not reside on the same contiguous DNA molecule (i.e. chromosome) and may still have interactions resulting in altered regulation.
[0032] A "promoter" or "promoter sequence" is a DNA regulatory region capable of binding RNA polymerase and initiating transcription of a polynucleotide or polypeptide coding sequence such as messenger RNA, ribosomal RNA, small nuclear or nucleolar RNA, guide RNA, or any kind of RNA transcribed by any class of any RNA polymerase I, II or III. Promoters may be constitutive or inducible and, in some embodiments--particularly many embodiments in which selection is employed--the transcription of at least one component of the nucleic acid-guided nuclease editing system is under the control of an inducible promoter.
[0033] As used herein the term "selectable marker" refers to a gene introduced into a cell, which confers a trait suitable for artificial selection. General use selectable markers are well-known to those of ordinary skill in the art. Drug selectable markers such as ampicillin/carbenicillin, kanamycin, chloramphenicol, erythromycin, tetracycline, gentamicin, bleomycin, streptomycin, rhamnose, puromycin, hygromycin, blasticidin, and G418 may be employed. In other embodiments, selectable markers include, but are not limited to human nerve growth factor receptor (detected with a MAb, such as described in U.S. Pat. No. 6,365,373); truncated human growth factor receptor (detected with MAb); mutant human dihydrofolate reductase (DHFR; fluorescent MTX substrate available); secreted alkaline phosphatase (SEAP; fluorescent substrate available); human thymidylate synthase (TS; confers resistance to anti-cancer agent fluorodeoxyuridine); human glutathione S-transferase alpha (GSTA1; conjugates glutathione to the stem cell selective alkylator busulfan; chemoprotective selectable marker in CD34+cells); CD24 cell surface antigen in hematopoietic stem cells; human CAD gene to confer resistance to N-phosphonacetyl-L-aspartate (PALA); human multi-drug resistance-1 (MDR-1; P-glycoprotein surface protein selectable by increased drug resistance or enriched by FACS); human CD25 (IL-2.alpha.; detectable by Mab-FITC); Methylguanine-DNA methyltransferase (MGMT; selectable by carmustine); and Cytidine deaminase (CD; selectable by Ara-C). "Selective medium" as used herein refers to cell growth medium to which has been added a chemical compound or biological moiety that selects for or against selectable markers.
[0034] The terms "target genomic DNA sequence", "target sequence", or "genomic target locus" refer to any locus in vitro or in vivo, or in a nucleic acid (e.g., genome) of a cell or population of cells, in which a change of at least one nucleotide is desired using a nucleic acid-guided nuclease editing system. The target sequence can be a genomic locus or extrachromosomal locus.
[0035] A "vector" is any of a variety of nucleic acids that comprise a desired sequence or sequences to be delivered to and/or expressed in a cell. Vectors are typically composed of DNA, although RNA vectors are also available. Vectors include, but are not limited to, plasmids, fosmids, phagemids, virus genomes, synthetic chromosomes, and the like. As used herein, the phrase "engine vector" comprises a coding sequence for a nuclease to be used in the nucleic acid-guided nuclease systems and methods of the present disclosure. The engine vector may also comprise, in a bacterial system, the .lamda. Red recombineering system or an equivalent thereto. Engine vectors also typically comprise a selectable marker. As used herein the phrase "editing vector" comprises a donor nucleic acid, optionally including an alteration to the target sequence that prevents nuclease binding at a PAM or spacer in the target sequence after editing has taken place, and a coding sequence for a gRNA. The editing vector may also comprise a selectable marker and/or a barcode. In some embodiments, the engine vector and editing vector may be combined; that is, the contents of the engine vector may be found on the editing vector. Further, the engine and editing vectors comprise control sequences operably linked to, e.g., the nuclease coding sequence, recombineering system coding sequences (if present), donor nucleic acid, guide nucleic acid, and selectable marker(s).
Editing in Nucleic Acid-Guided Nuclease Genome Systems
[0036] RNA-guided nucleases (RGNs) have rapidly become the foundational tools for genome engineering of prokaryotes and eukaryotes. Clustered Rapidly Interspaced Short Palindromic Repeats (CRISPR) systems are an adaptive immunity system which protect prokaryotes against mobile genetic elements (MGEs). RGNs are a major part of this defense system because they identify and destroy MGEs. RGNs can be repurposed for genome editing in various organisms by reprogramming the CRISPR RNA (crRNA) that guides the RGN to a specific target DNA. A number of different RGNs have been identified to date for various applications; however, there are various properties that make some RGNs more desirable than others for specific applications. RGNs can be used for creating specific double strand breaks (DSBs), specific nicks of one strand of DNA, or guide another moiety to a specific DNA sequence.
[0037] The ability of an RGN to specifically target any genomic sequence is perhaps the most desirable feature of RGNs; however, RGNs can only access their desired target if the target DNA also contains a short motif called PAM (protospacer adjacent motif) that is specific for every RGN. Type V RGNs such as MAD7, AsCas12a and LbCas12a tend to access DNA targets that contain YTTN/TTTN on the 5' end whereas type II RGNs--such as the MADzymes disclosed herein--target DNA sequences containing a specific short motif on the 3' end. An example well known in the art for a type II RGN is SpCas9 which requires an NGG on the 3' end of the target DNA. Type II RGNs, unlike type V RGNS, require a transactivating RNA (tracrRNA) in addition to a crRNA for optimal function. Compared to type V RGNs, the type II RGNs create a double-strand break closer to the PAM sequence, which is highly desirable for precise genome editing applications.
[0038] A number of type II RGNs have been discovered so far; however, their use in widespread applications is limited by restrictive PAMs. For example, the PAM of SpCas9 occurs less frequently in AT-rich regions of the genome. New type II RGNs with new and less restrictive PAMs are beneficial for the field. Further, not all type II nucleases are active in multiple organisms. For example, a number of RGNs have been discussed in the scientific literature but only a few have been demonstrated to be active in vitro and fewer still are active in cells, particularly in mammalian cells. The present disclosure identifies multiple type II RGNs that have novel PAMs and are active in mammalian cells.
[0039] In performing nucleic acid-guided nuclease editing, the type II RGNs or MADzymes may be delivered to cells to be edited as a polypeptide; alternatively, a polynucleotide sequence encoding the MADzyme are transformed or transfected into the cells to be edited. The polynucleotide sequence encoding the MADzyme may be codon optimized for expression in particular cells, such as archaeal, prokaryotic or eukaryotic cells. Eukaryotic cells can be yeast, fungi, algae, plant, animal, or human cells. Eukaryotic cells may be those of or derived from a particular organism, such as a mammal, including but not limited to human, mouse, rat, rabbit, dog, or non-human mammals including non-human primates. The choice of the MADzyme to be employed depends on many factors, such as what type of edit is to be made in the target sequence and whether an appropriate PAM is located close to the desired target sequence. The MADzyme may be encoded by a DNA sequence on a vector (e.g., the engine vector) and be under the control of a constitutive or inducible promoter. In some embodiments, the sequence encoding the nuclease is under the control of an inducible promoter, and the inducible promoter may be separate from but the same as an inducible promoter controlling transcription of the guide nucleic acid; that is, a separate inducible promoter may drive the transcription of the nuclease and guide nucleic acid sequences but the two inducible promoters may be the same type of inducible promoter (e.g., both are pL promoters). Alternatively, the inducible promoter controlling expression of the nuclease may be different from the inducible promoter controlling transcription of the guide nucleic acid; that is, e.g., the nuclease may be under the control of the pBAD inducible promoter, and the guide nucleic acid may be under the control of the pL inducible promoter.
[0040] In general, a guide nucleic acid (e.g., gRNA) complexes with a compatible nucleic acid-guided nuclease and can then hybridize with a target sequence, thereby directing the nuclease to the target sequence. With the type II MADzymes described herein, the nucleic acid-guided nuclease editing system uses two separate guide nucleic acid components that combine and function as a guide nucleic acid; that is, a CRISPR RNA (crRNA) and a transactivating CRISPR RNA (tracrRNA). The gRNA may be encoded by a DNA sequence on a polynucleotide molecule such as a plasmid, linear construct, or the coding sequence may reside within an editing cassette and is under the control of a constitutive promoter, or, in some embodiments, an inducible promoter as described below.
[0041] A guide nucleic acid comprises a guide polynucleotide sequence having sufficient complementarity with a target sequence to hybridize with the target sequence and direct sequence-specific binding of a complexed nucleic acid-guided nuclease to the target sequence. The degree of complementarity between a guide sequence and the corresponding target sequence, when optimally aligned using a suitable alignment algorithm, is about or more than about 50%, 60%, 75%, 80%, 85%, 90%, 95%, 97.5%, 99%, or more. Optimal alignment may be determined with the use of any suitable algorithm for aligning sequences. In some embodiments, a guide sequence is about or more than about 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 75, or more nucleotides in length. In some embodiments, a guide sequence is less than about 75, 50, 45, 40, 35, 30, 25, 20 nucleotides in length. Preferably the guide sequence is 10-30 or 15-20 nucleotides long, or 15, 16, 17, 18, 19, or 20 nucleotides in length.
[0042] In the present methods and compositions, the components of the guide nucleic acid is provided as a sequence to be expressed from a plasmid or vector and comprises both the guide sequence and the scaffold sequence as a single transcript under the control of a promoter, and in some embodiments, an inducible promoter. In general, to generate an edit in a target sequence, the gRNA/nuclease complex binds to a target sequence as determined by the guide RNA, and the nuclease recognizes a protospacer adjacent motif PAM) sequence adjacent to the target sequence. The target sequence can be any polynucleotide endogenous or exogenous to a prokaryotic or eukaryotic cell, or in vitro. For example, the target sequence can be a polynucleotide residing in the nucleus of a eukaryotic cell. A target sequence can be a sequence encoding a gene product (e.g., a protein) or a non-coding sequence (e.g., a regulatory polynucleotide, an intron, a PAM, or "junk" DNA).
[0043] The guide nucleic acid may be part of an editing cassette that encodes the donor nucleic acid. Alternatively, the guide nucleic acid may not be part of the editing cassette and instead may be encoded on the engine or editing vector backbone. For example, a sequence coding for a guide nucleic acid can be assembled or inserted into a vector backbone first, followed by insertion of the donor nucleic acid in, e.g., the editing cassette. In other cases, the donor nucleic acid in, e.g., an editing cassette can be inserted or assembled into a vector backbone first, followed by insertion of the sequence coding for the guide nucleic acid. In yet other cases, the sequence encoding the guide nucleic acid and the donor nucleic acid (inserted, for example, in an editing cassette) are simultaneously but separately inserted or assembled into a vector. In yet other embodiments, the sequence encoding the guide nucleic acid and the sequence encoding the donor nucleic acid are both included in the editing cassette.
[0044] The target sequence is associated with a PAM, which is a short nucleotide sequence recognized by the gRNA/nuclease complex. The precise PAM sequence and length requirements for different nucleic acid-guided nucleases vary; however, PAMs typically are 2-7 base-pair sequences adjacent or in proximity to the target sequence and, depending on the nuclease, can be 5' or 3' to the target sequence. Engineering of the PAM-interacting domain of a nucleic acid-guided nuclease may allow for alteration of PAM specificity, improve fidelity, or decrease fidelity. In certain embodiments, the genome editing of a target sequence both introduces a desired DNA change to a target sequence, e.g., the genomic DNA of a cell, and removes, mutates, or renders inactive a proto-spacer mutation (PAM) region in the target sequence. Rendering the PAM at the target sequence inactive precludes additional editing of the cell genome at that target sequence, e.g., upon subsequent exposure to a nucleic acid-guided nuclease complexed with a synthetic guide nucleic acid in later rounds of editing. Thus, cells having the desired target sequence edit and an altered PAM can be selected using a nucleic acid-guided nuclease complexed with a synthetic guide nucleic acid complementary to the target sequence. Cells that did not undergo the first editing event will be cut rendering a double-stranded DNA break, and thus will not continue to be viable. The cells containing the desired target sequence edit and PAM alteration will not be cut, as these edited cells no longer contain the necessary PAM site and will continue to grow and propagate.
[0045] As mentioned previously, the range of target sequences that nucleic acid-guided nucleases can recognize is constrained by the need for a specific PAM to be located near the desired target sequence. As a result, it often can be difficult to target edits with the precision that is necessary for genome editing. It has been found that nucleases can recognize some PAMs very well (e.g., canonical PAMs), and other PAMs less well or poorly (e.g., non-canonical PAMs). Because the mined MAD nucleases disclosed herein may recognize different PAMs, the mined MAD nucleases increase the number of target sequences that can be targeted for editing; that is, mined MAD nucleases decrease the regions of "PAM deserts" in the genome. Thus, the mined MAD nucleases expand the scope of target sequences that may be edited by increasing the number (variety) of PAM sequences recognized. Moreover, cocktails of mined MAD nucleases may be delivered to cells such that target sequences adjacent to several different PAMs may be edited in a single editing run.
[0046] Another component of the nucleic acid-guided nuclease system is the donor nucleic acid. In some embodiments, the donor nucleic acid is on the same polynucleotide (e.g., editing vector or editing cassette) as the guide nucleic acid and may be (but not necessarily) under the control of the same promoter as the guide nucleic acid (e.g., a single promoter driving the transcription of both the guide nucleic acid and the donor nucleic acid). For cassettes of this type, see U.S. Pat. Nos. 10,240,167; 10,266,849; 9,982,278; 10,351,877; 10,364,442; 10,435,715; and 10,465,207. The donor nucleic acid is designed to serve as a template for homologous recombination with a target sequence nicked or cleaved by the nucleic acid-guided nuclease as a part of the gRNA/nuclease complex. A donor nucleic acid polynucleotide may be of any suitable length, such as about or more than about 20, 25, 50, 75, 100, 150, 200, 500, or 1000 nucleotides in length. In certain preferred aspects, the donor nucleic acid can be provided as an oligonucleotide of between 20-300 nucleotides, more preferably between 50-250 nucleotides. The donor nucleic acid comprises a region that is complementary to a portion of the target sequence (e.g., a homology arm). When optimally aligned, the donor nucleic acid overlaps with (is complementary to) the target sequence by, e.g., about 20, 25, 30, 35, 40, 50, 60, 70, 80, 90 or more nucleotides. In many embodiments, the donor nucleic acid comprises two homology arms (regions complementary to the target sequence) flanking the mutation or difference between the donor nucleic acid and the target template. The donor nucleic acid comprises at least one mutation or alteration compared to the target sequence, such as an insertion, deletion, modification, or any combination thereof compared to the target sequence.
[0047] Often the donor nucleic acid is provided as an editing cassette, which is inserted into a vector backbone where the vector backbone may comprise a promoter driving transcription of the gRNA and the coding sequence of the gRNA, or the vector backbone may comprise a promoter driving the transcription of the gRNA but not the gRNA itself. Moreover, there may be more than one, e.g., two, three, four, or more guide nucleic acid/donor nucleic acid cassettes inserted into an engine vector, where each guide nucleic acid is under the control of separate different promoters, separate like promoters, or where all guide nucleic acid/donor nucleic acid pairs are under the control of a single promoter. In some embodiments the promoter driving transcription of the gRNA and the donor nucleic acid (or driving more than one gRNA/donor nucleic acid pair) is an inducible promoter. Inducible editing is advantageous in that isolated cells can be grown for several to many cell doublings to establish colonies before editing is initiated, which increases the likelihood that cells with edits will survive, as the double-strand cuts caused by active editing are largely toxic to the cells. This toxicity results both in cell death in the edited colonies, as well as a lag in growth for the edited cells that do survive but must repair and recover following editing. However, once the edited cells have a chance to recover, the size of the colonies of the edited cells will eventually catch up to the size of the colonies of unedited cells. See, e.g., U.S. Pat. Nos. 10,533,152; 10,550,363; 10,532,324; 10,550,363; 10,633,626; 10,633,627; 10,647,958; 10,760,043; 10,723,995; 10,801,008; and 10,851,339. Further, a guide nucleic acid may be efficacious directing the edit of more than one donor nucleic acid in an editing cassette; e.g., if the desired edits are close to one another in a target sequence.
[0048] In addition to the donor nucleic acid, an editing cassette may comprise one or more primer sites. The primer sites can be used to amplify the editing cassette by using oligonucleotide primers; for example, if the primer sites flank one or more of the other components of the editing cassette.
[0049] In addition, the editing cassette may comprise a barcode. A barcode is a unique DNA sequence that corresponds to the donor DNA sequence such that the barcode can identify the edit made to the corresponding target sequence. The barcode typically comprises four or more nucleotides. In some embodiments, the editing cassettes comprise a collection of donor nucleic acids representing, e.g., gene-wide or genome-wide libraries of donor nucleic acids. The library of editing cassettes is cloned into vector backbones where, e.g., each different donor nucleic acid is associated with a different barcode.
[0050] Additionally, in some embodiments, an expression vector or cassette encoding components of the nucleic acid-guided nuclease system further encodes one or more nuclear localization sequences (NLSs), such as about or more than about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more NLSs. In some embodiments, the nuclease comprises NLSs at or near the amino-terminus of the MADzyme, NLSs at or near the carboxy-terminus of the MADzyme, or a combination.
[0051] The engine and editing vectors comprise control sequences operably linked to the component sequences to be transcribed. As stated above, the promoters driving transcription of one or more components of the mined MAD nuclease editing system may be inducible, and an inducible system is likely employed if selection is to be performed. A number of gene regulation control systems have been developed for the controlled expression of genes in plant, microbe, and animal cells, including mammalian cells, including the pL promoter (induced by heat inactivation of the CI857 repressor), the pBAD promoter (induced by the addition of arabinose to the cell growth medium), and the rhamnose inducible promoter (induced by the addition of rhamnose to the cell growth medium). Other systems include the tetracycline-controlled transcriptional activation system (Tet-On/Tet-Off, Clontech, Inc. (Palo Alto, Calif.); Bujard and Gossen, PNAS, 89(12):5547-5551 (1992)), the Lac Switch Inducible system (Wyborski et al., Environ Mol Mutagen, 28(4):447-58 (1996); DuCoeur et al., Strategies 5(3):70-72 (1992); U.S. Pat. No. 4,833,080), the ecdysone-inducible gene expression system (No et al., PNAS, 93(8):3346-3351 (1996)), the cumate gene-switch system (Mullick et al., BMC Biotechnology, 6:43 (2006)), and the tamoxifen-inducible gene expression (Zhang et al., Nucleic Acids Research, 24:543-548 (1996)) as well as others.
[0052] Typically, performing genome editing in live cells entails transforming cells with the components necessary to perform nucleic acid-guided nuclease editing. For example, the cells may be transformed simultaneously with separate engine and editing vectors; the cells may already be expressing the mined MAD nuclease (e.g., the cells may have already been transformed with an engine vector or the coding sequence for the mined MAD nuclease may be stably integrated into the cellular genome) such that only the editing vector needs to be transformed into the cells; or the cells may be transformed with a single vector comprising all components required to perform nucleic acid-guided nuclease genome editing.
[0053] A variety of delivery systems can be used to introduce (e.g., transform or transfect) nucleic acid-guided nuclease editing system components into a host cell. These delivery systems include the use of yeast systems, lipofection systems, microinjection systems, biolistic systems, virosomes, liposomes, immunoliposomes, polycations, lipid:nucleic acid conjugates, virions, artificial virions, viral vectors, electroporation, cell permeable peptides, nanoparticles, nanowires, exosomes. Alternatively, molecular trojan horse liposomes may be used to deliver nucleic acid-guided nuclease components across the blood brain barrier. Of particular interest is the use of electroporation, particularly flow-through electroporation (either as a stand-alone instrument or as a module in an automated multi-module system) as described in, e.g., U.S. Pat. Nos. 10,435,713; 10,443,074; 10,323,258; and 10,415,058.
[0054] After the cells are transformed with the components necessary to perform nucleic acid-guided nuclease editing, the cells are cultured under conditions that promote editing. For example, if constitutive promoters are used to drive transcription of the mined MAD nucleases and/or gRNA, the transformed cells need only be cultured in a typical culture medium under typical conditions (e.g., temperature, CO.sub.2 atmosphere, etc.) Alternatively, if editing is inducible--by, e.g., activating inducible promoters that control transcription of one or more of the components needed for nucleic acid-guided nuclease editing, such as, e.g., transcription of the gRNA, donor DNA, nuclease, or, in the case of bacteria, a recombineering system--the cells are subjected to inducing conditions.
EXAMPLES
[0055] The following examples are put forth so as to provide those of ordinary skill in the art with a complete disclosure and description of how to make and use the present invention, and are not intended to limit the scope of what the inventors regard as their invention, nor are they intended to represent or imply that the experiments below are all of or the only experiments performed. It will be appreciated by persons skilled in the art that numerous variations and/or modifications may be made to the invention as shown in the specific aspects without departing from the spirit or scope of the invention as broadly described. The present aspects are, therefore, to be considered in all respects as illustrative and not restrictive.
Example 1
Exemplary Workflow Overview
[0056] The disclosed MADzyme Type II CRISPR enzymes were identified by the method depicted in FIG. 1. FIG. 1 shows an exemplary workflow for creating and for in vitro screening of MADzymes, including those in untapped clusters. In a first step, metagenome mining was performed to identify putative RGNs of interest based on, e.g., sequence (HMMER profile) and a search for CRISPR arrays. Once putative RGNs of interest were identified in silico, candidate pools were created and each MADzyme was identified by cluster, the tracrRNA was identified, and the sgRNA structure was predicted. Final candidates were identified, then the genes were synthesized. An in vitro depletion test was performed (see FIG. 2), where a synthetic target library was constructed in which to test target depletion for each of the candidate MADzymes. After target depletion, amplicons were produced for analysis for in vivo analysis. FIG. 2 depicts the in vitro depletion test in more detail.
Example 2
Metagenome Mining
[0057] The NCBI Metagenome database was used to search for novel, putative CRISPR nucleases using HMMER hidden Markov model searches. Hundreds of potential nucleases were identified. For each potential nuclease candidate, putative CRISPR arrays were identified and CRISPR repeat and anti-repeats were identified. Thirteen nucleases (FIG. 3) were chosen for in vitro validation and 11 active MADzymes were identified and assigned to clusters. There was less than 40% sequence identity between clusters. Cluster 59 shown in FIG. 4 presents two unique subclusters with distinct sgRNA architecture. Clusters 55-57 are shown in FIG. 5. These new MADzymes have diverse PAM preferences and distinct sgRNA structure. Cluster 141 (FIG. 6) is a distant cluster from 55, 56, 57 and 59 and shows diverse Cas protein structure and smaller-sized enzymes (e.g., approximately 200 amino acids shorter than the counterparts from the 55, 56, 57 and 59 clusters). Table 1 lists the identified MADzymes, including amino acid sequences, origin, and nucleic acid sequences of the CRISPR RNA and the trans-activating crispr RNA.
TABLE-US-00001 TABLE 1 Organism MAD Clus- (meta- CRISPR name ter Contig_id genome) Source aa_seq repeat tracrRNA MAD2015 59 DPZI01000013.1 Vagococcus MGKNYTIGLDIGTNSVGWSVVTENQQLVKKRMKIRGDS GTTTT TGTTGGT sp. EKKQVKKNFWGVRLFDEGETAEATRLKRTTRRRYTRRR AGAGC AGCATTC NRVVDLQNIFKDEINQKDSNFFNRLNESFLVVEDKKQP TATGC AAAACAA KQMIFGTVEEEASYHESFPTIYHLRKELVDNKDQADIR TGTTT CATAGCA LVYLAMAHMIKYRGHFLIEGQLSTENTSVEEKFHLFLK TGAAT AGTTAAA EYNSTFCKQEDGSLVNPVNEDINGEEILMGTLSRSKKA GCTTC ATAAGGC EQIMKSFEGEKSNGVFSQFLKMIVGNQGNFKKAFNLEE CAAAA TTTGTCC DAKIQFAKEEYDEDLTTLLSNIGDEYANVFSLAKETYE C GTTCTCA AIELSGILSTKDKETYAKLSSSMTERYEDHEKDLASLK [SEQ ACTTTTA SFFREHLPEKYAVMFKDVSKNGYAGYIENSNKISQEEF ID NO. GTGACGC YKYTKKLIGQIEGADYFIKKMEQEAFLRKQRTYDNGVI 2] TGTTTCG PYQVHLSELTHIINNQKKYYPFLLEKEEEIKSILTFKI GCG PYYIGPLAKGNSDFAWLIRNSNDKITPSNFNEVLDIEN [SEQ ID SASQFIERMTNNDVYLPEEKVLPKNSMLYQKYIVFNEL NO. 3] TKVRYINDRGTECNFSGEEKLQIFERFFKDSSTKVKKV SLENYLNKEYMIESPTIKGIEDDFNASFRTYHDFIKLG VSREMLDDIDNEEMFEDIVKILTIFEDRQMIKKQLEKY KDVFDSDILKKMVRRHYTGWGRLSKKLLHEMKDDNSGK TILDYLIEDDRLPKHINRNFMQLINDSNLSFKEKIEKA QLTDGTEDIDSVVKNLIGSPAIKKGISQSLKIVEELVS IMGYQPTSIVVEMARENQTTSKGKRQSIQRYKRLEAAI NELGSDLLKVCPTDNHALKDDRLYLYYLQNGRDMYTGL ELDIHNLSQYDIDHIVPRSFITDNSIDNRVLVSSKKNR GKLDNVPSKEIVQKNKLLWMNLKKSKLMSEKKYANLIK GETGGLTEDDKAKFLNRQLVETRQITKNVAQILDQRFN TQKDEKGNIIREVKVITLKSALVSQFRQNFEFYKVREV NDFHHANDAYLNAVVANTLLKVYPKLTPDFVYGEYRKG NPFKNTKATAKKHYYSNIMENLCHETTIIDDETGEILW DKKCIGTIKQVLNYHQVNVVKKVETQTGRFSEETLVPR GSTKNPIALKSHLDPQKYGGFKSPTIAYTIVIEYKKGK KDILIKELLGISIMNRGAFEKNNKEYLEKLNYKEPRVL MVLPKYSLFELENGRRRLLASDKESQKGNQMAVPSYLN NLLYHTNKSLSKNAKSLEYVNEHRQQFEELLEEIIDFA NQFTLAEKNTLLIADLYESNKEADIELLASSFINLLRF NQMGAPAEFSFFEKPIPRKRYSSTFELLKGKVIHQSIT GLYETHQKV [SEQ ID NO. 1] MAD2016 59 DGLK01000042.1 Entero- New MKKDYVIGLDIGTNSVGWAVMTEDYQLVKKKMPIYGNT GTTTT TCTTTTG coccus York EKKKIKKNFWGVRLFEEGHTAEDRRLKRTARRIISRRR AGAGT GGACTAT faecalis City NRLRYLQAFFEEAMTDLDENFFARLQESFLVPEDKKWH CATGT TCTAAAC MTA RHPIFAKLEDEVAYHETYPTIYHLRKKLADSSEQADLR TGTTT AACATAG subway LIYLALAHIVKYRGHFLIEGKLSTENISVKEQFQQFMI AGAAT CAAGTTA IYNQTFVNGESRLVSAPLPESVLIEEELTEKASRTKKS GGTAC AAATAAG EKVLQQFPQEKANGLFGQFLKLMVGNKADFKKVFGLEE CAAAA GTTTTAA EAKITYASESYEEDLEGILAKVGDEYSDVFLAAKNVYD C CCGTAAT AVELSTILADSDKKSHAKLSSSMIVRFTEHQEDLKKFK [SEQ CAACTGT RFIRENCPDEYDNLFKNEQKDGYAGYIAHAGKVSQLKF ID NO. AAAGTGG YQYVKKIIQDIAGAEYFLEKIAQENFLRKQRTFDNGVI 5] CGCTGTT PHQIHLAELQAIIHRQAAYYPFLKENQEKIEQLVTFRI TCGGCGC PYYVGPLSKGDASTFAWLKRQSEEPIRPWNLQETVDLD [SEQ ID QSATAFIERMTNFDTYLPSEKVLPKHSLLYEKFMVFNE NO. 6] LTKISYTDDRGIKANFSGKEKEKIFDYLFKTRRKVKKK DIIQFYRNEYNTEIVTLSGLEEDQFNASFSTYQDLLKC GLTRAELDHPDNAEKLEDIIKILTIFEDRQRIRTQLST FKGQFSAEVLKKLERKHYTGWGRLSKKLINGIYDKESG KTILGYLIKDDGVSKHYNRNFMQLINDSQLSFKNAIQK AQSSEHEETLSETVNELAGSPAIKKGIYQSLKIVDELV AIMGYAPKRIVVEMARENQTTSTGKRRSIQRLKIVEKA MAEIGSNLLKEQPTTNEQLRDTRLFLYYMQNGKDMYTG DELSLHRLSHYDIDHIIPQSFMKDDSLDNLVLVGSTEN RGKSDDVPSKEVVKDMKAYWEKLYAAGLISQRKFQRLT KGEQGGLTLEDKAHFIQRQLVETRQITKNVAGILDQRY NANSKEKKVQIITLKASLTSQFRSIFGLYKVREVNDYH HGQDAYLNCVVATTLLKVYPNLAPEFVYGEYPKFQTFK ENKATAKAIIYTNLLRFFTEDEPRFTKDGEILWSNSYL KTIKKELNYHQMNIVKKVEVQKGGFSKESIKPKGPSNK LIPVKNGLDPQKYGGFDSPIVAYTVLFTHEKGKKPLIK QEILGITIMEKTRFEQNPILFLEEKGFLRPRVLMKLPK YTLYEFPEGRRRLLASAKEAQKGNQMVLPEHLLTLLYH AKQCLLPNQSESLTYVEQHQPEFQEILERVVDFAEVHT LAKSKVQQIVKLFEANQTADVKEIAASFIQLMQFNAMG APSTFKFFQKDIERARYTSIKEIFDATIIYQSTTGLYE TRRKVVD [SEQ ID NO. 4] MAD2017 59 DMKA01000006.1 Strepto- MKKPYSIGLDIGTNSVGWAVITDDYKVPAKKMKVLGNT GTTTT TGTTGGA coccus DKKYIKKNLLGALLFDSGETAEVTRLKRTARRRYTRRK AGAGC ACTATTC sp. NRLRYLQEIFAKEMTKVDESFFQRLEESFLTDDDKTFD TGTGC GAAACAA (firmi- SHPIFGNKAEEDAYHQKFPTIYHLRKYLADSQEKADLR TGTTT CACAGCG cutes) LVYLALAHMIKYRGHFLIEGELNAENTDVQKLFNVFVE CGAAT AGTTAAA TYDKIVDESHLSEIEVDASSILTEKVSKSRRLENLIKQ GGTTC ATAAGGC YPTEKKNTLFGNLIALALGLQPNFKTNFKLSEDAKLQF CAAAA TTTGTCC SKDTYEEDLEELLGKVGDDYADLFISAKNLYDAILLSG C GTACACA ILTVDDNSTKAPLSASMIKRYVEHHEDLEKLKEFIKIN [SEQ ACTTGTA KLKLYHDIFKDKTKNGYAGYIDNGVKQDEFYKYLKTIL ID NO. AAAGGGG TKIDDSDYFLDKIERDDFLRKQRTFDNGSIPHQIHLQE 8] CACCCGA MHSILRRQGEYYPFLKENQAKIEKILTFRIPYYVGPLA TTCGGGT RKDSRFAWANYHSDEPITPWNFDEVVDKEKSAEKFITR GCA MTLNDLYLPEEKVLPKHSHVYETFTVYNELTKIKYVNE [SEQ ID QGESFFFDANMKQEIFDHVFKENRKVTKAKLLSYLNNE NO. 9] FEEFRINDLIGLDKDSKSFNASLGTYHDLKKILDKSFL DDKTNEQIIEDIVLTLTLFEDRDMIHERLQKYSDFFTS QQLKKLERRHYTGWGRLSYKLINGIRNKENNKTILDFL IDDGHANRNFMQLINDESLSFKTIIQEAQVVGDVDDIE AVVHDLPGSPAIKKGILQSVKIVDELVKVMGDNPDNIV IEMARENQTTGYGRNKSNQRLKRLQDSLKEFGSDILSK KKPSYVDSKVENSHLQNDRLFLYYIQNGKDMYTGEELD IDRLSDYDIDHIIPQAFIKDNSIDNKVLTSSAKNRGKS DDVPSIEIVRNRRSYWYKLYKSGLISKRKFDNLTKAER GGLTEADKAGFIKRQLVETRQITKHVAQILDARFNTKR DENDKVIRDVKVITLKSNLVSQFRKEFKFYKVREINDY HHANDAYLNAVVGTALLKKYPKLTPEFVYGEYKKYDVR KLIAKSSDDYSEMGKATAKYFFYSNLMNFFKTEVKYAD GRVFERPDIETNADGEVVWNKQKDFDIVRKVLSYPQVN IVKKVEAQTGGFSKESILSKGDSDKLIPRKTKKVYWNT KKYGGFDSPTVAYSVLVVADIEKGKAKKLKTVKELVGI SIMERSFFEENPVSFLEKKGYHNVQEDKLIKLPKYSLF EFEGGRRRLLASATELQKGNEVMLPAHLVELLYHAHRI DSFNSTEHLKYVSEHKKEFEKVLSCVENFSNLYVDVEK NLSKVRAAAESMTNFSLEEISASFINLLTLTALGAPAD FNFLGEKIPRKRYTSTKECLSATLIHQSVTGLYETRID LSKLGEE [SEQ ID NO. 7] MAD2019 59 DOTL01000042.1 Strepto- MTKPYSIGLDIGTNSVGWAVITDDYKVPSKKMKVLGNT GTTTT GGTTTGA coccus SKKYIKKNLLGALLFDSGITAEGRRLKRTARRRYTRRR AGAGC AACCATT sp. NRILYLQEIFSTEMATLDDAFFQRLDDSFLVPDDKRDS TGTGT CGAAACA (firmi- KYPIFGNLVEEKAYHDEFPTIYHLRKYLADSTKKADLR TGTTT ATACAGC cutes) LVYLALAHMIKYRGHFLIEGEFNSKNNDIQKNFQDFLD CGAAT AAAGTTA TYNAIFESDLSLENSKQLEEIVKDKISKLEKKDRILKL GGTTC AAATAAG FPGEKNSGIFSEFLKLIVGNQADFKKYFNLDEKASLHF CAAAA GCTAGTC SKESYDEDLETLLGYIGDDYSDVFLKAKKLYDAILLSG C CGTATAC ILTVTDNGTETPLSSAMIMRYKEHEEDLGLLKAYIRNI [SEQ AACGTGA SLKTYNEVFNDDTKNGYAGYIDGKTNQEDFYVYLKKLL ID NO. AAACACG AKFEGADYFLEKIDREDFLRKQRTFDNGSIPYQIHLQE 11] TGGCACC MRAILDKQAKFYPFLAKNKERIEKILTFRIPYYVGPLA GATTCGG RGNSDFAWSIRKRNEKITPWNFEDVIDKESSAEAFINR TGC MTSFDLYLPEEKVLPKHSLLYETFTVYNELTKVRFIAE [SEQ ID GMSDYQFLDSKQKKDIVRLYFKGKRKVKVTDKDIIEYL NO. 12] HAIDGYDGIELKGIEKQFNSSLSTYHDLLNIINDKEFL DDSSNEAIIEEIIHTLTIFEDREMIKQRLSKFENIFDK SVLKKLSRRHYTGWGKLSAKLINGIRDEKSGNTILDYL IDDGISNRNFMQLIHDDALSFKKKIQKAQIIGDKDKDN IKEVVKSLPGSPAIKKGILQSIKIVDELVKVMGRKPES IVVEMARENQYTNQGKSNSQQRLKRLEESLEELGSKIL KENIPAKLSKIDNNSLQNDRLYLYYLQNGKDMYTGDDL DIDRLSNYDIDHIIPQAFLKDNSIDNKVLVSSASNRGK SDDVPSLEVVKKRKTLWYQLLKSKLISQRKFDNLTKAE RGGLSPEDKAGFIQRQLVETRQITKHVARLLDEKFNNK KDENNRAVRTVKIITLKSTLVSQFRKDFELYKVREIND FHHAHDAYLNAVVASALLKKYPKLEPEFVYGDYPKYNS FRERKSATEKVYFYSNIMNIFKKSISLADGRVIERPLI EVNEETGESVWNKESDLATVRRVLSYPQVNVVKKVEVQ SGGFSKELVQPHGNSDKLIPRKTKKMIWDTKKYGGFDS PIVAYSVLVMAEREKGKSKKLKPVKELVRITIMEKESF KENTIDFLERRGLRNIQDENIILLPKFSLFELENGRRR LLASAKELQKGNEFILPNKLVKLLYHAKNIHNTLEPEH LEYVESHRADFGKILDVVSVFSEKYILAEAKLEKIKEI YRKNMNTEIHEMATAFINLLTFTSIGAPATFKFFGHNI ERKRYSSVAEILNATLIHQSVTGLYETRIDLGKLGED [SEQ ID NO. 10] MAD2020 55 DQFW01000027.1 Achole- human MKNNEETLKKLRLGLDIGTNSVGYALLDENNKLIKKNG GTTTG TGTAAAT plasmatales gut HTFWGVRMFDEAETAKDRGSYRKSRRRLLRRKERMEIL CTAGT AACATAA bacterium RSFFTKEICDIDPTFFERLDDSFYYKEDKKNKNTYNLF TATGT CGAGTGC TSEYTDKDFYLEYPTIYHLRKAMQEEDKKFDIRMVYLA TATTT AAATAAG IAHIIKYRGNFLYPGEEFSTSEYTSIKQFFLDFNDILD ATAGT CGTTTCG ELSNELEDNEDYSAEYFDKIENINDDFLEKLKVILMEI ATTAA CGAAAAT KGISNKKKELLDLFNVNKKSIYNELVIPFISGSAKVNI GCAAA TTACAGT SSLSVIKNSKYPKTEISLGSEELEGQVEEAISVAPEIK C GGCCCTG SVLEMIIKIKEISDFYFINKILSDSKTISESMVKMYDE [SEQ CTGTGGG HNEDLKKLKGFFKKYAEDQYNEIFKIRDEKLANYVAYV ID NO. GCCTTTT GFNKLRKNKVERFKHASREEFYGYLKQKLNNIKYAEAQ 14] TTATTTA EEIKYFIDKIDNNEFLLKQNSNQNGAFPMQLHLKELKT TCAAA ILNNQEKYYPFLSEGNDGYSIKEKIILTFKYKIPYYVG [SEQ ID PLNKESKYSWVVREDEKIYPWNFDKVVKLDETAEKFIL NO. 15] RMQNKCTYLKGDNDYCLPKNSLIFSEYSCLSYLNKLSI NGKPIDPIMKSKIFNEVFLIKKQPTKKDIIEFIKTNYN ADALTTTEKELPEATCNMASYIKMKEIFGKDFNDNKEM IENIIKDITIFEDKSILGNRLKELYKLNNDRIKQIKGL NYKGYSRLSKNLLVGLQIVDNQTGEIKGNVIEVMRKTN LNLQEILYLDGYRLIDAIDEYNRKNSLNDSYLCARDYI AENLVISPSFKRALIQTCSIIQEIERIFHKKIDEFYVE VTRTNKDKNKGKTTSSRYDKIKKIYSSCQELAMAYNFD MKRLKNELESNKDNLKSDILYFYFTQLGKCMYSLEDID ISDLTNNYHYDIDHIYPQSIIKDDSLSNRVLVDKKKNA AKTDKFLFEAKVLNPKAQQFYKKLLSLELISKEKYRRL TQKEISKDELEGFVNRQLVSTNQSVMGLIKLLKEYYKV DEKNIIYSKGENVSDFRHTFDLVKSRTANNFHHANDAY LNVVVGGILNKYYTSRRFYQFSDIARIENEGESLNPSR IFTKRDILKANGKVIWDKKEDIKRIEKDLYHRFDITET IRTYNPNKMYSKVTILPKGEGESAVPFQTTTPRVDVEK YGGITSNKFSRYVIIEAHGKKGLDTILEAIPKTACGDN NKIEKDIDNYIASLDEYQKYTSYKVVNYNIKANVVIQE GSFKYIITGKSGNQYVLQNVQDRFFSKKAMITIKNIDK YLNNKKLGIIMAKDNEKIIVSPARGKNNEEIFFEKTEL VNLLKEIKTMYSKDIYSFSAIQNIVNNIDCSIDYSIDD FIIICNNLLQILKTNERKNADLRLIHLSGNSGTLYLGK KLKSGMKFIWQSITGYYEEILYEVK [SEQ ID NO. 13] MAD2021 57 DEED01000018.1 Lachno- MSEKYFVGLDMGTSSVGWAVTDEHYHLLRRKGKDLWGA GTTTG GATAATG spiraceae RLFDEAETAAGRRTNRVSRRRLARQRARIGWLKELFRP AGAGC TTTTACA bacterium YLEEKDAGFLQRLEESRFFLEDKTVKQPYALFSDKEFT CTTGT AGGCGAG DKDYYQKYPTIFHLRKELLESKAPHDVRLVFLAVLNMY AAAAC TTCAAAT AHRGHFLNPELQEGTLGDIHDLLSRLDAYIQDLFEDQG CGTAT AAGGATT WSILENVEEQQKVLAEKNISNTVRLEKILSAIGTSPKD ATCTC TATCCGA KEKKPLIEIYKLICGLKGSLSLAFSGVEMNETDAQMKF TCAAG AATCGCT SFSDSNLEENEPEIERILGERYFEMYSILKEIHAWGLL C TGCGTGC SEIMSDDSGKTYPYISYAKVDLYQKHHEQLRMLKKIIR [SEQ ATTGGCA TYAPDEYHRMFRSMEDNTYSAYVGSVNSKNKKQRRGAK ID NO. CCATCTA STDFFKEVKRIIEKIEKEHGELPECEEILDLIARDSFL 17] TCTTTTA PKQLTTANGVIPNQVYATELRQIVTNAAAYLPFLNDKD DTGLTNAEKIVEMFKFHIPYYIGPLKNDGNGTAWVVRK AGACTTT QQGTVYPWNIDEKVDMAKTRDQFILNLVRKCSYLNDET CTTTGAA VLPASSLLYEKFKVLNELNNLTINGQKISVELKQDIFR AGTCTT DLFRATGKRVTTRKLMGYLRRKAVIDADADETCLEGFD [SEQ ID KTQGGFVSTLSSYHKFMEIFSTDVLTDRQREIAEGAIY NO. 18] FATVYGEDKSFLKKVLRDKFSPAELSQAQIDRLSGIRF KDWSHLSREFLLLEEADHSTGEIMTIIDRLWNTNENLM QIIHSDEYTYKQAIEERTARLEKSLSEVSFEDIEDSYM SAPVRRMVWQTIRILQEIEEVMGSEPARVFVEMTRSEG EKGDKGRKDSRKKKLKELYKKCKDDDQGLLSDIEGRDE RDFRIRKLYLYYMQKGLCMYSGHPIDFGKLFDDSYYDI DHIYPRHYVKDDSIENNLVLVESKLNRDKKDTLLCPDI QERMHPVWEMLHRQGFMNDEKFKRLMRKEPFSEEEFAH FIERQLVETGQGTKEIARILNDVLGNKDENNKVIYVKA GNVSSFRNDNKKNPEFVKCRVINDHHHAKDAYLNIVVG NTYYTKFTLHPANFIRELRNKSHPTLEDQYNMDKLFAR RVERNGYTAWNPDTDFQTVKQVLRKNSVLISRRSFIEH GQIADLQLVSGRKISEVNGKGYLPIKASDIRLSGPSGT MKYGGYNKASGAYFFLVEHELKGKLVRTIEPVYVYMMA SIHGKEDLEKYCQEELGYIHPRICLKKIPMYSHIRING FDYYLTGRSNDRLFICNAVQLTLSSEWSAYIKALSKAV DEKWDAAYIEQQASRIQDSLKSEEVFISKERNDQLYKV LLQKHLEGFFNNRINSIGTIMKEGYDSFRALPVNEQAE TLMEILKISQLVNIGANLVSIGGKSRSGVATVSKKISD SKSFQLISDSVTGIFQRATDLLTI [SEQ ID NO. 16] MAD2022 57 CACYWR010000004.1 uncultured Cattle MEKEYYLGLDMGTSSVGWAVTDKEYRLLRAKGKDMWGI GTTTG GAGAATT Lachno- rumen REFEEAQTAVERRTHRLSKRRRARQLVRIGLLKDYFHD AGAGT AACAAGA spiraceae EIMKIDPNFYIRLENSKYYLEDKDVRLASSNGIFDDKN CTTGT CGAGTGC bacterium YTDKDYYEQYKTIFHLRSELIHNSQKHDVRLVYLALLN TAATT AAATAAG MFKHRGHFLFEGDAYVQGNIGDIYKEFIQLLKNEYYED CTTAA GTTTATC ENVKLTDQIDYFKLKEILSNSEFSRTAKAEKINSLVHI AGGTG CGGAATC DKKNKLENTYIRLLCGLEIELKILFPEIDEKIKICFAK TAAAA GTCAATA GYDEKLVEITEILTDNQLQILENLKKIHDIAALDKIRK C TGACCTG
GKEYLSDARVAEYEKHREDLALLKKIYREYMTKQDYDR [SEQ CATTGTG MFREGEDGSYSAYVNSYNTSKKQRRNMKHRKIDEFYGT ID NO. CAGAATC IRKDLKLLLKQGIQDDNIERILEEIDGNNDNKFMPKQL 20] TTTAAAA SFANGVIPNSLHKAEMKAILRNAETYLPFLLETDESGL TCATATG TVSERILQLFSFHIPYYIGPVSVNSEKNNGNGWVVRRE ATTTCAT DGEVLPWNIEQKIDYGETSKRFIEKMVRRCTYISGEQV ATGGTTT LPKNSFIYEKYCVLNEINNIKIDGERITVELKQNIYND TA LYLHGKRVTKKQLINYLNNRGMIEDENQVSGIDINLNN [SEQ ID YLGSYGKFLPIFEEKLKEDNYIKIAEDIIYLASIYGDS NO. 21] KKMLKSQIKSKYGDILDDKQIKRILGLKFKDWGRISRR FLELEGLDKETGEITTIIKAMWDYNLNFMEIIHSDAFD FKDKIEELHANSIKPLAEIEVEDLDDMYFSAPVKRMIW QTFKVIKEIEKVMGCPPKKVFIEMTRINDKKSKGKRTN SRKEKFLSLYKNIHDELVDWKQLIISSDESGKLNSKKM YLYLTQQGICMYTGRRINLEELFDDNKYDIDHIYPRHF VKDDNLENNLVLVEKQSNSRKSDTYPIDKSIRNNSQVY KHWKSLREGNFISKEKYDRLTGKNEFTDEQKAGFIARQ MVETSQGTKGVADIIKQALPQSRIIYSKASNVSEFRRK YDILKSRTVNEFHHANDAYLNIVVGNVYDTKFTSNPLN FIKKQYNVDRKANNYNLDKMFVYDVKRGNEIAWIGWNP KKSEDSSEMSKRGTIVTVKKMLSKNTPLMTRMSFVGHG GIAEDNLSSHFVAKNKGYMPNGKESDVTKYGGYKKAKT AYFFVVEHGQTNNRIRTIETLPIYRRREVEKYEDGLIK YCEQSLSLLNPIIIYKKIKIQSLMKINGYYAYISGKSN EVYTFRNGVNMCLSQEWINYVKKLENYIEKDRQDRMIT YEKNIELYEIILRKYSTTILNKRLSKMDKKLINAKDRF CILNVKEQSQVLINVFVLSRIGDNQTDLSKIGIGKQSG QITQNKKITGCKEFKLVNQSVTGLYENEIDLLTV [SEQ ID NO. 19] MAD2023 56 DCGJ01000048.1 Lachno- Feces MEKNNYLLGLDIGTDSVGYAVTNDKYDILKFHGEPAWG GTTTG AGACCCC clostridium of six- VTIFDEASLSTEKRSFRVSRRRLDRRQQRVLLVQELFA AGAGT TATGGAT sp. years SEVAKVDKDFFKRIQESNLYRSDAENQAGLFIGEDYCD AGTGT TTACATT old REYYGQYPTIHHLISDLMNGTSPHDVRLVYLACAWLVA AAATC GCGAGTT elephant HRGHFLSNIDKDNLSGLKDFSSVYEGLMQYFSDNGYER CATAG CAAATAA PWNANVDVKALGDALKKKQGVTAKTKELLALLLDSAKA GGGTC AAGTTTA EKLPREEFPFSQDGIIKLLAGGTYKLSELFGNEEYKDF TCAAA CTCAAAT GSVKLSMDDEKLGEIMSNIGEDYELIASLRIVSDWAVL C CGTTGGC VDVLGESATISEAKVGIYNQHKADLEVLKKIIRKYTGK [SEQ TTGACCA EGYKKVFRQVDSKENYVAYSQHESDGKAPKEKGIDIAT ID NO. ACCGCAC FSKFILNIVRLLDVEPEDKEVYEDMVARLELNSFLPKQ 23] AGCGTGT VNTDNRVIPYQLYWFELHKILENASIYLPMLTEKDSNG ISVMEKLESVFMFRIPYFVGPLNKHSKYAWLERKEGKI GCTTAAA YPWNFENMVDLDASEANFIKRMTNTCTYLPGQNVLPKD GATCTCT SLRYHRFMVLNEINNLRINNERISVELKQKIYSELFLN TCAGTGA VKKVTRKRLVDFLISNGELRKGEESSLTGIDVEIKANL GGTC APQIAFKKLMESGQLTEEDVESIIERASYAEDKARLAH [SEQ ID WLEAKYSKLSEIDRKYICGIKIKDFGRLSKMFLSELEG NO. 24] VDKTTGEMTTILGAMWNSQLNLMELINSELYSFREAIC AYQTDYYSTHSSSLEERMNEMYLSNAVKRPVYRTLDIV KDVKKAFGEPKKIFVEMTRGASEEQKGKRTKSRKEQIL ELYKQCKDEDVRILQQQLEEMGDLADNKLQGDKLFLYY MQKGKCMYTGTPIVLEQLGSKAYDIDHIYPQAYVKDDS ILNNRVLVLSEANGKKKDIYPIEKETRDKMHGFWTYLN DKGMITEEKYKRLTRTTGFTEEEKWSFINRQLTETSQA TKAVATLLGELFPNAEIVYSKARLTSEFRQEFNLLKCR SYNDLHHAVDAYLNIVCGNVYNMKFTKRWFNINKDYSI KTKTVFTHPVVCGGQVVWDGQEMLNKVIRNAKKNTAHF TKYAYIRKGGFFDQMPVKAAEGLTPLKKDMPTAVYGGY NKPSVAFLIPTRYKAGKKTEIIILSVEHLFGERFLRDE AYAKEYAAERLKKILGKQVDEVSFPMGMRPWKINTVLS LDGFLICISGIGSGGKCLRAQSIMQFSSDYRWTIYLKR LERLVEKITVNAKYVYSEEFDKVSTIENIELYDLYIEK YKATIFSKRVNSPEEIIESGRDKFVKLDVLSQARALLC IHQTFGRIVGGCDLGLIGGKKNSAATGNFSSTISNWAK YYKDVRIIDQSTSGLWVRKSENLLELV [SEQ ID NO. 22] MAD2024 56 CADAKQ010000027.1 uncultured Cattle MNFDGEYFLGLDIGTDSVGYAVTDQRYNLVKFKGEPMW GTTTG GAGCCCT Lachno- rumen GSHLFDAANQCAERRGFRTARRRLDRRQQRVKLVDEIF AGAGT CTGGATT spiraceae APEVAKVDPNFYIRKMESALYPEDKSNKGDLYLYFNKQ AGTGT TACACTA bacterium EYDEKHYYKDYPTIHHLICALMNDEKTKFDIRLINIAI AAATC CGAGTTC DWLVAHRGHFLSEVGTDSVDKVLDFRKIYDEFMALFSD CAGAG AAATAAA EDDAVSSKPWENINPDELGKVLKIHGKNAKRNELKKLL GGCTC AATTATT YGGKIPTDEDSFIDRKLLIDFIAGTSVQCNKLFRNSEY CAAAA TCAAATC EDDLKITISNSDEREVVLPQLEDFHADIIAKLSSMYDW C GCCGCTA SVLSDILSGSTYISESKVKVYEQHKKDLKELKEFVRKY [SEQ TGTCGGC APEKYNDIFRLASKETYNYTAYSYNLKSVKDEKDLPKG ID NO. CGCACAG KASKEDFYSYLKKTLKLDKAENYNFVNDADTRFFDDMV 26] TGTGTGC ERISSGTFLPKQVNSDNRVIPYQVYYIELKKILENAKK ATTAAGA HYAFFEEKDEDGYSNVEKIMSVFTFRIPYYVGPLRNDD AAAGTCC KSPYAWIRRKADGKIYPWDFEEKVDLDASENAFIDRMT GAAAGGG NSCTYIPGADVLPKWSLLYTKYMVLNEINNIKVNNIGI C SVEAKQGIYNELFCKKAKVSLKAIREYLISNGFMQKDD [SEQ ID EMSGIDITVKSSLKSRYDFRHLLEKNELTTDDVEAIIS NO. 27] RSTYAEDKARFKKWLKKEFPQLSDEDYKYVSKLKYKDF GRLSRSLLNGLEGASKETGEIGTIMHFLWETNDNLMQL LSDRYTFMEEINKKRQDYYIEHKLTLNEQMEELGISNA VKRPVTRTLAVVKDVVSAIGYAPQKIFVEMARQEDEKK KRSVTRKEQILELYKNVEEDTKELERQLKKMGDTANNE LQSDALFLYYLQLGKCMYSGKPIDLTQIKTTKKYDIDH IWPQSMVKDDSLLNNRVLVLSEINGDKKDVYPIDESIR SKMHSYWKMLLDKNLITKEKYSRLTRPTPFTESEKLGF INRQLVETRQSMKAVTQLLNNMYPDSEIVYVKAKLAAD FKQDFKLAPKSRIINDLHHAKDTYLNVVAGNVYNERFT KKWFNVNEKYSMKTKVLFGHDVKIGDRLIWDSKKDLQT VKNTYEKNNIHLTRYAYCQKGGLFDQMPVKKGQGQIQL KKGMDIDRYGGYNKATASFFIIARYLRGGKKEVSFVPV ELMVSEKFLNDDNFAIEYITNVLTGMNTKKIENVELPL GKRVIKIKTVLLLDGYKVWVNGKASGGTRVMLTSAESL RMPKEYVEYLKKMENYSEKKKSNRNFMHDSENDGLSEE KNILLYDKLLEKLDENHFKKMPGNQCETMKSGRVKFIE LDFDVQISTLLNCIDLLKSGRTGGCDLKNIGGKSASGV VYISANLSACKYNDVHIIDISPAGLHENISCNLMELFE [SEQ ID NO. 25] MAD2025 56 DOQG01000053.1 Rumino- human MSFKENSKFYFGLDIGTDSVGWAVTDNLYKLYKYKNNL GTTTG TTTTACT coccaceae gut MWGVSLFEAASPAEDRRNHRTARRRLDRRQQRVALLRE AGAGT ACCCTAT bacterium LFAKEILKTDPDFFLRLKESSLYPEDRTNKNVNTYFDD AGTGT AAATTTA ADFKDSDYFKMYPTVHHLIKELSESDKPHDVRLVYLAC AAATT CACTACG AFIVAHRGHFLNGADENNVQEVLDFNSSYCEFTDWFKS TATAG AGTTCAA NDIEDNPFSESTENEFSVILRKKIGITAKEKEIKNLLF GGTAG ATAAAAA GTTKTPDCYKDEEYPIDIDVLIKFISGGKTNLAKLFRN TAAAA TTATTTC PAYDELDIQTVEVGKADFADTIDLLASSMEDTDVPLLS C AAATCGT AVKAMYDWSLLIDVLKGQKTISDAKVCEYEQHKSDLKA [SEQ ACTTTTT LKHIVRKYLDKAQYDEIFRTAGEKPNYVSYSYNVTDVK ID NO. AGTACCT LKQLPSNFKKKYSEEFCKYINSKLEKIKPEPDDEAVYN 29] TCACAAG ELIEKCNSKTLCPKQVTDENRVIPYQLYYHELSMILDK TGTTGTG ASAYLDFLNETEDGISVKQKILTLMKFRIPYFVGPSVK AATATTA RNETDNVWIVRKAEGRIYPWNFENMVDYDKSEDGFIRR ACTCACC MTCKCTYLAGEDVLPKYSLLYSRYTVLNEINNIKVKDV TTCGGGT KISPELKQDIFNELFMKTSRVTVKKITELLKRKGAFSE GAG ENGDSLSGVDINIKSSLKSYLDFRRLLENGSLSESDVE [SEQ ID RIIERITVTTDKPRLISWLKTEYPALPAEDIRYISRLS NO. 30] YKDYGRLSAKMLTGCYELDMDTGEIGGRSIIDLMWAEN INLMQIMSDSYGYKSFIEEENKKYYAINPTGSIAQTLR EMYVSPSASRAIIRTMDIVKELRKIIKRDPDKIFVEMA RGSKPEDKGKRTSSRREQIEKLFASAKEFVSDEEISHL RSQLGSLSDEQLRSEKYYLYFTQFGKCVYSGEAIDFSR LGDNHCYDIDHIYPQSKVKDDSLHNKVLVKSQLNGEKS DDYPIKEQIRNKMHPIWKNLFYRDPKNPTDKIKYERLT RSTPFTEDELAGFIERQLVETRQSTKAVATLLKEMFPD SKIVYVKAGQVSKFRHDFDMLKCREINDLHHAKDAYLN VVVGNVHDVKFTSNPLNFVKNADKHYTIKIKETLKHKV ARNGETAWNPETDFDTVKRMMSKNSVRYVRYCYKRKGE LFKQQPKKAGNPDLAWLKKNLDPVKYGGYNSKSISCFS LIKCTGVGVVIIPVELLCEKRYFSDDSFASEYAYSVLK NALPAKNIAKISIDDISFPLKRRPIKINTLFEFDGYRV NIRSKDSYSVFRISSAMAAIYSKDTSDYIKAISSYIDK SDKGSKFKPGEAFDVLSNLKAYDEIAKKCISEPFCKIS KLAEAGKKMEEGRNKFAELSIIEQMKTLLLLVDVLKTG RVDKCNLKPVGGVDNFHTERMSAILKNTKYSDIRIIDQ SPTGLYENKSDNLLEL [SEQ ID NO. 28] MAD2026 65 CADBQN010000053.1 uncultured Cattle MEQKDYYIGLDIGTNSVGWAVVDEGYQLCRFKKYDMWG GTTTG GACTACC Firmicutes rumen VRLFDSAETAAERRMNRVNRRRNRRKKQRIDLLQGLFA AGAGT ATATGAG bacterium EEIAKIDRTFFVRLNESRLHPEDKSTAFRHPLFNDPNY AGTGT ATTACAC TDVDYYKEYPTIYHLRKELMDSAEPHDIRLVYLALHHI AATTT TACACGG LKNRGHFLIEGGFEDSKKFEPTFRQLLEVLTEELGLKM CATAT TTCAAAT DGADAALAESVLKDRGMKKTEKVKRLKNVFTLNTTDMD GGTAG AAAGAAT QESQKKQKAQIDAVCKFLAGSKGDFKKLVADEALNELK TCAAA GTTCGAA LDTFALGTSKAEDIGLEIEKSAPQYCVVFESVKSVFDW C ACCGCCC KIMTQILGDESTFSSAKVKEYEKHHENLIILRELIRKY [SEQ TTTGGGG CDKETYRHFFNNVNGGYSRYIGSLKKNGKKYYVAGCTQ ID NO. CCCGCTT EEFYKELKGLLKSIDQRVDPEDRPVYQRVLAETEDETF 32] GTTGCGG LPLLRSKANSAIPRQIHQKELDDILQNASVYLPFLNDV ATTTACA DEDGLSAAEKIRSIFTFRIPYYVGPLSLRHKDKGAHVW GACTTGA IKRKEEGYIYPWNYEKKIDREKSNEEFIRRLINQCTYL TATCAAG KDEKVLPKKSLLYSEFMVLNELNNLRIRGKRLSEEQVE TCTG LKQRIYRDLFMTKTRVTKKTLLNYLRKEDSDLTEEDLS [SEQ ID GFDNDFKASLSSCLELKNKVFGDRIEEDRVRKIAEDLI NO. 33] RWLTIYDDDKKMIKEVIRAEYPNEFTNEQLDVICRLKF SGWGNLSEAFLCGVEGADKDTGEVFTIIEALRNTNHNL MELLSGNYTFTEKIREHNAALSSEIKAKDYESLVRDLY VSPACKRGIWQTIRITEEIKKIMGHEPKKIFVEMTREH RDSGRTTSRKDQLLALYQKCEEDARDWVKEIEDREERD FSSIKLFLYYLQQGKCMYSGEAIDLDELMSKNSRWDRD HIYPQSKIKDDSLDNLVLVKKELNAVKDNGEIAPDIQK RMKGFWLSLLRQGFLSKKKFDRLTRTGPFTSEELAGFI SRQLVETSQMSKAVAELLNQLYEDSRVVYVKAGLVSQF RQKDLGVLKSRSVNDYHHAKDAYLNVVVGDMFDRKFTS DPARWFKKNKKVNYSINQVFRRDYEENGKLIWKGIDRG EDGKPLFRDGLIHGGTIDLVRAIAKRNTNIRYTEYTYC ETGQLYNLTLLPKTDTAITIPLKKELPAAKYGGFKGAG TSYFSLIEFDDKKGHHHKQIVGVPIYVANMLEHNENAF IEYLETVCSFRNITVLCEKIKKNALISVNGYPMRIRGE NEILNMLKNNLQLVLSQEGEETLRHIEKYFNKKPGFEP DKEHDGIDRDAMAALYDEMTEKLCTVYKKRPTNQGELL KNNRGLFLNLEKRSEMAKVLSETAKMFGTTAQTTADLS LIKGSKYAGKIVINKNTLGAAKLILIHQSVTGLFETRV EL [SEQ ID NO. 31] MAD2027 65 CACWRN010000001.1 uncultured Cattle MSKKFAGEYYLGLDIGTDSVGWAVTDNQYNVLKFNGKS GTTTG TTTACCA Succini- rumen MWGIRLFDAAQTAAERRMFRTARRRVERRRWRLELLQE AGAGT TCCAGTG clasticum LFQNEIEKKDPDFFQRMKDSALYPEDSKTGKPFALFCD AATGT AGTTTAC sp. KDLNDKLYYKQYPTIYHLRKALLTENSKFDIRLVYLAI AAATT ATTACAA HHILKHRGHFLFNGDFSNVTRFSFAFEQLQTCLCNELD CATAG GTTCAAA MDFECNNVQKLSEILKDTHMSKNDKVKASVALFENSGD GATGG TAAAAAT KKQLQAVIGLFCGAKKKLADVFLDETLNDTEMPSISIA TAAAA TTATTCA DKPYEELRPELESILAEKCCVIDYIKAVYDWAILADML C ACCCGTT DGGEYGNRTYISVARVRQYEKHHDDLKKLKKLVRRYCK [SEQ CTTCGGA SEYKSFFSVAGTDNYCAYIGDDIETDDRKSVKKCKQED ID NO. ACCTCCA FYKRIKGLLKKAIENGCPKDEVVEIIKDIDAQVFLPLQ 35] CCGTGTG VTKDNGVIPHQVHEMELKQILKNAEKYYPFLCKKDEEG GAACATT IVTSNKILQLFKFRIPYYVGPLNSRIGKNSWIVRRAEG AAGGTCT KIYPWNFEEKVDFDKSEEGFIRRMTNPCTYMAGADVLP GCTTTGC KYSLLYSEFMVLNELNNVRICGDKLSVEIKQTIIKDLF AGGCC QRTRRVTVRKLCDKLKAEGVISRNSNQKDIDIKGIDQD [SEQ ID LKSSMVSYVDFKNIFGKEIEKYSVQQMCERIIFLLTIH NO. 36] HDDKRRLQKRIRAEFTEAQITDDQLQKVLRLNYQGWGR FSAEFLKELKGVDTETGEVFSIINALRETDDNLMQLLS NRYTFAEELEKYNSNKRKKIEALTYDNIMEGIVASPAI KRSAWQAISIVMELSKIMGREPKRIFVEMARGPEEKKH TISRKNQLLELYKSVKDESRDWKTELETKTESDFRSIK LFLYYTQMGRCMYTGEPIDLDQLANTTIYDRDHIYPQS LTKDDSLNNLVLVKKVENANKGNGLISADIQKKMRGFW AELKKKGLISDEKFSRLTRTTPLSDDELAGFINRQLVE TRQSSKIVADLFHQLYPTTQVVYVKAKIVSDFRHETLD MVKVRSLNDLHHAKDAYLNIVTGNVYYEKFSGNPLTWL RKNPDRNYSLNQMFNYDIVKKTKEGTSYVWKKGKDGSI AVVRRTMERNDILYTRQATENKNGGLFDQNIVSSKNKP FIPVKKGLDVNKYGGYKGITPAYFALIEFTDKKGSRQR LLEAVPLYLRADIDNDSNVLRDFYKNVLGLENPVVILN RIKKNSLLKINGFLIHLRGTTGFSASQLKVQNAVEFSL PHHMEDYVKKLENYEKHIIAERGSTKNSQIKITEWDGI SKEKNLQLYDMFINKMENTIYKFRPANQVSNLKENREV FNSLAVEDQCSVLNQVLMLFVCKPVTANLSLIKGSKNA GNMALSKIISNMRSAYLIHQSVTGLFEQKIDLLKVSSQ KD [SEQ ID NO. 34] MAD2028 66 DHKP01000031.1 Bacillales gut MANKLFIGLDVGSDSVGWAATDENFHLYRLKGKTAWGA GTTTG GCATTGT bacterium meta- RIFSEASDAKGRRGFRVAGRRLARRKERIRLLNTLFDP AGAGC AAGACAA genome LLKEKDPTFLLRLENSAIQNDDPNKPAQAVTDCLLFAN AGTGT CACTGCT KQEEKGFYKRYPTIWHLRKALMDNEDCAFSDIRFLYLA TGTCT ACGTTCA IHHIIKYRGNFLRDGEIKIGQFDYSVFDKLNETLSVLF TATAT AATAAGC DLQSEDEDSQEGHFVGLPKSQYEAFITTANDRNLPKQT AGCTC ATATTGC KKTKLLSMFEKDEESKSFLEMFCTLCAGGEFSTKKLNK GAAAA TACAAGG KGEETFDDTKISFNASYDQNEPNYQEILGDAFDLVDIA C TTCTCCC KAVFDYCDLSDILNGNDNLSNAFVELYDSHKSQLSALK [SEQ TCGGAGA AICKQIDNQSNLKGDASVYVKLFNDPNDKSNYPAFTHN ID NO. ATGACCA KTLVDKRCDIHTFDKYVIDTVLPYEPLLMGQDATNWQM 38] TTAGGTC LKSLAEQDRLLQTIALRSTSVIPMQLHQKELKIILKNA ACTTAGA ISRNVKGIAEIEEKILKLFQYKIPYYCGPLTTKSAYSN TAGCCGG VVFKNNEYRPLKPWDYEEAIDWDETKKKFMEGLTNKCT TTCTTCT YLKDKNVLPKQSILYQDFDAWNKLNNLKVNGSKPSLKE GGCTA LKDLFSFVSQRPKTTMKDIQRHFKSDTNSKDKDVVVSG [SEQ ID WNPEDYICCSSRASFGKNGVFDLNNPDSSDPKDLSKCE NO. 39] RMIFLKTIYADSPKDADVAILKEFPDLTNDQKSLLKTI KCKEWSPLSKEFLELRYADKYGEIRESIINLLRSGEGN LMQILAKYDYQERIDAYNADSFQTKSKSQIVSDLIEEM PPKMRRPVIQAVRIVHEVVKVAKKEPDQISIEVTRENN NKEKKQQLTKKAKSRSAQIQTFLKNLVKIDTFEEKRVD EVLEELKKYSDRSINGKHLYLYFLQNGKDAYTGKPINI DDVLSGNKYDTDHVIPQSKMKDDSIDNLVLVERSINQH RSNEYPLPESIRKNPANVAFWSKLKKAGMMSEKKFNNL
TRANPLTEEELSAFVAAQINVVNRSNIVIRDVLKVLYP NAKLIFSKAQYPSQIRKELNIPKLRDLNDTHHAVDAYL NIVSGVSLTERYGNLSFIKAAQKNENQTDYSLNMERYI SSLIQTKEGEKTSLGKLIDQTSRRHDFLLTYRFSYQDS AFYNQTIYKKNAGLIPVHEKLPPERYGGYNSMSTEVNC VVTIKGKKERRYLVGVPHLLLEKGNKVADINKEIANSV PHKENETIAVSLKDIIQLDSMVKKDGLVYLCTTQNKDL VKLKPFGPIFLSRESEVYLSNLNKFVEKYPNIADGNEN YSLKTNRYGEKSIDFLQEKTGNVLKELVDLSNQKRFDY CPMICKLRTIDYRKGVEGKTLTEQLILIRSFVGVFTRK SEALSNGSNFRKARGLVLQDGLVLCSDSITGLYHTERK L [SEQ ID NO. 37] MAD2029 66 DBKT01000013.1 Bacillales gut MADKLFIGLDVGSESVGWAATDENFHLYRLKGKTAWGA GTTTG GCATTGT bacterium meta- RIFSEANDAKTRRGFRVAGRRLARRKERIRLLNTLFDP AGAGC AAGACAA genome LLKKDPAFLLRLENSAIQNDDPNKPIQAIADCPLLVNK AGTGT CACTGCT QEEKDYYKRYPTIWHLRKALMENDDHAFSDIRFLYLAI TGTCT ACGTTCA HHIIKYRGNFLREGDIKIGQFDYSIFDKLNETLAVLFD TATAT AATAAGC LQNEDGENEEGRFIGLPKSQYEAFITCANDRNLPKQPK AGCTC ATATTGC KAKLLSMFEKTEESKAFLEMFCTLCSGGEFSTKKLNAK GAAAA TACAAGG GEETYQDAKISFNSSYDENEGAYQEILGDFFDLVDIAK C TTCTCCA AVFDYCDLSDILNGNDNLSSAFVELYDSHKSQLSALKS [SEQ TTGGAGA ICKRIDNQNGFIGEKSIYVKLFNDPNDKSNYPAFTNNK ID NO. ATGACCA TLVDKRCDIHTFDKYVKETILPYESSLTGRDAVNWQML 41] TTAGGTC KSLAEQDRLLQTIALRSTSVIPMQLHQKELKIILKNAV GCTTAGA SRNIKGVAEIEEKILKLFQYKIPYYCGPLTTKSDYSNV TAGCCAG VFKNNEYRPLKPWDYEEAIDWDGTKQKFMEGLTNKCTY TTCTTCT LKDKNVLPKQSVLYQDFDTWNKLNNLKVNGNKPSLEDL GGCTA NDLFSFVSQRSKTTMRDIQRYLKSKTNSKENDVVVSGW [SEQ ID NSEDYICCSSRASFNKNGIFNLNNSEVLKECERIIFLK NO. 42] TIYTDSPKDADAAVLKEFPDLTNNQKTLLKTIKCKEWS PLSKEFLELRYSDKYGElRQSIIDLLRNGEGNLMQILA KYDYQEVIDACNAASFQTKSKSQIVSDLIEEMPPKMRR PVIQAVRIVQEVAKVAKKEPDEISIEVTRENNDKEKKQ QLTKKAKSRSTQIQNFLKNLVKIDASEKKQANEVLEEL KKYSDQSINGKHLYLYFLQNGKDAYTGKPINIDDVLSG NKYDTDHIIPQSKMKDDSIDNLVLVEREINQHRSNEYP LPESIRKNPANVAFWRKLKKAGMMSEKKFNNLTRSNPL TEEELGAFVAAQINVVNRSNVVIRDVLKILYPNAKLIF SKAQYPSQIRKELNIPKLRDLNDTHHAVDAYLNIVSGV TLTDRYGNMRFIKASQDEEKHSLNMERYISSLIQTKEG QRTELGELIDQTSRRHDFLLTYRFSYQDSAFYKQTIYK KNAGLIPAHDNLPPERYGGYDSMSTEVNCVATIIGKKT TRYLVGVPHLLIKKAKDGIDVNDELIKLVPHKENEVVK VDLNTTLQLDCTVKKDGFMYLCTSNNIALVKLKPFSPI FLSRESEIYLSNLMKYVEKYPNISDENSEYEFKINREN VDPIKFTEKQSIEVVQDLIIKAKQDRFSYCSMISKLRD INAEEMIHSKSLTEQLKIIKSLIGVFTRKSEILSDKNN FRKSRGAILQEDLFLCSDSITGLYHTERKL [SEQ ID NO. 40] MAD2030 66 DBLD01000015.1 Bacillales gut MEQNTKKLFIGLDVGTDSVGWAATDEYFNLYRLKGKTA GTTTG GCATTGT bacterium meta- WGARLFLDAANAKDRRQHRVSGRRLARRKERIRLLNAL AGAGC AAGACAA genome FDPLLKKVDPTFLLRLESSTLQNDDPNKDQRAVSDALL AGTGT CACTGCA FGNKKHEKAYYAAFPTIWHLRKALIENDDKAFSDIRYL TGTCT CGTTCAA YLAIHHIIKYRGNFLRQGEIKIGEFDFSCFDKLNQFFD TAAAT ATAAGCA IYFSKEDEEEVEFIGLPNENYQRFIDCAADKNLGKGKK AGCTC GATTGCT KGDLLKLMSFSEDEKPFCEMFCSLCAGLAFSTKKLNKK GAAAA ACAAGGT DETVFEDIKVEFNGKFDDKQEEIKSVLGDAYDLVELAK C TCCCGTA FIFDYCDLKDILGASTNRLSEAFAGIYDSHKEELKALK [SEQ AGGGAAT GICREIDRSLGNESKNSLYREVFNDKGIPNNYAAFIHH ID NO. GACCATC ETNSSRCGIADFNNYVLQKIEPLENLLSKQNYKNWIQL 44] TGGTCAC KQLASQGRLLQTIAIRSTSIIPMQLHLKDLKLILANAE ATGAATA KRDIPGIKDIKEKILLLFQFKVPYYCGPLTDRSQYSNV GCCCCCG VLKAGTREKITPWNFADQVDLEETKKKFMEGLTNKCTY GCAACGG LKDCNVLPRQSLMFQEYDAWNKLNNLSINGNKPSPEEM TGGCTG NALFDFASKRRKTTMSDIKKFEKRATMSKENDVTVSGW [SEQ ID NENDFIDLSSFVSLSGFFDLGEIHSADYMACEEAILLK NO. 45] TIFTDAPQDADPIIAEKFPNLKPNQLAALKKMSCKGWA TLSREFLTLKAVDADGEVMNETLLGLMKEGKGNLMQLL HSSLYNFQDVIDSHNRAVFGDKSPKQIANDLIEEMPPQ MRRPVIQALRIVREVSKVAKKQPDVISIEVTRESNDKK KKEEWSKKATDRKKQIDLFLKNLKKTEDVKQTESELDG QAINDIDSIRGKHLYLYFLQNGKDAYTGLPIDINDVLN GTKYDTDHIIPQSLMKDDSIDNLVLVNREKNQHKSNEF PLPRDIQTKANIERWRALKKAGGMSEKKFNNLTRTTPL TEEELSAFVAAQINVVNRSNVVIRDVLKILYPNAKLIF SKAQYPSQIRRDLEIPKLRDLNDTHHAVDAFLNIVSGV ELTKQFGRMDVIKAAAKGDKDHSLNMTRYLERLLKKVD ENKNETMTELGNHVFVTSQRHDFLLTYRFDYQDSAFYN ATIYSPDKNLIPMHDGMDPERYGGYSSLNIEYNCIATI KGKKKTTRYLLGVPHLLALKFKNDGIDITSDLIKLVPH KGDEEVSIDWKNPIPLRITVKKDGVEYLLAPFNAQVME LKPVSPVFLPREAAEYLARLKKAVDQKKQFIYQNSAEI FQSKDKNNALQFGPEQSKNVALKIYALADAKKYDYCAM ISKLRDAALRAEMLDSLSSEALFKQYNDLISLLSQLTR RSKKISSKYFSKSRGALLQDGLKIVSKSITGLYETERN L [SEQ ID NO. 43] MAD2031 141 CACVOG010000001.1 uncultured Cattle MNYILGLDIGIASVGWAAVALDANDEPCKILDLNARIF ATTGT TTGTAAT Seleno- rumen EAAEQPKTGASLAAPRREARGSRRRTRRRRHRMERLRH ACCAT AACCTAT monadaceae LFAREELISAENIAALFEAPADVYRLRAEGLSRRLDEG AGCGA TTTACCT bacterium EWARVLYHIAKRRGFKSNRKGAASDADEGKVLEAVKEN GTTAA CGCTATG EALLKNYKTVGEMMFRDEKFQTAKRNKGGSYTFCVSRG ATTAG GCACAAT MLAEEIGELFAAQREQGNPHASETFETAYSKIFADQRS GGAAT TTGTTAT FDDGPDANSRSPYAGNQIEKMIGTCSLETDPPEKRAAK TACAA TACATGG ASYSFMRFSLLQKINHLRLKDAKGEERPLTDEERAAVE C ACATTAT ALAWKSPSLTYGAIRKALPLPDELRFTDLYYRWDKKPE [SEQ ACTAAAC EIEKKKLPFAAPYHEIRKALDKREKGRIQSLTPDALDA ID NO. ATTTCCT VGYAFTVFKNDAKIEAALSAAGIDGEDAVALMAAGLTF 47] AAAAAAG RGFGHISVKACRKLIPHLEKGMTYDKACKEAGYDLQKT CAACGAA GGEKTKLLSGNLDEIREIPNPVVRRAIAQTVKVVNAVI AAACGTG RRYGSPVAVNVELAREMGRTFQERRDMMKSMEDNNAEN CTGGCAG EKRKEELKGYGVVHPSGLDIVKLKLYKEQGGVCAYSLA CAA AMPIEKVLKDHDYAEVDHILPYSRSFDDSYANKVLVLS [SEQ ID KENRDKGNRTPMEYMANMPGRRHDFITWVKSAVRNPRK NO. 48] RDNLLLEKFGEDKEAAWKERHLTDTKYIGSFIANLLRD HLEFAPWLNGKKKQHVLAVNGAVTDYTRKRLGIRKIRE DGDLHHAVDAAVIATVTQGNIQKLTDYSKQIERAFVKN RDGRYVNPDTGEVLKKDEWIVQRSRHFPEPWPGFRHEL EARVSDHPKEMIESLRLPTYTPEEIDGLKPPFVSRMPT RKVRGAAHLETVVSPRLKDEGMIVKKVSLDALKLTKDK DAIENYYAPESDHLLYEALLHRLQAFGGDGEKAFAESF HKPKADGTPGPVVKKVKIAEKSTLSVPVHHGRGLAANG GMVRVDVFFIPEGKDRGYYLVPVYTSDVVRGELPMRAV VQGKSYAEWKLMREEDFIFSLYPNDLVYIEHEKGVKVK IQKKLREISTLPREKTMTSGLFYYRTMGIAVASIHIYA PDGVYVQESLGVKTLKEFKKWTIDILGGEPHPVQKEKR QDFASVKRDPHAAKSTSSG [SEQ ID NO. 46] MAD2032 141 CACVWE010000020.1 uncultured Cattle MKYIIGLDMGITSVGFATMMLDDKDEPCRIIRMGSRIF GTTGT ATTGTAT Rumino- rumen EAAEHPKDGSSLAAPRRINRGMRRRLRRKSHRKERIKD AGTTC CATACCA coccus LIIKNELMTADEISAIYSTGKQLSDIYQIRAEALDRKL CCTAA AGAACAA sp. NTEEFVRLLIHLSQRRGFKSNRKVDAKEKGSDAGKLLS TTATT TTAGGTT AVNSNKELMIEKNYRTIGEMLYKDEKFSEYKRNKADDY CTTGG ACTATGA SNTFARSEYEDEIRQIFSAQQEHGNPYATDELKESYLD TATGG TAAGGTA IYLSQRSFDEGPGGSSPYGGNQIEKMIGNCTLEPEEKR TATAA GTATACC AAKATFSFEYFNLLSKVNSIKIVSSSGKRALNNDERQS T GCAAAGC VIRLAFAKNAISYTSLRKELNMEYSERFNISYSQSDKS [SEQ TCTAACA IEEIEKKTKFTYLTAYHTFKKAYGSVFVEWSADKKNSL ID NO. CCTCATC AYALTAYKNDTKIIEYLTQKGFDAAETDIALTLPSFSK 50] TTCGGAT WGNLSEKALNNIIPYLEQGMLYHDACTAAGYNFKADDT GAGGTGT DKRMYLPAHEKEAPELDDITNPVVRRAISQTIKVINAL TATCT IREMGESPCFVNIELARELSKNKAERSKIEKGQKENQV [SEQ ID RNDRIMERLRNEFGLLSPTGQDLIKLKLWEEQDGICPY NO. 51] SLKPIKIEKLFDVGYTDIDHIIPYSLSFDDTYNNKVLV MSSENRQKGNRIPMQYLEGKRQDDFWLWVDNSNLSRRK KQNLTKETLSEDDLSGFKKRNLQDTQYLSRFMMNYLKK YLALAPNTTGRKNTIQAVNGAVTSYLRKRWGIQKVREN GDTHHAVDAVVISCVTAGMTKRVSEYAKYKETEFQNPQ TGEFFDVDIRTGEVINRFPLPYARFRNELLMRCSENPS RILHEMPLPTYAADEKVAPIFVSRMPKHKVKGSAHKET IRRAFEEDGKKYTVSKVPLTDLKLKNGEIENYYNPESD GLLYNALKEQUAFGGDAAKAFEQPFYKPKSDGSEGPLV KKVKLINKATLTVPVLNNTAVADNGSMVRVDVFFVEGE GYYLVPIYVADTVKKELPNKAIIANKPYEEWKEMREEN FVFSLYPNDLIKISSRKDMKFNLVNKESTLAPNCQSKE ALVYYKGSDISTAAVTAINHDNTYKLRGLGVKTLLKIE KYQVDVLGNVFKVGKEKRVRFK [SEQ ID NO. 49] MAD2033 141 DCJP01000021.1 un- Feces MKNTLYGIGLDIGVASVGWAVVGLNGTGEPVGLHRLGV GTTGT TTATACC cultivated of RIFDKAEQPKTGESLAAPRRMARGMRRRLRRKALRRAD AGTTC ATACCAA Faecali three- VYALLERSGLSTREALAQMFEAGGLEDIYALRTRALDE CCTAA GAACTGT bacterium weeks PVGKAEFSRILLHLAQRRGFKSNRRTASDGEDGRLLAA CAGTT TATGGTT sp. old VNENRRRMAQGGWRTVGEMLYRHEAFALRKRNKADEYL CTTGG GCTATGA elephant STVGRDMVAEEASLLFQRQRELGCAWATPELQAEYLSI TATGG TAAGGTC LLRQRSFDEGPGGNSPYGGNQVEKMVGRCTFEPDEPRA TATAA TTAGCAC AKAAYSFEYFSLLQKLNHIRLAENGETRPLTQPQRQQL T CGTAAAG LSLAHKTPDVSLARIRKELALPETVQFNGVRCRANETL [SEQ CTCTGAC EESEKKEKFACLPAYHKMRKALDGVVKGRISSLSISQR ID NO. GCCTCGC DAAATALSLYKNEDTLRAKLTEAGFQAPEIDALAGLTG 53] TTTCAGC FSKFGHLSLKACRKLIPHLEQGLTYDQACSAAGYDFKG GGGGCGT HGAGERAFTLPAAAPEMEQITSPVVRRAVAQTIKVVNG CATCTTT IIREMDASPAWVRIELARELSKTFGERQEMDRSMRENA TTTGCCC AQNERLMQELRDTFHLLSPTGQDLVKYRLWKEQDGVCA AAAAGAC YSLRRLDVERLFEPGYVDVDHIVPYSLSFDDRRSNKVL ACGGATA VLSSENRQKGNRLPLQYLQGKRREDFIVWTNSSVRDYR TTTTT KRQNLLREKFSGDEAEGFRQRNLQDTQHMARFLYNYIS [SEQ ID DHLAFAQSEALGKKRVFAVSGAVTSHLRKRWGLSKVRA NO. 54] DGDLHHALDAAVIACTTDGMIRRISGYYGHIEGEYLQD ADGAGSQHARTKERFPAPWPRFRDELIVRLSEQPGEHL LDINPAFYCEYGTEHICPVFVSRMPRRKVTGPGHKETI KGAAAADEGLLTVRKALTELKLDKDGEIKDYYMPSSDT LLYEALKAQLRRFGGDGKKAFAEPFYKPKADGTPGPLV RKVKTIEKATLTVPVHGGAASNDTMVRVDVFLVPGDGY YWVPVYVADTLKPELPNRAVVAFKPYSEWKEMREEDFI FSLYPNDLVYVEHKSGLKFTLQNADSTLEKTWVPKASF AYFVGGDISTAAISLRTHDNAYGLRGLGIKTLKVLKKY QVDVLGNISPVHRETRQRFR [SEQ ID NO. 52] MAD2034 141 CACXAV010000001.1 uncultured Cattle MAYGIGLDIGIASVGFATVALNEQDEPCGILRMGSRIF GTTGT TTATACC Clostri- rumen DAAEHPKNGASLAAPRREARSARRRLRRHRHRLERIRN AGTTC ATACCAA diales LLVESCLISQDGLGSLFEGRLEDIYALRTRALDERLTD CCTAA GAACTGT bacterium AELCRVLIHLAQRRGFRSNRKADAADKEAGKLLKAVSE CGGTT TGGGTTA NDRRMEENGYRTVGEMLYKDPLFAEHRRNKGEAYLSTV CTTGG CTACAAT TRTAVEQEARLVLSTQREKGNAAITEDFVEKYLDILLS TATGG AAGGTAG QRPFDVGPGGNSPYGGNMIEKMIGRCTFEPDELRAPKA TATAA TAAACCG SYSFEYFQLLQKVNHIRLLRDGRSEPLSEEQRRAIIDL T AAAAGCT ALASADVTFAKIRKALSLPDSVRFNDVYYRESAEEAEK [SEQ CTGACGT KKKLGCMDAYHEMRKALDKVAKGRICAIPVEQRNAIAY ID NO. CTTGTTT VLTVHKTDERILTELQNINLERSDIDQLMQMKGFSKFG 56] GCGCAGG HLSIKACDRIIPYLEQGMTYSDACTAAGYAFRGHEGGE ACGTCAT HSLYLPAQTPEMDEITSPVVRRAVSQTIKVVNALIREQ CTTTATA GESPTFVNIELAREMSKDFAERNDIRRENEKNAKANEA TCAGACG VMNELRRTFGLVNPSGQDLVKYKLFLEQGGVCPYTQRP GATG MEPGRLFEAGYADVDHIVPYSISFDDRYCNKVLTFASV [SEQ ID NRKEKGNRLPLQFLKGERRESFIVYVKANVRDYRKQRL NO. 57] LLKETVTEEDRKGFRDRNLQDTKHMAAFLHSYINDHLQ FAPFQTDRKRHVTAVNGAVTAYLRKRWGIRKVRAEGDL HHASDALVIACTTPGMIQRLSRYAELREAEYMQTEDGA VRFDPATGEVLEKFPYPWPCFRQEWTARVSDDPQAMLQ DMKLTDYRGLPLEQVKPVFVSRMPKHKVTGAAHKDTVK SAKALDRGVVLVKRALTDLKLKDGEIENYYDPASDRLL YEALKERLIAFGGDAQKAFAEPFHKPKRDGTPGPLVKK VKLMEKSSLTVPVHDGKGVADNDSMVRIDVFFVAGEGY YFVPIYVADTVKPELPNRAVVANKPYAEWKEMKDEDFL FSLYPSDLMRVTQKKGIKLSLINKESTLKKEEMAQSIL LYYVKGSISTGSITAENHDRTYAINSLGIKTLEKLEKY QVDVLGNVSPVGKEKRLTFC [SEQ ID NO. 55] MAD2035 141 CADATZ010000012.1 uncultured Cattle MLPYAIGLDIGIASVGWAVVGLDTNERPFCILGMGSRI GTTGT TTATACC Chloroflexi rumen FDKAEQPKTGASLALPRREARSLRRRLRRHRHRNERIR AGTCC ATTCCAG bacterium NLLLREKIISESELQDLFSGTLSDIYQLRVEALDRKLD CCTGA AAACTAT DKEFSRVLIHIAQRRGFKSNRKNAAASQEDGKLLSAVT TGGTT TATGGTC ENQQRMNDKGYRTVSEMLLRDDKFKDHKRNKGGEYLTT TCTGG ACTACAA VTRTMVEDEVHKIFSAQRTHGNLKADNQLESEYLEILL AATGG TAAGGTA SQRSFDEGPGGDSPYGGSQIEKMIGKCTFFPEEKRAAK TATAA TTAGACC ATYTFEYFNLLEKINHIRLVSKDNLPEPLSDFQRRSLI T GTAGAGC ELAYKVENLTYDRIRKELHISPELKFNTIRYESDDLPE [SEQ ACTAACA NEKKQKLNCLKAYHElRKALDKLGKGTINTLSKEQLNT ID NO. CCCCATT IGTVLSMYKTSEIIKNKMEQIPAEIVDKLDEEGINFSK 59] TGGGGTG FGHLSIKACELIIPGLEKGLNYNDACEEAGLNFKAHNN TTATCTC EEKSFLLHPTEDDYADITSPVVKRAASQTIKVINAIIR TTTAAAC KQGCSPTYINIEVARELSKDFYERDKINKRNEANRAEN TGTCCAA ERSLEQIRKEYGKSNASGLDLVKFKLYQKQDGVCAYSQ AATTTAG KQISFERLFEPNYVEVDHIIPYSKCFDDRESNKVLVFA TATTGCA KENREKGNRLPLEYLDGKKRESFIVWVNSKVKDYRKKQ ATTATTG NLLKESLSEEEEKQFKERNLQDTKTVSKFLMNYINDNL A IFSSSNKRKKHVTAVSGGVTSYMRKRWGISKVREDGDQ [SEQ ID HHAVDALVIVCTTDGMIQQVSKYVEYKECQYIQTDAGS NO. 60] LAVDPYTGEVLRSFPYPWARFHEDAVTWTEKIFVSRMP MRKVTGPAHKETIKSPKALGEGLLIVRKPLTELKLKNG EIENYYKPEADLLLYNGLKERLMEFGGDAKKAFAEPFP KPGNPQKIVKKVRLTEKSTLNVPVLKGEGRADNDSMVR VDVFLKDGKYYLVPIYVADTLKPELPNKACIAHKPYDE WATMDDGDFLFSLYPNDLIYIKHKKGIKLTKINKNSTL ADSIEGKEFFLFYKTMGISSAVLTCTNHDNTYYIESLG VKTLESLEKCVVGVLGEIHKVRKEKRTGFSGN [SEQ ID NO. 58] MAD2036 141 CADAWQ010000026.1 Ruminoe- Cattle MLPYAIGLDIGISSVGWASVALDEEDKPCGIIGMGSRI GTTAT TTATACC coccacea rumen FDAAEQPKTGDSLAAPRRAARSARRRLRRRRHRNERIR AGTTC ATACCAA bacterium ALMLREGLLSEAELAALFDGRLEDICALRVRALDEAVT CCTGT GAACGAA
NDELARILLHLSQRRGFRSNRKTAATQEDGELLAAVSA TCGTT GCAGGTT NRALMQERGYRTVAEMLLRDERYRDHRRNKGGAYIATV CTTGG ACTATGA GRDMVEDEVRQIFAAQRALGSTAASETLETAYLEILLS TATGG TAAGGTA QRSFDAGPGEPSPYAGGQIERMIGRCTFEPDEPRAARA TATAA GTATACC TYSFEYFSLLEAVNHIRLTEAGESVPLTKEQREKLIAL T GCAGAGC AHRTADLSYAKIRKELGVPESQRFNMVTYGKTDSADEA [SEQ TCCAACG EKKTKLKQLRAYHQMRAAFEKAAKGSFVLLTKEQRNAV ID NO. CCTCGCT GQTLSIYKTSDNIRPRLREAGLTEAEIDVAEGLSFSKF 62] TTTGCGG GHLSVKACDKIIPFLEQGMKYSEACVAAGYAFRGHEGQ GGCGTTG DKQRLLPPLDNDAKDTITSPVVLRAVSQTIKVVNAIIR TCTCT ERGGSPTFINIELAREMAKDFSERSQIKREQDSNRARN [SEQ ID ERMMERIKTEYGKSSPTGLDLVKLKLYEEQAGVCAYSL NO. 63] KQMSLEHLFDPNYAEIDHIIPYSISFDDGYKNKVLVLA KENRDKGNRLPLEYLNGKRREDFIVWVNSSVRDWRKKQ NLLKEHVTPEDEAKFKERNLQDTKTASRFLLNYIADNL AFAPFQTERKKRVTAVNGSVTAYLRKRWGIAKVRANGD LHHAVDALVIACTTDGLIQKVSRYACYQENRYSEAGGV IVDSATGEVVAQFPEPWPRFRHELEARLSDDPARAVLG LGLAHYMTGEIRPRPLFVSRMPRRKVTGAAHKETVKSP RALDEGQLVTKTPLSALKLGKDGEIPGYYKPESDRLLY EALKARLRQFGGDGKKAFAEPFHKPKHDGTPGPVVTKV KLCEPATLSVPVHGGLGAANNDSMVRIDVFHVEGDGYY FVPIYIADTLKLELPNKACVKIKKISEWKHMKPQDFMF SLYPNDLFRIVSKKGITLNLVSKESTLPTSVNVSDTLL YFVSAGIASACLTCRNHDNTYQIESLGIKTLEKLEKYT VDVLGNVHRVEKEPRMSFSQKGD [SEQ ID NO. 61] MAD2037 141 DGSQ01000028.1 Clostri- low MLPYGIGLDIGITSVGWATVALDENDRPYGIIGMGSRI GTTAT TTATACC diales methane FDAAEQPKTGESLAAPRRAARSARRRLRRHRHRNERIR AGTTC ATACCAA bacterium producing ALILRENLLSEGQLLHLYDGQLSDVYSLRVKALDERVS CCTGA GAACTAT sheep NEEFARILIHISQRRGFKSNRKGASSKEDSELLAAISA TAGTT GAGGTTG NQVRMQQQGYRTVAEMYLKDPIYQEHRRNKGGNYIATV CTTGG CTATAAT SRAMVEDEVHQIFTGQRACGNPAATKELEEAYVEILLS TATGG AAGGTAG QRSFDDGPGDGSPYAGSQIERMIGKCQLEKEAGEPRAA TATAA TAAACCG KATYSFEYFSLLAAINNISIISNGQLSPLTKEQREMLI T CAGAGCT ALAHKTSELNYARIRKELGLSEAQRFNTVSYGKMEIAE [SEQ CTAACGC AEKKTKFEHLKAYHKMRREFERIAKGHFASITIEQRNA ID NO. CTCACAT IGDVLSKYKTDAKIRPALREAGLTELDIDAAEALNFSK 65] TTGTGGG FGHISIKACKKIIPWLEQGMKYSEACNAAGYNFKGHDG GCGTTAT QEKSHLLPPLDEESRNVITSPVALRAISQTIKVVNAII CTCT RERGCSPTFINIELAREMSKDFYERIEIKKEQDGNRAK NERMMERIRTEYGKASPTGQDLVKFKLYEEQGGVCAYS [SEQ ID LKQMSLAHLFEPDYAEVDHIVPYSISFDDGYKNKVLVL NO. 66] AKENRDKGNRLPLQYLQGKRREDFIAWVNSCVRDYKKR QRLLKESISEDDLRAFKERNLQDTKTASRFLLNYISDH LEFTQFATERKKHVTAVNGSVTAYLRKRWGITKIRENG DLHHAVDALVIACTTDGMIQQVSRFAQHRENQYSLAED SRFIIDPETGEVIKEFPYPWPRFRQELEARLSSNPGLA VRDRGFLLYMAESIPVHPLFVSRMPRRKVTGAAHKETI KSGKAQKDGLLIVKKPLTDLKLDKEGEIANYYNPMSDR LLYEALKKRLTAFNGDGKKAFADPFYKPKSDGTQGPLV NKVKLCEPSTLNVSVIGGKGVAENDSMVRIDVFRVEGD GYYFVPVYVADTVKPELPNKACVANKPYTDWKEMRESD FLFSLYPNDLLKVTHKKALILTKAQKDSDLPDCKETKS EMLYFVSASISTASLACRTHDNSYRINSLGIKTLEALE KYTVDVLGEYHPVRRETRQTFTGRESSGHSGIS [SEQ ID NO. 64] MAD2038 141 CACWHR010000008.1 Rumino- Cattle MRPYGIGLDIGISSVGWAAIALDHQDSPCGILDMGARI GTTGT TTATACC coccaceae rumen FDAAENPKDGASLAAPRREKRSQRRRLRRHRHRNERIR AGTTC ATACCAA bacterium RMLLKEGLLTEAELTGLFDGALEDIYALRTRALDEALT CCTGA GAACGAT KQEFARVLLHLSQRRGFRSNRRATAAQEDGKLLDAVSE TCGTT CAGGTTG NAKRMADCGYRTVGEMLCRDATFAKHKRNKGGEYLTTV CTTGG CTACAAT SRAMIEDEVKLVFASQRRLGSAFASEALEQGYLDILLS TATGG AAGGTAG QRSFDEGPGGNSPYGGAQIERMIGKCTFYPEEPRAARA TATAA TAAACCG CYSFEYFSLLQKVNHIRLQKDGESTPLTSEQRLQLIEL T AAGAGCT ANKTENLDYARIRRALQIPDAYRFNTVSYRIESDPAAA [SEQ CTAACGC EKKEKFQYLRAYHTMRKAIDGASKGRFALLSQEQRDQI ID NO. CCCGTTT GTVLTLYKSQERISEKLTEAGIEPCDIAALESVSGFSK 68] CTTTACG TGHISLRACKELIPYLEQGMNYNEACAAAGIEFHGHSG GGGCGTT TERTVVLHPTPDDLADITSPVVRRAVAQTVKVINAVIR ATCTCT RYGSPVFVNIELARELAKDFTERKKLEKDNKTNRAENE [SEQ ID RLMRRIREEYGKMNPTGLDLVKLRLYEEQAGVCPYSQK NO. 69] QMSLQRLFEPNYAEVDHIIPYSISFDDSRRNKVLVLAE ENRNKGNRLPLQYLTGERRDNFIVWVNSSVRDYRKKQK LLKPTVTDEDKQQFKERNLQDTKTMSRFLMNYINDHLQ FGVSAKERKKRVTAVNGIVTSYLRKRWGITKIRGDGDL HHAVDALVIACATDGMIRQITRYAQYRECRYMQTDTGS AAIDEATGEVLRIFPYPWEHFRKELEARLSSDPARAVN ALRLPFYLDSGEPLPKPLFVSRMPRRKVSGAAHKDTVK SPKAMAEGKVIVRRALTDLKLKNGEIENYFDPGSDRLL YDALKARLAAFGGDGAKAFREPFYKPRHDGTPGPLVKK VKLCEPTTLNVAVHGGKGVADNDSMVRIDVFRVEGDGY YFVPIYIADTLKPVLPNKACVAFKPYSEWRTMDDRDFI FSLYPNDLIRVTHKSALKLSRVSKESTLPESIESKTAL LYYVSAGISGAAVSCRNHDNSYEIKSMGIKTLEKLEKY TVDVLGEYHKVEKERRMPFTGKRS [SEQ ID NO. 67] MAD2039 141 CACZLL010000017.1 Rumino- Cattle MRPYAIGLDIGITSVGWATVALDADESPCGIIGLGSRI GTTAT TTATACC coccaceae rumen FDAAEQPKTGESLAAPRRAARGSRRRLRRHRHRNERIR AGTTC ATACCAA bacterium SLMLEERLISQDELETLFDGRLEDIYALRVKALDEIVS CCTGA GAACTAT RTDFARILLHISQRRGFKSNRKNPTTKEDGVLLAAVNE TAGTT TTAGGTT NKQRMSEHGYRTVGEMFLLDETFKDHKRNKGGNYITTV CTTGG ACTATGA ARDMVADEVRAIFSAQRELGASFASEEFEERYLEILLS TATGG TAAGGTT QRSFDEGPGGNSPYGGSQIERMVGRCTFFPDEPRAAKA TATAA TAGTACA TYSFEYFTLLQKVNHIRIVENGVASKLTDEQRRIIIEL T CCTTAGA AHTTKDVSYAKIRKVLKLSDKQLFNIRYSDNSPAEDSE [SEQ GCTCTGA KKEKLGIMKAYHQMRSAIDRVSKGRFAMMPRAQRNAIG ID NO. CGCCTCG TALSLYKTSDKIRKYLTDAGLDEIDINSADSIGSFSKF 71] CTTTTGC GHISVKACDMLIPFLEQGMNYNEACAAAGLNFKGHDAG GAGGCGT EKSKLLHPKEEDYEDITSPVVRRAIAQTIKVINAIIRR TATCTCT EGCSPTFINIELAREMAKDFRERNRIKKENDDNRAKNE TTATATT RLLERIRTEYGKNNPTGLDLVKLRLYEEQSGVCMYSLK GCCAAAA QMSLEKLFEPNYAEVDHIVPYSISFDDSRKNKVLVLTE ATGCAAA ENRNKGNRLPLQYLKGRRREDFIVWVNNNVKDYRKRRL TATATCG LLKEELTAEDESGFKERNLQDTKTMSRFLLNYIADNLE TACAATG FAESTRGRKKKVTAVNGAVTAYMRKRWGITKIREDGDC GTGGC HHAVDAVVIACTTDAMIRQVSRYAQFRECEYMQTESGS [SEQ ID VAVDTGTGEVLRTFPYPWPDFRKELEARLANDPAKVIN NO. 72] DLHLPFYMSAGRPLPEPVFVSRMPRRKVTGAAHKDTIK SARELDNGYLIVKRPLTDLKLKNGEIENYYNPQSDKCL YDALKNALIEHGGDAKKAFAGEFRKPKRDGTPGPIVKK VKLLEPTTMCVPVHGGKGAADNDSMVRVDVFLSGGKYY LVPIYVADTLKPELPNKAVTRGKKYSEWLEMADEDFIF SLYPNDLICATSKNGITLSVCRKDSTLPPTVESKSFML YYRGTDISTGSISCITHDNAYKLRGLGVKTLEKLEKYT VDVLGEYHKVGKEVRQPFNIKRRKACPSEML [SEQ ID NO. 70] MAD2040 141 DHKF01000115.1 Clostri- Feces MHRYAIGLDIGITSVGWAAIALDAEENPCGMLDFGSRI GTTGT TTATACC diales FTGAEHPKTGASLAAPRREARGARRRLRRHRHRNERIR AGTTC ATACCAA bacterium RLMVSGGLISQEQLESLFAGQLEDIYALRTRALDEQVA CCTGA GAACTGC UBA4701 REELARIMLHLSQRRGFRSNRKGGADAEDGKLLEAVGD TGGTT TCAGGTT NKRRMDEKGYRTAGEMFFKDEAFAAHKRNKGGNYIATV CTTGG ACTATGA TRAMTEDEVHRIFAAQRGFGAEYANEKLEAAYLDILLS TATGG TAAGGTA QRSFDEGPGGDSPYGGSQIERMIGTCAFEPDQPRAAKA TATAA GTAAACC AYSFEYFSLLEKLNHIRLVSGGKSEPLTDAQRKKLIEL T GAAGAGC AHKQDTLSYAKIRKELELNEAVRFNSVRYTDDATFEEQ [SEQ TCTAATG EKKEKIVCMKAYHAMRKAVDKNAKGRFAYLTIPQRNEI ID NO. CCCCGTC GRVLSTYKTSAKIEPALAAAGIEPCDIAALEGLSFSKF 74] TCGCACG GHLSIKACDKLIPFLEKAMNYNDACAAAGYDFRGHSRD GGGCATT GRQMYLPPLGGDCTEITSPVVRRAVSQTIKVINAIIRR ATCTCTA YGTSPVYVNIELAREMSKDFAERNKIKKQNDDNRSKNE ACAGCGA KIKEQVAEYKHGAATGLDIVKMKLFNEQGGICAYSQRQ AAAGGCA MSLERLFDPNYAEVDHIVPYSISFDDRYKNKVLVLTEE AA NRNKGNRLPLQYLTGERRDRFIVWVNNSVRDFQKRKLL [SEQ ID LKEALTPEEENDWKERNLQDTKFVSSFLLNYINDNLLF NO. 75] APSVRRKKRVTAVNGAVTDYMRKRWGISKVREDGDRHH AVDAVVIACTNDALIQKVSRYESWHERHYMPTENGSIL VDPATGEIKQTFPYPWAMFRKELEARLSNDPSRAVADL KLPFYMDADAPPVKPLFVSRMPTRKVTGAAHKDTVKSA RALADGLAIVRRPLTALKLDKDGEIAGYYNKDSDRLLY DALKARLTEYGGNAAKAFAEPFYKPKSDGTPGPVVNKV KLTEPTTLSVPVQDGTGIADNDSMVRIDVFRVVGDGYY FVPVYVADTLKQELPDRAVVAFKAHSEWKVMSDGDFVF SLYPNDLVKVTRKKDVILKRSFDNSTLPETIASNECLL YYAGADISTGAISCVTNDNAYSIRGLGIKTLVSMEKYT VDILGEYHPVRKEERQRFNTKR [SEQ ID NO. 73]
Example 3
Vector Cloning, MADZYME Library Construction and PCR
[0058] The MADzyme coding sequences were cloned into a pUC57 vector with T7-promoter sequence attached to the 5'-end of the coding sequence and a T7-terminator sequence attached to the 3'-end of the coding sequence.
[0059] First, Q5 Hot Start 2.times. master mix reagent (NEB, Ipswich, MA) was used to amplify the MADzyme sequences cloned in the pUC57 vector. The forward primer 5'-TTGGGTAACGCCAGGGTTTT [SEQ ID No. 172] and reverse primer 5'-TGTGTGGAATTGTGAGCGGA [SEQ ID No. 173] amplified the sequences flanking the MADzyme in the pUC57 vector including the T7-promoter and T7-terminator components at the 5'- and 3'-end of the MADzymes, respectively. 1 .mu.M primers were used in a 10 .mu.L PCR reaction using 3.3 .mu.L boiled cell samples as templates in 96 well PCR plates. The PCR conditions shown in Table 2 were used:
TABLE-US-00002 TABLE 2 STEP TEMPERATURE TIME DENATURATION 98.degree. C. 30 SEC 30 CYCLES 98.degree. C. 10 SEC 66.degree. C. 30 SEC 72.degree. C. 3 MIN FINAL EXTENSION 72.degree. C. 2 MIN HOLD 12.degree. C.
Example 4
gRNA Construction
[0060] Several functional gRNAs associated with each MADzyme was designed by truncating the 5' region, the 3' region and the repeat/anti-repeat duplex (see Table 3).
TABLE-US-00003 TABLE 3 gRNA name sgRNAv1 sgRNAv2 sgRNAv3 sgRNAv4 sgRNAv5 sgM GTTTTAGAGCTATGC GTTTTAGAGCTATGC GTTTTAGAGCTATGC GTTTTAGAGCT NONE 2015 TGTTTTGAATGCTTC TGTTTTGAATGCTTC TGTTAACAACATAGC ATGCAAACATA CAAAACGAAATGTTG GTAGCATTCAAAACA AAGTTAAAATAAGGC GCAAGTTAAAA GTAGCATTCAAAACA ACATAGCAAGTTAAA TTTGTCCGTTCTCAA TAAGGCTTTGT ACATAGCAAGTTAAA ATAAGGCTTTGTCCG CTTTTAGTGACGCTG CCGTTCTCAAC ATAAGGCTTTGTCCG TTCTCAACTTTTAGT TTTCGGCG TTTTAGTGACG TTCTCAACTTTTAGT GACGCTGTTTCGGCG [SEQ ID NO. 78] CTGTTTCGGCG GACGCTGTTTCGGCG [SEQ ID NO. 77] [SEQ ID NO. [SEQ ID NO. 76] 79] sgM GTTTTAGAGTCATGT GTTTTAGAGTCATGT GTTTTAGAGTCATGT NONE NONE 2016 TGTTTAGAATGGTAC TGTAAAAACAACATA TGTAAAAACAACATA CAAAACATCTTTTGG GCAAGTTAAAATAAG GCAAGTTAAAATAAG GACTATTCTAAACAA GTTTTAACCGTAATC CGTAATCAACTGTAA CATAGCAAGTTAAAA AACTGTAAAGTGGCG AGTGGCGCTGTTTCG TAAGGTTTTAACCGT CTGTTTCGGCGC GCGC AATCAACTGTAAAGT [SEQ ID NO. 81] [SEQ ID NO. 82] GGCGCTGTTTCGGCG C [SEQ ID NO. 80] sgM GTTTTAGAGCTGTGC GTTTTAGAGCTGTGC GTTTTAGAGCTGTGC GTTTTAGAGCT NONE 2017 TGTTTCGAATGGTTC TGTTTCGAAAAATCG TGTAAAAACAACACA GTGCAAACACA CAAAACGAAATGTTG AAACAACACAGCGAG GCGAGTTAAAATAAG GCGAGTTAAAA GAACTATTCGAAACA TTAAAATAAGGCTTT GCTTTGTCCGTACAC TAAGGCTTTGT ACACAGCGAGTTAAA GTCCGTACACAACTT AACTTGTAAAAGGGG CCGTACACAAC ATAAGGCTTTGTCCG GTAAAAGGGGCACCC CACCCGATTCGGGTG TTGTAAAAGGG TACACAACTTGTAAA GATTCGGGTGC C GCACCCGATTC AGGGGCACCCGATTC [SEQ ID NO. 84] [SEQ ID NO. 85] GGGTGC GGGTGCA [SEQ ID NO. [SEQ ID NO. 83] 86] sgM GTTTTAGAGCTGTGT GTTTTAGAGCTGTGT GTTTTAGAGCTGTGT NONE NONE 2019 TGTTTCGAATGGTTC TGTAAAAACAATACA TGTAAAAACAATACA CAAAACGGTTTGAAA GCAAAGTTAAAATAA GCAAGTTAAAATAAG CCATTCGAAACAATA GGCTAGTCCGTATAC GCTAGTCCGTATACA CAGCAAAGTTAAAAT AACGTGAAAACACGT ACGTGAAAACACGTG AAGGCTAGTCCGTAT GGCACCGATTCGGTG GCACCGATTCGGTGC ACAACGTGAAAACAC C [SEQ ID NO. 89 GTGGCACCGATTCGG [SEQ ID NO. 88] TGC [SEQ ID NO. 87] sgM GTTTGCTAGTTATGT GTTTGCTAGTTATGT GTTTGCTAGTTATGT NONE NONE 2020 TATTTATAGTATTAA TATAAAAATAACATA TATAAAAATAACATA GCAAACTGTAAATAA ACGAGTGCAAATAAG ACGAGTGCAAATAAG CATAACGAGTGCAAA CGTTTCGCGAAAATT CGTTTCGCGAAAATT TAAGCGTTTCGCGAA TACAGTGGCCCTGCT TACAGTGGCCCTGCT AATTTACAGTGGCCC GTGGGGCCTTTTTTA GTGGGGCC TGCTGTGGGGCCTTT TTTATCAAA [SEQ ID NO. 92] TTTATTTATCAAA [SEQ ID NO. 91] [SEQ ID NO. 90] sgM GTTTGAGAGCCTTGT NONE NONE NONE NONE 2021 AAAACCGTATATCTC TCAAGCGAAAGATAA TGTTTTACAAGGCGA GTTCAAATAAGGATT TATCCGAAATCGCTT GCGTGCATTGGCACC ATCTATCTTTTAAGA CTTTCTTTGAAAGTC TT [SEQ ID NO. 93] sgM GTTTGAGAGTCTTGT GTTTGAGAGTCTTGT GTTTGAGAGTCTTGT GTTTGAGAGTC NONE 2022 TAATTCTTAAAGGTG AAAAACAAGACGAGT AAAAACAAGACGAGT TTGTTAATTCA TAAAACGAGAATTAA GCAAATAAGGTTTAT GCAAATAAGGTTTAT AAAGAATTAAC CAAGACGAGTGCAAA CCGGAATCGTCAATA CCGGAATCGTCAATA AAGACGAGTGC TAAGGTTTATCCGGA TGACCTGCATTGTGC TGACCTGCATTGTGC AAATAAGGTTT ATCGTCAATATGACC AGAATCTTTAAAATC AG [SEQ ID NO. ATCCGGAATCG TGCATTGTGCAGAAT ATATGATTTCATATG 96] TCAATATGACC CTTTAAAATCATATG GTTTTA [SEQ ID TGCATTGTGCA ATTTCATATGGTTTT NO. 95] GAATCTTTAAA A [SEQ ID NO. ATCATATGATT 94] TCATATGGTTT TA [SEQ ID NO. 97] sgM GTTTGAGAGTAGTGT NONE NONE NONE NONE 2023 AAATCCATAGGGGTC TCAAACGAAAAGACC CCTATGGATTTACAT TGCGAGTTCAAATAA AAGTTTACTCAAATC GTTGGCTTGACCAAC CGCACAGCGTGTGCT TAAAGATCTCTTCAG TGAGGTC [SEQ ID NO. 98] sgM GTTTGAGAGTAGTGT NONE NONE NONE NONE 2024 AAATCCAGAGGGCTC CAAAACGAGCCCTCT GGATTTACACTACGA GTTCAAATAAAAATT ATTTCAAATCGCCGC TATGTCGGCCGCACA GTGTGTGCATTAAGA AAAGTCCGAAAGGGC [SEQ ID NO. 99] sgM GTTTGAGAGTAGTGT GTTTGAGAGTAGTGT GTTTGAGAGTAGTGT GTTTGAGAGTA NONE 2025 AAATTTATAGGGTAG AAAAATACACTACGA AAAAATACACTACGA GTGTAAATTTA TAAAACAAATTTTAC GTTCAAATAAAAATT GTTCAAATAAAAATT TAGGAAAACCT TACCCTATAAATTTA ATTTCAAATCGTACT ATTTCAAATCGTACT ATAAATTTACA CACTACGAGTTCAAA TTTTAGTACCTTCAC TTTTAGTACCTTCAC CTACGAGTTCA TAAAAATTATTTCAA AAGTGTTGTGAATAT AAGTGTTGTGAA AATAAAAATTA ATCGTACTTTTTAGT TAACTCACCTTCGGG [SEQ ID NO. TTTCAAATCGT ACCTTCACAAGTGTT TGAG [SEQ ID 102] ACTTTTTAGTA GTGAATATTAACTCA NO. 101] CCTTCACAAGT CCTTCGGGTGAG GTTGTGAATAT [SEQ ID NO. TAACTCACCTT 100] CGGGTGAG [SEQ ID NO. 103] sgM GTTTGAGAGTAGTGT NONE NONE NONE NONE 2026 AATTTCATATGGTAG TCAAACGACTACCAT ATGAGATTACACTAC ACGGTTCAAATAAAG AATGTTCGAAACCGC CCTTTGGGGCCCGCT TGTTGCGGATTTACA GACTTGATATCAAGT CTG [SEQ ID NO. 104] sgM GTTTGAGAGTAATGT GTTTGAGAGTAATGT GTTTGAGAGTAATGT GTTTGAGAGTA NONE 2027 AAATTCATAGGATGG AAAAATACATTACAA AAAAATACATTACAA ATGTAAATTCA TAAAACGAAATTTAC GTTCAAATAAAAATT GTTCAAATAAAAATT TAAAAGTGAGT CATCCAGTGAGTTTA TATTCAACCCGTTCT TATTCAACCCGTTCT TTACATTACAA CATTACAAGTTCAAA TCGGAACCTCCACCG TCGGAACCTCCACCG GTTCAAATAAA TAAAAATTTATTCAA TGTGGAACATTAAGG TGTGGA [SEQ ID AATTTATTCAA CCCGTTCTTCGGAAC TCTGCTTTGCAGGCC NO. 107] CCCGTTCTTCG CTCCACCGTGTGGAA [SEQ ID NO. GAACCTCCACC C [SEQ ID NO. 106] GTGTGGAACAT 105] TAAG [SEQ ID NO. 108] sgM GTTTGAGAGCAGTGT NONE NONE NONE NONE 2028 TGTCTTATATAGCTC GAAAACGCATTGTAA GACAACACTGCTACG TTCAAATAAGCATAT TGCTACAAGGTTCTC CCTCGGAGAATGACC ATTAGGTCACTTAGA TAGCCGGTTCTTCTG GCTA [SEQ ID NO. 109] sgM GTTTGAGAGCAGTGT GTTTGAGAGCAGTGT GTTTGAGAGCAGTGT GTTTGAGAGCA NONE 2029 TGTCTTATATAGCTC AAAAACACTGCTACG AAAAACACTGCTACG GTGTTGTCAAA GAAAACGCATTGTAA TTCAAATAAGCATAT TTCAAATAAGCATAT AGACAACACTG GACAACACTGCTACG TGCTACAAGGTTCTC TGCTACAAGGTTCTC CTACGTTCAAA TTCAAATAAGCATAT CATTGGAGAATGACC CATTGGAGAATGACC TAAGCATATTG TGCTACAAGGTTCTC ATTAGGTCGCTTAGA ATTAGGTC [SEQ CTACAAGGTTC CATTGGAGAATGACC TAGCCAGTTCTTCTG ID NO. 112] TCCATTGGAGA ATTAGGTCGCTTAGA GCTA [SEQ ID ATGACCATTAG TAGCCAGTTCTTCTG NO. 111] GTCGCTTAGAT GCTA [SEQ ID AGCCAGTTCTT NO. 110] CTGGCTA [SEQ ID NO. 113] sgM GTTTGAGAGCAGTGT NONE NONE NONE NONE 2030 TGTCTTAAATAGCTC GAAAACGCATTGTAA GACAACACTGCACGT TCAAATAAGCAGATT GCTACAAGGTTCCCG TAAGGGAATGACCAT CTGGTCACATGAATA GCCCCCGGCAACGGT GGCTG [SEQ ID NO. 114] sgM ATTGTACCATAGCGA NONE NONE NONE NONE 2031 GTTAAATTAGGGAAT TACAACGAAATTGTA ATAACCTATTTTACC TCGCTATGGCACAAT TTGTTATTACATGGA CATTATACTAAACAT TTCCTAAAAAAGCAA CGAAAAACGTGCT [SEQ ID NO. 115] sgM GTTGTAGTTCCCTAA GTTGTAGTTCCCTAA GTTGTAGTTCCCTAA GTTGTAGTTCC NONE 2032 TTATTCTTGGTATGG TTATTCTTGGTAAAA TTATTCTTGGTAAAA CTAATTATTCT TATAATGAAAATTGT ACCAAGAACAATTAG ACCAAGAACAATTAG TGGTATGGTAA ATCATACCAAGAACA GTTACTATGATAAGG GTTACTATGATAAGG AAATATCATAC ATTAGGTTACTATGA TAGTATACCGCAAAG TAGTATACCGCAAAG CAAGAACAATA TAAGGTAGTATACCG CTCTAACACCTCATC CTCTAACACCTCATC GGTTACTATGA CAAAGCTCTAACACC TTCGGATGAGGTGTT TTCGGATGAG [SEQ TAAGGTAGTAT TCATCTTCGGATGAG A [SEQ ID NO. ID NO. 118] ACCGCAAAGCT GTGTTATCT [SEQ 117] CTAACACCTCA ID NO. 116] TCTTCGGATGA GGTGTTATCT [SEQ ID NO. 119] sgM GTTGTAGTTCCCTAA GTTGTAGTTCCCTAA GTTGTAGTTCCCTAA GTTGTAGTTCC NONE 2033 CAGTTCTTGGTATGG CAGTTCTAAAAAGAA CAGTTCTAAAAAGAA CTAACAGTAAA TATAATAAAAATTAT CTGTTATGGTTGCTA CTGTTATGGTTGCTA AACTGTTATGG ACCATACCAAGAACT TGATAAGGTCTTAGC TGATAAGGTCTTAGC TTGCTATGATA GTTATGGTTGCTATG ACCGTAAAGCTCTGA ACCGTAAAGCTCTGA AGGTCTTAGCA ATAAGGTCTTAGCAC CGCCTCGCTTTCAGC CGCCTCGCTTTCAGC CCGTAAAGCTC CGTAAAGCTCTGACG GGGGCGTCA [SEQ GGGG [SEQ ID TGACGCCTCGC CCTCGCTTTCAGCGG ID NO. 121] NO. 122] TTTCAGCGGGG GGCGTCATCTTTTTT CGTCA GCCCAAAAGACACGG [SEQ ID NO. ATATTTTT [SEQ 123] ID NO. 120] sgM GTTGTAGTTCCCTAA GTTGTAGTTCCCTAA GTTGTAGTTCCCTAA GTTGTAGTTCC NONE 2034 CGGTTCTTGGTATGG CGGTACTGTTGGGTT CGGTACTGTTGGGTT CTAACGGTTCT TATAATGAATTATAC ACTACAATAAGGTAG ACTACAATAAGGTAG TGAAAACAAGA CATACCAAGAACTGT TAAACCGAAAAGCTC TAAACCGAAAAGCTC ACTGTTGGGTT TGGGTTACTACAATA TGACGTCTTGTTTGC TGACGTCTTGTTTGC ACTACAATAAG AGGTAGTAAACCGAA GCAGGACGTCATCTT GCAGGACGTCATCTT GTAGTAAACCG AAGCTCTGACGTCTT TATATCAGACGGATG T [SEQ ID NO. AAAAGCTCTGA GTTTGCGCAGGACGT [SEQ ID NO. 126] CGTCTTGTTTG CATCTTTATATCAGA 125] CGCAGGACGTC CGGATG [SEQ ID ATCTTTATATC NO. 124] AGACGGATG [SEQ ID NO. 127] sgM GTTGTAGTCCCCTGA NONE NONE NONE NONE 2035 TGGTTTCTGGAATGG TATAATGAAATTATA CCATTCCAGAAACTA TTATGGTCACTACAA TAAGGTATTAGACCG TAGAGCACTAACACC CCATTTGGGGTGTTA TCTCTTTAAACTGTC CAAAATTTAGTATTG CAATTATTGA [SEQ ID NO. 128]
sgM GTTATAGTTCCCTGT NONE NONE NONE NONE 2036 TCGTTCTTGGTATGG TATAATGAAATTATA CCATACCAAGAACGA AGCAGGTTACTATGA TAAGGTAGTATACCG CAGAGCTCCAACGCC TCGCTTTTGCGGGGC GTTGTCTCT [SEQ ID NO. 128] sgM GTTATAGTTCCCTGA NONE NONE NONE NONE 2037 TAGTTCTTGGTATGG TATAATGAAATTATA CCATACCAAGAACTA TGAGGTTGCTATAAT AAGGTAGTAAACCGC AGAGCTCTAACGCCT CACATTTGTGGGGCG TTATCTCT [SEQ ID NO. 129] sgM GTTGTAGTTCCCTGA NONE NONE NONE NONE 2038 TCGTTCTTGGTATGG TATAATGAAATTATA CCATACCAAGAACGA TCAGGTTGCTACAAT AAGGTAGTAAACCGA AGAGCTCTAACGCCC CGTTTCTTTACGGGG CGTTATCTCT [SEQ ID NO. 130] sgM GTTATAGTTCCCTGA GTTATAGTTCCCTGA GTTATAGTTCCCTGA GTTATAGTTCC GTTATAGTTC 2039 TAGTTCTTGGTATGG TAGTTCTTGGTATGG TAGTTCTTAACCAAG CTGATAGTTCT CCTGATAGTT TATAATGAATTATAC TATAATGAATTATAC AACTATTTAGGTTAC TGCAAGAACTA CTTGCAAGAA CATACCAAGAACTAT CATACCAAGAACTAT TATGATAAGGTTTAG TTTAGGTTACT CTATTTAGGT TTAGGTTACTATGAT TTAGGTTACTATGAT TACACCTTAGAGCTC ATGATAAGGTT TACTATGATA AAGGTTTAGTACACC AAGGTTTAGTACACC TGACGCCTCGCTTTT TAGTACACCTT AGGTTTAGTA TTAGAGCTCTGACGC TTAGAGCTCTGACGC GCGAGGCGTTATCTC AGAGCTCTGAC CACCTTAGAG CTCGCTTTTGCGAGG CTCGCTTTTGCGAGG T [SEQ ID NO. GCCTCGCTTTT CTCTGACGCC CGTTATCTCTTTATA CGTTATCTCT [SEQ 133] GCGAGGCGTTA AAAAGGCGTT TTGCCAAAAATGCAA ID NO. 132] TCTCT ATCTCT ATATATCGTACAATG [SEQ ID [SEQ ID GTGGC [SEQ ID NO. 134] NO. 135] NO. 131] sgM GTTGTAGTTCCCTGA NONE GTTGTAGTTCCCTGA GTTGTAGTTCC NONE 2040 TGGTTCTTGGTATGG TGGTTCTTGAAAAAG CTGATGGTTCT TATAATAAATTATAC AACTGCTCAGGTTAC TGAAAAAGAAC CATACCAAGAACTGC TATGATAAGGTAGTA TGCTCAGGTTA TCAGGTTACTATGAT AACCGAAGAGCTCTA CTATGATAAGG AAGGTAGTAAACCGA ATGCCCCGTCTCGCA TAGTAAACCGA AGAGCTCTAATGCCC CGGGGCATTATCTCT AGAGCTCTAAT CGTCTCGCACGGGGC [SEQ ID NO. GCCAAAGGGCA ATTATCTCT [SEQ 137] TTATCTCT ID NO. 136] [SEQ ID NO. 138]
[0061] To find the optimal gRNA length, different lengths of spacer, repeat:anti-repeat duplex and 3' end of the tracrRNA were included. These gRNAs were then synthesized as a single stranded DNA downstream of the T7 promoter (see Table 4). These sgRNAs were amplified using two primers (5'-AAACCCCTCCGTTTAGAGAG [SEQ ID NO. 174] and 5'-AAGCTAATACGACTCACTATAGGCCAGTC [SEQ ID NO. 175]) and 1 uL of 10 uM diluted single stranded DNA as a template in 25 uL PCR reactions for each sgRNA according to the conditions of Table 5.
TABLE-US-00004 TABLE 4 Name Sequence sg M201 AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTAGCGCCGAAACAGCGCCACTTTACAGTTGATT- ACGGT 6v1 TAAAACCTTATTTTAACTTGCTATGTTGTTTAGAATAGTCCCAAAAGATGTTTTGGTACCATTCTAAACA- A CATGACTCTAAAACCCAGTAACATTACTGACTGGCCTATAGTGAGTCGTATTA [SEQ ID NO. 139] sg M201 AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTAGCGCCGAAACAGCGCCACTTTACAGTTGATT- ACGGT 6v2 TAAAACCTTATTTTAACTTGCTATGTTGTTTTTACAACATGACTCTAAAACCCAGTAACATTACTGACTG- G CCTATAGTGAGTCGTATTA [SEQ ID NO. 140] sg M201 AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTAGCGCCGAAACAGCGCCACTTTACAGTTGATT- ACGCT 6v3 TATTTTAACTTGCTATGTTGTTTTTACAACATGACTCTAAAACCCAGTAACATTACTGACTGGCCTATAG- T GAGTCGTATTA [SEQ ID NO. 141] sg M201 AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTAGCACCGAATCGGTGCCACGTGTTTTCACGTT- GTATA 9v1 CGGACTAGCCTTATTTTAACTTTGCTGTATTGTTTCGAATGGTTTCAAACCGTTTTGGAACCATTCGAAA- C AACACAGCTCTAAAACCCAGTAACATTACTGACTGGCCTATAGTGAGTCGTATTA [SEQ ID NO. 142] sg M201 AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTAGCACCGAATCGGTGCCACGTGTTTTCACGTT- GTATA 9v2 CGGACTAGCCTTATTTTAACTTTGCTGTATTGTTTTTACAACACAGCTCTAAAACCCAGTAACATTACTG- A CTGGCCTATAGTGAGTCGTATTA [SEQ ID NO. 143] sg M201 AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTAGCACCGAATCGGTGCCACGTGTTTTCACGTT- GTATA 9v3 CGGACTAGCCTTATTTTAACTTGCTGTATTGTTTTTACAACACAGCTCTAAAACCCAGTAACATTACTGA- C TGGCCTATAGTGAGTCGTATTA [SEQ ID NO. 144] sg M202 AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTATTTGATAAATAAAAAAGGCCCCACAGCAGGG- CCACT 0v1 GTAAATTTTCGCGAAACGCTTATTTGCACTCGTTATGTTATTTACAGTTTGCTTAATACTATAAATAACA- T AACTAGCAAACCCAGTAACATTACTGACTGGCCTATAGTGAGTCGTATTA [SEQ ID NO. 145] sg M202 AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTATTTGATAAATAAAAAAGGCCCCACAGCAGGG- CCACT 0v2 GTAAATTTTCGCGAAACGCTTATTTGCACTCGTTATGTTATTTTTATAACATAACTAGCAAACCCAGTAA- C ATTACTGACTGGCCTATAGTGAGTCGTATTA [SEQ ID NO. 146] sg M202 AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTAGGCCCCACAGCAGGGCCACTGTAAATTTTCG- CGAAA 0v3 CGCTTATTTGCACTCGTTATGTTATTTTTATAACATAACTAGCAAACCCAGTAACATTACTGACTGGCCT- A TAGTGAGTCGTATTA [SEQ ID NO. 147] sg M202 AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTATAAAACCATATGAAATCATATGATTTTAAAG- ATTCT 2v1 GCACAATGCAGGTCATATTGACGATTCCGGATAAACCTTATTTGCACTCGTCTTGTTAATTCTTTTGAAT- T AACAAGACTCTCAAACCCAGTAACATTACTGACTGGCCTATAGTGAGTCGTATTA [SEQ ID NO. 148] sg M202 AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTATAAAACCATATGAAATCATATGATTTTAAAG- ATTCT 2v2 GCACAATGCAGGTCATATTGACGATTCCGGATAAACCTTATTTGCACTCGTCTTGTTTTTACAAGACTCT- C AAACCCAGTAACATTACTGACTGGCCTATAGTGAGTCGTATTA [SEQ ID NO. 149] sg M202 AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTACTGCACAATGCAGGTCATATTGACGATTCCG- GATAA 2v3 ACCTTATTTGCACTCGTCTTGTTTTTACAAGACTCTCAAACCCAGTAACATTACTGACTGGCCTATAGTG- A GTCGTATTA [SEQ ID NO. 150] sg M202 AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGCTCACCCGAAGGTGAGTTAATATTCACAACACTT- GTGAA 5v1 GGTACTAAAAAGTACGATTTGAAATAATTTTTATTTGAACTCGTAGTGTAAATTTATAGGTTTTCCTATA- A ATTTACACTACTCTCAAACCCAGTAACATTACTGACTGGCCTATAGTGAGTCGTATTA [SEQ ID NO. 151] sg M202 AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTACTCACCCGAAGGTGAGTTAATATTCACAACA- CTTGT 5v2 GAAGGTACTAAAAAGTACGATTTGAAATAATTTTTATTTGAACTCGTAGTGTATTTTTACACTACTCTCA- A ACCCAGTAACATTACTGACTGGCCTATAGTGAGTCGTATTA [SEQ ID NO. 152] sg M202 AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTATTCACAACACTTGTGAAGGTACTAAAAAGTA- CGATT 5v3 TGAAATAATTTTTATTTGAACTCGTAGTGTATTTTTACACTACTCTCAAACCCAGTAACATTACTGACTG- G CCTATAGTGAGTCGTATTA [SEQ ID NO. 153] sg M202 AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTAGGCCTGCAAAGCAGACCTTAATGTTCCACAC- GGTGG 7v1 AGGTTCCGAAGAACGGGTTGAATAAATTTTTATTTGAACTTGTAATGTAAACTCACTTTTATGAATTTAC- A TTACTCTCAAACCCAGTAACATTACTGACTGGCCTATAGTGAGTCGTATTA [SEQ ID NO. 154] sg M202 AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTAGGCCTGCAAAGCAGACCTTAATGTTCCACAC- GGTGG 7v2 AGGTTCCGAAGAACGGGTTGAATAAATTTTTATTTGAACTTGTAATGTATTTTTACATTACTCTCAAACC- C AGTAACATTACTGACTGGCCTATAGTGAGTCGTATTA [SEQ ID NO. 155] sg M202 AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTATCCACACGGTGGAGGTTCCGAAGAACGGGTT- GAATA 7v3 AATTTTTATTTGAACTTGTAATGTATTTTTACATTACTCTCAAACCCAGTAACATTACTGACTGGCCTAT- A GTGAGTCGTATTA [SEQ ID NO. 156] sg M202 AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTATAGCCAGAAGAACTGGCTATCTAAGCGACCT- AATGG 9v1 TCATTCTCCAATGGAGAACCTTGTAGCAATATGCTTATTTGAACGTAGCAGTGTTGTCTTTTGACAACAC- T GCTCTCAAACCCAGTAACATTACTGACTGGCCTATAGTGAGTCGTATTA [SEQ ID NO. 157] sg M202 AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTATAGCCAGAAGAACTGGCTATCTAAGCGACCT- AATGG 9v2 TCATTCTCCAATGGAGAACCTTGTAGCAATATGCTTATTTGAACGTAGCAGTGTTTTTACACTGCTCTCA- A ACCCAGTAACATTACTGACTGGCCTATAGTGAGTCGTATTA [SEQ ID NO. 158] sg M202 AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTAGACCTAATGGTCATTCTCCAATGGAGAACCT- TGTAG 9v3 CAATATGCTTATTTGAACGTAGCAGTGTTTTTACACTGCTCTCAAACCCAGTAACATTACTGACTGGCCT- A TAGTGAGTCGTATTA [SEQ ID NO. 159] sg M203 AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTAAGATAACACCTCATCCGAAGATGAGGTGTTA- GAGCT 2v1 TTGCGGTATACTACCTTATCATAGTAACCTAATTGTTCTTGGTATGATATTTTTACCATACCAAGAATAA- T TAGGGAACTACAACCCAGTAACATTACTGACTGGCCTATAGTGAGTCGTATTA [SEQ ID NO. 160] sg M203 AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTATAACACCTCATCCGAAGATGAGGTGTTAGAG- CTTTG 2v2 CGGTATACTACCTTATCATAGTAACCTAATTGTTCTTGGTTTTTACCAAGAATAATTAGGGAACTACAAC- C CAGTAACATTACTGACTGGCCTATAGTGAGTCGTATTA [SEQ ID NO. 161] sg M203 AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTACTCATCCGAAGATGAGGTGTTAGAGCTTTGC- GGTAT 2v3 ACTACCTTATCATAGTAACCTAATTGTTCTTGGTTTTTACCAAGAATAATTAGGGAACTACAACCCAGTA- A CATTACTGACTGGCCTATAGTGAGTCGTATTA [SEQ ID NO. 162] sg M203 AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTATGACGCCCCGCTGAAAGCGAGGCGTCAGAGC- TTTAC 3v1 GGTGCTAAGACCTTATCATAGCAACCATAACAGTTTTTACTGTTAGGGAACTACAACCCAGTAACATTAC- T GACTGGCCTATAGTGAGTCGTATTA [SEQ ID NO. 163] sg M203 AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTATGACGCCCCGCTGAAAGCGAGGCGTCAGAGC- TTTAC 3v2 GGTGCTAAGACCTTATCATAGCAACCATAACAGTTCTTTTTAGAACTGTTAGGGAACTACAACCCAGTAA- C ATTACTGACTGGCCTATAGTGAGTCGTATTA [SEQ ID NO. 164] sg M203 AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTACCCCGCTGAAAGCGAGGCGTCAGAGCTTTAC- GGTGC 3v3 TAAGACCTTATCATAGCAACCATAACAGTTCTTTTTAGAACTGTTAGGGAACTACAACCCAGTAACATTA- C TGACTGGCCTATAGTGAGTCGTATTA [SEQ ID NO. 165] sg M203 AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTACATCCGTCTGATATAAAGATGACGTCCTGCG- CAAAC 4v1 AAGACGTCAGAGCTTTTCGGTTTACTACCTTATTGTAGTAACCCAACAGTTCTTGTTTTCAAGAACCGTT- A GGGAACTACAACCCAGTAACATTACTGACTGGCCTATAGTGAGTCGTATTA [SEQ ID NO. 166] sg M203 AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTACATCCGTCTGATATAAAGATGACGTCCTGCG- CAAAC 4v2 AAGACGTCAGAGCTTTTCGGTTTACTACCTTATTGTAGTAACCCAACAGTACCGTTAGGGAACTACAACC- C AGTAACATTACTGACTGGCCTATAGTGAGTCGTATTA [SEQ ID NO. 167] sg M203 AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTAAAAGATGACGTCCTGCGCAAACAAGACGTCA- GAGCT 4v3 TTTCGGTTTACTACCTTATTGTAGTAACCCAACAGTACCGTTAGGGAACTACAACCCAGTAACATTACTG- A CTGGCCTATAGTGAGTCGTATTA [SEQ ID NO. 168] sg M203 AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTAAGAGATAACGCCTCGCAAAAGCGAGGCGTCA- GAGCT 9v1 CTAAGGTGTACTAAACCTTATCATAGTAACCTAAATAGTTCTTGCAAGAACTATCAGGGAACTATAACCC- A GTAACATTACTGACTGGCCTATAGTGAGTCGTATTA [SEQ ID NO. 169] sg M203 AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTAAGAGATAACGCCTTTTGGCGTCAGAGCTCTA- AGGTG 9v2 TACTAAACCTTATCATAGTAACCTAAATAGTTCTTGCAAGAACTATCAGGGAACTATAACCCAGTAACAT- T ACTGACTGGCCTATAGTGAGTCGTATTA [SEQ ID NO. 170] sg M203 AAACCCCTCCGTTTAGAGAGGGGTTATGCTAGTTAAGAGATAACGCCTCGCAAAAGCGAGGCGTCA- GAGCT 9v3 CTAAGGTGTACTAAACCTTATCATAGTAACCTAAATAGTTCTTGGTTAAGAACTATCAGGGAACTATAAC- C CAGTAACATTACTGACTGGCCTATAGTGAGTCGTATTA [SEQ ID NO. 171]
TABLE-US-00005 TABLE 5 STEP TEMPERATURE TIME DENATURATION 98.degree. C. 30 SEC 12 CYCLES 98.degree. C. 10 SEC 66.degree. C. 30 SEC 72.degree. C. 2 MIN FINAL EXTENSION 72.degree. C. 2 MIN HOLD 12.degree. C.
[0062] The target library was designed based on an assumption that the eight randomized NNNNNNNN [SEQ ID NO. 176] PAMs of these nucleases reside on the 3' end of the target sequence (5'-CCAGTCAGTAATGTTACTGG [SEQ ID NO. 177]).
Example 5
In Vitro Transcription and Translation for Production of MAD Nucleases and gRNAs
[0063] The MADZYMEs were tested for activity by in vitro transcription and translation (txtl). Both the gRNA plasmid and nuclease plasmid were included in each txtl reaction. A PURExpress.RTM. In Vitro Protein Synthesis Kit (NEB, Ipswich, Mass.) was used to produce MADzymes from the PCR-amplified MADZYME library and also to produce the gRNA libraries. In each well in a 96-well plate, the reagents listed in Table 6 were mixed to start the production of MADzymes and gRNAs:
TABLE-US-00006 TABLE 6 REAGENTS VOLUME (.mu.l) 1 SolA (NEB kit) 10 2 SolB (NEB kit) 7.5 3 PCR amplified gRNA 0.4 4 Murine RNase inhibitor (NEB) 0.5 5 Water 3.0 6 PCR amplified T7 MADZYMEs 3.6
[0064] A master mix with all reagents was mixed on ice with the exception of the PCR-amplified T7-MADZYMEs to cover enough 96-well plates for the assay. After 21 .mu.L of the master mix was distributed in each well in 96 well plates, 4 .mu.L of the mixture of PCR amplified MADZYMEs and gRNA under the control of T7 promoter was added. The 96-well plates were sealed and incubated for 4 hrs at 37.degree. C. in a thermal cycler. The plates were kept at room temperature until the target pool was added to perform the target depletion reaction.
[0065] After 4 hours incubation to allow production of the MADzymes and gRNAs, 4 .mu.L of the target library pool (10 ng/.mu.L) was added to the 10 .mu.L aliquots of in vitro transcription/translation reaction mixture and allowed to deplete for 30 min, 3 hrs or overnight at 37.degree. C. and 48.degree. C. The target depletion reaction mixtures were diluted into PCR-grade water that contains RNAse A incubated for 5 min at room temperature. Proteinase K was then added and the mixtures were incubated for 5 min at 55.degree. C. RNAseA/Proteinase K treated samples were purified with DNA purification kits and the purified DNA samples were then amplified and sequenced. The PCR conditions are shown in Table 7:
TABLE-US-00007 TABLE 7 STEP TEMPERATURE TIME DENATURATION 98.degree. C. 30 SEC 4 CYCLES 98.degree. C. 10 SEC 66.degree. C. 30 SEC 72.degree. C. 20 SEC 12 CYCLES 98.degree. C. 10 SEC 72.degree. C. 20 SEC FINAL EXTENSION 72.degree. C. 2 MINUTES HOLD 12.degree. C.
Example 6
Measurement of Nicked Plasmid with Nickase RNP Complexes
[0066] Proteins were produced in vitro under a PURExpress.RTM. In Vitro Protein Synthesis Kit (NEB, Ipswich, Mass.). Guide RNAs that target the target plasmid were also produced under a T7 promoter in the same mixture. The MADzyme Nickase or Nuclease and guide complexes (RNP complex) formed as they were produced in the in vitro transcription and translation reagent. Supercoiled plasmid target was diluted into the digestion buffer, then the RNP complex was added to the same digestion buffer to initiate the plasmid digestion. After incubation at 37.degree. C. to allow digestion of the plasmid, the resulting mixtures were treated with RNAase and Proteinase K, then the target plasmid was purified with a PCR cleanup kit, and run on TAE-agarose gel to observe the formation of nicked or double stand cut plasmid. The results are shown in FIG. 7. Table 8 lists the identified MADzyme nickases, including the variations from the nuclease sequence in Table 1 and the amino acid sequence.
TABLE-US-00008 TABLE 8 MAD zyme SEQ Nickase ID Name NO Amino Acid Sequence MAD2016- 178 MKKDYVIGLDIGTNSVGWAVMTEDYQLVKKKMPIYGNTEKKKIKKNFWGVRLFEEGHTAEDRR H851A LKRTARRIISRRRNRLRYLQAFFEEAMTDLDENFFARLQESFLVPEDKKWHRHPIFAKLEDEV AYHETYPTIYHLRKKLADSSEQADLRLIYLALAHIVKYRGHFLIEGKLSTENISVKEQFQQFM IIYNQTFVNGESRLVSAPLPESVLIEEELTEKASRTKKSEKVLQQFPQEKANGLFGQFLKLMV GNKADFKKVFGLEEEAKITYASESYEEDLEGILAKVGDEYSDVFLAAKNVYDAVELSTILADS DKKSHAKLSSSMIVRFTEHQEDLKKFKRFIRENCPDEYDNLFKNEQKDGYAGYIAHAGKVSQL KFYQYVKKIIQDIAGAEYFLEKIAQENFLRKQRTFDNGVIPHQIHLAELQAIIHRQAAYYPFL KENQEKIEQLVTFRIPYYVGPLSKGDASTFAWLKRQSEEPIRPWNLQETVDLDQSATAFIERM TNFDTYLPSEKVLPKHSLLYEKFMVFNELTKISYTDDRGIKANFSGKEKEKIFDYLFKTRRKV KKKDIIQFYRNEYNTEIVTLSGLEEDQFNASFSTYQDLLKCGLTRAELDHPDNAEKLEDIIKI LTIFEDRQRIRTQLSTFKGQFSAEVLKKLERKHYTGWGRLSKKLINGIYDKESGKTILGYLIK DDGVSKHYNRNFMQLINDSQLSFKNAIQKAQSSEHEETLSETVNELAGSPAIKKGIYQSLKIV DELVAIMGYAPKRIVVEMARENQTTSTGKRRSIQRLKIVEKAMAEIGSNLLKEQPTTNEQLRD TRLFLYYMQNGKDMYTGDELSLHRLSHYDIDAIIPQSFMKDDSLDNLVLVGSTENRGKSDDVP SKEVVKDMKAYWEKLYAAGLISQRKFQRLTKGEQGGLTLEDKAHFIQRQLVETRQITKNVAGI LDQRYNANSKEKKVQIITLKASLTSQFRSIFGLYKVREVNDYHHGQDAYLNCVVATTLLKVYP NLAPEFVYGEYPKFQTFKENKATAKAIIYTNLLRFFTEDEPRFTKDGEILWSNSYLKTIKKEL NYHQMNIVKKVEVQKGGFSKESIKPKGPSNKLIPVKNGLDPQKYGGFDSPIVAYTVLFTHEKG KKPLIKQEILGITIMEKTRFEQNPILFLEEKGFLRPRVLMKLPKYTLYEFPEGRRRLLASAKE AQKGNQMVLPEHLLTLLYHAKQCLLPNQSESLTYVEQHQPEFQEILERVVDFAEVHTLAKSKV QQIVKLFEANQTADVKEIAASFIQLMQFNAMGAPSTFKFFQKDIERARYTSIKEIFDATIIYQ STTGLYETRRKVVD MAD2016- 179 MKKDYVIGLDIGTNSVGWAVMTEDYQLVKKKMPIYGNTEKKKIKKNFWGVRLFEEGHTAEDRR N874A LKRTARRIISRRRNRLRYLQAFFEEAMTDLDENFFARLQESFLVPEDKKWHRHPIFAKLEDEV AYHETYPTIYHLRKKLADSSEQADLRLIYLALAHIVKYRGHFLIEGKLSTENISVKEQFQQFM IIYNQTFVNGESRLVSAPLPESVLIEEELTEKASRTKKSEKVLQQFPQEKANGLFGQFLKLMV GNKADFKKVFGLEEEAKITYASESYEEDLEGILAKVGDEYSDVFLAAKNVYDAVELSTILADS DKKSHAKLSSSMIVRFTEHQEDLKKFKRFIRENCPDEYDNLFKNEQKDGYAGYIAHAGKVSQL KFYQYVKKIIQDIAGAEYFLEKIAQENFLRKQRTFDNGVIPHQIHLAELQAIIHRQAAYYPFL KENQEKIEQLVTFRIPYYVGPLSKGDASTFAWLKRQSEEPIRPWNLQETVDLDQSATAFIERM TNFDTYLPSEKVLPKHSLLYEKFMVFNELTKISYTDDRGIKANFSGKEKEKIFDYLFKTRRKV KKKDIIQFYRNEYNTEIVTLSGLEEDQFNASFSTYQDLLKCGLTRAELDHPDNAEKLEDIIKI LTIFEDRQRIRTQLSTFKGQFSAEVLKKLERKHYTGWGRLSKKLINGIYDKESGKTILGYLIK DDGVSKHYNRNFMQLINDSQLSFKNAIQKAQSSEHEETLSETVNELAGSPAIKKGIYQSLKIV DELVAIMGYAPKRIVVEMARENQTTSTGKRRSIQRLKIVEKAMAEIGSNLLKEQPTTNEQLRD TRLFLYYMQNGKDMYTGDELSLHRLSHYDIDHIIPQSFMKDDSLDNLVLVGSTEARGKSDDVP SKEVVKDMKAYWEKLYAAGLISQRKFQRLTKGEQGGLTLEDKAHFIQRQLVETRQITKNVAGI LDQRYNANSKEKKVQIITLKASLTSQFRSIFGLYKVREVNDYHHGQDAYLNCVVATTLLKVYP NLAPEFVYGEYPKFQTFKENKATAKAIIYTNLLRFFTEDEPRFTKDGEILWSNSYLKTIKKEL NYHQMNIVKKVEVQKGGFSKESIKPKGPSNKLIPVKNGLDPQKYGGFDSPIVAYTVLFTHEKG KKPLIKQEILGITIMEKTRFEQNPILFLEEKGFLRPRVLMKLPKYTLYEFPEGRRRLLASAKE AQKGNQMVLPEHLLTLLYHAKQCLLPNQSESLTYVEQHQPEFQEILERVVDFAEVHTLAKSKV QQIVKLFEANQTADVKEIAASFIQLMQFNAMGAPSTFKFFQKDIERARYTSIKEIFDATIIYQ STTGLYETRRKVVD MAD2032- 180 MKYIIGLDMGITSVGFATMMLDDKDEPCRIIRMGSRIFEAAEHPKDGSSLAAPRRINRGMRRR H590A LRRKSHRKERIKDLIIKNELMTADEISAIYSTGKQLSDIYQIRAEALDRKLNTEEFVRLLIHL SQRRGFKSNRKVDAKEKGSDAGKLLSAVNSNKELMIEKNYRTIGEMLYKDEKFSEYKRNKADD YSNTFARSEYEDEIRQIFSAQQEHGNPYATDELKESYLDIYLSQRSFDEGPGGSSPYGGNQIE KMIGNCTLEPEEKRAAKATFSFEYFNLLSKVNSIKIVSSSGKRALNNDERQSVIRLAFAKNAI SYTSLRKELNMEYSERFNISYSQSDKSIEEIEKKTKFTYLTAYHTFKKAYGSVFVEWSADKKN SLAYALTAYKNDTKIIEYLTQKGFDAAETDIALTLPSFSKWGNLSEKALNNIIPYLEQGMLYH DACTAAGYNFKADDTDKRMYLPAHEKEAPELDDITNPVVRRAISQTIKVINALIREMGESPCF VNIELARELSKNKAERSKIEKGQKENQVRNDRIMERLRNEFGLLSPTGQDLIKLKLWEEQDGI CPYSLKPIKIEKLFDVGYTDIDAIIPYSLSFDDTYNNKVLVMSSENRQKGNRIPMQYLEGKRQ DDFWLWVDNSNLSRRKKQNLTKETLSEDDLSGFKKRNLQDTQYLSRFMMNYLKKYLALAPNTT GRKNTIQAVNGAVTSYLRKRWGIQKVRENGDTHHAVDAVVISCVTAGMTKRVSEYAKYKETEF QNPQTGEFFDVDIRTGEVINRFPLPYARFRNELLMRCSENPSRILHEMPLPTYAADEKVAPIF VSRMPKHKVKGSAHKETIRRAFEEDGKKYTVSKVPLTDLKLKNGEIENYYNPESDGLLYNALK EQUAFGGDAAKAFEQPFYKPKSDGSEGPLVKKVKLINKATLTVPVLNNTAVADNGSMVRVDVF FVEGEGYYLVPIYVADTVKKELPNKAIIANKPYEEWKEMREENFVFSLYPNDLIKISSRKDMK FNLVNKESTLAPNCQSKEALVYYKGSDISTAAVTAINHDNTYKLRGLGVKTLLKIEKYQVDVL GNVFKVGKEKRVRFK MAD2039- 181 MRPYAIGLDIGITSVGWATVALDADESPCGIIGLGSRIFDAAEQPKTGESLAAPRRAARGSRR H587A RLRRHRHRNERIRSLMLEERLISQDELETLFDGRLEDIYALRVKALDEIVSRTDFARILLHIS QRRGFKSNRKNPTTKEDGVLLAAVNENKQRMSEHGYRTVGEMFLLDETFKDHKRNKGGNYITT VARDMVADEVRAIFSAQRELGASFASEEFEERYLEILLSQRSFDEGPGGNSPYGGSQIERMVG RCTFFPDEPRAAKATYSFEYFTLLQKVNHIRIVENGVASKLTDEQRRIIIELAHTTKDVSYAK IRKVLKLSDKQLFNIRYSDNSPAEDSEKKEKLGIMKAYHQMRSAIDRVSKGRFAMMPRAQRNA IGTALSLYKTSDKIRKYLTDAGLDEIDINSADSIGSFSKFGHISVKACDMLIPFLEQGMNYNE ACAAAGLNFKGHDAGEKSKLLHPKEEDYEDITSPVVRRAIAQTIKVINAIIRREGCSPTFINI ELAREMAKDFRERNRIKKENDDNRAKNERLLERIRTEYGKNNPTGLDLVKLRLYEEQSGVCMY SLKQMSLEKLFEPNYAEVDAIVPYSISFDDSRKNKVLVLTEENRNKGNRLPLQYLKGRRREDF IVWVNNNVKDYRKRRLLLKEELTAEDESGFKERNLQDTKTMSRFLLNYIADNLEFAESTRGRK KKVTAVNGAVTAYMRKRWGITKIREDGDCHHAVDAVVIACTTDAMIRQVSRYAQFRECEYMQT ESGSVAVDTGTGEVLRTFPYPWPDFRKELEARLANDPAKVINDLHLPFYMSAGRPLPEPVFVS RMPRRKVTGAAHKDTIKSARELDNGYLIVKRPLTDLKLKNGEIENYYNPQSDKCLYDALKNAL IEHGGDAKKAFAGEFRKPKRDGTPGPIVKKVKLLEPTTMCVPVHGGKGAADNDSMVRVDVFLS GGKYYLVPIYVADTLKPELPNKAVTRGKKYSEWLEMADEDFIFSLYPNDLICATSKNGITLSV CRKDSTLPPTVESKSFMLYYRGTDISTGSISCITHDNAYKLRGLGVKTLEKLEKYTVDVLGEY HKVGKEVRQPFNIKRRKACPSEML MAD2039- 182 MRPYAIGLDIGITSVGWATVALDADESPCGIIGLGSRIFDAAEQPKTGESLAAPRRAARGSRR N610A RLRRHRHRNERIRSLMLEERLISQDELETLFDGRLEDIYALRVKALDEIVSRTDFARILLHIS QRRGFKSNRKNPTTKEDGVLLAAVNENKQRMSEHGYRTVGEMFLLDETFKDHKRNKGGNYITT VARDMVADEVRAIFSAQRELGASFASEEFEERYLEILLSQRSFDEGPGGNSPYGGSQIERMVG RCTFFPDEPRAAKATYSFEYFTLLQKVNHIRIVENGVASKLTDEQRRIIIELAHTTKDVSYAK IRKVLKLSDKQLFNIRYSDNSPAEDSEKKEKLGIMKAYHQMRSAIDRVSKGRFAMMPRAQRNA IGTALSLYKTSDKIRKYLTDAGLDEIDINSADSIGSFSKFGHISVKACDMLIPFLEQGMNYNE ACAAAGLNFKGHDAGEKSKLLHPKEEDYEDITSPVVRRAIAQTIKVINAIIRREGCSPTFINI ELAREMAKDFRERNRIKKENDDNRAKNERLLERIRTEYGKNNPTGLDLVKLRLYEEQSGVCMY SLKQMSLEKLFEPNYAEVDHIVPYSISFDDSRKNKVLVLTEENRNKGNRLPLQYLKGRRREDF IVWVNNNVKDYRKRRLLLKEELTAEDESGFKERNLQDTKTMSRFLLNYIADNLEFAESTRGRK KKVTAVNGAVTAYMRKRWGITKIREDGDCHHAVDAVVIACTTDAMIRQVSRYAQFRECEYMQT ESGSVAVDTGTGEVLRTFPYPWPDFRKELEARLANDPAKVINDLHLPFYMSAGRPLPEPVFVS RMPRRKVTGAAHKDTIKSARELDNGYLIVKRPLTDLKLKNGEIENYYNPQSDKCLYDALKNAL IEHGGDAKKAFAGEFRKPKRDGTPGPIVKKVKLLEPTTMCVPVHGGKGAADNDSMVRVDVFLS GGKYYLVPIYVADTLKPELPAKAVTRGKKYSEWLEMADEDFIFSLYPNDLICATSKNGITLSV CRKDSTLPPTVESKSFMLYYRGTDISTGSISCITHDNAYKLRGLGVKTLEKLEKYTVDVLGEY HKVGKEVRQPFNIKRRKACPSEML
[0067] While this invention is satisfied by embodiments in many different forms, as described in detail in connection with preferred embodiments of the invention, it is understood that the present disclosure is to be considered as exemplary of the principles of the invention and is not intended to limit the invention to the specific embodiments illustrated and described herein. Numerous variations may be made by persons skilled in the art without departure from the spirit of the invention. The scope of the invention will be measured by the appended claims and their equivalents. The abstract and the title are not to be construed as limiting the scope of the present invention, as their purpose is to enable the appropriate authorities, as well as the general public, to quickly determine the general nature of the invention. In the claims that follow, unless the term "means" is used, none of the features or elements recited therein should be construed as means-plus-function limitations pursuant to 35 U.S.C. .sctn. 112, 6.
Sequence CWU
1
1
18211339PRTVagococcus fluvialis 1Met Gly Lys Asn Tyr Thr Ile Gly Leu Asp
Ile Gly Thr Asn Ser Val1 5 10
15Gly Trp Ser Val Val Thr Glu Asn Gln Gln Leu Val Lys Lys Arg Met
20 25 30Lys Ile Arg Gly Asp Ser
Glu Lys Lys Gln Val Lys Lys Asn Phe Trp 35 40
45Gly Val Arg Leu Phe Asp Glu Gly Glu Thr Ala Glu Ala Thr
Arg Leu 50 55 60Lys Arg Thr Thr Arg
Arg Arg Tyr Thr Arg Arg Arg Asn Arg Val Val65 70
75 80Asp Leu Gln Asn Ile Phe Lys Asp Glu Ile
Asn Gln Lys Asp Ser Asn 85 90
95Phe Phe Asn Arg Leu Asn Glu Ser Phe Leu Val Val Glu Asp Lys Lys
100 105 110Gln Pro Lys Gln Met
Ile Phe Gly Thr Val Glu Glu Glu Ala Ser Tyr 115
120 125His Glu Ser Phe Pro Thr Ile Tyr His Leu Arg Lys
Glu Leu Val Asp 130 135 140Asn Lys Asp
Gln Ala Asp Ile Arg Leu Val Tyr Leu Ala Met Ala His145
150 155 160Met Ile Lys Tyr Arg Gly His
Phe Leu Ile Glu Gly Gln Leu Ser Thr 165
170 175Glu Asn Thr Ser Val Glu Glu Lys Phe His Leu Phe
Leu Lys Glu Tyr 180 185 190Asn
Ser Thr Phe Cys Lys Gln Glu Asp Gly Ser Leu Val Asn Pro Val 195
200 205Asn Glu Asp Ile Asn Gly Glu Glu Ile
Leu Met Gly Thr Leu Ser Arg 210 215
220Ser Lys Lys Ala Glu Gln Ile Met Lys Ser Phe Glu Gly Glu Lys Ser225
230 235 240Asn Gly Val Phe
Ser Gln Phe Leu Lys Met Ile Val Gly Asn Gln Gly 245
250 255Asn Phe Lys Lys Ala Phe Asn Leu Glu Glu
Asp Ala Lys Ile Gln Phe 260 265
270Ala Lys Glu Glu Tyr Asp Glu Asp Leu Thr Thr Leu Leu Ser Asn Ile
275 280 285Gly Asp Glu Tyr Ala Asn Val
Phe Ser Leu Ala Lys Glu Thr Tyr Glu 290 295
300Ala Ile Glu Leu Ser Gly Ile Leu Ser Thr Lys Asp Lys Glu Thr
Tyr305 310 315 320Ala Lys
Leu Ser Ser Ser Met Thr Glu Arg Tyr Glu Asp His Glu Lys
325 330 335Asp Leu Ala Ser Leu Lys Ser
Phe Phe Arg Glu His Leu Pro Glu Lys 340 345
350Tyr Ala Val Met Phe Lys Asp Val Ser Lys Asn Gly Tyr Ala
Gly Tyr 355 360 365Ile Glu Asn Ser
Asn Lys Ile Ser Gln Glu Glu Phe Tyr Lys Tyr Thr 370
375 380Lys Lys Leu Ile Gly Gln Ile Glu Gly Ala Asp Tyr
Phe Ile Lys Lys385 390 395
400Met Glu Gln Glu Ala Phe Leu Arg Lys Gln Arg Thr Tyr Asp Asn Gly
405 410 415Val Ile Pro Tyr Gln
Val His Leu Ser Glu Leu Thr His Ile Ile Asn 420
425 430Asn Gln Lys Lys Tyr Tyr Pro Phe Leu Leu Glu Lys
Glu Glu Glu Ile 435 440 445Lys Ser
Ile Leu Thr Phe Lys Ile Pro Tyr Tyr Ile Gly Pro Leu Ala 450
455 460Lys Gly Asn Ser Asp Phe Ala Trp Leu Ile Arg
Asn Ser Asn Asp Lys465 470 475
480Ile Thr Pro Ser Asn Phe Asn Glu Val Leu Asp Ile Glu Asn Ser Ala
485 490 495Ser Gln Phe Ile
Glu Arg Met Thr Asn Asn Asp Val Tyr Leu Pro Glu 500
505 510Glu Lys Val Leu Pro Lys Asn Ser Met Leu Tyr
Gln Lys Tyr Ile Val 515 520 525Phe
Asn Glu Leu Thr Lys Val Arg Tyr Ile Asn Asp Arg Gly Thr Glu 530
535 540Cys Asn Phe Ser Gly Glu Glu Lys Leu Gln
Ile Phe Glu Arg Phe Phe545 550 555
560Lys Asp Ser Ser Thr Lys Val Lys Lys Val Ser Leu Glu Asn Tyr
Leu 565 570 575Asn Lys Glu
Tyr Met Ile Glu Ser Pro Thr Ile Lys Gly Ile Glu Asp 580
585 590Asp Phe Asn Ala Ser Phe Arg Thr Tyr His
Asp Phe Ile Lys Leu Gly 595 600
605Val Ser Arg Glu Met Leu Asp Asp Ile Asp Asn Glu Glu Met Phe Glu 610
615 620Asp Ile Val Lys Ile Leu Thr Ile
Phe Glu Asp Arg Gln Met Ile Lys625 630
635 640Lys Gln Leu Glu Lys Tyr Lys Asp Val Phe Asp Ser
Asp Ile Leu Lys 645 650
655Lys Met Val Arg Arg His Tyr Thr Gly Trp Gly Arg Leu Ser Lys Lys
660 665 670Leu Leu His Glu Met Lys
Asp Asp Asn Ser Gly Lys Thr Ile Leu Asp 675 680
685Tyr Leu Ile Glu Asp Asp Arg Leu Pro Lys His Ile Asn Arg
Asn Phe 690 695 700Met Gln Leu Ile Asn
Asp Ser Asn Leu Ser Phe Lys Glu Lys Ile Glu705 710
715 720Lys Ala Gln Leu Thr Asp Gly Thr Glu Asp
Ile Asp Ser Val Val Lys 725 730
735Asn Leu Ile Gly Ser Pro Ala Ile Lys Lys Gly Ile Ser Gln Ser Leu
740 745 750Lys Ile Val Glu Glu
Leu Val Ser Ile Met Gly Tyr Gln Pro Thr Ser 755
760 765Ile Val Val Glu Met Ala Arg Glu Asn Gln Thr Thr
Ser Lys Gly Lys 770 775 780Arg Gln Ser
Ile Gln Arg Tyr Lys Arg Leu Glu Ala Ala Ile Asn Glu785
790 795 800Leu Gly Ser Asp Leu Leu Lys
Val Cys Pro Thr Asp Asn His Ala Leu 805
810 815Lys Asp Asp Arg Leu Tyr Leu Tyr Tyr Leu Gln Asn
Gly Arg Asp Met 820 825 830Tyr
Thr Gly Leu Glu Leu Asp Ile His Asn Leu Ser Gln Tyr Asp Ile 835
840 845Asp His Ile Val Pro Arg Ser Phe Ile
Thr Asp Asn Ser Ile Asp Asn 850 855
860Arg Val Leu Val Ser Ser Lys Lys Asn Arg Gly Lys Leu Asp Asn Val865
870 875 880Pro Ser Lys Glu
Ile Val Gln Lys Asn Lys Leu Leu Trp Met Asn Leu 885
890 895Lys Lys Ser Lys Leu Met Ser Glu Lys Lys
Tyr Ala Asn Leu Ile Lys 900 905
910Gly Glu Thr Gly Gly Leu Thr Glu Asp Asp Lys Ala Lys Phe Leu Asn
915 920 925Arg Gln Leu Val Glu Thr Arg
Gln Ile Thr Lys Asn Val Ala Gln Ile 930 935
940Leu Asp Gln Arg Phe Asn Thr Gln Lys Asp Glu Lys Gly Asn Ile
Ile945 950 955 960Arg Glu
Val Lys Val Ile Thr Leu Lys Ser Ala Leu Val Ser Gln Phe
965 970 975Arg Gln Asn Phe Glu Phe Tyr
Lys Val Arg Glu Val Asn Asp Phe His 980 985
990His Ala His Asp Ala Tyr Leu Asn Ala Val Val Ala Asn Thr
Leu Leu 995 1000 1005Lys Val Tyr
Pro Lys Leu Thr Pro Asp Phe Val Tyr Gly Glu Tyr 1010
1015 1020Arg Lys Gly Asn Pro Phe Lys Asn Thr Lys Ala
Thr Ala Lys Lys 1025 1030 1035His Tyr
Tyr Ser Asn Ile Met Glu Asn Leu Cys His Glu Thr Thr 1040
1045 1050Ile Ile Asp Asp Glu Thr Gly Glu Ile Leu
Trp Asp Lys Lys Cys 1055 1060 1065Ile
Gly Thr Ile Lys Gln Val Leu Asn Tyr His Gln Val Asn Val 1070
1075 1080Val Lys Lys Val Glu Thr Gln Thr Gly
Arg Phe Ser Glu Glu Thr 1085 1090
1095Leu Val Pro Arg Gly Ser Thr Lys Asn Pro Ile Ala Leu Lys Ser
1100 1105 1110His Leu Asp Pro Gln Lys
Tyr Gly Gly Phe Lys Ser Pro Thr Ile 1115 1120
1125Ala Tyr Thr Ile Val Ile Glu Tyr Lys Lys Gly Lys Lys Asp
Ile 1130 1135 1140Leu Ile Lys Glu Leu
Leu Gly Ile Ser Ile Met Asn Arg Gly Ala 1145 1150
1155Phe Glu Lys Asn Asn Lys Glu Tyr Leu Glu Lys Leu Asn
Tyr Lys 1160 1165 1170Glu Pro Arg Val
Leu Met Val Leu Pro Lys Tyr Ser Leu Phe Glu 1175
1180 1185Leu Glu Asn Gly Arg Arg Arg Leu Leu Ala Ser
Asp Lys Glu Ser 1190 1195 1200Gln Lys
Gly Asn Gln Met Ala Val Pro Ser Tyr Leu Asn Asn Leu 1205
1210 1215Leu Tyr His Thr Asn Lys Ser Leu Ser Lys
Asn Ala Lys Ser Leu 1220 1225 1230Glu
Tyr Val Asn Glu His Arg Gln Gln Phe Glu Glu Leu Leu Glu 1235
1240 1245Glu Ile Ile Asp Phe Ala Asn Gln Phe
Thr Leu Ala Glu Lys Asn 1250 1255
1260Thr Leu Leu Ile Ala Asp Leu Tyr Glu Ser Asn Lys Glu Ala Asp
1265 1270 1275Ile Glu Leu Leu Ala Ser
Ser Phe Ile Asn Leu Leu Arg Phe Asn 1280 1285
1290Gln Met Gly Ala Pro Ala Glu Phe Ser Phe Phe Glu Lys Pro
Ile 1295 1300 1305Pro Arg Lys Arg Tyr
Ser Ser Thr Phe Glu Leu Leu Lys Gly Lys 1310 1315
1320Val Ile His Gln Ser Ile Thr Gly Leu Tyr Glu Thr His
Gln Lys 1325 1330
1335Val236DNAArtificial SequenceCRrNA 2gttttagagc tatgctgttt tgaatgcttc
caaaac 36380DNAArtificial SequenceTRACR RNA
3tgttggtagc attcaaaaca acatagcaag ttaaaataag gctttgtccg ttctcaactt
60ttagtgacgc tgtttcggcg
8041337PRTEnterococcus faecalis 4Met Lys Lys Asp Tyr Val Ile Gly Leu Asp
Ile Gly Thr Asn Ser Val1 5 10
15Gly Trp Ala Val Met Thr Glu Asp Tyr Gln Leu Val Lys Lys Lys Met
20 25 30Pro Ile Tyr Gly Asn Thr
Glu Lys Lys Lys Ile Lys Lys Asn Phe Trp 35 40
45Gly Val Arg Leu Phe Glu Glu Gly His Thr Ala Glu Asp Arg
Arg Leu 50 55 60Lys Arg Thr Ala Arg
Arg Ile Ile Ser Arg Arg Arg Asn Arg Leu Arg65 70
75 80Tyr Leu Gln Ala Phe Phe Glu Glu Ala Met
Thr Asp Leu Asp Glu Asn 85 90
95Phe Phe Ala Arg Leu Gln Glu Ser Phe Leu Val Pro Glu Asp Lys Lys
100 105 110Trp His Arg His Pro
Ile Phe Ala Lys Leu Glu Asp Glu Val Ala Tyr 115
120 125His Glu Thr Tyr Pro Thr Ile Tyr His Leu Arg Lys
Lys Leu Ala Asp 130 135 140Ser Ser Glu
Gln Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His145
150 155 160Ile Val Lys Tyr Arg Gly His
Phe Leu Ile Glu Gly Lys Leu Ser Thr 165
170 175Glu Asn Ile Ser Val Lys Glu Gln Phe Gln Gln Phe
Met Ile Ile Tyr 180 185 190Asn
Gln Thr Phe Val Asn Gly Glu Ser Arg Leu Val Ser Ala Pro Leu 195
200 205Pro Glu Ser Val Leu Ile Glu Glu Glu
Leu Thr Glu Lys Ala Ser Arg 210 215
220Thr Lys Lys Ser Glu Lys Val Leu Gln Gln Phe Pro Gln Glu Lys Ala225
230 235 240Asn Gly Leu Phe
Gly Gln Phe Leu Lys Leu Met Val Gly Asn Lys Ala 245
250 255Asp Phe Lys Lys Val Phe Gly Leu Glu Glu
Glu Ala Lys Ile Thr Tyr 260 265
270Ala Ser Glu Ser Tyr Glu Glu Asp Leu Glu Gly Ile Leu Ala Lys Val
275 280 285Gly Asp Glu Tyr Ser Asp Val
Phe Leu Ala Ala Lys Asn Val Tyr Asp 290 295
300Ala Val Glu Leu Ser Thr Ile Leu Ala Asp Ser Asp Lys Lys Ser
His305 310 315 320Ala Lys
Leu Ser Ser Ser Met Ile Val Arg Phe Thr Glu His Gln Glu
325 330 335Asp Leu Lys Lys Phe Lys Arg
Phe Ile Arg Glu Asn Cys Pro Asp Glu 340 345
350Tyr Asp Asn Leu Phe Lys Asn Glu Gln Lys Asp Gly Tyr Ala
Gly Tyr 355 360 365Ile Ala His Ala
Gly Lys Val Ser Gln Leu Lys Phe Tyr Gln Tyr Val 370
375 380Lys Lys Ile Ile Gln Asp Ile Ala Gly Ala Glu Tyr
Phe Leu Glu Lys385 390 395
400Ile Ala Gln Glu Asn Phe Leu Arg Lys Gln Arg Thr Phe Asp Asn Gly
405 410 415Val Ile Pro His Gln
Ile His Leu Ala Glu Leu Gln Ala Ile Ile His 420
425 430Arg Gln Ala Ala Tyr Tyr Pro Phe Leu Lys Glu Asn
Gln Glu Lys Ile 435 440 445Glu Gln
Leu Val Thr Phe Arg Ile Pro Tyr Tyr Val Gly Pro Leu Ser 450
455 460Lys Gly Asp Ala Ser Thr Phe Ala Trp Leu Lys
Arg Gln Ser Glu Glu465 470 475
480Pro Ile Arg Pro Trp Asn Leu Gln Glu Thr Val Asp Leu Asp Gln Ser
485 490 495Ala Thr Ala Phe
Ile Glu Arg Met Thr Asn Phe Asp Thr Tyr Leu Pro 500
505 510Ser Glu Lys Val Leu Pro Lys His Ser Leu Leu
Tyr Glu Lys Phe Met 515 520 525Val
Phe Asn Glu Leu Thr Lys Ile Ser Tyr Thr Asp Asp Arg Gly Ile 530
535 540Lys Ala Asn Phe Ser Gly Lys Glu Lys Glu
Lys Ile Phe Asp Tyr Leu545 550 555
560Phe Lys Thr Arg Arg Lys Val Lys Lys Lys Asp Ile Ile Gln Phe
Tyr 565 570 575Arg Asn Glu
Tyr Asn Thr Glu Ile Val Thr Leu Ser Gly Leu Glu Glu 580
585 590Asp Gln Phe Asn Ala Ser Phe Ser Thr Tyr
Gln Asp Leu Leu Lys Cys 595 600
605Gly Leu Thr Arg Ala Glu Leu Asp His Pro Asp Asn Ala Glu Lys Leu 610
615 620Glu Asp Ile Ile Lys Ile Leu Thr
Ile Phe Glu Asp Arg Gln Arg Ile625 630
635 640Arg Thr Gln Leu Ser Thr Phe Lys Gly Gln Phe Ser
Ala Glu Val Leu 645 650
655Lys Lys Leu Glu Arg Lys His Tyr Thr Gly Trp Gly Arg Leu Ser Lys
660 665 670Lys Leu Ile Asn Gly Ile
Tyr Asp Lys Glu Ser Gly Lys Thr Ile Leu 675 680
685Gly Tyr Leu Ile Lys Asp Asp Gly Val Ser Lys His Tyr Asn
Arg Asn 690 695 700Phe Met Gln Leu Ile
Asn Asp Ser Gln Leu Ser Phe Lys Asn Ala Ile705 710
715 720Gln Lys Ala Gln Ser Ser Glu His Glu Glu
Thr Leu Ser Glu Thr Val 725 730
735Asn Glu Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly Ile Tyr Gln Ser
740 745 750Leu Lys Ile Val Asp
Glu Leu Val Ala Ile Met Gly Tyr Ala Pro Lys 755
760 765Arg Ile Val Val Glu Met Ala Arg Glu Asn Gln Thr
Thr Ser Thr Gly 770 775 780Lys Arg Arg
Ser Ile Gln Arg Leu Lys Ile Val Glu Lys Ala Met Ala785
790 795 800Glu Ile Gly Ser Asn Leu Leu
Lys Glu Gln Pro Thr Thr Asn Glu Gln 805
810 815Leu Arg Asp Thr Arg Leu Phe Leu Tyr Tyr Met Gln
Asn Gly Lys Asp 820 825 830Met
Tyr Thr Gly Asp Glu Leu Ser Leu His Arg Leu Ser His Tyr Asp 835
840 845Ile Asp His Ile Ile Pro Gln Ser Phe
Met Lys Asp Asp Ser Leu Asp 850 855
860Asn Leu Val Leu Val Gly Ser Thr Glu Asn Arg Gly Lys Ser Asp Asp865
870 875 880Val Pro Ser Lys
Glu Val Val Lys Asp Met Lys Ala Tyr Trp Glu Lys 885
890 895Leu Tyr Ala Ala Gly Leu Ile Ser Gln Arg
Lys Phe Gln Arg Leu Thr 900 905
910Lys Gly Glu Gln Gly Gly Leu Thr Leu Glu Asp Lys Ala His Phe Ile
915 920 925Gln Arg Gln Leu Val Glu Thr
Arg Gln Ile Thr Lys Asn Val Ala Gly 930 935
940Ile Leu Asp Gln Arg Tyr Asn Ala Asn Ser Lys Glu Lys Lys Val
Gln945 950 955 960Ile Ile
Thr Leu Lys Ala Ser Leu Thr Ser Gln Phe Arg Ser Ile Phe
965 970 975Gly Leu Tyr Lys Val Arg Glu
Val Asn Asp Tyr His His Gly Gln Asp 980 985
990Ala Tyr Leu Asn Cys Val Val Ala Thr Thr Leu Leu Lys Val
Tyr Pro 995 1000 1005Asn Leu Ala
Pro Glu Phe Val Tyr Gly Glu Tyr Pro Lys Phe Gln 1010
1015 1020Thr Phe Lys Glu Asn Lys Ala Thr Ala Lys Ala
Ile Ile Tyr Thr 1025 1030 1035Asn Leu
Leu Arg Phe Phe Thr Glu Asp Glu Pro Arg Phe Thr Lys 1040
1045 1050Asp Gly Glu Ile Leu Trp Ser Asn Ser Tyr
Leu Lys Thr Ile Lys 1055 1060 1065Lys
Glu Leu Asn Tyr His Gln Met Asn Ile Val Lys Lys Val Glu 1070
1075 1080Val Gln Lys Gly Gly Phe Ser Lys Glu
Ser Ile Lys Pro Lys Gly 1085 1090
1095Pro Ser Asn Lys Leu Ile Pro Val Lys Asn Gly Leu Asp Pro Gln
1100 1105 1110Lys Tyr Gly Gly Phe Asp
Ser Pro Ile Val Ala Tyr Thr Val Leu 1115 1120
1125Phe Thr His Glu Lys Gly Lys Lys Pro Leu Ile Lys Gln Glu
Ile 1130 1135 1140Leu Gly Ile Thr Ile
Met Glu Lys Thr Arg Phe Glu Gln Asn Pro 1145 1150
1155Ile Leu Phe Leu Glu Glu Lys Gly Phe Leu Arg Pro Arg
Val Leu 1160 1165 1170Met Lys Leu Pro
Lys Tyr Thr Leu Tyr Glu Phe Pro Glu Gly Arg 1175
1180 1185Arg Arg Leu Leu Ala Ser Ala Lys Glu Ala Gln
Lys Gly Asn Gln 1190 1195 1200Met Val
Leu Pro Glu His Leu Leu Thr Leu Leu Tyr His Ala Lys 1205
1210 1215Gln Cys Leu Leu Pro Asn Gln Ser Glu Ser
Leu Thr Tyr Val Glu 1220 1225 1230Gln
His Gln Pro Glu Phe Gln Glu Ile Leu Glu Arg Val Val Asp 1235
1240 1245Phe Ala Glu Val His Thr Leu Ala Lys
Ser Lys Val Gln Gln Ile 1250 1255
1260Val Lys Leu Phe Glu Ala Asn Gln Thr Ala Asp Val Lys Glu Ile
1265 1270 1275Ala Ala Ser Phe Ile Gln
Leu Met Gln Phe Asn Ala Met Gly Ala 1280 1285
1290Pro Ser Thr Phe Lys Phe Phe Gln Lys Asp Ile Glu Arg Ala
Arg 1295 1300 1305Tyr Thr Ser Ile Lys
Glu Ile Phe Asp Ala Thr Ile Ile Tyr Gln 1310 1315
1320Ser Thr Thr Gly Leu Tyr Glu Thr Arg Arg Lys Val Val
Asp 1325 1330 1335536DNAArtificial
SequenceCR REPEAT 5gttttagagt catgttgttt agaatggtac caaaac
36684DNAArtificial SequenceTRACR RNA 6tcttttggga
ctattctaaa caacatagca agttaaaata aggttttaac cgtaatcaac 60tgtaaagtgg
cgctgtttcg gcgc
8471375PRTStreptococcus acidominimus 7Met Lys Lys Pro Tyr Ser Ile Gly Leu
Asp Ile Gly Thr Asn Ser Val1 5 10
15Gly Trp Ala Val Ile Thr Asp Asp Tyr Lys Val Pro Ala Lys Lys
Met 20 25 30Lys Val Leu Gly
Asn Thr Asp Lys Lys Tyr Ile Lys Lys Asn Leu Leu 35
40 45Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu
Val Thr Arg Leu 50 55 60Lys Arg Thr
Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Leu Arg65 70
75 80Tyr Leu Gln Glu Ile Phe Ala Lys
Glu Met Thr Lys Val Asp Glu Ser 85 90
95Phe Phe Gln Arg Leu Glu Glu Ser Phe Leu Thr Asp Asp Asp
Lys Thr 100 105 110Phe Asp Ser
His Pro Ile Phe Gly Asn Lys Ala Glu Glu Asp Ala Tyr 115
120 125His Gln Lys Phe Pro Thr Ile Tyr His Leu Arg
Lys Tyr Leu Ala Asp 130 135 140Ser Gln
Glu Lys Ala Asp Leu Arg Leu Val Tyr Leu Ala Leu Ala His145
150 155 160Met Ile Lys Tyr Arg Gly His
Phe Leu Ile Glu Gly Glu Leu Asn Ala 165
170 175Glu Asn Thr Asp Val Gln Lys Leu Phe Asn Val Phe
Val Glu Thr Tyr 180 185 190Asp
Lys Ile Val Asp Glu Ser His Leu Ser Glu Ile Glu Val Asp Ala 195
200 205Ser Ser Ile Leu Thr Glu Lys Val Ser
Lys Ser Arg Arg Leu Glu Asn 210 215
220Leu Ile Lys Gln Tyr Pro Thr Glu Lys Lys Asn Thr Leu Phe Gly Asn225
230 235 240Leu Ile Ala Leu
Ala Leu Gly Leu Gln Pro Asn Phe Lys Thr Asn Phe 245
250 255Lys Leu Ser Glu Asp Ala Lys Leu Gln Phe
Ser Lys Asp Thr Tyr Glu 260 265
270Glu Asp Leu Glu Glu Leu Leu Gly Lys Val Gly Asp Asp Tyr Ala Asp
275 280 285Leu Phe Ile Ser Ala Lys Asn
Leu Tyr Asp Ala Ile Leu Leu Ser Gly 290 295
300Ile Leu Thr Val Asp Asp Asn Ser Thr Lys Ala Pro Leu Ser Ala
Ser305 310 315 320Met Ile
Lys Arg Tyr Val Glu His His Glu Asp Leu Glu Lys Leu Lys
325 330 335Glu Phe Ile Lys Ile Asn Lys
Leu Lys Leu Tyr His Asp Ile Phe Lys 340 345
350Asp Lys Thr Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Asn Gly
Val Lys 355 360 365Gln Asp Glu Phe
Tyr Lys Tyr Leu Lys Thr Ile Leu Thr Lys Ile Asp 370
375 380Asp Ser Asp Tyr Phe Leu Asp Lys Ile Glu Arg Asp
Asp Phe Leu Arg385 390 395
400Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu
405 410 415Gln Glu Met His Ser
Ile Leu Arg Arg Gln Gly Glu Tyr Tyr Pro Phe 420
425 430Leu Lys Glu Asn Gln Ala Lys Ile Glu Lys Ile Leu
Thr Phe Arg Ile 435 440 445Pro Tyr
Tyr Val Gly Pro Leu Ala Arg Lys Asp Ser Arg Phe Ala Trp 450
455 460Ala Asn Tyr His Ser Asp Glu Pro Ile Thr Pro
Trp Asn Phe Asp Glu465 470 475
480Val Val Asp Lys Glu Lys Ser Ala Glu Lys Phe Ile Thr Arg Met Thr
485 490 495Leu Asn Asp Leu
Tyr Leu Pro Glu Glu Lys Val Leu Pro Lys His Ser 500
505 510His Val Tyr Glu Thr Phe Thr Val Tyr Asn Glu
Leu Thr Lys Ile Lys 515 520 525Tyr
Val Asn Glu Gln Gly Glu Ser Phe Phe Phe Asp Ala Asn Met Lys 530
535 540Gln Glu Ile Phe Asp His Val Phe Lys Glu
Asn Arg Lys Val Thr Lys545 550 555
560Ala Lys Leu Leu Ser Tyr Leu Asn Asn Glu Phe Glu Glu Phe Arg
Ile 565 570 575Asn Asp Leu
Ile Gly Leu Asp Lys Asp Ser Lys Ser Phe Asn Ala Ser 580
585 590Leu Gly Thr Tyr His Asp Leu Lys Lys Ile
Leu Asp Lys Ser Phe Leu 595 600
605Asp Asp Lys Thr Asn Glu Gln Ile Ile Glu Asp Ile Val Leu Thr Leu 610
615 620Thr Leu Phe Glu Asp Arg Asp Met
Ile His Glu Arg Leu Gln Lys Tyr625 630
635 640Ser Asp Phe Phe Thr Ser Gln Gln Leu Lys Lys Leu
Glu Arg Arg His 645 650
655Tyr Thr Gly Trp Gly Arg Leu Ser Tyr Lys Leu Ile Asn Gly Ile Arg
660 665 670Asn Lys Glu Asn Asn Lys
Thr Ile Leu Asp Phe Leu Ile Asp Asp Gly 675 680
685His Ala Asn Arg Asn Phe Met Gln Leu Ile Asn Asp Glu Ser
Leu Ser 690 695 700Phe Lys Thr Ile Ile
Gln Glu Ala Gln Val Val Gly Asp Val Asp Asp705 710
715 720Ile Glu Ala Val Val His Asp Leu Pro Gly
Ser Pro Ala Ile Lys Lys 725 730
735Gly Ile Leu Gln Ser Val Lys Ile Val Asp Glu Leu Val Lys Val Met
740 745 750Gly Asp Asn Pro Asp
Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln 755
760 765Thr Thr Gly Tyr Gly Arg Asn Lys Ser Asn Gln Arg
Leu Lys Arg Leu 770 775 780Gln Asp Ser
Leu Lys Glu Phe Gly Ser Asp Ile Leu Ser Lys Lys Lys785
790 795 800Pro Ser Tyr Val Asp Ser Lys
Val Glu Asn Ser His Leu Gln Asn Asp 805
810 815Arg Leu Phe Leu Tyr Tyr Ile Gln Asn Gly Lys Asp
Met Tyr Thr Gly 820 825 830Glu
Glu Leu Asp Ile Asp Arg Leu Ser Asp Tyr Asp Ile Asp His Ile 835
840 845Ile Pro Gln Ala Phe Ile Lys Asp Asn
Ser Ile Asp Asn Lys Val Leu 850 855
860Thr Ser Ser Ala Lys Asn Arg Gly Lys Ser Asp Asp Val Pro Ser Ile865
870 875 880Glu Ile Val Arg
Asn Arg Arg Ser Tyr Trp Tyr Lys Leu Tyr Lys Ser 885
890 895Gly Leu Ile Ser Lys Arg Lys Phe Asp Asn
Leu Thr Lys Ala Glu Arg 900 905
910Gly Gly Leu Thr Glu Ala Asp Lys Ala Gly Phe Ile Lys Arg Gln Leu
915 920 925Val Glu Thr Arg Gln Ile Thr
Lys His Val Ala Gln Ile Leu Asp Ala 930 935
940Arg Phe Asn Thr Lys Arg Asp Glu Asn Asp Lys Val Ile Arg Asp
Val945 950 955 960Lys Val
Ile Thr Leu Lys Ser Asn Leu Val Ser Gln Phe Arg Lys Glu
965 970 975Phe Lys Phe Tyr Lys Val Arg
Glu Ile Asn Asp Tyr His His Ala His 980 985
990Asp Ala Tyr Leu Asn Ala Val Val Gly Thr Ala Leu Leu Lys
Lys Tyr 995 1000 1005Pro Lys Leu
Thr Pro Glu Phe Val Tyr Gly Glu Tyr Lys Lys Tyr 1010
1015 1020Asp Val Arg Lys Leu Ile Ala Lys Ser Ser Asp
Asp Tyr Ser Glu 1025 1030 1035Met Gly
Lys Ala Thr Ala Lys Tyr Phe Phe Tyr Ser Asn Leu Met 1040
1045 1050Asn Phe Phe Lys Thr Glu Val Lys Tyr Ala
Asp Gly Arg Val Phe 1055 1060 1065Glu
Arg Pro Asp Ile Glu Thr Asn Ala Asp Gly Glu Val Val Trp 1070
1075 1080Asn Lys Gln Lys Asp Phe Asp Ile Val
Arg Lys Val Leu Ser Tyr 1085 1090
1095Pro Gln Val Asn Ile Val Lys Lys Val Glu Ala Gln Thr Gly Gly
1100 1105 1110Phe Ser Lys Glu Ser Ile
Leu Ser Lys Gly Asp Ser Asp Lys Leu 1115 1120
1125Ile Pro Arg Lys Thr Lys Lys Val Tyr Trp Asn Thr Lys Lys
Tyr 1130 1135 1140Gly Gly Phe Asp Ser
Pro Thr Val Ala Tyr Ser Val Leu Val Val 1145 1150
1155Ala Asp Ile Glu Lys Gly Lys Ala Lys Lys Leu Lys Thr
Val Lys 1160 1165 1170Glu Leu Val Gly
Ile Ser Ile Met Glu Arg Ser Phe Phe Glu Glu 1175
1180 1185Asn Pro Val Ser Phe Leu Glu Lys Lys Gly Tyr
His Asn Val Gln 1190 1195 1200Glu Asp
Lys Leu Ile Lys Leu Pro Lys Tyr Ser Leu Phe Glu Phe 1205
1210 1215Glu Gly Gly Arg Arg Arg Leu Leu Ala Ser
Ala Thr Glu Leu Gln 1220 1225 1230Lys
Gly Asn Glu Val Met Leu Pro Ala His Leu Val Glu Leu Leu 1235
1240 1245Tyr His Ala His Arg Ile Asp Ser Phe
Asn Ser Thr Glu His Leu 1250 1255
1260Lys Tyr Val Ser Glu His Lys Lys Glu Phe Glu Lys Val Leu Ser
1265 1270 1275Cys Val Glu Asn Phe Ser
Asn Leu Tyr Val Asp Val Glu Lys Asn 1280 1285
1290Leu Ser Lys Val Arg Ala Ala Ala Glu Ser Met Thr Asn Phe
Ser 1295 1300 1305Leu Glu Glu Ile Ser
Ala Ser Phe Ile Asn Leu Leu Thr Leu Thr 1310 1315
1320Ala Leu Gly Ala Pro Ala Asp Phe Asn Phe Leu Gly Glu
Lys Ile 1325 1330 1335Pro Arg Lys Arg
Tyr Thr Ser Thr Lys Glu Cys Leu Ser Ala Thr 1340
1345 1350Leu Ile His Gln Ser Val Thr Gly Leu Tyr Glu
Thr Arg Ile Asp 1355 1360 1365Leu Ser
Lys Leu Gly Glu Glu 1370 1375836DNAArtificial
SequenceCR RNA 8gttttagagc tgtgctgttt cgaatggttc caaaac
36987DNAArtificial SequenceTRACR RNA 9tgttggaact attcgaaaca
acacagcgag ttaaaataag gctttgtccg tacacaactt 60gtaaaagggg cacccgattc
gggtgca 87101367PRTStreptococcus
acidominimus 10Met Thr Lys Pro Tyr Ser Ile Gly Leu Asp Ile Gly Thr Asn
Ser Val1 5 10 15Gly Trp
Ala Val Ile Thr Asp Asp Tyr Lys Val Pro Ser Lys Lys Met 20
25 30Lys Val Leu Gly Asn Thr Ser Lys Lys
Tyr Ile Lys Lys Asn Leu Leu 35 40
45Gly Ala Leu Leu Phe Asp Ser Gly Ile Thr Ala Glu Gly Arg Arg Leu 50
55 60Lys Arg Thr Ala Arg Arg Arg Tyr Thr
Arg Arg Arg Asn Arg Ile Leu65 70 75
80Tyr Leu Gln Glu Ile Phe Ser Thr Glu Met Ala Thr Leu Asp
Asp Ala 85 90 95Phe Phe
Gln Arg Leu Asp Asp Ser Phe Leu Val Pro Asp Asp Lys Arg 100
105 110Asp Ser Lys Tyr Pro Ile Phe Gly Asn
Leu Val Glu Glu Lys Ala Tyr 115 120
125His Asp Glu Phe Pro Thr Ile Tyr His Leu Arg Lys Tyr Leu Ala Asp
130 135 140Ser Thr Lys Lys Ala Asp Leu
Arg Leu Val Tyr Leu Ala Leu Ala His145 150
155 160Met Ile Lys Tyr Arg Gly His Phe Leu Ile Glu Gly
Glu Phe Asn Ser 165 170
175Lys Asn Asn Asp Ile Gln Lys Asn Phe Gln Asp Phe Leu Asp Thr Tyr
180 185 190Asn Ala Ile Phe Glu Ser
Asp Leu Ser Leu Glu Asn Ser Lys Gln Leu 195 200
205Glu Glu Ile Val Lys Asp Lys Ile Ser Lys Leu Glu Lys Lys
Asp Arg 210 215 220Ile Leu Lys Leu Phe
Pro Gly Glu Lys Asn Ser Gly Ile Phe Ser Glu225 230
235 240Phe Leu Lys Leu Ile Val Gly Asn Gln Ala
Asp Phe Lys Lys Tyr Phe 245 250
255Asn Leu Asp Glu Lys Ala Ser Leu His Phe Ser Lys Glu Ser Tyr Asp
260 265 270Glu Asp Leu Glu Thr
Leu Leu Gly Tyr Ile Gly Asp Asp Tyr Ser Asp 275
280 285Val Phe Leu Lys Ala Lys Lys Leu Tyr Asp Ala Ile
Leu Leu Ser Gly 290 295 300Ile Leu Thr
Val Thr Asp Asn Gly Thr Glu Thr Pro Leu Ser Ser Ala305
310 315 320Met Ile Met Arg Tyr Lys Glu
His Glu Glu Asp Leu Gly Leu Leu Lys 325
330 335Ala Tyr Ile Arg Asn Ile Ser Leu Lys Thr Tyr Asn
Glu Val Phe Asn 340 345 350Asp
Asp Thr Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Lys Thr Asn 355
360 365Gln Glu Asp Phe Tyr Val Tyr Leu Lys
Lys Leu Leu Ala Lys Phe Glu 370 375
380Gly Ala Asp Tyr Phe Leu Glu Lys Ile Asp Arg Glu Asp Phe Leu Arg385
390 395 400Lys Gln Arg Thr
Phe Asp Asn Gly Ser Ile Pro Tyr Gln Ile His Leu 405
410 415Gln Glu Met Arg Ala Ile Leu Asp Lys Gln
Ala Lys Phe Tyr Pro Phe 420 425
430Leu Ala Lys Asn Lys Glu Arg Ile Glu Lys Ile Leu Thr Phe Arg Ile
435 440 445Pro Tyr Tyr Val Gly Pro Leu
Ala Arg Gly Asn Ser Asp Phe Ala Trp 450 455
460Ser Ile Arg Lys Arg Asn Glu Lys Ile Thr Pro Trp Asn Phe Glu
Asp465 470 475 480Val Ile
Asp Lys Glu Ser Ser Ala Glu Ala Phe Ile Asn Arg Met Thr
485 490 495Ser Phe Asp Leu Tyr Leu Pro
Glu Glu Lys Val Leu Pro Lys His Ser 500 505
510Leu Leu Tyr Glu Thr Phe Thr Val Tyr Asn Glu Leu Thr Lys
Val Arg 515 520 525Phe Ile Ala Glu
Gly Met Ser Asp Tyr Gln Phe Leu Asp Ser Lys Gln 530
535 540Lys Lys Asp Ile Val Arg Leu Tyr Phe Lys Gly Lys
Arg Lys Val Lys545 550 555
560Val Thr Asp Lys Asp Ile Ile Glu Tyr Leu His Ala Ile Asp Gly Tyr
565 570 575Asp Gly Ile Glu Leu
Lys Gly Ile Glu Lys Gln Phe Asn Ser Ser Leu 580
585 590Ser Thr Tyr His Asp Leu Leu Asn Ile Ile Asn Asp
Lys Glu Phe Leu 595 600 605Asp Asp
Ser Ser Asn Glu Ala Ile Ile Glu Glu Ile Ile His Thr Leu 610
615 620Thr Ile Phe Glu Asp Arg Glu Met Ile Lys Gln
Arg Leu Ser Lys Phe625 630 635
640Glu Asn Ile Phe Asp Lys Ser Val Leu Lys Lys Leu Ser Arg Arg His
645 650 655Tyr Thr Gly Trp
Gly Lys Leu Ser Ala Lys Leu Ile Asn Gly Ile Arg 660
665 670Asp Glu Lys Ser Gly Asn Thr Ile Leu Asp Tyr
Leu Ile Asp Asp Gly 675 680 685Ile
Ser Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ala Leu Ser 690
695 700Phe Lys Lys Lys Ile Gln Lys Ala Gln Ile
Ile Gly Asp Lys Asp Lys705 710 715
720Asp Asn Ile Lys Glu Val Val Lys Ser Leu Pro Gly Ser Pro Ala
Ile 725 730 735Lys Lys Gly
Ile Leu Gln Ser Ile Lys Ile Val Asp Glu Leu Val Lys 740
745 750Val Met Gly Arg Lys Pro Glu Ser Ile Val
Val Glu Met Ala Arg Glu 755 760
765Asn Gln Tyr Thr Asn Gln Gly Lys Ser Asn Ser Gln Gln Arg Leu Lys 770
775 780Arg Leu Glu Glu Ser Leu Glu Glu
Leu Gly Ser Lys Ile Leu Lys Glu785 790
795 800Asn Ile Pro Ala Lys Leu Ser Lys Ile Asp Asn Asn
Ser Leu Gln Asn 805 810
815Asp Arg Leu Tyr Leu Tyr Tyr Leu Gln Asn Gly Lys Asp Met Tyr Thr
820 825 830Gly Asp Asp Leu Asp Ile
Asp Arg Leu Ser Asn Tyr Asp Ile Asp His 835 840
845Ile Ile Pro Gln Ala Phe Leu Lys Asp Asn Ser Ile Asp Asn
Lys Val 850 855 860Leu Val Ser Ser Ala
Ser Asn Arg Gly Lys Ser Asp Asp Val Pro Ser865 870
875 880Leu Glu Val Val Lys Lys Arg Lys Thr Leu
Trp Tyr Gln Leu Leu Lys 885 890
895Ser Lys Leu Ile Ser Gln Arg Lys Phe Asp Asn Leu Thr Lys Ala Glu
900 905 910Arg Gly Gly Leu Ser
Pro Glu Asp Lys Ala Gly Phe Ile Gln Arg Gln 915
920 925Leu Val Glu Thr Arg Gln Ile Thr Lys His Val Ala
Arg Leu Leu Asp 930 935 940Glu Lys Phe
Asn Asn Lys Lys Asp Glu Asn Asn Arg Ala Val Arg Thr945
950 955 960Val Lys Ile Ile Thr Leu Lys
Ser Thr Leu Val Ser Gln Phe Arg Lys 965
970 975Asp Phe Glu Leu Tyr Lys Val Arg Glu Ile Asn Asp
Phe His His Ala 980 985 990His
Asp Ala Tyr Leu Asn Ala Val Val Ala Ser Ala Leu Leu Lys Lys 995
1000 1005Tyr Pro Lys Leu Glu Pro Glu Phe
Val Tyr Gly Asp Tyr Pro Lys 1010 1015
1020Tyr Asn Ser Phe Arg Glu Arg Lys Ser Ala Thr Glu Lys Val Tyr
1025 1030 1035Phe Tyr Ser Asn Ile Met
Asn Ile Phe Lys Lys Ser Ile Ser Leu 1040 1045
1050Ala Asp Gly Arg Val Ile Glu Arg Pro Leu Ile Glu Val Asn
Glu 1055 1060 1065Glu Thr Gly Glu Ser
Val Trp Asn Lys Glu Ser Asp Leu Ala Thr 1070 1075
1080Val Arg Arg Val Leu Ser Tyr Pro Gln Val Asn Val Val
Lys Lys 1085 1090 1095Val Glu Val Gln
Ser Gly Gly Phe Ser Lys Glu Leu Val Gln Pro 1100
1105 1110His Gly Asn Ser Asp Lys Leu Ile Pro Arg Lys
Thr Lys Lys Met 1115 1120 1125Ile Trp
Asp Thr Lys Lys Tyr Gly Gly Phe Asp Ser Pro Ile Val 1130
1135 1140Ala Tyr Ser Val Leu Val Met Ala Glu Arg
Glu Lys Gly Lys Ser 1145 1150 1155Lys
Lys Leu Lys Pro Val Lys Glu Leu Val Arg Ile Thr Ile Met 1160
1165 1170Glu Lys Glu Ser Phe Lys Glu Asn Thr
Ile Asp Phe Leu Glu Arg 1175 1180
1185Arg Gly Leu Arg Asn Ile Gln Asp Glu Asn Ile Ile Leu Leu Pro
1190 1195 1200Lys Phe Ser Leu Phe Glu
Leu Glu Asn Gly Arg Arg Arg Leu Leu 1205 1210
1215Ala Ser Ala Lys Glu Leu Gln Lys Gly Asn Glu Phe Ile Leu
Pro 1220 1225 1230Asn Lys Leu Val Lys
Leu Leu Tyr His Ala Lys Asn Ile His Asn 1235 1240
1245Thr Leu Glu Pro Glu His Leu Glu Tyr Val Glu Ser His
Arg Ala 1250 1255 1260Asp Phe Gly Lys
Ile Leu Asp Val Val Ser Val Phe Ser Glu Lys 1265
1270 1275Tyr Ile Leu Ala Glu Ala Lys Leu Glu Lys Ile
Lys Glu Ile Tyr 1280 1285 1290Arg Lys
Asn Met Asn Thr Glu Ile His Glu Met Ala Thr Ala Phe 1295
1300 1305Ile Asn Leu Leu Thr Phe Thr Ser Ile Gly
Ala Pro Ala Thr Phe 1310 1315 1320Lys
Phe Phe Gly His Asn Ile Glu Arg Lys Arg Tyr Ser Ser Val 1325
1330 1335Ala Glu Ile Leu Asn Ala Thr Leu Ile
His Gln Ser Val Thr Gly 1340 1345
1350Leu Tyr Glu Thr Arg Ile Asp Leu Gly Lys Leu Gly Glu Asp 1355
1360 13651136DNAArtificial SequenceCR RNA
11gttttagagc tgtgttgttt cgaatggttc caaaac
361287DNAArtificial SequenceTRACR RNA 12ggtttgaaac cattcgaaac aatacagcaa
agttaaaata aggctagtcc gtatacaacg 60tgaaaacacg tggcaccgat tcggtgc
87131393PRTArtificial
SequenceAcholeplasmatales 13Met Lys Asn Asn Glu Glu Thr Leu Lys Lys Leu
Arg Leu Gly Leu Asp1 5 10
15Ile Gly Thr Asn Ser Val Gly Tyr Ala Leu Leu Asp Glu Asn Asn Lys
20 25 30Leu Ile Lys Lys Asn Gly His
Thr Phe Trp Gly Val Arg Met Phe Asp 35 40
45Glu Ala Glu Thr Ala Lys Asp Arg Gly Ser Tyr Arg Lys Ser Arg
Arg 50 55 60Arg Leu Leu Arg Arg Lys
Glu Arg Met Glu Ile Leu Arg Ser Phe Phe65 70
75 80Thr Lys Glu Ile Cys Asp Ile Asp Pro Thr Phe
Phe Glu Arg Leu Asp 85 90
95Asp Ser Phe Tyr Tyr Lys Glu Asp Lys Lys Asn Lys Asn Thr Tyr Asn
100 105 110Leu Phe Thr Ser Glu Tyr
Thr Asp Lys Asp Phe Tyr Leu Glu Tyr Pro 115 120
125Thr Ile Tyr His Leu Arg Lys Ala Met Gln Glu Glu Asp Lys
Lys Phe 130 135 140Asp Ile Arg Met Val
Tyr Leu Ala Ile Ala His Ile Ile Lys Tyr Arg145 150
155 160Gly Asn Phe Leu Tyr Pro Gly Glu Glu Phe
Ser Thr Ser Glu Tyr Thr 165 170
175Ser Ile Lys Gln Phe Phe Leu Asp Phe Asn Asp Ile Leu Asp Glu Leu
180 185 190Ser Asn Glu Leu Glu
Asp Asn Glu Asp Tyr Ser Ala Glu Tyr Phe Asp 195
200 205Lys Ile Glu Asn Ile Asn Asp Asp Phe Leu Glu Lys
Leu Lys Val Ile 210 215 220Leu Met Glu
Ile Lys Gly Ile Ser Asn Lys Lys Lys Glu Leu Leu Asp225
230 235 240Leu Phe Asn Val Asn Lys Lys
Ser Ile Tyr Asn Glu Leu Val Ile Pro 245
250 255Phe Ile Ser Gly Ser Ala Lys Val Asn Ile Ser Ser
Leu Ser Val Ile 260 265 270Lys
Asn Ser Lys Tyr Pro Lys Thr Glu Ile Ser Leu Gly Ser Glu Glu 275
280 285Leu Glu Gly Gln Val Glu Glu Ala Ile
Ser Val Ala Pro Glu Ile Lys 290 295
300Ser Val Leu Glu Met Ile Ile Lys Ile Lys Glu Ile Ser Asp Phe Tyr305
310 315 320Phe Ile Asn Lys
Ile Leu Ser Asp Ser Lys Thr Ile Ser Glu Ser Met 325
330 335Val Lys Met Tyr Asp Glu His Asn Glu Asp
Leu Lys Lys Leu Lys Gly 340 345
350Phe Phe Lys Lys Tyr Ala Glu Asp Gln Tyr Asn Glu Ile Phe Lys Ile
355 360 365Arg Asp Glu Lys Leu Ala Asn
Tyr Val Ala Tyr Val Gly Phe Asn Lys 370 375
380Leu Arg Lys Asn Lys Val Glu Arg Phe Lys His Ala Ser Arg Glu
Glu385 390 395 400Phe Tyr
Gly Tyr Leu Lys Gln Lys Leu Asn Asn Ile Lys Tyr Ala Glu
405 410 415Ala Gln Glu Glu Ile Lys Tyr
Phe Ile Asp Lys Ile Asp Asn Asn Glu 420 425
430Phe Leu Leu Lys Gln Asn Ser Asn Gln Asn Gly Ala Phe Pro
Met Gln 435 440 445Leu His Leu Lys
Glu Leu Lys Thr Ile Leu Asn Asn Gln Glu Lys Tyr 450
455 460Tyr Pro Phe Leu Ser Glu Gly Asn Asp Gly Tyr Ser
Ile Lys Glu Lys465 470 475
480Ile Ile Leu Thr Phe Lys Tyr Lys Ile Pro Tyr Tyr Val Gly Pro Leu
485 490 495Asn Lys Glu Ser Lys
Tyr Ser Trp Val Val Arg Glu Asp Glu Lys Ile 500
505 510Tyr Pro Trp Asn Phe Asp Lys Val Val Lys Leu Asp
Glu Thr Ala Glu 515 520 525Lys Phe
Ile Leu Arg Met Gln Asn Lys Cys Thr Tyr Leu Lys Gly Asp 530
535 540Asn Asp Tyr Cys Leu Pro Lys Asn Ser Leu Ile
Phe Ser Glu Tyr Ser545 550 555
560Cys Leu Ser Tyr Leu Asn Lys Leu Ser Ile Asn Gly Lys Pro Ile Asp
565 570 575Pro Ile Met Lys
Ser Lys Ile Phe Asn Glu Val Phe Leu Ile Lys Lys 580
585 590Gln Pro Thr Lys Lys Asp Ile Ile Glu Phe Ile
Lys Thr Asn Tyr Asn 595 600 605Ala
Asp Ala Leu Thr Thr Thr Glu Lys Glu Leu Pro Glu Ala Thr Cys 610
615 620Asn Met Ala Ser Tyr Ile Lys Met Lys Glu
Ile Phe Gly Lys Asp Phe625 630 635
640Asn Asp Asn Lys Glu Met Ile Glu Asn Ile Ile Lys Asp Ile Thr
Ile 645 650 655Phe Glu Asp
Lys Ser Ile Leu Gly Asn Arg Leu Lys Glu Leu Tyr Lys 660
665 670Leu Asn Asn Asp Arg Ile Lys Gln Ile Lys
Gly Leu Asn Tyr Lys Gly 675 680
685Tyr Ser Arg Leu Ser Lys Asn Leu Leu Val Gly Leu Gln Ile Val Asp 690
695 700Asn Gln Thr Gly Glu Ile Lys Gly
Asn Val Ile Glu Val Met Arg Lys705 710
715 720Thr Asn Leu Asn Leu Gln Glu Ile Leu Tyr Leu Asp
Gly Tyr Arg Leu 725 730
735Ile Asp Ala Ile Asp Glu Tyr Asn Arg Lys Asn Ser Leu Asn Asp Ser
740 745 750Tyr Leu Cys Ala Arg Asp
Tyr Ile Ala Glu Asn Leu Val Ile Ser Pro 755 760
765Ser Phe Lys Arg Ala Leu Ile Gln Thr Cys Ser Ile Ile Gln
Glu Ile 770 775 780Glu Arg Ile Phe His
Lys Lys Ile Asp Glu Phe Tyr Val Glu Val Thr785 790
795 800Arg Thr Asn Lys Asp Lys Asn Lys Gly Lys
Thr Thr Ser Ser Arg Tyr 805 810
815Asp Lys Ile Lys Lys Ile Tyr Ser Ser Cys Gln Glu Leu Ala Met Ala
820 825 830Tyr Asn Phe Asp Met
Lys Arg Leu Lys Asn Glu Leu Glu Ser Asn Lys 835
840 845Asp Asn Leu Lys Ser Asp Ile Leu Tyr Phe Tyr Phe
Thr Gln Leu Gly 850 855 860Lys Cys Met
Tyr Ser Leu Glu Asp Ile Asp Ile Ser Asp Leu Thr Asn865
870 875 880Asn Tyr His Tyr Asp Ile Asp
His Ile Tyr Pro Gln Ser Ile Ile Lys 885
890 895Asp Asp Ser Leu Ser Asn Arg Val Leu Val Asp Lys
Lys Lys Asn Ala 900 905 910Ala
Lys Thr Asp Lys Phe Leu Phe Glu Ala Lys Val Leu Asn Pro Lys 915
920 925Ala Gln Gln Phe Tyr Lys Lys Leu Leu
Ser Leu Glu Leu Ile Ser Lys 930 935
940Glu Lys Tyr Arg Arg Leu Thr Gln Lys Glu Ile Ser Lys Asp Glu Leu945
950 955 960Glu Gly Phe Val
Asn Arg Gln Leu Val Ser Thr Asn Gln Ser Val Met 965
970 975Gly Leu Ile Lys Leu Leu Lys Glu Tyr Tyr
Lys Val Asp Glu Lys Asn 980 985
990Ile Ile Tyr Ser Lys Gly Glu Asn Val Ser Asp Phe Arg His Thr Phe
995 1000 1005Asp Leu Val Lys Ser Arg
Thr Ala Asn Asn Phe His His Ala His 1010 1015
1020Asp Ala Tyr Leu Asn Val Val Val Gly Gly Ile Leu Asn Lys
Tyr 1025 1030 1035Tyr Thr Ser Arg Arg
Phe Tyr Gln Phe Ser Asp Ile Ala Arg Ile 1040 1045
1050Glu Asn Glu Gly Glu Ser Leu Asn Pro Ser Arg Ile Phe
Thr Lys 1055 1060 1065Arg Asp Ile Leu
Lys Ala Asn Gly Lys Val Ile Trp Asp Lys Lys 1070
1075 1080Glu Asp Ile Lys Arg Ile Glu Lys Asp Leu Tyr
His Arg Phe Asp 1085 1090 1095Ile Thr
Glu Thr Ile Arg Thr Tyr Asn Pro Asn Lys Met Tyr Ser 1100
1105 1110Lys Val Thr Ile Leu Pro Lys Gly Glu Gly
Glu Ser Ala Val Pro 1115 1120 1125Phe
Gln Thr Thr Thr Pro Arg Val Asp Val Glu Lys Tyr Gly Gly 1130
1135 1140Ile Thr Ser Asn Lys Phe Ser Arg Tyr
Val Ile Ile Glu Ala His 1145 1150
1155Gly Lys Lys Gly Leu Asp Thr Ile Leu Glu Ala Ile Pro Lys Thr
1160 1165 1170Ala Cys Gly Asp Asn Asn
Lys Ile Glu Lys Asp Ile Asp Asn Tyr 1175 1180
1185Ile Ala Ser Leu Asp Glu Tyr Gln Lys Tyr Thr Ser Tyr Lys
Val 1190 1195 1200Val Asn Tyr Asn Ile
Lys Ala Asn Val Val Ile Gln Glu Gly Ser 1205 1210
1215Phe Lys Tyr Ile Ile Thr Gly Lys Ser Gly Asn Gln Tyr
Val Leu 1220 1225 1230Gln Asn Val Gln
Asp Arg Phe Phe Ser Lys Lys Ala Met Ile Thr 1235
1240 1245Ile Lys Asn Ile Asp Lys Tyr Leu Asn Asn Lys
Lys Leu Gly Ile 1250 1255 1260Ile Met
Ala Lys Asp Asn Glu Lys Ile Ile Val Ser Pro Ala Arg 1265
1270 1275Gly Lys Asn Asn Glu Glu Ile Phe Phe Glu
Lys Thr Glu Leu Val 1280 1285 1290Asn
Leu Leu Lys Glu Ile Lys Thr Met Tyr Ser Lys Asp Ile Tyr 1295
1300 1305Ser Phe Ser Ala Ile Gln Asn Ile Val
Asn Asn Ile Asp Cys Ser 1310 1315
1320Ile Asp Tyr Ser Ile Asp Asp Phe Ile Ile Ile Cys Asn Asn Leu
1325 1330 1335Leu Gln Ile Leu Lys Thr
Asn Glu Arg Lys Asn Ala Asp Leu Arg 1340 1345
1350Leu Ile His Leu Ser Gly Asn Ser Gly Thr Leu Tyr Leu Gly
Lys 1355 1360 1365Lys Leu Lys Ser Gly
Met Lys Phe Ile Trp Gln Ser Ile Thr Gly 1370 1375
1380Tyr Tyr Glu Glu Ile Leu Tyr Glu Val Lys 1385
13901436DNAArtificial SequenceCR RNA 14gtttgctagt tatgttattt
atagtattaa gcaaac 361582DNAArtificial
SequenceTRACR RNA 15tgtaaataac ataacgagtg caaataagcg tttcgcgaaa
atttacagtg gccctgctgt 60ggggcctttt ttatttatca aa
82161392PRTArtificial SequenceLachnospiraceae
16Met Ser Glu Lys Tyr Phe Val Gly Leu Asp Met Gly Thr Ser Ser Val1
5 10 15Gly Trp Ala Val Thr Asp
Glu His Tyr His Leu Leu Arg Arg Lys Gly 20 25
30Lys Asp Leu Trp Gly Ala Arg Leu Phe Asp Glu Ala Glu
Thr Ala Ala 35 40 45Gly Arg Arg
Thr Asn Arg Val Ser Arg Arg Arg Leu Ala Arg Gln Arg 50
55 60Ala Arg Ile Gly Trp Leu Lys Glu Leu Phe Arg Pro
Tyr Leu Glu Glu65 70 75
80Lys Asp Ala Gly Phe Leu Gln Arg Leu Glu Glu Ser Arg Phe Phe Leu
85 90 95Glu Asp Lys Thr Val Lys
Gln Pro Tyr Ala Leu Phe Ser Asp Lys Glu 100
105 110Phe Thr Asp Lys Asp Tyr Tyr Gln Lys Tyr Pro Thr
Ile Phe His Leu 115 120 125Arg Lys
Glu Leu Leu Glu Ser Lys Ala Pro His Asp Val Arg Leu Val 130
135 140Phe Leu Ala Val Leu Asn Met Tyr Ala His Arg
Gly His Phe Leu Asn145 150 155
160Pro Glu Leu Gln Glu Gly Thr Leu Gly Asp Ile His Asp Leu Leu Ser
165 170 175Arg Leu Asp Ala
Tyr Ile Gln Asp Leu Phe Glu Asp Gln Gly Trp Ser 180
185 190Ile Leu Glu Asn Val Glu Glu Gln Gln Lys Val
Leu Ala Glu Lys Asn 195 200 205Ile
Ser Asn Thr Val Arg Leu Glu Lys Ile Leu Ser Ala Ile Gly Thr 210
215 220Ser Pro Lys Asp Lys Glu Lys Lys Pro Leu
Ile Glu Ile Tyr Lys Leu225 230 235
240Ile Cys Gly Leu Lys Gly Ser Leu Ser Leu Ala Phe Ser Gly Val
Glu 245 250 255Met Asn Glu
Thr Asp Ala Gln Met Lys Phe Ser Phe Ser Asp Ser Asn 260
265 270Leu Glu Glu Asn Glu Pro Glu Ile Glu Arg
Ile Leu Gly Glu Arg Tyr 275 280
285Phe Glu Met Tyr Ser Ile Leu Lys Glu Ile His Ala Trp Gly Leu Leu 290
295 300Ser Glu Ile Met Ser Asp Asp Ser
Gly Lys Thr Tyr Pro Tyr Ile Ser305 310
315 320Tyr Ala Lys Val Asp Leu Tyr Gln Lys His His Glu
Gln Leu Arg Met 325 330
335Leu Lys Lys Ile Ile Arg Thr Tyr Ala Pro Asp Glu Tyr His Arg Met
340 345 350Phe Arg Ser Met Glu Asp
Asn Thr Tyr Ser Ala Tyr Val Gly Ser Val 355 360
365Asn Ser Lys Asn Lys Lys Gln Arg Arg Gly Ala Lys Ser Thr
Asp Phe 370 375 380Phe Lys Glu Val Lys
Arg Ile Ile Glu Lys Ile Glu Lys Glu His Gly385 390
395 400Glu Leu Pro Glu Cys Glu Glu Ile Leu Asp
Leu Ile Ala Arg Asp Ser 405 410
415Phe Leu Pro Lys Gln Leu Thr Thr Ala Asn Gly Val Ile Pro Asn Gln
420 425 430Val Tyr Ala Thr Glu
Leu Arg Gln Ile Val Thr Asn Ala Ala Ala Tyr 435
440 445Leu Pro Phe Leu Asn Asp Lys Asp Asp Thr Gly Leu
Thr Asn Ala Glu 450 455 460Lys Ile Val
Glu Met Phe Lys Phe His Ile Pro Tyr Tyr Ile Gly Pro465
470 475 480Leu Lys Asn Asp Gly Asn Gly
Thr Ala Trp Val Val Arg Lys Gln Gln 485
490 495Gly Thr Val Tyr Pro Trp Asn Ile Asp Glu Lys Val
Asp Met Ala Lys 500 505 510Thr
Arg Asp Gln Phe Ile Leu Asn Leu Val Arg Lys Cys Ser Tyr Leu 515
520 525Asn Asp Glu Thr Val Leu Pro Ala Ser
Ser Leu Leu Tyr Glu Lys Phe 530 535
540Lys Val Leu Asn Glu Leu Asn Asn Leu Thr Ile Asn Gly Gln Lys Ile545
550 555 560Ser Val Glu Leu
Lys Gln Asp Ile Phe Arg Asp Leu Phe Arg Ala Thr 565
570 575Gly Lys Arg Val Thr Thr Arg Lys Leu Met
Gly Tyr Leu Arg Arg Lys 580 585
590Ala Val Ile Asp Ala Asp Ala Asp Glu Thr Cys Leu Glu Gly Phe Asp
595 600 605Lys Thr Gln Gly Gly Phe Val
Ser Thr Leu Ser Ser Tyr His Lys Phe 610 615
620Met Glu Ile Phe Ser Thr Asp Val Leu Thr Asp Arg Gln Arg Glu
Ile625 630 635 640Ala Glu
Gly Ala Ile Tyr Phe Ala Thr Val Tyr Gly Glu Asp Lys Ser
645 650 655Phe Leu Lys Lys Val Leu Arg
Asp Lys Phe Ser Pro Ala Glu Leu Ser 660 665
670Gln Ala Gln Ile Asp Arg Leu Ser Gly Ile Arg Phe Lys Asp
Trp Ser 675 680 685His Leu Ser Arg
Glu Phe Leu Leu Leu Glu Glu Ala Asp His Ser Thr 690
695 700Gly Glu Ile Met Thr Ile Ile Asp Arg Leu Trp Asn
Thr Asn Glu Asn705 710 715
720Leu Met Gln Ile Ile His Ser Asp Glu Tyr Thr Tyr Lys Gln Ala Ile
725 730 735Glu Glu Arg Thr Ala
Arg Leu Glu Lys Ser Leu Ser Glu Val Ser Phe 740
745 750Glu Asp Ile Glu Asp Ser Tyr Met Ser Ala Pro Val
Arg Arg Met Val 755 760 765Trp Gln
Thr Ile Arg Ile Leu Gln Glu Ile Glu Glu Val Met Gly Ser 770
775 780Glu Pro Ala Arg Val Phe Val Glu Met Thr Arg
Ser Glu Gly Glu Lys785 790 795
800Gly Asp Lys Gly Arg Lys Asp Ser Arg Lys Lys Lys Leu Lys Glu Leu
805 810 815Tyr Lys Lys Cys
Lys Asp Asp Asp Gln Gly Leu Leu Ser Asp Ile Glu 820
825 830Gly Arg Asp Glu Arg Asp Phe Arg Ile Arg Lys
Leu Tyr Leu Tyr Tyr 835 840 845Met
Gln Lys Gly Leu Cys Met Tyr Ser Gly His Pro Ile Asp Phe Gly 850
855 860Lys Leu Phe Asp Asp Ser Tyr Tyr Asp Ile
Asp His Ile Tyr Pro Arg865 870 875
880His Tyr Val Lys Asp Asp Ser Ile Glu Asn Asn Leu Val Leu Val
Glu 885 890 895Ser Lys Leu
Asn Arg Asp Lys Lys Asp Thr Leu Leu Cys Pro Asp Ile 900
905 910Gln Glu Arg Met His Pro Val Trp Glu Met
Leu His Arg Gln Gly Phe 915 920
925Met Asn Asp Glu Lys Phe Lys Arg Leu Met Arg Lys Glu Pro Phe Ser 930
935 940Glu Glu Glu Phe Ala His Phe Ile
Glu Arg Gln Leu Val Glu Thr Gly945 950
955 960Gln Gly Thr Lys Glu Ile Ala Arg Ile Leu Asn Asp
Val Leu Gly Asn 965 970
975Lys Asp Glu Asn Asn Lys Val Ile Tyr Val Lys Ala Gly Asn Val Ser
980 985 990Ser Phe Arg Asn Asp Asn
Lys Lys Asn Pro Glu Phe Val Lys Cys Arg 995 1000
1005Val Ile Asn Asp His His His Ala Lys Asp Ala Tyr
Leu Asn Ile 1010 1015 1020Val Val Gly
Asn Thr Tyr Tyr Thr Lys Phe Thr Leu His Pro Ala 1025
1030 1035Asn Phe Ile Arg Glu Leu Arg Asn Lys Ser His
Pro Thr Leu Glu 1040 1045 1050Asp Gln
Tyr Asn Met Asp Lys Leu Phe Ala Arg Arg Val Glu Arg 1055
1060 1065Asn Gly Tyr Thr Ala Trp Asn Pro Asp Thr
Asp Phe Gln Thr Val 1070 1075 1080Lys
Gln Val Leu Arg Lys Asn Ser Val Leu Ile Ser Arg Arg Ser 1085
1090 1095Phe Ile Glu His Gly Gln Ile Ala Asp
Leu Gln Leu Val Ser Gly 1100 1105
1110Arg Lys Ile Ser Glu Val Asn Gly Lys Gly Tyr Leu Pro Ile Lys
1115 1120 1125Ala Ser Asp Ile Arg Leu
Ser Gly Pro Ser Gly Thr Met Lys Tyr 1130 1135
1140Gly Gly Tyr Asn Lys Ala Ser Gly Ala Tyr Phe Phe Leu Val
Glu 1145 1150 1155His Glu Leu Lys Gly
Lys Leu Val Arg Thr Ile Glu Pro Val Tyr 1160 1165
1170Val Tyr Met Met Ala Ser Ile His Gly Lys Glu Asp Leu
Glu Lys 1175 1180 1185Tyr Cys Gln Glu
Glu Leu Gly Tyr Ile His Pro Arg Ile Cys Leu 1190
1195 1200Lys Lys Ile Pro Met Tyr Ser His Ile Arg Ile
Asn Gly Phe Asp 1205 1210 1215Tyr Tyr
Leu Thr Gly Arg Ser Asn Asp Arg Leu Phe Ile Cys Asn 1220
1225 1230Ala Val Gln Leu Thr Leu Ser Ser Glu Trp
Ser Ala Tyr Ile Lys 1235 1240 1245Ala
Leu Ser Lys Ala Val Asp Glu Lys Trp Asp Ala Ala Tyr Ile 1250
1255 1260Glu Gln Gln Ala Ser Arg Ile Gln Asp
Ser Leu Lys Ser Glu Glu 1265 1270
1275Val Phe Ile Ser Lys Glu Arg Asn Asp Gln Leu Tyr Lys Val Leu
1280 1285 1290Leu Gln Lys His Leu Glu
Gly Phe Phe Asn Asn Arg Ile Asn Ser 1295 1300
1305Ile Gly Thr Ile Met Lys Glu Gly Tyr Asp Ser Phe Arg Ala
Leu 1310 1315 1320Pro Val Asn Glu Gln
Ala Glu Thr Leu Met Glu Ile Leu Lys Ile 1325 1330
1335Ser Gln Leu Val Asn Ile Gly Ala Asn Leu Val Ser Ile
Gly Gly 1340 1345 1350Lys Ser Arg Ser
Gly Val Ala Thr Val Ser Lys Lys Ile Ser Asp 1355
1360 1365Ser Lys Ser Phe Gln Leu Ile Ser Asp Ser Val
Thr Gly Ile Phe 1370 1375 1380Gln Arg
Ala Thr Asp Leu Leu Thr Ile 1385
13901736DNAArtificial SequenceCR RNA 17gtttgagagc cttgtaaaac cgtatatctc
tcaagc 361897DNAArtificial SequenceTRACR
RNA 18gataatgttt tacaaggcga gttcaaataa ggatttatcc gaaatcgctt gcgtgcattg
60gcaccatcta tcttttaaga ctttctttga aagtctt
97191364PRTArtificial SequenceLachnospiraceae 19Met Glu Lys Glu Tyr Tyr
Leu Gly Leu Asp Met Gly Thr Ser Ser Val1 5
10 15Gly Trp Ala Val Thr Asp Lys Glu Tyr Arg Leu Leu
Arg Ala Lys Gly 20 25 30Lys
Asp Met Trp Gly Ile Arg Glu Phe Glu Glu Ala Gln Thr Ala Val 35
40 45Glu Arg Arg Thr His Arg Leu Ser Lys
Arg Arg Arg Ala Arg Gln Leu 50 55
60Val Arg Ile Gly Leu Leu Lys Asp Tyr Phe His Asp Glu Ile Met Lys65
70 75 80Ile Asp Pro Asn Phe
Tyr Ile Arg Leu Glu Asn Ser Lys Tyr Tyr Leu 85
90 95Glu Asp Lys Asp Val Arg Leu Ala Ser Ser Asn
Gly Ile Phe Asp Asp 100 105
110Lys Asn Tyr Thr Asp Lys Asp Tyr Tyr Glu Gln Tyr Lys Thr Ile Phe
115 120 125His Leu Arg Ser Glu Leu Ile
His Asn Ser Gln Lys His Asp Val Arg 130 135
140Leu Val Tyr Leu Ala Leu Leu Asn Met Phe Lys His Arg Gly His
Phe145 150 155 160Leu Phe
Glu Gly Asp Ala Tyr Val Gln Gly Asn Ile Gly Asp Ile Tyr
165 170 175Lys Glu Phe Ile Gln Leu Leu
Lys Asn Glu Tyr Tyr Glu Asp Glu Asn 180 185
190Val Lys Leu Thr Asp Gln Ile Asp Tyr Phe Lys Leu Lys Glu
Ile Leu 195 200 205Ser Asn Ser Glu
Phe Ser Arg Thr Ala Lys Ala Glu Lys Ile Asn Ser 210
215 220Leu Val His Ile Asp Lys Lys Asn Lys Leu Glu Asn
Thr Tyr Ile Arg225 230 235
240Leu Leu Cys Gly Leu Glu Ile Glu Leu Lys Ile Leu Phe Pro Glu Ile
245 250 255Asp Glu Lys Ile Lys
Ile Cys Phe Ala Lys Gly Tyr Asp Glu Lys Leu 260
265 270Val Glu Ile Thr Glu Ile Leu Thr Asp Asn Gln Leu
Gln Ile Leu Glu 275 280 285Asn Leu
Lys Lys Ile His Asp Ile Ala Ala Leu Asp Lys Ile Arg Lys 290
295 300Gly Lys Glu Tyr Leu Ser Asp Ala Arg Val Ala
Glu Tyr Glu Lys His305 310 315
320Arg Glu Asp Leu Ala Leu Leu Lys Lys Ile Tyr Arg Glu Tyr Met Thr
325 330 335Lys Gln Asp Tyr
Asp Arg Met Phe Arg Glu Gly Glu Asp Gly Ser Tyr 340
345 350Ser Ala Tyr Val Asn Ser Tyr Asn Thr Ser Lys
Lys Gln Arg Arg Asn 355 360 365Met
Lys His Arg Lys Ile Asp Glu Phe Tyr Gly Thr Ile Arg Lys Asp 370
375 380Leu Lys Leu Leu Leu Lys Gln Gly Ile Gln
Asp Asp Asn Ile Glu Arg385 390 395
400Ile Leu Glu Glu Ile Asp Gly Asn Asn Asp Asn Lys Phe Met Pro
Lys 405 410 415Gln Leu Ser
Phe Ala Asn Gly Val Ile Pro Asn Ser Leu His Lys Ala 420
425 430Glu Met Lys Ala Ile Leu Arg Asn Ala Glu
Thr Tyr Leu Pro Phe Leu 435 440
445Leu Glu Thr Asp Glu Ser Gly Leu Thr Val Ser Glu Arg Ile Leu Gln 450
455 460Leu Phe Ser Phe His Ile Pro Tyr
Tyr Ile Gly Pro Val Ser Val Asn465 470
475 480Ser Glu Lys Asn Asn Gly Asn Gly Trp Val Val Arg
Arg Glu Asp Gly 485 490
495Glu Val Leu Pro Trp Asn Ile Glu Gln Lys Ile Asp Tyr Gly Glu Thr
500 505 510Ser Lys Arg Phe Ile Glu
Lys Met Val Arg Arg Cys Thr Tyr Ile Ser 515 520
525Gly Glu Gln Val Leu Pro Lys Asn Ser Phe Ile Tyr Glu Lys
Tyr Cys 530 535 540Val Leu Asn Glu Ile
Asn Asn Ile Lys Ile Asp Gly Glu Arg Ile Thr545 550
555 560Val Glu Leu Lys Gln Asn Ile Tyr Asn Asp
Leu Tyr Leu His Gly Lys 565 570
575Arg Val Thr Lys Lys Gln Leu Ile Asn Tyr Leu Asn Asn Arg Gly Met
580 585 590Ile Glu Asp Glu Asn
Gln Val Ser Gly Ile Asp Ile Asn Leu Asn Asn 595
600 605Tyr Leu Gly Ser Tyr Gly Lys Phe Leu Pro Ile Phe
Glu Glu Lys Leu 610 615 620Lys Glu Asp
Asn Tyr Ile Lys Ile Ala Glu Asp Ile Ile Tyr Leu Ala625
630 635 640Ser Ile Tyr Gly Asp Ser Lys
Lys Met Leu Lys Ser Gln Ile Lys Ser 645
650 655Lys Tyr Gly Asp Ile Leu Asp Asp Lys Gln Ile Lys
Arg Ile Leu Gly 660 665 670Leu
Lys Phe Lys Asp Trp Gly Arg Ile Ser Arg Arg Phe Leu Glu Leu 675
680 685Glu Gly Leu Asp Lys Glu Thr Gly Glu
Ile Thr Thr Ile Ile Lys Ala 690 695
700Met Trp Asp Tyr Asn Leu Asn Phe Met Glu Ile Ile His Ser Asp Ala705
710 715 720Phe Asp Phe Lys
Asp Lys Ile Glu Glu Leu His Ala Asn Ser Ile Lys 725
730 735Pro Leu Ala Glu Ile Glu Val Glu Asp Leu
Asp Asp Met Tyr Phe Ser 740 745
750Ala Pro Val Lys Arg Met Ile Trp Gln Thr Phe Lys Val Ile Lys Glu
755 760 765Ile Glu Lys Val Met Gly Cys
Pro Pro Lys Lys Val Phe Ile Glu Met 770 775
780Thr Arg Ile Asn Asp Lys Lys Ser Lys Gly Lys Arg Thr Asn Ser
Arg785 790 795 800Lys Glu
Lys Phe Leu Ser Leu Tyr Lys Asn Ile His Asp Glu Leu Val
805 810 815Asp Trp Lys Gln Leu Ile Ile
Ser Ser Asp Glu Ser Gly Lys Leu Asn 820 825
830Ser Lys Lys Met Tyr Leu Tyr Leu Thr Gln Gln Gly Ile Cys
Met Tyr 835 840 845Thr Gly Arg Arg
Ile Asn Leu Glu Glu Leu Phe Asp Asp Asn Lys Tyr 850
855 860Asp Ile Asp His Ile Tyr Pro Arg His Phe Val Lys
Asp Asp Asn Leu865 870 875
880Glu Asn Asn Leu Val Leu Val Glu Lys Gln Ser Asn Ser Arg Lys Ser
885 890 895Asp Thr Tyr Pro Ile
Asp Lys Ser Ile Arg Asn Asn Ser Gln Val Tyr 900
905 910Lys His Trp Lys Ser Leu Arg Glu Gly Asn Phe Ile
Ser Lys Glu Lys 915 920 925Tyr Asp
Arg Leu Thr Gly Lys Asn Glu Phe Thr Asp Glu Gln Lys Ala 930
935 940Gly Phe Ile Ala Arg Gln Met Val Glu Thr Ser
Gln Gly Thr Lys Gly945 950 955
960Val Ala Asp Ile Ile Lys Gln Ala Leu Pro Gln Ser Arg Ile Ile Tyr
965 970 975Ser Lys Ala Ser
Asn Val Ser Glu Phe Arg Arg Lys Tyr Asp Ile Leu 980
985 990Lys Ser Arg Thr Val Asn Glu Phe His His Ala
His Asp Ala Tyr Leu 995 1000
1005Asn Ile Val Val Gly Asn Val Tyr Asp Thr Lys Phe Thr Ser Asn
1010 1015 1020Pro Leu Asn Phe Ile Lys
Lys Gln Tyr Asn Val Asp Arg Lys Ala 1025 1030
1035Asn Asn Tyr Asn Leu Asp Lys Met Phe Val Tyr Asp Val Lys
Arg 1040 1045 1050Gly Asn Glu Ile Ala
Trp Ile Gly Trp Asn Pro Lys Lys Ser Glu 1055 1060
1065Asp Ser Ser Glu Met Ser Lys Arg Gly Thr Ile Val Thr
Val Lys 1070 1075 1080Lys Met Leu Ser
Lys Asn Thr Pro Leu Met Thr Arg Met Ser Phe 1085
1090 1095Val Gly His Gly Gly Ile Ala Glu Asp Asn Leu
Ser Ser His Phe 1100 1105 1110Val Ala
Lys Asn Lys Gly Tyr Met Pro Asn Gly Lys Glu Ser Asp 1115
1120 1125Val Thr Lys Tyr Gly Gly Tyr Lys Lys Ala
Lys Thr Ala Tyr Phe 1130 1135 1140Phe
Val Val Glu His Gly Gln Thr Asn Asn Arg Ile Arg Thr Ile 1145
1150 1155Glu Thr Leu Pro Ile Tyr Arg Arg Arg
Glu Val Glu Lys Tyr Glu 1160 1165
1170Asp Gly Leu Ile Lys Tyr Cys Glu Gln Ser Leu Ser Leu Leu Asn
1175 1180 1185Pro Ile Ile Ile Tyr Lys
Lys Ile Lys Ile Gln Ser Leu Met Lys 1190 1195
1200Ile Asn Gly Tyr Tyr Ala Tyr Ile Ser Gly Lys Ser Asn Glu
Val 1205 1210 1215Tyr Thr Phe Arg Asn
Gly Val Asn Met Cys Leu Ser Gln Glu Trp 1220 1225
1230Ile Asn Tyr Val Lys Lys Leu Glu Asn Tyr Ile Glu Lys
Asp Arg 1235 1240 1245Gln Asp Arg Met
Ile Thr Tyr Glu Lys Asn Ile Glu Leu Tyr Glu 1250
1255 1260Ile Ile Leu Arg Lys Tyr Ser Thr Thr Ile Leu
Asn Lys Arg Leu 1265 1270 1275Ser Lys
Met Asp Lys Lys Leu Ile Asn Ala Lys Asp Arg Phe Cys 1280
1285 1290Ile Leu Asn Val Lys Glu Gln Ser Gln Val
Leu Ile Asn Val Phe 1295 1300 1305Val
Leu Ser Arg Ile Gly Asp Asn Gln Thr Asp Leu Ser Lys Ile 1310
1315 1320Gly Ile Gly Lys Gln Ser Gly Gln Ile
Thr Gln Asn Lys Lys Ile 1325 1330
1335Thr Gly Cys Lys Glu Phe Lys Leu Val Asn Gln Ser Val Thr Gly
1340 1345 1350Leu Tyr Glu Asn Glu Ile
Asp Leu Leu Thr Val 1355 13602036DNAArtificial
SequenceCR RNA 20gtttgagagt cttgttaatt cttaaaggtg taaaac
3621100DNAArtificial SequenceTRACR RNA 21gagaattaac
aagacgagtg caaataaggt ttatccggaa tcgtcaatat gacctgcatt 60gtgcagaatc
tttaaaatca tatgatttca tatggtttta
100221357PRTArtificial SequenceELEPHANT FECES 22Met Glu Lys Asn Asn Tyr
Leu Leu Gly Leu Asp Ile Gly Thr Asp Ser1 5
10 15Val Gly Tyr Ala Val Thr Asn Asp Lys Tyr Asp Ile
Leu Lys Phe His 20 25 30Gly
Glu Pro Ala Trp Gly Val Thr Ile Phe Asp Glu Ala Ser Leu Ser 35
40 45Thr Glu Lys Arg Ser Phe Arg Val Ser
Arg Arg Arg Leu Asp Arg Arg 50 55
60Gln Gln Arg Val Leu Leu Val Gln Glu Leu Phe Ala Ser Glu Val Ala65
70 75 80Lys Val Asp Lys Asp
Phe Phe Lys Arg Ile Gln Glu Ser Asn Leu Tyr 85
90 95Arg Ser Asp Ala Glu Asn Gln Ala Gly Leu Phe
Ile Gly Glu Asp Tyr 100 105
110Cys Asp Arg Glu Tyr Tyr Gly Gln Tyr Pro Thr Ile His His Leu Ile
115 120 125Ser Asp Leu Met Asn Gly Thr
Ser Pro His Asp Val Arg Leu Val Tyr 130 135
140Leu Ala Cys Ala Trp Leu Val Ala His Arg Gly His Phe Leu Ser
Asn145 150 155 160Ile Asp
Lys Asp Asn Leu Ser Gly Leu Lys Asp Phe Ser Ser Val Tyr
165 170 175Glu Gly Leu Met Gln Tyr Phe
Ser Asp Asn Gly Tyr Glu Arg Pro Trp 180 185
190Asn Ala Asn Val Asp Val Lys Ala Leu Gly Asp Ala Leu Lys
Lys Lys 195 200 205Gln Gly Val Thr
Ala Lys Thr Lys Glu Leu Leu Ala Leu Leu Leu Asp 210
215 220Ser Ala Lys Ala Glu Lys Leu Pro Arg Glu Glu Phe
Pro Phe Ser Gln225 230 235
240Asp Gly Ile Ile Lys Leu Leu Ala Gly Gly Thr Tyr Lys Leu Ser Glu
245 250 255Leu Phe Gly Asn Glu
Glu Tyr Lys Asp Phe Gly Ser Val Lys Leu Ser 260
265 270Met Asp Asp Glu Lys Leu Gly Glu Ile Met Ser Asn
Ile Gly Glu Asp 275 280 285Tyr Glu
Leu Ile Ala Ser Leu Arg Ile Val Ser Asp Trp Ala Val Leu 290
295 300Val Asp Val Leu Gly Glu Ser Ala Thr Ile Ser
Glu Ala Lys Val Gly305 310 315
320Ile Tyr Asn Gln His Lys Ala Asp Leu Glu Val Leu Lys Lys Ile Ile
325 330 335Arg Lys Tyr Thr
Gly Lys Glu Gly Tyr Lys Lys Val Phe Arg Gln Val 340
345 350Asp Ser Lys Glu Asn Tyr Val Ala Tyr Ser Gln
His Glu Ser Asp Gly 355 360 365Lys
Ala Pro Lys Glu Lys Gly Ile Asp Ile Ala Thr Phe Ser Lys Phe 370
375 380Ile Leu Asn Ile Val Arg Leu Leu Asp Val
Glu Pro Glu Asp Lys Glu385 390 395
400Val Tyr Glu Asp Met Val Ala Arg Leu Glu Leu Asn Ser Phe Leu
Pro 405 410 415Lys Gln Val
Asn Thr Asp Asn Arg Val Ile Pro Tyr Gln Leu Tyr Trp 420
425 430Phe Glu Leu His Lys Ile Leu Glu Asn Ala
Ser Ile Tyr Leu Pro Met 435 440
445Leu Thr Glu Lys Asp Ser Asn Gly Ile Ser Val Met Glu Lys Leu Glu 450
455 460Ser Val Phe Met Phe Arg Ile Pro
Tyr Phe Val Gly Pro Leu Asn Lys465 470
475 480His Ser Lys Tyr Ala Trp Leu Glu Arg Lys Glu Gly
Lys Ile Tyr Pro 485 490
495Trp Asn Phe Glu Asn Met Val Asp Leu Asp Ala Ser Glu Ala Asn Phe
500 505 510Ile Lys Arg Met Thr Asn
Thr Cys Thr Tyr Leu Pro Gly Gln Asn Val 515 520
525Leu Pro Lys Asp Ser Leu Arg Tyr His Arg Phe Met Val Leu
Asn Glu 530 535 540Ile Asn Asn Leu Arg
Ile Asn Asn Glu Arg Ile Ser Val Glu Leu Lys545 550
555 560Gln Lys Ile Tyr Ser Glu Leu Phe Leu Asn
Val Lys Lys Val Thr Arg 565 570
575Lys Arg Leu Val Asp Phe Leu Ile Ser Asn Gly Glu Leu Arg Lys Gly
580 585 590Glu Glu Ser Ser Leu
Thr Gly Ile Asp Val Glu Ile Lys Ala Asn Leu 595
600 605Ala Pro Gln Ile Ala Phe Lys Lys Leu Met Glu Ser
Gly Gln Leu Thr 610 615 620Glu Glu Asp
Val Glu Ser Ile Ile Glu Arg Ala Ser Tyr Ala Glu Asp625
630 635 640Lys Ala Arg Leu Ala His Trp
Leu Glu Ala Lys Tyr Ser Lys Leu Ser 645
650 655Glu Ile Asp Arg Lys Tyr Ile Cys Gly Ile Lys Ile
Lys Asp Phe Gly 660 665 670Arg
Leu Ser Lys Met Phe Leu Ser Glu Leu Glu Gly Val Asp Lys Thr 675
680 685Thr Gly Glu Met Thr Thr Ile Leu Gly
Ala Met Trp Asn Ser Gln Leu 690 695
700Asn Leu Met Glu Leu Ile Asn Ser Glu Leu Tyr Ser Phe Arg Glu Ala705
710 715 720Ile Cys Ala Tyr
Gln Thr Asp Tyr Tyr Ser Thr His Ser Ser Ser Leu 725
730 735Glu Glu Arg Met Asn Glu Met Tyr Leu Ser
Asn Ala Val Lys Arg Pro 740 745
750Val Tyr Arg Thr Leu Asp Ile Val Lys Asp Val Lys Lys Ala Phe Gly
755 760 765Glu Pro Lys Lys Ile Phe Val
Glu Met Thr Arg Gly Ala Ser Glu Glu 770 775
780Gln Lys Gly Lys Arg Thr Lys Ser Arg Lys Glu Gln Ile Leu Glu
Leu785 790 795 800Tyr Lys
Gln Cys Lys Asp Glu Asp Val Arg Ile Leu Gln Gln Gln Leu
805 810 815Glu Glu Met Gly Asp Leu Ala
Asp Asn Lys Leu Gln Gly Asp Lys Leu 820 825
830Phe Leu Tyr Tyr Met Gln Lys Gly Lys Cys Met Tyr Thr Gly
Thr Pro 835 840 845Ile Val Leu Glu
Gln Leu Gly Ser Lys Ala Tyr Asp Ile Asp His Ile 850
855 860Tyr Pro Gln Ala Tyr Val Lys Asp Asp Ser Ile Leu
Asn Asn Arg Val865 870 875
880Leu Val Leu Ser Glu Ala Asn Gly Lys Lys Lys Asp Ile Tyr Pro Ile
885 890 895Glu Lys Glu Thr Arg
Asp Lys Met His Gly Phe Trp Thr Tyr Leu Asn 900
905 910Asp Lys Gly Met Ile Thr Glu Glu Lys Tyr Lys Arg
Leu Thr Arg Thr 915 920 925Thr Gly
Phe Thr Glu Glu Glu Lys Trp Ser Phe Ile Asn Arg Gln Leu 930
935 940Thr Glu Thr Ser Gln Ala Thr Lys Ala Val Ala
Thr Leu Leu Gly Glu945 950 955
960Leu Phe Pro Asn Ala Glu Ile Val Tyr Ser Lys Ala Arg Leu Thr Ser
965 970 975Glu Phe Arg Gln
Glu Phe Asn Leu Leu Lys Cys Arg Ser Tyr Asn Asp 980
985 990Leu His His Ala Val Asp Ala Tyr Leu Asn Ile
Val Cys Gly Asn Val 995 1000
1005Tyr Asn Met Lys Phe Thr Lys Arg Trp Phe Asn Ile Asn Lys Asp
1010 1015 1020Tyr Ser Ile Lys Thr Lys
Thr Val Phe Thr His Pro Val Val Cys 1025 1030
1035Gly Gly Gln Val Val Trp Asp Gly Gln Glu Met Leu Asn Lys
Val 1040 1045 1050Ile Arg Asn Ala Lys
Lys Asn Thr Ala His Phe Thr Lys Tyr Ala 1055 1060
1065Tyr Ile Arg Lys Gly Gly Phe Phe Asp Gln Met Pro Val
Lys Ala 1070 1075 1080Ala Glu Gly Leu
Thr Pro Leu Lys Lys Asp Met Pro Thr Ala Val 1085
1090 1095Tyr Gly Gly Tyr Asn Lys Pro Ser Val Ala Phe
Leu Ile Pro Thr 1100 1105 1110Arg Tyr
Lys Ala Gly Lys Lys Thr Glu Ile Ile Ile Leu Ser Val 1115
1120 1125Glu His Leu Phe Gly Glu Arg Phe Leu Arg
Asp Glu Ala Tyr Ala 1130 1135 1140Lys
Glu Tyr Ala Ala Glu Arg Leu Lys Lys Ile Leu Gly Lys Gln 1145
1150 1155Val Asp Glu Val Ser Phe Pro Met Gly
Met Arg Pro Trp Lys Ile 1160 1165
1170Asn Thr Val Leu Ser Leu Asp Gly Phe Leu Ile Cys Ile Ser Gly
1175 1180 1185Ile Gly Ser Gly Gly Lys
Cys Leu Arg Ala Gln Ser Ile Met Gln 1190 1195
1200Phe Ser Ser Asp Tyr Arg Trp Thr Ile Tyr Leu Lys Arg Leu
Glu 1205 1210 1215Arg Leu Val Glu Lys
Ile Thr Val Asn Ala Lys Tyr Val Tyr Ser 1220 1225
1230Glu Glu Phe Asp Lys Val Ser Thr Ile Glu Asn Ile Glu
Leu Tyr 1235 1240 1245Asp Leu Tyr Ile
Glu Lys Tyr Lys Ala Thr Ile Phe Ser Lys Arg 1250
1255 1260Val Asn Ser Pro Glu Glu Ile Ile Glu Ser Gly
Arg Asp Lys Phe 1265 1270 1275Val Lys
Leu Asp Val Leu Ser Gln Ala Arg Ala Leu Leu Cys Ile 1280
1285 1290His Gln Thr Phe Gly Arg Ile Val Gly Gly
Cys Asp Leu Gly Leu 1295 1300 1305Ile
Gly Gly Lys Lys Asn Ser Ala Ala Thr Gly Asn Phe Ser Ser 1310
1315 1320Thr Ile Ser Asn Trp Ala Lys Tyr Tyr
Lys Asp Val Arg Ile Ile 1325 1330
1335Asp Gln Ser Thr Ser Gly Leu Trp Val Arg Lys Ser Glu Asn Leu
1340 1345 1350Leu Glu Leu Val
13552336DNAArtificial SequenceCRRNA 23gtttgagagt agtgtaaatc cataggggtc
tcaaac 3624102DNAArtificial SequenceTRACR
RNA 24agacccctat ggatttacat tgcgagttca aataaaagtt tactcaaatc gttggcttga
60ccaaccgcac agcgtgtgct taaagatctc ttcagtgagg tc
102251368PRTArtificial SequenceLachnospiraceae bacterium 25Met Asn Phe
Asp Gly Glu Tyr Phe Leu Gly Leu Asp Ile Gly Thr Asp1 5
10 15Ser Val Gly Tyr Ala Val Thr Asp Gln
Arg Tyr Asn Leu Val Lys Phe 20 25
30Lys Gly Glu Pro Met Trp Gly Ser His Leu Phe Asp Ala Ala Asn Gln
35 40 45Cys Ala Glu Arg Arg Gly Phe
Arg Thr Ala Arg Arg Arg Leu Asp Arg 50 55
60Arg Gln Gln Arg Val Lys Leu Val Asp Glu Ile Phe Ala Pro Glu Val65
70 75 80Ala Lys Val Asp
Pro Asn Phe Tyr Ile Arg Lys Met Glu Ser Ala Leu 85
90 95Tyr Pro Glu Asp Lys Ser Asn Lys Gly Asp
Leu Tyr Leu Tyr Phe Asn 100 105
110Lys Gln Glu Tyr Asp Glu Lys His Tyr Tyr Lys Asp Tyr Pro Thr Ile
115 120 125His His Leu Ile Cys Ala Leu
Met Asn Asp Glu Lys Thr Lys Phe Asp 130 135
140Ile Arg Leu Ile Asn Ile Ala Ile Asp Trp Leu Val Ala His Arg
Gly145 150 155 160His Phe
Leu Ser Glu Val Gly Thr Asp Ser Val Asp Lys Val Leu Asp
165 170 175Phe Arg Lys Ile Tyr Asp Glu
Phe Met Ala Leu Phe Ser Asp Glu Asp 180 185
190Asp Ala Val Ser Ser Lys Pro Trp Glu Asn Ile Asn Pro Asp
Glu Leu 195 200 205Gly Lys Val Leu
Lys Ile His Gly Lys Asn Ala Lys Arg Asn Glu Leu 210
215 220Lys Lys Leu Leu Tyr Gly Gly Lys Ile Pro Thr Asp
Glu Asp Ser Phe225 230 235
240Ile Asp Arg Lys Leu Leu Ile Asp Phe Ile Ala Gly Thr Ser Val Gln
245 250 255Cys Asn Lys Leu Phe
Arg Asn Ser Glu Tyr Glu Asp Asp Leu Lys Ile 260
265 270Thr Ile Ser Asn Ser Asp Glu Arg Glu Val Val Leu
Pro Gln Leu Glu 275 280 285Asp Phe
His Ala Asp Ile Ile Ala Lys Leu Ser Ser Met Tyr Asp Trp 290
295 300Ser Val Leu Ser Asp Ile Leu Ser Gly Ser Thr
Tyr Ile Ser Glu Ser305 310 315
320Lys Val Lys Val Tyr Glu Gln His Lys Lys Asp Leu Lys Glu Leu Lys
325 330 335Glu Phe Val Arg
Lys Tyr Ala Pro Glu Lys Tyr Asn Asp Ile Phe Arg 340
345 350Leu Ala Ser Lys Glu Thr Tyr Asn Tyr Thr Ala
Tyr Ser Tyr Asn Leu 355 360 365Lys
Ser Val Lys Asp Glu Lys Asp Leu Pro Lys Gly Lys Ala Ser Lys 370
375 380Glu Asp Phe Tyr Ser Tyr Leu Lys Lys Thr
Leu Lys Leu Asp Lys Ala385 390 395
400Glu Asn Tyr Asn Phe Val Asn Asp Ala Asp Thr Arg Phe Phe Asp
Asp 405 410 415Met Val Glu
Arg Ile Ser Ser Gly Thr Phe Leu Pro Lys Gln Val Asn 420
425 430Ser Asp Asn Arg Val Ile Pro Tyr Gln Val
Tyr Tyr Ile Glu Leu Lys 435 440
445Lys Ile Leu Glu Asn Ala Lys Lys His Tyr Ala Phe Phe Glu Glu Lys 450
455 460Asp Glu Asp Gly Tyr Ser Asn Val
Glu Lys Ile Met Ser Val Phe Thr465 470
475 480Phe Arg Ile Pro Tyr Tyr Val Gly Pro Leu Arg Asn
Asp Asp Lys Ser 485 490
495Pro Tyr Ala Trp Ile Arg Arg Lys Ala Asp Gly Lys Ile Tyr Pro Trp
500 505 510Asp Phe Glu Glu Lys Val
Asp Leu Asp Ala Ser Glu Asn Ala Phe Ile 515 520
525Asp Arg Met Thr Asn Ser Cys Thr Tyr Ile Pro Gly Ala Asp
Val Leu 530 535 540Pro Lys Trp Ser Leu
Leu Tyr Thr Lys Tyr Met Val Leu Asn Glu Ile545 550
555 560Asn Asn Ile Lys Val Asn Asn Ile Gly Ile
Ser Val Glu Ala Lys Gln 565 570
575Gly Ile Tyr Asn Glu Leu Phe Cys Lys Lys Ala Lys Val Ser Leu Lys
580 585 590Ala Ile Arg Glu Tyr
Leu Ile Ser Asn Gly Phe Met Gln Lys Asp Asp 595
600 605Glu Met Ser Gly Ile Asp Ile Thr Val Lys Ser Ser
Leu Lys Ser Arg 610 615 620Tyr Asp Phe
Arg His Leu Leu Glu Lys Asn Glu Leu Thr Thr Asp Asp625
630 635 640Val Glu Ala Ile Ile Ser Arg
Ser Thr Tyr Ala Glu Asp Lys Ala Arg 645
650 655Phe Lys Lys Trp Leu Lys Lys Glu Phe Pro Gln Leu
Ser Asp Glu Asp 660 665 670Tyr
Lys Tyr Val Ser Lys Leu Lys Tyr Lys Asp Phe Gly Arg Leu Ser 675
680 685Arg Ser Leu Leu Asn Gly Leu Glu Gly
Ala Ser Lys Glu Thr Gly Glu 690 695
700Ile Gly Thr Ile Met His Phe Leu Trp Glu Thr Asn Asp Asn Leu Met705
710 715 720Gln Leu Leu Ser
Asp Arg Tyr Thr Phe Met Glu Glu Ile Asn Lys Lys 725
730 735Arg Gln Asp Tyr Tyr Ile Glu His Lys Leu
Thr Leu Asn Glu Gln Met 740 745
750Glu Glu Leu Gly Ile Ser Asn Ala Val Lys Arg Pro Val Thr Arg Thr
755 760 765Leu Ala Val Val Lys Asp Val
Val Ser Ala Ile Gly Tyr Ala Pro Gln 770 775
780Lys Ile Phe Val Glu Met Ala Arg Gln Glu Asp Glu Lys Lys Lys
Arg785 790 795 800Ser Val
Thr Arg Lys Glu Gln Ile Leu Glu Leu Tyr Lys Asn Val Glu
805 810 815Glu Asp Thr Lys Glu Leu Glu
Arg Gln Leu Lys Lys Met Gly Asp Thr 820 825
830Ala Asn Asn Glu Leu Gln Ser Asp Ala Leu Phe Leu Tyr Tyr
Leu Gln 835 840 845Leu Gly Lys Cys
Met Tyr Ser Gly Lys Pro Ile Asp Leu Thr Gln Ile 850
855 860Lys Thr Thr Lys Lys Tyr Asp Ile Asp His Ile Trp
Pro Gln Ser Met865 870 875
880Val Lys Asp Asp Ser Leu Leu Asn Asn Arg Val Leu Val Leu Ser Glu
885 890 895Ile Asn Gly Asp Lys
Lys Asp Val Tyr Pro Ile Asp Glu Ser Ile Arg 900
905 910Ser Lys Met His Ser Tyr Trp Lys Met Leu Leu Asp
Lys Asn Leu Ile 915 920 925Thr Lys
Glu Lys Tyr Ser Arg Leu Thr Arg Pro Thr Pro Phe Thr Glu 930
935 940Ser Glu Lys Leu Gly Phe Ile Asn Arg Gln Leu
Val Glu Thr Arg Gln945 950 955
960Ser Met Lys Ala Val Thr Gln Leu Leu Asn Asn Met Tyr Pro Asp Ser
965 970 975Glu Ile Val Tyr
Val Lys Ala Lys Leu Ala Ala Asp Phe Lys Gln Asp 980
985 990Phe Lys Leu Ala Pro Lys Ser Arg Ile Ile Asn
Asp Leu His His Ala 995 1000
1005Lys Asp Thr Tyr Leu Asn Val Val Ala Gly Asn Val Tyr Asn Glu
1010 1015 1020Arg Phe Thr Lys Lys Trp
Phe Asn Val Asn Glu Lys Tyr Ser Met 1025 1030
1035Lys Thr Lys Val Leu Phe Gly His Asp Val Lys Ile Gly Asp
Arg 1040 1045 1050Leu Ile Trp Asp Ser
Lys Lys Asp Leu Gln Thr Val Lys Asn Thr 1055 1060
1065Tyr Glu Lys Asn Asn Ile His Leu Thr Arg Tyr Ala Tyr
Cys Gln 1070 1075 1080Lys Gly Gly Leu
Phe Asp Gln Met Pro Val Lys Lys Gly Gln Gly 1085
1090 1095Gln Ile Gln Leu Lys Lys Gly Met Asp Ile Asp
Arg Tyr Gly Gly 1100 1105 1110Tyr Asn
Lys Ala Thr Ala Ser Phe Phe Ile Ile Ala Arg Tyr Leu 1115
1120 1125Arg Gly Gly Lys Lys Glu Val Ser Phe Val
Pro Val Glu Leu Met 1130 1135 1140Val
Ser Glu Lys Phe Leu Asn Asp Asp Asn Phe Ala Ile Glu Tyr 1145
1150 1155Ile Thr Asn Val Leu Thr Gly Met Asn
Thr Lys Lys Ile Glu Asn 1160 1165
1170Val Glu Leu Pro Leu Gly Lys Arg Val Ile Lys Ile Lys Thr Val
1175 1180 1185Leu Leu Leu Asp Gly Tyr
Lys Val Trp Val Asn Gly Lys Ala Ser 1190 1195
1200Gly Gly Thr Arg Val Met Leu Thr Ser Ala Glu Ser Leu Arg
Met 1205 1210 1215Pro Lys Glu Tyr Val
Glu Tyr Leu Lys Lys Met Glu Asn Tyr Ser 1220 1225
1230Glu Lys Lys Lys Ser Asn Arg Asn Phe Met His Asp Ser
Glu Asn 1235 1240 1245Asp Gly Leu Ser
Glu Glu Lys Asn Ile Leu Leu Tyr Asp Lys Leu 1250
1255 1260Leu Glu Lys Leu Asp Glu Asn His Phe Lys Lys
Met Pro Gly Asn 1265 1270 1275Gln Cys
Glu Thr Met Lys Ser Gly Arg Val Lys Phe Ile Glu Leu 1280
1285 1290Asp Phe Asp Val Gln Ile Ser Thr Leu Leu
Asn Cys Ile Asp Leu 1295 1300 1305Leu
Lys Ser Gly Arg Thr Gly Gly Cys Asp Leu Lys Asn Ile Gly 1310
1315 1320Gly Lys Ser Ala Ser Gly Val Val Tyr
Ile Ser Ala Asn Leu Ser 1325 1330
1335Ala Cys Lys Tyr Asn Asp Val His Ile Ile Asp Ile Ser Pro Ala
1340 1345 1350Gly Leu His Glu Asn Ile
Ser Cys Asn Leu Met Glu Leu Phe Glu 1355 1360
13652636DNAArtificial SequenceCR RNA 26gtttgagagt agtgtaaatc
cagagggctc caaaac 362799DNAArtificial
SequenceTRACR RNA 27gagccctctg gatttacact acgagttcaa ataaaaatta
tttcaaatcg ccgctatgtc 60ggccgcacag tgtgtgcatt aagaaaagtc cgaaagggc
99281384PRTArtificial SequenceRuminococcaceae
28Met Ser Phe Lys Glu Asn Ser Lys Phe Tyr Phe Gly Leu Asp Ile Gly1
5 10 15Thr Asp Ser Val Gly Trp
Ala Val Thr Asp Asn Leu Tyr Lys Leu Tyr 20 25
30Lys Tyr Lys Asn Asn Leu Met Trp Gly Val Ser Leu Phe
Glu Ala Ala 35 40 45Ser Pro Ala
Glu Asp Arg Arg Asn His Arg Thr Ala Arg Arg Arg Leu 50
55 60Asp Arg Arg Gln Gln Arg Val Ala Leu Leu Arg Glu
Leu Phe Ala Lys65 70 75
80Glu Ile Leu Lys Thr Asp Pro Asp Phe Phe Leu Arg Leu Lys Glu Ser
85 90 95Ser Leu Tyr Pro Glu Asp
Arg Thr Asn Lys Asn Val Asn Thr Tyr Phe 100
105 110Asp Asp Ala Asp Phe Lys Asp Ser Asp Tyr Phe Lys
Met Tyr Pro Thr 115 120 125Val His
His Leu Ile Lys Glu Leu Ser Glu Ser Asp Lys Pro His Asp 130
135 140Val Arg Leu Val Tyr Leu Ala Cys Ala Phe Ile
Val Ala His Arg Gly145 150 155
160His Phe Leu Asn Gly Ala Asp Glu Asn Asn Val Gln Glu Val Leu Asp
165 170 175Phe Asn Ser Ser
Tyr Cys Glu Phe Thr Asp Trp Phe Lys Ser Asn Asp 180
185 190Ile Glu Asp Asn Pro Phe Ser Glu Ser Thr Glu
Asn Glu Phe Ser Val 195 200 205Ile
Leu Arg Lys Lys Ile Gly Ile Thr Ala Lys Glu Lys Glu Ile Lys 210
215 220Asn Leu Leu Phe Gly Thr Thr Lys Thr Pro
Asp Cys Tyr Lys Asp Glu225 230 235
240Glu Tyr Pro Ile Asp Ile Asp Val Leu Ile Lys Phe Ile Ser Gly
Gly 245 250 255Lys Thr Asn
Leu Ala Lys Leu Phe Arg Asn Pro Ala Tyr Asp Glu Leu 260
265 270Asp Ile Gln Thr Val Glu Val Gly Lys Ala
Asp Phe Ala Asp Thr Ile 275 280
285Asp Leu Leu Ala Ser Ser Met Glu Asp Thr Asp Val Pro Leu Leu Ser 290
295 300Ala Val Lys Ala Met Tyr Asp Trp
Ser Leu Leu Ile Asp Val Leu Lys305 310
315 320Gly Gln Lys Thr Ile Ser Asp Ala Lys Val Cys Glu
Tyr Glu Gln His 325 330
335Lys Ser Asp Leu Lys Ala Leu Lys His Ile Val Arg Lys Tyr Leu Asp
340 345 350Lys Ala Gln Tyr Asp Glu
Ile Phe Arg Thr Ala Gly Glu Lys Pro Asn 355 360
365Tyr Val Ser Tyr Ser Tyr Asn Val Thr Asp Val Lys Leu Lys
Gln Leu 370 375 380Pro Ser Asn Phe Lys
Lys Lys Tyr Ser Glu Glu Phe Cys Lys Tyr Ile385 390
395 400Asn Ser Lys Leu Glu Lys Ile Lys Pro Glu
Pro Asp Asp Glu Ala Val 405 410
415Tyr Asn Glu Leu Ile Glu Lys Cys Asn Ser Lys Thr Leu Cys Pro Lys
420 425 430Gln Val Thr Asp Glu
Asn Arg Val Ile Pro Tyr Gln Leu Tyr Tyr His 435
440 445Glu Leu Ser Met Ile Leu Asp Lys Ala Ser Ala Tyr
Leu Asp Phe Leu 450 455 460Asn Glu Thr
Glu Asp Gly Ile Ser Val Lys Gln Lys Ile Leu Thr Leu465
470 475 480Met Lys Phe Arg Ile Pro Tyr
Phe Val Gly Pro Ser Val Lys Arg Asn 485
490 495Glu Thr Asp Asn Val Trp Ile Val Arg Lys Ala Glu
Gly Arg Ile Tyr 500 505 510Pro
Trp Asn Phe Glu Asn Met Val Asp Tyr Asp Lys Ser Glu Asp Gly 515
520 525Phe Ile Arg Arg Met Thr Cys Lys Cys
Thr Tyr Leu Ala Gly Glu Asp 530 535
540Val Leu Pro Lys Tyr Ser Leu Leu Tyr Ser Arg Tyr Thr Val Leu Asn545
550 555 560Glu Ile Asn Asn
Ile Lys Val Lys Asp Val Lys Ile Ser Pro Glu Leu 565
570 575Lys Gln Asp Ile Phe Asn Glu Leu Phe Met
Lys Thr Ser Arg Val Thr 580 585
590Val Lys Lys Ile Thr Glu Leu Leu Lys Arg Lys Gly Ala Phe Ser Glu
595 600 605Glu Asn Gly Asp Ser Leu Ser
Gly Val Asp Ile Asn Ile Lys Ser Ser 610 615
620Leu Lys Ser Tyr Leu Asp Phe Arg Arg Leu Leu Glu Asn Gly Ser
Leu625 630 635 640Ser Glu
Ser Asp Val Glu Arg Ile Ile Glu Arg Ile Thr Val Thr Thr
645 650 655Asp Lys Pro Arg Leu Ile Ser
Trp Leu Lys Thr Glu Tyr Pro Ala Leu 660 665
670Pro Ala Glu Asp Ile Arg Tyr Ile Ser Arg Leu Ser Tyr Lys
Asp Tyr 675 680 685Gly Arg Leu Ser
Ala Lys Met Leu Thr Gly Cys Tyr Glu Leu Asp Met 690
695 700Asp Thr Gly Glu Ile Gly Gly Arg Ser Ile Ile Asp
Leu Met Trp Ala705 710 715
720Glu Asn Ile Asn Leu Met Gln Ile Met Ser Asp Ser Tyr Gly Tyr Lys
725 730 735Ser Phe Ile Glu Glu
Glu Asn Lys Lys Tyr Tyr Ala Ile Asn Pro Thr 740
745 750Gly Ser Ile Ala Gln Thr Leu Arg Glu Met Tyr Val
Ser Pro Ser Ala 755 760 765Ser Arg
Ala Ile Ile Arg Thr Met Asp Ile Val Lys Glu Leu Arg Lys 770
775 780Ile Ile Lys Arg Asp Pro Asp Lys Ile Phe Val
Glu Met Ala Arg Gly785 790 795
800Ser Lys Pro Glu Asp Lys Gly Lys Arg Thr Ser Ser Arg Arg Glu Gln
805 810 815Ile Glu Lys Leu
Phe Ala Ser Ala Lys Glu Phe Val Ser Asp Glu Glu 820
825 830Ile Ser His Leu Arg Ser Gln Leu Gly Ser Leu
Ser Asp Glu Gln Leu 835 840 845Arg
Ser Glu Lys Tyr Tyr Leu Tyr Phe Thr Gln Phe Gly Lys Cys Val 850
855 860Tyr Ser Gly Glu Ala Ile Asp Phe Ser Arg
Leu Gly Asp Asn His Cys865 870 875
880Tyr Asp Ile Asp His Ile Tyr Pro Gln Ser Lys Val Lys Asp Asp
Ser 885 890 895Leu His Asn
Lys Val Leu Val Lys Ser Gln Leu Asn Gly Glu Lys Ser 900
905 910Asp Asp Tyr Pro Ile Lys Glu Gln Ile Arg
Asn Lys Met His Pro Ile 915 920
925Trp Lys Asn Leu Phe Tyr Arg Asp Pro Lys Asn Pro Thr Asp Lys Ile 930
935 940Lys Tyr Glu Arg Leu Thr Arg Ser
Thr Pro Phe Thr Glu Asp Glu Leu945 950
955 960Ala Gly Phe Ile Glu Arg Gln Leu Val Glu Thr Arg
Gln Ser Thr Lys 965 970
975Ala Val Ala Thr Leu Leu Lys Glu Met Phe Pro Asp Ser Lys Ile Val
980 985 990Tyr Val Lys Ala Gly Gln
Val Ser Lys Phe Arg His Asp Phe Asp Met 995 1000
1005Leu Lys Cys Arg Glu Ile Asn Asp Leu His His Ala
Lys Asp Ala 1010 1015 1020Tyr Leu Asn
Val Val Val Gly Asn Val His Asp Val Lys Phe Thr 1025
1030 1035Ser Asn Pro Leu Asn Phe Val Lys Asn Ala Asp
Lys His Tyr Thr 1040 1045 1050Ile Lys
Ile Lys Glu Thr Leu Lys His Lys Val Ala Arg Asn Gly 1055
1060 1065Glu Thr Ala Trp Asn Pro Glu Thr Asp Phe
Asp Thr Val Lys Arg 1070 1075 1080Met
Met Ser Lys Asn Ser Val Arg Tyr Val Arg Tyr Cys Tyr Lys 1085
1090 1095Arg Lys Gly Glu Leu Phe Lys Gln Gln
Pro Lys Lys Ala Gly Asn 1100 1105
1110Pro Asp Leu Ala Trp Leu Lys Lys Asn Leu Asp Pro Val Lys Tyr
1115 1120 1125Gly Gly Tyr Asn Ser Lys
Ser Ile Ser Cys Phe Ser Leu Ile Lys 1130 1135
1140Cys Thr Gly Val Gly Val Val Ile Ile Pro Val Glu Leu Leu
Cys 1145 1150 1155Glu Lys Arg Tyr Phe
Ser Asp Asp Ser Phe Ala Ser Glu Tyr Ala 1160 1165
1170Tyr Ser Val Leu Lys Asn Ala Leu Pro Ala Lys Asn Ile
Ala Lys 1175 1180 1185Ile Ser Ile Asp
Asp Ile Ser Phe Pro Leu Lys Arg Arg Pro Ile 1190
1195 1200Lys Ile Asn Thr Leu Phe Glu Phe Asp Gly Tyr
Arg Val Asn Ile 1205 1210 1215Arg Ser
Lys Asp Ser Tyr Ser Val Phe Arg Ile Ser Ser Ala Met 1220
1225 1230Ala Ala Ile Tyr Ser Lys Asp Thr Ser Asp
Tyr Ile Lys Ala Ile 1235 1240 1245Ser
Ser Tyr Ile Asp Lys Ser Asp Lys Gly Ser Lys Phe Lys Pro 1250
1255 1260Gly Glu Ala Phe Asp Val Leu Ser Asn
Leu Lys Ala Tyr Asp Glu 1265 1270
1275Ile Ala Lys Lys Cys Ile Ser Glu Pro Phe Cys Lys Ile Ser Lys
1280 1285 1290Leu Ala Glu Ala Gly Lys
Lys Met Glu Glu Gly Arg Asn Lys Phe 1295 1300
1305Ala Glu Leu Ser Ile Ile Glu Gln Met Lys Thr Leu Leu Leu
Leu 1310 1315 1320Val Asp Val Leu Lys
Thr Gly Arg Val Asp Lys Cys Asn Leu Lys 1325 1330
1335Pro Val Gly Gly Val Asp Asn Phe His Thr Glu Arg Met
Ser Ala 1340 1345 1350Ile Leu Lys Asn
Thr Lys Tyr Ser Asp Ile Arg Ile Ile Asp Gln 1355
1360 1365Ser Pro Thr Gly Leu Tyr Glu Asn Lys Ser Asp
Asn Leu Leu Glu 1370 1375
1380Leu2936DNAArtificial SequenceCR RNA 29gtttgagagt agtgtaaatt
tatagggtag taaaac 3630108DNAArtificial
SequenceTRACR RNA 30ttttactacc ctataaattt acactacgag ttcaaataaa
aattatttca aatcgtactt 60tttagtacct tcacaagtgt tgtgaatatt aactcacctt
cgggtgag 108311370PRTArtificial SequenceFirmicutes
bacterium 31Met Glu Gln Lys Asp Tyr Tyr Ile Gly Leu Asp Ile Gly Thr Asn
Ser1 5 10 15Val Gly Trp
Ala Val Val Asp Glu Gly Tyr Gln Leu Cys Arg Phe Lys 20
25 30Lys Tyr Asp Met Trp Gly Val Arg Leu Phe
Asp Ser Ala Glu Thr Ala 35 40
45Ala Glu Arg Arg Met Asn Arg Val Asn Arg Arg Arg Asn Arg Arg Lys 50
55 60Lys Gln Arg Ile Asp Leu Leu Gln Gly
Leu Phe Ala Glu Glu Ile Ala65 70 75
80Lys Ile Asp Arg Thr Phe Phe Val Arg Leu Asn Glu Ser Arg
Leu His 85 90 95Pro Glu
Asp Lys Ser Thr Ala Phe Arg His Pro Leu Phe Asn Asp Pro 100
105 110Asn Tyr Thr Asp Val Asp Tyr Tyr Lys
Glu Tyr Pro Thr Ile Tyr His 115 120
125Leu Arg Lys Glu Leu Met Asp Ser Ala Glu Pro His Asp Ile Arg Leu
130 135 140Val Tyr Leu Ala Leu His His
Ile Leu Lys Asn Arg Gly His Phe Leu145 150
155 160Ile Glu Gly Gly Phe Glu Asp Ser Lys Lys Phe Glu
Pro Thr Phe Arg 165 170
175Gln Leu Leu Glu Val Leu Thr Glu Glu Leu Gly Leu Lys Met Asp Gly
180 185 190Ala Asp Ala Ala Leu Ala
Glu Ser Val Leu Lys Asp Arg Gly Met Lys 195 200
205Lys Thr Glu Lys Val Lys Arg Leu Lys Asn Val Phe Thr Leu
Asn Thr 210 215 220Thr Asp Met Asp Gln
Glu Ser Gln Lys Lys Gln Lys Ala Gln Ile Asp225 230
235 240Ala Val Cys Lys Phe Leu Ala Gly Ser Lys
Gly Asp Phe Lys Lys Leu 245 250
255Val Ala Asp Glu Ala Leu Asn Glu Leu Lys Leu Asp Thr Phe Ala Leu
260 265 270Gly Thr Ser Lys Ala
Glu Asp Ile Gly Leu Glu Ile Glu Lys Ser Ala 275
280 285Pro Gln Tyr Cys Val Val Phe Glu Ser Val Lys Ser
Val Phe Asp Trp 290 295 300Lys Ile Met
Thr Gln Ile Leu Gly Asp Glu Ser Thr Phe Ser Ser Ala305
310 315 320Lys Val Lys Glu Tyr Glu Lys
His His Glu Asn Leu Ile Ile Leu Arg 325
330 335Glu Leu Ile Arg Lys Tyr Cys Asp Lys Glu Thr Tyr
Arg His Phe Phe 340 345 350Asn
Asn Val Asn Gly Gly Tyr Ser Arg Tyr Ile Gly Ser Leu Lys Lys 355
360 365Asn Gly Lys Lys Tyr Tyr Val Ala Gly
Cys Thr Gln Glu Glu Phe Tyr 370 375
380Lys Glu Leu Lys Gly Leu Leu Lys Ser Ile Asp Gln Arg Val Asp Pro385
390 395 400Glu Asp Arg Pro
Val Tyr Gln Arg Val Leu Ala Glu Thr Glu Asp Glu 405
410 415Thr Phe Leu Pro Leu Leu Arg Ser Lys Ala
Asn Ser Ala Ile Pro Arg 420 425
430Gln Ile His Gln Lys Glu Leu Asp Asp Ile Leu Gln Asn Ala Ser Val
435 440 445Tyr Leu Pro Phe Leu Asn Asp
Val Asp Glu Asp Gly Leu Ser Ala Ala 450 455
460Glu Lys Ile Arg Ser Ile Phe Thr Phe Arg Ile Pro Tyr Tyr Val
Gly465 470 475 480Pro Leu
Ser Leu Arg His Lys Asp Lys Gly Ala His Val Trp Ile Lys
485 490 495Arg Lys Glu Glu Gly Tyr Ile
Tyr Pro Trp Asn Tyr Glu Lys Lys Ile 500 505
510Asp Arg Glu Lys Ser Asn Glu Glu Phe Ile Arg Arg Leu Ile
Asn Gln 515 520 525Cys Thr Tyr Leu
Lys Asp Glu Lys Val Leu Pro Lys Lys Ser Leu Leu 530
535 540Tyr Ser Glu Phe Met Val Leu Asn Glu Leu Asn Asn
Leu Arg Ile Arg545 550 555
560Gly Lys Arg Leu Ser Glu Glu Gln Val Glu Leu Lys Gln Arg Ile Tyr
565 570 575Arg Asp Leu Phe Met
Thr Lys Thr Arg Val Thr Lys Lys Thr Leu Leu 580
585 590Asn Tyr Leu Arg Lys Glu Asp Ser Asp Leu Thr Glu
Glu Asp Leu Ser 595 600 605Gly Phe
Asp Asn Asp Phe Lys Ala Ser Leu Ser Ser Cys Leu Glu Leu 610
615 620Lys Asn Lys Val Phe Gly Asp Arg Ile Glu Glu
Asp Arg Val Arg Lys625 630 635
640Ile Ala Glu Asp Leu Ile Arg Trp Leu Thr Ile Tyr Asp Asp Asp Lys
645 650 655Lys Met Ile Lys
Glu Val Ile Arg Ala Glu Tyr Pro Asn Glu Phe Thr 660
665 670Asn Glu Gln Leu Asp Val Ile Cys Arg Leu Lys
Phe Ser Gly Trp Gly 675 680 685Asn
Leu Ser Glu Ala Phe Leu Cys Gly Val Glu Gly Ala Asp Lys Asp 690
695 700Thr Gly Glu Val Phe Thr Ile Ile Glu Ala
Leu Arg Asn Thr Asn His705 710 715
720Asn Leu Met Glu Leu Leu Ser Gly Asn Tyr Thr Phe Thr Glu Lys
Ile 725 730 735Arg Glu His
Asn Ala Ala Leu Ser Ser Glu Ile Lys Ala Lys Asp Tyr 740
745 750Glu Ser Leu Val Arg Asp Leu Tyr Val Ser
Pro Ala Cys Lys Arg Gly 755 760
765Ile Trp Gln Thr Ile Arg Ile Thr Glu Glu Ile Lys Lys Ile Met Gly 770
775 780His Glu Pro Lys Lys Ile Phe Val
Glu Met Thr Arg Glu His Arg Asp785 790
795 800Ser Gly Arg Thr Thr Ser Arg Lys Asp Gln Leu Leu
Ala Leu Tyr Gln 805 810
815Lys Cys Glu Glu Asp Ala Arg Asp Trp Val Lys Glu Ile Glu Asp Arg
820 825 830Glu Glu Arg Asp Phe Ser
Ser Ile Lys Leu Phe Leu Tyr Tyr Leu Gln 835 840
845Gln Gly Lys Cys Met Tyr Ser Gly Glu Ala Ile Asp Leu Asp
Glu Leu 850 855 860Met Ser Lys Asn Ser
Arg Trp Asp Arg Asp His Ile Tyr Pro Gln Ser865 870
875 880Lys Ile Lys Asp Asp Ser Leu Asp Asn Leu
Val Leu Val Lys Lys Glu 885 890
895Leu Asn Ala Val Lys Asp Asn Gly Glu Ile Ala Pro Asp Ile Gln Lys
900 905 910Arg Met Lys Gly Phe
Trp Leu Ser Leu Leu Arg Gln Gly Phe Leu Ser 915
920 925Lys Lys Lys Phe Asp Arg Leu Thr Arg Thr Gly Pro
Phe Thr Ser Glu 930 935 940Glu Leu Ala
Gly Phe Ile Ser Arg Gln Leu Val Glu Thr Ser Gln Met945
950 955 960Ser Lys Ala Val Ala Glu Leu
Leu Asn Gln Leu Tyr Glu Asp Ser Arg 965
970 975Val Val Tyr Val Lys Ala Gly Leu Val Ser Gln Phe
Arg Gln Lys Asp 980 985 990Leu
Gly Val Leu Lys Ser Arg Ser Val Asn Asp Tyr His His Ala Lys 995
1000 1005Asp Ala Tyr Leu Asn Val Val Val
Gly Asp Met Phe Asp Arg Lys 1010 1015
1020Phe Thr Ser Asp Pro Ala Arg Trp Phe Lys Lys Asn Lys Lys Val
1025 1030 1035Asn Tyr Ser Ile Asn Gln
Val Phe Arg Arg Asp Tyr Glu Glu Asn 1040 1045
1050Gly Lys Leu Ile Trp Lys Gly Ile Asp Arg Gly Glu Asp Gly
Lys 1055 1060 1065Pro Leu Phe Arg Asp
Gly Leu Ile His Gly Gly Thr Ile Asp Leu 1070 1075
1080Val Arg Ala Ile Ala Lys Arg Asn Thr Asn Ile Arg Tyr
Thr Glu 1085 1090 1095Tyr Thr Tyr Cys
Glu Thr Gly Gln Leu Tyr Asn Leu Thr Leu Leu 1100
1105 1110Pro Lys Thr Asp Thr Ala Ile Thr Ile Pro Leu
Lys Lys Glu Leu 1115 1120 1125Pro Ala
Ala Lys Tyr Gly Gly Phe Lys Gly Ala Gly Thr Ser Tyr 1130
1135 1140Phe Ser Leu Ile Glu Phe Asp Asp Lys Lys
Gly His His His Lys 1145 1150 1155Gln
Ile Val Gly Val Pro Ile Tyr Val Ala Asn Met Leu Glu His 1160
1165 1170Asn Glu Asn Ala Phe Ile Glu Tyr Leu
Glu Thr Val Cys Ser Phe 1175 1180
1185Arg Asn Ile Thr Val Leu Cys Glu Lys Ile Lys Lys Asn Ala Leu
1190 1195 1200Ile Ser Val Asn Gly Tyr
Pro Met Arg Ile Arg Gly Glu Asn Glu 1205 1210
1215Ile Leu Asn Met Leu Lys Asn Asn Leu Gln Leu Val Leu Ser
Gln 1220 1225 1230Glu Gly Glu Glu Thr
Leu Arg His Ile Glu Lys Tyr Phe Asn Lys 1235 1240
1245Lys Pro Gly Phe Glu Pro Asp Lys Glu His Asp Gly Ile
Asp Arg 1250 1255 1260Asp Ala Met Ala
Ala Leu Tyr Asp Glu Met Thr Glu Lys Leu Cys 1265
1270 1275Thr Val Tyr Lys Lys Arg Pro Thr Asn Gln Gly
Glu Leu Leu Lys 1280 1285 1290Asn Asn
Arg Gly Leu Phe Leu Asn Leu Glu Lys Arg Ser Glu Met 1295
1300 1305Ala Lys Val Leu Ser Glu Thr Ala Lys Met
Phe Gly Thr Thr Ala 1310 1315 1320Gln
Thr Thr Ala Asp Leu Ser Leu Ile Lys Gly Ser Lys Tyr Ala 1325
1330 1335Gly Lys Ile Val Ile Asn Lys Asn Thr
Leu Gly Ala Ala Lys Leu 1340 1345
1350Ile Leu Ile His Gln Ser Val Thr Gly Leu Phe Glu Thr Arg Val
1355 1360 1365Glu Leu
13703236DNAArtificial SequenceCR RNA 32gtttgagagt agtgtaattt catatggtag
tcaaac 3633102DNAArtificial SequenceTRACR
RNA 33gactaccata tgagattaca ctacacggtt caaataaaga atgttcgaaa ccgccctttg
60gggcccgctt gttgcggatt tacagacttg atatcaagtc tg
102341370PRTArtificial Sequenceuncultured Succiniclasticum sp. 34Met Ser
Lys Lys Phe Ala Gly Glu Tyr Tyr Leu Gly Leu Asp Ile Gly1 5
10 15Thr Asp Ser Val Gly Trp Ala Val
Thr Asp Asn Gln Tyr Asn Val Leu 20 25
30Lys Phe Asn Gly Lys Ser Met Trp Gly Ile Arg Leu Phe Asp Ala
Ala 35 40 45Gln Thr Ala Ala Glu
Arg Arg Met Phe Arg Thr Ala Arg Arg Arg Val 50 55
60Glu Arg Arg Arg Trp Arg Leu Glu Leu Leu Gln Glu Leu Phe
Gln Asn65 70 75 80Glu
Ile Glu Lys Lys Asp Pro Asp Phe Phe Gln Arg Met Lys Asp Ser
85 90 95Ala Leu Tyr Pro Glu Asp Ser
Lys Thr Gly Lys Pro Phe Ala Leu Phe 100 105
110Cys Asp Lys Asp Leu Asn Asp Lys Leu Tyr Tyr Lys Gln Tyr
Pro Thr 115 120 125Ile Tyr His Leu
Arg Lys Ala Leu Leu Thr Glu Asn Ser Lys Phe Asp 130
135 140Ile Arg Leu Val Tyr Leu Ala Ile His His Ile Leu
Lys His Arg Gly145 150 155
160His Phe Leu Phe Asn Gly Asp Phe Ser Asn Val Thr Arg Phe Ser Phe
165 170 175Ala Phe Glu Gln Leu
Gln Thr Cys Leu Cys Asn Glu Leu Asp Met Asp 180
185 190Phe Glu Cys Asn Asn Val Gln Lys Leu Ser Glu Ile
Leu Lys Asp Thr 195 200 205His Met
Ser Lys Asn Asp Lys Val Lys Ala Ser Val Ala Leu Phe Glu 210
215 220Asn Ser Gly Asp Lys Lys Gln Leu Gln Ala Val
Ile Gly Leu Phe Cys225 230 235
240Gly Ala Lys Lys Lys Leu Ala Asp Val Phe Leu Asp Glu Thr Leu Asn
245 250 255Asp Thr Glu Met
Pro Ser Ile Ser Ile Ala Asp Lys Pro Tyr Glu Glu 260
265 270Leu Arg Pro Glu Leu Glu Ser Ile Leu Ala Glu
Lys Cys Cys Val Ile 275 280 285Asp
Tyr Ile Lys Ala Val Tyr Asp Trp Ala Ile Leu Ala Asp Met Leu 290
295 300Asp Gly Gly Glu Tyr Gly Asn Arg Thr Tyr
Ile Ser Val Ala Arg Val305 310 315
320Arg Gln Tyr Glu Lys His His Asp Asp Leu Lys Lys Leu Lys Lys
Leu 325 330 335Val Arg Arg
Tyr Cys Lys Ser Glu Tyr Lys Ser Phe Phe Ser Val Ala 340
345 350Gly Thr Asp Asn Tyr Cys Ala Tyr Ile Gly
Asp Asp Ile Glu Thr Asp 355 360
365Asp Arg Lys Ser Val Lys Lys Cys Lys Gln Glu Asp Phe Tyr Lys Arg 370
375 380Ile Lys Gly Leu Leu Lys Lys Ala
Ile Glu Asn Gly Cys Pro Lys Asp385 390
395 400Glu Val Val Glu Ile Ile Lys Asp Ile Asp Ala Gln
Val Phe Leu Pro 405 410
415Leu Gln Val Thr Lys Asp Asn Gly Val Ile Pro His Gln Val His Glu
420 425 430Met Glu Leu Lys Gln Ile
Leu Lys Asn Ala Glu Lys Tyr Tyr Pro Phe 435 440
445Leu Cys Lys Lys Asp Glu Glu Gly Ile Val Thr Ser Asn Lys
Ile Leu 450 455 460Gln Leu Phe Lys Phe
Arg Ile Pro Tyr Tyr Val Gly Pro Leu Asn Ser465 470
475 480Arg Ile Gly Lys Asn Ser Trp Ile Val Arg
Arg Ala Glu Gly Lys Ile 485 490
495Tyr Pro Trp Asn Phe Glu Glu Lys Val Asp Phe Asp Lys Ser Glu Glu
500 505 510Gly Phe Ile Arg Arg
Met Thr Asn Pro Cys Thr Tyr Met Ala Gly Ala 515
520 525Asp Val Leu Pro Lys Tyr Ser Leu Leu Tyr Ser Glu
Phe Met Val Leu 530 535 540Asn Glu Leu
Asn Asn Val Arg Ile Cys Gly Asp Lys Leu Ser Val Glu545
550 555 560Ile Lys Gln Thr Ile Ile Lys
Asp Leu Phe Gln Arg Thr Arg Arg Val 565
570 575Thr Val Arg Lys Leu Cys Asp Lys Leu Lys Ala Glu
Gly Val Ile Ser 580 585 590Arg
Asn Ser Asn Gln Lys Asp Ile Asp Ile Lys Gly Ile Asp Gln Asp 595
600 605Leu Lys Ser Ser Met Val Ser Tyr Val
Asp Phe Lys Asn Ile Phe Gly 610 615
620Lys Glu Ile Glu Lys Tyr Ser Val Gln Gln Met Cys Glu Arg Ile Ile625
630 635 640Phe Leu Leu Thr
Ile His His Asp Asp Lys Arg Arg Leu Gln Lys Arg 645
650 655Ile Arg Ala Glu Phe Thr Glu Ala Gln Ile
Thr Asp Asp Gln Leu Gln 660 665
670Lys Val Leu Arg Leu Asn Tyr Gln Gly Trp Gly Arg Phe Ser Ala Glu
675 680 685Phe Leu Lys Glu Leu Lys Gly
Val Asp Thr Glu Thr Gly Glu Val Phe 690 695
700Ser Ile Ile Asn Ala Leu Arg Glu Thr Asp Asp Asn Leu Met Gln
Leu705 710 715 720Leu Ser
Asn Arg Tyr Thr Phe Ala Glu Glu Leu Glu Lys Tyr Asn Ser
725 730 735Asn Lys Arg Lys Lys Ile Glu
Ala Leu Thr Tyr Asp Asn Ile Met Glu 740 745
750Gly Ile Val Ala Ser Pro Ala Ile Lys Arg Ser Ala Trp Gln
Ala Ile 755 760 765Ser Ile Val Met
Glu Leu Ser Lys Ile Met Gly Arg Glu Pro Lys Arg 770
775 780Ile Phe Val Glu Met Ala Arg Gly Pro Glu Glu Lys
Lys His Thr Ile785 790 795
800Ser Arg Lys Asn Gln Leu Leu Glu Leu Tyr Lys Ser Val Lys Asp Glu
805 810 815Ser Arg Asp Trp Lys
Thr Glu Leu Glu Thr Lys Thr Glu Ser Asp Phe 820
825 830Arg Ser Ile Lys Leu Phe Leu Tyr Tyr Thr Gln Met
Gly Arg Cys Met 835 840 845Tyr Thr
Gly Glu Pro Ile Asp Leu Asp Gln Leu Ala Asn Thr Thr Ile 850
855 860Tyr Asp Arg Asp His Ile Tyr Pro Gln Ser Leu
Thr Lys Asp Asp Ser865 870 875
880Leu Asn Asn Leu Val Leu Val Lys Lys Val Glu Asn Ala Asn Lys Gly
885 890 895Asn Gly Leu Ile
Ser Ala Asp Ile Gln Lys Lys Met Arg Gly Phe Trp 900
905 910Ala Glu Leu Lys Lys Lys Gly Leu Ile Ser Asp
Glu Lys Phe Ser Arg 915 920 925Leu
Thr Arg Thr Thr Pro Leu Ser Asp Asp Glu Leu Ala Gly Phe Ile 930
935 940Asn Arg Gln Leu Val Glu Thr Arg Gln Ser
Ser Lys Ile Val Ala Asp945 950 955
960Leu Phe His Gln Leu Tyr Pro Thr Thr Gln Val Val Tyr Val Lys
Ala 965 970 975Lys Ile Val
Ser Asp Phe Arg His Glu Thr Leu Asp Met Val Lys Val 980
985 990Arg Ser Leu Asn Asp Leu His His Ala Lys
Asp Ala Tyr Leu Asn Ile 995 1000
1005Val Thr Gly Asn Val Tyr Tyr Glu Lys Phe Ser Gly Asn Pro Leu
1010 1015 1020Thr Trp Leu Arg Lys Asn
Pro Asp Arg Asn Tyr Ser Leu Asn Gln 1025 1030
1035Met Phe Asn Tyr Asp Ile Val Lys Lys Thr Lys Glu Gly Thr
Ser 1040 1045 1050Tyr Val Trp Lys Lys
Gly Lys Asp Gly Ser Ile Ala Val Val Arg 1055 1060
1065Arg Thr Met Glu Arg Asn Asp Ile Leu Tyr Thr Arg Gln
Ala Thr 1070 1075 1080Glu Asn Lys Asn
Gly Gly Leu Phe Asp Gln Asn Ile Val Ser Ser 1085
1090 1095Lys Asn Lys Pro Phe Ile Pro Val Lys Lys Gly
Leu Asp Val Asn 1100 1105 1110Lys Tyr
Gly Gly Tyr Lys Gly Ile Thr Pro Ala Tyr Phe Ala Leu 1115
1120 1125Ile Glu Phe Thr Asp Lys Lys Gly Ser Arg
Gln Arg Leu Leu Glu 1130 1135 1140Ala
Val Pro Leu Tyr Leu Arg Ala Asp Ile Asp Asn Asp Ser Asn 1145
1150 1155Val Leu Arg Asp Phe Tyr Lys Asn Val
Leu Gly Leu Glu Asn Pro 1160 1165
1170Val Val Ile Leu Asn Arg Ile Lys Lys Asn Ser Leu Leu Lys Ile
1175 1180 1185Asn Gly Phe Leu Ile His
Leu Arg Gly Thr Thr Gly Phe Ser Ala 1190 1195
1200Ser Gln Leu Lys Val Gln Asn Ala Val Glu Phe Ser Leu Pro
His 1205 1210 1215His Met Glu Asp Tyr
Val Lys Lys Leu Glu Asn Tyr Glu Lys His 1220 1225
1230Ile Ile Ala Glu Arg Gly Ser Thr Lys Asn Ser Gln Ile
Lys Ile 1235 1240 1245Thr Glu Trp Asp
Gly Ile Ser Lys Glu Lys Asn Leu Gln Leu Tyr 1250
1255 1260Asp Met Phe Ile Asn Lys Met Glu Asn Thr Ile
Tyr Lys Phe Arg 1265 1270 1275Pro Ala
Asn Gln Val Ser Asn Leu Lys Glu Asn Arg Glu Val Phe 1280
1285 1290Asn Ser Leu Ala Val Glu Asp Gln Cys Ser
Val Leu Asn Gln Val 1295 1300 1305Leu
Met Leu Phe Val Cys Lys Pro Val Thr Ala Asn Leu Ser Leu 1310
1315 1320Ile Lys Gly Ser Lys Asn Ala Gly Asn
Met Ala Leu Ser Lys Ile 1325 1330
1335Ile Ser Asn Met Arg Ser Ala Tyr Leu Ile His Gln Ser Val Thr
1340 1345 1350Gly Leu Phe Glu Gln Lys
Ile Asp Leu Leu Lys Val Ser Ser Gln 1355 1360
1365Lys Asp 13703536DNAArtificial SequenceCR RNA 35gtttgagagt
aatgtaaatt cataggatgg taaaac
3636103DNAArtificial SequenceTRACR RNA 36tttaccatcc agtgagttta cattacaagt
tcaaataaaa atttattcaa cccgttcttc 60ggaacctcca ccgtgtggaa cattaaggtc
tgctttgcag gcc 103371369PRTArtificial
SequenceBacillales bacterium 37Met Ala Asn Lys Leu Phe Ile Gly Leu Asp
Val Gly Ser Asp Ser Val1 5 10
15Gly Trp Ala Ala Thr Asp Glu Asn Phe His Leu Tyr Arg Leu Lys Gly
20 25 30Lys Thr Ala Trp Gly Ala
Arg Ile Phe Ser Glu Ala Ser Asp Ala Lys 35 40
45Gly Arg Arg Gly Phe Arg Val Ala Gly Arg Arg Leu Ala Arg
Arg Lys 50 55 60Glu Arg Ile Arg Leu
Leu Asn Thr Leu Phe Asp Pro Leu Leu Lys Glu65 70
75 80Lys Asp Pro Thr Phe Leu Leu Arg Leu Glu
Asn Ser Ala Ile Gln Asn 85 90
95Asp Asp Pro Asn Lys Pro Ala Gln Ala Val Thr Asp Cys Leu Leu Phe
100 105 110Ala Asn Lys Gln Glu
Glu Lys Gly Phe Tyr Lys Arg Tyr Pro Thr Ile 115
120 125Trp His Leu Arg Lys Ala Leu Met Asp Asn Glu Asp
Cys Ala Phe Ser 130 135 140Asp Ile Arg
Phe Leu Tyr Leu Ala Ile His His Ile Ile Lys Tyr Arg145
150 155 160Gly Asn Phe Leu Arg Asp Gly
Glu Ile Lys Ile Gly Gln Phe Asp Tyr 165
170 175Ser Val Phe Asp Lys Leu Asn Glu Thr Leu Ser Val
Leu Phe Asp Leu 180 185 190Gln
Ser Glu Asp Glu Asp Ser Gln Glu Gly His Phe Val Gly Leu Pro 195
200 205Lys Ser Gln Tyr Glu Ala Phe Ile Thr
Thr Ala Asn Asp Arg Asn Leu 210 215
220Pro Lys Gln Thr Lys Lys Thr Lys Leu Leu Ser Met Phe Glu Lys Asp225
230 235 240Glu Glu Ser Lys
Ser Phe Leu Glu Met Phe Cys Thr Leu Cys Ala Gly 245
250 255Gly Glu Phe Ser Thr Lys Lys Leu Asn Lys
Lys Gly Glu Glu Thr Phe 260 265
270Asp Asp Thr Lys Ile Ser Phe Asn Ala Ser Tyr Asp Gln Asn Glu Pro
275 280 285Asn Tyr Gln Glu Ile Leu Gly
Asp Ala Phe Asp Leu Val Asp Ile Ala 290 295
300Lys Ala Val Phe Asp Tyr Cys Asp Leu Ser Asp Ile Leu Asn Gly
Asn305 310 315 320Asp Asn
Leu Ser Asn Ala Phe Val Glu Leu Tyr Asp Ser His Lys Ser
325 330 335Gln Leu Ser Ala Leu Lys Ala
Ile Cys Lys Gln Ile Asp Asn Gln Ser 340 345
350Asn Leu Lys Gly Asp Ala Ser Val Tyr Val Lys Leu Phe Asn
Asp Pro 355 360 365Asn Asp Lys Ser
Asn Tyr Pro Ala Phe Thr His Asn Lys Thr Leu Val 370
375 380Asp Lys Arg Cys Asp Ile His Thr Phe Asp Lys Tyr
Val Ile Asp Thr385 390 395
400Val Leu Pro Tyr Glu Pro Leu Leu Met Gly Gln Asp Ala Thr Asn Trp
405 410 415Gln Met Leu Lys Ser
Leu Ala Glu Gln Asp Arg Leu Leu Gln Thr Ile 420
425 430Ala Leu Arg Ser Thr Ser Val Ile Pro Met Gln Leu
His Gln Lys Glu 435 440 445Leu Lys
Ile Ile Leu Lys Asn Ala Ile Ser Arg Asn Val Lys Gly Ile 450
455 460Ala Glu Ile Glu Glu Lys Ile Leu Lys Leu Phe
Gln Tyr Lys Ile Pro465 470 475
480Tyr Tyr Cys Gly Pro Leu Thr Thr Lys Ser Ala Tyr Ser Asn Val Val
485 490 495Phe Lys Asn Asn
Glu Tyr Arg Pro Leu Lys Pro Trp Asp Tyr Glu Glu 500
505 510Ala Ile Asp Trp Asp Glu Thr Lys Lys Lys Phe
Met Glu Gly Leu Thr 515 520 525Asn
Lys Cys Thr Tyr Leu Lys Asp Lys Asn Val Leu Pro Lys Gln Ser 530
535 540Ile Leu Tyr Gln Asp Phe Asp Ala Trp Asn
Lys Leu Asn Asn Leu Lys545 550 555
560Val Asn Gly Ser Lys Pro Ser Leu Lys Glu Leu Lys Asp Leu Phe
Ser 565 570 575Phe Val Ser
Gln Arg Pro Lys Thr Thr Met Lys Asp Ile Gln Arg His 580
585 590Phe Lys Ser Asp Thr Asn Ser Lys Asp Lys
Asp Val Val Val Ser Gly 595 600
605Trp Asn Pro Glu Asp Tyr Ile Cys Cys Ser Ser Arg Ala Ser Phe Gly 610
615 620Lys Asn Gly Val Phe Asp Leu Asn
Asn Pro Asp Ser Ser Asp Pro Lys625 630
635 640Asp Leu Ser Lys Cys Glu Arg Met Ile Phe Leu Lys
Thr Ile Tyr Ala 645 650
655Asp Ser Pro Lys Asp Ala Asp Val Ala Ile Leu Lys Glu Phe Pro Asp
660 665 670Leu Thr Asn Asp Gln Lys
Ser Leu Leu Lys Thr Ile Lys Cys Lys Glu 675 680
685Trp Ser Pro Leu Ser Lys Glu Phe Leu Glu Leu Arg Tyr Ala
Asp Lys 690 695 700Tyr Gly Glu Ile Arg
Glu Ser Ile Ile Asn Leu Leu Arg Ser Gly Glu705 710
715 720Gly Asn Leu Met Gln Ile Leu Ala Lys Tyr
Asp Tyr Gln Glu Arg Ile 725 730
735Asp Ala Tyr Asn Ala Asp Ser Phe Gln Thr Lys Ser Lys Ser Gln Ile
740 745 750Val Ser Asp Leu Ile
Glu Glu Met Pro Pro Lys Met Arg Arg Pro Val 755
760 765Ile Gln Ala Val Arg Ile Val His Glu Val Val Lys
Val Ala Lys Lys 770 775 780Glu Pro Asp
Gln Ile Ser Ile Glu Val Thr Arg Glu Asn Asn Asn Lys785
790 795 800Glu Lys Lys Gln Gln Leu Thr
Lys Lys Ala Lys Ser Arg Ser Ala Gln 805
810 815Ile Gln Thr Phe Leu Lys Asn Leu Val Lys Ile Asp
Thr Phe Glu Glu 820 825 830Lys
Arg Val Asp Glu Val Leu Glu Glu Leu Lys Lys Tyr Ser Asp Arg 835
840 845Ser Ile Asn Gly Lys His Leu Tyr Leu
Tyr Phe Leu Gln Asn Gly Lys 850 855
860Asp Ala Tyr Thr Gly Lys Pro Ile Asn Ile Asp Asp Val Leu Ser Gly865
870 875 880Asn Lys Tyr Asp
Thr Asp His Val Ile Pro Gln Ser Lys Met Lys Asp 885
890 895Asp Ser Ile Asp Asn Leu Val Leu Val Glu
Arg Ser Ile Asn Gln His 900 905
910Arg Ser Asn Glu Tyr Pro Leu Pro Glu Ser Ile Arg Lys Asn Pro Ala
915 920 925Asn Val Ala Phe Trp Ser Lys
Leu Lys Lys Ala Gly Met Met Ser Glu 930 935
940Lys Lys Phe Asn Asn Leu Thr Arg Ala Asn Pro Leu Thr Glu Glu
Glu945 950 955 960Leu Ser
Ala Phe Val Ala Ala Gln Ile Asn Val Val Asn Arg Ser Asn
965 970 975Ile Val Ile Arg Asp Val Leu
Lys Val Leu Tyr Pro Asn Ala Lys Leu 980 985
990Ile Phe Ser Lys Ala Gln Tyr Pro Ser Gln Ile Arg Lys Glu
Leu Asn 995 1000 1005Ile Pro Lys
Leu Arg Asp Leu Asn Asp Thr His His Ala Val Asp 1010
1015 1020Ala Tyr Leu Asn Ile Val Ser Gly Val Ser Leu
Thr Glu Arg Tyr 1025 1030 1035Gly Asn
Leu Ser Phe Ile Lys Ala Ala Gln Lys Asn Glu Asn Gln 1040
1045 1050Thr Asp Tyr Ser Leu Asn Met Glu Arg Tyr
Ile Ser Ser Leu Ile 1055 1060 1065Gln
Thr Lys Glu Gly Glu Lys Thr Ser Leu Gly Lys Leu Ile Asp 1070
1075 1080Gln Thr Ser Arg Arg His Asp Phe Leu
Leu Thr Tyr Arg Phe Ser 1085 1090
1095Tyr Gln Asp Ser Ala Phe Tyr Asn Gln Thr Ile Tyr Lys Lys Asn
1100 1105 1110Ala Gly Leu Ile Pro Val
His Glu Lys Leu Pro Pro Glu Arg Tyr 1115 1120
1125Gly Gly Tyr Asn Ser Met Ser Thr Glu Val Asn Cys Val Val
Thr 1130 1135 1140Ile Lys Gly Lys Lys
Glu Arg Arg Tyr Leu Val Gly Val Pro His 1145 1150
1155Leu Leu Leu Glu Lys Gly Asn Lys Val Ala Asp Ile Asn
Lys Glu 1160 1165 1170Ile Ala Asn Ser
Val Pro His Lys Glu Asn Glu Thr Ile Ala Val 1175
1180 1185Ser Leu Lys Asp Ile Ile Gln Leu Asp Ser Met
Val Lys Lys Asp 1190 1195 1200Gly Leu
Val Tyr Leu Cys Thr Thr Gln Asn Lys Asp Leu Val Lys 1205
1210 1215Leu Lys Pro Phe Gly Pro Ile Phe Leu Ser
Arg Glu Ser Glu Val 1220 1225 1230Tyr
Leu Ser Asn Leu Asn Lys Phe Val Glu Lys Tyr Pro Asn Ile 1235
1240 1245Ala Asp Gly Asn Glu Asn Tyr Ser Leu
Lys Thr Asn Arg Tyr Gly 1250 1255
1260Glu Lys Ser Ile Asp Phe Leu Gln Glu Lys Thr Gly Asn Val Leu
1265 1270 1275Lys Glu Leu Val Asp Leu
Ser Asn Gln Lys Arg Phe Asp Tyr Cys 1280 1285
1290Pro Met Ile Cys Lys Leu Arg Thr Ile Asp Tyr Arg Lys Gly
Val 1295 1300 1305Glu Gly Lys Thr Leu
Thr Glu Gln Leu Ile Leu Ile Arg Ser Phe 1310 1315
1320Val Gly Val Phe Thr Arg Lys Ser Glu Ala Leu Ser Asn
Gly Ser 1325 1330 1335Asn Phe Arg Lys
Ala Arg Gly Leu Val Leu Gln Asp Gly Leu Val 1340
1345 1350Leu Cys Ser Asp Ser Ile Thr Gly Leu Tyr His
Thr Glu Arg Lys 1355 1360
1365Leu3836DNAArtificial SequenceCR RNA 38gtttgagagc agtgttgtct
tatatagctc gaaaac 3639103DNAArtificial
SequenceTRACR RNA 39gcattgtaag acaacactgc tacgttcaaa taagcatatt
gctacaaggt tctccctcgg 60agaatgacca ttaggtcact tagatagccg gttcttctgg
cta 103401360PRTArtificial SequenceBacillales
bacterium 40Met Ala Asp Lys Leu Phe Ile Gly Leu Asp Val Gly Ser Glu Ser
Val1 5 10 15Gly Trp Ala
Ala Thr Asp Glu Asn Phe His Leu Tyr Arg Leu Lys Gly 20
25 30Lys Thr Ala Trp Gly Ala Arg Ile Phe Ser
Glu Ala Asn Asp Ala Lys 35 40
45Thr Arg Arg Gly Phe Arg Val Ala Gly Arg Arg Leu Ala Arg Arg Lys 50
55 60Glu Arg Ile Arg Leu Leu Asn Thr Leu
Phe Asp Pro Leu Leu Lys Lys65 70 75
80Asp Pro Ala Phe Leu Leu Arg Leu Glu Asn Ser Ala Ile Gln
Asn Asp 85 90 95Asp Pro
Asn Lys Pro Ile Gln Ala Ile Ala Asp Cys Pro Leu Leu Val 100
105 110Asn Lys Gln Glu Glu Lys Asp Tyr Tyr
Lys Arg Tyr Pro Thr Ile Trp 115 120
125His Leu Arg Lys Ala Leu Met Glu Asn Asp Asp His Ala Phe Ser Asp
130 135 140Ile Arg Phe Leu Tyr Leu Ala
Ile His His Ile Ile Lys Tyr Arg Gly145 150
155 160Asn Phe Leu Arg Glu Gly Asp Ile Lys Ile Gly Gln
Phe Asp Tyr Ser 165 170
175Ile Phe Asp Lys Leu Asn Glu Thr Leu Ala Val Leu Phe Asp Leu Gln
180 185 190Asn Glu Asp Gly Glu Asn
Glu Glu Gly Arg Phe Ile Gly Leu Pro Lys 195 200
205Ser Gln Tyr Glu Ala Phe Ile Thr Cys Ala Asn Asp Arg Asn
Leu Pro 210 215 220Lys Gln Pro Lys Lys
Ala Lys Leu Leu Ser Met Phe Glu Lys Thr Glu225 230
235 240Glu Ser Lys Ala Phe Leu Glu Met Phe Cys
Thr Leu Cys Ser Gly Gly 245 250
255Glu Phe Ser Thr Lys Lys Leu Asn Ala Lys Gly Glu Glu Thr Tyr Gln
260 265 270Asp Ala Lys Ile Ser
Phe Asn Ser Ser Tyr Asp Glu Asn Glu Gly Ala 275
280 285Tyr Gln Glu Ile Leu Gly Asp Phe Phe Asp Leu Val
Asp Ile Ala Lys 290 295 300Ala Val Phe
Asp Tyr Cys Asp Leu Ser Asp Ile Leu Asn Gly Asn Asp305
310 315 320Asn Leu Ser Ser Ala Phe Val
Glu Leu Tyr Asp Ser His Lys Ser Gln 325
330 335Leu Ser Ala Leu Lys Ser Ile Cys Lys Arg Ile Asp
Asn Gln Asn Gly 340 345 350Phe
Ile Gly Glu Lys Ser Ile Tyr Val Lys Leu Phe Asn Asp Pro Asn 355
360 365Asp Lys Ser Asn Tyr Pro Ala Phe Thr
Asn Asn Lys Thr Leu Val Asp 370 375
380Lys Arg Cys Asp Ile His Thr Phe Asp Lys Tyr Val Lys Glu Thr Ile385
390 395 400Leu Pro Tyr Glu
Ser Ser Leu Thr Gly Arg Asp Ala Val Asn Trp Gln 405
410 415Met Leu Lys Ser Leu Ala Glu Gln Asp Arg
Leu Leu Gln Thr Ile Ala 420 425
430Leu Arg Ser Thr Ser Val Ile Pro Met Gln Leu His Gln Lys Glu Leu
435 440 445Lys Ile Ile Leu Lys Asn Ala
Val Ser Arg Asn Ile Lys Gly Val Ala 450 455
460Glu Ile Glu Glu Lys Ile Leu Lys Leu Phe Gln Tyr Lys Ile Pro
Tyr465 470 475 480Tyr Cys
Gly Pro Leu Thr Thr Lys Ser Asp Tyr Ser Asn Val Val Phe
485 490 495Lys Asn Asn Glu Tyr Arg Pro
Leu Lys Pro Trp Asp Tyr Glu Glu Ala 500 505
510Ile Asp Trp Asp Gly Thr Lys Gln Lys Phe Met Glu Gly Leu
Thr Asn 515 520 525Lys Cys Thr Tyr
Leu Lys Asp Lys Asn Val Leu Pro Lys Gln Ser Val 530
535 540Leu Tyr Gln Asp Phe Asp Thr Trp Asn Lys Leu Asn
Asn Leu Lys Val545 550 555
560Asn Gly Asn Lys Pro Ser Leu Glu Asp Leu Asn Asp Leu Phe Ser Phe
565 570 575Val Ser Gln Arg Ser
Lys Thr Thr Met Arg Asp Ile Gln Arg Tyr Leu 580
585 590Lys Ser Lys Thr Asn Ser Lys Glu Asn Asp Val Val
Val Ser Gly Trp 595 600 605Asn Ser
Glu Asp Tyr Ile Cys Cys Ser Ser Arg Ala Ser Phe Asn Lys 610
615 620Asn Gly Ile Phe Asn Leu Asn Asn Ser Glu Val
Leu Lys Glu Cys Glu625 630 635
640Arg Ile Ile Phe Leu Lys Thr Ile Tyr Thr Asp Ser Pro Lys Asp Ala
645 650 655Asp Ala Ala Val
Leu Lys Glu Phe Pro Asp Leu Thr Asn Asn Gln Lys 660
665 670Thr Leu Leu Lys Thr Ile Lys Cys Lys Glu Trp
Ser Pro Leu Ser Lys 675 680 685Glu
Phe Leu Glu Leu Arg Tyr Ser Asp Lys Tyr Gly Glu Ile Arg Gln 690
695 700Ser Ile Ile Asp Leu Leu Arg Asn Gly Glu
Gly Asn Leu Met Gln Ile705 710 715
720Leu Ala Lys Tyr Asp Tyr Gln Glu Val Ile Asp Ala Cys Asn Ala
Ala 725 730 735Ser Phe Gln
Thr Lys Ser Lys Ser Gln Ile Val Ser Asp Leu Ile Glu 740
745 750Glu Met Pro Pro Lys Met Arg Arg Pro Val
Ile Gln Ala Val Arg Ile 755 760
765Val Gln Glu Val Ala Lys Val Ala Lys Lys Glu Pro Asp Glu Ile Ser 770
775 780Ile Glu Val Thr Arg Glu Asn Asn
Asp Lys Glu Lys Lys Gln Gln Leu785 790
795 800Thr Lys Lys Ala Lys Ser Arg Ser Thr Gln Ile Gln
Asn Phe Leu Lys 805 810
815Asn Leu Val Lys Ile Asp Ala Ser Glu Lys Lys Gln Ala Asn Glu Val
820 825 830Leu Glu Glu Leu Lys Lys
Tyr Ser Asp Gln Ser Ile Asn Gly Lys His 835 840
845Leu Tyr Leu Tyr Phe Leu Gln Asn Gly Lys Asp Ala Tyr Thr
Gly Lys 850 855 860Pro Ile Asn Ile Asp
Asp Val Leu Ser Gly Asn Lys Tyr Asp Thr Asp865 870
875 880His Ile Ile Pro Gln Ser Lys Met Lys Asp
Asp Ser Ile Asp Asn Leu 885 890
895Val Leu Val Glu Arg Glu Ile Asn Gln His Arg Ser Asn Glu Tyr Pro
900 905 910Leu Pro Glu Ser Ile
Arg Lys Asn Pro Ala Asn Val Ala Phe Trp Arg 915
920 925Lys Leu Lys Lys Ala Gly Met Met Ser Glu Lys Lys
Phe Asn Asn Leu 930 935 940Thr Arg Ser
Asn Pro Leu Thr Glu Glu Glu Leu Gly Ala Phe Val Ala945
950 955 960Ala Gln Ile Asn Val Val Asn
Arg Ser Asn Val Val Ile Arg Asp Val 965
970 975Leu Lys Ile Leu Tyr Pro Asn Ala Lys Leu Ile Phe
Ser Lys Ala Gln 980 985 990Tyr
Pro Ser Gln Ile Arg Lys Glu Leu Asn Ile Pro Lys Leu Arg Asp 995
1000 1005Leu Asn Asp Thr His His Ala Val
Asp Ala Tyr Leu Asn Ile Val 1010 1015
1020Ser Gly Val Thr Leu Thr Asp Arg Tyr Gly Asn Met Arg Phe Ile
1025 1030 1035Lys Ala Ser Gln Asp Glu
Glu Lys His Ser Leu Asn Met Glu Arg 1040 1045
1050Tyr Ile Ser Ser Leu Ile Gln Thr Lys Glu Gly Gln Arg Thr
Glu 1055 1060 1065Leu Gly Glu Leu Ile
Asp Gln Thr Ser Arg Arg His Asp Phe Leu 1070 1075
1080Leu Thr Tyr Arg Phe Ser Tyr Gln Asp Ser Ala Phe Tyr
Lys Gln 1085 1090 1095Thr Ile Tyr Lys
Lys Asn Ala Gly Leu Ile Pro Ala His Asp Asn 1100
1105 1110Leu Pro Pro Glu Arg Tyr Gly Gly Tyr Asp Ser
Met Ser Thr Glu 1115 1120 1125Val Asn
Cys Val Ala Thr Ile Ile Gly Lys Lys Thr Thr Arg Tyr 1130
1135 1140Leu Val Gly Val Pro His Leu Leu Ile Lys
Lys Ala Lys Asp Gly 1145 1150 1155Ile
Asp Val Asn Asp Glu Leu Ile Lys Leu Val Pro His Lys Glu 1160
1165 1170Asn Glu Val Val Lys Val Asp Leu Asn
Thr Thr Leu Gln Leu Asp 1175 1180
1185Cys Thr Val Lys Lys Asp Gly Phe Met Tyr Leu Cys Thr Ser Asn
1190 1195 1200Asn Ile Ala Leu Val Lys
Leu Lys Pro Phe Ser Pro Ile Phe Leu 1205 1210
1215Ser Arg Glu Ser Glu Ile Tyr Leu Ser Asn Leu Met Lys Tyr
Val 1220 1225 1230Glu Lys Tyr Pro Asn
Ile Ser Asp Glu Asn Ser Glu Tyr Glu Phe 1235 1240
1245Lys Ile Asn Arg Glu Asn Val Asp Pro Ile Lys Phe Thr
Glu Lys 1250 1255 1260Gln Ser Ile Glu
Val Val Gln Asp Leu Ile Ile Lys Ala Lys Gln 1265
1270 1275Asp Arg Phe Ser Tyr Cys Ser Met Ile Ser Lys
Leu Arg Asp Ile 1280 1285 1290Asn Ala
Glu Glu Met Ile His Ser Lys Ser Leu Thr Glu Gln Leu 1295
1300 1305Lys Ile Ile Lys Ser Leu Ile Gly Val Phe
Thr Arg Lys Ser Glu 1310 1315 1320Ile
Leu Ser Asp Lys Asn Asn Phe Arg Lys Ser Arg Gly Ala Ile 1325
1330 1335Leu Gln Glu Asp Leu Phe Leu Cys Ser
Asp Ser Ile Thr Gly Leu 1340 1345
1350Tyr His Thr Glu Arg Lys Leu 1355
13604136DNAArtificial SequenceCR RNA 41gtttgagagc agtgttgtct tatatagctc
gaaaac 3642103DNAArtificial SequenceTRACR
RNA 42gcattgtaag acaacactgc tacgttcaaa taagcatatt gctacaaggt tctccattgg
60agaatgacca ttaggtcgct tagatagcca gttcttctgg cta
103431369PRTArtificial SequenceBacillales bacterium 43Met Glu Gln Asn Thr
Lys Lys Leu Phe Ile Gly Leu Asp Val Gly Thr1 5
10 15Asp Ser Val Gly Trp Ala Ala Thr Asp Glu Tyr
Phe Asn Leu Tyr Arg 20 25
30Leu Lys Gly Lys Thr Ala Trp Gly Ala Arg Leu Phe Leu Asp Ala Ala
35 40 45Asn Ala Lys Asp Arg Arg Gln His
Arg Val Ser Gly Arg Arg Leu Ala 50 55
60Arg Arg Lys Glu Arg Ile Arg Leu Leu Asn Ala Leu Phe Asp Pro Leu65
70 75 80Leu Lys Lys Val Asp
Pro Thr Phe Leu Leu Arg Leu Glu Ser Ser Thr 85
90 95Leu Gln Asn Asp Asp Pro Asn Lys Asp Gln Arg
Ala Val Ser Asp Ala 100 105
110Leu Leu Phe Gly Asn Lys Lys His Glu Lys Ala Tyr Tyr Ala Ala Phe
115 120 125Pro Thr Ile Trp His Leu Arg
Lys Ala Leu Ile Glu Asn Asp Asp Lys 130 135
140Ala Phe Ser Asp Ile Arg Tyr Leu Tyr Leu Ala Ile His His Ile
Ile145 150 155 160Lys Tyr
Arg Gly Asn Phe Leu Arg Gln Gly Glu Ile Lys Ile Gly Glu
165 170 175Phe Asp Phe Ser Cys Phe Asp
Lys Leu Asn Gln Phe Phe Asp Ile Tyr 180 185
190Phe Ser Lys Glu Asp Glu Glu Glu Val Glu Phe Ile Gly Leu
Pro Asn 195 200 205Glu Asn Tyr Gln
Arg Phe Ile Asp Cys Ala Ala Asp Lys Asn Leu Gly 210
215 220Lys Gly Lys Lys Lys Gly Asp Leu Leu Lys Leu Met
Ser Phe Ser Glu225 230 235
240Asp Glu Lys Pro Phe Cys Glu Met Phe Cys Ser Leu Cys Ala Gly Leu
245 250 255Ala Phe Ser Thr Lys
Lys Leu Asn Lys Lys Asp Glu Thr Val Phe Glu 260
265 270Asp Ile Lys Val Glu Phe Asn Gly Lys Phe Asp Asp
Lys Gln Glu Glu 275 280 285Ile Lys
Ser Val Leu Gly Asp Ala Tyr Asp Leu Val Glu Leu Ala Lys 290
295 300Phe Ile Phe Asp Tyr Cys Asp Leu Lys Asp Ile
Leu Gly Ala Ser Thr305 310 315
320Asn Arg Leu Ser Glu Ala Phe Ala Gly Ile Tyr Asp Ser His Lys Glu
325 330 335Glu Leu Lys Ala
Leu Lys Gly Ile Cys Arg Glu Ile Asp Arg Ser Leu 340
345 350Gly Asn Glu Ser Lys Asn Ser Leu Tyr Arg Glu
Val Phe Asn Asp Lys 355 360 365Gly
Ile Pro Asn Asn Tyr Ala Ala Phe Ile His His Glu Thr Asn Ser 370
375 380Ser Arg Cys Gly Ile Ala Asp Phe Asn Asn
Tyr Val Leu Gln Lys Ile385 390 395
400Glu Pro Leu Glu Asn Leu Leu Ser Lys Gln Asn Tyr Lys Asn Trp
Ile 405 410 415Gln Leu Lys
Gln Leu Ala Ser Gln Gly Arg Leu Leu Gln Thr Ile Ala 420
425 430Ile Arg Ser Thr Ser Ile Ile Pro Met Gln
Leu His Leu Lys Asp Leu 435 440
445Lys Leu Ile Leu Ala Asn Ala Glu Lys Arg Asp Ile Pro Gly Ile Lys 450
455 460Asp Ile Lys Glu Lys Ile Leu Leu
Leu Phe Gln Phe Lys Val Pro Tyr465 470
475 480Tyr Cys Gly Pro Leu Thr Asp Arg Ser Gln Tyr Ser
Asn Val Val Leu 485 490
495Lys Ala Gly Thr Arg Glu Lys Ile Thr Pro Trp Asn Phe Ala Asp Gln
500 505 510Val Asp Leu Glu Glu Thr
Lys Lys Lys Phe Met Glu Gly Leu Thr Asn 515 520
525Lys Cys Thr Tyr Leu Lys Asp Cys Asn Val Leu Pro Arg Gln
Ser Leu 530 535 540Met Phe Gln Glu Tyr
Asp Ala Trp Asn Lys Leu Asn Asn Leu Ser Ile545 550
555 560Asn Gly Asn Lys Pro Ser Pro Glu Glu Met
Asn Ala Leu Phe Asp Phe 565 570
575Ala Ser Lys Arg Arg Lys Thr Thr Met Ser Asp Ile Lys Lys Phe Glu
580 585 590Lys Arg Ala Thr Met
Ser Lys Glu Asn Asp Val Thr Val Ser Gly Trp 595
600 605Asn Glu Asn Asp Phe Ile Asp Leu Ser Ser Phe Val
Ser Leu Ser Gly 610 615 620Phe Phe Asp
Leu Gly Glu Ile His Ser Ala Asp Tyr Met Ala Cys Glu625
630 635 640Glu Ala Ile Leu Leu Lys Thr
Ile Phe Thr Asp Ala Pro Gln Asp Ala 645
650 655Asp Pro Ile Ile Ala Glu Lys Phe Pro Asn Leu Lys
Pro Asn Gln Leu 660 665 670Ala
Ala Leu Lys Lys Met Ser Cys Lys Gly Trp Ala Thr Leu Ser Arg 675
680 685Glu Phe Leu Thr Leu Lys Ala Val Asp
Ala Asp Gly Glu Val Met Asn 690 695
700Glu Thr Leu Leu Gly Leu Met Lys Glu Gly Lys Gly Asn Leu Met Gln705
710 715 720Leu Leu His Ser
Ser Leu Tyr Asn Phe Gln Asp Val Ile Asp Ser His 725
730 735Asn Arg Ala Val Phe Gly Asp Lys Ser Pro
Lys Gln Ile Ala Asn Asp 740 745
750Leu Ile Glu Glu Met Pro Pro Gln Met Arg Arg Pro Val Ile Gln Ala
755 760 765Leu Arg Ile Val Arg Glu Val
Ser Lys Val Ala Lys Lys Gln Pro Asp 770 775
780Val Ile Ser Ile Glu Val Thr Arg Glu Ser Asn Asp Lys Lys Lys
Lys785 790 795 800Glu Glu
Trp Ser Lys Lys Ala Thr Asp Arg Lys Lys Gln Ile Asp Leu
805 810 815Phe Leu Lys Asn Leu Lys Lys
Thr Glu Asp Val Lys Gln Thr Glu Ser 820 825
830Glu Leu Asp Gly Gln Ala Ile Asn Asp Ile Asp Ser Ile Arg
Gly Lys 835 840 845His Leu Tyr Leu
Tyr Phe Leu Gln Asn Gly Lys Asp Ala Tyr Thr Gly 850
855 860Leu Pro Ile Asp Ile Asn Asp Val Leu Asn Gly Thr
Lys Tyr Asp Thr865 870 875
880Asp His Ile Ile Pro Gln Ser Leu Met Lys Asp Asp Ser Ile Asp Asn
885 890 895Leu Val Leu Val Asn
Arg Glu Lys Asn Gln His Lys Ser Asn Glu Phe 900
905 910Pro Leu Pro Arg Asp Ile Gln Thr Lys Ala Asn Ile
Glu Arg Trp Arg 915 920 925Ala Leu
Lys Lys Ala Gly Gly Met Ser Glu Lys Lys Phe Asn Asn Leu 930
935 940Thr Arg Thr Thr Pro Leu Thr Glu Glu Glu Leu
Ser Ala Phe Val Ala945 950 955
960Ala Gln Ile Asn Val Val Asn Arg Ser Asn Val Val Ile Arg Asp Val
965 970 975Leu Lys Ile Leu
Tyr Pro Asn Ala Lys Leu Ile Phe Ser Lys Ala Gln 980
985 990Tyr Pro Ser Gln Ile Arg Arg Asp Leu Glu Ile
Pro Lys Leu Arg Asp 995 1000
1005Leu Asn Asp Thr His His Ala Val Asp Ala Phe Leu Asn Ile Val
1010 1015 1020Ser Gly Val Glu Leu Thr
Lys Gln Phe Gly Arg Met Asp Val Ile 1025 1030
1035Lys Ala Ala Ala Lys Gly Asp Lys Asp His Ser Leu Asn Met
Thr 1040 1045 1050Arg Tyr Leu Glu Arg
Leu Leu Lys Lys Val Asp Glu Asn Lys Asn 1055 1060
1065Glu Thr Met Thr Glu Leu Gly Asn His Val Phe Val Thr
Ser Gln 1070 1075 1080Arg His Asp Phe
Leu Leu Thr Tyr Arg Phe Asp Tyr Gln Asp Ser 1085
1090 1095Ala Phe Tyr Asn Ala Thr Ile Tyr Ser Pro Asp
Lys Asn Leu Ile 1100 1105 1110Pro Met
His Asp Gly Met Asp Pro Glu Arg Tyr Gly Gly Tyr Ser 1115
1120 1125Ser Leu Asn Ile Glu Tyr Asn Cys Ile Ala
Thr Ile Lys Gly Lys 1130 1135 1140Lys
Lys Thr Thr Arg Tyr Leu Leu Gly Val Pro His Leu Leu Ala 1145
1150 1155Leu Lys Phe Lys Asn Asp Gly Ile Asp
Ile Thr Ser Asp Leu Ile 1160 1165
1170Lys Leu Val Pro His Lys Gly Asp Glu Glu Val Ser Ile Asp Trp
1175 1180 1185Lys Asn Pro Ile Pro Leu
Arg Ile Thr Val Lys Lys Asp Gly Val 1190 1195
1200Glu Tyr Leu Leu Ala Pro Phe Asn Ala Gln Val Met Glu Leu
Lys 1205 1210 1215Pro Val Ser Pro Val
Phe Leu Pro Arg Glu Ala Ala Glu Tyr Leu 1220 1225
1230Ala Arg Leu Lys Lys Ala Val Asp Gln Lys Lys Gln Phe
Ile Tyr 1235 1240 1245Gln Asn Ser Ala
Glu Ile Phe Gln Ser Lys Asp Lys Asn Asn Ala 1250
1255 1260Leu Gln Phe Gly Pro Glu Gln Ser Lys Asn Val
Ala Leu Lys Ile 1265 1270 1275Tyr Ala
Leu Ala Asp Ala Lys Lys Tyr Asp Tyr Cys Ala Met Ile 1280
1285 1290Ser Lys Leu Arg Asp Ala Ala Leu Arg Ala
Glu Met Leu Asp Ser 1295 1300 1305Leu
Ser Ser Glu Ala Leu Phe Lys Gln Tyr Asn Asp Leu Ile Ser 1310
1315 1320Leu Leu Ser Gln Leu Thr Arg Arg Ser
Lys Lys Ile Ser Ser Lys 1325 1330
1335Tyr Phe Ser Lys Ser Arg Gly Ala Leu Leu Gln Asp Gly Leu Lys
1340 1345 1350Ile Val Ser Lys Ser Ile
Thr Gly Leu Tyr Glu Thr Glu Arg Asn 1355 1360
1365Leu4436DNAArtificial SequenceCR RNA 44gtttgagagc agtgttgtct
taaatagctc gaaaac 3645104DNAArtificial
SequenceTRACR RNA 45gcattgtaag acaacactgc acgttcaaat aagcagattg
ctacaaggtt cccgtaaggg 60aatgaccatc tggtcacatg aatagccccc ggcaacggtg
gctg 104461121PRTArtificial SequenceSelenomonadaceae
46Met Asn Tyr Ile Leu Gly Leu Asp Ile Gly Ile Ala Ser Val Gly Trp1
5 10 15Ala Ala Val Ala Leu Asp
Ala Asn Asp Glu Pro Cys Lys Ile Leu Asp 20 25
30Leu Asn Ala Arg Ile Phe Glu Ala Ala Glu Gln Pro Lys
Thr Gly Ala 35 40 45Ser Leu Ala
Ala Pro Arg Arg Glu Ala Arg Gly Ser Arg Arg Arg Thr 50
55 60Arg Arg Arg Arg His Arg Met Glu Arg Leu Arg His
Leu Phe Ala Arg65 70 75
80Glu Glu Leu Ile Ser Ala Glu Asn Ile Ala Ala Leu Phe Glu Ala Pro
85 90 95Ala Asp Val Tyr Arg Leu
Arg Ala Glu Gly Leu Ser Arg Arg Leu Asp 100
105 110Glu Gly Glu Trp Ala Arg Val Leu Tyr His Ile Ala
Lys Arg Arg Gly 115 120 125Phe Lys
Ser Asn Arg Lys Gly Ala Ala Ser Asp Ala Asp Glu Gly Lys 130
135 140Val Leu Glu Ala Val Lys Glu Asn Glu Ala Leu
Leu Lys Asn Tyr Lys145 150 155
160Thr Val Gly Glu Met Met Phe Arg Asp Glu Lys Phe Gln Thr Ala Lys
165 170 175Arg Asn Lys Gly
Gly Ser Tyr Thr Phe Cys Val Ser Arg Gly Met Leu 180
185 190Ala Glu Glu Ile Gly Glu Leu Phe Ala Ala Gln
Arg Glu Gln Gly Asn 195 200 205Pro
His Ala Ser Glu Thr Phe Glu Thr Ala Tyr Ser Lys Ile Phe Ala 210
215 220Asp Gln Arg Ser Phe Asp Asp Gly Pro Asp
Ala Asn Ser Arg Ser Pro225 230 235
240Tyr Ala Gly Asn Gln Ile Glu Lys Met Ile Gly Thr Cys Ser Leu
Glu 245 250 255Thr Asp Pro
Pro Glu Lys Arg Ala Ala Lys Ala Ser Tyr Ser Phe Met 260
265 270Arg Phe Ser Leu Leu Gln Lys Ile Asn His
Leu Arg Leu Lys Asp Ala 275 280
285Lys Gly Glu Glu Arg Pro Leu Thr Asp Glu Glu Arg Ala Ala Val Glu 290
295 300Ala Leu Ala Trp Lys Ser Pro Ser
Leu Thr Tyr Gly Ala Ile Arg Lys305 310
315 320Ala Leu Pro Leu Pro Asp Glu Leu Arg Phe Thr Asp
Leu Tyr Tyr Arg 325 330
335Trp Asp Lys Lys Pro Glu Glu Ile Glu Lys Lys Lys Leu Pro Phe Ala
340 345 350Ala Pro Tyr His Glu Ile
Arg Lys Ala Leu Asp Lys Arg Glu Lys Gly 355 360
365Arg Ile Gln Ser Leu Thr Pro Asp Ala Leu Asp Ala Val Gly
Tyr Ala 370 375 380Phe Thr Val Phe Lys
Asn Asp Ala Lys Ile Glu Ala Ala Leu Ser Ala385 390
395 400Ala Gly Ile Asp Gly Glu Asp Ala Val Ala
Leu Met Ala Ala Gly Leu 405 410
415Thr Phe Arg Gly Phe Gly His Ile Ser Val Lys Ala Cys Arg Lys Leu
420 425 430Ile Pro His Leu Glu
Lys Gly Met Thr Tyr Asp Lys Ala Cys Lys Glu 435
440 445Ala Gly Tyr Asp Leu Gln Lys Thr Gly Gly Glu Lys
Thr Lys Leu Leu 450 455 460Ser Gly Asn
Leu Asp Glu Ile Arg Glu Ile Pro Asn Pro Val Val Arg465
470 475 480Arg Ala Ile Ala Gln Thr Val
Lys Val Val Asn Ala Val Ile Arg Arg 485
490 495Tyr Gly Ser Pro Val Ala Val Asn Val Glu Leu Ala
Arg Glu Met Gly 500 505 510Arg
Thr Phe Gln Glu Arg Arg Asp Met Met Lys Ser Met Glu Asp Asn 515
520 525Asn Ala Glu Asn Glu Lys Arg Lys Glu
Glu Leu Lys Gly Tyr Gly Val 530 535
540Val His Pro Ser Gly Leu Asp Ile Val Lys Leu Lys Leu Tyr Lys Glu545
550 555 560Gln Gly Gly Val
Cys Ala Tyr Ser Leu Ala Ala Met Pro Ile Glu Lys 565
570 575Val Leu Lys Asp His Asp Tyr Ala Glu Val
Asp His Ile Leu Pro Tyr 580 585
590Ser Arg Ser Phe Asp Asp Ser Tyr Ala Asn Lys Val Leu Val Leu Ser
595 600 605Lys Glu Asn Arg Asp Lys Gly
Asn Arg Thr Pro Met Glu Tyr Met Ala 610 615
620Asn Met Pro Gly Arg Arg His Asp Phe Ile Thr Trp Val Lys Ser
Ala625 630 635 640Val Arg
Asn Pro Arg Lys Arg Asp Asn Leu Leu Leu Glu Lys Phe Gly
645 650 655Glu Asp Lys Glu Ala Ala Trp
Lys Glu Arg His Leu Thr Asp Thr Lys 660 665
670Tyr Ile Gly Ser Phe Ile Ala Asn Leu Leu Arg Asp His Leu
Glu Phe 675 680 685Ala Pro Trp Leu
Asn Gly Lys Lys Lys Gln His Val Leu Ala Val Asn 690
695 700Gly Ala Val Thr Asp Tyr Thr Arg Lys Arg Leu Gly
Ile Arg Lys Ile705 710 715
720Arg Glu Asp Gly Asp Leu His His Ala Val Asp Ala Ala Val Ile Ala
725 730 735Thr Val Thr Gln Gly
Asn Ile Gln Lys Leu Thr Asp Tyr Ser Lys Gln 740
745 750Ile Glu Arg Ala Phe Val Lys Asn Arg Asp Gly Arg
Tyr Val Asn Pro 755 760 765Asp Thr
Gly Glu Val Leu Lys Lys Asp Glu Trp Ile Val Gln Arg Ser 770
775 780Arg His Phe Pro Glu Pro Trp Pro Gly Phe Arg
His Glu Leu Glu Ala785 790 795
800Arg Val Ser Asp His Pro Lys Glu Met Ile Glu Ser Leu Arg Leu Pro
805 810 815Thr Tyr Thr Pro
Glu Glu Ile Asp Gly Leu Lys Pro Pro Phe Val Ser 820
825 830Arg Met Pro Thr Arg Lys Val Arg Gly Ala Ala
His Leu Glu Thr Val 835 840 845Val
Ser Pro Arg Leu Lys Asp Glu Gly Met Ile Val Lys Lys Val Ser 850
855 860Leu Asp Ala Leu Lys Leu Thr Lys Asp Lys
Asp Ala Ile Glu Asn Tyr865 870 875
880Tyr Ala Pro Glu Ser Asp His Leu Leu Tyr Glu Ala Leu Leu His
Arg 885 890 895Leu Gln Ala
Phe Gly Gly Asp Gly Glu Lys Ala Phe Ala Glu Ser Phe 900
905 910His Lys Pro Lys Ala Asp Gly Thr Pro Gly
Pro Val Val Lys Lys Val 915 920
925Lys Ile Ala Glu Lys Ser Thr Leu Ser Val Pro Val His His Gly Arg 930
935 940Gly Leu Ala Ala Asn Gly Gly Met
Val Arg Val Asp Val Phe Phe Ile945 950
955 960Pro Glu Gly Lys Asp Arg Gly Tyr Tyr Leu Val Pro
Val Tyr Thr Ser 965 970
975Asp Val Val Arg Gly Glu Leu Pro Met Arg Ala Val Val Gln Gly Lys
980 985 990Ser Tyr Ala Glu Trp Lys
Leu Met Arg Glu Glu Asp Phe Ile Phe Ser 995 1000
1005Leu Tyr Pro Asn Asp Leu Val Tyr Ile Glu His Glu
Lys Gly Val 1010 1015 1020Lys Val Lys
Ile Gln Lys Lys Leu Arg Glu Ile Ser Thr Leu Pro 1025
1030 1035Arg Glu Lys Thr Met Thr Ser Gly Leu Phe Tyr
Tyr Arg Thr Met 1040 1045 1050Gly Ile
Ala Val Ala Ser Ile His Ile Tyr Ala Pro Asp Gly Val 1055
1060 1065Tyr Val Gln Glu Ser Leu Gly Val Lys Thr
Leu Lys Glu Phe Lys 1070 1075 1080Lys
Trp Thr Ile Asp Ile Leu Gly Gly Glu Pro His Pro Val Gln 1085
1090 1095Lys Glu Lys Arg Gln Asp Phe Ala Ser
Val Lys Arg Asp Pro His 1100 1105
1110Ala Ala Lys Ser Thr Ser Ser Gly 1115
11204736DNAArtificial SequenceCR RNA 47attgtaccat agcgagttaa attagggaat
tacaac 3648101DNAArtificial SequenceTRACR
RNA 48ttgtaataac ctattttacc tcgctatggc acaatttgtt attacatgga cattatacta
60aacatttcct aaaaaagcaa cgaaaaacgt gctggcagca a
101491087PRTRuminococcus albus 49Met Lys Tyr Ile Ile Gly Leu Asp Met Gly
Ile Thr Ser Val Gly Phe1 5 10
15Ala Thr Met Met Leu Asp Asp Lys Asp Glu Pro Cys Arg Ile Ile Arg
20 25 30Met Gly Ser Arg Ile Phe
Glu Ala Ala Glu His Pro Lys Asp Gly Ser 35 40
45Ser Leu Ala Ala Pro Arg Arg Ile Asn Arg Gly Met Arg Arg
Arg Leu 50 55 60Arg Arg Lys Ser His
Arg Lys Glu Arg Ile Lys Asp Leu Ile Ile Lys65 70
75 80Asn Glu Leu Met Thr Ala Asp Glu Ile Ser
Ala Ile Tyr Ser Thr Gly 85 90
95Lys Gln Leu Ser Asp Ile Tyr Gln Ile Arg Ala Glu Ala Leu Asp Arg
100 105 110Lys Leu Asn Thr Glu
Glu Phe Val Arg Leu Leu Ile His Leu Ser Gln 115
120 125Arg Arg Gly Phe Lys Ser Asn Arg Lys Val Asp Ala
Lys Glu Lys Gly 130 135 140Ser Asp Ala
Gly Lys Leu Leu Ser Ala Val Asn Ser Asn Lys Glu Leu145
150 155 160Met Ile Glu Lys Asn Tyr Arg
Thr Ile Gly Glu Met Leu Tyr Lys Asp 165
170 175Glu Lys Phe Ser Glu Tyr Lys Arg Asn Lys Ala Asp
Asp Tyr Ser Asn 180 185 190Thr
Phe Ala Arg Ser Glu Tyr Glu Asp Glu Ile Arg Gln Ile Phe Ser 195
200 205Ala Gln Gln Glu His Gly Asn Pro Tyr
Ala Thr Asp Glu Leu Lys Glu 210 215
220Ser Tyr Leu Asp Ile Tyr Leu Ser Gln Arg Ser Phe Asp Glu Gly Pro225
230 235 240Gly Gly Ser Ser
Pro Tyr Gly Gly Asn Gln Ile Glu Lys Met Ile Gly 245
250 255Asn Cys Thr Leu Glu Pro Glu Glu Lys Arg
Ala Ala Lys Ala Thr Phe 260 265
270Ser Phe Glu Tyr Phe Asn Leu Leu Ser Lys Val Asn Ser Ile Lys Ile
275 280 285Val Ser Ser Ser Gly Lys Arg
Ala Leu Asn Asn Asp Glu Arg Gln Ser 290 295
300Val Ile Arg Leu Ala Phe Ala Lys Asn Ala Ile Ser Tyr Thr Ser
Leu305 310 315 320Arg Lys
Glu Leu Asn Met Glu Tyr Ser Glu Arg Phe Asn Ile Ser Tyr
325 330 335Ser Gln Ser Asp Lys Ser Ile
Glu Glu Ile Glu Lys Lys Thr Lys Phe 340 345
350Thr Tyr Leu Thr Ala Tyr His Thr Phe Lys Lys Ala Tyr Gly
Ser Val 355 360 365Phe Val Glu Trp
Ser Ala Asp Lys Lys Asn Ser Leu Ala Tyr Ala Leu 370
375 380Thr Ala Tyr Lys Asn Asp Thr Lys Ile Ile Glu Tyr
Leu Thr Gln Lys385 390 395
400Gly Phe Asp Ala Ala Glu Thr Asp Ile Ala Leu Thr Leu Pro Ser Phe
405 410 415Ser Lys Trp Gly Asn
Leu Ser Glu Lys Ala Leu Asn Asn Ile Ile Pro 420
425 430Tyr Leu Glu Gln Gly Met Leu Tyr His Asp Ala Cys
Thr Ala Ala Gly 435 440 445Tyr Asn
Phe Lys Ala Asp Asp Thr Asp Lys Arg Met Tyr Leu Pro Ala 450
455 460His Glu Lys Glu Ala Pro Glu Leu Asp Asp Ile
Thr Asn Pro Val Val465 470 475
480Arg Arg Ala Ile Ser Gln Thr Ile Lys Val Ile Asn Ala Leu Ile Arg
485 490 495Glu Met Gly Glu
Ser Pro Cys Phe Val Asn Ile Glu Leu Ala Arg Glu 500
505 510Leu Ser Lys Asn Lys Ala Glu Arg Ser Lys Ile
Glu Lys Gly Gln Lys 515 520 525Glu
Asn Gln Val Arg Asn Asp Arg Ile Met Glu Arg Leu Arg Asn Glu 530
535 540Phe Gly Leu Leu Ser Pro Thr Gly Gln Asp
Leu Ile Lys Leu Lys Leu545 550 555
560Trp Glu Glu Gln Asp Gly Ile Cys Pro Tyr Ser Leu Lys Pro Ile
Lys 565 570 575Ile Glu Lys
Leu Phe Asp Val Gly Tyr Thr Asp Ile Asp His Ile Ile 580
585 590Pro Tyr Ser Leu Ser Phe Asp Asp Thr Tyr
Asn Asn Lys Val Leu Val 595 600
605Met Ser Ser Glu Asn Arg Gln Lys Gly Asn Arg Ile Pro Met Gln Tyr 610
615 620Leu Glu Gly Lys Arg Gln Asp Asp
Phe Trp Leu Trp Val Asp Asn Ser625 630
635 640Asn Leu Ser Arg Arg Lys Lys Gln Asn Leu Thr Lys
Glu Thr Leu Ser 645 650
655Glu Asp Asp Leu Ser Gly Phe Lys Lys Arg Asn Leu Gln Asp Thr Gln
660 665 670Tyr Leu Ser Arg Phe Met
Met Asn Tyr Leu Lys Lys Tyr Leu Ala Leu 675 680
685Ala Pro Asn Thr Thr Gly Arg Lys Asn Thr Ile Gln Ala Val
Asn Gly 690 695 700Ala Val Thr Ser Tyr
Leu Arg Lys Arg Trp Gly Ile Gln Lys Val Arg705 710
715 720Glu Asn Gly Asp Thr His His Ala Val Asp
Ala Val Val Ile Ser Cys 725 730
735Val Thr Ala Gly Met Thr Lys Arg Val Ser Glu Tyr Ala Lys Tyr Lys
740 745 750Glu Thr Glu Phe Gln
Asn Pro Gln Thr Gly Glu Phe Phe Asp Val Asp 755
760 765Ile Arg Thr Gly Glu Val Ile Asn Arg Phe Pro Leu
Pro Tyr Ala Arg 770 775 780Phe Arg Asn
Glu Leu Leu Met Arg Cys Ser Glu Asn Pro Ser Arg Ile785
790 795 800Leu His Glu Met Pro Leu Pro
Thr Tyr Ala Ala Asp Glu Lys Val Ala 805
810 815Pro Ile Phe Val Ser Arg Met Pro Lys His Lys Val
Lys Gly Ser Ala 820 825 830His
Lys Glu Thr Ile Arg Arg Ala Phe Glu Glu Asp Gly Lys Lys Tyr 835
840 845Thr Val Ser Lys Val Pro Leu Thr Asp
Leu Lys Leu Lys Asn Gly Glu 850 855
860Ile Glu Asn Tyr Tyr Asn Pro Glu Ser Asp Gly Leu Leu Tyr Asn Ala865
870 875 880Leu Lys Glu Gln
Leu Ile Ala Phe Gly Gly Asp Ala Ala Lys Ala Phe 885
890 895Glu Gln Pro Phe Tyr Lys Pro Lys Ser Asp
Gly Ser Glu Gly Pro Leu 900 905
910Val Lys Lys Val Lys Leu Ile Asn Lys Ala Thr Leu Thr Val Pro Val
915 920 925Leu Asn Asn Thr Ala Val Ala
Asp Asn Gly Ser Met Val Arg Val Asp 930 935
940Val Phe Phe Val Glu Gly Glu Gly Tyr Tyr Leu Val Pro Ile Tyr
Val945 950 955 960Ala Asp
Thr Val Lys Lys Glu Leu Pro Asn Lys Ala Ile Ile Ala Asn
965 970 975Lys Pro Tyr Glu Glu Trp Lys
Glu Met Arg Glu Glu Asn Phe Val Phe 980 985
990Ser Leu Tyr Pro Asn Asp Leu Ile Lys Ile Ser Ser Arg Lys
Asp Met 995 1000 1005Lys Phe Asn
Leu Val Asn Lys Glu Ser Thr Leu Ala Pro Asn Cys 1010
1015 1020Gln Ser Lys Glu Ala Leu Val Tyr Tyr Lys Gly
Ser Asp Ile Ser 1025 1030 1035Thr Ala
Ala Val Thr Ala Ile Asn His Asp Asn Thr Tyr Lys Leu 1040
1045 1050Arg Gly Leu Gly Val Lys Thr Leu Leu Lys
Ile Glu Lys Tyr Gln 1055 1060 1065Val
Asp Val Leu Gly Asn Val Phe Lys Val Gly Lys Glu Lys Arg 1070
1075 1080Val Arg Phe Lys
10855036DNAArtificial SequenceCR RNA 50gttgtagttc cctaattatt cttggtatgg
tataat 365189PRTArtificial SequenceTRACR
RNA 51Ala Thr Thr Gly Thr Ala Thr Cys Ala Thr Ala Cys Cys Ala Ala Gly1
5 10 15Ala Ala Cys Ala Ala
Thr Thr Ala Gly Gly Thr Thr Ala Cys Thr Ala 20
25 30Thr Gly Ala Thr Ala Ala Gly Gly Thr Ala Gly Thr
Ala Thr Ala Cys 35 40 45Cys Gly
Cys Ala Ala Ala Gly Cys Thr Cys Thr Ala Ala Cys Ala Cys 50
55 60Cys Thr Cys Ala Thr Cys Thr Thr Cys Gly Gly
Ala Thr Gly Ala Gly65 70 75
80Gly Thr Gly Thr Thr Ala Thr Cys Thr
85521084PRTArtificial SequenceFaecalibacterium 52Met Lys Asn Thr Leu Tyr
Gly Ile Gly Leu Asp Ile Gly Val Ala Ser1 5
10 15Val Gly Trp Ala Val Val Gly Leu Asn Gly Thr Gly
Glu Pro Val Gly 20 25 30Leu
His Arg Leu Gly Val Arg Ile Phe Asp Lys Ala Glu Gln Pro Lys 35
40 45Thr Gly Glu Ser Leu Ala Ala Pro Arg
Arg Met Ala Arg Gly Met Arg 50 55
60Arg Arg Leu Arg Arg Lys Ala Leu Arg Arg Ala Asp Val Tyr Ala Leu65
70 75 80Leu Glu Arg Ser Gly
Leu Ser Thr Arg Glu Ala Leu Ala Gln Met Phe 85
90 95Glu Ala Gly Gly Leu Glu Asp Ile Tyr Ala Leu
Arg Thr Arg Ala Leu 100 105
110Asp Glu Pro Val Gly Lys Ala Glu Phe Ser Arg Ile Leu Leu His Leu
115 120 125Ala Gln Arg Arg Gly Phe Lys
Ser Asn Arg Arg Thr Ala Ser Asp Gly 130 135
140Glu Asp Gly Arg Leu Leu Ala Ala Val Asn Glu Asn Arg Arg Arg
Met145 150 155 160Ala Gln
Gly Gly Trp Arg Thr Val Gly Glu Met Leu Tyr Arg His Glu
165 170 175Ala Phe Ala Leu Arg Lys Arg
Asn Lys Ala Asp Glu Tyr Leu Ser Thr 180 185
190Val Gly Arg Asp Met Val Ala Glu Glu Ala Ser Leu Leu Phe
Gln Arg 195 200 205Gln Arg Glu Leu
Gly Cys Ala Trp Ala Thr Pro Glu Leu Gln Ala Glu 210
215 220Tyr Leu Ser Ile Leu Leu Arg Gln Arg Ser Phe Asp
Glu Gly Pro Gly225 230 235
240Gly Asn Ser Pro Tyr Gly Gly Asn Gln Val Glu Lys Met Val Gly Arg
245 250 255Cys Thr Phe Glu Pro
Asp Glu Pro Arg Ala Ala Lys Ala Ala Tyr Ser 260
265 270Phe Glu Tyr Phe Ser Leu Leu Gln Lys Leu Asn His
Ile Arg Leu Ala 275 280 285Glu Asn
Gly Glu Thr Arg Pro Leu Thr Gln Pro Gln Arg Gln Gln Leu 290
295 300Leu Ser Leu Ala His Lys Thr Pro Asp Val Ser
Leu Ala Arg Ile Arg305 310 315
320Lys Glu Leu Ala Leu Pro Glu Thr Val Gln Phe Asn Gly Val Arg Cys
325 330 335Arg Ala Asn Glu
Thr Leu Glu Glu Ser Glu Lys Lys Glu Lys Phe Ala 340
345 350Cys Leu Pro Ala Tyr His Lys Met Arg Lys Ala
Leu Asp Gly Val Val 355 360 365Lys
Gly Arg Ile Ser Ser Leu Ser Ile Ser Gln Arg Asp Ala Ala Ala 370
375 380Thr Ala Leu Ser Leu Tyr Lys Asn Glu Asp
Thr Leu Arg Ala Lys Leu385 390 395
400Thr Glu Ala Gly Phe Gln Ala Pro Glu Ile Asp Ala Leu Ala Gly
Leu 405 410 415Thr Gly Phe
Ser Lys Phe Gly His Leu Ser Leu Lys Ala Cys Arg Lys 420
425 430Leu Ile Pro His Leu Glu Gln Gly Leu Thr
Tyr Asp Gln Ala Cys Ser 435 440
445Ala Ala Gly Tyr Asp Phe Lys Gly His Gly Ala Gly Glu Arg Ala Phe 450
455 460Thr Leu Pro Ala Ala Ala Pro Glu
Met Glu Gln Ile Thr Ser Pro Val465 470
475 480Val Arg Arg Ala Val Ala Gln Thr Ile Lys Val Val
Asn Gly Ile Ile 485 490
495Arg Glu Met Asp Ala Ser Pro Ala Trp Val Arg Ile Glu Leu Ala Arg
500 505 510Glu Leu Ser Lys Thr Phe
Gly Glu Arg Gln Glu Met Asp Arg Ser Met 515 520
525Arg Glu Asn Ala Ala Gln Asn Glu Arg Leu Met Gln Glu Leu
Arg Asp 530 535 540Thr Phe His Leu Leu
Ser Pro Thr Gly Gln Asp Leu Val Lys Tyr Arg545 550
555 560Leu Trp Lys Glu Gln Asp Gly Val Cys Ala
Tyr Ser Leu Arg Arg Leu 565 570
575Asp Val Glu Arg Leu Phe Glu Pro Gly Tyr Val Asp Val Asp His Ile
580 585 590Val Pro Tyr Ser Leu
Ser Phe Asp Asp Arg Arg Ser Asn Lys Val Leu 595
600 605Val Leu Ser Ser Glu Asn Arg Gln Lys Gly Asn Arg
Leu Pro Leu Gln 610 615 620Tyr Leu Gln
Gly Lys Arg Arg Glu Asp Phe Ile Val Trp Thr Asn Ser625
630 635 640Ser Val Arg Asp Tyr Arg Lys
Arg Gln Asn Leu Leu Arg Glu Lys Phe 645
650 655Ser Gly Asp Glu Ala Glu Gly Phe Arg Gln Arg Asn
Leu Gln Asp Thr 660 665 670Gln
His Met Ala Arg Phe Leu Tyr Asn Tyr Ile Ser Asp His Leu Ala 675
680 685Phe Ala Gln Ser Glu Ala Leu Gly Lys
Lys Arg Val Phe Ala Val Ser 690 695
700Gly Ala Val Thr Ser His Leu Arg Lys Arg Trp Gly Leu Ser Lys Val705
710 715 720Arg Ala Asp Gly
Asp Leu His His Ala Leu Asp Ala Ala Val Ile Ala 725
730 735Cys Thr Thr Asp Gly Met Ile Arg Arg Ile
Ser Gly Tyr Tyr Gly His 740 745
750Ile Glu Gly Glu Tyr Leu Gln Asp Ala Asp Gly Ala Gly Ser Gln His
755 760 765Ala Arg Thr Lys Glu Arg Phe
Pro Ala Pro Trp Pro Arg Phe Arg Asp 770 775
780Glu Leu Ile Val Arg Leu Ser Glu Gln Pro Gly Glu His Leu Leu
Asp785 790 795 800Ile Asn
Pro Ala Phe Tyr Cys Glu Tyr Gly Thr Glu His Ile Cys Pro
805 810 815Val Phe Val Ser Arg Met Pro
Arg Arg Lys Val Thr Gly Pro Gly His 820 825
830Lys Glu Thr Ile Lys Gly Ala Ala Ala Ala Asp Glu Gly Leu
Leu Thr 835 840 845Val Arg Lys Ala
Leu Thr Glu Leu Lys Leu Asp Lys Asp Gly Glu Ile 850
855 860Lys Asp Tyr Tyr Met Pro Ser Ser Asp Thr Leu Leu
Tyr Glu Ala Leu865 870 875
880Lys Ala Gln Leu Arg Arg Phe Gly Gly Asp Gly Lys Lys Ala Phe Ala
885 890 895Glu Pro Phe Tyr Lys
Pro Lys Ala Asp Gly Thr Pro Gly Pro Leu Val 900
905 910Arg Lys Val Lys Thr Ile Glu Lys Ala Thr Leu Thr
Val Pro Val His 915 920 925Gly Gly
Ala Ala Ser Asn Asp Thr Met Val Arg Val Asp Val Phe Leu 930
935 940Val Pro Gly Asp Gly Tyr Tyr Trp Val Pro Val
Tyr Val Ala Asp Thr945 950 955
960Leu Lys Pro Glu Leu Pro Asn Arg Ala Val Val Ala Phe Lys Pro Tyr
965 970 975Ser Glu Trp Lys
Glu Met Arg Glu Glu Asp Phe Ile Phe Ser Leu Tyr 980
985 990Pro Asn Asp Leu Val Tyr Val Glu His Lys Ser
Gly Leu Lys Phe Thr 995 1000
1005Leu Gln Asn Ala Asp Ser Thr Leu Glu Lys Thr Trp Val Pro Lys
1010 1015 1020Ala Ser Phe Ala Tyr Phe
Val Gly Gly Asp Ile Ser Thr Ala Ala 1025 1030
1035Ile Ser Leu Arg Thr His Asp Asn Ala Tyr Gly Leu Arg Gly
Leu 1040 1045 1050Gly Ile Lys Thr Leu
Lys Val Leu Lys Lys Tyr Gln Val Asp Val 1055 1060
1065Leu Gly Asn Ile Ser Pro Val His Arg Glu Thr Arg Gln
Arg Phe 1070 1075
1080Arg5336DNAArtificial SequenceCR RNA 53gttgtagttc cctaacagtt
cttggtatgg tataat 3654117DNAArtificial
SequenceTRACR RNA 54ttataccata ccaagaactg ttatggttgc tatgataagg
tcttagcacc gtaaagctct 60gacgcctcgc tttcagcggg gcgtcatctt ttttgcccaa
aagacacgga tattttt 117551084PRTArtificial SequenceClostridia 55Met
Ala Tyr Gly Ile Gly Leu Asp Ile Gly Ile Ala Ser Val Gly Phe1
5 10 15Ala Thr Val Ala Leu Asn Glu
Gln Asp Glu Pro Cys Gly Ile Leu Arg 20 25
30Met Gly Ser Arg Ile Phe Asp Ala Ala Glu His Pro Lys Asn
Gly Ala 35 40 45Ser Leu Ala Ala
Pro Arg Arg Glu Ala Arg Ser Ala Arg Arg Arg Leu 50 55
60Arg Arg His Arg His Arg Leu Glu Arg Ile Arg Asn Leu
Leu Val Glu65 70 75
80Ser Cys Leu Ile Ser Gln Asp Gly Leu Gly Ser Leu Phe Glu Gly Arg
85 90 95Leu Glu Asp Ile Tyr Ala
Leu Arg Thr Arg Ala Leu Asp Glu Arg Leu 100
105 110Thr Asp Ala Glu Leu Cys Arg Val Leu Ile His Leu
Ala Gln Arg Arg 115 120 125Gly Phe
Arg Ser Asn Arg Lys Ala Asp Ala Ala Asp Lys Glu Ala Gly 130
135 140Lys Leu Leu Lys Ala Val Ser Glu Asn Asp Arg
Arg Met Glu Glu Asn145 150 155
160Gly Tyr Arg Thr Val Gly Glu Met Leu Tyr Lys Asp Pro Leu Phe Ala
165 170 175Glu His Arg Arg
Asn Lys Gly Glu Ala Tyr Leu Ser Thr Val Thr Arg 180
185 190Thr Ala Val Glu Gln Glu Ala Arg Leu Val Leu
Ser Thr Gln Arg Glu 195 200 205Lys
Gly Asn Ala Ala Ile Thr Glu Asp Phe Val Glu Lys Tyr Leu Asp 210
215 220Ile Leu Leu Ser Gln Arg Pro Phe Asp Val
Gly Pro Gly Gly Asn Ser225 230 235
240Pro Tyr Gly Gly Asn Met Ile Glu Lys Met Ile Gly Arg Cys Thr
Phe 245 250 255Glu Pro Asp
Glu Leu Arg Ala Pro Lys Ala Ser Tyr Ser Phe Glu Tyr 260
265 270Phe Gln Leu Leu Gln Lys Val Asn His Ile
Arg Leu Leu Arg Asp Gly 275 280
285Arg Ser Glu Pro Leu Ser Glu Glu Gln Arg Arg Ala Ile Ile Asp Leu 290
295 300Ala Leu Ala Ser Ala Asp Val Thr
Phe Ala Lys Ile Arg Lys Ala Leu305 310
315 320Ser Leu Pro Asp Ser Val Arg Phe Asn Asp Val Tyr
Tyr Arg Glu Ser 325 330
335Ala Glu Glu Ala Glu Lys Lys Lys Lys Leu Gly Cys Met Asp Ala Tyr
340 345 350His Glu Met Arg Lys Ala
Leu Asp Lys Val Ala Lys Gly Arg Ile Cys 355 360
365Ala Ile Pro Val Glu Gln Arg Asn Ala Ile Ala Tyr Val Leu
Thr Val 370 375 380His Lys Thr Asp Glu
Arg Ile Leu Thr Glu Leu Gln Asn Ile Asn Leu385 390
395 400Glu Arg Ser Asp Ile Asp Gln Leu Met Gln
Met Lys Gly Phe Ser Lys 405 410
415Phe Gly His Leu Ser Ile Lys Ala Cys Asp Arg Ile Ile Pro Tyr Leu
420 425 430Glu Gln Gly Met Thr
Tyr Ser Asp Ala Cys Thr Ala Ala Gly Tyr Ala 435
440 445Phe Arg Gly His Glu Gly Gly Glu His Ser Leu Tyr
Leu Pro Ala Gln 450 455 460Thr Pro Glu
Met Asp Glu Ile Thr Ser Pro Val Val Arg Arg Ala Val465
470 475 480Ser Gln Thr Ile Lys Val Val
Asn Ala Leu Ile Arg Glu Gln Gly Glu 485
490 495Ser Pro Thr Phe Val Asn Ile Glu Leu Ala Arg Glu
Met Ser Lys Asp 500 505 510Phe
Ala Glu Arg Asn Asp Ile Arg Arg Glu Asn Glu Lys Asn Ala Lys 515
520 525Ala Asn Glu Ala Val Met Asn Glu Leu
Arg Arg Thr Phe Gly Leu Val 530 535
540Asn Pro Ser Gly Gln Asp Leu Val Lys Tyr Lys Leu Phe Leu Glu Gln545
550 555 560Gly Gly Val Cys
Pro Tyr Thr Gln Arg Pro Met Glu Pro Gly Arg Leu 565
570 575Phe Glu Ala Gly Tyr Ala Asp Val Asp His
Ile Val Pro Tyr Ser Ile 580 585
590Ser Phe Asp Asp Arg Tyr Cys Asn Lys Val Leu Thr Phe Ala Ser Val
595 600 605Asn Arg Lys Glu Lys Gly Asn
Arg Leu Pro Leu Gln Phe Leu Lys Gly 610 615
620Glu Arg Arg Glu Ser Phe Ile Val Tyr Val Lys Ala Asn Val Arg
Asp625 630 635 640Tyr Arg
Lys Gln Arg Leu Leu Leu Lys Glu Thr Val Thr Glu Glu Asp
645 650 655Arg Lys Gly Phe Arg Asp Arg
Asn Leu Gln Asp Thr Lys His Met Ala 660 665
670Ala Phe Leu His Ser Tyr Ile Asn Asp His Leu Gln Phe Ala
Pro Phe 675 680 685Gln Thr Asp Arg
Lys Arg His Val Thr Ala Val Asn Gly Ala Val Thr 690
695 700Ala Tyr Leu Arg Lys Arg Trp Gly Ile Arg Lys Val
Arg Ala Glu Gly705 710 715
720Asp Leu His His Ala Ser Asp Ala Leu Val Ile Ala Cys Thr Thr Pro
725 730 735Gly Met Ile Gln Arg
Leu Ser Arg Tyr Ala Glu Leu Arg Glu Ala Glu 740
745 750Tyr Met Gln Thr Glu Asp Gly Ala Val Arg Phe Asp
Pro Ala Thr Gly 755 760 765Glu Val
Leu Glu Lys Phe Pro Tyr Pro Trp Pro Cys Phe Arg Gln Glu 770
775 780Trp Thr Ala Arg Val Ser Asp Asp Pro Gln Ala
Met Leu Gln Asp Met785 790 795
800Lys Leu Thr Asp Tyr Arg Gly Leu Pro Leu Glu Gln Val Lys Pro Val
805 810 815Phe Val Ser Arg
Met Pro Lys His Lys Val Thr Gly Ala Ala His Lys 820
825 830Asp Thr Val Lys Ser Ala Lys Ala Leu Asp Arg
Gly Val Val Leu Val 835 840 845Lys
Arg Ala Leu Thr Asp Leu Lys Leu Lys Asp Gly Glu Ile Glu Asn 850
855 860Tyr Tyr Asp Pro Ala Ser Asp Arg Leu Leu
Tyr Glu Ala Leu Lys Glu865 870 875
880Arg Leu Ile Ala Phe Gly Gly Asp Ala Gln Lys Ala Phe Ala Glu
Pro 885 890 895Phe His Lys
Pro Lys Arg Asp Gly Thr Pro Gly Pro Leu Val Lys Lys 900
905 910Val Lys Leu Met Glu Lys Ser Ser Leu Thr
Val Pro Val His Asp Gly 915 920
925Lys Gly Val Ala Asp Asn Asp Ser Met Val Arg Ile Asp Val Phe Phe 930
935 940Val Ala Gly Glu Gly Tyr Tyr Phe
Val Pro Ile Tyr Val Ala Asp Thr945 950
955 960Val Lys Pro Glu Leu Pro Asn Arg Ala Val Val Ala
Asn Lys Pro Tyr 965 970
975Ala Glu Trp Lys Glu Met Lys Asp Glu Asp Phe Leu Phe Ser Leu Tyr
980 985 990Pro Ser Asp Leu Met Arg
Val Thr Gln Lys Lys Gly Ile Lys Leu Ser 995 1000
1005Leu Ile Asn Lys Glu Ser Thr Leu Lys Lys Glu Glu
Met Ala Gln 1010 1015 1020Ser Ile Leu
Leu Tyr Tyr Val Lys Gly Ser Ile Ser Thr Gly Ser 1025
1030 1035Ile Thr Ala Glu Asn His Asp Arg Thr Tyr Ala
Ile Asn Ser Leu 1040 1045 1050Gly Ile
Lys Thr Leu Glu Lys Leu Glu Lys Tyr Gln Val Asp Val 1055
1060 1065Leu Gly Asn Val Ser Pro Val Gly Lys Glu
Lys Arg Leu Thr Phe 1070 1075
1080Cys5636DNAArtificial SequenceCR RNA 56gttgtagttc cctaacggtt
cttggtatgg tataat 3657102DNAArtificial
SequenceTRACR RNA 57ttataccata ccaagaactg ttgggttact acaataaggt
agtaaaccga aaagctctga 60cgtcttgttt gcgcaggacg tcatctttat atcagacgga
tg 102581058PRTArtificial SequenceChloroflexi 58Met
Leu Pro Tyr Ala Ile Gly Leu Asp Ile Gly Ile Ala Ser Val Gly1
5 10 15Trp Ala Val Val Gly Leu Asp
Thr Asn Glu Arg Pro Phe Cys Ile Leu 20 25
30Gly Met Gly Ser Arg Ile Phe Asp Lys Ala Glu Gln Pro Lys
Thr Gly 35 40 45Ala Ser Leu Ala
Leu Pro Arg Arg Glu Ala Arg Ser Leu Arg Arg Arg 50 55
60Leu Arg Arg His Arg His Arg Asn Glu Arg Ile Arg Asn
Leu Leu Leu65 70 75
80Arg Glu Lys Ile Ile Ser Glu Ser Glu Leu Gln Asp Leu Phe Ser Gly
85 90 95Thr Leu Ser Asp Ile Tyr
Gln Leu Arg Val Glu Ala Leu Asp Arg Lys 100
105 110Leu Asp Asp Lys Glu Phe Ser Arg Val Leu Ile His
Ile Ala Gln Arg 115 120 125Arg Gly
Phe Lys Ser Asn Arg Lys Asn Ala Ala Ala Ser Gln Glu Asp 130
135 140Gly Lys Leu Leu Ser Ala Val Thr Glu Asn Gln
Gln Arg Met Asn Asp145 150 155
160Lys Gly Tyr Arg Thr Val Ser Glu Met Leu Leu Arg Asp Asp Lys Phe
165 170 175Lys Asp His Lys
Arg Asn Lys Gly Gly Glu Tyr Leu Thr Thr Val Thr 180
185 190Arg Thr Met Val Glu Asp Glu Val His Lys Ile
Phe Ser Ala Gln Arg 195 200 205Thr
His Gly Asn Leu Lys Ala Asp Asn Gln Leu Glu Ser Glu Tyr Leu 210
215 220Glu Ile Leu Leu Ser Gln Arg Ser Phe Asp
Glu Gly Pro Gly Gly Asp225 230 235
240Ser Pro Tyr Gly Gly Ser Gln Ile Glu Lys Met Ile Gly Lys Cys
Thr 245 250 255Phe Phe Pro
Glu Glu Lys Arg Ala Ala Lys Ala Thr Tyr Thr Phe Glu 260
265 270Tyr Phe Asn Leu Leu Glu Lys Ile Asn His
Ile Arg Leu Val Ser Lys 275 280
285Asp Asn Leu Pro Glu Pro Leu Ser Asp Phe Gln Arg Arg Ser Leu Ile 290
295 300Glu Leu Ala Tyr Lys Val Glu Asn
Leu Thr Tyr Asp Arg Ile Arg Lys305 310
315 320Glu Leu His Ile Ser Pro Glu Leu Lys Phe Asn Thr
Ile Arg Tyr Glu 325 330
335Ser Asp Asp Leu Pro Glu Asn Glu Lys Lys Gln Lys Leu Asn Cys Leu
340 345 350Lys Ala Tyr His Glu Ile
Arg Lys Ala Leu Asp Lys Leu Gly Lys Gly 355 360
365Thr Ile Asn Thr Leu Ser Lys Glu Gln Leu Asn Thr Ile Gly
Thr Val 370 375 380Leu Ser Met Tyr Lys
Thr Ser Glu Ile Ile Lys Asn Lys Met Glu Gln385 390
395 400Ile Pro Ala Glu Ile Val Asp Lys Leu Asp
Glu Glu Gly Ile Asn Phe 405 410
415Ser Lys Phe Gly His Leu Ser Ile Lys Ala Cys Glu Leu Ile Ile Pro
420 425 430Gly Leu Glu Lys Gly
Leu Asn Tyr Asn Asp Ala Cys Glu Glu Ala Gly 435
440 445Leu Asn Phe Lys Ala His Asn Asn Glu Glu Lys Ser
Phe Leu Leu His 450 455 460Pro Thr Glu
Asp Asp Tyr Ala Asp Ile Thr Ser Pro Val Val Lys Arg465
470 475 480Ala Ala Ser Gln Thr Ile Lys
Val Ile Asn Ala Ile Ile Arg Lys Gln 485
490 495Gly Cys Ser Pro Thr Tyr Ile Asn Ile Glu Val Ala
Arg Glu Leu Ser 500 505 510Lys
Asp Phe Tyr Glu Arg Asp Lys Ile Asn Lys Arg Asn Glu Ala Asn 515
520 525Arg Ala Glu Asn Glu Arg Ser Leu Glu
Gln Ile Arg Lys Glu Tyr Gly 530 535
540Lys Ser Asn Ala Ser Gly Leu Asp Leu Val Lys Phe Lys Leu Tyr Gln545
550 555 560Lys Gln Asp Gly
Val Cys Ala Tyr Ser Gln Lys Gln Ile Ser Phe Glu 565
570 575Arg Leu Phe Glu Pro Asn Tyr Val Glu Val
Asp His Ile Ile Pro Tyr 580 585
590Ser Lys Cys Phe Asp Asp Arg Glu Ser Asn Lys Val Leu Val Phe Ala
595 600 605Lys Glu Asn Arg Glu Lys Gly
Asn Arg Leu Pro Leu Glu Tyr Leu Asp 610 615
620Gly Lys Lys Arg Glu Ser Phe Ile Val Trp Val Asn Ser Lys Val
Lys625 630 635 640Asp Tyr
Arg Lys Lys Gln Asn Leu Leu Lys Glu Ser Leu Ser Glu Glu
645 650 655Glu Glu Lys Gln Phe Lys Glu
Arg Asn Leu Gln Asp Thr Lys Thr Val 660 665
670Ser Lys Phe Leu Met Asn Tyr Ile Asn Asp Asn Leu Ile Phe
Ser Ser 675 680 685Ser Asn Lys Arg
Lys Lys His Val Thr Ala Val Ser Gly Gly Val Thr 690
695 700Ser Tyr Met Arg Lys Arg Trp Gly Ile Ser Lys Val
Arg Glu Asp Gly705 710 715
720Asp Gln His His Ala Val Asp Ala Leu Val Ile Val Cys Thr Thr Asp
725 730 735Gly Met Ile Gln Gln
Val Ser Lys Tyr Val Glu Tyr Lys Glu Cys Gln 740
745 750Tyr Ile Gln Thr Asp Ala Gly Ser Leu Ala Val Asp
Pro Tyr Thr Gly 755 760 765Glu Val
Leu Arg Ser Phe Pro Tyr Pro Trp Ala Arg Phe His Glu Asp 770
775 780Ala Val Thr Trp Thr Glu Lys Ile Phe Val Ser
Arg Met Pro Met Arg785 790 795
800Lys Val Thr Gly Pro Ala His Lys Glu Thr Ile Lys Ser Pro Lys Ala
805 810 815Leu Gly Glu Gly
Leu Leu Ile Val Arg Lys Pro Leu Thr Glu Leu Lys 820
825 830Leu Lys Asn Gly Glu Ile Glu Asn Tyr Tyr Lys
Pro Glu Ala Asp Leu 835 840 845Leu
Leu Tyr Asn Gly Leu Lys Glu Arg Leu Met Glu Phe Gly Gly Asp 850
855 860Ala Lys Lys Ala Phe Ala Glu Pro Phe Pro
Lys Pro Gly Asn Pro Gln865 870 875
880Lys Ile Val Lys Lys Val Arg Leu Thr Glu Lys Ser Thr Leu Asn
Val 885 890 895Pro Val Leu
Lys Gly Glu Gly Arg Ala Asp Asn Asp Ser Met Val Arg 900
905 910Val Asp Val Phe Leu Lys Asp Gly Lys Tyr
Tyr Leu Val Pro Ile Tyr 915 920
925Val Ala Asp Thr Leu Lys Pro Glu Leu Pro Asn Lys Ala Cys Ile Ala 930
935 940His Lys Pro Tyr Asp Glu Trp Ala
Thr Met Asp Asp Gly Asp Phe Leu945 950
955 960Phe Ser Leu Tyr Pro Asn Asp Leu Ile Tyr Ile Lys
His Lys Lys Gly 965 970
975Ile Lys Leu Thr Lys Ile Asn Lys Asn Ser Thr Leu Ala Asp Ser Ile
980 985 990Glu Gly Lys Glu Phe Phe
Leu Phe Tyr Lys Thr Met Gly Ile Ser Ser 995 1000
1005Ala Val Leu Thr Cys Thr Asn His Asp Asn Thr Tyr
Tyr Ile Glu 1010 1015 1020Ser Leu Gly
Val Lys Thr Leu Glu Ser Leu Glu Lys Cys Val Val 1025
1030 1035Gly Val Leu Gly Glu Ile His Lys Val Arg Lys
Glu Lys Arg Thr 1040 1045 1050Gly Phe
Ser Gly Asn 10555936DNAArtificial SequenceCR RNA 59gttgtagtcc
cctgatggtt tctggaatgg tataat
3660120DNAArtificial SequenceTRACR RNA 60ttataccatt ccagaaacta ttatggtcac
tacaataagg tattagaccg tagagcacta 60acaccccatt tggggtgtta tctctttaaa
ctgtccaaaa tttagtattg caattattga 120611087PRTArtificial
SequenceRuminococcaceae bacterium 61Met Leu Pro Tyr Ala Ile Gly Leu Asp
Ile Gly Ile Ser Ser Val Gly1 5 10
15Trp Ala Ser Val Ala Leu Asp Glu Glu Asp Lys Pro Cys Gly Ile
Ile 20 25 30Gly Met Gly Ser
Arg Ile Phe Asp Ala Ala Glu Gln Pro Lys Thr Gly 35
40 45Asp Ser Leu Ala Ala Pro Arg Arg Ala Ala Arg Ser
Ala Arg Arg Arg 50 55 60Leu Arg Arg
Arg Arg His Arg Asn Glu Arg Ile Arg Ala Leu Met Leu65 70
75 80Arg Glu Gly Leu Leu Ser Glu Ala
Glu Leu Ala Ala Leu Phe Asp Gly 85 90
95Arg Leu Glu Asp Ile Cys Ala Leu Arg Val Arg Ala Leu Asp
Glu Ala 100 105 110Val Thr Asn
Asp Glu Leu Ala Arg Ile Leu Leu His Leu Ser Gln Arg 115
120 125Arg Gly Phe Arg Ser Asn Arg Lys Thr Ala Ala
Thr Gln Glu Asp Gly 130 135 140Glu Leu
Leu Ala Ala Val Ser Ala Asn Arg Ala Leu Met Gln Glu Arg145
150 155 160Gly Tyr Arg Thr Val Ala Glu
Met Leu Leu Arg Asp Glu Arg Tyr Arg 165
170 175Asp His Arg Arg Asn Lys Gly Gly Ala Tyr Ile Ala
Thr Val Gly Arg 180 185 190Asp
Met Val Glu Asp Glu Val Arg Gln Ile Phe Ala Ala Gln Arg Ala 195
200 205Leu Gly Ser Thr Ala Ala Ser Glu Thr
Leu Glu Thr Ala Tyr Leu Glu 210 215
220Ile Leu Leu Ser Gln Arg Ser Phe Asp Ala Gly Pro Gly Glu Pro Ser225
230 235 240Pro Tyr Ala Gly
Gly Gln Ile Glu Arg Met Ile Gly Arg Cys Thr Phe 245
250 255Glu Pro Asp Glu Pro Arg Ala Ala Arg Ala
Thr Tyr Ser Phe Glu Tyr 260 265
270Phe Ser Leu Leu Glu Ala Val Asn His Ile Arg Leu Thr Glu Ala Gly
275 280 285Glu Ser Val Pro Leu Thr Lys
Glu Gln Arg Glu Lys Leu Ile Ala Leu 290 295
300Ala His Arg Thr Ala Asp Leu Ser Tyr Ala Lys Ile Arg Lys Glu
Leu305 310 315 320Gly Val
Pro Glu Ser Gln Arg Phe Asn Met Val Thr Tyr Gly Lys Thr
325 330 335Asp Ser Ala Asp Glu Ala Glu
Lys Lys Thr Lys Leu Lys Gln Leu Arg 340 345
350Ala Tyr His Gln Met Arg Ala Ala Phe Glu Lys Ala Ala Lys
Gly Ser 355 360 365Phe Val Leu Leu
Thr Lys Glu Gln Arg Asn Ala Val Gly Gln Thr Leu 370
375 380Ser Ile Tyr Lys Thr Ser Asp Asn Ile Arg Pro Arg
Leu Arg Glu Ala385 390 395
400Gly Leu Thr Glu Ala Glu Ile Asp Val Ala Glu Gly Leu Ser Phe Ser
405 410 415Lys Phe Gly His Leu
Ser Val Lys Ala Cys Asp Lys Ile Ile Pro Phe 420
425 430Leu Glu Gln Gly Met Lys Tyr Ser Glu Ala Cys Val
Ala Ala Gly Tyr 435 440 445Ala Phe
Arg Gly His Glu Gly Gln Asp Lys Gln Arg Leu Leu Pro Pro 450
455 460Leu Asp Asn Asp Ala Lys Asp Thr Ile Thr Ser
Pro Val Val Leu Arg465 470 475
480Ala Val Ser Gln Thr Ile Lys Val Val Asn Ala Ile Ile Arg Glu Arg
485 490 495Gly Gly Ser Pro
Thr Phe Ile Asn Ile Glu Leu Ala Arg Glu Met Ala 500
505 510Lys Asp Phe Ser Glu Arg Ser Gln Ile Lys Arg
Glu Gln Asp Ser Asn 515 520 525Arg
Ala Arg Asn Glu Arg Met Met Glu Arg Ile Lys Thr Glu Tyr Gly 530
535 540Lys Ser Ser Pro Thr Gly Leu Asp Leu Val
Lys Leu Lys Leu Tyr Glu545 550 555
560Glu Gln Ala Gly Val Cys Ala Tyr Ser Leu Lys Gln Met Ser Leu
Glu 565 570 575His Leu Phe
Asp Pro Asn Tyr Ala Glu Ile Asp His Ile Ile Pro Tyr 580
585 590Ser Ile Ser Phe Asp Asp Gly Tyr Lys Asn
Lys Val Leu Val Leu Ala 595 600
605Lys Glu Asn Arg Asp Lys Gly Asn Arg Leu Pro Leu Glu Tyr Leu Asn 610
615 620Gly Lys Arg Arg Glu Asp Phe Ile
Val Trp Val Asn Ser Ser Val Arg625 630
635 640Asp Trp Arg Lys Lys Gln Asn Leu Leu Lys Glu His
Val Thr Pro Glu 645 650
655Asp Glu Ala Lys Phe Lys Glu Arg Asn Leu Gln Asp Thr Lys Thr Ala
660 665 670Ser Arg Phe Leu Leu Asn
Tyr Ile Ala Asp Asn Leu Ala Phe Ala Pro 675 680
685Phe Gln Thr Glu Arg Lys Lys Arg Val Thr Ala Val Asn Gly
Ser Val 690 695 700Thr Ala Tyr Leu Arg
Lys Arg Trp Gly Ile Ala Lys Val Arg Ala Asn705 710
715 720Gly Asp Leu His His Ala Val Asp Ala Leu
Val Ile Ala Cys Thr Thr 725 730
735Asp Gly Leu Ile Gln Lys Val Ser Arg Tyr Ala Cys Tyr Gln Glu Asn
740 745 750Arg Tyr Ser Glu Ala
Gly Gly Val Ile Val Asp Ser Ala Thr Gly Glu 755
760 765Val Val Ala Gln Phe Pro Glu Pro Trp Pro Arg Phe
Arg His Glu Leu 770 775 780Glu Ala Arg
Leu Ser Asp Asp Pro Ala Arg Ala Val Leu Gly Leu Gly785
790 795 800Leu Ala His Tyr Met Thr Gly
Glu Ile Arg Pro Arg Pro Leu Phe Val 805
810 815Ser Arg Met Pro Arg Arg Lys Val Thr Gly Ala Ala
His Lys Glu Thr 820 825 830Val
Lys Ser Pro Arg Ala Leu Asp Glu Gly Gln Leu Val Thr Lys Thr 835
840 845Pro Leu Ser Ala Leu Lys Leu Gly Lys
Asp Gly Glu Ile Pro Gly Tyr 850 855
860Tyr Lys Pro Glu Ser Asp Arg Leu Leu Tyr Glu Ala Leu Lys Ala Arg865
870 875 880Leu Arg Gln Phe
Gly Gly Asp Gly Lys Lys Ala Phe Ala Glu Pro Phe 885
890 895His Lys Pro Lys His Asp Gly Thr Pro Gly
Pro Val Val Thr Lys Val 900 905
910Lys Leu Cys Glu Pro Ala Thr Leu Ser Val Pro Val His Gly Gly Leu
915 920 925Gly Ala Ala Asn Asn Asp Ser
Met Val Arg Ile Asp Val Phe His Val 930 935
940Glu Gly Asp Gly Tyr Tyr Phe Val Pro Ile Tyr Ile Ala Asp Thr
Leu945 950 955 960Lys Leu
Glu Leu Pro Asn Lys Ala Cys Val Lys Ile Lys Lys Ile Ser
965 970 975Glu Trp Lys His Met Lys Pro
Gln Asp Phe Met Phe Ser Leu Tyr Pro 980 985
990Asn Asp Leu Phe Arg Ile Val Ser Lys Lys Gly Ile Thr Leu
Asn Leu 995 1000 1005Val Ser Lys
Glu Ser Thr Leu Pro Thr Ser Val Asn Val Ser Asp 1010
1015 1020Thr Leu Leu Tyr Phe Val Ser Ala Gly Ile Ala
Ser Ala Cys Leu 1025 1030 1035Thr Cys
Arg Asn His Asp Asn Thr Tyr Gln Ile Glu Ser Leu Gly 1040
1045 1050Ile Lys Thr Leu Glu Lys Leu Glu Lys Tyr
Thr Val Asp Val Leu 1055 1060 1065Gly
Asn Val His Arg Val Glu Lys Glu Pro Arg Met Ser Phe Ser 1070
1075 1080Gln Lys Gly Asp
10856236DNAArtificial SequenceCR RNA 62gttatagttc cctgttcgtt cttggtatgg
tataat 366389DNAArtificial SequenceTRACR
RNA 63ttataccata ccaagaacga agcaggttac tatgataagg tagtataccg cagagctcca
60acgcctcgct tttgcggggc gttgtctct
89641097PRTClostridium absonum 64Met Leu Pro Tyr Gly Ile Gly Leu Asp Ile
Gly Ile Thr Ser Val Gly1 5 10
15Trp Ala Thr Val Ala Leu Asp Glu Asn Asp Arg Pro Tyr Gly Ile Ile
20 25 30Gly Met Gly Ser Arg Ile
Phe Asp Ala Ala Glu Gln Pro Lys Thr Gly 35 40
45Glu Ser Leu Ala Ala Pro Arg Arg Ala Ala Arg Ser Ala Arg
Arg Arg 50 55 60Leu Arg Arg His Arg
His Arg Asn Glu Arg Ile Arg Ala Leu Ile Leu65 70
75 80Arg Glu Asn Leu Leu Ser Glu Gly Gln Leu
Leu His Leu Tyr Asp Gly 85 90
95Gln Leu Ser Asp Val Tyr Ser Leu Arg Val Lys Ala Leu Asp Glu Arg
100 105 110Val Ser Asn Glu Glu
Phe Ala Arg Ile Leu Ile His Ile Ser Gln Arg 115
120 125Arg Gly Phe Lys Ser Asn Arg Lys Gly Ala Ser Ser
Lys Glu Asp Ser 130 135 140Glu Leu Leu
Ala Ala Ile Ser Ala Asn Gln Val Arg Met Gln Gln Gln145
150 155 160Gly Tyr Arg Thr Val Ala Glu
Met Tyr Leu Lys Asp Pro Ile Tyr Gln 165
170 175Glu His Arg Arg Asn Lys Gly Gly Asn Tyr Ile Ala
Thr Val Ser Arg 180 185 190Ala
Met Val Glu Asp Glu Val His Gln Ile Phe Thr Gly Gln Arg Ala 195
200 205Cys Gly Asn Pro Ala Ala Thr Lys Glu
Leu Glu Glu Ala Tyr Val Glu 210 215
220Ile Leu Leu Ser Gln Arg Ser Phe Asp Asp Gly Pro Gly Asp Gly Ser225
230 235 240Pro Tyr Ala Gly
Ser Gln Ile Glu Arg Met Ile Gly Lys Cys Gln Leu 245
250 255Glu Lys Glu Ala Gly Glu Pro Arg Ala Ala
Lys Ala Thr Tyr Ser Phe 260 265
270Glu Tyr Phe Ser Leu Leu Ala Ala Ile Asn Asn Ile Ser Ile Ile Ser
275 280 285Asn Gly Gln Leu Ser Pro Leu
Thr Lys Glu Gln Arg Glu Met Leu Ile 290 295
300Ala Leu Ala His Lys Thr Ser Glu Leu Asn Tyr Ala Arg Ile Arg
Lys305 310 315 320Glu Leu
Gly Leu Ser Glu Ala Gln Arg Phe Asn Thr Val Ser Tyr Gly
325 330 335Lys Met Glu Ile Ala Glu Ala
Glu Lys Lys Thr Lys Phe Glu His Leu 340 345
350Lys Ala Tyr His Lys Met Arg Arg Glu Phe Glu Arg Ile Ala
Lys Gly 355 360 365His Phe Ala Ser
Ile Thr Ile Glu Gln Arg Asn Ala Ile Gly Asp Val 370
375 380Leu Ser Lys Tyr Lys Thr Asp Ala Lys Ile Arg Pro
Ala Leu Arg Glu385 390 395
400Ala Gly Leu Thr Glu Leu Asp Ile Asp Ala Ala Glu Ala Leu Asn Phe
405 410 415Ser Lys Phe Gly His
Ile Ser Ile Lys Ala Cys Lys Lys Ile Ile Pro 420
425 430Trp Leu Glu Gln Gly Met Lys Tyr Ser Glu Ala Cys
Asn Ala Ala Gly 435 440 445Tyr Asn
Phe Lys Gly His Asp Gly Gln Glu Lys Ser His Leu Leu Pro 450
455 460Pro Leu Asp Glu Glu Ser Arg Asn Val Ile Thr
Ser Pro Val Ala Leu465 470 475
480Arg Ala Ile Ser Gln Thr Ile Lys Val Val Asn Ala Ile Ile Arg Glu
485 490 495Arg Gly Cys Ser
Pro Thr Phe Ile Asn Ile Glu Leu Ala Arg Glu Met 500
505 510Ser Lys Asp Phe Tyr Glu Arg Ile Glu Ile Lys
Lys Glu Gln Asp Gly 515 520 525Asn
Arg Ala Lys Asn Glu Arg Met Met Glu Arg Ile Arg Thr Glu Tyr 530
535 540Gly Lys Ala Ser Pro Thr Gly Gln Asp Leu
Val Lys Phe Lys Leu Tyr545 550 555
560Glu Glu Gln Gly Gly Val Cys Ala Tyr Ser Leu Lys Gln Met Ser
Leu 565 570 575Ala His Leu
Phe Glu Pro Asp Tyr Ala Glu Val Asp His Ile Val Pro 580
585 590Tyr Ser Ile Ser Phe Asp Asp Gly Tyr Lys
Asn Lys Val Leu Val Leu 595 600
605Ala Lys Glu Asn Arg Asp Lys Gly Asn Arg Leu Pro Leu Gln Tyr Leu 610
615 620Gln Gly Lys Arg Arg Glu Asp Phe
Ile Ala Trp Val Asn Ser Cys Val625 630
635 640Arg Asp Tyr Lys Lys Arg Gln Arg Leu Leu Lys Glu
Ser Ile Ser Glu 645 650
655Asp Asp Leu Arg Ala Phe Lys Glu Arg Asn Leu Gln Asp Thr Lys Thr
660 665 670Ala Ser Arg Phe Leu Leu
Asn Tyr Ile Ser Asp His Leu Glu Phe Thr 675 680
685Gln Phe Ala Thr Glu Arg Lys Lys His Val Thr Ala Val Asn
Gly Ser 690 695 700Val Thr Ala Tyr Leu
Arg Lys Arg Trp Gly Ile Thr Lys Ile Arg Glu705 710
715 720Asn Gly Asp Leu His His Ala Val Asp Ala
Leu Val Ile Ala Cys Thr 725 730
735Thr Asp Gly Met Ile Gln Gln Val Ser Arg Phe Ala Gln His Arg Glu
740 745 750Asn Gln Tyr Ser Leu
Ala Glu Asp Ser Arg Phe Ile Ile Asp Pro Glu 755
760 765Thr Gly Glu Val Ile Lys Glu Phe Pro Tyr Pro Trp
Pro Arg Phe Arg 770 775 780Gln Glu Leu
Glu Ala Arg Leu Ser Ser Asn Pro Gly Leu Ala Val Arg785
790 795 800Asp Arg Gly Phe Leu Leu Tyr
Met Ala Glu Ser Ile Pro Val His Pro 805
810 815Leu Phe Val Ser Arg Met Pro Arg Arg Lys Val Thr
Gly Ala Ala His 820 825 830Lys
Glu Thr Ile Lys Ser Gly Lys Ala Gln Lys Asp Gly Leu Leu Ile 835
840 845Val Lys Lys Pro Leu Thr Asp Leu Lys
Leu Asp Lys Glu Gly Glu Ile 850 855
860Ala Asn Tyr Tyr Asn Pro Met Ser Asp Arg Leu Leu Tyr Glu Ala Leu865
870 875 880Lys Lys Arg Leu
Thr Ala Phe Asn Gly Asp Gly Lys Lys Ala Phe Ala 885
890 895Asp Pro Phe Tyr Lys Pro Lys Ser Asp Gly
Thr Gln Gly Pro Leu Val 900 905
910Asn Lys Val Lys Leu Cys Glu Pro Ser Thr Leu Asn Val Ser Val Ile
915 920 925Gly Gly Lys Gly Val Ala Glu
Asn Asp Ser Met Val Arg Ile Asp Val 930 935
940Phe Arg Val Glu Gly Asp Gly Tyr Tyr Phe Val Pro Val Tyr Val
Ala945 950 955 960Asp Thr
Val Lys Pro Glu Leu Pro Asn Lys Ala Cys Val Ala Asn Lys
965 970 975Pro Tyr Thr Asp Trp Lys Glu
Met Arg Glu Ser Asp Phe Leu Phe Ser 980 985
990Leu Tyr Pro Asn Asp Leu Leu Lys Val Thr His Lys Lys Ala
Leu Ile 995 1000 1005Leu Thr Lys
Ala Gln Lys Asp Ser Asp Leu Pro Asp Cys Lys Glu 1010
1015 1020Thr Lys Ser Glu Met Leu Tyr Phe Val Ser Ala
Ser Ile Ser Thr 1025 1030 1035Ala Ser
Leu Ala Cys Arg Thr His Asp Asn Ser Tyr Arg Ile Asn 1040
1045 1050Ser Leu Gly Ile Lys Thr Leu Glu Ala Leu
Glu Lys Tyr Thr Val 1055 1060 1065Asp
Val Leu Gly Glu Tyr His Pro Val Arg Arg Glu Thr Arg Gln 1070
1075 1080Thr Phe Thr Gly Arg Glu Ser Ser Gly
His Ser Gly Ile Ser 1085 1090
10956536DNAArtificial SequenceCR RNA 65gttatagttc cctgatagtt cttggtatgg
tataat 366688DNAArtificial SequenceTRACR
RNA 66ttataccata ccaagaacta tgaggttgct ataataaggt agtaaaccgc agagctctaa
60cgcctcacat ttgtggggcg ttatctct
88671088PRTRuminococcus albus 67Met Arg Pro Tyr Gly Ile Gly Leu Asp Ile
Gly Ile Ser Ser Val Gly1 5 10
15Trp Ala Ala Ile Ala Leu Asp His Gln Asp Ser Pro Cys Gly Ile Leu
20 25 30Asp Met Gly Ala Arg Ile
Phe Asp Ala Ala Glu Asn Pro Lys Asp Gly 35 40
45Ala Ser Leu Ala Ala Pro Arg Arg Glu Lys Arg Ser Gln Arg
Arg Arg 50 55 60Leu Arg Arg His Arg
His Arg Asn Glu Arg Ile Arg Arg Met Leu Leu65 70
75 80Lys Glu Gly Leu Leu Thr Glu Ala Glu Leu
Thr Gly Leu Phe Asp Gly 85 90
95Ala Leu Glu Asp Ile Tyr Ala Leu Arg Thr Arg Ala Leu Asp Glu Ala
100 105 110Leu Thr Lys Gln Glu
Phe Ala Arg Val Leu Leu His Leu Ser Gln Arg 115
120 125Arg Gly Phe Arg Ser Asn Arg Arg Ala Thr Ala Ala
Gln Glu Asp Gly 130 135 140Lys Leu Leu
Asp Ala Val Ser Glu Asn Ala Lys Arg Met Ala Asp Cys145
150 155 160Gly Tyr Arg Thr Val Gly Glu
Met Leu Cys Arg Asp Ala Thr Phe Ala 165
170 175Lys His Lys Arg Asn Lys Gly Gly Glu Tyr Leu Thr
Thr Val Ser Arg 180 185 190Ala
Met Ile Glu Asp Glu Val Lys Leu Val Phe Ala Ser Gln Arg Arg 195
200 205Leu Gly Ser Ala Phe Ala Ser Glu Ala
Leu Glu Gln Gly Tyr Leu Asp 210 215
220Ile Leu Leu Ser Gln Arg Ser Phe Asp Glu Gly Pro Gly Gly Asn Ser225
230 235 240Pro Tyr Gly Gly
Ala Gln Ile Glu Arg Met Ile Gly Lys Cys Thr Phe 245
250 255Tyr Pro Glu Glu Pro Arg Ala Ala Arg Ala
Cys Tyr Ser Phe Glu Tyr 260 265
270Phe Ser Leu Leu Gln Lys Val Asn His Ile Arg Leu Gln Lys Asp Gly
275 280 285Glu Ser Thr Pro Leu Thr Ser
Glu Gln Arg Leu Gln Leu Ile Glu Leu 290 295
300Ala His Lys Thr Glu Asn Leu Asp Tyr Ala Arg Ile Arg Arg Ala
Leu305 310 315 320Gln Ile
Pro Asp Ala Tyr Arg Phe Asn Thr Val Ser Tyr Arg Ile Glu
325 330 335Ser Asp Pro Ala Ala Ala Glu
Lys Lys Glu Lys Phe Gln Tyr Leu Arg 340 345
350Ala Tyr His Thr Met Arg Lys Ala Ile Asp Gly Ala Ser Lys
Gly Arg 355 360 365Phe Ala Leu Leu
Ser Gln Glu Gln Arg Asp Gln Ile Gly Thr Val Leu 370
375 380Thr Leu Tyr Lys Ser Gln Glu Arg Ile Ser Glu Lys
Leu Thr Glu Ala385 390 395
400Gly Ile Glu Pro Cys Asp Ile Ala Ala Leu Glu Ser Val Ser Gly Phe
405 410 415Ser Lys Thr Gly His
Ile Ser Leu Arg Ala Cys Lys Glu Leu Ile Pro 420
425 430Tyr Leu Glu Gln Gly Met Asn Tyr Asn Glu Ala Cys
Ala Ala Ala Gly 435 440 445Ile Glu
Phe His Gly His Ser Gly Thr Glu Arg Thr Val Val Leu His 450
455 460Pro Thr Pro Asp Asp Leu Ala Asp Ile Thr Ser
Pro Val Val Arg Arg465 470 475
480Ala Val Ala Gln Thr Val Lys Val Ile Asn Ala Val Ile Arg Arg Tyr
485 490 495Gly Ser Pro Val
Phe Val Asn Ile Glu Leu Ala Arg Glu Leu Ala Lys 500
505 510Asp Phe Thr Glu Arg Lys Lys Leu Glu Lys Asp
Asn Lys Thr Asn Arg 515 520 525Ala
Glu Asn Glu Arg Leu Met Arg Arg Ile Arg Glu Glu Tyr Gly Lys 530
535 540Met Asn Pro Thr Gly Leu Asp Leu Val Lys
Leu Arg Leu Tyr Glu Glu545 550 555
560Gln Ala Gly Val Cys Pro Tyr Ser Gln Lys Gln Met Ser Leu Gln
Arg 565 570 575Leu Phe Glu
Pro Asn Tyr Ala Glu Val Asp His Ile Ile Pro Tyr Ser 580
585 590Ile Ser Phe Asp Asp Ser Arg Arg Asn Lys
Val Leu Val Leu Ala Glu 595 600
605Glu Asn Arg Asn Lys Gly Asn Arg Leu Pro Leu Gln Tyr Leu Thr Gly 610
615 620Glu Arg Arg Asp Asn Phe Ile Val
Trp Val Asn Ser Ser Val Arg Asp625 630
635 640Tyr Arg Lys Lys Gln Lys Leu Leu Lys Pro Thr Val
Thr Asp Glu Asp 645 650
655Lys Gln Gln Phe Lys Glu Arg Asn Leu Gln Asp Thr Lys Thr Met Ser
660 665 670Arg Phe Leu Met Asn Tyr
Ile Asn Asp His Leu Gln Phe Gly Val Ser 675 680
685Ala Lys Glu Arg Lys Lys Arg Val Thr Ala Val Asn Gly Ile
Val Thr 690 695 700Ser Tyr Leu Arg Lys
Arg Trp Gly Ile Thr Lys Ile Arg Gly Asp Gly705 710
715 720Asp Leu His His Ala Val Asp Ala Leu Val
Ile Ala Cys Ala Thr Asp 725 730
735Gly Met Ile Arg Gln Ile Thr Arg Tyr Ala Gln Tyr Arg Glu Cys Arg
740 745 750Tyr Met Gln Thr Asp
Thr Gly Ser Ala Ala Ile Asp Glu Ala Thr Gly 755
760 765Glu Val Leu Arg Ile Phe Pro Tyr Pro Trp Glu His
Phe Arg Lys Glu 770 775 780Leu Glu Ala
Arg Leu Ser Ser Asp Pro Ala Arg Ala Val Asn Ala Leu785
790 795 800Arg Leu Pro Phe Tyr Leu Asp
Ser Gly Glu Pro Leu Pro Lys Pro Leu 805
810 815Phe Val Ser Arg Met Pro Arg Arg Lys Val Ser Gly
Ala Ala His Lys 820 825 830Asp
Thr Val Lys Ser Pro Lys Ala Met Ala Glu Gly Lys Val Ile Val 835
840 845Arg Arg Ala Leu Thr Asp Leu Lys Leu
Lys Asn Gly Glu Ile Glu Asn 850 855
860Tyr Phe Asp Pro Gly Ser Asp Arg Leu Leu Tyr Asp Ala Leu Lys Ala865
870 875 880Arg Leu Ala Ala
Phe Gly Gly Asp Gly Ala Lys Ala Phe Arg Glu Pro 885
890 895Phe Tyr Lys Pro Arg His Asp Gly Thr Pro
Gly Pro Leu Val Lys Lys 900 905
910Val Lys Leu Cys Glu Pro Thr Thr Leu Asn Val Ala Val His Gly Gly
915 920 925Lys Gly Val Ala Asp Asn Asp
Ser Met Val Arg Ile Asp Val Phe Arg 930 935
940Val Glu Gly Asp Gly Tyr Tyr Phe Val Pro Ile Tyr Ile Ala Asp
Thr945 950 955 960Leu Lys
Pro Val Leu Pro Asn Lys Ala Cys Val Ala Phe Lys Pro Tyr
965 970 975Ser Glu Trp Arg Thr Met Asp
Asp Arg Asp Phe Ile Phe Ser Leu Tyr 980 985
990Pro Asn Asp Leu Ile Arg Val Thr His Lys Ser Ala Leu Lys
Leu Ser 995 1000 1005Arg Val Ser
Lys Glu Ser Thr Leu Pro Glu Ser Ile Glu Ser Lys 1010
1015 1020Thr Ala Leu Leu Tyr Tyr Val Ser Ala Gly Ile
Ser Gly Ala Ala 1025 1030 1035Val Ser
Cys Arg Asn His Asp Asn Ser Tyr Glu Ile Lys Ser Met 1040
1045 1050Gly Ile Lys Thr Leu Glu Lys Leu Glu Lys
Tyr Thr Val Asp Val 1055 1060 1065Leu
Gly Glu Tyr His Lys Val Glu Lys Glu Arg Arg Met Pro Phe 1070
1075 1080Thr Gly Lys Arg Ser
10856836DNAArtificial SequenceCR RNA 68gttgtagttc cctgatcgtt cttggtatgg
tataat 366990DNAArtificial SequenceTRACR
RNA 69ttataccata ccaagaacga tcaggttgct acaataaggt agtaaaccga agagctctaa
60cgccccgttt ctttacgggg cgttatctct
90701095PRTRuminococcus albus 70Met Arg Pro Tyr Ala Ile Gly Leu Asp Ile
Gly Ile Thr Ser Val Gly1 5 10
15Trp Ala Thr Val Ala Leu Asp Ala Asp Glu Ser Pro Cys Gly Ile Ile
20 25 30Gly Leu Gly Ser Arg Ile
Phe Asp Ala Ala Glu Gln Pro Lys Thr Gly 35 40
45Glu Ser Leu Ala Ala Pro Arg Arg Ala Ala Arg Gly Ser Arg
Arg Arg 50 55 60Leu Arg Arg His Arg
His Arg Asn Glu Arg Ile Arg Ser Leu Met Leu65 70
75 80Glu Glu Arg Leu Ile Ser Gln Asp Glu Leu
Glu Thr Leu Phe Asp Gly 85 90
95Arg Leu Glu Asp Ile Tyr Ala Leu Arg Val Lys Ala Leu Asp Glu Ile
100 105 110Val Ser Arg Thr Asp
Phe Ala Arg Ile Leu Leu His Ile Ser Gln Arg 115
120 125Arg Gly Phe Lys Ser Asn Arg Lys Asn Pro Thr Thr
Lys Glu Asp Gly 130 135 140Val Leu Leu
Ala Ala Val Asn Glu Asn Lys Gln Arg Met Ser Glu His145
150 155 160Gly Tyr Arg Thr Val Gly Glu
Met Phe Leu Leu Asp Glu Thr Phe Lys 165
170 175Asp His Lys Arg Asn Lys Gly Gly Asn Tyr Ile Thr
Thr Val Ala Arg 180 185 190Asp
Met Val Ala Asp Glu Val Arg Ala Ile Phe Ser Ala Gln Arg Glu 195
200 205Leu Gly Ala Ser Phe Ala Ser Glu Glu
Phe Glu Glu Arg Tyr Leu Glu 210 215
220Ile Leu Leu Ser Gln Arg Ser Phe Asp Glu Gly Pro Gly Gly Asn Ser225
230 235 240Pro Tyr Gly Gly
Ser Gln Ile Glu Arg Met Val Gly Arg Cys Thr Phe 245
250 255Phe Pro Asp Glu Pro Arg Ala Ala Lys Ala
Thr Tyr Ser Phe Glu Tyr 260 265
270Phe Thr Leu Leu Gln Lys Val Asn His Ile Arg Ile Val Glu Asn Gly
275 280 285Val Ala Ser Lys Leu Thr Asp
Glu Gln Arg Arg Ile Ile Ile Glu Leu 290 295
300Ala His Thr Thr Lys Asp Val Ser Tyr Ala Lys Ile Arg Lys Val
Leu305 310 315 320Lys Leu
Ser Asp Lys Gln Leu Phe Asn Ile Arg Tyr Ser Asp Asn Ser
325 330 335Pro Ala Glu Asp Ser Glu Lys
Lys Glu Lys Leu Gly Ile Met Lys Ala 340 345
350Tyr His Gln Met Arg Ser Ala Ile Asp Arg Val Ser Lys Gly
Arg Phe 355 360 365Ala Met Met Pro
Arg Ala Gln Arg Asn Ala Ile Gly Thr Ala Leu Ser 370
375 380Leu Tyr Lys Thr Ser Asp Lys Ile Arg Lys Tyr Leu
Thr Asp Ala Gly385 390 395
400Leu Asp Glu Ile Asp Ile Asn Ser Ala Asp Ser Ile Gly Ser Phe Ser
405 410 415Lys Phe Gly His Ile
Ser Val Lys Ala Cys Asp Met Leu Ile Pro Phe 420
425 430Leu Glu Gln Gly Met Asn Tyr Asn Glu Ala Cys Ala
Ala Ala Gly Leu 435 440 445Asn Phe
Lys Gly His Asp Ala Gly Glu Lys Ser Lys Leu Leu His Pro 450
455 460Lys Glu Glu Asp Tyr Glu Asp Ile Thr Ser Pro
Val Val Arg Arg Ala465 470 475
480Ile Ala Gln Thr Ile Lys Val Ile Asn Ala Ile Ile Arg Arg Glu Gly
485 490 495Cys Ser Pro Thr
Phe Ile Asn Ile Glu Leu Ala Arg Glu Met Ala Lys 500
505 510Asp Phe Arg Glu Arg Asn Arg Ile Lys Lys Glu
Asn Asp Asp Asn Arg 515 520 525Ala
Lys Asn Glu Arg Leu Leu Glu Arg Ile Arg Thr Glu Tyr Gly Lys 530
535 540Asn Asn Pro Thr Gly Leu Asp Leu Val Lys
Leu Arg Leu Tyr Glu Glu545 550 555
560Gln Ser Gly Val Cys Met Tyr Ser Leu Lys Gln Met Ser Leu Glu
Lys 565 570 575Leu Phe Glu
Pro Asn Tyr Ala Glu Val Asp His Ile Val Pro Tyr Ser 580
585 590Ile Ser Phe Asp Asp Ser Arg Lys Asn Lys
Val Leu Val Leu Thr Glu 595 600
605Glu Asn Arg Asn Lys Gly Asn Arg Leu Pro Leu Gln Tyr Leu Lys Gly 610
615 620Arg Arg Arg Glu Asp Phe Ile Val
Trp Val Asn Asn Asn Val Lys Asp625 630
635 640Tyr Arg Lys Arg Arg Leu Leu Leu Lys Glu Glu Leu
Thr Ala Glu Asp 645 650
655Glu Ser Gly Phe Lys Glu Arg Asn Leu Gln Asp Thr Lys Thr Met Ser
660 665 670Arg Phe Leu Leu Asn Tyr
Ile Ala Asp Asn Leu Glu Phe Ala Glu Ser 675 680
685Thr Arg Gly Arg Lys Lys Lys Val Thr Ala Val Asn Gly Ala
Val Thr 690 695 700Ala Tyr Met Arg Lys
Arg Trp Gly Ile Thr Lys Ile Arg Glu Asp Gly705 710
715 720Asp Cys His His Ala Val Asp Ala Val Val
Ile Ala Cys Thr Thr Asp 725 730
735Ala Met Ile Arg Gln Val Ser Arg Tyr Ala Gln Phe Arg Glu Cys Glu
740 745 750Tyr Met Gln Thr Glu
Ser Gly Ser Val Ala Val Asp Thr Gly Thr Gly 755
760 765Glu Val Leu Arg Thr Phe Pro Tyr Pro Trp Pro Asp
Phe Arg Lys Glu 770 775 780Leu Glu Ala
Arg Leu Ala Asn Asp Pro Ala Lys Val Ile Asn Asp Leu785
790 795 800His Leu Pro Phe Tyr Met Ser
Ala Gly Arg Pro Leu Pro Glu Pro Val 805
810 815Phe Val Ser Arg Met Pro Arg Arg Lys Val Thr Gly
Ala Ala His Lys 820 825 830Asp
Thr Ile Lys Ser Ala Arg Glu Leu Asp Asn Gly Tyr Leu Ile Val 835
840 845Lys Arg Pro Leu Thr Asp Leu Lys Leu
Lys Asn Gly Glu Ile Glu Asn 850 855
860Tyr Tyr Asn Pro Gln Ser Asp Lys Cys Leu Tyr Asp Ala Leu Lys Asn865
870 875 880Ala Leu Ile Glu
His Gly Gly Asp Ala Lys Lys Ala Phe Ala Gly Glu 885
890 895Phe Arg Lys Pro Lys Arg Asp Gly Thr Pro
Gly Pro Ile Val Lys Lys 900 905
910Val Lys Leu Leu Glu Pro Thr Thr Met Cys Val Pro Val His Gly Gly
915 920 925Lys Gly Ala Ala Asp Asn Asp
Ser Met Val Arg Val Asp Val Phe Leu 930 935
940Ser Gly Gly Lys Tyr Tyr Leu Val Pro Ile Tyr Val Ala Asp Thr
Leu945 950 955 960Lys Pro
Glu Leu Pro Asn Lys Ala Val Thr Arg Gly Lys Lys Tyr Ser
965 970 975Glu Trp Leu Glu Met Ala Asp
Glu Asp Phe Ile Phe Ser Leu Tyr Pro 980 985
990Asn Asp Leu Ile Cys Ala Thr Ser Lys Asn Gly Ile Thr Leu
Ser Val 995 1000 1005Cys Arg Lys
Asp Ser Thr Leu Pro Pro Thr Val Glu Ser Lys Ser 1010
1015 1020Phe Met Leu Tyr Tyr Arg Gly Thr Asp Ile Ser
Thr Gly Ser Ile 1025 1030 1035Ser Cys
Ile Thr His Asp Asn Ala Tyr Lys Leu Arg Gly Leu Gly 1040
1045 1050Val Lys Thr Leu Glu Lys Leu Glu Lys Tyr
Thr Val Asp Val Leu 1055 1060 1065Gly
Glu Tyr His Lys Val Gly Lys Glu Val Arg Gln Pro Phe Asn 1070
1075 1080Ile Lys Arg Arg Lys Ala Cys Pro Ser
Glu Met Leu 1085 1090
10957136DNAArtificial SequenceCR RNA 71gttatagttc cctgatagtt cttggtatgg
tataat 3672131DNAArtificial SequenceTRACR
RNA 72ttataccata ccaagaacta tttaggttac tatgataagg tttagtacac cttagagctc
60tgacgcctcg cttttgcgag gcgttatctc tttatattgc caaaaatgca aatatatcgt
120acaatggtgg c
131731086PRTClostridium absonum 73Met His Arg Tyr Ala Ile Gly Leu Asp Ile
Gly Ile Thr Ser Val Gly1 5 10
15Trp Ala Ala Ile Ala Leu Asp Ala Glu Glu Asn Pro Cys Gly Met Leu
20 25 30Asp Phe Gly Ser Arg Ile
Phe Thr Gly Ala Glu His Pro Lys Thr Gly 35 40
45Ala Ser Leu Ala Ala Pro Arg Arg Glu Ala Arg Gly Ala Arg
Arg Arg 50 55 60Leu Arg Arg His Arg
His Arg Asn Glu Arg Ile Arg Arg Leu Met Val65 70
75 80Ser Gly Gly Leu Ile Ser Gln Glu Gln Leu
Glu Ser Leu Phe Ala Gly 85 90
95Gln Leu Glu Asp Ile Tyr Ala Leu Arg Thr Arg Ala Leu Asp Glu Gln
100 105 110Val Ala Arg Glu Glu
Leu Ala Arg Ile Met Leu His Leu Ser Gln Arg 115
120 125Arg Gly Phe Arg Ser Asn Arg Lys Gly Gly Ala Asp
Ala Glu Asp Gly 130 135 140Lys Leu Leu
Glu Ala Val Gly Asp Asn Lys Arg Arg Met Asp Glu Lys145
150 155 160Gly Tyr Arg Thr Ala Gly Glu
Met Phe Phe Lys Asp Glu Ala Phe Ala 165
170 175Ala His Lys Arg Asn Lys Gly Gly Asn Tyr Ile Ala
Thr Val Thr Arg 180 185 190Ala
Met Thr Glu Asp Glu Val His Arg Ile Phe Ala Ala Gln Arg Gly 195
200 205Phe Gly Ala Glu Tyr Ala Asn Glu Lys
Leu Glu Ala Ala Tyr Leu Asp 210 215
220Ile Leu Leu Ser Gln Arg Ser Phe Asp Glu Gly Pro Gly Gly Asp Ser225
230 235 240Pro Tyr Gly Gly
Ser Gln Ile Glu Arg Met Ile Gly Thr Cys Ala Phe 245
250 255Glu Pro Asp Gln Pro Arg Ala Ala Lys Ala
Ala Tyr Ser Phe Glu Tyr 260 265
270Phe Ser Leu Leu Glu Lys Leu Asn His Ile Arg Leu Val Ser Gly Gly
275 280 285Lys Ser Glu Pro Leu Thr Asp
Ala Gln Arg Lys Lys Leu Ile Glu Leu 290 295
300Ala His Lys Gln Asp Thr Leu Ser Tyr Ala Lys Ile Arg Lys Glu
Leu305 310 315 320Glu Leu
Asn Glu Ala Val Arg Phe Asn Ser Val Arg Tyr Thr Asp Asp
325 330 335Ala Thr Phe Glu Glu Gln Glu
Lys Lys Glu Lys Ile Val Cys Met Lys 340 345
350Ala Tyr His Ala Met Arg Lys Ala Val Asp Lys Asn Ala Lys
Gly Arg 355 360 365Phe Ala Tyr Leu
Thr Ile Pro Gln Arg Asn Glu Ile Gly Arg Val Leu 370
375 380Ser Thr Tyr Lys Thr Ser Ala Lys Ile Glu Pro Ala
Leu Ala Ala Ala385 390 395
400Gly Ile Glu Pro Cys Asp Ile Ala Ala Leu Glu Gly Leu Ser Phe Ser
405 410 415Lys Phe Gly His Leu
Ser Ile Lys Ala Cys Asp Lys Leu Ile Pro Phe 420
425 430Leu Glu Lys Ala Met Asn Tyr Asn Asp Ala Cys Ala
Ala Ala Gly Tyr 435 440 445Asp Phe
Arg Gly His Ser Arg Asp Gly Arg Gln Met Tyr Leu Pro Pro 450
455 460Leu Gly Gly Asp Cys Thr Glu Ile Thr Ser Pro
Val Val Arg Arg Ala465 470 475
480Val Ser Gln Thr Ile Lys Val Ile Asn Ala Ile Ile Arg Arg Tyr Gly
485 490 495Thr Ser Pro Val
Tyr Val Asn Ile Glu Leu Ala Arg Glu Met Ser Lys 500
505 510Asp Phe Ala Glu Arg Asn Lys Ile Lys Lys Gln
Asn Asp Asp Asn Arg 515 520 525Ser
Lys Asn Glu Lys Ile Lys Glu Gln Val Ala Glu Tyr Lys His Gly 530
535 540Ala Ala Thr Gly Leu Asp Ile Val Lys Met
Lys Leu Phe Asn Glu Gln545 550 555
560Gly Gly Ile Cys Ala Tyr Ser Gln Arg Gln Met Ser Leu Glu Arg
Leu 565 570 575Phe Asp Pro
Asn Tyr Ala Glu Val Asp His Ile Val Pro Tyr Ser Ile 580
585 590Ser Phe Asp Asp Arg Tyr Lys Asn Lys Val
Leu Val Leu Thr Glu Glu 595 600
605Asn Arg Asn Lys Gly Asn Arg Leu Pro Leu Gln Tyr Leu Thr Gly Glu 610
615 620Arg Arg Asp Arg Phe Ile Val Trp
Val Asn Asn Ser Val Arg Asp Phe625 630
635 640Gln Lys Arg Lys Leu Leu Leu Lys Glu Ala Leu Thr
Pro Glu Glu Glu 645 650
655Asn Asp Trp Lys Glu Arg Asn Leu Gln Asp Thr Lys Phe Val Ser Ser
660 665 670Phe Leu Leu Asn Tyr Ile
Asn Asp Asn Leu Leu Phe Ala Pro Ser Val 675 680
685Arg Arg Lys Lys Arg Val Thr Ala Val Asn Gly Ala Val Thr
Asp Tyr 690 695 700Met Arg Lys Arg Trp
Gly Ile Ser Lys Val Arg Glu Asp Gly Asp Arg705 710
715 720His His Ala Val Asp Ala Val Val Ile Ala
Cys Thr Asn Asp Ala Leu 725 730
735Ile Gln Lys Val Ser Arg Tyr Glu Ser Trp His Glu Arg His Tyr Met
740 745 750Pro Thr Glu Asn Gly
Ser Ile Leu Val Asp Pro Ala Thr Gly Glu Ile 755
760 765Lys Gln Thr Phe Pro Tyr Pro Trp Ala Met Phe Arg
Lys Glu Leu Glu 770 775 780Ala Arg Leu
Ser Asn Asp Pro Ser Arg Ala Val Ala Asp Leu Lys Leu785
790 795 800Pro Phe Tyr Met Asp Ala Asp
Ala Pro Pro Val Lys Pro Leu Phe Val 805
810 815Ser Arg Met Pro Thr Arg Lys Val Thr Gly Ala Ala
His Lys Asp Thr 820 825 830Val
Lys Ser Ala Arg Ala Leu Ala Asp Gly Leu Ala Ile Val Arg Arg 835
840 845Pro Leu Thr Ala Leu Lys Leu Asp Lys
Asp Gly Glu Ile Ala Gly Tyr 850 855
860Tyr Asn Lys Asp Ser Asp Arg Leu Leu Tyr Asp Ala Leu Lys Ala Arg865
870 875 880Leu Thr Glu Tyr
Gly Gly Asn Ala Ala Lys Ala Phe Ala Glu Pro Phe 885
890 895Tyr Lys Pro Lys Ser Asp Gly Thr Pro Gly
Pro Val Val Asn Lys Val 900 905
910Lys Leu Thr Glu Pro Thr Thr Leu Ser Val Pro Val Gln Asp Gly Thr
915 920 925Gly Ile Ala Asp Asn Asp Ser
Met Val Arg Ile Asp Val Phe Arg Val 930 935
940Val Gly Asp Gly Tyr Tyr Phe Val Pro Val Tyr Val Ala Asp Thr
Leu945 950 955 960Lys Gln
Glu Leu Pro Asp Arg Ala Val Val Ala Phe Lys Ala His Ser
965 970 975Glu Trp Lys Val Met Ser Asp
Gly Asp Phe Val Phe Ser Leu Tyr Pro 980 985
990Asn Asp Leu Val Lys Val Thr Arg Lys Lys Asp Val Ile Leu
Lys Arg 995 1000 1005Ser Phe Asp
Asn Ser Thr Leu Pro Glu Thr Ile Ala Ser Asn Glu 1010
1015 1020Cys Leu Leu Tyr Tyr Ala Gly Ala Asp Ile Ser
Thr Gly Ala Ile 1025 1030 1035Ser Cys
Val Thr Asn Asp Asn Ala Tyr Ser Ile Arg Gly Leu Gly 1040
1045 1050Ile Lys Thr Leu Val Ser Met Glu Lys Tyr
Thr Val Asp Ile Leu 1055 1060 1065Gly
Glu Tyr His Pro Val Arg Lys Glu Glu Arg Gln Arg Phe Asn 1070
1075 1080Thr Lys Arg 10857436DNAArtificial
SequenceCR RNA 74gttgtagttc cctgatggtt cttggtatgg tataat
3675107DNAArtificial SequenceTRACR RNA 75ttataccata
ccaagaactg ctcaggttac tatgataagg tagtaaaccg aagagctcta 60atgccccgtc
tcgcacgggg cattatctct aacagcgaaa aggcaaa
10776120DNAArtificial SequenceGRNA 76gttttagagc tatgctgttt tgaatgcttc
caaaacgaaa tgttggtagc attcaaaaca 60acatagcaag ttaaaataag gctttgtccg
ttctcaactt ttagtgacgc tgtttcggcg 12077105DNAArtificial SequenceGRNA
77gttttagagc tatgctgttt tgaatgcttc gtagcattca aaacaacata gcaagttaaa
60ataaggcttt gtccgttctc aacttttagt gacgctgttt cggcg
1057883DNAArtificial SequenceGRNA 78gttttagagc tatgctgtta acaacatagc
aagttaaaat aaggctttgt ccgttctcaa 60cttttagtga cgctgtttcg gcg
837977DNAArtificial SequenceGRNA
79gttttagagc tatgcaaaca tagcaagtta aaataaggct ttgtccgttc tcaactttta
60gtgacgctgt ttcggcg
7780121DNAArtificial SequenceGRNA 80gttttagagt catgttgttt agaatggtac
caaaacatct tttgggacta ttctaaacaa 60catagcaagt taaaataagg ttttaaccgt
aatcaactgt aaagtggcgc tgtttcggcg 120c
1218187DNAArtificial SequenceGRNA
81gttttagagt catgttgtaa aaacaacata gcaagttaaa ataaggtttt aaccgtaatc
60aactgtaaag tggcgctgtt tcggcgc
878279DNAArtificial SequenceGRNA 82gttttagagt catgttgtaa aaacaacata
gcaagttaaa ataagcgtaa tcaactgtaa 60agtggcgctg tttcggcgc
7983127DNAArtificial SequenceGRNA
83gttttagagc tgtgctgttt cgaatggttc caaaacgaaa tgttggaact attcgaaaca
60acacagcgag ttaaaataag gctttgtccg tacacaactt gtaaaagggg cacccgattc
120gggtgca
12784101DNAArtificial SequenceGRNA 84gttttagagc tgtgctgttt cgaaaaatcg
aaacaacaca gcgagttaaa ataaggcttt 60gtccgtacac aacttgtaaa aggggcaccc
gattcgggtg c 1018591DNAArtificial SequenceGRNA
85gttttagagc tgtgctgtaa aaacaacaca gcgagttaaa ataaggcttt gtccgtacac
60aacttgtaaa aggggcaccc gattcgggtg c
918683DNAArtificial SequenceGRNA 86gttttagagc tgtgcaaaca cagcgagtta
aaataaggct ttgtccgtac acaacttgta 60aaaggggcac ccgattcggg tgc
8387123DNAArtificial SequenceGRNA
87gttttagagc tgtgttgttt cgaatggttc caaaacggtt tgaaaccatt cgaaacaata
60cagcaaagtt aaaataaggc tagtccgtat acaacgtgaa aacacgtggc accgattcgg
120tgc
1238891DNAArtificial SequenceGRNA 88gttttagagc tgtgttgtaa aaacaataca
gcaaagttaa aataaggcta gtccgtatac 60aacgtgaaaa cacgtggcac cgattcggtg c
918990DNAArtificial SequenceGRNA
89gttttagagc tgtgttgtaa aaacaataca gcaagttaaa ataaggctag tccgtataca
60acgtgaaaac acgtggcacc gattcggtgc
9090118DNAArtificial SequenceGRNA 90gtttgctagt tatgttattt atagtattaa
gcaaactgta aataacataa cgagtgcaaa 60taagcgtttc gcgaaaattt acagtggccc
tgctgtgggg ccttttttat ttatcaaa 1189199DNAArtificial SequenceGRNA
91gtttgctagt tatgttataa aaataacata acgagtgcaa ataagcgttt cgcgaaaatt
60tacagtggcc ctgctgtggg gcctttttta tttatcaaa
999283DNAArtificial SequenceGRNA 92gtttgctagt tatgttataa aaataacata
acgagtgcaa ataagcgttt cgcgaaaatt 60tacagtggcc ctgctgtggg gcc
8393137DNAArtificial SequenceGRNA
93gtttgagagc cttgtaaaac cgtatatctc tcaagcgaaa gataatgttt tacaaggcga
60gttcaaataa ggatttatcc gaaatcgctt gcgtgcattg gcaccatcta tcttttaaga
120ctttctttga aagtctt
13794136DNAArtificial SequenceGRNA 94gtttgagagt cttgttaatt cttaaaggtg
taaaacgaga attaacaaga cgagtgcaaa 60taaggtttat ccggaatcgt caatatgacc
tgcattgtgc agaatcttta aaatcatatg 120atttcatatg gtttta
13695111DNAArtificial SequenceGRNA
95gtttgagagt cttgtaaaaa caagacgagt gcaaataagg tttatccgga atcgtcaata
60tgacctgcat tgtgcagaat ctttaaaatc atatgatttc atatggtttt a
1119677DNAArtificial SequenceGRNA 96gtttgagagt cttgtaaaaa caagacgagt
gcaaataagg tttatccgga atcgtcaata 60tgacctgcat tgtgcag
7797123DNAArtificial SequenceGRNA
97gtttgagagt cttgttaatt caaaagaatt aacaagacga gtgcaaataa ggtttatccg
60gaatcgtcaa tatgacctgc attgtgcaga atctttaaaa tcatatgatt tcatatggtt
120tta
12398142DNAArtificial SequenceGRNA 98gtttgagagt agtgtaaatc cataggggtc
tcaaacgaaa agacccctat ggatttacat 60tgcgagttca aataaaagtt tactcaaatc
gttggcttga ccaaccgcac agcgtgtgct 120taaagatctc ttcagtgagg tc
14299135DNAArtificial SequenceGRNA
99gtttgagagt agtgtaaatc cagagggctc caaaacgagc cctctggatt tacactacga
60gttcaaataa aaattatttc aaatcgccgc tatgtcggcc gcacagtgtg tgcattaaga
120aaagtccgaa agggc
135100147DNAArtificial SequenceGRNA 100gtttgagagt agtgtaaatt tatagggtag
taaaacaaat tttactaccc tataaattta 60cactacgagt tcaaataaaa attatttcaa
atcgtacttt ttagtacctt cacaagtgtt 120gtgaatatta actcaccttc gggtgag
147101109DNAArtificial SequenceGRNA
101gtttgagagt agtgtaaaaa tacactacga gttcaaataa aaattatttc aaatcgtact
60ttttagtacc ttcacaagtg ttgtgaatat taactcacct tcgggtgag
10910287DNAArtificial SequenceGRNA 102gtttgagagt agtgtaaaaa tacactacga
gttcaaataa aaattatttc aaatcgtact 60ttttagtacc ttcacaagtg ttgtgaa
87103129DNAArtificial SequenceGRNA
103gtttgagagt agtgtaaatt tataggaaaa cctataaatt tacactacga gttcaaataa
60aaattatttc aaatcgtact ttttagtacc ttcacaagtg ttgtgaatat taactcacct
120tcgggtgag
129104138DNAArtificial SequenceGRNA 104gtttgagagt agtgtaattt catatggtag
tcaaacgact accatatgag attacactac 60acggttcaaa taaagaatgt tcgaaaccgc
cctttggggc ccgcttgttg cggatttaca 120gacttgatat caagtctg
138105121DNAArtificial SequenceGRNA
105gtttgagagt aatgtaaatt cataggatgg taaaacgaaa tttaccatcc agtgagttta
60cattacaagt tcaaataaaa atttattcaa cccgttcttc ggaacctcca ccgtgtggaa
120c
121106105DNAArtificial SequenceGRNA 106gtttgagagt aatgtaaaaa tacattacaa
gttcaaataa aaatttattc aacccgttct 60tcggaacctc caccgtgtgg aacattaagg
tctgctttgc aggcc 10510781DNAArtificial SequenceGRNA
107gtttgagagt aatgtaaaaa tacattacaa gttcaaataa aaatttattc aacccgttct
60tcggaacctc caccgtgtgg a
81108103DNAArtificial SequenceGRNA 108gtttgagagt aatgtaaatt cataaaagtg
agtttacatt acaagttcaa ataaaaattt 60attcaacccg ttcttcggaa cctccaccgt
gtggaacatt aag 103109139DNAArtificial SequenceGRNA
109gtttgagagc agtgttgtct tatatagctc gaaaacgcat tgtaagacaa cactgctacg
60ttcaaataag catattgcta caaggttctc cctcggagaa tgaccattag gtcacttaga
120tagccggttc ttctggcta
139110139DNAArtificial SequenceGRNA 110gtttgagagc agtgttgtct tatatagctc
gaaaacgcat tgtaagacaa cactgctacg 60ttcaaataag catattgcta caaggttctc
cattggagaa tgaccattag gtcgcttaga 120tagccagttc ttctggcta
139111109DNAArtificial SequenceGRNA
111gtttgagagc agtgtaaaaa cactgctacg ttcaaataag catattgcta caaggttctc
60cattggagaa tgaccattag gtcgcttaga tagccagttc ttctggcta
10911283DNAArtificial SequenceGRNA 112gtttgagagc agtgtaaaaa cactgctacg
ttcaaataag catattgcta caaggttctc 60cattggagaa tgaccattag gtc
83113117DNAArtificial SequenceGRNA
113gtttgagagc agtgttgtca aaagacaaca ctgctacgtt caaataagca tattgctaca
60aggttctcca ttggagaatg accattaggt cgcttagata gccagttctt ctggcta
117114140DNAArtificial SequenceGRNA 114gtttgagagc agtgttgtct taaatagctc
gaaaacgcat tgtaagacaa cactgcacgt 60tcaaataagc agattgctac aaggttcccg
taagggaatg accatctggt cacatgaata 120gcccccggca acggtggctg
140115133DNAArtificial SequenceGRNA
115attgtaccat agcgagttaa attagggaat tacaacgaaa ttgtaataac ctattttacc
60tcgctatggc acaatttgtt attacatgga cattatacta aacatttcct aaaaaagcaa
120cgaaaaacgt gct
133116129DNAArtificial SequenceGRNA 116gttgtagttc cctaattatt cttggtatgg
tataatgaaa attgtatcat accaagaaca 60attaggttac tatgataagg tagtataccg
caaagctcta acacctcatc ttcggatgag 120gtgttatct
129117106DNAArtificial SequenceGRNA
117gttgtagttc cctaattatt cttggtaaaa accaagaaca attaggttac tatgataagg
60tagtataccg caaagctcta acacctcatc ttcggatgag gtgtta
106118100DNAArtificial SequenceGRNA 118gttgtagttc cctaattatt cttggtaaaa
accaagaaca attaggttac tatgataagg 60tagtataccg caaagctcta acacctcatc
ttcggatgag 100119121DNAArtificial SequenceGRNA
119gttgtagttc cctaattatt cttggtatgg taaaaatatc ataccaagaa caattaggtt
60actatgataa ggtagtatac cgcaaagctc taacacctca tcttcggatg aggtgttatc
120t
121120158DNAArtificial SequenceGRNA 120gttgtagttc cctaacagtt cttggtatgg
tataataaaa attataccat accaagaact 60gttatggttg ctatgataag gtcttagcac
cgtaaagctc tgacgcctcg ctttcagcgg 120ggcgtcatct tttttgccca aaagacacgg
atattttt 15812199DNAArtificial SequenceGRNA
121gttgtagttc cctaacagtt ctaaaaagaa ctgttatggt tgctatgata aggtcttagc
60accgtaaagc tctgacgcct cgctttcagc ggggcgtca
9912294DNAArtificial SequenceGRNA 122gttgtagttc cctaacagtt ctaaaaagaa
ctgttatggt tgctatgata aggtcttagc 60accgtaaagc tctgacgcct cgctttcagc
gggg 9412393DNAArtificial SequenceGRNA
123gttgtagttc cctaacagta aaaactgtta tggttgctat gataaggtct tagcaccgta
60aagctctgac gcctcgcttt cagcggggcg tca
93124126DNAArtificial SequenceGRNA 124cggttcttgg tatggtataa tgaattatac
cataccaaga actgttgggt tactacaata 60aggtagtaaa ccgaaaagct ctgacgtctt
gtttgcgcag gacgtcatct ttatatcaga 120cggatg
12612590DNAArtificial SequenceGRNA
125cggtactgtt gggttactac aataaggtag taaaccgaaa agctctgacg tcttgtttgc
60gcaggacgtc atctttatat cagacggatg
9012691DNAArtificial SequenceGRNA 126gttgtagttc cctaacggta ctgttgggtt
actacaataa ggtagtaaac cgaaaagctc 60tgacgtcttg tttgcgcagg acgtcatctt t
91127108DNAArtificial SequenceGRNA
127ctaacggttc ttgaaaacaa gaactgttgg gttactacaa taaggtagta aaccgaaaag
60ctctgacgtc ttgtttgcgc aggacgtcat ctttatatca gacggatg
108128160DNAArtificial SequenceGRNA 128gttgtagtcc cctgatggtt tctggaatgg
tataatgaaa ttataccatt ccagaaacta 60ttatggtcac tacaataagg tattagaccg
tagagcacta acaccccatt tggggtgtta 120tctctttaaa ctgtccaaaa tttagtattg
caattattga 160129128DNAArtificial SequenceGRNA
129gttatagttc cctgatagtt cttggtatgg tataatgaaa ttataccata ccaagaacta
60tgaggttgct ataataaggt agtaaaccgc agagctctaa cgcctcacat ttgtggggcg
120ttatctct
128130115DNAArtificial SequenceGRNA 130tcgttcttgg tatggtataa tgaaattata
ccataccaag aacgatcagg ttgctacaat 60aaggtagtaa accgaagagc tctaacgccc
cgtttcttta cggggcgtta tctct 115131170DNAArtificial SequenceGRNA
131gttatagttc cctgatagtt cttggtatgg tataatgaat tataccatac caagaactat
60ttaggttact atgataaggt ttagtacacc ttagagctct gacgcctcgc ttttgcgagg
120cgttatctct ttatattgcc aaaaatgcaa atatatcgta caatggtggc
170132130DNAArtificial SequenceGRNA 132gttatagttc cctgatagtt cttggtatgg
tataatgaat tataccatac caagaactat 60ttaggttact atgataaggt ttagtacacc
ttagagctct gacgcctcgc ttttgcgagg 120cgttatctct
130133106DNAArtificial SequenceGRNA
133gttatagttc cctgatagtt cttaaccaag aactatttag gttactatga taaggtttag
60tacaccttag agctctgacg cctcgctttt gcgaggcgtt atctct
106134104DNAArtificial SequenceGRNA 134gttatagttc cctgatagtt cttgcaagaa
ctatttaggt tactatgata aggtttagta 60caccttagag ctctgacgcc tcgcttttgc
gaggcgttat ctct 10413596DNAArtificial SequenceGRNA
135gttatagttc cctgatagtt cttgcaagaa ctatttaggt tactatgata aggtttagta
60caccttagag ctctgacgcc aaaaggcgtt atctct
96136129DNAArtificial SequenceGRNA 136gttgtagttc cctgatggtt cttggtatgg
tataataaat tataccatac caagaactgc 60tcaggttact atgataaggt agtaaaccga
agagctctaa tgccccgtct cgcacggggc 120attatctct
129137105DNAArtificial SequenceGRNA
137gttgtagttc cctgatggtt cttgaaaaag aactgctcag gttactatga taaggtagta
60aaccgaagag ctctaatgcc ccgtctcgca cggggcatta tctct
10513896DNAArtificial SequenceGRNA 138gttgtagttc cctgatggtt cttgaaaaag
aactgctcag gttactatga taaggtagta 60aaccgaagag ctctaatgcc aaagggcatt
atctct 96139195DNAArtificial SequenceGRNA
139aaacccctcc gtttagagag gggttatgct agttagcgcc gaaacagcgc cactttacag
60ttgattacgg ttaaaacctt attttaactt gctatgttgt ttagaatagt cccaaaagat
120gttttggtac cattctaaac aacatgactc taaaacccag taacattact gactggccta
180tagtgagtcg tatta
195140161DNAArtificial SequenceGRNA 140aaacccctcc gtttagagag gggttatgct
agttagcgcc gaaacagcgc cactttacag 60ttgattacgg ttaaaacctt attttaactt
gctatgttgt ttttacaaca tgactctaaa 120acccagtaac attactgact ggcctatagt
gagtcgtatt a 161141153DNAArtificial SequenceGRNA
141aaacccctcc gtttagagag gggttatgct agttagcgcc gaaacagcgc cactttacag
60ttgattacgc ttattttaac ttgctatgtt gtttttacaa catgactcta aaacccagta
120acattactga ctggcctata gtgagtcgta tta
153142197DNAArtificial SequenceGRNA 142aaacccctcc gtttagagag gggttatgct
agttagcacc gaatcggtgc cacgtgtttt 60cacgttgtat acggactagc cttattttaa
ctttgctgta ttgtttcgaa tggtttcaaa 120ccgttttgga accattcgaa acaacacagc
tctaaaaccc agtaacatta ctgactggcc 180tatagtgagt cgtatta
197143165DNAArtificial SequenceGRNA
143aaacccctcc gtttagagag gggttatgct agttagcacc gaatcggtgc cacgtgtttt
60cacgttgtat acggactagc cttattttaa ctttgctgta ttgtttttac aacacagctc
120taaaacccag taacattact gactggccta tagtgagtcg tatta
165144164DNAArtificial SequenceGRNA 144aaacccctcc gtttagagag gggttatgct
agttagcacc gaatcggtgc cacgtgtttt 60cacgttgtat acggactagc cttattttaa
cttgctgtat tgtttttaca acacagctct 120aaaacccagt aacattactg actggcctat
agtgagtcgt atta 164145192DNAArtificial SequenceGRNA
145aaacccctcc gtttagagag gggttatgct agttatttga taaataaaaa aggccccaca
60gcagggccac tgtaaatttt cgcgaaacgc ttatttgcac tcgttatgtt atttacagtt
120tgcttaatac tataaataac ataactagca aacccagtaa cattactgac tggcctatag
180tgagtcgtat ta
192146173DNAArtificial SequenceGRNA 146aaacccctcc gtttagagag gggttatgct
agttatttga taaataaaaa aggccccaca 60gcagggccac tgtaaatttt cgcgaaacgc
ttatttgcac tcgttatgtt atttttataa 120cataactagc aaacccagta acattactga
ctggcctata gtgagtcgta tta 173147157DNAArtificial SequenceGRNA
147aaacccctcc gtttagagag gggttatgct agttaggccc cacagcaggg ccactgtaaa
60ttttcgcgaa acgcttattt gcactcgtta tgttattttt ataacataac tagcaaaccc
120agtaacatta ctgactggcc tatagtgagt cgtatta
157148197DNAArtificial SequenceGRNA 148aaacccctcc gtttagagag gggttatgct
agttataaaa ccatatgaaa tcatatgatt 60ttaaagattc tgcacaatgc aggtcatatt
gacgattccg gataaacctt atttgcactc 120gtcttgttaa ttcttttgaa ttaacaagac
tctcaaaccc agtaacatta ctgactggcc 180tatagtgagt cgtatta
197149185DNAArtificial SequenceGRNA
149aaacccctcc gtttagagag gggttatgct agttataaaa ccatatgaaa tcatatgatt
60ttaaagattc tgcacaatgc aggtcatatt gacgattccg gataaacctt atttgcactc
120gtcttgtttt tacaagactc tcaaacccag taacattact gactggccta tagtgagtcg
180tatta
185150151DNAArtificial SequenceGRNA 150aaacccctcc gtttagagag gggttatgct
agttactgca caatgcaggt catattgacg 60attccggata aaccttattt gcactcgtct
tgtttttaca agactctcaa acccagtaac 120attactgact ggcctatagt gagtcgtatt a
151151200DNAArtificial SequenceGRNA
151aaacccctcc gtttagagag gggttatgct agctcacccg aaggtgagtt aatattcaca
60acacttgtga aggtactaaa aagtacgatt tgaaataatt tttatttgaa ctcgtagtgt
120aaatttatag gttttcctat aaatttacac tactctcaaa cccagtaaca ttactgactg
180gcctatagtg agtcgtatta
200152183DNAArtificial SequenceGRNA 152aaacccctcc gtttagagag gggttatgct
agttactcac ccgaaggtga gttaatattc 60acaacacttg tgaaggtact aaaaagtacg
atttgaaata atttttattt gaactcgtag 120tgtattttta cactactctc aaacccagta
acattactga ctggcctata gtgagtcgta 180tta
183153161DNAArtificial SequenceGRNA
153aaacccctcc gtttagagag gggttatgct agttattcac aacacttgtg aaggtactaa
60aaagtacgat ttgaaataat ttttatttga actcgtagtg tatttttaca ctactctcaa
120acccagtaac attactgact ggcctatagt gagtcgtatt a
161154193DNAArtificial SequenceGRNA 154aaacccctcc gtttagagag gggttatgct
agttaggcct gcaaagcaga ccttaatgtt 60ccacacggtg gaggttccga agaacgggtt
gaataaattt ttatttgaac ttgtaatgta 120aactcacttt tatgaattta cattactctc
aaacccagta acattactga ctggcctata 180gtgagtcgta tta
193155179DNAArtificial SequenceGRNA
155aaacccctcc gtttagagag gggttatgct agttaggcct gcaaagcaga ccttaatgtt
60ccacacggtg gaggttccga agaacgggtt gaataaattt ttatttgaac ttgtaatgta
120tttttacatt actctcaaac ccagtaacat tactgactgg cctatagtga gtcgtatta
179156155DNAArtificial SequenceGRNA 156aaacccctcc gtttagagag gggttatgct
agttatccac acggtggagg ttccgaagaa 60cgggttgaat aaatttttat ttgaacttgt
aatgtatttt tacattactc tcaaacccag 120taacattact gactggccta tagtgagtcg
tatta 155157191DNAArtificial SequenceGRNA
157aaacccctcc gtttagagag gggttatgct agttatagcc agaagaactg gctatctaag
60cgacctaatg gtcattctcc aatggagaac cttgtagcaa tatgcttatt tgaacgtagc
120agtgttgtct tttgacaaca ctgctctcaa acccagtaac attactgact ggcctatagt
180gagtcgtatt a
191158183DNAArtificial SequenceGRNA 158aaacccctcc gtttagagag gggttatgct
agttatagcc agaagaactg gctatctaag 60cgacctaatg gtcattctcc aatggagaac
cttgtagcaa tatgcttatt tgaacgtagc 120agtgttttta cactgctctc aaacccagta
acattactga ctggcctata gtgagtcgta 180tta
183159157DNAArtificial SequenceGRNA
159aaacccctcc gtttagagag gggttatgct agttagacct aatggtcatt ctccaatgga
60gaaccttgta gcaatatgct tatttgaacg tagcagtgtt tttacactgc tctcaaaccc
120agtaacatta ctgactggcc tatagtgagt cgtatta
157160195DNAArtificial SequenceGRNA 160aaacccctcc gtttagagag gggttatgct
agttaagata acacctcatc cgaagatgag 60gtgttagagc tttgcggtat actaccttat
catagtaacc taattgttct tggtatgata 120tttttaccat accaagaata attagggaac
tacaacccag taacattact gactggccta 180tagtgagtcg tatta
195161180DNAArtificial SequenceGRNA
161aaacccctcc gtttagagag gggttatgct agttataaca cctcatccga agatgaggtg
60ttagagcttt gcggtatact accttatcat agtaacctaa ttgttcttgg tttttaccaa
120gaataattag ggaactacaa cccagtaaca ttactgactg gcctatagtg agtcgtatta
180162174DNAArtificial SequenceGRNA 162aaacccctcc gtttagagag gggttatgct
agttactcat ccgaagatga ggtgttagag 60ctttgcggta tactacctta tcatagtaac
ctaattgttc ttggttttta ccaagaataa 120ttagggaact acaacccagt aacattactg
actggcctat agtgagtcgt atta 174163167DNAArtificial SequenceGRNA
163aaacccctcc gtttagagag gggttatgct agttatgacg ccccgctgaa agcgaggcgt
60cagagcttta cggtgctaag accttatcat agcaaccata acagttttta ctgttaggga
120actacaaccc agtaacatta ctgactggcc tatagtgagt cgtatta
167164173DNAArtificial SequenceGRNA 164aaacccctcc gtttagagag gggttatgct
agttatgacg ccccgctgaa agcgaggcgt 60cagagcttta cggtgctaag accttatcat
agcaaccata acagttcttt ttagaactgt 120tagggaacta caacccagta acattactga
ctggcctata gtgagtcgta tta 173165168DNAArtificial SequenceGRNA
165aaacccctcc gtttagagag gggttatgct agttaccccg ctgaaagcga ggcgtcagag
60ctttacggtg ctaagacctt atcatagcaa ccataacagt tctttttaga actgttaggg
120aactacaacc cagtaacatt actgactggc ctatagtgag tcgtatta
168166193DNAArtificial SequenceGRNA 166aaacccctcc gtttagagag gggttatgct
agttacatcc gtctgatata aagatgacgt 60cctgcgcaaa caagacgtca gagcttttcg
gtttactacc ttattgtagt aacccaacag 120ttcttgtttt caagaaccgt tagggaacta
caacccagta acattactga ctggcctata 180gtgagtcgta tta
193167179DNAArtificial SequenceGRNA
167aaacccctcc gtttagagag gggttatgct agttacatcc gtctgatata aagatgacgt
60cctgcgcaaa caagacgtca gagcttttcg gtttactacc ttattgtagt aacccaacag
120taccgttagg gaactacaac ccagtaacat tactgactgg cctatagtga gtcgtatta
179168165DNAArtificial SequenceGRNA 168aaacccctcc gtttagagag gggttatgct
agttaaaaga tgacgtcctg cgcaaacaag 60acgtcagagc ttttcggttt actaccttat
tgtagtaacc caacagtacc gttagggaac 120tacaacccag taacattact gactggccta
tagtgagtcg tatta 165169178DNAArtificial SequenceGRNA
169aaacccctcc gtttagagag gggttatgct agttaagaga taacgcctcg caaaagcgag
60gcgtcagagc tctaaggtgt actaaacctt atcatagtaa cctaaatagt tcttgcaaga
120actatcaggg aactataacc cagtaacatt actgactggc ctatagtgag tcgtatta
178170170DNAArtificial SequenceGRNA 170aaacccctcc gtttagagag gggttatgct
agttaagaga taacgccttt tggcgtcaga 60gctctaaggt gtactaaacc ttatcatagt
aacctaaata gttcttgcaa gaactatcag 120ggaactataa cccagtaaca ttactgactg
gcctatagtg agtcgtatta 170171180DNAArtificial SequenceGRNA
171aaacccctcc gtttagagag gggttatgct agttaagaga taacgcctcg caaaagcgag
60gcgtcagagc tctaaggtgt actaaacctt atcatagtaa cctaaatagt tcttggttaa
120gaactatcag ggaactataa cccagtaaca ttactgactg gcctatagtg agtcgtatta
18017220DNAArtificial SequencePRIMER 172ttgggtaacg ccagggtttt
2017320DNAArtificial SequencePRIMER
173tgtgtggaat tgtgagcgga
2017420DNAArtificial SequencePRIMER 174aaacccctcc gtttagagag
2017529DNAArtificial SequencePRIMER
175aagctaatac gactcactat aggccagtc
2917620DNAArtificial SequencePRIMER 176ccagtcagta atgttactgg
201771337PRTArtificial
SequenceENGINEERED NICKASE 177Met Lys Lys Asp Tyr Val Ile Gly Leu Asp Ile
Gly Thr Asn Ser Val1 5 10
15Gly Trp Ala Val Met Thr Glu Asp Tyr Gln Leu Val Lys Lys Lys Met
20 25 30Pro Ile Tyr Gly Asn Thr Glu
Lys Lys Lys Ile Lys Lys Asn Phe Trp 35 40
45Gly Val Arg Leu Phe Glu Glu Gly His Thr Ala Glu Asp Arg Arg
Leu 50 55 60Lys Arg Thr Ala Arg Arg
Ile Ile Ser Arg Arg Arg Asn Arg Leu Arg65 70
75 80Tyr Leu Gln Ala Phe Phe Glu Glu Ala Met Thr
Asp Leu Asp Glu Asn 85 90
95Phe Phe Ala Arg Leu Gln Glu Ser Phe Leu Val Pro Glu Asp Lys Lys
100 105 110Trp His Arg His Pro Ile
Phe Ala Lys Leu Glu Asp Glu Val Ala Tyr 115 120
125His Glu Thr Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu
Ala Asp 130 135 140Ser Ser Glu Gln Ala
Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His145 150
155 160Ile Val Lys Tyr Arg Gly His Phe Leu Ile
Glu Gly Lys Leu Ser Thr 165 170
175Glu Asn Ile Ser Val Lys Glu Gln Phe Gln Gln Phe Met Ile Ile Tyr
180 185 190Asn Gln Thr Phe Val
Asn Gly Glu Ser Arg Leu Val Ser Ala Pro Leu 195
200 205Pro Glu Ser Val Leu Ile Glu Glu Glu Leu Thr Glu
Lys Ala Ser Arg 210 215 220Thr Lys Lys
Ser Glu Lys Val Leu Gln Gln Phe Pro Gln Glu Lys Ala225
230 235 240Asn Gly Leu Phe Gly Gln Phe
Leu Lys Leu Met Val Gly Asn Lys Ala 245
250 255Asp Phe Lys Lys Val Phe Gly Leu Glu Glu Glu Ala
Lys Ile Thr Tyr 260 265 270Ala
Ser Glu Ser Tyr Glu Glu Asp Leu Glu Gly Ile Leu Ala Lys Val 275
280 285Gly Asp Glu Tyr Ser Asp Val Phe Leu
Ala Ala Lys Asn Val Tyr Asp 290 295
300Ala Val Glu Leu Ser Thr Ile Leu Ala Asp Ser Asp Lys Lys Ser His305
310 315 320Ala Lys Leu Ser
Ser Ser Met Ile Val Arg Phe Thr Glu His Gln Glu 325
330 335Asp Leu Lys Lys Phe Lys Arg Phe Ile Arg
Glu Asn Cys Pro Asp Glu 340 345
350Tyr Asp Asn Leu Phe Lys Asn Glu Gln Lys Asp Gly Tyr Ala Gly Tyr
355 360 365Ile Ala His Ala Gly Lys Val
Ser Gln Leu Lys Phe Tyr Gln Tyr Val 370 375
380Lys Lys Ile Ile Gln Asp Ile Ala Gly Ala Glu Tyr Phe Leu Glu
Lys385 390 395 400Ile Ala
Gln Glu Asn Phe Leu Arg Lys Gln Arg Thr Phe Asp Asn Gly
405 410 415Val Ile Pro His Gln Ile His
Leu Ala Glu Leu Gln Ala Ile Ile His 420 425
430Arg Gln Ala Ala Tyr Tyr Pro Phe Leu Lys Glu Asn Gln Glu
Lys Ile 435 440 445Glu Gln Leu Val
Thr Phe Arg Ile Pro Tyr Tyr Val Gly Pro Leu Ser 450
455 460Lys Gly Asp Ala Ser Thr Phe Ala Trp Leu Lys Arg
Gln Ser Glu Glu465 470 475
480Pro Ile Arg Pro Trp Asn Leu Gln Glu Thr Val Asp Leu Asp Gln Ser
485 490 495Ala Thr Ala Phe Ile
Glu Arg Met Thr Asn Phe Asp Thr Tyr Leu Pro 500
505 510Ser Glu Lys Val Leu Pro Lys His Ser Leu Leu Tyr
Glu Lys Phe Met 515 520 525Val Phe
Asn Glu Leu Thr Lys Ile Ser Tyr Thr Asp Asp Arg Gly Ile 530
535 540Lys Ala Asn Phe Ser Gly Lys Glu Lys Glu Lys
Ile Phe Asp Tyr Leu545 550 555
560Phe Lys Thr Arg Arg Lys Val Lys Lys Lys Asp Ile Ile Gln Phe Tyr
565 570 575Arg Asn Glu Tyr
Asn Thr Glu Ile Val Thr Leu Ser Gly Leu Glu Glu 580
585 590Asp Gln Phe Asn Ala Ser Phe Ser Thr Tyr Gln
Asp Leu Leu Lys Cys 595 600 605Gly
Leu Thr Arg Ala Glu Leu Asp His Pro Asp Asn Ala Glu Lys Leu 610
615 620Glu Asp Ile Ile Lys Ile Leu Thr Ile Phe
Glu Asp Arg Gln Arg Ile625 630 635
640Arg Thr Gln Leu Ser Thr Phe Lys Gly Gln Phe Ser Ala Glu Val
Leu 645 650 655Lys Lys Leu
Glu Arg Lys His Tyr Thr Gly Trp Gly Arg Leu Ser Lys 660
665 670Lys Leu Ile Asn Gly Ile Tyr Asp Lys Glu
Ser Gly Lys Thr Ile Leu 675 680
685Gly Tyr Leu Ile Lys Asp Asp Gly Val Ser Lys His Tyr Asn Arg Asn 690
695 700Phe Met Gln Leu Ile Asn Asp Ser
Gln Leu Ser Phe Lys Asn Ala Ile705 710
715 720Gln Lys Ala Gln Ser Ser Glu His Glu Glu Thr Leu
Ser Glu Thr Val 725 730
735Asn Glu Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly Ile Tyr Gln Ser
740 745 750Leu Lys Ile Val Asp Glu
Leu Val Ala Ile Met Gly Tyr Ala Pro Lys 755 760
765Arg Ile Val Val Glu Met Ala Arg Glu Asn Gln Thr Thr Ser
Thr Gly 770 775 780Lys Arg Arg Ser Ile
Gln Arg Leu Lys Ile Val Glu Lys Ala Met Ala785 790
795 800Glu Ile Gly Ser Asn Leu Leu Lys Glu Gln
Pro Thr Thr Asn Glu Gln 805 810
815Leu Arg Asp Thr Arg Leu Phe Leu Tyr Tyr Met Gln Asn Gly Lys Asp
820 825 830Met Tyr Thr Gly Asp
Glu Leu Ser Leu His Arg Leu Ser His Tyr Asp 835
840 845Ile Asp Ala Ile Ile Pro Gln Ser Phe Met Lys Asp
Asp Ser Leu Asp 850 855 860Asn Leu Val
Leu Val Gly Ser Thr Glu Asn Arg Gly Lys Ser Asp Asp865
870 875 880Val Pro Ser Lys Glu Val Val
Lys Asp Met Lys Ala Tyr Trp Glu Lys 885
890 895Leu Tyr Ala Ala Gly Leu Ile Ser Gln Arg Lys Phe
Gln Arg Leu Thr 900 905 910Lys
Gly Glu Gln Gly Gly Leu Thr Leu Glu Asp Lys Ala His Phe Ile 915
920 925Gln Arg Gln Leu Val Glu Thr Arg Gln
Ile Thr Lys Asn Val Ala Gly 930 935
940Ile Leu Asp Gln Arg Tyr Asn Ala Asn Ser Lys Glu Lys Lys Val Gln945
950 955 960Ile Ile Thr Leu
Lys Ala Ser Leu Thr Ser Gln Phe Arg Ser Ile Phe 965
970 975Gly Leu Tyr Lys Val Arg Glu Val Asn Asp
Tyr His His Gly Gln Asp 980 985
990Ala Tyr Leu Asn Cys Val Val Ala Thr Thr Leu Leu Lys Val Tyr Pro
995 1000 1005Asn Leu Ala Pro Glu Phe
Val Tyr Gly Glu Tyr Pro Lys Phe Gln 1010 1015
1020Thr Phe Lys Glu Asn Lys Ala Thr Ala Lys Ala Ile Ile Tyr
Thr 1025 1030 1035Asn Leu Leu Arg Phe
Phe Thr Glu Asp Glu Pro Arg Phe Thr Lys 1040 1045
1050Asp Gly Glu Ile Leu Trp Ser Asn Ser Tyr Leu Lys Thr
Ile Lys 1055 1060 1065Lys Glu Leu Asn
Tyr His Gln Met Asn Ile Val Lys Lys Val Glu 1070
1075 1080Val Gln Lys Gly Gly Phe Ser Lys Glu Ser Ile
Lys Pro Lys Gly 1085 1090 1095Pro Ser
Asn Lys Leu Ile Pro Val Lys Asn Gly Leu Asp Pro Gln 1100
1105 1110Lys Tyr Gly Gly Phe Asp Ser Pro Ile Val
Ala Tyr Thr Val Leu 1115 1120 1125Phe
Thr His Glu Lys Gly Lys Lys Pro Leu Ile Lys Gln Glu Ile 1130
1135 1140Leu Gly Ile Thr Ile Met Glu Lys Thr
Arg Phe Glu Gln Asn Pro 1145 1150
1155Ile Leu Phe Leu Glu Glu Lys Gly Phe Leu Arg Pro Arg Val Leu
1160 1165 1170Met Lys Leu Pro Lys Tyr
Thr Leu Tyr Glu Phe Pro Glu Gly Arg 1175 1180
1185Arg Arg Leu Leu Ala Ser Ala Lys Glu Ala Gln Lys Gly Asn
Gln 1190 1195 1200Met Val Leu Pro Glu
His Leu Leu Thr Leu Leu Tyr His Ala Lys 1205 1210
1215Gln Cys Leu Leu Pro Asn Gln Ser Glu Ser Leu Thr Tyr
Val Glu 1220 1225 1230Gln His Gln Pro
Glu Phe Gln Glu Ile Leu Glu Arg Val Val Asp 1235
1240 1245Phe Ala Glu Val His Thr Leu Ala Lys Ser Lys
Val Gln Gln Ile 1250 1255 1260Val Lys
Leu Phe Glu Ala Asn Gln Thr Ala Asp Val Lys Glu Ile 1265
1270 1275Ala Ala Ser Phe Ile Gln Leu Met Gln Phe
Asn Ala Met Gly Ala 1280 1285 1290Pro
Ser Thr Phe Lys Phe Phe Gln Lys Asp Ile Glu Arg Ala Arg 1295
1300 1305Tyr Thr Ser Ile Lys Glu Ile Phe Asp
Ala Thr Ile Ile Tyr Gln 1310 1315
1320Ser Thr Thr Gly Leu Tyr Glu Thr Arg Arg Lys Val Val Asp 1325
1330 13351781337PRTArtificial
SequenceENGINEERED NICKASE 178Met Lys Lys Asp Tyr Val Ile Gly Leu Asp Ile
Gly Thr Asn Ser Val1 5 10
15Gly Trp Ala Val Met Thr Glu Asp Tyr Gln Leu Val Lys Lys Lys Met
20 25 30Pro Ile Tyr Gly Asn Thr Glu
Lys Lys Lys Ile Lys Lys Asn Phe Trp 35 40
45Gly Val Arg Leu Phe Glu Glu Gly His Thr Ala Glu Asp Arg Arg
Leu 50 55 60Lys Arg Thr Ala Arg Arg
Ile Ile Ser Arg Arg Arg Asn Arg Leu Arg65 70
75 80Tyr Leu Gln Ala Phe Phe Glu Glu Ala Met Thr
Asp Leu Asp Glu Asn 85 90
95Phe Phe Ala Arg Leu Gln Glu Ser Phe Leu Val Pro Glu Asp Lys Lys
100 105 110Trp His Arg His Pro Ile
Phe Ala Lys Leu Glu Asp Glu Val Ala Tyr 115 120
125His Glu Thr Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu
Ala Asp 130 135 140Ser Ser Glu Gln Ala
Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His145 150
155 160Ile Val Lys Tyr Arg Gly His Phe Leu Ile
Glu Gly Lys Leu Ser Thr 165 170
175Glu Asn Ile Ser Val Lys Glu Gln Phe Gln Gln Phe Met Ile Ile Tyr
180 185 190Asn Gln Thr Phe Val
Asn Gly Glu Ser Arg Leu Val Ser Ala Pro Leu 195
200 205Pro Glu Ser Val Leu Ile Glu Glu Glu Leu Thr Glu
Lys Ala Ser Arg 210 215 220Thr Lys Lys
Ser Glu Lys Val Leu Gln Gln Phe Pro Gln Glu Lys Ala225
230 235 240Asn Gly Leu Phe Gly Gln Phe
Leu Lys Leu Met Val Gly Asn Lys Ala 245
250 255Asp Phe Lys Lys Val Phe Gly Leu Glu Glu Glu Ala
Lys Ile Thr Tyr 260 265 270Ala
Ser Glu Ser Tyr Glu Glu Asp Leu Glu Gly Ile Leu Ala Lys Val 275
280 285Gly Asp Glu Tyr Ser Asp Val Phe Leu
Ala Ala Lys Asn Val Tyr Asp 290 295
300Ala Val Glu Leu Ser Thr Ile Leu Ala Asp Ser Asp Lys Lys Ser His305
310 315 320Ala Lys Leu Ser
Ser Ser Met Ile Val Arg Phe Thr Glu His Gln Glu 325
330 335Asp Leu Lys Lys Phe Lys Arg Phe Ile Arg
Glu Asn Cys Pro Asp Glu 340 345
350Tyr Asp Asn Leu Phe Lys Asn Glu Gln Lys Asp Gly Tyr Ala Gly Tyr
355 360 365Ile Ala His Ala Gly Lys Val
Ser Gln Leu Lys Phe Tyr Gln Tyr Val 370 375
380Lys Lys Ile Ile Gln Asp Ile Ala Gly Ala Glu Tyr Phe Leu Glu
Lys385 390 395 400Ile Ala
Gln Glu Asn Phe Leu Arg Lys Gln Arg Thr Phe Asp Asn Gly
405 410 415Val Ile Pro His Gln Ile His
Leu Ala Glu Leu Gln Ala Ile Ile His 420 425
430Arg Gln Ala Ala Tyr Tyr Pro Phe Leu Lys Glu Asn Gln Glu
Lys Ile 435 440 445Glu Gln Leu Val
Thr Phe Arg Ile Pro Tyr Tyr Val Gly Pro Leu Ser 450
455 460Lys Gly Asp Ala Ser Thr Phe Ala Trp Leu Lys Arg
Gln Ser Glu Glu465 470 475
480Pro Ile Arg Pro Trp Asn Leu Gln Glu Thr Val Asp Leu Asp Gln Ser
485 490 495Ala Thr Ala Phe Ile
Glu Arg Met Thr Asn Phe Asp Thr Tyr Leu Pro 500
505 510Ser Glu Lys Val Leu Pro Lys His Ser Leu Leu Tyr
Glu Lys Phe Met 515 520 525Val Phe
Asn Glu Leu Thr Lys Ile Ser Tyr Thr Asp Asp Arg Gly Ile 530
535 540Lys Ala Asn Phe Ser Gly Lys Glu Lys Glu Lys
Ile Phe Asp Tyr Leu545 550 555
560Phe Lys Thr Arg Arg Lys Val Lys Lys Lys Asp Ile Ile Gln Phe Tyr
565 570 575Arg Asn Glu Tyr
Asn Thr Glu Ile Val Thr Leu Ser Gly Leu Glu Glu 580
585 590Asp Gln Phe Asn Ala Ser Phe Ser Thr Tyr Gln
Asp Leu Leu Lys Cys 595 600 605Gly
Leu Thr Arg Ala Glu Leu Asp His Pro Asp Asn Ala Glu Lys Leu 610
615 620Glu Asp Ile Ile Lys Ile Leu Thr Ile Phe
Glu Asp Arg Gln Arg Ile625 630 635
640Arg Thr Gln Leu Ser Thr Phe Lys Gly Gln Phe Ser Ala Glu Val
Leu 645 650 655Lys Lys Leu
Glu Arg Lys His Tyr Thr Gly Trp Gly Arg Leu Ser Lys 660
665 670Lys Leu Ile Asn Gly Ile Tyr Asp Lys Glu
Ser Gly Lys Thr Ile Leu 675 680
685Gly Tyr Leu Ile Lys Asp Asp Gly Val Ser Lys His Tyr Asn Arg Asn 690
695 700Phe Met Gln Leu Ile Asn Asp Ser
Gln Leu Ser Phe Lys Asn Ala Ile705 710
715 720Gln Lys Ala Gln Ser Ser Glu His Glu Glu Thr Leu
Ser Glu Thr Val 725 730
735Asn Glu Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly Ile Tyr Gln Ser
740 745 750Leu Lys Ile Val Asp Glu
Leu Val Ala Ile Met Gly Tyr Ala Pro Lys 755 760
765Arg Ile Val Val Glu Met Ala Arg Glu Asn Gln Thr Thr Ser
Thr Gly 770 775 780Lys Arg Arg Ser Ile
Gln Arg Leu Lys Ile Val Glu Lys Ala Met Ala785 790
795 800Glu Ile Gly Ser Asn Leu Leu Lys Glu Gln
Pro Thr Thr Asn Glu Gln 805 810
815Leu Arg Asp Thr Arg Leu Phe Leu Tyr Tyr Met Gln Asn Gly Lys Asp
820 825 830Met Tyr Thr Gly Asp
Glu Leu Ser Leu His Arg Leu Ser His Tyr Asp 835
840 845Ile Asp His Ile Ile Pro Gln Ser Phe Met Lys Asp
Asp Ser Leu Asp 850 855 860Asn Leu Val
Leu Val Gly Ser Thr Glu Ala Arg Gly Lys Ser Asp Asp865
870 875 880Val Pro Ser Lys Glu Val Val
Lys Asp Met Lys Ala Tyr Trp Glu Lys 885
890 895Leu Tyr Ala Ala Gly Leu Ile Ser Gln Arg Lys Phe
Gln Arg Leu Thr 900 905 910Lys
Gly Glu Gln Gly Gly Leu Thr Leu Glu Asp Lys Ala His Phe Ile 915
920 925Gln Arg Gln Leu Val Glu Thr Arg Gln
Ile Thr Lys Asn Val Ala Gly 930 935
940Ile Leu Asp Gln Arg Tyr Asn Ala Asn Ser Lys Glu Lys Lys Val Gln945
950 955 960Ile Ile Thr Leu
Lys Ala Ser Leu Thr Ser Gln Phe Arg Ser Ile Phe 965
970 975Gly Leu Tyr Lys Val Arg Glu Val Asn Asp
Tyr His His Gly Gln Asp 980 985
990Ala Tyr Leu Asn Cys Val Val Ala Thr Thr Leu Leu Lys Val Tyr Pro
995 1000 1005Asn Leu Ala Pro Glu Phe
Val Tyr Gly Glu Tyr Pro Lys Phe Gln 1010 1015
1020Thr Phe Lys Glu Asn Lys Ala Thr Ala Lys Ala Ile Ile Tyr
Thr 1025 1030 1035Asn Leu Leu Arg Phe
Phe Thr Glu Asp Glu Pro Arg Phe Thr Lys 1040 1045
1050Asp Gly Glu Ile Leu Trp Ser Asn Ser Tyr Leu Lys Thr
Ile Lys 1055 1060 1065Lys Glu Leu Asn
Tyr His Gln Met Asn Ile Val Lys Lys Val Glu 1070
1075 1080Val Gln Lys Gly Gly Phe Ser Lys Glu Ser Ile
Lys Pro Lys Gly 1085 1090 1095Pro Ser
Asn Lys Leu Ile Pro Val Lys Asn Gly Leu Asp Pro Gln 1100
1105 1110Lys Tyr Gly Gly Phe Asp Ser Pro Ile Val
Ala Tyr Thr Val Leu 1115 1120 1125Phe
Thr His Glu Lys Gly Lys Lys Pro Leu Ile Lys Gln Glu Ile 1130
1135 1140Leu Gly Ile Thr Ile Met Glu Lys Thr
Arg Phe Glu Gln Asn Pro 1145 1150
1155Ile Leu Phe Leu Glu Glu Lys Gly Phe Leu Arg Pro Arg Val Leu
1160 1165 1170Met Lys Leu Pro Lys Tyr
Thr Leu Tyr Glu Phe Pro Glu Gly Arg 1175 1180
1185Arg Arg Leu Leu Ala Ser Ala Lys Glu Ala Gln Lys Gly Asn
Gln 1190 1195 1200Met Val Leu Pro Glu
His Leu Leu Thr Leu Leu Tyr His Ala Lys 1205 1210
1215Gln Cys Leu Leu Pro Asn Gln Ser Glu Ser Leu Thr Tyr
Val Glu 1220 1225 1230Gln His Gln Pro
Glu Phe Gln Glu Ile Leu Glu Arg Val Val Asp 1235
1240 1245Phe Ala Glu Val His Thr Leu Ala Lys Ser Lys
Val Gln Gln Ile 1250 1255 1260Val Lys
Leu Phe Glu Ala Asn Gln Thr Ala Asp Val Lys Glu Ile 1265
1270 1275Ala Ala Ser Phe Ile Gln Leu Met Gln Phe
Asn Ala Met Gly Ala 1280 1285 1290Pro
Ser Thr Phe Lys Phe Phe Gln Lys Asp Ile Glu Arg Ala Arg 1295
1300 1305Tyr Thr Ser Ile Lys Glu Ile Phe Asp
Ala Thr Ile Ile Tyr Gln 1310 1315
1320Ser Thr Thr Gly Leu Tyr Glu Thr Arg Arg Lys Val Val Asp 1325
1330 13351791087PRTArtificial
SequenceENGINEERED NICKASE 179Met Lys Tyr Ile Ile Gly Leu Asp Met Gly Ile
Thr Ser Val Gly Phe1 5 10
15Ala Thr Met Met Leu Asp Asp Lys Asp Glu Pro Cys Arg Ile Ile Arg
20 25 30Met Gly Ser Arg Ile Phe Glu
Ala Ala Glu His Pro Lys Asp Gly Ser 35 40
45Ser Leu Ala Ala Pro Arg Arg Ile Asn Arg Gly Met Arg Arg Arg
Leu 50 55 60Arg Arg Lys Ser His Arg
Lys Glu Arg Ile Lys Asp Leu Ile Ile Lys65 70
75 80Asn Glu Leu Met Thr Ala Asp Glu Ile Ser Ala
Ile Tyr Ser Thr Gly 85 90
95Lys Gln Leu Ser Asp Ile Tyr Gln Ile Arg Ala Glu Ala Leu Asp Arg
100 105 110Lys Leu Asn Thr Glu Glu
Phe Val Arg Leu Leu Ile His Leu Ser Gln 115 120
125Arg Arg Gly Phe Lys Ser Asn Arg Lys Val Asp Ala Lys Glu
Lys Gly 130 135 140Ser Asp Ala Gly Lys
Leu Leu Ser Ala Val Asn Ser Asn Lys Glu Leu145 150
155 160Met Ile Glu Lys Asn Tyr Arg Thr Ile Gly
Glu Met Leu Tyr Lys Asp 165 170
175Glu Lys Phe Ser Glu Tyr Lys Arg Asn Lys Ala Asp Asp Tyr Ser Asn
180 185 190Thr Phe Ala Arg Ser
Glu Tyr Glu Asp Glu Ile Arg Gln Ile Phe Ser 195
200 205Ala Gln Gln Glu His Gly Asn Pro Tyr Ala Thr Asp
Glu Leu Lys Glu 210 215 220Ser Tyr Leu
Asp Ile Tyr Leu Ser Gln Arg Ser Phe Asp Glu Gly Pro225
230 235 240Gly Gly Ser Ser Pro Tyr Gly
Gly Asn Gln Ile Glu Lys Met Ile Gly 245
250 255Asn Cys Thr Leu Glu Pro Glu Glu Lys Arg Ala Ala
Lys Ala Thr Phe 260 265 270Ser
Phe Glu Tyr Phe Asn Leu Leu Ser Lys Val Asn Ser Ile Lys Ile 275
280 285Val Ser Ser Ser Gly Lys Arg Ala Leu
Asn Asn Asp Glu Arg Gln Ser 290 295
300Val Ile Arg Leu Ala Phe Ala Lys Asn Ala Ile Ser Tyr Thr Ser Leu305
310 315 320Arg Lys Glu Leu
Asn Met Glu Tyr Ser Glu Arg Phe Asn Ile Ser Tyr 325
330 335Ser Gln Ser Asp Lys Ser Ile Glu Glu Ile
Glu Lys Lys Thr Lys Phe 340 345
350Thr Tyr Leu Thr Ala Tyr His Thr Phe Lys Lys Ala Tyr Gly Ser Val
355 360 365Phe Val Glu Trp Ser Ala Asp
Lys Lys Asn Ser Leu Ala Tyr Ala Leu 370 375
380Thr Ala Tyr Lys Asn Asp Thr Lys Ile Ile Glu Tyr Leu Thr Gln
Lys385 390 395 400Gly Phe
Asp Ala Ala Glu Thr Asp Ile Ala Leu Thr Leu Pro Ser Phe
405 410 415Ser Lys Trp Gly Asn Leu Ser
Glu Lys Ala Leu Asn Asn Ile Ile Pro 420 425
430Tyr Leu Glu Gln Gly Met Leu Tyr His Asp Ala Cys Thr Ala
Ala Gly 435 440 445Tyr Asn Phe Lys
Ala Asp Asp Thr Asp Lys Arg Met Tyr Leu Pro Ala 450
455 460His Glu Lys Glu Ala Pro Glu Leu Asp Asp Ile Thr
Asn Pro Val Val465 470 475
480Arg Arg Ala Ile Ser Gln Thr Ile Lys Val Ile Asn Ala Leu Ile Arg
485 490 495Glu Met Gly Glu Ser
Pro Cys Phe Val Asn Ile Glu Leu Ala Arg Glu 500
505 510Leu Ser Lys Asn Lys Ala Glu Arg Ser Lys Ile Glu
Lys Gly Gln Lys 515 520 525Glu Asn
Gln Val Arg Asn Asp Arg Ile Met Glu Arg Leu Arg Asn Glu 530
535 540Phe Gly Leu Leu Ser Pro Thr Gly Gln Asp Leu
Ile Lys Leu Lys Leu545 550 555
560Trp Glu Glu Gln Asp Gly Ile Cys Pro Tyr Ser Leu Lys Pro Ile Lys
565 570 575Ile Glu Lys Leu
Phe Asp Val Gly Tyr Thr Asp Ile Asp Ala Ile Ile 580
585 590Pro Tyr Ser Leu Ser Phe Asp Asp Thr Tyr Asn
Asn Lys Val Leu Val 595 600 605Met
Ser Ser Glu Asn Arg Gln Lys Gly Asn Arg Ile Pro Met Gln Tyr 610
615 620Leu Glu Gly Lys Arg Gln Asp Asp Phe Trp
Leu Trp Val Asp Asn Ser625 630 635
640Asn Leu Ser Arg Arg Lys Lys Gln Asn Leu Thr Lys Glu Thr Leu
Ser 645 650 655Glu Asp Asp
Leu Ser Gly Phe Lys Lys Arg Asn Leu Gln Asp Thr Gln 660
665 670Tyr Leu Ser Arg Phe Met Met Asn Tyr Leu
Lys Lys Tyr Leu Ala Leu 675 680
685Ala Pro Asn Thr Thr Gly Arg Lys Asn Thr Ile Gln Ala Val Asn Gly 690
695 700Ala Val Thr Ser Tyr Leu Arg Lys
Arg Trp Gly Ile Gln Lys Val Arg705 710
715 720Glu Asn Gly Asp Thr His His Ala Val Asp Ala Val
Val Ile Ser Cys 725 730
735Val Thr Ala Gly Met Thr Lys Arg Val Ser Glu Tyr Ala Lys Tyr Lys
740 745 750Glu Thr Glu Phe Gln Asn
Pro Gln Thr Gly Glu Phe Phe Asp Val Asp 755 760
765Ile Arg Thr Gly Glu Val Ile Asn Arg Phe Pro Leu Pro Tyr
Ala Arg 770 775 780Phe Arg Asn Glu Leu
Leu Met Arg Cys Ser Glu Asn Pro Ser Arg Ile785 790
795 800Leu His Glu Met Pro Leu Pro Thr Tyr Ala
Ala Asp Glu Lys Val Ala 805 810
815Pro Ile Phe Val Ser Arg Met Pro Lys His Lys Val Lys Gly Ser Ala
820 825 830His Lys Glu Thr Ile
Arg Arg Ala Phe Glu Glu Asp Gly Lys Lys Tyr 835
840 845Thr Val Ser Lys Val Pro Leu Thr Asp Leu Lys Leu
Lys Asn Gly Glu 850 855 860Ile Glu Asn
Tyr Tyr Asn Pro Glu Ser Asp Gly Leu Leu Tyr Asn Ala865
870 875 880Leu Lys Glu Gln Leu Ile Ala
Phe Gly Gly Asp Ala Ala Lys Ala Phe 885
890 895Glu Gln Pro Phe Tyr Lys Pro Lys Ser Asp Gly Ser
Glu Gly Pro Leu 900 905 910Val
Lys Lys Val Lys Leu Ile Asn Lys Ala Thr Leu Thr Val Pro Val 915
920 925Leu Asn Asn Thr Ala Val Ala Asp Asn
Gly Ser Met Val Arg Val Asp 930 935
940Val Phe Phe Val Glu Gly Glu Gly Tyr Tyr Leu Val Pro Ile Tyr Val945
950 955 960Ala Asp Thr Val
Lys Lys Glu Leu Pro Asn Lys Ala Ile Ile Ala Asn 965
970 975Lys Pro Tyr Glu Glu Trp Lys Glu Met Arg
Glu Glu Asn Phe Val Phe 980 985
990Ser Leu Tyr Pro Asn Asp Leu Ile Lys Ile Ser Ser Arg Lys Asp Met
995 1000 1005Lys Phe Asn Leu Val Asn
Lys Glu Ser Thr Leu Ala Pro Asn Cys 1010 1015
1020Gln Ser Lys Glu Ala Leu Val Tyr Tyr Lys Gly Ser Asp Ile
Ser 1025 1030 1035Thr Ala Ala Val Thr
Ala Ile Asn His Asp Asn Thr Tyr Lys Leu 1040 1045
1050Arg Gly Leu Gly Val Lys Thr Leu Leu Lys Ile Glu Lys
Tyr Gln 1055 1060 1065Val Asp Val Leu
Gly Asn Val Phe Lys Val Gly Lys Glu Lys Arg 1070
1075 1080Val Arg Phe Lys 10851801095PRTArtificial
SequenceENGINEERED NICKASE 180Met Arg Pro Tyr Ala Ile Gly Leu Asp Ile Gly
Ile Thr Ser Val Gly1 5 10
15Trp Ala Thr Val Ala Leu Asp Ala Asp Glu Ser Pro Cys Gly Ile Ile
20 25 30Gly Leu Gly Ser Arg Ile Phe
Asp Ala Ala Glu Gln Pro Lys Thr Gly 35 40
45Glu Ser Leu Ala Ala Pro Arg Arg Ala Ala Arg Gly Ser Arg Arg
Arg 50 55 60Leu Arg Arg His Arg His
Arg Asn Glu Arg Ile Arg Ser Leu Met Leu65 70
75 80Glu Glu Arg Leu Ile Ser Gln Asp Glu Leu Glu
Thr Leu Phe Asp Gly 85 90
95Arg Leu Glu Asp Ile Tyr Ala Leu Arg Val Lys Ala Leu Asp Glu Ile
100 105 110Val Ser Arg Thr Asp Phe
Ala Arg Ile Leu Leu His Ile Ser Gln Arg 115 120
125Arg Gly Phe Lys Ser Asn Arg Lys Asn Pro Thr Thr Lys Glu
Asp Gly 130 135 140Val Leu Leu Ala Ala
Val Asn Glu Asn Lys Gln Arg Met Ser Glu His145 150
155 160Gly Tyr Arg Thr Val Gly Glu Met Phe Leu
Leu Asp Glu Thr Phe Lys 165 170
175Asp His Lys Arg Asn Lys Gly Gly Asn Tyr Ile Thr Thr Val Ala Arg
180 185 190Asp Met Val Ala Asp
Glu Val Arg Ala Ile Phe Ser Ala Gln Arg Glu 195
200 205Leu Gly Ala Ser Phe Ala Ser Glu Glu Phe Glu Glu
Arg Tyr Leu Glu 210 215 220Ile Leu Leu
Ser Gln Arg Ser Phe Asp Glu Gly Pro Gly Gly Asn Ser225
230 235 240Pro Tyr Gly Gly Ser Gln Ile
Glu Arg Met Val Gly Arg Cys Thr Phe 245
250 255Phe Pro Asp Glu Pro Arg Ala Ala Lys Ala Thr Tyr
Ser Phe Glu Tyr 260 265 270Phe
Thr Leu Leu Gln Lys Val Asn His Ile Arg Ile Val Glu Asn Gly 275
280 285Val Ala Ser Lys Leu Thr Asp Glu Gln
Arg Arg Ile Ile Ile Glu Leu 290 295
300Ala His Thr Thr Lys Asp Val Ser Tyr Ala Lys Ile Arg Lys Val Leu305
310 315 320Lys Leu Ser Asp
Lys Gln Leu Phe Asn Ile Arg Tyr Ser Asp Asn Ser 325
330 335Pro Ala Glu Asp Ser Glu Lys Lys Glu Lys
Leu Gly Ile Met Lys Ala 340 345
350Tyr His Gln Met Arg Ser Ala Ile Asp Arg Val Ser Lys Gly Arg Phe
355 360 365Ala Met Met Pro Arg Ala Gln
Arg Asn Ala Ile Gly Thr Ala Leu Ser 370 375
380Leu Tyr Lys Thr Ser Asp Lys Ile Arg Lys Tyr Leu Thr Asp Ala
Gly385 390 395 400Leu Asp
Glu Ile Asp Ile Asn Ser Ala Asp Ser Ile Gly Ser Phe Ser
405 410 415Lys Phe Gly His Ile Ser Val
Lys Ala Cys Asp Met Leu Ile Pro Phe 420 425
430Leu Glu Gln Gly Met Asn Tyr Asn Glu Ala Cys Ala Ala Ala
Gly Leu 435 440 445Asn Phe Lys Gly
His Asp Ala Gly Glu Lys Ser Lys Leu Leu His Pro 450
455 460Lys Glu Glu Asp Tyr Glu Asp Ile Thr Ser Pro Val
Val Arg Arg Ala465 470 475
480Ile Ala Gln Thr Ile Lys Val Ile Asn Ala Ile Ile Arg Arg Glu Gly
485 490 495Cys Ser Pro Thr Phe
Ile Asn Ile Glu Leu Ala Arg Glu Met Ala Lys 500
505 510Asp Phe Arg Glu Arg Asn Arg Ile Lys Lys Glu Asn
Asp Asp Asn Arg 515 520 525Ala Lys
Asn Glu Arg Leu Leu Glu Arg Ile Arg Thr Glu Tyr Gly Lys 530
535 540Asn Asn Pro Thr Gly Leu Asp Leu Val Lys Leu
Arg Leu Tyr Glu Glu545 550 555
560Gln Ser Gly Val Cys Met Tyr Ser Leu Lys Gln Met Ser Leu Glu Lys
565 570 575Leu Phe Glu Pro
Asn Tyr Ala Glu Val Asp Ala Ile Val Pro Tyr Ser 580
585 590Ile Ser Phe Asp Asp Ser Arg Lys Asn Lys Val
Leu Val Leu Thr Glu 595 600 605Glu
Asn Arg Asn Lys Gly Asn Arg Leu Pro Leu Gln Tyr Leu Lys Gly 610
615 620Arg Arg Arg Glu Asp Phe Ile Val Trp Val
Asn Asn Asn Val Lys Asp625 630 635
640Tyr Arg Lys Arg Arg Leu Leu Leu Lys Glu Glu Leu Thr Ala Glu
Asp 645 650 655Glu Ser Gly
Phe Lys Glu Arg Asn Leu Gln Asp Thr Lys Thr Met Ser 660
665 670Arg Phe Leu Leu Asn Tyr Ile Ala Asp Asn
Leu Glu Phe Ala Glu Ser 675 680
685Thr Arg Gly Arg Lys Lys Lys Val Thr Ala Val Asn Gly Ala Val Thr 690
695 700Ala Tyr Met Arg Lys Arg Trp Gly
Ile Thr Lys Ile Arg Glu Asp Gly705 710
715 720Asp Cys His His Ala Val Asp Ala Val Val Ile Ala
Cys Thr Thr Asp 725 730
735Ala Met Ile Arg Gln Val Ser Arg Tyr Ala Gln Phe Arg Glu Cys Glu
740 745 750Tyr Met Gln Thr Glu Ser
Gly Ser Val Ala Val Asp Thr Gly Thr Gly 755 760
765Glu Val Leu Arg Thr Phe Pro Tyr Pro Trp Pro Asp Phe Arg
Lys Glu 770 775 780Leu Glu Ala Arg Leu
Ala Asn Asp Pro Ala Lys Val Ile Asn Asp Leu785 790
795 800His Leu Pro Phe Tyr Met Ser Ala Gly Arg
Pro Leu Pro Glu Pro Val 805 810
815Phe Val Ser Arg Met Pro Arg Arg Lys Val Thr Gly Ala Ala His Lys
820 825 830Asp Thr Ile Lys Ser
Ala Arg Glu Leu Asp Asn Gly Tyr Leu Ile Val 835
840 845Lys Arg Pro Leu Thr Asp Leu Lys Leu Lys Asn Gly
Glu Ile Glu Asn 850 855 860Tyr Tyr Asn
Pro Gln Ser Asp Lys Cys Leu Tyr Asp Ala Leu Lys Asn865
870 875 880Ala Leu Ile Glu His Gly Gly
Asp Ala Lys Lys Ala Phe Ala Gly Glu 885
890 895Phe Arg Lys Pro Lys Arg Asp Gly Thr Pro Gly Pro
Ile Val Lys Lys 900 905 910Val
Lys Leu Leu Glu Pro Thr Thr Met Cys Val Pro Val His Gly Gly 915
920 925Lys Gly Ala Ala Asp Asn Asp Ser Met
Val Arg Val Asp Val Phe Leu 930 935
940Ser Gly Gly Lys Tyr Tyr Leu Val Pro Ile Tyr Val Ala Asp Thr Leu945
950 955 960Lys Pro Glu Leu
Pro Asn Lys Ala Val Thr Arg Gly Lys Lys Tyr Ser 965
970 975Glu Trp Leu Glu Met Ala Asp Glu Asp Phe
Ile Phe Ser Leu Tyr Pro 980 985
990Asn Asp Leu Ile Cys Ala Thr Ser Lys Asn Gly Ile Thr Leu Ser Val
995 1000 1005Cys Arg Lys Asp Ser Thr
Leu Pro Pro Thr Val Glu Ser Lys Ser 1010 1015
1020Phe Met Leu Tyr Tyr Arg Gly Thr Asp Ile Ser Thr Gly Ser
Ile 1025 1030 1035Ser Cys Ile Thr His
Asp Asn Ala Tyr Lys Leu Arg Gly Leu Gly 1040 1045
1050Val Lys Thr Leu Glu Lys Leu Glu Lys Tyr Thr Val Asp
Val Leu 1055 1060 1065Gly Glu Tyr His
Lys Val Gly Lys Glu Val Arg Gln Pro Phe Asn 1070
1075 1080Ile Lys Arg Arg Lys Ala Cys Pro Ser Glu Met
Leu 1085 1090 10951811095PRTArtificial
SequenceENGINEERED NICKASE 181Met Arg Pro Tyr Ala Ile Gly Leu Asp Ile Gly
Ile Thr Ser Val Gly1 5 10
15Trp Ala Thr Val Ala Leu Asp Ala Asp Glu Ser Pro Cys Gly Ile Ile
20 25 30Gly Leu Gly Ser Arg Ile Phe
Asp Ala Ala Glu Gln Pro Lys Thr Gly 35 40
45Glu Ser Leu Ala Ala Pro Arg Arg Ala Ala Arg Gly Ser Arg Arg
Arg 50 55 60Leu Arg Arg His Arg His
Arg Asn Glu Arg Ile Arg Ser Leu Met Leu65 70
75 80Glu Glu Arg Leu Ile Ser Gln Asp Glu Leu Glu
Thr Leu Phe Asp Gly 85 90
95Arg Leu Glu Asp Ile Tyr Ala Leu Arg Val Lys Ala Leu Asp Glu Ile
100 105 110Val Ser Arg Thr Asp Phe
Ala Arg Ile Leu Leu His Ile Ser Gln Arg 115 120
125Arg Gly Phe Lys Ser Asn Arg Lys Asn Pro Thr Thr Lys Glu
Asp Gly 130 135 140Val Leu Leu Ala Ala
Val Asn Glu Asn Lys Gln Arg Met Ser Glu His145 150
155 160Gly Tyr Arg Thr Val Gly Glu Met Phe Leu
Leu Asp Glu Thr Phe Lys 165 170
175Asp His Lys Arg Asn Lys Gly Gly Asn Tyr Ile Thr Thr Val Ala Arg
180 185 190Asp Met Val Ala Asp
Glu Val Arg Ala Ile Phe Ser Ala Gln Arg Glu 195
200 205Leu Gly Ala Ser Phe Ala Ser Glu Glu Phe Glu Glu
Arg Tyr Leu Glu 210 215 220Ile Leu Leu
Ser Gln Arg Ser Phe Asp Glu Gly Pro Gly Gly Asn Ser225
230 235 240Pro Tyr Gly Gly Ser Gln Ile
Glu Arg Met Val Gly Arg Cys Thr Phe 245
250 255Phe Pro Asp Glu Pro Arg Ala Ala Lys Ala Thr Tyr
Ser Phe Glu Tyr 260 265 270Phe
Thr Leu Leu Gln Lys Val Asn His Ile Arg Ile Val Glu Asn Gly 275
280 285Val Ala Ser Lys Leu Thr Asp Glu Gln
Arg Arg Ile Ile Ile Glu Leu 290 295
300Ala His Thr Thr Lys Asp Val Ser Tyr Ala Lys Ile Arg Lys Val Leu305
310 315 320Lys Leu Ser Asp
Lys Gln Leu Phe Asn Ile Arg Tyr Ser Asp Asn Ser 325
330 335Pro Ala Glu Asp Ser Glu Lys Lys Glu Lys
Leu Gly Ile Met Lys Ala 340 345
350Tyr His Gln Met Arg Ser Ala Ile Asp Arg Val Ser Lys Gly Arg Phe
355 360 365Ala Met Met Pro Arg Ala Gln
Arg Asn Ala Ile Gly Thr Ala Leu Ser 370 375
380Leu Tyr Lys Thr Ser Asp Lys Ile Arg Lys Tyr Leu Thr Asp Ala
Gly385 390 395 400Leu Asp
Glu Ile Asp Ile Asn Ser Ala Asp Ser Ile Gly Ser Phe Ser
405 410 415Lys Phe Gly His Ile Ser Val
Lys Ala Cys Asp Met Leu Ile Pro Phe 420 425
430Leu Glu Gln Gly Met Asn Tyr Asn Glu Ala Cys Ala Ala Ala
Gly Leu 435 440 445Asn Phe Lys Gly
His Asp Ala Gly Glu Lys Ser Lys Leu Leu His Pro 450
455 460Lys Glu Glu Asp Tyr Glu Asp Ile Thr Ser Pro Val
Val Arg Arg Ala465 470 475
480Ile Ala Gln Thr Ile Lys Val Ile Asn Ala Ile Ile Arg Arg Glu Gly
485 490 495Cys Ser Pro Thr Phe
Ile Asn Ile Glu Leu Ala Arg Glu Met Ala Lys 500
505 510Asp Phe Arg Glu Arg Asn Arg Ile Lys Lys Glu Asn
Asp Asp Asn Arg 515 520 525Ala Lys
Asn Glu Arg Leu Leu Glu Arg Ile Arg Thr Glu Tyr Gly Lys 530
535 540Asn Asn Pro Thr Gly Leu Asp Leu Val Lys Leu
Arg Leu Tyr Glu Glu545 550 555
560Gln Ser Gly Val Cys Met Tyr Ser Leu Lys Gln Met Ser Leu Glu Lys
565 570 575Leu Phe Glu Pro
Asn Tyr Ala Glu Val Asp His Ile Val Pro Tyr Ser 580
585 590Ile Ser Phe Asp Asp Ser Arg Lys Asn Lys Val
Leu Val Leu Thr Glu 595 600 605Glu
Asn Arg Asn Lys Gly Asn Arg Leu Pro Leu Gln Tyr Leu Lys Gly 610
615 620Arg Arg Arg Glu Asp Phe Ile Val Trp Val
Asn Asn Asn Val Lys Asp625 630 635
640Tyr Arg Lys Arg Arg Leu Leu Leu Lys Glu Glu Leu Thr Ala Glu
Asp 645 650 655Glu Ser Gly
Phe Lys Glu Arg Asn Leu Gln Asp Thr Lys Thr Met Ser 660
665 670Arg Phe Leu Leu Asn Tyr Ile Ala Asp Asn
Leu Glu Phe Ala Glu Ser 675 680
685Thr Arg Gly Arg Lys Lys Lys Val Thr Ala Val Asn Gly Ala Val Thr 690
695 700Ala Tyr Met Arg Lys Arg Trp Gly
Ile Thr Lys Ile Arg Glu Asp Gly705 710
715 720Asp Cys His His Ala Val Asp Ala Val Val Ile Ala
Cys Thr Thr Asp 725 730
735Ala Met Ile Arg Gln Val Ser Arg Tyr Ala Gln Phe Arg Glu Cys Glu
740 745 750Tyr Met Gln Thr Glu Ser
Gly Ser Val Ala Val Asp Thr Gly Thr Gly 755 760
765Glu Val Leu Arg Thr Phe Pro Tyr Pro Trp Pro Asp Phe Arg
Lys Glu 770 775 780Leu Glu Ala Arg Leu
Ala Asn Asp Pro Ala Lys Val Ile Asn Asp Leu785 790
795 800His Leu Pro Phe Tyr Met Ser Ala Gly Arg
Pro Leu Pro Glu Pro Val 805 810
815Phe Val Ser Arg Met Pro Arg Arg Lys Val Thr Gly Ala Ala His Lys
820 825 830Asp Thr Ile Lys Ser
Ala Arg Glu Leu Asp Asn Gly Tyr Leu Ile Val 835
840 845Lys Arg Pro Leu Thr Asp Leu Lys Leu Lys Asn Gly
Glu Ile Glu Asn 850 855 860Tyr Tyr Asn
Pro Gln Ser Asp Lys Cys Leu Tyr Asp Ala Leu Lys Asn865
870 875 880Ala Leu Ile Glu His Gly Gly
Asp Ala Lys Lys Ala Phe Ala Gly Glu 885
890 895Phe Arg Lys Pro Lys Arg Asp Gly Thr Pro Gly Pro
Ile Val Lys Lys 900 905 910Val
Lys Leu Leu Glu Pro Thr Thr Met Cys Val Pro Val His Gly Gly 915
920 925Lys Gly Ala Ala Asp Asn Asp Ser Met
Val Arg Val Asp Val Phe Leu 930 935
940Ser Gly Gly Lys Tyr Tyr Leu Val Pro Ile Tyr Val Ala Asp Thr Leu945
950 955 960Lys Pro Glu Leu
Pro Ala Lys Ala Val Thr Arg Gly Lys Lys Tyr Ser 965
970 975Glu Trp Leu Glu Met Ala Asp Glu Asp Phe
Ile Phe Ser Leu Tyr Pro 980 985
990Asn Asp Leu Ile Cys Ala Thr Ser Lys Asn Gly Ile Thr Leu Ser Val
995 1000 1005Cys Arg Lys Asp Ser Thr
Leu Pro Pro Thr Val Glu Ser Lys Ser 1010 1015
1020Phe Met Leu Tyr Tyr Arg Gly Thr Asp Ile Ser Thr Gly Ser
Ile 1025 1030 1035Ser Cys Ile Thr His
Asp Asn Ala Tyr Lys Leu Arg Gly Leu Gly 1040 1045
1050Val Lys Thr Leu Glu Lys Leu Glu Lys Tyr Thr Val Asp
Val Leu 1055 1060 1065Gly Glu Tyr His
Lys Val Gly Lys Glu Val Arg Gln Pro Phe Asn 1070
1075 1080Ile Lys Arg Arg Lys Ala Cys Pro Ser Glu Met
Leu 1085 1090 1095182129DNAArtificial
SequenceGRNA 182gttatagttc cctgttcgtt cttggtatgg tataatgaaa ttataccata
ccaagaacga 60agcaggttac tatgataagg tagtataccg cagagctcca acgcctcgct
tttgcggggc 120gttgtctct
129
User Contributions:
Comment about this patent or add new information about this topic: