Patent application title: RECOMBINANT FLAVIN-ADENINE DINUCLEOTIDE GLUCOSE DEHYDROGENASE AND USES THEREOF
Inventors:
IPC8 Class: AC12N904FI
USPC Class:
Class name:
Publication date: 2022-05-26
Patent application number: 20220162569
Abstract:
A recombinant protein, including: (a) alpha subunit of an FAD-GDH; and
(b) a minimal cytochrome c peptide is provided. Additionally, an
electrode coupled to a recombinant protein, the recombinant protein made
of: (a) a cofactor of a redox enzyme; (b) a redox enzyme; (c) a linker
moiety configured to link any one of: the cofactor or the enzyme to an
electron transfer (ET) domain; and (d) an ET domain, is also provided.
Methods for: (a) transferring an electron to an electrode, by coupling
the recombinant protein an electrode; and (b) quantifying the amount of
an analyte e.g., glucose are also provided.Claims:
1. A recombinant protein, comprising: (a) alpha subunit of a flavin
adenine dinucleotide-glucose dehydrogenase (FAD-GDH); and (b) a minimal
c-type cytochrome peptide.
2. The recombinant protein of claim 1, wherein said alpha subunit of an FAD-GDH is a Burkholderia cepacian alpha subunit of an FAD-GDH.
3. The recombinant protein of claim 1, wherein said minimal C-type cytochrome peptide is a cytochrome second magnetochrome domain (MCR-2) from magnetosome protein MamP.
4. The recombinant protein of claim 1, further being bound to a porphyrin comprising a metal.
5. The recombinant protein of claim 1, further being bound to a compound of formula I: ##STR00003## wherein said R is any electron donor, said compound of formula I is bound to a metal.
6. The recombinant protein of claim 4, wherein said metal is a trivalent metal or a divalent metal.
7. The recombinant protein of claim 1, comprising an amino acid sequence selected from the group consisting of: SED ID NOs: 10-13.
8. The recombinant protein of claim 1, wherein said alpha subunit of said FAD-GDH comprises the amino acid sequence set forth in SEQ ID NO: 8.
9. The recombinant protein of claim 1, further comprising a substitution of at least one amino acid residue of said alpha subunit of said FAD-GDH, said minimal c-type cytochrome peptide, or both, with a non-canonical amino acid residue.
10. The recombinant protein of claim 9, being coupled to an electrode via said non-canonical amino acid (ncAA) residue.
11. A composition comprising the recombinant protein of claim 1 and the gamma subunit of an FAD-GDH.
12. A DNA molecule comprising a nucleotide sequence encoding the recombinant protein of claim 1.
13. The DNA molecule of claim 12, comprising a nucleic acid sequence selected from the group consisting of: SEQ ID NOs: 6, 7, and 10.
14. An expression vector comprising the DNA molecule of claim 12.
15. A cell comprising the DNA molecule of claim 12.
16. A method for transferring an electron to an electrode, comprising coupling the recombinant protein of claim 9 to an electrode, wherein the recombinant protein is further bound to porphyrin comprising a metal, thereby transferring an electron to said electrode.
Description:
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application is divisional of U.S. patent application Ser. No. 16/636,013 filed on Feb. 2, 2020, which is a National Phase of PCT Patent Application No. PCT/IL2018/050863 having International filing date of Aug. 2, 2018 titled "A RECOMBINANT FLAVIN-ADENINE DINUCLEOTIDE GLUCOSE DEHYDROGENASE AND USES THEREOF", which claims the benefit of priority from Israeli Patent Application No. 253801 filed on Aug. 2, 2017. The contents of the above applications are all incorporated by reference as if fully set forth herein in their entirety.
FIELD OF INVENTION
[0002] The present invention relates to recombinant proteins, methods of using the same such as for direct electron transfer in bio-electrochemical applications.
BACKGROUND OF THE INVENTION
[0003] Redox enzymes are proteins that participate in biocatalytic processes which involve electron transfer (ET). Depending on their redox potential, enzyme mediated redox reactions may be used in anodes and cathodes of biofuel cells as well as in biosensing applications. For the utilization of most redox enzymes in such devices, a mediator molecule should be used to mediate the ET between the enzyme and the electrode. The use of an external redox mediator results in a potential loss as well as in low power outputs.
[0004] Redox mediator molecules introduce two major challenges to the system: the first is having a middle point potential value that affords an efficient electron transfer from an enzyme to the electrode, which results in insufficient energy production. The other, is the need for diffusion of the mediator molecule through solution towards the electrode which might cause an additional loss of energy.
SUMMARY OF THE INVENTION
[0005] The present invention relates to a recombinant protein having superior FAD-GDH (flavin-adenine dinucleotide glucose dehydrogenase) activity and methods of using the same.
[0006] According to an aspect of some embodiments of the present invention there is provided a recombinant protein, comprising: (a) alpha subunit of an FAD-GDH; and (b) a minimal c-type cytochrome peptide.
[0007] In some embodiments, the minimal c-type cytochrome peptide comprises 11 to 30 amino acids.
[0008] In some embodiments, the alpha subunit of an FAD-GDH is a Burkholderia cepacian alpha subunit of an FAD-GDH.
[0009] In some embodiments, the alpha subunit of an FAD-GDH is derived from a thermostable enzyme, an oxygen independent enzyme, or both. In some embodiments, the minimal cytochrome peptide is a cytochrome domain MCR-2 from a MamP protein.
[0010] In some embodiments, the minimal cytochrome peptide is a magnetotactic bacterius minimal cytochrome peptide.
[0011] In some embodiments, the protein has a molecular weight in the range of 63 to 65 kDa.
[0012] In some embodiments, the recombinant protein is bound to a porphyrin comprising a metal.
[0013] In some embodiments, the recombinant protein is bound to a compound of formula I:
##STR00001##
wherein the R is any electron donor, the compound of formula I is bound to a metal.
[0014] In some embodiments, the metal is a trivalent metal or a divalent metal.
[0015] In some embodiments, the recombinant protein has both peroxidase activity and oxidative activity.
[0016] In some embodiments, the recombinant protein comprises an amino acid sequence selected from the group consisting SED ID NOs: 8 to 13.
[0017] In some embodiments, there is provided a composition comprising the recombinant protein disclosed herein in an embodiment thereof, and the gamma subunit of an FAD-GDH.
[0018] In some embodiments, there is provided a composition comprising the recombinant protein disclosed herein in an embodiment thereof, and the gamma subunit of an FAD-GDH as two separate unbound proteins.
[0019] In some embodiments, there is provided a DNA molecule comprising a nucleotide sequence encoding the disclosed recombinant protein in an embodiment thereof with or without the gamma subunit of an FAD-GDH.
[0020] In some embodiments, the DNA molecule comprises the nucleic acid sequence selected from the group consisting SEQ ID NOs: 5-7 and 10.
[0021] In some embodiments, there is provided an expression vector comprising the DNA molecule disclosed herein in an embodiment thereof.
[0022] In some embodiments, there is provided a cell comprising the DNA molecule disclosed herein in an embodiment thereof, and the recombinant protein disclosed herein in an embodiment thereof, or both.
[0023] In some embodiments, the cell is a prokaryotic cell.
[0024] In some embodiments, there is provided the recombinant protein disclosed herein in an embodiment thereof, coupled to an electrode. In some embodiments, the coupled is covalently bound.
[0025] In some embodiments, the recombinant protein is coupled to an electrode via a mediator molecule comprising non-canonical amino acid (ncAA) residue.
[0026] According to another aspect, there is provided an electrode coupled to a recombinant protein, the recombinant protein comprising:
[0027] (a) a cofactor of a redox enzyme; (b) a redox enzyme; (c) a linker moiety configured to link any one of: the cofactor or the enzyme to an electron transfer (ET) domain; and (d) an ET domain, configured to transfer electrons between the electrode and the cofactor; wherein a distance between the ET domain and the electrode is in the range of 0 to 14 .ANG..
[0028] In some embodiments, the recombinant protein is in the form of formula: (a)-(b)-(c)-(d).
[0029] In some embodiments, the cofactor is selected from the group consisting of: FAD, NAD.sup.+, and NADP.sup.+.
[0030] In some embodiments, the redox enzyme is selected from an oxidase, a dehydrogenase, and a malic enzyme.
[0031] In some embodiments, the redox enzyme has a redox potential of less than 50 mV vs. Ag/AgCl.
[0032] In some embodiments, the ET domain comprises a minimal c-type cytochrome peptide.
[0033] In some embodiments, the electrode comprises a material selected from gold, graphite, and glassy carbon electrode (GCE).
[0034] In some embodiments, the ET is attached to the electrode.
[0035] In some embodiments, the ET is covalently attached to the electrode to the electrode via a mediator molecule. In some embodiments, the mediator molecule comprises one or more non-canonical amino acids.
[0036] In some embodiments, the one or more non-canonical amino acids comprise Propargyl-lysine (PrK).
[0037] In some embodiments, the mediator molecule comprises polycyclic aromatic system.
[0038] In some embodiments, there is provided a device comprising the electrode.
[0039] In some embodiments, the device is for biosensing glucose in a medium.
[0040] According to another aspect, there is provided a method for transferring an electron to an electrode, comprising coupling the disclosed recombinant protein in an embodiment thereof to an electrode, thereby transferring an electron to the electrode.
[0041] In some embodiments, the coupling is in the absence of a mediator molecule.
[0042] According to another aspect, there is provided a method for quantifying the amount of a reporter in a sample having a first detectable range of light absorbance in an oxidized state a second range of light absorbance in a non-oxidized state, comprising:
(a) contacting the disclosed recombinant protein in an embodiment thereof with the reporter in a non-oxidized state; and (b) measuring the amount of the reporter in an oxidized state, thereby quantifying the amount of a reporter in a sample.
[0043] In some embodiments, the first detectable range of light absorbance is detectable in visible light and the second range of light absorbance is non-detectable in visible light.
[0044] In some embodiments, the reporter is 2,6-Dichloroindophenol.
[0045] In some embodiments, the 2,6-Dichloroindophenol is coupled to glucose.
[0046] In another aspect, there is provided a method for determining an analyte in a liquid medium, the analyte being capable to undergo a biocatalytic oxidation or reduction reaction in the presence of an oxidizer or a reducer, respectively, the method comprising:
[0047] (i) providing the disclosed device in an embodiment thereof;
[0048] (ii) contacting the device with the liquid medium;
[0049] (iii) measuring the electric signal generated between the cathode and the anode, the electric signal being indicative of the presence and/or the concentration of the analyte; and
[0050] (iv) determining the analyte based on the electric signal.
[0051] In some embodiments, the analyte comprises glucose.
[0052] Unless otherwise defined, all technical and/or scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which the invention pertains. Although methods and materials similar or equivalent to those described herein can be used in the practice or testing of embodiments of the invention, exemplary methods and/or materials are described below. In case of conflict, the patent specification, including definitions, will control. In addition, the materials, methods, and examples are illustrative only and are not intended to be necessarily limiting.
BRIEF DESCRIPTION OF THE DRAWINGS
[0053] The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the Office upon request and payment of the necessary fee.
[0054] Some embodiments of the invention are herein described, by way of example only, with reference to the accompanying drawings and images. With specific reference now to the drawings and images in detail, it is stressed that the particulars shown are by way of example and for purposes of illustrative discussion of embodiments of the invention. In this regard, the description taken with the drawings and images makes apparent to those skilled in the art how embodiments of the invention may be practiced.
[0055] FIG. 1 is a 3D model of Flavin-adenine dinucleotide dependent glucose dehydrogenase FAD-GDH-MCD (FGM) based on the structure of GDH, predicted by homology to formate oxidase using Swiss-model (3q9t) and on the structure of MCD from MamP crystal structure (4jj0). FAD-GDH from Burkholderia cepacia is presented with its FAD binding motif (orange, "1") and the FAD co-factor MCD model (light orange, "2") and the linker (grey, "3") were cut from mamP known 3D structure and include the Heme binding motif (red, "4") and heme molecule (pink, "5"). Heme and FAD molecules were attached to the protein model manually using PyMOL.
[0056] FIGS. 2A-2B present gel micrographs: Coomassie stained sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) analysis of FGM and GDH elution fractions resulting from IMAC purification. FGM and GDH catalytic sub-units are shown between 60 to 75 kDa. Band at ca. 13 kDa in both lanes correlates to FAD-GDH .gamma. subunit (FIG. 2A); and in-gel heme staining showing the presence of a heme molecule in FGM compared to its absence in GDH (FIG. 2B, left panel) and Anti 6.times.his-tag Western blot verifying the full-length protein expression.
[0057] FIGS. 3A-3D are graphs showing: 2,6-dichloroindophenol (DCIP) reduction assay comparing the oxidation of D-glucose by both FGM (.box-solid.) and GDH (.circle-solid.) (FIG. 3A) and Heme activity measurements verifying the presence of a porphyrin in FGM compared to GDH. Measurements were performed in 37.degree. C., 50 mM Tris buffer, pH=7.0 (FIG. 3B); Peroxidase activity interference test; FGM was compared to Glucose oxidase (GOx) which is oxygen sensitive; No hydrogen peroxide was detected upon glucose oxidation by FGM while GOx, as a positive control, showed hydrogen peroxide production and its subsequent reduction by HRP is visible (FIG. 3C); and absorbance spectrum of FGM and GDH showing peak in absorbance at 408 nm for FGM and no peak for GDH (FIG. 3D).
[0058] FIGS. 4A-4C present graphs showing cyclic voltammograms of GCE/GDH and GCE/FGM with (+) and without (-) 5 mM glucose. The measurements were performed in 150 mM phosphate-citrate buffer, pH=5.0 at room temperature vs Ag/AgCl as a reference electrode at a scan rate of 5 mV s.sup.-1 (FIG. 4A) and 100 mV s.sup.-1 (FIG. 4B), and square-wave voltammetry (SWV)s of GCE/GDH and GCE/FGM. Measurements performed in 150 mM phosphate-citrate buffer, pH=5.0 vs Ag/AgCl reference electrode with 5 mV steps, amplitude 10 mV and a frequency 5 Hz (FIG. 4C). Background GCE current subtracted from the signal (GCE: glassy carbon electrode).
[0059] FIGS. 5A-5B present graphs showing steady state currents from chronoamperometry measurements of GCE/GDH (.circle-solid.) and GCE/FGM (.box-solid.) using different glucose concentrations (FIG. 5A); and Linweaver-Burk plots of electrochemical activity for both, FGM and GDH. Linear trend line equations were calculated to be y=123x+42.3 for GDH and y=10.1x+7.1 for FGM (FIG. 5B).
[0060] FIG. 6 is a map of expressed FGM. For the GDH, MCD sequence has been removed.
[0061] FIGS. 7A-C are graphs showing: Michaelis-Menten (FIG. 7A) and Lineweaver-Burk (FIG. 7B) plots of biochemical FAD-GDH activity for both, FGM and GDH. Linear trend line equations were calculated to be y=1.2x+6.7 for GDH and y=1.2x+7.7 for FGM and chronoamperometric measurements of glucose catalytic oxidation by FGM and GDH at an applied potential of 0.0 mV (FIG. 7C).
[0062] FIGS. 8A-8C are graphs showing GCE/FGM selectivity test that was performed by adding 3.6 mM glucose followed by two sequential additions of sugars in their relevant physiological concentration--1.67 and 3.3 mM galactose - -), 0.3 and 0.6 mM lactose ( ), 2.9 and 5.8 mM maltose (red -) and 1.67 and 3.3 mM xylose (- - -) (FIG. 8A); other molecules interference was tested by adding 3.6 mM glucose followed by two additions of 0.17 mM ascorbic acid and 0.2 mM acetaminophen at an applied potential of 0.0 mV (FIG. 8B); and 300 mV vs. Ag/AgC (FIG. 8C).
[0063] FIG. 9 presents 3D model of FAD-GDH from B. cepacia (green) is presented with its FAD binding motif (orange, "1") and the FAD co-factor (blue, "2"). MCD model (cyan, "3") and the linker (grey, "4") were cut from MamP known 3D structure and include the heme molecule (pink, "5"). The fusion shown in this figure was manually generated using PyMOL software. ncAA possible incorporation sites were colored--proximity to FAD (red, "6"), to MCD (yellow, "7"), distant from both FAD and MCD (black, "8").
[0064] FIGS. 10A-10B present non-limiting pyrene-azide linker structures with different lengths (FIG. 10A) and 5-etramethylrhodamine (TAMRA)-azide chemical structure (FIG. 10B).
DETAILED DESCRIPTION OF THE INVENTION
[0065] Provided herein is a recombinant flavin-adenine dinucleotide glucose dehydrogenase (FAD-GDH) enzyme, polynucleotides sequences encoding same, useful for direct electron transfer such as in bio-electrochemical applications, including, but not limited to, glucose monitoring.
[0066] As provided in some embodiments of the present invention, fusing the enzyme with a minimal cytochrome domain (MCD), instead of large cytochrome, allows to shorten the enzyme-electrode distance and improve the direct electron transfer (DET) capabilities of the enzyme. As demonstrated hereinbelow under a non-limiting example, the fusion enzyme communicated with an electrode directly, without the use of a mediator molecule. Direct electron transfer between the redox enzyme and an electrode resulted in enhancement of chemical detection.
[0067] As demonstrated hereinbelow, the disclosed recombinant enzyme showed a substantially reduced redox potential, e.g., from +400 mV to 0 mV, thereby improving enzyme selectivity in various electrochemical applications including glucose sensing and monitoring as well as an electromotive force in a bio-electrochemical power device.
[0068] Before explaining further embodiments of the invention in detail, it is to be understood that the invention is not necessarily limited in its application to the details set forth in the following description or exemplified by the Examples. The invention is capable of other embodiments or of being practiced or carried out in various ways.
[0069] The present invention provides, in some embodiments, a recombinant protein comprising: (a) an alpha subunit of an FAD-GDH; and (b) a minimal cytochrome peptide.
[0070] In one embodiment, the alpha subunit of the FAD-GDH is derived or recovered from a prokaryotic cell. In one embodiment, the alpha subunit of the FAD-GDH is derived or recovered from a bacterial cell. In one embodiment, the alpha subunit of the FAD-GDH is Burkholderia cepacian alpha subunit of FAD-GDH. In one embodiment, the alpha subunit of the FAD-GDH of the present invention is derived from a thermostable enzyme, an oxygen independent enzyme, or both.
[0071] The term "thermostable enzyme" refers to an enzyme that is relatively stable to heat. The thermostable enzymes can withstand the high temperature incubation used to remove the modifier groups, typically, but not exclusively, greater than 50.degree. C., without suffering an irreversible loss of activity.
[0072] In one embodiment, the recombinant protein further comprises the gamma subunit of an FAD-GDH. In one embodiment, the invention provides a composition comprising or consisting the recombinant protein with or without the gamma subunit of an FAD-GDH. In one embodiment, the invention provides a composition comprising or consisting the recombinant protein and the gamma subunit of an FAD-GDH. In one embodiment, the invention provides a composition comprising or consisting at least two different proteins: (a) the recombinant protein; and (b) the gamma subunit of an FAD-GDH. In one embodiment, the at least two different proteins are unbound. In some embodiments, the gamma subunit is from the same FAD-GDH as the alpha subunit. In some embodiments, the gamma subunit is from a different FAD-GDH as the alpha subunit. In some embodiments, the gamma subunit is from the same or different FAD-GDH as the alpha subunit.
[0073] In one embodiment, the recombinant protein further comprises a minimal cytochrome peptide. In one embodiment, the minimal cytochrome peptide is a natural peptide. In one embodiment, the minimal cytochrome peptide comprises a non-natural peptide.
[0074] In one embodiment, the recombinant protein is devoid of the gamma subunit of the FAD-GDH. In one embodiment, the minimal cytochrome peptide comprises a c-type cytochrome domain. In one embodiment, the minimal cytochrome peptide does not comprise a b-type cytochrome domain. In one embodiment, the minimal cytochrome peptide comprises a c-type cytochrome domain MCR-2 from a MamP protein. In one embodiment, the minimal cytochrome peptide comprises a magnetotactic bacterius minimal cytochrome peptide. In one embodiment, the minimal cytochrome peptide is a magnetotactic bacterius minimal cytochrome peptide.
[0075] In one embodiment, the minimal cytochrome peptide is a peptide comprising or consisting of 11 to 30 amino acids. In one embodiment, the minimal cytochrome peptide is a peptide comprising or consisting of 11 to 24 amino acids. In some embodiments, the minimal domain is a naturally occurring cytochrome. In some embodiments, the minimal domain is a synthetic cytochrome.
[0076] In one embodiment, the minimal cytochrome peptide is a cytochrome peptide (e.g., c-type cytochrome) comprising or consisting of 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, or 30 amino acids, including any range therebetween. In some embodiments, the peptide comprises cytochrome functionality. In some embodiments, the peptide comprises ET functionality.
[0077] In one embodiment, the minimal cytochrome peptide is linked to the amino terminus of the alpha subunit of an FAD-GDH. In one embodiment, the minimal cytochrome peptide is linked to the carboxy terminus of the alpha subunit of an FAD-GDH. In one embodiment, the minimal cytochrome peptide is linked to the amino terminus of the gamma subunit of an FAD-GDH. In one embodiment, the minimal cytochrome peptide is linked to the carboxy terminus of the gamma subunit of an FAD-GDH.
[0078] In one embodiment, the minimal cytochrome peptide is linked to the subunit of an FAD-GDH directly or indirectly. In one embodiment, the minimal cytochrome peptide is linked to the carboxy terminus of the subunit of an FAD-GDH directly or indirectly.
[0079] In one embodiment, the recombinant protein, further comprises a linker. In one embodiment, the recombinant protein, further comprises a peptide linker. In one embodiment, the minimal cytochrome peptide is linked to the carboxy terminus or the amino terminus of the alpha subunit of an FAD-GDH via a peptide linker. In one embodiment, the minimal cytochrome peptide is linked to the carboxy terminus or the amino terminus of the gamma subunit of an FAD-GDH via a peptide linker.
[0080] In one embodiment, the minimal cytochrome peptide is linked to the amino or carboxy terminus of the subunit via a linker comprising or consisting 5 to 20 amino acids. In one embodiment, the minimal cytochrome peptide is linked to the amino or carboxy terminus of the subunit via a linker comprising or consisting 8 to 18 amino acids. In one embodiment, the minimal cytochrome peptide is linked to the amino or carboxy terminus of the subunit via a linker comprising or consisting 12 to 15 amino acids. In one embodiment, the minimal cytochrome peptide is linked to the amino or carboxy terminus of the subunit via a linker comprising or consisting 5 to 15 amino acids. In some embodiments, the linker is between 1 and 10, 1 and 9, 1 and 8, 1 and 7, 2 and 10, 2 and 9, 2 and 8, 2 and 7, 3 and 10, 3 and 9, 2 and 8 or 2 and 7 amino acids in length.
[0081] In some embodiments, the linker comprises 30% to 60% glycine. In some embodiments, the linker comprises 30% to 60% serine. In some embodiments, the linker is hydrophilic. In some embodiments, the linker does not cause steric hinderance. In some embodiments, the linker is a flexible linker. In some embodiments, the linker does not interfere with maturation of the porphyrin binding MCD. In some embodiments, the linker does not interfere with enzymatic activity of the other subunits. In some embodiments, the linker is not so short that the other subunit interferes with maturation of the porphyrin binding MCD. In some embodiments, the linker retains the subunits in close enough proximity to allow electron transfer. In some embodiments, the linker has a length of up to 20 .ANG., up to 25 .ANG., or up to 30 .ANG.. In some embodiments, the linker has a length of greater than or equal to 5 .ANG., 10 .ANG., 12 .ANG., 15 .ANG. or 17 .ANG.. Non-limiting exemplary linker is a peptide comprising GSGYGSG (SEQ ID NO: 27). In some embodiments, the linker comprises or consists of SEQ ID NO: 24. In some embodiments, the linker comprises or consists a non-peptide backbone.
[0082] In one embodiment, the linker is encoded by a DNA sequence comprising or consisting the nucleotide sequence: GAATTCGGTTCTGGTTATGGCTCTGGTCCGCCGGGTCCG (SEQ ID NO: 4). It will be understood by a skilled artisan that synonymous substitutions may be made to this sequence.
[0083] In one embodiment, the linker is encoded by a DNA sequence comprising or consisting of a nucleotide sequence synonymous with SEQ ID NO: 4).
[0084] Without being bound by any particular mechanism it is assumed that a short linker containing glycine renders the linker with a desired flexibility. Further, and without being bound by any particular mechanism it is assumed that a short linker containing serine renders the linker with a desired hydrophilicity.
[0085] Further, a shorter linker (e.g. shorter than to 5 .ANG.) could prevent proper maturation of the porphyrin binding MCD (due to its close proximity to GDH); on the other hand, a longer linker (e.g., longer than 30 .ANG.) could prevent efficient ET between the two domains.
[0086] In one embodiment, the recombinant protein further comprises a short tag peptide (3 to 20 amino acids long). In one embodiment, the short tag peptide is his tag. In some embodiments, the tag is a 6.times.his tag. Protein tags are well known in the art and any tag that does not interfere with the function (redox and ET) of the recombinant protein may be used. In some embodiments, the short tag is between 1 and 30, 1 and 25, 1 and 20, 1 and 15, 1 and 10, 2 and 30, 2 and 25, 2 and 20, 2 and 25, 2 and 10, 3 and 30, 3 and 25, 3 and 20, 3 and 15 or 3 and 10 amino acids in length. Each possibility represents a separate embodiment of the invention.
[0087] In one embodiment, the recombinant protein has a molecular weight in the range of 58 to 75 kDa. In one embodiment, the recombinant protein has a molecular weight in the range of 60 to 70 kDa. In one embodiment, the recombinant protein has a molecular weight in the range of 63 to 65 kDa. In one embodiment, the recombinant protein has a molecular weight in the range of 62 to 68 kDa. In one embodiment, the recombinant protein has a molecular weight in the range of 63 to 65 kDa.
[0088] In some embodiments, the amino acid sequence of the recombinant protein comprises or consists of the following sequence:
TABLE-US-00001 (SEQ ID NO: 8, shortened) MADTDTQKADVVVVGSGVAGAIVAHQLAMAGKSVILLEAGPRMPRWEIVERFRNQVDKTD FMAPYPSSAWAPHPEYGPPNDYLILKGEHKFNSQYIRAVGGTTWHWAASAWRFIPNDFKMK TVYGVGRDWPIQYDDIEHYYQRAEEELGVWGPGPEEDLYSPRKEPYPMPPLPLSFNEQTIKSA LNGYDPKFHVVTEPVARNSRPYDGRPTCCGNNNCMPICPIGAMYNGIVHVEKAEQAGAKLID SAVVYKLETGPDKRITAAVYKDKTGADHRVEGKYFVIAANGIETPKILLMSANRDFPNGVAN SSDMVGRNLMDHPGTGVSFYANEKLWPGRGPQEMTSLIGFRDGPFRANEAAKKIHLSNMSRI NQETQKIFKGGKLMKPEELDAQIRDRSARFVQFDCFHEILPQPENRIVPSKTATDAVGIPRPEIT YAIDDYVKRGAVHTREVYATAAKVLGGTEVVFNDEFAPNNHITGATIMGADARDSVVDKDC RAFDHPNLFISSSSTMPTVGTVNVTLTIAAL.
[0089] Hereinthroughout, in some embodiments, by "shortened" or "partial" it is meant to refer to without e.g., tag peptide, and/or a linker.
[0090] In some embodiments, the amino acid sequence of the recombinant protein comprises or consists of the following sequence:
TABLE-US-00002 (SEQ ID NO: 9, full) MADTDTQKADVVVVGSGVAGAIVAHQLAMAGKSVILLEAGPRMPRWEIVERFRNQVDKTD FMAPYPSSAWAPHPEYGPPNDYLILKGEHKFNSQYIRAVGGTTWHWAASAWRFIPNDFKMK TVYGVGRDWPIQYDDIEHYYQRAEEELGVWGPGPEEDLYSPRKEPYPMPPLPLSFNEQTIKSA LNGYDPKFHVVTEPVARNSRPYDGRPTCCGNNNCMPICPIGAMYNGIVHVEKAEQAGAKLID SAVVYKLETGPDKRITAAVYKDKTGADHRVEGKYFVIAANGIETPKILLMSANRDFPNGVAN SSDMVGRNLMDHPGTGVSFYANEKLWPGRGPQEMTSLIGFRDGPFRANEAAKKIHLSNMSRI NQETQKIFKGGKLMKPEELDAQIRDRSARFVQFDCFHEILPQPENRIVPSKTATDAVGIPRPEIT YAIDDYVKRGAVHTREVYATAAKVLGGTEVVFNDEFAPNNHITGATIMGADARDSVVDKDC RAFDHPNLFISSSSTMPTVGTVNVTLTIAALALRMSDTLKKEVIRAGATMPHRDRGPCGACHA IIQ.
[0091] In some embodiments, the amino acid sequence of the recombinant protein comprises or consists of the following sequence:
TABLE-US-00003 (SEQ ID NO: 10, full) MADTDTQKADVVVVGSGVAGAIVAHQLAMAGKSVILLEAGPRMPRWEIVERFRNQVDKTD FMAPYPSSAWAPHPEYGPPNDYLILKGEHKFNSQYIRAVGGTTWHWAASAWRFIPNDFKMK TVYGVGRDWPIQYDDIEHYYQRAEEELGVWGPGPEEDLYSPRKEPYPMPPLPLSFNEQTIKSA LNGYDPKFHVVTEPVARNSRPYDGRPTCCGNNNCMPICPIGAMYNGIVHVEKAEQAGAKLID SAVVYKLETGPDKRITAAVYKDKTGADHRVEGKYFVIAANGIETPKILLMSANRDFPNGVAN SSDMVGRNLMDHPGTGVSFYANEKLWPGRGPQEMTSLIGFRDGPFRANEAAKKIHLSNMSRI NQETQKIFKGGKLMKPEELDAQIRDRSARFVQFDCFHEILPQPENRIVPSKTATDAVGIPRPEIT YAIDDYVKRGAVHTREVYATAAKVLGGTEVVFNDEFAPNNHITGATIMGADARDSVVDKDC RAFDHPNLFISSSSTMPTVGTVNVTLTIAALALRMSDTLKKEVEFGSGYGSGPPGPIRAGATMP HRDRGPCGACHAIIQGSGSGHHHHHH.
[0092] In some embodiments, the amino acid sequence of the recombinant protein comprises or consists of the following sequence:
TABLE-US-00004 (SEQ ID NO: 11, partial) MADTDTQKADVVVVGSGVAGAIVAHQLAMAGKSVILLEAGPRMPRWEIVERFRNQVDKTD FMAPYPSSAWAPHPEYGPPNDYLILKGEHKFNSQYIRAVGGTTWHWAASAWRFIPNDFKMK TVYGVGRDWPIQYDDIEHYYQRAEEELGVWGPGPEEDLYSPRKEPYPMPPLPLSFNEQTIKSA LNGYDPKFHVVTEPVARNSRPYDGRPTCCGNNNCMPICPIGAMYNGIVHVEKAEQAGAKLID SAVVYKLETGPDKRITAAVYKDKTGADHRVEGKYFVIAANGIETPKILLMSANRDFPNGVAN SSDMVGRNLMDHPGTGVSFYANEKLWPGRGPQEMTSLIGFRDGPFRANEAAKKIHLSNMSRI NQETQKIFKGGKLMKPEELDAQIRDRSARFVQFDCFHEILPQPENRIVPSKTATDAVGIPRPEIT YAIDDYVKRGAVHTREVYATAAKVLGGTEVVFNDEFAPNNHITGATIMGADARDSVVDKDC RAFDHPNLFISSSSTMPTVGTVNVTLTIAALALRMSDTLKKEVEFGSGYGSGPPGPIRAGATMP HRDRGPCGACHAIIQ.
An Electrode
[0093] In some embodiments, there is provided an electrode carrying or coupled to a recombinant protein comprising A, B, C, and D,
wherein:
[0094] A is a cofactor of a redox enzyme;
[0095] B is a redox enzyme;
[0096] C is a linker moiety; and
[0097] D is an electron transfer (ET) domain that is configured to transfer electrons between the electrode and A. In some embodiment the ET comprises a cytochrome.
[0098] In one embodiment, the A, B, C, and D are linked to each other under the following order: A-B-C-D.
[0099] In some embodiments, the cofactor, when not bound to a linker moiety, comprises at least one pair of hydroxyl groups.
[0100] In one embodiment, there is provided a device comprising the electrode.
[0101] In one embodiment, the electrode comprises a material selected from, without being limited thereto, graphite and glassy carbon electrode (GCE).
[0102] In one embodiment, the electrode is made of or coated by an electrically conducting substance, such as, without being limited thereto gold, platinum, silver, conducting glass such as indium tin oxide (ITO).
[0103] In one embodiment, by "chemically attach to" it is meant to refer to being attached via a covalent bond. As used herein, the term "coupled" refers to a physical attachment, such that the two are bonded together. In some embodiments, the bond is a covalent bond. In some embodiments, the bond is a synthetic bond. In some embodiments, the bond is a chemical bond.
[0104] In another embodiment, by "chemically attach to" it is meant to refer to being attached to ("wiring") the electrode via a synthetic linker (also referred "electrode linker" or "mediator molecule" or "mediator") being configured to link the recombinant protein to an electrode.
[0105] In one embodiment, the wiring can be obtained by a non-specific process, by using a chemical modification and conductive matrices such as, without limitation, graphene oxide and multi-walled carbon nanotubes.
[0106] In one embodiment, the wiring is a site-specific wiring, performed e.g., by inserting at least one non-canonical amino acid in a desired site of one of the groups (A to D) that, optionally, covalently links to a moiety that binds to an electrode as described herein.
[0107] Without being bound by any particular mechanism, it is assumed that the recombinant protein disclosed herein (e.g., in the form of A-B-C-D) allows direct electron transfer (DET) via a domain that affords DET, resulting in a "built in" redox mediator.
[0108] In some embodiments, in order to achieve an efficient DET, the distance between the enzyme's active site and the electrode is as short as a few Angstroms (e.g., 1 to 20 .ANG.). In some embodiments, the distance is between 0-20, 0-19, 0-18, 0-17, 0-16, 0-15, 0-14, 0-13, 0-12, 0-11, 0-10, 1-20, 1-19, 1-18, 1-17, 1-16, 1-15, 1-14, 1-13, 1-12, 1-11, 1-10, 2-20, 2-19, 2-18, 2-17, 2-16, 2-15, 2-14, 2-13, 2-12, 2-11, 3-10, 3-20, 3-19, 3-18, 3-17, 3-16, 3-15, 3-14, 3-13, 3-12, 3-11, or 3-10 .ANG.. Each possibility represents a separate embodiment of the invention.
[0109] In some embodiments, when linked to the enzyme, the ET domain is minimal so as not to introduce additional insulation to the system by a complex proteinaceous matrix. In some embodiments, the ET domain is linked by a flexible linker. As described herein, the flexibility allows avoiding interruption of the catalytic redox activity. In some embodiments, the minimal domain is not more than 10, 12, 15, 17, 20, 22, 25, 27, 30, 32, 35, 37, 40, 42, 45, 47 or 50 amino acids. Each possibility represents a separate embodiment of the invention.
[0110] Embodiments of "minimal domain" are described hereinabove e.g., for the c-type cytochrome.
[0111] In some embodiments, the cofactor is selected from, without being limited thereto, FAD, NAD.sup.+, and NADP.sup.+.
[0112] In one embodiment, the redox enzyme refers to an enzyme that can catalyze a redox reaction.
[0113] In one embodiment, the redox enzyme may be selected from, without being limited thereto, oxidase, dehydrogenase, and malic enzyme (e.g., malate dehydrogenase). In some embodiments, the redox enzyme is selected from an oxidase, a dehydrogenase, a reductase, a peroxidase, a glyoxalase, a hydroxylase and a malic enzyme. In some embodiments, the redox enzyme is sugar dehydrogenase. In some embodiments, the sugar is glucose. In some embodiments, enzyme is alcohol dehydrogenase.
[0114] In one embodiment, the redox enzyme in the immobilized group is characterized by a redox potential of less than 50 mV. In one embodiment, the redox enzyme in the immobilized group is characterized by a redox potential of 50 mV, 40 mV, 30 mV, 20 mV, 10 mV, 0 V, -10 mV, -20 mV, -30 mV, -40 mV, -50 mV, -60 mV, -70 mV, -80 mV, -90 mV, -100 mV, -110 mV, -120 mV, -130 mV, -140 mV, -150 mV, -160 mV, -170 mV, -180 mV, -190 mV, or -200 mV (induced potential vs. Ag/AgCl) including any value and range therebetween.
[0115] In one embodiment, the dehydrogenase is selected from, without being limited thereto, alcohol dehydrogenase, glutamic acid dehydrogenase, cholesterol dehydrogenase, aldehyde dehydrogenase, glucose dehydrogenase, fructose dehydrogenase, sorbitol dehydrogenase, lactate dehydrogenase, and glycerol dehydrogenase.
[0116] In one embodiment, the linker moiety comprises a peptide. In one embodiment, the peptide comprises serine. In one embodiment, the linker moiety comprises a short peptide e.g. having 5 to 20 amino acids, or 5 to 15 amino acids, e.g., 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 amino acids, including any range therebetween. In some embodiments, the linker is selected from any embodiments of a linker described hereinabove. In some embodiments, the linker moiety is a flexible linker. In some embodiments, the linker moiety is hydrophilic. In some embodiments, the linker moiety is synthetic. In some embodiments, the linker moiety is not from cellobiose dehydrogenase. In some embodiments, the linker moiety is not from pyranose dehydrogenase. In some embodiments, the linker moiety does not interfere with enzyme function and/or DET.
[0117] A non-limiting example of a linker is a peptide comprising GSGYGSG (SEQ ID NO: 27).
[0118] In one embodiment, the linker is characterized by a length of: 5 to 40, or 20 to 30 .ANG., e.g., 5, 10, 15, 20, 25, 30, 35, or 40 .ANG., including any value and range therebetween.
[0119] In one embodiment, the electron transfer domain comprises a cytochrome, e.g., MCD.
[0120] Embodiments of the cytochrome are described hereinabove. In some embodiments, the electron transfer domain comprises a cytochrome c. In some embodiments, the electron transfer domain does not comprise a cytochrome b. In some embodiments, the recombinant protein is not a naturally occurring protein.
[0121] In one embodiment, the device is a biosensor.
[0122] As used herein and in the art, biosensors are analytical devices that combine a biological material (e.g., tissues, microorganisms, enzymes, antibodies, nucleic acids etc.) or a biologically-derived material with a physicochemical transducer or transducing microsystem.
[0123] In one embodiment, the device comprises or is configured to attach to an electronic circuitry for energizing the electrode and measuring the response.
[0124] In some embodiments, the biosensor is for measuring the concentration of a sugar in a medium.
[0125] In some embodiments, the biosensor is for measuring the concentration of glucose in a medium. In some embodiments, the biosensor is for measuring the concentration of alcohol in a medium.
[0126] In some embodiments the medium is a bodily fluid. In some embodiments, the bodily fluid is selected from at least one of: blood, serum, gastric fluid, intestinal fluid, saliva, bile, tumor fluid, urine, breast milk, interstitial fluid, and stool. In some embodiments, the biosensor is capable of measuring whole blood, serum or plasma glucose without dilution or sample processing, which can evaluate the current status of glucose or an individual suffering from diabetes. In some embodiments, the bodily fluid is undiluted. In some embodiments, the bodily fluid is diluted with a buffer.
[0127] In some embodiments, the biosensor comprises an anode compartment and a cathode compartment, the compartments being in fluid communication, wherein the anode compartment comprises an anode electrode and a substrate; and wherein the cathode compartment comprises a cathode electrode, the anode electrode and cathode electrode in electrical communication.
[0128] In some embodiments, the electrode disclosed herein is the anode.
[0129] According to an aspect, the present invention provides a method for determining an analyte in a liquid medium, the analyte being capable to undergo a biocatalytic oxidation or reduction reaction in the presence of an oxidizer or a reducer, respectively, the method comprising:
(i) providing the disclosed device in an embodiment thereof; (ii) contacting the device with the liquid medium; (iii) measuring the electric signal generated between the cathode and the anode, the electric signal being indicative of the presence and/or the concentration of the analyte; and (iv) determining the analyte based on the electric signal.
[0130] In some embodiments, the method is for determining the presence of the analyte in the medium. In some embodiments, the method is for determining the concentration of the analyte in the medium. When the liquid medium is, for example, a body fluid e.g. blood, lymph fluid or cerebro-spinal fluid, and the method may be carried out in an invasive manner, the method comprises inserting the biosensor into the body and bringing it into contact with the body fluid and determining the analyte in the body fluid within the body. Alternatively, body fluids or any other analyte may be tested non-invasively, and in such cases the method may comprise adding a buffer to the fluid. In some embodiments, the buffer has pH 4 to 8. In some embodiments, the buffer has pH 5 (.+-.1).
[0131] In some embodiments, the contacting is in vivo. In some embodiments, the contacting is ex vivo. In some embodiments, a detected electrical signal indicates the analyte is present. In some embodiments, the greater the electrical signal the greater the concentration of analyte. In some embodiments, the electrical signal is compared to a predetermined standard that indicates the concentration of the analyte based on the electrical signal.
[0132] Non-limiting examples of analytes are sugar molecules e.g., galactose, lactose, maltose and xylose glucose, fructose, maltose; lactate; bilirubin; alcohols or amino acids.
[0133] In some embodiments, provided herein is a method for transferring an electron to an electrode, comprising coupling the disclosed recombinant protein in an embodiment thereof to an electrode, thereby transferring an electron to an electrode. In one embodiment, coupling is in the absence of a mediator molecule.
[0134] In some embodiments, the amino acid sequence of the recombinant protein comprises or consists of the following sequence:
TABLE-US-00005 (SEQ ID NO: 12, full) MADTDTQKADVVVVGSGVAGAIVAHQLAMAGKSVILLEAGPX.sub.1MPRWEIVERFRNQVDKTD FMAPYPSSAWAPHPEYGPPNDYLILKGEHKFNSQYIRAVGGTTWHWAASAWRFIPNDFKMK TVYGVGRDWPIQYDDIEHYYQRAEEELGVWGPGPEEDLYSPRKEPYPMPPLPLSFNEQTIKSA LNGYDPKFHVVTEPVARNSRPYDGRPTCCGNNNCMPICPIGAMYNGIVHVEKAEQAGAKLID X.sub.2AVVYKLETGPDKRITAAVYKDKTGADHRVEGKYFVIAANGIETPKILLMSANRDFPNGVA NSSDMVGRNLMDHPGTGVSFYANEKLWPGRGPQEMTSLIGFRDGPFRANEAAKKIHLSNMS RINQETQKIFKGGKLMKPEELDAQIRX.sub.3RSARFVQFDCFHEILPQPENRIVPSKTATDAVGIPRP EITYAIDDYVKRGAVHTREVYATAAKVLGGTEVVFNDEFAPNNHITGATIMGADARDSVVDK DCRAFDHPNLFISSSSTMPTVGTVNVTLTIAALALRMSDTLKKEVEFGSGYGSGPPGPIRAGA X.sub.4MX.sub.5HRDRGPCGACHAIIQGSGSGHHHHHH.
[0135] In some embodiments, the amino acid sequence of the recombinant protein comprises or consists of the following sequence:
TABLE-US-00006 (SEQ ID NO: 13, partial) MADTDTQKADVVVVGSGVAGAIVAHQLAMAGKSVILLEAGPX.sub.1MPRWEIVERFRNQVDKTD FMAPYPSSAWAPHPEYGPPNDYLILKGEHKFNSQYIRAVGGTTWHWAASAWRFIPNDFKMK TVYGVGRDWPIQYDDIEHYYQRAEEELGVWGPGPEEDLYSPRKEPYPMPPLPLSFNEQTIKSA LNGYDPKFHVVTEPVARNSRPYDGRPTCCGNNNCMPICPIGAMYNGIVHVEKAEQAGAKLID X.sub.2AVVYKLETGPDKRITAAVYKDKTGADHRVEGKYFVIAANGIETPKILLMSANRDFPNGVA NSSDMVGRNLMDHPGTGVSFYANEKLWPGRGPQEMTSLIGFRDGPFRANEAAKKIHLSNMS RINQETQKIFKGGKLMKPEELDAQIRX.sub.3RSARFVQFDCFHEILPQPENRIVPSKTATDAVGIPRP EITYAIDDYVKRGAVHTREVYATAAKVLGGTEVVFNDEFAPNNHITGATIMGADARDSVVDK DCRAFDHPNLFISSSSTMPTVGTVNVTLTIAALALRMSDTLKKEVEFGSGYGSGPPGPIRAGA X.sub.4MX.sub.5HRDRGPCGACHAIIQ.
[0136] In some embodiments, X.sub.1 is R. In some embodiments, X.sub.2 is S. In some embodiments, X.sub.3 is D. In some embodiments, X.sub.4 is T. In some embodiments, X.sub.5 is P.
[0137] In some embodiments, one or more amino acids selected from X.sub.1 to X.sub.5 comprise at least one non-canonical amino acid (ncAA) residue.
[0138] Exemplary embodiments of SEQ ID NO: 12-13 are presented in the Examples section below (e.g., SEQ ID NOs: 19-23).
[0139] The term "non-canonical amino acid residue" refers to amino acid residues in D- or L-form that are not among the 20 canonical amino acids generally incorporated into naturally occurring proteins, for example, .beta.-amino acids, homoamino acids, cyclic amino acids and amino acids with derivatized side chains.
[0140] Non-canonical amino acid residues may be incorporated into a peptide within the scope of the invention by employing known techniques of protein engineering that use recombinantly expressing cells.
[0141] In some embodiments one or more from X.sub.1 to X.sub.5 are present in proximity to the protein domain selected from, without being limited thereto: FAD binding domain, or MCD. In some embodiments one or more from X.sub.1 to X.sub.5 are present in a site that is distant from either FAD domain or MCD.
[0142] In some embodiments, by "proximity" it is meant to refer to a distance of 10 to 25 .ANG., e.g., 10, 15, 20, or 25 .ANG., including any value and range therebetween.
[0143] In some embodiments, by "distant" it is meant to refer to a distance of 30 to 100 .ANG., e.g., 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, or 100 .ANG..
[0144] In some embodiments, the ncAA is a clickable ncAA.
[0145] By "clickable ncAA" it is meant to refer to ncAA attachable to another group or moiety by biorthogonal chemical mechanism.
[0146] In exemplary embodiments, the ncAA comprises Propargyl-lysine (PrK).
[0147] In some embodiments, the ncAA has attached thereto an electrode linker (mediator) configured to couple the recombinant protein to an electrode.
[0148] In some embodiments, the linker comprises an aromatic group.
[0149] In some embodiments, the aromatic group comprises polycyclic aromatic hydrocarbon system.
[0150] In some embodiments, by "polycyclic aromatic hydrocarbon system" it is meant to refer to a system comprising e.g., 3, 4, 5, or 6, fused benzene rings, which, in some embodiments, is in the form in a flat aromatic system.
[0151] In some embodiments, the aromatic group is selected from, without being limited thereto, pyrene, perylene, benzopyrene, oxoperylene, rubrene, perylene bisimide, styrene, anthracene, tetracene, pentacene, or any derivative thereof.
[0152] In exemplary embodiments, the aromatic group comprises pyrene or a derivative thereof.
[0153] In some embodiments, the linker further comprises an azide group.
[0154] In some embodiments, the azide group allows to bind to the PrK, for example, via an alkyne group, e.g., by "click" chemistry.
[0155] In some embodiments, the aromatic system (e.g., pyrene group) is present in one pole of the linker, and the azide group is present at the other pole of the mediator.
[0156] In some embodiments, the two groups (e.g., azide group and the aromatic group) are connected to each other by an alkyl oxide, for example, and without being limited thereto, tri-ethylene oxide, di-ethylene oxide or mono-ethylene oxide. In some embodiments, the mediator is characterized by a length of 3 to 9 or 4 to 8 .ANG., for example 3, 4, 5, 6, 7, 8, or 9 .ANG., including any value and range therebetween.
The Porphyrin
[0157] In one embodiment, the recombinant protein (e.g., the cytochrome domain) is bound to a porphyrin comprising a metal. In one embodiment, the recombinant protein is bound to a compound of formula I:
##STR00002##
wherein R is any electron donor, or a compound as further described herein, wherein the compound of formula I is bound to a metal. In one embodiment, the recombinant protein but not the gamma subunit is bound to a porphyrin as described herein.
[0158] In one embodiment, R represents, independently and in each occurrence, a substituent selected from the group consisting of: alkyl, cycloalkyl, aryl, heteroalicyclic, heteroaryl, alkoxy, hydroxy, phosphonate, thiohydroxy, thioalkoxy, aryloxy, thioaryloxy, amino, nitro, halo, trihalomethyl, cyano, amide, amine, alkanoamine, carboxy, sulfonyl, sulfoxy, sulfinyl, and sulfonamide, or is absent. In one embodiment, R represents, independently and in each occurrence, hydrogen.
[0159] In one embodiment, R represents --(C.sub.1-C.sub.6)alkyl. In some embodiments, R represents --(C.sub.1-C.sub.6)alkoxy. In some embodiments, R represents --(C.sub.1-C.sub.6)alkylthio. In some embodiments, R represents --(C.sub.1-C.sub.6)alkylsulfinyl. In some embodiments, R represents --(C.sub.1-C.sub.6)alkylsulfonyl. In some embodiments, R represents --[(C.sub.1-C.sub.6)alkyl]NH. In some embodiments, R represents --[(C.sub.1-C.sub.6)alkyl]COOH.
[0160] In one embodiment, the term "alkyl" comprises an aliphatic hydrocarbon including straight chain and branched chain groups. Preferably, the alkyl group has 21 to 100 carbon atoms, and more preferably 21-50 carbon atoms. Whenever a numerical range; e.g., "21-100", is stated herein, it implies that the group, in this case the alkyl group, may contain 21 carbon atoms, 22 carbon atoms, 23 carbon atoms, etc., up to and including 100 carbon atoms.
[0161] In one embodiment, the term "long alkyl" comprises an alkyl having at least 20 carbon atoms in its main chain (the longest path of continuous covalently attached atoms). A short alkyl therefore has 20 or less main-chain carbons. In one embodiment, an alkyl can be substituted or unsubstituted. In one embodiment, the term "alkyl", as used herein, also encompasses saturated or unsaturated hydrocarbon, hence this term further encompasses alkenyl and alkynyl.
[0162] In one embodiment, the term "alkenyl" describes an unsaturated alkyl, as defined herein, having at least two carbon atoms and at least one carbon-carbon double bond. The alkenyl may be substituted or unsubstituted by one or more substituents, as described hereinabove. In one embodiment, the term "alkynyl", as defined herein, is an unsaturated alkyl having at least two carbon atoms and at least one carbon-carbon triple bond. The alkynyl may be substituted or unsubstituted by one or more substituents.
[0163] In one embodiment, the term "cycloalkyl" describes an all-carbon monocyclic or fused ring (i.e., rings which share an adjacent pair of carbon atoms) group where one or more of the rings does not have a completely conjugated pi-electron system. The cycloalkyl group may be substituted or unsub stituted.
[0164] In one embodiment, the term "aryl" describes an all-carbon monocyclic or fused-ring polycyclic (i.e., rings which share adjacent pairs of carbon atoms) groups having a completely conjugated pi-electron system. In one embodiment, an aryl group may be substituted or unsubstituted.
[0165] In one embodiment, the term alkoxy" describes both an --O-alkyl and an --O-cycloalkyl group. In one embodiment, the term "aryloxy" describes an --O-aryl. In one embodiment, the term alkyl, cycloalkyl and aryl groups in the general formulas herein may be substituted by one or more substituents, whereby each substituent group can independently be, for example, halide, alkyl, alkoxy, cycloalkyl, alkoxy, nitro, amine, hydroxyl, thiol, thioalkoxy, thiohydroxy, carboxy, amide, aryl and aryloxy, depending on the substituted group and its position in the molecule.
[0166] In one embodiment, "halide", "halogen" or "halo" describes fluorine, chlorine, bromine or iodine. In one embodiment, "haloalkyl" describes an alkyl group as defined herein, further substituted by one or more halide(s). In one embodiment, "haloalkoxy" describes an alkoxy group as defined herein, further substituted by one or more halide(s). In one embodiment, the term "hydroxyl" or "hydroxy" describes a --OH group. In one embodiment, the term "thiohydroxy" or "thiol" describes a --SH group. In one embodiment, the term "thioalkoxy" describes both an --S-alkyl group, and a --S-cycloalkyl group. In one embodiment, the term "thioaryloxy" describes both an --S-aryl and a --S-heteroaryl group. In one embodiment, the term "amine" describes a --NR'R'' group, with R' and R''. In one embodiment, the term "heteroaryl" describes a monocyclic or fused ring (i.e., rings which share an adjacent pair of atoms) group having in the ring(s) one or more atoms, such as, for example, nitrogen, oxygen and sulfur and, in addition, having a completely conjugated pi-electron system. Examples, without limitation, of heteroaryl groups include pyrrole, furane, thiophene, imidazole, oxazole, thiazole, pyrazole, pyridine, pyrimidine, quinoline, isoquinoline and purine.
[0167] In one embodiment, the term "heteroalicyclic" or "heterocyclyl" describes a monocyclic or fused ring group having in the ring(s) one or more atoms such as nitrogen, oxygen and sulfur. The rings may also have one or more double bonds. In one embodiment, the rings do not have a completely conjugated pi-electron system. Representative examples are piperidine, piperazine, tetrahydrofurane, tetrahydropyrane, morpholino and the like.
[0168] In one embodiment, the term "carboxy" or "carboxylate" describes a --C(.dbd.O)--OR' group, where R' is hydrogen, alkyl, cycloalkyl, alkenyl, aryl, heteroaryl (bonded through a ring carbon) or heteroalicyclic (bonded through a ring carbon).
[0169] In one embodiment, the term "carbonyl" describes a --C(.dbd.O)--R' group, where R' is as defined hereinabove. In one embodiment, the above-terms also encompass thio-derivatives thereof (thiocarboxy and thiocarbonyl).
[0170] In one embodiment, the term "thiocarbonyl" describes a --C(.dbd.S)--R' group, where R' is as defined hereinabove. In one embodiment, the term "thiocarboxy" group describes a --C(.dbd.S)--OR' group, where R' is as defined herein. In one embodiment, the term sulfinyl" group describes an --S(.dbd.O)--R' group, where R' is as defined herein. In one embodiment, the term sulfonyl" or "sulfonate" group describes an --S(.dbd.O).sub.2--R' group, where R' is as defined herein. In one embodiment, the term "carbamyl" or "carbamate" group describes an --OC(.dbd.O)--NR'R'' group, where R' is as defined herein and R'' is as defined for R'.
[0171] In one embodiment, the term "nitro" group refers to a --NO.sub.2 group. In one embodiment, the term "cyano" or "nitrile" group refers to a --C.intg.N group. In one embodiment, the term azide" refers to a --N.sub.3 group. In one embodiment, the term "sulfonamide" refers to a --S(.dbd.O).sub.2--NR'R'' group, with R' and R'' as defined herein.
[0172] In one embodiment, the term "phosphonyl" or "phosphonate" describes an --O--P(.dbd.O)(OR').sub.2 group, with R' as defined hereinabove. In one embodiment, the term "phosphinyl" describes a --PR'R'' group, with R' and R'' as defined hereinabove.
[0173] In one embodiment, the term "alkaryl" describes an alkyl, as defined herein, which substituted by an aryl, as described herein. In one embodiment, alkaryl is benzyl.
[0174] In one embodiment, the term "heteroaryl" describes a monocyclic or fused ring (i.e., rings which share an adjacent pair of atoms) group having in the ring(s) one or more atoms, such as, for example, nitrogen, oxygen and sulfur and, in addition, having a completely conjugated pi-electron system. Examples, without limitation, of heteroaryl groups include pyrrole, furane, thiophene, imidazole, oxazole, thiazole, pyrazole, pyridine, pyrimidine, quinoline, isoquinoline and purine. The heteroaryl group may be substituted or unsubstituted by one or more substituents, as described hereinabove. Representative examples are thiadiazole, pyridine, pyrrole, oxazole, indole, purine and the like.
[0175] In one embodiment, the terms "halo" and "halide", which are referred to herein interchangeably, describe an atom of a halogen, that is fluorine, chlorine, bromine or iodine, also referred to herein as fluoride, chloride, bromide and iodide. In one embodiment, the term "haloalkyl" describes an alkyl group as defined above, further substituted by one or more halide(s).
[0176] In one embodiment, the metal is a trivalent metal or a divalent metal.
[0177] In one embodiment, the recombinant protein as described herein comprises both peroxidase activity and oxidative activity. In one embodiment, the recombinant protein bound to a porphyrin comprising a metal comprises both peroxidase activity and oxidative activity. In one embodiment, the recombinant protein bound to a porphyrin comprising a metal comprises both peroxidase activity and oxidase activity.
[0178] In one embodiment, the recombinant protein is characterized by Michaelis-Menten constant K.sub.M.sup.app value which is at least 2, 3, 4 or 5, or more, higher compared to plain GDH, as measured under the same condition (e.g., glucose concentration) of glucose oxidation.
[0179] In one embodiment, the recombinant protein is characterized by higher selectivity towards glucose oxidation as compared to other sugar's molecules. In one embodiment, the means that the rate of glucose oxidation is higher by at least 10%, by at least 20%, by at least 30%, by at least 40%, or by at least 50% than a other sugar molecule.
[0180] As used herein, the terms "peptide", "polypeptide" and "protein" are used interchangeably to refer to a polymer of amino acid residues. In some embodiment, the peptides, polypeptides and proteins described herein have modifications rendering them more stable while in the body or more capable of penetrating into cells. In some embodiment, the terms "peptide", "polypeptide" and "protein" apply to naturally occurring amino acid polymers. In another embodiment, the terms "peptide", "polypeptide" and "protein" apply to amino acid polymers in which one or more amino acid residue is an artificial chemical analogue of a corresponding naturally occurring amino acid.
[0181] As used herein, the term "recombinant protein" refers to protein which is coded for by a recombinant DNA, and is thus not naturally occurring. The term "recombinant DNA" refers to DNA molecules formed by laboratory methods of genetic recombination. Generally, this recombinant DNA is in the form of a vector used to express the recombinant protein in a cell. In one embodiment, the recombinant protein is provided within a single composition or kit with the gamma subunit FAD-GDH. In one embodiment, the recombinant protein with the gamma subunit FAD-GDH are provided within a single composition or kit as two separate proteins (unbound).
[0182] In some embodiments, the recombinant protein is a hybrid protein. In some embodiments, the recombinant protein is a chimeric protein. In some embodiments, the c-type cytochrome peptide is from a different protein than the alpha subunit. In some embodiments, the c-type cytochrome peptide is not from FAD-GDH. In some embodiments, the c-type cytochrome peptide is not from GDH. In some embodiments, the nucleic acid sequence encoding the alpha subunit and the sequence encoding the c-type cytochrome peptide are operably linked, such that a full-length protein is produced following translation and/or transcription. The term "operably linked" is intended to mean that the two nucleotide sequences of interest are linked to each other in a manner that allows for expression of the nucleotide sequence (e.g. in an in vitro transcription/translation system or in a host cell when the vector is introduced into the host cell).
[0183] In general, and throughout this specification, the term "vector" refers to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked. Vectors include, but are not limited to, nucleic acid molecules that are single-stranded, double-stranded, or partially double-stranded; nucleic acid molecules that comprise one or more free ends, no free ends (e.g. circular); nucleic acid molecules that comprise DNA, RNA, or both; and other varieties of polynucleotides known in the art. One type of vector is a "plasmid" which refers to a circular double stranded DNA loop into which additional DNA segments can be inserted, such as by standard molecular cloning techniques. Another type of vector, wherein virally-derived DNA or RNA sequences are present in the virus (e.g. retroviruses, replication defective retroviruses, adenoviruses, replication defective adenoviruses, and adeno-associated viruses). Viral vectors also include polynucleotides carried by a virus for transfecting into host cells. Certain vectors are capable of autonomous replication in a host cell into which they are introduced (e.g. bacterial vectors having a bacterial origin of replication and episomal mammalian vectors). Other vectors (e.g., non-episomal mammalian vectors) are integrated into the genome of a host cell upon introduction into the host cell, and thereby are replicated along with the host genome. Moreover, certain vectors are capable of directing the expression of genes to which they are operatively-linked. Such vectors are referred to herein as "expression vectors". Common expression vectors of utility in recombinant DNA techniques are often in the form of plasmids.
[0184] Recombinant expression vectors can comprise a nucleic acid coding for the protein of the invention in a form suitable for expression of the nucleic acid in a host cell, which means that the recombinant expression vectors include one or more regulatory elements, which may be selected on the basis of the host cells to be used for expression, that is operatively-linked to the nucleic acid sequence to be expressed. Within a recombinant expression vector "operably linked" is intended to mean that the nucleotide sequence of interest is linked to the regulatory element(s) in a manner that allows for expression of the nucleotide sequence (e.g. in an in vitro transcription/translation system or in a host cell when the vector is introduced into the host cell).
[0185] A vector nucleic acid sequence generally contains at least an origin of replication for propagation in a cell and optionally additional elements, such as a heterologous polynucleotide sequence, expression control element (e.g., a promoter, enhancer), selectable marker (e.g., antibiotic resistance), poly-Adenine sequence.
[0186] A vector or a plasmid is an artificial composite. A vector or a plasmid as described herein is man-made. A vector or a plasmid as described herein is not a product of nature.
[0187] The vector may be a DNA plasmid delivered via non-viral methods or via viral methods. The viral vector may be a retroviral vector, a herpesviral vector, an adenoviral vector, an adeno-associated viral vector or a poxviral vector. The promoters may be active in mammalian cells. The promoters may be a viral promoter.
[0188] In some embodiments, the vector is introduced into the cell by standard methods including electroporation (e.g., as described in From et al., Proc. Natl. Acad. Sci. USA 82, 5824 (1985)), Heat shock, infection by viral vectors, high velocity ballistic penetration by small particles with the nucleic acid either within the matrix of small beads or particles, or on the surface (Klein et al., Nature 327. 70-73 (1987)), and/or the like.
[0189] General methods in molecular and cellular biochemistry, such as may be useful for carrying out DNA and protein recombination, as well as other techniques described herein, can be found in such standard textbooks as Molecular Cloning: A Laboratory Manual, 3rd Ed. (Sambrook et al., HaRBor Laboratory Press 2001); Short Protocols in Molecular Biology, 4th Ed. (Ausubel et al. eds., John Wiley &. Sons 1999); Protein Methods (Bollag et al., John Wiley &. Sons 1996); Nonviral Vectors for Gene Therapy (Wagner et al. eds., Academic Press 1999); Viral Vectors (Kaplift & Loewy eds., Academic Press 1995); Immunology Methods Manual (I. Lefkovits ed., Academic Press 1997); and Cell and Tissue Culture: Laboratory Procedures in Biotechnology (Doyle & Griffiths, John Wiley & Sons 1998).
[0190] It should be well understood to one skilled in the art that a recombinant protein is produced by expressing the recombinant DNA in a cell and then purifying the protein. The cells expressing the DNA are cultured under effective conditions, which allow for the expression of high amounts of recombinant polypeptide. Such effective culture conditions include, but are not limited to, effective media, bioreactor, temperature, pH and oxygen conditions that permit protein production. In one embodiment, an effective medium refers to any medium in which a cell is cultured to produce the recombinant polypeptide of the present invention. In some embodiments, a medium typically includes an aqueous solution having assimilable carbon, nitrogen and phosphate sources, and appropriate salts, minerals, metals and other nutrients, such as vitamins. In some embodiments, cells of the present invention can be cultured in conventional fermentation bioreactors, shake flasks, test tubes, microtiter dishes and petri plates. In some embodiments, culturing is carried out at a temperature, pH and oxygen content appropriate for a recombinant cell. In some embodiments, culturing conditions are within the expertise of one of ordinary skill in the art.
[0191] Purification of a recombinant protein involves standard laboratory techniques for extracting a recombinant protein that is essentially free from contaminating cellular components, such as carbohydrate, lipid, or other proteinaceous impurities associated with the peptide in nature. Purification can be carried out using a tag that is part of the recombinant protein or thought immuno-purification with antibodies directed to the recombinant protein. Kits are commercially available for such purifications and will be familiar to one skilled in the art. Typically, a preparation of purified peptide contains the peptide in a highly-purified form, i.e., at least about 80% pure, at least about 90% pure, at least about 95% pure, greater than 95% pure, or greater than 99% pure.
[0192] In some embodiments, the protein comprises an amino acid sequence with at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 99% homology to the amino acid sequence set forth below in SEQ ID NO: 1. In some embodiments, the amino acid sequence with at least 70% homology to SEQ ID NO: 1 is the amino acid sequence set forth in SEQ ID NO: 6.
[0193] Mutations and deletions in a recombinant protein are created by introducing the mutation or deletion into the recombinant DNA. Methods of site-directed mutagenesis, and routine DNA recombination can be found in such standard textbooks as are enumerated above. Mutagenesis of one amino acid to another may require mutation of 1, 2, or 3 of the bases that make up the codon corresponding to the amino acid to be changed.
[0194] In some embodiments, provided herein a method for quantifying the amount of a reporter in a sample having a first detectable range of light absorbance in an oxidized state a second range of light absorbance in a non-oxidized state and, comprising: contacting the recombinant protein with the reporter in a non-oxidized state; and measuring the amount of the reporter in an oxidized state, thereby quantifying the amount of a reporter in a sample. In some embodiments, coupled is directly bound. In another embodiment, first detectable range of light absorbance is detectable in visible light and the second range of light absorbance is non-detectable in visible light. In some embodiments, the reporter is 2,6-Dichloroindophenol. In some embodiments, the reporter is coupled to glucose. In some embodiments, 2,6-Dichloroindopheno is coupled to glucose.
DNA Sequences
[0195] In one embodiment, provided herein is a DNA molecule encoding the recombinant protein. In one embodiment, provided herein is a DNA molecule comprising: a transcription regulatory element, a translation regulatory element or both; operably linked to a nucleotide sequence encoding the recombinant protein. In one embodiment, the invention provides a DNA molecule encoding both the recombinant protein and the gamma subunit of an FAD-GDH. In one embodiment, the invention provides a single DNA molecule encoding both the recombinant protein and the gamma subunit of an FAD-GDH as two separate proteins.
[0196] In one embodiment, provided herein is a DNA molecule comprising a nucleic acid sequence encoding the recombinant protein. In one embodiment, provided herein is a DNA molecule comprising a nucleic acid sequence encoding the recombinant protein and the gamma subunit of an FAD-GDH. In one embodiment, provided herein is a DNA molecule comprising the nucleic acid sequence selected from the group consisting SEQ ID NOs: 5-7. In one embodiment, the invention provides a plasmid or a vector (such as an expression vector) comprising a nucleic acid sequence encoding the recombinant protein. In one embodiment, provided herein is a cell comprising a DNA molecule, a plasmid or a vector as described herein. In one embodiment, the cell is a prokaryotic cell. In one embodiment, the cell is a bacterial cell.
[0197] In one embodiment, the alpha subunit of the FAD-GDH is encoded by a DNA sequence comprising or consisting of the nucleotide sequence: ATGGCGGATACGGATACCCAGAAAGCGGACGTGGTCGTGGTTGGATCCGGCGTGGCAGG CGCAATCGTGGCTCATCAACTGGCAATGGCAGGTAAAAGCGTGATCCTGCTGGAAGCTGG TCCGCGTATGCCGCGTTGGGAAATTGTTGAACGTTTCCGCAATCAAGTCGATAAAACCGA CTTTATGGCACCGTATCCGAGCAGCGCATGGGCACCGCATCCGGAATATGGTCCGCCGAA TGATTACCTGATCCTGAAAGGCGAACACAAATTTAACTCACAGTACATTCGTGCAGTGGG CGGCACCACGTGGCATTGGGCAGCCTCGGCATGGCGCTTCATCCCGAACGATTTTAAAAT GAAAACCGTGTATGGCGTTGGTCGTGACTGGCCGATTCAGTACGATGACATCGAACATTA TTACCAACGCGCGGAAGAAGAACTGGGCGTGTGGGGTCCGGGCCCGGAAGAAGACCTGT ATTCACCGCGTAAAGAACCGTACCCGATGCCGCCGCTGCCGCTGAGTTTCAATGAACAAA CCATTAAATCCGCTCTGAACGGCTATGATCCGAAATTTCACGTGGTTACGGAACCGGTGG CCCGTAATTCGCGCCCGTACGACGGTCGCCCGACCTGCTGTGGCAACAATAACTGCATGC CGATTTGTCCGATCGGTGCAATGTATAACGGCATCGTCCATGTGGAAAAAGCTGAACAGG CAGGTGCTAAACTGATTGATAGTGCGGTCGTGTACAAACTGGAAACGGGCCCGGACAAA CGTATTACCGCAGCTGTTTATAAAGATAAAACGGGTGCGGACCATCGCGTCGAAGGCAAA TACTTCGTGATTGCGGCCAATGGTATCGAAACCCCGAAAATTCTGCTGATGAGCGCGAAC CGTGATTTTCCGAATGGTGTGGCCAACAGTTCCGATATGGTTGGCCGCAATCTGATGGAC CATCCGGGCACCGGCGTGAGCTTTTATGCAAACGAAAAACTGTGGCCGGGTCGTGGTCCG CAGGAAATGACCTCTCTGATCGGTTTCCGTGATGGCCCGTTTCGCGCGAATGAAGCAGCG AAGAAAATTCATCTGTCAAATATGTCGCGTATCAACCAGGAAACCCAAAAAATCTTTAAA GGCGGTAAACTGATGAAACCGGAAGAACTGGATGCGCAGATCCGTGACCGCAGTGCCCG CTTTGTTCAATTCGATTGCTTTCACGAAATCCTGCCGCAGCCGGAAAATCGTATTGTCCCG TCCAAAACCGCAACGGACGCAGTGGGTATTCCGCGTCCGGAAATTACGTATGCGATCGAT GACTACGTCAAACGTGGCGCAGTGCATACGCGCGAAGTTTATGCTACCGCGGCCAAAGTG CTGGGCGGCACCGAAGTGGTCTTCAACGATGAATTTGCGCCGAATAACCACATCACCGGT GCCACGATTATGGGCGCGGATGCCCGTGACTCAGTGGTTGATAAAGACTGTCGCGCCTTC GATCATCCGAACCTGTTTATTAGCAGCAGCAGCACCATGCCGACGGTTGGCACCGTTAAC GTCACCCTGACGATTGCAGCTCTGGCACTGCGTATGTCTGATACGCTGAAAAAAGAAGTC (SEQ ID NO: 1). In one embodiment, the recombinant protein is encoded by a DNA molecule comprising a coding nucleotide sequence encoding the alpha subunit of the FAD-GDH.
[0198] In one embodiment, the alpha subunit of the FAD-GDH is encoded by a DNA sequence of 1200 to 1700 nucleotides. In one embodiment, the alpha subunit of the FAD-GDH is encoded by a DNA sequence of 1200 to 1650 nucleotides. In one embodiment, the alpha subunit of the FAD-GDH is encoded by a DNA sequence of 1300 to 1650 nucleotides. In one embodiment, the alpha subunit of the FAD-GDH is encoded by a DNA sequence of 1400 to 1700 nucleotides. In one embodiment, the alpha subunit of the FAD-GDH is encoded by a DNA sequence of 1500 to 1700 nucleotides. In one embodiment, the alpha subunit of the FAD-GDH is encoded by a DNA sequence of 1500 to 1650 nucleotides. In one embodiment, the alpha subunit of the FAD-GDH is encoded by a DNA sequence of 1550 to 1640 nucleotides. In one embodiment, the alpha subunit of the FAD-GDH is encoded by a DNA sequence of 1600 to 1650 nucleotides.
[0199] In one embodiment, the alpha subunit of the FAD-GDH is a mutant of alpha FAD-GDH or a mutant of SEQ ID NO: 1. Active mutants of alpha FAD-GDH or SEQ ID NO: 1 are readily available to one of skill in the art. By the term active mutant, as used in conjunction with an FAD-GDH, is meant a mutated form of the naturally occurring FAD-GDH. FAD-GDH mutant or variants will typically but not exclusively have at least 70%, e.g., 80%, 85%, 90% to 95% or more, and for example 98% or more amino acid sequence identity to the amino acid sequence of the reference FAD-GDH molecule.
[0200] In one embodiment, the DNA sequence of a mutant of FAD-GDH or a mutant of SEQ ID NO: 1 is at least 70% identical to SEQ ID NO: 1. In one embodiment, the DNA sequence of a mutant of FAD-GDH or a mutant of SEQ ID NO: 1 is at least 75% identical to SEQ ID NO: 1. In one embodiment, the DNA sequence of a mutant of FAD-GDH or a mutant of SEQ ID NO: 1 is at least 80% identical to SEQ ID NO: 1. In one embodiment, the DNA sequence of a mutant of FAD-GDH or a mutant of SEQ ID NO: 1 is at least 85% identical to SEQ ID NO: 1. In one embodiment, the DNA sequence of a mutant of FAD-GDH or a mutant of SEQ ID NO: 1 is at least 90% identical to SEQ ID NO: 1. In one embodiment, the DNA sequence of a mutant of FAD-GDH or a mutant of SEQ ID NO: 1 is at least 95% identical to SEQ ID NO: 1. In one embodiment, the DNA sequence of a mutant of FAD-GDH or a mutant of SEQ ID NO: 1 is at least 97% identical to SEQ ID NO: 1.
[0201] In one embodiment, the gamma subunit of the FAD-GDH is encoded by a DNA sequence comprising or consisting of the nucleotide sequence:
TABLE-US-00007 (SEQ ID NO: 2) ATGGCTCACAATGACAACACCCCGCACTCCCGCCGTACCGGCGATGCGGCCGTGACCGGT ATTACGCGTCGCCAGTGGCTGCAAGGCGCGCTGGCCCTGACCGCAGCTGGCCTGACGGGT TCCCTGGCCCTGCGCGCACTGGCTGATGATCCGGGCACCGCACCGCTGGATACCTTTATG ACGCTGAGCGAAGCTCTGACGGGCAAAAAAGGTCTGTCTCGTGTTCTGGGCCAGCGTTTT CTGCAAGCGCTGCAAAAAGGTTCATTCAAAACCGCGGATTCGCTGCCGCAGCTGGCGGGC GCCCTGGCAAGCGGTTCTCTGAACCCGGACCAAGAAGCTCTGGCGCTGAAAATCCTGGAA GCATGGTATCTGGGCATTGTTGATAATGTGGTTATCACCTACGAAGAAGCCCTGATGTTTA GTGTCGTGTCCGACACGCTGGTCATTCCGAGCTATTGCCCGAACAAACCGGGTTTCTGGG CCGAAAAACCGATCGAACGTCAGGCATAA.
[0202] In one embodiment, the gamma subunit of the FAD-GDH is translated into a discrete protein. In one embodiment, the gamma subunit of the FAD-GDH is encoded by a DNA sequence of 400 to 700 nucleotides. In one embodiment, the gamma subunit of the FAD-GDH is encoded by a DNA sequence of 450 to 650 nucleotides. In one embodiment, the gamma subunit of the FAD-GDH is encoded by a DNA sequence of 450 to 550 nucleotides. In one embodiment, the gamma subunit of the FAD-GDH is encoded by a DNA sequence of 480 to 530 nucleotides. In one embodiment, the gamma subunit of the FAD-GDH is encoded by a DNA sequence of 500 to 540 nucleotides. In one embodiment, the gamma subunit of the FAD-GDH is encoded by a DNA sequence of 500 to 530 nucleotides.
[0203] In one embodiment, the gamma subunit of the FAD-GDH is a mutant of gamma FAD-GDH or a mutant of SEQ ID NO: 2. In one embodiment, active mutants of gamma FAD-GDH or SEQ ID NO: 2 are readily available to one of skill in the art.
[0204] In one embodiment, the DNA sequence of a mutant of FAD-GDH or a mutant of SEQ ID NO: 2 is at least 70% identical to SEQ ID NO: 2. In one embodiment, the DNA sequence of a mutant of FAD-GDH or a mutant of SEQ ID NO: 2 is at least 75% identical to SEQ ID NO: 2. In one embodiment, the DNA sequence of a mutant of FAD-GDH or a mutant of SEQ ID NO: 2 is at least 80% identical to SEQ ID NO: 2. In one embodiment, the DNA sequence of a mutant of FAD-GDH or a mutant of SEQ ID NO: 2 is at least 85% identical to SEQ ID NO: 2. In one embodiment, the DNA sequence of a mutant of FAD-GDH or a mutant of SEQ ID NO: 2 is at least 90% identical to SEQ ID NO: 2. In one embodiment, the DNA sequence of a mutant of FAD-GDH or a mutant of SEQ ID NO: 2 is at least 95% identical to SEQ ID NO: 2. In one embodiment, the DNA sequence of a mutant of FAD-GDH or a mutant of SEQ ID NO: 2 is at least 97% identical to SEQ ID NO: 2.
[0205] In one embodiment, the minimal cytochrome peptide is encoded by a DNA sequence comprising or consisting of the nucleotide sequence:
TABLE-US-00008 (SEQ ID NO: 3) ATTCGTGCAGGTGCTACCATGCCGCATCGTGATCGTGGTCCGTGCGGTGC ATGTCACGCTATTATCCAG.
[0206] In one embodiment, the recombinant protein is encoded by a DNA sequence comprising or consisting of the nucleotide sequence:
TABLE-US-00009 (SEQ ID NO: 5 without linker, his and restriction sites) CTCACAATGACAACACCCCGCACTCCCGCCGTACCGGCGATGCGGCCGTGACCGGTATTA CGCGTCGCCAGTGGCTGCAAGGCGCGCTGGCCCTGACCGCAGCTGGCCTGACGGGTTCCC TGGCCCTGCGCGCACTGGCTGATGATCCGGGCACCGCACCGCTGGATACCTTTATGACGC TGAGCGAAGCTCTGACGGGCAAAAAAGGTCTGTCTCGTGTTCTGGGCCAGCGTTTTCTGC AAGCGCTGCAAAAAGGTTCATTCAAAACCGCGGATTCGCTGCCGCAGCTGGCGGGCGCCC TGGCAAGCGGTTCTCTGAACCCGGACCAAGAAGCTCTGGCGCTGAAAATCCTGGAAGCAT GGTATCTGGGCATTGTTGATAATGTGGTTATCACCTACGAAGAAGCCCTGATGTTTAGTGT CGTGTCCGACACGCTGGTCATTCCGAGCTATTGCCCGAACAAACCGGGTTTCTGGGCCGA AAAACCGATCGAACGTCAGGCATAATGGCGGATACGGATACCCAGAAAGCGGACGTGGT CGTGGTTGGATCCGGCGTGGCAGGCGCAATCGTGGCTCATCAACTGGCAATGGCAGGTAA AAGCGTGATCCTGCTGGAAGCTGGTCCGCGTATGCCGCGTTGGGAAATTGTTGAACGTTT CCGCAATCAAGTCGATAAAACCGACTTTATGGCACCGTATCCGAGCAGCGCATGGGCACC GCATCCGGAATATGGTCCGCCGAATGATTACCTGATCCTGAAAGGCGAACACAAATTTAA CTCACAGTACATTCGTGCAGTGGGCGGCACCACGTGGCATTGGGCAGCCTCGGCATGGCG CTTCATCCCGAACGATTTTAAAATGAAAACCGTGTATGGCGTTGGTCGTGACTGGCCGATT CAGTACGATGACATCGAACATTATTACCAACGCGCGGAAGAAGAACTGGGCGTGTGGGG TCCGGGCCCGGAAGAAGACCTGTATTCACCGCGTAAAGAACCGTACCCGATGCCGCCGCT GCCGCTGAGTTTCAATGAACAAACCATTAAATCCGCTCTGAACGGCTATGATCCGAAATT TCACGTGGTTACGGAACCGGTGGCCCGTAATTCGCGCCCGTACGACGGTCGCCCGACCTG CTGTGGCAACAATAACTGCATGCCGATTTGTCCGATCGGTGCAATGTATAACGGCATCGT CCATGTGGAAAAAGCTGAACAGGCAGGTGCTAAACTGATTGATAGTGCGGTCGTGTACAA ACTGGAAACGGGCCCGGACAAACGTATTACCGCAGCTGTTTATAAAGATAAAACGGGTG CGGACCATCGCGTCGAAGGCAAATACTTCGTGATTGCGGCCAATGGTATCGAAACCCCGA AAATTCTGCTGATGAGCGCGAACCGTGATTTTCCGAATGGTGTGGCCAACAGTTCCGATA TGGTTGGCCGCAATCTGATGGACCATCCGGGCACCGGCGTGAGCTTTTATGCAAACGAAA AACTGTGGCCGGGTCGTGGTCCGCAGGAAATGACCTCTCTGATCGGTTTCCGTGATGGCC CGTTTCGCGCGAATGAAGCAGCGAAGAAAATTCATCTGTCAAATATGTCGCGTATCAACC AGGAAACCCAAAAAATCTTTAAAGGCGGTAAACTGATGAAACCGGAAGAACTGGATGCG CAGATCCGTGACCGCAGTGCCCGCTTTGTTCAATTCGATTGCTTTCACGAAATCCTGCCGC AGCCGGAAAATCGTATTGTCCCGTCCAAAACCGCAACGGACGCAGTGGGTATTCCGCGTC CGGAAATTACGTATGCGATCGATGACTACGTCAAACGTGGCGCAGTGCATACGCGCGAAG TTTATGCTACCGCGGCCAAAGTGCTGGGCGGCACCGAAGTGGTCTTCAACGATGAATTTG CGCCGAATAACCACATCACCGGTGCCACGATTATGGGCGCGGATGCCCGTGACTCAGTGG TTGATAAAGACTGTCGCGCCTTCGATCATCCGAACCTGTTTATTAGCAGCAGCAGCACCAT GCCGACGGTTGGCACCGTTAACGTCACCCTGACGATTGCAGCTCTGGCACTGCGTATGTCT GATACGCTGAAAAAAGAAGTCATTCGTGCAGGTGCTACCATGCCGCATCGTGATCG TGGTCCGTGCGGTGCATGTCACGCTATTATCCAG.
[0207] In one embodiment, the recombinant protein is encoded by a DNA sequence comprising or consisting of the nucleotide sequence:
TABLE-US-00010 (SEQ ID NO: 6 with linker and without his and restriction sites) CTCACAATGACAACACCCCGCACTCCCGCCGTACCGGCGATGCGGCCGTGACCGGTATTA CGCGTCGCCAGTGGCTGCAAGGCGCGCTGGCCCTGACCGCAGCTGGCCTGACGGGTTCCC TGGCCCTGCGCGCACTGGCTGATGATCCGGGCACCGCACCGCTGGATACCTTTATGACGC TGAGCGAAGCTCTGACGGGCAAAAAAGGTCTGTCTCGTGTTCTGGGCCAGCGTTTTCTGC AAGCGCTGCAAAAAGGTTCATTCAAAACCGCGGATTCGCTGCCGCAGCTGGCGGGCGCCC TGGCAAGCGGTTCTCTGAACCCGGACCAAGAAGCTCTGGCGCTGAAAATCCTGGAAGCAT GGTATCTGGGCATTGTTGATAATGTGGTTATCACCTACGAAGAAGCCCTGATGTTTAGTGT CGTGTCCGACACGCTGGTCATTCCGAGCTATTGCCCGAACAAACCGGGTTTCTGGGCCGA AAAACCGATCGAACGTCAGGCATAATGGCGGATACGGATACCCAGAAAGCGGACGTGGT CGTGGTTGGATCCGGCGTGGCAGGCGCAATCGTGGCTCATCAACTGGCAATGGCAGGTAA AAGCGTGATCCTGCTGGAAGCTGGTCCGCGTATGCCGCGTTGGGAAATTGTTGAACGTTT CCGCAATCAAGTCGATAAAACCGACTTTATGGCACCGTATCCGAGCAGCGCATGGGCACC GCATCCGGAATATGGTCCGCCGAATGATTACCTGATCCTGAAAGGCGAACACAAATTTAA CTCACAGTACATTCGTGCAGTGGGCGGCACCACGTGGCATTGGGCAGCCTCGGCATGGCG CTTCATCCCGAACGATTTTAAAATGAAAACCGTGTATGGCGTTGGTCGTGACTGGCCGATT CAGTACGATGACATCGAACATTATTACCAACGCGCGGAAGAAGAACTGGGCGTGTGGGG TCCGGGCCCGGAAGAAGACCTGTATTCACCGCGTAAAGAACCGTACCCGATGCCGCCGCT GCCGCTGAGTTTCAATGAACAAACCATTAAATCCGCTCTGAACGGCTATGATCCGAAATT TCACGTGGTTACGGAACCGGTGGCCCGTAATTCGCGCCCGTACGACGGTCGCCCGACCTG CTGTGGCAACAATAACTGCATGCCGATTTGTCCGATCGGTGCAATGTATAACGGCATCGT CCATGTGGAAAAAGCTGAACAGGCAGGTGCTAAACTGATTGATAGTGCGGTCGTGTACAA ACTGGAAACGGGCCCGGACAAACGTATTACCGCAGCTGTTTATAAAGATAAAACGGGTG CGGACCATCGCGTCGAAGGCAAATACTTCGTGATTGCGGCCAATGGTATCGAAACCCCGA AAATTCTGCTGATGAGCGCGAACCGTGATTTTCCGAATGGTGTGGCCAACAGTTCCGATA TGGTTGGCCGCAATCTGATGGACCATCCGGGCACCGGCGTGAGCTTTTATGCAAACGAAA AACTGTGGCCGGGTCGTGGTCCGCAGGAAATGACCTCTCTGATCGGTTTCCGTGATGGCC CGTTTCGCGCGAATGAAGCAGCGAAGAAAATTCATCTGTCAAATATGTCGCGTATCAACC AGGAAACCCAAAAAATCTTTAAAGGCGGTAAACTGATGAAACCGGAAGAACTGGATGCG CAGATCCGTGACCGCAGTGCCCGCTTTGTTCAATTCGATTGCTTTCACGAAATCCTGCCGC AGCCGGAAAATCGTATTGTCCCGTCCAAAACCGCAACGGACGCAGTGGGTATTCCGCGTC CGGAAATTACGTATGCGATCGATGACTACGTCAAACGTGGCGCAGTGCATACGCGCGAAG TTTATGCTACCGCGGCCAAAGTGCTGGGCGGCACCGAAGTGGTCTTCAACGATGAATTTG CGCCGAATAACCACATCACCGGTGCCACGATTATGGGCGCGGATGCCCGTGACTCAGTGG TTGATAAAGACTGTCGCGCCTTCGATCATCCGAACCTGTTTATTAGCAGCAGCAGCACCAT GCCGACGGTTGGCACCGTTAACGTCACCCTGACGATTGCAGCTCTGGCACTGCGTATGTCT GATACGCTGAAAAAAGAAGTCGAATTCGGTTCTGGTTATGGCTCTGGTCCGCCGGGTCCG ATTCGTGCAGGTGCTACCATGCCGCATCGTGATCGTGGTCCGTGCGGTGCATGTCACGCTA TTATCCAG.
[0208] In one embodiment, the recombinant protein is encoded by a DNA sequence comprising or consisting of the nucleotide sequence: CCATGG CTCACAATGACAACACCCCGCACTCCCGCCGTACCGGCGATGCGGCCGTGACCGGTATTA CGCGTCGCCAGTGGCTGCAAGGCGCGCTGGCCCTGACCGCAGCTGGCCTGACGGGTTCCC TGGCCCTGCGCGCACTGGCTGATGATCCGGGCACCGCACCGCTGGATACCTTTATGACGC TGAGCGAAGCTCTGACGGGCAAAAAAGGTCTGTCTCGTGTTCTGGGCCAGCGTTTTCTGC AAGCGCTGCAAAAAGGTTCATTCAAAACCGCGGATTCGCTGCCGCAGCTGGCGGGCGCCC TGGCAAGCGGTTCTCTGAACCCGGACCAAGAAGCTCTGGCGCTGAAAATCCTGGAAGCAT GGTATCTGGGCATTGTTGATAATGTGGTTATCACCTACGAAGAAGCCCTGATGTTTAGTGT CGTGTCCGACACGCTGGTCATTCCGAGCTATTGCCCGAACAAACCGGGTTTCTGGGCCGA AAAACCGATCGAACGTCAGGCATAATGGCGGATACGGATACCCAGAAAGCGGACGTGGT CGTGGTTGGATCCGGCGTGGCAGGCGCAATCGTGGCTCATCAACTGGCAATGGCAGGTAA AAGCGTGATCCTGCTGGAAGCTGGTCCGCGTATGCCGCGTTGGGAAATTGTTGAACGTTT CCGCAATCAAGTCGATAAAACCGACTTTATGGCACCGTATCCGAGCAGCGCATGGGCACC GCATCCGGAATATGGTCCGCCGAATGATTACCTGATCCTGAAAGGCGAACACAAATTTAA CTCACAGTACATTCGTGCAGTGGGCGGCACCACGTGGCATTGGGCAGCCTCGGCATGGCG CTTCATCCCGAACGATTTTAAAATGAAAACCGTGTATGGCGTTGGTCGTGACTGGCCGATT CAGTACGATGACATCGAACATTATTACCAACGCGCGGAAGAAGAACTGGGCGTGTGGGG TCCGGGCCCGGAAGAAGACCTGTATTCACCGCGTAAAGAACCGTACCCGATGCCGCCGCT GCCGCTGAGTTTCAATGAACAAACCATTAAATCCGCTCTGAACGGCTATGATCCGAAATT TCACGTGGTTACGGAACCGGTGGCCCGTAATTCGCGCCCGTACGACGGTCGCCCGACCTG CTGTGGCAACAATAACTGCATGCCGATTTGTCCGATCGGTGCAATGTATAACGGCATCGT CCATGTGGAAAAAGCTGAACAGGCAGGTGCTAAACTGATTGATAGTGCGGTCGTGTACAA ACTGGAAACGGGCCCGGACAAACGTATTACCGCAGCTGTTTATAAAGATAAAACGGGTG CGGACCATCGCGTCGAAGGCAAATACTTCGTGATTGCGGCCAATGGTATCGAAACCCCGA AAATTCTGCTGATGAGCGCGAACCGTGATTTTCCGAATGGTGTGGCCAACAGTTCCGATA TGGTTGGCCGCAATCTGATGGACCATCCGGGCACCGGCGTGAGCTTTTATGCAAACGAAA AACTGTGGCCGGGTCGTGGTCCGCAGGAAATGACCTCTCTGATCGGTTTCCGTGATGGCC CGTTTCGCGCGAATGAAGCAGCGAAGAAAATTCATCTGTCAAATATGTCGCGTATCAACC AGGAAACCCAAAAAATCTTTAAAGGCGGTAAACTGATGAAACCGGAAGAACTGGATGCG CAGATCCGTGACCGCAGTGCCCGCTTTGTTCAATTCGATTGCTTTCACGAAATCCTGCCGC AGCCGGAAAATCGTATTGTCCCGTCCAAAACCGCAACGGACGCAGTGGGTATTCCGCGTC CGGAAATTACGTATGCGATCGATGACTACGTCAAACGTGGCGCAGTGCATACGCGCGAAG TTTATGCTACCGCGGCCAAAGTGCTGGGCGGCACCGAAGTGGTCTTCAACGATGAATTTG CGCCGAATAACCACATCACCGGTGCCACGATTATGGGCGCGGATGCCCGTGACTCAGTGG TTGATAAAGACTGTCGCGCCTTCGATCATCCGAACCTGTTTATTAGCAGCAGCAGCACCAT GCCGACGGTTGGCACCGTTAACGTCACCCTGACGATTGCAGCTCTGGCACTGCGTATGTCT GATACGCTGAAAAAAGAAGTCGAATTCGGTTCTGGTTATGGCTCTGGTCCGCCGGGTCCG ATTCGTGCAGGTGCTACCATGCCGCATCGTGATCGTGGTCCGTGCGGTGCATGTCACGCTA TTATCCAGGGCAGTGGTTCCGGCCATCACCATCACCATCACTAAAAGCTT (SEQ ID NO: 7 with linker, his and restriction sites).
[0209] In one embodiment, the recombinant protein with or without the gamma subunit as described herein is encoded by a DNA sequence of 1500 to 3000 nucleotides. In one embodiment, the recombinant protein with or without the gamma subunit as described herein is encoded by a DNA sequence of 1600 to 2600 nucleotides. In one embodiment, the recombinant protein with or without the gamma subunit as described herein is encoded by a DNA sequence of 1800 to 2500 nucleotides. In one embodiment, the recombinant protein is encoded by a DNA sequence of 2000 to 2400 nucleotides.
[0210] In one embodiment, the recombinant protein is 350 to 700 amino acids long. In one embodiment, the recombinant protein is 220 to 600 amino acids long. In one embodiment, the recombinant protein is 250 to 550 amino acids long. In one embodiment, the recombinant protein is 450 to 850 amino acids long. In one embodiment, the recombinant protein is 470 to 750 amino acids long. In one embodiment, the recombinant protein is 500 to 700 amino acids long. In one embodiment, the recombinant protein is 710 to 780 amino acids long. In one embodiment, the recombinant protein is 500 to 600 amino acids long. In one embodiment, the recombinant protein is 520 to 580 amino acids long. In one embodiment, the recombinant protein is 350 to 550 amino acids long.
[0211] In one embodiment, a DNA sequence encoding the recombinant protein and the gamma subunit protein as described herein of SEQ ID NOs: 5 and 6 further comprises a Methionine codon (the initiation codon nucleotide sequence) 5' to SEQ ID NOs: 5 and/or 6. In one embodiment, a DNA sequence encoding the recombinant protein of SEQ ID NOs: 5 and 6 further comprises a Methionine codon (the initiation codon nucleotide sequence) 5' to SEQ ID NOs: 5 and/or 6. In one embodiment, a DNA sequence encoding the recombinant protein of SEQ ID NOs: 5 and 6 further comprises at its 5' end, a short DNA sequence comprising any 1-10 nucleotides sequence. In one embodiment, a DNA sequence encoding the recombinant protein of SEQ ID NOs: 5 and 6 further comprises at its 5' end, a short DNA sequence comprising 1-10 nucleotides sequence. In one embodiment, the 1-10 nucleotides sequence comprises the Methionine codon.
[0212] In one embodiment, a DNA sequence or molecule as described herein comprising a coding sequence encoding the recombinant protein, further encodes the gamma subunit protein as described herein (as a separate protein).
[0213] In one embodiment, the DNA sequence encoding the recombinant protein of the invention is any DNA molecule encoding the amino acid sequence encoded by anyone of SEQ ID NOs: 5-7 or the amino acid sequence of anyone of SEQ ID NOs: 8-11. In one embodiment, the DNA sequence encoding the recombinant protein with or without the gamma subunit as described herein is at least 70% identical to anyone of SEQ ID NOs: 5-7. In one embodiment, the DNA sequence encoding the recombinant protein with or without the gamma subunit as described herein is at least 75% identical to anyone of SEQ ID NOs: 5-7. In one embodiment, the DNA sequence encoding the recombinant protein with or without the gamma subunit as described herein is at least 80% identical to anyone of SEQ ID NOs: 5-7. In one embodiment, the DNA sequence encoding the recombinant protein with or without the gamma subunit as described herein is at least 85% identical to anyone of SEQ ID NOs: 5-7. In one embodiment, the DNA sequence encoding the recombinant protein with or without the gamma subunit as described herein is at least 90% identical to anyone of SEQ ID NOs: 5-7. In one embodiment, the DNA sequence encoding the recombinant protein with or without the gamma subunit as described herein is at least 95% identical to anyone of SEQ ID NOs: 5-7. In one embodiment, the DNA sequence encoding the recombinant protein with or without the gamma subunit as described herein is at least 95% identical to anyone of SEQ ID NOs: 5-7.
[0214] In some embodiments, a DNA molecule of the invention or a DNA sequence described herein comprises or consists any sequence encoding the recombinant protein including (but not limited to) the recombinant protein comprising or consisting anyone of SEQ ID NOs: 8-11. In some embodiments, a DNA molecule of the invention or a DNA sequence described herein comprises or consists any sequence encoding the recombinant protein and the gamma subunit, as described herein, including (but not limited to) the amino acid sequences set forth in anyone of SEQ ID NOs: 8-11. In one embodiment, the recombinant protein and/or the gamma subunit, as described herein is/are translated based on the DNA sequences provided herein. In one embodiment, the recombinant protein has an amino acid sequence that is at least 70%, 80%, 90%, 95%, or 97% identical to: (a) a recombinant protein translated from the DNA sequences provided herein or (b) anyone of SEQ ID NOs: 8-11.
[0215] In some embodiments, the protein of the invention comprises a tag. In some embodiments, the DNA encoding the proteins of the invention comprises sequence encoding the tag. In some embodiments, the tag is selected from an n-terminal tag, a c-terminal tag and an internal tag. A skilled artisan will appreciate that the tag should be positioned so as not to interfere with the function of the recombinant protein. Thus, the tag will not interfere with the redox activity or the DET activity. In some embodiments, the tag is a c-terminal tag. In some embodiments, the tag is a His tag. In some embodiments, the tag is a 6.times.His tag. In some embodiments, the His tag comprises or consists of the amino acid sequence HHHHHH (SEQ ID NO: 28). In some embodiments, the DNA encoding the His tag comprises the sequence CATCACCATCACCATCAC (SEQ ID NO: 24) (e.g., in addition to SEQ ID NO: 19-23). A skilled artisan will appreciate that any sequence which encodes the tag may be used. Protein tags are well known in the art and include, but are not limited to, HA tags, His tags, GFP tags, Myc tags, biotin tags, FLAG tags, streptavidin tags, and many, many others. Tagging may be useful for purification of the protein, and the tag may be cleaved before the enzyme is used. In some embodiment, the tag is small. In some embodiments, the tag is equal to or smaller than 40, 35, 30, 25, 20, 15, 10, 7, or 5 amino acids. Each possibility represents a separate embodiment of the invention. A smaller tag may be advantageous in that it is less likely to interfere with DET.
[0216] In some embodiments, the tag is connected to the recombinant protein by a linker. The linker may be a linker such as has been described hereinabove. In some embodiments, the linker comprises or consists of the sequence GSGSG (SEQ ID NO: 29). In some embodiments, the DNA sequence that encodes the linker comprises or consists of the sequence GGCAGTGGTTCCGGC (SEQ ID NO: 25) (e.g., in addition to SEQ ID NO: 19-23). A skilled artisan will appreciate that any sequence which encodes the linker may be used. In some embodiments, the linker is produced by the restriction site introduced into the DNA to produce the recombinant protein. Indeed, there may be a linker produced by restriction site insertion between any of the different parts of the recombinant protein.
General
[0217] As used herein the term "about" refers to .+-.10%.
[0218] The terms "comprises", "comprising", "includes", "including", "having" and their conjugates mean "including but not limited to". The term "consisting of" means "including and limited to". The term "consisting essentially of" means that the composition, method or structure may include additional ingredients, steps and/or parts, but only if the additional ingredients, steps and/or parts do not materially alter the basic and novel characteristics of the claimed composition, method or structure.
[0219] The word "exemplary" is used herein to mean "serving as an example, instance or illustration". Any embodiment described as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments and/or to exclude the incorporation of features from other embodiments.
[0220] The word "optionally" is used herein to mean "is provided in some embodiments and not provided in other embodiments". Any particular embodiment of the invention may include a plurality of "optional" features unless such features conflict.
[0221] As used herein, the singular form "a", "an" and "the" include plural references unless the context clearly dictates otherwise. For example, the term "a compound" or "at least one compound" may include a plurality of compounds, including mixtures thereof.
[0222] Throughout this application, various embodiments of this invention may be presented in a range format. It should be understood that the description in range format is merely for convenience and brevity and should not be construed as an inflexible limitation on the scope of the invention. Accordingly, the description of a range should be considered to have specifically disclosed all the possible subranges as well as individual numerical values within that range. For example, description of a range such as from 1 to 6 should be considered to have specifically disclosed subranges such as from 1 to 3, from 1 to 4, from 1 to 5, from 2 to 4, from 2 to 6, from 3 to 6 etc., as well as individual numbers within that range, for example, 1, 2, 3, 4, 5, and 6. This applies regardless of the breadth of the range.
[0223] Whenever a numerical range is indicated herein, it is meant to include any cited numeral (fractional or integral) within the indicated range. The phrases "ranging/ranges between" a first indicate number and a second indicate number and "ranging/ranges from" a first indicate number "to" a second indicate number are used herein interchangeably and are meant to include the first and second indicated numbers and all the fractional and integral numerals therebetween.
[0224] As used herein the term "method" refers to manners, means, techniques and procedures for accomplishing a given task including, but not limited to, those manners, means, techniques and procedures either known to, or readily developed from known manners, means, techniques and procedures by practitioners of the chemical, and material arts.
[0225] As used herein, the term "treating" includes abrogating, substantially inhibiting, slowing or reversing the progression of a condition, of aesthetical symptoms of a condition.
[0226] In those instances where a convention analogous to "at least one of A, B, and C, etc." is used, in general such a construction is intended in the sense one having skill in the art would understand the convention (e.g., "a system having at least one of A, B, and C" would include but not be limited to systems that have A alone, B alone, C alone, A and B together, A and C together, B and C together, and/or A, B, and C together, etc.). It will be further understood by those within the art that virtually any disjunctive word and/or phrase presenting two or more alternative terms, whether in the description, claims, or drawings, should be understood to contemplate the possibilities of including one of the terms, either of the terms, or both terms. For example, the phrase "A or B" will be understood to include the possibilities of "A" or "B" or "A and B."
[0227] It is appreciated that certain features of the invention, which are, for clarity, described in the context of separate embodiments, may also be provided in combination in a single embodiment. Conversely, various features of the invention, which are, for brevity, described in the context of a single embodiment, may also be provided separately or in any suitable subcombination or as suitable in any other described embodiment of the invention. Certain features described in the context of various embodiments are not to be considered essential features of those embodiments, unless the embodiment is inoperative without those elements.
[0228] Various embodiments and aspects of the present invention as delineated hereinabove and as claimed in the claims section below find experimental support in the following examples.
Examples
[0229] Reference is now made to the following examples which, together with the above descriptions, illustrate the invention in a non-limiting fashion.
Plasmid Construction
[0230] FAD-GDH .gamma. subunit as well as its catalytic a subunit were cloned into pTrcHis6A2 vector between NcoI and HindIII restriction sites. The partial FAD-GDH gene was followed by a short flexible polypeptide linker (13 amino-acids long) and the MCR2 (minimal cytochrome domain--MCD) DNA sequence with 6.times.His tag at the sequence's C-terminal, as shown in FIG. 6 in a map of the new fusion protein. For the WT GDH the same construct was used, but without the MCD sequence. Full DNA construct: CCATGGCTCACAATGACAACACCCCGCACTCCCGCCGTACCGGCGATGCGGCCGTGACCG GTATTACGCGTCGCCAGTGGCTGCAAGGCGCGCTGGCCCTGACCGCAGCTGGCCTGACGG GTTCCCTGGCCCTGCGCGCACTGGCTGATGATCCGGGCACCGCACCGCTGGATACCTTTAT GACGCTGAGCGAAGCTCTGACGGGCAAAAAAGGTCTGTCTCGTGTTCTGGGCCAGCGTTT TCTGCAAGCGCTGCAAAAAGGTTCATTCAAAACCGCGGATTCGCTGCCGCAGCTGGCGGG CGCCCTGGCAAGCGGTTCTCTGAACCCGGACCAAGAAGCTCTGGCGCTGAAAATCCTGGA AGCATGGTATCTGGGCATTGTTGATAATGTGGTTATCACCTACGAAGAAGCCCTGATGTTT AGTGTCGTGTCCGACACGCTGGTCATTCCGAGCTATTGCCCGAACAAACCGGGTTTCTGG GCCGAAAAACCGATCGAACGTCAGGCATAATGGCGGATACGGATACCCAGAAAGCGGAC GTGGTCGTGGTTGGATCCGGCGTGGCAGGCGCAATCGTGGCTCATCAACTGGCAATGGCA GGTAAAAGCGTGATCCTGCTGGAAGCTGGTCCGCGTATGCCGCGTTGGGAAATTGTTGAA CGTTTCCGCAATCAAGTCGATAAAACCGACTTTATGGCACCGTATCCGAGCAGCGCATGG GCACCGCATCCGGAATATGGTCCGCCGAATGATTACCTGATCCTGAAAGGCGAACACAAA TTTAACTCACAGTACATTCGTGCAGTGGGCGGCACCACGTGGCATTGGGCAGCCTCGGCA TGGCGCTTCATCCCGAACGATTTTAAAATGAAAACCGTGTATGGCGTTGGTCGTGACTGG CCGATTCAGTACGATGACATCGAACATTATTACCAACGCGCGGAAGAAGAACTGGGCGTG TGGGGTCCGGGCCCGGAAGAAGACCTGTATTCACCGCGTAAAGAACCGTACCCGATGCCG CCGCTGCCGCTGAGTTTCAATGAACAAACCATTAAATCCGCTCTGAACGGCTATGATCCG AAATTTCACGTGGTTACGGAACCGGTGGCCCGTAATTCGCGCCCGTACGACGGTCGCCCG ACCTGCTGTGGCAACAATAACTGCATGCCGATTTGTCCGATCGGTGCAATGTATAACGGC ATCGTCCATGTGGAAAAAGCTGAACAGGCAGGTGCTAAACTGATTGATAGTGCGGTCGTG TACAAACTGGAAACGGGCCCGGACAAACGTATTACCGCAGCTGTTTATAAAGATAAAAC GGGTGCGGACCATCGCGTCGAAGGCAAATACTTCGTGATTGCGGCCAATGGTATCGAAAC CCCGAAAATTCTGCTGATGAGCGCGAACCGTGATTTTCCGAATGGTGTGGCCAACAGTTC CGATATGGTTGGCCGCAATCTGATGGACCATCCGGGCACCGGCGTGAGCTTTTATGCAAA CGAAAAACTGTGGCCGGGTCGTGGTCCGCAGGAAATGACCTCTCTGATCGGTTTCCGTGA TGGCCCGTTTCGCGCGAATGAAGCAGCGAAGAAAATTCATCTGTCAAATATGTCGCGTAT CAACCAGGAAACCCAAAAAATCTTTAAAGGCGGTAAACTGATGAAACCGGAAGAACTGG ATGCGCAGATCCGTGACCGCAGTGCCCGCTTTGTTCAATTCGATTGCTTTCACGAAATCCT GCCGCAGCCGGAAAATCGTATTGTCCCGTCCAAAACCGCAACGGACGCAGTGGGTATTCC GCGTCCGGAAATTACGTATGCGATCGATGACTACGTCAAACGTGGCGCAGTGCATACGCG CGAAGTTTATGCTACCGCGGCCAAAGTGCTGGGCGGCACCGAAGTGGTCTTCAACGATGA ATTTGCGCCGAATAACCACATCACCGGTGCCACGATTATGGGCGCGGATGCCCGTGACTC AGTGGTTGATAAAGACTGTCGCGCCTTCGATCATCCGAACCTGTTTATTAGCAGCAGCAG CACCATGCCGACGGTTGGCACCGTTAACGTCACCCTGACGATTGCAGCTCTGGCACTGCG TATGTCTGATACGCTGAAAAAAGAAGTCGAATTCGGTTCTGGTTATGGCTCTGGTCCGCC GGGTCCGATTCGTGCAGGTGCTACCATGCCGCATCGTGATCGTGGTCCGTGCGGTGCATG TCACGCTATTATCCAGGGCAGTGGTTCCGGCCATCACCATCACCATCACTAAAAGCTT (SEQ ID NO: 26). The first 6 nucleotides and the last 6 nucleotides are NcoI and HindIII restriction sites. The Gamma subunit is expressed as a separate protein encoded from nucleotide 3 to nucleotide 512 of the DNA molecule described herein. The Alpha subunit is from nucleotide 512 to nucleotide 2128. The flexible linker is from nucleotide 2129 to nucleotide 2167. The carboxy terminal MCD linker is from nucleotide 2168 to nucleotide 2238. 6 times His-Tag from nucleotide 2239 to nucleotide 2269.
Enzyme Expression and Purification
[0231] The complete pTrcHis6A2-FAD-GDH-MCD plasmid was transformed into E-coli BL21 cells for the expression of the fusion protein (FAD-GDH-MCD). Bacteria were cultured in an auto-induction medium (Formedium.TM., Hunstanton, England) with 0.5% glycerol (Bio-Lab ltd., Jerusalem, Israel) and 10 .mu.g/mL carbenicillin (Apollo, Manchester, England) and grown in 37.degree. C., with shaking at 250 rpm, for 6 hours after which cells were transferred to 27.degree. C. for 18 additional hours.
[0232] The cells were then centrifuged at 6000 rpm for 10 minutes, the pallet was resuspended using 20 mL 50 mM Tris-base buffer (TB, Fisher scientific, Geel, Belgium) pH=7.0, centrifuged, weighted, and resuspended using lysis buffer (300 mM KCl, 50 mM KH.sub.2PO.sub.4, 10 mM imidazole pH=7.0) in a 1:3 (weight:buffer volume) ratio. Protease inhibitors, lysozyme and supernuclease were added to the suspension in 1:500, 1:400, and 1:5000 ratios, respectively. The cells were lysed using sonication needle followed by 15 minutes of 50.degree. C. incubation and centrifugation (11,500 rpm for 25 minutes). The supernatant was filtered using 0.22 .mu.m filter.
[0233] The fusion enzyme was purified using IMAC purification system (Bio-Rad, Profinia, Hercules, Calif., USA) according to manufacturer instructions.
FAD-GDH Activity Assay
[0234] To assess FGM's D-glucose oxidation activity, glucose oxidation was measured in the presence of 0.6 mM 2,6-Dichloroindophenol (DCIP, Sigma-Aldrich, Rehovot, Israel), 0.6 mM phenazine methyl sulfate (PMS, Tokyo chemical industry, Tokyo, Japan), different concentrations of D-glucose and 0.2 mM FGM, all dissolved in 50 mM TB pH=7.0. Assay was performed in 37.degree. C. using a 96-well plate reader (BioTek instruments, Winoosky, Vt., USA) with shaking between measurements, monitoring the decrease in the DCIP absorbance .lamda.=610 nm every 15 seconds over 25 min of activity.
Heme Activity Measurements
[0235] To verify the attachment of a porphyrin to the MCD domain, heme activity was measured using 1 mM dimethoxybenzidine (DMB, Alfa Aesar, Heysham, England), 1 mM hydrogen peroxide (Sigma-Aldrich, Rehovot, Israel), and 0.2 mM of the enzyme. Measurements were performed at 37.degree. C. using a 96-wells plate reader, monitoring the increase in DMB absorbance .lamda.=455 nm every 15 seconds over 30 min of activity.
Peroxidase Activity Interference Measurements
[0236] To verify that FGM does not transfer electrons to oxygen, producing hydrogen peroxide, that interferes with the FAD-GDH activity assay, peroxidase activity test was performed in the absence of hydrogen peroxide. 0.17 mM DMB and 172 mM glucose were mixed with 8 .mu.L of 0.1 mg/mL Horseradish peroxidase (HRP, Sigma-Aldrich, Rehovot, Israel) to generate a reaction mix. 8 .mu.L of concentrated FGM or glucose oxidase (GOx) were added to the reaction mix right before measurement. Measurements were performed at 37.degree. C. using a plate reader, monitoring the increase in DMB absorbance .lamda.=500 nm every 15 seconds over 25 min of activity (FIG. 3C).
Electrode Preparation
[0237] Glassy carbon electrodes (GCE, 3 mm in diameter; ALS, Tokyo, Japan) were polished using 0.05 .mu.m alumina slurry on polishing pad for two minutes, then transferred to a 20 mL glass with 10 mL double-distilled water (DDW) and sonicated for five minutes in a sonication bath. The electrodes were then dried under nitrogen gas. 104, of enzyme solution in wanted concentrations was dropped on the electrode surface. The electrodes were incubated in 4.degree. C. overnight to generate an enzyme film on the GCE. Electrode's surfaces were then covered with 12-14 kDa dialysis membrane (Membrane Filtration Products, Seguin, Tex., USA) and tightened using an O-ring rubber to prevent diffusion of the enzyme to the solution.
Electrochemical Measurements
[0238] Cyclic voltammetric measurements were performed using a PalmSense2 potentiostat (Palm Instruments, Houten, The Netherlands) using a standard three electrodes system with 0.9 mm graphite rod as an auxiliary electrode, 3M KCl saturated Ag/AgCl reference electrode (ALS, Tokyo, Japan) and GCE as the working electrode in 0.15M phosphate-citrate buffer pH=5.0. Chronoamperometric measurements (FIG. 7) were performed under the same conditions with the application of 0 mV vs. Ag/AgCl with the addition of varying concentrations of glucose or potential interfering molecules. Square wave voltammetry (SWV) measurements were performed under the same conditions.
FGM Selectivity Test
[0239] The interference of different sugars and molecules on glucose biosensing by GCE/FGM was measured using chronoamperometry (see selectivity FIGS. 8A-C). Measurements were performed in 0.15M phosphate-citrate buffer pH=5.0 with the application of 0 mV vs. Ag/AgCl reference electrode. FGM selectivity was tested by adding 3.6 mM of glucose followed by two sequential additions of one of the different sugars or molecules in their relevant physiological concentrations. The sugars used for this test were D-galactose, lactose, D-maltose and D-xylose and the molecules were ascorbic acid and acetaminophen (all from Sigma-Aldrich, Rehovot, Israel). Selectivity test have revealed a small interference caused by galactose while other sugars did not interfere with the measurement. This result indicates high selectivity towards glucose and low/no reaction with other sugars that can be found in human blood samples, which is very important for biosensing accuracy. GCE\FGM showed low sensitivity to ascorbic acid and no sensitivity to acetaminophen.
The Biocatalytic Recombinant Protein
[0240] In the present study, a fusion enzyme was designed in a combination of a biocatalytic function from a redox enzyme domain that was fused to a natural minimal ET domain via a short polypeptide linker as shown in FIG. 1. As the catalytic domain, the a subunit of an FAD-GDH from Burkholderia cepacia was used. As a minimal ET unit, the c-type cytochrome domain MCR-2 from a MamP protein which originates from a magnetotactic bacteria magnetoovoid bacterium MO-127 was chosen. MamP is part of the magnetosome, a unique organelle that is found in magnetotactic bacteria that allows magneto taxis to occur in these bacteria. MCR-2 is one of the shortest natural c-type cytochromes known today (23 amino-acids long), thus can be used to achieve DET.
[0241] In order for FAD-GDH-MCD (FGM) fused enzyme to mature properly in the host cell, FAD-GDH a subunit should correctly fold. The Enzyme's .gamma. subunit aids in the maturation of the a subunit and locates it in the periplasm. Within the bacterial periplasmatic environment, the maturation of c-type cytochromes (heme binding cytochromes) occurs with the help of a specific gene cluster called ccmA-H29. In that manner, the holo-enzyme is being transferred to the periplasm and there the MCD's maturation process occurs.
[0242] Fusion enzyme's engineered DNA sequence was cloned into pTrcHis6A2 expression vector and was transformed into E-coli BL21. FGM was overexpressed in the bacterial expression system and then purified by utilizing immobilized metal affinity chromatography (IMAC) purification system (FIG. 2A). In-gel heme staining was performed to verify the presence of the heme compared to GDH and Anti his-tag Western blot analysis was performed in order to verify the full-length enzyme's expression (FIG. 2B, right panel). As shown in FIG. 2B, both FGM and GDH enzymes were expressed and their respective bands appeared in the expected size--ca. 64 kDa and 62 kDa for FGM and GDH, respectively. In-gel heme staining revealed a band for FGM only, indicating the presence of a porphyrin containing iron bound to FGM.
[0243] FGM catalytic redox activity and heme peroxidase activity were measured biochemically and compared to GDH as shown in FIG. 3A. FGM has oxidized D-glucose as was measured by FAD-GDH activity assay in 50 mM Tris-base (pH 7.0), 0.6 mM 2,6-Dichloroindophenol (DCIP) and 0.6 mM phenazine methyl sulfate (PMS) in 37.degree. C. Absorbance (lambda=610 nm) was monitored using a plate reader while the oxidized DCIP (blue color) was being reduced by FGM to its reduced form (colorless). FGM has also shown peroxidase activity, measured by heme activity assay in 1 mM 3,3'-Dimethoxybenzidine (DMB) and 1 mM of hydrogen peroxide. Absorbance in 455 nm was monitored while the DMB oxidation occurs by the MCD, resulted in an oxidized DMB (red color). Heme activity assay results indicate that FGM indeed binds a heme group while no heme molecules are bound by GDH.
[0244] Absorbance measurements of protein sample spectrum revealed a peak in absorbance at 408 nm for FGM and no peak for GDH, indicating presence of heme c in FGM (FIG. 3B). 408 nm/A280 nm ratio was calculated to be 0.4 for FGM expressed in the presence of pEC86 plasmid, compared to 0.2 for FGM expressed in the absence of this plasmid, indicating more efficient heme maturation in the presence of the helper plasmid.
[0245] The apparent kinetic and thermodynamic parameters of FGM were calculated using FAD-GDH biochemical activity assay in 37.degree. C. (Table 1). FGM and GDH concentrations were first determined spectrophotometrically using a standard Bradford assay. Michaelis-Menten curves were transformed to Lineweaver-Burk curves in order to determine the kinetic and thermodynamic parameters of the enzyme. One enzyme activity unit was defined as amount of enzyme oxidizing 1 .mu.M of substrate per minute. The molar absorption coefficient of DCIP was calculated to be 4.7 cm.sup.-1mM.sup.-1. By using the biochemical activity assay and DCIP molar absorption coefficient, FGM and GDH specific activity were calculated to be 16 mU.
[0246] Lineweaver-Burk plots (FIGS. 7A-B) were used to calculate kinetic and thermodynamic parameters of the enzyme. K.sub.M.sup.app values were 157.+-.5 .mu.M and 174.+-.9 .mu.M for FGM and GDH, respectively, showing different affinity of the enzymes toward the substrate. GDH's K.sub.M.sup.app value is lower than reported values but yet in the same order of magnitude of some. K.sub.M.sup.app value was ca. 3 times higher for FGM compared to GDH, indicating faster oxidation of D-glucose by FGM. FGM also showed more than 3 times higher catalytic efficiency (k) compared to GDH (Table 1). Next, the electrochemical activity of the enzymes was measured to determine whether the addition of the minimal cytochrome domain improves enzyme-electrode ET.
[0247] Chronoamperometric measurements (FIG. 7C) were performed to determine the apparent kinetic parameters of FGM compared to GDH. Using a standard 3 electrode electrochemical cell, the current was measured vs. successive glucose additions. A potential of 0V was applied during the measurements. The current for each glucose concentration was determined and is presented in FIG. 5A. As described above, using the linear part of the transient curves and Linewaver-Burk transformation--the electrochemical kinetic constants were calculated (FIG. 5B).
TABLE-US-00011 TABLE 1 Apparent biochemical kinetic parameters of FGM and GDH K.sub.cat.sup.app (s.sup.-1) K.sub.M.sup.app (.mu.M) k.sub.cat.sup.app/K.sub.M.sup.app (s.sup.-1 mM.sup.-1) GDH 1.7 .+-. 0.1 174 .+-. 9 9.6 .+-. 0 FGM 5.2 .+-. 0.1 157 .+-. 5 33 .+-. 0
[0248] For the electrochemical measurements, a standard 3 electrode electrochemical cell with 0.9 mm graphite rod as the auxiliary electrode were used, 3M KCl saturated Ag/AgCl reference electrode and 3 mm diameter glassy carbon electrode (GCE) as the working electrode. 10 .mu.L of ca. 30 .mu.g/mL FGM or GDH enzyme solution were dropped on the GCE surface and dried in 4.degree. C. overnight. The electrodes surface was then covered with 12-14 kDa dialysis membrane tightened to the surface with an O-ring to keep the enzyme close to the electrode surface during measurements and to avoid enzyme diffusion to the surrounding buffer. Cyclic voltammetry (CV) measurements were performed to compare the enzyme-electrode communication of FGM to that of GDH.
[0249] Optimal pH value for electrochemical measurements was tested by measuring the catalytic current in pH values of 3.6-7.0 and found to be 5.0.
[0250] Measurements were performed in phosphate-citrate buffer pH=5.0 at room temperature and a scan rate of 5 mV/sec for both enzymes with and without the addition of 5 mM glucose. It can be seen in FIGS. 4A-B that the CVs of both enzymes before the addition of glucose are almost identical. No clear anodic or cathodic peaks were identified for both enzymes. After the addition of glucose to a final concentration of 5 mM, FGM has demonstrated 10 times higher electrocatalytic current compared to that of GDH with an onset potential of ca. -100 mV for both enzymes (indicating no apparent shift in potentials due to the fusion of MCD). That difference in the ET efficiency is probably due to the addition of the minimal cytochrome domain that mediates the ET from the buried FAD co-factor to GCE.
[0251] After the addition of glucose to a final concentration of 5 mM, FGM demonstrated a higher electrocatalytic current compared to that of GDH with an onset potential of ca. (-) 150 mV. The fact that no catalytic current was observed at this high scan rate using GDH, but a significant catalytic current evolved using FGM, is an indication of fast ET rates of FGM. That observed ET efficiency is probably due to addition of the minimal cytochrome domain that mediates ET from the buried FAD cofactor to GCE. To identify a peak originating from the MCD domain, square-wave voltammetry (SWV) was performed. As shown in FIG. 4C, the voltammogram of an electrode with FGM revealed a peak around (-) 230 mV, whereas GDH had no observable peak at this potential.
[0252] As shown in Table 2, the electrochemical K.sub.M.sup.app is 2.84.+-.0.57 mM for GDH and 1.40.+-.0.27 mM for FGM, indicating no significant difference in the affinity towards glucose. The i.sub.max value is one order of magnitude higher for FGM compared to GDH--2.04.+-.0.45 .mu.Acm.sup.-2 and 0.4.+-.0.17 .mu.Acm.sup.-2, respectively. The difference in the i.sub.max value is indicative that the DET efficiency is different between the two enzymes where FGM shows five to seven times higher current then GDH for the same glucose concentrations.
TABLE-US-00012 TABLE 2 Apparent electrochemical kinetic and thermodynamic parameters of FGM and GDH K.sub.M.sup.app (mM) i.sub.max (.mu.A cm.sup.-2) GDH 2.8 .+-. 0.6 0.4 .+-. 0.2 FGM 1.4 .+-. 0.3 2.0 .+-. 0.5
[0253] Burkholderia Cepacia is considered a good candidate for biosensing applications because of its stability in high temperatures and insensitivity to oxygen. By adding a minimal cytochrome domain to FAD-GDH c-terminus an improved DET was provided, showing higher catalytic currents compered to GDH with almost the same affinity to the substrate. When tested electrochemically with 0V induced potential vs Ag/AgCl, FGM showed much higher currents for the same substrate concentrations compared to GDH, which makes it more accurate for glucose biosensing with improved sensitivity than previously reported for GDH.
Non-Canonical Amino Acids (ncAAs) Incorporation into FGM for Site-Specific Wiring to an Electrode:
FGM Electron Transfer Machineries Investigation
[0254] To incorporate non-canonical amino acids (ncAAs) into FGM, a few constructs containing the amber (TAG) mutation on pTrcHis6A2-FGM plasmid were prepared using standard site-directed mutagenesis PCR protocol. The mutations were chosen with a proximity to the protein different domains--FAD binding domain, MCD and one site that is distant from either FAD domain or MCD. For proximity to FAD binding domain, two sites were found to be possible for ncAA incorporation--R42X and S247X (FIG. 9--red), both ca. 11 .ANG. from FAD binding domain. For proximity to MCD, T558X and P560X (yellow) are possible sites for incorporation, with 10 and 7 .ANG. from MCD, respectively. D395X (black) was found to be far from both FAD and MCD as it is about 37 .ANG. from FAD and 83 .ANG. from MCD.
[0255] Mutated plasmids were transformed into super competent E. coli DH5a cells and were plated on selective LB-agar plates. Bacterial colonies were isolated and plasmids were purified using miniprep kit followed by sequencing.
Non-Canonical Amino Acids (ncAA) Incorporation into FGM
[0256] Plasmids, with amber codon-containing FGM mutants sequences were co-transformed to E. coli BL21 strain containing the pEVOL plasmid expressing Pyrrolysyl orthogonal translation system to incorporate Propargyl-lysine (PrK) into the protein sequence. PrK containing protein was expressed in 20 mL auto-induction medium (AIM) in the presence of 1 mM PrK, lysed using Bugbuster lysis solution and was isolated utilizing IMAC purification method. PrK is an example to a clickable nCAA, all clickable biorthogonal chemical handles can be considered for the site-specific wiring of this enzyme.
Pyrene-Azide Linkers for Enzyme Wiring to Electrode
[0257] In order to wire FGM with site-specifically incorporated PrK to an electrode, a synthetic linker was used. The synthetic linker contained a pyrene group in one pole and an azide group at the other. The two groups were connected by a tri-ethylene oxide, di-ethylene oxide or mono-ethylene oxide, to get three different lengths of 8.3, 6.4 and 4.3 .ANG., respectively. The pyrene group is a polycyclic aromatic hydrocarbon consisting of four fused benzene rings, results in a flat aromatic system. Due to overlapping of n-bonds between aromatic side chains, the pyrene group can be attached to glassy carbon electrodes surface through .pi.-.pi. stacking. The azide group was used to attach the alkyne group of PrK using "click" chemistry. Exemplary pyrene-azide linker structures with different lengths are presented in FIGS. 10A-B.
Click Reaction:
[0258] Copper-catalyzed azide-alkyne cycloaddition (CuAAC) reaction is based on the formation of 1,4-disubstituted 1,2,3-triazoles between a terminal alkyne and an aliphatic azide in the presence of copper. The reaction is a facile, selective, high yielding with mild conditions and with few or no byproducts. It can be performed in room-temperature, what makes it relevant for use in proteins. CuAAC was used to link FGM to a pyrene-azide synthetic linker. Click reaction mix contained tris-KCl buffer (pH=8.2), pyrene-azide linker, Cu.sup.+, Tris(3-hydroxypropyltriazolylmethyl)amine (THPTA), FGM with incorporated PrK and sodium-ascorbate (Na-Asc). Reaction mix was incubated in RT for 30-60 min with moderate mixing, centrifuged for 10 min, and the supernatant was collected for further examinations.
ncAA Incorporation Validation
[0259] TAMRA-azide is an azide-linked reporter tag that can be used for visualization of alkyne containing proteins (see FIG. 11). To validate incorporation of PrK into FGM sequence, purified FGM was clicked with the florescent marker TAMRA-azide. Clicked protein sample was loaded on SDS-PAGE and the protein gel was checked for florescence using LAS4000 camera. In addition, anti 6.times.His-tag antibodies will be used for Western blot analysis. MS-MS analysis was performed on isolated protein to give another validation for PrK incorporation.
Enzyme Site-Specific Wiring to GCE
[0260] Ten .mu.L of pyrene-azide clicked to PrK containing FGM was dropped on clean GCE and incubate in RT for 15 min. The pyrene groups from the linker adhere to GCE surface through .pi.-.pi. stacking. GCE was washed using DW to avoid unbound protein showing signal in electrochemical measurements.
[0261] Relevant FGM DNA Sequences (all Sequences are of the FAD-GDH .alpha.-Subunit+MCD, without the .gamma.-Subunit):
[0262] R42X (SEQ ID NO: 14); S247X (SEQ ID NO: 15); D395X (SEQ ID NO: 16); T558X (SEQ ID NO: 17); P560X (SEQ ID NO: 18).
[0263] Relevant FGM Protein Sequences (all Sequences are of the FAD-GDH .alpha.-Subunit+MCD, without the .gamma.-Subunit):
[0264] R42X (SEQ ID NO: 19); S247X (SEQ ID NO: 20); D395X (SEQ ID NO: 21); T558X (SEQ ID NO: 22); P560X (SEQ ID NO: 23).
[0265] All publications, patents and patent applications mentioned in this specification are herein incorporated in their entirety by reference into the specification, to the same extent as if each individual publication, patent or patent application was specifically and individually indicated to be incorporated herein by reference. In addition, citation or identification of any reference in this application shall not be construed as an admission that such reference is available as prior art to the present invention. To the extent that section headings are used, they should not be construed as necessarily limiting.
Sequence CWU
1
1
2911617DNAArtificialSynthetic 1atggcggata cggataccca gaaagcggac gtggtcgtgg
ttggatccgg cgtggcaggc 60gcaatcgtgg ctcatcaact ggcaatggca ggtaaaagcg
tgatcctgct ggaagctggt 120ccgcgtatgc cgcgttggga aattgttgaa cgtttccgca
atcaagtcga taaaaccgac 180tttatggcac cgtatccgag cagcgcatgg gcaccgcatc
cggaatatgg tccgccgaat 240gattacctga tcctgaaagg cgaacacaaa tttaactcac
agtacattcg tgcagtgggc 300ggcaccacgt ggcattgggc agcctcggca tggcgcttca
tcccgaacga ttttaaaatg 360aaaaccgtgt atggcgttgg tcgtgactgg ccgattcagt
acgatgacat cgaacattat 420taccaacgcg cggaagaaga actgggcgtg tggggtccgg
gcccggaaga agacctgtat 480tcaccgcgta aagaaccgta cccgatgccg ccgctgccgc
tgagtttcaa tgaacaaacc 540attaaatccg ctctgaacgg ctatgatccg aaatttcacg
tggttacgga accggtggcc 600cgtaattcgc gcccgtacga cggtcgcccg acctgctgtg
gcaacaataa ctgcatgccg 660atttgtccga tcggtgcaat gtataacggc atcgtccatg
tggaaaaagc tgaacaggca 720ggtgctaaac tgattgatag tgcggtcgtg tacaaactgg
aaacgggccc ggacaaacgt 780attaccgcag ctgtttataa agataaaacg ggtgcggacc
atcgcgtcga aggcaaatac 840ttcgtgattg cggccaatgg tatcgaaacc ccgaaaattc
tgctgatgag cgcgaaccgt 900gattttccga atggtgtggc caacagttcc gatatggttg
gccgcaatct gatggaccat 960ccgggcaccg gcgtgagctt ttatgcaaac gaaaaactgt
ggccgggtcg tggtccgcag 1020gaaatgacct ctctgatcgg tttccgtgat ggcccgtttc
gcgcgaatga agcagcgaag 1080aaaattcatc tgtcaaatat gtcgcgtatc aaccaggaaa
cccaaaaaat ctttaaaggc 1140ggtaaactga tgaaaccgga agaactggat gcgcagatcc
gtgaccgcag tgcccgcttt 1200gttcaattcg attgctttca cgaaatcctg ccgcagccgg
aaaatcgtat tgtcccgtcc 1260aaaaccgcaa cggacgcagt gggtattccg cgtccggaaa
ttacgtatgc gatcgatgac 1320tacgtcaaac gtggcgcagt gcatacgcgc gaagtttatg
ctaccgcggc caaagtgctg 1380ggcggcaccg aagtggtctt caacgatgaa tttgcgccga
ataaccacat caccggtgcc 1440acgattatgg gcgcggatgc ccgtgactca gtggttgata
aagactgtcg cgccttcgat 1500catccgaacc tgtttattag cagcagcagc accatgccga
cggttggcac cgttaacgtc 1560accctgacga ttgcagctct ggcactgcgt atgtctgata
cgctgaaaaa agaagtc 16172510DNAArtificialSynthetic 2atggctcaca
atgacaacac cccgcactcc cgccgtaccg gcgatgcggc cgtgaccggt 60attacgcgtc
gccagtggct gcaaggcgcg ctggccctga ccgcagctgg cctgacgggt 120tccctggccc
tgcgcgcact ggctgatgat ccgggcaccg caccgctgga tacctttatg 180acgctgagcg
aagctctgac gggcaaaaaa ggtctgtctc gtgttctggg ccagcgtttt 240ctgcaagcgc
tgcaaaaagg ttcattcaaa accgcggatt cgctgccgca gctggcgggc 300gccctggcaa
gcggttctct gaacccggac caagaagctc tggcgctgaa aatcctggaa 360gcatggtatc
tgggcattgt tgataatgtg gttatcacct acgaagaagc cctgatgttt 420agtgtcgtgt
ccgacacgct ggtcattccg agctattgcc cgaacaaacc gggtttctgg 480gccgaaaaac
cgatcgaacg tcaggcataa
510369DNAArtificialSynthetic 3attcgtgcag gtgctaccat gccgcatcgt gatcgtggtc
cgtgcggtgc atgtcacgct 60attatccag
69439DNAArtificialSynthetic 4gaattcggtt
ctggttatgg ctctggtccg ccgggtccg
3952191DNAArtificialSynthetic 5ctcacaatga caacaccccg cactcccgcc
gtaccggcga tgcggccgtg accggtatta 60cgcgtcgcca gtggctgcaa ggcgcgctgg
ccctgaccgc agctggcctg acgggttccc 120tggccctgcg cgcactggct gatgatccgg
gcaccgcacc gctggatacc tttatgacgc 180tgagcgaagc tctgacgggc aaaaaaggtc
tgtctcgtgt tctgggccag cgttttctgc 240aagcgctgca aaaaggttca ttcaaaaccg
cggattcgct gccgcagctg gcgggcgccc 300tggcaagcgg ttctctgaac ccggaccaag
aagctctggc gctgaaaatc ctggaagcat 360ggtatctggg cattgttgat aatgtggtta
tcacctacga agaagccctg atgtttagtg 420tcgtgtccga cacgctggtc attccgagct
attgcccgaa caaaccgggt ttctgggccg 480aaaaaccgat cgaacgtcag gcataatggc
ggatacggat acccagaaag cggacgtggt 540cgtggttgga tccggcgtgg caggcgcaat
cgtggctcat caactggcaa tggcaggtaa 600aagcgtgatc ctgctggaag ctggtccgcg
tatgccgcgt tgggaaattg ttgaacgttt 660ccgcaatcaa gtcgataaaa ccgactttat
ggcaccgtat ccgagcagcg catgggcacc 720gcatccggaa tatggtccgc cgaatgatta
cctgatcctg aaaggcgaac acaaatttaa 780ctcacagtac attcgtgcag tgggcggcac
cacgtggcat tgggcagcct cggcatggcg 840cttcatcccg aacgatttta aaatgaaaac
cgtgtatggc gttggtcgtg actggccgat 900tcagtacgat gacatcgaac attattacca
acgcgcggaa gaagaactgg gcgtgtgggg 960tccgggcccg gaagaagacc tgtattcacc
gcgtaaagaa ccgtacccga tgccgccgct 1020gccgctgagt ttcaatgaac aaaccattaa
atccgctctg aacggctatg atccgaaatt 1080tcacgtggtt acggaaccgg tggcccgtaa
ttcgcgcccg tacgacggtc gcccgacctg 1140ctgtggcaac aataactgca tgccgatttg
tccgatcggt gcaatgtata acggcatcgt 1200ccatgtggaa aaagctgaac aggcaggtgc
taaactgatt gatagtgcgg tcgtgtacaa 1260actggaaacg ggcccggaca aacgtattac
cgcagctgtt tataaagata aaacgggtgc 1320ggaccatcgc gtcgaaggca aatacttcgt
gattgcggcc aatggtatcg aaaccccgaa 1380aattctgctg atgagcgcga accgtgattt
tccgaatggt gtggccaaca gttccgatat 1440ggttggccgc aatctgatgg accatccggg
caccggcgtg agcttttatg caaacgaaaa 1500actgtggccg ggtcgtggtc cgcaggaaat
gacctctctg atcggtttcc gtgatggccc 1560gtttcgcgcg aatgaagcag cgaagaaaat
tcatctgtca aatatgtcgc gtatcaacca 1620ggaaacccaa aaaatcttta aaggcggtaa
actgatgaaa ccggaagaac tggatgcgca 1680gatccgtgac cgcagtgccc gctttgttca
attcgattgc tttcacgaaa tcctgccgca 1740gccggaaaat cgtattgtcc cgtccaaaac
cgcaacggac gcagtgggta ttccgcgtcc 1800ggaaattacg tatgcgatcg atgactacgt
caaacgtggc gcagtgcata cgcgcgaagt 1860ttatgctacc gcggccaaag tgctgggcgg
caccgaagtg gtcttcaacg atgaatttgc 1920gccgaataac cacatcaccg gtgccacgat
tatgggcgcg gatgcccgtg actcagtggt 1980tgataaagac tgtcgcgcct tcgatcatcc
gaacctgttt attagcagca gcagcaccat 2040gccgacggtt ggcaccgtta acgtcaccct
gacgattgca gctctggcac tgcgtatgtc 2100tgatacgctg aaaaaagaag tcattcgtgc
aggtgctacc atgccgcatc gtgatcgtgg 2160tccgtgcggt gcatgtcacg ctattatcca g
219162230DNAArtificialSynthetic
6ctcacaatga caacaccccg cactcccgcc gtaccggcga tgcggccgtg accggtatta
60cgcgtcgcca gtggctgcaa ggcgcgctgg ccctgaccgc agctggcctg acgggttccc
120tggccctgcg cgcactggct gatgatccgg gcaccgcacc gctggatacc tttatgacgc
180tgagcgaagc tctgacgggc aaaaaaggtc tgtctcgtgt tctgggccag cgttttctgc
240aagcgctgca aaaaggttca ttcaaaaccg cggattcgct gccgcagctg gcgggcgccc
300tggcaagcgg ttctctgaac ccggaccaag aagctctggc gctgaaaatc ctggaagcat
360ggtatctggg cattgttgat aatgtggtta tcacctacga agaagccctg atgtttagtg
420tcgtgtccga cacgctggtc attccgagct attgcccgaa caaaccgggt ttctgggccg
480aaaaaccgat cgaacgtcag gcataatggc ggatacggat acccagaaag cggacgtggt
540cgtggttgga tccggcgtgg caggcgcaat cgtggctcat caactggcaa tggcaggtaa
600aagcgtgatc ctgctggaag ctggtccgcg tatgccgcgt tgggaaattg ttgaacgttt
660ccgcaatcaa gtcgataaaa ccgactttat ggcaccgtat ccgagcagcg catgggcacc
720gcatccggaa tatggtccgc cgaatgatta cctgatcctg aaaggcgaac acaaatttaa
780ctcacagtac attcgtgcag tgggcggcac cacgtggcat tgggcagcct cggcatggcg
840cttcatcccg aacgatttta aaatgaaaac cgtgtatggc gttggtcgtg actggccgat
900tcagtacgat gacatcgaac attattacca acgcgcggaa gaagaactgg gcgtgtgggg
960tccgggcccg gaagaagacc tgtattcacc gcgtaaagaa ccgtacccga tgccgccgct
1020gccgctgagt ttcaatgaac aaaccattaa atccgctctg aacggctatg atccgaaatt
1080tcacgtggtt acggaaccgg tggcccgtaa ttcgcgcccg tacgacggtc gcccgacctg
1140ctgtggcaac aataactgca tgccgatttg tccgatcggt gcaatgtata acggcatcgt
1200ccatgtggaa aaagctgaac aggcaggtgc taaactgatt gatagtgcgg tcgtgtacaa
1260actggaaacg ggcccggaca aacgtattac cgcagctgtt tataaagata aaacgggtgc
1320ggaccatcgc gtcgaaggca aatacttcgt gattgcggcc aatggtatcg aaaccccgaa
1380aattctgctg atgagcgcga accgtgattt tccgaatggt gtggccaaca gttccgatat
1440ggttggccgc aatctgatgg accatccggg caccggcgtg agcttttatg caaacgaaaa
1500actgtggccg ggtcgtggtc cgcaggaaat gacctctctg atcggtttcc gtgatggccc
1560gtttcgcgcg aatgaagcag cgaagaaaat tcatctgtca aatatgtcgc gtatcaacca
1620ggaaacccaa aaaatcttta aaggcggtaa actgatgaaa ccggaagaac tggatgcgca
1680gatccgtgac cgcagtgccc gctttgttca attcgattgc tttcacgaaa tcctgccgca
1740gccggaaaat cgtattgtcc cgtccaaaac cgcaacggac gcagtgggta ttccgcgtcc
1800ggaaattacg tatgcgatcg atgactacgt caaacgtggc gcagtgcata cgcgcgaagt
1860ttatgctacc gcggccaaag tgctgggcgg caccgaagtg gtcttcaacg atgaatttgc
1920gccgaataac cacatcaccg gtgccacgat tatgggcgcg gatgcccgtg actcagtggt
1980tgataaagac tgtcgcgcct tcgatcatcc gaacctgttt attagcagca gcagcaccat
2040gccgacggtt ggcaccgtta acgtcaccct gacgattgca gctctggcac tgcgtatgtc
2100tgatacgctg aaaaaagaag tcgaattcgg ttctggttat ggctctggtc cgccgggtcc
2160gattcgtgca ggtgctacca tgccgcatcg tgatcgtggt ccgtgcggtg catgtcacgc
2220tattatccag
223072278DNAArtificialSynthetic 7ccatggctca caatgacaac accccgcact
cccgccgtac cggcgatgcg gccgtgaccg 60gtattacgcg tcgccagtgg ctgcaaggcg
cgctggccct gaccgcagct ggcctgacgg 120gttccctggc cctgcgcgca ctggctgatg
atccgggcac cgcaccgctg gataccttta 180tgacgctgag cgaagctctg acgggcaaaa
aaggtctgtc tcgtgttctg ggccagcgtt 240ttctgcaagc gctgcaaaaa ggttcattca
aaaccgcgga ttcgctgccg cagctggcgg 300gcgccctggc aagcggttct ctgaacccgg
accaagaagc tctggcgctg aaaatcctgg 360aagcatggta tctgggcatt gttgataatg
tggttatcac ctacgaagaa gccctgatgt 420ttagtgtcgt gtccgacacg ctggtcattc
cgagctattg cccgaacaaa ccgggtttct 480gggccgaaaa accgatcgaa cgtcaggcat
aatggcggat acggataccc agaaagcgga 540cgtggtcgtg gttggatccg gcgtggcagg
cgcaatcgtg gctcatcaac tggcaatggc 600aggtaaaagc gtgatcctgc tggaagctgg
tccgcgtatg ccgcgttggg aaattgttga 660acgtttccgc aatcaagtcg ataaaaccga
ctttatggca ccgtatccga gcagcgcatg 720ggcaccgcat ccggaatatg gtccgccgaa
tgattacctg atcctgaaag gcgaacacaa 780atttaactca cagtacattc gtgcagtggg
cggcaccacg tggcattggg cagcctcggc 840atggcgcttc atcccgaacg attttaaaat
gaaaaccgtg tatggcgttg gtcgtgactg 900gccgattcag tacgatgaca tcgaacatta
ttaccaacgc gcggaagaag aactgggcgt 960gtggggtccg ggcccggaag aagacctgta
ttcaccgcgt aaagaaccgt acccgatgcc 1020gccgctgccg ctgagtttca atgaacaaac
cattaaatcc gctctgaacg gctatgatcc 1080gaaatttcac gtggttacgg aaccggtggc
ccgtaattcg cgcccgtacg acggtcgccc 1140gacctgctgt ggcaacaata actgcatgcc
gatttgtccg atcggtgcaa tgtataacgg 1200catcgtccat gtggaaaaag ctgaacaggc
aggtgctaaa ctgattgata gtgcggtcgt 1260gtacaaactg gaaacgggcc cggacaaacg
tattaccgca gctgtttata aagataaaac 1320gggtgcggac catcgcgtcg aaggcaaata
cttcgtgatt gcggccaatg gtatcgaaac 1380cccgaaaatt ctgctgatga gcgcgaaccg
tgattttccg aatggtgtgg ccaacagttc 1440cgatatggtt ggccgcaatc tgatggacca
tccgggcacc ggcgtgagct tttatgcaaa 1500cgaaaaactg tggccgggtc gtggtccgca
ggaaatgacc tctctgatcg gtttccgtga 1560tggcccgttt cgcgcgaatg aagcagcgaa
gaaaattcat ctgtcaaata tgtcgcgtat 1620caaccaggaa acccaaaaaa tctttaaagg
cggtaaactg atgaaaccgg aagaactgga 1680tgcgcagatc cgtgaccgca gtgcccgctt
tgttcaattc gattgctttc acgaaatcct 1740gccgcagccg gaaaatcgta ttgtcccgtc
caaaaccgca acggacgcag tgggtattcc 1800gcgtccggaa attacgtatg cgatcgatga
ctacgtcaaa cgtggcgcag tgcatacgcg 1860cgaagtttat gctaccgcgg ccaaagtgct
gggcggcacc gaagtggtct tcaacgatga 1920atttgcgccg aataaccaca tcaccggtgc
cacgattatg ggcgcggatg cccgtgactc 1980agtggttgat aaagactgtc gcgccttcga
tcatccgaac ctgtttatta gcagcagcag 2040caccatgccg acggttggca ccgttaacgt
caccctgacg attgcagctc tggcactgcg 2100tatgtctgat acgctgaaaa aagaagtcga
attcggttct ggttatggct ctggtccgcc 2160gggtccgatt cgtgcaggtg ctaccatgcc
gcatcgtgat cgtggtccgt gcggtgcatg 2220tcacgctatt atccagggca gtggttccgg
ccatcaccat caccatcact aaaagctt 22788527PRTArtificialSynthetic 8Met
Ala Asp Thr Asp Thr Gln Lys Ala Asp Val Val Val Val Gly Ser1
5 10 15Gly Val Ala Gly Ala Ile Val
Ala His Gln Leu Ala Met Ala Gly Lys 20 25
30Ser Val Ile Leu Leu Glu Ala Gly Pro Arg Met Pro Arg Trp
Glu Ile 35 40 45Val Glu Arg Phe
Arg Asn Gln Val Asp Lys Thr Asp Phe Met Ala Pro 50 55
60Tyr Pro Ser Ser Ala Trp Ala Pro His Pro Glu Tyr Gly
Pro Pro Asn65 70 75
80Asp Tyr Leu Ile Leu Lys Gly Glu His Lys Phe Asn Ser Gln Tyr Ile
85 90 95Arg Ala Val Gly Gly Thr
Thr Trp His Trp Ala Ala Ser Ala Trp Arg 100
105 110Phe Ile Pro Asn Asp Phe Lys Met Lys Thr Val Tyr
Gly Val Gly Arg 115 120 125Asp Trp
Pro Ile Gln Tyr Asp Asp Ile Glu His Tyr Tyr Gln Arg Ala 130
135 140Glu Glu Glu Leu Gly Val Trp Gly Pro Gly Pro
Glu Glu Asp Leu Tyr145 150 155
160Ser Pro Arg Lys Glu Pro Tyr Pro Met Pro Pro Leu Pro Leu Ser Phe
165 170 175Asn Glu Gln Thr
Ile Lys Ser Ala Leu Asn Gly Tyr Asp Pro Lys Phe 180
185 190His Val Val Thr Glu Pro Val Ala Arg Asn Ser
Arg Pro Tyr Asp Gly 195 200 205Arg
Pro Thr Cys Cys Gly Asn Asn Asn Cys Met Pro Ile Cys Pro Ile 210
215 220Gly Ala Met Tyr Asn Gly Ile Val His Val
Glu Lys Ala Glu Gln Ala225 230 235
240Gly Ala Lys Leu Ile Asp Ser Ala Val Val Tyr Lys Leu Glu Thr
Gly 245 250 255Pro Asp Lys
Arg Ile Thr Ala Ala Val Tyr Lys Asp Lys Thr Gly Ala 260
265 270Asp His Arg Val Glu Gly Lys Tyr Phe Val
Ile Ala Ala Asn Gly Ile 275 280
285Glu Thr Pro Lys Ile Leu Leu Met Ser Ala Asn Arg Asp Phe Pro Asn 290
295 300Gly Val Ala Asn Ser Ser Asp Met
Val Gly Arg Asn Leu Met Asp His305 310
315 320Pro Gly Thr Gly Val Ser Phe Tyr Ala Asn Glu Lys
Leu Trp Pro Gly 325 330
335Arg Gly Pro Gln Glu Met Thr Ser Leu Ile Gly Phe Arg Asp Gly Pro
340 345 350Phe Arg Ala Asn Glu Ala
Ala Lys Lys Ile His Leu Ser Asn Met Ser 355 360
365Arg Ile Asn Gln Glu Thr Gln Lys Ile Phe Lys Gly Gly Lys
Leu Met 370 375 380Lys Pro Glu Glu Leu
Asp Ala Gln Ile Arg Asp Arg Ser Ala Arg Phe385 390
395 400Val Gln Phe Asp Cys Phe His Glu Ile Leu
Pro Gln Pro Glu Asn Arg 405 410
415Ile Val Pro Ser Lys Thr Ala Thr Asp Ala Val Gly Ile Pro Arg Pro
420 425 430Glu Ile Thr Tyr Ala
Ile Asp Asp Tyr Val Lys Arg Gly Ala Val His 435
440 445Thr Arg Glu Val Tyr Ala Thr Ala Ala Lys Val Leu
Gly Gly Thr Glu 450 455 460Val Val Phe
Asn Asp Glu Phe Ala Pro Asn Asn His Ile Thr Gly Ala465
470 475 480Thr Ile Met Gly Ala Asp Ala
Arg Asp Ser Val Val Asp Lys Asp Cys 485
490 495Arg Ala Phe Asp His Pro Asn Leu Phe Ile Ser Ser
Ser Ser Thr Met 500 505 510Pro
Thr Val Gly Thr Val Asn Val Thr Leu Thr Ile Ala Ala Leu 515
520 5259562PRTArtificialSynthetic 9Met Ala Asp
Thr Asp Thr Gln Lys Ala Asp Val Val Val Val Gly Ser1 5
10 15Gly Val Ala Gly Ala Ile Val Ala His
Gln Leu Ala Met Ala Gly Lys 20 25
30Ser Val Ile Leu Leu Glu Ala Gly Pro Arg Met Pro Arg Trp Glu Ile
35 40 45Val Glu Arg Phe Arg Asn Gln
Val Asp Lys Thr Asp Phe Met Ala Pro 50 55
60Tyr Pro Ser Ser Ala Trp Ala Pro His Pro Glu Tyr Gly Pro Pro Asn65
70 75 80Asp Tyr Leu Ile
Leu Lys Gly Glu His Lys Phe Asn Ser Gln Tyr Ile 85
90 95Arg Ala Val Gly Gly Thr Thr Trp His Trp
Ala Ala Ser Ala Trp Arg 100 105
110Phe Ile Pro Asn Asp Phe Lys Met Lys Thr Val Tyr Gly Val Gly Arg
115 120 125Asp Trp Pro Ile Gln Tyr Asp
Asp Ile Glu His Tyr Tyr Gln Arg Ala 130 135
140Glu Glu Glu Leu Gly Val Trp Gly Pro Gly Pro Glu Glu Asp Leu
Tyr145 150 155 160Ser Pro
Arg Lys Glu Pro Tyr Pro Met Pro Pro Leu Pro Leu Ser Phe
165 170 175Asn Glu Gln Thr Ile Lys Ser
Ala Leu Asn Gly Tyr Asp Pro Lys Phe 180 185
190His Val Val Thr Glu Pro Val Ala Arg Asn Ser Arg Pro Tyr
Asp Gly 195 200 205Arg Pro Thr Cys
Cys Gly Asn Asn Asn Cys Met Pro Ile Cys Pro Ile 210
215 220Gly Ala Met Tyr Asn Gly Ile Val His Val Glu Lys
Ala Glu Gln Ala225 230 235
240Gly Ala Lys Leu Ile Asp Ser Ala Val Val Tyr Lys Leu Glu Thr Gly
245 250 255Pro Asp Lys Arg Ile
Thr Ala Ala Val Tyr Lys Asp Lys Thr Gly Ala 260
265 270Asp His Arg Val Glu Gly Lys Tyr Phe Val Ile Ala
Ala Asn Gly Ile 275 280 285Glu Thr
Pro Lys Ile Leu Leu Met Ser Ala Asn Arg Asp Phe Pro Asn 290
295 300Gly Val Ala Asn Ser Ser Asp Met Val Gly Arg
Asn Leu Met Asp His305 310 315
320Pro Gly Thr Gly Val Ser Phe Tyr Ala Asn Glu Lys Leu Trp Pro Gly
325 330 335Arg Gly Pro Gln
Glu Met Thr Ser Leu Ile Gly Phe Arg Asp Gly Pro 340
345 350Phe Arg Ala Asn Glu Ala Ala Lys Lys Ile His
Leu Ser Asn Met Ser 355 360 365Arg
Ile Asn Gln Glu Thr Gln Lys Ile Phe Lys Gly Gly Lys Leu Met 370
375 380Lys Pro Glu Glu Leu Asp Ala Gln Ile Arg
Asp Arg Ser Ala Arg Phe385 390 395
400Val Gln Phe Asp Cys Phe His Glu Ile Leu Pro Gln Pro Glu Asn
Arg 405 410 415Ile Val Pro
Ser Lys Thr Ala Thr Asp Ala Val Gly Ile Pro Arg Pro 420
425 430Glu Ile Thr Tyr Ala Ile Asp Asp Tyr Val
Lys Arg Gly Ala Val His 435 440
445Thr Arg Glu Val Tyr Ala Thr Ala Ala Lys Val Leu Gly Gly Thr Glu 450
455 460Val Val Phe Asn Asp Glu Phe Ala
Pro Asn Asn His Ile Thr Gly Ala465 470
475 480Thr Ile Met Gly Ala Asp Ala Arg Asp Ser Val Val
Asp Lys Asp Cys 485 490
495Arg Ala Phe Asp His Pro Asn Leu Phe Ile Ser Ser Ser Ser Thr Met
500 505 510Pro Thr Val Gly Thr Val
Asn Val Thr Leu Thr Ile Ala Ala Leu Ala 515 520
525Leu Arg Met Ser Asp Thr Leu Lys Lys Glu Val Ile Arg Ala
Gly Ala 530 535 540Thr Met Pro His Arg
Asp Arg Gly Pro Cys Gly Ala Cys His Ala Ile545 550
555 560Ile Gln10586PRTArtificialSynthetic 10Met
Ala Asp Thr Asp Thr Gln Lys Ala Asp Val Val Val Val Gly Ser1
5 10 15Gly Val Ala Gly Ala Ile Val
Ala His Gln Leu Ala Met Ala Gly Lys 20 25
30Ser Val Ile Leu Leu Glu Ala Gly Pro Arg Met Pro Arg Trp
Glu Ile 35 40 45Val Glu Arg Phe
Arg Asn Gln Val Asp Lys Thr Asp Phe Met Ala Pro 50 55
60Tyr Pro Ser Ser Ala Trp Ala Pro His Pro Glu Tyr Gly
Pro Pro Asn65 70 75
80Asp Tyr Leu Ile Leu Lys Gly Glu His Lys Phe Asn Ser Gln Tyr Ile
85 90 95Arg Ala Val Gly Gly Thr
Thr Trp His Trp Ala Ala Ser Ala Trp Arg 100
105 110Phe Ile Pro Asn Asp Phe Lys Met Lys Thr Val Tyr
Gly Val Gly Arg 115 120 125Asp Trp
Pro Ile Gln Tyr Asp Asp Ile Glu His Tyr Tyr Gln Arg Ala 130
135 140Glu Glu Glu Leu Gly Val Trp Gly Pro Gly Pro
Glu Glu Asp Leu Tyr145 150 155
160Ser Pro Arg Lys Glu Pro Tyr Pro Met Pro Pro Leu Pro Leu Ser Phe
165 170 175Asn Glu Gln Thr
Ile Lys Ser Ala Leu Asn Gly Tyr Asp Pro Lys Phe 180
185 190His Val Val Thr Glu Pro Val Ala Arg Asn Ser
Arg Pro Tyr Asp Gly 195 200 205Arg
Pro Thr Cys Cys Gly Asn Asn Asn Cys Met Pro Ile Cys Pro Ile 210
215 220Gly Ala Met Tyr Asn Gly Ile Val His Val
Glu Lys Ala Glu Gln Ala225 230 235
240Gly Ala Lys Leu Ile Asp Ser Ala Val Val Tyr Lys Leu Glu Thr
Gly 245 250 255Pro Asp Lys
Arg Ile Thr Ala Ala Val Tyr Lys Asp Lys Thr Gly Ala 260
265 270Asp His Arg Val Glu Gly Lys Tyr Phe Val
Ile Ala Ala Asn Gly Ile 275 280
285Glu Thr Pro Lys Ile Leu Leu Met Ser Ala Asn Arg Asp Phe Pro Asn 290
295 300Gly Val Ala Asn Ser Ser Asp Met
Val Gly Arg Asn Leu Met Asp His305 310
315 320Pro Gly Thr Gly Val Ser Phe Tyr Ala Asn Glu Lys
Leu Trp Pro Gly 325 330
335Arg Gly Pro Gln Glu Met Thr Ser Leu Ile Gly Phe Arg Asp Gly Pro
340 345 350Phe Arg Ala Asn Glu Ala
Ala Lys Lys Ile His Leu Ser Asn Met Ser 355 360
365Arg Ile Asn Gln Glu Thr Gln Lys Ile Phe Lys Gly Gly Lys
Leu Met 370 375 380Lys Pro Glu Glu Leu
Asp Ala Gln Ile Arg Asp Arg Ser Ala Arg Phe385 390
395 400Val Gln Phe Asp Cys Phe His Glu Ile Leu
Pro Gln Pro Glu Asn Arg 405 410
415Ile Val Pro Ser Lys Thr Ala Thr Asp Ala Val Gly Ile Pro Arg Pro
420 425 430Glu Ile Thr Tyr Ala
Ile Asp Asp Tyr Val Lys Arg Gly Ala Val His 435
440 445Thr Arg Glu Val Tyr Ala Thr Ala Ala Lys Val Leu
Gly Gly Thr Glu 450 455 460Val Val Phe
Asn Asp Glu Phe Ala Pro Asn Asn His Ile Thr Gly Ala465
470 475 480Thr Ile Met Gly Ala Asp Ala
Arg Asp Ser Val Val Asp Lys Asp Cys 485
490 495Arg Ala Phe Asp His Pro Asn Leu Phe Ile Ser Ser
Ser Ser Thr Met 500 505 510Pro
Thr Val Gly Thr Val Asn Val Thr Leu Thr Ile Ala Ala Leu Ala 515
520 525Leu Arg Met Ser Asp Thr Leu Lys Lys
Glu Val Glu Phe Gly Ser Gly 530 535
540Tyr Gly Ser Gly Pro Pro Gly Pro Ile Arg Ala Gly Ala Thr Met Pro545
550 555 560His Arg Asp Arg
Gly Pro Cys Gly Ala Cys His Ala Ile Ile Gln Gly 565
570 575Ser Gly Ser Gly His His His His His His
580 58511575PRTArtificialSynthetic 11Met Ala Asp
Thr Asp Thr Gln Lys Ala Asp Val Val Val Val Gly Ser1 5
10 15Gly Val Ala Gly Ala Ile Val Ala His
Gln Leu Ala Met Ala Gly Lys 20 25
30Ser Val Ile Leu Leu Glu Ala Gly Pro Arg Met Pro Arg Trp Glu Ile
35 40 45Val Glu Arg Phe Arg Asn Gln
Val Asp Lys Thr Asp Phe Met Ala Pro 50 55
60Tyr Pro Ser Ser Ala Trp Ala Pro His Pro Glu Tyr Gly Pro Pro Asn65
70 75 80Asp Tyr Leu Ile
Leu Lys Gly Glu His Lys Phe Asn Ser Gln Tyr Ile 85
90 95Arg Ala Val Gly Gly Thr Thr Trp His Trp
Ala Ala Ser Ala Trp Arg 100 105
110Phe Ile Pro Asn Asp Phe Lys Met Lys Thr Val Tyr Gly Val Gly Arg
115 120 125Asp Trp Pro Ile Gln Tyr Asp
Asp Ile Glu His Tyr Tyr Gln Arg Ala 130 135
140Glu Glu Glu Leu Gly Val Trp Gly Pro Gly Pro Glu Glu Asp Leu
Tyr145 150 155 160Ser Pro
Arg Lys Glu Pro Tyr Pro Met Pro Pro Leu Pro Leu Ser Phe
165 170 175Asn Glu Gln Thr Ile Lys Ser
Ala Leu Asn Gly Tyr Asp Pro Lys Phe 180 185
190His Val Val Thr Glu Pro Val Ala Arg Asn Ser Arg Pro Tyr
Asp Gly 195 200 205Arg Pro Thr Cys
Cys Gly Asn Asn Asn Cys Met Pro Ile Cys Pro Ile 210
215 220Gly Ala Met Tyr Asn Gly Ile Val His Val Glu Lys
Ala Glu Gln Ala225 230 235
240Gly Ala Lys Leu Ile Asp Ser Ala Val Val Tyr Lys Leu Glu Thr Gly
245 250 255Pro Asp Lys Arg Ile
Thr Ala Ala Val Tyr Lys Asp Lys Thr Gly Ala 260
265 270Asp His Arg Val Glu Gly Lys Tyr Phe Val Ile Ala
Ala Asn Gly Ile 275 280 285Glu Thr
Pro Lys Ile Leu Leu Met Ser Ala Asn Arg Asp Phe Pro Asn 290
295 300Gly Val Ala Asn Ser Ser Asp Met Val Gly Arg
Asn Leu Met Asp His305 310 315
320Pro Gly Thr Gly Val Ser Phe Tyr Ala Asn Glu Lys Leu Trp Pro Gly
325 330 335Arg Gly Pro Gln
Glu Met Thr Ser Leu Ile Gly Phe Arg Asp Gly Pro 340
345 350Phe Arg Ala Asn Glu Ala Ala Lys Lys Ile His
Leu Ser Asn Met Ser 355 360 365Arg
Ile Asn Gln Glu Thr Gln Lys Ile Phe Lys Gly Gly Lys Leu Met 370
375 380Lys Pro Glu Glu Leu Asp Ala Gln Ile Arg
Asp Arg Ser Ala Arg Phe385 390 395
400Val Gln Phe Asp Cys Phe His Glu Ile Leu Pro Gln Pro Glu Asn
Arg 405 410 415Ile Val Pro
Ser Lys Thr Ala Thr Asp Ala Val Gly Ile Pro Arg Pro 420
425 430Glu Ile Thr Tyr Ala Ile Asp Asp Tyr Val
Lys Arg Gly Ala Val His 435 440
445Thr Arg Glu Val Tyr Ala Thr Ala Ala Lys Val Leu Gly Gly Thr Glu 450
455 460Val Val Phe Asn Asp Glu Phe Ala
Pro Asn Asn His Ile Thr Gly Ala465 470
475 480Thr Ile Met Gly Ala Asp Ala Arg Asp Ser Val Val
Asp Lys Asp Cys 485 490
495Arg Ala Phe Asp His Pro Asn Leu Phe Ile Ser Ser Ser Ser Thr Met
500 505 510Pro Thr Val Gly Thr Val
Asn Val Thr Leu Thr Ile Ala Ala Leu Ala 515 520
525Leu Arg Met Ser Asp Thr Leu Lys Lys Glu Val Glu Phe Gly
Ser Gly 530 535 540Tyr Gly Ser Gly Pro
Pro Gly Pro Ile Arg Ala Gly Ala Thr Met Pro545 550
555 560His Arg Asp Arg Gly Pro Cys Gly Ala Cys
His Ala Ile Ile Gln 565 570
57512586PRTArtificialSyntheticX(42)..(42)X is R or a non-canonical amino
acid.X(247)..(247)X is S or a non-canonical amino acid.X(395)..(395)X is
D or a non-canonical amino acid.X(558)..(558)X is T or a non-canonical
amino acid.X(560)..(560)X is P or a non-canonical amino acid. 12Met Ala
Asp Thr Asp Thr Gln Lys Ala Asp Val Val Val Val Gly Ser1 5
10 15Gly Val Ala Gly Ala Ile Val Ala
His Gln Leu Ala Met Ala Gly Lys 20 25
30Ser Val Ile Leu Leu Glu Ala Gly Pro Xaa Met Pro Arg Trp Glu
Ile 35 40 45Val Glu Arg Phe Arg
Asn Gln Val Asp Lys Thr Asp Phe Met Ala Pro 50 55
60Tyr Pro Ser Ser Ala Trp Ala Pro His Pro Glu Tyr Gly Pro
Pro Asn65 70 75 80Asp
Tyr Leu Ile Leu Lys Gly Glu His Lys Phe Asn Ser Gln Tyr Ile
85 90 95Arg Ala Val Gly Gly Thr Thr
Trp His Trp Ala Ala Ser Ala Trp Arg 100 105
110Phe Ile Pro Asn Asp Phe Lys Met Lys Thr Val Tyr Gly Val
Gly Arg 115 120 125Asp Trp Pro Ile
Gln Tyr Asp Asp Ile Glu His Tyr Tyr Gln Arg Ala 130
135 140Glu Glu Glu Leu Gly Val Trp Gly Pro Gly Pro Glu
Glu Asp Leu Tyr145 150 155
160Ser Pro Arg Lys Glu Pro Tyr Pro Met Pro Pro Leu Pro Leu Ser Phe
165 170 175Asn Glu Gln Thr Ile
Lys Ser Ala Leu Asn Gly Tyr Asp Pro Lys Phe 180
185 190His Val Val Thr Glu Pro Val Ala Arg Asn Ser Arg
Pro Tyr Asp Gly 195 200 205Arg Pro
Thr Cys Cys Gly Asn Asn Asn Cys Met Pro Ile Cys Pro Ile 210
215 220Gly Ala Met Tyr Asn Gly Ile Val His Val Glu
Lys Ala Glu Gln Ala225 230 235
240Gly Ala Lys Leu Ile Asp Xaa Ala Val Val Tyr Lys Leu Glu Thr Gly
245 250 255Pro Asp Lys Arg
Ile Thr Ala Ala Val Tyr Lys Asp Lys Thr Gly Ala 260
265 270Asp His Arg Val Glu Gly Lys Tyr Phe Val Ile
Ala Ala Asn Gly Ile 275 280 285Glu
Thr Pro Lys Ile Leu Leu Met Ser Ala Asn Arg Asp Phe Pro Asn 290
295 300Gly Val Ala Asn Ser Ser Asp Met Val Gly
Arg Asn Leu Met Asp His305 310 315
320Pro Gly Thr Gly Val Ser Phe Tyr Ala Asn Glu Lys Leu Trp Pro
Gly 325 330 335Arg Gly Pro
Gln Glu Met Thr Ser Leu Ile Gly Phe Arg Asp Gly Pro 340
345 350Phe Arg Ala Asn Glu Ala Ala Lys Lys Ile
His Leu Ser Asn Met Ser 355 360
365Arg Ile Asn Gln Glu Thr Gln Lys Ile Phe Lys Gly Gly Lys Leu Met 370
375 380Lys Pro Glu Glu Leu Asp Ala Gln
Ile Arg Xaa Arg Ser Ala Arg Phe385 390
395 400Val Gln Phe Asp Cys Phe His Glu Ile Leu Pro Gln
Pro Glu Asn Arg 405 410
415Ile Val Pro Ser Lys Thr Ala Thr Asp Ala Val Gly Ile Pro Arg Pro
420 425 430Glu Ile Thr Tyr Ala Ile
Asp Asp Tyr Val Lys Arg Gly Ala Val His 435 440
445Thr Arg Glu Val Tyr Ala Thr Ala Ala Lys Val Leu Gly Gly
Thr Glu 450 455 460Val Val Phe Asn Asp
Glu Phe Ala Pro Asn Asn His Ile Thr Gly Ala465 470
475 480Thr Ile Met Gly Ala Asp Ala Arg Asp Ser
Val Val Asp Lys Asp Cys 485 490
495Arg Ala Phe Asp His Pro Asn Leu Phe Ile Ser Ser Ser Ser Thr Met
500 505 510Pro Thr Val Gly Thr
Val Asn Val Thr Leu Thr Ile Ala Ala Leu Ala 515
520 525Leu Arg Met Ser Asp Thr Leu Lys Lys Glu Val Glu
Phe Gly Ser Gly 530 535 540Tyr Gly Ser
Gly Pro Pro Gly Pro Ile Arg Ala Gly Ala Xaa Met Xaa545
550 555 560His Arg Asp Arg Gly Pro Cys
Gly Ala Cys His Ala Ile Ile Gln Gly 565
570 575Ser Gly Ser Gly His His His His His His
580 58513575PRTArtificialSyntheticX(42)..(42)X is R or a
non-canonical amino acidX(247)..(247)X is S or a non-canonical amino
acidX(395)..(395)X is D or a non-canonical amino acidX(558)..(558)X is T
or a non-canonical amino acidX(560)..(560)X is P or a non-canonical amino
acid 13Met Ala Asp Thr Asp Thr Gln Lys Ala Asp Val Val Val Val Gly Ser1
5 10 15Gly Val Ala Gly Ala
Ile Val Ala His Gln Leu Ala Met Ala Gly Lys 20
25 30Ser Val Ile Leu Leu Glu Ala Gly Pro Xaa Met Pro
Arg Trp Glu Ile 35 40 45Val Glu
Arg Phe Arg Asn Gln Val Asp Lys Thr Asp Phe Met Ala Pro 50
55 60Tyr Pro Ser Ser Ala Trp Ala Pro His Pro Glu
Tyr Gly Pro Pro Asn65 70 75
80Asp Tyr Leu Ile Leu Lys Gly Glu His Lys Phe Asn Ser Gln Tyr Ile
85 90 95Arg Ala Val Gly Gly
Thr Thr Trp His Trp Ala Ala Ser Ala Trp Arg 100
105 110Phe Ile Pro Asn Asp Phe Lys Met Lys Thr Val Tyr
Gly Val Gly Arg 115 120 125Asp Trp
Pro Ile Gln Tyr Asp Asp Ile Glu His Tyr Tyr Gln Arg Ala 130
135 140Glu Glu Glu Leu Gly Val Trp Gly Pro Gly Pro
Glu Glu Asp Leu Tyr145 150 155
160Ser Pro Arg Lys Glu Pro Tyr Pro Met Pro Pro Leu Pro Leu Ser Phe
165 170 175Asn Glu Gln Thr
Ile Lys Ser Ala Leu Asn Gly Tyr Asp Pro Lys Phe 180
185 190His Val Val Thr Glu Pro Val Ala Arg Asn Ser
Arg Pro Tyr Asp Gly 195 200 205Arg
Pro Thr Cys Cys Gly Asn Asn Asn Cys Met Pro Ile Cys Pro Ile 210
215 220Gly Ala Met Tyr Asn Gly Ile Val His Val
Glu Lys Ala Glu Gln Ala225 230 235
240Gly Ala Lys Leu Ile Asp Xaa Ala Val Val Tyr Lys Leu Glu Thr
Gly 245 250 255Pro Asp Lys
Arg Ile Thr Ala Ala Val Tyr Lys Asp Lys Thr Gly Ala 260
265 270Asp His Arg Val Glu Gly Lys Tyr Phe Val
Ile Ala Ala Asn Gly Ile 275 280
285Glu Thr Pro Lys Ile Leu Leu Met Ser Ala Asn Arg Asp Phe Pro Asn 290
295 300Gly Val Ala Asn Ser Ser Asp Met
Val Gly Arg Asn Leu Met Asp His305 310
315 320Pro Gly Thr Gly Val Ser Phe Tyr Ala Asn Glu Lys
Leu Trp Pro Gly 325 330
335Arg Gly Pro Gln Glu Met Thr Ser Leu Ile Gly Phe Arg Asp Gly Pro
340 345 350Phe Arg Ala Asn Glu Ala
Ala Lys Lys Ile His Leu Ser Asn Met Ser 355 360
365Arg Ile Asn Gln Glu Thr Gln Lys Ile Phe Lys Gly Gly Lys
Leu Met 370 375 380Lys Pro Glu Glu Leu
Asp Ala Gln Ile Arg Xaa Arg Ser Ala Arg Phe385 390
395 400Val Gln Phe Asp Cys Phe His Glu Ile Leu
Pro Gln Pro Glu Asn Arg 405 410
415Ile Val Pro Ser Lys Thr Ala Thr Asp Ala Val Gly Ile Pro Arg Pro
420 425 430Glu Ile Thr Tyr Ala
Ile Asp Asp Tyr Val Lys Arg Gly Ala Val His 435
440 445Thr Arg Glu Val Tyr Ala Thr Ala Ala Lys Val Leu
Gly Gly Thr Glu 450 455 460Val Val Phe
Asn Asp Glu Phe Ala Pro Asn Asn His Ile Thr Gly Ala465
470 475 480Thr Ile Met Gly Ala Asp Ala
Arg Asp Ser Val Val Asp Lys Asp Cys 485
490 495Arg Ala Phe Asp His Pro Asn Leu Phe Ile Ser Ser
Ser Ser Thr Met 500 505 510Pro
Thr Val Gly Thr Val Asn Val Thr Leu Thr Ile Ala Ala Leu Ala 515
520 525Leu Arg Met Ser Asp Thr Leu Lys Lys
Glu Val Glu Phe Gly Ser Gly 530 535
540Tyr Gly Ser Gly Pro Pro Gly Pro Ile Arg Ala Gly Ala Xaa Met Xaa545
550 555 560His Arg Asp Arg
Gly Pro Cys Gly Ala Cys His Ala Ile Ile Gln 565
570 575141727DNAArtificialSynthetic 14tggcggatac
ggatacccag aaagcggacg tggtcgtggt tggatccggc gtggcaggcg 60caatcgtggc
tcatcaactg gcaatggcag gtaaaagcgt gatcctgctg gaagctggtc 120cgtagatgcc
gcgttgggaa attgttgaac gtttccgcaa tcaagtcgat aaaaccgact 180ttatggcacc
gtatccgagc agcgcatggg caccgcatcc ggaatatggt ccgccgaatg 240attacctgat
cctgaaaggc gaacacaaat ttaactcaca gtacattcgt gcagtgggcg 300gcaccacgtg
gcattgggca gcctcggcat ggcgcttcat cccgaacgat tttaaaatga 360aaaccgtgta
tggcgttggt cgtgactggc cgattcagta cgatgacatc gaacattatt 420accaacgcgc
ggaagaagaa ctgggcgtgt ggggtccggg cccggaagaa gacctgtatt 480caccgcgtaa
agaaccgtac ccgatgccgc cgctgccgct gagtttcaat gaacaaacca 540ttaaatccgc
tctgaacggc tatgatccga aatttcacgt ggttacggaa ccggtggccc 600gtaattcgcg
cccgtacgac ggtcgcccga cctgctgtgg caacaataac tgcatgccga 660tttgtccgat
cggtgcaatg tataacggca tcgtccatgt ggaaaaagct gaacaggcag 720gtgctaaact
gattgatagt gcggtcgtgt acaaactgga aacgggcccg gacaaacgta 780ttaccgcagc
tgtttataaa gataaaacgg gtgcggacca tcgcgtcgaa ggcaaatact 840tcgtgattgc
ggccaatggt atcgaaaccc cgaaaattct gctgatgagc gcgaaccgtg 900attttccgaa
tggtgtggcc aacagttccg atatggttgg ccgcaatctg atggaccatc 960cgggcaccgg
cgtgagcttt tatgcaaacg aaaaactgtg gccgggtcgt ggtccgcagg 1020aaatgacctc
tctgatcggt ttccgtgatg gcccgtttcg cgcgaatgaa gcagcgaaga 1080aaattcatct
gtcaaatatg tcgcgtatca accaggaaac ccaaaaaatc tttaaaggcg 1140gtaaactgat
gaaaccggaa gaactggatg cgcagatccg tgaccgcagt gcccgctttg 1200ttcaattcga
ttgctttcac gaaatcctgc cgcagccgga aaatcgtatt gtcccgtcca 1260aaaccgcaac
ggacgcagtg ggtattccgc gtccggaaat tacgtatgcg atcgatgact 1320acgtcaaacg
tggcgcagtg catacgcgcg aagtttatgc taccgcggcc aaagtgctgg 1380gcggcaccga
agtggtcttc aacgatgaat ttgcgccgaa taaccacatc accggtgcca 1440cgattatggg
cgcggatgcc cgtgactcag tggttgataa agactgtcgc gccttcgatc 1500atccgaacct
gtttattagc agcagcagca ccatgccgac ggttggcacc gttaacgtca 1560ccctgacgat
tgcagctctg gcactgcgta tgtctgatac gctgaaaaaa gaagtcgaat 1620tcggttctgg
ttatggctct ggtccgccgg gtccgattcg tgcaggtgct accatgccgc 1680atcgtgatcg
tggtccgtgc ggtgcatgtc acgctattat ccagtaa
1727151728DNAArtificialSynthetic 15atggcggata cggataccca gaaagcggac
gtggtcgtgg ttggatccgg cgtggcaggc 60gcaatcgtgg ctcatcaact ggcaatggca
ggtaaaagcg tgatcctgct ggaagctggt 120ccgcgtatgc cgcgttggga aattgttgaa
cgtttccgca atcaagtcga taaaaccgac 180tttatggcac cgtatccgag cagcgcatgg
gcaccgcatc cggaatatgg tccgccgaat 240gattacctga tcctgaaagg cgaacacaaa
tttaactcac agtacattcg tgcagtgggc 300ggcaccacgt ggcattgggc agcctcggca
tggcgcttca tcccgaacga ttttaaaatg 360aaaaccgtgt atggcgttgg tcgtgactgg
ccgattcagt acgatgacat cgaacattat 420taccaacgcg cggaagaaga actgggcgtg
tggggtccgg gcccggaaga agacctgtat 480tcaccgcgta aagaaccgta cccgatgccg
ccgctgccgc tgagtttcaa tgaacaaacc 540attaaatccg ctctgaacgg ctatgatccg
aaatttcacg tggttacgga accggtggcc 600cgtaattcgc gcccgtacga cggtcgcccg
acctgctgtg gcaacaataa ctgcatgccg 660atttgtccga tcggtgcaat gtataacggc
atcgtccatg tggaaaaagc tgaacaggca 720ggtgctaaac tgattgatta ggcggtcgtg
tacaaactgg aaacgggccc ggacaaacgt 780attaccgcag ctgtttataa agataaaacg
ggtgcggacc atcgcgtcga aggcaaatac 840ttcgtgattg cggccaatgg tatcgaaacc
ccgaaaattc tgctgatgag cgcgaaccgt 900gattttccga atggtgtggc caacagttcc
gatatggttg gccgcaatct gatggaccat 960ccgggcaccg gcgtgagctt ttatgcaaac
gaaaaactgt ggccgggtcg tggtccgcag 1020gaaatgacct ctctgatcgg tttccgtgat
ggcccgtttc gcgcgaatga agcagcgaag 1080aaaattcatc tgtcaaatat gtcgcgtatc
aaccaggaaa cccaaaaaat ctttaaaggc 1140ggtaaactga tgaaaccgga agaactggat
gcgcagatcc gtgaccgcag tgcccgcttt 1200gttcaattcg attgctttca cgaaatcctg
ccgcagccgg aaaatcgtat tgtcccgtcc 1260aaaaccgcaa cggacgcagt gggtattccg
cgtccggaaa ttacgtatgc gatcgatgac 1320tacgtcaaac gtggcgcagt gcatacgcgc
gaagtttatg ctaccgcggc caaagtgctg 1380ggcggcaccg aagtggtctt caacgatgaa
tttgcgccga ataaccacat caccggtgcc 1440acgattatgg gcgcggatgc ccgtgactca
gtggttgata aagactgtcg cgccttcgat 1500catccgaacc tgtttattag cagcagcagc
accatgccga cggttggcac cgttaacgtc 1560accctgacga ttgcagctct ggcactgcgt
atgtctgata cgctgaaaaa agaagtcgaa 1620ttcggttctg gttatggctc tggtccgccg
ggtccgattc gtgcaggtgc taccatgccg 1680catcgtgatc gtggtccgtg cggtgcatgt
cacgctatta tccagtaa 1728161729DNAArtificialSynthetic
16atggcggata cggataccca gaaagcggac gtggtcgtgg ttggatccgg cgtggcaggc
60gcaatcgtgg ctcatcaact ggcaatggca ggtaaaagcg tgatcctgct ggaagctggt
120ccgcgtatgc cgcgttggga aattgttgaa cgtttccgca atcaagtcga taaaaccgac
180tttatggcac cgtatccgag cagcgcatgg gcaccgcatc cggaatatgg tccgccgaat
240gattacctga tcctgaaagg cgaacacaaa tttaactcac agtacattcg tgcagtgggc
300ggcaccacgt ggcattgggc agcctcggca tggcgcttca tcccgaacga ttttaaaatg
360aaaaccgtgt atggcgttgg tcgtgactgg ccgattcagt acgatgacat cgaacattat
420taccaacgcg cggaagaaga actgggcgtg tggggtccgg gcccggaaga agacctgtat
480tcaccgcgta aagaaccgta cccgatgccg ccgctgccgc tgagtttcaa tgaacaaacc
540attaaatccg ctctgaacgg ctatgatccg aaatttcacg tggttacgga accggtggcc
600cgtaattcgc gcccgtacga cggtcgcccg acctgctgtg gcaacaataa ctgcatgccg
660atttgtccga tcggtgcaat gtataacggc atcgtccatg tggaaaaagc tgaacaggca
720ggtgctaaac tgattgatag tgcggtcgtg tacaaactgg aaacgggccc ggacaaacgt
780attaccgcag ctgtttataa agataaaacg ggtgcggacc atcgcgtcga aggcaaatac
840ttcgtgattg cggccaatgg tatcgaaacc ccgaaaattc tgctgatgag cgcgaaccgt
900gattttccga atggtgtggc caacagttcc gatatggttg gccgcaatct gatggaccat
960ccgggcaccg gcgtgagctt ttatgcaaac gaaaaactgt ggccgggtcg tggtccgcag
1020gaaatgacct ctctgatcgg tttccgtgat ggcccgtttc gcgcgaatga agcagcgaag
1080aaaattcatc tgtcaaatat gtcgcgtatc aaccaggaaa cccaaaaaat ctttaaaggc
1140ggtaaactga tgaaaccgga agaactggat gcgcagatcc gttagcgcag tgcccgcttt
1200gttcaattcg attgctttca cgaaatcctg ccgcagccgg aaaatcgtat tgtcccgtcc
1260aaaaccgcaa cggacgcagt gggtattccg cgtccggaaa ttacgtatgc gatcgatgac
1320tacgtcaaac gtggcgcagt gcatacgcgc gaagtttatg ctaccgcggc caaagtgctg
1380ggcggcaccg aagtggtctt caacgatgaa tttgcgccga ataaccacat caccggtgcc
1440acgattatgg gcgcggatgc ccgtgactca gtggttgata aagactgtcg cgccttcgat
1500catccgaacc tgtttattag cagcagcagc accatgccga cggttggcac cgttaacgtc
1560accctgacga ttgcagctct ggcactgcgt atgtctgata cgctgaaaaa agaagtcgaa
1620ttcggttctg gttatggctc tggtccgccg ggtccgattc gtgcaggtgc taccatgccg
1680catcgtgatc gtggtccgtg cggtgcatgt cacgctatta tccaggtaa
1729171728DNAArtificialSynthetic 17atggcggata cggataccca gaaagcggac
gtggtcgtgg ttggatccgg cgtggcaggc 60gcaatcgtgg ctcatcaact ggcaatggca
ggtaaaagcg tgatcctgct ggaagctggt 120ccgcgtatgc cgcgttggga aattgttgaa
cgtttccgca atcaagtcga taaaaccgac 180tttatggcac cgtatccgag cagcgcatgg
gcaccgcatc cggaatatgg tccgccgaat 240gattacctga tcctgaaagg cgaacacaaa
tttaactcac agtacattcg tgcagtgggc 300ggcaccacgt ggcattgggc agcctcggca
tggcgcttca tcccgaacga ttttaaaatg 360aaaaccgtgt atggcgttgg tcgtgactgg
ccgattcagt acgatgacat cgaacattat 420taccaacgcg cggaagaaga actgggcgtg
tggggtccgg gcccggaaga agacctgtat 480tcaccgcgta aagaaccgta cccgatgccg
ccgctgccgc tgagtttcaa tgaacaaacc 540attaaatccg ctctgaacgg ctatgatccg
aaatttcacg tggttacgga accggtggcc 600cgtaattcgc gcccgtacga cggtcgcccg
acctgctgtg gcaacaataa ctgcatgccg 660atttgtccga tcggtgcaat gtataacggc
atcgtccatg tggaaaaagc tgaacaggca 720ggtgctaaac tgattgatag tgcggtcgtg
tacaaactgg aaacgggccc ggacaaacgt 780attaccgcag ctgtttataa agataaaacg
ggtgcggacc atcgcgtcga aggcaaatac 840ttcgtgattg cggccaatgg tatcgaaacc
ccgaaaattc tgctgatgag cgcgaaccgt 900gattttccga atggtgtggc caacagttcc
gatatggttg gccgcaatct gatggaccat 960ccgggcaccg gcgtgagctt ttatgcaaac
gaaaaactgt ggccgggtcg tggtccgcag 1020gaaatgacct ctctgatcgg tttccgtgat
ggcccgtttc gcgcgaatga agcagcgaag 1080aaaattcatc tgtcaaatat gtcgcgtatc
aaccaggaaa cccaaaaaat ctttaaaggc 1140ggtaaactga tgaaaccgga agaactggat
gcgcagatcc gtgaccgcag tgcccgcttt 1200gttcaattcg attgctttca cgaaatcctg
ccgcagccgg aaaatcgtat tgtcccgtcc 1260aaaaccgcaa cggacgcagt gggtattccg
cgtccggaaa ttacgtatgc gatcgatgac 1320tacgtcaaac gtggcgcagt gcatacgcgc
gaagtttatg ctaccgcggc caaagtgctg 1380ggcggcaccg aagtggtctt caacgatgaa
tttgcgccga ataaccacat caccggtgcc 1440acgattatgg gcgcggatgc ccgtgactca
gtggttgata aagactgtcg cgccttcgat 1500catccgaacc tgtttattag cagcagcagc
accatgccga cggttggcac cgttaacgtc 1560accctgacga ttgcagctct ggcactgcgt
atgtctgata cgctgaaaaa agaagtcgaa 1620ttcggttctg gttatggctc tggtccgccg
ggtccgattc gtgcaggtgc ttagatgccg 1680catcgtgatc gtggtccgtg cggtgcatgt
cacgctatta tccagtaa 1728181728DNAArtificialSynthetic
18atggcggata cggataccca gaaagcggac gtggtcgtgg ttggatccgg cgtggcaggc
60gcaatcgtgg ctcatcaact ggcaatggca ggtaaaagcg tgatcctgct ggaagctggt
120ccgcgtatgc cgcgttggga aattgttgaa cgtttccgca atcaagtcga taaaaccgac
180tttatggcac cgtatccgag cagcgcatgg gcaccgcatc cggaatatgg tccgccgaat
240gattacctga tcctgaaagg cgaacacaaa tttaactcac agtacattcg tgcagtgggc
300ggcaccacgt ggcattgggc agcctcggca tggcgcttca tcccgaacga ttttaaaatg
360aaaaccgtgt atggcgttgg tcgtgactgg ccgattcagt acgatgacat cgaacattat
420taccaacgcg cggaagaaga actgggcgtg tggggtccgg gcccggaaga agacctgtat
480tcaccgcgta aagaaccgta cccgatgccg ccgctgccgc tgagtttcaa tgaacaaacc
540attaaatccg ctctgaacgg ctatgatccg aaatttcacg tggttacgga accggtggcc
600cgtaattcgc gcccgtacga cggtcgcccg acctgctgtg gcaacaataa ctgcatgccg
660atttgtccga tcggtgcaat gtataacggc atcgtccatg tggaaaaagc tgaacaggca
720ggtgctaaac tgattgatag tgcggtcgtg tacaaactgg aaacgggccc ggacaaacgt
780attaccgcag ctgtttataa agataaaacg ggtgcggacc atcgcgtcga aggcaaatac
840ttcgtgattg cggccaatgg tatcgaaacc ccgaaaattc tgctgatgag cgcgaaccgt
900gattttccga atggtgtggc caacagttcc gatatggttg gccgcaatct gatggaccat
960ccgggcaccg gcgtgagctt ttatgcaaac gaaaaactgt ggccgggtcg tggtccgcag
1020gaaatgacct ctctgatcgg tttccgtgat ggcccgtttc gcgcgaatga agcagcgaag
1080aaaattcatc tgtcaaatat gtcgcgtatc aaccaggaaa cccaaaaaat ctttaaaggc
1140ggtaaactga tgaaaccgga agaactggat gcgcagatcc gtgaccgcag tgcccgcttt
1200gttcaattcg attgctttca cgaaatcctg ccgcagccgg aaaatcgtat tgtcccgtcc
1260aaaaccgcaa cggacgcagt gggtattccg cgtccggaaa ttacgtatgc gatcgatgac
1320tacgtcaaac gtggcgcagt gcatacgcgc gaagtttatg ctaccgcggc caaagtgctg
1380ggcggcaccg aagtggtctt caacgatgaa tttgcgccga ataaccacat caccggtgcc
1440acgattatgg gcgcggatgc ccgtgactca gtggttgata aagactgtcg cgccttcgat
1500catccgaacc tgtttattag cagcagcagc accatgccga cggttggcac cgttaacgtc
1560accctgacga ttgcagctct ggcactgcgt atgtctgata cgctgaaaaa agaagtcgaa
1620ttcggttctg gttatggctc tggtccgccg ggtccgattc gtgcaggtgc taccatgtag
1680catcgtgatc gtggtccgtg cggtgcatgt cacgctatta tccagtaa
172819575PRTArtificialSyntheticX(42)..(42)X is a non-canonical amino acid
19Met Ala Asp Thr Asp Thr Gln Lys Ala Asp Val Val Val Val Gly Ser1
5 10 15Gly Val Ala Gly Ala Ile
Val Ala His Gln Leu Ala Met Ala Gly Lys 20 25
30Ser Val Ile Leu Leu Glu Ala Gly Pro Xaa Met Pro Arg
Trp Glu Ile 35 40 45Val Glu Arg
Phe Arg Asn Gln Val Asp Lys Thr Asp Phe Met Ala Pro 50
55 60Tyr Pro Ser Ser Ala Trp Ala Pro His Pro Glu Tyr
Gly Pro Pro Asn65 70 75
80Asp Tyr Leu Ile Leu Lys Gly Glu His Lys Phe Asn Ser Gln Tyr Ile
85 90 95Arg Ala Val Gly Gly Thr
Thr Trp His Trp Ala Ala Ser Ala Trp Arg 100
105 110Phe Ile Pro Asn Asp Phe Lys Met Lys Thr Val Tyr
Gly Val Gly Arg 115 120 125Asp Trp
Pro Ile Gln Tyr Asp Asp Ile Glu His Tyr Tyr Gln Arg Ala 130
135 140Glu Glu Glu Leu Gly Val Trp Gly Pro Gly Pro
Glu Glu Asp Leu Tyr145 150 155
160Ser Pro Arg Lys Glu Pro Tyr Pro Met Pro Pro Leu Pro Leu Ser Phe
165 170 175Asn Glu Gln Thr
Ile Lys Ser Ala Leu Asn Gly Tyr Asp Pro Lys Phe 180
185 190His Val Val Thr Glu Pro Val Ala Arg Asn Ser
Arg Pro Tyr Asp Gly 195 200 205Arg
Pro Thr Cys Cys Gly Asn Asn Asn Cys Met Pro Ile Cys Pro Ile 210
215 220Gly Ala Met Tyr Asn Gly Ile Val His Val
Glu Lys Ala Glu Gln Ala225 230 235
240Gly Ala Lys Leu Ile Asp Ser Ala Val Val Tyr Lys Leu Glu Thr
Gly 245 250 255Pro Asp Lys
Arg Ile Thr Ala Ala Val Tyr Lys Asp Lys Thr Gly Ala 260
265 270Asp His Arg Val Glu Gly Lys Tyr Phe Val
Ile Ala Ala Asn Gly Ile 275 280
285Glu Thr Pro Lys Ile Leu Leu Met Ser Ala Asn Arg Asp Phe Pro Asn 290
295 300Gly Val Ala Asn Ser Ser Asp Met
Val Gly Arg Asn Leu Met Asp His305 310
315 320Pro Gly Thr Gly Val Ser Phe Tyr Ala Asn Glu Lys
Leu Trp Pro Gly 325 330
335Arg Gly Pro Gln Glu Met Thr Ser Leu Ile Gly Phe Arg Asp Gly Pro
340 345 350Phe Arg Ala Asn Glu Ala
Ala Lys Lys Ile His Leu Ser Asn Met Ser 355 360
365Arg Ile Asn Gln Glu Thr Gln Lys Ile Phe Lys Gly Gly Lys
Leu Met 370 375 380Lys Pro Glu Glu Leu
Asp Ala Gln Ile Arg Asp Arg Ser Ala Arg Phe385 390
395 400Val Gln Phe Asp Cys Phe His Glu Ile Leu
Pro Gln Pro Glu Asn Arg 405 410
415Ile Val Pro Ser Lys Thr Ala Thr Asp Ala Val Gly Ile Pro Arg Pro
420 425 430Glu Ile Thr Tyr Ala
Ile Asp Asp Tyr Val Lys Arg Gly Ala Val His 435
440 445Thr Arg Glu Val Tyr Ala Thr Ala Ala Lys Val Leu
Gly Gly Thr Glu 450 455 460Val Val Phe
Asn Asp Glu Phe Ala Pro Asn Asn His Ile Thr Gly Ala465
470 475 480Thr Ile Met Gly Ala Asp Ala
Arg Asp Ser Val Val Asp Lys Asp Cys 485
490 495Arg Ala Phe Asp His Pro Asn Leu Phe Ile Ser Ser
Ser Ser Thr Met 500 505 510Pro
Thr Val Gly Thr Val Asn Val Thr Leu Thr Ile Ala Ala Leu Ala 515
520 525Leu Arg Met Ser Asp Thr Leu Lys Lys
Glu Val Glu Phe Gly Ser Gly 530 535
540Tyr Gly Ser Gly Pro Pro Gly Pro Ile Arg Ala Gly Ala Thr Met Pro545
550 555 560His Arg Asp Arg
Gly Pro Cys Gly Ala Cys His Ala Ile Ile Gln 565
570 57520575PRTArtificialSyntheticX(247)..(247)X is
a non-canonical amino acid 20Met Ala Asp Thr Asp Thr Gln Lys Ala Asp Val
Val Val Val Gly Ser1 5 10
15Gly Val Ala Gly Ala Ile Val Ala His Gln Leu Ala Met Ala Gly Lys
20 25 30Ser Val Ile Leu Leu Glu Ala
Gly Pro Arg Met Pro Arg Trp Glu Ile 35 40
45Val Glu Arg Phe Arg Asn Gln Val Asp Lys Thr Asp Phe Met Ala
Pro 50 55 60Tyr Pro Ser Ser Ala Trp
Ala Pro His Pro Glu Tyr Gly Pro Pro Asn65 70
75 80Asp Tyr Leu Ile Leu Lys Gly Glu His Lys Phe
Asn Ser Gln Tyr Ile 85 90
95Arg Ala Val Gly Gly Thr Thr Trp His Trp Ala Ala Ser Ala Trp Arg
100 105 110Phe Ile Pro Asn Asp Phe
Lys Met Lys Thr Val Tyr Gly Val Gly Arg 115 120
125Asp Trp Pro Ile Gln Tyr Asp Asp Ile Glu His Tyr Tyr Gln
Arg Ala 130 135 140Glu Glu Glu Leu Gly
Val Trp Gly Pro Gly Pro Glu Glu Asp Leu Tyr145 150
155 160Ser Pro Arg Lys Glu Pro Tyr Pro Met Pro
Pro Leu Pro Leu Ser Phe 165 170
175Asn Glu Gln Thr Ile Lys Ser Ala Leu Asn Gly Tyr Asp Pro Lys Phe
180 185 190His Val Val Thr Glu
Pro Val Ala Arg Asn Ser Arg Pro Tyr Asp Gly 195
200 205Arg Pro Thr Cys Cys Gly Asn Asn Asn Cys Met Pro
Ile Cys Pro Ile 210 215 220Gly Ala Met
Tyr Asn Gly Ile Val His Val Glu Lys Ala Glu Gln Ala225
230 235 240Gly Ala Lys Leu Ile Asp Xaa
Ala Val Val Tyr Lys Leu Glu Thr Gly 245
250 255Pro Asp Lys Arg Ile Thr Ala Ala Val Tyr Lys Asp
Lys Thr Gly Ala 260 265 270Asp
His Arg Val Glu Gly Lys Tyr Phe Val Ile Ala Ala Asn Gly Ile 275
280 285Glu Thr Pro Lys Ile Leu Leu Met Ser
Ala Asn Arg Asp Phe Pro Asn 290 295
300Gly Val Ala Asn Ser Ser Asp Met Val Gly Arg Asn Leu Met Asp His305
310 315 320Pro Gly Thr Gly
Val Ser Phe Tyr Ala Asn Glu Lys Leu Trp Pro Gly 325
330 335Arg Gly Pro Gln Glu Met Thr Ser Leu Ile
Gly Phe Arg Asp Gly Pro 340 345
350Phe Arg Ala Asn Glu Ala Ala Lys Lys Ile His Leu Ser Asn Met Ser
355 360 365Arg Ile Asn Gln Glu Thr Gln
Lys Ile Phe Lys Gly Gly Lys Leu Met 370 375
380Lys Pro Glu Glu Leu Asp Ala Gln Ile Arg Asp Arg Ser Ala Arg
Phe385 390 395 400Val Gln
Phe Asp Cys Phe His Glu Ile Leu Pro Gln Pro Glu Asn Arg
405 410 415Ile Val Pro Ser Lys Thr Ala
Thr Asp Ala Val Gly Ile Pro Arg Pro 420 425
430Glu Ile Thr Tyr Ala Ile Asp Asp Tyr Val Lys Arg Gly Ala
Val His 435 440 445Thr Arg Glu Val
Tyr Ala Thr Ala Ala Lys Val Leu Gly Gly Thr Glu 450
455 460Val Val Phe Asn Asp Glu Phe Ala Pro Asn Asn His
Ile Thr Gly Ala465 470 475
480Thr Ile Met Gly Ala Asp Ala Arg Asp Ser Val Val Asp Lys Asp Cys
485 490 495Arg Ala Phe Asp His
Pro Asn Leu Phe Ile Ser Ser Ser Ser Thr Met 500
505 510Pro Thr Val Gly Thr Val Asn Val Thr Leu Thr Ile
Ala Ala Leu Ala 515 520 525Leu Arg
Met Ser Asp Thr Leu Lys Lys Glu Val Glu Phe Gly Ser Gly 530
535 540Tyr Gly Ser Gly Pro Pro Gly Pro Ile Arg Ala
Gly Ala Thr Met Pro545 550 555
560His Arg Asp Arg Gly Pro Cys Gly Ala Cys His Ala Ile Ile Gln
565 570
57521575PRTArtificialSyntheticX(395)..(395)X is a non-canonical amino
acid 21Met Ala Asp Thr Asp Thr Gln Lys Ala Asp Val Val Val Val Gly Ser1
5 10 15Gly Val Ala Gly Ala
Ile Val Ala His Gln Leu Ala Met Ala Gly Lys 20
25 30Ser Val Ile Leu Leu Glu Ala Gly Pro Arg Met Pro
Arg Trp Glu Ile 35 40 45Val Glu
Arg Phe Arg Asn Gln Val Asp Lys Thr Asp Phe Met Ala Pro 50
55 60Tyr Pro Ser Ser Ala Trp Ala Pro His Pro Glu
Tyr Gly Pro Pro Asn65 70 75
80Asp Tyr Leu Ile Leu Lys Gly Glu His Lys Phe Asn Ser Gln Tyr Ile
85 90 95Arg Ala Val Gly Gly
Thr Thr Trp His Trp Ala Ala Ser Ala Trp Arg 100
105 110Phe Ile Pro Asn Asp Phe Lys Met Lys Thr Val Tyr
Gly Val Gly Arg 115 120 125Asp Trp
Pro Ile Gln Tyr Asp Asp Ile Glu His Tyr Tyr Gln Arg Ala 130
135 140Glu Glu Glu Leu Gly Val Trp Gly Pro Gly Pro
Glu Glu Asp Leu Tyr145 150 155
160Ser Pro Arg Lys Glu Pro Tyr Pro Met Pro Pro Leu Pro Leu Ser Phe
165 170 175Asn Glu Gln Thr
Ile Lys Ser Ala Leu Asn Gly Tyr Asp Pro Lys Phe 180
185 190His Val Val Thr Glu Pro Val Ala Arg Asn Ser
Arg Pro Tyr Asp Gly 195 200 205Arg
Pro Thr Cys Cys Gly Asn Asn Asn Cys Met Pro Ile Cys Pro Ile 210
215 220Gly Ala Met Tyr Asn Gly Ile Val His Val
Glu Lys Ala Glu Gln Ala225 230 235
240Gly Ala Lys Leu Ile Asp Ser Ala Val Val Tyr Lys Leu Glu Thr
Gly 245 250 255Pro Asp Lys
Arg Ile Thr Ala Ala Val Tyr Lys Asp Lys Thr Gly Ala 260
265 270Asp His Arg Val Glu Gly Lys Tyr Phe Val
Ile Ala Ala Asn Gly Ile 275 280
285Glu Thr Pro Lys Ile Leu Leu Met Ser Ala Asn Arg Asp Phe Pro Asn 290
295 300Gly Val Ala Asn Ser Ser Asp Met
Val Gly Arg Asn Leu Met Asp His305 310
315 320Pro Gly Thr Gly Val Ser Phe Tyr Ala Asn Glu Lys
Leu Trp Pro Gly 325 330
335Arg Gly Pro Gln Glu Met Thr Ser Leu Ile Gly Phe Arg Asp Gly Pro
340 345 350Phe Arg Ala Asn Glu Ala
Ala Lys Lys Ile His Leu Ser Asn Met Ser 355 360
365Arg Ile Asn Gln Glu Thr Gln Lys Ile Phe Lys Gly Gly Lys
Leu Met 370 375 380Lys Pro Glu Glu Leu
Asp Ala Gln Ile Arg Xaa Arg Ser Ala Arg Phe385 390
395 400Val Gln Phe Asp Cys Phe His Glu Ile Leu
Pro Gln Pro Glu Asn Arg 405 410
415Ile Val Pro Ser Lys Thr Ala Thr Asp Ala Val Gly Ile Pro Arg Pro
420 425 430Glu Ile Thr Tyr Ala
Ile Asp Asp Tyr Val Lys Arg Gly Ala Val His 435
440 445Thr Arg Glu Val Tyr Ala Thr Ala Ala Lys Val Leu
Gly Gly Thr Glu 450 455 460Val Val Phe
Asn Asp Glu Phe Ala Pro Asn Asn His Ile Thr Gly Ala465
470 475 480Thr Ile Met Gly Ala Asp Ala
Arg Asp Ser Val Val Asp Lys Asp Cys 485
490 495Arg Ala Phe Asp His Pro Asn Leu Phe Ile Ser Ser
Ser Ser Thr Met 500 505 510Pro
Thr Val Gly Thr Val Asn Val Thr Leu Thr Ile Ala Ala Leu Ala 515
520 525Leu Arg Met Ser Asp Thr Leu Lys Lys
Glu Val Glu Phe Gly Ser Gly 530 535
540Tyr Gly Ser Gly Pro Pro Gly Pro Ile Arg Ala Gly Ala Thr Met Pro545
550 555 560His Arg Asp Arg
Gly Pro Cys Gly Ala Cys His Ala Ile Ile Gln 565
570 57522575PRTArtificialSyntheticX(558)..(558)X is
a non-canonical amino acid 22Met Ala Asp Thr Asp Thr Gln Lys Ala Asp Val
Val Val Val Gly Ser1 5 10
15Gly Val Ala Gly Ala Ile Val Ala His Gln Leu Ala Met Ala Gly Lys
20 25 30Ser Val Ile Leu Leu Glu Ala
Gly Pro Arg Met Pro Arg Trp Glu Ile 35 40
45Val Glu Arg Phe Arg Asn Gln Val Asp Lys Thr Asp Phe Met Ala
Pro 50 55 60Tyr Pro Ser Ser Ala Trp
Ala Pro His Pro Glu Tyr Gly Pro Pro Asn65 70
75 80Asp Tyr Leu Ile Leu Lys Gly Glu His Lys Phe
Asn Ser Gln Tyr Ile 85 90
95Arg Ala Val Gly Gly Thr Thr Trp His Trp Ala Ala Ser Ala Trp Arg
100 105 110Phe Ile Pro Asn Asp Phe
Lys Met Lys Thr Val Tyr Gly Val Gly Arg 115 120
125Asp Trp Pro Ile Gln Tyr Asp Asp Ile Glu His Tyr Tyr Gln
Arg Ala 130 135 140Glu Glu Glu Leu Gly
Val Trp Gly Pro Gly Pro Glu Glu Asp Leu Tyr145 150
155 160Ser Pro Arg Lys Glu Pro Tyr Pro Met Pro
Pro Leu Pro Leu Ser Phe 165 170
175Asn Glu Gln Thr Ile Lys Ser Ala Leu Asn Gly Tyr Asp Pro Lys Phe
180 185 190His Val Val Thr Glu
Pro Val Ala Arg Asn Ser Arg Pro Tyr Asp Gly 195
200 205Arg Pro Thr Cys Cys Gly Asn Asn Asn Cys Met Pro
Ile Cys Pro Ile 210 215 220Gly Ala Met
Tyr Asn Gly Ile Val His Val Glu Lys Ala Glu Gln Ala225
230 235 240Gly Ala Lys Leu Ile Asp Ser
Ala Val Val Tyr Lys Leu Glu Thr Gly 245
250 255Pro Asp Lys Arg Ile Thr Ala Ala Val Tyr Lys Asp
Lys Thr Gly Ala 260 265 270Asp
His Arg Val Glu Gly Lys Tyr Phe Val Ile Ala Ala Asn Gly Ile 275
280 285Glu Thr Pro Lys Ile Leu Leu Met Ser
Ala Asn Arg Asp Phe Pro Asn 290 295
300Gly Val Ala Asn Ser Ser Asp Met Val Gly Arg Asn Leu Met Asp His305
310 315 320Pro Gly Thr Gly
Val Ser Phe Tyr Ala Asn Glu Lys Leu Trp Pro Gly 325
330 335Arg Gly Pro Gln Glu Met Thr Ser Leu Ile
Gly Phe Arg Asp Gly Pro 340 345
350Phe Arg Ala Asn Glu Ala Ala Lys Lys Ile His Leu Ser Asn Met Ser
355 360 365Arg Ile Asn Gln Glu Thr Gln
Lys Ile Phe Lys Gly Gly Lys Leu Met 370 375
380Lys Pro Glu Glu Leu Asp Ala Gln Ile Arg Asp Arg Ser Ala Arg
Phe385 390 395 400Val Gln
Phe Asp Cys Phe His Glu Ile Leu Pro Gln Pro Glu Asn Arg
405 410 415Ile Val Pro Ser Lys Thr Ala
Thr Asp Ala Val Gly Ile Pro Arg Pro 420 425
430Glu Ile Thr Tyr Ala Ile Asp Asp Tyr Val Lys Arg Gly Ala
Val His 435 440 445Thr Arg Glu Val
Tyr Ala Thr Ala Ala Lys Val Leu Gly Gly Thr Glu 450
455 460Val Val Phe Asn Asp Glu Phe Ala Pro Asn Asn His
Ile Thr Gly Ala465 470 475
480Thr Ile Met Gly Ala Asp Ala Arg Asp Ser Val Val Asp Lys Asp Cys
485 490 495Arg Ala Phe Asp His
Pro Asn Leu Phe Ile Ser Ser Ser Ser Thr Met 500
505 510Pro Thr Val Gly Thr Val Asn Val Thr Leu Thr Ile
Ala Ala Leu Ala 515 520 525Leu Arg
Met Ser Asp Thr Leu Lys Lys Glu Val Glu Phe Gly Ser Gly 530
535 540Tyr Gly Ser Gly Pro Pro Gly Pro Ile Arg Ala
Gly Ala Xaa Met Pro545 550 555
560His Arg Asp Arg Gly Pro Cys Gly Ala Cys His Ala Ile Ile Gln
565 570
57523575PRTArtificialSyntheticX(560)..(560)X is a non-canonical amino
acid 23Met Ala Asp Thr Asp Thr Gln Lys Ala Asp Val Val Val Val Gly Ser1
5 10 15Gly Val Ala Gly Ala
Ile Val Ala His Gln Leu Ala Met Ala Gly Lys 20
25 30Ser Val Ile Leu Leu Glu Ala Gly Pro Arg Met Pro
Arg Trp Glu Ile 35 40 45Val Glu
Arg Phe Arg Asn Gln Val Asp Lys Thr Asp Phe Met Ala Pro 50
55 60Tyr Pro Ser Ser Ala Trp Ala Pro His Pro Glu
Tyr Gly Pro Pro Asn65 70 75
80Asp Tyr Leu Ile Leu Lys Gly Glu His Lys Phe Asn Ser Gln Tyr Ile
85 90 95Arg Ala Val Gly Gly
Thr Thr Trp His Trp Ala Ala Ser Ala Trp Arg 100
105 110Phe Ile Pro Asn Asp Phe Lys Met Lys Thr Val Tyr
Gly Val Gly Arg 115 120 125Asp Trp
Pro Ile Gln Tyr Asp Asp Ile Glu His Tyr Tyr Gln Arg Ala 130
135 140Glu Glu Glu Leu Gly Val Trp Gly Pro Gly Pro
Glu Glu Asp Leu Tyr145 150 155
160Ser Pro Arg Lys Glu Pro Tyr Pro Met Pro Pro Leu Pro Leu Ser Phe
165 170 175Asn Glu Gln Thr
Ile Lys Ser Ala Leu Asn Gly Tyr Asp Pro Lys Phe 180
185 190His Val Val Thr Glu Pro Val Ala Arg Asn Ser
Arg Pro Tyr Asp Gly 195 200 205Arg
Pro Thr Cys Cys Gly Asn Asn Asn Cys Met Pro Ile Cys Pro Ile 210
215 220Gly Ala Met Tyr Asn Gly Ile Val His Val
Glu Lys Ala Glu Gln Ala225 230 235
240Gly Ala Lys Leu Ile Asp Ser Ala Val Val Tyr Lys Leu Glu Thr
Gly 245 250 255Pro Asp Lys
Arg Ile Thr Ala Ala Val Tyr Lys Asp Lys Thr Gly Ala 260
265 270Asp His Arg Val Glu Gly Lys Tyr Phe Val
Ile Ala Ala Asn Gly Ile 275 280
285Glu Thr Pro Lys Ile Leu Leu Met Ser Ala Asn Arg Asp Phe Pro Asn 290
295 300Gly Val Ala Asn Ser Ser Asp Met
Val Gly Arg Asn Leu Met Asp His305 310
315 320Pro Gly Thr Gly Val Ser Phe Tyr Ala Asn Glu Lys
Leu Trp Pro Gly 325 330
335Arg Gly Pro Gln Glu Met Thr Ser Leu Ile Gly Phe Arg Asp Gly Pro
340 345 350Phe Arg Ala Asn Glu Ala
Ala Lys Lys Ile His Leu Ser Asn Met Ser 355 360
365Arg Ile Asn Gln Glu Thr Gln Lys Ile Phe Lys Gly Gly Lys
Leu Met 370 375 380Lys Pro Glu Glu Leu
Asp Ala Gln Ile Arg Asp Arg Ser Ala Arg Phe385 390
395 400Val Gln Phe Asp Cys Phe His Glu Ile Leu
Pro Gln Pro Glu Asn Arg 405 410
415Ile Val Pro Ser Lys Thr Ala Thr Asp Ala Val Gly Ile Pro Arg Pro
420 425 430Glu Ile Thr Tyr Ala
Ile Asp Asp Tyr Val Lys Arg Gly Ala Val His 435
440 445Thr Arg Glu Val Tyr Ala Thr Ala Ala Lys Val Leu
Gly Gly Thr Glu 450 455 460Val Val Phe
Asn Asp Glu Phe Ala Pro Asn Asn His Ile Thr Gly Ala465
470 475 480Thr Ile Met Gly Ala Asp Ala
Arg Asp Ser Val Val Asp Lys Asp Cys 485
490 495Arg Ala Phe Asp His Pro Asn Leu Phe Ile Ser Ser
Ser Ser Thr Met 500 505 510Pro
Thr Val Gly Thr Val Asn Val Thr Leu Thr Ile Ala Ala Leu Ala 515
520 525Leu Arg Met Ser Asp Thr Leu Lys Lys
Glu Val Glu Phe Gly Ser Gly 530 535
540Tyr Gly Ser Gly Pro Pro Gly Pro Ile Arg Ala Gly Ala Thr Met Xaa545
550 555 560His Arg Asp Arg
Gly Pro Cys Gly Ala Cys His Ala Ile Ile Gln 565
570 5752418DNAArtificialSynthetic 24catcaccatc
accatcac
182515DNAArtificialSynthetic 25ggcagtggtt ccggc
15262278DNAArtificialSynthetic 26ccatggctca
caatgacaac accccgcact cccgccgtac cggcgatgcg gccgtgaccg 60gtattacgcg
tcgccagtgg ctgcaaggcg cgctggccct gaccgcagct ggcctgacgg 120gttccctggc
cctgcgcgca ctggctgatg atccgggcac cgcaccgctg gataccttta 180tgacgctgag
cgaagctctg acgggcaaaa aaggtctgtc tcgtgttctg ggccagcgtt 240ttctgcaagc
gctgcaaaaa ggttcattca aaaccgcgga ttcgctgccg cagctggcgg 300gcgccctggc
aagcggttct ctgaacccgg accaagaagc tctggcgctg aaaatcctgg 360aagcatggta
tctgggcatt gttgataatg tggttatcac ctacgaagaa gccctgatgt 420ttagtgtcgt
gtccgacacg ctggtcattc cgagctattg cccgaacaaa ccgggtttct 480gggccgaaaa
accgatcgaa cgtcaggcat aatggcggat acggataccc agaaagcgga 540cgtggtcgtg
gttggatccg gcgtggcagg cgcaatcgtg gctcatcaac tggcaatggc 600aggtaaaagc
gtgatcctgc tggaagctgg tccgcgtatg ccgcgttggg aaattgttga 660acgtttccgc
aatcaagtcg ataaaaccga ctttatggca ccgtatccga gcagcgcatg 720ggcaccgcat
ccggaatatg gtccgccgaa tgattacctg atcctgaaag gcgaacacaa 780atttaactca
cagtacattc gtgcagtggg cggcaccacg tggcattggg cagcctcggc 840atggcgcttc
atcccgaacg attttaaaat gaaaaccgtg tatggcgttg gtcgtgactg 900gccgattcag
tacgatgaca tcgaacatta ttaccaacgc gcggaagaag aactgggcgt 960gtggggtccg
ggcccggaag aagacctgta ttcaccgcgt aaagaaccgt acccgatgcc 1020gccgctgccg
ctgagtttca atgaacaaac cattaaatcc gctctgaacg gctatgatcc 1080gaaatttcac
gtggttacgg aaccggtggc ccgtaattcg cgcccgtacg acggtcgccc 1140gacctgctgt
ggcaacaata actgcatgcc gatttgtccg atcggtgcaa tgtataacgg 1200catcgtccat
gtggaaaaag ctgaacaggc aggtgctaaa ctgattgata gtgcggtcgt 1260gtacaaactg
gaaacgggcc cggacaaacg tattaccgca gctgtttata aagataaaac 1320gggtgcggac
catcgcgtcg aaggcaaata cttcgtgatt gcggccaatg gtatcgaaac 1380cccgaaaatt
ctgctgatga gcgcgaaccg tgattttccg aatggtgtgg ccaacagttc 1440cgatatggtt
ggccgcaatc tgatggacca tccgggcacc ggcgtgagct tttatgcaaa 1500cgaaaaactg
tggccgggtc gtggtccgca ggaaatgacc tctctgatcg gtttccgtga 1560tggcccgttt
cgcgcgaatg aagcagcgaa gaaaattcat ctgtcaaata tgtcgcgtat 1620caaccaggaa
acccaaaaaa tctttaaagg cggtaaactg atgaaaccgg aagaactgga 1680tgcgcagatc
cgtgaccgca gtgcccgctt tgttcaattc gattgctttc acgaaatcct 1740gccgcagccg
gaaaatcgta ttgtcccgtc caaaaccgca acggacgcag tgggtattcc 1800gcgtccggaa
attacgtatg cgatcgatga ctacgtcaaa cgtggcgcag tgcatacgcg 1860cgaagtttat
gctaccgcgg ccaaagtgct gggcggcacc gaagtggtct tcaacgatga 1920atttgcgccg
aataaccaca tcaccggtgc cacgattatg ggcgcggatg cccgtgactc 1980agtggttgat
aaagactgtc gcgccttcga tcatccgaac ctgtttatta gcagcagcag 2040caccatgccg
acggttggca ccgttaacgt caccctgacg attgcagctc tggcactgcg 2100tatgtctgat
acgctgaaaa aagaagtcga attcggttct ggttatggct ctggtccgcc 2160gggtccgatt
cgtgcaggtg ctaccatgcc gcatcgtgat cgtggtccgt gcggtgcatg 2220tcacgctatt
atccagggca gtggttccgg ccatcaccat caccatcact aaaagctt
2278277PRTArtificial SequenceSynthetic 27Gly Ser Gly Tyr Gly Ser Gly1
5286PRTArtificial SequenceSynthetic 28His His His His His His1
5295PRTArtificial SequenceSynthetic 29Gly Ser Gly Ser Gly1
5
User Contributions:
Comment about this patent or add new information about this topic: