Patent application title: RECOMBINANT MILK PROTEINS
Inventors:
Viviane Lanquar (San Carlos, CA, US)
Magi El-Richani (San Francisco, CA, US)
IPC8 Class: AC07K1447FI
USPC Class:
1 1
Class name:
Publication date: 2022-03-31
Patent application number: 20220098259
Abstract:
Provided herein are compositions and methods for producing recombinant
milk proteins, as well as food compositions comprising the same. In
aspects, the recombinant milk proteins of the disclosure are recombinant
fusion proteins comprising casein and beta-lactoglobulin.Claims:
1. A recombinant fusion protein, comprising: a) casein; and b)
.beta.-lactoglobulin.
2. The recombinant fusion protein of claim 1, further comprising a protease cleavage site.
3. The recombinant fusion protein of claim 1, further comprising a chymosin cleavage site.
4. The recombinant fusion protein of claim 1, wherein the casein is bovine.
5. The recombinant fusion protein of claim 1, wherein the .beta.-lactoglobulin is bovine.
6. The recombinant fusion protein of claim 1, wherein the casein and .beta.-lactoglobulin are bovine.
7. A nucleic acid molecule encoding the recombinant fusion protein of claim 1.
8. The nucleic acid molecule of claim 7, wherein the nucleic acid sequence is codon optimized for expression in a plant.
9. The nucleic acid molecule of claim 8, wherein the plant is a soybean plant.
10. An expression vector comprising the nucleic acid molecule of claim 7.
11. A host cell comprising the expression vector of claim 10.
12. The host cell of claim 11, wherein the host cell is selected from the group consisting of plant cells, bacterial cells, fungal cells, and mammalian cells.
13. The host cell of claim 11, wherein the host cell is a plant cell.
14. A plant stably transformed with the nucleic acid molecule of claim 7.
15. The plant of claim 14, wherein the plant is a monocot selected from the group consisting of turf grass, maize, rice, oat, wheat, barley, sorghum, orchid, iris, lily, onion, palm, and duckweed.
16. The plant of claim 14, wherein the plant is a dicot selected from the group consisting of Arabidopsis, tobacco, tomato, potato, sweet potato, cassava, alfalfa, lima bean, pea, chick pea, soybean, carrot, strawberry, lettuce, oak, maple, walnut, rose, mint squash, daisy, quinoa, buckwheat, mung bean, cow pea, lentil, lupin, peanut, fava bean, French beans, mustard, and cactus.
17. The plant of claim 14, wherein the plant is a soybean plant.
18. A food composition, comprising: a fusion protein comprising casein and .beta.-lactoglobulin.
19. The food composition of claim 18, wherein the food composition is a solid.
20. The food composition of claim 18, wherein the food composition is a liquid.
21. The food composition of claim 18, wherein the food composition is a powder.
22. The food composition of claim 18, wherein the food composition is selected from the group consisting of: cheese, processed cheese product, yogurt, fermented dairy product, directly acidified counterpart of fermented dairy product, cottage cheese, dressing, frozen dairy product, frozen dessert, dessert, baked good, topping, icing, filling, low-fat spread, dairy-based dry mix, soup, sauce, salad dressing, geriatric nutrition, cream, creamer, analog dairy product, follow-up formula, baby formula, infant formula, milk, dairy beverage, acid dairy drink, smoothie, milk tea, butter, margarine, butter alternative, growing up milk, low-lactose product, low-lactose beverage, medical and clinical nutrition product, protein bar, nutrition bar, sport beverage, confection, meat product, analog meat product, meal replacement beverage, weight management food and beverage, dairy product, cultured buttermilk, sour cream, skyr, leben, lassi, kefir, powder containing a milk protein, and low-lactose product.
23. The food composition of claim 18, wherein the food composition is a dairy product.
24. The food composition of claim 18, wherein the food composition is an analog dairy product.
25. The food composition of claim 18, wherein the food composition is a low lactose product.
26. The food composition of claim 18, wherein the food composition is a milk.
27. The food composition of claim 18, wherein the food composition is a cheese.
28. The food composition of claim 18, wherein the food composition is fermented.
Description:
CROSS REFERENCE TO RELATED APPLICATIONS
[0001] This application is a continuation of U.S. patent application Ser. No. 17/326,785 (now U.S. Pat. No. 11,142,555), filed May 21, 2021, which is a continuation of U.S. patent application Ser. No. 17/127,090 (now U.S. Pat. No. 11,034,743), filed Dec. 18, 2020, which is a continuation of Ser. No. 17/039,760 (now. U.S. Pat. No. 10,894,812), filed Sep. 30, 2020, the disclosures of which are hereby incorporated by reference in their entirety.
DESCRIPTION OF THE TEXT FILE SUBMITTED ELECTRONICALLY
[0002] The contents of the text file submitted electronically herewith are incorporated herein by reference in their entirety: A computer readable format copy of the Sequence Listing filename: ALRO_007_14US_SeqList_ST25.txt, date recorded: Oct. 4, 2021, file size 155 kilobytes.
FIELD OF THE DISCLOSURE
[0003] The present disclosure generally relates to recombinant milk proteins. The disclosure also relates to food compositions comprising recombinant milk proteins.
BACKGROUND
[0004] Globally, more than 7.5 billion people around the world consume milk and milk products. Demand for cow milk and dairy products is expected to keep increasing due to increased reliance on these products in developing countries as well as growth in the human population, which is expected to exceed 9 billion people by 2050.
[0005] Relying on animal agriculture to meet the growing demand for food is not a sustainable solution. According to the Food & Agriculture Organization of the United Nations, animal agriculture is responsible for 18% of all greenhouse gases, more than the entire transportation sector combined. Dairy cows alone account for 3% of this total.
[0006] In addition to impacting the environment, animal agriculture poses a serious risk to human health. A startling 80% of antibiotics used in the United States go towards treating animals, resulting in the development of antibiotic resistant microorganisms also known as superbugs. For years, food companies and farmers have administered antibiotics not only to sick animals, but also to healthy animals, to prevent illness. In September 2016, the United Nations announced the use of antibiotics in the food system as a crisis on par with Ebola and HIV.
[0007] It is estimated that cow milk accounts for 83% of global milk production. Accordingly, there is an urgent need for to provide bovine milk and/or essential high-quality proteins from bovine milk in a more sustainable and humane manner, instead of solely relying on animal farming. Also, there is a need for selectively producing the specific milk proteins that confer nutritional and clinical benefits, and/or do not provoke allergic responses.
BRIEF SUMMARY
[0008] Provided herein are compositions and methods for producing milk proteins in transgenic plants. In some embodiments, a milk protein is stably expressed in a transgenic plant by fusing it to a stable protein, such as a stable mammalian, avian, plant or fungal protein. The compositions and methods provided herein allow for safe, sustainable and humane production of milk proteins for commercial use, such as use in food compositions.
[0009] In some embodiments, the disclosure provides a stably transformed plant comprising in its genome: a recombinant DNA construct encoding a fusion protein, the fusion protein comprising: (i) an unstructured milk protein, and (ii) a structured animal protein; wherein the fusion protein is stably expressed in the plant in an amount of 1% or higher per total protein weight of soluble protein extractable from the plant.
[0010] In some embodiments, the disclosure provides a stably transformed plant, comprising in its genome: a recombinant DNA construct encoding a fusion protein, the fusion protein comprising: .kappa.-casein; and .beta.-lactoglobulin; wherein the fusion protein is stably expressed in the plant in an amount of 1% or higher per total protein weight of soluble protein extractable from the plant.
[0011] In some embodiments, the disclosure provides a recombinant fusion protein comprising: (i) an unstructured milk protein, and (ii) a structured animal protein.
[0012] In some embodiments, the disclosure provides a plant-expressed recombinant fusion protein, comprising: .kappa.-casein and .beta.-lactoglobulin.
[0013] Also provided are nucleic acids encoding the recombinant fusion proteins described herein.
[0014] Also provided are vectors comprising a nucleic acid encoding one or more recombinant fusion proteins described herein, wherein the recombinant fusion protein comprises: (i) an unstructured milk protein, and (ii) a structured animal protein.
[0015] Also provided are plants comprising the recombinant fusion proteins and/or the nucleic acids described herein.
[0016] The instant disclosure also provides a method for stably expressing a recombinant fusion protein in a plant, the method comprising: a) transforming a plant with a plant transformation vector comprising an expression cassette comprising: a sequence encoding a fusion protein, wherein the fusion protein comprises an unstructured milk protein, and a structured animal protein; and b) growing the transformed plant under conditions wherein the recombinant fusion protein is expressed in an amount of 1% or higher per total protein weight of soluble protein extractable from the plant.
[0017] Also provided herein are methods for making food compositions, the methods comprising: expressing the recombinant fusion protein in a plant; extracting the recombinant fusion protein from the plant; optionally, separating the milk protein from the structured animal protein or the structured plant protein; and creating a food composition using the milk protein or the fusion protein.
[0018] Also provided herein are food compositions comprising one or more recombinant fusion proteins as described herein.
[0019] Also provided are food compositions produced using any one of the methods disclosed herein.
[0020] These and other embodiments are described in detail below.
BRIEF DESCRIPTION OF THE DRAWINGS
[0021] The accompanying figures, which are incorporated herein and form a part of the specification, illustrate some, but not the only or exclusive, example embodiments and/or features. It is intended that the embodiments and figures disclosed herein are to be considered illustrative rather than limiting.
[0022] FIGS. 1A, 1B, 1C, 1D, 1E, IF, 1G, 1H, 1I, 1J, 1K, 1L, 1M, 1N, 1O, and 1P show expression cassettes having different combinations of fusions between structured and intrinsically unstructured proteins (not to scale). Coding regions and regulatory sequences are indicated as blocks (not to scale). As used in the figures, "L" refers to linker; "Sig" refers to a signal sequence that directs foreign proteins to protein storage vacuoles, "5' UTR" refers to the 5' untranslated region, and "KDEL" refers to an endoplasmic reticulum retention signal.
[0023] FIG. 2 shows the modified pAR15-00 cloning vector containing a selectable marker cassette conferring herbicide resistance. Coding regions and regulatory sequences are indicated as blocks (not to scale).
[0024] FIG. 3 shows an example expression cassette comprising a OKC1-T:OLG1 (Optimized Kappa Casein version 1:beta-lactoglobulin version 1, SEQ ID NOs: 71-72) fusion driven by PvPhas promoter fused with arc5'UTR:sig10, followed by the ER retention signal (KDEL) and the 3'UTR of the arc5-1 gene, "arc-terminator". "arc5'UTR" refers to the 5' untranslated region of the arc5-1 gene. "Sig10" refers to the lectin 1 gene signal peptide. "RB" refers to ribosomal binding site. Coding regions and regulatory sequences are indicated as blocks (not to scale).
[0025] FIG. 4 shows an example expression cassette comprising a OBC-T2:FM:OLG1 (Optimized Beta Casein Truncated version 2:Chymosin cleavage site:beta-lactoglobulin version 1, SEQ ID NOs: 73-74) fusion driven by PvPhas promoter fused with arc5'UTR:sig10, followed by the 3'UTR of the arc5-1 gene, "arc-terminator". "arc5'UTR" refers to the 5' untranslated region of the arc5-1 gene. "Sig10" refers to the lectin 1 gene signal peptide. "RB" refers to ribosomal binding site. Coding regions and regulatory sequences are indicated as blocks (not to scale). The Beta Casein is "truncated" in that the bovine secretion signal is removed, and replaced with a plant targeting signal.
[0026] FIG. 5 shows an example expression cassette comprising a OaS1-T:FM:OLG1 (Optimized Alpha S1 Casein Truncated version 1:Chymosin cleavage site:beta-lactoglobulin version 1, SEQ ID NOs: 75-76) fusion driven by PvPhas promoter fused with arc5'UTR:sig10, followed by the 3'UTR of the arc5-1 gene, "arc-terminator". "arc5'UTR" refers to the 5' untranslated region of the arc5-1 gene. "Sig10" refers to the lectin 1 gene signal peptide. "RB" refers to ribosomal binding site. Coding regions and regulatory sequences are indicated as blocks (not to scale). The Alpha S1 Casein is "truncated" in that the bovine secretion signal is removed, and replaced with a plant targeting signal.
[0027] FIG. 6 shows an example expression cassette comprising a para-OKC1-T:FM:OLG1:KDEL (Optimized paraKappa Casein version 1:Chymosin cleavage site:beta-lactoglobulin version 1, SEQ ID NOs: 77-78) fusion driven by PvPhas promoter fused with arc5'UTR:sig 10, followed by the ER retention signal (KDEL) and the 3'UTR of the arc5-1 gene, "arc-terminator". "arc5'UTR" refers to the 5' untranslated region of the arc5-1 gene. "Sig10" refers to the lectin 1 gene signal peptide. "RB" refers to ribosomal binding site. Coding regions and regulatory sequences are indicated as blocks (not to scale).
[0028] FIG. 7 shows an example expression cassette comprising a para-OKC1-T:FM:OLG1 (Optimized paraKappa Casein version 1:Chymosin cleavage site:beta-lactoglobulin version 1, SEQ ID NOs: 79-80) fusion driven by PvPhas promoter fused with arc5'UTR:sig 10, followed by the 3'UTR of the arc5-1 gene, "arc-terminator." "arc5'UTR" refers to the 5' untranslated region of the arc5-1 gene. "Sig10" refers to the lectin 1 gene signal peptide. "RB" refers to ribosomal binding site. Coding regions and regulatory sequences are indicated as blocks (not to scale).
[0029] FIG. 8 shows an example expression cassette comprising a OKC1-T:OLG1 (Optimized Kappa Casein version 1:beta-lactoglobulin version 1, SEQ ID NOs: 81-82) fusion that is driven by the promoter and signal peptide of glycinin 1 (GmSeed2:sig2) followed by the ER retention signal (KDEL) and the nopaline synthase gene termination sequence, (nos term). Coding regions and regulatory sequences are indicated as blocks (not to scale).
[0030] FIGS. 9A, 9B, 9C, and 9D show protein detection by western blotting. FIG. 9A shows detection of the fusion protein using a primary antibody raised against .kappa.-casein (kCN). The kCN commercial protein is detected at an apparent MW of .about.26 kDa (theoretical: 19 kDa--arrow). The fusion protein is detected at an apparent MW of .about.40 kDa (theoretical: 38 kDa--arrowhead). FIG. 9B shows detection of the fusion protein using a primary antibody raised against .beta.-lactoglobulin (LG). The LG commercial protein is detected at an apparent MW of .about.18 kDa (theoretical: 18 kDa--arrow). The fusion protein is detected at an apparent MW of .about.40 kDa (theoretical: 38 kDa--arrowhead). FIGS. 9C and 9D show protein gels as control for equal lane loading (image is taken at the end of the SDS run).
[0031] FIGS. 10A and 10B show two illustrative fusion proteins. In FIG. 10A, a .kappa.-casein protein is fused to a .beta.-lactoglobulin protein. The .kappa.-casein comprises a natural chymosin cleavage site (arrow 1). Cleavage of the fusion protein with rennet (or chymosin) yields two fragments: a para-kappa casein fragment, and a fragment comprising a .kappa.-casein macropeptide fused to .beta.-lactoglobulin. In some embodiments, a second protease cleavage site may be added at the C-terminus of the k-casein protein (i.e., at arrow 2), in order to further allow separation of the .kappa.-casein macropeptide and the .beta.-lactoglobulin. The second protease cleavage site may be a rennet cleavage site (e.g., a chymosin cleavage site), or it may be a cleavage site for a different protease. In FIG. 10B, a para-.kappa.-casein protein is fused directly to .beta.-lactoglobulin. A protease cleavage site (e.g., a rennet cleavage site) is added between the para-.kappa.-casein and the .beta.-lactoglobulin to allow for separation thereof. By fusing the para-.kappa.-casein directly to the .beta.-lactoglobulin, no .kappa.-casein macropeptide is produced.
[0032] FIG. 11 is a flow-chart showing an illustrative process for producing a food composition comprising an unstructured milk protein, as described herein.
DETAILED DESCRIPTION
[0033] The following description includes information that may be useful in understanding the present disclosure. It is not an admission that any of the information provided herein is prior art or relevant to the presently claimed disclosures, or that any publication specifically or implicitly referenced is prior art.
Definitions
[0034] While the following terms are believed to be well understood by one of ordinary skill in the art, the following definitions are set forth to facilitate explanation of the presently disclosed subject matter.
[0035] All technical and scientific terms used herein, unless otherwise defined below, are intended to have the same meaning as commonly understood by one of ordinary skill in the art. References to techniques employed herein are intended to refer to the techniques as commonly understood in the art, including variations on those techniques and/or substitutions of equivalent techniques that would be apparent to one of skill in the art.
[0036] As used herein, the singular forms "a," "an," and "the" include plural referents unless the content clearly dictates otherwise.
[0037] The term "about" or "approximately" when immediately preceding a numerical value means a range (e.g., plus or minus 10% of that value). For example, "about 50" can mean 45 to 55, "about 25,000" can mean 22,500 to 27,500, etc., unless the context of the disclosure indicates otherwise, or is inconsistent with such an interpretation. For example, in a list of numerical values such as "about 49, about 50, about 55, . . . ", "about 50" means a range extending to less than half the interval(s) between the preceding and subsequent values, e.g., more than 49.5 to less than 52.5. Furthermore, the phrases "less than about" a value or "greater than about" a value should be understood in view of the definition of the term "about" provided herein. Similarly, the term "about" when preceding a series of numerical values or a range of values (e.g., "about 10, 20, 30" or "about 10-30") refers, respectively to all values in the series, or the endpoints of the range.
[0038] As used herein, "mammalian milk" can refer to milk derived from any mammal, such as bovine, human, goat, sheep, camel, buffalo, water buffalo, dromedary, llama and any combination thereof. In some embodiments, a mammalian milk is a bovine milk.
[0039] As used herein, "structured" refers to those proteins having a well-defined secondary and tertiary structure, and "unstructured" refers to proteins that do not have well defined secondary and/or tertiary structures. An unstructured protein may also be described as lacking a fixed or ordered three-dimensional structure. "Disordered" and "intrinsically disordered" are synonymous with unstructured.
[0040] As used herein, "rennet" refers to a set of enzymes typically produced in the stomachs of ruminant mammals. Chymosin, its key component, is a protease enzyme that cleaves .kappa.-casein (to produce para-.kappa.-casein). In addition to chymosin, rennet contains other enzymes, such as pepsin and lipase. Rennet is used to separate milk into solid curds (for cheesemaking) and liquid whey. Rennet or rennet substitutes are used in the production of most cheeses.
[0041] As used herein "whey" refers to the liquid remaining after milk has been curdled and strained, for example during cheesemaking. Whey comprises a collection of globular proteins, typically a mixture of .beta.-lactoglobulin, .alpha.-lactalbumin, bovine serum albumin, and immunoglobulins.
[0042] The term "plant" includes reference to whole plants, plant organs, plant tissues, and plant cells and progeny of same, but is not limited to angiosperms and gymnosperms such as Arabidopsis, potato, tomato, tobacco, alfalfa, lettuce, carrot, strawberry, sugarbeet, cassava, sweet potato, soybean, lima bean, pea, chick pea, maize (corn), turf grass, wheat, rice, barley, sorghum, oat, oak, eucalyptus, walnut, palm and duckweed as well as fern and moss. Thus, a plant may be a monocot, a dicot, a vascular plant reproduced from spores such as fern or a nonvascular plant such as moss, liverwort, hornwort and algae. The word "plant," as used herein, also encompasses plant cells, seeds, plant progeny, propagule whether generated sexually or asexually, and descendants of any of these, such as cuttings or seed. Plant cells include suspension cultures, callus, embryos, meristematic regions, callus tissue, leaves, roots, shoots, gametophytes, sporophytes, pollen, seeds and microspores. Plants may be at various stages of maturity and may be grown in liquid or solid culture, or in soil or suitable media in pots, greenhouses or fields. Expression of an introduced leader, trailer or gene sequences in plants may be transient or permanent.
[0043] The term "vascular plant" refers to a large group of plants that are defined as those land plants that have lignified tissues (the xylem) for conducting water and minerals throughout the plant and a specialized non-lignified tissue (the phloem) to conduct products of photosynthesis. Vascular plants include the clubmosses, horsetails, ferns, gymnosperms (including conifers) and angiosperms (flowering plants). Scientific names for the group include Tracheophyta and Tracheobionta. Vascular plants are distinguished by two primary characteristics. First, vascular plants have vascular tissues which distribute resources through the plant. This feature allows vascular plants to evolve to a larger size than non-vascular plants, which lack these specialized conducting tissues and are therefore restricted to relatively small sizes. Second, in vascular plants, the principal generation phase is the sporophyte, which is usually diploid with two sets of chromosomes per cell. Only the germ cells and gametophytes are haploid. By contrast, the principal generation phase in non-vascular plants is the gametophyte, which is haploid with one set of chromosomes per cell. In these plants, only the spore stalk and capsule are diploid.
[0044] The term "non-vascular plant" refers to a plant without a vascular system consisting of xylem and phloem. Many non-vascular plants have simpler tissues that are specialized for internal transport of water. For example, mosses and leafy liverworts have structures that look like leaves, but are not true leaves because they are single sheets of cells with no stomata, no internal air spaces and have no xylem or phloem. Non-vascular plants include two distantly related groups. The first group are the bryophytes, which is further categorized as three separate land plant Divisions, namely Bryophyta (mosses), Marchantiophyta (liverworts), and Anthocerotophyta (hornworts). In all bryophytes, the primary plants are the haploid gametophytes, with the only diploid portion being the attached sporophyte, consisting of a stalk and sporangium. Because these plants lack lignified water-conducting tissues, they can't become as tall as most vascular plants. The second group is the algae, especially the green algae, which consists of several unrelated groups. Only those groups of algae included in the Viridiplantae are still considered relatives of land plants.
[0045] The term "plant part" refers to any part of a plant including but not limited to the embryo, shoot, root, stem, seed, stipule, leaf, petal, flower bud, flower, ovule, bract, trichome, branch, petiole, internode, bark, pubescence, tiller, rhizome, frond, blade, ovule, pollen, stamen, and the like. The two main parts of plants grown in some sort of media, such as soil or vermiculite, are often referred to as the "above-ground" part, also often referred to as the "shoots", and the "below-ground" part, also often referred to as the "roots".
[0046] The term "plant tissue" refers to any part of a plant, such as a plant organ. Examples of plant organs include, but are not limited to the leaf, stem, root, tuber, seed, branch, pubescence, nodule, leaf axil, flower, pollen, stamen, pistil, petal, peduncle, stalk, stigma, style, bract, fruit, trunk, carpel, sepal, anther, ovule, pedicel, needle, cone, rhizome, stolon, shoot, pericarp, endosperm, placenta, berry, stamen, and leaf sheath.
[0047] The term "seed" is meant to encompass the whole seed and/or all seed components, including, for example, the coleoptile and leaves, radicle and coleorhiza, scutellum, starchy endosperm, aleurone layer, pericarp and/or testa, either during seed maturation and seed germination.
[0048] The term "transgenic plant" means a plant that has been transformed with one or more exogenous nucleic acids. "Transformation" refers to a process by which a nucleic acid is stably integrated into the genome of a plant cell. "Stably integrated" refers to the permanent, or non-transient retention and/or expression of a polynucleotide in and by a cell genome. Thus, a stably integrated polynucleotide is one that is a fixture within a transformed cell genome and can be replicated and propagated through successive progeny of the cell or resultant transformed plant. Transformation may occur under natural or artificial conditions using various methods well known in the art. Transformation may rely on any known method for the insertion of nucleic acid sequences into a prokaryotic or eukaryotic host cell, including Agrobacterium-mediated transformation protocols, viral infection, whiskers, electroporation, heat shock, lipofection, polyethylene glycol treatment, micro-injection, and particle bombardment.
[0049] As used herein, the terms "stably expressed" or "stable expression" refer to expression and accumulation of a protein in a plant cell over time. In some embodiments, a protein may accumulate because it is not degraded by endogenous plant proteases. In some embodiments, a protein is considered to be stably expressed in a plant if it is present in the plant in an amount of 1% or higher per total protein weight of soluble protein extractable from the plant.
[0050] As used herein, the term "fusion protein" refers to a protein comprising at least two constituent proteins (or fragments or variants thereof) that are encoded by separate genes, and that have been joined so that they are transcribed and translated as a single polypeptide. In some embodiments, a fusion protein may be separated into its constituent proteins, for example by cleavage with a protease.
[0051] The term "recombinant" refers to nucleic acids or proteins formed by laboratory methods of genetic recombination (e.g., molecular cloning) to bring together genetic material from multiple sources, creating sequences that would not otherwise be found in the genome. A recombinant fusion protein is a protein created by combining sequences encoding two or more constituent proteins, such that they are expressed as a single polypeptide. Recombinant fusion proteins may be expressed in vivo in various types of host cells, including plant cells, bacterial cells, fungal cells, mammalian cells, etc. Recombinant fusion proteins may also be generated in vitro.
[0052] The term "promoter" or a "transcription regulatory region" refers to nucleic acid sequences that influence and/or promote initiation of transcription. Promoters are typically considered to include regulatory regions, such as enhancer or inducer elements. The promoter will generally be appropriate to the host cell in which the target gene is being expressed. The promoter, together with other transcriptional and translational regulatory nucleic acid sequences (also termed "control sequences"), is necessary to express any given gene. In general, the transcriptional and translational regulatory sequences include, but are not limited to, promoter sequences, ribosomal binding sites, transcriptional start and stop sequences, translational start and stop sequences, and enhancer or activator sequences.
[0053] The term signal peptide--also known as "signal sequence", "targeting signal", "localization signal", "localization sequence", "transit peptide", "leader sequence", or "leader peptide", is used herein to refer to an N-terminal peptide which directs a newly synthesized protein to a specific cellular location or pathway. Signal peptides are often cleaved from a protein during translation or transport, and are therefore not typically present in a mature protein.
[0054] The term "proteolysis" or "proteolytic" or "proteolyze" means the breakdown of proteins into smaller polypeptides or amino acids. Uncatalyzed hydrolysis of peptide bonds is extremely slow. Proteolysis is typically catalyzed by cellular enzymes called proteases, but may also occur by intra-molecular digestion. Low pH or high temperatures can also cause proteolysis non-enzymatically. Limited proteolysis of a polypeptide during or after translation in protein synthesis often occurs for many proteins. This may involve removal of the N-terminal methionine, signal peptide, and/or the conversion of an inactive or non-functional protein to an active one.
[0055] The term "2A peptide", used herein, refers to nucleic acid sequence encoding a 2A peptide or the 2A peptide itself. The average length of 2A peptides is 18-22 amino acids. The designation "2A" refers to a specific region of picornavirus polyproteins and arose from a systematic nomenclature adopted by researchers. In foot-and-mouth disease virus (FMDV), a member of Picornaviridae family, a 2A sequence appears to have the unique capability to mediate cleavage at its own C-terminus by an apparently enzyme-independent, novel type of reaction. This sequence can also mediate cleavage in a heterologous protein context in a range of eukaryotic expression systems. The 2A sequence is inserted between two genes of interest, maintaining a single open reading frame. Efficient cleavage of the polyprotein can lead to co-ordinate expression of active two proteins of interest. Self-processing polyproteins using the FMDV 2A sequence could therefore provide a system for ensuring coordinated, stable expression of multiple introduced proteins in cells including plant cells.
[0056] The term "purifying" is used interchangeably with the term "isolating" and generally refers to the separation of a particular component from other components of the environment in which it was found or produced. For example, purifying a recombinant protein from plant cells in which it was produced typically means subjecting transgenic protein containing plant material to biochemical purification and/or column chromatography.
[0057] When referring to expression of a protein in a specific amount per the total protein weight of the soluble protein extractable from the plant ("TSP"), it is meant an amount of a protein of interest relative to the total amount of protein that may reasonably be extracted from a plant using standard methods. Methods for extracting total protein from a plant are known in the art. For example, total protein may be extracted from seeds by bead beating seeds at about 15000 rpm for about 1 min. The resulting powder may then be resuspended in an appropriate buffer (e.g., 50 mM Carbonate-Bicarbonate pH 10.8, 1 mM DTT, 1.times. Protease Inhibitor Cocktail). After the resuspended powder is incubated at about 4.degree. C. for about 15 minutes, the supernatant may be collected after centrifuging (e.g., at 4000 g, 20 min, 4.degree. C.). Total protein may be measured using standard assays, such as a Bradford assay. The amount of protein of interest may be measured using methods known in the art, such as an ELISA or a Western Blot.
[0058] When referring to a nucleic acid sequence or protein sequence, the term "identity" is used to denote similarity between two sequences. Sequence similarity or identity may be determined using standard techniques known in the art, including, but not limited to, the local sequence identity algorithm of Smith & Waterman, Adv. Appl. Math. 2, 482 (1981), by the sequence identity alignment algorithm of Needleman & Wunsch, J Mol. Biol. 48,443 (1970), by the search for similarity method of Pearson & Lipman, Proc. Natl. Acad. Sci. USA 85, 2444 (1988), by computerized implementations of these algorithms (GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package, Genetics Computer Group, 575 Science Drive, Madison, Wis.), the Best Fit sequence program described by Devereux et al., Nucl. Acid Res. 12, 387-395 (1984), or by inspection. Another suitable algorithm is the BLAST algorithm, described in Altschul et al., J Mol. Biol. 215, 403-410, (1990) and Karlin et al., Proc. Natl. Acad. Sci. USA 90, 5873-5787 (1993). A particularly useful BLAST program is the WU-BLAST-2 program which was obtained from Altschul et al., Methods in Enzymology, 266, 460-480 (1996); http://blast.wustl/edu/blast/README.html. WU-BLAST-2 uses several search parameters, which are optionally set to the default values. The parameters are dynamic values and are established by the program itself depending upon the composition of the particular sequence and composition of the particular database against which the sequence of interest is being searched; however, the values may be adjusted to increase sensitivity. Further, an additional useful algorithm is gapped BLAST as reported by Altschul et al, (1997) Nucleic Acids Res. 25, 3389-3402. As used herein, the terms "dicot" or "dicotyledon" or "dicotyledonous" refer to a flowering plant whose embryos have two seed leaves or cotyledons. Examples of dicots include, but are not limited to, Arabidopsis, tobacco, tomato, potato, sweet potato, cassava, alfalfa, lima bean, pea, chick pea, soybean, carrot, strawberry, lettuce, oak, maple, walnut, rose, mint, squash, daisy, Quinoa, buckwheat, mung bean, cow pea, lentil, lupin, peanut, fava bean, French beans (i.e., common beans), mustard, or cactus.
[0059] The terms "monocot" or "monocotyledon" or "monocotyledonous" refer to a flowering plant whose embryos have one cotyledon or seed leaf. Examples of monocots include, but are not limited to turf grass, maize (corn), rice, oat, wheat, barley, sorghum, orchid, iris, lily, onion, palm, and duckweed.
[0060] As used herein, a "low lactose product" is any food composition considered by the FDA to be "lactose reduced", "low lactose", or "lactose free".
Unstructured Milk Proteins
[0061] The fusion proteins described herein may comprise one or more unstructured milk proteins. As used herein the term "milk protein" refers to any protein, or fragment or variant thereof, that is typically found in one or more mammalian milks. Examples of mammalian milk include, but are not limited to, milk produced by a cow, human, goat, sheep, camel, horse, donkey, dog, cat, elephant, monkey, mouse, rat, hamster, guinea pig, whale, dolphin, seal, sheep, buffalo, water buffalo, dromedary, llama, yak, zebu, reindeer, mole, otter, weasel, wolf, raccoon, walrus, polar bear, rabbit, or giraffe.
[0062] An "unstructured milk protein" is a milk protein that lacks a defined secondary structure, a defined tertiary structure, or a defined secondary and tertiary structure. Whether a milk protein is unstructured may be determined using a variety of biophysical and biochemical methods known in the art, such as small angle X-ray scattering, Raman optical activity, circular dichroism, nuclear magnetic resonance (NMR) and protease sensitivity. In some embodiments, a milk protein is considered to be unstructured if it is unable to be crystallized using standard techniques.
[0063] Illustrative unstructured milk proteins that may be used in the fusion proteins of the disclosure includes members of the casein family of proteins, such as .alpha.-S1 casein, .alpha.-S2 casein, .beta.-casein, and .kappa.-casein. The caseins are phosphoproteins, and make up approximately 80% of the protein content in bovine milk and about 20-45% of the protein in human milk. Caseins form a multi-molecular, granular structure called a casein micelle in which some enzymes, water, and salts, such as calcium and phosphorous, are present. The micellar structure of casein in milk is significant in terms of a mode of digestion of milk in the stomach and intestine and a basis for separating some proteins and other components from cow milk. In practice, casein proteins in bovine milk can be separated from whey proteins by acid precipitation of caseins, by breaking the micellar structure by partial hydrolysis of the protein molecules with proteolytic enzymes, or microfiltration to separate the smaller soluble whey proteins from the larger casein micelle. Caseins are relatively hydrophobic, making them poorly soluble in water.
[0064] In some embodiments, the casein proteins described herein (e.g., .alpha.-S1 casein, .alpha.-S2 casein, .beta.-casein, and/or .kappa.-casein) are isolated or derived from cow (Bos taurus), goat (Capra hircus), sheep (Ovis aries), water buffalo (Bubalus bubalis), dromedary camel (Camelus dromedaries), bactrian camel (Camelus bactrianus), wild yak (Bos mutus), horse (Equus caballus), donkey (Equus asinus), reindeer (Rangifer tarandus), eurasian elk (Alces alces), alpaca (Vicugna pacos), zebu (Bos indicus), llama (Lama glama), or human (Homo sapiens). In some embodiments, a casein protein (e.g., .alpha.-S1 casein, .alpha.-S2 casein, .beta.-casein, or .kappa.-casein) has at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identity with a casein protein from one or more of cow (Bos taurus), goat (Capra hircus), sheep (Ovis aries), water buffalo (Bubalus bubalis), dromedary camel (Camelus dromedaries), bactrian camel (Camelus bactrianus), wild yak (Bos mutus), horse (Equus caballus), donkey (Equus asinus), reindeer (Rangifer tarandus), eurasian elk (Alces alces), alpaca (Vicugna pacos), zebu (Bos indicus), llama (Lama glama), or human (Homo sapiens).
[0065] As used herein, the term ".alpha.-S1 casein" refers to not only the .alpha.-S1 casein protein, but also fragments or variants thereof. .alpha.-S1 casein is found in the milk of numerous different mammalian species, including cow, goat, and sheep. The sequence, structure and physical/chemical properties of .alpha.-S1 casein derived from various species is highly variable. An exemplary sequence for bovine .alpha.-S1 casein can be found at Uniprot Accession No. P02662, and an exemplary sequence for goat .alpha.-S1 casein can be found at GenBank Accession No. X59836.1.
[0066] As used herein, the term ".alpha.-S2 casein" refers to not only the .alpha.-S2 casein protein, but also fragments or variants thereof .alpha.-S2 is known as epsilon-casein in mouse, gamma-casein in rat, and casein-A in guinea pig. The sequence, structure and physical/chemical properties of .alpha.-S2 casein derived from various species is highly variable. An exemplary sequence for bovine .alpha.-S2 casein can be found at Uniprot Accession No. P02663, and an exemplary sequence for goat .alpha.-S2 casein can be found at Uniprot Accession No. P33049.
[0067] As used herein, the term ".beta.-casein" refers to not only the .beta.-casein protein, but also fragments or variants thereof. For example, A1 and A2 .beta.-casein are genetic variants of the .beta.-casein milk protein that differ by one amino acid (at amino acid 67, A2 .beta.-casein has a proline, whereas A1 has a histidine). Other genetic variants of .beta.-casein include the A3, B, C, D, E, F, H1, H2, I and G genetic variants. The sequence, structure and physical/chemical properties of .beta.-casein derived from various species is highly variable. Exemplary sequences for bovine .beta.-casein can be found at Uniprot Accession No. P02666 and GenBank Accession No. M15132.1.
[0068] As used herein, the term ".kappa.-casein" refers to not only the .kappa.-casein protein, but also fragments or variants thereof. .kappa.-casein is cleaved by rennet, which releases a macropeptide from the C-terminal region. The remaining product with the N-terminus and two-thirds of the original peptide chain is referred to as para-.kappa.-casein. The sequence, structure and physical/chemical properties of .kappa.-casein derived from various species is highly variable. Exemplary sequences for bovine .kappa.-casein can be found at Uniprot Accession No. P02668 and GenBank Accession No. CAA25231.
[0069] In some embodiments, the unstructured milk protein is a casein protein, for example, .alpha.-S1 casein, .alpha.-S2 casein, .beta.-casein, and or .kappa.-casein. In some embodiments, the unstructured milk protein is .kappa.-casein and comprises the sequence of SEQ ID NO: 4, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, the unstructured milk protein is para-.kappa.-casein and comprises the sequence of SEQ ID NO: 2, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, the unstructured milk protein is .beta.-casein and comprises the sequence of SEQ ID NO: 6, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, the unstructured milk protein is .alpha.-S1 casein and comprises the sequence SEQ ID NO: 8, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, unstructured milk protein is .alpha.-S2 casein and comprises the sequence SEQ ID NO: 84, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto.
[0070] In some embodiments, the unstructured milk protein comprises a sequence that is at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 4. In some embodiments, the unstructured milk protein comprises a sequence that is at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 2. In some embodiments, the unstructured milk protein comprises a sequence that is at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 6. In some embodiments, the unstructured milk protein comprises a sequence that is at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 8. In some embodiments, the unstructured milk protein comprises a sequence that is at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 84.
[0071] In some embodiments, .alpha.-S1 casein is encoded by the sequence of SEQ ID NO: 7, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, .alpha.-S2 casein is encoded by the sequence of SEQ ID NO: 83, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, .beta.-casein is encoded by the sequence of SEQ ID NO: 5, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, .kappa.-casein is encoded by the sequence of SEQ ID NO: 3, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, para-.kappa.-casein is encoded by the sequence of SEQ ID NO: 1, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto.
[0072] In some embodiments, the unstructured milk protein is encoded by a sequence that is at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 7. In some embodiments, the unstructured milk protein is encoded by a sequence that is at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 83. In some embodiments, the unstructured milk protein is encoded by a sequence that is at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 3. In some embodiments, the unstructured milk protein is encoded by a sequence that is at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 1. In some embodiments, the unstructured milk protein is encoded by a sequence that is at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 5.
[0073] In some embodiments, the unstructured milk protein is a casein protein, and comprises a sequence that is at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to any one of SEQ ID NO: 85-133. In some embodiments, the unstructured milk protein is a casein protein and comprises the sequence of any one of SEQ ID NO: 85-133.
[0074] In some embodiments, the unstructured milk protein comprises a sequence that is at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to any one of SEQ ID NO: 85-98. In some embodiments, the unstructured milk protein comprises the sequence of any one of SEQ ID NO: 85-98.
[0075] In some embodiments, the unstructured milk protein comprises a sequence that is at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to any one of SEQ ID NO: 99-109. In some embodiments, the unstructured milk protein comprises the sequence of any one of SEQ ID NO: 99-109.
[0076] In some embodiments, the unstructured milk protein comprises a sequence that is at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to any one of SEQ ID NO: 110-120. In some embodiments, the unstructured milk protein comprises the sequence of any one of SEQ ID NO: 110-120.
[0077] In some embodiments, the unstructured milk protein comprises a sequence that is at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to any one of SEQ ID NO: 121-133. In some embodiments, the unstructured milk protein comprises the sequence of any one of SEQ ID NO: 121-133.
Structured Proteins
[0078] The fusion proteins described herein may comprise one or more structured proteins, including any fragment or variant thereof. The proteins may be, for example, structured animal proteins, or structured plant proteins. In some embodiments, the structured animal proteins are mammalian proteins. In some embodiments, the structured animal proteins are avian proteins. In some embodiments, the structured proteins are structured milk proteins.
[0079] Whether a milk protein is structured may be determined using a variety of biophysical and biochemical methods known in the art, such as small angle X-ray scattering, Raman optical activity, circular dichroism, and protease sensitivity. In some embodiments, a milk protein is considered to be structured if it has been crystallized or if it may be crystallized using standard techniques.
[0080] In some embodiments, the structured protein is not a protein that is typically used as a marker. As used herein, the term "marker" refers to a protein that produces a visual or other signal and is used to detect successful delivery of a vector (e.g., a DNA sequence) into a cell. Proteins typically used as a marker may include, for example, fluorescent proteins (e.g., green fluorescent protein (GFP)) and bacterial or other enzymes (e.g., .beta.-glucuronidase (GUS), .beta.-galactosidase, luciferase, chloramphenicol acetyltransferase). In some embodiments, the structured protein is a non-marker protein.
[0081] A non-limiting list of illustrative structured proteins that may be used in the fusion proteins described herein is provided in Table 1. In some embodiments, a fragment or variant of any one of the proteins listed in Table 1 may be used. In some embodiments, the structured protein may be an animal protein. For example, in some embodiments, the structured protein may be a mammalian protein. In some embodiments, the structured protein may be a plant protein. For example, the plant protein may be a protein that is not typically expressed in a seed. In some embodiments, the plant protein may be a storage protein, e.g., a protein that acts as a storage reserve for nitrogen, carbon, and/or sulfur. In some embodiments, the plant protein may inhibit one or more proteases. In some embodiments, the structured protein may be a fungal protein.
TABLE-US-00001 TABLE 1 Structured proteins Protein or Protein Exemplary Uniprot Categories family Native Species Accession No. Mammalian Alpha-lactalbumin Bovine (Bos taurus) P00711 Beta-lactoglobulin Bovine (Bos taurus) P02754 Albumin Bovine (Bos taurus) P02769 Lysozyme Bovine (Bos taurus) Q6B411 Collagen family Human (Homo sapiens) Q02388, P02452, P08123, P02458 Hemoglobin Bovine (Bos taurus) P02070 Avian proteins Ovalbumin Chicken (Gallus gallus) P01012 Ovotransferrin Chicken (Gallus gallus) P02789 Ovoglobulin Chicken (Gallus gallus) I0J170 Lysozyme Chicken (Gallus gallus) P00698 Plant Proteins Oleosins Soybean (Glycine max) P29530, P29531 Leghemoglobin Soybean (Glycine max) Q41219 Extensin-like protein Soybean (Glycine soja) A0A445JU93 family Prolamine Rice (Oryza sativa) Q0DJ45 Glutenin Wheat (Sorghum bicolor P10388 Gamma-kafirin Wheat (Sorghum bicolor Q41506 preprotein Alpha globulin Rice (Oryza sativa) P29835 Basic 7S globulin Soybean (Glycine max) P13917 precursor 2S albumin Soybean (Glycine max) P19594 Beta-conglycinins Soybean (Glycine max) P0DO16, P0DO15, P0DO15 Glycinins Soybean (Glycine max) P04347, P04776, P04405 Canein Sugar cane (Saccharum ABP64791.1 officinarum) Zein Corn (Zea Mays) ABP64791.1 Patatin Tomato (Solanum P07745 lycopersicum) Kunitz-Trypsin Soybean (Glycine max) Q39898 inhibitor Bowman-Birk Soybean (Glycine max) I1MQD2 inhibitor Cystatine Tomato (Solanum Q95E07 lycopersicum) Fungal proteins Hydrophobin I Fungus (Trichoderma reesei) P52754 Hydrophobin II Fungus (Trichoderma reesei) P79073
[0082] In some embodiments, the structured protein is an animal protein. In some embodiments, the structured protein is a mammalian protein. For example, the structured protein may be a mammalian protein selected from: .beta.-lactoglobulin, .alpha.-lactalbumin, albumin, lysozyme, lactoferrin, lactoperoxidase, hemoglobin, collagen, and an immunoglobulin (e.g., IgA, IgG, IgM, IgE). In some embodiments, the structured mammalian protein is .beta.-lactoglobulin and comprises the sequence of SEQ ID NO: 10, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, the structured mammalian protein is .beta.-lactoglobulin and is encoded by the sequence of any one of SEQ ID NO: 9, 11, 12, or 13, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to any one of SEQ ID NO: 9, 11, 12, or 13. In some embodiments, the structured protein is an avian protein. For example, the structured protein may be an avian protein selected from: ovalbumin, ovotransferrin, lysozyme and ovoglobulin.
[0083] In some embodiments, the structured protein is a plant protein. For example, the structured protein may be a plant protein selected from: hydrophobin I, hydrophobin II, oleosins, leghemoglobin, extension-like protein family, prolamine, glutenin, gamma-kafirin preprotein, .alpha.-globulin, basic 7S globulin precursor, 2S albumin, .beta.-conglycinins, glycinins, canein, zein, patatin, kunitz-trypsin inhibitor, bowman-birk inhibitor, and cystatine.
Fusion Proteins
Fusion Proteins Comprising an Unstructured Milk Protein and a Structured Animal (e.g., Mammalian) Protein
[0084] In some embodiments, the fusion proteins described herein comprise (i) an unstructured milk protein, and (ii) a structured animal protein. In some embodiments, the fusion proteins described herein comprise (i) an unstructured milk protein, and (ii) a structured mammalian protein. In some embodiments, the fusion proteins described herein comprise (i) an unstructured milk protein, and (ii) a structured avian protein. In some embodiments, the fusion proteins described herein comprise (i) an unstructured milk protein, and (ii) a structured fungal protein.
[0085] In some embodiments, the fusion proteins comprise an unstructured milk protein, such as a casein protein. In some embodiments, the fusion proteins comprise an unstructured milk protein selected from .alpha.-S1 casein, .alpha.-S2 casein, .beta.-casein, and .kappa.-casein. In some embodiments, the fusion proteins comprise an unstructured milk protein isolated or derived from cow (Bos taurus), goat (Capra hircus), sheep (Ovis aries), water buffalo (Bubalus bubalis), dromedary camel (Camelus dromedaries), bactrian camel (Camelus bactrianus), wild yak (Bos mutus), horse (Equus caballus), donkey (Equus asinus), reindeer (Rangifer tarandus), eurasian elk (Alces alces), alpaca (Vicugna pacos), zebu (Bos indicus), llama (Lama glama), or human (Homo sapiens). In some embodiments, the fusion proteins comprise a casein protein (e.g., .alpha.-S1 casein, .alpha.-S2 casein, .beta.-casein, or .kappa.-casein) from cow (Bos taurus), goat (Capra hircus), sheep (Ovis aries), water buffalo (Bubalus bubalis), dromedary camel (Camelus dromedaries), bactrian camel (Camelus bactrianus), wild yak (Bos mutus), horse (Equus caballus), donkey (Equus asinus), reindeer (Rangifer tarandus), eurasian elk (Alces alces), alpaca (Vicugna pacos), zebu (Bos indicus), llama (Lama glama), or human (Homo sapiens).
[0086] In some embodiments, the unstructured milk protein is .alpha.-S1 casein. In some embodiments, the unstructured milk protein is .alpha.-S1 casein and comprises the sequence SEQ ID NO: 8, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, the unstructured milk protein is .alpha.-S1 casein and comprises the sequence of any one of SEQ ID NO: 99-109, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto
[0087] In some embodiments, the unstructured milk protein is .alpha.-S2 casein. In some embodiments, the unstructured milk protein is .alpha.-S2 casein and comprises the sequence SEQ ID NO: 84, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, the unstructured milk protein is .alpha.-S2 casein and comprises the sequence of any one of SEQ ID NO: 110-120, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto.
[0088] In some embodiments, the unstructured milk protein is .beta.-casein. In some embodiments, the unstructured milk protein is .beta.-casein and comprises the sequence of SEQ ID NO: 6, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, the unstructured milk protein is .beta.-casein and comprises the sequence of any one of SEQ ID NO: 121-133, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto.
[0089] In some embodiments, the unstructured milk protein is .kappa.-casein. In some embodiments, the unstructured milk protein is .kappa.-casein and comprises the sequence of SEQ ID NO: 4, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, the unstructured milk protein is .kappa.-casein and comprises the sequence of any one of SEQ ID NO: 85-98, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto.
[0090] In some embodiments, the unstructured milk protein is para-.kappa.-casein. In some embodiments, the unstructured milk protein is para-.kappa.-casein and comprises the sequence of SEQ ID NO: 2, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto.
[0091] In some embodiments, the structured mammalian protein is .beta.-lactoglobulin, .alpha.-lactalbumin, albumin, lysozyme, lactoferrin, lactoperoxidase, hemoglobin, collagen, or an immunoglobulin (e.g., IgA, IgG, IgM, or IgE). In some embodiments, the structured avian protein is ovalbumin, ovotransferrin, lysozyme or ovoglobulin.
[0092] In some embodiments, the structured mammalian protein is .beta.-lactoglobulin. In some embodiments, the structured mammalian protein is .beta.-lactoglobulin and comprises the sequence of SEQ ID NO: 10, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto.
[0093] In some embodiments, a fusion protein comprises a casein protein (e.g., .kappa.-casein, para-.kappa.-casein, .beta.-casein, or .alpha.-S1 casein) and .beta.-lactoglobulin. In some embodiments, a fusion protein comprises .kappa.-casein and .beta.-lactoglobulin (see, e.g., FIG. 3, FIG. 8, FIG. 10A-10B). In some embodiments, a fusion protein comprises para-.kappa.-casein and .beta.-lactoglobulin (see, e.g., FIG. 6, FIG. 7, FIG. 10A-10B). In some embodiments, a fusion protein comprises .beta.-casein and (3-lactoglobulin. In some embodiments, a fusion protein comprises .alpha.-S1 casein and .beta.-lactoglobulin.
[0094] In some embodiments, a plant-expressed recombinant fusion protein comprises .kappa.-casein, or fragment thereof; and .beta.-lactoglobulin, or fragment thereof. In some embodiments, the fusion protein comprises, in order from N-terminus to C-terminus, the .kappa.-casein and the .beta.-lactoglobulin.
Fusion Protein Comprising an Unstructured Milk Protein and a Structured Plant Protein
[0095] In some embodiments, the fusion proteins described herein comprise (i) an unstructured milk protein, and (ii) a structured plant protein. In some embodiments, the unstructured milk protein is a casein protein, such as .alpha.-S1 casein, .alpha.-S2 casein, .beta.-casein, or .kappa.-casein. In some embodiments, the plant protein is selected from the group consisting of: hydrophobin I, hydrophobin II, oleosins, leghemoglobin, extension-like protein family, prolamine, glutenin, gamma-kafirin preprotein, .alpha.-globulin, basic 7S globulin precursor, 2S albumin, .beta.-conglycinins, glycinins, canein, zein, patatin, kunitz-trypsin inhibitor, bowman-birk inhibitor, and cystatine.
Fusion Protein Structure
[0096] The fusion proteins described herein may have various different structures, in order to increase expression and/or accumulation in a plant or other host organism or cell. In some embodiments, a fusion protein comprises, in order from N-terminus to C-terminus, an unstructured milk protein and a structured animal (e.g., mammalian or avian) protein. In some embodiments, a fusion protein comprises, in order from N-terminus to C-terminus, a structured animal (e.g., mammalian or avian) protein and a milk protein. For example, in some embodiments, a fusion protein comprises, in order from N-terminus to C-terminus .kappa.-casein and .beta.-lactoglobulin. In some embodiments, a fusion protein comprises, in order from N-terminus to C-terminus .beta.-lactoglobulin and .kappa.-casein. In some embodiments, a fusion protein comprises, in order from N-terminus to C-terminus, para-.kappa.-casein and .beta.-lactoglobulin. In some embodiments, a fusion protein comprises, in order from N-terminus to C-terminus, .beta.-lactoglobulin and para-.kappa.-casein. In some embodiments, a fusion protein comprises, in order from N-terminus to C-terminus, .beta.-casein and .beta.-lactoglobulin. In some embodiments, a fusion protein comprises, in order from N-terminus to C-terminus, .beta.-lactoglobulin and .beta.-casein. In some embodiments, a fusion protein comprises, in order from N-terminus to C-terminus, .alpha.-S1 casein and .beta.-lactoglobulin. In some embodiments, a fusion protein comprises, in order from N-terminus to C-terminus, .beta.-lactoglobulin and .alpha.-S1 casein.
[0097] In some embodiments, a fusion protein comprises, in order from N-terminus to C-terminus, an unstructured milk protein and a structured plant protein. In some embodiments, a fusion protein comprises, in order from N-terminus to C-terminus, a structured plant protein and a milk protein. In some embodiments, a fusion protein comprises, in order from N-terminus to C-terminus, a casein protein and a structured plant protein. In some embodiments, a fusion protein comprises, in order from N-terminus to C-terminus, a structured plant protein and a casein protein.
[0098] In some embodiments, a fusion protein comprises a protease cleavage site. For example, in some embodiments, the fusion protein comprises an endoprotease, endopeptidase, and/or endoproteinase cleavage site. In some embodiments, the fusion protein comprises a rennet cleavage site. In some embodiments, the fusion protein comprises a chymosin cleavage site. In some embodiments, the fusion protein comprises a trypsin cleavage site.
[0099] The protease cleavage site may be located between the unstructured milk protein and the structured animal (e.g., mammalian or avian) protein, or between the unstructured milk protein and the structured plant protein, such that cleavage of the protein at the protease cleavage site will separate the unstructured milk protein from the structured animal (e.g., mammalian or avian) or plant protein.
[0100] In some embodiments, the protease cleavage site may be contained within the sequence of either the milk protein or the structured animal (e.g., mammalian or animal) or plant protein. In some embodiments, the protease cleavage site may be added separately, for example, between the two proteins.
[0101] In some embodiments, a fusion protein comprises a linker between the unstructured milk protein and the structured animal (e.g., mammalian or avian) protein, or between the unstructured milk protein and the structured plant protein. In some embodiments, the linker may comprise a peptide sequence recognizable by an endoprotease. In some embodiments, the linker may comprise a protease cleavage site. In some embodiments, the linker may comprise a self-cleaving peptide, such as a 2A peptide.
[0102] In some embodiments, a fusion protein may comprise a signal peptide. The signal peptide may be cleaved from the fusion protein, for example, during processing or transport of the protein within the cell. In some embodiments, the signal peptide is located at the N-terminus of the fusion protein. In some embodiments, the signal peptide is located at the C-terminus of the fusion protein.
[0103] In some embodiments, the signal peptide is selected from the group consisting of GmSCB1, StPat21, 2Sss, Sig2, Sig12, Sig8, Sig10, Sig11, and Coixss. In some embodiments, the signal peptide is Sig10 and comprises SEQ ID NO: 15, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, the signal peptide is Sig2 and comprises SEQ ID NO: 17, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto.
[0104] In some embodiments, the fusion protein comprises the sequence of SEQ ID NO: 71. In some embodiments, the fusion protein comprises the sequence of SEQ ID NO: 73. In some embodiments, the fusion protein comprises the sequence of SEQ ID NO: 75. In some embodiments, the fusion protein comprises the sequence of SEQ ID NO: 77. In some embodiments, the fusion protein comprises the sequence of SEQ ID NO: 79. In some embodiments, the fusion protein comprises the sequence of SEQ ID NO: 81. In some embodiments, the fusion protein comprises the sequence of SEQ ID NO: 135. In some embodiments, the fusion protein comprises the sequence of SEQ ID NO: 137.
[0105] In some embodiments, the fusion protein comprises the sequence of SEQ ID NO: 71, with 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more amino acid substitutions. In some embodiments, the fusion protein comprises the sequence of SEQ ID NO: 73, with 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more amino acid substitutions. In some embodiments, the fusion protein comprises the sequence of SEQ ID NO: 75, with 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more amino acid substitutions. In some embodiments, the fusion protein comprises the sequence of SEQ ID NO: 77, with 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more amino acid substitutions. In some embodiments, the fusion protein comprises the sequence of SEQ ID NO: 79, with 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more amino acid substitutions. In some embodiments, the fusion protein comprises the sequence of SEQ ID NO: 81, with 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more amino acid substitutions. In some embodiments, the fusion protein comprises the sequence of SEQ ID NO: 135, with 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more amino acid substitutions. In some embodiments, the fusion protein comprises the sequence of SEQ ID NO: 137, with 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more amino acid substitutions.
[0106] In some embodiments, the fusion protein comprises the sequence of SEQ ID NO: 71, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, the fusion protein comprises the sequence of SEQ ID NO: 73, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, the fusion protein comprises the sequence of SEQ ID NO: 75, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, the fusion protein comprises the sequence of SEQ ID NO: 77, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, the fusion protein comprises the sequence of SEQ ID NO: 79, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, the fusion protein comprises the sequence of SEQ ID NO: 81, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, the fusion protein comprises the sequence of SEQ ID NO: 135, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, the fusion protein comprises the sequence of SEQ ID NO: 137, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto.
[0107] In some embodiments, the fusion proteins have a molecular weight in the range of about 1 kDa to about 500 kDa, about 1 kDa to about 250 kDa, about 1 to about 100 kDa, about 10 to about 50 kDa, about 1 to about 10 kDa, about 10 to about 200 kDa, about 30 to about 150 kDa, about 30 kDa to about 50 kDa, or about 20 to about 80 kDa.
Nucleic Acids Encoding Fusion Proteins and Vectors Comprising the Same
[0108] Also provided herein are nucleic acids encoding the fusion proteins of the disclosure, for example fusion proteins comprising an unstructured milk protein and a structured animal (e.g., mammalian or avian) or plant protein. In some embodiments, the nucleic acids are DNAs. In some embodiments, the nucleic acids are RNAs.
[0109] In some embodiments, a nucleic acid comprises a sequence encoding a fusion protein. In some embodiments, a nucleic acid comprises a sequence encoding a fusion protein, which is operably linked to a promoter. In some embodiments, a nucleic acid comprises, in order from 5' to 3', a promoter, a 5' untranslated region (UTR), a sequence encoding a fusion protein, and a terminator.
[0110] The promoter may be a plant promoter. A "plant promoter" is a promoter capable of initiating transcription in plant cells. Examples of promoters under developmental control include promoters that preferentially initiate transcription in certain organs, such as leaves, roots, flowers, seeds and tissues such as fibers, xylem vessels, tracheids, or sclerenchyma. Such promoters are referred to as "tissue-preferred." Promoters which initiate transcription only in certain tissue are referred to as "tissue-specific." A "cell-type" specific promoter primarily drives expression in certain cell types in one or more organs, for example, vascular cells in leaves, roots, flowers, or seeds. An "inducible" promoter is a promoter which is under environmental control. Examples of environmental conditions that may affect transcription by inducible promoters include anaerobic conditions or the presence of light. Tissue-specific, tissue-preferred, cell-type specific, and inducible promoters constitute the class of "non-constitutive" promoters. A "constitutive" promoter is a promoter which is active under most environmental conditions.
[0111] In some embodiments, the promoter is a plant promoter derived from, for example soybean, lima bean, Arabidopsis, tobacco, rice, maize, barley, sorghum, wheat, pea, and/or oat. In some embodiments, the promoter is a constitutive or an inducible promoter. Exemplary constitutive promoters include, but are not limited to, the promoters from plant viruses such as the 35S promoter from CaMV and the promoters from such genes as rice actin; ubiquitin; pEMU; MAS and maize H3 histone. In some embodiments, the constitutive promoter is the ALS promoter, Xbal/Ncol fragment 5' to the Brassica napus ALS3 structural gene (or a nucleotide sequence similarity to said Xbal/Ncol fragment).
[0112] In some embodiments, the promoter is a plant tissue-specific or tissue-preferential promoter. In some embodiments, the promoter is isolated or derived from a soybean gene. Illustrative soybean tissue-specific promoters include AR-Pro1, AR-Pro2, AR-Pro3, AR-Pro4, AR-Pro5, AR-Pro6, AR-Pro7, AR-Pro8, and AR-Pro9.
[0113] In some embodiments, the plant is a seed-specific promoter. In some embodiments, the seed-specific promoter is selected from the group consisting of PvPhas, BnNap, AtOle1, GmSeed2, GmSeed3, GmSeed5, GmSeed6, GmSeed7, GmSeed8, GmSeed10, GmSeed11, GmSeed12, pBCON, GmCEP1-L, GmTHIC, GmBg7S1, GmGRD, GmOLEA, GmOLER, Gm2S-1, and GmBBld-II. In some embodiments, the seed-specific promoter is PvPhas and comprises the sequence of SEQ ID NO: 18, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, the seed-specific promoter is GmSeed2 and comprises the sequence of SEQ ID NO: 19, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, the promoter is a Cauliflower Mosaic Virus (CaMV) 35S promoter.
[0114] In some embodiments, the promoter is a soybean polyubiquitin (Gmubi) promoter, a soybean heat shock protein 90-like (GmHSP90L) promoter, a soybean Ethylene Response Factor (GmERF) promoter. In some embodiments, the promoter is a constitutive soybean promoter derived from GmScreamM1, GmScreamM4, GmScreamM8 genes or GmubiXL genes.
[0115] In some embodiments, the 5' UTR is selected from the group consisting of Arc5'UTR and glnB1UTR. In some embodiments, the 5' untranslated region is Arc5'UTR and comprises the sequence of SEQ ID NO: 20, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto.
[0116] In some embodiments, the terminator sequence is isolated or derived from a gene encoding Nopaline synthase, Arc5-1, an Extensin, Rb7 matrix attachment region, a Heat shock protein, Ubiquitin 10, Ubiquitin 3, and M6 matrix attachment region. In some embodiments, the terminator sequence is isolated or derived from a Nopaline synthase gene and comprises the sequence of SEQ ID NO: 22, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto.
[0117] In some embodiments, the nucleic acid comprises a 3' UTR. For example, the 3' untranslated region may be Arc5-1 and comprise SEQ ID NO: 21, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto.
[0118] In some embodiments the nucleic acid comprises a gene encoding a selectable marker. One illustrative selectable marker gene for plant transformation is the neomycin phosphotransferase II (nptll) gene, isolated from transposon Tn5, which, when placed under the control of plant regulatory signals, confers resistance to kanamycin. Another exemplary marker gene is the hygromycin phosphotransferase gene which confers resistance to the antibiotic hygromycin. In some embodiments, the selectable marker is of bacterial origin and confers resistance to antibiotics such as gentamycin acetyl transferase, streptomycin phosphotransferase, and aminoglycoside-3'-adenyl transferase, the bleomycin resistance determinant. In some embodiments, the selectable marker genes confer resistance to herbicides such as glyphosate, glufosinate or bromoxynil. In some embodiments, the selectable marker is mouse dihydrofolate reductase, plant 5-enolpyruvylshikimate-3-phosphate synthase and plant acetolactate synthase. In some embodiments, the selectable marker is acetolactate synthase (e.g., AtCsr1.2).
[0119] In some embodiments, a nucleic acid comprises an endoplasmic reticulum retention signal. For example, in some embodiments, a nucleic acid comprises a KDEL sequence (SEQ ID NO: 23). In some embodiments, the nucleic acid may comprise an endoplasmic reticulum retention signal selected from any one of SEQ ID NO: 23-70.
[0120] Shown in Table 2 are exemplary promoters, 5' UTRs, signal peptides, and terminators that may be used in the nucleic acids of the disclosure.
TABLE-US-00002 TABLE 2 Promoters, 5' UTRs, signal peptides and terminators Illustrative Accession No. Type Name Description Native Species (Glyma, GenBank) Promoter PvPhas Phaseolin-1 (aka .beta.-phaseolin) Common bean J01263.1 (Phaseolus vulgaris) BnNap Napin-1 Rapeseed (Brassica J02798.1 napus) AtOle1 Oleosin-1 (Ole1) Arabidopsis (Arabidopsis X62353.1, thaliana) AT4G25140 GmSeed2 Gy1 (Glycinin 1) Soybean (Glycine max) Glyma.03G163500 GmSeed3 cysteine protease Soybean (Glycine max) Glyma.08G116300 GmSeed5 Gy5 (Glycinin 5) Soybean (Glycine max) Glyma.13G123500 GmSeed6 Gy4 (Glycinin 4) Soybean (Glycine max) Glyma.10G037100 GmSeed7 Kunitz trypsin protease Soybean (Glycine max) Glyma.01G095000 inhibitor GmSeed8 mKunitz trypsin protease Soybean (Glycine max) Glyma.08G341500 inhibitor GmSeed10 Legume Lectin Domain Soybean (Glycine max) Glyma.02G012600 GmSeed11 .beta.-conglycinin a subunit Soybean (Glycine max) Glyma.20G148400 GmSeed12 .beta.-conglycinin a' subunit Soybean (Glycine max) Glyma.10G246300 pBCON .beta.-conglycinin .beta. subunit Soybean (Glycine max) Glyma.20G148200 GmCEP1-L KDEL-tailed cysteine Soybean (Glycine max) Glyma06g42780 endopeptidase CEP1-like GmTHIC phosphomethylpyrimidine Soybean (Glycine max) Glyma11g26470 synthase GmBg7S1 Basic 7S globulin precursor Soybean (Glycine max) Glyma03g39940 GmGRD glucose and ribitol Soybean (Glycine max) Glyma07g38790 dehydrogenase-like GmOLEA Oleosin isoform A Soybean (Glycine max) Glyma.19g063400 GmOLEB Oleosin isoform B Soybean (Glycine max) Glyma.16g071800 Gm2S-1 2S albumin Soybean (Glycine max) Glyma13g36400 GmBBId-II Bowman-Birk protease Soybean (Glycine max) Glyma16g33400 inhibitor 5'UTR Arc5'UTR arc5-1 gene Phaseolus vulgaris J01263.1 glnB1UTR 65 bp of native glutamine Soybean (Glycine max) AF301590.1 synthase Signal peptide GmSCB1 Seed coat BURP domain Soybean (Glycine max) Glyma07g28940.1 protein StPat21 Patatin Tomato (Solanum CAA27588 lycopersicum) 2Sss 2S albumin Soybean (Glycine max) Glyma13g36400 Sig2 Glycinin G1 N-terminal Soybean (Glycine max) Glyma.03G163500 peptide Sig12 Beta-conglycinin alpha prime Soybean (Glycine max) Glyma.10G246300 subunit N-terminal peptide Sig8 Kunitz trypsin inhibitor N- Soybean (Glycine max) Glyma.08G341500 terminal peptide Sig10 Lectin N-terminal peptide Soybean (Glycine max) Glyma.02G012600 from Glycine max Sig11 Beta-conglycinin alpha Soybean (Glycine max) Glyma.20G148400 subunit N-terminal peptide Coixss Alpha-coixin N-terminal peptide from Coix lacryma- Coix lacryma-job job KDEL C-terminal amino acids of Phaseolus vulgaris sulfhydryl endopeptidase Terminator NOS Nopaline synthase gene Agrobacterium termination sequence tumefaciens ARC arc5-1 gene termination Phaseolus vulgaris J01263.1 sequence EU Extensin termination sequence Nicotiana tabacum Rb7 Rb7 matrix attachment region Nicotiana tabacum termination sequence HSP or Heat shock termination Arabidopsis thaliana AtHSP sequence AtUbi10 Ubiquitin 10 termination Arabidopsis thaliana sequence Stubi3 Ubiquitin 3 termination Solanum tuberosum TM6 M6 matrix attachment region Nicotiana tabacum termination sequence
[0121] Illustrative nucleic acids of the disclosure are provided in FIG. 1A-1P. In some embodiments a nucleic acid comprises, from 5' to 3', a promoter, a 5'UTR, a sequence encoding an unstructured milk protein, a sequence encoding a structured mammalian protein, an endoplasmic reticulum retention signal, and a terminator (See, e.g., FIG. 1A). In some embodiments a nucleic acid comprises, from 5' to 3', a promoter, a 5'UTR, a sequence encoding an unstructured milk protein, a sequence encoding a linker, a sequence encoding a structured mammalian protein, an endoplasmic reticulum retention signal, and a terminator (See, e.g., FIG. 1B). In some embodiments a nucleic acid comprises, from 5' to 3', a promoter, a 5'UTR, a sequence encoding an unstructured milk protein, a sequence encoding a linker, a sequence encoding a structured mammalian protein, and a terminator (See, e.g., FIG. 1C). In some embodiments a nucleic acid comprises, from 5' to 3', a promoter, a 5'UTR, a sequence encoding an unstructured milk protein, a sequence encoding a structured mammalian protein, and a terminator (See, e.g., FIG. 1D). In some embodiments a nucleic acid comprises, from 5' to 3', a promoter, a 5'UTR, a sequence encoding a structured mammalian protein, a sequence encoding an unstructured milk protein, an endoplasmic reticulum retention signal, and a terminator (See, e.g., FIG. 1E). In some embodiments a nucleic acid comprises, from 5' to 3', a promoter, a 5'UTR, a sequence encoding a structured mammalian protein, a sequence encoding a linker, a sequence encoding an unstructured milk protein, an endoplasmic reticulum retention signal, and a terminator (See, e.g., FIG. 1F). In some embodiments a nucleic acid comprises, from 5' to 3', a promoter, a 5'UTR, a sequence encoding a structured mammalian protein, a sequence encoding a linker, a sequence encoding an unstructured milk protein, and a terminator (See, e.g., FIG. 1G). In some embodiments a nucleic acid comprises, from 5' to 3', a promoter, a 5'UTR, a sequence encoding a structured mammalian protein, a sequence encoding an unstructured milk protein, and a terminator (See, e.g., FIG. 1H). In some embodiments a nucleic acid comprises, from 5' to 3', a promoter, a 5'UTR, a sequence encoding a signal peptide, a sequence encoding an unstructured milk protein, a sequence encoding a structured mammalian protein, an endoplasmic reticulum retention signal, and a terminator (See, e.g., FIG. 1I). In some embodiments a nucleic acid comprises, from 5' to 3', a promoter, a 5'UTR, a sequence encoding a signal peptide, a sequence encoding an unstructured milk protein, a sequence encoding a linker, a sequence encoding a structured mammalian protein, an endoplasmic reticulum retention signal, and a terminator (See, e.g., FIG. 1J). In some embodiments a nucleic acid comprises, from 5' to 3', a promoter, a 5'UTR, a sequence encoding a signal peptide, a sequence encoding an unstructured milk protein, a sequence encoding a linker, a sequence encoding a structured mammalian protein, and a terminator (See, e.g., FIG. 1K). In some embodiments a nucleic acid comprises, from 5' to 3', a promoter, a 5'UTR, a sequence encoding a signal peptide, a sequence encoding an unstructured milk protein, a sequence encoding a structured mammalian protein, and a terminator (See, e.g., FIG. 1L). In some embodiments a nucleic acid comprises, from 5' to 3', a promoter, a 5'UTR, a sequence encoding a signal peptide, a sequence encoding a structured mammalian protein, a sequence encoding an unstructured milk protein, an endoplasmic reticulum retention signal, and a terminator (See, e.g., FIG. 1M). In some embodiments a nucleic acid comprises, from 5' to 3', a promoter, a 5'UTR, a sequence encoding a signal peptide, a sequence encoding a structured mammalian protein, a sequence encoding a linker, a sequence encoding an unstructured milk protein, an endoplasmic reticulum retention signal, and a terminator (See, e.g., FIG. 1N). In some embodiments a nucleic acid comprises, from 5' to 3', a promoter, a 5'UTR, a sequence encoding a signal peptide, a sequence encoding a structured mammalian protein, a sequence encoding a linker, a sequence encoding an unstructured milk protein, and a terminator (See, e.g., FIG. 1O). In some embodiments a nucleic acid comprises, from 5' to 3', a promoter, a 5'UTR, a sequence encoding a signal peptide, a sequence encoding a structured mammalian protein, a sequence encoding an unstructured milk protein, and a terminator (See, e.g., FIG. 1P).
[0122] In some embodiments, the nucleic acid comprises an expression cassette comprising a OKC1-T:OLG1 (Optimized Kappa Casein version 1:beta-lactoglobulin version 1) fusion driven by PvPhas promoter fused with arc5'UTR:sig10, followed by the ER retention signal (KDEL) and the 3'UTR of the arc5-1 gene, "arc-terminator" (See, e.g., FIG. 3). In some embodiments, the nucleic acid comprises SEQ ID NO: 72.
[0123] In some embodiments, the nucleic acid comprises an expression cassette comprising a OBC-T2:FM:OLG1 (Optimized Beta Casein Truncated version 2:Chymosin cleavage site:beta-lactoglobulin version 1) fusion driven by PvPhas promoter fused with arc5'UTR:sig10, followed by the 3'UTR of the arc5-1 gene, "arc-terminator" (See, e.g., FIG. 4). In some embodiments, the nucleic acid comprises SEQ ID NO: 74. The Beta Casein is "truncated" in that the bovine secretion signal is removed, and replaced with a plant targeting signal.
[0124] In some embodiments, the nucleic acid comprises an expression cassette comprising a OaS1-T:FM:OLG1 (Optimized Alpha S1 Casein Truncated version 1:Chymosin cleavage site:beta-lactoglobulin version 1) fusion driven by PvPhas promoter fused with arc5'UTR:sig10, followed by the 3'UTR of the arc5-1 gene, "arc-terminator" (See, e.g., FIG. 5). In some embodiments, the nucleic acid comprises SEQ ID NO: 76. The Alpha S1 is "truncated" in that the bovine secretion signal is removed, and replaced with a plant targeting signal.
[0125] In some embodiments, the nucleic acid comprises an expression cassette comprising a para-OKC1-T:FM:OLG1:KDEL (Optimized paraKappa Casein version 1:Chymosin cleavage site:beta-lactoglobulin version 1) fusion driven by PvPhas promoter fused with arc5'UTR:sig 10, followed by the ER retention signal (KDEL) and the 3'UTR of the arc5-1 gene, "arc-terminator" (See, e.g., FIG. 6). In some embodiments, the nucleic acid comprises SEQ ID NO: 78.
[0126] In some embodiments, the nucleic acid comprises an expression cassette comprising a para-OKC1-T:FM:OLG1 (Optimized paraKappa Casein version 1:Chymosin cleavage site:beta-lactoglobulin version 1) fusion driven by PvPhas promoter fused with arc5'UTR:sig 10, followed by the 3'UTR of the arc5-1 gene, "arc-terminator" (See, e.g., FIG. 7). In some embodiments, the nucleic acid comprises SEQ ID NO: 80.
[0127] In some embodiments, the nucleic acid comprises an expression cassette comprising a OKC1-T-OLG1 (Optimized Kappa Casein version 1:beta-lactoglobulin version 1) fusion that is driven by the promoter and signal peptide of glycinin 1 (GmSeed2:sig2) followed by the ER retention signal (KDEL) and the nopaline synthase gene termination sequence (nos term) (See, e.g., FIG. 8). In some embodiments, the nucleic acid comprises SEQ ID NO: 82. In some embodiments, a nucleic acid encoding a fusion protein comprises the sequence of any one of SEQ ID NO: 72, 74, 76, 78, 80, 82, 134, or 136.
[0128] In some embodiments, the nucleic acids are codon optimized for expression in a host cell. Codon optimization is a process used to improve gene expression and increase the translational efficiency of a gene of interest by accommodating codon bias of the host organism (i.e., the organism in which the gene is expressed). Codon-optimized mRNA sequences that are produced using different programs or approaches can vary because different codon optimization strategies differ in how they quantify codon usage and implement codon changes. Some approaches use the most optimal (frequently used) codon for all instances of an amino acid, or a variation of this approach. Other approaches adjust codon usage so that it is proportional to the natural distribution of the host organism. These approaches include codon harmonization, which endeavors to identify and maintain regions of slow translation thought to be important for protein folding. Alternative approaches involve using codons thought to correspond to abundant tRNAs, using codons according to their cognate tRNA concentrations, selectively replacing rare codons, or avoiding occurrences of codon-pairs that are known to translate slowly. In addition to approaches that vary in the extent to which codon usage is considered as a parameter, there are hypothesis-free approaches that do not consider this parameter. Algorithms for performing codon optimization are known to those of skill in the art and are widely available on the Internet.
[0129] In some embodiments the nucleic acids are codon optimized for expression in a plant species. The plant species may be, for example, a monocot or a dicot. In some embodiments, the plant species is a dicot species selected from soybean, lima bean, Arabidopsis, tobacco, rice, maize, barley, sorghum, wheat and/or oat. In some embodiments, the plant species is soybean.
[0130] The nucleic acids of the disclosure may be contained within a vector. The vector may be, for example, a viral vector or a non-viral vector. In some embodiments, the non-viral vector is a plasmid, such as an Agrobacterium Ti plasmid. In some embodiments, the non-viral vector is a lipid nanoparticle.
[0131] In some embodiments, a vector comprises a nucleic acid encoding a recombinant fusion protein, wherein the recombinant fusion protein comprises: (i) an unstructured milk protein, and (ii) a structured animal (e.g., mammalian or avian) protein. In some embodiments, the vector is an Agrobacterium Ti plasmid.
[0132] In some embodiments, a method for expressing a fusion protein in a plant comprises contacting the plant with a vector of the disclosure. In some embodiments, the method comprises maintaining the plant or part thereof under conditions in which the fusion protein is expressed.
Plants Expressing Fusion Proteins
[0133] Also provided herein are transgenic plants expressing one or more fusion proteins of the disclosure. In some embodiments, the transgenic plants stably express the fusion protein. In some embodiments, the transgenic plants stably express the fusion protein in the plant in an amount of at least 1% per the total protein weight of the soluble protein extractable from the plant. For example, the transgenic plants may stably express the fusion protein in an amount of at least 1%, at least 1.5%, at least 2%, at least 2.5%, at least 3%, at least 3.5%, at least 4%, at least 4.5%, at least 5%, at least 5.5%, at least 6%, at least 6.5%, at least 7%, at least 7.5%, at least 8%, at least 8.5%, at least 9%, at least 9.5%, at least 10%, at least 10.5%, at least 11%, at least 11.5%, at least 12%, at least 12.5%, at least 13%, at least 13.5%, at least 14%, at least 14.5%, at least 15%, at least 15.5%, at least 16%, at least 16.5%, at least 17%, at least 17.5%, at least 18%, at least 18.5%, at least 19%, at least 19.5%, at least 20%, or more of total protein weight of soluble protein extractable from the plant.
[0134] In some embodiments, the transgenic plants stably express the fusion protein in an amount of less than about 1% of the total protein weight of soluble protein extractable from the plant. In some embodiments, the transgenic plants stably express the fusion protein in the range of about 1% to about 2%, about 3% to about 4%, about 4% to about 5%, about 5% to about 6%, about 6% to about 7%, about 7% to about 8%, about 8% to about 9%, about 9% to about 10%, about 10% to about 11%, about 11% to about 12%, about 12% to about 13%, about 13% to about 14%, about 14% to about 15%, about 15% to about 16%, about 16% to about 17%, about 17%, to about 18%, about 18% to about 19%, about 19% to about 20%, or more than about 20% of the total protein weight of soluble protein extractable from the plant.
[0135] In some embodiments, the transgenic plant stably express the fusion protein in an amount in the range of about 0.5% to about 3%, about 1% to about 4%, about 1% to about 5%, about 2% to about 5%, about 1% to about 10%, about 2% to about 10%, about 3% to about 10%, about 5 to about 12%, about 4% to about 10%, or about 5% to about 10%, about 4% to about 8%, about 5% to about 15%, about 5% to about 18%, about 10% to about 20%, or about 1% to about 20% of the total protein weight of soluble protein extractable from the plant.
[0136] In some embodiments, the fusion protein is expressed at a level at least 2-fold higher than an unstructured milk protein expressed individually in a plant. For example, in some embodiments, the fusion protein is expressed at a level at least 2-fold, at least 2.5-fold, at least 3-fold, at least 3.5-fold, at least 4-fold, at least 4.5-fold, at least 5-fold, at least 5.5-fold, at least 6-fold, at least 7-fold, at least 7.5-fold, at least 8-fold, at least 8.5-fold, at least 9-fold, at least 9.5-fold, at least 10-fold, at least 25-fold, at least 50-fold, or at least 100-fold higher than an unstructured milk protein expressed individually in a plant.
[0137] In some embodiments, the fusion protein accumulates in the plant at least 2-fold higher than an unstructured milk protein expressed without the structured animal (e.g., mammalian or avian) protein. For example, in some embodiments, the fusion protein accumulates in the plant at least 2-fold, at least 2.5-fold, at least 3-fold, at least 3.5-fold, at least 4-fold, at least 4.5-fold, at least 5-fold, at least 5.5-fold, at least 6-fold, at least 7-fold, at least 7.5-fold, at least 8-fold, at least 8.5-fold, at least 9-fold, at least 9.5-fold, at least 10-fold, at least 25-fold, at least 50-fold, or at least 100-fold higher than an unstructured milk protein expressed without the structured animal protein.
[0138] In some embodiments, a stably transformed plant comprises in its genome: a recombinant DNA construct encoding a fusion protein, wherein the fusion protein comprises (i) an unstructured milk protein, and (ii) a structured animal (e.g., mammalian or avian) protein. In some embodiments, the fusion protein is stably expressed in the plant in an amount of 1% or higher per the total protein weight of the soluble protein extractable from the plant. In some embodiments, the fusion protein is stably expressed in the plant in an amount of 2% or higher per the total protein weight of the soluble protein extractable from the plant. In some embodiments, the fusion protein is stably expressed in the plant in an amount of 3% or higher per the total protein weight of the soluble protein extractable from the plant. In some embodiments, the fusion protein is stably expressed in the plant in an amount of 4% or higher per the total protein weight of the soluble protein extractable from the plant. In some embodiments, the fusion protein is stably expressed in the plant in an amount of 5% or higher per the total protein weight of the soluble protein extractable from the plant. In some embodiments, the fusion protein is stably expressed in the plant in an amount of 6% or higher per the total protein weight of the soluble protein extractable from the plant. In some embodiments, the fusion protein is stably expressed in the plant in an amount of 7% or higher per the total protein weight of the soluble protein extractable from the plant. In some embodiments, the fusion protein is stably expressed in the plant in an amount of 8% or higher per the total protein weight of the soluble protein extractable from the plant. In some embodiments, the fusion protein is stably expressed in the plant in an amount of 9% or higher per the total protein weight of the soluble protein extractable from the plant. In some embodiments, the fusion protein is stably expressed in the plant in an amount of 10% or higher per the total protein weight of the soluble protein extractable from the plant. In some embodiments, the fusion protein is stably expressed in the plant in an amount of 11% or higher per the total protein weight of the soluble protein extractable from the plant. In some embodiments, the fusion protein is stably expressed in the plant in an amount of 12% or higher per the total protein weight of the soluble protein extractable from the plant. In some embodiments, the fusion protein is stably expressed in the plant in an amount of 13% or higher per the total protein weight of the soluble protein extractable from the plant. In some embodiments, the fusion protein is stably expressed in the plant in an amount of 14% or higher per the total protein weight of the soluble protein extractable from the plant. In some embodiments, the fusion protein is stably expressed in the plant in an amount of 15% or higher per the total protein weight of the soluble protein extractable from the plant. In some embodiments, the fusion protein is stably expressed in the plant in an amount of 16% or higher per the total protein weight of the soluble protein extractable from the plant. In some embodiments, the fusion protein is stably expressed in the plant in an amount of 17% or higher per the total protein weight of the soluble protein extractable from the plant. In some embodiments, the fusion protein is stably expressed in the plant in an amount of 18% or higher per the total protein weight of the soluble protein extractable from the plant. In some embodiments, the fusion protein is stably expressed in the plant in an amount of 19% or higher per the total protein weight of the soluble protein extractable from the plant. In some embodiments, the fusion protein is stably expressed in the plant in an amount of 20% or higher per the total protein weight of the soluble protein extractable from the plant.
[0139] In some embodiments, a stably transformed plant comprises in its genome: a recombinant DNA construct encoding a fusion protein, wherein the fusion protein comprises from N-terminus to C-terminus, the unstructured milk protein and the animal (e.g., mammalian or avian) protein. In some embodiments, the fusion protein comprises, from N-terminus to C-terminus, the structured animal (e.g., mammalian or avian) protein and the milk protein.
[0140] In some embodiments, a stably transformed plant comprises in its genome: a recombinant DNA construct encoding a fusion protein, wherein the fusion protein comprises an unstructured milk protein such as a casein protein. In some embodiments, a stably transformed plant comprises in its genome: a recombinant DNA construct encoding a fusion protein, wherein the fusion protein comprises an unstructured milk protein selected from .alpha.-S1 casein, .alpha.-S2 casein, (3-casein, and .kappa.-casein. In some embodiments, the unstructured milk protein is .alpha.-S1 casein. In some embodiments, the unstructured milk protein is .alpha.-S1 casein and comprises the sequence SEQ ID NO: 8, or a sequence at least 90% identical thereto. In some embodiments, the unstructured milk protein is .alpha.-S2 casein. In some embodiments, the unstructured milk protein is .alpha.-S2 casein and comprises the sequence SEQ ID NO: 84, or a sequence at least 90% identical thereto. In some embodiments, the unstructured milk protein is .beta.-casein. In some embodiments, the unstructured milk protein is .beta.-casein and comprises the sequence of SEQ ID NO: 6, or a sequence at least 90% identical thereto. In some embodiments, the unstructured milk protein is .kappa.-casein. In some embodiments, the unstructured milk protein is .kappa.-casein and comprises the sequence of SEQ ID NO: 4, or a sequence at least 90% identical thereto. In some embodiments, the unstructured milk protein is para-.kappa.-casein. In some embodiments, the unstructured milk protein is para-.kappa.-casein and comprises the sequence of SEQ ID NO: 2, or a sequence at least 90% identical thereto.
[0141] In some embodiments, a stably transformed plant comprises in its genome: a recombinant DNA construct encoding a fusion protein, wherein the fusion protein comprises a structured mammalian protein selected from .beta.-lactoglobulin, .alpha.-lactalbumin, albumin, lysozyme, lactoferrin, lactoperoxidase, hemoglobin, collagen, and an immunoglobulin (e.g., IgA, IgG, IgM, or IgE). In some embodiments, the structured mammalian protein is .beta.-lactoglobulin. In some embodiments, the structured mammalian protein is .beta.-lactoglobulin and comprises the sequence of SEQ ID NO: 10, or a sequence at least 90% identical thereto. In some embodiments, a stably transformed plant comprises in its genome: a recombinant DNA construct encoding a fusion protein, wherein the fusion protein comprises a structured avian protein selected from lysozyme, ovalbumin, ovotransferrin, and ovoglobulin.
[0142] In some embodiments, a stably transformed plant comprises in its genome: a recombinant DNA construct encoding a fusion protein, wherein the fusion protein comprises a casein protein and .beta.-lactoglobulin. In some embodiments, a stably transformed plant comprises in its genome: a recombinant DNA construct encoding a fusion protein, wherein the fusion protein comprises .kappa.-casein and .beta.-lactoglobulin. In some embodiments, the fusion protein comprises para-.kappa.-casein and .beta.-lactoglobulin. In some embodiments, the fusion protein comprises .beta.-casein and .beta.-lactoglobulin. In some embodiments, the fusion protein comprises .alpha.-S1 casein and .beta.-lactoglobulin.
[0143] In some embodiments, a stably transformed plant comprises in its genome: a recombinant DNA construct encoding a fusion protein; wherein the fusion protein comprises (1) .kappa.-casein, and (ii) .beta.-lactoglobulin. In some embodiments; and wherein the fusion protein is stably expressed in the plant in an amount of 1% or higher per the total protein weight of the soluble protein extractable from the plant.
[0144] In some embodiments, the stably transformed plant is a monocot. For example, in some embodiments, the plant may be a monocot selected from turf grass, maize (corn), rice, oat, wheat, barley, sorghum, orchid, iris, lily, onion, palm, and duckweed.
[0145] In some embodiments, the stably transformed plant is a dicot. For example, in some embodiments, the plant may be a dicot selected from Arabidopsis, tobacco, tomato, potato, sweet potato, cassava, alfalfa, lima bean, pea, chick pea, soybean, carrot, strawberry, lettuce, oak, maple, walnut, rose, mint, squash, daisy, Quinoa, buckwheat, mung bean, cow pea, lentil, lupin, peanut, fava bean, French beans (i.e., common beans), mustard, or cactus. In some embodiments, the plant is a soybean (Glycine max).
[0146] In some embodiments, the plant is a non-vascular plant selected from moss, liverwort, hornwort or algae. In some embodiments, the plant is a vascular plant reproducing from spores (e.g., a fern).
[0147] In some embodiments, the recombinant DNA construct is codon-optimized for expression in the plant. For example, in some embodiments, the recombinant DNA construct is codon-optimized for expression in a soybean plant.
[0148] The transgenic plants described herein may be generated by various methods known in the art. For example, a nucleic acid encoding a fusion protein may be contacted with a plant, or a part thereof, and the plant may then be maintained under conditions wherein the fusion protein is expressed. In some embodiments, the nucleic acid is introduced into the plant, or part thereof, using one or more methods for plant transformation known in the art, such as Agrobacterium-mediated transformation, particle bombardment-medicated transformation, electroporation, and microinjection.
[0149] In some embodiments, a method for stably expressing a recombinant fusion protein in a plant comprises (i) transforming a plant with a plant transformation vector comprising an expression cassette comprising: a sequence encoding a fusion protein, wherein the fusion protein comprises an unstructured milk protein, and a structured animal (e.g., mammalian or avian) protein; and (ii) growing the transformed plant under conditions wherein the recombinant fusion protein is expressed. In some embodiments, the recombinant fusion protein is expressed in an amount of 1% or higher per the total protein weight of the soluble protein extractable from the plant. In some embodiments, the unstructured milk protein is .kappa.-casein. In some embodiments, the structured mammalian protein is .beta.-lactoglobulin. In some embodiments, the unstructured milk protein is .kappa.-casein and the structured mammalian protein is .beta.-lactoglobulin.
Food Compositions Comprising a Fusion Protein
[0150] The fusion proteins and transgenic plants described herein may be used to prepare food compositions. The fusion protein may be used directly to prepare the food composition (i.e., in the form of a fusion protein), or the fusion protein may first be separated into its constituent proteins. For example, in some embodiments, a food composition may comprise either (i) a fusion protein, (ii) an unstructured milk protein, (iii) a structured mammalian, avian, or plant protein, or (iv) an unstructured milk protein and a structured mammalian, avian, or plant protein. An illustrative method for preparing a food composition of the disclosure is provided in FIG. 11.
[0151] In some embodiments, the fusion proteins and transgenic plants described herein may be used to prepare a food composition selected from cheese and processed cheese products, yogurt and fermented dairy products, directly acidified counterparts of fermented dairy products, cottage cheese dressing, frozen dairy products, frozen desserts, desserts, baked goods, toppings, icings, fillings, low-fat spreads, dairy-based dry mixes, soups, sauces, salad dressing, geriatric nutrition, creams and creamers, analog dairy products, follow-up formula, baby formula, infant formula, milk, dairy beverages, acid dairy drinks, smoothies, milk tea, butter, margarine, butter alternatives, growing up milks, low-lactose products and beverages, medical and clinical nutrition products, protein/nutrition bar applications, sports beverages, confections, meat products, analog meat products, meal replacement beverages, and weight management food and beverages.
[0152] In some embodiments the fusion proteins and transgenic plants described herein may be used to prepare a dairy product. In some embodiments, the dairy product is a fermented dairy product. An illustrative list of fermented dairy products includes cultured buttermilk, sour cream, yogurt, skyr, leben, lassi, or kefir. In some embodiments the fusion proteins and transgenic plants described herein may be used to prepare cheese products.
[0153] In some embodiments the fusion proteins and transgenic plants described herein may be used to prepare a powder containing a milk protein. In some embodiments, the fusion proteins and transgenic plants described herein may be used to prepare a low-lactose product.
[0154] In some embodiments, a method for making a food composition comprises, expressing a recombinant fusion protein of the disclosure in a plant, extracting the recombinant fusion protein from the plant, optionally separating the milk protein from the structured mammalian or plant protein, and creating a food composition using the fusion protein and/or the milk protein.
[0155] The recombinant fusion proteins may be extracted from a plant using standard methods known in the art. For example, the fusion proteins may be extracted using solvent or aqueous extraction. In some embodiments, the fusion proteins may be extracted using phenol extraction. Once extracted, the fusion proteins may be maintained in a buffered environment (e.g., Tris, MOPS, HEPES), in order to avoid sudden changes in the pH. The fusion proteins may also be maintained at a particular temperature, such as 4.degree. C. In some embodiments, one or more additives may be used to aid the extraction process (e.g., salts, protease/peptidase inhibitors, osmolytes, reducing agents, etc.)
[0156] In some embodiments, a method for making a food composition comprises, expressing a recombinant fusion protein of the disclosure in a plant, extracting one or both of the unstructured milk protein and the structured mammalian or plant protein from the plant, and creating a food composition using the milk protein.
[0157] In some embodiments, the milk protein and the structured mammalian or plant protein are separated from one another in the plant cell, prior to extraction. In some embodiments, the milk protein is separated from the structured mammalian or plant protein after extraction, for example by contacting the fusion protein with an enzyme that cleaves the fusion protein. The enzyme may be, for example, chymosin. In some embodiments, the fusion protein is cleaved using rennet.
[0158] All references, articles, publications, patents, patent publications, and patent applications cited herein are incorporated by reference in their entireties for all purposes. However, mention of any reference, article, publication, patent, patent publication, and patent application cited herein is not, and should not be taken as an acknowledgment or any form of suggestion that they constitute valid prior art or form part of the common general knowledge in any country in the world, or that they disclose essential matter.
EXAMPLES
[0159] The following experiments demonstrate different recombinant fusion constructs of milk proteins and structured proteins, as well as methods of testing and producing the recombinant proteins, and food compositions produced from the extracted protein. While the examples below describe expression in soybean, it will be understood by those skilled in the art that the constructs and methods disclosed herein may be tailored for expression in any organism.
Example 1: Construction of Expression Vectors for Plant Transformation for Stable Expression of Recombinant Fusion Proteins
Binary Vector Design
[0160] While a number of vectors may be utilized for expression of the fusion proteins disclosed herein, the example constructs described below were built in the binary pCAMBIA3300 (Creative Biogene, VET1372) vector, which was customized for soybean transformation and selection. In order to modify the vector, pCAMBIA3300 was digested with HindIII and AseI allowing the release of the vector backbone (LB T-DNA repeat_KanR_pBR322 ori_pBR322 bom_pVS1 oriV_pVs1 repA_pVS1 StaA_RB T-DNA repeat). The 6598 bp vector backbone was gel extracted and a synthesized multiple cloning site (MCS) was ligated via In-Fusion cloning (In-Fusion.RTM. HD Cloning System CE, available on the world wide web at clontech.com) to allow modular vector modifications. A cassette containing the Arabidopsis thaliana Csr1.2 gene for acetolactate synthase was added to the vector backbone to be used as a marker for herbicide selection of transgenic plants. In order to build this cassette, the regulatory sequences from Solanum tuberosum ubiquitin/ribosomal fusion protein promoter (StUbi3 prom; -1 to -922 bp) and terminator (StUbi3 term; 414 bp) (GenBank accession no. L22576.1) were fused to the mutant (S653N) acetolactate synthase gene (Csr1.2; GenBank accession no. X51514.1) (Sathasivan et al, 1990; Ding et al, 2006) to generate imazapyr-resistant traits in soybean plants. The selectable marker cassette was introduced into the digested (EcoRI) modified vector backbone via In-Fusion cloning to form vector pAR15-00 (FIG. 2).
[0161] Recombinant DNA constructs were designed to express milk proteins (intrinsically unstructured and structured) in transgenic plants. The coding regions of the expression cassettes outlined below contain a fusion of codon-optimized nucleic acid sequences encoding bovine milk proteins, or a functional fragment thereof. To enhance protein expression in soybean, the nucleic acid sequences encoding .beta.-lactoglobulin (GenBank accession no. X14712.1) .kappa.-casein (GenBank accession no. CAA25231), .beta.-casein (GenBank accession no. M15132.1), and aS1-casein (GenBank accession no. X59836.1) were codon optimized using Glycine max codon bias and synthesized (available on the world wide web at idtdna.com/CodonOpt). The signal sequences were removed (i.e., making the constructs "truncated") and the new versions of the genes were renamed as OLG1 (.beta.-lactoglobulin version 1, SEQ ID NO: 9), OLG2 (.beta.-lactoglobulin version 2, SEQ ID NO: 11), OLG3 (.beta.-lactoglobulin version 3, SEQ ID NO: 12), OLG4 (.beta.-lactoglobulin version 4, SEQ ID NO: 13), OKC1-T (Optimized .kappa.-casein Truncated version 1, SEQ ID NO: 3), paraOKC1-T (only the para-.kappa. portion of OKC1-T, SEQ ID NO: 1), OBC-T2 (Optimized .beta.-casein Truncated version 2, SEQ ID NO: 5), and OaS1-T (Optimized .alpha.S1-casein Truncated version 1, SEQ ID NO: 7). As will be understood by those skilled in the art, any codon optimized nucleic acid sequences can present from 60% to 100% identity to the native version of the nucleic acid sequence.
[0162] All the expression cassettes described below and shown in FIG. 3-8 contained codon-optimized nucleic acid sequences encoding bovine milk proteins, or a functional fragment thereof, a seed specific promoter, a 5'UTR, a signal sequence (Sig) that directs foreign proteins to the protein storage vacuoles, and a termination sequence. In some versions of the constructs a linker (FM) such as chymosin cleavage site, was placed between the two proteins and/or a C-terminal KDEL sequence for ER retention was included. Expression cassettes were inserted in the pAR15-00 vector described above utilizing a KpnI restriction site with the MCS (FIG. 2). Coding regions and regulatory sequences are indicated as blocks (not to scale) in FIG. 3-8.
.kappa.-casein-.beta.-lactoglobulin Fusion with KDEL
[0163] Shown in FIG. 3 is an example expression cassette comprising .kappa.-casein (OKC1-T, SEQ ID NO: 3) and .beta.-lactoglobulin (OLG1, SEQ ID NO: 9). The regulatory sequences that were used in order to produce the heterologous milk proteins in soybean seeds include the promoter of the beta-phaseolin storage protein gene (PvPhas prom; -1 to -1543; GenBank accession no. J01263.1, SEQ ID NO: 18); the 5'UTR of the arc5-1 gene (arc5'UTR; -1 to -13; GenBank accession no. Z50202, SEQ ID NO: 20) (De Jaeger et al, 2002); the signal peptide of Lectin 1 gene 1 (sig10; +1 to +93; GenBank accession no. Glyma.02G012600, SEQ ID NO: 14) (Darnowski et al, 20020); and, the 3'UTR of the arc5-1 gene, (arc term 1197 bp; GenBank accession no. Z50202.1, SEQ ID NO: 21) (De Jaeger et al, 2002). A C-terminal KDEL (SEQ ID NO: 23) was also included for ER retention.
[0164] .beta.-casein-.beta.-lactoglobulin Fusion with Linker
[0165] Shown in FIG. 4 is an example expression cassette comprising .beta.-casein (OBC-T2, SEQ ID NO: 5) and .beta.-lactoglobulin (OLG1, SEQ ID NO: 9). The regulatory sequences that were used in order to produce the heterologous milk proteins in soybean seeds include the promoter of the beta-phaseolin storage protein gene (PvPhas prom; -1 to -1543; GenBank accession no. J01263.1, SEQ ID NO: 18); the 5'UTR of the arc5-1 gene (arc5'UTR; -1 to -13; GenBank accession no. Z50202, SEQ ID NO: 20) (De Jaeger et al, 2002); the signal peptide of Lectin 1 gene 1 (sig10; +1 to +93; accession no. Glyma.02G012600, SEQ ID NO: 14) (Darnowski et al, 2002); and, the 3'UTR of the arc5-1 gene, (arc term 1197 bp; accession no. Z50202.1, SEQ ID NO: 21) (De Jaeger, et al 2002). A linker (FM) comprising a chymosin cleavage site was inserted between the two proteins.
.alpha.S1-casein-.beta.-lactoglobulin Fusion with Linker
[0166] Shown in FIG. 5 is an example expression cassette comprising aS1-casein (OaS1-T, SEQ ID NO: 7) and .beta.-lactoglobulin (OLG1, SEQ ID NO: 9). The regulatory sequences that were used in order to produce the heterologous milk proteins in soybean seeds include the promoter of the beta-phaseolin storage protein gene (PvPhas prom; -1 to -1543; GenBank accession no. J01263.1, SEQ ID NO: 18); the 5'UTR of the arc5-1 gene (arc5'UTR; -1 to -13; GenBank accession no. Z50202, SEQ ID NO: 20) (De Jaeger et al, 2002); the signal peptide of Lectin 1 gene 1 (sig10; +1 to +93; accession no. Glyma.02G012600, SEQ ID NO: 14) (Darnowski et al, 2002); and, the 3'UTR of the arc5-1 gene, (arc term 1197 bp; GenBank accession no. Z50202.1, SEQ ID NO: 21)(De Jaeger et al, 2002). A linker (FM) comprising a chymosin cleavage site was inserted between the two proteins.
Para-.kappa.-casein-.beta.-lactoglobulin Fusion with Linker and KDEL
[0167] Shown in FIG. 6 is an example expression cassette comprising para-.kappa.-casein (paraOKC1-T, SEQ ID NO: 1) and .beta.-lactoglobulin (OLG1, SEQ ID NO: 9). The regulatory sequences that were used in order to produce the heterologous milk proteins in soybean seeds include the promoter of the beta-phaseolin storage protein gene (PvPhas prom; -1 to -1543; GenBank accession no. J01263.1, SEQ ID NO: 18); the 5'UTR of the arc5-1 gene (arc5'UTR; -1 to -13; GenBank accession no. Z50202, SEQ ID NO: 20) (De Jaeger et al, 2002); the signal peptide of Lectin 1 gene 1 (sig10; +1 to +93; GenBank accession no. Glyma.02G012600, SEQ ID NO: 14) (Darnowski et al, 2002); and, the 3'UTR of the arc5-1 gene, (arc term 1197 bp; GenBank accession no. Z50202.1, SEQ ID NO: 21) (De Jaeger et al 2002). A linker (FM) comprising a chymosin cleavage site was inserted between the two proteins and a C-terminal KDEL (SEQ ID NO: 23) was also included for ER retention.
Para-.kappa.-casein-.beta.-lactoglobulin Fusion with Linker
[0168] Shown in FIG. 7 is an example expression cassette comprising para-.kappa.-casein (paraOKC1-T, SEQ ID NO: 1) and .beta.-lactoglobulin (OLG1, SEQ ID NO: 9). The regulatory sequences that were used in order to produce the heterologous milk proteins in soybean seeds include the promoter of the beta-phaseolin storage protein gene (PvPhas prom; -1 to -1543; GenBank accession no. J01263.1, SEQ ID NO: 18); the 5'UTR of the arc5-1 gene (arc5'UTR; -1 to -13; GenBank accession no. Z50202, SEQ ID NO: 20) (De Jaeger et al, 2002); the signal peptide of Lectin 1 gene 1 (sig10; +1 to +93; GenBank accession no. Glyma.02G012600, SEQ ID NO: 14) (Darnowski et al, 2002); and, the 3'UTR of the arc5-1 gene, (arc term 1197 bp; GenBank accession no. Z50202.1, SEQ ID NO: 21) (De Jaeger et al, 2002). A linker (FM) comprising a chymosin cleavage site was inserted between the two proteins.
Fusion Protein with Seed2 Promoter, Sig2 and Nopaline Synthase Terminator
[0169] Shown in FIG. 8 is an example expression cassette comprising .kappa.-casein (OKC1-T, SEQ ID NO: 3) and .beta.-lactoglobulin (OLG1, SEQ ID NO: 9). The regulatory sequences that were used in order to produce the heterologous milk proteins in soybean seeds include the promoter and signal peptide of glycinin 1 (GmSeed2 (SEQ ID NO: 19): sig2 (SEQ ID NO: 16)) followed by the ER retention signal (KDEL) and the Nopaline synthase termination sequence (nos term, SEQ ID NO: 22).
Example 2: Identification of Transgenic Events, Recombinant Protein Extraction and Detection
[0170] To quantify recombinant protein expression levels, DNA constructs such as those shown in FIG. 3-8 were transformed into soybean using transformation protocols well known in the art, for example, by bombardment or Agrobacterium. Total soybean genomic DNA was isolated from the first trifoliate leaves of transgenic events using the PureGene tissue DNA isolation kit (product #158667: QIAGEN, Valencia, Calif., USA). Trifoliates were frozen in liquid nitrogen and pulverized. Cells were lysed using the PureGene Cell Lysis Buffer, proteins were precipitated using the PureGene Protein Precipitation Buffer, and DNA was precipitated from the resulting supernatant using ethanol. The DNA pellets were washed with 70% ethanol and resuspended in water.
[0171] Genomic DNA was quantified by the Quant-iT PicoGreen (product #P7589: ThermoFisher Scientific, Waltham, Mass., USA) assay as described by manufacturer, and 150 ng of DNA was digested overnight with EcoRI, HindIII, NcoI, and/or KpnI, 30 ng of which was used for a BioRad ddPCR reaction, including labelled FAM or HEX probes for the transgene and Lectin1 endogenous gene respectively. Transgene copy number (CNV) was calculated by comparing the measured transgene concentration to the reference gene concentration. A CNV of greater than or equal to one was deemed acceptable.
Preparation of Total Soluble Protein Samples
[0172] Total soluble soybean protein fractions were prepared from the seeds of transgenic events by bead beating seeds (seeds collected about 90 days after germination) at 15000 rpm for 1 min. The resulting powder was resuspended in 50 mM Carbonate-Bicarbonate pH 10.8, 1 mM DTT, 1.times.HALT Protease Inhibitor Cocktail (Product #78438 ThermoFisher Scientific). The resuspended powder was incubated at 4.degree. C. for 15 minutes and then the supernatant collected after centrifuging twice at 4000 g, 20 min, 4.degree. C. Protein concentration was measured using a modified Bradford assay (Thermo Scientific Pierce 660 nm assay; Product #22660 ThermoFisher Scientific) using a bovine serum albumin (BSA) standard curve.
Recombinant Protein Quantification via Western Blot Densitometry
[0173] SDS-PAGE was performed according to manufacturer's instructions (Product #5678105BioRad, Hercules, Calif., USA) under denaturing and reducing conditions. 5 ug of total protein extracts were loaded per lane. For immunoblotting proteins separated by SDS-PAGE were transferred to a PVDF membrane using Trans-Blot.RTM. Turbo.TM. Midi PVDF Transfer Packs (Product #1704157 BioRad) according to manufacturer's guidelines. Membranes were blocked with 3% BSA in phosphate buffered saline with 0.5% Tween-20, reacted with antigen specific antibody and subsequently reacted with fluorescent goat anti rabbit IgG (Product #60871 BioRad, CA). Membranes were scanned according to manufacturer's instructions using the ChemiDoc MP Imaging System (BioRad, CA) and analyzed using ImageLab Version 6.0.1 Standard Edition (BioRad Laboratories, Inc.). Recombinant protein from the seeds of transgenic events was quantified by densitometry from commercial reference protein spike-in standards.
[0174] Shown in FIGS. 9A, 9B, 9C, and 9D are Western Blots of protein extracted from transgenic soybeans expressing the .kappa.-casein-.beta.-lactoglobulin expression cassette shown in FIG. 3. FIG. 9A shows the fusion protein detected using a primary antibody raised against .kappa.-casein. The first lane is a molecular weight marker. Lanes two (DCI 9.1) and three (DCI 9.2) represent individual seeds from a single transgenic line. Lane four (DCI 3.1) represents a seed from a separate transgenic line. Lane five is protein extracted from wild-type soybean plants, and lanes six-eight are protein extracted from wild-type soybean plants spiked with 0.05% commercial .kappa.-casein (lane 6), 0.5% commercial .kappa.-casein (lane 7), and 1.5% commercial .kappa.-casein (lane 8). The .kappa.-casein commercial protein is detected at an apparent molecular weight (MW) of .about.26 kDa (theoretical: 19 kDa--arrow). The fusion protein is detected at an apparent MW of .about.40 kDa (theoretical: 38 kDa--arrowhead).
[0175] FIG. 9B shows the fusion protein detected using a primary antibody raised against .beta.-lactoglobulin. The first lane is a molecular weight marker. Lanes two (DCI 9.1) and three (DCI 9.2) represent individual seeds from a single transgenic line. Lane four (DCI 3.1) represents a seed from a separate transgenic line. Lane five is protein extracted from wild-type soybean plants, and lanes six-eight are protein extracted from wild-type soybean plants spiked with 0.05% commercial .beta.-lactoglobulin (lane 6), 1% commercial .beta.-lactoglobulin (lane 7), and 2% commercial .beta.-lactoglobulin (lane 8). The .beta.-lactoglobulin commercial protein is detected at an apparent MW of .about.18 kDa (theoretical: 18 kDa--arrow). The fusion protein is detected at an apparent MW of .about.40 kDa (theoretical: 38 kDa--arrowhead). FIGS. 9C and 9D show the protein gels as control for equal lane loading (image is taken at the end of the SDS run) for FIGS. 9A and 9B, respectively.
[0176] Other combinations of structured and unstructured proteins were tested and evaluated for the percentage of recombinant protein. Cassettes having the same promoter (Seed2-sig), signal peptide (EUT:Rb7T), and in some instances a different terminator, were built with either .alpha.-S1-casein, .beta.-casein, .kappa.-casein, or the fusion of .beta.-lactoglobulin with .kappa.-casein (kCN-LG) (See FIGS. 3 and 8). As shown below in Table 3, none of the cassettes encoding .alpha.-S1-casein, .beta.-casein, or .kappa.-casein were able to produce expression of the protein at a level that exceeded 1% total soluble protein. However, when .kappa.-casein was fused with .beta.-lactoglobulin, .kappa.-casein was expressed at a level that was greater than 1% total soluble protein.
TABLE-US-00003 TABLE 3 Expression levels of unstructured proteins Number of events.sup.1 accumulating the Total recombinant protein at the events.sup.1 concentration: analyzed 0-1% TSP Above 1% TSP Unstructured .kappa.-Casein 89 89 0 B-Casein 12 12 0 .alpha.S1-Casein 6 6 0 Fusion kCN-LG 23 12 11 .sup.1As used in Table 3, the each "event" refers to an independent transgenic line.
[0177] As will be readily understood by those of skill in the art, T-DNA insertion into the plant genome is a random process and each T-DNA lands at an unpredictable genomic position. Hence, each of the 23 events generated in Table 3 for the fusion protein have different genomic insertion loci. The genomic context greatly influences the expression levels of a gene, and each loci will be either favorable or unfavorable for the expression of the recombinant genes. The variability observed at the protein level is a reflection of that random insertion process, and explains why 12 out of 23 events present expression levels below 1%.
Example 3: Food Compositions
[0178] The transgenic plants expressing the recombinant fusion proteins described herein can produce milk proteins for the purpose of food industrial, non-food industrial, pharmaceutical, and commercial uses described in this disclosure. An illustrative method for making a food composition is provided in FIG. 11.
[0179] A fusion protein comprising an unstructured milk protein (para-.kappa.-casein) and a structured mammalian protein (.beta.-lactoglobulin) is expressed in a transgenic soybean plant. The fusion protein comprises a chymosin cleavage site between the para-.kappa.-casein and the .beta.-lactoglobulin.
[0180] The fusion protein is extracted from the plant. The fusion protein is then treated with chymosin, to separate the para-.kappa.-casein from the .beta.-lactoglobulin. The para-.kappa.-casein is isolated and/or purified and used to make a food composition (e.g., cheese).
Numbered Embodiments
[0181] Notwithstanding the appended claims, the following numbered embodiments also form part of the instant disclosure.
[0182] 1. A stably transformed plant comprising in its genome: a recombinant DNA construct encoding a fusion protein, the fusion protein comprising: (i) an unstructured milk protein, and (ii) a structured animal protein; wherein the fusion protein is stably expressed in the plant in an amount of 1% or higher per total protein weight of soluble protein extractable from the plant.
[0183] 2. The stably transformed plant of embodiment 1, wherein the fusion protein comprises, from N-terminus to C-terminus, the unstructured milk protein and the animal protein.
[0184] 3. The stably transformed plant of any one of embodiments 1-2, wherein the unstructured milk protein is .alpha.-S1 casein, .alpha.-S2 casein, .beta.-casein, or .kappa.-casein.
[0185] 4. The stably transformed plant of embodiment 1, wherein the unstructured milk protein is .kappa.-casein and comprises the sequence of SEQ ID NO: 4, or a sequence at least 90% identical thereto.
[0186] 5. The stably transformed plant of embodiment 1, wherein the unstructured milk protein is para-.kappa.-casein and comprises the sequence of SEQ ID NO: 2, or a sequence at least 90% identical thereto.
[0187] 6. The stably transformed plant of embodiment 1, wherein the unstructured milk protein is .beta.-casein and comprises the sequence of SEQ ID NO: 6, or a sequence at least 90% identical thereto.
[0188] 7. The stably transformed plant of embodiment 1, wherein the unstructured milk protein is .alpha.-S1 casein and comprises the sequence SEQ ID NO: 8, or a sequence at least 90% identical thereto.
[0189] 8. The stably transformed plant of embodiment 1, wherein the unstructured milk protein is .alpha.-S2 casein and comprises the sequence SEQ ID NO: 84, or a sequence at least 90% identical thereto.
[0190] 9. The stably transformed plant of any one of embodiments 1-8, wherein the structured animal protein is a structured mammalian protein.
[0191] 10. The stably transformed plant of embodiment 9, wherein the structured mammalian protein is .beta.-lactoglobulin, .alpha.-lactalbumin, albumin, lysozyme, lactoferrin, lactoperoxidase, hemoglobin, collagen, or an immunoglobulin.
[0192] 11. The stably transformed plant of embodiment 9, wherein the structured mammalian protein is .beta.-lactoglobulin and comprises the sequence of SEQ ID NO: 10, or a sequence at least 90% identical thereto.
[0193] 12. The stably transformed plant of any one of embodiments 1-8, wherein the structured animal protein is a structured avian protein.
[0194] 13. The stably transformed plant embodiment 12, wherein the structured avian protein is ovalbumin, ovotransferrin, lysozyme or ovoglobulin.
[0195] 14. The stably transformed plant of embodiment 9, wherein the milk protein is .kappa.-casein and the structured mammalian protein is .beta.-lactoglobulin.
[0196] 15. The stably transformed plant of embodiment 9, wherein the milk protein is para-.kappa.-casein and the structured mammalian protein is .beta.-lactoglobulin.
[0197] 16. The stably transformed plant of embodiment 9, wherein the milk protein is .beta.-casein and the structured mammalian protein is .beta.-lactoglobulin.
[0198] 17. The stably transformed plant of embodiment 9, wherein the milk protein is .alpha.-S1 casein or .alpha.-S2 casein and the structured mammalian protein is .beta.-lactoglobulin.
[0199] 18. The stably transformed plant of any one of embodiments 1-17, wherein the plant is a dicot.
[0200] 19. The stably transformed plant of embodiment 18, wherein the dicot is Arabidopsis, tobacco, tomato, potato, sweet potato, cassava, alfalfa, lima bean, pea, chick pea, soybean, carrot, strawberry, lettuce, oak, maple, walnut, rose, mint, squash, daisy, Quinoa, buckwheat, mung bean, cow pea, lentil, lupin, peanut, fava bean, French beans (i.e., common beans), mustard, or cactus.
[0201] 20. The stably transformed plant of any one of embodiments 1-19, wherein the plant is soybean.
[0202] 21. The stably transformed plant of any one of embodiments 1-20, wherein the recombinant DNA construct is codon-optimized for expression in the plant.
[0203] 22. The stably transformed plant of any one of embodiments 1-21, wherein the fusion protein comprises a protease cleavage site.
[0204] 23. The stably transformed plant of embodiment 22, wherein the protease cleavage site is a chymosin cleavage site.
[0205] 24. The stably transformed plant of any one of embodiments 1-23, wherein the fusion protein is expressed at a level at least 2-fold higher than an unstructured milk protein expressed individually in a plant.
[0206] 25. The stably transformed plant of any one of embodiments 1-24, wherein the fusion protein accumulates in the plant at least 2-fold higher than an unstructured milk protein expressed without the structured animal protein.
[0207] 26. A recombinant fusion protein comprising: (i) an unstructured milk protein, and (ii) a structured animal protein.
[0208] 27. The recombinant fusion protein of embodiment 26, wherein the fusion protein is expressed in a plant.
[0209] 28. The recombinant fusion protein of embodiment 26 or 27, wherein the unstructured milk protein is .alpha.-S1 casein, .alpha.-S2 casein, .beta.-casein, or .kappa.-casein.
[0210] 29. The recombinant fusion protein of embodiment 28, wherein the milk protein is .kappa.-casein and comprises the sequence of SEQ ID NO: 4, or a sequence at least 90% identical thereto.
[0211] 30. The recombinant fusion protein of embodiment 28, wherein the milk protein is para-.kappa.-casein and comprises the sequence of SEQ ID NO: 2, or a sequence at least 90% identical thereto.
[0212] 31. The recombinant fusion protein of embodiment 28, wherein the milk protein is .beta.-casein and comprises the sequence of SEQ ID NO: 6, or a sequence at least 90% identical thereto.
[0213] 32. The recombinant fusion protein of embodiment 28, wherein the milk protein is .alpha.-S1 casein and comprises the sequence SEQ ID NO: 8, or a sequence at least 90% identical thereto.
[0214] 33. The recombinant fusion protein of embodiment 28, wherein the milk protein is .alpha.-S2 casein and comprises the sequence SEQ ID NO: 84, or a sequence at least 90% identical thereto.
[0215] 34. The recombinant fusion protein of any one of embodiments 26-33, wherein the structured animal protein is a structured mammalian protein.
[0216] 35. The recombinant fusion protein of embodiment 34, wherein the structured mammalian protein is .beta.-lactoglobulin, .alpha.-lactalbumin, albumin, lysozyme, lactoferrin, lactoperoxidase, hemoglobin, collagen, or an immunoglobulin.
[0217] 36. The recombinant fusion protein of embodiment 34, wherein the structured mammalian protein is .beta.-lactoglobulin and comprises the sequence of SEQ ID NO: 10, or a sequence at least 90% identical thereto.
[0218] 37. The recombinant fusion protein of any one of embodiments 26-33, wherein the structured animal protein is a structured avian protein.
[0219] 38. The recombinant fusion protein of embodiment 37, wherein the structured avian protein is ovalbumin, ovotransferrin, lysozyme or ovoglobulin.
[0220] 39. The recombinant fusion protein embodiment 34, wherein the milk protein is .kappa.-casein and the structured mammalian protein is .beta.-lactoglobulin.
[0221] 40. The recombinant fusion protein of embodiment 34, wherein the milk protein is para-.kappa.-casein and the structured mammalian protein is .beta.-lactoglobulin.
[0222] 41. The recombinant fusion protein of embodiment 34, wherein the milk protein is .beta.-casein and the structured mammalian protein is .beta.-lactoglobulin.
[0223] 42. The recombinant fusion protein of embodiment 34, wherein the milk protein is .alpha.-S1 casein or .alpha.-S2 casein and the structured mammalian protein is .beta.-lactoglobulin.
[0224] 43. The recombinant fusion protein of embodiment 34, wherein the fusion protein comprises a protease cleavage site.
[0225] 44. The recombinant fusion protein of embodiment 34, wherein the protease cleavage site is a chymosin cleavage site.
[0226] 45. A nucleic acid encoding the recombinant fusion protein of any one of embodiments 26 to 44.
[0227] 46. The nucleic acid of embodiment 45, wherein the nucleic acid is codon optimized for expression in a plant species.
[0228] 47. The nucleic of embodiment 45 or 46, wherein the nucleic acid is codon optimized for expression in soybean.
[0229] 48. A vector comprising a nucleic acid encoding a recombinant fusion protein, wherein the recombinant fusion protein comprises: (i) an unstructured milk protein, and (ii) a structured animal protein.
[0230] 49. The vector of embodiment 48, wherein the vector is a plasmid.
[0231] 50. The vector of embodiment 49, wherein the vector is an Agrobacterium Ti plasmid.
[0232] 51. The vector of any one of embodiments 48-50, wherein the nucleic acid comprises, in order from 5' to 3': a promoter; a 5' untranslated region; a sequence encoding the fusion protein; and a terminator.
[0233] 52. The vector of embodiment 51, wherein the promoter is a seed-specific promoter.
[0234] 53. The vector of embodiment 52, wherein the seed-specific promoter is selected from the group consisting of PvPhas, BnNap, AtOle1, GmSeed2, GmSeed3, GmSeed5, GmSeed6, GmSeed7, GmSeed8, GmSeed10, GmSeed11, GmSeed12, pBCON, GmCEP1-L, GmTHIC, GmBg7S1, GmGRD, GmOLEA, GmOLER, Gm2S-1, and GmBBld-II.
[0235] 54. The vector of embodiment 53, wherein the seed-specific promoter is PvPhas and comprises the sequence of SEQ ID NO: 18, or a sequence at least 90% identical thereto.
[0236] 55. The vector of embodiment 53, wherein the seed-specific promoter is GmSeed2 and comprises the sequence of SEQ ID NO: 19, or a sequence at least 90% identical thereto.
[0237] 56. The vector of any one of embodiments 51-55, wherein the 5' untranslated region is selected from the group consisting of Arc5'UTR and glnBlUTR.
[0238] 57. The vector of embodiment 56, wherein the 5' untranslated region is Arc5'UTR and comprises the sequence of SEQ ID NO: 20, or a sequence at least 90% identical thereto.
[0239] 58. The vector of any one of embodiments 51-57, wherein the expression cassette comprises a 3' untranslated region.
[0240] 59. The vector of embodiment 58, wherein the 3' untranslated region is Arc5-1 and comprises SEQ ID NO: 21, or a sequence at least 90% identical thereto.
[0241] 60. The vector of any one of embodiments 51-59, wherein the terminator sequence is a terminator isolated or derived from a gene encoding Nopaline synthase, Arc5-1, an Extensin, Rb7 matrix attachment region, a Heat shock protein, Ubiquitin 10, Ubiquitin 3, and M6 matrix attachment region.
[0242] 61. The vector of embodiment 60, wherein the terminator sequence is isolated or derived from a Nopaline synthase gene and comprises the sequence of SEQ ID NO: 22, or a sequence at least 90% identical thereto.
[0243] 62. A plant comprising the recombinant fusion protein of any one of embodiments 26-44 or the nucleic acid of any one of embodiments 45-47.
[0244] 63. A method for stably expressing a recombinant fusion protein in a plant, the method comprising: a) transforming a plant with a plant transformation vector comprising an expression cassette comprising: a sequence encoding a fusion protein, wherein the fusion protein comprises an unstructured milk protein, and a structured animal protein; and b) growing the transformed plant under conditions wherein the recombinant fusion protein is expressed in an amount of 1% or higher per total protein weight of soluble protein extractable from the plant.
[0245] 64. The method of embodiment 63, wherein the unstructured milk protein is .kappa.-casein.
[0246] 65. The method of embodiment 63 or 64, wherein the structured animal protein is .beta.-lactoglobulin.
[0247] 66. A food composition comprising the recombinant fusion protein of any one of embodiments 26-44.
[0248] 67. A method for making a food composition, the method comprising: expressing the recombinant fusion protein of any one of embodiments 26-44 in a plant; extracting the recombinant fusion protein from the plant; optionally, separating the milk protein from the structured animal protein or the structured plant protein; and creating a food composition using the milk protein or the fusion protein.
[0249] 68. The method of embodiment 67, wherein the plant stably expresses the recombinant fusion protein.
[0250] 69. The method of embodiment 68, wherein the plant expresses the recombinant fusion protein in an amount of 1% or higher per total protein weight of soluble protein extractable from the plant.
[0251] 70. The method of any one of embodiments 67-69, wherein the plant is soybean.
[0252] 71. The method of any one of embodiments 67-70, wherein the food composition comprises the structured animal or plant protein.
[0253] 72. The method of any one of embodiments 67-71, wherein the milk protein and the structured animal or plant protein are separated from one another in the plant cell, prior to extraction.
[0254] 73. The method of any one of embodiments 67-71, wherein the milk protein is separated from the structured animal or plant protein after extraction, by contacting the fusion protein with an enzyme that cleaves the fusion protein.
[0255] 74. A food composition produced using the method of any one of embodiments 67-73.
[0256] 75. A plant-expressed recombinant fusion protein, comprising: .kappa.-casein; and .beta.-lactoglobulin.
[0257] 76. The plant-expressed recombinant fusion protein of embodiment 75, wherein the fusion protein comprises, in order from N-terminus to C-terminus, the .kappa.-casein and the .beta.-lactoglobulin.
[0258] 77. The plant-expressed recombinant fusion protein of embodiment 75 or 76, wherein the fusion protein comprises a protease cleavage site.
[0259] 78. The plant-expressed recombinant fusion protein of embodiment 77, wherein the protease cleavage site is a chymosin cleavage site.
[0260] 79. The plant-expressed recombinant fusion protein of any one of embodiments 75-78, wherein the fusion protein comprises a signal peptide.
[0261] 80. The plant-expressed recombinant fusion protein of embodiment 79, wherein the signal peptide is located at the N-terminus of the fusion protein.
[0262] 81. The plant-expressed recombinant fusion protein of any one of embodiments 75-80, wherein the fusion protein is encoded by a nucleic acid that is codon optimized for expression in a plant.
[0263] 82. The plant-expressed recombinant fusion protein of any one of embodiments 75-81, wherein the fusion protein is expressed in a soybean.
[0264] 83. The plant-expressed recombinant fusion protein of any one of embodiments 75-81, wherein the fusion protein has a molecular weight of 30 kDa to 50 kDa.
[0265] 84. The plant-expressed recombinant fusion protein of any one of embodiments 75-83, wherein the fusion protein is expressed in a plant in an amount of 1% or higher per total protein weight of soluble protein extractable from the plant.
[0266] 85. The plant-expressed recombinant fusion protein of any one of embodiments 75-84, wherein the fusion protein is expressed in the plant at a level at least 2-fold higher than .kappa.-casein expressed individually in a plant.
[0267] 86. The plant-expressed recombinant fusion protein of any one of embodiments 75-84, wherein the fusion protein accumulates in the plant at least 2-fold higher than .kappa.-casein expressed without .beta.-lactoglobulin.
[0268] 87. A stably transformed plant, comprising in its genome: a recombinant DNA construct encoding a fusion protein, the fusion protein comprising: .kappa.-casein; and .beta.-lactoglobulin; wherein the fusion protein is stably expressed in the plant in an amount of 1% or higher per total protein weight of soluble protein extractable from the plant.
[0269] 88. The stably transformed plant of embodiment 87, wherein the fusion protein comprises, in order from N-terminus to C-terminus, the .kappa.-casein and the .beta.-lactoglobulin.
[0270] 89. The stably transformed plant of embodiment 87 or 88, wherein the fusion protein comprises a protease cleavage site.
[0271] 90. The stably transformed plant of embodiment 89, wherein the protease cleavage site is a chymosin cleavage site.
[0272] 91. The stably transformed plant of any one of embodiments 87-90, wherein the fusion protein comprises a signal peptide.
[0273] 92. The stably transformed plant of embodiment 91, wherein the signal peptide is located at the N-terminus of the fusion protein.
[0274] 93. The stably transformed plant of any one of embodiments 87-92, wherein the plant is soybean.
[0275] 94. The stably transformed plant of any one of embodiments 87-93, wherein the recombinant DNA construct comprises codon-optimized nucleic acids for expression in the plant.
[0276] 95. The stably transformed plant of any one of embodiments 87-94, wherein the fusion protein has a molecular weight of 30 kDa to 50 kDa.
[0277] 96. The stably transformed plant of any one of embodiments 87-95, wherein the fusion protein is expressed at a level at least 2-fold higher than .kappa.-casein expressed individually in a plant.
[0278] 97. The stably transformed plant of any one of embodiments 87-96, wherein the fusion protein accumulates in the plant at least 2-fold higher than .kappa.-casein expressed without .beta.-lactoglobulin.
[0279] 98. A plant-expressed recombinant fusion protein comprising: a casein protein and .beta.-lactoglobulin.
[0280] 99. The plant-expressed recombinant fusion protein of embodiment 98, wherein the casein protein is .alpha.-S1 casein, .alpha.-S2 casein, .beta.-casein, or .kappa.-casein.
[0281] 100. A stably transformed plant, comprising in its genome: a recombinant DNA construct encoding a fusion protein, the fusion protein comprising: a casein protein and .beta.-lactoglobulin; wherein the fusion protein is stably expressed in the plant in an amount of 1% or higher per total protein weight of soluble protein extractable from the plant.
[0282] 101. The stably transformed plant of embodiment 100, wherein the casein protein is .alpha.-S1 casein, .alpha.-S2 casein, .beta.-casein, or .kappa.-casein.
Sequence CWU
1
1
1371318DNAArtificial SequenceOptimized para-kappa-casein truncated version
1 (paraOKC1-T) 1caagagcaga atcaagagca gccaatccgt tgtgagaagg
acgagaggtt cttctcagac 60aagatcgcca aatatatacc catacaatat gtactctcac
gctaccctag ctacgggctt 120aactactatc agcaaaaacc tgtagcactg ataaataacc
agtttctccc ctatccctat 180tatgctaaac ctgccgccgt gaggagtcca gcacaaatac
ttcagtggca agtgctcagt 240aacaccgtgc cagcaaaaag ctgccaggct cagcccacca
caatggcccg tcatccccat 300cctcacctta gcttcatg
3182106PRTArtificial SequenceOptimized
para-kappa-casein truncated version 1 (paraOKC1-T) 2Gln Glu Gln Asn
Gln Glu Gln Pro Ile Arg Cys Glu Lys Asp Glu Arg1 5
10 15Phe Phe Ser Asp Lys Ile Ala Lys Tyr Ile
Pro Ile Gln Tyr Val Leu 20 25
30Ser Arg Tyr Pro Ser Tyr Gly Leu Asn Tyr Tyr Gln Gln Lys Pro Val
35 40 45Ala Leu Ile Asn Asn Gln Phe Leu
Pro Tyr Pro Tyr Tyr Ala Lys Pro 50 55
60Ala Ala Val Arg Ser Pro Ala Gln Ile Leu Gln Trp Gln Val Leu Ser65
70 75 80Asn Thr Val Pro Ala
Lys Ser Cys Gln Ala Gln Pro Thr Thr Met Ala 85
90 95Arg His Pro His Pro His Leu Ser Phe Met
100 1053507DNAArtificial SequenceOptimized
kappa-casein truncated version 1 (OKC1-T) 3caagagcaga atcaagagca
gccaatccgt tgtgagaagg acgagaggtt cttctcagac 60aagatcgcca aatatatacc
catacaatat gtactctcac gctaccctag ctacgggctt 120aactactatc agcaaaaacc
tgtagcactg ataaataacc agtttctccc ctatccctat 180tatgctaaac ctgccgccgt
gaggagtcca gcacaaatac ttcagtggca agtgctcagt 240aacaccgtgc cagcaaaaag
ctgccaggct cagcccacca caatggcccg tcatccccat 300cctcacctta gcttcatggc
aatcccacca aagaagaatc aagacaagac cgaaatacct 360accatcaaca caattgcatc
tggagagcct accagtacac caacaactga ggcagtagag 420tctactgttg ctacccttga
ggacagcccc gaggttatag agtccccacc tgagataaat 480accgtgcagg tgacaagtac
cgccgta 5074169PRTArtificial
SequenceOptimized kappa-casein truncated version 1 (OKC1-T) 4Gln Glu
Gln Asn Gln Glu Gln Pro Ile Arg Cys Glu Lys Asp Glu Arg1 5
10 15Phe Phe Ser Asp Lys Ile Ala Lys
Tyr Ile Pro Ile Gln Tyr Val Leu 20 25
30Ser Arg Tyr Pro Ser Tyr Gly Leu Asn Tyr Tyr Gln Gln Lys Pro
Val 35 40 45Ala Leu Ile Asn Asn
Gln Phe Leu Pro Tyr Pro Tyr Tyr Ala Lys Pro 50 55
60Ala Ala Val Arg Ser Pro Ala Gln Ile Leu Gln Trp Gln Val
Leu Ser65 70 75 80Asn
Thr Val Pro Ala Lys Ser Cys Gln Ala Gln Pro Thr Thr Met Ala
85 90 95Arg His Pro His Pro His Leu
Ser Phe Met Ala Ile Pro Pro Lys Lys 100 105
110Asn Gln Asp Lys Thr Glu Ile Pro Thr Ile Asn Thr Ile Ala
Ser Gly 115 120 125Glu Pro Thr Ser
Thr Pro Thr Thr Glu Ala Val Glu Ser Thr Val Ala 130
135 140Thr Leu Glu Asp Ser Pro Glu Val Ile Glu Ser Pro
Pro Glu Ile Asn145 150 155
160Thr Val Gln Val Thr Ser Thr Ala Val
1655627DNAArtificial SequenceOptimized beta-casein truncated version 2
(OBC-T2) 5cgcgaactgg aagagttgaa cgtaccagga gagattgtag aatcactgag
ctcctcagag 60gagtctatta ctcgtatcaa caagaagata gagaagttcc aatccgagga
gcaacaacaa 120acagaggacg aattgcagga caagatacat cctttcgcac agacccagag
cctcgtctat 180ccctttccag gtccaatccc taactctctc ccccagaata tcccaccctt
gactcagact 240cccgtggtcg tacccccttt cttgcaaccc gaggtgatgg gggtttctaa
agtcaaagag 300gctatggctc ctaaacataa ggaaatgcct tttcccaaat atccagtgga
gccattcact 360gagagccagt ctctgacact tacagatgtg gaaaacttgc acctgccctt
gccacttttg 420cagtcctgga tgcaccaacc acatcaaccc ttgcccccca cagtgatgtt
tcctccacaa 480tcagttctta gtctctccca aagcaaagtc cttccagtgc ctcagaaggc
cgtcccatac 540ccccagagag atatgccaat acaggcattc ttgctttacc aggaaccagt
gctcggtcct 600gtacgtggcc cattccctat catagtg
6276209PRTArtificial SequenceOptimized beta-casein truncated
version 2 (OBC-T2) 6Arg Glu Leu Glu Glu Leu Asn Val Pro Gly Glu Ile
Val Glu Ser Leu1 5 10
15Ser Ser Ser Glu Glu Ser Ile Thr Arg Ile Asn Lys Lys Ile Glu Lys
20 25 30Phe Gln Ser Glu Glu Gln Gln
Gln Thr Glu Asp Glu Leu Gln Asp Lys 35 40
45Ile His Pro Phe Ala Gln Thr Gln Ser Leu Val Tyr Pro Phe Pro
Gly 50 55 60Pro Ile Pro Asn Ser Leu
Pro Gln Asn Ile Pro Pro Leu Thr Gln Thr65 70
75 80Pro Val Val Val Pro Pro Phe Leu Gln Pro Glu
Val Met Gly Val Ser 85 90
95Lys Val Lys Glu Ala Met Ala Pro Lys His Lys Glu Met Pro Phe Pro
100 105 110Lys Tyr Pro Val Glu Pro
Phe Thr Glu Ser Gln Ser Leu Thr Leu Thr 115 120
125Asp Val Glu Asn Leu His Leu Pro Leu Pro Leu Leu Gln Ser
Trp Met 130 135 140His Gln Pro His Gln
Pro Leu Pro Pro Thr Val Met Phe Pro Pro Gln145 150
155 160Ser Val Leu Ser Leu Ser Gln Ser Lys Val
Leu Pro Val Pro Gln Lys 165 170
175Ala Val Pro Tyr Pro Gln Arg Asp Met Pro Ile Gln Ala Phe Leu Leu
180 185 190Tyr Gln Glu Pro Val
Leu Gly Pro Val Arg Gly Pro Phe Pro Ile Ile 195
200 205Val7597DNAArtificial SequenceOptimized alpha
S1-casein truncated version 1 (OaS1-T) 7cgcccaaaac atcccataaa
acatcaagga ttgccccagg aagtactcaa cgagaatctc 60ctccgttttt tcgttgctcc
tttccccgaa gtgttcggga aggaaaaagt aaacgagctt 120tcaaaggaca tcggctctga
aagtaccgag gatcaggcta tggaagatat caagcaaatg 180gaggccgaat ctataagttc
ttcagaagaa atagttccca actcagtgga gcagaagcac 240attcagaaag aagacgtgcc
cagcgagcgc tatctgggat atttggaaca gctgctcaga 300ctgaaaaagt acaaggtgcc
tcagctcgaa atcgtaccca atagtgctga agaaaggttg 360cactcaatga aagaggggat
tcacgcacaa caaaaagagc ctatgatcgg agtaaatcaa 420gaactggcat acttttatcc
cgagttgttt cgccaattct atcaactgga tgcctaccct 480tccggtgcat ggtactacgt
acccctcggt actcaatata ccgatgctcc ctccttttcc 540gacattccta atcctatagg
ttccgagaat agcgaaaaga ccaccatgcc cttatgg 5978199PRTArtificial
SequenceOptimized alpha S1-casein truncated version 1 (OaS1-T) 8Arg
Pro Lys His Pro Ile Lys His Gln Gly Leu Pro Gln Glu Val Leu1
5 10 15Asn Glu Asn Leu Leu Arg Phe
Phe Val Ala Pro Phe Pro Glu Val Phe 20 25
30Gly Lys Glu Lys Val Asn Glu Leu Ser Lys Asp Ile Gly Ser
Glu Ser 35 40 45Thr Glu Asp Gln
Ala Met Glu Asp Ile Lys Gln Met Glu Ala Glu Ser 50 55
60Ile Ser Ser Ser Glu Glu Ile Val Pro Asn Ser Val Glu
Gln Lys His65 70 75
80Ile Gln Lys Glu Asp Val Pro Ser Glu Arg Tyr Leu Gly Tyr Leu Glu
85 90 95Gln Leu Leu Arg Leu Lys
Lys Tyr Lys Val Pro Gln Leu Glu Ile Val 100
105 110Pro Asn Ser Ala Glu Glu Arg Leu His Ser Met Lys
Glu Gly Ile His 115 120 125Ala Gln
Gln Lys Glu Pro Met Ile Gly Val Asn Gln Glu Leu Ala Tyr 130
135 140Phe Tyr Pro Glu Leu Phe Arg Gln Phe Tyr Gln
Leu Asp Ala Tyr Pro145 150 155
160Ser Gly Ala Trp Tyr Tyr Val Pro Leu Gly Thr Gln Tyr Thr Asp Ala
165 170 175Pro Ser Phe Ser
Asp Ile Pro Asn Pro Ile Gly Ser Glu Asn Ser Glu 180
185 190Lys Thr Thr Met Pro Leu Trp
1959486DNAArtificial SequenceOptimized Beta Lactoglobulin 1 (OLG1)
9ttgatcgtaa cacagactat gaagggtctt gatatacaga aggtggccgg gacttggtac
60agtttggcaa tggccgcatc cgacatctcc ttgttggacg cacaatcagc cccattgcgt
120gtgtacgtag aagagcttaa accaactccc gagggggatc tggaaattct gctccagaaa
180tgggagaacg gtgagtgcgc ccagaagaag atcatcgcag agaagaccaa aattccagca
240gtattcaaaa tcgacgcatt gaacgaaaat aaggtgctcg tactggacac tgattataag
300aagtatctcc ttttctgtat ggagaactca gcagagcctg aacagagtct tgcctgccaa
360tgccttgttc gtaccccaga ggtagatgat gaagctctgg aaaagttcga taaggccctt
420aaggctctgc ctatgcacat taggctttct ttcaatccaa ctcaacttga ggaacaatgt
480cacatt
48610162PRTArtificial SequenceOptimized Beta Lactoglobulin 1 (OLG1) 10Leu
Ile Val Thr Gln Thr Met Lys Gly Leu Asp Ile Gln Lys Val Ala1
5 10 15Gly Thr Trp Tyr Ser Leu Ala
Met Ala Ala Ser Asp Ile Ser Leu Leu 20 25
30Asp Ala Gln Ser Ala Pro Leu Arg Val Tyr Val Glu Glu Leu
Lys Pro 35 40 45Thr Pro Glu Gly
Asp Leu Glu Ile Leu Leu Gln Lys Trp Glu Asn Gly 50 55
60Glu Cys Ala Gln Lys Lys Ile Ile Ala Glu Lys Thr Lys
Ile Pro Ala65 70 75
80Val Phe Lys Ile Asp Ala Leu Asn Glu Asn Lys Val Leu Val Leu Asp
85 90 95Thr Asp Tyr Lys Lys Tyr
Leu Leu Phe Cys Met Glu Asn Ser Ala Glu 100
105 110Pro Glu Gln Ser Leu Ala Cys Gln Cys Leu Val Arg
Thr Pro Glu Val 115 120 125Asp Asp
Glu Ala Leu Glu Lys Phe Asp Lys Ala Leu Lys Ala Leu Pro 130
135 140Met His Ile Arg Leu Ser Phe Asn Pro Thr Gln
Leu Glu Glu Gln Cys145 150 155
160His Ile11486DNAArtificial SequenceOptimized Beta Lactoglobulin 2
(OLG2) 11cttattgtga cccaaaccat gaagggcctc gacattcaaa aggttgccgg
aacctggtac 60tcccttgcta tggctgcttc cgatatctcc ttgctcgatg ctcaatccgc
tccacttagg 120gtgtacgtgg aagagttgaa gccaactcca gagggcgatc ttgagatctt
gcttcaaaag 180tgggagaacg atgagtgcgc ccagaagaag attatcgccg aaaagaccaa
gattcccgcc 240gtgttcaaga tcgatgctct caacgagaac aaggtgctcg tgctcgatac
cgactacaag 300aagtaccttc tcgtctgcat ggaaaactcc gctgagccag agcaatctct
tgtttgccaa 360tgccttgtga ggaccccaga ggttgacgat gaagctcttg agaagttcga
caaggctctc 420aaggctttgc ctatgcacat ccgccttagc ttcaacccaa ctcagcttga
ggaacagtgc 480cacatc
48612486DNAArtificial SequenceOptimized Beta Lactoglobulin 3
(OLG3) 12ctcattgtta cacaaaccat gaagggtctt gacattcaga aggttgctgg
gacatggtat 60tcactagcga tggctgcttc tgatatctcc ctgttggatg cacagtctgc
ccccctgaga 120gtgtatgttg aagaactgaa accgacacct gaaggagact tggaaatttt
actccagaaa 180tgggaaaatg atgagtgtgc ccaaaagaag ataatagccg agaagaccaa
aattcctgct 240gtgtttaaga ttgatgcttt gaatgagaac aaagtactag tcctcgacac
tgattacaag 300aaatacttat tagtgtgcat ggaaaacagc gcagagccag aacaatcact
tgtttgtcaa 360tgtttggtcc gtactccaga ggtagatgat gaagcattgg agaaatttga
taaagcattg 420aaggcacttc caatgcatat aaggcttagt ttcaatccta ctcagcttga
agagcaatgc 480cacatc
48613486DNAArtificial SequenceOptimized Beta Lactoglobulin 4
(OLG4) 13cttatagtaa ctcaaaccat gaagggactt gatatccaaa aagttgcagg
aacctggtac 60tcactggcta tggcagcttc cgacatctcc ttgttggacg cacaatccgc
accattgcgc 120gtctacgttg aggagttgaa acctacacca gagggggatc ttgagatttt
gctccagaaa 180tgggagaacg acgagtgtgc ccagaaaaaa attatagcag agaagactaa
aattcctgct 240gtttttaaga ttgatgccct gaacgagaat aaggtactgg tcctcgacac
tgattataaa 300aagtatttgc tggtgtgtat ggagaacagt gctgaacctg aacagagcct
ggtctgtcaa 360tgtcttgtaa ggacacctga ggttgatgac gaggcacttg aaaaattcga
caaggccctt 420aaggctctgc ctatgcacat ccgtctgagt ttcaacccta ctcagttgga
ggaacaatgt 480catatt
4861496DNAGlycine max 14atggctactt caaagttgaa aacccagaat
gtggttgtat ctctctccct aaccttaacc 60ttggtactgg tgctactgac cagcaaggca
aactca 961532PRTGlycine max 15Met Ala Thr
Ser Lys Leu Lys Thr Gln Asn Val Val Val Ser Leu Ser1 5
10 15Leu Thr Leu Thr Leu Val Leu Val Leu
Leu Thr Ser Lys Ala Asn Ser 20 25
301657DNAGlycine max 16atggccaagc tagttttttc cctttgtttt ctgcttttca
gtggctgctg cttcgct 571719PRTGlycine max 17Met Ala Lys Leu Val Phe
Ser Leu Cys Phe Leu Leu Phe Ser Gly Cys1 5
10 15Cys Phe Ala181543DNAGlycine max 18cattgtactc
ccagtatcat tatagtgaaa gttttggctc tctcgccggt ggttttttac 60ctctatttaa
aggggttttc cacctaaaaa ttctggtatc attctcactt tacttgttac 120tttaatttct
cataatcttt ggttgaaatt atcacgcttc cgcacacgat atccctacaa 180atttattatt
tgttaaacat tttcaaaccg cataaaattt tatgaagtcc cgtctatctt 240taatgtagtc
taacattttc atattgaaat atataattta cttaatttta gcgttggtag 300aaagcataat
gatttattct tattcttctt catataaatg tttaatatac aatataaaca 360aattctttac
cttaagaagg atttcccatt ttatatttta aaaatatatt tatcaaatat 420ttttcaacca
cgtaaatcac ataataataa gttgtttcaa aagtaataaa atttaactcc 480ataatttttt
tatttgactg atcttaaagc aacacccagt gacacaacta gccatttttt 540tctttgaata
aaaaaatcca attatcattg tatttttttt atacaatgaa aatttcacca 600aacaatgatt
tgtggtattt ctgaagcaag tcatgttatg caaaattcta taattcccat 660ttgacactac
ggaagtaact gaagatctgc ttttacatgc gagacacatc ttctaaagta 720attttaataa
tagttactat attcaagatt tcatatatca aatactcaat attacttcta 780aaaaattaat
tagatataat taaaatatta cttttttaat tttaagttta attgttgaat 840ttgtgactat
tgatttatta ttctactatg tttaaattgt tttataggta gtttaaagta 900aatataagta
atgtagtaga gtgttagagt gttaccctaa accataaact ataagattta 960tggtggacta
attttcatat atttcttatt gcttttacct tttcttggta tgtaagtccg 1020taactggaat
tactgtgggt tgccatgaca ctctgtggtc ttttggttca tgcatggatg 1080cttgcgcaag
aaaaagacaa agaacaaaga aaaaagacaa aacagagaga caaaacgcaa 1140tcacacaacc
aactcaaatt agtcactggc tgatcaagat cgccgcgtcc atgtatgtct 1200aaatgccatg
caaagcaaca cgtgcttaac atgcacttta aatggctcac ccatcccaac 1260ccactcacaa
acacattgcc tttttcttca tcatcaccac aaccacctgt atatattcat 1320tctcttccgc
cacctcaatt tcttcacttc aacacacgtc aacctgcata tgcgtgtcat 1380cccatgccca
aatctccatg catgttccta ccaccttctc tcttatataa tacctataaa 1440tacctctaat
atcactcact tctttcatca tccatccatc cagagtacta ctactctact 1500actataatac
cccaacccaa ctcatattca atactactct act
1543191384DNAGlycine max 19aacacaagct tcaagtttta aaaggaaaaa tgtcagccaa
aaactttaaa taaaatggta 60acaaggaaat tattcaaaaa ttacaaacct cgtcaaaata
ggaaagaaaa aaagtttagg 120gatttagaaa aaacatcaat ctagttccac cttattttat
agagagaaga aactaatata 180taagaactaa aaaacagaag aatagaaaaa aaaagtattg
acaggaaaga aaaagtagct 240gtatgcttat aagtactttg aggatttgaa ttctctctta
taaaacacaa acacaatttt 300tagattttat ttaaataatc atcaatccga ttataattat
ttatatattt ttctattttc 360aaagaagtaa atcatgagct tttccaactc aacatctatt
ttttttctct caaccttttt 420cacatcttaa gtagtctcac cctttatata tataacttat
ttcttacctt ttacattatg 480taacttttat caccaaaacc aacaacttta aaattttatt
aaatagactc cacaagtaac 540ttgacactct tacattcatc gacattaact tttatctgtt
ttataaatat tattgtgata 600taatttaatc aaaataacca caaactttca taaaaggttc
ttattaagca tggcatttaa 660taagcaaaaa caactcaatc actttcatat aggaggtagc
ctaagtacgt actcaaaatg 720ccaacaaata aaaaaaaagt tgctttaata atgccaaaac
aaattaataa aacacttaca 780acaccggatt ttttttaatt aaaatgtgcc atttaggata
aatagttaat atttttaata 840attatttaaa aagccgtatc tactaaaatg atttttattt
ggttgaaaat attaatatgt 900ttaaatcaac acaatctatc aaaattaaac taaaaaaaaa
ataagtgtac gtggttaaca 960ttagtacagt aatataagag gaaaatgaga aattaagaaa
ttgaaagcga gtctaatttt 1020taaattatga acctgcatat ataaaaggaa agaaagaatc
caggaagaaa agaaatgaaa 1080ccatgcatgg tcccctcgtc atcacgagtt tctgccattt
gcaatagaaa cactgaaaca 1140cctttctctt tgtcacttaa ttgagatgcc gaagccacct
cacaccatga acttcatgag 1200gtgtagcacc caaggcttcc atagccatgc atactgaaga
atgtctcaag ctcagcaccc 1260tacttctgtg acgtgtccct cattcacctt cctctcttcc
ctataaataa ccacgcctca 1320ggttctccgc ttcacaactc aaacattctc tccattggtc
cttaaacact catcagtcat 1380cacc
13842013DNAGlycine max 20tgaatgcatg atc
13211197DNAGlycine max
21aataaataaa atgggagcaa taaataaaat gggagctcat atatttacac catttacact
60gtctattatt caccatgcca attattactt cataatttta aaattatgtc atttttaaaa
120attgcttaat gatggaaagg attattataa gttaaaagta taacatagat aaactaacca
180caaaacaaat caatataaac taacttactc tcccatctaa tttttattta aatttcttta
240cacttctctt ccatttctat ttctacaaca ttatttaaca tttttattgt atttttctta
300ctttctaact ctattcattt caaaaatcaa tatatgttta tcaccacctc tctaaaaaaa
360actttacaat cattggtcca gaaaagttaa atcacgagat ggtcatttta gcattaaaac
420aacgattctt gtatcactat ttttcagcat gtagtccatt ctcttcaaac aaagacagcg
480gctatataat cgttgtgtta tattcagtct aaaacaattg ttatggtaaa agtcgtcatt
540ttacgccttt ttaaaagata taaaatgaca gttatggtta aaagtcatca tgttagatcc
600tccttaaaga tataaaatga cagttttgga taaaaagtgg tcattttata cgctcttgaa
660agatataaaa cgacggttat ggtaaaagct gccattttaa atgaaatatt tttgttttag
720ttcattttgt ttaatgctaa tcccatttaa attgacttgt acaattaaaa ctcacccacc
780cagatacaat ataaactaac ttactctcac agctaagttt tatttaaatt tctttacact
840tcttttccat ttctatttct atgacattaa ctaacatttt tctcgtaatt ttttttctta
900ttttctaact ctatccattt caaatcgata tatgtttatc accaccactt taaaaagaaa
960atttacaatt tctcgtgcaa aaaagctaaa tcatgaccgt cattttagca ttaaaacaac
1020gattcttgta tcgttgtttt tcagcatgta gtccattctt ttcaagcaaa gacaacagct
1080atataatcat cgtgttatat tcagtctaaa acaacagtaa tgataaaagt catcatttta
1140ggcctttctg aaatatatag aacgacattc atggtaaaaa atcgtcattt tagatcc
119722253DNAGlycine max 22gatcgttcaa acatttggca ataaagtttc ttaagattga
atcctgttgc cggtcttgcg 60atgattatca tataatttct gttgaattac gttaagcatg
taataattaa catgtaatgc 120atgacgttat ttatgagatg ggtttttatg attagagtcc
cgcaattata catttaatac 180gcgatagaaa acaaaatata gcgcgcaaac taggataaat
tatcgcgcgc ggtgtcatct 240atgttactag atc
253234PRTUnknownCarboxy-terminal endoplasmic
reticulum retention/retrieval signal 23Lys Asp Glu
Leu1244PRTUnknownCarboxy-terminal endoplasmic reticulum
retention/retrieval signal 24His Asp Glu
Leu1254PRTUnknownCarboxy-terminal endoplasmic reticulum
retention/retrieval signal 25His Asp Glu
Phe1264PRTUnknownCarboxy-terminal endoplasmic reticulum
retention/retrieval signal 26Arg Asp Glu
Phe1274PRTUnknownCarboxy-terminal endoplasmic reticulum
retention/retrieval signal 27Arg Asp Glu
Leu1284PRTUnknownCarboxy-terminal endoplasmic reticulum
retention/retrieval signal 28Trp Asp Glu
Leu1294PRTUnknownCarboxy-terminal endoplasmic reticulum
retention/retrieval signal 29Tyr Asp Glu
Leu1304PRTUnknownCarboxy-terminal endoplasmic reticulum
retention/retrieval signal 30His Glu Glu
Phe1314PRTUnknownCarboxy-terminal endoplasmic reticulum
retention/retrieval signal 31His Glu Glu
Leu1324PRTUnknownCarboxy-terminal endoplasmic reticulum
retention/retrieval signal 32Lys Glu Glu
Leu1334PRTUnknownCarboxy-terminal endoplasmic reticulum
retention/retrieval signal 33Arg Glu Glu
Leu1344PRTUnknownCarboxy-terminal endoplasmic reticulum
retention/retrieval signal 34Lys Ala Glu
Leu1354PRTUnknownCarboxy-terminal endoplasmic reticulum
retention/retrieval signal 35Lys Cys Glu
Leu1364PRTUnknownCarboxy-terminal endoplasmic reticulum
retention/retrieval signal 36Lys Phe Glu
Leu1374PRTUnknownCarboxy-terminal endoplasmic reticulum
retention/retrieval signal 37Lys Gly Glu
Leu1384PRTUnknownCarboxy-terminal endoplasmic reticulum
retention/retrieval signal 38Lys His Glu
Leu1394PRTUnknownCarboxy-terminal endoplasmic reticulum
retention/retrieval signal 39Lys Leu Glu
Leu1404PRTUnknownCarboxy-terminal endoplasmic reticulum
retention/retrieval signal 40Lys Asn Glu
Leu1414PRTUnknownCarboxy-terminal endoplasmic reticulum
retention/retrieval signal 41Lys Gln Glu
Leu1424PRTUnknownCarboxy-terminal endoplasmic reticulum
retention/retrieval signal 42Lys Arg Glu
Leu1434PRTUnknownCarboxy-terminal endoplasmic reticulum
retention/retrieval signal 43Lys Ser Glu
Leu1444PRTUnknownCarboxy-terminal endoplasmic reticulum
retention/retrieval signal 44Lys Val Glu
Leu1454PRTUnknownCarboxy-terminal endoplasmic reticulum
retention/retrieval signal 45Lys Trp Glu
Leu1464PRTUnknownCarboxy-terminal endoplasmic reticulum
retention/retrieval signal 46Lys Tyr Glu
Leu1474PRTUnknownCarboxy-terminal endoplasmic reticulum
retention/retrieval signal 47Lys Glu Asp
Leu1484PRTUnknownCarboxy-terminal endoplasmic reticulum
retention/retrieval signal 48Lys Ile Glu
Leu1494PRTUnknownCarboxy-terminal endoplasmic reticulum
retention/retrieval signal 49Asp Lys Glu
Leu1504PRTUnknownCarboxy-terminal endoplasmic reticulum
retention/retrieval signal 50Phe Asp Glu
Leu1514PRTUnknownCarboxy-terminal endoplasmic reticulum
retention/retrieval signal 51Lys Asp Glu
Phe1524PRTUnknownCarboxy-terminal endoplasmic reticulum
retention/retrieval signal 52Lys Lys Glu
Leu1534PRTUnknownCarboxy-terminal endoplasmic reticulum
retention/retrieval signal 53His Ala Asp
Leu1544PRTUnknownCarboxy-terminal endoplasmic reticulum
retention/retrieval signal 54His Ala Glu
Leu1554PRTUnknownCarboxy-terminal endoplasmic reticulum
retention/retrieval signal 55His Ile Glu
Leu1564PRTUnknownCarboxy-terminal endoplasmic reticulum
retention/retrieval signal 56His Asn Glu
Leu1574PRTUnknownCarboxy-terminal endoplasmic reticulum
retention/retrieval signal 57His Thr Glu
Leu1584PRTUnknownCarboxy-terminal endoplasmic reticulum
retention/retrieval signal 58Lys Thr Glu
Leu1594PRTUnknownCarboxy-terminal endoplasmic reticulum
retention/retrieval signal 59His Val Glu
Leu1604PRTUnknownCarboxy-terminal endoplasmic reticulum
retention/retrieval signal 60Asn Asp Glu
Leu1614PRTUnknownCarboxy-terminal endoplasmic reticulum
retention/retrieval signal 61Gln Asp Glu
Leu1624PRTUnknownCarboxy-terminal endoplasmic reticulum
retention/retrieval signal 62Arg Glu Asp
Leu1634PRTUnknownCarboxy-terminal endoplasmic reticulum
retention/retrieval signal 63Arg Asn Glu
Leu1644PRTUnknownCarboxy-terminal endoplasmic reticulum
retention/retrieval signal 64Arg Thr Asp
Leu1654PRTUnknownCarboxy-terminal endoplasmic reticulum
retention/retrieval signal 65Arg Thr Glu
Leu1664PRTUnknownCarboxy-terminal endoplasmic reticulum
retention/retrieval signal 66Ser Asp Glu
Leu1674PRTUnknownCarboxy-terminal endoplasmic reticulum
retention/retrieval signal 67Thr Asp Glu
Leu1684PRTUnknownCarboxy-terminal endoplasmic reticulum
retention/retrieval signal 68Ser Lys Glu
Leu1694PRTUnknownCarboxy-terminal endoplasmic reticulum
retention/retrieval signal 69Ser Thr Glu
Leu1704PRTUnknownCarboxy-terminal endoplasmic reticulum
retention/retrieval signal 70Glu Asp Glu Leu171367PRTArtificial
SequenceFusion protein sig10OKC1-TOLG1KDEL 71Met Ala Thr Ser Lys Leu Lys
Thr Gln Asn Val Val Val Ser Leu Ser1 5 10
15Leu Thr Leu Thr Leu Val Leu Val Leu Leu Thr Ser Lys
Ala Asn Ser 20 25 30Gln Glu
Gln Asn Gln Glu Gln Pro Ile Arg Cys Glu Lys Asp Glu Arg 35
40 45Phe Phe Ser Asp Lys Ile Ala Lys Tyr Ile
Pro Ile Gln Tyr Val Leu 50 55 60Ser
Arg Tyr Pro Ser Tyr Gly Leu Asn Tyr Tyr Gln Gln Lys Pro Val65
70 75 80Ala Leu Ile Asn Asn Gln
Phe Leu Pro Tyr Pro Tyr Tyr Ala Lys Pro 85
90 95Ala Ala Val Arg Ser Pro Ala Gln Ile Leu Gln Trp
Gln Val Leu Ser 100 105 110Asn
Thr Val Pro Ala Lys Ser Cys Gln Ala Gln Pro Thr Thr Met Ala 115
120 125Arg His Pro His Pro His Leu Ser Phe
Met Ala Ile Pro Pro Lys Lys 130 135
140Asn Gln Asp Lys Thr Glu Ile Pro Thr Ile Asn Thr Ile Ala Ser Gly145
150 155 160Glu Pro Thr Ser
Thr Pro Thr Thr Glu Ala Val Glu Ser Thr Val Ala 165
170 175Thr Leu Glu Asp Ser Pro Glu Val Ile Glu
Ser Pro Pro Glu Ile Asn 180 185
190Thr Val Gln Val Thr Ser Thr Ala Val Leu Ile Val Thr Gln Thr Met
195 200 205Lys Gly Leu Asp Ile Gln Lys
Val Ala Gly Thr Trp Tyr Ser Leu Ala 210 215
220Met Ala Ala Ser Asp Ile Ser Leu Leu Asp Ala Gln Ser Ala Pro
Leu225 230 235 240Arg Val
Tyr Val Glu Glu Leu Lys Pro Thr Pro Glu Gly Asp Leu Glu
245 250 255Ile Leu Leu Gln Lys Trp Glu
Asn Gly Glu Cys Ala Gln Lys Lys Ile 260 265
270Ile Ala Glu Lys Thr Lys Ile Pro Ala Val Phe Lys Ile Asp
Ala Leu 275 280 285Asn Glu Asn Lys
Val Leu Val Leu Asp Thr Asp Tyr Lys Lys Tyr Leu 290
295 300Leu Phe Cys Met Glu Asn Ser Ala Glu Pro Glu Gln
Ser Leu Ala Cys305 310 315
320Gln Cys Leu Val Arg Thr Pro Glu Val Asp Asp Glu Ala Leu Glu Lys
325 330 335Phe Asp Lys Ala Leu
Lys Ala Leu Pro Met His Ile Arg Leu Ser Phe 340
345 350Asn Pro Thr Gln Leu Glu Glu Gln Cys His Ile Lys
Asp Glu Leu 355 360
365721104DNAArtificial SequenceNucleic acid sequence encoding fusion
protein sig10OKC1-TOLG1KDEL 72atggctactt caaagttgaa aacccagaat
gtggttgtat ctctctccct aaccttaacc 60ttggtactgg tgctactgac cagcaaggca
aactcacaag agcagaatca agagcagcca 120atccgttgtg agaaggacga gaggttcttc
tcagacaaga tcgccaaata tatacccata 180caatatgtac tctcacgcta ccctagctac
gggcttaact actatcagca aaaacctgta 240gcactgataa ataaccagtt tctcccctat
ccctattatg ctaaacctgc cgccgtgagg 300agtccagcac aaatacttca gtggcaagtg
ctcagtaaca ccgtgccagc aaaaagctgc 360caggctcagc ccaccacaat ggcccgtcat
ccccatcctc accttagctt catggcaatc 420ccaccaaaga agaatcaaga caagaccgaa
atacctacca tcaacacaat tgcatctgga 480gagcctacca gtacaccaac aactgaggca
gtagagtcta ctgttgctac ccttgaggac 540agccccgagg ttatagagtc cccacctgag
ataaataccg tgcaggtgac aagtaccgcc 600gtattgatcg taacacagac tatgaagggt
cttgatatac agaaggtggc cgggacttgg 660tacagtttgg caatggccgc atccgacatc
tccttgttgg acgcacaatc agccccattg 720cgtgtgtacg tagaagagct taaaccaact
cccgaggggg atctggaaat tctgctccag 780aaatgggaga acggtgagtg cgcccagaag
aagatcatcg cagagaagac caaaattcca 840gcagtattca aaatcgacgc attgaacgaa
aataaggtgc tcgtactgga cactgattat 900aagaagtatc tccttttctg tatggagaac
tcagcagagc ctgaacagag tcttgcctgc 960caatgccttg ttcgtacccc agaggtagat
gatgaagctc tggaaaagtt cgataaggcc 1020cttaaggctc tgcctatgca cattaggctt
tctttcaatc caactcaact tgaggaacaa 1080tgtcacatta aggatgagct ttaa
110473405PRTArtificial SequenceFusion
protein sig10OBC-T2FMOLG1 73Met Ala Thr Ser Lys Leu Lys Thr Gln Asn Val
Val Val Ser Leu Ser1 5 10
15Leu Thr Leu Thr Leu Val Leu Val Leu Leu Thr Ser Lys Ala Asn Ser
20 25 30Arg Glu Leu Glu Glu Leu Asn
Val Pro Gly Glu Ile Val Glu Ser Leu 35 40
45Ser Ser Ser Glu Glu Ser Ile Thr Arg Ile Asn Lys Lys Ile Glu
Lys 50 55 60Phe Gln Ser Glu Glu Gln
Gln Gln Thr Glu Asp Glu Leu Gln Asp Lys65 70
75 80Ile His Pro Phe Ala Gln Thr Gln Ser Leu Val
Tyr Pro Phe Pro Gly 85 90
95Pro Ile Pro Asn Ser Leu Pro Gln Asn Ile Pro Pro Leu Thr Gln Thr
100 105 110Pro Val Val Val Pro Pro
Phe Leu Gln Pro Glu Val Met Gly Val Ser 115 120
125Lys Val Lys Glu Ala Met Ala Pro Lys His Lys Glu Met Pro
Phe Pro 130 135 140Lys Tyr Pro Val Glu
Pro Phe Thr Glu Ser Gln Ser Leu Thr Leu Thr145 150
155 160Asp Val Glu Asn Leu His Leu Pro Leu Pro
Leu Leu Gln Ser Trp Met 165 170
175His Gln Pro His Gln Pro Leu Pro Pro Thr Val Met Phe Pro Pro Gln
180 185 190Ser Val Leu Ser Leu
Ser Gln Ser Lys Val Leu Pro Val Pro Gln Lys 195
200 205Ala Val Pro Tyr Pro Gln Arg Asp Met Pro Ile Gln
Ala Phe Leu Leu 210 215 220Tyr Gln Glu
Pro Val Leu Gly Pro Val Arg Gly Pro Phe Pro Ile Ile225
230 235 240Val Phe Met Leu Ile Val Thr
Gln Thr Met Lys Gly Leu Asp Ile Gln 245
250 255Lys Val Ala Gly Thr Trp Tyr Ser Leu Ala Met Ala
Ala Ser Asp Ile 260 265 270Ser
Leu Leu Asp Ala Gln Ser Ala Pro Leu Arg Val Tyr Val Glu Glu 275
280 285Leu Lys Pro Thr Pro Glu Gly Asp Leu
Glu Ile Leu Leu Gln Lys Trp 290 295
300Glu Asn Gly Glu Cys Ala Gln Lys Lys Ile Ile Ala Glu Lys Thr Lys305
310 315 320Ile Pro Ala Val
Phe Lys Ile Asp Ala Leu Asn Glu Asn Lys Val Leu 325
330 335Val Leu Asp Thr Asp Tyr Lys Lys Tyr Leu
Leu Phe Cys Met Glu Asn 340 345
350Ser Ala Glu Pro Glu Gln Ser Leu Ala Cys Gln Cys Leu Val Arg Thr
355 360 365Pro Glu Val Asp Asp Glu Ala
Leu Glu Lys Phe Asp Lys Ala Leu Lys 370 375
380Ala Leu Pro Met His Ile Arg Leu Ser Phe Asn Pro Thr Gln Leu
Glu385 390 395 400Glu Gln
Cys His Ile 405741218DNAArtificial SequenceNucleic acid
encoding fusion protein sig10OBC-T2FMOLG1 74atggctactt caaagttgaa
aacccagaat gtggttgtat ctctctccct aaccttaacc 60ttggtactgg tgctactgac
cagcaaggca aactcacgcg aactggaaga gttgaacgta 120ccaggagaga ttgtagaatc
actgagctcc tcagaggagt ctattactcg tatcaacaag 180aagatagaga agttccaatc
cgaggagcaa caacaaacag aggacgaatt gcaggacaag 240atacatcctt tcgcacagac
ccagagcctc gtctatccct ttccaggtcc aatccctaac 300tctctccccc agaatatccc
acccttgact cagactcccg tggtcgtacc ccctttcttg 360caacccgagg tgatgggggt
ttctaaagtc aaagaggcta tggctcctaa acataaggaa 420atgccttttc ccaaatatcc
agtggagcca ttcactgaga gccagtctct gacacttaca 480gatgtggaaa acttgcacct
gcccttgcca cttttgcagt cctggatgca ccaaccacat 540caacccttgc cccccacagt
gatgtttcct ccacaatcag ttcttagtct ctcccaaagc 600aaagtccttc cagtgcctca
gaaggccgtc ccataccccc agagagatat gccaatacag 660gcattcttgc tttaccagga
accagtgctc ggtcctgtac gtggcccatt ccctatcata 720gtgttcatgt tgatcgtaac
acagactatg aagggtcttg atatacagaa ggtggccggg 780acttggtaca gtttggcaat
ggccgcatcc gacatctcct tgttggacgc acaatcagcc 840ccattgcgtg tgtacgtaga
agagcttaaa ccaactcccg agggggatct ggaaattctg 900ctccagaaat gggagaacgg
tgagtgcgcc cagaagaaga tcatcgcaga gaagaccaaa 960attccagcag tattcaaaat
cgacgcattg aacgaaaata aggtgctcgt actggacact 1020gattataaga agtatctcct
tttctgtatg gagaactcag cagagcctga acagagtctt 1080gcctgccaat gccttgttcg
taccccagag gtagatgatg aagctctgga aaagttcgat 1140aaggccctta aggctctgcc
tatgcacatt aggctttctt tcaatccaac tcaacttgag 1200gaacaatgtc acatttaa
121875395PRTArtificial
SequenceFusion protein sig10OaS1-TFMOLG1 75Met Ala Thr Ser Lys Leu Lys
Thr Gln Asn Val Val Val Ser Leu Ser1 5 10
15Leu Thr Leu Thr Leu Val Leu Val Leu Leu Thr Ser Lys
Ala Asn Ser 20 25 30Arg Pro
Lys His Pro Ile Lys His Gln Gly Leu Pro Gln Glu Val Leu 35
40 45Asn Glu Asn Leu Leu Arg Phe Phe Val Ala
Pro Phe Pro Glu Val Phe 50 55 60Gly
Lys Glu Lys Val Asn Glu Leu Ser Lys Asp Ile Gly Ser Glu Ser65
70 75 80Thr Glu Asp Gln Ala Met
Glu Asp Ile Lys Gln Met Glu Ala Glu Ser 85
90 95Ile Ser Ser Ser Glu Glu Ile Val Pro Asn Ser Val
Glu Gln Lys His 100 105 110Ile
Gln Lys Glu Asp Val Pro Ser Glu Arg Tyr Leu Gly Tyr Leu Glu 115
120 125Gln Leu Leu Arg Leu Lys Lys Tyr Lys
Val Pro Gln Leu Glu Ile Val 130 135
140Pro Asn Ser Ala Glu Glu Arg Leu His Ser Met Lys Glu Gly Ile His145
150 155 160Ala Gln Gln Lys
Glu Pro Met Ile Gly Val Asn Gln Glu Leu Ala Tyr 165
170 175Phe Tyr Pro Glu Leu Phe Arg Gln Phe Tyr
Gln Leu Asp Ala Tyr Pro 180 185
190Ser Gly Ala Trp Tyr Tyr Val Pro Leu Gly Thr Gln Tyr Thr Asp Ala
195 200 205Pro Ser Phe Ser Asp Ile Pro
Asn Pro Ile Gly Ser Glu Asn Ser Glu 210 215
220Lys Thr Thr Met Pro Leu Trp Phe Met Leu Ile Val Thr Gln Thr
Met225 230 235 240Lys Gly
Leu Asp Ile Gln Lys Val Ala Gly Thr Trp Tyr Ser Leu Ala
245 250 255Met Ala Ala Ser Asp Ile Ser
Leu Leu Asp Ala Gln Ser Ala Pro Leu 260 265
270Arg Val Tyr Val Glu Glu Leu Lys Pro Thr Pro Glu Gly Asp
Leu Glu 275 280 285Ile Leu Leu Gln
Lys Trp Glu Asn Gly Glu Cys Ala Gln Lys Lys Ile 290
295 300Ile Ala Glu Lys Thr Lys Ile Pro Ala Val Phe Lys
Ile Asp Ala Leu305 310 315
320Asn Glu Asn Lys Val Leu Val Leu Asp Thr Asp Tyr Lys Lys Tyr Leu
325 330 335Leu Phe Cys Met Glu
Asn Ser Ala Glu Pro Glu Gln Ser Leu Ala Cys 340
345 350Gln Cys Leu Val Arg Thr Pro Glu Val Asp Asp Glu
Ala Leu Glu Lys 355 360 365Phe Asp
Lys Ala Leu Lys Ala Leu Pro Met His Ile Arg Leu Ser Phe 370
375 380Asn Pro Thr Gln Leu Glu Glu Gln Cys His
Ile385 390 395761188DNAArtificial
SequenceNucleic acid encoding fusion protein sig10OaS1-TFMOLG1
76atggctactt caaagttgaa aacccagaat gtggttgtat ctctctccct aaccttaacc
60ttggtactgg tgctactgac cagcaaggca aactcacgcc caaaacatcc cataaaacat
120caaggattgc cccaggaagt actcaacgag aatctcctcc gttttttcgt tgctcctttc
180cccgaagtgt tcgggaagga aaaagtaaac gagctttcaa aggacatcgg ctctgaaagt
240accgaggatc aggctatgga agatatcaag caaatggagg ccgaatctat aagttcttca
300gaagaaatag ttcccaactc agtggagcag aagcacattc agaaagaaga cgtgcccagc
360gagcgctatc tgggatattt ggaacagctg ctcagactga aaaagtacaa ggtgcctcag
420ctcgaaatcg tacccaatag tgctgaagaa aggttgcact caatgaaaga ggggattcac
480gcacaacaaa aagagcctat gatcggagta aatcaagaac tggcatactt ttatcccgag
540ttgtttcgcc aattctatca actggatgcc tacccttccg gtgcatggta ctacgtaccc
600ctcggtactc aatataccga tgctccctcc ttttccgaca ttcctaatcc tataggttcc
660gagaatagcg aaaagaccac catgccctta tggttcatgt tgatcgtaac acagactatg
720aagggtcttg atatacagaa ggtggccggg acttggtaca gtttggcaat ggccgcatcc
780gacatctcct tgttggacgc acaatcagcc ccattgcgtg tgtacgtaga agagcttaaa
840ccaactcccg agggggatct ggaaattctg ctccagaaat gggagaacgg tgagtgcgcc
900cagaagaaga tcatcgcaga gaagaccaaa attccagcag tattcaaaat cgacgcattg
960aacgaaaata aggtgctcgt actggacact gattataaga agtatctcct tttctgtatg
1020gagaactcag cagagcctga acagagtctt gcctgccaat gccttgttcg taccccagag
1080gtagatgatg aagctctgga aaagttcgat aaggccctta aggctctgcc tatgcacatt
1140aggctttctt tcaatccaac tcaacttgag gaacaatgtc acatttaa
118877304PRTArtificial SequenceFusion protein sig10paraOKC1-TFMOLG1KDEL
77Met Ala Thr Ser Lys Leu Lys Thr Gln Asn Val Val Val Ser Leu Ser1
5 10 15Leu Thr Leu Thr Leu Val
Leu Val Leu Leu Thr Ser Lys Ala Asn Ser 20 25
30Gln Glu Gln Asn Gln Glu Gln Pro Ile Arg Cys Glu Lys
Asp Glu Arg 35 40 45Phe Phe Ser
Asp Lys Ile Ala Lys Tyr Ile Pro Ile Gln Tyr Val Leu 50
55 60Ser Arg Tyr Pro Ser Tyr Gly Leu Asn Tyr Tyr Gln
Gln Lys Pro Val65 70 75
80Ala Leu Ile Asn Asn Gln Phe Leu Pro Tyr Pro Tyr Tyr Ala Lys Pro
85 90 95Ala Ala Val Arg Ser Pro
Ala Gln Ile Leu Gln Trp Gln Val Leu Ser 100
105 110Asn Thr Val Pro Ala Lys Ser Cys Gln Ala Gln Pro
Thr Thr Met Ala 115 120 125Arg His
Pro His Pro His Leu Ser Phe Met Leu Ile Val Thr Gln Thr 130
135 140Met Lys Gly Leu Asp Ile Gln Lys Val Ala Gly
Thr Trp Tyr Ser Leu145 150 155
160Ala Met Ala Ala Ser Asp Ile Ser Leu Leu Asp Ala Gln Ser Ala Pro
165 170 175Leu Arg Val Tyr
Val Glu Glu Leu Lys Pro Thr Pro Glu Gly Asp Leu 180
185 190Glu Ile Leu Leu Gln Lys Trp Glu Asn Gly Glu
Cys Ala Gln Lys Lys 195 200 205Ile
Ile Ala Glu Lys Thr Lys Ile Pro Ala Val Phe Lys Ile Asp Ala 210
215 220Leu Asn Glu Asn Lys Val Leu Val Leu Asp
Thr Asp Tyr Lys Lys Tyr225 230 235
240Leu Leu Phe Cys Met Glu Asn Ser Ala Glu Pro Glu Gln Ser Leu
Ala 245 250 255Cys Gln Cys
Leu Val Arg Thr Pro Glu Val Asp Asp Glu Ala Leu Glu 260
265 270Lys Phe Asp Lys Ala Leu Lys Ala Leu Pro
Met His Ile Arg Leu Ser 275 280
285Phe Asn Pro Thr Gln Leu Glu Glu Gln Cys His Ile Lys Asp Glu Leu 290
295 30078915DNAArtificial SequenceNucleic
acid encoding fusion protein sig10paraOKC1-TFMOLG1KDEL 78atggctactt
caaagttgaa aacccagaat gtggttgtat ctctctccct aaccttaacc 60ttggtactgg
tgctactgac cagcaaggca aactcacaag agcagaatca agagcagcca 120atccgttgtg
agaaggacga gaggttcttc tcagacaaga tcgccaaata tatacccata 180caatatgtac
tctcacgcta ccctagctac gggcttaact actatcagca aaaacctgta 240gcactgataa
ataaccagtt tctcccctat ccctattatg ctaaacctgc cgccgtgagg 300agtccagcac
aaatacttca gtggcaagtg ctcagtaaca ccgtgccagc aaaaagctgc 360caggctcagc
ccaccacaat ggcccgtcat ccccatcctc accttagctt catgttgatc 420gtaacacaga
ctatgaaggg tcttgatata cagaaggtgg ccgggacttg gtacagtttg 480gcaatggccg
catccgacat ctccttgttg gacgcacaat cagccccatt gcgtgtgtac 540gtagaagagc
ttaaaccaac tcccgagggg gatctggaaa ttctgctcca gaaatgggag 600aacggtgagt
gcgcccagaa gaagatcatc gcagagaaga ccaaaattcc agcagtattc 660aaaatcgacg
cattgaacga aaataaggtg ctcgtactgg acactgatta taagaagtat 720ctccttttct
gtatggagaa ctcagcagag cctgaacaga gtcttgcctg ccaatgcctt 780gttcgtaccc
cagaggtaga tgatgaagct ctggaaaagt tcgataaggc ccttaaggct 840ctgcctatgc
acattaggct ttctttcaat ccaactcaac ttgaggaaca atgtcacatt 900aaggatgagc
tttaa
91579300PRTArtificial SequenceFusion protein sig10paraOKC1-TFMOLG1 79Met
Ala Thr Ser Lys Leu Lys Thr Gln Asn Val Val Val Ser Leu Ser1
5 10 15Leu Thr Leu Thr Leu Val Leu
Val Leu Leu Thr Ser Lys Ala Asn Ser 20 25
30Gln Glu Gln Asn Gln Glu Gln Pro Ile Arg Cys Glu Lys Asp
Glu Arg 35 40 45Phe Phe Ser Asp
Lys Ile Ala Lys Tyr Ile Pro Ile Gln Tyr Val Leu 50 55
60Ser Arg Tyr Pro Ser Tyr Gly Leu Asn Tyr Tyr Gln Gln
Lys Pro Val65 70 75
80Ala Leu Ile Asn Asn Gln Phe Leu Pro Tyr Pro Tyr Tyr Ala Lys Pro
85 90 95Ala Ala Val Arg Ser Pro
Ala Gln Ile Leu Gln Trp Gln Val Leu Ser 100
105 110Asn Thr Val Pro Ala Lys Ser Cys Gln Ala Gln Pro
Thr Thr Met Ala 115 120 125Arg His
Pro His Pro His Leu Ser Phe Met Leu Ile Val Thr Gln Thr 130
135 140Met Lys Gly Leu Asp Ile Gln Lys Val Ala Gly
Thr Trp Tyr Ser Leu145 150 155
160Ala Met Ala Ala Ser Asp Ile Ser Leu Leu Asp Ala Gln Ser Ala Pro
165 170 175Leu Arg Val Tyr
Val Glu Glu Leu Lys Pro Thr Pro Glu Gly Asp Leu 180
185 190Glu Ile Leu Leu Gln Lys Trp Glu Asn Gly Glu
Cys Ala Gln Lys Lys 195 200 205Ile
Ile Ala Glu Lys Thr Lys Ile Pro Ala Val Phe Lys Ile Asp Ala 210
215 220Leu Asn Glu Asn Lys Val Leu Val Leu Asp
Thr Asp Tyr Lys Lys Tyr225 230 235
240Leu Leu Phe Cys Met Glu Asn Ser Ala Glu Pro Glu Gln Ser Leu
Ala 245 250 255Cys Gln Cys
Leu Val Arg Thr Pro Glu Val Asp Asp Glu Ala Leu Glu 260
265 270Lys Phe Asp Lys Ala Leu Lys Ala Leu Pro
Met His Ile Arg Leu Ser 275 280
285Phe Asn Pro Thr Gln Leu Glu Glu Gln Cys His Ile 290
295 30080903DNAArtificial SequenceNucleic acid encoding
fusion protein sig10paraOKC1-TFMOLG1 80atggctactt caaagttgaa
aacccagaat gtggttgtat ctctctccct aaccttaacc 60ttggtactgg tgctactgac
cagcaaggca aactcacaag agcagaatca agagcagcca 120atccgttgtg agaaggacga
gaggttcttc tcagacaaga tcgccaaata tatacccata 180caatatgtac tctcacgcta
ccctagctac gggcttaact actatcagca aaaacctgta 240gcactgataa ataaccagtt
tctcccctat ccctattatg ctaaacctgc cgccgtgagg 300agtccagcac aaatacttca
gtggcaagtg ctcagtaaca ccgtgccagc aaaaagctgc 360caggctcagc ccaccacaat
ggcccgtcat ccccatcctc accttagctt catgttgatc 420gtaacacaga ctatgaaggg
tcttgatata cagaaggtgg ccgggacttg gtacagtttg 480gcaatggccg catccgacat
ctccttgttg gacgcacaat cagccccatt gcgtgtgtac 540gtagaagagc ttaaaccaac
tcccgagggg gatctggaaa ttctgctcca gaaatgggag 600aacggtgagt gcgcccagaa
gaagatcatc gcagagaaga ccaaaattcc agcagtattc 660aaaatcgacg cattgaacga
aaataaggtg ctcgtactgg acactgatta taagaagtat 720ctccttttct gtatggagaa
ctcagcagag cctgaacaga gtcttgcctg ccaatgcctt 780gttcgtaccc cagaggtaga
tgatgaagct ctggaaaagt tcgataaggc ccttaaggct 840ctgcctatgc acattaggct
ttctttcaat ccaactcaac ttgaggaaca atgtcacatt 900taa
90381354PRTArtificial
SequenceFusion protein sig2OKC1-TOLG1KDEL 81Met Ala Lys Leu Val Phe Ser
Leu Cys Phe Leu Leu Phe Ser Gly Cys1 5 10
15Cys Phe Ala Gln Glu Gln Asn Gln Glu Gln Pro Ile Arg
Cys Glu Lys 20 25 30Asp Glu
Arg Phe Phe Ser Asp Lys Ile Ala Lys Tyr Ile Pro Ile Gln 35
40 45Tyr Val Leu Ser Arg Tyr Pro Ser Tyr Gly
Leu Asn Tyr Tyr Gln Gln 50 55 60Lys
Pro Val Ala Leu Ile Asn Asn Gln Phe Leu Pro Tyr Pro Tyr Tyr65
70 75 80Ala Lys Pro Ala Ala Val
Arg Ser Pro Ala Gln Ile Leu Gln Trp Gln 85
90 95Val Leu Ser Asn Thr Val Pro Ala Lys Ser Cys Gln
Ala Gln Pro Thr 100 105 110Thr
Met Ala Arg His Pro His Pro His Leu Ser Phe Met Ala Ile Pro 115
120 125Pro Lys Lys Asn Gln Asp Lys Thr Glu
Ile Pro Thr Ile Asn Thr Ile 130 135
140Ala Ser Gly Glu Pro Thr Ser Thr Pro Thr Thr Glu Ala Val Glu Ser145
150 155 160Thr Val Ala Thr
Leu Glu Asp Ser Pro Glu Val Ile Glu Ser Pro Pro 165
170 175Glu Ile Asn Thr Val Gln Val Thr Ser Thr
Ala Val Leu Ile Val Thr 180 185
190Gln Thr Met Lys Gly Leu Asp Ile Gln Lys Val Ala Gly Thr Trp Tyr
195 200 205Ser Leu Ala Met Ala Ala Ser
Asp Ile Ser Leu Leu Asp Ala Gln Ser 210 215
220Ala Pro Leu Arg Val Tyr Val Glu Glu Leu Lys Pro Thr Pro Glu
Gly225 230 235 240Asp Leu
Glu Ile Leu Leu Gln Lys Trp Glu Asn Gly Glu Cys Ala Gln
245 250 255Lys Lys Ile Ile Ala Glu Lys
Thr Lys Ile Pro Ala Val Phe Lys Ile 260 265
270Asp Ala Leu Asn Glu Asn Lys Val Leu Val Leu Asp Thr Asp
Tyr Lys 275 280 285Lys Tyr Leu Leu
Phe Cys Met Glu Asn Ser Ala Glu Pro Glu Gln Ser 290
295 300Leu Ala Cys Gln Cys Leu Val Arg Thr Pro Glu Val
Asp Asp Glu Ala305 310 315
320Leu Glu Lys Phe Asp Lys Ala Leu Lys Ala Leu Pro Met His Ile Arg
325 330 335Leu Ser Phe Asn Pro
Thr Gln Leu Glu Glu Gln Cys His Ile Lys Asp 340
345 350Glu Leu821065DNAArtificial SequenceNucleic acid
encoding fusion protein sig2OKC1-TOLG1KDEL 82atggccaagc tagttttttc
cctttgtttt ctgcttttca gtggctgctg cttcgctcaa 60gagcagaatc aagagcagcc
aatccgttgt gagaaggacg agaggttctt ctcagacaag 120atcgccaaat atatacccat
acaatatgta ctctcacgct accctagcta cgggcttaac 180tactatcagc aaaaacctgt
agcactgata aataaccagt ttctccccta tccctattat 240gctaaacctg ccgccgtgag
gagtccagca caaatacttc agtggcaagt gctcagtaac 300accgtgccag caaaaagctg
ccaggctcag cccaccacaa tggcccgtca tccccatcct 360caccttagct tcatggcaat
cccaccaaag aagaatcaag acaagaccga aatacctacc 420atcaacacaa ttgcatctgg
agagcctacc agtacaccaa caactgaggc agtagagtct 480actgttgcta cccttgagga
cagccccgag gttatagagt ccccacctga gataaatacc 540gtgcaggtga caagtaccgc
cgtattgatc gtaacacaga ctatgaaggg tcttgatata 600cagaaggtgg ccgggacttg
gtacagtttg gcaatggccg catccgacat ctccttgttg 660gacgcacaat cagccccatt
gcgtgtgtac gtagaagagc ttaaaccaac tcccgagggg 720gatctggaaa ttctgctcca
gaaatgggag aacggtgagt gcgcccagaa gaagatcatc 780gcagagaaga ccaaaattcc
agcagtattc aaaatcgacg cattgaacga aaataaggtg 840ctcgtactgg acactgatta
taagaagtat ctccttttct gtatggagaa ctcagcagag 900cctgaacaga gtcttgcctg
ccaatgcctt gttcgtaccc cagaggtaga tgatgaagct 960ctggaaaagt tcgataaggc
ccttaaggct ctgcctatgc acattaggct ttctttcaat 1020ccaactcaac ttgaggaaca
atgtcacatt aaggatgagc tttaa 106583621PRTArtificial
SequenceOptimized alpha S2-casein truncated version 1 (OaS2-T) 83Ala
Ala Gly Ala Ala Thr Ala Cys Thr Ala Thr Gly Gly Ala Ala Cys1
5 10 15Ala Cys Gly Thr Ala Ala Gly
Cys Thr Cys Ala Ala Gly Thr Gly Ala 20 25
30Ala Gly Ala Ala Thr Cys Thr Ala Thr Ala Ala Thr Ala Ala
Gly Thr 35 40 45Cys Ala Ala Gly
Ala Gly Ala Cys Ala Thr Ala Thr Ala Ala Gly Cys 50 55
60Ala Ala Gly Ala Gly Ala Ala Ala Ala Ala Cys Ala Thr
Gly Gly Cys65 70 75
80Ala Ala Thr Ala Ala Ala Thr Cys Cys Cys Thr Cys Cys Ala Ala Gly
85 90 95Gly Ala Gly Ala Ala Thr
Cys Thr Thr Thr Gly Thr Ala Gly Cys Ala 100
105 110Cys Thr Thr Thr Thr Thr Gly Cys Ala Ala Ala Gly
Ala Ala Gly Thr 115 120 125Thr Gly
Thr Gly Ala Gly Ala Ala Ala Thr Gly Cys Ala Ala Ala Thr 130
135 140Gly Ala Gly Gly Ala Ala Gly Ala Ala Thr Ala
Cys Thr Cys Ala Ala145 150 155
160Thr Ala Gly Gly Cys Ala Gly Cys Thr Cys Thr Thr Cys Cys Gly Ala
165 170 175Ala Gly Ala Ala
Thr Cys Thr Gly Cys Thr Gly Ala Ala Gly Thr Cys 180
185 190Gly Cys Thr Ala Cys Thr Gly Ala Ala Gly Ala
Gly Gly Thr Cys Ala 195 200 205Ala
Ala Ala Thr Ala Ala Cys Ala Gly Thr Thr Gly Ala Cys Gly Ala 210
215 220Cys Ala Ala Gly Cys Ala Thr Thr Ala Thr
Cys Ala Ala Ala Ala Ala225 230 235
240Gly Cys Cys Cys Thr Gly Ala Ala Thr Gly Ala Ala Ala Thr Ala
Ala 245 250 255Ala Cys Cys
Ala Gly Thr Thr Cys Thr Ala Cys Cys Ala Ala Ala Ala 260
265 270Ala Thr Thr Thr Cys Cys Cys Cys Ala Ala
Thr Ala Cys Cys Thr Cys 275 280
285Cys Ala Gly Thr Ala Cys Cys Thr Thr Thr Ala Thr Cys Ala Ala Gly 290
295 300Gly Ala Cys Cys Cys Ala Thr Ala
Gly Thr Cys Cys Thr Cys Ala Ala305 310
315 320Cys Cys Cys Thr Thr Gly Gly Gly Ala Thr Cys Ala
Gly Gly Thr Cys 325 330
335Ala Ala Gly Cys Gly Thr Ala Ala Thr Gly Cys Thr Gly Thr Thr Cys
340 345 350Cys Ala Ala Thr Ala Ala
Cys Ala Cys Cys Ala Ala Cys Ala Cys Thr 355 360
365Cys Ala Ala Thr Cys Gly Thr Gly Ala Ala Cys Ala Ala Cys
Thr Gly 370 375 380Thr Cys Thr Ala Cys
Cys Thr Cys Ala Gly Ala Ala Gly Ala Ala Ala385 390
395 400Ala Thr Thr Cys Cys Ala Ala Ala Ala Ala
Ala Ala Cys Thr Gly Thr 405 410
415Gly Gly Ala Thr Ala Thr Gly Gly Ala Ala Ala Gly Thr Ala Cys Ala
420 425 430Gly Ala Ala Gly Thr
Thr Thr Thr Thr Ala Cys Thr Ala Ala Ala Ala 435
440 445Ala Gly Ala Cys Cys Ala Ala Gly Cys Thr Cys Ala
Cys Cys Gly Ala 450 455 460Gly Gly Ala
Gly Gly Ala Ala Ala Ala Ala Ala Ala Thr Ala Gly Ala465
470 475 480Thr Thr Gly Ala Ala Thr Thr
Thr Thr Cys Thr Thr Ala Ala Gly Ala 485
490 495Ala Gly Ala Thr Cys Ala Gly Thr Cys Ala Ala Cys
Gly Cys Thr Ala 500 505 510Thr
Cys Ala Gly Ala Ala Gly Thr Thr Cys Gly Cys Cys Cys Thr Thr 515
520 525Cys Cys Ala Cys Ala Ala Thr Ala Cys
Cys Thr Cys Ala Ala Gly Ala 530 535
540Cys Thr Gly Thr Ala Thr Ala Cys Cys Ala Ala Cys Ala Thr Cys Ala545
550 555 560Gly Ala Ala Gly
Gly Cys Cys Ala Thr Gly Ala Ala Gly Cys Cys Thr 565
570 575Thr Gly Gly Ala Thr Thr Cys Ala Gly Cys
Cys Cys Ala Ala Ala Ala 580 585
590Cys Ala Ala Ala Gly Gly Thr Ala Ala Thr Cys Cys Cys Cys Thr Ala
595 600 605Thr Gly Thr Thr Ala Gly Ala
Thr Ala Cys Thr Thr Gly 610 615
62084207PRTArtificial SequenceOptimized alpha S2-casein truncated version
1 (OaS2-T) 84Lys Asn Thr Met Glu His Val Ser Ser Ser Glu Glu Ser Ile
Ile Ser1 5 10 15Gln Glu
Thr Tyr Lys Gln Glu Lys Asn Met Ala Ile Asn Pro Ser Lys 20
25 30Glu Asn Leu Cys Ser Thr Phe Cys Lys
Glu Val Val Arg Asn Ala Asn 35 40
45Glu Glu Glu Tyr Ser Ile Gly Ser Ser Ser Glu Glu Ser Ala Glu Val 50
55 60Ala Thr Glu Glu Val Lys Ile Thr Val
Asp Asp Lys His Tyr Gln Lys65 70 75
80Ala Leu Asn Glu Ile Asn Gln Phe Tyr Gln Lys Phe Pro Gln
Tyr Leu 85 90 95Gln Tyr
Leu Tyr Gln Gly Pro Ile Val Leu Asn Pro Trp Asp Gln Val 100
105 110Lys Arg Asn Ala Val Pro Ile Thr Pro
Thr Leu Asn Arg Glu Gln Leu 115 120
125Ser Thr Ser Glu Glu Asn Ser Lys Lys Thr Val Asp Met Glu Ser Thr
130 135 140Glu Val Phe Thr Lys Lys Thr
Lys Leu Thr Glu Glu Glu Lys Asn Arg145 150
155 160Leu Asn Phe Leu Lys Lys Ile Ser Gln Arg Tyr Gln
Lys Phe Ala Leu 165 170
175Pro Gln Tyr Leu Lys Thr Val Tyr Gln His Gln Lys Ala Met Lys Pro
180 185 190Trp Ile Gln Pro Lys Thr
Lys Val Ile Pro Tyr Val Arg Tyr Leu 195 200
20585171PRTCapra hircus 85Gln Glu Gln Asn Gln Glu Gln Pro Ile
Cys Cys Glu Lys Asp Glu Arg1 5 10
15Phe Phe Asp Asp Lys Ile Ala Lys Tyr Ile Pro Ile Gln Tyr Val
Leu 20 25 30Ser Arg Tyr Pro
Ser Tyr Gly Leu Asn Tyr Tyr Gln Gln Arg Pro Val 35
40 45Ala Leu Ile Asn Asn Gln Phe Leu Pro Tyr Pro Tyr
Tyr Ala Lys Pro 50 55 60Val Ala Val
Arg Ser Pro Ala Gln Thr Leu Gln Trp Gln Val Leu Pro65 70
75 80Asn Thr Val Pro Ala Lys Ser Cys
Gln Asp Gln Pro Thr Thr Leu Ala 85 90
95Arg His Pro His Pro His Leu Ser Phe Met Ala Ile Pro Pro
Lys Lys 100 105 110Asp Gln Asp
Lys Thr Glu Val Pro Ala Ile Asn Thr Ile Ala Ser Ala 115
120 125Glu Pro Thr Val His Ser Thr Pro Thr Thr Glu
Ala Ile Val Asn Thr 130 135 140Val Asp
Asn Pro Glu Ala Ser Ser Glu Ser Ile Ala Ser Ala Ser Glu145
150 155 160Thr Asn Thr Ala Gln Val Thr
Ser Thr Glu Val 165 17086171PRTOvis aries
86Gln Glu Gln Asn Gln Glu Gln Arg Ile Cys Cys Glu Lys Asp Glu Arg1
5 10 15Phe Phe Asp Asp Lys Ile
Ala Lys Tyr Ile Pro Ile Gln Tyr Val Leu 20 25
30Ser Arg Tyr Pro Ser Tyr Gly Leu Asn Tyr Tyr Gln Gln
Arg Pro Val 35 40 45Ala Leu Ile
Asn Asn Gln Phe Leu Pro Tyr Pro Tyr Tyr Ala Lys Pro 50
55 60Val Ala Val Arg Ser Pro Ala Gln Thr Leu Gln Trp
Gln Val Leu Pro65 70 75
80Asn Ala Val Pro Ala Lys Ser Cys Gln Asp Gln Pro Thr Ala Met Ala
85 90 95Arg His Pro His Pro His
Leu Ser Phe Met Ala Ile Pro Pro Lys Lys 100
105 110Asp Gln Asp Lys Thr Glu Ile Pro Ala Ile Asn Thr
Ile Ala Ser Ala 115 120 125Glu Pro
Thr Val His Ser Thr Pro Thr Thr Glu Ala Val Val Asn Ala 130
135 140Val Asp Asn Pro Glu Ala Ser Ser Glu Ser Ile
Ala Ser Ala Pro Glu145 150 155
160Thr Asn Thr Ala Gln Val Thr Ser Thr Glu Val 165
17087165PRTBubalus bubalis 87Gln Glu Gln Asn Gln Glu Gln Pro
Ile Arg Cys Glu Lys Glu Glu Arg1 5 10
15Phe Phe Asn Asp Lys Ile Ala Lys Tyr Ile Pro Ile Gln Tyr
Val Leu 20 25 30Ser Arg Tyr
Pro Ser Tyr Gly Leu Asn Tyr Tyr Gln Gln Lys Pro Val 35
40 45Ala Leu Ile Asn Asn Gln Phe Leu Pro Tyr Pro
Tyr Tyr Ala Lys Pro 50 55 60Ala Ala
Val Arg Ser Pro Ala Gln Ile Leu Gln Trp Gln Val Leu Pro65
70 75 80Asn Thr Val Pro Ala Lys Ser
Cys Gln Ala Gln Pro Thr Thr Met Thr 85 90
95Arg His Pro His Pro His Leu Ser Phe Met Ala Ile Pro
Pro Lys Lys 100 105 110Asn Gln
Asp Lys Thr Glu Ile Pro Thr Ile Asn Thr Ile Val Ser Val 115
120 125Glu Pro Thr Ser Thr Pro Thr Thr Glu Ala
Ile Glu Asn Thr Val Ala 130 135 140Thr
Leu Glu Ala Ser Ser Glu Val Ile Glu Ser Val Pro Glu Thr Asn145
150 155 160Thr Ala Gln Val Thr
16588162PRTCamelus dromedaries 88Glu Val Gln Asn Gln Glu Gln Pro
Thr Cys Phe Glu Lys Val Glu Arg1 5 10
15Leu Leu Asn Glu Lys Thr Val Lys Tyr Phe Pro Ile Gln Phe
Val Gln 20 25 30Ser Arg Tyr
Pro Ser Tyr Gly Ile Asn Tyr Tyr Gln His Arg Leu Ala 35
40 45Val Pro Ile Asn Asn Gln Phe Ile Pro Tyr Pro
Asn Tyr Ala Lys Pro 50 55 60Val Ala
Ile Arg Leu His Ala Gln Ile Pro Gln Cys Gln Ala Leu Pro65
70 75 80Asn Ile Asp Pro Pro Thr Val
Glu Arg Arg Pro Arg Pro Arg Pro Ser 85 90
95Phe Ile Ala Ile Pro Pro Lys Lys Thr Gln Asp Lys Thr
Val Asn Pro 100 105 110Ala Ile
Asn Thr Val Ala Thr Val Glu Pro Pro Val Ile Pro Thr Ala 115
120 125Glu Pro Ala Val Asn Thr Val Val Ile Ala
Glu Ala Ser Ser Glu Phe 130 135 140Ile
Thr Thr Ser Thr Pro Glu Thr Thr Thr Val Gln Ile Thr Ser Thr145
150 155 160Glu Ile89162PRTCamelus
bactrianus 89Glu Val Gln Asn Gln Glu Gln Pro Thr Cys Cys Glu Lys Val Glu
Arg1 5 10 15Leu Leu Asn
Glu Lys Thr Val Lys Tyr Phe Pro Ile Gln Phe Val Gln 20
25 30Ser Arg Tyr Pro Ser Tyr Gly Ile Asn Tyr
Tyr Gln His Arg Leu Ala 35 40
45Val Pro Ile Asn Asn Gln Phe Ile Pro Tyr Pro Asn Tyr Ala Lys Pro 50
55 60Val Ala Ile Arg Leu His Ala Gln Ile
Pro Gln Cys Gln Ala Leu Pro65 70 75
80Asn Ile Asp Pro Pro Thr Val Glu Arg Arg Pro Arg Pro Arg
Pro Ser 85 90 95Phe Ile
Ala Ile Pro Pro Lys Lys Thr Gln Asp Lys Thr Val Asn Pro 100
105 110Ala Ile Asn Thr Val Ala Thr Val Glu
Pro Pro Val Ile Pro Thr Ala 115 120
125Glu Pro Ala Val Asn Thr Val Val Ile Ala Glu Ala Ser Ser Glu Phe
130 135 140Ile Thr Thr Ser Thr Pro Glu
Thr Thr Thr Val Gln Ile Thr Ser Thr145 150
155 160Glu Ile90173PRTBos mutus 90Gln Glu Gln Asn Gln
Glu Gln Pro Ile Arg Cys Glu Lys Asp Glu Arg1 5
10 15Phe Phe Ser Asp Lys Ile Ala Lys Tyr Ile Pro
Ile Gln Tyr Val Leu 20 25
30Ser Arg Tyr Pro Ser Tyr Gly Leu Asn Tyr Tyr Gln Gln Lys Pro Val
35 40 45Ala Leu Ile Asn Asn Gln Phe Leu
Pro Tyr Pro Tyr Tyr Ala Lys Pro 50 55
60Ala Ala Val Arg Ser Pro Ala Gln Ile Leu Gln Trp Gln Val Leu Ser65
70 75 80Asn Thr Val Pro Ala
Lys Ser Cys Gln Ala Gln Pro Thr Thr Met Ala 85
90 95Arg His Pro His Pro His Leu Ser Phe Met Ala
Ile Pro Pro Lys Lys 100 105
110Asn Gln Asp Lys Thr Glu Ile Pro Thr Ile Asn Thr Ile Ala Ser Gly
115 120 125Glu Pro Thr Ser Thr Pro Thr
Thr Glu Ala Val Glu Ser Thr Val Ala 130 135
140Thr Leu Glu Ala Ser Pro Glu Ala Ser Pro Glu Val Ile Glu Ser
Pro145 150 155 160Pro Glu
Ile Asn Thr Val Gln Val Thr Ser Thr Ala Val 165
17091165PRTEquus caballus 91Glu Val Gln Asn Gln Glu Gln Pro Thr Cys
His Lys Asn Asp Glu Arg1 5 10
15Phe Phe Asp Leu Lys Thr Val Lys Tyr Ile Pro Ile Tyr Tyr Val Leu
20 25 30Asn Ser Ser Pro Arg Tyr
Glu Pro Ile Tyr Tyr Gln His Arg Leu Ala 35 40
45Leu Leu Ile Asn Asn Gln His Met Pro Tyr Gln Tyr Tyr Ala
Arg Pro 50 55 60Ala Ala Val Arg Pro
His Val Gln Ile Pro Gln Trp Gln Val Leu Pro65 70
75 80Asn Ile Tyr Pro Ser Thr Val Val Arg His
Pro Cys Pro His Pro Ser 85 90
95Phe Ile Ala Ile Pro Pro Lys Lys Leu Gln Glu Ile Thr Val Ile Pro
100 105 110Lys Ile Asn Thr Ile
Ala Thr Val Glu Pro Thr Pro Ile Pro Thr Pro 115
120 125Glu Pro Thr Val Asn Asn Ala Val Ile Pro Asp Ala
Ser Ser Glu Phe 130 135 140Ile Ile Ala
Ser Thr Pro Glu Thr Thr Thr Val Pro Val Thr Ser Pro145
150 155 160Val Val Gln Lys Leu
16592162PRTEquus asinus 92Glu Val Gln Asn Gln Glu Gln Pro Thr Cys Arg
Lys Asn Asp Glu Arg1 5 10
15Phe Phe Asp Leu Lys Thr Val Lys Tyr Ile Pro Ile Tyr Tyr Val Leu
20 25 30Asn Ser Ser Pro Arg Asn Glu
Pro Ile Tyr Tyr Gln His Arg Leu Ala 35 40
45Val Leu Ile Asn Asn Gln His Met Pro Tyr Gln Tyr Tyr Ala Arg
Pro 50 55 60Ala Ala Val Arg Pro His
Val Gln Ile Pro Gln Trp Gln Val Leu Pro65 70
75 80Asn Ile Tyr Pro Ser Thr Val Val Arg His Pro
Arg Pro His Pro Ser 85 90
95Phe Ile Ala Ile Pro Pro Lys Lys Leu Gln Glu Lys Thr Val Ile Pro
100 105 110Lys Ile Asn Thr Ile Ala
Thr Val Glu Pro Thr Pro Ile Pro Thr Pro 115 120
125Glu Pro Thr Val Asn Asn Ala Val Ile Pro Asp Ala Ser Ser
Glu Phe 130 135 140Ile Ile Ala Ser Thr
Pro Glu Thr Thr Thr Val Pro Val Thr Ser Pro145 150
155 160Val Val93122PRTRangifer tarandus 93Val
Ala Leu Ile Asn Asn Gln Phe Leu Pro Tyr Pro Tyr Tyr Ala Lys1
5 10 15Pro Gly Ala Val Arg Ser Pro
Ala Gln Ile Leu Gln Trp Gln Val Leu 20 25
30Pro Asn Thr Val Pro Ala Lys Ser Cys Gln Ala Gln Pro Thr
Thr Leu 35 40 45Ala Arg His Pro
His Pro Arg Leu Ser Phe Met Ala Ile Pro Pro Lys 50 55
60Lys Asn Gln Asp Lys Thr Asp Ile Pro Thr Ile Asn Thr
Ile Ala Thr65 70 75
80Val Glu Ser Thr Ile Thr Pro Thr Thr Glu Ala Ile Val Asp Thr Val
85 90 95Ala Thr Leu Glu Ala Ser
Ser Glu Val Ile Glu Ser Ala Pro Glu Thr 100
105 110Asn Thr Asp Gln Val Thr Ser Thr Val Val 115
12094141PRTAlces alces 94Lys Ile Val Lys Tyr Ile Pro Ile
Gln Tyr Ala Leu Ser Arg Tyr Pro1 5 10
15Ser Tyr Gly Leu Ser Tyr Tyr Gln His Arg Pro Val Ala Leu
Ile Asn 20 25 30Asn Gln Phe
Leu Pro Tyr Pro Tyr Tyr Ala Lys Pro Gly Ala Val Arg 35
40 45Ser Pro Ala Gln Ile Leu Gln Trp Gln Val Leu
Pro Asn Thr Val Pro 50 55 60Ala Lys
Ser Cys Gln Ala Gln Pro Thr Thr Met Ala Arg His Pro Arg65
70 75 80Pro Arg Leu Ser Phe Met Ala
Ile Pro Pro Lys Lys Asn Gln Asp Lys 85 90
95Thr Asp Ile Pro Thr Ile Asn Thr Ile Ala Thr Val Glu
Ser Thr Ile 100 105 110Thr Pro
Thr Thr Glu Ala Ile Glu Asp Asn Val Ala Thr Leu Glu Ala 115
120 125Ser Ser Glu Val Ile Glu Ser Ala Pro Glu
Thr Asn Thr 130 135 14095162PRTVicugna
pacos 95Glu Val Gln Asn Gln Glu Gln Pro Thr Cys Cys Glu Lys Val Glu Arg1
5 10 15Leu Leu Asn Glu
Lys Thr Val Lys Tyr Phe Pro Ile Gln Phe Val Gln 20
25 30Ser Arg Tyr Pro Ser Tyr Gly Ile Asn Tyr Tyr
Gln His Arg Leu Ala 35 40 45Val
Pro Ile Asn Asn Gln Phe Ile Pro Tyr Pro Asn Tyr Ala Lys Pro 50
55 60Val Ala Ile Arg Leu His Ala Gln Ile Pro
Gln Cys Gln Ala Leu Pro65 70 75
80Asn Ile Asp Pro Pro Thr Val Glu Arg Arg Pro Arg Pro Arg Pro
Ser 85 90 95Phe Ile Ala
Ile Pro Pro Lys Lys Thr Gln Asp Lys Thr Val Ile Pro 100
105 110Ala Ile Asn Thr Val Ala Thr Ala Glu Pro
Pro Val Ile Pro Thr Ala 115 120
125Glu Pro Val Val Asn Thr Val Val Ile Ala Glu Ala Ser Ser Glu Phe 130
135 140Ile Thr Thr Ser Thr Pro Glu Thr
Thr Thr Val Gln Ile Thr Ser Thr145 150
155 160Glu Ile96160PRTBos indicus 96Arg Cys Glu Lys Asp
Glu Arg Phe Phe Ser Asp Lys Ile Ala Lys Tyr1 5
10 15Ile Pro Ile Gln Tyr Val Leu Ser Arg Tyr Pro
Ser Tyr Gly Leu Asn 20 25
30Tyr Tyr Gln Gln Lys Pro Val Ala Leu Ile Asn Asn Gln Phe Leu Pro
35 40 45Tyr Pro Tyr Tyr Ala Lys Pro Ala
Ala Val Arg Ser Pro Ala Gln Ile 50 55
60Leu Gln Trp Gln Val Leu Ser Asn Thr Val Pro Ala Lys Ser Cys Gln65
70 75 80Ala Gln Pro Thr Thr
Met Ala Arg His Pro His Pro His Leu Ser Phe 85
90 95Met Ala Ile Pro Pro Lys Lys Asn Gln Asp Lys
Thr Glu Ile Pro Thr 100 105
110Ile Asn Thr Ile Ala Ser Gly Glu Pro Thr Ser Thr Pro Thr Thr Glu
115 120 125Ala Val Glu Ser Thr Val Ala
Thr Leu Glu Asp Ser Pro Glu Val Ile 130 135
140Glu Ser Pro Pro Glu Ile Asn Thr Val Gln Val Thr Ser Thr Ala
Val145 150 155
16097162PRTLama glama 97Glu Val Gln Asn Gln Glu Gln Pro Thr Cys Cys Glu
Lys Val Glu Arg1 5 10
15Leu Leu Asn Glu Lys Thr Val Lys Tyr Phe Pro Ile Gln Phe Val Gln
20 25 30Ser Arg Tyr Pro Ser Tyr Gly
Ile Asn Tyr Tyr Gln His Arg Leu Ala 35 40
45Val Pro Ile Asn Asn Gln Phe Ile Pro Tyr Pro Asn Tyr Ala Lys
Pro 50 55 60Val Ala Ile Arg Leu His
Ala Gln Ile Pro Gln Cys Gln Ala Leu Pro65 70
75 80Asn Ile Asp Pro Pro Thr Val Glu Arg Arg Pro
Arg Pro Arg Pro Ser 85 90
95Phe Ile Ala Ile Pro Pro Lys Lys Thr Gln Asp Lys Thr Val Ile Pro
100 105 110Ala Ile Asn Thr Val Ala
Thr Val Glu Pro Pro Val Ile Pro Thr Ala 115 120
125Glu Pro Val Val Asn Thr Val Val Ile Ala Glu Ala Ser Ser
Glu Phe 130 135 140Ile Thr Thr Ser Thr
Pro Glu Thr Thr Thr Val Gln Ile Thr Ser Thr145 150
155 160Glu Ile98162PRTHomo sapiens 98Glu Val Gln
Asn Gln Lys Gln Pro Ala Cys His Glu Asn Asp Glu Arg1 5
10 15Pro Phe Tyr Gln Lys Thr Ala Pro Tyr
Val Pro Met Tyr Tyr Val Pro 20 25
30Asn Ser Tyr Pro Tyr Tyr Gly Thr Asn Leu Tyr Gln Arg Arg Pro Ala
35 40 45Ile Ala Ile Asn Asn Pro Tyr
Val Pro Arg Thr Tyr Tyr Ala Asn Pro 50 55
60Ala Val Val Arg Pro His Ala Gln Ile Pro Gln Arg Gln Tyr Leu Pro65
70 75 80Asn Ser His Pro
Pro Thr Val Val Arg Arg Pro Asn Leu His Pro Ser 85
90 95Phe Ile Ala Ile Pro Pro Lys Lys Ile Gln
Asp Lys Ile Ile Ile Pro 100 105
110Thr Ile Asn Thr Ile Ala Thr Val Glu Pro Thr Pro Ala Pro Ala Thr
115 120 125Glu Pro Thr Val Asp Ser Val
Val Thr Pro Glu Ala Phe Ser Glu Ser 130 135
140Ile Ile Thr Ser Thr Pro Glu Thr Thr Thr Val Ala Val Thr Pro
Pro145 150 155 160Thr
Ala99199PRTCapra hircus 99Arg Pro Lys His Pro Ile Asn His Arg Gly Leu Ser
Pro Glu Val Pro1 5 10
15Asn Glu Asn Leu Leu Arg Phe Val Val Ala Pro Phe Pro Glu Val Phe
20 25 30Arg Lys Glu Asn Ile Asn Glu
Leu Ser Lys Asp Ile Gly Ser Glu Ser 35 40
45Thr Glu Asp Gln Ala Met Glu Asp Ala Lys Gln Met Lys Ala Gly
Ser 50 55 60Ser Ser Ser Ser Glu Glu
Ile Val Pro Asn Ser Ala Glu Gln Lys Tyr65 70
75 80Ile Gln Lys Glu Asp Val Pro Ser Glu Arg Tyr
Leu Gly Tyr Leu Glu 85 90
95Gln Leu Leu Arg Leu Lys Lys Tyr Asn Val Pro Gln Leu Glu Ile Val
100 105 110Pro Lys Ser Ala Glu Glu
Gln Leu His Ser Met Lys Glu Gly Asn Pro 115 120
125Ala His Gln Lys Gln Pro Met Ile Ala Val Asn Gln Glu Leu
Ala Tyr 130 135 140Phe Tyr Pro Gln Leu
Phe Arg Gln Phe Tyr Gln Leu Asp Ala Tyr Pro145 150
155 160Ser Gly Ala Trp Tyr Tyr Leu Pro Leu Gly
Thr Gln Tyr Thr Asp Ala 165 170
175Pro Ser Phe Ser Asp Ile Pro Asn Pro Ile Gly Ser Glu Asn Ser Gly
180 185 190Lys Thr Thr Met Pro
Leu Trp 195100199PRTOvis aris 100Arg Pro Lys His Pro Ile Lys His
Gln Gly Leu Ser Ser Glu Val Leu1 5 10
15Asn Glu Asn Leu Leu Arg Phe Val Val Ala Pro Phe Pro Glu
Val Phe 20 25 30Arg Lys Glu
Asn Ile Asn Glu Leu Ser Lys Asp Ile Gly Ser Glu Ser 35
40 45Ile Glu Asp Gln Ala Met Glu Asp Ala Lys Gln
Met Lys Ala Gly Ser 50 55 60Ser Ser
Ser Ser Glu Glu Ile Val Pro Asn Ser Ala Glu Gln Lys Tyr65
70 75 80Ile Gln Lys Glu Asp Val Pro
Ser Glu Arg Tyr Leu Gly Tyr Leu Glu 85 90
95Gln Leu Leu Arg Leu Lys Lys Tyr Asn Val Pro Gln Leu
Glu Ile Val 100 105 110Pro Lys
Ser Ala Glu Glu Gln Leu His Ser Met Lys Glu Gly Asn Pro 115
120 125Ala His Gln Lys Gln Pro Met Ile Ala Val
Asn Gln Glu Leu Ala Tyr 130 135 140Phe
Tyr Pro Gln Leu Phe Arg Gln Phe Tyr Gln Leu Asp Ala Tyr Pro145
150 155 160Ser Gly Ala Trp Tyr Tyr
Leu Pro Leu Gly Thr Gln Tyr Thr Asp Ala 165
170 175Pro Ser Phe Ser Asp Ile Pro Asn Pro Ile Gly Ser
Glu Asn Ser Gly 180 185 190Lys
Ile Thr Met Pro Leu Trp 195101199PRTBubalus bubalis 101Arg Pro Lys
Gln Pro Ile Lys His Gln Gly Leu Pro Gln Gly Val Leu1 5
10 15Asn Glu Asn Leu Leu Arg Phe Phe Val
Ala Pro Phe Pro Glu Val Phe 20 25
30Gly Lys Glu Lys Val Asn Glu Leu Ser Thr Asp Ile Gly Ser Glu Ser
35 40 45Thr Glu Asp Gln Ala Met Glu
Asp Ile Lys Gln Met Glu Ala Glu Ser 50 55
60Ile Ser Ser Ser Glu Glu Ile Val Pro Ile Ser Val Glu Gln Lys His65
70 75 80Ile Gln Lys Glu
Asp Val Pro Ser Glu Arg Tyr Leu Gly Tyr Leu Glu 85
90 95Gln Leu Leu Arg Leu Lys Lys Tyr Asn Val
Pro Gln Leu Glu Ile Val 100 105
110Pro Asn Leu Ala Glu Glu Gln Leu His Ser Met Lys Glu Gly Ile His
115 120 125Ala Gln Gln Lys Glu Pro Met
Ile Gly Val Asn Gln Glu Leu Ala Tyr 130 135
140Phe Tyr Pro Gln Leu Phe Arg Gln Phe Tyr Gln Leu Asp Ala Tyr
Pro145 150 155 160Ser Gly
Ala Trp Tyr Tyr Val Pro Leu Gly Thr Gln Tyr Pro Asp Ala
165 170 175Pro Ser Phe Ser Asp Ile Pro
Asn Pro Ile Gly Ser Glu Asn Ser Gly 180 185
190Lys Thr Thr Met Pro Leu Trp 195102154PRTCamelus
dromedaries 102Asp Thr Glu Arg Lys Glu Ser Gly Ser Ser Ser Ser Glu Glu
Val Val1 5 10 15Ser Ser
Thr Thr Glu Gln Lys Asp Ile Leu Lys Glu Asp Met Pro Ser 20
25 30Gln Arg Tyr Leu Glu Glu Leu His Arg
Leu Asn Lys Tyr Lys Leu Leu 35 40
45Gln Leu Glu Ala Ile Arg Asp Gln Lys Leu Ile Pro Arg Val Lys Leu 50
55 60Ser Ser His Pro Tyr Leu Glu Gln Leu
Tyr Arg Ile Asn Glu Asp Asn65 70 75
80His Pro Gln Leu Gly Glu Pro Val Lys Val Val Thr Gln Glu
Gln Ala 85 90 95Tyr Phe
His Leu Glu Pro Phe Pro Gln Phe Phe Gln Leu Gly Ala Ser 100
105 110Pro Tyr Val Ala Trp Tyr Tyr Pro Pro
Gln Val Met Gln Tyr Ile Ala 115 120
125His Pro Ser Ser Tyr Asp Thr Pro Glu Gly Ile Ala Ser Glu Asp Gly
130 135 140Gly Lys Thr Asp Val Met Pro
Gln Trp Trp145 150103207PRTCamelus bactrianus 103Arg Pro
Lys Tyr Pro Leu Arg Tyr Pro Glu Val Phe Gln Asn Glu Pro1 5
10 15Asp Ser Ile Glu Glu Val Leu Asn
Lys Arg Lys Ile Leu Glu Leu Ala 20 25
30Val Val Ser Pro Ile Gln Phe Arg Gln Glu Asn Ile Asp Glu Leu
Lys 35 40 45Asp Thr Arg Asn Glu
Pro Thr Glu Asp His Ile Met Glu Asp Thr Glu 50 55
60Arg Lys Glu Ser Gly Ser Ser Ser Ser Glu Glu Val Val Ser
Ser Thr65 70 75 80Thr
Glu Gln Lys Asp Ile Leu Lys Glu Asp Met Pro Ser Gln Arg Tyr
85 90 95Leu Glu Glu Leu His Arg Leu
Asn Lys Tyr Lys Leu Leu Gln Leu Glu 100 105
110Ala Ile Arg Asp Gln Lys Leu Ile Pro Arg Val Lys Leu Ser
Ser His 115 120 125Pro Tyr Leu Glu
Gln Leu Tyr Arg Ile Asn Glu Asp Asn His Pro Gln 130
135 140Leu Gly Glu Pro Val Lys Val Val Thr Gln Pro Phe
Pro Gln Phe Phe145 150 155
160Gln Leu Gly Ala Ser Pro Tyr Val Ala Trp Tyr Tyr Pro Pro Gln Val
165 170 175Met Gln Tyr Ile Ala
His Pro Ser Ser Tyr Asp Thr Pro Glu Gly Ile 180
185 190Ala Ser Glu Asp Gly Gly Lys Thr Asp Val Met Pro
Gln Trp Trp 195 200
205104199PRTBos mutus 104Arg Pro Lys His Pro Ile Lys His Gln Gly Leu Pro
Gln Glu Val Leu1 5 10
15Asn Glu Asn Leu Leu Arg Phe Phe Val Ala Pro Phe Pro Glu Val Phe
20 25 30Gly Lys Glu Lys Val Asn Glu
Leu Ser Lys Asp Ile Gly Ser Glu Ser 35 40
45Thr Glu Asp Gln Ala Met Glu Asp Ile Lys Gln Met Glu Ala Glu
Ser 50 55 60Ile Ser Ser Ser Glu Glu
Ile Val Pro Asn Ser Val Glu Gln Lys His65 70
75 80Ile Gln Lys Glu Asp Val Pro Ser Glu His Tyr
Leu Gly Tyr Leu Glu 85 90
95Gln Leu Leu Arg Leu Lys Lys Tyr Lys Val Pro Gln Leu Glu Ile Val
100 105 110Pro Asn Ser Ala Glu Glu
Arg Leu His Ser Met Lys Glu Gly Ile His 115 120
125Ala Gln Gln Lys Glu Pro Met Ile Gly Val Asn Gln Glu Leu
Ala Tyr 130 135 140Phe Tyr Pro Glu Leu
Phe Arg Gln Phe Tyr Gln Leu Asp Ala Tyr Pro145 150
155 160Ser Gly Ala Trp Tyr Tyr Val Pro Leu Gly
Thr Gln Tyr Thr Asp Ala 165 170
175Pro Ser Phe Ser Asp Ile Pro Asn Pro Ile Gly Ser Glu Asn Ser Gly
180 185 190Lys Thr Thr Met Pro
Leu Trp 195105226PRTEquus caballus 105Arg Glu Lys Glu Glu Leu Asn
Val Ser Ser Glu Thr Val Glu Ser Leu1 5 10
15Ser Ser Asn Glu Pro Asp Ser Ser Ser Glu Glu Ser Ile
Thr His Ile 20 25 30Asn Lys
Glu Lys Leu Gln Lys Phe Lys His Glu Gly Gln Gln Gln Arg 35
40 45Glu Val Glu Arg Gln Asp Lys Ile Ser Arg
Phe Val Gln Pro Gln Pro 50 55 60Val
Val Tyr Pro Tyr Ala Glu Pro Val Pro Tyr Ala Val Val Pro Gln65
70 75 80Ser Ile Leu Pro Leu Ala
Gln Pro Pro Ile Leu Pro Phe Leu Gln Pro 85
90 95Glu Ile Met Glu Val Ser Gln Ala Lys Glu Thr Ile
Leu Pro Lys Arg 100 105 110Lys
Val Met Pro Phe Leu Lys Ser Pro Ile Val Pro Phe Ser Glu Arg 115
120 125Gln Ile Leu Asn Pro Thr Asn Gly Glu
Asn Leu Arg Leu Pro Val His 130 135
140Leu Ile Gln Pro Phe Met His Gln Val Pro Gln Ser Leu Leu Gln Thr145
150 155 160Leu Met Leu Pro
Ser Gln Pro Val Leu Ser Pro Pro Gln Ser Lys Val 165
170 175Ala Pro Phe Pro Gln Pro Val Val Pro Tyr
Pro Gln Arg Asp Thr Pro 180 185
190Val Gln Ala Phe Leu Leu Tyr Gln Asp Pro Arg Leu Gly Pro Thr Gly
195 200 205Glu Leu Asp Pro Ala Thr Gln
Pro Ile Val Ala Val His Asn Pro Val 210 215
220Ile Val225106202PRTEquus asinus 106Arg Pro Lys Leu Pro His Arg
His Pro Glu Ile Ile Gln Asn Glu Gln1 5 10
15Asp Ser Arg Glu Lys Val Leu Lys Glu Arg Lys Phe Pro
Ser Phe Ala 20 25 30Leu His
Thr Pro Arg Glu Glu Tyr Ile Asn Glu Leu Asn Arg Gln Arg 35
40 45Glu Leu Leu Lys Glu Lys Gln Lys Asp Glu
His Lys Glu Tyr Leu Ile 50 55 60Glu
Asp Pro Glu Gln Gln Glu Ser Ser Ser Thr Ser Ser Ser Glu Glu65
70 75 80Val Val Pro Ile Asn Thr
Glu Gln Lys Arg Ile Pro Arg Glu Asp Met 85
90 95Leu Tyr Gln His Thr Leu Glu Gln Leu Arg Arg Leu
Ser Lys Tyr Asn 100 105 110Gln
Leu Gln Leu Gln Ala Ile Tyr Ala Gln Glu Gln Leu Ile Arg Met 115
120 125Lys Glu Asn Ser Gln Arg Lys Pro Met
Arg Val Val Asn Gln Glu Gln 130 135
140Ala Tyr Phe Tyr Leu Glu Pro Phe Gln Pro Ser Tyr Gln Leu Asp Val145
150 155 160Tyr Pro Tyr Ala
Ala Trp Phe His Pro Ala Gln Ile Met Gln His Val 165
170 175Ala Tyr Ser Pro Phe His Asp Thr Ala Lys
Leu Ile Ala Ser Glu Asn 180 185
190Ser Glu Lys Thr Asp Ile Ile Pro Glu Trp 195
200107199PRTBos indicusSITE(84)..(84)Xaa can be any naturally occurring
amino acid 107Arg Pro Lys His Pro Ile Lys His Gln Gly Leu Pro Gln Glu Val
Leu1 5 10 15Asn Glu Asn
Leu Leu Arg Phe Phe Val Ala Pro Phe Pro Glu Val Phe 20
25 30Gly Lys Glu Lys Val Asn Glu Leu Ser Lys
Asp Ile Gly Ser Glu Ser 35 40
45Thr Glu Asp Gln Ala Met Glu Asp Ile Lys Gln Met Glu Ala Glu Ser 50
55 60Ile Ser Ser Ser Glu Glu Ile Val Pro
Asn Ser Val Glu Gln Lys His65 70 75
80Ile Gln Lys Xaa Asp Val Pro Ser Glu Arg Tyr Leu Gly Tyr
Leu Glu 85 90 95Gln Leu
Leu Arg Leu Lys Lys Tyr Lys Val Pro Gln Leu Glu Ile Val 100
105 110Pro Asn Ser Ala Glu Glu Arg Leu His
Ser Met Lys Glu Gly Ile His 115 120
125Ala Gln Gln Lys Glu Pro Met Ile Gly Val Asn Gln Glu Leu Ala Tyr
130 135 140Phe Tyr Pro Glu Leu Phe Arg
Gln Phe Tyr Gln Leu Asp Ala Tyr Pro145 150
155 160Ser Gly Ala Trp Tyr Tyr Val Pro Leu Gly Thr Gln
Tyr Thr Asp Ala 165 170
175Pro Ser Phe Ser Asp Ile Pro Asn Pro Ile Gly Ser Glu Asn Ser Gly
180 185 190Lys Thr Thr Met Pro Leu
Trp 195108215PRTLama glama 108Arg Pro Lys Tyr Pro Leu Arg Tyr Pro
Glu Val Phe Gln Asn Glu Pro1 5 10
15Asp Ser Ile Gln Glu Val Leu Asn Lys Arg Lys Ile Leu Glu Leu
Ala 20 25 30Val Val Ser Pro
Ile Gln Phe Arg Gln Glu Asn Ile Asp Glu Leu Lys 35
40 45Asp Thr Arg Asn Glu Pro Thr Glu Asp His Ile Met
Glu Asp Thr Glu 50 55 60Arg Thr Val
Ser Gly Ser Ser Ser Ser Glu Glu Val Val Ser Ser Thr65 70
75 80Thr Glu Gln Lys Asp Ile Leu Lys
Glu Asp Met Pro Ser Gln Arg Ile 85 90
95Leu Glu Glu Leu His Arg Leu Asn Lys Tyr Lys Leu Leu Gln
Leu Glu 100 105 110Ala Ile Arg
Asp Gln Lys Leu Ile Pro Arg Val Lys Leu Ser Ser His 115
120 125Pro Tyr Leu Glu Gln Leu Tyr Arg Ile Asn Glu
Asp Asn His Pro Gln 130 135 140Leu Gly
Glu Pro Val Lys Val Val Thr Gln Glu Gln Ala Tyr Phe His145
150 155 160Leu Glu Pro Phe Gln Gln Phe
Phe Gln Leu Gly Ala Ser Pro Tyr Val 165
170 175Ala Trp Tyr Tyr Pro Pro Gln Val Met Gln Tyr Ile
Ala His Pro Ser 180 185 190Ser
His Asp Thr Pro Glu Gly Ile Ala Ser Glu Asp Gly Gly Lys Thr 195
200 205Asp Val Met Pro Gln Trp Trp 210
215109170PRTHomo sapiens 109Arg Pro Lys Leu Pro Leu Arg Tyr
Pro Glu Arg Leu Gln Asn Pro Ser1 5 10
15Glu Ser Ser Glu Pro Ile Pro Leu Glu Ser Arg Glu Glu Tyr
Met Asn 20 25 30Gly Met Asn
Arg Gln Arg Asn Ile Leu Arg Glu Lys Gln Thr Asp Glu 35
40 45Ile Lys Asp Thr Arg Asn Glu Ser Thr Gln Asn
Cys Val Val Ala Glu 50 55 60Pro Glu
Lys Met Glu Ser Ser Ile Ser Ser Ser Ser Glu Glu Met Ser65
70 75 80Leu Ser Lys Cys Ala Glu Gln
Phe Cys Arg Leu Asn Glu Tyr Asn Gln 85 90
95Leu Gln Leu Gln Ala Ala His Ala Gln Glu Gln Ile Arg
Arg Met Asn 100 105 110Glu Asn
Ser His Val Gln Val Pro Phe Gln Gln Leu Asn Gln Leu Ala 115
120 125Ala Tyr Pro Tyr Ala Val Trp Tyr Tyr Pro
Gln Ile Met Gln Tyr Val 130 135 140Pro
Phe Pro Pro Phe Ser Asp Ile Ser Asn Pro Thr Ala His Glu Asn145
150 155 160Tyr Glu Lys Asn Asn Val
Met Leu Gln Trp 165 170110208PRTCapra
hircus 110Lys His Lys Met Glu His Val Ser Ser Ser Glu Glu Pro Ile Asn
Ile1 5 10 15Phe Gln Glu
Ile Tyr Lys Gln Glu Lys Asn Met Ala Ile His Pro Arg 20
25 30Lys Glu Lys Leu Cys Thr Thr Ser Cys Glu
Glu Val Val Arg Asn Ala 35 40
45Asn Glu Glu Glu Tyr Ser Ile Arg Ser Ser Ser Glu Glu Ser Ala Glu 50
55 60Val Ala Pro Glu Glu Ile Lys Ile Thr
Val Asp Asp Lys His Tyr Gln65 70 75
80Lys Ala Leu Asn Glu Ile Asn Gln Phe Tyr Gln Lys Phe Pro
Gln Tyr 85 90 95Leu Gln
Tyr Pro Tyr Gln Gly Pro Ile Val Leu Asn Pro Trp Asp Gln 100
105 110Val Lys Arg Asn Ala Gly Pro Phe Thr
Pro Thr Val Asn Arg Glu Gln 115 120
125Leu Ser Thr Ser Glu Glu Asn Ser Lys Lys Thr Ile Asp Met Glu Ser
130 135 140Thr Glu Val Phe Thr Lys Lys
Thr Lys Leu Thr Glu Glu Glu Lys Asn145 150
155 160Arg Leu Asn Phe Leu Lys Lys Ile Ser Gln Tyr Tyr
Gln Lys Phe Ala 165 170
175Trp Pro Gln Tyr Leu Lys Thr Val Asp Gln His Gln Lys Ala Met Lys
180 185 190Pro Trp Thr Gln Pro Lys
Thr Asn Ala Ile Pro Tyr Val Arg Tyr Leu 195 200
205111208PRTOvis aries 111Lys His Lys Met Glu His Val Ser
Ser Ser Glu Glu Pro Ile Asn Ile1 5 10
15Ser Gln Glu Ile Tyr Lys Gln Glu Lys Asn Met Ala Ile His
Pro Arg 20 25 30Lys Glu Lys
Leu Cys Thr Thr Ser Cys Glu Glu Val Val Arg Asn Ala 35
40 45Asp Glu Glu Glu Tyr Ser Ile Arg Ser Ser Ser
Glu Glu Ser Ala Glu 50 55 60Val Ala
Pro Glu Glu Val Lys Ile Thr Val Asp Asp Lys His Tyr Gln65
70 75 80Lys Ala Leu Asn Glu Ile Asn
Gln Phe Tyr Gln Lys Phe Pro Gln Tyr 85 90
95Leu Gln Tyr Leu Tyr Gln Gly Pro Ile Val Leu Asn Pro
Trp Asp Gln 100 105 110Val Lys
Arg Asn Ala Gly Pro Phe Thr Pro Thr Val Asn Arg Glu Gln 115
120 125Leu Ser Thr Ser Glu Glu Asn Ser Lys Lys
Thr Ile Asp Met Glu Ser 130 135 140Thr
Glu Val Phe Thr Lys Lys Thr Lys Leu Thr Glu Glu Glu Lys Asn145
150 155 160Arg Leu Asn Phe Leu Lys
Lys Ile Ser Gln Tyr Tyr Gln Lys Phe Ala 165
170 175Trp Pro Gln Tyr Leu Lys Thr Val Asp Gln His Gln
Lys Ala Met Lys 180 185 190Pro
Trp Thr Gln Pro Lys Thr Asn Ala Ile Pro Tyr Val Arg Tyr Leu 195
200 205112207PRTBubalus bubalis 112Lys His
Thr Met Glu His Val Ser Ser Ser Glu Glu Ser Ile Ile Ser1 5
10 15Gln Glu Thr Tyr Lys Gln Glu Lys
Asn Met Ala Ile His Pro Ser Lys 20 25
30Glu Asn Leu Cys Ser Thr Phe Cys Lys Glu Val Ile Arg Asn Ala
Asn 35 40 45Glu Glu Glu Tyr Ser
Ile Gly Ser Ser Ser Glu Glu Ser Ala Glu Val 50 55
60Ala Thr Glu Glu Val Lys Ile Thr Val Asp Asp Lys His Tyr
Gln Lys65 70 75 80Ala
Leu Asn Glu Ile Asn Gln Phe Tyr Gln Lys Phe Pro Gln Tyr Leu
85 90 95Gln Tyr Leu Tyr Gln Gly Pro
Ile Val Leu Asn Pro Trp Asp Gln Val 100 105
110Lys Arg Asn Ala Val Pro Ile Thr Pro Thr Leu Asn Arg Glu
Gln Leu 115 120 125Ser Thr Ser Glu
Glu Asn Ser Lys Lys Thr Val Asp Met Glu Ser Thr 130
135 140Glu Val Ile Thr Lys Lys Thr Lys Leu Thr Glu Glu
Asp Lys Asn Arg145 150 155
160Leu Asn Phe Leu Lys Lys Ile Ser Gln His Tyr Gln Lys Phe Thr Trp
165 170 175Pro Gln Tyr Leu Lys
Thr Val Tyr Gln Tyr Gln Lys Ala Met Lys Pro 180
185 190Trp Thr Gln Pro Lys Thr Asn Val Ile Pro Tyr Val
Arg Tyr Leu 195 200
205113178PRTCamelus dromedaries 113Lys His Glu Met Asp Gln Gly Ser Ser
Ser Glu Glu Ser Ile Asn Val1 5 10
15Ser Gln Gln Lys Phe Lys Gln Val Lys Lys Val Ala Ile His Pro
Ser 20 25 30Lys Glu Asp Ile
Cys Ser Thr Phe Cys Glu Glu Ala Val Arg Asn Ile 35
40 45Lys Glu Val Glu Ser Ala Glu Val Pro Thr Glu Asn
Lys Ile Ser Gln 50 55 60Phe Tyr Gln
Lys Trp Lys Phe Leu Gln Tyr Leu Gln Ala Leu His Gln65 70
75 80Gly Gln Ile Val Met Asn Pro Trp
Asp Gln Gly Lys Thr Arg Ala Tyr 85 90
95Pro Phe Ile Pro Thr Val Asn Thr Glu Gln Leu Ser Ile Ser
Glu Glu 100 105 110Ser Thr Glu
Val Pro Thr Glu Glu Ser Thr Glu Val Phe Thr Lys Lys 115
120 125Thr Glu Leu Thr Glu Glu Glu Lys Asp His Gln
Lys Phe Leu Asn Lys 130 135 140Ile Tyr
Gln Tyr Tyr Gln Thr Phe Leu Trp Pro Glu Tyr Leu Lys Thr145
150 155 160Val Tyr Gln Tyr Gln Lys Thr
Met Thr Pro Trp Asn His Ile Lys Arg 165
170 175Tyr Phe114178PRTCamelus bactrianus 114Lys His Glu
Met Asp Gln Gly Ser Ser Ser Glu Glu Ser Ile Asn Val1 5
10 15Ser Gln Gln Lys Phe Lys Gln Val Lys
Lys Val Ala Ile His Pro Ser 20 25
30Lys Glu Asp Ile Cys Ser Thr Phe Cys Glu Glu Ala Val Arg Asn Ile
35 40 45Lys Glu Val Glu Ser Ala Glu
Val Pro Thr Glu Asn Lys Ile Ser Gln 50 55
60Phe Tyr Gln Lys Trp Lys Phe Leu Gln Tyr Leu Gln Ala Leu His Gln65
70 75 80Gly Gln Ile Val
Met Asn Pro Trp Asp Gln Gly Lys Thr Arg Ala Tyr 85
90 95Pro Phe Ile Pro Thr Val Asn Thr Glu Gln
Leu Ser Ile Ser Glu Glu 100 105
110Ser Thr Glu Val Pro Thr Glu Glu Ser Thr Glu Val Phe Asn Lys Lys
115 120 125Thr Glu Leu Thr Glu Glu Glu
Lys Asp His Gln Lys Phe Leu Asn Lys 130 135
140Ile Tyr Gln Tyr Tyr Gln Thr Phe Leu Trp Pro Glu Tyr Leu Lys
Thr145 150 155 160Val Tyr
Gln Tyr Gln Lys Thr Met Thr Pro Trp Asn His Ile Lys Arg
165 170 175Tyr Phe115204PRTBos mutus
115Lys Asn Thr Met Glu His Val Ser Ser Ser Glu Glu Ser Ile Ile Ser1
5 10 15Gln Glu Thr Tyr Lys Gln
Glu Lys Asn Met Ala Ile Asn Pro Ser Lys 20 25
30Gly Asn Leu Cys Ser Thr Phe Cys Lys Glu Val Val Arg
Asn Ala Asn 35 40 45Glu Glu Glu
Tyr Ser Ile Gly Ser Ser Ser Glu Glu Ser Ala Glu Val 50
55 60Ala Thr Glu Glu Val Lys Ile Thr Val Asp Asp Lys
His Tyr Gln Lys65 70 75
80Ala Leu Asn Glu Ile Asn Gln Phe Tyr Gln Lys Phe Pro Gln Tyr Leu
85 90 95Gln Tyr Leu Tyr Gln Gly
Pro Ile Val Leu Asn Pro Trp Asp Gln Val 100
105 110Lys Arg Asn Ala Val Pro Ile Thr Pro Thr Leu Asn
Arg Glu Gln Leu 115 120 125Ser Thr
Ser Glu Glu Asn Ser Lys Lys Thr Val Asp Met Glu Ser Thr 130
135 140Glu Val Phe Thr Lys Lys Thr Lys Leu Thr Glu
Glu Glu Lys Asn Arg145 150 155
160Leu Asn Phe Leu Lys Lys Ile Ser Gln Arg Tyr Gln Lys Phe Ala Leu
165 170 175Pro Gln Tyr Leu
Lys Thr Val Tyr Gln His Gln Lys Ala Met Lys Pro 180
185 190Trp Ile Gln Pro Lys Thr Lys Val Ile Pro Tyr
Val 195 200116216PRTEquus caballus 116Lys His Asn
Met Glu His Arg Ser Ser Ser Glu Asp Ser Val Asn Ile1 5
10 15Ser Gln Glu Lys Phe Lys Gln Glu Lys
Tyr Val Val Ile Pro Thr Ser 20 25
30Lys Glu Ser Ile Cys Ser Thr Ser Cys Glu Glu Ala Thr Arg Asn Ile
35 40 45Asn Glu Met Glu Ser Ala Lys
Phe Pro Thr Glu Val Tyr Ser Ser Ser 50 55
60Ser Ser Ser Glu Glu Ser Ala Lys Phe Pro Thr Glu Arg Glu Glu Lys65
70 75 80Glu Val Glu Glu
Lys His His Leu Lys Gln Leu Asn Lys Ile Asn Gln 85
90 95Phe Tyr Glu Lys Leu Asn Phe Leu Gln Tyr
Leu Gln Ala Leu Arg Gln 100 105
110Pro Arg Ile Val Leu Thr Pro Trp Asp Gln Thr Lys Thr Gly Asp Ser
115 120 125Pro Phe Ile Pro Ile Val Asn
Thr Glu Gln Leu Phe Thr Ser Glu Glu 130 135
140Ile Pro Lys Lys Thr Val Asp Met Glu Ser Thr Glu Val Val Thr
Glu145 150 155 160Lys Thr
Glu Leu Thr Glu Glu Glu Lys Asn Tyr Leu Lys Leu Leu Tyr
165 170 175Tyr Glu Lys Phe Thr Leu Pro
Gln Tyr Phe Lys Ile Val Arg Gln His 180 185
190Gln Thr Thr Met Asp Pro Arg Ser His Arg Lys Thr Asn Ser
Tyr Gln 195 200 205Ile Ile Pro Val
Leu Arg Tyr Phe 210 215117221PRTEquus asinus 117Lys
His Asn Met Glu His Arg Ser Ser Ser Glu Asp Ser Val Asn Ile1
5 10 15Ser Gln Glu Lys Phe Lys Gln
Glu Lys Tyr Val Val Ile Pro Thr Ser 20 25
30Lys Glu Ser Ile Cys Ser Thr Ser Cys Glu Glu Ala Thr Arg
Asn Ile 35 40 45Asn Glu Met Glu
Ser Ala Lys Phe Pro Thr Glu Val Tyr Ser Ser Ser 50 55
60Ser Ser Ser Glu Glu Ser Ala Lys Phe Pro Thr Glu Arg
Glu Glu Lys65 70 75
80Glu Val Glu Glu Lys His His Leu Lys Gln Leu Asn Lys Ile Asn Gln
85 90 95Phe Tyr Glu Lys Leu Asn
Phe Leu Gln Tyr Leu Gln Ala Leu Arg Gln 100
105 110Pro Arg Ile Val Leu Thr Pro Trp Asp Gln Thr Lys
Thr Gly Ala Ser 115 120 125Pro Phe
Ile Pro Ile Val Asn Thr Glu Gln Leu Phe Thr Ser Glu Glu 130
135 140Ile Pro Lys Lys Thr Val Asp Met Glu Ser Thr
Glu Val Val Thr Glu145 150 155
160Lys Thr Glu Leu Thr Glu Glu Glu Lys Asn Tyr Leu Lys Leu Leu Asn
165 170 175Lys Ile Asn Gln
Tyr Tyr Glu Lys Phe Thr Leu Pro Gln Tyr Phe Lys 180
185 190Ile Val His Gln His Gln Thr Thr Met Asp Pro
Gln Ser His Ser Lys 195 200 205Thr
Asn Ser Tyr Gln Ile Ile Pro Val Leu Arg Tyr Phe 210
215 220118192PRTVicugna pacos 118Lys His Glu Met Asp Gln
Gly Ser Ser Ser Glu Glu Ser Ile Asn Val1 5
10 15Ser Gln Gln Lys Leu Lys Gln Val Lys Lys Val Ala
Ile His Pro Ser 20 25 30Lys
Glu Asp Ile Cys Ser Thr Phe Cys Glu Glu Ala Val Arg Asn Ile 35
40 45Lys Glu Val Glu Ser Val Glu Val Pro
Thr Glu Asn Lys Ile Ser Gln 50 55
60Phe Tyr Gln Lys Trp Lys Phe Leu Gln Tyr Leu Gln Ala Leu His Gln65
70 75 80Gly Gln Ile Val Met
Asn Pro Trp Asp Gln Gly Lys Thr Met Val Tyr 85
90 95Pro Phe Ile Pro Thr Val Asn Thr Glu Gln Leu
Ser Ile Ser Glu Glu 100 105
110Ser Thr Glu Val Pro Thr Glu Glu Ser Thr Glu Val Phe Thr Lys Lys
115 120 125Thr Glu Leu Thr Glu Glu Glu
Lys Asp His Gln Lys Phe Leu Asn Lys 130 135
140Ile Tyr Gln Tyr Tyr Gln Thr Phe Leu Trp Pro Glu Tyr Leu Lys
Thr145 150 155 160Val Tyr
Gln Tyr Gln Lys Thr Met Thr Pro Trp Asn His Ile Lys Val
165 170 175Lys Ala Tyr Gln Ile Ile Pro
Asn Leu Val Ser Ser Thr Phe Tyr Leu 180 185
190119207PRTBos indicus 119Lys Asn Thr Met Glu His Val Ser
Ser Ser Glu Glu Ser Ile Ile Ser1 5 10
15Gln Glu Thr Tyr Lys Gln Glu Lys Asn Met Ala Ile Asn Pro
Ser Lys 20 25 30Glu Asn Leu
Cys Ser Thr Phe Cys Lys Glu Val Val Arg Asn Ala Asn 35
40 45Glu Glu Glu Tyr Ser Ile Gly Ser Ser Ser Glu
Glu Ser Ala Glu Val 50 55 60Ala Thr
Glu Glu Val Lys Ile Thr Val Asp Asp Lys His Tyr Gln Lys65
70 75 80Ala Leu Asn Glu Ile Asn Gln
Phe Tyr Gln Lys Phe Pro Gln Tyr Leu 85 90
95Gln Tyr Leu Tyr Gln Gly Pro Ile Val Leu Asn Pro Trp
Asp Gln Val 100 105 110Lys Arg
Asn Ala Val Pro Ile Thr Pro Thr Leu Asn Arg Glu Gln Leu 115
120 125Ser Thr Ser Glu Glu Asn Ser Lys Lys Thr
Val Asp Met Glu Ser Thr 130 135 140Glu
Val Phe Thr Lys Lys Thr Lys Leu Thr Glu Glu Glu Lys Asn Arg145
150 155 160Leu Asn Phe Leu Lys Lys
Ile Ser Gln Arg Tyr Gln Lys Phe Ala Leu 165
170 175Pro Gln Tyr Leu Lys Thr Val Tyr Gln His Gln Lys
Ala Met Lys Pro 180 185 190Trp
Ile Gln Pro Lys Thr Lys Val Ile Pro Tyr Val Arg Tyr Leu 195
200 205120187PRTLama glama 120Lys His Glu Met
Asp Gln Gly Ser Ser Ser Glu Glu Ser Ile Asn Val1 5
10 15Ser Gln Gln Lys Leu Lys Gln Val Lys Lys
Val Ala Ile His Pro Ser 20 25
30Lys Glu Asp Ile Cys Ser Thr Phe Cys Glu Glu Ala Val Arg Asn Ile
35 40 45Lys Glu Val Glu Ser Val Glu Val
Pro Thr Glu Asn Lys Ile Ser Gln 50 55
60Phe Tyr Gln Lys Trp Lys Phe Leu Gln Tyr Leu Gln Ala Leu His Gln65
70 75 80Gly Gln Ile Val Met
Asn Pro Trp Asp Gln Gly Lys Thr Met Val Tyr 85
90 95Pro Phe Ile Pro Thr Val Asn Thr Glu Gln Leu
Ser Ile Ser Glu Glu 100 105
110Ser Thr Glu Val Pro Thr Glu Glu Asn Ser Lys Lys Thr Val Asp Thr
115 120 125Glu Ser Thr Glu Val Phe Thr
Lys Lys Thr Glu Leu Thr Glu Glu Glu 130 135
140Lys Asp His Gln Lys Phe Leu Asn Lys Ile Tyr Gln Tyr Tyr Gln
Thr145 150 155 160Phe Leu
Trp Pro Glu Tyr Leu Lys Thr Val Tyr Gln Tyr Gln Lys Thr
165 170 175Met Thr Pro Trp Asn His Ile
Lys Arg Tyr Phe 180 185121207PRTCapra hircus
121Arg Glu Gln Glu Glu Leu Asn Val Val Gly Glu Thr Val Glu Ser Leu1
5 10 15Ser Ser Ser Glu Glu Ser
Ile Thr His Ile Asn Lys Lys Ile Glu Lys 20 25
30Phe Gln Ser Glu Glu Gln Gln Gln Thr Glu Asp Glu Leu
Gln Asp Lys 35 40 45Ile His Pro
Phe Ala Gln Ala Gln Ser Leu Val Tyr Pro Phe Thr Gly 50
55 60Pro Ile Pro Asn Ser Leu Pro Gln Asn Ile Leu Pro
Leu Thr Gln Thr65 70 75
80Pro Val Val Val Pro Pro Phe Leu Gln Pro Glu Ile Met Gly Val Pro
85 90 95Lys Val Lys Glu Thr Met
Val Pro Lys His Lys Glu Met Pro Phe Pro 100
105 110Lys Tyr Pro Val Glu Pro Phe Thr Glu Ser Gln Ser
Leu Thr Leu Thr 115 120 125Asp Val
Glu Lys Leu His Leu Pro Leu Pro Leu Val Gln Ser Trp Met 130
135 140His Gln Pro Pro Gln Pro Leu Ser Pro Thr Val
Met Phe Pro Pro Gln145 150 155
160Ser Val Leu Ser Leu Ser Gln Pro Lys Val Leu Pro Val Pro Gln Lys
165 170 175Ala Val Pro Gln
Arg Asp Met Pro Ile Gln Ala Phe Leu Leu Tyr Gln 180
185 190Glu Pro Val Leu Gly Pro Val Arg Gly Pro Phe
Pro Ile Leu Val 195 200
205122207PRTOvis aries 122Arg Glu Gln Glu Glu Leu Asn Val Val Gly Glu Thr
Val Glu Ser Leu1 5 10
15Ser Ser Ser Glu Glu Ser Ile Thr His Ile Asn Lys Lys Ile Glu Lys
20 25 30Phe Gln Ser Glu Glu Gln Gln
Gln Thr Glu Asp Glu Leu Gln Asp Lys 35 40
45Ile His Pro Phe Ala Gln Ala Gln Ser Leu Val Tyr Pro Phe Thr
Gly 50 55 60Pro Ile Pro Asn Ser Leu
Pro Gln Asn Ile Leu Pro Leu Thr Gln Thr65 70
75 80Pro Val Val Val Pro Pro Phe Leu Gln Pro Glu
Ile Met Gly Val Pro 85 90
95Lys Val Lys Glu Thr Met Val Pro Lys His Lys Glu Met Pro Phe Pro
100 105 110Lys Tyr Pro Val Glu Pro
Phe Thr Glu Ser Gln Ser Leu Thr Leu Thr 115 120
125Asp Val Glu Lys Leu His Leu Pro Leu Pro Leu Val Gln Ser
Trp Met 130 135 140His Gln Pro Pro Gln
Pro Leu Pro Pro Thr Val Met Phe Pro Pro Gln145 150
155 160Ser Val Leu Ser Leu Ser Gln Pro Lys Val
Leu Pro Val Pro Gln Lys 165 170
175Ala Val Pro Gln Arg Asp Met Pro Ile Gln Ala Phe Leu Leu Tyr Gln
180 185 190Glu Pro Val Leu Gly
Pro Val Arg Gly Pro Phe Pro Ile Leu Val 195 200
205123209PRTBubalus bubalis 123Arg Glu Leu Glu Glu Leu Asn
Val Pro Gly Glu Ile Val Glu Ser Leu1 5 10
15Ser Ser Ser Glu Glu Ser Ile Thr His Ile Asn Lys Lys
Ile Glu Lys 20 25 30Phe Gln
Ser Glu Glu Gln Gln Gln Met Glu Asp Glu Leu Gln Asp Lys 35
40 45Ile His Pro Phe Ala Gln Thr Gln Ser Leu
Val Tyr Pro Phe Pro Gly 50 55 60Pro
Ile Pro Lys Ser Leu Pro Gln Asn Ile Pro Pro Leu Thr Gln Thr65
70 75 80Pro Val Val Val Pro Pro
Phe Leu Gln Pro Glu Ile Met Gly Val Ser 85
90 95Lys Val Lys Glu Ala Met Ala Pro Lys His Lys Glu
Met Pro Phe Pro 100 105 110Lys
Tyr Pro Val Glu Pro Phe Thr Glu Ser Gln Ser Leu Thr Leu Thr 115
120 125Asp Val Glu Asn Leu His Leu Pro Leu
Pro Leu Leu Gln Ser Trp Met 130 135
140His Gln Pro Pro Gln Pro Leu Pro Pro Thr Val Met Phe Pro Pro Gln145
150 155 160Ser Val Leu Ser
Leu Ser Gln Ser Lys Val Leu Pro Val Pro Gln Lys 165
170 175Ala Val Pro Tyr Pro Gln Arg Asp Met Pro
Ile Gln Ala Phe Leu Leu 180 185
190Tyr Gln Glu Pro Val Leu Gly Pro Val Arg Gly Pro Phe Pro Ile Ile
195 200 205Val124217PRTCamelus
dromedaries 124Arg Glu Lys Glu Glu Phe Lys Thr Ala Gly Glu Ala Leu Glu
Ser Ile1 5 10 15Ser Ser
Ser Glu Glu Ser Ile Thr His Ile Asn Lys Gln Lys Ile Glu 20
25 30Lys Phe Lys Ile Glu Glu Gln Gln Gln
Thr Glu Asp Glu Gln Gln Asp 35 40
45Lys Ile Tyr Thr Phe Pro Gln Pro Gln Ser Leu Val Tyr Ser His Thr 50
55 60Glu Pro Ile Pro Tyr Pro Ile Leu Pro
Gln Asn Phe Leu Pro Pro Leu65 70 75
80Gln Pro Ala Val Met Val Pro Phe Leu Gln Pro Lys Val Met
Asp Val 85 90 95Pro Lys
Thr Lys Glu Thr Ile Ile Pro Lys Arg Lys Glu Met Pro Leu 100
105 110Leu Gln Ser Pro Val Val Pro Phe Thr
Glu Ser Gln Ser Leu Thr Leu 115 120
125Thr Asp Leu Glu Asn Leu His Leu Pro Leu Pro Leu Leu Gln Ser Leu
130 135 140Met Tyr Gln Ile Pro Gln Pro
Val Pro Gln Thr Pro Met Ile Pro Pro145 150
155 160Gln Ser Leu Leu Ser Leu Ser Gln Phe Lys Val Leu
Pro Val Pro Gln 165 170
175Gln Met Val Pro Tyr Pro Gln Arg Ala Met Pro Val Gln Ala Val Leu
180 185 190Pro Phe Gln Glu Pro Val
Pro Asp Pro Val Arg Gly Leu His Pro Val 195 200
205Pro Gln Pro Leu Val Pro Val Ile Ala 210
215125217PRTCamelus bactrianus 125Arg Glu Lys Glu Glu Phe Lys Thr Ala
Gly Glu Ala Leu Glu Ser Ile1 5 10
15Ser Ser Ser Glu Glu Ser Ile Thr His Ile Asn Lys Gln Lys Ile
Glu 20 25 30Lys Phe Lys Ile
Glu Glu Gln Gln Gln Thr Glu Asp Glu Gln Gln Asp 35
40 45Lys Ile Tyr Thr Phe Pro Gln Pro Gln Ser Leu Val
Tyr Ser His Thr 50 55 60Glu Pro Ile
Pro Tyr Pro Ile Leu Pro Gln Asn Phe Leu Pro Pro Leu65 70
75 80Gln Pro Ala Val Met Val Pro Phe
Leu Gln Pro Lys Val Met Asp Val 85 90
95Pro Lys Thr Lys Glu Thr Ile Ile Pro Lys Arg Lys Glu Met
Pro Leu 100 105 110Leu Gln Ser
Pro Val Val Pro Phe Thr Glu Ser Gln Ser Leu Thr Leu 115
120 125Thr Asp Leu Glu Asn Leu His Leu Pro Leu Pro
Leu Leu Gln Ser Leu 130 135 140Met Tyr
Gln Ile Pro Gln Pro Val Pro Gln Thr Pro Met Ile Pro Pro145
150 155 160Gln Ser Leu Leu Ser Leu Ser
Gln Phe Lys Val Leu Pro Val Pro Gln 165
170 175Gln Met Val Pro Tyr Pro Gln Arg Ala Ile Pro Val
Gln Ala Val Leu 180 185 190Pro
Phe Gln Glu Pro Val Pro Asp Pro Val Arg Gly Leu His Pro Val 195
200 205Pro Gln Pro Leu Val Pro Val Ile Ala
210 215126209PRTBos mutus 126Arg Glu Leu Glu Glu Leu Asn
Val Pro Gly Glu Ile Val Glu Ser Leu1 5 10
15Ser Ser Ser Glu Glu Ser Ile Thr Arg Ile Asn Lys Lys
Ile Glu Lys 20 25 30Phe Gln
Ser Glu Glu Gln Gln Gln Thr Glu Asp Glu Leu Gln Asp Lys 35
40 45Ile His Pro Phe Ala Gln Thr Gln Ser Leu
Val Tyr Pro Phe Pro Gly 50 55 60Pro
Ile Pro Asn Ser Leu Pro Gln Asn Ile Pro Pro Leu Thr Gln Thr65
70 75 80Pro Val Val Val Pro Pro
Phe Leu Gln Pro Glu Val Met Gly Val Ser 85
90 95Lys Val Lys Glu Ala Met Ala Pro Lys His Lys Glu
Met Pro Phe Pro 100 105 110Lys
Tyr Pro Val Glu Pro Phe Thr Glu Ser Gln Ser Leu Thr Leu Thr 115
120 125Asp Val Glu Asn Leu His Leu Pro Leu
Pro Leu Leu Gln Ser Trp Met 130 135
140His Gln Pro His Gln Pro Leu Pro Pro Thr Val Met Phe Pro Pro Gln145
150 155 160Ser Val Leu Ser
Leu Ser Gln Ser Lys Val Leu Pro Val Pro Gln Lys 165
170 175Ala Val Pro Tyr Pro Gln Arg Asp Met Pro
Ile Gln Ala Phe Leu Leu 180 185
190Tyr Gln Glu Pro Val Leu Gly Pro Val Arg Gly Pro Phe Pro Ile Ile
195 200 205Val127226PRTEquus caballus
127Arg Glu Lys Glu Glu Leu Asn Val Ser Ser Glu Thr Val Glu Ser Leu1
5 10 15Ser Ser Asn Glu Pro Asp
Ser Ser Ser Glu Glu Ser Ile Thr His Ile 20 25
30Asn Lys Glu Lys Leu Gln Lys Phe Lys His Glu Gly Gln
Gln Gln Arg 35 40 45Glu Val Glu
Arg Gln Asp Lys Ile Ser Arg Phe Val Gln Pro Gln Pro 50
55 60Val Val Tyr Pro Tyr Ala Glu Pro Val Pro Tyr Ala
Val Val Pro Gln65 70 75
80Ser Ile Leu Pro Leu Ala Gln Pro Pro Ile Leu Pro Phe Leu Gln Pro
85 90 95Glu Ile Met Glu Val Ser
Gln Ala Lys Glu Thr Ile Leu Pro Lys Arg 100
105 110Lys Val Met Pro Phe Leu Lys Ser Pro Ile Val Pro
Phe Ser Glu Arg 115 120 125Gln Ile
Leu Asn Pro Thr Asn Gly Glu Asn Leu Arg Leu Pro Val His 130
135 140Leu Ile Gln Pro Phe Met His Gln Val Pro Gln
Ser Leu Leu Gln Thr145 150 155
160Leu Met Leu Pro Ser Gln Pro Val Leu Ser Pro Pro Gln Ser Lys Val
165 170 175Ala Pro Phe Pro
Gln Pro Val Val Pro Tyr Pro Gln Arg Asp Thr Pro 180
185 190Val Gln Ala Phe Leu Leu Tyr Gln Asp Pro Arg
Leu Gly Pro Thr Gly 195 200 205Glu
Leu Asp Pro Ala Thr Gln Pro Ile Val Ala Val His Asn Pro Val 210
215 220Ile Val225128226PRTEquus asinus 128Arg
Glu Lys Glu Glu Leu Asn Val Ser Ser Glu Thr Val Glu Ser Leu1
5 10 15Ser Ser Asn Glu Pro Asp Ser
Ser Ser Glu Glu Ser Ile Thr His Ile 20 25
30Asn Lys Glu Lys Ser Gln Lys Phe Lys His Glu Gly Gln Gln
Gln Arg 35 40 45Glu Val Glu His
Gln Asp Lys Ile Ser Arg Phe Val Gln Pro Gln Pro 50 55
60Val Val Tyr Pro Tyr Ala Glu Pro Val Pro Tyr Ala Val
Val Pro Gln65 70 75
80Asn Ile Leu Val Leu Ala Gln Pro Pro Ile Val Pro Phe Leu Gln Pro
85 90 95Glu Ile Met Glu Val Ser
Gln Ala Lys Glu Thr Ile Leu Pro Lys Arg 100
105 110Lys Val Met Pro Phe Leu Lys Ser Pro Ile Val Pro
Phe Ser Glu Arg 115 120 125Gln Ile
Leu Asn Pro Thr Asn Gly Glu Asn Leu Arg Leu Pro Val His 130
135 140Leu Ile Gln Pro Phe Met His Gln Val Pro Gln
Ser Leu Leu Gln Thr145 150 155
160Leu Met Leu Pro Ser Gln Pro Val Leu Ser Pro Pro Gln Ser Lys Val
165 170 175Ala Pro Phe Pro
Gln Pro Val Val Pro Tyr Pro Gln Arg Asp Thr Pro 180
185 190Val Gln Ala Phe Leu Leu Tyr Gln Asp Pro Gln
Leu Gly Leu Thr Gly 195 200 205Glu
Phe Asp Pro Ala Thr Gln Pro Ile Val Pro Val His Asn Pro Val 210
215 220Ile Val225129141PRTAlces
alcesSITE(6)..(6)Xaa can be any naturally occurring amino
acidSITE(17)..(17)Xaa can be any naturally occurring amino
acidSITE(65)..(65)Xaa can be any naturally occurring amino acid 129Ile
His Pro Phe Ala Xaa Thr Gln Ser Leu Val Tyr Pro Phe Thr Gly1
5 10 15Xaa Ile Pro Tyr Ser Leu Pro
Gln Asn Phe Leu Pro Leu Pro Gln Thr 20 25
30Pro Gly Met Val Pro Pro Phe Leu Gln Pro Glu Ile Met Gly
Val Ser 35 40 45Glu Val Lys Glu
Thr Met Val Pro Lys Asn Lys Glu Met Pro Phe Pro 50 55
60Xaa Tyr Pro Val Glu Pro Phe Ala Glu Gly Gln Ser Leu
Thr Leu Thr65 70 75
80Asp Val Glu Asn Leu His Leu Pro Leu Pro Leu Leu Gln Ser Trp Met
85 90 95His Gln Thr Pro Gln Pro
Leu Pro Pro Thr Val Met Phe Pro Pro Gln 100
105 110Ser Val Leu Ser Leu Ser Gln Pro Lys Val Leu Ser
Val Pro Gln Lys 115 120 125Ala Val
Pro Tyr Pro Gln Arg Asp Met Pro Ile Gln Ala 130 135
140130173PRTVicugna pacos 130Asp Glu Gln Gln Asp Lys Ile Tyr
Thr Phe Pro Gln Pro Gln Ser Leu1 5 10
15Val Tyr Ser His Thr Glu Pro Ile Pro Tyr Pro Ile Leu Pro
Gln Asn 20 25 30Phe Leu Pro
Pro Leu Gln Pro Ala Val Met Val Pro Phe Leu Gln Pro 35
40 45Lys Val Met Asp Val Pro Lys Thr Lys Glu Ile
Val Ile Pro Lys Arg 50 55 60Lys Glu
Met Pro Leu Leu Gln Ser Pro Leu Val Pro Phe Thr Glu Ser65
70 75 80Gln Ser Leu Thr Leu Thr Asp
Leu Glu Asn Leu His Leu Pro Leu Pro 85 90
95Leu Leu Gln Ser Leu Met His Gln Ile Pro Gln Pro Val
Pro Gln Thr 100 105 110Pro Met
Ile Pro Pro Gln Ser Leu Leu Ser Leu Ser Gln Phe Lys Val 115
120 125Leu Pro Val Pro Gln Gln Met Val Pro Tyr
Pro Gln Arg Ala Met Pro 130 135 140Val
Gln Ala Leu Leu Pro Phe Gln Glu Pro Ile Pro Asp Pro Val Arg145
150 155 160Gly Leu His Pro Val Pro
Gln Pro Leu Val Pro Val Ile 165
170131209PRTBos indicus 131Arg Glu Leu Glu Glu Leu Asn Val Pro Gly Glu
Ile Val Glu Ser Leu1 5 10
15Ser Ser Ser Glu Glu Ser Ile Thr Arg Ile Asn Lys Lys Ile Glu Lys
20 25 30Phe Gln Ser Glu Glu Gln Gln
Gln Thr Glu Asp Glu Leu Gln Asp Lys 35 40
45Ile His Pro Phe Ala Gln Thr Gln Ser Leu Val Tyr Pro Phe Pro
Gly 50 55 60Pro Ile Pro Asn Ser Leu
Pro Gln Asn Ile Pro Pro Leu Thr Gln Thr65 70
75 80Pro Val Val Val Pro Pro Phe Leu Gln Pro Glu
Val Met Gly Val Ser 85 90
95Lys Val Lys Glu Ala Met Ala Pro Lys His Lys Glu Met Pro Phe Pro
100 105 110Lys Tyr Pro Val Glu Pro
Phe Thr Glu Ser Gln Ser Leu Thr Leu Thr 115 120
125Asp Val Glu Asn Leu His Leu Pro Leu Pro Leu Leu Gln Ser
Trp Met 130 135 140His Gln Pro His Gln
Pro Leu Pro Pro Thr Val Met Phe Pro Pro Gln145 150
155 160Ser Val Leu Ser Leu Ser Gln Ser Lys Val
Leu Pro Val Pro Gln Lys 165 170
175Ala Val Pro Tyr Pro Gln Arg Asp Met Pro Ile Gln Ala Phe Leu Leu
180 185 190Tyr Gln Glu Pro Val
Leu Gly Pro Val Arg Gly Pro Phe Pro Ile Ile 195
200 205Val132217PRTLama glama 132Arg Glu Lys Glu Glu Phe
Lys Thr Ala Gly Glu Ala Val Glu Ser Ile1 5
10 15Ser Ser Ser Glu Glu Ser Ile Thr His Ile Asn Lys
Gln Lys Ile Glu 20 25 30Lys
Phe Lys Ile Glu Glu Gln Gln Gln Thr Glu Asp Glu Gln Gln Asp 35
40 45Lys Ile Tyr Thr Phe Pro Gln Pro Gln
Ser Leu Val Tyr Ser His Thr 50 55
60Glu Pro Ile Pro Tyr Pro Ile Leu Pro Gln Asn Phe Leu Pro Pro Leu65
70 75 80Gln Pro Ala Val Met
Val Pro Phe Leu Gln Pro Lys Val Met Asp Val 85
90 95Pro Lys Thr Lys Glu Ile Val Ile Pro Lys Arg
Lys Glu Met Pro Leu 100 105
110Leu Gln Ser Pro Leu Val Pro Phe Thr Glu Ser Gln Ser Leu Thr Leu
115 120 125Thr Asp Leu Glu Asn Leu His
Leu Pro Leu Pro Leu Leu Gln Ser Leu 130 135
140Met His Gln Ile Pro Gln Pro Val Pro Gln Thr Pro Met Ile Pro
Pro145 150 155 160Gln Ser
Leu Leu Ser Leu Ser Gln Phe Lys Val Leu Pro Val Pro Gln
165 170 175Gln Met Val Pro Tyr Pro Gln
Arg Ala Met Pro Val Gln Ala Leu Leu 180 185
190Pro Phe Gln Glu Pro Ile Pro Asp Pro Val Arg Gly Leu His
Pro Val 195 200 205Pro Gln Pro Leu
Val Pro Val Ile Ala 210 215133211PRTHomo sapiens
133Arg Glu Thr Ile Glu Ser Leu Ser Ser Ser Glu Glu Ser Ile Thr Glu1
5 10 15Tyr Lys Gln Lys Val Glu
Lys Val Lys His Glu Asp Gln Gln Gln Gly 20 25
30Glu Asp Glu His Gln Asp Lys Ile Tyr Pro Ser Phe Gln
Pro Gln Pro 35 40 45Leu Ile Tyr
Pro Phe Val Glu Pro Ile Pro Tyr Gly Phe Leu Pro Gln 50
55 60Asn Ile Leu Pro Leu Ala Gln Pro Ala Val Val Leu
Pro Val Pro Gln65 70 75
80Pro Glu Ile Met Glu Val Pro Lys Ala Lys Asp Thr Val Tyr Thr Lys
85 90 95Gly Arg Val Met Pro Val
Leu Lys Ser Pro Thr Ile Pro Phe Phe Asp 100
105 110Pro Gln Ile Pro Lys Leu Thr Asp Leu Glu Asn Leu
His Leu Pro Leu 115 120 125Pro Leu
Leu Gln Pro Leu Met Gln Gln Val Pro Gln Pro Ile Pro Gln 130
135 140Thr Leu Ala Leu Pro Pro Gln Pro Leu Trp Ser
Val Pro Gln Pro Lys145 150 155
160Val Leu Pro Ile Pro Gln Gln Val Val Pro Tyr Pro Gln Arg Ala Val
165 170 175Pro Val Gln Ala
Leu Leu Leu Asn Gln Glu Leu Leu Leu Asn Pro Thr 180
185 190His Gln Ile Tyr Pro Val Thr Gln Pro Leu Ala
Pro Val His Asn Pro 195 200 205Ile
Ser Val 2101341059DNAArtificial SequenceNucleic acid encoding fusion
protein sig2OKC1-TFMOLG1 134atggccaagc tagttttttc cctttgtttt
ctgcttttca gtggctgctg cttcgctcaa 60gagcagaatc aagagcagcc aatccgttgt
gagaaggacg agaggttctt ctcagacaag 120atcgccaaat atatacccat acaatatgta
ctctcacgct accctagcta cgggcttaac 180tactatcagc aaaaacctgt agcactgata
aataaccagt ttctccccta tccctattat 240gctaaacctg ccgccgtgag gagtccagca
caaatacttc agtggcaagt gctcagtaac 300accgtgccag caaaaagctg ccaggctcag
cccaccacaa tggcccgtca tccccatcct 360caccttagct tcatggcaat cccaccaaag
aagaatcaag acaagaccga aatacctacc 420atcaacacaa ttgcatctgg agagcctacc
agtacaccaa caactgaggc agtagagtct 480actgttgcta cccttgagga cagccccgag
gttatagagt ccccacctga gataaatacc 540gtgcaggtga caagtaccgc cgtattcatg
ttgatcgtaa cacagactat gaagggtctt 600gatatacaga aggtggccgg gacttggtac
agtttggcaa tggccgcatc cgacatctcc 660ttgttggacg cacaatcagc cccattgcgt
gtgtacgtag aagagcttaa accaactccc 720gagggggatc tggaaattct gctccagaaa
tgggagaacg gtgagtgcgc ccagaagaag 780atcatcgcag agaagaccaa aattccagca
gtattcaaaa tcgacgcatt gaacgaaaat 840aaggtgctcg tactggacac tgattataag
aagtatctcc ttttctgtat ggagaactca 900gcagagcctg aacagagtct tgcctgccaa
tgccttgttc gtaccccaga ggtagatgat 960gaagctctgg aaaagttcga taaggccctt
aaggctctgc ctatgcacat taggctttct 1020ttcaatccaa ctcaacttga ggaacaatgt
cacatttaa 1059135352PRTArtificial SequenceFusion
protein sig2OKC1-TFMOLG1 135Met Ala Lys Leu Val Phe Ser Leu Cys Phe Leu
Leu Phe Ser Gly Cys1 5 10
15Cys Phe Ala Gln Glu Gln Asn Gln Glu Gln Pro Ile Arg Cys Glu Lys
20 25 30Asp Glu Arg Phe Phe Ser Asp
Lys Ile Ala Lys Tyr Ile Pro Ile Gln 35 40
45Tyr Val Leu Ser Arg Tyr Pro Ser Tyr Gly Leu Asn Tyr Tyr Gln
Gln 50 55 60Lys Pro Val Ala Leu Ile
Asn Asn Gln Phe Leu Pro Tyr Pro Tyr Tyr65 70
75 80Ala Lys Pro Ala Ala Val Arg Ser Pro Ala Gln
Ile Leu Gln Trp Gln 85 90
95Val Leu Ser Asn Thr Val Pro Ala Lys Ser Cys Gln Ala Gln Pro Thr
100 105 110Thr Met Ala Arg His Pro
His Pro His Leu Ser Phe Met Ala Ile Pro 115 120
125Pro Lys Lys Asn Gln Asp Lys Thr Glu Ile Pro Thr Ile Asn
Thr Ile 130 135 140Ala Ser Gly Glu Pro
Thr Ser Thr Pro Thr Thr Glu Ala Val Glu Ser145 150
155 160Thr Val Ala Thr Leu Glu Asp Ser Pro Glu
Val Ile Glu Ser Pro Pro 165 170
175Glu Ile Asn Thr Val Gln Val Thr Ser Thr Ala Val Phe Met Leu Ile
180 185 190Val Thr Gln Thr Met
Lys Gly Leu Asp Ile Gln Lys Val Ala Gly Thr 195
200 205Trp Tyr Ser Leu Ala Met Ala Ala Ser Asp Ile Ser
Leu Leu Asp Ala 210 215 220Gln Ser Ala
Pro Leu Arg Val Tyr Val Glu Glu Leu Lys Pro Thr Pro225
230 235 240Glu Gly Asp Leu Glu Ile Leu
Leu Gln Lys Trp Glu Asn Gly Glu Cys 245
250 255Ala Gln Lys Lys Ile Ile Ala Glu Lys Thr Lys Ile
Pro Ala Val Phe 260 265 270Lys
Ile Asp Ala Leu Asn Glu Asn Lys Val Leu Val Leu Asp Thr Asp 275
280 285Tyr Lys Lys Tyr Leu Leu Phe Cys Met
Glu Asn Ser Ala Glu Pro Glu 290 295
300Gln Ser Leu Ala Cys Gln Cys Leu Val Arg Thr Pro Glu Val Asp Asp305
310 315 320Glu Ala Leu Glu
Lys Phe Asp Lys Ala Leu Lys Ala Leu Pro Met His 325
330 335Ile Arg Leu Ser Phe Asn Pro Thr Gln Leu
Glu Glu Gln Cys His Ile 340 345
3501361071DNAArtificial SequenceNucleic acid encoding fusion protein
sig2OKC1-TFMOLG1KDEL 136atggccaagc tagttttttc cctttgtttt ctgcttttca
gtggctgctg cttcgctcaa 60gagcagaatc aagagcagcc aatccgttgt gagaaggacg
agaggttctt ctcagacaag 120atcgccaaat atatacccat acaatatgta ctctcacgct
accctagcta cgggcttaac 180tactatcagc aaaaacctgt agcactgata aataaccagt
ttctccccta tccctattat 240gctaaacctg ccgccgtgag gagtccagca caaatacttc
agtggcaagt gctcagtaac 300accgtgccag caaaaagctg ccaggctcag cccaccacaa
tggcccgtca tccccatcct 360caccttagct tcatggcaat cccaccaaag aagaatcaag
acaagaccga aatacctacc 420atcaacacaa ttgcatctgg agagcctacc agtacaccaa
caactgaggc agtagagtct 480actgttgcta cccttgagga cagccccgag gttatagagt
ccccacctga gataaatacc 540gtgcaggtga caagtaccgc cgtattcatg ttgatcgtaa
cacagactat gaagggtctt 600gatatacaga aggtggccgg gacttggtac agtttggcaa
tggccgcatc cgacatctcc 660ttgttggacg cacaatcagc cccattgcgt gtgtacgtag
aagagcttaa accaactccc 720gagggggatc tggaaattct gctccagaaa tgggagaacg
gtgagtgcgc ccagaagaag 780atcatcgcag agaagaccaa aattccagca gtattcaaaa
tcgacgcatt gaacgaaaat 840aaggtgctcg tactggacac tgattataag aagtatctcc
ttttctgtat ggagaactca 900gcagagcctg aacagagtct tgcctgccaa tgccttgttc
gtaccccaga ggtagatgat 960gaagctctgg aaaagttcga taaggccctt aaggctctgc
ctatgcacat taggctttct 1020ttcaatccaa ctcaacttga ggaacaatgt cacattaagg
atgagcttta a 1071137356PRTArtificial SequenceFusion protein
sig2OKC1-TFMOLG1KDEL 137Met Ala Lys Leu Val Phe Ser Leu Cys Phe Leu Leu
Phe Ser Gly Cys1 5 10
15Cys Phe Ala Gln Glu Gln Asn Gln Glu Gln Pro Ile Arg Cys Glu Lys
20 25 30Asp Glu Arg Phe Phe Ser Asp
Lys Ile Ala Lys Tyr Ile Pro Ile Gln 35 40
45Tyr Val Leu Ser Arg Tyr Pro Ser Tyr Gly Leu Asn Tyr Tyr Gln
Gln 50 55 60Lys Pro Val Ala Leu Ile
Asn Asn Gln Phe Leu Pro Tyr Pro Tyr Tyr65 70
75 80Ala Lys Pro Ala Ala Val Arg Ser Pro Ala Gln
Ile Leu Gln Trp Gln 85 90
95Val Leu Ser Asn Thr Val Pro Ala Lys Ser Cys Gln Ala Gln Pro Thr
100 105 110Thr Met Ala Arg His Pro
His Pro His Leu Ser Phe Met Ala Ile Pro 115 120
125Pro Lys Lys Asn Gln Asp Lys Thr Glu Ile Pro Thr Ile Asn
Thr Ile 130 135 140Ala Ser Gly Glu Pro
Thr Ser Thr Pro Thr Thr Glu Ala Val Glu Ser145 150
155 160Thr Val Ala Thr Leu Glu Asp Ser Pro Glu
Val Ile Glu Ser Pro Pro 165 170
175Glu Ile Asn Thr Val Gln Val Thr Ser Thr Ala Val Phe Met Leu Ile
180 185 190Val Thr Gln Thr Met
Lys Gly Leu Asp Ile Gln Lys Val Ala Gly Thr 195
200 205Trp Tyr Ser Leu Ala Met Ala Ala Ser Asp Ile Ser
Leu Leu Asp Ala 210 215 220Gln Ser Ala
Pro Leu Arg Val Tyr Val Glu Glu Leu Lys Pro Thr Pro225
230 235 240Glu Gly Asp Leu Glu Ile Leu
Leu Gln Lys Trp Glu Asn Gly Glu Cys 245
250 255Ala Gln Lys Lys Ile Ile Ala Glu Lys Thr Lys Ile
Pro Ala Val Phe 260 265 270Lys
Ile Asp Ala Leu Asn Glu Asn Lys Val Leu Val Leu Asp Thr Asp 275
280 285Tyr Lys Lys Tyr Leu Leu Phe Cys Met
Glu Asn Ser Ala Glu Pro Glu 290 295
300Gln Ser Leu Ala Cys Gln Cys Leu Val Arg Thr Pro Glu Val Asp Asp305
310 315 320Glu Ala Leu Glu
Lys Phe Asp Lys Ala Leu Lys Ala Leu Pro Met His 325
330 335Ile Arg Leu Ser Phe Asn Pro Thr Gln Leu
Glu Glu Gln Cys His Ile 340 345
350Lys Asp Glu Leu 355
User Contributions:
Comment about this patent or add new information about this topic: