Patents - stay tuned to the technology

Inventors list

Assignees list

Classification tree browser

Top 100 Inventors

Top 100 Assignees

Patent application title: RECOMBINANT FUSION PROTEINS FOR PRODUCING MILK PROTEINS IN PLANTS

Inventors:  Viviane Lanquar (San Carlos, CA, US)  Magi El-Richani (San Francisco, CA, US)
IPC8 Class: AC12N1582FI
USPC Class:
Class name:
Publication date: 2022-03-31
Patent application number: 20220098608



Abstract:

Provided herein are compositions and methods for producing milk proteins in plants, which allow for safe, sustainable and humane production of milk proteins for commercial use, such as use in food compositions. The disclosure provides recombinant fusion proteins comprising a milk protein, or fragment thereof and a structured mammalian, avian, plant, or fungal protein, or fragment thereof. The disclosure also provides methods for producing the recombinant fusions proteins, and food compositions comprising the same.

Claims:

1. A food composition, comprising: at least one component of a fusion protein, said fusion protein comprising i) a bovine casein component and ii) a bovine .beta.-lactoglobulin component, wherein the component has been separated from the fusion protein.

2. The food composition of claim 1, wherein the food composition comprises the bovine casein component.

3. The food composition of claim 1, wherein the food composition comprises the bovine .beta.-lactoglobulin component.

4. The food composition of claim 1, wherein the food composition is selected from the group consisting of: cheese, processed cheese product, fermented dairy product, directly acidified counterpart of fermented dairy product, cottage cheese, dressing, frozen dairy product, frozen dessert, dessert, baked good, topping, icing, filling, low-fat spread, dairy-based dry mix, soup, sauce, salad dressing, geriatric nutrition, cream, creamer, analog dairy product, follow-up formula, baby formula, infant formula, milk, dairy beverage, acid dairy drink, smoothie, milk tea, butter, margarine, butter alternative, growing up milk, low-lactose product, low-lactose beverage, medical and clinical nutrition product, protein bar, nutrition bar, sport beverage, confection, meat product, analog meat product, meal replacement beverage, weight management food and beverage, dairy product, cultured buttermilk, sour cream, yogurt, skyr, leben, lassi, kefir, powder containing a milk protein, and low-lactose product.

5. The food composition of claim 1, wherein the food composition is a solid.

6. The food composition of claim 1, wherein the food composition is a liquid.

7. The food composition of claim 1, wherein the food composition is a powder.

8. The food composition of claim 1, wherein the food composition is a dairy product.

9. The food composition of claim 1, wherein the food composition is an analog dairy product.

10. The food composition of claim 1, wherein the food composition is a low lactose product.

11. The food composition of claim 1, wherein the food composition is a milk.

12. The food composition of claim 1, wherein the food composition is a cheese.

13. The food composition of claim 1, wherein the food composition is fermented.

14. A food composition, comprising: a fusion protein comprising bovine casein and bovine .beta.-lactoglobulin.

15. The food composition of claim 14, wherein the fusion protein comprises a protease cleavage site.

16. The food composition of claim 14, wherein the fusion protein comprises a chymosin cleavage site.

17. The food composition of claim 14, wherein the fusion protein has a molecular weight of 30 kDa to 50 kDa.

18. The food composition of claim 14, wherein the food composition is selected from the group consisting of: cheese, processed cheese product, fermented dairy product, directly acidified counterpart of fermented dairy product, cottage cheese, dressing, frozen dairy product, frozen dessert, dessert, baked good, topping, icing, filling, low-fat spread, dairy-based dry mix, soup, sauce, salad dressing, geriatric nutrition, cream, creamer, analog dairy product, follow-up formula, baby formula, infant formula, milk, dairy beverage, acid dairy drink, smoothie, milk tea, butter, margarine, butter alternative, growing up milk, low-lactose product, low-lactose beverage, medical and clinical nutrition product, protein bar, nutrition bar, sport beverage, confection, meat product, analog meat product, meal replacement beverage, weight management food and beverage, dairy product, cultured buttermilk, sour cream, yogurt, skyr, leben, lassi, kefir, powder containing a milk protein, and low-lactose product.

19. The food composition of claim 14, wherein the food composition is a solid.

20. The food composition of claim 14, wherein the food composition is a liquid.

21. The food composition of claim 14, wherein the food composition is a powder.

22. The food composition of claim 14, wherein the food composition is a dairy product.

23. The food composition of claim 14, wherein the food composition is an analog dairy product.

24. The food composition of claim 14, wherein the food composition is a low lactose product.

25. The food composition of claim 14, wherein the food composition is a milk.

26. The food composition of claim 14, wherein the food composition is a cheese.

27. The food composition of claim 14, wherein the food composition is fermented.

28. The food composition of claim 14, wherein the fusion protein is a plant expressed fusion protein.

29. The food composition of claim 14, wherein the fusion protein is a soybean expressed fusion protein.

Description:

CROSS REFERENCE TO RELATED APPLICATIONS

[0001] This application is a divisional of U.S. application Ser. No. 17/157,105, filed Jan. 25, 2021, which is a continuation of U.S. application Ser. No. 17/039,759, filed Sep. 30, 2020 and which issued as U.S. Pat. No. 10,947,552 on Mar. 16, 2021, the disclosures of which are hereby incorporated by reference in their entirety.

DESCRIPTION OF THE TEXT FILE SUBMITTED ELECTRONICALLY

[0002] The contents of the text file submitted electronically herewith are incorporated herein by reference in their entirety: A computer readable format copy of the Sequence Listing filename: ALRO_007_12US_SeqList_ST25.txt, date recorded: Jul. 9, 2021, file size.apprxeq.155 kilobytes.

FIELD OF THE DISCLOSURE

[0003] The present disclosure generally relates to recombinant milk proteins, and methods of production, extraction, and purification of the milk proteins from transgenic plants. The disclosure also relates to food compositions comprising recombinant milk proteins.

BACKGROUND

[0004] Globally, more than 7.5 billion people around the world consume milk and milk products. Demand for cow milk and dairy products is expected to keep increasing due to increased reliance on these products in developing countries as well as growth in the human population, which is expected to exceed 9 billion people by 2050.

[0005] Relying on animal agriculture to meet the growing demand for food is not a sustainable solution. According to the Food & Agriculture Organization of the United Nations, animal agriculture is responsible for 18% of all greenhouse gases, more than the entire transportation sector combined. Dairy cows alone account for 3% of this total.

[0006] In addition to impacting the environment, animal agriculture poses a serious risk to human health. A startling 80% of antibiotics used in the United States go towards treating animals, resulting in the development of antibiotic resistant microorganisms also known as superbugs. For years, food companies and farmers have administered antibiotics not only to sick animals, but also to healthy animals, to prevent illness. In September 2016, the United Nations announced the use of antibiotics in the food system as a crisis on par with Ebola and HIV.

[0007] It is estimated that cow milk accounts for 83% of global milk production. Accordingly, there is an urgent need for to provide bovine milk and/or essential high-quality proteins from bovine milk in a more sustainable and humane manner, instead of solely relying on animal farming. Also, there is a need for selectively producing the specific milk proteins that confer nutritional and clinical benefits, and/or do not provoke allergic responses.

BRIEF SUMMARY

[0008] Provided herein are compositions and methods for producing milk proteins in transgenic plants. In some embodiments, a milk protein is stably expressed in a transgenic plant by fusing it to a stable protein, such as a stable mammalian, avian, plant or fungal protein. The compositions and methods provided herein allow for safe, sustainable and humane production of milk proteins for commercial use, such as use in food compositions.

[0009] In some embodiments, the disclosure provides a stably transformed plant comprising in its genome: a recombinant DNA construct encoding a fusion protein, the fusion protein comprising: (i) an unstructured milk protein, and (ii) a structured animal protein; wherein the fusion protein is stably expressed in the plant in an amount of 1% or higher per total protein weight of soluble protein extractable from the plant.

[0010] In some embodiments, the disclosure provides a stably transformed plant, comprising in its genome: a recombinant DNA construct encoding a fusion protein, the fusion protein comprising: .kappa.-casein; and .beta.-lactoglobulin; wherein the fusion protein is stably expressed in the plant in an amount of 1% or higher per total protein weight of soluble protein extractable from the plant.

[0011] In some embodiments, the disclosure provides a recombinant fusion protein comprising: (i) an unstructured milk protein, and (ii) a structured animal protein.

[0012] In some embodiments, the disclosure provides a plant-expressed recombinant fusion protein, comprising: .kappa.-casein and .beta.-lactoglobulin.

[0013] Also provided are nucleic acids encoding the recombinant fusion proteins described herein.

[0014] Also provided are vectors comprising a nucleic acid encoding one or more recombinant fusion proteins described herein, wherein the recombinant fusion protein comprises: (i) an unstructured milk protein, and (ii) a structured animal protein.

[0015] Also provided are plants comprising the recombinant fusion proteins and/or the nucleic acids described herein.

[0016] The instant disclosure also provides a method for stably expressing a recombinant fusion protein in a plant, the method comprising: a) transforming a plant with a plant transformation vector comprising an expression cassette comprising: a sequence encoding a fusion protein, wherein the fusion protein comprises an unstructured milk protein, and a structured animal protein; and b) growing the transformed plant under conditions wherein the recombinant fusion protein is expressed in an amount of 1% or higher per total protein weight of soluble protein extractable from the plant.

[0017] Also provided herein are methods for making food compositions, the methods comprising: expressing the recombinant fusion protein in a plant; extracting the recombinant fusion protein from the plant; optionally, separating the milk protein from the structured animal protein or the structured plant protein; and creating a food composition using the milk protein or the fusion protein.

[0018] Also provided herein are food compositions comprising one or more recombinant fusion proteins as described herein.

[0019] Also provided are food compositions produced using any one of the methods disclosed herein.

[0020] These and other embodiments are described in detail below.

BRIEF DESCRIPTION OF THE DRAWINGS

[0021] The accompanying figures, which are incorporated herein and form a part of the specification, illustrate some, but not the only or exclusive, example embodiments and/or features. It is intended that the embodiments and figures disclosed herein are to be considered illustrative rather than limiting.

[0022] FIGS. 1A, 1B, 1C, 1D, 1E, 1F, 1G, 1H, 1I, 1J, 1K, 1L, 1M, 1N, 1O, and 1P show expression cassettes having different combinations of fusions between structured and intrinsically unstructured proteins (not to scale). Coding regions and regulatory sequences are indicated as blocks (not to scale). As used in the figures, "L" refers to linker; "Sig" refers to a signal sequence that directs foreign proteins to protein storage vacuoles, "5' UTR" refers to the 5' untranslated region, and "KDEL" refers to an endoplasmic reticulum retention signal.

[0023] FIG. 2 shows the modified pAR15-00 cloning vector containing a selectable marker cassette conferring herbicide resistance. Coding regions and regulatory sequences are indicated as blocks (not to scale).

[0024] FIG. 3 shows an example expression cassette comprising a OKC1-T:OLG1 (Optimized Kappa Casein version 1:beta-lactoglobulin version 1, SEQ ID NOs: 71-72) fusion driven by PvPhas promoter fused with arc5'UTR:sig10, followed by the ER retention signal (KDEL) and the 3'UTR of the arc5-1 gene, "arc-terminator". "arc5'UTR" refers to the 5' untranslated region of the arc5-1 gene. "Sig10" refers to the lectin 1 gene signal peptide. "RB" refers to ribosomal binding site. Coding regions and regulatory sequences are indicated as blocks (not to scale).

[0025] FIG. 4 shows an example expression cassette comprising a OBC-T2:FM:OLG1 (Optimized Beta Casein Truncated version 2:Chymosin cleavage site:beta-lactoglobulin version 1, SEQ ID NOs: 73-74) fusion driven by PvPhas promoter fused with arc5'UTR:sig10, followed by the 3'UTR of the arc5-1 gene, "arc-terminator". "arc5'UTR" refers to the 5' untranslated region of the arc5-1 gene. "Sig10" refers to the lectin 1 gene signal peptide. "RB" refers to ribosomal binding site. Coding regions and regulatory sequences are indicated as blocks (not to scale). The Beta Casein is "truncated" in that the bovine secretion signal is removed, and replaced with a plant targeting signal.

[0026] FIG. 5 shows an example expression cassette comprising a OaS1-T:FM:OLG1 (Optimized Alpha S1 Casein Truncated version 1:Chymosin cleavage site:beta-lactoglobulin version 1, SEQ ID NOs: 75-76) fusion driven by PvPhas promoter fused with arc5'UTR:sig10, followed by the 3'UTR of the arc5-1 gene, "arc-terminator". "arc5'UTR" refers to the 5' untranslated region of the arc5-1 gene. "Sig10" refers to the lectin 1 gene signal peptide. "RB" refers to ribosomal binding site. Coding regions and regulatory sequences are indicated as blocks (not to scale). The Alpha S1 Casein is "truncated" in that the bovine secretion signal is removed, and replaced with a plant targeting signal.

[0027] FIG. 6 shows an example expression cassette comprising a para-OKC1-T:FM:OLG1:KDEL (Optimized paraKappa Casein version 1:Chymosin cleavage site:beta-lactoglobulin version 1, SEQ ID NOs: 77-78) fusion driven by PvPhas promoter fused with arc5'UTR:sig 10, followed by the ER retention signal (KDEL) and the 3'UTR of the arc5-1 gene, "arc-terminator". "arc5'UTR" refers to the 5' untranslated region of the arc5-1 gene. "Sig10" refers to the lectin 1 gene signal peptide. "RB" refers to ribosomal binding site. Coding regions and regulatory sequences are indicated as blocks (not to scale).

[0028] FIG. 7 shows an example expression cassette comprising a para-OKC1-T:FM:OLG1 (Optimized paraKappa Casein version 1:Chymosin cleavage site:beta-lactoglobulin version 1, SEQ ID NOs: 79-80) fusion driven by PvPhas promoter fused with arc5'UTR:sig 10, followed by the 3'UTR of the arc5-1 gene, "arc-terminator." "arc5'UTR" refers to the 5' untranslated region of the arc5-1 gene. "Sig10" refers to the lectin 1 gene signal peptide. "RB" refers to ribosomal binding site. Coding regions and regulatory sequences are indicated as blocks (not to scale).

[0029] FIG. 8 shows an example expression cassette comprising a OKC1-T:OLG1 (Optimized Kappa Casein version 1:beta-lactoglobulin version 1, SEQ ID NOs: 81-82) fusion that is driven by the promoter and signal peptide of glycinin 1 (GmSeed2:sig2) followed by the ER retention signal (KDEL) and the nopaline synthase gene termination sequence, (nos term). Coding regions and regulatory sequences are indicated as blocks (not to scale).

[0030] FIGS. 9A, 9B, 9C, and 9D show protein detection by western blotting. FIG. 9A shows detection of the fusion protein using a primary antibody raised against .kappa.-casein (kCN). The kCN commercial protein is detected at an apparent MW of .about.26 kDa (theoretical: 19 kDa--arrow). The fusion protein is detected at an apparent MW of .about.40 kDa (theoretical: 38 kDa--arrowhead). FIG. 9B shows detection of the fusion protein using a primary antibody raised against .beta.-lactoglobulin (LG). The LG commercial protein is detected at an apparent MW of .about.18 kDa (theoretical: 18 kDa--arrow). The fusion protein is detected at an apparent MW of .about.40 kDa (theoretical: 38 kDa--arrowhead). FIGS. 9C and 9D show protein gels as control for equal lane loading (image is taken at the end of the SDS run).

[0031] FIGS. 10A and 10B show two illustrative fusion proteins. In FIG. 10A, a .kappa.-casein protein is fused to a .beta.-lactoglobulin protein. The .kappa.-casein comprises a natural chymosin cleavage site (arrow 1). Cleavage of the fusion protein with rennet (or chymosin) yields two fragments: a para-kappa casein fragment, and a fragment comprising a .kappa.-casein macropeptide fused to .beta.-lactoglobulin. In some embodiments, a second protease cleavage site may be added at the C-terminus of the k-casein protein (i.e., at arrow 2), in order to further allow separation of the .kappa.-casein macropeptide and the .beta.-lactoglobulin. The second protease cleavage site may be a rennet cleavage site (e.g., a chymosin cleavage site), or it may be a cleavage site for a different protease. In FIG. 10B, a para-.kappa.-casein protein is fused directly to .beta.-lactoglobulin. A protease cleavage site (e.g., a rennet cleavage site) is added between the para-.kappa.-casein and the .beta.-lactoglobulin to allow for separation thereof. By fusing the para-.kappa.-casein directly to the .beta.-lactoglobulin, no .kappa.-casein macropeptide is produced.

[0032] FIG. 11 is a flow-chart showing an illustrative process for producing a food composition comprising an unstructured milk protein, as described herein.

DETAILED DESCRIPTION

[0033] The following description includes information that may be useful in understanding the present disclosure. It is not an admission that any of the information provided herein is prior art or relevant to the presently claimed disclosures, or that any publication specifically or implicitly referenced is prior art.

Definitions

[0034] While the following terms are believed to be well understood by one of ordinary skill in the art, the following definitions are set forth to facilitate explanation of the presently disclosed subject matter.

[0035] All technical and scientific terms used herein, unless otherwise defined below, are intended to have the same meaning as commonly understood by one of ordinary skill in the art. References to techniques employed herein are intended to refer to the techniques as commonly understood in the art, including variations on those techniques and/or substitutions of equivalent techniques that would be apparent to one of skill in the art.

[0036] As used herein, the singular forms "a," "an," and "the: include plural referents unless the content clearly dictates otherwise.

[0037] The term "about" or "approximately" when immediately preceding a numerical value means a range (e.g., plus or minus 10% of that value). For example, "about 50" can mean 45 to 55, "about 25,000" can mean 22,500 to 27,500, etc., unless the context of the disclosure indicates otherwise, or is inconsistent with such an interpretation. For example, in a list of numerical values such as "about 49, about 50, about 55, . . . ", "about 50" means a range extending to less than half the interval(s) between the preceding and subsequent values, e.g., more than 49.5 to less than 52.5. Furthermore, the phrases "less than about" a value or "greater than about" a value should be understood in view of the definition of the term "about" provided herein. Similarly, the term "about" when preceding a series of numerical values or a range of values (e.g., "about 10, 20, 30" or "about 10-30") refers, respectively to all values in the series, or the endpoints of the range.

[0038] As used herein, "mammalian milk" can refer to milk derived from any mammal, such as bovine, human, goat, sheep, camel, buffalo, water buffalo, dromedary, llama and any combination thereof. In some embodiments, a mammalian milk is a bovine milk.

[0039] As used herein, "structured" refers to those proteins having a well-defined secondary and tertiary structure, and "unstructured" refers to proteins that do not have well defined secondary and/or tertiary structures. An unstructured protein may also be described as lacking a fixed or ordered three-dimensional structure. "Disordered" and "intrinsically disordered" are synonymous with unstructured.

[0040] As used herein, "rennet" refers to a set of enzymes typically produced in the stomachs of ruminant mammals. Chymosin, its key component, is a protease enzyme that cleaves .kappa.-casein (to produce para-.kappa.-casein). In addition to chymosin, rennet contains other enzymes, such as pepsin and lipase. Rennet is used to separate milk into solid curds (for cheesemaking) and liquid whey. Rennet or rennet substitutes are used in the production of most cheeses.

[0041] As used herein "whey" refers to the liquid remaining after milk has been curdled and strained, for example during cheesemaking. Whey comprises a collection of globular proteins, typically a mixture of .beta.-lactoglobulin, .alpha.-lactalbumin, bovine serum albumin, and immunoglobulins.

[0042] The term "plant" includes reference to whole plants, plant organs, plant tissues, and plant cells and progeny of same, but is not limited to angiosperms and gymnosperms such as Arabidopsis, potato, tomato, tobacco, alfalfa, lettuce, carrot, strawberry, sugarbeet, cassava, sweet potato, soybean, lima bean, pea, chick pea, maize (corn), turf grass, wheat, rice, barley, sorghum, oat, oak, eucalyptus, walnut, palm and duckweed as well as fern and moss. Thus, a plant may be a monocot, a dicot, a vascular plant reproduced from spores such as fern or a nonvascular plant such as moss, liverwort, hornwort and algae. The word "plant," as used herein, also encompasses plant cells, seeds, plant progeny, propagule whether generated sexually or asexually, and descendants of any of these, such as cuttings or seed. Plant cells include suspension cultures, callus, embryos, meristematic regions, callus tissue, leaves, roots, shoots, gametophytes, sporophytes, pollen, seeds and microspores. Plants may be at various stages of maturity and may be grown in liquid or solid culture, or in soil or suitable media in pots, greenhouses or fields. Expression of an introduced leader, trailer or gene sequences in plants may be transient or permanent.

[0043] The term "vascular plant" refers to a large group of plants that are defined as those land plants that have lignified tissues (the xylem) for conducting water and minerals throughout the plant and a specialized non-lignified tissue (the phloem) to conduct products of photosynthesis. Vascular plants include the clubmosses, horsetails, ferns, gymnosperms (including conifers) and angiosperms (flowering plants). Scientific names for the group include Tracheophyta and Tracheobionta. Vascular plants are distinguished by two primary characteristics. First, vascular plants have vascular tissues which distribute resources through the plant. This feature allows vascular plants to evolve to a larger size than non-vascular plants, which lack these specialized conducting tissues and are therefore restricted to relatively small sizes. Second, in vascular plants, the principal generation phase is the sporophyte, which is usually diploid with two sets of chromosomes per cell. Only the germ cells and gametophytes are haploid. By contrast, the principal generation phase in non-vascular plants is the gametophyte, which is haploid with one set of chromosomes per cell. In these plants, only the spore stalk and capsule are diploid.

[0044] The term "non-vascular plant" refers to a plant without a vascular system consisting of xylem and phloem. Many non-vascular plants have simpler tissues that are specialized for internal transport of water. For example, mosses and leafy liverworts have structures that look like leaves, but are not true leaves because they are single sheets of cells with no stomata, no internal air spaces and have no xylem or phloem. Non-vascular plants include two distantly related groups. The first group are the bryophytes, which is further categorized as three separate land plant Divisions, namely Bryophyta (mosses), Marchantiophyta (liverworts), and Anthocerotophyta (hornworts). In all bryophytes, the primary plants are the haploid gametophytes, with the only diploid portion being the attached sporophyte, consisting of a stalk and sporangium. Because these plants lack lignified water-conducting tissues, they can't become as tall as most vascular plants. The second group is the algae, especially the green algae, which consists of several unrelated groups. Only those groups of algae included in the Viridiplantae are still considered relatives of land plants.

[0045] The term "plant part" refers to any part of a plant including but not limited to the embryo, shoot, root, stem, seed, stipule, leaf, petal, flower bud, flower, ovule, bract, trichome, branch, petiole, internode, bark, pubescence, tiller, rhizome, frond, blade, ovule, pollen, stamen, and the like. The two main parts of plants grown in some sort of media, such as soil or vermiculite, are often referred to as the "above-ground" part, also often referred to as the "shoots", and the "below-ground" part, also often referred to as the "roots".

[0046] The term "plant tissue" refers to any part of a plant, such as a plant organ. Examples of plant organs include, but are not limited to the leaf, stem, root, tuber, seed, branch, pubescence, nodule, leaf axil, flower, pollen, stamen, pistil, petal, peduncle, stalk, stigma, style, bract, fruit, trunk, carpel, sepal, anther, ovule, pedicel, needle, cone, rhizome, stolon, shoot, pericarp, endosperm, placenta, berry, stamen, and leaf sheath.

[0047] The term "seed" is meant to encompass the whole seed and/or all seed components, including, for example, the coleoptile and leaves, radicle and coleorhiza, scutellum, starchy endosperm, aleurone layer, pericarp and/or testa, either during seed maturation and seed germination.

[0048] The term "transgenic plant" means a plant that has been transformed with one or more exogenous nucleic acids. "Transformation" refers to a process by which a nucleic acid is stably integrated into the genome of a plant cell. "Stably integrated" refers to the permanent, or non-transient retention and/or expression of a polynucleotide in and by a cell genome. Thus, a stably integrated polynucleotide is one that is a fixture within a transformed cell genome and can be replicated and propagated through successive progeny of the cell or resultant transformed plant. Transformation may occur under natural or artificial conditions using various methods well known in the art. Transformation may rely on any known method for the insertion of nucleic acid sequences into a prokaryotic or eukaryotic host cell, including Agrobacterium-mediated transformation protocols, viral infection, whiskers, electroporation, heat shock, lipofection, polyethylene glycol treatment, micro-injection, and particle bombardment.

[0049] As used herein, the terms "stably expressed" or "stable expression" refer to expression and accumulation of a protein in a plant cell over time. In some embodiments, a protein may accumulate because it is not degraded by endogenous plant proteases. In some embodiments, a protein is considered to be stably expressed in a plant if it is present in the plant in an amount of 1% or higher per total protein weight of soluble protein extractable from the plant.

[0050] As used herein, the term "fusion protein" refers to a protein comprising at least two constituent proteins (or fragments or variants thereof) that are encoded by separate genes, and that have been joined so that they are transcribed and translated as a single polypeptide. In some embodiments, a fusion protein may be separated into its constituent proteins, for example by cleavage with a protease.

[0051] The term "recombinant" refers to nucleic acids or proteins formed by laboratory methods of genetic recombination (e.g., molecular cloning) to bring together genetic material from multiple sources, creating sequences that would not otherwise be found in the genome. A recombinant fusion protein is a protein created by combining sequences encoding two or more constituent proteins, such that they are expressed as a single polypeptide. Recombinant fusion proteins may be expressed in vivo in various types of host cells, including plant cells, bacterial cells, fungal cells, mammalian cells, etc. Recombinant fusion proteins may also be generated in vitro.

[0052] The term "promoter" or a "transcription regulatory region" refers to nucleic acid sequences that influence and/or promote initiation of transcription. Promoters are typically considered to include regulatory regions, such as enhancer or inducer elements. The promoter will generally be appropriate to the host cell in which the target gene is being expressed. The promoter, together with other transcriptional and translational regulatory nucleic acid sequences (also termed "control sequences"), is necessary to express any given gene. In general, the transcriptional and translational regulatory sequences include, but are not limited to, promoter sequences, ribosomal binding sites, transcriptional start and stop sequences, translational start and stop sequences, and enhancer or activator sequences.

[0053] The term signal peptide--also known as "signal sequence", "targeting signal", "localization signal", "localization sequence", "transit peptide", "leader sequence", or "leader peptide", is used herein to refer to an N-terminal peptide which directs a newly synthesized protein to a specific cellular location or pathway. Signal peptides are often cleaved from a protein during translation or transport, and are therefore not typically present in a mature protein.

[0054] The term "proteolysis" or "proteolytic" or "proteolyze" means the breakdown of proteins into smaller polypeptides or amino acids. Uncatalyzed hydrolysis of peptide bonds is extremely slow. Proteolysis is typically catalyzed by cellular enzymes called proteases, but may also occur by intra-molecular digestion. Low pH or high temperatures can also cause proteolysis non-enzymatically. Limited proteolysis of a polypeptide during or after translation in protein synthesis often occurs for many proteins. This may involve removal of the N-terminal methionine, signal peptide, and/or the conversion of an inactive or non-functional protein to an active one.

[0055] The term "2A peptide", used herein, refers to nucleic acid sequence encoding a 2A peptide or the 2A peptide itself. The average length of 2A peptides is 18-22 amino acids. The designation "2A" refers to a specific region of picornavirus polyproteins and arose from a systematic nomenclature adopted by researchers. In foot-and-mouth disease virus (FMDV), a member of Picornaviridae family, a 2A sequence appears to have the unique capability to mediate cleavage at its own C-terminus by an apparently enzyme-independent, novel type of reaction. This sequence can also mediate cleavage in a heterologous protein context in a range of eukaryotic expression systems. The 2A sequence is inserted between two genes of interest, maintaining a single open reading frame. Efficient cleavage of the polyprotein can lead to co-ordinate expression of active two proteins of interest. Self-processing polyproteins using the FMDV 2A sequence could therefore provide a system for ensuring coordinated, stable expression of multiple introduced proteins in cells including plant cells.

[0056] The term "purifying" is used interchangeably with the term "isolating" and generally refers to the separation of a particular component from other components of the environment in which it was found or produced. For example, purifying a recombinant protein from plant cells in which it was produced typically means subjecting transgenic protein containing plant material to biochemical purification and/or column chromatography.

[0057] When referring to expression of a protein in a specific amount per the total protein weight of the soluble protein extractable from the plant ("TSP"), it is meant an amount of a protein of interest relative to the total amount of protein that may reasonably be extracted from a plant using standard methods. Methods for extracting total protein from a plant are known in the art. For example, total protein may be extracted from seeds by bead beating seeds at about 15000 rpm for about 1 min. The resulting powder may then be resuspended in an appropriate buffer (e.g., 50 mM Carbonate-Bicarbonate pH 10.8, 1 mM DTT, 1.times. Protease Inhibitor Cocktail). After the resuspended powder is incubated at about 4.degree. C. for about 15 minutes, the supernatant may be collected after centrifuging (e.g., at 4000 g, 20 min, 4.degree. C.). Total protein may be measured using standard assays, such as a Bradford assay. The amount of protein of interest may be measured using methods known in the art, such as an ELISA or a Western Blot.

[0058] When referring to a nucleic acid sequence or protein sequence, the term "identity" is used to denote similarity between two sequences. Sequence similarity or identity may be determined using standard techniques known in the art, including, but not limited to, the local sequence identity algorithm of Smith & Waterman, Adv. Appl. Math. 2, 482 (1981), by the sequence identity alignment algorithm of Needleman & Wunsch, J Mol. Biol. 48,443 (1970), by the search for similarity method of Pearson & Lipman, Proc. Natl. Acad. Sci. USA 85, 2444 (1988), by computerized implementations of these algorithms (GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package, Genetics Computer Group, 575 Science Drive, Madison, Wis.), the Best Fit sequence program described by Devereux et al., Nucl. Acid Res. 12, 387-395 (1984), or by inspection. Another suitable algorithm is the BLAST algorithm, described in Altschul et al., J Mol. Biol. 215, 403-410, (1990) and Karlin et al., Proc. Natl. Acad. Sci. USA 90, 5873-5787 (1993). A particularly useful BLAST program is the WU-BLAST-2 program which was obtained from Altschul et al., Methods in Enzymology, 266, 460-480 (1996); http://blast.wustl/edu/blast/README.html. WU-BLAST-2 uses several search parameters, which are optionally set to the default values. The parameters are dynamic values and are established by the program itself depending upon the composition of the particular sequence and composition of the particular database against which the sequence of interest is being searched; however, the values may be adjusted to increase sensitivity. Further, an additional useful algorithm is gapped BLAST as reported by Altschul et al, (1997) Nucleic Acids Res. 25, 3389-3402. As used herein, the terms "dicot" or "dicotyledon" or "dicotyledonous" refer to a flowering plant whose embryos have two seed leaves or cotyledons. Examples of dicots include, but are not limited to, Arabidopsis, tobacco, tomato, potato, sweet potato, cassava, alfalfa, lima bean, pea, chick pea, soybean, carrot, strawberry, lettuce, oak, maple, walnut, rose, mint, squash, daisy, Quinoa, buckwheat, mung bean, cow pea, lentil, lupin, peanut, fava bean, French beans (i.e., common beans), mustard, or cactus.

[0059] The terms "monocot" or "monocotyledon" or "monocotyledonous" refer to a flowering plant whose embryos have one cotyledon or seed leaf. Examples of monocots include, but are not limited to turf grass, maize (corn), rice, oat, wheat, barley, sorghum, orchid, iris, lily, onion, palm, and duckweed.

[0060] As used herein, a "low lactose product" is any food composition considered by the FDA to be "lactose reduced", "low lactose", or "lactose free".

Unstructured Milk Proteins

[0061] The fusion proteins described herein may comprise one or more unstructured milk proteins. As used herein the term "milk protein" refers to any protein, or fragment or variant thereof, that is typically found in one or more mammalian milks. Examples of mammalian milk include, but are not limited to, milk produced by a cow, human, goat, sheep, camel, horse, donkey, dog, cat, elephant, monkey, mouse, rat, hamster, guinea pig, whale, dolphin, seal, sheep, buffalo, water buffalo, dromedary, llama, yak, zebu, reindeer, mole, otter, weasel, wolf, raccoon, walrus, polar bear, rabbit, or giraffe.

[0062] An "unstructured milk protein" is a milk protein that lacks a defined secondary structure, a defined tertiary structure, or a defined secondary and tertiary structure. Whether a milk protein is unstructured may be determined using a variety of biophysical and biochemical methods known in the art, such as small angle X-ray scattering, Raman optical activity, circular dichroism, nuclear magnetic resonance (NMR) and protease sensitivity. In some embodiments, a milk protein is considered to be unstructured if it is unable to be crystallized using standard techniques.

[0063] Illustrative unstructured milk proteins that may be used in the fusion proteins of the disclosure includes members of the casein family of proteins, such as .alpha.-S1 casein, .alpha.-S2 casein, .beta.-casein, and .kappa.-casein. The caseins are phosphoproteins, and make up approximately 80% of the protein content in bovine milk and about 20-45% of the protein in human milk. Caseins form a multi-molecular, granular structure called a casein micelle in which some enzymes, water, and salts, such as calcium and phosphorous, are present. The micellar structure of casein in milk is significant in terms of a mode of digestion of milk in the stomach and intestine and a basis for separating some proteins and other components from cow milk. In practice, casein proteins in bovine milk can be separated from whey proteins by acid precipitation of caseins, by breaking the micellar structure by partial hydrolysis of the protein molecules with proteolytic enzymes, or microfiltration to separate the smaller soluble whey proteins from the larger casein micelle. Caseins are relatively hydrophobic, making them poorly soluble in water.

[0064] In some embodiments, the casein proteins described herein (e.g., .alpha.-S1 casein, .alpha.-S2 casein, .beta.-casein, and/or .kappa.-casein) are isolated or derived from cow (Bos taurus), goat (Capra hircus), sheep (Ovis aries), water buffalo (Bubalus bubalis), dromedary camel (Camelus dromedaries), bactrian camel (Camelus bactrianus), wild yak (Bos mutus), horse (Equus caballus), donkey (Equus asinus), reindeer (Rangifer tarandus), eurasian elk (Alces alces), alpaca (Vicugna pacos), zebu (Bos indicus), llama (Lama glama), or human (Homo sapiens). In some embodiments, a casein protein (e.g., .alpha.-S1 casein, .alpha.-S2 casein, .beta.-casein, or .kappa.-casein) has at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identity with a casein protein from one or more of cow (Bos taurus), goat (Capra hircus), sheep (Ovis aries), water buffalo (Bubalus bubalis), dromedary camel (Camelus dromedaries), bactrian camel (Camelus bactrianus), wild yak (Bos mutus), horse (Equus caballus), donkey (Equus asinus), reindeer (Rangifer tarandus), eurasian elk (Alces alces), alpaca (Vicugna pacos), zebu (Bos indicus), llama (Lama glama), or human (Homo sapiens).

[0065] As used herein, the term ".alpha.-S1 casein" refers to not only the .alpha.-S1 casein protein, but also fragments or variants thereof. .alpha.-S1 casein is found in the milk of numerous different mammalian species, including cow, goat, and sheep. The sequence, structure and physical/chemical properties of .alpha.-S1 casein derived from various species is highly variable. An exemplary sequence for bovine .alpha.-S1 casein can be found at Uniprot Accession No. P02662, and an exemplary sequence for goat .alpha.-S1 casein can be found at GenBank Accession No. X59836.1.

[0066] As used herein, the term ".alpha.-S2 casein" refers to not only the .alpha.-S2 casein protein, but also fragments or variants thereof .alpha.-S2 is known as epsilon-casein in mouse, gamma-casein in rat, and casein-A in guinea pig. The sequence, structure and physical/chemical properties of .alpha.-S2 casein derived from various species is highly variable. An exemplary sequence for bovine .alpha.-S2 casein can be found at Uniprot Accession No. P02663, and an exemplary sequence for goat .alpha.-S2 casein can be found at Uniprot Accession No. P33049.

[0067] As used herein, the term ".beta.-casein" refers to not only the .beta.-casein protein, but also fragments or variants thereof. For example, A1 and A2 .beta.-casein are genetic variants of the .beta.-casein milk protein that differ by one amino acid (at amino acid 67, A2 .beta.-casein has a proline, whereas A1 has a histidine). Other genetic variants of .beta.-casein include the A3, B, C, D, E, F, H1, H2, I and G genetic variants. The sequence, structure and physical/chemical properties of .beta.-casein derived from various species is highly variable. Exemplary sequences for bovine .beta.-casein can be found at Uniprot Accession No. P02666 and GenBank Accession No. M15132.1.

[0068] As used herein, the term ".kappa.-casein" refers to not only the .kappa.-casein protein, but also fragments or variants thereof. .kappa.-casein is cleaved by rennet, which releases a macropeptide from the C-terminal region. The remaining product with the N-terminus and two-thirds of the original peptide chain is referred to as para-.kappa.-casein. The sequence, structure and physical/chemical properties of .kappa.-casein derived from various species is highly variable. Exemplary sequences for bovine .kappa.-casein can be found at Uniprot Accession No. P02668 and GenBank Accession No. CAA25231.

[0069] In some embodiments, the unstructured milk protein is a casein protein, for example, .alpha.-S1 casein, .alpha.-S2 casein, .beta.-casein, and or .kappa.-casein. In some embodiments, the unstructured milk protein is .kappa.-casein and comprises the sequence of SEQ ID NO: 4, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, the unstructured milk protein is para-.kappa.-casein and comprises the sequence of SEQ ID NO: 2, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, the unstructured milk protein is .beta.-casein and comprises the sequence of SEQ ID NO: 6, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, the unstructured milk protein is .alpha.-S1 casein and comprises the sequence SEQ ID NO: 8, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, unstructured milk protein is .alpha.-S2 casein and comprises the sequence SEQ ID NO: 84, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto.

[0070] In some embodiments, the unstructured milk protein comprises a sequence that is at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 4. In some embodiments, the unstructured milk protein comprises a sequence that is at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 2. In some embodiments, the unstructured milk protein comprises a sequence that is at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 6. In some embodiments, the unstructured milk protein comprises a sequence that is at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 8. In some embodiments, the unstructured milk protein comprises a sequence that is at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 84.

[0071] In some embodiments, .alpha.-S1 casein is encoded by the sequence of SEQ ID NO: 7, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, .alpha.-S2 casein is encoded by the sequence of SEQ ID NO: 83, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, .beta.-casein is encoded by the sequence of SEQ ID NO: 5, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, .kappa.-casein is encoded by the sequence of SEQ ID NO: 3, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, para-.kappa.-casein is encoded by the sequence of SEQ ID NO: 1, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto.

[0072] In some embodiments, the unstructured milk protein is encoded by a sequence that is at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 7. In some embodiments, the unstructured milk protein is encoded by a sequence that is at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 83. In some embodiments, the unstructured milk protein is encoded by a sequence that is at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 3. In some embodiments, the unstructured milk protein is encoded by a sequence that is at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 1. In some embodiments, the unstructured milk protein is encoded by a sequence that is at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 5.

[0073] In some embodiments, the unstructured milk protein is a casein protein, and comprises a sequence that is at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to any one of SEQ ID NO: 85-133. In some embodiments, the unstructured milk protein is a casein protein and comprises the sequence of any one of SEQ ID NO: 85-133.

[0074] In some embodiments, the unstructured milk protein comprises a sequence that is at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to any one of SEQ ID NO: 85-98. In some embodiments, the unstructured milk protein comprises the sequence of any one of SEQ ID NO: 85-98.

[0075] In some embodiments, the unstructured milk protein comprises a sequence that is at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to any one of SEQ ID NO: 99-109. In some embodiments, the unstructured milk protein comprises the sequence of any one of SEQ ID NO: 99-109.

[0076] In some embodiments, the unstructured milk protein comprises a sequence that is at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to any one of SEQ ID NO: 110-120. In some embodiments, the unstructured milk protein comprises the sequence of any one of SEQ ID NO: 110-120.

[0077] In some embodiments, the unstructured milk protein comprises a sequence that is at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to any one of SEQ ID NO: 121-133. In some embodiments, the unstructured milk protein comprises the sequence of any one of SEQ ID NO: 121-133.

Structured Proteins

[0078] The fusion proteins described herein may comprise one or more structured proteins, including any fragment or variant thereof. The proteins may be, for example, structured animal proteins, or structured plant proteins. In some embodiments, the structured animal proteins are mammalian proteins. In some embodiments, the structured animal proteins are avian proteins. In some embodiments, the structured proteins are structured milk proteins.

[0079] Whether a milk protein is structured may be determined using a variety of biophysical and biochemical methods known in the art, such as small angle X-ray scattering, Raman optical activity, circular dichroism, and protease sensitivity. In some embodiments, a milk protein is considered to be structured if it has been crystallized or if it may be crystallized using standard techniques.

[0080] In some embodiments, the structured protein is not a protein that is typically used as a marker. As used herein, the term "marker" refers to a protein that produces a visual or other signal and is used to detect successful delivery of a vector (e.g., a DNA sequence) into a cell. Proteins typically used as a marker may include, for example, fluorescent proteins (e.g., green fluorescent protein (GFP)) and bacterial or other enzymes (e.g., .beta.-glucuronidase (GUS), .beta.-galactosidase, luciferase, chloramphenicol acetyltransferase). In some embodiments, the structured protein is a non-marker protein.

[0081] A non-limiting list of illustrative structured proteins that may be used in the fusion proteins described herein is provided in Table 1. In some embodiments, a fragment or variant of any one of the proteins listed in Table 1 may be used. In some embodiments, the structured protein may be an animal protein. For example, in some embodiments, the structured protein may be a mammalian protein. In some embodiments, the structured protein may be a plant protein. For example, the plant protein may be a protein that is not typically expressed in a seed. In some embodiments, the plant protein may be a storage protein, e.g., a protein that acts as a storage reserve for nitrogen, carbon, and/or sulfur. In some embodiments, the plant protein may inhibit one or more proteases. In some embodiments, the structured protein may be a fungal protein.

TABLE-US-00001 TABLE 1 Structured proteins Protein or Protein Exemplary Uniprot Categories family Native Species Accession No. Mammalian Alpha-lactalbumin Bovine (Bos taurus) P00711 Beta-lactoglobulin Bovine (Bos taurus) P02754 Albumin Bovine (Bos taurus) P02769 Lysozyme Bovine (Bos taurus) Q6B411 Collagen family Human (Homo sapiens) Q02388, P02452, P08123, P02458 Hemoglobin Bovine (Bos taurus) P02070 Avian proteins Ovalbumin Chicken (Gallus gallus) P01012 Ovotransferrin Chicken (Gallus gallus) P02789 Ovoglobulin Chicken (Gallus gallus) I0J170 Lysozyme Chicken (Gallus gallus) P00698 Plant Proteins Oleosins Soybean (Glycine max) P29530, P29531 Leghemoglobin Soybean (Glycine max) Q41219 Extensin-like protein Soybean (Glycine soja) A0A445JU93 family Prolamine Rice (Oryza sativa) Q0DJ45 Glutenin Wheat (Sorghum bicolor] P10388 Gamma-kafirin Wheat (Sorghum bicolor] Q41506 preprotein Alpha globulin Rice (Oryza sativa) P29835 Basic 7S globulin Soybean (Glycine max) P13917 precursor 2S albumin Soybean (Glycine max) P19594 Beta-conglycinins Soybean (Glycine max) P0DO16, P0DO15, P0DO15 Glycinins Soybean (Glycine max) P04347, P04776, P04405 Canein Sugar cane (Saccharum ABP64791.1 officinarum) Zein Corn (Zea Mays) ABP64791.1 Patatin Tomato (Solanum P07745 lycopersicum) Kunitz-Trypsin Soybean (Glycine max) Q39898 inhibitor Bowman-Birk Soybean (Glycine max) I1MQD2 inhibitor Cystatine Tomato (Solanum Q9SE07 lycopersicum) Fungal proteins Hydrophobin I Fungus (Trichoderma reesei) P52754 Hydrophobin II Fungus (Trichoderma reesei) P79073

[0082] In some embodiments, the structured protein is an animal protein. In some embodiments, the structured protein is a mammalian protein. For example, the structured protein may be a mammalian protein selected from: .beta.-lactoglobulin, .alpha.-lactalbumin, albumin, lysozyme, lactoferrin, lactoperoxidase, hemoglobin, collagen, and an immunoglobulin (e.g., IgA, IgG, IgM, IgE). In some embodiments, the structured mammalian protein is .beta.-lactoglobulin and comprises the sequence of SEQ ID NO: 10, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, the structured mammalian protein is .beta.-lactoglobulin and is encoded by the sequence of any one of SEQ ID NO: 9, 11, 12, or 13, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to any one of SEQ ID NO: 9, 11, 12, or 13. In some embodiments, the structured protein is an avian protein. For example, the structured protein may be an avian protein selected from: ovalbumin, ovotransferrin, lysozyme and ovoglobulin.

[0083] In some embodiments, the structured protein is a plant protein. For example, the structured protein may be a plant protein selected from: hydrophobin I, hydrophobin II, oleosins, leghemoglobin, extension-like protein family, prolamine, glutenin, gamma-kafirin preprotein, .alpha.-globulin, basic 7S globulin precursor, 2S albumin, .beta.-conglycinins, glycinins, canein, zein, patatin, kunitz-trypsin inhibitor, bowman-birk inhibitor, and cystatine.

Fusion Proteins

Fusion Proteins Comprising an Unstructured Milk Protein and a Structured Animal (e.g., Mammalian) Protein

[0084] In some embodiments, the fusion proteins described herein comprise (i) an unstructured milk protein, and (ii) a structured animal protein. In some embodiments, the fusion proteins described herein comprise (i) an unstructured milk protein, and (ii) a structured mammalian protein. In some embodiments, the fusion proteins described herein comprise (i) an unstructured milk protein, and (ii) a structured avian protein. In some embodiments, the fusion proteins described herein comprise (i) an unstructured milk protein, and (ii) a structured fungal protein.

[0085] In some embodiments, the fusion proteins comprise an unstructured milk protein, such as a casein protein. In some embodiments, the fusion proteins comprise an unstructured milk protein selected from .alpha.-S1 casein, .alpha.-S2 casein, .beta.-casein, and .kappa.-casein. In some embodiments, the fusion proteins comprise an unstructured milk protein isolated or derived from cow (Bos taurus), goat (Capra hircus), sheep (Ovis aries), water buffalo (Bubalus bubalis), dromedary camel (Camelus dromedaries), bactrian camel (Camelus bactrianus), wild yak (Bos mutus), horse (Equus caballus), donkey (Equus asinus), reindeer (Rangifer tarandus), eurasian elk (Alces alces), alpaca (Vicugna pacos), zebu (Bos indicus), llama (Lama glama), or human (Homo sapiens). In some embodiments, the fusion proteins comprise a casein protein (e.g., .alpha.-S1 casein, .alpha.-S2 casein, .beta.-casein, or .kappa.-casein) from cow (Bos taurus), goat (Capra hircus), sheep (Ovis aries), water buffalo (Bubalus bubalis), dromedary camel (Camelus dromedaries), bactrian camel (Camelus bactrianus), wild yak (Bos mutus), horse (Equus caballus), donkey (Equus asinus), reindeer (Rangifer tarandus), eurasian elk (Alces alces), alpaca (Vicugna pacos), zebu (Bos indicus), llama (Lama glama), or human (Homo sapiens).

[0086] In some embodiments, the unstructured milk protein is .alpha.-S1 casein. In some embodiments, the unstructured milk protein is .alpha.-S1 casein and comprises the sequence SEQ ID NO: 8, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, the unstructured milk protein is .alpha.-S1 casein and comprises the sequence of any one of SEQ ID NO: 99-109, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto

[0087] In some embodiments, the unstructured milk protein is .alpha.-S2 casein. In some embodiments, the unstructured milk protein is .alpha.-S2 casein and comprises the sequence SEQ ID NO: 84, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, the unstructured milk protein is .alpha.-S2 casein and comprises the sequence of any one of SEQ ID NO: 110-120, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto.

[0088] In some embodiments, the unstructured milk protein is .beta.-casein. In some embodiments, the unstructured milk protein is .beta.-casein and comprises the sequence of SEQ ID NO: 6, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, the unstructured milk protein is .beta.-casein and comprises the sequence of any one of SEQ ID NO: 121-133, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto.

[0089] In some embodiments, the unstructured milk protein is .kappa.-casein. In some embodiments, the unstructured milk protein is .kappa.-casein and comprises the sequence of SEQ ID NO: 4, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, the unstructured milk protein is .kappa.-casein and comprises the sequence of any one of SEQ ID NO: 85-98, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto.

[0090] In some embodiments, the unstructured milk protein is para-.kappa.-casein. In some embodiments, the unstructured milk protein is para-.kappa.-casein and comprises the sequence of SEQ ID NO: 2, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto.

[0091] In some embodiments, the structured mammalian protein is .beta.-lactoglobulin, .alpha.-lactalbumin, albumin, lysozyme, lactoferrin, lactoperoxidase, hemoglobin, collagen, or an immunoglobulin (e.g., IgA, IgG, IgM, or IgE). In some embodiments, the structured avian protein is ovalbumin, ovotransferrin, lysozyme or ovoglobulin.

[0092] In some embodiments, the structured mammalian protein is .beta.-lactoglobulin. In some embodiments, the structured mammalian protein is .beta.-lactoglobulin and comprises the sequence of SEQ ID NO: 10, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto.

[0093] In some embodiments, a fusion protein comprises a casein protein (e.g., .kappa.-casein, para-.kappa.-casein, .beta.-casein, or .alpha.-S1 casein) and .beta.-lactoglobulin. In some embodiments, a fusion protein comprises .kappa.-casein and .beta.-lactoglobulin (see, e.g., FIG. 3, FIG. 8, FIG. 10A-10B). In some embodiments, a fusion protein comprises para-.kappa.-casein and .beta.-lactoglobulin (see, e.g., FIG. 6, FIG. 7, FIG. 10A-10B). In some embodiments, a fusion protein comprises .beta.-casein and .beta.-lactoglobulin. In some embodiments, a fusion protein comprises .alpha.-S1 casein and .beta.-lactoglobulin.

[0094] In some embodiments, a plant-expressed recombinant fusion protein comprises .kappa.-casein, or fragment thereof; and .beta.-lactoglobulin, or fragment thereof. In some embodiments, the fusion protein comprises, in order from N-terminus to C-terminus, the .kappa.-casein and the .beta.-lactoglobulin.

Fusion Protein Comprising an Unstructured Milk Protein and a Structured Plant Protein

[0095] In some embodiments, the fusion proteins described herein comprise (i) an unstructured milk protein, and (ii) a structured plant protein. In some embodiments, the unstructured milk protein is a casein protein, such as .alpha.-S1 casein, .alpha.-S2 casein, .beta.-casein, or .kappa.-casein. In some embodiments, the plant protein is selected from the group consisting of: hydrophobin I, hydrophobin II, oleosins, leghemoglobin, extension-like protein family, prolamine, glutenin, gamma-kafirin preprotein, .alpha.-globulin, basic 7S globulin precursor, 2S albumin, .beta.-conglycinins, glycinins, canein, zein, patatin, kunitz-trypsin inhibitor, bowman-birk inhibitor, and cystatine.

Fusion Protein Structure

[0096] The fusion proteins described herein may have various different structures, in order to increase expression and/or accumulation in a plant or other host organism or cell. In some embodiments, a fusion protein comprises, in order from N-terminus to C-terminus, an unstructured milk protein and a structured animal (e.g., mammalian or avian) protein. In some embodiments, a fusion protein comprises, in order from N-terminus to C-terminus, a structured animal (e.g., mammalian or avian) protein and a milk protein. For example, in some embodiments, a fusion protein comprises, in order from N-terminus to C-terminus .kappa.-casein and .beta.-lactoglobulin. In some embodiments, a fusion protein comprises, in order from N-terminus to C-terminus .beta.-lactoglobulin and .kappa.-casein. In some embodiments, a fusion protein comprises, in order from N-terminus to C-terminus, para-.kappa.-casein and .beta.-lactoglobulin. In some embodiments, a fusion protein comprises, in order from N-terminus to C-terminus, .beta.-lactoglobulin and para-.kappa.-casein. In some embodiments, a fusion protein comprises, in order from N-terminus to C-terminus, .beta.-casein and .beta.-lactoglobulin. In some embodiments, a fusion protein comprises, in order from N-terminus to C-terminus, .beta.-lactoglobulin and .beta.-casein. In some embodiments, a fusion protein comprises, in order from N-terminus to C-terminus, .alpha.-S1 casein and .beta.-lactoglobulin. In some embodiments, a fusion protein comprises, in order from N-terminus to C-terminus, .beta.-lactoglobulin and .alpha.-S1 casein.

[0097] In some embodiments, a fusion protein comprises, in order from N-terminus to C-terminus, an unstructured milk protein and a structured plant protein. In some embodiments, a fusion protein comprises, in order from N-terminus to C-terminus, a structured plant protein and a milk protein. In some embodiments, a fusion protein comprises, in order from N-terminus to C-terminus, a casein protein and a structured plant protein. In some embodiments, a fusion protein comprises, in order from N-terminus to C-terminus, a structured plant protein and a casein protein.

[0098] In some embodiments, a fusion protein comprises a protease cleavage site. For example, in some embodiments, the fusion protein comprises an endoprotease, endopeptidase, and/or endoproteinase cleavage site. In some embodiments, the fusion protein comprises a rennet cleavage site. In some embodiments, the fusion protein comprises a chymosin cleavage site. In some embodiments, the fusion protein comprises a trypsin cleavage site.

[0099] The protease cleavage site may be located between the unstructured milk protein and the structured animal (e.g., mammalian or avian) protein, or between the unstructured milk protein and the structured plant protein, such that cleavage of the protein at the protease cleavage site will separate the unstructured milk protein from the structured animal (e.g., mammalian or avian) or plant protein.

[0100] In some embodiments, the protease cleavage site may be contained within the sequence of either the milk protein or the structured animal (e.g., mammalian or animal) or plant protein. In some embodiments, the protease cleavage site may be added separately, for example, between the two proteins.

[0101] In some embodiments, a fusion protein comprises a linker between the unstructured milk protein and the structured animal (e.g., mammalian or avian) protein, or between the unstructured milk protein and the structured plant protein. In some embodiments, the linker may comprise a peptide sequence recognizable by an endoprotease. In some embodiments, the linker may comprise a protease cleavage site. In some embodiments, the linker may comprise a self-cleaving peptide, such as a 2A peptide.

[0102] In some embodiments, a fusion protein may comprise a signal peptide. The signal peptide may be cleaved from the fusion protein, for example, during processing or transport of the protein within the cell. In some embodiments, the signal peptide is located at the N-terminus of the fusion protein. In some embodiments, the signal peptide is located at the C-terminus of the fusion protein.

[0103] In some embodiments, the signal peptide is selected from the group consisting of GmSCB1, StPat21, 2Sss, Sig2, Sig12, Sig8, Sig10, Sig11, and Coixss. In some embodiments, the signal peptide is Sig10 and comprises SEQ ID NO: 15, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, the signal peptide is Sig2 and comprises SEQ ID NO: 17, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto.

[0104] In some embodiments, the fusion protein comprises the sequence of SEQ ID NO: 71. In some embodiments, the fusion protein comprises the sequence of SEQ ID NO: 73. In some embodiments, the fusion protein comprises the sequence of SEQ ID NO: 75. In some embodiments, the fusion protein comprises the sequence of SEQ ID NO: 77. In some embodiments, the fusion protein comprises the sequence of SEQ ID NO: 79. In some embodiments, the fusion protein comprises the sequence of SEQ ID NO: 81. In some embodiments, the fusion protein comprises the sequence of SEQ ID NO: 135. In some embodiments, the fusion protein comprises the sequence of SEQ ID NO: 137.

[0105] In some embodiments, the fusion protein comprises the sequence of SEQ ID NO: 71, with 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more amino acid substitutions. In some embodiments, the fusion protein comprises the sequence of SEQ ID NO: 73, with 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more amino acid substitutions. In some embodiments, the fusion protein comprises the sequence of SEQ ID NO: 75, with 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more amino acid substitutions. In some embodiments, the fusion protein comprises the sequence of SEQ ID NO: 77, with 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more amino acid substitutions. In some embodiments, the fusion protein comprises the sequence of SEQ ID NO: 79, with 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more amino acid substitutions. In some embodiments, the fusion protein comprises the sequence of SEQ ID NO: 81, with 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more amino acid substitutions. In some embodiments, the fusion protein comprises the sequence of SEQ ID NO: 135, with 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more amino acid substitutions. In some embodiments, the fusion protein comprises the sequence of SEQ ID NO: 137, with 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more amino acid substitutions.

[0106] In some embodiments, the fusion protein comprises the sequence of SEQ ID NO: 71, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, the fusion protein comprises the sequence of SEQ ID NO: 73, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, the fusion protein comprises the sequence of SEQ ID NO: 75, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, the fusion protein comprises the sequence of SEQ ID NO: 77, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, the fusion protein comprises the sequence of SEQ ID NO: 79, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, the fusion protein comprises the sequence of SEQ ID NO: 81, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, the fusion protein comprises the sequence of SEQ ID NO: 135, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, the fusion protein comprises the sequence of SEQ ID NO: 137, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto.

[0107] In some embodiments, the fusion proteins have a molecular weight in the range of about 1 kDa to about 500 kDa, about 1 kDa to about 250 kDa, about 1 to about 100 kDa, about 10 to about 50 kDa, about 1 to about 10 kDa, about 10 to about 200 kDa, about 30 to about 150 kDa, about 30 kDa to about 50 kDa, or about 20 to about 80 kDa.

Nucleic Acids Encoding Fusion Proteins and Vectors Comprising the Same

[0108] Also provided herein are nucleic acids encoding the fusion proteins of the disclosure, for example fusion proteins comprising an unstructured milk protein and a structured animal (e.g., mammalian or avian) or plant protein. In some embodiments, the nucleic acids are DNAs. In some embodiments, the nucleic acids are RNAs.

[0109] In some embodiments, a nucleic acid comprises a sequence encoding a fusion protein. In some embodiments, a nucleic acid comprises a sequence encoding a fusion protein, which is operably linked to a promoter. In some embodiments, a nucleic acid comprises, in order from 5' to 3', a promoter, a 5' untranslated region (UTR), a sequence encoding a fusion protein, and a terminator.

[0110] The promoter may be a plant promoter. A "plant promoter" is a promoter capable of initiating transcription in plant cells. Examples of promoters under developmental control include promoters that preferentially initiate transcription in certain organs, such as leaves, roots, flowers, seeds and tissues such as fibers, xylem vessels, tracheids, or sclerenchyma. Such promoters are referred to as "tissue-preferred." Promoters which initiate transcription only in certain tissue are referred to as "tissue-specific." A "cell-type" specific promoter primarily drives expression in certain cell types in one or more organs, for example, vascular cells in leaves, roots, flowers, or seeds. An "inducible" promoter is a promoter which is under environmental control. Examples of environmental conditions that may affect transcription by inducible promoters include anaerobic conditions or the presence of light. Tissue-specific, tissue-preferred, cell-type specific, and inducible promoters constitute the class of "non-constitutive" promoters. A "constitutive" promoter is a promoter which is active under most environmental conditions.

[0111] In some embodiments, the promoter is a plant promoter derived from, for example soybean, lima bean, Arabidopsis, tobacco, rice, maize, barley, sorghum, wheat, pea, and/or oat. In some embodiments, the promoter is a constitutive or an inducible promoter. Exemplary constitutive promoters include, but are not limited to, the promoters from plant viruses such as the 35S promoter from CaMV and the promoters from such genes as rice actin; ubiquitin; pEMU; MAS and maize H3 histone. In some embodiments, the constitutive promoter is the ALS promoter, Xbal/Ncol fragment 5' to the Brassica napus ALS3 structural gene (or a nucleotide sequence similarity to said Xbal/Ncol fragment).

[0112] In some embodiments, the promoter is a plant tissue-specific or tissue-preferential promoter. In some embodiments, the promoter is isolated or derived from a soybean gene. Illustrative soybean tissue-specific promoters include AR-Pro1, AR-Pro2, AR-Pro3, AR-Pro4, AR-Pro5, AR-Pro6, AR-Pro7, AR-Pro8, and AR-Pro9.

[0113] In some embodiments, the plant is a seed-specific promoter. In some embodiments, the seed-specific promoter is selected from the group consisting of PvPhas, BnNap, AtOle1, GmSeed2, GmSeed3, GmSeed5, GmSeed6, GmSeed7, GmSeed8, GmSeed10, GmSeed11, GmSeed12, pBCON, GmCEP1-L, GmTHIC, GmBg7S1, GmGRD, GmOLEA, GmOLER, Gm2S-1, and GmBBld-II. In some embodiments, the seed-specific promoter is PvPhas and comprises the sequence of SEQ ID NO: 18, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, the seed-specific promoter is GmSeed2 and comprises the sequence of SEQ ID NO: 19, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, the promoter is a Cauliflower Mosaic Virus (CaMV) 35S promoter.

[0114] In some embodiments, the promoter is a soybean polyubiquitin (Gmubi) promoter, a soybean heat shock protein 90-like (GmHSP90L) promoter, a soybean Ethylene Response Factor (GmERF) promoter. In some embodiments, the promoter is a constitutive soybean promoter derived from GmScreamM1, GmScreamM4, GmScreamM8 genes or GmubiXL genes.

[0115] In some embodiments, the 5' UTR is selected from the group consisting of Arc5 `UTR and glnBIUTR. In some embodiments, the 5` untranslated region is Arc5'UTR and comprises the sequence of SEQ ID NO: 20, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto.

[0116] In some embodiments, the terminator sequence is isolated or derived from a gene encoding Nopaline synthase, Arc5-1, an Extensin, Rb7 matrix attachment region, a Heat shock protein, Ubiquitin 10, Ubiquitin 3, and M6 matrix attachment region. In some embodiments, the terminator sequence is isolated or derived from a Nopaline synthase gene and comprises the sequence of SEQ ID NO: 22, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto.

[0117] In some embodiments, the nucleic acid comprises a 3' UTR. For example, the 3' untranslated region may be Arc5-1 and comprise SEQ ID NO: 21, or a sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto.

[0118] In some embodiments the nucleic acid comprises a gene encoding a selectable marker. One illustrative selectable marker gene for plant transformation is the neomycin phosphotransferase II (nptII) gene, isolated from transposon Tn5, which, when placed under the control of plant regulatory signals, confers resistance to kanamycin. Another exemplary marker gene is the hygromycin phosphotransferase gene which confers resistance to the antibiotic hygromycin. In some embodiments, the selectable marker is of bacterial origin and confers resistance to antibiotics such as gentamycin acetyl transferase, streptomycin phosphotransferase, and aminoglycoside-3'-adenyl transferase, the bleomycin resistance determinant. In some embodiments, the selectable marker genes confer resistance to herbicides such as glyphosate, glufosinate or bromoxynil. In some embodiments, the selectable marker is mouse dihydrofolate reductase, plant 5-enolpyruvylshikimate-3-phosphate synthase and plant acetolactate synthase. In some embodiments, the selectable marker is acetolactate synthase (e.g., AtCsr1.2).

[0119] In some embodiments, a nucleic acid comprises an endoplasmic reticulum retention signal. For example, in some embodiments, a nucleic acid comprises a KDEL sequence (SEQ ID NO: 23). In some embodiments, the nucleic acid may comprise an endoplasmic reticulum retention signal selected from any one of SEQ ID NO: 23-70.

[0120] Shown in Table 2 are exemplary promoters, 5' UTRs, signal peptides, and terminators that may be used in the nucleic acids of the disclosure.

TABLE-US-00002 TABLE 2 Promoters, 5' UTRs, signal peptides and terminators Illustrative Accession No. Type Name Description Native Species (Glyma, GenBank) Promoter PvPhas Phaseolin-1 (aka .beta.-phaseolin) Common bean J01263.1 (Phaseolus vulgaris) BnNap Napin-1 Rapeseed (Brassica J02798.1 napus) AtOle1 Oleosin-1 (Ole1) Arabidopsis (Arabidopsis X62353.1, AT4G25140 thaliana) GmSeed2 Gy1 (Glycinin 1) Soybean (Glycine max) Glyma.03G163500 GmSeed3 cysteine protease Soybean (Glycine max) Glyma.08G116300 GmSeed5 Gy5 (Glycinin 5) Soybean (Glycine max) Glyma.13G123500 GmSeed6 Gy4 (Glycinin 4) Soybean (Glycine max) Glyma.10G037100 GmSeed7 Kunitz trypsin protease Soybean (Glycine max) Glyma.01G095000 inhibitor GmSeed8 Kunitz trypsin protease Soybean (Glycine max) Glyma.08G341500 inhibitor GmSeed10 Legume Lectin Domain Soybean (Glycine max) Glyma.02G012600 GmSeed11 .beta.-conglycinin a subunit Soybean (Glycine max) Glyma.20G148400 GmSeed12 .beta.-conglycinin a' subunit Soybean (Glycine max) Glyma.10G246300 pBCON .beta.-conglycinin .beta. subunit Soybean (Glycine max) Glyma.20G148200 GmCEP1-L KDEL-tailed cysteine Soybean (Glycine max) Glyma06g42780 endopeptidase CEP1-like GmTHIC phosphomethylpyrimidine Soybean (Glycine max) Glyma11g26470 synthase GmBg7S1 Basic 7S globulin precursor Soybean (Glycine max) Glyma03g39940 GmGRD glucose and ribitol Soybean (Glycine max) Glyma07g38790 dehydrogenase-like GmOLEA Oleosin isoform A Soybean (Glycine max) Glyma.19g063400 GmOLEB Oleosin isoform B Soybean (Glycine max) Glyma.16g071800 Gm2S-1 2S albumin Soybean (Glycine max) Glyma13g36400 GmBBId-II Bowman-Birk protease Soybean (Glycine max) Glyma16g33400 inhibitor 5'UTR Arc5'UTR arc5-1 gene Phaseolus vulgaris J01263.1 glnB1UTR 65 bp of native glutamine Soybean (Glycine max) AF301590.1 synthase Signal peptide GmSCB1 Seed coat BURP domain Soybean (Glycine max) Glyma07g28940.1 protein StPat21 Patatin Tomato (Solanum CAA27588 lycopersicum) 2Sss 2S albumin Soybean (Glycine max) Glyma13g36400 Sig2 Glycinin G1 N-terminal Soybean (Glycine max) Glyma.03G163500 peptide Sig12 Beta-conglycinin alpha prime Soybean (Glycine max) Glyma.10G246300 subunit N-terminal peptide Sig8 Kunitz trypsin inhibitor N- Soybean (Glycine max) Glyma.08G341500 terminal peptide Sig10 Lectin N-terminal peptide Soybean (Glycine max) Glyma.02G012600 from Glycine max Sig11 Beta-conglycinin alpha Soybean (Glycine max) Glyma.20G148400 subunit N-terminal peptide Coixss Alpha-coixin N-terminal Coix lacryma-job peptide from Coix lacryma- job KDEL C-terminal amino acids of Phaseolus vulgaris sulfhydryl endopeptidase Terminator NOS Nopaline synthase gene Agrobacterium termination sequence tumefaciens ARC arc5-1 gene termination Phaseolus vulgaris J01263.1 sequence EU Extensin termination sequence Nicotiana tabacum Rb7 Rb7 matrix attachment region Nicotiana tabacum termination sequence HSP or Heat shock termination Arabidopsis thaliana AtHSP sequence AtUbi10 Ubiquitin 10 termination Arabidopsis thaliana sequence Stubi3 Ubiquitin 3 termination Solanum tuberosum TM6 M6 matrix attachment region Nicotiana tabacum termination sequence

[0121] Illustrative nucleic acids of the disclosure are provided in FIG. 1A-1P. In some embodiments a nucleic acid comprises, from 5' to 3', a promoter, a 5'UTR, a sequence encoding an unstructured milk protein, a sequence encoding a structured mammalian protein, an endoplasmic reticulum retention signal, and a terminator (See, e.g., FIG. 1A). In some embodiments a nucleic acid comprises, from 5' to 3', a promoter, a 5'UTR, a sequence encoding an unstructured milk protein, a sequence encoding a linker, a sequence encoding a structured mammalian protein, an endoplasmic reticulum retention signal, and a terminator (See, e.g., FIG. 1B). In some embodiments a nucleic acid comprises, from 5' to 3', a promoter, a 5'UTR, a sequence encoding an unstructured milk protein, a sequence encoding a linker, a sequence encoding a structured mammalian protein, and a terminator (See, e.g., FIG. 1C). In some embodiments a nucleic acid comprises, from 5' to 3', a promoter, a 5'UTR, a sequence encoding an unstructured milk protein, a sequence encoding a structured mammalian protein, and a terminator (See, e.g., FIG. 1D). In some embodiments a nucleic acid comprises, from 5' to 3', a promoter, a 5'UTR, a sequence encoding a structured mammalian protein, a sequence encoding an unstructured milk protein, an endoplasmic reticulum retention signal, and a terminator (See, e.g., FIG. 1E). In some embodiments a nucleic acid comprises, from 5' to 3', a promoter, a 5'UTR, a sequence encoding a structured mammalian protein, a sequence encoding a linker, a sequence encoding an unstructured milk protein, an endoplasmic reticulum retention signal, and a terminator (See, e.g., FIG. 1F). In some embodiments a nucleic acid comprises, from 5' to 3', a promoter, a 5'UTR, a sequence encoding a structured mammalian protein, a sequence encoding a linker, a sequence encoding an unstructured milk protein, and a terminator (See, e.g., FIG. 1G). In some embodiments a nucleic acid comprises, from 5' to 3', a promoter, a 5'UTR, a sequence encoding a structured mammalian protein, a sequence encoding an unstructured milk protein, and a terminator (See, e.g., FIG. 1H). In some embodiments a nucleic acid comprises, from 5' to 3', a promoter, a 5'UTR, a sequence encoding a signal peptide, a sequence encoding an unstructured milk protein, a sequence encoding a structured mammalian protein, an endoplasmic reticulum retention signal, and a terminator (See, e.g., FIG. 1I). In some embodiments a nucleic acid comprises, from 5' to 3', a promoter, a 5'UTR, a sequence encoding a signal peptide, a sequence encoding an unstructured milk protein, a sequence encoding a linker, a sequence encoding a structured mammalian protein, an endoplasmic reticulum retention signal, and a terminator (See, e.g., FIG. 1J). In some embodiments a nucleic acid comprises, from 5' to 3', a promoter, a 5'UTR, a sequence encoding a signal peptide, a sequence encoding an unstructured milk protein, a sequence encoding a linker, a sequence encoding a structured mammalian protein, and a terminator (See, e.g., FIG. 1K). In some embodiments a nucleic acid comprises, from 5' to 3', a promoter, a 5'UTR, a sequence encoding a signal peptide, a sequence encoding an unstructured milk protein, a sequence encoding a structured mammalian protein, and a terminator (See, e.g., FIG. 1L). In some embodiments a nucleic acid comprises, from 5' to 3', a promoter, a 5'UTR, a sequence encoding a signal peptide, a sequence encoding a structured mammalian protein, a sequence encoding an unstructured milk protein, an endoplasmic reticulum retention signal, and a terminator (See, e.g., FIG. 1M). In some embodiments a nucleic acid comprises, from 5' to 3', a promoter, a 5'UTR, a sequence encoding a signal peptide, a sequence encoding a structured mammalian protein, a sequence encoding a linker, a sequence encoding an unstructured milk protein, an endoplasmic reticulum retention signal, and a terminator (See, e.g., FIG. 1N). In some embodiments a nucleic acid comprises, from 5' to 3', a promoter, a 5'UTR, a sequence encoding a signal peptide, a sequence encoding a structured mammalian protein, a sequence encoding a linker, a sequence encoding an unstructured milk protein, and a terminator (See, e.g., FIG. 1O). In some embodiments a nucleic acid comprises, from 5' to 3', a promoter, a 5'UTR, a sequence encoding a signal peptide, a sequence encoding a structured mammalian protein, a sequence encoding an unstructured milk protein, and a terminator (See, e.g., FIG. 1P).

[0122] In some embodiments, the nucleic acid comprises an expression cassette comprising a OKC1-T:OLG1 (Optimized Kappa Casein version 1:beta-lactoglobulin version 1) fusion driven by PvPhas promoter fused with arc5'UTR:sig10, followed by the ER retention signal (KDEL) and the 3'UTR of the arc5-1 gene, "arc-terminator" (See, e.g., FIG. 3). In some embodiments, the nucleic acid comprises SEQ ID NO: 72.

[0123] In some embodiments, the nucleic acid comprises an expression cassette comprising a OBC-T2:FM:OLG1 (Optimized Beta Casein Truncated version 2:Chymosin cleavage site:beta-lactoglobulin version 1) fusion driven by PvPhas promoter fused with arc5'UTR:sig10, followed by the 3'UTR of the arc5-1 gene, "arc-terminator" (See, e.g., FIG. 4). In some embodiments, the nucleic acid comprises SEQ ID NO: 74. The Beta Casein is "truncated" in that the bovine secretion signal is removed, and replaced with a plant targeting signal.

[0124] In some embodiments, the nucleic acid comprises an expression cassette comprising a OaS1-T:FM:OLG1 (Optimized Alpha S1 Casein Truncated version 1:Chymosin cleavage site:beta-lactoglobulin version 1) fusion driven by PvPhas promoter fused with arc5'UTR:sig10, followed by the 3'UTR of the arc5-1 gene, "arc-terminator" (See, e.g., FIG. 5). In some embodiments, the nucleic acid comprises SEQ ID NO: 76. The Alpha 51 is "truncated" in that the bovine secretion signal is removed, and replaced with a plant targeting signal.

[0125] In some embodiments, the nucleic acid comprises an expression cassette comprising a para-OKC1-T:FM:OLG1:KDEL (Optimized paraKappa Casein version 1:Chymosin cleavage site:beta-lactoglobulin version 1) fusion driven by PvPhas promoter fused with arc5'UTR:sig 10, followed by the ER retention signal (KDEL) and the 3'UTR of the arc5-1 gene, "arc-terminator" (See, e.g., FIG. 6). In some embodiments, the nucleic acid comprises SEQ ID NO: 78.

[0126] In some embodiments, the nucleic acid comprises an expression cassette comprising a para-OKC1-T:FM:OLG1 (Optimized paraKappa Casein version 1:Chymosin cleavage site:beta-lactoglobulin version 1) fusion driven by PvPhas promoter fused with arc5'UTR:sig 10, followed by the 3'UTR of the arc5-1 gene, "arc-terminator" (See, e.g., FIG. 7). In some embodiments, the nucleic acid comprises SEQ ID NO: 80.

[0127] In some embodiments, the nucleic acid comprises an expression cassette comprising a OKC1-T-OLG1 (Optimized Kappa Casein version 1:beta-lactoglobulin version 1) fusion that is driven by the promoter and signal peptide of glycinin 1 (GmSeed2:sig2) followed by the ER retention signal (KDEL) and the nopaline synthase gene termination sequence (nos term) (See, e.g., FIG. 8). In some embodiments, the nucleic acid comprises SEQ ID NO: 82.

In some embodiments, a nucleic acid encoding a fusion protein comprises the sequence of any one of SEQ ID NO: 72, 74, 76, 78, 80, 82, 134, or 136.

[0128] In some embodiments, the nucleic acids are codon optimized for expression in a host cell. Codon optimization is a process used to improve gene expression and increase the translational efficiency of a gene of interest by accommodating codon bias of the host organism (i.e., the organism in which the gene is expressed). Codon-optimized mRNA sequences that are produced using different programs or approaches can vary because different codon optimization strategies differ in how they quantify codon usage and implement codon changes. Some approaches use the most optimal (frequently used) codon for all instances of an amino acid, or a variation of this approach. Other approaches adjust codon usage so that it is proportional to the natural distribution of the host organism. These approaches include codon harmonization, which endeavors to identify and maintain regions of slow translation thought to be important for protein folding. Alternative approaches involve using codons thought to correspond to abundant tRNAs, using codons according to their cognate tRNA concentrations, selectively replacing rare codons, or avoiding occurrences of codon-pairs that are known to translate slowly. In addition to approaches that vary in the extent to which codon usage is considered as a parameter, there are hypothesis-free approaches that do not consider this parameter. Algorithms for performing codon optimization are known to those of skill in the art and are widely available on the Internet.

[0129] In some embodiments the nucleic acids are codon optimized for expression in a plant species. The plant species may be, for example, a monocot or a dicot. In some embodiments, the plant species is a dicot species selected from soybean, lima bean, Arabidopsis, tobacco, rice, maize, barley, sorghum, wheat and/or oat. In some embodiments, the plant species is soybean.

[0130] The nucleic acids of the disclosure may be contained within a vector. The vector may be, for example, a viral vector or a non-viral vector. In some embodiments, the non-viral vector is a plasmid, such as an Agrobacterium Ti plasmid. In some embodiments, the non-viral vector is a lipid nanoparticle.

[0131] In some embodiments, a vector comprises a nucleic acid encoding a recombinant fusion protein, wherein the recombinant fusion protein comprises: (i) an unstructured milk protein, and (ii) a structured animal (e.g., mammalian or avian) protein. In some embodiments, the vector is an Agrobacterium Ti plasmid.

[0132] In some embodiments, a method for expressing a fusion protein in a plant comprises contacting the plant with a vector of the disclosure. In some embodiments, the method comprises maintaining the plant or part thereof under conditions in which the fusion protein is expressed.

Plants Expressing Fusion Proteins

[0133] Also provided herein are transgenic plants expressing one or more fusion proteins of the disclosure. In some embodiments, the transgenic plants stably express the fusion protein. In some embodiments, the transgenic plants stably express the fusion protein in the plant in an amount of at least 1% per the total protein weight of the soluble protein extractable from the plant. For example, the transgenic plants may stably express the fusion protein in an amount of at least 1%, at least 1.5%, at least 2%, at least 2.5%, at least 3%, at least 3.5%, at least 4%, at least 4.5%, at least 5%, at least 5.5%, at least 6%, at least 6.5%, at least 7%, at least 7.5%, at least 8%, at least 8.5%, at least 9%, at least 9.5%, at least 10%, at least 10.5%, at least 11%, at least 11.5%, at least 12%, at least 12.5%, at least 13%, at least 13.5%, at least 14%, at least 14.5%, at least 15%, at least 15.5%, at least 16%, at least 16.5%, at least 17%, at least 17.5%, at least 18%, at least 18.5%, at least 19%, at least 19.5%, at least 20%, or more of total protein weight of soluble protein extractable from the plant.

[0134] In some embodiments, the transgenic plants stably express the fusion protein in an amount of less than about 1% of the total protein weight of soluble protein extractable from the plant. In some embodiments, the transgenic plants stably express the fusion protein in the range of about 1% to about 2%, about 3% to about 4%, about 4% to about 5%, about 5% to about 6%, about 6% to about 7%, about 7% to about 8%, about 8% to about 9%, about 9% to about 10%, about 10% to about 11%, about 11% to about 12%, about 12% to about 13%, about 13% to about 14%, about 14% to about 15%, about 15% to about 16%, about 16% to about 17%, about 17%, to about 18%, about 18% to about 19%, about 19% to about 20%, or more than about 20% of the total protein weight of soluble protein extractable from the plant.

[0135] In some embodiments, the transgenic plant stably express the fusion protein in an amount in the range of about 0.5% to about 3%, about 1% to about 4%, about 1% to about 5%, about 2% to about 5%, about 1% to about 10%, about 2% to about 10%, about 3% to about 10%, about 5 to about 12%, about 4% to about 10%, or about 5% to about 10%, about 4% to about 8%, about 5% to about 15%, about 5% to about 18%, about 10% to about 20%, or about 1% to about 20% of the total protein weight of soluble protein extractable from the plant.

[0136] In some embodiments, the fusion protein is expressed at a level at least 2-fold higher than an unstructured milk protein expressed individually in a plant. For example, in some embodiments, the fusion protein is expressed at a level at least 2-fold, at least 2.5-fold, at least 3-fold, at least 3.5-fold, at least 4-fold, at least 4.5-fold, at least 5-fold, at least 5.5-fold, at least 6-fold, at least 7-fold, at least 7.5-fold, at least 8-fold, at least 8.5-fold, at least 9-fold, at least 9.5-fold, at least 10-fold, at least 25-fold, at least 50-fold, or at least 100-fold higher than an unstructured milk protein expressed individually in a plant.

[0137] In some embodiments, the fusion protein accumulates in the plant at least 2-fold higher than an unstructured milk protein expressed without the structured animal (e.g., mammalian or avian) protein. For example, in some embodiments, the fusion protein accumulates in the plant at least 2-fold, at least 2.5-fold, at least 3-fold, at least 3.5-fold, at least 4-fold, at least 4.5-fold, at least 5-fold, at least 5.5-fold, at least 6-fold, at least 7-fold, at least 7.5-fold, at least 8-fold, at least 8.5-fold, at least 9-fold, at least 9.5-fold, at least 10-fold, at least 25-fold, at least 50-fold, or at least 100-fold higher than an unstructured milk protein expressed without the structured animal protein.

[0138] In some embodiments, a stably transformed plant comprises in its genome: a recombinant DNA construct encoding a fusion protein, wherein the fusion protein comprises (i) an unstructured milk protein, and (ii) a structured animal (e.g., mammalian or avian) protein. In some embodiments, the fusion protein is stably expressed in the plant in an amount of 1% or higher per the total protein weight of the soluble protein extractable from the plant. In some embodiments, the fusion protein is stably expressed in the plant in an amount of 2% or higher per the total protein weight of the soluble protein extractable from the plant. In some embodiments, the fusion protein is stably expressed in the plant in an amount of 3% or higher per the total protein weight of the soluble protein extractable from the plant. In some embodiments, the fusion protein is stably expressed in the plant in an amount of 4% or higher per the total protein weight of the soluble protein extractable from the plant. In some embodiments, the fusion protein is stably expressed in the plant in an amount of 5% or higher per the total protein weight of the soluble protein extractable from the plant. In some embodiments, the fusion protein is stably expressed in the plant in an amount of 6% or higher per the total protein weight of the soluble protein extractable from the plant. In some embodiments, the fusion protein is stably expressed in the plant in an amount of 7% or higher per the total protein weight of the soluble protein extractable from the plant. In some embodiments, the fusion protein is stably expressed in the plant in an amount of 8% or higher per the total protein weight of the soluble protein extractable from the plant. In some embodiments, the fusion protein is stably expressed in the plant in an amount of 9% or higher per the total protein weight of the soluble protein extractable from the plant. In some embodiments, the fusion protein is stably expressed in the plant in an amount of 10% or higher per the total protein weight of the soluble protein extractable from the plant. In some embodiments, the fusion protein is stably expressed in the plant in an amount of 11% or higher per the total protein weight of the soluble protein extractable from the plant. In some embodiments, the fusion protein is stably expressed in the plant in an amount of 12% or higher per the total protein weight of the soluble protein extractable from the plant. In some embodiments, the fusion protein is stably expressed in the plant in an amount of 13% or higher per the total protein weight of the soluble protein extractable from the plant. In some embodiments, the fusion protein is stably expressed in the plant in an amount of 14% or higher per the total protein weight of the soluble protein extractable from the plant. In some embodiments, the fusion protein is stably expressed in the plant in an amount of 15% or higher per the total protein weight of the soluble protein extractable from the plant. In some embodiments, the fusion protein is stably expressed in the plant in an amount of 16% or higher per the total protein weight of the soluble protein extractable from the plant. In some embodiments, the fusion protein is stably expressed in the plant in an amount of 17% or higher per the total protein weight of the soluble protein extractable from the plant. In some embodiments, the fusion protein is stably expressed in the plant in an amount of 18% or higher per the total protein weight of the soluble protein extractable from the plant. In some embodiments, the fusion protein is stably expressed in the plant in an amount of 19% or higher per the total protein weight of the soluble protein extractable from the plant. In some embodiments, the fusion protein is stably expressed in the plant in an amount of 20% or higher per the total protein weight of the soluble protein extractable from the plant.

[0139] In some embodiments, a stably transformed plant comprises in its genome: a recombinant DNA construct encoding a fusion protein, wherein the fusion protein comprises from N-terminus to C-terminus, the unstructured milk protein and the animal (e.g., mammalian or avian) protein. In some embodiments, the fusion protein comprises, from N-terminus to C-terminus, the structured animal (e.g., mammalian or avian) protein and the milk protein.

[0140] In some embodiments, a stably transformed plant comprises in its genome: a recombinant DNA construct encoding a fusion protein, wherein the fusion protein comprises an unstructured milk protein such as a casein protein. In some embodiments, a stably transformed plant comprises in its genome: a recombinant DNA construct encoding a fusion protein, wherein the fusion protein comprises an unstructured milk protein selected from .alpha.-S1 casein, .alpha.-S2 casein, .beta.-casein, and .kappa.-casein. In some embodiments, the unstructured milk protein is .alpha.-S1 casein. In some embodiments, the unstructured milk protein is .alpha.-S1 casein and comprises the sequence SEQ ID NO: 8, or a sequence at least 90% identical thereto. In some embodiments, the unstructured milk protein is .alpha.-S2 casein. In some embodiments, the unstructured milk protein is .alpha.-S2 casein and comprises the sequence SEQ ID NO: 84, or a sequence at least 90% identical thereto. In some embodiments, the unstructured milk protein is .beta.-casein. In some embodiments, the unstructured milk protein is .beta.-casein and comprises the sequence of SEQ ID NO: 6, or a sequence at least 90% identical thereto. In some embodiments, the unstructured milk protein is .kappa.-casein. In some embodiments, the unstructured milk protein is .kappa.-casein and comprises the sequence of SEQ ID NO: 4, or a sequence at least 90% identical thereto. In some embodiments, the unstructured milk protein is para-.kappa.-casein. In some embodiments, the unstructured milk protein is para-.kappa.-casein and comprises the sequence of SEQ ID NO: 2, or a sequence at least 90% identical thereto.

[0141] In some embodiments, a stably transformed plant comprises in its genome: a recombinant DNA construct encoding a fusion protein, wherein the fusion protein comprises a structured mammalian protein selected from .beta.-lactoglobulin, .alpha.-lactalbumin, albumin, lysozyme, lactoferrin, lactoperoxidase, hemoglobin, collagen, and an immunoglobulin (e.g., IgA, IgG, IgM, or IgE). In some embodiments, the structured mammalian protein is .beta.-lactoglobulin. In some embodiments, the structured mammalian protein is .beta.-lactoglobulin and comprises the sequence of SEQ ID NO: 10, or a sequence at least 90% identical thereto. In some embodiments, a stably transformed plant comprises in its genome: a recombinant DNA construct encoding a fusion protein, wherein the fusion protein comprises a structured avian protein selected from lysozyme, ovalbumin, ovotransferrin, and ovoglobulin.

[0142] In some embodiments, a stably transformed plant comprises in its genome: a recombinant DNA construct encoding a fusion protein, wherein the fusion protein comprises a casein protein and .beta.-lactoglobulin. In some embodiments, a stably transformed plant comprises in its genome: a recombinant DNA construct encoding a fusion protein, wherein the fusion protein comprises .kappa.-casein and .beta.-lactoglobulin. In some embodiments, the fusion protein comprises para-.kappa.-casein and .beta.-lactoglobulin. In some embodiments, the fusion protein comprises .beta.-casein and .beta.-lactoglobulin. In some embodiments, the fusion protein comprises .alpha.-S1 casein and .beta.-lactoglobulin.

[0143] In some embodiments, a stably transformed plant comprises in its genome: a recombinant DNA construct encoding a fusion protein; wherein the fusion protein comprises (1) .kappa.-casein, and (ii) .beta.-lactoglobulin. In some embodiments; and wherein the fusion protein is stably expressed in the plant in an amount of 1% or higher per the total protein weight of the soluble protein extractable from the plant.

[0144] In some embodiments, the stably transformed plant is a monocot. For example, in some embodiments, the plant may be a monocot selected from turf grass, maize (corn), rice, oat, wheat, barley, sorghum, orchid, iris, lily, onion, palm, and duckweed.

[0145] In some embodiments, the stably transformed plant is a dicot. For example, in some embodiments, the plant may be a dicot selected from Arabidopsis, tobacco, tomato, potato, sweet potato, cassava, alfalfa, lima bean, pea, chick pea, soybean, carrot, strawberry, lettuce, oak, maple, walnut, rose, mint, squash, daisy, Quinoa, buckwheat, mung bean, cow pea, lentil, lupin, peanut, fava bean, French beans (i.e., common beans), mustard, or cactus. In some embodiments, the plant is a soybean (Glycine max).

[0146] In some embodiments, the plant is a non-vascular plant selected from moss, liverwort, hornwort or algae. In some embodiments, the plant is a vascular plant reproducing from spores (e.g., a fern).

[0147] In some embodiments, the recombinant DNA construct is codon-optimized for expression in the plant. For example, in some embodiments, the recombinant DNA construct is codon-optimized for expression in a soybean plant.

[0148] The transgenic plants described herein may be generated by various methods known in the art. For example, a nucleic acid encoding a fusion protein may be contacted with a plant, or a part thereof, and the plant may then be maintained under conditions wherein the fusion protein is expressed. In some embodiments, the nucleic acid is introduced into the plant, or part thereof, using one or more methods for plant transformation known in the art, such as Agrobacterium-mediated transformation, particle bombardment-medicated transformation, electroporation, and microinjection.

[0149] In some embodiments, a method for stably expressing a recombinant fusion protein in a plant comprises (i) transforming a plant with a plant transformation vector comprising an expression cassette comprising: a sequence encoding a fusion protein, wherein the fusion protein comprises an unstructured milk protein, and a structured animal (e.g., mammalian or avian) protein; and (ii) growing the transformed plant under conditions wherein the recombinant fusion protein is expressed. In some embodiments, the recombinant fusion protein is expressed in an amount of 1% or higher per the total protein weight of the soluble protein extractable from the plant. In some embodiments, the unstructured milk protein is .kappa.-casein. In some embodiments, the structured mammalian protein is .beta.-lactoglobulin. In some embodiments, the unstructured milk protein is .kappa.-casein and the structured mammalian protein is .beta.-lactoglobulin.

Food Compositions Comprising a Fusion Protein

[0150] The fusion proteins and transgenic plants described herein may be used to prepare food compositions. The fusion protein may be used directly to prepare the food composition (i.e., in the form of a fusion protein), or the fusion protein may first be separated into its constituent proteins. For example, in some embodiments, a food composition may comprise either (i) a fusion protein, (ii) an unstructured milk protein, (iii) a structured mammalian, avian, or plant protein, or (iv) an unstructured milk protein and a structured mammalian, avian, or plant protein. An illustrative method for preparing a food composition of the disclosure is provided in FIG. 11.

[0151] In some embodiments, the fusion proteins and transgenic plants described herein may be used to prepare a food composition selected from cheese and processed cheese products, yogurt and fermented dairy products, directly acidified counterparts of fermented dairy products, cottage cheese dressing, frozen dairy products, frozen desserts, desserts, baked goods, toppings, icings, fillings, low-fat spreads, dairy-based dry mixes, soups, sauces, salad dressing, geriatric nutrition, creams and creamers, analog dairy products, follow-up formula, baby formula, infant formula, milk, dairy beverages, acid dairy drinks, smoothies, milk tea, butter, margarine, butter alternatives, growing up milks, low-lactose products and beverages, medical and clinical nutrition products, protein/nutrition bar applications, sports beverages, confections, meat products, analog meat products, meal replacement beverages, and weight management food and beverages.

[0152] In some embodiments the fusion proteins and transgenic plants described herein may be used to prepare a dairy product. In some embodiments, the dairy product is a fermented dairy product. An illustrative list of fermented dairy products includes cultured buttermilk, sour cream, yogurt, skyr, leben, lassi, or kefir. In some embodiments the fusion proteins and transgenic plants described herein may be used to prepare cheese products.

[0153] In some embodiments the fusion proteins and transgenic plants described herein may be used to prepare a powder containing a milk protein. In some embodiments, the fusion proteins and transgenic plants described herein may be used to prepare a low-lactose product.

[0154] In some embodiments, a method for making a food composition comprises, expressing a recombinant fusion protein of the disclosure in a plant, extracting the recombinant fusion protein from the plant, optionally separating the milk protein from the structured mammalian or plant protein, and creating a food composition using the fusion protein and/or the milk protein.

[0155] The recombinant fusion proteins may be extracted from a plant using standard methods known in the art. For example, the fusion proteins may be extracted using solvent or aqueous extraction. In some embodiments, the fusion proteins may be extracted using phenol extraction. Once extracted, the fusion proteins may be maintained in a buffered environment (e.g., Tris, MOPS, HEPES), in order to avoid sudden changes in the pH. The fusion proteins may also be maintained at a particular temperature, such as 4.degree. C. In some embodiments, one or more additives may be used to aid the extraction process (e.g., salts, protease/peptidase inhibitors, osmolytes, reducing agents, etc.)

[0156] In some embodiments, a method for making a food composition comprises, expressing a recombinant fusion protein of the disclosure in a plant, extracting one or both of the unstructured milk protein and the structured mammalian or plant protein from the plant, and creating a food composition using the milk protein.

[0157] In some embodiments, the milk protein and the structured mammalian or plant protein are separated from one another in the plant cell, prior to extraction. In some embodiments, the milk protein is separated from the structured mammalian or plant protein after extraction, for example by contacting the fusion protein with an enzyme that cleaves the fusion protein. The enzyme may be, for example, chymosin. In some embodiments, the fusion protein is cleaved using rennet.

[0158] All references, articles, publications, patents, patent publications, and patent applications cited herein are incorporated by reference in their entireties for all purposes. However, mention of any reference, article, publication, patent, patent publication, and patent application cited herein is not, and should not be taken as an acknowledgment or any form of suggestion that they constitute valid prior art or form part of the common general knowledge in any country in the world, or that they disclose essential matter.

EXAMPLES

[0159] The following experiments demonstrate different recombinant fusion constructs of milk proteins and structured proteins, as well as methods of testing and producing the recombinant proteins, and food compositions produced from the extracted protein. While the examples below describe expression in soybean, it will be understood by those skilled in the art that the constructs and methods disclosed herein may be tailored for expression in any organism.

Example 1: Construction of Expression Vectors for Plant Transformation for Stable Expression of Recombinant Fusion Proteins

Binary Vector Design

[0160] While a number of vectors may be utilized for expression of the fusion proteins disclosed herein, the example constructs described below were built in the binary pCAMBIA3300 (Creative Biogene, VET1372) vector, which was customized for soybean transformation and selection. In order to modify the vector, pCAMBIA3300 was digested with HindIII and AseI allowing the release of the vector backbone (LB T-DNA repeat_KanR_pBR322 ori_pBR322 bom_pVS1 oriV_pVs1 repA_pVS1 StaA_RB T-DNA repeat). The 6598 bp vector backbone was gel extracted and a synthesized multiple cloning site (MCS) was ligated via In-Fusion cloning (In-Fusion.RTM. HD Cloning System CE, available on the world wide web at clontech.com) to allow modular vector modifications. A cassette containing the Arabidopsis thaliana Csr1.2 gene for acetolactate synthase was added to the vector backbone to be used as a marker for herbicide selection of transgenic plants. In order to build this cassette, the regulatory sequences from Solanum tuberosum ubiquitin/ribosomal fusion protein promoter (StUbi3 prom; -1 to -922 bp) and terminator (StUbi3 term; 414 bp) (GenBank accession no. L22576.1) were fused to the mutant (S653N) acetolactate synthase gene (Csr1.2; GenBank accession no. X51514.1) (Sathasivan et al, 1990; Ding et al, 2006) to generate imazapyr-resistant traits in soybean plants. The selectable marker cassette was introduced into the digested (EcoRI) modified vector backbone via In-Fusion cloning to form vector pAR15-00 (FIG. 2).

[0161] Recombinant DNA constructs were designed to express milk proteins (intrinsically unstructured and structured) in transgenic plants. The coding regions of the expression cassettes outlined below contain a fusion of codon-optimized nucleic acid sequences encoding bovine milk proteins, or a functional fragment thereof. To enhance protein expression in soybean, the nucleic acid sequences encoding .beta.-lactoglobulin (GenBank accession no. X14712.1) .kappa.-casein (GenBank accession no. CAA25231), .beta.-casein (GenBank accession no. M15132.1), and .alpha.S1-casein (GenBank accession no. X59836.1) were codon optimized using Glycine max codon bias and synthesized (available on the world wide web at idtdna.com/CodonOpt). The signal sequences were removed (i.e., making the constructs "truncated") and the new versions of the genes were renamed as OLG1 (.beta.-lactoglobulin version 1, SEQ ID NO: 9), OLG2 (.beta.-lactoglobulin version 2, SEQ ID NO: 11), OLG3 (.beta.-lactoglobulin version 3, SEQ ID NO: 12), OLG4 (.beta.-lactoglobulin version 4, SEQ ID NO: 13), OKC1-T (Optimized .kappa.-casein Truncated version 1, SEQ ID NO: 3), paraOKC1-T (only the para-.kappa. portion of OKC1-T, SEQ ID NO: 1), OBC-T2 (Optimized .beta.-casein Truncated version 2, SEQ ID NO: 5), and OaS1-T (Optimized .alpha.S1-casein Truncated version 1, SEQ ID NO: 7). As will be understood by those skilled in the art, any codon optimized nucleic acid sequences can present from 60% to 100% identity to the native version of the nucleic acid sequence.

[0162] All the expression cassettes described below and shown in FIG. 3-8 contained codon-optimized nucleic acid sequences encoding bovine milk proteins, or a functional fragment thereof, a seed specific promoter, a 5'UTR, a signal sequence (Sig) that directs foreign proteins to the protein storage vacuoles, and a termination sequence. In some versions of the constructs a linker (FM) such as chymosin cleavage site, was placed between the two proteins and/or a C-terminal KDEL sequence for ER retention was included. Expression cassettes were inserted in the pAR15-00 vector described above utilizing a KpnI restriction site with the MCS (FIG. 2). Coding regions and regulatory sequences are indicated as blocks (not to scale) in FIG. 3-8.

.kappa.-casein-.beta.-lactoglobulin Fusion with KDEL

[0163] Shown in FIG. 3 is an example expression cassette comprising .kappa.-casein (OKC1-T, SEQ ID NO: 3) and .beta.-lactoglobulin (OLG1, SEQ ID NO: 9). The regulatory sequences that were used in order to produce the heterologous milk proteins in soybean seeds include the promoter of the beta-phaseolin storage protein gene (PvPhas prom; -1 to -1543; GenBank accession no. J01263.1, SEQ ID NO: 18); the 5'UTR of the arc5-1 gene (arc5'UTR; -1 to -13; GenBank accession no. Z50202, SEQ ID NO: 20) (De Jaeger et al, 2002); the signal peptide of Lectin 1 gene 1 (sig10; +1 to +93; GenBank accession no. Glyma.02G012600, SEQ ID NO: 14) (Darnowski et al, 20020); and, the 3'UTR of the arc5-1 gene, (arc term 1197 bp; GenBank accession no. Z50202.1, SEQ ID NO: 21)(De Jaeger et al, 2002). A C-terminal KDEL (SEQ ID NO: 23) was also included for ER retention.

.beta.-casein-.beta.-lactoglobulin Fusion with Linker

[0164] Shown in FIG. 4 is an example expression cassette comprising .beta.-casein (OBC-T2, SEQ ID NO: 5) and .beta.-lactoglobulin (OLG1, SEQ ID NO: 9). The regulatory sequences that were used in order to produce the heterologous milk proteins in soybean seeds include the promoter of the beta-phaseolin storage protein gene (PvPhas prom; -1 to -1543; GenBank accession no. J01263.1, SEQ ID NO: 18); the 5'UTR of the arc5-1 gene (arc5'UTR; -1 to -13; GenBank accession no. Z50202, SEQ ID NO: 20) (De Jaeger et al, 2002); the signal peptide of Lectin 1 gene 1 (sig10; +1 to +93; accession no. Glyma.02G012600, SEQ ID NO: 14) (Darnowski et al, 2002); and, the 3'UTR of the arc5-1 gene, (arc term 1197 bp; accession no. Z50202.1, SEQ ID NO: 21) (De Jaeger, et al 2002). A linker (FM) comprising a chymosin cleavage site was inserted between the two proteins.

.alpha.S1-casein-.beta.-lactoglobulin Fusion with Linker

[0165] Shown in FIG. 5 is an example expression cassette comprising .alpha.S1-casein (OaS1-T, SEQ ID NO: 7) and .beta.-lactoglobulin (OLG1, SEQ ID NO: 9). The regulatory sequences that were used in order to produce the heterologous milk proteins in soybean seeds include the promoter of the beta-phaseolin storage protein gene (PvPhas prom; -1 to -1543; GenBank accession no. J01263.1, SEQ ID NO: 18); the 5'UTR of the arc5-1 gene (arc5'UTR; -1 to -13; GenBank accession no. Z50202, SEQ ID NO: 20) (De Jaeger et al, 2002); the signal peptide of Lectin 1 gene 1 (sig10; +1 to +93; accession no. Glyma.02G012600, SEQ ID NO: 14) (Darnowski et al, 2002); and, the 3'UTR of the arc5-1 gene, (arc term 1197 bp; GenBank accession no. Z50202.1, SEQ ID NO: 21)(De Jaeger et al, 2002). A linker (FM) comprising a chymosin cleavage site was inserted between the two proteins.

Para-.kappa.-casein-.beta.-lactoglobulin Fusion with Linker and KDEL

[0166] Shown in FIG. 6 is an example expression cassette comprising para-.kappa.-casein (paraOKC1-T, SEQ ID NO: 1) and .beta.-lactoglobulin (OLG1, SEQ ID NO: 9). The regulatory sequences that were used in order to produce the heterologous milk proteins in soybean seeds include the promoter of the beta-phaseolin storage protein gene (PvPhas prom; -1 to -1543; GenBank accession no. J01263.1, SEQ ID NO: 18); the 5'UTR of the arc5-1 gene (arc5'UTR; -1 to -13; GenBank accession no. Z50202, SEQ ID NO: 20) (De Jaeger et al, 2002); the signal peptide of Lectin 1 gene 1 (sig10; +1 to +93; GenBank accession no. Glyma.02G012600, SEQ ID NO: 14) (Darnowski et al, 2002); and, the 3'UTR of the arc5-1 gene, (arc term 1197 bp; GenBank accession no. Z50202.1, SEQ ID NO: 21) (De Jaeger et al 2002). A linker (FM) comprising a chymosin cleavage site was inserted between the two proteins and a C-terminal KDEL (SEQ ID NO: 23) was also included for ER retention.

Para-.kappa.-casein-.beta.-lactoglobulin Fusion with Linker

[0167] Shown in FIG. 7 is an example expression cassette comprising para-.kappa.-casein (paraOKC1-T, SEQ ID NO: 1) and .beta.-lactoglobulin (OLG1, SEQ ID NO: 9). The regulatory sequences that were used in order to produce the heterologous milk proteins in soybean seeds include the promoter of the beta-phaseolin storage protein gene (PvPhas prom; -1 to -1543; GenBank accession no. J01263.1, SEQ ID NO: 18); the 5'UTR of the arc5-1 gene (arc5'UTR; -1 to -13; GenBank accession no. Z50202, SEQ ID NO: 20) (De Jaeger et al, 2002); the signal peptide of Lectin 1 gene 1 (sig10; +1 to +93; GenBank accession no. Glyma.02G012600, SEQ ID NO: 14) (Darnowski et al, 2002); and, the 3'UTR of the arc5-1 gene, (arc term 1197 bp; GenBank accession no. Z50202.1, SEQ ID NO: 21) (De Jaeger et al, 2002). A linker (FM) comprising a chymosin cleavage site was inserted between the two proteins.

Fusion Protein with Seed2 Promoter, Sig2 and Nopaline Synthase Terminator

[0168] Shown in FIG. 8 is an example expression cassette comprising .kappa.-casein (OKC1-T, SEQ ID NO: 3) and .beta.-lactoglobulin (OLG1, SEQ ID NO: 9). The regulatory sequences that were used in order to produce the heterologous milk proteins in soybean seeds include the promoter and signal peptide of glycinin 1 (GmSeed2 (SEQ ID NO: 19): sig2 (SEQ ID NO: 16)) followed by the ER retention signal (KDEL) and the Nopaline synthase termination sequence (nos term, SEQ ID NO: 22).

Example 2: Identification of Transgenic Events, Recombinant Protein Extraction and Detection

[0169] To quantify recombinant protein expression levels, DNA constructs such as those shown in FIG. 3-8 were transformed into soybean using transformation protocols well known in the art, for example, by bombardment or agrobacterium. Total soybean genomic DNA was isolated from the first trifoliate leaves of transgenic events using the PureGene tissue DNA isolation kit (product #158667: QIAGEN, Valencia, Calif., USA). Trifoliates were frozen in liquid nitrogen and pulverized. Cells were lysed using the PureGene Cell Lysis Buffer, proteins were precipitated using the PureGene Protein Precipitation Buffer, and DNA was precipitated from the resulting supernatant using ethanol. The DNA pellets were washed with 70% ethanol and resuspended in water.

[0170] Genomic DNA was quantified by the Quant-iT PicoGreen (product #P7589: ThermoFisher Scientific, Waltham, Mass., USA) assay as described by manufacturer, and 150 ng of DNA was digested overnight with EcoRI, HindIII, NcoI, and/or KpnI, 30 ng of which was used for a BioRad ddPCR reaction, including labelled FAM or HEX probes for the transgene and Lectin1 endogenous gene respectively. Transgene copy number (CNV) was calculated by comparing the measured transgene concentration to the reference gene concentration. A CNV of greater than or equal to one was deemed acceptable.

Preparation of Total Soluble Protein Samples

[0171] Total soluble soybean protein fractions were prepared from the seeds of transgenic events by bead beating seeds (seeds collected about 90 days after germination) at 15000 rpm for 1 min. The resulting powder was resuspended in 50 mM Carbonate-Bicarbonate pH 10.8, 1 mM DTT, 1.times.HALT Protease Inhibitor Cocktail (Product #78438 ThermoFisher Scientific). The resuspended powder was incubated at 4.degree. C. for 15 minutes and then the supernatant collected after centrifuging twice at 4000 g, 20 min, 4.degree. C. Protein concentration was measured using a modified Bradford assay (Thermo Scientific Pierce 660 nm assay; Product #22660 ThermoFisher Scientific) using a bovine serum albumin (BSA) standard curve.

Recombinant Protein Quantification Via Western Blot Densitometry

[0172] SDS-PAGE was performed according to manufacturer's instructions (Product #5678105BioRad, Hercules, Calif., USA) under denaturing and reducing conditions. 5 ug of total protein extracts were loaded per lane. For immunoblotting proteins separated by SD S-PAGE were transferred to a PVDF membrane using Trans-Blot.RTM. Turbo.TM. Midi PVDF Transfer Packs (Product #1704157 BioRad) according to manufacturer's guidelines. Membranes were blocked with 3% BSA in phosphate buffered saline with 0.5% Tween-20, reacted with antigen specific antibody and subsequently reacted with fluorescent goat anti rabbit IgG (Product #60871 BioRad, Calif.). Membranes were scanned according to manufacturer's instructions using the ChemiDoc MP Imaging System (BioRad, Calif.) and analyzed using ImageLab Version 6.0.1 Standard Edition (Bio-Rad Laboratories, Inc.). Recombinant protein from the seeds of transgenic events was quantified by densitometry from commercial reference protein spike-in standards.

[0173] Shown in FIGS. 9A, 9B, 9C, and 9D are Western Blots of protein extracted from transgenic soybeans expressing the .kappa.-casein-.beta.-lactoglobulin expression cassette shown in FIG. 3. FIG. 9A shows the fusion protein detected using a primary antibody raised against .kappa.-casein. The first lane is a molecular weight marker. Lanes two (DCI 9.1) and three (DCI 9.2) represent individual seeds from a single transgenic line. Lane four (DCI 3.1) represents a seed from a separate transgenic line. Lane five is protein extracted from wild-type soybean plants, and lanes six-eight are protein extracted from wild-type soybean plants spiked with 0.05% commercial .kappa.-casein (lane 6), 0.5% commercial .kappa.-casein (lane 7), and 1.5% commercial .kappa.-casein (lane 8). The .kappa.-casein commercial protein is detected at an apparent molecular weight (MW) of .about.26 kDa (theoretical: 19 kDa--arrow). The fusion protein is detected at an apparent MW of .about.40 kDa (theoretical: 38 kDa--arrowhead).

[0174] FIG. 9B shows the fusion protein detected using a primary antibody raised against .beta.-lactoglobulin. The first lane is a molecular weight marker. Lanes two (DCI 9.1) and three (DCI 9.2) represent individual seeds from a single transgenic line. Lane four (DCI 3.1) represents a seed from a separate transgenic line. Lane five is protein extracted from wild-type soybean plants, and lanes six-eight are protein extracted from wild-type soybean plants spiked with 0.05% commercial .beta.-lactoglobulin (lane 6), 1% commercial .beta.-lactoglobulin (lane 7), and 2% commercial .beta.-lactoglobulin (lane 8). The .beta.-lactoglobulin commercial protein is detected at an apparent MW of .about.18 kDa (theoretical: 18 kDa--arrow). The fusion protein is detected at an apparent MW of .about.40 kDa (theoretical: 38 kDa--arrowhead). FIGS. 9C and 9D show the protein gels as control for equal lane loading (image is taken at the end of the SDS run) for FIGS. 9A and 9B, respectively.

[0175] Other combinations of structured and unstructured proteins were tested and evaluated for the percentage of recombinant protein. Cassettes having the same promoter (Seed2-sig), signal peptide (EUT:Rb7T), and in some instances a different terminator, were built with either .alpha.-S1-casein, .beta.-casein, .kappa.-casein, or the fusion of .beta.-lactoglobulin with .kappa.-casein (kCN-LG) (See FIGS. 3 and 8). As shown below in Table 3, none of the cassettes encoding .alpha.-S1-casein, .beta.-casein, or .kappa.-casein were able to produce expression of the protein at a level that exceeded 1% total soluble protein. However, when .kappa.-casein was fused with .beta.-lactoglobulin, .kappa.-casein was expressed at a level that was greater than 1% total soluble protein.

TABLE-US-00003 TABLE 3 Expression levels of unstructured proteins Number of events.sup.1 accumulating the recombinant protein at the concentration: Total events.sup.1 Above analyzed 0-1% TSP 1% TSP Unstructured .kappa.-Casein 89 89 0 B-Casein 12 12 0 .alpha.S1-Casein 6 6 0 Fusion kCN-LG 23 12 11 .sup.1As used in Table 3, the each "event" refers to an independent transgenic line.

[0176] As will be readily understood by those of skill in the art, T-DNA insertion into the plant genome is a random process and each T-DNA lands at an unpredictable genomic position. Hence, each of the 23 events generated in Table 3 for the fusion protein have different genomic insertion loci. The genomic context greatly influences the expression levels of a gene, and each loci will be either favorable or unfavorable for the expression of the recombinant genes. The variability observed at the protein level is a reflection of that random insertion process, and explains why 12 out of 23 events present expression levels below 1%.

Example 3: Food Compositions

[0177] The transgenic plants expressing the recombinant fusion proteins described herein can produce milk proteins for the purpose of food industrial, non-food industrial, pharmaceutical, and commercial uses described in this disclosure. An illustrative method for making a food composition is provided in FIG. 11.

[0178] A fusion protein comprising an unstructured milk protein (para-.kappa.-casein) and a structured mammalian protein (.beta.-lactoglobulin) is expressed in a transgenic soybean plant. The fusion protein comprises a chymosin cleavage site between the para-.kappa.-casein and the .beta.-lactoglobulin.

[0179] The fusion protein is extracted from the plant. The fusion protein is then treated with chymosin, to separate the para-.kappa.-casein from the .beta.-lactoglobulin. The para-.kappa.-casein is isolated and/or purified and used to make a food composition (e.g., cheese).

Numbered Embodiments

[0180] Notwithstanding the appended claims, the following numbered embodiments also form part of the instant disclosure.

[0181] 1. A stably transformed plant comprising in its genome: a recombinant DNA construct encoding a fusion protein, the fusion protein comprising: (i) an unstructured milk protein, and (ii) a structured animal protein; wherein the fusion protein is stably expressed in the plant in an amount of 1% or higher per total protein weight of soluble protein extractable from the plant.

[0182] 2. The stably transformed plant of embodiment 1, wherein the fusion protein comprises, from N-terminus to C-terminus, the unstructured milk protein and the animal protein.

[0183] 3. The stably transformed plant of any one of embodiments 1-2, wherein the unstructured milk protein is .alpha.-S1 casein, .alpha.-S2 casein, .beta.-casein, or .kappa.-casein.

[0184] 4. The stably transformed plant of embodiment 1, wherein the unstructured milk protein is .kappa.-casein and comprises the sequence of SEQ ID NO: 4, or a sequence at least 90% identical thereto.

[0185] 5. The stably transformed plant of embodiment 1, wherein the unstructured milk protein is para-.kappa.-casein and comprises the sequence of SEQ ID NO: 2, or a sequence at least 90% identical thereto.

[0186] 6. The stably transformed plant of embodiment 1, wherein the unstructured milk protein is .beta.-casein and comprises the sequence of SEQ ID NO: 6, or a sequence at least 90% identical thereto.

[0187] 7. The stably transformed plant of embodiment 1, wherein the unstructured milk protein is .alpha.-S1 casein and comprises the sequence SEQ ID NO: 8, or a sequence at least 90% identical thereto.

[0188] 8. The stably transformed plant of embodiment 1, wherein the unstructured milk protein is .alpha.-S2 casein and comprises the sequence SEQ ID NO: 84, or a sequence at least 90% identical thereto.

[0189] 9. The stably transformed plant of any one of embodiments 1-8, wherein the structured animal protein is a structured mammalian protein.

[0190] 10. The stably transformed plant of embodiment 9, wherein the structured mammalian protein is .beta.-lactoglobulin, .alpha.-lactalbumin, albumin, lysozyme, lactoferrin, lactoperoxidase, hemoglobin, collagen, or an immunoglobulin.

[0191] 11. The stably transformed plant of embodiment 9, wherein the structured mammalian protein is .beta.-lactoglobulin and comprises the sequence of SEQ ID NO: 10, or a sequence at least 90% identical thereto.

[0192] 12. The stably transformed plant of any one of embodiments 1-8, wherein the structured animal protein is a structured avian protein.

[0193] 13. The stably transformed plant embodiment 12, wherein the structured avian protein is ovalbumin, ovotransferrin, lysozyme or ovoglobulin.

[0194] 14. The stably transformed plant of embodiment 9, wherein the milk protein is .kappa.-casein and the structured mammalian protein is .beta.-lactoglobulin.

[0195] 15. The stably transformed plant of embodiment 9, wherein the milk protein is para-K-casein and the structured mammalian protein is .beta.-lactoglobulin.

[0196] 16. The stably transformed plant of embodiment 9, wherein the milk protein is .beta.-casein and the structured mammalian protein is .beta.-lactoglobulin.

[0197] 17. The stably transformed plant of embodiment 9, wherein the milk protein is .alpha.-S1 casein or .alpha.-S2 casein and the structured mammalian protein is .beta.-lactoglobulin.

[0198] 18. The stably transformed plant of any one of embodiments 1-17, wherein the plant is a dicot.

[0199] 19. The stably transformed plant of embodiment 18, wherein the dicot is Arabidopsis, tobacco, tomato, potato, sweet potato, cassava, alfalfa, lima bean, pea, chick pea, soybean, carrot, strawberry, lettuce, oak, maple, walnut, rose, mint, squash, daisy, Quinoa, buckwheat, mung bean, cow pea, lentil, lupin, peanut, fava bean, French beans (i.e., common beans), mustard, or cactus.

[0200] 20. The stably transformed plant of any one of embodiments 1-19, wherein the plant is soybean.

[0201] 21. The stably transformed plant of any one of embodiments 1-20, wherein the recombinant DNA construct is codon-optimized for expression in the plant.

[0202] 22. The stably transformed plant of any one of embodiments 1-21, wherein the fusion protein comprises a protease cleavage site.

[0203] 23. The stably transformed plant of embodiment 22, wherein the protease cleavage site is a chymosin cleavage site.

[0204] 24. The stably transformed plant of any one of embodiments 1-23, wherein the fusion protein is expressed at a level at least 2-fold higher than an unstructured milk protein expressed individually in a plant.

[0205] 25. The stably transformed plant of any one of embodiments 1-24, wherein the fusion protein accumulates in the plant at least 2-fold higher than an unstructured milk protein expressed without the structured animal protein.

[0206] 26. A recombinant fusion protein comprising: (i) an unstructured milk protein, and (ii) a structured animal protein.

[0207] 27. The recombinant fusion protein of embodiment 26, wherein the fusion protein is expressed in a plant.

[0208] 28. The recombinant fusion protein of embodiment 26 or 27, wherein the unstructured milk protein is .alpha.-S1 casein, .alpha.-S2 casein, .beta.-casein, or .kappa.-casein.

[0209] 29. The recombinant fusion protein of embodiment 28, wherein the milk protein is .kappa.-casein and comprises the sequence of SEQ ID NO: 4, or a sequence at least 90% identical thereto.

[0210] 30. The recombinant fusion protein of embodiment 28, wherein the milk protein is para-.kappa.-casein and comprises the sequence of SEQ ID NO: 2, or a sequence at least 90% identical thereto.

[0211] 31. The recombinant fusion protein of embodiment 28, wherein the milk protein is .beta.-casein and comprises the sequence of SEQ ID NO: 6, or a sequence at least 90% identical thereto.

[0212] 32. The recombinant fusion protein of embodiment 28, wherein the milk protein is .alpha.-S1 casein and comprises the sequence SEQ ID NO: 8, or a sequence at least 90% identical thereto.

[0213] 33. The recombinant fusion protein of embodiment 28, wherein the milk protein is .alpha.-S2 casein and comprises the sequence SEQ ID NO: 84, or a sequence at least 90% identical thereto.

[0214] 34. The recombinant fusion protein of any one of embodiments 26-33, wherein the structured animal protein is a structured mammalian protein.

[0215] 35. The recombinant fusion protein of embodiment 34, wherein the structured mammalian protein is .beta.-lactoglobulin, .alpha.-lactalbumin, albumin, lysozyme, lactoferrin, lactoperoxidase, hemoglobin, collagen, or an immunoglobulin.

[0216] 36. The recombinant fusion protein of embodiment 34, wherein the structured mammalian protein is .beta.-lactoglobulin and comprises the sequence of SEQ ID NO: 10, or a sequence at least 90% identical thereto.

[0217] 37. The recombinant fusion protein of any one of embodiments 26-33, wherein the structured animal protein is a structured avian protein.

[0218] 38. The recombinant fusion protein of embodiment 37, wherein the structured avian protein is ovalbumin, ovotransferrin, lysozyme or ovoglobulin.

[0219] 39. The recombinant fusion protein embodiment 34, wherein the milk protein is .kappa.-casein and the structured mammalian protein is .beta.-lactoglobulin.

[0220] 40. The recombinant fusion protein of embodiment 34, wherein the milk protein is para-.kappa.-casein and the structured mammalian protein is .beta.-lactoglobulin.

[0221] 41. The recombinant fusion protein of embodiment 34, wherein the milk protein is .beta.-casein and the structured mammalian protein is .beta.-lactoglobulin.

[0222] 42. The recombinant fusion protein of embodiment 34, wherein the milk protein is .alpha.-S1 casein or .alpha.-S2 casein and the structured mammalian protein is .beta.-lactoglobulin.

[0223] 43. The recombinant fusion protein of embodiment 34, wherein the fusion protein comprises a protease cleavage site.

[0224] 44. The recombinant fusion protein of embodiment 34, wherein the protease cleavage site is a chymosin cleavage site.

[0225] 45. A nucleic acid encoding the recombinant fusion protein of any one of embodiments 26 to 44.

[0226] 46. The nucleic acid of embodiment 45, wherein the nucleic acid is codon optimized for expression in a plant species.

[0227] 47. The nucleic of embodiment 45 or 46, wherein the nucleic acid is codon optimized for expression in soybean.

[0228] 48. A vector comprising a nucleic acid encoding a recombinant fusion protein, wherein the recombinant fusion protein comprises: (i) an unstructured milk protein, and (ii) a structured animal protein.

[0229] 49. The vector of embodiment 48, wherein the vector is a plasmid.

[0230] 50. The vector of embodiment 49, wherein the vector is an Agrobacterium Ti plasmid.

[0231] 51. The vector of any one of embodiments 48-50, wherein the nucleic acid comprises, in order from 5' to 3': a promoter; a 5' untranslated region; a sequence encoding the fusion protein; and a terminator.

[0232] 52. The vector of embodiment 51, wherein the promoter is a seed-specific promoter.

[0233] 53. The vector of embodiment 52, wherein the seed-specific promoter is selected from the group consisting of PvPhas, BnNap, AtOle1, GmSeed2, GmSeed3, GmSeed5, GmSeed6, GmSeed7, GmSeed8, GmSeed10, GmSeed11, GmSeed12, pBCON, GmCEP1-L, GmTHIC, GmBg7S1, GmGRD, GmOLEA, GmOLER, Gm2S-1, and GmBBld-II.

[0234] 54. The vector of embodiment 53, wherein the seed-specific promoter is PvPhas and comprises the sequence of SEQ ID NO: 18, or a sequence at least 90% identical thereto.

[0235] 55. The vector of embodiment 53, wherein the seed-specific promoter is GmSeed2 and comprises the sequence of SEQ ID NO: 19, or a sequence at least 90% identical thereto.

[0236] 56. The vector of any one of embodiments 51-55, wherein the 5' untranslated region is selected from the group consisting of Arc5'UTR and glnBIUTR.

[0237] 57. The vector of embodiment 56, wherein the 5' untranslated region is Arc5'UTR and comprises the sequence of SEQ ID NO: 20, or a sequence at least 90% identical thereto.

[0238] 58. The vector of any one of embodiments 51-57, wherein the expression cassette comprises a 3' untranslated region.

[0239] 59. The vector of embodiment 58, wherein the 3' untranslated region is Arc5-1 and comprises SEQ ID NO: 21, or a sequence at least 90% identical thereto.

[0240] 60. The vector of any one of embodiments 51-59, wherein the terminator sequence is a terminator isolated or derived from a gene encoding Nopaline synthase, Arc5-1, an Extensin, Rb7 matrix attachment region, a Heat shock protein, Ubiquitin 10, Ubiquitin 3, and M6 matrix attachment region.

[0241] 61. The vector of embodiment 60, wherein the terminator sequence is isolated or derived from a Nopaline synthase gene and comprises the sequence of SEQ ID NO: 22, or a sequence at least 90% identical thereto.

[0242] 62. A plant comprising the recombinant fusion protein of any one of embodiments 26-44 or the nucleic acid of any one of embodiments 45-47.

[0243] 63. A method for stably expressing a recombinant fusion protein in a plant, the method comprising: a) transforming a plant with a plant transformation vector comprising an expression cassette comprising: a sequence encoding a fusion protein, wherein the fusion protein comprises an unstructured milk protein, and a structured animal protein; and b) growing the transformed plant under conditions wherein the recombinant fusion protein is expressed in an amount of 1% or higher per total protein weight of soluble protein extractable from the plant.

[0244] 64. The method of embodiment 63, wherein the unstructured milk protein is .kappa.-casein.

[0245] 65. The method of embodiment 63 or 64, wherein the structured animal protein is .beta.-lactoglobulin.

[0246] 66. A food composition comprising the recombinant fusion protein of any one of embodiments 26-44.

[0247] 67. A method for making a food composition, the method comprising: expressing the recombinant fusion protein of any one of embodiments 26-44 in a plant; extracting the recombinant fusion protein from the plant; optionally, separating the milk protein from the structured animal protein or the structured plant protein; and creating a food composition using the milk protein or the fusion protein.

[0248] 68. The method of embodiment 67, wherein the plant stably expresses the recombinant fusion protein.

[0249] 69. The method of embodiment 68, wherein the plant expresses the recombinant fusion protein in an amount of 1% or higher per total protein weight of soluble protein extractable from the plant.

[0250] 70. The method of any one of embodiments 67-69, wherein the plant is soybean.

[0251] 71. The method of any one of embodiments 67-70, wherein the food composition comprises the structured animal or plant protein.

[0252] 72. The method of any one of embodiments 67-71, wherein the milk protein and the structured animal or plant protein are separated from one another in the plant cell, prior to extraction.

[0253] 73. The method of any one of embodiments 67-71, wherein the milk protein is separated from the structured animal or plant protein after extraction, by contacting the fusion protein with an enzyme that cleaves the fusion protein.

[0254] 74. A food composition produced using the method of any one of embodiments 67-73.

[0255] 75. A plant-expressed recombinant fusion protein, comprising: .kappa.-casein; and .beta.-lactoglobulin.

[0256] 76. The plant-expressed recombinant fusion protein of embodiment 75, wherein the fusion protein comprises, in order from N-terminus to C-terminus, the .kappa.-casein and the .beta.-lactoglobulin.

[0257] 77. The plant-expressed recombinant fusion protein of embodiment 75 or 76, wherein the fusion protein comprises a protease cleavage site.

[0258] 78. The plant-expressed recombinant fusion protein of embodiment 77, wherein the protease cleavage site is a chymosin cleavage site.

[0259] 79. The plant-expressed recombinant fusion protein of any one of embodiments 75-78, wherein the fusion protein comprises a signal peptide.

[0260] 80. The plant-expressed recombinant fusion protein of embodiment 79, wherein the signal peptide is located at the N-terminus of the fusion protein.

[0261] 81. The plant-expressed recombinant fusion protein of any one of embodiments 75-80, wherein the fusion protein is encoded by a nucleic acid that is codon optimized for expression in a plant.

[0262] 82. The plant-expressed recombinant fusion protein of any one of embodiments 75-81, wherein the fusion protein is expressed in a soybean.

[0263] 83. The plant-expressed recombinant fusion protein of any one of embodiments 75-81, wherein the fusion protein has a molecular weight of 30 kDa to 50 kDa.

[0264] 84. The plant-expressed recombinant fusion protein of any one of embodiments 75-83, wherein the fusion protein is expressed in a plant in an amount of 1% or higher per total protein weight of soluble protein extractable from the plant.

[0265] 85. The plant-expressed recombinant fusion protein of any one of embodiments 75-84, wherein the fusion protein is expressed in the plant at a level at least 2-fold higher than .kappa.-casein expressed individually in a plant.

[0266] 86. The plant-expressed recombinant fusion protein of any one of embodiments 75-84, wherein the fusion protein accumulates in the plant at least 2-fold higher than .kappa.-casein expressed without .beta.-lactoglobulin.

[0267] 87. A stably transformed plant, comprising in its genome: a recombinant DNA construct encoding a fusion protein, the fusion protein comprising: .kappa.-casein; and .beta.-lactoglobulin; wherein the fusion protein is stably expressed in the plant in an amount of 1% or higher per total protein weight of soluble protein extractable from the plant.

[0268] 88. The stably transformed plant of embodiment 87, wherein the fusion protein comprises, in order from N-terminus to C-terminus, the .kappa.-casein and the .beta.-lactoglobulin.

[0269] 89. The stably transformed plant of embodiment 87 or 88, wherein the fusion protein comprises a protease cleavage site.

[0270] 90. The stably transformed plant of embodiment 89, wherein the protease cleavage site is a chymosin cleavage site.

[0271] 91. The stably transformed plant of any one of embodiments 87-90, wherein the fusion protein comprises a signal peptide.

[0272] 92. The stably transformed plant of embodiment 91, wherein the signal peptide is located at the N-terminus of the fusion protein.

[0273] 93. The stably transformed plant of any one of embodiments 87-92, wherein the plant is soybean.

[0274] 94. The stably transformed plant of any one of embodiments 87-93, wherein the recombinant DNA construct comprises codon-optimized nucleic acids for expression in the plant.

[0275] 95. The stably transformed plant of any one of embodiments 87-94, wherein the fusion protein has a molecular weight of 30 kDa to 50 kDa.

[0276] 96. The stably transformed plant of any one of embodiments 87-95, wherein the fusion protein is expressed at a level at least 2-fold higher than .kappa.-casein expressed individually in a plant.

[0277] 97. The stably transformed plant of any one of embodiments 87-96, wherein the fusion protein accumulates in the plant at least 2-fold higher than .kappa.-casein expressed without .beta.-lactoglobulin.

[0278] 98. A plant-expressed recombinant fusion protein comprising: a casein protein and .beta.-lactoglobulin.

[0279] 99. The plant-expressed recombinant fusion protein of embodiment 98, wherein the casein protein is .alpha.-S1 casein, .alpha.-S2 casein, .beta.-casein, or .kappa.-casein.

[0280] 100. A stably transformed plant, comprising in its genome: a recombinant DNA construct encoding a fusion protein, the fusion protein comprising: a casein protein and .beta.-lactoglobulin; wherein the fusion protein is stably expressed in the plant in an amount of 1% or higher per total protein weight of soluble protein extractable from the plant.

[0281] 101. The stably transformed plant of embodiment 100, wherein the casein protein is .alpha.-S1 casein, .alpha.-S2 casein, .beta.-casein, or .kappa.-casein.

Sequence CWU 1

1

1371318DNAArtificial SequenceOptimized para-kappa-casein truncated version 1 (paraOKC1-T) 1caagagcaga atcaagagca gccaatccgt tgtgagaagg acgagaggtt cttctcagac 60aagatcgcca aatatatacc catacaatat gtactctcac gctaccctag ctacgggctt 120aactactatc agcaaaaacc tgtagcactg ataaataacc agtttctccc ctatccctat 180tatgctaaac ctgccgccgt gaggagtcca gcacaaatac ttcagtggca agtgctcagt 240aacaccgtgc cagcaaaaag ctgccaggct cagcccacca caatggcccg tcatccccat 300cctcacctta gcttcatg 3182106PRTArtificial SequenceOptimized para-kappa-casein truncated version 1 (paraOKC1-T) 2Gln Glu Gln Asn Gln Glu Gln Pro Ile Arg Cys Glu Lys Asp Glu Arg1 5 10 15Phe Phe Ser Asp Lys Ile Ala Lys Tyr Ile Pro Ile Gln Tyr Val Leu 20 25 30Ser Arg Tyr Pro Ser Tyr Gly Leu Asn Tyr Tyr Gln Gln Lys Pro Val 35 40 45Ala Leu Ile Asn Asn Gln Phe Leu Pro Tyr Pro Tyr Tyr Ala Lys Pro 50 55 60Ala Ala Val Arg Ser Pro Ala Gln Ile Leu Gln Trp Gln Val Leu Ser65 70 75 80Asn Thr Val Pro Ala Lys Ser Cys Gln Ala Gln Pro Thr Thr Met Ala 85 90 95Arg His Pro His Pro His Leu Ser Phe Met 100 1053507DNAArtificial SequenceOptimized kappa-casein truncated version 1 (OKC1-T) 3caagagcaga atcaagagca gccaatccgt tgtgagaagg acgagaggtt cttctcagac 60aagatcgcca aatatatacc catacaatat gtactctcac gctaccctag ctacgggctt 120aactactatc agcaaaaacc tgtagcactg ataaataacc agtttctccc ctatccctat 180tatgctaaac ctgccgccgt gaggagtcca gcacaaatac ttcagtggca agtgctcagt 240aacaccgtgc cagcaaaaag ctgccaggct cagcccacca caatggcccg tcatccccat 300cctcacctta gcttcatggc aatcccacca aagaagaatc aagacaagac cgaaatacct 360accatcaaca caattgcatc tggagagcct accagtacac caacaactga ggcagtagag 420tctactgttg ctacccttga ggacagcccc gaggttatag agtccccacc tgagataaat 480accgtgcagg tgacaagtac cgccgta 5074169PRTArtificial SequenceOptimized kappa-casein truncated version 1 (OKC1-T) 4Gln Glu Gln Asn Gln Glu Gln Pro Ile Arg Cys Glu Lys Asp Glu Arg1 5 10 15Phe Phe Ser Asp Lys Ile Ala Lys Tyr Ile Pro Ile Gln Tyr Val Leu 20 25 30Ser Arg Tyr Pro Ser Tyr Gly Leu Asn Tyr Tyr Gln Gln Lys Pro Val 35 40 45Ala Leu Ile Asn Asn Gln Phe Leu Pro Tyr Pro Tyr Tyr Ala Lys Pro 50 55 60Ala Ala Val Arg Ser Pro Ala Gln Ile Leu Gln Trp Gln Val Leu Ser65 70 75 80Asn Thr Val Pro Ala Lys Ser Cys Gln Ala Gln Pro Thr Thr Met Ala 85 90 95Arg His Pro His Pro His Leu Ser Phe Met Ala Ile Pro Pro Lys Lys 100 105 110Asn Gln Asp Lys Thr Glu Ile Pro Thr Ile Asn Thr Ile Ala Ser Gly 115 120 125Glu Pro Thr Ser Thr Pro Thr Thr Glu Ala Val Glu Ser Thr Val Ala 130 135 140Thr Leu Glu Asp Ser Pro Glu Val Ile Glu Ser Pro Pro Glu Ile Asn145 150 155 160Thr Val Gln Val Thr Ser Thr Ala Val 1655627DNAArtificial SequenceOptimized beta-casein truncated version 2 (OBC-T2) 5cgcgaactgg aagagttgaa cgtaccagga gagattgtag aatcactgag ctcctcagag 60gagtctatta ctcgtatcaa caagaagata gagaagttcc aatccgagga gcaacaacaa 120acagaggacg aattgcagga caagatacat cctttcgcac agacccagag cctcgtctat 180ccctttccag gtccaatccc taactctctc ccccagaata tcccaccctt gactcagact 240cccgtggtcg tacccccttt cttgcaaccc gaggtgatgg gggtttctaa agtcaaagag 300gctatggctc ctaaacataa ggaaatgcct tttcccaaat atccagtgga gccattcact 360gagagccagt ctctgacact tacagatgtg gaaaacttgc acctgccctt gccacttttg 420cagtcctgga tgcaccaacc acatcaaccc ttgcccccca cagtgatgtt tcctccacaa 480tcagttctta gtctctccca aagcaaagtc cttccagtgc ctcagaaggc cgtcccatac 540ccccagagag atatgccaat acaggcattc ttgctttacc aggaaccagt gctcggtcct 600gtacgtggcc cattccctat catagtg 6276209PRTArtificial SequenceOptimized beta-casein truncated version 2 (OBC-T2) 6Arg Glu Leu Glu Glu Leu Asn Val Pro Gly Glu Ile Val Glu Ser Leu1 5 10 15Ser Ser Ser Glu Glu Ser Ile Thr Arg Ile Asn Lys Lys Ile Glu Lys 20 25 30Phe Gln Ser Glu Glu Gln Gln Gln Thr Glu Asp Glu Leu Gln Asp Lys 35 40 45Ile His Pro Phe Ala Gln Thr Gln Ser Leu Val Tyr Pro Phe Pro Gly 50 55 60Pro Ile Pro Asn Ser Leu Pro Gln Asn Ile Pro Pro Leu Thr Gln Thr65 70 75 80Pro Val Val Val Pro Pro Phe Leu Gln Pro Glu Val Met Gly Val Ser 85 90 95Lys Val Lys Glu Ala Met Ala Pro Lys His Lys Glu Met Pro Phe Pro 100 105 110Lys Tyr Pro Val Glu Pro Phe Thr Glu Ser Gln Ser Leu Thr Leu Thr 115 120 125Asp Val Glu Asn Leu His Leu Pro Leu Pro Leu Leu Gln Ser Trp Met 130 135 140His Gln Pro His Gln Pro Leu Pro Pro Thr Val Met Phe Pro Pro Gln145 150 155 160Ser Val Leu Ser Leu Ser Gln Ser Lys Val Leu Pro Val Pro Gln Lys 165 170 175Ala Val Pro Tyr Pro Gln Arg Asp Met Pro Ile Gln Ala Phe Leu Leu 180 185 190Tyr Gln Glu Pro Val Leu Gly Pro Val Arg Gly Pro Phe Pro Ile Ile 195 200 205Val7597DNAArtificial SequenceOptimized alpha S1-casein truncated version 1 (OaS1-T) 7cgcccaaaac atcccataaa acatcaagga ttgccccagg aagtactcaa cgagaatctc 60ctccgttttt tcgttgctcc tttccccgaa gtgttcggga aggaaaaagt aaacgagctt 120tcaaaggaca tcggctctga aagtaccgag gatcaggcta tggaagatat caagcaaatg 180gaggccgaat ctataagttc ttcagaagaa atagttccca actcagtgga gcagaagcac 240attcagaaag aagacgtgcc cagcgagcgc tatctgggat atttggaaca gctgctcaga 300ctgaaaaagt acaaggtgcc tcagctcgaa atcgtaccca atagtgctga agaaaggttg 360cactcaatga aagaggggat tcacgcacaa caaaaagagc ctatgatcgg agtaaatcaa 420gaactggcat acttttatcc cgagttgttt cgccaattct atcaactgga tgcctaccct 480tccggtgcat ggtactacgt acccctcggt actcaatata ccgatgctcc ctccttttcc 540gacattccta atcctatagg ttccgagaat agcgaaaaga ccaccatgcc cttatgg 5978199PRTArtificial SequenceOptimized alpha S1-casein truncated version 1 (OaS1-T) 8Arg Pro Lys His Pro Ile Lys His Gln Gly Leu Pro Gln Glu Val Leu1 5 10 15Asn Glu Asn Leu Leu Arg Phe Phe Val Ala Pro Phe Pro Glu Val Phe 20 25 30Gly Lys Glu Lys Val Asn Glu Leu Ser Lys Asp Ile Gly Ser Glu Ser 35 40 45Thr Glu Asp Gln Ala Met Glu Asp Ile Lys Gln Met Glu Ala Glu Ser 50 55 60Ile Ser Ser Ser Glu Glu Ile Val Pro Asn Ser Val Glu Gln Lys His65 70 75 80Ile Gln Lys Glu Asp Val Pro Ser Glu Arg Tyr Leu Gly Tyr Leu Glu 85 90 95Gln Leu Leu Arg Leu Lys Lys Tyr Lys Val Pro Gln Leu Glu Ile Val 100 105 110Pro Asn Ser Ala Glu Glu Arg Leu His Ser Met Lys Glu Gly Ile His 115 120 125Ala Gln Gln Lys Glu Pro Met Ile Gly Val Asn Gln Glu Leu Ala Tyr 130 135 140Phe Tyr Pro Glu Leu Phe Arg Gln Phe Tyr Gln Leu Asp Ala Tyr Pro145 150 155 160Ser Gly Ala Trp Tyr Tyr Val Pro Leu Gly Thr Gln Tyr Thr Asp Ala 165 170 175Pro Ser Phe Ser Asp Ile Pro Asn Pro Ile Gly Ser Glu Asn Ser Glu 180 185 190Lys Thr Thr Met Pro Leu Trp 1959486DNAArtificial SequenceOptimized Beta Lactoglobulin 1 (OLG1) 9ttgatcgtaa cacagactat gaagggtctt gatatacaga aggtggccgg gacttggtac 60agtttggcaa tggccgcatc cgacatctcc ttgttggacg cacaatcagc cccattgcgt 120gtgtacgtag aagagcttaa accaactccc gagggggatc tggaaattct gctccagaaa 180tgggagaacg gtgagtgcgc ccagaagaag atcatcgcag agaagaccaa aattccagca 240gtattcaaaa tcgacgcatt gaacgaaaat aaggtgctcg tactggacac tgattataag 300aagtatctcc ttttctgtat ggagaactca gcagagcctg aacagagtct tgcctgccaa 360tgccttgttc gtaccccaga ggtagatgat gaagctctgg aaaagttcga taaggccctt 420aaggctctgc ctatgcacat taggctttct ttcaatccaa ctcaacttga ggaacaatgt 480cacatt 48610162PRTArtificial SequenceOptimized Beta Lactoglobulin 1 (OLG1) 10Leu Ile Val Thr Gln Thr Met Lys Gly Leu Asp Ile Gln Lys Val Ala1 5 10 15Gly Thr Trp Tyr Ser Leu Ala Met Ala Ala Ser Asp Ile Ser Leu Leu 20 25 30Asp Ala Gln Ser Ala Pro Leu Arg Val Tyr Val Glu Glu Leu Lys Pro 35 40 45Thr Pro Glu Gly Asp Leu Glu Ile Leu Leu Gln Lys Trp Glu Asn Gly 50 55 60Glu Cys Ala Gln Lys Lys Ile Ile Ala Glu Lys Thr Lys Ile Pro Ala65 70 75 80Val Phe Lys Ile Asp Ala Leu Asn Glu Asn Lys Val Leu Val Leu Asp 85 90 95Thr Asp Tyr Lys Lys Tyr Leu Leu Phe Cys Met Glu Asn Ser Ala Glu 100 105 110Pro Glu Gln Ser Leu Ala Cys Gln Cys Leu Val Arg Thr Pro Glu Val 115 120 125Asp Asp Glu Ala Leu Glu Lys Phe Asp Lys Ala Leu Lys Ala Leu Pro 130 135 140Met His Ile Arg Leu Ser Phe Asn Pro Thr Gln Leu Glu Glu Gln Cys145 150 155 160His Ile11486DNAArtificial SequenceOptimized Beta Lactoglobulin 2 (OLG2) 11cttattgtga cccaaaccat gaagggcctc gacattcaaa aggttgccgg aacctggtac 60tcccttgcta tggctgcttc cgatatctcc ttgctcgatg ctcaatccgc tccacttagg 120gtgtacgtgg aagagttgaa gccaactcca gagggcgatc ttgagatctt gcttcaaaag 180tgggagaacg atgagtgcgc ccagaagaag attatcgccg aaaagaccaa gattcccgcc 240gtgttcaaga tcgatgctct caacgagaac aaggtgctcg tgctcgatac cgactacaag 300aagtaccttc tcgtctgcat ggaaaactcc gctgagccag agcaatctct tgtttgccaa 360tgccttgtga ggaccccaga ggttgacgat gaagctcttg agaagttcga caaggctctc 420aaggctttgc ctatgcacat ccgccttagc ttcaacccaa ctcagcttga ggaacagtgc 480cacatc 48612486DNAArtificial SequenceOptimized Beta Lactoglobulin 3 (OLG3) 12ctcattgtta cacaaaccat gaagggtctt gacattcaga aggttgctgg gacatggtat 60tcactagcga tggctgcttc tgatatctcc ctgttggatg cacagtctgc ccccctgaga 120gtgtatgttg aagaactgaa accgacacct gaaggagact tggaaatttt actccagaaa 180tgggaaaatg atgagtgtgc ccaaaagaag ataatagccg agaagaccaa aattcctgct 240gtgtttaaga ttgatgcttt gaatgagaac aaagtactag tcctcgacac tgattacaag 300aaatacttat tagtgtgcat ggaaaacagc gcagagccag aacaatcact tgtttgtcaa 360tgtttggtcc gtactccaga ggtagatgat gaagcattgg agaaatttga taaagcattg 420aaggcacttc caatgcatat aaggcttagt ttcaatccta ctcagcttga agagcaatgc 480cacatc 48613486DNAArtificial SequenceOptimized Beta Lactoglobulin 4 (OLG4) 13cttatagtaa ctcaaaccat gaagggactt gatatccaaa aagttgcagg aacctggtac 60tcactggcta tggcagcttc cgacatctcc ttgttggacg cacaatccgc accattgcgc 120gtctacgttg aggagttgaa acctacacca gagggggatc ttgagatttt gctccagaaa 180tgggagaacg acgagtgtgc ccagaaaaaa attatagcag agaagactaa aattcctgct 240gtttttaaga ttgatgccct gaacgagaat aaggtactgg tcctcgacac tgattataaa 300aagtatttgc tggtgtgtat ggagaacagt gctgaacctg aacagagcct ggtctgtcaa 360tgtcttgtaa ggacacctga ggttgatgac gaggcacttg aaaaattcga caaggccctt 420aaggctctgc ctatgcacat ccgtctgagt ttcaacccta ctcagttgga ggaacaatgt 480catatt 4861496DNAGlycine max 14atggctactt caaagttgaa aacccagaat gtggttgtat ctctctccct aaccttaacc 60ttggtactgg tgctactgac cagcaaggca aactca 961532PRTGlycine max 15Met Ala Thr Ser Lys Leu Lys Thr Gln Asn Val Val Val Ser Leu Ser1 5 10 15Leu Thr Leu Thr Leu Val Leu Val Leu Leu Thr Ser Lys Ala Asn Ser 20 25 301657DNAGlycine max 16atggccaagc tagttttttc cctttgtttt ctgcttttca gtggctgctg cttcgct 571719PRTGlycine max 17Met Ala Lys Leu Val Phe Ser Leu Cys Phe Leu Leu Phe Ser Gly Cys1 5 10 15Cys Phe Ala181543DNAGlycine max 18cattgtactc ccagtatcat tatagtgaaa gttttggctc tctcgccggt ggttttttac 60ctctatttaa aggggttttc cacctaaaaa ttctggtatc attctcactt tacttgttac 120tttaatttct cataatcttt ggttgaaatt atcacgcttc cgcacacgat atccctacaa 180atttattatt tgttaaacat tttcaaaccg cataaaattt tatgaagtcc cgtctatctt 240taatgtagtc taacattttc atattgaaat atataattta cttaatttta gcgttggtag 300aaagcataat gatttattct tattcttctt catataaatg tttaatatac aatataaaca 360aattctttac cttaagaagg atttcccatt ttatatttta aaaatatatt tatcaaatat 420ttttcaacca cgtaaatcac ataataataa gttgtttcaa aagtaataaa atttaactcc 480ataatttttt tatttgactg atcttaaagc aacacccagt gacacaacta gccatttttt 540tctttgaata aaaaaatcca attatcattg tatttttttt atacaatgaa aatttcacca 600aacaatgatt tgtggtattt ctgaagcaag tcatgttatg caaaattcta taattcccat 660ttgacactac ggaagtaact gaagatctgc ttttacatgc gagacacatc ttctaaagta 720attttaataa tagttactat attcaagatt tcatatatca aatactcaat attacttcta 780aaaaattaat tagatataat taaaatatta cttttttaat tttaagttta attgttgaat 840ttgtgactat tgatttatta ttctactatg tttaaattgt tttataggta gtttaaagta 900aatataagta atgtagtaga gtgttagagt gttaccctaa accataaact ataagattta 960tggtggacta attttcatat atttcttatt gcttttacct tttcttggta tgtaagtccg 1020taactggaat tactgtgggt tgccatgaca ctctgtggtc ttttggttca tgcatggatg 1080cttgcgcaag aaaaagacaa agaacaaaga aaaaagacaa aacagagaga caaaacgcaa 1140tcacacaacc aactcaaatt agtcactggc tgatcaagat cgccgcgtcc atgtatgtct 1200aaatgccatg caaagcaaca cgtgcttaac atgcacttta aatggctcac ccatcccaac 1260ccactcacaa acacattgcc tttttcttca tcatcaccac aaccacctgt atatattcat 1320tctcttccgc cacctcaatt tcttcacttc aacacacgtc aacctgcata tgcgtgtcat 1380cccatgccca aatctccatg catgttccta ccaccttctc tcttatataa tacctataaa 1440tacctctaat atcactcact tctttcatca tccatccatc cagagtacta ctactctact 1500actataatac cccaacccaa ctcatattca atactactct act 1543191384DNAGlycine max 19aacacaagct tcaagtttta aaaggaaaaa tgtcagccaa aaactttaaa taaaatggta 60acaaggaaat tattcaaaaa ttacaaacct cgtcaaaata ggaaagaaaa aaagtttagg 120gatttagaaa aaacatcaat ctagttccac cttattttat agagagaaga aactaatata 180taagaactaa aaaacagaag aatagaaaaa aaaagtattg acaggaaaga aaaagtagct 240gtatgcttat aagtactttg aggatttgaa ttctctctta taaaacacaa acacaatttt 300tagattttat ttaaataatc atcaatccga ttataattat ttatatattt ttctattttc 360aaagaagtaa atcatgagct tttccaactc aacatctatt ttttttctct caaccttttt 420cacatcttaa gtagtctcac cctttatata tataacttat ttcttacctt ttacattatg 480taacttttat caccaaaacc aacaacttta aaattttatt aaatagactc cacaagtaac 540ttgacactct tacattcatc gacattaact tttatctgtt ttataaatat tattgtgata 600taatttaatc aaaataacca caaactttca taaaaggttc ttattaagca tggcatttaa 660taagcaaaaa caactcaatc actttcatat aggaggtagc ctaagtacgt actcaaaatg 720ccaacaaata aaaaaaaagt tgctttaata atgccaaaac aaattaataa aacacttaca 780acaccggatt ttttttaatt aaaatgtgcc atttaggata aatagttaat atttttaata 840attatttaaa aagccgtatc tactaaaatg atttttattt ggttgaaaat attaatatgt 900ttaaatcaac acaatctatc aaaattaaac taaaaaaaaa ataagtgtac gtggttaaca 960ttagtacagt aatataagag gaaaatgaga aattaagaaa ttgaaagcga gtctaatttt 1020taaattatga acctgcatat ataaaaggaa agaaagaatc caggaagaaa agaaatgaaa 1080ccatgcatgg tcccctcgtc atcacgagtt tctgccattt gcaatagaaa cactgaaaca 1140cctttctctt tgtcacttaa ttgagatgcc gaagccacct cacaccatga acttcatgag 1200gtgtagcacc caaggcttcc atagccatgc atactgaaga atgtctcaag ctcagcaccc 1260tacttctgtg acgtgtccct cattcacctt cctctcttcc ctataaataa ccacgcctca 1320ggttctccgc ttcacaactc aaacattctc tccattggtc cttaaacact catcagtcat 1380cacc 13842013DNAGlycine max 20tgaatgcatg atc 13211197DNAGlycine max 21aataaataaa atgggagcaa taaataaaat gggagctcat atatttacac catttacact 60gtctattatt caccatgcca attattactt cataatttta aaattatgtc atttttaaaa 120attgcttaat gatggaaagg attattataa gttaaaagta taacatagat aaactaacca 180caaaacaaat caatataaac taacttactc tcccatctaa tttttattta aatttcttta 240cacttctctt ccatttctat ttctacaaca ttatttaaca tttttattgt atttttctta 300ctttctaact ctattcattt caaaaatcaa tatatgttta tcaccacctc tctaaaaaaa 360actttacaat cattggtcca gaaaagttaa atcacgagat ggtcatttta gcattaaaac 420aacgattctt gtatcactat ttttcagcat gtagtccatt ctcttcaaac aaagacagcg 480gctatataat cgttgtgtta tattcagtct aaaacaattg ttatggtaaa agtcgtcatt 540ttacgccttt ttaaaagata taaaatgaca gttatggtta aaagtcatca tgttagatcc 600tccttaaaga tataaaatga cagttttgga taaaaagtgg tcattttata cgctcttgaa 660agatataaaa cgacggttat ggtaaaagct gccattttaa atgaaatatt tttgttttag 720ttcattttgt ttaatgctaa tcccatttaa attgacttgt acaattaaaa ctcacccacc 780cagatacaat ataaactaac ttactctcac agctaagttt tatttaaatt tctttacact 840tcttttccat ttctatttct atgacattaa ctaacatttt tctcgtaatt ttttttctta 900ttttctaact ctatccattt caaatcgata tatgtttatc accaccactt taaaaagaaa 960atttacaatt tctcgtgcaa aaaagctaaa tcatgaccgt cattttagca ttaaaacaac

1020gattcttgta tcgttgtttt tcagcatgta gtccattctt ttcaagcaaa gacaacagct 1080atataatcat cgtgttatat tcagtctaaa acaacagtaa tgataaaagt catcatttta 1140ggcctttctg aaatatatag aacgacattc atggtaaaaa atcgtcattt tagatcc 119722253DNAGlycine max 22gatcgttcaa acatttggca ataaagtttc ttaagattga atcctgttgc cggtcttgcg 60atgattatca tataatttct gttgaattac gttaagcatg taataattaa catgtaatgc 120atgacgttat ttatgagatg ggtttttatg attagagtcc cgcaattata catttaatac 180gcgatagaaa acaaaatata gcgcgcaaac taggataaat tatcgcgcgc ggtgtcatct 240atgttactag atc 253234PRTUnknownCarboxy-terminal endoplasmic reticulum retention/retrieval signal 23Lys Asp Glu Leu1244PRTUnknownCarboxy-terminal endoplasmic reticulum retention/retrieval signal 24His Asp Glu Leu1254PRTUnknownCarboxy-terminal endoplasmic reticulum retention/retrieval signal 25His Asp Glu Phe1264PRTUnknownCarboxy-terminal endoplasmic reticulum retention/retrieval signal 26Arg Asp Glu Phe1274PRTUnknownCarboxy-terminal endoplasmic reticulum retention/retrieval signal 27Arg Asp Glu Leu1284PRTUnknownCarboxy-terminal endoplasmic reticulum retention/retrieval signal 28Trp Asp Glu Leu1294PRTUnknownCarboxy-terminal endoplasmic reticulum retention/retrieval signal 29Tyr Asp Glu Leu1304PRTUnknownCarboxy-terminal endoplasmic reticulum retention/retrieval signal 30His Glu Glu Phe1314PRTUnknownCarboxy-terminal endoplasmic reticulum retention/retrieval signal 31His Glu Glu Leu1324PRTUnknownCarboxy-terminal endoplasmic reticulum retention/retrieval signal 32Lys Glu Glu Leu1334PRTUnknownCarboxy-terminal endoplasmic reticulum retention/retrieval signal 33Arg Glu Glu Leu1344PRTUnknownCarboxy-terminal endoplasmic reticulum retention/retrieval signal 34Lys Ala Glu Leu1354PRTUnknownCarboxy-terminal endoplasmic reticulum retention/retrieval signal 35Lys Cys Glu Leu1364PRTUnknownCarboxy-terminal endoplasmic reticulum retention/retrieval signal 36Lys Phe Glu Leu1374PRTUnknownCarboxy-terminal endoplasmic reticulum retention/retrieval signal 37Lys Gly Glu Leu1384PRTUnknownCarboxy-terminal endoplasmic reticulum retention/retrieval signal 38Lys His Glu Leu1394PRTUnknownCarboxy-terminal endoplasmic reticulum retention/retrieval signal 39Lys Leu Glu Leu1404PRTUnknownCarboxy-terminal endoplasmic reticulum retention/retrieval signal 40Lys Asn Glu Leu1414PRTUnknownCarboxy-terminal endoplasmic reticulum retention/retrieval signal 41Lys Gln Glu Leu1424PRTUnknownCarboxy-terminal endoplasmic reticulum retention/retrieval signal 42Lys Arg Glu Leu1434PRTUnknownCarboxy-terminal endoplasmic reticulum retention/retrieval signal 43Lys Ser Glu Leu1444PRTUnknownCarboxy-terminal endoplasmic reticulum retention/retrieval signal 44Lys Val Glu Leu1454PRTUnknownCarboxy-terminal endoplasmic reticulum retention/retrieval signal 45Lys Trp Glu Leu1464PRTUnknownCarboxy-terminal endoplasmic reticulum retention/retrieval signal 46Lys Tyr Glu Leu1474PRTUnknownCarboxy-terminal endoplasmic reticulum retention/retrieval signal 47Lys Glu Asp Leu1484PRTUnknownCarboxy-terminal endoplasmic reticulum retention/retrieval signal 48Lys Ile Glu Leu1494PRTUnknownCarboxy-terminal endoplasmic reticulum retention/retrieval signal 49Asp Lys Glu Leu1504PRTUnknownCarboxy-terminal endoplasmic reticulum retention/retrieval signal 50Phe Asp Glu Leu1514PRTUnknownCarboxy-terminal endoplasmic reticulum retention/retrieval signal 51Lys Asp Glu Phe1524PRTUnknownCarboxy-terminal endoplasmic reticulum retention/retrieval signal 52Lys Lys Glu Leu1534PRTUnknownCarboxy-terminal endoplasmic reticulum retention/retrieval signal 53His Ala Asp Leu1544PRTUnknownCarboxy-terminal endoplasmic reticulum retention/retrieval signal 54His Ala Glu Leu1554PRTUnknownCarboxy-terminal endoplasmic reticulum retention/retrieval signal 55His Ile Glu Leu1564PRTUnknownCarboxy-terminal endoplasmic reticulum retention/retrieval signal 56His Asn Glu Leu1574PRTUnknownCarboxy-terminal endoplasmic reticulum retention/retrieval signal 57His Thr Glu Leu1584PRTUnknownCarboxy-terminal endoplasmic reticulum retention/retrieval signal 58Lys Thr Glu Leu1594PRTUnknownCarboxy-terminal endoplasmic reticulum retention/retrieval signal 59His Val Glu Leu1604PRTUnknownCarboxy-terminal endoplasmic reticulum retention/retrieval signal 60Asn Asp Glu Leu1614PRTUnknownCarboxy-terminal endoplasmic reticulum retention/retrieval signal 61Gln Asp Glu Leu1624PRTUnknownCarboxy-terminal endoplasmic reticulum retention/retrieval signal 62Arg Glu Asp Leu1634PRTUnknownCarboxy-terminal endoplasmic reticulum retention/retrieval signal 63Arg Asn Glu Leu1644PRTUnknownCarboxy-terminal endoplasmic reticulum retention/retrieval signal 64Arg Thr Asp Leu1654PRTUnknownCarboxy-terminal endoplasmic reticulum retention/retrieval signal 65Arg Thr Glu Leu1664PRTUnknownCarboxy-terminal endoplasmic reticulum retention/retrieval signal 66Ser Asp Glu Leu1674PRTUnknownCarboxy-terminal endoplasmic reticulum retention/retrieval signal 67Thr Asp Glu Leu1684PRTUnknownCarboxy-terminal endoplasmic reticulum retention/retrieval signal 68Ser Lys Glu Leu1694PRTUnknownCarboxy-terminal endoplasmic reticulum retention/retrieval signal 69Ser Thr Glu Leu1704PRTUnknownCarboxy-terminal endoplasmic reticulum retention/retrieval signal 70Glu Asp Glu Leu171367PRTArtificial SequenceFusion protein sig10OKC1-TOLG1KDEL 71Met Ala Thr Ser Lys Leu Lys Thr Gln Asn Val Val Val Ser Leu Ser1 5 10 15Leu Thr Leu Thr Leu Val Leu Val Leu Leu Thr Ser Lys Ala Asn Ser 20 25 30Gln Glu Gln Asn Gln Glu Gln Pro Ile Arg Cys Glu Lys Asp Glu Arg 35 40 45Phe Phe Ser Asp Lys Ile Ala Lys Tyr Ile Pro Ile Gln Tyr Val Leu 50 55 60Ser Arg Tyr Pro Ser Tyr Gly Leu Asn Tyr Tyr Gln Gln Lys Pro Val65 70 75 80Ala Leu Ile Asn Asn Gln Phe Leu Pro Tyr Pro Tyr Tyr Ala Lys Pro 85 90 95Ala Ala Val Arg Ser Pro Ala Gln Ile Leu Gln Trp Gln Val Leu Ser 100 105 110Asn Thr Val Pro Ala Lys Ser Cys Gln Ala Gln Pro Thr Thr Met Ala 115 120 125Arg His Pro His Pro His Leu Ser Phe Met Ala Ile Pro Pro Lys Lys 130 135 140Asn Gln Asp Lys Thr Glu Ile Pro Thr Ile Asn Thr Ile Ala Ser Gly145 150 155 160Glu Pro Thr Ser Thr Pro Thr Thr Glu Ala Val Glu Ser Thr Val Ala 165 170 175Thr Leu Glu Asp Ser Pro Glu Val Ile Glu Ser Pro Pro Glu Ile Asn 180 185 190Thr Val Gln Val Thr Ser Thr Ala Val Leu Ile Val Thr Gln Thr Met 195 200 205Lys Gly Leu Asp Ile Gln Lys Val Ala Gly Thr Trp Tyr Ser Leu Ala 210 215 220Met Ala Ala Ser Asp Ile Ser Leu Leu Asp Ala Gln Ser Ala Pro Leu225 230 235 240Arg Val Tyr Val Glu Glu Leu Lys Pro Thr Pro Glu Gly Asp Leu Glu 245 250 255Ile Leu Leu Gln Lys Trp Glu Asn Gly Glu Cys Ala Gln Lys Lys Ile 260 265 270Ile Ala Glu Lys Thr Lys Ile Pro Ala Val Phe Lys Ile Asp Ala Leu 275 280 285Asn Glu Asn Lys Val Leu Val Leu Asp Thr Asp Tyr Lys Lys Tyr Leu 290 295 300Leu Phe Cys Met Glu Asn Ser Ala Glu Pro Glu Gln Ser Leu Ala Cys305 310 315 320Gln Cys Leu Val Arg Thr Pro Glu Val Asp Asp Glu Ala Leu Glu Lys 325 330 335Phe Asp Lys Ala Leu Lys Ala Leu Pro Met His Ile Arg Leu Ser Phe 340 345 350Asn Pro Thr Gln Leu Glu Glu Gln Cys His Ile Lys Asp Glu Leu 355 360 365721104DNAArtificial SequenceNucleic acid sequence encoding fusion protein sig10OKC1-TOLG1KDEL 72atggctactt caaagttgaa aacccagaat gtggttgtat ctctctccct aaccttaacc 60ttggtactgg tgctactgac cagcaaggca aactcacaag agcagaatca agagcagcca 120atccgttgtg agaaggacga gaggttcttc tcagacaaga tcgccaaata tatacccata 180caatatgtac tctcacgcta ccctagctac gggcttaact actatcagca aaaacctgta 240gcactgataa ataaccagtt tctcccctat ccctattatg ctaaacctgc cgccgtgagg 300agtccagcac aaatacttca gtggcaagtg ctcagtaaca ccgtgccagc aaaaagctgc 360caggctcagc ccaccacaat ggcccgtcat ccccatcctc accttagctt catggcaatc 420ccaccaaaga agaatcaaga caagaccgaa atacctacca tcaacacaat tgcatctgga 480gagcctacca gtacaccaac aactgaggca gtagagtcta ctgttgctac ccttgaggac 540agccccgagg ttatagagtc cccacctgag ataaataccg tgcaggtgac aagtaccgcc 600gtattgatcg taacacagac tatgaagggt cttgatatac agaaggtggc cgggacttgg 660tacagtttgg caatggccgc atccgacatc tccttgttgg acgcacaatc agccccattg 720cgtgtgtacg tagaagagct taaaccaact cccgaggggg atctggaaat tctgctccag 780aaatgggaga acggtgagtg cgcccagaag aagatcatcg cagagaagac caaaattcca 840gcagtattca aaatcgacgc attgaacgaa aataaggtgc tcgtactgga cactgattat 900aagaagtatc tccttttctg tatggagaac tcagcagagc ctgaacagag tcttgcctgc 960caatgccttg ttcgtacccc agaggtagat gatgaagctc tggaaaagtt cgataaggcc 1020cttaaggctc tgcctatgca cattaggctt tctttcaatc caactcaact tgaggaacaa 1080tgtcacatta aggatgagct ttaa 110473405PRTArtificial SequenceFusion protein sig10OBC-T2FMOLG1 73Met Ala Thr Ser Lys Leu Lys Thr Gln Asn Val Val Val Ser Leu Ser1 5 10 15Leu Thr Leu Thr Leu Val Leu Val Leu Leu Thr Ser Lys Ala Asn Ser 20 25 30Arg Glu Leu Glu Glu Leu Asn Val Pro Gly Glu Ile Val Glu Ser Leu 35 40 45Ser Ser Ser Glu Glu Ser Ile Thr Arg Ile Asn Lys Lys Ile Glu Lys 50 55 60Phe Gln Ser Glu Glu Gln Gln Gln Thr Glu Asp Glu Leu Gln Asp Lys65 70 75 80Ile His Pro Phe Ala Gln Thr Gln Ser Leu Val Tyr Pro Phe Pro Gly 85 90 95Pro Ile Pro Asn Ser Leu Pro Gln Asn Ile Pro Pro Leu Thr Gln Thr 100 105 110Pro Val Val Val Pro Pro Phe Leu Gln Pro Glu Val Met Gly Val Ser 115 120 125Lys Val Lys Glu Ala Met Ala Pro Lys His Lys Glu Met Pro Phe Pro 130 135 140Lys Tyr Pro Val Glu Pro Phe Thr Glu Ser Gln Ser Leu Thr Leu Thr145 150 155 160Asp Val Glu Asn Leu His Leu Pro Leu Pro Leu Leu Gln Ser Trp Met 165 170 175His Gln Pro His Gln Pro Leu Pro Pro Thr Val Met Phe Pro Pro Gln 180 185 190Ser Val Leu Ser Leu Ser Gln Ser Lys Val Leu Pro Val Pro Gln Lys 195 200 205Ala Val Pro Tyr Pro Gln Arg Asp Met Pro Ile Gln Ala Phe Leu Leu 210 215 220Tyr Gln Glu Pro Val Leu Gly Pro Val Arg Gly Pro Phe Pro Ile Ile225 230 235 240Val Phe Met Leu Ile Val Thr Gln Thr Met Lys Gly Leu Asp Ile Gln 245 250 255Lys Val Ala Gly Thr Trp Tyr Ser Leu Ala Met Ala Ala Ser Asp Ile 260 265 270Ser Leu Leu Asp Ala Gln Ser Ala Pro Leu Arg Val Tyr Val Glu Glu 275 280 285Leu Lys Pro Thr Pro Glu Gly Asp Leu Glu Ile Leu Leu Gln Lys Trp 290 295 300Glu Asn Gly Glu Cys Ala Gln Lys Lys Ile Ile Ala Glu Lys Thr Lys305 310 315 320Ile Pro Ala Val Phe Lys Ile Asp Ala Leu Asn Glu Asn Lys Val Leu 325 330 335Val Leu Asp Thr Asp Tyr Lys Lys Tyr Leu Leu Phe Cys Met Glu Asn 340 345 350Ser Ala Glu Pro Glu Gln Ser Leu Ala Cys Gln Cys Leu Val Arg Thr 355 360 365Pro Glu Val Asp Asp Glu Ala Leu Glu Lys Phe Asp Lys Ala Leu Lys 370 375 380Ala Leu Pro Met His Ile Arg Leu Ser Phe Asn Pro Thr Gln Leu Glu385 390 395 400Glu Gln Cys His Ile 405741218DNAArtificial SequenceNucleic acid encoding fusion protein sig10OBC-T2FMOLG1 74atggctactt caaagttgaa aacccagaat gtggttgtat ctctctccct aaccttaacc 60ttggtactgg tgctactgac cagcaaggca aactcacgcg aactggaaga gttgaacgta 120ccaggagaga ttgtagaatc actgagctcc tcagaggagt ctattactcg tatcaacaag 180aagatagaga agttccaatc cgaggagcaa caacaaacag aggacgaatt gcaggacaag 240atacatcctt tcgcacagac ccagagcctc gtctatccct ttccaggtcc aatccctaac 300tctctccccc agaatatccc acccttgact cagactcccg tggtcgtacc ccctttcttg 360caacccgagg tgatgggggt ttctaaagtc aaagaggcta tggctcctaa acataaggaa 420atgccttttc ccaaatatcc agtggagcca ttcactgaga gccagtctct gacacttaca 480gatgtggaaa acttgcacct gcccttgcca cttttgcagt cctggatgca ccaaccacat 540caacccttgc cccccacagt gatgtttcct ccacaatcag ttcttagtct ctcccaaagc 600aaagtccttc cagtgcctca gaaggccgtc ccataccccc agagagatat gccaatacag 660gcattcttgc tttaccagga accagtgctc ggtcctgtac gtggcccatt ccctatcata 720gtgttcatgt tgatcgtaac acagactatg aagggtcttg atatacagaa ggtggccggg 780acttggtaca gtttggcaat ggccgcatcc gacatctcct tgttggacgc acaatcagcc 840ccattgcgtg tgtacgtaga agagcttaaa ccaactcccg agggggatct ggaaattctg 900ctccagaaat gggagaacgg tgagtgcgcc cagaagaaga tcatcgcaga gaagaccaaa 960attccagcag tattcaaaat cgacgcattg aacgaaaata aggtgctcgt actggacact 1020gattataaga agtatctcct tttctgtatg gagaactcag cagagcctga acagagtctt 1080gcctgccaat gccttgttcg taccccagag gtagatgatg aagctctgga aaagttcgat 1140aaggccctta aggctctgcc tatgcacatt aggctttctt tcaatccaac tcaacttgag 1200gaacaatgtc acatttaa 121875395PRTArtificial SequenceFusion protein sig10OaS1-TFMOLG1 75Met Ala Thr Ser Lys Leu Lys Thr Gln Asn Val Val Val Ser Leu Ser1 5 10 15Leu Thr Leu Thr Leu Val Leu Val Leu Leu Thr Ser Lys Ala Asn Ser 20 25 30Arg Pro Lys His Pro Ile Lys His Gln Gly Leu Pro Gln Glu Val Leu 35 40 45Asn Glu Asn Leu Leu Arg Phe Phe Val Ala Pro Phe Pro Glu Val Phe 50 55 60Gly Lys Glu Lys Val Asn Glu Leu Ser Lys Asp Ile Gly Ser Glu Ser65 70 75 80Thr Glu Asp Gln Ala Met Glu Asp Ile Lys Gln Met Glu Ala Glu Ser 85 90 95Ile Ser Ser Ser Glu Glu Ile Val Pro Asn Ser Val Glu Gln Lys His 100 105 110Ile Gln Lys Glu Asp Val Pro Ser Glu Arg Tyr Leu Gly Tyr Leu Glu 115 120 125Gln Leu Leu Arg Leu Lys Lys Tyr Lys Val Pro Gln Leu Glu Ile Val 130 135 140Pro Asn Ser Ala Glu Glu Arg Leu His Ser Met Lys Glu Gly Ile His145 150 155 160Ala Gln Gln Lys Glu Pro Met Ile Gly Val Asn Gln Glu Leu Ala Tyr 165 170 175Phe Tyr Pro Glu Leu Phe Arg Gln Phe Tyr Gln Leu Asp Ala Tyr Pro 180 185 190Ser Gly Ala Trp Tyr Tyr Val Pro Leu Gly Thr Gln Tyr Thr Asp Ala 195 200 205Pro Ser Phe Ser Asp Ile Pro Asn Pro Ile Gly Ser Glu Asn Ser Glu 210 215 220Lys Thr Thr Met Pro Leu Trp Phe Met Leu Ile Val Thr Gln Thr

Met225 230 235 240Lys Gly Leu Asp Ile Gln Lys Val Ala Gly Thr Trp Tyr Ser Leu Ala 245 250 255Met Ala Ala Ser Asp Ile Ser Leu Leu Asp Ala Gln Ser Ala Pro Leu 260 265 270Arg Val Tyr Val Glu Glu Leu Lys Pro Thr Pro Glu Gly Asp Leu Glu 275 280 285Ile Leu Leu Gln Lys Trp Glu Asn Gly Glu Cys Ala Gln Lys Lys Ile 290 295 300Ile Ala Glu Lys Thr Lys Ile Pro Ala Val Phe Lys Ile Asp Ala Leu305 310 315 320Asn Glu Asn Lys Val Leu Val Leu Asp Thr Asp Tyr Lys Lys Tyr Leu 325 330 335Leu Phe Cys Met Glu Asn Ser Ala Glu Pro Glu Gln Ser Leu Ala Cys 340 345 350Gln Cys Leu Val Arg Thr Pro Glu Val Asp Asp Glu Ala Leu Glu Lys 355 360 365Phe Asp Lys Ala Leu Lys Ala Leu Pro Met His Ile Arg Leu Ser Phe 370 375 380Asn Pro Thr Gln Leu Glu Glu Gln Cys His Ile385 390 395761188DNAArtificial SequenceNucleic acid encoding fusion protein sig10OaS1-TFMOLG1 76atggctactt caaagttgaa aacccagaat gtggttgtat ctctctccct aaccttaacc 60ttggtactgg tgctactgac cagcaaggca aactcacgcc caaaacatcc cataaaacat 120caaggattgc cccaggaagt actcaacgag aatctcctcc gttttttcgt tgctcctttc 180cccgaagtgt tcgggaagga aaaagtaaac gagctttcaa aggacatcgg ctctgaaagt 240accgaggatc aggctatgga agatatcaag caaatggagg ccgaatctat aagttcttca 300gaagaaatag ttcccaactc agtggagcag aagcacattc agaaagaaga cgtgcccagc 360gagcgctatc tgggatattt ggaacagctg ctcagactga aaaagtacaa ggtgcctcag 420ctcgaaatcg tacccaatag tgctgaagaa aggttgcact caatgaaaga ggggattcac 480gcacaacaaa aagagcctat gatcggagta aatcaagaac tggcatactt ttatcccgag 540ttgtttcgcc aattctatca actggatgcc tacccttccg gtgcatggta ctacgtaccc 600ctcggtactc aatataccga tgctccctcc ttttccgaca ttcctaatcc tataggttcc 660gagaatagcg aaaagaccac catgccctta tggttcatgt tgatcgtaac acagactatg 720aagggtcttg atatacagaa ggtggccggg acttggtaca gtttggcaat ggccgcatcc 780gacatctcct tgttggacgc acaatcagcc ccattgcgtg tgtacgtaga agagcttaaa 840ccaactcccg agggggatct ggaaattctg ctccagaaat gggagaacgg tgagtgcgcc 900cagaagaaga tcatcgcaga gaagaccaaa attccagcag tattcaaaat cgacgcattg 960aacgaaaata aggtgctcgt actggacact gattataaga agtatctcct tttctgtatg 1020gagaactcag cagagcctga acagagtctt gcctgccaat gccttgttcg taccccagag 1080gtagatgatg aagctctgga aaagttcgat aaggccctta aggctctgcc tatgcacatt 1140aggctttctt tcaatccaac tcaacttgag gaacaatgtc acatttaa 118877304PRTArtificial SequenceFusion protein sig10paraOKC1-TFMOLG1KDEL 77Met Ala Thr Ser Lys Leu Lys Thr Gln Asn Val Val Val Ser Leu Ser1 5 10 15Leu Thr Leu Thr Leu Val Leu Val Leu Leu Thr Ser Lys Ala Asn Ser 20 25 30Gln Glu Gln Asn Gln Glu Gln Pro Ile Arg Cys Glu Lys Asp Glu Arg 35 40 45Phe Phe Ser Asp Lys Ile Ala Lys Tyr Ile Pro Ile Gln Tyr Val Leu 50 55 60Ser Arg Tyr Pro Ser Tyr Gly Leu Asn Tyr Tyr Gln Gln Lys Pro Val65 70 75 80Ala Leu Ile Asn Asn Gln Phe Leu Pro Tyr Pro Tyr Tyr Ala Lys Pro 85 90 95Ala Ala Val Arg Ser Pro Ala Gln Ile Leu Gln Trp Gln Val Leu Ser 100 105 110Asn Thr Val Pro Ala Lys Ser Cys Gln Ala Gln Pro Thr Thr Met Ala 115 120 125Arg His Pro His Pro His Leu Ser Phe Met Leu Ile Val Thr Gln Thr 130 135 140Met Lys Gly Leu Asp Ile Gln Lys Val Ala Gly Thr Trp Tyr Ser Leu145 150 155 160Ala Met Ala Ala Ser Asp Ile Ser Leu Leu Asp Ala Gln Ser Ala Pro 165 170 175Leu Arg Val Tyr Val Glu Glu Leu Lys Pro Thr Pro Glu Gly Asp Leu 180 185 190Glu Ile Leu Leu Gln Lys Trp Glu Asn Gly Glu Cys Ala Gln Lys Lys 195 200 205Ile Ile Ala Glu Lys Thr Lys Ile Pro Ala Val Phe Lys Ile Asp Ala 210 215 220Leu Asn Glu Asn Lys Val Leu Val Leu Asp Thr Asp Tyr Lys Lys Tyr225 230 235 240Leu Leu Phe Cys Met Glu Asn Ser Ala Glu Pro Glu Gln Ser Leu Ala 245 250 255Cys Gln Cys Leu Val Arg Thr Pro Glu Val Asp Asp Glu Ala Leu Glu 260 265 270Lys Phe Asp Lys Ala Leu Lys Ala Leu Pro Met His Ile Arg Leu Ser 275 280 285Phe Asn Pro Thr Gln Leu Glu Glu Gln Cys His Ile Lys Asp Glu Leu 290 295 30078915DNAArtificial SequenceNucleic acid encoding fusion protein sig10paraOKC1-TFMOLG1KDEL 78atggctactt caaagttgaa aacccagaat gtggttgtat ctctctccct aaccttaacc 60ttggtactgg tgctactgac cagcaaggca aactcacaag agcagaatca agagcagcca 120atccgttgtg agaaggacga gaggttcttc tcagacaaga tcgccaaata tatacccata 180caatatgtac tctcacgcta ccctagctac gggcttaact actatcagca aaaacctgta 240gcactgataa ataaccagtt tctcccctat ccctattatg ctaaacctgc cgccgtgagg 300agtccagcac aaatacttca gtggcaagtg ctcagtaaca ccgtgccagc aaaaagctgc 360caggctcagc ccaccacaat ggcccgtcat ccccatcctc accttagctt catgttgatc 420gtaacacaga ctatgaaggg tcttgatata cagaaggtgg ccgggacttg gtacagtttg 480gcaatggccg catccgacat ctccttgttg gacgcacaat cagccccatt gcgtgtgtac 540gtagaagagc ttaaaccaac tcccgagggg gatctggaaa ttctgctcca gaaatgggag 600aacggtgagt gcgcccagaa gaagatcatc gcagagaaga ccaaaattcc agcagtattc 660aaaatcgacg cattgaacga aaataaggtg ctcgtactgg acactgatta taagaagtat 720ctccttttct gtatggagaa ctcagcagag cctgaacaga gtcttgcctg ccaatgcctt 780gttcgtaccc cagaggtaga tgatgaagct ctggaaaagt tcgataaggc ccttaaggct 840ctgcctatgc acattaggct ttctttcaat ccaactcaac ttgaggaaca atgtcacatt 900aaggatgagc tttaa 91579300PRTArtificial SequenceFusion protein sig10paraOKC1-TFMOLG1 79Met Ala Thr Ser Lys Leu Lys Thr Gln Asn Val Val Val Ser Leu Ser1 5 10 15Leu Thr Leu Thr Leu Val Leu Val Leu Leu Thr Ser Lys Ala Asn Ser 20 25 30Gln Glu Gln Asn Gln Glu Gln Pro Ile Arg Cys Glu Lys Asp Glu Arg 35 40 45Phe Phe Ser Asp Lys Ile Ala Lys Tyr Ile Pro Ile Gln Tyr Val Leu 50 55 60Ser Arg Tyr Pro Ser Tyr Gly Leu Asn Tyr Tyr Gln Gln Lys Pro Val65 70 75 80Ala Leu Ile Asn Asn Gln Phe Leu Pro Tyr Pro Tyr Tyr Ala Lys Pro 85 90 95Ala Ala Val Arg Ser Pro Ala Gln Ile Leu Gln Trp Gln Val Leu Ser 100 105 110Asn Thr Val Pro Ala Lys Ser Cys Gln Ala Gln Pro Thr Thr Met Ala 115 120 125Arg His Pro His Pro His Leu Ser Phe Met Leu Ile Val Thr Gln Thr 130 135 140Met Lys Gly Leu Asp Ile Gln Lys Val Ala Gly Thr Trp Tyr Ser Leu145 150 155 160Ala Met Ala Ala Ser Asp Ile Ser Leu Leu Asp Ala Gln Ser Ala Pro 165 170 175Leu Arg Val Tyr Val Glu Glu Leu Lys Pro Thr Pro Glu Gly Asp Leu 180 185 190Glu Ile Leu Leu Gln Lys Trp Glu Asn Gly Glu Cys Ala Gln Lys Lys 195 200 205Ile Ile Ala Glu Lys Thr Lys Ile Pro Ala Val Phe Lys Ile Asp Ala 210 215 220Leu Asn Glu Asn Lys Val Leu Val Leu Asp Thr Asp Tyr Lys Lys Tyr225 230 235 240Leu Leu Phe Cys Met Glu Asn Ser Ala Glu Pro Glu Gln Ser Leu Ala 245 250 255Cys Gln Cys Leu Val Arg Thr Pro Glu Val Asp Asp Glu Ala Leu Glu 260 265 270Lys Phe Asp Lys Ala Leu Lys Ala Leu Pro Met His Ile Arg Leu Ser 275 280 285Phe Asn Pro Thr Gln Leu Glu Glu Gln Cys His Ile 290 295 30080903DNAArtificial SequenceNucleic acid encoding fusion protein sig10paraOKC1-TFMOLG1 80atggctactt caaagttgaa aacccagaat gtggttgtat ctctctccct aaccttaacc 60ttggtactgg tgctactgac cagcaaggca aactcacaag agcagaatca agagcagcca 120atccgttgtg agaaggacga gaggttcttc tcagacaaga tcgccaaata tatacccata 180caatatgtac tctcacgcta ccctagctac gggcttaact actatcagca aaaacctgta 240gcactgataa ataaccagtt tctcccctat ccctattatg ctaaacctgc cgccgtgagg 300agtccagcac aaatacttca gtggcaagtg ctcagtaaca ccgtgccagc aaaaagctgc 360caggctcagc ccaccacaat ggcccgtcat ccccatcctc accttagctt catgttgatc 420gtaacacaga ctatgaaggg tcttgatata cagaaggtgg ccgggacttg gtacagtttg 480gcaatggccg catccgacat ctccttgttg gacgcacaat cagccccatt gcgtgtgtac 540gtagaagagc ttaaaccaac tcccgagggg gatctggaaa ttctgctcca gaaatgggag 600aacggtgagt gcgcccagaa gaagatcatc gcagagaaga ccaaaattcc agcagtattc 660aaaatcgacg cattgaacga aaataaggtg ctcgtactgg acactgatta taagaagtat 720ctccttttct gtatggagaa ctcagcagag cctgaacaga gtcttgcctg ccaatgcctt 780gttcgtaccc cagaggtaga tgatgaagct ctggaaaagt tcgataaggc ccttaaggct 840ctgcctatgc acattaggct ttctttcaat ccaactcaac ttgaggaaca atgtcacatt 900taa 90381354PRTArtificial SequenceFusion protein sig2OKC1-TOLG1KDEL 81Met Ala Lys Leu Val Phe Ser Leu Cys Phe Leu Leu Phe Ser Gly Cys1 5 10 15Cys Phe Ala Gln Glu Gln Asn Gln Glu Gln Pro Ile Arg Cys Glu Lys 20 25 30Asp Glu Arg Phe Phe Ser Asp Lys Ile Ala Lys Tyr Ile Pro Ile Gln 35 40 45Tyr Val Leu Ser Arg Tyr Pro Ser Tyr Gly Leu Asn Tyr Tyr Gln Gln 50 55 60Lys Pro Val Ala Leu Ile Asn Asn Gln Phe Leu Pro Tyr Pro Tyr Tyr65 70 75 80Ala Lys Pro Ala Ala Val Arg Ser Pro Ala Gln Ile Leu Gln Trp Gln 85 90 95Val Leu Ser Asn Thr Val Pro Ala Lys Ser Cys Gln Ala Gln Pro Thr 100 105 110Thr Met Ala Arg His Pro His Pro His Leu Ser Phe Met Ala Ile Pro 115 120 125Pro Lys Lys Asn Gln Asp Lys Thr Glu Ile Pro Thr Ile Asn Thr Ile 130 135 140Ala Ser Gly Glu Pro Thr Ser Thr Pro Thr Thr Glu Ala Val Glu Ser145 150 155 160Thr Val Ala Thr Leu Glu Asp Ser Pro Glu Val Ile Glu Ser Pro Pro 165 170 175Glu Ile Asn Thr Val Gln Val Thr Ser Thr Ala Val Leu Ile Val Thr 180 185 190Gln Thr Met Lys Gly Leu Asp Ile Gln Lys Val Ala Gly Thr Trp Tyr 195 200 205Ser Leu Ala Met Ala Ala Ser Asp Ile Ser Leu Leu Asp Ala Gln Ser 210 215 220Ala Pro Leu Arg Val Tyr Val Glu Glu Leu Lys Pro Thr Pro Glu Gly225 230 235 240Asp Leu Glu Ile Leu Leu Gln Lys Trp Glu Asn Gly Glu Cys Ala Gln 245 250 255Lys Lys Ile Ile Ala Glu Lys Thr Lys Ile Pro Ala Val Phe Lys Ile 260 265 270Asp Ala Leu Asn Glu Asn Lys Val Leu Val Leu Asp Thr Asp Tyr Lys 275 280 285Lys Tyr Leu Leu Phe Cys Met Glu Asn Ser Ala Glu Pro Glu Gln Ser 290 295 300Leu Ala Cys Gln Cys Leu Val Arg Thr Pro Glu Val Asp Asp Glu Ala305 310 315 320Leu Glu Lys Phe Asp Lys Ala Leu Lys Ala Leu Pro Met His Ile Arg 325 330 335Leu Ser Phe Asn Pro Thr Gln Leu Glu Glu Gln Cys His Ile Lys Asp 340 345 350Glu Leu821065DNAArtificial SequenceNucleic acid encoding fusion protein sig2OKC1-TOLG1KDEL 82atggccaagc tagttttttc cctttgtttt ctgcttttca gtggctgctg cttcgctcaa 60gagcagaatc aagagcagcc aatccgttgt gagaaggacg agaggttctt ctcagacaag 120atcgccaaat atatacccat acaatatgta ctctcacgct accctagcta cgggcttaac 180tactatcagc aaaaacctgt agcactgata aataaccagt ttctccccta tccctattat 240gctaaacctg ccgccgtgag gagtccagca caaatacttc agtggcaagt gctcagtaac 300accgtgccag caaaaagctg ccaggctcag cccaccacaa tggcccgtca tccccatcct 360caccttagct tcatggcaat cccaccaaag aagaatcaag acaagaccga aatacctacc 420atcaacacaa ttgcatctgg agagcctacc agtacaccaa caactgaggc agtagagtct 480actgttgcta cccttgagga cagccccgag gttatagagt ccccacctga gataaatacc 540gtgcaggtga caagtaccgc cgtattgatc gtaacacaga ctatgaaggg tcttgatata 600cagaaggtgg ccgggacttg gtacagtttg gcaatggccg catccgacat ctccttgttg 660gacgcacaat cagccccatt gcgtgtgtac gtagaagagc ttaaaccaac tcccgagggg 720gatctggaaa ttctgctcca gaaatgggag aacggtgagt gcgcccagaa gaagatcatc 780gcagagaaga ccaaaattcc agcagtattc aaaatcgacg cattgaacga aaataaggtg 840ctcgtactgg acactgatta taagaagtat ctccttttct gtatggagaa ctcagcagag 900cctgaacaga gtcttgcctg ccaatgcctt gttcgtaccc cagaggtaga tgatgaagct 960ctggaaaagt tcgataaggc ccttaaggct ctgcctatgc acattaggct ttctttcaat 1020ccaactcaac ttgaggaaca atgtcacatt aaggatgagc tttaa 106583621PRTArtificial SequenceOptimized alpha S2-casein truncated version 1 (OaS2-T) 83Ala Ala Gly Ala Ala Thr Ala Cys Thr Ala Thr Gly Gly Ala Ala Cys1 5 10 15Ala Cys Gly Thr Ala Ala Gly Cys Thr Cys Ala Ala Gly Thr Gly Ala 20 25 30Ala Gly Ala Ala Thr Cys Thr Ala Thr Ala Ala Thr Ala Ala Gly Thr 35 40 45Cys Ala Ala Gly Ala Gly Ala Cys Ala Thr Ala Thr Ala Ala Gly Cys 50 55 60Ala Ala Gly Ala Gly Ala Ala Ala Ala Ala Cys Ala Thr Gly Gly Cys65 70 75 80Ala Ala Thr Ala Ala Ala Thr Cys Cys Cys Thr Cys Cys Ala Ala Gly 85 90 95Gly Ala Gly Ala Ala Thr Cys Thr Thr Thr Gly Thr Ala Gly Cys Ala 100 105 110Cys Thr Thr Thr Thr Thr Gly Cys Ala Ala Ala Gly Ala Ala Gly Thr 115 120 125Thr Gly Thr Gly Ala Gly Ala Ala Ala Thr Gly Cys Ala Ala Ala Thr 130 135 140Gly Ala Gly Gly Ala Ala Gly Ala Ala Thr Ala Cys Thr Cys Ala Ala145 150 155 160Thr Ala Gly Gly Cys Ala Gly Cys Thr Cys Thr Thr Cys Cys Gly Ala 165 170 175Ala Gly Ala Ala Thr Cys Thr Gly Cys Thr Gly Ala Ala Gly Thr Cys 180 185 190Gly Cys Thr Ala Cys Thr Gly Ala Ala Gly Ala Gly Gly Thr Cys Ala 195 200 205Ala Ala Ala Thr Ala Ala Cys Ala Gly Thr Thr Gly Ala Cys Gly Ala 210 215 220Cys Ala Ala Gly Cys Ala Thr Thr Ala Thr Cys Ala Ala Ala Ala Ala225 230 235 240Gly Cys Cys Cys Thr Gly Ala Ala Thr Gly Ala Ala Ala Thr Ala Ala 245 250 255Ala Cys Cys Ala Gly Thr Thr Cys Thr Ala Cys Cys Ala Ala Ala Ala 260 265 270Ala Thr Thr Thr Cys Cys Cys Cys Ala Ala Thr Ala Cys Cys Thr Cys 275 280 285Cys Ala Gly Thr Ala Cys Cys Thr Thr Thr Ala Thr Cys Ala Ala Gly 290 295 300Gly Ala Cys Cys Cys Ala Thr Ala Gly Thr Cys Cys Thr Cys Ala Ala305 310 315 320Cys Cys Cys Thr Thr Gly Gly Gly Ala Thr Cys Ala Gly Gly Thr Cys 325 330 335Ala Ala Gly Cys Gly Thr Ala Ala Thr Gly Cys Thr Gly Thr Thr Cys 340 345 350Cys Ala Ala Thr Ala Ala Cys Ala Cys Cys Ala Ala Cys Ala Cys Thr 355 360 365Cys Ala Ala Thr Cys Gly Thr Gly Ala Ala Cys Ala Ala Cys Thr Gly 370 375 380Thr Cys Thr Ala Cys Cys Thr Cys Ala Gly Ala Ala Gly Ala Ala Ala385 390 395 400Ala Thr Thr Cys Cys Ala Ala Ala Ala Ala Ala Ala Cys Thr Gly Thr 405 410 415Gly Gly Ala Thr Ala Thr Gly Gly Ala Ala Ala Gly Thr Ala Cys Ala 420 425 430Gly Ala Ala Gly Thr Thr Thr Thr Thr Ala Cys Thr Ala Ala Ala Ala 435 440 445Ala Gly Ala Cys Cys Ala Ala Gly Cys Thr Cys Ala Cys Cys Gly Ala 450 455 460Gly Gly Ala Gly Gly Ala Ala Ala Ala Ala Ala Ala Thr Ala Gly Ala465 470 475 480Thr Thr Gly Ala Ala Thr Thr Thr Thr Cys Thr Thr Ala Ala Gly Ala 485 490 495Ala Gly Ala Thr Cys Ala Gly Thr Cys Ala Ala Cys Gly Cys Thr Ala 500 505 510Thr Cys Ala Gly Ala Ala Gly Thr Thr Cys Gly Cys Cys Cys Thr Thr 515 520 525Cys Cys Ala Cys Ala Ala Thr Ala Cys Cys Thr Cys Ala Ala Gly Ala 530 535 540Cys Thr Gly Thr Ala Thr Ala Cys Cys Ala Ala Cys Ala Thr Cys Ala545 550 555 560Gly Ala Ala Gly Gly Cys Cys Ala Thr Gly Ala Ala Gly Cys Cys Thr 565

570 575Thr Gly Gly Ala Thr Thr Cys Ala Gly Cys Cys Cys Ala Ala Ala Ala 580 585 590Cys Ala Ala Ala Gly Gly Thr Ala Ala Thr Cys Cys Cys Cys Thr Ala 595 600 605Thr Gly Thr Thr Ala Gly Ala Thr Ala Cys Thr Thr Gly 610 615 62084207PRTArtificial SequenceOptimized alpha S2-casein truncated version 1 (OaS2-T) 84Lys Asn Thr Met Glu His Val Ser Ser Ser Glu Glu Ser Ile Ile Ser1 5 10 15Gln Glu Thr Tyr Lys Gln Glu Lys Asn Met Ala Ile Asn Pro Ser Lys 20 25 30Glu Asn Leu Cys Ser Thr Phe Cys Lys Glu Val Val Arg Asn Ala Asn 35 40 45Glu Glu Glu Tyr Ser Ile Gly Ser Ser Ser Glu Glu Ser Ala Glu Val 50 55 60Ala Thr Glu Glu Val Lys Ile Thr Val Asp Asp Lys His Tyr Gln Lys65 70 75 80Ala Leu Asn Glu Ile Asn Gln Phe Tyr Gln Lys Phe Pro Gln Tyr Leu 85 90 95Gln Tyr Leu Tyr Gln Gly Pro Ile Val Leu Asn Pro Trp Asp Gln Val 100 105 110Lys Arg Asn Ala Val Pro Ile Thr Pro Thr Leu Asn Arg Glu Gln Leu 115 120 125Ser Thr Ser Glu Glu Asn Ser Lys Lys Thr Val Asp Met Glu Ser Thr 130 135 140Glu Val Phe Thr Lys Lys Thr Lys Leu Thr Glu Glu Glu Lys Asn Arg145 150 155 160Leu Asn Phe Leu Lys Lys Ile Ser Gln Arg Tyr Gln Lys Phe Ala Leu 165 170 175Pro Gln Tyr Leu Lys Thr Val Tyr Gln His Gln Lys Ala Met Lys Pro 180 185 190Trp Ile Gln Pro Lys Thr Lys Val Ile Pro Tyr Val Arg Tyr Leu 195 200 20585171PRTCapra hircus 85Gln Glu Gln Asn Gln Glu Gln Pro Ile Cys Cys Glu Lys Asp Glu Arg1 5 10 15Phe Phe Asp Asp Lys Ile Ala Lys Tyr Ile Pro Ile Gln Tyr Val Leu 20 25 30Ser Arg Tyr Pro Ser Tyr Gly Leu Asn Tyr Tyr Gln Gln Arg Pro Val 35 40 45Ala Leu Ile Asn Asn Gln Phe Leu Pro Tyr Pro Tyr Tyr Ala Lys Pro 50 55 60Val Ala Val Arg Ser Pro Ala Gln Thr Leu Gln Trp Gln Val Leu Pro65 70 75 80Asn Thr Val Pro Ala Lys Ser Cys Gln Asp Gln Pro Thr Thr Leu Ala 85 90 95Arg His Pro His Pro His Leu Ser Phe Met Ala Ile Pro Pro Lys Lys 100 105 110Asp Gln Asp Lys Thr Glu Val Pro Ala Ile Asn Thr Ile Ala Ser Ala 115 120 125Glu Pro Thr Val His Ser Thr Pro Thr Thr Glu Ala Ile Val Asn Thr 130 135 140Val Asp Asn Pro Glu Ala Ser Ser Glu Ser Ile Ala Ser Ala Ser Glu145 150 155 160Thr Asn Thr Ala Gln Val Thr Ser Thr Glu Val 165 17086171PRTOvis aries 86Gln Glu Gln Asn Gln Glu Gln Arg Ile Cys Cys Glu Lys Asp Glu Arg1 5 10 15Phe Phe Asp Asp Lys Ile Ala Lys Tyr Ile Pro Ile Gln Tyr Val Leu 20 25 30Ser Arg Tyr Pro Ser Tyr Gly Leu Asn Tyr Tyr Gln Gln Arg Pro Val 35 40 45Ala Leu Ile Asn Asn Gln Phe Leu Pro Tyr Pro Tyr Tyr Ala Lys Pro 50 55 60Val Ala Val Arg Ser Pro Ala Gln Thr Leu Gln Trp Gln Val Leu Pro65 70 75 80Asn Ala Val Pro Ala Lys Ser Cys Gln Asp Gln Pro Thr Ala Met Ala 85 90 95Arg His Pro His Pro His Leu Ser Phe Met Ala Ile Pro Pro Lys Lys 100 105 110Asp Gln Asp Lys Thr Glu Ile Pro Ala Ile Asn Thr Ile Ala Ser Ala 115 120 125Glu Pro Thr Val His Ser Thr Pro Thr Thr Glu Ala Val Val Asn Ala 130 135 140Val Asp Asn Pro Glu Ala Ser Ser Glu Ser Ile Ala Ser Ala Pro Glu145 150 155 160Thr Asn Thr Ala Gln Val Thr Ser Thr Glu Val 165 17087165PRTBubalus bubalis 87Gln Glu Gln Asn Gln Glu Gln Pro Ile Arg Cys Glu Lys Glu Glu Arg1 5 10 15Phe Phe Asn Asp Lys Ile Ala Lys Tyr Ile Pro Ile Gln Tyr Val Leu 20 25 30Ser Arg Tyr Pro Ser Tyr Gly Leu Asn Tyr Tyr Gln Gln Lys Pro Val 35 40 45Ala Leu Ile Asn Asn Gln Phe Leu Pro Tyr Pro Tyr Tyr Ala Lys Pro 50 55 60Ala Ala Val Arg Ser Pro Ala Gln Ile Leu Gln Trp Gln Val Leu Pro65 70 75 80Asn Thr Val Pro Ala Lys Ser Cys Gln Ala Gln Pro Thr Thr Met Thr 85 90 95Arg His Pro His Pro His Leu Ser Phe Met Ala Ile Pro Pro Lys Lys 100 105 110Asn Gln Asp Lys Thr Glu Ile Pro Thr Ile Asn Thr Ile Val Ser Val 115 120 125Glu Pro Thr Ser Thr Pro Thr Thr Glu Ala Ile Glu Asn Thr Val Ala 130 135 140Thr Leu Glu Ala Ser Ser Glu Val Ile Glu Ser Val Pro Glu Thr Asn145 150 155 160Thr Ala Gln Val Thr 16588162PRTCamelus dromedaries 88Glu Val Gln Asn Gln Glu Gln Pro Thr Cys Phe Glu Lys Val Glu Arg1 5 10 15Leu Leu Asn Glu Lys Thr Val Lys Tyr Phe Pro Ile Gln Phe Val Gln 20 25 30Ser Arg Tyr Pro Ser Tyr Gly Ile Asn Tyr Tyr Gln His Arg Leu Ala 35 40 45Val Pro Ile Asn Asn Gln Phe Ile Pro Tyr Pro Asn Tyr Ala Lys Pro 50 55 60Val Ala Ile Arg Leu His Ala Gln Ile Pro Gln Cys Gln Ala Leu Pro65 70 75 80Asn Ile Asp Pro Pro Thr Val Glu Arg Arg Pro Arg Pro Arg Pro Ser 85 90 95Phe Ile Ala Ile Pro Pro Lys Lys Thr Gln Asp Lys Thr Val Asn Pro 100 105 110Ala Ile Asn Thr Val Ala Thr Val Glu Pro Pro Val Ile Pro Thr Ala 115 120 125Glu Pro Ala Val Asn Thr Val Val Ile Ala Glu Ala Ser Ser Glu Phe 130 135 140Ile Thr Thr Ser Thr Pro Glu Thr Thr Thr Val Gln Ile Thr Ser Thr145 150 155 160Glu Ile89162PRTCamelus bactrianus 89Glu Val Gln Asn Gln Glu Gln Pro Thr Cys Cys Glu Lys Val Glu Arg1 5 10 15Leu Leu Asn Glu Lys Thr Val Lys Tyr Phe Pro Ile Gln Phe Val Gln 20 25 30Ser Arg Tyr Pro Ser Tyr Gly Ile Asn Tyr Tyr Gln His Arg Leu Ala 35 40 45Val Pro Ile Asn Asn Gln Phe Ile Pro Tyr Pro Asn Tyr Ala Lys Pro 50 55 60Val Ala Ile Arg Leu His Ala Gln Ile Pro Gln Cys Gln Ala Leu Pro65 70 75 80Asn Ile Asp Pro Pro Thr Val Glu Arg Arg Pro Arg Pro Arg Pro Ser 85 90 95Phe Ile Ala Ile Pro Pro Lys Lys Thr Gln Asp Lys Thr Val Asn Pro 100 105 110Ala Ile Asn Thr Val Ala Thr Val Glu Pro Pro Val Ile Pro Thr Ala 115 120 125Glu Pro Ala Val Asn Thr Val Val Ile Ala Glu Ala Ser Ser Glu Phe 130 135 140Ile Thr Thr Ser Thr Pro Glu Thr Thr Thr Val Gln Ile Thr Ser Thr145 150 155 160Glu Ile90173PRTBos mutus 90Gln Glu Gln Asn Gln Glu Gln Pro Ile Arg Cys Glu Lys Asp Glu Arg1 5 10 15Phe Phe Ser Asp Lys Ile Ala Lys Tyr Ile Pro Ile Gln Tyr Val Leu 20 25 30Ser Arg Tyr Pro Ser Tyr Gly Leu Asn Tyr Tyr Gln Gln Lys Pro Val 35 40 45Ala Leu Ile Asn Asn Gln Phe Leu Pro Tyr Pro Tyr Tyr Ala Lys Pro 50 55 60Ala Ala Val Arg Ser Pro Ala Gln Ile Leu Gln Trp Gln Val Leu Ser65 70 75 80Asn Thr Val Pro Ala Lys Ser Cys Gln Ala Gln Pro Thr Thr Met Ala 85 90 95Arg His Pro His Pro His Leu Ser Phe Met Ala Ile Pro Pro Lys Lys 100 105 110Asn Gln Asp Lys Thr Glu Ile Pro Thr Ile Asn Thr Ile Ala Ser Gly 115 120 125Glu Pro Thr Ser Thr Pro Thr Thr Glu Ala Val Glu Ser Thr Val Ala 130 135 140Thr Leu Glu Ala Ser Pro Glu Ala Ser Pro Glu Val Ile Glu Ser Pro145 150 155 160Pro Glu Ile Asn Thr Val Gln Val Thr Ser Thr Ala Val 165 17091165PRTEquus caballus 91Glu Val Gln Asn Gln Glu Gln Pro Thr Cys His Lys Asn Asp Glu Arg1 5 10 15Phe Phe Asp Leu Lys Thr Val Lys Tyr Ile Pro Ile Tyr Tyr Val Leu 20 25 30Asn Ser Ser Pro Arg Tyr Glu Pro Ile Tyr Tyr Gln His Arg Leu Ala 35 40 45Leu Leu Ile Asn Asn Gln His Met Pro Tyr Gln Tyr Tyr Ala Arg Pro 50 55 60Ala Ala Val Arg Pro His Val Gln Ile Pro Gln Trp Gln Val Leu Pro65 70 75 80Asn Ile Tyr Pro Ser Thr Val Val Arg His Pro Cys Pro His Pro Ser 85 90 95Phe Ile Ala Ile Pro Pro Lys Lys Leu Gln Glu Ile Thr Val Ile Pro 100 105 110Lys Ile Asn Thr Ile Ala Thr Val Glu Pro Thr Pro Ile Pro Thr Pro 115 120 125Glu Pro Thr Val Asn Asn Ala Val Ile Pro Asp Ala Ser Ser Glu Phe 130 135 140Ile Ile Ala Ser Thr Pro Glu Thr Thr Thr Val Pro Val Thr Ser Pro145 150 155 160Val Val Gln Lys Leu 16592162PRTEquus asinus 92Glu Val Gln Asn Gln Glu Gln Pro Thr Cys Arg Lys Asn Asp Glu Arg1 5 10 15Phe Phe Asp Leu Lys Thr Val Lys Tyr Ile Pro Ile Tyr Tyr Val Leu 20 25 30Asn Ser Ser Pro Arg Asn Glu Pro Ile Tyr Tyr Gln His Arg Leu Ala 35 40 45Val Leu Ile Asn Asn Gln His Met Pro Tyr Gln Tyr Tyr Ala Arg Pro 50 55 60Ala Ala Val Arg Pro His Val Gln Ile Pro Gln Trp Gln Val Leu Pro65 70 75 80Asn Ile Tyr Pro Ser Thr Val Val Arg His Pro Arg Pro His Pro Ser 85 90 95Phe Ile Ala Ile Pro Pro Lys Lys Leu Gln Glu Lys Thr Val Ile Pro 100 105 110Lys Ile Asn Thr Ile Ala Thr Val Glu Pro Thr Pro Ile Pro Thr Pro 115 120 125Glu Pro Thr Val Asn Asn Ala Val Ile Pro Asp Ala Ser Ser Glu Phe 130 135 140Ile Ile Ala Ser Thr Pro Glu Thr Thr Thr Val Pro Val Thr Ser Pro145 150 155 160Val Val93122PRTRangifer tarandus 93Val Ala Leu Ile Asn Asn Gln Phe Leu Pro Tyr Pro Tyr Tyr Ala Lys1 5 10 15Pro Gly Ala Val Arg Ser Pro Ala Gln Ile Leu Gln Trp Gln Val Leu 20 25 30Pro Asn Thr Val Pro Ala Lys Ser Cys Gln Ala Gln Pro Thr Thr Leu 35 40 45Ala Arg His Pro His Pro Arg Leu Ser Phe Met Ala Ile Pro Pro Lys 50 55 60Lys Asn Gln Asp Lys Thr Asp Ile Pro Thr Ile Asn Thr Ile Ala Thr65 70 75 80Val Glu Ser Thr Ile Thr Pro Thr Thr Glu Ala Ile Val Asp Thr Val 85 90 95Ala Thr Leu Glu Ala Ser Ser Glu Val Ile Glu Ser Ala Pro Glu Thr 100 105 110Asn Thr Asp Gln Val Thr Ser Thr Val Val 115 12094141PRTAlces alces 94Lys Ile Val Lys Tyr Ile Pro Ile Gln Tyr Ala Leu Ser Arg Tyr Pro1 5 10 15Ser Tyr Gly Leu Ser Tyr Tyr Gln His Arg Pro Val Ala Leu Ile Asn 20 25 30Asn Gln Phe Leu Pro Tyr Pro Tyr Tyr Ala Lys Pro Gly Ala Val Arg 35 40 45Ser Pro Ala Gln Ile Leu Gln Trp Gln Val Leu Pro Asn Thr Val Pro 50 55 60Ala Lys Ser Cys Gln Ala Gln Pro Thr Thr Met Ala Arg His Pro Arg65 70 75 80Pro Arg Leu Ser Phe Met Ala Ile Pro Pro Lys Lys Asn Gln Asp Lys 85 90 95Thr Asp Ile Pro Thr Ile Asn Thr Ile Ala Thr Val Glu Ser Thr Ile 100 105 110Thr Pro Thr Thr Glu Ala Ile Glu Asp Asn Val Ala Thr Leu Glu Ala 115 120 125Ser Ser Glu Val Ile Glu Ser Ala Pro Glu Thr Asn Thr 130 135 14095162PRTVicugna pacos 95Glu Val Gln Asn Gln Glu Gln Pro Thr Cys Cys Glu Lys Val Glu Arg1 5 10 15Leu Leu Asn Glu Lys Thr Val Lys Tyr Phe Pro Ile Gln Phe Val Gln 20 25 30Ser Arg Tyr Pro Ser Tyr Gly Ile Asn Tyr Tyr Gln His Arg Leu Ala 35 40 45Val Pro Ile Asn Asn Gln Phe Ile Pro Tyr Pro Asn Tyr Ala Lys Pro 50 55 60Val Ala Ile Arg Leu His Ala Gln Ile Pro Gln Cys Gln Ala Leu Pro65 70 75 80Asn Ile Asp Pro Pro Thr Val Glu Arg Arg Pro Arg Pro Arg Pro Ser 85 90 95Phe Ile Ala Ile Pro Pro Lys Lys Thr Gln Asp Lys Thr Val Ile Pro 100 105 110Ala Ile Asn Thr Val Ala Thr Ala Glu Pro Pro Val Ile Pro Thr Ala 115 120 125Glu Pro Val Val Asn Thr Val Val Ile Ala Glu Ala Ser Ser Glu Phe 130 135 140Ile Thr Thr Ser Thr Pro Glu Thr Thr Thr Val Gln Ile Thr Ser Thr145 150 155 160Glu Ile96160PRTBos indicus 96Arg Cys Glu Lys Asp Glu Arg Phe Phe Ser Asp Lys Ile Ala Lys Tyr1 5 10 15Ile Pro Ile Gln Tyr Val Leu Ser Arg Tyr Pro Ser Tyr Gly Leu Asn 20 25 30Tyr Tyr Gln Gln Lys Pro Val Ala Leu Ile Asn Asn Gln Phe Leu Pro 35 40 45Tyr Pro Tyr Tyr Ala Lys Pro Ala Ala Val Arg Ser Pro Ala Gln Ile 50 55 60Leu Gln Trp Gln Val Leu Ser Asn Thr Val Pro Ala Lys Ser Cys Gln65 70 75 80Ala Gln Pro Thr Thr Met Ala Arg His Pro His Pro His Leu Ser Phe 85 90 95Met Ala Ile Pro Pro Lys Lys Asn Gln Asp Lys Thr Glu Ile Pro Thr 100 105 110Ile Asn Thr Ile Ala Ser Gly Glu Pro Thr Ser Thr Pro Thr Thr Glu 115 120 125Ala Val Glu Ser Thr Val Ala Thr Leu Glu Asp Ser Pro Glu Val Ile 130 135 140Glu Ser Pro Pro Glu Ile Asn Thr Val Gln Val Thr Ser Thr Ala Val145 150 155 16097162PRTLama glama 97Glu Val Gln Asn Gln Glu Gln Pro Thr Cys Cys Glu Lys Val Glu Arg1 5 10 15Leu Leu Asn Glu Lys Thr Val Lys Tyr Phe Pro Ile Gln Phe Val Gln 20 25 30Ser Arg Tyr Pro Ser Tyr Gly Ile Asn Tyr Tyr Gln His Arg Leu Ala 35 40 45Val Pro Ile Asn Asn Gln Phe Ile Pro Tyr Pro Asn Tyr Ala Lys Pro 50 55 60Val Ala Ile Arg Leu His Ala Gln Ile Pro Gln Cys Gln Ala Leu Pro65 70 75 80Asn Ile Asp Pro Pro Thr Val Glu Arg Arg Pro Arg Pro Arg Pro Ser 85 90 95Phe Ile Ala Ile Pro Pro Lys Lys Thr Gln Asp Lys Thr Val Ile Pro 100 105 110Ala Ile Asn Thr Val Ala Thr Val Glu Pro Pro Val Ile Pro Thr Ala 115 120 125Glu Pro Val Val Asn Thr Val Val Ile Ala Glu Ala Ser Ser Glu Phe 130 135 140Ile Thr Thr Ser Thr Pro Glu Thr Thr Thr Val Gln Ile Thr Ser Thr145 150 155 160Glu Ile98162PRTHomo sapiens 98Glu Val Gln Asn Gln Lys Gln Pro Ala Cys His Glu Asn Asp Glu Arg1 5 10 15Pro Phe Tyr Gln Lys Thr Ala Pro Tyr Val Pro Met Tyr Tyr Val Pro 20 25 30Asn Ser Tyr Pro Tyr Tyr Gly Thr Asn Leu Tyr Gln Arg Arg Pro Ala 35 40 45Ile Ala Ile Asn Asn Pro Tyr Val Pro Arg Thr Tyr Tyr Ala Asn Pro 50 55 60Ala Val Val Arg Pro His Ala Gln Ile Pro Gln Arg Gln Tyr Leu Pro65 70 75 80Asn Ser His Pro Pro Thr Val Val Arg Arg Pro Asn Leu His Pro Ser 85 90 95Phe Ile Ala Ile Pro Pro Lys Lys Ile Gln

Asp Lys Ile Ile Ile Pro 100 105 110Thr Ile Asn Thr Ile Ala Thr Val Glu Pro Thr Pro Ala Pro Ala Thr 115 120 125Glu Pro Thr Val Asp Ser Val Val Thr Pro Glu Ala Phe Ser Glu Ser 130 135 140Ile Ile Thr Ser Thr Pro Glu Thr Thr Thr Val Ala Val Thr Pro Pro145 150 155 160Thr Ala99199PRTCapra hircus 99Arg Pro Lys His Pro Ile Asn His Arg Gly Leu Ser Pro Glu Val Pro1 5 10 15Asn Glu Asn Leu Leu Arg Phe Val Val Ala Pro Phe Pro Glu Val Phe 20 25 30Arg Lys Glu Asn Ile Asn Glu Leu Ser Lys Asp Ile Gly Ser Glu Ser 35 40 45Thr Glu Asp Gln Ala Met Glu Asp Ala Lys Gln Met Lys Ala Gly Ser 50 55 60Ser Ser Ser Ser Glu Glu Ile Val Pro Asn Ser Ala Glu Gln Lys Tyr65 70 75 80Ile Gln Lys Glu Asp Val Pro Ser Glu Arg Tyr Leu Gly Tyr Leu Glu 85 90 95Gln Leu Leu Arg Leu Lys Lys Tyr Asn Val Pro Gln Leu Glu Ile Val 100 105 110Pro Lys Ser Ala Glu Glu Gln Leu His Ser Met Lys Glu Gly Asn Pro 115 120 125Ala His Gln Lys Gln Pro Met Ile Ala Val Asn Gln Glu Leu Ala Tyr 130 135 140Phe Tyr Pro Gln Leu Phe Arg Gln Phe Tyr Gln Leu Asp Ala Tyr Pro145 150 155 160Ser Gly Ala Trp Tyr Tyr Leu Pro Leu Gly Thr Gln Tyr Thr Asp Ala 165 170 175Pro Ser Phe Ser Asp Ile Pro Asn Pro Ile Gly Ser Glu Asn Ser Gly 180 185 190Lys Thr Thr Met Pro Leu Trp 195100199PRTOvis aris 100Arg Pro Lys His Pro Ile Lys His Gln Gly Leu Ser Ser Glu Val Leu1 5 10 15Asn Glu Asn Leu Leu Arg Phe Val Val Ala Pro Phe Pro Glu Val Phe 20 25 30Arg Lys Glu Asn Ile Asn Glu Leu Ser Lys Asp Ile Gly Ser Glu Ser 35 40 45Ile Glu Asp Gln Ala Met Glu Asp Ala Lys Gln Met Lys Ala Gly Ser 50 55 60Ser Ser Ser Ser Glu Glu Ile Val Pro Asn Ser Ala Glu Gln Lys Tyr65 70 75 80Ile Gln Lys Glu Asp Val Pro Ser Glu Arg Tyr Leu Gly Tyr Leu Glu 85 90 95Gln Leu Leu Arg Leu Lys Lys Tyr Asn Val Pro Gln Leu Glu Ile Val 100 105 110Pro Lys Ser Ala Glu Glu Gln Leu His Ser Met Lys Glu Gly Asn Pro 115 120 125Ala His Gln Lys Gln Pro Met Ile Ala Val Asn Gln Glu Leu Ala Tyr 130 135 140Phe Tyr Pro Gln Leu Phe Arg Gln Phe Tyr Gln Leu Asp Ala Tyr Pro145 150 155 160Ser Gly Ala Trp Tyr Tyr Leu Pro Leu Gly Thr Gln Tyr Thr Asp Ala 165 170 175Pro Ser Phe Ser Asp Ile Pro Asn Pro Ile Gly Ser Glu Asn Ser Gly 180 185 190Lys Ile Thr Met Pro Leu Trp 195101199PRTBubalus bubalis 101Arg Pro Lys Gln Pro Ile Lys His Gln Gly Leu Pro Gln Gly Val Leu1 5 10 15Asn Glu Asn Leu Leu Arg Phe Phe Val Ala Pro Phe Pro Glu Val Phe 20 25 30Gly Lys Glu Lys Val Asn Glu Leu Ser Thr Asp Ile Gly Ser Glu Ser 35 40 45Thr Glu Asp Gln Ala Met Glu Asp Ile Lys Gln Met Glu Ala Glu Ser 50 55 60Ile Ser Ser Ser Glu Glu Ile Val Pro Ile Ser Val Glu Gln Lys His65 70 75 80Ile Gln Lys Glu Asp Val Pro Ser Glu Arg Tyr Leu Gly Tyr Leu Glu 85 90 95Gln Leu Leu Arg Leu Lys Lys Tyr Asn Val Pro Gln Leu Glu Ile Val 100 105 110Pro Asn Leu Ala Glu Glu Gln Leu His Ser Met Lys Glu Gly Ile His 115 120 125Ala Gln Gln Lys Glu Pro Met Ile Gly Val Asn Gln Glu Leu Ala Tyr 130 135 140Phe Tyr Pro Gln Leu Phe Arg Gln Phe Tyr Gln Leu Asp Ala Tyr Pro145 150 155 160Ser Gly Ala Trp Tyr Tyr Val Pro Leu Gly Thr Gln Tyr Pro Asp Ala 165 170 175Pro Ser Phe Ser Asp Ile Pro Asn Pro Ile Gly Ser Glu Asn Ser Gly 180 185 190Lys Thr Thr Met Pro Leu Trp 195102154PRTCamelus dromedaries 102Asp Thr Glu Arg Lys Glu Ser Gly Ser Ser Ser Ser Glu Glu Val Val1 5 10 15Ser Ser Thr Thr Glu Gln Lys Asp Ile Leu Lys Glu Asp Met Pro Ser 20 25 30Gln Arg Tyr Leu Glu Glu Leu His Arg Leu Asn Lys Tyr Lys Leu Leu 35 40 45Gln Leu Glu Ala Ile Arg Asp Gln Lys Leu Ile Pro Arg Val Lys Leu 50 55 60Ser Ser His Pro Tyr Leu Glu Gln Leu Tyr Arg Ile Asn Glu Asp Asn65 70 75 80His Pro Gln Leu Gly Glu Pro Val Lys Val Val Thr Gln Glu Gln Ala 85 90 95Tyr Phe His Leu Glu Pro Phe Pro Gln Phe Phe Gln Leu Gly Ala Ser 100 105 110Pro Tyr Val Ala Trp Tyr Tyr Pro Pro Gln Val Met Gln Tyr Ile Ala 115 120 125His Pro Ser Ser Tyr Asp Thr Pro Glu Gly Ile Ala Ser Glu Asp Gly 130 135 140Gly Lys Thr Asp Val Met Pro Gln Trp Trp145 150103207PRTCamelus bactrianus 103Arg Pro Lys Tyr Pro Leu Arg Tyr Pro Glu Val Phe Gln Asn Glu Pro1 5 10 15Asp Ser Ile Glu Glu Val Leu Asn Lys Arg Lys Ile Leu Glu Leu Ala 20 25 30Val Val Ser Pro Ile Gln Phe Arg Gln Glu Asn Ile Asp Glu Leu Lys 35 40 45Asp Thr Arg Asn Glu Pro Thr Glu Asp His Ile Met Glu Asp Thr Glu 50 55 60Arg Lys Glu Ser Gly Ser Ser Ser Ser Glu Glu Val Val Ser Ser Thr65 70 75 80Thr Glu Gln Lys Asp Ile Leu Lys Glu Asp Met Pro Ser Gln Arg Tyr 85 90 95Leu Glu Glu Leu His Arg Leu Asn Lys Tyr Lys Leu Leu Gln Leu Glu 100 105 110Ala Ile Arg Asp Gln Lys Leu Ile Pro Arg Val Lys Leu Ser Ser His 115 120 125Pro Tyr Leu Glu Gln Leu Tyr Arg Ile Asn Glu Asp Asn His Pro Gln 130 135 140Leu Gly Glu Pro Val Lys Val Val Thr Gln Pro Phe Pro Gln Phe Phe145 150 155 160Gln Leu Gly Ala Ser Pro Tyr Val Ala Trp Tyr Tyr Pro Pro Gln Val 165 170 175Met Gln Tyr Ile Ala His Pro Ser Ser Tyr Asp Thr Pro Glu Gly Ile 180 185 190Ala Ser Glu Asp Gly Gly Lys Thr Asp Val Met Pro Gln Trp Trp 195 200 205104199PRTBos mutus 104Arg Pro Lys His Pro Ile Lys His Gln Gly Leu Pro Gln Glu Val Leu1 5 10 15Asn Glu Asn Leu Leu Arg Phe Phe Val Ala Pro Phe Pro Glu Val Phe 20 25 30Gly Lys Glu Lys Val Asn Glu Leu Ser Lys Asp Ile Gly Ser Glu Ser 35 40 45Thr Glu Asp Gln Ala Met Glu Asp Ile Lys Gln Met Glu Ala Glu Ser 50 55 60Ile Ser Ser Ser Glu Glu Ile Val Pro Asn Ser Val Glu Gln Lys His65 70 75 80Ile Gln Lys Glu Asp Val Pro Ser Glu His Tyr Leu Gly Tyr Leu Glu 85 90 95Gln Leu Leu Arg Leu Lys Lys Tyr Lys Val Pro Gln Leu Glu Ile Val 100 105 110Pro Asn Ser Ala Glu Glu Arg Leu His Ser Met Lys Glu Gly Ile His 115 120 125Ala Gln Gln Lys Glu Pro Met Ile Gly Val Asn Gln Glu Leu Ala Tyr 130 135 140Phe Tyr Pro Glu Leu Phe Arg Gln Phe Tyr Gln Leu Asp Ala Tyr Pro145 150 155 160Ser Gly Ala Trp Tyr Tyr Val Pro Leu Gly Thr Gln Tyr Thr Asp Ala 165 170 175Pro Ser Phe Ser Asp Ile Pro Asn Pro Ile Gly Ser Glu Asn Ser Gly 180 185 190Lys Thr Thr Met Pro Leu Trp 195105226PRTEquus caballus 105Arg Glu Lys Glu Glu Leu Asn Val Ser Ser Glu Thr Val Glu Ser Leu1 5 10 15Ser Ser Asn Glu Pro Asp Ser Ser Ser Glu Glu Ser Ile Thr His Ile 20 25 30Asn Lys Glu Lys Leu Gln Lys Phe Lys His Glu Gly Gln Gln Gln Arg 35 40 45Glu Val Glu Arg Gln Asp Lys Ile Ser Arg Phe Val Gln Pro Gln Pro 50 55 60Val Val Tyr Pro Tyr Ala Glu Pro Val Pro Tyr Ala Val Val Pro Gln65 70 75 80Ser Ile Leu Pro Leu Ala Gln Pro Pro Ile Leu Pro Phe Leu Gln Pro 85 90 95Glu Ile Met Glu Val Ser Gln Ala Lys Glu Thr Ile Leu Pro Lys Arg 100 105 110Lys Val Met Pro Phe Leu Lys Ser Pro Ile Val Pro Phe Ser Glu Arg 115 120 125Gln Ile Leu Asn Pro Thr Asn Gly Glu Asn Leu Arg Leu Pro Val His 130 135 140Leu Ile Gln Pro Phe Met His Gln Val Pro Gln Ser Leu Leu Gln Thr145 150 155 160Leu Met Leu Pro Ser Gln Pro Val Leu Ser Pro Pro Gln Ser Lys Val 165 170 175Ala Pro Phe Pro Gln Pro Val Val Pro Tyr Pro Gln Arg Asp Thr Pro 180 185 190Val Gln Ala Phe Leu Leu Tyr Gln Asp Pro Arg Leu Gly Pro Thr Gly 195 200 205Glu Leu Asp Pro Ala Thr Gln Pro Ile Val Ala Val His Asn Pro Val 210 215 220Ile Val225106202PRTEquus asinus 106Arg Pro Lys Leu Pro His Arg His Pro Glu Ile Ile Gln Asn Glu Gln1 5 10 15Asp Ser Arg Glu Lys Val Leu Lys Glu Arg Lys Phe Pro Ser Phe Ala 20 25 30Leu His Thr Pro Arg Glu Glu Tyr Ile Asn Glu Leu Asn Arg Gln Arg 35 40 45Glu Leu Leu Lys Glu Lys Gln Lys Asp Glu His Lys Glu Tyr Leu Ile 50 55 60Glu Asp Pro Glu Gln Gln Glu Ser Ser Ser Thr Ser Ser Ser Glu Glu65 70 75 80Val Val Pro Ile Asn Thr Glu Gln Lys Arg Ile Pro Arg Glu Asp Met 85 90 95Leu Tyr Gln His Thr Leu Glu Gln Leu Arg Arg Leu Ser Lys Tyr Asn 100 105 110Gln Leu Gln Leu Gln Ala Ile Tyr Ala Gln Glu Gln Leu Ile Arg Met 115 120 125Lys Glu Asn Ser Gln Arg Lys Pro Met Arg Val Val Asn Gln Glu Gln 130 135 140Ala Tyr Phe Tyr Leu Glu Pro Phe Gln Pro Ser Tyr Gln Leu Asp Val145 150 155 160Tyr Pro Tyr Ala Ala Trp Phe His Pro Ala Gln Ile Met Gln His Val 165 170 175Ala Tyr Ser Pro Phe His Asp Thr Ala Lys Leu Ile Ala Ser Glu Asn 180 185 190Ser Glu Lys Thr Asp Ile Ile Pro Glu Trp 195 200107199PRTBos indicusmisc_feature(84)..(84)Xaa can be any naturally occurring amino acid 107Arg Pro Lys His Pro Ile Lys His Gln Gly Leu Pro Gln Glu Val Leu1 5 10 15Asn Glu Asn Leu Leu Arg Phe Phe Val Ala Pro Phe Pro Glu Val Phe 20 25 30Gly Lys Glu Lys Val Asn Glu Leu Ser Lys Asp Ile Gly Ser Glu Ser 35 40 45Thr Glu Asp Gln Ala Met Glu Asp Ile Lys Gln Met Glu Ala Glu Ser 50 55 60Ile Ser Ser Ser Glu Glu Ile Val Pro Asn Ser Val Glu Gln Lys His65 70 75 80Ile Gln Lys Xaa Asp Val Pro Ser Glu Arg Tyr Leu Gly Tyr Leu Glu 85 90 95Gln Leu Leu Arg Leu Lys Lys Tyr Lys Val Pro Gln Leu Glu Ile Val 100 105 110Pro Asn Ser Ala Glu Glu Arg Leu His Ser Met Lys Glu Gly Ile His 115 120 125Ala Gln Gln Lys Glu Pro Met Ile Gly Val Asn Gln Glu Leu Ala Tyr 130 135 140Phe Tyr Pro Glu Leu Phe Arg Gln Phe Tyr Gln Leu Asp Ala Tyr Pro145 150 155 160Ser Gly Ala Trp Tyr Tyr Val Pro Leu Gly Thr Gln Tyr Thr Asp Ala 165 170 175Pro Ser Phe Ser Asp Ile Pro Asn Pro Ile Gly Ser Glu Asn Ser Gly 180 185 190Lys Thr Thr Met Pro Leu Trp 195108215PRTLama glama 108Arg Pro Lys Tyr Pro Leu Arg Tyr Pro Glu Val Phe Gln Asn Glu Pro1 5 10 15Asp Ser Ile Gln Glu Val Leu Asn Lys Arg Lys Ile Leu Glu Leu Ala 20 25 30Val Val Ser Pro Ile Gln Phe Arg Gln Glu Asn Ile Asp Glu Leu Lys 35 40 45Asp Thr Arg Asn Glu Pro Thr Glu Asp His Ile Met Glu Asp Thr Glu 50 55 60Arg Thr Val Ser Gly Ser Ser Ser Ser Glu Glu Val Val Ser Ser Thr65 70 75 80Thr Glu Gln Lys Asp Ile Leu Lys Glu Asp Met Pro Ser Gln Arg Ile 85 90 95Leu Glu Glu Leu His Arg Leu Asn Lys Tyr Lys Leu Leu Gln Leu Glu 100 105 110Ala Ile Arg Asp Gln Lys Leu Ile Pro Arg Val Lys Leu Ser Ser His 115 120 125Pro Tyr Leu Glu Gln Leu Tyr Arg Ile Asn Glu Asp Asn His Pro Gln 130 135 140Leu Gly Glu Pro Val Lys Val Val Thr Gln Glu Gln Ala Tyr Phe His145 150 155 160Leu Glu Pro Phe Gln Gln Phe Phe Gln Leu Gly Ala Ser Pro Tyr Val 165 170 175Ala Trp Tyr Tyr Pro Pro Gln Val Met Gln Tyr Ile Ala His Pro Ser 180 185 190Ser His Asp Thr Pro Glu Gly Ile Ala Ser Glu Asp Gly Gly Lys Thr 195 200 205Asp Val Met Pro Gln Trp Trp 210 215109170PRTHomo sapiens 109Arg Pro Lys Leu Pro Leu Arg Tyr Pro Glu Arg Leu Gln Asn Pro Ser1 5 10 15Glu Ser Ser Glu Pro Ile Pro Leu Glu Ser Arg Glu Glu Tyr Met Asn 20 25 30Gly Met Asn Arg Gln Arg Asn Ile Leu Arg Glu Lys Gln Thr Asp Glu 35 40 45Ile Lys Asp Thr Arg Asn Glu Ser Thr Gln Asn Cys Val Val Ala Glu 50 55 60Pro Glu Lys Met Glu Ser Ser Ile Ser Ser Ser Ser Glu Glu Met Ser65 70 75 80Leu Ser Lys Cys Ala Glu Gln Phe Cys Arg Leu Asn Glu Tyr Asn Gln 85 90 95Leu Gln Leu Gln Ala Ala His Ala Gln Glu Gln Ile Arg Arg Met Asn 100 105 110Glu Asn Ser His Val Gln Val Pro Phe Gln Gln Leu Asn Gln Leu Ala 115 120 125Ala Tyr Pro Tyr Ala Val Trp Tyr Tyr Pro Gln Ile Met Gln Tyr Val 130 135 140Pro Phe Pro Pro Phe Ser Asp Ile Ser Asn Pro Thr Ala His Glu Asn145 150 155 160Tyr Glu Lys Asn Asn Val Met Leu Gln Trp 165 170110208PRTCapra hircus 110Lys His Lys Met Glu His Val Ser Ser Ser Glu Glu Pro Ile Asn Ile1 5 10 15Phe Gln Glu Ile Tyr Lys Gln Glu Lys Asn Met Ala Ile His Pro Arg 20 25 30Lys Glu Lys Leu Cys Thr Thr Ser Cys Glu Glu Val Val Arg Asn Ala 35 40 45Asn Glu Glu Glu Tyr Ser Ile Arg Ser Ser Ser Glu Glu Ser Ala Glu 50 55 60Val Ala Pro Glu Glu Ile Lys Ile Thr Val Asp Asp Lys His Tyr Gln65 70 75 80Lys Ala Leu Asn Glu Ile Asn Gln Phe Tyr Gln Lys Phe Pro Gln Tyr 85 90 95Leu Gln Tyr Pro Tyr Gln Gly Pro Ile Val Leu Asn Pro Trp Asp Gln 100 105 110Val Lys Arg Asn Ala Gly Pro Phe Thr Pro Thr Val Asn Arg Glu Gln 115 120 125Leu Ser Thr Ser Glu Glu Asn Ser Lys Lys Thr Ile Asp Met Glu Ser 130 135 140Thr Glu Val Phe Thr Lys Lys Thr Lys Leu Thr Glu Glu Glu Lys Asn145 150 155 160Arg Leu Asn Phe Leu Lys Lys Ile Ser Gln Tyr Tyr Gln Lys Phe Ala 165 170 175Trp Pro Gln Tyr Leu Lys Thr Val Asp Gln His Gln Lys Ala Met Lys 180 185 190Pro Trp Thr Gln Pro Lys Thr Asn Ala Ile Pro Tyr Val Arg Tyr Leu 195 200 205111208PRTOvis aries 111Lys His Lys Met Glu His Val Ser

Ser Ser Glu Glu Pro Ile Asn Ile1 5 10 15Ser Gln Glu Ile Tyr Lys Gln Glu Lys Asn Met Ala Ile His Pro Arg 20 25 30Lys Glu Lys Leu Cys Thr Thr Ser Cys Glu Glu Val Val Arg Asn Ala 35 40 45Asp Glu Glu Glu Tyr Ser Ile Arg Ser Ser Ser Glu Glu Ser Ala Glu 50 55 60Val Ala Pro Glu Glu Val Lys Ile Thr Val Asp Asp Lys His Tyr Gln65 70 75 80Lys Ala Leu Asn Glu Ile Asn Gln Phe Tyr Gln Lys Phe Pro Gln Tyr 85 90 95Leu Gln Tyr Leu Tyr Gln Gly Pro Ile Val Leu Asn Pro Trp Asp Gln 100 105 110Val Lys Arg Asn Ala Gly Pro Phe Thr Pro Thr Val Asn Arg Glu Gln 115 120 125Leu Ser Thr Ser Glu Glu Asn Ser Lys Lys Thr Ile Asp Met Glu Ser 130 135 140Thr Glu Val Phe Thr Lys Lys Thr Lys Leu Thr Glu Glu Glu Lys Asn145 150 155 160Arg Leu Asn Phe Leu Lys Lys Ile Ser Gln Tyr Tyr Gln Lys Phe Ala 165 170 175Trp Pro Gln Tyr Leu Lys Thr Val Asp Gln His Gln Lys Ala Met Lys 180 185 190Pro Trp Thr Gln Pro Lys Thr Asn Ala Ile Pro Tyr Val Arg Tyr Leu 195 200 205112207PRTBubalus bubalis 112Lys His Thr Met Glu His Val Ser Ser Ser Glu Glu Ser Ile Ile Ser1 5 10 15Gln Glu Thr Tyr Lys Gln Glu Lys Asn Met Ala Ile His Pro Ser Lys 20 25 30Glu Asn Leu Cys Ser Thr Phe Cys Lys Glu Val Ile Arg Asn Ala Asn 35 40 45Glu Glu Glu Tyr Ser Ile Gly Ser Ser Ser Glu Glu Ser Ala Glu Val 50 55 60Ala Thr Glu Glu Val Lys Ile Thr Val Asp Asp Lys His Tyr Gln Lys65 70 75 80Ala Leu Asn Glu Ile Asn Gln Phe Tyr Gln Lys Phe Pro Gln Tyr Leu 85 90 95Gln Tyr Leu Tyr Gln Gly Pro Ile Val Leu Asn Pro Trp Asp Gln Val 100 105 110Lys Arg Asn Ala Val Pro Ile Thr Pro Thr Leu Asn Arg Glu Gln Leu 115 120 125Ser Thr Ser Glu Glu Asn Ser Lys Lys Thr Val Asp Met Glu Ser Thr 130 135 140Glu Val Ile Thr Lys Lys Thr Lys Leu Thr Glu Glu Asp Lys Asn Arg145 150 155 160Leu Asn Phe Leu Lys Lys Ile Ser Gln His Tyr Gln Lys Phe Thr Trp 165 170 175Pro Gln Tyr Leu Lys Thr Val Tyr Gln Tyr Gln Lys Ala Met Lys Pro 180 185 190Trp Thr Gln Pro Lys Thr Asn Val Ile Pro Tyr Val Arg Tyr Leu 195 200 205113178PRTCamelus dromedaries 113Lys His Glu Met Asp Gln Gly Ser Ser Ser Glu Glu Ser Ile Asn Val1 5 10 15Ser Gln Gln Lys Phe Lys Gln Val Lys Lys Val Ala Ile His Pro Ser 20 25 30Lys Glu Asp Ile Cys Ser Thr Phe Cys Glu Glu Ala Val Arg Asn Ile 35 40 45Lys Glu Val Glu Ser Ala Glu Val Pro Thr Glu Asn Lys Ile Ser Gln 50 55 60Phe Tyr Gln Lys Trp Lys Phe Leu Gln Tyr Leu Gln Ala Leu His Gln65 70 75 80Gly Gln Ile Val Met Asn Pro Trp Asp Gln Gly Lys Thr Arg Ala Tyr 85 90 95Pro Phe Ile Pro Thr Val Asn Thr Glu Gln Leu Ser Ile Ser Glu Glu 100 105 110Ser Thr Glu Val Pro Thr Glu Glu Ser Thr Glu Val Phe Thr Lys Lys 115 120 125Thr Glu Leu Thr Glu Glu Glu Lys Asp His Gln Lys Phe Leu Asn Lys 130 135 140Ile Tyr Gln Tyr Tyr Gln Thr Phe Leu Trp Pro Glu Tyr Leu Lys Thr145 150 155 160Val Tyr Gln Tyr Gln Lys Thr Met Thr Pro Trp Asn His Ile Lys Arg 165 170 175Tyr Phe114178PRTCamelus bactrianus 114Lys His Glu Met Asp Gln Gly Ser Ser Ser Glu Glu Ser Ile Asn Val1 5 10 15Ser Gln Gln Lys Phe Lys Gln Val Lys Lys Val Ala Ile His Pro Ser 20 25 30Lys Glu Asp Ile Cys Ser Thr Phe Cys Glu Glu Ala Val Arg Asn Ile 35 40 45Lys Glu Val Glu Ser Ala Glu Val Pro Thr Glu Asn Lys Ile Ser Gln 50 55 60Phe Tyr Gln Lys Trp Lys Phe Leu Gln Tyr Leu Gln Ala Leu His Gln65 70 75 80Gly Gln Ile Val Met Asn Pro Trp Asp Gln Gly Lys Thr Arg Ala Tyr 85 90 95Pro Phe Ile Pro Thr Val Asn Thr Glu Gln Leu Ser Ile Ser Glu Glu 100 105 110Ser Thr Glu Val Pro Thr Glu Glu Ser Thr Glu Val Phe Asn Lys Lys 115 120 125Thr Glu Leu Thr Glu Glu Glu Lys Asp His Gln Lys Phe Leu Asn Lys 130 135 140Ile Tyr Gln Tyr Tyr Gln Thr Phe Leu Trp Pro Glu Tyr Leu Lys Thr145 150 155 160Val Tyr Gln Tyr Gln Lys Thr Met Thr Pro Trp Asn His Ile Lys Arg 165 170 175Tyr Phe115204PRTBos mutus 115Lys Asn Thr Met Glu His Val Ser Ser Ser Glu Glu Ser Ile Ile Ser1 5 10 15Gln Glu Thr Tyr Lys Gln Glu Lys Asn Met Ala Ile Asn Pro Ser Lys 20 25 30Gly Asn Leu Cys Ser Thr Phe Cys Lys Glu Val Val Arg Asn Ala Asn 35 40 45Glu Glu Glu Tyr Ser Ile Gly Ser Ser Ser Glu Glu Ser Ala Glu Val 50 55 60Ala Thr Glu Glu Val Lys Ile Thr Val Asp Asp Lys His Tyr Gln Lys65 70 75 80Ala Leu Asn Glu Ile Asn Gln Phe Tyr Gln Lys Phe Pro Gln Tyr Leu 85 90 95Gln Tyr Leu Tyr Gln Gly Pro Ile Val Leu Asn Pro Trp Asp Gln Val 100 105 110Lys Arg Asn Ala Val Pro Ile Thr Pro Thr Leu Asn Arg Glu Gln Leu 115 120 125Ser Thr Ser Glu Glu Asn Ser Lys Lys Thr Val Asp Met Glu Ser Thr 130 135 140Glu Val Phe Thr Lys Lys Thr Lys Leu Thr Glu Glu Glu Lys Asn Arg145 150 155 160Leu Asn Phe Leu Lys Lys Ile Ser Gln Arg Tyr Gln Lys Phe Ala Leu 165 170 175Pro Gln Tyr Leu Lys Thr Val Tyr Gln His Gln Lys Ala Met Lys Pro 180 185 190Trp Ile Gln Pro Lys Thr Lys Val Ile Pro Tyr Val 195 200116216PRTEquus caballus 116Lys His Asn Met Glu His Arg Ser Ser Ser Glu Asp Ser Val Asn Ile1 5 10 15Ser Gln Glu Lys Phe Lys Gln Glu Lys Tyr Val Val Ile Pro Thr Ser 20 25 30Lys Glu Ser Ile Cys Ser Thr Ser Cys Glu Glu Ala Thr Arg Asn Ile 35 40 45Asn Glu Met Glu Ser Ala Lys Phe Pro Thr Glu Val Tyr Ser Ser Ser 50 55 60Ser Ser Ser Glu Glu Ser Ala Lys Phe Pro Thr Glu Arg Glu Glu Lys65 70 75 80Glu Val Glu Glu Lys His His Leu Lys Gln Leu Asn Lys Ile Asn Gln 85 90 95Phe Tyr Glu Lys Leu Asn Phe Leu Gln Tyr Leu Gln Ala Leu Arg Gln 100 105 110Pro Arg Ile Val Leu Thr Pro Trp Asp Gln Thr Lys Thr Gly Asp Ser 115 120 125Pro Phe Ile Pro Ile Val Asn Thr Glu Gln Leu Phe Thr Ser Glu Glu 130 135 140Ile Pro Lys Lys Thr Val Asp Met Glu Ser Thr Glu Val Val Thr Glu145 150 155 160Lys Thr Glu Leu Thr Glu Glu Glu Lys Asn Tyr Leu Lys Leu Leu Tyr 165 170 175Tyr Glu Lys Phe Thr Leu Pro Gln Tyr Phe Lys Ile Val Arg Gln His 180 185 190Gln Thr Thr Met Asp Pro Arg Ser His Arg Lys Thr Asn Ser Tyr Gln 195 200 205Ile Ile Pro Val Leu Arg Tyr Phe 210 215117221PRTEquus asinus 117Lys His Asn Met Glu His Arg Ser Ser Ser Glu Asp Ser Val Asn Ile1 5 10 15Ser Gln Glu Lys Phe Lys Gln Glu Lys Tyr Val Val Ile Pro Thr Ser 20 25 30Lys Glu Ser Ile Cys Ser Thr Ser Cys Glu Glu Ala Thr Arg Asn Ile 35 40 45Asn Glu Met Glu Ser Ala Lys Phe Pro Thr Glu Val Tyr Ser Ser Ser 50 55 60Ser Ser Ser Glu Glu Ser Ala Lys Phe Pro Thr Glu Arg Glu Glu Lys65 70 75 80Glu Val Glu Glu Lys His His Leu Lys Gln Leu Asn Lys Ile Asn Gln 85 90 95Phe Tyr Glu Lys Leu Asn Phe Leu Gln Tyr Leu Gln Ala Leu Arg Gln 100 105 110Pro Arg Ile Val Leu Thr Pro Trp Asp Gln Thr Lys Thr Gly Ala Ser 115 120 125Pro Phe Ile Pro Ile Val Asn Thr Glu Gln Leu Phe Thr Ser Glu Glu 130 135 140Ile Pro Lys Lys Thr Val Asp Met Glu Ser Thr Glu Val Val Thr Glu145 150 155 160Lys Thr Glu Leu Thr Glu Glu Glu Lys Asn Tyr Leu Lys Leu Leu Asn 165 170 175Lys Ile Asn Gln Tyr Tyr Glu Lys Phe Thr Leu Pro Gln Tyr Phe Lys 180 185 190Ile Val His Gln His Gln Thr Thr Met Asp Pro Gln Ser His Ser Lys 195 200 205Thr Asn Ser Tyr Gln Ile Ile Pro Val Leu Arg Tyr Phe 210 215 220118192PRTVicugna pacos 118Lys His Glu Met Asp Gln Gly Ser Ser Ser Glu Glu Ser Ile Asn Val1 5 10 15Ser Gln Gln Lys Leu Lys Gln Val Lys Lys Val Ala Ile His Pro Ser 20 25 30Lys Glu Asp Ile Cys Ser Thr Phe Cys Glu Glu Ala Val Arg Asn Ile 35 40 45Lys Glu Val Glu Ser Val Glu Val Pro Thr Glu Asn Lys Ile Ser Gln 50 55 60Phe Tyr Gln Lys Trp Lys Phe Leu Gln Tyr Leu Gln Ala Leu His Gln65 70 75 80Gly Gln Ile Val Met Asn Pro Trp Asp Gln Gly Lys Thr Met Val Tyr 85 90 95Pro Phe Ile Pro Thr Val Asn Thr Glu Gln Leu Ser Ile Ser Glu Glu 100 105 110Ser Thr Glu Val Pro Thr Glu Glu Ser Thr Glu Val Phe Thr Lys Lys 115 120 125Thr Glu Leu Thr Glu Glu Glu Lys Asp His Gln Lys Phe Leu Asn Lys 130 135 140Ile Tyr Gln Tyr Tyr Gln Thr Phe Leu Trp Pro Glu Tyr Leu Lys Thr145 150 155 160Val Tyr Gln Tyr Gln Lys Thr Met Thr Pro Trp Asn His Ile Lys Val 165 170 175Lys Ala Tyr Gln Ile Ile Pro Asn Leu Val Ser Ser Thr Phe Tyr Leu 180 185 190119207PRTBos indicus 119Lys Asn Thr Met Glu His Val Ser Ser Ser Glu Glu Ser Ile Ile Ser1 5 10 15Gln Glu Thr Tyr Lys Gln Glu Lys Asn Met Ala Ile Asn Pro Ser Lys 20 25 30Glu Asn Leu Cys Ser Thr Phe Cys Lys Glu Val Val Arg Asn Ala Asn 35 40 45Glu Glu Glu Tyr Ser Ile Gly Ser Ser Ser Glu Glu Ser Ala Glu Val 50 55 60Ala Thr Glu Glu Val Lys Ile Thr Val Asp Asp Lys His Tyr Gln Lys65 70 75 80Ala Leu Asn Glu Ile Asn Gln Phe Tyr Gln Lys Phe Pro Gln Tyr Leu 85 90 95Gln Tyr Leu Tyr Gln Gly Pro Ile Val Leu Asn Pro Trp Asp Gln Val 100 105 110Lys Arg Asn Ala Val Pro Ile Thr Pro Thr Leu Asn Arg Glu Gln Leu 115 120 125Ser Thr Ser Glu Glu Asn Ser Lys Lys Thr Val Asp Met Glu Ser Thr 130 135 140Glu Val Phe Thr Lys Lys Thr Lys Leu Thr Glu Glu Glu Lys Asn Arg145 150 155 160Leu Asn Phe Leu Lys Lys Ile Ser Gln Arg Tyr Gln Lys Phe Ala Leu 165 170 175Pro Gln Tyr Leu Lys Thr Val Tyr Gln His Gln Lys Ala Met Lys Pro 180 185 190Trp Ile Gln Pro Lys Thr Lys Val Ile Pro Tyr Val Arg Tyr Leu 195 200 205120187PRTLama glama 120Lys His Glu Met Asp Gln Gly Ser Ser Ser Glu Glu Ser Ile Asn Val1 5 10 15Ser Gln Gln Lys Leu Lys Gln Val Lys Lys Val Ala Ile His Pro Ser 20 25 30Lys Glu Asp Ile Cys Ser Thr Phe Cys Glu Glu Ala Val Arg Asn Ile 35 40 45Lys Glu Val Glu Ser Val Glu Val Pro Thr Glu Asn Lys Ile Ser Gln 50 55 60Phe Tyr Gln Lys Trp Lys Phe Leu Gln Tyr Leu Gln Ala Leu His Gln65 70 75 80Gly Gln Ile Val Met Asn Pro Trp Asp Gln Gly Lys Thr Met Val Tyr 85 90 95Pro Phe Ile Pro Thr Val Asn Thr Glu Gln Leu Ser Ile Ser Glu Glu 100 105 110Ser Thr Glu Val Pro Thr Glu Glu Asn Ser Lys Lys Thr Val Asp Thr 115 120 125Glu Ser Thr Glu Val Phe Thr Lys Lys Thr Glu Leu Thr Glu Glu Glu 130 135 140Lys Asp His Gln Lys Phe Leu Asn Lys Ile Tyr Gln Tyr Tyr Gln Thr145 150 155 160Phe Leu Trp Pro Glu Tyr Leu Lys Thr Val Tyr Gln Tyr Gln Lys Thr 165 170 175Met Thr Pro Trp Asn His Ile Lys Arg Tyr Phe 180 185121207PRTCapra hircus 121Arg Glu Gln Glu Glu Leu Asn Val Val Gly Glu Thr Val Glu Ser Leu1 5 10 15Ser Ser Ser Glu Glu Ser Ile Thr His Ile Asn Lys Lys Ile Glu Lys 20 25 30Phe Gln Ser Glu Glu Gln Gln Gln Thr Glu Asp Glu Leu Gln Asp Lys 35 40 45Ile His Pro Phe Ala Gln Ala Gln Ser Leu Val Tyr Pro Phe Thr Gly 50 55 60Pro Ile Pro Asn Ser Leu Pro Gln Asn Ile Leu Pro Leu Thr Gln Thr65 70 75 80Pro Val Val Val Pro Pro Phe Leu Gln Pro Glu Ile Met Gly Val Pro 85 90 95Lys Val Lys Glu Thr Met Val Pro Lys His Lys Glu Met Pro Phe Pro 100 105 110Lys Tyr Pro Val Glu Pro Phe Thr Glu Ser Gln Ser Leu Thr Leu Thr 115 120 125Asp Val Glu Lys Leu His Leu Pro Leu Pro Leu Val Gln Ser Trp Met 130 135 140His Gln Pro Pro Gln Pro Leu Ser Pro Thr Val Met Phe Pro Pro Gln145 150 155 160Ser Val Leu Ser Leu Ser Gln Pro Lys Val Leu Pro Val Pro Gln Lys 165 170 175Ala Val Pro Gln Arg Asp Met Pro Ile Gln Ala Phe Leu Leu Tyr Gln 180 185 190Glu Pro Val Leu Gly Pro Val Arg Gly Pro Phe Pro Ile Leu Val 195 200 205122207PRTOvis aries 122Arg Glu Gln Glu Glu Leu Asn Val Val Gly Glu Thr Val Glu Ser Leu1 5 10 15Ser Ser Ser Glu Glu Ser Ile Thr His Ile Asn Lys Lys Ile Glu Lys 20 25 30Phe Gln Ser Glu Glu Gln Gln Gln Thr Glu Asp Glu Leu Gln Asp Lys 35 40 45Ile His Pro Phe Ala Gln Ala Gln Ser Leu Val Tyr Pro Phe Thr Gly 50 55 60Pro Ile Pro Asn Ser Leu Pro Gln Asn Ile Leu Pro Leu Thr Gln Thr65 70 75 80Pro Val Val Val Pro Pro Phe Leu Gln Pro Glu Ile Met Gly Val Pro 85 90 95Lys Val Lys Glu Thr Met Val Pro Lys His Lys Glu Met Pro Phe Pro 100 105 110Lys Tyr Pro Val Glu Pro Phe Thr Glu Ser Gln Ser Leu Thr Leu Thr 115 120 125Asp Val Glu Lys Leu His Leu Pro Leu Pro Leu Val Gln Ser Trp Met 130 135 140His Gln Pro Pro Gln Pro Leu Pro Pro Thr Val Met Phe Pro Pro Gln145 150 155 160Ser Val Leu Ser Leu Ser Gln Pro Lys Val Leu Pro Val Pro Gln Lys 165 170 175Ala Val Pro Gln Arg Asp Met Pro Ile Gln Ala Phe Leu Leu Tyr Gln 180 185 190Glu Pro Val Leu Gly Pro Val Arg Gly Pro Phe Pro Ile Leu Val 195 200 205123209PRTBubalus bubalis 123Arg Glu Leu Glu Glu Leu Asn Val Pro Gly Glu Ile Val Glu Ser Leu1 5 10 15Ser Ser Ser Glu Glu Ser Ile Thr His Ile Asn Lys Lys Ile Glu Lys 20 25 30Phe Gln

Ser Glu Glu Gln Gln Gln Met Glu Asp Glu Leu Gln Asp Lys 35 40 45Ile His Pro Phe Ala Gln Thr Gln Ser Leu Val Tyr Pro Phe Pro Gly 50 55 60Pro Ile Pro Lys Ser Leu Pro Gln Asn Ile Pro Pro Leu Thr Gln Thr65 70 75 80Pro Val Val Val Pro Pro Phe Leu Gln Pro Glu Ile Met Gly Val Ser 85 90 95Lys Val Lys Glu Ala Met Ala Pro Lys His Lys Glu Met Pro Phe Pro 100 105 110Lys Tyr Pro Val Glu Pro Phe Thr Glu Ser Gln Ser Leu Thr Leu Thr 115 120 125Asp Val Glu Asn Leu His Leu Pro Leu Pro Leu Leu Gln Ser Trp Met 130 135 140His Gln Pro Pro Gln Pro Leu Pro Pro Thr Val Met Phe Pro Pro Gln145 150 155 160Ser Val Leu Ser Leu Ser Gln Ser Lys Val Leu Pro Val Pro Gln Lys 165 170 175Ala Val Pro Tyr Pro Gln Arg Asp Met Pro Ile Gln Ala Phe Leu Leu 180 185 190Tyr Gln Glu Pro Val Leu Gly Pro Val Arg Gly Pro Phe Pro Ile Ile 195 200 205Val124217PRTCamelus dromedaries 124Arg Glu Lys Glu Glu Phe Lys Thr Ala Gly Glu Ala Leu Glu Ser Ile1 5 10 15Ser Ser Ser Glu Glu Ser Ile Thr His Ile Asn Lys Gln Lys Ile Glu 20 25 30Lys Phe Lys Ile Glu Glu Gln Gln Gln Thr Glu Asp Glu Gln Gln Asp 35 40 45Lys Ile Tyr Thr Phe Pro Gln Pro Gln Ser Leu Val Tyr Ser His Thr 50 55 60Glu Pro Ile Pro Tyr Pro Ile Leu Pro Gln Asn Phe Leu Pro Pro Leu65 70 75 80Gln Pro Ala Val Met Val Pro Phe Leu Gln Pro Lys Val Met Asp Val 85 90 95Pro Lys Thr Lys Glu Thr Ile Ile Pro Lys Arg Lys Glu Met Pro Leu 100 105 110Leu Gln Ser Pro Val Val Pro Phe Thr Glu Ser Gln Ser Leu Thr Leu 115 120 125Thr Asp Leu Glu Asn Leu His Leu Pro Leu Pro Leu Leu Gln Ser Leu 130 135 140Met Tyr Gln Ile Pro Gln Pro Val Pro Gln Thr Pro Met Ile Pro Pro145 150 155 160Gln Ser Leu Leu Ser Leu Ser Gln Phe Lys Val Leu Pro Val Pro Gln 165 170 175Gln Met Val Pro Tyr Pro Gln Arg Ala Met Pro Val Gln Ala Val Leu 180 185 190Pro Phe Gln Glu Pro Val Pro Asp Pro Val Arg Gly Leu His Pro Val 195 200 205Pro Gln Pro Leu Val Pro Val Ile Ala 210 215125217PRTCamelus bactrianus 125Arg Glu Lys Glu Glu Phe Lys Thr Ala Gly Glu Ala Leu Glu Ser Ile1 5 10 15Ser Ser Ser Glu Glu Ser Ile Thr His Ile Asn Lys Gln Lys Ile Glu 20 25 30Lys Phe Lys Ile Glu Glu Gln Gln Gln Thr Glu Asp Glu Gln Gln Asp 35 40 45Lys Ile Tyr Thr Phe Pro Gln Pro Gln Ser Leu Val Tyr Ser His Thr 50 55 60Glu Pro Ile Pro Tyr Pro Ile Leu Pro Gln Asn Phe Leu Pro Pro Leu65 70 75 80Gln Pro Ala Val Met Val Pro Phe Leu Gln Pro Lys Val Met Asp Val 85 90 95Pro Lys Thr Lys Glu Thr Ile Ile Pro Lys Arg Lys Glu Met Pro Leu 100 105 110Leu Gln Ser Pro Val Val Pro Phe Thr Glu Ser Gln Ser Leu Thr Leu 115 120 125Thr Asp Leu Glu Asn Leu His Leu Pro Leu Pro Leu Leu Gln Ser Leu 130 135 140Met Tyr Gln Ile Pro Gln Pro Val Pro Gln Thr Pro Met Ile Pro Pro145 150 155 160Gln Ser Leu Leu Ser Leu Ser Gln Phe Lys Val Leu Pro Val Pro Gln 165 170 175Gln Met Val Pro Tyr Pro Gln Arg Ala Ile Pro Val Gln Ala Val Leu 180 185 190Pro Phe Gln Glu Pro Val Pro Asp Pro Val Arg Gly Leu His Pro Val 195 200 205Pro Gln Pro Leu Val Pro Val Ile Ala 210 215126209PRTBos mutus 126Arg Glu Leu Glu Glu Leu Asn Val Pro Gly Glu Ile Val Glu Ser Leu1 5 10 15Ser Ser Ser Glu Glu Ser Ile Thr Arg Ile Asn Lys Lys Ile Glu Lys 20 25 30Phe Gln Ser Glu Glu Gln Gln Gln Thr Glu Asp Glu Leu Gln Asp Lys 35 40 45Ile His Pro Phe Ala Gln Thr Gln Ser Leu Val Tyr Pro Phe Pro Gly 50 55 60Pro Ile Pro Asn Ser Leu Pro Gln Asn Ile Pro Pro Leu Thr Gln Thr65 70 75 80Pro Val Val Val Pro Pro Phe Leu Gln Pro Glu Val Met Gly Val Ser 85 90 95Lys Val Lys Glu Ala Met Ala Pro Lys His Lys Glu Met Pro Phe Pro 100 105 110Lys Tyr Pro Val Glu Pro Phe Thr Glu Ser Gln Ser Leu Thr Leu Thr 115 120 125Asp Val Glu Asn Leu His Leu Pro Leu Pro Leu Leu Gln Ser Trp Met 130 135 140His Gln Pro His Gln Pro Leu Pro Pro Thr Val Met Phe Pro Pro Gln145 150 155 160Ser Val Leu Ser Leu Ser Gln Ser Lys Val Leu Pro Val Pro Gln Lys 165 170 175Ala Val Pro Tyr Pro Gln Arg Asp Met Pro Ile Gln Ala Phe Leu Leu 180 185 190Tyr Gln Glu Pro Val Leu Gly Pro Val Arg Gly Pro Phe Pro Ile Ile 195 200 205Val127226PRTEquus caballus 127Arg Glu Lys Glu Glu Leu Asn Val Ser Ser Glu Thr Val Glu Ser Leu1 5 10 15Ser Ser Asn Glu Pro Asp Ser Ser Ser Glu Glu Ser Ile Thr His Ile 20 25 30Asn Lys Glu Lys Leu Gln Lys Phe Lys His Glu Gly Gln Gln Gln Arg 35 40 45Glu Val Glu Arg Gln Asp Lys Ile Ser Arg Phe Val Gln Pro Gln Pro 50 55 60Val Val Tyr Pro Tyr Ala Glu Pro Val Pro Tyr Ala Val Val Pro Gln65 70 75 80Ser Ile Leu Pro Leu Ala Gln Pro Pro Ile Leu Pro Phe Leu Gln Pro 85 90 95Glu Ile Met Glu Val Ser Gln Ala Lys Glu Thr Ile Leu Pro Lys Arg 100 105 110Lys Val Met Pro Phe Leu Lys Ser Pro Ile Val Pro Phe Ser Glu Arg 115 120 125Gln Ile Leu Asn Pro Thr Asn Gly Glu Asn Leu Arg Leu Pro Val His 130 135 140Leu Ile Gln Pro Phe Met His Gln Val Pro Gln Ser Leu Leu Gln Thr145 150 155 160Leu Met Leu Pro Ser Gln Pro Val Leu Ser Pro Pro Gln Ser Lys Val 165 170 175Ala Pro Phe Pro Gln Pro Val Val Pro Tyr Pro Gln Arg Asp Thr Pro 180 185 190Val Gln Ala Phe Leu Leu Tyr Gln Asp Pro Arg Leu Gly Pro Thr Gly 195 200 205Glu Leu Asp Pro Ala Thr Gln Pro Ile Val Ala Val His Asn Pro Val 210 215 220Ile Val225128226PRTEquus asinus 128Arg Glu Lys Glu Glu Leu Asn Val Ser Ser Glu Thr Val Glu Ser Leu1 5 10 15Ser Ser Asn Glu Pro Asp Ser Ser Ser Glu Glu Ser Ile Thr His Ile 20 25 30Asn Lys Glu Lys Ser Gln Lys Phe Lys His Glu Gly Gln Gln Gln Arg 35 40 45Glu Val Glu His Gln Asp Lys Ile Ser Arg Phe Val Gln Pro Gln Pro 50 55 60Val Val Tyr Pro Tyr Ala Glu Pro Val Pro Tyr Ala Val Val Pro Gln65 70 75 80Asn Ile Leu Val Leu Ala Gln Pro Pro Ile Val Pro Phe Leu Gln Pro 85 90 95Glu Ile Met Glu Val Ser Gln Ala Lys Glu Thr Ile Leu Pro Lys Arg 100 105 110Lys Val Met Pro Phe Leu Lys Ser Pro Ile Val Pro Phe Ser Glu Arg 115 120 125Gln Ile Leu Asn Pro Thr Asn Gly Glu Asn Leu Arg Leu Pro Val His 130 135 140Leu Ile Gln Pro Phe Met His Gln Val Pro Gln Ser Leu Leu Gln Thr145 150 155 160Leu Met Leu Pro Ser Gln Pro Val Leu Ser Pro Pro Gln Ser Lys Val 165 170 175Ala Pro Phe Pro Gln Pro Val Val Pro Tyr Pro Gln Arg Asp Thr Pro 180 185 190Val Gln Ala Phe Leu Leu Tyr Gln Asp Pro Gln Leu Gly Leu Thr Gly 195 200 205Glu Phe Asp Pro Ala Thr Gln Pro Ile Val Pro Val His Asn Pro Val 210 215 220Ile Val225129141PRTAlces alcesmisc_feature(6)..(6)Xaa can be any naturally occurring amino acidmisc_feature(17)..(17)Xaa can be any naturally occurring amino acidmisc_feature(65)..(65)Xaa can be any naturally occurring amino acid 129Ile His Pro Phe Ala Xaa Thr Gln Ser Leu Val Tyr Pro Phe Thr Gly1 5 10 15Xaa Ile Pro Tyr Ser Leu Pro Gln Asn Phe Leu Pro Leu Pro Gln Thr 20 25 30Pro Gly Met Val Pro Pro Phe Leu Gln Pro Glu Ile Met Gly Val Ser 35 40 45Glu Val Lys Glu Thr Met Val Pro Lys Asn Lys Glu Met Pro Phe Pro 50 55 60Xaa Tyr Pro Val Glu Pro Phe Ala Glu Gly Gln Ser Leu Thr Leu Thr65 70 75 80Asp Val Glu Asn Leu His Leu Pro Leu Pro Leu Leu Gln Ser Trp Met 85 90 95His Gln Thr Pro Gln Pro Leu Pro Pro Thr Val Met Phe Pro Pro Gln 100 105 110Ser Val Leu Ser Leu Ser Gln Pro Lys Val Leu Ser Val Pro Gln Lys 115 120 125Ala Val Pro Tyr Pro Gln Arg Asp Met Pro Ile Gln Ala 130 135 140130173PRTVicugna pacos 130Asp Glu Gln Gln Asp Lys Ile Tyr Thr Phe Pro Gln Pro Gln Ser Leu1 5 10 15Val Tyr Ser His Thr Glu Pro Ile Pro Tyr Pro Ile Leu Pro Gln Asn 20 25 30Phe Leu Pro Pro Leu Gln Pro Ala Val Met Val Pro Phe Leu Gln Pro 35 40 45Lys Val Met Asp Val Pro Lys Thr Lys Glu Ile Val Ile Pro Lys Arg 50 55 60Lys Glu Met Pro Leu Leu Gln Ser Pro Leu Val Pro Phe Thr Glu Ser65 70 75 80Gln Ser Leu Thr Leu Thr Asp Leu Glu Asn Leu His Leu Pro Leu Pro 85 90 95Leu Leu Gln Ser Leu Met His Gln Ile Pro Gln Pro Val Pro Gln Thr 100 105 110Pro Met Ile Pro Pro Gln Ser Leu Leu Ser Leu Ser Gln Phe Lys Val 115 120 125Leu Pro Val Pro Gln Gln Met Val Pro Tyr Pro Gln Arg Ala Met Pro 130 135 140Val Gln Ala Leu Leu Pro Phe Gln Glu Pro Ile Pro Asp Pro Val Arg145 150 155 160Gly Leu His Pro Val Pro Gln Pro Leu Val Pro Val Ile 165 170131209PRTBos indicus 131Arg Glu Leu Glu Glu Leu Asn Val Pro Gly Glu Ile Val Glu Ser Leu1 5 10 15Ser Ser Ser Glu Glu Ser Ile Thr Arg Ile Asn Lys Lys Ile Glu Lys 20 25 30Phe Gln Ser Glu Glu Gln Gln Gln Thr Glu Asp Glu Leu Gln Asp Lys 35 40 45Ile His Pro Phe Ala Gln Thr Gln Ser Leu Val Tyr Pro Phe Pro Gly 50 55 60Pro Ile Pro Asn Ser Leu Pro Gln Asn Ile Pro Pro Leu Thr Gln Thr65 70 75 80Pro Val Val Val Pro Pro Phe Leu Gln Pro Glu Val Met Gly Val Ser 85 90 95Lys Val Lys Glu Ala Met Ala Pro Lys His Lys Glu Met Pro Phe Pro 100 105 110Lys Tyr Pro Val Glu Pro Phe Thr Glu Ser Gln Ser Leu Thr Leu Thr 115 120 125Asp Val Glu Asn Leu His Leu Pro Leu Pro Leu Leu Gln Ser Trp Met 130 135 140His Gln Pro His Gln Pro Leu Pro Pro Thr Val Met Phe Pro Pro Gln145 150 155 160Ser Val Leu Ser Leu Ser Gln Ser Lys Val Leu Pro Val Pro Gln Lys 165 170 175Ala Val Pro Tyr Pro Gln Arg Asp Met Pro Ile Gln Ala Phe Leu Leu 180 185 190Tyr Gln Glu Pro Val Leu Gly Pro Val Arg Gly Pro Phe Pro Ile Ile 195 200 205Val132217PRTLama glama 132Arg Glu Lys Glu Glu Phe Lys Thr Ala Gly Glu Ala Val Glu Ser Ile1 5 10 15Ser Ser Ser Glu Glu Ser Ile Thr His Ile Asn Lys Gln Lys Ile Glu 20 25 30Lys Phe Lys Ile Glu Glu Gln Gln Gln Thr Glu Asp Glu Gln Gln Asp 35 40 45Lys Ile Tyr Thr Phe Pro Gln Pro Gln Ser Leu Val Tyr Ser His Thr 50 55 60Glu Pro Ile Pro Tyr Pro Ile Leu Pro Gln Asn Phe Leu Pro Pro Leu65 70 75 80Gln Pro Ala Val Met Val Pro Phe Leu Gln Pro Lys Val Met Asp Val 85 90 95Pro Lys Thr Lys Glu Ile Val Ile Pro Lys Arg Lys Glu Met Pro Leu 100 105 110Leu Gln Ser Pro Leu Val Pro Phe Thr Glu Ser Gln Ser Leu Thr Leu 115 120 125Thr Asp Leu Glu Asn Leu His Leu Pro Leu Pro Leu Leu Gln Ser Leu 130 135 140Met His Gln Ile Pro Gln Pro Val Pro Gln Thr Pro Met Ile Pro Pro145 150 155 160Gln Ser Leu Leu Ser Leu Ser Gln Phe Lys Val Leu Pro Val Pro Gln 165 170 175Gln Met Val Pro Tyr Pro Gln Arg Ala Met Pro Val Gln Ala Leu Leu 180 185 190Pro Phe Gln Glu Pro Ile Pro Asp Pro Val Arg Gly Leu His Pro Val 195 200 205Pro Gln Pro Leu Val Pro Val Ile Ala 210 215133211PRTHomo sapiens 133Arg Glu Thr Ile Glu Ser Leu Ser Ser Ser Glu Glu Ser Ile Thr Glu1 5 10 15Tyr Lys Gln Lys Val Glu Lys Val Lys His Glu Asp Gln Gln Gln Gly 20 25 30Glu Asp Glu His Gln Asp Lys Ile Tyr Pro Ser Phe Gln Pro Gln Pro 35 40 45Leu Ile Tyr Pro Phe Val Glu Pro Ile Pro Tyr Gly Phe Leu Pro Gln 50 55 60Asn Ile Leu Pro Leu Ala Gln Pro Ala Val Val Leu Pro Val Pro Gln65 70 75 80Pro Glu Ile Met Glu Val Pro Lys Ala Lys Asp Thr Val Tyr Thr Lys 85 90 95Gly Arg Val Met Pro Val Leu Lys Ser Pro Thr Ile Pro Phe Phe Asp 100 105 110Pro Gln Ile Pro Lys Leu Thr Asp Leu Glu Asn Leu His Leu Pro Leu 115 120 125Pro Leu Leu Gln Pro Leu Met Gln Gln Val Pro Gln Pro Ile Pro Gln 130 135 140Thr Leu Ala Leu Pro Pro Gln Pro Leu Trp Ser Val Pro Gln Pro Lys145 150 155 160Val Leu Pro Ile Pro Gln Gln Val Val Pro Tyr Pro Gln Arg Ala Val 165 170 175Pro Val Gln Ala Leu Leu Leu Asn Gln Glu Leu Leu Leu Asn Pro Thr 180 185 190His Gln Ile Tyr Pro Val Thr Gln Pro Leu Ala Pro Val His Asn Pro 195 200 205Ile Ser Val 2101341059DNAArtificial SequenceNucleic acid encoding fusion protein sig2OKC1-TFMOLG1 134atggccaagc tagttttttc cctttgtttt ctgcttttca gtggctgctg cttcgctcaa 60gagcagaatc aagagcagcc aatccgttgt gagaaggacg agaggttctt ctcagacaag 120atcgccaaat atatacccat acaatatgta ctctcacgct accctagcta cgggcttaac 180tactatcagc aaaaacctgt agcactgata aataaccagt ttctccccta tccctattat 240gctaaacctg ccgccgtgag gagtccagca caaatacttc agtggcaagt gctcagtaac 300accgtgccag caaaaagctg ccaggctcag cccaccacaa tggcccgtca tccccatcct 360caccttagct tcatggcaat cccaccaaag aagaatcaag acaagaccga aatacctacc 420atcaacacaa ttgcatctgg agagcctacc agtacaccaa caactgaggc agtagagtct 480actgttgcta cccttgagga cagccccgag gttatagagt ccccacctga gataaatacc 540gtgcaggtga caagtaccgc cgtattcatg ttgatcgtaa cacagactat gaagggtctt 600gatatacaga aggtggccgg gacttggtac agtttggcaa tggccgcatc cgacatctcc 660ttgttggacg cacaatcagc cccattgcgt gtgtacgtag aagagcttaa accaactccc 720gagggggatc tggaaattct gctccagaaa tgggagaacg gtgagtgcgc ccagaagaag 780atcatcgcag agaagaccaa aattccagca gtattcaaaa tcgacgcatt gaacgaaaat 840aaggtgctcg tactggacac tgattataag aagtatctcc ttttctgtat ggagaactca 900gcagagcctg aacagagtct tgcctgccaa tgccttgttc gtaccccaga ggtagatgat 960gaagctctgg aaaagttcga taaggccctt aaggctctgc ctatgcacat taggctttct 1020ttcaatccaa ctcaacttga ggaacaatgt cacatttaa 1059135352PRTArtificial SequenceFusion protein sig2OKC1-TFMOLG1 135Met Ala Lys Leu Val Phe Ser Leu Cys Phe Leu

Leu Phe Ser Gly Cys1 5 10 15Cys Phe Ala Gln Glu Gln Asn Gln Glu Gln Pro Ile Arg Cys Glu Lys 20 25 30Asp Glu Arg Phe Phe Ser Asp Lys Ile Ala Lys Tyr Ile Pro Ile Gln 35 40 45Tyr Val Leu Ser Arg Tyr Pro Ser Tyr Gly Leu Asn Tyr Tyr Gln Gln 50 55 60Lys Pro Val Ala Leu Ile Asn Asn Gln Phe Leu Pro Tyr Pro Tyr Tyr65 70 75 80Ala Lys Pro Ala Ala Val Arg Ser Pro Ala Gln Ile Leu Gln Trp Gln 85 90 95Val Leu Ser Asn Thr Val Pro Ala Lys Ser Cys Gln Ala Gln Pro Thr 100 105 110Thr Met Ala Arg His Pro His Pro His Leu Ser Phe Met Ala Ile Pro 115 120 125Pro Lys Lys Asn Gln Asp Lys Thr Glu Ile Pro Thr Ile Asn Thr Ile 130 135 140Ala Ser Gly Glu Pro Thr Ser Thr Pro Thr Thr Glu Ala Val Glu Ser145 150 155 160Thr Val Ala Thr Leu Glu Asp Ser Pro Glu Val Ile Glu Ser Pro Pro 165 170 175Glu Ile Asn Thr Val Gln Val Thr Ser Thr Ala Val Phe Met Leu Ile 180 185 190Val Thr Gln Thr Met Lys Gly Leu Asp Ile Gln Lys Val Ala Gly Thr 195 200 205Trp Tyr Ser Leu Ala Met Ala Ala Ser Asp Ile Ser Leu Leu Asp Ala 210 215 220Gln Ser Ala Pro Leu Arg Val Tyr Val Glu Glu Leu Lys Pro Thr Pro225 230 235 240Glu Gly Asp Leu Glu Ile Leu Leu Gln Lys Trp Glu Asn Gly Glu Cys 245 250 255Ala Gln Lys Lys Ile Ile Ala Glu Lys Thr Lys Ile Pro Ala Val Phe 260 265 270Lys Ile Asp Ala Leu Asn Glu Asn Lys Val Leu Val Leu Asp Thr Asp 275 280 285Tyr Lys Lys Tyr Leu Leu Phe Cys Met Glu Asn Ser Ala Glu Pro Glu 290 295 300Gln Ser Leu Ala Cys Gln Cys Leu Val Arg Thr Pro Glu Val Asp Asp305 310 315 320Glu Ala Leu Glu Lys Phe Asp Lys Ala Leu Lys Ala Leu Pro Met His 325 330 335Ile Arg Leu Ser Phe Asn Pro Thr Gln Leu Glu Glu Gln Cys His Ile 340 345 3501361071DNAArtificial SequenceNucleic acid encoding fusion protein sig2OKC1-TFMOLG1KDEL 136atggccaagc tagttttttc cctttgtttt ctgcttttca gtggctgctg cttcgctcaa 60gagcagaatc aagagcagcc aatccgttgt gagaaggacg agaggttctt ctcagacaag 120atcgccaaat atatacccat acaatatgta ctctcacgct accctagcta cgggcttaac 180tactatcagc aaaaacctgt agcactgata aataaccagt ttctccccta tccctattat 240gctaaacctg ccgccgtgag gagtccagca caaatacttc agtggcaagt gctcagtaac 300accgtgccag caaaaagctg ccaggctcag cccaccacaa tggcccgtca tccccatcct 360caccttagct tcatggcaat cccaccaaag aagaatcaag acaagaccga aatacctacc 420atcaacacaa ttgcatctgg agagcctacc agtacaccaa caactgaggc agtagagtct 480actgttgcta cccttgagga cagccccgag gttatagagt ccccacctga gataaatacc 540gtgcaggtga caagtaccgc cgtattcatg ttgatcgtaa cacagactat gaagggtctt 600gatatacaga aggtggccgg gacttggtac agtttggcaa tggccgcatc cgacatctcc 660ttgttggacg cacaatcagc cccattgcgt gtgtacgtag aagagcttaa accaactccc 720gagggggatc tggaaattct gctccagaaa tgggagaacg gtgagtgcgc ccagaagaag 780atcatcgcag agaagaccaa aattccagca gtattcaaaa tcgacgcatt gaacgaaaat 840aaggtgctcg tactggacac tgattataag aagtatctcc ttttctgtat ggagaactca 900gcagagcctg aacagagtct tgcctgccaa tgccttgttc gtaccccaga ggtagatgat 960gaagctctgg aaaagttcga taaggccctt aaggctctgc ctatgcacat taggctttct 1020ttcaatccaa ctcaacttga ggaacaatgt cacattaagg atgagcttta a 1071137356PRTArtificial SequenceFusion protein sig2OKC1-TFMOLG1KDEL 137Met Ala Lys Leu Val Phe Ser Leu Cys Phe Leu Leu Phe Ser Gly Cys1 5 10 15Cys Phe Ala Gln Glu Gln Asn Gln Glu Gln Pro Ile Arg Cys Glu Lys 20 25 30Asp Glu Arg Phe Phe Ser Asp Lys Ile Ala Lys Tyr Ile Pro Ile Gln 35 40 45Tyr Val Leu Ser Arg Tyr Pro Ser Tyr Gly Leu Asn Tyr Tyr Gln Gln 50 55 60Lys Pro Val Ala Leu Ile Asn Asn Gln Phe Leu Pro Tyr Pro Tyr Tyr65 70 75 80Ala Lys Pro Ala Ala Val Arg Ser Pro Ala Gln Ile Leu Gln Trp Gln 85 90 95Val Leu Ser Asn Thr Val Pro Ala Lys Ser Cys Gln Ala Gln Pro Thr 100 105 110Thr Met Ala Arg His Pro His Pro His Leu Ser Phe Met Ala Ile Pro 115 120 125Pro Lys Lys Asn Gln Asp Lys Thr Glu Ile Pro Thr Ile Asn Thr Ile 130 135 140Ala Ser Gly Glu Pro Thr Ser Thr Pro Thr Thr Glu Ala Val Glu Ser145 150 155 160Thr Val Ala Thr Leu Glu Asp Ser Pro Glu Val Ile Glu Ser Pro Pro 165 170 175Glu Ile Asn Thr Val Gln Val Thr Ser Thr Ala Val Phe Met Leu Ile 180 185 190Val Thr Gln Thr Met Lys Gly Leu Asp Ile Gln Lys Val Ala Gly Thr 195 200 205Trp Tyr Ser Leu Ala Met Ala Ala Ser Asp Ile Ser Leu Leu Asp Ala 210 215 220Gln Ser Ala Pro Leu Arg Val Tyr Val Glu Glu Leu Lys Pro Thr Pro225 230 235 240Glu Gly Asp Leu Glu Ile Leu Leu Gln Lys Trp Glu Asn Gly Glu Cys 245 250 255Ala Gln Lys Lys Ile Ile Ala Glu Lys Thr Lys Ile Pro Ala Val Phe 260 265 270Lys Ile Asp Ala Leu Asn Glu Asn Lys Val Leu Val Leu Asp Thr Asp 275 280 285Tyr Lys Lys Tyr Leu Leu Phe Cys Met Glu Asn Ser Ala Glu Pro Glu 290 295 300Gln Ser Leu Ala Cys Gln Cys Leu Val Arg Thr Pro Glu Val Asp Asp305 310 315 320Glu Ala Leu Glu Lys Phe Asp Lys Ala Leu Lys Ala Leu Pro Met His 325 330 335Ile Arg Leu Ser Phe Asn Pro Thr Gln Leu Glu Glu Gln Cys His Ile 340 345 350Lys Asp Glu Leu 355



User Contributions:

Comment about this patent or add new information about this topic:

CAPTCHA
New patent applications in this class:
DateTitle
2022-09-08Shrub rose plant named 'vlr003'
2022-08-25Cherry tree named 'v84031'
2022-08-25Miniature rose plant named 'poulty026'
2022-08-25Information processing system and information processing method
2022-08-25Data reassembly method and apparatus
New patent applications from these inventors:
DateTitle
2022-08-25Isolated plant protein
2022-03-31Recombinant milk proteins
2021-01-14Food compositions comprising milk proteins produced in transgenic plants
Website © 2025 Advameg, Inc.