Patent application title: METHOD FOR INCREASING N-GLYCOSYLATION SITE OCCUPANCY ON THERAPEUTIC GLYCOPROTEINS PRODUCED IN PICHIA PASTORIS
Inventors:
IPC8 Class: AC07K1628FI
USPC Class:
Class name:
Publication date: 2014-08-14
Patent application number: 20140227290
Abstract:
Described is a method for increasing the N-glycosylation site occupancy
of a therapeutic glycoprotein produced in recombinant host cells modified
as described herein and genetically engineered to express the
glycoprotein compared to the N-glycosylation site occupancy of the
therapeutic glycoprotein produced in a recombinant host cell not modified
as described herein. In particular, the method provides recombinant host
cells that overexpress a heterologous single-subunit
oligosaccharyltransferase, which in particular embodiments is capable of
functionally suppressing the lethal phenotype of a mutation of at least
one essential protein of the yeast oligosaccharyltransferase (OTase)
complex, for example, the Leishmania major STT3D protein, in the presence
of expression of the host cell genes encoding the endogenous OTase
complex. The method is useful for both producing therapeutic
glycoproteins with increased N-glycosylation site occupancy in lower
eukaryote cells such as yeast and filamentous fungi and in higher
eukaryote cells such as plant and insect cells and mammalian cells.Claims:
1-22. (canceled)
23. A glycoprotein composition comprising: a plurality of antibodies wherein at least 70% of the antibody molecules in the composition have both N-glycosylation sites occupied and about 50 to 70 mole % of the N-glycans are G0, 15-25 mole % of the N-glycans are G1, and about 5 to 15 mole % of the N-glycans comprise a Man5GlcNAc2 core structure and a pharmaceutically acceptable carrier.
24. The composition of claim 23, wherein the antibodies comprise an antibody selected from the group consisting of anti-Her2 antibody, anti-RSV (respiratory syncytial virus) antibody, anti-TNFα antibody, anti-VEGF antibody, anti-CD3 receptor antibody, anti-CD41 7E3 antibody, anti-CD25 antibody, anti-CD52 antibody, anti-CD33 antibody, anti-IgE antibody, anti-CD11a antibody, anti-EGF receptor antibody, and anti-CD20 antibody.
25-26. (canceled)
Description:
CROSS REFERENCE TO RELATED APPLICATIONS
[0001] This patent application is a divisional of U.S. application Ser. No. 13/579,972 filed 20 Aug. 2012, now pending, which is a National Phase entry of PCT International Application No. PCT/US2011/025878 filed 23 Feb. 2011, and which claims benefit of U.S. Provisional Application No. 61/307,642, filed 24 Feb. 2010, now expired.
REFERENCE TO SEQUENCE LISTING SUBMITTED ELECTRONICALLY
[0002] The sequence listing of the present application is submitted electronically via EFS-Web as an ASCII formatted sequence listing with a file name "GFIMIS00010USPCT-SEQTXT-28JAN2014.txt", creation date of Jan. 28, 2014, and a size of 151 KB. This sequence listing submitted via EFS-Web is part of the specification and is herein incorporated by reference in its entirety.
BACKGROUND OF THE INVENTION
[0003] (1) Field of the Invention
[0004] The present invention relates to methods for increasing the N-glycosylation site occupancy of a heterologous glycoprotein produced in a recombinant host cell modified according to the present invention and genetically engineered to express the glycoprotein compared to the N-glycosylation site occupancy of the therapeutic glycoprotein produced in a recombinant host cell not modified according to the present invention. In particular, the present invention provides recombinant host cells that overexpress a heterologous single-subunit oligosaccharyltransferase, which in particular embodiments is capable of functionally suppressing the lethal phenotype of a mutation of at least one essential protein of the yeast oligosaccharyltransferase (OTase) complex in the presence of the host cell's endogenous OTase complex and methods to using these host cells to produce heterologous glycoproteins.
[0005] (2) Description of Related Art
[0006] The ability to produce recombinant human proteins has led to major advances in human health care and remains an active area of drug discovery. Many therapeutic proteins require the posttranslational addition of glycans to specific asparagine residues (N-glycosylation) of the protein to ensure proper structure-function activity and subsequent stability in human serum. For therapeutic use in humans, glycoproteins require human-like N-glycosylation. Mammalian cell lines (e.g., Chinese hamster ovary (CHO) cells, human retinal cells) that can mimic human-like glycoprotein processing have several drawbacks including low protein titers, long fermentation times, heterogeneous products, and continued viral containment. It is therefore desirable to use an expression system that not only produces high protein titers with short fermentation times, but can also produce human-like glycoproteins.
[0007] Fungal hosts such as Saccharomyces cerevisiae or methylotrophic yeast such as Pichia pastoris have distinct advantages for therapeutic protein expression, for example, they do not secrete high amounts of endogenous proteins, strong inducible promoters for producing heterologous proteins are available, they can be grown in defined chemical media and without the use of animal sera, and they can produce high titers of recombinant proteins (Cregg et al., FEMS Microbiol. Rev. 24: 45-66 (2000)). However, glycosylated proteins expressed in yeast generally contain additional mannose sugars resulting in "high mannose" glycans. Because these high mannose N-glycans can result in adverse responses when administered to certain individuals, yeast have not generally been used to produce therapeutic glycoproteins intended for human use. However, methods for genetically engineering yeast to produce human-like N-glycans are described in U.S. Pat. Nos. 7,029,872 and 7,449,308 along with methods described in U.S. Published Application Nos. 20040230042, 20050208617, 20040171826, 20050208617, and 20060286637. These methods have been used to construct recombinant yeast that can produce therapeutic glycoproteins that have predominantly human-like complex or hybrid N-glycans thereon instead of yeast type N-glycans.
[0008] It has been found that while the genetically engineered yeast can produce glycoproteins that have mammalian- or human-like N-glycans, the occupancy of N-glycan attachment sites on glycoproteins varies widely and is generally lower than the occupancy of these same sites in glycoproteins produced in mammalian cells. This has been observed for various recombinant antibodies produced in Pichia pastoris. However, variability of occupancy of N-glycan attachment sites has also been observed in mammalian cells as well. For example, Gawlitzek et al., Identification of cell culture conditions to control N-glycosylation site-occupancy of recombinant glycoproteins expressed in CHO cells, Biotechnol. Bioengin. 103: 1164-1175 (2009), disclosed that N-glycosylation site occupancy can vary for particular sites for particular glycoproteins produced in CHO cells and that modifications in growth conditions can be made to control occupancy at these sites. International Published Application No. WO 2006107990 discloses a method for improving protein N-glycosylation of eukaryotic cells using the dolichol-linked oligosaccharide synthesis pathway. Control of N-glycosylation site occupancy has been reviewed by Jones et al., Biochim. Biophys. Acta. 1726: 121-137 (2005). However, there still remains a need for methods for increasing N-glycosylation site occupancy of therapeutic proteins produced in recombinant host cells.
BRIEF SUMMARY OF THE INVENTION
[0009] The present invention provides a method for producing therapeutic glycoproteins in recombinant host cells modified as disclosed herein wherein the N-glycosylation site occupancy of the glycoproteins produced in the host cells modified as disclosed herein is increased over the N-glycosylation site occupancy of the same glycoproteins produced in host cells not modified as disclosed herein. For example, in yeast host cells modified as disclosed herein, the N-glycosylation site occupancy of glycoproteins produced therein will be the same as or more similar to the N-glycosylation site occupancy of the same glycoproteins produced in recombinant mammalian or human cells.
[0010] To increase the N-glycosylation site occupancy on a glycoprotein produced in a recombinant host cell one or more heterologous single-subunit oligosaccharyltransferase (OTase) is/are overexpressed in the recombinant host cell either before or simultaneously with the expression of the glycoprotein in the host cell. In particular aspects, at least one of the heterologous single-subunit oligosaccharyltransferase is capable of functionally complementing a lethal mutation of one or more essential subunits comprising the endogenous host cell hetero-oligomeric oligosaccharyltransferase (OTase) complex. The Leishmania major STT3D protein is an example of a heterologous single-subunit oligosaccharyltransferase that has been shown to suppress a lethal mutation in the STT3 locus and at least one locus selected from WBP1, OST1, SWP1, and OST2 in Saccharomyces cerevisiae (Naseb et al., Molec. Biol. Cell 19: 3758-3768 (2008)). In general, the one or more heterologous single-subunit oligosaccharyltransferases is/are overexpressed constitutively or inducibly in the presence of the proteins comprising the host cell's endogenous OTase complex, including the host cell's endogenous STT3 protein. Expression cassettes encoding the heterologous single-subunit oligosaccharyltransferase gene can either be integrated into any site within the host cell genome or located in the extrachromosomal space of the host cell, i.e., autonomously replicating genetic elements such as plasmids, viruses, 2 μm plasmid, minichromosomes, and the like.
[0011] In particular embodiments, one or more of the single-subunit oligosaccharyltransferases is/are the Leishmania sp. STT3A protein, STT3B protein, STT3C protein, STT3D protein, or combinations thereof. In particular embodiments, the one or more single-subunit oligosaccharyltransferases is/are the Leishmania major STT3A protein, STT3B protein, STT3C protein, STT3D protein, or combinations thereof. The nucleic acid molecules encoding the single-subunit oligosaccharyltransferases are not overexpressed in lieu of the expression of the endogenous genes encoding the proteins comprising the host cell's OTase complex, including the host cell STT3 protein. Instead the nucleic acid molecules encoding the single-subunit oligosaccharyltransferases are overexpressed constitutively or inducibly in the presence of the expression of the genes encoding the proteins comprising the host cell's endogenous oligosaccharyltransferase (OTase) complex, which includes expression of the endogenous gene encoding the host cell's STT3. Each expression cassette encoding a single-subunit OTase can either be integrated into any site within the host cell genome or located in the extrachromosomal space of the host cell, i.e., autonomously replicating genetic elements such as plasmids, viruses, 2 nm plasmid, minichromosomes, and the like.
[0012] The present invention has been exemplified herein using Pichia pastoris host cells genetically engineered to produce mammalian- or human-like complex N-glycans; however, the present invention can be applied to other yeast or filamentous fungal host cells, in particular, yeast or filamentous fungi genetically engineered to produce mammalian- or human-like complex or hybrid N-glycans, to improve the overall N-glycosylation site occupancy of glycoproteins produced in the yeast or filamentous fungus host cell. In further aspects, the host cells are yeast or filamentous fungi that produce recombinant heterologous proteins that have wild-type or endogenous host cell N-glycosylation patterns, e.g., hypermannosylated or high mannose N-glycans. In further aspects, the host cells are yeast or filamentous fungi that lack alpha-1,6-mannosyltransferase activity (e.g., och1p activity in the case of various yeast strains such as but not limited to Saccharomyces cerevisiae or Pichia pastoris) and thus produce recombinant heterologous proteins that have high mannose N-glycans. Furthermore, the present invention can also be applied to plant and mammalian expression system to improve the overall N-glycosylation site occupancy of glycoproteins produced in these plant or mammalian expression systems, particularly glycoproteins that have more than two N-linked glycosylation sites.
[0013] Therefore, in one aspect of the above, provided is a method for producing a heterologous glycoprotein in a recombinant host cell, comprising providing a recombinant host cell that includes one or more nucleic acid molecules encoding one or more heterologous single-subunit oligosaccharyltransferases and a nucleic acid molecule encoding the heterologous glycoprotein; and culturing the host cell under conditions for expressing the heterologous glycoprotein to produce the heterologous glycoprotein.
[0014] In a further aspect of the above, provided is a method for producing a heterologous glycoprotein with mammalian- or human-like complex or hybrid N-glycans in a host cell, comprising providing a host cell that includes one or more nucleic acid molecules encoding one or more heterologous single-subunit oligosaccharyltransferases and a nucleic acid molecule encoding the heterologous glycoprotein; and culturing the host cell under conditions for expressing the heterologous glycoprotein to produce the heterologous glycoprotein.
[0015] In general, in the above aspects, the endogenous host cell genes encoding the proteins comprising the endogenous oligosaccharyltransferase (OTase) complex are expressed.
[0016] In further aspects of the above method, the host cell is selected from the group consisting of Pichia pastoris, Pichia finlandica, Pichia trehalophila, Pichia koclamae, Pichia membranaefaciens, Pichia opuntiae, Pichia thermotolerans, Pichia salictaria, Pichia guercuum, Pichia pijperi, Pichia stiptis, Pichia methanolica, Pichia minuta (Ogataea minuta, Pichia lindneri), Pichia sp., Saccharomyces cerevisiae, Saccharomyces sp., Hansenula polymorpha, Kluyveromyces sp., Kluyveromyces lactis, Candida albicans, Aspergillus nidulans, Aspergillus niger, Aspergillus oryzae, Trichoderma reesei, Chrysosporium lucknowense, Fusarium sp., Fusarium gramineum, Fusarium venenatum, and Neurospora crassa. In other aspects, the host cell is an insect, plant or mammalian host cell.
[0017] In a further aspect of the above, provided is a method for producing a heterologous glycoprotein in a lower eukaryote host cell, comprising providing a recombinant lower eukaryote host cell that includes at least one nucleic acid molecule encoding a heterologous single-subunit oligosaccharyltransferase and a nucleic acid molecule encoding the heterologous glycoprotein and wherein the endogenous host cell genes encoding the proteins comprising the endogenous oligosaccharyltransferase (OTase) complex are expressed; and culturing the host cell under conditions for expressing the heterologous glycoprotein to produce the heterologous glycoprotein.
[0018] In further aspects of the above method, the lower eukaryote host cell is selected from the group consisting of Pichia pastoris, Pichia finlandica, Pichia trehalophila, Pichia koclamae, Pichia membranaefaciens, Pichia opuntiae, Pichia thermotolerans, Pichia salictaria, Pichia guercuum, Pichia pijperi, Pichia stiptis, Pichia methanolica, Pichia minuta (Ogataea minuta, Pichia lindneri), Pichia sp., Saccharomyces cerevisiae, Saccharomyces sp., Hansenula polymorpha, Kluyveromyces sp., Kluyveromyces lactis, Candida albicans, Aspergillus nidulans, Aspergillus niger, Aspergillus oryzae, Trichoderma reesei, Chrysosporium lucknowense, Fusarium sp., Fusarium gramineum, Fusarium venenatum, and Neurospora crassa.
[0019] In a further aspect of the above, provided is a method for producing a heterologous glycoprotein in a recombinant yeast host cell, comprising providing a recombinant yeast host cell that includes at least one nucleic acid molecule encoding a heterologous single-subunit oligosaccharyltransferase and a nucleic acid molecule encoding the heterologous glycoprotein and wherein the endogenous host cell genes encoding the proteins comprising the endogenous oligosaccharyltransferase (OTase) complex are expressed; and culturing the host cell under conditions for expressing the heterologous glycoprotein to produce the heterologous glycoprotein.
[0020] In the above methods, the recombinant yeast host cell either produces the glycoprotein with a yeast N-glycan pattern or the yeast has been genetically engineered to produce glycoproteins with a yeast pattern but which lack hypermannosylation but which produce high mannose N-glycans. For example, the yeast can be genetically engineered to lack α1,6-mannosyltransferase activity, e.g., Och1p activity. In further aspects, the yeast is genetically engineered to produce glycoproteins that have mammalian or human-like N-glycans.
[0021] In particular embodiments, the single-subunit oligosaccharyltransferase is the Leishmania sp. STT3A protein, STT3B protein, STT3C protein, STT3D protein, or combinations thereof. In particular embodiments, the single-subunit oligosaccharyltransferase is the Leishmania major STT3A protein, STT3B protein, STT3C protein, STT3D protein, or combinations thereof. In further embodiments, the single-subunit oligosaccharyltransferase is capable of functionally suppressing the lethal phenotype of a mutation of at least one essential protein of an OTase complex, for example, a yeast OTase complex. In further aspects, the essential protein of the OTase complex is encoded by the Saccharomyces cerevisiae and/or Pichia pastoris STT3 locus, WBP1 locus, OST1 locus, SWP1 locus, or OST2 locus, or homologue thereof. For example, in further aspects, the for example single-subunit oligosaccharyltransferase is the Leishmania major STT3D protein, which is capable of functionally suppressing (or rescuing or complementing) the lethal phenotype of at least one essential protein of the Saccharomyces cerevisae OTase complex.
[0022] In further aspects of the above method, the yeast host cell is selected from the group consisting of Pichia pastoris, Pichia finlandica, Pichia trehalophila, Pichia koclamae, Pichia membranaefaciens, Pichia opuntiae, Pichia thermotolerans, Pichia salictaria, Pichia guercuum, Pichia pijperi, Pichia stiptis, Pichia methanolica, Pichia minuta (Ogataea minuta, Pichia lindneri), Pichia sp., Saccharomyces cerevisiae, Saccharomyces sp., Hansenula polymorpha, Kluyveromyces sp., Kluyveromyces lactis, and Candida albicans.
[0023] In a further aspect of the above, provided is a method for producing a heterologous glycoprotein in a recombinant yeast host cell, comprising providing a recombinant yeast host cell that includes at least one nucleic acid molecule encoding a heterologous single-subunit oligosaccharyltransferase capable of functionally suppressing the lethal phenotype of a mutation of at least one essential protein of a yeast oligosaccharyltransferase (OTase) complex, and a nucleic acid molecule encoding the heterologous glycoprotein and wherein the endogenous host cell genes encoding the proteins comprising the endogenous oligosaccharyltransferase (OTase) complex are expressed; and culturing the host cell under conditions for expressing the heterologous glycoprotein to produce the heterologous glycoprotein.
[0024] In the above methods, the recombinant yeast host cell either produces the glycoprotein with a yeast N-glycan pattern or the yeast has been genetically engineered to produce glycoproteins with a yeast pattern that includes high mannose N-glycans but which lack hypermannosylation. For example, the yeast can be genetically engineered to lack α1,6-mannosyltransferase activity, e.g., Och1p activity. In further aspects, the yeast is genetically engineered to produce glycoproteins that have mammalian or human-like N-glycans.
[0025] In particular embodiments, the host cell further includes one or more nucleic acid molecules encoding the Leishmania sp. STT3A protein, STT3B protein, STT3C protein, STT3D protein, or combinations thereof. In particular embodiments, the host cell further includes one or more nucleic acids encoding the Leishmania major STT3A protein, STT3B protein, STT3C protein, or combinations thereof.
[0026] In further aspects of the above method, the yeast host cell is selected from the group consisting of Pichia pastoris, Pichia finlandica, Pichia trehalophila, Pichia koclamae, Pichia membranaefaciens, Pichia opuntiae, Pichia thermotolerans, Pichia salictaria, Pichia guercuum, Pichia pijperi, Pichia stiptis, Pichia methanolica, Pichia minuta (Ogataea minuta, Pichia lindneri), Pichia sp., Saccharomyces cerevisiae, Saccharomyces sp., Hansenula polymorpha, Kluyveromyces sp., Kluyveromyces lactis, and Candida albicans.
[0027] In a further aspect of the above, provided is a method for producing a heterologous glycoprotein in a filamentous fungus host cell, comprising providing a filamentous fungus host cell that includes at least one nucleic acid molecule encoding a single-subunit heterologous oligosaccharyltransferase and a nucleic acid molecule encoding the heterologous glycoprotein and wherein the endogenous host cell genes encoding the proteins comprising the endogenous oligosaccharyltransferase (OTase) complex are expressed; and culturing the host cell under conditions for expressing the heterologous glycoprotein to produce the heterologous glycoprotein. The filamentous fungus host cell produces the glycoprotein in which the N-glycans have a filamentous fungus pattern or it is genetically engineered to produce glycoproteins that have mammalian or human-like N-glycans.
[0028] In particular embodiments, the single-subunit oligosaccharyltransferase is the Leishmania sp. STT3A protein, STT3B protein, STT3C protein, STT3D protein, or combinations thereof. In particular embodiments, the single-subunit oligosaccharyltransferase is the Leishmania major STT3A protein, STT3B protein, STT3C protein, STT3D protein or combinations thereof. In further embodiments, the single-subunit oligosaccharyltransferase is capable of functionally suppressing the lethal phenotype of a mutation of at least one essential protein of an OTase complex, for example, a yeast OTase complex. In further aspects, the essential protein of the OTase complex is encoded by the Saccharomyces cerevisiae and/or Pichia pastoris STT3 locus, WBP1 locus, OST1 locus, SWP1 locus, or OST2 locus, or homologue thereof. For example, in further aspects, the single-subunit oligosaccharyltransferase is the Leishmania major STT3D protein, which is capable of functionally suppressing (or rescuing or complementing) the lethal phenotype of at least one essential protein of the Saccharomyces cerevisiae OTase complex.
[0029] In further aspects of the above, the filamentous fungus host cell is selected from the group consisting of Aspergillus nidulans, Aspergillus niger, Aspergillus oryzae, Trichoderma reesei, Chrysosporium lucknowense, Fusarium sp., Fusarium gramineum, Fusarium venenatum, and Neurospora crassa.
[0030] In further embodiments of any one of the above methods, the host cell is genetically engineered to produce glycoproteins comprising one or more mammalian- or human-like complex N-glycans selected from G0, G1, G2, A1, or A2. In further embodiments, the host cell is genetically engineered to produce glycoproteins comprising one or more mammalian- or human-like complex N-glycans that have bisected N-glycans or have multiantennary N-glycans. In other embodiments, the host cell is genetically engineered to produce glycoproteins comprising one or more mammalian- or human-like hybrid N-glycans selected from GlcNAcMan3GlcNAc2; GalGlcNAcMan3GlcNAc2; NANAGalGlcNAcMan3GlcNAc2; Man5GlcNAc2, GlcNAcMan5GlcNAc2, GalGlcNAcMan5GlcNAc2, and NANAGalGlcNAcMan5GlcNAc2. In further embodiments, the N-glycan structure consists of the G-2 structure Man3GlcNAc2.
[0031] In particular embodiments of any one of the above methods, the heterologous glycoprotein can be for example, erythropoietin (EPO); cytokines such as interferon α, interferon β, interferon γ, and interferon ω; and granulocyte-colony stimulating factor (GCSF); granulocyte macrophage-colony stimulating factor (GM-CSF); coagulation factors such as factor VIII, factor IX, and human protein C; antithrombin III; thrombin; soluble IgE receptor α-chain; immunoglobulins such as IgG, IgG fragments, IgG fusions, and IgM; immunoadhesions and other Fc fusion proteins such as soluble TNF receptor-Fc fusion proteins; RAGE-Fc fusion proteins; interleukins; urokinase; chymase; urea trypsin inhibitor; IGF-binding protein; epidermal growth factor; growth hormone-releasing factor; annexin V fusion protein; angiostatin; vascular endothelial growth factor-2; myeloid progenitor inhibitory factor-1; osteoprotegerin; α-1-antitrypsin; α-feto proteins; DNase II; kringle 3 of human plasminogen; glucocerebrosidase; TNF binding protein 1; follicle stimulating hormone; cytotoxic T lymphocyte associated antigen 4-Ig; transmembrane activator and calcium modulator and cyclophilin ligand; glucagon like protein 1; or IL-2 receptor agonist.
[0032] In further embodiments of any one of the above methods, the heterologous protein is an antibody, examples of which, include but are not limited to, an anti-Her2 antibody, anti-RSV (respiratory syncytial virus) antibody, anti-TNFα antibody, anti-VEGF antibody, anti-CD3 receptor antibody, anti-CD41 7E3 antibody, anti-CD25 antibody, anti-CD52 antibody, anti-CD33 antibody, anti-IgE antibody, anti-CD11a antibody, anti-EGF receptor antibody, or anti-CD20 antibody.
[0033] In particular aspects of any one of the above methods, the host cell includes one or more nucleic acid molecules encoding one or more catalytic domains of a glycosidase, mannosidase, or glycosyltransferase activity derived from a member of the group consisting of UDP-GlcNAc transferase (GnT) I, GnT II, GnT III, GnT IV, GnT V, GnT VI, UDP-galactosyltransferase (GalT), fucosyltransferase, and sialyltransferase. In particular embodiments, the mannosidase is selected from the group consisting of C. elegans mannosidase IA, C. elegans mannosidase IB, D. melanogaster mannosidase IA, H. sapiens mannosidase IB, P. citrinum mannosidase I, mouse mannosidase IA, mouse mannosidase IB, A. nidulans mannosidase IA, A. nidulans mannosidase IB, A. nidulans mannosidase IC, mouse mannosidase II, C. elegans mannosidase II, H. sapiens mannosidase II, and mannosidase III.
[0034] In certain aspects of any one of the above methods, at least one catalytic domain is localized by forming a fusion protein comprising the catalytic domain and a cellular targeting signal peptide. The fusion protein can be encoded by at least one genetic construct formed by the in-frame ligation of a DNA fragment encoding a cellular targeting signal peptide with a DNA fragment encoding a catalytic domain having enzymatic activity. Examples of targeting signal peptides include, but are not limited to, membrane-bound proteins of the ER or Golgi, retrieval signals, Type II membrane proteins, Type I membrane proteins, membrane spanning nucleotide sugar transporters, mannosidases, sialyltransferases, glucosidases, mannosyltransferases, and phospho-mannosyltransferases.
[0035] In particular aspects of any one of the above methods, the host cell further includes one or more nucleic acid molecules encode one or more enzymes selected from the group consisting of UDP-GlcNAc transporter, UDP-galactose transporter, GDP-fucose transporter, CMP-sialic acid transporter, and nucleotide diphosphatases.
[0036] In further aspects of any one of the above methods, the host cell includes one or more nucleic acid molecules encoding an α1,2-mannosidase activity, a UDP-GlcNAc transferase (GnT) I activity, a mannosidase II activity, and a GnT II activity.
[0037] In further still aspects of any one of the above methods, the host cell includes one or more nucleic acid molecules encoding an α1,2-mannosidase activity, a UDP-GlcNAc transferase (GnT) I activity, a mannosidase II activity, a GnT II activity, and a UDP-galactosyltransferase (GalT) activity.
[0038] In further still aspects of any one of the above methods, the host cell is deficient in the activity of one or more enzymes selected from the group consisting of mannosyltransferases and phosphomannosyltransferases. In further still aspects, the host cell does not express an enzyme selected from the group consisting of 1,6 mannosyltransferase, 1,3 mannosyltransferase, and 1,2 mannosyltransferase.
[0039] In a particular aspect of any one of the above methods, the host cell is an och1 mutant of Pichia pastoris.
[0040] Further provided is a host cell, comprising (a) a first nucleic acid molecule encoding a heterologous single-subunit oligosaccharyltransferase; and (b) a second nucleic acid molecule encoding a heterologous glycoprotein; and the endogenous host cell genes encoding the proteins comprising the endogenous oligosaccharyltransferase (OTase) complex are expressed, which includes expression of the endogenous host cell STT3 gene.
[0041] Further provided is a lower eukaryotic host cell, comprising (a) a first nucleic acid molecule encoding a heterologous single-subunit oligosaccharyltransferase; and (b) a second nucleic acid molecule encoding a heterologous glycoprotein; and the endogenous host cell genes encoding the proteins comprising the endogenous oligosaccharyltransferase (OTase) complex are expressed.
[0042] Further provided is a yeast host cell, comprising (a) a first nucleic acid molecule encoding a heterologous single-subunit oligosaccharyltransferase; and (b) a second nucleic acid molecule encoding a heterologous glycoprotein; and the endogenous host cell genes encoding the proteins comprising the endogenous oligosaccharyltransferase (OTase) complex are expressed.
[0043] Further provided is a yeast host cell, comprising (a) a first nucleic acid molecule encoding a heterologous single-subunit oligosaccharyltransferase capable of functionally suppressing the lethal phenotype of a mutation of at least one essential protein of a yeast oligosaccharyltransferase (OTase) complex; and (b) a second nucleic acid molecule encoding a heterologous glycoprotein; and the endogenous host cell genes encoding the proteins comprising the endogenous oligosaccharyltransferase (OTase) complex are expressed.
[0044] Further provided is a filamentous fungus host cell comprising (a) a first nucleic acid molecule encoding a heterologous single-subunit oligosaccharyltransferase; and (b) a second nucleic acid molecule encoding a heterologous glycoprotein; and the endogenous host cell genes encoding the proteins comprising the endogenous oligosaccharyltransferase (OTase) complex are expressed.
[0045] Further provided is a filamentous fungus host cell, comprising (a) a first nucleic acid molecule encoding a heterologous single-subunit oligosaccharyltransferase capable of functionally suppressing the lethal phenotype of a mutation of at least one essential protein of a yeast or filamentous fungus oligosaccharyltransferase (OTase) complex; and (b) a second nucleic acid molecule encoding a heterologous glycoprotein; and the endogenous host cell genes encoding the proteins comprising the endogenous oligosaccharyltransferase (OTase) complex are expressed
[0046] In particular embodiments, the single-subunit oligosaccharyltransferase is the Leishmania sp. STT3A protein, STT3B protein, STT3C protein, STT3D protein, or combinations thereof. In particular embodiments, the single-subunit oligosaccharyltransferase is the Leishmania major STT3A protein, STT3B protein, STT3C protein, STT3D protein, or combinations thereof. In further embodiments, the single-subunit oligosaccharyltransferase is capable of functionally suppressing the lethal phenotype of a mutation of at least one essential protein of an OTase complex, for example, a yeast OTase complex. In further aspects, the essential protein of the OTase complex is encoded by the Saccharomyces cerevisiae and/or Pichia pastoris STT3 locus, WBP1 locus, OST1 locus, SWP1 locus, or OST2 locus, or homologue thereof. For example, in further aspects, the for example single-subunit oligosaccharyltransferase is the Leishmania major STT3D protein, which is capable of functionally suppressing (or rescuing or complementing) the lethal phenotype of at least one essential protein of the Saccharomyces cerevisiae OTase complex.
[0047] In further aspects, the above host cells further include one or more nucleic acid molecules encoding a Leishmania sp. STT3A protein, STT3B protein, STT3C protein, or combinations thereof.
[0048] In further embodiments of any one of the above host cells, the host cell is genetically engineered to produce glycoproteins comprising one or more mammalian- or human-like complex N-glycans selected from G0, G1, G2, A1, or A2. In further embodiments, the host cell is genetically engineered to produce glycoproteins comprising one or more human-like complex N-glycans that bisected N-glycans or have multiantennary N-glycans. In other embodiments, the host cell is genetically engineered to produce glycoproteins comprising one or more mammalian- or human-like hybrid N-glycans selected from GlcNAcMan3GlcNAc2; GalGlcNAcMan3GlcNAc2; NANAGalGlcNAcMan3GlcNAc2; Man5GlcNAc2, GlcNAcMan5GlcNAc2, GalGlcNAcMan5GlcNAc2, and NANAGalGlcNAcMan5GlcNAc2. In further embodiments, the N-glycan structure consists of the G-2 structure Man3GlcNAc2.
[0049] In particular embodiments of any one of the above host cells, the heterologous glycoprotein can be for example, selected from the group consisting of erythropoietin (EPO); cytokines such as interferon α, interferon β, interferon γ, and interferon ω; and granulocyte-colony stimulating factor (GCSF); granulocyte macrophage-colony stimulating factor (GM-CSF); coagulation factors such as factor VIII, factor IX, and human protein C; antithrombin III; thrombin; soluble IgE receptor α-chain; immunoglobulins such as IgG, IgG fragments, IgG fusions, and IgM; immunoadhesions and other Fc fusion proteins such as soluble TNF receptor-Fc fusion proteins; RAGE-Fc fusion proteins; interleukins; urokinase; chymase; urea trypsin inhibitor; IGF-binding protein; epidermal growth factor; growth hormone-releasing factor; annexin V fusion protein; angiostatin; vascular endothelial growth factor-2; myeloid progenitor inhibitory factor-1; osteoprotegerin; α-1-antitrypsin; α-feto proteins; DNase II; kringle 3 of human plasminogen; glucocerebrosidase; TNF binding protein 1; follicle stimulating hormone; cytotoxic T lymphocyte associated antigen 4-Ig; transmembrane activator and calcium modulator and cyclophilin ligand; glucagon like protein 1; and IL-2 receptor agonist.
[0050] In further embodiments of any one of the above host cells, the heterologous protein is an antibody, examples of which, include but are not limited to, an anti-Her2 antibody, anti-RSV (respiratory syncytial virus) antibody, anti-TNFα antibody, anti-VEGF antibody, anti-CD3 receptor antibody, anti-CD41 7E3 antibody, anti-CD25 antibody, anti-CD52 antibody, anti-CD33 antibody, anti-IgE antibody, anti-CD11a antibody, anti-EGF receptor antibody, or anti-CD20 antibody.
[0051] In particular aspects of the above host cells, the host cell includes one or more nucleic acid molecules encoding one or more catalytic domains of a glycosidase, mannosidase, or glycosyltransferase activity derived from a member of the group consisting of UDP-GlcNAc transferase (GnT) I, GnT II, GnT III, GnT IV, GnT V, GnT VI, UDP-galactosyltransferase (GalT), fucosyltransferase, and sialyltransferase. In particular embodiments, the mannosidase is selected from the group consisting of C. elegans mannosidase IA, C. elegans mannosidase IB, D. melanogaster mannosidase IA, H. sapiens mannosidase IB, P. citrinum mannosidase I, mouse mannosidase IA, mouse mannosidase IB, A. nidulans mannosidase IA, A. nidulans mannosidase IB, A. nidulans mannosidase IC, mouse mannosidase II, C. elegans mannosidase II, H. sapiens mannosidase II, and mannosidase III.
[0052] In certain aspects of any one of the above host cells, at least one catalytic domain is localized by forming a fusion protein comprising the catalytic domain and a cellular targeting signal peptide. The fusion protein can be encoded by at least one genetic construct formed by the in-frame ligation of a DNA fragment encoding a cellular targeting signal peptide with a DNA fragment encoding a catalytic domain having enzymatic activity. Examples of targeting signal peptides include, but are not limited to, those to membrane-bound proteins of the ER or Golgi, retrieval signals such as HDEL or KDEL, Type II membrane proteins, Type I membrane proteins, membrane spanning nucleotide sugar transporters, mannosidases, sialyltransferases, glucosidases, mannosyltransferases, and phospho-mannosyltransferases.
[0053] In particular aspects of any one of the above host cells, the host cell further includes one or more nucleic acid molecules encoding one or more enzymes selected from the group consisting of UDP-GlcNAc transporter, UDP-galactose transporter, GDP-fucose transporter, CMP-sialic acid transporter, and nucleotide diphosphatases.
[0054] In further aspects of any one of the above host cells, the host cell includes one or more nucleic acid molecules encoding an α1,2-mannosidase activity, a UDP-GlcNAc transferase (GnT) I activity, a mannosidase II activity, and a GnT II activity.
[0055] In further still aspects of any one of the above host cells, the host cell includes one or more nucleic acid molecules encoding an α1,2-mannosidase activity, a UDP-GlcNAc transferase (GnT) I activity, a mannosidase II activity, a GnT II activity, and a UDP-galactosyltransferase (GalT) activity.
[0056] In further aspects of any one of the above host cells, the host cell is selected from the group consisting of Pichia pastoris, Pichia finlandica, Pichia trehalophila, Pichia koclamae, Pichia membranaefaciens, Pichia opuntiae, Pichia thermotolerans, Pichia salictaria, Pichia guercuum, Pichia pijperi, Pichia stiptis, Pichia methanolica, Pichia sp., Saccharomyces cerevisiae, Saccharomyces sp., Hansenula polymorpha, Kluyveromyces sp., Kluyveromyces lactis, Candida albicans, Aspergillus nidulans, Aspergillus niger, Aspergillus oryzae, Trichoderma reesei, Chrysosporium lucknowense, Fusarium sp., Fusarium gramineum, Fusarium venenatum, Neurospora crassa, plant cells, insect cells, and mammalian cells.
[0057] In further still aspects of any one of the above host cells, the host cell is deficient in the activity of one or more enzymes selected from the group consisting of mannosyltransferases and phosphomannosyltransferases. In further still aspects, the host cell does not express an enzyme selected from the group consisting of 1,6 mannosyltransferase, 1,3 mannosyltransferase, and 1,2 mannosyltransferase.
[0058] In a particular aspect of any one of the above host cells, the host cell is Pichia pastoris. In a further aspect, the host cell is an och1 mutant of Pichia pastoris. The methods and host cells herein can be used to produce glycoprotein compositions in which at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, or 99% of the N-glycosylation sites of the glycoproteins in the composition are occupied.
[0059] Further, the methods and host cells herein can be used to produce glycoprotein compositions in which at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, or 99% of the N-glycosylation sites of the glycoproteins in the composition are occupied and which in further aspects have mammalian- or human-like N-glycans that lack fucose.
[0060] Further, the methods and yeast or filamentous fungus host cells are genetically engineered to produce mammalian-like or human-like N-glycans can be used to produce glycoprotein compositions in which at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, or 99% of the N-glycosylation sites of the glycoproteins in the composition are occupied and which in further aspects have mammalian- or human-like N-glycans that lack fucose.
[0061] In some aspects, the yeast or filamentous host cells genetically engineered to produce fucosylated mammalian- or human-like N-glycans can be used to produce glycoprotein compositions in which at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, or 99% of the N-glycosylation sites of the glycoproteins in the composition are occupied and which in further aspects have mammalian- or human-like N-glycans that have fucose.
[0062] The methods and host cells herein can be used to produce antibody compositions in which at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, or 99% antibody molecules in the compositions have both N-glycosylation sites occupied.
[0063] Further, the methods and host cells herein can be used to produce antibody compositions in which at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, or 99% antibody molecules in the compositions have both N-glycosylation sites occupied and the N-glycans lack fucose.
[0064] Further, the methods and yeast or filamentous fungus host cells herein can be used to produce antibody compositions in which at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, or 99% antibody molecules in the compositions have both N-glycosylation sites occupied and the N-glycans lack fucose.
[0065] Further, the methods and yeast or filamentous fungus host cells genetically engineered to produce mammalian-like or human-like N-glycans can be used to produce antibody compositions in which at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, or 99% antibody molecules in the compositions have both N-glycosylation sites occupied and the antibodies have mammalian- or human-like N-glycans that lack fucose. In some aspects, the yeast or filamentous host cells genetically engineered to produce fucosylated mammalian- or human-like N-glycans can be used to produce antibody compositions in which at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, or 99% antibody molecules in the compositions have both N-glycosylation sites occupied and the antibodies have mammalian- or human-like N-glycans with fucose.
[0066] Further provided is a glycoprotein composition comprising a plurality of antibodies wherein about 70% to about 99% of the intact antibody molecules in the composition have both N-glycosylation sites occupied and about 50-70 mole % of the N-glycans have a G0 structure, 15-25 mole % of the N-glycans have a G1 structure, 4-12 mole % of the N-glycans have a G2 structure, 5-17 mole % of the N-glycans have a Man5 structure, and 3-15 mole % of the N-glycans have a hybrid structure, and a pharmaceutically acceptable carrier.
[0067] Further still is provided is a glycoprotein composition comprising a plurality of antibodies wherein about 70% to 99% of the intact antibody molecules in the composition have both N-glycosylation sites occupied and about 53 to 58 mole % of the N-glycans have a G0 structure, 20-22 mole % of the N-glycans have a G1 structure, and about 16 to 18 mole % of the N-glycans comprise a Man5GlcNAc2 core structure, and a pharmaceutically acceptable carrier.
[0068] In particular embodiments, the antibodies comprise an antibody selected from the group consisting of anti-Her2 antibody, anti-RSV (respiratory syncytial virus) antibody, anti-TNFα antibody, anti-VEGF antibody, anti-CD3 receptor antibody, anti-CD41 7E3 antibody, anti-CD25 antibody, anti-CD52 antibody, anti-CD33 antibody, anti-IgE antibody, anti-CD11a antibody, anti-EGF receptor antibody, and anti-CD20 antibody.
[0069] Further provided are compositions comprising one or more glycoproteins produced by the host cells and methods described herein.
[0070] In particular embodiments, the glycoprotein compositions provided herein comprise glycoproteins having fucosylated and non-fucosylated hybrid and complex N-glycans, including bisected and multiantennary species, including but not limited to N-glycans such as GlcNAc.sub.(1-4)Man3GlcNAc2; Gal.sub.(1-4)GlcNAc.sub.(1-4)Man3GlcNAc2; NANA.sub.(1-4)Gal.sub.(1-4)GlcNAc.sub.(1-4)Man3GlcNAc2.
[0071] In particular embodiments, the glycoprotein compositions provided herein comprise glycoproteins having at least one hybrid N-glycan selected from the group consisting of GlcNAcMan3GlcNAc2; GalGlcNAcMan3GlcNAc2; NANAGalGlcNAcMan3GlcNAc2; GlcNAcMan5GlcNAc2; GalGlcNAcMan5GlcNAc2; and NANAGalGlcNAcMan5GlcNAc2. In particular aspects, the hybrid N-glycan is the predominant N-glycan species in the composition. In further aspects, the hybrid N-glycan is a particular N-glycan species that comprises about 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 97%, 98%, 99%, or 100% of the hybrid N-glycans in the composition.
[0072] In particular embodiments, the glycoprotein compositions provided herein comprise glycoproteins having at least one complex N-glycan selected from the group consisting of GlcNAc2Man3GlcNAc2; GalGlcNAc2Man3GlcNAc2; Gal2GlcNAc2Man3GlcNAc2; NANAGal2GlcNAc2Man3GlcNAc2; and NANA2Gal2GlcNAc2Man3GlcNAc2. In particular aspects, the complex N-glycan is the predominant N-glycan species in the composition. In further aspects, the complex N-glycan is a particular N-glycan species that comprises about 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 97%, 98%, 99%, or 100% of the complex N-glycans in the composition.
[0073] In particular embodiments, the N-glycan is fusosylated. In general, the fucose is in an α1,3-linkage with the GlcNAc at the reducing end of the N-glycan, an α1,6-linkage with the GlcNAc at the reducing end of the N-glycan, an α1,2-linkage with the Gal at the non-reducing end of the N-glycan, an α1,3-linkage with the GlcNac at the non-reducing end of the N-glycan, or an α1,4-linkage with a GlcNAc at the non-reducing end of the N-glycan.
[0074] Therefore, in particular aspects of the above the glycoprotein compositions, the glycoform is in an α1,3-linkage or α1,6-linkage fucose to produce a glycoform selected from the group consisting of GlcNAcMan5GlcNAc2(Fuc), GlcNAcMan3GlcNAc2(Fuc), GlcNAc2Man3GlcNAc2(Fuc), GalGlcNAc2Man3GlcNAc2(Fuc), Gal2GlcNAc2Man3GlcNAc2(Fuc), NANAGal2GlcNAc2Man3GlcNAc2(Fuc), and NANA2Gal2GlcNAc2Man3GlcNAc2(Fuc); in an α1,3-linkage or α1,4-linkage fucose to produce a glycoform selected from the group consisting of GlcNAc(Fuc)Man5GlcNAc2, GlcNAc(Fuc)Man3GlcNAc2, GlcNAc2(Fuc1-2)Man3GlcNAc2, GalGlcNAc2(Fuc1-2)Man3GlcNAc2, Gal2GlcNAc2(Fuc1-2)Man3GlcNAc2, NANAGal2GlcNAc2(Fuc1-2)Man3GlcNAc2, and NANA2Gal2GlcNAc2(Fuc1-2)Man3GlcNAc2; or in an α1,2-linkage fucose to produce a glycoform selected from the group consisting of Gal(Fuc)GlcNAc2Man3GlcNAc2, Gal2(Fuc1-2)GlcNAc2Man3GlcNAc2, NANAGal2(Fuc1-2)GlcNAc2Man3GlcNAc2, and
[0075] NANA2Gal2(Fuc1-2)GlcNAc2Man3GlcNAc2.
[0076] In further aspects of the above, the complex N-glycans further include fucosylated and non-fucosylated bisected and multiantennary species.
[0077] In further aspects, the glycoproteins comprise high mannose N-glycans, including but not limited to, Man5GlcNAc2, or N-glycans that consist of the Man3GlcNAc2 N-glycan structure.
DEFINITIONS
[0078] As used herein, the terms "N-glycan" and "glycoform" are used interchangeably and refer to an N-linked oligosaccharide, for example, one that is attached by an asparagine-N-acetylglucosamine linkage to an asparagine residue of a polypeptide. N-linked glycoproteins contain an N-acetylglucosamine residue linked to the amide nitrogen of an asparagine residue in the protein. The predominant sugars found on glycoproteins are glucose, galactose, mannose, fucose, N-acetylgalactosamine (GalNAc), N-acetylglucosamine (GlcNAc) and sialic acid (e.g., N-acetyl-neuraminic acid (NANA)). The processing of the sugar groups occurs co-translationally in the lumen of the ER and continues post-translationally in the Golgi apparatus for N-linked glycoproteins.
[0079] N-glycans have a common pentasaccharide core of Man3GlcNAc2 ("Man" refers to mannose; "Glc" refers to glucose; and "NAc" refers to N-acetyl; GlcNAc refers to N-acetylglucosamine). Usually, N-glycan structures are presented with the non-reducing end to the left and the reducing end to the right. The reducing end of the N-glycan is the end that is attached to the Asn residue comprising the glycosylation site on the protein. N-glycans differ with respect to the number of branches (antennae) comprising peripheral sugars (e.g., GlcNAc, galactose, fucose and sialic acid) that are added to the Man3GlcNAc2 ("Man3") core structure which is also referred to as the "triammnose core", the "pentasaccharide core" or the "paucimannose core". N-glycans are classified according to their branched constituents (e.g., high mannose, complex or hybrid). A "high mannose" type N-glycan has five or more mannose residues. A "complex" type N-glycan typically has at least one GlcNAc attached to the 1,3 mannose arm and at least one GlcNAc attached to the 1,6 mannose arm of a "trimannose" core. Complex N-glycans may also have galactose ("Gal") or N-acetylgalactosamine ("GalNAc") residues that are optionally modified with sialic acid or derivatives (e.g., "NANA" or "NeuAc", where "Neu" refers to neuraminic acid and "Ac" refers to acetyl). Complex N-glycans may also have intrachain substitutions comprising "bisecting" GlcNAc and core fucose ("Fuc"). Complex N-glycans may also have multiple antennae on the "trimannose core," often referred to as "multiple antennary glycans." A "hybrid" N-glycan has at least one GlcNAc on the terminal of the 1,3 mannose arm of the trimannose core and zero or more mannoses on the 1,6 mannose arm of the trimannose core. The various N-glycans are also referred to as "glycoforms."
[0080] With respect to complex N-glycans, the terms "G-2", "G-1", "G0", "G1", "G2", "A1", and "A2" mean the following. "G-2" refers to an N-glycan structure that can be characterized as Man3GlcNAc2; the term "G-1" refers to an N-glycan structure that can be characterized as GlcNAcMan3GlcNAc2; the term "G0" refers to an N-glycan structure that can be characterized as GlcNAc2Man3GlcNAc2; the term "G1" refers to an N-glycan structure that can be characterized as GalGlcNAc2Man3GlcNAc2; the term "G2" refers to an N-glycan structure that can be characterized as Gal2GlcNAc2Man3GlcNAc2; the term "A1" refers to an N-glycan structure that can be characterized as NANAGal2GlcNAc2Man3GlcNAc2; and, the term "A2" refers to an N-glycan structure that can be characterized as NANA2Gal2GlcNAc2Man3GlcNAc2. Unless otherwise indicated, the terms G-2'', "G-1", "G0", "G1", "G2", "A1", and "A2" refer to N-glycan species that lack fucose attached to the GlcNAc residue at the reducing end of the N-glycan. When the term includes an "F", the "F" indicates that the N-glcyan species contains a fucose residue on the GlcNAc residue at the reducing end of the N-glycan. For example, G0F, G1F, G2F, A1F, and A2F all indicate that the N-glycan further includes a fucose residue attached to the GlcNAc residue at the reducing end of the N-glycan. Lower eukaryotes such as yeast and filamentous fungi do not normally produce N-glycans that produce fucose.
[0081] With respect to multiantennary N-glycans, the term "multiantennary N-glycan" refers to N-glycans that further comprise a GlcNAc residue on the mannose residue comprising the non-reducing end of the 1,6 arm or the 1,3 arm of the N-glycan or a GlcNAc residue on each of the mannose residues comprising the non-reducing end of the 1,6 arm and the 1,3 arm of the N-glycan. Thus, multiantennary N-glycans can be characterized by the formulas GlcNAc.sub.(2-4)Man3GlcNAc2, Gal.sub.(1-4)GlcNAc.sub.(2-4)Man3GlcNAc2, or NANA.sub.(1-4)Gal.sub.(1-4)GlcNAc.sub.(2-4)Man3GlcNAc2. The term "1-4" refers to 1, 2, 3, or 4 residues.
[0082] With respect to bisected N-glycans, the term "bisected N-glycan" refers to N-glycans in which a GlcNAc residue is linked to the mannose residue at the reducing end of the N-glycan. A bisected N-glycan can be characterized by the formula GlcNAc3Man3GlcNAc2 wherein each mannose residue is linked at its non-reducing end to a GlcNAc residue. In contrast, when a multiantennary N-glycan is characterized as GlcNAc3Man3GlcNAc2, the formula indicates that two GlcNAc residues are linked to the mannose residue at the non-reducing end of one of the two arms of the N-glycans and one GlcNAc residue is linked to the mannose residue at the non-reducing end of the other arm of the N-glycan.
[0083] Abbreviations used herein are of common usage in the art, see, e.g., abbreviations of sugars, above. Other common abbreviations include "PNGase", or "glycanase" or "glucosidase" which all refer to peptide N-glycosidase F (EC 3.2.2.18).
[0084] As used herein, the term "glycoprotein" refers to any protein having one or more N-glycans attached thereto. Thus, the term refers both to proteins that are generally recognized in the art as a glycoprotein and to proteins which have been genetically engineered to contain one or more N-linked glycosylation sites.
[0085] As used herein, a "humanized glycoprotein" or a "human-like glycoprotein" refers alternatively to a protein having attached thereto N-glycans having fewer than four mannose residues, and synthetic glycoprotein intermediates (which are also useful and can be manipulated further in vitro or in vivo) having at least five mannose residues. Preferably, glycoproteins produced according to the invention contain at least 30 mole %, preferably at least 40 mole % and more preferably 50, 60, 70, 80, 90, or even 100 mole % of the Man5GlcNAc2 intermediate, at least transiently. This may be achieved, e.g., by engineering a host cell of the invention to express a "better", i.e., a more efficient glycosylation enzyme. For example, a mannosidase is selected such that it will have optimal activity under the conditions present at the site in the host cell where proteins are glycosylated and is introduced into the host cell preferably by targeting the enzyme to a host cell organelle where activity is desired.
[0086] The term "recombinant host cell" ("expression host cell", "expression host system", "expression system" or simply "host cell"), as used herein, is intended to refer to a cell into which a recombinant vector has been introduced. It should be understood that such terms are intended to refer not only to the particular subject cell but to the progeny of such a cell. Because certain modifications may occur in succeeding generations due to either mutation or environmental influences, such progeny may not, in fact, be identical to the parent cell, but are still included within the scope of the term "host cell" as used herein. A recombinant host cell may be an isolated cell or cell line grown in culture or may be a cell which resides in a living tissue or organism. Preferred host cells are yeasts and fungi.
[0087] When referring to "mole percent" of a glycan present in a preparation of a glycoprotein, the term means the molar percent of a particular glycan present in the pool of N-linked oligosaccharides released when the protein preparation is treated with PNGase and then quantified by a method that is not affected by glycoform composition, (for instance, labeling a PNGase released glycan pool with a fluorescent tag such as 2-aminobenzamide and then separating by high performance liquid chromatography or capillary electrophoresis and then quantifying glycans by fluorescence intensity). For example, 50 mole percent GlcNAc2Man3GlcNAc2Gal2NANA2 means that 50 percent of the released glycans are GlcNAc2Man3GlcNAc2Gal2NANA2 and the remaining 50 percent are comprised of other N-linked oligosaccharides. In embodiments, the mole percent of a particular glycan in a preparation of glycoprotein will be between 20% and 100%, preferably above 25%, 30%, 35%, 40% or 45%, more preferably above 50%, 55%, 60%, 65% or 70% and most preferably above 75%, 80% 85%, 90% or 95%.
[0088] The term "operably linked" expression control sequences refers to a linkage in which the expression control sequence is contiguous with the gene of interest to control the gene of interest, as well as expression control sequences that act in trans or at a distance to control the gene of interest.
[0089] The term "expression control sequence" or "regulatory sequences" are used interchangeably and as used herein refer to polynucleotide sequences which are necessary to affect the expression of coding sequences to which they are operably linked. Expression control sequences are sequences which control the transcription, post-transcriptional events and translation of nucleic acid sequences. Expression control sequences include appropriate transcription initiation, termination, promoter and enhancer sequences; efficient RNA processing signals such as splicing and polyadenylation signals; sequences that stabilize cytoplasmic mRNA; sequences that enhance translation efficiency (e.g., ribosome binding sites); sequences that enhance protein stability; and when desired, sequences that enhance protein secretion. The nature of such control sequences differs depending upon the host organism; in prokaryotes, such control sequences generally include promoter, ribosomal binding site, and transcription termination sequence. The term "control sequences" is intended to include, at a minimum, all components whose presence is essential for expression, and can also include additional components whose presence is advantageous, for example, leader sequences and fusion partner sequences.
[0090] The term "transfect", transfection", "transfecting" and the like refer to the introduction of a heterologous nucleic acid into eukaryote cells, both higher and lower eukaryote cells. Historically, the term "transformation" has been used to describe the introduction of a nucleic acid into a yeast or fungal cell; however, herein the term "transfection" is used to refer to the introduction of a nucleic acid into any eukaryote cell, including yeast and fungal cells.
[0091] The term "eukaryotic" refers to a nucleated cell or organism, and includes insect cells, plant cells, mammalian cells, animal cells and lower eukaryotic cells.
[0092] The term "lower eukaryotic cells" includes yeast and filamentous fungi. Yeast and filamentous fungi include, but are not limited to Pichia pastoris, Pichia finlandica, Pichia trehalophila, Pichia koclamae, Pichia membranaefaciens, Pichia minuta (Ogataea minuta, Pichia lindneri), Pichia opuntiae, Pichia thermotolerans, Pichia salictaria, Pichia guercuum, Pichia pijperi, Pichia stiptis, Pichia methanolica, Pichia sp., Saccharomyces cerevisiae, Saccharomyces sp., Hansenula polymorpha, Kluyveromyces sp., Kluyveromyces lactis, Candida albicans, Aspergillus nidulans, Aspergillus niger, Aspergillus oryzae, Trichoderma reesei, Chrysosporium lucknowense, Fusarium sp., Fusarium gramineum, Fusarium venenatum, Physcomitrella patens and Neurospora crassa. Pichia sp., any Saccharomyces sp., Hansenula polymorpha, any Kluyveromyces sp., Candida albicans, any Aspergillus sp., Trichoderma reesei, Chrysosporium lucknowense, any Fusarium sp. and Neurospora crassa.
[0093] As used herein, the terms "antibody," "immunoglobulin," "immunoglobulins" and "immunoglobulin molecule" are used interchangeably. Each immunoglobulin molecule has a unique structure that allows it to bind its specific antigen, but all immunoglobulins have the same overall structure as described herein. The basic immunoglobulin structural unit is known to comprise a tetramer of subunits. Each tetramer has two identical pairs of polypeptide chains, each pair having one "light" chain (about 25 kDa) and one "heavy" chain (about 50-70 kDa). The amino-terminal portion of each chain includes a variable region of about 100 to 110 or more amino acids primarily responsible for antigen recognition. The carboxy-terminal portion of each chain defines a constant region primarily responsible for effector function. Light chains are classified as either kappa or lambda. Heavy chains are classified as gamma, mu, alpha, delta, or epsilon, and define the antibody's isotype as IgG, IgM, IgA, IgD, and IgE, respectively.
[0094] The light and heavy chains are subdivided into variable regions and constant regions (See generally, Fundamental Immunology (Paul, W., ed., 2nd ed. Raven Press, N.Y., 1989), Ch. 7. The variable regions of each light/heavy chain pair form the antibody binding site. Thus, an intact antibody has two binding sites. Except in bifunctional or bispecific antibodies, the two binding sites are the same. The chains all exhibit the same general structure of relatively conserved framework regions (FR) joined by three hypervariable regions, also called complementarity determining regions or CDRs. The CDRs from the two chains of each pair are aligned by the framework regions, enabling binding to a specific epitope. The terms include naturally occurring forms, as well as fragments and derivatives. Included within the scope of the term are classes of immunoglobulins (Igs), namely, IgG, IgA, IgE, IgM, and IgD. Also included within the scope of the terms are the subtypes of IgGs, namely, IgG1, IgG2, IgG3, and IgG4. The term is used in the broadest sense and includes single monoclonal antibodies (including agonist and antagonist antibodies) as well as antibody compositions which will bind to multiple epitopes or antigens. The terms specifically cover monoclonal antibodies (including full length monoclonal antibodies), polyclonal antibodies, multispecific antibodies (for example, bispecific antibodies), and antibody fragments so long as they contain or are modified to contain at least the portion of the CH2 domain of the heavy chain immunoglobulin constant region which comprises an N-linked glycosylation site of the CH2 domain, or a variant thereof. Included within the terms are molecules comprising only the Fc region, such as immunoadhesions (U.S. Published Patent Application No. 2004/0136986; the disclosure of which is incorporated herein by reference), Fc fusions, and antibody-like molecules.
[0095] The term "Fc fragment" refers to the `fragment crystallized` C-terminal region of the antibody containing the CH2 and CH3 domains. The term "Fab fragment" refers to the `fragment antigen binding` region of the antibody containing the VH, CH1, VL and CL domains.
[0096] The term "monoclonal antibody" (mAb) as used herein refers to an antibody obtained from a population of substantially homogeneous antibodies, i.e., the individual antibodies comprising the population are identical except for possible naturally occurring mutations that may be present in minor amounts. Monoclonal antibodies are highly specific, being directed against a single antigenic site. Furthermore, in contrast to conventional (polyclonal) antibody preparations which typically include different antibodies directed against different determinants (epitopes), each mAb is directed against a single determinant on the antigen. In addition to their specificity, monoclonal antibodies are advantageous in that they can be produced, for example, by hybridoma culture, uncontaminated by other immunoglobulins. The term "monoclonal" indicates the character of the antibody as being obtained from a substantially homogeneous population of antibodies, and is not to be construed as requiring production of the antibody by any particular method. For example, the monoclonal antibodies to be used in accordance with the present invention may be made by the hybridoma method first described by Kohler et al., (1975) Nature, 256:495, or may be made by recombinant DNA methods (See, for example, U.S. Pat. No. 4,816,567; the disclosure of which is incorporated herein by reference).
[0097] The term "fragments" within the scope of the terms "antibody" or "immunoglobulin" include those produced by digestion with various proteases, those produced by chemical cleavage and/or chemical dissociation and those produced recombinantly, so long as the fragment remains capable of specific binding to a target molecule. Among such fragments are Fc, Fab, Fab', Fv, F(ab')2, and single chain Fv (scFv) fragments. Hereinafter, the term "immunoglobulin" also includes the term "fragments" as well.
[0098] Immunoglobulins further include immunoglobulins or fragments that have been modified in sequence but remain capable of specific binding to a target molecule, including: interspecies chimeric and humanized antibodies; antibody fusions; heteromeric antibody complexes and antibody fusions, such as diabodies (bispecific antibodies), single-chain diabodies, and intrabodies (See, for example, Intracellular Antibodies: Research and Disease Applications, (Marasco, ed., Springer-Verlag New York, Inc., 1998).
[0099] The term "catalytic antibody" refers to immunoglobulin molecules that are capable of catalyzing a biochemical reaction. Catalytic antibodies are well known in the art and have been described in U.S. Pat. Nos. 7,205,136; 4,888,281; 5,037,750 to Schochetman et al., U.S. Pat. Nos. 5,733,757; 5,985,626; and 6,368,839 to Barbas, III et al. (the disclosures of which are all incorporated herein by reference).
[0100] The interaction of antibodies and antibody-antigen complexes with cells of the immune system and the variety of responses, including antibody-dependent cell-mediated cytotoxicity (ADCC) and complement-dependent cytotoxicity (CDC), clearance of immunocomplexes (phagocytosis), antibody production by B cells and IgG serum half-life are defined respectively in the following: Daeron et al., Annu. Rev. Immunol. 15: 203-234 (1997); Ward and Ghetie, Therapeutic Immunol. 2:77-94 (1995); Cox and Greenberg, Semin. Immunol. 13: 339-345 (2001); Heyman, Immunol. Lett. 88:157-161 (2003); and Ravetch, Curr. Opin. Immunol. 9: 121-125 (1997).
[0101] As used herein, the term "consisting essentially of" will be understood to imply the inclusion of a stated integer or group of integers; while excluding modifications or other integers which would materially affect or alter the stated integer. With respect to species of N-glycans, the term "consisting essentially of" a stated N-glycan will be understood to include the N-glycan whether or not that N-glycan is fucosylated at the N-acetylglucosamine (GlcNAc) which is directly linked to the asparagine residue of the glycoprotein.
[0102] As used herein, the term "predominantly" or variations such as "the predominant" or "which is predominant" will be understood to mean the glycan species that has the highest mole percent (%) of total neutral N-glycans after the glycoprotein has been treated with PNGase and released glycans analyzed by mass spectroscopy, for example, MALDI-TOF MS or HPLC. In other words, the phrase "predominantly" is defined as an individual entity, such as a specific glycoform, is present in greater mole percent than any other individual entity. For example, if a composition consists of species A at 40 mole percent, species B at 35 mole percent and species C at 25 mole percent, the composition comprises predominantly species A, and species B would be the next most predominant species. Some host cells may produce compositions comprising neutral N-glycans and charged N-glycans such as mannosylphosphate. Therefore, a composition of glycoproteins can include a plurality of charged and uncharged or neutral N-glycans. In the present invention, it is within the context of the total plurality of neutral N-glycans in the composition in which the predominant N-glycan determined. Thus, as used herein, "predominant N-glycan" means that of the total plurality of neutral N-glycans in the composition, the predominant N-glycan is of a particular structure.
[0103] As used herein, the term "essentially free of" a particular sugar residue, such as fucose, or galactose and the like, is used to indicate that the glycoprotein composition is substantially devoid of N-glycans which contain such residues. Expressed in terms of purity, essentially free means that the amount of N-glycan structures containing such sugar residues does not exceed 10%, and preferably is below 5%, more preferably below 1%, most preferably below 0.5%, wherein the percentages are by weight or by mole percent. Thus, substantially all of the N-glycan structures in a glycoprotein composition according to the present invention are free of, for example, fucose, or galactose, or both.
[0104] As used herein, a glycoprotein composition "lacks" or "is lacking" a particular sugar residue, such as fucose or galactose, when no detectable amount of such sugar residue is present on the N-glycan structures at any time. For example, in preferred embodiments of the present invention, the glycoprotein compositions are produced by lower eukaryotic organisms, as defined above, including yeast (for example, Pichia sp.; Saccharomyces sp.; Kluyveromyces sp.; Aspergillus sp.), and will "lack fucose," because the cells of these organisms do not have the enzymes needed to produce fucosylated N-glycan structures. Thus, the term "essentially free of fucose" encompasses the term "lacking fucose." However, a composition may be "essentially free of fucose" even if the composition at one time contained fucosylated N-glycan structures or contains limited, but detectable amounts of fucosylated N-glycan structures as described above.
BRIEF DESCRIPTION OF THE DRAWINGS
[0105] FIG. 1A-H shows the genealogy of P. pastoris strain YGLY13992 (FIG. 1F) and strain YGLY14401 (FIG. 1G) beginning from wild-type strain NRRL-Y11430 (FIG. 1A).
[0106] FIG. 2 shows a map of plasmid pGLY6301 encoding the LmSTT3D ORF under the control of the Pichia pastoris alcohol oxidase I (AOX1) promoter and S. cerevisiae CYC transcription termination sequence. The plasmid is a roll-in vector that targets the URA6 locus. The selection of transformants uses arsenic resistance encoded by the S. cerevisiae ARR3 ORF under the control of the P. pastoris RPL10 promoter and S. cerevisiae CYC transcription termination sequence.
[0107] FIG. 3 shows a map of plasmid pGLY6294 encoding the LmSTT3D ORF under the control of the P. pastoris GAPDH promoter and S. cerevisiae CYC transcription termination sequence. The plasmid is a KINKO vector that targets the TRP1 locus: the 3' end of the TRP1 ORF is adjacent to the P. pastoris ALG3 transcription termination sequence. The selection of transformants uses nourseothricin resistance encoded by the Streptomyces noursei nourseothricin acetyltransferase (NAT) ORF under the control of the Ashbya gossypii TEF1 promoter (PTEF) and Ashbya gossypii TEF1 termination sequence (TTEF).
[0108] FIG. 4 shows a map of plasmid pGLY6. Plasmid pGLY6 is an integration vector that targets the URA5 locus and contains a nucleic acid molecule comprising the S. cerevisiae invertase gene or transcription unit (ScSUC2) flanked on one side by a nucleic acid molecule comprising a nucleotide sequence from the 5' region of the P. pastoris URA5 gene (PpURA5-5') and on the other side by a nucleic acid molecule comprising the a nucleotide sequence from the 3' region of the P. pastoris URA5 gene (PpURAS-3').
[0109] FIG. 5 shows a map of plasmid pGLY40. Plasmid pGLY40 is an integration vector that targets the OCH1 locus and contains a nucleic acid molecule comprising the P. pastoris URA5 gene or transcription unit (PpURA5) flanked by nucleic acid molecules comprising lacZ repeats (lacZ repeat) which in turn is flanked on one side by a nucleic acid molecule comprising a nucleotide sequence from the 5' region of the OCH1 gene (PpOCH1-5') and on the other side by a nucleic acid molecule comprising a nucleotide sequence from the 3' region of the OCH1 gene (PpOCH1-3').
[0110] FIG. 6 shows a map of plasmid pGLY43a. Plasmid pGLY43a is an integration vector that targets the BMT2 locus and contains a nucleic acid molecule comprising the K. lactis UDP-N-acetylglucosamine (UDP-GlcNAc) transporter gene or transcription unit (KlGlcNAc Transp.) adjacent to a nucleic acid molecule comprising the P. pastoris URA5 gene or transcription unit (PpURA5) flanked by nucleic acid molecules comprising lacZ repeats (lacZ repeat). The adjacent genes are flanked on one side by a nucleic acid molecule comprising a nucleotide sequence from the 5' region of the BMT2 gene (PpPBS2-5') and on the other side by a nucleic acid molecule comprising a nucleotide sequence from the 3' region of the BMT2 gene (PpPBS2-3').
[0111] FIG. 7 shows a map of plasmid pGLY48. Plasmid pGLY48 is an integration vector that targets the MNN4L1 locus and contains an expression cassette comprising a nucleic acid molecule encoding the mouse homologue of the UDP-GlcNAc transporter (MmGlcNAc Transp.) open reading frame (ORF) operably linked at the 5' end to a nucleic acid molecule comprising the P. pastoris GAPDH promoter (PpGAPDH Prom) and at the 3' end to a nucleic acid molecule comprising the S. cerevisiae CYC termination sequence (ScCYC TT) adjacent to a nucleic acid molecule comprising the P. pastoris URA5 gene or transcription unit (PpURA5) flanked by lacZ repeats (lacZ repeat) and in which the expression cassettes together are flanked on one side by a nucleic acid molecule comprising a nucleotide sequence from the 5' region of the P. pastoris MNN4L1 gene (PpMNN4L1-5') and on the other side by a nucleic acid molecule comprising a nucleotide sequence from the 3' region of the MNN4L1 gene (PpMNN4L1-3').
[0112] FIG. 8 shows as map of plasmid pGLY45. Plasmid pGLY45 is an integration vector that targets the PNO1/MNN4 loci contains a nucleic acid molecule comprising the P. pastoris URA5 gene or transcription unit (PpURA5) flanked by nucleic acid molecules comprising lacZ repeats (lacZ repeat) which in turn is flanked on one side by a nucleic acid molecule comprising a nucleotide sequence from the 5' region of the PNO1 gene (PpPNO1-5') and on the other side by a nucleic acid molecule comprising a nucleotide sequence from the 3' region of the MNN4 gene (PpMNN4-3').
[0113] FIG. 9 shows a map of plasmid pGLY1430. Plasmid pGLY1430 is a KINKO integration vector that targets the ADE1 locus without disrupting expression of the locus and contains in tandem four expression cassettes encoding (1) the human GlcNAc transferase I catalytic domain (codon optimized) fused at the N-terminus to P. pastoris SEC12 leader peptide (CO-NA10), (2) mouse homologue of the UDP-GlcNAc transporter (MmTr), (3) the mouse mannosidase IA catalytic domain (FB) fused at the N-terminus to S. cerevisiae SEC12 leader peptide (FB8), and (4) the P. pastoris URA5 gene or transcription unit (PpURA5) flanked by lacZ repeats (lacZ). All flanked by the 5' region of the ADE1 gene and ORF (ADE1 5' and ORF) and the 3' region of the ADE1 gene (PpADE1-3'). PpPMA1 prom is the P. pastoris PMA1 promoter; PpPMA1 TT is the P. pastoris PMA1 termination sequence; SEC4 is the P. pastoris SEC4 promoter; OCH1 TT is the P. pastoris OCH1 termination sequence; ScCYC TT is the S. cerevisiae CYC termination sequence; PpOCH1 Prom is the P. pastoris OCH1 promoter; PpALG3 TT is the P. pastoris ALG3 termination sequence; and PpGAPDH is the P. pastoris GADPH promoter.
[0114] FIG. 10 shows a map of plasmid pGLY582. Plasmid pGLY582 is an integration vector that targets the HIS1 locus and contains in tandem four expression cassettes encoding (1) the S. cerevisiae UDP-glucose epimerase (ScGAL10), (2) the human galactosyltransferase I (hGalT) catalytic domain fused at the N-terminus to the S. cerevisiae KRE2-s leader peptide (33), (3) the P. pastoris URA5 gene or transcription unit (PpURA5) flanked by lacZ repeats (lacZ repeat), and (4) the D. melanogaster UDP-galactose transporter (DmUGT). All flanked by the 5' region of the HIS1 gene (PpHIS1-5') and the 3' region of the HIS1 gene (PpHIS1-3'). PMA1 is the P. pastoris PMA1 promoter; PpPMA1 TT is the P. pastoris PMA1 termination sequence; GAPDH is the P. pastoris GADPH promoter and ScCYC TT is the S. cerevisiae CYC termination sequence; PpOCH1 Prom is the P. pastoris OCH1 promoter and PpALG12 TT is the P. pastoris ALG12 termination sequence.
[0115] FIG. 11 shows a map of plasmid pGLY167b. Plasmid pGLY167b is an integration vector that targets the ARG1 locus and contains in tandem three expression cassettes encoding (1) the D. melanogaster mannosidase II catalytic domain (codon optimized) fused at the N-terminus to S. cerevisiae MNN2 leader peptide (CO-KD53), (2) the P. pastoris HIS1 gene or transcription unit, and (3) the rat N-acetylglucosamine (GlcNAc) transferase II catalytic domain (codon optimized) fused at the N-terminus to S. cerevisiae MNN2 leader peptide (CO-TC54). All flanked by the 5' region of the ARG1 gene (PpARG1-5') and the 3' region of the ARG1 gene (PpARG1-3'). PpPMA1 prom is the P. pastoris PMA1 promoter; PpPMA1 TT is the P. pastoris PMA1 termination sequence; PpGAPDH is the P. pastoris GADPH promoter; ScCYC TT is the S. cerevisiae CYC termination sequence; PpOCH1 Prom is the P. pastoris OCH1 promoter; and PpALG12 TT is the P. pastoris ALG12 termination sequence.
[0116] FIG. 12 shows a map of plasmid pGLY3411 (pSH1092). Plasmid pGLY3411 (pSH1092) is an integration vector that contains the expression cassette comprising the P. pastoris URA5 gene or transcription unit (PpURA5) flanked by lacZ repeats (lacZ repeat) flanked on one side with the 5' nucleotide sequence of the P. pastoris BMT4 gene (PpPBS4 5') and on the other side with the 3' nucleotide sequence of the P. pastoris BMT4 gene (PpPBS4 3').
[0117] FIG. 13 shows a map of plasmid pGLY3419 (pSH1110). Plasmid pGLY3430 (pSH1115) is an integration vector that contains an expression cassette comprising the P. pastoris URA5 gene or transcription unit (PpURA5) flanked by lacZ repeats (lacZ repeat) flanked on one side with the 5' nucleotide sequence of the P. pastoris BMT1 gene (PBS1 5') and on the other side with the 3' nucleotide sequence of the P. pastoris BMT1 gene (PBS1 3')
[0118] FIG. 14 shows a map of plasmid pGLY3421 (pSH1106). Plasmid pGLY4472 (pSH1186) contains an expression cassette comprising the P. pastoris URA5 gene or transcription unit (PpURA5) flanked by lacZ repeats (lacZ repeat) flanked on one side with the 5' nucleotide sequence of the P. pastoris BMT3 gene (PpPBS3 5') and on the other side with the 3' nucleotide sequence of the P. pastoris BMT3 gene (PpPBS3 3').
[0119] FIG. 15 shows a map of plasmid pGLY3673. Plasmid pGLY3673 is a KINKO integration vector that targets the PRO1 locus without disrupting expression of the locus and contains expression cassettes encoding the T. reesei α-1,2-mannosidase catalytic domain fused at the N-terminus to S. cerevisiae αMATpre signal peptide (aMATTrMan) to target the chimeric protein to the secretory pathway and secretion from the cell.
[0120] FIG. 16 shows a map of pGLY6833 encoding the light and heavy chains of an anti-Her2 antibody. The plasmid is a roll-in vector that targets the TRP2 locus. The ORFs encoding the light and heavy chains are under the control of a P. pastoris AOX1 promoter and the P. pastoris CIT1 3UTR transcription termination sequence. Selection of transformants uses zeocin resistance encoded by the zeocin resistance protein (ZeocinR) ORF under the control of the P. pastoris TEF1 promoter and S. cerevisiae CYC termination sequence.
[0121] FIG. 17 shows a map of pGLY6564 encoding the light and heavy chains of an anti-RSV antibody. The plasmid is a roll-in vector that targets the TRP2 locus. The ORF encoding the heavy chain is under the control of a P. pastoris AOX1 promoter and the S. cerevisiae CYC transcription termination sequence. The ORF encoding the light chain is under the control of a P. pastoris AOX1 promoter and the P. pastoris AOX1 transcription termination sequence. Selection of transformants uses zeocin resistance encoded by the zeocin resistance protein (ZeocinR) ORF under the control of the P. pastoris TEF1 promoter and S. cerevisiae CYC termination sequence.
[0122] FIG. 18 shows the percent N-glycosylation site occupancy of anti-Her2 and anti-RSV antibodies produced in control strains verses strains in which the LmSTT3D is constitutively expressed (GAPDH promoter) or inducibly expressed (AOX1 promoter).
[0123] FIG. 19 A-C shows a comparison of N-glycosylation site occupancy of the anti-Her2 antibody produced in strain YGLY13992 (FIG. 19B) and strain YGLY17351 (FIG. 19C) compared to N-glycosylation site occupancy of a commercially available anti-Her2 antibody produced in CHO cells (HERCEPTIN) (FIG. 19A). Strain YGLY13992 does not include an expression cassette encoding the LmSTT3D whereas strain YGLY17351 includes an expression cassette encoding the LmSTT3 under the control of the inducible PpAOX1 promoter.
[0124] FIG. 20 shows the shows the percent N-glycosylation site occupancy of anti-Her2 antibodies produced in strain YGLY17351 grown in various bioreactors was consistent regardless of bioreactor scale.
[0125] FIG. 21A-B shows the results of a CE (FIG. 20B) and Q-TOF (FIG. 20A) analysis of a commercial lot of anti-Her2 antibody (HERCEPTIN).
[0126] FIG. 22 A-B shows the results of a CE (FIG. 20B) and Q-TOF (FIG. 20A) analysis for the same commercial lot as used for FIG. 21 but after treatment with PNGase F for a period of time.
[0127] FIG. 23 A-E shows the genealogy of P. pastoris strain YGLY12900 beginning from YGLY7961.
[0128] FIG. 24 shows a map of plasmid pGLY2456. Plasmid pGLY2456 is a KINKO integration vector that targets the TRP2 locus without disrupting expression of the locus and contains six expression cassettes encoding (1) the mouse CMP-sialic acid transporter codon optimized (CO mCMP-Sia Transp), (2) the human UDP-GlcNAc 2-epimerase/N-acetylmannosamine kinase codon optimized (CO hGNE), (3) the Pichia pastoris ARG1 gene or transcription unit, (4) the human CMP-sialic acid synthase codon optimized (CO hCMP-NANA S), (5) the human N-acetylneuraminate-9-phosphate synthase codon optimized (CO hSIAP S), and, (6) the mouse a-2,6-sialyltransferase catalytic domain codon optimized fused at the N-terminus to S. cerevisiae KRE2 leader peptide (comST6-33). All flanked by the 5' region of the TRP2 gene and ORF (PpTRP2 5') and the 3' region of the TRP2 gene (PpTRP2-3'). PpPMA1 prom is the P. pastoris PMA1 promoter; PpPMA1 TT is the P. pastoris PMA1 termination sequence; CYC TT is the S. cerevisiae CYC termination sequence; PpTEF Prom is the P. pastoris TEF1 promoter; PpTEF TT is the P. pastoris TEF1 termination sequence; PpALG3 TT is the P. pastoris ALG3 termination sequence; and pGAP is the P. pastoris GAPDH promoter.
[0129] FIG. 25 shows a map of plasmid pGLY5048. Plasmid pGLY5048 is an integration vector that targets the STE13 locus and contains expression cassettes encoding (1) the T. reesei α-1,2-mannosidase catalytic domain fused at the N-terminus to S. cerevisiae αMATpre signal peptide (aMATTrMan) to target the chimeric protein to the secretory pathway and secretion from the cell and (2) the P. pastoris URA5 gene or transcription unit.
[0130] FIG. 26 shows a map of plasmid pGLY5019. Plasmid pGLY5019 is an integration vector that targets the DAP2 locus and contains an expression cassette comprising a nucleic acid molecule encoding the Nourseothricin resistance (NATR) ORF operably linked to the Ashbya gossypii TEF1 promoter and A. gossypii TEF1 termination sequences flanked one side with the 5' nucleotide sequence of the P. pastoris DAP2 gene and on the other side with the 3' nucleotide sequence of the P. pastoris DAP2 gene.
[0131] FIG. 27 shows a plasmid map of pGLY5085. Plasmid pGLY5085 is a KINKO plasmid for introducing a second set of the genes involved in producing sialylated N-glycans into P. pastoris. The plasmid is similar to plasmid YGLY2456 except that the P. pastoris ARG1 gene has been replaced with an expression cassette encoding hygromycin resistance (HygR) and the plasmid targets the P. pastoris TRP5 locus. The six tandem cassettes are flanked on one side by a nucleic acid molecule comprising a nucleotide sequence from the 5' region and ORF of the TRP5 gene ending at the stop codon followed by a P. pastoris ALG3 termination sequence and on the other side by a nucleic acid molecule comprising a nucleotide sequence from the 3' region of the TRP5 gene.
[0132] FIG. 28 shows a plasmid map of pGLY7240. The plasmid is an integration vector that targets the TRP2 locus and contains an ORF encoding the zeocin resistance protein (ZeocinR) under the control of the P. pastoris TEF1 promoter and S. cerevisiae CYC termination sequence. The plasmid encodes the GM-CSF/CWP1 fusion protein operably linked at the 5' end to the Pichia pastoris AOX1 promoter and at the 3' end to the S. cerevisiae CYC transcription termination sequence.
[0133] FIG. 29 shows a Western blot of GM-CSF produced in strain YGLY16349, which co-expresses LmSTT3D, that the majority of GM-CSF (Lanes 2-8) is glycosylated with 2N-linked sites in contrast to the control strain (YGLY15560, lane 9) where GM-CSF is predominantly N-glycosylated with 1 site along with the minor portions of 2 N sites and non-glycosylated.
[0134] FIG. 30 shows a Q-TOP analysis of GM-CSF expressed from YGLY15560 (A) and YGLY16349 (B), respectively. Non-glycosylated GM-CSF was not detected.
DETAILED DESCRIPTION OF THE INVENTION
[0135] The present invention provides a method for producing a therapeutic glycoprotein in a host cell in which the N-glycosylation site occupancy of the glycoprotein is increased over the N-glycosylation site occupancy of the same glycoprotein produced in a host cell not modified as disclosed herein. When the present invention is practiced in a lower eukaryote host cell, e.g., yeast host cells or filamentous fungal host cells, the N-glycosylation site occupancy of recombinant glycoproteins produced in the host cell is the same as or more similar to the N-glycosylation site occupancy of the same recombinant glycoproteins produced in mammalian or human host cells.
[0136] To increase the N-glycosylation site occupancy on a glycoprotein produced in a recombinant host cell, at least one nucleic acid molecule encoding at least one heterologous single-subunit oligosaccharyltransferase, which in particular embodiments at least one is capable of functionally suppressing a lethal mutation of one or more essential subunits comprising the endogenous host cell hetero-oligomeric oligosaccharyltransferase (OTase) complex, is overexpressed in the recombinant host cell either before or simultaneously with the expression of the glycoprotein in the host cell.
[0137] The Leishmania major STT3A protein, Leishmania major STT3B protein, and Leishmania major STT3D protein, are single-subunit oligosaccharyltransferases that have been shown to suppress the lethal phenotype of a deletion of the STT3 locus in Saccharomyces cerevisiae (Naseb et al., Molec. Biol. Cell 19: 3758-3768 (2008)). Naseb et al. (ibid.) further showed that the Leishmania major STT3D protein could suppress the lethal phenotype of a deletion of the WBP1, OST1, SWP1, or OST2 loci. Hese et al. (Glycobiology 19: 160-171 (2009)) teaches that the Leishmania major STT3A (STT3-1), STT3B (STT3-2), and STT3D (STT3-4) proteins can functionally complement deletions of the OST2, SWP1, and WBP1 loci. The Leishmania major STT3D (LmSTT3D) protein is a heterologous single-subunit oligosaccharyltransferases that is capable of suppressing a lethal phenotype of a Δstt3 mutation and at least one lethal phenotype of a Δwbp1, Δost1, Δswp1, and Δost2 mutation that is shown in the examples herein to be capable of enhancing the N-glycosylation site occupancy of heterologous glycoproteins, for example antibodies, produced by the host cell.
[0138] The one or more heterologous single-subunit oligosaccharyltransferases is/are overexpressed constitutively or inducibly in the presence of the proteins comprising the host cell's endogenous OTase complex, including the host cell's STT3 protein. An expression cassette encoding each heterologous single-subunit oligosaccharyltransferase gene can either be integrated into any site within the host cell genome or located in the extrachromosomal space of the host cell, i.e., autonomously replicating genetic elements such as plasmids, viruses, 2 nm plasmid, minichromosomes, and the like. In general, the heterologous single-subunit oligosaccharyltransferases are provided to the host cell in expression cassettes, each comprising a nucleic acid molecule encoding a single-subunit oligosaccharyltransferase open reading frame (ORF) operably linked to a heterologous constitutive or inducible promoter and other heterologous transcriptional or translational regulatory elements suitable for expressing heterologous proteins in a particular host cell. One or more copies of each expression cassette is/are integrated into one or more locations in the host cell's genome either by site-specific targeting of a particular locus for integration or randomly integrating the expression cassette into the genome. The locus for targeted integration can be selected based upon the suitability of the locus for ectopic constitutive or inducible expression of the single-subunit oligosaccharyltransferase in the expression cassette. Methods for integrating heterologous nucleic acid molecules into a host cell genome by techniques such as single- and double-crossover homologous recombination and the like are well known in the art (See for example, U.S. Published Application No. 20090124000 and International Published Application No. WO2009085135, the disclosures of which are incorporated herein by reference). Alternatively, or in addition to integrating one or more copies of the expression cassette into the host cell genome, one or more copies of the expression cassette are located in the extrachromosomal space of the host cell using a 2μ plasmid, viral vector, mini-chromosome, or other genetic vector that replicates autonomously.
[0139] While the present invention has been exemplified herein with Pichia pastoris host cells genetically engineered to produce mammalian or human-like glycosylation patterns comprising complex N-glycans, the present invention to increase the overall amount of N-glycosylation site occupancy of the glycoproteins produced in the host cell compared to that of glycoproteins produced in the host not modified as disclosed herein to express the single-subunit oligosaccharyltransferase gene can also be applied to Pichia pastoris host cells that are not genetically engineered to produce glycoproteins that have mammalian or human glycosylation patterns but instead express glycoproteins that have endogenous or wild-type glycosylation patterns, for example hypermannosylated N-glycosylation or when the host cell lacks alpha-1,6-mannosylatransferase (och1p) activity, high mannose N-glycosylation. The present invention can also be applied to other yeast or filamentous fungi or to plant or algal host cells, which express glycoproteins that have endogenous or wild-type glycosylation patterns, for example hypermannosylated N-glycosylation or when the host cell lacks alpha-1,6-mannosylatransferase (och1p) activity, high mannose N-glycosylation, or which have been genetically engineered to produce mammalian or human-like complex or hybrid N-glycans to increase the overall amount of N-glycosylation site occupancy of the glycoproteins produced in the host cell compared to that of glycoproteins produced in the host not modified as disclosed herein to express the single-subunit oligosaccharyltransferase gene. The present invention can also be applied to mammalian expression systems to increase the overall N-glycosylation site occupancy of glycoproteins that have more than two N-linked sites compared to that of glycoproteins produced in the host cell not modified as disclosed herein to express the single-subunit oligosaccharyltransferase gene.
[0140] The OTase complex of animals, plants, and fungi is a hetero-oligomeric protein complex. In the well-studied model organism Saccharomyces cerevisiae, the OTase complex currently appears to consist of at least eight different subunits: Ost1p, Ost2p, Wbp1, Stt3p, Swp1p, Ost4p, Ost5p, and Ost3p/Ost6p (Silberstein & Gilmore, FASEB J. 10: 849-858 (1996); Knauer & Lehle, Biochim. Biophys. Acta. 1426: 259-273 (1999); Dempski & Imperiali, Curr. Opin. Chem. Biol. 6: 844-850 (2002); Yan & Lennarz, J. Biol. Chem. 277: 47692-47700 (2005); Kelleher & Gilmore, Glycobiol. 16:47R-62R (2006); Weerapana & Imperiali, Glycobiol. 16: 91R-101R (2006)). In Pichia pastoris, the OTase complex appears to include at least Ost1p, Ost2p, Ost3p, Ost4p, Ost6p, Wbp1, Swp1p, and Stt3p (See Shutter et al., Nat. Biotechnol. 27: 561-566 (2009)).
[0141] It has been hypothesized that the STT3 protein is the catalytic subunit in the OTase complex (Yan & Lennarz, J. Biol. Chem. 277: 47692-47700 (2002); Kelleher et al., Mol. Cell. 12: 101-111 (2003); Nilsson et al., J. Cell Biol. 161: 715-725 (2003)). Support for this hypothesis is from experiments showing that the prokaryotic homologue of yeast Stt3p is an active oligosaccharyltransferase in the absence of any other accessory proteins (Wacker et al., Science. 298: 1790-1793 (2002); Kowarik et al., Science 314: 1148-1150 (2006)). Proteins homologous to yeast Stt3p are encoded in almost all eukaryotic genomes (Kelleher & Gilmore, Glycobiol. 16:47R-62R (2006)). However, comparative genome analysis suggests that the composition of the OTase became increasing complex during the evolutionary divergence of eukaryotes.
[0142] Single-subunit oligosaccharyltransferases are present in Giardia and kinetoplastids, whereas four subunit oligosaccharyltransferases consisting of the STT3, OST1, OST2, and WBP1 homologues are found in diplomonads, entamoebas, and apicomplexan species. Additionally, multiple forms of the putative STT3 proteins can be encoded in trypanosomatid genomes: three STT3 homologues are found in Trypanosoma brucei and four in Leishmania major (McConville et al., Microbiol. Mol. Biol. Rev. 66: 122-154 (2002); Berriman et al., Science. 309: 416-422 (2005); Ivens et al., Science. 309: 436-442 (2005); Samuelson et al., Proc. Natl. Acad. Sci. USA 102: 1548-1553 (2005); Kelleher & Gilmore, Glycobiol. 16:47R-62R (2006)).
[0143] In trypanosomatid parasites, N-linked glycosylation principally follows the pathway described for fungal or animal cells, but with different oligosaccharide structures transferred to protein (Parodi, Glycobiology 3: 193-199 (1993); McConville et al., Microbiol. Mol. Biol. Rev. 66: 122-154 (2002)). It has been shown that, depending on the species, either Man6GlcNAc2 or Man7GlcNAc2 is the largest glycan transferred to protein in the genus Leishmania (Parodi, Glycobiology 3: 193-199 (1993). Unlike the yeast and mammalian oligosaccharyltransferase that preferably use Glc3Man9GlcNAc2, the trypanosome oligosaccharyltransferase is not selective and transfers different lipid-linked oligosaccharides at the same rate (Bosch et al., J. Biol. Chem. 263:17360-17365 (1988)). Therefore, the simplest eukaryotic oligosaccharyltransferase is a single subunit STT3 protein, similar to the oligosaccharyltransferase found in bacterial N-glycosylation systems. Nasab et al., Molecular Biology of the Cell 19: 3758-3768 (2008) expressed each of the four Leishmania major STT3 proteins individually in Saccharomyces cerevisiae and found that three of them, LmSTT3A protein, LmSTT3B protein, and LmSTT3D protein, were able to complement a deletion of the yeast STT3 locus. In addition, LmSTT3D expression suppressed the lethal phenotype of single and double deletions in genes encoding various essential OTase subunits. The LmSTT3 proteins did not incorporate into the yeast OTase complex but instead formed a homodimeric enzyme, capable of replacing the endogenous, multimeric enzyme of the yeast cell. The results indicate that while these single-subunit oligosaccharyltransferases may resemble the prokaryotic enzymes, they use substrates typical for eukaryote glycosylation: that is, the N-X-S/T N-glycosylation recognition site and dolicholpyrophosphate-linked high mannose oligosaccharides.
[0144] N-glycosylation site occupancy in yeast has also been discussed in reports by, for example, Schultz and Aebi, Molec. Cell. Proteomics 8: 357-364 (2009); Hese et al., op. cit.) and Nasab et al., (op. cit.). Expression of the Toxoplasma gondii or Trypanosoma cruzi STT3 protein in Saccharomyces cerevisiae has been shown to complement the lethal phenotype of an stt3 deletion (Shams-Eldin et al., Mol. Biochem. Parasitol. 143: 6-11 (2005); Castro et al., Proc. Natl. Acad. Sci. USA 103: 14756-14760 (2006) and while the Trypanosoma cruzi STT3 protein integrates into the yeast OTase complex the Leishmania major STT3 proteins appear to form homodimers instead (Nasab et al., op. cit.). However, in these reports, the LmSTT3D protein had been tested for its functional suppression of a lethal mutation of the endogenous yeast STT3 locus and other essential components of the yeast OTase complex in studies that measured N-glycosylation site occupancy of endogenous proteins. In addition, the yeast strains that were used in the studies produced glycoproteins that had a yeast glycosylation pattern, not a mammalian or human-like glycosylation pattern comprising hybrid or complex N-glycans.
[0145] In contrast to the above reports, in the present invention the open reading frame encoding a heterologous single-subunit oligosaccharyltransferase (as exemplified herein with the open reading frame encoding the LmSTT3D) protein is overexpressed constitutively or inducibly in the recombinant host cell in which the host cell further expresses the endogenous genes encoding the proteins comprising the host cell oligosaccharyltransferase (OTase) complex, which includes the expression of the endogenous host cell STT3 gene. Thus, the host cell expresses both the heterologous single-subunit oligosaccharyltransferase and the endogenous host cell OTase complex, including the endogenous host cell SST3 protein. Furthermore, with respect to recombinant yeast, filamentous fungus, algal, or plant host cells, the host cells can further be genetically engineered to produce glycoproteins that comprise a mammalian or human-like glycosylation pattern comprising complex and/or hybrid N-glycans and not glycoproteins that have the host cells' endogenous glycosylation pattern.
[0146] The present invention has been exemplified herein using Pichia pastoris host cells genetically engineered to produce mammalian- or human-like complex N-glycans; however, the present invention can be applied to other yeast ost cells (including but not limited to Saccharomyces cerevisiae, Schizosaccharomyces pombe, Ogataea minuta, and Pichia pastoris) or filamentous fungi (including but not limited to Tricoderma reesei) that produce glycoproteins that have yeast or fungal N-glycans (either hypermannosylated N-glycans or high mannose N-glycans) or genetically engineered to produce glycoproteins that have mammalian- or human-like high mannose, complex, or hybrid N-glycans to improve the overall N-glycosylation site occupancy of glycoproteins produced in the host cell. Furthermore, the present invention can also be applied to plant and mammalian expression system to improve the overall N-glycosylation site occupancy of glycoproteins produced in these plant or mammalian expression systems, particularly glycoproteins that have more than two N-linked glycosylation sites.
[0147] Therefore, in one aspect of the above, provided is a method for producing a heterologous glycoprotein in a host cell, comprising providing a host cell that includes a nucleic acid molecule encoding at least one heterologous single-subunit oligosaccharyltransferase and a nucleic acid molecule encoding the heterologous glycoprotein and wherein the endogenous host cell genes encoding the proteins comprising the endogenous oligosaccharyltransferase (OTase) complex are expressed; and culturing the host cell under conditions for expressing the heterologous glycoprotein to produce the heterologous glycoprotein.
[0148] In a further aspect of the above, provided is a method for producing a heterologous glycoprotein with mammalian- or human-like complex or hybrid N-glycans in a host cell, comprising providing a host cell that is genetically engineered to produce glycoproteins that have human-like N-glycans and includes a nucleic acid molecule encoding at least one heterologous single-subunit oligosaccharyltransferase and a nucleic acid molecule encoding the heterologous glycoprotein and wherein the endogenous host cell genes encoding the proteins comprising the endogenous oligosaccharyltransferase (OTase) complex are expressed; and culturing the host cell under conditions for expressing the heterologous glycoprotein to produce the heterologous glycoprotein.
[0149] Expression of the endogenous host cell genes encoding the proteins comprising the oligosaccharyltransferase (OTase) complex includes expression of the endogenous host cell gene encoding the endogenous STT3 protein or homologue. In the case of yeast host cells, the endogenous host cell genes encoding the proteins comprising the OTase complex are expressed, which includes the expression of the endogenous STT3 gene. Currently, the genes encoding proteins comprising the Saccharomyces cerevisiae OTase complex are known to include OST1, OST2, OST3, OST4, OST5, OST6, WBP1, SWP1, and STT3 (See for example, Spirig et al., Molec. Gen. Genet. 256: 628-637 (1997) and in Pichia pastoris, the OTase complex appears to include at least Ost1p, Ost2p, Ost3p, Ost4p, Ost6p, Wbp1, Swp1p, and Stt3p (See Shutter et al., op. cit.).
[0150] In general, the heterologous single-subunit oligosaccharyltransferase is capable of functionally suppressing the lethal phenotype of a mutation of at least one essential protein of an OTase complex, for example, a yeast OTase complex. Thus, the heterologous single-subunit oligosaccharyltransferase is capable of functionally complementing or rescuing a lethal mutation of at least one essential protein of an OTase complex. In further aspects, the essential protein of the OTase complex is encoded by the Saccharomyces cerevisiae and/or Pichia pastoris STT3 locus, WBP1 locus, OST1 locus, SWP1 locus, or OST2 locus, or homologue thereof. In general, heterologous single-subunit oligosaccharyltransferases that can be used in the methods herein for increasing N-glycosylation site occupancy is a heterologous single-subunit oligosaccharyltransferase that in particular embodiments is capable of functionally suppressing (or rescuing or complementing) the lethal phenotype of at least one essential protein of the Saccharomyces cerevisiae and/or Pichia pastoris OTase complex. For example, in further aspects, the heterologous single-subunit oligosaccharyltransferase is the Leishmania major STT3D protein, which is capable of functionally suppressing (or rescuing or complementing) the lethal phenotype of at least one essential protein of the Saccharomyces cerevisiae or Pichia pastoris OTase complex. Therefore, for a particular host cell, a particular heterologous single-subunit oligosaccharyltransferase is suitable for expression in the particular host cell provided the single-subunit heterologous oligosaccharyltransferase is capable of suppressing the lethal phenotype of at least one essential protein of the yeast OTase complex. In further aspect, a heterologous single-subunit heterologous oligosaccharyltransferase is selected for expression in a particular host cell provided the single-subunit heterologous oligosaccharyltransferase is capable of suppressing the lethal phenotype of at least one essential protein of the Saccharomyces cerevisiae and/or Pichia pastoris OTase complex. The essential proteins include OST1, OST2,WBP1,SWP1, and STT3.
[0151] As used herein, a lethal mutation includes a deletion or disruption of the gene encoding the essential protein of the OTase complex or a mutation in the coding sequence that renders the essential protein non-functional. The term can further include knock-down mutations wherein production of a functional essential protein is abrogated using shRNA or RNAi.
[0152] Further provided is a host cell, comprising a first nucleic acid molecule encoding at least one heterologous single-subunit oligosaccharyltransferase; and a second nucleic acid molecule encoding a heterologous glycoprotein; and the host cell expresses its endogenous genes encoding the proteins comprising the endogenous oligosaccharyltransferase (OTase) complex, which includes expressing the endogenous host cell gene encoding the host cell STT3 protein, which in yeast is the STT3 gene. In further aspects of a yeast host cell, the host cell expresses the endogenous genes encoding the proteins comprising the OTase complex.
[0153] In particular aspects of any of the above, the host cell further comprises one or more a nucleic acid molecules encoding additional heterologous oligosaccharyltransferases, which can include single-subunit or multimeric oligosaccharyltransferases. For example, the host cell can comprise one or more nucleic acid molecules encoding one or more single-subunit oligosaccharyltransferases selected from the group consisting of the LmSTT3A protein, LmSTT3B protein, and LmSTT3D protein. In further aspects, the host cell can further include a nucleic acid molecule encoding LmSTT3C protein. In further aspects of any one of the above, the host cell can include one or more nucleic acid molecules encoding one or more oligosaccharyltransferases selected from the group consisting of the Toxoplasma gondii STT3 protein, Trypanosoma cruzi STT3 protein, Trypanosoma brucei STT3 protein, and C. elegans STT3 protein. In further still aspects of any one of the above, the host cell can further include a nucleic acid molecule encoding the Pichia pastoris STT3 protein.
[0154] Lower eukaryotes such as yeast or filamentous fungi are often used for expression of recombinant glycoproteins because they can be economically cultured, give high yields, and when appropriately modified are capable of suitable glycosylation. Yeast in particular offers established genetics allowing for rapid transfections, tested protein localization strategies and facile gene knock-out techniques. Suitable vectors have expression control sequences, such as promoters, including 3-phosphoglycerate kinase or other glycolytic enzymes, and an origin of replication, termination sequences, and the like as desired.
[0155] Useful lower eukaryote host cells include but are not limited to Pichia pastoris, Pichia finlandica, Pichia trehalophila, Pichia koclamae, Pichia membranaefaciens, Pichia minuta (Ogataea minuta, Pichia lindneri), Pichia opuntiae, Pichia thermotolerans, Pichia salictaria, Pichia guercuum, Pichia pijperi, Pichia stiptis, Pichia methanolica, Pichia sp., Saccharomyces cerevisiae, Saccharomyces sp., Hansenula polymorpha, Kluyveromyces sp., Kluyveromyces lactis, Candida albicans, Aspergillus nidulans, Aspergillus niger, Aspergillus oryzae, Trichoderma reesei, Chrysosporium lucknowense, Fusarium sp., Fusarium gramineum, Fusarium venenatum and Neurospora crassa. Various yeasts, such as K. lactis, Pichia pastoris, Pichia methanolica, and Hansenula polymorpha are particularly suitable for cell culture because they are able to grow to high cell densities and secrete large quantities of recombinant protein. Likewise, filamentous fungi, such as Aspergillus niger, Fusarium sp, Neurospora crassa and others can be used to produce glycoproteins of the invention at an industrial scale. In the case of lower eukaryotes, cells are routinely grown from between about one and a half to three days.
[0156] Therefore, provided is a method for producing a heterologous glycoprotein in a lower eukaryote host cell, comprising providing a lower eukaryote host cell that includes a nucleic acid molecule encoding a heterologous single-subunit oligosaccharyltransferase and a nucleic acid molecule encoding the heterologous glycoprotein; and culturing the host cell under conditions for expressing the heterologous glycoprotein to produce the heterologous glycoprotein.
[0157] Further provided is a lower eukaryote host cell, comprising a first nucleic acid molecule encoding a heterologous single-subunit oligosaccharyltransferase; and a second nucleic acid molecule encoding a heterologous glycoprotein; and wherein the endogenous host cell genes encoding the proteins comprising the oligosaccharyltransferase (OTase) complex are expressed.
[0158] Further provided is a yeast or filamentous fungus host cell, comprising a first nucleic acid molecule encoding a heterologous single-subunit oligosaccharyltransferase; and a second nucleic acid molecule encoding a heterologous glycoprotein; and wherein the endogenous host cell genes encoding the proteins comprising the oligosaccharyltransferase (OTase) complex are expressed. This includes expression of the endogenous STT3 gene, which in yeast is the STT3 gene.
[0159] In particular aspects, the above yeast or filamentous fungus host cell can be a host cell that produces glycoproteins that have a yeast-like or filamentous fungus-like glycosylation pattern. The yeast glycosylation pattern can include hypermannosylated N-glycans or the yeast can be genetically engineered to lack α1,6-mannosyltransferase activity, that is, the yeast host is genetically engineered to lack och1p activity, in which case, the yeast produces glycoproteins that have high mannose N-glycans that are not further hypermannosylated.
[0160] In particular embodiments of the above methods and host cells, the heterologous single-subunit oligosaccharyltransferase is capable of functionally suppressing the lethal phenotype of a mutation of at least one essential protein of the OTase complex. In further aspects, the essential protein of the OTase complex is encoded by the STT3 locus, WBP1 locus, OST1 locus, SWP1 locus, or OST2 locus, or homologue thereof. In further aspects, the for example single-subunit oligosaccharyltransferase is the Leishmania major STT3D protein.
[0161] The methods and host cells herein provide a means for producing heterologous glycoproteins in a host cell wherein the N-glycosylation site occupancy of a composition of the heterologous glycoproteins is greater than the N-glycosylation site occupancy for the heterologous produced in the host cell not modified as described herein to express a heterologous single-subunit oligosaccharyltransferase and the endogenous host cell genes encoding the proteins comprising the oligosaccharyltransferase (OTase) complex. For a lower eukaryote host cell such as yeast, when the N-glycosylation site occupancy of a heterologous glycoprotein is lower than that obtained for the heterologous glycoprotein when produced in mammalian or human cells, the N-glycosylation site occupancy of the glycoprotein produced in the host cell can be made the same as or more similar to the N-glycosylation site occupancy of the glycoprotein in the mammalian or human cell by producing the glycoprotein in a host cell that express a heterologous single-subunit oligosaccharyltransferase and the endogenous host cell genes encoding the proteins comprising the oligosaccharyltransferase (OTase) complex. As shown in the examples, Pichia pastoris host cells that express a heterologous single-subunit oligosaccharyltransferase and the endogenous host cell genes encoding the proteins comprising the oligosaccharyltransferase (OTase) complex are capable of producing antibodies wherein the N-glycosylation site occupancy of the antibodies is similar to that of the antibodies produced in Chinese hamster ovary (CHO) cells (See also FIG. 19A-C).
[0162] A method for measuring N-glycosylation site occupancy is to separate and measure the amount of glycosylated protein and non-glycosylated protein and determine the N-glycosylation site occupancy using the formula
(Moles glycosylated protein)/(moles glycosylated protein+moles non-glycosylated protein)×100=percent N-glycosylation site occupancy
[0163] When measuring the N-glycosylation site occupancy of antibodies in an antibody composition, the antibodies in the composition are reduced and the moles of glycosylated and non-glycosylated heavy chains determined. Each heavy chain has one N-glycosylation site at Asn-297. The percent N-glycosylation site occupancy is determined based upon total moles of N-glycans released and the total moles of antibody heavy chains. For example, an N-glycosylation site occupancy of 94% would indicate that 94% of the heavy chains in the composition have an N-glycan at Asn-297 and 6% of the heavy chains would lack an N-glycan. Antibodies consist of two heavy chains and two light chains. In the above example, antibodies in the composition can have both heavy chains linked to an N-glycan, one of the two heavy chains with an N-glycan, or neither chain with an N-glycan. Therefore, a 94% N-glycosylation site occupancy of heavy chains would suggest that about 88% of the antibodies in the composition would have both heavy chains N-glycosylated and 11.4% of the antibodies would have only one of the two heavy chains N-glycosylated. To get a qualitative indication that the above is correct, whole antibodies are analyzed by a method such as Q-TOF (hybrid quadrupole time of flight mass spectrometer with MS/MS capability).
[0164] A general method for measuring N-glycosylation site occupancy of antibodies can use the following method, which is exemplified in Example 3. The antibodies are reduced to heavy chains (HC) and light chains (LC) and the amount of glycosylated heavy chain (GHC) and non-glycosylated heavy chains (NGHC) are determined by a method such as capillary electrophoresis. The N-glycosylation site occupancy using the formula
Moles GHC)/(moles GHC+moles NGHC)×100=percent N-glycosylation site HC occupancy
For any N-glycosylation site, the site is either occupied or not. Therefore, N-glycan occupancy of 100% would be equivalent to a ratio of 1:1 (1 mole of N-glycan per 1 mole of N-glycosylation site, e.g., heavy chain from reduced antibody) or 2:1 (2 moles of N-glycan per 1 mole of protein with two N-glycosylation sites, e.g., non-reduced antibody). N-glycan occupancy of 80% would be equivalent to a ratio of 0.8:1 (0.8 mole of N-glycan per 1 mole of N-glycosylation site, e.g., heavy chain from reduced antibody) or 1.6:1 (1.6 moles of N-glycan per mole of protein with two N-glycosylation sites, e.g., non-reduced antibody).
[0165] An estimate of the proportion of whole antibodies in which both heavy chains are glycosylated can be approximated by the formula (fraction GHC)2×100=fully occupied antibodies (whole, non-reduced antibodies in which both N-glycosylation sites are occupied). Example 3 shows that the methods herein enable the production of antibody compositions wherein about 70% to about 98% of the non-reduced whole antibody molecules in the composition have both N-glycosylation sites occupied. Since measurement of N-glycosylation site occupancy was determined using reduced antibody molecules, the results herein show that for compositions comprising glycoprotein molecules containing a single glycosylation site, more than 84% to at least 99% of the glycoprotein molecules were N-glycosylated. Therefore, the methods and host cells herein enable production of glycoprotein compositions in which at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, or 99% of the N-glycosylation sites of the glycoproteins in the composition are occupied.
[0166] Another method for measuring N-glycosylation site occupancy of glycoproteins in a glycoprotein composition can be accomplished by releasing the N-glycans from the glycoproteins in the composition and measuring the molar amount of the N-glycans released and the molar amount of glycoprotein times the number of glycosylation sites on the glycoprotein. The following formula can be used
(Total moles of N-glycans)/(Total moles of glycoprotein×No. of sites)×100=percent N-glycosylation site occupancy.
The above formula will give the percent of total N-glycosylation sites that are occupied.
[0167] Lower eukaryotes, particularly yeast, can be genetically modified so that they express glycoproteins in which the glycosylation pattern is mammalian or human-like or humanized. In this manner, glycoprotein compositions can be produced in which a specific desired glycoform is predominant in the composition. Such can be achieved by eliminating selected endogenous glycosylation enzymes and/or genetically engineering the host cells and/or supplying exogenous enzymes to mimic all or part of the mammalian glycosylation pathway as described in U.S. Published Application No. 2004/0018590, the disclosure of which is incorporated herein by reference. If desired, additional genetic engineering of the glycosylation can be performed, such that the glycoprotein can be produced with or without core fucosylation.
[0168] Lower eukaryotes such as yeast can be genetically modified so that they express glycoproteins in which the glycosylation pattern is mammalian-like or human-like or humanized. Such can be achieved by eliminating selected endogenous glycosylation enzymes and/or supplying exogenous enzymes as described by Gerngross et al., U.S. Pat. No. 7,449,308, the disclosure of which is incorporated herein by reference. Thus, in particular aspects of the invention, the host cell is yeast, for example, a methylotrophic yeast such as Pichia pastoris or Ogataea minuta and mutants thereof and genetically engineered variants thereof. In this manner, glycoprotein compositions can be produced in which a specific desired glycoform is predominant in the composition. Such can be achieved by eliminating selected endogenous glycosylation enzymes and/or genetically engineering the host cells and/or supplying exogenous enzymes to mimic all or part of the mammalian glycosylation pathway as described in U.S. Pat. No. 7,449,308, the disclosure of which is incorporated herein by reference. If desired, additional genetic engineering of the glycosylation can be performed, such that the glycoprotein can be produced with or without core fucosylation. Use of lower eukaryotic host cells such as yeast are further advantageous in that these cells are able to produce relatively homogenous compositions of glycoprotein, such that the predominant glycoform of the glycoprotein may be present as greater than thirty mole percent of the glycoprotein in the composition. In particular aspects, the predominant glycoform may be present in greater than forty mole percent, fifty mole percent, sixty mole percent, seventy mole percent and, most preferably, greater than eighty mole percent of the glycoprotein present in the composition. Such can be achieved by eliminating selected endogenous glycosylation enzymes and/or supplying exogenous enzymes as described by Gerngross et al., U.S. Pat. No. 7,029,872 and U.S. Pat. No. 7,449,308, the disclosures of which are incorporated herein by reference. For example, a host cell can be selected or engineered to be depleted in 1,6-mannosyl transferase activities, which would otherwise add mannose residues onto the N-glycan on a glycoprotein.
[0169] In one embodiment, the host cell further includes an α1,2-mannosidase catalytic domain fused to a cellular targeting signal peptide not normally associated with the catalytic domain and selected to target the α1,2-mannosidase activity to the ER or Golgi apparatus of the host cell. Passage of a recombinant glycoprotein through the ER or Golgi apparatus of the host cell produces a recombinant glycoprotein comprising a Man5GlcNAc2 glycoform, for example, a recombinant glycoprotein composition comprising predominantly a Man5GlcNAc2 glycoform. For example, U.S. Pat. No. 7,029,872, U.S. Pat. No. 7,449,308, and U.S. Published Patent Application No. 2005/0170452, the disclosures of which are all incorporated herein by reference, disclose lower eukaryote host cells capable of producing a glycoprotein comprising a Man5GlcNAc2 glycoform.
[0170] In a further embodiment, the immediately preceding host cell further includes an N-acetylglucosaminyltransferase I (GlcNAc transferase I or GnT I) catalytic domain fused to a cellular targeting signal peptide not normally associated with the catalytic domain and selected to target GlcNAc transferase I activity to the ER or Golgi apparatus of the host cell. Passage of the recombinant glycoprotein through the ER or Golgi apparatus of the host cell produces a recombinant glycoprotein comprising a GlcNAcMan5GlcNAc2 glycoform, for example a recombinant glycoprotein composition comprising predominantly a GlcNAcMan5GlcNAc2 glycoform. U.S. Pat. No. 7,029,872, U.S. Pat. No. 7,449,308, and U.S. Published Patent Application No. 2005/0170452, the disclosures of which are all incorporated herein by reference, disclose lower eukaryote host cells capable of producing a glycoprotein comprising a GlcNAcMan5GlcNAc2 glycoform. The glycoprotein produced in the above cells can be treated in vitro with a hexaminidase to produce a recombinant glycoprotein comprising a Man5GlcNAc2 glycoform.
[0171] In a further embodiment, the immediately preceding host cell further includes a mannosidase II catalytic domain fused to a cellular targeting signal peptide not normally associated with the catalytic domain and selected to target mannosidase II activity to the ER or Golgi apparatus of the host cell. Passage of the recombinant glycoprotein through the ER or Golgi apparatus of the host cell produces a recombinant glycoprotein comprising a GlcNAcMan3GlcNAc2 glycoform, for example a recombinant glycoprotein composition comprising predominantly a GlcNAcMan3GlcNAc2 glycoform. U.S. Pat. No. 7,029,872 and U.S. Pat. No. 7,625,756, the disclosures of which are all incorporated herein by reference, discloses lower eukaryote host cells that express mannosidase II enzymes and are capable of producing glycoproteins having predominantly a GlcNAc2Man3GlcNAc2 glycoform. The glycoprotein produced in the above cells can be treated in vitro with a hexaminidase to produce a recombinant glycoprotein comprising a Man3GlcNAc2 glycoform.
[0172] In a further embodiment, the immediately preceding host cell further includes N-acetylglucosaminyltransferase II (GlcNAc transferase II or GnT II) catalytic domain fused to a cellular targeting signal peptide not normally associated with the catalytic domain and selected to target GlcNAc transferase II activity to the ER or Golgi apparatus of the host cell. Passage of the recombinant glycoprotein through the ER or Golgi apparatus of the host cell produces a recombinant glycoprotein comprising a GlcNAc2Man3GlcNAc2 glycoform, for example a recombinant glycoprotein composition comprising predominantly a GlcNAc2Man3GlcNAc2 glycoform. U.S. Pat. Nos. 7,029,872 and 7,449,308 and U.S. Published Patent Application No. 2005/0170452, the disclosures of which are all incorporated herein by reference, disclose lower eukaryote host cells capable of producing a glycoprotein comprising a GlcNAc2Man3GlcNAc2 glycoform. The glycoprotein produced in the above cells can be treated in vitro with a hexaminidase to produce a recombinant glycoprotein comprising a Man3GlcNAc2 glycoform.
[0173] In a further embodiment, the immediately preceding host cell further includes a galactosyltransferase catalytic domain fused to a cellular targeting signal peptide not normally associated with the catalytic domain and selected to target galactosyltransferase activity to the ER or Golgi apparatus of the host cell. Passage of the recombinant glycoprotein through the ER or Golgi apparatus of the host cell produces a recombinant glycoprotein comprising a GalGlcNAc2Man3GlcNAc2 or Gal2GlcNAc2Man3GlcNAc2 glycoform, or mixture thereof for example a recombinant glycoprotein composition comprising predominantly a GalGlcNAc2Man3GlcNAc2 glycoform or Gal2GlcNAc2Man3GlcNAc2 glycoform or mixture thereof. U.S. Pat. No. 7,029,872 and U.S. Published Patent Application No. 2006/0040353, the disclosures of which are incorporated herein by reference, discloses lower eukaryote host cells capable of producing a glycoprotein comprising a Gal2GlcNAc2Man3GlcNAc2 glycoform. The glycoprotein produced in the above cells can be treated in vitro with a galactosidase to produce a recombinant glycoprotein comprising a GlcNAc2Man3GlcNAc2 glycoform, for example a recombinant glycoprotein composition comprising predominantly a GlcNAc2Man3GlcNAc2 glycoform.
[0174] In a further embodiment, the immediately preceding host cell further includes a sialyltransferase catalytic domain fused to a cellular targeting signal peptide not normally associated with the catalytic domain and selected to target sialyltransferase activity to the ER or Golgi apparatus of the host cell. Passage of the recombinant glycoprotein through the ER or Golgi apparatus of the host cell produces a recombinant glycoprotein comprising predominantly a NANA2Gal2GlcNAc2Man3GlcNAc2 glycoform or NANAGal2GlcNAc2Man3GlcNAc2 glycoform or mixture thereof. For lower eukaryote host cells such as yeast and filamentous fungi, it is useful that the host cell further include a means for providing CMP-sialic acid for transfer to the N-glycan. U.S. Published Patent Application No. 2005/0260729, the disclosure of which is incorporated herein by reference, discloses a method for genetically engineering lower eukaryotes to have a CMP-sialic acid synthesis pathway and U.S. Published Patent Application No. 2006/0286637, the disclosure of which is incorporated herein by reference, discloses a method for genetically engineering lower eukaryotes to produce sialylated glycoproteins. The glycoprotein produced in the above cells can be treated in vitro with a neuraminidase to produce a recombinant glycoprotein comprising predominantly a Gal2GlcNAc2Man3GlcNAc2 glycoform or GalGlcNAc2Man3GlcNAc2 glycoform or mixture thereof.
[0175] Any one of the preceding host cells can further include one or more GlcNAc transferase selected from the group consisting of GnT III, GnT IV, GnT V, GnT VI, and GnT IX to produce glycoproteins having bisected (GnT III) and/or multiantennary (GnT IV, V, VI, and IX) N-glycan structures such as disclosed in U.S. Pat. No. 7,598,055 and U.S. Published Patent Application No. 2007/0037248, the disclosures of which are all incorporated herein by reference.
[0176] In further embodiments, the host cell that produces glycoproteins that have predominantly GlcNAcMan5GlcNAc2 N-glycans further includes a galactosyltransferase catalytic domain fused to a cellular targeting signal peptide not normally associated with the catalytic domain and selected to target galactosyltransferase activity to the ER or Golgi apparatus of the host cell. Passage of the recombinant glycoprotein through the ER or Golgi apparatus of the host cell produces a recombinant glycoprotein comprising predominantly the GalGlcNAcMan5GlcNAc2 glycoform.
[0177] In a further embodiment, the immediately preceding host cell that produced glycoproteins that have predominantly the GalGlcNAcMan5GlcNAc2 N-glycans further includes a sialyltransferase catalytic domain fused to a cellular targeting signal peptide not normally associated with the catalytic domain and selected to target sialytransferase activity to the ER or Golgi apparatus of the host cell. Passage of the recombinant glycoprotein through the ER or Golgi apparatus of the host cell produces a recombinant glycoprotein comprising a NANAGalGlcNAcMan5GlcNAc2 glycoform.
[0178] In further aspects, any one of the aforementioned host cells, the host cell is further modified to include a fucosyltransferase and a pathway for producing fucose and transporting fucose into the ER or Golgi. Examples of methods for modifying Pichia pastoris to render it capable of producing glycoproteins in which one or more of the N-glycans thereon are fucosylated are disclosed in Published International Application No. WO 2008112092, the disclosure of which is incorporated herein by reference. In particular aspects of the invention, the Pichia pastoris host cell is further modified to include a fucosylation pathway comprising a GDP-mannose-4,6-dehydratase, GDP-keto-deoxy-mannose-epimerase/GDP-keto-deoxy-galactose-reductase, GDP-fucose transporter, and a fucosyltransferase. In particular aspects, the fucosyltransferase is selected from the group consisting of α1,2-fucosyltransferase, α1,3-fucosyltransferase, α1,4-fucosyltransferase, and α1,6-fucosyltransferase.
[0179] Various of the preceding host cells further include one or more sugar transporters such as UDP-GlcNAc transporters (for example, Kluyveromyces lactis and Mus musculus UDP-GlcNAc transporters), UDP-galactose transporters (for example, Drosophila melanogaster UDP-galactose transporter), and CMP-sialic acid transporter (for example, human sialic acid transporter). Because lower eukaryote host cells such as yeast and filamentous fungi lack the above transporters, it is preferable that lower eukaryote host cells such as yeast and filamentous fungi be genetically engineered to include the above transporters.
[0180] Host cells further include Pichia pastoris that are genetically engineered to eliminate glycoproteins having phosphomannose residues by deleting or disrupting one or both of the phosphomannosyltransferase genes PNO1 and MNN4B (See for example, U.S. Pat. Nos. 7,198,921 and 7,259,007; the disclosures of which are all incorporated herein by reference), which in further aspects can also include deleting or disrupting the MNN4A gene. Disruption includes disrupting the open reading frame encoding the particular enzymes or disrupting expression of the open reading frame or abrogating translation of RNAs encoding one or more of the β-mannosyltransferases and/or phosphomannosyltransferases using interfering RNA, antisense RNA, or the like. The host cells can further include any one of the aforementioned host cells modified to produce particular N-glycan structures.
[0181] Host cells further include lower eukaryote cells (e.g., yeast such as Pichia pastoris) that are genetically modified to control O-glycosylation of the glycoprotein by deleting or disrupting one or more of the protein O-mannosyltransferase (Dol-P-Man:Protein (Ser/Thr) Mannosyl Transferase genes) (PMTs) (See U.S. Pat. No. 5,714,377; the disclosure of which is incorporated herein by reference) or grown in the presence of Pmtp inhibitors and/or an α1,2 mannosidase as disclosed in Published International Application No. WO 2007061631 the disclosure of which is incorporated herein by reference), or both. Disruption includes disrupting the open reading frame encoding the Pmtp or disrupting expression of the open reading frame or abrogating translation of RNAs encoding one or more of the Pmtps using interfering RNA, antisense RNA, or the like. The host cells can further include any one of the aforementioned host cells modified to produce particular N-glycan structures.
[0182] Pmtp inhibitors include but are not limited to a benzylidene thiazolidinediones. Examples of benzylidene thiazolidinediones that can be used are 5-[[3,4-bis(phenylmethoxy)phenyl]methylene]-4-oxo-2-thioxo-3-thiazolidine- acetic Acid; 5-[[3-(1-Phenylethoxy)-4-(2-phenylethoxy)]phenyl]methylene]-4-oxo-2-thiox- o-3-thiazolidineacetic Acid; and 5-[[3-(1-Phenyl-2-hydroxy)ethoxy)-4-(2-phenylethoxy)]phenyl]methylene]-4-- oxo-2-thioxo-3-thiazolidineacetic Acid.
[0183] In particular embodiments, the function or expression of at least one endogenous PMT gene is reduced, disrupted, or deleted. For example, in particular embodiments the function or expression of at least one endogenous PMT gene selected from the group consisting of the PMT1, PMT2, PMT3, and PMT4 genes is reduced, disrupted, or deleted; or the host cells are cultivated in the presence of one or more PMT inhibitors. In further embodiments, the host cells include one or more PMT gene deletions or disruptions and the host cells are cultivated in the presence of one or more Pmtp inhibitors. In particular aspects of these embodiments, the host cells also express a secreted α-1,2-mannosidase.
[0184] PMT deletions or disruptions and/or Pmtp inhibitors control O-glycosylation by reducing O-glycosylation occupancy; that is by reducing the total number of O-glycosylation sites on the glycoprotein that are glycosylated. The further addition of an α-1,2-mannosidase that is secreted by the cell controls O-glycosylation by reducing the mannose chain length of the O-glycans that are on the glycoprotein. Thus, combining PMT deletions or disruptions and/or Pmtp inhibitors with expression of a secreted α-1,2-mannosidase controls O-glycosylation by reducing occupancy and chain length. In particular circumstances, the particular combination of PMT deletions or disruptions, Pmtp inhibitors, and α-1,2-mannosidase is determined empirically as particular heterologous glycoproteins (antibodies, for example) may be expressed and transported through the Golgi apparatus with different degrees of efficiency and thus may require a particular combination of PMT deletions or disruptions, Pmtp inhibitors, and α-1,2-mannosidase. In another aspect, genes encoding one or more endogenous mannosyltransferase enzymes are deleted. The deletion(s) can be in combination with providing the secreted α-1,2-mannosidase and/or PMT inhibitors or can be in lieu of providing the secreted α-1,2-mannosidase and/or PMT inhibitors.
[0185] Thus, the control of O-glycosylation can be useful for producing particular glycoproteins in the host cells disclosed herein in better total yield or in yield of properly assembled glycoprotein. The reduction or elimination of O-glycosylation appears to have a beneficial effect on the assembly and transport of glycoproteins such as whole antibodies as they traverse the secretory pathway and are transported to the cell surface. Thus, in cells in which O-glycosylation is controlled, the yield of properly assembled glycoproteins such as antibody fragments is increased over the yield obtained in host cells in which O-glycosylation is not controlled.
[0186] To reduce or eliminate the likelihood of N-glycans and O-glycans with β-linked mannose residues, which are resistant to α-mannosidases, the recombinant glycoengineered Pichia pastoris host cells are genetically engineered to eliminate glycoproteins having α-mannosidase-resistant N-glycans by deleting or disrupting one or more of the β-mannosyltransferase genes (e.g., BMT1, BMT2, BMT3, and BMT4) (See, U.S. Pat. No. 7,465,577 and U.S. Pat. No. 7,713,719). The deletion or disruption of BMT2 and one or more of BMT1, BMT3, and BMT4 also reduces or eliminates detectable cross reactivity to antibodies against host cell protein.
[0187] Yield of glycoprotein can in some situations be improved by overexpressing nucleic acid molecules encoding mammalian or human chaperone proteins or replacing the genes encoding one or more endogenous chaperone proteins with nucleic acid molecules encoding one or more mammalian or human chaperone proteins. In addition, the expression of mammalian or human chaperone proteins in the host cell also appears to control O-glycosylation in the cell. Thus, further included are the host cells herein wherein the function of at least one endogenous gene encoding a chaperone protein has been reduced or eliminated, and a vector encoding at least one mammalian or human homolog of the chaperone protein is expressed in the host cell. Also included are host cells in which the endogenous host cell chaperones and the mammalian or human chaperone proteins are expressed. In further aspects, the lower eukaryotic host cell is a yeast or filamentous fungi host cell. Examples of the use of chaperones of host cells in which human chaperone proteins are introduced to improve the yield and reduce or control O-glycosylation of recombinant proteins has been disclosed in Published International Application No. WO 2009105357 and WO2010019487 (the disclosures of which are incorporated herein by reference). Like above, further included are lower eukaryotic host cells wherein, in addition to replacing the genes encoding one or more of the endogenous chaperone proteins with nucleic acid molecules encoding one or more mammalian or human chaperone proteins or overexpressing one or more mammalian or human chaperone proteins as described above, the function or expression of at least one endogenous gene encoding a protein O-mannosyltransferase (PMT) protein is reduced, disrupted, or deleted. In particular embodiments, the function of at least one endogenous PMT gene selected from the group consisting of the PMT1, PMT2, PMT3, and PMT4 genes is reduced, disrupted, or deleted.
[0188] Therefore, the methods disclose herein can use any host cell that has been genetically modified to produce glycoproteins wherein the predominant N-glycan is selected from the group consisting of complex N-glycans, hybrid N-glycans, and high mannose N-glycans wherein complex N-glycans may be selected from the group consisting of GlcNAc.sub.(2-4)Man3GlcNAc2, Gal.sub.(1-4)GlcNAc.sub.(2-4)Man3GlcNAc2, and NANA.sub.(1-4)Gal.sub.(1-4)GlcNAc.sub.(2-4)Man3GlcNAc2; hybrid N-glycans maybe selected from the group consisting of GlcNAcMan3 GlcNAc2; GalGlcNAcMan3GlcNAc2; NANAGalGlcNAcMan3 GlcNAc2 GlcNAcMan5GlcNAc2, GalGlcNAcMan5GlcNAc2, and NANAGalGlcNAcMan5GlcNAc2; and high Mannose N-glycans maybe selected from the group consisting of Man5GlcNAc2, Man6GlcNAc2, Man7GlcNAc2, Man8GlcNAc2, and Man9GlcNAc2. Further included are glycoproteins having N-glycans consisting of the N-glycan structure Man3GlcNAc2, for example, as shown in U.S. Published Application No. 20050170452.
[0189] Therefore, provided is a method for producing a heterologous glycoprotein with mammalian- or human-like complex or hybrid N-glycans in a lower eukaryote host cell, comprising providing a lower eukaryote host cell that is genetically engineered to produce glycoproteins that have human-like N-glycans and includes a nucleic acid molecule encoding a heterologous single-subunit oligosaccharyltransferase and a nucleic acid molecule encoding the heterologous glycoprotein; and culturing the host cell under conditions for expressing the heterologous glycoprotein to produce the heterologous glycoprotein.
[0190] In a further aspect of the above, provided is a method for producing a heterologous glycoprotein with mammalian- or human-like complex or hybrid N-glycans in a yeast or filamentous fungus host cell, comprising providing a yeast or filamentous fungus host cell that is genetically engineered to produce glycoproteins that have human-like N-glycans and includes a nucleic acid molecule encoding a heterologous single-subunit oligosaccharyltransferase and a nucleic acid molecule encoding the heterologous glycoprotein; and culturing the host cell under conditions for expressing the heterologous glycoprotein to produce the heterologous glycoprotein.
[0191] Further provided is a yeast or filamentous fungus host cell genetically engineered to produce glycoproteins having mammalian- or human-like N-glycans, comprising a first nucleic acid molecule encoding a heterologous single-subunit oligosaccharyltransferase; and a second nucleic acid molecule encoding a heterologous glycoprotein; and wherein the endogenous host cell genes encoding the proteins comprising the oligosaccharyltransferase (OTase) complex are expressed. This includes expression of the endogenous STT3 gene, which in yeast is the STT3 gene.
[0192] In general, in the above methods and host cells, the single-subunit oligosaccharyltransferase is capable of functionally suppressing the lethal phenotype of a mutation of at least one essential protein of the OTase complex. In further aspects, the essential protein of the OTase complex is encoded by the STT3 locus, WBP1 locus, OST1 locus, SWP1 locus, or OST2 locus, or homologue thereof. In further aspects, the for example single-subunit oligosaccharyltransferase is the Leishmania major STT3D protein.
[0193] Promoters are DNA sequence elements for controlling gene expression. In particular, promoters specify transcription initiation sites and can include a TATA box and upstream promoter elements. The promoters selected are those which would be expected to be operable in the particular host system selected. For example, yeast promoters are used when a yeast such as Saccharomyces cerevisiae, Kluyveromyces lactis, Ogataea minuta, or Pichia pastoris is the host cell whereas fungal promoters would be used in host cells such as Aspergillus niger, Neurospora crassa, or Tricoderma reesei. Examples of yeast promoters include but are not limited to the GAPDH, AOX1, SEC4, HH1, PMA1, OCH1, GAL1, PGK, GAP, TPI, CYC1, ADH2, PHO5, CUP1, MFα1, FLD1, PMA1, PDI, TEF, RPL10, and GUT1 promoters. Romanos et al., Yeast 8: 423-488 (1992) provide a review of yeast promoters and expression vectors. Hartner et al., Nucl. Acid Res. 36: e76 (pub on-line 6 Jun. 2008) describes a library of promoters for fine-tuned expression of heterologous proteins in Pichia pastoris.
[0194] The promoters that are operably linked to the nucleic acid molecules disclosed herein can be constitutive promoters or inducible promoters. An inducible promoter, for example the AOX1 promoter, is a promoter that directs transcription at an increased or decreased rate upon binding of a transcription factor in response to an inducer. Transcription factors as used herein include any factor that can bind to a regulatory or control region of a promoter and thereby affect transcription. The RNA synthesis or the promoter binding ability of a transcription factor within the host cell can be controlled by exposing the host to an inducer or removing an inducer from the host cell medium. Accordingly, to regulate expression of an inducible promoter, an inducer is added or removed from the growth medium of the host cell. Such inducers can include sugars, phosphate, alcohol, metal ions, hormones, heat, cold and the like. For example, commonly used inducers in yeast are glucose, galactose, alcohol, and the like.
[0195] Transcription termination sequences that are selected are those that are operable in the particular host cell selected. For example, yeast transcription termination sequences are used in expression vectors when a yeast host cell such as Saccharomyces cerevisiae, Kluyveromyces lactis, or Pichia pastoris is the host cell whereas fungal transcription termination sequences would be used in host cells such as Aspergillus niger, Neurospora crassa, or Tricoderma reesei. Transcription termination sequences include but are not limited to the Saccharomyces cerevisiae CYC transcription termination sequence (ScCYC TT), the Pichia pastoris ALG3 transcription termination sequence (ALG3 TT), the Pichia pastoris ALG6 transcription termination sequence (ALG6 TT), the Pichia pastoris ALG12 transcription termination sequence (ALG12 TT), the Pichia pastoris AOX1 transcription termination sequence (AOX1 TT), the Pichia pastoris OCH1 transcription termination sequence (OCH1 TT) and Pichia pastoris PMA1 transcription termination sequence (PMA1 TT). Other transcription termination sequences can be found in the examples and in the art.
[0196] For genetically engineering yeast, selectable markers can be used to construct the recombinant host cells include drug resistance markers and genetic functions which allow the yeast host cell to synthesize essential cellular nutrients, e.g. amino acids. Drug resistance markers which are commonly used in yeast include chloramphenicol, kanamycin, methotrexate, G418 (geneticin), Zeocin, and the like. Genetic functions which allow the yeast host cell to synthesize essential cellular nutrients are used with available yeast strains having auxotrophic mutations in the corresponding genomic function. Common yeast selectable markers provide genetic functions for synthesizing leucine (LEU2), tryptophan (TRP1 and TRP2), proline (PRO1), uracil (URA3, URA5, URA6), histidine (HIS3), lysine (LYS2), adenine (ADE1 or ADE2), and the like. Other yeast selectable markers include the ARR3 gene from S. cerevisiae, which confers arsenite resistance to yeast cells that are grown in the presence of arsenite (Bobrowicz et al., Yeast, 13:819-828 (1997); Wysocki et al., J. Biol. Chem. 272:30061-30066 (1997)). A number of suitable integration sites include those enumerated in U.S. Pat. No. 7,479,389 (the disclosure of which is incorporated herein by reference) and include homologs to loci known for Saccharomyces cerevisiae and other yeast or fungi. Methods for integrating vectors into yeast are well known (See for example, U.S. Pat. No. 7,479,389, U.S. Pat. No. 7,514,253, U.S. Published Application No. 2009012400, and WO2009/085135; the disclosures of which are all incorporated herein by reference). Examples of insertion sites include, but are not limited to, Pichia ADE genes; Pichia TRP (including TRP1 through TRP2) genes; Pichia MCA genes; Pichia CYM genes; Pichia PEP genes; Pichia PRB genes; and Pichia LEU genes. The Pichia ADE1 and ARG4 genes have been described in Lin Cereghino et al., Gene 263:159-169 (2001) and U.S. Pat. No. 4,818,700 (the disclosure of which is incorporated herein by reference), the HIS3 and TRP1 genes have been described in Cosano et al., Yeast 14:861-867 (1998), HIS4 has been described in GenBank Accession No. X56180.
[0197] The methods disclosed herein can be adapted for use in mammalian, plant, and insect cells. Examples of animal cells include, but are not limited to, SC-I cells, LLC-MK cells, CV-I cells, CHO cells, COS cells, murine cells, human cells, HeLa cells, 293 cells, VERO cells, MDBK cells, MDCK cells, MDCK cells, CRFK cells, RAF cells, TCMK cells, LLC-PK cells, PK15 cells, WI-38 cells, MRC-5 cells, T-FLY cells, BHK cells, SP2/0, NSO cells, and derivatives thereof. Insect cells include cells of Drosophila melanogaster origin. These cells can be genetically engineered to render the cells capable of making immunoglobulins that have particular or predominantly particular N-glycans. For example, U.S. Pat. No. 6,949,372 discloses methods for making glycoproteins in insect cells that are sialylated. Yamane-Ohnuki et al. Biotechnol. Bioeng. 87: 614-622 (2004), Kanda et al., Biotechnol. Bioeng. 94: 680-688 (2006), Kanda et al., Glycobiol. 17: 104-118 (2006), and U.S. Pub. Application Nos. 2005/0216958 and 2007/0020260 (the disclosures of which are incorporated herein by reference) disclose mammalian cells that are capable of producing immunoglobulins in which the N-glycans thereon lack fucose or have reduced fucose. U.S. Published Patent Application No. 2005/0074843 (the disclosure of which is incorporated herein by reference) discloses making antibodies in mammalian cells that have bisected N-glycans.
[0198] The regulatable promoters selected for regulating expression of the expression cassettes in mammalian, insect, or plant cells should be selected for functionality in the cell-type chosen. Examples of suitable regulatable promoters include but are not limited to the tetracycline-regulatable promoters (See for example, Berens & Hillen, Eur. J. Biochem. 270: 3109-3121 (2003)), RU 486-inducible promoters, ecdysone-inducible promoters, and kanamycin-regulatable systems. These promoters can replace the promoters exemplified in the expression cassettes described in the examples. The capture moiety can be fused to a cell surface anchoring protein suitable for use in the cell-type chosen. Cell surface anchoring proteins including GPI proteins are well known for mammalian, insect, and plant cells. GPI-anchored fusion proteins has been described by Kennard et al., Methods Biotechnol. Vo. 8: Animal Cell Biotechnology (Ed. Jenkins. Human Press, Inc., Totowa, N.J.) pp. 187-200 (1999). The genome targeting sequences for integrating the expression cassettes into the host cell genome for making stable recombinants can replace the genome targeting and integration sequences exemplified in the examples. Transfection methods for making stable and transiently transfected mammalian, insect, and plant host cells are well known in the art. Once the transfected host cells have been constructed as disclosed herein, the cells can be screened for expression of the immunoglobulin of interest and selected as disclosed herein.
[0199] Therefore, in a further aspect of the above, provided is a method for producing a heterologous glycoprotein in a mammalian or insect host cell, comprising providing a mammalian or insect host cell that includes a nucleic acid molecule encoding a heterologous single-subunit oligosaccharyltransferase (e.g., Leishmania major STT3 protein) and a nucleic acid molecule encoding the heterologous glycoprotein; and culturing the host cell under conditions for expressing the heterologous glycoprotein to produce the heterologous glycoprotein. In further aspects, the host cell is genetically engineered to produce glycoproteins with human-like N-glycans or N-glycans not normally endogenous to the host cell.
[0200] In a further aspect of the above, provided is a method for producing a heterologous glycoprotein wherein the N-glycosylation site occupancy of the heterologous glycoprotein is greater than 83% in a mammalian or insect host cell, comprising providing a mammalian or insect host cell that includes a nucleic acid molecule encoding a heterologous single-subunit oligosaccharyltransferase (e.g., Leishmania major STT3 protein) and a nucleic acid molecule encoding the heterologous glycoprotein; and culturing the host cell under conditions for expressing the heterologous glycoprotein to produce the heterologous glycoprotein wherein the N-glycosylation site occupancy of the heterologous glycoprotein is greater than 83%. In further aspects, the host cell is genetically engineered to produce glycoproteins with human-like N-glycans or N-glycans not normally endogenous to the host cell.
[0201] In a further embodiment of the above methods, the endogenous host cell genes encoding the proteins comprising the oligosaccharyltransferase (OTase) complex are expressed.
[0202] In particular embodiments of the above methods, the N-glycosylation site occupancy is at least 94%. In further still embodiments, the N-glycosylation site occupancy is at least 99%.
[0203] Further provided is a mammalian or insect host cell, comprising a first nucleic acid molecule encoding a heterologous single-subunit oligosaccharyltransferase (e.g., the Leishmania major STT3D protein); and a second nucleic acid molecule encoding a heterologous glycoprotein; and wherein the endogenous host cell genes encoding the proteins comprising the endogenous host cell oligosaccharyltransferase (OTase) complex are expressed.
[0204] In particular embodiments, the higher eukaryote cell, tissue, or organism can also be from the plant kingdom, for example, wheat, rice, corn, tobacco, and the like. Alternatively, bryophyte cells can be selected, for example from species of the genera Physcomitrella, Funaria, Sphagnum, Ceratodon, Marchantia, and Sphaerocarpos. Exemplary of plant cells is the bryophyte cell of Physcomitrella patens, which has been disclosed in WO 2004/057002 and WO2008/006554 (the disclosures of which are all incorporated herein by reference). Expression systems using plant cells can further manipulated to have altered glycosylation pathways to enable the cells to produce immunoglobulins that have predominantly particular N-glycans. For example, the cells can be genetically engineered to have a dysfunctional or no core fucosyltransferase and/or a dysfunctional or no xylosyltransferase, and/or a dysfunctional or no β1,4-galactosyltransferase. Alternatively, the galactose, fucose and/or xylose can be removed from the immunoglobulin by treatment with enzymes removing the residues. Any enzyme resulting in the release of galactose, fucose and/or xylose residues from N-glycans which are known in the art can be used, for example α-galactosidase, β-xylosidase, and α-fucosidase. Alternatively, an expression system can be used which synthesizes modified N-glycans which can not be used as substrates by 1,3-fucosyltransferase and/or 1,2-xylosyltransferase, and/or 1,4-galactosyltransferase. Methods for modifying glycosylation pathways in plant cells are disclosed in U.S. Pat. Nos. 7,449,308, 6,998,267 and 7,388,081 (the disclosures of which are incorporated herein by reference) which disclose methods for genetically engineering plants to make recombinant glycoproteins that have human-like N-glycans. WO 2008006554 (the disclosure of which is incorporated herein by reference) discloses methods for making glycoproteins such as antibodies in plants genetically engineered to make glycoproteins without xylose or fucose. WO 2007006570 (the disclosure of which is incorporated herein by reference) discloses methods for genetically engineering bryophytes, ciliates, algae, and yeast to make glycoproteins that have animal or human-like glycosylation patterns.
[0205] Therefore, in a further aspect of the above, provided is a method for producing a heterologous glycoprotein with mammalian- or human-like complex or hybrid N-glycans in a plant host cell, comprising providing a plant host cell that is genetically engineered to produce glycoproteins that have mammalian- or human-like N-glycans and includes a nucleic acid molecule encoding a heterologous single-subunit oligosaccharyltransferase (e.g., the Leishmania major STT3D protein) and a nucleic acid molecule encoding the heterologous glycoprotein; and culturing the host cell under conditions for expressing the heterologous glycoprotein to produce the heterologous glycoprotein.
[0206] In a further aspect of the above, provided is a method for producing a heterologous glycoprotein with mammalian- or human-like complex or hybrid N-glycans wherein the N-glycosylation site occupancy of the heterologous glycoprotein is greater than 83% in a plant host cell, comprising providing a plant host cell that is genetically engineered to produce glycoproteins that have mammalian- or human-like N-glycans and includes a nucleic acid molecule encoding a heterologous single-subunit oligosaccharyltransferase (e.g., the Leishmania major STT3D protein) and a nucleic acid molecule encoding the heterologous glycoprotein; and culturing the host cell under conditions for expressing the heterologous glycoprotein to produce the heterologous glycoprotein with mammalian- or human-like N-glycans wherein the N-glycosylation site occupancy of the heterologous glycoprotein is greater than 83%.
[0207] In a further embodiment of the above methods, the endogenous host cell genes encoding the proteins comprising the endogenous host cell oligosaccharyltransferase (OTase) complex are expressed.
[0208] In particular embodiments of the above methods, the N-glycosylation site occupancy is at least 94%. In further still embodiments, the N-glycosylation site occupancy is at least 99%.
[0209] Further provided is a plant host cell, comprising a first nucleic acid molecule encoding a heterologous single-subunit oligosaccharyltransferase (e.g., the Leishmania major STT3D protein); and a second nucleic acid molecule encoding a heterologous glycoprotein; and wherein the endogenous host cell genes encoding the proteins comprising the endogenous host cell oligosaccharyltransferase (OTase) complex are expressed.
[0210] The host cells and methods herein are useful for producing a wide range of recombinant proteins and glycoproteins. Examples of recombinant proteins and glycoproteins that can be produced in the host cells disclosed herein include but are not limited to erythropoietin (EPO); cytokines such as interferon α, interferon β, interferon γ, and interferon ω; and granulocyte-colony stimulating factor (GCSF); granulocyte macrophage-colony stimulating factor (GM-CSF); coagulation factors such as factor VIII, factor IX, and human protein C; antithrombin III; thrombin,; soluble IgE receptor α-chain; immunoglobulins or antibodies such as IgG, IgG fragments, IgG fusions, and IgM; immunoadhesions and other Fc fusion proteins such as soluble TNF receptor-Fc fusion proteins; RAGE-Fc fusion proteins; interleukins; urokinase; chymase; urea trypsin inhibitor; IGF-binding protein; epidermal growth factor; growth hormone-releasing factor; annexin V fusion protein; angiostatin; vascular endothelial growth factor-2; myeloid progenitor inhibitory factor-1; osteoprotegerin; α-1-antitrypsin; α-feto proteins; DNase II; kringle 3 of human plasminogen; glucocerebrosidase; TNF binding protein 1; follicle stimulating hormone; cytotoxic T lymphocyte associated antigen 4-Ig; transmembrane activator and calcium modulator and cyclophilin ligand; glucagon like protein 1; and IL-2 receptor agonist.
[0211] The recombinant host cells and methods disclosed herein are particularly useful for producing antibodies, Fc fusion proteins, and the like where it is desirable to provide antibody or Fc fusion protein compositions wherein the percent galactose-containing N-glycans is increased compared to the percent galactose obtainable in the host cells prior to modification as taught herein. Examples of antibodies that can be made in the host cells herein include but are not limited to human antibodies, humanized antibodies, chimeric antibodies, heavy chain antibodies (e.g., camel or llama). Specific antibodies include but are not limited to the following antibodies recited under their generic name (target): Muromonab-CD3 (anti-CD3 receptor antibody), Abciximab (anti-CD41 7E3 antibody), Rituximab (anti-CD20 antibody), Daclizumab (anti-CD25 antibody), Basiliximab (anti-CD25 antibody), Palivizumab (anti-RSV (respiratory syncytial virus) antibody), Infliximab (anti-TNFα antibody), Trastuzumab (anti-Her2 antibody), Gemtuzumab ozogamicin (anti-CD33 antibody), Alemtuzumab (anti-CD52 antibody), Ibritumomab tiuxeten (anti-CD20 antibody), Adalimumab (anti-TNFα antibody), Omalizumab (anti-IgE antibody), Tositumomab-131I (iodinated derivative of an anti-CD20 antibody), Efalizumab (anti-CD11a antibody), Cetuximab (anti-EGF receptor antibody), Golimumab (anti-TNFα antibody), Bevacizumab (anti VEGF-A antibody), and variants thereof. Examples of Fc-fusion proteins that can be made in the host cells disclosed herein include but are not limited to etanercept (TNFR-Fc fusion protein), FGF-21-Fc fusion proteins, GLP-1-Fc fusion proteins, RAGE-Fc fusion proteins, EPO-Fc fusion proteins, ActRIIA-Fc fusion proteins, ActRIIB-Fc fusion proteins, glucagon-Fc fusions, oxyntomodulin-Fc-fusions, and analogs and variants thereof.
[0212] Thus, the methods and host cells herein can be used to produce glycoprotein compositions in which at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, or 99% of the N-glycosylation sites of the glycoproteins in the composition are occupied and the glycoproteins have mammalian- or human-like N-glycans.
[0213] Further, the methods and host cells herein can be used to produce glycoprotein compositions in which at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, or 99% of the N-glycosylation sites of the glycoproteins in the composition are occupied and the glycoproteins have mammalian- or human-like N-glycans that lack fucose.
[0214] Further, the methods and yeast or filamentous fungus host cells genetically engineered to produce mammalian-like or human-like N-glycans can be used to produce glycoprotein compositions in which at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, or 99% of the N-glycosylation sites of the glycoproteins in the composition are occupied and the glycoproteins have mammalian- or human-like N-glycans that lack fucose.
[0215] In some aspects, the yeast or filamentous host cells genetically engineered to produce fucosylated mammalian- or human-like N-glycans can be used to produce glycoprotein compositions in which at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, or 99% of the N-glycosylation sites of the glycoproteins in the composition are occupied and the glycoproteins have mammalian- or human-like N-glycans that have fucose.
[0216] The recombinant cells disclosed herein can be used to produce antibodies and Fc fragments suitable for chemically conjugating to a heterologous peptide or drug molecule. For example, WO2005047334, WO2005047336, WO2005047337, and WO2006107124 (the disclosures of which are incorporated herein by reference) disclose chemically conjugating peptides or drug molecules to Fc fragments. EP1180121, EP1105409, and U.S. Pat. No. 6,593,295 (the disclosures of which are incorporated herein by reference) disclose chemically conjugating peptides and the like to blood components, which includes whole antibodies.
[0217] Thus, the methods and host cells herein can be used to produce antibody compositions in which at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, or 99% of the antibody molecules in the composition have both N-glycosylation sites occupied and the antibodies have mammalian- or human-like N-glycans.
[0218] Further, the methods and host cells herein can be used to produce antibody compositions in which at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, or 99% of the antibody molecules in the composition have both N-glycosylation sites occupied and the antibodies have mammalian- or human-like N-glycans that lack fucose.
[0219] Further, the methods and yeast or filamentous fungus host cells genetically engineered to produce mammalian-like or human-like N-glycans can be used to produce antibody compositions in which at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, or 99% of the antibody molecules in the composition have both N-glycosylation sites occupied and the antibodies have mammalian- or human-like N-glycans that lack fucose.
[0220] In some aspects, the yeast or filamentous host cells genetically engineered to produce fucosylated mammalian- or human-like N-glycans can be used to produce antibody compositions in which at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, or 99% of the antibody molecules in the composition have both N-glycosylation sites occupied and the antibodies have mammalian- or human-like N-glycans that have fucose.
[0221] As shown in Example 3, the N-glycosylation composition of antibodies produced in Pichia pastoris strains, which have been genetically engineered to make galactose-terminated
[0222] N-glycans, appear to range from about 50-60 mole % G0, 18-24 mole % G1, 3-8% mole % G2, 12-17 mole % Man5, and 3-6 mole % hybrids.
[0223] Therefore, provided is a glycoprotein composition comprising a plurality of antibodies wherein at least 70% of the antibody molecules in the composition have both N-glycosylation sites occupied and about 50-70 mole % of the N-glycans have a G0 structure, 15-25 mole % of the N-glycans have a G1 structure, 4-12 mole % of the N-glycans have a G2 structure, 5-17 mole % of the N-glycans have a Man5 structure, and 5-15 mole % of the N-glycans have a hybrid structure, and a pharmaceutically acceptable carrier. Further still is provided is a glycoprotein composition comprising a plurality of antibodies wherein at least 70% of the antibody molecules in the composition have both N-glycosylation sites occupied and about 53 to 58 mole % of the N-glycans have a G0 structure, 20-22 mole % of the N-glycans have a G1 structure, and about 16 to 18 mole % of the N-glycans comprise a Man5GlcNAc2 core structure, and a pharmaceutically acceptable carrier. In further aspects of the above, the N-glycans further include fucose.
[0224] Therefore, provided is a glycoprotein composition comprising a plurality of antibodies wherein at least 75% of the antibody molecules in the composition have both N-glycosylation sites occupied and about 50-70 mole % of the N-glycans have a G0 structure, 15-25 mole % of the N-glycans have a G1 structure, 4-12 mole % of the N-glycans have a G2 structure, 5-17 mole % of the N-glycans have a Man5 structure, and 5-15 mole % of the N-glycans have a hybrid structure, and a pharmaceutically acceptable carrier. Further still is provided is a glycoprotein composition comprising a plurality of antibodies wherein at least 75% of the antibody molecules in the composition have both N-glycosylation sites occupied and about 53 to 58 mole % of the N-glycans have a G0 structure, 20-22 mole % of the N-glycans have a G1 structure, and about 16 to 18 mole % of the N-glycans comprise a Man5GlcNAc2 core structure, and a pharmaceutically acceptable carrier. In further aspects of the above, the N-glycans further include fucose.
[0225] Further still, provided is a glycoprotein composition comprising a plurality of antibodies wherein at least 80% of the antibody molecules in the composition have both N-glycosylation sites occupied and about 50-70 mole % of the N-glycans have a G0 structure, 15-25 mole % of the N-glycans have a G1 structure, 4-12 mole % of the N-glycans have a G2 structure, 5-17 mole % of the N-glycans have a Man5 structure, and 5-15 mole % of the N-glycans have a hybrid structure, and a pharmaceutically acceptable carrier. Further still is provided is a glycoprotein composition comprising a plurality of antibodies wherein at least 80% of the antibody molecules in the composition have both N-glycosylation sites occupied and about 53 to 58 mole % of the N-glycans have a G0 structure, 20-22 mole % of the N-glycans have a G1 structure, and about 16 to 18 mole % of the N-glycans comprise a Man5GlcNAc2 core structure, and a pharmaceutically acceptable carrier. In further aspects of the above, the N-glycans further include fucose.
[0226] Therefore, provided is a glycoprotein composition comprising a plurality of antibodies wherein at least 85% of the antibody molecules in the composition have both N-glycosylation sites occupied and about 50-70 mole % of the N-glycans have a G0 structure, 15-25 mole % of the N-glycans have a G1 structure, 4-12 mole % of the N-glycans have a G2 structure, 5-17 mole % of the N-glycans have a Man5 structure, and 5-15 mole % of the N-glycans have a hybrid structure, and a pharmaceutically acceptable carrier. Further still is provided is a glycoprotein composition comprising a plurality of antibodies wherein at least 85% of the antibody molecules in the composition have both N-glycosylation sites occupied and about 53 to 58 mole % of the N-glycans have a G0 structure, 20-22 mole % of the N-glycans have a G1 structure, and about 16 to 18 mole % of the N-glycans comprise a Man5GlcNAc2 core structure, and a pharmaceutically acceptable carrier. In further aspects of the above, the N-glycans further include fucose.
[0227] Further still, provided is a glycoprotein composition comprising a plurality of antibodies wherein at least 90% of the antibody molecules in the composition have both N-glycosylation sites occupied and about 50-70 mole % of the N-glycans have a G0 structure, 15-25 mole % of the N-glycans have a G1 structure, 4-12 mole % of the N-glycans have a G2 structure, 5-17 mole % of the N-glycans have a Man5 structure, and 5-15 mole % of the N-glycans have a hybrid structure, and a pharmaceutically acceptable carrier. Further still is provided is a glycoprotein composition comprising a plurality of antibodies wherein at least 90% of the antibody molecules in the composition have both N-glycosylation sites occupied and about 53 to 58 mole % of the N-glycans have a G0 structure, 20-22 mole % of the N-glycans have a G1 structure, and about 16 to 18 mole % of the N-glycans comprise a Man5GlcNAc2 core structure, and a pharmaceutically acceptable carrier. In further aspects of the above, the N-glycans further include fucose.
[0228] Therefore, provided is a glycoprotein composition comprising a plurality of antibodies wherein at least 95% of the antibody molecules in the composition have both N-glycosylation sites occupied and about 50-70 mole % of the N-glycans have a G0 structure, 15-25 mole % of the N-glycans have a G1 structure, 4-12 mole % of the N-glycans have a G2 structure, 5-17 mole % of the N-glycans have a Man5 structure, and 5-15 mole % of the N-glycans have a hybrid structure, and a pharmaceutically acceptable carrier. Further still is provided is a glycoprotein composition comprising a plurality of antibodies wherein at least 95% of the antibody molecules in the composition have both N-glycosylation sites occupied and about 53 to 58 mole % of the N-glycans have a G0 structure, 20-22 mole % of the N-glycans have a G1 structure, and about 16 to 18 mole % of the N-glycans comprise a Man5GlcNAc2 core structure, and a pharmaceutically acceptable carrier. In further aspects of the above, the N-glycans further include fucose.
[0229] Further still, provided is a glycoprotein composition comprising a plurality of antibodies wherein at least 98% of the antibody molecules in the composition have both N-glycosylation sites occupied and about 50-70 mole % of the N-glycans have a G0 structure, 15-25 mole % of the N-glycans have a G1 structure, 4-12 mole % of the N-glycans have a G2 structure, 5-17 mole % of the N-glycans have a Man5 structure, and 5-15 mole % of the N-glycans have a hybrid structure, and a pharmaceutically acceptable carrier. Further still is provided is a glycoprotein composition comprising a plurality of antibodies wherein at least 98% of the antibody molecules in the composition have both N-glycosylation sites occupied and about 53 to 58 mole % of the N-glycans have a G0 structure, 20-22 mole % of the N-glycans have a G1 structure, and about 16 to 18 mole % of the N-glycans comprise a Man5GlcNAc2 core structure, and a pharmaceutically acceptable carrier. In further aspects of the above, the N-glycans further include fucose.
[0230] Therefore, provided is a glycoprotein composition comprising a plurality of antibodies wherein at least 99% of the antibody molecules in the composition have both N-glycosylation sites occupied and about 50-70 mole % of the N-glycans have a G0 structure, 15-25 mole % of the N-glycans have a G1 structure, 4-12 mole % of the N-glycans have a G2 structure, 5-17 mole % of the N-glycans have a Man5 structure, and 5-15 mole % of the N-glycans have a hybrid structure, and a pharmaceutically acceptable carrier. Further still is provided is a glycoprotein composition comprising a plurality of antibodies wherein at least 99% of the antibody molecules in the composition have both N-glycosylation sites occupied and about 53 to 58 mole % of the N-glycans have a G0 structure, 20-22 mole % of the N-glycans have a G1 structure, and about 16 to 18 mole % of the N-glycans comprise a Man5GlcNAc2 core structure, and a pharmaceutically acceptable carrier. In further aspects of the above, the N-glycans further include fucose.
[0231] In particular embodiments, the antibodies comprise an antibody selected from the group consisting of anti-Her2 antibody, anti-RSV (respiratory syncytial virus) antibody, anti-TNFα antibody, anti-VEGF antibody, anti-CD3 receptor antibody, anti-CD41 7E3 antibody, anti-CD25 antibody, anti-CD52 antibody, anti-CD33 antibody, anti-IgE antibody, anti-CD11a antibody, anti-EGF receptor antibody, and anti-CD20 antibody.
[0232] All patents and publications referenced or mentioned herein are indicative of the levels of skill of those skilled in the art to which the invention pertains, and each such referenced patent or publication is hereby incorporated by reference to the same extent as if it had been incorporated by reference in its entirety individually or set forth herein in its entirety.
[0233] The following examples are intended to promote a further understanding of the present invention.
Example 1
[0234] Plasmids comprising expression cassettes encoding the Leishmania major STT3D (LmSTT3D) open reading frame (ORF) operably linked to an inducible or constitutive promoter were constructed as follows.
[0235] The open reading frame encoding the LmSTT3D (SEQ ID NO:12) was codon-optimized for optimal expression in P. pastoris and synthesized by GeneArt AG, Brandenburg, Germany. The codon-optimized nucleic acid molecule encoding the LmSTT3D was designated pGLY6287 and has the nucleotide sequence shown in SEQ ID NO:11.
[0236] Plasmid pGLY6301 (FIG. 2) is a roll-in integration plasmid that targets the URA6 locus in P. pastoris. The expression cassette encoding the LmSTT3D comprises a nucleic acid molecule encoding the LmSTT3D ORF codon-optimized for effective expression in P. pastoris operably linked at the 5' end to a nucleic acid molecule that has the inducible P. pastoris AOX1 promoter sequence (SEQ ID NO:23) and at the 3' end to a nucleic acid molecule that has the S. cerevisiae CYC transcription termination sequence (SEQ ID NO:24). For selecting transformants, the plasmid comprises an expression cassette encoding the S. cerevisiae ARR3 ORF in which the nucleic acid molecule encoding the ORF (SEQ ID NO:32) is operably linked at the 5' end to a nucleic acid molecule having the P. pastoris RPL10 promoter sequence (SEQ ID NO:25) and at the 3' end to a nucleic acid molecule having the S. cerevisiae CYC transcription termination sequence (SEQ ID NO:24). The plasmid further includes a nucleic acid molecule for targeting the URA6 locus (SEQ ID NO:33). Plasmid pGLY6301 was constructed by cloning the DNA fragment encoding the codon-optimized LmSTT3D ORF (pGLY6287) flanked by an EcoRI site at the 5' end and an FseI site at the 3' end into plasmid pGF130t, which had been digested with EcoRI and FseI.
[0237] Plasmid pGLY6294 (FIG. 3) is a KINKO integration vector that targets the TRP1 locus in P. pastoris without disrupting expression of the locus. KINKO (Knock-In with little or No Knock-Out) integration vectors enable insertion of heterologous DNA into a targeted locus without disrupting expression of the gene at the targeted locus and have been described in U.S. Published Application No. 20090124000. The expression cassette encoding the LmSTT3D comprises a nucleic acid molecule encoding the LmSTT3D ORF operably linked at the 5' end to a nucleic acid molecule that has the constitutive P. pastoris GAPDH promoter sequence (SEQ ID NO:26) and at the 3' end to a nucleic acid molecule having the S. cerevisiae CYC transcription termination sequence (SEQ ID NO:24). For selecting transformants, the plasmid comprises an expression cassette encoding the Nourseothricin resistance (NATR) ORF (originally from pAG25 from EROSCARF, Scientific Research and Development GmbH, Daimlerstrasse 13a, D-61352 Bad Homburg, Germany, See Goldstein et al., Yeast 15: 1541 (1999)); wherein the nucleic acid molecule encoding the ORF (SEQ ID NO:34) is operably linked to at the 5' end to a nucleic acid molecule having the Ashbya gossypii TEF1 promoter sequence (SEQ ID NO:86) and at the 3' end to a nucleic acid molecule that has the Ashbya gossypii TEF1 termination sequence (SEQ ID NO:87). The two expression cassettes are flanked on one side by a nucleic acid molecule comprising a nucleotide sequence from the 5' region of the ORF encoding Trp1p ending at the stop codon (SEQ ID NO:30) linked to a nucleic acid molecule having the P. pastoris ALG3 termination sequence (SEQ ID NO:29) and on the other side by a nucleic acid molecule comprising a nucleotide sequence from the 3' region of the TRP1 gene (SEQ ID NO:31). Plasmid pGLY6294 was constructed by cloning the DNA fragment encoding the codon-optimized LmSTT3D ORF (pGLY6287) flanked by a NotI site at the 5' end and a Pad site at the 3' end into plasmid pGLY597, which had been digested with NotI and FseI. An expression cassette comprising a nucleic acid molecule encoding the Nourseothricin resistance ORF (NAT) operably linked to the Ashbya gossypii TEF1 promoter (PTEF) and Ashbya gossypii TEF1 termination sequence (TTEF).
[0238] The above plasmids can be used to introduce the LmSTT3D expression cassettes into P. pastoris to increase the N-glycosylation site occupancy on glycoproteins produced therein as shown in the following examples.
Example 2
[0239] Genetically engineered Pichia pastoris strain YGLY13992 is a strain that produces recombinant human anti-Her2 antibodies and Pichia pastoris strain YGLY14401 is a strain that produces recombinant human anti-RSV antibodies. Construction of the strains is illustrated schematically in FIG. 1A-1H. Briefly, the strains were constructed as follows.
[0240] The strain YGLY8316 was constructed from wild-type Pichia pastoris strain NRRL-Y 11430 using methods described earlier (See for example, U.S. Pat. No. 7,449,308; U.S. Pat. No. 7,479,389; U.S. Published Application No. 20090124000; Published PCT Application No. WO2009085135; Nett and Gerngross, Yeast 20:1279 (2003); Choi et al., Proc. Natl. Acad. Sci. USA 100:5022 (2003); Hamilton et al., Science 301:1244 (2003)). All plasmids were made in a pUC19 plasmid using standard molecular biology procedures. For nucleotide sequences that were optimized for expression in P. pastoris, the native nucleotide sequences were analyzed by the GENEOPTIMIZER software (GeneArt, Regensburg, Germany) and the results used to generate nucleotide sequences in which the codons were optimized for P. pastoris expression. Yeast strains were transformed by electroporation (using standard techniques as recommended by the manufacturer of the electroporator BioRad).
[0241] Plasmid pGLY6 (FIG. 4) is an integration vector that targets the URA5 locus. It contains a nucleic acid molecule comprising the S. cerevisiae invertase gene or transcription unit (ScSUC2; SEQ ID NO:38) flanked on one side by a nucleic acid molecule comprising a nucleotide sequence from the 5' region of the P. pastoris URA5 gene (SEQ ID NO:39) and on the other side by a nucleic acid molecule comprising the nucleotide sequence from the 3' region of the P. pastoris URA5 gene (SEQ ID NO:40). Plasmid pGLY6 was linearized and the linearized plasmid transformed into wild-type strain NRRL-Y 11430 to produce a number of strains in which the ScSUC2 gene was inserted into the URA5 locus by double-crossover homologous recombination. Strain YGLY1-3 was selected from the strains produced and is auxotrophic for uracil.
[0242] Plasmid pGLY40 (FIG. 5) is an integration vector that targets the OCH1 locus and contains a nucleic acid molecule comprising the P. pastoris URA5 gene or transcription unit (SEQ ID NO:41) flanked by nucleic acid molecules comprising lacZ repeats (SEQ ID NO:42) which in turn is flanked on one side by a nucleic acid molecule comprising a nucleotide sequence from the 5' region of the OCH1 gene (SEQ ID NO:43) and on the other side by a nucleic acid molecule comprising a nucleotide sequence from the 3' region of the OCH1 gene (SEQ ID NO:44). Plasmid pGLY40 was linearized with SfiI and the linearized plasmid transformed into strain YGLY1-3 to produce a number of strains in which the URA5 gene flanked by the lacZ repeats has been inserted into the OCH1 locus by double-crossover homologous recombination. Strain YGLY2-3 was selected from the strains produced and is prototrophic for URA5. Strain YGLY2-3 was counterselected in the presence of 5-fluoroorotic acid (5-FOA) to produce a number of strains in which the URA5 gene has been lost and only the lacZ repeats remain in the OCH1 locus. This renders the strain auxotrophic for uracil. Strain YGLY4-3 was selected.
[0243] Plasmid pGLY43a (FIG. 6) is an integration vector that targets the BMT2 locus and contains a nucleic acid molecule comprising the K. lactis UDP-N-acetylglucosamine (UDP-GlcNAc) transporter gene or transcription unit (KlMNN2-2, SEQ ID NO:45) adjacent to a nucleic acid molecule comprising the P. pastoris URA5 gene or transcription unit flanked by nucleic acid molecules comprising lacZ repeats. The adjacent genes are flanked on one side by a nucleic acid molecule comprising a nucleotide sequence from the 5' region of the BMT2 gene (SEQ ID NO: 46) and on the other side by a nucleic acid molecule comprising a nucleotide sequence from the 3' region of the BMT2 gene (SEQ ID NO:47). Plasmid pGLY43a was linearized with SfiI and the linearized plasmid transformed into strain YGLY4-3 to produce to produce a number of strains in which the KlMNN2-2 gene and URA5 gene flanked by the lacZ repeats has been inserted into the BMT2 locus by double-crossover homologous recombination. The BMT2 gene has been disclosed in Mille et al., J. Biol. Chem. 283: 9724-9736 (2008) and U.S. Pat. No. 7,465,557. Strain YGLY6-3 was selected from the strains produced and is prototrophic for uracil. Strain YGLY6-3 was counterselected in the presence of 5-FOA to produce strains in which the URA5 gene has been lost and only the lacZ repeats remain. This renders the strain auxotrophic for uracil. Strain YGLY8-3 was selected.
[0244] Plasmid pGLY48 (FIG. 7) is an integration vector that targets the MNN4L1 locus and contains an expression cassette comprising a nucleic acid molecule encoding the mouse homologue of the UDP-GlcNAc transporter (SEQ ID NO:48) open reading frame (ORF) operably linked at the 5' end to a nucleic acid molecule comprising the P. pastoris GAPDH promoter (SEQ ID NO:26) and at the 3' end to a nucleic acid molecule comprising the S. cerevisiae CYC termination sequences (SEQ ID NO:24) adjacent to a nucleic acid molecule comprising the P. pastoris URA5 gene flanked by lacZ repeats and in which the expression cassettes together are flanked on one side by a nucleic acid molecule comprising a nucleotide sequence from the 5' region of the P. pastoris MNN4L1 gene (SEQ ID NO:49) and on the other side by a nucleic acid molecule comprising a nucleotide sequence from the 3' region of the MNN4L1 gene (SEQ ID NO:50). Plasmid pGLY48 was linearized with SfiI and the linearized plasmid transformed into strain YGLY8-3 to produce a number of strains in which the expression cassette encoding the mouse UDP-GlcNAc transporter and the URA5 gene have been inserted into the MNN4L1 locus by double-crossover homologous recombination. The MNN4L1 gene (also referred to as MNN4B) has been disclosed in U.S. Pat. No. 7,259,007. Strain YGLY10-3 was selected from the strains produced and then counterselected in the presence of 5-FOA to produce a number of strains in which the URA5 gene has been lost and only the lacZ repeats remain. Strain YGLY12-3 was selected.
[0245] Plasmid pGLY45 (FIG. 8) is an integration vector that targets the PNO1/MNN4 loci and contains a nucleic acid molecule comprising the P. pastoris URA5 gene or transcription unit flanked by nucleic acid molecules comprising lacZ repeats which in turn is flanked on one side by a nucleic acid molecule comprising a nucleotide sequence from the 5' region of the PNO1 gene (SEQ ID NO:51) and on the other side by a nucleic acid molecule comprising a nucleotide sequence from the 3' region of the MNN4 gene (SEQ ID NO:52). Plasmid pGLY45 was linearized with SfiI and the linearized plasmid transformed into strain YGLY12-3 to produce a number of strains in which the URA5 gene flanked by the lacZ repeats has been inserted into the PNO1/MNN4 loci by double-crossover homologous recombination. The PNO1 gene has been disclosed in U.S. Pat. No. 7,198,921 and the MNN4 gene (also referred to as MNN4B) has been disclosed in U.S. Pat. No. 7,259,007. Strain YGLY14-3 was selected from the strains produced and then counterselected in the presence of 5-FOA to produce a number of strains in which the URA5 gene has been lost and only the lacZ repeats remain. Strain YGLY16-3 was selected.
[0246] Plasmid pGLY1430 (FIG. 9) is a KINKO integration vector that targets the ADE1 locus without disrupting expression of the locus and contains in tandem four expression cassettes encoding (1) the human GlcNAc transferase I catalytic domain (NA) fused at the N-terminus to P. pastoris SEC12 leader peptide (10) to target the chimeric enzyme to the ER or Golgi, (2) mouse homologue of the UDP-GlcNAc transporter (MmTr), (3) the mouse mannosidase IA catalytic domain (FB) fused at the N-terminus to S. cerevisiae SEC12 leader peptide (8) to target the chimeric enzyme to the ER or Golgi, and (4) the P. pastoris URA5 gene or transcription unit. KINKO (Knock-In with little or No Knock-Out) integration vectors enable insertion of heterologous DNA into a targeted locus without disrupting expression of the gene at the targeted locus and have been described in U.S. Published Application No. 20090124000. The expression cassette encoding the NA10 comprises a nucleic acid molecule encoding the human GlcNAc transferase I catalytic domain codon-optimized for expression in P. pastoris (SEQ ID NO:53) fused at the 5' end to a nucleic acid molecule encoding the SEC12 leader 10 (SEQ ID NO:54), which is operably linked at the 5' end to a nucleic acid molecule comprising the P. pastoris PMA1 promoter and at the 3' end to a nucleic acid molecule comprising the P. pastoris PMA1 transcription termination sequence. The expression cassette encoding MmTr comprises a nucleic acid molecule encoding the mouse homologue of the UDP-GlcNAc transporter ORF operably linked at the 5' end to a nucleic acid molecule comprising the P. pastoris SEC4 promoter (SEQ ID NO:55) and at the 3' end to a nucleic acid molecule comprising the P. pastoris OCH1 termination sequences (SEQ ID NO:56). The expression cassette encoding the FB8 comprises a nucleic acid molecule encoding the mouse mannosidase IA catalytic domain (SEQ ID NO:57) fused at the 5' end to a nucleic acid molecule encoding the SEC12-m leader 8 (SEQ ID NO:58), which is operably linked at the 5' end to a nucleic acid molecule comprising the P. pastoris GADPH promoter and at the 3' end to a nucleic acid molecule comprising the S. cerevisiae CYC transcription termination sequence. The URA5 expression cassette comprises a nucleic acid molecule comprising the P. pastoris URA5 gene or transcription unit flanked by nucleic acid molecules comprising lacZ repeats. The four tandem cassettes are flanked on one side by a nucleic acid molecule comprising a nucleotide sequence from the 5' region and complete ORF of the ADE1 gene (SEQ ID NO:59) followed by a P. pastoris ALG3 termination sequence (SEQ ID NO:29) and on the other side by a nucleic acid molecule comprising a nucleotide sequence from the 3' region of the ADE1 gene (SEQ ID NO:60). Plasmid pGLY1430 was linearized with SfiI and the linearized plasmid transformed into strain YGLY16-3 to produce a number of strains in which the four tandem expression cassette have been inserted into the ADE1 locus immediately following the ADE1 ORF by double-crossover homologous recombination. The strain YGLY2798 was selected from the strains produced and is auxotrophic for arginine and now prototrophic for uridine, histidine, and adenine. The strain was then counterselected in the presence of 5-FOA to produce a number of strains now auxotrophic for uridine. Strain YGLY3794 was selected and is capable of making glycoproteins that have predominantly galactose terminated N-glycans.
[0247] Plasmid pGLY582 (FIG. 10) is an integration vector that targets the HIS1 locus and contains in tandem four expression cassettes encoding (1) the S. cerevisiae UDP-glucose epimerase (ScGAL10), (2) the human galactosyltransferase I (hGalT) catalytic domain fused at the N-terminus to the S. cerevisiae KRE2-s leader peptide (33) to target the chimeric enzyme to the ER or Golgi, (3) the P. pastoris URA5 gene or transcription unit flanked by lacZ repeats, and (4) the D. melanogaster UDP-galactose transporter (DmUGT). The expression cassette encoding the ScGAL10 comprises a nucleic acid molecule encoding the ScGAL100RF (SEQ ID NO:61) operably linked at the 5' end to a nucleic acid molecule comprising the P. pastoris PMA1 promoter (SEQ ID NO:88) and operably linked at the 3' end to a nucleic acid molecule comprising the P. pastoris PMA1 transcription termination sequence (SEQ ID NO:62). The expression cassette encoding the chimeric galactosyltransferase I comprises a nucleic acid molecule encoding the hGalT catalytic domain codon optimized for expression in P. pastoris (SEQ ID NO:63) fused at the 5' end to a nucleic acid molecule encoding the KRE2-s leader 33 (SEQ ID NO:64), which is operably linked at the 5' end to a nucleic acid molecule comprising the P. pastoris GAPDH promoter and at the 3' end to a nucleic acid molecule comprising the S. cerevisiae CYC transcription termination sequence. The URA5 expression cassette comprises a nucleic acid molecule comprising the P. pastoris URA5 gene or transcription unit flanked by nucleic acid molecules comprising lacZ repeats. The expression cassette encoding the DmUGT comprises a nucleic acid molecule encoding the DmUGT ORF (SEQ ID NO:65) operably linked at the 5' end to a nucleic acid molecule comprising the P. pastoris OCH1 promoter (SEQ ID NO:66) and operably linked at the 3' end to a nucleic acid molecule comprising the P. pastoris ALG12 transcription termination sequence (SEQ ID NO:67). The four tandem cassettes are flanked on one side by a nucleic acid molecule comprising a nucleotide sequence from the 5' region of the HIS1 gene (SEQ ID NO:68) and on the other side by a nucleic acid molecule comprising a nucleotide sequence from the 3' region of the HIS1 gene (SEQ ID NO:69). Plasmid pGLY582 was linearized and the linearized plasmid transformed into strain YGLY3794 to produce a number of strains in which the four tandem expression cassette have been inserted into the HIS1 locus by homologous recombination. Strain YGLY3853 was selected and is auxotrophic for histidine and prototrophic for uridine.
[0248] Plasmid pGLY167b (FIG. 11) is an integration vector that targets the ARG1 locus and contains in tandem three expression cassettes encoding (1) the D. melanogaster mannosidase II catalytic domain (KD) fused at the N-terminus to S. cerevisiae MNN2 leader peptide (53) to target the chimeric enzyme to the ER or Golgi, (2) the P. pastoris HIS1 gene or transcription unit, and (3) the rat N-acetylglucosamine (GlcNAc) transferase II catalytic domain (TC) fused at the N-terminus to S. cerevisiae MNN2 leader peptide (54) to target the chimeric enzyme to the ER or Golgi. The expression cassette encoding the KD53 comprises a nucleic acid molecule encoding the D. melanogaster mannosidase II catalytic domain codon-optimized for expression in P. pastoris (SEQ ID NO:70) fused at the 5' end to a nucleic acid molecule encoding the MNN2 leader 53 (SEQ ID NO:71), which is operably linked at the 5' end to a nucleic acid molecule comprising the P. pastoris GAPDH promoter and at the 3' end to a nucleic acid molecule comprising the S. cerevisiae CYC transcription termination sequence. The HIS1 expression cassette comprises a nucleic acid molecule comprising the P. pastoris HIS1 gene or transcription unit (SEQ ID NO:72). The expression cassette encoding the TC54 comprises a nucleic acid molecule encoding the rat GlcNAc transferase II catalytic domain codon-optimized for expression in P. pastoris (SEQ ID NO:73) fused at the 5' end to a nucleic acid molecule encoding the MNN2 leader 54 (SEQ ID NO:74), which is operably linked at the 5' end to a nucleic acid molecule comprising the P. pastoris PMA1 promoter and at the 3' end to a nucleic acid molecule comprising the P. pastoris PMA1 transcription termination sequence. The three tandem cassettes are flanked on one side by a nucleic acid molecule comprising a nucleotide sequence from the 5' region of the ARG1 gene (SEQ ID NO:75) and on the other side by a nucleic acid molecule comprising a nucleotide sequence from the 3' region of the ARG1 gene (SEQ ID NO:76). Plasmid pGLY167b was linearized with SfiI and the linearized plasmid transformed into strain YGLY3853 to produce a number of strains (in which the three tandem expression cassette have been inserted into the ARG1 locus by double-crossover homologous recombination. The strain YGLY4754 was selected from the strains produced and is auxotrophic for arginine and prototrophic for uridine and histidine. The strain was then counterselected in the presence of 5-FOA to produce a number of strains now auxotrophic for uridine. Strain YGLY4799 was selected.
[0249] Plasmid pGLY3411 (FIG. 12) is an integration vector that contains the expression cassette comprising the P. pastoris URA5 gene flanked by lacZ repeats flanked on one side with the 5' nucleotide sequence of the P. pastoris BMT4 gene (SEQ ID NO:77) and on the other side with the 3' nucleotide sequence of the P. pastoris BMT4 gene (SEQ ID NO:78). Plasmid pGLY3411 was linearized and the linearized plasmid transformed into YGLY4799 to produce a number of strains in which the URA5 expression cassette has been inserted into the BMT4 locus by double-crossover homologous recombination. Strain YGLY6903 was selected from the strains produced and is prototrophic for uracil, adenine, histidine, proline, arginine, and tryptophan. The strain was then counterselected in the presence of 5-FOA to produce a number of strains now auxotrophic for uridine. Strains YGLY7432 and YGLY7433 were selected.
[0250] Plasmid pGLY3419 (FIG. 13) is an integration vector that contains an expression cassette comprising the P. pastoris URA5 gene flanked by lacZ repeats flanked on one side with the 5' nucleotide sequence of the P. pastoris BMT1 gene (SEQ ID NO:79) and on the other side with the 3' nucleotide sequence of the P. pastoris BMT1 gene (SEQ ID NO:80). Plasmid pGLY3419 was linearized and the linearized plasmid transformed into strain YGLY7432 and YGLY7433 to produce a number of strains in which the URA5 expression cassette has been inserted into the BMT1 locus by double-crossover homologous recombination. The strains YGLY7656 and YGLY7651 were selected from the strains produced and are prototrophic for uracil, adenine, histidine, proline, arginine, and tryptophan. The strains were then counterselected in the presence of 5-FOA to produce a number of strains now auxotrophic for uridine. Strains YGLY7930 and YGLY7940 were selected.
[0251] Plasmid pGLY3421 (FIG. 14) is an integration vector that contains an expression cassette comprising the P. pastoris URA5 gene flanked by lacZ repeats flanked on one side with the 5' nucleotide sequence of the P. pastoris BMT3 gene (SEQ ID NO:81) and on the other side with the 3' nucleotide sequence of the P. pastoris BMT3 gene (SEQ ID NO:82). Plasmid pGLY3419 was linearized and the linearized plasmid transformed into strain YGLY7930 and YGLY7940 to produce a number of strains in which the URA5 expression cassette has been inserted into the BMT1 locus by double-crossover homologous recombination. The strains YGLY7965 and YGLY7961 were selected from the strains produced and are prototrophic for uracil, adenine, histidine, proline, arginine, and tryptophan.
[0252] Plasmid pGLY3673 (FIG. 15) is a KINKO integration vector that targets the PRO1 locus without disrupting expression of the locus and contains expression cassettes encoding the T. reesei α-1,2-mannosidase catalytic domain fused at the N-terminus to S. cerevisiae αMATpre signal peptide (aMATTrMan) to target the chimeric protein to the secretory pathway and secretion from the cell. The expression cassette encoding the aMATTrMan comprises a nucleic acid molecule encoding the T. reesei catalytic domain (SEQ ID NO:83) fused at the 5' end to a nucleic acid molecule encoding the S. cerevisiae αMATpre signal peptide (SEQ ID NO:13), which is operably linked at the 5' end to a nucleic acid molecule comprising the P. pastoris AOX1 promoter (SEQ ID NO:23) and at the 3' end to a nucleic acid molecule comprising the S. cerevisiae CYC transcription termination sequence (SEQ ID NO:24). The cassette is flanked on one side by a nucleic acid molecule comprising a nucleotide sequence from the 5' region and complete ORF of the PRO1 gene (SEQ ID NO:89) followed by a P. pastoris ALG3 termination sequence and on the other side by a nucleic acid molecule comprising a nucleotide sequence from the 3' region of the PRO1 gene (SEQ ID NO:90). The plasmid contains the PpARG1 gene. Plasmid pGLY3673 was transformed into strains YGLY7965 and YGLY7961 to produce a number of strains of which strains YGLY78316 and YGLY8323 were selected from the strains produced.
[0253] Plasmid p GLY6833 (FIG. 16) is a roll-in integration plasmid encoding the light and heavy chains of an anti-Her2 antibody that targets the TRP2 locus in P. pastoris. The expression cassette encoding the anti-Her2 heavy chain comprises a nucleic acid molecule encoding the heavy chain ORF codon-optimized for effective expression in P. pastoris (SEQ ID NO:15) operably linked at the 5' end to a nucleic acid molecule encoding the Saccharomyces cerevisiae mating factor pre-signal sequence (SEQ ID NO:14) which in turn is fused at its N-terminus to a nucleic acid molecule that has the inducible P. pastoris AOX1 promoter sequence (SEQ ID NO:23) and at the 3' end to a nucleic acid molecule that has the P. pastoris CIT1 transcription termination sequence (SEQ ID NO:85). The expression cassette encoding the anti-Her2 light chain comprises a nucleic acid molecule encoding the light chain ORF codon-optimized for effective expression in P. pastoris (SEQ ID NO:17) operably linked at the 5' end to a nucleic acid molecule encoding the Saccharomyces cerevisiae mating factor pre-signal sequence (SEQ ID NO:14) which in turn is fused at its N-terminus to a nucleic acid molecule that has the inducible P. pastoris AOX1 promoter sequence (SEQ ID NO:23) and at the 3' end to a nucleic acid molecule that has the P. pastoris CIT1 transcription termination sequence (SEQ ID NO:85). For selecting transformants, the plasmid comprises an expression cassette encoding the Zeocin ORF in which the nucleic acid molecule encoding the ORF (SEQ ID NO:35) is operably linked at the 5' end to a nucleic acid molecule having the S. cerevisiae TEF promoter sequence (SEQ ID NO:37) and at the 3' end to a nucleic acid molecule having the S. cerevisiae CYC transcription termination sequence (SEQ ID NO:24). The plasmid further includes a nucleic acid molecule for targeting the TRP2 locus (SEQ ID NO:91).
[0254] Plasmid p GLY6564 (FIG. 17) is a roll-in integration plasmid encoding the light and heavy chains of an anti-RSV antibody that targets the TRP2 locus in P. pastoris. The expression cassette encoding the anti-RSV heavy chain comprises a nucleic acid molecule encoding the heavy chain ORF codon-optimized for effective expression in P. pastoris (SEQ ID NO:19) operably linked at the 5' end to a nucleic acid molecule encoding the Saccharomyces cerevisiae mating factor pre-signal sequence (SEQ ID NO:14) which in turn is fused at its N-terminus to a nucleic acid molecule that has the inducible P. pastoris AOX1 promoter sequence (SEQ ID NO:23) and at the 3' end to a nucleic acid molecule that has the S. cerevisiae CYC transcription termination sequence (SEQ ID NO:24). The expression cassette encoding the anti-RSV light chain comprises a nucleic acid molecule encoding the light chain ORF codon-optimized for effective expression in P. pastoris (SEQ ID NO:21) operably linked at the 5' end to a nucleic acid molecule encoding the Saccharomyces cerevisiae mating factor pre-signal sequence (SEQ ID NO:14) which in turn is fused at its N-terminus to a nucleic acid molecule that has the inducible P. pastoris AOX1 promoter sequence (SEQ ID NO:23) and at the 3' end to a nucleic acid molecule that has the P. pastoris AOX1 transcription termination sequence (SEQ ID NO:36). For selecting transformants, the plasmid comprises an expression cassette encoding the Zeocin ORF in which the nucleic acid molecule encoding the ORF (SEQ ID NO:35) is operably linked at the 5' end to a nucleic acid molecule having the S. cerevisiae TEF promoter sequence (SEQ ID NO:37) and at the 3' end to a nucleic acid molecule having the S. cerevisiae CYC transcription termination sequence (SEQ ID NO:24). The plasmid further includes a nucleic acid molecule for targeting the TRP2 locus (SEQ ID NO:91).
[0255] Strain YGLY13992 was generated by transforming pGLY6833, which encodes the anti-Her2 antibody, into YGLY8316. The strain YGLY13992 was selected from the strains produced. In this strain, the expression cassettes encoding the anti-Her2 heavy and light chains are targeted to the Pichia pastoris TRP2 locus (PpTRP2). This strain does not include the LmSTT3D expression cassette. Strain YGLY14401 was generated by transforming pGLY6564, which encodes the anti-RSV antibody, into YGLY8323. The strain YGLY14401 was selected from the strains produced. In this strain, the expression cassettes encoding the anti-RSV heavy and light chains are targeted to the Pichia pastoris TRP2 locus (PpTRP2). This strain does not include the LmSTT3D expression cassette.
[0256] Transformation of the appropriate strains disclosed herein with the above LmSTT3D expression/integration plasmid vectors was performed essentially as follows. Appropriate Pichia pastoris strains were grown in 50 mL YPD media (yeast extract (1%), peptone (2%), and dextrose (2%)) overnight to an OD of about 0.2 to 6. After incubation on ice for 30 minutes, cells were pelleted by centrifugation at 2500-3000 rpm for five minutes. Media was removed and the cells washed three times with ice cold sterile 1 M sorbitol before resuspension in 0.5 mL ice cold sterile 1 M sorbitol. Ten μL linearized DNA (5-20 μg) and 100 μL cell suspension was combined in an electroporation cuvette and incubated for 5 minutes on ice. Electroporation was in a Bio-Rad GenePulser Xcell following the preset Pichia pastoris protocol (2 kV, 25 μF, 200Ω), immediately followed by the addition of 1 mL YPDS recovery media (YPD media plus 1 M sorbitol). The transformed cells were allowed to recover for four hours to overnight at room temperature (24° C.) before plating the cells on selective media.
[0257] Strains YGLY13992 and YGLY14401 were each then transformed with pGLY6301, which encodes the LmSTT3D under the control of the inducible AOX1 promoter, or pGLY6294, which encodes the LmSTT3D under the control of the constitutive GAPDH promoter, as described above to produce the strains described in Example 3.
Example 3
[0258] Integration/expression plasmid pGLY6301, which comprises the expression cassette in which the ORF encoding the LmSTT3D is operably-linked to the inducible PpAOX1 promoter, or pGLY6294, which comprises the expression cassette in which the ORF encoding the LmSTT3D is operably-linked to the constitutive PpGAPDH promoter, was linearized with SpeI or SfiI, respectively, and the linearized plasmids transformed into Pichia pastoris strain YGLY13992 or YGLY14401 to produce strains YGLY17351, YGLY17368, YGLY17319, and YGLY17354 shown in Table 1. Transformations were performed essentially as described in Example 2.
TABLE-US-00001 TABLE 1 LmSTT3D Strain Antibody expression YGLY13992 Anti-Her2 none YGLY17351 Anti-Her2 inducible YGLY17368 Anti-Her2 constitutive YGLY14401 Anti-RSV none YGLY17319 Anti-RSV inducible YGLY17354 Anti-RSV constitutive
[0259] The genomic integration of pGLY6301 at the URA6 locus was confirmed by colony PCR (cPCR) using the primers, PpURA6out/UP (5'-CTGAGGAGTCAGATATCAGCTCAATCTCCAT-3; SEQ ID NO: 1) and Puc19/LP (5'-TCCGGCTCGTATGTTGTGTGGAATTGT-3; SEQ ID NO: 2) or ScARR3/UP (5'-GGCAATAGTCGCGAGAATCCTTAAACCAT-3; SEQ ID NO: 3) and PpURA6out/LP (5-CTGGATGTTTGATGGGTTCAGTTTCAGCTGGA-3; SEQ ID NO: 4).
[0260] The genomic integration of pGLY6294 at the TRP1 locus was confirmed by cPCR using the primers, PpTRP-5' out/UP (5'-CCTCGTAAAGATCTGCGGTTTGCAAAGT-3; SEQ ID NO: 5) and PpALG3TT/LP (5'-CCTCCCACTGGAACCGATGATATGGAA-3; SEQ ID NO: 6) or PpTEFTT/UP (5'-GATGCGAAGTTAAGTGCGCAGAAAGTAATATCA-3; SEQ ID NO: 7) and PpTRP1-3' out/LP (5'-CGTGTGTACCTTGAAACGTCAATGATACTTTGA-3; SEQ ID NO: 8). Integration of the expression cassette encoding the LmSTT3D into the genome was confirmed using cPCR primers, LmSTT3D/iUP (5'-GCGACTGGTTCCAATTGACAAGCTT-3' (SEQ ID NO: 9) and LmSTT3D/iLP (5'-CAACAGTAGAACCAGAAGCCTCGTAAGTACAG-3' (SEQ ID NO: 10). The PCR conditions were one cycle of 95° C. for two minutes, 35 cycles of 95° C. for 20 seconds, 55° C. for 20 seconds, and 72° C. for one minute; followed by one cycle of 72° C. for 10 minutes.
[0261] The strains were cultivated in a Sixfor fermentor to produce the antibodies for N-glycosylation site occupancy analysis. Cell growth conditions of the transformed strains for antibody production were generally as follows.
[0262] Protein expression for the transformed yeast strains was carried out at in shake flasks at 24° C. with buffered glycerol-complex medium (BMGY) consisting of 1% yeast extract, 2% peptone, 100 mM potassium phosphate buffer pH 6.0, 1.34% yeast nitrogen base, 4×10-5% biotin, and 1% glycerol. The induction medium for protein expression was buffered methanol-complex medium (BMMY) consisting of 1% methanol instead of glycerol in BMGY. Pmt inhibitor Pmti-3 in methanol was added to the growth medium to a final concentration of 18.3 μM at the time the induction medium was added. Cells were harvested and centrifuged at 2,000 rpm for five minutes.
[0263] SixFors Fermentor Screening Protocol followed the parameters shown in Table 2.
TABLE-US-00002 TABLE 2 SixFors Fermentor Parameters Parameter Set-point Actuated Element pH 6.5 ± 0.1 30% NH4OH Temperature 24 ± 0.1 Cooling Water & Heating Blanket Dissolved O2 n/a Initial impeller speed of 550 rpm is ramped to 1200 rpm over first 10 hr, then fixed at 1200 rpm for remainder of run
[0264] At time of about 18 hours post-inoculation, SixFors vessels containing 350 mL media A (See Table 3 below) plus 4% glycerol were inoculated with strain of interest. A small dose (0.3 mL of 0.2 mg/mL in 100% methanol) of Pmti-3 (5-[[3-(1-Phenyl-2-hydroxy)ethoxy)-4-(2-phenylethoxy)]phenyl]methylene]-4- -oxo-2-thioxo-3-thiazolidineacetic Acid) (See Published International Application No. WO 2007061631) was added with inoculum. At time about 20 hour, a bolus of 17 mL 50% glycerol solution (Glycerol Fed-Batch Feed, See Table 4 below) plus a larger dose (0.3 mL of 4 mg/mL) of Pmti-3 was added per vessel. At about 26 hours, when the glycerol was consumed, as indicated by a positive spike in the dissolved oxygen (DO) concentration, a methanol feed (See Table 5 below) was initiated at 0.7 mL/hr continuously. At the same time, another dose of Pmti-3 (0.3 mL of 4 mg/mL stock) was added per vessel. At time about 48 hours, another dose (0.3 mL of 4 mg/mL) of Pmti-3 was added per vessel. Cultures were harvested and processed at time about 60 hours post-inoculation.
TABLE-US-00003 TABLE 3 Composition of Media A Soytone L-1 20 g/L Yeast Extract 10 g/L KH2PO4 11.9 g/L K2HPO4 2.3 g/L Sorbitol 18.2 g/L Glycerol 40 g/L Antifoam Sigma 204 8 drops/L 10X YNB w/Ammonium Sulfate w/o 100 mL/L Amino Acids (134 g/L) 250X Biotin (0.4 g/L) 10 mL/L 500X Chloramphenicol (50 g/L) 2 mL/L 500X Kanamycin (50 g/L) 2 mL/L
TABLE-US-00004 TABLE 4 Glycerol Fed-Batch Feed Glycerol 50 % m/m PTM1 Salts (see Table IV-E below) 12.5 mL/L 250X Biotin (0.4 g/L) 12.5 mL/L
TABLE-US-00005 TABLE 5 Methanol Feed Methanol 100 % m/m PTM1 Salts (See Table 6) 12.5 mL/L 250X Biotin (0.4 g/L) 12.5 mL/L
TABLE-US-00006 TABLE 6 PTM1 Salts CuSO4--5H2O 6 g/L NaI 80 mg/L MnSO4--7H2O 3 g/L NaMoO4--2H2O 200 mg/L H3BO3 20 mg/L CoCl2--6H2O 500 mg/L ZnCl2 20 g/L FeSO4--7H2O 65 g/L Biotin 200 mg/L H2SO4 (98%) 5 mL/L
[0265] The occupancy of N-glycan on anti-Her2 or anti-RSV antibodies was determined using capillary electrophoresis (CE) as follows. The antibodies were recovered from the cell culture medium and purified by protein A column chromatography. The protein A purified sample (100-200 μg) was concentrated to about 100 μL and then buffer was exchanged with 100 mM Tris-HCl pH 9.0 with 1% SDS. Then, the sample along with 2 μL of 10 kDa internal standard provided by Beckman was reduced by addition of 5 μL β-mercaptoethanol and boiled for five minutes. About 20 μL of reduced sample was then resolved over a bare-fused silica capillary (about 70 mm, 50 μm I.D.) according to the method recommended by Beckman Coulter.
[0266] FIG. 18 shows the N-glycosylation site occupancy of heavy chains from the CE analysis. The figure shows that for both antibodies, the amount of N-linked heavy chains species increased from about 80% to about 94% when the LmSTT3D was constitutively expressed to about 99% when expression of the LmSTT3D was induced at the same time as expression of the antibodies was induced.
[0267] Table 7 shows N-glycosylation site occupancy of anti-HER2 and anti-RSV antibodies was increased for compositions in which the antibodies were obtained from host cells in which the LmSTT3D was overexpressed in the presence of the endogenous oligosaccharyltransferase (OST) complex. To determine N-glycosylation site occupancy, antibodies were reduced and the N-glycan occupancy of the heavy chains determined. The table shows that in general, overexpression of the LmSTT3D under the control of an inducible promoter effected an increase of N-glycosylation site occupancy from about 82-83% to about 99% for both antibodies tested (about a 19% increase over the N-glycosylation site occupancy in the absence of LmSTT3D overexpression). The expression of the LmSTT3D and the antibodies were under the control of the same inducible promoter. When overexpression of the LmSTT3D was under the control of a constitutive promoter the increase in N-glycosylation site occupancy was increased to about 94% for both antibodies tested (about a 13% increase over the N-glycosylation site occupancy in the absence of LmSTT3D overexpression).
TABLE-US-00007 TABLE 7 Heavy Chain N- LmSTT3D glycosylation AOX1 Prom. GAPDH Prom. site (pGLY 6301) (pGLY6294) occupancy# Strain (inducible) (constitutive) Antibody (%) YGLY13992 None None Anti-HER2 83 YGLY17368 None overexpressed Anti-HER2 94 YGLY17351 overexpressed None Anti-HER2 99 YGLY14401 None None Anti-RSV 82 YGLY17354 None overexpresse Anti-RSV 94 YGLY17319 overexpressed None Anti-RSV 99 #N-glycosylation site occupancy based upon percent glycosylation site occupancy of total heavy chains from reduced antibodies
[0268] Table 8 shows the N-glycosylation site occupancy for compositions comprising whole antibodies obtained from host cells in which the LmSTT3D was overexpressed in the presence of the endogenous oligosaccharyltransferase (OST) complex based upon the determination of N-glycosylation site occupancy of the individual heavy chains from reduced antibody preparations. The formula (fraction GHC)2×100 will provide an estimate or approximation of the percent fully occupied antibodies based upon the determination of the fraction of heavy chains that are N-glycosylated.
TABLE-US-00008 TABLE 8 LmSTT3D Fully AOX1 Prom. GAPDH Prom. Occupied (pGLY6301) (pGLY6294) Antibodies# Strain (inducible) (constitutive) Antibody (%) YGLY13992 None None Anti-HER2 68.9 YGLY17368 None overexpressed Anti-HER2 88.4 YGLY17351 overexpressed None Anti-HER2 98.0 YGLY14401 None None Anti-RSV 67.2 YGLY17354 None overexpressed Anti-RSV 88.4 YGLY17319 overexpressed None Anti-RSV 98.0 #based upon results obtained from Table 7.
Q-TOF Analysis
[0269] The high performance liquid chromatography (HPLC) system used consisted of an Agilent 1200 equipped with autoinjector, a column-heating compartment and a UV detector detecting at 210 and 280 nm. All LC-MS experiments performed with this system were running at 1 mL/min. The flow rate was not split for MS detection. Mass spectrometric analysis was carried out in positive ion mode on Accurate-Mass Q-TOF LC/MS 6520 (Agilent technology). The temperature of dual ESI source was set at 350° C. The nitrogen gas flow rates were set at 13 L/h for the cone and 350 l/h and nebulizer was set at 45 psig with 4500 volt applied to the capillary. Reference mass of 922.009 was prepared from HP-0921 according to API-TOF reference mass solution kit for mass calibration and the protein mass measurements. The data for ion spectrum range from 300-3000 m/z were acquired and processed using Agilent Masshunter.
[0270] Sample preparation was as follows. An intact antibody sample (50 μg) was prepared 50 μL 25 mM NH4HCO3, pH 7.8. For deglycosylated antibody, a 50 μL aliquot of intact antibody sample was treated with PNGase F (10 units) for 18 hours at 37° C. Reduced antibody was prepared by adding 1 M DTT to a final concentration of 10 mM to an aliquot of either intact antibody or deglycosylated antibody and incubated for 30 min at 37° C.
[0271] Three micrograms of intact or deglycosylated antibody sample was loaded onto a Poroshell 300SB-C3 column (2.1 mm×75 mm, 5 μm) (Agilent Technologies) maintained at 70° C. The protein was first rinsed on the cartridge for 1 minute with 90% solvent A (0.1% HCOOH), 5% solvent B (90% Acetonitrile in 0.1% HCOOH). Elution was then performed using a gradient of 5-100% of B over 26 minutes followed by a three-minute regeneration at 100% B and by a final equilibration period of 10 minute at 5% B.
[0272] For reduced antibody, a three microgram sample was loaded onto a Poroshell 300SB-C3 column (2.1 mm×75 mm, 5 μm) (Agilent Technologies) maintained at 40 C. The protein was first rinsed on the cartridge for three minutes with 90% solvent A, 5% solvent B.
[0273] Elution was then performed using a gradient of 5-80% of B over 20 minutes followed by a seven-minute regeneration at 80% B and by a final equilibration period of 10 minutes at 5% B.
[0274] FIG. 19A-C shows the results of a Q-TOF analysis in which the N-glycosylation site occupancy of non-reduced anti-Her2 antibody produced in YGLY17351 (Figure C) was compared to N-glycosylation site occupancy of non-reduced commercially available anti-Her2 antibody produced in CHO cells (HERCEPTIN) (Figure A). The figure shows that anti-Her2 antibody produced in strain YGLY17351 has an N-glycosylation site occupancy that is like the N-glycosylation site occupancy of an anti-Her2 antibody made in CHO cells. The figure shows that the amount of antibodies in which only one N-glycosylation site was occupied decreased and the amount of antibodies in which both N-glycosylation sites was occupied increased when the antibodies were produced by strain YGLY17351. The results shown for anti-Her2 antibody produced in YGLY17351 were consistent with the approximated occupancy shown in Table 8.
[0275] FIG. 20 demonstrates the scalability of N-glycosylation site occupancy on anti-Her2 antibodies produced in YGLY17351. In order to evaluate scalability of N-glycan occupancy, YGLY17351 was tested in bioreactors ranging from 5 mL through 40 L. In general, N-glycosylation site occupancy of glycoproteins in glycoengineered P. pastoris has been observed to vary with the process conditions used to produce the glycoproteins. However, the LmSTT3D overexpressing strains showed very consistent N-glycosylation site occupancy (99%) regardless of scale of bioreactors and process conditions. Thus, the present invention provides a method in which the N-glycosylation site occupancy of glycoproteins in glycoengineered P. pastoris grown under small scale conditions is maintained when grown under large scale conditions.
[0276] FIGS. 21 A-B and 22 A-B are provided for illustrative purposes. FIG. 21A-B shows the results of a CE (FIG. 21B) and Q-TOF (FIG. 21A) analysis for a commercial lot of anti-Her2 antibody produced in CHO cells (HERCEPTIN). FIG. 22 A-B shows the results of a CE (FIG. 22B) and Q-TOF (FIG. 22A) analysis for the same commercial lot of anti-Her2 antibody following treatment with PNGase F for a time. The CE shows an increase in non-glycosylated heavy chain and the Q-TOF shows the presence of non-glycosylated antibody following PNGase F treatment (compare FIG. 21 A-B to FIG. 22 A-B).
[0277] Table 9 shows the N-glycan composition of the anti-Her2 and anti-RSV antibodies produced in strains that overexpress LmSTT3D compared to strains that do not overexpress LmSTT3D. The Figure confirms that the quality of N-glycans of antibodies from LmSTT3D overexpressing strains is comparable to that from strains that do not overexpress LmSTT3D. Antibodies were produced from SixFors (0.5 L bioreactor) and N-glycans from protein A-purified antibodies were analyzed with 2AB labeling. Overall, overexpression of LmSTT3D did not appear to significantly affect the N-glycan composition of the antibodies. The glycosylation composition can vary as a function of fermentation conditions, therefore, the glycosylation composition of antibodies produced in Pichia pastoris strains can range from about 50-70 mole % G0, 15-25 mole % G1, 4-12% mole % G2, 5-17 mole % Man5, and 3-15 mole % hybrids.
TABLE-US-00009 TABLE 9 N-glycans (%) LmSTT3D G0 G1 G2 Man5 Hybrids Anti-Her2 none 58.1 ± 1.8 20.5 ± 0.6 3.0 ± 0.9 14.0 ± 2.1 4.3 ± 1.2 Antibody overexpressed 53.9 ± 2.0 22.4 ± 3.0 4.5 ± 1.7 14.7 ± 1.5 4.2 ± 1.5 Anti-RSV none 51.6 ± 1.6 22.9 ± 2.0 5.3 ± 2.2 15.2 ± 1.1 4.9 ± 0.6 Antibody overexpressed 58.4 ± 5.3 20.9 ± 2.8 3.5 ± 0.3 12.4 ± 0.1 4.7 ± 2.3 G0--GlcNAc2Man3GlcNAc2 G1--GalGlcNAc2Man3GlcNAc2 G2--Gal2GlcNAc2Man3GlcNAc2 Man5--Man5GlcNAc2 Hybrid--GlcNAcMan5GlcNAc2 and/or GalGlcNAcMan5GlcNAc2
[0278] Table 10 shows a comparison of the glycosylation pattern of the anti-RSV antibody produced in strain YGLY14401 compared to several commercial lots of an anti-RSV antibody produced in CHO cells and marketed as palivizumab under the tradename SYNAGIS.
TABLE-US-00010 TABLE 10 Anti-RSV SYNAGIS SYNAGIS antibody (Commercial lot (Commercial lot produced in 07A621) 09A621) YGLY14401 Glycoform % of total % of total % of total Man5 6.4 6.8 9.5 G0 <1.0 <1.0 59.9 G0F 33.9 30 0 G1 <1.0 <1.0 20 G1F 41.7 48.8 0 G2 0 0 2.8 G2F 10.9 12.3 0 A2 5.1 3.7 0 Hybrid -- -- 7.8 O-glycans occupancy 0 0 3.0 (mol/mol) Mannose 0 0 96 (single mannose) Mannobiose 0 0 4 (two mannose residues)
[0279] This example shows then that the present invention enables the production of recombinant glycoproteins in Pichia pastoris in which the N-glycosylation site occupancy of the recombinant glycoproteins is comparable to the N-glycosylation site occupancy of recombinant glycoproteins produced in mammalian expression systems such as CHO cells.
Example 4
[0280] The Leishmania major STT3A protein, Leishmania major STT3B protein, and Leishmania major STT3D protein, are all examples of heterologous single-subunit oligosaccharyltransferases that have been shown to suppress the lethal phenotype of a deletion of the STT3 locus in Saccharomyces cerevisiae (Naseb et al., Molec. Biol. Cell 19: 3758-3768 (2008)). Naseb et al. (ibid.) further showed that the Leishmania major STT3D protein could suppress the lethal phenotype of a mutation of the WBP1, OST1, SWP1, or OST2 loci in Saccharomyces cerevisiae. Hese et al. (Glycobiology 19: 160-171 (2009)) provides data that suggest the Leishmania major STT3A, STT3B, and STT3D proteins can functionally complement mutations of the WBP1, OST1, SWP1, and OST2 loci. Other single-subunit heterologous oligosaccharyltransferases include but are not limited to single-subunit Giardia or kinetoplastid STT3 proteins, for example, the Caenorhabditis elegans STT3 protein, Trypanosoma brucei STT3 protein, Trypanosoma cruzi STT3 protein, and Toxoplasma gondii STT3 protein. In contrast to the Leishmania major STT3D protein, which Naseb et al. (op. cit.) teaches does not incorporate into the Saccharomyces cerevisiae OTase complex, Castro et al. (Proc. Natl. Acad. Sci. USA 103: 14756-14760 (2006)) teaches that the Trypanosoma cruzi STT3 appears to integrate into the Saccharomyces cerevisiae OTase complex.
[0281] In this example, host cells constructed similar to the host cells in the previous example were transformed with plasmid vectors containing expression cassettes encoding the STT3 protein from Caenorhabditis elegans, Trypanosoma cruzi, and Leishmania major STT3C operably linked to the AOX1 promoter. A vector containing an expression cassette encoding the Pichia pastoris Stt3p was included in the experiment. As shown in Table 11, expression of the various STT3 proteins concurrently with expression of the anti-Her2 antibody did not appear result in an increase in N-glycosylation site occupancy. However, various STT3 proteins can display substrate specificity. For example, the Leishmania major STT3A, B, C, and D proteins differ in substrate specificity at the level of glycosylation, which suggests that in addition to the essential N-X-S/T attachment site additional features of the substrate may influence N-linked glycosylation at a particular attachment site (Naseb et al., op cit.). The results shown in Table 9 used the anti-Her2 antibody as the substrate. The CH2 domain of each heavy chain of an antibody contains a single site for N-linked glycosylation: this is usually at the asparagine residue 297 (Asn-297) (Kabat et al., Sequences of proteins of immunological interest, Fifth Ed., U.S. Department of Health and Human Services, NIH Publication No. 91-3242). Thus, the results shown in Table 9 suggest that the percent N-glycosylation site occupancy might be influenced by the substrate specificity of the particular single-subunit oligosaccharyltransferase being used.
TABLE-US-00011 TABLE 11 STT3 N-glycosylation site (AOX1 Prom) Antibody occupancy (%) C. elegans overexpressed Anti-Her2 83 T. cruzi overexpressed Anti-Her2 83 L. major (STT3C) overexpressed Anti-Her2 82 P. pastoris overexpressed Anti-Her2 80
Example 5
[0282] A strain capable of producing sialylated N-glycans was constructed as follows. The strain was transfected with a plasmid vector encoding human GM-CSF and a plasmid vector encoding the Leishmania major STT3D. Construction of the strains is illustrated schematically in FIG. 23A-23E. Briefly, the strains were constructed as follows.
[0283] Plasmid pGLY2456 (FIG. 24) is a KINKO integration vector that targets the TRP2 locus without disrupting expression of the locus and contains six expression cassettes encoding (1) the mouse CMP-sialic acid transporter (mCMP-Sia Transp), (2) the human UDP-GlcNAc 2-epimerase/N-acetylmannosamine kinase (hGNE), (3) the Pichia pastoris ARG1 gene or transcription unit, (4) the human CMP-sialic acid synthase (hCSS), (5) the human N-acetylneuraminate-9-phosphate synthase (hSPS), (6) the mouse α-2,6-sialyltransferase catalytic domain (mST6) fused at the N-terminus to S. cerevisiae KRE2 leader peptide (33) to target the chimeric enzyme to the ER or Golgi, and the P. pastoris ARG1 gene or transcription unit. The expression cassette encoding the mouse CMP-sialic acid transporter comprises a nucleic acid molecule encoding the mCMP Sia Transp ORF codon optimized for expression in P. pastoris (SEQ ID NO:92), which is operably linked at the 5' end to a nucleic acid molecule comprising the P. pastoris PMA1 promoter and at the 3' end to a nucleic acid molecule comprising the P. pastoris PMA1 transcription termination sequence. The expression cassette encoding the human UDP-GlcNAc 2-epimerase/N-acetylmannosamine kinase comprises a nucleic acid molecule encoding the hGNE ORF codon optimized for expression in P. pastoris (SEQ ID NO:93), which is operably linked at the 5' end to a nucleic acid molecule comprising the P. pastoris GAPDH promoter and at the 3' end to a nucleic acid molecule comprising the S. cerevisiae CYC transcription termination sequence. The expression cassette encoding the P. pastoris ARG1 gene comprises (SEQ ID NO:94). The expression cassette encoding the human CMP-sialic acid synthase comprises a nucleic acid molecule encoding the hCSS ORF codon optimized for expression in P. pastoris (SEQ ID NO:95), which is operably linked at the 5' end to a nucleic acid molecule comprising the P. pastoris GAPDH promoter and at the 3' end to a nucleic acid molecule comprising the S. cerevisiae CYC transcription termination sequence. The expression cassette encoding the human N-acetylneuraminate-9-phosphate synthase comprises a nucleic acid molecule encoding the hSIAP S ORF codon optimized for expression in P. pastoris (SEQ ID NO:96), which is operably linked at the 5' end to a nucleic acid molecule comprising the P. pastoris PMA1 promoter and at the 3' end to a nucleic acid molecule comprising the P. pastoris PMA1 transcription termination sequence. The expression cassette encoding the chimeric mouse α-2,6-sialyltransferase comprises a nucleic acid molecule encoding the mST6 catalytic domain codon optimized for expression in P. pastoris (SEQ ID NO:97) fused at the 5' end to a nucleic acid molecule encoding the S. cerevisiae KRE2 signal peptide, which is operably linked at the 5' end to a nucleic acid molecule comprising the P. pastoris TEF promoter and at the 3' end to a nucleic acid molecule comprising the P. pastoris TEF transcription termination sequence. The six tandem cassettes are flanked on one side by a nucleic acid molecule comprising a nucleotide sequence from the 5' region and ORF of the TRP2 gene ending at the stop codon (SEQ ID NO:98) followed by a P. pastoris ALG3 termination sequence and on the other side by a nucleic acid molecule comprising a nucleotide sequence from the 3' region of the TRP2 gene (SEQ ID NO 99). Plasmid pGLY2456 was linearized with SfiI and the linearized plasmid transformed into strain YGLY7961 to produce a number of strains in which the six expression cassette have been inserted into the TRP2 locus immediately following the TRP2 ORF by double-crossover homologous recombination. The strain YGLY8146 was selected from the strains produced. The strain was then counterselected in the presence of 5-FOA to produce a number of strains now auxotrophic for uridine. Strain YGLY9296 was selected.
[0284] Plasmid pGLY5048 (FIG. 25) is an integration vector that targets the STE13 locus and contains expression cassettes encoding (1) the T. reesei α-1,2-mannosidase catalytic domain fused at the N-terminus to S. cerevisiae αMATpre signal peptide (aMATTrMan) to target the chimeric protein to the secretory pathway and secretion from the cell and (2) the P. pastoris URA5 gene or transcription unit. The expression cassette encoding the aMATTrMan comprises a nucleic acid molecule encoding the T. reesei catalytic domain (SEQ ID NO:83) fused at the 5' end to a nucleic acid molecule encoding the S. cerevisiae αMATpre signal peptide (SEQ ID NO:13), which is operably linked at the 5' end to a nucleic acid molecule comprising the P. pastoris AOX1 promoter and at the 3' end to a nucleic acid molecule comprising the S. cerevisiae CYC transcription termination sequence. The URA5 expression cassette comprises a nucleic acid molecule comprising the P. pastoris URA5 gene or transcription unit flanked by nucleic acid molecules comprising lacZ repeats. The two tandem cassettes are flanked on one side by a nucleic acid molecule comprising a nucleotide sequence from the 5' region of the STE13 gene (SEQ ID NO:100) and on the other side by a nucleic acid molecule comprising a nucleotide sequence from the 3' region of the STE13 gene (SEQ ID NO:101). Plasmid pGLY5048 was linearized with SfiI and the linearized plasmid transformed into strain YGLY9296 to produce a number of strains. The strain YGLY9469 was selected from the strains produced. This strain is capable of producing glycoproteins that have single-mannose O-glycosylation (See Published U.S. Application No. 20090170159).
[0285] Plasmid pGLY5019 (FIG. 26) is an integration vector that targets the DAP2 locus and contains an expression cassette comprising a nucleic acid molecule encoding the Nourseothricin resistance (NATR) expression cassette (originally from pAG25 from EROSCARF, Scientific Research and Development GmbH, Daimlerstrasse 13a, D-61352 Bad Homburg, Germany, See Goldstein et al., Yeast 15: 1541 (1999)). The NATR expression cassette (SEQ ID NO:34) is operably regulated to the Ashbya gossypii TEF1 promoter and A. gossypii TEF1 termination sequences flanked one side with the 5' nucleotide sequence of the P. pastoris DAP2 gene (SEQ ID NO:102) and on the other side with the 3' nucleotide sequence of the P. pastoris DAP2 gene (SEQ ID NO:103). Plasmid pGLY5019 was linearized and the linearized plasmid transformed into strain YGLY9469 to produce a number of strains in which the NATR expression cassette has been inserted into the DAP2 locus by double-crossover homologous recombination. The strain YGLY9797 was selected from the strains produced.
[0286] Plasmid pGLY5085 (FIG. 27) is a KINKO plasmid for introducing a second set of the genes involved in producing sialylated N-glycans into P. pastoris. The plasmid is similar to plasmid YGLY2456 except that the P. pastoris ARG1 gene has been replaced with an expression cassette encoding hygromycin resistance (HygR) and the plasmid targets the P. pastoris TRP5 locus. The HYGR resistance cassette is SEQ ID NO:104. The HYGR expression cassette (SEQ ID NO:104) is operably regulated to the Ashbya gossypii TEF1 promoter and A. gossypii TEF1 termination sequences (See Goldstein et al., Yeast 15: 1541 (1999)). The six tandem cassettes are flanked on one side by a nucleic acid molecule comprising a nucleotide sequence from the 5' region and ORF of the TRP5 gene ending at the stop codon (SEQ ID NO:105) followed by a P. pastoris ALG3 termination sequence and on the other side by a nucleic acid molecule comprising a nucleotide sequence from the 3' region of the TRP5 gene (SEQ ID NO:106). Plasmid pGLY5085 was transformed into strain YGLY9797 to produce a number of strains of which strain YGLY1200 was selected.
[0287] Plasmid pGLY7240 (FIG. 28), which targets the Pichia pastoris TRP2 locus (PpTRP2), encodes a fusion protein comprising the human GM-CSF fused to the Pichia pastoris CWP1 protein via a linker containing a Kex2 cleavage site. The CWP1 protein is removed from GM-CSF in the late Golgi by the Kex2 endoprotease so that free GM-CSF is secreted into the fermentation supernatant. The human GM-CSF has the amino acid sequence shown in SEQ ID NO:108 and is encoded by the nucleotide sequence shown in SEQ ID NO:108. The fusion protein (SEQ ID NO:109) is encoded by the nucleotide sequence shown in SEQ ID NO:110. The CWP1 signal sequence is amino acids 1-18, the CWP1 amino acid sequence is from amino acids 19-289, the GGGSLVKR Kex2 linker amino acid sequence (SEQ ID NO:111) is from amino acids 290-297, and the GM-CSF amino acid sequence is from amino acids 298-424. The expression of the fusion protein is operably linked to the Pp AOX1 promoter and ScCYC termination sequences. Plasmid pGLY7240 was transformed into strain YGLY12900 to produce a number of strains of which strain YGLY15660 was selected. Strain YGLY15660 was transformed with plasmid pGLY6301 (encodes Leishmania major STT3D) to produce a number of strains of which YGLY16349 was selected.
[0288] FIG. 29 shows that LmSTT3D also improved N-glycan occupancy of non-antibody glycoprotein, GM-CSF. GM-CSF contains 2 N-linked sites and in wild-type Pichia 1 N-linked site on GM-CSF is predominantly glycosylated. To investigate impact of LmSTT3D on N-glycan occupancy of GM-CSF, methanol-inducible LmSTT3D was overexpressed in GM-CSF producing strain, yGLY15560. N-glycan occupancy was evaluated using Micro24 bioreactor (M24). The cell-free supernatants from M24 were analyzed for N-glycan occupancy using Western blot and 15% SDS-PAGE. As shown in Western blot detected with GM-CSF specific antibody, the majority of GM-CSF (Lanes 2-8) is glycosylated with 2N-linked sites in contrast to the control strain (yGLY15560, lane 9) where GM-CSF is predominantly N-glycosylated with 1 site along with the minor portions of 2 N sites and non-glycosylated. Taken together, this indicates that LmSTT3D can improve N-glycan occupancy of glycoproteins carrying multiple N-linked sites.
[0289] FIG. 30 shows Q-TOP analysis of GM-CSF expressed from yGLY15560 (A) and yGLY16349 (B), respectively. This analysis confirms that the majority of GM-CSF is glycosylated with 2N-linked sites in the presence of LmSTT3D as shown in FIG. 29. Non-glycosylated GM-CSF was not detected.
LC-ESI-TOF
[0290] The high performance liquid chromatography (HPLC) system used in this study consisted of an Agilent 1200 equipped with autoinjector, a column-heating compartment and a UV detector detecting at 210 and 280 nm. All LC-MS experiments performed with this system were running at 1 ml/min. The flow rate was not split for MS detection. Mass spectrometric analysis was carried out in positive ion mode on Accurate-Mass Q-TOF LC/MS 6520 (Agilent technology). The temperature of dual ESI source was set at 350° C. The nitrogen gas flow rates were set at 13 l/h for the cone and 350 l/h and nebulizer was set at 45 psig with 4500 volt applied to the capillary. Reference mass of 922.009 was prepared from HP-0921 according to API-TOF reference mass solution kit for mass calibration and the protein mass measurements. The data for ion spectrum range from 300-3000 m/z were acquired and processed using Agilent Masshunter.
Sample Preparation
[0291] An intact antibody sample (50 ug) was prepared 50 ul 25 mM NH4HCO3, pH 7.8. For deglycosylated antibody, a 50 ul aliquot of intact antibody sample was treated with PNGase F (10 units) for 18 hr at 37 C. Reduced antibody was prepared by adding 1 M DTT to a final concentration of 10 mM to an aliquot of either intact antibody or deglycosylated antibody and incubated for 30 min at 37 C.
[0292] Three microgram of intact or deglycosylated antibody sample was loaded onto a Poroshell 300SB-C3 column (2.1 mm×75 mm, 5 μm) (Agilent Technologies) maintained at 70° C. The protein was first rinsed on the cartridge for 1 min with 90% solvent A (0.1% HCOOH), 5% solvent B (90% Acetonitrile in 0.1% HCOOH). Elution was then performed using an gradient of 5-100% of B over 26 min followed by a 3 min regeneration at 100% B and by a final equilibration period of 10 min at 5% B.
[0293] For reduced antibody, three microgram sample was loaded a Poroshell 300SB-C3 column (2.1 mm×75 mm, 5 μm) (Agilent Technologies) maintained at 40° C. The protein was first rinsed on the cartridge for 3 min with 90% solvent A, 5% solvent B. Elution was then performed using an gradient of 5-80% of B over 20 min followed by a 7 min regeneration at 80% B and by a final equilibration period of 10 min at 5% B.
Sequences
[0294] Sequences that were used to produce some of the strains disclosed in Examples 1-4 are provided in Table 12.
TABLE-US-00012 TABLE 12 BRIEF DESCRIPTION OF THE SEQUENCES SEQ ID NO: Description Sequence 1 PCR primer CTGAGGAGTCAGATATCAGCTCAATCTCCAT PpURA6out/UP 2 PCR primer TCCGGCTCGTATGTTGTGTGGAATTGT Puc19/LP 3 PCR primer CTGGATGTTTGATGGGTTCAGTTTCAGCTGGA PpURA6out/LP 4 PCR primer GGCAATAGTCGCGAGAATCCTTAAACCAT ScARR3/UP 5 PCR primer CCTCGTAAAGATCTGCGGTTTGCAAAGT PpTRP1- 5'out/UP 6 PCR primer CCTCCCACTGGAACCGATGATATGGAA PpALG3TT/LP 7 PCR primer GATGCGAAGTTAAGTGCGCAGAAAGTAATATCA PpTEFTT/UP 8 PCR primer CGTGTGTACCTTGAAACGTCAATGATACTTTGA PpTRP- 3'1out/LP 9 PCR primer CAGACTAAGACTGCTTCTCCACCTGCTAAG LmSTT3D/iUP 10 PCR primer CAACAGTAGAACCAGAAGCCTCGTAAGTACAG LmSTT3D/iLP 11 Leishmania ATGGGTAAAAGAAAGGGAAACTCCTTGGGAGATTCTG major STT3D GTTCTGCTGCTACTGCTTCCAGAGAGGCTTCTGCTCAA (DNA) GCTGAAGATGCTGCTTCCCAGACTAAGACTGCTTCTCC ACCTGCTAAGGTTATCTTGTTGCCAAAGACTTTGACTG ACGAGAAGGACTTCATCGGTATCTTCCCATTTCCATTC TGGCCAGTTCACTTCGTTTTGACTGTTGTTGCTTTGTTC GTTTTGGCTGCTTCCTGTTTCCAGGCTTTCACTGTTAGA ATGATCTCCGTTCAAATCTACGGTTACTTGATCCACGA ATTTGACCCATGGTTCAACTACAGAGCTGCTGAGTAC ATGTCTACTCACGGATGGAGTGCTTTTTTCTCCTGGTT CGATTACATGTCCTGGTATCCATTGGGTAGACCAGTTG GTTCTACTACTTACCCAGGATTGCAGTTGACTGCTGTT GCTATCCATAGAGCTTTGGCTGCTGCTGGAATGCCAAT GTCCTTGAACAATGTTTGTGTTTTGATGCCAGCTTGGT TTGGTGCTATCGCTACTGCTACTTTGGCTTTCTGTACTT ACGAGGCTTCTGGTTCTACTGTTGCTGCTGCTGCAGCT GCTTTGTCCTTCTCCATTATCCCTGCTCACTTGATGAG ATCCATGGCTGGTGAGTTCGACAACGAGTGTATTGCT GTTGCTGCTATGTTGTTGACTTTCTACTGTTGGGTTCGT TCCTTGAGAACTAGATCCTCCTGGCCAATCGGTGTTTT GACAGGTGTTGCTTACGGTTACATGGCTGCTGCTTGGG GAGGTTACATCTTCGTTTTGAACATGGTTGCTATGCAC GCTGGTATCTCTTCTATGGTTGACTGGGCTAGAAACAC TTACAACCCATCCTTGTTGAGAGCTTACACTTTGTTCT ACGTTGTTGGTACTGCTATCGCTGTTTGTGTTCCACCA GTTGGAATGTCTCCATTCAAGTCCTTGGAGCAGTTGGG AGCTTTGTTGGTTTTGGTTTTCTTGTGTGGATTGCAAGT TTGTGAGGTTTTGAGAGCTAGAGCTGGTGTTGAAGTTA GATCCAGAGCTAATTTCAAGATCAGAGTTAGAGTTTTC TCCGTTATGGCTGGTGTTGCTGCTTTGGCTATCTCTGTT TTGGCTCCAACTGGTTACTTTGGTCCATTGTCTGTTAG AGTTAGAGCTTTGTTTGTTGAGCACACTAGAACTGGTA ACCCATTGGTTGACTCCGTTGCTGAACATCAACCAGCT TCTCCAGAGGCTATGTGGGCTTTCTTGCATGTTTGTGG TGTTACTTGGGGATTGGGTTCCATTGTTTTGGCTGTTTC CACTTTCGTTCACTACTCCCCATCTAAGGTTTTCTGGTT GTTGAACTCCGGTGCTGTTTACTACTTCTCCACTAGAA TGGCTAGATTGTTGTTGTTGTCCGGTCCAGCTGCTTGT TTGTCCACTGGTATCTTCGTTGGTACTATCTTGGAGGC TGCTGTTCAATTGTCTTTCTGGGACTCCGATGCTACTA AGGCTAAGAAGCAGCAAAAGCAGGCTCAAAGACACC AAAGAGGTGCTGGTAAAGGTTCTGGTAGAGATGACGC TAAGAACGCTACTACTGCTAGAGCTTTCTGTGACGTTT TCGCTGGTTCTTCTTTGGCTTGGGGTCACAGAATGGTT TTGTCCATTGCTATGTGGGCTTTGGTTACTACTACTGC TGTTTCCTTCTTCTCCTCCGAATTTGCTTCTCACTCCAC TAAGTTCGCTGAACAATCCTCCAACCCAATGATCGTTT TCGCTGCTGTTGTTCAGAACAGAGCTACTGGAAAGCC AATGAACTTGTTGGTTGACGACTACTTGAAGGCTTACG AGTGGTTGAGAGACTCTACTCCAGAGGACGCTAGAGT TTTGGCTTGGTGGGACTACGGTTACCAAATCACTGGTA TCGGTAACAGAACTTCCTTGGCTGATGGTAACACTTGG AACCACGAGCACATTGCTACTATCGGAAAGATGTTGA CTTCCCCAGTTGTTGAAGCTCACTCCCTTGTTAGACAC ATGGCTGACTACGTTTTGATTTGGGCTGGTCAATCTGG TGACTTGATGAAGTCTCCACACATGGCTAGAATCGGT AACTCTGTTTACCACGACATTTGTCCAGATGACCCATT GTGTCAGCAATTCGGTTTCCACAGAAACGATTACTCCA GACCAACTCCAATGATGAGAGCTTCCTTGTTGTACAAC TTGCACGAGGCTGGAAAAAGAAAGGGTGTTAAGGTTA ACCCATCTTTGTTCCAAGAGGTTTACTCCTCCAAGTAC GGACTTGTTAGAATCTTCAAGGTTATGAACGTTTCCGC TGAGTCTAAGAAGTGGGTTGCAGACCCAGCTAACAGA GTTTGTCACCCACCTGGTTCTTGGATTTGTCCTGGTCA ATACCCACCTGCTAAAGAAATCCAAGAGATGTTGGCT CACAGAGTTCCATTCGACCAGGTTACAAACGCTGACA GAAAGAACAATGTTGGTTCCTACCAAGAGGAATACAT GAGAAGAATGAGAGAGTCCGAGAACAGAAGATAATA G 12 Leishmania MGKRKGNSLGDSGSAATASREASAQAEDAASQTKTASP major STT3D PAKVILLPKTLTDEKDFIGIFPFPFWPVHFVLTVVALFVLA (protein) ASCFQAFTVRMISVQIYGYLIHEFDPWFNYRAAEYMSTH GWSAFFSWFDYMSWYPLGRPVGSTTYPGLQLTAVAIHR ALAAAGMPMSLNNVCVLMPAWFGAIATATLAFCTYEAS GSTVAAAAAALSFSIIPAHLMRSMAGEFDNECIAVAAML LTFYCWVRSLRTRSSWPIGVLTGVAYGYMAAAWGGYIF VLNMVAMHAGISSMVDWARNTYNPSLLRAYTLFYVVG TAIAVCVPPVGMSPFKSLEQLGALLVLVFLCGLQVCEVL RARAGVEVRSRANFKIRVRVFSVMAGVAALAISVLAPTG YFGPLSVRVRALFVEHTRTGNPLVDSVAEHQPASPEAM WAFLHVCGVTWGLGSIVLAVSTFVHYSPSKVFWLLNSG AVYYFSTRMARLLLLSGPAACLSTGIFVGTILEAAVQLSF WDSDATKAKKQQKQAQRHQRGAGKGSGRDDAKNATT ARAFCDVFAGSSLAWGHRMVLSIAMWALVTTTAVSFFS SEFASHSTKFAEQSSNPMIVFAAVVQNRATGKPMNLLVD DYLKAYEWLRDSTPEDARVLAWWDYGYQITGIGNRTSL ADGNTWNHEHIATIGKMLTSPVVEAHSLVRHMADYVLI WAGQSGDLMKSPHMARIGNSVYHDICPDDPLCQQFGFH RNDYSRPTPMMRASLLYNLHEAGKRKGVKVNPSLFQEV YSSKYGLVRIFKVMNVSAESKKWVADPANRVCHPPGSW ICPGQYPPAKEIQEMLAHRVPFDQVTNADRKNNVGSYQ EEYMRRMRESENRR 13 Saccharomyces ATGAGATTCCCATCCATCTTCACTGCTGTTTTGTTCGCT cerevisiae GCTTCTTCTGCTTTGGCT mating factor pre-signal peptide (DNA) 14 Saccharomyces MRFPSIFTAVLFAASSALA cerevisiae mating factor pre-signal peptide (protein) 15 Anti-Her2 GAGGTTCAGTTGGTTGAATCTGGAGGAGGATTGGTTC Heavy chain AACCTGGTGGTTCTTTGAGATTGTCCTGTGCTGCTTCC (VH + IgG1 GGTTTCAACATCAAGGACACTTACATCCACTGGGTTA constant region) GACAAGCTCCAGGAAAGGGATTGGAGTGGGTTGCTAG (DNA) AATCTACCCAACTAACGGTTACACAAGATACGCTGAC TCCGTTAAGGGAAGATTCACTATCTCTGCTGACACTTC CAAGAACACTGCTTACTTGCAGATGAACTCCTTGAGA GCTGAGGATACTGCTGTTTACTACTGTTCCAGATGGGG TGGTGATGGTTTCTACGCTATGGACTACTGGGGTCAAG GAACTTTGGTTACTGTTTCCTCCGCTTCTACTAAGGGA CCATCTGTTTTCCCATTGGCTCCATCTTCTAAGTCTACT TCCGGTGGTACTGCTGCTTTGGGATGTTTGGTTAAAGA CTACTTCCCAGAGCCAGTTACTGTTTCTTGGAACTCCG GTGCTTTGACTTCTGGTGTTCACACTTTCCCAGCTGTTT TGCAATCTTCCGGTTTGTACTCTTTGTCCTCCGTTGTTA CTGTTCCATCCTCTTCCTTGGGTACTCAGACTTACATCT GTAACGTTAACCACAAGCCATCCAACACTAAGGTTGA CAAGAAGGTTGAGCCAAAGTCCTGTGACAAGACACAT ACTTGTCCACCATGTCCAGCTCCAGAATTGTTGGGTGG TCCATCCGTTTTCTTGTTCCCACCAAAGCCAAAGGACA CTTTGATGATCTCCAGAACTCCAGAGGTTACATGTGTT GTTGTTGACGTTTCTCACGAGGACCCAGAGGTTAAGTT CAACTGGTACGTTGACGGTGTTGAAGTTCACAACGCT AAGACTAAGCCAAGAGAAGAGCAGTACAACTCCACTT ACAGAGTTGTTTCCGTTTTGACTGTTTTGCACCAGGAC TGGTTGAACGGTAAAGAATACAAGTGTAAGGTTTCCA ACAAGGCTTTGCCAGCTCCAATCGAAAAGACTATCTC CAAGGCTAAGGGTCAACCAAGAGAGCCACAGGTTTAC ACTTTGCCACCATCCAGAGAAGAGATGACTAAGAACC AGGTTTCCTTGACTTGTTTGGTTAAAGGATTCTACCCA TCCGACATTGCTGTTGAGTGGGAATCTAACGGTCAAC CAGAGAACAACTACAAGACTACTCCACCAGTTTTGGA TTCTGATGGTTCCTTCTTCTTGTACTCCAAGTTGACTGT TGACAAGTCCAGATGGCAACAGGGTAACGTTTTCTCC TGTTCCGTTATGCATGAGGCTTTGCACAACCACTACAC TCAAAAGTCCTTGTCTTTGTCCCCTGGTTAA 16 Anti-Her2 EVQLVESGGGLVQPGGSLRLSCAASGFNIKDTYIHWVRQ Heavy chain APGKGLEWVARIYPTNGYTRYADSVKGRFTISADTSKNT (VH + IgG1 AYLQMNSLRAEDTAVYYCSRWGGDGFYAMDYWGQGT constant region) LVTVSSASTKGPSVFPLAPSSKSTSGGTAALGCLVKDYFP (protein) EPVTVSWNSGALTSGVHTFPAVLQSSGLYSLSSVVTVPSS SLGTQTYICNVNHKPSNTKVDKKVEPKSCDKTHTCPPCP APELLGGPSVFLFPPKPKDTLMISRTPEVTCVVVDVSHED PEVKFNWYVDGVEVHNAKTKPREEQYNSTYRVVSVLT VLHQDWLNGKEYKCKVSNKALPAPIEKTISKAKGQPREP QVYTLPPSREEMTKNQVSLTCLVKGFYPSDIAVEWESNG QPENNYKTTPPVLDSDGSFFLYSKLTVDKSRWQQGNVFS CSVMHEALHNHYTQKSLSLSPG 17 Anti-Her2 light GACATCCAAATGACTCAATCCCCATCTTCTTTGTCTGC chain (VL + TTCCGTTGGTGACAGAGTTACTATCACTTGTAGAGCTT Kappa constant CCCAGGACGTTAATACTGCTGTTGCTTGGTATCAACAG region) (DNA) AAGCCAGGAAAGGCTCCAAAGTTGTTGATCTACTCCG CTTCCTTCTTGTACTCTGGTGTTCCATCCAGATTCTCTG GTTCCAGATCCGGTACTGACTTCACTTTGACTATCTCC TCCTTGCAACCAGAAGATTTCGCTACTTACTACTGTCA GCAGCACTACACTACTCCACCAACTTTCGGACAGGGT ACTAAGGTTGAGATCAAGAGAACTGTTGCTGCTCCAT CCGTTTTCATTTTCCCACCATCCGACGAACAGTTGAAG TCTGGTACAGCTTCCGTTGTTTGTTTGTTGAACAACTT CTACCCAAGAGAGGCTAAGGTTCAGTGGAAGGTTGAC AACGCTTTGCAATCCGGTAACTCCCAAGAATCCGTTAC TGAGCAAGACTCTAAGGACTCCACTTACTCCTTGTCCT CCACTTTGACTTTGTCCAAGGCTGATTACGAGAAGCAC AAGGTTTACGCTTGTGAGGTTACACATCAGGGTTTGTC CTCCCCAGTTACTAAGTCCTTCAACAGAGGAGAGTGTT AA 18 Anti-Her2 light DIQMTQSPSSLSASVGDRVTITCRASQDVNTAVAWYQQ chain (VL + KPGKAPKLLIYSASFLYSGVPSRFSGSRSGTDFTLTISSLQ Kappa constant PEDFATYYCQQHYTTPPTFGQGTKVEIKRTVAAPSVFIFP region) PSDEQLKSGTASVVCLLNNFYPREAKVQWKVDNALQSG NSQESVTEQDSKDSTYSLSSTLTLSKADYEKHKVYACEV THQGLSSPVTKSFNRGEC 19 Anti-RSV CAGGTTACATTGAGAGAATCCGGTCCAGCTTTGGTTA Heavy chain AGCCAACTCAGACTTTGACTTTGACTTGTACTTTCTCC (VH + IgG1 GGTTTCTCCTTGTCTACTTCCGGAATGTCTGTTGGATG constant region) GATCAGACAACCACCTGGAAAGGCTTTGGAATGGCTT (DNA) GCTGACATTTGGTGGGATGACAAGAAGGACTACAACC CATCCTTGAAGTCCAGATTGACTATCTCCAAGGACACT TCCAAGAATCAAGTTGTTTTGAAGGTTACAAACATGG ACCCAGCTGACACTGCTACTTACTACTGTGCTAGATCC ATGATCACTAACTGGTACTTCGATGTTTGGGGTGCTGG TACTACTGTTACTGTCTCGAGTGCTTCTACTAAGGGAC CATCCGTTTTTCCATTGGCTCCATCCTCTAAGTCTACTT CCGGTGGAACCGCTGCTTTGGGATGTTTGGTTAAAGA CTACTTCCCAGAGCCAGTTACTGTTTCTTGGAACTCCG GTGCTTTGACTTCTGGTGTTCACACTTTCCCAGCTGTTT TGCAATCTTCCGGTTTGTACTCTTTGTCCTCCGTTGTTA CTGTTCCATCCTCTTCCTTGGGTACTCAGACTTACATCT GTAACGTTAACCACAAGCCATCCAACACTAAGGTTGA CAAGAGAGTTGAGCCAAAGTCCTGTGACAAGACACAT ACTTGTCCACCATGTCCAGCTCCAGAATTGTTGGGTGG TCCATCCGTTTTCTTGTTCCCACCAAAGCCAAAGGACA CTTTGATGATCTCCAGAACTCCAGAGGTTACATGTGTT GTTGTTGACGTTTCTCACGAGGACCCAGAGGTTAAGTT CAACTGGTACGTTGACGGTGTTGAAGTTCACAACGCT AAGACTAAGCCAAGAGAAGAGCAGTACAACTCCACTT ACAGAGTTGTTTCCGTTTTGACTGTTTTGCACCAGGAC TGGTTGAACGGTAAAGAATACAAGTGTAAGGTTTCCA ACAAGGCTTTGCCAGCTCCAATCGAAAAGACTATCTC CAAGGCTAAGGGTCAACCAAGAGAGCCACAGGTTTAC ACTTTGCCACCATCCAGAGAAGAGATGACTAAGAACC
AGGTTTCCTTGACTTGTTTGGTTAAAGGATTCTACCCA TCCGACATTGCTGTTGAGTGGGAATCTAACGGTCAAC CAGAGAACAACTACAAGACTACTCCACCAGTTTTGGA TTCTGATGGTTCCTTCTTCTTGTACTCCAAGTTGACTGT TGACAAGTCCAGATGGCAACAGGGTAACGTTTTCTCC TGTTCCGTTATGCATGAGGCTTTGCACAACCACTACAC TCAAAAGTCCTTGTCTTTGTCCCCTGGTTAA 20 Anti-RSV QVTLRESGPALVKPTQTLTLTCTFSGFSLSTSGMSVGWIR Heavy chain QPPGKALEWLADIWWDDKKDYNPSLKSRLTISKDTSKN (VH + IgG1 QVVLKVTNMDPADTATYYCARSMITNWYFDVWGAGTT constant region) VTVSSASTKGPSVFPLAPSSKSTSGGTAALGCLVKDYFPE (protein) PVTVSWNSGALTSGVHTFPAVLQSSGLYSLSSVVTVPSSS LGTQTYICNVNHKPSNTKVDKRVEPKSCDKTHTCPPCPA PELLGGPSVFLFPPKPKDTLMISRTPEVTCVVVDVSHEDP EVKFNWYVDGVEVHNAKTKPREEQYNSTYRVVSVLTV LHQDWLNGKEYKCKVSNKALPAPIEKTISKAKGQPREPQ VYTLPPSREEMTKNQVSLTCLVKGFYPSDIAVEWESNGQ PENNYKTTPPVLDSDGSFFLYSKLTVDKSRWQQGNVFSC SVMHEALHNHYTQKSLSLSPG 21 Anti-RSV light ATGAGATTCCCATCCATCTTCACTGCTGTTTTGTTCGCT chain (VL + GCTTCTTCTGCTTTGGCTGACATTCAGATGACACAGTC Kappa constant CCCATCTACTTTGTCTGCTTCCGTTGGTGACAGAGTTA region (DNA) CTATCACTTGTAAGTGTCAGTTGTCCGTTGGTTACATG CACTGGTATCAGCAAAAGCCAGGAAAGGCTCCAAAGT TGTTGATCTACGACACTTCCAAGTTGGCTTCCGGTGTT CCATCTAGATTCTCTGGTTCCGGTTCTGGTACTGAGTT CACTTTGACTATCTCTTCCTTGCAACCAGATGACTTCG CTACTTACTACTGTTTCCAGGGTTCTGGTTACCCATTC ACTTTCGGTGGTGGTACTAAGTTGGAGATCAAGAGAA CTGTTGCTGCTCCATCCGTTTTCATTTTCCCACCATCCG ACGAACAATTGAAGTCCGGTACCGCTTCCGTTGTTTGT TTGTTGAACAACTTCTACCCACGTGAGGCTAAGGTTCA GTGGAAGGTTGACAACGCTTTGCAATCCGGTAACTCC CAAGAATCCGTTACTGAGCAGGATTCTAAGGATTCCA CTTACTCATTGTCCTCCACTTTGACTTTGTCCAAGGCT GATTACGAGAAGCACAAGGTTTACGCTTGCGAGGTTA CACATCAGGGTTTGTCCTCCCCAGTTACTAAGTCCTTC AACAGAGGAGAGTGTTAA 22 Anti-RSV light DIQMTQSPSTLSASVGDRVTITCKCQLSVGYMHWYQQK chain (VL + PGKAPKLLIYDTSKLASGVPSRFSGSGSGTEFTLTISSLQP Kappa constant DDFATYYCFQGSGYPFTFGGGTKLEIKRTVAAPSVFIFPP region) (protein) SDEQLKSGTASVVCLLNNFYPREAKVQWKVDNALQSGN SQESVTEQDSKDSTYSLSSTLTLSKADYEKHKVYACEVT HQGLSSPVTKSFNRGEC 23 Pp AOX1 AACATCCAAAGACGAAAGGTTGAATGAAACCTTTTTG promoter CCATCCGACATCCACAGGTCCATTCTCACACATAAGTG CCAAACGCAACAGGAGGGGATACACTAGCAGCAGAC CGTTGCAAACGCAGGACCTCCACTCCTCTTCTCCTCAA CACCCACTTTTGCCATCGAAAAACCAGCCCAGTTATTG GGCTTGATTGGAGCTCGCTCATTCCAATTCCTTCTATT AGGCTACTAACACCATGACTTTATTAGCCTGTCTATCC TGGCCCCCCTGGCGAGGTTCATGTTTGTTTATTTCCGA ATGCAACAAGCTCCGCATTACACCCGAACATCACTCC AGATGAGGGCTTTCTGAGTGTGGGGTCAAATAGTTTC ATGTTCCCCAAATGGCCCAAAACTGACAGTTTAAACG CTGTCTTGGAACCTAATATGACAAAAGCGTGATCTCAT CCAAGATGAACTAAGTTTGGTTCGTTGAAATGCTAAC GGCCAGTTGGTCAAAAAGAAACTTCCAAAAGTCGGCA TACCGTTTGTCTTGTTTGGTATTGATTGACGAATGCTC AAAAATAATCTCATTAATGCTTAGCGCAGTCTCTCTAT CGCTTCTGAACCCCGGTGCACCTGTGCCGAAACGCAA ATGGGGAAACACCCGCTTTTTGGATGATTATGCATTGT CTCCACATTGTATGCTTCCAAGATTCTGGTGGGAATAC TGCTGATAGCCTAACGTTCATGATCAAAATTTAACTGT TCTAACCCCTACTTGACAGCAATATATAAACAGAAGG AAGCTGCCCTGTCTTAAACCTTTTTTTTTATCATCATTA TTAGCTTACTTTCATAATTGCGACTGGTTCCAATTGAC AAGCTTTTGATTTTAACGACTTTTAACGACAACTTGAG AAGATCAAAAAACAACTAATTATTCGAAACG 24 ScCYC TT ACAGGCCCCTTTTCCTTTGTCGATATCATGTAATTAGT TATGTCACGCTTACATTCACGCCCTCCTCCCACATCCG CTCTAACCGAAAAGGAAGGAGTTAGACAACCTGAAGT CTAGGTCCCTATTTATTTTTTTTAATAGTTATGTTAGTA TTAAGAACGTTATTTATATTTCAAATTTTTCTTTTTTTT CTGTACAAACGCGTGTACGCATGTAACATTATACTGA AAACCTTGCTTGAGAAGGTTTTGGGACGCTCGAAGGC TTTAATTTGCAAGCTGCCGGCTCTTAAG 25 PpRPL10 GTTCTTCGCTTGGTCTTGTATCTCCTTACACTGTATCTTCCC promoter ATTTGCGTTTAGGTGGTTATCAAAAACTAAAAGGAAAAATT TCAGATGTTTATCTCTAAGGTTTTTTCTTTTTACAGTATAAC ACGTGATGCGTCACGTGGTACTAGATTACGTAAGTTATTTT GGTCCGGTGGGTAAGTGGGTAAGAATAGAAAGCATGAAGG TTTACAAAAACGCAGTCACGAATTATTGCTACTTCGAGCTT GGAACCACCCCAAAGATTATATTGTACTGATGCACTACCTT CTCGATTTTGCTCCTCCAAGAACCTACGAAAAACATTTCTT GAGCCTTTTCAACCTAGACTACACATCAAGTTATTTAAGGT ATGTTCCGTTAACATGTAAGAAAAGGAGAGGATAGATCGT TTATGGGGTACGTCGCCTGATTCAAGCGTGACCATTCGAAG AATAGGCCTTCGAAAGCTGAATAAAGCAAATGTCAGTTGC GATTGGTATGCTGACAAATTAGCATAAAAAGCAATAGACTT TCTAACCACCTGTTTTTTTCCTTTTACTTTATTTATATTTTGC CACCGTACTAACAAGTTCAGACAAA 26 PpGAPDH TTTTTGTAGAAATGTCTTGGTGTCCTCGTCCAATCAGG promoter TAGCCATCTCTGAAATATCTGGCTCCGTTGCAACTCCG AACGACCTGCTGGCAACGTAAAATTCTCCGGGGTAAA ACTTAAATGTGGAGTAATGGAACCAGAAACGTCTCTT CCCTTCTCTCTCCTTCCACCGCCCGTTACCGTCCCTAG GAAATTTTACTCTGCTGGAGAGCTTCTTCTACGGCCCC CTTGCAGCAATGCTCTTCCCAGCATTACGTTGCGGGTA AAACGGAGGTCGTGTACCCGACCTAGCAGCCCAGGGA TGGAAAAGTCCCGGCCGTCGCTGGCAATAATAGCGGG CGGACGCATGTCATGAGATTATTGGAAACCACCAGAA TCGAATATAAAAGGCGAACACCTTTCCCAATTTTGGTT TCTCCTGACCCAAAGACTTTAAATTTAATTTATTTGTC CCTATTTCAATCAATTGAACAACTATCAAAACACA 27 PpTEF1 TTAAGGTTTGGAACAACACTAAACTACCTTGCGGTACT promoter ACCATTGACACTACACATCCTTAATTCCAATCCTGTCT GGCCTCCTTCACCTTTTAACCATCTTGCCCATTCCAAC TCGTGTCAGATTGCGTATCAAGTGAAAAAAAAAAAAT TTTAAATCTTTAACCCAATCAGGTAATAACTGTCGCCT CTTTTATCTGCCGCACTGCATGAGGTGTCCCCTTAGTG GGAAAGAGTACTGAGCCAACCCTGGAGGACAGCAAG GGAAAAATACCTACAACTTGCTTCATAATGGTCGTAA AAACAATCCTTGTCGGATATAAGTGTTGTAGACTGTCC CTTATCCTCTGCGATGTTCTTCCTCTCAAAGTTTGCGAT TTCTCTCTATCAGAATTGCCATCAAGAGACTCAGGACT AATTTCGCAGTCCCACACGCACTCGTACATGATTGGCT GAAATTTCCCTAAAGAATTTCTTTTTCACGAAAATTTT TTTTTTACACAAGATTTTCAGCAGATATAAAATGGAGA GCAGGACCTCCGCTGTGACTCTTCTTTTTTTTCTTTTAT TCTCACTACATACATTTTAGTTATTCGCCAAC 28 PpTEF1 TT ATTGCTTGAAGCTTTAATTTATTTTATTAACATAATAA TAATACAAGCATGATATATTTGTATTTTGTTCGTTAAC ATTGATGTTTTCTTCATTTACTGTTATTGTTTGTAACTT TGATCGATTTATCTTTTCTACTTTACTGTAATATGGCTG GCGGGTGAGCCTTGAACTCCCTGTATTACTTTACCTTG CTATTACTTAATCTATTGACTAGCAGCGACCTCTTCAA CCGAAGGGCAAGTACACAGCAAGTTCATGTCTCCGTA AGTGTCATCAACCCTGGAAACAGTGGGCCATGTC 29 PpALG3 TT ATTTACAATTAGTAATATTAAGGTGGTAAAAACATTC GTAGAATTGAAATGAATTAATATAGTATGACAATGGT TCATGTCTATAAATCTCCGGCTTCGGTACCTTCTCCCC AATTGAATACATTGTCAAAATGAATGGTTGAACTATT AGGTTCGCCAGTTTCGTTATTAAGAAAACTGTTAAAAT CAAATTCCATATCATCGGTTCCAGTGGGAGGACCAGT TCCATCGCCAAAATCCTGTAAGAATCCATTGTCAGAA CCTGTAAAGTCAGTTTGAGATGAAATTTTTCCGGTCTT TGTTGACTTGGAAGCTTCGTTAAGGTTAGGTGAAACA GTTTGATCAACCAGCGGCTCCCGTTTTCGTCGCTTAGT AG 30 PpTRP1 5' GCGGAAACGGCAGTAAACAATGGAGCTTCATTAGTGGGTG region and ORF TTATTATGGTCCCTGGCCGGGAACGAACGGTGAAACAAGA GGTTGCGAGGGAAATTTCGCAGATGGTGCGGGAAAAGAGA ATTTCAAAGGGCTCAAAATACTTGGATTCCAGACAACTGAG GAAAGAGTGGGACGACTGTCCTCTGGAAGACTGGTTTGAG TACAACGTGAAAGAAATAAACAGCAGTGGTCCATTTTTAGT TGGAGTTTTTCGTAATCAAAGTATAGATGAAATCCAGCAAG CTATCCACACTCATGGTTTGGATTTCGTCCAACTACATGGG TCTGAGGATTTTGATTCGTATATACGCAATATCCCAGTTCCT GTGATTACCAGATACACAGATAATGCCGTCGATGGTCTTAC CGGAGAAGACCTCGCTATAAATAGGGCCCTGGTGCTACTG GACAGCGAGCAAGGAGGTGAAGGAAAAACCATCGATTGGG CTCGTGCACAAAAATTTGGAGAACGTAGAGGAAAATATTT ACTAGCCGGAGGTTTGACACCTGATAATGTTGCTCATGCTC GATCTCATACTGGCTGTATTGGTGTTGACGTCTCTGGTGGG GTAGAAACAAATGCCTCAAAAGATATGGACAAGATCACAC AATTTATCAGAAACGCTACATAA 31 PpTRP1 3' AAGTCAATTAAATACACGCTTGAAAGGACATTACATAGCTT region TCGATTTAAGCAGAACCAGAAATGTAGAACCACTTGTCAAT AGATTGGTCAATCTTAGCAGGAGCGGCTGGGCTAGCAGTTG GAACAGCAGAGGTTGCTGAAGGTGAGAAGGATGGAGTGGA TTGCAAAGTGGTGTTGGTTAAGTCAATCTCACCAGGGCTGG TTTTGCCAAAAATCAACTTCTCCCAGGCTTCACGGCATTCTT GAATGACCTCTTCTGCATACTTCTTGTTCTTGCATTCACCAG AGAAAGCAAACTGGTTCTCAGGTTTTCCATCAGGGATCTTG TAAATTCTGAACCATTCGTTGGTAGCTCTCAACAAGCCCGG CATGTGCTTTTCAACATCCTCGATGTCATTGAGCTTAGGAG CCAATGGGTCGTTGATGTCGATGACGATGACCTTCCAGTCA GTCTCTCCCTCATCCAACAAAGCCATAACACCGAGGACCTT GACTTGCTTGACCTGTCCAGTGTAACCTACGGCTTCACCAA TTTCGCAAACGTCCAATGGATCATTGTCACCCTTGGCCTTG GTCTCTGGATGAGTGACGTTAGGGTCTTCCCATGTCTGAGG GAAGGCACCGTAGTTGTGAATGTATCCGTGGTGAGGGAAA CAGTTACGAACGAAACGAAGTTTTCCCTTCTTTGTGTCCTG AAGAATTGGGTTCAGTTTCTCCTCCTTGGAAATCTCCAACTT GGCGTTGGTCCAACGGGGGACTTCAACAACCATGTTGAGA ACCTTCTTGGATTCGTCAGCATAAAGTGGGATGTCGTGGAA AGGAGATACGACTT 32 ScARR3 ORF ATGTCAGAAGATCAAAAAAGTGAAAATTCCGTACCTTCTAA GGTTAATATGGTGAATCGCACCGATATACTGACTACGATCA AGTCATTGTCATGGCTTGACTTGATGTTGCCATTTACTATAA TTCTCTCCATAATCATTGCAGTAATAATTTCTGTCTATGTGC CTTCTTCCCGTCACACTTTTGACGCTGAAGGTCATCCCAATC TAATGGGAGTGTCCATTCCTTTGACTGTTGGTATGATTGTA ATGATGATTCCCCCGATCTGCAAAGTTTCCTGGGAGTCTAT TCACAAGTACTTCTACAGGAGCTATATAAGGAAGCAACTA GCCCTCTCGTTATTTTTGAATTGGGTCATCGGTCCTTTGTTG ATGACAGCATTGGCGTGGATGGCGCTATTCGATTATAAGGA ATACCGTCAAGGCATTATTATGATCGGAGTAGCTAGATGCA TTGCCATGGTGCTAATTTGGAATCAGATTGCTGGAGGAGAC AATGATCTCTGCGTCGTGCTTGTTATTACAAACTCGCTTTTA CAGATGGTATTATATGCACCATTGCAGATATTTTACTGTTAT GTTATTTCTCATGACCACCTGAATACTTCAAATAGGGTATT ATTCGAAGAGGTTGCAAAGTCTGTCGGAGTTTTTCTCGGCA TACCACTGGGAATTGGCATTATCATACGTTTGGGAAGTCTT ACCATAGCTGGTAAAAGTAATTATGAAAAATACATTTTGAG ATTTATTTCTCCATGGGCAATGATCGGATTTCATTACACTTT ATTTGTTATTTTTATTAGTAGAGGTTATCAATTTATCCACGA AATTGGTTCTGCAATATTGTGCTTTGTCCCATTGGTGCTTTA CTTCTTTATTGCATGGTTTTTGACCTTCGCATTAATGAGGTA CTTATCAATATCTAGGAGTGATACACAAAGAGAATGTAGCT GTGACCAAGAACTACTTTTAAAGAGGGTCTGGGGAAGAAA GTCTTGTGAAGCTAGCTTTTCTATTACGATGACGCAATGTTT CACTATGGCTTCAAATAATTTTGAACTATCCCTGGCAATTG CTATTTCCTTATATGGTAACAATAGCAAGCAAGCAATAGCT GCAACATTTGGGCCGTTGCTAGAAGTTCCAATTTTATTGAT TTTGGCAATAGTCGCGAGAATCCTTAAACCATATTATATAT GGAACAATAGAAATTAA 33 URA6 region CAAATGCAAGAGGACATTAGAAATGTGTTTGGTAAGAACA TGAAGCCGGAGGCATACAAACGATTCACAGATTTGAAGGA GGAAAACAAACTGCATCCACCGGAAGTGCCAGCAGCCGTG TATGCCAACCTTGCTCTCAAAGGCATTCCTACGGATCTGAG TGGGAAATATCTGAGATTCACAGACCCACTATTGGAACAGT ACCAAACCTAGTTTGGCCGATCCATGATTATGTAATGCATA TAGTTTTTGTCGATGCTCACCCGTTTCGAGTCTGTCTCGTAT CGTCTTACGTATAAGTTCAAGCATGTTTACCAGGTCTGTTA GAAACTCCTTTGTGAGGGCAGGACCTATTCGTCTCGGTCCC GTTGTTTCTAAGAGACTGTACAGCCAAGCGCAGAATGGTGG CATTAACCATAAGAGGATTCTGATCGGACTTGGTCTATTGG CTATTGGAACCACCCTTTACGGGACAACCAACCCTACCAAG ACTCCTATTGCATTTGTGGAACCAGCCACGGAAAGAGCGTT TAAGGACGGAGACGTCTCTGTGATTTTTGTTCTCGGAGGTC CAGGAGCTGGAAAAGGTACCCAATGTGCCAAACTAGTGAG TAATTACGGATTTGTTCACCTGTCAGCTGGAGACTTGTTAC GTGCAGAACAGAAGAGGGAGGGGTCTAAGTATGGAGAGAT GATTTCCCAGTATATCAGAGATGGACTGATAGTACCTCAAG AGGTCACCATTGCGCTCTTGGAGCAGGCCATGAAGGAAAA CTTCGAGAAAGGGAAGACACGGTTCTTGATTGATGGATTCC CTCGTAAGATGGACCAGGCCAAAACTTTTGAGGAAAAAGT CGCAAAGTCCAAGGTGACACTTTTCTTTGATTGTCCCGAAT CAGTGCTCCTTGAGAGATTACTTAAAAGAGGACAGACAAG CGGAAGAGAGGATGATAATGCGGAGAGTATCAAAAAAAG ATTCAAAACATTCGTGGAAACTTCGATGCCTGTGGTGGACT ATTTCGGGAAGCAAGGACGCGTTTTGAAGGTATCTTGTGAC CACCCTGTGGATCAAGTGTATTCACAGGTTGTGTCGGTGCT AAAAGAGAAGGGGATCTTTGCCGATAACGAGACGGAGAAT AAATAA
34 NatR ORF ATGGGTACCACTCTTGACGACACGGCTTACCGGTACC GCACCAGTGTCCCGGGGGACGCCGAGGCCATCGAGGC ACTGGATGGGTCCTTCACCACCGACACCGTCTTCCGCG TCACCGCCACCGGGGACGGCTTCACCCTGCGGGAGGT GCCGGTGGACCCGCCCCTGACCAAGGTGTTCCCCGAC GACGAATCGGACGACGAATCGGACGACGGGGAGGAC GGCGACCCGGACTCCCGGACGTTCGTCGCGTACGGGG ACGACGGCGACCTGGCGGGCTTCGTGGTCGTCTCGTA CTCCGGCTGGAACCGCCGGCTGACCGTCGAGGACATC GAGGTCGCCCCGGAGCACCGGGGGCACGGGGTCGGG CGCGCGTTGATGGGGCTCGCGACGGAGTTCGCCCGCG AGCGGGGCGCCGGGCACCTCTGGCTGGAGGTCACCAA CGTCAACGCACCGGCGATCCACGCGTACCGGCGGATG GGGTTCACCCTCTGCGGCCTGGACACCGCCCTGTACG ACGGCACCGCCTCGGACGGCGAGCAGGCGCTCTACAT GAGCATGCCCTGCCCCTAATCAGTACTG 35 Sequence of the ATGGCCAAGTTGACCAGTGCCGTTCCGGTGCTCACCG Sh ble ORF CGCGCGACGTCGCCGGAGCGGTCGAGTTCTGGACCGA (Zeocin CCGGCTCGGGTTCTCCCGGGACTTCGTGGAGGACGAC resistance TTCGCCGGTGTGGTCCGGGACGACGTGACCCTGTTCAT marker): CAGCGCGGTCCAGGACCAGGTGGTGCCGGACAACACC CTGGCCTGGGTGTGGGTGCGCGGCCTGGACGAGCTGT ACGCCGAGTGGTCGGAGGTCGTGTCCACGAACTTCCG GGACGCCTCCGGGCCGGCCATGACCGAGATCGGCGAG CAGCCGTGGGGGCGGGAGTTCGCCCTGCGCGACCCGG CCGGCAACTGCGTGCACTTCGTGGCCGAGGAGCAGGA CTGA 36 PpAOX1 TT TCAAGAGGATGTCAGAATGCCATTTGCCTGAGAGATGCAG GCTTCATTTTGATACTTTTTTATTTGTAACCTATATAGTATA GGATTTTTTTTGTCATTTTGTTTCTTCTCGTACGAGCTTGCTC CTGATCAGCCTATCTCGCAGCTGATGAATATCTTGTGGTAG GGGTTTGGGAAAATCATTCGAGTTTGATGTTTTTCTTGGTAT TTCCCACTCCTCTTCAGAGTACAGAAGATTAAGTGAGACGT TCGTTTGTGCA 37 ScTEF1 GATCCCCCACACACCATAGCTTCAAAATGTTTCTACTC promoter CTTTTTTACTCTTCCAGATTTTCTCGGACTCCGCGCATC GCCGTACCACTTCAAAACACCCAAGCACAGCATACTA AATTTCCCCTCTTTCTTCCTCTAGGGTGTCGTTAATTAC CCGTACTAAAGGTTTGGAAAAGAAAAAAGAGACCGCC TCGTTTCTTTTTCTTCGTCGAAAAAGGCAATAAAAATT TTTATCACGTTTCTTTTTCTTGAAAATTTTTTTTTTTGAT TTTTTTCTCTTTCGATGACCTCCCATTGATATTTAAGTT AATAAACGGTCTTCAATTTCTCAAGTTTCAGTTTCATT TTTCTTGTTCTATTACAACTTTTTTTACTTCTTGCTCATT AGAAAGAAAGCATAGCAATCTAATCTAAGTTTTAATT ACAAA 38 S. cerevisiae AGGCCTCGCAACAACCTATAATTGAGTTAAGTGCCTTT invertase gene CCAAGCTAAAAAGTTTGAGGTTATAGGGGCTTAGCAT (ScSUC2) ORF CCACACGTCACAATCTCGGGTATCGAGTATAGTATGT underlined AGAATTACGGCAGGAGGTTTCCCAATGAACAAAGGAC AGGGGCACGGTGAGCTGTCGAAGGTATCCATTTTATC ATGTTTCGTTTGTACAAGCACGACATACTAAGACATTT ACCGTATGGGAGTTGTTGTCCTAGCGTAGTTCTCGCTC CCCCAGCAAAGCTCAAAAAAGTACGTCATTTAGAATA GTTTGTGAGCAAATTACCAGTCGGTATGCTACGTTAGA AAGGCCCACAGTATTCTTCTACCAAAGGCGTGCCTTTG TTGAACTCGATCCATTATGAGGGCTTCCATTATTCCCC GCATTTTTATTACTCTGAACAGGAATAAAAAGAAAAA ACCCAGTTTAGGAAATTATCCGGGGGCGAAGAAATAC GCGTAGCGTTAATCGACCCCACGTCCAGGGTTTTTCCA TGGAGGTTTCTGGAAAAACTGACGAGGAATGTGATTA TAAATCCCTTTATGTGATGTCTAAGACTTTTAAGGTAC GCCCGATGTTTGCCTATTACCATCATAGAGACGTTTCT TTTCGAGGAATGCTTAAACGACTTTGTTTGACAAAAAT GTTGCCTAAGGGCTCTATAGTAAACCATTTGGAAGAA AGATTTGACGACTTTTTTTTTTTGGATTTCGATCCTATA ATCCTTCCTCCTGAAAAGAAACATATAAATAGATATG TATTATTCTTCAAAACATTCTCTTGTTCTTGTGCTTTTT TTTTACCATATATCTTACTTTTTTTTTTCTCTCAGAGAA ACAAGCAAAACAAAAAGCTTTTCTTTTCACTAACGTAT ATGATGCTTTTGCAAGCTTTCCTTTTCCTTTTGGCTGGT TTTGCAGCCAAAATATCTGCATCAATGACAAACGAAA CTAGCGATAGACCTTTGGTCCACTTCACACCCAACAA GGGCTGGATGAATGACCCAAATGGGTTGTGGTACGAT GAAAAAGATGCCAAATGGCATCTGTACTTTCAATACA ACCCAAATGACACCGTATGGGGTACGCCATTGTTTTG GGGCCATGCTACTTCCGATGATTTGACTAATTGGGAA GATCAACCCATTGCTATCGCTCCCAAGCGTAACGATTC AGGTGCTTTCTCTGGCTCCATGGTGGTTGATTACAACA ACACGAGTGGGTTTTTCAATGATACTATTGATCCAAGA CAAAGATGCGTTGCGATTTGGACTTATAACACTCCTGA AAGTGAAGAGCAATACATTAGCTATTCTCTTGATGGT GGTTACACTTTTACTGAATACCAAAAGAACCCTGTTTT AGCTGCCAACTCCACTCAATTCAGAGATCCAAAGGTG TTCTGGTATGAACCTTCTCAAAAATGGATTATGACGGC TGCCAAATCACAAGACTACAAAATTGAAATTTACTCC TCTGATGACTTGAAGTCCTGGAAGCTAGAATCTGCATT TGCCAATGAAGGTTTCTTAGGCTACCAATACGAATGTC CAGGTTTGATTGAAGTCCCAACTGAGCAAGATCCTTCC AAATCTTATTGGGTCATGTTTATTTCTATCAACCCAGG TGCACCTGCTGGCGGTTCCTTCAACCAATATTTTGTTG GATCCTTCAATGGTACTCATTTTGAAGCGTTTGACAAT CAATCTAGAGTGGTAGATTTTGGTAAGGACTACTATG CCTTGCAAACTTTCTTCAACACTGACCCAACCTACGGT TCAGCATTAGGTATTGCCTGGGCTTCAAACTGGGAGT ACAGTGCCTTTGTCCCAACTAACCCATGGAGATCATCC ATGTCTTTGGTCCGCAAGTTTTCTTTGAACACTGAATA TCAAGCTAATCCAGAGACTGAATTGATCAATTTGAAA GCCGAACCAATATTGAACATTAGTAATGCTGGTCCCT GGTCTCGTTTTGCTACTAACACAACTCTAACTAAGGCC AATTCTTACAATGTCGATTTGAGCAACTCGACTGGTAC CCTAGAGTTTGAGTTGGTTTACGCTGTTAACACCACAC AAACCATATCCAAATCCGTCTTTGCCGACTTATCACTT TGGTTCAAGGGTTTAGAAGATCCTGAAGAATATTTGA GAATGGGTTTTGAAGTCAGTGCTTCTTCCTTCTTTTTG GACCGTGGTAACTCTAAGGTCAAGTTTGTCAAGGAGA ACCCATATTTCACAAACAGAATGTCTGTCAACAACCA ACCATTCAAGTCTGAGAACGACCTAAGTTACTATAAA GTGTACGGCCTACTGGATCAAAACATCTTGGAATTGT ACTTCAACGATGGAGATGTGGTTTCTACAAATACCTAC TTCATGACCACCGGTAACGCTCTAGGATCTGTGAACAT GACCACTGGTGTCGATAATTTGTTCTACATTGACAAGT TCCAAGTAAGGGAAGTAAAATAGAGGTTATAAAACTT ATTGTCTTTTTTATTTTTTTCAAAAGCCATTCTAAAGGG CTTTAGCTAACGAGTGACGAATGTAAAACTTTATGATT TCAAAGAATACCTCCAAACCATTGAAAATGTATTTTTA TTTTTATTTTCTCCCGACCCCAGTTACCTGGAATTTGTT CTTTATGTACTTTATATAAGTATAATTCTCTTAAAAAT TTTTACTACTTTGCAATAGACATCATTTTTTCACGTAAT AAACCCACAATCGTAATGTAGTTGCCTTACACTACTAG GATGGACCTTTTTGCCTTTATCTGTTTTGTTACTGACAC AATGAAACCGGGTAAAGTATTAGTTATGTGAAAATTT AAAAGCATTAAGTAGAAGTATACCATATTGTAAAAAA AAAAAGCGTTGTCTTCTACGTAAAAGTGTTCTCAAAA AGAAGTAGTGAGGGAAATGGATACCAAGCTATCTGTA ACAGGAGCTAAAAAATCTCAGGGAAAAGCTTCTGGTT TGGGAAACGGTCGAC 39 Sequence of the ATCGGCCTTTGTTGATGCAAGTTTTACGTGGATCATGG 5'-Region used ACTAAGGAGTTTTATTTGGACCAAGTTCATCGTCCTAG for knock out of ACATTACGGAAAGGGTTCTGCTCCTCTTTTTGGAAACT PpURA5: TTTTGGAACCTCTGAGTATGACAGCTTGGTGGATTGTA CCCATGGTATGGCTTCCTGTGAATTTCTATTTTTTCTAC ATTGGATTCACCAATCAAAACAAATTAGTCGCCATGG CTTTTTGGCTTTTGGGTCTATTTGTTTGGACCTTCTTGG AATATGCTTTGCATAGATTTTTGTTCCACTTGGACTAC TATCTTCCAGAGAATCAAATTGCATTTACCATTCATTT CTTATTGCATGGGATACACCACTATTTACCAATGGATA AATACAGATTGGTGATGCCACCTACACTTTTCATTGTA CTTTGCTACCCAATCAAGACGCTCGTCTTTTCTGTTCT ACCATATTACATGGCTTGTTCTGGATTTGCAGGTGGAT TCCTGGGCTATATCATGTATGATGTCACTCATTACGTT CTGCATCACTCCAAGCTGCCTCGTTATTTCCAAGAGTT GAAGAAATATCATTTGGAACATCACTACAAGAATTAC GAGTTAGGCTTTGGTGTCACTTCCAAATTCTGGGACAA AGTCTTTGGGACTTATCTGGGTCCAGACGATGTGTATC AAAAGACAAATTAGAGTATTTATAAAGTTATGTAAGC AAATAGGGGCTAATAGGGAAAGAAAAATTTTGGTTCT TTATCAGAGCTGGCTCGCGCGCAGTGTTTTTCGTGCTC CTTTGTAATAGTCATTTTTGACTACTGTTCAGATTGAA ATCACATTGAAGATGTCACTCGAGGGGTACCAAAAAA GGTTTTTGGATGCTGCAGTGGCTTCGC 40 Sequence of the GGTCTTTTCAACAAAGCTCCATTAGTGAGTCAGCTGGC 3'-Region used TGAATCTTATGCACAGGCCATCATTAACAGCAACCTG for knock out of GAGATAGACGTTGTATTTGGACCAGCTTATAAAGGTA PpURA5: TTCCTTTGGCTGCTATTACCGTGTTGAAGTTGTACGAG CTCGGCGGCAAAAAATACGAAAATGTCGGATATGCGT TCAATAGAAAAGAAAAGAAAGACCACGGAGAAGGTG GAAGCATCGTTGGAGAAAGTCTAAAGAATAAAAGAGT ACTGATTATCGATGATGTGATGACTGCAGGTACTGCTA TCAACGAAGCATTTGCTATAATTGGAGCTGAAGGTGG GAGAGTTGAAGGTAGTATTATTGCCCTAGATAGAATG GAGACTACAGGAGATGACTCAAATACCAGTGCTACCC AGGCTGTTAGTCAGAGATATGGTACCCCTGTCTTGAGT ATAGTGACATTGGACCATATTGTGGCCCATTTGGGCG AAACTTTCACAGCAGACGAGAAATCTCAAATGGAAAC GTATAGAAAAAAGTATTTGCCCAAATAAGTATGAATC TGCTTCGAATGAATGAATTAATCCAATTATCTTCTCAC CATTATTTTCTTCTGTTTCGGAGCTTTGGGCACGGCGG CGGGTGGTGCGGGCTCAGGTTCCCTTTCATAAACAGA TTTAGTACTTGGATGCTTAATAGTGAATGGCGAATGCA AAGGAACAATTTCGTTCATCTTTAACCCTTTCACTCGG GGTACACGTTCTGGAATGTACCCGCCCTGTTGCAACTC AGGTGGACCGGGCAATTCTTGAACTTTCTGTAACGTTG TTGGATGTTCAACCAGAAATTGTCCTACCAACTGTATT AGTTTCCTTTTGGTCTTATATTGTTCATCGAGATACTTC CCACTCTCCTTGATAGCCACTCTCACTCTTCCTGGATT ACCAAAATCTTGAGGATGAGTCTTTTCAGGCTCCAGG ATGCAAGGTATATCCAAGTACCTGCAAGCATCTAATA TTGTCTTTGCCAGGGGGTTCTCCACACCATACTCCTTT TGGCGCATGC 41 Sequence of the TCTAGAGGGACTTATCTGGGTCCAGACGATGTGTATC PpURA5 AAAAGACAAATTAGAGTATTTATAAAGTTATGTAAGC auxotrophic AAATAGGGGCTAATAGGGAAAGAAAAATTTTGGTTCT marker: TTATCAGAGCTGGCTCGCGCGCAGTGTTTTTCGTGCTC CTTTGTAATAGTCATTTTTGACTACTGTTCAGATTGAA ATCACATTGAAGATGTCACTGGAGGGGTACCAAAAAA GGTTTTTGGATGCTGCAGTGGCTTCGCAGGCCTTGAAG TTTGGAACTTTCACCTTGAAAAGTGGAAGACAGTCTCC ATACTTCTTTAACATGGGTCTTTTCAACAAAGCTCCAT TAGTGAGTCAGCTGGCTGAATCTTATGCTCAGGCCATC ATTAACAGCAACCTGGAGATAGACGTTGTATTTGGAC CAGCTTATAAAGGTATTCCTTTGGCTGCTATTACCGTG TTGAAGTTGTACGAGCTGGGCGGCAAAAAATACGAAA ATGTCGGATATGCGTTCAATAGAAAAGAAAAGAAAGA CCACGGAGAAGGTGGAAGCATCGTTGGAGAAAGTCTA AAGAATAAAAGAGTACTGATTATCGATGATGTGATGA CTGCAGGTACTGCTATCAACGAAGCATTTGCTATAATT GGAGCTGAAGGTGGGAGAGTTGAAGGTTGTATTATTG CCCTAGATAGAATGGAGACTACAGGAGATGACTCAAA TACCAGTGCTACCCAGGCTGTTAGTCAGAGATATGGT ACCCCTGTCTTGAGTATAGTGACATTGGACCATATTGT GGCCCATTTGGGCGAAACTTTCACAGCAGACGAGAAA TCTCAAATGGAAACGTATAGAAAAAAGTATTTGCCCA AATAAGTATGAATCTGCTTCGAATGAATGAATTAATC CAATTATCTTCTCACCATTATTTTCTTCTGTTTCGGAGC TTTGGGCACGGCGGCGGATCC 42 Sequence of the CCTGCACTGGATGGTGGCGCTGGATGGTAAGCCGCTG part of the Ec GCAAGCGGTGAAGTGCCTCTGGATGTCGCTCCACAAG lacZ gene that GTAAACAGTTGATTGAACTGCCTGAACTACCGCAGCC was used to GGAGAGCGCCGGGCAACTCTGGCTCACAGTACGCGTA construct the GTGCAACCGAACGCGACCGCATGGTCAGAAGCCGGGC PpURA5 blaster ACATCAGCGCCTGGCAGCAGTGGCGTCTGGCGGAAAA (recyclable CCTCAGTGTGACGCTCCCCGCCGCGTCCCACGCCATCC auxotrophic CGCATCTGACCACCAGCGAAATGGATTTTTGCATCGA marker) GCTGGGTAATAAGCGTTGGCAATTTAACCGCCAGTCA GGCTTTCTTTCACAGATGTGGATTGGCGATAAAAAAC AACTGCTGACGCCGCTGCGCGATCAGTTCACCCGTGC ACCGCTGGATAACGACATTGGCGTAAGTGAAGCGACC CGCATTGACCCTAACGCCTGGGTCGAACGCTGGAAGG CGGCGGGCCATTACCAGGCCGAAGCAGCGTTGTTGCA GTGCACGGCAGATACACTTGCTGATGCGGTGCTGATT ACGACCGCTCACGCGTGGCAGCATCAGGGGAAAACCT TATTTATCAGCCGGAAAACCTACCGGATTGATGGTAG TGGTCAAATGGCGATTACCGTTGATGTTGAAGTGGCG AGCGATACACCGCATCCGGCGCGGATTGGCCTGAACT GCCAG 43 Sequence of the AAAACCTTTTTTCCTATTCAAACACAAGGCATTGCTTC 5'-Region used AACACGTGTGCGTATCCTTAACACAGATACTCCATACT for knock out of TCTAATAATGTGATAGACGAATACAAAGATGTTCACT PpOCH1: CTGTGTTGTGTCTACAAGCATTTCTTATTCTGATTGGG GATATTCTAGTTACAGCACTAAACAACTGGCGATACA AACTTAAATTAAATAATCCGAATCTAGAAAATGAACT TTTGGATGGTCCGCCTGTTGGTTGGATAAATCAATACC GATTAAATGGATTCTATTCCAATGAGAGAGTAATCCA AGACACTCTGATGTCAATAATCATTTGCTTGCAACAAC AAACCCGTCATCTAATCAAAGGGTTTGATGAGGCTTA CCTTCAATTGCAGATAAACTCATTGCTGTCCACTGCTG TATTATGTGAGAATATGGGTGATGAATCTGGTCTTCTC CACTCAGCTAACATGGCTGTTTGGGCAAAGGTGGTAC AATTATACGGAGATCAGGCAATAGTGAAATTGTTGAA TATGGCTACTGGACGATGCTTCAAGGATGTACGTCTA
GTAGGAGCCGTGGGAAGATTGCTGGCAGAACCAGTTG GCACGTCGCAACAATCCCCAAGAAATGAAATAAGTGA AAACGTAACGTCAAAGACAGCAATGGAGTCAATATTG ATAACACCACTGGCAGAGCGGTTCGTACGTCGTTTTG GAGCCGATATGAGGCTCAGCGTGCTAACAGCACGATT GACAAGAAGACTCTCGAGTGACAGTAGGTTGAGTAAA GTATTCGCTTAGATTCCCAACCTTCGTTTTATTCTTTCG TAGACAAAGAAGCTGCATGCGAACATAGGGACAACTT TTATAAATCCAATTGTCAAACCAACGTAAAACCCTCTG GCACCATTTTCAACATATATTTGTGAAGCAGTACGCAA TATCGATAAATACTCACCGTTGTTTGTAACAGCCCCAA CTTGCATACGCCTTCTAATGACCTCAAATGGATAAGCC GCAGCTTGTGCTAACATACCAGCAGCACCGCCCGCGG TCAGCTGCGCCCACACATATAAAGGCAATCTACGATC ATGGGAGGAATTAGTTTTGACCGTCAGGTCTTCAAGA GTTTTGAACTCTTCTTCTTGAACTGTGTAACCTTTTAAA TGACGGGATCTAAATACGTCATGGATGAGATCATGTG TGTAAAAACTGACTCCAGCATATGGAATCATTCCAAA GATTGTAGGAGCGAACCCACGATAAAAGTTTCCCAAC CTTGCCAAAGTGTCTAATGCTGTGACTTGAAATCTGGG TTCCTCGTTGAAGACCCTGCGTACTATGCCCAAAAACT TTCCTCCACGAGCCCTATTAACTTCTCTATGAGTTTCA AATGCCAAACGGACACGGATTAGGTCCAATGGGTAAG TGAAAAACACAGAGCAAACCCCAGCTAATGAGCCGGC CAGTAACCGTCTTGGAGCTGTTTCATAAGAGTCATTAG GGATCAATAACGTTCTAATCTGTTCATAACATACAAAT TTTATGGCTGCATAGGGAAAAATTCTCAACAGGGTAG CCGAATGACCCTGATATAGACCTGCGACACCATCATA CCCATAGATCTGCCTGACAGCCTTAAAGAGCCCGCTA AAAGACCCGGAAAACCGAGAGAACTCTGGATTAGCA GTCTGAAAAAGAATCTTCACTCTGTCTAGTGGAGCAA TTAATGTCTTAGCGGCACTTCCTGCTACTCCGCCAGCT ACTCCTGAATAGATCACATACTGCAAAGACTGCTTGTC GATGACCTTGGGGTTATTTAGCTTCAAGGGCAATTTTT GGGACATTTTGGACACAGGAGACTCAGAAACAGACAC AGAGCGTTCTGAGTCCTGGTGCTCCTGACGTAGGCCTA GAACAGGAATTATTGGCTTTATTTGTTTGTCCATTTCA TAGGCTTGGGGTAATAGATAGATGACAGAGAAATAGA GAAGACCTAATATTTTTTGTTCATGGCAAATCGCGGGT TCGCGGTCGGGTCACACACGGAGAAGTAATGAGAAGA GCTGGTAATCTGGGGTAAAAGGGTTCAAAAGAAGGTC GCCTGGTAGGGATGCAATACAAGGTTGTCTTGGAGTT TACATTGACCAGATGATTTGGCTTTTTCTCTGTTCAATT CACATTTTTCAGCGAGAATCGGATTGACGGAGAAATG GCGGGGTGTGGGGTGGATAGATGGCAGAAATGCTCGC AATCACCGCGAAAGAAAGACTTTATGGAATAGAACTA CTGGGTGGTGTAAGGATTACATAGCTAGTCCAATGGA GTCCGTTGGAAAGGTAAGAAGAAGCTAAAACCGGCTA AGTAACTAGGGAAGAATGATCAGACTTTGATTTGATG AGGTCTGAAAATACTCTGCTGCTTTTTCAGTTGCTTTTT CCCTGCAACCTATCATTTTCCTTTTCATAAGCCTGCCTT TTCTGTTTTCACTTATATGAGTTCCGCCGAGACTTCCC CAAATTCTCTCCTGGAACATTCTCTATCGCTCTCCTTCC AAGTTGCGCCCCCTGGCACTGCCTAGTAATATTACCAC GCGACTTATATTCAGTTCCACAATTTCCAGTGTTCGTA GCAAATATCATCAGCCATGGCGAAGGCAGATGGCAGT TTGCTCTACTATAATCCTCACAATCCACCCAGAAGGTA TTACTTCTACATGGCTATATTCGCCGTTTCTGTCATTTG CGTTTTGTACGGACCCTCACAACAATTATCATCTCCAA AAATAGACTATGATCCATTGACGCTCCGATCACTTGAT TTGAAGACTTTGGAAGCTCCTTCACAGTTGAGTCCAGG CACCGTAGAAGATAATCTTCG 44 Sequence of the AAAGCTAGAGTAAAATAGATATAGCGAGATTAGAGA 3'-Region used ATGAATACCTTCTTCTAAGCGATCGTCCGTCATCATAG for knock out of AATATCATGGACTGTATAGTTTTTTTTTTGTACATATA PpOCH1: ATGATTAAACGGTCATCCAACATCTCGTTGACAGATCT CTCAGTACGCGAAATCCCTGACTATCAAAGCAAGAAC CGATGAAGAAAAAAACAACAGTAACCCAAACACCAC AACAAACACTTTATCTTCTCCCCCCCAACACCAATCAT CAAAGAGATGTCGGAACCAAACACCAAGAAGCAAAA ACTAACCCCATATAAAAACATCCTGGTAGATAATGCT GGTAACCCGCTCTCCTTCCATATTCTGGGCTACTTCAC GAAGTCTGACCGGTCTCAGTTGATCAACATGATCCTCG AAATGGGTGGCAAGATCGTTCCAGACCTGCCTCCTCT GGTAGATGGAGTGTTGTTTTTGACAGGGGATTACAAG TCTATTGATGAAGATACCCTAAAGCAACTGGGGGACG TTCCAATATACAGAGACTCCTTCATCTACCAGTGTTTT GTGCACAAGACATCTCTTCCCATTGACACTTTCCGAAT TGACAAGAACGTCGACTTGGCTCAAGATTTGATCAAT AGGGCCCTTCAAGAGTCTGTGGATCATGTCACTTCTGC CAGCACAGCTGCAGCTGCTGCTGTTGTTGTCGCTACCA ACGGCCTGTCTTCTAAACCAGACGCTCGTACTAGCAA AATACAGTTCACTCCCGAAGAAGATCGTTTTATTCTTG ACTTTGTTAGGAGAAATCCTAAACGAAGAAACACACA TCAACTGTACACTGAGCTCGCTCAGCACATGAAAAAC CATACGAATCATTCTATCCGCCACAGATTTCGTCGTAA TCTTTCCGCTCAACTTGATTGGGTTTATGATATCGATC CATTGACCAACCAACCTCGAAAAGATGAAAACGGGAA CTACATCAAGGTACAAGGCCTTCCA 45 K. lactis UDP- AAACGTAACGCCTGGCACTCTATTTTCTCAAACTTCTG GlcNAc GGACGGAAGAGCTAAATATTGTGTTGCTTGAACAAAC transporter gene CCAAAAAAACAAAAAAATGAACAAACTAAAACTACA (KIMNN2-2) CCTAAATAAACCGTGTGTAAAACGTAGTACCATATTA ORF underlined CTAGAAAAGATCACAAGTGTATCACACATGTGCATCT CATATTACATCTTTTATCCAATCCATTCTCTCTATCCCG TCTGTTCCTGTCAGATTCTTTTTCCATAAAAAGAAGAA GACCCCGAATCTCACCGGTACAATGCAAAACTGCTGA AAAAAAAAGAAAGTTCACTGGATACGGGAACAGTGC CAGTAGGCTTCACCACATGGACAAAACAATTGACGAT AAAATAAGCAGGTGAGCTTCTTTTTCAAGTCACGATCC CTTTATGTCTCAGAAACAATATATACAAGCTAAACCCT TTTGAACCAGTTCTCTCTTCATAGTTATGTTCACATAA ATTGCGGGAACAAGACTCCGCTGGCTGTCAGGTACAC GTTGTAACGTTTTCGTCCGCCCAATTATTAGCACAACA TTGGCAAAAAGAAAAACTGCTCGTTTTCTCTACAGGT AAATTACAATTTTTTTCAGTAATTTTCGCTGAAAAATT TAAAGGGCAGGAAAAAAAGACGATCTCGACTTTGCAT AGATGCAAGAACTGTGGTCAAAACTTGAAATAGTAAT TTTGCTGTGCGTGAACTAATAAATATATATATATATAT ATATATATATTTGTGTATTTTGTATATGTAATTGTGCA CGTCTTGGCTATTGGATATAAGATTTTCGCGGGTTGAT GACATAGAGCGTGTACTACTGTAATAGTTGTATATTCA AAAGCTGCTGCGTGGAGAAAGACTAAAATAGATAAA AAGCACACATTTTGACTTCGGTACCGTCAACTTAGTGG GACAGTCTTTTATATTTGGTGTAAGCTCATTTCTGGTA CTATTCGAAACAGAACAGTGTTTTCTGTATTACCGTCC AATCGTTTGTCATGAGTTTTGTATTGATTTTGTCGTTA GTGTTCGGAGGATGTTGTTCCAATGTGATTAGTTTCGA GCACATGGTGCAAGGCAGCAATATAAATTTGGGAAAT ATTGTTACATTCACTCAATTCGTGTCTGTGACGCTAAT TCAGTTGCCCAATGCTTTGGACTTCTCTCACTTTCCGTT TAGGTTGCGACCTAGACACATTCCTCTTAAGATCCATA TGTTAGCTGTGTTTTTGTTCTTTACCAGTTCAGTCGCCA ATAACAGTGTGTTTAAATTTGACATTTCCGTTCCGATT CATATTATCATTAGATTTTCAGGTACCACTTTGACGAT GATAATAGGTTGGGCTGTTTGTAATAAGAGGTACTCC AAACTTCAGGTGCAATCTGCCATCATTATGACGCTTGG TGCGATTGTCGCATCATTATACCGTGACAAAGAATTTT CAATGGACAGTTTAAAGTTGAATACGGATTCAGTGGG TATGACCCAAAAATCTATGTTTGGTATCTTTGTTGTGC TAGTGGCCACTGCCTTGATGTCATTGTTGTCGTTGCTC AACGAATGGACGTATAACAAGTACGGGAAACATTGGA AAGAAACTTTGTTCTATTCGCATTTCTTGGCTCTACCG TTGTTTATGTTGGGGTACACAAGGCTCAGAGACGAAT TCAGAGACCTCTTAATTTCCTCAGACTCAATGGATATT CCTATTGTTAAATTACCAATTGCTACGAAACTTTTCAT GCTAATAGCAAATAACGTGACCCAGTTCATTTGTATCA AAGGTGTTAACATGCTAGCTAGTAACACGGATGCTTT GACACTTTCTGTCGTGCTTCTAGTGCGTAAATTTGTTA GTCTTTTACTCAGTGTCTACATCTACAAGAACGTCCTA TCCGTGACTGCATACCTAGGGACCATCACCGTGTTCCT GGGAGCTGGTTTGTATTCATATGGTTCGGTCAAAACTG CACTGCCTCGCTGAAACAATCCACGTCTGTATGATACT CGTTTCAGAATTTTTTTGATTTTCTGCCGGATATGGTTT CTCATCTTTACAATCGCATTCTTAATTATACCAGAACG TAATTCAATGATCCCAGTGACTCGTAACTCTTATATGT CAATTTAAGC 46 Sequence of the GGCCGAGCGGGCCTAGATTTTCACTACAAATTTCAAA 5'-Region used ACTACGCGGATTTATTGTCTCAGAGAGCAATTTGGCAT for knock out of TTCTGAGCGTAGCAGGAGGCTTCATAAGATTGTATAG PpBMT2: GACCGTACCAACAAATTGCCGAGGCACAACACGGTAT GCTGTGCACTTATGTGGCTACTTCCCTACAACGGAATG AAACCTTCCTCTTTCCGCTTAAACGAGAAAGTGTGTCG CAATTGAATGCAGGTGCCTGTGCGCCTTGGTGTATTGT TTTTGAGGGCCCAATTTATCAGGCGCCTTTTTTCTTGG TTGTTTTCCCTTAGCCTCAAGCAAGGTTGGTCTATTTC ATCTCCGCTTCTATACCGTGCCTGATACTGTTGGATGA GAACACGACTCAACTTCCTGCTGCTCTGTATTGCCAGT GTTTTGTCTGTGATTTGGATCGGAGTCCTCCTTACTTG GAATGATAATAATCTTGGCGGAATCTCCCTAAACGGA GGCAAGGATTCTGCCTATGATGATCTGCTATCATTGGG AAGCTTCAACGACATGGAGGTCGACTCCTATGTCACC AACATCTACGACAATGCTCCAGTGCTAGGATGTACGG ATTTGTCTTATCATGGATTGTTGAAAGTCACCCCAAAG CATGACTTAGCTTGCGATTTGGAGTTCATAAGAGCTCA GATTTTGGACATTGACGTTTACTCCGCCATAAAAGACT TAGAAGATAAAGCCTTGACTGTAAAACAAAAGGTTGA AAAACACTGGTTTACGTTTTATGGTAGTTCAGTCTTTC TGCCCGAACACGATGTGCATTACCTGGTTAGACGAGT CATCTTTTCGGCTGAAGGAAAGGCGAACTCTCCAGTA ACATC 47 Sequence of the CCATATGATGGGTGTTTGCTCACTCGTATGGATCAAAA 3'-Region used TTCCATGGTTTCTTCTGTACAACTTGTACACTTATTTGG for knock out of ACTTTTCTAACGGTTTTTCTGGTGATTTGAGAAGTCCT PpBMT2: TATTTTGGTGTTCGCAGCTTATCCGTGATTGAACCATC AGAAATACTGCAGCTCGTTATCTAGTTTCAGAATGTGT TGTAGAATACAATCAATTCTGAGTCTAGTTTGGGTGGG TCTTGGCGACGGGACCGTTATATGCATCTATGCAGTGT TAAGGTACATAGAATGAAAATGTAGGGGTTAATCGAA AGCATCGTTAATTTCAGTAGAACGTAGTTCTATTCCCT ACCCAAATAATTTGCCAAGAATGCTTCGTATCCACATA CGCAGTGGACGTAGCAAATTTCACTTTGGACTGTGAC CTCAAGTCGTTATCTTCTACTTGGACATTGATGGTCAT TACGTAATCCACAAAGAATTGGATAGCCTCTCGTTTTA TCTAGTGCACAGCCTAATAGCACTTAAGTAAGAGCAA TGGACAAATTTGCATAGACATTGAGCTAGATACGTAA CTCAGATCTTGTTCACTCATGGTGTACTCGAAGTACTG CTGGAACCGTTACCTCTTATCATTTCGCTACTGGCTCG TGAAACTACTGGATGAAAAAAAAAAAAGAGCTGAAA GCGAGATCATCCCATTTTGTCATCATACAAATTCACGC TTGCAGTTTTGCTTCGTTAACAAGACAAGATGTCTTTA TCAAAGACCCGTTTTTTCTTCTTGAAGAATACTTCCCT GTTGAGCACATGCAAACCATATTTATCTCAGATTTCAC TCAACTTGGGTGCTTCCAAGAGAAGTAAAATTCTTCCC ACTGCATCAACTTCCAAGAAACCCGTAGACCAGTTTCT CTTCAGCCAAAAGAAGTTGCTCGCCGATCACCGCGGT AACAGAGGAGTCAGAAGGTTTCACACCCTTCCATCCC GATTTCAAAGTCAAAGTGCTGCGTTGAACCAAGGTTTT CAGGTTGCCAAAGCCCAGTCTGCAAAAACTAGTTCCA AATGGCCTATTAATTCCCATAAAAGTGTTGGCTACGTA TGTATCGGTACCTCCATTCTGGTATTTGCTATTGTTGTC GTTGGTGGGTTGACTAGACTGACCGAATCCGGTCTTTC CATAACGGAGTGGAAACCTATCACTGGTTCGGTTCCC CCACTGACTGAGGAAGACTGGAAGTTGGAATTTGAAA AATACAAACAAAGCCCTGAGTTTCAGGAACTAAATTC TCACATAACATTGGAAGAGTTCAAGTTTATATTTTCCA TGGAATGGGGACATAGATTGTTGGGAAGGGTCATCGG CCTGTCGTTTGTTCTTCCCACGTTTTACTTCATTGCCCG TCGAAAGTGTTCCAAAGATGTTGCATTGAAACTGCTTG CAATATGCTCTATGATAGGATTCCAAGGTTTCATCGGC TGGTGGATGGTGTATTCCGGATTGGACAAACAGCAAT TGGCTGAACGTAACTCCAAACCAACTGTGTCTCCATAT CGCTTAACTACCCATCTTGGAACTGCATTTGTTATTTA CTGTTACATGATTTACACAGGGCTTCAAGTTTTGAAGA ACTATAAGATCATGAAACAGCCTGAAGCGTATGTTCA AATTTTCAAGCAAATTGCGTCTCCAAAATTGAAAACTT TCAAGAGACTCTCTTCAGTTCTATTAGGCCTGGTG 48 DNA encodes ATGTCTGCCAACCTAAAATATCTTTCCTTGGGAATTTT MmSLC35A3 GGTGTTTCAGACTACCAGTCTGGTTCTAACGATGCGGT UDP-GlcNAc ATTCTAGGACTTTAAAAGAGGAGGGGCCTCGTTATCT transporter GTCTTCTACAGCAGTGGTTGTGGCTGAATTTTTGAAGA TAATGGCCTGCATCTTTTTAGTCTACAAAGACAGTAAG TGTAGTGTGAGAGCACTGAATAGAGTACTGCATGATG AAATTCTTAATAAGCCCATGGAAACCCTGAAGCTCGC TATCCCGTCAGGGATATATACTCTTCAGAACAACTTAC TCTATGTGGCACTGTCAAACCTAGATGCAGCCACTTAC CAGGTTACATATCAGTTGAAAATACTTACAACAGCAT TATTTTCTGTGTCTATGCTTGGTAAAAAATTAGGTGTG TACCAGTGGCTCTCCCTAGTAATTCTGATGGCAGGAGT TGCTTTTGTACAGTGGCCTTCAGATTCTCAAGAGCTGA ACTCTAAGGACCTTTCAACAGGCTCACAGTTTGTAGGC CTCATGGCAGTTCTCACAGCCTGTTTTTCAAGTGGCTT TGCTGGAGTTTATTTTGAGAAAATCTTAAAAGAAACA AAACAGTCAGTATGGATAAGGAACATTCAACTTGGTT TCTTTGGAAGTATATTTGGATTAATGGGTGTATACGTT TATGATGGAGAATTGGTCTCAAAGAATGGATTTTTTCA GGGATATAATCAACTGACGTGGATAGTTGTTGCTCTGC AGGCACTTGGAGGCCTTGTAATAGCTGCTGTCATCAA ATATGCAGATAACATTTTAAAAGGATTTGCGACCTCCT TATCCATAATATTGTCAACAATAATATCTTATTTTTGG TTGCAAGATTTTGTGCCAACCAGTGTCTTTTTCCTTGG AGCCATCCTTGTAATAGCAGCTACTTTCTTGTATGGTT ACGATCCCAAACCTGCAGGAAATCCCACTAAAGCATAG 49 Sequence of the GATCTGGCCATTGTGAAACTTGACACTAAAGACAAAA 5'-Region used CTCTTAGAGTTTCCAATCACTTAGGAGACGATGTTTCC
for knock out of TACAACGAGTACGATCCCTCATTGATCATGAGCAATTT PpMNN4L1: GTATGTGAAAAAAGTCATCGACCTTGACACCTTGGAT AAAAGGGCTGGAGGAGGTGGAACCACCTGTGCAGGC GGTCTGAAAGTGTTCAAGTACGGATCTACTACCAAAT ATACATCTGGTAACCTGAACGGCGTCAGGTTAGTATA CTGGAACGAAGGAAAGTTGCAAAGCTCCAAATTTGTG GTTCGATCCTCTAATTACTCTCAAAAGCTTGGAGGAAA CAGCAACGCCGAATCAATTGACAACAATGGTGTGGGT TTTGCCTCAGCTGGAGACTCAGGCGCATGGATTCTTTC CAAGCTACAAGATGTTAGGGAGTACCAGTCATTCACT GAAAAGCTAGGTGAAGCTACGATGAGCATTTTCGATT TCCACGGTCTTAAACAGGAGACTTCTACTACAGGGCTT GGGGTAGTTGGTATGATTCATTCTTACGACGGTGAGTT CAAACAGTTTGGTTTGTTCACTCCAATGACATCTATTC TACAAAGACTTCAACGAGTGACCAATGTAGAATGGTG TGTAGCGGGTTGCGAAGATGGGGATGTGGACACTGAA GGAGAACACGAATTGAGTGATTTGGAACAACTGCATA TGCATAGTGATTCCGACTAGTCAGGCAAGAGAGAGCC CTCAAATTTACCTCTCTGCCCCTCCTCACTCCTTTTGGT ACGCATAATTGCAGTATAAAGAACTTGCTGCCAGCCA GTAATCTTATTTCATACGCAGTTCTATATAGCACATAA TCTTGCTTGTATGTATGAAATTTACCGCGTTTTAGTTG AAATTGTTTATGTTGTGTGCCTTGCATGAAATCTCTCG TTAGCCCTATCCTTACATTTAACTGGTCTCAAAACCTC TACCAATTCCATTGCTGTACAACAATATGAGGCGGCA TTACTGTAGGGTTGGAAAAAAATTGTCATTCCAGCTA GAGATCACACGACTTCATCACGCTTATTGCTCCTCATT GCTAAATCATTTACTCTTGACTTCGACCCAGAAAAGTT CGCC 50 Sequence of the GCATGTCAAACTTGAACACAACGACTAGATAGTTGTT 3'-Region used TTTTCTATATAAAACGAAACGTTATCATCTTTAATAAT for knock out of CATTGAGGTTTACCCTTATAGTTCCGTATTTTCGTTTCC PpMNN4L1: AAACTTAGTAATCTTTTGGAAATATCATCAAAGCTGGT GCCAATCTTCTTGTTTGAAGTTTCAAACTGCTCCACCA AGCTACTTAGAGACTGTTCTAGGTCTGAAGCAACTTCG AACACAGAGACAGCTGCCGCCGATTGTTCTTTTTTGTG TTTTTCTTCTGGAAGAGGGGCATCATCTTGTATGTCCA ATGCCCGTATCCTTTCTGAGTTGTCCGACACATTGTCC TTCGAAGAGTTTCCTGACATTGGGCTTCTTCTATCCGT GTATTAATTTTGGGTTAAGTTCCTCGTTTGCATAGCAG TGGATACCTCGATTTTTTTGGCTCCTATTTACCTGACAT AATATTCTACTATAATCCAACTTGGACGCGTCATCTAT GATAACTAGGCTCTCCTTTGTTCAAAGGGGACGTCTTC ATAATCCACTGGCACGAAGTAAGTCTGCAACGAGGCG GCTTTTGCAACAGAACGATAGTGTCGTTTCGTACTTGG ACTATGCTAAACAAAAGGATCTGTCAAACATTTCAAC CGTGTTTCAAGGCACTCTTTACGAATTATCGACCAAGA CCTTCCTAGACGAACATTTCAACATATCCAGGCTACTG CTTCAAGGTGGTGCAAATGATAAAGGTATAGATATTA GATGTGTTTGGGACCTAAAACAGTTCTTGCCTGAAGAT TCCCTTGAGCAACAGGCTTCAATAGCCAAGTTAGAGA AGCAGTACCAAATCGGTAACAAAAGGGGGAAGCATA TAAAACCTTTACTATTGCGACAAAATCCATCCTTGAAA GTAAAGCTGTTTGTTCAATGTAAAGCATACGAAACGA AGGAGGTAGATCCTAAGATGGTTAGAGAACTTAACGG GACATACTCCAGCTGCATCCCATATTACGATCGCTGGA AGACTTTTTTCATGTACGTATCGCCCACCAACCTTTCA AAGCAAGCTAGGTATGATTTTGACAGTTCTCACAATCC ATTGGTTTTCATGCAACTTGAAAAAACCCAACTCAAA CTTCATGGGGATCCATACAATGTAAATCATTACGAGA GGGCGAGGTTGAAAAGTTTCCATTGCAATCACGTCGC ATCATGGCTACTGAAAGGCCTTAAC 51 Sequence of the TCATTCTATATGTTCAAGAAAAGGGTAGTGAAAGGAA 5'-Region used AGAAAAGGCATATAGGCGAGGGAGAGTTAGCTAGCA for knock out of TACAAGATAATGAAGGATCAATAGCGGTAGTTAAAGT PpPNO1 and GCACAAGAAAAGAGCACCTGTTGAGGCTGATGATAAA PpMNN4: GCTCCAATTACATTGCCACAGAGAAACACAGTAACAG AAATAGGAGGGGATGCACCACGAGAAGAGCATTCAG TGAACAACTTTGCCAAATTCATAACCCCAAGCGCTAA TAAGCCAATGTCAAAGTCGGCTACTAACATTAATAGT ACAACAACTATCGATTTTCAACCAGATGTTTGCAAGG ACTACAAACAGACAGGTTACTGCGGATATGGTGACAC TTGTAAGTTTTTGCACCTGAGGGATGATTTCAAACAGG GATGGAAATTAGATAGGGAGTGGGAAAATGTCCAAA AGAAGAAGCATAATACTCTCAAAGGGGTTAAGGAGAT CCAAATGTTTAATGAAGATGAGCTCAAAGATATCCCG TTTAAATGCATTATATGCAAAGGAGATTACAAATCAC CCGTGAAAACTTCTTGCAATCATTATTTTTGCGAACAA TGTTTCCTGCAACGGTCAAGAAGAAAACCAAATTGTA TTATATGTGGCAGAGACACTTTAGGAGTTGCTTTACCA GCAAAGAAGTTGTCCCAATTTCTGGCTAAGATACATA ATAATGAAAGTAATAAAGTTTAGTAATTGCATTGCGTT GACTATTGATTGCATTGATGTCGTGTGATACTTTCACC GAAAAAAAACACGAAGCGCAATAGGAGCGGTTGCAT ATTAGTCCCCAAAGCTATTTAATTGTGCCTGAAACTGT TTTTTAAGCTCATCAAGCATAATTGTATGCATTGCGAC GTAACCAACGTTTAGGCGCAGTTTAATCATAGCCCACT GCTAAGCC 52 Sequence of the CGGAGGAATGCAAATAATAATCTCCTTAATTACCCAC 3'-Region used TGATAAGCTCAAGAGACGCGGTTTGAAAACGATATAA for knock out of TGAATCATTTGGATTTTATAATAAACCCTGACAGTTTT PpPNO1 and TCCACTGTATTGTTTTAACACTCATTGGAAGCTGTATT PpMNN4: GATTCTAAGAAGCTAGAAATCAATACGGCCATACAAA AGATGACATTGAATAAGCACCGGCTTTTTTGATTAGCA TATACCTTAAAGCATGCATTCATGGCTACATAGTTGTT AAAGGGCTTCTTCCATTATCAGTATAATGAATTACATA ATCATGCACTTATATTTGCCCATCTCTGTTCTCTCACTC TTGCCTGGGTATATTCTATGAAATTGCGTATAGCGTGT CTCCAGTTGAACCCCAAGCTTGGCGAGTTTGAAGAGA ATGCTAACCTTGCGTATTCCTTGCTTCAGGAAACATTC AAGGAGAAACAGGTCAAGAAGCCAAACATTTTGATCC TTCCCGAGTTAGCATTGACTGGCTACAATTTTCAAAGC CAGCAGCGGATAGAGCCTTTTTTGGAGGAAACAACCA AGGGAGCTAGTACCCAATGGGCTCAAAAAGTATCCAA GACGTGGGATTGCTTTACTTTAATAGGATACCCAGAA AAAAGTTTAGAGAGCCCTCCCCGTATTTACAACAGTG CGGTACTTGTATCGCCTCAGGGAAAAGTAATGAACAA CTACAGAAAGTCCTTCTTGTATGAAGCTGATGAACATT GGGGATGTTCGGAATCTTCTGATGGGTTTCAAACAGT AGATTTATTAATTGAAGGAAAGACTGTAAAGACATCA TTTGGAATTTGCATGGATTTGAATCCTTATAAATTTGA AGCTCCATTCACAGACTTCGAGTTCAGTGGCCATTGCT TGAAAACCGGTACAAGACTCATTTTGTGCCCAATGGC CTGGTTGTCCCCTCTATCGCCTTCCATTAAAAAGGATC TTAGTGATATAGAGAAAAGCAGACTTCAAAAGTTCTA CCTTGAAAAAATAGATACCCCGGAATTTGACGTTAAT TACGAATTGAAAAAAGATGAAGTATTGCCCACCCGTA TGAATGAAACGTTGGAAACAATTGACTTTGAGCCTTC AAAACCGGACTACTCTAATATAAATTATTGGATACTA AGGTTTTTTCCCTTTCTGACTCATGTCTATAAACGAGA TGTGCTCAAAGAGAATGCAGTTGCAGTCTTATGCAAC CGAGTTGGCATTGAGAGTGATGTCTTGTACGGAGGAT CAACCACGATTCTAAACTTCAATGGTAAGTTAGCATC GACACAAGAGGAGCTGGAGTTGTACGGGCAGACTAAT AGTCTCAACCCCAGTGTGGAAGTATTGGGGGCCCTTG GCATGGGTCAACAGGGAATTCTAGTACGAGACATTGA ATTAACATAATATACAATATACAATAAACACAAATAA AGAATACAAGCCTGACAAAAATTCACAAATTATTGCC TAGACTTGTCGTTATCAGCAGCGACCTTTTTCCAATGC TCAATTTCACGATATGCCTTTTCTAGCTCTGCTTTAAG CTTCTCATTGGAATTGGCTAACTCGTTGACTGCTTGGT CAGTGATGAGTTTCTCCAAGGTCCATTTCTCGATGTTG TTGTTTTCGTTTTCCTTTAATCTCTTGATATAATCAACA GCCTTCTTTAATATCTGAGCCTTGTTCGAGTCCCCTGTT GGCAACAGAGCGGCCAGTTCCTTTATTCCGTGGTTTAT ATTTTCTCTTCTACGCCTTTCTACTTCTTTGTGATTCTC TTTACGCATCTTATGCCATTCTTCAGAACCAGTGGCTG GCTTAACCGAATAGCCAGAGCCTGAAGAAGCCGCACT AGAAGAAGCAGTGGCATTGTTGACTATGG 53 DNA encodes TCAGTCAGTGCTCTTGATGGTGACCCAGCAAGTTTGAC human GnTI CAGAGAAGTGATTAGATTGGCCCAAGACGCAGAGGTG catalytic domain GAGTTGGAGAGACAACGTGGACTGCTGCAGCAAATCG (NA) GAGATGCATTGTCTAGTCAAAGAGGTAGGGTGCCTAC Codon- CGCAGCTCCTCCAGCACAGCCTAGAGTGCATGTGACC optimized CCTGCACCAGCTGTGATTCCTATCTTGGTCATCGCCTG TGACAGATCTACTGTTAGAAGATGTCTGGACAAGCTG TTGCATTACAGACCATCTGCTGAGTTGTTCCCTATCAT CGTTAGTCAAGACTGTGGTCACGAGGAGACTGCCCAA GCCATCGCCTCCTACGGATCTGCTGTCACTCACATCAG ACAGCCTGACCTGTCATCTATTGCTGTGCCACCAGACC ACAGAAAGTTCCAAGGTTACTACAAGATCGCTAGACA CTACAGATGGGCATTGGGTCAAGTCTTCAGACAGTTT AGATTCCCTGCTGCTGTGGTGGTGGAGGATGACTTGG AGGTGGCTCCTGACTTCTTTGAGTACTTTAGAGCAACC TATCCATTGCTGAAGGCAGACCCATCCCTGTGGTGTGT CTCTGCCTGGAATGACAACGGTAAGGAGCAAATGGTG GACGCTTCTAGGCCTGAGCTGTTGTACAGAACCGACTT CTTTCCTGGTCTGGGATGGTTGCTGTTGGCTGAGTTGT GGGCTGAGTTGGAGCCTAAGTGGCCAAAGGCATTCTG GGACGACTGGATGAGAAGACCTGAGCAAAGACAGGG TAGAGCCTGTATCAGACCTGAGATCTCAAGAACCATG ACCTTTGGTAGAAAGGGAGTGTCTCACGGTCAATTCTT TGACCAACACTTGAAGTTTATCAAGCTGAACCAGCAA TTTGTGCACTTCACCCAACTGGACCTGTCTTACTTGCA GAGAGAGGCCTATGACAGAGATTTCCTAGCTAGAGTC TACGGAGCTCCTCAACTGCAAGTGGAGAAAGTGAGGA CCAATGACAGAAAGGAGTTGGGAGAGGTGAGAGTGC AGTACACTGGTAGGGACTCCTTTAAGGCTTTCGCTAAG GCTCTGGGTGTCATGGATGACCTTAAGTCTGGAGTTCC TAGAGCTGGTTACAGAGGTATTGTCACCTTTCAATTCA GAGGTAGAAGAGTCCACTTGGCTCCTCCACCTACTTG GGAGGGTTATGATCCTTCTTGGAATTAG 54 DNA encodes ATGCCCAGAAAAATATTTAACTACTTCATTTTGACTGT Pp SEC12 (10) ATTCATGGCAATTCTTGCTATTGTTTTACAATGGTCTA The last 9 TAGAGAATGGACATGGGCGCGCC nucleotides are the linker containing the AscI restriction site used for fusion to proteins of interest. 55 Sequence of the GAAGTAAAGTTGGCGAAACTTTGGGAACCTTTGGTTA PpSEC4 AAACTTTGTAATTTTTGTCGCTACCCATTAGGCAGAAT promoter: CTGCATCTTGGGAGGGGGATGTGGTGGCGTTCTGAGA TGTACGCGAAGAATGAAGAGCCAGTGGTAACAACAG GCCTAGAGAGATACGGGCATAATGGGTATAACCTACA AGTTAAGAATGTAGCAGCCCTGGAAACCAGATTGAAA CGAAAAACGAAATCATTTAAACTGTAGGATGTTTTGG CTCATTGTCTGGAAGGCTGGCTGTTTATTGCCCTGTTC TTTGCATGGGAATAAGCTATTATATCCCTCACATAATC CCAGAAAATAGATTGAAGCAACGCGAAATCCTTACGT ATCGAAGTAGCCTTCTTACACATTCACGTTGTACGGAT AAGAAAACTACTCAAACGAACAATC 56 Sequence of the AATAGATATAGCGAGATTAGAGAATGAATACCTTCTT PpOCH1 CTAAGCGATCGTCCGTCATCATAGAATATCATGGACT terminator: GTATAGTTTTTTTTTTGTACATATAATGATTAAACGGT CATCCAACATCTCGTTGACAGATCTCTCAGTACGCGAA ATCCCTGACTATCAAAGCAAGAACCGATGAAGAAAAA AACAACAGTAACCCAAACACCACAACAAACACTTTAT CTTCTCCCCCCCAACACCAATCATCAAAGAGATGTCG GAACACAAACACCAAGAAGCAAAAACTAACCCCATAT AAAAACATCCTGGTAGATAATGCTGGTAACCCGCTCT CCTTCCATATTCTGGGCTACTTCACGAAGTCTGACCGG TCTCAGTTGATCAACATGATCCTCGAAATGG 57 DNA encodes GAGCCCGCTGACGCCACCATCCGTGAGAAGAGGGCAA Mm ManI AGATCAAAGAGATGATGACCCATGCTTGGAATAATTA catalytic domain TAAACGCTATGCGTGGGGCTTGAACGAACTGAAACCT (FB) ATATCAAAAGAAGGCCATTCAAGCAGTTTGTTTGGCA ACATCAAAGGAGCTACAATAGTAGATGCCCTGGATAC CCTTTTCATTATGGGCATGAAGACTGAATTTCAAGAAG CTAAATCGTGGATTAAAAAATATTTAGATTTTAATGTG AATGCTGAAGTTTCTGTTTTTGAAGTCAACATACGCTT CGTCGGTGGACTGCTGTCAGCCTACTATTTGTCCGGAG AGGAGATATTTCGAAAGAAAGCAGTGGAACTTGGGGT AAAATTGCTACCTGCATTTCATACTCCCTCTGGAATAC CTTGGGCATTGCTGAATATGAAAAGTGGGATCGGGCG GAACTGGCCCTGGGCCTCTGGAGGCAGCAGTATCCTG GCCGAATTTGGAACTCTGCATTTAGAGTTTATGCACTT GTCCCACTTATCAGGAGACCCAGTCTTTGCCGAAAAG GTTATGAAAATTCGAACAGTGTTGAACAAACTGGACA AACCAGAAGGCCTTTATCCTAACTATCTGAACCCCAGT AGTGGACAGTGGGGTCAACATCATGTGTCGGTTGGAG GACTTGGAGACAGCTTTTATGAATATTTGCTTAAGGCG TGGTTAATGTCTGACAAGACAGATCTCGAAGCCAAGA AGATGTATTTTGATGCTGTTCAGGCCATCGAGACTCAC TTGATCCGCAAGTCAAGTGGGGGACTAACGTACATCG CAGAGTGGAAGGGGGGCCTCCTGGAACACAAGATGG GCCACCTGACGTGCTTTGCAGGAGGCATGTTTGCACTT GGGGCAGATGGAGCTCCGGAAGCCCGGGCCCAACACT ACCTTGAACTCGGAGCTGAAATTGCCCGCACTTGTCAT GAATCTTATAATCGTACATATGTGAAGTTGGGACCGG AAGCGTTTCGATTTGATGGCGGTGTGGAAGCTATTGCC ACGAGGCAAAATGAAAAGTATTACATCTTACGGCCCG AGGTCATCGAGACATACATGTACATGTGGCGACTGAC TCACGACCCCAAGTACAGGACCTGGGCCTGGGAAGCC GTGGAGGCTCTAGAAAGTCACTGCAGAGTGAACGGAG GCTACTCAGGCTTACGGGATGTTTACATTGCCCGTGAG AGTTATGACGATGTCCAGCAAAGTTTCTTCCTGGCAGA GACACTGAAGTATTTGTACTTGATATTTTCCGATGATG ACCTTCTTCCACTAGAACACTGGATCTTCAACACCGAG GCTCATCCTTTCCCTATACTCCGTGAACAGAAGAAGG
AAATTGATGGCAAAGAGAAATGA 58 DNA encodes ATGAACACTATCCACATAATAAAATTACCGCTTAACT ScSEC12 (8) ACGCCAACTACACCTCAATGAAACAAAAAATCTCTAA The last 9 ATTTTTCACCAACTTCATCCTTATTGTGCTGCTTTCTTA nucleotides are CATTTTACAGTTCTCCTATAAGCACAATTTGCATTCCA the linker TGCTTTTCAATTACGCGAAGGACAATTTTCTAACGAAA containing the AGAGACACCATCTCTTCGCCCTACGTAGTTGATGAAG AscI restriction ACTTACATCAAACAACTTTGTTTGGCAACCACGGTACA site used for AAAACATCTGTACCTAGCGTAGATTCCATAAAAGTGC fusion to ATGGCGTGGGGCGCGCC proteins of interest 59 Sequence of the GAGTCGGCCAAGAGATGATAACTGTTACTAAGCTTCT 5'-region that CCGTAATTAGTGGTATTTTGTAACTTTTACCAATAATC was used to GTTTATGAATACGGATATTTTTCGACCTTATCCAGTGC knock into the CAAATCACGTAACTTAATCATGGTTTAAATACTCCACT PpADE1 locus: TGAACGATTCATTATTCAGAAAAAAGTCAGGTTGGCA GAAACACTTGGGCGCTTTGAAGAGTATAAGAGTATTA AGCATTAAACATCTGAACTTTCACCGCCCCAATATACT ACTCTAGGAAACTCGAAAAATTCCTTTCCATGTGTCAT CGCTTCCAACACACTTTGCTGTATCCTTCCAAGTATGT CCATTGTGAACACTGATCTGGACGGAATCCTACCTTTA ATCGCCAAAGGAAAGGTTAGAGACATTTATGCAGTCG ATGAGAACAACTTGCTGTTCGTCGCAACTGACCGTATC TCCGCTTACGATGTGATTATGACAAACGGTATTCCTGA TAAGGGAAAGATTTTGACTCAGCTCTCAGTTTTCTGGT TTGATTTTTTGGCACCCTACATAAAGAATCATTTGGTT GCTTCTAATGACAAGGAAGTCTTTGCTTTACTACCATC AAAACTGTCTGAAGAAAAaTACAAATCTCAATTAGAG GGACGATCCTTGATAGTAAAAAAGCACAGACTGATAC CTTTGGAAGCCATTGTCAGAGGTTACATCACTGGAAG TGCATGGAAAGAGTACAAGAACTCAAAAACTGTCCAT GGAGTCAAGGTTGAAAACGAGAACCTTCAAGAGAGC GACGCCTTTCCAACTCCGATTTTCACACCTTCAACGAA AGCTGAACAGGGTGAACACGATGAAAACATCTCTATT GAACAAGCTGCTGAGATTGTAGGTAAAGACATTTGTG AGAAGGTCGCTGTCAAGGCGGTCGAGTTGTATTCTGC TGCAAAAAACCTCGCCCTTTTGAAGGGGATCATTATTG CTGATACGAAATTCGAATTTGGACTGGACGAAAACAA TGAATTGGTACTAGTAGATGAAGTTTTAACTCCAGATT CTTCTAGATTTTGGAATCAAAAGACTTACCAAGTGGGT AAATCGCAAGAGAGTTACGATAAGCAGTTTCTCAGAG ATTGGTTGACGGCCAACGGATTGAATGGCAAAGAGGG CGTAGCCATGGATGCAGAAATTGCTATCAAGAGTAAA GAAAAGTATATTGAAGCTTATGAAGCAATTACTGGCA AGAAATGGGCTTGA 60 Sequence of the ATGATTAGTACCCTCCTCGCCTTTTTCAGACATCTGAA 3'-region that ATTTCCCTTATTCTTCCAATTCCATATAAAATCCTATTT was used to AGGTAATTAGTAAACAATGATCATAAAGTGAAATCAT knock into the TCAAGTAACCATTCCGTTTATCGTTGATTTAAAATCAA PpADE1 locus: TAACGAATGAATGTCGGTCTGAGTAGTCAATTTGTTGC CTTGGAGCTCATTGGCAGGGGGTCTTTTGGCTCAGTAT GGAAGGTTGAAAGGAAAACAGATGGAAAGTGGTTCG TCAGAAAAGAGGTATCCTACATGAAGATGAATGCCAA AGAGATATCTCAAGTGATAGCTGAGTTCAGAATTCTT AGTGAGTTAAGCCATCCCAACATTGTGAAGTACCTTC ATCACGAACATATTTCTGAGAATAAAACTGTCAATTTA TACATGGAATACTGTGATGGTGGAGATCTCTCCAAGC TGATTCGAACACATAGAAGGAACAAAGAGTACATTTC AGAAGAAAAAATATGGAGTATTTTTACGCAGGTTTTA TTAGCATTGTATCGTTGTCATTATGGAACTGATTTCAC GGCTTCAAAGGAGTTTGAATCGCTCAATAAAGGTAAT AGACGAACCCAGAATCCTTCGTGGGTAGACTCGACAA GAGTTATTATTCACAGGGATATAAAACCCGACAACAT CTTTCTGATGAACAATTCAAACCTTGTCAAACTGGGAG ATTTTGGATTAGCAAAAATTCTGGACCAAGAAAACGA TTTTGCCAAAACATACGTCGGTACGCCGTATTACATGT CTCCTGAAGTGCTGTTGGACCAACCCTACTCACCATTA TGTGATATATGGTCTCTTGGGTGCGTCATGTATGAGCT ATGTGCATTGAGGCCTCCTT 61 DNA encodes ATGACAGCTCAGTTACAAAGTGAAAGTACTTCTAAAA ScGAL10 TTGTTTTGGTTACAGGTGGTGCTGGATACATTGGTTCA CACACTGTGGTAGAGCTAATTGAGAATGGATATGACT GTGTTGTTGCTGATAACCTGTCGAATTCAACTTATGAT TCTGTAGCCAGGTTAGAGGTCTTGACCAAGCATCACA TTCCCTTCTATGAGGTTGATTTGTGTGACCGAAAAGGT CTGGAAAAGGTTTTCAAAGAATATAAAATTGATTCGG TAATTCACTTTGCTGGTTTAAAGGCTGTAGGTGAATCT ACACAAATCCCGCTGAGATACTATCACAATAACATTTT GGGAACTGTCGTTTTATTAGAGTTAATGCAACAATAC AACGTTTCCAAATTTGTTTTTTCATCTTCTGCTACTGTC TATGGTGATGCTACGAGATTCCCAAATATGATTCCTAT CCCAGAAGAATGTCCCTTAGGGCCTACTAATCCGTAT GGTCATACGAAATACGCCATTGAGAATATCTTGAATG ATCTTTACAATAGCGACAAAAAAAGTTGGAAGTTTGC TATCTTGCGTTATTTTAACCCAATTGGCGCACATCCCT CTGGATTAATCGGAGAAGATCCGCTAGGTATACCAAA CAATTTGTTGCCATATATGGCTCAAGTAGCTGTTGGTA GGCGCGAGAAGCTTTACATCTTCGGAGACGATTATGA TTCCAGAGATGGTACCCCGATCAGGGATTATATCCAC GTAGTTGATCTAGCAAAAGGTCATATTGCAGCCCTGC AATACCTAGAGGCCTACAATGAAAATGAAGGTTTGTG TCGTGAGTGGAACTTGGGTTCCGGTAAAGGTTCTACA GTTTTTGAAGTTTATCATGCATTCTGCAAAGCTTCTGG TATTGATCTTCCATACAAAGTTACGGGCAGAAGAGCA GGTGATGTTTTGAACTTGACGGCTAAACCAGATAGGG CCAAACGCGAACTGAAATGGCAGACCGAGTTGCAGGT TGAAGACTCCTGCAAGGATTTATGGAAATGGACTACT GAGAATCCTTTTGGTTACCAGTTAAGGGGTGTCGAGG CCAGATTTTCCGCTGAAGATATGCGTTATGACGCAAG ATTTGTGACTATTGGTGCCGGCACCAGATTTCAAGCCA CGTTTGCCAATTTGGGCGCCAGCATTGTTGACCTGAAA GTGAACGGACAATCAGTTGTTCTTGGCTATGAAAATG AGGAAGGGTATTTGAATCCTGATAGTGCTTATATAGG CGCCACGATCGGCAGGTATGCTAATCGTATTTCGAAG GGTAAGTTTAGTTTATGCAACAAAGACTATCAGTTAA CCGTTAATAACGGCGTTAATGCGAATCATAGTAGTAT CGGTTCTTTCCACAGAAAAAGATTTTTGGGACCCATCA TTCAAAATCCTTCAAAGGATGTTTTTACCGCCGAGTAC ATGCTGATAGATAATGAGAAGGACACCGAATTTCCAG GTGATCTATTGGTAACCATACAGTATACTGTGAACGTT GCCCAAAAAAGTTTGGAAATGGTATATAAAGGTAAAT TGACTGCTGGTGAAGCGACGCCAATAAATTTAACAAA TCATAGTTATTTCAATCTGAACAAGCCATATGGAGAC ACTATTGAGGGTACGGAGATTATGGTGCGTTCAAAAA AATCTGTTGATGTCGACAAAAACATGATTCCTACGGG TAATATCGTCGATAGAGAAATTGCTACCTTTAACTCTA CAAAGCCAACGGTCTTAGGCCCCAAAAATCCCCAGTT TGATTGTTGTTTTGTGGTGGATGAAAATGCTAAGCCAA GTCAAATCAATACTCTAAACAATGAATTGACGCTTATT GTCAAGGCTTTTCATCCCGATTCCAATATTACATTAGA AGTTTTAAGTACAGAGCCAACTTATCAATTTTATACCG GTGATTTCTTGTCTGCTGGTTACGAAGCAAGACAAGGT TTTGCAATTGAGCCTGGTAGATACATTGATGCTATCAA TCAAGAGAACTGGAAAGATTGTGTAACCTTGAAAAAC GGTGAAACTTACGGGTCCAAGATTGTCTACAGATTTTC CTGA 62 Sequence of the TAAGCTTCACGATTTGTGTTCCAGTTTATCCCCCCTTTA PpPMA1 TATACCGTTAACCCTTTCCCTGTTGAGCTGACTGTTGT terminator: TGTATTACCGCAATTTTTCCAAGTTTGCCATGCTTTTCG TGTTATTTGACCGATGTCTTTTTTCCCAAATCAAACTA TATTTGTTACCATTTAAACCAAGTTATCTTTTGTATTAA GAGTCTAAGTTTGTTCCCAGGCTTCATGTGAGAGTGAT AACCATCCAGACTATGATTCTTGTTTTTTATTGGGTTT GTTTGTGTGATACATCTGAGTTGTGATTCGTAAAGTAT GTCAGTCTATCTAGATTTTTAATAGTTAATTGGTAATC AATGACTTGTTTGTTTTAACTTTTAAATTGTGGGTCGT ATCCACGCGTTTAGTATAGCTGTTCATGGCTGTTAGAG GAGGGCGATGTTTATATACAGAGGACAAGAATGAGGA GGCGGCGTGTATTTTTAAAATGGAGACGCGACTCCTG TACACCTTATCGGTTGG 63 hGalT codon GGTAGAGATTTGTCTAGATTGCCACAGTTGGTTGGTGT optimized (XB) TTCCACTCCATTGCAAGGAGGTTCTAACTCTGCTGCTG CTATTGGTCAATCTTCCGGTGAGTTGAGAACTGGTGGA GCTAGACCACCTCCACCATTGGGAGCTTCCTCTCAACC AAGACCAGGTGGTGATTCTTCTCCAGTTGTTGACTCTG GTCCAGGTCCAGCTTCTAACTTGACTTCCGTTCCAGTT CCACACACTACTGCTTTGTCCTTGCCAGCTTGTCCAGA AGAATCCCCATTGTTGGTTGGTCCAATGTTGATCGAGT TCAACATGCCAGTTGACTTGGAGTTGGTTGCTAAGCA GAACCCAAACGTTAAGATGGGTGGTAGATACGCTCCA AGAGACTGTGTTTCCCCACACAAAGTTGCTATCATCAT CCCATTCAGAAACAGACAGGAGCACTTGAAGTACTGG TTGTACTACTTGCACCCAGTTTTGCAAAGACAGCAGTT GGACTACGGTATCTACGTTATCAACCAGGCTGGTGAC ACTATTTTCAACAGAGCTAAGTTGTTGAATGTTGGTTT CCAGGAGGCTTTGAAGGATTACGACTACACTTGTTTCG TTTTCTCCGACGTTGACTTGATTCCAATGAACGACCAC AACGCTTACAGATGTTTCTCCCAGCCAAGACACATTTC TGTTGCTATGGACAAGTTCGGTTTCTCCTTGCCATACG TTCAATACTTCGGTGGTGTTTCCGCTTTGTCCAAGCAG CAGTTCTTGACTATCAACGGTTTCCCAAACAATTACTG GGGATGGGGTGGTGAAGATGACGACATCTTTAACAGA TTGGTTTTCAGAGGAATGTCCATCTCTAGACCAAACGC TGTTGTTGGTAGATGTAGAATGATCAGACACTCCAGA GACAAGAAGAACGAGCCAAACCCACAAAGATTCGAC AGAATCGCTCACACTAAGGAAACTATGTTGTCCGACG GATTGAACTCCTTGACTTACCAGGTTTTGGACGTTCAG AGATACCCATTGTACACTCAGATCACTGTTGACATCGG TACTCCATCCTAG 64 DNA encodes ATGGCCCTCTTTCTCAGTAAGAGACTGTTGAGATTTAC ScMnt1 (Kre2) CGTCATTGCAGGTGCGGTTATTGTTCTCCTCCTAACAT (33) TGAATTCCAACAGTAGAACTCAGCAATATATTCCGAG TTCCATCTCCGCTGCATTTGATTTTACCTCAGGATCTAT ATCCCCTGAACAACAAGTCATCGGGCGCGCC 65 DNA encodes ATGAATAGCATACACATGAACGCCAATACGCTGAAGT DmUGT ACATCAGCCTGCTGACGCTGACCCTGCAGAATGCCAT CCTGGGCCTCAGCATGCGCTACGCCCGCACCCGGCCA GGCGACATCTTCCTCAGCTCCACGGCCGTACTCATGGC AGAGTTCGCCAAACTGATCACGTGCCTGTTCCTGGTCT TCAACGAGGAGGGCAAGGATGCCCAGAAGTTTGTACG CTCGCTGCACAAGACCATCATTGCGAATCCCATGGAC ACGCTGAAGGTGTGCGTCCCCTCGCTGGTCTATATCGT TCAAAACAATCTGCTGTACGTCTCTGCCTCCCATTTGG ATGCGGCCACCTACCAGGTGACGTACCAGCTGAAGAT TCTCACCACGGCCATGTTCGCGGTTGTCATTCTGCGCC GCAAGCTGCTGAACACGCAGTGGGGTGCGCTGCTGCT CCTGGTGATGGGCATCGTCCTGGTGCAGTTGGCCCAA ACGGAGGGTCCGACGAGTGGCTCAGCCGGTGGTGCCG CAGCTGCAGCCACGGCCGCCTCCTCTGGCGGTGCTCCC GAGCAGAACAGGATGCTCGGACTGTGGGCCGCACTGG GCGCCTGCTTCCTCTCCGGATTCGCGGGCATCTACTTT GAGAAGATCCTCAAGGGTGCCGAGATCTCCGTGTGGA TGCGGAATGTGCAGTTGAGTCTGCTCAGCATTCCCTTC GGCCTGCTCACCTGTTTCGTTAACGACGGCAGTAGGAT CTTCGACCAGGGATTCTTCAAGGGCTACGATCTGTTTG TCTGGTACCTGGTCCTGCTGCAGGCCGGCGGTGGATTG ATCGTTGCCGTGGTGGTCAAGTACGCGGATAACATTCT CAAGGGCTTCGCCACCTCGCTGGCCATCATCATCTCGT GCGTGGCCTCCATATACATCTTCGACTTCAATCTCACG CTGCAGTTCAGCTTCGGAGCTGGCCTGGTCATCGCCTC CATATTTCTCTACGGCTACGATCCGGCCAGGTCGGCGC CGAAGCCAACTATGCATGGTCCTGGCGGCGATGAGGA GAAGCTGCTGCCGCGCGTCTAG 66 Sequence of the TGGACACAGGAGACTCAGAAACAGACACAGAGCGTTC PpOCH1 TGAGTCCTGGTGCTCCTGACGTAGGCCTAGAACAGGA promoter: ATTATTGGCTTTATTTGTTTGTCCATTTCATAGGCTTGG GGTAATAGATAGATGACAGAGAAATAGAGAAGACCT AATATTTTTTGTTCATGGCAAATCGCGGGTTCGCGGTC GGGTCACACACGGAGAAGTAATGAGAAGAGCTGGTA ATCTGGGGTAAAAGGGTTCAAAAGAAGGTCGCCTGGT AGGGATGCAATACAAGGTTGTCTTGGAGTTTACATTG ACCAGATGATTTGGCTTTTTCTCTGTTCAATTCACATTT TTCAGCGAGAATCGGATTGACGGAGAAATGGCGGGGT GTGGGGTGGATAGATGGCAGAAATGCTCGCAATCACC GCGAAAGAAAGACTTTATGGAATAGAACTACTGGGTG GTGTAAGGATTACATAGCTAGTCCAATGGAGTCCGTT GGAAAGGTAAGAAGAAGCTAAAACCGGCTAAGTAAC TAGGGAAGAATGATCAGACTTTGATTTGATGAGGTCT GAAAATACTCTGCTGCTTTTTCAGTTGCTTTTTCCCTGC AACCTATCATTTTCCTTTTCATAAGCCTGCCTTTTCTGT TTTCACTTATATGAGTTCCGCCGAGACTTCCCCAAATT CTCTCCTGGAACATTCTCTATCGCTCTCCTTCCAAGTT GCGCCCCCTGGCACTGCCTAGTAATATTACCACGCGA CTTATATTCAGTTCCACAATTTCCAGTGTTCGTAGCAA ATATCATCAGCC 67 Sequence of the AATATATACCTCATTTGTTCAATTTGGTGTAAAGAGTG PpALG12 TGGCGGATAGACTTCTTGTAAATCAGGAAAGCTACAA terminator: TTCCAATTGCTGCAAAAAATACCAATGCCCATAAACC AGTATGAGCGGTGCCTTCGACGGATTGCTTACTTTCCG ACCCTTTGTCGTTTGATTCTTCTGCCTTTGGTGAGTCAG TTTGTTTCGACTTTATATCTGACTCATCAACTTCCTTTA CGGTTGCGTTTTTAATCATAATTTTAGCCGTTGGCTTA TTATCCCTTGAGTTGGTAGGAGTTTTGATGATGCTG 68 Sequence of the TAACTGGCCCTTTGACGTTTCTGACAATAGTTCTAGAG 5'-Region used GAGTCGTCCAAAAACTCAACTCTGACTTGGGTGACAC for knock out of CACCACGGGATCCGGTTCTTCCGAGGACCTTGATGAC PpHIS1: CTTGGCTAATGTAACTGGAGTTTTAGTATCCATTTTAA GATGTGTGTTTCTGTAGGTTCTGGGTTGGAAAAAAATT TTAGACACCAGAAGAGAGGAGTGAACTGGTTTGCGTG
GGTTTAGACTGTGTAAGGCACTACTCTGTCGAAGTTTT AGATAGGGGTTACCCGCTCCGATGCATGGGAAGCGAT TAGCCCGGCTGTTGCCCGTTTGGTTTTTGAAGGGTAAT TTTCAATATCTCTGTTTGAGTCATCAATTTCATATTCAA AGATTCAAAAACAAAATCTGGTCCAAGGAGCGCATTT AGGATTATGGAGTTGGCGAATCACTTGAACGATAGAC TATTATTTGC 69 Sequence of the GTGACATTCTTGTCTTTGAGATCAGTAATTGTAGAGCA 3'-Region used TAGATAGAATAATATTCAAGACCAACGGCTTCTCTTCG for knock out of GAAGCTCCAAGTAGCTTATAGTGATGAGTACCGGCAT PpHIS1: ATATTTATAGGCTTAAAATTTCGAGGGTTCACTATATT CGTTTAGTGGGAAGAGTTCCTTTCACTCTTGTTATCTA TATTGTCAGCGTGGACTGTTTATAACTGTACCAACTTA GTTTCTTTCAACTCCAGGTTAAGAGACATAAATGTCCT TTGATGCTGACAATAATCAGTGGAATTCAAGGAAGGA CAATCCCGACCTCAATCTGTTCATTAATGAAGAGTTCG AATCGTCCTTAAATCAAGCGCTAGACTCAATTGTCAAT GAGAACCCTTTCTTTGACCAAGAAACTATAAATAGAT CGAATGACAAAGTTGGAAATGAGTCCATTAGCTTACA TGATATTGAGCAGGCAGACCAAAATAAACCGTCCTTT GAGAGCGATATTGATGGTTCGGCGCCGTTGATAAGAG ACGACAAATTGCCAAAGAAACAAAGCTGGGGGCTGA GCAATTTTTTTTCAAGAAGAAATAGCATATGTTTACCA CTACATGAAAATGATTCAAGTGTTGTTAAGACCGAAA GATCTATTGCAGTGGGAACACCCCATCTTCAATACTGC TTCAATGGAATCTCCAATGCCAAGTACAATGCATTTAC CTTTTTCCCAGTCATCCTATACGAGCAATTCAAATTTT TTTTCAATTTATACTTTACTTTAGTGGCTCTCTCTCAAG CGATACCGCAACTTCGCATTGGATATCTTTCTTCGTAT GTCGTCCCACTTTTGTTTGTACTCATAGTGACCATGTC AAAAGAGGCGATGGATGATATTCAACGCCGAAGAAG GGATAGAGAACAGAACAATGAACCATATGAGGTTCTG TCCAGCCCATCACCAGTTTTGTCCAAAAACTTAAAATG TGGTCACTTGGTTCGATTGCATAAGGGAATGAGAGTG CCCGCAGATATGGTTCTTGTCCAGTCAAGCGAATCCAC CGGAGAGTCATTTATCAAGACAGATCAGCTGGATGGT GAGACTGATTGGAAGCTTCGGATTGTTTCTCCAGTTAC ACAATCGTTACCAATGACTGAACTTCAAAATGTCGCC ATCACTGCAAGCGCACCCTCAAAATCAATTCACTCCTT TCTTGGAAGATTGACCTACAATGGGCAATCATATGGT CTTACGATAGACAACACAATGTGGTGTAATACTGTATT AGCTTCTGGTTCAGCAATTGGTTGTATAATTTACACAG GTAAAGATACTCGACAATCGATGAACACAACTCAGCC CAAACTGAAAACGGGCTTGTTAGAACTGGAAATCAAT AGTTTGTCCAAGATCTTATGTGTTTGTGTGTTTGCATT ATCTGTCATCTTAGTGCTATTCCAAGGAATAGCTGATG ATTGGTACGTCGATATCATGCGGTTTCTCATTCTATTC TCCACTATTATCCCAGTGTCTCTGAGAGTTAACCTTGA TCTTGGAAAGTCAGTCCATGCTCATCAAATAGAAACT GATAGCTCAATACCTGAAACCGTTGTTAGAACTAGTA CAATACCGGAAGACCTGGGAAGAATTGAATACCTATT AAGTGACAAAACTGGAACTCTTACTCAAAATGATATG GAAATGAAAAAACTACACCTAGGAACAGTCTCTTATG CTGGTGATACCATGGATATTATTTCTGATCATGTTAAA GGTCTTAATAACGCTAAAACATCGAGGAAAGATCTTG GTATGAGAATAAGAGATTTGGTTACAACTCTGGCCAT CTG 70 DNA encodes AGAGACGATCCAATTAGACCTCCATTGAAGGTTGCTA Drosophila GATCCCCAAGACCAGGTCAATGTCAAGATGTTGTTCA melanogaster GGACGTCCCAAACGTTGATGTCCAGATGTTGGAGTTG ManII codon- TACGATAGAATGTCCTTCAAGGACATTGATGGTGGTG optimized (KD) TTTGGAAGCAGGGTTGGAACATTAAGTACGATCCATT GAAGTACAACGCTCATCACAAGTTGAAGGTCTTCGTT GTCCCACACTCCCACAACGATCCTGGTTGGATTCAGAC CTTCGAGGAATACTACCAGCACGACACCAAGCACATC TTGTCCAACGCTTTGAGACATTTGCACGACAACCCAG AGATGAAGTTCATCTGGGCTGAAATCTCCTACTTCGCT AGATTCTACCACGATTTGGGTGAGAACAAGAAGTTGC AGATGAAGTCCATCGTCAAGAACGGTCAGTTGGAATT CGTCACTGGTGGATGGGTCATGCCAGACGAGGCTAAC TCCCACTGGAGAAACGTTTTGTTGCAGTTGACCGAAG GTCAAACTTGGTTGAAGCAATTCATGAACGTCACTCC AACTGCTTCCTGGGCTATCGATCCATTCGGACACTCTC CAACTATGCCATACATTTTGCAGAAGTCTGGTTTCAAG AATATGTTGATCCAGAGAACCCACTACTCCGTTAAGA AGGAGTTGGCTCAACAGAGACAGTTGGAGTTCTTGTG GAGACAGATCTGGGACAACAAAGGTGACACTGCTTTG TTCACCCACATGATGCCATTCTACTCTTACGACATTCC TCATACCTGTGGTCCAGATCCAAAGGTTTGTTGTCAGT TCGATTTCAAAAGAATGGGTTCCTTCGGTTTGTCTTGT CCATGGAAGGTTCCACCTAGAACTATCTCTGATCAAA ATGTTGCTGCTAGATCCGATTTGTTGGTTGATCAGTGG AAGAAGAAGGCTGAGTTGTACAGAACCAACGTCTTGT TGATTCCATTGGGTGACGACTTCAGATTCAAGCAGAA CACCGAGTGGGATGTTCAGAGAGTCAACTACGAAAGA TTGTTCGAACACATCAACTCTCAGGCTCACTTCAATGT CCAGGCTCAGTTCGGTACTTTGCAGGAATACTTCGATG CTGTTCACCAGGCTGAAAGAGCTGGACAAGCTGAGTT CCCAACCTTGTCTGGTGACTTCTTCACTTACGCTGATA GATCTGATAACTACTGGTCTGGTTACTACACTTCCAGA CCATACCATAAGAGAATGGACAGAGTCTTGATGCACT ACGTTAGAGCTGCTGAAATGTTGTCCGCTTGGCACTCC TGGGACGGTATGGCTAGAATCGAGGAAAGATTGGAGC AGGCTAGAAGAGAGTTGTCCTTGTTCCAGCACCACGA CGGTATTACTGGTACTGCTAAAACTCACGTTGTCGTCG ACTACGAGCAAAGAATGCAGGAAGCTTTGAAAGCTTG TCAAATGGTCATGCAACAGTCTGTCTACAGATTGTTGA CTAAGCCATCCATCTACTCTCCAGACTTCTCCTTCTCCT ACTTCACTTTGGACGACTCCAGATGGCCAGGTTCTGGT GTTGAGGACTCTAGAACTACCATCATCTTGGGTGAGG ATATCTTGCCATCCAAGCATGTTGTCATGCACAACACC TTGCCACACTGGAGAGAGCAGTTGGTTGACTTCTACGT CTCCTCTCCATTCGTTTCTGTTACCGACTTGGCTAACA ATCCAGTTGAGGCTCAGGTTTCTCCAGTTTGGTCTTGG CACCACGACACTTTGACTAAGACTATCCACCCACAAG GTTCCACCACCAAGTACAGAATCATCTTCAAGGCTAG AGTTCCACCAATGGGTTTGGCTACCTACGTTTTGACCA TCTCCGATTCCAAGCCAGAGCACACCTCCTACGCTTCC AATTTGTTGCTTAGAAAGAACCCAACTTCCTTGCCATT GGGTCAATACCCAGAGGATGTCAAGTTCGGTGATCCA AGAGAGATCTCCTTGAGAGTTGGTAACGGTCCAACCT TGGCTTTCTCTGAGCAGGGTTTGTTGAAGTCCATTCAG TTGACTCAGGATTCTCCACATGTTCCAGTTCACTTCAA GTTCTTGAAGTACGGTGTTAGATCTCATGGTGATAGAT CTGGTGCTTACTTGTTCTTGCCAAATGGTCCAGCTTCT CCAGTCGAGTTGGGTCAGCCAGTTGTCTTGGTCACTAA GGGTAAATTGGAGTCTTCCGTTTCTGTTGGTTTGCCAT CTGTCGTTCACCAGACCATCATGAGAGGTGGTGCTCC AGAGATTAGAAATTTGGTCGATATTGGTTCTTTGGACA ACACTGAGATCGTCATGAGATTGGAGACTCATATCGA CTCTGGTGATATCTTCTACACTGATTTGAATGGATTGC AATTCATCAAGAGGAGAAGATTGGACAAGTTGCCATT GCAGGCTAACTACTACCCAATTCCATCTGGTATGTTCA TTGAGGATGCTAATACCAGATTGACTTTGTTGACCGGT CAACCATTGGGTGGATCTTCTTTGGCTTCTGGTGAGTT GGAGATTATGCAAGATAGAAGATTGGCTTCTGATGAT GAAAGAGGTTTGGGTCAGGGTGTTTTGGACAACAAGC CAGTTTTGCATATTTACAGATTGGTCTTGGAGAAGGTT AACAACTGTGTCAGACCATCTAAGTTGCATCCAGCTG GTTACTTGACTTCTGCTGCTCACAAAGCTTCTCAGTCT TTGTTGGATCCATTGGACAAGTTCATCTTCGCTGAAAA TGAGTGGATCGGTGCTCAGGGTCAATTCGGTGGTGAT CATCCATCTGCTAGAGAGGATTTGGATGTCTCTGTCAT GAGAAGATTGACCAAGTCTTCTGCTAAAACCCAGAGA GTTGGTTACGTTTTGCACAGAACCAATTTGATGCAATG TGGTACTCCAGAGGAGCATACTCAGAAGTTGGATGTC TGTCACTTGTTGCCAAATGTTGCTAGATGTGAGAGAAC TACCTTGACTTTCTTGCAGAATTTGGAGCACTTGGATG GTATGGTTGCTCCAGAAGTTTGTCCAATGGAAACCGCT GCTTACGTCTCTTCTCACTCTTCTTGA 71 DNA encodes ATGCTGCTTACCAAAAGGTTTTCAAAGCTGTTCAAGCT Mnn2 leader GACGTTCATAGTTTTGATATTGTGCGGGCTGTTCGTCA (53) TTACAAACAAATACATGGATGAGAACACGTCG 72 Sequence of the CAAGTTGCGTCCGGTATACGTAACGTCTCACGATGATC PpHIS1 AAAGATAATACTTAATCTTCATGGTCTACTGAATAACT auxotrophic CATTTAAACAATTGACTAATTGTACATTATATTGAACT marker: TATGCATCCTATTAACGTAATCTTCTGGCTTCTCTCTCA GACTCCATCAGACACAGAATATCGTTCTCTCTAACTGG TCCTTTGACGTTTCTGACAATAGTTCTAGAGGAGTCGT CCAAAAACTCAACTCTGACTTGGGTGACACCACCACG GGATCCGGTTCTTCCGAGGACCTTGATGACCTTGGCTA ATGTAACTGGAGTTTTAGTATCCATTTTAAGATGTGTG TTTCTGTAGGTTCTGGGTTGGAAAAAAATTTTAGACAC CAGAAGAGAGGAGTGAACTGGTTTGCGTGGGTTTAGA CTGTGTAAGGCACTACTCTGTCGAAGTTTTAGATAGGG GTTACCCGCTCCGATGCATGGGAAGCGATTAGCCCGG CTGTTGCCCGTTTGGTTTTTGAAGGGTAATTTTCAATA TCTCTGTTTGAGTCATCAATTTCATATTCAAAGATTCA AAAACAAAATCTGGTCCAAGGAGCGCATTTAGGATTA TGGAGTTGGCGAATCACTTGAACGATAGACTATTATTT GCTGTTCCTAAAGAGGGCAGATTGTATGAGAAATGCG TTGAATTACTTAGGGGATCAGATATTCAGTTTCGAAGA TCCAGTAGATTGGATATAGCTTTGTGCACTAACCTGCC CCTGGCATTGGTTTTCCTTCCAGCTGCTGACATTCCCA CGTTTGTAGGAGAGGGTAAATGTGATTTGGGTATAAC TGGTATTGACCAGGTTCAGGAAAGTGACGTAGATGTC ATACCTTTATTAGACTTGAATTTCGGTAAGTGCAAGTT GCAGATTCAAGTTCCCGAGAATGGTGACTTGAAAGAA CCTAAACAGCTAATTGGTAAAGAAATTGTTTCCTCCTT TACTAGCTTAACCACCAGGTACTTTGAACAACTGGAA GGAGTTAAGCCTGGTGAGCCACTAAAGACAAAAATCA AATATGTTGGAGGGTCTGTTGAGGCCTCTTGTGCCCTA GGAGTTGCCGATGCTATTGTGGATCTTGTTGAGAGTGG AGAAACCATGAAAGCGGCAGGGCTGATCGATATTGAA ACTGTTCTTTCTACTTCCGCTTACCTGATCTCTTCGAAG CATCCTCAACACCCAGAACTGATGGATACTATCAAGG AGAGAATTGAAGGTGTACTGACTGCTCAGAAGTATGT CTTGTGTAATTACAACGCACCTAGAGGTAACCTTCCTC AGCTGCTAAAACTGACTCCAGGCAAGAGAGCTGCTAC CGTTTCTCCATTAGATGAAGAAGATTGGGTGGGAGTG TCCTCGATGGTAGAGAAGAAAGATGTTGGAAGAATCA TGGACGAATTAAAGAAACAAGGTGCCAGTGACATTCT TGTCTTTGAGATCAGTAATTGTAGAGCATAGATAGAA TAATATTCAAGACCAACGGCTTCTCTTCGGAAGCTCCA AGTAGCTTATAGTGATGAGTACCGGCATATATTTATAG GCTTAAAATTTCGAGGGTTCACTATATTCGTTTAGTGG GAAGAGTTCCTTTCACTCTTGTTATCTATATTGTCAGC GTGGACTGTTTATAACTGTACCAACTTAGTTTCTTTCA ACTCCAGGTTAAGAGACATAAATGTCCTTTGATGC 73 DNA encodes TCCTTGGTTTACCAATTGAACTTCGACCAGATGTTGAG Rat GnT II AAACGTTGACAAGGACGGTACTTGGTCTCCTGGTGAG (TC) TTGGTTTTGGTTGTTCAGGTTCACAACAGACCAGAGTA Codon- CTTGAGATTGTTGATCGACTCCTTGAGAAAGGCTCAA optimized GGTATCAGAGAGGTTTTGGTTATCTTCTCCCACGATTT CTGGTCTGCTGAGATCAACTCCTTGATCTCCTCCGTTG ACTTCTGTCCAGTTTTGCAGGTTTTCTTCCCATTCTCCA TCCAATTGTACCCATCTGAGTTCCCAGGTTCTGATCCA AGAGACTGTCCAAGAGACTTGAAGAAGAACGCTGCTT TGAAGTTGGGTTGTATCAACGCTGAATACCCAGATTCT TTCGGTCACTACAGAGAGGCTAAGTTCTCCCAAACTA AGCATCATTGGTGGTGGAAGTTGCACTTTGTTTGGGAG AGAGTTAAGGTTTTGCAGGACTACACTGGATTGATCTT GTTCTTGGAGGAGGATCATTACTTGGCTCCAGACTTCT ACCACGTTTTCAAGAAGATGTGGAAGTTGAAGCAACA AGAGTGTCCAGGTTGTGACGTTTTGTCCTTGGGAACTT ACACTACTATCAGATCCTTCTACGGTATCGCTGACAAG GTTGACGTTAAGACTTGGAAGTCCACTGAACACAACA TGGGATTGGCTTTGACTAGAGATGCTTACCAGAAGTT GATCGAGTGTACTGACACTTTCTGTACTTACGACGACT ACAACTGGGACTGGACTTTGCAGTACTTGACTTTGGCT TGTTTGCCAAAAGTTTGGAAGGTTTTGGTTCCACAGGC TCCAAGAATTTTCCACGCTGGTGACTGTGGAATGCACC ACAAGAAAACTTGTAGACCATCCACTCAGTCCGCTCA AATTGAGTCCTTGTTGAACAACAACAAGCAGTACTTG TTCCCAGAGACTTTGGTTATCGGAGAGAAGTTTCCAAT GGCTGCTATTTCCCCACCAAGAAAGAATGGTGGATGG GGTGATATTAGAGACCACGAGTTGTGTAAATCCTACA GAAGATTGCAGTAG 74 DNA encodes ATGCTGCTTACCAAAAGGTTTTCAAAGCTGTTCAAGCT Mnn2 leader GACGTTCATAGTTTTGATATTGTGCGGGCTGTTCGTCA (54) TTACAAACAAATACATGGATGAGAACACGTCGGTCAA The last 9 GGAGTACAAGGAGTACTTAGACAGATATGTCCAGAGT nucleotides are TACTCCAATAAGTATTCATCTTCCTCAGACGCCGCCAG the linker CGCTGACGATTCAACCCCATTGAGGGACAATGATGAG containing the GCAGGCAATGAAAAGTTGAAAAGCTTCTACAACAACG AscI restriction TTTTCAACTTTCTAATGGTTGATTCGCCCGGGCGCGCC site) 75 Sequence of the GATCTGGCCTTCCCTGAATTTTTACGTCCAGCTATACG 5'-Region used ATCCGTTGTGACTGTATTTCCTGAAATGAAGTTTCAAC for knock out of CTAAAGTTTTGGTTGTACTTGCTCCACCTACCACGGAA PpARG1: ACTAATATCGAAACCAATGAAAAAGTAGAACTGGAAT CGTCAATCGAAATTCGCAACCAAGTGGAACCCAAAGA CTTGAATCTTTCTAAAGTCTATTCTAGTGACACTAATG GCAACAGAAGATTTGAGCTGACTTTTCAAATGAATCT CAATAATGCAATATCAACATCAGACAATCAATGGGCT TTGTCTAGTGACACAGGATCAATTATAGTAGTGTCTTC TGCAGGAAGAATAACTTCCCCGATCCTAGAAGTCGGG GCATCCGTCTGTGTCTTAAGATCGTACAACGAACACCT TTTGGCAATAACTTGTGAAGGAACATGCTTTTCATGGA ATTTAAAGAAGCAAGAATGTGTTCTAAACAGCATTTC ATTAGCACCTATAGTCAATTCACACATGCTAGTTAAGA AAGTTGGAGATGCAAGGAACTATTCTATTGTATCTGCC GAAGGAGACAACAATCCGTTACCCCAGATTCTAGACT GCGAACTTTCCAAAAATGGCGCTCCAATTGTGGCTCTT
AGCACGAAAGACATCTACTCTTATTCAAAGAAAATGA AATGCTGGATCCATTTGATTGATTCGAAATACTTTGAA TTGTTGGGTGCTGACAATGCACTGTTTGAGTGTGTGGA AGCGCTAGAAGGTCCAATTGGAATGCTAATTCATAGA TTGGTAGATGAGTTCTTCCATGAAAACACTGCCGGTA AAAAACTCAAACTTTACAACAAGCGAGTACTGGAGGA CCTTTCAAATTCACTTGAAGAACTAGGTGAAAATGCG TCTCAATTAAGAGAGAAACTTGACAAACTCTATGGTG ATGAGGTTGAGGCTTCTTGACCTCTTCTCTCTATCTGC GTTTCTTTTTTTTTTTTTTTTTTTTTTTTTTTCAGTTGAG CCAGACCGCGCTAAACGCATACCAATTGCCAAATCAG GCAATTGTGAGACAGTGGTAAAAAAGATGCCTGCAAA GTTAGATTCACACAGTAAGAGAGATCCTACTCATAAA TGAGGCGCTTATTTAGTAGCTAGTGATAGCCACTGCG GTTCTGCTTTATGCTATTTGTTGTATGCCTTACTATCTT TGTTTGGCTCCTTTTTCTTGACGTTTTCCGTTGGAGGGA CTCCCTATTCTGAGTCATGAGCCGCACAGATTATCGCC CAAAATTGACAAAATCTTCTGGCGAAAAAAGTATAAA AGGAGAAAAAAGCTCACCCTTTTCCAGCGTAGAAAGT ATATATCAGTCATTGAAGAC 76 Sequence of the GGGACTTTAACTCAAGTAAAAGGATAGTTGTACAATT 3'-Region used ATATATACGAAGAATAAATCATTACAAAAAGTATTCG for knock out of TTTCTTTGATTCTTAACAGGATTCATTTTCTGGGTGTCA PpARG1: TCAGGTACAGCGCTGAATATCTTGAAGTTAACATCGA GCTCATCATCGACGTTCATCACACTAGCCACGTTTCCG CAACGGTAGCAATAATTAGGAGCGGACCACACAGTGA CGACATCTTTCTCTTTGAAATGGTATCTGAAGCCTTCC ATGACCAATTGATGGGCTCTAGCGATGAGTTGCAAGT TATTAATGTGGTTGAACTCACGTGCTACTCGAGCACCG AATAACCAGCCAGCTCCACGAGGAGAAACAGCCCAAC TGTCGACTTCATCTGGGTCAGACCAAACCAAGTCACA AAATCCTCCTTCATGAGGGACCTCTTGCGCTCGGCTGA GAACTCTGATTTGATCTAACATGCGAATATCGGGAGA GAGACCACCATGGATACATAATATTTTACCATCAATG ATGGCACTAAGGGTTAAAAAGTCGAACACCTGGCAAC AGTACTTCCAGACAGTGGTGGAACCATATTTATTGAG ACATTCCTCATAAAATCCATAAACCTGAGTGATCTGTC TGGATTCATGATTTCCCCTTACCAATGTGATATGTTGA GGAAACTTAATTTTTAAAATCATGAGTAACGTGAACG TCTCCAACGAGAAATAGCCTCTATCCACATAGTCTCCT AGGAAGATATAGTTCTGTTTTATTCCATTAGAGGAGG ATCCGGGAAACCCACCACTAATCTTGAAAAGTTCCAG TAGATCGTGAAATTGGCCGTGAATATCTCCGCATACTG TCACTGGACTCTGCACTGGCTGTATATTGGATTCCTCC ATCAGCAAATCCTTCACCCGTTCGCAAAGATGCTTCAT ATCATTTTCACTTAAAGCCTTGCAGCTTTTGACTTCTTC AAACCACTGATCTGGTCCTCTTTCTGGCATGATTAAGG TCTATAATATTTCTGAGCTGAGATGTAAAAAAAAATA ATAAAAATGGGGAGTGAAAAAGTGTGTAGCTTTTAGG AGTTTGGGATTGATACCCCAAAATGATCTTTATGAGA ATTAAAAGGTAGATACGCTTTTAATAAGAACACCTAT CTATAGTACTTTGTGGTCTTGAGTAATTGAGATGTTCA GCTTCTGAGGTTTGCCGTTATTCTGGGATAGTAGTGCG CGACCAAACAACCCGCCAGGCAAAGTGTGTTGTGCTC GAAGACGATTGCCAGAAGAGTAAGTCCGTCCTGCCTC AGATGTTACACACTTTCTTCCCTAGACAGTCGATGCAT CATCGGATTTAAACCTGAAACTTTGATGCCATGATACG CCTAGTCACGTCGACTGAGATTTTAGATAAGCCCCGAT CCCTTTAGTACATTCCTGTTATCCATGGATGGAATGGC CTGATA 77 Sequence of the AAGCTTGTTCACCGTTGGGACTTTTCCGTGGACAATGT 5'-Region used TGACTACTCCAGGAGGGATTCCAGCTTTCTCTACTAGC for knock out of TCAGCAATAATCAATGCAGCCCCAGGCGCCCGTTCTG BMT4 ATGGCTTGATGACCGTTGTATTGCCTGTCACTATAGCC AGGGGTAGGGTCCATAAAGGAATCATAGCAGGGAAA TTAAAAGGGCATATTGATGCAATCACTCCCAATGGCT CTCTTGCCATTGAAGTCTCCATATCAGCACTAACTTCC AAGAAGGACCCCTTCAAGTCTGACGTGATAGAGCACG CTTGCTCTGCCACCTGTAGTCCTCTCAAAACGTCACCT TGTGCATCAGCAAAGACTTTACCTTGCTCCAATACTAT GACGGAGGCAATTCTGTCAAAATTCTCTCTCAGCAATT CAACCAACTTGAAAGCAAATTGCTGTCTCTTGATGATG GAGACTTTTTTCCAAGATTGAAATGCAATGTGGGACG ACTCAATTGCTTCTTCCAGCTCCTCTTCGGTTGATTGA GGAACTTTTGAAACCACAAAATTGGTCGTTGGGTCAT GTACATCAAACCATTCTGTAGATTTAGATTCGACGAA AGCGTTGTTGATGAAGGAAAAGGTTGGATACGGTTTG TCGGTCTCTTTGGTATGGCCGGTGGGGTATGCAATTGC AGTAGAAGATAATTGGACAGCCATTGTTGAAGGTAGA GAAAAGGTCAGGGAACTTGGGGGTTATTTATACCATT TTACCCCACAAATAACAACTGAAAAGTACCCATTCCA TAGTGAGAGGTAACCGACGGAAAAAGACGGGCCCAT GTTCTGGGACCAATAGAACTGTGTAATCCATTGGGAC TAATCAACAGACGATTGGCAATATAATGAAATAGTTC GTTGAAAAGCCACGTCAGCTGTCTTTTCATTAACTTTG GTCGGACACAACATTTTCTACTGTTGTATCTGTCCTAC TTTGCTTATCATCTGCCACAGGGCAAGTGGATTTCCTT CTCGCGCGGCTGGGTGAAAACGGTTAACGTGAA 78 Sequence of the GCCTTGGGGGACTTCAAGTCTTTGCTAGAAACTAGAT 3'-Region used GAGGTCAGGCCCTCTTATGGTTGTGTCCCAATTGGGCA for knock out of ATTTCACTCACCTAAAAAGCATGACAATTATTTAGCGA BMT4 AATAGGTAGTATATTTTCCCTCATCTCCCAAGCAGTTT CGTTTTTGCATCCATATCTCTCAAATGAGCAGCTACGA CTCATTAGAACCAGAGTCAAGTAGGGGTGAGCTCAGT CATCAGCCTTCGTTTCTAAAACGATTGAGTTCTTTTGT TGCTACAGGAAGCGCCCTAGGGAACTTTCGCACTTTG GAAATAGATTTTGATGACCAAGAGCGGGAGTTGATAT TAGAGAGGCTGTCCAAAGTACATGGGATCAGGCCGGC CAAATTGATTGGTGTGACTAAACCATTGTGTACTTGGA CACTCTATTACAAAAGCGAAGATGATTTGAAGTATTA CAAGTCCCGAAGTGTTAGAGGATTCTATCGAGCCCAG AATGAAATCATCAACCGTTATCAGCAGATTGATAAAC TCTTGGAAAGCGGTATCCCATTTTCATTATTGAAGAAC TACGATAATGAAGATGTGAGAGACGGCGACCCTCTGA ACGTAGACGAAGAAACAAATCTACTTTTGGGGTACAA TAGAGAAAGTGAATCAAGGGAGGTATTTGTGGCCATA ATACTCAACTCTATCATTAATG 79 Sequence of the CATATGGTGAGAGCCGTTCTGCACAACTAGATGTTTTC 5'-Region used GAGCTTCGCATTGTTTCCTGCAGCTCGACTATTGAATT for knock out of AAGATTTCCGGATATCTCCAATCTCACAAAAACTTATG BMT1 TTGACCACGTGCTTTCCTGAGGCGAGGTGTTTTATATG CAAGCTGCCAAAAATGGAAAACGAATGGCCATTTTTC GCCCAGGCAAATTATTCGATTACTGCTGTCATAAAGA CAGTGTTGCAAGGCTCACATTTTTTTTTAGGATCCGAG ATAAAGTGAATACAGGACAGCTTATCTCTATATCTTGT ACCATTCGTGAATCTTAAGAGTTCGGTTAGGGGGACT CTAGTTGAGGGTTGGCACTCACGTATGGCTGGGCGCA GAAATAAAATTCAGGCGCAGCAGCACTTATCGATG 80 Sequence of the GAATTCACAGTTATAAATAAAAACAAAAACTCAAAAA 3'-Region used GTTTGGGCTCCACAAAATAACTTAATTTAAATTTTTGT for knock out of CTAATAAATGAATGTAATTCCAAGATTATGTGATGCA BMT1 AGCACAGTATGCTTCAGCCCTATGCAGCTACTAATGTC AATCTCGCCTGCGAGCGGGCCTAGATTTTCACTACAA ATTTCAAAACTACGCGGATTTATTGTCTCAGAGAGCA ATTTGGCATTTCTGAGCGTAGCAGGAGGCTTCATAAG ATTGTATAGGACCGTACCAACAAATTGCCGAGGCACA ACACGGTATGCTGTGCACTTATGTGGCTACTTCCCTAC AACGGAATGAAACCTTCCTCTTTCCGCTTAAACGAGA AAGTGTGTCGCAATTGAATGCAGGTGCCTGTGCGCCTT GGTGTATTGTTTTTGAGGGCCCAATTTATCAGGCGCCT TTTTTCTTGGTTGTTTTCCCTTAGCCTCAAGCAAGGTTG GTCTATTTCATCTCCGCTTCTATACCGTGCCTGATACT GTTGGATGAGAACACGACTCAACTTCCTGCTGCTCTGT ATTGCCAGTGTTTTGTCTGTGATTTGGATCGGAGTCCT CCTTACTTGGAATGATAATAATCTTGGCGGAATCTCCC TAAACGGAGGCAAGGATTCTGCCTATGATGATCTGCT ATCATTGGGAAGCTT 81 Sequence of the GATATCTCCCTGGGGACAATATGTGTTGCAACTGTTCG 5'-Region used TTGTTGGTGCCCCAGTCCCCCAACCGGTACTAATCGGT for knock out of CTATGTTCCCGTAACTCATATTCGGTTAGAACTAGAAC BMT3 AATAAGTGCATCATTGTTCAACATTGTGGTTCAATTGT CGAACATTGCTGGTGCTTATATCTACAGGGAAGACGA TAAGCCTTTGTACAAGAGAGGTAACAGACAGTTAATT GGTATTTCTTTGGGAGTCGTTGCCCTCTACGTTGTCTC CAAGACATACTACATTCTGAGAAACAGATGGAAGACT CAAAAATGGGAGAAGCTTAGTGAAGAAGAGAAAGTT GCCTACTTGGACAGAGCTGAGAAGGAGAACCTGGGTT CTAAGAGGCTGGACTTTTTGTTCGAGAGTTAAACTGCA TAATTTTTTCTAAGTAAATTTCATAGTTATGAAATTTCT GCAGCTTAGTGTTTACTGCATCGTTTACTGCATCACCC TGTAAATAATGTGAGCTTTTTTCCTTCCATTGCTTGGT ATCTTCCTTGCTGCTGTTT 82 Sequence of the ACAAAACAGTCATGTACAGAACTAACGCCTTTAAGAT 3'-Region used GCAGACCACTGAAAAGAATTGGGTCCCATTTTTCTTGA for knock out of AAGACGACCAGGAATCTGTCCATTTTGTTTACTCGTTC BMT3 AATCCTCTGAGAGTACTCAACTGCAGTCTTGATAACG GTGCATGTGATGTTCTATTTGAGTTACCACATGATTTT GGCATGTCTTCCGAGCTACGTGGTGCCACTCCTATGCT CAATCTTCCTCAGGCAATCCCGATGGCAGACGACAAA GAAATTTGGGTTTCATTCCCAAGAACGAGAATATCAG ATTGCGGGTGTTCTGAAACAATGTACAGGCCAATGTT AATGCTTTTTGTTAGAGAAGGAACAAACTTTTTTGCTG AGC 83 DNA encodes Tr CGCGCCGGATCTCCCAACCCTACGAGGGCGGCAGCAG ManI catalytic TCAAGGCCGCATTCCAGACGTCGTGGAACGCTTACCA domain CCATTTTGCCTTTCCCCATGACGACCTCCACCCGGTCA GCAACAGCTTTGATGATGAGAGAAACGGCTGGGGCTC GTCGGCAATCGATGGCTTGGACACGGCTATCCTCATG GGGGATGCCGACATTGTGAACACGATCCTTCAGTATG TACCGCAGATCAACTTCACCACGACTGCGGTTGCCAA CCAAGGCATCTCCGTGTTCGAGACCAACATTCGGTAC CTCGGTGGCCTGCTTTCTGCCTATGACCTGTTGCGAGG TCCTTTCAGCTCCTTGGCGACAAACCAGACCCTGGTAA ACAGCCTTCTGAGGCAGGCTCAAACACTGGCCAACGG CCTCAAGGTTGCGTTCACCACTCCCAGCGGTGTCCCGG ACCCTACCGTCTTCTTCAACCCTACTGTCCGGAGAAGT GGTGCATCTAGCAACAACGTCGCTGAAATTGGAAGCC TGGTGCTCGAGTGGACACGGTTGAGCGACCTGACGGG AAACCCGCAGTATGCCCAGCTTGCGCAGAAGGGCGAG TCGTATCTCCTGAATCCAAAGGGAAGCCCGGAGGCAT GGCCTGGCCTGATTGGAACGTTTGTCAGCACGAGCAA CGGTACCTTTCAGGATAGCAGCGGCAGCTGGTCCGGC CTCATGGACAGCTTCTACGAGTACCTGATCAAGATGT ACCTGTACGACCCGGTTGCGTTTGCACACTACAAGGA TCGCTGGGTCCTTGCTGCCGACTCGACCATTGCGCATC TCGCCTCTCACCCGTCGACGCGCAAGGACTTGACCTTT TTGTCTTCGTACAACGGACAGTCTACGTCGCCAAACTC AGGACATTTGGCCAGTTTTGCCGGTGGCAACTTCATCT TGGGAGGCATTCTCCTGAACGAGCAAAAGTACATTGA CTTTGGAATCAAGCTTGCCAGCTCGTACTTTGCCACGT ACAACCAGACGGCTTCTGGAATCGGCCCCGAAGGCTT CGCGTGGGTGGACAGCGTGACGGGCGCCGGCGGCTCG CCGCCCTCGTCCCAGTCCGGGTTCTACTCGTCGGCAGG ATTCTGGGTGACGGCACCGTATTACATCCTGCGGCCG GAGACGCTGGAGAGCTTGTACTACGCATACCGCGTCA CGGGCGACTCCAAGTGGCAGGACCTGGCGTGGGAAGC GTTCAGTGCCATTGAGGACGCATGCCGCGCCGGCAGC GCGTACTCGTCCATCAACGACGTGACGCAGGCCAACG GCGGGGGTGCCTCTGACGATATGGAGAGCTTCTGGTT TGCCGAGGCGCTCAAGTATGCGTACCTGATCTTTGCGG AGGAGTCGGATGTGCAGGTGCAGGCCAACGGCGGGA ACAAATTTGTCTTTAACACGGAGGCGCACCCCTTTAGC ATCCGTTCATCATCACGACGGGGCGGCCACCTTGCTTAA 84 5'ARG1 and TACCAATTGCCAAATCAGGCAATTGTGAGACAGTGGTAAA ORF AAAGATGCCTGCAAAGTTAGATTCACACAGTAAGAGAGAT CCTACTCATAAATGAGGCGCTTATTTAGTAGCTAGTGATAG CCACTGCGGTTCTGCTTTATGCTATTTGTTGTATGCCTTACT ATCTTTGTTTGGCTCCTTTTTCTTGACGTTTTCCGTTGGAGG GACTCCCTATTCTGAGTCATGAGCCGCACAGATTATCGCCC AAAATTGACAAAATCTTCTGGCGAAAAAAGTATAAAAGGA GAAAAAAGCTCACCCTTTTCCAGCGTAGAAAGTATATATCA GTCATTGAAGACTATTATTTAAATAACACAATGTCTAAAGG AAAAGTTTGTTTGGCCTACTCCGGTGGTTTGGATACCTCCA TCATCCTAGCTTGGTTGTTGGAGCAGGGATACGAAGTCGTT GCCTTTTTAGCCAACATTGGTCAAGAGGAAGACTTTGAGGC TGCTAGAGAGAAAGCTCTGAAGATCGGTGCTACCAAGTTTA TCGTCAGTGACGTTAGGAAGGAATTTGTTGAGGAAGTTTTG TTCCCAGCAGTCCAAGTTAACGCTATCTACGAGAACGTCTA CTTACTGGGTACCTCTTTGGCCAGACCAGTCATTGCCAAGG CCCAAATAGAGGTTGCTGAACAAGAAGGTTGTTTTGCTGTT GCCCACGGTTGTACCGGAAAGGGTAACGATCAGGTTAGAT TTGAGCTTTCCTTTTATGCTCTGAAGCCTGACGTTGTCTGTA TCGCCCCATGGAGAGACCCAGAATTCTTCGAAAGATTCGCT GGTAGAAATGACTTGCTGAATTACGCTGCTGAGAAGGATAT TCCAGTTGCTCAGACTAAAGCCAAGCCATGGTCTACTGATG AGAACATGGCTCACATCTCCTTCGAGGCTGGTATTCTAGAA GATCCAAACACTACTCCTCCAAAGGACATGTGGAAGCTCAC TGTTGACCCAGAAGATGCACCAGACAAGCCAGAGTTCTTTG ACGTCCACTTTGAGAAGGGTAAGCCAGTTAAATTAGTTCTC GAGAACAAAACTGAGGTCACCGATCCGGTTGAGATCTTTTT GACTGCTAACGCCATTGCTAGAAGAAACGGTGTTGGTAGA ATTGACATTGTCGAGAACAGATTCATCGGAATCAAGTCCAG AGGTTGTTATGAAACTCCAGGTTTGACTCTACTGAGAACCA CTCACATCGACTTGGAAGGTCTTACCGTTGACCGTGAAGTT AGATCGATCAGAGACACTTTTGTTACCCCAACCTACTCTAA GTTGTTATACAACGGGTTGTACTTTACCCCAGAAGGTGAGT ACGTCAGAACTATGATTCAGCCTTCTCAAAACACCGTCAAC GGTGTTGTTAGAGCCAAGGCCTACAAAGGTAATGTGTATAA CCTAGGAAGATACTCTGAAACCGAGAAATTGTACGATGCT ACCGAATCTTCCATGGATGAGTTGACCGGATTCCACCCTCA AGAAGCTGGAGGATTTATCACAACACAAGCCATCAGAATC AAGAAGTACGGAGAAAGTGTCAGAGAGAAGGGAAAGTTTT
TGGGACTTTAACTCAAGTAAAAGGATAGTTGTACAATTATA TATACGAAGAATAAATCATTACAAAAAGTATTCGTTTCTTT GATTCTTAACAGGATTCATTTTCTGGGTGTCATCAGGTACA GCGCTGAATATCTTGAAGTTAACATCGAGCTCATCATCGAC GTTCATCACACTAGCCACGTTTCCGCAACGGTAG 85 PpCITI TT CCGGCCATTTAAATATGTGACGACTGGGTGATCCGGGTTAG TGAGTTGTTCTCCCATCTGTATATTTTTCATTTACGATGAAT ACGAAATGAGTATTAAGAAATCAGGCGTAGCAATATGGGC AGTGTTCAGTCCTGTCATAGATGGCAAGCACTGGCACATCC TTAATAGGTTAGAGAAAATCATTGAATCATTTGGGTGGTGA AAAAAAATTGATGTAAACAAGCCACCCACGCTGGGAGTCG AACCCAGAATCTTTTGATTAGAAGTCAAACGCGTTAACCAT TACGCTACGCAGGCATGTTTCACGTCCATTTTTGATTGCTTT CTATCATAATCTAAAGATGTGAACTCAATTAGTTGCAATTT GACCAATTCTTCCATTACAAGTCGTGCTTCCTCCGTTGATGC AAC 86 Ashbya gossypii GATCTGTTTAGCTTGCCTCGTCCCCGCCGGGTCACCCG TEF1 promoter GCCAGCGACATGGAGGCCCAGAATACCCTCCTTGACA GTCTTGACGTGCGCAGCTCAGGGGCATGATGTGACTG TCGCCCGTACATTTAGCCCATACATCCCCATGTATAAT CATTTGCATCCATACATTTTGATGGCCGCACGGCGCGA AGCAAAAATTACGGCTCCTCGCTGCAGACCTGCGAGC AGGGAAACGCTCCCCTCACAGACGCGTTGAATTGTCC CCACGCCGCGCCCCTGTAGAGAAATATAAAAGGTTAG GATTTGCCACTGAGGTTCTTCTTTCATATACTTCCTTTT AAAATCTTGCTAGGATACAGTTCTCACATCACATCCGA ACATAAACAACC 87 Ashbya gossypii TAATCAGTACTGACAATAAAAAGATTCTTGTTTTCAAG TEF1 AACTTGTCATTTGTATAGTTTTTTTATATTGTAGTTGTT termination CTATTTTAATCAAATGTTAGCGTGATTTATATTTTTTTT sequence CGCCTCGACATCATCTGCCCAGATGCGAAGTTAAGTG CGCAGAAAGTAATATCATGCGTCAATCGTATGTGAAT GCTGGTCGCTATACTGCTGTCGATTCGATACTAACGCC GCCATCCAGTGTCGAAAAC 88 Sequence of the AAATGCGTACCTCTTCTACGAGATTCAAGCGAATGAG PpPMA1 AATAATGTAATATGCAAGATCAGAAAGAATGAAAGG promoter: AGTTGAAAAAAAAAACCGTTGCGTTTTGACCTTGAAT GGGGTGGAGGTTTCCATTCAAAGTAAAGCCTGTGTCTT GGTATTTTCGGCGGCACAAGAAATCGTAATTTTCATCT TCTAAACGATGAAGATCGCAGCCCAACCTGTATGTAG TTAACCGGTCGGAATTATAAGAAAGATTTTCGATCAA CAAACCCTAGCAAATAGAAAGCAGGGTTACAACTTTA AACCGAAGTCACAAACGATAAACCACTCAGCTCCCAC CCAAATTCATTCCCACTAGCAGAAAGGAATTATTTAAT CCCTCAGGAAACCTCGATGATTCTCCCGTTCTTCCATG GGCGGGTATCGCAAAATGAGGAATTTTTCAAATTTCTC TATTGTCAAGACTGTTTATTATCTAAGAAATAGCCCAA TCCGAAGCTCAGTTTTGAAAAAATCACTTCCGCGTTTC TTTTTTACAGCCCGATGAATATCCAAATTTGGAATATG GATTACTCTATCGGGACTGCAGATAATATGACAACAA CGCAGATTACATTTTAGGTAAGGCATAAACACCAGCC AGAAATGAAACGCCCACTAGCCATGGTCGAATAGTCC AATGAATTCAGATAGCTATGGTCTAAAAGCTGATGTTT TTTATTGGGTAATGGCGAAGAGTCCAGTACGACTTCC AGCAGAGCTGAGATGGCCATTTTTGGGGGTATTAGTA ACTTTTTGAGCTCTTTTCACTTCGATGAAGTGTCCCATT CGGGATATAATCGGATCGCGTCGTTTTCTCGAAAATAC AGCTTAGCGTCGTCCGCTTGTTGTAAAAGCAGCACCA CATTCCTAATCTCTTATATAAACAAAACAACCCAAATT ATCAGTGCTGTTTTCCCACCAGATATAAGTTTCTTTTCT CTTCCGCTTTTTGATTTTTTATCTCTTTCCTTTAAAAAC TTCTTTACCTTAAAGGGCGGCC 89 Sequence of the GAAGGGCCATCGAATTGTCATCGTCTCCTCAGGTGCC 5'-region that ATCGCTGTGGGCATGAAGAGAGTCAACATGAAGCGGA was used to AACCAAAAAAGTTACAGCAAGTGCAGGCATTGGCTGC knock into the TATAGGACAAGGCCGTTTGATAGGACTTTGGGACGAC PpPRO1 locus: CTTTTCCGTCAGTTGAATCAGCCTATTGCGCAGATTTT ACTGACTAGAACGGATTTGGTCGATTACACCCAGTTTA AGAACGCTGAAAATACATTGGAACAGCTTATTAAAAT GGGTATTATTCCTATTGTCAATGAGAATGACACCCTAT CCATTCAAGAAATCAAATTTGGTGACAATGACACCTT ATCCGCCATAACAGCTGGTATGTGTCATGCAGACTAC CTGTTTTTGGTGACTGATGTGGACTGTCTTTACACGGA TAACCCTCGTACGAATCCGGACGCTGAGCCAATCGTG TTAGTTAGAAATATGAGGAATCTAAACGTCAATACCG AAAGTGGAGGTTCCGCCGTAGGAACAGGAGGAATGA CAACTAAATTGATCGCAGCTGATTTGGGTGTATCTGCA GGTGTTACAACGATTATTTGCAAAAGTGAACATCCCG AGCAGATTTTGGACATTGTAGAGTACAGTATCCGTGCT GATAGAGTCGAAAATGAGGCTAAATATCTGGTCATCA ACGAAGAGGAAACTGTGGAACAATTTCAAGAGATCAA TCGGTCAGAACTGAGGGAGTTGAACAAGCTGGACATT CCTTTGCATACACGTTTCGTTGGCCACAGTTTTAATGC TGTTAATAACAAAGAGTTTTGGTTACTCCATGGACTAA AGGCCAACGGAGCCATTATCATTGATCCAGGTTGTTAT AAGGCTATCACTAGAAAAAACAAAGCTGGTATTCTTC CAGCTGGAATTATTTCCGTAGAGGGTAATTTCCATGAA TACGAGTGTGTTGATGTTAAGGTAGGACTAAGAGATC CAGATGACCCACATTCACTAGACCCCAATGAAGAACT TTACGTCGTTGGCCGTGCCCGTTGTAATTACCCCAGCA ATCAAATCAACAAAATTAAGGGTCTACAAAGCTCGCA GATCGAGCAGGTTCTAGGTTACGCTGACGGTGAGTAT GTTGTTCACAGGGACAACTTGGCTTTCCCAGTATTTGC CGATCCAGAACTGTTGGATGTTGTTGAGAGTACCCTGT CTGAACAGGAGAGAGAATCCAAACCAAATAAATAG 90 Sequence of the AATTTCACATATGCTGCTTGATTATGTAATTATACCTT 3'-region that GCGTTCGATGGCATCGATTTCCTCTTCTGTCAATCGCG was used to CATCGCATTAAAAGTATACTTTTTTTTTTTTCCTATAGT knock into the ACTATTCGCCTTATTATAAACTTTGCTAGTATGAGTTC PpPRO1 locus: TACCCCCAAGAAAGAGCCTGATTTGACTCCTAAGAAG AGTCAGCCTCCAAAGAATAGTCTCGGTGGGGGTAAAG GCTTTAGTGAGGAGGGTTTCTCCCAAGGGGACTTCAG CGCTAAGCATATACTAAATCGTCGCCCTAACACCGAA GGCTCTTCTGTGGCTTCGAACGTCATCAGTTCGTCATC ATTGCAAAGGTTACCATCCTCTGGATCTGGAAGCGTTG CTGTGGGAAGTGTGTTGGGATCTTCGCCATTAACTCTT TCTGGAGGGTTCCACGGGCTTGATCCAACCAAGAATA AAATAGACGTTCCAAAGTCGAAACAGTCAAGGAGACA AAGTGTTCTTTCTGACATGATTTCCACTTCTCATGCAG CTAGAAATGATCACTCAGAGCAGCAGTTACAAACTGG ACAACAATCAGAACAAAAAGAAGAAGATGGTAGTCG ATCTTCTTTTTCTGTTTCTTCCCCCGCAAGAGATATCCG GCACCCAGATGTACTGAAAACTGTCGAGAAACATCTT GCCAATGACAGCGAGATCGACTCATCTTTACAACTTC AAGGTGGAGATGTCACTAGAGGCATTTATCAATGGGT AACTGGAGAAAGTAGTCAAAAAGATAACCCGCCTTTG AAACGAGCAAATAGTTTTAATGATTTTTCTTCTGTGCA TGGTGACGAGGTAGGCAAGGCAGATGCTGACCACGAT CGTGAAAGCGTATTCGACGAGGATGATATCTCCATTG ATGATATCAAAGTTCCGGGAGGGATGCGTCGAAGTTT TTTATTACAAAAGCATAGAGACCAACAACTTTCTGGA CTGAATAAAACGGCTCACCAACCAAAACAACTTACTA AACCTAATTTCTTCACGAACAACTTTATAGAGTTTTTG GCATTGTATGGGCATTTTGCAGGTGAAGATTTGGAGG AAGACGAAGATGAAGATTTAGACAGTGGTTCCGAATC AGTCGCAGTCAGTGATAGTGAGGGAGAATTCAGTGAG GCTGACAACAATTTGTTGTATGATGAAGAGTCTCTCCT ATTAGCACCTAGTACCTCCAACTATGCGAGATCAAGA ATAGGAAGTATTCGTACTCCTACTTATGGATCTTTCAG TTCAAATGTTGGTTCTTCGTCTATTCATCAGCAGTTAA TGAAAAGTCAAATCCCGAAGCTGAAGAAACGTGGACA GCACAAGCATAAAACACAATCAAAAATACGCTCGAAG AAGCAAACTACCACCGTAAAAGCAGTGTTGCTGCTAT TAAA 91 Sequence of the GGTTTCTCAATTACTATATACTACTAACCATTTACCTG PpTRP2 gene TAGCGTATTTCTTTTCCCTCTTCGCGAAAGCTCAAGGG integration CATCTTCTTGACTCATGAAAAATATCTGGATTTCTTCT locus: GACAGATCATCACCCTTGAGCCCAACTCTCTAGCCTAT GAGTGTAAGTGATAGTCATCTTGCAACAGATTATTTTG GAACGCAACTAACAAAGCAGATACACCCTTCAGCAGA ATCCTTTCTGGATATTGTGAAGAATGATCGCCAAAGTC ACAGTCCTGAGACAGTTCCTAATCTTTACCCCATTTAC AAGTTCATCCAATCAGACTTCTTAACGCCTCATCTGGC TTATATCAAGCTTACCAACAGTTCAGAAACTCCCAGTC CAAGTTTCTTGCTTGAAAGTGCGAAGAATGGTGACAC CGTTGACAGGTACACCTTTATGGGACATTCCCCCAGA AAAATAATCAAGACTGGGCCTTTAGAGGGTGCTGAAG TTGACCCCTTGGTGCTTCTGGAAAAAGAACTGAAGGG CACCAGACAAGCGCAACTTCCTGGTATTCCTCGTCTAA GTGGTGGTGCCATAGGATACATCTCGTACGATTGTATT AAGTACTTTGAACCAAAAACTGAAAGAAAACTGAAAG ATGTTTTGCAACTTCCGGAAGCAGCTTTGATGTTGTTC GACACGATCGTGGCTTTTGACAATGTTTATCAAAGATT CCAGGTAATTGGAAACGTTTCTCTATCCGTTGATGACT CGGACGAAGCTATTCTTGAGAAATATTATAAGACAAG AGAAGAAGTGGAAAAGATCAGTAAAGTGGTATTTGAC AATAAAACTGTTCCCTACTATGAACAGAAAGATATTA TTCAAGGCCAAACGTTCACCTCTAATATTGGTCAGGA AGGGTATGAAAACCATGTTCGCAAGCTGAAAGAACAT ATTCTGAAAGGAGACATCTTCCAAGCTGTTCCCTCTCA AAGGGTAGCCAGGCCGACCTCATTGCACCCTTTCAAC ATCTATCGTCATTTGAGAACTGTCAATCCTTCTCCATA CATGTTCTATATTGACTATCTAGACTTCCAAGTTGTTG GTGCTTCACCTGAATTACTAGTTAAATCCGACAACAAC AACAAAATCATCACACATCCTATTGCTGGAACTCTTCC CAGAGGTAAAACTATCGAAGAGGACGACAATTATGCT AAGCAATTGAAGTCGTCTTTGAAAGACAGGGCCGAGC ACGTCATGCTGGTAGATTTGGCCAGAAATGATATTAA CCGTGTGTGTGAGCCCACCAGTACCACGGTTGATCGTT TATTGACTGTGGAGAGATTTTCTCATGTGATGCATCTT GTGTCAGAAGTCAGTGGAACATTGAGACCAAACAAGA CTCGCTTCGATGCTTTCAGATCCATTTTCCCAGCAGGA ACCGTCTCCGGTGCTCCGAAGGTAAGAGCAATGCAAC TCATAGGAGAATTGGAAGGAGAAAAGAGAGGTGTTTA TGCGGGGGCCGTAGGACACTGGTCGTACGATGGAAAA TCGATGGACACATGTATTGCCTTAAGAACAATGGTCG TCAAGGACGGTGTCGCTTACCTTCAAGCCGGAGGTGG AATTGTCTACGATTCTGACCCCTATGACGAGTACATCG AAACCATGAACAAAATGAGATCCAACAATAACACCAT CTTGGAGGCTGAGAAAATCTGGACCGATAGGTTGGCC AGAGACGAGAATCAAAGTGAATCCGAAGAAAACGAT CAATGAACGGAGGACGTAAGTAGGAATTTATG 92 Human UDP- ATGGAAAAGAACGGTAACAACAGAAAGTTGAGAGTTT GlcNAc 2- GTGTTGCTACTTGTAACAGAGCTGACTACTCCAAGTTG epimerase/N- GCTCCAATCATGTTCGGTATCAAGACTGAGCCAGAGT acetylmannosamine TCTTCGAGTTGGACGTTGTTGTTTTGGGTTCCCACTTG kinase ATTGATGACTACGGTAACACTTACAGAATGATCGAGC (HsGNE) AGGACGACTTCGACATCAACACTAGATTGCACACTAT codon TGTTAGAGGAGAGGACGAAGCTGCTATGGTTGAATCT opitimized GTTGGATTGGCTTTGGTTAAGTTGCCAGACGTTTTGAA CAGATTGAAGCCAGACATCATGATTGTTCACGGTGAC AGATTCGATGCTTTGGCTTTGGCTACTTCCGCTGCTTT GATGAACATTAGAATCTTGCACATCGAGGGTGGTGAA GTTTCTGGTACTATCGACGACTCCATCAGACACGCTAT CACTAAGTTGGCTCACTACCATGTTTGTTGTACTAGAT CCGCTGAGCAACACTTGATTTCCATGTGTGAGGACCA CGACAGAATTTTGTTGGCTGGTTGTCCATCTTACGACA AGTTGTTGTCCGCTAAGAACAAGGACTACATGTCCAT CATCAGAATGTGGTTGGGTGACGACGTTAAGTCTAAG GACTACATCGTTGCTTTGCAGCACCCAGTTACTACTGA CATCAAGCACTCCATCAAGATGTTCGAGTTGACTTTGG ACGCTTTGATCTCCTTCAACAAGAGAACTTTGGTTTTG TTCCCAAACATTGACGCTGGTTCCAAAGAGATGGTTA GAGTTATGAGAAAGAAGGGTATCGAACACCACCCAAA CTTCAGAGCTGTTAAGCACGTTCCATTCGACCAATTCA TCCAGTTGGTTGCTCATGCTGGTTGTATGATCGGTAAC TCCTCCTGTGGTGTTAGAGAAGTTGGTGCTTTCGGTAC TCCAGTTATCAACTTGGGTACTAGACAGATCGGTAGA GAGACTGGAGAAAACGTTTTGCATGTTAGAGATGCTG ACACTCAGGACAAGATTTTGCAGGCTTTGCACTTGCA ATTCGGAAAGCAGTACCCATGTTCCAAAATCTACGGT GACGGTAACGCTGTTCCAAGAATCTTGAAGTTTTTGAA GTCCATCGACTTGCAAGAGCCATTGCAGAAGAAGTTC TGTTTCCCACCAGTTAAGGAGAACATCTCCCAGGACA TTGACCACATCTTGGAGACATTGTCCGCTTTGGCTGTT GATTTGGGTGGAACTAACTTGAGAGTTGCTATCGTTTC CATGAAGGGAGAGATCGTTAAGAAGTACACTCAGTTC AACCCAAAGACTTACGAGGAGAGAATCAACTTGATCT TGCAGATGTGTGTTGAAGCTGCTGCTGAGGCTGTTAA GTTGAACTGTAGAATCTTGGGTGTTGGTATCTCTACTG GTGGTAGAGTTAATCCAAGAGAGGGTATCGTTTTGCA CTCCACTAAGTTGATTCAGGAGTGGAACTCCGTTGATT TGAGAACTCCATTGTCCGACACATTGCACTTGCCAGTT TGGGTTGACAACGACGGTAATTGTGCTGCTTTGGCTGA GAGAAAGTTCGGTCAAGGAAAGGGATTGGAGAACTTC GTTACTTTGATCACTGGTACTGGTATTGGTGGTGGTAT CATTCACCAGCACGAGTTGATTCACGGTTCTTCCTTCT GTGCTGCTGAATTGGGACACTTGGTTGTTTCTTTGGAC GGTCCAGACTGTTCTTGTGGTTCCCACGGTTGTATTGA AGCTTACGCATCAGGAATGGCATTGCAGAGAGAGGCT AAGAAGTTGCACGACGAGGACTTGTTGTTGGTTGAGG GAATGTCTGTTCCAAAGGACGAGGCTGTTGGTGCTTTG CATTTGATCCAGGCTGCTAAGTTGGGTAATGCTAAGG CTCAGTCCATCTTGAGAACTGCTGGTACTGCTTTGGGA TTGGGTGTTGTTAATATCTTGCACACTATGAACCCATC CTTGGTTATCTTGTCCGGTGTTTTGGCTTCTCACTACAT CCACATCGTTAAGGACGTTATCAGACAGCAAGCTTTG TCCTCCGTTCAAGACGTTGATGTTGTTGTTTCCGACTT GGTTGACCCAGCTTTGTTGGGTGCTGCTTCCATGGTTT TGGACTACACTACTAGAAGAATCTACTAATAG 93 Sequence of the CAGTTGAGCCAGACCGCGCTAAACGCATACCAATTGC PpARG1 CAAATCAGGCAATTGTGAGACAGTGGTAAAAAAGATG
auxotrophic CCTGCAAAGTTAGATTCACACAGTAAGAGAGATCCTA marker: CTCATAAATGAGGCGCTTATTTAGTAGCTAGTGATAGC CACTGCGGTTCTGCTTTATGCTATTTGTTGTATGCCTTA CTATCTTTGTTTGGCTCCTTTTTCTTGACGTTTTCCGTT GGAGGGACTCCCTATTCTGAGTCATGAGCCGCACAGA TTATCGCCCAAAATTGACAAAATCTTCTGGCGAAAAA AGTATAAAAGGAGAAAAAAGCTCACCCTTTTCCAGCG TAGAAAGTATATATCAGTCATTGAAGACTATTATTTAA ATAACACAATGTCTAAAGGAAAAGTTTGTTTGGCCTA CTCCGGTGGTTTGGATACCTCCATCATCCTAGCTTGGT TGTTGGAGCAGGGATACGAAGTCGTTGCCTTTTTAGCC AACATTGGTCAAGAGGAAGACTTTGAGGCTGCTAGAG AGAAAGCTCTGAAGATCGGTGCTACCAAGTTTATCGT CAGTGACGTTAGGAAGGAATTTGTTGAGGAAGTTTTG TTCCCAGCAGTCCAAGTTAACGCTATCTACGAGAACG TCTACTTACTGGGTACCTCTTTGGCCAGACCAGTCATT GCCAAGGCCCAAATAGAGGTTGCTGAACAAGAAGGTT GTTTTGCTGTTGCCCACGGTTGTACCGGAAAGGGTAAC GATCAGGTTAGATTTGAGCTTTCCTTTTATGCTCTGAA GCCTGACGTTGTCTGTATCGCCCCATGGAGAGACCCA GAATTCTTCGAAAGATTCGCTGGTAGAAATGACTTGCT GAATTACGCTGCTGAGAAGGATATTCCAGTTGCTCAG ACTAAAGCCAAGCCATGGTCTACTGATGAGAACATGG CTCACATCTCCTTCGAGGCTGGTATTCTAGAAGATCCA AACACTACTCCTCCAAAGGACATGTGGAAGCTCACTG TTGACCCAGAAGATGCACCAGACAAGCCAGAGTTCTT TGACGTCCACTTTGAGAAGGGTAAGCCAGTTAAATTA GTTCTCGAGAACAAAACTGAGGTCACCGATCCGGTTG AGATCTTTTTGACTGCTAACGCCATTGCTAGAAGAAAC GGTGTTGGTAGAATTGACATTGTCGAGAACAGATTCA TCGGAATCAAGTCCAGAGGTTGTTATGAAACTCCAGG TTTGACTCTACTGAGAACCACTCACATCGACTTGGAAG GTCTTACCGTTGACCGTGAAGTTAGATCGATCAGAGA CACTTTTGTTACCCCAACCTACTCTAAGTTGTTATACA ACGGGTTGTACTTTACCCCAGAAGGTGAGTACGTCAG AACTATGATTCAGCCTTCTCAAAACACCGTCAACGGT GTTGTTAGAGCCAAGGCCTACAAAGGTAATGTGTATA ACCTAGGAAGATACTCTGAAACCGAGAAATTGTACGA TGCTACCGAATCTTCCATGGATGAGTTGACCGGATTCC ACCCTCAAGAAGCTGGAGGATTTATCACAACACAAGC CATCAGAATCAAGAAGTACGGAGAAAGTGTCAGAGA GAAGGGAAAGTTTTTGGGACTTTAACTCAAGTAAAAG GATAGTTGTACAATTATATATACGAAGAATAAATCAT TACAAAAAGTATTCGTTTCTTTGATTCTTAACAGGATT CATTTTCTGGGTGTCATCAGGTACAGCGCTGAATATCT TGAAGTTAACATCGAGCTCATCATCGACGTTCATCACA CTAGCCACGTTTCCGCAACGGTAGCAATAATTAGGAG CGGACCACACAGTGACGACATC 94 Human CMP- ATGGACTCTGTTGAAAAGGGTGCTGCTACTTCTGTTTC sialic acid CAACCCAAGAGGTAGACCATCCAGAGGTAGACCTCCT synthase AAGTTGCAGAGAAACTCCAGAGGTGGTCAAGGTAGAG (HsCSS) codon GTGTTGAAAAGCCACCACACTTGGCTGCTTTGATCTTG optimized GCTAGAGGAGGTTCTAAGGGTATCCCATTGAAGAACA TCAAGCACTTGGCTGGTGTTCCATTGATTGGATGGGTT TTGAGAGCTGCTTTGGACTCTGGTGCTTTCCAATCTGT TTGGGTTTCCACTGACCACGACGAGATTGAGAACGTT GCTAAGCAATTCGGTGCTCAGGTTCACAGAAGATCCT CTGAGGTTTCCAAGGACTCTTCTACTTCCTTGGACGCT ATCATCGAGTTCTTGAACTACCACAACGAGGTTGACA TCGTTGGTAACATCCAAGCTACTTCCCCATGTTTGCAC CCAACTGACTTGCAAAAAGTTGCTGAGATGATCAGAG AAGAGGGTTACGACTCCGTTTTCTCCGTTGTTAGAAGG CACCAGTTCAGATGGTCCGAGATTCAGAAGGGTGTTA GAGAGGTTACAGAGCCATTGAACTTGAACCCAGCTAA AAGACCAAGAAGGCAGGATTGGGACGGTGAATTGTAC GAAAACGGTTCCTTCTACTTCGCTAAGAGACACTTGAT CGAGATGGGATACTTGCAAGGTGGAAAGATGGCTTAC TACGAGATGAGAGCTGAACACTCCGTTGACATCGACG TTGATATCGACTGGCCAATTGCTGAGCAGAGAGTTTTG AGATACGGTTACTTCGGAAAGGAGAAGTTGAAGGAGA TCAAGTTGTTGGTTTGTAACATCGACGGTTGTTTGACT AACGGTCACATCTACGTTTCTGGTGACCAGAAGGAGA TTATCTCCTACGACGTTAAGGACGCTATTGGTATCTCC TTGTTGAAGAAGTCCGGTATCGAAGTTAGATTGATCTC CGAGAGAGCTTGTTCCAAGCAAACATTGTCCTCTTTGA AGTTGGACTGTAAGATGGAGGTTTCCGTTTCTGACAA GTTGGCTGTTGTTGACGAATGGAGAAAGGAGATGGGT TTGTGTTGGAAGGAAGTTGCTTACTTGGGTAACGAAG TTTCTGACGAGGAGTGTTTGAAGAGAGTTGGTTTGTCT GGTGCTCCAGCTGATGCTTGTTCCACTGCTCAAAAGGC TGTTGGTTACATCTGTAAGTGTAACGGTGGTAGAGGT GCTATTAGAGAGTTCGCTGAGCACATCTGTTTGTTGAT GGAGAAAGTTAATAACTCCTGTCAGAAGTAGTAG 95 Human N- ATGCCATTGGAATTGGAGTTGTGTCCTGGTAGATGGGT acetylneuraminate- TGGTGGTCAACACCCATGTTTCATCATCGCTGAGATCG 9-phosphate GTCAAAACCACCAAGGAGACTTGGACGTTGCTAAGAG synthase AATGATCAGAATGGCTAAGGAATGTGGTGCTGACTGT (HsSPS) codon GCTAAGTTCCAGAAGTCCGAGTTGGAGTTCAAGTTCA optimized ACAGAAAGGCTTTGGAAAGACCATACACTTCCAAGCA CTCTTGGGGAAAGACTTACGGAGAACACAAGAGACAC TTGGAGTTCTCTCACGACCAATACAGAGAGTTGCAGA GATACGCTGAGGAAGTTGGTATCTTCTTCACTGCTTCT GGAATGGACGAAATGGCTGTTGAGTTCTTGCACGAGT TGAACGTTCCATTCTTCAAAGTTGGTTCCGGTGACACT AACAACTTCCCATACTTGGAAAAGACTGCTAAGAAAG GTAGACCAATGGTTATCTCCTCTGGAATGCAGTCTATG GACACTATGAAGCAGGTTTACCAGATCGTTAAGCCAT TGAACCCAAACTTTTGTTTCTTGCAGTGTACTTCCGCT TACCCATTGCAACCAGAGGACGTTAATTTGAGAGTTA TCTCCGAGTACCAGAAGTTGTTCCCAGACATCCCAATT GGTTACTCTGGTCACGAGACTGGTATTGCTATTTCCGT TGCTGCTGTTGCTTTGGGTGCTAAGGTTTTGGAGAGAC ACATCACTTTGGACAAGACTTGGAAGGGTTCTGATCA CTCTGCTTCTTTGGAACCTGGTGAGTTGGCTGAACTTG TTAGATCAGTTAGATTGGTTGAGAGAGCTTTGGGTTCC CCAACTAAGCAATTGTTGCCATGTGAGATGGCTTGTA ACGAGAAGTTGGGAAAGTCCGTTGTTGCTAAGGTTAA GATCCCAGAGGGTACTATCTTGACTATGGACATGTTG ACTGTTAAAGTTGGAGAGCCAAAGGGTTACCCACCAG AGGACATCTTTAACTTGGTTGGTAAAAAGGTTTTGGTT ACTGTTGAGGAGGACGACACTATTATGGAGGAGTTGG TTGACAACCACGGAAAGAAGATCAAGTCCTAG 96 Mouse alpha- GTTTTTCAAATGCCAAAGTCCCAGGAGAAAGTTGCTG 2,6-sialyl TTGGTCCAGCTCCACAAGCTGTTTTCTCCAACTCCAAG transferase CAAGATCCAAAGGAGGGTGTTCAAATCTTGTCCTACC catalytic domain CAAGAGTTACTGCTAAGGTTAAGCCACAACCATCCTT (MmmST6) GCAAGTTTGGGACAAGGACTCCACTTACTCCAAGTTG codon optimized AACCCAAGATTGTTGAAGATTTGGAGAAACTACTTGA ACATGAACAAGTACAAGGTTTCCTACAAGGGTCCAGG TCCAGGTGTTAAGTTCTCCGTTGAGGCTTTGAGATGTC ACTTGAGAGACCACGTTAACGTTTCCATGATCGAGGC TACTGACTTCCCATTCAACACTACTGAATGGGAGGGA TACTTGCCAAAGGAGAACTTCAGAACTAAGGCTGGTC CATGGCATAAGTGTGCTGTTGTTTCTTCTGCTGGTTCC TTGAAGAACTCCCAGTTGGGTAGAGAAATTGACAACC ACGACGCTGTTTTGAGATTCAACGGTGCTCCAACTGAC AACTTCCAGCAGGATGTTGGTACTAAGACTACTATCA GATTGGTTAACTCCCAATTGGTTACTACTGAGAAGAG ATTCTTGAAGGACTCCTTGTACACTGAGGGAATCTTGA TTTTGTGGGACCCATCTGTTTACCACGCTGACATTCCA CAATGGTATCAGAAGCCAGACTACAACTTCTTCGAGA CTTACAAGTCCTACAGAAGATTGCACCCATCCCAGCC ATTCTACATCTTGAAGCCACAAATGCCATGGGAATTGT GGGACATCATCCAGGAAATTTCCCCAGACTTGATCCA ACCAAACCCACCATCTTCTGGAATGTTGGGTATCATCA TCATGATGACTTTGTGTGACCAGGTTGACATCTACGAG TTCTTGCCATCCAAGAGAAAGACTGATGTTTGTTACTA CCACCAGAAGTTCTTCGACTCCGCTTGTACTATGGGAG CTTACCACCCATTGTTGTTCGAGAAGAACATGGTTAAG CACTTGAACGAAGGTACTGACGAGGACATCTACTTGT TCGGAAAGGCTACTTTGTCCGGTTTCAGAAACAACAG ATGTTAG 97 Human UDP- ATGGAAAAGAACGGTAACAACAGAAAGTTGAGAGTTT GlcNAc 2- GTGTTGCTACTTGTAACAGAGCTGACTACTCCAAGTTG epimerase/N- GCTCCAATCATGTTCGGTATCAAGACTGAGCCAGAGT acetylmannosamine TCTTCGAGTTGGACGTTGTTGTTTTGGGTTCCCACTTG kinase ATTGATGACTACGGTAACACTTACAGAATGATCGAGC (HsGNE) AGGACGACTTCGACATCAACACTAGATTGCACACTAT codon TGTTAGAGGAGAGGACGAAGCTGCTATGGTTGAATCT opitimized GTTGGATTGGCTTTGGTTAAGTTGCCAGACGTTTTGAA CAGATTGAAGCCAGACATCATGATTGTTCACGGTGAC AGATTCGATGCTTTGGCTTTGGCTACTTCCGCTGCTTT GATGAACATTAGAATCTTGCACATCGAGGGTGGTGAA GTTTCTGGTACTATCGACGACTCCATCAGACACGCTAT CACTAAGTTGGCTCACTACCATGTTTGTTGTACTAGAT CCGCTGAGCAACACTTGATTTCCATGTGTGAGGACCA CGACAGAATTTTGTTGGCTGGTTGTCCATCTTACGACA AGTTGTTGTCCGCTAAGAACAAGGACTACATGTCCAT CATCAGAATGTGGTTGGGTGACGACGTTAAGTCTAAG GACTACATCGTTGCTTTGCAGCACCCAGTTACTACTGA CATCAAGCACTCCATCAAGATGTTCGAGTTGACTTTGG ACGCTTTGATCTCCTTCAACAAGAGAACTTTGGTTTTG TTCCCAAACATTGACGCTGGTTCCAAAGAGATGGTTA GAGTTATGAGAAAGAAGGGTATCGAACACCACCCAAA CTTCAGAGCTGTTAAGCACGTTCCATTCGACCAATTCA TCCAGTTGGTTGCTCATGCTGGTTGTATGATCGGTAAC TCCTCCTGTGGTGTTAGAGAAGTTGGTGCTTTCGGTAC TCCAGTTATCAACTTGGGTACTAGACAGATCGGTAGA GAGACTGGAGAAAACGTTTTGCATGTTAGAGATGCTG ACACTCAGGACAAGATTTTGCAGGCTTTGCACTTGCA ATTCGGAAAGCAGTACCCATGTTCCAAAATCTACGGT GACGGTAACGCTGTTCCAAGAATCTTGAAGTTTTTGAA GTCCATCGACTTGCAAGAGCCATTGCAGAAGAAGTTC TGTTTCCCACCAGTTAAGGAGAACATCTCCCAGGACA TTGACCACATCTTGGAGACATTGTCCGCTTTGGCTGTT GATTTGGGTGGAACTAACTTGAGAGTTGCTATCGTTTC CATGAAGGGAGAGATCGTTAAGAAGTACACTCAGTTC AACCCAAAGACTTACGAGGAGAGAATCAACTTGATCT TGCAGATGTGTGTTGAAGCTGCTGCTGAGGCTGTTAA GTTGAACTGTAGAATCTTGGGTGTTGGTATCTCTACTG GTGGTAGAGTTAATCCAAGAGAGGGTATCGTTTTGCA CTCCACTAAGTTGATTCAGGAGTGGAACTCCGTTGATT TGAGAACTCCATTGTCCGACACATTGCACTTGCCAGTT TGGGTTGACAACGACGGTAATTGTGCTGCTTTGGCTGA GAGAAAGTTCGGTCAAGGAAAGGGATTGGAGAACTTC GTTACTTTGATCACTGGTACTGGTATTGGTGGTGGTAT CATTCACCAGCACGAGTTGATTCACGGTTCTTCCTTCT GTGCTGCTGAATTGGGACACTTGGTTGTTTCTTTGGAC GGTCCAGACTGTTCTTGTGGTTCCCACGGTTGTATTGA AGCTTACGCATCAGGAATGGCATTGCAGAGAGAGGCT AAGAAGTTGCACGACGAGGACTTGTTGTTGGTTGAGG GAATGTCTGTTCCAAAGGACGAGGCTGTTGGTGCTTTG CATTTGATCCAGGCTGCTAAGTTGGGTAATGCTAAGG CTCAGTCCATCTTGAGAACTGCTGGTACTGCTTTGGGA TTGGGTGTTGTTAATATCTTGCACACTATGAACCCATC CTTGGTTATCTTGTCCGGTGTTTTGGCTTCTCACTACAT CCACATCGTTAAGGACGTTATCAGACAGCAAGCTTTG TCCTCCGTTCAAGACGTTGATGTTGTTGTTTCCGACTT GGTTGACCCAGCTTTGTTGGGTGCTGCTTCCATGGTTT TGGACTACACTACTAGAAGAATCTACTAATAG 98 Pp TRP2: 5' and ACTGGGCCTTTAGAGGGTGCTGAAGTTGACCCCTTGGT ORF GCTTCTGGAAAAAGAACTGAAGGGCACCAGACAAGC GCAACTTCCTGGTATTCCTCGTCTAAGTGGTGGTGCCA TAGGATACATCTCGTACGATTGTATTAAGTACTTTGAA CCAAAAACTGAAAGAAAACTGAAAGATGTTTTGCAAC TTCCGGAAGCAGCTTTGATGTTGTTCGACACGATCGTG GCTTTTGACAATGTTTATCAAAGATTCCAGGTAATTGG AAACGTTTCTCTATCCGTTGATGACTCGGACGAAGCTA TTCTTGAGAAATATTATAAGACAAGAGAAGAAGTGGA AAAGATCAGTAAAGTGGTATTTGACAATAAAACTGTT CCCTACTATGAACAGAAAGATATTATTCAAGGCCAAA CGTTCACCTCTAATATTGGTCAGGAAGGGTATGAAAA CCATGTTCGCAAGCTGAAAGAACATATTCTGAAAGGA GACATCTTCCAAGCTGTTCCCTCTCAAAGGGTAGCCAG GCCGACCTCATTGCACCCTTTCAACATCTATCGTCATT TGAGAACTGTCAATCCTTCTCCATACATGTTCTATATT GACTATCTAGACTTCCAAGTTGTTGGTGCTTCACCTGA ATTACTAGTTAAATCCGACAACAACAACAAAATCATC ACACATCCTATTGCTGGAACTCTTCCCAGAGGTAAAA CTATCGAAGAGGACGACAATTATGCTAAGCAATTGAA GTCGTCTTTGAAAGACAGGGCCGAGCACGTCATGCTG GTAGATTTGGCCAGAAATGATATTAACCGTGTGTCTG AGCCCACCAGTACCACGGTTGATCGTTTATTGACTGTG GAGAGATTTTCTCATGTGATGCATCTTGTGTCAGAAGT CAGTGGAACATTGAGACCAAACAAGACTCGCTTCGAT GCTTTCAGATCCATTTTCCCAGCAGGTACCGTCTCCGG TGCTCCGAAGGTAAGAGCAATGCAACTCATAGGAGAA TTGGAAGGAGAAAAGAGAGGTGTTTATGCGGGGGCCG TAGGACACTGGTCGTACGATGGAAAATCGATGGACAC ATGTATTGCCTTAAGAACAATGGTCGTCAAGGACGGT GTCGCTTACCTTCAAGCCGGAGGTGGAATTGTCTACG ATTCTGACCCCTATGACGAGTACATCGAAACCATGAA CAAAATGAGATCCAACAATAACACCATCTTGGAGGCT GAGAAAATCTGGACCGATAGGTTGGCCAGAGACGAG AATCAAAGTGAATCCGAAGAAAACGATCAATGA 99 PpTRP2 3' ACGGAGGACGTAAGTAGGAATTTATGTAATCATGCCA region ATACATCTTTAGATTTCTTCCTCTTCTTTTTAACGAAAG ACCTCCAGTTTTGCACTCTCGACTCTCTAGTATCTTCCC ATTTCTGTTGCTGCAACCTCTTGCCTTCTGTTTCCTTCA ATTGTTCTTCTTTCTTCTGTTGCACTTGGCCTTCTTCCT CCATCTTTCGTTTTTTTTCAAGCCTTTTCAGCAGTTCTT CTTCCAAGAGCAGTTCTTTGATTTTCTCTCTCCAATCC ACCAAAAAACTGGATGAATTCAACCGGGCATCATCAA TGTTCCACTTTCTTTCTCTTATCAATAATCTACGTGCTT CGGCATACGAGGAATCCAGTTGCTCCCTAATCGAGTC
ATCCACAAGGTTAGCATGGGCCTTTTTCAGGGTGTCAA AAGCATCTGGAGCTCGTTTATTCGGAGTCTTGTCTGGA TGGATCAGCAAAGACTTTTTGCGGAAAGTCTTTCTTAT ATCTTCCGGAGAACAACCTGGTTTCAAATCCAAGATG GCATAGCTGTCCAATTTGAAAGTGGAAAGAATCCTGC CAATTTCCTTCTCTCGTGTCAGCTCGTTCTCCTCCTTTT GCAACAGGTCCACTTCATCTGGCATTTTTCTTTATGTT AACTTTAATTATTATTAATTATAAAGTTGATTATCGTT ATCAAAATAATCATATTCGAGAAATAATCCGTCCATG CAATATATAAATAAGAATTCATAATAATGTAATGATA ACAGTACCTCTGATGACCTTTGATGAACCGCAATTTTC TTTCCAATGACAAGACATCCCTATAATACAATTATACA GTTTATATATCACAAATAATCACCTTTTTATAAGAAAA CCGTCCTCTCCGTAACAGAACTTATTATCCGCACGTTA TGGTTAACACACTACTAATACCGATATAGTGTATGAA GTCGCTACGAGATAGCCATCCAGGAAACTTACCAATT CATCAGCACTTTCATGATCCGATTGTTGGCTTTATTCTT TGCGAGACAGATACTTGCCAATGAAATAACTGATCCC ACAGATGAGAATCCGGTGCTCGT 100 Sequence of the TTGGGGGCCTCCAGGACTTGCTGAAATTTGCTGACTCA 5'-Region used TCTTCGCCATCCAAGGATAATGAGTTAGCTAATGTGAC for knock out of AGTTAATGAGTCGTCTTGACTAACGGGGAACATTTCAT STE13 TATTTATATCCAGAGTCAATTTGATAGCAGAGTTTGTG GTTGAAATACCTATGATTCGGGAGACTTTGTTGTAACG ACCATTATCCACAGTTTGGACCGTGAAAATGTCATCG AAGAGAGCAGACGACATATTATCTATTGTGGTAAGTG ATAGTTGGAAGTCCGACTAAGGCATGAAAATGAGAAG ACTGAAAATTTAAAGTTTTTGAAAACACTAATCGGGT AATAACTTGGAAATTACGTTTACGTGCCTTTAGCTCTT GTCCTTACCCCTGATAATCTATCCATTTCCCGAGAGAC AATGACATCTCGGACAGCTGAGAACCCGTTCGATATA GAGCTTCAAGAGAATCTAAGTCCACGTTCTTCCAATTC GTCCATATTGGAAAACATTAATGAGTATGCTAGAAGA CATCGCAATGATTCGCTTTCCCAAGAATGTGATAATGA AGATGAGAACGAAAATCTCAATTATACTGATAACTTG GCCAAGTTTTCAAAGTCTGGAGTATCAAGAAAGAGCT GTATGCTAATATTTGGTATTTGCTTTGTTATCTGGCTGT TTCTCTTTGCCTTGTATGCGAGGGACAATCGATTTTCC AATTTGAACGAGTACGTTCCAGATTCAAACAG 101 Sequence of the CTACTGGGAACCACGAGACATCACTGCAGTAGTTTCC 3'-Region used AAGTGGATTTCAGATCACTCATTTGTGAATCCTGACAA for knock out of AACTGCGATATGGGGGTGGTCTTACGGTGGGTTCACT STE13 ACGCTTAAGACATTGGAATATGATTCTGGAGAGGTTTT CAAATATGGTATGGCTGTTGCTCCAGTAACTAATTGGC TTTTGTATGACTCCATCTACACTGAAAGATACATGAAC CTTCCAAAGGACAATGTTGAAGGCTACAGTGAACACA GCGTCATTAAGAAGGTTTCCAATTTTAAGAATGTAAA CCGATTCTTGGTTTGTCACGGGACTACTGATGATAACG TGCATTTTCAGAACACACTAACCTTACTGGACCAGTTC AATATTAATGGTGTTGTGAATTACGATCTTCAGGTGTA TCCCGACAGTGAACATAGCATTGCCCATCACAACGCA AATAAAGTGATCTACGAGAGGTTATTCAAGTGGTTAG AGCGGGCATTTAACGATAGATTTTTGTAACATTCCGTA CTTCATGCCATACTATATATCCTGCAAGGTTTCCCTTT CAGACACAATAATTGCTTTGCAATTTTACATACCACCA ATTGGCAAAAATAATCTCTTCAGTAAGTTGAATGCTTT TCAAGCCAGCACCGTGAGAAATTGCTACAGCGCGCAT TCTAACATCACTTTAAAATTCCCTCGCCGGTGCTCACT GGAGTTTCCAACCCTTAGCTTATCAAAATCGGGTGATA ACTCTGAGTTTTTTTTTTCACTTCTATTCCTAAACCTTC GCCCAATGCTACCACCTCCAATCAACATCCCGAAATG GATAGAAGAGAATGGACATCTCTTGCAACCTCCGGTT AATAATTACTGTCTCCACAGAGGAGGATTTACGGTAA TGATTGTAGGTGGGCCTAATG 102 Sequence of the CACCTGGGCCTGTTGCTGCTGGTACTGCTGTTGGAACT 5'-Region used GTTGGTATTGTTGCTGATCTAAGGCCGCCTGTTCCACA for knock out of CCGTGTGTATCGAATGCTTGGGCAAAATCATCGCCTGC DAP2 CGGAGGCCCCACTACCGCTTGTTCCTCCTGCTCTTGTT TGTTTTGCTCATTGATGATATCGGCGTCAATGAATTGA TCCTCAATCGTGTGGTGGTGGTGTCGTGATTCCTCTTC TTTCTTGAGTGCCTTATCCATATTCCTATCTTAGTGTAC CAATAATTTTGTTAAACACACGCTGTTGTTTATGAAAA GTCGTCAAAAGGTTAAAAATTCTACTTGGTGTGTGTCA GAGAAAGTAGTGCAGACCCCCAGTTTGTTGACTAGTT GAGAAGGCGGCTCACTATTGCGCGAATAGCATGAGAA ATTTGCAAACATCTGGCAAAGTGGTCAATACCTGCCA ACCTGCCAATCTTCGCGACGGAGGCTGTTAAGCGGGT TGGGTTCCCAAAGTGAATGGATATTACGGGCAGGAAA AACAGCCCCTTCCACACTAGTCTTTGCTACTGACATCT TCCCTCTCATGTATCCCGAACACAAGTATCGGGAGTAT CAACGGAGGGTGCCCTTATGGCAGTACTCCCTGTTGGT GATTGTACTGCTATACGGGTCTCATTTGCTTATCAGCA CCATCAACTTGATACACTATAACCACAAAAATTATCAT GCACACCCAGTCAATAGTGGTATCGTTCTTAATGAGTT TGCTGATGACGATTCATTCTCTTTGAATGGCACTCTGA ACTTGGAGAACTGGAGAAATGGTACCTTTTCCCCTAA ATTTCATTCCATTCAGTGGACCGAAATAGGTCAGGAA GATGACCAGGGATATTACATTCTCTCTTCCAATTCCTC TTACATAGTAAAGTCTTTATCCGACCCAGACTTTGAAT CTGTTCTATTCAACGAGTCTACAATCACTTACAACG 103 Sequence of the GGCAGCAAAGCCTTACGTTGATGAGAATAGACTGGCC 3'-Region used ATTTGGGGTTGGTCTTATGGAGGTTACATGACGCTAAA for knock out of GGTTTTAGAACAGGATAAAGGTGAAACATTCAAATAT DAP2 GGAATGTCTGTTGCCCCTGTGACGAATTGGAAATTCTA TGATTCTATCTACACAGAAAGATACATGCACACTCCTC AGGACAATCCAAACTATTATAATTCGTCAATCCATGA GATTGATAATTTGAAGGGAGTGAAGAGGTTCTTGCTA ATGCACGGAACTGGTGACGACAATGTTCACTTCCAAA ATACACTCAAAGTTCTAGATTTATTTGATTTACATGGT CTTGAAAACTATGATATCCACGTGTTCCCTGATAGTGA TCACAGTATTAGATATCACAACGGTAATGTTATAGTGT ATGATAAGCTATTCCATTGGATTAGGCGTGCATTCAAG GCTGGCAAATAAATAGGTGCAAAAATATTATTAGACT TTTTTTTTCGTTCGCAAGTTATTACTGTGTACCATACCG ATCCAATCCGTATTGTAATTCATGTTCTAGATCCAAAA TTTGGGACTCTAATTCATGAGGTCTAGGAAGATGATC ATCTCTATAGTTTTCAGCGGGGGGCTCGATTTGCGGTT GGTCAAAGCTAACATCAAAATGTTTGTCAGGTTCAGT GAATGGTAACTGCTGCTCTTGAATTGGTCGTCTGACAA ATTCTCTAAGTGATAGCACTTCATCTACAATCATTTGC TTCATCGTTTCTATATCGTCCACGACCTCAAACGAGAA ATCGAATTTGGAAGAACAGACGGGCTCATCGTTAGGA TCATGCCAAACCTTGAGATATGGATGCTCTAAAGCCTC AGTAACTGTAATTCTGTGAGTGGGATCTACCGTGAGC ATTCGATCCAGTAAGTCTATCGCTTCAGGGTTGGCACC GGGAAATAACTGGCTGAATGGGATCTTGGGCATGAAT GGCAGGGAGCGAACATAATCCTGGGCACGCTCTGATC TGATAGACTGAAGTGTCTCTTCCGAAACAGTACCCAG CGTACTCAAAATCAAGTTCAATTGATCCACATAGTCTC TTCCTCTAAAAATGGGTCGGCCACCTA 104 HYGR resistance GATCTGTTTAGCTTGCCTCGTCCCCGCCGGGTCACCCG cassette GCCAGCGACATGGAGGCCCAGAATACCCTCCTTGACA GTCTTGACGTGCGCAGCTCAGGGGCATGATGTGACTG TCGCCCGTACATTTAGCCCATACATCCCCATGTATAAT CATTTGCATCCATACATTTTGATGGCCGCACGGCGCGA AGCAAAAATTACGGCTCCTCGCTGCGGACCTGCGAGC AGGGAAACGCTCCCCTCACAGACGCGTTGAATTGTCC CCACGCCGCGCCCCTGTAGAGAAATATAAAAGGTTAG GATTTGCCACTGAGGTTCTTCTTTCATATACTTCCTTTT AAAATCTTGCTAGGATACAGTTCTCACATCACATCCGA ACATAAACAACCATGGGTAAAAAGCCTGAACTCACCG CGACGTCTGTCGAGAAGTTTCTGATCGAAAAGTTCGA CAGCGTCTCCGACCTGATGCAGCTCTCGGAGGGCGAA GAATCTCGTGCTTTCAGCTTCGATGTAGGAGGGCGTG GATATGTCCTGCGGGTAAATAGCTGCGCCGATGGTTTC TACAAAGATCGTTATGTTTATCGGCACTTTGCATCGGC CGCGCTCCCGATTCCGGAAGTGCTTGACATTGGGGAA TTCAGCGAGAGCCTGACCTATTGCATCTCCCGCCGTGC ACAGGGTGTCACGTTGCAAGACCTGCCTGAAACCGAA CTGCCCGCTGTTCTGCAGCCGGTCGCGGAGGCCATGG ATGCGATCGCTGCGGCCGATCTTAGCCAGACGAGCGG GTTCGGCCCATTCGGACCGCAAGGAATCGGTCAATAC ACTACATGGCGTGATTTCATATGCGCGATTGCTGATCC CCATGTGTATCACTGGCAAACTGTGATGGACGACACC GTCAGTGCGTCCGTCGCGCAGGCTCTCGATGAGCTGA TGCTTTGGGCCGAGGACTGCCCCGAAGTCCGGCACCT CGTGCACGCGGATTTCGGCTCCAACAATGTCCTGACG GACAATGGCCGCATAACAGCGGTCATTGACTGGAGCG AGGCGATGTTCGGGGATTCCCAATACGAGGTCGCCAA CATCTTCTTCTGGAGGCCGTGGTTGGCTTGTATGGAGC AGCAGACGCGCTACTTCGAGCGGAGGCATCCGGAGCT TGCAGGATCGCCGCGGCTCCGGGCGTATATGCTCCGC ATTGGTCTTGACCAACTCTATCAGAGCTTGGTTGACGG CAATTTCGATGATGCAGCTTGGGCGCAGGGTCGATGC GACGCAATCGTCCGATCCGGAGCCGGGACTGTCGGGC GTACACAAATCGCCCGCAGAAGCGCGGCCGTCTGGAC CGATGGCTGTGTAGAAGTACTCGCCGATAGTGGAAAC CGACGCCCCAGCACTCGTCCGAGGGCAAAGGAATAAT CAGTACTGACAATAAAAAGATTCTTGTTTTCAAGAACT TGTCATTTGTATAGTTTTTTTATATTGTAGTTGTTCTAT TTTAATCAAATGTTAGCGTGATTTATATTTTTTTTCGCC TCGACATCATCTGCCCAGATGCGAAGTTAAGTGCGCA GAAAGTAATATCATGCGTCAATCGTATGTGAATGCTG GTCGCTATACTGCTGTCGATTCGATACTAACGCCGCCA TCCAGTGTCGAAAACGAGCT 105 Sequence of ACGACGGCCAAATTCATGATACACACTCTGTTTCAGCT PpTRP5 5' GGTTTGGACTACCCTGGAGTTGGTCCTGAATTGGCTGC integration CTGGAAAGCAAATGGTAGAGCCCAATTTTCCGCTGTA fragment ACTGATGCCCAAGCATTAGAGGGATTCAAAATCCTGT CTCAATTGGAAGGGATCATTCCAGCACTAGAGTCTAG TCATGCAATCTACGGCGCATTGCAAATTGCAAAGACT ATGTCTTCGGACCAGTCCTTAGTTATTAATGTATCTGG AAGGGGTGATAAGGACGTCCAGAGTGTAGCTGAGATT TTACCTAAATTGGGACCTCAAATTGGATGGGATTTGCG TTTCAGCGAAGACATTACTAAAGAGTGA 106 Sequence of TCGATAGCACAATATTCAACTTGACTGGGTGTTAAGA PpTRP5 3' ACTAAGAGCTCTGGGAAACTTTGTATTTATTACTACCA integration ACACAGTCAAATTATTGGATGTGTTTTTTTTTCCAGTA fragment CATTTCACTGAGCAGTTTGTTATACTCGGTCTTTAATC TCCATATACATGCAGATTGTAATACAGATCTGAACAG TTTGATTCTGATTGATCTTGCCACCAATATTCTATTTTT GTATCAAGTAACAGAGTCAATGATCATTGGTAACGTA ACGGTTTTCGTGTATAGTAGTTAGAGCCCATCTTGTAA CCTCATTTCCTCCCATATTAAAGTATCAGTGATTCGCT GGAACGATTAACTAAGAAAAAAAAAATATCTGCACAT ACTCATCAGTCTGTAAATCTAAGTCAAAACTGCTGTAT CCAATAGAAATCGGGATATACCTGGATGTTTTTTCCAC ATAAACAAACGGGAGTTCAGCTTACTTATGGTGTTGA TGCAATTCAGTATGATCCTACCAATAAAACGAAACTTT GGGATTTTGGCTGTTTGAGGGATCAAAAGCTGCACCTT TACAAGATTGACGGATCGACCATTAGACCAAAGCAAA TGGCCACCAA 107 DNA encodes CCAGCTAGATCTCCATCTCCATCCACTCAACCATGGGAACA GM-CSF CGTTAACGCTATCCAAGAGGCTTTGAGATTGTTGAACTTGT CCAGAGACACTGCTGCTGAAATGAACGAGACTGTTGAGGT TATCTCCGAGATGTTCGACTTGCAAGAGCCAACTTGTTTGC AGACTAGATTGGAGTTGTACAAGCAGGGATTGAGAGGATC CTTGACTAAGTTGAAGGGACCATTGACTATGATGGCTTCCC ACTACAAGCAACACTGTCCACCAACTCCAGAAACATCCTGT GCTACTCAGATCATCACTTTCGAGTCCTT CAAAGAGAACTTGAAGGACTTCTTGTTGGTTATCCCATTCG ACTGTTGGGAACCAGTTCAAGAATAATAA 108 GM-CSF PARSPSPSTQPWEHVNAIQEALRLLNLSRDTAAEMNETVEVIS EMFDLQEPTCLQTRLELYKQGLRGSLTKLKGPLTMMASHYK QHCPPTPETSCATQIITFESFKENLKDFLLVIPFDCWEPVQE 109 DNA encodes ATGTTCAACCTGAAAACTATTCTCATCTCAACACTTGC CWP1-GMCSF ATCGATCGCTGTTGCCGACCAAACCTTCGGTGTCCTTC fusion protein TAATCCGGAGTGGATCCCCATATCACTATTCGACTCTC ACTAATAGAGACGAAAAGATTGTTGCTGGAGGTGGCA ACAAAAAAGTGACCCTCACAGATGAGGGAGCTCTGAA GTATGATGGTGGTAAATGGATAGGTCTTGATGATGAT GGCTATGCGGTACAGACCGACAAACCAGTTACAGGTT GGAGCACTAACGGTGGATACCTCTATTTTGACCAAGG CTTAATTGTTTGCACGGAGGACTATATCGGATATGTGA AGAAACATGGTGAATGCAAAGGTGACAGCTATGGTAT GGCTTGGAAGGTACTCCCAGCCGACGATGACAAGGAT GATGACAAGGATGATGATAAAGATGATGACAAGGATT ATGACGATGACAATGACCACGGTGATGGTGATTACTA TTGCTCGATCACAGGAACCTATGCCATCAAATCCAAA GGCAGTAAGCATCAATACGAGGCCATCAAAAAAGTTG ATGCACATCCTCATGTCTTCTCTGTAGGAGGAGATCAG GGAAACGATCTGATTGTGACTTTCCAAAAGGATTGTTC GCTGGTAGATCAAGATAACAGAGGCGTATATGTTGAC CCTAATTCTGGAGAAGTCGGAAACGTTGACCCTTGGG GAGAACTCACGCCATCTGTTAAATGGGATATTGACGA CGGATACCTGATCTTTAATGGTGAGTCCAATTTCAGGT CATGTCCATCTGGTAATGGATATTCATTGTCTATCAAG GATTGTGTTGGGGGAACTGACATTGGCCTTAAAGTAT GGGAGAAAGGTGGAGGTTCTTTGGTTAAGAGGGCTCC AGCTAGATCTCCATCTCCATCCACTCAACCATGGGAAC ACGTTAACGCTATCCAAGAGGCTTTGAGATTGTTGAA CTTGTCCAGAGACACTGCTGCTGAAATGAACGAGACT GTTGAGGTTATCTCCGAGATGTTCGACTTGCAAGAGCC AACTTGTTTGCAGACTAGATTGGAGTTGTACAAGCAG GGATTGAGAGGATCCTTGACTAAGTTGAAGGGACCAT TGACTATGATGGCTTCCCACTACAAGCAACACTGTCCA CCAACTCCAGAAACATCCTGTGCTACTCAGATCATCAC TTTCGAGTCCTTCAAAGAGAACTTGAAGGACTTCTTGT TGGTTATCCCATTCGACTGTTGGGAACCAGTTCAAGAA TAA
110 CWP1-GMCSF MFNLKTILISTLASIAVADQTFGVLLIRSGSPYHYSTLTNR fusion protein DEKIVAGGGNKKVTLTDEGALKYDGGKWIGLDDDGYA VQTDKPVTGWSTNGGYLYFDQGLIVCTEDYIGYVKKHG ECKGDSYGMAWKVLPADDDKDDDKDDDKDDDKDYDD DNDHGDGDYYCSITGTYAIKSKGSKHQYEAIKKVDAHP HVFSVGGDQGNDLIVTFQKDCSLVDQDNRGVYVDPNSG EVGNVDPWGELTPSVKWDIDDGYLIFNGESNFRSCPSGN GYSLSIKDCVGGTDIGLKVWEKGGGSLVKRAPARSPSPS TQPWEHVNAIQEALRLLNLSRDTAAEMNETVEVISEMFD LQEPTCLQTRLELYKQGLRGSLTKLKGPLTMMASHYKQ HCPPTPETSCATQIITFESFKENLKDFLLVIPFDCWEPVQE 111 Kex2 linker GGGSLVKR amino acid sequence
[0295] While the present invention is described herein with reference to illustrated embodiments, it should be understood that the invention is not limited hereto. Those having ordinary skill in the art and access to the teachings herein will recognize additional modifications and embodiments within the scope thereof. Therefore, the present invention is limited only by the claims attached herein.
Sequence CWU
1
1
111131DNAArtificial SequencePCR primer PpURA6out/UP 1ctgaggagtc agatatcagc
tcaatctcca t 31227DNAArtificial
SequencePCR primer Puc19/LP 2tccggctcgt atgttgtgtg gaattgt
27332DNAArtificial SequencePCR primer
PpURA6out/LP 3ctggatgttt gatgggttca gtttcagctg ga
32429DNAArtificial SequencePCR primer ScARR3/UP 4ggcaatagtc
gcgagaatcc ttaaaccat
29528DNAArtificial SequencePCR primer PpTRP1-5'out/UP 5cctcgtaaag
atctgcggtt tgcaaagt
28627DNAArtificial SequencePCR primer PpALG3TT/LP 6cctcccactg gaaccgatga
tatggaa 27733DNAArtificial
SequencePCR primer PpTEFTT/UP 7gatgcgaagt taagtgcgca gaaagtaata tca
33833DNAArtificial SequencePCR primer
PpTRP-3'1out/LP 8cgtgtgtacc ttgaaacgtc aatgatactt tga
33930DNAArtificial SequencePCR primer LmSTT3D/iUP
9cagactaaga ctgcttctcc acctgctaag
301032DNAArtificial SequencePCR primer LmSTT3D/iLP 10caacagtaga
accagaagcc tcgtaagtac ag
32112577DNALeishmania major 11atgggtaaaa gaaagggaaa ctccttggga gattctggtt
ctgctgctac tgcttccaga 60gaggcttctg ctcaagctga agatgctgct tcccagacta
agactgcttc tccacctgct 120aaggttatct tgttgccaaa gactttgact gacgagaagg
acttcatcgg tatcttccca 180tttccattct ggccagttca cttcgttttg actgttgttg
ctttgttcgt tttggctgct 240tcctgtttcc aggctttcac tgttagaatg atctccgttc
aaatctacgg ttacttgatc 300cacgaatttg acccatggtt caactacaga gctgctgagt
acatgtctac tcacggatgg 360agtgcttttt tctcctggtt cgattacatg tcctggtatc
cattgggtag accagttggt 420tctactactt acccaggatt gcagttgact gctgttgcta
tccatagagc tttggctgct 480gctggaatgc caatgtcctt gaacaatgtt tgtgttttga
tgccagcttg gtttggtgct 540atcgctactg ctactttggc tttctgtact tacgaggctt
ctggttctac tgttgctgct 600gctgcagctg ctttgtcctt ctccattatc cctgctcact
tgatgagatc catggctggt 660gagttcgaca acgagtgtat tgctgttgct gctatgttgt
tgactttcta ctgttgggtt 720cgttccttga gaactagatc ctcctggcca atcggtgttt
tgacaggtgt tgcttacggt 780tacatggctg ctgcttgggg aggttacatc ttcgttttga
acatggttgc tatgcacgct 840ggtatctctt ctatggttga ctgggctaga aacacttaca
acccatcctt gttgagagct 900tacactttgt tctacgttgt tggtactgct atcgctgttt
gtgttccacc agttggaatg 960tctccattca agtccttgga gcagttggga gctttgttgg
ttttggtttt cttgtgtgga 1020ttgcaagttt gtgaggtttt gagagctaga gctggtgttg
aagttagatc cagagctaat 1080ttcaagatca gagttagagt tttctccgtt atggctggtg
ttgctgcttt ggctatctct 1140gttttggctc caactggtta ctttggtcca ttgtctgtta
gagttagagc tttgtttgtt 1200gagcacacta gaactggtaa cccattggtt gactccgttg
ctgaacatca accagcttct 1260ccagaggcta tgtgggcttt cttgcatgtt tgtggtgtta
cttggggatt gggttccatt 1320gttttggctg tttccacttt cgttcactac tccccatcta
aggttttctg gttgttgaac 1380tccggtgctg tttactactt ctccactaga atggctagat
tgttgttgtt gtccggtcca 1440gctgcttgtt tgtccactgg tatcttcgtt ggtactatct
tggaggctgc tgttcaattg 1500tctttctggg actccgatgc tactaaggct aagaagcagc
aaaagcaggc tcaaagacac 1560caaagaggtg ctggtaaagg ttctggtaga gatgacgcta
agaacgctac tactgctaga 1620gctttctgtg acgttttcgc tggttcttct ttggcttggg
gtcacagaat ggttttgtcc 1680attgctatgt gggctttggt tactactact gctgtttcct
tcttctcctc cgaatttgct 1740tctcactcca ctaagttcgc tgaacaatcc tccaacccaa
tgatcgtttt cgctgctgtt 1800gttcagaaca gagctactgg aaagccaatg aacttgttgg
ttgacgacta cttgaaggct 1860tacgagtggt tgagagactc tactccagag gacgctagag
ttttggcttg gtgggactac 1920ggttaccaaa tcactggtat cggtaacaga acttccttgg
ctgatggtaa cacttggaac 1980cacgagcaca ttgctactat cggaaagatg ttgacttccc
cagttgttga agctcactcc 2040cttgttagac acatggctga ctacgttttg atttgggctg
gtcaatctgg tgacttgatg 2100aagtctccac acatggctag aatcggtaac tctgtttacc
acgacatttg tccagatgac 2160ccattgtgtc agcaattcgg tttccacaga aacgattact
ccagaccaac tccaatgatg 2220agagcttcct tgttgtacaa cttgcacgag gctggaaaaa
gaaagggtgt taaggttaac 2280ccatctttgt tccaagaggt ttactcctcc aagtacggac
ttgttagaat cttcaaggtt 2340atgaacgttt ccgctgagtc taagaagtgg gttgcagacc
cagctaacag agtttgtcac 2400ccacctggtt cttggatttg tcctggtcaa tacccacctg
ctaaagaaat ccaagagatg 2460ttggctcaca gagttccatt cgaccaggtt acaaacgctg
acagaaagaa caatgttggt 2520tcctaccaag aggaatacat gagaagaatg agagagtccg
agaacagaag ataatag 257712857PRTLeishmania major 12Met Gly Lys Arg
Lys Gly Asn Ser Leu Gly Asp Ser Gly Ser Ala Ala 1 5
10 15 Thr Ala Ser Arg Glu Ala Ser Ala Gln
Ala Glu Asp Ala Ala Ser Gln 20 25
30 Thr Lys Thr Ala Ser Pro Pro Ala Lys Val Ile Leu Leu Pro
Lys Thr 35 40 45
Leu Thr Asp Glu Lys Asp Phe Ile Gly Ile Phe Pro Phe Pro Phe Trp 50
55 60 Pro Val His Phe Val
Leu Thr Val Val Ala Leu Phe Val Leu Ala Ala 65 70
75 80 Ser Cys Phe Gln Ala Phe Thr Val Arg Met
Ile Ser Val Gln Ile Tyr 85 90
95 Gly Tyr Leu Ile His Glu Phe Asp Pro Trp Phe Asn Tyr Arg Ala
Ala 100 105 110 Glu
Tyr Met Ser Thr His Gly Trp Ser Ala Phe Phe Ser Trp Phe Asp 115
120 125 Tyr Met Ser Trp Tyr Pro
Leu Gly Arg Pro Val Gly Ser Thr Thr Tyr 130 135
140 Pro Gly Leu Gln Leu Thr Ala Val Ala Ile His
Arg Ala Leu Ala Ala 145 150 155
160 Ala Gly Met Pro Met Ser Leu Asn Asn Val Cys Val Leu Met Pro Ala
165 170 175 Trp Phe
Gly Ala Ile Ala Thr Ala Thr Leu Ala Phe Cys Thr Tyr Glu 180
185 190 Ala Ser Gly Ser Thr Val Ala
Ala Ala Ala Ala Ala Leu Ser Phe Ser 195 200
205 Ile Ile Pro Ala His Leu Met Arg Ser Met Ala Gly
Glu Phe Asp Asn 210 215 220
Glu Cys Ile Ala Val Ala Ala Met Leu Leu Thr Phe Tyr Cys Trp Val 225
230 235 240 Arg Ser Leu
Arg Thr Arg Ser Ser Trp Pro Ile Gly Val Leu Thr Gly 245
250 255 Val Ala Tyr Gly Tyr Met Ala Ala
Ala Trp Gly Gly Tyr Ile Phe Val 260 265
270 Leu Asn Met Val Ala Met His Ala Gly Ile Ser Ser Met
Val Asp Trp 275 280 285
Ala Arg Asn Thr Tyr Asn Pro Ser Leu Leu Arg Ala Tyr Thr Leu Phe 290
295 300 Tyr Val Val Gly
Thr Ala Ile Ala Val Cys Val Pro Pro Val Gly Met 305 310
315 320 Ser Pro Phe Lys Ser Leu Glu Gln Leu
Gly Ala Leu Leu Val Leu Val 325 330
335 Phe Leu Cys Gly Leu Gln Val Cys Glu Val Leu Arg Ala Arg
Ala Gly 340 345 350
Val Glu Val Arg Ser Arg Ala Asn Phe Lys Ile Arg Val Arg Val Phe
355 360 365 Ser Val Met Ala
Gly Val Ala Ala Leu Ala Ile Ser Val Leu Ala Pro 370
375 380 Thr Gly Tyr Phe Gly Pro Leu Ser
Val Arg Val Arg Ala Leu Phe Val 385 390
395 400 Glu His Thr Arg Thr Gly Asn Pro Leu Val Asp Ser
Val Ala Glu His 405 410
415 Gln Pro Ala Ser Pro Glu Ala Met Trp Ala Phe Leu His Val Cys Gly
420 425 430 Val Thr Trp
Gly Leu Gly Ser Ile Val Leu Ala Val Ser Thr Phe Val 435
440 445 His Tyr Ser Pro Ser Lys Val Phe
Trp Leu Leu Asn Ser Gly Ala Val 450 455
460 Tyr Tyr Phe Ser Thr Arg Met Ala Arg Leu Leu Leu Leu
Ser Gly Pro 465 470 475
480 Ala Ala Cys Leu Ser Thr Gly Ile Phe Val Gly Thr Ile Leu Glu Ala
485 490 495 Ala Val Gln Leu
Ser Phe Trp Asp Ser Asp Ala Thr Lys Ala Lys Lys 500
505 510 Gln Gln Lys Gln Ala Gln Arg His Gln
Arg Gly Ala Gly Lys Gly Ser 515 520
525 Gly Arg Asp Asp Ala Lys Asn Ala Thr Thr Ala Arg Ala Phe
Cys Asp 530 535 540
Val Phe Ala Gly Ser Ser Leu Ala Trp Gly His Arg Met Val Leu Ser 545
550 555 560 Ile Ala Met Trp Ala
Leu Val Thr Thr Thr Ala Val Ser Phe Phe Ser 565
570 575 Ser Glu Phe Ala Ser His Ser Thr Lys Phe
Ala Glu Gln Ser Ser Asn 580 585
590 Pro Met Ile Val Phe Ala Ala Val Val Gln Asn Arg Ala Thr Gly
Lys 595 600 605 Pro
Met Asn Leu Leu Val Asp Asp Tyr Leu Lys Ala Tyr Glu Trp Leu 610
615 620 Arg Asp Ser Thr Pro Glu
Asp Ala Arg Val Leu Ala Trp Trp Asp Tyr 625 630
635 640 Gly Tyr Gln Ile Thr Gly Ile Gly Asn Arg Thr
Ser Leu Ala Asp Gly 645 650
655 Asn Thr Trp Asn His Glu His Ile Ala Thr Ile Gly Lys Met Leu Thr
660 665 670 Ser Pro
Val Val Glu Ala His Ser Leu Val Arg His Met Ala Asp Tyr 675
680 685 Val Leu Ile Trp Ala Gly Gln
Ser Gly Asp Leu Met Lys Ser Pro His 690 695
700 Met Ala Arg Ile Gly Asn Ser Val Tyr His Asp Ile
Cys Pro Asp Asp 705 710 715
720 Pro Leu Cys Gln Gln Phe Gly Phe His Arg Asn Asp Tyr Ser Arg Pro
725 730 735 Thr Pro Met
Met Arg Ala Ser Leu Leu Tyr Asn Leu His Glu Ala Gly 740
745 750 Lys Arg Lys Gly Val Lys Val Asn
Pro Ser Leu Phe Gln Glu Val Tyr 755 760
765 Ser Ser Lys Tyr Gly Leu Val Arg Ile Phe Lys Val Met
Asn Val Ser 770 775 780
Ala Glu Ser Lys Lys Trp Val Ala Asp Pro Ala Asn Arg Val Cys His 785
790 795 800 Pro Pro Gly Ser
Trp Ile Cys Pro Gly Gln Tyr Pro Pro Ala Lys Glu 805
810 815 Ile Gln Glu Met Leu Ala His Arg Val
Pro Phe Asp Gln Val Thr Asn 820 825
830 Ala Asp Arg Lys Asn Asn Val Gly Ser Tyr Gln Glu Glu Tyr
Met Arg 835 840 845
Arg Met Arg Glu Ser Glu Asn Arg Arg 850 855
1357DNAArtificial SequenceSaccharomyces cerevisiae mating factor
pre-signal peptide (DNA) 13atgagattcc catccatctt cactgctgtt ttgttcgctg
cttcttctgc tttggct 571419PRTArtificial SequenceSaccharomyces
cerevisiae mating factor pre-signal peptide 14Met Arg Phe Pro Ser
Ile Phe Thr Ala Val Leu Phe Ala Ala Ser Ser 1 5
10 15 Ala Leu Ala 151350DNAArtificial
SequenceAnti-Her2 Heavy chain (VH + IgG1 constant region) (DNA)
15gaggttcagt tggttgaatc tggaggagga ttggttcaac ctggtggttc tttgagattg
60tcctgtgctg cttccggttt caacatcaag gacacttaca tccactgggt tagacaagct
120ccaggaaagg gattggagtg ggttgctaga atctacccaa ctaacggtta cacaagatac
180gctgactccg ttaagggaag attcactatc tctgctgaca cttccaagaa cactgcttac
240ttgcagatga actccttgag agctgaggat actgctgttt actactgttc cagatggggt
300ggtgatggtt tctacgctat ggactactgg ggtcaaggaa ctttggttac tgtttcctcc
360gcttctacta agggaccatc tgttttccca ttggctccat cttctaagtc tacttccggt
420ggtactgctg ctttgggatg tttggttaaa gactacttcc cagagccagt tactgtttct
480tggaactccg gtgctttgac ttctggtgtt cacactttcc cagctgtttt gcaatcttcc
540ggtttgtact ctttgtcctc cgttgttact gttccatcct cttccttggg tactcagact
600tacatctgta acgttaacca caagccatcc aacactaagg ttgacaagaa ggttgagcca
660aagtcctgtg acaagacaca tacttgtcca ccatgtccag ctccagaatt gttgggtggt
720ccatccgttt tcttgttccc accaaagcca aaggacactt tgatgatctc cagaactcca
780gaggttacat gtgttgttgt tgacgtttct cacgaggacc cagaggttaa gttcaactgg
840tacgttgacg gtgttgaagt tcacaacgct aagactaagc caagagaaga gcagtacaac
900tccacttaca gagttgtttc cgttttgact gttttgcacc aggactggtt gaacggtaaa
960gaatacaagt gtaaggtttc caacaaggct ttgccagctc caatcgaaaa gactatctcc
1020aaggctaagg gtcaaccaag agagccacag gtttacactt tgccaccatc cagagaagag
1080atgactaaga accaggtttc cttgacttgt ttggttaaag gattctaccc atccgacatt
1140gctgttgagt gggaatctaa cggtcaacca gagaacaact acaagactac tccaccagtt
1200ttggattctg atggttcctt cttcttgtac tccaagttga ctgttgacaa gtccagatgg
1260caacagggta acgttttctc ctgttccgtt atgcatgagg ctttgcacaa ccactacact
1320caaaagtcct tgtctttgtc ccctggttaa
135016449PRTArtificial SequenceAnti-Her2 Heavy chain (VH + IgG1
constant region) 16Glu Val Gln Leu Val Glu Ser Gly Gly Gly Leu Val Gln
Pro Gly Gly 1 5 10 15
Ser Leu Arg Leu Ser Cys Ala Ala Ser Gly Phe Asn Ile Lys Asp Thr
20 25 30 Tyr Ile His Trp
Val Arg Gln Ala Pro Gly Lys Gly Leu Glu Trp Val 35
40 45 Ala Arg Ile Tyr Pro Thr Asn Gly Tyr
Thr Arg Tyr Ala Asp Ser Val 50 55
60 Lys Gly Arg Phe Thr Ile Ser Ala Asp Thr Ser Lys Asn
Thr Ala Tyr 65 70 75
80 Leu Gln Met Asn Ser Leu Arg Ala Glu Asp Thr Ala Val Tyr Tyr Cys
85 90 95 Ser Arg Trp Gly
Gly Asp Gly Phe Tyr Ala Met Asp Tyr Trp Gly Gln 100
105 110 Gly Thr Leu Val Thr Val Ser Ser Ala
Ser Thr Lys Gly Pro Ser Val 115 120
125 Phe Pro Leu Ala Pro Ser Ser Lys Ser Thr Ser Gly Gly Thr
Ala Ala 130 135 140
Leu Gly Cys Leu Val Lys Asp Tyr Phe Pro Glu Pro Val Thr Val Ser 145
150 155 160 Trp Asn Ser Gly Ala
Leu Thr Ser Gly Val His Thr Phe Pro Ala Val 165
170 175 Leu Gln Ser Ser Gly Leu Tyr Ser Leu Ser
Ser Val Val Thr Val Pro 180 185
190 Ser Ser Ser Leu Gly Thr Gln Thr Tyr Ile Cys Asn Val Asn His
Lys 195 200 205 Pro
Ser Asn Thr Lys Val Asp Lys Lys Val Glu Pro Lys Ser Cys Asp 210
215 220 Lys Thr His Thr Cys Pro
Pro Cys Pro Ala Pro Glu Leu Leu Gly Gly 225 230
235 240 Pro Ser Val Phe Leu Phe Pro Pro Lys Pro Lys
Asp Thr Leu Met Ile 245 250
255 Ser Arg Thr Pro Glu Val Thr Cys Val Val Val Asp Val Ser His Glu
260 265 270 Asp Pro
Glu Val Lys Phe Asn Trp Tyr Val Asp Gly Val Glu Val His 275
280 285 Asn Ala Lys Thr Lys Pro Arg
Glu Glu Gln Tyr Asn Ser Thr Tyr Arg 290 295
300 Val Val Ser Val Leu Thr Val Leu His Gln Asp Trp
Leu Asn Gly Lys 305 310 315
320 Glu Tyr Lys Cys Lys Val Ser Asn Lys Ala Leu Pro Ala Pro Ile Glu
325 330 335 Lys Thr Ile
Ser Lys Ala Lys Gly Gln Pro Arg Glu Pro Gln Val Tyr 340
345 350 Thr Leu Pro Pro Ser Arg Glu Glu
Met Thr Lys Asn Gln Val Ser Leu 355 360
365 Thr Cys Leu Val Lys Gly Phe Tyr Pro Ser Asp Ile Ala
Val Glu Trp 370 375 380
Glu Ser Asn Gly Gln Pro Glu Asn Asn Tyr Lys Thr Thr Pro Pro Val 385
390 395 400 Leu Asp Ser Asp
Gly Ser Phe Phe Leu Tyr Ser Lys Leu Thr Val Asp 405
410 415 Lys Ser Arg Trp Gln Gln Gly Asn Val
Phe Ser Cys Ser Val Met His 420 425
430 Glu Ala Leu His Asn His Tyr Thr Gln Lys Ser Leu Ser Leu
Ser Pro 435 440 445
Gly 17645DNAArtificial SequenceAnti-Her2 light chain (VL + Kappa
constant region) (DNA) 17gacatccaaa tgactcaatc cccatcttct ttgtctgctt
ccgttggtga cagagttact 60atcacttgta gagcttccca ggacgttaat actgctgttg
cttggtatca acagaagcca 120ggaaaggctc caaagttgtt gatctactcc gcttccttct
tgtactctgg tgttccatcc 180agattctctg gttccagatc cggtactgac ttcactttga
ctatctcctc cttgcaacca 240gaagatttcg ctacttacta ctgtcagcag cactacacta
ctccaccaac tttcggacag 300ggtactaagg ttgagatcaa gagaactgtt gctgctccat
ccgttttcat tttcccacca 360tccgacgaac agttgaagtc tggtacagct tccgttgttt
gtttgttgaa caacttctac 420ccaagagagg ctaaggttca gtggaaggtt gacaacgctt
tgcaatccgg taactcccaa 480gaatccgtta ctgagcaaga ctctaaggac tccacttact
ccttgtcctc cactttgact 540ttgtccaagg ctgattacga gaagcacaag gtttacgctt
gtgaggttac acatcagggt 600ttgtcctccc cagttactaa gtccttcaac agaggagagt
gttaa 64518214PRTArtificial SequenceAnti-Her2 light
chain (VL + Kappa constant region) 18Asp Ile Gln Met Thr Gln Ser Pro
Ser Ser Leu Ser Ala Ser Val Gly 1 5 10
15 Asp Arg Val Thr Ile Thr Cys Arg Ala Ser Gln Asp Val
Asn Thr Ala 20 25 30
Val Ala Trp Tyr Gln Gln Lys Pro Gly Lys Ala Pro Lys Leu Leu Ile
35 40 45 Tyr Ser Ala Ser
Phe Leu Tyr Ser Gly Val Pro Ser Arg Phe Ser Gly 50
55 60 Ser Arg Ser Gly Thr Asp Phe Thr
Leu Thr Ile Ser Ser Leu Gln Pro 65 70
75 80 Glu Asp Phe Ala Thr Tyr Tyr Cys Gln Gln His Tyr
Thr Thr Pro Pro 85 90
95 Thr Phe Gly Gln Gly Thr Lys Val Glu Ile Lys Arg Thr Val Ala Ala
100 105 110 Pro Ser Val
Phe Ile Phe Pro Pro Ser Asp Glu Gln Leu Lys Ser Gly 115
120 125 Thr Ala Ser Val Val Cys Leu Leu
Asn Asn Phe Tyr Pro Arg Glu Ala 130 135
140 Lys Val Gln Trp Lys Val Asp Asn Ala Leu Gln Ser Gly
Asn Ser Gln 145 150 155
160 Glu Ser Val Thr Glu Gln Asp Ser Lys Asp Ser Thr Tyr Ser Leu Ser
165 170 175 Ser Thr Leu Thr
Leu Ser Lys Ala Asp Tyr Glu Lys His Lys Val Tyr 180
185 190 Ala Cys Glu Val Thr His Gln Gly Leu
Ser Ser Pro Val Thr Lys Ser 195 200
205 Phe Asn Arg Gly Glu Cys 210
191350DNAArtificial SequenceAnti-RSV Heavy chain (VH + IgG1 constant
region) (DNA) 19caggttacat tgagagaatc cggtccagct ttggttaagc caactcagac
tttgactttg 60acttgtactt tctccggttt ctccttgtct acttccggaa tgtctgttgg
atggatcaga 120caaccacctg gaaaggcttt ggaatggctt gctgacattt ggtgggatga
caagaaggac 180tacaacccat ccttgaagtc cagattgact atctccaagg acacttccaa
gaatcaagtt 240gttttgaagg ttacaaacat ggacccagct gacactgcta cttactactg
tgctagatcc 300atgatcacta actggtactt cgatgtttgg ggtgctggta ctactgttac
tgtctcgagt 360gcttctacta agggaccatc cgtttttcca ttggctccat cctctaagtc
tacttccggt 420ggaaccgctg ctttgggatg tttggttaaa gactacttcc cagagccagt
tactgtttct 480tggaactccg gtgctttgac ttctggtgtt cacactttcc cagctgtttt
gcaatcttcc 540ggtttgtact ctttgtcctc cgttgttact gttccatcct cttccttggg
tactcagact 600tacatctgta acgttaacca caagccatcc aacactaagg ttgacaagag
agttgagcca 660aagtcctgtg acaagacaca tacttgtcca ccatgtccag ctccagaatt
gttgggtggt 720ccatccgttt tcttgttccc accaaagcca aaggacactt tgatgatctc
cagaactcca 780gaggttacat gtgttgttgt tgacgtttct cacgaggacc cagaggttaa
gttcaactgg 840tacgttgacg gtgttgaagt tcacaacgct aagactaagc caagagaaga
gcagtacaac 900tccacttaca gagttgtttc cgttttgact gttttgcacc aggactggtt
gaacggtaaa 960gaatacaagt gtaaggtttc caacaaggct ttgccagctc caatcgaaaa
gactatctcc 1020aaggctaagg gtcaaccaag agagccacag gtttacactt tgccaccatc
cagagaagag 1080atgactaaga accaggtttc cttgacttgt ttggttaaag gattctaccc
atccgacatt 1140gctgttgagt gggaatctaa cggtcaacca gagaacaact acaagactac
tccaccagtt 1200ttggattctg atggttcctt cttcttgtac tccaagttga ctgttgacaa
gtccagatgg 1260caacagggta acgttttctc ctgttccgtt atgcatgagg ctttgcacaa
ccactacact 1320caaaagtcct tgtctttgtc ccctggttaa
135020449PRTArtificial SequenceAnti-RSV Heavy chain (VH +
IgG1 constant region) 20Gln Val Thr Leu Arg Glu Ser Gly Pro Ala Leu Val
Lys Pro Thr Gln 1 5 10
15 Thr Leu Thr Leu Thr Cys Thr Phe Ser Gly Phe Ser Leu Ser Thr Ser
20 25 30 Gly Met Ser
Val Gly Trp Ile Arg Gln Pro Pro Gly Lys Ala Leu Glu 35
40 45 Trp Leu Ala Asp Ile Trp Trp Asp
Asp Lys Lys Asp Tyr Asn Pro Ser 50 55
60 Leu Lys Ser Arg Leu Thr Ile Ser Lys Asp Thr Ser Lys
Asn Gln Val 65 70 75
80 Val Leu Lys Val Thr Asn Met Asp Pro Ala Asp Thr Ala Thr Tyr Tyr
85 90 95 Cys Ala Arg Ser
Met Ile Thr Asn Trp Tyr Phe Asp Val Trp Gly Ala 100
105 110 Gly Thr Thr Val Thr Val Ser Ser Ala
Ser Thr Lys Gly Pro Ser Val 115 120
125 Phe Pro Leu Ala Pro Ser Ser Lys Ser Thr Ser Gly Gly Thr
Ala Ala 130 135 140
Leu Gly Cys Leu Val Lys Asp Tyr Phe Pro Glu Pro Val Thr Val Ser 145
150 155 160 Trp Asn Ser Gly Ala
Leu Thr Ser Gly Val His Thr Phe Pro Ala Val 165
170 175 Leu Gln Ser Ser Gly Leu Tyr Ser Leu Ser
Ser Val Val Thr Val Pro 180 185
190 Ser Ser Ser Leu Gly Thr Gln Thr Tyr Ile Cys Asn Val Asn His
Lys 195 200 205 Pro
Ser Asn Thr Lys Val Asp Lys Arg Val Glu Pro Lys Ser Cys Asp 210
215 220 Lys Thr His Thr Cys Pro
Pro Cys Pro Ala Pro Glu Leu Leu Gly Gly 225 230
235 240 Pro Ser Val Phe Leu Phe Pro Pro Lys Pro Lys
Asp Thr Leu Met Ile 245 250
255 Ser Arg Thr Pro Glu Val Thr Cys Val Val Val Asp Val Ser His Glu
260 265 270 Asp Pro
Glu Val Lys Phe Asn Trp Tyr Val Asp Gly Val Glu Val His 275
280 285 Asn Ala Lys Thr Lys Pro Arg
Glu Glu Gln Tyr Asn Ser Thr Tyr Arg 290 295
300 Val Val Ser Val Leu Thr Val Leu His Gln Asp Trp
Leu Asn Gly Lys 305 310 315
320 Glu Tyr Lys Cys Lys Val Ser Asn Lys Ala Leu Pro Ala Pro Ile Glu
325 330 335 Lys Thr Ile
Ser Lys Ala Lys Gly Gln Pro Arg Glu Pro Gln Val Tyr 340
345 350 Thr Leu Pro Pro Ser Arg Glu Glu
Met Thr Lys Asn Gln Val Ser Leu 355 360
365 Thr Cys Leu Val Lys Gly Phe Tyr Pro Ser Asp Ile Ala
Val Glu Trp 370 375 380
Glu Ser Asn Gly Gln Pro Glu Asn Asn Tyr Lys Thr Thr Pro Pro Val 385
390 395 400 Leu Asp Ser Asp
Gly Ser Phe Phe Leu Tyr Ser Lys Leu Thr Val Asp 405
410 415 Lys Ser Arg Trp Gln Gln Gly Asn Val
Phe Ser Cys Ser Val Met His 420 425
430 Glu Ala Leu His Asn His Tyr Thr Gln Lys Ser Leu Ser Leu
Ser Pro 435 440 445
Gly 21699DNAArtificial SequenceAnti-RSV light chain (VL + Kappa
constant region (DNA) 21atgagattcc catccatctt cactgctgtt ttgttcgctg
cttcttctgc tttggctgac 60attcagatga cacagtcccc atctactttg tctgcttccg
ttggtgacag agttactatc 120acttgtaagt gtcagttgtc cgttggttac atgcactggt
atcagcaaaa gccaggaaag 180gctccaaagt tgttgatcta cgacacttcc aagttggctt
ccggtgttcc atctagattc 240tctggttccg gttctggtac tgagttcact ttgactatct
cttccttgca accagatgac 300ttcgctactt actactgttt ccagggttct ggttacccat
tcactttcgg tggtggtact 360aagttggaga tcaagagaac tgttgctgct ccatccgttt
tcattttccc accatccgac 420gaacaattga agtccggtac cgcttccgtt gtttgtttgt
tgaacaactt ctacccacgt 480gaggctaagg ttcagtggaa ggttgacaac gctttgcaat
ccggtaactc ccaagaatcc 540gttactgagc aggattctaa ggattccact tactcattgt
cctccacttt gactttgtcc 600aaggctgatt acgagaagca caaggtttac gcttgcgagg
ttacacatca gggtttgtcc 660tccccagtta ctaagtcctt caacagagga gagtgttaa
69922213PRTArtificial SequenceAnti-RSV light chain
(VL + Kappa constant region 22Asp Ile Gln Met Thr Gln Ser Pro Ser
Thr Leu Ser Ala Ser Val Gly 1 5 10
15 Asp Arg Val Thr Ile Thr Cys Lys Cys Gln Leu Ser Val Gly
Tyr Met 20 25 30
His Trp Tyr Gln Gln Lys Pro Gly Lys Ala Pro Lys Leu Leu Ile Tyr
35 40 45 Asp Thr Ser Lys
Leu Ala Ser Gly Val Pro Ser Arg Phe Ser Gly Ser 50
55 60 Gly Ser Gly Thr Glu Phe Thr Leu
Thr Ile Ser Ser Leu Gln Pro Asp 65 70
75 80 Asp Phe Ala Thr Tyr Tyr Cys Phe Gln Gly Ser Gly
Tyr Pro Phe Thr 85 90
95 Phe Gly Gly Gly Thr Lys Leu Glu Ile Lys Arg Thr Val Ala Ala Pro
100 105 110 Ser Val Phe
Ile Phe Pro Pro Ser Asp Glu Gln Leu Lys Ser Gly Thr 115
120 125 Ala Ser Val Val Cys Leu Leu Asn
Asn Phe Tyr Pro Arg Glu Ala Lys 130 135
140 Val Gln Trp Lys Val Asp Asn Ala Leu Gln Ser Gly Asn
Ser Gln Glu 145 150 155
160 Ser Val Thr Glu Gln Asp Ser Lys Asp Ser Thr Tyr Ser Leu Ser Ser
165 170 175 Thr Leu Thr Leu
Ser Lys Ala Asp Tyr Glu Lys His Lys Val Tyr Ala 180
185 190 Cys Glu Val Thr His Gln Gly Leu Ser
Ser Pro Val Thr Lys Ser Phe 195 200
205 Asn Arg Gly Glu Cys 210
23934DNAArtificial SequencePp AOX1 promoter 23aacatccaaa gacgaaaggt
tgaatgaaac ctttttgcca tccgacatcc acaggtccat 60tctcacacat aagtgccaaa
cgcaacagga ggggatacac tagcagcaga ccgttgcaaa 120cgcaggacct ccactcctct
tctcctcaac acccactttt gccatcgaaa aaccagccca 180gttattgggc ttgattggag
ctcgctcatt ccaattcctt ctattaggct actaacacca 240tgactttatt agcctgtcta
tcctggcccc cctggcgagg ttcatgtttg tttatttccg 300aatgcaacaa gctccgcatt
acacccgaac atcactccag atgagggctt tctgagtgtg 360gggtcaaata gtttcatgtt
ccccaaatgg cccaaaactg acagtttaaa cgctgtcttg 420gaacctaata tgacaaaagc
gtgatctcat ccaagatgaa ctaagtttgg ttcgttgaaa 480tgctaacggc cagttggtca
aaaagaaact tccaaaagtc ggcataccgt ttgtcttgtt 540tggtattgat tgacgaatgc
tcaaaaataa tctcattaat gcttagcgca gtctctctat 600cgcttctgaa ccccggtgca
cctgtgccga aacgcaaatg gggaaacacc cgctttttgg 660atgattatgc attgtctcca
cattgtatgc ttccaagatt ctggtgggaa tactgctgat 720agcctaacgt tcatgatcaa
aatttaactg ttctaacccc tacttgacag caatatataa 780acagaaggaa gctgccctgt
cttaaacctt tttttttatc atcattatta gcttactttc 840ataattgcga ctggttccaa
ttgacaagct tttgatttta acgactttta acgacaactt 900gagaagatca aaaaacaact
aattattcga aacg 93424293DNAArtificial
SequenceScCYC TT 24acaggcccct tttcctttgt cgatatcatg taattagtta tgtcacgctt
acattcacgc 60cctcctccca catccgctct aaccgaaaag gaaggagtta gacaacctga
agtctaggtc 120cctatttatt ttttttaata gttatgttag tattaagaac gttatttata
tttcaaattt 180ttcttttttt tctgtacaaa cgcgtgtacg catgtaacat tatactgaaa
accttgcttg 240agaaggtttt gggacgctcg aaggctttaa tttgcaagct gccggctctt
aag 29325600DNAArtificial SequencePpRPL10 promoter 25gttcttcgct
tggtcttgta tctccttaca ctgtatcttc ccatttgcgt ttaggtggtt 60atcaaaaact
aaaaggaaaa atttcagatg tttatctcta aggttttttc tttttacagt 120ataacacgtg
atgcgtcacg tggtactaga ttacgtaagt tattttggtc cggtgggtaa 180gtgggtaaga
atagaaagca tgaaggttta caaaaacgca gtcacgaatt attgctactt 240cgagcttgga
accaccccaa agattatatt gtactgatgc actaccttct cgattttgct 300cctccaagaa
cctacgaaaa acatttcttg agccttttca acctagacta cacatcaagt 360tatttaaggt
atgttccgtt aacatgtaag aaaaggagag gatagatcgt ttatggggta 420cgtcgcctga
ttcaagcgtg accattcgaa gaataggcct tcgaaagctg aataaagcaa 480atgtcagttg
cgattggtat gctgacaaat tagcataaaa agcaatagac tttctaacca 540cctgtttttt
tccttttact ttatttatat tttgccaccg tactaacaag ttcagacaaa
60026486DNAArtificial SequencePpGAPDH promoter 26tttttgtaga aatgtcttgg
tgtcctcgtc caatcaggta gccatctctg aaatatctgg 60ctccgttgca actccgaacg
acctgctggc aacgtaaaat tctccggggt aaaacttaaa 120tgtggagtaa tggaaccaga
aacgtctctt cccttctctc tccttccacc gcccgttacc 180gtccctagga aattttactc
tgctggagag cttcttctac ggcccccttg cagcaatgct 240cttcccagca ttacgttgcg
ggtaaaacgg aggtcgtgta cccgacctag cagcccaggg 300atggaaaagt cccggccgtc
gctggcaata atagcgggcg gacgcatgtc atgagattat 360tggaaaccac cagaatcgaa
tataaaaggc gaacaccttt cccaattttg gtttctcctg 420acccaaagac tttaaattta
atttatttgt ccctatttca atcaattgaa caactatcaa 480aacaca
48627600DNAArtificial
SequencePpTEF1 promoter 27ttaaggtttg gaacaacact aaactacctt gcggtactac
cattgacact acacatcctt 60aattccaatc ctgtctggcc tccttcacct tttaaccatc
ttgcccattc caactcgtgt 120cagattgcgt atcaagtgaa aaaaaaaaaa ttttaaatct
ttaacccaat caggtaataa 180ctgtcgcctc ttttatctgc cgcactgcat gaggtgtccc
cttagtggga aagagtactg 240agccaaccct ggaggacagc aagggaaaaa tacctacaac
ttgcttcata atggtcgtaa 300aaacaatcct tgtcggatat aagtgttgta gactgtccct
tatcctctgc gatgttcttc 360ctctcaaagt ttgcgatttc tctctatcag aattgccatc
aagagactca ggactaattt 420cgcagtccca cacgcactcg tacatgattg gctgaaattt
ccctaaagaa tttctttttc 480acgaaaattt tttttttaca caagattttc agcagatata
aaatggagag caggacctcc 540gctgtgactc ttcttttttt tcttttattc tcactacata
cattttagtt attcgccaac 60028301DNAArtificial SequencePpTEF1 TT
28attgcttgaa gctttaattt attttattaa cataataata atacaagcat gatatatttg
60tattttgttc gttaacattg atgttttctt catttactgt tattgtttgt aactttgatc
120gatttatctt ttctacttta ctgtaatatg gctggcgggt gagccttgaa ctccctgtat
180tactttacct tgctattact taatctattg actagcagcg acctcttcaa ccgaagggca
240agtacacagc aagttcatgt ctccgtaagt gtcatcaacc ctggaaacag tgggccatgt
300c
30129376DNAArtificial SequencePpALG3 TT 29atttacaatt agtaatatta
aggtggtaaa aacattcgta gaattgaaat gaattaatat 60agtatgacaa tggttcatgt
ctataaatct ccggcttcgg taccttctcc ccaattgaat 120acattgtcaa aatgaatggt
tgaactatta ggttcgccag tttcgttatt aagaaaactg 180ttaaaatcaa attccatatc
atcggttcca gtgggaggac cagttccatc gccaaaatcc 240tgtaagaatc cattgtcaga
acctgtaaag tcagtttgag atgaaatttt tccggtcttt 300gttgacttgg aagcttcgtt
aaggttaggt gaaacagttt gatcaaccag cggctcccgt 360tttcgtcgct tagtag
37630672DNAArtificial
SequencePpTRP1 5' region and ORF 30gcggaaacgg cagtaaacaa tggagcttca
ttagtgggtg ttattatggt ccctggccgg 60gaacgaacgg tgaaacaaga ggttgcgagg
gaaatttcgc agatggtgcg ggaaaagaga 120atttcaaagg gctcaaaata cttggattcc
agacaactga ggaaagagtg ggacgactgt 180cctctggaag actggtttga gtacaacgtg
aaagaaataa acagcagtgg tccattttta 240gttggagttt ttcgtaatca aagtatagat
gaaatccagc aagctatcca cactcatggt 300ttggatttcg tccaactaca tgggtctgag
gattttgatt cgtatatacg caatatccca 360gttcctgtga ttaccagata cacagataat
gccgtcgatg gtcttaccgg agaagacctc 420gctataaata gggccctggt gctactggac
agcgagcaag gaggtgaagg aaaaaccatc 480gattgggctc gtgcacaaaa atttggagaa
cgtagaggaa aatatttact agccggaggt 540ttgacacctg ataatgttgc tcatgctcga
tctcatactg gctgtattgg tgttgacgtc 600tctggtgggg tagaaacaaa tgcctcaaaa
gatatggaca agatcacaca atttatcaga 660aacgctacat aa
67231834DNAArtificial SequencePpTRP1 3'
region 31aagtcaatta aatacacgct tgaaaggaca ttacatagct ttcgatttaa
gcagaaccag 60aaatgtagaa ccacttgtca atagattggt caatcttagc aggagcggct
gggctagcag 120ttggaacagc agaggttgct gaaggtgaga aggatggagt ggattgcaaa
gtggtgttgg 180ttaagtcaat ctcaccaggg ctggttttgc caaaaatcaa cttctcccag
gcttcacggc 240attcttgaat gacctcttct gcatacttct tgttcttgca ttcaccagag
aaagcaaact 300ggttctcagg ttttccatca gggatcttgt aaattctgaa ccattcgttg
gtagctctca 360acaagcccgg catgtgcttt tcaacatcct cgatgtcatt gagcttagga
gccaatgggt 420cgttgatgtc gatgacgatg accttccagt cagtctctcc ctcatccaac
aaagccataa 480caccgaggac cttgacttgc ttgacctgtc cagtgtaacc tacggcttca
ccaatttcgc 540aaacgtccaa tggatcattg tcacccttgg ccttggtctc tggatgagtg
acgttagggt 600cttcccatgt ctgagggaag gcaccgtagt tgtgaatgta tccgtggtga
gggaaacagt 660tacgaacgaa acgaagtttt cccttctttg tgtcctgaag aattgggttc
agtttctcct 720ccttggaaat ctccaacttg gcgttggtcc aacgggggac ttcaacaacc
atgttgagaa 780ccttcttgga ttcgtcagca taaagtggga tgtcgtggaa aggagatacg
actt 834321215DNAArtificial SequenceScARR3 ORF 32atgtcagaag
atcaaaaaag tgaaaattcc gtaccttcta aggttaatat ggtgaatcgc 60accgatatac
tgactacgat caagtcattg tcatggcttg acttgatgtt gccatttact 120ataattctct
ccataatcat tgcagtaata atttctgtct atgtgccttc ttcccgtcac 180acttttgacg
ctgaaggtca tcccaatcta atgggagtgt ccattccttt gactgttggt 240atgattgtaa
tgatgattcc cccgatctgc aaagtttcct gggagtctat tcacaagtac 300ttctacagga
gctatataag gaagcaacta gccctctcgt tatttttgaa ttgggtcatc 360ggtcctttgt
tgatgacagc attggcgtgg atggcgctat tcgattataa ggaataccgt 420caaggcatta
ttatgatcgg agtagctaga tgcattgcca tggtgctaat ttggaatcag 480attgctggag
gagacaatga tctctgcgtc gtgcttgtta ttacaaactc gcttttacag 540atggtattat
atgcaccatt gcagatattt tactgttatg ttatttctca tgaccacctg 600aatacttcaa
atagggtatt attcgaagag gttgcaaagt ctgtcggagt ttttctcggc 660ataccactgg
gaattggcat tatcatacgt ttgggaagtc ttaccatagc tggtaaaagt 720aattatgaaa
aatacatttt gagatttatt tctccatggg caatgatcgg atttcattac 780actttatttg
ttatttttat tagtagaggt tatcaattta tccacgaaat tggttctgca 840atattgtgct
ttgtcccatt ggtgctttac ttctttattg catggttttt gaccttcgca 900ttaatgaggt
acttatcaat atctaggagt gatacacaaa gagaatgtag ctgtgaccaa 960gaactacttt
taaagagggt ctggggaaga aagtcttgtg aagctagctt ttctattacg 1020atgacgcaat
gtttcactat ggcttcaaat aattttgaac tatccctggc aattgctatt 1080tccttatatg
gtaacaatag caagcaagca atagctgcaa catttgggcc gttgctagaa 1140gttccaattt
tattgatttt ggcaatagtc gcgagaatcc ttaaaccata ttatatatgg 1200aacaatagaa
attaa
1215331144DNAArtificial SequenceURA6 region 33caaatgcaag aggacattag
aaatgtgttt ggtaagaaca tgaagccgga ggcatacaaa 60cgattcacag atttgaagga
ggaaaacaaa ctgcatccac cggaagtgcc agcagccgtg 120tatgccaacc ttgctctcaa
aggcattcct acggatctga gtgggaaata tctgagattc 180acagacccac tattggaaca
gtaccaaacc tagtttggcc gatccatgat tatgtaatgc 240atatagtttt tgtcgatgct
cacccgtttc gagtctgtct cgtatcgtct tacgtataag 300ttcaagcatg tttaccaggt
ctgttagaaa ctcctttgtg agggcaggac ctattcgtct 360cggtcccgtt gtttctaaga
gactgtacag ccaagcgcag aatggtggca ttaaccataa 420gaggattctg atcggacttg
gtctattggc tattggaacc accctttacg ggacaaccaa 480ccctaccaag actcctattg
catttgtgga accagccacg gaaagagcgt ttaaggacgg 540agacgtctct gtgatttttg
ttctcggagg tccaggagct ggaaaaggta cccaatgtgc 600caaactagtg agtaattacg
gatttgttca cctgtcagct ggagacttgt tacgtgcaga 660acagaagagg gaggggtcta
agtatggaga gatgatttcc cagtatatca gagatggact 720gatagtacct caagaggtca
ccattgcgct cttggagcag gccatgaagg aaaacttcga 780gaaagggaag acacggttct
tgattgatgg attccctcgt aagatggacc aggccaaaac 840ttttgaggaa aaagtcgcaa
agtccaaggt gacacttttc tttgattgtc ccgaatcagt 900gctccttgag agattactta
aaagaggaca gacaagcgga agagaggatg ataatgcgga 960gagtatcaaa aaaagattca
aaacattcgt ggaaacttcg atgcctgtgg tggactattt 1020cgggaagcaa ggacgcgttt
tgaaggtatc ttgtgaccac cctgtggatc aagtgtattc 1080acaggttgtg tcggtgctaa
aagagaaggg gatctttgcc gataacgaga cggagaataa 1140ataa
114434582DNAArtificial
SequenceNatR ORF 34atgggtacca ctcttgacga cacggcttac cggtaccgca ccagtgtccc
gggggacgcc 60gaggccatcg aggcactgga tgggtccttc accaccgaca ccgtcttccg
cgtcaccgcc 120accggggacg gcttcaccct gcgggaggtg ccggtggacc cgcccctgac
caaggtgttc 180cccgacgacg aatcggacga cgaatcggac gacggggagg acggcgaccc
ggactcccgg 240acgttcgtcg cgtacgggga cgacggcgac ctggcgggct tcgtggtcgt
ctcgtactcc 300ggctggaacc gccggctgac cgtcgaggac atcgaggtcg ccccggagca
ccgggggcac 360ggggtcgggc gcgcgttgat ggggctcgcg acggagttcg cccgcgagcg
gggcgccggg 420cacctctggc tggaggtcac caacgtcaac gcaccggcga tccacgcgta
ccggcggatg 480gggttcaccc tctgcggcct ggacaccgcc ctgtacgacg gcaccgcctc
ggacggcgag 540caggcgctct acatgagcat gccctgcccc taatcagtac tg
58235375DNAArtificial SequenceSequence of the Sh ble ORF
(Zeocin resistance marker) 35atggccaagt tgaccagtgc cgttccggtg ctcaccgcgc
gcgacgtcgc cggagcggtc 60gagttctgga ccgaccggct cgggttctcc cgggacttcg
tggaggacga cttcgccggt 120gtggtccggg acgacgtgac cctgttcatc agcgcggtcc
aggaccaggt ggtgccggac 180aacaccctgg cctgggtgtg ggtgcgcggc ctggacgagc
tgtacgccga gtggtcggag 240gtcgtgtcca cgaacttccg ggacgcctcc gggccggcca
tgaccgagat cggcgagcag 300ccgtgggggc gggagttcgc cctgcgcgac ccggccggca
actgcgtgca cttcgtggcc 360gaggagcagg actga
37536260DNAArtificial SequencePpAOX1 TT
36tcaagaggat gtcagaatgc catttgcctg agagatgcag gcttcatttt gatacttttt
60tatttgtaac ctatatagta taggattttt tttgtcattt tgtttcttct cgtacgagct
120tgctcctgat cagcctatct cgcagctgat gaatatcttg tggtaggggt ttgggaaaat
180cattcgagtt tgatgttttt cttggtattt cccactcctc ttcagagtac agaagattaa
240gtgagacgtt cgtttgtgca
26037427DNAArtificial SequenceScTEF1 promoter 37gatcccccac acaccatagc
ttcaaaatgt ttctactcct tttttactct tccagatttt 60ctcggactcc gcgcatcgcc
gtaccacttc aaaacaccca agcacagcat actaaatttc 120ccctctttct tcctctaggg
tgtcgttaat tacccgtact aaaggtttgg aaaagaaaaa 180agagaccgcc tcgtttcttt
ttcttcgtcg aaaaaggcaa taaaaatttt tatcacgttt 240ctttttcttg aaaatttttt
tttttgattt ttttctcttt cgatgacctc ccattgatat 300ttaagttaat aaacggtctt
caatttctca agtttcagtt tcatttttct tgttctatta 360caactttttt tacttcttgc
tcattagaaa gaaagcatag caatctaatc taagttttaa 420ttacaaa
427383029DNASaccharomyces
cerevisiea 38aggcctcgca acaacctata attgagttaa gtgcctttcc aagctaaaaa
gtttgaggtt 60ataggggctt agcatccaca cgtcacaatc tcgggtatcg agtatagtat
gtagaattac 120ggcaggaggt ttcccaatga acaaaggaca ggggcacggt gagctgtcga
aggtatccat 180tttatcatgt ttcgtttgta caagcacgac atactaagac atttaccgta
tgggagttgt 240tgtcctagcg tagttctcgc tcccccagca aagctcaaaa aagtacgtca
tttagaatag 300tttgtgagca aattaccagt cggtatgcta cgttagaaag gcccacagta
ttcttctacc 360aaaggcgtgc ctttgttgaa ctcgatccat tatgagggct tccattattc
cccgcatttt 420tattactctg aacaggaata aaaagaaaaa acccagttta ggaaattatc
cgggggcgaa 480gaaatacgcg tagcgttaat cgaccccacg tccagggttt ttccatggag
gtttctggaa 540aaactgacga ggaatgtgat tataaatccc tttatgtgat gtctaagact
tttaaggtac 600gcccgatgtt tgcctattac catcatagag acgtttcttt tcgaggaatg
cttaaacgac 660tttgtttgac aaaaatgttg cctaagggct ctatagtaaa ccatttggaa
gaaagatttg 720acgacttttt ttttttggat ttcgatccta taatccttcc tcctgaaaag
aaacatataa 780atagatatgt attattcttc aaaacattct cttgttcttg tgcttttttt
ttaccatata 840tcttactttt ttttttctct cagagaaaca agcaaaacaa aaagcttttc
ttttcactaa 900cgtatatgat gcttttgcaa gctttccttt tccttttggc tggttttgca
gccaaaatat 960ctgcatcaat gacaaacgaa actagcgata gacctttggt ccacttcaca
cccaacaagg 1020gctggatgaa tgacccaaat gggttgtggt acgatgaaaa agatgccaaa
tggcatctgt 1080actttcaata caacccaaat gacaccgtat ggggtacgcc attgttttgg
ggccatgcta 1140cttccgatga tttgactaat tgggaagatc aacccattgc tatcgctccc
aagcgtaacg 1200attcaggtgc tttctctggc tccatggtgg ttgattacaa caacacgagt
gggtttttca 1260atgatactat tgatccaaga caaagatgcg ttgcgatttg gacttataac
actcctgaaa 1320gtgaagagca atacattagc tattctcttg atggtggtta cacttttact
gaataccaaa 1380agaaccctgt tttagctgcc aactccactc aattcagaga tccaaaggtg
ttctggtatg 1440aaccttctca aaaatggatt atgacggctg ccaaatcaca agactacaaa
attgaaattt 1500actcctctga tgacttgaag tcctggaagc tagaatctgc atttgccaat
gaaggtttct 1560taggctacca atacgaatgt ccaggtttga ttgaagtccc aactgagcaa
gatccttcca 1620aatcttattg ggtcatgttt atttctatca acccaggtgc acctgctggc
ggttccttca 1680accaatattt tgttggatcc ttcaatggta ctcattttga agcgtttgac
aatcaatcta 1740gagtggtaga ttttggtaag gactactatg ccttgcaaac tttcttcaac
actgacccaa 1800cctacggttc agcattaggt attgcctggg cttcaaactg ggagtacagt
gcctttgtcc 1860caactaaccc atggagatca tccatgtctt tggtccgcaa gttttctttg
aacactgaat 1920atcaagctaa tccagagact gaattgatca atttgaaagc cgaaccaata
ttgaacatta 1980gtaatgctgg tccctggtct cgttttgcta ctaacacaac tctaactaag
gccaattctt 2040acaatgtcga tttgagcaac tcgactggta ccctagagtt tgagttggtt
tacgctgtta 2100acaccacaca aaccatatcc aaatccgtct ttgccgactt atcactttgg
ttcaagggtt 2160tagaagatcc tgaagaatat ttgagaatgg gttttgaagt cagtgcttct
tccttctttt 2220tggaccgtgg taactctaag gtcaagtttg tcaaggagaa cccatatttc
acaaacagaa 2280tgtctgtcaa caaccaacca ttcaagtctg agaacgacct aagttactat
aaagtgtacg 2340gcctactgga tcaaaacatc ttggaattgt acttcaacga tggagatgtg
gtttctacaa 2400atacctactt catgaccacc ggtaacgctc taggatctgt gaacatgacc
actggtgtcg 2460ataatttgtt ctacattgac aagttccaag taagggaagt aaaatagagg
ttataaaact 2520tattgtcttt tttatttttt tcaaaagcca ttctaaaggg ctttagctaa
cgagtgacga 2580atgtaaaact ttatgatttc aaagaatacc tccaaaccat tgaaaatgta
tttttatttt 2640tattttctcc cgaccccagt tacctggaat ttgttcttta tgtactttat
ataagtataa 2700ttctcttaaa aatttttact actttgcaat agacatcatt ttttcacgta
ataaacccac 2760aatcgtaatg tagttgcctt acactactag gatggacctt tttgccttta
tctgttttgt 2820tactgacaca atgaaaccgg gtaaagtatt agttatgtga aaatttaaaa
gcattaagta 2880gaagtatacc atattgtaaa aaaaaaaagc gttgtcttct acgtaaaagt
gttctcaaaa 2940agaagtagtg agggaaatgg ataccaagct atctgtaaca ggagctaaaa
aatctcaggg 3000aaaagcttct ggtttgggaa acggtcgac
302939898DNAArtificial SequenceSequence of the 5'-Region used
for knock out of PpURA5 39atcggccttt gttgatgcaa gttttacgtg
gatcatggac taaggagttt tatttggacc 60aagttcatcg tcctagacat tacggaaagg
gttctgctcc tctttttgga aactttttgg 120aacctctgag tatgacagct tggtggattg
tacccatggt atggcttcct gtgaatttct 180attttttcta cattggattc accaatcaaa
acaaattagt cgccatggct ttttggcttt 240tgggtctatt tgtttggacc ttcttggaat
atgctttgca tagatttttg ttccacttgg 300actactatct tccagagaat caaattgcat
ttaccattca tttcttattg catgggatac 360accactattt accaatggat aaatacagat
tggtgatgcc acctacactt ttcattgtac 420tttgctaccc aatcaagacg ctcgtctttt
ctgttctacc atattacatg gcttgttctg 480gatttgcagg tggattcctg ggctatatca
tgtatgatgt cactcattac gttctgcatc 540actccaagct gcctcgttat ttccaagagt
tgaagaaata tcatttggaa catcactaca 600agaattacga gttaggcttt ggtgtcactt
ccaaattctg ggacaaagtc tttgggactt 660atctgggtcc agacgatgtg tatcaaaaga
caaattagag tatttataaa gttatgtaag 720caaatagggg ctaataggga aagaaaaatt
ttggttcttt atcagagctg gctcgcgcgc 780agtgtttttc gtgctccttt gtaatagtca
tttttgacta ctgttcagat tgaaatcaca 840ttgaagatgt cactcgaggg gtaccaaaaa
aggtttttgg atgctgcagt ggcttcgc 898401060DNAArtificial
SequenceSequence of the 3'-Region used for knock out of PpURA5
40ggtcttttca acaaagctcc attagtgagt cagctggctg aatcttatgc acaggccatc
60attaacagca acctggagat agacgttgta tttggaccag cttataaagg tattcctttg
120gctgctatta ccgtgttgaa gttgtacgag ctcggcggca aaaaatacga aaatgtcgga
180tatgcgttca atagaaaaga aaagaaagac cacggagaag gtggaagcat cgttggagaa
240agtctaaaga ataaaagagt actgattatc gatgatgtga tgactgcagg tactgctatc
300aacgaagcat ttgctataat tggagctgaa ggtgggagag ttgaaggtag tattattgcc
360ctagatagaa tggagactac aggagatgac tcaaatacca gtgctaccca ggctgttagt
420cagagatatg gtacccctgt cttgagtata gtgacattgg accatattgt ggcccatttg
480ggcgaaactt tcacagcaga cgagaaatct caaatggaaa cgtatagaaa aaagtatttg
540cccaaataag tatgaatctg cttcgaatga atgaattaat ccaattatct tctcaccatt
600attttcttct gtttcggagc tttgggcacg gcggcgggtg gtgcgggctc aggttccctt
660tcataaacag atttagtact tggatgctta atagtgaatg gcgaatgcaa aggaacaatt
720tcgttcatct ttaacccttt cactcggggt acacgttctg gaatgtaccc gccctgttgc
780aactcaggtg gaccgggcaa ttcttgaact ttctgtaacg ttgttggatg ttcaaccaga
840aattgtccta ccaactgtat tagtttcctt ttggtcttat attgttcatc gagatacttc
900ccactctcct tgatagccac tctcactctt cctggattac caaaatcttg aggatgagtc
960ttttcaggct ccaggatgca aggtatatcc aagtacctgc aagcatctaa tattgtcttt
1020gccagggggt tctccacacc atactccttt tggcgcatgc
106041957DNAArtificial SequenceSequence of the PpURA5 auxotrophic marker
41tctagaggga cttatctggg tccagacgat gtgtatcaaa agacaaatta gagtatttat
60aaagttatgt aagcaaatag gggctaatag ggaaagaaaa attttggttc tttatcagag
120ctggctcgcg cgcagtgttt ttcgtgctcc tttgtaatag tcatttttga ctactgttca
180gattgaaatc acattgaaga tgtcactgga ggggtaccaa aaaaggtttt tggatgctgc
240agtggcttcg caggccttga agtttggaac tttcaccttg aaaagtggaa gacagtctcc
300atacttcttt aacatgggtc ttttcaacaa agctccatta gtgagtcagc tggctgaatc
360ttatgctcag gccatcatta acagcaacct ggagatagac gttgtatttg gaccagctta
420taaaggtatt cctttggctg ctattaccgt gttgaagttg tacgagctgg gcggcaaaaa
480atacgaaaat gtcggatatg cgttcaatag aaaagaaaag aaagaccacg gagaaggtgg
540aagcatcgtt ggagaaagtc taaagaataa aagagtactg attatcgatg atgtgatgac
600tgcaggtact gctatcaacg aagcatttgc tataattgga gctgaaggtg ggagagttga
660aggttgtatt attgccctag atagaatgga gactacagga gatgactcaa ataccagtgc
720tacccaggct gttagtcaga gatatggtac ccctgtcttg agtatagtga cattggacca
780tattgtggcc catttgggcg aaactttcac agcagacgag aaatctcaaa tggaaacgta
840tagaaaaaag tatttgccca aataagtatg aatctgcttc gaatgaatga attaatccaa
900ttatcttctc accattattt tcttctgttt cggagctttg ggcacggcgg cggatcc
95742709DNAArtificial SequenceSequence of the part of the Ec lacZ gene
that was used to construct the PpURA5 blaster (recyclable
auxotrophic marker) 42cctgcactgg atggtggcgc tggatggtaa gccgctggca
agcggtgaag tgcctctgga 60tgtcgctcca caaggtaaac agttgattga actgcctgaa
ctaccgcagc cggagagcgc 120cgggcaactc tggctcacag tacgcgtagt gcaaccgaac
gcgaccgcat ggtcagaagc 180cgggcacatc agcgcctggc agcagtggcg tctggcggaa
aacctcagtg tgacgctccc 240cgccgcgtcc cacgccatcc cgcatctgac caccagcgaa
atggattttt gcatcgagct 300gggtaataag cgttggcaat ttaaccgcca gtcaggcttt
ctttcacaga tgtggattgg 360cgataaaaaa caactgctga cgccgctgcg cgatcagttc
acccgtgcac cgctggataa 420cgacattggc gtaagtgaag cgacccgcat tgaccctaac
gcctgggtcg aacgctggaa 480ggcggcgggc cattaccagg ccgaagcagc gttgttgcag
tgcacggcag atacacttgc 540tgatgcggtg ctgattacga ccgctcacgc gtggcagcat
caggggaaaa ccttatttat 600cagccggaaa acctaccgga ttgatggtag tggtcaaatg
gcgattaccg ttgatgttga 660agtggcgagc gatacaccgc atccggcgcg gattggcctg
aactgccag 709432875DNAArtificial SequenceSequence of the
5'-Region used for knock out of PpOCH1 43aaaacctttt ttcctattca
aacacaaggc attgcttcaa cacgtgtgcg tatccttaac 60acagatactc catacttcta
ataatgtgat agacgaatac aaagatgttc actctgtgtt 120gtgtctacaa gcatttctta
ttctgattgg ggatattcta gttacagcac taaacaactg 180gcgatacaaa cttaaattaa
ataatccgaa tctagaaaat gaacttttgg atggtccgcc 240tgttggttgg ataaatcaat
accgattaaa tggattctat tccaatgaga gagtaatcca 300agacactctg atgtcaataa
tcatttgctt gcaacaacaa acccgtcatc taatcaaagg 360gtttgatgag gcttaccttc
aattgcagat aaactcattg ctgtccactg ctgtattatg 420tgagaatatg ggtgatgaat
ctggtcttct ccactcagct aacatggctg tttgggcaaa 480ggtggtacaa ttatacggag
atcaggcaat agtgaaattg ttgaatatgg ctactggacg 540atgcttcaag gatgtacgtc
tagtaggagc cgtgggaaga ttgctggcag aaccagttgg 600cacgtcgcaa caatccccaa
gaaatgaaat aagtgaaaac gtaacgtcaa agacagcaat 660ggagtcaata ttgataacac
cactggcaga gcggttcgta cgtcgttttg gagccgatat 720gaggctcagc gtgctaacag
cacgattgac aagaagactc tcgagtgaca gtaggttgag 780taaagtattc gcttagattc
ccaaccttcg ttttattctt tcgtagacaa agaagctgca 840tgcgaacata gggacaactt
ttataaatcc aattgtcaaa ccaacgtaaa accctctggc 900accattttca acatatattt
gtgaagcagt acgcaatatc gataaatact caccgttgtt 960tgtaacagcc ccaacttgca
tacgccttct aatgacctca aatggataag ccgcagcttg 1020tgctaacata ccagcagcac
cgcccgcggt cagctgcgcc cacacatata aaggcaatct 1080acgatcatgg gaggaattag
ttttgaccgt caggtcttca agagttttga actcttcttc 1140ttgaactgtg taacctttta
aatgacggga tctaaatacg tcatggatga gatcatgtgt 1200gtaaaaactg actccagcat
atggaatcat tccaaagatt gtaggagcga acccacgata 1260aaagtttccc aaccttgcca
aagtgtctaa tgctgtgact tgaaatctgg gttcctcgtt 1320gaagaccctg cgtactatgc
ccaaaaactt tcctccacga gccctattaa cttctctatg 1380agtttcaaat gccaaacgga
cacggattag gtccaatggg taagtgaaaa acacagagca 1440aaccccagct aatgagccgg
ccagtaaccg tcttggagct gtttcataag agtcattagg 1500gatcaataac gttctaatct
gttcataaca tacaaatttt atggctgcat agggaaaaat 1560tctcaacagg gtagccgaat
gaccctgata tagacctgcg acaccatcat acccatagat 1620ctgcctgaca gccttaaaga
gcccgctaaa agacccggaa aaccgagaga actctggatt 1680agcagtctga aaaagaatct
tcactctgtc tagtggagca attaatgtct tagcggcact 1740tcctgctact ccgccagcta
ctcctgaata gatcacatac tgcaaagact gcttgtcgat 1800gaccttgggg ttatttagct
tcaagggcaa tttttgggac attttggaca caggagactc 1860agaaacagac acagagcgtt
ctgagtcctg gtgctcctga cgtaggccta gaacaggaat 1920tattggcttt atttgtttgt
ccatttcata ggcttggggt aatagataga tgacagagaa 1980atagagaaga cctaatattt
tttgttcatg gcaaatcgcg ggttcgcggt cgggtcacac 2040acggagaagt aatgagaaga
gctggtaatc tggggtaaaa gggttcaaaa gaaggtcgcc 2100tggtagggat gcaatacaag
gttgtcttgg agtttacatt gaccagatga tttggctttt 2160tctctgttca attcacattt
ttcagcgaga atcggattga cggagaaatg gcggggtgtg 2220gggtggatag atggcagaaa
tgctcgcaat caccgcgaaa gaaagacttt atggaataga 2280actactgggt ggtgtaagga
ttacatagct agtccaatgg agtccgttgg aaaggtaaga 2340agaagctaaa accggctaag
taactaggga agaatgatca gactttgatt tgatgaggtc 2400tgaaaatact ctgctgcttt
ttcagttgct ttttccctgc aacctatcat tttccttttc 2460ataagcctgc cttttctgtt
ttcacttata tgagttccgc cgagacttcc ccaaattctc 2520tcctggaaca ttctctatcg
ctctccttcc aagttgcgcc ccctggcact gcctagtaat 2580attaccacgc gacttatatt
cagttccaca atttccagtg ttcgtagcaa atatcatcag 2640ccatggcgaa ggcagatggc
agtttgctct actataatcc tcacaatcca cccagaaggt 2700attacttcta catggctata
ttcgccgttt ctgtcatttg cgttttgtac ggaccctcac 2760aacaattatc atctccaaaa
atagactatg atccattgac gctccgatca cttgatttga 2820agactttgga agctccttca
cagttgagtc caggcaccgt agaagataat cttcg 287544997DNAArtificial
SequenceSequence of the 3'-Region used for knock out of PpOCH1
44aaagctagag taaaatagat atagcgagat tagagaatga ataccttctt ctaagcgatc
60gtccgtcatc atagaatatc atggactgta tagttttttt tttgtacata taatgattaa
120acggtcatcc aacatctcgt tgacagatct ctcagtacgc gaaatccctg actatcaaag
180caagaaccga tgaagaaaaa aacaacagta acccaaacac cacaacaaac actttatctt
240ctccccccca acaccaatca tcaaagagat gtcggaacca aacaccaaga agcaaaaact
300aaccccatat aaaaacatcc tggtagataa tgctggtaac ccgctctcct tccatattct
360gggctacttc acgaagtctg accggtctca gttgatcaac atgatcctcg aaatgggtgg
420caagatcgtt ccagacctgc ctcctctggt agatggagtg ttgtttttga caggggatta
480caagtctatt gatgaagata ccctaaagca actgggggac gttccaatat acagagactc
540cttcatctac cagtgttttg tgcacaagac atctcttccc attgacactt tccgaattga
600caagaacgtc gacttggctc aagatttgat caatagggcc cttcaagagt ctgtggatca
660tgtcacttct gccagcacag ctgcagctgc tgctgttgtt gtcgctacca acggcctgtc
720ttctaaacca gacgctcgta ctagcaaaat acagttcact cccgaagaag atcgttttat
780tcttgacttt gttaggagaa atcctaaacg aagaaacaca catcaactgt acactgagct
840cgctcagcac atgaaaaacc atacgaatca ttctatccgc cacagatttc gtcgtaatct
900ttccgctcaa cttgattggg tttatgatat cgatccattg accaaccaac ctcgaaaaga
960tgaaaacggg aactacatca aggtacaagg ccttcca
997452159DNAKluyveromyces lactis 45aaacgtaacg cctggcactc tattttctca
aacttctggg acggaagagc taaatattgt 60gttgcttgaa caaacccaaa aaaacaaaaa
aatgaacaaa ctaaaactac acctaaataa 120accgtgtgta aaacgtagta ccatattact
agaaaagatc acaagtgtat cacacatgtg 180catctcatat tacatctttt atccaatcca
ttctctctat cccgtctgtt cctgtcagat 240tctttttcca taaaaagaag aagaccccga
atctcaccgg tacaatgcaa aactgctgaa 300aaaaaaagaa agttcactgg atacgggaac
agtgccagta ggcttcacca catggacaaa 360acaattgacg ataaaataag caggtgagct
tctttttcaa gtcacgatcc ctttatgtct 420cagaaacaat atatacaagc taaacccttt
tgaaccagtt ctctcttcat agttatgttc 480acataaattg cgggaacaag actccgctgg
ctgtcaggta cacgttgtaa cgttttcgtc 540cgcccaatta ttagcacaac attggcaaaa
agaaaaactg ctcgttttct ctacaggtaa 600attacaattt ttttcagtaa ttttcgctga
aaaatttaaa gggcaggaaa aaaagacgat 660ctcgactttg catagatgca agaactgtgg
tcaaaacttg aaatagtaat tttgctgtgc 720gtgaactaat aaatatatat atatatatat
atatatattt gtgtattttg tatatgtaat 780tgtgcacgtc ttggctattg gatataagat
tttcgcgggt tgatgacata gagcgtgtac 840tactgtaata gttgtatatt caaaagctgc
tgcgtggaga aagactaaaa tagataaaaa 900gcacacattt tgacttcggt accgtcaact
tagtgggaca gtcttttata tttggtgtaa 960gctcatttct ggtactattc gaaacagaac
agtgttttct gtattaccgt ccaatcgttt 1020gtcatgagtt ttgtattgat tttgtcgtta
gtgttcggag gatgttgttc caatgtgatt 1080agtttcgagc acatggtgca aggcagcaat
ataaatttgg gaaatattgt tacattcact 1140caattcgtgt ctgtgacgct aattcagttg
cccaatgctt tggacttctc tcactttccg 1200tttaggttgc gacctagaca cattcctctt
aagatccata tgttagctgt gtttttgttc 1260tttaccagtt cagtcgccaa taacagtgtg
tttaaatttg acatttccgt tccgattcat 1320attatcatta gattttcagg taccactttg
acgatgataa taggttgggc tgtttgtaat 1380aagaggtact ccaaacttca ggtgcaatct
gccatcatta tgacgcttgg tgcgattgtc 1440gcatcattat accgtgacaa agaattttca
atggacagtt taaagttgaa tacggattca 1500gtgggtatga cccaaaaatc tatgtttggt
atctttgttg tgctagtggc cactgccttg 1560atgtcattgt tgtcgttgct caacgaatgg
acgtataaca agtacgggaa acattggaaa 1620gaaactttgt tctattcgca tttcttggct
ctaccgttgt ttatgttggg gtacacaagg 1680ctcagagacg aattcagaga cctcttaatt
tcctcagact caatggatat tcctattgtt 1740aaattaccaa ttgctacgaa acttttcatg
ctaatagcaa ataacgtgac ccagttcatt 1800tgtatcaaag gtgttaacat gctagctagt
aacacggatg ctttgacact ttctgtcgtg 1860cttctagtgc gtaaatttgt tagtctttta
ctcagtgtct acatctacaa gaacgtccta 1920tccgtgactg catacctagg gaccatcacc
gtgttcctgg gagctggttt gtattcatat 1980ggttcggtca aaactgcact gcctcgctga
aacaatccac gtctgtatga tactcgtttc 2040agaatttttt tgattttctg ccggatatgg
tttctcatct ttacaatcgc attcttaatt 2100ataccagaac gtaattcaat gatcccagtg
actcgtaact cttatatgtc aatttaagc 215946870DNAArtificial
SequenceSequence of the 5'-Region used for knock out of PpBMT2
46ggccgagcgg gcctagattt tcactacaaa tttcaaaact acgcggattt attgtctcag
60agagcaattt ggcatttctg agcgtagcag gaggcttcat aagattgtat aggaccgtac
120caacaaattg ccgaggcaca acacggtatg ctgtgcactt atgtggctac ttccctacaa
180cggaatgaaa ccttcctctt tccgcttaaa cgagaaagtg tgtcgcaatt gaatgcaggt
240gcctgtgcgc cttggtgtat tgtttttgag ggcccaattt atcaggcgcc ttttttcttg
300gttgttttcc cttagcctca agcaaggttg gtctatttca tctccgcttc tataccgtgc
360ctgatactgt tggatgagaa cacgactcaa cttcctgctg ctctgtattg ccagtgtttt
420gtctgtgatt tggatcggag tcctccttac ttggaatgat aataatcttg gcggaatctc
480cctaaacgga ggcaaggatt ctgcctatga tgatctgcta tcattgggaa gcttcaacga
540catggaggtc gactcctatg tcaccaacat ctacgacaat gctccagtgc taggatgtac
600ggatttgtct tatcatggat tgttgaaagt caccccaaag catgacttag cttgcgattt
660ggagttcata agagctcaga ttttggacat tgacgtttac tccgccataa aagacttaga
720agataaagcc ttgactgtaa aacaaaaggt tgaaaaacac tggtttacgt tttatggtag
780ttcagtcttt ctgcccgaac acgatgtgca ttacctggtt agacgagtca tcttttcggc
840tgaaggaaag gcgaactctc cagtaacatc
870471733DNAArtificial SequenceSequence of the 3'-Region used for knock
out of PpBMT2 47ccatatgatg ggtgtttgct cactcgtatg gatcaaaatt
ccatggtttc ttctgtacaa 60cttgtacact tatttggact tttctaacgg tttttctggt
gatttgagaa gtccttattt 120tggtgttcgc agcttatccg tgattgaacc atcagaaata
ctgcagctcg ttatctagtt 180tcagaatgtg ttgtagaata caatcaattc tgagtctagt
ttgggtgggt cttggcgacg 240ggaccgttat atgcatctat gcagtgttaa ggtacataga
atgaaaatgt aggggttaat 300cgaaagcatc gttaatttca gtagaacgta gttctattcc
ctacccaaat aatttgccaa 360gaatgcttcg tatccacata cgcagtggac gtagcaaatt
tcactttgga ctgtgacctc 420aagtcgttat cttctacttg gacattgatg gtcattacgt
aatccacaaa gaattggata 480gcctctcgtt ttatctagtg cacagcctaa tagcacttaa
gtaagagcaa tggacaaatt 540tgcatagaca ttgagctaga tacgtaactc agatcttgtt
cactcatggt gtactcgaag 600tactgctgga accgttacct cttatcattt cgctactggc
tcgtgaaact actggatgaa 660aaaaaaaaaa gagctgaaag cgagatcatc ccattttgtc
atcatacaaa ttcacgcttg 720cagttttgct tcgttaacaa gacaagatgt ctttatcaaa
gacccgtttt ttcttcttga 780agaatacttc cctgttgagc acatgcaaac catatttatc
tcagatttca ctcaacttgg 840gtgcttccaa gagaagtaaa attcttccca ctgcatcaac
ttccaagaaa cccgtagacc 900agtttctctt cagccaaaag aagttgctcg ccgatcaccg
cggtaacaga ggagtcagaa 960ggtttcacac ccttccatcc cgatttcaaa gtcaaagtgc
tgcgttgaac caaggttttc 1020aggttgccaa agcccagtct gcaaaaacta gttccaaatg
gcctattaat tcccataaaa 1080gtgttggcta cgtatgtatc ggtacctcca ttctggtatt
tgctattgtt gtcgttggtg 1140ggttgactag actgaccgaa tccggtcttt ccataacgga
gtggaaacct atcactggtt 1200cggttccccc actgactgag gaagactgga agttggaatt
tgaaaaatac aaacaaagcc 1260ctgagtttca ggaactaaat tctcacataa cattggaaga
gttcaagttt atattttcca 1320tggaatgggg acatagattg ttgggaaggg tcatcggcct
gtcgtttgtt cttcccacgt 1380tttacttcat tgcccgtcga aagtgttcca aagatgttgc
attgaaactg cttgcaatat 1440gctctatgat aggattccaa ggtttcatcg gctggtggat
ggtgtattcc ggattggaca 1500aacagcaatt ggctgaacgt aactccaaac caactgtgtc
tccatatcgc ttaactaccc 1560atcttggaac tgcatttgtt atttactgtt acatgattta
cacagggctt caagttttga 1620agaactataa gatcatgaaa cagcctgaag cgtatgttca
aattttcaag caaattgcgt 1680ctccaaaatt gaaaactttc aagagactct cttcagttct
attaggcctg gtg 173348981DNAArtificial SequenceDNA encodes
MmSLC35A3 UDP-GlcNAc transporter 48atgtctgcca acctaaaata tctttccttg
ggaattttgg tgtttcagac taccagtctg 60gttctaacga tgcggtattc taggacttta
aaagaggagg ggcctcgtta tctgtcttct 120acagcagtgg ttgtggctga atttttgaag
ataatggcct gcatcttttt agtctacaaa 180gacagtaagt gtagtgtgag agcactgaat
agagtactgc atgatgaaat tcttaataag 240cccatggaaa ccctgaagct cgctatcccg
tcagggatat atactcttca gaacaactta 300ctctatgtgg cactgtcaaa cctagatgca
gccacttacc aggttacata tcagttgaaa 360atacttacaa cagcattatt ttctgtgtct
atgcttggta aaaaattagg tgtgtaccag 420tggctctccc tagtaattct gatggcagga
gttgcttttg tacagtggcc ttcagattct 480caagagctga actctaagga cctttcaaca
ggctcacagt ttgtaggcct catggcagtt 540ctcacagcct gtttttcaag tggctttgct
ggagtttatt ttgagaaaat cttaaaagaa 600acaaaacagt cagtatggat aaggaacatt
caacttggtt tctttggaag tatatttgga 660ttaatgggtg tatacgttta tgatggagaa
ttggtctcaa agaatggatt ttttcaggga 720tataatcaac tgacgtggat agttgttgct
ctgcaggcac ttggaggcct tgtaatagct 780gctgtcatca aatatgcaga taacatttta
aaaggatttg cgacctcctt atccataata 840ttgtcaacaa taatatctta tttttggttg
caagattttg tgccaaccag tgtctttttc 900cttggagcca tccttgtaat agcagctact
ttcttgtatg gttacgatcc caaacctgca 960ggaaatccca ctaaagcata g
981491128DNAArtificial SequenceSequence
of the 5'-Region used for knock out of PpMNN4L1 49gatctggcca
ttgtgaaact tgacactaaa gacaaaactc ttagagtttc caatcactta 60ggagacgatg
tttcctacaa cgagtacgat ccctcattga tcatgagcaa tttgtatgtg 120aaaaaagtca
tcgaccttga caccttggat aaaagggctg gaggaggtgg aaccacctgt 180gcaggcggtc
tgaaagtgtt caagtacgga tctactacca aatatacatc tggtaacctg 240aacggcgtca
ggttagtata ctggaacgaa ggaaagttgc aaagctccaa atttgtggtt 300cgatcctcta
attactctca aaagcttgga ggaaacagca acgccgaatc aattgacaac 360aatggtgtgg
gttttgcctc agctggagac tcaggcgcat ggattctttc caagctacaa 420gatgttaggg
agtaccagtc attcactgaa aagctaggtg aagctacgat gagcattttc 480gatttccacg
gtcttaaaca ggagacttct actacagggc ttggggtagt tggtatgatt 540cattcttacg
acggtgagtt caaacagttt ggtttgttca ctccaatgac atctattcta 600caaagacttc
aacgagtgac caatgtagaa tggtgtgtag cgggttgcga agatggggat 660gtggacactg
aaggagaaca cgaattgagt gatttggaac aactgcatat gcatagtgat 720tccgactagt
caggcaagag agagccctca aatttacctc tctgcccctc ctcactcctt 780ttggtacgca
taattgcagt ataaagaact tgctgccagc cagtaatctt atttcatacg 840cagttctata
tagcacataa tcttgcttgt atgtatgaaa tttaccgcgt tttagttgaa 900attgtttatg
ttgtgtgcct tgcatgaaat ctctcgttag ccctatcctt acatttaact 960ggtctcaaaa
cctctaccaa ttccattgct gtacaacaat atgaggcggc attactgtag 1020ggttggaaaa
aaattgtcat tccagctaga gatcacacga cttcatcacg cttattgctc 1080ctcattgcta
aatcatttac tcttgacttc gacccagaaa agttcgcc
1128501231DNAArtificial SequenceSequence of the 3'-Region used for knock
out of PpMNN4L1 50gcatgtcaaa cttgaacaca acgactagat agttgttttt
tctatataaa acgaaacgtt 60atcatcttta ataatcattg aggtttaccc ttatagttcc
gtattttcgt ttccaaactt 120agtaatcttt tggaaatatc atcaaagctg gtgccaatct
tcttgtttga agtttcaaac 180tgctccacca agctacttag agactgttct aggtctgaag
caacttcgaa cacagagaca 240gctgccgccg attgttcttt tttgtgtttt tcttctggaa
gaggggcatc atcttgtatg 300tccaatgccc gtatcctttc tgagttgtcc gacacattgt
ccttcgaaga gtttcctgac 360attgggcttc ttctatccgt gtattaattt tgggttaagt
tcctcgtttg catagcagtg 420gatacctcga tttttttggc tcctatttac ctgacataat
attctactat aatccaactt 480ggacgcgtca tctatgataa ctaggctctc ctttgttcaa
aggggacgtc ttcataatcc 540actggcacga agtaagtctg caacgaggcg gcttttgcaa
cagaacgata gtgtcgtttc 600gtacttggac tatgctaaac aaaaggatct gtcaaacatt
tcaaccgtgt ttcaaggcac 660tctttacgaa ttatcgacca agaccttcct agacgaacat
ttcaacatat ccaggctact 720gcttcaaggt ggtgcaaatg ataaaggtat agatattaga
tgtgtttggg acctaaaaca 780gttcttgcct gaagattccc ttgagcaaca ggcttcaata
gccaagttag agaagcagta 840ccaaatcggt aacaaaaggg ggaagcatat aaaaccttta
ctattgcgac aaaatccatc 900cttgaaagta aagctgtttg ttcaatgtaa agcatacgaa
acgaaggagg tagatcctaa 960gatggttaga gaacttaacg ggacatactc cagctgcatc
ccatattacg atcgctggaa 1020gacttttttc atgtacgtat cgcccaccaa cctttcaaag
caagctaggt atgattttga 1080cagttctcac aatccattgg ttttcatgca acttgaaaaa
acccaactca aacttcatgg 1140ggatccatac aatgtaaatc attacgagag ggcgaggttg
aaaagtttcc attgcaatca 1200cgtcgcatca tggctactga aaggccttaa c
123151937DNAArtificial SequenceSequence of the
5'-Region used for knock out of PpPNO1 and PpMNN4 51tcattctata
tgttcaagaa aagggtagtg aaaggaaaga aaaggcatat aggcgaggga 60gagttagcta
gcatacaaga taatgaagga tcaatagcgg tagttaaagt gcacaagaaa 120agagcacctg
ttgaggctga tgataaagct ccaattacat tgccacagag aaacacagta 180acagaaatag
gaggggatgc accacgagaa gagcattcag tgaacaactt tgccaaattc 240ataaccccaa
gcgctaataa gccaatgtca aagtcggcta ctaacattaa tagtacaaca 300actatcgatt
ttcaaccaga tgtttgcaag gactacaaac agacaggtta ctgcggatat 360ggtgacactt
gtaagttttt gcacctgagg gatgatttca aacagggatg gaaattagat 420agggagtggg
aaaatgtcca aaagaagaag cataatactc tcaaaggggt taaggagatc 480caaatgttta
atgaagatga gctcaaagat atcccgttta aatgcattat atgcaaagga 540gattacaaat
cacccgtgaa aacttcttgc aatcattatt tttgcgaaca atgtttcctg 600caacggtcaa
gaagaaaacc aaattgtatt atatgtggca gagacacttt aggagttgct 660ttaccagcaa
agaagttgtc ccaatttctg gctaagatac ataataatga aagtaataaa 720gtttagtaat
tgcattgcgt tgactattga ttgcattgat gtcgtgtgat actttcaccg 780aaaaaaaaca
cgaagcgcaa taggagcggt tgcatattag tccccaaagc tatttaattg 840tgcctgaaac
tgttttttaa gctcatcaag cataattgta tgcattgcga cgtaaccaac 900gtttaggcgc
agtttaatca tagcccactg ctaagcc
937521906DNAArtificial SequenceSequence of the 3'-Region used for knock
out of PpPNO1 and PpMNN4 52cggaggaatg caaataataa tctccttaat
tacccactga taagctcaag agacgcggtt 60tgaaaacgat ataatgaatc atttggattt
tataataaac cctgacagtt tttccactgt 120attgttttaa cactcattgg aagctgtatt
gattctaaga agctagaaat caatacggcc 180atacaaaaga tgacattgaa taagcaccgg
cttttttgat tagcatatac cttaaagcat 240gcattcatgg ctacatagtt gttaaagggc
ttcttccatt atcagtataa tgaattacat 300aatcatgcac ttatatttgc ccatctctgt
tctctcactc ttgcctgggt atattctatg 360aaattgcgta tagcgtgtct ccagttgaac
cccaagcttg gcgagtttga agagaatgct 420aaccttgcgt attccttgct tcaggaaaca
ttcaaggaga aacaggtcaa gaagccaaac 480attttgatcc ttcccgagtt agcattgact
ggctacaatt ttcaaagcca gcagcggata 540gagccttttt tggaggaaac aaccaaggga
gctagtaccc aatgggctca aaaagtatcc 600aagacgtggg attgctttac tttaatagga
tacccagaaa aaagtttaga gagccctccc 660cgtatttaca acagtgcggt acttgtatcg
cctcagggaa aagtaatgaa caactacaga 720aagtccttct tgtatgaagc tgatgaacat
tggggatgtt cggaatcttc tgatgggttt 780caaacagtag atttattaat tgaaggaaag
actgtaaaga catcatttgg aatttgcatg 840gatttgaatc cttataaatt tgaagctcca
ttcacagact tcgagttcag tggccattgc 900ttgaaaaccg gtacaagact cattttgtgc
ccaatggcct ggttgtcccc tctatcgcct 960tccattaaaa aggatcttag tgatatagag
aaaagcagac ttcaaaagtt ctaccttgaa 1020aaaatagata ccccggaatt tgacgttaat
tacgaattga aaaaagatga agtattgccc 1080acccgtatga atgaaacgtt ggaaacaatt
gactttgagc cttcaaaacc ggactactct 1140aatataaatt attggatact aaggtttttt
ccctttctga ctcatgtcta taaacgagat 1200gtgctcaaag agaatgcagt tgcagtctta
tgcaaccgag ttggcattga gagtgatgtc 1260ttgtacggag gatcaaccac gattctaaac
ttcaatggta agttagcatc gacacaagag 1320gagctggagt tgtacgggca gactaatagt
ctcaacccca gtgtggaagt attgggggcc 1380cttggcatgg gtcaacaggg aattctagta
cgagacattg aattaacata atatacaata 1440tacaataaac acaaataaag aatacaagcc
tgacaaaaat tcacaaatta ttgcctagac 1500ttgtcgttat cagcagcgac ctttttccaa
tgctcaattt cacgatatgc cttttctagc 1560tctgctttaa gcttctcatt ggaattggct
aactcgttga ctgcttggtc agtgatgagt 1620ttctccaagg tccatttctc gatgttgttg
ttttcgtttt cctttaatct cttgatataa 1680tcaacagcct tctttaatat ctgagccttg
ttcgagtccc ctgttggcaa cagagcggcc 1740agttccttta ttccgtggtt tatattttct
cttctacgcc tttctacttc tttgtgattc 1800tctttacgca tcttatgcca ttcttcagaa
ccagtggctg gcttaaccga atagccagag 1860cctgaagaag ccgcactaga agaagcagtg
gcattgttga ctatgg 1906531224DNAArtificial SequenceDNA
encodes human GnTI catalytic domain (NA) Codon-optimized
53tcagtcagtg ctcttgatgg tgacccagca agtttgacca gagaagtgat tagattggcc
60caagacgcag aggtggagtt ggagagacaa cgtggactgc tgcagcaaat cggagatgca
120ttgtctagtc aaagaggtag ggtgcctacc gcagctcctc cagcacagcc tagagtgcat
180gtgacccctg caccagctgt gattcctatc ttggtcatcg cctgtgacag atctactgtt
240agaagatgtc tggacaagct gttgcattac agaccatctg ctgagttgtt ccctatcatc
300gttagtcaag actgtggtca cgaggagact gcccaagcca tcgcctccta cggatctgct
360gtcactcaca tcagacagcc tgacctgtca tctattgctg tgccaccaga ccacagaaag
420ttccaaggtt actacaagat cgctagacac tacagatggg cattgggtca agtcttcaga
480cagtttagat tccctgctgc tgtggtggtg gaggatgact tggaggtggc tcctgacttc
540tttgagtact ttagagcaac ctatccattg ctgaaggcag acccatccct gtggtgtgtc
600tctgcctgga atgacaacgg taaggagcaa atggtggacg cttctaggcc tgagctgttg
660tacagaaccg acttctttcc tggtctggga tggttgctgt tggctgagtt gtgggctgag
720ttggagccta agtggccaaa ggcattctgg gacgactgga tgagaagacc tgagcaaaga
780cagggtagag cctgtatcag acctgagatc tcaagaacca tgacctttgg tagaaaggga
840gtgtctcacg gtcaattctt tgaccaacac ttgaagttta tcaagctgaa ccagcaattt
900gtgcacttca cccaactgga cctgtcttac ttgcagagag aggcctatga cagagatttc
960ctagctagag tctacggagc tcctcaactg caagtggaga aagtgaggac caatgacaga
1020aaggagttgg gagaggtgag agtgcagtac actggtaggg actcctttaa ggctttcgct
1080aaggctctgg gtgtcatgga tgaccttaag tctggagttc ctagagctgg ttacagaggt
1140attgtcacct ttcaattcag aggtagaaga gtccacttgg ctcctccacc tacttgggag
1200ggttatgatc cttcttggaa ttag
12245499DNAArtificial SequenceDNA encodes Pp SEC12 (10) 54atgcccagaa
aaatatttaa ctacttcatt ttgactgtat tcatggcaat tcttgctatt 60gttttacaat
ggtctataga gaatggacat gggcgcgcc
9955435DNAArtificial SequenceSequence of the PpSEC4 promoter 55gaagtaaagt
tggcgaaact ttgggaacct ttggttaaaa ctttgtaatt tttgtcgcta 60cccattaggc
agaatctgca tcttgggagg gggatgtggt ggcgttctga gatgtacgcg 120aagaatgaag
agccagtggt aacaacaggc ctagagagat acgggcataa tgggtataac 180ctacaagtta
agaatgtagc agccctggaa accagattga aacgaaaaac gaaatcattt 240aaactgtagg
atgttttggc tcattgtctg gaaggctggc tgtttattgc cctgttcttt 300gcatgggaat
aagctattat atccctcaca taatcccaga aaatagattg aagcaacgcg 360aaatccttac
gtatcgaagt agccttctta cacattcacg ttgtacggat aagaaaacta 420ctcaaacgaa
caatc
43556404DNAArtificial SequenceSequence of the PpOCH1 terminator
56aatagatata gcgagattag agaatgaata ccttcttcta agcgatcgtc cgtcatcata
60gaatatcatg gactgtatag tttttttttt gtacatataa tgattaaacg gtcatccaac
120atctcgttga cagatctctc agtacgcgaa atccctgact atcaaagcaa gaaccgatga
180agaaaaaaac aacagtaacc caaacaccac aacaaacact ttatcttctc ccccccaaca
240ccaatcatca aagagatgtc ggaacacaaa caccaagaag caaaaactaa ccccatataa
300aaacatcctg gtagataatg ctggtaaccc gctctccttc catattctgg gctacttcac
360gaagtctgac cggtctcagt tgatcaacat gatcctcgaa atgg
404571407DNAArtificial SequenceDNA encodes Mm ManI catalytic domain (FB)
57gagcccgctg acgccaccat ccgtgagaag agggcaaaga tcaaagagat gatgacccat
60gcttggaata attataaacg ctatgcgtgg ggcttgaacg aactgaaacc tatatcaaaa
120gaaggccatt caagcagttt gtttggcaac atcaaaggag ctacaatagt agatgccctg
180gatacccttt tcattatggg catgaagact gaatttcaag aagctaaatc gtggattaaa
240aaatatttag attttaatgt gaatgctgaa gtttctgttt ttgaagtcaa catacgcttc
300gtcggtggac tgctgtcagc ctactatttg tccggagagg agatatttcg aaagaaagca
360gtggaacttg gggtaaaatt gctacctgca tttcatactc cctctggaat accttgggca
420ttgctgaata tgaaaagtgg gatcgggcgg aactggccct gggcctctgg aggcagcagt
480atcctggccg aatttggaac tctgcattta gagtttatgc acttgtccca cttatcagga
540gacccagtct ttgccgaaaa ggttatgaaa attcgaacag tgttgaacaa actggacaaa
600ccagaaggcc tttatcctaa ctatctgaac cccagtagtg gacagtgggg tcaacatcat
660gtgtcggttg gaggacttgg agacagcttt tatgaatatt tgcttaaggc gtggttaatg
720tctgacaaga cagatctcga agccaagaag atgtattttg atgctgttca ggccatcgag
780actcacttga tccgcaagtc aagtggggga ctaacgtaca tcgcagagtg gaaggggggc
840ctcctggaac acaagatggg ccacctgacg tgctttgcag gaggcatgtt tgcacttggg
900gcagatggag ctccggaagc ccgggcccaa cactaccttg aactcggagc tgaaattgcc
960cgcacttgtc atgaatctta taatcgtaca tatgtgaagt tgggaccgga agcgtttcga
1020tttgatggcg gtgtggaagc tattgccacg aggcaaaatg aaaagtatta catcttacgg
1080cccgaggtca tcgagacata catgtacatg tggcgactga ctcacgaccc caagtacagg
1140acctgggcct gggaagccgt ggaggctcta gaaagtcact gcagagtgaa cggaggctac
1200tcaggcttac gggatgttta cattgcccgt gagagttatg acgatgtcca gcaaagtttc
1260ttcctggcag agacactgaa gtatttgtac ttgatatttt ccgatgatga ccttcttcca
1320ctagaacact ggatcttcaa caccgaggct catcctttcc ctatactccg tgaacagaag
1380aaggaaattg atggcaaaga gaaatga
140758318DNAArtificial SequenceDNA encodes ScSEC12 (8) 58atgaacacta
tccacataat aaaattaccg cttaactacg ccaactacac ctcaatgaaa 60caaaaaatct
ctaaattttt caccaacttc atccttattg tgctgctttc ttacatttta 120cagttctcct
ataagcacaa tttgcattcc atgcttttca attacgcgaa ggacaatttt 180ctaacgaaaa
gagacaccat ctcttcgccc tacgtagttg atgaagactt acatcaaaca 240actttgtttg
gcaaccacgg tacaaaaaca tctgtaccta gcgtagattc cataaaagtg 300catggcgtgg
ggcgcgcc
318591250DNAArtificial SequenceSequence of the 5'-region that was used to
knock into the PpADE1 locus 59gagtcggcca agagatgata actgttacta
agcttctccg taattagtgg tattttgtaa 60cttttaccaa taatcgttta tgaatacgga
tatttttcga ccttatccag tgccaaatca 120cgtaacttaa tcatggttta aatactccac
ttgaacgatt cattattcag aaaaaagtca 180ggttggcaga aacacttggg cgctttgaag
agtataagag tattaagcat taaacatctg 240aactttcacc gccccaatat actactctag
gaaactcgaa aaattccttt ccatgtgtca 300tcgcttccaa cacactttgc tgtatccttc
caagtatgtc cattgtgaac actgatctgg 360acggaatcct acctttaatc gccaaaggaa
aggttagaga catttatgca gtcgatgaga 420acaacttgct gttcgtcgca actgaccgta
tctccgctta cgatgtgatt atgacaaacg 480gtattcctga taagggaaag attttgactc
agctctcagt tttctggttt gattttttgg 540caccctacat aaagaatcat ttggttgctt
ctaatgacaa ggaagtcttt gctttactac 600catcaaaact gtctgaagaa aaatacaaat
ctcaattaga gggacgatcc ttgatagtaa 660aaaagcacag actgatacct ttggaagcca
ttgtcagagg ttacatcact ggaagtgcat 720ggaaagagta caagaactca aaaactgtcc
atggagtcaa ggttgaaaac gagaaccttc 780aagagagcga cgcctttcca actccgattt
tcacaccttc aacgaaagct gaacagggtg 840aacacgatga aaacatctct attgaacaag
ctgctgagat tgtaggtaaa gacatttgtg 900agaaggtcgc tgtcaaggcg gtcgagttgt
attctgctgc aaaaaacctc gcccttttga 960aggggatcat tattgctgat acgaaattcg
aatttggact ggacgaaaac aatgaattgg 1020tactagtaga tgaagtttta actccagatt
cttctagatt ttggaatcaa aagacttacc 1080aagtgggtaa atcgcaagag agttacgata
agcagtttct cagagattgg ttgacggcca 1140acggattgaa tggcaaagag ggcgtagcca
tggatgcaga aattgctatc aagagtaaag 1200aaaagtatat tgaagcttat gaagcaatta
ctggcaagaa atgggcttga 125060882DNAArtificial
SequenceSequence of the 3'-region that was used to knock into the
PpADE1 locus 60atgattagta ccctcctcgc ctttttcaga catctgaaat ttcccttatt
cttccaattc 60catataaaat cctatttagg taattagtaa acaatgatca taaagtgaaa
tcattcaagt 120aaccattccg tttatcgttg atttaaaatc aataacgaat gaatgtcggt
ctgagtagtc 180aatttgttgc cttggagctc attggcaggg ggtcttttgg ctcagtatgg
aaggttgaaa 240ggaaaacaga tggaaagtgg ttcgtcagaa aagaggtatc ctacatgaag
atgaatgcca 300aagagatatc tcaagtgata gctgagttca gaattcttag tgagttaagc
catcccaaca 360ttgtgaagta ccttcatcac gaacatattt ctgagaataa aactgtcaat
ttatacatgg 420aatactgtga tggtggagat ctctccaagc tgattcgaac acatagaagg
aacaaagagt 480acatttcaga agaaaaaata tggagtattt ttacgcaggt tttattagca
ttgtatcgtt 540gtcattatgg aactgatttc acggcttcaa aggagtttga atcgctcaat
aaaggtaata 600gacgaaccca gaatccttcg tgggtagact cgacaagagt tattattcac
agggatataa 660aacccgacaa catctttctg atgaacaatt caaaccttgt caaactggga
gattttggat 720tagcaaaaat tctggaccaa gaaaacgatt ttgccaaaac atacgtcggt
acgccgtatt 780acatgtctcc tgaagtgctg ttggaccaac cctactcacc attatgtgat
atatggtctc 840ttgggtgcgt catgtatgag ctatgtgcat tgaggcctcc tt
882612100DNAArtificial SequenceDNA encodes ScGAL10
61atgacagctc agttacaaag tgaaagtact tctaaaattg ttttggttac aggtggtgct
60ggatacattg gttcacacac tgtggtagag ctaattgaga atggatatga ctgtgttgtt
120gctgataacc tgtcgaattc aacttatgat tctgtagcca ggttagaggt cttgaccaag
180catcacattc ccttctatga ggttgatttg tgtgaccgaa aaggtctgga aaaggttttc
240aaagaatata aaattgattc ggtaattcac tttgctggtt taaaggctgt aggtgaatct
300acacaaatcc cgctgagata ctatcacaat aacattttgg gaactgtcgt tttattagag
360ttaatgcaac aatacaacgt ttccaaattt gttttttcat cttctgctac tgtctatggt
420gatgctacga gattcccaaa tatgattcct atcccagaag aatgtccctt agggcctact
480aatccgtatg gtcatacgaa atacgccatt gagaatatct tgaatgatct ttacaatagc
540gacaaaaaaa gttggaagtt tgctatcttg cgttatttta acccaattgg cgcacatccc
600tctggattaa tcggagaaga tccgctaggt ataccaaaca atttgttgcc atatatggct
660caagtagctg ttggtaggcg cgagaagctt tacatcttcg gagacgatta tgattccaga
720gatggtaccc cgatcaggga ttatatccac gtagttgatc tagcaaaagg tcatattgca
780gccctgcaat acctagaggc ctacaatgaa aatgaaggtt tgtgtcgtga gtggaacttg
840ggttccggta aaggttctac agtttttgaa gtttatcatg cattctgcaa agcttctggt
900attgatcttc catacaaagt tacgggcaga agagcaggtg atgttttgaa cttgacggct
960aaaccagata gggccaaacg cgaactgaaa tggcagaccg agttgcaggt tgaagactcc
1020tgcaaggatt tatggaaatg gactactgag aatccttttg gttaccagtt aaggggtgtc
1080gaggccagat tttccgctga agatatgcgt tatgacgcaa gatttgtgac tattggtgcc
1140ggcaccagat ttcaagccac gtttgccaat ttgggcgcca gcattgttga cctgaaagtg
1200aacggacaat cagttgttct tggctatgaa aatgaggaag ggtatttgaa tcctgatagt
1260gcttatatag gcgccacgat cggcaggtat gctaatcgta tttcgaaggg taagtttagt
1320ttatgcaaca aagactatca gttaaccgtt aataacggcg ttaatgcgaa tcatagtagt
1380atcggttctt tccacagaaa aagatttttg ggacccatca ttcaaaatcc ttcaaaggat
1440gtttttaccg ccgagtacat gctgatagat aatgagaagg acaccgaatt tccaggtgat
1500ctattggtaa ccatacagta tactgtgaac gttgcccaaa aaagtttgga aatggtatat
1560aaaggtaaat tgactgctgg tgaagcgacg ccaataaatt taacaaatca tagttatttc
1620aatctgaaca agccatatgg agacactatt gagggtacgg agattatggt gcgttcaaaa
1680aaatctgttg atgtcgacaa aaacatgatt cctacgggta atatcgtcga tagagaaatt
1740gctaccttta actctacaaa gccaacggtc ttaggcccca aaaatcccca gtttgattgt
1800tgttttgtgg tggatgaaaa tgctaagcca agtcaaatca atactctaaa caatgaattg
1860acgcttattg tcaaggcttt tcatcccgat tccaatatta cattagaagt tttaagtaca
1920gagccaactt atcaatttta taccggtgat ttcttgtctg ctggttacga agcaagacaa
1980ggttttgcaa ttgagcctgg tagatacatt gatgctatca atcaagagaa ctggaaagat
2040tgtgtaacct tgaaaaacgg tgaaacttac gggtccaaga ttgtctacag attttcctga
210062512DNAArtificial SequenceSequence of the PpPMA1 terminator
62taagcttcac gatttgtgtt ccagtttatc ccccctttat ataccgttaa ccctttccct
60gttgagctga ctgttgttgt attaccgcaa tttttccaag tttgccatgc ttttcgtgtt
120atttgaccga tgtctttttt cccaaatcaa actatatttg ttaccattta aaccaagtta
180tcttttgtat taagagtcta agtttgttcc caggcttcat gtgagagtga taaccatcca
240gactatgatt cttgtttttt attgggtttg tttgtgtgat acatctgagt tgtgattcgt
300aaagtatgtc agtctatcta gatttttaat agttaattgg taatcaatga cttgtttgtt
360ttaactttta aattgtgggt cgtatccacg cgtttagtat agctgttcat ggctgttaga
420ggagggcgat gtttatatac agaggacaag aatgaggagg cggcgtgtat ttttaaaatg
480gagacgcgac tcctgtacac cttatcggtt gg
512631068DNAArtificial SequencehGalT codon optimized (XB) 63ggtagagatt
tgtctagatt gccacagttg gttggtgttt ccactccatt gcaaggaggt 60tctaactctg
ctgctgctat tggtcaatct tccggtgagt tgagaactgg tggagctaga 120ccacctccac
cattgggagc ttcctctcaa ccaagaccag gtggtgattc ttctccagtt 180gttgactctg
gtccaggtcc agcttctaac ttgacttccg ttccagttcc acacactact 240gctttgtcct
tgccagcttg tccagaagaa tccccattgt tggttggtcc aatgttgatc 300gagttcaaca
tgccagttga cttggagttg gttgctaagc agaacccaaa cgttaagatg 360ggtggtagat
acgctccaag agactgtgtt tccccacaca aagttgctat catcatccca 420ttcagaaaca
gacaggagca cttgaagtac tggttgtact acttgcaccc agttttgcaa 480agacagcagt
tggactacgg tatctacgtt atcaaccagg ctggtgacac tattttcaac 540agagctaagt
tgttgaatgt tggtttccag gaggctttga aggattacga ctacacttgt 600ttcgttttct
ccgacgttga cttgattcca atgaacgacc acaacgctta cagatgtttc 660tcccagccaa
gacacatttc tgttgctatg gacaagttcg gtttctcctt gccatacgtt 720caatacttcg
gtggtgtttc cgctttgtcc aagcagcagt tcttgactat caacggtttc 780ccaaacaatt
actggggatg gggtggtgaa gatgacgaca tctttaacag attggttttc 840agaggaatgt
ccatctctag accaaacgct gttgttggta gatgtagaat gatcagacac 900tccagagaca
agaagaacga gccaaaccca caaagattcg acagaatcgc tcacactaag 960gaaactatgt
tgtccgacgg attgaactcc ttgacttacc aggttttgga cgttcagaga 1020tacccattgt
acactcagat cactgttgac atcggtactc catcctag
106864183DNAArtificial SequenceDNA encodes ScMnt1 (Kre2) (33)
64atggccctct ttctcagtaa gagactgttg agatttaccg tcattgcagg tgcggttatt
60gttctcctcc taacattgaa ttccaacagt agaactcagc aatatattcc gagttccatc
120tccgctgcat ttgattttac ctcaggatct atatcccctg aacaacaagt catcgggcgc
180gcc
183651074DNAArtificial SequenceDNA encodes DmUGT 65atgaatagca tacacatgaa
cgccaatacg ctgaagtaca tcagcctgct gacgctgacc 60ctgcagaatg ccatcctggg
cctcagcatg cgctacgccc gcacccggcc aggcgacatc 120ttcctcagct ccacggccgt
actcatggca gagttcgcca aactgatcac gtgcctgttc 180ctggtcttca acgaggaggg
caaggatgcc cagaagtttg tacgctcgct gcacaagacc 240atcattgcga atcccatgga
cacgctgaag gtgtgcgtcc cctcgctggt ctatatcgtt 300caaaacaatc tgctgtacgt
ctctgcctcc catttggatg cggccaccta ccaggtgacg 360taccagctga agattctcac
cacggccatg ttcgcggttg tcattctgcg ccgcaagctg 420ctgaacacgc agtggggtgc
gctgctgctc ctggtgatgg gcatcgtcct ggtgcagttg 480gcccaaacgg agggtccgac
gagtggctca gccggtggtg ccgcagctgc agccacggcc 540gcctcctctg gcggtgctcc
cgagcagaac aggatgctcg gactgtgggc cgcactgggc 600gcctgcttcc tctccggatt
cgcgggcatc tactttgaga agatcctcaa gggtgccgag 660atctccgtgt ggatgcggaa
tgtgcagttg agtctgctca gcattccctt cggcctgctc 720acctgtttcg ttaacgacgg
cagtaggatc ttcgaccagg gattcttcaa gggctacgat 780ctgtttgtct ggtacctggt
cctgctgcag gccggcggtg gattgatcgt tgccgtggtg 840gtcaagtacg cggataacat
tctcaagggc ttcgccacct cgctggccat catcatctcg 900tgcgtggcct ccatatacat
cttcgacttc aatctcacgc tgcagttcag cttcggagct 960ggcctggtca tcgcctccat
atttctctac ggctacgatc cggccaggtc ggcgccgaag 1020ccaactatgc atggtcctgg
cggcgatgag gagaagctgc tgccgcgcgt ctag 107466798DNAArtificial
SequenceSequence of the PpOCH1 promoter 66tggacacagg agactcagaa
acagacacag agcgttctga gtcctggtgc tcctgacgta 60ggcctagaac aggaattatt
ggctttattt gtttgtccat ttcataggct tggggtaata 120gatagatgac agagaaatag
agaagaccta atattttttg ttcatggcaa atcgcgggtt 180cgcggtcggg tcacacacgg
agaagtaatg agaagagctg gtaatctggg gtaaaagggt 240tcaaaagaag gtcgcctggt
agggatgcaa tacaaggttg tcttggagtt tacattgacc 300agatgatttg gctttttctc
tgttcaattc acatttttca gcgagaatcg gattgacgga 360gaaatggcgg ggtgtggggt
ggatagatgg cagaaatgct cgcaatcacc gcgaaagaaa 420gactttatgg aatagaacta
ctgggtggtg taaggattac atagctagtc caatggagtc 480cgttggaaag gtaagaagaa
gctaaaaccg gctaagtaac tagggaagaa tgatcagact 540ttgatttgat gaggtctgaa
aatactctgc tgctttttca gttgcttttt ccctgcaacc 600tatcattttc cttttcataa
gcctgccttt tctgttttca cttatatgag ttccgccgag 660acttccccaa attctctcct
ggaacattct ctatcgctct ccttccaagt tgcgccccct 720ggcactgcct agtaatatta
ccacgcgact tatattcagt tccacaattt ccagtgttcg 780tagcaaatat catcagcc
79867302DNAArtificial
SequenceSequence of the PpALG12 terminator 67aatatatacc tcatttgttc
aatttggtgt aaagagtgtg gcggatagac ttcttgtaaa 60tcaggaaagc tacaattcca
attgctgcaa aaaataccaa tgcccataaa ccagtatgag 120cggtgccttc gacggattgc
ttactttccg accctttgtc gtttgattct tctgcctttg 180gtgagtcagt ttgtttcgac
tttatatctg actcatcaac ttcctttacg gttgcgtttt 240taatcataat tttagccgtt
ggcttattat cccttgagtt ggtaggagtt ttgatgatgc 300tg
30268461DNAArtificial
SequenceSequence of the 5'-Region used for knock out of PpHIS1
68taactggccc tttgacgttt ctgacaatag ttctagagga gtcgtccaaa aactcaactc
60tgacttgggt gacaccacca cgggatccgg ttcttccgag gaccttgatg accttggcta
120atgtaactgg agttttagta tccattttaa gatgtgtgtt tctgtaggtt ctgggttgga
180aaaaaatttt agacaccaga agagaggagt gaactggttt gcgtgggttt agactgtgta
240aggcactact ctgtcgaagt tttagatagg ggttacccgc tccgatgcat gggaagcgat
300tagcccggct gttgcccgtt tggtttttga agggtaattt tcaatatctc tgtttgagtc
360atcaatttca tattcaaaga ttcaaaaaca aaatctggtc caaggagcgc atttaggatt
420atggagttgg cgaatcactt gaacgataga ctattatttg c
461691841DNAArtificial SequenceSequence of the 3'-Region used for knock
out of PpHIS1 69gtgacattct tgtctttgag atcagtaatt gtagagcata
gatagaataa tattcaagac 60caacggcttc tcttcggaag ctccaagtag cttatagtga
tgagtaccgg catatattta 120taggcttaaa atttcgaggg ttcactatat tcgtttagtg
ggaagagttc ctttcactct 180tgttatctat attgtcagcg tggactgttt ataactgtac
caacttagtt tctttcaact 240ccaggttaag agacataaat gtcctttgat gctgacaata
atcagtggaa ttcaaggaag 300gacaatcccg acctcaatct gttcattaat gaagagttcg
aatcgtcctt aaatcaagcg 360ctagactcaa ttgtcaatga gaaccctttc tttgaccaag
aaactataaa tagatcgaat 420gacaaagttg gaaatgagtc cattagctta catgatattg
agcaggcaga ccaaaataaa 480ccgtcctttg agagcgatat tgatggttcg gcgccgttga
taagagacga caaattgcca 540aagaaacaaa gctgggggct gagcaatttt ttttcaagaa
gaaatagcat atgtttacca 600ctacatgaaa atgattcaag tgttgttaag accgaaagat
ctattgcagt gggaacaccc 660catcttcaat actgcttcaa tggaatctcc aatgccaagt
acaatgcatt tacctttttc 720ccagtcatcc tatacgagca attcaaattt tttttcaatt
tatactttac tttagtggct 780ctctctcaag cgataccgca acttcgcatt ggatatcttt
cttcgtatgt cgtcccactt 840ttgtttgtac tcatagtgac catgtcaaaa gaggcgatgg
atgatattca acgccgaaga 900agggatagag aacagaacaa tgaaccatat gaggttctgt
ccagcccatc accagttttg 960tccaaaaact taaaatgtgg tcacttggtt cgattgcata
agggaatgag agtgcccgca 1020gatatggttc ttgtccagtc aagcgaatcc accggagagt
catttatcaa gacagatcag 1080ctggatggtg agactgattg gaagcttcgg attgtttctc
cagttacaca atcgttacca 1140atgactgaac ttcaaaatgt cgccatcact gcaagcgcac
cctcaaaatc aattcactcc 1200tttcttggaa gattgaccta caatgggcaa tcatatggtc
ttacgataga caacacaatg 1260tggtgtaata ctgtattagc ttctggttca gcaattggtt
gtataattta cacaggtaaa 1320gatactcgac aatcgatgaa cacaactcag cccaaactga
aaacgggctt gttagaactg 1380gaaatcaata gtttgtccaa gatcttatgt gtttgtgtgt
ttgcattatc tgtcatctta 1440gtgctattcc aaggaatagc tgatgattgg tacgtcgata
tcatgcggtt tctcattcta 1500ttctccacta ttatcccagt gtctctgaga gttaaccttg
atcttggaaa gtcagtccat 1560gctcatcaaa tagaaactga tagctcaata cctgaaaccg
ttgttagaac tagtacaata 1620ccggaagacc tgggaagaat tgaataccta ttaagtgaca
aaactggaac tcttactcaa 1680aatgatatgg aaatgaaaaa actacaccta ggaacagtct
cttatgctgg tgataccatg 1740gatattattt ctgatcatgt taaaggtctt aataacgcta
aaacatcgag gaaagatctt 1800ggtatgagaa taagagattt ggttacaact ctggccatct g
1841703105DNAArtificial SequenceDNA encodes
Drosophila melanogaster ManII codon-optimized (KD) 70agagacgatc
caattagacc tccattgaag gttgctagat ccccaagacc aggtcaatgt 60caagatgttg
ttcaggacgt cccaaacgtt gatgtccaga tgttggagtt gtacgataga 120atgtccttca
aggacattga tggtggtgtt tggaagcagg gttggaacat taagtacgat 180ccattgaagt
acaacgctca tcacaagttg aaggtcttcg ttgtcccaca ctcccacaac 240gatcctggtt
ggattcagac cttcgaggaa tactaccagc acgacaccaa gcacatcttg 300tccaacgctt
tgagacattt gcacgacaac ccagagatga agttcatctg ggctgaaatc 360tcctacttcg
ctagattcta ccacgatttg ggtgagaaca agaagttgca gatgaagtcc 420atcgtcaaga
acggtcagtt ggaattcgtc actggtggat gggtcatgcc agacgaggct 480aactcccact
ggagaaacgt tttgttgcag ttgaccgaag gtcaaacttg gttgaagcaa 540ttcatgaacg
tcactccaac tgcttcctgg gctatcgatc cattcggaca ctctccaact 600atgccataca
ttttgcagaa gtctggtttc aagaatatgt tgatccagag aacccactac 660tccgttaaga
aggagttggc tcaacagaga cagttggagt tcttgtggag acagatctgg 720gacaacaaag
gtgacactgc tttgttcacc cacatgatgc cattctactc ttacgacatt 780cctcatacct
gtggtccaga tccaaaggtt tgttgtcagt tcgatttcaa aagaatgggt 840tccttcggtt
tgtcttgtcc atggaaggtt ccacctagaa ctatctctga tcaaaatgtt 900gctgctagat
ccgatttgtt ggttgatcag tggaagaaga aggctgagtt gtacagaacc 960aacgtcttgt
tgattccatt gggtgacgac ttcagattca agcagaacac cgagtgggat 1020gttcagagag
tcaactacga aagattgttc gaacacatca actctcaggc tcacttcaat 1080gtccaggctc
agttcggtac tttgcaggaa tacttcgatg ctgttcacca ggctgaaaga 1140gctggacaag
ctgagttccc aaccttgtct ggtgacttct tcacttacgc tgatagatct 1200gataactact
ggtctggtta ctacacttcc agaccatacc ataagagaat ggacagagtc 1260ttgatgcact
acgttagagc tgctgaaatg ttgtccgctt ggcactcctg ggacggtatg 1320gctagaatcg
aggaaagatt ggagcaggct agaagagagt tgtccttgtt ccagcaccac 1380gacggtatta
ctggtactgc taaaactcac gttgtcgtcg actacgagca aagaatgcag 1440gaagctttga
aagcttgtca aatggtcatg caacagtctg tctacagatt gttgactaag 1500ccatccatct
actctccaga cttctccttc tcctacttca ctttggacga ctccagatgg 1560ccaggttctg
gtgttgagga ctctagaact accatcatct tgggtgagga tatcttgcca 1620tccaagcatg
ttgtcatgca caacaccttg ccacactgga gagagcagtt ggttgacttc 1680tacgtctcct
ctccattcgt ttctgttacc gacttggcta acaatccagt tgaggctcag 1740gtttctccag
tttggtcttg gcaccacgac actttgacta agactatcca cccacaaggt 1800tccaccacca
agtacagaat catcttcaag gctagagttc caccaatggg tttggctacc 1860tacgttttga
ccatctccga ttccaagcca gagcacacct cctacgcttc caatttgttg 1920cttagaaaga
acccaacttc cttgccattg ggtcaatacc cagaggatgt caagttcggt 1980gatccaagag
agatctcctt gagagttggt aacggtccaa ccttggcttt ctctgagcag 2040ggtttgttga
agtccattca gttgactcag gattctccac atgttccagt tcacttcaag 2100ttcttgaagt
acggtgttag atctcatggt gatagatctg gtgcttactt gttcttgcca 2160aatggtccag
cttctccagt cgagttgggt cagccagttg tcttggtcac taagggtaaa 2220ttggagtctt
ccgtttctgt tggtttgcca tctgtcgttc accagaccat catgagaggt 2280ggtgctccag
agattagaaa tttggtcgat attggttctt tggacaacac tgagatcgtc 2340atgagattgg
agactcatat cgactctggt gatatcttct acactgattt gaatggattg 2400caattcatca
agaggagaag attggacaag ttgccattgc aggctaacta ctacccaatt 2460ccatctggta
tgttcattga ggatgctaat accagattga ctttgttgac cggtcaacca 2520ttgggtggat
cttctttggc ttctggtgag ttggagatta tgcaagatag aagattggct 2580tctgatgatg
aaagaggttt gggtcagggt gttttggaca acaagccagt tttgcatatt 2640tacagattgg
tcttggagaa ggttaacaac tgtgtcagac catctaagtt gcatccagct 2700ggttacttga
cttctgctgc tcacaaagct tctcagtctt tgttggatcc attggacaag 2760ttcatcttcg
ctgaaaatga gtggatcggt gctcagggtc aattcggtgg tgatcatcca 2820tctgctagag
aggatttgga tgtctctgtc atgagaagat tgaccaagtc ttctgctaaa 2880acccagagag
ttggttacgt tttgcacaga accaatttga tgcaatgtgg tactccagag 2940gagcatactc
agaagttgga tgtctgtcac ttgttgccaa atgttgctag atgtgagaga 3000actaccttga
ctttcttgca gaatttggag cacttggatg gtatggttgc tccagaagtt 3060tgtccaatgg
aaaccgctgc ttacgtctct tctcactctt cttga
310571108DNAArtificial SequenceDNA encodes Mnn2 leader (53) 71atgctgctta
ccaaaaggtt ttcaaagctg ttcaagctga cgttcatagt tttgatattg 60tgcgggctgt
tcgtcattac aaacaaatac atggatgaga acacgtcg
108721729DNAArtificial SequenceSequence of the PpHIS1 auxotrophic marker
72caagttgcgt ccggtatacg taacgtctca cgatgatcaa agataatact taatcttcat
60ggtctactga ataactcatt taaacaattg actaattgta cattatattg aacttatgca
120tcctattaac gtaatcttct ggcttctctc tcagactcca tcagacacag aatatcgttc
180tctctaactg gtcctttgac gtttctgaca atagttctag aggagtcgtc caaaaactca
240actctgactt gggtgacacc accacgggat ccggttcttc cgaggacctt gatgaccttg
300gctaatgtaa ctggagtttt agtatccatt ttaagatgtg tgtttctgta ggttctgggt
360tggaaaaaaa ttttagacac cagaagagag gagtgaactg gtttgcgtgg gtttagactg
420tgtaaggcac tactctgtcg aagttttaga taggggttac ccgctccgat gcatgggaag
480cgattagccc ggctgttgcc cgtttggttt ttgaagggta attttcaata tctctgtttg
540agtcatcaat ttcatattca aagattcaaa aacaaaatct ggtccaagga gcgcatttag
600gattatggag ttggcgaatc acttgaacga tagactatta tttgctgttc ctaaagaggg
660cagattgtat gagaaatgcg ttgaattact taggggatca gatattcagt ttcgaagatc
720cagtagattg gatatagctt tgtgcactaa cctgcccctg gcattggttt tccttccagc
780tgctgacatt cccacgtttg taggagaggg taaatgtgat ttgggtataa ctggtattga
840ccaggttcag gaaagtgacg tagatgtcat acctttatta gacttgaatt tcggtaagtg
900caagttgcag attcaagttc ccgagaatgg tgacttgaaa gaacctaaac agctaattgg
960taaagaaatt gtttcctcct ttactagctt aaccaccagg tactttgaac aactggaagg
1020agttaagcct ggtgagccac taaagacaaa aatcaaatat gttggagggt ctgttgaggc
1080ctcttgtgcc ctaggagttg ccgatgctat tgtggatctt gttgagagtg gagaaaccat
1140gaaagcggca gggctgatcg atattgaaac tgttctttct acttccgctt acctgatctc
1200ttcgaagcat cctcaacacc cagaactgat ggatactatc aaggagagaa ttgaaggtgt
1260actgactgct cagaagtatg tcttgtgtaa ttacaacgca cctagaggta accttcctca
1320gctgctaaaa ctgactccag gcaagagagc tgctaccgtt tctccattag atgaagaaga
1380ttgggtggga gtgtcctcga tggtagagaa gaaagatgtt ggaagaatca tggacgaatt
1440aaagaaacaa ggtgccagtg acattcttgt ctttgagatc agtaattgta gagcatagat
1500agaataatat tcaagaccaa cggcttctct tcggaagctc caagtagctt atagtgatga
1560gtaccggcat atatttatag gcttaaaatt tcgagggttc actatattcg tttagtggga
1620agagttcctt tcactcttgt tatctatatt gtcagcgtgg actgtttata actgtaccaa
1680cttagtttct ttcaactcca ggttaagaga cataaatgtc ctttgatgc
1729731068DNAArtificial SequenceDNA encodes Rat GnT II (TC)
Codon-optimized 73tccttggttt accaattgaa cttcgaccag atgttgagaa acgttgacaa
ggacggtact 60tggtctcctg gtgagttggt tttggttgtt caggttcaca acagaccaga
gtacttgaga 120ttgttgatcg actccttgag aaaggctcaa ggtatcagag aggttttggt
tatcttctcc 180cacgatttct ggtctgctga gatcaactcc ttgatctcct ccgttgactt
ctgtccagtt 240ttgcaggttt tcttcccatt ctccatccaa ttgtacccat ctgagttccc
aggttctgat 300ccaagagact gtccaagaga cttgaagaag aacgctgctt tgaagttggg
ttgtatcaac 360gctgaatacc cagattcttt cggtcactac agagaggcta agttctccca
aactaagcat 420cattggtggt ggaagttgca ctttgtttgg gagagagtta aggttttgca
ggactacact 480ggattgatct tgttcttgga ggaggatcat tacttggctc cagacttcta
ccacgttttc 540aagaagatgt ggaagttgaa gcaacaagag tgtccaggtt gtgacgtttt
gtccttggga 600acttacacta ctatcagatc cttctacggt atcgctgaca aggttgacgt
taagacttgg 660aagtccactg aacacaacat gggattggct ttgactagag atgcttacca
gaagttgatc 720gagtgtactg acactttctg tacttacgac gactacaact gggactggac
tttgcagtac 780ttgactttgg cttgtttgcc aaaagtttgg aaggttttgg ttccacaggc
tccaagaatt 840ttccacgctg gtgactgtgg aatgcaccac aagaaaactt gtagaccatc
cactcagtcc 900gctcaaattg agtccttgtt gaacaacaac aagcagtact tgttcccaga
gactttggtt 960atcggagaga agtttccaat ggctgctatt tccccaccaa gaaagaatgg
tggatggggt 1020gatattagag accacgagtt gtgtaaatcc tacagaagat tgcagtag
106874300DNAArtificial SequenceDNA encodes Mnn2 leader (54)
74atgctgctta ccaaaaggtt ttcaaagctg ttcaagctga cgttcatagt tttgatattg
60tgcgggctgt tcgtcattac aaacaaatac atggatgaga acacgtcggt caaggagtac
120aaggagtact tagacagata tgtccagagt tactccaata agtattcatc ttcctcagac
180gccgccagcg ctgacgattc aaccccattg agggacaatg atgaggcagg caatgaaaag
240ttgaaaagct tctacaacaa cgttttcaac tttctaatgg ttgattcgcc cgggcgcgcc
300751373DNAArtificial SequenceSequence of the 5'-Region used for knock
out of PpARG1 75gatctggcct tccctgaatt tttacgtcca gctatacgat
ccgttgtgac tgtatttcct 60gaaatgaagt ttcaacctaa agttttggtt gtacttgctc
cacctaccac ggaaactaat 120atcgaaacca atgaaaaagt agaactggaa tcgtcaatcg
aaattcgcaa ccaagtggaa 180cccaaagact tgaatctttc taaagtctat tctagtgaca
ctaatggcaa cagaagattt 240gagctgactt ttcaaatgaa tctcaataat gcaatatcaa
catcagacaa tcaatgggct 300ttgtctagtg acacaggatc aattatagta gtgtcttctg
caggaagaat aacttccccg 360atcctagaag tcggggcatc cgtctgtgtc ttaagatcgt
acaacgaaca ccttttggca 420ataacttgtg aaggaacatg cttttcatgg aatttaaaga
agcaagaatg tgttctaaac 480agcatttcat tagcacctat agtcaattca cacatgctag
ttaagaaagt tggagatgca 540aggaactatt ctattgtatc tgccgaagga gacaacaatc
cgttacccca gattctagac 600tgcgaacttt ccaaaaatgg cgctccaatt gtggctctta
gcacgaaaga catctactct 660tattcaaaga aaatgaaatg ctggatccat ttgattgatt
cgaaatactt tgaattgttg 720ggtgctgaca atgcactgtt tgagtgtgtg gaagcgctag
aaggtccaat tggaatgcta 780attcatagat tggtagatga gttcttccat gaaaacactg
ccggtaaaaa actcaaactt 840tacaacaagc gagtactgga ggacctttca aattcacttg
aagaactagg tgaaaatgcg 900tctcaattaa gagagaaact tgacaaactc tatggtgatg
aggttgaggc ttcttgacct 960cttctctcta tctgcgtttc tttttttttt tttttttttt
tttttttcag ttgagccaga 1020ccgcgctaaa cgcataccaa ttgccaaatc aggcaattgt
gagacagtgg taaaaaagat 1080gcctgcaaag ttagattcac acagtaagag agatcctact
cataaatgag gcgcttattt 1140agtagctagt gatagccact gcggttctgc tttatgctat
ttgttgtatg ccttactatc 1200tttgtttggc tcctttttct tgacgttttc cgttggaggg
actccctatt ctgagtcatg 1260agccgcacag attatcgccc aaaattgaca aaatcttctg
gcgaaaaaag tataaaagga 1320gaaaaaagct cacccttttc cagcgtagaa agtatatatc
agtcattgaa gac 1373761470DNAArtificial SequenceSequence of the
3'-Region used for knock out of PpARG1 76gggactttaa ctcaagtaaa
aggatagttg tacaattata tatacgaaga ataaatcatt 60acaaaaagta ttcgtttctt
tgattcttaa caggattcat tttctgggtg tcatcaggta 120cagcgctgaa tatcttgaag
ttaacatcga gctcatcatc gacgttcatc acactagcca 180cgtttccgca acggtagcaa
taattaggag cggaccacac agtgacgaca tctttctctt 240tgaaatggta tctgaagcct
tccatgacca attgatgggc tctagcgatg agttgcaagt 300tattaatgtg gttgaactca
cgtgctactc gagcaccgaa taaccagcca gctccacgag 360gagaaacagc ccaactgtcg
acttcatctg ggtcagacca aaccaagtca caaaatcctc 420cttcatgagg gacctcttgc
gctcggctga gaactctgat ttgatctaac atgcgaatat 480cgggagagag accaccatgg
atacataata ttttaccatc aatgatggca ctaagggtta 540aaaagtcgaa cacctggcaa
cagtacttcc agacagtggt ggaaccatat ttattgagac 600attcctcata aaatccataa
acctgagtga tctgtctgga ttcatgattt ccccttacca 660atgtgatatg ttgaggaaac
ttaattttta aaatcatgag taacgtgaac gtctccaacg 720agaaatagcc tctatccaca
tagtctccta ggaagatata gttctgtttt attccattag 780aggaggatcc gggaaaccca
ccactaatct tgaaaagttc cagtagatcg tgaaattggc 840cgtgaatatc tccgcatact
gtcactggac tctgcactgg ctgtatattg gattcctcca 900tcagcaaatc cttcacccgt
tcgcaaagat gcttcatatc attttcactt aaagccttgc 960agcttttgac ttcttcaaac
cactgatctg gtcctctttc tggcatgatt aaggtctata 1020atatttctga gctgagatgt
aaaaaaaaat aataaaaatg gggagtgaaa aagtgtgtag 1080cttttaggag tttgggattg
ataccccaaa atgatcttta tgagaattaa aaggtagata 1140cgcttttaat aagaacacct
atctatagta ctttgtggtc ttgagtaatt gagatgttca 1200gcttctgagg tttgccgtta
ttctgggata gtagtgcgcg accaaacaac ccgccaggca 1260aagtgtgttg tgctcgaaga
cgattgccag aagagtaagt ccgtcctgcc tcagatgtta 1320cacactttct tccctagaca
gtcgatgcat catcggattt aaacctgaaa ctttgatgcc 1380atgatacgcc tagtcacgtc
gactgagatt ttagataagc cccgatccct ttagtacatt 1440cctgttatcc atggatggaa
tggcctgata 1470771043DNAArtificial
SequenceSequence of the 5'-Region used for knock out of BMT4
77aagcttgttc accgttggga cttttccgtg gacaatgttg actactccag gagggattcc
60agctttctct actagctcag caataatcaa tgcagcccca ggcgcccgtt ctgatggctt
120gatgaccgtt gtattgcctg tcactatagc caggggtagg gtccataaag gaatcatagc
180agggaaatta aaagggcata ttgatgcaat cactcccaat ggctctcttg ccattgaagt
240ctccatatca gcactaactt ccaagaagga ccccttcaag tctgacgtga tagagcacgc
300ttgctctgcc acctgtagtc ctctcaaaac gtcaccttgt gcatcagcaa agactttacc
360ttgctccaat actatgacgg aggcaattct gtcaaaattc tctctcagca attcaaccaa
420cttgaaagca aattgctgtc tcttgatgat ggagactttt ttccaagatt gaaatgcaat
480gtgggacgac tcaattgctt cttccagctc ctcttcggtt gattgaggaa cttttgaaac
540cacaaaattg gtcgttgggt catgtacatc aaaccattct gtagatttag attcgacgaa
600agcgttgttg atgaaggaaa aggttggata cggtttgtcg gtctctttgg tatggccggt
660ggggtatgca attgcagtag aagataattg gacagccatt gttgaaggta gagaaaaggt
720cagggaactt gggggttatt tataccattt taccccacaa ataacaactg aaaagtaccc
780attccatagt gagaggtaac cgacggaaaa agacgggccc atgttctggg accaatagaa
840ctgtgtaatc cattgggact aatcaacaga cgattggcaa tataatgaaa tagttcgttg
900aaaagccacg tcagctgtct tttcattaac tttggtcgga cacaacattt tctactgttg
960tatctgtcct actttgctta tcatctgcca cagggcaagt ggatttcctt ctcgcgcggc
1020tgggtgaaaa cggttaacgt gaa
104378695DNAArtificial SequenceSequence of the 3'-Region used for knock
out of BMT4 78gccttggggg acttcaagtc tttgctagaa actagatgag gtcaggccct
cttatggttg 60tgtcccaatt gggcaatttc actcacctaa aaagcatgac aattatttag
cgaaataggt 120agtatatttt ccctcatctc ccaagcagtt tcgtttttgc atccatatct
ctcaaatgag 180cagctacgac tcattagaac cagagtcaag taggggtgag ctcagtcatc
agccttcgtt 240tctaaaacga ttgagttctt ttgttgctac aggaagcgcc ctagggaact
ttcgcacttt 300ggaaatagat tttgatgacc aagagcggga gttgatatta gagaggctgt
ccaaagtaca 360tgggatcagg ccggccaaat tgattggtgt gactaaacca ttgtgtactt
ggacactcta 420ttacaaaagc gaagatgatt tgaagtatta caagtcccga agtgttagag
gattctatcg 480agcccagaat gaaatcatca accgttatca gcagattgat aaactcttgg
aaagcggtat 540cccattttca ttattgaaga actacgataa tgaagatgtg agagacggcg
accctctgaa 600cgtagacgaa gaaacaaatc tacttttggg gtacaataga gaaagtgaat
caagggaggt 660atttgtggcc ataatactca actctatcat taatg
695791103DNAArtificial SequenceSequence of the 5'-Region used
for knock out of BMT1 79catatggtga gagccgttct gcacaactag atgttttcga
gcttcgcatt gtttcctgca 60gctcgactat tgaattaaga tttccggata tctccaatct
cacaaaaact tatgttgacc 120acgtgctttc ctgaggcgag gtgttttata tgcaagctgc
caaaaatgga aaacgaatgg 180ccatttttcg cccaggcaaa ttattcgatt actgctgtca
taaagacagt gttgcaaggc 240tcacattttt ttttaggatc cgagataaag tgaatacagg
acagcttatc tctatatctt 300gtaccattcg tgaatcttaa gagttcggtt agggggactc
tagttgaggg ttggcactca 360cgtatggctg ggcgcagaaa taaaattcag gcgcagcagc
acttatcgat ggaattcaca 420gttataaata aaaacaaaaa ctcaaaaagt ttgggctcca
caaaataact taatttaaat 480ttttgtctaa taaatgaatg taattccaag attatgtgat
gcaagcacag tatgcttcag 540ccctatgcag ctactaatgt caatctcgcc tgcgagcggg
cctagatttt cactacaaat 600ttcaaaacta cgcggattta ttgtctcaga gagcaatttg
gcatttctga gcgtagcagg 660aggcttcata agattgtata ggaccgtacc aacaaattgc
cgaggcacaa cacggtatgc 720tgtgcactta tgtggctact tccctacaac ggaatgaaac
cttcctcttt ccgcttaaac 780gagaaagtgt gtcgcaattg aatgcaggtg cctgtgcgcc
ttggtgtatt gtttttgagg 840gcccaattta tcaggcgcct tttttcttgg ttgttttccc
ttagcctcaa gcaaggttgg 900tctatttcat ctccgcttct ataccgtgcc tgatactgtt
ggatgagaac acgactcaac 960ttcctgctgc tctgtattgc cagtgttttg tctgtgattt
ggatcggagt cctccttact 1020tggaatgata ataatcttgg cggaatctcc ctaaacggag
gcaaggattc tgcctatgat 1080gatctgctat cattgggaag ctt
110380692DNAArtificial SequenceSequence of the
3'-Region used for knock out of BMT1 80gaattcacag ttataaataa
aaacaaaaac tcaaaaagtt tgggctccac aaaataactt 60aatttaaatt tttgtctaat
aaatgaatgt aattccaaga ttatgtgatg caagcacagt 120atgcttcagc cctatgcagc
tactaatgtc aatctcgcct gcgagcgggc ctagattttc 180actacaaatt tcaaaactac
gcggatttat tgtctcagag agcaatttgg catttctgag 240cgtagcagga ggcttcataa
gattgtatag gaccgtacca acaaattgcc gaggcacaac 300acggtatgct gtgcacttat
gtggctactt ccctacaacg gaatgaaacc ttcctctttc 360cgcttaaacg agaaagtgtg
tcgcaattga atgcaggtgc ctgtgcgcct tggtgtattg 420tttttgaggg cccaatttat
caggcgcctt ttttcttggt tgttttccct tagcctcaag 480caaggttggt ctatttcatc
tccgcttcta taccgtgcct gatactgttg gatgagaaca 540cgactcaact tcctgctgct
ctgtattgcc agtgttttgt ctgtgatttg gatcggagtc 600ctccttactt ggaatgataa
taatcttggc ggaatctccc taaacggagg caaggattct 660gcctatgatg atctgctatc
attgggaagc tt 69281546DNAArtificial
SequenceSequence of the 5'-Region used for knock out of BMT3
81gatatctccc tggggacaat atgtgttgca actgttcgtt gttggtgccc cagtccccca
60accggtacta atcggtctat gttcccgtaa ctcatattcg gttagaacta gaacaataag
120tgcatcattg ttcaacattg tggttcaatt gtcgaacatt gctggtgctt atatctacag
180ggaagacgat aagcctttgt acaagagagg taacagacag ttaattggta tttctttggg
240agtcgttgcc ctctacgttg tctccaagac atactacatt ctgagaaaca gatggaagac
300tcaaaaatgg gagaagctta gtgaagaaga gaaagttgcc tacttggaca gagctgagaa
360ggagaacctg ggttctaaga ggctggactt tttgttcgag agttaaactg cataattttt
420tctaagtaaa tttcatagtt atgaaatttc tgcagcttag tgtttactgc atcgtttact
480gcatcaccct gtaaataatg tgagcttttt tccttccatt gcttggtatc ttccttgctg
540ctgttt
54682378DNAArtificial SequenceSequence of the 3'-Region used for knock
out of BMT3 82acaaaacagt catgtacaga actaacgcct ttaagatgca gaccactgaa
aagaattggg 60tcccattttt cttgaaagac gaccaggaat ctgtccattt tgtttactcg
ttcaatcctc 120tgagagtact caactgcagt cttgataacg gtgcatgtga tgttctattt
gagttaccac 180atgattttgg catgtcttcc gagctacgtg gtgccactcc tatgctcaat
cttcctcagg 240caatcccgat ggcagacgac aaagaaattt gggtttcatt cccaagaacg
agaatatcag 300attgcgggtg ttctgaaaca atgtacaggc caatgttaat gctttttgtt
agagaaggaa 360caaacttttt tgctgagc
378831494DNAArtificial SequenceDNA encodes Tr ManI catalytic
domain 83cgcgccggat ctcccaaccc tacgagggcg gcagcagtca aggccgcatt
ccagacgtcg 60tggaacgctt accaccattt tgcctttccc catgacgacc tccacccggt
cagcaacagc 120tttgatgatg agagaaacgg ctggggctcg tcggcaatcg atggcttgga
cacggctatc 180ctcatggggg atgccgacat tgtgaacacg atccttcagt atgtaccgca
gatcaacttc 240accacgactg cggttgccaa ccaaggcatc tccgtgttcg agaccaacat
tcggtacctc 300ggtggcctgc tttctgccta tgacctgttg cgaggtcctt tcagctcctt
ggcgacaaac 360cagaccctgg taaacagcct tctgaggcag gctcaaacac tggccaacgg
cctcaaggtt 420gcgttcacca ctcccagcgg tgtcccggac cctaccgtct tcttcaaccc
tactgtccgg 480agaagtggtg catctagcaa caacgtcgct gaaattggaa gcctggtgct
cgagtggaca 540cggttgagcg acctgacggg aaacccgcag tatgcccagc ttgcgcagaa
gggcgagtcg 600tatctcctga atccaaaggg aagcccggag gcatggcctg gcctgattgg
aacgtttgtc 660agcacgagca acggtacctt tcaggatagc agcggcagct ggtccggcct
catggacagc 720ttctacgagt acctgatcaa gatgtacctg tacgacccgg ttgcgtttgc
acactacaag 780gatcgctggg tccttgctgc cgactcgacc attgcgcatc tcgcctctca
cccgtcgacg 840cgcaaggact tgaccttttt gtcttcgtac aacggacagt ctacgtcgcc
aaactcagga 900catttggcca gttttgccgg tggcaacttc atcttgggag gcattctcct
gaacgagcaa 960aagtacattg actttggaat caagcttgcc agctcgtact ttgccacgta
caaccagacg 1020gcttctggaa tcggccccga aggcttcgcg tgggtggaca gcgtgacggg
cgccggcggc 1080tcgccgccct cgtcccagtc cgggttctac tcgtcggcag gattctgggt
gacggcaccg 1140tattacatcc tgcggccgga gacgctggag agcttgtact acgcataccg
cgtcacgggc 1200gactccaagt ggcaggacct ggcgtgggaa gcgttcagtg ccattgagga
cgcatgccgc 1260gccggcagcg cgtactcgtc catcaacgac gtgacgcagg ccaacggcgg
gggtgcctct 1320gacgatatgg agagcttctg gtttgccgag gcgctcaagt atgcgtacct
gatctttgcg 1380gaggagtcgg atgtgcaggt gcaggccaac ggcgggaaca aatttgtctt
taacacggag 1440gcgcacccct ttagcatccg ttcatcatca cgacggggcg gccaccttgc
ttaa 1494841792DNAArtificial Sequence5'ARG1 and ORF 84taccaattgc
caaatcaggc aattgtgaga cagtggtaaa aaagatgcct gcaaagttag 60attcacacag
taagagagat cctactcata aatgaggcgc ttatttagta gctagtgata 120gccactgcgg
ttctgcttta tgctatttgt tgtatgcctt actatctttg tttggctcct 180ttttcttgac
gttttccgtt ggagggactc cctattctga gtcatgagcc gcacagatta 240tcgcccaaaa
ttgacaaaat cttctggcga aaaaagtata aaaggagaaa aaagctcacc 300cttttccagc
gtagaaagta tatatcagtc attgaagact attatttaaa taacacaatg 360tctaaaggaa
aagtttgttt ggcctactcc ggtggtttgg atacctccat catcctagct 420tggttgttgg
agcagggata cgaagtcgtt gcctttttag ccaacattgg tcaagaggaa 480gactttgagg
ctgctagaga gaaagctctg aagatcggtg ctaccaagtt tatcgtcagt 540gacgttagga
aggaatttgt tgaggaagtt ttgttcccag cagtccaagt taacgctatc 600tacgagaacg
tctacttact gggtacctct ttggccagac cagtcattgc caaggcccaa 660atagaggttg
ctgaacaaga aggttgtttt gctgttgccc acggttgtac cggaaagggt 720aacgatcagg
ttagatttga gctttccttt tatgctctga agcctgacgt tgtctgtatc 780gccccatgga
gagacccaga attcttcgaa agattcgctg gtagaaatga cttgctgaat 840tacgctgctg
agaaggatat tccagttgct cagactaaag ccaagccatg gtctactgat 900gagaacatgg
ctcacatctc cttcgaggct ggtattctag aagatccaaa cactactcct 960ccaaaggaca
tgtggaagct cactgttgac ccagaagatg caccagacaa gccagagttc 1020tttgacgtcc
actttgagaa gggtaagcca gttaaattag ttctcgagaa caaaactgag 1080gtcaccgatc
cggttgagat ctttttgact gctaacgcca ttgctagaag aaacggtgtt 1140ggtagaattg
acattgtcga gaacagattc atcggaatca agtccagagg ttgttatgaa 1200actccaggtt
tgactctact gagaaccact cacatcgact tggaaggtct taccgttgac 1260cgtgaagtta
gatcgatcag agacactttt gttaccccaa cctactctaa gttgttatac 1320aacgggttgt
actttacccc agaaggtgag tacgtcagaa ctatgattca gccttctcaa 1380aacaccgtca
acggtgttgt tagagccaag gcctacaaag gtaatgtgta taacctagga 1440agatactctg
aaaccgagaa attgtacgat gctaccgaat cttccatgga tgagttgacc 1500ggattccacc
ctcaagaagc tggaggattt atcacaacac aagccatcag aatcaagaag 1560tacggagaaa
gtgtcagaga gaagggaaag tttttgggac tttaactcaa gtaaaaggat 1620agttgtacaa
ttatatatac gaagaataaa tcattacaaa aagtattcgt ttctttgatt 1680cttaacagga
ttcattttct gggtgtcatc aggtacagcg ctgaatatct tgaagttaac 1740atcgagctca
tcatcgacgt tcatcacact agccacgttt ccgcaacggt ag
179285414DNAArtificial SequencePpCITI TT 85ccggccattt aaatatgtga
cgactgggtg atccgggtta gtgagttgtt ctcccatctg 60tatatttttc atttacgatg
aatacgaaat gagtattaag aaatcaggcg tagcaatatg 120ggcagtgttc agtcctgtca
tagatggcaa gcactggcac atccttaata ggttagagaa 180aatcattgaa tcatttgggt
ggtgaaaaaa aattgatgta aacaagccac ccacgctggg 240agtcgaaccc agaatctttt
gattagaagt caaacgcgtt aaccattacg ctacgcaggc 300atgtttcacg tccatttttg
attgctttct atcataatct aaagatgtga actcaattag 360ttgcaatttg accaattctt
ccattacaag tcgtgcttcc tccgttgatg caac 41486388DNAArtificial
SequenceAshbya gossypii TEF1 promoter 86gatctgttta gcttgcctcg tccccgccgg
gtcacccggc cagcgacatg gaggcccaga 60ataccctcct tgacagtctt gacgtgcgca
gctcaggggc atgatgtgac tgtcgcccgt 120acatttagcc catacatccc catgtataat
catttgcatc catacatttt gatggccgca 180cggcgcgaag caaaaattac ggctcctcgc
tgcagacctg cgagcaggga aacgctcccc 240tcacagacgc gttgaattgt ccccacgccg
cgcccctgta gagaaatata aaaggttagg 300atttgccact gaggttcttc tttcatatac
ttccttttaa aatcttgcta ggatacagtt 360ctcacatcac atccgaacat aaacaacc
38887247DNAArtificial SequenceAshbya
gossypii TEF1 termination sequence 87taatcagtac tgacaataaa aagattcttg
ttttcaagaa cttgtcattt gtatagtttt 60tttatattgt agttgttcta ttttaatcaa
atgttagcgt gatttatatt ttttttcgcc 120tcgacatcat ctgcccagat gcgaagttaa
gtgcgcagaa agtaatatca tgcgtcaatc 180gtatgtgaat gctggtcgct atactgctgt
cgattcgata ctaacgccgc catccagtgt 240cgaaaac
247881037DNAArtificial SequenceSequence
of the PpPMA1 promoter 88aaatgcgtac ctcttctacg agattcaagc gaatgagaat
aatgtaatat gcaagatcag 60aaagaatgaa aggagttgaa aaaaaaaacc gttgcgtttt
gaccttgaat ggggtggagg 120tttccattca aagtaaagcc tgtgtcttgg tattttcggc
ggcacaagaa atcgtaattt 180tcatcttcta aacgatgaag atcgcagccc aacctgtatg
tagttaaccg gtcggaatta 240taagaaagat tttcgatcaa caaaccctag caaatagaaa
gcagggttac aactttaaac 300cgaagtcaca aacgataaac cactcagctc ccacccaaat
tcattcccac tagcagaaag 360gaattattta atccctcagg aaacctcgat gattctcccg
ttcttccatg ggcgggtatc 420gcaaaatgag gaatttttca aatttctcta ttgtcaagac
tgtttattat ctaagaaata 480gcccaatccg aagctcagtt ttgaaaaaat cacttccgcg
tttctttttt acagcccgat 540gaatatccaa atttggaata tggattactc tatcgggact
gcagataata tgacaacaac 600gcagattaca ttttaggtaa ggcataaaca ccagccagaa
atgaaacgcc cactagccat 660ggtcgaatag tccaatgaat tcagatagct atggtctaaa
agctgatgtt ttttattggg 720taatggcgaa gagtccagta cgacttccag cagagctgag
atggccattt ttgggggtat 780tagtaacttt ttgagctctt ttcacttcga tgaagtgtcc
cattcgggat ataatcggat 840cgcgtcgttt tctcgaaaat acagcttagc gtcgtccgct
tgttgtaaaa gcagcaccac 900attcctaatc tcttatataa acaaaacaac ccaaattatc
agtgctgttt tcccaccaga 960tataagtttc ttttctcttc cgctttttga ttttttatct
ctttccttta aaaacttctt 1020taccttaaag ggcggcc
1037891231DNAArtificial SequenceSequence of the
5'-region that was used to knock into the PpPRO1 locus 89gaagggccat
cgaattgtca tcgtctcctc aggtgccatc gctgtgggca tgaagagagt 60caacatgaag
cggaaaccaa aaaagttaca gcaagtgcag gcattggctg ctataggaca 120aggccgtttg
ataggacttt gggacgacct tttccgtcag ttgaatcagc ctattgcgca 180gattttactg
actagaacgg atttggtcga ttacacccag tttaagaacg ctgaaaatac 240attggaacag
cttattaaaa tgggtattat tcctattgtc aatgagaatg acaccctatc 300cattcaagaa
atcaaatttg gtgacaatga caccttatcc gccataacag ctggtatgtg 360tcatgcagac
tacctgtttt tggtgactga tgtggactgt ctttacacgg ataaccctcg 420tacgaatccg
gacgctgagc caatcgtgtt agttagaaat atgaggaatc taaacgtcaa 480taccgaaagt
ggaggttccg ccgtaggaac aggaggaatg acaactaaat tgatcgcagc 540tgatttgggt
gtatctgcag gtgttacaac gattatttgc aaaagtgaac atcccgagca 600gattttggac
attgtagagt acagtatccg tgctgataga gtcgaaaatg aggctaaata 660tctggtcatc
aacgaagagg aaactgtgga acaatttcaa gagatcaatc ggtcagaact 720gagggagttg
aacaagctgg acattccttt gcatacacgt ttcgttggcc acagttttaa 780tgctgttaat
aacaaagagt tttggttact ccatggacta aaggccaacg gagccattat 840cattgatcca
ggttgttata aggctatcac tagaaaaaac aaagctggta ttcttccagc 900tggaattatt
tccgtagagg gtaatttcca tgaatacgag tgtgttgatg ttaaggtagg 960actaagagat
ccagatgacc cacattcact agaccccaat gaagaacttt acgtcgttgg 1020ccgtgcccgt
tgtaattacc ccagcaatca aatcaacaaa attaagggtc tacaaagctc 1080gcagatcgag
caggttctag gttacgctga cggtgagtat gttgttcaca gggacaactt 1140ggctttccca
gtatttgccg atccagaact gttggatgtt gttgagagta ccctgtctga 1200acaggagaga
gaatccaaac caaataaata g
1231901425DNAArtificial SequenceSequence of the 3'-region that was used
to knock into the PpPRO1 locus 90aatttcacat atgctgcttg attatgtaat
tataccttgc gttcgatggc atcgatttcc 60tcttctgtca atcgcgcatc gcattaaaag
tatacttttt tttttttcct atagtactat 120tcgccttatt ataaactttg ctagtatgag
ttctaccccc aagaaagagc ctgatttgac 180tcctaagaag agtcagcctc caaagaatag
tctcggtggg ggtaaaggct ttagtgagga 240gggtttctcc caaggggact tcagcgctaa
gcatatacta aatcgtcgcc ctaacaccga 300aggctcttct gtggcttcga acgtcatcag
ttcgtcatca ttgcaaaggt taccatcctc 360tggatctgga agcgttgctg tgggaagtgt
gttgggatct tcgccattaa ctctttctgg 420agggttccac gggcttgatc caaccaagaa
taaaatagac gttccaaagt cgaaacagtc 480aaggagacaa agtgttcttt ctgacatgat
ttccacttct catgcagcta gaaatgatca 540ctcagagcag cagttacaaa ctggacaaca
atcagaacaa aaagaagaag atggtagtcg 600atcttctttt tctgtttctt cccccgcaag
agatatccgg cacccagatg tactgaaaac 660tgtcgagaaa catcttgcca atgacagcga
gatcgactca tctttacaac ttcaaggtgg 720agatgtcact agaggcattt atcaatgggt
aactggagaa agtagtcaaa aagataaccc 780gcctttgaaa cgagcaaata gttttaatga
tttttcttct gtgcatggtg acgaggtagg 840caaggcagat gctgaccacg atcgtgaaag
cgtattcgac gaggatgata tctccattga 900tgatatcaaa gttccgggag ggatgcgtcg
aagtttttta ttacaaaagc atagagacca 960acaactttct ggactgaata aaacggctca
ccaaccaaaa caacttacta aacctaattt 1020cttcacgaac aactttatag agtttttggc
attgtatggg cattttgcag gtgaagattt 1080ggaggaagac gaagatgaag atttagacag
tggttccgaa tcagtcgcag tcagtgatag 1140tgagggagaa ttcagtgagg ctgacaacaa
tttgttgtat gatgaagagt ctctcctatt 1200agcacctagt acctccaact atgcgagatc
aagaatagga agtattcgta ctcctactta 1260tggatctttc agttcaaatg ttggttcttc
gtctattcat cagcagttaa tgaaaagtca 1320aatcccgaag ctgaagaaac gtggacagca
caagcataaa acacaatcaa aaatacgctc 1380gaagaagcaa actaccaccg taaaagcagt
gttgctgcta ttaaa 1425911793DNAArtificial
SequenceSequence of the PpTRP2 gene integration locus 91ggtttctcaa
ttactatata ctactaacca tttacctgta gcgtatttct tttccctctt 60cgcgaaagct
caagggcatc ttcttgactc atgaaaaata tctggatttc ttctgacaga 120tcatcaccct
tgagcccaac tctctagcct atgagtgtaa gtgatagtca tcttgcaaca 180gattattttg
gaacgcaact aacaaagcag atacaccctt cagcagaatc ctttctggat 240attgtgaaga
atgatcgcca aagtcacagt cctgagacag ttcctaatct ttaccccatt 300tacaagttca
tccaatcaga cttcttaacg cctcatctgg cttatatcaa gcttaccaac 360agttcagaaa
ctcccagtcc aagtttcttg cttgaaagtg cgaagaatgg tgacaccgtt 420gacaggtaca
cctttatggg acattccccc agaaaaataa tcaagactgg gcctttagag 480ggtgctgaag
ttgacccctt ggtgcttctg gaaaaagaac tgaagggcac cagacaagcg 540caacttcctg
gtattcctcg tctaagtggt ggtgccatag gatacatctc gtacgattgt 600attaagtact
ttgaaccaaa aactgaaaga aaactgaaag atgttttgca acttccggaa 660gcagctttga
tgttgttcga cacgatcgtg gcttttgaca atgtttatca aagattccag 720gtaattggaa
acgtttctct atccgttgat gactcggacg aagctattct tgagaaatat 780tataagacaa
gagaagaagt ggaaaagatc agtaaagtgg tatttgacaa taaaactgtt 840ccctactatg
aacagaaaga tattattcaa ggccaaacgt tcacctctaa tattggtcag 900gaagggtatg
aaaaccatgt tcgcaagctg aaagaacata ttctgaaagg agacatcttc 960caagctgttc
cctctcaaag ggtagccagg ccgacctcat tgcacccttt caacatctat 1020cgtcatttga
gaactgtcaa tccttctcca tacatgttct atattgacta tctagacttc 1080caagttgttg
gtgcttcacc tgaattacta gttaaatccg acaacaacaa caaaatcatc 1140acacatccta
ttgctggaac tcttcccaga ggtaaaacta tcgaagagga cgacaattat 1200gctaagcaat
tgaagtcgtc tttgaaagac agggccgagc acgtcatgct ggtagatttg 1260gccagaaatg
atattaaccg tgtgtgtgag cccaccagta ccacggttga tcgtttattg 1320actgtggaga
gattttctca tgtgatgcat cttgtgtcag aagtcagtgg aacattgaga 1380ccaaacaaga
ctcgcttcga tgctttcaga tccattttcc cagcaggaac cgtctccggt 1440gctccgaagg
taagagcaat gcaactcata ggagaattgg aaggagaaaa gagaggtgtt 1500tatgcggggg
ccgtaggaca ctggtcgtac gatggaaaat cgatggacac atgtattgcc 1560ttaagaacaa
tggtcgtcaa ggacggtgtc gcttaccttc aagccggagg tggaattgtc 1620tacgattctg
acccctatga cgagtacatc gaaaccatga acaaaatgag atccaacaat 1680aacaccatct
tggaggctga gaaaatctgg accgataggt tggccagaga cgagaatcaa 1740agtgaatccg
aagaaaacga tcaatgaacg gaggacgtaa gtaggaattt atg
1793922172DNAArtificial SequenceHuman UDP-GlcNAc 2-epimerase/
N-acetylmannosamine kinase (HsGNE) codon opitimized 92atggaaaaga
acggtaacaa cagaaagttg agagtttgtg ttgctacttg taacagagct 60gactactcca
agttggctcc aatcatgttc ggtatcaaga ctgagccaga gttcttcgag 120ttggacgttg
ttgttttggg ttcccacttg attgatgact acggtaacac ttacagaatg 180atcgagcagg
acgacttcga catcaacact agattgcaca ctattgttag aggagaggac 240gaagctgcta
tggttgaatc tgttggattg gctttggtta agttgccaga cgttttgaac 300agattgaagc
cagacatcat gattgttcac ggtgacagat tcgatgcttt ggctttggct 360acttccgctg
ctttgatgaa cattagaatc ttgcacatcg agggtggtga agtttctggt 420actatcgacg
actccatcag acacgctatc actaagttgg ctcactacca tgtttgttgt 480actagatccg
ctgagcaaca cttgatttcc atgtgtgagg accacgacag aattttgttg 540gctggttgtc
catcttacga caagttgttg tccgctaaga acaaggacta catgtccatc 600atcagaatgt
ggttgggtga cgacgttaag tctaaggact acatcgttgc tttgcagcac 660ccagttacta
ctgacatcaa gcactccatc aagatgttcg agttgacttt ggacgctttg 720atctccttca
acaagagaac tttggttttg ttcccaaaca ttgacgctgg ttccaaagag 780atggttagag
ttatgagaaa gaagggtatc gaacaccacc caaacttcag agctgttaag 840cacgttccat
tcgaccaatt catccagttg gttgctcatg ctggttgtat gatcggtaac 900tcctcctgtg
gtgttagaga agttggtgct ttcggtactc cagttatcaa cttgggtact 960agacagatcg
gtagagagac tggagaaaac gttttgcatg ttagagatgc tgacactcag 1020gacaagattt
tgcaggcttt gcacttgcaa ttcggaaagc agtacccatg ttccaaaatc 1080tacggtgacg
gtaacgctgt tccaagaatc ttgaagtttt tgaagtccat cgacttgcaa 1140gagccattgc
agaagaagtt ctgtttccca ccagttaagg agaacatctc ccaggacatt 1200gaccacatct
tggagacatt gtccgctttg gctgttgatt tgggtggaac taacttgaga 1260gttgctatcg
tttccatgaa gggagagatc gttaagaagt acactcagtt caacccaaag 1320acttacgagg
agagaatcaa cttgatcttg cagatgtgtg ttgaagctgc tgctgaggct 1380gttaagttga
actgtagaat cttgggtgtt ggtatctcta ctggtggtag agttaatcca 1440agagagggta
tcgttttgca ctccactaag ttgattcagg agtggaactc cgttgatttg 1500agaactccat
tgtccgacac attgcacttg ccagtttggg ttgacaacga cggtaattgt 1560gctgctttgg
ctgagagaaa gttcggtcaa ggaaagggat tggagaactt cgttactttg 1620atcactggta
ctggtattgg tggtggtatc attcaccagc acgagttgat tcacggttct 1680tccttctgtg
ctgctgaatt gggacacttg gttgtttctt tggacggtcc agactgttct 1740tgtggttccc
acggttgtat tgaagcttac gcatcaggaa tggcattgca gagagaggct 1800aagaagttgc
acgacgagga cttgttgttg gttgagggaa tgtctgttcc aaaggacgag 1860gctgttggtg
ctttgcattt gatccaggct gctaagttgg gtaatgctaa ggctcagtcc 1920atcttgagaa
ctgctggtac tgctttggga ttgggtgttg ttaatatctt gcacactatg 1980aacccatcct
tggttatctt gtccggtgtt ttggcttctc actacatcca catcgttaag 2040gacgttatca
gacagcaagc tttgtcctcc gttcaagacg ttgatgttgt tgtttccgac 2100ttggttgacc
cagctttgtt gggtgctgct tccatggttt tggactacac tactagaaga 2160atctactaat
ag
2172931854DNAArtificial SequenceSequence of the PpARG1 auxotrophic marker
93cagttgagcc agaccgcgct aaacgcatac caattgccaa atcaggcaat tgtgagacag
60tggtaaaaaa gatgcctgca aagttagatt cacacagtaa gagagatcct actcataaat
120gaggcgctta tttagtagct agtgatagcc actgcggttc tgctttatgc tatttgttgt
180atgccttact atctttgttt ggctcctttt tcttgacgtt ttccgttgga gggactccct
240attctgagtc atgagccgca cagattatcg cccaaaattg acaaaatctt ctggcgaaaa
300aagtataaaa ggagaaaaaa gctcaccctt ttccagcgta gaaagtatat atcagtcatt
360gaagactatt atttaaataa cacaatgtct aaaggaaaag tttgtttggc ctactccggt
420ggtttggata cctccatcat cctagcttgg ttgttggagc agggatacga agtcgttgcc
480tttttagcca acattggtca agaggaagac tttgaggctg ctagagagaa agctctgaag
540atcggtgcta ccaagtttat cgtcagtgac gttaggaagg aatttgttga ggaagttttg
600ttcccagcag tccaagttaa cgctatctac gagaacgtct acttactggg tacctctttg
660gccagaccag tcattgccaa ggcccaaata gaggttgctg aacaagaagg ttgttttgct
720gttgcccacg gttgtaccgg aaagggtaac gatcaggtta gatttgagct ttccttttat
780gctctgaagc ctgacgttgt ctgtatcgcc ccatggagag acccagaatt cttcgaaaga
840ttcgctggta gaaatgactt gctgaattac gctgctgaga aggatattcc agttgctcag
900actaaagcca agccatggtc tactgatgag aacatggctc acatctcctt cgaggctggt
960attctagaag atccaaacac tactcctcca aaggacatgt ggaagctcac tgttgaccca
1020gaagatgcac cagacaagcc agagttcttt gacgtccact ttgagaaggg taagccagtt
1080aaattagttc tcgagaacaa aactgaggtc accgatccgg ttgagatctt tttgactgct
1140aacgccattg ctagaagaaa cggtgttggt agaattgaca ttgtcgagaa cagattcatc
1200ggaatcaagt ccagaggttg ttatgaaact ccaggtttga ctctactgag aaccactcac
1260atcgacttgg aaggtcttac cgttgaccgt gaagttagat cgatcagaga cacttttgtt
1320accccaacct actctaagtt gttatacaac gggttgtact ttaccccaga aggtgagtac
1380gtcagaacta tgattcagcc ttctcaaaac accgtcaacg gtgttgttag agccaaggcc
1440tacaaaggta atgtgtataa cctaggaaga tactctgaaa ccgagaaatt gtacgatgct
1500accgaatctt ccatggatga gttgaccgga ttccaccctc aagaagctgg aggatttatc
1560acaacacaag ccatcagaat caagaagtac ggagaaagtg tcagagagaa gggaaagttt
1620ttgggacttt aactcaagta aaaggatagt tgtacaatta tatatacgaa gaataaatca
1680ttacaaaaag tattcgtttc tttgattctt aacaggattc attttctggg tgtcatcagg
1740tacagcgctg aatatcttga agttaacatc gagctcatca tcgacgttca tcacactagc
1800cacgtttccg caacggtagc aataattagg agcggaccac acagtgacga catc
1854941308DNAArtificial SequenceEncodes human CMP-sialic acid synthase
(HsCSS) codon optimized 94atggactctg ttgaaaaggg tgctgctact
tctgtttcca acccaagagg tagaccatcc 60agaggtagac ctcctaagtt gcagagaaac
tccagaggtg gtcaaggtag aggtgttgaa 120aagccaccac acttggctgc tttgatcttg
gctagaggag gttctaaggg tatcccattg 180aagaacatca agcacttggc tggtgttcca
ttgattggat gggttttgag agctgctttg 240gactctggtg ctttccaatc tgtttgggtt
tccactgacc acgacgagat tgagaacgtt 300gctaagcaat tcggtgctca ggttcacaga
agatcctctg aggtttccaa ggactcttct 360acttccttgg acgctatcat cgagttcttg
aactaccaca acgaggttga catcgttggt 420aacatccaag ctacttcccc atgtttgcac
ccaactgact tgcaaaaagt tgctgagatg 480atcagagaag agggttacga ctccgttttc
tccgttgtta gaaggcacca gttcagatgg 540tccgagattc agaagggtgt tagagaggtt
acagagccat tgaacttgaa cccagctaaa 600agaccaagaa ggcaggattg ggacggtgaa
ttgtacgaaa acggttcctt ctacttcgct 660aagagacact tgatcgagat gggatacttg
caaggtggaa agatggctta ctacgagatg 720agagctgaac actccgttga catcgacgtt
gatatcgact ggccaattgc tgagcagaga 780gttttgagat acggttactt cggaaaggag
aagttgaagg agatcaagtt gttggtttgt 840aacatcgacg gttgtttgac taacggtcac
atctacgttt ctggtgacca gaaggagatt 900atctcctacg acgttaagga cgctattggt
atctccttgt tgaagaagtc cggtatcgaa 960gttagattga tctccgagag agcttgttcc
aagcaaacat tgtcctcttt gaagttggac 1020tgtaagatgg aggtttccgt ttctgacaag
ttggctgttg ttgacgaatg gagaaaggag 1080atgggtttgt gttggaagga agttgcttac
ttgggtaacg aagtttctga cgaggagtgt 1140ttgaagagag ttggtttgtc tggtgctcca
gctgatgctt gttccactgc tcaaaaggct 1200gttggttaca tctgtaagtg taacggtggt
agaggtgcta ttagagagtt cgctgagcac 1260atctgtttgt tgatggagaa agttaataac
tcctgtcaga agtagtag 1308951080DNAArtificial
SequenceEncodes human N-acetylneuraminate-9-phosphate synthase
(HsSPS) codon optimized 95atgccattgg aattggagtt gtgtcctggt agatgggttg
gtggtcaaca cccatgtttc 60atcatcgctg agatcggtca aaaccaccaa ggagacttgg
acgttgctaa gagaatgatc 120agaatggcta aggaatgtgg tgctgactgt gctaagttcc
agaagtccga gttggagttc 180aagttcaaca gaaaggcttt ggaaagacca tacacttcca
agcactcttg gggaaagact 240tacggagaac acaagagaca cttggagttc tctcacgacc
aatacagaga gttgcagaga 300tacgctgagg aagttggtat cttcttcact gcttctggaa
tggacgaaat ggctgttgag 360ttcttgcacg agttgaacgt tccattcttc aaagttggtt
ccggtgacac taacaacttc 420ccatacttgg aaaagactgc taagaaaggt agaccaatgg
ttatctcctc tggaatgcag 480tctatggaca ctatgaagca ggtttaccag atcgttaagc
cattgaaccc aaacttttgt 540ttcttgcagt gtacttccgc ttacccattg caaccagagg
acgttaattt gagagttatc 600tccgagtacc agaagttgtt cccagacatc ccaattggtt
actctggtca cgagactggt 660attgctattt ccgttgctgc tgttgctttg ggtgctaagg
ttttggagag acacatcact 720ttggacaaga cttggaaggg ttctgatcac tctgcttctt
tggaacctgg tgagttggct 780gaacttgtta gatcagttag attggttgag agagctttgg
gttccccaac taagcaattg 840ttgccatgtg agatggcttg taacgagaag ttgggaaagt
ccgttgttgc taaggttaag 900atcccagagg gtactatctt gactatggac atgttgactg
ttaaagttgg agagccaaag 960ggttacccac cagaggacat ctttaacttg gttggtaaaa
aggttttggt tactgttgag 1020gaggacgaca ctattatgga ggagttggtt gacaaccacg
gaaagaagat caagtcctag 1080961092DNAArtificial SequenceEncodes mouse
alpha-2,6-sialyl transferase catalytic domain (MmmST6) (codon
optimized) 96gtttttcaaa tgccaaagtc ccaggagaaa gttgctgttg gtccagctcc
acaagctgtt 60ttctccaact ccaagcaaga tccaaaggag ggtgttcaaa tcttgtccta
cccaagagtt 120actgctaagg ttaagccaca accatccttg caagtttggg acaaggactc
cacttactcc 180aagttgaacc caagattgtt gaagatttgg agaaactact tgaacatgaa
caagtacaag 240gtttcctaca agggtccagg tccaggtgtt aagttctccg ttgaggcttt
gagatgtcac 300ttgagagacc acgttaacgt ttccatgatc gaggctactg acttcccatt
caacactact 360gaatgggagg gatacttgcc aaaggagaac ttcagaacta aggctggtcc
atggcataag 420tgtgctgttg tttcttctgc tggttccttg aagaactccc agttgggtag
agaaattgac 480aaccacgacg ctgttttgag attcaacggt gctccaactg acaacttcca
gcaggatgtt 540ggtactaaga ctactatcag attggttaac tcccaattgg ttactactga
gaagagattc 600ttgaaggact ccttgtacac tgagggaatc ttgattttgt gggacccatc
tgtttaccac 660gctgacattc cacaatggta tcagaagcca gactacaact tcttcgagac
ttacaagtcc 720tacagaagat tgcacccatc ccagccattc tacatcttga agccacaaat
gccatgggaa 780ttgtgggaca tcatccagga aatttcccca gacttgatcc aaccaaaccc
accatcttct 840ggaatgttgg gtatcatcat catgatgact ttgtgtgacc aggttgacat
ctacgagttc 900ttgccatcca agagaaagac tgatgtttgt tactaccacc agaagttctt
cgactccgct 960tgtactatgg gagcttacca cccattgttg ttcgagaaga acatggttaa
gcacttgaac 1020gaaggtactg acgaggacat ctacttgttc ggaaaggcta ctttgtccgg
tttcagaaac 1080aacagatgtt ag
1092972172DNAArtificial SequenceEncodes human UDP-GlcNAc
2-epimerase/ N-acetylmannosamine kinase (HsGNE) (codon optimized)
97atggaaaaga acggtaacaa cagaaagttg agagtttgtg ttgctacttg taacagagct
60gactactcca agttggctcc aatcatgttc ggtatcaaga ctgagccaga gttcttcgag
120ttggacgttg ttgttttggg ttcccacttg attgatgact acggtaacac ttacagaatg
180atcgagcagg acgacttcga catcaacact agattgcaca ctattgttag aggagaggac
240gaagctgcta tggttgaatc tgttggattg gctttggtta agttgccaga cgttttgaac
300agattgaagc cagacatcat gattgttcac ggtgacagat tcgatgcttt ggctttggct
360acttccgctg ctttgatgaa cattagaatc ttgcacatcg agggtggtga agtttctggt
420actatcgacg actccatcag acacgctatc actaagttgg ctcactacca tgtttgttgt
480actagatccg ctgagcaaca cttgatttcc atgtgtgagg accacgacag aattttgttg
540gctggttgtc catcttacga caagttgttg tccgctaaga acaaggacta catgtccatc
600atcagaatgt ggttgggtga cgacgttaag tctaaggact acatcgttgc tttgcagcac
660ccagttacta ctgacatcaa gcactccatc aagatgttcg agttgacttt ggacgctttg
720atctccttca acaagagaac tttggttttg ttcccaaaca ttgacgctgg ttccaaagag
780atggttagag ttatgagaaa gaagggtatc gaacaccacc caaacttcag agctgttaag
840cacgttccat tcgaccaatt catccagttg gttgctcatg ctggttgtat gatcggtaac
900tcctcctgtg gtgttagaga agttggtgct ttcggtactc cagttatcaa cttgggtact
960agacagatcg gtagagagac tggagaaaac gttttgcatg ttagagatgc tgacactcag
1020gacaagattt tgcaggcttt gcacttgcaa ttcggaaagc agtacccatg ttccaaaatc
1080tacggtgacg gtaacgctgt tccaagaatc ttgaagtttt tgaagtccat cgacttgcaa
1140gagccattgc agaagaagtt ctgtttccca ccagttaagg agaacatctc ccaggacatt
1200gaccacatct tggagacatt gtccgctttg gctgttgatt tgggtggaac taacttgaga
1260gttgctatcg tttccatgaa gggagagatc gttaagaagt acactcagtt caacccaaag
1320acttacgagg agagaatcaa cttgatcttg cagatgtgtg ttgaagctgc tgctgaggct
1380gttaagttga actgtagaat cttgggtgtt ggtatctcta ctggtggtag agttaatcca
1440agagagggta tcgttttgca ctccactaag ttgattcagg agtggaactc cgttgatttg
1500agaactccat tgtccgacac attgcacttg ccagtttggg ttgacaacga cggtaattgt
1560gctgctttgg ctgagagaaa gttcggtcaa ggaaagggat tggagaactt cgttactttg
1620atcactggta ctggtattgg tggtggtatc attcaccagc acgagttgat tcacggttct
1680tccttctgtg ctgctgaatt gggacacttg gttgtttctt tggacggtcc agactgttct
1740tgtggttccc acggttgtat tgaagcttac gcatcaggaa tggcattgca gagagaggct
1800aagaagttgc acgacgagga cttgttgttg gttgagggaa tgtctgttcc aaaggacgag
1860gctgttggtg ctttgcattt gatccaggct gctaagttgg gtaatgctaa ggctcagtcc
1920atcttgagaa ctgctggtac tgctttggga ttgggtgttg ttaatatctt gcacactatg
1980aacccatcct tggttatctt gtccggtgtt ttggcttctc actacatcca catcgttaag
2040gacgttatca gacagcaagc tttgtcctcc gttcaagacg ttgatgttgt tgtttccgac
2100ttggttgacc cagctttgtt gggtgctgct tccatggttt tggactacac tactagaaga
2160atctactaat ag
2172981302DNAArtificial SequencePichia pastoris TRP2 5' and ORF
98actgggcctt tagagggtgc tgaagttgac cccttggtgc ttctggaaaa agaactgaag
60ggcaccagac aagcgcaact tcctggtatt cctcgtctaa gtggtggtgc cataggatac
120atctcgtacg attgtattaa gtactttgaa ccaaaaactg aaagaaaact gaaagatgtt
180ttgcaacttc cggaagcagc tttgatgttg ttcgacacga tcgtggcttt tgacaatgtt
240tatcaaagat tccaggtaat tggaaacgtt tctctatccg ttgatgactc ggacgaagct
300attcttgaga aatattataa gacaagagaa gaagtggaaa agatcagtaa agtggtattt
360gacaataaaa ctgttcccta ctatgaacag aaagatatta ttcaaggcca aacgttcacc
420tctaatattg gtcaggaagg gtatgaaaac catgttcgca agctgaaaga acatattctg
480aaaggagaca tcttccaagc tgttccctct caaagggtag ccaggccgac ctcattgcac
540cctttcaaca tctatcgtca tttgagaact gtcaatcctt ctccatacat gttctatatt
600gactatctag acttccaagt tgttggtgct tcacctgaat tactagttaa atccgacaac
660aacaacaaaa tcatcacaca tcctattgct ggaactcttc ccagaggtaa aactatcgaa
720gaggacgaca attatgctaa gcaattgaag tcgtctttga aagacagggc cgagcacgtc
780atgctggtag atttggccag aaatgatatt aaccgtgtgt gtgagcccac cagtaccacg
840gttgatcgtt tattgactgt ggagagattt tctcatgtga tgcatcttgt gtcagaagtc
900agtggaacat tgagaccaaa caagactcgc ttcgatgctt tcagatccat tttcccagca
960ggtaccgtct ccggtgctcc gaaggtaaga gcaatgcaac tcataggaga attggaagga
1020gaaaagagag gtgtttatgc gggggccgta ggacactggt cgtacgatgg aaaatcgatg
1080gacacatgta ttgccttaag aacaatggtc gtcaaggacg gtgtcgctta ccttcaagcc
1140ggaggtggaa ttgtctacga ttctgacccc tatgacgagt acatcgaaac catgaacaaa
1200atgagatcca acaataacac catcttggag gctgagaaaa tctggaccga taggttggcc
1260agagacgaga atcaaagtga atccgaagaa aacgatcaat ga
1302991085DNAArtificial SequencePichia pastoris TRP2 3' region
99acggaggacg taagtaggaa tttatgtaat catgccaata catctttaga tttcttcctc
60ttctttttaa cgaaagacct ccagttttgc actctcgact ctctagtatc ttcccatttc
120tgttgctgca acctcttgcc ttctgtttcc ttcaattgtt cttctttctt ctgttgcact
180tggccttctt cctccatctt tcgttttttt tcaagccttt tcagcagttc ttcttccaag
240agcagttctt tgattttctc tctccaatcc accaaaaaac tggatgaatt caaccgggca
300tcatcaatgt tccactttct ttctcttatc aataatctac gtgcttcggc atacgaggaa
360tccagttgct ccctaatcga gtcatccaca aggttagcat gggccttttt cagggtgtca
420aaagcatctg gagctcgttt attcggagtc ttgtctggat ggatcagcaa agactttttg
480cggaaagtct ttcttatatc ttccggagaa caacctggtt tcaaatccaa gatggcatag
540ctgtccaatt tgaaagtgga aagaatcctg ccaatttcct tctctcgtgt cagctcgttc
600tcctcctttt gcaacaggtc cacttcatct ggcatttttc tttatgttaa ctttaattat
660tattaattat aaagttgatt atcgttatca aaataatcat attcgagaaa taatccgtcc
720atgcaatata taaataagaa ttcataataa tgtaatgata acagtacctc tgatgacctt
780tgatgaaccg caattttctt tccaatgaca agacatccct ataatacaat tatacagttt
840atatatcaca aataatcacc tttttataag aaaaccgtcc tctccgtaac agaacttatt
900atccgcacgt tatggttaac acactactaa taccgatata gtgtatgaag tcgctacgag
960atagccatcc aggaaactta ccaattcatc agcactttca tgatccgatt gttggcttta
1020ttctttgcga gacagatact tgccaatgaa ataactgatc ccacagatga gaatccggtg
1080ctcgt
1085100747DNAArtificial Sequence5'-Region of STE13 100ttgggggcct
ccaggacttg ctgaaatttg ctgactcatc ttcgccatcc aaggataatg 60agttagctaa
tgtgacagtt aatgagtcgt cttgactaac ggggaacatt tcattattta 120tatccagagt
caatttgata gcagagtttg tggttgaaat acctatgatt cgggagactt 180tgttgtaacg
accattatcc acagtttgga ccgtgaaaat gtcatcgaag agagcagacg 240acatattatc
tattgtggta agtgatagtt ggaagtccga ctaaggcatg aaaatgagaa 300gactgaaaat
ttaaagtttt tgaaaacact aatcgggtaa taacttggaa attacgttta 360cgtgccttta
gctcttgtcc ttacccctga taatctatcc atttcccgag agacaatgac 420atctcggaca
gctgagaacc cgttcgatat agagcttcaa gagaatctaa gtccacgttc 480ttccaattcg
tccatattgg aaaacattaa tgagtatgct agaagacatc gcaatgattc 540gctttcccaa
gaatgtgata atgaagatga gaacgaaaat ctcaattata ctgataactt 600ggccaagttt
tcaaagtctg gagtatcaag aaagagctgt atgctaatat ttggtatttg 660ctttgttatc
tggctgtttc tctttgcctt gtatgcgagg gacaatcgat tttccaattt 720gaacgagtac
gttccagatt caaacag
747101924DNAArtificial Sequence3'-Region of STE13 101ctactgggaa
ccacgagaca tcactgcagt agtttccaag tggatttcag atcactcatt 60tgtgaatcct
gacaaaactg cgatatgggg gtggtcttac ggtgggttca ctacgcttaa 120gacattggaa
tatgattctg gagaggtttt caaatatggt atggctgttg ctccagtaac 180taattggctt
ttgtatgact ccatctacac tgaaagatac atgaaccttc caaaggacaa 240tgttgaaggc
tacagtgaac acagcgtcat taagaaggtt tccaatttta agaatgtaaa 300ccgattcttg
gtttgtcacg ggactactga tgataacgtg cattttcaga acacactaac 360cttactggac
cagttcaata ttaatggtgt tgtgaattac gatcttcagg tgtatcccga 420cagtgaacat
agcattgccc atcacaacgc aaataaagtg atctacgaga ggttattcaa 480gtggttagag
cgggcattta acgatagatt tttgtaacat tccgtacttc atgccatact 540atatatcctg
caaggtttcc ctttcagaca caataattgc tttgcaattt tacataccac 600caattggcaa
aaataatctc ttcagtaagt tgaatgcttt tcaagccagc accgtgagaa 660attgctacag
cgcgcattct aacatcactt taaaattccc tcgccggtgc tcactggagt 720ttccaaccct
tagcttatca aaatcgggtg ataactctga gttttttttt tcacttctat 780tcctaaacct
tcgcccaatg ctaccacctc caatcaacat cccgaaatgg atagaagaga 840atggacatct
cttgcaacct ccggttaata attactgtct ccacagagga ggatttacgg 900taatgattgt
aggtgggcct aatg
924102980DNAArtificial Sequence5'-Region of DAP2 102cacctgggcc tgttgctgct
ggtactgctg ttggaactgt tggtattgtt gctgatctaa 60ggccgcctgt tccacaccgt
gtgtatcgaa tgcttgggca aaatcatcgc ctgccggagg 120ccccactacc gcttgttcct
cctgctcttg tttgttttgc tcattgatga tatcggcgtc 180aatgaattga tcctcaatcg
tgtggtggtg gtgtcgtgat tcctcttctt tcttgagtgc 240cttatccata ttcctatctt
agtgtaccaa taattttgtt aaacacacgc tgttgtttat 300gaaaagtcgt caaaaggtta
aaaattctac ttggtgtgtg tcagagaaag tagtgcagac 360ccccagtttg ttgactagtt
gagaaggcgg ctcactattg cgcgaatagc atgagaaatt 420tgcaaacatc tggcaaagtg
gtcaatacct gccaacctgc caatcttcgc gacggaggct 480gttaagcggg ttgggttccc
aaagtgaatg gatattacgg gcaggaaaaa cagccccttc 540cacactagtc tttgctactg
acatcttccc tctcatgtat cccgaacaca agtatcggga 600gtatcaacgg agggtgccct
tatggcagta ctccctgttg gtgattgtac tgctatacgg 660gtctcatttg cttatcagca
ccatcaactt gatacactat aaccacaaaa attatcatgc 720acacccagtc aatagtggta
tcgttcttaa tgagtttgct gatgacgatt cattctcttt 780gaatggcact ctgaacttgg
agaactggag aaatggtacc ttttccccta aatttcattc 840cattcagtgg accgaaatag
gtcaggaaga tgaccaggga tattacattc tctcttccaa 900ttcctcttac atagtaaagt
ctttatccga cccagacttt gaatctgttc tattcaacga 960gtctacaatc acttacaacg
9801031117DNAArtificial
Sequence3'-Region of DAP2 103ggcagcaaag ccttacgttg atgagaatag actggccatt
tggggttggt cttatggagg 60ttacatgacg ctaaaggttt tagaacagga taaaggtgaa
acattcaaat atggaatgtc 120tgttgcccct gtgacgaatt ggaaattcta tgattctatc
tacacagaaa gatacatgca 180cactcctcag gacaatccaa actattataa ttcgtcaatc
catgagattg ataatttgaa 240gggagtgaag aggttcttgc taatgcacgg aactggtgac
gacaatgttc acttccaaaa 300tacactcaaa gttctagatt tatttgattt acatggtctt
gaaaactatg atatccacgt 360gttccctgat agtgatcaca gtattagata tcacaacggt
aatgttatag tgtatgataa 420gctattccat tggattaggc gtgcattcaa ggctggcaaa
taaataggtg caaaaatatt 480attagacttt ttttttcgtt cgcaagttat tactgtgtac
cataccgatc caatccgtat 540tgtaattcat gttctagatc caaaatttgg gactctaatt
catgaggtct aggaagatga 600tcatctctat agttttcagc ggggggctcg atttgcggtt
ggtcaaagct aacatcaaaa 660tgtttgtcag gttcagtgaa tggtaactgc tgctcttgaa
ttggtcgtct gacaaattct 720ctaagtgata gcacttcatc tacaatcatt tgcttcatcg
tttctatatc gtccacgacc 780tcaaacgaga aatcgaattt ggaagaacag acgggctcat
cgttaggatc atgccaaacc 840ttgagatatg gatgctctaa agcctcagta actgtaattc
tgtgagtggg atctaccgtg 900agcattcgat ccagtaagtc tatcgcttca gggttggcac
cgggaaataa ctggctgaat 960gggatcttgg gcatgaatgg cagggagcga acataatcct
gggcacgctc tgatctgata 1020gactgaagtg tctcttccga aacagtaccc agcgtactca
aaatcaagtt caattgatcc 1080acatagtctc ttcctctaaa aatgggtcgg ccaccta
11171041666DNAArtificial SequenceHYGR resistance
cassette 104gatctgttta gcttgcctcg tccccgccgg gtcacccggc cagcgacatg
gaggcccaga 60ataccctcct tgacagtctt gacgtgcgca gctcaggggc atgatgtgac
tgtcgcccgt 120acatttagcc catacatccc catgtataat catttgcatc catacatttt
gatggccgca 180cggcgcgaag caaaaattac ggctcctcgc tgcggacctg cgagcaggga
aacgctcccc 240tcacagacgc gttgaattgt ccccacgccg cgcccctgta gagaaatata
aaaggttagg 300atttgccact gaggttcttc tttcatatac ttccttttaa aatcttgcta
ggatacagtt 360ctcacatcac atccgaacat aaacaaccat gggtaaaaag cctgaactca
ccgcgacgtc 420tgtcgagaag tttctgatcg aaaagttcga cagcgtctcc gacctgatgc
agctctcgga 480gggcgaagaa tctcgtgctt tcagcttcga tgtaggaggg cgtggatatg
tcctgcgggt 540aaatagctgc gccgatggtt tctacaaaga tcgttatgtt tatcggcact
ttgcatcggc 600cgcgctcccg attccggaag tgcttgacat tggggaattc agcgagagcc
tgacctattg 660catctcccgc cgtgcacagg gtgtcacgtt gcaagacctg cctgaaaccg
aactgcccgc 720tgttctgcag ccggtcgcgg aggccatgga tgcgatcgct gcggccgatc
ttagccagac 780gagcgggttc ggcccattcg gaccgcaagg aatcggtcaa tacactacat
ggcgtgattt 840catatgcgcg attgctgatc cccatgtgta tcactggcaa actgtgatgg
acgacaccgt 900cagtgcgtcc gtcgcgcagg ctctcgatga gctgatgctt tgggccgagg
actgccccga 960agtccggcac ctcgtgcacg cggatttcgg ctccaacaat gtcctgacgg
acaatggccg 1020cataacagcg gtcattgact ggagcgaggc gatgttcggg gattcccaat
acgaggtcgc 1080caacatcttc ttctggaggc cgtggttggc ttgtatggag cagcagacgc
gctacttcga 1140gcggaggcat ccggagcttg caggatcgcc gcggctccgg gcgtatatgc
tccgcattgg 1200tcttgaccaa ctctatcaga gcttggttga cggcaatttc gatgatgcag
cttgggcgca 1260gggtcgatgc gacgcaatcg tccgatccgg agccgggact gtcgggcgta
cacaaatcgc 1320ccgcagaagc gcggccgtct ggaccgatgg ctgtgtagaa gtactcgccg
atagtggaaa 1380ccgacgcccc agcactcgtc cgagggcaaa ggaataatca gtactgacaa
taaaaagatt 1440cttgttttca agaacttgtc atttgtatag tttttttata ttgtagttgt
tctattttaa 1500tcaaatgtta gcgtgattta tatttttttt cgcctcgaca tcatctgccc
agatgcgaag 1560ttaagtgcgc agaaagtaat atcatgcgtc aatcgtatgt gaatgctggt
cgctatactg 1620ctgtcgattc gatactaacg ccgccatcca gtgtcgaaaa cgagct
1666105365DNAArtificial SequenceSequence of Pichia pastoris
TRP5 5' integration fragment 105acgacggcca aattcatgat acacactctg
tttcagctgg tttggactac cctggagttg 60gtcctgaatt ggctgcctgg aaagcaaatg
gtagagccca attttccgct gtaactgatg 120cccaagcatt agagggattc aaaatcctgt
ctcaattgga agggatcatt ccagcactag 180agtctagtca tgcaatctac ggcgcattgc
aaattgcaaa gactatgtct tcggaccagt 240ccttagttat taatgtatct ggaaggggtg
ataaggacgt ccagagtgta gctgagattt 300tacctaaatt gggacctcaa attggatggg
atttgcgttt cagcgaagac attactaaag 360agtga
365106613DNAArtificial SequenceSequence
of Pichia pastoris TRP5 3' integration fragment 106tcgatagcac
aatattcaac ttgactgggt gttaagaact aagagctctg ggaaactttg 60tatttattac
taccaacaca gtcaaattat tggatgtgtt tttttttcca gtacatttca 120ctgagcagtt
tgttatactc ggtctttaat ctccatatac atgcagattg taatacagat 180ctgaacagtt
tgattctgat tgatcttgcc accaatattc tatttttgta tcaagtaaca 240gagtcaatga
tcattggtaa cgtaacggtt ttcgtgtata gtagttagag cccatcttgt 300aacctcattt
cctcccatat taaagtatca gtgattcgct ggaacgatta actaagaaaa 360aaaaaatatc
tgcacatact catcagtctg taaatctaag tcaaaactgc tgtatccaat 420agaaatcggg
atatacctgg atgttttttc cacataaaca aacgggagtt cagcttactt 480atggtgttga
tgcaattcag tatgatccta ccaataaaac gaaactttgg gattttggct 540gtttgaggga
tcaaaagctg cacctttaca agattgacgg atcgaccatt agaccaaagc 600aaatggccac
caa
613107384DNAArtificial SequenceEncodes human GM-CSF 107ccagctagat
ctccatctcc atccactcaa ccatgggaac acgttaacgc tatccaagag 60gctttgagat
tgttgaactt gtccagagac actgctgctg aaatgaacga gactgttgag 120gttatctccg
agatgttcga cttgcaagag ccaacttgtt tgcagactag attggagttg 180tacaagcagg
gattgagagg atccttgact aagttgaagg gaccattgac tatgatggct 240tcccactaca
agcaacactg tccaccaact ccagaaacat cctgtgctac tcagatcatc 300actttcgagt
ccttcaaaga gaacttgaag gacttcttgt tggttatccc attcgactgt 360tgggaaccag
ttcaagaata ataa
384108126PRTHomo sapiens 108Pro Ala Arg Ser Pro Ser Pro Ser Thr Gln Pro
Trp Glu His Val Asn 1 5 10
15 Ala Ile Gln Glu Ala Leu Arg Leu Leu Asn Leu Ser Arg Asp Thr Ala
20 25 30 Ala Glu
Met Asn Glu Thr Val Glu Val Ile Ser Glu Met Phe Asp Leu 35
40 45 Gln Glu Pro Thr Cys Leu Gln
Thr Arg Leu Glu Leu Tyr Lys Gln Gly 50 55
60 Leu Arg Gly Ser Leu Thr Lys Leu Lys Gly Pro Leu
Thr Met Met Ala 65 70 75
80 Ser His Tyr Lys Gln His Cys Pro Pro Thr Pro Glu Thr Ser Cys Ala
85 90 95 Thr Gln Ile
Ile Thr Phe Glu Ser Phe Lys Glu Asn Leu Lys Asp Phe 100
105 110 Leu Leu Val Ile Pro Phe Asp Cys
Trp Glu Pro Val Gln Glu 115 120
125 1091275DNAArtificial SequenceEncodes CWP1-GMCSF fusion protein
109atgttcaacc tgaaaactat tctcatctca acacttgcat cgatcgctgt tgccgaccaa
60accttcggtg tccttctaat ccggagtgga tccccatatc actattcgac tctcactaat
120agagacgaaa agattgttgc tggaggtggc aacaaaaaag tgaccctcac agatgaggga
180gctctgaagt atgatggtgg taaatggata ggtcttgatg atgatggcta tgcggtacag
240accgacaaac cagttacagg ttggagcact aacggtggat acctctattt tgaccaaggc
300ttaattgttt gcacggagga ctatatcgga tatgtgaaga aacatggtga atgcaaaggt
360gacagctatg gtatggcttg gaaggtactc ccagccgacg atgacaagga tgatgacaag
420gatgatgata aagatgatga caaggattat gacgatgaca atgaccacgg tgatggtgat
480tactattgct cgatcacagg aacctatgcc atcaaatcca aaggcagtaa gcatcaatac
540gaggccatca aaaaagttga tgcacatcct catgtcttct ctgtaggagg agatcaggga
600aacgatctga ttgtgacttt ccaaaaggat tgttcgctgg tagatcaaga taacagaggc
660gtatatgttg accctaattc tggagaagtc ggaaacgttg acccttgggg agaactcacg
720ccatctgtta aatgggatat tgacgacgga tacctgatct ttaatggtga gtccaatttc
780aggtcatgtc catctggtaa tggatattca ttgtctatca aggattgtgt tgggggaact
840gacattggcc ttaaagtatg ggagaaaggt ggaggttctt tggttaagag ggctccagct
900agatctccat ctccatccac tcaaccatgg gaacacgtta acgctatcca agaggctttg
960agattgttga acttgtccag agacactgct gctgaaatga acgagactgt tgaggttatc
1020tccgagatgt tcgacttgca agagccaact tgtttgcaga ctagattgga gttgtacaag
1080cagggattga gaggatcctt gactaagttg aagggaccat tgactatgat ggcttcccac
1140tacaagcaac actgtccacc aactccagaa acatcctgtg ctactcagat catcactttc
1200gagtccttca aagagaactt gaaggacttc ttgttggtta tcccattcga ctgttgggaa
1260ccagttcaag aataa
1275110424PRTArtificial SequenceCWP1-GMCSF fusion protein 110Met Phe Asn
Leu Lys Thr Ile Leu Ile Ser Thr Leu Ala Ser Ile Ala 1 5
10 15 Val Ala Asp Gln Thr Phe Gly Val
Leu Leu Ile Arg Ser Gly Ser Pro 20 25
30 Tyr His Tyr Ser Thr Leu Thr Asn Arg Asp Glu Lys Ile
Val Ala Gly 35 40 45
Gly Gly Asn Lys Lys Val Thr Leu Thr Asp Glu Gly Ala Leu Lys Tyr 50
55 60 Asp Gly Gly Lys
Trp Ile Gly Leu Asp Asp Asp Gly Tyr Ala Val Gln 65 70
75 80 Thr Asp Lys Pro Val Thr Gly Trp Ser
Thr Asn Gly Gly Tyr Leu Tyr 85 90
95 Phe Asp Gln Gly Leu Ile Val Cys Thr Glu Asp Tyr Ile Gly
Tyr Val 100 105 110
Lys Lys His Gly Glu Cys Lys Gly Asp Ser Tyr Gly Met Ala Trp Lys
115 120 125 Val Leu Pro Ala
Asp Asp Asp Lys Asp Asp Asp Lys Asp Asp Asp Lys 130
135 140 Asp Asp Asp Lys Asp Tyr Asp Asp
Asp Asn Asp His Gly Asp Gly Asp 145 150
155 160 Tyr Tyr Cys Ser Ile Thr Gly Thr Tyr Ala Ile Lys
Ser Lys Gly Ser 165 170
175 Lys His Gln Tyr Glu Ala Ile Lys Lys Val Asp Ala His Pro His Val
180 185 190 Phe Ser Val
Gly Gly Asp Gln Gly Asn Asp Leu Ile Val Thr Phe Gln 195
200 205 Lys Asp Cys Ser Leu Val Asp Gln
Asp Asn Arg Gly Val Tyr Val Asp 210 215
220 Pro Asn Ser Gly Glu Val Gly Asn Val Asp Pro Trp Gly
Glu Leu Thr 225 230 235
240 Pro Ser Val Lys Trp Asp Ile Asp Asp Gly Tyr Leu Ile Phe Asn Gly
245 250 255 Glu Ser Asn Phe
Arg Ser Cys Pro Ser Gly Asn Gly Tyr Ser Leu Ser 260
265 270 Ile Lys Asp Cys Val Gly Gly Thr Asp
Ile Gly Leu Lys Val Trp Glu 275 280
285 Lys Gly Gly Gly Ser Leu Val Lys Arg Ala Pro Ala Arg Ser
Pro Ser 290 295 300
Pro Ser Thr Gln Pro Trp Glu His Val Asn Ala Ile Gln Glu Ala Leu 305
310 315 320 Arg Leu Leu Asn Leu
Ser Arg Asp Thr Ala Ala Glu Met Asn Glu Thr 325
330 335 Val Glu Val Ile Ser Glu Met Phe Asp Leu
Gln Glu Pro Thr Cys Leu 340 345
350 Gln Thr Arg Leu Glu Leu Tyr Lys Gln Gly Leu Arg Gly Ser Leu
Thr 355 360 365 Lys
Leu Lys Gly Pro Leu Thr Met Met Ala Ser His Tyr Lys Gln His 370
375 380 Cys Pro Pro Thr Pro Glu
Thr Ser Cys Ala Thr Gln Ile Ile Thr Phe 385 390
395 400 Glu Ser Phe Lys Glu Asn Leu Lys Asp Phe Leu
Leu Val Ile Pro Phe 405 410
415 Asp Cys Trp Glu Pro Val Gln Glu 420
1118PRTArtificial SequenceKEX2 linker 111Gly Gly Gly Ser Leu Val Lys Arg
1 5
User Contributions:
Comment about this patent or add new information about this topic: