Patent application title: ENTROPIC BRISTLE DOMAIN SEQUENCES AND THEIR USE IN RECOMBINANT PROTEIN PRODUCTION
Inventors:
A. Keith Dunker (Indianapolis, IN, US)
Vladimir N. Uversky (Carmel, IN, US)
Marc S. Cortese (Indianapolis, IN, US)
James Mueller (Indianapolis, IN, US)
Assignees:
MOLECULAR KINETICS INCORPORATED
IPC8 Class: AC12P2106FI
USPC Class:
435 691
Class name: Chemistry: molecular biology and microbiology micro-organism, tissue cell culture or enzyme using process to synthesize a desired chemical compound or composition recombinant dna technique included in method of making a protein or polypeptide
Publication date: 2009-09-03
Patent application number: 20090221032
Inventors list |
Agents list |
Assignees list |
List by place |
Classification tree browser |
Top 100 Inventors |
Top 100 Agents |
Top 100 Assignees |
Usenet FAQ Index |
Documents |
Other FAQs |
Patent application title: ENTROPIC BRISTLE DOMAIN SEQUENCES AND THEIR USE IN RECOMBINANT PROTEIN PRODUCTION
Inventors:
Vladimir N. Uversky
A. Keith Dunker
Marc S. Cortese
James Mueller
Agents:
SEED INTELLECTUAL PROPERTY LAW GROUP PLLC
Assignees:
MOLECULAR KINETICS INCORPORATED
Origin: SEATTLE, WA US
IPC8 Class: AC12P2106FI
USPC Class:
435 691
Abstract:
Compositions and methods for recombinant protein production and, more
particularly, fusion polypeptides, polynucleotides encoding fusion
polypeptides, expression vectors, kits, and related methods for
recombinant protein production, are provided.Claims:
1. An isolated fusion polypeptide comprising at least one entropic bristle
domain (EBD) sequence and at least one heterologous polypeptide sequence,
wherein the fusion polypeptide has increased solubility relative to the
heterologous polypeptide sequence, reduced aggregation relative to the
heterologous polypeptide sequence and/or improved folding relative to the
heterologous polypeptide sequence.
2. The polypeptide according to claim 1, wherein the EBD sequence is derived from a mammalian neurofilament protein.
3. The polypeptide according to claim 1, wherein the EBD sequence is derived from a mammalian neurofilament NF-H protein.
4. The polypeptide according to claim 1, wherein the EBD sequence is derived from a human neurofilament NF-H protein having a sequence set forth in SEQ ID NO: 1 or 3.
5. The polypeptide according to claim 1, wherein the EBD sequence comprises a neurofilament NF-H sequence selected from the group consisting of SPEAEK (SEQ ID NO:23), SPAAVK (SEQ ID NO:24), SPAEAK (SEQ ID NO:25), SPAEPK (SEQ ID NO:26), SPAEVK (SEQ ID NO:27), SPATVK (SEQ ID NO:28), SPEKAK (SEQ ID NO:29), SPGEAK (SEQ ID NO:30), SPIEVK (SEQ ID NO:31), SPPEAK (SEQ ID NO:32), SPSEAK (SEQ ID NO:33), SPEKEAK (SEQ ID NO:34), SPAKEKAK (SEQ ID NO:35), SPEKEEAK (SEQ ID NO:36), SPTKEEAK (SEQ ID NO:37), SPVKEEAK (SEQ ID NO:38), SPVKAEAK (SEQ ID NO:39), SPVKEEAK (SEQ ID NO:40), SPVKEEVK (SEQ ID NO:41), SPVKEEEKP (SEQ ID NO:42), SPEKAKTLDVK (SEQ ID NO:43), SPADKFPEKAK (SEQ ID NO:44), SPEAKTPAKEEAR (SEQ ID NO:45), SPEKAKTPVKEGAK (SEQ ID NO:46), SPVKEEAKTPEKAK (SEQ ID NO:47), SPVKEGAKPPEKAKPLDVK (SEQ ID NO:48), SPVKEDIKPPAEAKSPEKAK (SEQ ID NO:49), SPLKEDAKAPEKEIPKKEEVK (SEQ ID NO:50), SPEKEEAKTSEKVAPKKEEVK (SEQ ID NO:51), SPEAQTPVQEEATVPTDIRPPEQVK (SEQ ID NO:52), SPVKEEVKAKEPPKKVEEEKTLPTPKTEAKESKKDE (SEQ ID NO:53), or a combination thereof.
6. The polypeptide according to claim 1, wherein the EBD sequence is derived from a mammalian neurofilament protein NF-M.
7. The polypeptide according to claim 1, wherein the EBD sequence is derived from a mammalian neurofilament NF-M protein having the sequence set forth in any one of SEQ ID NOs: 5, 7, 9, 11, 13 or 15.
8. The polypeptide according to claim 1, wherein the EBD sequence comprises a neurofilament NF-M sequence selected from the group consisting of SPPK (SEQ ID NO:54), SPVK (SEQ ID NO:55), SPAAK (SEQ ID NO:56), SPAPK (SEQ ID NO:57), SPEAK (SEQ ID NO:58), SPMPK (SEQ ID NO:59), SPPAK (SEQ ID NO:60), SPTAK (SEQ ID NO:61), SPTTK (SEQ ID NO:62), SPVAK (SEQ ID NO:63), SPVAK (SEQ ID NO:64), SPVPK (SEQ ID NO:65), SPVSK (SEQ ID NO:66), SPEKPA (SEQ ID NO:67), SPVEEKAK (SEQ ID NO:68), SPVEEKGK (SEQ ID NO:69), SPVEEVKP (SEQ ID NO:70), SPEKPATPKVT (SEQ ID NO:71), SPEKPRTPEKPA (SEQ ID NO:72), SPEKPTTPEKW (SEQ ID NO:73), SPEKPSSPLKDEKA (SEQ ID NO:74), SPVKEKAVEEMITIT (SEQ ID NO:75), SPVKEEAAEEAATITK (SEQ ID NO:76), SPVPKSPVEEVKPKAEATAG (SEQ ID NO:77), SPVKAESPVKEEVPAKPVKV (SEQ ID NO:78), SPEKEAKEEEKPQEKEKEKEK (SEQ ID NO:79), SPVKATTPEIKEEEGEKEEEGQE (SEQ ID NO:80), SPVEEVKPKPEAKAGKGEQKEE (SEQ ID NO:81), SPEKPATPEKPPTPEKAITPEKVR (SEQ ID NO:82), SPEKPATPEKPRTPEKPATPEKPR (SEQ ID NO:83), SPKEEKVEKKEEKPKDVPKKKAE (SEQ ID NO:84), SPKEEKAEKKEEKPKDVPEKKKAE (SEQ ID NO:85), SPVEEAKSKAEVGKGEQKEEEEKE (SEQ ID NO:86), SPKEEKVEKKEEKPKDVPDKKKAE (SEQ ID NO:87), SPVKEEAVAEVVTITKSVKVHLEKET (SEQ ID NO:88), SSEKDEGEQEEEEGETEAEGEGEEAEAKEEK (SEQ ID NO:89), SPVEEVKPKAEAGAEKGEQKEKVEEEKKEAKE (SEQ ID NO:90), SPVTEQAKAVQKAAAEVGKDQKAEKAAEKAAKEEKAA (SEQ ID NO:91), SPEAKEEEEEGEKEEEEEGQEEEEEEDEGVKSDQAEEGGSEKEG (SEQ ID NO:92), or a combination thereof.
9. The polypeptide according to claim 1, wherein the EBD sequence is derived from a phage sequence.
10. The polypeptide according to claim 1, wherein the EBD sequence is derived from a filamentous phage fd.
11. The polypeptide according to claim 1, wherein the EBD sequence comprises at least one linker region derived from a filamentous phage fd adsorption protein pIII having a sequence set forth in SEQ ID NO: 17.
12. The polypeptide according to claim 1, wherein the EBD sequence comprises a filamentous phage fd adsorption protein pIII sequence selected from the group consisting of EGGGS (SEQ ID NO:93), EGGGT (SEQ ID NO:94), SEGGG (SEQ ID NO:95), GGGSGGG (SEQ ID NO:96), SGGGSGSG (SEQ ID NO:97), and SGGGSEGGG (SEQ ID NO:98), or a combination thereof.
13. The polypeptide according to claim 1, wherein the EBD sequence is derived from a nuclear pore Nup2p protein having a sequence set forth in SEQ ID NO: 19.
14. The polypeptide according to claim 1, wherein the EBD sequence comprises a yeast nucleoporin Nup2p sequence selected from the group consisting of FSFGTSQPNNTPS (SEQ ID NO:99), FSFSIPSKNTPDASKPS (SEQ ID NO:100), FVFGQAAAKPSLEKSS (SEQ ID NO:101), FSFGVPNSSKNETSKPV (SEQ ID NO:102), FTFGTKHAADSQNNKPS (SEQ ID NO:103), FTFGSSALADNKEDVKKP (SEQ ID NO:104), FSFGINTNTTKTADTKAPT (SEQ ID NO:105), FSFGKTTANLPANSSTSPAPSIPSTG (SEQ ID NO:106), FSFGPKKENRKKDESDSENDIEIKGPE (SEQ ID NO:107), FKFSGTVSSDVFKLNPSTDKNEKKTETNAKP (SEQ ID NO:108), FKFSLPFEQKGSQTTTNDSKEESTTEATGNESQ (SEQ ID NO:109), FTFGSTTIEKKNDENSTSNSKPEKSSDSNDSNPS (SEQ ID NO:110), FSFGISNGSESKDSDKPSLPSAVDGENDKKEATKPA (SEQ ID NO:111), FSFSSATSTTEQTKSKNPLSLTEATKTNVDNNSKAEAS (SEQ ID NO:112) and FSFGAATPSAKEASQEDDNNNVEKPSSKPAFNLISNAGTEKEKESKKDSKPA (SEQ ID NO:113), or a combination thereof.
15. The polypeptide according to claim 1, wherein the EBD sequence is derived from a mammalian elastin protein.
16. The polypeptide according to claim 1, wherein the EBD sequence is derived from a mouse elastin protein having a sequence set forth in SEQ ID NO: 21.
17. The polypeptide according to claim 1, wherein the EBD sequence is an elastin sequence selected from the group consisting of VPGA (SEQ ID NO:114), GAGGL (SEQ ID NO:115), GAGGG (SEQ ID NO:116), VPGVG (SEQ ID NO:117), VPGFGAGA (SEQ ID NO:118), VPGALPGA (SEQ ID NO:119), VPGFGAGAG (SEQ ID NO:120), VPAVPGAGG (SEQ ID NO:121), VPGGVGVGG (SEQ ID NO:122), VGAGGFPGYG (SEQ ID NO:123), VPGAVPGGLPGG (SEQ ID NO:124), VSPAAAAKAAKYGAA (SEQ ID NO:125), VPQVGAGIGAGGKPGK (SEQ ID NO:126), VPGGVGVGGIPGGVGVGG (SEQ ID NO:127), VPGGVGGIGGIGGLGVSTGAV (SEQ ID NO:128), VPGGAAGAAAAYKAAAKAGAGLGGVGG (SEQ ID NO:129), VSPAAAAKAAAKAAKYGARGGVGIPTYG (SEQ ID NO:130), KPPKPYGGALGALGYQGGGCFGKSCGRKRK (SEQ ID NO:131), VPGAGTPAAAAAAAAAKAAAKAGLGPGVGG (SEQ ID NO:132), VPGRVAGAAPPAAAAAAAKAAAKAAQYGLG (SEQ ID NO:133), VPGVGLPGVYPGGVLPGTGARFPGVGVLPG (SEQ ID NO:134), VPTGTGVKAKAPGGGGAFSGIPGVGPFGGQQPG (SEQ ID NO:135), VPGGVYYPGAGIGGLGGGGGALGPGGKPPKPGAG (SEQ ID NO:136), VGAGAGLGGASPAAAAAAAKAAKYGAGGAGALGGL (SEQ ID NO:137), GLGGVLGARPFPGGGVAARPGFGLSPIYPGGGAGGLGVGG (SEQ ID NO:138), VPGSLAASKAAKYGAAGGLGGPGGLGGPGGLGGPGGLGGAG (SEQ ID NO:139), VPGGPGVRLPGAGIPGVGGIPGVGGIPGVGGPGIGGPGIVGGPGA (SEQ ID NO:140), VLPGVGGGGIPGGAGAIPGIGGIAGAGTPAAAAAAKAAAKAAKYGAAGGL (SEQ ID NO:141), VPGGVGPGGVTGIGAGPGGLGGAGSPAAAKSAAKAAAKAQYRAAAGLGAG (SEQ ID NO:142), and VPLGYPIKAPKLPGGYGLPYTNGKLPYGVAGAGGKAGYPTGTGVGSQAAAAAAK AAKYGAGGAG (SEQ ID NO:143), or a combination thereof.
18. The polypeptide according to claim 1, wherein the polypeptide further comprises a cleavable linker.
19. An isolated polynucleotide encoding a fusion polypeptide according to claim 1.
20. An expression vector comprising an isolated polynucleotide according to claims 19.
21. A host cell comprising an expression vector according to claim 20.
22. A kit comprising an isolated polynucleotide according to claim 19.
23. A kit comprising an expression vector according to claim 20.
24. A kit comprising a host cell according to claim 21.
25. A method for producing a recombinant protein comprising the steps of: (a) introducing into a host cell a polynucleotide according to claim 19 or an expression vector according to claim 20; and (b) expressing in the host cell a fusion polypeptide comprising at least one EBD sequence and at least one heterologous polypeptide sequence.
Description:
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001]This application is a continuation of U.S. patent application Ser. No. 11/485,613, filed Jul. 11, 2006, now U.S. Pat. No. 7,494,788, which application claims the benefit under 35 U.S.C. ยง 119(e) of U.S. Provisional Patent Application No. 60/698,456, filed Jul. 11, 2005, where these applications are incorporated herein by reference in their entireties.
STATEMENT REGARDING SEQUENCE LISTING
[0002]The Sequence Listing associated with this application is provided in text format in lieu of a paper copy, and is hereby incorporated by reference into the specification. The name of the text file containing the Sequence Listing is 670098--402Cl_SEQUENCE_LISTING.txt. The text file is 151 KB, was created on Jan. 22, 2009, and is being submitted electronically via EFS-Web.
FIELD OF THE INVENTION
[0003]The present invention relates generally to compositions and methods for recombinant protein production and, more particularly, to fusion polypeptides, polynucleotides encoding fusion polypeptides, expression vectors, kits, and related methods for recombinant protein production.
DETAILED OF THE RELATED ART
[0004]A large percentage of the proteins identified via the different genome sequencing effort have been difficult to express and/or purify as recombinant proteins using standard methods. For example, a trial study using Methanobacterium thermoautotrophicum as a model system identified a number of problems associated with high throughput structure determination (Christendat et al. (2000) Prog. Biophys. Mol. Biol. 73(5): 339-345; Christendat et al. (2000) Nat Struct Biol 7(10): 903-909). The complete list of genome-encoded proteins was filtered to remove proteins with predicted transmembrane regions or homologues to known structures. When these filtered proteins were taken through the cloning, expression, and structural determination steps of a high throughput process, only about 50% of the selected proteins could be purified in a state suitable for structural studies, with roughly 45% of large expressed proteins and 30% of small expressed proteins failing due to insolubility. The study concluded that considerable effort must be invested in improving the attrition rate due to proteins with poor expression levels and unfavorable biophysical properties. (Christendat et al. (2000) Prog. Biophys. Mol. Biol. 73(5): 339-345; Christendat et al. (2000) Nat Struct Biol 7(10): 903-909).
[0005]Similar results have been observed for other prokaryotic proteomes. One study reported the successful cloning and attempted expression of 1376 (73%) of the predicted 1877 genes of the Thermotoga maritima proteome. However, crystallization conditions were able to be determined for only 432 proteins (23%). A significant component of the decrease between the cloned and crystallized success levels was due to poor protein solubility and stability (Kuhn et al. (2002) Proteins 49(1): 142-5).
[0006]Similarly low success rates have been reported for eukaryotic proteomes. A study of a sample set of human proteins, for example, reported that the failure rate using high-throughput methods for three classes of proteins based on cellular location was 50% for soluble proteins, 70% for extracellular proteins, and more than 80% for membrane proteins (Braun et al. (2002) Proc Natl Acad Sci USA 99(5): 2654-9).
[0007]Interactions between individual recombinant proteins are responsible for a significant number of the previously mentioned failures. In a high-throughput structural determination study, Christendat and colleagues found that 24 of 32 proteins that were classified by nuclear magnetic resonance as aggregated displayed circular dichroism spectra consistent with stable folded proteins, suggesting that these proteins were folded properly but aggregated due to surface interactions (Christendat et al. (2000) Prog. Biophys. Mol. Biol. 73(5): 339-345). One possible explanation for this is that these proteins function in vivo as part of multimeric units but when they are recombinantly expressed, dimerization domains are exposed that mediate protein-protein interactions.
[0008]Prior methods used to increase recombinant protein stability include production in E. coli strains that are deficient in proteases (Gottesman and Zipser (1978) J Bacteriol 133(2): 844-51) and production of fusions of bacterial protein fragments to a recombinant polypeptide/protein of interest (Itakura et al., Science, 1977. 198:1056-63; Shen, Proc Natl Acad Sci USA, 1984. 81:4627-31), has also been attempted to stabilize foreign proteins in E. coli. In addition, fusing a leader sequence to a recombinant protein may cause a gene product to accumulate in the periplasm or be excreted, which may result in increased recovery of properly folded soluble protein (Nilsson et al., EMBO J, 1985. 4:1075-80; Abrahmsen et al., Nucleic Acids Res, 1986. 14:7487-500). These strategies have advantages for some proteins but they generally do not succeed when used, for example, with membrane proteins or proteins capable of strong protein-protein interactions.
[0009]Fusion polypeptides have also been used as an approach for improving the solubility and folding of recombinant polypeptides/proteins produced in E. coli (Zhan et al., Gene, 2001. 281:1-9). Some commonly used fusion partners which have been linked to heterologous protein sequences of interest include calmodulin-binding peptide (CBP) (Vaillancourt et al., Biotechniques, 1997. 22:451-3), glutathione-S-transferase (GST) (Smith, Methods Enzymol, 2000. 326:254-70), thioredoxin (TRX) (Martin Hammarstrom et al., Protein Science, 2002. 11:313-321), and maltose-binding protein (MBP) (Sachdev et al., Methods Enzymol, 2000. 326:312-21). Glutathione-S-transferase and maltose-binding protein have been found to increase the recombinant protein purification success rate when fused to a heterologous sequence in a controlled trial of 32 human test proteins (Braun et al., Proc Natl Acad Sci USA, 2002. 99:2654-9). Further, maltose-binding protein domain fusions have been shown to increase the solubility of recombinant proteins (Kapust et al., Protein Sci, 1999. 8:1668-74; Braun et al., Proc Natl Acad Sci USA, 2002. 99:2654-9; Martin Hammarstrom et al., Protein Science, 2002. 11:313-321). Maltose-binding protein may further benefit recombinant protein solubility and folding in that it may have chaperone-like properties that assist in folding of the fusion partner (Richarme et al., J Biol Chem, 1997. 272:15607-12; Bach et al., J Mol Biol, 2001. 312:79-93. However, these fusion approaches used to date have not been amendable to all classes of proteins, and have thus met with only limited success.
[0010]Entropic bristles have been used in a variety of polymers to reduce aggregation of small particles such as latex particles in paints and to stabilize a wide variety of other colloidal products (Hoh, Proteins, 1998. 32:223-228). Entropic bristles generally comprise amino acid residues that do not have a tendency to form secondary structure and in the process of random motion about their attachment points sweep out a significant region in space and entropically exclude other molecules by their random motion (Hoh, Proteins, 1998. 32:223-228). Entropic bristles are singular elements, comprising highly flexible, non-aggregating polymer chains, of which entropic brushes are assembled. In polymer chemistry, entropic bristles have been affixed to the surfaces of particles (e.g. latex beads), thereby forming entropic brushes which, in turn, prevent particle aggregation (Stabilization by attached polymer: steric stabilization, in Polymeric stabilization of colloidal dispersions, D. H. Napper, Editor. 1983, Academic Press: London. p. 18-30). EBDs can exclude large molecules but do not exclude small molecules such as water, salts, metal ions, or cofactors (Hoh, Proteins, 1998. 32:223-228).
[0011]EBDs can also function as steric stabilizers and operate through steric hindrance stabilization (Stabilization by attached polymer: steric stabilization, in Polymeric stabilization of colloidal dispersions, D. H. Napper, Editor. 1983, Academic Press: London. p. 18-30). Naper described characteristics that contribute to steric stabilization functions, including (1) they have an amphipathic sequence; (2) they are attached to the colloidal particle by one end rather than being totally adsorbed; (3) they are soluble in the medium used; (4) they are mutually repulsive; (5) they are thermodynamically stable; and (6) they exhibit stabilizing ability in proportion to their length. Steric stabilizers intended to function in aqueous media extend from the surface of colloidal molecules thus transforming their surfaces from hydrophobic to hydrophilic. The fact that sterically stabilized particles are thermodynamically stable leads them to spontaneously re-disperse when dried residue is reintroduced to solvent. Entropic bristles can adopt random-walk configurations in solution (Milner, Science, 1991. 251:905-914). These chains extend from an attachment point because of their affinity for the solvent. This affinity is due in part to the highly charged nature of the entropic bristle sequence.
[0012]While certain prior approaches have met with some success, there remains a need for new compositions and methods for improving the properties and characteristics of recombinant proteins, e.g., improving solubility, stability, yield and/or folding of recombinant proteins. The present invention addresses these needs and offers other related advantages by employing entropic bristle domain sequences as fusion partners in recombinant protein production, as described herein.
SUMMARY OF THE INVENTION
[0013]According to a general aspect of the present invention, there are provided isolated fusion polypeptides comprising at least one entropic bristle domain (EBD) sequence and at least one heterologous polypeptide sequence of interest. By providing an EBD sequence which effectively sweeps out the three-dimensional space surrounding a newly synthesized heterologous polypeptide, the fusion polypeptides of the invention offer a number of advantages over prior fusion polypeptides and methods relating thereto.
[0014]In one embodiment, a fusion polypeptide comprising an EBD sequence and a heterologous polypeptide sequence exhibits improved solubility relative to the corresponding heterologous polypeptide in the absence of the EBD sequence. In a related embodiment, the fusion polypeptide has at least 5% increased solubility relative to the heterologous polypeptide sequence, at least 25% increased solubility relative to the heterologous polypeptide sequence, or at least 50% increased solubility relative to the heterologous polypeptide sequence.
[0015]In another embodiment, a fusion polypeptide of the invention exhibits reduced aggregation relative to the level of aggregation of the heterologous polypeptide sequence in the absence of the EBD sequence. For example, a fusion polypeptide of the invention generally exhibits at least 10% reduced aggregation relative to the heterologous polypeptide sequence or at least 25% reduced aggregation relative to the heterologous polypeptide sequence.
[0016]In another embodiment, a fusion polypeptide of the invention exhibits improved self-folding relative to the heterologous polypeptide sequence in the absence of the EBD sequence.
[0017]In another embodiment of the present invention, an EBD sequence employed in a fusion polypeptide comprises an amino acid sequence that maintains a substantially random coil conformation.
[0018]In another embodiment, the EBD sequence of a fusion polypeptide of the invention comprises an amino acid sequence that is substantially mutually repulsive.
[0019]In another embodiment, the EBD sequence of a fusion polypeptide of the invention comprises an amino acid sequence that remains in substantially constant motion.
[0020]In a more particular embodiment, an EBD sequence of a fusion polypeptide of the invention is derived from a mammalian neurofilament protein. In a related embodiment, the EBD sequence of a fusion polypeptide of the invention is derived from a mammalian neurofilament NF-H protein. In another related embodiment, the EBD sequence of a fusion polypeptide of the invention is derived from a human neurofilament NF-H protein having the sequence set forth in SEQ ID NO: 1. In another related embodiment, the EBD sequence of a fusion polypeptide of the invention is derived from a mouse neurofilament NF-H protein having the sequence set forth in SEQ ID NO: 3.
[0021]In yet another related embodiment, the EBD sequence of a fusion polypeptide of the invention comprises a neurofilament NF-H sequence selected from the group consisting of SPEAEK (SEQ ID NO:23), SPAAVK (SEQ ID NO:24), SPAEAK (SEQ ID NO:25), SPAEPK (SEQ ID NO:26), SPAEVK (SEQ ID NO:27), SPATVK (SEQ ID NO:28), SPEKAK (SEQ ID NO:29), SPGEAK (SEQ ID NO:30), SPIEVK (SEQ ID NO:31), SPPEAK (SEQ ID NO:32), SPSEAK (SEQ ID NO:33), SPEKEAK (SEQ ID NO:34), SPAKEKAK (SEQ ID NO:35), SPEKEEAK (SEQ ID NO:36), SPTKEEAK (SEQ ID NO:37), SPVKEEAK (SEQ ID NO:38), SPVKAEAK (SEQ ID NO:39), SPVKEEAK (SEQ ID NO:40), SPVKEEVK (SEQ ID NO:41), SPVKEEEKP (SEQ ID NO:42), SPEKAKTLDVK (SEQ ID NO:43), SPADKFPEKAK (SEQ ID NO:44), SPEAKTPAKEEAR (SEQ ID NO:45), SPEKAKTPVKEGAK (SEQ ID NO:46), SPVKEEAKTPEKAK (SEQ ID NO:47), SPVKEGAKPPEKAKPLDVK (SEQ ID NO:48), SPVKEDIKPPAEAKSPEKAK (SEQ ID NO:49), SPLKEDAKAPEKEIPKKEEVK (SEQ ID NO:50), SPEKEEAKTSEKVAPKKEEVK (SEQ ID NO:51), SPEAQTPVQEEATVPTDIRPPEQVK (SEQ ID NO:52), SPVKEEVKAKEPPKKVEEEKTLPTPKTEAKESKKDE (SEQ ID NO:53).
[0022]In yet another related embodiment, the EBD sequence of a fusion polypeptide of the invention comprises at least 2-100 repeats of a neurofilament NF-H sequence set forth above, or a combination thereof.
[0023]According to another particular embodiment of the present invention, an EBD sequence of a fusion polypeptide is derived from a mammalian neurofilament protein NF-M. In a related embodiment, the EBD sequence of a fusion polypeptide of the invention is derived from a bovine neurofilament NF-M protein having the sequence set forth in SEQ ID NO: 5. In another related embodiment, the EBD sequence of a fusion polypeptide of the invention is derived from a chicken neurofilament NF-M protein having the sequence set forth in SEQ ID NO: 7. In yet another related embodiment, the EBD sequence of a fusion polypeptide of the invention is derived from a human neurofilament NF-M protein having the sequence set forth in SEQ ID NO: 9. In another related embodiment, the EBD sequence of a fusion polypeptide of the invention is derived from a mouse neurofilament NF-M protein having the sequence set forth in SEQ ID NO: 11. In yet another related embodiment, the EBD sequence of a fusion polypeptide of the invention is derived from a rat neurofilament NF-M protein having the sequence set forth in SEQ ID NO: 13. In another related embodiment, the EBD sequence of a fusion polypeptide of the invention is derived from a rabbit neurofilament NF-M protein having the sequence set forth in SEQ ID NO: 15.
[0024]In yet another related embodiment, the EBD sequence of a fusion polypeptide of the invention comprises a neurofilament NF-M sequence selected from the group consisting of SPPK (SEQ ID NO:54), SPVK (SEQ ID NO:55), SPAAK (SEQ ID NO:56), SPAPK (SEQ ID NO:57), SPEAK (SEQ ID NO:58), SPMPK (SEQ ID NO:59), SPPAK (SEQ ID NO:60), SPTAK (SEQ ID NO:61), SPTTK (SEQ ID NO:62), SPVAK (SEQ ID NO:63), SPVAK (SEQ ID NO:64), SPVPK (SEQ ID NO:65), SPVSK (SEQ ID NO:66), SPEKPA (SEQ ID NO:67), SPVEEKAK (SEQ ID NO:68), SPVEEKGK (SEQ ID NO:69), SPVEEVKP (SEQ ID NO:70), SPEKPATPKVT (SEQ ID NO:71), SPEKPRTPEKPA (SEQ ID NO:72), SPEKPTTPEKW (SEQ ID NO:73), SPEKPSSPLKDEKA (SEQ ID NO:74), SPVKEKAVEEMITIT (SEQ ID NO:75), SPVKEEAAEEAATITK (SEQ ID NO:76), SPVPKSPVEEVKPKAEATAG (SEQ ID NO:77), SPVKAESPVKEEVPAKPVKV (SEQ ID NO:78), SPEKEAKEEEKPQEKEKEKEK (SEQ ID NO:79), SPVKATTPEIKEEEGEKEEEGQE (SEQ ID NO:80), SPVEEVKPKPEAKAGKGEQKEE (SEQ ID NO:81), SPEKPATPEKPPTPEKAITPEKVR (SEQ ID NO:82), SPEKPATPEKPRTPEKPATPEKPR (SEQ ID NO:83), SPKEEKVEKKEEKPKDVPKKKAE (SEQ ID NO:84), SPKEEKAEKKEEKPKDVPEKKKAE (SEQ ID NO:85), SPVEEAKSKAEVGKGEQKEEEEKE (SEQ ID NO:86), SPKEEKVEKKEEKPKDVPDKKKAE (SEQ ID NO:87), SPVKEEAVAEVVTITKSVKVHLEKET (SEQ ID NO:88), SSEKDEGEQEEEEGETEAEGEGEEAEAKEEK (SEQ ID NO:89), SPVEEVKPKAEAGAEKGEQKEKVEEEKKEAKE (SEQ ID NO:90), SPVTEQAKAVQKAAAEVGKDQKAEKAAEKAAKEEKAA (SEQ ID NO:91), SPEAKEEEEEGEKEEEEEGQEEEEEEDEGVKSDQAEEGGSEKEG (SEQ ID NO:92).
[0025]According to another particular embodiment of the present invention, an EBD sequence of a fusion polypeptide is derived from a phage sequence. In a related embodiment, the EBD sequence of a fusion polypeptide of the invention is derived from a filamentous phage fd. In another related embodiment, the EBD sequence of a fusion polypeptide of the invention comprises at least one linker region derived from a filamentous phage fd adsorption protein pIII. In another related embodiment, the EBD sequence of a fusion polypeptide of the invention comprises a filamentous phage fd adsorption protein pIII having a sequence set forth in SEQ ID NO: 17. In another related embodiment, the EBD sequence of a fusion polypeptide of the invention comprises a filamentous phage fd adsorption protein pIII sequence selected from the group consisting of EGGGS (SEQ ID NO:93), EGGGT (SEQ ID NO:94), SEGGG (SEQ ID NO:95), GGGSGGG (SEQ ID NO:96), SGGGSGSG (SEQ ID NO:97), and SGGGSEGGG (SEQ ID NO:98).
[0026]In yet another related embodiment, the EBD sequence of a fusion polypeptide of the invention comprises at least 2-100 repeats of A filamentous phage fd adsorption protein pIII sequence set forth above, or a combination thereof.
[0027]In another particular embodiment of the invention, an EBD sequence of a fusion polypeptide of the invention is derived from a nuclear pore protein. In a more particular embodiment, the EBD sequence of a fusion polypeptide of the invention is derived from an yeast nuclear pore Nup2p protein having the sequence set forth in SEQ ID NO: 19. In a related embodiment, the EBD is derived from the yeast nucleoporin Nup2p protein and is selected from the group consisting of FSFGTSQPNNTPS (SEQ ID NO:99), FSFSIPSKNTPDASKPS (SEQ ID NO:100), FVFGQAAAKPSLEKSS (SEQ ID NO:101), FSFGVPNSSKNETSKPV (SEQ ID NO:102), FTFGTKHAADSQNNKPS (SEQ ID NO:103), FTFGSSALADNKEDVKKP (SEQ ID NO:104), FSFGINTNTTKTADTKAPT (SEQ ID NO:105), FSFGKTTANLPANSSTSPAPSIPSTG (SEQ ID NO:106), FSFGPKKENRKKDESDSENDIEIKGPE (SEQ ID NO:107), FKFSGTVSSDVFKLNPSTDKNEKKTETNAKP (SEQ ID NO:108), FKFSLPFEQKGSQTTTNDSKEESTTEATGNESQ (SEQ ID NO:109), FTFGSTTIEKKNDENSTSNSKPEKSSDSNDSNPS (SEQ ID NO:110), FSFGISNGSESKDSDKPSLPSAVDGENDKKEATKPA (SEQ ID NO:111), FSFSSATSTTEQTKSKNPLSLTEATKTNVDNNSKAEAS (SEQ ID NO:112) and FSFGAATPSAKEASQEDDNNNVEKPSSKPAFNLISNAGTEKEKESKKDSKPA (SEQ ID NO:113).
[0028]In yet another related embodiment, the EBD sequence of a fusion polypeptide of the invention comprises at least 2-100 repeats of a Nup2p sequence set forth above, or a combination thereof.
[0029]According to another particular embodiment of the present invention, an EBD sequence is a sequence derived from a mammalian elastin protein. In another related embodiment, the EBD sequence of a fusion polypeptide of the invention is derived from a mouse elastin having the sequence set forth in SEQ ID NO: 21.
[0030]In a related embodiment, the EBD comprises a sequence derived from an elastin protein and is selected from the group consisting of VPGA (SEQ ID NO:114), GAGGL (SEQ ID NO:115), GAGGG (SEQ ID NO:116), VPGVG (SEQ ID NO:117), VPGFGAGA (SEQ ID NO:118), VPGALPGA (SEQ ID NO:119), VPGFGAGAG (SEQ ID NO:120), VPAVPGAGG (SEQ ID NO:121), VPGGVGVGG (SEQ ID NO:122), VGAGGFPGYG (SEQ ID NO:123), VPGAVPGGLPGG (SEQ ID NO:124), VSPAAAAKAAKYGAA (SEQ ID NO:125), VPQVGAGIGAGGKPGK (SEQ ID NO:126), VPGGVGVGGIPGGVGVGG (SEQ ID NO:127), VPGGVGGIGGIGGLGVSTGAV (SEQ ID NO:128), VPGGAAGAAAAYKAAAKAGAGLGGVGG (SEQ ID NO:129), VSPAAAAKAAAKAAKYGARGGVGIPTYG (SEQ ID NO:130), KPPKPYGGALGALGYQGGGCFGKSCGRKRK (SEQ ID NO:131), VPGAGTPAAAAAAAAAKAAAKAGLGPGVGG (SEQ ID NO:132), VPGRVAGAAPPAAAAAAAKAAAKAAQYGLG (SEQ ID NO:133), VPGVGLPGVYPGGVLPGTGARFPGVGVLPG (SEQ ID NO:134), VPTGTGVKAKAPGGGGAFSGIPGVGPFGGQQPG (SEQ ID NO:135), VPGGVYYPGAGIGGLGGGGGALGPGGKPPKPGAG (SEQ ID NO:136), VGAGAGLGGASPAAAAAAAKAAKYGAGGAGALGGL (SEQ ID NO:137), GLGGVLGARPFPGGGVAARPGFGLSPIYPGGGAGGLGVGG (SEQ ID NO:138), VPGSLAASKAAKYGAAGGLGGPGGLGGPGGLGGPGGLGGAG (SEQ ID NO:139), VPGGPGVRLPGAGIPGVGGIPGVGGIPGVGGPGIGGPGIVGGPGA (SEQ ID NO:140), VLPGVGGGGIPGGAGAIPGIGGIAGAGTPAAAAAAKAAAKAAKYGAAGGL (SEQ ID NO:141), VPGGVGPGGVTGIGAGPGGLGGAGSPAAAKSAAKAAAKAQYRAAAGLGAG (SEQ ID NO:142), and VPLGYPIKAPKLPGGYGLPYTNGKLPYGVAGAGGKAGYPTGTGVGSQAAAAAAK AAKYGAGGAG (SEQ ID NO:143).
[0031]In yet another related embodiment, the EBD sequence of a fusion polypeptide of the invention comprises at least 2-100 repeats of an elastin sequence set forth above, or a combination thereof.
[0032]In another embodiment, the EBD sequence of a fusion polypeptide of the invention comprises a combination of any one or more of the EBD sequences set forth herein.
[0033]In yet another embodiment of the invention, an EBD sequence of a fusion polypeptide of the invention comprises a combination of NF-H and NF-M sequences set forth herein.
[0034]In yet another embodiment of the invention, an EBD sequence of a fusion polypeptide of the invention comprises a combination of NF-H and Nup2p sequences set forth herein.
[0035]In yet another embodiment of the invention, an EBD sequence of a fusion polypeptide of the invention comprises a combination of NF-M and Nup2p sequence set forth herein.
[0036]In yet another embodiment of the invention, an EBD sequence of a fusion polypeptide of the invention comprises a combination of NF-H and filamentous phage fd adsorption protein pIII sequences set forth herein.
[0037]In yet another embodiment of the invention, an EBD sequence of a fusion polypeptide of the invention comprises a combination of NF-M and filamentous phage fd adsorption protein pIII sequences set forth herein.
[0038]In yet another embodiment of the invention, an EBD sequence of a fusion polypeptide of the invention comprises a combination of Nup2p and filamentous phage fd adsorption protein pIII sequences set forth herein.
[0039]In yet another embodiment of the invention, an EBD sequence of a fusion polypeptide of the invention comprises a combination of NF-H, NF-M and filamentous phage fd adsorption protein pIII sequences set forth herein.
[0040]In yet another embodiment of the invention, an EBD sequence of a fusion polypeptide of the invention comprises a combination of NF-H, NF-M and Nup2p sequences set forth herein.
[0041]In yet another embodiment of the invention, an EBD sequence of a fusion polypeptide of the invention comprises a combination of Nup2p, NF-M and filamentous phage fd adsorption protein pIII sequences set forth herein.
[0042]In yet another embodiment of the invention, an EBD sequence of a fusion polypeptide of the invention comprises a combination of NF-H, Nup2p and filamentous phage fd adsorption protein pIII sequences set forth herein.
[0043]In yet another embodiment of the invention, an EBD sequence of a fusion polypeptide of the invention comprises a combination of Nup2p, NF-H, NF-M and filamentous phage fd adsorption protein pIII sequences set forth herein.
[0044]According to another embodiment of the invention, an EBD sequence of a fusion polypeptide of the invention comprises a variant version of an amino acid sequence of NF-H described herein, where resulting sequence preserves amino acid composition of the parent sequence.
[0045]According to another embodiment of the invention, an EBD sequence of a fusion polypeptide of the invention comprises a variant version of an amino acid sequence of NF-M described herein, where resulting sequence preserves amino acid composition of the parent sequence.
[0046]According to another embodiment of the invention, an EBD sequence of a fusion polypeptide of the invention comprises a variant version of an amino acid sequence of Nup2p described herein, where resulting sequence preserves amino acid composition of the parent sequence.
[0047]According to another embodiment of the invention, an EBD sequence of a fusion polypeptide of the invention comprises a variant version of an amino acid sequence of filamentous phage fd adsorption protein pIII described herein, where resulting sequence preserves amino acid composition of the parent sequence.
[0048]According to another embodiment of the invention, an EBD sequence of a fusion polypeptide of the invention comprises a variant version of an amino acid sequence of elastin described herein, where resulting sequence preserves amino acid composition of the parent sequence.
[0049]According to another embodiment of the invention, an EBD sequence of a fusion polypeptide of the invention generally comprises between about 5-600 amino acid residues, between about 5-300 amino acid residues or between about 5-100 amino acid residues, however other polypeptide lengths may also be used.
[0050]In another embodiment, an EBD sequence of a fusion polypeptide of the invention is cleavable, e.g., can be removed and/or separated from the heterologous polypeptide sequence after recombinant expression by, for example, enzymatic or chemical cleavage methods.
[0051]In another embodiment, an EBD sequence of a fusion polypeptide of the invention is covalently linked at the N-terminus of the heterologous polypeptide sequence of interest. In another embodiment, an EBD sequence of a fusion polypeptide of the invention is covalently linked at the C-terminus of the heterologous polypeptide sequence of interest. In yet another embodiment, an EBD sequence of a fusion polypeptide of the invention is covalently linked at the N- and C-termini of the heterologous polypeptide sequence of interest.
[0052]In another embodiment of the invention, the charge of an EBD sequence of a fusion polypeptide of the invention is modulated by, for example, enzymatic and/or chemical methods, in order to modulate the activity of the EBD sequence. In a particular embodiment, the charge of the EBD sequence is modulated by phosphorylation.
[0053]According to another aspect of the invention, an isolated polynucleotide is provided, wherein the polynucleotide encodes a fusion polypeptide as described herein.
[0054]According to yet another aspect of the invention, there is provided an expression vector comprising an isolated polynucleotide encoding a fusion polypeptide as described herein. In a related embodiment, an expression vector is provided comprising a polynucleotide encoding an EBD sequence and further comprising a cloning site for insertion of a polynucleotide encoding a heterologous polypeptide of interest.
[0055]According to yet another aspect of the invention, there is provided a host cell comprising an expression vector as described herein.
[0056]According to yet another aspect of the invention, there is provided a kit comprising an isolated polynucleotide as described herein, an isolated polypeptide as described herein and/or an isolated host cell as described herein.
[0057]Yet another aspect of the invention provides a method for producing a recombinant protein comprising the steps of: introducing into a host cell an expression vector comprising a polynucleotide sequence encoding a fusion polypeptide, the fusion polypeptide comprising at least one entropic bristle domain sequence and at least one polypeptide sequence of interest; and expressing the fusion polypeptide in the host cell. In another embodiment, the method further comprises the step of isolating the fusion polypeptide from the host cell. In another related embodiment, the method further comprises the step of removing the entropic bristle domain sequence from the fusion polypeptide before or after isolating the fusion polypeptide from the host cell.
[0058]These and other aspects of the present invention will become apparent upon reference to the following detailed description. All references disclosed herein and in the enclosed Application Data Sheet are hereby incorporated by reference in their entirety as if each was incorporated individually.
BRIEF DESCRIPTION OF THE DRAWING
[0059]FIG. 1 depicts the average net charge of a 5 residue moving window for residues 422 to 916 of human neurofilament medium (NF-M) protein sequence.
BRIEF DESCRIPTION OF THE SEQUENCE IDENTIFIERS
[0060]SEQ ID NO: 1 is the amino acid sequence of a human NF-H protein, Swiss-Prot accession number P12036, having an illustrative EB-domain corresponding to residues 414-1026.
[0061]SEQ ID NO: 2 is a polynucleotide sequence encoding the amino acid sequence of SEQ ID NO: 1, GenBank accession number BC073969, having an illustrative EB-domain corresponding to residues 1242-3081.
[0062]SEQ ID NO: 3 is the amino acid sequence of a mouse NF-H protein, Swiss-Prot accession number P19246, having an illustrative EB domain corresponding to residues 409-1087.
[0063]SEQ ID NO: 4 is a polynucleotide sequence encoding the amino acid sequence of SEQ ID NO: 3, GenBank accession number M35131, having an illustrative EB-domain corresponding to residues 1227-3219.
[0064]SEQ ID NO: 5 is the amino acid sequence of a bovine NF-M protein, Swiss-Prot accession number 077788; having an illustrative EB domain corresponding to residues 412-925.
[0065]SEQ ID NO: 6 is a polynucleotide sequence encoding protein residues 116-925 of bovine NF-M, GenBank accession number AF091342, having an illustrative EB domain corresponding to residues 891-2433.
[0066]SEQ ID NO: 7 is the amino acid sequence of a chicken NF-M protein, Swiss-Prot accession number P16053, having an illustrative EB domain corresponding to residues 407-857.
[0067]SEQ ID NO: 8 is a polynucleotide sequence encoding the protein fragment 259-857 of chicken NF-M, GenBank accession number X05558, having an illustrative EB domain corresponding to residues 177-1530.
[0068]SEQ ID NO: 9 is the amino acid sequence of a human NF-M protein, Swiss-Prot accession number P07197, having an illustrative EB domain corresponding to residues 412-915.
[0069]SEQ ID NO: 10 is a polynucleotide sequence encoding the amino acid sequence of SEQ ID NO: 9, GenBank accession number Y00067, having an illustrative EB domain corresponding to residues 1236-2751.
[0070]SEQ ID NO: 11 is the amino acid sequence of a mouse NF-M protein, Swiss-Prot accession number P08553, having an illustrative EB domain corresponding to residues 411-848.
[0071]SEQ ID NO: 12 is a polynucleotide sequence encoding the amino acid sequence of SEQ ID NO: 11, GenBank accession number X05640, having an illustrative EB domain corresponding to residues 1233-2550.
[0072]SEQ ID NO: 13 is the amino acid sequence of a rat NF-M protein, Swiss-Prot accession number P12839, having an illustrative EB domain corresponding to residues 411-845.
[0073]SEQ ID NO: 14 is a polynucleotide sequence encoding the amino acid sequence of SEQ ID NO: 13, GenBank accession number Z12152, having an illustrative EB domain corresponding to residues 1233-2538.
[0074]SEQ ID NO: 15 is the amino acid sequence of a rabbit NF-M protein, Swiss-Prot accession number P54938, having an illustrative EB domain corresponding to residues 198-644.
[0075]SEQ ID NO: 16 is a polynucleotide sequence encoding the amino acid sequence of SEQ ID NO: 15, GenBank accession number Z47378, having an illustrative EB domain corresponding to residues 594-1938.
[0076]SEQ ID NO: 17 is the amino acid sequence of a phage fd pill protein, Swiss-Prot accession number P69168, having illustrative EB-domains corresponding to residues 86-104 and 236-274.
[0077]SEQ ID NO: 18 is a polynucleotide sequence encoding the amino acid sequence of SEQ ID NO: 17, GenBank accession number V00604, having illustrative EB domains corresponding to residues 258-312 and 708-822.
[0078]SEQ ID NO: 19 is the amino acid sequence of a Yeast Nup2p protein, Swiss-Prot accession number P32499, having an illustrative EB-domain corresponding to residues 189-582.
[0079]SEQ ID NO: 20 is a polynucleotide sequence encoding the amino acid sequence of SEQ ID NO: 19, GenBank accession number X69964, having an illustrative EB domain corresponding to residues 567-1748.
[0080]SEQ ID NO: 21 is the amino acid sequence of a mouse elastin protein, Swiss-Prot accession number P54320, the entire sequence of which represents an illustrative EB domain.
[0081]SEQ ID NO: 22 is a polynucleotide sequence encoding the amino acid sequence of SEQ ID NO: 21, GenBank accession number U08210.
[0082]SEQ ID Nos: 23 to 144 represent further illustrative EBD sequences according to the present invention.
DETAILED DESCRIPTION OF THE INVENTION
[0083]The practice of the present invention will employ, unless indicated specifically to the contrary, conventional methods of molecular biology and recombinant DNA techniques within the skill of the art, many of which are described below for the purpose of illustration. Such techniques are explained fully in the literature. See, e.g., Sambrook, et al., Molecular Cloning: A Laboratory Manual (2nd Edition, 1989); Maniatis et al., Molecular Cloning: A Laboratory Manual (1982); DNA Cloning: A Practical Approach, vol. I & II (D. Glover, ed.); Oligonucleotide Synthesis (N. Gait, ed., 1984); Nucleic Acid Hybridization (B. Hames & S. Higgins, eds., 1985); Transcription and Translation (B. Hames & S. Higgins, eds., 1984); Animal Cell Culture (R. Freshney, ed., 1986); Perbal, A Practical Guide to Molecular Cloning (1984).
[0084]All publications, patents and patent applications cited herein, whether supra or infra, are hereby incorporated by reference in their entirety.
[0085]As used in this specification and the appended claims, the singular forms "a," "an" and "the" include plural references unless the content clearly dictates otherwise.
[0086]As used herein, the terms "polypeptide" and "protein" are used interchangeably, unless specified to the contrary, and according to conventional meaning, i.e., as a sequence of amino acids. Polypeptides are not limited to a specific length, e.g., they may comprise a full length protein sequence or a fragment of a full length protein, and may include post-expression modifications of the polypeptide, for example, glycosylations, acetylations, phosphorylations and the like, as well as other modifications known in the art, both naturally occurring and non-naturally occurring. Polypeptides of the invention may be prepared using any of a variety of well known recombinant and/or synthetic techniques, illustrative examples of which are further discussed below.
[0087]As noted above, the present invention, in a general aspect, relates to isolated fusion polypeptides comprising at least one entropic bristle domain (EBD) sequence and at least one heterologous polypeptide sequence. By providing an EBD sequence which sweeps out the three-dimensional space surrounding a newly synthesized heterologous polypeptide, the EBD sequences of the invention effectively exclude other polypeptides and thereby minimize aggregation with other newly synthesized heterologous polypeptides during recombinant polypeptide production.
[0088]In addition, an EBD sequence of the invention can provide steric stabilization to recombinant polypeptides, a property that is relatively independent of concentration, and can thus minimize problems associated with high-level recombinant production of polypeptides and proteins (e.g., precipitation, toxicity and/or inclusion body formation). Thus, EBD fusion polypeptides described herein exhibit both steric effects (via the entropic bristle's motion) and electrostatic effects (via the bristle's highly charged sequence) to minimize interactions between recombinant polypeptides expressed as fusions according to the present invention. These characteristics allow EBD polypeptide sequences to more effectively solubilize recombinantly expressed polypeptides than, for example, other fusion partners which do not have a steric exclusion component that contributes to their activity.
[0089]Therefore, according to one embodiment of the invention, fusion polypeptides comprising an EBD sequence and a heterologous polypeptide are provided which exhibit improved solubility relative to the corresponding heterologous polypeptide in the absence of the EBD sequence. In one embodiment, for example, the fusion polypeptide has at least 5% increased solubility relative to the heterologous polypeptide sequence alone. In another related embodiment, the fusion polypeptide has at least 25% increased solubility relative to the heterologous polypeptide sequence. In yet another related embodiment, the fusion polypeptide has at least 50% increased solubility relative to the heterologous polypeptide sequence.
[0090]The extent of improved solubility provided by an EBD sequence described herein can be determined using any of a number of available approaches (see for example, Kapust, R. B. and D. S. Waugh, Escherichia coli maltose-binding protein is uncommonly effective at promoting the solubility of polypeptides to which it is fused. Protein Sci, 1999. 8:1668-74; Fox, J. D., et al., Maltodextrin-binding proteins from diverse bacteria and archaea are potent solubility enhancers. FEBS Lett, 2003. 537:53-7; Dyson M R, Shadbolt S P, Vincent K J, Perera R L, McCafferty J. Production of soluble mammalian proteins in Escherichia coli: identification of protein features that correlate with successful expression. BMC Biotechnol. 2004 Dec. 14; 4(1):32).
[0091]Cells from single, drug resistant colony of E. coli overproducing the fusion polypeptide are grown to saturation in LB broth (Miller J H. 1972. Experiments in molecular genetics. Cold Spring Harbor, N.Y.: Cold Spring Harbor Press. p 433) supplemented with 100 mg/mL ampicillin and 30 mg/mL chloramphenicol at 37ยฐ C. The saturated cultures are diluted 50-fold in the same medium and grown in shake-flasks to mid-log phase (A600ห0.5-0.7), at which time IPTG is added to a final concentration of 1 mM. After 3 h, the cells are recovered by centrifugation. The cell pellets are resuspended in 0.1 culture volumes of lysis buffer (50 mM Tris-HCl (pH 8.0), 150 mM NaCl, 1 mM EDTA), and disrupted by sonication. A total protein sample is collected from the cell suspension after sonication, and a soluble protein sample is collected from the supernatant after the insoluble debris is pelleted by centrifugation (20,000รg). These samples are subjected to SDS-PAGE and proteins are visualized by staining with Coomassie Brilliant Blue. At least three independent experiments are typically performed to obtain numerical estimates of the solubility of each fusion protein in E. coli. Coomassie-stained gels will be scanned with a gel-scanning densitometer and the pixel densities of the bands corresponding to the fusion proteins are obtained directly by volumetric integration. In each lane, the collective density of all E. coli proteins that are larger than the largest fusion protein are also determined by volumetric integration and used to normalize the values in each lane relative to the others. The percent solubility of each fusion protein is calculated by dividing the amount of soluble fusion protein by the total amount of fusion protein in the cells, after first subtracting the normalized background values obtained from negative control lanes (cells containing no expression vector). Descriptive statistical data (e.g., the mean and standard deviation) is then generated using standard methods.
[0092]The presence of an EBD sequence in fusion polypeptides of the present invention can also serve to reduce the extent of aggregation of a heterologous polypeptide sequence. In one embodiment, for example, the fusion polypeptide exhibits at least 10% reduced aggregation relative to the heterologous polypeptide. In another embodiment, the fusion polypeptide has at least 25% reduced aggregation relative to the heterologous polypeptide.
[0093]The extent of reduced aggregation provided by the fusion polypeptides of the present invention can be determined using any of a number of available techniques (see for example, Kapust, R. B. and D. S. Waugh, Escherichia coli maltose-binding protein is uncommonly effective at promoting the solubility of polypeptides to which it is fused. Protein Sci, 1999. 8:1668-74; Fox, J. D., et al., Maltodextrin-binding proteins from diverse bacteria and archaea are potent solubility enhancers. FEBS Lett, 2003. 537:53-7).
[0094]Cells from single, drug resistant colony of E. coli overproducing the fusion polypeptide are grown to saturation in LB broth (Miller J H. 1972. Experiments in molecular genetics. Cold Spring Harbor, N.Y.: Cold Spring Harbor Press. p 433) supplemented with 100 mg/mL ampicillin and 30 mg/mL chloramphenicol at 37ยฐ C. The saturated cultures are diluted 50-fold in the same medium and grown in shake-flasks to mid-log phase (A600ห0.5-0.7), at which time IPTG is added to a final concentration of 1 mM. After 3 h, the cells are recovered by centrifugation. The cell pellets are resuspended in 0.1 culture volumes of lysis buffer (50 mM Tris-HCl (pH 8.0), 150 mM NaCl, 1 mM EDTA), and disrupted by sonication. A total protein sample is collected from the cell suspension after sonication, and an insoluble protein sample is collected from the pellet after the centrifugation (20,000รg). These samples are subjected to SDS-PAGE and proteins are visualized by staining with Coomassie Brilliant Blue. At least three independent experiments are typically performed to obtain numerical estimates of the solubility of each fusion protein in E. coli. Coomassie-stained gels are scanned with a gel-scanning densitometer and the pixel densities of the bands corresponding to the fusion proteins are obtained directly by volumetric integration. In each lane, the collective density of all insoluble E. coli proteins that are larger than the largest fusion protein is also determined by volumetric integration and used to normalize the values in each lane relative to the others. The percent insolubility of each fusion protein is calculated by dividing the amount of insoluble fusion protein by the total amount of fusion protein in the cells, after first subtracting the normalized background values obtained from negative control lanes (cells containing no expression vector). Descriptive statistical data (e.g., the mean and standard deviation) is generated by standard methods.
[0095]The presence of an EBD sequence in the fusion polypeptides of the present invention can also serve to improve the folding characteristics of the fusion polypeptides relative to the corresponding heterologous polypeptide, e.g., by minimizing interference caused by interaction with other proteins.
[0096]Assays for evaluating the folding characteristics of a fusion polypeptide of the invention can be carried out using conventional techniques, such as circular dichroism spectroscopy in far ultra-violet region, circular dichroism in near ultra-violet region, nuclear magnetic resonance spectroscopy, infra-red spectroscopy, Raman spectroscopy, intrinsic fluorescence spectroscopy, extrinsic fluorescence spectroscopy, fluorescence resonance energy transfer, fluorescence anisotropy and polarization, steady-state fluorescence, time-domain fluorescence, numerous hydrodynamic techniques including gel-filtration, viscometry, small-angle X-ray scattering, small angle neutron scattering, dynamic light scattering, static light scattering, scanning microcalorimetry, and limited proteolysis.
[0097]In another embodiment of the invention, an EBD comprises an amino acid sequence that maintains a substantially random coil conformation. Whether a given amino acid sequence maintains a substantially random coil conformation can be determined by circular dichroism spectroscopy in far ultra-violet region, nuclear magnetic resonance spectroscopy, infra-red spectroscopy, Raman spectroscopy, fluorescence spectroscopy, numerous hydrodynamic techniques including gel-filtration, viscometry, small-angle X-ray scattering, small angle neutron scattering, dynamic light scattering, static light scattering, scanning microcalorimetry, and limited proteolysis.
[0098]In another embodiment of the invention, an EBD sequence comprises an amino acid sequence that is substantially mutually repulsive. This property of being mutually repulsive can be determined by simple calculations of charge distribution within the polypeptide sequence.
[0099]In yet another embodiment of the invention, an EBD sequence comprises an amino acid sequence that remains in substantially constant motion, particularly in an aqueous environment. The property of being in substantially constant motion can be determined by nuclear magnetic resonance spectroscopy, small-angle X-ray scattering, small angle neutron scattering, dynamic light scattering, intrinsic fluorescence spectroscopy, extrinsic fluorescence spectroscopy, fluorescence resonance energy transfer, fluorescence anisotropy and polarization, steady-state fluorescence, time-domain fluorescence.
[0100]According to a more particular embodiment of the present invention, an EBD sequence is derived from one of the three subunits that make up mammalian axon neurofilaments (including human, bovine, chicken, rabbit, mouse, and rat neurofilaments). Axon neurofilaments are major cytoskeletal components of the axonal cell. One of the functions of neurofilaments is to maintain the bore of the axon. Spacing between the filaments is maintained by the action of an entropic brush formed by entropic bristles carried by certain of the neurofilament subunits. The combination of the entropic bristles along the length of the fiber results in the formation of an entropic brush that functions to sterically exclude interfiber contact by thermally-driven motion, thereby maintaining the bore of the axon. Interfilament spacing is thought to be maintained by long-range interactions between the entropic brushes formed by the EBDs that project from the NF-M and NF-H monomers (Brown and Hoh, 1997).
[0101]Therefore, in another embodiment of the invention, an EBD sequence of the invention comprises a C-terminal entropic bristle sequence of an NF-M or NF-H neurofilament protein. For example, in one embodiment, an EBD sequence of the invention comprises at least one amino acid sequence, SPEAEK (SEQ ID NO:23), derived from the neurofilament triplet H protein. In a related embodiment, multiple repeats of the SPEAEK (SEQ ID NO:23) sequence are provided within the same isolated fusion polypeptide. In a more particular embodiment, about 1-10, 1-50 or 1-100 repeats of the sequence SPEAEK (SEQ ID NO:23) are provide in a polypeptide.
[0102]In another embodiment of the invention, an EBD sequence is a sequence derived from a phage protein. In a more particular embodiment, the EBD sequence comprises at least one sequence derived from the linker region of a filamentous phage, such as the filamentous phage fd. In a more particular embodiment, the EBD sequence comprises at least one sequence derived from the linker region derived from the filamentous phage fd adsorption protein pIII. In a more particular embodiment, the EBD sequence comprises at least one sequence derived from the 36 amino acid linker region derived from filamentous phage fd adsorption protein pIII. In a more particular embodiment, an EBD sequence of the invention comprises between about 1-10, 1-50 or 1-100 repeats of the amino acid sequence EGGGS (SEQ ID NO:93), derived from the linker region of a filamentous phage fd adsorption protein pIII.
[0103]In another embodiment of the invention, an EBD sequence is a sequence derived from nucleoporin. In eukaryotic cells, the translocation of biomolecules between the nucleus and cytosol occurs through nuclear pore complexes (NPCs), supramolecular protein structures embedded in the double lipid membrane of the nuclear envelope (Nakielny, S., and Dreyfuss, G. (1999) Cell 99, 677-690; Pemberton, L. F., Blobel, G., and Rosenblum, J. S. (1998) Curr. Opin. Cell Biol. 10, 392-399; Rout, M., and Aitchison, J. (2001) J. Biol. Chem. 276, 16593-16596). For example, the Saccharomyces cerevisiae NPC is a 60-MDa structure (Yang, Q., Rout, M. P., and Akey, C. W. (1998) Mol. Cell. 1, 223-234) formed by 30 different nucleoporins present in multiple copies per NPC (Rout, M. P., Aitchison, J. D., Suprapto, A., Hjertaas, K., Zhao, Y., and Chait, B. T. (2000) J. Cell Biol. 148, 635-651). The yeast NPC contains a core ring structure with 8-fold symmetry measuring 95 nm in diameter and 35 nm in depth (Yang, Q., Rout, M. P., and Akey, C. W. (1998) Mol. Cell. 1, 223-234). It is believed that nucleoporins form a barrier meshwork that excludes most macromolecules larger than a threshold size from entering the NPC (Rout, M., and Aitchison, J. (2001) J. Biol. Chem. 276, 16593-16596; Rout, M. P., Aitchison, J. D., Suprapto, A., Hjertaas, K., Zhao, Y., and Chait, B. T. (2000) J. Cell Biol. 148, 635-651; Denning D P, Uversky V, Patel S S, Fink A L, Rexach M (2002) The Saccharomyces cerevisiae nucleoporin Nup2p is a natively unfolded protein. J Biol. Chem. 277(36):33447-55).
[0104]Therefore, in another embodiment of the invention, an EBD sequence of the invention comprises a central fragment of yeast nucleoporin Nup2p, such as those described herein. For example, in one embodiment, an EBD sequence of the invention comprises at least one amino acid sequence, FSFGTSQPNNTPS (SEQ ID NO:99), derived from the yeast nucleoporin porin protein Nup2p. In a related embodiment, multiple repeats of the FSFGTSQPNNTPS (SEQ ID NO:99) sequence are provided within the same isolated fusion polypeptide. In a more particular embodiment, about 1-10, 1-50 or 1-100 repeats of the sequence FSFGTSQPNNTPS (SEQ ID NO:99) are provide in a polypeptide.
[0105]In another embodiment of the invention, an EBD sequence is a sequence derived from an elastin-like polypeptide (ELP). ELPs comprise multiple repeats of the elastin-derived pentamer VPGXG (SEQ ID NO:144) where x, the guest residue, is not proline. ELPs are disordered and highly solvated at normal temperatures. They undergo inverse transition at elevated temperatures (the Tt of a particular ELP sequence). The conformation of ELPs transitions from extended to collapsed and is dependent on temperature and salt concentration. Purification of proteins using ELPs may be carried out using inverse transition cycling. The ELP is soluble at temperatures below its Tt and insoluble at temperatures above its Tt. Using ELPs to purify protein may be accomplished by making a fusion construct that includes the target heterologous protein and a suitable ELP multimer, e.g., comprising about 5-100 residues.
[0106]As will be understood by those skilled in the art, the propensity of a polypeptide chain to maintain a substantially random coil and flexible conformation is encoded in its amino acid composition rather than in its amino acid sequence (Uversky V N, Gillespie J R, Fink A L (2000) Why are "natively unfolded" proteins unstructured under physiologic conditions? Proteins. 41(3):415-27). This means that polypeptides sharing similar amino acid compositions will be similarly unfolded. The function of EBDs to increase protein solubility is based at least in part on their random coil and flexible conformation. Therefore, in one preferred embodiment of the invention, an EBD sequence of the invention comprises a scrambled variant sequence of a mammalian NF-H protein. In another embodiment of the invention, an EBD sequence of the invention comprises a scrambled variant sequence of a mammalian NF-M protein. In yet another embodiment of the invention, an EBD sequence of the invention comprises a scrambled variant sequence of a Nup2 protein. In another embodiment of the invention, an EBD sequence of the invention comprises a scrambled variant sequence of a mammalian elastin protein. In yet another embodiment of the invention, an EBD sequence of the invention comprises a scrambled variant sequence of a filamentous phage fd adsorption protein pIII.
[0107]In another embodiment of the invention, an EBD sequence of the invention comprises a scrambled variant sequence corresponding to any combination of fragments derived from sequence of a mammalian NF-H protein. In yet another embodiment of the invention, an EBD sequence of the invention comprises a scrambled variant sequence corresponding to any combination of fragments derived from sequence of a mammalian NF-M protein. In another embodiment of the invention, an EBD sequence of the invention comprises a scrambled variant sequence corresponding to any combination of fragments derived from sequence of a Nup2p protein. In another embodiment of the invention, an EBD sequence of the invention comprises a scrambled variant sequence corresponding to any combination of fragments derived from sequence of an elastin protein. In yet another embodiment of the invention, an EBD sequence of the invention comprises a scrambled variant sequence corresponding to any combination of fragments derived from sequence of a filamentous phage fd adsorption protein pIII.
[0108]In another embodiment of the invention, an EBD sequence of the invention comprises a scrambled variant sequence corresponding to multiple repeats of any combination of fragments derived from sequence of a mammalian NF-H protein. In yet another embodiment of the invention, an EBD sequence of the invention comprises a scrambled variant sequence corresponding to multiple repeats of any combination of fragments derived from sequence of a mammalian NF-M protein. In one more embodiment of the invention, an EBD sequence of the invention comprises a scrambled variant sequence corresponding to multiple repeats of any combination of fragments derived from sequence of a Nup2p protein. In another embodiment of the invention, an EBD sequence of the invention comprises a scrambled variant sequence corresponding to multiple repeats of any combination of fragments derived from sequence of an elastin protein. In yet another embodiment of the invention, an EBD sequence of the invention comprises a scrambled variant sequence corresponding to multiple repeats of any combination of fragments derived from sequence of a filamentous phage fd adsorption protein pIII.
[0109]In another embodiment of the invention, an EBD sequence of the invention comprises a scrambled variant sequence corresponding to any pairwise or multiple combinations of fragments derived from sequence of a mammalian NF-H protein, a mammalian NF-M protein, a Nup2p protein, an elastin protein and a filamentous phage fd adsorption protein pIII.
[0110]In yet another embodiment of the invention, an EBD sequence of the invention comprises a scrambled variant sequence corresponding to multiple repeats of any pairwise or multiple combinations of fragments derived from sequence of a mammalian NF-H protein, a mammalian NF-M protein, a Nup2p protein, an elastin protein and a filamentous phage fd adsorption protein pIII.
[0111]In another embodiment, the fusion polypeptides of the invention further comprise independent cleavable linkers, which allow an EBD sequence, for example at either the N or C terminus, to be easily cleaved from a heterologous polypeptide sequence of interest. Such cleavable linkers are known and available in the art. This embodiment thus provides improved isolation and purification of a heterologous polypeptide sequence and facilitates downstream high-throughput processes.
[0112]The present invention also provides polypeptide fragments of an EBD polypeptide sequence described herein, wherein the fragment comprises at least about 5, 10, 15, 20, 25, 50, or 100 contiguous amino acids, or more, including all intermediate lengths, of an EBD polypeptide sequence set forth herein, or those encoded by a polynucleotide sequence set forth herein. In a preferred embodiment, an EBD fragment provides similar or improved activity relative to the activity of the EBD sequence from which it is derived (wherein the activity includes, for example, one or more of improved solubility, improved folding, reduced aggregation and/or improved yield, when in fusion with a heterologous polypeptide sequence of interest.
[0113]In another aspect, the present invention provides variants of an EBD polypeptide sequence described herein. EBD polypeptide variants will typically exhibit at least about 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% or more identity (e.g., determined as described below), along its length, to an EBD polypeptide sequence set forth herein. Preferably the EBD variant provides similar or improved activity relative to the activity of the EBD sequence from which the variant was derived (wherein the activity includes one or more of improved solubility, improved folding, reduced aggregation and/or improved yield, when in fusion with a heterologous polypeptide sequence of interest.
[0114]An EBD polypeptide variant thus refers to a polypeptide that differs from an EBD polypeptide sequence disclosed herein in one or more substitutions, deletions, additions and/or insertions. Such variants may be naturally occurring or may be synthetically generated, for example, by modifying one or more of the EBD polypeptide sequences of the invention and evaluating their activity as described herein and/or using any of a number of techniques well known in the art.
[0115]In many instances, a variant will contain conservative substitutions. A "conservative substitution" is one in which an amino acid is substituted for another amino acid that has similar properties, such that one skilled in the art of peptide chemistry would expect the secondary structure and hydropathic nature of the polypeptide to be substantially unchanged. As described above, modifications may be made in the structure of the EBD polynucleotides and polypeptides of the present invention and still obtain a functional molecule that encodes a variant or derivative polypeptide with desirable activity. When it is desired to alter the amino acid sequence of an EBD polypeptide to create an equivalent or an improved EBD variant or EBD fragment, one skilled in the art can readily change one or more of the codons of the encoding DNA sequence, for example according to Table 1.
[0116]For example, certain amino acids may be substituted for other amino acids in a protein structure without appreciable loss of desired activity. It is thus contemplated that various changes may be made in the EBD polypeptide sequences of the invention, or corresponding DNA sequences which encode said EBD polypeptide sequences, without appreciable loss of their desired activity.
TABLE-US-00001 TABLE 1 Amino Acids Codons Alanine Ala A GCA GCC GCG GCU Cysteine Cys C UGC UGU Aspartic acid Asp D GAC GAU Glutamic acid Glu E GAA GAG Phenylalanine Phe F UUC UUU Glycine Gly G GGA GGC GGG GGU Histidine His H CAC CAU Isoleucine Ile I AUA AUC AUU Lysine Lys K AAA AAG Leucine Leu L UUA UUG CUA CUC CUG CUU Methionine Met M AUG Asparagine Asn N AAC AAU Proline Pro P CCA CCC CCG CCU Glutamine Gln Q CAA CAG Arginine Arg R AGA AGG CGA CGC CGG CGU Serine Ser S AGC AGU UCA UCC UCG UCU Threonine Thr T ACA ACC ACG ACU Valine Val V GUA GUC GUG GUU Tryptophan Trp W UGG Tyrosine Tyr Y UAC UAU
[0117]In making such changes, the hydropathic index of amino acids may also be considered. The importance of the hydropathic amino acid index in conferring interactive biologic function on a protein is generally understood in the art (Kyte and Doolittle, 1982, incorporated herein by reference). It is accepted that the relative hydropathic character of the amino acid contributes to the secondary structure of the resultant protein, which in turn has potential bearing on the interaction of the protein with other molecules, for example, enzymes, substrates, receptors, DNA, antibodies, antigens, and the like. Each amino acid has been assigned a hydropathic index on the basis of its hydrophobicity and charge characteristics (Kyte and Doolittle, 1982). These values are: isoleucine (+4.5); valine (+4.2); leucine (+3.8); phenylalanine (+2.8); cysteine/cystine (+2.5); methionine (+1.9); alanine (+1.8); glycine (-0.4); threonine (-0.7); serine (-0.8); tryptophan (-0.9); tyrosine (-1.3); proline (-1.6); histidine (-3.2); glutamate (-3.5); glutamine (-3.5); aspartate (-3.5); asparagine (-3.5); lysine (-3.9); and arginine (-4.5).
[0118]Therefore, according to certain embodiments, amino acids within an EBD sequence of the invention may be substituted by other amino acids having a similar hydropathic index or score. Preferably, any such changes result in an EBD sequence with a similar level of activity as the unmodified EBD sequence. In making such changes, the substitution of amino acids whose hydropathic indices are within ยฑ2 is preferred, those within ยฑ1 are particularly preferred, and those within ยฑ0.5 are even more particularly preferred. It is also understood in the art that the substitution of like amino acids can be made effectively on the basis of hydrophilicity. As detailed in U.S. Pat. No. 4,554,101, the following hydrophilicity values have been assigned to amino acid residues: arginine (+3.0); lysine (+3.0); aspartate (+3.0ยฑ1); glutamate (+3.0ยฑ1); serine (+0.3); asparagine (+0.2); glutamine (+0.2); glycine (0); threonine (-0.4); proline (-0.5ยฑ1); alanine (-0.5); histidine (-0.5); cysteine (-1.0); methionine (-1.3); valine (-1.5); leucine (-1.8); isoleucine (-1.8); tyrosine (-2.3); phenylalanine (-2.5); tryptophan (-3.4). Thus, an amino acid can be substituted for another having a similar hydrophilicity value and in many cases still retain a desired level of activity. In such changes, the substitution of amino acids whose hydrophilicity values are within ยฑ2 is preferred, those within ยฑ1 are particularly preferred, and those within ยฑ0.5 are even more particularly preferred.
[0119]As outlined above, amino acid substitutions are generally therefore based on the relative similarity of the amino acid side-chain substituents, for example, their hydrophobicity, hydrophilicity, charge, size, and the like.
[0120]In addition, any polynucleotide of the invention, such as a polynucleotide encoding an EBD polypeptide sequence, or a vector comprising a polynucleotide encoding an EBD polypeptide sequence, may be further modified to increase stability in vivo. Possible modifications include, but are not limited to, the addition of flanking sequences at the 5' and/or 3' ends; the use of phosphorothioate or 2' O-methyl rather than phosphodiesterase linkages in the backbone; and/or the inclusion of nontraditional bases such as inosine, queosine and wybutosine, as well as acetyl- methyl-, thio- and other modified forms of adenine, cytidine, guanine, thymine and uridine.
[0121]Amino acid substitutions within an EBD sequence of the invention may further be made on the basis of similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity and/or the amphipathic nature of the residues. For example, negatively charged amino acids include aspartic acid and glutamic acid; positively charged amino acids include lysine and arginine; and amino acids with uncharged polar head groups having similar hydrophilicity values include leucine, isoleucine and valine; glycine and alanine; asparagine and glutamine; and serine, threonine, phenylalanine and tyrosine. Other groups of amino acids that may represent conservative changes include: (1) ala, pro, gly, glu, asp, gin, asn, ser, thr; (2) cys, ser, tyr, thr; (3) val, ile, leu, met, ala, phe; (4) lys, arg, his; and (5) phe, tyr, trp, his. A variant may also, or alternatively, contain nonconservative changes.
[0122]In an illustrative embodiment, a variant EBD polypeptide differs from the corresponding unmodified EBD sequence by substitution, deletion or addition of five percent of the original amino acids or fewer. Variants may also (or alternatively) be modified by, for example, the deletion or addition of amino acids that have minimal influence on the desired activity.
[0123]A polypeptide of the invention may further comprise a signal (or leader) sequence at the N-terminal end of the polypeptide, which co-translationally or post-translationally directs transfer of the protein. The polypeptide may also be conjugated to a linker or other sequence for ease of synthesis, purification or identification of the polypeptide (e.g., poly-His), or to enhance binding of the polypeptide to a solid support.
[0124]As noted above, the present invention provides EBD polypeptide variant sequences which share some degree of sequence identity with an EBD polypeptide specifically described herein, such as those having at least 40%, 50%, 60%, 70%, 80%, 90% or 95% identity with an EBD polypeptide sequence described herein. When comparing polypeptide sequences to evaluate their extent of shared sequence identity, two sequences are said to be "identical" if the sequence of amino acids in the two sequences is the same when aligned for maximum correspondence, as described below. Comparisons between two sequences are typically performed by comparing the sequences over a comparison window to identify and compare local regions of sequence similarity.
[0125]A "comparison window" as used herein, refers to a segment of at least about 20 contiguous positions, usually 30 to about 75, 40 to about 50, in which a sequence may be compared to a reference sequence of the same number of contiguous positions after the two sequences are optimally aligned.
[0126]Optimal alignment of sequences for comparison may be conducted using the Megalign program in the Lasergene suite of bioinformatics software (DNASTAR, Inc., Madison, Wis.), using default parameters. This program embodies several alignment schemes described in the following references: Dayhoff, M. O., (1978) A model of evolutionary change in proteins--Matrices for detecting distant relationships. In Dayhoff, M. O. (ed.) Atlas of Protein Sequence and Structure, National Biomedical Research Foundation, Washington D.C. Vol. 5, Suppl. 3, pp. 345-358; Hein J. (1990) Unified Approach to Alignment and Phylogenes, pp. 626-645 Methods in Enzymology vol. 183, Academic Press, Inc., San Diego, Calif.; Higgins, D. G. and Sharp, P. M., CABIOS 5:151-153 (1989); Myers, E. W. and Muller W., CABIOS 4:11-17 (1988); Robinson, E. D., Comb. Theor 11:105 (1971); Saitou, N. Nei, M., Mol. Biol. Evol. 4:406-425 (1987); Sneath, P. H. A. and Sokal, R. R., Numerical Taxonomy--the Principles and Practice of Numerical Taxonomy, Freeman Press, San Francisco, Calif. (1973); Wilbur, W. J. and Lipman, D. J., Proc. Natl. Acad., Sci. USA 80:726-730 (1983).
[0127]Alternatively, optimal alignment of sequences for comparison may be conducted by the local identity algorithm of Smith and Waterman, Add. APL. Math 2:482 (1981), by the identity alignment algorithm of Needleman and Wunsch, J. Mol. Biol. 48:443 (1970), by the search for similarity methods of Pearson and Lipman, Proc. Natl. Acad. Sci. USA 85: 2444 (1988), by computerized implementations of these algorithms (GAP, BESTFIT, BLAST, FASTA, and TFASTA in the Wisconsin Genetics Software Package, Genetics Computer Group (GCG), 575 Science Dr., Madison, Wis.), or by inspection.
[0128]One preferred example of algorithms that are suitable for determining percent sequence identity and sequence similarity are the BLAST and BLAST 2.0 algorithms, which are described in Altschul et al., Nucl. Acids Res. 25:3389-3402 (1977), and Altschul et al., J. Mol. Biol. 215:403-410 (1990), respectively. BLAST and BLAST 2.0 can be used, for example with the parameters described herein, to determine percent sequence identity for the polynucleotides and polypeptides of the invention. Software for performing BLAST analyses is publicly available through the National Center for Biotechnology Information. For amino acid sequences, a scoring matrix can be used to calculate the cumulative score. Extension of the word hits in each direction are halted when: the cumulative alignment score falls off by the quantity X from its maximum achieved value; the cumulative score goes to zero or below, due to the accumulation of one or more negative-scoring residue alignments; or the end of either sequence is reached. The BLAST algorithm parameters W, T and X determine the sensitivity and speed of the alignment.
[0129]In one preferred approach, the "percentage of sequence identity" is determined by comparing two optimally aligned sequences over a window of comparison of at least 20 positions, wherein the portion of the polypeptide sequence in the comparison window may comprise additions or deletions (i.e., gaps) of 20 percent or less, usually 5 to 15 percent, or 10 to 12 percent, as compared to the reference sequences (which does not comprise additions or deletions) for optimal alignment of the two sequences. The percentage is calculated by determining the number of positions at which the identical amino acid residue occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the reference sequence (i.e., the window size) and multiplying the results by 100 to yield the percentage of sequence identity.
[0130]In another aspect of the invention, there is provided an isolated polynucleotide sequence encoding a fusion polypeptide, the fusion polypeptide comprising at least one entropic bristle domain sequence and at least one heterologous polypeptide sequence of interest. In a related aspect, the invention provides expression vectors comprising a polynucleotide encoding an EBD fusion polypeptide of the invention. In another related aspect, an expression vector of the invention comprises a polynucleotide encoding one or more EBD sequence and further comprises a multiple cloning site for the insertion of a polynucleotide encoding a heterologous polypeptide sequence of interest.
[0131]Polynucleotides compositions of the present invention may be identified, prepared and/or manipulated using any of a variety of well established techniques (see generally, Sambrook et al., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratories, Cold Spring Harbor, N.Y., 1989, and other like references).
[0132]The terms "DNA" and "polynucleotide" are used essentially interchangeably herein to refer to a DNA molecule that has been isolated free of total genomic DNA of a particular species. "Isolated", as used herein, means that a polynucleotide is substantially away from other coding sequences, and that the DNA molecule does not contain large portions of unrelated coding DNA, such as large chromosomal fragments or other functional genes or polypeptide coding regions. Of course, this refers to the DNA molecule as originally isolated, and does not exclude genes or coding regions later added to the segment by the hand of man.
[0133]As will be understood by those skilled in the art, the polynucleotide compositions of this invention can include genomic sequences, extra-genomic and plasmid-encoded sequences and smaller engineered gene segments that express, or may be adapted to express, proteins, polypeptides, peptides and the like. Such segments may be naturally isolated, or modified synthetically by the hand of man.
[0134]As will also be recognized, polynucleotides of the invention may be single-stranded (coding or antisense) or double-stranded, and may be DNA (genomic, cDNA or synthetic) or RNA molecules. RNA molecules may include HnRNA molecules, which contain introns and correspond to a DNA molecule in a one-to-one manner, and mRNA molecules, which do not contain introns. Additional coding or non-coding sequences may, but need not, be present within a polynucleotide of the present invention, and a polynucleotide may, but need not, be linked to other molecules and/or support materials.
[0135]In addition to the EBD polynucleotide sequences set forth herein, the present invention also provides EBD polynucleotide variants having substantial identity to an EBD polynucleotide sequence disclosed herein, for example those comprising at least 50% sequence identity, preferably at least, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or higher, sequence identity compared to an EBD polynucleotide sequence of this invention using the methods described herein, (e.g., BLAST analysis using standard parameters, as described below). One skilled in this art will recognize that these values can be appropriately adjusted to determine corresponding identity of polypeptides encoded by two polynucleotide sequences by taking into account codon degeneracy, amino acid similarity, reading frame positioning and the like.
[0136]Typically, EBD polynucleotide variants will contain one or more substitutions, additions, deletions and/or insertions, preferably such that the activity (e.g., improved folding, reduced aggregation and/or improved yield, when in fusion with a heterologous sequence of interest) of the polypeptide encoded by the variant polynucleotide is not substantially diminished relative to the corresponding unmodified polynucleotide sequence.
[0137]In additional embodiments, the present invention provides polynucleotide fragments comprising or consisting of various lengths of contiguous stretches of sequence identical to or complementary to one or more of the EBD polynucleotide sequences disclosed herein. For example, polynucleotides are provided by this invention that comprise or consist of at least about 10, 15, 20, 30, 40, 50, 75, 100, 150, 200, 300, 400, 500 or 1000 or more contiguous nucleotides of one or more of the sequences disclosed herein as well as all intermediate lengths there between. It will be readily understood that "intermediate lengths", in this context, means any length between the quoted values, such as 16, 17, 18, 19, etc.; 21, 22, 23, etc.; 30, 31, 32, etc.; 50, 51, 52, 53, etc.; 100, 101, 102, 103, etc.; 150, 151, 152, 153, etc.; including all integers through 200-500; 500-1,000, and the like. A polynucleotide sequence as described here may be extended at one or both ends by additional nucleotides not found in the native sequence. This additional sequence may consist of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 nucleotides at either end of the disclosed sequence or at both ends of the disclosed sequence. Preferably, an EBD polynucleotide fragment of the invention encodes a fusion polypeptide that retains one or more desired activities, e.g., improved folding, reduced aggregation and/or improved yield, when in fusion with a heterologous sequence of interest.
[0138]The EBD polynucleotides of the present invention, or fragments thereof, regardless of the length of the coding sequence itself, may be combined with other DNA sequences, such as promoters, polyadenylation signals, additional restriction enzyme sites, multiple cloning sites, other coding segments, and the like, such that their overall length may vary considerably. It is therefore contemplated that a nucleic acid fragment of almost any length may be employed, with the total length preferably being limited by the ease of preparation and use in the intended recombinant DNA protocol. For example, illustrative polynucleotide segments with total lengths of about 10,000, about 5000, about 3000, about 2,000, about 1,000, about 500, about 200, about 100, about 50 base pairs in length, and the like, (including all intermediate lengths) are contemplated to be useful in many implementations of this invention.
[0139]It will be appreciated by those of ordinary skill in the art that, as a result of the degeneracy of the genetic code, there are many nucleotide sequences that will encode a polypeptide as described herein. Some of these polynucleotides bear minimal homology to the native polynucleotide sequence. Nonetheless, polynucleotides that vary due to differences in codon usage are specifically contemplated by the present invention. Further, different alleles of an EBD polynucleotide sequence provided herein are within the scope of the present invention. Alleles are endogenous sequences that are altered as a result of one or more mutations, such as deletions, additions and/or substitutions of nucleotides. The resulting mRNA and protein may, but need not, have an altered structure or function. Alleles may be identified using standard techniques (such as hybridization, amplification and/or database sequence comparison).
[0140]In another embodiment of the invention, a mutagenesis approach, such as site-specific mutagenesis, may be employed for the preparation of variants and/or derivatives of the EBD polynucleotides and polypeptides described herein. By this approach, for example, specific modifications in a polypeptide sequence can be made through mutagenesis of the underlying polynucleotides that encode them. These techniques provides a straightforward approach to prepare and test sequence variants, for example, incorporating one or more of the foregoing considerations, by introducing one or more nucleotide sequence changes into the polynucleotide.
[0141]Site-specific mutagenesis allows the production of mutants through the use of specific oligonucleotide sequences which encode the DNA sequence of the desired mutation, as well as a sufficient number of adjacent nucleotides, to provide a primer sequence of sufficient size and sequence complexity to form a stable duplex on both sides of the deletion junction being traversed. Mutations may be employed in a selected polynucleotide sequence to improve, alter, decrease, modify, or otherwise change the properties of the polynucleotide itself, and/or alter the properties, activity, composition, stability, or primary sequence of the encoded polypeptide.
[0142]In certain embodiments, the present invention contemplates the mutagenesis of the disclosed polynucleotide sequences to alter one or more activities/properties of the encoded polypeptide. The techniques of site-specific mutagenesis are well-known in the art, and are widely used to create variants of both polypeptides and polynucleotides. For example, site-specific mutagenesis is often used to alter a specific portion of a DNA molecule. In such embodiments, a primer comprising typically about 14 to about 25 nucleotides or so in length may be employed, with about 5 to about 10 residues on both sides of the junction of the sequence being altered.
[0143]As will be appreciated by those of skill in the art, site-specific mutagenesis techniques have often employed a phage vector that exists in both a single stranded and double stranded form. Typical vectors useful in site-directed mutagenesis include vectors such as the M13 phage. These phage are readily commercially-available and their use is generally well-known to those skilled in the art. Double-stranded plasmids are also routinely employed in site directed mutagenesis that eliminates the step of transferring the gene of interest from a plasmid to a phage.
[0144]In general, site-directed mutagenesis in accordance herewith is performed by first obtaining a single-stranded vector or melting apart of two strands of a double-stranded vector that includes within its sequence a DNA sequence that encodes the desired peptide. An oligonucleotide primer bearing the desired mutated sequence is prepared, generally synthetically. This primer is then annealed with the single-stranded vector, and subjected to DNA polymerizing enzymes such as E. coli polymerase I Klenow fragment, in order to complete the synthesis of the mutation-bearing strand. Thus, a heteroduplex is formed wherein one strand encodes the original non-mutated sequence and the second strand bears the desired mutation. This heteroduplex vector is then used to transform appropriate cells, such as E. coli cells, and clones are selected which include recombinant vectors bearing the mutated sequence arrangement.
[0145]The preparation of sequence variants of the selected peptide-encoding DNA segments using site-directed mutagenesis provides a means of producing potentially useful species and is not meant to be limiting as there are other ways in which sequence variants of peptides and the DNA sequences encoding them may be obtained. For example, recombinant vectors encoding the desired peptide sequence may be treated with mutagenic agents, such as hydroxylamine, to obtain sequence variants. Specific details regarding these methods and protocols are found in the teachings of Maloy et al., 1994; Segal, 1976; Prokop and Bajpai, 1991; Kuby, 1994; and Maniatis et al., 1982, each incorporated herein by reference, for that purpose.
[0146]As used herein, the term "oligonucleotide directed mutagenesis procedure" refers to template-dependent processes and vector-mediated propagation which result in an increase in the concentration of a specific nucleic acid molecule relative to its initial concentration, or in an increase in the concentration of a detectable signal, such as amplification. As used herein, the term "oligonucleotide directed mutagenesis procedure" is intended to refer to a process that involves the template-dependent extension of a primer molecule. The term template dependent process refers to nucleic acid synthesis of an RNA or a DNA molecule wherein the sequence of the newly synthesized strand of nucleic acid is dictated by the well-known rules of complementary base pairing (see, for example, Watson, 1987). Typically, vector mediated methodologies involve the introduction of the nucleic acid fragment into a DNA or RNA vector, the clonal amplification of the vector, and the recovery of the amplified nucleic acid fragment. Examples of such methodologies are provided by U.S. Pat. No. 4,237,224, specifically incorporated herein by reference in its entirety.
[0147]In another approach for the production of polypeptide variants of the present invention, recursive sequence recombination, as described in U.S. Pat. No. 5,837,458, may be employed. In this approach, iterative cycles of recombination and screening or selection are performed to "evolve" individual polynucleotide variants of the invention wherein one or more desired activities is improved or modified.
[0148]In other embodiments of the present invention, the polynucleotide sequences provided herein can be advantageously used as probes or primers for nucleic acid hybridization. As such, it is contemplated that nucleic acid segments that comprise or consist of a sequence region of at least about a 15 nucleotide long contiguous sequence that has the same sequence as, or is complementary to, a 15 nucleotide long contiguous sequence disclosed herein may be used. Longer contiguous identical or complementary sequences, e.g., those of about 20, 30, 40, 50, 100, 200, 500, 1000 (including all intermediate lengths) and even up to full length sequences will also be of use in certain embodiments.
[0149]Many template dependent processes are available to amplify a target sequences of interest present in a sample. One of the best known amplification methods is the polymerase chain reaction (PCRยฎ) which is described in detail in U.S. Pat. Nos. 4,683,195, 4,683,202 and 4,800,159, each of which is incorporated herein by reference in its entirety. Briefly, in PCRยฎ, two primer sequences are prepared which are complementary to regions on opposite complementary strands of the target sequence. An excess of deoxynucleoside triphosphates is added to a reaction mixture along with a DNA polymerase (e.g., Taq polymerase). If the target sequence is present in a sample, the primers will bind to the target and the polymerase will cause the primers to be extended along the target sequence by adding on nucleotides. By raising and lowering the temperature of the reaction mixture, the extended primers will dissociate from the target to form reaction products, excess primers will bind to the target and to the reaction product and the process is repeated. Preferably reverse transcription and PCRยฎ amplification procedure may be performed in order to quantify the amount of mRNA amplified. Polymerase chain reaction methodologies are well known in the art.
[0150]Any of a number of other template dependent processes, many of which are variations of the PCRยฎ amplification technique, are readily known and available in the art. Illustratively, some such methods include the ligase chain reaction (referred to as LCR), described, for example, in Eur. Pat. Appl. Publ. No. 320,308 and U.S. Pat. No. 4,883,750; Qbeta Replicase, described in PCT Intl. Pat. Appl. Publ. No. PCT/US87/00880; Strand Displacement Amplification (SDA) and Repair Chain Reaction (RCR). Still other amplification methods are described in Great Britain Pat. Appl. No. 2 202 328, and in PCT Intl. Pat. Appl. Publ. No. PCT/US89/01025. Other nucleic acid amplification procedures include transcription-based amplification systems (TAS) (PCT Intl. Pat. Appl. Publ. No. WO 88/10315), including nucleic acid sequence based amplification (NASBA) and 3SR. Eur. Pat. Appl. Publ. No. 329,822 describes a nucleic acid amplification process involving cyclically synthesizing single-stranded RNA ("ssRNA"), ssDNA, and double-stranded DNA (dsDNA). PCT Intl. Pat. Appl. Publ. No. WO 89/06700 describes a nucleic acid sequence amplification scheme based on the hybridization of a promoter/primer sequence to a target single-stranded DNA ("ssDNA") followed by transcription of many RNA copies of the sequence. Other amplification methods such as "RACE" (Frohman, 1990), and "one-sided PCR" (Ohara, 1989) are also well-known to those of skill in the art.
[0151]As noted, the EBD fusion polynucleotides, polypeptides and vectors of the present invention are advantageous in the context of recombinant polypeptide production, particularly where it is desired to achieve, for example, improved solubility, improved yield, improved folding and/or reduced aggregation of a heterologous polypeptide to which an EBD polypeptide sequence has been operably fused. Therefore, another aspect of the invention provides methods for producing a recombinant protein, for example by introducing into a host cell an expression vector comprising a polynucleotide sequence encoding a fusion polypeptide as described herein, e.g., a fusion polypeptide comprising at least one EBD sequence and at least one heterologous polypeptide sequence of interest; and expressing the fusion polypeptide in the host cell. In a related embodiment, the method further comprises the step of isolating the fusion polypeptide from the host cell. In another embodiment, the method further comprises the step of removing an entropic bristle domain sequence from the fusion polypeptide before or after isolating the fusion polypeptide from the host cell.
[0152]For recombinant production of a fusion polypeptide of the invention, DNA sequences encoding the polypeptide components of a fusion polypeptide (e.g., one or more EBD sequences and a heterologous polypeptide sequence of interest) may be assembled using conventional methodologies. In one example, the components may be assembled separately and ligated into an appropriate expression vector. For example, the 3' end of the DNA sequence encoding one polypeptide component is ligated, with or without a peptide linker, to the 5' end of a DNA sequence encoding the second polypeptide component so that the reading frames of the sequences are in phase. This permits translation into a single fusion polypeptide that retains the activities of both component polypeptides.
[0153]A peptide linker sequence may be employed to separate an EBD polypeptide sequence from a heterologous polypeptide sequence by some defined distance, for example a distance sufficient to ensure that the advantages of the invention are achieved, e.g, advantages such as improved folding, reduced aggregation and/or improved yield. Such a peptide linker sequence may be incorporated into the fusion polypeptide using standard techniques well known in the art. Suitable peptide linker sequences may be chosen based, for example, on the factors such as: (1) their ability to adopt a flexible extended conformation; and (2) their inability to adopt a secondary structure that could interfere with the activity of the EBD sequence. Illustrative peptide linker sequences, for example, may contain Gly, Asn and Ser residues. Other near neutral amino acids, such as Thr and Ala may also be used in the linker sequence. Amino acid sequences which may be usefully employed as linkers include those disclosed in Maratea et al., Gene 40:39-46, 1985; Murphy et al., Proc. Natl. Acad. Sci. USA 83:8258-8262, 1986; U.S. Pat. No. 4,935,233 and U.S. Pat. No. 4,751,180. The linker sequence may generally be from 1 to about 50 amino acids in length, for example.
[0154]The ligated DNA sequences of a fusion polynucleotide are operably linked to suitable transcriptional and/or translational regulatory elements. The regulatory elements responsible for expression of DNA are located only 5' to the DNA sequence encoding the first polypeptides. Similarly, stop codons required to end translation and transcription termination signals are only present 3' to the DNA sequence encoding the second polypeptide.
[0155]The EBD and heterologous polynucleotide sequences may comprise a sequence as described herein, or may comprise a sequence that has been modified to facilitate recombinant polypeptide production. As will be understood by those of skill in the art, it may be advantageous in some instances to produce polypeptide-encoding polynucleotide sequences possessing non-naturally occurring codons. For example, codons preferred by a particular prokaryotic or eukaryotic host can be selected to increase the rate of protein expression or to produce a recombinant RNA transcript having desirable properties, such as a half-life which is longer than that of a transcript generated from the naturally occurring sequence.
[0156]Moreover, the polynucleotide sequences of the present invention can be engineered using methods generally known in the art in order to alter polypeptide encoding sequences for a variety of reasons, including but not limited to, alterations which modify the cloning, processing, and/or expression of the gene product. For example, DNA shuffling by random fragmentation and PCR reassembly of gene fragments and synthetic oligonucleotides may be used to engineer the nucleotide sequences. In addition, site-directed mutagenesis may be used to insert new restriction sites, alter glycosylation patterns, change codon preference, produce splice variants, or introduce mutations, and so forth.
[0157]In a particular embodiment, a fusion polynucleotide is engineered to further comprise a cleavage site located between the EBD polypeptide-encoding sequence and the heterologous polypeptide sequence, so that the hetereolous polypeptide may be cleaved and purified away from an EBD polypeptide sequence at any desired stage following expression of the fusion polypeptide. Illustratively, a fusion polynucleotide of the invention may be designed to include heparin, thrombin, or factor Xa protease cleavage sites.
[0158]In order to express a desired polypeptide, the nucleotide sequences encoding the polypeptide, or functional equivalents, may be inserted into appropriate expression vector, i.e., a vector which contains the necessary elements for the transcription and translation of an inserted coding sequence. Methods which are well known to those skilled in the art may be used to construct expression vectors containing sequences encoding a polypeptide of interest and appropriate transcriptional and translational control elements. These methods include in vitro recombinant DNA techniques, synthetic techniques, and in vivo genetic recombination. Such techniques are described, for example, in Sambrook, J. et al. (1989) Molecular Cloning, A Laboratory Manual, Cold Spring Harbor Press, Plainview, N.Y., and Ausubel, F. M. et al. (1989) Current Protocols in Molecular Biology, John Wiley & Sons, New York. N.Y.
[0159]A variety of expression vector/host systems may be utilized to contain and express polynucleotide sequences of the present invention. These include, but are not limited to, microorganisms such as bacteria transformed with recombinant bacteriophage, plasmid, or cosmid DNA expression vectors; yeast transformed with yeast expression vectors; insect cell systems infected with virus expression vectors (e.g., baculovirus); plant cell systems transformed with virus expression vectors (e.g., cauliflower mosaic virus, CaMV; tobacco mosaic virus, TMV) or with bacterial expression vectors (e.g., Ti or pBR322 plasmids); or animal cell systems.
[0160]The "control elements" or "regulatory sequences" present in an expression vector are those non-translated regions of the vector--enhancers, promoters, 5' and 3' untranslated regions--which interact with host cellular proteins to carry out transcription and translation. Such elements may vary in their strength and specificity. Depending on the vector system and host utilized, any number of suitable transcription and translation elements, including constitutive and inducible promoters, may be used. For example, when cloning in bacterial systems, inducible promoters such as the hybrid lacZ promoter of the pBLUESCRIPT phagemid (Stratagene, La Jolla, Calif.) or pSPORT1 plasmid (Gibco BRL, Gaithersburg, Md.) and the like may be used. In mammalian cell systems, promoters from mammalian genes or from mammalian viruses are generally preferred. If it is necessary to generate a cell line that contains multiple copies of the sequence encoding a polypeptide, vectors based on SV40 or EBV may be advantageously used with an appropriate selectable marker.
[0161]In bacterial systems, any of a number of expression vectors may be selected depending upon the use intended for the expressed polypeptide. For example, when large quantities are needed, for example for the induction of antibodies, vectors which direct high level expression of fusion proteins that are readily purified may be used. Such vectors include, but are not limited to, the multifunctional E. coli cloning and expression vectors such as pBLUESCRIPT (Stratagene), in which the sequence encoding the polypeptide of interest may be ligated into the vector in frame with sequences for the amino-terminal Met and the subsequent 7 residues of ฮฒ-galactosidase so that a hybrid protein is produced; pIN vectors (Van Heeke, G. and S. M. Schuster (1989) J. Biol. Chem. 264:5503-5509); and the like. Proteins made in such systems may be designed to include heparin, thrombin, or factor Xa protease cleavage sites so that the cloned polypeptide of interest can be released from the EBD moiety at will.
[0162]In the yeast, Saccharomyces cerevisiae, a number of vectors containing constitutive or inducible promoters such as alpha factor, alcohol oxidase, and PGH may be used. For reviews, see Ausubel et al. (supra) and Grant et al. (1987) Methods Enzymol. 153:516-544.
[0163]In cases where plant expression vectors are used, the expression of sequences encoding polypeptides may be driven by any of a number of promoters. For example, viral promoters such as the 35S and 19S promoters of CaMV may be used alone or in combination with the omega leader sequence from TMV (Takamatsu, N. (1987) EMBO J. 6:307-311. Alternatively, plant promoters such as the small subunit of RUBISCO or heat shock promoters may be used (Coruzzi, G. et al. (1984) EMBO J. 3:1671-1680; Broglie, R. et al. (1984) Science 224:838-843; and Winter, J. et al. (1991) Results Probl. Cell Differ. 17:85-105). These constructs can be introduced into plant cells by direct DNA transformation or pathogen-mediated transfection. Such techniques are described in a number of generally available reviews (see, for example, Hobbs, S. or Murry, L. E. in McGraw Hill Yearbook of Science and Technology (1992) McGraw Hill, New York, N.Y.; pp. 191-196).
[0164]An insect system may also be used to express a polypeptide of interest. For example, in one such system, Autographa californica nuclear polyhedrosis virus (AcNPV) is used as a vector to express foreign genes in Spodoptera frugiperda cells or in Trichoplusia larvae. The sequences encoding the polypeptide may be cloned into a non-essential region of the virus, such as the polyhedrin gene, and placed under control of the polyhedrin promoter. Successful insertion of the polypeptide-encoding sequence will render the polyhedrin gene inactive and produce recombinant virus lacking coat protein. The recombinant viruses may then be used to infect, for example, S. frugiperda cells or Trichoplusia larvae in which the polypeptide of interest may be expressed (Engelhard, E. K. et al. (1994) Proc. Natl. Acad. Sci. 91:3224-3227).
[0165]In mammalian host cells, a number of viral-based expression systems are generally available. For example, in cases where an adenovirus is used as an expression vector, sequences encoding a polypeptide of interest may be ligated into an adenovirus transcription/translation complex consisting of the late promoter and tripartite leader sequence. Insertion in a non-essential E1 or E3 region of the viral genome may be used to obtain a viable virus which is capable of expressing the polypeptide in infected host cells (Logan, J. and Shenk, T. (1984) Proc. Natl. Acad. Sci. 81:3655-3659). In addition, transcription enhancers, such as the Rous sarcoma virus (RSV) enhancer, may be used to increase expression in mammalian host cells.
[0166]Specific initiation signals may also be used to achieve more efficient translation of sequences encoding a polypeptide of interest. Such signals include the ATG initiation codon and adjacent sequences. In cases where sequences encoding the polypeptide, its initiation codon, and upstream sequences are inserted into the appropriate expression vector, no additional transcriptional or translational control signals may be needed. However, in cases where only coding sequence, or a portion thereof, is inserted, exogenous translational control signals including the ATG initiation codon should be provided. Furthermore, the initiation codon should be in the correct reading frame to ensure translation of the entire insert. Exogenous translational elements and initiation codons may be of various origins, both natural and synthetic. The efficiency of expression may be enhanced by the inclusion of enhancers which are appropriate for the particular cell system which is used, such as those described in the literature (Scharf, D. et al. (1994) Results Probl. Cell Differ. 20:125-162).
[0167]In addition, a host cell strain may be chosen for its ability to modulate the expression of the inserted sequences or to process the expressed protein in the desired fashion. Such modifications of the polypeptide include, but are not limited to, acetylation, carboxylation, glycosylation, phosphorylation, lipidation, and acylation. Post-translational processing which cleaves a "prepro" form of the protein may also be used to facilitate correct insertion, folding and/or function. Different host cells such as CHO, COS, HeLa, MDCK, HEK293, and WI38, which have specific cellular machinery and characteristic mechanisms for such post-translational activities, may be chosen to ensure the correct modification and processing of the foreign protein.
[0168]For long-term, high-yield production of recombinant proteins, stable expression is generally preferred. For example, cell lines which stably express a polynucleotide of interest may be transformed using expression vectors which may contain viral origins of replication and/or endogenous expression elements and a selectable marker gene on the same or on a separate vector. Following the introduction of the vector, cells may be allowed to grow for 1-2 days in an enriched media before they are switched to selective media. The purpose of the selectable marker is to confer resistance to selection, and its presence allows growth and recovery of cells which successfully express the introduced sequences. Resistant clones of stably transformed cells may be proliferated using tissue culture techniques appropriate to the cell type.
[0169]Any number of selection systems may be used to recover transformed cell lines. These include, but are not limited to, the herpes simplex virus thymidine kinase (Wigler, M. et al. (1977) Cell 11:223-32) and adenine phosphoribosyltransferase (Lowy, I. et al. (1990) Cell 22:817-23) genes which can be employed in tk.sup.- or aprt.sup.-cells, respectively. Also, antimetabolite, antibiotic or herbicide resistance can be used as the basis for selection; for example, dhfr which confers resistance to methotrexate (Wigler, M. et al. (1980) Proc. Natl. Acad. Sci. 77:3567-70); npt, which confers resistance to the aminoglycosides, neomycin and G-418 (Colbere-Garapin, F. et al (1981) J. Mol. Biol. 150:1-14); and als or pat, which confer resistance to chlorsulfuron and phosphinotricin acetyltransferase, respectively (Murry, supra). Additional selectable genes have been described, for example, trpB, which allows cells to utilize indole in place of tryptophan, or hisD, which allows cells to utilize histinol in place of histidine (Hartman, S. C. and R. C. Mulligan (1988) Proc. Natl. Acad. Sci. 85:8047-51). The use of visible markers has gained popularity with such markers as anthocyanins, beta-glucuronidase and its substrate GUS, and luciferase and its substrate luciferin, being widely used not only to identify transformants, but also to quantify the amount of transient or stable protein expression attributable to a specific vector system (Rhodes, C. A. et al. (1995) Methods Mol. Biol. 55:121-131).
[0170]Although the presence/absence of marker gene expression suggests that the gene of interest is also present, its presence and expression may need to be confirmed. For example, if the sequence encoding a polypeptide is inserted within a marker gene sequence, recombinant cells containing sequences can be identified by the absence of marker gene function. Alternatively, a marker gene can be placed in tandem with a polypeptide-encoding sequence under the control of a single promoter. Expression of the marker gene in response to induction or selection usually indicates expression of the tandem gene as well.
[0171]Alternatively, host cells that contain and express a desired polynucleotide sequence may be identified by a variety of procedures known to those of skill in the art. These procedures include, but are not limited to, DNA-DNA or DNA-RNA hybridizations and protein bioassay or immunoassay techniques which include, for example, membrane, solution, or chip based technologies for the detection and/or quantification of nucleic acid or protein.
[0172]A variety of protocols for detecting and measuring the expression of polynucleotide-encoded products, using either polyclonal or monoclonal antibodies specific for the product are known in the art. Examples include enzyme-linked immunosorbent assay (ELISA), radioimmunoassay (RIA), and fluorescence activated cell sorting (FACS). A two-site, monoclonal-based immunoassay utilizing monoclonal antibodies reactive to two non-interfering epitopes on a given polypeptide may be preferred for some applications, but a competitive binding assay may also be employed. These and other assays are described, among other places, in Hampton, R. et al. (1990; Serological Methods, a Laboratory Manual, APS Press, St Paul. Minn.) and Maddox, D. E. et al. (1983; J. Exp. Med. 158:1211-1216).
[0173]A wide variety of labels and conjugation techniques are known by those skilled in the art and may be used in various nucleic acid and amino acid assays. Means for producing labeled hybridization or PCR probes for detecting sequences related to polynucleotides include oligolabeling, nick translation, end-labeling or PCR amplification using a labeled nucleotide. Alternatively, the sequences, or any portions thereof may be cloned into a vector for the production of an mRNA probe. Such vectors are known in the art, are commercially available, and may be used to synthesize RNA probes in vitro by addition of an appropriate RNA polymerase such as T7, T3, or SP6 and labeled nucleotides. These procedures may be conducted using a variety of commercially available kits. Suitable reporter molecules or labels, which may be used include radionuclides, enzymes, fluorescent, chemiluminescent, or chromogenic agents as well as substrates, cofactors, inhibitors, magnetic particles, and the like.
[0174]Host cells transformed with a polynucleotide sequence of interest may be cultured under conditions suitable for the expression and recovery of the polypeptide from cell culture. The polypeptide produced by a recombinant cell may be secreted or contained intracellularly depending on the sequence and/or the vector used. As will be understood by those of skill in the art, expression vectors containing polynucleotides of the invention may be designed to contain signal sequences which direct secretion of the encoded polypeptide through a prokaryotic or eukaryotic cell membrane. Other recombinant constructions may be used to join sequences encoding a polypeptide of interest to polynucleotide sequence encoding a polypeptide domain which will facilitate purification of soluble proteins. Such purification facilitating domains include, but are not limited to, metal chelating peptides such as histidine-tryptophan modules that allow purification on immobilized metals, protein A domains that allow purification on immobilized immunoglobulin, and the domain utilized in the FLAGS extension/affinity purification system (Immunex Corp., Seattle, Wash.). The inclusion of cleavable linker sequences such as those specific for Factor Xa or enterokinase (Invitrogen. San Diego, Calif.) between the purification domain and the encoded polypeptide may be used to facilitate purification. One such expression vector provides for expression of a fusion protein containing a polypeptide of interest and a nucleic acid encoding 6 histidine residues preceding a thioredoxin or an enterokinase cleavage site. The histidine residues facilitate purification on IMIAC (immobilized metal ion affinity chromatography) as described in Porath, J. et al. (1992, Prot. Exp. Purif. 3:263-281) while the enterokinase cleavage site provides a means for purifying the desired polypeptide from the fusion protein. Further discussion of vectors which comprise fusion proteins can be found in Kroll, D. J. et al. (1993; DNA Cell Biol. 12:441-453).
[0175]In addition to recombinant production methods, polypeptides of the invention, and fragments thereof, may be produced by direct peptide synthesis using solid-phase techniques (Merrifield J. (1963) J. Am. Chem. Soc. 85:2149-2154). Polypeptide synthesis may be performed using manual techniques or by automation. Automated synthesis may be achieved, for example, using Applied Biosystems 431A Peptide Synthesizer (Perkin Elmer). Alternatively, various fragments may be chemically synthesized separately and combined using chemical methods to produce the full length molecule.
[0176]According to another aspect, the present invention further provides binding agents, such as antibodies and antigen-binding fragments thereof, that specifically bind to an EBD sequence according to the present invention, or to a portion, variant or derivative thereof. Such binding agents may be used, for example, to detect the presence of a polypeptide comprising an EBD sequence, to facilitate purification of a polypeptide comprising an EBD sequence, and the like. An antibody, or antigen-binding fragment thereof, is said to "specifically bind" to a polypeptide if it reacts at a detectable level (within, for example, an ELISA assay) with the polypeptide, and does not react detectably with unrelated polypeptides under similar conditions.
[0177]Antibodies and other binding agents can be prepared using conventional methodologies. For example, monoclonal antibodies specific for a polypeptide of interest may be prepared using the technique of Kohler and Milstein, Eur. J. Immunol. 6:511-519, 1976, and improvements thereto. Briefly, these methods involve the preparation of immortal cell lines capable of producing antibodies having the desired specificity (i.e., reactivity with the polypeptide of interest). Such cell lines may be produced, for example, from spleen cells obtained from an animal immunized as described above. The spleen cells are then immortalized by, for example, fusion with a myeloma cell fusion partner, preferably one that is syngeneic with the immunized animal. A variety of fusion techniques may be employed. For example, the spleen cells and myeloma cells may be combined with a nonionic detergent for a few minutes and then plated at low density on a selective medium that supports the growth of hybrid cells, but not myeloma cells. A preferred selection technique uses HAT (hypoxanthine, aminopterin, thymidine) selection. After a sufficient time, usually about 1 to 2 weeks, colonies of hybrids are observed. Single colonies are selected and their culture supernatants tested for binding activity against the polypeptide. Hybridomas having high reactivity and specificity are preferred.
[0178]Monoclonal antibodies may be isolated from the supernatants of growing hybridoma colonies. In addition, various techniques may be employed to enhance the yield, such as injection of the hybridoma cell line into the peritoneal cavity of a suitable vertebrate host, such as a mouse. Monoclonal antibodies may then be harvested from the ascites fluid or the blood. Contaminants may be removed from the antibodies by conventional techniques, such as chromatography, gel filtration, precipitation, and extraction. The polypeptides of this invention may be used in the purification process in, for example, an affinity chromatography step.
[0179]A number of "humanized" antibody molecules comprising an antigen-binding site derived from a non-human immunoglobulin have been described, including chimeric antibodies having rodent V regions and their associated CDRs fused to human constant domains (Winter et al. (1991) Nature 349:293-299; Lobuglio et al. (1989) Proc. Nat. Acad. Sci. USA 86:4220-4224; Shaw et al. (1987) J. Immunol. 138:4534-4538; and Brown et al. (1987) Cancer Res. 47:3577-3583), rodent CDRs grafted into a human supporting FR prior to fusion with an appropriate human antibody constant domain (Riechmann et al. (1988) Nature 332:323-327; Verhoeyen et al. (1988) Science 239:1534-1536; and Jones et al. (1986) Nature 321:522-525), and rodent CDRs supported by recombinantly veneered rodent FRs (European Patent Publication No. 519,596, published Dec. 23, 1992). These "humanized" molecules are designed to minimize unwanted immunological response toward rodent antihuman antibody molecules which limits the duration and effectiveness of therapeutic applications of those moieties in human recipients.
[0180]Yet another aspect of the invention provides kits comprising one or more compositions described herein, e.g., an isolated EBD polynucleotide, polypeptide, antibody, vector, host cell, etc. In a particular embodiment, the invention provides a kit containing an expression vector comprising a polynucleotide sequence encoding an EBD polypeptide sequence and a multiple cloning site for easily introducing into the vector a polynucleotide sequence encoding a heterologous polypeptide sequence of interest. In another embodiment, the expression vector further comprises an engineered cleavage site to facilitate separation of the an EBD polypeptide sequence from the hetereologous polypeptide sequence of interest following recombinant production.
[0181]The following Examples are offered by way of illustration and not by way of limitation.
EXAMPLES
Example 1
Use of Neurofilament Triplet M Protein (NF-M) in an Entropic Bristle Domain Vector
[0182]The heterogeneity in the charge distribution of the human NF-M protein sequence was determined (shown below). The observed heterogeneity of the sequence suggests that EBDs with different characteristics may result for different regions of the sequence. For example, a 422-600 fragment is predominantly negatively charged. This fragment could be used as a basis to design EBDs for negatively charged proteins. The charge distribution in the 601-916 fragment is very heterogeneous. It can be used as a basis to design EBDs both for positively- and negatively-charged proteins.
[0183]Cloning of EBD sequence: We obtained the full-length cDNA for human NF-M from Origene Technologies (Rockville, Md.) and cloned the coding region for a 494-residue EBD sequence (residues 422 to 916 of the NF-M protein) into a pMALc2E vector from which the maltose-binding protein coding region had been deleted. (See FIG. 1.) Restriction sites suitable for cloning the test proteins were engineered at the appropriate locations. The proximity of the start codon in the cloned target sequences to the Shine Delgarno sequence of the vector was the same as that in pMALc2E. This construct is referred to as pEBDM.
[0184]Preparation of heterologous sequence: The coding region of a heterologous sequence of interest may be examined for rare E. Coli codons and restrictions sites for a suitable cloning strategy. Prior to cloning, incompatible codons and restriction sites may be altered by site directed mutagenesis. The heterologous protein coding region, not including the stop codon, is PCR-amplified using primers containing the relevant restriction sites for the 5' and the 3' ends of the test protein open reading frame respectively.
[0185]Assembly of EBD expression vector: The PCR-amplified open reading frame of the heterologous polypeptide sequence of interest is ligated into the pEBDM vector backbone following digestion with appropriate restriction enzymes. In addition to cloning the heterologous sequence into an EBD expression vector, the test proteins may be cloned, for example, into an MBP expression vector (e.g., pMALยฎ-c2E, which already contains a maltose-binding protein coding region) as well as a control vector. The pMALยฎ-c2E serves as a positive control. To construct the control vector backbone, a KpnI site is added to pMALยฎ-c2E at base 1524 by site-directed mutagenesis of 4 bases. This allows excision of the MBP coding region (including the start codon) by KpnI digestion and re-ligation.
[0186]Protein expression and solubility analysis are carried out essentially according to the procedures of Kapust and Waugh. Briefly, the construct is transformed into E. Coli BL21/DE3 cells (Stratagene, LaJolla, Calif.). This cell line provides increased protein stability due to its deficiency in both the OmpT and Lon proteases. The transformed cells are grown at 37ยฐ C. with shaking in LB broth supplemented with the appropriate antibiotics, diluted 50 fold, and grown to an OD600 of 0.6 before induction. Recombinant protein productions is induced by adding IPTG to a final concentration of 1 mM, grown for more 3 hours, and harvested by centrifugation. The pellets are resuspended in 0.1 volume of lysis buffer and sonicated to disrupt cells. A sample of this crude lysate is reserved and used for total protein analyses. After the crude lysate is cleared by centrifugation, a sample of the cleared lysate will be used for soluble protein analyses. These samples are run on SDS-PAGE gels using standard procedures and visualized by Coomassie staining. The non-degraded soluble recombinant protein is apparent as a heavy band of the appropriate size.
[0187]The stained gels are scanned using an Epson Perfection 3200 scanner (Epson, Long Beach, Calif.) and the density of the protein bands is quantified using Total Lab image analysis software (Nonlinear Dynamics, Newcastle upon Tyne, UK). The densities of the bands corresponding to the fusion protein are normalized by dividing by the combined density of all the E. coli proteins larger than the largest fusion protein. Percent solubility is calculated by dividing the normalized density of the fusion protein band in the cleared lysate (soluble protein) lane by the normalized density of the fusion protein band in the crude lysate (total protein) protein lane after subtracting the normalized background density obtained from lanes containing equivalent protein extracts from E. Coli cells grown with an empty vector. Mean and standard deviation are calculated for at least three independent experiments.
Sequence CWU
1
14411026PRTHomo sapiens 1Met Met Ser Phe Gly Gly Ala Asp Ala Leu Leu Gly
Ala Pro Phe Ala1 5 10
15Pro Leu His Gly Gly Gly Ser Leu His Tyr Ala Leu Ala Arg Lys Gly20
25 30Gly Ala Gly Gly Thr Arg Ser Ala Ala Gly
Ser Ser Ser Gly Phe His35 40 45Ser Trp
Thr Arg Thr Ser Val Ser Ser Val Ser Ala Ser Pro Ser Arg50
55 60Phe Arg Gly Ala Gly Ala Ala Ser Ser Thr Asp Ser
Leu Asp Thr Leu65 70 75
80Ser Asn Gly Pro Glu Gly Cys Met Val Ala Val Ala Thr Ser Arg Ser85
90 95Glu Lys Glu Gln Leu Gln Ala Leu Asn Asp
Arg Phe Ala Gly Tyr Ile100 105 110Asp Lys
Val Arg Gln Leu Glu Ala His Asn Arg Ser Leu Glu Gly Glu115
120 125Ala Ala Ala Leu Arg Gln Gln Gln Ala Gly Arg Ser
Ala Met Gly Glu130 135 140Leu Tyr Glu Arg
Glu Val Arg Glu Met Arg Gly Ala Val Leu Arg Leu145 150
155 160Gly Ala Ala Arg Gly Gln Leu Arg Leu
Glu Gln Glu His Leu Leu Glu165 170 175Asp
Ile Ala His Val Arg Gln Arg Leu Asp Asp Glu Ala Arg Gln Arg180
185 190Glu Glu Ala Glu Ala Ala Ala Arg Ala Leu Ala
Arg Phe Ala Gln Glu195 200 205Ala Glu Ala
Ala Arg Val Asp Leu Gln Lys Lys Ala Gln Ala Leu Gln210
215 220Glu Glu Cys Gly Tyr Leu Arg Arg His His Gln Glu
Glu Val Gly Glu225 230 235
240Leu Leu Gly Gln Ile Gln Gly Ser Gly Ala Ala Gln Ala Gln Met Gln245
250 255Ala Glu Thr Arg Asp Ala Leu Lys Cys
Asp Val Thr Ser Ala Leu Arg260 265 270Glu
Ile Arg Ala Gln Leu Glu Gly His Ala Val Gln Ser Thr Leu Gln275
280 285Ser Glu Glu Trp Phe Arg Val Arg Leu Asp Arg
Leu Ser Glu Ala Ala290 295 300Lys Val Asn
Thr Asp Ala Met Arg Ser Ala Gln Glu Glu Ile Thr Glu305
310 315 320Tyr Arg Arg Gln Leu Gln Ala
Arg Thr Thr Glu Leu Glu Ala Leu Lys325 330
335Ser Thr Lys Asp Ser Leu Glu Arg Gln Arg Ser Glu Leu Glu Asp Arg340
345 350His Gln Ala Asp Ile Ala Ser Tyr Gln
Glu Ala Ile Gln Gln Leu Asp355 360 365Ala
Glu Leu Arg Asn Thr Lys Trp Glu Met Ala Ala Gln Leu Arg Glu370
375 380Tyr Gln Asp Leu Leu Asn Val Lys Met Ala Leu
Asp Ile Glu Ile Ala385 390 395
400Ala Tyr Arg Lys Leu Leu Glu Gly Glu Glu Cys Arg Ile Gly Phe
Gly405 410 415Pro Ile Pro Phe Ser Leu Pro
Glu Gly Leu Pro Lys Ile Pro Ser Val420 425
430Ser Thr His Ile Lys Val Lys Ser Glu Glu Lys Ile Lys Val Val Glu435
440 445Lys Ser Glu Lys Glu Thr Val Ile Val
Glu Glu Gln Thr Glu Glu Thr450 455 460Gln
Val Thr Glu Glu Val Thr Glu Glu Glu Glu Lys Glu Ala Lys Glu465
470 475 480Glu Glu Gly Lys Glu Glu
Glu Gly Gly Glu Glu Glu Glu Ala Glu Gly485 490
495Gly Glu Glu Glu Thr Lys Ser Pro Pro Ala Glu Glu Ala Ala Ser
Pro500 505 510Glu Lys Glu Ala Lys Ser Pro
Val Lys Glu Glu Ala Lys Ser Pro Ala515 520
525Glu Ala Lys Ser Pro Glu Lys Glu Glu Ala Lys Ser Pro Ala Glu Val530
535 540Lys Ser Pro Glu Lys Ala Lys Ser Pro
Ala Lys Glu Glu Ala Lys Ser545 550 555
560Pro Pro Glu Ala Lys Ser Pro Glu Lys Glu Glu Ala Lys Ser
Pro Ala565 570 575Glu Val Lys Ser Pro Glu
Lys Ala Lys Ser Pro Ala Lys Glu Glu Ala580 585
590Lys Ser Pro Ala Glu Ala Lys Ser Pro Glu Lys Ala Lys Ser Pro
Val595 600 605Lys Glu Glu Ala Lys Ser Pro
Ala Glu Ala Lys Ser Pro Val Lys Glu610 615
620Glu Ala Lys Ser Pro Ala Glu Val Lys Ser Pro Glu Lys Ala Lys Ser625
630 635 640Pro Thr Lys Glu
Glu Ala Lys Ser Pro Glu Lys Ala Lys Ser Pro Glu645 650
655Lys Ala Lys Ser Pro Glu Lys Glu Glu Ala Lys Ser Pro Glu
Lys Ala660 665 670Lys Ser Pro Val Lys Ala
Glu Ala Lys Ser Pro Glu Lys Ala Lys Ser675 680
685Pro Val Lys Ala Glu Ala Lys Ser Pro Glu Lys Ala Lys Ser Pro
Val690 695 700Lys Glu Glu Ala Lys Ser Pro
Glu Lys Ala Lys Ser Pro Val Lys Glu705 710
715 720Glu Ala Lys Ser Pro Glu Lys Ala Lys Ser Pro Val
Lys Glu Glu Ala725 730 735Lys Thr Pro Glu
Lys Ala Lys Ser Pro Val Lys Glu Glu Ala Lys Ser740 745
750Pro Glu Lys Ala Lys Ser Pro Glu Lys Ala Lys Thr Leu Asp
Val Lys755 760 765Ser Pro Glu Ala Lys Thr
Pro Ala Lys Glu Glu Ala Arg Ser Pro Ala770 775
780Asp Lys Phe Pro Glu Lys Ala Lys Ser Pro Val Lys Glu Glu Val
Lys785 790 795 800Ser Pro
Glu Lys Ala Lys Ser Pro Leu Lys Glu Asp Ala Lys Ala Pro805
810 815Glu Lys Glu Ile Pro Lys Lys Glu Glu Val Lys Ser
Pro Val Lys Glu820 825 830Glu Glu Lys Pro
Gln Glu Val Lys Val Lys Glu Pro Pro Lys Lys Ala835 840
845Glu Glu Glu Lys Ala Pro Ala Thr Pro Lys Thr Glu Glu Lys
Lys Asp850 855 860Ser Lys Lys Glu Glu Ala
Pro Lys Lys Glu Ala Pro Lys Pro Lys Val865 870
875 880Glu Glu Lys Lys Glu Pro Ala Val Glu Lys Pro
Lys Glu Ser Lys Val885 890 895Glu Ala Lys
Lys Glu Glu Ala Glu Asp Lys Lys Lys Val Pro Thr Pro900
905 910Glu Lys Glu Ala Pro Ala Lys Val Glu Val Lys Glu
Asp Ala Lys Pro915 920 925Lys Glu Lys Thr
Glu Val Ala Lys Lys Glu Pro Asp Asp Ala Lys Ala930 935
940Lys Glu Pro Ser Lys Pro Ala Glu Lys Lys Glu Ala Ala Pro
Glu Lys945 950 955 960Lys
Asp Thr Lys Glu Glu Lys Ala Lys Lys Pro Glu Glu Lys Pro Lys965
970 975Thr Glu Ala Lys Ala Lys Glu Asp Asp Lys Thr
Leu Ser Lys Glu Pro980 985 990Ser Lys Pro
Lys Ala Glu Lys Ala Glu Lys Ser Ser Ser Thr Asp Gln995
1000 1005Lys Asp Ser Lys Pro Pro Glu Lys Ala Thr Glu Asp
Lys Ala Ala Lys1010 1015 1020Gly
Lys102523081DNAHomo sapiens 2atgatgagct tcggcggcgc ggacgcgctg ctgggcgccc
cgttcgcgcc gctgcatggc 60ggcggcagcc tccactacgc gctagcccga aagggtggcg
caggcgggac gcgctccgcc 120gctggctcct ccagcggctt ccactcgtgg acacggacgt
ccgtgagctc cgtgtccgcc 180tcgcccagcc gcttccgtgg cgcaggcgcc gcctcaagca
ccgactcgct ggacacgctg 240agcaacgggc cggagggctg catggtggcg gtggccacct
cacgcagtga gaaggagcag 300ctgcaggcgc tgaacgaccg cttcgccggg tacatcgaca
aggtgcggca gctggaggcg 360cacaaccgca gcctggaggg cgaggctgcg gcgctgcggc
agcagcaggc gggccgctcc 420gctatgggcg agctgtacga gcgcgaggtc cgcgagatgc
gcggcgcggt gctgcgcctg 480ggcgcggcgc gcggtcagct acgcctggag caggagcacc
tgctcgagga catcgcgcac 540gtgcgccagc gcctagacga cgaggcccgg cagcgagagg
aggccgaggc ggcggcccgc 600gcgctggcgc gcttcgcgca ggaggccgag gcggcgcgcg
tggacctgca gaagaaggcg 660caggcgctgc aggaggagtg cggctacctg cggcgccacc
accaggaaga ggtgggcgag 720ctgctcggcc agatccaggg ctccggcgcc gcgcaggcgc
agatgcaggc cgagacgcgc 780gacgccctga agtgcgacgt gacgtcggcg ctgcgcgaga
ttcgcgcgca gcttgaaggc 840cacgcggtgc agagcacgct gcagtccgag gagtggttcc
gagtgaggct ggaccgactg 900tcggaggcag ccaaggtgaa cacagacgct atgcgctcag
cgcaggagga gataactgag 960taccggcgtc agctgcaggc caggaccaca gagctggagg
cactgaaaag caccaaggac 1020tcactggaga ggcagcgctc tgagctggag gaccgtcatc
aggccgacat tgcctcctac 1080caggaagcca ttcagcagct ggacgctgag ctgaggaaca
ccaagtggga gatggccgcc 1140cagctgcgag aataccagga cctgctcaat gtcaagatgg
ctctggatat agagatagcc 1200gcttacagaa aactcctgga aggtgaagag tgtcggattg
gctttggccc aattcctttc 1260tcgcttccag aaggactccc caaaattccc tctgtgtcca
ctcacataaa ggtgaaaagc 1320gaagagaaga tcaaagtggt ggagaagtct gagaaagaaa
ctgtgattgt ggaggaacag 1380acagaggaga cccaagtgac tgaagaagtg actgaagaag
aggagaaaga ggccaaagag 1440gaggagggca aggaggaaga agggggtgaa gaagaggagg
cagaaggggg agaagaagaa 1500acaaagtctc ccccagcaga agaggctgca tccccagaga
aggaagccaa gtcaccagta 1560aaggaagagg caaagtcacc ggctgaggcc aagtccccag
agaaggagga agcaaaatcc 1620ccagccgaag tcaagtcccc tgagaaggcc aagtctccag
caaaggaaga ggcaaagtca 1680ccgcctgagg ccaagtcccc agagaaggag gaagcaaaat
ctccagctga ggtcaagtcc 1740cccgagaagg ccaagtcccc agcaaaggaa gaggcaaagt
caccggctga ggccaagtct 1800ccagagaagg ccaagtcccc agtgaaggaa gaagcaaagt
caccggctga ggccaagtcc 1860ccagtgaagg aagaagcaaa atctccagct gaggtcaagt
ccccggaaaa ggccaagtct 1920ccaacgaagg aggaagcaaa gtcccctgag aaggccaagt
cccctgagaa ggccaagtcc 1980ccagagaagg aagaggccaa gtcccctgag aaggccaagt
ccccagtgaa ggcagaagca 2040aagtcccctg agaaggccaa gtccccagtg aaggcagaag
caaagtcccc tgagaaggcc 2100aagtccccag tgaaggaaga agcaaagtcc cctgagaagg
ccaagtcccc agtgaaggaa 2160gaagcaaagt cccctgagaa ggccaagtcc ccagtgaagg
aagaagcaaa gacccccgag 2220aaggccaagt ccccagtgaa ggaagaagcc aagtccccag
agaaggccaa gtccccagag 2280aaggccaaga ctcttgatgt gaagtctcca gaagccaaga
ctccagcgaa ggaggaagca 2340aggtcccctg cagacaaatt ccctgaaaag gccaaaagcc
ctgtcaagga ggaggtcaag 2400tccccagaga aggcgaaatc tcccctgaag gaggatgcca
aggcccctga gaaggagatc 2460ccaaaaaagg aagaggtgaa gtccccagtg aaggaggagg
agaagcccca ggaggtgaaa 2520gtcaaagagc ccccaaagaa ggcagaggaa gagaaagccc
ctgccacacc aaaaacagag 2580gagaagaagg acagcaagaa agaggaggca cccaagaagg
aggctccaaa gcccaaggtg 2640gaggagaaga aggaacctgc tgtcgaaaag cccaaagaat
ccaaagttga agccaagaag 2700gaagaggctg aagataagaa aaaagtcccc accccagaga
aggaggctcc tgccaaggtg 2760gaggtgaagg aagacgctaa acccaaagaa aagacagagg
tggccaagaa ggaaccagat 2820gatgccaagg ccaaggaacc cagcaaacca gcagagaaga
aggaggcagc accggagaaa 2880aaagacacca aggaggagaa ggccaagaag cctgaggaga
aacccaagac agaggccaaa 2940gccaaggaag atgacaagac cctctcaaaa gagcctagca
agcctaaggc agaaaaggct 3000gaaaaatcct ccagcacaga ccaaaaagac agcaagcctc
cagagaaggc cacagaagac 3060aaggccgcca aggggaagta a
308131087PRTMus musculus 3Met Ser Phe Gly Ser Ala
Asp Ala Leu Leu Gly Ala Pro Phe Ala Pro1 5
10 15Leu His Gly Gly Gly Ser Leu His Tyr Ser Leu Ser
Arg Lys Ala Gly20 25 30Pro Gly Gly Thr
Arg Ser Ala Ala Gly Ser Ser Ser Gly Phe His Ser35 40
45Trp Ala Arg Thr Ser Val Ser Ser Val Ser Ala Ser Pro Ser
Arg Phe50 55 60Arg Gly Ala Ala Ser Ser
Thr Asp Ser Leu Asp Thr Leu Ser Asn Gly65 70
75 80Pro Glu Gly Cys Val Val Ala Ala Val Ala Ala
Arg Ser Glu Lys Glu85 90 95Gln Leu Gln
Ala Leu Asn Asp Arg Phe Ala Gly Tyr Ile Asp Lys Val100
105 110Arg Gln Leu Glu Ala His Asn Arg Ser Leu Glu Gly
Glu Ala Ala Ala115 120 125Leu Arg Gln Gln
Lys Gly Arg Ala Ala Met Gly Glu Leu Tyr Glu Arg130 135
140Glu Val Arg Glu Met Arg Gly Ala Val Leu Arg Leu Gly Ala
Ala Arg145 150 155 160Gly
Gln Leu Arg Leu Glu Gln Glu His Leu Leu Glu Asp Ile Ala His165
170 175Val Arg Gln Arg Leu Asp Glu Glu Ala Arg Gln
Arg Glu Glu Ala Glu180 185 190Ala Ala Ala
Arg Ala Leu Ala Phe Ala Gln Glu Ala Glu Ala Ala Arg195
200 205Val Glu Leu Gln Lys Lys Ala Gln Ala Leu Gln Glu
Glu Cys Gly Tyr210 215 220Leu Arg Arg His
His Gln Glu Glu Val Gly Glu Leu Leu Gly Gln Ile225 230
235 240Gln Gly Cys Gly Ala Ala Gln Ala Gln
Ala Gln Ala Glu Ala Arg Asp245 250 255Ala
Leu Lys Cys Asp Val Thr Ser Ala Leu Arg Glu Ile Arg Ala Gln260
265 270Leu Glu Gly His Ala Val Gln Ser Ser Leu Gln
Ser Glu Glu Trp Phe275 280 285Arg Val Arg
Leu Asp Arg Leu Ser Glu Ala Ala Lys Val Asn Thr Asp290
295 300Ala Met Arg Ser Ala Gln Glu Glu Ile Thr Glu Tyr
Arg Arg Gln Leu305 310 315
320Gln Ala Arg Thr Thr Glu Leu Glu Ala Leu Lys Ser Thr Lys Glu Ser325
330 335Leu Glu Arg Gln Arg Ser Glu Leu Glu
Asp Arg His Gln Ala Asp Ile340 345 350Ala
Ser Tyr Gln Asp Ala Ile Gln Gln Leu Asp Ser Glu Leu Arg Asn355
360 365Thr Lys Trp Glu Met Ala Ala Gln Leu Arg Glu
Tyr Gln Asp Leu Leu370 375 380Asn Val Lys
Met Ala Leu Asp Ile Glu Ile Ala Ala Tyr Arg Lys Leu385
390 395 400Leu Glu Gly Glu Glu Cys Arg
Ile Gly Phe Gly Pro Ser Pro Phe Ser405 410
415Leu Thr Glu Gly Leu Pro Lys Ile Pro Ser Ile Ser Thr His Ile Lys420
425 430Val Lys Ser Glu Glu Met Ile Lys Val
Val Glu Lys Ser Glu Lys Glu435 440 445Thr
Val Ile Val Glu Gly Gln Thr Glu Glu Ile Arg Val Thr Glu Gly450
455 460Val Thr Glu Glu Glu Asp Lys Glu Ala Gln Gly
Gln Glu Gly Glu Glu465 470 475
480Ala Glu Glu Gly Glu Glu Lys Glu Glu Glu Glu Leu Ala Ala Ala
Thr485 490 495Ser Pro Pro Ala Glu Glu Ala
Ala Ser Pro Glu Lys Glu Thr Lys Ser500 505
510Arg Val Lys Glu Glu Ala Lys Ser Pro Gly Glu Ala Lys Ser Pro Gly515
520 525Glu Ala Lys Ser Pro Ala Glu Ala Lys
Ser Pro Gly Glu Ala Lys Ser530 535 540Pro
Gly Glu Ala Lys Ser Pro Gly Glu Ala Lys Ser Pro Ala Glu Pro545
550 555 560Lys Ser Pro Ala Glu Pro
Lys Ser Pro Ala Glu Ala Lys Ser Pro Ala565 570
575Glu Pro Lys Ser Pro Ala Thr Val Lys Ser Pro Gly Glu Ala Lys
Ser580 585 590Pro Ser Glu Ala Lys Ser Pro
Ala Glu Ala Lys Ser Pro Ala Glu Ala595 600
605Lys Ser Pro Ala Glu Ala Lys Ser Pro Ala Glu Ala Lys Ser Pro Ala610
615 620Glu Ala Lys Ser Pro Ala Glu Ala Lys
Ser Pro Ala Thr Val Lys Ser625 630 635
640Pro Gly Glu Ala Lys Ser Pro Ser Glu Ala Lys Ser Pro Ala
Glu Ala645 650 655Lys Ser Pro Ala Glu Ala
Lys Ser Pro Ala Glu Ala Lys Ser Pro Ala660 665
670Glu Val Lys Ser Pro Gly Glu Ala Lys Ser Pro Ala Glu Pro Lys
Ser675 680 685Pro Ala Glu Ala Lys Ser Pro
Ala Glu Val Lys Ser Pro Ala Glu Ala690 695
700Lys Ser Pro Ala Glu Val Lys Ser Pro Gly Glu Ala Lys Ser Pro Ala705
710 715 720Ala Val Lys Ser
Pro Ala Glu Ala Lys Ser Pro Ala Ala Val Lys Ser725 730
735Pro Gly Glu Ala Lys Ser Pro Gly Glu Ala Lys Ser Pro Ala
Glu Ala740 745 750Lys Ser Pro Ala Glu Ala
Lys Ser Pro Ile Glu Val Lys Ser Pro Glu755 760
765Lys Ala Lys Thr Pro Val Lys Glu Gly Ala Lys Ser Pro Ala Glu
Ala770 775 780Lys Ser Pro Glu Lys Ala Lys
Ser Pro Val Lys Glu Asp Ile Lys Pro785 790
795 800Pro Ala Glu Ala Lys Ser Pro Glu Lys Ala Lys Ser
Pro Val Lys Glu805 810 815Gly Ala Lys Pro
Pro Glu Lys Ala Lys Pro Leu Asp Val Lys Ser Pro820 825
830Glu Ala Gln Thr Pro Val Gln Glu Glu Ala Thr Val Pro Thr
Asp Ile835 840 845Arg Pro Pro Glu Gln Val
Lys Ser Pro Ala Lys Glu Lys Ala Lys Ser850 855
860Pro Glu Lys Glu Glu Ala Lys Thr Ser Glu Lys Val Ala Pro Lys
Lys865 870 875 880Glu Glu
Val Lys Ser Pro Val Lys Glu Glu Val Lys Ala Lys Glu Pro885
890 895Pro Lys Lys Val Glu Glu Glu Lys Thr Leu Pro Thr
Pro Lys Thr Glu900 905 910Ala Lys Glu Ser
Lys Lys Asp Glu Ala Pro Lys Glu Ala Pro Lys Pro915 920
925Lys Val Glu Glu Lys Lys Glu Thr Pro Thr Glu Lys Pro Lys
Asp Ser930 935 940Thr Ala Glu Ala Lys Lys
Glu Glu Ala Gly Glu Lys Lys Lys Ala Val945 950
955 960Ala Ser Glu Glu Glu Thr Pro Ala Lys Leu Gly
Val Lys Glu Glu Ala965 970 975Lys Pro Lys
Glu Lys Thr Glu Thr Thr Lys Thr Glu Ala Glu Asp Thr980
985 990Lys Ala Lys Glu Pro Ser Lys Pro Thr Glu Thr Glu
Lys Pro Lys Lys995 1000 1005Glu Glu Met
Pro Ala Ala Pro Glu Lys Lys Asp Thr Lys Glu Glu Lys1010
1015 1020Thr Thr Glu Ser Arg Lys Pro Glu Glu Lys Pro Lys
Met Glu Ala Lys1025 1030 1035
1040Val Lys Glu Asp Asp Lys Ser Leu Ser Lys Glu Pro Ser Lys Pro Lys1045
1050 1055Thr Glu Lys Ala Glu Lys Ser Ser Ser
Thr Asp Gln Lys Glu Ser Gln1060 1065
1070Pro Pro Glu Lys Thr Thr Glu Asp Lys Ala Thr Lys Gly Glu Lys1075
1080 108543219DNAMus musculus 4atgatgagct
tcggcagcgc cgatgcgctg ctgggcgccc cgttcgcgcc gctgcacgga 60ggcggcagcc
tgcactactc gctgagccgc aaggcaggcc cgggcggcac gcgctccgcg 120gccggctcct
ccagcggctt ccactcgtgg gcgcggacgt ccgtgagctc cgtgtccgcc 180tcacccagcc
gcttccgcgg cgccgcctcg agcaccgact cgctagacac cctaagcaac 240ggcccagagg
gctgcgtggt ggcggcggtg gcggcgcgca gcgagaagga gcagctgcag 300gctctgaacg
accgcttcgc gggctacatc gacaaggtga ggcagctcga ggcgcacaac 360cgcagcctgg
agggcgaggc ggcggcgctg cggcagcaac aagccggccg cgccgccatg 420ggcgagctgt
acgagcgcga ggtgcgcgag atgcgcggcg ccgtgctgcg cctcggggcg 480gcgcgcgggc
agctgcgcct ggagcaggag cacctgctgg aggacatcgc tcacgtccgc 540cagcggctgg
acgaggaggc ccggcagcgt gaggaggcgg aggcggcggc gcgcgccctg 600gcgcgcttcg
cgcaggaggc ggaagcggcg cgcgtggagc tgcagaagaa ggcgcaggcg 660ctgcaggagg
agtgcggcta cctgcggcgc caccaccagg aggaggtggg cgagctgctc 720ggtcagatcc
agggctgcgg ggccgcgcag gcgcaggctc aggccgaggc tcgcgacgcc 780ctcaagtgcg
acgtgacgtc ggcgctgcgg gagatccgcg cgcagctcga aggccacgcg 840gtgcagagca
cgctgcagtc cgaggagtgg ttccgagtga ggttggaccg actctcagag 900gcagccaaag
tgaacacaga tgctatgcgc tcggcccaag aggagataac tgagtaccgg 960cggcagctgc
aagccaggac cacagagttg gaggccctga aaagcaccaa ggagtcactg 1020gagaggcagc
gctctgagct agaggaccgt catcaggcag acattgcctc ctaccaggac 1080gctattcagc
agctggacag tgagctgaga aacaccaagt gggagatggc tgcacagctc 1140cgagagtacc
aggacctgct caacgtcaag atggccctgg acattgagat tgccgcttac 1200agaaagctcc
tggaaggcga agagtgtcgg attggctttg gtccgagtcc cttctctctt 1260actgaaggac
tcccaaaaat tccctccata tccacgcaca taaaagtcaa aagcgaagag 1320atgataaagg
tagtagagaa atccgagaag gaaactgtga ttgtagaagg acagacagaa 1380gagatccggg
tgacggaagg agtgacagaa gaggaggaca aagaggccca aggtcaggaa 1440ggagaagaag
cagaagaggg agaagaaaaa gaagaagagg aaggagcagc agctacatct 1500ccccctgcag
aagaggctgc atctccagaa aaagaaacca agtctcgtgt gaaagaagag 1560gccaagtccc
caggtgaggc caagtcccca ggtgaggcca agtccccagg tgaggccaag 1620tccccagctg
aggccaagtc cccaggtgag gccaagtccc cacgtgaggc caagtcccca 1680ggtgaggcca
agtctccagc tgagcccaag tctccagctg agcccaagtc tccagctgag 1740gccaagtcac
cagctgagcc caagtctcca gctacagtga agtctccagg tgaggccaag 1800tcaccatctg
aggccaaatc tccagctgaa gccaaatctc cagctgaggc caaatctcca 1860gctgaggcca
aatctccagc tgaggccaag tcaccagctg aagccaagtc accagctgaa 1920gccaaatctc
cagctacagt gaagtctcca ggtgaggcca agtcaccatc tgaggccaaa 1980tctccagctg
aagccaaatc tccagctgag gccaaatctc cagctgaggc caaatctcca 2040gctgaggtca
agtcaccagg tgaggccaag tctccagctg agcccaagtc accagctgag 2100gccaaatctc
cagctgcagt gaagtcacca gctgaggcca agtctccagc tgcagtcaag 2160tccccaggtg
aggccaagtc cccaggtgag gccaagtcac cagctgaggc caaatctcca 2220gctgaggcca
agtcaccaat tgaggtaaaa tctccagaga aggccaagac ccccgtcaag 2280gaaggagcaa
aatctccagc tgaggccaag tctcctgaga aggccaagtc ccccgtgaag 2340gaagatatca
agcccccagc tgaggcgaaa tcccctgaga aggccaagag ccccatgaag 2400gaaggagcaa
agcctcctga gaaggccaag cctctagatg tgaagtctcc ggaagcccag 2460actccagtac
aggaggaagc gaacgacccc acagacatca gaccccctga gcaggtgaaa 2520agtcctgcca
aggagaaggc caagtcccct gagaaggaag aagccaagac ttctgaaaag 2580gtggctccca
agaaggaaga ggtgaagtcc cctgtgaagg aggaggtaaa agccaaagaa 2640cccccaaaga
aggtagaaga agagaagaca ctgcctacac caaagacaga ggcgaaggag 2700agtaagaaag
acgaagctcc caaggaggcc ccgaagccca aggtggagga gaagaaggaa 2760actcccacgg
aaaagcccaa ggactctaca gcagaagcca agaaggaaga ggctggagag 2820aagaagaaag
ccgtggcctc agaggaggag actcctgcca agttgggtgt gaaggaagaa 2880gctaaaccca
aagagaagac agagacaacc aagacagaag cagaagacac caaggccaaa 2940gaacctagca
aacccacaga gacggaaaag ccaaagaaag aggagatgcc agcggcacca 3000gagaagaaag
acaccaagga ggagaagacc acagagtcca ggaagcctga ggagaagccc 3060aaaatggagg
ccaaggtcaa ggaggatgac aagagccttt ccaaagagcc tagcaaaccc 3120aagacagaaa
aggctgaaaa atcctctagc acagaccaga aagaaagcca gcccccagag 3180aagaccacag
aggacaaggc caccaaggga gagaagtaa 32195925PRTBos
taurus 5Ser Tyr Thr Leu Asp Ser Leu Gly Asn Pro Ser Ala Tyr Arg Arg Val1
5 10 15Thr Glu Thr Arg
Ser Ser Phe Ser Arg Ile Ser Gly Ser Pro Ser Ser20 25
30Gly Phe Arg Ser Gln Ser Trp Ser Arg Gly Ser Pro Ser Thr
Val Ser35 40 45Ser Ser Tyr Lys Arg Ser
Ala Leu Ala Pro Arg Leu Thr Tyr Ser Ser50 55
60Ala Met Leu Ser Ser Ala Glu Ser Ser Leu Asp Phe Ser Gln Ser Ser65
70 75 80Ser Leu Leu Asp
Gly Gly Ser Gly Pro Gly Gly Asp Tyr Lys Leu Ser85 90
95Arg Ser Asn Glu Lys Glu Gln Ile Gln Gly Leu Asn Asp Arg
Phe Ala100 105 110Gly Tyr Ile Glu Lys Val
His Tyr Leu Glu Gln Gln Asn Lys Glu Ile115 120
125Glu Ala Glu Ile Gln Ala Leu Arg Gln Lys Gln Ala Ser His Ala
Gln130 135 140Leu Gly Asp Ala Tyr Asp Gln
Glu Ile Arg Glu Leu Arg Ala Thr Leu145 150
155 160Glu Met Val Asn His Glu Lys Ala Gln Val Gln Leu
Asp Ser Asp His165 170 175Leu Glu Glu Asp
Ile His Arg Leu Lys Glu Arg Phe Glu Glu Glu Ala180 185
190Arg Leu Arg Asp Asp Thr Glu Ala Ala Ile Arg Ala Leu Arg
Lys Asp195 200 205Ile Glu Glu Ser Ser Leu
Val Lys Val Glu Leu Asp Lys Lys Val Gln210 215
220Ser Leu Gln Asp Glu Val Ala Phe Leu Arg Ser Asn His Glu Glu
Glu225 230 235 240Val Ala
Asp Leu Leu Ala Gln Ile Gln Ala Ser His Ile Thr Val Glu245
250 255Arg Lys Asp Tyr Leu Lys Thr Asp Ile Ser Thr Ala
Leu Lys Glu Ile260 265 270Arg Ser Gln Leu
Glu Ser His Ser Asp Gln Asn Met His Gln Ala Glu275 280
285Glu Trp Phe Lys Cys Arg Tyr Ala Lys Leu Thr Glu Ala Ala
Glu Gln290 295 300Asn Lys Glu Ala Ile Arg
Ser Ala Lys Glu Glu Ile Ala Glu Tyr Arg305 310
315 320Arg Gln Leu Gln Ser Lys Ser Ile Glu Leu Glu
Ser Val Arg Gly Thr325 330 335Lys Glu Ser
Leu Glu Arg Gln Leu Ser Asp Ile Glu Glu Arg His Asn340
345 350His Asp Leu Ser Ser Tyr Gln Asp Thr Ile Gln Gln
Leu Glu Asn Glu355 360 365Leu Arg Gly Thr
Lys Trp Glu Met Ala Arg His Leu Arg Glu Tyr Gln370 375
380Asp Leu Leu Asn Val Lys Met Ala Leu Asp Ile Glu Ile Ala
Ala Tyr385 390 395 400Arg
Lys Leu Leu Glu Gly Glu Glu Thr Arg Phe Ser Thr Phe Ala Gly405
410 415Ser Ile Thr Gly Pro Leu Tyr Thr His Arg Gln
Pro Ser Ile Ala Ile420 425 430Ser Ser Lys
Ile Gln Lys Thr Lys Val Glu Ala Pro Lys Leu Lys Val435
440 445Gln His Lys Phe Val Glu Glu Ile Ile Glu Glu Thr
Lys Val Glu Asp450 455 460Glu Lys Ser Glu
Met Glu Glu Ala Leu Thr Ala Ile Thr Glu Glu Leu465 470
475 480Ala Val Ser Val Lys Glu Glu Val Lys
Glu Glu Glu Ala Glu Glu Lys485 490 495Glu
Glu Lys Glu Glu Ala Glu Glu Glu Val Val Ala Ala Lys Lys Ser500
505 510Pro Val Lys Ala Thr Ala Pro Glu Leu Lys Glu
Glu Glu Gly Glu Lys515 520 525Glu Glu Glu
Glu Gly Gln Glu Glu Glu Glu Glu Glu Glu Glu Ala Ala530
535 540Lys Ser Asp Gln Ala Glu Glu Gly Gly Ser Glu Lys
Glu Gly Ser Ser545 550 555
560Glu Lys Glu Glu Gly Glu Gln Glu Glu Glu Gly Glu Thr Glu Ala Glu565
570 575Gly Glu Gly Glu Glu Ala Ala Ala Glu
Ala Lys Glu Glu Lys Lys Met580 585 590Glu
Glu Lys Ala Glu Glu Val Ala Pro Lys Glu Glu Leu Ala Ala Glu595
600 605Ala Lys Val Glu Lys Pro Glu Lys Ala Lys Ser
Pro Val Ala Lys Ser610 615 620Pro Thr Thr
Lys Ser Pro Thr Ala Lys Ser Pro Glu Ala Lys Ser Pro625
630 635 640Glu Ala Lys Ser Pro Thr Ala
Lys Ser Pro Thr Ala Lys Ser Pro Val645 650
655Ala Lys Ser Pro Thr Ala Lys Ser Pro Glu Ala Lys Ser Pro Glu Ala660
665 670Lys Ser Pro Thr Ala Lys Ser Pro Thr
Ala Lys Ser Pro Ala Ala Lys675 680 685Ser
Pro Ala Pro Lys Ser Pro Val Glu Glu Val Lys Pro Lys Ala Glu690
695 700Ala Gly Ala Glu Lys Gly Glu Gln Lys Glu Lys
Val Glu Glu Glu Lys705 710 715
720Lys Glu Ala Lys Glu Ser Pro Lys Glu Glu Lys Ala Glu Lys Lys
Glu725 730 735Glu Lys Pro Lys Asp Val Pro
Glu Lys Lys Lys Ala Glu Ser Pro Val740 745
750Lys Ala Glu Ser Pro Val Lys Glu Glu Val Pro Ala Lys Pro Val Lys755
760 765Val Ser Pro Glu Lys Glu Ala Lys Glu
Glu Glu Lys Pro Gln Glu Lys770 775 780Glu
Lys Glu Lys Glu Lys Val Glu Glu Val Gly Gly Lys Glu Glu Gly785
790 795 800Gly Leu Lys Glu Ser Arg
Lys Glu Asp Ile Ala Ile Asn Gly Glu Val805 810
815Glu Gly Lys Glu Glu Glu Gln Glu Thr Lys Glu Lys Gly Ser Gly
Gly820 825 830Glu Glu Glu Lys Gly Val Val
Thr Asn Gly Leu Asp Val Ser Pro Gly835 840
845Asp Glu Lys Lys Gly Gly Asp Lys Ser Glu Glu Lys Val Val Val Thr850
855 860Lys Met Val Glu Lys Ile Thr Ser Glu
Gly Gly Asp Gly Ala Thr Lys865 870 875
880Tyr Ile Thr Lys Ser Val Thr Val Thr Gln Lys Val Glu Glu
His Glu885 890 895Glu Thr Phe Glu Glu Lys
Leu Val Ser Thr Lys Lys Val Glu Lys Val900 905
910Thr Ser His Ala Ile Val Lys Glu Val Thr Gln Ser Asp915
920 92562433DNABos Taurus 6gagaaggtcc actacctgga
gcagcagaac aaggagatcg aggcagagat ccaggcgctg 60cggcagaagc aggcctcgca
cgcccagctg ggcgacgcgt acgaccagga aatccgcgag 120ctacgcgcca ccctggagat
ggtgaaccat gagaaggctc aggtacagct ggactcggac 180cacctggaag aggatatcca
ccggctcaag gagcgcttcg aggaggaggc acggctgcgc 240gacgacaccg aggcggctat
ccgcgcgctg cgcaaagata tcgaggagtc gtcgctggtc 300aaggtggagc tggacaagaa
ggtgcagtcg ctgcaggatg aggtggcctt cctgcggagc 360aatcacgagg aggaggtggc
cgacctgctg gcccagatcc aagcgtcgca catcacggtg 420gagcgcaaag actacctgaa
gacggacatc tcgacggcgc tgaaagagat ccgctcccag 480ctcgagagtc actccgacca
gaacatgcac caggccgaag agtggtttaa gtgccgctac 540gccaagctca ccgaggcggc
cgagcagaac aaggaagcca tccgctccgc caaggaagag 600atcgccgagt accggcgcca
gctgcagtcc aagagcatcg agctcgagtc agtgcgcggc 660accaaggagt ccctggagcg
gcagctcagc gacatcgagg agcgccacaa ccacgacctt 720agcagctacc aggacaccat
ccagcagctg gaaaatgagc ttcggggcac aaagtgggaa 780atggctcgtc atctgcgaga
ataccaggat ctcctcaacg tcaagatggc tctggatatt 840gagatcgcgg cgtacaggaa
actcctggag ggtgaagaga ccagatttag cacatttgcg 900ggtagcatca ctgggccact
gtatacacac cgacagccct ccatcgccat atccagtaag 960attcagaaaa ccaaggtaga
ggctcccaag ctaaaggtcc aacacaaatt tgttgaggag 1020attatagagg aaaccaaggt
ggaagatgag aaatcagaaa tggaagaagc cctgacggcc 1080attaccgagg aattggccgt
ttccgtgaaa gaggaggtca aggaagagga ggctgaagaa 1140aaggaggaga aagaagaagc
cgaagaagaa gttgttgctg ccaaaaagtc tccagtgaaa 1200gctactgcac ctgaacttaa
agaagaggaa ggagaaaagg aggaggaaga gggccaagag 1260gaagaggaag aggaagaaga
ggctgctaag tcagaccaag ccgaggaagg aggatctgag 1320aaggaaggtt ctagtgaaaa
agaggaaggt gagcaagaag aggaaggaga aacagaggct 1380gagggggaag gagaggaagc
cgctgccgaa gctaaggagg aaaagaaaat ggaggaaaag 1440gctgaagaag tggctccaaa
ggaggagctg gcggcagaag ccaaggtgga gaagccagag 1500aaagccaagt ccccagtggc
caagtcccca acaacaaagt ccccaacggc caagtcccca 1560gaggcaaagt ccccagaggc
aaagtcccca acagcaaaat ccccgacggc caagtcccca 1620gtggccaagt ccccgacggc
caagtcccca gaggcaaagt ccccagaggc aaagtcccca 1680acagcaaaat ccccgacggc
caagtcccca gcagcaaagt ccccagcgcc aaaatcacct 1740gtggaggaag tgaaacccaa
agcagaagct ggagctgaga aaggagaaca gaaggagaag 1800gtggaggaag aaaagaaaga
agcaaaggaa tctcccaagg aagagaaggc agagaaaaag 1860gaggagaagc caaaggatgt
gccagagaag aagaaggctg aatccccagt gaaggctgag 1920tccccagtga aggaggaggt
gcctgccaag ccagtaaagg tgagcccaga gaaggaagcc 1980aaagaggagg agaagccaca
ggagaaagag aaggagaagg agaaagtgga agaggtggga 2040gggaaggagg agggaggttt
gaaggaatcc aggaaggaag acatagccat caatggggag 2100gtggaaggga aggaggaaga
acaggaaact aaggagaaag gcagtggggg agaagaggag 2160aaaggagtcg tcaccaacgg
cctagacgtg agcccagggg atgaaaagaa gggcggtgat 2220aaaagtgagg agaaagtggt
ggtaaccaaa atggtggaaa aaatcaccag tgagggggga 2280gatggtgcta ccaagtatat
caccaaatct gtaaccgtca ctcaaaaggt cgaagagcat 2340gaagagacct ttgaggagaa
actagtgtct actaaaaagg tagagaaagt cacttcacac 2400gccatagtaa aggaagtcac
ccagagtgac taa 24337857PRTGallus gallus
7Ser Tyr Thr Met Glu Pro Leu Gly Asn Pro Ser Tyr Arg Arg Val Met1
5 10 15Thr Glu Thr Arg Ala Thr
Tyr Ser Arg Ala Ser Ala Ser Pro Ser Ser20 25
30Gly Phe Arg Ser Gln Ser Trp Ser Arg Gly Ser Gly Ser Thr Val Ser35
40 45Ser Ser Tyr Lys Arg Thr Asn Leu Gly
Ala Pro Arg Thr Ala Tyr Gly50 55 60Ser
Thr Val Leu Ser Ser Ala Glu Ser Leu Asp Val Ser Gln Ser Ser65
70 75 80Leu Leu Asn Gly Ala Ala
Glu Leu Lys Leu Ser Arg Ser Asn Glu Lys85 90
95Glu Gln Leu Gln Gly Leu Asn Asp Arg Phe Ala Gly Tyr Ile Glu Lys100
105 110Val His Tyr Leu Glu Gln Gln Asn
Lys Glu Ile Glu Ala Glu Leu Ala115 120
125Ala Leu Arg Gln Lys His Ala Gly Arg Ala Gln Leu Gly Asp Ala Tyr130
135 140Glu Gln Glu Leu Arg Glu Leu Arg Gly
Ala Leu Glu Gln Val Ser His145 150 155
160Glu Lys Ala Gln Ile Gln Leu Asp Ser Glu His Ile Glu Glu
Asp Ile165 170 175Gln Arg Leu Arg Glu Arg
Phe Glu Asp Glu Ala Arg Leu Arg Asp Glu180 185
190Thr Glu Ala Thr Ile Ala Ala Leu Arg Lys Glu Met Glu Glu Ala
Ser195 200 205Leu Met Arg Ala Glu Leu Asp
Lys Lys Val Gln Ser Leu Gln Asp Glu210 215
220Val Ala Phe Leu Arg Gly Asn His Glu Glu Glu Val Ala Glu Leu Leu225
230 235 240Ala Gln Leu Gln
Ala Ser His Ala Thr Val Glu Arg Lys Asp Tyr Leu245 250
255Lys Thr Asp Leu Thr Thr Ala Leu Lys Glu Ile Arg Ala Gln
Leu Glu260 265 270Cys Gln Ser Asp His Asn
Met His Gln Ala Glu Glu Trp Phe Lys Cys275 280
285Arg Tyr Ala Lys Leu Thr Glu Ala Ala Glu Gln Asn Lys Glu Ala
Ile290 295 300Arg Ser Ala Lys Glu Glu Ile
Ala Glu Tyr Arg Arg Gln Leu Gln Ser305 310
315 320Lys Ser Ile Glu Leu Glu Ser Val Arg Gly Thr Lys
Glu Ser Leu Glu325 330 335Arg Gln Leu Ser
Asp Ile Glu Glu Arg His Asn Asn Asp Leu Thr Thr340 345
350Tyr Gln Asp Thr Ile His Gln Leu Glu Asn Glu Leu Arg Gly
Thr Lys355 360 365Trp Glu Met Ala Arg His
Leu Arg Glu Tyr Gln Asp Leu Leu Asn Val370 375
380Lys Met Ala Leu Asp Ile Glu Ile Ala Ala Tyr Arg Lys Leu Leu
Glu385 390 395 400Gly Glu
Glu Thr Arg Phe Ser Ala Phe Ser Gly Ser Ile Thr Gly Pro405
410 415Ile Phe Thr His Arg Gln Pro Ser Val Thr Ile Ala
Ser Thr Lys Ile420 425 430Gln Lys Thr Lys
Ile Glu Pro Pro Lys Leu Lys Val Gln His Lys Phe435 440
445Val Glu Glu Ile Ile Glu Glu Thr Lys Val Glu Asp Glu Lys
Ser Glu450 455 460Met Glu Asp Ala Leu Ser
Ala Ile Ala Glu Glu Met Ala Ala Lys Ala465 470
475 480Gln Glu Glu Glu Gln Glu Glu Glu Lys Ala Glu
Glu Glu Ala Val Glu485 490 495Glu Glu Ala
Val Ser Glu Lys Ala Ala Glu Gln Ala Ala Glu Glu Glu500
505 510Glu Lys Glu Glu Glu Glu Ala Glu Glu Glu Glu Ala
Ala Lys Ser Asp515 520 525Ala Ala Glu Glu
Gly Gly Ser Lys Lys Glu Glu Ile Glu Glu Lys Glu530 535
540Glu Gly Glu Glu Ala Glu Glu Glu Glu Ala Glu Ala Lys Gly
Lys Ala545 550 555 560Glu
Glu Ala Gly Ala Lys Val Glu Lys Val Lys Ser Pro Pro Ala Lys565
570 575Ser Pro Pro Lys Ser Pro Pro Lys Ser Pro Val
Thr Glu Gln Ala Lys580 585 590Ala Val Gln
Lys Ala Ala Ala Glu Val Gly Lys Asp Gln Lys Ala Glu595
600 605Lys Ala Ala Glu Lys Ala Ala Lys Glu Glu Lys Ala
Ala Ser Pro Glu610 615 620Lys Pro Ala Thr
Pro Lys Val Thr Ser Pro Glu Lys Pro Ala Thr Pro625 630
635 640Glu Lys Pro Pro Thr Pro Glu Lys Ala
Ile Thr Pro Glu Lys Val Arg645 650 655Ser
Pro Glu Lys Pro Thr Thr Pro Glu Lys Val Val Ser Pro Glu Lys660
665 670Pro Ala Ser Pro Glu Lys Pro Arg Thr Pro Glu
Lys Pro Ala Ser Pro675 680 685Glu Lys Pro
Ala Thr Pro Glu Lys Pro Arg Thr Pro Glu Lys Pro Ala690
695 700Thr Pro Glu Lys Pro Arg Ser Pro Glu Lys Pro Ser
Ser Pro Leu Lys705 710 715
720Asp Glu Lys Ala Val Val Glu Glu Ser Ile Thr Val Thr Lys Val Thr725
730 735Lys Val Thr Ala Glu Val Glu Val Ser
Lys Glu Ala Arg Lys Glu Asp740 745 750Ile
Ala Val Asn Gly Glu Val Glu Glu Lys Lys Asp Glu Ala Lys Glu755
760 765Lys Glu Ala Glu Glu Glu Glu Lys Gly Val Val
Thr Asn Gly Leu Asp770 775 780Val Ser Pro
Val Asp Glu Lys Gly Glu Lys Val Val Val Thr Lys Lys785
790 795 800Ala Glu Lys Ile Thr Ser Glu
Gly Gly Asp Ser Thr Thr Thr Tyr Ile805 810
815Thr Lys Ser Val Thr Val Thr Gln Lys Val Glu Glu His Glu Glu Ser820
825 830Phe Glu Glu Lys Leu Val Ser Thr Lys
Lys Val Glu Lys Val Thr Ser835 840 845His
Ala Val Val Lys Glu Ile Lys Glu850 85581530DNAGallus
gallus 8gacctcacca cctatcagga cacgatccat cagctggaaa atgagctcag aggaacgaag
60tgggagatgg cacgtcattt gagggagtac caggatctcc tcaatgtcaa gatggccctg
120gatatcgaaa ttgctgcata caggaagctg ctggagggtg aggagacaag attcagtgcc
180ttctctggaa gcatcactgg acccatattc acacacagac aaccatcggt cacaatagca
240tccactaaaa tacagaaaac caaaatcgag ccaccaaagc tgaaggtcca gcacaaattt
300gtagaagaaa tcattgaaga gacgaaagta gaggatgaga agtctgaaat ggaagatgcc
360ctctcagcca ttgcagaaga aatggcagca aaggctcagg aggaagaaca ggaggaggaa
420aaggcagaag aagaagctgt agaggaagaa gctgtttctg agaaggctgc agaacaggca
480gctgaggaag aagagaagga ggaagaagaa gcagaggagg aagaagctgc aaaatcagac
540gctgcagaag aaggaggctc taaaaaggaa gaaatagagg aaaaggaaga aagggaggag
600gctgaagaag aagaagctga agccaagggc aaagctgaag aggcaggtgc aaaggtagaa
660aaagtgaaat cacctcctgc aaagtcaccc cctaaatccc cccctaaatc ccctgtaaca
720gagcaagcca aggccgtcca gaaagcagca gcagaggtag gaaaggatca gaaagcagag
780aaagctgctg agaaggcagc caaggaggag aaggcagcat ccccagagaa gccggcgaca
840ccaaaggtga cctccccgga gaaaccagcg actccggaga aaccaccaac cccagagaaa
900gcgatcaccc cggagaaggt ccgttcccca gaaaaaccaa caaccccgga aaaagtggtg
960agcccagaga aaccagcaag cccagagaag ccccgaaccc cagagaaacc agcaagcccc
1020gaaaaaccgg caacaccaga gaagccccgc actcctgaaa agccagcgac gccggagaag
1080ccccgttctc cagagaagcc atcctccccg ctcaaagatg aaaaggctgt ggtggaggag
1140agcatcactg tcacaaaggt aacaaaagtc actgcagagg tggaggtgtc gaaggaagcc
1200aggaaagaag acattgcggt gaatggtgaa gtggaggaga agaaggatga ggcgaaggag
1260aaggaggctg aggaggaaga gaagggcgtt gtcaccaatg ggctcgatgt gagccccgtc
1320gatgagaagg gtgagaaagt tgtagtaacc aaaaaagcag agaaaatcac aagtgaagga
1380ggggacagta ctaccacgta catcacgaag tcggtgacgg tcactcagaa ggtggaggaa
1440cacgaagaga gctttgagga gaaattggtg tccactaaga aagtggagaa agttacttca
1500catgctgtag taaaagagat taaagaatga
15309915PRTHomo sapiens 9Ser Tyr Thr Leu Asp Ser Leu Gly Asn Pro Ser Ala
Tyr Arg Arg Val1 5 10
15Thr Glu Thr Arg Ser Ser Phe Ser Arg Val Ser Gly Ser Pro Ser Ser20
25 30Gly Phe Arg Ser Gln Ser Trp Ser Arg Gly
Ser Pro Ser Thr Val Ser35 40 45Ser Ser
Tyr Lys Arg Ser Met Leu Ala Pro Arg Leu Ala Tyr Ser Ser50
55 60Ala Met Leu Ser Ser Ala Glu Ser Ser Leu Asp Phe
Ser Gln Ser Ser65 70 75
80Ser Leu Leu Asn Gly Gly Ser Gly Pro Gly Gly Asp Tyr Lys Leu Ser85
90 95Arg Ser Asn Glu Lys Glu Gln Leu Gln Gly
Leu Asn Asp Arg Phe Ala100 105 110Gly Tyr
Ile Glu Lys Val His Tyr Leu Glu Gln Gln Asn Lys Glu Ile115
120 125Glu Ala Glu Ile Gln Ala Leu Arg Gln Lys Gln Ala
Ser His Ala Gln130 135 140Leu Gly Asp Ala
Tyr Asp Gln Glu Ile Arg Glu Leu Arg Ala Thr Leu145 150
155 160Glu Met Val Asn His Glu Lys Ala Gln
Val Gln Leu Asp Ser Asp His165 170 175Leu
Glu Glu Asp Ile His Arg Leu Lys Glu Arg Phe Glu Glu Glu Ala180
185 190Arg Leu Arg Asp Asp Thr Glu Ala Ala Ile Arg
Ala Leu Arg Lys Asp195 200 205Ile Glu Glu
Ala Ser Leu Val Lys Val Glu Leu Asp Lys Lys Val Gln210
215 220Ser Leu Gln Asp Glu Val Ala Phe Leu Arg Ser Asn
His Glu Glu Glu225 230 235
240Val Ala Asp Leu Leu Ala Gln Ile Gln Ala Ser His Ile Thr Val Glu245
250 255Arg Lys Asp Tyr Leu Lys Thr Asp Ile
Ser Thr Ala Leu Lys Glu Ile260 265 270Arg
Ser Gln Leu Glu Ser His Ser Asp Gln Asn Met His Gln Ala Glu275
280 285Glu Trp Phe Lys Cys Arg Tyr Ala Lys Leu Thr
Glu Ala Ala Glu Gln290 295 300Asn Lys Glu
Ala Ile Arg Ser Ala Lys Glu Glu Ile Ala Glu Tyr Arg305
310 315 320Arg Gln Leu Gln Ser Lys Ser
Ile Glu Leu Glu Ser Val Arg Gly Thr325 330
335Lys Glu Ser Leu Glu Arg Gln Leu Ser Asp Ile Glu Glu Arg His Asn340
345 350His Asp Leu Ser Ser Tyr Gln Asp Thr
Ile Gln Gln Leu Glu Asn Glu355 360 365Leu
Arg Gly Thr Lys Trp Glu Met Ala Arg His Leu Arg Glu Tyr Gln370
375 380Asp Leu Leu Asn Val Lys Met Ala Leu Asp Ile
Glu Ile Ala Ala Tyr385 390 395
400Arg Lys Leu Leu Glu Gly Glu Glu Thr Arg Phe Ser Thr Phe Ala
Gly405 410 415Ser Ile Thr Gly Pro Leu Tyr
Thr His Arg Pro Pro Ile Thr Ile Ser420 425
430Ser Lys Ile Gln Lys Thr Lys Val Glu Ala Pro Lys Leu Lys Val Gln435
440 445His Lys Phe Val Glu Glu Ile Ile Glu
Glu Thr Lys Val Glu Asp Glu450 455 460Lys
Ser Glu Met Glu Glu Ala Leu Thr Ala Ile Thr Glu Glu Leu Ala465
470 475 480Ala Ser Met Lys Glu Glu
Lys Lys Glu Ala Ala Glu Glu Lys Glu Glu485 490
495Glu Pro Glu Ala Glu Glu Glu Glu Val Ala Ala Lys Lys Ser Pro
Val500 505 510Lys Ala Thr Ala Pro Glu Val
Lys Glu Glu Glu Gly Glu Lys Glu Glu515 520
525Glu Glu Gly Gln Glu Glu Glu Glu Glu Glu Asp Glu Gly Ala Lys Ser530
535 540Asp Gln Ala Glu Glu Gly Gly Ser Glu
Lys Glu Gly Ser Ser Glu Lys545 550 555
560Glu Glu Gly Glu Gln Glu Glu Gly Glu Thr Glu Ala Glu Ala
Glu Gly565 570 575Glu Glu Ala Glu Ala Lys
Glu Glu Lys Lys Val Glu Glu Lys Ser Glu580 585
590Glu Val Ala Thr Lys Glu Glu Leu Val Ala Asp Ala Lys Val Glu
Lys595 600 605Pro Glu Lys Ala Lys Ser Pro
Val Pro Lys Ser Pro Val Glu Glu Lys610 615
620Gly Lys Ser Pro Val Pro Lys Ser Pro Val Glu Glu Lys Gly Lys Ser625
630 635 640Pro Val Pro Lys
Ser Pro Val Glu Glu Lys Gly Lys Ser Pro Val Pro645 650
655Lys Ser Pro Val Glu Glu Lys Gly Lys Ser Pro Val Ser Lys
Ser Pro660 665 670Val Glu Glu Lys Ala Lys
Ser Pro Val Pro Lys Ser Pro Val Glu Glu675 680
685Ala Lys Ser Lys Ala Glu Val Gly Lys Gly Glu Gln Lys Glu Glu
Glu690 695 700Glu Lys Glu Val Lys Glu Ala
Pro Lys Glu Glu Lys Val Glu Lys Lys705 710
715 720Glu Glu Lys Pro Lys Asp Val Pro Glu Lys Lys Lys
Ala Glu Ser Pro725 730 735Val Lys Glu Glu
Ala Val Ala Glu Val Val Thr Ile Thr Lys Ser Val740 745
750Lys Val His Leu Glu Lys Glu Thr Lys Glu Glu Gly Lys Pro
Leu Gln755 760 765Gln Glu Lys Glu Lys Glu
Lys Ala Gly Gly Glu Gly Gly Ser Glu Glu770 775
780Glu Gly Ser Asp Lys Gly Ala Lys Gly Ser Arg Lys Glu Asp Ile
Ala785 790 795 800Val Asn
Gly Glu Val Glu Gly Lys Glu Glu Val Glu Gln Glu Thr Lys805
810 815Glu Lys Gly Ser Gly Arg Glu Glu Glu Lys Gly Val
Val Thr Asn Gly820 825 830Leu Asp Leu Ser
Pro Ala Asp Glu Lys Lys Gly Gly Asp Lys Ser Glu835 840
845Glu Lys Val Val Val Thr Lys Thr Val Glu Lys Ile Thr Ser
Glu Gly850 855 860Gly Asp Gly Ala Thr Lys
Tyr Ile Thr Lys Ser Val Thr Val Thr Gln865 870
875 880Lys Val Glu Glu His Glu Glu Thr Phe Glu Glu
Lys Leu Val Ser Thr885 890 895Lys Lys Val
Glu Lys Val Thr Ser His Ala Ile Val Lys Glu Val Thr900
905 910Gln Ser Asp915102751DNAHomo sapiens 10atgagctaca
cgttggactc gctgggcaac ccgtccgcct accggcgggt aaccgagacc 60cgctcgagct
tcagccgcgt cagcggctcc ccgtccagtg gcttccgctc gcagtcgtgg 120tcccgcggct
cgcccagcac cgtgtcctcc tcctataagc gcagcatgct cgccccgcgc 180ctcgcttaca
gctcggccat gctcagctcc gccgagagca gccttgactt cagccagtcc 240tcgtccctgc
tcaacggcgg ctccggaccc ggcggcgact acaagctgtc ccgctccaac 300gagaaggagc
agctgcaggg gctgaacgac cgctttgccg gctacataga gaaggtgcac 360tacctggagc
agcagaataa ggagattgag gcggagatcc aggcgctgcg gcagaagcag 420gcctcgcacg
cccagctggg cgacgcgtac gaccaggaga tccgcgagct gcgcgccacc 480ctggagatgg
tgaaccacga gaaggctcag gtgcagctgg actcggacca cctggaggaa 540gacatccacc
ggctcaagga gcgctttgag gaggaggcgc ggttgcggga cgacactgag 600gcggccatcc
gggcgctgcg caaagacatc gaggaggcgt cgctggtcaa ggtggagctg 660gacaagaagg
tgcagtcgct gcaggatgag gtggccttcc tgcggagcaa ccacgaggag 720gaggtggccg
accttctggc ccagatccag gcatcgcaca tcacggtgga gcgcaaagac 780tacctgaaga
cagacatctc gacggcgctg aaggaaatcc gctcccagct cgaaagccac 840tcagaccaga
atatgcacca ggccgaagag tggttcaaat gccgctacgc caagctcacc 900gaggcggccg
agcagaacaa ggaggccatc cgctccgcca aggaagagat cgccgagtac 960cggcgccagc
tgcagtccaa gagcatcgag ctagagtcgg tgcgcggcac caaggagtcc 1020ctggagcggc
agctcagcga catcgaggag cgccacaacc acgacctcag cagctaccag 1080gacaccatcc
agcagctgga aaatgagctt cggggcacaa agtgggaaat ggctcgtcat 1140ttgcgcgaat
accaggacct cctcaacgtc aagatggctc tggatataga aatcgctgcg 1200tacagaaaac
tcctggaggg tgaagagact agatttagca catttgcagg aagcatcact 1260gggccactgt
atacacaccg acccccaatc acaatatcca gtaagattca gaaaaccaag 1320gtggaagctc
ccaagcttaa ggtccaacac aaatttgtcg aggagatcat agaggaaacc 1380aaagtggagg
atgagaagtc agaaatggaa gaggccctga cagccattac agaggaattg 1440gccgcttcca
tgaaggaaga gaagaaagaa gcagcagaag aaaaggaaga ggaacccgaa 1500gctgaagaag
aagaagtagc tgccaaaaag tctccagtga aagcaactgc acctgaagtt 1560aaagaagagg
aaggggaaaa ggaggaagaa gaaggccagg aagaagagga ggaagaagat 1620gagggagcta
agtcagacca agccgaagag ggaggatccg agaaggaagg ctctagtgaa 1680aaagaggaag
gtgagcagga agaaggagaa acagaagctg aagctgaagg agaggaagcc 1740gaagctaaag
aggaaaagaa agtggaggaa aagagtgagg aagtggctac caaggaggag 1800ctggtggcag
atgccaaggt ggaaaagcca gaaaaagcca agtctcctgt gccaaaatca 1860ccagtggaag
agaaaggcaa gtctcctgtg cccaagtcac cagtggaaga gaaaggcaag 1920tctcctgtgc
ccaagtcacc agtggaagag aaaggcaagt ctcctgtgcc gaaatcacca 1980gtggaagaga
aaggcaagtc tcctgtgtca aaatcaccag tggaagagaa agccaaatct 2040cctgtgccaa
aatcaccagt ggaagaggca aagtcaaaag cagaagtggg gaaaggtgaa 2100cagaaagagg
aagaagaaaa ggaagtcaag gaagctccca aggaagagaa ggtagagaaa 2160aaggaagaga
aaccaaagga tgtgccagag aagaagaaag ctgagtcccc tgtaaaggag 2220gaagctgtgg
cagaggtggt caccatcacc aaatcggtaa aggtgcactt ggagaaagag 2280accaaagaag
aggggaagcc actgcagcag gagaaagaga aggagaaagc gggaggagag 2340ggaggaagtg
aggaggaagg gagtgataaa ggtgccaagg gatccaggaa ggaagacata 2400gctgtcaatg
gggaggtaga aggaaaagag gaggtagagc aggagaccaa ggaaaaaggc 2460agtgggaggg
aagaggagaa aggcgttgtc accaatggcc tagacttgag cccagcagat 2520gaaaagaagg
ggggtgataa aagtgaggag aaagtggtgg tgaccaaaac ggtagaaaaa 2580atcaccagtg
aggggggaga tggtgctacc aaatacatca ctaaatctgt aaccgtcact 2640caaaaggttg
aagagcatga agagaccttt gaggagaaac tagtgtctac taaaaaggta 2700gaaaaagtca
cttcacacgc catagtaaag gaagtcaccc agagtgacta a 275111848PRTMus
musculus 11Ser Tyr Thr Leu Asp Ser Leu Gly Asn Pro Ser Ala Tyr Arg Arg
Val1 5 10 15Pro Thr Glu
Thr Arg Ser Ser Phe Ser Arg Val Ser Gly Ser Pro Ser20 25
30Ser Gly Phe Arg Ser Gln Ser Trp Ser Arg Gly Ser Pro
Ser Thr Val35 40 45Ser Ser Ser Tyr Thr
Arg Ser Ala Val Ala Pro Arg Leu Ala Tyr Ser50 55
60Ser Ala Met Leu Ser Ser Ala Glu Ser Ser Leu Asp Phe Ser Gln
Ser65 70 75 80Ser Ser
Leu Leu Asn Gly Gly Ser Gly Gly Asp Tyr Lys Leu Ser Arg85
90 95Ser Asn Glu Lys Glu Gln Leu Gln Gly Leu Asn Asp
Arg Phe Ala Gly100 105 110Tyr Ile Glu Lys
Val His Tyr Leu Glu Gln Gln Asn Lys Glu Ile Glu115 120
125Ala Glu Ile Gln Ala Leu Arg Gln Lys Gln Ala Ser His Ala
Gln Leu130 135 140Gly Asp Ala Tyr Asp Gln
Glu Ile Arg Glu Leu Arg Ala Thr Leu Glu145 150
155 160Met Val Asn His Glu Lys Ala Gln Val Gln Leu
Asp Ser Asp His Leu165 170 175Glu Glu Asp
Ile His Arg Leu Lys Glu Arg Phe Glu Glu Glu Ala Arg180
185 190Leu Arg Asp Asp Thr Glu Ala Ala Ile Arg Ala Leu
Arg Lys Asp Ile195 200 205Glu Glu Ser Ser
Met Val Lys Val Glu Leu Asp Lys Lys Val Gln Ser210 215
220Leu Gln Asp Glu Val Ala Phe Leu Arg Arg Asn His Glu Glu
Glu Val225 230 235 240Ala
Asp Leu Leu Ala Gln Ile Gln Ala Ser His Ile Thr Val Glu Arg245
250 255Lys Asp Tyr Leu Lys Thr Asp Ile Ser Thr Ala
Leu Lys Glu Ile Arg260 265 270Ser Gln Leu
Glu Cys His Ser Asp Gln Asn Met His Gln Ala Glu Glu275
280 285Trp Phe Lys Cys Arg Tyr Ala Lys Leu Thr Glu Ala
Ala Glu Gln Asn290 295 300Lys Glu Ala Ile
Arg Ser Ala Lys Glu Glu Ile Ala Glu Tyr Arg Arg305 310
315 320Gln Leu Gln Ser Lys Ser Ile Glu Leu
Glu Ser Val Arg Gly Thr Lys325 330 335Glu
Ser Leu Glu Arg Gln Leu Ser Asp Ile Glu Glu Arg His Asn His340
345 350Asp Leu Ser Ser Tyr Gln Asp Thr Ile Gln Gln
Leu Glu Asn Glu Leu355 360 365Arg Gly Thr
Lys Trp Glu Met Ala Arg His Leu Arg Glu Tyr Gln Asp370
375 380Leu Leu Asn Val Lys Met Ala Leu Asp Ile Glu Ile
Ala Ala Tyr Arg385 390 395
400Lys Leu Leu Glu Gly Glu Glu Thr Arg Phe Ser Thr Phe Ser Gly Ser405
410 415Ile Thr Gly Pro Leu Tyr Thr His Arg
Gln Pro Ser Val Thr Ile Ser420 425 430Ser
Lys Ile Gln Lys Thr Lys Val Glu Ala Pro Lys Leu Lys Val Gln435
440 445His Lys Phe Val Glu Glu Ile Ile Glu Glu Thr
Lys Val Glu Asp Glu450 455 460Lys Ser Glu
Met Glu Glu Thr Leu Thr Ala Ile Ala Glu Glu Leu Ala465
470 475 480Ala Ser Ala Lys Glu Glu Lys
Glu Glu Ala Glu Glu Lys Glu Glu Glu485 490
495Pro Glu Ala Glu Lys Ser Pro Val Lys Ser Pro Glu Ala Lys Glu Glu500
505 510Glu Glu Glu Gly Glu Lys Glu Glu Glu
Glu Glu Gly Gln Glu Glu Glu515 520 525Glu
Glu Glu Asp Glu Gly Val Lys Ser Asp Gln Ala Glu Glu Gly Gly530
535 540Ser Glu Lys Glu Gly Ser Ser Glu Lys Asp Glu
Gly Glu Gln Glu Glu545 550 555
560Glu Glu Gly Glu Thr Glu Ala Glu Gly Glu Gly Glu Glu Ala Glu
Ala565 570 575Lys Glu Glu Lys Lys Ile Glu
Gly Lys Val Glu Glu Val Ala Val Lys580 585
590Glu Glu Ile Lys Val Glu Lys Pro Glu Lys Ala Lys Ser Pro Met Pro595
600 605Lys Ser Pro Val Glu Glu Val Lys Pro
Lys Pro Glu Ala Lys Ala Gly610 615 620Lys
Gly Glu Gln Lys Glu Glu Glu Lys Val Glu Glu Glu Lys Lys Glu625
630 635 640Val Thr Lys Glu Ser Pro
Lys Glu Glu Lys Val Glu Lys Lys Glu Glu645 650
655Lys Pro Lys Asp Val Ala Asp Lys Lys Lys Ala Glu Ser Pro Val
Lys660 665 670Glu Lys Ala Val Glu Glu Val
Ile Thr Ile Ser Lys Ser Val Lys Val675 680
685Ser Leu Glu Lys Asp Thr Lys Glu Glu Lys Pro Gln Pro Gln Glu Lys690
695 700Val Lys Glu Lys Ala Glu Glu Glu Gly
Gly Ser Glu Glu Glu Gly Ser705 710 715
720Asp Arg Ser Pro Gln Glu Ser Lys Lys Glu Asp Ile Ala Ile
Asn Gly725 730 735Glu Val Glu Gly Lys Glu
Glu Glu Glu Gln Glu Thr Gln Glu Lys Gly740 745
750Ser Gly Arg Glu Glu Glu Lys Gly Val Val Thr Asn Gly Leu Asp
Val755 760 765Ser Pro Ala Glu Glu Lys Lys
Gly Glu Asp Ser Ser Asp Asp Lys Val770 775
780Val Val Thr Lys Lys Val Glu Lys Ile Thr Ser Glu Gly Gly Asp Gly785
790 795 800Ala Thr Lys Tyr
Ile Thr Lys Ser Val Thr Val Thr Gln Lys Val Glu805 810
815Glu His Glu Glu Thr Phe Glu Glu Lys Leu Val Ser Thr Lys
Lys Val820 825 830Glu Lys Val Thr Ser His
Ala Ile Val Lys Glu Val Thr Gln Gly Asp835 840
845122550DNAMus musculus 12atgagctaca cgctggactc gctgggcaac
ccgtccgcct accggcgcgt tccaaccgag 60acccggtcca gcttcagccg cgtgagcggt
tccccgtcca gcggcttccg ctcgcagtcc 120tggtcccgcg gctcgcccag caccgtgtcc
tcctcctaca cgcgcagcgc ggtcgccccg 180cgtctcgcct acagctcggc tatgctcagc
tcggccgaga gcagcctcga cttcagccag 240tcctcgtcgc tgctcaacgg cggctccggc
ggcgactaca aactgtcccg ctctaacgag 300aaagagcagc tgcaggggct gaacgaccgc
ttcgccggct acatcgagaa agtgcactac 360ttggaacaac agaacaagga gatcgaagca
gagatccagg cactgcggca gaagcaggcc 420tcgcacgccc agctgggtga tgcttacgac
caggagatcc gagagctgcg cgccaccctc 480gagatggtga accacgagaa ggctcaagtg
cagctggact ccgatcactt ggaggaagac 540atccaccggc tcaaggagcg cttcgaggag
gaggcgcggc tgcgggacga caccgaggct 600gccattcgcg cgctgcgcaa agacatcgaa
gagtcgtcga tggttaaggt ggagctggac 660aagaaggtgc agtcgctgca ggatgaggtg
gctttcctgc ggcgtaatca cgaagaggag 720gtggccgacc tgctggctca gatccaggcg
tcgcacatca cggtagagcg caaagattac 780ctgaagacag acatctccac ggcgctgaag
gagatccgct cccagctcga gtgtcactca 840gaccagaaca tgcaccaggc cgaagagtgg
ttcaaatgcc gctacgccaa gctcaccgag 900gcggccgagc agaacaagga ggccattcgc
tctgccaagg aagagatcgc cgagtaccgg 960cgccagctgc agtccaagag catcgagctc
gagtcggtgc gaggcactaa ggagtccctg 1020gaacggcagc tcagcgacat cgaggagcgc
cacaaccacg acctcagcag ctaccaggac 1080accatccagc agttggaaaa tgaacttcgg
ggaaccaagt gggaaatggc tcgtcatttg 1140cgagaatacc aggatctcct taacgtcaag
atggccctgg acatcgagat cgccgcgtac 1200aggaaactcc tagaggggga agagaccaga
tttagcacat tttcaggaag catcaccggg 1260cctctgtaca cacaccgaca gccctcagtc
acaatatcca gtaagattca gaagaccaaa 1320gtcgaggccc ccaagctcaa ggtccaacac
aaatttgtgg aggagatcat cgaagaaact 1380aaagtggaag atgagaagtc agaaatggaa
gaaaccctca cagccatcgc agaggagttg 1440gcagcctccg ccaaagagga gaaggaagag
gccgaagaaa aggaggagga accagaagcc 1500gaaaagtctc ccgtgaagtc tcctgaggct
aaggaagagg aggaggaagg ggaaaaggag 1560gaagaagagg aaggccagga ggaagaagag
gaggaagatg aaggtgtcaa gtcagaccag 1620gcagaagagg ggggatctga gaaggaaggc
tccagtgaaa aagatgaagg tgagcaggaa 1680gaagaagaag gagaaaccga ggcagaaggt
gaaggagagg aagcagaagc taaggaggaa 1740aagaaaattg agggaaaggt tgaggaagtg
gctgtcaagg aggaaatcaa ggtcgagaag 1800cctgagaaag ccaaatcccc tatgcccaaa
tcacccgtgg aagaagtaaa gccaaaacca 1860gaggccaagg ccgggaaggg tgagcagaag
gaggaagaga aagttgagga agagaagaag 1920gaagtcacca aagaatcacc caaggaagag
aaggtggaga aaaaggagga gaagccaaaa 1980gatgttgcag ataaaaagaa ggccgagtcc
ccggtgaaag agaaggctgt ggaggaggtg 2040atcaccatca gcaagtcggt aaaggtgagc
ctggagaaag acaccaaaga ggagaagccg 2100cagccgcagg agaaggtgaa ggagaaggca
gaggaggagg ggggcagtga ggaggaaggg 2160agtgaccgta gcccgcagga gtccaagaag
gaagacatag ctatcaatgg ggaggtggaa 2220ggaaaagagg aggaggagca ggaaactcag
gagaagggca gtgggcggga ggaggagaaa 2280ggggtggtca ctaatggctt agatgtgagc
cctgcagagg agaagaaagg agaggatagc 2340agtgatgata aagtggtggt caccaagaag
gtagaaaaga tcaccagcga gggaggcgat 2400ggtgctacca aatacatcac caaatctgta
accgtcactc aaaaggttga agagcatgag 2460gagacctttg aggagaagct ggtctcaact
aaaaaggtag aaaaggtcac ttcacacgcc 2520atagtcaagg aagtcaccca gggtgactaa
255013845PRTRattus Norvegicus 13Ser Tyr
Thr Leu Asp Ser Leu Gly Asn Pro Ser Ala Tyr Arg Arg Val1 5
10 15Pro Thr Glu Thr Arg Ser Ser Phe
Ser Arg Val Ser Gly Ser Pro Ser20 25
30Ser Gly Phe Arg Ser Gln Ser Trp Ser Arg Gly Ser Pro Ser Thr Val35
40 45Ser Ser Ser Tyr Lys Arg Ser Ala Leu Ala
Pro Arg Leu Ala Tyr Ser50 55 60Ser Ala
Met Leu Ser Ser Ala Glu Ser Ser Leu Asp Phe Ser Gln Ser65
70 75 80Ser Ser Leu Leu Asn Gly Gly
Ser Gly Gly Asp Tyr Lys Leu Ser Arg85 90
95Ser Asn Glu Lys Glu Gln Leu Gln Gly Leu Asn Asp Arg Phe Ala Gly100
105 110Tyr Ile Glu Lys Val His Tyr Leu Glu
Gln Gln Asn Lys Glu Ile Glu115 120 125Ala
Glu Ile His Ala Leu Arg Gln Lys Gln Ala Ser His Ala Gln Leu130
135 140Gly Asp Ala Tyr Asp Gln Glu Ile Arg Glu Leu
Arg Ala Thr Leu Glu145 150 155
160Met Val Asn His Glu Lys Ala Gln Val Gln Leu Asp Ser Asp His
Leu165 170 175Glu Glu Asp Ile His Arg Leu
Lys Glu Arg Phe Glu Glu Glu Ala Arg180 185
190Leu Arg Asp Asp Thr Glu Ala Ala Ile Arg Ala Val Arg Lys Asp Ile195
200 205Glu Glu Ser Ser Met Val Lys Val Glu
Leu Asp Lys Lys Val Gln Ser210 215 220Leu
Gln Asp Glu Val Ala Phe Leu Arg Ser Asn His Glu Glu Glu Val225
230 235 240Ala Asp Leu Leu Ala Gln
Ile Gln Ala Ser His Ile Thr Val Glu Arg245 250
255Lys Asp Tyr Leu Lys Thr Asp Ile Ser Thr Ala Leu Lys Glu Ile
Arg260 265 270Ser Gln Leu Glu Cys His Ser
Asp Gln Asn Met His Gln Ala Glu Glu275 280
285Trp Phe Lys Cys Arg Tyr Ala Lys Leu Thr Glu Ala Ala Glu Gln Asn290
295 300Lys Glu Ala Ile Arg Ser Ala Lys Glu
Glu Ile Ala Glu Tyr Arg Arg305 310 315
320Gln Leu Gln Ser Lys Ser Ile Glu Leu Glu Ser Val Arg Gly
Thr Lys325 330 335Glu Ser Leu Glu Arg Gln
Leu Ser Asp Ile Glu Glu Arg His Asn His340 345
350Asp Leu Ser Ser Tyr Gln Asp Thr Ile Gln Gln Leu Glu Asn Glu
Leu355 360 365Arg Gly Thr Lys Trp Glu Met
Ala Arg His Leu Arg Glu Tyr Gln Asp370 375
380Leu Leu Asn Val Lys Met Ala Leu Asp Ile Glu Ile Ala Ala Tyr Arg385
390 395 400Lys Leu Leu Glu
Gly Glu Glu Thr Arg Phe Ser Thr Phe Ser Gly Ser405 410
415Ile Thr Gly Pro Leu Tyr Thr His Arg Gln Pro Ser Val Thr
Ile Ser420 425 430Ser Lys Ile Gln Lys Thr
Lys Val Glu Ala Pro Lys Leu Lys Val Gln435 440
445His Lys Phe Val Glu Glu Ile Ile Glu Glu Thr Lys Val Glu Asp
Glu450 455 460Lys Ser Glu Met Glu Asp Ala
Leu Thr Val Ile Ala Glu Glu Leu Ala465 470
475 480Ala Ser Ala Lys Glu Glu Lys Glu Glu Ala Glu Glu
Lys Glu Glu Glu485 490 495Pro Glu Val Glu
Lys Ser Pro Val Lys Ser Pro Glu Ala Lys Glu Glu500 505
510Glu Glu Gly Glu Lys Glu Glu Glu Glu Glu Gly Gln Glu Glu
Glu Glu515 520 525Glu Glu Asp Glu Gly Val
Lys Ser Asp Gln Ala Glu Glu Gly Gly Ser530 535
540Glu Lys Glu Gly Ser Ser Glu Lys Asp Glu Gly Glu Gln Glu Glu
Glu545 550 555 560Gly Glu
Thr Glu Ala Glu Gly Glu Gly Glu Glu Ala Glu Ala Lys Glu565
570 575Glu Lys Lys Thr Glu Gly Lys Val Glu Glu Met Ala
Ile Lys Glu Glu580 585 590Ile Lys Val Glu
Lys Pro Glu Lys Ala Lys Ser Pro Val Pro Lys Ser595 600
605Pro Val Glu Glu Val Lys Pro Lys Pro Glu Ala Lys Ala Gly
Lys Asp610 615 620Glu Gln Lys Glu Glu Glu
Lys Val Glu Glu Lys Lys Glu Val Ala Lys625 630
635 640Glu Ser Pro Lys Glu Glu Lys Val Glu Lys Lys
Glu Glu Lys Pro Lys645 650 655Asp Val Pro
Asp Lys Lys Lys Ala Glu Ser Pro Val Lys Glu Lys Ala660
665 670Val Glu Glu Met Ile Thr Ile Thr Lys Ser Val Lys
Val Ser Leu Glu675 680 685Lys Asp Thr Lys
Glu Glu Lys Pro Gln Gln Gln Glu Lys Val Lys Glu690 695
700Lys Ala Glu Glu Glu Gly Gly Ser Glu Glu Glu Val Gly Asp
Lys Ser705 710 715 720Pro
Gln Glu Ser Lys Lys Glu Asp Ile Ala Ile Asn Gly Glu Val Glu725
730 735Gly Lys Glu Glu Glu Glu Gln Glu Thr Gln Glu
Lys Gly Ser Gly Gln740 745 750Glu Glu Glu
Lys Gly Val Val Thr Asn Gly Leu Asp Val Ser Pro Ala755
760 765Glu Glu Lys Lys Gly Glu Asp Arg Ser Asp Asp Lys
Val Val Val Thr770 775 780Lys Lys Val Glu
Lys Ile Thr Ser Glu Gly Gly Asp Gly Ala Thr Lys785 790
795 800Tyr Ile Thr Lys Ser Val Thr Val Thr
Gln Lys Val Glu Glu His Glu805 810 815Glu
Thr Phe Glu Glu Lys Leu Val Ser Thr Lys Lys Val Glu Lys Val820
825 830Thr Ser His Ala Ile Val Lys Glu Val Thr Gln
Gly Asp835 840 845142538DNARattus
norvegicus 14atgagctaca cgctggactc gctgggcaac ccgtccgcct accggcgcgt
caccgagacc 60ccgtccagct tcagtcgtgt gagcggttcc ccgtccagcg gcttccgctc
gcagtcctgg 120tcccgcggct cgcccagcac cgtgtcctcc tcctacaagc gcagcgcgct
cgccccgcgc 180ctcgcctaca gctcggctat gctcagctcg gccgagagca gcctcgactt
cagccagtcc 240tcttcgctgc ttaacggcgg ctccggcggc gactacaagc tgtcccgctc
aaacgagaaa 300gagcagctgc aggggctgaa cgaccgtttc gccggctaca tcgagaaagt
gcactacttg 360gaacaacaga acaaggagat cgaggcagag atccacgcgc tgcggcagaa
gcaggcctcg 420cacgcccagc tgggtgacgc ttacgaccag gagatccgag agctgcgcgc
caccctggag 480atggtgaatc acgagaaggc tcaagtgcag ctggactctg atcacttgga
ggaagacatc 540caccggctca aggagcgctt cgaggaggag gcgcggctgc gggacgacac
cgaggctgcc 600atccgggcgc tgcgcaaaga catagaggag tcgtcgatgg ttaaggtgga
gctggacaag 660aaggtgcagt cgctgcagga tgaggtggcc ttcctgcgga gcaatcacga
agaggaggtg 720gccgacctgc tggcccagat ccaggcgtcg cacatcaccg tagagcgcaa
agactacctg 780aagacagaca tctccacggc gctgaaagag atccgctccc agctcgagtg
tcactccgac 840cagaacatgc accaggccga agagtggttc aaatgccgct acgccaagct
caccgaggcg 900gccgagcaga acaaggaggc catccgctcc gctaaagaag agatcgccga
gtaccggcgc 960cagctgcagt ccaagagcat tgagctcgag tcggtgcgag gcactaagga
gtccctggaa 1020cggcagctca gcgacatcga ggagcgccac aaccacgacc tcagcagcta
ccaggacacc 1080atccagcagc tggaaaatga gcttcgggga acaaagtggg aaatggctcg
tcatttgcga 1140gaataccagg atctccttaa cgtcaagatg gctctggaca tcgagatcgc
cgcatatagg 1200aaactactgg agggtgaaga gaccagattt agcacatttt caggaagcat
cactgggcct 1260ctgtacacac accgacagcc ctcagtcaca atatccagta agattcagaa
gaccaaagtc 1320gaggccccca agctcaaggt ccaacacaaa tttgtggagg agatcattga
ggagactaaa 1380gtggaagatg agaagtcaga aatggaagac gccctcacag tcattgcaga
ggaattggca 1440gcctctgcca aagaggagaa agaagaggca gaagaaaagg aagaggaacc
ggaagttgaa 1500aagtctcccg tgaagtctcc tgaggctaag gaagaggagg aaggggaaaa
ggaggaagaa 1560gaggaaggcc aagaggaaga agaggaggaa gatgaaggtg tcaagtcaga
ccaggcagaa 1620gagggaggat ctgagaagga aggctcgagt gaaaaggatg aaggtgagca
agaagaagaa 1680ggggaaactg aggcagaagg tgaaggagag gaagcagaag ctaaggagga
aaagaaaaca 1740gagggaaagg tcgaggaaat ggctatcaag gaggaaatca aggtcgagaa
gcccgagaaa 1800gccaagtccc ctgtgccaaa atcacccgtg gaagaagtaa agccaaaacc
agaagccaaa 1860gccggaaagg atgagcagaa ggaggaagag aaagttgagg agaagaagga
ggtagccaag 1920gaatcaccca aggaagagaa ggtggagaaa aaggaggaga agccaaaaga
tgtcccagat 1980aaaaagaagg ctgagtcccc agtgaaagaa aaggccgtag aggaaatgat
caccattact 2040aagtcggtaa aggtgagcct ggagaaagac accaaagagg agaagcctca
gcagcaggag 2100aaggtgaagg agaaggcaga ggaggagggg ggtagtgagg aggaagtggg
tgacaaaagc 2160ccgcaagaat ccaagaagga agacatagct atcaatgggg aggtggaagg
aaaagaggag 2220gaggagcagg aaactcagga gaagggcagt gggcaagagg aggagaaagg
ggtggtcact 2280aatggcttag atgtgagccc tgcggaggaa aagaaagggg aggatagaag
tgatgacaaa 2340gtggtggtga ccaagaaggt agaaaaaatc accagcgagg gaggcgatgg
tgctaccaaa 2400tacatcacca aatctgttac tgtcactcaa aaggttgaag agcatgagga
gacctttgag 2460gagaagctgg tgtcaactaa aaaggtagaa aaggtcactt cacatgccat
agtcaaggaa 2520gtcacccagg gtgactaa
253815644PRTOryctolagus cuniculus 15Val Lys Val Glu Leu Asp
Lys Lys Val Gln Ser Leu Gln Asp Glu Val1 5
10 15Ala Phe Leu Arg Thr Asn His Glu Glu Glu Val Ala
Asp Leu Leu Ala20 25 30Gln Ile Gln Ala
Ser His Ile Thr Val Glu Arg Lys Asp Tyr Leu Lys35 40
45Thr Asp Ile Ser Ser Ala Leu Lys Glu Ile Arg Ser Gln Leu
Glu Cys50 55 60His Ser Asp Gln Asn Met
His Gln Ala Glu Glu Trp Phe Lys Cys Arg65 70
75 80Tyr Ala Lys Leu Thr Glu Ala Ala Glu Gln Asn
Lys Glu Ala Ile Arg85 90 95Ser Ala Lys
Glu Glu Ile Ala Glu Tyr Arg Arg Gln Leu Gln Ser Lys100
105 110Ser Ile Glu Leu Glu Ser Val Ala Trp His Lys Glu
Ser Leu Glu Arg115 120 125His Val Ser Asp
Ile Glu Glu Arg His Asn His Asp Leu Ser Ser Tyr130 135
140Gln Asp Thr Ile Gln Gln Leu Glu Asn Glu Leu Arg Gly Thr
Lys Trp145 150 155 160Glu
Met Ala Arg His Leu Arg Glu Tyr Gln Asp Leu Leu Asn Val Lys165
170 175Met Ala Leu Asp Ile Glu Ile Ala Ala Tyr Arg
Lys Leu Leu Glu Gly180 185 190Glu Glu Thr
Arg Phe Ser Thr Phe Ser Gly Ser Ile Thr Gly Pro Leu195
200 205Tyr Thr His Arg Gln Pro Ser Val Thr Ile Ser Ser
Lys Ile Gln Lys210 215 220Thr Lys Val Glu
Ala Pro Lys Leu Lys Val Gln His Lys Phe Val Glu225 230
235 240Glu Ile Ile Glu Glu Thr Lys Val Glu
Asp Glu Lys Ser Glu Met Glu245 250 255Asp
Ala Leu Thr Ala Ile Ala Glu Glu Leu Ala Val Ser Val Lys Glu260
265 270Glu Glu Lys Glu Glu Glu Ala Glu Gly Lys Glu
Glu Glu Gln Glu Ala275 280 285Glu Glu Glu
Val Ala Ala Ala Lys Lys Ser Pro Val Lys Ala Thr Thr290
295 300Pro Glu Ile Lys Glu Glu Glu Gly Glu Lys Glu Glu
Glu Gly Gln Glu305 310 315
320Glu Glu Glu Glu Glu Glu Asp Glu Gly Val Lys Ser Asp Gln Ala Glu325
330 335Glu Gly Gly Ser Glu Lys Glu Gly Ser
Ser Lys Asn Glu Gly Glu Gln340 345 350Glu
Glu Gly Glu Thr Glu Ala Glu Gly Glu Val Glu Glu Ala Glu Ala355
360 365Lys Glu Glu Lys Lys Thr Glu Glu Lys Ser Glu
Glu Val Ala Ala Lys370 375 380Glu Glu Pro
Val Thr Glu Ala Lys Val Gly Lys Pro Glu Lys Ala Lys385
390 395 400Ser Pro Val Pro Lys Ser Pro
Val Glu Glu Val Lys Pro Lys Ala Glu405 410
415Ala Thr Ala Gly Lys Gly Glu Gln Lys Glu Glu Glu Glu Lys Val Glu420
425 430Glu Glu Lys Lys Lys Ala Ala Lys Glu
Ser Pro Lys Glu Glu Lys Val435 440 445Glu
Lys Lys Glu Glu Lys Pro Lys Asp Val Pro Lys Lys Lys Ala Glu450
455 460Ser Pro Val Lys Glu Glu Ala Ala Glu Glu Ala
Ala Thr Ile Thr Lys465 470 475
480Pro Thr Lys Val Gly Leu Glu Lys Glu Thr Lys Glu Gly Glu Lys
Pro485 490 495Leu Gln Gln Glu Lys Glu Lys
Glu Lys Ala Gly Glu Glu Gly Gly Ser500 505
510Glu Glu Glu Gly Ser Asp Gln Gly Ser Lys Arg Ala Lys Lys Glu Asp515
520 525Ile Ala Val Asn Gly Glu Gly Glu Gly
Lys Glu Glu Glu Glu Pro Glu530 535 540Thr
Lys Glu Lys Gly Ser Gly Arg Glu Glu Glu Lys Gly Val Val Thr545
550 555 560Asn Gly Leu Asp Leu Ser
Pro Ala Asp Glu Lys Lys Gly Gly Asp Arg565 570
575Ser Glu Glu Lys Val Val Val Thr Lys Lys Val Glu Lys Ile Thr
Thr580 585 590Glu Gly Gly Asp Gly Ala Thr
Lys Tyr Ile Thr Lys Ser Val Thr Ala595 600
605Gln Lys Val Glu Glu His Glu Glu Thr Phe Glu Glu Lys Leu Val Ser610
615 620Thr Lys Lys Val Glu Lys Val Thr Ser
His Ala Ile Val Lys Glu Val625 630 635
640Thr Gln Ser Asp161938DNAOryctolagus cuniculus
16tggtcaaggt ggagctggac aagaaggtcc agtcgctgca ggatgaggtg gccttcctgc
60ggacgaacca cgaggaggag gtagcggacc tgctggccca gatccaggcg tcgcacatca
120cggtggagcg caaagactac ctgaagacgg acatctcgtc ggcgctgaag gagatccgct
180cccagctcga gtgccactcc gaccagaaca tgcatcaggc cgaagagtgg tttaagtgcc
240gctacgccaa gctcaccgaa gccgccgagc agaacaagga ggccatccgc tccgccaagg
300aagagatcgc cgagtaccgg cgccagctgc agtccaagag catcgagctc gagtcggtcg
360cgtggcacaa ggagtccctg gagcggcacg tcagcgacat cgaggagcgc cacaaccacg
420acctcagcag ctaccaggac accattcagc agctggaaaa tgagcttcgg ggaacgaagt
480gggaaatggc ccgccacttg cgagagtacc aggatctcct caatgtcaag atggctctgg
540atatcgagat cgcagcctac agaaaactcc tggagggtga agagaccaga ttcagcacat
600tttcaggaag catcactggg ccactgtata cacaccgaca gccctcagtc accatatcca
660gtaagattca gaagacaaag gtggaagctc ccaagctcaa agtccaacac aaatttgttg
720aggagatcat agaggaaacc aaagtggagg atgagaagtc agaaatggaa gatgcactga
780cagccattgc agaggaactg gccgtgtctg tgaaggaaga ggagaaggaa gaagaggcag
840aaggaaagga agaggagcaa gaagctgaag aagaagttgc agctgccaag aagtctccag
900tgaaggctac cacacccgag attaaagagg aagaagggga aaaggaagaa gaaggccagg
960aggaggaaga agaggaagaa gatgaaggtg ttaagtcaga ccaagctgaa gagggaggat
1020cagagaagga aggctctagc aagaacgagg gtgagcagga agaaggagaa accgaggctg
1080aaggtgaagt agaagaagca gaagccaagg aggaaaagaa aaccgaggag aagagtgaag
1140aagtggctgc taaagaggag ccagtgacag aagccaaggt gggaaagcca gagaaagcca
1200agtcccctgt gccaaaatca ccagtggaag aggtgaagcc aaaagctgaa gccacagcag
1260ggaaagggga gcagaaagag gaagaagaga aggttgagga agaaaagaaa aaggcagcca
1320aggaatctcc aaaggaagag aaggtggaga agaaggagga gaaaccaaaa gatgtgccaa
1380agaagaaagc tgaatccccg gtaaaagagg aggccgcaga agaggctgcc accatcacca
1440aacccacaaa ggtgggcttg gagaaagaga ccaaagaagg ggagaagccg ctgcagcagg
1500agaaggaaaa ggagaaagca ggagaggagg gagggagtga ggaggaaggg agcgaccagg
1560ggtcaaagag ggccaagaag gaagacatag cagtcaatgg ggagggcgaa gggaaagagg
1620aggaagagcc ggagaccaag gaaaagggca gtgggcgaga agaggagaaa ggcgtcgtca
1680ccaatgggtt agacctgagc ccagcagacg agaagaaggg gggtgacaga agcgaggaga
1740aagtggtggt gaccaaaaag gtagaaaaaa tcaccactga ggggggcgat ggtgctacca
1800aatacatcac taaatctgta accgctcaaa aggtcgaaga gcatgaagag acctttgagg
1860agaaactagt gtctactaaa aaggtagaaa aagtcacttc acacgccatt gtaaaggaag
1920tcacccagag tgactaag
193817424PRTEnterobacteria phage M13 17Met Lys Lys Leu Leu Phe Ala Ile
Pro Leu Val Val Pro Phe Tyr Ser1 5 10
15His Ser Ala Glu Thr Val Glu Ser Cys Leu Ala Lys Pro His
Thr Glu20 25 30Asn Ser Phe Thr Asn Val
Trp Lys Asp Asp Lys Thr Leu Asp Arg Tyr35 40
45Ala Asn Tyr Glu Gly Cys Leu Trp Asn Ala Thr Gly Val Val Val Cys50
55 60Thr Gly Asp Glu Thr Gln Cys Tyr Gly
Thr Trp Val Pro Ile Gly Leu65 70 75
80Ala Ile Pro Glu Asn Glu Gly Gly Gly Ser Glu Gly Gly Gly
Ser Glu85 90 95Gly Gly Gly Ser Glu Gly
Gly Gly Thr Lys Pro Pro Glu Tyr Gly Asp100 105
110Thr Pro Ile Pro Gly Tyr Thr Tyr Ile Asn Pro Leu Asp Gly Thr
Tyr115 120 125Pro Pro Gly Thr Glu Gln Asn
Pro Ala Asn Pro Asn Pro Ser Leu Glu130 135
140Glu Ser Gln Pro Leu Asn Thr Phe Met Phe Gln Asn Asn Arg Phe Arg145
150 155 160Asn Arg Gln Gly
Ala Leu Thr Val Tyr Thr Gly Thr Val Thr Gln Gly165 170
175Thr Asp Pro Val Lys Thr Tyr Tyr Gln Tyr Thr Pro Val Ser
Ser Lys180 185 190Ala Met Tyr Asp Ala Tyr
Trp Asn Gly Lys Phe Arg Asp Cys Ala Phe195 200
205His Ser Gly Phe Asn Glu Asp Pro Phe Val Cys Glu Tyr Gln Gly
Gln210 215 220Ser Ser Asp Leu Pro Gln Pro
Pro Val Asn Ala Gly Gly Gly Ser Gly225 230
235 240Gly Gly Ser Gly Gly Gly Ser Glu Gly Gly Gly Ser
Glu Gly Gly Gly245 250 255Ser Glu Gly Gly
Gly Ser Glu Gly Gly Gly Ser Gly Gly Gly Ser Gly260 265
270Ser Gly Asp Phe Asp Tyr Glu Lys Met Ala Asn Ala Asn Lys
Gly Ala275 280 285Met Thr Glu Asn Ala Asp
Glu Asn Ala Leu Gln Ser Asp Ala Lys Gly290 295
300Lys Leu Asp Ser Val Ala Thr Asp Tyr Gly Ala Ala Ile Asp Gly
Phe305 310 315 320Ile Gly
Asp Val Ser Gly Leu Ala Asn Gly Asn Gly Ala Thr Gly Asp325
330 335Phe Ala Gly Ser Asn Ser Gln Met Ala Gln Val Gly
Asp Gly Asp Asn340 345 350Ser Pro Leu Met
Asn Asn Phe Arg Gln Tyr Leu Pro Ser Leu Pro Gln355 360
365Ser Val Glu Cys Arg Pro Phe Val Phe Ser Ala Gly Lys Pro
Tyr Glu370 375 380Phe Ser Ile Asp Cys Asp
Lys Ile Asn Leu Phe Arg Gly Val Phe Ala385 390
395 400Phe Leu Leu Tyr Val Ala Thr Phe Met Tyr Val
Phe Ser Thr Phe Ala405 410 415Asn Ile Leu
Arg Asn Lys Glu Ser420181275DNAEnterobacteria phage M13 18atgaaaaaat
tattattcgc aattccttta gttgttcctt tctattctca ctccgctgaa 60actgttgaaa
gttgtttagc aaaaccccat acagaaaatt catttactaa cgtctggaaa 120gacgacaaaa
ctttagatcg ttacgctaac tatgagggtt gtctgtggaa tgctacaggc 180gttgtagttt
gtactggtga cgaaactcag tgttacggta catgggttcc tattgggctt 240gctatccctg
aaaatgaggg tggtggctct gagggtggcg gttctgaggg tggcggttct 300gagggtggcg
gtactaaacc tcctgagtac ggtgatacac ctattccggg ctatacttat 360atcaaccctc
tcgacggcac ttatccgcct ggtactgagc aaaaccccgc taatcctaat 420ccttctcttg
aggagtctca gcctcttaat actttcatgt ttcagaataa taggttccga 480aataggcagg
gggcattaac tgtttatacg ggcactgtta ctcaaggcac tgaccccgtt 540aaaacttatt
accagtacac tcctgtatca tcaaaagcca tgtatgacgc ttactggaac 600ggtaaattca
gagactgcgc tttccattct ggctttaatg aggatccatt cgtttgtgaa 660tatcaaggcc
aatcgtctga cctgcctcaa cctcctgtca atgctggcgg cggctctggt 720ggtggttctg
gtggcggctc tgagggtggt ggctctgagg gtggcggttc tgagggtggc 780ggctctgagg
gaggcggttc cggtggtggc tctggttccg gtgattttga ttatgaaaag 840atggcaaacg
ctaataaggg ggctatgacc gaaaatgccg atgaaaacgc gctacagtct 900gacgctaaag
gcaaacttga ttctgtcgct actgattacg gtgctgctat cgatggtttc 960attggtgacg
tttccggcct tgctaatggt aatggtgcta ctggtgattt tgctggctct 1020aattcccaaa
tggctcaagt cggtgacggt gataattcac ctttaatgaa taatttccgt 1080caatatttac
cttccctccc tcaatcggtt gaatgtcgcc cttttgtctt tagcgctggt 1140aaaccatatg
aattttctat tgattgtgac aaaataaact tattccgtgg tgtctttgcg 1200tttcttttat
atgttgccac ctttatgtat gtattttcta cgtttgctaa catactgcgt 1260aataaggagt
cttaa
127519720PRTSaccharomyces cerevisiae 19Met Ala Lys Arg Val Ala Asp Ala
Gln Ile Gln Arg Glu Thr Tyr Asp1 5 10
15Ser Asn Glu Ser Asp Asp Asp Val Thr Pro Ser Thr Lys Val
Ala Ser20 25 30Ser Ala Val Met Asn Arg
Arg Lys Ile Ala Met Pro Lys Arg Arg Met35 40
45Ala Phe Lys Pro Phe Gly Ser Ala Lys Ser Asp Glu Thr Lys Gln Ala50
55 60Ser Ser Phe Ser Phe Leu Asn Arg Ala
Asp Gly Thr Gly Glu Ala Gln65 70 75
80Val Asp Asn Ser Pro Thr Thr Glu Ser Asn Ser Arg Leu Lys
Ala Leu85 90 95Asn Leu Gln Phe Lys Ala
Lys Val Asp Asp Leu Val Leu Gly Lys Pro100 105
110Leu Ala Asp Leu Arg Pro Leu Phe Thr Arg Tyr Glu Leu Tyr Ile
Lys115 120 125Asn Ile Leu Glu Ala Pro Val
Lys Ser Ile Glu Asn Pro Thr Gln Thr130 135
140Lys Gly Asn Asp Ala Lys Pro Ala Lys Val Glu Asp Val Gln Lys Ser145
150 155 160Ser Asp Ser Ser
Ser Glu Asp Glu Val Lys Val Glu Gly Pro Lys Phe165 170
175Thr Ile Asp Ala Lys Pro Pro Ile Ser Asp Ser Val Phe Ser
Phe Gly180 185 190Pro Lys Lys Glu Asn Arg
Lys Lys Asp Glu Ser Asp Ser Glu Asn Asp195 200
205Ile Glu Ile Lys Gly Pro Glu Phe Lys Phe Ser Gly Thr Val Ser
Ser210 215 220Asp Val Phe Lys Leu Asn Pro
Ser Thr Asp Lys Asn Glu Lys Lys Thr225 230
235 240Glu Thr Asn Ala Lys Pro Phe Ser Phe Ser Ser Ala
Thr Ser Thr Thr245 250 255Glu Gln Thr Lys
Ser Lys Asn Pro Leu Ser Leu Thr Glu Ala Thr Lys260 265
270Thr Asn Val Asp Asn Asn Ser Lys Ala Glu Ala Ser Phe Thr
Phe Gly275 280 285Thr Lys His Ala Ala Asp
Ser Gln Asn Asn Lys Pro Ser Phe Val Phe290 295
300Gly Gln Ala Ala Ala Lys Pro Ser Leu Glu Lys Ser Ser Phe Thr
Phe305 310 315 320Gly Ser
Thr Thr Ile Glu Lys Lys Asn Asp Glu Asn Ser Thr Ser Asn325
330 335Ser Lys Pro Glu Lys Ser Ser Asp Ser Asn Asp Ser
Asn Pro Ser Phe340 345 350Ser Phe Ser Ile
Pro Ser Lys Asn Thr Pro Asp Ala Ser Lys Pro Ser355 360
365Phe Ser Phe Gly Val Pro Asn Ser Ser Lys Asn Glu Thr Ser
Lys Pro370 375 380Val Phe Ser Phe Gly Ala
Ala Thr Pro Ser Ala Lys Glu Ala Ser Gln385 390
395 400Glu Asp Asp Asn Asn Asn Val Glu Lys Pro Ser
Ser Lys Pro Ala Phe405 410 415Asn Leu Ile
Ser Asn Ala Gly Thr Glu Lys Glu Lys Glu Ser Lys Lys420
425 430Asp Ser Lys Pro Ala Phe Ser Phe Gly Ile Ser Asn
Gly Ser Glu Ser435 440 445Lys Asp Ser Asp
Lys Pro Ser Leu Pro Ser Ala Val Asp Gly Glu Asn450 455
460Asp Lys Lys Glu Ala Thr Lys Pro Ala Phe Ser Phe Gly Ile
Asn Thr465 470 475 480Asn
Thr Thr Lys Thr Ala Asp Thr Lys Ala Pro Thr Phe Thr Phe Gly485
490 495Ser Ser Ala Leu Ala Asp Asn Lys Glu Asp Val
Lys Lys Pro Phe Ser500 505 510Phe Gly Thr
Ser Gln Pro Asn Asn Thr Pro Ser Phe Ser Phe Gly Lys515
520 525Thr Thr Ala Asn Leu Pro Ala Asn Ser Ser Thr Ser
Pro Ala Pro Ser530 535 540Ile Pro Ser Thr
Gly Phe Lys Phe Ser Leu Pro Phe Glu Gln Lys Gly545 550
555 560Ser Gln Thr Thr Thr Asn Asp Ser Lys
Glu Glu Ser Thr Thr Glu Ala565 570 575Thr
Gly Asn Glu Ser Gln Asp Ala Thr Lys Val Asp Ala Thr Pro Glu580
585 590Glu Ser Lys Pro Ile Asn Leu Gln Asn Gly Glu
Glu Asp Glu Val Ala595 600 605Leu Phe Ser
Gln Lys Ala Lys Leu Met Thr Phe Asn Ala Glu Thr Lys610
615 620Ser Tyr Asp Ser Arg Gly Val Gly Glu Met Lys Leu
Leu Lys Lys Lys625 630 635
640Asp Asp Pro Ser Lys Val Arg Leu Leu Cys Arg Ser Asp Gly Met Gly645
650 655Asn Val Leu Leu Asn Ala Thr Val Val
Asp Ser Phe Lys Tyr Glu Pro660 665 670Leu
Ala Pro Gly Asn Asp Asn Leu Ile Lys Ala Pro Thr Val Ala Ala675
680 685Asp Gly Lys Leu Val Thr Tyr Ile Val Lys Phe
Lys Gln Lys Glu Glu690 695 700Gly Arg Ser
Phe Thr Lys Ala Ile Glu Asp Ala Lys Lys Glu Met Lys705
710 715 720202163DNASaccharomyces
cerevisiae 20atggccaaaa gagttgccga tgcgcaaata cagagagaaa cgtacgattc
taacgagtct 60gacgatgacg tgactccctc cactaaggtt gcgtcatctg ctgtgatgaa
tagaagaaaa 120attgccatgc caaagcgcag gatggcgttc aaaccttttg gttctgcaaa
atcggatgaa 180accaagcagg ctagttcctt tagcttcctg aaccgggcgg acggcactgg
agaagctcag 240gttgataata gccctaccac agaaagcaat tccagactaa aagcattgaa
cctccagttc 300aaggctaagg ttgatgactt agttctaggc aagccgttag cggacttgag
gccccttttc 360accaggtacg aattatacat aaagaatatc ttagaagctc ccgtgaaatt
tatcgagaat 420ccaacgcaga caaagggaaa tgatgctaaa cctgccaaag tagaagatgt
ccaaaaaagt 480tccgattctt catctgaaga tgaggttaag gtggaggggc ccaagttcac
aatagatgct 540aaaccgccta tttcagattc cgttttctca tttggcccaa aaaaagaaaa
tcgcaagaaa 600gatgaaagtg atagcgaaaa cgatatagaa atcaagggcc ctgaatttaa
attttctgga 660actgtatcaa gtgatgtatt taagctgaat ccaagcaccg ataaaaatga
aaagaaaacc 720gagactaatg ctaaaccatt ttcattttct tcggccactt caactactga
acaaacgaag 780agtaaaaatc ccctttcatt gacagaagct accaagacca atgtggacaa
caacagtaaa 840gccgaggctt ccttcacttt tggaacaaaa catgctgcgg attctcaaaa
taataaacca 900tcttttgtat ttggtcaagc agctgcaaaa ccatcgctag aaaagagctc
attcacgttt 960ggttcaacaa caattgaaaa aaaaaatgac gaaaactcaa cctctaactc
aaaacctgaa 1020aagtctagtg atagcaatga ttcaaaccca tctttttcct tttccatacc
cagtaagaat 1080acacctgatg catctaagcc atcttttaat tttggggtcc caaactcttc
caaaaacgaa 1140acttcaaaac cggtattttc gtttggtgca gcaacaccat cggccaaaga
agctagtcag 1200gaagatgaca acaacaacgt tgaaaaacct tcctctaagc ctgccttcaa
tttcatatct 1260aacgctggta ccgagaaaga gaaggaaagt aaaaaggact caaagccagc
tttttcattt 1320ggcatatcaa acggaagtga aagcaaagac tctgacaaac cctctttacc
ctctgcggtt 1380gatggtgaaa atgacaagaa agaagcaaca aaacctgctt tttcgtttgg
aataaataca 1440aatactacta aaaccgcgga tactaaagct ccaactttta catttggctc
ctctgcactc 1500gctgacaata aagaggatgt taagaaacct ttttcattcg gtacctccca
gcctaataat 1560actccatcct tctcattcgg aaaaacaaca gcaaacttgc ctgctaattc
ttcaacatct 1620cctgctccct ctataccatc gacggggttc aaattttctt tgccatttga
acaaaaaggt 1680agtcaaacaa ctacaaatga tagcaaggaa gaatcaacaa cagaagcaac
tggaaatgag 1740tcgcaagatg caaccaaagt agatgctacc ccagaagaat caaagccaat
aaacttgcaa 1800aacggtgagg aagacgaagt ggctttattt tcgcaaaaag caaaattaat
gacattcaat 1860gctgaaacca aatcgtacga ttcaagaggc gtaggcgaaa tgaagctttt
gaagaaaaag 1920gacgatcctt ctaaagtgcg cctactttgt aggtctgacg gtatgggtaa
tgtattacta 1980aatgcaactg ttgtagactc cttcaaatat gagcctttag ctcccggaaa
tgataatctc 2040attaaagctc ctactgttgc ggctgatggg aaacttgtaa cttatatcgt
caagtttaag 2100cagaaggaag aaggccgctc atttacgaaa gctattgaag atgctaaaaa
agaaatgaaa 2160taa
216321860PRTMus musculus 21Met Ala Gly Leu Thr Ala Val Val Pro
Gln Pro Gly Val Leu Leu Ile1 5 10
15Leu Leu Leu Asn Leu Leu His Pro Ala Gln Pro Gly Gly Val Pro
Gly20 25 30Ala Val Pro Gly Gly Leu Pro
Gly Gly Val Pro Gly Gly Val Tyr Tyr35 40
45Pro Gly Ala Gly Ile Gly Gly Leu Gly Gly Gly Gly Gly Ala Leu Gly50
55 60Pro Gly Gly Lys Pro Pro Lys Pro Gly Ala
Gly Leu Leu Gly Thr Phe65 70 75
80Gly Ala Gly Pro Gly Gly Leu Gly Gly Ala Gly Pro Gly Ala Gly
Leu85 90 95Gly Ala Phe Pro Ala Gly Thr
Phe Pro Gly Ala Gly Ala Leu Val Pro100 105
110Gly Gly Ala Ala Gly Ala Ala Ala Ala Tyr Lys Ala Ala Ala Lys Ala115
120 125Gly Ala Gly Leu Gly Gly Val Gly Gly
Val Pro Gly Gly Val Gly Val130 135 140Gly
Gly Val Pro Gly Gly Val Gly Val Gly Gly Val Pro Gly Gly Val145
150 155 160Gly Val Gly Gly Val Pro
Gly Gly Val Gly Gly Ile Gly Gly Ile Gly165 170
175Gly Leu Gly Val Ser Thr Gly Ala Val Val Pro Gln Val Gly Ala
Gly180 185 190Ile Gly Ala Gly Gly Lys Pro
Gly Lys Val Pro Gly Val Gly Leu Pro195 200
205Gly Val Tyr Pro Gly Gly Val Leu Pro Gly Thr Gly Ala Arg Phe Pro210
215 220Gly Val Gly Val Leu Pro Gly Val Pro
Thr Gly Thr Gly Val Lys Ala225 230 235
240Lys Ala Pro Gly Gly Gly Gly Ala Phe Ser Gly Ile Pro Gly
Val Gly245 250 255Pro Phe Gly Gly Gln Gln
Pro Gly Val Pro Leu Gly Tyr Pro Ile Lys260 265
270Ala Pro Lys Leu Pro Gly Gly Tyr Gly Leu Pro Tyr Thr Asn Gly
Lys275 280 285Leu Pro Tyr Gly Val Ala Gly
Ala Gly Gly Lys Ala Gly Tyr Pro Thr290 295
300Gly Thr Gly Val Gly Ser Gln Ala Ala Ala Ala Ala Ala Lys Ala Ala305
310 315 320Lys Tyr Gly Ala
Gly Gly Ala Gly Val Leu Pro Gly Val Gly Gly Gly325 330
335Gly Ile Pro Gly Gly Ala Gly Ala Ile Pro Gly Ile Gly Gly
Ile Ala340 345 350Gly Ala Gly Thr Pro Ala
Ala Ala Ala Ala Ala Lys Ala Ala Ala Lys355 360
365Ala Ala Lys Tyr Gly Ala Ala Gly Gly Leu Val Pro Gly Gly Pro
Gly370 375 380Val Arg Leu Pro Gly Ala Gly
Ile Pro Gly Val Gly Gly Ile Pro Gly385 390
395 400Val Gly Gly Ile Pro Gly Val Gly Gly Pro Gly Ile
Gly Gly Pro Gly405 410 415Ile Val Gly Gly
Pro Gly Ala Val Ser Pro Ala Ala Ala Ala Lys Ala420 425
430Ala Ala Lys Ala Ala Lys Tyr Gly Ala Arg Gly Gly Val Gly
Ile Pro435 440 445Thr Tyr Gly Val Gly Ala
Gly Gly Phe Pro Gly Tyr Gly Val Gly Ala450 455
460Gly Ala Gly Leu Gly Gly Ala Ser Pro Ala Ala Ala Ala Ala Ala
Ala465 470 475 480Lys Ala
Ala Lys Tyr Gly Ala Gly Gly Ala Gly Ala Leu Gly Gly Leu485
490 495Val Pro Gly Ala Val Pro Gly Ala Leu Pro Gly Ala
Val Pro Ala Val500 505 510Pro Gly Ala Gly
Gly Val Pro Gly Ala Gly Thr Pro Ala Ala Ala Ala515 520
525Ala Ala Ala Ala Ala Lys Ala Ala Ala Lys Ala Gly Leu Gly
Pro Gly530 535 540Val Gly Gly Val Pro Gly
Gly Val Gly Val Gly Gly Ile Pro Gly Gly545 550
555 560Val Gly Val Gly Gly Val Pro Gly Gly Val Gly
Pro Gly Gly Val Thr565 570 575Gly Ile Gly
Ala Gly Pro Gly Gly Leu Gly Gly Ala Gly Ser Pro Ala580
585 590Ala Ala Lys Ser Ala Ala Lys Ala Ala Ala Lys Ala
Gln Tyr Arg Ala595 600 605Ala Ala Gly Leu
Gly Ala Gly Val Pro Gly Phe Gly Ala Gly Ala Gly610 615
620Val Pro Gly Phe Gly Ala Gly Ala Gly Val Pro Gly Phe Gly
Ala Gly625 630 635 640Ala
Gly Val Pro Gly Phe Gly Ala Gly Ala Gly Val Pro Gly Phe Gly645
650 655Ala Gly Ala Val Pro Gly Ser Leu Ala Ala Ser
Lys Ala Ala Lys Tyr660 665 670Gly Ala Ala
Gly Gly Leu Gly Gly Pro Gly Gly Leu Gly Gly Pro Gly675
680 685Gly Leu Gly Gly Pro Gly Gly Leu Gly Gly Ala Gly
Val Pro Gly Arg690 695 700Val Ala Gly Ala
Ala Pro Pro Ala Ala Ala Ala Ala Ala Ala Lys Ala705 710
715 720Ala Ala Lys Ala Ala Gln Tyr Gly Leu
Gly Gly Ala Gly Gly Leu Gly725 730 735Ala
Gly Gly Leu Gly Ala Gly Gly Leu Gly Ala Gly Gly Leu Gly Ala740
745 750Gly Gly Leu Gly Ala Gly Gly Leu Gly Ala Gly
Gly Leu Gly Ala Gly755 760 765Gly Leu Gly
Ala Gly Gly Gly Val Ser Pro Ala Ala Ala Ala Lys Ala770
775 780Ala Lys Tyr Gly Ala Ala Gly Leu Gly Gly Val Leu
Gly Ala Arg Pro785 790 795
800Phe Pro Gly Gly Gly Val Ala Ala Arg Pro Gly Phe Gly Leu Ser Pro805
810 815Ile Tyr Pro Gly Gly Gly Ala Gly Gly
Leu Gly Val Gly Gly Lys Pro820 825 830Pro
Lys Pro Tyr Gly Gly Ala Leu Gly Ala Leu Gly Tyr Gln Gly Gly835
840 845Gly Cys Phe Gly Lys Ser Cys Gly Arg Lys Arg
Lys850 855 860222583DNAMus musculus
22atggcgggtc tgacagcggt agtcccgcag cctggcgtct tgctgatcct cttgctcaac
60ctcctccatc ccgcgcagcc tggaggggtt ccaggagctg tgcctggcgg acttcctggt
120ggagttcccg gtggagtcta ttatccaggg gctggtattg gaggcctggg aggaggagga
180ggagctctgg gacctggagg aaaaccacct aagccaggtg ccggacttct gggaacgttt
240ggagcaggtc ctggaggact tggaggtgct ggcccgggtg caggtctcgg ggcctttcct
300gcaggcacct tcccaggggc aggagctctg gtgcccgggg gagcagcagg ggctgctgcc
360gcttataaag ctgccgccaa agctggggct gggcttggtg gcgttggcgg agtcccaggt
420ggtgttggcg ttggtggagt tccaggtggt gttggagttg gcggagtccc aggtggtgtt
480ggagttggtg gagtccctgg cggtgttggt ggtattggtg gcatcggtgg cttaggagtc
540tcgacaggtg ctgtggtgcc ccaagtcgga gctggcatcg gagctggagg aaagcctggg
600aaagttcctg gtgttggtct tccaggtgta tacccaggcg gagtgctccc aggaacagga
660gctcggttcc ctggtgtggg ggtgctccct ggagttccca ctggcacagg agtcaaagcc
720aaggctccag gtggaggtgg tgctttttct ggaatcccag gggtcggacc ctttgggggt
780cagcagcctg gtgtcccact gggttatccc atcaaagcac caaagctgcc aggtggctac
840ggactgccct ataccaatgg gaaattgccc tatggagtag ctggtgcagg gggcaaggct
900ggctacccaa cagggacagg ggtcggatcc caggcggcgg cggcagcagc taaagcagcc
960aagtatggtg ctgggggagc tggagtcctc cctggtgttg gagggggtgg cattcctggt
1020ggtgctggcg caattcctgg gattggaggc attgcaggcg ctggaactcc tgcagcagca
1080gctgctgcaa aggctgctgc taaggctgct aagtatggag ctgctggagg tttagtgcct
1140ggtggaccag gagttaggct cccaggtgct ggaatcccag gtgttggtgg cattcctggt
1200gttggtggca tcccaggtgt tgggggccct ggtattggag gtccaggcat tgtgggtgga
1260ccaggagctg tgtcaccagc tgctgctgct aaagctgctg ccaaagctgc caaatacgga
1320gccagaggtg gagttggcat cccgacatat ggggttggtg ctggtggctt tcctggctat
1380ggtgttggag ctggagcagg acttggaggt gcaagcccag ctgctgctgc tgccgccgcc
1440aaagctgcta agtatggtgc tggaggagct ggagccctgg gaggcctggt gccaggtgca
1500gtaccaggtg cactgccagg tgcagtacca gctgtgccgg gagctggtgg agtgccagga
1560gcaggtaccc ctgcagctgc agctgctgcc gccgccgcta aagcagccgc caaagcaggt
1620ttgggtcctg gtgttggtgg ggttcctggt ggagttggtg ttggtgggat tcccggtgga
1680gttggtgttg gtggggttcc tggtggagtt ggccctggtg gtgttactgg tattggagct
1740ggtcctggcg gtcttggagg agcagggtca ccggctgccg ctaaatctgc tgctaaggca
1800gctgccaaag cccagtacag agctgccgct gggcttggag ctggtgtccc tggatttggg
1860gctggtgctg gtgtccccgg atttggggct ggtgctggtg tccccggatt tggggctggt
1920gctggtgtcc ccggatttgg ggctggtgct ggtgtccctg gatttggagc tggagcagta
1980cctggatcgc tggctgcatc caaagctgct aaatatggag cagcaggtgg ccttggtggc
2040cctggaggtc tcggtggccc tggaggtctc ggtggacctg gaggacttgg tggggctggt
2100gttcccggta gagtagcagg agctgcaccc cctgctgctg ccgctgctgc tgccaaagct
2160gctgctaagg ctgcccagta tggccttggt ggagccggag gattgggagc cggtggactg
2220ggggccggtg gactgggagc cggtggactg ggagctggtg gactgggagc cggtggactg
2280ggagctggtg gactgggagc cggtggactg ggagctggtg gaggtgtgtc ccctgctgca
2340gctgctaagg cagccaaata tggtgctgct ggccttggag gtgtcctagg agccaggcca
2400ttcccaggtg gaggagttgc agcaagacct ggctttggac tttctcccat ttatccaggt
2460ggtggtgctg ggggcctggg agttggtgga aaacccccga agccctatgg aggagccctt
2520ggagccctgg gataccaagg tgggggctgc tttgggaaat cctgtgggcg gaagagaaag
2580tga
2583236PRTArtificial SequenceEntropic bristle domain (EBD) sequence
derived from human or mouse neurofilament NF-H proteins (SEQ ID
Nos. 1 and 3) 23Ser Pro Glu Ala Glu Lys1 5246PRTArtificial
SequenceEntropic bristle domain (EBD) sequence derived from human or
mouse neurofilament NF-H proteins (SEQ ID Nos. 1 and 3) 24Ser Pro
Ala Ala Val Lys1 5256PRTArtificial SequenceEntropic bristle
domain (EBD) sequence derived from human or mouse neurofilament NF-H
proteins (SEQ ID Nos. 1 and 3) 25Ser Pro Ala Glu Ala Lys1
5266PRTArtificial SequenceEntropic bristle domain (EBD) sequence
derived from human or mouse neurofilament NF-H proteins (SEQ ID
Nos. 1 and 3) 26Ser Pro Ala Glu Pro Lys1 5276PRTArtificial
SequenceEntropic bristle domain (EBD) sequence derived from human or
mouse neurofilament NF-H proteins (SEQ ID Nos. 1 and 3) 27Ser Pro
Ala Glu Val Lys1 5286PRTArtificial SequenceEntropic bristle
domain (EBD) sequence derived from human or mouse neurofilament NF-H
proteins (SEQ ID Nos. 1 and 3) 28Ser Pro Ala Thr Val Lys1
5296PRTArtificial SequenceEntropic bristle domain (EBD) sequence
derived from human or mouse neurofilament NF-H proteins (SEQ ID
Nos. 1 and 3) 29Ser Pro Glu Lys Ala Lys1 5306PRTArtificial
SequenceEntropic bristle domain (EBD) sequence derived from human or
mouse neurofilament NF-H proteins (SEQ ID Nos. 1 and 3) 30Ser Pro
Gly Glu Ala Lys1 5316PRTArtificial SequenceEntropic bristle
domain (EBD) sequence derived from human or mouse neurofilament NF-H
proteins (SEQ ID Nos. 1 and 3) 31Ser Pro Ile Glu Val Lys1
5326PRTArtificial SequenceEntropic bristle domain (EBD) sequence
derived from human or mouse neurofilament NF-H proteins (SEQ ID
Nos. 1 and 3) 32Ser Pro Pro Glu Ala Lys1 5336PRTArtificial
SequenceEntropic bristle domain (EBD) sequence derived from human or
mouse neurofilament NF-H proteins (SEQ ID Nos. 1 and 3) 33Ser Pro
Ser Glu Ala Lys1 5347PRTArtificial SequenceEntropic bristle
domain (EBD) sequence derived from human or mouse neurofilament NF-H
proteins (SEQ ID Nos. 1 and 3) 34Ser Pro Glu Lys Glu Ala Lys1
5358PRTArtificial SequenceEntropic bristle domain (EBD) sequence
derived from human or mouse neurofilament NF-H proteins (SEQ ID
Nos. 1 and 3) 35Ser Pro Ala Lys Glu Lys Ala Lys1
5368PRTArtificial SequenceEntropic bristle domain (EBD) sequence derived
from human or mouse neurofilament NF-H proteins (SEQ ID Nos. 1
and 3) 36Ser Pro Glu Lys Glu Glu Ala Lys1 5378PRTArtificial
SequenceEntropic bristle domain (EBD) sequence derived from human or
mouse neurofilament NF-H proteins (SEQ ID Nos. 1 and 3) 37Ser Pro
Thr Lys Glu Glu Ala Lys1 5388PRTArtificial SequenceEntropic
bristle domain (EBD) sequence derived from human or mouse
neurofilament NF-H proteins (SEQ ID Nos. 1 and 3) 38Ser Pro Val Lys
Glu Glu Ala Lys1 5398PRTArtificial SequenceEntropic bristle
domain (EBD) sequence derived from human or mouse neurofilament NF-H
proteins (SEQ ID Nos. 1 and 3) 39Ser Pro Val Lys Ala Glu Ala Lys1
5408PRTArtificial SequenceEntropic bristle domain (EBD)
sequence derived from human or mouse neurofilament NF-H proteins
(SEQ ID Nos. 1 and 3) 40Ser Pro Val Lys Glu Glu Ala Lys1
5418PRTArtificial SequenceEntropic bristle domain (EBD) sequence derived
from human or mouse neurofilament NF-H proteins (SEQ ID Nos. 1
and 3) 41Ser Pro Val Lys Glu Glu Val Lys1 5429PRTArtificial
SequenceEntropic bristle domain (EBD) sequence derived from human or
mouse neurofilament NF-H proteins (SEQ ID Nos. 1 and 3) 42Ser Pro
Val Lys Glu Glu Glu Lys Pro1 54311PRTArtificial
SequenceEntropic bristle domain (EBD) sequence derived from human or
mouse neurofilament NF-H proteins (SEQ ID Nos. 1 and 3) 43Ser Pro
Glu Lys Ala Lys Thr Leu Asp Val Lys1 5
104411PRTArtificial SequenceEntropic bristle domain (EBD) sequence
derived from human or mouse neurofilament NF-H proteins (SEQ ID
Nos. 1 and 3) 44Ser Pro Ala Asp Lys Phe Pro Glu Lys Ala Lys1
5 104513PRTArtificial SequenceEntropic bristle domain
(EBD) sequence derived from human or mouse neurofilament NF-H
proteins (SEQ ID Nos. 1 and 3) 45Ser Pro Glu Ala Lys Thr Pro Ala Lys
Glu Glu Ala Arg1 5 104614PRTArtificial
SequenceEntropic bristle domain (EBD) sequence derived from human or
mouse neurofilament NF-H proteins (SEQ ID Nos. 1 and 3) 46Ser Pro
Glu Lys Ala Lys Thr Pro Val Lys Glu Gly Ala Lys1 5
104714PRTArtificial SequenceEntropic bristle domain (EBD)
sequence derived from human or mouse neurofilament NF-H proteins
(SEQ ID Nos. 1 and 3) 47Ser Pro Val Lys Glu Glu Ala Lys Thr Pro Glu Lys
Ala Lys1 5 104819PRTArtificial
SequenceEntropic bristle domain (EBD) sequence derived from human or
mouse neurofilament NF-H proteins (SEQ ID Nos. 1 and 3) 48Ser Pro
Val Lys Glu Gly Ala Lys Pro Pro Glu Lys Ala Lys Pro Leu1 5
10 15Asp Val Lys4920PRTArtificial
SequenceEntropic bristle domain (EBD) sequence derived from human or
mouse neurofilament NF-H proteins (SEQ ID Nos. 1 and 3) 49Ser Pro
Val Lys Glu Asp Ile Lys Pro Pro Ala Glu Ala Lys Ser Pro1 5
10 15Glu Lys Ala Lys205021PRTArtificial
SequenceEntropic bristle domain (EBD) sequence derived from human or
mouse neurofilament NF-H proteins (SEQ ID Nos. 1 and 3) 50Ser Pro
Leu Lys Glu Asp Ala Lys Ala Pro Glu Lys Glu Ile Pro Lys1 5
10 15Lys Glu Glu Val
Lys205121PRTArtificial SequenceEntropic bristle domain (EBD) sequence
derived from human or mouse neurofilament NF-H proteins (SEQ ID
Nos. 1 and 3) 51Ser Pro Glu Lys Glu Glu Ala Lys Thr Ser Glu Lys Val Ala
Pro Lys1 5 10 15Lys Glu
Glu Val Lys205225PRTArtificial SequenceEntropic bristle domain (EBD)
sequence derived from human or mouse neurofilament NF-H proteins
(SEQ ID Nos. 1 and 3) 52Ser Pro Glu Ala Gln Thr Pro Val Gln Glu Glu Ala
Thr Val Pro Thr1 5 10
15Asp Ile Arg Pro Pro Glu Gln Val Lys20
255336PRTArtificial SequenceEntropic bristle domain (EBD) sequence
derived from human or mouse neurofilament NF-H proteins (SEQ ID
Nos. 1 and 3) 53Ser Pro Val Lys Glu Glu Val Lys Ala Lys Glu Pro Pro Lys
Lys Val1 5 10 15Glu Glu
Glu Lys Thr Leu Pro Thr Pro Lys Thr Glu Ala Lys Glu Ser20
25 30Lys Lys Asp Glu35544PRTArtificial SequenceEntropic
bristle domain (EBD) sequence derived from bovine, chicken, human,
mouse, rat and rabbit neurofilament NF-M proteins (SEQ ID Nos.
5,7,9,11,13 and 15) 54Ser Pro Pro Lys1554PRTArtificial SequenceEntropic
bristle domain (EBD) sequence derived from bovine, chicken, human,
mouse, rat and rabbit neurofilament NF-M proteins (SEQ ID Nos.
5,7,9,11,13 and 15) 55Ser Pro Val Lys1565PRTArtificial SequenceEntropic
bristle domain (EBD) sequence derived from bovine, chicken, human,
mouse, rat and rabbit neurofilament NF-M proteins (SEQ ID Nos.
5,7,9,11,13 and 15) 56Ser Pro Ala Ala Lys1
5575PRTArtificial SequenceEntropic bristle domain (EBD) sequence derived
from bovine, chicken, human, mouse, rat and rabbit neurofilament
NF-M proteins (SEQ ID Nos. 5,7,9,11,13 and 15) 57Ser Pro Ala Pro
Lys1 5585PRTArtificial SequenceEntropic bristle domain
(EBD) sequence derived from bovine, chicken, human, mouse, rat and
rabbit neurofilament NF-M proteins (SEQ ID Nos. 5,7,9,11,13 and
15) 58Ser Pro Glu Ala Lys1 5595PRTArtificial
SequenceEntropic bristle domain (EBD) sequence derived from bovine,
chicken, human, mouse, rat and rabbit neurofilament NF-M proteins
(SEQ ID Nos. 5,7,9,11,13 and 15) 59Ser Pro Met Pro Lys1
5605PRTArtificial SequenceEntropic bristle domain (EBD) sequence derived
from bovine, chicken, human, mouse, rat and rabbit
neurofilament NF-M proteins (SEQ ID Nos. 5,7,9,11,13 and 15) 60Ser
Pro Pro Ala Lys1 5615PRTArtificial SequenceEntropic bristle
domain (EBD) sequence derived from bovine, chicken, human, mouse,
rat and rabbit neurofilament NF-M proteins (SEQ ID Nos.
5,7,9,11,13 and 15) 61Ser Pro Thr Ala Lys1
5625PRTArtificial SequenceEntropic bristle domain (EBD) sequence derived
from bovine, chicken, human, mouse, rat and rabbit neurofilament
NF-M proteins (SEQ ID Nos. 5,7,9,11,13 and 15) 62Ser Pro Thr Thr
Lys1 5635PRTArtificial SequenceEntropic bristle domain
(EBD) sequence derived from bovine, chicken, human, mouse, rat and
rabbit neurofilament NF-M proteins (SEQ ID Nos. 5,7,9,11,13 and
15) 63Ser Pro Val Ala Lys1 5645PRTArtificial
SequenceEntropic bristle domain (EBD) sequence derived from bovine,
chicken, human, mouse, rat and rabbit neurofilament NF-M proteins
(SEQ ID Nos. 5,7,9,11,13 and 15) 64Ser Pro Val Ala Lys1
5655PRTArtificial SequenceEntropic bristle domain (EBD) sequence derived
from bovine, chicken, human, mouse, rat and rabbit
neurofilament NF-M proteins (SEQ ID Nos. 5,7,9,11,13 and 15) 65Ser
Pro Val Pro Lys1 5665PRTArtificial SequenceEntropic bristle
domain (EBD) sequence derived from bovine, chicken, human, mouse,
rat and rabbit neurofilament NF-M proteins (SEQ ID Nos.
5,7,9,11,13 and 15) 66Ser Pro Val Ser Lys1
5676PRTArtificial SequenceEntropic bristle domain (EBD) sequence derived
from bovine, chicken, human, mouse, rat and rabbit neurofilament
NF-M proteins (SEQ ID Nos. 5,7,9,11,13 and 15) 67Ser Pro Glu Lys Pro
Ala1 5688PRTArtificial SequenceEntropic bristle domain
(EBD) sequence derived from bovine, chicken, human, mouse, rat and
rabbit neurofilament NF-M proteins (SEQ ID Nos. 5,7,9,11,13 and
15) 68Ser Pro Val Glu Glu Lys Ala Lys1 5698PRTArtificial
SequenceEntropic bristle domain (EBD) sequence derived from bovine,
chicken, human, mouse, rat and rabbit neurofilament NF-M proteins
(SEQ ID Nos. 5,7,9,11,13 and 15) 69Ser Pro Val Glu Glu Lys Gly Lys1
5708PRTArtificial SequenceEntropic bristle domain (EBD)
sequence derived from bovine, chicken, human, mouse, rat and rabbit
neurofilament NF-M proteins (SEQ ID Nos. 5,7,9,11,13 and 15)
70Ser Pro Val Glu Glu Val Lys Pro1 57111PRTArtificial
SequenceEntropic bristle domain (EBD) sequence derived from bovine,
chicken, human, mouse, rat and rabbit neurofilament NF-M proteins
(SEQ ID Nos. 5,7,9,11,13 and 15) 71Ser Pro Glu Lys Pro Ala Thr Pro
Lys Val Thr1 5 107212PRTArtificial
SequenceEntropic bristle domain (EBD) sequence derived from bovine,
chicken, human, mouse, rat and rabbit neurofilament NF-M proteins
(SEQ ID Nos. 5,7,9,11,13 and 15) 72Ser Pro Glu Lys Pro Arg Thr Pro
Glu Lys Pro Ala1 5 107312PRTArtificial
SequenceEntropic bristle domain (EBD) sequence derived from bovine,
chicken, human, mouse, rat and rabbit neurofilament NF-M proteins
(SEQ ID Nos. 5,7,9,11,13 and 15) 73Ser Pro Glu Lys Pro Thr Thr Pro
Glu Lys Val Val1 5 107414PRTArtificial
SequenceEntropic bristle domain (EBD) sequence derived from bovine,
chicken, human, mouse, rat and rabbit neurofilament NF-M proteins
(SEQ ID Nos. 5,7,9,11,13 and 15) 74Ser Pro Glu Lys Pro Ser Ser Pro
Leu Lys Asp Glu Lys Ala1 5
107515PRTArtificial SequenceEntropic bristle domain (EBD) sequence
derived from bovine, chicken, human, mouse, rat and rabbit
neurofilament NF-M proteins (SEQ ID Nos. 5,7,9,11,13 and 15) 75Ser
Pro Val Lys Glu Lys Ala Val Glu Glu Met Ile Thr Ile Thr1 5
10 157616PRTArtificial SequenceEntropic
bristle domain (EBD) sequence derived from bovine, chicken, human,
mouse, rat and rabbit neurofilament NF-M proteins (SEQ ID Nos.
5,7,9,11,13 and 15) 76Ser Pro Val Lys Glu Glu Ala Ala Glu Glu Ala Ala Thr
Ile Thr Lys1 5 10
157720PRTArtificial SequenceEntropic bristle domain (EBD) sequence
derived from bovine, chicken, human, mouse, rat and rabbit
neurofilament NF-M proteins (SEQ ID Nos. 5,7,9,11,13 and 15) 77Ser
Pro Val Pro Lys Ser Pro Val Glu Glu Val Lys Pro Lys Ala Glu1
5 10 15Ala Thr Ala
Gly207820PRTArtificial SequenceEntropic bristle domain (EBD) sequence
derived from bovine, chicken, human, mouse, rat and rabbit
neurofilament NF-M proteins (SEQ ID Nos. 5,7,9,11,13 and 15) 78Ser
Pro Val Lys Ala Glu Ser Pro Val Lys Glu Glu Val Pro Ala Lys1
5 10 15Pro Val Lys
Val207921PRTArtificial SequenceEntropic bristle domain (EBD) sequence
derived from bovine, chicken, human, mouse, rat and rabbit
neurofilament NF-M proteins (SEQ ID Nos. 5,7,9,11,13 and 15) 79Ser
Pro Glu Lys Glu Ala Lys Glu Glu Glu Lys Pro Gln Glu Lys Glu1
5 10 15Lys Glu Lys Glu
Lys208023PRTArtificial SequenceEntropic bristle domain (EBD) sequence
derived from bovine, chicken, human, mouse, rat and rabbit
neurofilament NF-M proteins (SEQ ID Nos. 5,7,9,11,13 and 15) 80Ser
Pro Val Lys Ala Thr Thr Pro Glu Ile Lys Glu Glu Glu Gly Glu1
5 10 15Lys Glu Glu Glu Gly Gln
Glu208122PRTArtificial SequenceEntropic bristle domain (EBD) sequence
derived from bovine, chicken, human, mouse, rat and rabbit
neurofilament NF-M proteins (SEQ ID Nos. 5,7,9,11,13 and 15) 81Ser
Pro Val Glu Glu Val Lys Pro Lys Pro Glu Ala Lys Ala Gly Lys1
5 10 15Gly Glu Gln Lys Glu
Glu208224PRTArtificial SequenceEntropic bristle domain (EBD) sequence
derived from bovine, chicken, human, mouse, rat and rabbit
neurofilament NF-M proteins (SEQ ID Nos. 5,7,9,11,13 and 15) 82Ser
Pro Glu Lys Pro Ala Thr Pro Glu Lys Pro Pro Thr Pro Glu Lys1
5 10 15Ala Ile Thr Pro Glu Lys Val
Arg208324PRTArtificial SequenceEntropic bristle domain (EBD) sequence
derived from bovine, chicken, human, mouse, rat and rabbit
neurofilament NF-M proteins (SEQ ID Nos. 5,7,9,11,13 and 15) 83Ser
Pro Glu Lys Pro Ala Thr Pro Glu Lys Pro Arg Thr Pro Glu Lys1
5 10 15Pro Ala Thr Pro Glu Lys Pro
Arg208423PRTArtificial SequenceEntropic bristle domain (EBD) sequence
derived from bovine, chicken, human, mouse, rat and rabbit
neurofilament NF-M proteins (SEQ ID Nos. 5,7,9,11,13 and 15) 84Ser
Pro Lys Glu Glu Lys Val Glu Lys Lys Glu Glu Lys Pro Lys Asp1
5 10 15Val Pro Lys Lys Lys Ala
Glu208524PRTArtificial SequenceEntropic bristle domain (EBD) sequence
derived from bovine, chicken, human, mouse, rat and rabbit
neurofilament NF-M proteins (SEQ ID Nos. 5,7,9,11,13 and 15) 85Ser
Pro Lys Glu Glu Lys Ala Glu Lys Lys Glu Glu Lys Pro Lys Asp1
5 10 15Val Pro Glu Lys Lys Lys Ala
Glu208624PRTArtificial SequenceEntropic bristle domain (EBD) sequence
derived from bovine, chicken, human, mouse, rat and rabbit
neurofilament NF-M proteins (SEQ ID Nos. 5,7,9,11,13 and 15) 86Ser
Pro Val Glu Glu Ala Lys Ser Lys Ala Glu Val Gly Lys Gly Glu1
5 10 15Gln Lys Glu Glu Glu Glu Lys
Glu208724PRTArtificial SequenceEntropic bristle domain (EBD) sequence
derived from bovine, chicken, human, mouse, rat and rabbit
neurofilament NF-M proteins (SEQ ID Nos. 5,7,9,11,13 and 15) 87Ser
Pro Lys Glu Glu Lys Val Glu Lys Lys Glu Glu Lys Pro Lys Asp1
5 10 15Val Pro Asp Lys Lys Lys Ala
Glu208826PRTArtificial SequenceEntropic bristle domain (EBD) sequence
derived from bovine, chicken, human, mouse, rat and rabbit
neurofilament NF-M proteins (SEQ ID Nos. 5,7,9,11,13 and 15) 88Ser
Pro Val Lys Glu Glu Ala Val Ala Glu Val Val Thr Ile Thr Lys1
5 10 15Ser Val Lys Val His Leu Glu
Lys Glu Thr20 258931PRTArtificial SequenceEntropic
bristle domain (EBD) sequence derived from bovine, chicken, human,
mouse, rat and rabbit neurofilament NF-M proteins (SEQ ID Nos.
5,7,9,11,13 and 15) 89Ser Ser Glu Lys Asp Glu Gly Glu Gln Glu Glu Glu Glu
Gly Glu Thr1 5 10 15Glu
Ala Glu Gly Glu Gly Glu Glu Ala Glu Ala Lys Glu Glu Lys20
25 309032PRTArtificial SequenceEntropic bristle domain
(EBD) sequence derived from bovine, chicken, human, mouse, rat and
rabbit neurofilament NF-M proteins (SEQ ID Nos. 5,7,9,11,13 and
15) 90Ser Pro Val Glu Glu Val Lys Pro Lys Ala Glu Ala Gly Ala Glu Lys1
5 10 15Gly Glu Gln Lys Glu
Lys Val Glu Glu Glu Lys Lys Glu Ala Lys Glu20 25
309137PRTArtificial SequenceEntropic bristle domain (EBD)
sequence derived from bovine, chicken, human, mouse, rat and rabbit
neurofilament NF-M proteins (SEQ ID Nos. 5,7,9,11,13 and 15)
91Ser Pro Val Thr Glu Gln Ala Lys Ala Val Gln Lys Ala Ala Ala Glu1
5 10 15Val Gly Lys Asp Gln Lys
Ala Glu Lys Ala Ala Glu Lys Ala Ala Lys20 25
30Glu Glu Lys Ala Ala359244PRTArtificial SequenceEntropic bristle
domain (EBD) sequence derived from bovine, chicken, human, mouse,
rat and rabbit neurofilament NF-M proteins (SEQ ID Nos.
5,7,9,11,13 and 15) 92Ser Pro Glu Ala Lys Glu Glu Glu Glu Glu Gly Glu Lys
Glu Glu Glu1 5 10 15Glu
Glu Gly Gln Glu Glu Glu Glu Glu Glu Asp Glu Gly Val Lys Ser20
25 30Asp Gln Ala Glu Glu Gly Gly Ser Glu Lys Glu
Gly35 40935PRTArtificial SequenceEntropic bristle domain
(EBD) sequence derived from filamentous phage fd adsorption protein
pIII (SEQ ID No. 17) 93Glu Gly Gly Gly Ser1
5945PRTArtificial SequenceEntropic bristle domain (EBD) sequence derived
from filamentous phage fd adsorption protein pIII (SEQ ID No.
17) 94Glu Gly Gly Gly Thr1 5955PRTArtificial
SequenceEntropic bristle domain (EBD) sequence derived from
filamentous phage fd adsorption protein pIII (SEQ ID No. 17) 95Ser
Glu Gly Gly Gly1 5967PRTArtificial SequenceEntropic bristle
domain (EBD) sequence derived from filamentous phage fd adsorption
protein pIII (SEQ ID No. 17) 96Gly Gly Gly Ser Gly Gly Gly1
5978PRTArtificial SequenceEntropic bristle domain (EBD) sequence
derived from filamentous phage fd adsorption protein pIII (SEQ
ID No. 17) 97Ser Gly Gly Gly Ser Gly Ser Gly1
5989PRTArtificial SequenceEntropic bristle domain (EBD) sequence derived
from filamentous phage fd adsorption protein pIII (SEQ ID No.
17) 98Ser Gly Gly Gly Ser Glu Gly Gly Gly1
59913PRTArtificial SequenceEntropic bristle domain (EBD) sequence derived
from yeast nucleoporin Nup2p protein (SEQ ID No. 19) 99Phe Ser
Phe Gly Thr Ser Gln Pro Asn Asn Thr Pro Ser1 5
1010017PRTArtificial SequenceEntropic bristle domain (EBD) sequence
derived from yeast nucleoporin Nup2p protein (SEQ ID No. 19)
100Phe Ser Phe Ser Ile Pro Ser Lys Asn Thr Pro Asp Ala Ser Lys Pro1
5 10 15Ser10116PRTArtificial
SequenceEntropic bristle domain (EBD) sequence derived from yeast
nucleoporin Nup2p protein (SEQ ID No. 19) 101Phe Val Phe Gly Gln Ala
Ala Ala Lys Pro Ser Leu Glu Lys Ser Ser1 5
10 1510217PRTArtificial SequenceEntropic bristle domain
(EBD) sequence derived from yeast nucleoporin Nup2p protein (SEQ ID
No. 19) 102Phe Ser Phe Gly Val Pro Asn Ser Ser Lys Asn Glu Thr Ser
Lys Pro1 5 10
15Val10317PRTArtificial SequenceEntropic bristle domain (EBD) sequence
derived from yeast nucleoporin Nup2p protein (SEQ ID No. 19)
103Phe Thr Phe Gly Thr Lys His Ala Ala Asp Ser Gln Asn Asn Lys Pro1
5 10 15Ser10418PRTArtificial
SequenceEntropic bristle domain (EBD) sequence derived from yeast
nucleoporin Nup2p protein (SEQ ID No. 19) 104Phe Thr Phe Gly Ser Ser
Ala Leu Ala Asp Asn Lys Glu Asp Val Lys1 5
10 15Lys Pro10519PRTArtificial SequenceEntropic bristle
domain (EBD) sequence derived from yeast nucleoporin Nup2p protein
(SEQ ID No. 19) 105Phe Ser Phe Gly Ile Asn Thr Asn Thr Thr Lys Thr
Ala Asp Thr Lys1 5 10
15Ala Pro Thr10626PRTArtificial SequenceEntropic bristle domain (EBD)
sequence derived from yeast nucleoporin Nup2p protein (SEQ ID No.
19) 106Phe Ser Phe Gly Lys Thr Thr Ala Asn Leu Pro Ala Asn Ser Ser Thr1
5 10 15Ser Pro Ala Pro
Ser Ile Pro Ser Thr Gly20 2510727PRTArtificial
SequenceEntropic bristle domain (EBD) sequence derived from yeast
nucleoporin Nup2p protein (SEQ ID No. 19) 107Phe Ser Phe Gly Pro Lys
Lys Glu Asn Arg Lys Lys Asp Glu Ser Asp1 5
10 15Ser Glu Asn Asp Ile Glu Ile Lys Gly Pro Glu20
2510831PRTArtificial SequenceEntropic bristle domain (EBD)
sequence derived from yeast nucleoporin Nup2p protein (SEQ ID No.
19) 108Phe Lys Phe Ser Gly Thr Val Ser Ser Asp Val Phe Lys Leu Asn Pro1
5 10 15Ser Thr Asp Lys
Asn Glu Lys Lys Thr Glu Thr Asn Ala Lys Pro20 25
3010933PRTArtificial SequenceEntropic bristle domain (EBD)
sequence derived from yeast nucleoporin Nup2p protein (SEQ ID No.
19) 109Phe Lys Phe Ser Leu Pro Phe Glu Gln Lys Gly Ser Gln Thr Thr Thr1
5 10 15Asn Asp Ser Lys
Glu Glu Ser Thr Thr Glu Ala Thr Gly Asn Glu Ser20 25
30Gln11034PRTArtificial SequenceEntropic bristle domain
(EBD) sequence derived from yeast nucleoporin Nup2p protein (SEQ ID
No. 19) 110Phe Thr Phe Gly Ser Thr Thr Ile Glu Lys Lys Asn Asp Glu
Asn Ser1 5 10 15Thr Ser
Asn Ser Lys Pro Glu Lys Ser Ser Asp Ser Asn Asp Ser Asn20
25 30Pro Ser11136PRTArtificial SequenceEntropic bristle
domain (EBD) sequence derived from yeast nucleoporin Nup2p protein
(SEQ ID No. 19) 111Phe Ser Phe Gly Ile Ser Asn Gly Ser Glu Ser Lys
Asp Ser Asp Lys1 5 10
15Pro Ser Leu Pro Ser Ala Val Asp Gly Glu Asn Asp Lys Lys Glu Ala20
25 30Thr Lys Pro Ala3511238PRTArtificial
SequenceEntropic bristle domain (EBD) sequence derived from yeast
nucleoporin Nup2p protein (SEQ ID No. 19) 112Phe Ser Phe Ser Ser Ala
Thr Ser Thr Thr Glu Gln Thr Lys Ser Lys1 5
10 15Asn Pro Leu Ser Leu Thr Glu Ala Thr Lys Thr Asn
Val Asp Asn Asn20 25 30Ser Lys Ala Glu
Ala Ser3511352PRTArtificial SequenceEntropic bristle domain (EBD)
sequence derived from yeast nucleoporin Nup2p protein (SEQ ID No.
19) 113Phe Ser Phe Gly Ala Ala Thr Pro Ser Ala Lys Glu Ala Ser Gln Glu1
5 10 15Asp Asp Asn Asn
Asn Val Glu Lys Pro Ser Ser Lys Pro Ala Phe Asn20 25
30Leu Ile Ser Asn Ala Gly Thr Glu Lys Glu Lys Glu Ser Lys
Lys Asp35 40 45Ser Lys Pro
Ala501144PRTArtificial SequenceEntropic bristle domain (EBD) sequence
derived from mouse elastin protein (SEQ ID No. 21) 114Val Pro Gly
Ala11155PRTArtificial SequenceEntropic bristle domain (EBD) sequence
derived from mouse elastin protein (SEQ ID No. 21) 115Gly Ala Gly
Gly Leu1 51165PRTArtificial SequenceEntropic bristle domain
(EBD) sequence derived from mouse elastin protein (SEQ ID No. 21)
116Gly Ala Gly Gly Gly1 51175PRTArtificial SequenceEntropic
bristle domain (EBD) sequence derived from mouse elastin protein
(SEQ ID No. 21) 117Val Pro Gly Val Gly1 51188PRTArtificial
SequenceEntropic bristle domain (EBD) sequence derived from mouse
elastin protein (SEQ ID No. 21) 118Val Pro Gly Phe Gly Ala Gly Ala1
51198PRTArtificial SequenceEntropic bristle domain (EBD) sequence
derived from mouse elastin protein (SEQ ID No. 21) 119Val Pro Gly
Ala Leu Pro Gly Ala1 51209PRTArtificial SequenceEntropic
bristle domain (EBD) sequence derived from mouse elastin protein
(SEQ ID No. 21) 120Val Pro Gly Phe Gly Ala Gly Ala Gly1
51219PRTArtificial SequenceEntropic bristle domain (EBD) sequence derived
from mouse elastin protein (SEQ ID No. 21) 121Val Pro Ala Val Pro
Gly Ala Gly Gly1 51229PRTArtificial SequenceEntropic
bristle domain (EBD) sequence derived from mouse elastin protein
(SEQ ID No. 21) 122Val Pro Gly Gly Val Gly Val Gly Gly1
512310PRTArtificial SequenceEntropic bristle domain (EBD) sequence
derived from mouse elastin protein (SEQ ID No. 21) 123Val Gly Ala
Gly Gly Phe Pro Gly Tyr Gly1 5
1012412PRTArtificial SequenceEntropic bristle domain (EBD) sequence
derived from mouse elastin protein (SEQ ID No. 21) 124Val Pro Gly
Ala Val Pro Gly Gly Leu Pro Gly Gly1 5
1012515PRTArtificial SequenceEntropic bristle domain (EBD) sequence
derived from mouse elastin protein (SEQ ID No. 21) 125Val Ser Pro
Ala Ala Ala Ala Lys Ala Ala Lys Tyr Gly Ala Ala1 5
10 1512616PRTArtificial SequenceEntropic
bristle domain (EBD) sequence derived from mouse elastin protein
(SEQ ID No. 21) 126Val Pro Gln Val Gly Ala Gly Ile Gly Ala Gly Gly Lys
Pro Gly Lys1 5 10
1512718PRTArtificial SequenceEntropic bristle domain (EBD) sequence
derived from mouse elastin protein (SEQ ID No. 21) 127Val Pro Gly
Gly Val Gly Val Gly Gly Ile Pro Gly Gly Val Gly Val1 5
10 15Gly Gly12821PRTArtificial
SequenceEntropic bristle domain (EBD) sequence derived from mouse
elastin protein (SEQ ID No. 21) 128Val Pro Gly Gly Val Gly Gly Ile Gly
Gly Ile Gly Gly Leu Gly Val1 5 10
15Ser Thr Gly Ala Val2012927PRTArtificial SequenceEntropic
bristle domain (EBD) sequence derived from mouse elastin protein
(SEQ ID No. 21) 129Val Pro Gly Gly Ala Ala Gly Ala Ala Ala Ala Tyr Lys
Ala Ala Ala1 5 10 15Lys
Ala Gly Ala Gly Leu Gly Gly Val Gly Gly20
2513028PRTArtificial SequenceEntropic bristle domain (EBD) sequence
derived from mouse elastin protein (SEQ ID No. 21) 130Val Ser Pro
Ala Ala Ala Ala Lys Ala Ala Ala Lys Ala Ala Lys Tyr1 5
10 15Gly Ala Arg Gly Gly Val Gly Ile Pro
Thr Tyr Gly20 2513130PRTArtificial SequenceEntropic
bristle domain (EBD) sequence derived from mouse elastin protein
(SEQ ID No. 21) 131Lys Pro Pro Lys Pro Tyr Gly Gly Ala Leu Gly Ala Leu
Gly Tyr Gln1 5 10 15Gly
Gly Gly Cys Phe Gly Lys Ser Cys Gly Arg Lys Arg Lys20 25
3013230PRTArtificial SequenceEntropic bristle domain
(EBD) sequence derived from mouse elastin protein (SEQ ID No. 21)
132Val Pro Gly Ala Gly Thr Pro Ala Ala Ala Ala Ala Ala Ala Ala Ala1
5 10 15Lys Ala Ala Ala Lys Ala
Gly Leu Gly Pro Gly Val Gly Gly20 25
3013330PRTArtificial SequenceEntropic bristle domain (EBD) sequence
derived from mouse elastin protein (SEQ ID No. 21) 133Val Pro Gly
Arg Val Ala Gly Ala Ala Pro Pro Ala Ala Ala Ala Ala1 5
10 15Ala Ala Lys Ala Ala Ala Lys Ala Ala
Gln Tyr Gly Leu Gly20 25
3013430PRTArtificial SequenceEntropic bristle domain (EBD) sequence
derived from mouse elastin protein (SEQ ID No. 21) 134Val Pro Gly
Val Gly Leu Pro Gly Val Tyr Pro Gly Gly Val Leu Pro1 5
10 15Gly Thr Gly Ala Arg Phe Pro Gly Val
Gly Val Leu Pro Gly20 25
3013533PRTArtificial SequenceEntropic bristle domain (EBD) sequence
derived from mouse elastin protein (SEQ ID No. 21) 135Val Pro Thr
Gly Thr Gly Val Lys Ala Lys Ala Pro Gly Gly Gly Gly1 5
10 15Ala Phe Ser Gly Ile Pro Gly Val Gly
Pro Phe Gly Gly Gln Gln Pro20 25
30Gly13634PRTArtificial SequenceEntropic bristle domain (EBD) sequence
derived from mouse elastin protein (SEQ ID No. 21) 136Val Pro Gly
Gly Val Tyr Tyr Pro Gly Ala Gly Ile Gly Gly Leu Gly1 5
10 15Gly Gly Gly Gly Ala Leu Gly Pro Gly
Gly Lys Pro Pro Lys Pro Gly20 25 30Ala
Gly13735PRTArtificial SequenceEntropic bristle domain (EBD) sequence
derived from mouse elastin protein (SEQ ID No. 21) 137Val Gly Ala
Gly Ala Gly Leu Gly Gly Ala Ser Pro Ala Ala Ala Ala1 5
10 15Ala Ala Ala Lys Ala Ala Lys Tyr Gly
Ala Gly Gly Ala Gly Ala Leu20 25 30Gly
Gly Leu3513840PRTArtificial SequenceEntropic bristle domain (EBD)
sequence derived from mouse elastin protein (SEQ ID No. 21) 138Gly
Leu Gly Gly Val Leu Gly Ala Arg Pro Phe Pro Gly Gly Gly Val1
5 10 15Ala Ala Arg Pro Gly Phe Gly
Leu Ser Pro Ile Tyr Pro Gly Gly Gly20 25
30Ala Gly Gly Leu Gly Val Gly Gly35
4013941PRTArtificial SequenceEntropic bristle domain (EBD) sequence
derived from mouse elastin protein (SEQ ID No. 21) 139Val Pro Gly
Ser Leu Ala Ala Ser Lys Ala Ala Lys Tyr Gly Ala Ala1 5
10 15Gly Gly Leu Gly Gly Pro Gly Gly Leu
Gly Gly Pro Gly Gly Leu Gly20 25 30Gly
Pro Gly Gly Leu Gly Gly Ala Gly35 4014045PRTArtificial
SequenceEntropic bristle domain (EBD) sequence derived from mouse
elastin protein (SEQ ID No. 21) 140Val Pro Gly Gly Pro Gly Val Arg Leu
Pro Gly Ala Gly Ile Pro Gly1 5 10
15Val Gly Gly Ile Pro Gly Val Gly Gly Ile Pro Gly Val Gly Gly
Pro20 25 30Gly Ile Gly Gly Pro Gly Ile
Val Gly Gly Pro Gly Ala35 40
4514150PRTArtificial SequenceEntropic bristle domain (EBD) sequence
derived from mouse elastin protein (SEQ ID No. 21) 141Val Leu Pro
Gly Val Gly Gly Gly Gly Ile Pro Gly Gly Ala Gly Ala1 5
10 15Ile Pro Gly Ile Gly Gly Ile Ala Gly
Ala Gly Thr Pro Ala Ala Ala20 25 30Ala
Ala Ala Lys Ala Ala Ala Lys Ala Ala Lys Tyr Gly Ala Ala Gly35
40 45Gly Leu5014250PRTArtificial SequenceEntropic
bristle domain (EBD) sequence derived from mouse elastin protein
(SEQ ID No. 21) 142Val Pro Gly Gly Val Gly Pro Gly Gly Val Thr Gly Ile
Gly Ala Gly1 5 10 15Pro
Gly Gly Leu Gly Gly Ala Gly Ser Pro Ala Ala Ala Lys Ser Ala20
25 30Ala Lys Ala Ala Ala Lys Ala Gln Tyr Arg Ala
Ala Ala Gly Leu Gly35 40 45Ala
Gly5014364PRTArtificial SequenceEntropic bristle domain (EBD) sequence
derived from mouse elastin protein (SEQ ID No. 21) 143Val Pro Leu
Gly Tyr Pro Ile Lys Ala Pro Lys Leu Pro Gly Gly Tyr1 5
10 15Gly Leu Pro Tyr Thr Asn Gly Lys Leu
Pro Tyr Gly Val Ala Gly Ala20 25 30Gly
Gly Lys Ala Gly Tyr Pro Thr Gly Thr Gly Val Gly Ser Gln Ala35
40 45Ala Ala Ala Ala Ala Lys Ala Ala Lys Tyr Gly
Ala Gly Gly Ala Gly50 55
601445PRTArtificial SequenceEntropic bristle domain (EBD) sequence
derived from mouse elastin protein (SEQ ID No. 21) 144Val Pro Gly
Xaa Gly1 5
User Contributions:
comments("1"); ?> comment_form("1"); ?>Inventors list |
Agents list |
Assignees list |
List by place |
Classification tree browser |
Top 100 Inventors |
Top 100 Agents |
Top 100 Assignees |
Usenet FAQ Index |
Documents |
Other FAQs |
User Contributions:
Comment about this patent or add new information about this topic: