Patent application title: POST-NATAL TRANSPLANTATION OF FACTOR VIII-EXPRESSING CELLS FOR TREATMENT OF HEMOPHILIA
Inventors:
IPC8 Class: AA61K3528FI
USPC Class:
1 1
Class name:
Publication date: 2020-11-19
Patent application number: 20200360441
Abstract:
Disclosed herein are method of treating hemophilia A in a subject
comprising injecting the subject with mesenchymal stromal/stem cells
(MSC) modified to express high levels of Factor VIII protein. The MSC are
isolated prenatally, at birth, or after the subject's birth. The modified
MSC may also express high levels von Willebrand factor protein.Claims:
1. A method of treating a subject diagnosed with hemophilia A,
comprising: (a) modifying mesenchymal stem/stromal cells (MSC) to express
high levels of Factor VIII protein or high levels of both Factor VIII
protein and von Willebrand factor (vWF) protein thereby generating
modified MSC, the MSC comprising bone-marrow MSC isolated from the
subject; (b) generating an expanded modified MSC population by in vitro
culturing the modified MSC; and (c) injecting MSC from the expanded
modified MSC population into the subject.
2. The method of claim 1, wherein isolating the MSC comprises isolating cells that express at least one of Stro-1 or CD146.
3. The method of claim 1, wherein the subject has received prior treatment with exogenous Factor VIII and has developed an inhibitory immune response that diminishes the effectiveness of the exogenous Factor VIII treatment.
4. A method of treating a subject prenatally diagnosed as having hemophilia A, comprising: (a) modifying mesenchymal stem/stromal cells (MSC) to express high levels of Factor VIII protein or high levels of both Factor VIII protein and von Willebrand factor (vWF) protein thereby generating modified MSC, the MSC comprising MSC isolated from at least one of amniotic fluid, placental tissue, or umbilical cord tissue obtained at the time of the subject's birth or prenatally from the subject's mother; (b) generating an expanded modified MSC population by in vitro culturing the modified MSC; and (c) injecting MSC from the expanded modified MSC population into the subject.
5. The method of claim 4, wherein isolating the MSC comprises isolating cells that express c-kit.
6. The method of claim 4, wherein the isolated MSC express c-kit, CD34, CD90, and CD133.
7. (canceled)
8. (canceled)
9. The method of claim 1, wherein the modified MSC comprise one or both of a Factor VIII gene sequence or a vWF gene sequence comprising one or more modifications that increase protein expression, protein stability, or both, of the Factor VIII protein or vWF protein.
10. (canceled)
11. (canceled)
12. The method claim 1, wherein the MSC are modified by introducing into the MSC a viral vector comprising a Factor VIII gene sequence operatively linked to a constitutively active promoter, a viral vector comprising a vWF gene sequence operatively linked to a constitutively active promoter, or a viral vector comprising a Factor VIII gene sequence operatively linked to a constitutively active promoter and a vWF gene sequence operatively linked to a constitutively active promoter.
13. The method claim 1, wherein the MSC are modified via gene editing, wherein the gene editing introduces one or more modifications to one or both of an endogenous Factor VIII gene sequence or an endogenous vWF gene sequence that increase protein expression, protein stability, or both.
14. (canceled)
15. (canceled)
16. The method of claim 12, wherein the Factor VIII gene sequence and the vWF gene sequence are operatively linked to the same constitutively active promoter.
17. (canceled)
18. The method of claim 1, wherein the MSC from the expanded modified MSC population are injected into the subject via at least one of intraperitoneal injection, intravenous injection, or intra-articular injection.
19. The method of claim 1, wherein the expanded modified MSC population are injected into the subject at least once, at least twice, or at least three times.
20. The method of claim 1, wherein the expanded modified MSC population are injected into the subject in an amount of about 10.sup.7 to about 10.sup.9 MSC per kilogram weight of the subject.
21. The method of claim 4, wherein the modified MSC comprise one or both of a Factor VIII gene sequence or a vWF gene sequence comprising one or more modifications that increase protein expression, protein stability, or both, of the Factor VIII protein or vWF protein.
22. The method claim 4, wherein the MSC are modified by introducing into the MSC a viral vector comprising a Factor VIII gene sequence operatively linked to a constitutively active promoter, a viral vector comprising a vWF gene sequence operatively linked to a constitutively active promoter, or a viral vector comprising a Factor VIII gene sequence operatively linked to a constitutively active promoter and a vWF gene sequence operatively linked to a constitutively active promoter.
23. The method of claim 22, wherein the Factor VIII gene sequence and the vWF gene sequence are operatively linked to the same constitutively active promoter.
24. The method claim 4, wherein the MSC are modified via gene editing, wherein the gene editing introduces one or more modifications to one or both of an endogenous Factor VIII gene sequence or an endogenous vWF gene sequence that increase protein expression, protein stability, or both.
25. The method claim 4, wherein the MSC from the expanded modified MSC population are injected into the subject via at least one of intraperitoneal injection, intravenous injection, or intra-articular injection.
26. The method claim 4, wherein the expanded modified MSC population are injected into the subject at least once, at least twice, or at least three times.
27. The method claim 4, wherein the expanded modified MSC population are injected into the subject in an amount of about 10.sup.7 to about 10.sup.9 MSC per kilogram weight of the subject.
Description:
CROSS REFERENCE TO RELATED APPLICATION
[0001] This application claims the benefit of U.S. Provisional Application Ser. No. 62/549,280, filed Aug. 23, 2017, the contents of this application is herein incorporated by reference in its entirety.
REFERENCE TO A SEQUENCE LISTING SUBMITTED AS A TEXT FILE VIA EFS-WEB
[0003] The official copy of the sequence listing is submitted electronically via EFS-Web as an ASCII formatted sequence listing with a file named SEQ_WFIRM17-908.txt, created on Aug. 23, 2018, and having a size of 176 KB and is filed concurrently with the specification. The sequence listing contained in this ASCII formatted document is part of the specification and is herein incorporated by reference in its entirety.
BACKGROUND
[0004] Factor VIII is an essential blood clotting factor. The protein circulates in the bloodstream in an inactive form, bound to another molecule called von Willebrand factor, until an injury that damages blood vessels occurs. In response to injury, coagulation factor VIII is activated and separates from von Willebrand factor. The active protein interacts with another coagulation factor called Factor IX. This interaction sets off a chain of additional chemical reactions that form a blood clot.
[0005] Hemophilia A (HA) is the most common inheritable coagulation deficiency, affecting 1 in 5000 boys, approximately 60% of whom present with the severe form of the disease. Mutations in the Factor VIII gene that result in decreased or defective Factor VIII protein give rise to HA, a recessive X-linked disorder. Individuals with severe HA experience recurrent hematomas of subcutaneous connective tissue/muscle, internal bleeding, and frequent hemarthrosis, leading to chronic debilitating arthropathies. Current treatment is frequent infusions of Factor VIII (plasma-derived or recombinant) to maintain hemostasis, which greatly improves quality of life for many HA patients. While current therapeutic products for HA offer reliable prophylactic and therapeutic efficacy, they are very expensive and do not cure the underlying disease, thus requiring administration for the entire life of the patient. In addition, more than 30% of patients with severe HA develop inhibitory antibodies to the infused Factor VIII therapeutic, placing them in danger of treatment failure. This is a significant and serious complication/challenge in the clinical management/treatment of HA. While protein-based immune tolerance induction (ITI) therapy has been used with some success in this patient group, its cost extends into the millions of dollars per patient, it is only effective in about 60% of patients, and its mechanism of action is largely unknown. These shortcomings with existing therapy for patients who develop inhibitors highlight the need for innovative approaches to surmount this immunological hurdle.
BRIEF SUMMARY
[0006] In one aspect, provided are methods of treating a subject diagnosed with hemophilia A, the method involving the steps of (a) modifying mesenchymal stem/stromal cells (MSC) to express high levels of Factor VIII protein thereby generating modified MSC, the MSC comprising bone-marrow MSC isolated from the subject; (b) generating an expanded modified MSC population by in vitro culturing the modified MSC; and (c) injecting MSC from the expanded modified MSC population into the subject.
[0007] In another aspect, provided are methods of treating a subject prenatally diagnosed as having hemophilia A, the method involving the steps of (a) modifying mesenchymal stem/stromal cells (MSC) to express high levels of Factor VIII protein thereby generating modified MSC, the MSC comprising MSC isolated from at least one of amniotic fluid, placental tissue, or umbilical cord tissue obtained at the time of the subject's birth or prenatally from the subject's mother; (b) generating an expanded modified MSC population by in vitro culturing the modified MSC; and (c) injecting MSC from the expanded modified MSC population into the subject.
[0008] The above described and many other features and attendant advantages of embodiments of the present disclosure will become apparent and further understood by reference to the following detailed description when considered in conjunction with the accompanying drawings.
BRIEF DESCRIPTION OF THE DRAWINGS
[0009] These figures are intended to be illustrative, not limiting. Although the aspects of the disclosure are generally described in the context of these figures, it should be understood that it is not intended to limit the scope of the disclosure to these particular aspects.
[0010] FIG. 1 shows the complete absence of any FVIII antigen/cross-reactive material (CRM) in the hemophilia A sheep according to some aspects of the disclosure.
[0011] FIG. 2 shows assessment of phenotypic markers in MSC isolated from sheep according to some aspects of the disclosure.
[0012] FIG. 3 shows a schematic diagram of an ovine Factor VIII transgene expression vector according to some aspects of the disclosure.
[0013] FIG. 4 shows a schematic diagram of an ovine vWF transgene expression vector according to some aspects of the disclosure.
[0014] FIG. 5 shows endogenous expression of Factor VIII in MSC isolated from amniotic fluid according to some aspects of the disclosure.
[0015] FIG. 6 shows endogenous expression of vWF in MSC isolated from amniotic fluid according to some aspects of the disclosure.
[0016] FIG. 7 shows exogenous expression of Factor VIII in MSC isolated from amniotic fluid and transduced with a lentivector encoding Factor VIII according to some aspects of the disclosure.
[0017] FIG. 8A shows assessment of phenotypic markers in PLCs according to some aspects of the disclosure.
[0018] FIG. 8B shows flow cytometric analysis of PLC constitutively expressed levels of FVIII protein according to some aspects of the disclosure.
[0019] FIG. 8C shows assessment of normalized levels of PLC constitutively expressed levels of FVIII protein according to some aspects of the disclosure.
[0020] FIG. 9A shows assessment of normalized levels of secretion of FVIII protein by PLC engineered/transduced to express high levels of FVIII according to some aspects of the disclosure.
[0021] FIG. 9B shows assessment of secretion of FVIII protein by transduced as compared to non-transduced PLCs according to some aspects of the disclosure.
[0022] FIGS. 10A-10C show assessment of phenotypic markers in transduced as compared to non-transduced PLCs according to some aspects of the disclosure.
[0023] FIG. 11A shows assessment of expression of Toll-like receptors (TLRs) in mcoET3-transduced PLCs as compared to non-transduced PLCs according to some aspects of the disclosure.
[0024] FIG. 11B shows assessment of expression of stress molecules in mcoET3-transduced PLCs as compared to non-transduced PLCs according to some aspects of the disclosure.
[0025] FIG. 12 shows assessment of normalized levels of PLC secretion of FVIII protein by non-transduced-PLC, lcoHSQ-PLC, lcoET3-PLC, ET3-PLC, and mcoET3-PLC according to some aspects of the disclosure.
DETAILED DESCRIPTION
[0026] Provided in this disclosure are methods of treatment for subjects having hemophilia A. The methods are post-natal therapies comprising administering to a subject with hemophilia A autologous mesenchymal stem/stromal cells (MSC) that have been modified to express Factor VIII. Provided methods are effective as first-line therapies for subjects that have been diagnosed prenatally or at an early age and who have not received Factor VIII therapy. Provided methods are also effective as second-line therapies for the treatment of subjects that have been receiving Factor VIII therapy and, in some instances, have developed an immune response to standard infusion therapy of exogenous Factor VIII. The MSC used in the methods are isolated from biological samples obtained prenatally or after the subject's birth. The MSC are modified to express high levels of Factor VIII protein. In some instances, the MSC are modified to express high levels of Factor VIII and high levels of another protein, such as von Willebrand factor. The MSC may be modified by the introduction of a transgene (for example, using a viral vector) or via genome-editing (for example, using the CRISPR/Cas9 system). Administering the modified MSC to the subject results in engraftment of the modified cells. The engrafted cells produce Factor VIII on a continuing basis in the subject and provide long-lasting (ideally lifelong) therapeutic benefit to the subject by promoting blood coagulation.
[0027] In one aspect, provided is a method of treating a subject diagnosed clinically or genetically with hemophilia A comprising: (a) modifying mesenchymal stem/stromal cells (MSC) to express high levels of Factor VIII protein thereby generating modified MSC, the MSC comprising bone marrow MSC isolated from the subject; (b) generating an expanded modified MSC population by in vitro culturing the modified MSC; and (c) injecting MSC from the expanded modified MSC population into the subject. The bone marrow MSC express at least one of Stro-1 or CD146. In some instances, the bone marrow MSC may be isolated based on expression of at least one of Stro-1 or CD146.
[0028] In another aspect, provided is a method of treating a subject prenatally diagnosed as having hemophilia A comprising: (a) modifying mesenchymal stem/stromal cells (MSC) to express high levels of Factor VIII protein thereby generating modified MSC, the MSC comprising MSC isolated from at least one of amniotic fluid, placental tissue, or umbilical cord tissue obtained at the time of the subject's birth or prenatally; (b) generating an expanded modified MSC population by in vitro culturing the modified MSC; and (c) injecting MSC from the expanded modified MSC population into the subject. In some instances, the MSC are amniotic fluid MSC. In some instances, the MSC are placental MSC (PLC). In some instances, the MSC are umbilical cord tissue MSC.
[0029] As used herein the terms treatment, treat, or treating refer to a method of reducing one or more symptoms of a disease or condition. In some instances, treatment results in a 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, or 100% reduction in the severity of one or more symptoms of the disease or condition. In some instances, treatment results in at least a 60%, 65%, 70%, 75%, 80%, 85%, 90%, or 95% reduction in the severity of one or more symptoms of the disease or condition. In some instances, treatment results in a 100% reduction in the severity of one or more symptoms of the disease or condition. For example, a method for treating a disease is considered to be a treatment if there is a 5% reduction in one or more symptoms or signs. As used herein, control refers to the untreated condition. In some instances, the reduction can be a 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 100%, or any percent reduction in between 10% and 100% as compared to native or control levels. In some instances, the reduction can be at least a 65%, 70%, 75%, 80%, 85%, 90%, or 95% reduction as compared to native or control levels. In some instances, the reduction can be a 100% reduction. It is understood that treatment does not necessarily refer to a cure or complete ablation of the disease, condition, or symptoms of the disease or condition. As used herein, references to decreasing, reducing, or inhibiting include a change of 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or greater as compared to a control level. Such terms can include, but do not necessarily include, complete elimination.
[0030] The subject on which the method is performed has been diagnosed with hemophilia A. The subject is mammalian, including humans; non-human primates, such as apes and monkeys; cattle; horses; sheep; rats; dogs; cats; mice; pigs; and goats. In some embodiments, the subject is a human, a dog, a horse, a sheep, a cow, or a cat. The subject may be male or female. The subject may be a juvenile subject or an adult subject. As a recessive X-linked disorder, a male subject will carry an X chromosome that has a mutation in the Factor VIII gene. A female subject that has hemophilia A will either have a mutated Factor VIII allele on both X chromosomes or will have a mutant Factor VIII allele on one X chromosome and have an inactive Factor VIII allele on the other X chromosome. In some instances, the subject may be a female carrier of hemophilia A that has a mutant Factor VIII allele on one X chromosome and a normal Factor VIII gene on the other X chromosome. Subjects may be diagnosed via prenatal genetic testing, particularly in instances where there is a family history of hemophilia. The DNA from biological samples obtained from amniocentesis, chorionic villi sampling, or cell-free fetal DNA present in the maternal peripheral blood may be analyzed for abnormalities on the X chromosome or mutations in the Factor VIII gene. Alternatively, subjects may be diagnosed after birth by assessing the ability of the subject's blood to clot properly. For example, screening tests include activated partial thromboplastin time (APTT) test, prothrombin time (PT) test, and fibrinogen test. Diagnosis of hemophilia A (type and severity) can also be performed with antigen-based tests that assess the amount of Factor VIII protein in the subject's blood. In some instances, the subject may have received therapy with infused Factor VIII protein. Where the subject has received such therapy, in some instances the subject may have developed an immune response to Factor VIII protein (developed inhibitory antibodies that impair the effectiveness of the therapy). In some embodiments, the subject has received prior treatment with exogenous Factor VIII and has developed an inhibitory immune response that diminishes the effectiveness of the exogenous Factor VIII treatment.
[0031] MSC, referred to in the field as mesenchymal stem cells, mesenchymal stromal cells, and, when isolated from bone marrow, also marrow stromal progenitors (MSP), are multipotent stromal cells that can differentiate into a variety of cell types, including: osteoblasts, chondrocytes, myocytes, and adipocytes. MSC do not have the capacity to reconstitute an entire organ. The term encompasses multipotent cells derived from other non-marrow tissues, such as placenta, umbilical cord blood, adipose tissue, or the dental pulp of deciduous baby teeth. MSC are heterogeneous and different subsets of MSC may have different capabilities. Different methods of isolation will result in different populations of MSC. Such different populations may express different protein markers. MSC subpopulations with different marker expression profiles have been found to have different capabilities. See, for example, Thierry, D., et al., Stro-1 Positive and Stro-1 Negative Human Mesenchymal Stem Cells Express Different Levels of Immunosuppression, Blood 104(11): 4964 (2004). The extent to which a MSC population isolated using one method and having a particular marker profile will share properties with a MSC population isolated using a different method and having a different marker profile has not been determined.
[0032] In some instances, the MSC used in the method are bone marrow-derived MSC--that is, the MSC are isolated from bone marrow. Specifically, the bone marrow-derived MSC are isolated from bone marrow obtained from the subject (autologous MSC). In some instances, the bone marrow-derived MSC used in the method express Stro-1, CD146, or both Stro-1 and CD146. Flow cytometry methods may be used to isolate MSC expressing these markers such as described, for example, in Sanada C., et al., Mesenchymal stem cells contribute to endogenous FVIII:c production. J Cell Physiol. 2013; 228(5):1010-1016 and Chamberlain J. L., et al., Efficient generation of human hepatocytes by the intrahepatic delivery of clonal human mesenchymal stem cells in fetal sheep. Hepatology. 2007; 46(6):1935-1945. Isolating MSC based on Stro-1 and/or CD146 results in a distinct cell population from that isolated using the traditional approach in which bulk unpurified bone marrow or Ficoll-purified bone marrow mononuclear cells are plated directly into plastic cell culture plates or flasks to which the adherent MSC population binds.
[0033] In some instances, the MSC used in the method are MSC isolated from a birth tissue or birth fluid. Specifically, the MSC may be isolated from amniotic fluid, placental tissue, chorionic villi, or umbilical cord tissue. In some instances, the MSC used in the method express c-kit. Methods of isolating such cells are described in U.S. Pat. Nos. 7,968,336 and 8,021,876, which are incorporated herein by reference in their entirety. In some instances, the MSC express at least one of c-kit, CD34, CD90, or CD133. In some instances, the MSC express c-kit and at least one of CD34, CD90, or CD133. In some instances, the MSC are isolated based on expression of c-kit.
[0034] For juvenile patients for whom prenatal biological samples are available, the MSC may be isolated from such samples (such as amniotic fluid, placental, cord tissue). Such samples may be available where the subject is diagnosed with hemophilia prior to birth. In some instances, appropriate biological samples may be obtained at the time of the subject's birth (such as amniotic fluid, placental, cord tissue). For adult patients, or juvenile patients for which prenatal biological samples are not available, the MSC used in the method may be bone marrow derived mesenchymal stem/stromal cells (MSC), also referred to as bone marrow stromal cells.
[0035] The MSC used in the method are modified to express high levels of Factor VIII. In some instances, the MSC may be modified to also express high levels of von Willebrand factor (vWF).
[0036] In some instances, an exogenous gene sequence encoding one or both of these proteins may be introduced into the MSC via one or more vectors. In some instances, the MSC may be modified to express high levels of Factor VIII protein via introduction into the MSC of a vector comprising a Factor VIII gene sequence operatively linked to a constitutively active promoter. In some instances, the MSC may be modified to express high levels of vWF protein via introduction into the MSC of a vector comprising a vWF gene sequence operatively linked to a constitutively active promoter. In some instances, the MSC may be modified to express high levels of Factor VIII protein and vWF protein via introduction into the MSC of a vector comprising a Factor VIII gene sequence operatively linked to a constitutively active promoter and a vector encoding a vWF gene sequence operatively linked to a constitutively active promoter. In some instances, the Factor VIII gene sequence and the vWF gene sequence may be operatively linked to the same constitutively active promoter. Alternatively, the Factor VIII gene sequence and the vWF gene sequence may be operatively linked to different constitutively active promoters.
[0037] Exemplary vectors include, for example, plasmids and viral vectors (including but not limited to adenoviral, adeno-associated viral (AAV), or retroviruses such as lentiviruses. In preferred embodiments, the vector is a viral vector. In some instances, the vector may be a vector that integrates into the genome of transduced cells. For example, the vector may be a lentivirus vector. In preferred embodiments, the vector is a lentivirus vector. In some instances, the lentivirus vector contains a 3'-modified long terminal repeat (LTR), resulting in a self-inactivating (SIN) lentivector. A lentivirus vector may integrate into the genome of dividing or non-dividing cells. The lentivirus genome in the form of RNA is reverse-transcribed to DNA when the virus enters the cell, and is then inserted into the genome by the viral integrase enzyme. The lentivirus vector, now called a provirus, remains in the genome and is passed on to the progeny of the cell when it divides. In another example, the vector may be an adeno-associated virus (AAV) vector, which, in contrast to wild-type AAV, only rarely integrates into the genome of the cells it transduces. In one example, the vector may be an adenoviral vector. An adenoviral vector does not integrate into the genome. In another instance, the vector may be a murine retrovirus vector. In another example, the vector may be a foamy virus vector, which may have a larger capacity for inserts than lentiviral vectors. In another example, the vector may be Sendai virus vector.
[0038] The exogenous gene sequences are operatively linked to one or more promoter sequences within the vector. The term "promoter sequence" or "promoter element" refers to a nucleotide sequence that assists with controlling expression of a coding sequence. Generally, promoter elements are located 5' of the translation start site of a gene. However, in certain embodiments, a promoter element may be located within an intron sequence, or 3' of the coding sequence. In some embodiments, a promoter useful for a gene therapy vector is derived from the native gene of the target protein (e.g., a Factor VIII promoter). In some embodiments, the promoter is a constitutive promoter, which drives substantially constant expression of protein from the exogenous gene sequence. Non-limiting examples of well-characterized promoter elements include the cytomegalovirus immediate-early promoter (CMV), the .beta.-actin promoter, the methyl CpG binding protein 2 (MeCP2) promoter, the simian virus 40 early (SV40) promoter, human Ubiquitin C promoter (UBC), human elongation factor 1.alpha. promoter (EF1.alpha.), the phosphoglycerate kinase 1 promoter (PGK), or the CMV immediate early enhancer/chicken beta actin (CAG) promoter. The vector will generally also contain one or more of a promoter regulatory region (e.g., one conferring constitutive expression), a transcription initiation start site, a ribosome binding site, an RNA processing signal, a transcription termination site, and/or a polyadenylation signal.
[0039] In some instances, the Factor VIII transgene is operably linked to a promoter. A number of promoters can be used in the practice of the invention. The promoters can be selected based on desired outcome. The nucleic acids can be combined with constitutive, inducible, tissue-preferred, or other promoters for expression in the organism of interest. See, for example, promoters set forth as SEQ ID NOs: 1-6 as described in Brown et al. (2018) Target-Cell-Directed Bioengineering Approaches for Gene Therapy of Hemophilia A. Mol. Ther. Methods Clin. Dev., 2018. 9:57-69, which is herein incorporated by reference in its entirety for all purposes.
[0040] Where the MSC are modified to express high levels of Factor VIII via transduction with an exogenous Factor VIII gene sequence, the exogenous Factor VIII gene sequence may be human Factor VIII gene sequence, porcine Factor VIII gene sequence, or a hybrid transgene comprising portions of human Factor VIII gene sequence and portions of porcine Factor VIII gene sequence. In some instances, the gene sequence comprises all or a portion of the human Factor cDNA as set forth in GenBank Accession No. 192448441 as updated Jul. 17, 2017, wherein said portion would encode a function portion of the human Factor VIII protein. In some instances, the gene sequence comprises a sequence that is at least 99%, 98%, 97%, 96%, 95%, 94%, 93%, 92%, 90%, 89%, 88%, 87%, 86%, 85%, 84%, 83%, 82%, 81%, or 80% identical to all or a portion of the human Factor cDNA as set forth in GenBank Accession No. 192448441 as updated Jul. 17, 2017, wherein said portion would encode a functional portion of the human Factor VIII protein. In some instances, the gene sequence may encode all or a functional portion of the human Factor VIII protein as set forth in GenBank Accession No. 192448441 as updated Jul. 17, 2017, reflecting the protein transcribed from transcript variant 1 of the Factor VIII gene. This protein is approximately 300 kDa and contains a series of homology-defined domains designated A1-A2-B-ap-A3-C.sub.1-C.sub.2. In some instances, the exogenous Factor VIII gene sequence is modified relative to wild-type protein sequence to result in increased protein expression, increased protein stability, reduced immunogenicity, or a combination of one or more thereof.
[0041] In some instances, the sequence of one or more of the Factor VIII protein domains may be deleted. In one example, the B domain of Factor VIII is deleted. The B domain of Factor VIII has no known function and can be deleted without loss of coagulant activity. Deletion of the B-domain has been shown to increase factor VIII protein production in heterologous systems (Toole et al. (1986) Proc. Natl. Acad. Sci. U.S.A. 83:5939-5942). In addition, wildtype porcine Factor VIII protein having the B-domain deleted may have 10-100-fold higher expression and secretion than the human Factor FVIII gene sequence, both in vitro and in vivo. (See, for example, Dooriss, K. L., et al., Comparison of factor VIII transgenes bioengineered for improved expression in gene therapy of hemophilia A. Hum Gene Ther. 20:465-478 (2009). A B-domain deleted form of human Factor VIII protein (Lind et al. (1995) Eur. J. Biochem. 232:19-27) has been approved for clinical use.
[0042] In some instances, the exogenous Factor VIII gene sequence may include protein modifications to reduce immunogenicity of the protein thereby reducing the risk of an immune response due to therapy. For example, alanine substitutions may be included as described in Healey, J. F., et al., The comparative immunogenicity of human and porcine factor VIII in haemophilia A mice. Thromb Haemost. 102:35-41 (2009) and Lubin, I. M., et al., Analysis of the human factor VIII A2 inhibitor epitope by alanine scanning mutagenesis. J Biol Chem. 272:30191-30195 (1997), which are incorporated by reference herein in their entirety.
[0043] In some instances, one or more of the human Factor VIII protein domain sequences may be substituted with the sequence of the corresponding porcine Factor VIII protein domain sequences. For example, one or more porcine Factor VIII domains may be substituted for one or more human Factor VIII domains. For example, inclusion of the porcine Factor VIII domains A1 and ap-A3 may increase expression of the expressed Factor VIII protein. See, for example, Doering, C. B., et al., Identification of porcine coagulation factor VIII domains responsible for high level expression via enhanced secretion. J Biol Chem. 279:6546-6552 (2004). In some embodiments, the exogenous Factor VIII gene sequence may comprise the human Factor VIII A2 and C2 domains and the porcine Factor VIII A1, A3, and C1 domains.
[0044] In some instances, the exogenous Factor VIII gene sequence may comprise a modified Factor VIII sequence comprising a B domain-deleted (BDD) Factor VIII transgene having the sequence of the human A2 and C2 domains and the porcine A1, A3, and C1 domains, and also include three alanine substitutions in the A2 domain to reduce immunogenicity, as described in Lubin, I. M., et al., Analysis of the human factor VIII A2 inhibitor epitope by alanine scanning mutagenesis. J Biol Chem. 1997; 272(48):30191-5. This modified Factor VIII protein is referred to as the ET3 transgene in this disclosure, including in the Examples below. In some instances, the ET3 transgene is expressed at a comparable level to that of wild-type porcine Factor VIII protein while having 91% identity to the amino acid sequence of wild-type human Factor VIII protein. In one example, the exogenous Factor VIII gene sequence may comprise a human/porcine Factor VIII transgene as described in Doering, C. B., et al., Directed engineering of a high-expression chimeric transgene as a strategy for gene therapy of hemophilia A, Mol. Ther. 17(7):1145-1154 (2009), which is incorporated herein by reference in its entirety.
[0045] In some instances, the Factor VIII transgene sequence may comprise one of the modified Factor VIII sequences described in Brown et al. (2018) Target-Cell-Directed Bioengineering Approaches for Gene Therapy of Hemophilia A. Mol. Ther. Methods Clin. Dev., 2018. 9:57-69, which is incorporated herein by reference in its entirety for all purposes. Factor VIII polypeptides, including tissue-specific codon optimized variants, are described therein. Modified Factor VIII transgene sequences used in the methods described herein can be any one of SEQ ID NOs: 7-16 (as described in Brown et al.). For example, Factor VIII transgene sequences that can be used in the methods described herein include a B-domain deleted (BDD) human Factor VIII polypeptide (HSQ) as set forth in SEQ ID NO: 15, a BDD chimeric human/porcine Factor VIII polypeptide (ET3) as set forth in SEQ ID NO: 11, or an ancestral Factor VIII polypeptide (An53) as set forth in SEQ ID NO: 7.
[0046] In some instances, the exogenous Factor VIII gene sequence may be modified for expression in a particular organ or tissue type. For example, the gene sequence may be optimized for expression in myeloid tissue. In some embodiments, the Factor VIII transgene may comprise myeloid codon optimized ET3 (mcoET3) as set forth in SEQ ID NO: 12 or myeloid codon optimized HSQ (mcoHSQ) as set forth in SEQ ID NO: 16. Alternatively, the Factor VIII transgene may be optimized for expression in liver tissue. In some embodiments, the Factor VIII transgene may comprise liver codon optimized ET3 (lcoET3) as set forth in SEQ ID NO: 10; liver codon optimized An53 as set forth in SEQ ID NO: 8; or liver codon optimized (lcoHSQ) as set forth in SEQ ID NO: 14.
[0047] In some instances, the exogenous Factor VIII gene sequence may comprise one of the modified Factor VIII sequences described in U.S. Pat. No. 7,635,763, which is incorporated herein by reference in its entirety for all purposes. Regions of the porcine Factor VIII polypeptide that comprises the A1 and ap-A3 regions, and variants and fragments thereof, are described therein that impart high-level expression to both the porcine and human Factor VIII polypeptide. The exogenous Factor VIII gene sequence encoded by the viral vector of the provided methods may be the polynucleotides set forth in any one of SEQ ID NOs: 19, 21, 23, 25, or 27 (SEQ ID NOs: 15, 17, 19, 13, or 21 as described in U.S. Pat. No. 7,635,763). The modified Factor VIII protein expressed at high levels in the modified MSC may comprise the amino acid sequences set forth in any one of SEQ ID NOs: 18, 20, 22, 24, or 26 (SEQ ID NOs: 14, 16, 18, 12, or 20 as described in U.S. Pat. No. 7,635,763). Such sequences are summarized in Table 1 below. In some instances, these sequences may be used to construct an exogenous Factor VIII gene sequence encoding a modified factor VIII polypeptide that results in a high level of expression of the encoded modified Factor VIII protein.
TABLE-US-00001 TABLE 1 Exemplary Modified Factor VIII Proteins Modified Factor VIII Protein SEQ ID NO. Description HP44/OL aa: SEQ ID NO: 18 A1.sub.P-A2.sub.P-ap.sub.P-A3.sub.P-C1.sub.H-C2.sub.H nt: SEQ ID NO: 19 porcine A1, A2, ap-A3 domains, porcine- derived linker sequence S F A Q N S R P P S A S A P K P P V L R R H Q R (SEQ ID NO: 30), and human C1 and C2 domains HP46/SQ aa: SEQ ID NO: 20 A1.sub.P-A2.sub.H-ap.sub.H-A3.sub.H-C1.sub.H-C2.sub.H nt: SEQ ID NO: 21 porcine A1 domain, human A2, ap-A3, C1 and C2 domains, and human S F S Q N P P V L K R H Q R linker sequence HP47/OL aa: SEQ ID NO: 22 A1.sub.P-A2.sub.H-ap.sub.p-A3.sub.P-C1.sub.H-C2.sub.H nt: SEQ ID NO: 23 porcine A1, ap-A3 domains, porcine- derived linker sequence S F A Q N S R P P S A S A P K P P V L R R H Q R (SEQ ID NO: 31), and human A2, C1 and C2 domains B-domain deleted aa: SEQ ID NO: 24 Human Factor VIII protein sequence nt: SEQ ID NO: 25 minus B-domain HP63/OL aa: SEQ ID NO: 26 porcine A1 domain and a partially nt: SEQ ID NO: 27 humanized ap-A3 domain that comprises porcine amino acids from about 1690 to about 1804 and from about 1819 to about 2019
[0048] As discussed above, in some instances, the MSC are also modified to express high levels of vWF protein via introduction into the MSC of a vector. In some embodiments coding sequences for vWF can be any one of SEQ ID NOs: 28 or 29. In some instances, the vWF gene sequence in the vector may encode all or a functional portion of the human vWF protein set forth in GenBank Accession No. 1023301060 as updated Aug. 21, 2017. However, in some instances, the vWF gene sequence may include one or more modifications to the wild-type vWF gene sequence to increase protein expression, increase protein stability, reduce immunogenicity, or a combination of one or more thereof, of the vWF protein. For example, the full cDNA sequence of the vWF gene may be too large to be packaged efficiently in certain vectors, such as, for example, a lentiviral vector. Thus, in some instances, one or more exons of the vWF gene may be deleted while still retaining biological function of the expressed protein. In some instances, exons 24-46 of the vWF gene may be deleted as described in U.S. Patent Application Publication No. 2010/0183556. In some instances, the vWF gene sequence may be codon-optimized for efficient expression in the MSC. In some instances, the exogenous vWF gene sequence may modified for expression in a particular organ or tissue type. For example, the gene sequence may be optimized for expression in the liver. Thus, in some instances, the vWF gene sequence may comprise at least 99%, 98%, 97%, 96%, 95%, 94%, 93%, 92%, 90%, 89% 88%, 87%, 86%, 85%, 84%, 83%, 82%, 81%, or 80% identity to the corresponding wild-type vWF gene sequence and comprise modifications to improve expression. In some instances, the vWF gene sequence comprises the truncated human vWF sequence set forth below in this disclosure or a sequence at least 99%, 98%, 97%, 96%, 95%, 94%, 93%, 92%, 90%, 89% 88%, 87%, 86%, 85%, 84%, 83%, 82%, 81%, or 80% identical thereto while retaining biological activity of the expressed protein. In some instances, the vWF gene sequence comprises the truncated sheep vWF sequence set forth below in this disclosure or a sequence at least 99%, 98%, 97%, 96%, 95%, 94%, 93%, 92%, 90%, 89% 88%, 87%, 86%, 85%, 84%, 83%, 82%, 81%, or 80% identical thereto while retaining biological activity of the expressed protein.
[0049] In some instances, gene-editing may be performed on the MSC to insert, delete, or replace the genomic sequence of one or both of the endogenous genes using engineered nucleases (molecular scissors). Gene-editing nucleases belong to one of three known categories: zinc-finger nucleases (ZFN), transcription activator-like effector nucleases (TALEN), and clustered regularly interspaced short palindromic repeats (CRISPR) and their associated proteins (Cas) tools. All operate on the same principle; they are all capable of inducing a double-strand break at a defined genomic sequence that is subsequently corrected by endogenous DNA repair mechanisms. Double-strand breaks can be repaired through homology-driven repair (HDR), in the presence of donor homologous DNA sequences, resulting in gene-editing events.
[0050] In some instances, the MSC may be modified to express high levels of the Factor VIII protein via gene-editing of an endogenous Factor VIII gene sequence of the MSC, wherein the gene-editing introduces one or more modifications to an endogenous Factor VIII gene sequence that increase protein expression, increase protein stability, reduce immunogenicity, or a combination of one or more thereof, of the Factor VIII protein. In some instances, the MSC are modified to express high levels of an exogenous FVIII protein via genome-editing, wherein the gene-editing introduces an exogenous FVIII gene, under the control of a constitutive promoter, into a "safe harbor" region within the genome, such as the AAVS1 site. In some instances, the MSC are modified to express high levels of the vWF protein via gene-editing of an endogenous vWF gene sequence of the MSC, wherein the gene editing introduces one or more modifications to the endogenous vWF gene sequence that increase protein expression, increase protein stability, reduce immunogenicity, or a combination of one or more thereof, of the vWF protein. In some instances, the MSC are modified to express high levels of an exogenous vWF protein via genome-editing, wherein the gene-editing introduces an exogenous vWF gene, under the control of a constitutive promoter, into a "safe harbor" within the genome, such as the AAVS1 site. Exemplary "safe harbor" regions are described in Cerbini, T., et al., Transfection, selection, and colony-picking of human induced pluripotent stem cells TALEN-targeted with a GFP gene into the AAVS1 safe harbor. J Vis Exp. 2015 Feb. 1; (96):52504 and Hong, S. G., et al., Rhesus iPSC Safe Harbor Gene-Editing Platform for Stable Expression of Transgenes in Differentiated Cells of All Germ Layers. Mol Ther. 2017; 25(1):44-53.
[0051] In some instances, the endogenous Factor VIII gene sequence may be modified by gene-editing to have the type of modifications described above for embodiments where an exogenous Factor VIII gene sequence is introduced via transduction. The discussion of the various modifications described above is thus also applicable to embodiments where the endogenous Factor VIII gene sequence is modified. For example, in some instances, the sequence of one or more protein domains of the endogenous Factor VIII gene sequence may be deleted. In some instances, the B domain of Factor VIII is deleted. In some instances, the endogenous Factor VIII gene sequence may be modified to reduce immunogenicity of the protein thereby reducing the risk of an immune response due to therapy. For example, alanine substitutions may be introduced as described in Healey, J. F., et al., The comparative immunogenicity of human and porcine factor VIII in haemophilia A mice. Thromb Haemost. 102:35-41 (2009) and Lubin, I. M., et al., Analysis of the human factor VIII A2 inhibitor epitope by alanine scanning mutagenesis. J Biol Chem. 272:30191-30195 (1997), which are incorporated by reference herein in their entirety.
[0052] In some instances, the endogenous Factor VIII gene sequence may be modified to substitute one or more of the Factor VIII protein domain sequences with the sequence of the corresponding Factor VIII protein domain sequences from another species. For example, for human subjects, the endogenous Factor VIII gene sequence may be modified to substitute one or more of the human Factor VIII protein domain sequences with the sequence of the corresponding porcine Factor VIII protein domain sequences. For example, substitution with the porcine Factor VIII domains A1 and ap-A3 may increase expression of the expressed Factor VIII protein. See, for example, Doering, C. B., et al., Identification of porcine coagulation factor VIII domains responsible for high level expression via enhanced secretion. J Biol Chem. 279:6546-6552 (2004). In some embodiments, the endogenous Factor VIII gene sequence may be modified to comprise the porcine Factor VIII A1, A3, and C1 domains, while retaining the human Factor VIII A2 and C2 domains.
[0053] In some instances, the endogenous Factor VIII gene sequence may be modified to include a B domain deletion, the porcine A1, A3, and C1 domains, and also include three alanine substitutions in the A2 domain to reduce immunogenicity, as described above for the exogenous Factor VIII gene sequence embodiments. In one example, the endogenous Factor VIII gene sequence may be modified to have the sequence of a human/porcine Factor VIII transgene as described in Doering, C. B., et al., Directed engineering of a high-expression chimeric transgene as a strategy for gene therapy of hemophilia A, Mol. Ther. 17(7):1145-1154 (2009), which is incorporated herein by reference in its entirety. In some instances, the modified endogenous Factor VIII gene sequence results in expression of a modified Factor VIII protein at a level comparable to that of wild-type porcine Factor VIII protein while having 91% identity to the amino acid sequence of wild-type human Factor VIII protein.
[0054] In some instances, the endogenous Factor VIII gene sequence may be modified to comprise one of the modified Factor VIII sequences described in U.S. Pat. No. 7,635,763, which is incorporated herein by reference in its entirety for all purposes. In some instances, the endogenous Factor VIII gene sequence may comprise the polynucleotides set forth in any one of SEQ ID NOs: 19, 21, 23, 25, or 27 (SEQ ID NOs: 15, 17, 19, 13, or 21 as described in U.S. Pat. No. 7,635,763). The modified Factor VIII protein expressed at high levels in the modified MSC may comprise the amino acid sequences set forth in any one of SEQ ID NOs: 18, 20, 22, 24, or 26 (SEQ ID NOs: 14, 16, 18, 12, or 20 as described in U.S. Pat. No. 7,635,763). Such sequences are summarized in Table 1 above.
[0055] As discussed above, in some instances, the MSC are also modified to express high levels of vWF protein via gene-editing. In some instances, the vWF gene sequence may include one or more modifications to the wild-type vWF gene sequence to increase protein expression, increase protein stability, reduce immunogenicity, or a combination of one or more thereof, of the vWF protein. For example, in some instances, one or more exons of the vWF gene may be deleted while still retaining biological function of the expressed protein. In some instances, exons 24-46 of the vWF gene may be deleted as described in U.S. Patent Application Publication No. 2010/0183556. In some instances, the vWF gene sequence may be codon-optimized for efficient expression in the MSC. In some instances, the exogenous vWF gene sequence may modified for expression in a particular organ or tissue type. For example, the gene sequence may be optimized for expression in the liver. Thus, in some instances, the vWF gene sequence may be modified to comprise at least 99%, 98%, 97%, 96%, 95%, 94%, 93%, 92%, 90%, 89% 88%, 87%, 86%, 85%, 84%, 83%, 82%, 81%, or 80% identity to the corresponding wild-type vWF gene sequence and comprise modifications to improve expression. In some instances, the vWF gene sequence may be modified to comprise the truncated human vWF sequence set forth below in this disclosure or a sequence at least 99%, 98%, 97%, 96%, 95%, 94%, 93%, 92%, 90%, 89% 88%, 87%, 86%, 85%, 84%, 83%, 82%, 81%, or 80% identical thereto while retaining biological activity of the expressed protein. In some instances, the vWF gene sequence may be modified to comprise the truncated sheep vWF sequence set forth below in this disclosure or a sequence at least 99%, 98%, 97%, 96%, 95%, 94%, 93%, 92%, 90%, 89% 88%, 87%, 86%, 85%, 84%, 83%, 82%, 81%, or 80% identical thereto while retaining biological activity of the expressed protein.
[0056] Where the MSC are modified to express high levels of both Factor VIII and vWF, the same method of modification may be used to achieve high expression of both proteins or different methods could be used for each protein. For example, in some instances, the MSC may be modified to express high levels of both Factor VIII and vWF protein via introduction of exogenous gene sequences for both proteins. In another example, the MSC may be modified to express high levels of both Factor VIII and vWF protein via gene-editing of the endogenous gene sequences of both proteins. In some instances, the MSC may be modified to express high levels of Factor VIII via transduction of an exogenous Factor VIII gene sequence and modified to express high levels of vWF via gene-editing of the endogenous vWF gene sequences. In other instances, the MSC may be modified to express high levels of vWF via transduction of an exogenous vWF gene sequence and modified to express high levels of Factor VIII via gene-editing of the endogenous Factor VIII gene sequences.
[0057] A "high level of expression" means that the production/expression of the modified Factor VIII protein or vWF protein is at an increased level as compared to the expression level of the corresponding native Factor VIII protein or vWF protein expressed under the same conditions. An increase in protein expression levels (considered a high level of expression) comprises at least about 0.5, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 20-fold or greater expression of the modified Factor VIII protein or vWF protein compared to the expression levels of the corresponding Factor VIII protein or vWF protein. Alternatively, "high-level expression" can comprise an increase in protein expression levels of at least 1-25 fold, 1-5 fold, 5-10 fold, 10-15 fold, 15-20 fold, 20-25 fold or greater expression levels of the modified Factor VIII protein or vWF protein when compared to the corresponding Factor VIII protein or vWF protein. Methods for assaying protein expression levels are routine in the art. By "corresponding" Factor VIII protein or vWF protein is intended a Factor VIII protein or vWF protein that comprises an equivalent amino acid sequence. In one example, expression of a modified human Factor VIII protein comprising the A1-A2-ap-A3-C1-C2 domains is compared to a human Factor VIII protein containing corresponding domains A1-A2-ap-A3-C1-C2. In another example, for a fragment of a modified human Factor VIII protein containing domains A1-A2-ap-A3, expression is compared to a fragment of human Factor VIII protein having the corresponding domains A1-A2-ap-A3. Alternatively, in certain instances, expression of a modified Factor VIII protein or vWF protein may be compared to the full-length corresponding proteins. In one example, for a fragment of a modified human Factor VIII protein containing domains A1-A2-ap-A3, expression is compared to human Factor VIII protein having the A1-A2-ap-A3-C1-C2 domains.
[0058] The modified MSC are cultured in vitro to generate an expanded modified MSC population. The expanded modified MSC population provides sufficient numbers of modified MSC for therapeutic use. Culture conditions may be selected based on the type of MSC used in the method. For example, MSC isolated from placental tissue may be grown in culture medium optimized for placental cells. In another example, MSC isolated from amnion tissue may be grown in culture medium optimized for amniotic cells. In another example, MSC isolated from umbilical cord or bone marrow may be grown in culture medium optimized for MSC cells. The modified cells may be grown on plastic culture dishes for at least 2, 3, 4, 5, or 6 passages to generate the expanded modified MSC population. In some instances, all or a portion of the expanded modified MSC population may be cryopreserved.
[0059] Following culturing of the modified MSC to generate expanded modified MSC population, modified MSC from expanded modified MSC population are injected into the subject. The injection may be made at least one of intraperitoneal injection, intravenous injection, or intra-articular injection. Each injection comprises about 10.sup.5 to about 10.sup.9 MSC from the expanded modified MSC population per kilogram weight of the subject. For example, the injection may comprise 10.sup.5 MSC, 10.sup.6 MSC, 10.sup.7 MSC, 10.sup.8 MSC, or 10.sup.9 MSC. The number of cells injected into the subject is based on the amount of protein expressed per cell. This metric is determined empirically for the expanded modified MSC population. In some instances, this metric may be generally predictable based on the nature of the modified MSC (for example, method of modification, Factor VIII gene sequence, vWF gene sequence, vector and vector components).
[0060] In some instances, modified MSC are injected into the subject once, twice, 3 times, 4 times, 5 times, 6 times, 7 times, 8 times, 9 times, or 10 times. In some instances, modified MSC are injected into the subject at least once, at least twice, at least 3 times, at least 4 times, at least 5 times, at least 6 times, at least 7 times, at least 8 times, at least 9 times, or at least 10 times. For example, the modified MSC may injected as multiple injections on the same day. In some instances, the modified MSC may be injected into the subject on multiple days. In some instances, the subject is injected with modified MSC on a first day and then the subject may be monitored over a period of time (days or weeks) to determine if there is sufficient protein expression to provide the desired therapeutic benefit. In some instances, the amount of protein expression in the subject's blood of Factor VIII protein, vWF protein, or both, may be monitored. In some instances, the efficiency of the subject's blood to clot may be assessed using routine blood clotting tests known in the art. In some instances, the subject's symptoms relating to joint pain and/or inflammation may be assessed. Where monitoring indicates that the amount of expression of Factor VIII protein alone, or the amount of expression of Factor VIII protein and vWF protein, is insufficient, the subject's disease symptoms are not alleviated, or both, the subject may be injected with modified MSC on a second day. Again, the subject may be monitored over a period of time to determine if there is sufficient protein expression to provide the desired therapeutic benefit. These steps may be repeated for a fourth, fifth, sixth, seventh, eighth, ninth, or tenth injection as needed to achieve the desired therapeutic benefit of alleviating the subject's disease symptoms.
[0061] In some instances, the use of MSC as cellular vehicles to deliver a Factor VIII gene sequence, a vWF gene sequence, or both, to a subject (as opposed to administration of vector directly) may overcome limitations/risks observed to-date in AAV-based clinical trials for hemophilia: 1) the possibility of off-target transduction of troubling cell types, such as germline cells; 2) the inability to treat patients with pre-existing antibodies to the serotype of AAV being employed as a vector; and 3) the transient hepatotoxicity induced by the AAV capsid, that triggers subsequent immune/inflammatory destruction of many of the transduced cells. Although early studies in vitro and in normal and hemophilia A mice, have used unselected stromal cells (isolated based solely upon plastic adherence) as cellular vehicles for delivering exogenous Factor VIII, no attempts have yet been made to use phenotypically-defined MSC/pericytes to deliver FVIII in vivo in any preclinical model of hemophilia A.
[0062] In some instances, the use of MSC as cellular vehicles to deliver a therapeutic gene is also an improvement over the use of hematopoietic stem cells (HSC), as have been used in most cell-based gene therapy trials. The use of MSC eliminates the possibility of insertional leukemogenesis, which is the most serious adverse event seen to-date in clinical gene therapy trials. A successful outcome of the proposed studies targeting hemophilia A thus promises to open the door to safe correction of a variety of congenital disorders using MSC to deliver the therapeutic gene.
EXAMPLES
Example 1. Animal Model
[0063] Applicant re-established a line of sheep that emulates the genetics, inhibitor formation, and clinical symptoms of the severe form of human hemophilia A (HA), including the development of frequent, spontaneous hematomas and crippling hemarthroses, making them unique among the HA models. See Porada, C. D., et al., Clinical and molecular characterization of a re-established line of sheep exhibiting hemophilia A. J Thromb Haemost, 2010. 8(2): 276-285. Using unique antibodies developed to various regions of the ovine FVIII protein, it was determined that these sheep do not produce any FVIII antigen (FIG. 1), as demonstrated by the complete lack of any staining within the liver of two of the HA sheep, which is in marked contrast to the widespread bright staining that is seen in the liver from a normal/healthy sheep. As such, they should be an excellent model of severe, cross-reacting material (CRM)-negative hemophilia A patients. Additionally, sheep are close in size to humans, their immune system is quite similar to that of humans, and their long lifespan allows long-term efficacy and safety to be addressed.
Example 2. Preliminary Pilot Study--Allogeneic Cells
[0064] A pilot study on 2 pediatric subjects from the HA sheep model described in Example 1. See Porada, et al., Phenotypic correction of hemophilia A in sheep by postnatal intraperitoneal transplantation of FVIII-expressing MSC. Exp Hematol, 2011. 39(12):1124-1135. During the first 3-5 months of life, both animals had received frequent, on-demand infusions of human FVIII (hFVIII) for multiple hematomas and chronic, progressive, debilitating hemarthroses of the leg joints which had resulted in severe defects in posture and gait, rendering them nearly immobile. Thus, for these subjects, FVIII was presented in the context of "danger signals", which is known to trigger a robust host immune response to FVIII and other proteins.
[0065] Haploidentical MSC from the ram that had sired the two HA lambs were used for the therapy. The MSC were modified to introduce via transduction a B domain-deleted, wild-type porcine FVIII cDNA as described in Porada et al., Phenotypic correction of hemophilia A in sheep by postnatal intraperitoneal transplantation of FVIII-expressing MSC. Exp Hematol. 2011; 39(12):1124-1135. MSC were simultaneously transduced with 2 lentivectors; one encoded eGFP for in vivo tracking of donor cells, and the second encoded an expression/secretion optimized porcine FVIII (pFVIII) transgene previously shown to be expressed/secreted from human cells at 10-100 times higher levels than hFVIII or sheep (ovine) FVIII (oFVIII). See Gangadharan et al., High-level expression of porcine factor VIII from genetically modified bone marrow-derived stem cells. Blood, 2006. 107(10):3859-64; Doering et al., Directed Engineering of a High-expression Chimeric Transgene as a Strategy for Gene Therapy of Hemophilia A. Mol Ther, 2009. 17(7):1145-54; Doering et al., Identification of porcine coagulation factor VIII domains responsible for high level expression via enhanced secretion. J Biol Chem, 2004. 279(8):6546-52; Dooriss et al., Comparison of Factor VIII Transgenes Bioengineered for Improved Expression in Gene Therapy of Hemophilia A. Hum Gene Ther, 2009. 20(5):465-78; Ide, L. M., et al., Hematopoietic stem-cell gene therapy of hemophilia A incorporating a porcine factor VIII transgene and nonmyeloablative conditioning regimens. Blood, 2007. 110(8):2855-63; and Johnston et al., Generation of an optimized lentiviral vector encoding a high-expression factor VIII transgene for gene therapy of hemophilia A. Gene Ther, 2013. 20(6):607-15.
[0066] FVIII/GFP-expressing MSC were then expanded and transplanted by IP injection (30.times.10.sup.6), in the absence of any preconditioning, into the first lamb. Following transplant, this animal's clinical picture improved dramatically, and he enjoyed an event-free clinical course, devoid of spontaneous bleeds, obviating the need for hFVIII infusions. Even more remarkably, the animal's existing hemarthroses resolved, his joints recovered fully, and he regained normal posture and gait, resuming a normal activity level. To the inventors' knowledge, this represents the first report of phenotypic correction of severe HA in a large animal model following transplantation of cells modified to express FVIII, and is the first time that reversal of chronic debilitating hemarthroses has been achieved in any setting.
[0067] Based on this remarkable clinical improvement, the modified MSC were transplanted into the second animal using an identical procedure, but a higher cell dose (125.times.10.sup.6). In similarity to the first animal, hemarthroses present in this second animal at the time of transplant resolved, he resumed normal activity shortly after transplantation, and became factor-independent.
[0068] Interestingly, despite the marked phenotypic improvement in both these animals, no circulating FVIII activity was detectable following the transplant, most likely due to the presence of high-titer inhibitors in these animals. These findings are remarkable, since despite the high titers of antibodies/inhibitors present in these animals, the transplanted allogeneic (haploidentical) MSC persisted and were not eliminated by the recipient's immune system, and the therapeutic effect of the treatment was maintained, i.e., the animals' symptoms of spontaneous joint bleeds, hematomas, and bleeding upon needle stick all improved.
Example 3. Mechanistic Study--Autologous Cells
[0069] Twenty female HA carriers will be artificially inseminated (AI) via laparoscopy, as done in Example 2, with the support of the North Carolina State Theriogenology/Ruminant Medicine team. At 50-70 days of gestation (term: 150 days), amniotic fluid will be collected, and fetal cells from the amniotic fluid will be isolated, cultured, and expanded, using standard methods in our lab. Given the severe phenotype of these animals, we will perform a PCR-based RFLP (see Porada, C. D., et al., J Thromb Haemost, 2010. 8(2):276-85) to identify affected fetuses, allowing us to plan for their subsequent delivery. Following amniocentesis, the animals will be allowed to complete term. When the sheep have nearly completed gestation, the pregnant ewes carrying affected fetuses will be placed under close observation, and ewes will either be induced into labor using intramuscular dexamethasone for natural delivery, or the lambs will be delivered by Caesarian section, with clinical veterinarians assisting in either case. Both approaches have been used previously with success.
[0070] Affected lambs will be treated immediately with recombinant full-length or B-domain deleted ovine FVIII (oFVIII) produced as described in Zakas, P. M., et al., Development and characterization of recombinant ovine coagulation factor VIII. PLoS One, 2012. 7(11):e49481. Although we have found that oFVIII is not a very high-expressing FVIII variant when compared to FVIII from other species, oFVIII protein for transfusion and an oFVIII transgene in the vectors are being used because the consensus in the hemophilia field is that the use of "same-species" FVIII is essential in preclinical gene therapy-based studies to accurately model the potential immune response in the clinical arena.
[0071] While human cells may be needed to perform definitive clinical studies, human cells are not appropriate for these mechanistic studies because using human cells would not allow us to address the critical issue of whether the use of autologous cells results in higher levels of long-term engraftment than we achieved in our pilot study with allogeneic cells, and whether the use of autologous cells may reduce the incidence of inhibitor formation. For this reason, sheep MSC will be used throughout this study.
[0072] It is our goal to treat HA with this MSC-based delivery system during the first 18 months of postnatal human life, since this is the time by which most HA patients would be diagnosed. Sheep develop much faster than humans, and are weaned at 60-90 days of age. We thus know this corresponds to the first 12-18 months for a human, so we will test the MSC-based treatment during the first 2-3 months of life in the sheep.
[0073] Starting at birth, HA lambs will be treated prophylactically 2-3 times per week with recombinant oFVIII. At 4-5 weeks of age, we will collect bone marrow (under oFVIII coverage), and isolate MSC from each affected lamb, as we have done previously. These methods enable us to successfully establish primary sheep MSC that are phenotypically and functionally similar to their human counterparts; these sheep MSC are devoid of hematopoietic cells (they lack CD11b, CD34, and CD45), but they express the MSC markers CD146 and CD90. See FIG. 2. Anti-ovine antibodies to other antigens routinely used to identify MSC, such as CD44, CD105, and CD73, are not commercially available. Immunofluorescence microscopy demonstrated expression of vimentin and .alpha.-smooth muscle actin, known MSC cytoskeleton proteins, and we verified the ability of these sheep MSC were able to differentiate into adipocytes (by Oil-red-o staining) and osteocytes (by calcium deposition and alkaline phosphatase). See FIG. 2.
[0074] We will then subject the isolated MSC to 2-3 rounds of transduction with either the EF1.alpha.-[oFVIII] lentivirus vector (FIG. 3) alone, or simultaneously with the EF1.alpha.-[oFVIII] lentivirus vector and the CAG-[vWF] lentivirus vector (FIG. 4). See De Meyer, S. F., et al., Phenotypic correction of von Willebrand disease type 3 blood-derived endothelial cells with lentiviral vectors expressing von Willebrand factor. Blood, 2006. 107(12): 4728-36. The EF1.alpha.-[oFVIII] lentivirus vector contains a B-domain deleted oFVIII gene having the polynucleotide sequence set forth in SEQ ID NO: 33. The CAG-[vWF] lentiviral vector contains a truncated vWF having the polynucleotide sequence set forth in SEQ ID NO: 29. From the pilot study described in Example 2, we know that 2-3 rounds of simultaneous exposure to two different lentivirus vectors results in transduction of 90-95% of sheep MSC with both vectors. The lentivirus vectors for these studies both contain the 3'-modified LTR to produce SIN lentivirus vectors, and the constitutive EF1.alpha. promoter will drive FVIII expression. Due to packaging constraints, it is not possible to utilize the EF1.alpha. promoter in the vWF lentivirus vector. Prior studies have established the ability of vWF to be packaged within a lentivirus vector and expressed at high levels. See De Meyer, S. F., et al., 2006 (above). However, if any difficulties arise packaging vWF-encoding lentivirus vectors, we will utilize truncated vWF cassettes as described in this disclosure, or switch to foamy virus vectors, as they possess a much larger packaging capacity.
[0075] We are including a group in which autologous MSC are transduced with vectors encoding both oFVIII and vWF (ovine vWF; GI:426227037) for two reasons: 1) binding to vWF stabilizes FVIII and prolongs its half-life and, thus, delivery of MSC secreting FVIII complexed with vWF may produce a more pronounced therapeutic effect; and 2) vWF may reduce the immunogenicity of exogenously administered FVIII by preventing both its uptake and presentation by dendritic cells, and its recognition by immune effector cells. We predict that delivering the two proteins in the same vector will result in the release of FVIII:vWF as a complex from the transduced MSC. We will confirm co-localization/complex formation of vector-derived oFVIII and vWF in transduced MSC populations by confocal microscopy prior to performing the proposed transplants. Although MSC do not endogenously produce any vWF, we will add a 6-His tag (SEQ ID NO:32) to the vWF transgene, making it readily distinguishable from any trace endogenous vWF for these in vitro studies.
[0076] Following transduction, the FVIII-expressing MSC will be expanded until the animals have reached 2-3 months of age, at which point the MSC be transplanted autologously into their respective donor HA lamb, via IP injection under ultrasound guidance, with no preconditioning (as in our pilot study [23]), using a dose of 5-10.times.10.sup.6 cells/kg. An aliquot of each cell type will be reserved to determine vector copy number by qPCR and to perform integration site analysis by LM-PCR. Methods described in Porada, C. D., et al., Phenotypic correction of hemophilia A in sheep by postnatal intraperitoneal transplantation of FVIII-expressing MSC. Exp Hematol, 2011. 39(12): 1124-1135 (qPCR) and Russo-Carbolante, E. M., et al., Integration pattern of HIV-1 based lentiviral vector carrying recombinant coagulation factor VIII in Sk-Hep and 293T cells. Biotechnol Lett, 2011. 33(1):23-31 and Tellez, J., et al., High Incidence of Vector Integration Near Cancer Related Genes within Primitive Hematopoietic Stem Cells (HSC) After Fetal Gene Transfer with .gamma.-Retroviral Vectors. Molecular Therapy, 2010. 18(Suppl. 1): p. S331 (LM-PCR). Two experimental groups will be included: 1) autologous MSC transduced with the EF1.alpha.-[oFVIII] lentivector (n=2-3 HA lambs); and 2) autologous MSC transduced with both the EF1.alpha.-[oFVIII] and CAG-vWF lentivectors (n=2-3 HA lambs).
[0077] Following transplantation, prophylactic oFVIII infusions will be discontinued, and any benefit as a result of this MSC-based approach should be readily apparent, given the severe, life-threatening phenotype of these animals. The sheep will be continually monitored for bleeds, and platelet-deficient plasma will be collected monthly until at least 1.5 years of age for coagulation assays, to quantify the plasma levels of oFVIII by chromogenic assay and/or ELISA. The formation/presence of inhibitors will be assessed at each time point by performing the Nijmegen modification of the Bethesda assay (as described in Verbruggen, B., et al., The Nijmegen modification of the Bethesda assay for factor VIII:C inhibitors: improved specificity and reliability. Thromb Haemost, 1995. 73(2):247-51) and a commercially available kit (Technoclone/DiaPharma Group, Inc.) on an aliquot of plasma collected from the animals. Once we have obtained these values, we will compare the HA lambs that received MSC transduced with the lentivector encoding oFVIII alone to those that received MSC transduced with both the oFVIII and vWF lentivectors, and compare each of these to the historic values from untransplanted HA sheep, and to a reference panel of normal unaffected males. These studies will allow us to: 1) establish what levels of vector-encoded oFVIII are expressed as a result of this postnatal approach; 2) determine the duration of the therapeutic effect of this approach; 3) assess whether using autologous MSC and a lentivector that lacks eGFP avoids the inhibitors seen in the pilot study of Example 2; and 4) establish whether including vWF improves the therapeutic effect of this MSC-based treatment and/or reduces the incidence/titer of inhibitor formation.
[0078] At 1.5 years of age (or sooner, if we see that FVIII levels are dropping), the HA lambs will be euthanized, and all major organs will be harvested. All tissues will be fixed in 4% paraformaldehyde, processed through a sucrose gradient, embedded and frozen in OCT, and sectioned at 5 .mu.m. 8-10 slides/tissue will be stained with an antibody specific to oFVIII and analyzed/quantitated by confocal microscopy for the presence of engrafted MSC to precisely determine the levels and localization (parenchymal vs. perivascular) of engrafted cells that are expressing FVIII, and are therefore providing therapeutic benefit. We will also collect plasma from each of these recipients at the time of euthanasia, and quantitate the circulating levels of vector-derived oFVIII in these sheep using an ELISA specific for oFVIII, correlating levels and patterns of engraftment with circulating FVIII levels. Based on ease of access to the circulation, we hypothesize that maximal plasma FVIII levels will be obtained when MSC lodge in perivascular regions of the engrafted tissues.
[0079] While confocal analysis should provide us with a fairly accurate estimate of the levels of oFVIII+MSC within each tissue, the slides selected for quantitation may or may not be representative of the engraftment levels within the organ as a whole. Therefore, we will also use an ELISA to precisely quantitate the amount of oFVIII within tissue homogenates. A standard curve will be created with known numbers of oFVIII+MSC, thereby establishing how much oFVIII is present on a per-cell basis. Protein extracts will then be prepared from the tissues from each animal and analyzed using this ELISA, comparing the tissue values to that of the standard curve, to precisely quantitate the number of MSC that have engrafted within each tissue and are expressing oFVIII. We will then compare the levels of donor MSC in each tissue with the resultant plasma FVIII levels to determine in which tissues engraftment produces the highest circulating levels of FVIII.
Example 4. Study Assessing Treatment in Subjects with Pre-Existing Inhibitors
[0080] The findings of the study described in Example 3 will be used to determine the ability of this autologous cell-based approach to mediate clinical/phenotypic improvement in recipients with pre-existing inhibitors as a result of on-demand FVIII treatment.
[0081] Three to four HA lambs will be treated on-demand with oFVIII beginning at birth. We expect, based on the pilot study of Example 2, that treating on-demand with FVIII products will result in the formation of inhibitors in almost all HA sheep by 4-5 months of age. We will collect bone marrow and isolate MSC at 4-5 weeks of age, and transduce these cells with either the EF1.alpha.-[oFVIII] lentivector alone, or with both the EF1.alpha.-[oFVIII] and CAG-vWF lentivectors (depending which approach yields the best outcome in the preceding studies), and expand the transduced cells to obtain adequate numbers for transplant.
[0082] Beginning at birth, we will draw blood from these animals every other week to obtain plasma and perform the Nijmegen modification of the Bethesda assay (as above) (Technoclone/DiaPharma Group, Inc.) to assess the development of inhibitors. Once inhibitors have developed, we will transplant the transduced autologous MSC into each animal by ultrasound-guided IP injection, as in Exp. Set #1.1, at a dose of 5-10.times.10.sup.6 cells/kg.
[0083] Following transplantation, we will analyze the animals as detailed in Example 3, continually monitoring them for bleeds, and collecting platelet-deficient plasma monthly until at least 1 year of age for coagulation assays, to quantify the plasma levels of oFVIII by chromogenic assay and/or ELISA, and to quantitate the levels of inhibitors present, to ascertain whether the MSC-based treatment impacts upon the levels of the pre-existing inhibitors. We will then compare the results obtained with these HA lambs with pre-existing inhibitors to those in Example 3 in which the HA lambs lacked inhibitors at the time of MSC infusion.
Example 5. First Line Therapy Study
[0084] We hypothesize that previously untreated patients (PUPS) represent the ideal group to initially target with this MSC-based treatment, because their immune systems are completely naive to exogenous FVIII, and will be exposed to it for the first time when it is released by the transplanted MSC; we anticipate this will reduce/eliminate the risk of inhibitor formation in this population.
[0085] In families with no prior history, HA is normally diagnosed during the first 18 months of life, after the child exhibits abnormal bruising/bleeding after a minor trauma. In families with a history of HA, diagnosis can be made at birth, or even prior to birth [148-158]. Regardless of when diagnosis is made, however, it would not be possible to collect bone marrow MSC from these patients without treating with factor to prevent hemorrhage during the procedure. This same issue exists with the HA sheep, since they present with a severe phenotype and spontaneous bleeding from birth. However, in similarity to patients with a family history of HA, the affected sheep can be diagnosed in utero by amniocentesis (as can be done in human patients), making it possible to collect autologous cells from the amniotic fluid part way through gestation, transduce these cells, expand them, and have them ready to transplant as the first-line therapy at birth, or shortly thereafter. We recently found that MSC-like cells present within the amniotic fluid, "AF-MSC", are readily transduced with lentivirus vectors, they endogenously produce low levels of FVIII (FIG. 5) and vWF (FIG. 6), and they express very high levels of FVIII following lentivector transduction (FIG. 7). These findings suggest that amniotic fluid can replace marrow as a source of autologous MSC for delivering a FVIII transgene, enabling us to test the MSC-based treatment's efficacy as first-line therapy in PUPs.
[0086] To test the efficacy of the MSC-based treatment in PUPs, HA carrier ewes will be bred or artificially inseminated as detailed in Example 3. At 50-60 days of gestation (term: 150 days), amniotic fluid will be collected, AF-MSC isolated, and a PCR-based RFLP performed to detect the HA mutation, as detailed in Example 3. AF-MSC from the affected fetuses will then be subjected to 2-3 rounds of transduction with the EF1.alpha.-[oFVIII] comprising the polynucleotide sequence set forth in SEQ ID NO: 33 (FIG. 3) and expanded. At near term, labor will be induced in the ewes with affected lambs, or a C-section performed to deliver the lambs. Immediately after birth, the HA lamb "PUPs" (n=3-4) will be injected IP with the transduced autologous AF-MSC at a dose of 5-10.times.10.sup.6 cells/kg, as described for the BM-MSC in Example 3.
[0087] Following transplantation, we will analyze the animals as detailed in Example 3 and 4, continually monitoring them for bleeds, and collecting platelet-deficient plasma monthly until at least 1 year of age for: 1) coagulation assays; 2) to quantify the plasma levels of oFVIII by chromogenic assay and/or ELISA; and 3) to perform the Nijmegen modified Bethesda assay to assess the development of inhibitors. We will then compare the results obtained by using this MSC-based treatment as a first-line therapy in these HA lamb "PUPS" to the results obtained in the HA lambs treated prophylactically (Example 3) and to those treated on-demand (Example 4) prior to MSC infusion.
Example 6. Study to Assess Induction of Immune Tolerance to Factor VIII
[0088] We hypothesize that the continued delivery of FVIII to the circulation by the lentivector-modified MSC can serve as a much-needed novel method of inducing immune tolerance to FVIII.
[0089] The aim of this study is to test the ability of this MSC-based approach as a novel method of inducing immune tolerance through the continued delivery of FVIII to the circulation by the genetically-modified MSC. To accomplish this objective, 2-3 HA lambs (more will be added if initial data are not clear-cut) will be treated on-demand with oFVIII, beginning at birth, as we know that treating on-demand with FVIII products results in the formation of inhibitors in almost all HA sheep by 4-5 months of age. As in Example 4, we will isolate BM-MSC at 4-5 weeks of age, and transduce these cells with both the EF1.alpha.-[oFVIII] and CAG-vWF lentivectors, as clinical data indicate that the inclusion/presence of vWF may facilitate ITI [123, 159, 160]. The transduced cells will then be expanded to obtain adequate numbers for subsequent transplant.
[0090] Beginning at birth, we will draw blood from these animals every other week to obtain plasma and perform the Nijmegen modified Bethesda assay (as above) to assess inhibitor induction. Once inhibitors have developed, we will transplant transduced autologous MSC, at a dose of 10' cells/kg, into the peritoneal cavity of each animal, as in Example 3. This procedure will be repeated each 4-5 days until we observe a drop in inhibitor titer (as detailed below); a maximum of 10 infusions will be given initially.
[0091] Following transplantation, we will analyze the animals as in Example 3, continually monitoring for bleeds (as the repeated MSC-based treatment should produce clinical/phenotypic improvement), and collecting platelet deficient plasma bi-weekly for .gtoreq.3 months, to perform coagulation assays, to quantify the oFVIII plasma levels by chromogenic assay and/or ELISA, and to quantitate the levels of inhibitors present, to ascertain whether the repeated infusion of MSC expressing high levels of FVIII can break the existing inhibitors and induce tolerance to FVIII. To further confirm that this cell-based ITI has overcome the existing inhibitors, we will assess the restoration of normal FVIII pharmacokinetics, using well-established methodology (plasma FVIII recovery.gtoreq.66% of expected and a .gtoreq.6 h half-life, determined following a 72-hour FVIII-exposure-free period).
Example 7. Comparison of Recombinant Factor VIII Infusion Therapy with MSC-Based Tolerance Induction
[0092] One of the only clinical options for HA patients who develop inhibitors is immune tolerance induction (ITI), which involves the long-term administration of high doses of FVIII protein. ITI is extremely expensive, is only effective in a percentage of patients with inhibitors, and the mechanism for its success is unknown. To-date, no preclinical HA model has been used to study and/or optimize ITI. Given the high incidence of inhibitor formation in the HA sheep and their lack of any cross-reactive material, they represent an excellent model in which to investigate ITI. We propose to perform a head-to-head comparison of traditional ITI, using repeated high-dose recombinant oFVIII to the MSC-based ITI protocol developed/tested in Example 6.
[0093] To achieve these goals, 2-3 HA lambs will be treated on-demand with oFVIII, beginning at birth, as described in Example 6, until inhibitors develop. We will then commence a clinically employed, protein-based ITI regimen, infusing the inhibitor animals with a dose of 100 IU/kg/day for 3 months (as described in Oldenburg, J., et al., Primary and rescue immune tolerance induction in children and adults: a multicentre international study with a VWF-containing plasma-derived FVIII concentrate. Haemophilia, 2014. 20(1):83-91). During the course of this ITI protocol, we will analyze the animals as detailed in Example 6, collecting platelet-deficient plasma weekly, to: 1) perform coagulation assays; 2) quantify the plasma levels of oFVIII by chromogenic assay and/or ELISA; and 3) quantitate the levels of inhibitors present, to assess the ability of the protein-based ITI to break existing inhibitors, and define the kinetics with which this happens. To further confirm that ITI has overcome existing inhibitors, we will assess the restoration of normal FVIII pharmacokinetics, as detailed above. The success rate and kinetics of tolerance induction with the protein-based ITI will be compared to those of the cell-based protocol in Example 6, to determine whether the cell-based method is a viable alternative to the time consuming and expensive protein-based method that represents the current state-of-the-art in clinical ITI.
Example 8. Transduction Efficiency with Different Vectors
[0094] The transduction efficiency, FVIII production, and FVIII secretion from human PLC following transduction at an identical multiplicity of infection (MOI) of 7.5 with an identical lentiviral vector (LV) encoding one of the following four different FVIII transgenes: (1) a bioengineered human-porcine hybrid FVIII (ET3) having the polynucleotide sequence set forth in SEQ ID NO: 11; (2) a liver codon-optimized ET3 (lcoET3) having the polynucleotide sequence set forth in SEQ ID NO: 10; (3) a liver codon-optimized human FVIII (lcoHSQ) having the polynucleotide sequence set forth in SEQ ID NO: 14; and (4) a myeloid-codon optimized ET3 (mcoET3) having the polynucleotide sequence set forth in SEQ ID NO: 12 were compared. Brown et al. (2018) Mol. Ther. Methods Clin. Dev. 9:57-69, demonstrated that vectors encoding FVIII, when codon-optimized to the target cells, or tissue, result in a dramatically increase FVIII expression of functional FVIII. Following transduction, PLCs were analyzed by flow cytometry and confocal microscopy to measure transduction efficiency and FVIII production. Conditioned media of PLCs were assayed by aPTT to quantitate FVIII activity. Analysis of the culture supernatants by aPTT demonstrated FVIII activity was readily detectable in supernatants of all transduced cells lines. It also revealed marked differences in the secretion of functional FVIII following transduction with each of these vectors. Specifically, PLCs transduced with mcoET3 (SEQ ID NO: 12), ET3 (SEQ ID NO: 11), lcoET3 (SEQ ID NO: 10), and lcoHSQ (SEQ ID NO: 14) LV secreted 25.+-.9, 19.+-.8, 11.+-.2, and 1.+-.0.1 IU of FVIII/24 h/10.sup.6 cells, respectively (FIG. 10). PLC population doubling time was not affected by transduction with any of the vectors; nor were phenotype or expression of signaling molecules involved in innate immunity. Importantly, at passage 3 following transduction with any of the 4 lentiviral vectors, PLCs continued to produce and secrete FVIII at similar levels to those observed shortly after transduction, demonstrating stable vector integration and durability/longevity of FVIII expression. The relative levels of FVIII expression by PLCs following transduction with each lentiviral vector were also assessed by immunofluorescence microscopy with an antibody specific to a region of FVIII that is conserved in all 4 FVIII transgenes. These analyses confirmed the results of the aPTT analyses on the supernatants from these cells, with mcoET3-PLC exhibiting the brightest/highest intensity staining for FVIII, followed by PLC transduced with ET3 (SEQ ID NO: 11), then those transduced with lcoET3 (SEQ ID NO: 10) and with lcoHSQ (SEQ ID NO: 14) (data not shown).
[0095] The gene transfer efficiency of these gene-modified cells was assessed by determining the final proviral/vector copy number (VCN) using a commercially available qPCR-based kit (Lenti-X Provirus Quantitation Kit, Takara Bio USA, Inc., Mountain View, Calif.). To ensure that only integrated copies were detected by the assay, qPCR for VCN was performed in PLCs that had been passaged at least three times after transduction. After transducing the cells at the same MOI (7.5) with each lentiviral vector, the VCNs for mcoET3-PLC, lcoHSQ-PLC, lcoET3-PLC, and ET3-PLC were all around 1.
Example 9. Optimizing Factor VIII Expression in Placental Cells
[0096] The aim of this study was to investigate the suitability of placental cells (PLC) as cellular delivery vehicles for FVIII. The expression of phenotypical markers was determined in three different master cells banks (101, 103, and 104) of placental cells (PLCs), each of which was derived from a different human donor by the Regenerative Medicine Clinical Core (RMCC) at WFIRM following GMP-compliant standard operating procedures (SOPs) established by the RMCC for PLC. Expression of CD29, CD44, CD73, CD90, CD105, HLA-ABC, HLA-E, CD31, CD34, CD35, CD144, HLA-G, HLA-DR/DP/DQ, and ABO blood group were determined using flow cytometric analysis. These markers were selected to confirm that the PLC isolated possessed a phenotype characteristic of MSC from other tissues (CD29, CD44, CD73, CD90, CD105), to assess their potential for stimulating an immune response upon transplantation (HLA-ABC, HLA-E, CD35, CD144, HLA-G, HLA-DR/DP/DQ, and ABO blood group), and to discern whether they expressed markers indicative of endothelial properties (CD31, CD34). No statistically significant differences (p<0.05) were found in expression of phenotypic markers between PLCs derived from three different master cell banks (101, 103, and 104). PLCs from each of the master cell banks expressed CD29, CD44, CD73, CD90, CD 105, HLA-ABC, and HLA-E (FIG. 8A); had negligible amounts (<1%) of CD31, CD34, CD35, CD144, HLA-G, and HLA-DR/DP/DQ (data not shown); and were devoid of ABO blood group (data not shown). Collectively, these findings support the conclusion that PLC are an MSC-like population and that they should exhibit minimal immunogenicity upon transplantation.
[0097] PLCs derived from each of the master cell banks [MCBs] (101, 103, and 104) were assessed for their ability to express FVIII protein constitutively. Immunofluorescence microscopy with a primary antibody specific to hFVIIIc and a fluorochrome-conjugated secondary antibody and flow cytometric analysis with fluorochrome-conjugated antibodies to were used to determine the levels of constitutively expressed FVIII protein and define the phenptype of the PLCs, respectively. As shown in FIG. 8A, these cells expressed markers characteristic of MSC from bone marrow and other tissues. All three MCBs endogenously expressed detectable amounts of FVIII by immunofluorescence microscopy, and MCB 103 expressed the highest levels, as indicated by the brightest fluorescence intensity (data not shown). The fold increase of FVIII expression over isotype control for PLCs derived from each of the MCBs is presented as relative mean fluorescence intensity (MFI) as shown in FIG. 8B.
[0098] The activated partial thromboplastin time (aPTT or PTT) assay is a functional measure of the intrinsic and common pathways of the coagulation cascade (i.e. it characterizes blood coagulation). The aPTT assay was used to quantitate levels of functional FVIII secreted by PLCs. PLCs were plated at the same density and cultured for 24 hours in Phenol Red-free alpha-MEM AmnioMax Complete Medium (ThermoFisher Scientific, Raleigh, N.C.). Supernatants were collected and the number of cells present counted, and then the levels of secreted functional FVIII were measured by the Clinical Hematology Laboratory at Wake Forest Baptist Health using a commercial aPTT assay. Levels of FVIII were normalized by adjusting to account for the number of cells present at the time of supernatant collection, and expressing FVIII activity on a per cell basis. The data from this analysis is shown in FIG. 8C
[0099] FVIII mRNA levels in the PLCs derived from three different master cell banks (101, 103, and 104) were evaluated by qPCR using primers specific to human FVIII. Relative expression of endogenous mRNA for FVIII was calculated by comparing the threshold cycle (CT) value for FVIII with the CT of each master cell bank's respective internal reference gene, GAPDH. The relative expression of endogenous FVIII mRNA was 0.01.+-.0.0005, 0.075.+-.0.007, and 0.011.+-.0.0002, for PLCs 101, 103, and 104, respectively.
Example 10. Suitability of PLCs as a Transgenic FVIII Production Platform
[0100] PLC 101, 103, and 104 were transduced at the same MOI (7.5) using a lentiviral vector-(LV) encoding mcoET3 (mcoET3-PLC) as described in Example 4 above. Vector copy number (VCN) was determined, as described in detail above. The VCN was found to be similar between the three different PLC MCBs (0.71-0.75). After transduction, the relative levels of expression of FVIII by the 3 MCB PLCs were assessed by immunofluorescence microscopy after staining with a primary antibody specific to hFVIIIc and a fluorochrome-conjugated secondary antibody. All 3 MCBs expressed high levels of FVIII after transduction with the mcoET3 lentiviral vector, but MCB 103 exhibited the highest levels of FVIII protein, as evidence by the brightest/highest fluorescence intensity (data not shown). The secretion of FVIII was determined using aPTT performed on 24-hour culture supernatants harvested from PLCs that were plated at the same density and normalized for the number of cells present at the time of the supernatant collection, as described in detail in the preceding paragraphs (FIG. 9A). Levels of FVIII in the culture supernatant increased significantly (p<0.05) in PLCs derived from all 3 MCBs (101, 103, and 104) when compared with respective non-transduced PLC counterparts (FIG. 9B). No significant differences were found between the different transduced cells.
[0101] The effect of transduction of PLCs with LV encoding mcoET3 on phenotype or molecules involved in immunity was assessed. Expression of CD29, CD44, CD73, CD90, CD 105, CD58, CD112, CD155, CD47, HLA-ABC, HLA-E, HLA-G, and HLA-DR/DP/DQ, were determined by flow cytometric analysis, as described above. No statistically significant differences (p<0.05) were found between transduced and non-transduced cells (FIGS. 8A-8C). Additionally, PLC population doubling time was not affected by transduction (data not shown). Both non-transduced and transduced PLCs expressed CD47, a molecule involved in immune evasion (FIG. 8B). Transduced cells did not significantly upregulate the expression of HLA-DR/DP/DQ (FIG. 8C).
[0102] To further examine whether transduction of the PLC with the mcoET3 lentiviral vector had the potential to alter the immunogenicity of these cells, we examined the levels of expression of various Toll-like Receptors (TLRs) on the PLC prior to and following transduction, as these molecules play a key role in innate immunity, and their upregulation could potentially trigger an immune response to the transduced cells upon transplantation. To address this possibility, the effect of PLC transduction with the mcoET3 lentiviral vector on TLR-3, TLR-4, TLR-7, TLR-8, and TLR-9 expression was assessed. TLR expression on transduced (t) and non-transduced (n) PLCs (101, 103, and 104) was determined using flow cytometric analysis. No significant differences in expression of TLR molecules was detected in the PLC populations. As shown in FIG. 11A, there was no difference in the levels of expression of any of these TLRs in transduced vs. non-transduced PLCs, confirming that transduction of these 3 MCB PLCs with the mcoET3 lentiviral vector did not lead to upregulation of any of these immune-stimulating molecules.
[0103] In order to evaluate the demands of PLC transduction with mcoET3 and increased Factor VIII expression on the secretory and endoplasmic reticulum pathways, expression of stress molecules MICA/B, ULBP-1, ULBP-2, and ULBP-3 was determined in transduced (t) and non-transduced (n) PLCs (101, 103, and 104). Flow cytometric analysis demonstrated that no significant expression or alteration/upregulation of MICA/B or ULBP-1 were found before or after transduction with mcoET3 lentiviral vector (FIG. 11B).
[0104] The production of interferon-gamma (IFN-.gamma.) by mcoET3 transduced and non-transduced PLCs was measured using a high-sensitivity ELISA (assay range: 0.16-10.0 pg/mL). PLCs were cultured for 24 hours in AmnioMax Complete Medium (ThermoFisher). Supernatants were collected and IFN-.gamma. production was determined. No IFN-.gamma. was detected in any of the culture supernatants of mcoET3-transduced or non-transduced PLCs (data not shown).
[0105] All features of the described systems are applicable to the described methods mutatis mutandis, and vice versa.
[0106] All patents, patent publications, patent applications, journal articles, books, technical references, and the like discussed in the instant disclosure are incorporated herein by reference in their entirety for all purposes.
[0107] It is to be understood that the figures and descriptions of the disclosure have been simplified to illustrate elements that are relevant for a clear understanding of the disclosure. It should be appreciated that the figures are presented for illustrative purposes and not as construction drawings. Omitted details and modifications or alternative embodiments are within the purview of persons of ordinary skill in the art.
[0108] It can be appreciated that, in certain aspects of the disclosure, a single component may be replaced by multiple components, and multiple components may be replaced by a single component, to provide an element or structure or to perform a given function or functions. Except where such substitution would not be operative to practice certain embodiments, such substitution is considered within the scope of the disclosure.
[0109] The examples presented herein are intended to illustrate potential and specific implementations of the invention. It can be appreciated that the examples are intended primarily for purposes of illustration for those skilled in the art. There may be variations to these diagrams or the operations described herein without departing from the spirit of the invention. For instance, in certain cases, method steps or operations may be performed or executed in differing order, or operations may be added, deleted or modified.
[0110] Different arrangements of the components depicted in the drawings or described above, as well as components and steps not shown or described are possible. Similarly, some features and sub-combinations are useful and may be employed without reference to other features and sub-combinations. Aspects and embodiments of the invention have been described for illustrative and not restrictive purposes, and alternative embodiments will become apparent to readers of this patent. Accordingly, the present invention is not limited to the embodiments described above or depicted in the drawings, and various embodiments and modifications can be made without departing from the scope of the claims below.
[0111] While exemplary embodiments have been described in some detail, by way of example and for clarity of understanding, those of skill in the art will recognize that a variety of modification, adaptations, and changes may be employed. Hence, the scope of the present invention should be limited solely by the claims.
Sequence CWU
1
1
331146DNAArtificial Sequencesynthetic HNF1-shortABP-SynO-TSS 1gttaatcatt
aagtcgttaa tttttgtggc ccttgcgatg tttgctctgg ttaataatct 60caggacaaac
agaggttaat aattttccag atctctctga gcaatagtat aaaaggccag 120cagcagcctg
accacatctc atcctc
1462159DNAArtificial Sequencesynthetic ABPshort-HP1-AFP-TSS 2gttaattttt
gtggcccttg cgatgtttgc tctggttaat aatctcagga caaacataca 60ttttcagtca
tatgtttgct cactgaaggt tactagttaa caggcatccc ttaaacagga 120tataaaaggc
cagcagcagc ctgaccacat ctcatcctc
1593152DNAArtificial SequenceHNF1-ABP-SynO 3gttaatcatt aagtcgttaa
tttttaaaaa gcagtcaaaa gtccaagtgg cccttgcgag 60catttactct ctctgtttgc
tctggttaat aatctcagga gcacaaacag aggttaataa 120ttttccagat ctctctgagc
aatagtataa aa 1524150DNAArtificial
Sequencesynthetic SP1-ABP-SynO 4tgggcggagt gtcgttaatt tttaaaaagc
agtcaaaagt ccaagtggcc cttgcgagca 60tttactctct ctgtttgctc tggttaataa
tctcaggagc acaaacagag gttaataatt 120ttccagatct ctctgagcaa tagtataaaa
1505137DNAArtificial Sequencesynthetic
ABP-SynO 5gttaattttt aaaaagcagt caaaagtcca agtggccctt gcgagcattt
actctctctg 60tttgctctgg ttaataatct caggagcaca aacagaggtt aataattttc
cagatctctc 120tgagcaatag tataaaa
1376172DNAArtificial Sequencesynthetic ABP-HP1-AFP
6gttaattttt aaaaagcagt caaaagtcca agtggccctt gcgagcattt actctctctg
60tttgctctgg ttaataatct caggagcaca aacagaggtt aataattttc agtcatatgt
120ttgctcactg aaggttacta gttaacaggc atcccttaaa caggatataa aa
17274377DNAArtificial Sequencesynthetic hcoAn53 7atgcagattg agctgtccac
ttgctttttc ctgtgcctgc tgcagttttc attttccgcc 60actagaagat actacctggg
ggctgtcgaa ctgtcctggg attacatgca gtccgacctg 120ctgtctgagc tgcatgtgga
cacccgattt ccacctcgcg tcccacgaag cttccccttt 180aatacatccg tgatgtacaa
gaaaactgtg ttcgtcgagt tcaccgatca cctgttcaac 240atcgcaaagc cccggccacc
ctggatggga ctgctgggcc ctaccatcag agccgaggtg 300tacgacaccg tggtcattac
actgaaaaac atggcaagtc accccgtgtc actgcatgcc 360gtgggagtct cctactggaa
ggcatctgaa ggcgccgagt atgacgatca gactagtcag 420agagaaaaag aggacgataa
ggtgtttccc ggagaatctc atacctatgt gtggcaggtc 480ctgaaggaga atggccctat
ggccagcgac cctccatgcc tgacctactc ctatctgtct 540cacgtggacc tggtcaaaga
tctgaactcc gggctgatcg gagccctgct ggtgtgtcgg 600gaaggatctc tggctaagga
gagaacccag acactgcatc agttcgtgct gctgttcgct 660gtctttgacg aaggcaaaag
ttggcactca gagacaaagg attccctgac tcaggcaatg 720gactctgcca gtgctagggc
atggccaaaa atgcacaccg tgaacggcta cgtcaataga 780agcctgccag gactgatcgg
atgccacagg aagtccgtgt attggcatgt catcggcatg 840gggaccacac cagaagtcca
ctctattttc ctggagggac atacatttct ggtgcggaat 900cacagacagg ctagcctgga
gatctccccc attaccttcc tgacagcaca gactctgctg 960atggatctgg gccagttcct
gctgttttgc cacatcagct cccaccagca tgatgggatg 1020gaggcctacg tgaaagtcga
cagctgtcca gaggaacccc agctgaggat gaagaacaat 1080gaggaagagg aagactacga
cgatgacctg tatgacagcg agatggatgt ggtccgattc 1140gatgacgata actcaccccc
ttttatccag attagaagcg tcgccaagaa acaccctaag 1200acttgggtgc attacatcgc
cgctgaggaa gaggactggg attatgctcc ttccgtgctg 1260accccagacg atcgcagcta
caaatcccag tatctgaaca atggccctca gaggattggg 1320cgcaagtaca agaaagtgag
gttcatggct tataccgatg aaaccttcaa gactcgcgaa 1380gcaatccagt acgagtccgg
aattctgggc ccactgctgt atggggaagt gggagacacc 1440ctgctgatca ttttcaagaa
ccaggcctct aggccctaca atatctatcc tcatggcatt 1500acagatgtgt ctcccctgca
cagtggacgc ctgcctaagg gcgtgaaaca cctgaaggac 1560ctgcctatcc tgccagggga
aatttttaag tacaaatgga ctgtgaccgt cgaggatgga 1620ccaactaaga gcgaccccag
gtgcctgacc cgctactatt ctagtttcat caatctggaa 1680agagatctgg caagcggact
gatcggacca ctgctgattt gttacaaaga gtccgtggat 1740cagcgaggca accagatgat
gtctgacaag cggaatgtga tcctgttctc agtctttgac 1800gaaaaccgca gctggtatct
gaccgagaac atgcagcgat tcctgccaaa tgcagcagga 1860gtgcagccac aggatcctga
gtttcaggct agtaacatca tgcattcaat taatggctac 1920gtgttcgact cactgcagct
gagcgtgtgt ctgcacgagg tcgcttactg gtatatcctg 1980agcgtcggag cacagacaga
tttcctgtcc gtgttctttt ctggctacac ttttaagcat 2040aaaatggtgt atgaggacac
actgactctg ttcccttttt ccggcgaaac cgtctttatg 2100tctatggaga atccagggct
gtgggtgctg ggatgccaca actccgattt ccggaataga 2160ggaatgactg ccctgctgaa
agtgtcaagc tgtgaccgga acaccggcga ctactatgaa 2220gatacatacg aggacatccc
aacttatctg ctgtctgaaa acaatgtgat tgagcccaga 2280agcttcagcc agaatccacc
cgtgctgaag cgacaccagc gggaaatcac cctgactacc 2340ctgcagtcag agcaggaaga
gattgattac gacgatacca tcagcattga aacaaaaagg 2400gaggacttcg atatctatgg
ggaagacgag aaccagggac ctcgctcctt ccagaagagg 2460acacgccatt actttattgc
tgcagtggag aggctgtggg attatgggat gtcccgctct 2520ccccacgtcc tgcgaaatcg
ggcccagagt ggatcagtgc ctcagttcaa gaaagtggtc 2580ttccaggagt ttactgacgg
gagctttacc cagcctctgt accggggaga actgaacgag 2640cacctgggac tgctgggccc
atatatcaga gcagaagtgg aggataacat tatggtcacc 2700ttcaagaatc aggccagtcg
gccctactca ttttattcct ctctgatcag ctacgaagag 2760gaccagcgcc agggcgcaga
accacgaaaa aacttcgtga agcccaatga gaccaaaaca 2820tacttttgga aggtgcagca
ccatatggct cctacaaaag acgaattcga ttgcaaggcc 2880tgggcttatt ttagtgacgt
ggatctggag aaggacatgc actcagggct gatcggacct 2940ctgctgattt gtcatactaa
caccctgaat ccagcacacg gacgacaggt gacagtccag 3000gaattcgctc tgttctttac
aatcttcgat gagactaaga gctggtactt cactgaaaac 3060atggagagaa attgcagggc
cccttgtaat atccagatgg aagacccaac attcaaggag 3120aactacagat ttcatgctat
taatggctat gtgatggata ctctgccagg gctggtcatg 3180gcacaggacc agagaatcag
gtggtacctg ctgtctatgg ggagtaacga gaatatccac 3240agcattcatt tctccggaca
cgtgtttact gtcaggaaga aagaagagta taaaatggcc 3300gtgtacaacc tgtatccagg
cgtgttcgaa accgtcgaga tgctgccaag caaggcagga 3360atctggcgag tggaatgcct
gattggcgag cacctgcatg ctgggatgag taccctgttt 3420ctggtgtact caaaacagtg
tcagacacct ctgggaatgg catctggcca tatccgggat 3480ttccagatta ccgcaagtgg
acagtacgga cagtgggctc caaagctggc aagactgcac 3540tatagcggct ccatcaacgc
ctggtctaca aaagagccct ttagttggat taaggtggac 3600ctgctggccc ctatgatcat
tcatggcatc aaaactcagg gggctaggca gaagttcagt 3660tcactgtaca tcagccagtt
tatcatcatg tactccctgg atgggaagaa atggcagacc 3720taccgcggga atagcacagg
aactctgatg gtgttctttg gaaacgtcga cagctccggc 3780atcaagcaca acattttcaa
tcctccaatc attgcccgct acatccgact gcaccccacc 3840cattattcaa ttcgaagcac
actgcggatg gaactgatgg gctgcgatct gaactcttgt 3900agtatgcctc tggggatgga
gtctaaggcc atcagtgacg ctcagattac cgcatctagt 3960tacttcacca atatgtttgc
cacatggtca ccaagccagg ctaggctgca cctgcaggga 4020agaacaaacg cctggaggcc
tcaggtgaac aatccaaagg agtggctgca ggtggatttc 4080cagaaaacta tgaaggtcac
cggaatcaca actcagggcg tgaaatcact gctgaccagc 4140atgtatgtga aggagtttct
gatttcaagc tcccaggacg gccaccattg gacactgttc 4200ctgcagaacg ggaaggtgaa
agtcttccag ggaaatcagg attcttttac accagtggtc 4260aacagtctgg acccccctct
gctgactcgg tacctgagaa tccaccccca gagctgggtc 4320catcagattg cactgcgact
ggaagtgctg ggatgcgagg cacagcagct gtattga 437784359DNAArtificial
Sequencesynthetic lcoAn53 8atgcagattg agctgagcac ctgcttcttc ctgtgcctgc
tgcagttctc attctctgcc 60accaggagat actacctggg cgccgtggag ctgagctggg
actacatgca gtctgacctg 120ctgtctgagc tgcatgtgga caccaggttc ccccccagag
tgccccgaag cttccccttc 180aacaccagcg tgatgtacaa gaagaccgtg ttcgtggagt
tcactgacca cctgttcaac 240atcgccaagc ccaggccccc ctggatgggc ctgctgggcc
ccaccatcag agccgaggtg 300tacgacaccg tggtcatcac cctgaagaac atggccagcc
accccgtctc cctgcacgcc 360gtgggggtga gctactggaa ggcctctgag ggcgccgagt
acgacgacca gaccagccag 420agggagaagg aggacgacaa ggtgttccct ggggaaagcc
acacctacgt gtggcaggtc 480ctgaaggaga acggccccat ggcctctgac cccccatgcc
tgacctacag ctacctgagc 540cacgtggacc tggtgaagga cctgaactct ggcctgattg
gggccctgct ggtgtgcagg 600gagggcagcc tggccaagga gagaacccag accctgcacc
agttcgtgct gctgttcgcc 660gtgttcgacg agggcaagag ctggcactct gaaaccaagg
atagcctgac tcaggccatg 720gactctgcct ctgccagggc ctggcccaag atgcacaccg
tcaacggcta cgtcaacagg 780agcctgcctg gcctgattgg ctgccacagg aagagcgtgt
actggcatgt gatcggcatg 840ggcaccaccc ctgaggtgca cagcatcttc ctggagggcc
acaccttcct ggtcaggaac 900cacaggcagg ccagcctgga gatcagcccc atcaccttcc
tgaccgccca gaccctgctg 960atggacctgg gccagttcct gctgttctgc cacatctcca
gccaccagca cgacggcatg 1020gaggcctacg tgaaagtgga cagctgccct gaggagcccc
agctgaggat gaagaacaac 1080gaggaggagg aggactatga tgacgacctg tatgacagcg
agatggacgt ggtcaggttc 1140gacgacgaca acagcccccc tttcatccag atcaggagcg
tggccaagaa gcaccccaag 1200acctgggtgc actacatcgc tgctgaggag gaggactggg
actatgcccc ctccgtgctg 1260acccctgatg acaggagcta caagagccag tacctgaaca
atggccccca gaggattggc 1320aggaagtaca agaaagtcag gttcatggcc tacactgatg
aaaccttcaa gaccagggag 1380gccatccagt acgagtctgg catcctgggc cccctgctgt
acggggaggt gggggacacc 1440ctgctgatca tcttcaagaa ccaggccagc aggccctaca
acatctaccc ccatggcatc 1500accgacgtga gccccctgca cagcggaagg ctgcctaagg
gggtgaagca cctgaaagac 1560ctgcccatcc tgcctgggga gatcttcaag tacaagtgga
ctgtgactgt ggaggacggc 1620cccaccaaga gcgaccccag gtgcctgacc agatactaca
gcagcttcat caacctggag 1680agggacctgg cctctggcct gattggcccc ctgctgatct
gctacaagga gtctgtggac 1740cagaggggca accagatgat gagcgacaag aggaacgtga
tcctgttctc tgtcttcgac 1800gagaacagga gctggtacct gaccgagaac atgcagaggt
tcctgcccaa cgcagctggg 1860gtgcagccac aggaccccga gttccaggcc agcaacatca
tgcacagcat caatggctac 1920gtgttcgaca gcctgcagct gagcgtgtgc ctgcacgagg
tggcctactg gtacatcctg 1980agcgtcggcg cccagaccga cttcctgagc gtgttcttct
ctggctacac cttcaagcac 2040aagatggtgt atgaggacac cctgaccctg ttccccttca
tgagcatgga gaaccctggc 2100ctgtgggtgc tgggctgcca caacagcgac ttcaggaaca
ggggcatgac tgccctgctg 2160aaagtctcca gctgtgaccg gaacaccggg gactactacg
aggacacata cgaggacatc 2220ccaacttacc tgctgagcga aaacaatgtg atcgagccca
ggagcttctc tcagaacccc 2280ccagtgctga agaggcacca gagggagatc accctgacca
ccctgcagtc tgagcaggag 2340gagatcgact atgatgacac catcagcatt gagacaaaga
gggaggactt cgacatctac 2400ggggaggacg agaaccaggg acccaggagc ttccagaaga
ggaccaggca ctacttcatt 2460gctgctgtgg agaggctgtg ggactatggc atgtcccgca
gcccccatgt gctgaggaac 2520agggcccagt ctggcagcgt gccccagttc aagaaagtcg
tgttccagga gttcaccgac 2580ggcagcttca cccagcccct gtacagaggg gagctgaacg
agcacctggg cctgctgggc 2640ccctacatca gggccgaggt ggaggacaac atcatggtga
ccttcaagaa ccaggccagc 2700aggccctaca gcttctacag cagcctgatc agctacgagg
aggaccagag gcagggggct 2760gagcccagga agaactttgt gaagcccaat gaaaccaaga
cctacttctg gaaggtgcag 2820caccacatgg cccccaccaa ggacgagttc gactgcaagg
cctgggccta cttctctgac 2880gtggacctgg agaaggacat gcactctggc ctgattggcc
ccctgctgat ttgccacacc 2940aacaccctga accctgccca tggcaggcag gtgactgtgc
aggagttcgc cctgttcttc 3000accatcttcg atgaaaccaa gagctggtac ttcactgaga
acatggagag gaactgcagg 3060gccccctgca acatccagat ggaggacccc accttcaagg
agaactacag gttccatgcc 3120atcaatggct acgtgatgga caccctgcct ggcctggtca
tggcccagga ccagaggatc 3180aggtggtatc tgctgagcat gggcagcaac gagaacatcc
acagcatcca cttctctggc 3240cacgtgttca ctgtgaggaa gaaggaggag tacaagatgg
ccgtgtacaa cctgtaccct 3300ggggtgttcg aaaccgtgga gatgctgccc agcaaggccg
gcatctggag ggtggagtgc 3360ctgattgggg agcacctgca cgccggcatg agcaccctgt
tcctggtgta cagcaaacag 3420tgccagaccc ccctgggcat ggcctctggc cacatcaggg
acttccagat cactgcctct 3480ggccagtacg gccagtgggc ccccaagctg gccaggctgc
actactccgg aagcatcaat 3540gcctggagca ccaaggagcc cttcagctgg atcaaagtgg
acctgctggc ccccatgatc 3600atccacggca tcaagaccca gggggccagg cagaagttct
ccagcctgta catcagccag 3660ttcatcatca tgtacagcct ggacggcaag aagtggcaga
cctacagggg caacagcacc 3720ggcaccctga tggtgttctt cggcaacgtg gacagcagcg
gcatcaagca caacatcttc 3780aaccccccca tcatcgccag atacatcagg ctgcacccca
cccactacag catcaggagc 3840accctgagga tggagctgat gggctgtgac ctgaacagct
gcagcatgcc cctgggcatg 3900gagagcaagg ccatctctga cgcccagatc actgcctcca
gctacttcac caacatgttt 3960gccacctgga gccccagcca ggccaggctg cacctgcagg
gcaggacaaa tgcctggagg 4020ccccaggtca acaaccccaa ggagtggctg caggtggact
tccagaagac catgaaggtg 4080actgggatca ccacccaggg ggtgaagagc ctgctgacca
gcatgtacgt gaaggagttc 4140ctgatctcca gcagccagga cggccaccat tggaccctgt
tcctgcagaa tggcaaggtg 4200aaggtgttcc agggcaacca ggacagcttc acccctgtgg
tcaacagcct ggaccccccc 4260ctgctgacca gatacctgag gatccacccc cagagctggg
tgcaccagat cgccctgagg 4320ctggaggtgc tgggctgtga ggcccagcag ctgtactga
435994404DNAArtificial Sequencesynthetic lcoET3
with CpGs 9atgcagctgg aactgtctac ctgtgtgttt ctgtgtctgc tgcctctggg
gttttctgct 60atccgccgct actatctggg agccgtggag ctgtcctggg actacaggca
gagcgagctg 120ctgagagaac tgcacgtgga taccagattc ccagctaccg ctccaggagc
tctgcctctg 180ggcccatccg tgctgtacaa gaaaaccgtc ttcgtggagt ttaccgacca
gctgttcagc 240gtggccaggc caagaccacc ttggatggga ctgctgggac caaccatcca
ggctgaggtg 300tacgataccg tggtcgtgac cctgaaaaac atggcctccc atcccgtgag
cctgcacgct 360gtcggggtgt ccttctggaa gtccagcgag ggagccgagt acgaagacca
tacctcccag 420cgcgagaaag aagacgataa ggtgctgcct ggcaaaagcc agacctatgt
ctggcaggtg 480ctgaaggaga acggaccaac cgctagcgac ccaccatgcc tgacctactc
ttatctgtcc 540cacgtcgatc tggtgaagga cctgaattcc ggactgatcg gagctctgct
ggtgtgtaga 600gagggaagcc tgaccagaga aagaacccag aacctgcatg agttcgtcct
gctgttcgcc 660gtgtttgacg aagggaagag ctggcactct gcccgcaatg actcctggac
cagagctatg 720gatccagctc ctgctagagc tcagcctgct atgcacaccg tcaacggcta
cgtgaatcgg 780tctctgccag gactgatcgg ctgccataag aaaagcgtct attggcacgt
gatcggaatg 840ggcaccagcc ccgaggtgca ttctatcttc ctggaaggcc acacctttct
ggtcaggcac 900catagacagg cctctctgga gatctcccct ctgaccttcc tgaccgctca
gacctttctg 960atggacctgg ggcagttcct gctgttttgc catatctctt cccaccatca
cggaggaatg 1020gaggctcacg tcagggtgga atcctgtgct gaggaaccac agctgagaag
aaaggctgat 1080gaggaagagg actacgacga taacctgtat gacagcgata tggacgtcgt
gcgcctggac 1140ggcgacgatg tcagcccttt catccagatc cggtctgtgg ccaagaaaca
tccaaagacc 1200tgggtccact acatcgccgc tgaagaggaa gattgggact atgcccccct
ggtgctggct 1260cctgacgata gatcctacaa aagccagtat ctgaacaatg ggccccagcg
catcggacgg 1320aagtacaaga aagtgaggtt catggcctat accgacgaga cctttaagac
cagagaggct 1380atccagcacg aatccgggat cctgggacct ctgctgtacg gcgaagtggg
ggataccctg 1440ctgatcatct tcaagaacca ggcctccagg ccatacaata tctatcccca
tggcatcacc 1500gacgtgagac cactgtacag caggagactg cccaaggggg tcaaacacct
gaaggatttc 1560cccatcctgc ctggagagat ctttaagtat aaatggaccg tcaccgtgga
agacgggcct 1620accaagtccg atccacgctg cctgacccgg tactatagct ctttcgtgaa
catggagaga 1680gacctggcta gcggactgat cggacccctg ctgatctgtt acaaagagag
cgtggaccag 1740aggggcaacc agatcatgtc tgataagaga aatgtcatcc tgttctccgt
gtttgacgag 1800aaccgcagct ggtacctgac cgagaacatc cagcggttcc tgccaaatcc
agctggagtg 1860cagctggagg acccagaatt tcaggcttcc aacatcatgc atagcatcaa
tggctacgtg 1920ttcgatagcc tgcagctgtc tgtctgcctg cacgaggtgg cctactggta
tatcctgtcc 1980atcggcgctc agaccgactt cctgtccgtg ttctttagcg ggtacacctt
taagcataaa 2040atggtgtatg aggataccct gaccctgttc cccttttctg gcgagaccgt
gttcatgtcc 2100atggaaaacc ctggcctgtg gatcctgggg tgccacaaca gcgacttcag
gaatagagga 2160atgaccgccc tgctgaaagt gtccagctgt gataagaata ccggcgatta
ctatgaggac 2220tcttacgaag atatctccgc ttatctgctg agcaagaaca atgccatcga
gcccaggtct 2280ttcgctcaga actccagacc tccaagcgct tctgctccta agccacctgt
gctgagaaga 2340catcagaggg acatctccct gcctaccttc cagccagagg aagataaaat
ggactacgac 2400gatatcttca gcaccgagac caagggggaa gattttgaca tctatggaga
ggacgaaaac 2460caggatccaa gatccttcca gaagagaacc agacactact ttatcgccgc
tgtggagcag 2520ctgtgggact atgggatgtc cgaaagccca cgggccctga ggaacagagc
tcagaatgga 2580gaggtgcccc gcttcaagaa agtcgtgttc cgggagtttg ccgacggcag
ctttacccag 2640ccatcttaca ggggggagct gaacaagcat ctggggctgc tgggacccta
tatcagagcc 2700gaggtcgaag ataacatcat ggtgaccttc aagaatcagg cttctcgccc
ctactccttt 2760tattcttccc tgatctccta ccctgacgat caggagcagg gcgccgaacc
taggcacaac 2820ttcgtgcagc caaatgagac cagaacctac ttttggaagg tgcagcatca
catggctccc 2880accgaggatg aattcgactg caaagcttgg gcctattttt ccgatgtcga
cctggagaag 2940gacgtgcata gcggcctgat cgggcctctg ctgatctgtc gcgccaacac
cctgaatgct 3000gctcacggaa gacaggtcac cgtgcaggag ttcgctctgt tctttaccat
ctttgacgaa 3060accaagagct ggtacttcac cgagaacgtg gaaaggaatt gcagagcccc
ctgtcatctg 3120cagatggagg accctaccct gaaggaaaac tacaggttcc acgccatcaa
tggatatgtc 3180atggataccc tgcccggcct ggtcatggct cagaaccagc gcatccggtg
gtacctgctg 3240tctatgggat ccaacgagaa tatccatagc atccacttct ctggccatgt
cttttccgtg 3300aggaagaaag aggaatacaa aatggccgtg tacaatctgt atcctggggt
cttcgagacc 3360gtggaaatgc tgccaagcaa agtgggaatc tggagaatcg agtgcctgat
cggcgaacac 3420ctgcaggccg ggatgagcac caccttcctg gtgtactcta agaaatgtca
gaccccactg 3480gggatggcct ccggacatat ccgcgacttc cagatcaccg ctagcggaca
gtacggacag 3540tgggctccaa agctggctag actgcactat tctggctcca tcaacgcctg
gtctaccaaa 3600gagccattct cctggatcaa ggtggacctg ctggccccca tgatcatcca
cggaatcaaa 3660acccagggcg ctaggcagaa gttcagctct ctgtacatct cccagtttat
catcatgtat 3720agcctggacg ggaagaaatg gcagacctac agaggcaatt ccaccgggac
cctgatggtc 3780ttctttggaa acgtggattc cagcggcatc aagcacaaca tcttcaatcc
acccatcatc 3840gcccgctaca tccggctgca tcctacccac tatagcatca ggtctaccct
gagaatggag 3900ctgatgggat gcgacctgaa cagctgttct atgccactgg gcatggagtc
caaggctatc 3960agcgatgccc agatcaccgc ttcttcctac ttcaccaata tgtttgctac
ctggtcccca 4020agcaaggcta gactgcacct gcagggaaga tccaacgctt ggagacccca
ggtgaacaat 4080cctaaggagt ggctgcaggt cgacttccag aaaaccatga aggtcaccgg
ggtgaccacc 4140cagggagtga aatctctgct gacctccatg tacgtcaagg agttcctgat
cagctcttcc 4200caggacggcc accagtggac cctgttcttt cagaacggca aggtcaaagt
gttccagggg 4260aatcaggact cttttacccc cgtcgtgaac tccctggatc ctccactgct
gaccaggtac 4320ctgagaatcc atcctcagag ctgggtgcac cagatcgctc tgagaatgga
ggtcctggga 4380tgcgaagctc aggacctgta ttga
4404104403DNAArtificial Sequencesynthetic lcoET3 CpGs removed
10atgcagctgg aactgtctac ctgtgtgttt ctgtgtctgc tgcctctggg gttttctgct
60atcaggagat actatctggg agctgtggag ctgtcctggg actacaggca gtctgagctg
120ctgagagaac tgcatgtgga taccagattc ccagctacag ctccaggagc tctgcctctg
180ggcccatctg tgctgtacaa gaaaacagtc tttgtggagt ttacagacca gctgttctct
240gtggccaggc caagaccacc ttggatggga ctgctgggac caaccatcca ggctgaggtg
300tatgatacag tggtggtgac cctgaaaaac atggcctccc atcctgtgag cctgcatgct
360gtgggggtgt ccttctggaa gtcctctgag ggagctgagt atgaagacca tacctcccag
420agggagaaag aagatgataa ggtgctgcct ggcaaaagcc agacctatgt ctggcaggtg
480ctgaaggaga atggaccaac tgcttctgac ccaccatgcc tgacctactc ttatctgtcc
540catgtggatc tggtgaagga cctgaattct ggactgattg gagctctgct ggtgtgtaga
600gagggaagcc tgaccagaga aagaacccag aacctgcatg agtttgtcct gctgtttgct
660gtgtttgatg aagggaagag ctggcactct gccaggaatg actcctggac cagagctatg
720gatccagctc ctgctagagc tcagcctgct atgcacacag tcaatggcta tgtgaatagg
780tctctgccag gactgattgg ctgccataag aaatctgtct attggcatgt gattggaatg
840ggcaccagcc ctgaggtgca ttctatcttc ctggaaggcc acacctttct ggtcaggcac
900catagacagg cctctctgga gatctcccct ctgaccttcc tgacagctca gacctttctg
960atggacctgg ggcagttcct gctgttttgc catatctctt cccaccatca tggaggaatg
1020gaggctcatg tcagggtgga atcctgtgct gaggaaccac agctgagaag aaaggctgat
1080gaggaagagg actatgatga taacctgtat gactctgata tggatgtggt gaggctggat
1140ggggatgatg tcagcccttt catccagatc aggtctgtgg ccaagaaaca tccaaagacc
1200tgggtccact acattgctgc tgaagaggaa gattgggact atgcccccct ggtgctggct
1260cctgatgata gatcctacaa aagccagtat ctgaacaatg ggccccagag gattggaagg
1320aagtacaaga aagtgaggtt catggcctat acagatgaga cctttaagac cagagaggct
1380atccagcatg aatctgggat cctgggacct ctgctgtatg gagaagtggg ggatacctgc
1440tgatcatctt caagaaccag gcctccaggc catacaatat ctatccccat ggcatcacag
1500atgtgagacc actgtacagc aggagactgc ccaagggggt caaacacctg aaggatttcc
1560ccatcctgcc tggagagatc tttaagtata aatggacagt cacagtggaa gatgggccta
1620ccaagtctga tccaaggtgc ctgaccagat actatagctc ttttgtgaac atggagagag
1680acctggcttc tggactgatt ggacccctgc tgatctgtta caaagagtct gtggaccaga
1740ggggcaacca gatcatgtct gataagagaa atgtcatcct gttctctgtg tttgatgaga
1800acaggagctg gtacctgaca gagaacatcc agaggttcct gccaaatcca gctggagtgc
1860agctggagga cccagaattt caggcttcca acatcatgca tagcatcaat ggctatgtgt
1920ttgatagcct gcagctgtct gtctgcctgc atgaggtggc ctactggtat atcctgtcca
1980ttggagctca gacagacttc ctgtctgtgt tctttagtgg gtacaccttt aagcataaaa
2040tggtgtatga ggataccctg accctgttcc ccttttctgg ggagacagtg ttcatgtcca
2100tggaaaaccc tggcctgtgg atcctggggt gccacaactc tgacttcagg aatagaggaa
2160tgacagccct gctgaaagtg tccagctgtg ataagaatac aggggattac tatgaggact
2220cttatgaaga tatctctgct tatctgctga gcaagaacaa tgccattgag cccaggtctt
2280ttgctcagaa ctccagacct ccatctgctt ctgctcctaa gccacctgtg ctgagaagac
2340atcagaggga catctccctg cctaccttcc agccagagga agataaaatg gactatgatg
2400atatcttcag cacagagacc aagggggaag attttgacat ctatggagag gatgaaaacc
2460aggatccaag atccttccag aagagaacca gacactactt tattgctgct gtggagcagc
2520tgtgggacta tgggatgtct gaaagcccaa gggccctgag gaacagagct cagaatggag
2580aggtgcccag attcaagaaa gtggtgttca gagagtttgc tgatggcagc tttacccagc
2640catcttacag gggggagctg aacaagcatc tggggctgct gggaccctat atcagagctg
2700aggtggaaga taacatcatg gtgaccttca agaatcaggc ttctaggccc tactcctttt
2760attcttccct gatctcctac cctgatgatc aggagcaggg agctgaacct aggcacaact
2820ttgtgcagcc aaatgagacc agaacctact tttggaaggt gcagcatcac atggctccca
2880cagaggatga atttgactgc aaagcttggg cctatttttc tgatgtggac ctggagaagg
2940atgtgcattc tggcctgatt gggcctctgc tgatctgtag ggccaacacc ctgaatgctg
3000ctcatggaag acaggtcaca gtgcaggagt ttgctctgtt ctttaccatc tttgatgaaa
3060ccaagagctg gtacttcaca gagaatgtgg aaaggaattg cagagccccc tgtcatctgc
3120agatggagga ccctaccctg aaggaaaact acaggttcca tgccatcaat ggatatgtca
3180tggataccct gcctggcctg gtcatggctc agaaccagag gatcagatgg tacctgctgt
3240ctatgggatc caatgagaat atccatagca tccacttctc tggccatgtc ttttctgtga
3300ggaagaaaga ggaatacaaa atggctgtgt acaatctgta tcctggggtc tttgagacag
3360tggaaatgct gccaagcaaa gtgggaatct ggagaattga gtgcctgatt ggggaacacc
3420tgcaggctgg gatgagcacc accttcctgg tgtactctaa gaaatgtcag accccactgg
3480ggatggcctc tggacatatc agggacttcc agatcacagc ttctggacag tatggacagt
3540gggctccaaa gctggctaga ctgcactatt ctggctccat caatgcctgg tctaccaaag
3600agccattctc ctggatcaag gtggacctgc tggcccccat gatcatccat ggaatcaaaa
3660cccagggagc taggcagaag ttcagctctc tgtacatctc ccagtttatc atcatgtata
3720gcctggatgg gaagaaatgg cagacctaca gaggcaattc cactgggacc ctgatggtct
3780tctttggaaa tgtggattcc tctggcatca agcacaacat cttcaatcca cccatcattg
3840ccaggtacat caggctgcat cctacccact atagcatcag gtctaccctg agaatggagc
3900tgatgggatg tgacctgaac agctgttcta tgccactggg catggagtcc aaggctatct
3960ctgatgccca gatcacagct tcttcctact tcaccaatat gtttgctacc tggtccccaa
4020gcaaggctag actgcacctg cagggaagat ccaatgcttg gagaccccag gtgaacaatc
4080ctaaggagtg gctgcaggtg gacttccaga aaaccatgaa ggtcacaggg gtgaccaccc
4140agggagtgaa atctctgctg acctccatgt atgtcaagga gttcctgatc agctcttccc
4200aggatggcca ccagtggacc ctgttctttc agaatggcaa ggtcaaagtg ttccagggga
4260atcaggactc ttttacccca gtggtgaact ccctggatcc tccactgctg accaggtacc
4320tgagaatcca tcctcagagc tgggtgcacc agattgctct gagaatggag gtcctgggat
4380gtgaagctca ggacctgtat tga
4403114404DNAArtificial Sequencesynthetic NoCoET3 11atgcagctag agctctccac
ctgtgtcttt ctgtgtctct tgccactcgg ctttagtgcc 60atcaggagat actacctggg
cgcagtggaa ctgtcctggg actaccggca aagtgaactc 120ctccgtgagc tgcacgtgga
caccagattt cctgctacag cgccaggagc tcttccgttg 180ggcccgtcag tcctgtacaa
aaagactgtg ttcgtagagt tcacggatca acttttcagc 240gttgccaggc ccaggccacc
atggatgggt ctgctgggtc ctaccatcca ggctgaggtt 300tacgacacgg tggtcgttac
cctgaagaac atggcttctc atcccgttag tcttcacgct 360gtcggcgtct ccttctggaa
atcttccgaa ggcgctgaat atgaggatca caccagccaa 420agggagaagg aagacgataa
agtccttccc ggtaaaagcc aaacctacgt ctggcaggtc 480ctgaaagaaa atggtccaac
agcctctgac ccaccatgtc ttacctactc atacctgtct 540cacgtggacc tggtgaaaga
cctgaattcg ggcctcattg gagccctgct ggtttgtaga 600gaagggagtc tgaccagaga
aaggacccag aacctgcacg aatttgtact actttttgct 660gtctttgatg aagggaaaag
ttggcactca gcaagaaatg actcctggac acgggccatg 720gatcccgcac ctgccagggc
ccagcctgca atgcacacag tcaatggcta tgtcaacagg 780tctctgccag gtctgatcgg
atgtcataag aaatcagtct actggcacgt gattggaatg 840ggcaccagcc cggaagtgca
ctccattttt cttgaaggcc acacgtttct cgtgaggcac 900catcgccagg cttccttgga
gatctcgcca ctaactttcc tcactgctca gacattcctg 960atggaccttg gccagttcct
actgttttgt catatctctt cccaccacca tggtggcatg 1020gaggctcacg tcagagtaga
aagctgcgcc gaggagcccc agctgcggag gaaagctgat 1080gaagaggaag attatgatga
caatttgtac gactcggaca tggacgtggt ccggctcgat 1140ggtgacgacg tgtctccctt
tatccaaatc cgctcagttg ccaagaagca tcctaaaact 1200tgggtacatt acattgctgc
tgaagaggag gactgggact atgctccctt agtcctcgcc 1260cccgatgaca gaagttataa
aagtcaatat ttgaacaatg gccctcagcg gattggtagg 1320aagtacaaaa aagtccgatt
tatggcatac acagatgaaa cctttaagac gcgtgaagct 1380attcagcatg aatcaggaat
cttgggacct ttactttatg gggaagttgg agacacactg 1440ttgattatat ttaagaatca
agcaagcaga ccatataaca tctaccctca cggaatcact 1500gatgtccgtc ctttgtattc
aaggagatta ccaaaaggtg taaaacattt gaaggatttt 1560ccaattctgc caggagaaat
attcaaatat aaatggacag tgactgtaga agatgggcca 1620actaaatcag atccgcggtg
cctgacccgc tattactcta gtttcgttaa tatggagaga 1680gatctagctt caggactcat
tggccctctc ctcatctgct acaaagaatc tgtagatcaa 1740agaggaaacc agataatgtc
agacaagagg aatgtcatcc tgttttctgt atttgatgag 1800aaccgaagct ggtacctcac
agagaatata caacgctttc tccccaatcc agctggagtg 1860cagcttgagg atccagagtt
ccaagcctcc aacatcatgc acagcatcaa tggctatgtt 1920tttgatagtt tgcagttgtc
agtttgtttg catgaggtgg catactggta cattctaagc 1980attggagcac agactgactt
cctttctgtc ttcttctctg gatatacctt caaacacaaa 2040atggtctatg aagacacact
caccctattc ccattctcag gagaaactgt cttcatgtcg 2100atggaaaacc caggtctatg
gattctgggg tgccacaact cagactttcg gaacagaggc 2160atgaccgcct tactgaaggt
ttctagttgt gacaagaaca ctggtgatta ttacgaggac 2220agttatgaag atatttcagc
atacttgctg agtaaaaaca atgccattga acctaggagc 2280tttgcccaga attcaagacc
ccctagtgcg agcgctccaa agcctccggt cctgcgacgg 2340catcagaggg acataagcct
tcctactttt cagccggagg aagacaaaat ggactatgat 2400gatatcttct caactgaaac
gaagggagaa gattttgaca tttacggtga ggatgaaaat 2460caggaccctc gcagctttca
gaagagaacc cgacactatt tcattgctgc ggtggagcag 2520ctctgggatt acgggatgag
cgaatccccc cgggcgctaa gaaacagggc tcagaacgga 2580gaggtgcctc ggttcaagaa
ggtggtcttc cgggaatttg ctgacggctc cttcacgcag 2640ccgtcgtacc gcggggaact
caacaaacac ttggggctct tgggacccta catcagagcg 2700gaagttgaag acaacatcat
ggtaactttc aaaaaccagg cgtctcgtcc ctattccttc 2760tactcgagcc ttatttctta
tccggatgat caggagcaag gggcagaacc tcgacacaac 2820ttcgtccagc caaatgaaac
cagaacttac ttttggaaag tgcagcatca catggcaccc 2880acagaagacg agtttgactg
caaagcctgg gcctactttt ctgatgttga cctggaaaaa 2940gatgtgcact caggcttgat
cggccccctt ctgatctgcc gcgccaacac cctgaacgct 3000gctcacggta gacaagtgac
cgtgcaagaa tttgctctgt ttttcactat ttttgatgag 3060acaaagagct ggtacttcac
tgaaaatgtg gaaaggaact gccgggcccc ctgccatctg 3120cagatggagg accccactct
gaaagaaaac tatcgcttcc atgcaatcaa tggctatgtg 3180atggatacac tccctggctt
agtaatggct cagaatcaaa ggatccgatg gtatctgctc 3240agcatgggca gcaatgaaaa
tatccattcg attcatttta gcggacacgt gttcagtgta 3300cggaaaaagg aggagtataa
aatggccgtg tacaatctct atccgggtgt ctttgagaca 3360gtggaaatgc taccgtccaa
agttggaatt tggcgaatag aatgcctgat tggcgagcac 3420ctgcaagctg ggatgagcac
gactttcctg gtgtacagca agaagtgtca gactcccctg 3480ggaatggctt ctggacacat
tagagatttt cagattacag cttcaggaca atatggacag 3540tgggccccaa agctggccag
acttcattat tccggatcaa tcaatgcctg gagcaccaag 3600gagccctttt cttggatcaa
ggtggatctg ttggcaccaa tgattattca cggcatcaag 3660acccagggtg cccgtcagaa
gttctccagc ctctacatct ctcagtttat catcatgtat 3720agtcttgatg ggaagaagtg
gcagacttat cgaggaaatt ccactggaac cttaatggtc 3780ttctttggca atgtggattc
atctgggata aaacacaata tttttaaccc tccaattatt 3840gctcgataca tccgtttgca
cccaactcat tatagcattc gcagcactct tcgcatggag 3900ttgatgggct gtgatttaaa
tagttgcagc atgccattgg gaatggagag taaagcaata 3960tcagatgcac agattactgc
ttcatcctac tttaccaata tgtttgccac ctggtctcct 4020tcaaaagctc gacttcacct
ccaagggagg agtaatgcct ggagacctca ggtgaataat 4080ccaaaagagt ggctgcaagt
ggacttccag aagacaatga aagtcacagg agtaactact 4140cagggagtaa aatctctgct
taccagcatg tatgtgaagg agttcctcat ctccagcagt 4200caagatggcc atcagtggac
tctctttttt cagaatggca aagtaaaggt ttttcaggga 4260aatcaagact ccttcacacc
tgtggtgaac tctctagacc caccgttact gactcgctac 4320cttcgaattc acccccagag
ttgggtgcac cagattgccc tgaggatgga ggttctgggc 4380tgcgaggcac aggacctcta
ctga 4404124404DNAArtificial
Sequencesynthetic mcoET3 12atgcagctgg agctctcaac ctgtgtgttc ctctgcctgc
tccccctggg attttcagct 60atcaggagat actatctggg agcagtggaa ctgtcctggg
actacaggca gtcagagctg 120ctcagagaac tgcatgtgga tactaggttc cctgcaacag
ctcctggagc actgccactg 180ggaccttcag tgctgtacaa gaaaactgtc tttgtggagt
ttacagacca gctgttcagt 240gtggccaggc ccaggccccc ctggatgggg ctgctgggac
ccaccatcca ggctgaagtg 300tatgatactg tggtggtgac cctgaaaaac atggcctctc
atccagtcag cctgcatgct 360gtgggagtga gcttctggaa gagcagtgag ggagctgagt
atgaagacca tacctcacag 420agggagaaag aagatgataa ggtgctgcca ggaaaaagcc
agacctatgt gtggcaggtg 480ctgaaggaga atggccctac agcttcagat cctccctgcc
tcacatactc ttatctgagc 540catgtggatc tggtgaagga cctcaatagt ggcctgattg
gggcactgct ggtgtgcaga 600gaggggtccc tcacaaggga aagaactcag aacctgcatg
agtttgtcct gctctttgct 660gtgtttgatg agggaaagtc ctggcactca gcaaggaatg
acagctggac cagggctatg 720gacccagcac cagccagagc tcagccagct atgcacactg
tcaatggcta tgtgaatagg 780tccctgcctg gactcattgg ctgccataag aaatcagtct
attggcatgt gattggaatg 840ggcaccagcc cagaggtgca ttccatcttc ctggaaggcc
acacatttct ggtcaggcac 900catagacagg ccagcctgga gatcagccca ctgactttcc
tcacagcaca gacatttctg 960atggacctgg ggcagttcct gctcttttgc catatctcaa
gtcaccatca tggagggatg 1020gaggctcatg tcagggtgga aagctgtgca gaggaacctc
agctgaggag gaaggcagat 1080gaggaagagg actatgatga taacctgtat gactcagata
tggatgtggt gaggctggat 1140ggagatgatg tcagcccatt catccagatc aggtcagtgg
ctaagaaaca ccctaagacc 1200tgggtccact acattgcagc tgaagaggaa gattgggact
atgcacccct ggtgctggcc 1260ccagatgata gaagttacaa atctcagtat ctgaacaatg
ggccccagag gattggaagg 1320aagtacaaga aagtgaggtt catggcttat actgatgaga
cctttaagac aagagaggca 1380atccagcatg aaagtggcat cctgggacca ctgctctatg
gagaagtggg ggataccctg 1440ctcatcatct tcaagaacca ggcctcaagg ccttacaata
tctatcccca tggcatcaca 1500gatgtgaggc ctctctacag caggagactg cccaagggag
tcaaacacct caaggatttc 1560cccatcctgc caggggaaat cttcaagtat aaatggacag
tcactgtgga agatgggcca 1620actaagtcag atcctaggtg cctgaccagg tactattcta
gctttgtgaa catggagagg 1680gacctggctt caggactgat tggacctctg ctcatctgct
acaaagaatc agtggaccag 1740aggggcaacc agatcatgag tgataagaga aatgtcatcc
tgttctcagt gtttgatgag 1800aataggagtt ggtatctgac agaaaacatc cagaggttcc
tgcctaatcc tgcaggagtg 1860cagctggagg acccagaatt tcaggcttca aacatcatgc
atagtatcaa tggctatgtg 1920tttgatagtc tgcagctctc tgtctgcctg catgaggtgg
cctactggta tatcctcagc 1980attggagctc agactgactt cctgagtgtg ttcttttcag
gctacacatt caagcataag 2040atggtctatg aagataccct gacactcttc cccttttctg
gggagactgt gtttatgagc 2100atggaaaacc caggcctgtg gattctgggg tgccacaaca
gtgacttcag gaatagaggg 2160atgactgctc tgctcaaagt gtcctcatgt gataagaata
ctggagatta ctatgaggac 2220tcttatgaag atatcagtgc atatctgctc tccaaaaaca
atgccattga gcccaggtca 2280tttgctcaga acagtagacc accttctgca agtgcaccaa
agcctccagt gctgaggaga 2340caccagaggg acatcagcct gccaaccttc cagcctgagg
aagataaaat ggactatgat 2400gatatcttct ccactgagac caagggggaa gattttgaca
tctatggaga ggatgaaaac 2460caggacccca ggtccttcca gaagaggacc agacactact
ttattgcagc tgtggagcag 2520ctgtgggact atggcatgtc tgaatcacct agagctctga
ggaacagagc acagaatggg 2580gaggtgccca ggttcaagaa agtggtgttc agagaatttg
cagatggctc ttttacccag 2640cctagctaca ggggggagct caacaagcat ctggggctgc
tgggacccta tatcagagca 2700gaggtggaag ataacatcat ggtgacattc aagaatcagg
cctcaagacc ctacagtttt 2760tatagttctc tgatcagcta cccagatgat caggagcagg
gggctgaacc aaggcacaac 2820tttgtgcagc ctaatgagac aagaacttac ttttggaagg
tccagcatca catggctccc 2880acagaggatg agtttgactg caaggcctgg gcatattttt
ctgatgtgga cctggagaag 2940gatgtgcata gtggcctcat tgggccactg ctcatctgca
gggcaaacac actgaatgct 3000gcacatggca ggcaggtcac tgtgcaggag tttgccctgt
tctttacaat ctttgatgaa 3060actaagtcct ggtacttcac agagaatgtg gaaaggaatt
gcagagcccc ctgccatctc 3120cagatggagg acccaactct gaaggaaaac tacaggttcc
atgctatcaa tggatatgtc 3180atggatacac tgccaggcct ggtgatggca cagaaccaga
ggatcaggtg gtatctgctc 3240agcatggggt ccaatgagaa tatccattct atccacttct
caggacatgt cttttcagtg 3300aggaagaaag aggaatataa aatggctgtg tacaatctgt
atccaggggt ctttgagaca 3360gtggaaatgc tgcctagcaa agtggggatc tggagaattg
agtgcctcat tggagaacac 3420ctgcaggcag ggatgtccac cacatttctg gtgtactcaa
agaaatgcca gactcccctg 3480gggatggcaa gtggacatat cagggacttc cagatcactg
catcaggaca gtatggacag 3540tgggcaccaa agctggctag gctccactat agtggctcta
tcaatgcttg gagtaccaaa 3600gagcctttct cttggatcaa ggtggatctg ctggccccca
tgatcatcca tggaatcaaa 3660acacagggag ctagacagaa gttcagctcc ctgtacatca
gtcagtttat catcatgtat 3720tctctggatg ggaagaaatg gcagacctac aggggcaata
gcactgggac actgatggtc 3780ttctttggaa atgtggattc aagtggcatc aagcacaaca
tcttcaatcc tcccatcatt 3840gccaggtaca tcagactgca tcccacacac tattcaatca
ggagtactct cagaatggag 3900ctgatggggt gtgacctcaa cagctgctcc atgccactgg
gaatggaatc caaggcaatc 3960tcagatgccc agatcactgc ttctagctac ttcaccaata
tgtttgcaac atggtcaccc 4020agtaaagcaa ggctgcacct ccagggaagg tccaatgctt
ggagacccca ggtgaacaat 4080ccaaaggagt ggctgcaggt ggactttcag aaaaccatga
aggtcacagg ggtgactacc 4140cagggagtga aaagtctgct cacctctatg tatgtcaagg
agttcctgat ctcctcaagt 4200caggatggcc accagtggac actgttcttt cagaatggca
aggtcaaagt gttccagggg 4260aatcaggaca gctttacacc agtggtgaac agcctggacc
cccctctgct cactagatat 4320ctgagaatcc atccacagag ctgggtgcac cagattgcac
tcagaatgga ggtcctgggc 4380tgtgaagccc aggacctgta ttga
4404134374DNAArtificial Sequencesynthetic lcoHSQ
with CpGs 13atgcagatcg aactgtctac ctgtttcttt ctgtgcctgc tgcggttttg
tttttccgct 60accagaagat actacctggg agccgtcgaa ctgagctggg attacatgca
gtctgacctg 120ggagagctgc ccgtggacgc tagattccca cctagagtcc ctaagtcctt
ccccttcaac 180accagcgtgg tctacaagaa aaccctgttc gtggagttta ccgaccacct
gttcaacatc 240gctaagccta gaccaccatg gatgggactg ctgggaccaa ccatccaggc
cgaggtgtac 300gacaccgtgg tcatcaccct gaaaaacatg gcttctcacc ccgtgtccct
gcatgctgtg 360ggcgtctcct actggaaggc cagcgaaggg gctgagtatg acgatcagac
cagccagcgg 420gaaaaagagg acgataaggt gttccctggc gggtcccata cctacgtgtg
gcaggtcctg 480aaggagaatg gaccaatggc ttccgaccct ctgtgcctga cctactctta
tctgtcccac 540gtggacctgg tcaaggatct gaacagcggc ctgatcgggg ctctgctggt
gtgtcgcgaa 600gggtccctgg ccaaggagaa aacccagacc ctgcataagt tcatcctgct
gttcgccgtg 660tttgacgaag gaaaaagctg gcactctgag accaagaact ctctgatgca
ggacagggat 720gccgcttccg ccagagcttg gcccaagatg cacaccgtga acggctacgt
caataggagc 780ctgcctggac tgatcggctg ccacagaaag tccgtgtatt ggcatgtcat
cggaatgggc 840accacccctg aagtgcacag catcttcctg gaggggcata cctttctggt
ccgcaaccac 900cggcaggcta gcctggagat ctctccaatc accttcctga ccgcccagac
cctgctgatg 960gacctgggac agttcctgct gttttgccac atctccagcc accagcatga
tggcatggag 1020gcttacgtga aagtcgactc ctgtcccgag gaacctcagc tgaggatgaa
gaacaatgag 1080gaagccgaag actatgacga tgacctgacc gacagcgaga tggatgtggt
ccgcttcgat 1140gacgataact ctccctcctt tatccagatc cggtccgtgg ccaagaaaca
ccctaagacc 1200tgggtccatt acatcgccgc tgaggaagag gactgggatt atgctccact
ggtgctggcc 1260cccgacgata gatcctacaa aagccagtat ctgaacaatg gaccccagag
gatcggcaga 1320aagtacaaga aagtgaggtt catggcttat accgatgaga cctttaagac
cagagaagcc 1380atccagcacg agtccgggat cctgggacct ctgctgtacg gcgaagtggg
ggacaccctg 1440ctgatcatct tcaagaacca ggccagcagg ccttacaata tctatccaca
tggcatcacc 1500gatgtgagac ctctgtactc ccgccggctg ccaaagggcg tgaaacacct
gaaggacttc 1560ccaatcctgc ccggggaaat ctttaagtat aaatggaccg tcaccgtcga
ggatgggccc 1620accaagagcg accctaggtg cctgaccaga tactattctt ccttcgtgaa
tatggagaga 1680gacctggctt ccggactgat cggacccctg ctgatctgtt acaaagagag
cgtggatcag 1740cgcggcaacc agatcatgtc tgacaagcgg aatgtgatcc tgttcagcgt
ctttgacgaa 1800aaccgctctt ggtacctgac cgagaacatc cagcggttcc tgcctaatcc
agctggagtg 1860cagctggaag atcccgagtt ccaggcctct aacatcatgc attccatcaa
tggctacgtg 1920ttcgactccc tgcagctgag cgtgtgcctg cacgaggtcg cttactggta
tatcctgagc 1980atcggagccc agaccgattt cctgtctgtg ttcttttccg gctacacctt
taagcataaa 2040atggtgtatg aggacaccct gaccctgttc ccattttccg gcgaaaccgt
gttcatgagc 2100atggagaatc ccgggctgtg gatcctggga tgccacaact ccgatttcag
gaatagaggg 2160atgaccgccc tgctgaaagt gagctcttgt gacaagaaca ccggagacta
ctatgaagat 2220agctacgagg acatctctgc ttatctgctg tccaaaaaca atgccatcga
gcccaggagc 2280ttctctcaga accctccagt gctgaagcgc caccagcggg agatcaccag
aaccaccctg 2340cagagcgatc aggaagagat cgactacgac gataccatct ccgtggaaat
gaagaaagag 2400gacttcgata tctatgacga agatgagaac cagtctccca ggtccttcca
gaagaaaacc 2460agacattact ttatcgccgc tgtggagcgg ctgtgggact atggcatgtc
cagctctcct 2520cacgtgctga gaaatagagc tcagtccgga agcgtcccac agttcaagaa
agtggtcttc 2580caggagttta ccgacggaag ctttacccag ccactgtacc gcggcgaact
gaacgagcac 2640ctggggctgc tgggacccta tatccgggct gaagtggagg ataacatcat
ggtcaccttc 2700aggaatcagg ccagcagacc ctactctttt tattccagcc tgatctccta
cgaagaggac 2760cagagacagg gagctgaacc aagaaaaaac ttcgtgaagc ctaatgagac
caaaacctac 2820ttttggaagg tgcagcacca tatggcccct accaaagacg agttcgattg
caaggcctgg 2880gcttatttta gcgacgtgga tctggagaag gacgtccact ccggcctgat
cgggccactg 2940ctggtgtgtc ataccaacac cctgaatcca gctcacggaa ggcaggtgac
cgtccaggaa 3000ttcgccctgt tctttaccat ctttgatgag accaagagct ggtacttcac
cgaaaacatg 3060gagaggaatt gcagagcccc atgtaacatc cagatggaag accccacctt
caaggagaac 3120tacagatttc atgctatcaa tgggtatatc atggataccc tgccaggact
ggtcatggct 3180caggaccaga ggatcagatg gtacctgctg agcatggggt ctaacgagaa
tatccactcc 3240atccatttca gcggacacgt gtttaccgtc cgcaagaaag aagagtacaa
gatggccctg 3300tacaacctgt atcccggcgt gttcgaaacc gtcgagatgc tgccttccaa
ggctgggatc 3360tggcgggtgg aatgcctgat cggggagcac ctgcatgccg gaatgtctac
cctgttcctg 3420gtgtactcca ataagtgtca gacccccctg gggatggcta gcggacatat
ccgcgacttc 3480cagatcaccg cttccggaca gtacggacag tgggctccta agctggctag
actgcactat 3540tctggctcca tcaacgcttg gtctaccaaa gagcctttct cctggatcaa
ggtggacctg 3600ctggctccaa tgatcatcca tggcatcaaa acccaggggg ccaggcagaa
gttctcttcc 3660ctgtacatca gccagtttat catcatgtat tctctggatg ggaagaaatg
gcagacctac 3720agaggcaatt ccaccgggac cctgatggtg ttctttggca acgtcgacag
ctctgggatc 3780aagcacaaca tcttcaatcc ccctatcatc gcccgctaca tccggctgca
cccaacccat 3840tattccatcc gcagcaccct gcggatggag ctgatggggt gcgatctgaa
cagctgttct 3900atgcccctgg gaatggagtc taaggccatc tccgacgctc agatcaccgc
ctccagctac 3960ttcaccaata tgtttgctac ctggtcccca agcaaggcta gactgcatct
gcagggaaga 4020agcaacgctt ggagaccaca ggtgaacaat cccaaggagt ggctgcaggt
cgacttccag 4080aaaaccatga aggtgaccgg agtcaccacc cagggcgtga aaagcctgct
gacctctatg 4140tacgtcaagg agttcctgat ctcttccagc caggacgggc accagtggac
cctgttcttt 4200cagaacggaa aggtgaaagt cttccagggc aatcaggatt cctttacccc
tgtggtcaac 4260agcctggacc cacccctgct gaccaggtac ctgagaatcc acccacagtc
ctgggtgcat 4320cagatcgctc tgaggatgga agtcctgggc tgcgaggccc aggacctgta
ttga 4374144374DNAArtificial Sequencesynthetic lcoHSQ CpGs
removed 14atgcagattg aactgtctac ctgtttcttt ctgtgcctgc tgaggttttg
tttttctgct 60accagaagat actacctggg agctgtggaa ctgagctggg attacatgca
gtctgacctg 120ggagagctgc ctgtggatgc tagattccca cctagagtcc ctaagtcctt
ccccttcaac 180acctctgtgg tctacaagaa aaccctgttt gtggagttta cagaccacct
gttcaacatt 240gctaagccta gaccaccatg gatgggactg ctgggaccaa ccatccaggc
agaggtgtat 300gacacagtgg tcatcaccct gaaaaacatg gcttctcacc ctgtgtccct
gcatgctgtg 360ggagtctcct actggaaggc ctctgaaggg gctgagtatg atgatcagac
cagccagagg 420gaaaaagagg atgataaggt gttccctgga gggtcccata cctatgtgtg
gcaggtcctg 480aaggagaatg gaccaatggc ttctgaccct ctgtgcctga cctactctta
tctgtcccat 540gtggacctgg tcaaggatct gaactctggc ctgattgggg ctctgctggt
gtgtagggaa 600gggtccctgg ccaaggagaa aacccagacc ctgcataagt tcatcctgct
gtttgctgtg 660tttgatgaag gaaaaagctg gcactctgag accaagaact ctctgatgca
ggacagggat 720gctgcttctg ccagagcttg gcccaagatg cacacagtga atggctatgt
caataggagc 780ctgcctggac tgattggctg ccacagaaag tctgtgtatt ggcatgtcat
tggaatgggc 840accacccctg aagtgcacag catcttcctg gaggggcata cctttctggt
caggaaccac 900aggcaggcta gcctggagat ctctccaatc accttcctga cagcccagac
cctgctgatg 960gacctgggac agttcctgct gttttgccac atctccagcc accagcatga
tggcatggag 1020gcttatgtga aagtggactc ctgtcctgag gaacctcagc tgaggatgaa
gaacaatgag 1080gaagctgaag actatgatga tgacctgaca gactctgaga tggatgtggt
caggtttgat 1140gatgataact ctccctcctt tatccagatc aggtctgtgg ccaagaaaca
ccctaagacc 1200tgggtccatt acattgctgc tgaggaagag gactgggatt atgctccact
ggtgctggcc 1260cctgatgata gatcctacaa aagccagtat ctgaacaatg gaccccagag
gattggcaga 1320aagtacaaga aagtgaggtt catggcttat acagatgaga cctttaagac
cagagaagcc 1380atccagcatg agtctgggat cctgggacct ctgctgtatg gggaagtggg
ggacaccctg 1440ctgatcatct tcaagaacca ggccagcagg ccttacaata tctatccaca
tggcatcaca 1500gatgtgagac ctctgtactc caggaggctg ccaaaggggg tgaaacacct
gaaggacttc 1560ccaatcctgc ctggggaaat ctttaagtat aaatggacag tcacagtgga
ggatgggccc 1620accaagtctg accctaggtg cctgaccaga tactattctt cctttgtgaa
tatggagaga 1680gacctggctt ctggactgat tggacccctg ctgatctgtt acaaagagtc
tgtggatcag 1740aggggcaacc agatcatgtc tgacaagagg aatgtgatcc tgttctctgt
ctttgatgaa 1800aacaggtctt ggtacctgac agagaacatc cagaggttcc tgcctaatcc
agctggagtg 1860cagctggaag atcctgagtt ccaggcctct aacatcatgc attccatcaa
tggctatgtg 1920tttgactccc tgcagctgtc tgtgtgcctg catgaggtgg cttactggta
tatcctgagc 1980attggagccc agacagattt cctgtctgtg ttcttttctg gctacacctt
taagcataaa 2040atggtgtatg aggacaccct gaccctgttc ccattttctg gagaaactgt
gttcatgagc 2100atggagaatc ctgggctgtg gatcctggga tgccacaact ctgatttcag
gaatagaggg 2160atgacagccc tgctgaaagt gagctcttgt gacaagaaca caggagacta
ctatgaagat 2220agctatgagg acatctctgc ttatctgctg tccaaaaaca atgccattga
gcccaggagc 2280ttctctcaga accctccagt gctgaagagg caccagaggg agatcaccag
aaccaccctg 2340cagtctgatc aggaagagat tgactatgat gataccatct ctgtggaaat
gaagaaagag 2400gactttgata tctatgatga agatgagaac cagtctccca ggtccttcca
gaagaaaacc 2460agacattact ttattgctgc tgtggagagg ctgtgggact atggcatgtc
cagctctcct 2520catgtgctga gaaatagagc tcagtctgga tctgtcccac agttcaagaa
agtggtcttc 2580caggagttta cagatggaag ctttacccag ccactgtaca ggggagaact
gaatgagcac 2640ctggggctgc tgggacccta tatcagggct gaagtggagg ataacatcat
ggtcaccttc 2700aggaatcagg ccagcagacc ctactctttt tattccagcc tgatctccta
tgaagaggac 2760cagagacagg gagctgaacc aagaaaaaac tttgtgaagc ctaatgagac
caaaacctac 2820ttttggaagg tgcagcacca tatggcccct accaaagatg agtttgattg
caaggcctgg 2880gcttattttt ctgatgtgga tctggagaag gatgtccact ctggcctgat
tgggccactg 2940ctggtgtgtc ataccaacac cctgaatcca gctcatggaa ggcaggtgac
agtccaggaa 3000tttgccctgt tctttaccat ctttgatgag accaagagct ggtacttcac
agaaaacatg 3060gagaggaatt gcagagcccc atgtaacatc cagatggaag accccacctt
caaggagaac 3120tacagatttc atgctatcaa tgggtatatc atggataccc tgccaggact
ggtcatggct 3180caggaccaga ggatcagatg gtacctgctg agcatggggt ctaatgagaa
tatccactcc 3240atccatttct ctggacatgt gtttacagta aggaagaaag aagagtacaa
gatggccctg 3300tacaacctgt atcctggggt gtttgaaaca gtggagatgc tgccttccaa
ggctgggatc 3360tggagggtgg aatgcctgat tggggagcac ctgcatgctg gaatgtctac
cctgttcctg 3420gtgtactcca ataagtgtca gacccccctg gggatggctt ctggacatat
cagggacttc 3480cagatcacag cttctggaca gtatggacag tgggctccta agctggctag
actgcactat 3540tctggctcca tcaatgcttg gtctaccaaa gagcctttct cctggatcaa
ggtggacctg 3600ctggctccaa tgatcatcca tggcatcaaa acccaggggg ccaggcagaa
gttctcttcc 3660ctgtacatca gccagtttat catcatgtat tctctggatg ggaagaaatg
gcagacctac 3720agaggcaatt ccacagggac cctgatggtg ttctttggca atgtggacag
ctctgggatc 3780aagcacaaca tcttcaatcc ccctatcatt gccaggtaca tcagactgca
cccaacccat 3840tattccatca ggagcaccct gagaatggag ctgatggggt gtgatctgaa
cagctgttct 3900atgcccctgg gaatggagtc taaggccatc tctgatgctc agatcacagc
ctccagctac 3960ttcaccaata tgtttgctac ctggtcccca agcaaggcta gactgcatct
gcagggaaga 4020agcaatgctt ggagaccaca ggtgaacaat cccaaggagt ggctgcaggt
ggacttccag 4080aaaaccatga aggtgacagg agtcaccacc cagggagtga aaagcctgct
gacctctatg 4140tatgtcaagg agttcctgat ctcttccagc caggatgggc accagtggac
cctgttcttt 4200cagaatggaa aggtgaaagt cttccagggc aatcaggatt cctttacccc
tgtggtcaac 4260agcctggacc cacccctgct gaccaggtac ctgagaatcc acccacagtc
ctgggtgcat 4320cagattgctc tgaggatgga agtcctgggc tgtgaggccc aggacctgta
ttga 4374154374DNAArtificial Sequencesynthetic NoCoHSQ
15atgcaaatag agctctccac ctgcttcttt ctgtgccttt tgcgattctg ctttagtgcc
60accagaagat actacctggg tgcagtggaa ctgtcatggg actatatgca aagtgatctc
120ggtgagctgc ctgtggacgc aagatttcct cctagagtgc caaaatcttt tccattcaac
180acctcagtcg tgtacaaaaa gactctgttt gtagaattca cggaccacct tttcaacatc
240gctaagccaa ggccaccctg gatgggtctg ctaggtccta ccatccaggc tgaggtttat
300gatacagtgg tcattacact taagaacatg gcttcccatc ctgtcagtct tcatgctgtt
360ggtgtatcct actggaaagc ttctgaggga gctgaatatg atgatcagac cagtcaaagg
420gagaaagaag atgataaagt cttccctggt ggaagccata catatgtctg gcaggtcctg
480aaagagaatg gtccaatggc ctctgaccca ctgtgcctta cctactcata tctttctcat
540gtggacctgg taaaagactt gaattcaggc ctcattggag ccctactagt atgtagagaa
600gggagtctgg ccaaggaaaa gacacagacc ttgcacaaat ttatactact ttttgctgta
660tttgatgaag ggaaaagttg gcactcagaa acaaagaact ccttgatgca ggatagggat
720gctgcatctg ctcgggcctg gcctaaaatg cacacagtca atggttatgt aaacaggtct
780ctgccaggtc tgattggatg ccacaggaaa tcagtctatt ggcatgtgat tggaatgggc
840accactcctg aagtgcactc aatattcctc gaaggtcaca catttcttgt gaggaaccat
900cgccaggcgt ccttggaaat ctcgccaata actttcctta ctgctcaaac actcttgatg
960gaccttggac agtttctact gttttgtcat atctcttccc accaacatga tggcatggaa
1020gcttatgtca aagtagacag ctgtccagag gaaccccaac tacgaatgaa aaataatgaa
1080gaagcggaag actatgatga tgatcttact gattctgaaa tggatgtggt caggtttgat
1140gatgacaact ctccttcctt tatccaaatt cgctcagttg ccaagaagca tcctaaaact
1200tgggtacatt acattgctgc tgaagaggag gactgggact atgctccctt agtcctcgcc
1260cccgatgaca gaagttataa aagtcaatat ttgaacaatg gccctcagcg gattggtagg
1320aagtacaaaa aagtccgatt tatggcatac acagatgaaa cctttaagac gcgtgaagct
1380attcagcatg aatcaggaat cttgggacct ttactttatg gggaagttgg agacacactg
1440ttgattatat ttaagaatca agcaagcaga ccatataaca tctaccctca cggaatcact
1500gatgtccgtc ctttgtattc aaggagatta ccaaaaggtg taaaacattt gaaggatttt
1560ccaattctgc caggagaaat attcaaatat aaatggacag tgactgtaga agatgggcca
1620actaaatcag atccgcggtg cctgacccgc tattactcta gtttcgttaa tatggagaga
1680gatctagctt caggactcat tggccctctc ctcatctgct acaaagaatc tgtagatcaa
1740agaggaaacc agataatgtc agacaagagg aatgtcatcc tgttttctgt atttgatgag
1800aaccgaagct ggtacctcac agagaatata caacgctttc tccccaatcc agctggagtg
1860cagcttgagg atccagagtt ccaagcctcc aacatcatgc acagcatcaa tggctatgtt
1920tttgatagtt tgcagttgtc agtttgtttg catgaggtgg catactggta cattctaagc
1980attggagcac agactgactt cctttctgtc ttcttctctg gatatacctt caaacacaaa
2040atggtctatg aagacacact caccctattc ccattctcag gagaaactgt cttcatgtcg
2100atggaaaacc caggtctatg gattctgggg tgccacaact cagactttcg gaacagaggc
2160atgaccgcct tactgaaggt ttctagttgt gacaagaaca ctggtgatta ttacgaggac
2220agttatgaag atatttcagc atacttgctg agtaaaaaca atgccattga acctaggagc
2280ttctctcaga atccaccagt cttgaaacgc catcaacggg aaataactcg tactactctt
2340cagtcagatc aagaggaaat tgactatgat gataccatat cagttgaaat gaagaaggaa
2400gattttgaca tttatgatga ggatgaaaat cagagccccc gcagctttca aaagaaaaca
2460cgacactatt ttattgctgc agtggagagg ctctgggatt atgggatgag tagctcccca
2520catgttctaa gaaacagggc tcagagtggc agtgtccctc agttcaagaa agttgttttc
2580caggaattta ctgatggctc ctttactcag cccttatacc gtggagaact aaatgaacat
2640ttgggactcc tggggccata tataagagca gaagttgaag ataatatcat ggtaactttc
2700agaaatcagg cctctcgtcc ctattccttc tattctagcc ttatttctta tgaggaagat
2760cagaggcaag gagcagaacc tagaaaaaac tttgtcaagc ctaatgaaac caaaacttac
2820ttttggaaag tgcaacatca tatggcaccc actaaagatg agtttgactg caaagcctgg
2880gcttatttct ctgatgttga cctggaaaaa gatgtgcact caggcctgat tggacccctt
2940ctggtctgcc acactaacac actgaaccct gctcatggga gacaagtgac agtacaggaa
3000tttgctctgt ttttcaccat ctttgatgag accaaaagct ggtacttcac tgaaaatatg
3060gaaagaaact gcagggctcc ctgcaatatc cagatggaag atcccacttt taaagagaat
3120tatcgcttcc atgcaatcaa tggctacata atggatacac tacctggctt agtaatggct
3180caggatcaaa ggattcgatg gtatctgctc agcatgggca gcaatgaaaa catccattct
3240attcatttca gtggacatgt gttcactgta cgaaaaaaag aggagtataa aatggcactg
3300tacaatctct atccaggtgt ttttgagaca gtggaaatgt taccatccaa agctggaatt
3360tggcgggtgg aatgccttat tggcgagcat ctacatgctg ggatgagcac actttttctg
3420gtgtacagca ataagtgtca gactcccctg ggaatggctt ctggacacat tagagatttt
3480cagattacag cttcaggaca atatggacag tgggccccaa agctggccag acttcattat
3540tccggatcaa tcaatgcctg gagcaccaag gagccctttt cttggatcaa ggtggatctg
3600ttggcaccaa tgattattca cggcatcaag acccagggtg cccgtcagaa gttctccagc
3660ctctacatct ctcagtttat catcatgtat agtcttgatg ggaagaagtg gcagacttat
3720cgaggaaatt ccactggaac cttaatggtc ttctttggca atgtggattc atctgggata
3780aaacacaata tttttaaccc tccaattatt gctcgataca tccgtttgca cccaactcat
3840tatagcattc gcagcactct tcgcatggag ttgatgggct gtgatttaaa tagttgcagc
3900atgccattgg gaatggagag taaagcaata tcagatgcac agattactgc ttcatcctac
3960tttaccaata tgtttgccac ctggtctcct tcaaaagctc gacttcacct ccaagggagg
4020agtaatgcct ggagacctca ggtgaataat ccaaaagagt ggctgcaagt ggacttccag
4080aagacaatga aagtcacagg agtaactact cagggagtaa aatctctgct taccagcatg
4140tatgtgaagg agttcctcat ctccagcagt caagatggcc atcagtggac tctctttttt
4200cagaatggca aagtaaaggt ttttcaggga aatcaagact ccttcacacc tgtggtgaac
4260tctctagacc caccgttact gactcgctac cttcgaattc acccccagag ttgggtgcac
4320cagattgccc tgaggatgga ggttctgggc tgcgaggcac aggacctcta ctga
4374164375DNAArtificial Sequencesynthetic mcoHSQ 16atgcagattg agctcagcac
ctgcttcttt ctgtgcctgc tcaggttctg cttttcagcc 60acaaggagat actatctggg
agctgtggaa ctgtcatggg attacatgca gagtgacctg 120ggagagctcc ctgtggatgc
taggttcccc ccaagggtcc caaagtcttt cccttttaat 180accagtgtgg tctataagaa
aacactcttt gtggaattta ctgatcacct gttcaacatt 240gcaaagccaa ggcctccctg
gatgggactg ctgggaccta ccatccaggc tgaggtgtat 300gacactgtgg tcatcacact
gaaaaacatg gcatctcacc ctgtcagcct gcatgcagtg 360ggagtcagct actggaaggc
ttcagaaggg gcagagtatg atgatcagac aagccagaga 420gaaaaagagg atgataaggt
gttcccagga gggagccata cttatgtgtg gcaggtcctg 480aaggagaatg gcccaatggc
cagtgaccca ctgtgcctca cctactcata tctgagtcat 540gtggacctgg tcaaggatct
caactcaggc ctgattgggg cactgctggt gtgcagggaa 600ggctcactgg ccaaggagaa
aacccagaca ctgcataagt tcatcctgct ctttgctgtg 660tttgatgaag ggaaatcttg
gcacagtgag accaagaaca gtctgatgca ggacagggat 720gctgcttctg ccagagcttg
gcccaagatg cacacagtga atggatatgt caataggtcc 780ctgccaggac tcattggctg
ccacagaaag tcagtgtatt ggcatgtcat tggaatgggc 840accacaccag aagtgcacag
catcttcctg gaggggcata cctttctggt caggaaccac 900aggcaggcca gcctggagat
cagcccaatc accttcctga cagcccagac tctgctcatg 960gatctggggc agttcctgct
cttttgccac atcagctccc accagcatga tggaatggag 1020gcatatgtga aagtggactc
ctgcccagag gaaccacagc tgaggatgaa gaacaatgag 1080gaagctgaag actatgatga
tgacctgaca gactcagaga tggatgtggt caggtttgat 1140gatgataaca gcccctcctt
tatccagatc agaagtgtgg ccaagaaaca cccaaagaca 1200tgggtccatt acattgcagc
tgaggaagag gactgggatt atgcacctct ggtgctggcc 1260ccagatgata gatcctacaa
atcacagtat ctgaacaatg gaccccagag gattggcaga 1320aagtacaaga aagtgaggtt
catggcctat actgatgaaa catttaagac tagagaagct 1380atccagcatg agtcaggcat
cctgggacca ctgctctatg gagaagtggg ggacaccctg 1440ctcatcatct tcaagaacca
ggcttccagg ccatacaata tctatcctca tggcatcaca 1500gatgtgagac cactctactc
aaggagactg cctaagggag tcaaacacct caaggacttc 1560cctatcctgc caggggaaat
ctttaagtat aaatggactg tgacagtgga ggatgggccc 1620actaagagtg acccaaggtg
cctgaccaga tactattcaa gttttgtgaa tatggaaagg 1680gatctggcat caggactgat
tggacctctg ctcatctgct acaaagagag tgtggatcag 1740aggggcaacc agatcatgtc
agacaagagg aatgtgatcc tgttcagtgt ctttgatgaa 1800aacaggtctt ggtatctgac
agagaacatc cagagattcc tgccaaatcc tgcaggggtg 1860cagctggaag atccagagtt
tcaggcctca aacatcatgc atagtatcaa tggatatgtg 1920tttgacagtc tgcagctctc
tgtgtgcctg catgaagtgg cctactggta tatcctgtcc 1980attggagctc agacagattt
cctgagtgtg ttcttttcag gctacacttt taagcataaa 2040atggtctatg aggacacact
gactctcttc ccttttagtg gggaaacagt gtttatgagc 2100atggagaatc cagggctgtg
gattctggga tgccacaaca gtgatttcag gaatagaggc 2160atgactgctc tgctcaaagt
gtctagctgt gacaagaaca caggggacta ctatgaagat 2220tcttatgagg acatcagtgc
ttatctgctc tccaaaaaca atgcaattga acccagatca 2280ttcagtcaga atccacctgt
gctgaagagg caccagagag agatcactag gactaccctg 2340cagtcagatc aggaagagat
tgactatgat gataccatct cagtggaaat gaagaaagag 2400gactttgata tctatgatga
agatgagaac cagagtccaa ggtctttcca gaagaaaacc 2460agacattact ttattgctgc
agtggagagg ctgtgggatt atggaatgtc ctcaagtcca 2520catgtgctga ggaatagggc
acagtctggc agtgtccctc agttcaagaa agtggtcttc 2580caggagttta cagatggcag
cttcactcag cctctgtaca ggggagaact caatgagcac 2640ctggggctgc tgggacccta
tatcagagct gaagtggagg ataacatcat ggtcaccttc 2700aggaatcagg cttcaagacc
ctacagtttt tattctagcc tgatcagcta tgaagaggac 2760cagaggcagg gagctgaacc
taggaaaaac tttgtgaagc caaatgagac caaaacatac 2820ttttggaagg tccagcacca
catggcacca accaaagatg agtttgattg caaggcatgg 2880gcctattttt cagatgtgga
tctggagaag gatgtccaca gtggcctcat tgggcctctg 2940ctggtgtgcc atactaacac
cctgaatcca gctcatggca ggcaggtgac agtccaggag 3000tttgcactgt tctttaccat
ctttgatgag acaaagtcct ggtacttcac tgaaaacatg 3060gagaggaatt gcagagctcc
ttgcaacatc cagatggaag accccacctt caaggagaac 3120tacagatttc atgcaatcaa
tgggtatatc atggatacac tgccaggact ggtgatggcc 3180caggaccaga ggatcagatg
gtatctgctc agcatggggt ccaatgagaa tatccactct 3240atccatttca gtggacatgt
gtttacagtc agaaagaaag aagagtataa aatggccctg 3300tacaacctct atccaggagt
gtttgaaaca gtggagatgc tgccaagcaa ggctgggatc 3360tggagggtgg aatgcctcat
tggggagcac ctgcatgcag gaatgtcaac cctgtttctg 3420gtctacagta ataagtgcca
gacacctctg ggaatggcaa gtggacatat cagggatttc 3480cagatcactg ctagtggaca
gtatggacag tgggcaccaa agctggctag actccactat 3540tcaggctcaa tcaatgcttg
gtccaccaaa gagccattct catggatcaa ggtggacctg 3600ctggctccta tgatcatcca
tggcatcaaa acacaggggg caaggcagaa gttctcctca 3660ctgtacatct ctcagtttat
catcatgtat agcctggatg gcaagaaatg gcagacctac 3720aggggcaata gcacagggac
tctgatggtg ttctttggca atgtggacag cagtgggatc 3780aagcacaaca tcttcaatcc
cccaatcatt gcaaggtaca tcagactgca ccccacccat 3840tattcaatca ggagtacact
caggatggaa ctgatggggt gtgatctcaa cagttgctct 3900atgccactgg gaatggagtc
caaggcaatc tcagatgccc agatcactgc tagctcctac 3960ttcactaata tgtttgctac
ctggagcccc tccaaagcaa ggctgcacct ccagggaagg 4020agcaatgcat ggaggcctca
ggtgaacaat cccaaggaat ggctgcaggt ggatttccag 4080aaaactatga aggtgactgg
agtcacaact cagggagtga aaagtctgct cacttctatg 4140tatgtcaagg agttcctgat
ctcaagttct caggatggcc accagtggac cctgttcttt 4200cagaatggaa aggtgaaagt
cttccagggc aatcaggatt cctttacacc agtggtcaac 4260tcactggacc ctcccctgct
cactagatat ctgagaatcc accctcagag ctgggtgcat 4320cagattgctc tcagaatgga
agtcctgggc tgtgaggcac aggacctgta ttgag 4375172903DNAArtificial
Sequencesynthetic 6x-tRNA 17gtcgactacg tagaatccgg agccgtcttt gtcttccagc
tccatctttt ccaccttttg 60cttaggcagt cccccgagtc gtgtcaaggc tgaggagtag
aaatggaaca gcactaatat 120taatggcaaa accgttgtga aatagggtta ctttctgttt
aagcaaggaa aataaagtaa 180agcaatggga aaaaaattaa aagcaaaagg aatggaggtg
ccggggattg aacccggggc 240ctcgtgcatg ctaagcacgc gctctaccac tgagctacac
ccccgtactg aaacggttct 300ctcgagagta tattcaagat cagaatctga cccttttgct
aggtttcaga accattagtt 360gtaatcagcc aaggtctatt ttatttagtt atttctgata
tctcaaattt aggttttgcg 420tccctctttg ctgacagctg agcaaaccgc attctacacc
gaaggccctc tattgatggc 480cctgatttaa atcccttccg ccgcctgccg caggtggcta
gggtctgagc acacttgaac 540ctcacacccg ccccaggggt agctccttgg ttcccttcag
ccccgatgtg tccctcgtgc 600tttgaaatgg aattacagtt ttggttaaaa acatgccttt
tccgagttag gaagaatcta 660aatcgactga acgccagtct aaaatttcgg cgttcccaca
ccgggagtcg aacccgggcc 720gcctgggtga aaaccaggaa tcctaaccgc tagaccatgt
gggagacggc aatagcgact 780ccaagcctag acaaattgag tcttctcggt cggcttccgc
ccactccatc gcgttcatcc 840gtaggcgtca aacctgctcc tgcgcctgcg cggagtctgc
agcggtttaa accgttcagg 900ttcgcattct actgtttctc tccttgcagg ggcccttgaa
tctttcttca atctactctc 960gcgtgcccgg gcgactgggc ataaccctac aggttcatgt
ggggtgggtg gcgcgcgcta 1020gcggtgaagg tcactcacaa ttgcgcgctg ggcagacgac
ggcagccatt acttttacct 1080cgatcgctgt tttcctggat ccgcacgggt ccaacccgac
tcatcccaac caacctgagg 1140tatgaaaacc aggaaagaga gctagcaccg gagcgttggt
ggtatagtgg taagcatagc 1200tgccttccaa gcagttgacc cgggttcgat tcccggccaa
cgcaagtcgt tttgggtgtt 1260ttttcccccc cccgcctttt ccttttcgtg ttttctgggc
cccagcatcg ttgagggttt 1320tcgtgaggtt ttcctgagga aacttccgct ccgaaaggac
ccactttccg ctacacccgc 1380gaccacggct ggaccaccgc gctcctgacg gatgcgccct
gcaagcccct ccaggcgaga 1440gcaggccggc ctgtgctcag ttttgtagca tcaaaactag
gatttctctt gttaccccca 1500gtcactccat tcagttttcg tgtctttccc agctgcatcc
atcctttcct cattttcgta 1560tgcagccgac tttttgtgac atctttgtat tcattctctg
caattcagct gacctggcca 1620aggaaacaag atcctaagcg tctttccggc ggcgccgtgg
cttagttggt taaagcgcct 1680gtctagtaaa caggagatcc tgggttcgaa tcccagcggt
gcctccgtgt ttcccccacg 1740cttttgccaa cattaaacat tgtgaggaca gttgcagaaa
ctcataactt ccatcctaca 1800tggtttactc acgtacccat ctatcctctc ccggtgcatc
tgccacacgc tgttgggttt 1860ttgctcttcg tgcacatggt acttgcgcct cgacctgcag
ttacaccagt cgcatcatct 1920gtacagcgct aaaacctagc tgggcgtggt ggtcctgcag
tcccagctac tcgggaggct 1980gaggcaggag aatggcttga atccaggagg cggaggtggc
agtgagccga gattgcgcca 2040ctgcactcca gcctggtgac agagcgagac tccatctcaa
aaaaaaaaaa aaaaaaaaag 2100tcagaattag gtactaaaag ccatatgaca tgcctgaaca
gggacttgaa ccctggaccc 2160tcagattaaa agtctgatgc tctaccaact gagctatcca
ggcttcttcc ctgctagttt 2220attttaatgc agtaataaat aacagcactt tgttaaaaat
aataaaggta taatctgtga 2280cacatccaaa gtggacaaga tgaagagata ataggtatac
accaggattc ccctgggcaa 2340actgggacct cttggtaccc tatatattaa gagtctcggg
ttttgttttc acttaagcaa 2400atggttaacg aattagcagg ttaagaaaaa ctgtttcccc
gtaagaagca gggttctttg 2460gtgttcaatg tggagctccg ccactcccag cccgggtgaa
ggaaaactgg gaaacagaat 2520gaatgtgatt atctattcga agataaattt ccacaaagca
tgccgtttga tagtagctta 2580taatgtggaa gtaaggcatc ctgtcatccg gccggttagc
tcagttggtt agagcgtggt 2640gctaataacg ccaaggtcgc gggttcgatc cccgtactgg
ccaagtattc tctgtggctt 2700ttatcaccag aatggatagt aacccagaca tcgatctaaa
cgtgtacctg tgtgtttctc 2760caggcttaac tttgccccga gaaaacggat ctgtgaattt
ggtgcgccct cgcttactcg 2820acagcggtta atttgaacgg ggacgtttct ttccgctgcc
tccaaggcat acccacatcc 2880taccacgatg gtggcggccg caa
2903181467PRTArtificial Sequencesynthetic HP44/OL
aa 18Met Gln Leu Glu Leu Ser Thr Cys Val Phe Leu Cys Leu Leu Pro Leu1
5 10 15Gly Phe Ser Ala Ile
Arg Arg Tyr Tyr Leu Gly Ala Val Glu Leu Ser 20
25 30Trp Asp Tyr Arg Gln Ser Glu Leu Leu Arg Glu Leu
His Val Asp Thr 35 40 45Arg Phe
Pro Ala Thr Ala Pro Gly Ala Leu Pro Leu Gly Pro Ser Val 50
55 60Leu Tyr Lys Lys Thr Val Phe Val Glu Phe Thr
Asp Gln Leu Phe Ser65 70 75
80Val Ala Arg Pro Arg Pro Pro Trp Met Gly Leu Leu Gly Pro Thr Ile
85 90 95Gln Ala Glu Val Tyr
Asp Thr Val Val Val Thr Leu Lys Asn Met Ala 100
105 110Ser His Pro Val Ser Leu His Ala Val Gly Val Ser
Phe Trp Lys Ser 115 120 125Ser Glu
Gly Ala Glu Tyr Glu Asp His Thr Ser Gln Arg Glu Lys Glu 130
135 140Asp Asp Lys Val Leu Pro Gly Lys Ser Gln Thr
Tyr Val Trp Gln Val145 150 155
160Leu Lys Glu Asn Gly Pro Thr Ala Ser Asp Pro Pro Cys Leu Thr Tyr
165 170 175Ser Tyr Leu Ser
His Val Asp Leu Val Lys Asp Leu Asn Ser Gly Leu 180
185 190Ile Gly Ala Leu Leu Val Cys Arg Glu Gly Ser
Leu Thr Arg Glu Arg 195 200 205Thr
Gln Asn Leu His Glu Phe Val Leu Leu Phe Ala Val Phe Asp Glu 210
215 220Gly Lys Ser Trp His Ser Ala Arg Asn Asp
Ser Trp Thr Arg Ala Met225 230 235
240Asp Pro Ala Pro Ala Arg Ala Gln Pro Ala Met His Thr Val Asn
Gly 245 250 255Tyr Val Asn
Arg Ser Leu Pro Gly Leu Ile Gly Cys His Lys Lys Ser 260
265 270Val Tyr Trp His Val Ile Gly Met Gly Thr
Ser Pro Glu Val His Ser 275 280
285Ile Phe Leu Glu Gly His Thr Phe Leu Val Arg His His Arg Gln Ala 290
295 300Ser Leu Glu Ile Ser Pro Leu Thr
Phe Leu Thr Ala Gln Thr Phe Leu305 310
315 320Met Asp Leu Gly Gln Phe Leu Leu Phe Cys His Ile
Ser Ser His His 325 330
335His Gly Gly Met Glu Ala His Val Arg Val Glu Ser Cys Ala Glu Glu
340 345 350Pro Gln Leu Arg Arg Lys
Ala Asp Glu Glu Glu Asp Tyr Asp Asp Asn 355 360
365Leu Tyr Asp Ser Asp Met Asp Val Val Arg Leu Asp Gly Asp
Asp Val 370 375 380Ser Pro Phe Ile Gln
Ile Arg Ser Val Ala Lys Lys His Pro Lys Thr385 390
395 400Trp Val His Tyr Ile Ser Ala Glu Glu Glu
Asp Trp Asp Tyr Ala Pro 405 410
415Ala Val Pro Ser Pro Ser Asp Arg Ser Tyr Lys Ser Leu Tyr Leu Asn
420 425 430Ser Gly Pro Gln Arg
Ile Gly Arg Lys Tyr Lys Lys Ala Arg Phe Val 435
440 445Ala Tyr Thr Asp Val Thr Phe Lys Thr Arg Lys Ala
Ile Pro Tyr Glu 450 455 460Ser Gly Ile
Leu Gly Pro Leu Leu Tyr Gly Glu Val Gly Asp Thr Leu465
470 475 480Leu Ile Ile Phe Lys Asn Lys
Ala Ser Arg Pro Tyr Asn Ile Tyr Pro 485
490 495His Gly Ile Thr Asp Val Ser Ala Leu His Pro Gly
Arg Leu Leu Lys 500 505 510Gly
Trp Lys His Leu Lys Asp Met Pro Ile Leu Pro Gly Glu Thr Phe 515
520 525Lys Tyr Lys Trp Thr Val Thr Val Glu
Asp Gly Pro Thr Lys Ser Asp 530 535
540Pro Arg Cys Leu Thr Arg Tyr Tyr Ser Ser Ser Ile Asn Leu Glu Lys545
550 555 560Asp Leu Ala Ser
Gly Leu Ile Gly Pro Leu Leu Ile Cys Tyr Lys Glu 565
570 575Ser Val Asp Gln Arg Gly Asn Gln Met Met
Ser Asp Lys Arg Asn Val 580 585
590Ile Leu Phe Ser Val Phe Asp Glu Asn Gln Ser Trp Tyr Leu Ala Glu
595 600 605Asn Ile Gln Arg Phe Leu Pro
Asn Pro Asp Gly Leu Gln Pro Gln Asp 610 615
620Pro Glu Phe Gln Ala Ser Asn Ile Met His Ser Ile Asn Gly Tyr
Val625 630 635 640Phe Asp
Ser Leu Gln Leu Ser Val Cys Leu His Glu Val Ala Tyr Trp
645 650 655Tyr Ile Leu Ser Val Gly Ala
Gln Thr Asp Phe Leu Ser Val Phe Phe 660 665
670Ser Gly Tyr Thr Phe Lys His Lys Met Val Tyr Glu Asp Thr
Leu Thr 675 680 685Leu Phe Pro Phe
Ser Gly Glu Thr Val Phe Met Ser Met Glu Asn Pro 690
695 700Gly Leu Trp Val Leu Gly Cys His Asn Ser Asp Leu
Arg Asn Arg Gly705 710 715
720Met Thr Ala Leu Leu Lys Val Tyr Ser Cys Asp Arg Asp Ile Gly Asp
725 730 735Tyr Tyr Asp Asn Thr
Tyr Glu Asp Ile Pro Gly Phe Leu Leu Ser Gly 740
745 750Lys Asn Val Ile Glu Pro Arg Ser Phe Ala Gln Asn
Ser Arg Pro Pro 755 760 765Ser Ala
Ser Ala Pro Lys Pro Pro Val Leu Arg Arg His Gln Arg Asp 770
775 780Ile Ser Leu Pro Thr Phe Gln Pro Glu Glu Asp
Lys Met Asp Tyr Asp785 790 795
800Asp Ile Phe Ser Thr Glu Thr Lys Gly Glu Asp Phe Asp Ile Tyr Gly
805 810 815Glu Asp Glu Asn
Gln Asp Pro Arg Ser Phe Gln Lys Arg Thr Arg His 820
825 830Tyr Phe Ile Ala Ala Val Glu Gln Leu Trp Asp
Tyr Gly Met Ser Glu 835 840 845Ser
Pro Arg Ala Leu Arg Asn Arg Ala Gln Asn Gly Glu Val Pro Arg 850
855 860Phe Lys Lys Val Val Phe Arg Glu Phe Ala
Asp Gly Ser Phe Thr Gln865 870 875
880Pro Ser Tyr Arg Gly Glu Leu Asn Lys His Leu Gly Leu Leu Gly
Pro 885 890 895Tyr Ile Arg
Ala Glu Val Glu Asp Asn Ile Met Val Thr Phe Lys Asn 900
905 910Gln Ala Ser Arg Pro Tyr Ser Phe Tyr Ser
Ser Leu Ile Ser Tyr Pro 915 920
925Asp Asp Gln Glu Gln Gly Ala Glu Pro Arg His Asn Phe Val Gln Pro 930
935 940Asn Glu Thr Arg Thr Tyr Phe Trp
Lys Val Gln His His Met Ala Pro945 950
955 960Thr Glu Asp Glu Phe Asp Cys Lys Ala Trp Ala Tyr
Phe Ser Asp Val 965 970
975Asp Leu Glu Lys Asp Val His Ser Gly Leu Ile Gly Pro Leu Leu Ile
980 985 990Cys Arg Ala Asn Thr Leu
Asn Ala Ala His Gly Arg Gln Val Thr Val 995 1000
1005Gln Glu Phe Ala Leu Phe Phe Thr Ile Phe Asp Glu
Thr Lys Ser 1010 1015 1020Trp Tyr Phe
Thr Glu Asn Val Glu Arg Asn Cys Arg Ala Pro Cys 1025
1030 1035His Leu Gln Met Glu Asp Pro Thr Leu Lys Glu
Asn Tyr Arg Phe 1040 1045 1050His Ala
Ile Asn Gly Tyr Val Met Asp Thr Leu Pro Gly Leu Val 1055
1060 1065Met Ala Gln Asn Gln Arg Ile Arg Trp Tyr
Leu Leu Ser Met Gly 1070 1075 1080Ser
Asn Glu Asn Ile His Ser Ile His Phe Ser Gly His Val Phe 1085
1090 1095Ser Val Arg Lys Lys Glu Glu Tyr Lys
Met Ala Val Tyr Asn Leu 1100 1105
1110Tyr Pro Gly Val Phe Glu Thr Val Glu Met Leu Pro Ser Lys Val
1115 1120 1125Gly Ile Trp Arg Ile Glu
Cys Leu Ile Gly Glu His Leu Gln Ala 1130 1135
1140Gly Met Ser Thr Thr Phe Leu Val Tyr Ser Lys Lys Cys Gln
Thr 1145 1150 1155Pro Leu Gly Met Ala
Ser Gly His Ile Arg Asp Phe Gln Ile Thr 1160 1165
1170Ala Ser Gly Gln Tyr Gly Gln Trp Ala Pro Lys Leu Ala
Arg Leu 1175 1180 1185His Tyr Ser Gly
Ser Ile Asn Ala Trp Ser Thr Lys Glu Pro Phe 1190
1195 1200Ser Trp Ile Lys Val Asp Leu Leu Ala Pro Met
Ile Ile His Gly 1205 1210 1215Ile Lys
Thr Gln Gly Ala Arg Gln Lys Phe Ser Ser Leu Tyr Ile 1220
1225 1230Ser Gln Phe Ile Ile Met Tyr Ser Leu Asp
Gly Lys Lys Trp Gln 1235 1240 1245Thr
Tyr Arg Gly Asn Ser Thr Gly Thr Leu Met Val Phe Phe Gly 1250
1255 1260Asn Val Asp Ser Ser Gly Ile Lys His
Asn Ile Phe Asn Pro Pro 1265 1270
1275Ile Ile Ala Arg Tyr Ile Arg Leu His Pro Thr His Tyr Ser Ile
1280 1285 1290Arg Ser Thr Leu Arg Met
Glu Leu Met Gly Cys Asp Leu Asn Ser 1295 1300
1305Cys Ser Met Pro Leu Gly Met Glu Ser Lys Ala Ile Ser Asp
Ala 1310 1315 1320Gln Ile Thr Ala Ser
Ser Tyr Phe Thr Asn Met Phe Ala Thr Trp 1325 1330
1335Ser Pro Ser Lys Ala Arg Leu His Leu Gln Gly Arg Ser
Asn Ala 1340 1345 1350Trp Arg Pro Gln
Val Asn Asn Pro Lys Glu Trp Leu Gln Val Asp 1355
1360 1365Phe Gln Lys Thr Met Lys Val Thr Gly Val Thr
Thr Gln Gly Val 1370 1375 1380Lys Ser
Leu Leu Thr Ser Met Tyr Val Lys Glu Phe Leu Ile Ser 1385
1390 1395Ser Ser Gln Asp Gly His Gln Trp Thr Leu
Phe Phe Gln Asn Gly 1400 1405 1410Lys
Val Lys Val Phe Gln Gly Asn Gln Asp Ser Phe Thr Pro Val 1415
1420 1425Val Asn Ser Leu Asp Pro Pro Leu Leu
Thr Arg Tyr Leu Arg Ile 1430 1435
1440His Pro Gln Ser Trp Val His Gln Ile Ala Leu Arg Met Glu Val
1445 1450 1455Leu Gly Cys Glu Ala Gln
Asp Leu Tyr 1460 1465194401DNAArtificial
Sequencesynthetic HP44/OL NT 19atgcagctag agctctccac ctgtgtcttt
ctgtgtctct tgccactcgg ctttagtgcc 60atcaggagat actacctggg cgcagtggaa
ctgtcctggg actaccggca aagtgaactc 120ctccgtgagc tgcacgtgga caccagattt
cctgctacag cgccaggagc tcttccgttg 180ggcccgtcag tcctgtacaa aaagactgtg
ttcgtagagt tcacggatca acttttcagc 240gttgccaggc ccaggccacc atggatgggt
ctgctgggtc ctaccatcca ggctgaggtt 300tacgacacgg tggtcgttac cctgaagaac
atggcttctc atcccgttag tcttcacgct 360gtcggcgtct ccttctggaa atcttccgaa
ggcgctgaat atgaggatca caccagccaa 420agggagaagg aagacgataa agtccttccc
ggtaaaagcc aaacctacgt ctggcaggtc 480ctgaaagaaa atggtccaac agcctctgac
ccaccatgtc ttacctactc atacctgtct 540cacgtggacc tggtgaaaga cctgaattcg
ggcctcattg gagccctgct ggtttgtaga 600gaagggagtc tgaccagaga aaggacccag
aacctgcacg aatttgtact actttttgct 660gtctttgatg aagggaaaag ttggcactca
gcaagaaatg actcctggac acgggccatg 720gatcccgcac ctgccagggc ccagcctgca
atgcacacag tcaatggcta tgtcaacagg 780tctctgccag gtctgatcgg atgtcataag
aaatcagtct actggcacgt gattggaatg 840ggcaccagcc cggaagtgca ctccattttt
cttgaaggcc acacgtttct cgtgaggcac 900catcgccagg cttccttgga gatctcgcca
ctaactttcc tcactgctca gacattcctg 960atggaccttg gccagttcct actgttttgt
catatctctt cccaccacca tggtggcatg 1020gaggctcacg tcagagtaga aagctgcgcc
gaggagcccc agctgcggag gaaagctgat 1080gaagaggaag attatgatga caatttgtac
gactcggaca tggacgtggt ccggctcgat 1140ggtgacgacg tgtctccctt tatccaaatc
cgctcggttg ccaagaagca tcccaaaacc 1200tgggtgcact acatctctgc agaggaggag
gactgggact acgcccccgc ggtccccagc 1260cccagtgaca gaagttataa aagtctctac
ttgaacagtg gtcctcagcg aattggtagg 1320aaatacaaaa aagctcgatt cgtcgcttac
acggatgtaa catttaagac tcgtaaagct 1380attccgtatg aatcaggaat cctgggacct
ttactttatg gagaagttgg agacacactt 1440ttgattatat ttaagaataa agcgagccga
ccatataaca tctaccctca tggaatcact 1500gatgtcagcg ctttgcaccc agggagactt
ctaaaaggtt ggaaacattt gaaagacatg 1560ccaattctgc caggagagac tttcaagtat
aaatggacag tgactgtgga agatgggcca 1620accaagtccg atcctcggtg cctgacccgc
tactactcga gctccattaa tctagagaaa 1680gatctggctt cgggactcat tggccctctc
ctcatctgct acaaagaatc tgtagaccaa 1740agaggaaacc agatgatgtc agacaagaga
aacgtcatcc tgttttctgt attcgatgag 1800aatcaaagct ggtacctcgc agagaatatt
cagcgcttcc tccccaatcc ggatggatta 1860cagccccagg atccagagtt ccaagcttct
aacatcatgc acagcatcaa tggctatgtt 1920tttgatagct tgcagctgtc ggtttgtttg
cacgaggtgg catactggta cattctaagt 1980gttggagcac agacggactt cctctccgtc
ttcttctctg gctacacctt caaacacaaa 2040atggtctatg aagacacact caccctgttc
cccttctcag gagaaacggt cttcatgtca 2100atggaaaacc caggtctctg ggtccttggg
tgccacaact cagacttgcg gaacagaggg 2160atgacagcct tactgaaggt gtatagttgt
gacagggaca ttggtgatta ttatgacaac 2220acttatgaag atattccagg cttcttgctg
agtggaaaga atgtcattga acctaggagc 2280tttgcccaga attcaagacc ccctagtgcg
agcgctccaa agcctccggt cctgcgacgg 2340catcagaggg acataagcct tcctactttt
cagccggagg aagacaaaat ggactatgat 2400gatatcttct caactgaaac gaagggagaa
gattttgaca tttacggtga ggatgaaaat 2460caggaccctc gcagctttca gaagagaacc
cgacactatt tcattgctgc ggtggagcag 2520ctctgggatt acgggatgag cgaatccccc
cgggcgctaa gaaacagggc tcagaacgga 2580gaggtgcctc ggttcaagaa ggtggtcttc
cgggaatttg ctgacggctc cttcacgcag 2640ccgtcgtacc gcggggaact caacaaacac
ttggggctct tgggacccta catcagagcg 2700gaagttgaag acaacatcat ggtaactttc
aaaaaccagg cgtctcgtcc ctattccttc 2760tactcgagcc ttatttctta tccggatgat
caggagcaag gggcagaacc tcgacacaac 2820ttcgtccagc caaatgaaac cagaacttac
ttttggaaag tgcagcatca catggcaccc 2880acagaagacg agtttgactg caaagcctgg
gcctactttt ctgatgttga cctggaaaaa 2940gatgtgcact caggcttgat cggccccctt
ctgatctgcc gcgccaacac cctgaacgct 3000gctcacggta gacaagtgac cgtgcaagaa
tttgctctgt ttttcactat ttttgatgag 3060acaaagagct ggtacttcac tgaaaatgtg
gaaaggaact gccgggcccc ctgccatctg 3120cagatggagg accccactct gaaagaaaac
tatcgcttcc atgcaatcaa tggctatgtg 3180atggatacac tccctggctt agtaatggct
cagaatcaaa ggatccgatg gtatctgctc 3240agcatgggca gcaatgaaaa tatccattcg
attcatttta gcggacacgt gttcagtgta 3300cggaaaaagg aggagtataa aatggccgtg
tacaatctct atccgggtgt ctttgagaca 3360gtggaaatgc taccgtccaa agttggaatt
tggcgaatag aatgcctgat tggcgagcac 3420ctgcaagctg ggatgagcac gactttcctg
gtgtacagca agaagtgtca gactcccctg 3480ggaatggctt ctggacacat tagagatttt
cagattacag cttcaggaca atatggacag 3540tgggccccaa agctggccag acttcattat
tccggatcaa tcaatgcctg gagcaccaag 3600gagccctttt cttggatcaa ggtggatctg
ttggcaccaa tgattattca cggcatcaag 3660acccagggtg cccgtcagaa gttctccagc
ctctacatct ctcagtttat catcatgtat 3720agtcttgatg ggaagaagtg gcagacttat
cgaggaaatt ccactggaac cttaatggtc 3780ttctttggca atgtggattc atctgggata
aaacacaata tttttaaccc tccaattatt 3840gctcgataca tccgtttgca cccaactcat
tatagcattc gcagcactct tcgcatggag 3900ttgatgggct gtgatttaaa tagttgcagc
atgccattgg gaatggagag taaagcaata 3960tcagatgcac agattactgc ttcatcctac
tttaccaata tgtttgccac ctggtctcct 4020tcaaaagctc gacttcacct ccaagggagg
agtaatgcct ggagacctca ggtgaataat 4080ccaaaagagt ggctgcaagt ggacttccag
aagacaatga aagtcacagg agtaactact 4140cagggagtaa aatctctgct taccagcatg
tatgtgaagg agttcctcat ctccagcagt 4200caagatggcc atcagtggac tctctttttt
cagaatggca aagtaaaggt ttttcaggga 4260aatcaagact ccttcacacc tgtggtgaac
tctctagacc caccgttact gactcgctac 4320cttcgaattc acccccagag ttgggtgcac
cagattgccc tgaggatgga ggttctgggc 4380tgcgaggcac aggacctcta c
4401201457PRTArtificial
Sequencesynthetic HP46/SQ aa 20Met Gln Leu Glu Leu Ser Thr Cys Val Phe
Leu Cys Leu Leu Pro Leu1 5 10
15Gly Phe Ser Ala Ile Arg Arg Tyr Tyr Leu Gly Ala Val Glu Leu Ser
20 25 30Trp Asp Tyr Arg Gln Ser
Glu Leu Leu Arg Glu Leu His Val Asp Thr 35 40
45Arg Phe Pro Ala Thr Ala Pro Gly Ala Leu Pro Leu Gly Pro
Ser Val 50 55 60Leu Tyr Lys Lys Thr
Val Phe Val Glu Phe Thr Asp Gln Leu Phe Ser65 70
75 80Val Ala Arg Pro Arg Pro Pro Trp Met Gly
Leu Leu Gly Pro Thr Ile 85 90
95Gln Ala Glu Val Tyr Asp Thr Val Val Val Thr Leu Lys Asn Met Ala
100 105 110Ser His Pro Val Ser
Leu His Ala Val Gly Val Ser Phe Trp Lys Ser 115
120 125Ser Glu Gly Ala Glu Tyr Glu Asp His Thr Ser Gln
Arg Glu Lys Glu 130 135 140Asp Asp Lys
Val Leu Pro Gly Lys Ser Gln Thr Tyr Val Trp Gln Val145
150 155 160Leu Lys Glu Asn Gly Pro Thr
Ala Ser Asp Pro Pro Cys Leu Thr Tyr 165
170 175Ser Tyr Leu Ser His Val Asp Leu Val Lys Asp Leu
Asn Ser Gly Leu 180 185 190Ile
Gly Ala Leu Leu Val Cys Arg Glu Gly Ser Leu Thr Arg Glu Arg 195
200 205Thr Gln Asn Leu His Glu Phe Val Leu
Leu Phe Ala Val Phe Asp Glu 210 215
220Gly Lys Ser Trp His Ser Ala Arg Asn Asp Ser Trp Thr Arg Ala Met225
230 235 240Asp Pro Ala Pro
Ala Arg Ala Gln Pro Ala Met His Thr Val Asn Gly 245
250 255Tyr Val Asn Arg Ser Leu Pro Gly Leu Ile
Gly Cys His Lys Lys Ser 260 265
270Val Tyr Trp His Val Ile Gly Met Gly Thr Ser Pro Glu Val His Ser
275 280 285Ile Phe Leu Glu Gly His Thr
Phe Leu Val Arg His His Arg Gln Ala 290 295
300Ser Leu Glu Ile Ser Pro Leu Thr Phe Leu Thr Ala Gln Thr Phe
Leu305 310 315 320Met Asp
Leu Gly Gln Phe Leu Leu Phe Cys His Ile Ser Ser His His
325 330 335His Gly Gly Met Glu Ala His
Val Arg Val Glu Ser Cys Ala Glu Glu 340 345
350Pro Gln Leu Arg Arg Lys Ala Asp Glu Glu Glu Asp Tyr Asp
Asp Asn 355 360 365Leu Tyr Asp Ser
Asp Met Asp Val Val Arg Leu Asp Gly Asp Asp Val 370
375 380Ser Pro Phe Ile Gln Ile Arg Ser Val Ala Lys Lys
His Pro Lys Thr385 390 395
400Trp Val His Tyr Ile Ala Ala Glu Glu Glu Asp Trp Asp Tyr Ala Pro
405 410 415Leu Val Leu Ala Pro
Asp Asp Arg Ser Tyr Lys Ser Gln Tyr Leu Asn 420
425 430Asn Gly Pro Gln Arg Ile Gly Arg Lys Tyr Lys Lys
Val Arg Phe Met 435 440 445Ala Tyr
Thr Asp Glu Thr Phe Lys Thr Arg Glu Ala Ile Gln His Glu 450
455 460Ser Gly Ile Leu Gly Pro Leu Leu Tyr Gly Glu
Val Gly Asp Thr Leu465 470 475
480Leu Ile Ile Phe Lys Asn Gln Ala Ser Arg Pro Tyr Asn Ile Tyr Pro
485 490 495His Gly Ile Thr
Asp Val Arg Pro Leu Tyr Ser Arg Arg Leu Pro Lys 500
505 510Gly Val Lys His Leu Lys Asp Phe Pro Ile Leu
Pro Gly Glu Ile Phe 515 520 525Lys
Tyr Lys Trp Thr Val Thr Val Glu Asp Gly Pro Thr Lys Ser Asp 530
535 540Pro Arg Cys Leu Thr Arg Tyr Tyr Ser Ser
Phe Val Asn Met Glu Arg545 550 555
560Asp Leu Ala Ser Gly Leu Ile Gly Pro Leu Leu Ile Cys Tyr Lys
Glu 565 570 575Ser Val Asp
Gln Arg Gly Asn Gln Ile Met Ser Asp Lys Arg Asn Val 580
585 590Ile Leu Phe Ser Val Phe Asp Glu Asn Arg
Ser Trp Tyr Leu Thr Glu 595 600
605Asn Ile Gln Arg Phe Leu Pro Asn Pro Ala Gly Val Gln Leu Glu Asp 610
615 620Pro Glu Phe Gln Ala Ser Asn Ile
Met His Ser Ile Asn Gly Tyr Val625 630
635 640Phe Asp Ser Leu Gln Leu Ser Val Cys Leu His Glu
Val Ala Tyr Trp 645 650
655Tyr Ile Leu Ser Ile Gly Ala Gln Thr Asp Phe Leu Ser Val Phe Phe
660 665 670Ser Gly Tyr Thr Phe Lys
His Lys Met Val Tyr Glu Asp Thr Leu Thr 675 680
685Leu Phe Pro Phe Ser Gly Glu Thr Val Phe Met Ser Met Glu
Asn Pro 690 695 700Gly Leu Trp Ile Leu
Gly Cys His Asn Ser Asp Phe Arg Asn Arg Gly705 710
715 720Met Thr Ala Leu Leu Lys Val Ser Ser Cys
Asp Lys Asn Thr Gly Asp 725 730
735Tyr Tyr Glu Asp Ser Tyr Glu Asp Ile Ser Ala Tyr Leu Leu Ser Lys
740 745 750Asn Asn Ala Ile Glu
Pro Arg Ser Phe Ser Gln Asn Pro Pro Val Leu 755
760 765Lys Arg His Gln Arg Glu Ile Thr Arg Thr Thr Leu
Gln Ser Asp Gln 770 775 780Glu Glu Ile
Asp Tyr Asp Asp Thr Ile Ser Val Glu Met Lys Lys Glu785
790 795 800Asp Phe Asp Ile Tyr Asp Glu
Asp Glu Asn Gln Ser Pro Arg Ser Phe 805
810 815Gln Lys Lys Thr Arg His Tyr Phe Ile Ala Ala Val
Glu Arg Leu Trp 820 825 830Asp
Tyr Gly Met Ser Ser Ser Pro His Val Leu Arg Asn Arg Ala Gln 835
840 845Ser Gly Ser Val Pro Gln Phe Lys Lys
Val Val Phe Gln Glu Phe Thr 850 855
860Asp Gly Ser Phe Thr Gln Pro Leu Tyr Arg Gly Glu Leu Asn Glu His865
870 875 880Leu Gly Leu Leu
Gly Pro Tyr Ile Arg Ala Glu Val Glu Asp Asn Ile 885
890 895Met Val Thr Phe Arg Asn Gln Ala Ser Arg
Pro Tyr Ser Phe Tyr Ser 900 905
910Ser Leu Ile Ser Tyr Glu Glu Asp Gln Arg Gln Gly Ala Glu Pro Arg
915 920 925Lys Asn Phe Val Lys Pro Asn
Glu Thr Lys Thr Tyr Phe Trp Lys Val 930 935
940Gln His His Met Ala Pro Thr Lys Asp Glu Phe Asp Cys Lys Ala
Trp945 950 955 960Ala Tyr
Phe Ser Asp Val Asp Leu Glu Lys Asp Val His Ser Gly Leu
965 970 975Ile Gly Pro Leu Leu Val Cys
His Thr Asn Thr Leu Asn Pro Ala His 980 985
990Gly Arg Gln Val Thr Val Gln Glu Phe Ala Leu Phe Phe Thr
Ile Phe 995 1000 1005Asp Glu Thr
Lys Ser Trp Tyr Phe Thr Glu Asn Met Glu Arg Asn 1010
1015 1020Cys Arg Ala Pro Cys Asn Ile Gln Met Glu Asp
Pro Thr Phe Lys 1025 1030 1035Glu Asn
Tyr Arg Phe His Ala Ile Asn Gly Tyr Ile Met Asp Thr 1040
1045 1050Leu Pro Gly Leu Val Met Ala Gln Asp Gln
Arg Ile Arg Trp Tyr 1055 1060 1065Leu
Leu Ser Met Gly Ser Asn Glu Asn Ile His Ser Ile His Phe 1070
1075 1080Ser Gly His Val Phe Thr Val Arg Lys
Lys Glu Glu Tyr Lys Met 1085 1090
1095Ala Leu Tyr Asn Leu Tyr Pro Gly Val Phe Glu Thr Val Glu Met
1100 1105 1110Leu Pro Ser Lys Ala Gly
Ile Trp Arg Val Glu Cys Leu Ile Gly 1115 1120
1125Glu His Leu His Ala Gly Met Ser Thr Leu Phe Leu Val Tyr
Ser 1130 1135 1140Asn Lys Cys Gln Thr
Pro Leu Gly Met Ala Ser Gly His Ile Arg 1145 1150
1155Asp Phe Gln Ile Thr Ala Ser Gly Gln Tyr Gly Gln Trp
Ala Pro 1160 1165 1170Lys Leu Ala Arg
Leu His Tyr Ser Gly Ser Ile Asn Ala Trp Ser 1175
1180 1185Thr Lys Glu Pro Phe Ser Trp Ile Lys Val Asp
Leu Leu Ala Pro 1190 1195 1200Met Ile
Ile His Gly Ile Lys Thr Gln Gly Ala Arg Gln Lys Phe 1205
1210 1215Ser Ser Leu Tyr Ile Ser Gln Phe Ile Ile
Met Tyr Ser Leu Asp 1220 1225 1230Gly
Lys Lys Trp Gln Thr Tyr Arg Gly Asn Ser Thr Gly Thr Leu 1235
1240 1245Met Val Phe Phe Gly Asn Val Asp Ser
Ser Gly Ile Lys His Asn 1250 1255
1260Ile Phe Asn Pro Pro Ile Ile Ala Arg Tyr Ile Arg Leu His Pro
1265 1270 1275Thr His Tyr Ser Ile Arg
Ser Thr Leu Arg Met Glu Leu Met Gly 1280 1285
1290Cys Asp Leu Asn Ser Cys Ser Met Pro Leu Gly Met Glu Ser
Lys 1295 1300 1305Ala Ile Ser Asp Ala
Gln Ile Thr Ala Ser Ser Tyr Phe Thr Asn 1310 1315
1320Met Phe Ala Thr Trp Ser Pro Ser Lys Ala Arg Leu His
Leu Gln 1325 1330 1335Gly Arg Ser Asn
Ala Trp Arg Pro Gln Val Asn Asn Pro Lys Glu 1340
1345 1350Trp Leu Gln Val Asp Phe Gln Lys Thr Met Lys
Val Thr Gly Val 1355 1360 1365Thr Thr
Gln Gly Val Lys Ser Leu Leu Thr Ser Met Tyr Val Lys 1370
1375 1380Glu Phe Leu Ile Ser Ser Ser Gln Asp Gly
His Gln Trp Thr Leu 1385 1390 1395Phe
Phe Gln Asn Gly Lys Val Lys Val Phe Gln Gly Asn Gln Asp 1400
1405 1410Ser Phe Thr Pro Val Val Asn Ser Leu
Asp Pro Pro Leu Leu Thr 1415 1420
1425Arg Tyr Leu Arg Ile His Pro Gln Ser Trp Val His Gln Ile Ala
1430 1435 1440Leu Arg Met Glu Val Leu
Gly Cys Glu Ala Gln Asp Leu Tyr 1445 1450
1455214368DNAArtificial Sequencesynthetic HP46/SQ NT 21atgcagctag
agctctccac ctgtgtcttt ctgtgtctct tgccactcgg ctttagtgcc 60atcaggagat
actacctggg cgcagtggaa ctgtcctggg actaccggca aagtgaactc 120ctccgtgagc
tgcacgtgga caccagattt cctgctacag cgccaggagc tcttccgttg 180ggcccgtcag
tcctgtacaa aaagactgtg ttcgtagagt tcacggatca acttttcagc 240gttgccaggc
ccaggccacc atggatgggt ctgctgggtc ctaccatcca ggctgaggtt 300tacgacacgg
tggtcgttac cctgaagaac atggcttctc atcccgttag tcttcacgct 360gtcggcgtct
ccttctggaa atcttccgaa ggcgctgaat atgaggatca caccagccaa 420agggagaagg
aagacgataa agtccttccc ggtaaaagcc aaacctacgt ctggcaggtc 480ctgaaagaaa
atggtccaac agcctctgac ccaccatgtc ttacctactc atacctgtct 540cacgtggacc
tggtgaaaga cctgaattcg ggcctcattg gagccctgct ggtttgtaga 600gaagggagtc
tgaccagaga aaggacccag aacctgcacg aatttgtact actttttgct 660gtctttgatg
aagggaaaag ttggcactca gcaagaaatg actcctggac acgggccatg 720gatcccgcac
ctgccagggc ccagcctgca atgcacacag tcaatggcta tgtcaacagg 780tctctgccag
gtctgatcgg atgtcataag aaatcagtct actggcacgt gattggaatg 840ggcaccagcc
cggaagtgca ctccattttt cttgaaggcc acacgtttct cgtgaggcac 900catcgccagg
cttccttgga gatctcgcca ctaactttcc tcactgctca gacattcctg 960atggaccttg
gccagttcct actgttttgt catatctctt cccaccacca tggtggcatg 1020gaggctcacg
tcagagtaga aagctgcgcc gaggagcccc agctgcggag gaaagctgat 1080gaagaggaag
attatgatga caatttgtac gactcggaca tggacgtggt ccggctcgat 1140ggtgacgacg
tgtctccctt tatccaaatc cgctcagttg ccaagaagca tcctaaaact 1200tgggtacatt
acattgctgc tgaagaggag gactgggact atgctccctt agtcctcgcc 1260cccgatgaca
gaagttataa aagtcaatat ttgaacaatg gccctcagcg gattggtagg 1320aagtacaaaa
aagtccgatt tatggcatac acagatgaaa cctttaagac gcgtgaagct 1380attcagcatg
aatcaggaat cttgggacct ttactttatg gggaagttgg agacacactg 1440ttgattatat
ttaagaatca agcaagcaga ccatataaca tctaccctca cggaatcact 1500gatgtccgtc
ctttgtattc aaggagatta ccaaaaggtg taaaacattt gaaggatttt 1560ccaattctgc
caggagaaat attcaaatat aaatggacag tgactgtaga agatgggcca 1620actaaatcag
atccgcggtg cctgacccgc tattactcta gtttcgttaa tatggagaga 1680gatctagctt
caggactcat tggccctctc ctcatctgct acaaagaatc tgtagatcaa 1740agaggaaacc
agataatgtc agacaagagg aatgtcatcc tgttttctgt atttgatgag 1800aaccgaagct
ggtacctcac agagaatata caacgctttc tccccaatcc agctggagtg 1860cagcttgagg
atccagagtt ccaagcctcc aacatcatgc acagcatcaa tggctatgtt 1920tttgatagtt
tgcagttgtc agtttgtttg catgaggtgg catactggta cattctaagc 1980attggagcac
agactgactt cctttctgtc ttcttctctg gatatacctt caaacacaaa 2040atggtctatg
aagacacact caccctattc ccattctcag gagaaactgt cttcatgtcg 2100atggaaaacc
caggtctatg gattctgggg tgccacaact cagactttcg gaacagaggc 2160atgaccgcct
tactgaaggt ttctagttgt gacaagaaca ctggtgatta ttacgaggac 2220agttatgaag
atatttcagc atacttgctg agtaaaaaca atgccattga acctaggagc 2280ttctctcaga
atccaccagt cttgaaacgc catcaacggg aaataactcg tactactctt 2340cagtcagatc
aagaggaaat tgactatgat gataccatat cagttgaaat gaagaaggaa 2400gattttgaca
tttatgatga ggatgaaaat cagagccccc gcagctttca aaagaaaaca 2460cgacactatt
ttattgctgc agtggagagg ctctgggatt atgggatgag tagctcccca 2520catgttctaa
gaaacagggc tcagagtggc agtgtccctc agttcaagaa agttgttttc 2580caggaattta
ctgatggctc ctttactcag cccttatacc gtggagaact aaatgaacat 2640ttgggactcc
tggggccata tataagagca gaagttgaag ataatatcat ggtaactttc 2700agaaatcagg
cctctcgtcc ctattccttc tattctagcc ttatttctta tgaggaagat 2760cagaggcaag
gagcagaacc tagaaaaaac tttgtcaagc ctaatgaaac caaaacttac 2820ttttggaaag
tgcaacatca tatggcaccc actaaagatg agtttgactg caaagcctgg 2880gcttatttct
ctgatgttga cctggaaaaa gatgtgcact caggcctgat tggacccctt 2940ctggtctgcc
acactaacac actgaaccct gctcatggga gacaagtgac agtacaggaa 3000tttgctctgt
ttttcaccat ctttgatgag accaaaagct ggtacttcac tgaaaatatg 3060gaaagaaact
gcagggctcc ctgcaatatc cagatggaag atcccacttt taaagagaat 3120tatcgcttcc
atgcaatcaa tggctacata atggatacac tacctggctt agtaatggct 3180caggatcaaa
ggattcgatg gtatctgctc agcatgggca gcaatgaaaa catccattct 3240attcatttca
gtggacatgt gttcactgta cgaaaaaaag aggagtataa aatggcactg 3300tacaatctct
atccaggtgt ttttgagaca gtggaaatgt taccatccaa agctggaatt 3360tggcgggtgg
aatgccttat tggcgagcat ctacatgctg ggatgagcac actttttctg 3420gtgtacagca
ataagtgtca gactcccctg ggaatggctt ctggacacat tagagatttt 3480cagattacag
cttcaggaca atatggacag tgggccccaa agctggccag acttcattat 3540tccggatcaa
tcaatgcctg gagcaccaag gagccctttt cttggatcaa ggtggatctg 3600ttggcaccaa
tgattattca cggcatcaag acccagggtg cccgtcagaa gttctccagc 3660ctctacatct
ctcagtttat catcatgtat agtcttgatg ggaagaagtg gcagacttat 3720cgaggaaatt
ccactggaac cttaatggtc ttctttggca atgtggattc atctgggata 3780aaacacaata
tttttaaccc tccaattatt gctcgataca tccgtttgca cccaactcat 3840tatagcattc
gcagcactct tcgcatggag ttgatgggct gtgatttaaa tagttgcagc 3900atgccattgg
gaatggagag taaagcaata tcagatgcac agattactgc ttcatcctac 3960tttaccaata
tgtttgccac ctggtctcct tcaaaagctc gacttcacct ccaagggagg 4020agtaatgcct
ggagacctca ggtgaataat ccaaaagagt ggctgcaagt ggacttccag 4080aagacaatga
aagtcacagg agtaactact cagggagtaa aatctctgct taccagcatg 4140tatgtgaagg
agttcctcat ctccagcagt caagatggcc atcagtggac tctctttttt 4200cagaatggca
aagtaaaggt ttttcaggga aatcaagact ccttcacacc tgtggtgaac 4260tctctagacc
caccgttact gactcgctac cttcgaattc acccccagag ttgggtgcac 4320cagattgccc
tgaggatgga ggttctgggc tgcgaggcac aggacctc
4368221467PRTArtificial Sequencesynthetic HP47/OL aa 22Met Gln Leu Glu
Leu Ser Thr Cys Val Phe Leu Cys Leu Leu Pro Leu1 5
10 15Gly Phe Ser Ala Ile Arg Arg Tyr Tyr Leu
Gly Ala Val Glu Leu Ser 20 25
30Trp Asp Tyr Arg Gln Ser Glu Leu Leu Arg Glu Leu His Val Asp Thr
35 40 45Arg Phe Pro Ala Thr Ala Pro Gly
Ala Leu Pro Leu Gly Pro Ser Val 50 55
60Leu Tyr Lys Lys Thr Val Phe Val Glu Phe Thr Asp Gln Leu Phe Ser65
70 75 80Val Ala Arg Pro Arg
Pro Pro Trp Met Gly Leu Leu Gly Pro Thr Ile 85
90 95Gln Ala Glu Val Tyr Asp Thr Val Val Val Thr
Leu Lys Asn Met Ala 100 105
110Ser His Pro Val Ser Leu His Ala Val Gly Val Ser Phe Trp Lys Ser
115 120 125Ser Glu Gly Ala Glu Tyr Glu
Asp His Thr Ser Gln Arg Glu Lys Glu 130 135
140Asp Asp Lys Val Leu Pro Gly Lys Ser Gln Thr Tyr Val Trp Gln
Val145 150 155 160Leu Lys
Glu Asn Gly Pro Thr Ala Ser Asp Pro Pro Cys Leu Thr Tyr
165 170 175Ser Tyr Leu Ser His Val Asp
Leu Val Lys Asp Leu Asn Ser Gly Leu 180 185
190Ile Gly Ala Leu Leu Val Cys Arg Glu Gly Ser Leu Thr Arg
Glu Arg 195 200 205Thr Gln Asn Leu
His Glu Phe Val Leu Leu Phe Ala Val Phe Asp Glu 210
215 220Gly Lys Ser Trp His Ser Ala Arg Asn Asp Ser Trp
Thr Arg Ala Met225 230 235
240Asp Pro Ala Pro Ala Arg Ala Gln Pro Ala Met His Thr Val Asn Gly
245 250 255Tyr Val Asn Arg Ser
Leu Pro Gly Leu Ile Gly Cys His Lys Lys Ser 260
265 270Val Tyr Trp His Val Ile Gly Met Gly Thr Ser Pro
Glu Val His Ser 275 280 285Ile Phe
Leu Glu Gly His Thr Phe Leu Val Arg His His Arg Gln Ala 290
295 300Ser Leu Glu Ile Ser Pro Leu Thr Phe Leu Thr
Ala Gln Thr Phe Leu305 310 315
320Met Asp Leu Gly Gln Phe Leu Leu Phe Cys His Ile Ser Ser His His
325 330 335His Gly Gly Met
Glu Ala His Val Arg Val Glu Ser Cys Ala Glu Glu 340
345 350Pro Gln Leu Arg Arg Lys Ala Asp Glu Glu Glu
Asp Tyr Asp Asp Asn 355 360 365Leu
Tyr Asp Ser Asp Met Asp Val Val Arg Leu Asp Gly Asp Asp Val 370
375 380Ser Pro Phe Ile Gln Ile Arg Ser Val Ala
Lys Lys His Pro Lys Thr385 390 395
400Trp Val His Tyr Ile Ala Ala Glu Glu Glu Asp Trp Asp Tyr Ala
Pro 405 410 415Leu Val Leu
Ala Pro Asp Asp Arg Ser Tyr Lys Ser Gln Tyr Leu Asn 420
425 430Asn Gly Pro Gln Arg Ile Gly Arg Lys Tyr
Lys Lys Val Arg Phe Met 435 440
445Ala Tyr Thr Asp Glu Thr Phe Lys Thr Arg Glu Ala Ile Gln His Glu 450
455 460Ser Gly Ile Leu Gly Pro Leu Leu
Tyr Gly Glu Val Gly Asp Thr Leu465 470
475 480Leu Ile Ile Phe Lys Asn Gln Ala Ser Arg Pro Tyr
Asn Ile Tyr Pro 485 490
495His Gly Ile Thr Asp Val Arg Pro Leu Tyr Ser Arg Arg Leu Pro Lys
500 505 510Gly Val Lys His Leu Lys
Asp Phe Pro Ile Leu Pro Gly Glu Ile Phe 515 520
525Lys Tyr Lys Trp Thr Val Thr Val Glu Asp Gly Pro Thr Lys
Ser Asp 530 535 540Pro Arg Cys Leu Thr
Arg Tyr Tyr Ser Ser Phe Val Asn Met Glu Arg545 550
555 560Asp Leu Ala Ser Gly Leu Ile Gly Pro Leu
Leu Ile Cys Tyr Lys Glu 565 570
575Ser Val Asp Gln Arg Gly Asn Gln Ile Met Ser Asp Lys Arg Asn Val
580 585 590Ile Leu Phe Ser Val
Phe Asp Glu Asn Arg Ser Trp Tyr Leu Thr Glu 595
600 605Asn Ile Gln Arg Phe Leu Pro Asn Pro Ala Gly Val
Gln Leu Glu Asp 610 615 620Pro Glu Phe
Gln Ala Ser Asn Ile Met His Ser Ile Asn Gly Tyr Val625
630 635 640Phe Asp Ser Leu Gln Leu Ser
Val Cys Leu His Glu Val Ala Tyr Trp 645
650 655Tyr Ile Leu Ser Ile Gly Ala Gln Thr Asp Phe Leu
Ser Val Phe Phe 660 665 670Ser
Gly Tyr Thr Phe Lys His Lys Met Val Tyr Glu Asp Thr Leu Thr 675
680 685Leu Phe Pro Phe Ser Gly Glu Thr Val
Phe Met Ser Met Glu Asn Pro 690 695
700Gly Leu Trp Ile Leu Gly Cys His Asn Ser Asp Phe Arg Asn Arg Gly705
710 715 720Met Thr Ala Leu
Leu Lys Val Ser Ser Cys Asp Lys Asn Thr Gly Asp 725
730 735Tyr Tyr Glu Asp Ser Tyr Glu Asp Ile Ser
Ala Tyr Leu Leu Ser Lys 740 745
750Asn Asn Ala Ile Glu Pro Arg Ser Phe Ala Gln Asn Ser Arg Pro Pro
755 760 765Ser Ala Ser Ala Pro Lys Pro
Pro Val Leu Arg Arg His Gln Arg Asp 770 775
780Ile Ser Leu Pro Thr Phe Gln Pro Glu Glu Asp Lys Met Asp Tyr
Asp785 790 795 800Asp Ile
Phe Ser Thr Glu Thr Lys Gly Glu Asp Phe Asp Ile Tyr Gly
805 810 815Glu Asp Glu Asn Gln Asp Pro
Arg Ser Phe Gln Lys Arg Thr Arg His 820 825
830Tyr Phe Ile Ala Ala Val Glu Gln Leu Trp Asp Tyr Gly Met
Ser Glu 835 840 845Ser Pro Arg Ala
Leu Arg Asn Arg Ala Gln Asn Gly Glu Val Pro Arg 850
855 860Phe Lys Lys Val Val Phe Arg Glu Phe Ala Asp Gly
Ser Phe Thr Gln865 870 875
880Pro Ser Tyr Arg Gly Glu Leu Asn Lys His Leu Gly Leu Leu Gly Pro
885 890 895Tyr Ile Arg Ala Glu
Val Glu Asp Asn Ile Met Val Thr Phe Lys Asn 900
905 910Gln Ala Ser Arg Pro Tyr Ser Phe Tyr Ser Ser Leu
Ile Ser Tyr Pro 915 920 925Asp Asp
Gln Glu Gln Gly Ala Glu Pro Arg His Asn Phe Val Gln Pro 930
935 940Asn Glu Thr Arg Thr Tyr Phe Trp Lys Val Gln
His His Met Ala Pro945 950 955
960Thr Glu Asp Glu Phe Asp Cys Lys Ala Trp Ala Tyr Phe Ser Asp Val
965 970 975Asp Leu Glu Lys
Asp Val His Ser Gly Leu Ile Gly Pro Leu Leu Ile 980
985 990Cys Arg Ala Asn Thr Leu Asn Ala Ala His Gly
Arg Gln Val Thr Val 995 1000
1005Gln Glu Phe Ala Leu Phe Phe Thr Ile Phe Asp Glu Thr Lys Ser
1010 1015 1020Trp Tyr Phe Thr Glu Asn
Val Glu Arg Asn Cys Arg Ala Pro Cys 1025 1030
1035His Leu Gln Met Glu Asp Pro Thr Leu Lys Glu Asn Tyr Arg
Phe 1040 1045 1050His Ala Ile Asn Gly
Tyr Val Met Asp Thr Leu Pro Gly Leu Val 1055 1060
1065Met Ala Gln Asn Gln Arg Ile Arg Trp Tyr Leu Leu Ser
Met Gly 1070 1075 1080Ser Asn Glu Asn
Ile His Ser Ile His Phe Ser Gly His Val Phe 1085
1090 1095Ser Val Arg Lys Lys Glu Glu Tyr Lys Met Ala
Val Tyr Asn Leu 1100 1105 1110Tyr Pro
Gly Val Phe Glu Thr Val Glu Met Leu Pro Ser Lys Val 1115
1120 1125Gly Ile Trp Arg Ile Glu Cys Leu Ile Gly
Glu His Leu Gln Ala 1130 1135 1140Gly
Met Ser Thr Thr Phe Leu Val Tyr Ser Lys Lys Cys Gln Thr 1145
1150 1155Pro Leu Gly Met Ala Ser Gly His Ile
Arg Asp Phe Gln Ile Thr 1160 1165
1170Ala Ser Gly Gln Tyr Gly Gln Trp Ala Pro Lys Leu Ala Arg Leu
1175 1180 1185His Tyr Ser Gly Ser Ile
Asn Ala Trp Ser Thr Lys Glu Pro Phe 1190 1195
1200Ser Trp Ile Lys Val Asp Leu Leu Ala Pro Met Ile Ile His
Gly 1205 1210 1215Ile Lys Thr Gln Gly
Ala Arg Gln Lys Phe Ser Ser Leu Tyr Ile 1220 1225
1230Ser Gln Phe Ile Ile Met Tyr Ser Leu Asp Gly Lys Lys
Trp Gln 1235 1240 1245Thr Tyr Arg Gly
Asn Ser Thr Gly Thr Leu Met Val Phe Phe Gly 1250
1255 1260Asn Val Asp Ser Ser Gly Ile Lys His Asn Ile
Phe Asn Pro Pro 1265 1270 1275Ile Ile
Ala Arg Tyr Ile Arg Leu His Pro Thr His Tyr Ser Ile 1280
1285 1290Arg Ser Thr Leu Arg Met Glu Leu Met Gly
Cys Asp Leu Asn Ser 1295 1300 1305Cys
Ser Met Pro Leu Gly Met Glu Ser Lys Ala Ile Ser Asp Ala 1310
1315 1320Gln Ile Thr Ala Ser Ser Tyr Phe Thr
Asn Met Phe Ala Thr Trp 1325 1330
1335Ser Pro Ser Lys Ala Arg Leu His Leu Gln Gly Arg Ser Asn Ala
1340 1345 1350Trp Arg Pro Gln Val Asn
Asn Pro Lys Glu Trp Leu Gln Val Asp 1355 1360
1365Phe Gln Lys Thr Met Lys Val Thr Gly Val Thr Thr Gln Gly
Val 1370 1375 1380Lys Ser Leu Leu Thr
Ser Met Tyr Val Lys Glu Phe Leu Ile Ser 1385 1390
1395Ser Ser Gln Asp Gly His Gln Trp Thr Leu Phe Phe Gln
Asn Gly 1400 1405 1410Lys Val Lys Val
Phe Gln Gly Asn Gln Asp Ser Phe Thr Pro Val 1415
1420 1425Val Asn Ser Leu Asp Pro Pro Leu Leu Thr Arg
Tyr Leu Arg Ile 1430 1435 1440His Pro
Gln Ser Trp Val His Gln Ile Ala Leu Arg Met Glu Val 1445
1450 1455Leu Gly Cys Glu Ala Gln Asp Leu Tyr
1460 1465234401DNAArtificial Sequencesynthetic HP47/OL NT
23atgcagctag agctctccac ctgtgtcttt ctgtgtctct tgccactcgg ctttagtgcc
60atcaggagat actacctggg cgcagtggaa ctgtcctggg actaccggca aagtgaactc
120ctccgtgagc tgcacgtgga caccagattt cctgctacag cgccaggagc tcttccgttg
180ggcccgtcag tcctgtacaa aaagactgtg ttcgtagagt tcacggatca acttttcagc
240gttgccaggc ccaggccacc atggatgggt ctgctgggtc ctaccatcca ggctgaggtt
300tacgacacgg tggtcgttac cctgaagaac atggcttctc atcccgttag tcttcacgct
360gtcggcgtct ccttctggaa atcttccgaa ggcgctgaat atgaggatca caccagccaa
420agggagaagg aagacgataa agtccttccc ggtaaaagcc aaacctacgt ctggcaggtc
480ctgaaagaaa atggtccaac agcctctgac ccaccatgtc ttacctactc atacctgtct
540cacgtggacc tggtgaaaga cctgaattcg ggcctcattg gagccctgct ggtttgtaga
600gaagggagtc tgaccagaga aaggacccag aacctgcacg aatttgtact actttttgct
660gtctttgatg aagggaaaag ttggcactca gcaagaaatg actcctggac acgggccatg
720gatcccgcac ctgccagggc ccagcctgca atgcacacag tcaatggcta tgtcaacagg
780tctctgccag gtctgatcgg atgtcataag aaatcagtct actggcacgt gattggaatg
840ggcaccagcc cggaagtgca ctccattttt cttgaaggcc acacgtttct cgtgaggcac
900catcgccagg cttccttgga gatctcgcca ctaactttcc tcactgctca gacattcctg
960atggaccttg gccagttcct actgttttgt catatctctt cccaccacca tggtggcatg
1020gaggctcacg tcagagtaga aagctgcgcc gaggagcccc agctgcggag gaaagctgat
1080gaagaggaag attatgatga caatttgtac gactcggaca tggacgtggt ccggctcgat
1140ggtgacgacg tgtctccctt tatccaaatc cgctcagttg ccaagaagca tcctaaaact
1200tgggtacatt acattgctgc tgaagaggag gactgggact atgctccctt agtcctcgcc
1260cccgatgaca gaagttataa aagtcaatat ttgaacaatg gccctcagcg gattggtagg
1320aagtacaaaa aagtccgatt tatggcatac acagatgaaa cctttaagac gcgtgaagct
1380attcagcatg aatcaggaat cttgggacct ttactttatg gggaagttgg agacacactg
1440ttgattatat ttaagaatca agcaagcaga ccatataaca tctaccctca cggaatcact
1500gatgtccgtc ctttgtattc aaggagatta ccaaaaggtg taaaacattt gaaggatttt
1560ccaattctgc caggagaaat attcaaatat aaatggacag tgactgtaga agatgggcca
1620actaaatcag atccgcggtg cctgacccgc tattactcta gtttcgttaa tatggagaga
1680gatctagctt caggactcat tggccctctc ctcatctgct acaaagaatc tgtagatcaa
1740agaggaaacc agataatgtc agacaagagg aatgtcatcc tgttttctgt atttgatgag
1800aaccgaagct ggtacctcac agagaatata caacgctttc tccccaatcc agctggagtg
1860cagcttgagg atccagagtt ccaagcctcc aacatcatgc acagcatcaa tggctatgtt
1920tttgatagtt tgcagttgtc agtttgtttg catgaggtgg catactggta cattctaagc
1980attggagcac agactgactt cctttctgtc ttcttctctg gatatacctt caaacacaaa
2040atggtctatg aagacacact caccctattc ccattctcag gagaaactgt cttcatgtcg
2100atggaaaacc caggtctatg gattctgggg tgccacaact cagactttcg gaacagaggc
2160atgaccgcct tactgaaggt ttctagttgt gacaagaaca ctggtgatta ttacgaggac
2220agttatgaag atatttcagc atacttgctg agtaaaaaca atgccattga acctaggagc
2280tttgcccaga attcaagacc ccctagtgcg agcgctccaa agcctccggt cctgcgacgg
2340catcagaggg acataagcct tcctactttt cagccggagg aagacaaaat ggactatgat
2400gatatcttct caactgaaac gaagggagaa gattttgaca tttacggtga ggatgaaaat
2460caggaccctc gcagctttca gaagagaacc cgacactatt tcattgctgc ggtggagcag
2520ctctgggatt acgggatgag cgaatccccc cgggcgctaa gaaacagggc tcagaacgga
2580gaggtgcctc ggttcaagaa ggtggtcttc cgggaatttg ctgacggctc cttcacgcag
2640ccgtcgtacc gcggggaact caacaaacac ttggggctct tgggacccta catcagagcg
2700gaagttgaag acaacatcat ggtaactttc aaaaaccagg cgtctcgtcc ctattccttc
2760tactcgagcc ttatttctta tccggatgat caggagcaag gggcagaacc tcgacacaac
2820ttcgtccagc caaatgaaac cagaacttac ttttggaaag tgcagcatca catggcaccc
2880acagaagacg agtttgactg caaagcctgg gcctactttt ctgatgttga cctggaaaaa
2940gatgtgcact caggcttgat cggccccctt ctgatctgcc gcgccaacac cctgaacgct
3000gctcacggta gacaagtgac cgtgcaagaa tttgctctgt ttttcactat ttttgatgag
3060acaaagagct ggtacttcac tgaaaatgtg gaaaggaact gccgggcccc ctgccatctg
3120cagatggagg accccactct gaaagaaaac tatcgcttcc atgcaatcaa tggctatgtg
3180atggatacac tccctggctt agtaatggct cagaatcaaa ggatccgatg gtatctgctc
3240agcatgggca gcaatgaaaa tatccattcg attcatttta gcggacacgt gttcagtgta
3300cggaaaaagg aggagtataa aatggccgtg tacaatctct atccgggtgt ctttgagaca
3360gtggaaatgc taccgtccaa agttggaatt tggcgaatag aatgcctgat tggcgagcac
3420ctgcaagctg ggatgagcac gactttcctg gtgtacagca agaagtgtca gactcccctg
3480ggaatggctt ctggacacat tagagatttt cagattacag cttcaggaca atatggacag
3540tgggccccaa agctggccag acttcattat tccggatcaa tcaatgcctg gagcaccaag
3600gagccctttt cttggatcaa ggtggatctg ttggcaccaa tgattattca cggcatcaag
3660acccagggtg cccgtcagaa gttctccagc ctctacatct ctcagtttat catcatgtat
3720agtcttgatg ggaagaagtg gcagacttat cgaggaaatt ccactggaac cttaatggtc
3780ttctttggca atgtggattc atctgggata aaacacaata tttttaaccc tccaattatt
3840gctcgataca tccgtttgca cccaactcat tatagcattc gcagcactct tcgcatggag
3900ttgatgggct gtgatttaaa tagttgcagc atgccattgg gaatggagag taaagcaata
3960tcagatgcac agattactgc ttcatcctac tttaccaata tgtttgccac ctggtctcct
4020tcaaaagctc gacttcacct ccaagggagg agtaatgcct ggagacctca ggtgaataat
4080ccaaaagagt ggctgcaagt ggacttccag aagacaatga aagtcacagg agtaactact
4140cagggagtaa aatctctgct taccagcatg tatgtgaagg agttcctcat ctccagcagt
4200caagatggcc atcagtggac tctctttttt cagaatggca aagtaaaggt ttttcaggga
4260aatcaagact ccttcacacc tgtggtgaac tctctagacc caccgttact gactcgctac
4320cttcgaattc acccccagag ttgggtgcac cagattgccc tgaggatgga ggttctgggc
4380tgcgaggcac aggacctcta c
4401241457PRTHomo sapiens 24Met Gln Ile Glu Leu Ser Thr Cys Phe Phe Leu
Cys Leu Leu Arg Phe1 5 10
15Cys Phe Ser Ala Thr Arg Arg Tyr Tyr Leu Gly Ala Val Glu Leu Ser
20 25 30Trp Asp Tyr Met Gln Ser Asp
Leu Gly Glu Leu Pro Val Asp Ala Arg 35 40
45Phe Pro Pro Arg Val Pro Lys Ser Phe Pro Phe Asn Thr Ser Val
Val 50 55 60Tyr Lys Lys Thr Leu Phe
Val Glu Phe Thr Val His Leu Phe Asn Ile65 70
75 80Ala Lys Pro Arg Pro Pro Trp Met Gly Leu Leu
Gly Pro Thr Ile Gln 85 90
95Ala Glu Val Tyr Asp Thr Val Val Ile Thr Leu Lys Asn Met Ala Ser
100 105 110His Pro Val Ser Leu His
Ala Val Gly Val Ser Tyr Trp Lys Ala Ser 115 120
125Glu Gly Ala Glu Tyr Asp Asp Gln Thr Ser Gln Arg Glu Lys
Glu Asp 130 135 140Asp Lys Val Phe Pro
Gly Gly Ser His Thr Tyr Val Trp Gln Val Leu145 150
155 160Lys Glu Asn Gly Pro Met Ala Ser Asp Pro
Leu Cys Leu Thr Tyr Ser 165 170
175Tyr Leu Ser His Val Asp Leu Val Lys Asp Leu Asn Ser Gly Leu Ile
180 185 190Gly Ala Leu Leu Val
Cys Arg Glu Gly Ser Leu Ala Lys Glu Lys Thr 195
200 205Gln Thr Leu His Lys Phe Ile Leu Leu Phe Ala Val
Phe Asp Glu Gly 210 215 220Lys Ser Trp
His Ser Glu Thr Lys Asn Ser Leu Met Gln Asp Arg Asp225
230 235 240Ala Ala Ser Ala Arg Ala Trp
Pro Lys Met His Thr Val Asn Gly Tyr 245
250 255Val Asn Arg Ser Leu Pro Gly Leu Ile Gly Cys His
Arg Lys Ser Val 260 265 270Tyr
Trp His Val Ile Gly Met Gly Thr Thr Pro Glu Val His Ser Ile 275
280 285Phe Leu Glu Gly His Thr Phe Leu Val
Arg Asn His Arg Gln Ala Ser 290 295
300Leu Glu Ile Ser Pro Ile Thr Phe Leu Thr Ala Gln Thr Leu Leu Met305
310 315 320Asp Leu Gly Gln
Phe Leu Leu Phe Cys His Ile Ser Ser His Gln His 325
330 335Asp Gly Met Glu Ala Tyr Val Lys Val Asp
Ser Cys Pro Glu Glu Pro 340 345
350Gln Leu Arg Met Lys Asn Asn Glu Glu Ala Glu Asp Tyr Asp Asp Asp
355 360 365Leu Thr Asp Ser Glu Met Asp
Val Val Arg Phe Asp Asp Asp Asn Ser 370 375
380Pro Ser Phe Ile Gln Ile Arg Ser Val Ala Lys Lys His Pro Lys
Thr385 390 395 400Trp Val
His Tyr Ile Ala Ala Glu Glu Glu Asp Trp Asp Tyr Ala Pro
405 410 415Leu Val Leu Ala Pro Asp Asp
Arg Ser Tyr Lys Ser Gln Tyr Leu Asn 420 425
430Asn Gly Pro Gln Arg Ile Gly Arg Lys Tyr Lys Lys Val Arg
Phe Met 435 440 445Ala Tyr Thr Asp
Glu Thr Phe Lys Thr Arg Glu Ala Ile Gln His Glu 450
455 460Ser Gly Ile Leu Gly Pro Leu Leu Tyr Gly Glu Val
Gly Asp Thr Leu465 470 475
480Leu Ile Ile Phe Lys Asn Gln Ala Ser Arg Pro Tyr Asn Ile Tyr Pro
485 490 495His Gly Ile Thr Asp
Val Arg Pro Leu Tyr Ser Arg Arg Leu Pro Lys 500
505 510Gly Val Lys His Leu Lys Asp Phe Pro Ile Leu Pro
Gly Glu Ile Phe 515 520 525Lys Tyr
Lys Trp Thr Val Thr Val Glu Asp Gly Pro Thr Lys Ser Asp 530
535 540Pro Arg Cys Leu Thr Arg Tyr Tyr Ser Ser Phe
Val Asn Met Glu Arg545 550 555
560Asp Leu Ala Ser Gly Leu Ile Gly Pro Leu Leu Ile Cys Tyr Lys Glu
565 570 575Ser Val Asp Gln
Arg Gly Asn Gln Ile Met Ser Asp Lys Arg Asn Val 580
585 590Ile Leu Phe Ser Val Phe Asp Glu Asn Arg Ser
Trp Tyr Leu Thr Glu 595 600 605Asn
Ile Gln Arg Phe Leu Pro Asn Pro Ala Gly Val Gln Leu Glu Asp 610
615 620Pro Glu Phe Gln Ala Ser Asn Ile Met His
Ser Ile Asn Gly Tyr Val625 630 635
640Phe Asp Ser Leu Gln Leu Ser Val Cys Leu His Glu Val Ala Tyr
Trp 645 650 655Tyr Ile Leu
Ser Ile Gly Ala Gln Thr Asp Phe Leu Ser Val Phe Phe 660
665 670Ser Gly Tyr Thr Phe Lys His Lys Met Val
Tyr Glu Asp Thr Leu Thr 675 680
685Leu Phe Pro Phe Ser Gly Glu Thr Val Phe Met Ser Met Glu Asn Pro 690
695 700Gly Leu Trp Ile Leu Gly Cys His
Asn Ser Asp Phe Arg Asn Arg Gly705 710
715 720Met Thr Ala Leu Leu Lys Val Ser Ser Cys Asp Lys
Asn Thr Gly Asp 725 730
735Tyr Tyr Glu Asp Ser Tyr Glu Asp Ile Ser Ala Tyr Leu Leu Ser Lys
740 745 750Asn Asn Ala Ile Glu Pro
Arg Ser Phe Ser Gln Asn Pro Pro Val Leu 755 760
765Lys Arg His Gln Arg Glu Ile Thr Arg Thr Thr Leu Gln Ser
Asp Gln 770 775 780Glu Glu Ile Asp Tyr
Asp Asp Thr Ile Ser Val Glu Met Lys Lys Glu785 790
795 800Asp Phe Asp Ile Tyr Asp Glu Asp Glu Asn
Gln Ser Pro Arg Ser Phe 805 810
815Gln Lys Lys Thr Arg His Tyr Phe Ile Ala Ala Val Glu Arg Leu Trp
820 825 830Asp Tyr Gly Met Ser
Ser Ser Pro His Val Leu Arg Asn Arg Ala Gln 835
840 845Ser Gly Ser Val Pro Gln Phe Lys Lys Val Val Phe
Gln Glu Phe Thr 850 855 860Asp Gly Ser
Phe Thr Gln Pro Leu Tyr Arg Gly Glu Leu Asn Glu His865
870 875 880Leu Gly Leu Leu Gly Pro Tyr
Ile Arg Ala Glu Val Glu Asp Asn Ile 885
890 895Met Val Thr Phe Arg Asn Gln Ala Ser Arg Pro Tyr
Ser Phe Tyr Ser 900 905 910Ser
Leu Ile Ser Tyr Glu Glu Asp Gln Arg Gln Gly Ala Glu Pro Arg 915
920 925Lys Asn Phe Val Lys Pro Asn Glu Thr
Lys Thr Tyr Phe Trp Lys Val 930 935
940Gln His His Met Ala Pro Thr Lys Asp Glu Phe Asp Cys Lys Ala Trp945
950 955 960Ala Tyr Phe Ser
Asp Val Asp Leu Glu Lys Asp Val His Ser Gly Leu 965
970 975Ile Gly Pro Leu Leu Val Cys His Thr Asn
Thr Leu Asn Pro Ala His 980 985
990Gly Arg Gln Val Thr Val Gln Glu Phe Ala Leu Phe Phe Thr Ile Phe
995 1000 1005Asp Glu Thr Lys Ser Trp
Tyr Phe Thr Glu Asn Met Glu Arg Asn 1010 1015
1020Cys Arg Ala Pro Cys Asn Ile Gln Met Glu Asp Pro Thr Phe
Lys 1025 1030 1035Glu Asn Tyr Arg Phe
His Ala Ile Asn Gly Tyr Ile Met Asp Thr 1040 1045
1050Leu Pro Gly Leu Val Met Ala Gln Asp Gln Arg Ile Arg
Trp Tyr 1055 1060 1065Leu Leu Ser Met
Gly Ser Asn Glu Asn Ile His Ser Ile His Phe 1070
1075 1080Ser Gly His Val Phe Thr Val Arg Lys Lys Glu
Glu Tyr Lys Met 1085 1090 1095Ala Leu
Tyr Asn Leu Tyr Pro Gly Val Phe Glu Thr Val Glu Met 1100
1105 1110Leu Pro Ser Lys Ala Gly Ile Trp Arg Val
Glu Cys Leu Ile Gly 1115 1120 1125Glu
His Leu His Ala Gly Met Ser Thr Leu Phe Leu Val Tyr Ser 1130
1135 1140Asn Lys Cys Gln Thr Pro Leu Gly Met
Ala Ser Gly His Ile Arg 1145 1150
1155Asp Phe Gln Ile Thr Ala Ser Gly Gln Tyr Gly Gln Trp Ala Pro
1160 1165 1170Lys Leu Ala Arg Leu His
Tyr Ser Gly Ser Ile Asn Ala Trp Ser 1175 1180
1185Thr Lys Glu Pro Phe Ser Trp Ile Lys Val Asp Leu Leu Ala
Pro 1190 1195 1200Met Ile Ile His Gly
Ile Lys Thr Gln Gly Ala Arg Gln Lys Phe 1205 1210
1215Ser Ser Leu Tyr Ile Ser Gln Phe Ile Ile Met Tyr Ser
Leu Asp 1220 1225 1230Gly Lys Lys Trp
Gln Thr Tyr Arg Gly Asn Ser Thr Gly Thr Leu 1235
1240 1245Met Val Phe Phe Gly Asn Val Asp Ser Ser Gly
Ile Lys His Asn 1250 1255 1260Ile Phe
Asn Pro Pro Ile Ile Ala Arg Tyr Ile Arg Leu His Pro 1265
1270 1275Thr His Tyr Ser Ile Arg Ser Thr Leu Arg
Met Glu Leu Met Gly 1280 1285 1290Cys
Asp Leu Asn Ser Cys Ser Met Pro Leu Gly Met Glu Ser Lys 1295
1300 1305Ala Ile Ser Asp Ala Gln Ile Thr Ala
Ser Ser Tyr Phe Thr Asn 1310 1315
1320Met Phe Ala Thr Trp Ser Pro Ser Lys Ala Arg Leu His Leu Gln
1325 1330 1335Gly Arg Ser Asn Ala Trp
Arg Pro Gln Val Asn Asn Pro Lys Glu 1340 1345
1350Trp Leu Gln Val Asp Phe Gln Lys Thr Met Lys Val Thr Gly
Val 1355 1360 1365Thr Thr Gln Gly Val
Lys Ser Leu Leu Thr Ser Met Tyr Val Lys 1370 1375
1380Glu Phe Leu Ile Ser Ser Ser Gln Asp Gly His Gln Trp
Thr Leu 1385 1390 1395Phe Phe Gln Asn
Gly Lys Val Lys Val Phe Gln Gly Asn Gln Asp 1400
1405 1410Ser Phe Thr Pro Val Val Asn Ser Leu Asp Pro
Pro Leu Leu Thr 1415 1420 1425Arg Tyr
Leu Arg Ile His Pro Gln Ser Trp Val His Gln Ile Ala 1430
1435 1440Leu Arg Met Glu Val Leu Gly Cys Glu Ala
Gln Asp Leu Tyr 1445 1450
1455254371DNAHomo sapiens 25atgcaaatag agctctccac ctgcttcttt ctgtgccttt
tgcgattctg ctttagtgcc 60accagaagat actacctggg tgcagtggaa ctgtcatggg
actatatgca aagtgatctc 120ggtgagctgc ctgtggacgc aagatttcct cctagagtgc
caaaatcttt tccattcaac 180acctcagtcg tgtacaaaaa gactctgttt gtagaattca
cggttcacct tttcaacatc 240gctaagccaa ggccaccctg gatgggtctg ctaggtccta
ccatccaggc tgaggtttat 300gatacagtgg tcattacact taagaacatg gcttcccatc
ctgtcagtct tcatgctgtt 360ggtgtatcct actggaaagc ttctgaggga gctgaatatg
atgatcagac cagtcaaagg 420gagaaagaag atgataaagt cttccctggt ggaagccata
catatgtctg gcaggtcctg 480aaagagaatg gtccaatggc ctctgaccca ctgtgcctta
cctactcata tctttctcat 540gtggacctgg taaaagactt gaattcaggc ctcattggag
ccctactagt atgtagagaa 600gggagtctgg ccaaggaaaa gacacagacc ttgcacaaat
ttatactact ttttgctgta 660tttgatgaag ggaaaagttg gcactcagaa acaaagaact
ccttgatgca ggatagggat 720gctgcatctg ctcgggcctg gcctaaaatg cacacagtca
atggttatgt aaacaggtct 780ctgccaggtc tgattggatg ccacaggaaa tcagtctatt
ggcatgtgat tggaatgggc 840accactcctg aagtgcactc aatattcctc gaaggtcaca
catttcttgt gaggaaccat 900cgccaggcgt ccttggaaat ctcgccaata actttcctta
ctgctcaaac actcttgatg 960gaccttggac agtttctact gttttgtcat atctcttccc
accaacatga tggcatggaa 1020gcttatgtca aagtagacag ctgtccagag gaaccccaac
tacgaatgaa aaataatgaa 1080gaagcggaag actatgatga tgatcttact gattctgaaa
tggatgtggt caggtttgat 1140gatgacaact ctccttcctt tatccaaatt cgctcagttg
ccaagaagca tcctaaaact 1200tgggtacatt acattgctgc tgaagaggag gactgggact
atgctccctt agtcctcgcc 1260cccgatgaca gaagttataa aagtcaatat ttgaacaatg
gccctcagcg gattggtagg 1320aagtacaaaa aagtccgatt tatggcatac acagatgaaa
cctttaagac gcgtgaagct 1380attcagcatg aatcaggaat cttgggacct ttactttatg
gggaagttgg agacacactg 1440ttgattatat ttaagaatca agcaagcaga ccatataaca
tctaccctca cggaatcact 1500gatgtccgtc ctttgtattc aaggagatta ccaaaaggtg
taaaacattt gaaggatttt 1560ccaattctgc caggagaaat attcaaatat aaatggacag
tgactgtaga agatgggcca 1620actaaatcag atccgcggtg cctgacccgc tattactcta
gtttcgttaa tatggagaga 1680gatctagctt caggactcat tggccctctc ctcatctgct
acaaagaatc tgtagatcaa 1740agaggaaacc agataatgtc agacaagagg aatgtcatcc
tgttttctgt atttgatgag 1800aaccgaagct ggtacctcac agagaatata caacgctttc
tccccaatcc agctggagtg 1860cagcttgagg atccagagtt ccaagcctcc aacatcatgc
acagcatcaa tggctatgtt 1920tttgatagtt tgcagttgtc agtttgtttg catgaggtgg
catactggta cattctaagc 1980attggagcac agactgactt cctttctgtc ttcttctctg
gatatacctt caaacacaaa 2040atggtctatg aagacacact caccctattc ccattctcag
gagaaactgt cttcatgtcg 2100atggaaaacc caggtctatg gattctgggg tgccacaact
cagactttcg gaacagaggc 2160atgaccgcct tactgaaggt ttctagttgt gacaagaaca
ctggtgatta ttacgaggac 2220agttatgaag atatttcagc atacttgctg agtaaaaaca
atgccattga acctaggagc 2280ttctctcaga atccaccagt cttgaaacgc catcaacggg
aaataactcg tactactctt 2340cagtcagatc aagaggaaat tgactatgat gataccatat
cagttgaaat gaagaaggaa 2400gattttgaca tttatgatga ggatgaaaat cagagccccc
gcagctttca aaagaaaaca 2460cgacactatt ttattgctgc agtggagagg ctctgggatt
atgggatgag tagctcccca 2520catgttctaa gaaacagggc tcagagtggc agtgtccctc
agttcaagaa agttgttttc 2580caggaattta ctgatggctc ctttactcag cccttatacc
gtggagaact aaatgaacat 2640ttgggactcc tggggccata tataagagca gaagttgaag
ataatatcat ggtaactttc 2700agaaatcagg cctctcgtcc ctattccttc tattctagcc
ttatttctta tgaggaagat 2760cagaggcaag gagcagaacc tagaaaaaac tttgtcaagc
ctaatgaaac caaaacttac 2820ttttggaaag tgcaacatca tatggcaccc actaaagatg
agtttgactg caaagcctgg 2880gcttatttct ctgatgttga cctggaaaaa gatgtgcact
caggcctgat tggacccctt 2940ctggtctgcc acactaacac actgaaccct gctcatggga
gacaagtgac agtacaggaa 3000tttgctctgt ttttcaccat ctttgatgag accaaaagct
ggtacttcac tgaaaatatg 3060gaaagaaact gcagggctcc ctgcaatatc cagatggaag
atcccacttt taaagagaat 3120tatcgcttcc atgcaatcaa tggctacata atggatacac
tacctggctt agtaatggct 3180caggatcaaa ggattcgatg gtatctgctc agcatgggca
gcaatgaaaa catccattct 3240attcatttca gtggacatgt gttcactgta cgaaaaaaag
aggagtataa aatggcactg 3300tacaatctct atccaggtgt ttttgagaca gtggaaatgt
taccatccaa agctggaatt 3360tggcgggtgg aatgccttat tggcgagcat ctacatgctg
ggatgagcac actttttctg 3420gtgtacagca ataagtgtca gactcccctg ggaatggctt
ctggacacat tagagatttt 3480cagattacag cttcaggaca atatggacag tgggccccaa
agctggccag acttcattat 3540tccggatcaa tcaatgcctg gagcaccaag gagccctttt
cttggatcaa ggtggatctg 3600ttggcaccaa tgattattca cggcatcaag acccagggtg
cccgtcagaa gttctccagc 3660ctctacatct ctcagtttat catcatgtat agtcttgatg
ggaagaagtg gcagacttat 3720cgaggaaatt ccactggaac cttaatggtc ttctttggca
atgtggattc atctgggata 3780aaacacaata tttttaaccc tccaattatt gctcgataca
tccgtttgca cccaactcat 3840tatagcattc gcagcactct tcgcatggag ttgatgggct
gtgatttaaa tagttgcagc 3900atgccattgg gaatggagag taaagcaata tcagatgcac
agattactgc ttcatcctac 3960tttaccaata tgtttgccac ctggtctcct tcaaaagctc
gacttcacct ccaagggagg 4020agtaatgcct ggagacctca ggtgaataat ccaaaagagt
ggctgcaagt ggacttccag 4080aagacaatga aagtcacagg agtaactact cagggagtaa
aatctctgct taccagcatg 4140tatgtgaagg agttcctcat ctccagcagt caagatggcc
atcagtggac tctctttttt 4200cagaatggca aagtaaaggt ttttcaggga aatcaagact
ccttcacacc tgtggtgaac 4260tctctagacc caccgttact gactcgctac cttcgaattc
acccccagag ttgggtgcac 4320cagattgccc tgaggatgga ggttctgggc tgcgaggcac
aggacctcta c 4371261467PRTArtificial Sequencesynthetic HP63/OL
aa 26Met Gln Leu Glu Leu Ser Thr Cys Val Phe Leu Cys Leu Leu Pro Leu1
5 10 15Gly Phe Ser Ala Ile
Arg Arg Tyr Tyr Leu Gly Ala Val Glu Leu Ser 20
25 30Trp Asp Tyr Arg Gln Ser Glu Leu Leu Arg Glu Leu
His Val Asp Thr 35 40 45Arg Phe
Pro Ala Thr Ala Pro Gly Ala Leu Pro Leu Gly Pro Ser Val 50
55 60Leu Tyr Lys Lys Thr Val Phe Val Glu Phe Thr
Asp Gln Leu Phe Ser65 70 75
80Val Ala Arg Pro Arg Pro Pro Trp Met Gly Leu Leu Gly Pro Thr Ile
85 90 95Gln Ala Glu Val Tyr
Asp Thr Val Val Val Thr Leu Lys Asn Met Ala 100
105 110Ser His Pro Val Ser Leu His Ala Val Gly Val Ser
Phe Trp Lys Ser 115 120 125Ser Glu
Gly Ala Glu Tyr Glu Asp His Thr Ser Gln Arg Glu Lys Glu 130
135 140Asp Asp Lys Val Leu Pro Gly Lys Ser Gln Thr
Tyr Val Trp Gln Val145 150 155
160Leu Lys Glu Asn Gly Pro Thr Ala Ser Asp Pro Pro Cys Leu Thr Tyr
165 170 175Ser Tyr Leu Ser
His Val Asp Leu Val Lys Asp Leu Asn Ser Gly Leu 180
185 190Ile Gly Ala Leu Leu Val Cys Arg Glu Gly Ser
Leu Thr Arg Glu Arg 195 200 205Thr
Gln Asn Leu His Glu Phe Val Leu Leu Phe Ala Val Phe Asp Glu 210
215 220Gly Lys Ser Trp His Ser Ala Arg Asn Asp
Ser Trp Thr Arg Ala Met225 230 235
240Asp Pro Ala Pro Ala Arg Ala Gln Pro Ala Met His Thr Val Asn
Gly 245 250 255Tyr Val Asn
Arg Ser Leu Pro Gly Leu Ile Gly Cys His Lys Lys Ser 260
265 270Val Tyr Trp His Val Ile Gly Met Gly Thr
Ser Pro Glu Val His Ser 275 280
285Ile Phe Leu Glu Gly His Thr Phe Leu Val Arg His His Arg Gln Ala 290
295 300Ser Leu Glu Ile Ser Pro Leu Thr
Phe Leu Thr Ala Gln Thr Phe Leu305 310
315 320Met Asp Leu Gly Gln Phe Leu Leu Phe Cys His Ile
Ser Ser His His 325 330
335His Gly Gly Met Glu Ala His Val Arg Val Glu Ser Cys Ala Glu Glu
340 345 350Pro Gln Leu Arg Arg Lys
Ala Asp Glu Glu Glu Asp Tyr Asp Asp Asn 355 360
365Leu Tyr Asp Ser Asp Met Asp Val Val Arg Leu Asp Gly Asp
Asp Val 370 375 380Ser Pro Phe Ile Gln
Ile Arg Ser Val Ala Lys Lys His Pro Lys Thr385 390
395 400Trp Val His Tyr Ile Ala Ala Glu Glu Glu
Asp Trp Asp Tyr Ala Pro 405 410
415Leu Val Leu Ala Pro Asp Asp Arg Ser Tyr Lys Ser Gln Tyr Leu Asn
420 425 430Asn Gly Pro Gln Arg
Ile Gly Arg Lys Tyr Lys Lys Val Arg Phe Met 435
440 445Ala Tyr Thr Asp Glu Thr Phe Lys Thr Arg Glu Ala
Ile Gln His Glu 450 455 460Ser Gly Ile
Leu Gly Pro Leu Leu Tyr Gly Glu Val Gly Asp Thr Leu465
470 475 480Leu Ile Ile Phe Lys Asn Gln
Ala Ser Arg Pro Tyr Asn Ile Tyr Pro 485
490 495His Gly Ile Thr Asp Val Arg Pro Leu Tyr Ser Arg
Arg Leu Pro Lys 500 505 510Gly
Val Lys His Leu Lys Asp Phe Pro Ile Leu Pro Gly Glu Ile Phe 515
520 525Lys Tyr Lys Trp Thr Val Thr Val Glu
Asp Gly Pro Thr Lys Ser Asp 530 535
540Pro Arg Cys Leu Thr Arg Tyr Tyr Ser Ser Phe Val Asn Met Glu Arg545
550 555 560Asp Leu Ala Ser
Gly Leu Ile Gly Pro Leu Leu Ile Cys Tyr Lys Glu 565
570 575Ser Val Asp Gln Arg Gly Asn Gln Ile Met
Ser Asp Lys Arg Asn Val 580 585
590Ile Leu Phe Ser Val Phe Asp Glu Asn Arg Ser Trp Tyr Leu Thr Glu
595 600 605Asn Ile Gln Arg Phe Leu Pro
Asn Pro Ala Gly Val Gln Leu Glu Asp 610 615
620Pro Glu Phe Gln Ala Ser Asn Ile Met His Ser Ile Asn Gly Tyr
Val625 630 635 640Phe Asp
Ser Leu Gln Leu Ser Val Cys Leu His Glu Val Ala Tyr Trp
645 650 655Tyr Ile Leu Ser Ile Gly Ala
Gln Thr Asp Phe Leu Ser Val Phe Phe 660 665
670Ser Gly Tyr Thr Phe Lys His Lys Met Val Tyr Glu Asp Thr
Leu Thr 675 680 685Leu Phe Pro Phe
Ser Gly Glu Thr Val Phe Met Ser Met Glu Asn Pro 690
695 700Gly Leu Trp Ile Leu Gly Cys His Asn Ser Asp Phe
Arg Asn Arg Gly705 710 715
720Met Thr Ala Leu Leu Lys Val Ser Ser Cys Asp Lys Asn Thr Gly Asp
725 730 735Tyr Tyr Glu Asp Ser
Tyr Glu Asp Ile Ser Ala Tyr Leu Leu Ser Lys 740
745 750Asn Asn Ala Ile Glu Pro Arg Ser Phe Ser Gln Asn
Ser Arg His Pro 755 760 765Ser Thr
Arg Ser Gln Asn Pro Pro Val Leu Lys Arg His Gln Arg Glu 770
775 780Ile Thr Arg Thr Thr Leu Gln Ser Asp Gln Glu
Glu Ile Asp Tyr Asp785 790 795
800Asp Thr Ile Ser Val Glu Met Lys Lys Glu Asp Phe Asp Ile Tyr Asp
805 810 815Glu Asp Glu Asn
Gln Ser Pro Arg Ser Phe Gln Lys Arg Thr Arg His 820
825 830Tyr Phe Ile Ala Ala Val Glu Gln Leu Trp Asp
Tyr Gly Met Ser Glu 835 840 845Ser
Pro Arg Ala Leu Arg Asn Arg Ala Gln Asn Gly Glu Val Pro Arg 850
855 860Phe Lys Lys Val Val Phe Arg Glu Phe Ala
Asp Gly Ser Phe Thr Gln865 870 875
880Pro Ser Tyr Arg Gly Glu Leu Asn Lys His Leu Gly Leu Leu Gly
Pro 885 890 895Tyr Ile Arg
Ala Glu Val Glu Asp Asn Ile Met Val Thr Phe Lys Asn 900
905 910Gln Ala Ser Arg Pro Tyr Ser Phe Tyr Ser
Ser Leu Ile Ser Tyr Pro 915 920
925Asp Asp Gln Glu Gln Gly Ala Glu Pro Arg Lys Asn Phe Val Lys Pro 930
935 940Asn Glu Thr Lys Thr Tyr Phe Trp
Lys Val Gln His His Met Ala Pro945 950
955 960Thr Glu Asp Glu Phe Asp Cys Lys Ala Trp Ala Tyr
Phe Ser Asp Val 965 970
975Asp Leu Glu Lys Asp Val His Ser Gly Leu Ile Gly Pro Leu Leu Ile
980 985 990Cys Arg Ala Asn Thr Leu
Asn Ala Ala His Gly Arg Gln Val Thr Val 995 1000
1005Gln Glu Phe Ala Leu Phe Phe Thr Ile Phe Asp Glu
Thr Lys Ser 1010 1015 1020Trp Tyr Phe
Thr Glu Asn Val Glu Arg Asn Cys Arg Ala Pro Cys 1025
1030 1035His Leu Gln Met Glu Asp Pro Thr Leu Lys Glu
Asn Tyr Arg Phe 1040 1045 1050His Ala
Ile Asn Gly Tyr Val Met Asp Thr Leu Pro Gly Leu Val 1055
1060 1065Met Ala Gln Asn Gln Arg Ile Arg Trp Tyr
Leu Leu Ser Met Gly 1070 1075 1080Ser
Asn Glu Asn Ile His Ser Ile His Phe Ser Gly His Val Phe 1085
1090 1095Ser Val Arg Lys Lys Glu Glu Tyr Lys
Met Ala Val Tyr Asn Leu 1100 1105
1110Tyr Pro Gly Val Phe Glu Thr Val Glu Met Leu Pro Ser Lys Val
1115 1120 1125Gly Ile Trp Arg Asn Arg
Cys Leu Ile Gly Glu His Leu Gln Ala 1130 1135
1140Gly Met Ser Thr Thr Phe Leu Val Tyr Ser Lys Lys Cys Gln
Thr 1145 1150 1155Pro Leu Gly Met Ala
Ser Gly His Ile Arg Asp Phe Gln Ile Thr 1160 1165
1170Ala Ser Gly Gln Tyr Gly Gln Trp Ala Pro Lys Leu Ala
Arg Leu 1175 1180 1185His Tyr Ser Gly
Ser Ile Asn Ala Trp Ser Thr Lys Glu Pro Phe 1190
1195 1200Ser Trp Ile Lys Val Asp Leu Leu Ala Pro Met
Ile Ile His Gly 1205 1210 1215Ile Lys
Thr Gln Gly Ala Arg Gln Lys Phe Ser Ser Leu Tyr Ile 1220
1225 1230Ser Gln Phe Ile Ile Met Tyr Ser Leu Asp
Gly Lys Lys Trp Gln 1235 1240 1245Thr
Tyr Arg Gly Asn Ser Thr Gly Thr Leu Met Val Phe Phe Gly 1250
1255 1260Asn Val Asp Ser Ser Gly Ile Lys His
Asn Ile Phe Asn Pro Pro 1265 1270
1275Ile Ile Ala Arg Tyr Ile Arg Leu His Pro Thr His Tyr Ser Ile
1280 1285 1290Arg Ser Thr Leu Arg Met
Glu Leu Met Gly Cys Asp Leu Asn Ser 1295 1300
1305Cys Ser Met Pro Leu Gly Met Glu Ser Lys Ala Ile Ser Asp
Ala 1310 1315 1320Gln Ile Thr Ala Ser
Ser Tyr Phe Thr Asn Met Phe Ala Thr Trp 1325 1330
1335Ser Pro Ser Lys Ala Arg Leu His Leu Gln Gly Arg Ser
Asn Ala 1340 1345 1350Trp Arg Pro Gln
Val Asn Asn Pro Lys Glu Trp Leu Gln Val Asp 1355
1360 1365Phe Gln Lys Thr Met Lys Val Thr Gly Val Thr
Thr Gln Gly Val 1370 1375 1380Lys Ser
Leu Leu Thr Ser Met Tyr Val Lys Glu Phe Leu Ile Ser 1385
1390 1395Ser Ser Gln Asp Gly His Gln Trp Thr Leu
Phe Phe Gln Asn Gly 1400 1405 1410Lys
Val Lys Val Phe Gln Gly Asn Gln Asp Ser Phe Thr Pro Val 1415
1420 1425Val Asn Ser Leu Asp Pro Pro Leu Leu
Thr Arg Tyr Leu Arg Ile 1430 1435
1440His Pro Gln Ser Trp Val His Gln Ile Ala Leu Arg Met Glu Val
1445 1450 1455Leu Gly Cys Glu Ala Gln
Asp Leu Tyr 1460 1465274398DNAArtificial
Sequencesynthetic HP63/OL NT 27atgcagctag agctctccac ctgtgtcttt
ctgtgtctct tgccactcgg ctttagtgcc 60atcaggagat actacctggg cgcagtggaa
ctgtcctggg actaccggca aagtgaactc 120ctccgtgagc tgcacgtgga caccagattt
cctgctacag cgccaggagc tcttccgttg 180ggcccgtcag tcctgtacaa aaagactgtg
ttcgtagagt tcacggatca acttttcagc 240gttgccaggc ccaggccacc atggatgggt
ctgctgggtc ctaccatcca ggctgaggtt 300tacgacacgg tggtcgttac cctgaagaac
atggcttctc atcccgttag tcttcacgct 360gtcggcgtct ccttctggaa atcttccgaa
ggcgctgaat atgaggatca caccagccaa 420agggagaagg aagacgataa agtccttccc
ggtaaaagcc aaacctacgt ctggcaggtc 480ctgaaagaaa atggtccaac agcctctgac
ccaccatgtc ttacctactc atacctgtct 540cacgtggacc tggtgaaaga cctgaattcg
ggcctcattg gagccctgct ggtttgtaga 600gaagggagtc tgaccagaga aaggacccag
aacctgcacg aatttgtact actttttgct 660gtctttgatg aagggaaaag ttggcactca
gcaagaaatg actcctggac acgggccatg 720gatcccgcac ctgccagggc ccagcctgca
atgcacacag tcaatggcta tgtcaacagg 780tctctgccag gtctgatcgg atgtcataag
aaatcagtct actggcacgt gattggaatg 840ggcaccagcc cggaagtgca ctccattttt
cttgaaggcc acacgtttct cgtgaggcac 900catcgccagg cttccttgga gatctcgcca
ctaactttcc tcactgctca gacattcctg 960atggaccttg gccagttcct actgttttgt
catatctctt cccaccacca tggtggcatg 1020gaggctcacg tcagagtaga aagctgcgcc
gaggagcccc agctgcggag gaaagctgat 1080gaagaggaag attatgatga caatttgtac
gactcggaca tggacgtggt ccggctcgat 1140ggtgacgacg tgtctccctt tatccaaatc
cgctcagttg ccaagaagca tcctaaaact 1200tgggtacatt acattgctgc tgaagaggag
gactgggact atgctccctt agtcctcgcc 1260cccgatgaca gaagttataa aagtcaatat
ttgaacaatg gccctcagcg gattggtagg 1320aagtacaaaa aagtccgatt tatggcatac
acagatgaaa cctttaagac tcgtgaagct 1380attcagcatg aatcaggaat cttgggacct
ttactttatg gggaagttgg agacacactg 1440ttgattatat ttaagaatca agcaagcaga
ccatataaca tctaccctca cggaatcact 1500gatgtccgtc ctttgtattc aaggagatta
ccaaaaggtg taaaacattt gaaggatttt 1560ccaattctgc caggagaaat attcaaatat
aaatggacag tgactgtaga agatgggcca 1620actaaatcag atcctcggtg cctgacccgc
tattactcta gtttcgttaa tatggagaga 1680gatctagctt caggactcat tggccctctc
ctcatctgct acaaagaatc tgtagatcaa 1740agaggaaacc agataatgtc agacaagagg
aatgtcatcc tgttttctgt atttgatgag 1800aaccgaagct ggtacctcac agagaatata
caacgctttc tccccaatcc agctggagtg 1860cagcttgagg atccagagtt ccaagcctcc
aacatcatgc acagcatcaa tggctatgtt 1920tttgatagtt tgcagttgtc agtttgtttg
catgaggtgg catactggta cattctaagc 1980attggagcac agactgactt cctttctgtc
ttcttctctg gatatacctt caaacacaaa 2040atggtctatg aagacacact caccctattc
ccattctcag gagaaactgt cttcatgtcg 2100atggaaaacc caggtctatg gattctgggg
tgccacaact cagactttcg gaacagaggc 2160atgaccgcct tactgaaggt ttctagttgt
gacaagaaca ctggtgatta ttacgaggac 2220agttatgaag atatttcagc atacttgctg
agtaaaaaca atgccattga acctaggagc 2280ttctcccaga attcaagaca ccctagcact
aggtctcaaa acccaccagt cttgaaacgc 2340catcaacggg aaataactcg tactactctt
cagtcagatc aagaggaaat tgactatgat 2400gataccatat cagttgaaat gaagaaggaa
gattttgaca tttatgatga ggatgaaaat 2460cagagccccc gcagctttca aaagagaacc
cgacactatt tcattgctgc ggtggagcag 2520ctctgggatt acgggatgag cgaatccccc
cgggcgctaa gaaacagggc tcagaacgga 2580gaggtgcctc ggttcaagaa ggtggtcttc
cgggaatttg ctgacggctc cttcacgcag 2640ccgtcgtacc gcggggaact caacaaacac
ttggggctct tgggacccta catcagagcg 2700gaagttgaag acaacatcat ggtaactttc
aaaaaccagg cgtctcgtcc ctattccttc 2760tactcgagcc ttatttctta tccggatgat
caggagcaag gggcagaacc tcgaaaaaac 2820tttgtcaagc ctaatgaaac caaaacttac
ttttggaaag tgcagcatca catggcaccc 2880acagaagacg agtttgactg caaagcctgg
gcctactttt ctgatgttga cctggaaaaa 2940gatgtgcact caggcttgat cggccccctt
ctgatctgcc gcgccaacac cctgaacgct 3000gctcacggta gacaagtgac cgtgcaagaa
tttgctctgt ttttcactat ttttgatgag 3060acaaagagct ggtacttcac tgaaaatgtg
gaaaggaact gccgggcccc ctgccatctg 3120cagatggagg accccactct gaaagaaaac
tatcgcttcc atgcaatcaa tggctatgtg 3180atggatacac tccctggctt agtaatggct
cagaatcaaa ggatccgatg gtatctgctc 3240agcatgggca gcaatgaaaa tatccattcg
attcatttta gcggacacgt gttcagtgta 3300cggaaaaagg aggagtataa aatggccgtg
tacaatctct atccgggtgt ctttgagaca 3360gtggaaatgc taccgtccaa agttggaatt
tggcggaata gatgcctgat tggcgagcac 3420ctgcaagctg ggatgagcac gactttcctg
gtgtacagca agaagtgtca gactcccctg 3480ggaatggctt ctggacacat tagagatttt
cagattacag cttcaggaca atatggacag 3540tgggccccaa agctggccag acttcattat
tccggatcaa tcaatgcctg gagcaccaag 3600gagccctttt cttggatcaa ggtggatctg
ttggcaccaa tgattattca cggcatcaag 3660acccagggtg cccgtcagaa gttctccagc
ctctacatct ctcagtttat catcatgtat 3720agtcttgatg ggaagaagtg gcagacttat
cgaggaaatt ccactggaac cttaatggtc 3780ttctttggca atgtggattc atctgggata
aaacacaata tttttaaccc tccaattatt 3840gctcgataca tccgtttgca cccaactcat
tatagcattc gcagcactct tcgcatggag 3900ttgatgggct gtgatttaaa tagttgcagc
atgccattgg gaatggagag taaagcaata 3960tcagatgcac agattactgc ttcatcctac
tttaccaata tgtttgccac ctggtctcct 4020tcaaaagctc gacttcacct ccaagggagg
agtaatgcct ggagacctca ggtgaataat 4080ccaaaagagt ggctgcaagt ggacttccag
aagacaatga aagtcacagg agtaactact 4140cagggagtaa aatctctgct taccagcatg
tatgtgaagg agttcctcat ctccagcagt 4200caagatggcc atcagtggac tttttttcag
aatggcaaag taaaggtttt tcagggaaat 4260caagactcct tcacacctgt ggtgaactct
ctagacccac cgttactgac tcgctacctt 4320cgaattcacc cccagagttg ggtgcaccag
attgccctga ggatggaggt tctgggctgc 4380gaggcacagg acctctac
4398285508DNAHomo sapiens 28atgattcctg
ccagatttgc cggggtgctg cttgctctgg ccctcatttt gccagggacc 60ctttgtgcag
aaggaactcg cggcaggtca tccacggccc gatgcagcct tttcggaagt 120gacttcgtca
acacctttga tgggagcatg tacagctttg cgggatactg cagttacctc 180ctggcagggg
gctgccagaa acgctccttc tcgattattg gggacttcca gaatggcaag 240agagtgagcc
tctccgtgta tcttggggaa ttttttgaca tccatttgtt tgtcaatggt 300accgtgacac
agggggacca aagagtctcc atgccctatg cctccaaagg gctgtatcta 360gaaactgagg
ctgggtacta caagctgtcc ggtgaggcct atggctttgt ggccaggatc 420gatggcagcg
gcaactttca agtcctgctg tcagacagat acttcaacaa gacctgcggg 480ctgtgtggca
actttaacat ctttgctgaa gatgacttta tgacccaaga agggaccttg 540acctcggacc
cttatgactt tgccaactca tgggctctga gcagtggaga acagtggtgt 600gaacgggcat
ctcctcccag cagctcatgc aacatctcct ctggggaaat gcagaagggc 660ctgtgggagc
agtgccagct tctgaagagc acctcggtgt ttgcccgctg ccaccctctg 720gtggaccccg
agccttttgt ggccctgtgt gagaagactt tgtgtgagtg tgctgggggg 780ctggagtgcg
cctgccctgc cctcctggag tacgcccgga cctgtgccca ggagggaatg 840gtgctgtacg
gctggaccga ccacagcgcg tgcagcccag tgtgccctgc tggtatggag 900tataggcagt
gtgtgtcccc ttgcgccagg acctgccaga gcctgcacat caatgaaatg 960tgtcaggagc
gatgcgtgga tggctgcagc tgccctgagg gacagctcct ggatgaaggc 1020ctctgcgtgg
agagcaccga gtgtccctgc gtgcattccg gaaagcgcta ccctcccggc 1080acctccctct
ctcgagactg caacacctgc atttgccgaa acagccagtg gatctgcagc 1140aatgaagaat
gtccagggga gtgccttgtc acaggtcaat cacacttcaa gagctttgac 1200aacagatact
tcaccttcag tgggatctgc cagtacctgc tggcccggga ttgccaggac 1260cactccttct
ccattgtcat tgagactgtc cagtgtgctg atgaccgcga cgctgtgtgc 1320acccgctccg
tcaccgtccg gctgcctggc ctgcacaaca gccttgtgaa actgaagcat 1380ggggcaggag
ttgccatgga tggccaggac gtccagctcc ccctcctgaa aggtgacctc 1440cgcatccagc
atacagtgac ggcctccgtg cgcctcagct acggggagga cctgcagatg 1500gactgggatg
gccgcgggag gctgctggtg aagctgtccc ccgtctatgc cgggaagacc 1560tgcggcctgt
gtgggaatta caatggcaac cagggcgacg acttccttac cccctctggg 1620ctggcggagc
cccgggtgga ggacttcggg aacgcctgga agctgcacgg ggactgccag 1680gacctgcaga
agcagcacag cgatccctgc gccctcaacc cgcgcatgac caggttctcc 1740gaggaggcgt
gcgcggtcct gacgtccccc acattcgagg cctgccatcg tgccgtcagc 1800ccgctgccct
acctgcggaa ctgccgctac gacgtgtgct cctgctcgga cggccgcgag 1860tgcctgtgcg
gcgccctggc cagctatgcc gcggcctgcg cggggagagg cgtgcgcgtc 1920gcgtggcgcg
agccaggccg ctgtgagctg aactgcccga aaggccaggt gtacctgcag 1980tgcgggaccc
cctgcaacct gacctgccgc tctctctctt acccggatga ggaatgcaat 2040gaggcctgcc
tggagggctg cttctgcccc ccagggctct acatggatga gaggggggac 2100tgcgtgccca
aggcccagtg cccctgttac tatgacggtg agatcttcca gccagaagac 2160atcttctcag
accatcacac catgtgctac tgtgaggatg gcttcatgca ctgtaccatg 2220agtggagtcc
ccggaagctt gctgcctgac gctgtcctca gcagtcccct gtctcatcgc 2280agcaaaagga
gcctatcctg tcggcccccc atggtcaagc tggtgtgtcc cgctgacaac 2340ctgcgggctg
aagggctcga gtgtaccaaa acgtgccaga actatgacct ggagtgcatg 2400agcatgggct
gtgtctctgg ctgcctctgc cccccgggca tggtccggca tgagaacaga 2460tgtgtggccc
tggaaaggtg tccctgcttc catcagggca aggagtatgc ccctggagaa 2520acagtgaaga
ttggctgcaa cacttgtgtc tgtcgggacc ggaagtggaa ctgcacagac 2580catgtgtgtg
atgccacgtg ctccacgatc ggcatggccc actacctcac cttcgacggg 2640ctcaaatacc
tgttccccgg ggagtgccag tacgttctgg tgcaggatta ctgcggcagt 2700aaccctggga
cctttcggat cctagtgggg aataagggat gcagccaccc ctcagtgaaa 2760tgcaagaaac
gggtcaccat cctggtggag ggaggagaga ttgagctgtt tgacggggag 2820gtgaatgtga
agaggcccat gaaggatgag actcactttg aggtggtgga gtctggccgg 2880tacatcattc
tgctgctggg caaagccctc tccgtggtct gggaccgcca cctgagcatc 2940tccgtggtcc
tgaagcagac ataccaggag aaagtgtgtg gcctgtgtgg gaattttgat 3000ggcatccaga
acaatgacct caccagcagc aacctccaag tggaggaaga ccctgtggac 3060tttgggaact
cctggaaagt gagctcgcag tgtgctgaca ccagaaaagt gcctctggac 3120tcatcccctg
ccacctgcca taacaacatc atgaagcaga cgatggtgga ttcctcctgt 3180agaatcctta
ccagtgacgt cttccaggac tgcaacaagc tggtggaccc cgagccatat 3240ctggatgtct
gcatttacga cacctgctcc tgtgagtcca ttggggactg cgcctgcttc 3300tgcgacacca
ttgctgccta tgcccacgtg tgtgcccagc atggcaaggt ggtgacctgg 3360aggacggcca
cattgtgccc ccagagctgc gaggagagga atctccggga gaacgggtat 3420gagtgtgagt
ggcgctataa cagctgtgca cctgcctgtc aagtcacgtg tcagcaccct 3480gagccactgg
cctgccctgt gcagtgtgtg gagggctgcc atgcccactg ccctccaggg 3540aaaatcctgg
atgagctttt gcagacctgc gttgaccctg aagactgtcc agtgtgtgag 3600gtggctggcc
ggcgttttgc ctcaggaaag aaagtcacct tgaatcccag tgaccctgag 3660cactgccaga
tttgccactg tgatgttgtc aacctcacct gtgaagcctg ccaggagccg 3720ggaggcctgg
tggtgcctcc cacagatgcc ccggtgagcc ccaccactct gtatgtggag 3780gacatctcgg
aaccgccgtt gcacgatttc tactgcagca ggctactgga cctggtcttc 3840ctgctggatg
gctcctccag gctgtccgag gctgagtttg aagtgctgaa ggcctttgtg 3900gtggacatga
tggagcggct gcgcatctcc cagaagtggg tccgcgtggc cgtggtggag 3960taccacgacg
gctcccacgc ctacatcggg ctcaaggacc ggaagcgacc gtcagagctg 4020cggcgcattg
ccagccaggt gaagtatgcg ggcagccagg tggcctccac cagcgaggtc 4080ttgaaataca
cactgttcca aatcttcagc aagatcgacc gccctgaagc ctcccgcatc 4140accctgctcc
tgatggccag ccaggagccc caacggatgt cccggaactt tgtccgctac 4200gtccagggcc
tgaagaagaa gaaggtcatt gtgatcccgg tgggcattgg gccccatgcc 4260aacctcaagc
agatccgcct catcgagaag caggcccctg agaacaaggc cttcgtgctg 4320agcagtgtgg
atgagctgga gcagcaaagg gacgagatcg ttagctacct ctgtgacctt 4380gcccctgaag
cccctcctcc tactctgccc cccgacatgg cacaagtcac tgtgggcccg 4440gggctcttgg
gggtttcgac cctggggccc aagaggaact ccatggttct ggatgtggcg 4500ttcgtcctgg
aaggatcgga caaaattggt gaagccgact tcaacaggag caaggagttc 4560atggaggagg
tgattcagcg gatggatgtg ggccaggaca gcatccacgt cacggtgctg 4620cagtactcct
acatggtgac tgtggagtac cccttcagcg aggcacagtc caaaggggac 4680atcctgcagc
gggtgcgaga gatccgctac cagggcggca acaggaccaa cactgggctg 4740gccctgcggt
acctctctga ccacagcttc ttggtcagcc agggtgaccg ggagcaggcg 4800cccaacctgg
tctacatggt caccggaaat cctgcctctg atgagatcaa gaggctgcct 4860ggagacatcc
aggtggtgcc cattggagtg ggccctaatg ccaacgtgca ggagctggag 4920aggattggct
ggcccaatgc ccctatcctc atccaggact ttgagacgct cccccgagag 4980gctcctgacc
tggtgctgca gaggtgctgc tccggagagg ggctgcagat ccccaccctc 5040tcccctgcac
ctcgtgatga gacgctccag gatggctgtg atactcactt ctgcaaggtc 5100aatgagagag
gagagtactt ctgggagaag agggtcacag gctgcccacc ctttgatgaa 5160cacaagtgtc
tggctgaggg aggtaaaatt atgaaaattc caggcacctg ctgtgacaca 5220tgtgaggagc
ctgagtgcaa cgacatcact gccaggctgc agtatgtcaa ggtgggaagc 5280tgtaagtctg
aagtagaggt ggatatccac tactgccagg gcaaatgtgc cagcaaagcc 5340atgtactcca
ttgacatcaa cgatgtgcag gaccagtgct cctgctgctc tccgacacgg 5400acggagccca
tgcaggtggc cctgcactgc accaatggct ctgttgtgta ccatgaggtt 5460ctcaatgcca
tggagtgcaa atgctccccc aggaagtgca gcaagtga
5508296648DNAOvis aries 29atgtttccca ccaggctcgc gaggctgctg cttgctgtgg
ccctcacttt gccaggggcc 60ctttgtggag aaggtgctcc tggcaagtca tcgatggccc
ggtgcagcct cttcggagct 120gacttcatca acacctttga tgagagcatg tacagctttg
cgggagactg tagttacctc 180ctggcagggg attgcaagac acactccttt tcaatcgtag
gggacttcca agctggtaga 240agagtgggtc tctctgtgta ccttggggaa tttttcgaca
tccatgtgtt tgtcaacggt 300actgtgctgc aggggggcca gcatgtctcc atgccctatg
ccaccagagg gctgtacctg 360gacaccgagg ctgggtacca caagctgtcc agcgagtctt
atggctttgt ggccaggatc 420gacagcagcg ggaacttcca aatcctgctg tcggacagac
acttcaacaa gacctgtggg 480ctgtgcggtg actttaacat cttcgccgaa gatgacttca
ggactcaaga aggaaccctg 540acctcagacc cctacgactt tgcaaactcc tgggccctga
gcagtgagga gcagcggtgt 600ccacgggtgt cccctcccag cagctcctgc aacatctcct
ctgagctgca gaagggcctg 660tgggagcagt gccagcttct gaagacggcc tccgtgttcg
cccgctgcca cgccctggtg 720gaccccgagc ctttcgtggc cctgtgtgag cggatgctgt
gcgcatgcgc ccaggggctg 780cgctgcccct gcccggtgct cctggagtac gcccgcgcct
gcgcccagca agggatgctg 840ctgtacggct gggcggacca cagctcctgc cgaccggact
gccccgcggg catggagtac 900aaggagtgtg tgtccccatg ccacaggacc tgccggagcc
tgagtatcac cgaagtgtgt 960cgggagcagt gtgtggatgg ctgcagctgc cctgagggac
agctcctgga tgaaggccgc 1020tgtgtggaaa gtgccgagtg tccctgtgtg catgctggaa
agccataccc tcctggcgcc 1080tccctctcgc gagactgcaa cacctgcatc tgccgaaaca
gccagtgggt ctgcagcaat 1140gaggactgtc caggagagtg tctcatcaca ggacaatccc
acttcaagag ctttgacgac 1200aggcacttca ccttcagcgg ggtctgccag tacctgctgg
cccaggactg ccaggaccac 1260tccttctccg tcatcataga gactgttcag tgtgctgacg
accctgatgc ggtctgcacc 1320cgctccgtca ccgtccgcct gcccagcccg caccacggcc
tcctgaagct gaagcacggg 1380ggtggagtcg ccctggatgg ccaggacgtc cagattcccc
tcctgcaagg tgacctccgg 1440atccagcaca ctgtgacggc ctccctgcac ctcatcttcg
gggaggacct gcagatagac 1500tgggacggtc gcgggaggct gctgctgaag ctgtccccgg
tctacgcggg gaggacctgc 1560gggctgtgcg ggaattacaa cggcaaccag agggatgact
tcctgacgcc cgcgggcctg 1620gtcgagcccc tggtggagca ctttggaaac tcctggaagc
tacgtgcaga ctgtgaggac 1680ttgcaggagc agcccagtga cccctgcagc ctcaacccgc
gcctgaccaa gttcgcagac 1740caggcctgcg ccatcctgac gtcccgcaag ttcgaggcct
gccacagcgc cgtgggcccg 1800ctgccctacc tgcgcaactg ccgcttcgac gtgtgcgcct
gctccgatgg cagagactgc 1860ctgtgcgacg cggtggccaa ctacgcggcg gcctgcgcca
ggaggggcgt gcacatcggg 1920tggcgggagc ccagcttctg tgcactgagc tgcccacacg
gccaggtgta ccagcagtgt 1980gggaccccct gcaacctcac ctgccgctca ctctcccacc
cggacgagga atgcactgag 2040gtctgtctgg agggctgctt ctgccctgct gggctcttcc
tggatgagac ggggtcctgt 2100gtgcccaagg cccagtgccc ctgttactat gacggcgaga
tcttccaacc tgaagacatc 2160ttctcggacc atcacaccat gtgctactgc gaagatggct
tcatgcactg ctccacgagc 2220ggagccccgg ggagcctgct gcctgaagca gtcctcagca
gccctctgtc ccaccgcagc 2280aaaaggagcc tgtcctgccg gccccccatg gtcaagctgg
tgtgccctgc tgacaacccg 2340agggccgaag ggctcgaatg caccaagacc tgccagaact
atgacctgga atgcgtgagc 2400acgggctgtg tgtccggctg cctctgcccc ccgggcatgg
tccggcatga gaacaggtgt 2460gtggccctgg aaaggtgccc ctgcttccac cagggcagag
agtacgcccc cggagacagg 2520gtgaaggccg actgcaacac ctgcgtctgt caggaccgga
agtggaactg tacggaccgc 2580gtgtgtgatg ctggctgctc tgccgtgggc ctggctcact
acttcacctt tgatgggctc 2640aagtacctgt tcccggggga gtgccagtac gtcctggtac
aggaccactg cggtagtaac 2700cctgggacct tccgggtcct ggtggggaat gaggggtgca
gcgtcccctc cctgaagtgc 2760aggaagcgca tcaccatcct ggtggaggga ggagagatcg
agctgtttga cggggaggtg 2820aacgtgaaga agcccatgaa ggatgagacg cacttcgagg
tggtggaatc tggccggtac 2880atcactgtgc tgctgggcaa ggccctctct gtggtctggg
acgggcacct ggccatctct 2940gtgttcctga agcggatgta ccaggagcgg gtatgcggcc
tgtgtgggaa tttcgatggc 3000gtccagaaca atgacctcac cagtagcagc ctccaagtgg
aggaagaccc tgtggacttt 3060gggaattcct ggaaagtgag cccgcattgc gctgacaccc
agaaagtgcc gctggactcg 3120gcccctgcca cctgccacaa gaacgtcatg aagcagacca
tggtggattc ctcctgcagg 3180gtcctcacca gtgatgtttt ccgggagtgc aacaggctgg
tgaaccccga gccgtacctg 3240gatgtttgca tctacgacag ctgctcctgc gagtccatcg
gggactgcgc ctgcttctgt 3300gacaccatcg ccgcctacgc ccacgagtgt gcccagcacg
gcgaagtggt gacctggagg 3360acagccacac tgtgccccca gaattgtgag gagcggaacc
tgaaggagag tgggtaccaa 3420tgtgagtggc gctacaacag ctgtgctccc gcctgtccag
tcacgtgcca gcacccagag 3480cccctggcct gccccgtgca gtgcgtggag ggctgccacg
cacactgccc gcctgggaaa 3540atcctggacg agcttttgca gacctgtgtc aaccccgagg
actgccctgt gtgccaggtg 3600gagggccggc gcttagcctc cgggaagaaa gtgaccctga
accctgggga ccctgagcat 3660tgccagctct gtcactgtga tggtgtcagc ctcacttgtg
aagcctgcag ggagccagga 3720ggcctgcccg tgccccccac cgaaggcccg gtcagcccca
caaccccgta cgtggaggac 3780accccggagc cacccctgca cgacttcttc tgcagcaaac
tgctggacct ggtcttcctg 3840ctggacggct cctccaagct gtctgaggcc gacttcgaga
cgctgaaggc gttcgtggtg 3900ggcatgatgg agcgtctgca catttcccag aaacgcatcc
gtgtggccgt ggtggagtac 3960cacgatggct cccatgccta ccttgcactg caagaccgga
agcggccatc cgagctgcgg 4020cgcattgccg ggcaggtgaa gtacgcgggc agcgaggtgg
cttccaccag cgaggtcttg 4080aagtacacgc tcttccagat cttcggcagg attgaccggc
ccgaggcctc tcgcgtggcc 4140ctgctgctta cggccagcca ggagcccccc aggctggccc
ggaacttggt ccgctacctc 4200cagggcctga agaagaagaa ggtctccgtg gtcccggtgg
gcatcgggcc ccacgtcagc 4260ctcaagcaga tccgcctcat cgagaagcag gcgtctgaga
acaaagcctt tgtgctaagc 4320ggtgtgcacg agctggagca gcggatgaac gagattgtcg
gctacctctg tgacctcgcc 4380cccgaggtgc ctgccccgac cccgacgcga catcctctca
ttgcgcaggt cactgtggcg 4440ccgcagctcc tgggtccttc gccaccagga cccaagagga
gctctgtggt cctggatgtg 4500gcattcctcc tggaaggctc ggatgaggta ggcgaggcca
acttcaacag gagcgcagag 4560tttgtggagg aggtgatccg acgcatggac gttggccggg
acggcatcca tgtcacggtg 4620ctgcagtact cgtacacggt gaccgtggag cactcgttca
gggagccaca gtccaaggag 4680gtggtcctgc agcgactcca tgaagtccgc taccggggtg
gcaaccagac gaacacgggg 4740ctggccctgc agtacctgtc ggagcacagc ttctccgcca
gccaggggga ccgggagcag 4800gcgcccaacc tggtctacat ggtgacgggc agcccggcct
cggacaagat ccagcggatg 4860ccaggagaca tccagctcgt gcccatcggc gtgggccccc
gtgtggacgt gcaggagctg 4920gagagggtca gctggcccca gacccccatc ttcatccagg
acttcgagag gctcccccga 4980gaagctccgg atctggtgct gcagcggtgc tgctccgaag
acggcccgca cctccccacc 5040ctcgcccctg ccccagactg cagccagccc ctggatgtag
tcctcctcct ggatggctcc 5100tccacccctc cagcctctta ctttgacgaa atgaagagtt
tcgccaaggc tttcatctca 5160aaagccaacc taggccctca gcttacccag gtgtcagtcc
tgcagtacgg gagcatcacc 5220aacgtcgacg tatcctggaa tgtgcacgtg gacaaagccc
acctgctgag ccttgtggac 5280cccatgcagc gcgagggagg ccccagccga gtcggggagg
cgttgtcctt cgcggtgcgc 5340tacatcacgt cccaagtcca cggtgccagg cccggggcct
ccaaggtggt ggtgatcctg 5400gtcacaggct cctccatgga ctcagtggag gcggccgccg
ctgctgccag atccaaccga 5460gtggctgtgt tccccatcgg gatcggggac cagtatgacg
cagcccagct gagggtcttg 5520gcgggcccgg gggccagctc caatgtggca gagctccagc
ggattgaaga cctccccagc 5580atggttgccc ttggcaactc cttcttccag aggctatgct
ctgggttcgt cagtgtttgc 5640gtggatgagg acgggaatga gaggaggcct ggggacgtct
ggaccttgct ggatcagtgc 5700cacacagtga cttgcctgcc agatggccag accttgctga
agagtcaccg ggtcaactgt 5760gatcaggggc cacagccatc atgccccgac ggccagatcc
cgctcaggat ggaggaagcc 5820tgtggctgcc gctgggcctg tccctgtgtg tgcacaggca
gctccactcg gcacatcgtg 5880acctttgatg gacggaattt caagctgacc ggcaactgct
catacgttct gtttcacaac 5940aaggagcagg acttggaggt gattctccat aacggattct
gcagcgctgg ggcgaggcag 6000gcctgcatga aatctgtgga ggtgaagcac agcggcctct
cggttgagct ccgcagcaac 6060atggaggtga tggtgaatgg gagactggtc tctgtccctt
acctgggtgg ggacatggag 6120gtcagagtct atggtaccat catgttcgag gtcagattca
accttctggg ccacatcctc 6180tccttcaccc cacgtgatga gacactccag gatggctgtg
acagtcactt ctgcaaagtc 6240aacaagagag gagagttcat ttgggagaag agggtcatgg
gctgcccgcc cttcaacgaa 6300cacaaatgtc tggctgaggg ggggaaagtc atgaaaattc
ctggcacgtg ctgtgacaca 6360tgtgaggagc ccgagtgcaa ggacatcaca gccagggtgc
agtacatcaa ggtgggagac 6420tgcaaatctg aagaggaagt ggacattaac tactgccagg
gaagatgcac cagcaaagcc 6480ctgtactcca tcgacacgga ggacgtgcaa gaccagtgtt
cctgctgctc gcccacgcgc 6540acggagccca tgtcagtgcc cctgcgctgc accaacggct
ccatcattca ccatgtgatc 6600ctcaacgccc tgcagtgcaa gtgctcatcc aggaagtgcc
gcccgtga 66483024PRTArtificial Sequencesynthetic
porcine-derived linker sequence 30Ser Phe Ala Gln Asn Ser Arg Pro Pro Ser
Ala Ser Ala Pro Lys Pro1 5 10
15Pro Val Leu Arg Arg His Gln Arg 203114PRTArtificial
Sequencesynthetic human linker sequence 31Ser Phe Ser Gln Asn Pro Pro Val
Leu Lys Arg His Gln Arg1 5
10326PRTArtificial Sequencesynthetic 6-His tag 32His His His His His His1
5334350DNAArtificial SequenceSynthetic truncated ovine
(sheep) gene 33atgcacatca agctctgtac ctgcctcttt ctgtgcctct ggccatgcag
cttcagtgcc 60atcagaagat actacctggg tgcagtggaa ctgtcctggg actatacgcg
aagtgaactg 120ctcagtgagc tgcatgtgaa cacgaggttt cctcccagag tgcccaaacc
ttttccattc 180aacacatcag tcatgtacag aaagactgtg tttgtagagt tcacggatca
actttttaac 240atcgccaagc ccaggccacc atggatgggt ctgctgggtc cagctatcca
ggctgaggtt 300tatgacaccg tggtcattac atttaagaat atggcttctc atcctgttag
tcttcatgct 360attggcgtat cctactggaa atcttctgaa ggtgctgcat ataaggatga
aaccagccaa 420agggagaagg aagatgacaa agtcattcct ggtaaaagtc atacctacgt
ctggcatatc 480ctgaaagaaa acggtccaac agcctctgac ccaccatgtc tcacctactc
atatctttct 540catgtggacc tggtgaaaga cgtgaactca ggtctcatcg gagccctgct
aatttgtagg 600gaagggactc tgatcaaaga aaggacacag accttgcaca aattcgtact
actgtttgct 660gtatttgatg aagggaaaag ttggcactcg ggaaaaaatg agtccttgac
acatgttatg 720gattctgcct ctgtactgca caccatcaat ggctatataa acaggtctct
gccaggtctg 780attggatgtc ataagaaatc agtctattgg catgtgattg gaatgggcac
caccccagaa 840gtgcactcaa ttttcctcga aggccacaca tttctcgtga ggaaccatcg
ccaggcttcc 900ttggagatct caccaataac tttccttacc gctcagacag tcctgatgga
ctgtggccag 960tttctactgt tttgtcgtat ctcttcccac caacatgatg gtatggaagc
ttatgtcaaa 1020gtagatagtt gcccagagga accccgacta tggatgaaaa ataaccaaga
agaagattac 1080gatgatggtt tggatgactc tgacatggat gtggtcaggt tcgatggtga
cagtgtgcct 1140ccctttatcc aaatgcgctc agttgcaaag aagcatccta aaacctgggt
ccactacatt 1200gctgctgaag aggatgactg ggactatgcc ccctcggtcc tcacctccaa
tgacagaagt 1260tataaaagtc tgtatctgaa ccacggtcct cagcggattg gtaggaagta
cagaaaagta 1320cgatttatag cttacacaga tggaacattt aagactcgtg aagctattca
gcatgaatca 1380gggatcctgg ggcctttact ttatggagaa gttggagaca cacttttgat
tatatttaag 1440aatcaagcaa gccggccata taacatctac cctcatggaa tcactgatgt
cagtcctttg 1500cactcaggga gatttccaaa aggtgtgaaa catttgaaag acatgccaat
tctgccagga 1560gaagtgttca agtataaatg gacagtgact gtagaagatg ggccaactaa
atcagatcct 1620cggtgtctga cccgatatta ctcgagtttc attaacttag agaaagatct
agcttcagga 1680ctcattggcc ccctcctcat ctgttacaaa gaatctgtag atcaaagagg
aaaccagatg 1740atgtcagaca agagaaatgt catcctgttt tctgtatttg atgaaaacaa
aagttggtac 1800ctcacagaga atattcaacg cttcctcccc agtggagtac agccccagga
tccagagttc 1860caagtctcca atgtcatgca cagcatcaat ggctatgttt ttgatagctt
gcagctgtcg 1920gtttgtttgc atgaggtggc gtactggtac attctaagtg ttggagccca
aattgacttc 1980ctctctgtct tcttctctgg atataccttc aaacacaaaa tggtctatga
agacacactc 2040accctattcc ccttctcggg agaaactgtc ttcatgtcaa tggaaaatcc
aggtctgtgg 2100gttctggggt gccacaactc agactttcga aacagaggca tgacagcctt
actgaaggtt 2160gatagttgtg acaggaacgt tggcgattat tatgacacat atgaagctat
tccaaccttc 2220ctgctgagtg aaaacaatgt cattgaaccc agaagcttct cccagaatcc
accaagcttg 2280aaacgccatc aaagggagat aacccttact acttttcagc cagagccaga
caaaactgac 2340tatgatgata ctttgtcgat tgaaacaaag agagaagatt ttgacattta
tggtgaagat 2400gaaaatcagg acccccgcag ctttcaaaag agaacacgcc actattttat
tgctgcagtg 2460gagcggctct gggattatgg gatgagtaga tccccccacg cactaagaaa
caggtctcag 2520aatggaggag tccctcagtt caaaaaggtg gtgttcgagg aatttactga
tggctccttt 2580actcaggccg tataccgtgg acaattaaat gaacacctgg gactcttggg
accatatata 2640agagcagaag tggaagacaa tatcatggta actttcaaaa accaggcctc
tcgtccctac 2700tccttctatt ctagccttat ttcttataac ggagatcaga gacaaggagc
agaacctcga 2760aaaaagtttg tcaagcctaa tgaaacccaa agctactttt ggaaagtgca
gcaccatatg 2820gcacccacca aagatgagtt tgactgcaaa gcctgggctt acttttctga
tgttgatctg 2880gaaaaagatg tgcactcagg cttgattggc cccattctga tctgccgtgt
ggacacgctg 2940agtgctgctc atgggagaca agtgacagta caggaattcg ctctgttttt
caccattttt 3000gatgagacca agagctggta ctttgccgaa aacatggcaa ggaactgcgt
ggcaccctgc 3060catgtccagc cagaggaccc tactttccaa gaaaagtatc gcttccatgc
aatcaatggc 3120tacgtgatgg atacactccc tggcttagtc atggctcagc atcaaaggat
taggtggtat 3180ctgctcagca tgggcagcaa tgaaaatatc cattccattc atttcagtgg
ccatgtgttc 3240actgtgagaa aaacggagga gtataaaatg gcggtctaca atctctaccc
aggtgtcttt 3300gagaccgtgg aaatgctacc atccaaggtt gggacttggc ggatagaatg
tcttattggc 3360gagcacctac aagctgggat gagcactctc ttcctggtgt acagcaagga
gtgtcaaatt 3420ccactgggaa tggcttctgg acgcattaga gattttcaga ttacagcttc
aggacaatat 3480ggacagtggg ccccaaagct ggccagactt cattattctg gatcaatcaa
tgcctggagc 3540accaaggatc cctctccttg gatcaaggtg gatctgttgg cgccgatgat
tattcacagc 3600atcctgactc agggtgcccg gcagaagttc tccagcctgt acatctctca
gtttatcatc 3660atgtacagcc tcgatggaca gcggtggcag ggttatcggg ggaactccac
tgggacttta 3720atggtgttct ttggcaatgt ggattcatct ggagtaaaac acaatatttt
taaccctcca 3780attattgcta gatatatccg tttgcaccca acgcattaca gcatccgcag
cactcttcgc 3840atggagttga tgggctgtga cttaaatagt tgcaacatgc cactgggaat
ggagaataaa 3900gctatatcag atacacagat tactgcctca tcccacttaa gcaacatgtt
tgccacctgg 3960tctccttcac aagcccgact taaccttcaa gggaggacaa atgcctggag
accccaggtg 4020aataatccaa aagagtggct acaagtggac ttccagaaga caatgagagt
tacaggaata 4080accactcaag gggtgaaatc tctgcttacc agcatgtatg tgaaggagtt
ccttatatcc 4140agtagtcaag agggccataa ctggactcca tttcttcaga atggcaaagt
gaaggttttt 4200cagggaaatc aagactcctt cacccccgtg gtgaatactc tagacccccc
actgtttacc 4260cgcttccttc ggattcaccc gcagagctgg gtgcaccata tcgccctgag
gctggagttt 4320tggggttgtg aggcacagca gcagtactag
4350
User Contributions:
Comment about this patent or add new information about this topic: