Patent application title: Novel Codon-Optimized CFTR MRNA
Inventors:
IPC8 Class: AA61K4764FI
USPC Class:
1 1
Class name:
Publication date: 2022-06-23
Patent application number: 20220193247
Abstract:
The present invention provides, among other things, improved methods and
pharmaceutical compositions for treating cystic fibrosis based on codon
optimized mRNA encoding a Cystic Fibrosis Transmembrane Conductance
Regulator (CFTR) protein.Claims:
1. A pharmaceutical composition for treating cystic fibrosis, comprising
a codon optimized mRNA encoding a Cystic Fibrosis Transmembrane
Conductance Regulator (CFTR) protein and wherein the codon optimized CFTR
mRNA comprises a polynucleotide sequence of SEQ ID NO: 1.
2-4. (canceled)
5. The pharmaceutical composition of claim 1, wherein the codon optimized CFTR mRNA encoding the CFTR protein is encapsulated within a nanoparticle.
6. The pharmaceutical composition of claim 5, wherein the nanoparticle is a liposome.
7. The pharmaceutical composition of claim 6, wherein the liposome comprises one or more cationic lipids, one or more non-cationic lipids, one or more cholesterol-based lipids and one or more PEG-modified lipids.
8. The pharmaceutical composition of claim 6, wherein the liposome comprises no more than three distinct lipid components.
9-10. (canceled)
11. A method of large scale production of codon optimized mRNA encoding a Cystic Fibrosis Transmembrane Conductance Regulator (CFTR) protein, comprising in vitro synthesizing codon optimized CFTR mRNA using a SP6 RNA polymerase, wherein at least 80% of the synthesized codon optimized CFTR mRNA molecules are full-length and wherein at least 100 mg of codon optimized mRNA is synthesized at a single batch, and wherein the codon optimized CFTR mRNA comprises a polynucleotide sequence of SEQ ID NO: 1.
12. (canceled)
13. The method of claim 11, wherein the in vitro synthesis of codon optimized CFTR mRNA results in a secondary polynucleotide species that constitutes less than 10%, 5%, 4%, 3%, 2%, 1%, 0.5%, 0.4%, 0.3%, 0.2% or 0.1% of the total mRNA synthesized.
14. The method of claim 11, wherein at least 85%, 90%, 95%, 96%, 97%, 98%, or 99% of the synthesized codon optimized CFTR mRNA molecules are full-length.
15. (canceled)
16. The method of claim 11, wherein at least 200 mg, 300 mg, 400 mg, 500 mg, 600 mg, 700 mg, 800 mg, 900 mg, 1 g, 5 g, 10 g, 25 g, 50 g, 75 g, 100 g, 150 g, 200 g, 250 g, 500 g, 750 g, 1 kg, 5 kg, 10 kg, 50 kg, 100 kg, 1000 kg, or more of codon optimized CFTR mRNA is synthesized at a single batch.
17.-20. (canceled)
21. The method of claim 11, wherein the method further comprises a step of capping and/or tailing of the synthesized codon optimized CFTR mRNA.
22. (canceled)
23. A method of treating cystic fibrosis, comprising administering to a subject in need of treatment a composition comprising a codon optimized mRNA encoding an Cystic Fibrosis Transmembrane Conductance Regulator (CFTR) protein, wherein the codon optimized CFTR mRNA comprises a polynucleotide sequence at least 85% identical to SEQ ID NO: 1.
24. The method of claim 23, wherein the codon optimized CFTR mRNA comprises SEQ ID NO: 1.
25.-26. (canceled)
27. The method of claim 23, wherein the codon optimized CFTR mRNA is encapsulated within a nanoparticle.
28. The method of claim 27, wherein the nanoparticle is a liposome.
29. The method of claim 28, wherein the liposome comprises one or more cationic lipids, one or more non-cationic lipids, one or more cholesterol-based lipids and one or more PEG-modified lipids.
30-32. (canceled)
33. The method of claim 23, wherein the codon optimized CFTR mRNA is administered to the subject via pulmonary delivery.
34. The method of claim 33, wherein the pulmonary delivery is nebulization.
35. A pharmaceutical composition for treating cystic fibrosis, comprising an mRNA encoding a Cystic Fibrosis Transmembrane Conductance Regulator (CFTR) protein and wherein the mRNA encoding the CFTR protein comprises a polynucleotide sequence comprising any one of SEQ ID NO: 21-40.
36-40. (canceled)
41. The pharmaceutical composition of claim 35, wherein the mRNA is encapsulated in a nanoparticle, and wherein the nanoparticle is a liposome.
42. The pharmaceutical composition of claim 41, wherein the liposome comprises one or more cationic lipids, one or more non-cationic lipids, one or more cholesterol-based lipids and one or more PEG-modified lipids.
43-44. (canceled)
Description:
RELATED APPLICATIONS
[0001] This application claims priority to U.S. Provisional Application Ser. No. 62/464,215, filed Feb. 27, 2017, the disclosures of which are hereby incorporated by reference.
SEQUENCE LISTING
[0002] The present specification makes reference to a Sequence Listing (submitted electronically as a .txt file named MRT-2001 US_ST25 on Feb. 27, 2018). The .txt file was generated on date and is 166,293 bytes in size. The entire contents of the sequence are herein incorporated by reference.
BACKGROUND
[0003] Cystic fibrosis is an autosomal inherited disorder resulting from mutation of the CFTR gene, which encodes a chloride ion channel believed to be involved in regulation of multiple other ion channels and transport systems in epithelial cells. Loss of function of CFTR results in chronic lung disease, aberrant mucus production, and dramatically reduced life expectancy. See generally Rowe et al., New Engl. J. Med. 352, 1992-2001 (2005).
[0004] Currently there is no cure for cystic fibrosis. The literature has documented numerous difficulties encountered in attempting to induce expression of CFTR in the lung. For example, viral vectors comprising CFTR DNA triggered immune responses and CF symptoms persisted after administration. Conese et al., J. Cyst. Fibros. 10 Suppl 2, S114-28 (2011); Rosenecker et al., Curr. Opin. Mol. Ther. 8, 439-45 (2006). Non-viral delivery of DNA, including CFTR DNA, has also been reported to trigger immune responses. Alton et al., Lancet 353, 947-54 (1999); Rosenecker et al., J Gene Med. 5, 49-60 (2003). Furthermore, non-viral DNA vectors encounter the additional problem that the machinery of the nuclear pore complex does not ordinarily import DNA into the nucleus, where transcription would occur. Pearson, Nature 460, 164-69 (2009).
SUMMARY OF THE INVENTION
[0005] The present invention provides, among other things, pharmaceutical compositions comprising messenger RNA (mRNA) encoding a Cystic Fibrosis Transmembrane Conductance Regulator (CFTR) protein and methods of making and using thereof. These pharmaceutical compositions can be used for improved treatment of cystic fibrosis.
[0006] In one aspect, the present invention provides pharmaceutical compositions for treating cystic fibrosis, comprising an mRNA encoding a Cystic Fibrosis Transmembrane Conductance Regulator (CFTR) protein and wherein the mRNA encoding the CFTR protein comprises a polynucleotide sequence at least 85% identical to SEQ ID NO: 1. In some embodiments, the mRNA encoding the CFTR protein comprises SEQ ID NO: 1. In some embodiments, the mRNA further comprises a 5' untranslated region (UTR) sequence of SEQ ID NO: 4. In some embodiments, the mRNA further comprises a 3' untranslated region (UTR) sequence of SEQ ID NO: 5 or SEQ ID NO: 6.
[0007] In some embodiments, the mRNA encoding the CFTR protein is encapsulated within a nanoparticle. In some embodiments, the nanoparticle is a liposome. In some embodiments, the liposome comprises one or more cationic lipids, one or more non-cationic lipids, one or more cholesterol-based lipids and one or more PEG-modified lipids. In some embodiments, the liposome comprises no more than three distinct lipid components. In some embodiments, one distinct lipid component is a sterol-based cationic lipid. In some embodiments, the liposome has a size less than about 100 nm. In another aspect, the present invention provides methods for large scale production of mRNA encoding Cystic Fibrosis Transmembrane Conductance Regulator (CFTR). In some embodiments, a method according to the present invention comprises in vitro synthesizing mRNA encoding a CFTR protein using a SP6 RNA polymerase, wherein at least 80% of the synthesized mRNA molecules are full-length and wherein at least 100 mg of mRNA is synthesized at a single batch.
[0008] In some embodiments, the in vitro synthesized mRNA encoding CFTR is substantially free of a secondary polynucleotide species of approximately 1800 nucleotides in length. In some embodiments, the in vitro synthesis of mRNA results in a secondary polynucleotide species that constitutes less than 10%, 5%, 4%, 3%, 2%, 1%, 0.5%, 0.4%, 0.3%, 0.2% or 0.1% of the total mRNA synthesized.
[0009] In some embodiments, at least 85%, 90%, 95%, 96%, 97%, 98%, or 99% of the synthesized mRNA molecules are full-length. In some embodiments, the synthesized mRNA molecules are substantially full-length.
[0010] In some embodiments, at least 200 mg, 300 mg, 400 mg, 500 mg, 600 mg, 700 mg, 800 mg, 900 mg, 1 g, 5 g, 10 g, 25 g, 50 g, 75 g, 100 g, 150 g, 200 g, 250 g, 500 g, 750 g, 1 kg, 5 kg, 10 kg, 50 kg, 100 kg, 1000 kg, or more of mRNA is synthesized at a single batch.
[0011] In some embodiments, the CFTR protein comprises the amino acid sequence of SEQ ID NO: 3. In some embodiments, the mRNA comprises a polynucleotide sequence at least 85% identical to SEQ ID NO: 1. In some embodiments, the mRNA further comprises a 5' untranslated region (UTR) sequence of SEQ ID NO: 4. In some embodiments, the mRNA further comprises a 3' untranslated region (UTR) sequence of SEQ ID NO: 5 or SEQ ID NO: 6.
[0012] In some embodiments, the method further comprises a step of capping and/or tailing of the synthesized CFTR mRNA.
[0013] Among other things, the present invention provides mRNA encoding Cystic Fibrosis Transmembrane Conductance Regulator (CFTR) synthesized using various methods described herein and pharmaceutical compositions containing the same.
[0014] In yet another aspect, the present invention provides methods of delivering mRNA encoding CFTR described herein for in vivo protein expression and/or for treatment of Cystic Fibrosis. In some embodiments, the present invention provides methods of treating cystic fibrosis, comprising administering to a subject in need of treatment a composition comprising an mRNA encoding an Cystic Fibrosis Transmembrane Conductance Regulator (CFTR) protein wherein the mRNA comprises a polynucleotide sequence at least 85% (e.g., at least 90%, 92%, 94%, 95%, 96%, 97%, 98%, or 99%) identical to SEQ ID NO: 1.
[0015] In some embodiments, the mRNA encoding the CFTR protein comprises SEQ ID NO: 1. In some embodiments, the mRNA further comprises a 5' untranslated region (UTR) sequence of SEQ ID NO: 4. In some embodiments, the mRNA further comprises a 3' untranslated region (UTR) sequence of SEQ ID NO: 5 or SEQ ID NO: 6.
[0016] In some embodiments, the mRNA encoding the CFTR protein is encapsulated within a nanoparticle. In some embodiments, the nanoparticle is a liposome. In some embodiments, the liposome comprises one or more cationic lipids, one or more non-cationic lipids, one or more cholesterol-based lipids and one or more PEG-modified lipids. In some embodiments, the liposome comprises no more than three distinct lipid components. In some embodiments, one distinct lipid component is a sterol-based cationic lipid. In some embodiments, the sterol-based cationic lipid is the imidazole cholesterol ester "ICE" lipid (3S, 10R, 13R, 17R)-10, 13-dimethyl-17-((R)-6-methylheptan-2-yl)-2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17-tetradecahydro-1H-cyclopenta[a]phenanthren-3-yl3-(1H-imidazol-4-yl)pro- panoate. In some embodiments, the liposome has a size less than about 100 nm.
[0017] In some embodiments, the mRNA is administered to the subject via pulmonary delivery. In some embodiments, the pulmonary delivery is nebulization.
[0018] Other features, objects, and advantages of the present invention are apparent in the detailed description, drawings and claims that follow. It should be understood, however, that the detailed description, the drawings, and the claims, while indicating embodiments of the present invention, are given by way of illustration only, not limitation. Various changes and modifications within the scope of the invention will become apparent to those skilled in the art.
BRIEF DESCRIPTION OF THE DRAWING
[0019] The drawings are for illustration purposes only not for limitation.
[0020] FIG. 1 depicts an exemplary gel showing that synthesis of the novel codon-optimized Cystic Fibrosis Transmembrane Conductance Regulator (CFTR) sequence using an SP6 promoter eliminated the secondary polynucleotide species (lane 2), as compared to a previous codon-optimized CFTR sequence (lane 3). Arrow indicates a secondary polynucleotide species approximately 1800 nucleotides in length.
Definitions
[0021] In order for the present invention to be more readily understood, certain terms are first defined below. Additional definitions for the following terms and other terms are set forth throughout the specification. The publications and other reference materials referenced herein to describe the background of the invention and to provide additional detail regarding its practice are hereby incorporated by reference.
[0022] Approximately or about: As used herein, the term "approximately" or "about," as applied to one or more values of interest, refers to a value that is similar to a stated reference value. In certain embodiments, the term "approximately" or "about" refers to a range of values that fall within 25%, 20%, 19%, 18%, 17%, 16%, 15%, 14%, 13%, 12%, 11%, 10%, 9%, 8%, 7%, 6%, 5%, 4%, 3%, 2%, 1%, or less in either direction (greater than or less than) of the stated reference value unless otherwise stated or otherwise evident from the context (except where such number would exceed 100% of a possible value).
[0023] As used herein, the term "batch" refers to a quantity or amount of mRNA synthesized at one time, e.g., produced according to a single manufacturing order during the same cycle of manufacture. A batch may refer to an amount of mRNA synthesized in one reaction that occurs via a single aliquot of enzyme and/or a single aliquot of DNA template for continuous synthesis under one set of conditions. In some embodiments, a batch would include the mRNA produced from a reaction in which not all reagents and/or components are supplemented and/or replenished as the reaction progresses. The term "not in a single batch" would not mean mRNA synthesized at different times that are combined to achieve the desired amount.
[0024] Delivery: As used herein, the term "delivery" encompasses both local and systemic delivery. For example, delivery of mRNA encompasses situations in which an mRNA is delivered to a target tissue and the encoded protein is expressed and retained within the target tissue (also referred to as "local distribution" or "local delivery"), and situations in which an mRNA is delivered to a target tissue and the encoded protein is expressed and secreted into patient's circulation system (e.g., serum) and systematically distributed and taken up by other tissues (also referred to as "systemic distribution" or "systemic delivery). In some embodiments, delivery is pulmonary delivery, e.g., comprising nebulization.
[0025] Encapsulation: As used herein, the term "encapsulation," or grammatical equivalent, refers to the process of confining an mRNA molecule within a nanoparticle.
[0026] Expression: As used herein, "expression" of a nucleic acid sequence refers to translation of an mRNA into a polypeptide, assemble multiple polypeptides (e.g., heavy chain or light chain of antibody) into an intact protein (e.g., antibody) and/or post-translational modification of a polypeptide or fully assembled protein (e.g., antibody). In this application, the terms "expression" and "production," and grammatical equivalents, are used interchangeably.
[0027] Functional: As used herein, a "functional" biological molecule is a biological molecule in a form in which it exhibits a property and/or activity by which it is characterized.
[0028] Half-life: As used herein, the term "half-life" is the time required for a quantity such as nucleic acid or protein concentration or activity to fall to half of its value as measured at the beginning of a time period.
[0029] Improve, increase, or reduce: As used herein, the terms "improve," "increase" or "reduce," or grammatical equivalents, indicate values that are relative to a baseline measurement, such as a measurement in the same individual prior to initiation of the treatment described herein, or a measurement in a control subject (or multiple control subject) in the absence of the treatment described herein. A "control subject" is a subject afflicted with the same form of disease as the subject being treated, who is about the same age as the subject being treated.
[0030] Impurities: As used herein, the term "impurities" refers to substances inside a confined amount of liquid, gas, or solid, which differ from the chemical composition of the target material or compound. Impurities are also referred to as contaminants.
[0031] In Vitro: As used herein, the term "in vitro" refers to events that occur in an artificial environment, e.g., in a test tube or reaction vessel, in cell culture, etc., rather than within a multi-cellular organism.
[0032] In Vivo: As used herein, the term "in vivo" refers to events that occur within a multi-cellular organism, such as a human and a non-human animal. In the context of cell-based systems, the term may be used to refer to events that occur within a living cell (as opposed to, for example, in vitro systems).
[0033] Isolated: As used herein, the term "isolated" refers to a substance and/or entity that has been (1) separated from at least some of the components with which it was associated when initially produced (whether in nature and/or in an experimental setting), and/or (2) produced, prepared, and/or manufactured by the hand of man. Isolated substances and/or entities may be separated from about 10%, about 20%, about 30%, about 40%, about 50%, about 60%, about 70%, about 80%, about 90%, about 91%, about 92%, about 93%, about 94%, about 95%, about 96%, about 97%, about 98%, about 99%, or more than about 99% of the other components with which they were initially associated. In some embodiments, isolated agents are about 80%, about 85%, about 90%, about 91%, about 92%, about 93%, about 94%, about 95%, about 96%, about 97%, about 98%, about 99%, or more than about 99% pure. As used herein, a substance is "pure" if it is substantially free of other components. As used herein, calculation of percent purity of isolated substances and/or entities should not include excipients (e.g., buffer, solvent, water, etc.).
[0034] messenger RNA (mRNA): As used herein, the term "messenger RNA (mRNA)" refers to a polynucleotide that encodes at least one polypeptide. mRNA as used herein encompasses both modified and unmodified RNA. mRNA may contain one or more coding and non-coding regions. mRNA can be purified from natural sources, produced using recombinant expression systems and optionally purified, chemically synthesized, etc. Where appropriate, e.g., in the case of chemically synthesized molecules, mRNA can comprise nucleoside analogs such as analogs having chemically modified bases or sugars, backbone modifications, etc. An mRNA sequence is presented in the 5' to 3' direction unless otherwise indicated.
[0035] Nucleic acid: As used herein, the term "nucleic acid," in its broadest sense, refers to any compound and/or substance that is or can be incorporated into a polynucleotide chain. In some embodiments, a nucleic acid is a compound and/or substance that is or can be incorporated into a polynucleotide chain via a phosphodiester linkage. In some embodiments, "nucleic acid" refers to individual nucleic acid residues (e.g., nucleotides and/or nucleosides). In some embodiments, "nucleic acid" refers to a polynucleotide chain comprising individual nucleic acid residues. In some embodiments, "nucleic acid" encompasses RNA as well as single and/or double-stranded DNA and/or cDNA. Furthermore, the terms "nucleic acid," "DNA," "RNA," and/or similar terms include nucleic acid analogs, i.e., analogs having other than a phosphodiester backbone. For example, the so-called "peptide nucleic acids," which are known in the art and have peptide bonds instead of phosphodiester bonds in the backbone, are considered within the scope of the present invention. The term "nucleotide sequence encoding an amino acid sequence" includes all nucleotide sequences that are degenerate versions of each other and/or encode the same amino acid sequence. Nucleotide sequences that encode proteins and/or RNA may include introns. Nucleic acids can be purified from natural sources, produced using recombinant expression systems and optionally purified, chemically synthesized, etc. Where appropriate, e.g., in the case of chemically synthesized molecules, nucleic acids can comprise nucleoside analogs such as analogs having chemically modified bases or sugars, backbone modifications, etc. A nucleic acid sequence is presented in the 5' to 3' direction unless otherwise indicated. In some embodiments, a nucleic acid is or comprises natural nucleosides (e.g., adenosine, thymidine, guanosine, cytidine, uridine, deoxyadenosine, deoxythymidine, deoxyguanosine, and deoxycytidine); nucleoside analogs (e.g., 2-aminoadenosine, 2-thiothymidine, inosine, pyrrolo-pyrimidine, 3-methyl adenosine, 5-methylcytidine, C-5 propynyl-cytidine, C-5 propynyl-uridine, 2-aminoadenosine, C5-bromouridine, C5-fluorouridine, C5-iodouridine, C5-propynyl-uridine, C5-propynyl-cytidine, C5-methylcytidine, 2-aminoadenosine, 7-deazaadenosine, 7-deazaguano sine, 8-oxoadenosine, 8-oxoguanosine, 0(6)-methylguanine, and 2-thiocytidine); chemically modified bases; biologically modified bases (e.g., methylated bases); intercalated bases; modified sugars (e.g., 2'-fluororibose, ribose, 2'-deoxyribose, arabinose, and hexose); and/or modified phosphate groups (e.g., phosphorothioates and 5'-N-phosphoramidite linkages). In some embodiments, the present invention is specifically directed to "unmodified nucleic acids," meaning nucleic acids (e.g., polynucleotides and residues, including nucleotides and/or nucleosides) that have not been chemically modified in order to facilitate or achieve delivery. In some embodiments, the nucleotides T and U are used interchangeably in sequence descriptions.
[0036] Patient: As used herein, the term "patient" or "subject" refers to any organism to which a provided composition may be administered, e.g., for experimental, diagnostic, prophylactic, cosmetic, and/or therapeutic purposes. Typical patients include animals (e.g., mammals such as mice, rats, rabbits, non-human primates, and/or humans). In some embodiments, a patient is a human. A human includes pre- and post-natal forms.
[0037] Pharmaceutically acceptable: The term "pharmaceutically acceptable" as used herein, refers to substances that, within the scope of sound medical judgment, are suitable for use in contact with the tissues of human beings and animals without excessive toxicity, irritation, allergic response, or other problem or complication, commensurate with a reasonable benefit/risk ratio.
[0038] Subject: As used herein, the term "subject" refers to a human or any non-human animal (e.g., mouse, rat, rabbit, dog, cat, cattle, swine, sheep, horse or primate). A human includes pre- and post-natal forms. In many embodiments, a subject is a human being. A subject can be a patient, which refers to a human presenting to a medical provider for diagnosis or treatment of a disease. The term "subject" is used herein interchangeably with "individual" or "patient." A subject can be afflicted with or is susceptible to a disease or disorder but may or may not display symptoms of the disease or disorder.
[0039] Substantially: As used herein, the term "substantially" refers to the qualitative condition of exhibiting total or near-total extent or degree of a characteristic or property of interest. One of ordinary skill in the biological arts will understand that biological and chemical phenomena rarely, if ever, go to completion and/or proceed to completeness or achieve or avoid an absolute result. The term "substantially" is therefore used herein to capture the potential lack of completeness inherent in many biological and chemical phenomena.
[0040] Treating: As used herein, the term "treat," "treatment," or "treating" refers to any method used to partially or completely alleviate, ameliorate, relieve, inhibit, prevent, delay onset of, reduce severity of and/or reduce incidence of one or more symptoms or features of a particular disease, disorder, and/or condition. Treatment may be administered to a subject who does not exhibit signs of a disease and/or exhibits only early signs of the disease for the purpose of decreasing the risk of developing pathology associated with the disease.
DETAILED DESCRIPTION
[0041] The present invention provides, among other things, improved methods and pharmaceutical compositions for treating cystic fibrosis based on codon optimized messenger RNA (mRNA) encoding a Cystic Fibrosis Transmembrane Conductance Regulator (CFTR) protein. In particular, these codon optimized mRNA may be synthesized efficiently at a large scale by, e.g., SP6 RNA polymerase. Certain codon optimized mRNA may be particularly useful for producing homogenous, safe and efficacious clinical product.
[0042] In some embodiments, the present invention provides methods of producing a pharmaceutical composition comprising an mRNA, wherein the mRNA is an in vitro transcribed mRNA encoding a Cystic Fibrosis Transmembrane Conductance Regulator (CFTR) protein, wherein the in vitro transcribed mRNA is synthesized from a DNA template using an SP6 RNA polymerase, and wherein the synthesis of the in vitro transcribed mRNA does not result in the production of a secondary polynucleotide species of approximately 1800 nucleotides in length.
Cystic Fibrosis
[0043] The present invention may be used to treat a subject who is suffering from or susceptible to cystic fibrosis. Cystic fibrosis is a genetic disorder characterized by mutations in the gene for Cystic Fibrosis Transmembrane Conductance Regulator (CFTR). The CFTR protein functions as a channel across the membrane of cells that produce mucus, sweat, saliva, tears, and digestive enzymes. The channel transports negatively charged particles called chloride ions into and out of cells. The transport of chloride ions helps control the movement of water in tissues, which is necessary for the production of thin, freely flowing mucus. Mucus is a slippery substance that lubricates and protects the lining of the airways, digestive system, reproductive system, and other organs and tissues.
[0044] Respiratory symptoms of cystic fibrosis include: a persistent cough that produces thick mucus (sputum), wheezing, breathlessness, exercise intolerance, repeated lung infections and inflamed nasal passages or a stuffy nose. Digestive symptoms of cystic fibrosis include: foul-smelling, greasy stools, poor weight gain and growth, intestinal blockage, particularly in newborns (meconium ileus), and severe constipation.
Codon Optimized mRNA Encoding CFTR
[0045] In some embodiments, the present invention provides methods and compositions for delivering codon optimized mRNA encoding CFTR to a subject for the treatment of cystic fibrosis. A suitable codon optimized CFTR mRNA encodes any full length, fragment or portion of a CFTR protein which can be substituted for naturally-occurring CFTR protein activity and/or reduce the intensity, severity, and/or frequency of one or more symptoms associated with cystic fibrosis.
[0046] In some embodiments, a suitable codon optimized mRNA sequence is an mRNA sequence encoding a human CFTR (hCFTR) protein. Exemplary codon optimized CFTR mRNA coding sequence and the corresponding amino acid sequence are shown in Table 1:
TABLE-US-00001 TABLE 1 Exemplary Codon-Optimized Human CFTR SEQ ID AUGCAACGCUCUCCUCUUGAAAAGGCCUCGGUGGUGUCCAAGCUCUU NO: 1 CUUCUCGUGGACUAGACCCAUCCUGAGAAAGGGGUACAGACAGCGCU UGGAGCUGUCCGAUAUCUAUCAAAUCCCUUCCGUGGACUCCGCGGAC AACCUGUCCGAGAAGCUCGAGAGAGAAUGGGACAGAGAACUCGCCUC AAAGAAGAACCCGAAGCUGAUUAAUGCGCUUAGGCGGUGCUUUUUC UGGCGGUUCAUGUUCUACGGCAUCUUCCUCUACCUGGGAGAGGUCAC CAAGGCCGUGCAGCCCCUGUUGCUGGGACGGAUUAUUGCCUCCUACG ACCCCGACAACAAGGAAGAAAGAAGCAUCGCUAUCUACUUGGGCAUC GGUCUGUGCCUGCUUUUCAUCGUCCGGACCCUCUUGUUGCAUCCUGC UAUUUUCGGCCUGCAUCACAUUGGCAUGCAGAUGAGAAUUGCCAUG UUUUCCCUGAUCUACAAGAAAACUCUGAAGCUCUCGAGCCGCGUGCU UGACAAGAUUUCCAUCGGCCAGCUCGUGUCCCUGCUCUCCAACAAUC UGAACAAGUUCGACGAGGGCCUCGCCCUGGCCCACUUCGUGUGGAUC GCCCCUCUGCAAGUGGCGCUUCUGAUGGGCCUGAUCUGGGAGCUGCU GCAAGCCUCGGCAUUCUGUGGGCUUGGAUUCCUGAUCGUGCUGGCAC UGUUCCAGGCCGGACUGGGGCGGAUGAUGAUGAAGUACAGGGACCA GAGAGCCGGAAAGAUUUCCGAACGGCUGGUGAUCACUUCGGAAAUG AUCGAAAACAUCCAGUCAGUGAAGGCCUACUGCUGGGAAGAGGCCAU GGAAAAGAUGAUUGAAAACCUCCGGCAAACCGAGCUGAAGCUGACCC GCAAGGCCGCUUACGUGCGCUAUUUCAACUCGUCCGCUUUCUUCUUC UCCGGGUUCUUCGUGGUGUUUCUCUCCGUGCUCCCCUACGCCCUGAU UAAGGGAAUCAUCCUCAGGAAGAUCUUCACCACCAUUUCCUUCUGUA UCGUGCUCCGCAUGGCCGUGACCCGGCAGUUCCCAUGGGCCGUGCAG ACUUGGUACGACUCCCUGGGAGCCAUUAACAAGAUCCAGGACUUCCU UCAAAAGCAGGAGUACAAGACCCUCGAGUACAACCUGACUACUACCG AGGUCGUGAUGGAAAACGUCACCGCCUUUUGGGAGGAGGGAUUUGG CGAACUGUUCGAGAAGGCCAAGCAGAACAACAACAACCGCAAGACCU CGAACGGUGACGACUCCCUCUUCUUUUCAAACUUCAGCCUGCUCGGG ACGCCCGUGCUGAAGGACAUUAACUUCAAGAUCGAAAGAGGACAGCU CCUGGCGGUGGCCGGAUCGACCGGAGCCGGAAAGACUUCCCUGCUGA UGGUGAUCAUGGGAGAGCUUGAACCUAGCGAGGGAAAGAUCAAGCA CUCCGGCCGCAUCAGCUUCUGUAGCCAGUUUUCCUGGAUCAUGCCCG GAACCAUUAAGGAAAACAUCAUCUUCGGCGUGUCCUACGAUGAAUAC CGCUACCGGUCCGUGAUCAAAGCCUGCCAGCUGGAAGAGGAUAUUUC AAAGUUCGCGGAGAAAGAUAACAUCGUGCUGGGCGAAGGGGGUAUU ACCUUGUCGGGGGGCCAGCGGGCUAGAAUCUCGCUGGCCAGAGCCGU GUAUAAGGACGCCGACCUGUAUCUCCUGGACUCCCCCUUCGGAUACC UGGACGUCCUGACCGAAAAGGAGAUCUUCGAAUCGUGCGUGUGCAA GCUGAUGGCUAACAAGACUCGCAUCCUCGUGACCUCCAAAAUGGAGC ACCUGAAGAAGGCAGACAAGAUUCUGAUUCUGCAUGAGGGGUCCUCC UACUUUUACGGCACCUUCUCGGAGUUGCAGAACUUGCAGCCCGACUU CUCAUCGAAGCUGAUGGGUUGCGACAGCUUCGACCAGUUCUCCGCCG AAAGAAGGAACUCGAUCCUGACGGAAACCUUGCACCGCUUCUCUUUG GAAGGCGACGCCCCUGUGUCAUGGACCGAGACUAAGAAGCAGAGCUU CAAGCAGACCGGGGAAUUCGGCGAAAAGAGGAAGAACAGCAUCUUG AACCCCAUUAACUCCAUCCGCAAGUUCUCAAUCGUGCAAAAGACGCC ACUGCAGAUGAACGGCAUUGAGGAGGACUCCGACGAACCCCUUGAGA GGCGCCUGUCCCUGGUGCCGGACAGCGAGCAGGGAGAAGCCAUCCUG CCUCGGAUUUCCGUGAUCUCCACUGGUCCGACGCUCCAAGCCCGGCG GCGGCAGUCCGUGCUGAACCUGAUGACCCACAGCGUGAACCAGGGCC AAAACAUUCACCGCAAGACUACCGCAUCCACCCGGAAAGUGUCCCUG GCACCUCAAGCGAAUCUUACCGAGCUCGACAUCUACUCCCGGAGACU GUCGCAGGAAACCGGGCUCGAAAUUUCCGAAGAAAUCAACGAGGAG GAUCUGAAAGAGUGCUUCUUCGACGAUAUGGAGUCGAUACCCGCCGU GACGACUUGGAACACUUAUCUGCGGUACAUCACUGUGCACAAGUCAU UGAUCUUCGUGCUGAUUUGGUGCCUGGUGAUUUUCCUGGCCGAGGU CGCGGCCUCACUGGUGGUGCUCUGGCUGUUGGGAAACACGCCUCUGC AAGACAAGGGAAACUCCACGCACUCGAGAAACAACAGCUAUGCCGUG AUUAUCACUUCCACCUCCUCUUAUUACGUGUUCUACAUCUACGUCGG AGUGGCGGAUACCCUGCUCGCGAUGGGUUUCUUCAGAGGACUGCCGC UGGUCCACACCUUGAUCACCGUCAGCAAGAUUCUUCACCACAAGAUG UUGCAUAGCGUGCUGCAGGCCCCCAUGUCCACCCUCAACACUCUGAA GGCCGGAGGCAUUCUGAACAGAUUCUCCAAGGACAUCGCUAUCCUGG ACGAUCUCCUGCCGCUUACCAUCUUUGACUUCAUCCAGCUGCUGCUG AUCGUGAUUGGAGCAAUCGCAGUGGUGGCGGUGCUGCAGCCUUACA UUUUCGUGGCCACUGUGCCGGUCAUUGUGGCGUUCAUCAUGCUGCGG GCCUACUUCCUCCAAACCAGCCAGCAGCUGAAGCAACUGGAAUCCGA GGGACGAUCCCCCAUCUUCACUCACCUUGUGACGUCGUUGAAGGGAC UGUGGACCCUCCGGGCUUUCGGACGGCAGCCCUACUUCGAAACCCUC UUCCACAAGGCCCUGAACCUCCACACCGCCAAUUGGUUCCUGUACCU GUCCACCCUGCGGUGGUUCCAGAUGCGCAUCGAGAUGAUUUUCGUCA UCUUCUUCAUCGCGGUCACAUUCAUCAGCAUCCUGACUACCGGAGAG GGAGAGGGACGGGUCGGAAUAAUCCUGACCCUCGCCAUGAACAUUAU GAGCACCCUGCAGUGGGCAGUGAACAGCUCGAUCGACGUGGACAGCC UGAUGCGAAGCGUCAGCCGCGUGUUCAAGUUCAUCGACAUGCCUACU GAGGGAAAACCCACUAAGUCCACUAAGCCCUACAAAAAUGGCCAGCU GAGCAAGGUCAUGAUCAUCGAAAACUCCCACGUGAAGAAGGACGAU AUUUGGCCCUCCGGAGGUCAAAUGACCGUGAAGGACCUGACCGCAAA GUACACCGAGGGAGGAAACGCCAUUCUCGAAAACAUCAGCUUCUCCA UUUCGCCGGGACAGCGGGUCGGCCUUCUCGGGCGGACCGGUUCCGGG AAGUCAACUCUGCUGUCGGCUUUCCUCCGGCUGCUGAAUACCGAGGG GGAAAUCCAAAUUGACGGCGUGUCUUGGGAUUCCAUUACUCUGCAGC AGUGGCGGAAGGCCUUCGGCGUGAUCCCCCAGAAGGUGUUCAUCUUC UCGGGUACCUUCCGGAAGAACCUGGAUCCUUACGAGCAGUGGAGCGA CCAAGAAAUCUGGAAGGUCGCCGACGAGGUCGGCCUGCGCUCCGUGA UUGAACAAUUUCCUGGAAAGCUGGACUUCGUGCUCGUCGACGGGGG AUGUGUCCUGUCGCACGGACAUAAGCAGCUCAUGUGCCUCGCACGGU CCGUGCUCUCCAAGGCCAAGAUUCUGCUGCUGGACGAACCUUCGGCC CACCUGGAUCCGGUCACCUACCAGAUCAUCAGGAGGACCCUGAAGCA GGCCUUUGCCGAUUGCACCGUGAUUCUCUGCGAGCACCGCAUCGAGG CCAUGCUGGAGUGCCAGCAGUUCCUGGUCAUCGAGGAGAACAAGGUC CGCCAAUACGACUCCAUUCAAAAGCUCCUCAACGAGCGGUCGCUGUU CAGACAAGCUAUUUCACCGUCCGAUAGAGUGAAGCUCUUCCCGCAUC GGAACAGCUCAAAGUGCAAAUCGAAGCCGCAGAUCGCAGCCUUGAAG GAAGAGACUGAGGAAGAGGUGCAGGACACCCGGCUUUAA SEQ ID AUGCAGCGGUCCCCGCUCGAAAAGGCCAGUGUCGUGUCCAAACUCUU NO: 2 CUUCUCAUGGACUCGGCCUAUCCUUAGAAAGGGGUAUCGGCAGAGGC UUGAGUUGUCUGACAUCUACCAGAUCCCCUCGGUAGAUUCGGCGGAU AACCUCUCGGAGAAGCUCGAACGGGAAUGGGACCGCGAACUCGCGUC UAAGAAAAACCCGAAGCUCAUCAACGCACUGAGAAGGUGCUUCUUCU GGCGGUUCAUGUUCUACGGUAUCUUCUUGUAUCUCGGGGAGGUCAC AAAAGCAGUCCAACCCCUGUUGUUGGGUCGCAUUAUCGCCUCGUACG ACCCCGAUAACAAAGAAGAACGGAGCAUCGCGAUCUACCUCGGGAUC GGACUGUGUUUGCUUUUCAUCGUCAGAACACUUUUGUUGCAUCCAGC AAUCUUCGGCCUCCAUCACAUCGGUAUGCAGAUGCGAAUCGCUAUGU UUAGCUUGAUCUACAAAAAGACACUGAAACUCUCGUCGCGGGUGUU GGAUAAGAUUUCCAUCGGUCAGUUGGUGUCCCUGCUUAGUAAUAAC CUCAACAAAUUCGAUGAGGGACUGGCGCUGGCACAUUUCGUGUGGA UUGCCCCGUUGCAAGUCGCCCUUUUGAUGGGCCUUAUUUGGGAGCUG UUGCAGGCAUCUGCCUUUUGUGGCCUGGGAUUUCUGAUUGUGUUGG CAUUGUUUCAGGCUGGGCUUGGGCGGAUGAUGAUGAAGUAUCGCGA CCAGAGAGCGGGUAAAAUCUCGGAAAGACUCGUCAUCACUUCGGAAA UGAUCGAAAACAUCCAGUCGGUCAAAGCCUAUUGCUGGGAAGAAGC UAUGGAGAAGAUGAUUGAAAACCUCCGCCAAACUGAGCUGAAACUG ACCCGCAAGGCGGCGUAUGUCCGGUAUUUCAAUUCGUCAGCGUUCUU CUUUUCCGGGUUCUUCGUUGUCUUUCUCUCGGUUUUGCCUUAUGCCU UGAUUAAGGGGAUUAUCCUCCGCAAGAUUUUCACCACGAUUUCGUUC UGCAUUGUAUUGCGCAUGGCAGUGACACGGCAAUUUCCGUGGGCCGU GCAGACAUGGUAUGACUCGCUUGGAGCGAUCAACAAAAUCCAAGACU UCUUGCAAAAGCAAGAGUACAAGACCCUGGAGUACAAUCUUACUACU ACGGAGGUAGUAAUGGAGAAUGUGACGGCUUUUUGGGAAGAGGGUU UUGGAGAACUGUUUGAGAAAGCAAAGCAGAAUAACAACAACCGCAA GACCUCAAAUGGGGACGAUUCCCUGUUUUUCUCGAACUUCUCCCUGC UCGGAACACCCGUGUUGAAGGACAUCAAUUUCAAGAUUGAGAGGGG ACAGCUUCUCGCGGUAGCGGGAAGCACUGGUGCGGGAAAAACUAGCC UCUUGAUGGUGAUUAUGGGGGAGCUUGAGCCCAGCGAGGGGAAGAU UAAACACUCCGGGCGUAUCUCAUUCUGUAGCCAGUUUUCAUGGAUCA UGCCCGGAACCAUUAAAGAGAACAUCAUUUUCGGAGUAUCCUAUGA UGAGUACCGAUACAGAUCGGUCAUUAAGGCGUGCCAGUUGGAAGAG GACAUUUCUAAGUUCGCCGAGAAGGAUAACAUCGUCUUGGGAGAAG GGGGUAUUACAUUGUCGGGAGGGCAGCGAGCGCGGAUCAGCCUCGCG AGAGCGGUAUACAAAGAUGCAGAUUUGUAUCUGCUUGAUUCACCGU UUGGAUACCUCGACGUAUUGACAGAAAAAGAAAUCUUCGAGUCGUG CGUGUGUAAACUUAUGGCUAAUAAGACGAGAAUCCUGGUGACAUCA AAAAUGGAACACCUUAAGAAGGCGGACAAGAUCCUGAUCCUCCACGA AGGAUCGUCCUACUUUUACGGCACUUUCUCAGAGUUGCAAAACUUGC AGCCGGACUUCUCAAGCAAACUCAUGGGGUGUGACUCAUUCGACCAG UUCAGCGCGGAACGGCGGAACUCGAUCUUGACGGAAACGCUGCACCG AUUCUCGCUUGAGGGUGAUGCCCCGGUAUCGUGGACCGAGACAAAGA AGCAGUCGUUUAAGCAGACAGGAGAAUUUGGUGAGAAAAGAAAGAA CAGUAUCUUGAAUCCUAUUAACUCAAUUCGCAAGUUCUCAAUCGUCC AGAAAACUCCACUGCAGAUGAAUGGAAUUGAAGAGGAUUCGGACGA ACCCCUGGAGCGCAGGCUUAGCCUCGUGCCGGAUUCAGAGCAAGGGG AGGCCAUUCUUCCCCGGAUUUCGGUGAUUUCAACCGGACCUACACUU CAGGCGAGGCGAAGGCAAUCCGUGCUCAACCUCAUGACGCAUUCGGU AAACCAGGGGCAAAACAUUCACCGCAAAACGACGGCCUCAACGAGAA AAGUGUCACUUGCACCCCAGGCGAAUUUGACUGAACUCGACAUCUAC AGCCGUAGGCUUUCGCAAGAAACCGGACUUGAGAUCAGCGAAGAAA UCAAUGAAGAAGAUUUGAAAGAGUGUUUCUUUGAUGACAUGGAAUC AAUCCCAGCGGUGACAACGUGGAACACAUACUUGCGUUACAUCACGG UGCACAAGUCCUUGAUUUUCGUCCUCAUCUGGUGUCUCGUGAUCUUU CUCGCUGAGGUCGCAGCGUCACUUGUGGUCCUCUGGCUGCUUGGUAA UACGCCCUUGCAAGACAAAGGCAAUUCUACACACUCAAGAAACAAUU CCUAUGCCGUGAUUAUCACUUCUACAAGCUCGUAUUACGUGUUUUAC AUCUACGUAGGAGUGGCCGACACUCUGCUCGCGAUGGGUUUCUUCCG AGGACUCCCACUCGUUCACACGCUUAUCACUGUCUCCAAGAUUCUCC ACCAUAAGAUGCUUCAUAGCGUACUGCAGGCUCCCAUGUCCACCUUG AAUACGCUCAAGGCGGGAGGUAUUUUGAAUCGCUUCUCAAAAGAUA UUGCAAUUUUGGAUGACCUUCUGCCCCUGACGAUCUUCGACUUCAUC CAGUUGUUGCUGAUCGUGAUUGGGGCUAUUGCAGUAGUCGCUGUCC UCCAGCCUUACAUUUUUGUCGCGACCGUUCCGGUGAUCGUGGCGUUU AUCAUGCUGCGGGCCUAUUUCUUGCAGACGUCACAGCAGCUUAAGCA ACUGGAGUCUGAAGGGAGGUCGCCUAUCUUUACGCAUCUUGUGACCA GUUUGAAGGGAUUGUGGACGUUGCGCGCCUUUGGCAGGCAGCCCUAC UUUGAAACACUGUUCCACAAAGCGCUGAAUCUCCAUACGGCAAAUUG GUUUUUGUAUUUGAGUACCCUCCGAUGGUUUCAGAUGCGCAUUGAG AUGAUUUUUGUGAUCUUCUUUAUCGCGGUGACUUUUAUCUCCAUCU UGACCACGGGAGAGGGCGAGGGACGGGUCGGUAUUAUCCUGACACUC GCCAUGAACAUUAUGAGCACUUUGCAGUGGGCAGUGAACAGCUCGA UUGAUGUGGAUAGCCUGAUGAGGUCCGUUUCGAGGGUCUUUAAGUU CAUCGACAUGCCGACGGAGGGAAAGCCCACAAAAAGUACGAAACCCU AUAAGAAUGGGCAAUUGAGUAAGGUAAUGAUCAUCGAGAACAGUCA CGUGAAGAAGGAUGACAUCUGGCCUAGCGGGGGUCAGAUGACCGUG AAGGACCUGACGGCAAAAUACACCGAGGGAGGGAACGCAAUCCUUGA AAACAUCUCGUUCAGCAUUAGCCCCGGUCAGCGUGUGGGGUUGCUCG GGAGGACCGGGUCAGGAAAAUCGACGUUGCUGUCGGCCUUCUUGAG ACUUCUGAAUACAGAGGGUGAGAUCCAGAUCGACGGCGUUUCGUGG GAUAGCAUCACCUUGCAGCAGUGGCGGAAAGCGUUUGGAGUAAUCCC CCAAAAGGUCUUUAUCUUUAGCGGAACCUUCCGAAAGAAUCUCGAUC CUUAUGAACAGUGGUCAGAUCAAGAGAUUUGGAAAGUCGCGGACGA GGUUGGCCUUCGGAGUGUAAUCGAGCAGUUUCCGGGAAAACUCGAC UUUGUCCUUGUAGAUGGGGGAUGCGUCCUGUCGCAUGGGCACAAGC AGCUCAUGUGCCUGGCGCGAUCCGUCCUCUCUAAAGCGAAAAUUCUU CUCUUGGAUGAACCUUCGGCCCAUCUGGACCCGGUAACGUAUCAGAU CAUCAGAAGGACACUUAAGCAGGCGUUUGCCGACUGCACGGUGAUUC UCUGUGAGCAUCGUAUCGAGGCCAUGCUCGAAUGCCAGCAAUUUCUU GUCAUCGAAGAGAAUAAGGUCCGCCAGUACGACUCCAUCCAGAAGCU GCUUAAUGAGAGAUCAUUGUUCCGGCAGGCGAUUUCACCAUCCGAUA GGGUGAAACUUUUUCCACACAGAAAUUCGUCGAAGUGCAAGUCCAA ACCGCAGAUCGCGGCCUUGAAAGAAGAGACUGAAGAAGAAGUUCAA GACACGCGUCUUUAA Human MQRSPLEKASVVSKLFFSWTRPILRKGYRQRLELSDIYQIPSVDSADNLSEK CFTR LEREWDRELASKKNPKLINALRRCFFWRFMFYGIFLYLGEVTKAVQPLLL Protein GRIIASYDPDNKEERSIAIYLGIGLCLLFIVRTLLLHPAIFGLHHIGMQMRIA Sequence MFSLIYKKTLKLSSRVLDKISIGQLVSLLSNNLNKFDEGLALAHFVWIAPLQ VALLMGLIWELLQASAFCGLGFLIVLALFQAGLGRMMMKYRDQRAGKIS ERLVITSEMIENIQSVKAYCWEEAMEKMIENLRQTELKLTRKAAYVRYFN SSAFFFSGFFVVFLSVLPYALIKGIILRKIFTTISFCIVLRMAVTRQFPWAVQT WYDSLGAINKIQDFLQKQEYKTLEYNLTTTEVVMENVTAFWEEGFGELFE KAKQNNNNRKTSNGDDSLFFSNFSLLGTPVLKDINFKIERGQLLAVAGSTG AGKTSLLMVIMGELEPSEGKIKHSGRISFCSQFSWIMPGTIKENIIFGVSYDE YRYRSVIKACQLEEDISKFAEKDNIVLGEGGITLSGGQRARISLARAVYKD ADLYLLDSPFGYLDVLTEKEIFESCVCKLMANKTRILVTSKMEHLKKADKI LILHEGSSYFYGTFSELQNLQPDFSSKLMGCDSFDQFSAERRNSILTETLHR FSLEGDAPVSWTETKKQSFKQTGEFGEKRKNSILNPINSIRKFSIVQKTPLQ MNGIEEDSDEPLERRLSLVPDSEQGEAILPRISVISTGPTLQARRRQSVLNL MTHSVNQGQNIHRKTTASTRKVSLAPQANLTELDIYSRRLSQETGLEISEEI NEEDLKECFFDDMESIPAVTTWNTYLRYITVHKSLIFVLIWCLVIFLAEVAA SLVVLWLLGNTPLQDKGNSTHSRNNSYAVIITSTSSYYVFYIYVGVADTLL AMGFFRGLPLVHTLITVSKILHHKMLHSVLQAPMSTLNTLKAGGILNRFSK DIAILDDLLPLTIFDFIQLLLIVIGAIAVVAVLQPYIFVATVPVIVAFIMLRAY FLQTSQQLKQLESEGRSPIFTHLVTSLKGLWTLRAFGRQPYFETLFHKALN LHTANWFLYLSTLRWFQMRIEMIFVIFFIAVTFISILTTGEGEGRVGIILTLA MNIMSTLQWAVNSSIDVDSLMRSVSRVFKFIDMPTEGKPTKSTKPYKNGQ LSKVMIIENSHVKKDDIWPSGGQMTVKDLTAKYTEGGNAILENISFSISPGQ RVGLLGRTGSGKSTLLSAFLRLLNTEGEIQIDGVSWDSITLQQWRKAFGVIP QKVFIFSGTFRKNLDPYEQWSDQEIWKVADEVGLRSVIEQFPGKLDFVLVD GGCVLSHGHKQLMCLARSVLSKAKILLLDEPSAHLDPVTYQIIRRTLKQAF ADCTVILCEHRIEAMLECQQFLVIEENKVRQYDSIQKLLNERSLFRQAISPS DRVKLFPHRNSSKCKSKPQIAALKEETEEEVQDTRL (SEQ ID NO: 3)
[0047] Additional exemplary codon optimized mRNA sequences are described in the Examples section below, for example, SEQ ID NO: 7 and SEQ ID NO: 8, both of which include 5' and 3' untranslated regions framing a codon-optimized hCFTR-encoding mRNA and SEQ ID NO: 27 to SEQ ID NO: 40.
[0048] In some embodiments, a suitable mRNA sequence may be an mRNA sequence encoding a homolog or an analog of human CFTR (hCFTR) protein. For example, a homolog or an analog of hCFTR protein may be a modified hCFTR protein containing one or more amino acid substitutions, deletions, and/or insertions as compared to a wild-type or naturally-occurring hCFTR protein while retaining substantial hCFTR protein activity. In some embodiments, an mRNA suitable for the present invention encodes an amino acid sequence at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more homologous to SEQ ID NO: 3. In some embodiments, an mRNA suitable for the present invention encodes a protein substantially identical to hCFTR protein. In some embodiments, an mRNA suitable for the present invention encodes an amino acid sequence at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more identical to SEQ ID NO: 3. In some embodiments, an mRNA suitable for the present invention encodes a fragment or a portion of hCFTR protein. In some embodiments, an mRNA suitable for the present invention encodes a fragment or a portion of hCFTR protein, wherein the fragment or portion of the protein still maintains CFTR activity similar to that of the wild-type protein. In some embodiments, an mRNA suitable for the present invention has a nucleotide sequence at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more identical SEQ ID NO: 1, SEQ ID NO: 7 or SEQ ID NO: 8.
[0049] In some embodiments, an mRNA suitable for the present invention has a nucleotide sequence at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more identical to any one of SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32, SEQ ID NO: 33, SEQ ID NO: 34, SEQ ID NO: 35, SEQ ID NO: 36, SEQ ID NO: 37, SEQ ID NO: 38, SEQ ID NO: 39 or SEQ ID NO: 40.
[0050] In some embodiments, a suitable mRNA encodes a fusion protein comprising a full length, fragment or portion of an hCFTR protein fused to another protein (e.g., an N or C terminal fusion). In some embodiments, the protein fused to the mRNA encoding a full length, fragment or portion of an hCFTR protein encodes a signal or a cellular targeting sequence.
Synthesis of mRNA
[0051] mRNAs according to the present invention may be synthesized according to any of a variety of known methods. For example, mRNAs according to the present invention may be synthesized via in vitro transcription (IVT). Briefly, IVT is typically performed with a linear or circular DNA template containing a promoter, a pool of ribonucleotide triphosphates, a buffer system that may include DTT and magnesium ions, and an appropriate RNA polymerase (e.g., T3, T7, or SP6 RNA polymerase), DNAse I, pyrophosphatase, and/or RNAse inhibitor. The exact conditions will vary according to the specific application.
[0052] In some embodiments, for the preparation of mRNA according to the invention, a DNA template is transcribed in vitro. A suitable DNA template typically has a promoter, for example a T3, T7 or SP6 promoter, for in vitro transcription, followed by desired nucleotide sequence for desired mRNA and a termination signal.
[0053] Synthesis of mRNA Using SP6 RNA Polymerase
[0054] In some embodiments, CFTR mRNA is produced using SP6 RNA Polymerase. SP6 RNA Polymerase is a DNA-dependent RNA polymerase with high sequence specificity for SP6 promoter sequences. The SP6 polymerase catalyzes the 5'.fwdarw.3' in vitro synthesis of RNA on either single-stranded DNA or double-stranded DNA downstream from its promoter; it incorporates native ribonucleotides and/or modified ribonucleotides and/or labeled ribonucleotides into the polymerized transcript. Examples of such labeled ribonucleotides include biotin-, fluorescein-, digoxigenin-, aminoallyl-, and isotope-labeled nucleotides.
[0055] The sequence for bacteriophage SP6 RNA polymerase was initially described (GenBank: Y00105.1) as having the following amino acid sequence:
TABLE-US-00002 (SEQ ID NO: 9) MQDLHAIQLQLEEEMFNGGIRRFEADQQRQIAAGSESDTAWNRRLLSE LIAPMAEGIQAYKEEYEGKKGRAPRALAFLQCVENEVAAYITMKVVMD MLNTDATLQAIAMSVAERIEDQVRFSKLEGHAAKYFEKVKKSLKASRT KSYRHNVAVVAEKSVAEKDADFDRWEAWPKETQLQIGTTLLEILEGSV FYNGEPVFMRAMRTYGGKTIYYLQTSESVGQWISAFKEHVAQLSPAYA PCVIPPRPWRTPFNGGFHTEKVASRIRLVKGNREHVRKLTQKQMPKVY KAINALQNTQWQINKDVLAVIEEVIRLDLGYGVPSFKPLIDKENKPAN PVPVEFQHLRGRELKEMLSPEQWQQFINWKGECARLYTAETKRGSKSA AVVRMVGQARKYSAFESIYFVYAMDSRSRVYVQSSTLSPQSNDLGKAL LRFTEGRPVNGVEALKWFCINGANLWGWDKKTFDVRVSNVLDEEFQDM CRDIAADPLTFTQWAKADAPYEFLAWCFEYAQYLDLVDEGRADEFRTH LPVHQDGSCSGIQHYSAMLRDEVGAKAVNLKPSDAPQDIYGAVAQVVI KKNALYMDADDATTFTSGSVTLSGTELRAMASAWDSIGITRSLTKKPV MTLPYGSTRLTCRESVIDYIVDLEEKEAQKAVAEGRTANKVHPFEDDR QDYLTPGAAYNYMTALIWPSISEVVKAPIVAMKMIRQLARFAAKRNEG LMYTLPTGFILEQKIMATEMLRVRTCLMGDIKMSLQVETDIVDEAAMM GAAAPNFVHGHDASHLILTVCELVDKGVTSIAVIHDSFGTHADNTLTL RVALKGQMVAMYIDGNALQKLLEEHEVRWMVDTGIEVPEQGEFDLNEI MDSEYVFA.
[0056] An SP6 RNA polymerase suitable for the present invention can be any enzyme having substantially the same polymerase activity as bacteriophage SP6 RNA polymerase. Thus, in some embodiments, an SP6 RNA polymerase suitable for the present invention may be modified from SEQ ID NO: 9. For example, a suitable SP6 RNA polymerase may contain one or more amino acid substitutions, deletions, or additions. In some embodiments, a suitable SP6 RNA polymerase has an amino acid sequence about 99%, 98%, 97%, 96%, 95%, 94%, 93%, 92%, 91%, 90%, 89%, 88%, 87%, 86%, 85%, 84%, 83%, 82%, 81%, 80%, 75%, 70%, 65%, or 60% identical or homologous to SEQ ID NO: 9. In some embodiments, a suitable SP6 RNA polymerase may be a truncated protein (from N-terminus, C-terminus, or internally) but retain the polymerase activity. In some embodiments, a suitable SP6 RNA polymerase is a fusion protein.
[0057] An SP6 RNA polymerase suitable for the invention may be a commercially-available product, e.g., from Aldevron, Ambion, New England Biolabs (NEB), Promega, and Roche. The SP6 may be ordered and/or custom designed from a commercial source or a non-commercial source according to the amino acid sequence of SEQ ID NO: 9 or a variant of SEQ ID NO: 9 as described herein. The SP6 may be a standard-fidelity polymerase or may be a high-fidelity/high-efficiency/high-capacity which has been modified to promote RNA polymerase activities, e.g., mutations in the SP6 RNA polymerase gene or post-translational modifications of the SP6 RNA polymerase itself. Examples of such modified SP6 include SP6 RNA Polymerase-Plus.TM. from Ambion, HiScribe SP6 from NEB, and RiboMAX.TM. and Riboprobe.RTM. Systems from Promega.
[0058] In some embodiments, a suitable SP6 RNA polymerase is a fusion protein. For example, an SP6 RNA polymerase may include one or more tags to promote isolation, purification, or solubility of the enzyme. A suitable tag may be located at the N-terminus, C-terminus, and/or internally. Non-limiting examples of a suitable tag include Calmodulin-binding protein (CBP); Fasciola hepatica 8-kDa antigen (Fh8); FLAG tag peptide; glutathione-S-transferase (GST); Histidine tag (e.g., hexahistidine tag (His6)); maltose-binding protein (MBP); N-utilization substance (NusA); small ubiquitin related modifier (SUMO) fusion tag; Streptavidin binding peptide (STREP); Tandem affinity purification (TAP); and thioredoxin (TrxA). Other tags may be used in the present invention. These and other fusion tags have been described, e.g., Costa et al. Frontiers in Microbiology 5 (2014): 63 and in PCT/US16/57044, the contents of which are incorporated herein by reference in their entireties. In certain embodiments, a His tag is located at SP6's N-terminus.
[0059] SP6 Promoter
[0060] Any promoter that can be recognized by an SP6 RNA polymerase may be used in the present invention. Typically, an SP6 promoter comprises 5' ATTTAGGTGACACTATAG-3' (SEQ ID NO: 10). Variants of the SP6 promoter have been discovered and/or created to optimize recognition and/or binding of SP6 to its promoter. Non-limiting variants include but are not limited to: 5'-ATTTAGGGGACACTATAGAAGAG-3'; 5'-ATTTAGGGGACACTATAGAAGG-3'; 5'-ATTTAGGGGACACTATAGAAGGG-3'; 5'-ATTTAGGTGACACTATAGAA-3'; 5'-ATTTAGGTGACACTATAGAAGA-3'; 5'-ATTTAGGTGACACTATAGAAGAG-3'; 5'-ATTTAGGTGACACTATAGAAGG-3'; 5'-ATTTAGGTGACACTATAGAAGGG-3'; 5'-ATTTAGGTGACACTATAGAAGNG-3'; and 5'-CATACGATTTAGGTGACACTATAG-3' (SEQ ID NO: 11 to SEQ ID NO: 20).
[0061] In addition, a suitable SP6 promoter for the present invention may be about 95%, 90%, 85%, 80%, 75%, or 70% identical or homologous to any one of SEQ ID NO: 10 to SEQ ID NO: 20. Moreover, an SP6 promoter useful in the present invention may include one or more additional nucleotides 5' and/or 3' to any of the promoter sequences described herein.
[0062] DNA Template
[0063] Typically, a CFTR DNA template is either entirely double-stranded or mostly single-stranded with a double-stranded SP6 promoter sequence.
[0064] Linearized plasmid DNA (linearized via one or more restriction enzymes), linearized genomic DNA fragments (via restriction enzyme and/or physical means), PCR products, and/or synthetic DNA oligonucleotides can be used as templates for in vitro transcription with SP6, provided that they contain a double-stranded SP6 promoter upstream (and in the correct orientation) of the DNA sequence to be transcribed.
[0065] In some embodiments, the linearized DNA template has a blunt-end.
[0066] In some embodiments, the DNA sequence to be transcribed may be optimized to facilitate more efficient transcription and/or translation. For example, the DNA sequence may be optimized regarding cis-regulatory elements (e.g., TATA box, termination signals, and protein binding sites), artificial recombination sites, chi sites, CpG dinucleotide content, negative CpG islands, GC content, polymerase slippage sites, and/or other elements relevant to transcription; the DNA sequence may be optimized regarding cryptic splice sites, mRNA secondary structure, stable free energy of mRNA, repetitive sequences, RNA instability motif, and/or other elements relevant to mRNA processing and stability; the DNA sequence may be optimized regarding codon usage bias, codon adaptability, internal chi sites, ribosomal binding sites (e.g., IRES), premature polyA sites, Shine-Dalgarno (SD) sequences, and/or other elements relevant to translation; and/or the DNA sequence may be optimized regarding codon context, codon-anticodon interaction, translational pause sites, and/or other elements relevant to protein folding. Optimization methods known in the art may be used in the present invention, e.g., GeneOptimizer by ThermoFisher and OptimumGene.TM., which are described in US 20110081708, the contents of which are incorporated herein by reference in its entirety.
[0067] In some embodiments, the DNA template includes a 5' and/or 3' untranslated region. In some embodiments, a 5' untranslated region includes one or more elements that affect an mRNA's stability or translation, for example, an iron responsive element. In some embodiments, a 5' untranslated region may be between about 50 and 500 nucleotides in length.
[0068] In some embodiments, a 3' untranslated region includes one or more of a polyadenylation signal, a binding site for proteins that affect an mRNA's stability of location in a cell, or one or more binding sites for miRNAs. In some embodiments, a 3' untranslated region may be between 50 and 500 nucleotides in length or longer.
[0069] Exemplary 3' and/or 5' UTR sequences can be derived from mRNA molecules which are stable (e.g., globin, actin, GAPDH, tubulin, histone, or citric acid cycle enzymes) to increase the stability of the sense mRNA molecule. For example, a 5' UTR sequence may include a partial sequence of a CMV immediate-early 1 (IE1) gene, or a fragment thereof to improve the nuclease resistance and/or improve the half-life of the polynucleotide. Also contemplated is the inclusion of a sequence encoding human growth hormone (hGH), or a fragment thereof to the 3' end or untranslated region of the polynucleotide (e.g., mRNA) to further stabilize the polynucleotide. Generally, these modifications improve the stability and/or pharmacokinetic properties (e.g., half-life) of the polynucleotide relative to their unmodified counterparts, and include, for example modifications made to improve such polynucleotides' resistance to in vivo nuclease digestion.
[0070] Large-Scale mRNA Synthesis
[0071] The present invention relates to large-scale production of codon optimized CFTR mRNA. In some embodiments, a method according to the invention synthesizes mRNA at least 100 mg, 150 mg, 200 mg, 300 mg, 400 mg, 500 mg, 600 mg, 700 mg, 800 mg, 900 mg, 1 g, 5 g, 10 g, 25 g, 50 g, 75 g, 100 g, 250 g, 500 g, 750 g, 1 kg, 5 kg, 10 kg, 50 kg, 100 kg, 1000 kg, or more at a single batch. As used herein, the term "batch" refers to a quantity or amount of mRNA synthesized at one time, e.g., produced according to a single manufacturing setting. A batch may refer to an amount of mRNA synthesized in one reaction that occurs via a single aliquot of enzyme and/or a single aliquot of DNA template for continuous synthesis under one set of conditions. mRNA synthesized at a single batch would not include mRNA synthesized at different times that are combined to achieve the desired amount. Generally, a reaction mixture includes SP6 RNA polymerase, a linear DNA template, and an RNA polymerase reaction buffer (which may include ribonucleotides or may require addition of ribonucleotides).
[0072] According to the present invention, 1-100 mg of SP6 polymerase is typically used per gram (g) of mRNA produced. In some embodiments, about 1-90 mg, 1-80 mg, 1-60 mg, 1-50 mg, 1-40 mg, 10-100 mg, 10-80 mg, 10-60 mg, 10-50 mg of SP6 polymerase is used per gram of mRNA produced. In some embodiments, about 5-20 mg of SP6 polymerase is used to produce about 1 gram of mRNA. In some embodiments, about 0.5 to 2 grams of SP6 polymerase is used to produce about 100 grams of mRNA. In some embodiments, about 5 to 20 grams of SP6 polymerase is used to about 1 kilogram of mRNA. In some embodiments, at least 5 mg of SP6 polymerase is used to produce at least 1 gram of mRNA. In some embodiments, at least 500 mg of SP6 polymerase is used to produce at least 100 grams of mRNA. In some embodiments, at least 5 grams of SP6 polymerase is used to produce at least 1 kilogram of mRNA. In some embodiments, about 10 mg, 20 mg, 30 mg, 40 mg, 50 mg, 60 mg, 70 mg, 80 mg, 90 mg, or 100 mg of plasmid DNA is used per gram of mRNA produced. In some embodiments, about 10-30 mg of plasmid DNA is used to produce about 1 gram of mRNA. In some embodiments, about 1 to 3 grams of plasmid DNA is used to produce about 100 grams of mRNA. In some embodiments, about 10 to 30 grams of plasmid DNA is used to about 1 kilogram of mRNA. In some embodiments, at least 10 mg of plasmid DNA is used to produce at least 1 gram of mRNA. In some embodiments, at least 1 gram of plasmid DNA is used to produce at least 100 grams of mRNA. In some embodiments, at least 10 grams of plasmid DNA is used to produce at least 1 kilogram of mRNA.
[0073] In some embodiments, the concentration of the SP6 RNA polymerase in the reaction mixture may be from about 1 to 100 nM, 1 to 90 nM, 1 to 80 nM, 1 to 70 nM, 1 to 60 nM, 1 to 50 nM, 1 to 40 nM, 1 to 30 nM, 1 to 20 nM, or about 1 to 10 nM. In certain embodiments, the concentration of the SP6 RNA polymerase is from about 10 to 50 nM, 20 to 50 nM, or 30 to 50 nM. A concentration of 100 to 10000 Units/ml of the SP6 RNA polymerase may be used, as examples, concentrations of 100 to 9000 Units/ml, 100 to 8000 Units/ml, 100 to 7000 Units/ml, 100 to 6000 Units/ml, 100 to 5000 Units/ml, 100 to 1000 Units/ml, 200 to 2000 Units/ml, 500 to 1000 Units/ml, 500 to 2000 Units/ml, 500 to 3000 Units/ml, 500 to 4000 Units/ml, 500 to 5000 Units/ml, 500 to 6000 Units/ml, 1000 to 7500 Units/ml, and 2500 to 5000 Units/ml may be used.
[0074] The concentration of each ribonucleotide (e.g., ATP, UTP, GTP, and CTP) in a reaction mixture is between about 0.1 mM and about 10 mM, e.g., between about 1 mM and about 10 mM, between about 2 mM and about 10 mM, between about 3 mM and about 10 mM, between about 1 mM and about 8 mM, between about 1 mM and about 6 mM, between about 3 mM and about 10 mM, between about 3 mM and about 8 mM, between about 3 mM and about 6 mM, between about 4 mM and about 5 mM. In some embodiments, each ribonucleotide is at about 5 mM in a reaction mixture. In some embodiments, the total concentration of rNTPs (for example, ATP, GTP, CTP and UTPs combined) used in the reaction range between 1 mM and 40 mM. In some embodiments, the total concentration of rNTPs (for example, ATP, GTP, CTP and UTPs combined) used in the reaction range between 1 mM and 30 mM, or between 1 mM and 28 mM, or between 1 mM to 25 mM, or between 1 mM and 20 mM. In some embodiments, the total rNTPs concentration is less than 30 mM. In some embodiments, the total rNTPs concentration is less than 25 mM. In some embodiments, the total rNTPs concentration is less than 20 mM. In some embodiments, the total rNTPs concentration is less than 15 mM. In some embodiments, the total rNTPs concentration is less than 10 mM.
[0075] The RNA polymerase reaction buffer typically includes a salt/buffering agent, e.g., Tris, HEPES, ammonium sulfate, sodium bicarbonate, sodium citrate, sodium acetate, potassium phosphate sodium phosphate, sodium chloride, and magnesium chloride.
[0076] The pH of the reaction mixture may be between about 6 to 8.5, from 6.5 to 8.0, from 7.0 to 7.5, and in some embodiments, the pH is 7.5.
[0077] Linear or linearized DNA template (e.g., as described above and in an amount/concentration sufficient to provide a desired amount of RNA), the RNA polymerase reaction buffer, and SP6 RNA polymerase are combined to form the reaction mixture. The reaction mixture is incubated at between about 37.degree. C. and about 42.degree. C. for thirty minutes to six hours, e.g., about sixty to about ninety minutes.
[0078] In some embodiments, about 5 mM NTPs, about 0.05 mg/mL SP6 polymerase, and about 0.1 mg/ml DNA template in a suitable RNA polymerase reaction buffer (final reaction mixture pH of about 7.5) is incubated at about 37.degree. C. to about 42.degree. C. for sixty to ninety minutes.
[0079] In some embodiments, a reaction mixture contains linearized double stranded DNA template with an SP6 polymerase-specific promoter, SP6 RNA polymerase, RNase inhibitor, pyrophosphatase, 29 mM NTPs, 10 mM DTT and a reaction buffer (when at 10.times. is 800 mM HEPES, 20 mM spermidine, 250 mM MgCl.sub.2, pH 7.7) and quantity sufficient (QS) to a desired reaction volume with RNase-free water; this reaction mixture is then incubated at 37.degree. C. for 60 minutes. The polymerase reaction is then quenched by addition of DNase I and a DNase I buffer (when at 10.times. is 100 mM Tris-HCl, 5 mM MgCl.sub.2 and 25 mM CaCl.sub.2, pH 7.6) to facilitate digestion of the double-stranded DNA template in preparation for purification. This embodiment has been shown to be sufficient to produce 100 grams of mRNA.
[0080] In some embodiments, a reaction mixture includes NTPs at a concentration ranging from 1-10 mM, DNA template at a concentration ranging from 0.01-0.5 mg/ml, and SP6 RNA polymerase at a concentration ranging from 0.01-0.1 mg/ml, e.g., the reaction mixture comprises NTPs at a concentration of 5 mM, the DNA template at a concentration of 0.1 mg/ml, and the SP6 RNA polymerase at a concentration of 0.05 mg/ml.
[0081] Nucleotides
[0082] Various naturally-occurring or modified nucleosides may be used to product mRNA according to the present invention. In some embodiments, an mRNA is or comprises natural nucleosides (e.g., adenosine, guanosine, cytidine, uridine); nucleoside analogs (e.g., 2-aminoadenosine, 2-thiothymidine, inosine, pyrrolo-pyrimidine, 3-methyl adenosine, 5-methylcytidine, C-5 propynyl-cytidine, C-5 propynyl-uridine, 2-aminoadenosine, C5-bromouridine, C5-fluorouridine, C5-iodouridine, C5-propynyl-uridine, C5-propynyl-cytidine, C5-methylcytidine, 2-aminoadenosine, 7-deazaadenosine, 7-deazaguanosine, 8-oxoadenosine, 8-oxoguanosine, O(6)-methylguanine, pseudouridine, (e.g., N-1-methyl-pseudouridine), 2-thiouridine, and 2-thiocytidine); chemically modified bases; biologically modified bases (e.g., methylated bases); intercalated bases; modified sugars (e.g., 2'-fluororibose, ribose, 2'-deoxyribose, arabinose, and hexose); and/or modified phosphate groups (e.g., phosphorothioates and 5'-N-phosphoramidite linkages).
[0083] In some embodiments, the mRNA comprises one or more nonstandard nucleotide residues. The nonstandard nucleotide residues may include, e.g., 5-methyl-cytidine ("5 mC"), pseudouridine (".psi.U"), and/or 2-thio-uridine ("2sU"). See, e.g., U.S. Pat. No. 8,278,036 or WO2011012316 for a discussion of such residues and their incorporation into mRNA. The mRNA may be RNA, which is defined as RNA in which 25% of U residues are 2-thio-uridine and 25% of C residues are 5-methylcytidine. Teachings for the use of RNA are disclosed US Patent Publication US20120195936 and international publication WO2011012316, both of which are hereby incorporated by reference in their entirety. The presence of nonstandard nucleotide residues may render an mRNA more stable and/or less immunogenic than a control mRNA with the same sequence but containing only standard residues. In further embodiments, the mRNA may comprise one or more nonstandard nucleotide residues chosen from isocytosine, pseudoisocytosine, 5-bromouracil, 5-propynyluracil, 6-aminopurine, 2-aminopurine, inosine, diaminopurine and 2-chloro-6-aminopurine cytosine, as well as combinations of these modifications and other nucleobase modifications. Some embodiments may further include additional modifications to the furanose ring or nucleobase. Additional modifications may include, for example, sugar modifications or substitutions (e.g., one or more of a 2'-O-alkyl modification, a locked nucleic acid (LNA)). In some embodiments, the RNAs may be complexed or hybridized with additional polynucleotides and/or peptide polynucleotides (PNA). In some embodiments where the sugar modification is a 2'-O-alkyl modification, such modification may include, but are not limited to a 2'-deoxy-2'-fluoro modification, a 2'-O-methyl modification, a 2'-O-methoxyethyl modification and a 2'-deoxy modification. In some embodiments, any of these modifications may be present in 0-100% of the nucleotides--for example, more than 0%, 1%, 10%, 25%, 50%, 75%, 85%, 90%, 95%, or 100% of the constituent nucleotides individually or in combination. Post-synthesis processing
[0084] Typically, a 5' cap and/or a 3' tail may be added after the synthesis. The presence of the cap is important in providing resistance to nucleases found in most eukaryotic cells. The presence of a "tail" serves to protect the mRNA from exonuclease degradation.
[0085] A 5' cap is typically added as follows: first, an RNA terminal phosphatase removes one of the terminal phosphate groups from the 5' nucleotide, leaving two terminal phosphates; guanosine triphosphate (GTP) is then added to the terminal phosphates via a guanylyl transferase, producing a 5'5'5 triphosphate linkage; and the 7-nitrogen of guanine is then methylated by a methyltransferase. Examples of cap structures include, but are not limited to, m7G(5')ppp (5'(A,G(5')ppp(5')A and G(5')ppp(5')G. Additional cap structures are described in published US Application No. US 2016/0032356 and U.S. Provisional Application 62/464,327, filed Feb. 27, 2017, which are incorporated herein by reference.
[0086] Typically, a tail structure includes a poly(A) and/or poly(C) tail. A poly-A or poly-C tail on the 3' terminus of mRNA typically includes at least 50 adenosine or cytosine nucleotides, at least 150 adenosine or cytosine nucleotides, at least 200 adenosine or cytosine nucleotides, at least 250 adenosine or cytosine nucleotides, at least 300 adenosine or cytosine nucleotides, at least 350 adenosine or cytosine nucleotides, at least 400 adenosine or cytosine nucleotides, at least 450 adenosine or cytosine nucleotides, at least 500 adenosine or cytosine nucleotides, at least 550 adenosine or cytosine nucleotides, at least 600 adenosine or cytosine nucleotides, at least 650 adenosine or cytosine nucleotides, at least 700 adenosine or cytosine nucleotides, at least 750 adenosine or cytosine nucleotides, at least 800 adenosine or cytosine nucleotides, at least 850 adenosine or cytosine nucleotides, at least 900 adenosine or cytosine nucleotides, at least 950 adenosine or cytosine nucleotides, or at least 1 kb adenosine or cytosine nucleotides, respectively. In some embodiments, a poly A or poly C tail may be about 10 to 800 adenosine or cytosine nucleotides (e.g., about 10 to 200 adenosine or cytosine nucleotides, about 10 to 300 adenosine or cytosine nucleotides, about 10 to 400 adenosine or cytosine nucleotides, about 10 to 500 adenosine or cytosine nucleotides, about 10 to 550 adenosine or cytosine nucleotides, about 10 to 600 adenosine or cytosine nucleotides, about 50 to 600 adenosine or cytosine nucleotides, about 100 to 600 adenosine or cytosine nucleotides, about 150 to 600 adenosine or cytosine nucleotides, about 200 to 600 adenosine or cytosine nucleotides, about 250 to 600 adenosine or cytosine nucleotides, about 300 to 600 adenosine or cytosine nucleotides, about 350 to 600 adenosine or cytosine nucleotides, about 400 to 600 adenosine or cytosine nucleotides, about 450 to 600 adenosine or cytosine nucleotides, about 500 to 600 adenosine or cytosine nucleotides, about 10 to 150 adenosine or cytosine nucleotides, about 10 to 100 adenosine or cytosine nucleotides, about 20 to 70 adenosine or cytosine nucleotides, or about 20 to 60 adenosine or cytosine nucleotides) respectively. In some embodiments, a tail structure includes is a combination of poly (A) and poly (C) tails with various lengths described herein. In some embodiments, a tail structure includes at least 50%, 55%, 65%, 70%, 75%, 80%, 85%, 90%, 92%, 94%, 95%, 96%, 97%, 98%, or 99% adenosine nucleotides. In some embodiments, a tail structure includes at least 50%, 55%, 65%, 70%, 75%, 80%, 85%, 90%, 92%, 94%, 95%, 96%, 97%, 98%, or 99% cytosine nucleotides.
[0087] As described herein, the addition of the 5' cap and/or the 3' tail facilitates the detection of abortive transcripts generated during in vitro synthesis because without capping and/or tailing, the size of those prematurely aborted mRNA transcripts can be too small to be detected. Thus, in some embodiments, the 5' cap and/or the 3' tail are added to the synthesized mRNA before the mRNA is tested for purity (e.g., the level of abortive transcripts present in the mRNA). In some embodiments, the 5' cap and/or the 3' tail are added to the synthesized mRNA before the mRNA is purified as described herein. In other embodiments, the 5' cap and/or the 3' tail are added to the synthesized mRNA after the mRNA is purified as described herein.
[0088] mRNA synthesized according to the present invention may be used without further purification. In particular, mRNA synthesized according to the present invention may be used without a step of removing shortmers. In some embodiments, mRNA synthesized according to the present invention may be further purified. Various methods may be used to purify mRNA synthesized according to the present invention. For example, purification of mRNA can be performed using centrifugation, filtration and/or chromatographic methods. In some embodiments, the synthesized mRNA is purified by ethanol precipitation or filtration or chromatography, or gel purification or any other suitable means. In some embodiments, the mRNA is purified by HPLC. In some embodiments, the mRNA is extracted in a standard phenol: chloroform: isoamyl alcohol solution, well known to one of skill in the art. In some embodiments, the mRNA is purified using Tangential Flow Filtration. Suitable purification methods include those described in US 2016/0040154, US 2015/0376220, PCT application PCT/US18/19954 entitled "METHODS FOR PURIFICATION OF MESSENGER RNA" filed on Feb. 27, 2018, and PCT application PCT/US18/19978 entitled "METHODS FOR PURIFICATION OF MESSENGER RNA" filed on Feb. 27, 2018, all of which are incorporated by reference herein and may be used to practice the present invention.
[0089] In some embodiments, the mRNA is purified before capping and tailing. In some embodiments, the mRNA is purified after capping and tailing. In some embodiments, the mRNA is purified both before and after capping and tailing.
[0090] In some embodiments, the mRNA is purified either before or after or both before and after capping and tailing, by centrifugation.
[0091] In some embodiments, the mRNA is purified either before or after or both before and after capping and tailing, by filtration.
[0092] In some embodiments, the mRNA is purified either before or after or both before and after capping and tailing, by Tangential Flow Filtration (TFF).
[0093] In some embodiments, the mRNA is purified either before or after or both before and after capping and tailing by chromatography.
Characterization of mRNA
[0094] Full-length or abortive transcripts of mRNA may be detected and quantified using any methods available in the art. In some embodiments, the synthesized mRNA molecules are detected using blotting, capillary electrophoresis, chromatography, fluorescence, gel electrophoresis, HPLC, silver stain, spectroscopy, ultraviolet (UV), or UPLC, or a combination thereof. Other detection methods known in the art are included in the present invention. In some embodiments, the synthesized mRNA molecules are detected using UV absorption spectroscopy with separation by capillary electrophoresis. In some embodiments, mRNA is first denatured by a Glyoxal dye before gel electrophoresis ("Glyoxal gel electrophoresis"). In some embodiments, synthesized mRNA is characterized before capping or tailing. In some embodiments, synthesized mRNA is characterized after capping and tailing.
[0095] In some embodiments, mRNA generated by the method disclosed herein comprises less than 10%, less than 9%, less than 8%, less than 7%, less than 6%, less than 5%, less than 4%, less than 3%, less than 2%, less than 1%, less than 0.5%, less than 0.1% impurities other than full length mRNA. The impurities include IVT contaminants, e.g., proteins, enzymes, free nucleotides and/or shortmers.
[0096] In some embodiments, mRNA produced according to the invention is substantially free of shortmers or abortive transcripts. In particular, mRNA produced according to the invention contains undetectable level of shortmers or abortive transcripts by capillary electrophoresis or Glyoxal gel electrophoresis. As used herein, the term "shortmers" or "abortive transcripts" refers to any transcripts that are less than full-length. In some embodiments, "shortmers" or "abortive transcripts" are less than 100 nucleotides in length, less than 90, less than 80, less than 70, less than 60, less than 50, less than 40, less than 30, less than 20, or less than 10 nucleotides in length. In some embodiments, shortmers are detected or quantified after adding a 5'-cap, and/or a 3'-poly A tail.
[0097] mRNA Solution
[0098] In some embodiments, mRNA may be provided in a solution to be mixed with a lipid solution such that the mRNA may be encapsulated in lipid nanoparticles. A suitable mRNA solution may be any aqueous solution containing mRNA to be encapsulated at various concentrations. For example, a suitable mRNA solution may contain an mRNA at a concentration of or greater than about 0.01 mg/ml, 0.05 mg/ml, 0.06 mg/ml, 0.07 mg/ml, 0.08 mg/ml, 0.09 mg/ml, 0.1 mg/ml, 0.15 mg/ml, 0.2 mg/ml, 0.3 mg/ml, 0.4 mg/ml, 0.5 mg/ml, 0.6 mg/ml, 0.7 mg/ml, 0.8 mg/ml, 0.9 mg/ml, or 1.0 mg/ml. In some embodiments, a suitable mRNA solution may contain an mRNA at a concentration ranging from about 0.01-1.0 mg/ml, 0.01-0.9 mg/ml, 0.01-0.8 mg/ml, 0.01-0.7 mg/ml, 0.01-0.6 mg/ml, 0.01-0.5 mg/ml, 0.01-0.4 mg/ml, 0.01-0.3 mg/ml, 0.01-0.2 mg/ml, 0.01-0.1 mg/ml, 0.05-1.0 mg/ml, 0.05-0.9 mg/ml, 0.05-0.8 mg/ml, 0.05-0.7 mg/ml, 0.05-0.6 mg/ml, 0.05-0.5 mg/ml, 0.05-0.4 mg/ml, 0.05-0.3 mg/ml, 0.05-0.2 mg/ml, 0.05-0.1 mg/ml, 0.1-1.0 mg/ml, 0.2-0.9 mg/ml, 0.3-0.8 mg/ml, 0.4-0.7 mg/ml, or 0.5-0.6 mg/ml. In some embodiments, a suitable mRNA solution may contain an mRNA at a concentration up to about 5.0 mg/ml, 4.0 mg/ml, 3.0 mg/ml, 2.0 mg/ml, 1.0 mg/ml, 0.09 mg/ml, 0.08 mg/ml, 0.07 mg/ml, 0.06 mg/ml, or 0.05 mg/ml.
[0099] Typically, a suitable mRNA solution may also contain a buffering agent and/or salt. Generally, buffering agents can include HEPES, ammonium sulfate, sodium bicarbonate, sodium citrate, sodium acetate, potassium phosphate and sodium phosphate. In some embodiments, suitable concentration of the buffering agent may range from about 0.1 mM to 100 mM, 0.5 mM to 90 mM, 1.0 mM to 80 mM, 2 mM to 70 mM, 3 mM to 60 mM, 4 mM to 50 mM, 5 mM to 40 mM, 6 mM to 30 mM, 7 mM to 20 mM, 8 mM to 15 mM, or 9 to 12 mM. In some embodiments, suitable concentration of the buffering agent is or greater than about 0.1 mM, 0.5 mM, 1 mM, 2 mM, 4 mM, 6 mM, 8 mM, 10 mM, 15 mM, 20 mM, 25 mM, 30 mM, 35 mM, 40 mM, 45 mM, or 50 mM.
[0100] Exemplary salts can include sodium chloride, magnesium chloride, and potassium chloride. In some embodiments, suitable concentration of salts in an mRNA solution may range from about 1 mM to 500 mM, 5 mM to 400 mM, 10 mM to 350 mM, 15 mM to 300 mM, 20 mM to 250 mM, 30 mM to 200 mM, 40 mM to 190 mM, 50 mM to 180 mM, 50 mM to 170 mM, 50 mM to 160 mM, 50 mM to 150 mM, or 50 mM to 100 mM. Salt concentration in a suitable mRNA solution is or greater than about 1 mM, 5 mM, 10 mM, 20 mM, 30 mM, 40 mM, 50 mM, 60 mM, 70 mM, 80 mM, 90 mM, or 100 mM.
[0101] In some embodiments, a suitable mRNA solution may have a pH ranging from about 3.5-6.5, 3.5-6.0, 3.5-5.5, 3.5-5.0, 3.5-4.5, 4.0-5.5, 4.0-5.0, 4.0-4.9, 4.0-4.8, 4.0-4.7, 4.0-4.6, or 4.0-4.5. In some embodiments, a suitable mRNA solution may have a pH of or no greater than about 3.5, 4.0, 4.1, 4.2, 4.3, 4.4, 4.5, 4.6, 4.7, 4.8, 4.9, 5.0, 5.2, 5.4, 5.6, 5.8, 6.0, 6.1, 6.3, and 6.5.
[0102] Various methods may be used to prepare an mRNA solution suitable for the present invention. In some embodiments, mRNA may be directly dissolved in a buffer solution described herein. In some embodiments, an mRNA solution may be generated by mixing an mRNA stock solution with a buffer solution prior to mixing with a lipid solution for encapsulation. In some embodiments, an mRNA solution may be generated by mixing an mRNA stock solution with a buffer solution immediately before mixing with a lipid solution for encapsulation. In some embodiments, a suitable mRNA stock solution may contain mRNA in water at a concentration at or greater than about 0.2 mg/ml, 0.4 mg/ml, 0.5 mg/ml, 0.6 mg/ml, 0.8 mg/ml, 1.0 mg/ml, 1.2 mg/ml, 1.4 mg/ml, 1.5 mg/ml, or 1.6 mg/ml, 2.0 mg/ml, 2.5 mg/ml, 3.0 mg/ml, 3.5 mg/ml, 4.0 mg/ml, 4.5 mg/ml, or 5.0 mg/ml.
[0103] In some embodiments, an mRNA stock solution is mixed with a buffer solution using a pump. Exemplary pumps include but are not limited to gear pumps, peristaltic pumps and centrifugal pumps.
[0104] Typically, the buffer solution is mixed at a rate greater than that of the mRNA stock solution. For example, the buffer solution may be mixed at a rate at least 1.times., 2.times., 3.times., 4.times., 5.times., 6.times., 7.times., 8.times., 9.times., 10.times., 15.times., or 20.times. greater than the rate of the mRNA stock solution. In some embodiments, a buffer solution is mixed at a flow rate ranging between about 100-6000 ml/minute (e.g., about 100-300 ml/minute, 300-600 ml/minute, 600-1200 ml/minute, 1200-2400 ml/minute, 2400-3600 ml/minute, 3600-4800 ml/minute, 4800-6000 ml/minute, or 60-420 ml/minute). In some embodiments, a buffer solution is mixed at a flow rate of or greater than about 60 ml/minute, 100 ml/minute, 140 ml/minute, 180 ml/minute, 220 ml/minute, 260 ml/minute, 300 ml/minute, 340 ml/minute, 380 ml/minute, 420 ml/minute, 480 ml/minute, 540 ml/minute, 600 ml/minute, 1200 ml/minute, 2400 ml/minute, 3600 ml/minute, 4800 ml/minute, or 6000 ml/minute.
[0105] In some embodiments, an mRNA stock solution is mixed at a flow rate ranging between about 10-600 ml/minute (e.g., about 5-50 ml/minute, about 10-30 ml/minute, about 30-60 ml/minute, about 60-120 ml/minute, about 120-240 ml/minute, about 240-360 ml/minute, about 360-480 ml/minute, or about 480-600 ml/minute). In some embodiments, an mRNA stock solution is mixed at a flow rate of or greater than about 5 ml/minute, 10 ml/minute, 15 ml/minute, 20 ml/minute, 25 ml/minute, 30 ml/minute, 35 ml/minute, 40 ml/minute, 45 ml/minute, 50 ml/minute, 60 ml/minute, 80 ml/minute, 100 ml/minute, 200 ml/minute, 300 ml/minute, 400 ml/minute, 500 ml/minute, or 600 ml/minute.
Delivery Vehicles
[0106] According to the present invention, mRNA encoding a CFTR protein (e.g., a full length, fragment, or portion of a CFTR protein) as described herein may be delivered as naked RNA (unpackaged) or via delivery vehicles. As used herein, the terms "delivery vehicle," "transfer vehicle," "nanoparticle" or grammatical equivalent, are used interchangeably.
[0107] Delivery vehicles can be formulated in combination with one or more additional nucleic acids, carriers, targeting ligands or stabilizing reagents, or in pharmacological compositions where it is mixed with suitable excipients. Techniques for formulation and administration of drugs may be found in "Remington's Pharmaceutical Sciences," Mack Publishing Co., Easton, Pa., latest edition. A particular delivery vehicle is selected based upon its ability to facilitate the transfection of a nucleic acid to a target cell.
[0108] In some embodiments, a delivery vehicle comprising CFTR mRNA is administered by pulmonary delivery, e.g., comprising nebulization. In these embodiments, the delivery vehicle may be in an aerosolized composition which can be inhaled. In some embodiments, the mRNA is expressed in the tissue in which the delivery vehicle was administered, e.g., nasal cavity, trachea, bronchi, bronchioles, and/or other pulmonary system-related cell or tissue. Additional teaching of pulmonary delivery and nebulization are described in the related international application PCT/US17/61100 filed Nov. 10, 2017 by Applicant entitled "NOVEL ICE-BASED LIPID NANOPARTICLE FORMULATION FOR DELIVERY OF MRNA", and the U.S. Ser. No. 62/507,061, each of which is incorporated by reference in its entirety.
[0109] In some embodiments, mRNAs encoding a CFTR protein may be delivered via a single delivery vehicle. In some embodiments, mRNAs encoding a CFTR protein may be delivered via one or more delivery vehicles each of a different composition. According to various embodiments, suitable delivery vehicles include, but are not limited to polymer based carriers, such as polyethyleneimine (PEI), lipid nanoparticles and liposomes, nanoliposomes, ceramide-containing nanoliposomes, proteoliposomes, both natural and synthetically-derived exosomes, natural, synthetic and semi-synthetic lamellar bodies, nanoparticulates, calcium phosphor-silicate nanoparticulates, calcium phosphate nanoparticulates, silicon dioxide nanoparticulates, nanocrystalline particulates, semiconductor nanoparticulates, poly(D-arginine), sol-gels, nanodendrimers, starch-based delivery systems, micelles, emulsions, niosomes, multi-domain-block polymers (vinyl polymers, polypropyl acrylic acid polymers, dynamic polyconjugates), dry powder formulations, plasmids, viruses, calcium phosphate nucleotides, aptamers, peptides and other vectorial tags. Also contemplated is the use of bionanocapsules and other viral capsid proteins assemblies as a suitable transfer vehicle. (Hum. Gene Ther. 2008 September; 19(9):887-95).
[0110] A delivery vehicle comprising CFTR mRNA may be administered and dosed in accordance with current medical practice, taking into account the clinical condition of the subject, the site and method of administration (e.g., local and systemic, including oral, pulmonary, and via injection), the scheduling of administration, the subject's age, sex, body weight, and other factors relevant to clinicians of ordinary skill in the art. The "effective amount" for the purposes herein may be determined by such relevant considerations as are known to those of ordinary skill in experimental clinical research, pharmacological, clinical and medical arts. In some embodiments, the amount administered is effective to achieve at least some stabilization, improvement or elimination of symptoms and other indicators as are selected as appropriate measures of disease progress, regression or improvement by those of skill in the art. For example, a suitable amount and dosing regimen is one that causes at least transient protein production.
[0111] In some embodiments, delivery vehicles are formulated such that they are suitable for extended-release of the mRNA contained therein. Such extended-release compositions may be conveniently administered to a subject at extended dosing intervals.
[0112] Liposomal Delivery Vehicles
[0113] In some embodiments, a suitable delivery vehicle is a liposomal delivery vehicle, e.g., a lipid nanoparticle. As used herein, liposomal delivery vehicles, e.g., lipid nanoparticles, are usually characterized as microscopic vesicles having an interior aqua space sequestered from an outer medium by a membrane of one or more bilayers. Bilayer membranes of liposomes are typically formed by amphiphilic molecules, such as lipids of synthetic or natural origin that comprise spatially separated hydrophilic and hydrophobic domains (Lasic, Trends Biotechnol., 16: 307-321, 1998). Bilayer membranes of the liposomes can also be formed by amphiphilic polymers and surfactants (e.g., polymerosomes, niosomes, etc.). In the context of the present invention, a liposomal delivery vehicle typically serves to transport a desired mRNA to a target cell or tissue. In some embodiments, a nanoparticle delivery vehicle is a liposome. In some embodiments, a liposome comprises one or more cationic lipids, one or more non-cationic lipids, one or more cholesterol-based lipids and one or more PEG-modified lipids. In some embodiments, a liposome comprises no more than three distinct lipid components. In some embodiments, one distinct lipid component is a sterol-based cationic lipid.
[0114] Cationic Lipids
[0115] In some embodiments, liposomes may comprise one or more cationic lipids. As used herein, the phrase "cationic lipid" refers to any of a number of lipid species that have a net positive charge at a selected pH, such as physiological pH. Several cationic lipids have been described in the literature, many of which are commercially available. An example of suitable cationic lipids for use in the compositions and methods of the invention include those described in international patent publications WO 2010/053572 (for example, CI 2-200 described at paragraph [00225]) and WO 2012/170930, both of which are incorporated herein by reference. In certain embodiments, the compositions and methods of the invention employ a lipid nanoparticles comprising an ionizable cationic lipid described in U.S. provisional patent application 61/617,468, filed Mar. 29, 2012 (incorporated herein by reference), such as, e.g, (15Z, 18Z)-N,N-dimethyl-6-(9Z, 12Z)-octadeca-9, 12-dien-1-yl)tetracosa-15,18-dien-1-amine (HGT5000), (15Z, 18Z)-N,N-dimethyl-6-((9Z, 12Z)-octadeca-9, 12-dien-1-yl)tetracosa-4,15,18-trien-1-amine (HGT5001), and (15Z,18Z)-N,N-dimethyl-6-((9Z, 12Z)-octadeca-9, 12-dien-1-yl)tetracosa-5, 15, 18-trien-1-amine (HGT5002).
[0116] In some embodiments, provided liposomes include a cationic lipid described in WO 2013/063468 and in U.S. provisional application entitled "Lipid Formulations for Delivery of Messenger RNA" filed concurrently with the present application on even date, both of which are incorporated by reference herein.
[0117] In some embodiments, a cationic lipid comprises a compound of formula I-c1-a:
##STR00001##
[0118] or a pharmaceutically acceptable salt thereof, wherein:
[0119] each R.sup.2 independently is hydrogen or C.sub.1-3 alkyl;
[0120] each q independently is 2 to 6;
[0121] each R' independently is hydrogen or C.sub.1-3 alkyl;
[0122] and each R.sup.L independently is C.sub.8-12 alkyl.
[0123] In some embodiments, each R.sup.2 independently is hydrogen, methyl or ethyl. In some embodiments, each R.sup.2 independently is hydrogen or methyl. In some embodiments, each R.sup.2 is hydrogen.
[0124] In some embodiments, each q independently is 3 to 6. In some embodiments, each q independently is 3 to 5. In some embodiments, each q is 4.
[0125] In some embodiments, each R' independently is hydrogen, methyl or ethyl. In some embodiments, each R' independently is hydrogen or methyl. In some embodiments, each R' independently is hydrogen.
[0126] In some embodiments, each R.sup.L independently is C.sub.8-12 alkyl. In some embodiments, each R.sup.L independently is n-C.sub.8-12alkyl. In some embodiments, each R.sup.L independently is C.sub.9-11 alkyl. In some embodiments, each R.sup.L independently is n-C.sub.9-11 alkyl. In some embodiments, each R.sup.L independently is C.sub.10 alkyl. In some embodiments, each R.sup.L independently is n-C.sub.10 alkyl.
[0127] In some embodiments, each R.sup.2 independently is hydrogen or methyl; each q independently is 3 to 5; each R' independently is hydrogen or methyl; and each R.sup.L independently is C.sub.8-12 alkyl.
[0128] In some embodiments, each R.sup.2 is hydrogen; each q independently is 3 to 5; each R' is hydrogen; and each R.sup.L independently is C.sub.8-12 alkyl.
[0129] In some embodiments, each R.sup.2 is hydrogen; each q is 4; each R' is hydrogen; and each R.sup.L independently is C.sub.8-12 alkyl.
[0130] In some embodiments, a cationic lipid comprises a compound of formula I-g:
##STR00002##
or a pharmaceutically acceptable salt thereof, wherein each R.sup.L independently is C.sub.8-12 alkyl. In some embodiments, each R.sup.L independently is n-C.sub.8-12 alkyl. In some embodiments, each R.sup.L independently is C.sub.9-11 alkyl. In some embodiments, each R.sup.L independently is n-C.sub.9-11 alkyl. In some embodiments, each R.sup.L independently is C.sub.10 alkyl. In some embodiments, each R.sup.L is n-C.sub.10 alkyl.
[0131] In particular embodiments, provided liposomes include a cationic lipid cKK-E12, or (3,6-bis(4-(bis(2-hydroxydodecyl)amino)butyl)piperazine-2,5-dione). The structure of cKK-E12 is shown below:
##STR00003##
[0132] Additional exemplary cationic lipids include those of formula I:
##STR00004##
and pharmaceutically acceptable salts thereof, wherein,
[0133] R is
##STR00005##
[0134] R is
##STR00006##
[0135] R is
##STR00007##
or
[0136] R is
##STR00008##
(see, e.g., Fenton, Owen S., et al. "Bioinspired Alkenyl Amino Alcohol Ionizable Lipid Materials for Highly Potent In Vivo mRNA Delivery." Advanced materials (2016)).
[0137] In some embodiments, the one or more cationic lipids may be N-[1-(2,3-dioleyloxy)propyl]-N,N,N-trimethylammonium chloride or "DOTMA" (Feigner et al. (Proc. Nat'l Acad. Sci. 84, 7413 (1987); U.S. Pat. No. 4,897,355). DOTMA can be formulated alone or can be combined with the neutral lipid, dioleoylphosphatidyl-ethanolamine or "DOPE" or other cationic or non-cationic lipids into a liposomal transfer vehicle or a lipid nanoparticle, and such liposomes can be used to enhance the delivery of nucleic acids into target cells. Other suitable cationic lipids include, for example, 5-carboxyspermylglycinedioctadecylamide or "DOGS," 2,3-dioleyloxy-N-[2(spermine-carboxamido)ethyl]-N,N-dimethyl-1-pr- opanaminium or "DOSPA" (Behr et al. Proc. Nat.'1 Acad. Sci. 86, 6982 (1989); U.S. Pat. Nos. 5,171,678; 5,334,761), 1,2-Dioleoyl-3-Dimethylammonium-Propane or "DODAP",1,2-Dioleoyl-3-Trimethylammonium-Propane or "DOTAP".
[0138] Additional exemplary cationic lipids also include 1,2-distearyloxy-N,N-dimethyl-3-aminopropane or "DSDMA", 1,2-dioleyloxy-N,N-dimethyl-3-aminopropane or "DODMA", 1, 2-dilinoleyloxy-N,N-dimethyl-3-aminopropane or "DLinDMA",1,2-dilinolenyloxy-N,N-dimethyl-3-aminopropane or "DLenDMA", N-dioleyl-N,N-dimethylammonium chloride or "DODAC", N,N-distearyl-N,N-dimethylarnrnonium bromide or "DDAB", N-(1,2-dimyristyloxyprop-3-yl)-N,N-dimethyl-N-hydroxyethyl ammonium bromide or "DMRIE", 3-dimethylamino-2-(cholest-5-en-3-beta-oxybutan-4-oxy)-1-(cis,cis-9,12-oc- tadecadienoxy)propane or "CLinDMA", 2-[5'-(cholest-5-en-3-beta-oxy)-3'-oxapentoxy)-3-dimethy 1-1-(cis,cis-9',1-2'-octadecadienoxy)propane or "CpLinDMA", N,N-dimethyl-3,4-dioleyloxybenzylamine or "DMOBA", 1,2-N,N'-dioleylcarbamyl-3-dimethylaminopropane or "DOcarbDAP", 2,3-Dilinoleoyloxy-N,N-dimethylpropylamine or "DLinDAP",1,2-N,N'-Dilinoleylcarbamyl-3-dimethylaminopropane or "DLincarbDAP", 1,2-Dilinoleoylcarbamyl-3-dimethylaminopropane or "DLinCDAP", 2,2-dilinoleyl-4-dimethylaminomethyl-[1,3]-dioxolane or "DLin- -DMA", 2,2-dilinoleyl-4-dimethylaminoethyl-[1,3]-dioxolane or "DLin-K-XTC2-DMA", and 2-(2,2-di((9Z,12Z)-octadeca-9,1 2-dien-1-yl)-1,3-dioxolan-4-yl)-N,N-dimethylethanamine (DLin-KC2-DMA)) (See, WO 2010/042877; Semple et al., Nature Biotech. 28: 172-176 (2010)), or mixtures thereof. (Heyes, J., et al., J Controlled Release 107: 276-287 (2005); Morrissey, D V., et al., Nat. Biotechnol. 23(8): 1003-1007 (2005); PCT Publication WO2005/121348A1). In some embodiments, one or more of the cationic lipids comprise at least one of an imidazole, dialkylamino, or guanidinium moiety.
[0139] In some embodiments, the one or more cationic lipids may be chosen from XTC (2,2-Dilinoleyl-4-dimethylaminoethyl-[1,3]-dioxolane), MC3 (((6Z,9Z,28Z,31Z)-heptatriaconta-6,9,28,31-tetraen-19-yl 4-(dimethylamino)butanoate), ALNY-100 ((3aR,5s,6aS)-N,N-dimethyl-2,2-di((9Z,12Z)-octadeca-9,12-dienyl)tetrahydr- o-3aH-cyclopenta[d] [1,3]dioxol-5-amine)), NC98-5 (4,7,13-tris(3-oxo-3-(undecylamino)propyl)-N1,N16-diundecyl-4,7,10,13-tet- raazahexadecane-1,16-diamide), DODAP (1,2-dioleyl-3-dimethylammonium propane), HGT4003 (WO 2012/170889, the teachings of which are incorporated herein by reference in their entirety), ICE (WO 2011/068810, the teachings of which are incorporated herein by reference in their entirety), HGT5000 (U.S. Provisional Patent Application No. 61/617,468, the teachings of which are incorporated herein by reference in their entirety) or HGT5001 (cis or trans) (Provisional Patent Application No. 61/617,468), aminoalcohol lipidoids such as those disclosed in WO2010/053572, DOTAP (1,2-dioleyl-3-trimethylammonium propane), DOTMA (1,2-di-O-octadecenyl-3-trimethylammonium propane), DLinDMA (Heyes, J.; Palmer, L.; Bremner, K.; MacLachlan, I. "Cationic lipid saturation influences intracellular delivery of encapsulated nucleic acids" J. Contr. Rel. 2005, 107, 276-287), DLin-KC2-DMA (Semple, S. C. et al. "Rational Design of Cationic Lipids for siRNA Delivery" Nature Biotech. 2010, 28, 172-176), C12-200 (Love, K. T. et al. "Lipid-like materials for low-dose in vivo gene silencing" PNAS 2010, 107, 1864-1869).
[0140] Sterol Cationic Lipids
[0141] In some embodiments, sterol-based cationic lipids are dialkylamino-, imidazole-, and guanidinium-containing sterol-based cationic lipids. For example, certain embodiments are directed to a composition comprising one or more sterol-based cationic lipids comprising an imidazole, for example, the imidazole cholesterol ester or "ICE" lipid (3S, 10R, 13R, 17R)-10, 13-dimethyl-17-((R)-6-methylheptan-2-yl)-2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17-tetradecahydro-1H-cyclopenta[a]phenanthren-3-yl 3-(1H-imidazol-4-yl)propanoate, as represented by structure (II) below. In certain embodiments, a lipid nanoparticle for delivery of RNA (e.g., mRNA) encoding a functional protein may comprise one or more imidazole-based cationic lipids, for example, the imidazole cholesterol ester or "ICE" lipid (3S, 10R, 13R, 17R)-10, 13-dimethyl-17-((R)-6-methylheptan-2-yl)-2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17-tetradecahydro-1H-cyclopenta[a]phenanthren-3-yl 3-(1H-imidazol-4-yl)propanoate, as represented by structure (II).
##STR00009##
[0142] In some embodiments, the percentage of cationic lipid in a liposome may be greater than 10%, greater than 20%, greater than 30%, greater than 40%, greater than 50%, greater than 60%, or greater than 70%. In some embodiments, cationic lipid(s) constitute(s) about 30-50% (e.g., about 30-45%, about 30-40%, about 35-50%, about 35-45%, or about 35-40%) of the liposome by weight. In some embodiments, the cationic lipid (e.g., ICE lipid) constitutes about 30%, about 35%, about 40%, about 45%, or about 50% of the liposome by molar ratio.
[0143] Non-Cationic/Helper Lipids
[0144] In some embodiments, provided liposomes contain one or more non-cationic ("helper") lipids. As used herein, the phrase "non-cationic lipid" refers to any neutral, zwitterionic or anionic lipid. As used herein, the phrase "anionic lipid" refers to any of a number of lipid species that carry a net negative charge at a selected H, such as physiological pH. Non-cationic lipids include, but are not limited to, distearoylphosphatidylcholine (DSPC), dioleoylphosphatidylcholine (DOPC), dipalmitoylphosphatidylcholine (DPPC), dioleoylphosphatidylglycerol (DOPG), dipalmitoylphosphatidylglycerol (DPPG), dioleoylphosphatidylethanolamine (DOPE), palmitoyloleoylphosphatidylcholine (POPC), palmitoyloleoyl-phosphatidylethanolamine (POPE), dioleoyl-phosphatidylethanolamine 4-(N-maleimidomethyl)-cyclohexane-1-carboxylate (DOPE-mal), dipalmitoyl phosphatidyl ethanolamine (DPPE), dimyristoylphosphoethanolamine (DMPE), distearoyl-phosphatidyl-ethanolamine (DSPE), phosphatidylserine, sphingolipids, cerebrosides, gangliosides, 16-O-monomethyl PE, 16-O-dimethyl PE, 18-1-trans PE, 1-stearoyl-2-oleoyl-phosphatidyethanolamine (SOPE), or a mixture thereof.
[0145] In some embodiments, such non-cationic lipids may be used alone, but are preferably used in combination with other lipids, for example, cationic lipids. In some embodiments, the non-cationic lipid may comprise a molar ratio of about 5% to about 90%, or about 10% to about 70% of the total lipid present in a liposome. In some embodiments, a non-cationic lipid is a neutral lipid, i.e., a lipid that does not carry a net charge in the conditions under which the composition is formulated and/or administered. In some embodiments, the percentage of non-cationic lipid in a liposome may be greater than 5%, greater than 10%, greater than 20%, greater than 30%, or greater than 40%.
[0146] Cholesterol-Based Lipids
[0147] In some embodiments, provided liposomes comprise one or more cholesterol-based lipids. For example, suitable cholesterol-based cationic lipids include, for example, DC-Choi (N,N-dimethyl-N-ethylcarboxamidocholesterol),1,4-bis(3-N-oleylamino-propy- l)piperazine (Gao, et al. Biochem. Biophys. Res. Comm. 179, 280 (1991); Wolf et al. BioTechniques 23, 139 (1997); U.S. Pat. No. 5,744,335), or ICE. In some embodiments, the cholesterol-based lipid may comprise a molar ration of about 2% to about 30%, or about 5% to about 20% of the total lipid present in a liposome. In some embodiments, the percentage of cholesterol-based lipid in the lipid nanoparticle may be greater than 5%, greater than 10%, greater than 20%, greater than 30%, or greater than 40%.
[0148] PEG-Modified Lipids
[0149] The use of polyethylene glycol (PEG)-modified phospholipids and derivatized lipids such as derivatized ceramides (PEG-CER), including N-Octanoyl-Sphingosine-1-[Succinyl(Methoxy Polyethylene Glycol)-2000] (C8 PEG-2000 ceramide) is also contemplated by the present invention, either alone or preferably in combination with other lipid formulations together which comprise the transfer vehicle (e.g., a lipid nanoparticle). Contemplated PEG-modified lipids include, but are not limited to, a polyethylene glycol chain of up to S kDa in length covalently attached to a lipid with alkyl chain(s) of C.sub.6-C.sub.20 length. The addition of such components may prevent complex aggregation and may also provide a means for increasing circulation lifetime and increasing the delivery of the lipid-nucleic acid composition to the target tissues, (Klibanov et al. (1990) FEBS Letters, 268 (1): 235-237), or they may be selected to rapidly exchange out of the formulation in vivo (see U.S. Pat. No. 5,885,613). Particularly useful exchangeable lipids are PEG-ceramides having shorter acyl chains (e.g., C14 or C18). The PEG-modified phospholipid and derivitized lipids of the present invention may comprise a molar ratio from about 0% to about 20%, about 0.5% to about 20%, about 1% to about 15%, about 4% to about 10%, or about 2% of the total lipid present in the liposomal transfer vehicle.
[0150] According to various embodiments, the selection of cationic lipids, non-cationic lipids and/or PEG-modified lipids which comprise the lipid nanoparticle, as well as the relative molar ratio of such lipids to each other, is based upon the characteristics of the selected lipid(s), the nature of the intended target cells, the characteristics of the MCNA to be delivered. Additional considerations include, for example, the saturation of the alkyl chain, as well as the size, charge, pH, pKa, fusogenicity and toxicity of the selected lipid(s). Thus the molar ratios may be adjusted accordingly.
[0151] Polymers
[0152] In some embodiments, a suitable delivery vehicle is formulated using a polymer as a carrier, alone or in combination with other carriers including various lipids described herein. Thus, in some embodiments, liposomal delivery vehicles, as used herein, also encompass nanoparticles comprising polymers. Suitable polymers may include, for example, polyacrylates, polyalkycyanoacrylates, polylactide, polylactide-polyglycolide copolymers, polycaprolactones, dextran, albumin, gelatin, alginate, collagen, chitosan, cyclodextrins, protamine, PEGylated protamine, PLL, PEGylated PLL and polyethylenimine (PEI). When PEI is present, it may be branched PEI of a molecular weight ranging from 10 to 40 kDa, e.g., 25 kDa branched PEI (Sigma #408727).
[0153] A suitable liposome for the present invention may include one or more of any of the cationic lipids, non-cationic lipids, cholesterol lipids, PEG-modified lipids and/or polymers described herein at various ratios. As non-limiting examples, a suitable liposome formulation may include a combination selected from cKK-E12, DOPE, cholesterol and DMG-PEG2K; C12-200, DOPE, cholesterol and DMG-PEG2K; HGT4003, DOPE, cholesterol and DMG-PEG2K; ICE, DOPE, cholesterol and DMG-PEG2K; or ICE, DOPE, and DMG-PEG2K.
[0154] In various embodiments, cationic lipids (e.g., cKK-E12, C12-200, ICE, and/or HGT4003) constitute about 30-60% (e.g., about 30-55%, about 30-50%, about 30-45%, about 30-40%, about 35-50%, about 35-45%, or about 35-40%) of the liposome by molar ratio. In some embodiments, the percentage of cationic lipids (e.g., cKK-E12, C12-200, ICE, and/or HGT4003) is or greater than about 30%, about 35%, about 40%, about 45%, about 50%, about 55%, or about 60% of the liposome by molar ratio.
[0155] In some embodiments, the ratio of cationic lipid(s) to non-cationic lipid(s) to cholesterol-based lipid(s) to PEG-modified lipid(s) may be between about 30-60:25-35:20-30:1-15, respectively. In some embodiments, the ratio of cationic lipid(s) to non-cationic lipid(s) to cholesterol-based lipid(s) to PEG-modified lipid(s) is approximately 40:30:20:10, respectively. In some embodiments, the ratio of cationic lipid(s) to non-cationic lipid(s) to cholesterol-based lipid(s) to PEG-modified lipid(s) is approximately 40:30:25:5, respectively. In some embodiments, the ratio of cationic lipid(s) to non-cationic lipid(s) to cholesterol-based lipid(s) to PEG-modified lipid(s) is approximately 40:32:25:3, respectively. In some embodiments, the ratio of cationic lipid(s) to non-cationic lipid(s) to cholesterol-based lipid(s) to PEG-modified lipid(s) is approximately 50:25:20:5.
[0156] Ratio of Distinct Lipid Components
[0157] In embodiments where a lipid nanoparticle comprises three and no more than three distinct components of lipids, the ratio of total lipid content (i.e., the ratio of lipid component (1):lipid component (2):lipid component (3)) can be represented as x:y:z, wherein
(y+z)=100-x.
[0158] In some embodiments, each of "x," "y," and "z" represents molar percentages of the three distinct components of lipids, and the ratio is a molar ratio.
[0159] In some embodiments, each of "x," "y," and "z" represents weight percentages of the three distinct components of lipids, and the ratio is a weight ratio.
[0160] In some embodiments, lipid component (1), represented by variable "x," is a sterol-based cationic lipid.
[0161] In some embodiments, lipid component (2), represented by variable "y," is a helper lipid.
[0162] In some embodiments, lipid component (3), represented by variable "z" is a PEG lipid.
[0163] In some embodiments, variable "x," representing the molar percentage of lipid component (1) (e.g., a sterol-based cationic lipid), is at least about 10%, about 20%, about 30%, about 40%, about 50%, about 55%, about 60%, about 65%, about 70%, about 75%, about 80%, about 85%, about 90%, or about 95%.
[0164] In some embodiments, variable "x," representing the molar percentage of lipid component (1) (e.g., a sterol-based cationic lipid), is no more than about 95%, about 90%, about 85%, about 80%, about 75%, about 70%, about 65%, about 60%, about 55%, about 50%, about 40%, about 30%, about 20%, or about 10%. In embodiments, variable "x" is no more than about 65%, about 60%, about 55%, about 50%, about 40%.
[0165] In some embodiments, variable "x," representing the molar percentage of lipid component (1) (e.g., a sterol-based cationic lipid), is: at least about 50% but less than about 95%; at least about 50% but less than about 90%; at least about 50% but less than about 85%; at least about 50% but less than about 80%; at least about 50% but less than about 75%; at least about 50% but less than about 70%; at least about 50% but less than about 65%; or at least about 50% but less than about 60%. In embodiments, variable "x" is at least about 50% but less than about 70%; at least about 50% but less than about 65%; or at least about 50% but less than about 60%.
[0166] In some embodiments, variable "x," representing the weight percentage of lipid component (1) (e.g., a sterol-based cationic lipid), is at least about 10%, about 20%, about 30%, about 40%, about 50%, about 55%, about 60%, about 65%, about 70%, about 75%, about 80%, about 85%, about 90%, or about 95%.
[0167] In some embodiments, variable "x," representing the weight percentage of lipid component (1) (e.g., a sterol-based cationic lipid), is no more than about 95%, about 90%, about 85%, about 80%, about 75%, about 70%, about 65%, about 60%, about 55%, about 50%, about 40%, about 30%, about 20%, or about 10%. In embodiments, variable "x" is no more than about 65%, about 60%, about 55%, about 50%, about 40%.
[0168] In some embodiments, variable "x," representing the weight percentage of lipid component (1) (e.g., a sterol-based cationic lipid), is: at least about 50% but less than about 95%; at least about 50% but less than about 90%; at least about 50% but less than about 85%; at least about 50% but less than about 80%; at least about 50% but less than about 75%; at least about 50% but less than about 70%; at least about 50% but less than about 65%; or at least about 50% but less than about 60%. In embodiments, variable "x" is at least about 50% but less than about 70%; at least about 50% but less than about 65%; or at least about 50% but less than about 60%.
[0169] In some embodiments, variable "z," representing the molar percentage of lipid component (3) (e.g., a PEG lipid) is no more than about 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 15%, 20%, or 25%. In embodiments, variable "z," representing the molar percentage of lipid component (3) (e.g., a PEG lipid) is about 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%. In embodiments, variable "z," representing the molar percentage of lipid component (3) (e.g., a PEG lipid) is about 1% to about 10%, about 2% to about 10%, about 3% to about 10%, about 4% to about 10%, about 1% to about 7.5%, about 2.5% to about 10%, about 2.5% to about 7.5%, about 2.5% to about 5%, about 5% to about 7.5%, or about 5% to about 10%.
[0170] In some embodiments, variable "z," representing the weight percentage of lipid component (3) (e.g., a PEG lipid) is no more than about 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 15%, 20%, or 25%. In embodiments, variable "z," representing the weight percentage of lipid component (3) (e.g., a PEG lipid) is about 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%. In embodiments, variable "z," representing the weight percentage of lipid component (3) (e.g., a PEG lipid) is about 1% to about 10%, about 2% to about 10%, about 3% to about 10%, about 4% to about 10%, about 1% to about 7.5%, about 2.5% to about 10%, about 2.5% to about 7.5%, about 2.5% to about 5%, about 5% to about 7.5%, or about 5% to about 10%.
[0171] For compositions having three and only three distinct lipid components, variables "x," "y," and "z" may be in any combination so long as the total of the three variables sums to 100% of the total lipid content.
[0172] Formation of Liposomes Encapsulating mRNA
[0173] The liposomal transfer vehicles for use in the compositions of the invention can be prepared by various techniques which are presently known in the art. The liposomes for use in provided compositions can be prepared by various techniques which are presently known in the art. For example, multilamellar vesicles (MLV) may be prepared according to conventional techniques, such as by depositing a selected lipid on the inside wall of a suitable container or vessel by dissolving the lipid in an appropriate solvent, and then evaporating the solvent to leave a thin film on the inside of the vessel or by spray drying. An aqueous phase may then be added to the vessel with a vortexing motion which results in the formation of MLVs. Unilamellar vesicles (ULV) can then be formed by homogenization, sonication or extrusion of the multilamellar vesicles. In addition, unilamellar vesicles can be formed by detergent removal techniques.
[0174] In certain embodiments, provided compositions comprise a liposome wherein the mRNA is associated on both the surface of the liposome and encapsulated within the same liposome. For example, during preparation of the compositions of the present invention, cationic liposomes may associate with the mRNA through electrostatic interactions. For example, during preparation of the compositions of the present invention, cationic liposomes may associate with the mRNA through electrostatic interactions.
[0175] In some embodiments, the compositions and methods of the invention comprise mRNA encapsulated in a liposome. In some embodiments, the one or more mRNA species may be encapsulated in the same liposome. In some embodiments, the one or more mRNA species may be encapsulated in different liposomes. In some embodiments, the mRNA is encapsulated in one or more liposomes, which differ in their lipid composition, molar ratio of lipid components, size, charge (zeta potential), targeting ligands and/or combinations thereof. In some embodiments, the one or more liposome may have a different composition of sterol-based cationic lipids, neutral lipid, PEG-modified lipid and/or combinations thereof. In some embodiments the one or more liposomes may have a different molar ratio of cholesterol-based cationic lipid, neutral lipid, and PEG-modified lipid used to create the liposome.
[0176] The process of incorporation of a desired mRNA into a liposome is often referred to as "loading". Exemplary methods are described in Lasic, et al., FEBS Lett., 312: 255-258, 1992, which is incorporated herein by reference. The liposome-incorporated nucleic acids may be completely or partially located in the interior space of the liposome, within the bilayer membrane of the liposome, or associated with the exterior surface of the liposome membrane. The incorporation of a nucleic acid into liposomes is also referred to herein as "encapsulation" wherein the nucleic acid is entirely contained within the interior space of the liposome. The purpose of incorporating an mRNA into a transfer vehicle, such as a liposome, is often to protect the nucleic acid from an environment which may contain enzymes or chemicals that degrade nucleic acids and/or systems or receptors that cause the rapid excretion of the nucleic acids. Accordingly, in some embodiments, a suitable delivery vehicle is capable of enhancing the stability of the mRNA contained therein and/or facilitate the delivery of mRNA to the target cell or tissue.
[0177] Suitable liposomes in accordance with the present invention may be made in various sizes. In some embodiments, provided liposomes may be made smaller than previously known mRNA encapsulating liposomes. In some embodiments, decreased size of liposomes is associated with more efficient delivery of mRNA. Selection of an appropriate liposome size may take into consideration the site of the target cell or tissue and to some extent the application for which the liposome is being made.
[0178] In some embodiments, an appropriate size of liposome is selected to facilitate systemic distribution of antibody encoded by the mRNA. In some embodiments, it may be desirable to limit transfection of the mRNA to certain cells or tissues. For example, to target hepatocytes a liposome may be sized such that its dimensions are smaller than the fenestrations of the endothelial layer lining hepatic sinusoids in the liver; in such cases the liposome could readily penetrate such endothelial fenestrations to reach the target hepatocytes.
[0179] Alternatively or additionally, a liposome may be sized such that the dimensions of the liposome are of a sufficient diameter to limit or expressly avoid distribution into certain cells or tissues.
[0180] A variety of alternative methods known in the art are available for sizing of a population of liposomes. One such sizing method is described in U.S. Pat. No. 4,737,323, incorporated herein by reference. Sonicating a liposome suspension either by bath or probe sonication produces a progressive size reduction down to small ULV less than about 0.05 microns in diameter. Homogenization is another method that relies on shearing energy to fragment large liposomes into smaller ones. In a typical homogenization procedure, MLV are recirculated through a standard emulsion homogenizer until selected liposome sizes, typically between about 0.1 and 0.5 microns, are observed. The size of the liposomes may be determined by quasi-electric light scattering (QELS) as described in Bloomfield, Ann. Rev. Biophys. Bioeng., 10:421-150 (1981), incorporated herein by reference. Average liposome diameter may be reduced by sonication of formed liposomes. Intermittent sonication cycles may be alternated with QELS assessment to guide efficient liposome synthesis.
EXAMPLES
[0181] While certain compounds, compositions and methods of the present invention have been described with specificity in accordance with certain embodiments, the following examples serve only to illustrate the compounds of the invention and are not intended to limit the same.
Example 1. Synthesis and Comparison of hCFTR mRNA Constructs
[0182] Codon-optimized Human Cystic Fibrosis Transmembrane Conductance Regulator (CFTR) messenger RNA was synthesized by in vitro transcription from a plasmid DNA template encoding the gene, which was followed by the addition of a 5' cap structure (Cap 1) (Fechter, P.; Brownlee, G. G. "Recognition of mRNA cap structures by viral and cellular proteins" J. Gen. Virology 2005, 86, 1239-1249) and a 3' poly(A) tail of approximately 250 nucleotides in length as determined by gel electrophoresis. 5' and 3' untranslated regions present in each mRNA product are represented as X and Y, respectively and defined as stated (vide infra).
[0183] Exemplary Codon-Optimized Human Cystic Fibrosis Transmembrane Conductance Regulator (CFTR) mRNAs
[0184] Construct design:
X-SEQ ID NO: 1-Y
[0185] 5' and 3' UTR Sequences:
TABLE-US-00003 X (5' UTR Sequence) = (SEQ ID NO: 4) GGACAGAUCGCCUGGAGACGCCAUCCACGCUGUUUUGACCUCCAUAGA AGACACCGGGACCGAUCCAGCCUCCGCGGCCGGGAACGGUGCAUUGGA ACGCGGAUUCCCCGUGCCAAGAGUGACUCACCGUCCUUGACACG Y (3' UTR Sequence) = (SEQ ID NO: 5) CGGGUGGCAUCCCUGUGACCCCUCCCCAGUGCCUCUCCUGGCCCUGGA AGUUGCCACUCCAGUGCCCACCAGCCUUGUCCUAAUAAAAUUAAGUUG CAUCAAGCU OR (SEQ ID NO: 6) GGGUGGCAUCCCUGUGACCCCUCCCCAGUGCCUCUCCUGGCCCUGGAA GUUGCCACUCCAGUGCCCACCAGCCUUGUCCUAAUAAAAUUAAGUUGC AUCAAAGCU
[0186] An exemplary codon-optimized human CFTR mRNA sequence includes SEQ ID NO: 1 as described in the detailed description section.
[0187] An exemplary full-length codon-optimized human CFTR mRNA sequence is shown below:
TABLE-US-00004 (SEQ ID NO: 7) GGACAGAUCGCCUGGAGACGCCAUCCACGCUGUUUUGACCUCCAUAGAAGACACCGGGACC GAUCCAGCCUCCGCGGCCGGGAACGGUGCAUUGGAACGCGGAUUCCCCG UGCCAAGAGUGACUCACCGUCCUUGACACGAUGCAACGCUCUCCUCUUGAAAAGG CCUCGGUGGUGUCCAAGCUCUUCUUCUCGUGGACUAGACCCAUCCUGAGAAAGGG GUACAGACAGCGCUUGGAGCUGUCCGAUAUCUAUCAAAUCCCUUCCGUGGACUCC GCGGACAACCUGUCCGAGAAGCUCGAGAGAGAAUGGGACAGAGAACUCGCCUCAA AGAAGAACCCGAAGCUGAUUAAUGCGCUUAGGCGGUGCUUUUUCUGGCGGUUCA UGUUCUACGGCAUCUUCCUCUACCUGGGAGAGGUCACCAAGGCCGUGCAGCCCCU GUUGCUGGGACGGAUUAUUGCCUCCUACGACCCCGACAACAAGGAAGAAAGAAGC AUCGCUAUCUACUUGGGCAUCGGUCUGUGCCUGCUUUUCAUCGUCCGGACCCUCU UGUUGCAUCCUGCUAUUUUCGGCCUGCAUCACAUUGGCAUGCAGAUGAGAAUUG CCAUGUUUUCCCUGAUCUACAAGAAAACUCUGAAGCUCUCGAGCCGCGUGCUUGA CAAGAUUUCCAUCGGCCAGCUCGUGUCCCUGCUCUCCAACAAUCUGAACAAGUUC GACGAGGGCCUCGCCCUGGCCCACUUCGUGUGGAUCGCCCCUCUGCAAGUGGCGC UUCUGAUGGGCCUGAUCUGGGAGCUGCUGCAAGCCUCGGCAUUCUGUGGGCUUG GAUUCCUGAUCGUGCUGGCACUGUUCCAGGCCGGACUGGGGCGGAUGAUGAUGA AGUACAGGGACCAGAGAGCCGGAAAGAUUUCCGAACGGCUGGUGAUCACUUCGG AAAUGAUCGAAAACAUCCAGUCAGUGAAGGCCUACUGCUGGGAAGAGGCCAUGG AAAAGAUGAUUGAAAACCUCCGGCAAACCGAGCUGAAGCUGACCCGCAAGGCCGC UUACGUGCGCUAUUUCAACUCGUCCGCUUUCUUCUUCUCCGGGUUCUUCGUGGUG UUUCUCUCCGUGCUCCCCUACGCCCUGAUUAAGGGAAUCAUCCUCAGGAAGAUCU UCACCACCAUUUCCUUCUGUAUCGUGCUCCGCAUGGCCGUGACCCGGCAGUUCCC AUGGGCCGUGCAGACUUGGUACGACUCCCUGGGAGCCAUUAACAAGAUCCAGGAC UUCCUUCAAAAGCAGGAGUACAAGACCCUCGAGUACAACCUGACUACUACCGAGG UCGUGAUGGAAAACGUCACCGCCUUUUGGGAGGAGGGAUUUGGCGAACUGUUCG AGAAGGCCAAGCAGAACAACAACAACCGCAAGACCUCGAACGGUGACGACUCCCU CUUCUUUUCAAACUUCAGCCUGCUCGGGACGCCCGUGCUGAAGGACAUUAACUUC AAGAUCGAAAGAGGACAGCUCCUGGCGGUGGCCGGAUCGACCGGAGCCGGAAAG ACUUCCCUGCUGAUGGUGAUCAUGGGAGAGCUUGAACCUAGCGAGGGAAAGAUC AAGCACUCCGGCCGCAUCAGCUUCUGUAGCCAGUUUUCCUGGAUCAUGCCCGGAA CCAUUAAGGAAAACAUCAUCUUCGGCGUGUCCUACGAUGAAUACCGCUACCGGUC CGUGAUCAAAGCCUGCCAGCUGGAAGAGGAUAUUUCAAAGUUCGCGGAGAAAGA UAACAUCGUGCUGGGCGAAGGGGGUAUUACCUUGUCGGGGGGCCAGCGGGCUAG AAUCUCGCUGGCCAGAGCCGUGUAUAAGGACGCCGACCUGUAUCUCCUGGACUCC CCCUUCGGAUACCUGGACGUCCUGACCGAAAAGGAGAUCUUCGAAUCGUGCGUGU GCAAGCUGAUGGCUAACAAGACUCGCAUCCUCGUGACCUCCAAAAUGGAGCACCU GAAGAAGGCAGACAAGAUUCUGAUUCUGCAUGAGGGGUCCUCCUACUUUUACGG CACCUUCUCGGAGUUGCAGAACUUGCAGCCCGACUUCUCAUCGAAGCUGAUGGGU UGCGACAGCUUCGACCAGUUCUCCGCCGAAAGAAGGAACUCGAUCCUGACGGAAA CCUUGCACCGCUUCUCUUUGGAAGGCGACGCCCCUGUGUCAUGGACCGAGACUAA GAAGCAGAGCUUCAAGCAGACCGGGGAAUUCGGCGAAAAGAGGAAGAACAGCAU CUUGAACCCCAUUAACUCCAUCCGCAAGUUCUCAAUCGUGCAAAAGACGCCACUG CAGAUGAACGGCAUUGAGGAGGACUCCGACGAACCCCUUGAGAGGCGCCUGUCCC UGGUGCCGGACAGCGAGCAGGGAGAAGCCAUCCUGCCUCGGAUUUCCGUGAUCUC CACUGGUCCGACGCUCCAAGCCCGGCGGCGGCAGUCCGUGCUGAACCUGAUGACC CACAGCGUGAACCAGGGCCAAAACAUUCACCGCAAGACUACCGCAUCCACCCGGA AAGUGUCCCUGGCACCUCAAGCGAAUCUUACCGAGCUCGACAUCUACUCCCGGAG ACUGUCGCAGGAAACCGGGCUCGAAAUUUCCGAAGAAAUCAACGAGGAGGAUCU GAAAGAGUGCUUCUUCGACGAUAUGGAGUCGAUACCCGCCGUGACGACUUGGAA CACUUAUCUGCGGUACAUCACUGUGCACAAGUCAUUGAUCUUCGUGCUGAUUUG GUGCCUGGUGAUUUUCCUGGCCGAGGUCGCGGCCUCACUGGUGGUGCUCUGGCUG UUGGGAAACACGCCUCUGCAAGACAAGGGAAACUCCACGCACUCGAGAAACAACA GCUAUGCCGUGAUUAUCACUUCCACCUCCUCUUAUUACGUGUUCUACAUCUACGU CGGAGUGGCGGAUACCCUGCUCGCGAUGGGUUUCUUCAGAGGACUGCCGCUGGUC CACACCUUGAUCACCGUCAGCAAGAUUCUUCACCACAAGAUGUUGCAUAGCGUGC UGCAGGCCCCCAUGUCCACCCUCAACACUCUGAAGGCCGGAGGCAUUCUGAACAG AUUCUCCAAGGACAUCGCUAUCCUGGACGAUCUCCUGCCGCUUACCAUCUUUGAC UUCAUCCAGCUGCUGCUGAUCGUGAUUGGAGCAAUCGCAGUGGUGGCGGUGCUG CAGCCUUACAUUUUCGUGGCCACUGUGCCGGUCAUUGUGGCGUUCAUCAUGCUGC GGGCCUACUUCCUCCAAACCAGCCAGCAGCUGAAGCAACUGGAAUCCGAGGGACG AUCCCCCAUCUUCACUCACCUUGUGACGUCGUUGAAGGGACUGUGGACCCUCCGG GCUUUCGGACGGCAGCCCUACUUCGAAACCCUCUUCCACAAGGCCCUGAACCUCC ACACCGCCAAUUGGUUCCUGUACCUGUCCACCCUGCGGUGGUUCCAGAUGCGCAU CGAGAUGAUUUUCGUCAUCUUCUUCAUCGCGGUCACAUUCAUCAGCAUCCUGACU ACCGGAGAGGGAGAGGGACGGGUCGGAAUAAUCCUGACCCUCGCCAUGAACAUU AUGAGCACCCUGCAGUGGGCAGUGAACAGCUCGAUCGACGUGGACAGCCUGAUGC GAAGCGUCAGCCGCGUGUUCAAGUUCAUCGACAUGCCUACUGAGGGAAAACCCAC UAAGUCCACUAAGCCCUACAAAAAUGGCCAGCUGAGCAAGGUCAUGAUCAUCGAA AACUCCCACGUGAAGAAGGACGAUAUUUGGCCCUCCGGAGGUCAAAUGACCGUGA AGGACCUGACCGCAAAGUACACCGAGGGAGGAAACGCCAUUCUCGAAAACAUCAG CUUCUCCAUUUCGCCGGGACAGCGGGUCGGCCUUCUCGGGCGGACCGGUUCCGGG AAGUCAACUCUGCUGUCGGCUUUCCUCCGGCUGCUGAAUACCGAGGGGGAAAUCC AAAUUGACGGCGUGUCUUGGGAUUCCAUUACUCUGCAGCAGUGGCGGAAGGCCU UCGGCGUGAUCCCCCAGAAGGUGUUCAUCUUCUCGGGUACCUUCCGGAAGAACCU GGAUCCUUACGAGCAGUGGAGCGACCAAGAAAUCUGGAAGGUCGCCGACGAGGU CGGCCUGCGCUCCGUGAUUGAACAAUUUCCUGGAAAGCUGGACUUCGUGCUCGUC GACGGGGGAUGUGUCCUGUCGCACGGACAUAAGCAGCUCAUGUGCCUCGCACGGU CCGUGCUCUCCAAGGCCAAGAUUCUGCUGCUGGACGAACCUUCGGCCCACCUGGA UCCGGUCACCUACCAGAUCAUCAGGAGGACCCUGAAGCAGGCCUUUGCCGAUUGC ACCGUGAUUCUCUGCGAGCACCGCAUCGAGGCCAUGCUGGAGUGCCAGCAGUUCC UGGUCAUCGAGGAGAACAAGGUCCGCCAAUACGACUCCAUUCAAAAGCUCCUCAA CGAGCGGUCGCUGUUCAGACAAGCUAUUUCACCGUCCGAUAGAGUGAAGCUCUUC CCGCAUCGGAACAGCUCAAAGUGCAAAUCGAAGCCGCAGAUCGCAGCCUUGAAGG AAGAGACUGAGGAAGAGGUGCAGGACACCCGGCUUUAACGGGUGGCAUCCCUGU GACCCCUCCCCAGUGCCUCUCCUGGCCCUGGAAGUUGCCACUCCAGUGCCCACCA GCCUUGUCCUAAUAAAAUUAAGUUGCAUCAAGCU
[0188] In another example, a full length codon-optimized human CFTR mRNA sequence is shown below:
TABLE-US-00005 (SEQ ID NO: 8) GGACAGAUCGCCUGGAGACGCCAUCCACGCUGUUUUGACCUCCAUAGAAGACACCGGGACC GAUCCAGCCUCCGCGGCCGGGAACGGUGCAUUGGAACGCGGAUUCCCCG UGCCAAGAGUGACUCACCGUCCUUGACACGAUGCAACGCUCUCCUCUUGAAAAGG CCUCGGUGGUGUCCAAGCUCUUCUUCUCGUGGACUAGACCCAUCCUGAGAAAGGG GUACAGACAGCGCUUGGAGCUGUCCGAUAUCUAUCAAAUCCCUUCCGUGGACUCC GCGGACAACCUGUCCGAGAAGCUCGAGAGAGAAUGGGACAGAGAACUCGCCUCAA AGAAGAACCCGAAGCUGAUUAAUGCGCUUAGGCGGUGCUUUUUCUGGCGGUUCA UGUUCUACGGCAUCUUCCUCUACCUGGGAGAGGUCACCAAGGCCGUGCAGCCCCU GUUGCUGGGACGGAUUAUUGCCUCCUACGACCCCGACAACAAGGAAGAAAGAAGC AUCGCUAUCUACUUGGGCAUCGGUCUGUGCCUGCUUUUCAUCGUCCGGACCCUCU UGUUGCAUCCUGCUAUUUUCGGCCUGCAUCACAUUGGCAUGCAGAUGAGAAUUG CCAUGUUUUCCCUGAUCUACAAGAAAACUCUGAAGCUCUCGAGCCGCGUGCUUGA CAAGAUUUCCAUCGGCCAGCUCGUGUCCCUGCUCUCCAACAAUCUGAACAAGUUC GACGAGGGCCUCGCCCUGGCCCACUUCGUGUGGAUCGCCCCUCUGCAAGUGGCGC UUCUGAUGGGCCUGAUCUGGGAGCUGCUGCAAGCCUCGGCAUUCUGUGGGCUUG GAUUCCUGAUCGUGCUGGCACUGUUCCAGGCCGGACUGGGGCGGAUGAUGAUGA AGUACAGGGACCAGAGAGCCGGAAAGAUUUCCGAACGGCUGGUGAUCACUUCGG AAAUGAUCGAAAACAUCCAGUCAGUGAAGGCCUACUGCUGGGAAGAGGCCAUGG AAAAGAUGAUUGAAAACCUCCGGCAAACCGAGCUGAAGCUGACCCGCAAGGCCGC UUACGUGCGCUAUUUCAACUCGUCCGCUUUCUUCUUCUCCGGGUUCUUCGUGGUG UUUCUCUCCGUGCUCCCCUACGCCCUGAUUAAGGGAAUCAUCCUCAGGAAGAUCU UCACCACCAUUUCCUUCUGUAUCGUGCUCCGCAUGGCCGUGACCCGGCAGUUCCC AUGGGCCGUGCAGACUUGGUACGACUCCCUGGGAGCCAUUAACAAGAUCCAGGAC UUCCUUCAAAAGCAGGAGUACAAGACCCUCGAGUACAACCUGACUACUACCGAGG UCGUGAUGGAAAACGUCACCGCCUUUUGGGAGGAGGGAUUUGGCGAACUGUUCG AGAAGGCCAAGCAGAACAACAACAACCGCAAGACCUCGAACGGUGACGACUCCCU CUUCUUUUCAAACUUCAGCCUGCUCGGGACGCCCGUGCUGAAGGACAUUAACUUC AAGAUCGAAAGAGGACAGCUCCUGGCGGUGGCCGGAUCGACCGGAGCCGGAAAG ACUUCCCUGCUGAUGGUGAUCAUGGGAGAGCUUGAACCUAGCGAGGGAAAGAUC AAGCACUCCGGCCGCAUCAGCUUCUGUAGCCAGUUUUCCUGGAUCAUGCCCGGAA CCAUUAAGGAAAACAUCAUCUUCGGCGUGUCCUACGAUGAAUACCGCUACCGGUC CGUGAUCAAAGCCUGCCAGCUGGAAGAGGAUAUUUCAAAGUUCGCGGAGAAAGA UAACAUCGUGCUGGGCGAAGGGGGUAUUACCUUGUCGGGGGGCCAGCGGGCUAG AAUCUCGCUGGCCAGAGCCGUGUAUAAGGACGCCGACCUGUAUCUCCUGGACUCC CCCUUCGGAUACCUGGACGUCCUGACCGAAAAGGAGAUCUUCGAAUCGUGCGUGU GCAAGCUGAUGGCUAACAAGACUCGCAUCCUCGUGACCUCCAAAAUGGAGCACCU GAAGAAGGCAGACAAGAUUCUGAUUCUGCAUGAGGGGUCCUCCUACUUUUACGG CACCUUCUCGGAGUUGCAGAACUUGCAGCCCGACUUCUCAUCGAAGCUGAUGGGU UGCGACAGCUUCGACCAGUUCUCCGCCGAAAGAAGGAACUCGAUCCUGACGGAAA CCUUGCACCGCUUCUCUUUGGAAGGCGACGCCCCUGUGUCAUGGACCGAGACUAA GAAGCAGAGCUUCAAGCAGACCGGGGAAUUCGGCGAAAAGAGGAAGAACAGCAU CUUGAACCCCAUUAACUCCAUCCGCAAGUUCUCAAUCGUGCAAAAGACGCCACUG CAGAUGAACGGCAUUGAGGAGGACUCCGACGAACCCCUUGAGAGGCGCCUGUCCC UGGUGCCGGACAGCGAGCAGGGAGAAGCCAUCCUGCCUCGGAUUUCCGUGAUCUC CACUGGUCCGACGCUCCAAGCCCGGCGGCGGCAGUCCGUGCUGAACCUGAUGACC CACAGCGUGAACCAGGGCCAAAACAUUCACCGCAAGACUACCGCAUCCACCCGGA AAGUGUCCCUGGCACCUCAAGCGAAUCUUACCGAGCUCGACAUCUACUCCCGGAG ACUGUCGCAGGAAACCGGGCUCGAAAUUUCCGAAGAAAUCAACGAGGAGGAUCU GAAAGAGUGCUUCUUCGACGAUAUGGAGUCGAUACCCGCCGUGACGACUUGGAA CACUUAUCUGCGGUACAUCACUGUGCACAAGUCAUUGAUCUUCGUGCUGAUUUG GUGCCUGGUGAUUUUCCUGGCCGAGGUCGCGGCCUCACUGGUGGUGCUCUGGCUG UUGGGAAACACGCCUCUGCAAGACAAGGGAAACUCCACGCACUCGAGAAACAACA GCUAUGCCGUGAUUAUCACUUCCACCUCCUCUUAUUACGUGUUCUACAUCUACGU CGGAGUGGCGGAUACCCUGCUCGCGAUGGGUUUCUUCAGAGGACUGCCGCUGGUC CACACCUUGAUCACCGUCAGCAAGAUUCUUCACCACAAGAUGUUGCAUAGCGUGC UGCAGGCCCCCAUGUCCACCCUCAACACUCUGAAGGCCGGAGGCAUUCUGAACAG AUUCUCCAAGGACAUCGCUAUCCUGGACGAUCUCCUGCCGCUUACCAUCUUUGAC UUCAUCCAGCUGCUGCUGAUCGUGAUUGGAGCAAUCGCAGUGGUGGCGGUGCUG CAGCCUUACAUUUUCGUGGCCACUGUGCCGGUCAUUGUGGCGUUCAUCAUGCUGC GGGCCUACUUCCUCCAAACCAGCCAGCAGCUGAAGCAACUGGAAUCCGAGGGACG AUCCCCCAUCUUCACUCACCUUGUGACGUCGUUGAAGGGACUGUGGACCCUCCGG GCUUUCGGACGGCAGCCCUACUUCGAAACCCUCUUCCACAAGGCCCUGAACCUCC ACACCGCCAAUUGGUUCCUGUACCUGUCCACCCUGCGGUGGUUCCAGAUGCGCAU CGAGAUGAUUUUCGUCAUCUUCUUCAUCGCGGUCACAUUCAUCAGCAUCCUGACU ACCGGAGAGGGAGAGGGACGGGUCGGAAUAAUCCUGACCCUCGCCAUGAACAUU AUGAGCACCCUGCAGUGGGCAGUGAACAGCUCGAUCGACGUGGACAGCCUGAUGC GAAGCGUCAGCCGCGUGUUCAAGUUCAUCGACAUGCCUACUGAGGGAAAACCCAC UAAGUCCACUAAGCCCUACAAAAAUGGCCAGCUGAGCAAGGUCAUGAUCAUCGAA AACUCCCACGUGAAGAAGGACGAUAUUUGGCCCUCCGGAGGUCAAAUGACCGUGA AGGACCUGACCGCAAAGUACACCGAGGGAGGAAACGCCAUUCUCGAAAACAUCAG CUUCUCCAUUUCGCCGGGACAGCGGGUCGGCCUUCUCGGGCGGACCGGUUCCGGG AAGUCAACUCUGCUGUCGGCUUUCCUCCGGCUGCUGAAUACCGAGGGGGAAAUCC AAAUUGACGGCGUGUCUUGGGAUUCCAUUACUCUGCAGCAGUGGCGGAAGGCCU UCGGCGUGAUCCCCCAGAAGGUGUUCAUCUUCUCGGGUACCUUCCGGAAGAACCU GGAUCCUUACGAGCAGUGGAGCGACCAAGAAAUCUGGAAGGUCGCCGACGAGGU CGGCCUGCGCUCCGUGAUUGAACAAUUUCCUGGAAAGCUGGACUUCGUGCUCGUC GACGGGGGAUGUGUCCUGUCGCACGGACAUAAGCAGCUCAUGUGCCUCGCACGGU CCGUGCUCUCCAAGGCCAAGAUUCUGCUGCUGGACGAACCUUCGGCCCACCUGGA UCCGGUCACCUACCAGAUCAUCAGGAGGACCCUGAAGCAGGCCUUUGCCGAUUGC ACCGUGAUUCUCUGCGAGCACCGCAUCGAGGCCAUGCUGGAGUGCCAGCAGUUCC UGGUCAUCGAGGAGAACAAGGUCCGCCAAUACGACUCCAUUCAAAAGCUCCUCAA CGAGCGGUCGCUGUUCAGACAAGCUAUUUCACCGUCCGAUAGAGUGAAGCUCUUC CCGCAUCGGAACAGCUCAAAGUGCAAAUCGAAGCCGCAGAUCGCAGCCUUGAAGG AAGAGACUGAGGAAGAGGUGCAGGACACCCGGCUUUAAGGGUGGCAUCCCUGUG ACCCCUCCCCAGUGCCUCUCCUGGCCCUGGAAGUUGCCACUCCAGUGCCCACCAGC CUUGUCCUAAUAAAAUUAAGUUGCAUCAAAGCU
Comparison of hCFTR mRNA Constructs
[0189] A previous hCFTR sequence (SEQ ID NO: 2) was codon-optimized using a T7 promoter. Upon changing the promoter used to synthesize the hCFTR mRNA to SP6, "cleaner" mRNA was synthesized with respect to pre-aborted sequences, but a second species of approximately 1800 nt ("longmer") was being produced in low quantities. This was visualized by gel electrophoresis as depicted in FIG. 1. In FIG. 1, lane 1 contains an RNA ladder, lane 2 contains mRNA of SEQ ID NO: 1 and lane 3 contains mRNA of SEQ ID NO: 2. As indicated by the arrow, a secondary polynucleotide species approximately 1800 nucleotides in length is present in lane 3. Several new sequences (relative to SEQ ID NO: 2) were designed with site mutations to remove suspected cryptic promoters, but that did not result in the disappearance of the .about.1800 nt secondary species. Complete codon-re-optimization was performed to create SEQ ID NO: 1, which successfully led to an mRNA product without the additional production of the second species at .about.1800 nt (lane 1).
[0190] Thus, SEQ ID NO: 1 is particularly useful in a homogenous, safe and efficacious pharmaceutical composition.
Example 2. Additional Exemplary Codon Optimized CFTR Sequences
[0191] The following additional exemplary codon optimized sequences are used for synthesis of CFTR mRNA for safe and efficacious clinical use:
TABLE-US-00006 (SEQ ID NO: 21) ATGCAGAGGAGCCCACTGGAGAAAGCCTCCGTGGTGAGTAAACTCTTTTTTAGTTGG ACCAGACCCATCCTGCGAAAAGGATACAGGCAGCGCCTCGAGTTGTCAGATATCTA CCAGATTCCTTCTGTGGACTCAGCTGACAATTTGAGTGAGAAGCTGGAGCGGGAGTG GGATAGAGAGCTGGCGAGCAAAAAAAACCCCAAGCTTATCAATGCTCTGCGCCGCT GCTTTTTCTGGAGGTTCATGTTTTATGGGATCTTCCTGTACCTGGGGGAGGTCACCAA AGCTGTTCAGCCGCTCCTTCTTGGCCGCATCATCGCCAGCTATGACCCTGATAATAA AGAAGAAAGGTCTATTGCTATTTATCTGGGAATTGGCCTCTGCTTGCTCTTCATCGTC CGCACCCTTCTGCTGCACCCTGCCATTTTTGGCCTTCACCACATCGGCATGCAAATG AGAATTGCCATGTTCTCCCTCATTTACAAAAAGACCCTGAAACTTTCCTCAAGAGTG TTAGATAAAATATCCATTGGTCAGCTGGTCAGCCTGCTGTCCAACAATCTTAACAAA TTTGATGAAGGCTTGGCGCTGGCCCACTTCGTGTGGATTGCACCTCTGCAGGTGGCC CTGTTGATGGGACTTATATGGGAGCTGCTTCAAGCCTCTGCTTTCTGTGGGCTGGGCT TTTTGATTGTACTGGCACTTTTTCAGGCTGGGCTCGGAAGAATGATGATGAAATACA GAGATCAGCGGGCCGGGAAGATATCAGAGCGACTTGTGATCACCAGTGAAATGATT GAAAATATTCAGAGCGTGAAAGCCTACTGCTGGGAAGAAGCCATGGAGAAGATGAT TGAGAACCTGAGGCAGACAGAGCTCAAGCTCACTCGGAAGGCTGCTTATGTTCGCT ATTTCAACAGCAGCGCCTTCTTCTTCAGTGGCTTCTTTGTTGTCTTCCTGTCTGTTCTG CCATATGCACTGATAAAAGGCATTATTTTACGAAAGATCTTCACCACCATCAGTTTT TGCATCGTTCTCAGGATGGCCGTCACAAGACAGTTCCCCTGGGCTGTGCAGACCTGG TACGATTCCTTGGGGGCCATCAACAAGATTCAAGATTTCTTGCAAAAACAAGAATAT AAAACTTTAGAATACAACCTCACCACCACTGAAGTGGTCATGGAAAATGTGACAGC CTTTTGGGAGGAGGGTTTTGGAGAATTGTTCGAGAAGGCAAAGCAGAATAACAACA ACAGGAAGACGAGCAATGGGGACGACTCTCTCTTCTTCAGCAACTTTTCACTGCTCG GGACCCCTGTGTTGAAAGATATAAACTTCAAGATCGAGAGGGGCCAGCTCTTGGCT GTGGCAGGCTCCACTGGAGCTGGTAAAACATCTCTTCTCATGGTGATCATGGGGGAA CTGGAGCCTTCCGAAGGAAAAATCAAGCACAGTGGGAGAATCTCATTCTGCAGCCA GTTTTCCTGGATCATGCCCGGCACCATTAAGGAAAACATCATATTTGGAGTGTCCTA TGATGAGTACCGCTACCGGTCAGTCATCAAAGCCTGTCAGTTGGAGGAGGACATCTC CAAGTTTGCAGAGAAAGACAACATTGTGCTTGGAGAGGGGGGTATCACTCTTTCTGG AGGACAAAGAGCCAGGATCTCTTTGGCCCGGGCAGTCTACAAGGATGCAGACCTCT ACTTGTTGGACAGTCCCTTCGGCTACCTCGACGTGCTGACTGAAAAAGAAATTTTTG AAAGCTGTGTGTGCAAACTGATGGCAAACAAGACCAGGATTCTTGTCACCAGCAAG ATGGAACATCTGAAGAAAGCGGACAAAATTCTGATTCTGCATGAAGGGAGCTCCTA CTTCTATGGAACATTTAGCGAGCTTCAGAACCTACAGCCAGACTTCTCCTCCAAATT AATGGGCTGTGACTCCTTCGACCAGTTCTCTGCAGAAAGAAGAAACTCTATACTCAC AGAGACCCTCCACCGCTTCTCCCTTGAGGGAGATGCCCCAGTTTCTTGGACAGAAAC CAAGAAGCAGTCCTTTAAGCAGACTGGCGAGTTTGGTGAAAAGAGGAAAAATTCAA TTCTCAATCCAATTAACAGTATTCGCAAGTTCAGCATTGTCCAGAAGACACCCCTCC AGATGAATGGCATCGAAGAAGATAGTGACGAGCCGCTGGAGAGACGGCTGAGTCTG GTGCCAGATTCAGAACAGGGGGAGGCCATCCTGCCCCGGATCAGCGTCATTTCCAC AGGCCCCACATTACAAGCACGGCGCCGGCAGAGTGTTTTAAATCTCATGACCCATTC AGTGAACCAGGGCCAAAATATCCACAGGAAGACTACAGCTTCTACCCGGAAAGTGT CTCTGGCCCCTCAGGCCAATCTGACCGAGCTGGACATCTACAGCAGGAGGCTCTCCC AGGAAACAGGGCTGGAAATATCTGAAGAGATTAATGAAGAGGATCTTAAAGAGTGC TTCTTTGATGACATGGAGAGCATCCCCGCGGTGACCACATGGAACACCTACCTTAGA TATATTACTGTCCACAAGAGCCTCATATTTGTCCTCATCTGGTGCCTGGTTATTTTCC TCGCTGAGGTGGCGGCCAGTCTTGTTGTGCTCTGGCTGCTGGGCAACACTCCTCTCC AGGACAAGGGCAATAGTACTCACAGCAGAAATAATTCTTATGCCGTCATCATTACA AGCACCTCCAGCTACTACGTGTTCTACATCTATGTGGGCGTGGCTGACACCCTCCTG GCCATGGGTTTCTTCCGGGGCCTGCCTTTGGTGCACACCCTCATCACAGTGTCAAAA ATTCTGCACCATAAAATGCTTCATTCTGTCCTGCAGGCACCCATGAGCACTTTGAAC ACATTGAAGGCTGGCGGCATCCTCAACAGATTTTCTAAAGATATTGCTATCCTGGAT GATCTCCTCCCCCTGACAATCTTTGACTTTATCCAGCTTCTGCTGATCGTGATTGGAG CCATAGCAGTGGTTGCTGTCCTGCAGCCCTACATTTTTGTGGCCACCGTGCCCGTGAT TGTTGCCTTTATTATGCTCAGAGCTTACTTCCTGCAAACTTCTCAACAGCTCAAACAG CTAGAATCTGAGGGCCGGAGCCCCATTTTTACCCACCTGGTGACTTCCCTGAAGGGA CTGTGGACTCTGAGAGCATTCGGGCGACAGCCTTACTTTGAGACACTGTTCCACAAG GCCCTGAACTTGCACACTGCCAACTGGTTTCTTTACCTGAGCACACTCCGCTGGTTCC AGATGCGGATAGAGATGATCTTCGTCATCTTTTTTATAGCTGTAACCTTCATTTCTAT CCTTACAACAGGAGAAGGAGAGGGCAGGGTGGGAATCATCCTCACGCTGGCTATGA ACATAATGTCCACCTTGCAGTGGGCCGTGAATTCCAGTATAGATGTGGATTCTCTAA TGAGGAGTGTCTCCCGGGTGTTTAAATTCATTGATATGCCTACTGAGGGGAAACCCA CCAAGTCAACAAAACCTTATAAGAATGGACAGCTGAGCAAGGTGATGATAATTGAG AACAGCCACGTGAAGAAGGATGACATTTGGCCCAGCGGGGGCCAGATGACTGTGAA GGACCTGACGGCCAAGTACACCGAAGGTGGAAATGCCATTTTGGAAAACATCAGCT TCTCAATCTCTCCTGGGCAGAGAGTTGGATTGCTGGGTCGCACGGGCAGCGGCAAAT CAACCCTGCTCAGTGCCTTCCTTCGGCTCCTGAATACAGAAGGCGAAATCCAAATTG ACGGGGTGAGCTGGGACAGCATCACCCTGCAGCAGTGGAGAAAAGCATTTGGGGTC ATTCCACAGAAAGTTTTCATCTTCTCTGGCACTTTCAGAAAGAACCTGGACCCCTAT GAGCAGTGGAGCGACCAGGAGATCTGGAAGGTTGCAGATGAAGTTGGCCTGCGGAG TGTGATAGAACAATTTCCTGGCAAGCTGGATTTTGTGCTGGTAGATGGAGGCTGCGT GCTGTCCCACGGCCACAAACAGCTGATGTGCCTCGCCCGCTCCGTTCTTTCAAAGGC CAAAATCTTGCTTTTGGATGAGCCCAGTGCTCACCTCGACCCAGTGACCTATCAGAT AATCCGCAGGACCTTAAAGCAAGCTTTTGCCGACTGCACCGTCATACTGTGTGAGCA CCGGATTGAAGCAATGCTGGAATGCCAGCAGTTTCTGGTGATCGAGGAGAATAAGG TCCGGCAGTACGACAGCATCCAGAAGTTGTTGAATGAGCGCAGCCTTTTCCGCCAGG CCATCTCCCCATCTGACAGAGTCAAGCTGTTTCCACATAGGAACTCCTCTAAGTGCA AGTCCAAGCCCCAGATCGCTGCCCTCAAGGAGGAAACTGAGGAAGAGGTGCAGGAT ACCCGCCTGTGA (SEQ ID NO: 22) ATGCAGAGGAGCCCACTGGAGAAAGCCTCCGTGGTGAGTAAACTCTTTTTTAGTTGG ACCAGACCCATCCTGCGAAAAGGATACAGGCAGCGCCTCGAGTTGTCTGATATCTA CCAGATTCCTTCTGTGGACTCAGCTGACAATTTGAGTGAGAAGCTGGAGCGGGAGTG GGATAGAGAGCTGGCGAGCAAAAAAAACCCCAAGCTTATCAATGCTCTGCGCCGCT GCTTTTTCTGGAGGTTCATGTTTTATGGGATCTTCCTGTACCTGGGGGAGGTCACCAA AGCTGTTCAGCCGCTCCTTCTTGGCCGCATCATCGCCAGCTATGACCCTGATAATAA AGAAGAAAGGTCTATTGCTATTTATCTGGGAATTGGCCTCTGCTTGCTCTTCATCGTC CGCACCCTTCTGCTGCACCCTGCCATTTTTGGCCTTCACCACATCGGCATGCAAATG AGAATTGCCATGTTCTCCCTCATTTACAAAAAGACCCTGAAACTTTCCTCAAGAGTG TTAGATAAAATATCCATTGGTCAGCTGGTCAGCCTGCTGTCCAACAATCTTAACAAA TTTGATGAAGGCTTGGCGCTGGCCCACTTCGTGTGGATTGCACCTCTGCAGGTGGCC CTGTTGATGGGACTTATATGGGAGCTGCTTCAAGCCTCTGCTTTCTGTGGGCTGGGCT TTTTGATTGTACTGGCACTTTTTCAGGCTGGGCTCGGAAGAATGATGATGAAATACA GAGATCAGCGGGCCGGGAAGATTTCAGAGCGACTTGTGATCACCAGTGAAATGATT GAAAATATTCAGAGCGTGAAAGCCTACTGCTGGGAAGAAGCCATGGAGAAGATGAT TGAGAACCTGAGGCAGACAGAGCTCAAGCTCACTCGGAAGGCTGCTTATGTTCGCT ATTTCAACAGCAGCGCCTTCTTCTTCAGTGGCTTCTTTGTTGTCTTCCTGTCTGTTCTG CCATATGCACTGATAAAAGGCATTATTTTACGAAAGATCTTCACCACCATCAGTTTT TGCATCGTTCTCAGGATGGCCGTCACAAGACAGTTCCCCTGGGCTGTGCAGACCTGG TACGATTCCTTGGGGGCCATCAACAAGATTCAAGATTTCTTGCAAAAACAAGAATAT AAAACTTTAGAATACAACCTCACCACCACTGAAGTGGTCATGGAAAATGTGACAGC CTTTTGGGAGGAGGGTTTTGGAGAATTGTTCGAGAAGGCAAAGCAGAATAACAACA ACAGGAAGACGAGCAATGGGGACGACTCTCTCTTCTTCAGCAACTTTTCACTGCTCG GGACCCCTGTGTTGAAAGATATAAACTTCAAGATCGAGAGGGGCCAGCTCTTGGCT GTGGCAGGCTCCACTGGAGCTGGTAAAACATCTCTTCTCATGGTGATCATGGGGGAA CTGGAGCCTTCCGAAGGAAAAATCAAGCACAGTGGGAGAATCTCATTCTGCAGCCA GTTTTCCTGGATCATGCCCGGCACCATTAAGGAAAACATCATATTTGGAGTGTCCTA TGATGAGTACCGCTACCGGTCAGTCATCAAAGCCTGTCAGTTGGAGGAGGACATCTC CAAGTTTGCAGAGAAAGACAACATTGTGCTTGGAGAGGGGGGTATCACTCTTTCTGG AGGACAAAGAGCCAGGATCTCTTTGGCCCGGGCAGTCTACAAGGATGCAGACCTCT ACTTGTTGGACAGTCCCTTCGGCTACCTCGACGTGCTGACTGAAAAAGAAATTTTTG AAAGCTGTGTGTGCAAACTGATGGCAAACAAGACCAGGATTCTTGTCACCAGCAAG ATGGAACATCTGAAGAAAGCGGACAAAATTCTGATTCTGCATGAAGGGAGCTCCTA CTTCTATGGAACATTTAGCGAGCTTCAGAACCTACAGCCAGACTTCTCCTCCAAATT AATGGGCTGTGACTCCTTCGACCAGTTCTCTGCAGAAAGAAGAAACTCTATACTCAC AGAGACCCTCCACCGCTTCTCCCTTGAGGGAGATGCCCCAGTTTCTTGGACAGAAAC CAAGAAGCAGTCCTTTAAGCAGACTGGCGAGTTTGGTGAAAAGAGGAAAAATTCAA TTCTCAATCCAATTAACAGTATTCGCAAGTTCAGCATTGTCCAGAAGACACCCCTCC AGATGAATGGCATCGAAGAAGATAGTGACGAGCCGCTGGAGAGACGGCTGAGTCTG GTGCCAGATTCAGAACAGGGGGAGGCCATCCTGCCCCGGATCAGCGTCATTTCCAC AGGCCCCACATTACAAGCACGGCGCCGGCAGAGTGTTTTAAATCTCATGACCCATTC AGTGAACCAGGGCCAAAATATCCACAGGAAGACTACAGCTTCTACCCGGAAAGTGT CTCTGGCCCCTCAGGCCAATCTGACCGAGCTGGACATCTACAGCAGGAGGCTCTCCC AGGAAACAGGGCTGGAAATATCTGAAGAGATTAATGAAGAGGATCTTAAAGAGTGC TTCTTTGATGACATGGAGAGCATCCCCGCGGTGACCACATGGAACACCTACCTTAGA
TATATTACTGTCCACAAGAGCCTCATATTTGTCCTCATCTGGTGCCTGGTTATTTTCC TCGCTGAGGTGGCGGCCAGTCTTGTTGTGCTCTGGCTGCTGGGCAACACTCCTCTCC AGGACAAGGGCAATAGTACTCACAGCAGAAATAATTCTTATGCCGTCATCATTACA AGCACCTCCAGCTACTACGTGTTCTACATCTATGTGGGCGTGGCTGACACCCTCCTG GCCATGGGTTTCTTCCGGGGCCTGCCTTTGGTGCACACCCTCATCACAGTGTCAAAA ATTCTGCACCATAAAATGCTTCATTCTGTCCTGCAGGCACCCATGAGCACTTTGAAC ACATTGAAGGCTGGCGGCATCCTCAACAGATTTTCTAAAGATATTGCTATCCTGGAT GATCTCCTCCCCCTGACAATCTTTGACTTTATCCAGCTTCTGCTGATCGTGATTGGAG CCATAGCAGTGGTTGCTGTCCTGCAGCCCTACATTTTTGTGGCCACCGTGCCCGTGAT TGTTGCCTTTATTATGCTCAGAGCTTACTTCCTGCAAACTTCTCAACAGCTCAAACAG CTAGAATCTGAGGGCCGGAGCCCCATTTTTACCCACCTGGTGACTTCCCTGAAGGGA CTGTGGACTCTGAGAGCATTCGGGCGACAGCCTTACTTTGAGACACTGTTCCACAAG GCCCTGAACTTGCACACTGCCAACTGGTTTCTTTACCTGAGCACACTCCGCTGGTTCC AGATGCGGATAGAGATGATCTTCGTCATCTTTTTTATAGCTGTAACCTTCATTTCTAT CCTTACAACAGGAGAAGGAGAGGGCAGGGTGGGAATCATCCTCACGCTGGCTATGA ACATAATGTCCACCTTGCAGTGGGCCGTGAATTCCAGTATAGATGTGGATTCTCTAA TGAGGAGTGTCTCCCGGGTGTTTAAATTCATTGATATGCCTACTGAGGGGAAACCCA CCAAGTCAACAAAACCTTATAAGAATGGACAGCTGAGCAAGGTGATGATAATTGAG AACAGCCACGTGAAGAAGGATGACATTTGGCCCAGCGGGGGCCAGATGACTGTGAA GGACCTGACGGCCAAGTACACCGAAGGTGGAAATGCCATTTTGGAAAACATCAGCT TCTCAATCTCTCCTGGGCAGAGAGTTGGATTGCTGGGTCGCACGGGCAGCGGCAAAT CAACCCTGCTCAGTGCCTTCCTTCGGCTCCTGAATACAGAAGGCGAAATCCAAATTG ACGGGGTGAGCTGGGACAGCATCACCCTGCAGCAGTGGAGAAAAGCATTTGGGGTC ATTCCACAGAAAGTTTTCATCTTCTCTGGCACTTTCAGAAAGAACCTGGACCCCTAT GAGCAGTGGAGCGACCAGGAGATCTGGAAGGTTGCAGATGAAGTTGGCCTGCGGAG TGTGATAGAACAATTTCCTGGCAAGCTGGATTTTGTGCTGGTAGATGGAGGCTGCGT GCTGTCCCACGGCCACAAACAGCTGATGTGCCTCGCCCGCTCCGTTCTTTCAAAGGC CAAAATCTTGCTTTTGGATGAGCCCAGTGCTCACCTTGACCCAGTGACCTATCAGAT AATCCGCAGGACCTTAAAGCAAGCTTTTGCCGACTGCACCGTCATACTGTGTGAGCA CCGGATTGAAGCAATGCTGGAATGCCAGCAGTTTCTGGTGATCGAGGAGAATAAGG TCCGGCAGTACGACAGCATCCAGAAGTTGTTGAATGAGCGCAGCCTTTTCCGCCAGG CCATCTCCCCATCTGACAGAGTCAAGCTGTTTCCACATAGGAACTCCTCTAAGTGCA AGTCCAAGCCCCAGATCGCTGCCCTCAAGGAGGAAACTGAGGAAGAGGTGCAGGAT ACCCGCCTGTGA (SEQ ID NO: 23) ATGCAGAGGAGCCCACTGGAGAAAGCCTCCGTGGTGAGTAAACTCTTTTTTAGTTGG ACCAGACCCATCCTGCGAAAAGGATACAGGCAGCGCCTCGAGTTGTCAGATATCTA CCAGATTCCTTCTGTGGACTCAGCTGACAATTTGAGTGAGAAGCTGGAGCGGGAGTG GGATAGAGAGCTGGCGAGCAAAAAAAACCCCAAGCTTATCAATGCTCTGCGCCGCT GCTTTTTCTGGAGGTTCATGTTTTATGGGATCTTCCTGTACCTGGGGGAGGTCACCAA AGCTGTTCAGCCGCTCCTTCTTGGCCGCATCATCGCCAGCTATGACCCTGATAATAA AGAAGAAAGGTCTATTGCTATTTATCTGGGAATTGGCCTCTGCTTGCTCTTCATCGTC CGCACCCTTCTGCTGCACCCTGCCATTTTTGGCCTTCACCACATCGGCATGCAAATG AGAATTGCCATGTTCTCCCTCATTTACAAAAAGACCCTGAAACTTTCCTCAAGAGTG TTAGATAAAATATCCATTGGTCAGCTGGTCAGCCTGCTGTCCAACAATCTTAACAAA TTTGATGAAGGCTTGGCGCTGGCCCACTTCGTGTGGATTGCACCTCTGCAGGTGGCC CTGTTGATGGGACTTATATGGGAGCTGCTTCAAGCCTCTGCTTTCTGTGGGCTGGGCT TTTTGATTGTACTGGCACTTTTTCAGGCTGGGCTCGGAAGAATGATGATGAAATACA GAGATCAGCGGGCCGGGAAGATATCAGAGCGACTTGTGATCACCAGTGAAATGATT GAAAATATTCAGAGCGTGAAAGCCTACTGCTGGGAAGAAGCCATGGAGAAGATGAT TGAGAACCTGAGGCAGACAGAGCTCAAGCTCACTCGGAAGGCTGCTTATGTTCGCT ATTTCAACAGCAGCGCCTTCTTCTTCAGTGGCTTCTTTGTTGTCTTCCTGTCTGTTCTG CCATATGCACTGATAAAAGGCATTATTTTACGAAAGATCTTCACCACCATCAGTTTT TGCATCGTTCTCAGGATGGCCGTCACAAGACAGTTCCCCTGGGCTGTGCAGACCTGG TACGATTCCTTGGGGGCCATCAACAAGATTCAAGATTTCTTGCAAAAACAAGAATAT AAAACTTTAGAATACAACCTCACCACCACTGAAGTGGTCATGGAAAATGTGACAGC CTTTTGGGAGGAGGGTTTTGGAGAATTGTTCGAGAAGGCAAAGCAGAATAACAACA ACAGGAAGACGAGCAATGGGGACGACTCTCTCTTCTTCAGCAACTTTTCACTGCTCG GGACCCCTGTGTTGAAAGATATAAACTTCAAGATCGAGAGGGGCCAGCTCTTGGCT GTGGCAGGCTCCACTGGAGCTGGTAAAACATCTCTTCTCATGGTGATCATGGGGGAA CTGGAGCCTTCCGAAGGAAAAATCAAGCACAGTGGGAGAATCTCATTCTGCAGCCA GTTTTCCTGGATCATGCCCGGCACCATTAAGGAAAACATCATATTTGGAGTGTCCTA TGATGAGTACCGCTACCGGTCAGTCATCAAAGCCTGTCAGTTGGAGGAGGACATCTC CAAGTTTGCAGAGAAAGACAACATTGTGCTTGGAGAGGGGGGTATCACTCTTTCTGG AGGACAAAGAGCCAGGATCTCTTTGGCCCGGGCAGTCTACAAGGATGCAGACCTCT ACTTGTTGGACAGTCCCTTCGGCTACCTCGACGTGCTGACTGAAAAAGAAATTTTTG AAAGCTGTGTGTGCAAACTGATGGCAAACAAGACCAGGATTCTTGTCACCAGCAAG ATGGAACATCTGAAGAAAGCGGACAAAATTCTGATTCTGCATGAAGGGAGCTCCTA CTTCTATGGAACATTTAGCGAGCTTCAGAACCTACAGCCAGACTTCTCCTCCAAATT AATGGGCTGTGACTCCTTCGACCAGTTCTCTGCAGAAAGAAGAAACTCTATACTCAC AGAGACCCTCCACCGCTTCTCCCTTGAGGGAGATGCCCCAGTTTCTTGGACAGAAAC CAAGAAGCAGTCCTTTAAGCAGACTGGCGAGTTTGGTGAAAAGAGGAAAAATTCAA TTCTCAATCCAATTAACAGTATTCGCAAGTTCAGCATTGTCCAGAAGACACCCCTCC AGATGAATGGCATCGAAGAAGATAGTGACGAGCCGCTGGAGAGACGGCTGAGTCTG GTGCCAGATTCAGAACAGGGGGAGGCCATCCTGCCCCGGATCAGCGTCATTTCCAC AGGCCCCACATTACAAGCACGGCGCCGGCAGAGTGTTTTAAATCTCATGACCCATTC AGTGAACCAGGGCCAAAATATCCACAGGAAGACTACAGCTTCTACCCGGAAAGTGT CTCTGGCCCCTCAGGCCAATCTGACCGAGCTGGACATCTACAGCAGGAGGCTCTCCC AGGAAACAGGGCTTGAAATATCTGAAGAGATTAATGAAGAGGATCTTAAAGAGTGC TTCTTTGATGACATGGAGAGCATCCCCGCGGTGACCACATGGAACACCTACCTTAGA TATATTACTGTCCACAAGAGCCTCATATTTGTCCTCATCTGGTGCCTGGTTATTTTCC TCGCTGAGGTGGCGGCCAGTCTTGTTGTGCTCTGGCTGCTGGGCAACACTCCTCTCC AGGACAAGGGCAATAGTACACACAGCAGAAATAATTCTTATGCCGTCATCATTACA AGCACCTCCAGCTACTACGTGTTCTACATCTATGTGGGCGTGGCTGACACCCTCCTG GCCATGGGTTTCTTCCGGGGCCTGCCTTTGGTGCACACCCTCATCACAGTGTCAAAA ATTCTGCACCATAAAATGCTTCATTCTGTCCTGCAGGCACCCATGAGCACTTTGAAC ACATTGAAGGCTGGCGGCATCCTCAACAGATTTTCTAAAGATATTGCTATCCTGGAT GATCTCCTCCCCCTGACAATCTTTGACTTTATCCAGCTTCTGCTGATCGTGATTGGAG CCATAGCAGTGGTTGCTGTCCTGCAGCCCTACATTTTTGTGGCCACCGTGCCCGTGAT TGTTGCCTTTATTATGCTCAGAGCTTACTTCCTGCAAACTTCTCAACAGCTCAAACAG CTAGAATCTGAGGGCCGGAGCCCCATTTTTACCCACCTGGTGACTTCCCTGAAGGGA CTGTGGACTCTGAGAGCATTCGGGCGACAGCCTTACTTTGAGACACTGTTCCACAAG GCCCTGAACTTGCACACTGCCAACTGGTTTCTTTACCTGAGCACACTCCGCTGGTTCC AGATGCGGATAGAGATGATCTTCGTCATCTTTTTTATAGCTGTAACCTTCATTTCTAT CCTTACAACAGGAGAAGGAGAGGGCAGGGTGGGAATCATCCTCACGCTGGCTATGA ACATAATGTCCACCTTGCAGTGGGCCGTGAATTCCAGTATAGATGTGGATTCTCTAA TGAGGAGTGTCTCCCGGGTGTTTAAATTCATTGATATGCCTACTGAGGGGAAACCCA CCAAGTCAACAAAACCTTATAAGAATGGACAGCTGAGCAAGGTGATGATAATTGAG AACAGCCACGTGAAGAAGGATGACATTTGGCCCAGCGGGGGCCAGATGACTGTGAA GGACCTGACGGCCAAGTACACCGAAGGTGGAAATGCCATTTTGGAAAACATCAGCT TCTCAATCTCTCCTGGGCAGAGAGTTGGATTGCTGGGTCGCACGGGCAGCGGCAAAT CAACCCTGCTCAGTGCCTTCCTTCGGCTCCTGAATACAGAAGGCGAAATCCAAATTG ACGGGGTGAGCTGGGACAGCATCACCCTGCAGCAGTGGAGAAAAGCATTTGGGGTC ATTCCACAGAAAGTTTTCATCTTCTCTGGCACTTTCAGAAAGAACCTGGACCCCTAT GAGCAGTGGAGCGACCAGGAGATCTGGAAGGTTGCAGATGAAGTTGGCCTGCGGAG TGTGATAGAACAATTTCCTGGCAAGCTGGATTTTGTGCTGGTAGATGGAGGCTGCGT GCTGTCCCACGGCCACAAACAGCTGATGTGCCTCGCCCGCTCCGTTCTTTCAAAGGC CAAAATCTTGCTTTTGGATGAGCCCAGTGCTCACCTTGACCCAGTGACCTATCAGAT AATCCGCAGGACCTTAAAGCAAGCTTTTGCCGACTGCACCGTCATACTGTGTGAGCA CCGGATTGAAGCAATGCTGGAATGCCAGCAGTTTCTGGTGATCGAGGAGAATAAGG TCCGGCAGTACGACAGCATCCAGAAGTTGTTGAATGAGCGCAGCCTTTTCCGCCAGG CCATCTCCCCATCTGACAGAGTCAAGCTGTTTCCACATAGGAACTCCTCTAAGTGCA AGTCCAAGCCCCAGATCGCTGCCCTCAAGGAGGAAACTGAGGAAGAGGTGCAGGAT ACCCGCCTGTGA (SEQ ID NO: 24) ATGCAGAGGAGCCCACTGGAGAAAGCCTCCGTGGTGAGTAAACTCTTTTTTAGTTGG ACCAGACCCATCCTGCGAAAAGGATACAGGCAGCGCCTCGAGTTGTCAGATATCTA CCAGATTCCTTCTGTGGACTCAGCTGACAATTTGAGTGAGAAGCTGGAGCGGGAGTG GGATAGAGAGCTGGCGAGCAAAAAAAACCCCAAGCTTATCAATGCTCTGCGCCGCT GCTTTTTCTGGAGGTTCATGTTTTATGGGATCTTCCTGTACCTGGGGGAGGTCACCAA AGCTGTTCAGCCGCTCCTTCTTGGCCGCATCATCGCCAGCTATGACCCTGATAATAA AGAAGAAAGGTCTATTGCTATTTATCTGGGAATTGGCCTCTGCTTGCTCTTCATCGTC CGCACCCTTCTGCTGCACCCTGCCATTTTTGGCCTTCACCACATCGGCATGCAAATG AGAATTGCCATGTTCTCCCTCATTTACAAAAAGACCCTGAAACTTTCCTCAAGAGTG TTAGATAAAATATCCATTGGTCAGCTGGTCAGCCTGCTGTCCAACAATCTTAACAAA TTTGATGAAGGCTTGGCGCTGGCCCACTTCGTGTGGATTGCACCTCTGCAGGTGGCC CTGTTGATGGGACTTATATGGGAGCTGCTTCAAGCCTCTGCTTTCTGTGGGCTGGGCT
TTTTGATTGTACTGGCACTTTTTCAGGCTGGGCTCGGAAGAATGATGATGAAATACA GAGATCAGCGGGCCGGGAAGATATCAGAGCGACTTGTGATCACCAGTGAAATGATT GAAAATATTCAGAGCGTGAAAGCCTACTGCTGGGAAGAAGCCATGGAGAAGATGAT TGAGAACCTGAGGCAGACAGAGCTCAAGCTCACTCGGAAGGCTGCTTATGTTCGCT ATTTCAACAGCAGCGCCTTCTTCTTCAGTGGCTTCTTTGTTGTCTTCCTGTCTGTTCTG CCATATGCACTGATAAAAGGCATTATTTTACGAAAGATCTTCACCACCATCAGTTTT TGCATCGTTCTCAGGATGGCCGTCACAAGACAGTTCCCCTGGGCTGTGCAGACCTGG TACGATTCCTTGGGGGCCATCAACAAGATTCAAGATTTCTTGCAAAAACAAGAATAT AAAACTTTAGAATACAACCTCACCACCACTGAAGTGGTCATGGAAAATGTGACAGC CTTTTGGGAGGAGGGTTTTGGAGAATTGTTCGAGAAGGCAAAGCAGAATAACAACA ACAGGAAGACGAGCAATGGGGACGACTCTCTCTTCTTCAGCAACTTTTCACTGCTCG GGACCCCTGTGTTGAAAGATATAAACTTCAAGATCGAGAGGGGCCAGCTCTTGGCT GTGGCAGGCTCCACTGGAGCTGGTAAAACATCTCTTCTCATGGTGATCATGGGGGAA CTGGAGCCTTCCGAAGGAAAAATCAAGCACAGTGGGAGAATCTCATTCTGCAGCCA GTTTTCCTGGATCATGCCCGGCACCATTAAGGAAAACATCATATTTGGAGTGTCCTA TGATGAGTACCGCTACCGGTCAGTCATCAAAGCCTGTCAGTTGGAGGAGGACATCTC CAAGTTTGCAGAGAAAGACAACATTGTGCTTGGAGAGGGGGGTATCACTCTTTCTGG AGGACAAAGAGCCAGGATCTCTTTGGCCCGGGCAGTCTACAAGGATGCAGACCTCT ACTTGTTGGACAGTCCCTTCGGCTACCTCGACGTGCTGACTGAAAAAGAAATTTTTG AAAGCTGTGTGTGCAAACTGATGGCAAACAAGACCAGGATTCTTGTCACCAGCAAG ATGGAACATCTGAAGAAAGCGGACAAAATTCTGATTCTGCATGAAGGGAGCTCCTA CTTCTATGGAACATTTAGCGAGCTTCAGAACCTACAGCCAGACTTCTCCTCCAAATT AATGGGCTGTGACTCCTTCGACCAGTTCTCTGCAGAAAGAAGAAACTCTATACTCAC AGAGACCCTCCACCGCTTCTCCCTTGAGGGAGATGCCCCAGTTTCTTGGACAGAAAC CAAGAAGCAGTCCTTTAAGCAGACTGGCGAGTTTGGTGAAAAGAGGAAAAATTCAA TTCTCAATCCAATTAACAGTATTCGCAAGTTCAGCATTGTCCAGAAGACACCCCTCC AGATGAATGGCATCGAAGAAGATAGTGACGAGCCGCTGGAGAGACGGCTGAGTCTG GTGCCAGATTCAGAACAGGGGGAGGCCATCCTGCCCCGGATCAGCGTCATTTCCAC AGGCCCCACATTACAAGCACGGCGCCGGCAGAGTGTTTTAAATCTCATGACCCATTC AGTGAACCAGGGCCAAAATATCCACAGGAAGACTACAGCTTCTACCCGGAAAGTGT CTCTGGCCCCTCAGGCCAATCTGACCGAGCTGGACATCTACAGCAGGAGGCTCTCCC AGGAAACAGGGCTGGAAATATCTGAAGAGATTAATGAAGAGGATCTTAAAGAGTGC TTCTTTGATGACATGGAGAGCATCCCCGCGGTGACCACATGGAACACCTACCTTAGA TATATTACTGTCCACAAGAGCCTCATATTTGTCCTCATCTGGTGCCTGGTTATTTTCC TCGCTGAGGTGGCGGCCAGTCTTGTTGTGCTCTGGCTGCTGGGCAACACTCCTCTCC AGGACAAGGGCAATAGTACTCACAGCAGAAATAATTCTTATGCCGTCATCATTACA AGCACCTCCAGCTACTACGTGTTCTACATCTATGTGGGCGTGGCTGACACCCTCCTG GCCATGGGTTTCTTCCGGGGCCTGCCTTTGGTGCACACCCTCATCACAGTGTCAAAA ATTCTGCACCATAAAATGCTTCATTCTGTCCTGCAGGCACCCATGAGCACTTTGAAC ACATTGAAGGCTGGCGGCATCCTCAACAGATTTTCTAAAGATATTGCTATCCTGGAT GATCTCCTCCCCCTGACAATCTTTGACTTTATCCAGCTTCTGCTGATCGTGATTGGAG CCATAGCAGTGGTTGCTGTCCTGCAGCCCTACATTTTTGTGGCCACCGTGCCCGTGAT TGTTGCCTTTATTATGCTCAGAGCTTACTTCCTGCAAACTTCTCAACAGCTCAAACAG CTAGAATCTGAGGGCCGGAGCCCCATTTTTACCCACCTGGTGACTTCCCTGAAGGGA CTGTGGACTCTGAGAGCATTCGGGCGACAGCCTTACTTTGAGACACTGTTCCACAAG GCCCTGAACTTGCACACTGCCAACTGGTTTCTTTACCTGAGCACACTCCGCTGGTTCC AGATGCGGATAGAGATGATCTTCGTCATCTTTTTTATAGCTGTAACCTTCATTTCTAT CCTTACAACAGGAGAAGGAGAGGGCAGGGTGGGAATCATCCTCACGCTGGCTATGA ACATAATGTCCACCTTGCAGTGGGCCGTGAATTCCAGTATAGATGTGGATTCTCTAA TGAGGAGTGTCTCCCGGGTGTTTAAATTCATTGATATGCCAACTGAGGGGAAACCCA CCAAGTCAACAAAACCTTATAAGAATGGACAGCTGAGCAAGGTGATGATAATTGAG AACAGCCACGTGAAGAAGGATGACATTTGGCCCAGCGGGGGCCAGATGACTGTGAA GGACCTGACGGCCAAGTACACCGAAGGTGGAAATGCCATTTTGGAAAACATCAGCT TCTCAATCTCTCCTGGGCAGAGAGTTGGATTGCTGGGTCGCACGGGCAGCGGCAAAT CAACCCTGCTCAGTGCCTTCCTTCGGCTCCTGAATACAGAAGGCGAAATCCAAATTG ACGGGGTGAGCTGGGACAGCATCACCCTGCAGCAGTGGAGAAAAGCATTTGGGGTC ATTCCACAGAAAGTTTTCATCTTCTCTGGCACTTTCAGAAAGAACCTGGACCCCTAT GAGCAGTGGAGCGACCAGGAGATCTGGAAGGTTGCAGATGAAGTTGGCCTGCGGAG TGTGATAGAACAATTTCCTGGCAAGCTGGATTTTGTGCTGGTAGATGGAGGCTGCGT GCTGTCCCACGGCCACAAACAGCTGATGTGCCTCGCCCGCTCCGTTCTTTCAAAGGC CAAAATCTTGCTTTTGGATGAGCCCAGTGCTCACCTCGACCCAGTGACCTATCAGAT AATCCGCAGGACCTTAAAGCAAGCTTTTGCCGACTGCACCGTCATACTGTGTGAGCA CCGGATTGAAGCAATGCTGGAATGCCAGCAGTTTCTGGTGATCGAGGAGAATAAGG TCCGGCAGTACGACAGCATCCAGAAGTTGTTGAATGAGCGCAGCCTTTTCCGCCAGG CCATCTCCCCATCTGACAGAGTCAAGCTGTTTCCACATAGGAACTCCTCTAAGTGCA AGTCCAAGCCCCAGATCGCTGCCCTCAAGGAGGAAACTGAGGAAGAGGTGCAGGAT ACCCGCCTGTGA (SEQ ID NO: 25) ATGCAGAGGAGCCCACTGGAGAAAGCCTCCGTGGTGAGTAAACTCTTTTTTAGTTGG ACCAGACCCATCCTGCGAAAAGGATACAGGCAGCGCCTCGAGTTGTCAGATATCTA CCAGATTCCTTCTGTGGACTCAGCTGACAATTTGAGTGAGAAGCTGGAGCGGGAGTG GGATAGAGAGCTGGCGAGCAAAAAAAACCCCAAGCTTATCAATGCTCTGCGCCGCT GCTTTTTCTGGAGGTTCATGTTTTATGGGATCTTCCTGTACCTGGGGGAGGTCACCAA AGCTGTTCAGCCGCTCCTTCTTGGCCGCATCATCGCCAGCTATGACCCTGATAATAA AGAAGAAAGGTCTATTGCTATTTATCTGGGAATTGGCCTCTGCTTGCTCTTCATCGTC CGCACCCTTCTGCTGCACCCTGCCATTTTTGGCCTTCACCACATCGGCATGCAAATG AGAATTGCCATGTTCTCCCTCATTTACAAAAAGACCCTGAAACTTTCCTCAAGAGTG TTAGATAAAATATCCATTGGTCAGCTGGTCAGCCTGCTGTCCAACAATCTTAACAAA TTTGATGAAGGCTTGGCGCTGGCCCACTTCGTGTGGATTGCACCTCTGCAGGTGGCC CTGTTGATGGGACTTATATGGGAGCTGCTTCAAGCCTCTGCTTTCTGTGGGCTGGGCT TTTTGATTGTACTGGCACTTTTTCAGGCTGGGCTCGGAAGAATGATGATGAAATACA GAGATCAGCGGGCCGGGAAGATATCAGAGCGACTTGTGATCACCAGTGAAATGATT GAAAATATTCAGAGCGTGAAAGCCTACTGCTGGGAAGAAGCCATGGAGAAGATGAT TGAGAACCTGAGGCAGACAGAGCTCAAGCTCACTCGGAAGGCTGCTTATGTTCGCT ATTTCAACAGCAGCGCCTTCTTCTTCAGTGGCTTCTTTGTTGTCTTCCTGTCTGTTCTG CCATATGCACTGATAAAAGGCATTATTTTACGAAAGATCTTCACCACCATCAGTTTT TGCATCGTTCTCAGGATGGCCGTCACAAGACAGTTCCCCTGGGCTGTGCAGACCTGG TACGATTCCTTGGGGGCCATCAACAAGATTCAAGATTTCTTGCAAAAACAAGAATAT AAAACTTTAGAATACAACCTCACCACCACTGAAGTGGTCATGGAAAATGTGACAGC CTTTTGGGAGGAGGGTTTTGGAGAATTGTTCGAGAAGGCAAAGCAGAATAACAACA ACAGGAAGACGAGCAATGGGGACGACTCTCTCTTCTTCAGCAACTTTTCACTGCTCG GGACCCCTGTGTTGAAAGATATAAACTTCAAGATCGAGAGGGGCCAGCTCTTGGCT GTGGCAGGCTCCACTGGAGCTGGTAAAACATCTCTTCTCATGGTGATCATGGGGGAA CTGGAGCCTTCCGAAGGAAAAATCAAGCACAGTGGGAGAATCTCATTCTGCAGCCA GTTTTCCTGGATCATGCCCGGCACCATTAAGGAAAACATCATATTTGGAGTGTCCTA TGATGAGTACCGCTACCGGTCAGTCATCAAAGCCTGTCAGTTGGAGGAGGACATCTC CAAGTTTGCAGAGAAAGACAACATTGTGCTTGGAGAGGGGGGTATCACTCTTTCTGG AGGACAAAGAGCCAGGATCTCTTTGGCCCGGGCAGTCTACAAGGATGCAGACCTCT ACTTGTTGGACAGTCCCTTCGGCTACCTCGACGTGCTGACTGAAAAAGAAATTTTTG AAAGCTGTGTGTGCAAACTGATGGCAAACAAGACCAGGATTCTTGTCACCAGCAAG ATGGAACATCTGAAGAAAGCGGACAAAATTCTGATTCTGCATGAAGGGAGCTCCTA CTTCTATGGAACATTTAGCGAGCTTCAGAACCTACAGCCAGACTTCTCCTCCAAATT AATGGGCTGTGACTCCTTCGACCAGTTCTCTGCAGAAAGAAGAAACTCTATACTCAC AGAGACCCTCCACCGCTTCTCCCTTGAGGGAGATGCCCCAGTTTCTTGGACAGAAAC CAAGAAGCAGTCCTTTAAGCAGACTGGCGAGTTTGGTGAAAAGAGGAAAAATTCAA TTCTCAATCCTATTAACAGTATTCGCAAGTTCAGCATTGTCCAGAAGACACCCCTCC AGATGAATGGCATCGAAGAAGATAGTGACGAGCCGCTGGAGAGACGGCTGAGTCTG GTGCCAGATTCAGAACAGGGGGAGGCCATCCTGCCCCGGATCAGCGTCATTTCCAC AGGCCCCACATTACAAGCACGGCGCCGGCAGAGTGTTTTAAATCTCATGACCCATTC AGTGAACCAGGGCCAAAATATCCACAGGAAGACTACAGCTTCTACCCGGAAAGTGT CTCTGGCCCCTCAGGCCAATCTGACCGAGCTGGACATCTACAGCAGGAGGCTCTCCC AGGAAACAGGGCTTGAAATATCTGAAGAGATTAATGAAGAGGATCTTAAAGAGTGC TTCTTTGATGACATGGAGAGCATCCCCGCGGTGACCACATGGAACACCTACCTTAGA TATATTACTGTCCACAAGAGCCTCATATTTGTCCTCATCTGGTGCCTGGTTATTTTCC TCGCTGAGGTGGCGGCCAGTCTTGTTGTGCTCTGGCTGCTGGGCAACACTCCTCTCC AGGACAAGGGCAATAGTACTCACAGCAGAAATAATTCTTATGCCGTCATCATTACA AGCACCTCCAGCTACTACGTGTTCTACATCTATGTGGGCGTGGCTGACACCCTCCTG GCCATGGGTTTCTTCCGGGGCCTGCCTTTGGTGCACACCCTCATCACAGTGTCAAAA ATTCTGCACCATAAAATGCTTCATTCTGTCCTGCAGGCACCCATGAGCACTTTGAAC ACATTGAAGGCTGGCGGCATCCTCAACAGATTTTCTAAAGATATTGCTATCCTGGAT GATCTCCTCCCCCTGACAATCTTTGACTTTATCCAGCTTCTGCTGATCGTGATTGGAG CCATAGCAGTGGTTGCTGTCCTGCAGCCCTACATTTTTGTGGCCACCGTGCCCGTGAT TGTTGCCTTTATTATGCTCAGAGCTTACTTCCTGCAAACTTCTCAACAGCTCAAACAG CTAGAATCTGAGGGCCGGAGCCCCATTTTTACCCACCTGGTGACTTCCCTGAAGGGA CTGTGGACTCTGAGAGCATTCGGGCGACAGCCTTACTTTGAGACACTGTTCCACAAG GCCCTGAACTTGCACACTGCCAACTGGTTTCTTTACCTGAGCACACTCCGCTGGTTCC
AGATGCGGATAGAGATGATCTTCGTCATCTTTTTTATAGCTGTAACCTTCATTTCTAT CCTTACAACAGGAGAAGGAGAGGGCAGGGTGGGAATCATCCTCACGCTGGCTATGA ACATAATGTCCACCTTGCAGTGGGCCGTGAATTCCAGTATAGATGTGGATTCTCTAA TGAGGAGTGTCTCCCGGGTGTTTAAATTCATTGATATGCCTACTGAGGGGAAACCCA CCAAGTCAACAAAACCTTATAAGAATGGACAGCTGAGCAAGGTGATGATAATTGAG AACAGCCACGTGAAGAAGGATGACATTTGGCCCAGCGGGGGCCAGATGACTGTGAA GGACCTGACGGCCAAGTACACCGAAGGTGGAAATGCCATTTTGGAAAACATCAGCT TCTCAATCTCTCCTGGGCAGAGAGTTGGATTGCTGGGTCGCACGGGCAGCGGCAAAT CAACCCTGCTCAGTGCCTTCCTTCGGCTCCTGAATACAGAAGGCGAAATCCAAATTG ACGGGGTGAGCTGGGACAGCATCACCCTGCAGCAGTGGAGAAAAGCATTTGGGGTC ATTCCACAGAAAGTTTTCATCTTCTCTGGCACTTTCAGAAAGAACCTGGACCCCTAT GAGCAGTGGAGCGACCAGGAGATCTGGAAGGTTGCAGATGAAGTTGGCCTGCGGAG TGTGATAGAACAATTTCCTGGCAAGCTGGATTTTGTGCTGGTAGATGGAGGCTGCGT GCTGTCCCACGGCCACAAACAGCTGATGTGCCTCGCCCGCTCCGTTCTTTCAAAGGC CAAAATCTTGCTTTTGGATGAGCCCAGTGCTCACCTCGACCCAGTGACCTATCAGAT AATCCGCAGGACCTTAAAGCAAGCTTTTGCCGACTGCACCGTCATACTGTGTGAGCA CCGGATTGAAGCAATGCTGGAATGCCAGCAGTTTCTGGTGATCGAGGAGAATAAGG TCCGGCAGTACGACAGCATCCAGAAGTTGTTGAATGAGCGCAGCCTTTTCCGCCAGG CCATCTCCCCATCTGACAGAGTCAAGCTGTTTCCACATAGGAACTCCTCTAAGTGCA AGTCCAAGCCCCAGATCGCTGCCCTCAAGGAGGAAACTGAGGAAGAGGTGCAGGAT ACCCGCCTGTGA (SEQ ID NO: 26) ATGCAGAGGAGCCCACTGGAGAAAGCCTCCGTGGTGAGTAAACTCTTTTTTAGTTGG ACCAGACCCATCCTGCGAAAAGGATACAGGCAGCGCCTCGAGTTGTCAGATATCTA CCAGATTCCTTCTGTGGACTCAGCTGACAATTTGAGTGAGAAGCTGGAGCGGGAGTG GGATAGAGAGCTGGCGAGCAAAAAAAACCCCAAGCTTATCAATGCTCTGCGCCGCT GCTTTTTCTGGAGGTTCATGTTTTATGGGATCTTCCTGTACCTGGGGGAGGTCACCAA AGCTGTTCAGCCGCTCCTTCTTGGCCGCATCATCGCCAGCTATGACCCTGATAATAA AGAAGAAAGGTCTATTGCTATTTATCTGGGAATTGGCCTCTGCTTGCTCTTCATCGTC CGCACCCTTCTGCTGCACCCTGCCATTTTTGGCCTTCACCACATCGGCATGCAAATG AGAATTGCCATGTTCTCCCTCATTTACAAAAAGACCCTGAAACTTTCCTCAAGAGTG TTAGATAAAATATCCATTGGTCAGCTGGTCAGCCTGCTGTCCAACAATCTTAACAAA TTTGATGAAGGCTTGGCGCTGGCCCACTTCGTGTGGATTGCACCTCTGCAGGTGGCC CTGTTGATGGGACTTATATGGGAGCTGCTTCAAGCCTCTGCTTTCTGTGGGCTGGGCT TTTTGATTGTACTGGCACTTTTTCAGGCTGGGCTCGGAAGAATGATGATGAAATACA GAGATCAGCGGGCCGGGAAGATTTCAGAGCGACTTGTGATCACCAGTGAAATGATT GAAAATATTCAGAGCGTGAAAGCCTACTGCTGGGAAGAAGCCATGGAGAAGATGAT TGAGAACCTGAGGCAGACAGAGCTCAAGCTCACTCGGAAGGCTGCTTATGTTCGCT ATTTCAACAGCAGCGCCTTCTTCTTCAGTGGCTTCTTTGTTGTCTTCCTGTCTGTTCTG CCATATGCACTGATAAAAGGCATTATTTTACGAAAGATCTTCACCACCATCAGTTTT TGCATCGTTCTCAGGATGGCCGTCACAAGACAGTTCCCCTGGGCTGTGCAGACCTGG TACGATTCCTTGGGGGCCATCAACAAGATTCAAGATTTCTTGCAAAAACAAGAATAT AAAACTTTAGAATACAACCTCACCACCACTGAAGTGGTCATGGAAAATGTGACAGC CTTTTGGGAGGAGGGTTTTGGAGAATTGTTCGAGAAGGCAAAGCAGAATAACAACA ACAGGAAGACGAGCAATGGGGACGACTCTCTCTTCTTCAGCAACTTTTCACTGCTCG GGACCCCTGTGTTGAAAGATATAAACTTCAAGATCGAGAGGGGCCAGCTCTTGGCT GTGGCAGGCTCCACTGGAGCTGGTAAAACATCTCTTCTCATGGTGATCATGGGGGAA CTGGAGCCTTCCGAAGGAAAAATCAAGCACAGTGGGAGAATCTCATTCTGCAGCCA GTTTTCCTGGATCATGCCCGGCACCATTAAGGAAAACATCATATTTGGAGTGTCCTA TGATGAGTACCGCTACCGGTCAGTCATCAAAGCCTGTCAGTTGGAGGAGGACATCTC CAAGTTTGCAGAGAAAGACAACATTGTGCTTGGAGAGGGGGGTATCACTCTTTCTGG AGGACAAAGAGCCAGGATCTCTTTGGCCCGGGCAGTCTACAAGGATGCAGACCTCT ACTTGTTGGACAGTCCCTTCGGCTACCTCGACGTGCTGACTGAAAAAGAAATTTTTG AAAGCTGTGTGTGCAAACTGATGGCAAACAAGACCAGGATTCTTGTCACCAGCAAG ATGGAACATCTGAAGAAAGCGGACAAAATTCTGATTCTGCATGAAGGGAGCTCCTA CTTCTATGGAACATTTAGCGAGCTTCAGAACCTACAGCCAGACTTCTCCTCCAAATT AATGGGCTGTGACTCCTTCGACCAGTTCTCTGCAGAAAGAAGAAACTCTATACTCAC AGAGACCCTCCACCGCTTCTCCCTTGAGGGAGATGCCCCAGTTTCTTGGACAGAAAC CAAGAAGCAGTCCTTTAAGCAGACTGGCGAGTTTGGTGAAAAGAGGAAAAATTCAA TTCTCAATCCAATTAACAGTATTCGCAAGTTCAGCATTGTCCAGAAGACACCCCTCC AGATGAATGGCATCGAAGAAGATAGTGACGAGCCGCTGGAGAGACGGCTGAGTCTG GTGCCAGATTCAGAACAGGGGGAGGCCATCCTGCCCCGGATCAGCGTCATTTCCAC AGGCCCCACATTACAAGCACGGCGCCGGCAGAGTGTTTTAAATCTCATGACCCATTC AGTGAACCAGGGCCAAAATATCCACAGGAAGACTACAGCTTCTACCCGGAAAGTGT CTCTGGCCCCTCAGGCCAATCTGACCGAGCTGGACATCTACAGCAGGAGGCTCTCCC AGGAAACAGGGCTGGAAATATCTGAAGAGATTAATGAAGAGGATCTTAAAGAGTGC TTCTTTGATGACATGGAGAGCATCCCCGCGGTGACCACATGGAACACCTACCTTAGA TATATTACTGTCCACAAGAGCCTCATATTTGTCCTCATCTGGTGCCTGGTTATTTTCC TCGCTGAGGTGGCGGCCAGTCTTGTTGTGCTCTGGCTGCTGGGCAACACTCCTCTCC AGGACAAGGGCAATAGTACTCACAGCAGAAATAATTCTTATGCCGTCATCATTACA AGCACCTCCAGCTACTACGTGTTCTACATCTATGTGGGCGTGGCTGACACCCTCCTG GCCATGGGTTTCTTCCGGGGCCTGCCTTTGGTGCACACCCTCATCACAGTGTCAAAA ATTCTGCACCATAAAATGCTTCATTCTGTCCTGCAGGCACCCATGAGCACTTTGAAC ACATTGAAGGCTGGCGGCATCCTCAACAGATTTTCTAAAGATATTGCTATCCTGGAT GATCTCCTCCCCCTGACAATCTTTGACTTTATCCAGCTTCTGCTGATCGTGATTGGAG CCATAGCAGTGGTTGCTGTCCTGCAGCCCTACATTTTTGTGGCCACCGTGCCCGTGAT TGTTGCCTTTATTATGCTCAGAGCTTACTTCCTGCAAACTTCTCAACAGCTCAAACAG CTAGAATCTGAGGGCCGGAGCCCCATTTTTACCCACCTGGTGACTTCCCTGAAGGGA CTGTGGACTCTGAGAGCATTCGGGCGACAGCCTTACTTTGAGACACTGTTCCACAAG GCCCTGAACTTGCACACTGCCAACTGGTTTCTTTACCTGAGCACACTCCGCTGGTTCC AGATGCGGATAGAGATGATCTTCGTCATCTTTTTTATAGCTGTAACCTTCATTTCTAT CCTTACAACAGGAGAAGGAGAGGGCAGGGTGGGAATCATCCTCACGCTGGCTATGA ACATAATGTCCACCTTGCAGTGGGCCGTGAATTCCAGTATAGATGTGGATTCTCTAA TGAGGAGTGTCTCCCGGGTGTTTAAATTCATTGATATGCCAACTGAGGGGAAACCCA CCAAGTCAACAAAACCTTATAAGAATGGACAGCTGAGCAAGGTGATGATAATTGAG AACAGCCACGTGAAGAAGGATGACATTTGGCCCAGCGGGGGCCAGATGACTGTGAA GGACCTGACGGCCAAGTACACCGAAGGTGGAAATGCCATTTTGGAAAACATCAGCT TCTCAATCTCTCCTGGGCAGAGAGTTGGATTGCTGGGTCGCACGGGCAGCGGCAAAT CAACCCTGCTCAGTGCCTTCCTTCGGCTCCTGAATACAGAAGGCGAAATCCAAATTG ACGGGGTGAGCTGGGACAGCATCACCCTGCAGCAGTGGAGAAAAGCATTTGGGGTC ATTCCACAGAAAGTTTTCATCTTCTCTGGCACTTTCAGAAAGAACCTGGACCCCTAT GAGCAGTGGAGCGACCAGGAGATCTGGAAGGTTGCAGATGAAGTTGGCCTGCGGAG TGTGATAGAACAATTTCCTGGCAAGCTGGATTTTGTGCTGGTAGATGGAGGCTGCGT GCTGTCCCACGGCCACAAACAGCTGATGTGCCTCGCCCGCTCCGTTCTTTCAAAGGC CAAAATCTTGCTTTTGGATGAGCCCAGTGCTCACCTCGACCCAGTGACCTATCAGAT AATCCGCAGGACCTTAAAGCAAGCTTTTGCCGACTGCACCGTCATACTGTGTGAGCA CCGGATTGAAGCAATGCTGGAATGCCAGCAGTTTCTGGTGATCGAGGAGAATAAGG TCCGGCAGTACGACAGCATCCAGAAGTTGTTGAATGAGCGCAGCCTTTTCCGCCAGG CCATCTCCCCATCTGACAGAGTCAAGCTGTTTCCACATAGGAACTCCTCTAAGTGCA AGTCCAAGCCCCAGATCGCTGCCCTCAAGGAGGAAACTGAGGAAGAGGTGCAGGAT ACCCGCCTGTGA (SEQ ID NO: 27) ATGCAGAGGAGCCCACTGGAGAAAGCCTCCGTGGTGAGTAAACTCTTTTTTAGTTGG ACCAGACCCATCCTGCGAAAAGGATACAGGCAGCGCCTCGAGTTGTCTGATATCTA CCAGATTCCTTCTGTGGACTCAGCTGACAATTTGAGTGAGAAGCTGGAGCGGGAGTG GGATAGAGAGCTGGCGAGCAAAAAAAACCCCAAGCTTATCAATGCTCTGCGCCGCT GCTTTTTCTGGAGGTTCATGTTTTATGGGATCTTCCTGTACCTGGGGGAGGTCACCAA AGCTGTTCAGCCGCTCCTTCTTGGCCGCATCATCGCCAGCTATGACCCTGATAATAA AGAAGAAAGGTCTATTGCTATTTATCTGGGAATTGGCCTCTGCTTGCTCTTCATCGTC CGCACCCTTCTGCTGCACCCTGCCATTTTTGGCCTTCACCACATCGGCATGCAAATG AGAATTGCCATGTTCTCCCTCATTTACAAAAAGACCCTGAAACTTTCCTCAAGAGTG TTAGATAAAATATCCATTGGTCAGCTGGTCAGCCTGCTGTCCAACAATCTTAACAAA TTTGATGAAGGCTTGGCGCTGGCCCACTTCGTGTGGATTGCACCTCTGCAGGTGGCC CTGTTGATGGGACTTATATGGGAGCTGCTTCAAGCCTCTGCTTTCTGTGGGCTGGGCT TTTTGATTGTACTGGCACTTTTTCAGGCTGGGCTCGGAAGAATGATGATGAAATACA GAGATCAGCGGGCCGGGAAGATATCAGAGCGACTTGTGATCACCAGTGAAATGATT GAAAATATTCAGAGCGTGAAAGCCTACTGCTGGGAAGAAGCCATGGAGAAGATGAT TGAGAACCTGAGGCAGACAGAGCTCAAGCTCACTCGGAAGGCTGCTTATGTTCGCT ATTTCAACAGCAGCGCCTTCTTCTTCAGTGGCTTCTTTGTTGTCTTCCTGTCTGTTCTG CCATATGCACTGATAAAAGGCATTATTTTACGAAAGATCTTCACCACCATCAGTTTT TGCATCGTTCTCAGGATGGCCGTCACAAGACAGTTCCCCTGGGCTGTGCAGACCTGG TACGATTCCTTGGGGGCCATCAACAAGATTCAAGATTTCTTGCAAAAACAAGAATAT AAAACTTTAGAATACAACCTCACCACCACTGAAGTGGTCATGGAAAATGTGACAGC CTTTTGGGAGGAGGGTTTTGGAGAATTGTTCGAGAAGGCAAAGCAGAATAACAACA ACAGGAAGACGAGCAATGGGGACGACTCTCTCTTCTTCAGCAACTTTTCACTGCTCG GGACCCCTGTGTTGAAAGATATAAACTTCAAGATCGAGAGGGGCCAGCTCTTGGCT
GTGGCAGGCTCCACTGGAGCTGGTAAAACATCTCTTCTCATGGTGATCATGGGGGAA CTGGAGCCTTCCGAAGGAAAAATCAAGCACAGTGGGAGAATCTCATTCTGCAGCCA GTTTTCCTGGATCATGCCCGGCACCATTAAGGAAAACATCATATTTGGAGTGTCCTA TGATGAGTACCGCTACCGGTCCGTCATCAAAGCCTGTCAGTTGGAGGAGGACATCTC CAAGTTTGCAGAGAAAGACAACATTGTGCTTGGAGAGGGGGGTATCACTCTTTCTGG AGGACAAAGAGCCAGGATCTCTTTGGCCCGGGCAGTCTACAAGGATGCAGACCTCT ACTTGTTGGACAGTCCCTTCGGCTACCTCGACGTGCTGACTGAAAAAGAAATTTTTG AAAGCTGTGTGTGCAAACTGATGGCAAACAAGACCAGGATTCTTGTCACCAGCAAG ATGGAACATCTGAAGAAAGCGGACAAAATTCTGATTCTGCATGAAGGGAGCTCCTA CTTCTATGGAACATTTAGCGAGCTTCAGAACCTACAGCCAGACTTCTCCTCCAAATT AATGGGCTGTGACTCCTTCGACCAGTTCTCTGCAGAAAGAAGAAACTCTATACTCAC AGAGACCCTCCACCGCTTCTCCCTTGAGGGAGATGCCCCAGTTTCTTGGACAGAAAC CAAGAAGCAGTCCTTTAAGCAGACTGGCGAGTTTGGTGAAAAGAGGAAAAATTCAA TTCTCAATCCAATTAACAGTATTCGCAAGTTCAGCATTGTCCAGAAGACACCCCTCC AGATGAATGGCATCGAAGAAGATAGTGACGAGCCGCTGGAGAGACGGCTGAGTCTG GTGCCAGATTCAGAACAGGGGGAGGCCATCCTGCCCCGGATCAGCGTCATTTCCAC AGGCCCCACATTACAAGCACGGCGCCGGCAGAGTGTTTTAAATCTCATGACCCATTC AGTGAACCAGGGCCAAAATATCCACAGGAAGACTACAGCTTCTACCCGGAAAGTGT CTCTGGCCCCTCAGGCCAATCTGACCGAGCTGGACATCTACAGCAGGAGGCTCTCCC AGGAAACAGGGCTTGAAATATCTGAAGAGATTAATGAAGAGGATCTTAAAGAGTGC TTCTTTGATGACATGGAGAGCATCCCCGCGGTGACCACATGGAACACCTACCTTAGA TATATTACTGTCCACAAGAGCCTCATATTTGTCCTCATCTGGTGCCTGGTTATTTTCC TCGCTGAGGTGGCGGCCAGTCTTGTTGTGCTCTGGCTGCTGGGCAACACTCCTCTCC AGGACAAGGGCAATAGTACTCACAGCAGAAATAATTCTTATGCCGTCATCATTACA AGCACCTCCAGCTACTACGTGTTCTACATCTATGTGGGCGTGGCTGACACCCTCCTG GCCATGGGTTTCTTCCGGGGCCTGCCTTTGGTGCACACCCTCATCACAGTGTCAAAA ATTCTGCACCATAAAATGCTTCATTCTGTCCTGCAGGCACCCATGAGCACTTTGAAC ACATTGAAGGCTGGCGGCATCCTCAACAGATTTTCTAAAGATATTGCTATCCTGGAT GATCTCCTCCCCCTGACAATCTTTGACTTTATCCAGCTTCTGCTGATCGTGATTGGAG CCATAGCAGTGGTTGCTGTCCTGCAGCCCTACATTTTTGTGGCCACCGTGCCCGTGAT TGTTGCCTTTATTATGCTCAGAGCTTACTTCCTGCAAACTTCTCAACAGCTCAAACAG CTAGAATCTGAGGGCCGGAGCCCCATTTTTACCCACCTGGTGACTTCCCTGAAGGGA CTGTGGACTCTGAGAGCATTCGGGCGACAGCCTTACTTTGAGACACTGTTCCACAAG GCCCTGAACTTGCACACTGCCAACTGGTTTCTTTACCTGAGCACACTCCGCTGGTTCC AGATGCGGATAGAGATGATCTTCGTCATCTTTTTTATAGCTGTAACCTTCATTTCTAT CCTTACAACAGGAGAAGGAGAGGGCAGGGTGGGAATCATCCTCACGCTGGCTATGA ACATAATGTCCACCTTGCAGTGGGCCGTGAATTCCAGTATAGATGTGGATTCTCTAA TGAGGAGTGTCTCCCGGGTGTTTAAATTCATTGATATGCCTACTGAGGGGAAACCCA CCAAGTCAACAAAGCCTTATAAGAATGGACAGCTGAGCAAGGTGATGATAATTGAG AACAGCCACGTGAAGAAGGATGACATTTGGCCCAGCGGGGGCCAGATGACTGTGAA GGACCTGACGGCCAAGTACACCGAAGGTGGAAATGCCATTTTGGAAAACATCAGCT TCTCAATCTCTCCTGGGCAGAGAGTTGGATTGCTGGGTCGCACGGGCAGCGGCAAAT CAACCCTGCTCAGTGCCTTCCTTCGGCTCCTGAATACAGAAGGCGAAATCCAAATTG ACGGGGTGAGCTGGGACAGCATCACCCTGCAGCAGTGGAGAAAAGCATTTGGGGTC ATTCCACAGAAAGTTTTCATCTTCTCTGGCACTTTCAGAAAGAACCTGGACCCCTAT GAGCAGTGGAGCGACCAGGAGATCTGGAAGGTTGCAGATGAAGTTGGCCTGCGGAG TGTGATAGAACAATTTCCTGGCAAGCTGGATTTTGTGCTGGTAGATGGAGGCTGCGT GCTGTCCCACGGCCACAAACAGCTGATGTGCCTCGCCCGCTCCGTTCTTTCAAAGGC CAAAATCTTGCTTTTGGATGAGCCCAGTGCTCACCTCGACCCAGTGACCTATCAGAT AATCCGCAGGACCTTAAAGCAAGCTTTTGCCGACTGCACCGTCATACTGTGTGAGCA CCGGATTGAAGCAATGCTGGAATGCCAGCAGTTTCTGGTGATCGAGGAGAATAAGG TCCGGCAGTACGACAGCATCCAGAAGTTGTTGAATGAGCGCAGCCTTTTCCGCCAGG CCATCTCCCCATCTGACAGAGTCAAGCTGTTTCCACATAGGAACTCCTCTAAGTGCA AGTCCAAGCCCCAGATCGCTGCCCTCAAGGAGGAAACTGAGGAAGAGGTGCAGGAT ACCCGCCTGTGA (SEQ ID NO: 28) ATGCAGAGGAGCCCACTGGAGAAAGCCTCCGTGGTGAGTAAACTCTTTTTTAGTTGG ACCAGACCCATCCTGCGAAAAGGATACAGGCAGCGCCTCGAGTTGTCAGATATCTA CCAGATTCCTTCTGTGGACTCAGCTGACAATTTGAGTGAGAAGCTGGAGCGGGAGTG GGATAGAGAGCTGGCGAGCAAAAAAAACCCCAAGCTTATCAATGCTCTGCGCCGCT GCTTTTTCTGGAGGTTCATGTTTTATGGGATCTTCCTGTACCTGGGGGAGGTCACCAA AGCTGTTCAGCCGCTCCTTCTTGGCCGCATCATCGCCAGCTATGACCCTGATAATAA AGAAGAAAGGTCTATTGCTATTTATCTGGGAATTGGCCTCTGCTTGCTCTTCATCGTC CGCACCCTTCTGCTGCACCCTGCCATTTTTGGCCTTCACCACATCGGCATGCAAATG AGAATTGCCATGTTCTCCCTCATTTACAAAAAGACCCTGAAACTTTCCTCAAGAGTG TTAGATAAAATATCCATTGGTCAGCTGGTCAGCCTGCTGTCCAACAATCTTAACAAA TTTGATGAAGGCTTGGCGCTGGCCCACTTCGTGTGGATTGCACCTCTGCAGGTGGCC CTGTTGATGGGACTTATATGGGAGCTGCTTCAAGCCTCTGCTTTCTGTGGGCTGGGCT TTTTGATTGTACTGGCACTTTTTCAGGCTGGGCTCGGAAGAATGATGATGAAATACA GAGATCAGCGGGCCGGGAAGATATCAGAGCGACTTGTGATCACCAGTGAAATGATT GAAAATATTCAGAGCGTGAAAGCCTACTGCTGGGAAGAAGCCATGGAGAAGATGAT TGAGAACCTGAGGCAGACAGAGCTCAAGCTCACTCGGAAGGCTGCTTATGTTCGCT ATTTCAACAGCAGCGCCTTCTTCTTCAGTGGCTTCTTTGTTGTCTTCCTGTCTGTTCTG CCATATGCACTGATAAAAGGCATTATTTTACGAAAGATCTTCACCACCATCAGTTTT TGCATCGTTCTCAGGATGGCCGTCACAAGACAGTTCCCCTGGGCTGTGCAGACCTGG TACGATTCCTTGGGGGCCATCAACAAGATTCAAGATTTCTTGCAAAAACAAGAATAT AAAACTTTAGAATACAACCTCACCACCACTGAAGTGGTCATGGAAAATGTGACAGC CTTTTGGGAGGAGGGTTTTGGAGAATTGTTCGAGAAGGCAAAGCAGAATAACAACA ACAGGAAGACGAGCAATGGGGACGACTCTCTCTTCTTCAGCAACTTTTCACTGCTCG GGACCCCTGTGTTGAAAGATATAAACTTCAAGATCGAGAGGGGCCAGCTCTTGGCT GTGGCAGGCTCCACTGGAGCTGGTAAAACATCTCTTCTCATGGTGATCATGGGGGAA CTGGAGCCTTCCGAAGGAAAAATCAAGCACAGTGGGAGAATCTCATTCTGCAGCCA GTTTTCCTGGATCATGCCCGGCACCATTAAGGAAAACATCATATTTGGAGTGTCCTA TGATGAGTACCGCTACCGGTCAGTCATCAAAGCCTGTCAGTTGGAGGAGGACATCTC CAAGTTTGCAGAGAAAGACAACATTGTGCTTGGAGAGGGGGGTATCACTCTTTCTGG AGGACAAAGAGCCAGGATCTCTTTGGCCCGGGCAGTCTACAAGGATGCAGACCTCT ACTTGTTGGACAGTCCCTTCGGCTACCTCGACGTGCTGACTGAAAAAGAAATTTTTG AAAGCTGTGTGTGCAAACTGATGGCAAACAAGACCAGGATTCTTGTCACCAGCAAG ATGGAACATCTGAAGAAAGCGGACAAAATTCTGATTCTGCATGAAGGGAGCTCCTA CTTCTATGGAACATTTAGCGAGCTTCAGAACCTACAGCCAGACTTCTCCTCCAAATT AATGGGCTGTGACTCCTTCGACCAGTTCTCTGCAGAAAGAAGAAACTCTATACTCAC AGAGACCCTCCACCGCTTCTCCCTTGAGGGAGATGCCCCAGTTTCTTGGACAGAAAC CAAGAAGCAGTCCTTTAAGCAGACTGGCGAGTTTGGTGAAAAGAGGAAAAATTCAA TTCTCAATCCTATTAACAGTATTCGCAAGTTCAGCATTGTCCAGAAGACACCCCTCC AGATGAATGGCATCGAAGAAGATAGTGACGAGCCGCTGGAGAGACGGCTGAGTCTG GTGCCAGATTCAGAACAGGGGGAGGCCATCCTGCCCCGGATCAGCGTCATTTCCAC AGGCCCCACATTACAAGCACGGCGCCGGCAGAGTGTTTTAAATCTCATGACCCATTC AGTGAACCAGGGCCAAAATATCCACAGGAAGACTACAGCTTCTACCCGGAAAGTGT CTCTGGCCCCTCAGGCCAATCTGACCGAGCTGGACATCTACAGCAGGAGGCTCTCCC AGGAAACAGGGCTTGAAATATCTGAAGAGATTAATGAAGAGGATCTTAAAGAGTGC TTCTTTGATGACATGGAGAGCATCCCCGCGGTGACCACATGGAACACCTACCTTAGA TATATTACTGTCCACAAGAGCCTCATATTTGTCCTCATCTGGTGCCTGGTTATTTTCC TCGCTGAGGTGGCGGCCAGTCTTGTTGTGCTCTGGCTGCTGGGCAACACTCCTCTCC AGGACAAGGGCAATAGTACACACAGCAGAAATAATTCTTATGCCGTCATCATTACA AGCACCTCCAGCTACTACGTGTTCTACATCTATGTGGGCGTGGCTGACACCCTCCTG GCCATGGGTTTCTTCCGGGGCCTGCCTTTGGTGCACACCCTCATCACAGTGTCAAAA ATTCTGCACCATAAAATGCTTCATTCTGTCCTGCAGGCACCCATGAGCACTTTGAAC ACATTGAAGGCTGGCGGCATCCTCAACAGATTTTCTAAAGATATTGCTATCCTGGAT GATCTCCTCCCCCTGACAATCTTTGACTTTATCCAGCTTCTGCTGATCGTGATTGGAG CCATAGCAGTGGTTGCTGTCCTGCAGCCCTACATTTTTGTGGCCACCGTGCCCGTGAT TGTTGCCTTTATTATGCTCAGAGCTTACTTCCTGCAAACTTCTCAACAGCTCAAACAG CTAGAATCTGAGGGCCGGAGCCCCATTTTTACCCACCTGGTGACTTCCCTGAAGGGA CTGTGGACTCTGAGAGCATTCGGGCGACAGCCTTACTTTGAGACACTGTTCCACAAG GCCCTGAACTTGCACACTGCCAACTGGTTTCTTTACCTGAGCACACTCCGCTGGTTCC AGATGCGGATAGAGATGATCTTCGTCATCTTTTTTATAGCTGTAACCTTCATTTCTAT CCTTACAACAGGAGAAGGAGAGGGCAGGGTGGGAATCATCCTCACGCTGGCTATGA ACATAATGTCCACCTTGCAGTGGGCCGTGAATTCCAGTATAGATGTGGATTCTCTAA TGAGGAGTGTCTCCCGGGTGTTTAAATTCATTGATATGCCTACTGAGGGGAAACCCA CCAAGTCAACAAAACCTTATAAGAATGGACAGCTGAGCAAGGTGATGATAATTGAG AACAGCCACGTGAAGAAGGATGACATTTGGCCCAGCGGGGGCCAGATGACTGTGAA GGACCTGACGGCCAAGTACACCGAAGGTGGAAATGCCATTTTGGAAAACATCAGCT TCTCAATCTCTCCTGGGCAGAGAGTTGGATTGCTGGGTCGCACGGGCAGCGGCAAAT CAACCCTGCTCAGTGCCTTCCTTCGGCTCCTGAATACAGAAGGCGAAATCCAAATTG ACGGGGTGAGCTGGGACAGCATCACCCTGCAGCAGTGGAGAAAAGCATTTGGGGTC ATTCCACAGAAAGTTTTCATCTTCTCTGGCACTTTCAGAAAGAACCTGGACCCCTAT GAGCAGTGGAGCGACCAGGAGATCTGGAAGGTTGCAGATGAAGTTGGCCTGCGGAG
TGTGATAGAACAATTTCCTGGCAAGCTGGATTTTGTGCTGGTAGATGGAGGCTGCGT GCTGTCCCACGGCCACAAACAGCTGATGTGCCTCGCCCGCTCCGTTCTTTCAAAGGC CAAAATCTTGCTTTTGGATGAGCCCAGTGCTCACCTCGACCCAGTGACCTATCAGAT AATCCGCAGGACCTTAAAGCAAGCTTTTGCCGACTGCACCGTCATACTGTGTGAGCA CCGGATTGAAGCAATGCTGGAATGCCAGCAGTTTCTGGTGATCGAGGAGAATAAGG TCCGGCAGTACGACAGCATCCAGAAGTTGTTGAATGAGCGCAGCCTTTTCCGCCAGG CCATCTCCCCATCTGACAGAGTCAAGCTGTTTCCACATAGGAACTCCTCTAAGTGCA AGTCCAAGCCCCAGATCGCTGCCCTCAAGGAGGAAACTGAGGAAGAGGTGCAGGAT ACCCGCCTGTGA (SEQ ID NO: 29) ATGCAGAGGAGCCCACTGGAGAAAGCCTCCGTGGTGAGTAAACTCTTTTTTAGTTGG ACCAGACCCATCCTGCGAAAAGGATACAGGCAGCGCCTCGAGTTGTCAGATATCTA CCAGATTCCTTCTGTGGACTCAGCTGACAATTTGAGTGAGAAGCTGGAGCGGGAGTG GGATAGAGAGCTGGCGAGCAAAAAAAACCCCAAGCTTATCAATGCTCTGCGCCGCT GCTTTTTCTGGAGGTTCATGTTTTATGGGATCTTCCTGTACCTGGGGGAGGTCACCAA AGCTGTTCAGCCGCTCCTTCTTGGCCGCATCATCGCCAGCTATGACCCTGATAATAA AGAAGAAAGGTCTATTGCTATTTATCTGGGAATTGGCCTCTGCTTGCTCTTCATCGTC CGCACCCTTCTGCTGCACCCTGCCATTTTTGGCCTTCACCACATCGGCATGCAAATG AGAATTGCCATGTTCTCCCTCATTTACAAAAAGACCCTGAAACTTTCCTCAAGAGTG TTAGATAAAATATCCATTGGTCAGCTGGTCAGCCTGCTGTCCAACAATCTTAACAAA TTTGATGAAGGCTTGGCGCTGGCCCACTTCGTGTGGATTGCACCTCTGCAGGTGGCC CTGTTGATGGGACTTATATGGGAGCTGCTTCAAGCCTCTGCTTTCTGTGGGCTGGGCT TTTTGATTGTACTGGCACTTTTTCAGGCTGGGCTCGGAAGAATGATGATGAAATACA GAGATCAGCGGGCCGGGAAGATATCAGAGCGACTTGTGATCACCAGTGAAATGATT GAAAATATTCAGAGCGTGAAAGCCTACTGCTGGGAAGAAGCCATGGAGAAGATGAT TGAGAACCTGAGGCAGACAGAGCTCAAGCTCACTCGGAAGGCTGCTTATGTTCGCT ATTTCAACAGCAGCGCCTTCTTCTTCAGTGGCTTCTTTGTTGTCTTCCTGTCTGTTCTG CCATATGCACTGATAAAAGGCATTATTTTACGAAAGATCTTCACCACCATCAGTTTT TGCATCGTTCTCAGGATGGCCGTCACAAGACAGTTCCCCTGGGCTGTGCAGACCTGG TACGATTCCTTGGGGGCCATCAACAAGATTCAAGATTTCTTGCAAAAACAAGAATAT AAAACTTTAGAATACAACCTCACCACCACTGAAGTGGTCATGGAAAATGTGACAGC CTTTTGGGAGGAGGGTTTTGGAGAATTGTTCGAGAAGGCAAAGCAGAATAACAACA ACAGGAAGACGAGCAATGGGGACGACTCTCTCTTCTTCAGCAACTTTTCACTGCTCG GGACCCCTGTGTTGAAAGATATAAACTTCAAGATCGAGAGGGGCCAGCTCTTGGCT GTGGCAGGCTCCACTGGAGCTGGTAAAACATCTCTTCTCATGGTGATCATGGGGGAA CTGGAGCCTTCCGAAGGAAAAATCAAGCACAGTGGGAGAATCTCATTCTGCAGCCA GTTTTCCTGGATCATGCCCGGCACCATTAAGGAAAACATCATATTTGGAGTGTCCTA TGATGAGTACCGCTACCGGTCCGTCATCAAAGCCTGTCAGTTGGAGGAGGACATCTC CAAGTTTGCAGAGAAAGACAACATTGTGCTTGGAGAGGGGGGTATCACTCTTTCTGG AGGACAAAGAGCCAGGATCTCTTTGGCCCGGGCAGTCTACAAGGATGCAGACCTCT ACTTGTTGGACAGTCCCTTCGGCTACCTCGACGTGCTGACTGAAAAAGAAATTTTTG AAAGCTGTGTGTGCAAACTGATGGCAAACAAGACCAGGATTCTTGTCACCAGCAAG ATGGAACATCTGAAGAAAGCGGACAAAATTCTGATTCTGCATGAAGGGAGCTCCTA CTTCTATGGAACATTTAGCGAGCTTCAGAACCTACAGCCAGACTTCTCCTCCAAATT AATGGGCTGTGACTCCTTCGACCAGTTCTCTGCAGAAAGAAGAAACTCTATACTCAC AGAGACCCTCCACCGCTTCTCCCTTGAGGGAGATGCCCCAGTTTCTTGGACAGAAAC CAAGAAGCAGTCCTTTAAGCAGACTGGCGAGTTTGGTGAAAAGAGGAAAAATTCAA TTCTCAATCCAATTAACAGTATTCGCAAGTTCAGCATTGTCCAGAAGACACCCCTCC AGATGAATGGCATCGAAGAAGATAGTGACGAGCCGCTGGAGAGACGGCTGAGTCTG GTGCCAGATTCAGAACAGGGGGAGGCCATCCTGCCCCGGATCAGCGTCATTTCCAC AGGCCCCACATTACAAGCACGGCGCCGGCAGAGTGTTTTAAATCTCATGACCCATTC AGTGAACCAGGGCCAAAATATCCACAGGAAGACTACAGCTTCTACCCGGAAAGTGT CTCTGGCCCCTCAGGCCAATCTGACCGAGCTGGACATCTACAGCAGGAGGCTCTCCC AGGAAACAGGGCTGGAAATATCTGAAGAGATTAATGAAGAGGATCTTAAAGAGTGC TTCTTTGATGACATGGAGAGCATCCCCGCGGTGACCACATGGAACACCTACCTTAGA TATATTACTGTCCACAAGAGCCTCATATTTGTCCTCATCTGGTGCCTGGTTATTTTCC TCGCTGAGGTGGCGGCCAGTCTTGTTGTGCTCTGGCTGCTGGGCAACACTCCTCTCC AGGACAAGGGCAATAGTACTCACAGCAGAAATAATTCTTATGCCGTCATCATTACA AGCACCTCCAGCTACTACGTGTTCTACATCTATGTGGGCGTGGCTGACACCCTCCTG GCCATGGGTTTCTTCCGGGGCCTGCCTTTGGTGCACACCCTCATCACAGTGTCAAAA ATTCTGCACCATAAAATGCTTCATTCTGTCCTGCAGGCACCCATGAGCACTTTGAAC ACATTGAAGGCTGGCGGCATCCTCAACAGATTTTCTAAAGATATTGCTATCCTGGAT GATCTCCTCCCCCTGACAATCTTTGACTTTATCCAGCTTCTGCTGATCGTGATTGGAG CCATAGCAGTGGTTGCTGTCCTGCAGCCCTACATTTTTGTGGCCACCGTGCCCGTGAT TGTTGCCTTTATTATGCTCAGAGCTTACTTCCTGCAAACTTCTCAACAGCTCAAACAG CTAGAGTCTGAGGGCCGGAGCCCCATTTTTACCCACCTGGTGACTTCCCTGAAGGGA CTGTGGACTCTGAGAGCATTCGGGCGACAGCCTTACTTTGAGACACTGTTCCACAAG GCCCTGAACTTGCACACTGCCAACTGGTTTCTTTACCTGAGCACACTCCGCTGGTTCC AGATGCGGATAGAGATGATCTTCGTCATCTTTTTTATAGCTGTAACCTTCATTTCTAT CCTTACAACAGGAGAAGGAGAGGGCAGGGTGGGAATCATCCTCACGCTGGCTATGA ACATAATGTCCACCTTGCAGTGGGCCGTGAATTCCAGTATAGATGTGGATTCTCTAA TGAGGAGTGTCTCCCGGGTGTTTAAATTCATTGATATGCCTACTGAGGGGAAACCCA CCAAGTCAACAAAACCTTATAAGAATGGACAGCTGAGCAAGGTGATGATAATTGAG AACAGCCACGTGAAGAAGGATGACATTTGGCCCAGCGGGGGCCAGATGACTGTGAA GGACCTGACGGCCAAGTACACCGAAGGTGGAAATGCCATTTTGGAAAACATCAGCT TCTCAATCTCTCCTGGGCAGAGAGTTGGATTGCTGGGTCGCACGGGCAGCGGCAAAT CAACCCTGCTCAGTGCCTTCCTTCGGCTCCTGAATACAGAAGGCGAAATCCAAATTG ACGGGGTGAGCTGGGACAGCATCACCCTGCAGCAGTGGAGAAAAGCATTTGGGGTC ATTCCACAGAAAGTTTTCATCTTCTCTGGCACTTTCAGAAAGAACCTGGACCCCTAT GAGCAGTGGAGCGACCAGGAGATCTGGAAGGTTGCAGATGAAGTTGGCCTGCGGAG TGTGATAGAACAATTTCCTGGCAAGCTGGATTTTGTGCTGGTAGATGGAGGCTGCGT GCTGTCCCACGGCCACAAACAGCTGATGTGCCTCGCCCGCTCCGTTCTTTCAAAGGC CAAAATCTTGCTTTTGGATGAGCCCAGTGCTCACCTCGACCCAGTGACCTATCAGAT AATCCGCAGGACCTTAAAGCAAGCTTTTGCCGACTGCACCGTCATACTGTGTGAGCA CCGGATTGAAGCAATGCTGGAATGCCAGCAGTTTCTGGTGATCGAGGAGAATAAGG TCCGGCAGTACGACAGCATCCAGAAGTTGTTGAATGAGCGCAGCCTTTTCCGCCAGG CCATCTCCCCATCTGACAGAGTCAAGCTGTTTCCACATAGGAACTCCTCTAAGTGCA AGTCCAAGCCCCAGATCGCTGCCCTCAAGGAGGAAACTGAGGAAGAGGTGCAGGAT ACCCGCCTGTGA (SEQ ID NO: 30) ATGCAGAGGAGCCCACTGGAGAAAGCCTCCGTGGTGAGTAAACTCTTTTTTAGTTGG ACCAGACCCATCCTGCGAAAAGGATACAGGCAGCGCCTCGAGTTGTCTGATATCTA CCAGATTCCTTCTGTGGACTCAGCTGACAATTTGAGTGAGAAGCTGGAGCGGGAGTG GGATAGAGAGCTGGCGAGCAAAAAAAACCCCAAGCTTATCAATGCTCTGCGCCGCT GCTTTTTCTGGAGGTTCATGTTTTATGGGATCTTCCTGTACCTGGGGGAGGTCACCAA AGCTGTTCAGCCGCTCCTTCTTGGCCGCATCATCGCCAGCTATGACCCTGATAATAA AGAAGAAAGGTCTATTGCTATTTATCTGGGAATTGGCCTCTGCTTGCTCTTCATCGTC CGCACCCTTCTGCTGCACCCTGCCATTTTTGGCCTTCACCACATCGGCATGCAAATG AGAATTGCCATGTTCTCCCTCATTTACAAAAAGACCCTGAAACTTTCCTCAAGAGTG TTAGATAAAATATCCATTGGTCAGCTGGTCAGCCTGCTGTCCAACAATCTTAACAAA TTTGATGAAGGCTTGGCGCTGGCCCACTTCGTGTGGATTGCACCTCTGCAGGTGGCC CTGTTGATGGGACTTATATGGGAGCTGCTTCAAGCCTCTGCTTTCTGTGGGCTGGGCT TTTTGATTGTACTGGCACTTTTTCAGGCTGGGCTCGGAAGAATGATGATGAAATACA GAGATCAGCGGGCCGGGAAGATTTCAGAGCGACTTGTGATCACCAGTGAAATGATT GAAAATATTCAGAGCGTGAAAGCCTACTGCTGGGAAGAAGCCATGGAGAAGATGAT TGAGAACCTGAGGCAGACAGAGCTCAAGCTCACTCGGAAGGCTGCTTATGTTCGCT ATTTCAACAGCAGCGCCTTCTTCTTCAGTGGCTTCTTTGTTGTCTTCCTGTCTGTTCTG CCATATGCACTGATAAAAGGCATTATTTTACGAAAGATCTTCACCACCATCAGTTTT TGCATCGTTCTCAGGATGGCCGTCACAAGACAGTTCCCCTGGGCTGTGCAGACCTGG TACGATTCCTTGGGGGCCATCAACAAGATTCAAGATTTCTTGCAAAAACAAGAATAT AAAACTTTAGAATACAACCTCACCACCACTGAAGTGGTCATGGAAAATGTGACAGC CTTTTGGGAGGAGGGTTTTGGAGAATTGTTCGAGAAGGCAAAGCAGAATAACAACA ACAGGAAGACGAGCAATGGGGACGACTCTCTCTTCTTCAGCAACTTTTCACTGCTCG GGACCCCTGTGTTGAAAGATATAAACTTCAAGATCGAGAGGGGCCAGCTCTTGGCT GTGGCAGGCTCCACTGGAGCTGGTAAAACATCTCTTCTCATGGTGATCATGGGGGAA CTGGAGCCTTCCGAAGGAAAAATCAAGCACAGTGGGAGAATCTCATTCTGCAGCCA GTTTTCCTGGATCATGCCCGGCACCATTAAGGAAAACATCATATTTGGAGTGTCCTA TGATGAGTACCGCTACCGGTCAGTCATCAAAGCCTGTCAGTTGGAGGAGGACATCTC CAAGTTTGCAGAGAAAGACAACATTGTGCTTGGAGAGGGGGGTATCACTCTTTCTGG AGGACAAAGAGCCAGGATCTCTTTGGCCCGGGCAGTCTACAAGGATGCAGACCTCT ACTTGTTGGACAGTCCCTTCGGCTACCTCGACGTGCTGACTGAAAAAGAAATTTTTG AAAGCTGTGTGTGCAAACTGATGGCAAACAAGACCAGGATTCTTGTCACCAGCAAG ATGGAACATCTGAAGAAAGCGGACAAAATTCTGATTCTGCATGAAGGGAGCTCCTA CTTCTATGGAACATTTAGCGAGCTTCAGAACCTACAGCCAGACTTCTCCTCCAAATT AATGGGCTGTGACTCCTTCGACCAGTTCTCTGCAGAAAGAAGAAACTCTATACTCAC AGAGACCCTCCACCGCTTCTCCCTTGAGGGAGATGCCCCAGTTTCTTGGACAGAAAC CAAGAAGCAGTCCTTTAAGCAGACTGGCGAGTTTGGTGAAAAGAGGAAAAATTCAA
TTCTCAATCCTATTAACAGTATTCGCAAGTTCAGCATTGTCCAGAAGACACCCCTCC AGATGAATGGCATCGAAGAAGATAGTGACGAGCCGCTGGAGAGACGGCTGAGTCTG GTGCCAGATTCAGAACAGGGGGAGGCCATCCTGCCCCGGATCAGCGTCATTTCCAC AGGCCCCACATTACAAGCACGGCGCCGGCAGAGTGTTTTAAATCTCATGACCCATTC AGTGAACCAGGGCCAAAATATCCACAGGAAGACTACAGCTTCTACCCGGAAAGTGT CTCTGGCCCCTCAGGCCAATCTGACCGAGCTGGACATCTACAGCAGGAGGCTCTCCC AGGAAACAGGGCTGGAAATATCTGAAGAGATTAATGAAGAGGATCTTAAAGAGTGC TTCTTTGATGACATGGAGAGCATCCCCGCGGTGACCACATGGAACACCTACCTTAGA TATATTACTGTCCACAAGAGCCTCATATTTGTCCTCATCTGGTGCCTGGTTATTTTCC TCGCTGAGGTGGCGGCCAGTCTTGTTGTGCTCTGGCTGCTGGGCAACACTCCTCTCC AGGACAAGGGCAATAGTACACACAGCAGAAATAATTCTTATGCCGTCATCATTACA AGCACCTCCAGCTACTACGTGTTCTACATCTATGTGGGCGTGGCTGACACCCTCCTG GCCATGGGTTTCTTCCGGGGCCTGCCTTTGGTGCACACCCTCATCACAGTGTCAAAA ATTCTGCACCATAAAATGCTTCATTCTGTCCTGCAGGCACCCATGAGCACTTTGAAC ACATTGAAGGCTGGCGGCATCCTCAACAGATTTTCTAAAGATATTGCTATCCTGGAT GATCTCCTCCCCCTGACAATCTTTGACTTTATCCAGCTTCTGCTGATCGTGATTGGAG CCATAGCAGTGGTTGCTGTCCTGCAGCCCTACATTTTTGTGGCCACCGTGCCCGTGAT TGTTGCCTTTATTATGCTCAGAGCTTACTTCCTGCAAACTTCTCAACAGCTCAAACAG CTAGAATCTGAGGGCCGGAGCCCCATTTTTACCCACCTGGTGACTTCCCTGAAGGGA CTGTGGACTCTGAGAGCATTCGGGCGACAGCCTTACTTTGAGACACTGTTCCACAAG GCCCTGAACTTGCACACTGCCAACTGGTTTCTTTACCTGAGCACACTCCGCTGGTTCC AGATGCGGATAGAGATGATCTTCGTCATCTTTTTTATAGCTGTAACCTTCATTTCTAT CCTTACAACAGGAGAAGGAGAGGGCAGGGTGGGAATCATCCTCACGCTGGCTATGA ACATAATGTCCACCTTGCAGTGGGCCGTGAATTCCAGTATAGATGTGGATTCTCTAA TGAGGAGTGTCTCCCGGGTGTTTAAATTCATTGATATGCCTACTGAGGGGAAACCCA CCAAGTCAACAAAACCTTATAAGAATGGACAGCTGAGCAAGGTGATGATAATTGAG AACAGCCACGTGAAGAAGGATGACATTTGGCCCAGCGGGGGCCAGATGACTGTGAA GGACCTGACGGCCAAGTACACCGAAGGTGGAAATGCCATTTTGGAAAACATCAGCT TCTCAATCTCTCCTGGGCAGAGAGTTGGATTGCTGGGTCGCACGGGCAGCGGCAAAT CAACCCTGCTCAGTGCCTTCCTTCGGCTCCTGAATACAGAAGGCGAAATCCAAATTG ACGGGGTGAGCTGGGACAGCATCACCCTGCAGCAGTGGAGAAAAGCATTTGGGGTC ATTCCACAGAAAGTTTTCATCTTCTCTGGCACTTTCAGAAAGAACCTGGACCCCTAT GAGCAGTGGAGCGACCAGGAGATCTGGAAGGTTGCAGATGAAGTTGGCCTGCGGAG TGTGATAGAACAATTTCCTGGCAAGCTGGATTTTGTGCTGGTAGATGGAGGCTGCGT GCTGTCCCACGGCCACAAACAGCTGATGTGCCTCGCCCGCTCCGTTCTTTCAAAGGC CAAAATCTTGCTTTTGGATGAGCCCAGTGCTCACCTCGACCCAGTGACCTATCAGAT AATCCGCAGGACCTTAAAGCAAGCTTTTGCCGACTGCACCGTCATACTGTGTGAGCA CCGGATTGAAGCAATGCTGGAATGCCAGCAGTTTCTGGTGATCGAGGAGAATAAGG TCCGGCAGTACGACAGCATCCAGAAGTTGTTGAATGAGCGCAGCCTTTTCCGCCAGG CCATCTCCCCATCTGACAGAGTCAAGCTGTTTCCACATAGGAACTCCTCTAAGTGCA AGTCCAAGCCCCAGATCGCTGCCCTCAAGGAGGAAACTGAGGAAGAGGTGCAGGAT ACCCGCCTGTGA
Example 3. Additional Exemplary Codon Optimized CFTR Sequences
[0192] The following additional exemplary codon optimized sequences are used for generating human CFTR mRNA for safe and efficacious clinical use:
TABLE-US-00007 (SEQ ID NO: 31) ATGCAGAGAAGCCCCCTGGAGAAGGCCTCTGTGGTGAGCAAGCTGTTCTTCAGCTG GACCAGACCCATCCTGAGAAAGGGCTACAGACAGAGACTGGAGCTGTCTGACATCT ACCAGATCCCCTCTGTGGACTCTGCCGACAACCTGTCTGAGAAGCTGGAGAGAGAG TGGGACAGAGAGCTGGCCAGCAAGAAGAACCCCAAGCTGATCAATGCCCTGAGAA GATGCTTCTTCTGGAGATTCATGTTCTATGGCATCTTCCTGTACCTGGGAGAGGTGAC CAAGGCCGTGCAGCCCCTGCTGCTGGGCAGGATCATTGCCAGCTATGACCCTGACA ACAAGGAGGAGAGAAGCATTGCCATCTACCTGGGCATTGGCCTGTGCCTGCTGTTCA TTGTGAGAACCCTGCTGCTGCACCCTGCCATCTTTGGCCTGCACCACATTGGCATGC AGATGAGAATTGCCATGTTCAGCCTGATCTACAAGAAGACCCTGAAGCTGAGCAGC AGAGTGCTGGACAAGATCAGCATTGGCCAGCTGGTGAGCCTGCTGAGCAACAACCT GAACAAGTTTGATGAGGGCCTGGCCCTGGCCCACTTTGTGTGGATTGCCCCCCTGCA GGTGGCCCTGCTGATGGGCCTGATCTGGGAGCTGCTGCAGGCCTCTGCCTTCTGTGG CCTGGGCTTCCTGATTGTGCTGGCCCTGTTCCAGGCCGGCCTGGGCAGAATGATGAT GAAGTACAGAGACCAGAGAGCCGGCAAGATCTCTGAGAGACTGGTGATCACCTCTG AGATGATTGAGAACATCCAGTCTGTGAAGGCCTACTGCTGGGAGGAGGCCATGGAG AAGATGATTGAGAACCTGAGACAGACAGAGCTGAAGCTGACCAGGAAGGCCGCCTA TGTGAGATACTTCAACAGCTCTGCCTTCTTCTTCTCTGGCTTCTTTGTGGTGTTCCTGT CTGTGCTGCCCTATGCCCTGATCAAGGGCATCATCCTGAGGAAGATCTTCACCACCA TCAGCTTCTGCATTGTGCTGAGGATGGCCGTGACCAGGCAGTTCCCCTGGGCCGTGC AGACCTGGTATGACAGCCTGGGGGCCATCAACAAGATCCAGGACTTCCTGCAGAAG CAGGAGTACAAGACCCTGGAGTACAACCTGACCACCACAGAGGTGGTGATGGAGAA TGTGACAGCCTTCTGGGAGGAGGGCTTTGGAGAGCTGTTTGAGAAGGCCAAGCAGA ACAACAACAACAGAAAGACCAGCAATGGAGATGACAGCCTGTTCTTCAGCAACTTC AGCCTGCTGGGCACCCCTGTGCTGAAGGACATCAACTTCAAGATTGAGAGGGGCCA GCTGCTGGCCGTGGCCGGCAGCACAGGAGCCGGCAAGACCAGCCTGCTGATGGTGA TCATGGGAGAGCTGGAGCCCTCTGAGGGCAAGATCAAGCACTCTGGCAGAATCAGC TTCTGCAGCCAGTTCAGCTGGATCATGCCTGGCACCATCAAGGAGAACATCATCTTT GGGGTGAGCTATGATGAGTACAGGTACAGATCTGTGATCAAGGCCTGCCAGCTGGA GGAGGACATCTCCAAGTTTGCCGAGAAGGACAACATTGTGCTGGGGGAGGGAGGCA TCACCCTGTCTGGGGGCCAGAGAGCCAGAATCAGCCTGGCCAGAGCCGTGTACAAG GATGCCGACCTGTACCTGCTGGACAGCCCCTTTGGCTACCTGGATGTGCTGACAGAG AAGGAGATCTTTGAGAGCTGTGTGTGCAAGCTGATGGCCAACAAGACCAGGATCCT GGTGACCAGCAAGATGGAGCACCTGAAGAAGGCCGACAAGATCCTGATCCTGCATG AGGGCAGCAGCTACTTCTATGGCACCTTCTCTGAGCTGCAGAACCTGCAGCCTGACT TCAGCAGCAAGCTGATGGGCTGTGACAGCTTTGACCAGTTCTCTGCTGAGAGAAGA AACAGCATCCTGACAGAGACCCTGCACAGGTTCAGCCTGGAGGGGGATGCCCCTGT GAGCTGGACAGAGACCAAGAAGCAGAGCTTCAAGCAGACAGGAGAGTTTGGGGAG AAGAGGAAGAACAGCATCCTGAACCCCATCAACAGCATCAGGAAGTTCAGCATTGT GCAGAAGACCCCCCTGCAGATGAATGGCATTGAGGAGGACTCTGATGAGCCCCTGG AGAGAAGACTGAGCCTGGTGCCAGACTCTGAGCAGGGAGAGGCCATCCTGCCCAGG ATCTCTGTGATCAGCACAGGCCCCACCCTGCAGGCCAGAAGAAGACAGTCTGTGCT GAACCTGATGACCCACTCTGTGAACCAGGGCCAGAATATCCACAGAAAGACCACAG CCAGCACCAGAAAGGTGAGCCTGGCCCCCCAGGCCAACCTGACAGAGCTGGACATC TACAGCAGAAGGCTGAGCCAGGAGACAGGCCTGGAGATCTCTGAGGAGATCAATGA GGAGGACCTGAAGGAGTGCTTCTTTGATGACATGGAGAGCATCCCTGCCGTGACCA CCTGGAACACCTACCTGAGATACATCACAGTGCACAAGAGCCTGATCTTTGTGCTGA TCTGGTGCCTGGTGATCTTCCTGGCCGAGGTGGCCGCCAGCCTGGTGGTGCTGTGGC TGCTGGGCAACACCCCCCTGCAGGACAAGGGCAACAGCACCCACAGCAGAAACAAC AGCTATGCTGTGATCATCACCAGCACCAGCAGCTACTATGTGTTCTACATCTATGTG GGAGTGGCTGACACCCTGCTGGCCATGGGCTTCTTCAGAGGCCTGCCCCTGGTGCAC ACCCTGATCACAGTGAGCAAGATCCTGCACCACAAGATGCTGCACTCTGTGCTGCAG GCCCCCATGAGCACCCTGAACACCCTGAAGGCTGGAGGCATCCTGAACAGATTCAG CAAGGACATTGCCATCCTGGATGACCTGCTGCCCCTGACCATCTTTGACTTCATCCA GCTGCTGCTGATTGTGATTGGAGCCATTGCCGTGGTGGCCGTGCTGCAGCCCTACAT CTTTGTGGCCACAGTGCCTGTGATTGTGGCCTTCATCATGCTGAGGGCCTACTTCCTG CAGACCAGCCAGCAGCTGAAGCAGCTGGAGTCTGAGGGCAGAAGCCCCATCTTCAC CCACCTGGTGACCAGCCTGAAGGGCCTGTGGACCCTGAGGGCCTTTGGCAGACAGC CCTACTTTGAGACCCTGTTCCACAAGGCCCTGAACCTGCACACAGCCAACTGGTTCC TGTACCTGAGCACCCTGAGATGGTTCCAGATGAGGATTGAGATGATCTTTGTGATCT TCTTCATTGCCGTGACCTTCATCAGCATCCTGACCACAGGGGAGGGCGAGGGCAGA GTGGGCATCATCCTGACCCTGGCCATGAACATCATGAGCACCCTGCAGTGGGCCGTG AACAGCAGCATTGATGTGGACAGCCTGATGAGATCTGTGAGCAGAGTGTTCAAGTT CATTGACATGCCCACAGAGGGCAAGCCCACCAAGAGCACCAAGCCCTACAAGAATG GCCAGCTGAGCAAGGTGATGATCATTGAGAACAGCCATGTGAAGAAGGATGACATC TGGCCCTCTGGAGGCCAGATGACAGTGAAGGACCTGACAGCCAAGTACACAGAGGG GGGCAATGCCATCCTGGAGAACATCAGCTTCAGCATCAGCCCTGGCCAGAGGGTGG GCCTGCTGGGCAGAACAGGCTCTGGCAAGAGCACCCTGCTGTCTGCCTTCCTGAGGC TGCTGAACACAGAGGGAGAGATCCAGATTGATGGGGTGAGCTGGGACAGCATCACC CTGCAGCAGTGGAGGAAGGCCTTTGGGGTGATCCCCCAGAAGGTGTTCATCTTCTCT GGCACCTTCAGGAAGAACCTGGACCCCTATGAGCAGTGGTCTGACCAGGAGATCTG GAAGGTGGCCGATGAGGTGGGCCTGAGATCTGTGATTGAGCAGTTCCCTGGCAAGC TGGACTTTGTGCTGGTGGATGGAGGCTGTGTGCTGAGCCATGGCCACAAGCAGCTGA TGTGCCTGGCCAGATCTGTGCTGAGCAAGGCCAAGATCCTGCTGCTGGATGAGCCCT CTGCCCACCTGGACCCTGTGACCTACCAGATCATCAGAAGAACCCTGAAGCAGGCC TTTGCCGACTGCACAGTGATCCTGTGTGAGCACAGAATTGAGGCCATGCTGGAGTGC CAGCAGTTCCTGGTGATTGAGGAGAACAAGGTGAGGCAGTATGACAGCATCCAGAA GCTGCTGAATGAGAGAAGCCTGTTCAGACAGGCCATCAGCCCCTCTGACAGAGTGA AGCTGTTCCCCCACAGGAACAGCAGCAAGTGCAAGAGCAAGCCCCAGATTGCCGCC CTGAAGGAGGAGACAGAGGAGGAGGTGCAGGACACCAGACTGTGA (SEQ ID NO: 32) ATGCAGAGGAGCCCCCTGGAGAAGGCCAGCGTGGTGAGCAAGCTGTTCTTCAGCTG GACCAGGCCCATCCTGAGGAAGGGCTACAGGCAGAGGCTGGAGCTGAGCGACATCT ACCAGATCCCCAGCGTGGACAGCGCCGACAACCTGAGCGAGAAGCTGGAGAGGGA GTGGGACAGGGAGCTGGCCAGCAAGAAGAACCCCAAGCTGATCAACGCCCTGAGG AGGTGCTTCTTCTGGAGGTTCATGTTCTACGGCATCTTCCTGTACCTGGGCGAGGTG ACCAAGGCCGTGCAGCCCCTGCTGCTGGGCAGGATCATCGCCAGCTACGACCCCGA CAACAAGGAGGAGAGGAGCATCGCCATCTACCTGGGCATCGGCCTGTGCCTGCTGT TCATCGTGAGGACCCTGCTGCTGCACCCCGCCATCTTCGGCCTGCACCACATCGGCA TGCAGATGAGGATCGCCATGTTCAGCCTGATCTACAAGAAGACCCTGAAGCTGAGC AGCAGGGTGCTGGACAAGATCAGCATCGGCCAGCTGGTGAGCCTGCTGAGCAACAA CCTGAACAAGTTCGACGAGGGCCTGGCCCTGGCCCACTTCGTGTGGATCGCCCCCCT GCAGGTGGCCCTGCTGATGGGCCTGATCTGGGAGCTGCTGCAGGCCAGCGCCTTCTG CGGCCTGGGCTTCCTGATCGTGCTGGCCCTGTTCCAGGCCGGCCTGGGCAGGATGAT GATGAAGTACAGGGACCAGAGGGCCGGCAAGATCAGCGAGAGGCTGGTGATCACC AGCGAGATGATCGAGAACATCCAGAGCGTGAAGGCCTACTGCTGGGAGGAGGCCAT GGAGAAGATGATCGAGAACCTGAGGCAGACCGAGCTGAAGCTGACCAGGAAGGCC GCCTACGTGAGGTACTTCAACAGCAGCGCCTTCTTCTTCAGCGGCTTCTTCGTGGTGT TCCTGAGCGTGCTGCCCTACGCCCTGATCAAGGGCATCATCCTGAGGAAGATCTTCA CCACCATCAGCTTCTGCATCGTGCTGAGGATGGCCGTGACCAGGCAGTTCCCCTGGG CCGTGCAGACCTGGTACGACAGCCTGGGCGCCATCAACAAGATCCAGGACTTCCTG CAGAAGCAGGAGTACAAGACCCTGGAGTACAACCTGACCACCACCGAGGTGGTGAT GGAGAACGTGACCGCCTTCTGGGAGGAGGGCTTCGGCGAGCTGTTCGAGAAGGCCA AGCAGAACAACAACAACAGGAAGACCAGCAACGGCGACGACAGCCTGTTCTTCAGC AACTTCAGCCTGCTGGGCACCCCCGTGCTGAAGGACATCAACTTCAAGATCGAGAG GGGCCAGCTGCTGGCCGTGGCCGGCAGCACCGGCGCCGGCAAGACCAGCCTGCTGA TGGTGATCATGGGCGAGCTGGAGCCCAGCGAGGGCAAGATCAAGCACAGCGGCAG GATCAGCTTCTGCAGCCAGTTCAGCTGGATCATGCCCGGCACCATCAAGGAGAACA TCATCTTCGGCGTGAGCTACGACGAGTACAGGTACAGGAGCGTGATCAAGGCCTGC CAGCTGGAGGAGGACATCAGCAAGTTCGCCGAGAAGGACAACATCGTGCTGGGCGA GGGCGGCATCACCCTGAGCGGCGGCCAGAGGGCCAGGATCAGCCTGGCCAGGGCCG TGTACAAGGACGCCGACCTGTACCTGCTGGACAGCCCCTTCGGCTACCTGGACGTGC TGACCGAGAAGGAGATCTTCGAGAGCTGCGTGTGCAAGCTGATGGCCAACAAGACC AGGATCCTGGTGACCAGCAAGATGGAGCACCTGAAGAAGGCCGACAAGATCCTGAT CCTGCACGAGGGCAGCAGCTACTTCTACGGCACCTTCAGCGAGCTGCAGAACCTGC AGCCCGACTTCAGCAGCAAGCTGATGGGCTGCGACAGCTTCGACCAGTTCAGCGCC GAGAGGAGGAACAGCATCCTGACCGAGACCCTGCACAGGTTCAGCCTGGAGGGCGA CGCCCCCGTGAGCTGGACCGAGACCAAGAAGCAGAGCTTCAAGCAGACCGGCGAGT TCGGCGAGAAGAGGAAGAACAGCATCCTGAACCCCATCAACAGCATCAGGAAGTTC AGCATCGTGCAGAAGACCCCCCTGCAGATGAACGGCATCGAGGAGGACAGCGACG AGCCCCTGGAGAGGAGGCTGAGCCTGGTGCCCGACAGCGAGCAGGGCGAGGCCATC CTGCCCAGGATCAGCGTGATCAGCACCGGCCCCACCCTGCAGGCCAGGAGGAGGCA GAGCGTGCTGAACCTGATGACCCACAGCGTGAACCAGGGCCAGAACATCCACAGGA AGACCACCGCCAGCACCAGGAAGGTGAGCCTGGCCCCCCAGGCCAACCTGACCGAG CTGGACATCTACAGCAGGAGGCTGAGCCAGGAGACCGGCCTGGAGATCAGCGAGG AGATCAACGAGGAGGACCTGAAGGAGTGCTTCTTCGACGACATGGAGAGCATCCCC
GCCGTGACCACCTGGAACACCTACCTGAGGTACATCACCGTGCACAAGAGCCTGAT CTTCGTGCTGATCTGGTGCCTGGTGATCTTCCTGGCCGAGGTGGCCGCCAGCCTGGT GGTGCTGTGGCTGCTGGGCAACACCCCCCTGCAGGACAAGGGCAACAGCACCCACA GCAGGAACAACAGCTACGCCGTGATCATCACCAGCACCAGCAGCTACTACGTGTTC TACATCTACGTGGGCGTGGCCGACACCCTGCTGGCCATGGGCTTCTTCAGGGGCCTG CCCCTGGTGCACACCCTGATCACCGTGAGCAAGATCCTGCACCACAAGATGCTGCAC AGCGTGCTGCAGGCCCCCATGAGCACCCTGAACACCCTGAAGGCCGGCGGCATCCT GAACAGGTTCAGCAAGGACATCGCCATCCTGGACGACCTGCTGCCCCTGACCATCTT CGACTTCATCCAGCTGCTGCTGATCGTGATCGGCGCCATCGCCGTGGTGGCCGTGCT GCAGCCCTACATCTTCGTGGCCACCGTGCCCGTGATCGTGGCCTTCATCATGCTGAG GGCCTACTTCCTGCAGACCAGCCAGCAGCTGAAGCAGCTGGAGAGCGAGGGCAGGA GCCCCATCTTCACCCACCTGGTGACCAGCCTGAAGGGCCTGTGGACCCTGAGGGCCT TCGGCAGGCAGCCCTACTTCGAGACCCTGTTCCACAAGGCCCTGAACCTGCACACCG CCAACTGGTTCCTGTACCTGAGCACCCTGAGGTGGTTCCAGATGAGGATCGAGATGA TCTTCGTGATCTTCTTCATCGCCGTGACCTTCATCAGCATCCTGACCACCGGCGAGG GCGAGGGCAGGGTGGGCATCATCCTGACCCTGGCCATGAACATCATGAGCACCCTG CAGTGGGCCGTGAACAGCAGCATCGACGTGGACAGCCTGATGAGGAGCGTGAGCAG GGTGTTCAAGTTCATCGACATGCCCACCGAGGGCAAGCCCACCAAGAGCACCAAGC CCTACAAGAACGGCCAGCTGAGCAAGGTGATGATCATCGAGAACAGCCACGTGAAG AAGGACGACATCTGGCCCAGCGGCGGCCAGATGACCGTGAAGGACCTGACCGCCAA GTACACCGAGGGCGGCAACGCCATCCTGGAGAACATCAGCTTCAGCATCAGCCCCG GCCAGAGGGTGGGCCTGCTGGGCAGGACCGGCAGCGGCAAGAGCACCCTGCTGAGC GCCTTCCTGAGGCTGCTGAACACCGAGGGCGAGATCCAGATCGACGGCGTGAGCTG GGACAGCATCACCCTGCAGCAGTGGAGGAAGGCCTTCGGCGTGATCCCCCAGAAGG TGTTCATCTTCAGCGGCACCTTCAGGAAGAACCTGGACCCCTACGAGCAGTGGAGC GACCAGGAGATCTGGAAGGTGGCCGACGAGGTGGGCCTGAGGAGCGTGATCGAGC AGTTCCCCGGCAAGCTGGACTTCGTGCTGGTGGACGGCGGCTGCGTGCTGAGCCAC GGCCACAAGCAGCTGATGTGCCTGGCCAGGAGCGTGCTGAGCAAGGCCAAGATCCT GCTGCTGGACGAGCCCAGCGCCCACCTGGACCCCGTGACCTACCAGATCATCAGGA GGACCCTGAAGCAGGCCTTCGCCGACTGCACCGTGATCCTGTGCGAGCACAGGATC GAGGCCATGCTGGAGTGCCAGCAGTTCCTGGTGATCGAGGAGAACAAGGTGAGGCA GTACGACAGCATCCAGAAGCTGCTGAACGAGAGGAGCCTGTTCAGGCAGGCCATCA GCCCCAGCGACAGGGTGAAGCTGTTCCCCCACAGGAACAGCAGCAAGTGCAAGAGC AAGCCCCAGATCGCCGCCCTGAAGGAGGAGACCGAGGAGGAGGTGCAGGACACCA GGCTGTGA (SEQ ID NO: 33) ATGCAGAGATCCCCTCTGGAGAAGGCCTCAGTGGTGTCCAAGCTTTTCTTCTCCTGG ACCAGGCCCATTTTAAGAAAGGGCTACAGGCAGAGACTTGAGCTGTCTGACATCTAT CAGATCCCTTCTGTGGATTCTGCTGACAATCTTAGTGAAAAATTGGAAAGGGAGTGG GACAGAGAGCTGGCAAGTAAAAAGAACCCCAAGCTGATTAATGCCCTGAGGCGCTG CTTTTTTTGGAGATTCATGTTCTATGGCATATTCCTCTACCTTGGAGAAGTAACCAAA GCTGTACAGCCTCTCCTCCTTGGCAGAATCATTGCCTCCTATGATCCTGATAACAAG GAGGAGAGAAGCATAGCCATCTACCTGGGCATTGGGCTGTGCCTCTTGTTTATTGTG AGGACCCTTCTCTTGCACCCTGCCATCTTTGGCCTTCATCACATTGGCATGCAAATGA GAATAGCAATGTTTAGTCTTATTTACAAAAAAACATTAAAACTCTCTTCCAGGGTGT TGGACAAGATCAGTATTGGACAACTGGTCAGCCTGCTGAGCAACAACCTGAACAAG TTTGATGAAGGACTGGCCCTGGCCCACTTTGTCTGGATTGCCCCCCTTCAGGTGGCTC TTTTGATGGGCCTGATCTGGGAACTCCTGCAGGCCTCTGCCTTCTGTGGGTTAGGCTT CCTGATAGTGCTAGCTCTCTTTCAGGCAGGGTTGGGTAGAATGATGATGAAGTACAG AGACCAGAGGGCTGGGAAGATATCTGAGAGGCTGGTCATTACTTCTGAAATGATAG AAAACATCCAGTCTGTTAAAGCTTACTGCTGGGAGGAGGCTATGGAAAAGATGATT GAGAACTTGAGGCAAACAGAGCTCAAGCTGACTAGGAAGGCAGCCTATGTCAGGTA TTTCAACAGCAGTGCTTTCTTCTTCTCAGGCTTTTTCGTGGTCTTCTTGAGTGTTCTGC CCTATGCCCTCATCAAGGGGATAATTTTGAGAAAGATTTTCACCACTATTTCCTTTTG CATTGTCCTGAGGATGGCTGTCACCAGGCAATTCCCCTGGGCTGTGCAGACATGGTA TGACTCTCTGGGGGCCATCAACAAAATCCAAGATTTCCTGCAGAAGCAGGAGTACA AGACCCTGGAATACAACCTCACCACCACAGAAGTTGTGATGGAGAATGTGACTGCA TTCTGGGAGGAAGGATTTGGGGAGCTGTTTGAGAAAGCAAAACAAAACAATAATAA CAGGAAAACCAGCAATGGAGATGACTCCCTGTTCTTTTCCAACTTCTCTTTGTTGGG CACCCCTGTCCTGAAAGATATAAACTTTAAAATTGAAAGAGGGCAGCTGTTGGCAGT TGCTGGCTCCACAGGAGCTGGAAAAACTTCACTACTGATGGTGATCATGGGGGAGTT AGAACCCTCTGAAGGGAAAATAAAACATTCTGGGAGGATTAGTTTCTGCAGCCAGT TCAGCTGGATCATGCCTGGGACCATTAAAGAAAATATTATATTTGGAGTGAGCTATG ATGAATATAGATATAGGAGTGTCATCAAAGCCTGTCAGTTGGAGGAAGACATCAGC AAATTTGCAGAGAAAGACAACATTGTTCTGGGTGAAGGTGGCATCACCCTGTCAGG AGGGCAAAGGGCCAGGATCAGCTTGGCCAGAGCAGTCTATAAAGATGCTGATCTGT ACCTCCTGGATAGCCCTTTTGGCTATCTGGATGTTTTGACAGAGAAGGAAATTTTTG AGTCCTGTGTCTGCAAGTTAATGGCAAATAAAACAAGGATACTTGTGACCTCAAAA ATGGAACACCTGAAGAAGGCTGACAAAATTCTGATCCTGCATGAGGGCAGCAGCTA CTTTTATGGAACATTTTCTGAACTGCAGAATTTGCAACCAGACTTTTCATCAAAGCTC ATGGGATGTGACAGTTTTGATCAGTTTTCTGCAGAAAGGAGAAACTCCATTTTGACT GAGACCCTGCACAGGTTCAGTCTGGAGGGGGATGCCCCAGTGAGTTGGACTGAGAC AAAGAAACAGAGCTTCAAGCAGACTGGAGAGTTTGGAGAAAAGAGGAAAAACTCA ATTCTCAATCCCATCAATAGCATCAGGAAGTTCAGCATAGTTCAGAAGACTCCTTTG CAGATGAATGGGATTGAAGAGGACTCAGATGAGCCCCTGGAAAGGAGACTCTCCTT GGTGCCAGATTCAGAGCAGGGGGAAGCCATACTGCCAAGGATCTCTGTGATTTCTAC AGGGCCCACCCTCCAAGCAAGAAGGAGACAGTCAGTTTTAAACCTGATGACCCACT CTGTCAACCAGGGACAGAACATTCATAGAAAGACAACAGCATCTACAAGAAAAGTT TCACTGGCCCCTCAAGCCAATTTAACTGAACTAGATATCTACAGCAGGAGGCTCAGC CAAGAAACAGGCCTGGAGATCTCAGAAGAAATAAATGAGGAGGATTTGAAGGAAT GCTTCTTTGATGATATGGAGAGCATCCCAGCTGTCACAACCTGGAACACCTACCTGA GATACATCACAGTGCACAAATCCCTCATCTTTGTACTTATATGGTGCCTTGTCATCTT CTTAGCTGAGGTGGCTGCTTCCCTGGTGGTGCTGTGGCTGCTGGGAAACACACCCCT CCAGGATAAAGGGAACTCTACTCACAGCAGGAACAACAGTTATGCTGTGATCATCA CCAGTACCTCCTCCTACTATGTGTTCTACATTTATGTTGGAGTTGCAGACACATTGCT TGCCATGGGTTTTTTTAGAGGACTCCCCCTGGTGCATACTCTCATCACTGTTTCCAAA ATCCTTCACCACAAGATGCTGCACAGTGTACTACAGGCTCCCATGAGCACCCTCAAC ACTCTTAAAGCAGGAGGAATCTTGAACAGATTTAGCAAGGACATTGCAATTCTTGAT GACCTGCTTCCACTGACCATCTTTGACTTCATCCAGCTTCTGCTCATTGTAATTGGTG CCATTGCTGTGGTAGCAGTGCTCCAGCCATATATTTTTGTGGCCACTGTGCCTGTTAT TGTGGCCTTCATTATGTTGAGAGCCTACTTCCTGCAGACCTCTCAGCAGCTCAAGCA ACTTGAAAGTGAGGGCAGGAGCCCCATATTTACACACTTGGTCACTTCCCTCAAAGG CCTCTGGACACTCAGAGCTTTTGGAAGACAACCTTATTTTGAAACTCTCTTCCACAA GGCTCTGAATCTCCACACAGCCAACTGGTTTCTGTATCTTTCAACACTGCGCTGGTTC CAGATGAGGATTGAGATGATCTTTGTTATCTTCTTCATAGCTGTTACCTTCATCTCTA TTCTGACAACTGGTGAGGGGGAAGGGAGAGTAGGCATCATCCTCACACTAGCCATG AACATAATGTCTACCTTACAATGGGCCGTGAACAGCTCCATAGATGTGGACAGCCTC ATGAGAAGTGTGTCAAGAGTTTTCAAATTCATTGACATGCCCACAGAAGGCAAACC AACCAAGAGCACAAAACCCTACAAGAATGGCCAGCTGAGTAAGGTCATGATCATTG AAAATTCTCATGTGAAGAAGGATGATATTTGGCCCAGTGGGGGCCAGATGACAGTC AAGGACCTCACTGCCAAATACACAGAGGGTGGAAATGCTATCCTAGAGAACATCTC CTTCTCCATCTCCCCAGGCCAAAGAGTTGGCTTGCTGGGCAGGACTGGCAGTGGCAA GTCCACCTTGCTCTCAGCATTTCTCAGGCTTTTAAATACAGAGGGAGAGATTCAAAT TGATGGGGTGTCTTGGGATAGTATAACACTTCAACAGTGGAGGAAAGCCTTTGGTGT GATTCCTCAGAAAGTGTTTATCTTCTCTGGCACTTTCAGAAAAAATCTGGACCCCTAT GAACAGTGGAGTGACCAGGAAATCTGGAAGGTGGCAGATGAAGTGGGCCTAAGATC AGTCATAGAGCAGTTTCCTGGAAAGTTGGATTTTGTGCTTGTAGATGGAGGCTGTGT GCTGTCCCATGGCCATAAACAGCTAATGTGCCTGGCTAGGTCAGTGCTGAGCAAGGC CAAGATCCTGCTGTTAGATGAGCCTTCAGCCCATCTGGACCCTGTGACATACCAGAT TATCAGAAGAACTCTGAAGCAGGCCTTTGCTGACTGCACTGTCATCCTGTGTGAGCA CAGAATTGAGGCCATGCTGGAGTGCCAGCAGTTCCTTGTTATAGAAGAGAATAAGG TTAGGCAGTATGACAGCATTCAGAAACTGCTAAATGAAAGATCTCTCTTCAGGCAAG CTATTTCACCATCTGATAGAGTGAAACTTTTTCCCCACAGAAATTCCTCTAAATGTAA ATCTAAGCCCCAGATAGCTGCCTTGAAAGAGGAGACTGAAGAAGAAGTCCAGGACA CCAGACTGTGA (SEQ ID NO: 34) ATGCAGAGATCCCCGCTGGAGAAGGCATCTGTGGTGTCAAAACTGTTCTTTAGCTGG ACAAGGCCCATCCTTAGGAAAGGGTACAGACAGAGGTTGGAGCTGTCAGACATATA TCAGATCCCTTCAGTGGACTCTGCAGACAACCTCTCTGAAAAGCTGGAGAGGGAAT GGGACAGGGAACTGGCCAGCAAAAAAAACCCTAAACTGATTAATGCCCTGAGGAGG TGCTTCTTTTGGAGATTCATGTTCTATGGGATCTTCCTTTACCTGGGGGAGGTGACTA AAGCTGTTCAGCCTCTTCTTCTGGGGAGGATTATTGCCTCCTATGACCCAGACAACA AAGAAGAAAGAAGCATAGCCATTTACTTAGGCATAGGCCTCTGCTTGCTCTTCATAG TTAGAACCCTCCTACTCCACCCAGCCATCTTTGGTCTCCACCACATAGGTATGCAGA TGAGAATAGCAATGTTCTCCTTGATCTACAAGAAGACCCTCAAGCTGTCCAGCAGGG TGCTGGACAAGATCTCCATAGGCCAGTTAGTCAGTCTACTGTCCAATAACTTAAATA AGTTTGATGAGGGACTGGCACTGGCACATTTTGTGTGGATTGCCCCCCTCCAAGTGG
CCCTTCTTATGGGCCTTATCTGGGAGCTGTTGCAGGCCTCTGCTTTCTGTGGCCTGGG TTTCCTCATAGTCCTAGCCTTATTCCAGGCTGGACTGGGCAGAATGATGATGAAGTA TAGGGACCAAAGAGCAGGGAAGATTTCTGAAAGGCTGGTTATAACTTCTGAGATGA TTGAGAACATTCAGTCAGTGAAAGCTTACTGCTGGGAAGAAGCTATGGAAAAAATG ATTGAAAATCTCAGACAGACTGAATTAAAGTTGACCAGGAAAGCTGCTTATGTCAG ATACTTCAACTCCTCAGCCTTCTTTTTTTCTGGCTTCTTTGTTGTATTCCTTTCAGTCC TCCCCTATGCCCTGATTAAGGGCATTATCTTGAGGAAAATTTTCACAACCATCTCCTT TTGTATTGTCCTCAGGATGGCTGTTACAAGGCAATTTCCTTGGGCTGTGCAAACTTG GTATGATAGCCTTGGAGCAATCAACAAGATCCAGGATTTCCTGCAAAAGCAGGAGT ACAAGACATTGGAATACAACCTTACCACCACTGAGGTGGTGATGGAAAATGTGACT GCCTTCTGGGAGGAGGGGTTTGGAGAGCTGTTTGAGAAAGCCAAACAGAACAACAA CAATAGAAAGACCTCTAATGGTGATGATTCCCTGTTCTTTTCTAACTTTAGTCTTCTG GGGACCCCAGTTCTGAAAGATATTAACTTTAAAATTGAAAGGGGACAGTTGCTGGCT GTGGCTGGGTCCACTGGGGCTGGGAAGACAAGCCTGCTCATGGTGATCATGGGAGA GCTGGAACCCAGTGAAGGAAAGATCAAACACTCAGGCAGGATCTCCTTCTGCAGCC AGTTCTCATGGATTATGCCAGGCACTATTAAAGAAAATATCATCTTTGGTGTAAGCT ATGATGAGTACAGGTATAGATCTGTAATTAAAGCCTGCCAGCTGGAGGAAGACATC TCTAAGTTTGCTGAGAAGGATAACATTGTGTTGGGGGAAGGGGGCATCACCCTTTCT GGTGGGCAGAGGGCTAGGATCTCCCTTGCTAGGGCAGTATACAAGGATGCTGACTT GTACCTCTTGGATAGTCCTTTTGGCTACCTAGATGTGCTGACAGAGAAAGAAATATT TGAAAGCTGTGTGTGTAAGCTCATGGCTAACAAGACCAGGATCCTGGTCACCAGTA AAATGGAACACCTCAAAAAAGCAGACAAGATCCTTATTCTCCATGAGGGCTCCTCCT ACTTCTATGGGACCTTCAGTGAGCTGCAGAATCTGCAGCCAGACTTCTCCTCAAAAC TTATGGGCTGTGACTCCTTTGACCAATTCTCTGCAGAAAGAAGGAATAGCATACTGA CAGAAACACTGCATAGATTCTCCCTGGAAGGAGATGCCCCAGTGAGTTGGACAGAA ACCAAAAAGCAGAGCTTCAAGCAGACTGGTGAGTTTGGTGAAAAGAGGAAGAATTC TATCCTGAACCCCATCAATAGCATCAGGAAATTTAGCATAGTCCAAAAGACCCCCCT CCAGATGAATGGAATAGAGGAGGATAGTGATGAGCCTCTTGAGAGAAGGCTGTCCC TGGTTCCAGACAGTGAACAGGGTGAAGCCATTCTTCCGAGGATCAGTGTCATCTCCA CTGGGCCCACATTGCAGGCCAGAAGAAGACAGTCTGTTCTGAATTTGATGACACATT CTGTGAATCAAGGCCAGAATATCCATAGAAAAACCACTGCCAGCACCAGAAAAGTT TCTCTAGCCCCCCAGGCTAACCTGACTGAGTTAGACATCTACAGCAGAAGGCTGAGC CAAGAGACTGGCTTGGAAATATCTGAGGAGATCAATGAGGAGGACCTCAAGGAGTG CTTCTTTGATGACATGGAGTCAATCCCTGCAGTCACTACATGGAACACTTACCTAAG GTACATCACAGTTCATAAGAGCCTCATCTTTGTCCTCATATGGTGTCTGGTCATCTTT TTAGCAGAAGTGGCTGCCAGCCTAGTTGTGCTGTGGTTACTGGGCAATACACCTCTT CAGGACAAAGGCAATAGCACACACAGCAGAAACAACTCCTATGCAGTGATCATCAC CTCTACAAGCTCTTACTATGTATTCTATATATATGTGGGAGTGGCAGATACTCTCCTG GCCATGGGATTCTTCAGGGGATTACCTCTAGTTCACACATTGATCACAGTGTCAAAA ATTCTCCACCACAAGATGTTACACAGTGTCCTGCAAGCCCCAATGTCTACTCTGAAC ACACTTAAGGCAGGTGGAATTTTGAATAGGTTTAGCAAGGACATAGCTATCCTGGAT GATCTCCTCCCTCTGACCATCTTTGACTTCATCCAGTTACTGCTCATTGTAATTGGAG CCATTGCAGTGGTAGCAGTCCTACAGCCTTACATTTTTGTGGCTACTGTTCCTGTTAT TGTGGCCTTCATTATGCTAAGAGCTTACTTCCTGCAAACAAGCCAACAGTTGAAACA GCTAGAAAGTGAGGGAAGGTCCCCCATCTTCACCCACCTGGTGACATCACTCAAGG GGCTATGGACTCTTAGGGCTTTTGGGAGACAGCCGTACTTTGAGACCTTATTCCATA AGGCCCTTAACCTCCATACAGCAAACTGGTTCTTATACCTGAGTACTCTGAGGTGGT TTCAAATGAGGATTGAAATGATTTTTGTGATCTTCTTCATTGCTGTGACCTTCATCTC AATCTTGACCACAGGAGAGGGGGAGGGCAGGGTGGGCATCATACTGACCTTGGCCA TGAACATTATGTCAACCCTGCAGTGGGCTGTCAATAGCTCCATTGATGTGGACAGTC TGATGAGGAGTGTCTCCAGGGTCTTCAAGTTTATTGACATGCCAACTGAGGGCAAAC CCACCAAAAGCACTAAGCCATATAAAAATGGCCAACTGTCCAAAGTGATGATCATT GAAAATTCACATGTAAAGAAGGATGATATCTGGCCCTCTGGAGGACAGATGACAGT GAAAGACCTGACTGCCAAGTACACAGAGGGTGGTAATGCCATTCTTGAGAACATTA GTTTCAGTATTTCCCCGGGGCAAAGGGTGGGCCTCCTTGGCAGAACAGGCTCTGGCA AGAGTACCCTGCTGTCAGCCTTTTTAAGACTGTTGAACACTGAGGGAGAAATTCAGA TTGATGGTGTCTCCTGGGATAGCATCACCCTCCAGCAGTGGAGAAAAGCTTTTGGAG TGATCCCGCAAAAGGTTTTCATCTTTTCAGGCACCTTCCGGAAGAACCTGGACCCCT ATGAGCAGTGGTCTGACCAGGAAATATGGAAGGTAGCTGATGAAGTTGGGCTTAGG TCAGTCATAGAGCAGTTCCCAGGCAAACTGGACTTTGTCCTGGTGGATGGTGGATGT GTACTGAGTCATGGGCACAAACAGCTGATGTGCCTAGCCAGGTCTGTGCTCAGCAA GGCAAAGATATTGCTGCTTGATGAACCCAGTGCCCATCTGGACCCAGTCACATATCA GATCATCAGAAGAACATTGAAGCAGGCCTTTGCTGATTGCACAGTTATCCTCTGTGA GCACAGGATTGAGGCCATGCTGGAGTGCCAGCAGTTTCTGGTGATTGAGGAGAATA AAGTAAGGCAGTATGACTCCATCCAGAAGCTGCTCAATGAAAGAAGCCTCTTTAGA CAAGCTATCTCCCCCTCAGACAGGGTCAAATTGTTCCCTCACAGAAACAGCAGCAA GTGCAAGAGCAAGCCCCAAATTGCAGCCTTGAAAGAGGAGACAGAGGAAGAGGTG CAGGACACCAGACTCTGA (SEQ ID NO: 35) ATGCAGAGAAGCCCCCTGGAGAAGGCCAGCGTGGTGAGCAAGCTGTTCTTCAGCTG GACCAGACCCATCCTGAGAAAGGGCTACAGACAGAGACTGGAGCTGAGCGACATCT ACCAGATCCCCAGCGTGGACAGCGCCGACAACCTGAGCGAGAAGCTGGAGAGAGA GTGGGACAGAGAGCTGGCCAGCAAGAAGAACCCCAAGCTGATCAACGCCCTGAGA AGATGCTTCTTCTGGAGATTCATGTTCTACGGCATCTTCCTGTACCTGGGCGAGGTG ACCAAGGCCGTGCAGCCCCTGCTGCTGGGCAGAATCATCGCCAGCTACGACCCCGA CAACAAGGAGGAGAGAAGCATCGCCATCTACCTGGGCATCGGCCTGTGCCTGCTGT TCATCGTGAGAACCCTGCTGCTGCACCCCGCCATCTTCGGCCTGCACCACATCGGCA TGCAGATGAGAATCGCCATGTTCAGCCTGATCTACAAGAAGACCCTGAAGCTGAGC AGCAGAGTGCTGGACAAGATCAGCATCGGCCAGCTGGTGAGCCTGCTGAGCAACAA CCTGAACAAGTTCGACGAGGGCCTGGCCCTGGCCCACTTCGTGTGGATCGCCCCCCT GCAGGTGGCCCTGCTGATGGGCCTGATCTGGGAGCTGCTGCAGGCCAGCGCCTTCTG CGGCCTGGGCTTCCTGATCGTGCTGGCCCTGTTCCAGGCCGGCCTGGGCAGAATGAT GATGAAGTACAGAGACCAGAGAGCCGGCAAGATCAGCGAGAGACTGGTGATCACC AGCGAGATGATCGAGAACATCCAGAGCGTGAAGGCCTACTGCTGGGAGGAGGCCAT GGAGAAGATGATCGAGAACCTGAGACAGACCGAGCTGAAGCTGACCAGAAAGGCC GCCTACGTGAGATACTTCAACAGCAGCGCCTTCTTCTTCAGCGGCTTCTTCGTGGTGT TCCTGAGCGTGCTGCCCTACGCCCTGATCAAGGGCATCATCCTGAGAAAGATCTTCA CCACCATCAGCTTCTGCATCGTGCTGAGAATGGCCGTGACCAGACAGTTCCCCTGGG CCGTGCAGACCTGGTACGACAGCCTGGGCGCCATCAACAAGATCCAGGACTTCCTG CAGAAGCAGGAGTACAAGACCCTGGAGTACAACCTGACCACCACCGAGGTGGTGAT GGAGAACGTGACCGCCTTCTGGGAGGAGGGCTTCGGCGAGCTGTTCGAGAAGGCCA AGCAGAACAACAACAACAGAAAGACCAGCAACGGCGACGACAGCCTGTTCTTCAGC AACTTCAGCCTGCTGGGCACCCCCGTGCTGAAGGACATCAACTTCAAGATCGAGAG AGGCCAGCTGCTGGCCGTGGCCGGCAGCACCGGCGCCGGCAAGACCAGCCTGCTGA TGGTGATCATGGGCGAGCTGGAGCCCAGCGAGGGCAAGATCAAGCACAGCGGCAG AATCAGCTTCTGCAGCCAGTTCAGCTGGATCATGCCCGGCACCATCAAGGAGAACA TCATCTTCGGCGTGAGCTACGACGAGTACAGATACAGAAGCGTGATCAAGGCCTGC CAGCTGGAGGAGGACATCAGCAAGTTCGCCGAGAAGGACAACATCGTGCTGGGCGA GGGCGGCATCACCCTGAGCGGCGGCCAGAGAGCCAGAATCAGCCTGGCCAGAGCCG TGTACAAGGACGCCGACCTGTACCTGCTGGACAGCCCCTTCGGCTACCTGGACGTGC TGACCGAGAAGGAGATCTTCGAGAGCTGCGTGTGCAAGCTGATGGCCAACAAGACC AGAATCCTGGTGACCAGCAAGATGGAGCACCTGAAGAAGGCCGACAAGATCCTGAT CCTGCACGAGGGCAGCAGCTACTTCTACGGCACCTTCAGCGAGCTGCAGAACCTGC AGCCCGACTTCAGCAGCAAGCTGATGGGCTGCGACAGCTTCGACCAGTTCAGCGCC GAGAGAAGAAACAGCATCCTGACCGAGACCCTGCACAGATTCAGCCTGGAGGGCGA CGCCCCCGTGAGCTGGACCGAGACCAAGAAGCAGAGCTTCAAGCAGACCGGCGAGT TCGGCGAGAAGAGAAAGAACAGCATCCTGAACCCCATCAACAGCATCAGAAAGTTC AGCATCGTGCAGAAGACCCCCCTGCAGATGAACGGCATCGAGGAGGACAGCGACG AGCCCCTGGAGAGAAGACTGAGCCTGGTGCCCGACAGCGAGCAGGGCGAGGCCATC CTGCCCAGAATCAGCGTGATCAGCACCGGCCCCACCCTGCAGGCCAGAAGAAGACA GAGCGTGCTGAACCTGATGACCCACAGCGTGAACCAGGGCCAGAACATCCACAGAA AGACCACCGCCAGCACCAGAAAGGTGAGCCTGGCCCCCCAGGCCAACCTGACCGAG CTGGACATCTACAGCAGAAGACTGAGCCAGGAGACCGGCCTGGAGATCAGCGAGG AGATCAACGAGGAGGACCTGAAGGAGTGCTTCTTCGACGACATGGAGAGCATCCCC GCCGTGACCACCTGGAACACCTACCTGAGATACATCACCGTGCACAAGAGCCTGAT CTTCGTGCTGATCTGGTGCCTGGTGATCTTCCTGGCCGAGGTGGCCGCCAGCCTGGT GGTGCTGTGGCTGCTGGGCAACACCCCCCTGCAGGACAAGGGCAACAGCACCCACA GCAGAAACAACAGCTACGCCGTGATCATCACCAGCACCAGCAGCTACTACGTGTTC TACATCTACGTGGGCGTGGCCGACACCCTGCTGGCCATGGGCTTCTTCAGAGGCCTG CCCCTGGTGCACACCCTGATCACCGTGAGCAAGATCCTGCACCACAAGATGCTGCAC AGCGTGCTGCAGGCCCCCATGAGCACCCTGAACACCCTGAAGGCCGGCGGCATCCT GAACAGATTCAGCAAGGACATCGCCATCCTGGACGACCTGCTGCCCCTGACCATCTT CGACTTCATCCAGCTGCTGCTGATCGTGATCGGCGCCATCGCCGTGGTGGCCGTGCT GCAGCCCTACATCTTCGTGGCCACCGTGCCCGTGATCGTGGCCTTCATCATGCTGAG AGCCTACTTCCTGCAGACCAGCCAGCAGCTGAAGCAGCTGGAGAGCGAGGGCAGAA GCCCCATCTTCACCCACCTGGTGACCAGCCTGAAGGGCCTGTGGACCCTGAGAGCCT
TCGGCAGACAGCCCTACTTCGAGACCCTGTTCCACAAGGCCCTGAACCTGCACACCG CCAACTGGTTCCTGTACCTGAGCACCCTGAGATGGTTCCAGATGAGAATCGAGATGA TCTTCGTGATCTTCTTCATCGCCGTGACCTTCATCAGCATCCTGACCACCGGCGAGG GCGAGGGCAGAGTGGGCATCATCCTGACCCTGGCCATGAACATCATGAGCACCCTG CAGTGGGCCGTGAACAGCAGCATCGACGTGGACAGCCTGATGAGAAGCGTGAGCAG AGTGTTCAAGTTCATCGACATGCCCACCGAGGGCAAGCCCACCAAGAGCACCAAGC CCTACAAGAACGGCCAGCTGAGCAAGGTGATGATCATCGAGAACAGCCACGTGAAG AAGGACGACATCTGGCCCAGCGGCGGCCAGATGACCGTGAAGGACCTGACCGCCAA GTACACCGAGGGCGGCAACGCCATCCTGGAGAACATCAGCTTCAGCATCAGCCCCG GCCAGAGAGTGGGCCTGCTGGGCAGAACCGGCAGCGGCAAGAGCACCCTGCTGAGC GCCTTCCTGAGACTGCTGAACACCGAGGGCGAGATCCAGATCGACGGCGTGAGCTG GGACAGCATCACCCTGCAGCAGTGGAGAAAGGCCTTCGGCGTGATCCCCCAGAAGG TGTTCATCTTCAGCGGCACCTTCAGAAAGAACCTGGACCCCTACGAGCAGTGGAGC GACCAGGAGATCTGGAAGGTGGCCGACGAGGTGGGCCTGAGAAGCGTGATCGAGC AGTTCCCCGGCAAGCTGGACTTCGTGCTGGTGGACGGCGGCTGCGTGCTGAGCCAC GGCCACAAGCAGCTGATGTGCCTGGCCAGAAGCGTGCTGAGCAAGGCCAAGATCCT GCTGCTGGACGAGCCCAGCGCCCACCTGGACCCCGTGACCTACCAGATCATCAGAA GAACCCTGAAGCAGGCCTTCGCCGACTGCACCGTGATCCTGTGCGAGCACAGAATC GAGGCCATGCTGGAGTGCCAGCAGTTCCTGGTGATCGAGGAGAACAAGGTGAGACA GTACGACAGCATCCAGAAGCTGCTGAACGAGAGAAGCCTGTTCAGACAGGCCATCA GCCCCAGCGACAGAGTGAAGCTGTTCCCCCACAGAAACAGCAGCAAGTGCAAGAGC AAGCCCCAGATCGCCGCCCTGAAGGAGGAGACCGAGGAGGAGGTGCAGGACACCA GACTGTGA (SEQ ID NO: 36) ATGCAGCGCAGCCCCCTGGAGAAGGCCAGCGTGGTGAGCAAGCTGTTCTTCAGCTG GACCCGCCCCATCCTGCGCAAGGGCTACCGCCAGCGCCTGGAGCTGAGCGACATCT ACCAGATCCCCAGCGTGGACAGCGCCGACAACCTGAGCGAGAAGCTGGAGCGCGA GTGGGACCGCGAGCTGGCCAGCAAGAAGAACCCCAAGCTGATCAACGCCCTGCGCC GCTGCTTCTTCTGGCGCTTCATGTTCTACGGCATCTTCCTGTACCTGGGCGAGGTGAC CAAGGCCGTGCAGCCCCTGCTGCTGGGCCGCATCATCGCCAGCTACGACCCCGACA ACAAGGAGGAGCGCAGCATCGCCATCTACCTGGGCATCGGCCTGTGCCTGCTGTTCA TCGTGCGCACCCTGCTGCTGCACCCCGCCATCTTCGGCCTGCACCACATCGGCATGC AGATGCGCATCGCCATGTTCAGCCTGATCTACAAGAAGACCCTGAAGCTGAGCAGC CGCGTGCTGGACAAGATCAGCATCGGCCAGCTGGTGAGCCTGCTGAGCAACAACCT GAACAAGTTCGACGAGGGCCTGGCCCTGGCCCACTTCGTGTGGATCGCCCCCCTGCA GGTGGCCCTGCTGATGGGCCTGATCTGGGAGCTGCTGCAGGCCAGCGCCTTCTGCGG CCTGGGCTTCCTGATCGTGCTGGCCCTGTTCCAGGCCGGCCTGGGCCGCATGATGAT GAAGTACCGCGACCAGCGCGCCGGCAAGATCAGCGAGCGCCTGGTGATCACCAGCG AGATGATCGAGAACATCCAGAGCGTGAAGGCCTACTGCTGGGAGGAGGCCATGGAG AAGATGATCGAGAACCTGCGCCAGACCGAGCTGAAGCTGACCCGCAAGGCCGCCTA CGTGCGCTACTTCAACAGCAGCGCCTTCTTCTTCAGCGGCTTCTTCGTGGTGTTCCTG AGCGTGCTGCCCTACGCCCTGATCAAGGGCATCATCCTGCGCAAGATCTTCACCACC ATCAGCTTCTGCATCGTGCTGCGCATGGCCGTGACCCGCCAGTTCCCCTGGGCCGTG CAGACCTGGTACGACAGCCTGGGCGCCATCAACAAGATCCAGGACTTCCTGCAGAA GCAGGAGTACAAGACCCTGGAGTACAACCTGACCACCACCGAGGTGGTGATGGAGA ACGTGACCGCCTTCTGGGAGGAGGGCTTCGGCGAGCTGTTCGAGAAGGCCAAGCAG AACAACAACAACCGCAAGACCAGCAACGGCGACGACAGCCTGTTCTTCAGCAACTT CAGCCTGCTGGGCACCCCCGTGCTGAAGGACATCAACTTCAAGATCGAGCGCGGCC AGCTGCTGGCCGTGGCCGGCAGCACCGGCGCCGGCAAGACCAGCCTGCTGATGGTG ATCATGGGCGAGCTGGAGCCCAGCGAGGGCAAGATCAAGCACAGCGGCCGCATCA GCTTCTGCAGCCAGTTCAGCTGGATCATGCCCGGCACCATCAAGGAGAACATCATCT TCGGCGTGAGCTACGACGAGTACCGCTACCGCAGCGTGATCAAGGCCTGCCAGCTG GAGGAGGACATCAGCAAGTTCGCCGAGAAGGACAACATCGTGCTGGGCGAGGGCG GCATCACCCTGAGCGGCGGCCAGCGCGCCCGCATCAGCCTGGCCCGCGCCGTGTAC AAGGACGCCGACCTGTACCTGCTGGACAGCCCCTTCGGCTACCTGGACGTGCTGACC GAGAAGGAGATCTTCGAGAGCTGCGTGTGCAAGCTGATGGCCAACAAGACCCGCAT CCTGGTGACCAGCAAGATGGAGCACCTGAAGAAGGCCGACAAGATCCTGATCCTGC ACGAGGGCAGCAGCTACTTCTACGGCACCTTCAGCGAGCTGCAGAACCTGCAGCCC GACTTCAGCAGCAAGCTGATGGGCTGCGACAGCTTCGACCAGTTCAGCGCCGAGCG CCGCAACAGCATCCTGACCGAGACCCTGCACCGCTTCAGCCTGGAGGGCGACGCCC CCGTGAGCTGGACCGAGACCAAGAAGCAGAGCTTCAAGCAGACCGGCGAGTTCGGC GAGAAGCGCAAGAACAGCATCCTGAACCCCATCAACAGCATCCGCAAGTTCAGCAT CGTGCAGAAGACCCCCCTGCAGATGAACGGCATCGAGGAGGACAGCGACGAGCCCC TGGAGCGCCGCCTGAGCCTGGTGCCCGACAGCGAGCAGGGCGAGGCCATCCTGCCC CGCATCAGCGTGATCAGCACCGGCCCCACCCTGCAGGCCCGCCGCCGCCAGAGCGT GCTGAACCTGATGACCCACAGCGTGAACCAGGGCCAGAACATCCACCGCAAGACCA CCGCCAGCACCCGCAAGGTGAGCCTGGCCCCCCAGGCCAACCTGACCGAGCTGGAC ATCTACAGCCGCCGCCTGAGCCAGGAGACCGGCCTGGAGATCAGCGAGGAGATCAA CGAGGAGGACCTGAAGGAGTGCTTCTTCGACGACATGGAGAGCATCCCCGCCGTGA CCACCTGGAACACCTACCTGCGCTACATCACCGTGCACAAGAGCCTGATCTTCGTGC TGATCTGGTGCCTGGTGATCTTCCTGGCCGAGGTGGCCGCCAGCCTGGTGGTGCTGT GGCTGCTGGGCAACACCCCCCTGCAGGACAAGGGCAACAGCACCCACAGCCGCAAC AACAGCTACGCCGTGATCATCACCAGCACCAGCAGCTACTACGTGTTCTACATCTAC GTGGGCGTGGCCGACACCCTGCTGGCCATGGGCTTCTTCCGCGGCCTGCCCCTGGTG CACACCCTGATCACCGTGAGCAAGATCCTGCACCACAAGATGCTGCACAGCGTGCT GCAGGCCCCCATGAGCACCCTGAACACCCTGAAGGCCGGCGGCATCCTGAACCGCT TCAGCAAGGACATCGCCATCCTGGACGACCTGCTGCCCCTGACCATCTTCGACTTCA TCCAGCTGCTGCTGATCGTGATCGGCGCCATCGCCGTGGTGGCCGTGCTGCAGCCCT ACATCTTCGTGGCCACCGTGCCCGTGATCGTGGCCTTCATCATGCTGCGCGCCTACTT CCTGCAGACCAGCCAGCAGCTGAAGCAGCTGGAGAGCGAGGGCCGCAGCCCCATCT TCACCCACCTGGTGACCAGCCTGAAGGGCCTGTGGACCCTGCGCGCCTTCGGCCGCC AGCCCTACTTCGAGACCCTGTTCCACAAGGCCCTGAACCTGCACACCGCCAACTGGT TCCTGTACCTGAGCACCCTGCGCTGGTTCCAGATGCGCATCGAGATGATCTTCGTGA TCTTCTTCATCGCCGTGACCTTCATCAGCATCCTGACCACCGGCGAGGGCGAGGGCC GCGTGGGCATCATCCTGACCCTGGCCATGAACATCATGAGCACCCTGCAGTGGGCC GTGAACAGCAGCATCGACGTGGACAGCCTGATGCGCAGCGTGAGCCGCGTGTTCAA GTTCATCGACATGCCCACCGAGGGCAAGCCCACCAAGAGCACCAAGCCCTACAAGA ACGGCCAGCTGAGCAAGGTGATGATCATCGAGAACAGCCACGTGAAGAAGGACGA CATCTGGCCCAGCGGCGGCCAGATGACCGTGAAGGACCTGACCGCCAAGTACACCG AGGGCGGCAACGCCATCCTGGAGAACATCAGCTTCAGCATCAGCCCCGGCCAGCGC GTGGGCCTGCTGGGCCGCACCGGCAGCGGCAAGAGCACCCTGCTGAGCGCCTTCCT GCGCCTGCTGAACACCGAGGGCGAGATCCAGATCGACGGCGTGAGCTGGGACAGCA TCACCCTGCAGCAGTGGCGCAAGGCCTTCGGCGTGATCCCCCAGAAGGTGTTCATCT TCAGCGGCACCTTCCGCAAGAACCTGGACCCCTACGAGCAGTGGAGCGACCAGGAG ATCTGGAAGGTGGCCGACGAGGTGGGCCTGCGCAGCGTGATCGAGCAGTTCCCCGG CAAGCTGGACTTCGTGCTGGTGGACGGCGGCTGCGTGCTGAGCCACGGCCACAAGC AGCTGATGTGCCTGGCCCGCAGCGTGCTGAGCAAGGCCAAGATCCTGCTGCTGGAC GAGCCCAGCGCCCACCTGGACCCCGTGACCTACCAGATCATCCGCCGCACCCTGAA GCAGGCCTTCGCCGACTGCACCGTGATCCTGTGCGAGCACCGCATCGAGGCCATGCT GGAGTGCCAGCAGTTCCTGGTGATCGAGGAGAACAAGGTGCGCCAGTACGACAGCA TCCAGAAGCTGCTGAACGAGCGCAGCCTGTTCCGCCAGGCCATCAGCCCCAGCGAC CGCGTGAAGCTGTTCCCCCACCGCAACAGCAGCAAGTGCAAGAGCAAGCCCCAGAT CGCCGCCCTGAAGGAGGAGACCGAGGAGGAGGTGCAGGACACCCGCCTGTAA (SEQ ID NO: 37) ATGCAGAGAAGCCCCCTGGAGAAGGCCAGCGTGGTGAGCAAGCTGTTCTTCAGCTG GACCAGACCCATCCTGAGAAAGGGCTACAGACAGAGACTGGAGCTGAGCGACATCT ACCAGATCCCCAGCGTGGACAGCGCCGACAACCTGAGCGAGAAGCTGGAGAGAGA GTGGGACAGAGAGCTGGCCAGCAAGAAGAACCCCAAGCTGATCAACGCCCTGAGA AGATGCTTCTTCTGGAGATTCATGTTCTACGGCATCTTCCTGTACCTGGGCGAGGTG ACCAAGGCCGTGCAGCCCCTGCTGCTGGGCAGAATCATCGCCAGCTACGACCCCGA CAACAAGGAGGAGAGAAGCATCGCCATCTACCTGGGCATCGGCCTGTGCCTGCTGT TCATCGTGAGAACCCTGCTGCTGCACCCCGCCATCTTCGGCCTGCACCACATCGGCA TGCAGATGAGAATCGCCATGTTCAGCCTGATCTACAAGAAGACCCTGAAGCTGAGC AGCAGAGTGCTGGACAAGATCAGCATCGGCCAGCTGGTGAGCCTGCTGAGCAACAA CCTGAACAAGTTCGACGAGGGCCTGGCCCTGGCCCACTTCGTGTGGATCGCCCCCCT GCAGGTGGCCCTGCTGATGGGCCTGATCTGGGAGCTGCTGCAGGCCAGCGCCTTCTG CGGCCTGGGCTTCCTGATCGTGCTGGCCCTGTTCCAGGCCGGCCTGGGCAGAATGAT GATGAAGTACAGGGACCAGAGAGCCGGCAAGATCAGCGAGAGACTGGTGATCACC AGCGAGATGATCGAGAACATCCAGAGCGTGAAGGCCTACTGCTGGGAGGAGGCCAT GGAGAAGATGATCGAGAACCTGAGACAGACCGAGCTGAAGCTGACCAGAAAGGCC GCCTACGTGAGATACTTCAACAGCAGCGCCTTCTTCTTCAGCGGCTTCTTCGTGGTGT TCCTGAGCGTGCTGCCCTACGCCCTGATCAAGGGCATCATCCTGAGAAAGATCTTCA CCACCATCAGCTTCTGCATCGTGCTGAGAATGGCCGTGACCAGACAGTTCCCCTGGG CCGTGCAGACCTGGTACGACAGCCTGGGCGCCATCAACAAGATCCAGGACTTCCTG CAGAAGCAGGAGTACAAGACCCTGGAGTACAACCTGACCACCACCGAGGTGGTGAT GGAGAACGTGACCGCCTTCTGGGAGGAGGGCTTCGGCGAGCTGTTCGAGAAGGCCA
AGCAGAACAACAACAACAGAAAGACCAGCAACGGCGACGACAGCCTGTTCTTCAGC AACTTCAGCCTGCTGGGCACCCCCGTGCTGAAGGACATCAACTTCAAGATCGAGAG AGGCCAGCTGCTGGCCGTGGCCGGCAGCACCGGCGCCGGCAAGACCAGCCTGCTGA TGGTGATCATGGGCGAGCTGGAGCCCAGCGAGGGCAAGATCAAGCACAGCGGCAG AATCAGCTTCTGCAGCCAGTTCAGCTGGATCATGCCCGGCACCATCAAGGAGAACA TCATCTTCGGCGTGAGCTACGACGAGTACAGATACAGAAGCGTGATCAAGGCCTGC CAGCTGGAGGAGGACATCAGCAAGTTCGCCGAGAAGGACAACATCGTGCTGGGCGA GGGCGGCATCACCCTGAGCGGCGGCCAGAGAGCCAGAATCAGCCTGGCCAGAGCCG TGTACAAGGACGCCGACCTGTACCTGCTGGACAGCCCCTTCGGCTACCTGGACGTGC TGACCGAGAAGGAGATCTTCGAGAGCTGCGTGTGCAAGCTGATGGCCAACAAGACC AGAATCCTGGTGACCAGCAAGATGGAGCACCTGAAGAAGGCCGACAAGATCCTGAT CCTGCACGAGGGCAGCAGCTACTTCTACGGCACCTTCAGCGAGCTGCAGAACCTGC AGCCCGACTTCAGCAGCAAGCTGATGGGCTGCGACAGCTTCGACCAGTTCAGCGCC GAGAGAAGAAACAGCATCCTGACCGAGACCCTGCACAGATTCAGCCTGGAGGGCGA CGCCCCCGTGAGCTGGACCGAGACCAAGAAGCAGAGCTTCAAGCAGACCGGCGAGT TCGGCGAGAAGAGAAAGAACAGCATCCTGAACCCCATCAACAGCATCAGAAAGTTC AGCATCGTGCAGAAGACCCCCCTGCAGATGAACGGCATCGAGGAGGACAGCGACG AGCCCCTGGAGAGAAGACTGAGCCTGGTGCCCGACAGCGAGCAGGGCGAGGCCATC CTGCCCAGAATCAGCGTGATCAGCACCGGCCCCACCCTGCAGGCCAGAAGAAGACA GAGCGTGCTGAACCTGATGACCCACAGCGTGAACCAGGGCCAGAACATCCACAGAA AGACCACCGCCAGCACCAGAAAGGTGAGCCTGGCCCCCCAGGCCAACCTGACCGAG CTGGACATCTACAGCAGAAGACTGAGCCAGGAGACCGGCCTGGAGATCAGCGAGG AGATCAACGAGGAGGACCTGAAGGAGTGCTTCTTCGACGACATGGAGAGCATCCCC GCCGTGACCACCTGGAACACCTACCTGAGATACATCACCGTGCACAAGAGCCTGAT CTTCGTGCTGATCTGGTGCCTGGTGATCTTCCTGGCCGAGGTGGCCGCCAGCCTGGT GGTGCTGTGGCTGCTGGGCAACACCCCCCTGCAGGACAAGGGCAACAGCACCCACA GCAGAAACAACAGCTACGCCGTGATCATCACCAGCACCAGCAGCTACTACGTGTTC TACATCTACGTGGGCGTGGCCGACACCCTGCTGGCCATGGGCTTCTTCAGAGGCCTG CCCCTGGTGCACACCCTGATCACCGTGAGCAAGATCCTGCACCACAAGATGCTGCAC AGCGTGCTGCAGGCCCCCATGAGCACCCTGAACACCCTGAAGGCCGGCGGCATCCT GAACAGATTCAGCAAGGACATCGCCATCCTGGACGACCTGCTGCCCCTGACCATCTT CGACTTCATCCAGCTGCTGCTGATCGTGATCGGCGCCATCGCCGTGGTGGCCGTGCT GCAGCCCTACATCTTCGTGGCCACCGTGCCCGTGATCGTGGCCTTCATCATGCTGAG AGCCTACTTCCTGCAGACCAGCCAGCAGCTGAAGCAGCTGGAGAGCGAGGGCAGGA GCCCCATCTTCACCCACCTGGTGACCAGCCTGAAGGGCCTGTGGACCCTGAGAGCCT TCGGCAGACAGCCCTACTTCGAGACCCTGTTCCACAAGGCCCTGAACCTGCACACCG CCAACTGGTTCCTGTACCTGAGCACCCTGAGATGGTTCCAGATGAGAATCGAGATGA TCTTCGTGATCTTCTTCATCGCCGTGACCTTCATCAGCATCCTGACCACCGGCGAGG GCGAGGGCAGAGTGGGCATCATCCTGACCCTGGCCATGAACATCATGAGCACCCTG CAGTGGGCCGTGAACAGCAGCATCGACGTGGACAGCCTGATGAGAAGCGTGAGCAG AGTGTTCAAGTTCATCGACATGCCCACCGAGGGCAAGCCCACCAAGAGCACCAAGC CCTACAAGAACGGCCAGCTGAGCAAGGTGATGATCATCGAGAACAGCCACGTGAAG AAGGACGACATCTGGCCCAGCGGCGGCCAGATGACCGTGAAGGACCTGACCGCCAA GTACACCGAGGGCGGCAACGCCATCCTGGAGAACATCAGCTTCAGCATCAGCCCCG GCCAGAGAGTGGGCCTGCTGGGCAGAACCGGCAGCGGCAAGAGCACCCTGCTGAGC GCCTTCCTGAGACTGCTGAACACCGAGGGCGAGATCCAGATCGACGGCGTGAGCTG GGACAGCATCACCCTGCAGCAGTGGAGAAAGGCCTTCGGCGTGATCCCCCAGAAGG TGTTCATCTTCAGCGGCACCTTCAGAAAGAACCTGGACCCCTACGAGCAGTGGAGC GACCAGGAGATCTGGAAGGTGGCCGACGAGGTGGGCCTGAGAAGCGTGATCGAGC AGTTCCCCGGCAAGCTGGACTTCGTGCTGGTGGACGGCGGCTGCGTGCTGAGCCAC GGCCACAAGCAGCTGATGTGCCTGGCCAGAAGCGTGCTGAGCAAGGCCAAGATCCT GCTGCTGGACGAGCCCAGCGCCCACCTGGACCCCGTGACCTACCAGATCATCAGAA GAACCCTGAAGCAGGCCTTCGCCGACTGCACCGTGATCCTGTGCGAGCACAGAATC GAGGCCATGCTGGAGTGCCAGCAGTTCCTGGTGATCGAGGAGAACAAGGTGAGACA GTACGACAGCATCCAGAAGCTGCTGAACGAGAGAAGCCTGTTCAGACAGGCCATCA GCCCCAGCGACAGAGTGAAGCTGTTCCCCCACAGAAACAGCAGCAAGTGCAAGAGC AAGCCCCAGATCGCCGCCCTGAAGGAGGAGACCGAGGAGGAGGTGCAGGACACCA GACTGTGA (SEQ ID NO: 38) ATGCAGAGGTCACCTCTGGAAAAGGCTAGCGTGGTCAGCAAGCTATTTTTTTCCTGG ACCCGCCCGATACTCAGGAAGGGCTACCGACAGCGGCTGGAGCTGAGTGACATTTA TCAGATTCCCTCCGTCGATTCCGCTGACAACCTGTCTGAGAAACTGGAGCGGGAATG GGATAGGGAACTGGCGTCCAAAAAAAACCCCAAACTCATCAATGCACTCCGCAGAT GCTTCTTCTGGCGGTTTATGTTTTATGGCATATTCCTGTATCTGGGGGAGGTGACGAA AGCCGTGCAGCCGCTGCTGCTTGGTCGCATTATCGCGTCATACGATCCAGATAACAA GGAGGAAAGAAGTATCGCTATCTATCTCGGGATAGGGCTGTGCCTGCTCTTCATTGT GCGGACTCTTCTCTTGCACCCCGCCATTTTCGGTCTGCATCATATAGGTATGCAGATG AGAATTGCGATGTTCTCATTGATTTACAAAAAAACGCTTAAGCTAAGTTCAAGGGTG CTAGATAAGATATCGATCGGCCAGCTGGTGTCTCTGCTTAGCAACAACCTCAATAAA TTCGACGAAGGCCTTGCACTGGCCCACTTCGTGTGGATCGCCCCTCTGCAGGTGGCT CTGCTGATGGGGTTAATATGGGAGCTGTTGCAGGCCTCCGCTTTTTGTGGCCTGGGG TTTCTCATCGTGTTGGCCTTGTTTCAGGCAGGGCTGGGACGTATGATGATGAAATAT AGGGATCAGAGGGCTGGCAAAATCTCTGAGCGCCTGGTTATTACGAGTGAAATGAT TGAGAACATCCAGTCAGTGAAGGCCTATTGCTGGGAGGAGGCCATGGAAAAAATGA TTGAGAACCTACGCCAGACTGAGCTGAAGTTAACCAGAAAAGCCGCCTATGTGCGC TACTTTAACAGTAGCGCATTTTTCTTCTCCGGTTTTTTCGTGGTGTTTCTTAGTGTGTT GCCGTATGCCTTAATCAAGGGAATAATACTCCGGAAGATTTTCACTACCATCAGCTT CTGTATCGTGTTGCGGATGGCCGTCACCCGGCAGTTTCCCTGGGCAGTACAGACTTG GTACGATTCTCTCGGAGCAATTAACAAAATCCAAGACTTTCTACAAAAGCAGGAGT ACAAGACCCTGGAGTACAATCTGACCACCACAGAAGTCGTAATGGAGAATGTAACT GCCTTCTGGGAAGAGGGCTTTGGCGAACTCTTTGAAAAGGCCAAGCAGAACAATAA CAACCGGAAGACCTCCAACGGGGACGACAGCTTATTTTTCAGCAATTTTTCTTTGCT CGGGACCCCTGTACTGAAAGATATTAACTTTAAGATCGAGCGCGGACAACTCCTGGC TGTCGCCGGCAGCACTGGAGCTGGAAAAACATCACTGCTTATGGTGATAATGGGAG AACTCGAACCAAGCGAGGGAAAAATAAAGCACTCTGGACGGATTAGTTTTTGCTCC CAGTTCTCGTGGATAATGCCTGGCACCATTAAGGAGAATATCATCTTTGGAGTGAGT TACGACGAATACCGGTACCGGTCCGTTATCAAGGCTTGTCAACTCGAGGAGGACATT TCTAAATTCGCCGAAAAAGATAATATAGTGCTGGGCGAAGGAGGCATTACACTGAG CGGGGGTCAGAGAGCTCGAATTAGCCTCGCCCGAGCAGTCTATAAAGACGCCGATC TTTACCTGCTGGATTCCCCTTTTGGGTATTTGGATGTTCTGACAGAGAAGGAAATCTT TGAATCATGTGTCTGTAAACTGATGGCCAATAAGACTAGGATTCTAGTGACTTCGAA AATGGAGCACCTGAAAAAAGCGGACAAAATTCTGATACTCCATGAAGGGTCTTCCT ACTTCTACGGCACCTTCTCAGAGTTGCAGAACTTACAACCTGATTTTTCATCTAAGCT TATGGGGTGCGACTCGTTTGACCAGTTCTCCGCTGAAAGACGAAACAGCATCTTAAC GGAAACTCTTCACAGGTTCTCATTAGAGGGAGATGCGCCGGTGTCCTGGACAGAGA CAAAAAAACAGTCTTTCAAACAGACAGGAGAGTTTGGCGAGAAGAGAAAAAACTC AATCCTCAATCCCATCAATTCTATTAGAAAGTTTAGCATCGTCCAAAAAACACCATT GCAGATGAATGGGATTGAGGAGGACAGTGATGAGCCTTTGGAACGAAGACTGTCCC TGGTACCCGATAGCGAACAGGGTGAGGCCATCCTTCCTAGGATCTCGGTCATAAGTA CAGGGCCCACACTGCAGGCCAGGCGACGTCAAAGTGTCCTCAATCTTATGACGCAC AGTGTGAATCAGGGGCAGAACATCCATCGTAAGACGACAGCTTCAACTCGAAAGGT CAGTCTAGCTCCACAAGCCAATCTTACAGAGCTGGACATTTATTCCCGCCGCCTCAG TCAGGAGACCGGATTGGAAATATCAGAGGAAATTAATGAAGAGGATCTGAAGGAAT GCTTCTTTGATGACATGGAATCGATCCCCGCTGTTACTACCTGGAACACATATCTGA GATATATTACCGTCCATAAGAGCTTAATCTTTGTACTGATATGGTGCTTGGTGATTTT CCTGGCAGAGGTTGCGGCGAGTTTGGTCGTGCTATGGCTCCTTGGAAACACTCCCCT GCAGGATAAGGGGAACTCCACTCATAGCAGGAATAACAGCTATGCCGTGATCATCA CCTCTACCTCCTCTTATTACGTGTTTTACATATACGTCGGTGTTGCGGATACCCTGTT GGCAATGGGGTTCTTTAGAGGACTACCCCTAGTTCACACCCTGATCACCGTTTCGAA GATCTTGCACCACAAGATGCTTCATAGCGTTCTCCAAGCTCCTATGAGCACCCTTAA TACACTGAAAGCAGGAGGTATCCTTAACCGCTTTTCCAAAGACATCGCTATACTCGA CGATTTGCTCCCATTGACCATCTTCGACTTCATTCAGCTGCTCCTCATTGTGATCGGC GCCATTGCCGTGGTCGCAGTGTTACAGCCATATATTTTCGTAGCCACCGTGCCCGTC ATCGTGGCATTTATCATGCTGCGCGCATATTTCTTACAGACATCTCAGCAACTGAAG CAGCTGGAATCTGAGGGCAGATCTCCTATTTTTACACACCTGGTTACCAGCCTGAAG GGCCTGTGGACCCTGCGTGCTTTCGGTCGCCAACCCTACTTTGAGACTCTCTTCCATA AGGCTCTGAATTTACATACTGCCAATTGGTTCCTATACCTTAGTACCCTTCGGTGGTT CCAGATGCGGATAGAAATGATCTTCGTGATTTTCTTCATCGCAGTCACTTTCATCTCT ATTTTGACGACCGGTGAGGGCGAGGGCAGGGTGGGCATCATTCTGACTTTGGCCATG AACATTATGTCAACACTCCAGTGGGCCGTTAATTCAAGCATTGATGTGGATTCCTTG ATGCGTTCCGTCAGCAGGGTATTTAAATTCATAGACATGCCCACCGAGGGCAAGCCA ACAAAATCTACCAAGCCATACAAAAATGGCCAACTAAGCAAGGTCATGATTATCGA GAATTCTCATGTGAAAAAGGACGACATTTGGCCTTCCGGGGGTCAAATGACTGTAA AGGACCTGACGGCTAAATACACTGAGGGCGGTAATGCTATCTTGGAGAACATCTCTT TCAGCATCTCCCCTGGCCAGAGAGTGGGACTGCTCGGGCGGACAGGCTCCGGAAAG TCTACGCTCCTTTCAGCATTCCTTAGACTTCTGAACACCGAAGGTGAGATTCAGATT
GACGGGGTCTCTTGGGACTCCATCACACTTCAGCAATGGAGGAAGGCATTCGGTGTA ATCCCCCAAAAGGTTTTTATCTTCTCCGGAACATTTCGTAAGAATCTGGACCCGTAC GAGCAGTGGTCAGATCAGGAGATCTGGAAAGTAGCAGACGAGGTCGGGCTACGGA GCGTTATTGAACAGTTTCCTGGCAAACTGGACTTCGTTTTGGTGGACGGAGGCTGTG TGCTGAGTCACGGCCATAAACAACTGATGTGCTTAGCTAGGTCTGTTCTCAGCAAGG CAAAGATTTTACTGCTGGATGAACCAAGCGCCCACCTTGATCCAGTGACATATCAAA TCATCAGAAGAACTCTTAAACAGGCGTTCGCCGACTGCACAGTGATCCTGTGTGAGC ACAGAATAGAAGCCATGCTGGAATGTCAACAGTTTCTCGTGATTGAGGAGAACAAG GTGCGCCAGTACGATAGCATCCAGAAGTTACTCAATGAAAGGTCACTCTTCAGGCA GGCCATCTCACCCAGCGACCGCGTTAAGCTGTTTCCACACCGAAACAGTTCCAAGTG CAAAAGTAAGCCACAGATTGCTGCACTGAAGGAAGAGACAGAAGAAGAAGTTCAG GACACTCGGCTCTGA (SEQ ID NO: 39) ATGCAGAGGAGCCCACTGGAGAAAGCCTCCGTGGTGAGTAAACTCTTTTTTAGTTGG ACCAGACCCATCCTGCGAAAAGGATACAGGCAGCGCCTCGAGTTGTCAGATATCTA CCAGATTCCTTCTGTGGACTCAGCTGACAATTTGAGTGAGAAGCTGGAGCGGGAGTG GGATAGAGAGCTGGCGAGCAAAAAAAACCCCAAGCTTATCAATGCTCTGCGCCGCT GCTTTTTCTGGAGGTTCATGTTTTATGGGATCTTCCTGTACCTGGGGGAGGTCACCAA AGCTGTTCAGCCGCTCCTTCTTGGCCGCATCATCGCCAGCTATGACCCTGATAATAA AGAAGAAAGGTCTATTGCTATTTATCTGGGAATTGGCCTCTGCTTGCTCTTCATCGTC CGCACCCTTCTGCTGCACCCTGCCATTTTTGGCCTTCACCACATCGGCATGCAAATG AGAATTGCCATGTTCTCCCTCATTTACAAAAAGACCCTGAAACTTTCCTCAAGAGTG TTAGATAAAATATCCATTGGTCAGCTGGTCAGCCTGCTGTCCAACAATCTTAACAAA TTTGATGAAGGCTTGGCGCTGGCCCACTTCGTGTGGATTGCACCTCTGCAGGTGGCC CTGTTGATGGGACTTATATGGGAGCTGCTTCAAGCCTCTGCTTTCTGTGGGCTGGGCT TTTTGATTGTACTGGCACTTTTTCAGGCTGGGCTCGGAAGAATGATGATGAAATACA GAGATCAGCGGGCCGGGAAGATATCAGAGCGACTTGTGATCACCAGTGAAATGATT GAAAATATTCAGAGCGTGAAAGCCTACTGCTGGGAAGAAGCCATGGAGAAGATGAT TGAGAACCTGAGGCAGACAGAGCTCAAGCTCACTCGGAAGGCTGCTTATGTTCGCT ATTTCAACAGCAGCGCCTTCTTCTTCAGTGGCTTCTTTGTTGTCTTCCTGTCTGTTCTG CCATATGCACTGATAAAAGGCATTATTTTACGAAAGATCTTCACCACCATCAGTTTT TGCATCGTTCTCAGGATGGCCGTCACAAGACAGTTCCCCTGGGCTGTGCAGACCTGG TACGATTCCTTGGGGGCCATCAACAAGATTCAAGATTTCTTGCAAAAACAAGAATAT AAAACTTTAGAATACAACCTCACCACCACTGAAGTGGTCATGGAAAATGTGACAGC CTTTTGGGAGGAGGGTTTTGGAGAATTGTTCGAGAAGGCAAAGCAGAATAACAACA ACAGGAAGACGAGCAATGGGGACGACTCTCTCTTCTTCAGCAACTTTTCACTGCTCG GGACCCCTGTGTTGAAAGATATAAACTTCAAGATCGAGAGGGGCCAGCTCTTGGCT GTGGCAGGCTCCACTGGAGCTGGTAAAACATCTCTTCTCATGGTGATCATGGGGGAA CTGGAGCCTTCCGAAGGAAAAATCAAGCACAGTGGGAGAATCTCATTCTGCAGCCA GTTTTCCTGGATCATGCCCGGCACCATTAAGGAAAACATCATATTTGGAGTGTCCTA TGATGAGTACCGCTACCGGTCAGTCATCAAAGCCTGTCAGTTGGAGGAGGACATCTC CAAGTTTGCAGAGAAAGACAACATTGTGCTTGGAGAGGGGGGTATCACTCTTTCTGG AGGACAAAGAGCCAGGATCTCTTTGGCCCGGGCAGTCTACAAGGATGCAGACCTCT ACTTGTTGGACAGTCCCTTCGGCTACCTCGACGTGCTGACTGAAAAAGAAATTTTTG AAAGCTGTGTGTGCAAACTGATGGCAAACAAGACCAGGATTCTTGTCACCAGCAAG ATGGAACATCTGAAGAAAGCGGACAAAATTCTGATTCTGCATGAAGGGAGCTCCTA CTTCTATGGAACATTTAGCGAGCTTCAGAACCTACAGCCAGACTTCTCCTCCAAATT AATGGGCTGTGACTCCTTCGACCAGTTCTCTGCAGAAAGAAGAAACTCTATACTCAC AGAGACCCTCCACCGCTTCTCCCTTGAGGGAGATGCCCCAGTTTCTTGGACAGAAAC CAAGAAGCAGTCCTTTAAGCAGACTGGCGAGTTTGGTGAAAAGAGGAAAAATTCAA TTCTCAATCCAATTAACAGTATTCGCAAGTTCAGCATTGTCCAGAAGACACCCCTCC AGATGAATGGCATCGAAGAAGATAGTGACGAGCCGCTGGAGAGACGGCTGAGTCTG GTGCCAGATTCAGAACAGGGGGAGGCCATCCTGCCCCGGATCAGCGTCATTTCCAC AGGCCCCACATTACAAGCACGGCGCCGGCAGAGTGTTTTAAATCTCATGACCCATTC AGTGAACCAGGGCCAAAATATCCACAGGAAGACTACAGCTTCTACCCGGAAAGTGT CTCTGGCCCCTCAGGCCAATCTGACCGAGCTGGACATCTACAGCAGGAGGCTCTCCC AGGAAACAGGGCTGGAAATATCTGAAGAGATTAATGAAGAGGATCTTAAAGAGTGC TTCTTTGATGACATGGAGAGCATCCCCGCGGTGACCACATGGAACACCTACCTTAGA TATATTACTGTCCACAAGAGCCTCATATTTGTCCTCATCTGGTGCCTGGTTATTTTCC TCGCTGAGGTGGCGGCCAGTCTTGTTGTGCTCTGGCTGCTGGGCAACACTCCTCTCC AGGACAAGGGCAATAGTACTCACAGCAGAAATAATTCTTATGCCGTCATCATTACA AGCACCTCCAGCTACTACGTGTTCTACATCTATGTGGGCGTGGCTGACACCCTCCTG GCCATGGGTTTCTTCCGGGGCCTGCCTTTGGTGCACACCCTCATCACAGTGTCAAAA ATTCTGCACCATAAAATGCTTCATTCTGTCCTGCAGGCACCCATGAGCACTTTGAAC ACATTGAAGGCTGGCGGCATCCTCAACAGATTTTCTAAAGATATTGCTATCCTGGAT GATCTCCTCCCCCTGACAATCTTTGACTTTATCCAGCTTCTGCTGATCGTGATTGGAG CCATAGCAGTGGTTGCTGTCCTGCAGCCCTACATTTTTGTGGCCACCGTGCCCGTGAT TGTTGCCTTTATTATGCTCAGAGCTTACTTCCTGCAAACTTCTCAACAGCTCAAACAG CTAGAATCTGAGGGCCGGAGCCCCATTTTTACCCACCTGGTGACTTCCCTGAAGGGA CTGTGGACTCTGAGAGCATTCGGGCGACAGCCTTACTTTGAGACACTGTTCCACAAG GCCCTGAACTTGCACACTGCCAACTGGTTTCTTTACCTGAGCACACTCCGCTGGTTCC AGATGCGGATAGAGATGATCTTCGTCATCTTTTTTATAGCTGTAACCTTCATTTCTAT CCTTACAACAGGAGAAGGAGAGGGCAGGGTGGGAATCATCCTCACGCTGGCTATGA ACATAATGTCCACCTTGCAGTGGGCCGTGAATTCCAGTATAGATGTGGATTCTCTAA TGAGGAGTGTCTCCCGGGTGTTTAAATTCATTGATATGCCTACTGAGGGGAAACCCA CCAAGTCAACAAAACCTTATAAGAATGGACAGCTGAGCAAGGTGATGATAATTGAG AACAGCCACGTGAAGAAGGATGACATTTGGCCCAGCGGGGGCCAGATGACTGTGAA GGACCTGACGGCCAAGTACACCGAAGGTGGAAATGCCATTTTGGAAAACATCAGCT TCTCAATCTCTCCTGGGCAGAGAGTTGGATTGCTGGGTCGCACGGGCAGCGGCAAAT CAACCCTGCTCAGTGCCTTCCTTCGGCTCCTGAATACAGAAGGCGAAATCCAAATTG ACGGGGTGAGCTGGGACAGCATCACCCTGCAGCAGTGGAGAAAAGCATTTGGGGTC ATTCCACAGAAAGTTTTCATCTTCTCTGGCACTTTCAGAAAGAACCTGGACCCCTAT GAGCAGTGGAGCGACCAGGAGATCTGGAAGGTTGCAGATGAAGTTGGCCTGCGGAG TGTGATAGAACAATTTCCTGGCAAGCTGGATTTTGTGCTGGTAGATGGAGGCTGCGT GCTGTCCCACGGCCACAAACAGCTGATGTGCCTCGCCCGCTCCGTTCTTTCAAAGGC CAAAATCTTGCTTTTGGATGAGCCCAGTGCTCACCTCGACCCAGTGACCTATCAGAT AATCCGCAGGACCTTAAAGCAAGCTTTTGCCGACTGCACCGTCATACTGTGTGAGCA CCGGATTGAAGCAATGCTGGAATGCCAGCAGTTTCTGGTGATCGAGGAGAATAAGG TCCGGCAGTACGACAGCATCCAGAAGTTGTTGAATGAGCGCAGCCTTTTCCGCCAGG CCATCTCCCCATCTGACAGAGTCAAGCTGTTTCCACATAGGAACTCCTCTAAGTGCA AGTCCAAGCCCCAGATCGCTGCCCTCAAGGAGGAAACTGAGGAAGAGGTGCAGGAT ACCCGCCTGTGA (SEQ ID NO: 40) ATGCAACGGAGTCCTCTGGAAAAAGCCTCTGTCGTATCTAAGCTTTTCTTCAGTTGG ACACGCCCGATTTTGAGAAAGGGTTATCGGCAACGCTTGGAACTTAGTGACATCTAC CAAATTCCAAGTGTAGACTCAGCCGATAACTTGAGCGAAAAGCTCGAACGAGAGTG GGATCGAGAACTGGCTAGCAAAAAAAATCCCAAACTCATAAATGCCCTGCGACGCT GTTTCTTTTGGCGATTTATGTTTTACGGTATTTTCCTTTATTTGGGTGAGGTCACGAA GGCTGTACAGCCACTGCTGCTGGGTCGCATCATTGCCTCTTACGACCCTGACAACAA AGAGGAGCGGTCAATAGCTATCTACCTTGGTATAGGACTTTGCTTGCTCTTCATAGT CCGCACGTTGCTTCTCCACCCTGCTATATTTGGTCTCCATCACATTGGGATGCAAATG CGGATCGCGATGTTCAGTCTTATATATAAAAAGACTCTTAAACTTTCCAGCCGGGTT CTGGATAAGATCTCTATTGGTCAACTGGTATCTCTTTTGTCTAACAACCTGAATAAGT TCGACGAGGGCCTTGCATTGGCCCATTTTGTATGGATTGCCCCTTTGCAAGTCGCCCT CCTGATGGGATTGATCTGGGAACTCCTGCAAGCTAGTGCTTTTTGCGGATTGGGATT CCTCATAGTCCTTGCGCTCTTTCAGGCGGGACTTGGACGCATGATGATGAAGTATCG CGACCAACGAGCTGGCAAGATCAGTGAACGGCTTGTAATAACCAGTGAAATGATAG AGAACATCCAGAGCGTAAAAGCTTACTGTTGGGAAGAAGCGATGGAAAAGATGATT GAGAACCTTCGCCAGACAGAACTTAAACTTACACGAAAGGCCGCTTATGTCCGGTA CTTCAACTCTTCAGCATTTTTTTTTAGTGGCTTCTTTGTAGTGTTCCTGTCCGTCCTTC CGTATGCACTTATCAAGGGTATAATACTTAGGAAAATCTTCACAACAATCAGTTTTT GCATAGTCCTTCGCATGGCAGTAACTCGCCAATTTCCCTGGGCAGTTCAGACGTGGT ACGACTCACTTGGCGCAATTAACAAAATTCAAGATTTCCTCCAAAAGCAAGAGTATA AAACCTTGGAATACAACCTTACCACCACAGAAGTTGTAATGGAAAATGTCACAGCC TTCTGGGAGGAAGGTTTCGGCGAACTTTTTGAGAAGGCGAAGCAAAATAACAATAA TCGGAAAACATCAAACGGTGACGATTCACTGTTCTTTTCTAACTTTAGCCTTCTTGGG ACGCCCGTCCTGAAGGACATAAACTTTAAGATTGAACGGGGTCAACTTCTCGCGGTC GCAGGGAGTACTGGAGCGGGGAAAACGAGCCTGCTGATGGTGATAATGGGGGAGTT GGAGCCCTCAGAAGGCAAGATCAAGCATAGTGGTAGAATTAGCTTCTGCAGTCAAT TTAGTTGGATTATGCCGGGCACGATCAAAGAAAATATAATCTTTGGGGTATCCTACG ATGAATACAGGTACCGATCAGTGATAAAAGCGTGCCAGCTTGAAGAAGACATTTCA AAGTTTGCTGAGAAGGATAATATCGTACTTGGAGAAGGAGGTATCACCCTGTCTGG GGGTCAACGAGCGAGGATCTCCCTGGCACGCGCCGTCTACAAGGACGCGGACCTCT ATCTGTTGGATTCACCGTTCGGATATTTGGACGTGCTTACGGAGAAAGAAATATTTG AGAGCTGTGTTTGCAAGCTCATGGCAAATAAAACCAGAATATTGGTTACAAGCAAG ATGGAGCATCTTAAGAAAGCAGATAAAATCCTGATATTGCACGAGGGCTCTTCATAC TTCTACGGGACGTTTTCTGAGTTGCAGAACCTCCAGCCGGATTTCAGCTCTAAGCTG
ATGGGCTGTGATTCCTTTGATCAGTTTAGTGCGGAAAGACGAAACAGTATACTCACC GAAACACTGCACAGGTTCTCTCTGGAGGGCGACGCCCCGGTTTCCTGGACAGAGAC GAAGAAGCAGTCCTTCAAACAGACAGGCGAGTTTGGGGAGAAAAGGAAAAATAGC ATACTCAACCCGATTAACAGCATTCGCAAGTTCAGTATAGTACAAAAGACCCCGTTG CAGATGAACGGTATAGAGGAAGATTCTGATGAGCCACTGGAAAGACGGCTTTCTCT CGTTCCGGACAGTGAACAGGGAGAGGCAATACTGCCTCGGATCAGCGTTATCTCTAC AGGACCTACTTTGCAAGCTCGGCGCCGACAGTCAGTCTTGAATCTTATGACTCATAG TGTTAATCAAGGCCAGAATATCCATCGCAAGACCACCGCAAGTACAAGGAAAGTGA GCTTGGCACCTCAAGCAAACCTTACTGAACTTGATATCTACTCACGGCGACTTTCAC AGGAGACCGGACTTGAAATTAGTGAAGAAATTAACGAGGAGGACCTCAAGGAGTGC TTCTTCGATGACATGGAATCAATCCCCGCAGTCACAACCTGGAACACTTATCTGAGG TATATAACAGTTCACAAGAGCCTCATTTTTGTACTTATTTGGTGTTTGGTAATTTTCC TGGCGGAGGTTGCTGCTTCTTTGGTCGTCCTTTGGCTCCTCGGGAATACACCGCTCCA AGACAAAGGCAACTCTACCCATAGTAGGAACAATTCATATGCAGTGATTATAACCA GTACATCATCTTATTACGTTTTCTATATTTATGTCGGGGTAGCTGACACGCTGTTGGC GATGGGCTTCTTTAGGGGCCTCCCCTTGGTACACACCCTTATCACGGTGAGTAAAAT CCTGCATCACAAAATGCTTCATTCTGTACTCCAAGCGCCGATGAGTACGCTTAATAC GCTGAAAGCAGGAGGGATACTGAATCGGTTCAGCAAGGACATCGCCATTCTGGATG ACCTGCTTCCATTGACAATATTTGATTTCATTCAGCTCCTTCTCATAGTTATTGGAGC CATAGCGGTGGTGGCTGTGCTTCAGCCTTATATATTCGTTGCCACAGTTCCCGTTATA GTGGCATTTATAATGCTCAGGGCCTACTTTCTCCAGACTTCCCAGCAGTTGAAGCAA CTCGAATCAGAAGGAAGGTCACCTATTTTCACACATCTTGTGACTTCCTTGAAGGGC TTGTGGACGCTGCGGGCCTTCGGAAGACAACCATATTTTGAAACTCTCTTCCACAAA GCTTTGAATCTTCATACTGCGAACTGGTTCCTGTATTTGAGTACTTTGCGCTGGTTCC AGATGAGGATAGAAATGATATTCGTTATCTTCTTTATCGCGGTTACGTTCATAAGTA TCCTCACTACGGGGGAGGGTGAGGGTAGAGTGGGCATAATACTGACCCTCGCCATG AACATTATGTCCACCCTGCAGTGGGCGGTAAACAGCAGCATAGATGTGGATTCTTTG ATGCGCAGTGTGAGCAGGGTTTTTAAGTTTATCGATATGCCGACGGAAGGAAAGCC CACTAAAAGCACGAAACCCTATAAAAATGGACAGCTTAGCAAAGTAATGATAATCG AGAATAGCCATGTGAAAAAGGATGACATATGGCCTTCCGGAGGCCAAATGACTGTT AAAGATCTGACCGCTAAATATACCGAGGGCGGCAACGCAATACTCGAAAACATAAG CTTTTCCATAAGCCCCGGCCAACGCGTGGGTCTTCTGGGGAGGACTGGCTCCGGAAA ATCAACGTTGCTTAGCGCGTTTTTGCGGCTCCTTAACACTGAAGGTGAGATCCAAAT AGATGGCGTTAGTTGGGACTCTATAACACTGCAACAATGGCGGAAAGCTTTCGGCGT CATACCTCAGAAGGTGTTCATCTTTAGCGGAACGTTCAGGAAGAACTTGGATCCCTA CGAACAATGGAGTGATCAAGAAATATGGAAAGTGGCAGATGAGGTAGGCTTGCGCA GTGTCATTGAACAATTCCCAGGGAAACTCGACTTTGTACTGGTGGACGGCGGTTGCG TCTTGTCACACGGGCACAAACAGTTGATGTGTTTGGCCCGCAGTGTTTTGTCTAAGG CGAAGATTCTGTTGCTCGACGAACCGAGTGCTCATCTTGATCCCGTCACCTACCAAA TCATCAGAAGGACGTTGAAGCAAGCTTTCGCCGACTGCACTGTAATCCTTTGTGAGC ATAGGATCGAAGCAATGCTCGAGTGCCAACAGTTCTTGGTTATAGAGGAGAATAAG GTTCGGCAATACGACTCAATACAGAAACTGCTTAATGAGCGGTCACTCTTTCGACAA GCTATCTCTCCTAGTGACAGGGTAAAGCTTTTTCCTCATCGGAATTCCAGCAAGTGT AAGAGTAAACCACAGATCGCCGCCCTTAAAGAGGAGACCGAAGAAGAGGTGCAGG ATACGAGACTTTAG
EQUIVALENTS
[0193] Those skilled in the art will recognize, or be able to ascertain using no more than routine experimentation, many equivalents to the specific embodiments of the invention described herein. The scope of the present invention is not intended to be limited to the above Description, but rather is as set forth in the following claims:
Sequence CWU
1
1
4014443RNAArtificial Sequencechemically synthesized
oligonucleotidemRNA(1)..(4443)Coding SequencemRNA(1)..(4443) 1augcaacgcu
cuccucuuga aaaggccucg guggugucca agcucuucuu cucguggacu 60agacccaucc
ugagaaaggg guacagacag cgcuuggagc uguccgauau cuaucaaauc 120ccuuccgugg
acuccgcgga caaccugucc gagaagcucg agagagaaug ggacagagaa 180cucgccucaa
agaagaaccc gaagcugauu aaugcgcuua ggcggugcuu uuucuggcgg 240uucauguucu
acggcaucuu ccucuaccug ggagagguca ccaaggccgu gcagccccug 300uugcugggac
ggauuauugc cuccuacgac cccgacaaca aggaagaaag aagcaucgcu 360aucuacuugg
gcaucggucu gugccugcuu uucaucgucc ggacccucuu guugcauccu 420gcuauuuucg
gccugcauca cauuggcaug cagaugagaa uugccauguu uucccugauc 480uacaagaaaa
cucugaagcu cucgagccgc gugcuugaca agauuuccau cggccagcuc 540gugucccugc
ucuccaacaa ucugaacaag uucgacgagg gccucgcccu ggcccacuuc 600guguggaucg
ccccucugca aguggcgcuu cugaugggcc ugaucuggga gcugcugcaa 660gccucggcau
ucugugggcu uggauuccug aucgugcugg cacuguucca ggccggacug 720gggcggauga
ugaugaagua cagggaccag agagccggaa agauuuccga acggcuggug 780aucacuucgg
aaaugaucga aaacauccag ucagugaagg ccuacugcug ggaagaggcc 840auggaaaaga
ugauugaaaa ccuccggcaa accgagcuga agcugacccg caaggccgcu 900uacgugcgcu
auuucaacuc guccgcuuuc uucuucuccg gguucuucgu gguguuucuc 960uccgugcucc
ccuacgcccu gauuaaggga aucauccuca ggaagaucuu caccaccauu 1020uccuucugua
ucgugcuccg cauggccgug acccggcagu ucccaugggc cgugcagacu 1080ugguacgacu
cccugggagc cauuaacaag auccaggacu uccuucaaaa gcaggaguac 1140aagacccucg
aguacaaccu gacuacuacc gaggucguga uggaaaacgu caccgccuuu 1200ugggaggagg
gauuuggcga acuguucgag aaggccaagc agaacaacaa caaccgcaag 1260accucgaacg
gugacgacuc ccucuucuuu ucaaacuuca gccugcucgg gacgcccgug 1320cugaaggaca
uuaacuucaa gaucgaaaga ggacagcucc uggcgguggc cggaucgacc 1380ggagccggaa
agacuucccu gcugauggug aucaugggag agcuugaacc uagcgaggga 1440aagaucaagc
acuccggccg caucagcuuc uguagccagu uuuccuggau caugcccgga 1500accauuaagg
aaaacaucau cuucggcgug uccuacgaug aauaccgcua ccgguccgug 1560aucaaagccu
gccagcugga agaggauauu ucaaaguucg cggagaaaga uaacaucgug 1620cugggcgaag
gggguauuac cuugucgggg ggccagcggg cuagaaucuc gcuggccaga 1680gccguguaua
aggacgccga ccuguaucuc cuggacuccc ccuucggaua ccuggacguc 1740cugaccgaaa
aggagaucuu cgaaucgugc gugugcaagc ugauggcuaa caagacucgc 1800auccucguga
ccuccaaaau ggagcaccug aagaaggcag acaagauucu gauucugcau 1860gagggguccu
ccuacuuuua cggcaccuuc ucggaguugc agaacuugca gcccgacuuc 1920ucaucgaagc
ugauggguug cgacagcuuc gaccaguucu ccgccgaaag aaggaacucg 1980auccugacgg
aaaccuugca ccgcuucucu uuggaaggcg acgccccugu gucauggacc 2040gagacuaaga
agcagagcuu caagcagacc ggggaauucg gcgaaaagag gaagaacagc 2100aucuugaacc
ccauuaacuc cauccgcaag uucucaaucg ugcaaaagac gccacugcag 2160augaacggca
uugaggagga cuccgacgaa ccccuugaga ggcgccuguc ccuggugccg 2220gacagcgagc
agggagaagc cauccugccu cggauuuccg ugaucuccac ugguccgacg 2280cuccaagccc
ggcggcggca guccgugcug aaccugauga cccacagcgu gaaccagggc 2340caaaacauuc
accgcaagac uaccgcaucc acccggaaag ugucccuggc accucaagcg 2400aaucuuaccg
agcucgacau cuacucccgg agacugucgc aggaaaccgg gcucgaaauu 2460uccgaagaaa
ucaacgagga ggaucugaaa gagugcuucu ucgacgauau ggagucgaua 2520cccgccguga
cgacuuggaa cacuuaucug cgguacauca cugugcacaa gucauugauc 2580uucgugcuga
uuuggugccu ggugauuuuc cuggccgagg ucgcggccuc acugguggug 2640cucuggcugu
ugggaaacac gccucugcaa gacaagggaa acuccacgca cucgagaaac 2700aacagcuaug
ccgugauuau cacuuccacc uccucuuauu acguguucua caucuacguc 2760ggaguggcgg
auacccugcu cgcgaugggu uucuucagag gacugccgcu gguccacacc 2820uugaucaccg
ucagcaagau ucuucaccac aagauguugc auagcgugcu gcaggccccc 2880auguccaccc
ucaacacucu gaaggccgga ggcauucuga acagauucuc caaggacauc 2940gcuauccugg
acgaucuccu gccgcuuacc aucuuugacu ucauccagcu gcugcugauc 3000gugauuggag
caaucgcagu gguggcggug cugcagccuu acauuuucgu ggccacugug 3060ccggucauug
uggcguucau caugcugcgg gccuacuucc uccaaaccag ccagcagcug 3120aagcaacugg
aauccgaggg acgauccccc aucuucacuc accuugugac gucguugaag 3180ggacugugga
cccuccgggc uuucggacgg cagcccuacu ucgaaacccu cuuccacaag 3240gcccugaacc
uccacaccgc caauugguuc cuguaccugu ccacccugcg gugguuccag 3300augcgcaucg
agaugauuuu cgucaucuuc uucaucgcgg ucacauucau cagcauccug 3360acuaccggag
agggagaggg acgggucgga auaauccuga cccucgccau gaacauuaug 3420agcacccugc
agugggcagu gaacagcucg aucgacgugg acagccugau gcgaagcguc 3480agccgcgugu
ucaaguucau cgacaugccu acugagggaa aacccacuaa guccacuaag 3540cccuacaaaa
auggccagcu gagcaagguc augaucaucg aaaacuccca cgugaagaag 3600gacgauauuu
ggcccuccgg aggucaaaug accgugaagg accugaccgc aaaguacacc 3660gagggaggaa
acgccauucu cgaaaacauc agcuucucca uuucgccggg acagcggguc 3720ggccuucucg
ggcggaccgg uuccgggaag ucaacucugc ugucggcuuu ccuccggcug 3780cugaauaccg
agggggaaau ccaaauugac ggcgugucuu gggauuccau uacucugcag 3840caguggcgga
aggccuucgg cgugaucccc cagaaggugu ucaucuucuc ggguaccuuc 3900cggaagaacc
uggauccuua cgagcagugg agcgaccaag aaaucuggaa ggucgccgac 3960gaggucggcc
ugcgcuccgu gauugaacaa uuuccuggaa agcuggacuu cgugcucguc 4020gacgggggau
guguccuguc gcacggacau aagcagcuca ugugccucgc acgguccgug 4080cucuccaagg
ccaagauucu gcugcuggac gaaccuucgg cccaccugga uccggucacc 4140uaccagauca
ucaggaggac ccugaagcag gccuuugccg auugcaccgu gauucucugc 4200gagcaccgca
ucgaggccau gcuggagugc cagcaguucc uggucaucga ggagaacaag 4260guccgccaau
acgacuccau ucaaaagcuc cucaacgagc ggucgcuguu cagacaagcu 4320auuucaccgu
ccgauagagu gaagcucuuc ccgcaucgga acagcucaaa gugcaaaucg 4380aagccgcaga
ucgcagccuu gaaggaagag acugaggaag aggugcagga cacccggcuu 4440uaa
444324443RNAArtificial Sequencechemically synthesized oligonucleotide
2augcagcggu ccccgcucga aaaggccagu gucgugucca aacucuucuu cucauggacu
60cggccuaucc uuagaaaggg guaucggcag aggcuugagu ugucugacau cuaccagauc
120cccucgguag auucggcgga uaaccucucg gagaagcucg aacgggaaug ggaccgcgaa
180cucgcgucua agaaaaaccc gaagcucauc aacgcacuga gaaggugcuu cuucuggcgg
240uucauguucu acgguaucuu cuuguaucuc ggggagguca caaaagcagu ccaaccccug
300uuguuggguc gcauuaucgc cucguacgac cccgauaaca aagaagaacg gagcaucgcg
360aucuaccucg ggaucggacu guguuugcuu uucaucguca gaacacuuuu guugcaucca
420gcaaucuucg gccuccauca caucgguaug cagaugcgaa ucgcuauguu uagcuugauc
480uacaaaaaga cacugaaacu cucgucgcgg guguuggaua agauuuccau cggucaguug
540gugucccugc uuaguaauaa ccucaacaaa uucgaugagg gacuggcgcu ggcacauuuc
600guguggauug ccccguugca agucgcccuu uugaugggcc uuauuuggga gcuguugcag
660gcaucugccu uuuguggccu gggauuucug auuguguugg cauuguuuca ggcugggcuu
720gggcggauga ugaugaagua ucgcgaccag agagcgggua aaaucucgga aagacucguc
780aucacuucgg aaaugaucga aaacauccag ucggucaaag ccuauugcug ggaagaagcu
840auggagaaga ugauugaaaa ccuccgccaa acugagcuga aacugacccg caaggcggcg
900uauguccggu auuucaauuc gucagcguuc uucuuuuccg gguucuucgu ugucuuucuc
960ucgguuuugc cuuaugccuu gauuaagggg auuauccucc gcaagauuuu caccacgauu
1020ucguucugca uuguauugcg cauggcagug acacggcaau uuccgugggc cgugcagaca
1080ugguaugacu cgcuuggagc gaucaacaaa auccaagacu ucuugcaaaa gcaagaguac
1140aagacccugg aguacaaucu uacuacuacg gagguaguaa uggagaaugu gacggcuuuu
1200ugggaagagg guuuuggaga acuguuugag aaagcaaagc agaauaacaa caaccgcaag
1260accucaaaug gggacgauuc ccuguuuuuc ucgaacuucu cccugcucgg aacacccgug
1320uugaaggaca ucaauuucaa gauugagagg ggacagcuuc ucgcgguagc gggaagcacu
1380ggugcgggaa aaacuagccu cuugauggug auuauggggg agcuugagcc cagcgagggg
1440aagauuaaac acuccgggcg uaucucauuc uguagccagu uuucauggau caugcccgga
1500accauuaaag agaacaucau uuucggagua uccuaugaug aguaccgaua cagaucgguc
1560auuaaggcgu gccaguugga agaggacauu ucuaaguucg ccgagaagga uaacaucguc
1620uugggagaag gggguauuac auugucggga gggcagcgag cgcggaucag ccucgcgaga
1680gcgguauaca aagaugcaga uuuguaucug cuugauucac cguuuggaua ccucgacgua
1740uugacagaaa aagaaaucuu cgagucgugc guguguaaac uuauggcuaa uaagacgaga
1800auccugguga caucaaaaau ggaacaccuu aagaaggcgg acaagauccu gauccuccac
1860gaaggaucgu ccuacuuuua cggcacuuuc ucagaguugc aaaacuugca gccggacuuc
1920ucaagcaaac ucauggggug ugacucauuc gaccaguuca gcgcggaacg gcggaacucg
1980aucuugacgg aaacgcugca ccgauucucg cuugagggug augccccggu aucguggacc
2040gagacaaaga agcagucguu uaagcagaca ggagaauuug gugagaaaag aaagaacagu
2100aucuugaauc cuauuaacuc aauucgcaag uucucaaucg uccagaaaac uccacugcag
2160augaauggaa uugaagagga uucggacgaa ccccuggagc gcaggcuuag ccucgugccg
2220gauucagagc aaggggaggc cauucuuccc cggauuucgg ugauuucaac cggaccuaca
2280cuucaggcga ggcgaaggca auccgugcuc aaccucauga cgcauucggu aaaccagggg
2340caaaacauuc accgcaaaac gacggccuca acgagaaaag ugucacuugc accccaggcg
2400aauuugacug aacucgacau cuacagccgu aggcuuucgc aagaaaccgg acuugagauc
2460agcgaagaaa ucaaugaaga agauuugaaa gaguguuucu uugaugacau ggaaucaauc
2520ccagcgguga caacguggaa cacauacuug cguuacauca cggugcacaa guccuugauu
2580uucguccuca ucuggugucu cgugaucuuu cucgcugagg ucgcagcguc acuugugguc
2640cucuggcugc uugguaauac gcccuugcaa gacaaaggca auucuacaca cucaagaaac
2700aauuccuaug ccgugauuau cacuucuaca agcucguauu acguguuuua caucuacgua
2760ggaguggccg acacucugcu cgcgaugggu uucuuccgag gacucccacu cguucacacg
2820cuuaucacug ucuccaagau ucuccaccau aagaugcuuc auagcguacu gcaggcuccc
2880auguccaccu ugaauacgcu caaggcggga gguauuuuga aucgcuucuc aaaagauauu
2940gcaauuuugg augaccuucu gccccugacg aucuucgacu ucauccaguu guugcugauc
3000gugauugggg cuauugcagu agucgcuguc cuccagccuu acauuuuugu cgcgaccguu
3060ccggugaucg uggcguuuau caugcugcgg gccuauuucu ugcagacguc acagcagcuu
3120aagcaacugg agucugaagg gaggucgccu aucuuuacgc aucuugugac caguuugaag
3180ggauugugga cguugcgcgc cuuuggcagg cagcccuacu uugaaacacu guuccacaaa
3240gcgcugaauc uccauacggc aaauugguuu uuguauuuga guacccuccg augguuucag
3300augcgcauug agaugauuuu ugugaucuuc uuuaucgcgg ugacuuuuau cuccaucuug
3360accacgggag agggcgaggg acgggucggu auuauccuga cacucgccau gaacauuaug
3420agcacuuugc agugggcagu gaacagcucg auugaugugg auagccugau gagguccguu
3480ucgagggucu uuaaguucau cgacaugccg acggagggaa agcccacaaa aaguacgaaa
3540cccuauaaga augggcaauu gaguaaggua augaucaucg agaacaguca cgugaagaag
3600gaugacaucu ggccuagcgg gggucagaug accgugaagg accugacggc aaaauacacc
3660gagggaggga acgcaauccu ugaaaacauc ucguucagca uuagccccgg ucagcgugug
3720ggguugcucg ggaggaccgg gucaggaaaa ucgacguugc ugucggccuu cuugagacuu
3780cugaauacag agggugagau ccagaucgac ggcguuucgu gggauagcau caccuugcag
3840caguggcgga aagcguuugg aguaaucccc caaaaggucu uuaucuuuag cggaaccuuc
3900cgaaagaauc ucgauccuua ugaacagugg ucagaucaag agauuuggaa agucgcggac
3960gagguuggcc uucggagugu aaucgagcag uuuccgggaa aacucgacuu uguccuugua
4020gaugggggau gcguccuguc gcaugggcac aagcagcuca ugugccuggc gcgauccguc
4080cucucuaaag cgaaaauucu ucucuuggau gaaccuucgg cccaucugga cccgguaacg
4140uaucagauca ucagaaggac acuuaagcag gcguuugccg acugcacggu gauucucugu
4200gagcaucgua ucgaggccau gcucgaaugc cagcaauuuc uugucaucga agagaauaag
4260guccgccagu acgacuccau ccagaagcug cuuaaugaga gaucauuguu ccggcaggcg
4320auuucaccau ccgauagggu gaaacuuuuu ccacacagaa auucgucgaa gugcaagucc
4380aaaccgcaga ucgcggccuu gaaagaagag acugaagaag aaguucaaga cacgcgucuu
4440uaa
444331480PRTHomo sapiens 3Met Gln Arg Ser Pro Leu Glu Lys Ala Ser Val Val
Ser Lys Leu Phe1 5 10
15Phe Ser Trp Thr Arg Pro Ile Leu Arg Lys Gly Tyr Arg Gln Arg Leu
20 25 30Glu Leu Ser Asp Ile Tyr Gln
Ile Pro Ser Val Asp Ser Ala Asp Asn 35 40
45Leu Ser Glu Lys Leu Glu Arg Glu Trp Asp Arg Glu Leu Ala Ser
Lys 50 55 60Lys Asn Pro Lys Leu Ile
Asn Ala Leu Arg Arg Cys Phe Phe Trp Arg65 70
75 80Phe Met Phe Tyr Gly Ile Phe Leu Tyr Leu Gly
Glu Val Thr Lys Ala 85 90
95Val Gln Pro Leu Leu Leu Gly Arg Ile Ile Ala Ser Tyr Asp Pro Asp
100 105 110Asn Lys Glu Glu Arg Ser
Ile Ala Ile Tyr Leu Gly Ile Gly Leu Cys 115 120
125Leu Leu Phe Ile Val Arg Thr Leu Leu Leu His Pro Ala Ile
Phe Gly 130 135 140Leu His His Ile Gly
Met Gln Met Arg Ile Ala Met Phe Ser Leu Ile145 150
155 160Tyr Lys Lys Thr Leu Lys Leu Ser Ser Arg
Val Leu Asp Lys Ile Ser 165 170
175Ile Gly Gln Leu Val Ser Leu Leu Ser Asn Asn Leu Asn Lys Phe Asp
180 185 190Glu Gly Leu Ala Leu
Ala His Phe Val Trp Ile Ala Pro Leu Gln Val 195
200 205Ala Leu Leu Met Gly Leu Ile Trp Glu Leu Leu Gln
Ala Ser Ala Phe 210 215 220Cys Gly Leu
Gly Phe Leu Ile Val Leu Ala Leu Phe Gln Ala Gly Leu225
230 235 240Gly Arg Met Met Met Lys Tyr
Arg Asp Gln Arg Ala Gly Lys Ile Ser 245
250 255Glu Arg Leu Val Ile Thr Ser Glu Met Ile Glu Asn
Ile Gln Ser Val 260 265 270Lys
Ala Tyr Cys Trp Glu Glu Ala Met Glu Lys Met Ile Glu Asn Leu 275
280 285Arg Gln Thr Glu Leu Lys Leu Thr Arg
Lys Ala Ala Tyr Val Arg Tyr 290 295
300Phe Asn Ser Ser Ala Phe Phe Phe Ser Gly Phe Phe Val Val Phe Leu305
310 315 320Ser Val Leu Pro
Tyr Ala Leu Ile Lys Gly Ile Ile Leu Arg Lys Ile 325
330 335Phe Thr Thr Ile Ser Phe Cys Ile Val Leu
Arg Met Ala Val Thr Arg 340 345
350Gln Phe Pro Trp Ala Val Gln Thr Trp Tyr Asp Ser Leu Gly Ala Ile
355 360 365Asn Lys Ile Gln Asp Phe Leu
Gln Lys Gln Glu Tyr Lys Thr Leu Glu 370 375
380Tyr Asn Leu Thr Thr Thr Glu Val Val Met Glu Asn Val Thr Ala
Phe385 390 395 400Trp Glu
Glu Gly Phe Gly Glu Leu Phe Glu Lys Ala Lys Gln Asn Asn
405 410 415Asn Asn Arg Lys Thr Ser Asn
Gly Asp Asp Ser Leu Phe Phe Ser Asn 420 425
430Phe Ser Leu Leu Gly Thr Pro Val Leu Lys Asp Ile Asn Phe
Lys Ile 435 440 445Glu Arg Gly Gln
Leu Leu Ala Val Ala Gly Ser Thr Gly Ala Gly Lys 450
455 460Thr Ser Leu Leu Met Val Ile Met Gly Glu Leu Glu
Pro Ser Glu Gly465 470 475
480Lys Ile Lys His Ser Gly Arg Ile Ser Phe Cys Ser Gln Phe Ser Trp
485 490 495Ile Met Pro Gly Thr
Ile Lys Glu Asn Ile Ile Phe Gly Val Ser Tyr 500
505 510Asp Glu Tyr Arg Tyr Arg Ser Val Ile Lys Ala Cys
Gln Leu Glu Glu 515 520 525Asp Ile
Ser Lys Phe Ala Glu Lys Asp Asn Ile Val Leu Gly Glu Gly 530
535 540Gly Ile Thr Leu Ser Gly Gly Gln Arg Ala Arg
Ile Ser Leu Ala Arg545 550 555
560Ala Val Tyr Lys Asp Ala Asp Leu Tyr Leu Leu Asp Ser Pro Phe Gly
565 570 575Tyr Leu Asp Val
Leu Thr Glu Lys Glu Ile Phe Glu Ser Cys Val Cys 580
585 590Lys Leu Met Ala Asn Lys Thr Arg Ile Leu Val
Thr Ser Lys Met Glu 595 600 605His
Leu Lys Lys Ala Asp Lys Ile Leu Ile Leu His Glu Gly Ser Ser 610
615 620Tyr Phe Tyr Gly Thr Phe Ser Glu Leu Gln
Asn Leu Gln Pro Asp Phe625 630 635
640Ser Ser Lys Leu Met Gly Cys Asp Ser Phe Asp Gln Phe Ser Ala
Glu 645 650 655Arg Arg Asn
Ser Ile Leu Thr Glu Thr Leu His Arg Phe Ser Leu Glu 660
665 670Gly Asp Ala Pro Val Ser Trp Thr Glu Thr
Lys Lys Gln Ser Phe Lys 675 680
685Gln Thr Gly Glu Phe Gly Glu Lys Arg Lys Asn Ser Ile Leu Asn Pro 690
695 700Ile Asn Ser Ile Arg Lys Phe Ser
Ile Val Gln Lys Thr Pro Leu Gln705 710
715 720Met Asn Gly Ile Glu Glu Asp Ser Asp Glu Pro Leu
Glu Arg Arg Leu 725 730
735Ser Leu Val Pro Asp Ser Glu Gln Gly Glu Ala Ile Leu Pro Arg Ile
740 745 750Ser Val Ile Ser Thr Gly
Pro Thr Leu Gln Ala Arg Arg Arg Gln Ser 755 760
765Val Leu Asn Leu Met Thr His Ser Val Asn Gln Gly Gln Asn
Ile His 770 775 780Arg Lys Thr Thr Ala
Ser Thr Arg Lys Val Ser Leu Ala Pro Gln Ala785 790
795 800Asn Leu Thr Glu Leu Asp Ile Tyr Ser Arg
Arg Leu Ser Gln Glu Thr 805 810
815Gly Leu Glu Ile Ser Glu Glu Ile Asn Glu Glu Asp Leu Lys Glu Cys
820 825 830Phe Phe Asp Asp Met
Glu Ser Ile Pro Ala Val Thr Thr Trp Asn Thr 835
840 845Tyr Leu Arg Tyr Ile Thr Val His Lys Ser Leu Ile
Phe Val Leu Ile 850 855 860Trp Cys Leu
Val Ile Phe Leu Ala Glu Val Ala Ala Ser Leu Val Val865
870 875 880Leu Trp Leu Leu Gly Asn Thr
Pro Leu Gln Asp Lys Gly Asn Ser Thr 885
890 895His Ser Arg Asn Asn Ser Tyr Ala Val Ile Ile Thr
Ser Thr Ser Ser 900 905 910Tyr
Tyr Val Phe Tyr Ile Tyr Val Gly Val Ala Asp Thr Leu Leu Ala 915
920 925Met Gly Phe Phe Arg Gly Leu Pro Leu
Val His Thr Leu Ile Thr Val 930 935
940Ser Lys Ile Leu His His Lys Met Leu His Ser Val Leu Gln Ala Pro945
950 955 960Met Ser Thr Leu
Asn Thr Leu Lys Ala Gly Gly Ile Leu Asn Arg Phe 965
970 975Ser Lys Asp Ile Ala Ile Leu Asp Asp Leu
Leu Pro Leu Thr Ile Phe 980 985
990Asp Phe Ile Gln Leu Leu Leu Ile Val Ile Gly Ala Ile Ala Val Val
995 1000 1005Ala Val Leu Gln Pro Tyr
Ile Phe Val Ala Thr Val Pro Val Ile 1010 1015
1020Val Ala Phe Ile Met Leu Arg Ala Tyr Phe Leu Gln Thr Ser
Gln 1025 1030 1035Gln Leu Lys Gln Leu
Glu Ser Glu Gly Arg Ser Pro Ile Phe Thr 1040 1045
1050His Leu Val Thr Ser Leu Lys Gly Leu Trp Thr Leu Arg
Ala Phe 1055 1060 1065Gly Arg Gln Pro
Tyr Phe Glu Thr Leu Phe His Lys Ala Leu Asn 1070
1075 1080Leu His Thr Ala Asn Trp Phe Leu Tyr Leu Ser
Thr Leu Arg Trp 1085 1090 1095Phe Gln
Met Arg Ile Glu Met Ile Phe Val Ile Phe Phe Ile Ala 1100
1105 1110Val Thr Phe Ile Ser Ile Leu Thr Thr Gly
Glu Gly Glu Gly Arg 1115 1120 1125Val
Gly Ile Ile Leu Thr Leu Ala Met Asn Ile Met Ser Thr Leu 1130
1135 1140Gln Trp Ala Val Asn Ser Ser Ile Asp
Val Asp Ser Leu Met Arg 1145 1150
1155Ser Val Ser Arg Val Phe Lys Phe Ile Asp Met Pro Thr Glu Gly
1160 1165 1170Lys Pro Thr Lys Ser Thr
Lys Pro Tyr Lys Asn Gly Gln Leu Ser 1175 1180
1185Lys Val Met Ile Ile Glu Asn Ser His Val Lys Lys Asp Asp
Ile 1190 1195 1200Trp Pro Ser Gly Gly
Gln Met Thr Val Lys Asp Leu Thr Ala Lys 1205 1210
1215Tyr Thr Glu Gly Gly Asn Ala Ile Leu Glu Asn Ile Ser
Phe Ser 1220 1225 1230Ile Ser Pro Gly
Gln Arg Val Gly Leu Leu Gly Arg Thr Gly Ser 1235
1240 1245Gly Lys Ser Thr Leu Leu Ser Ala Phe Leu Arg
Leu Leu Asn Thr 1250 1255 1260Glu Gly
Glu Ile Gln Ile Asp Gly Val Ser Trp Asp Ser Ile Thr 1265
1270 1275Leu Gln Gln Trp Arg Lys Ala Phe Gly Val
Ile Pro Gln Lys Val 1280 1285 1290Phe
Ile Phe Ser Gly Thr Phe Arg Lys Asn Leu Asp Pro Tyr Glu 1295
1300 1305Gln Trp Ser Asp Gln Glu Ile Trp Lys
Val Ala Asp Glu Val Gly 1310 1315
1320Leu Arg Ser Val Ile Glu Gln Phe Pro Gly Lys Leu Asp Phe Val
1325 1330 1335Leu Val Asp Gly Gly Cys
Val Leu Ser His Gly His Lys Gln Leu 1340 1345
1350Met Cys Leu Ala Arg Ser Val Leu Ser Lys Ala Lys Ile Leu
Leu 1355 1360 1365Leu Asp Glu Pro Ser
Ala His Leu Asp Pro Val Thr Tyr Gln Ile 1370 1375
1380Ile Arg Arg Thr Leu Lys Gln Ala Phe Ala Asp Cys Thr
Val Ile 1385 1390 1395Leu Cys Glu His
Arg Ile Glu Ala Met Leu Glu Cys Gln Gln Phe 1400
1405 1410Leu Val Ile Glu Glu Asn Lys Val Arg Gln Tyr
Asp Ser Ile Gln 1415 1420 1425Lys Leu
Leu Asn Glu Arg Ser Leu Phe Arg Gln Ala Ile Ser Pro 1430
1435 1440Ser Asp Arg Val Lys Leu Phe Pro His Arg
Asn Ser Ser Lys Cys 1445 1450 1455Lys
Ser Lys Pro Gln Ile Ala Ala Leu Lys Glu Glu Thr Glu Glu 1460
1465 1470Glu Val Gln Asp Thr Arg Leu 1475
14804140RNAArtificial Sequencechemically synthesized
oligonucleotide 4ggacagaucg ccuggagacg ccauccacgc uguuuugacc uccauagaag
acaccgggac 60cgauccagcc uccgcggccg ggaacggugc auuggaacgc ggauuccccg
ugccaagagu 120gacucaccgu ccuugacacg
1405105RNAArtificial Sequencechemically synthesized
oligonucleotide 5cggguggcau cccugugacc ccuccccagu gccucuccug gcccuggaag
uugccacucc 60agugcccacc agccuugucc uaauaaaauu aaguugcauc aagcu
1056105RNAArtificial Sequencechemically synthesized
oligonucleotide 6ggguggcauc ccugugaccc cuccccagug ccucuccugg cccuggaagu
ugccacucca 60gugcccacca gccuuguccu aauaaaauua aguugcauca aagcu
10574688RNAArtificial Sequencechemically synthesized
oligonucleotide 7ggacagaucg ccuggagacg ccauccacgc uguuuugacc uccauagaag
acaccgggac 60cgauccagcc uccgcggccg ggaacggugc auuggaacgc ggauuccccg
ugccaagagu 120gacucaccgu ccuugacacg augcaacgcu cuccucuuga aaaggccucg
guggugucca 180agcucuucuu cucguggacu agacccaucc ugagaaaggg guacagacag
cgcuuggagc 240uguccgauau cuaucaaauc ccuuccgugg acuccgcgga caaccugucc
gagaagcucg 300agagagaaug ggacagagaa cucgccucaa agaagaaccc gaagcugauu
aaugcgcuua 360ggcggugcuu uuucuggcgg uucauguucu acggcaucuu ccucuaccug
ggagagguca 420ccaaggccgu gcagccccug uugcugggac ggauuauugc cuccuacgac
cccgacaaca 480aggaagaaag aagcaucgcu aucuacuugg gcaucggucu gugccugcuu
uucaucgucc 540ggacccucuu guugcauccu gcuauuuucg gccugcauca cauuggcaug
cagaugagaa 600uugccauguu uucccugauc uacaagaaaa cucugaagcu cucgagccgc
gugcuugaca 660agauuuccau cggccagcuc gugucccugc ucuccaacaa ucugaacaag
uucgacgagg 720gccucgcccu ggcccacuuc guguggaucg ccccucugca aguggcgcuu
cugaugggcc 780ugaucuggga gcugcugcaa gccucggcau ucugugggcu uggauuccug
aucgugcugg 840cacuguucca ggccggacug gggcggauga ugaugaagua cagggaccag
agagccggaa 900agauuuccga acggcuggug aucacuucgg aaaugaucga aaacauccag
ucagugaagg 960ccuacugcug ggaagaggcc auggaaaaga ugauugaaaa ccuccggcaa
accgagcuga 1020agcugacccg caaggccgcu uacgugcgcu auuucaacuc guccgcuuuc
uucuucuccg 1080gguucuucgu gguguuucuc uccgugcucc ccuacgcccu gauuaaggga
aucauccuca 1140ggaagaucuu caccaccauu uccuucugua ucgugcuccg cauggccgug
acccggcagu 1200ucccaugggc cgugcagacu ugguacgacu cccugggagc cauuaacaag
auccaggacu 1260uccuucaaaa gcaggaguac aagacccucg aguacaaccu gacuacuacc
gaggucguga 1320uggaaaacgu caccgccuuu ugggaggagg gauuuggcga acuguucgag
aaggccaagc 1380agaacaacaa caaccgcaag accucgaacg gugacgacuc ccucuucuuu
ucaaacuuca 1440gccugcucgg gacgcccgug cugaaggaca uuaacuucaa gaucgaaaga
ggacagcucc 1500uggcgguggc cggaucgacc ggagccggaa agacuucccu gcugauggug
aucaugggag 1560agcuugaacc uagcgaggga aagaucaagc acuccggccg caucagcuuc
uguagccagu 1620uuuccuggau caugcccgga accauuaagg aaaacaucau cuucggcgug
uccuacgaug 1680aauaccgcua ccgguccgug aucaaagccu gccagcugga agaggauauu
ucaaaguucg 1740cggagaaaga uaacaucgug cugggcgaag gggguauuac cuugucgggg
ggccagcggg 1800cuagaaucuc gcuggccaga gccguguaua aggacgccga ccuguaucuc
cuggacuccc 1860ccuucggaua ccuggacguc cugaccgaaa aggagaucuu cgaaucgugc
gugugcaagc 1920ugauggcuaa caagacucgc auccucguga ccuccaaaau ggagcaccug
aagaaggcag 1980acaagauucu gauucugcau gagggguccu ccuacuuuua cggcaccuuc
ucggaguugc 2040agaacuugca gcccgacuuc ucaucgaagc ugauggguug cgacagcuuc
gaccaguucu 2100ccgccgaaag aaggaacucg auccugacgg aaaccuugca ccgcuucucu
uuggaaggcg 2160acgccccugu gucauggacc gagacuaaga agcagagcuu caagcagacc
ggggaauucg 2220gcgaaaagag gaagaacagc aucuugaacc ccauuaacuc cauccgcaag
uucucaaucg 2280ugcaaaagac gccacugcag augaacggca uugaggagga cuccgacgaa
ccccuugaga 2340ggcgccuguc ccuggugccg gacagcgagc agggagaagc cauccugccu
cggauuuccg 2400ugaucuccac ugguccgacg cuccaagccc ggcggcggca guccgugcug
aaccugauga 2460cccacagcgu gaaccagggc caaaacauuc accgcaagac uaccgcaucc
acccggaaag 2520ugucccuggc accucaagcg aaucuuaccg agcucgacau cuacucccgg
agacugucgc 2580aggaaaccgg gcucgaaauu uccgaagaaa ucaacgagga ggaucugaaa
gagugcuucu 2640ucgacgauau ggagucgaua cccgccguga cgacuuggaa cacuuaucug
cgguacauca 2700cugugcacaa gucauugauc uucgugcuga uuuggugccu ggugauuuuc
cuggccgagg 2760ucgcggccuc acugguggug cucuggcugu ugggaaacac gccucugcaa
gacaagggaa 2820acuccacgca cucgagaaac aacagcuaug ccgugauuau cacuuccacc
uccucuuauu 2880acguguucua caucuacguc ggaguggcgg auacccugcu cgcgaugggu
uucuucagag 2940gacugccgcu gguccacacc uugaucaccg ucagcaagau ucuucaccac
aagauguugc 3000auagcgugcu gcaggccccc auguccaccc ucaacacucu gaaggccgga
ggcauucuga 3060acagauucuc caaggacauc gcuauccugg acgaucuccu gccgcuuacc
aucuuugacu 3120ucauccagcu gcugcugauc gugauuggag caaucgcagu gguggcggug
cugcagccuu 3180acauuuucgu ggccacugug ccggucauug uggcguucau caugcugcgg
gccuacuucc 3240uccaaaccag ccagcagcug aagcaacugg aauccgaggg acgauccccc
aucuucacuc 3300accuugugac gucguugaag ggacugugga cccuccgggc uuucggacgg
cagcccuacu 3360ucgaaacccu cuuccacaag gcccugaacc uccacaccgc caauugguuc
cuguaccugu 3420ccacccugcg gugguuccag augcgcaucg agaugauuuu cgucaucuuc
uucaucgcgg 3480ucacauucau cagcauccug acuaccggag agggagaggg acgggucgga
auaauccuga 3540cccucgccau gaacauuaug agcacccugc agugggcagu gaacagcucg
aucgacgugg 3600acagccugau gcgaagcguc agccgcgugu ucaaguucau cgacaugccu
acugagggaa 3660aacccacuaa guccacuaag cccuacaaaa auggccagcu gagcaagguc
augaucaucg 3720aaaacuccca cgugaagaag gacgauauuu ggcccuccgg aggucaaaug
accgugaagg 3780accugaccgc aaaguacacc gagggaggaa acgccauucu cgaaaacauc
agcuucucca 3840uuucgccggg acagcggguc ggccuucucg ggcggaccgg uuccgggaag
ucaacucugc 3900ugucggcuuu ccuccggcug cugaauaccg agggggaaau ccaaauugac
ggcgugucuu 3960gggauuccau uacucugcag caguggcgga aggccuucgg cgugaucccc
cagaaggugu 4020ucaucuucuc ggguaccuuc cggaagaacc uggauccuua cgagcagugg
agcgaccaag 4080aaaucuggaa ggucgccgac gaggucggcc ugcgcuccgu gauugaacaa
uuuccuggaa 4140agcuggacuu cgugcucguc gacgggggau guguccuguc gcacggacau
aagcagcuca 4200ugugccucgc acgguccgug cucuccaagg ccaagauucu gcugcuggac
gaaccuucgg 4260cccaccugga uccggucacc uaccagauca ucaggaggac ccugaagcag
gccuuugccg 4320auugcaccgu gauucucugc gagcaccgca ucgaggccau gcuggagugc
cagcaguucc 4380uggucaucga ggagaacaag guccgccaau acgacuccau ucaaaagcuc
cucaacgagc 4440ggucgcuguu cagacaagcu auuucaccgu ccgauagagu gaagcucuuc
ccgcaucgga 4500acagcucaaa gugcaaaucg aagccgcaga ucgcagccuu gaaggaagag
acugaggaag 4560aggugcagga cacccggcuu uaacgggugg caucccugug accccucccc
agugccucuc 4620cuggcccugg aaguugccac uccagugccc accagccuug uccuaauaaa
auuaaguugc 4680aucaagcu
468884688RNAArtificial Sequencechemically synthesized
oligonucleotide 8ggacagaucg ccuggagacg ccauccacgc uguuuugacc uccauagaag
acaccgggac 60cgauccagcc uccgcggccg ggaacggugc auuggaacgc ggauuccccg
ugccaagagu 120gacucaccgu ccuugacacg augcaacgcu cuccucuuga aaaggccucg
guggugucca 180agcucuucuu cucguggacu agacccaucc ugagaaaggg guacagacag
cgcuuggagc 240uguccgauau cuaucaaauc ccuuccgugg acuccgcgga caaccugucc
gagaagcucg 300agagagaaug ggacagagaa cucgccucaa agaagaaccc gaagcugauu
aaugcgcuua 360ggcggugcuu uuucuggcgg uucauguucu acggcaucuu ccucuaccug
ggagagguca 420ccaaggccgu gcagccccug uugcugggac ggauuauugc cuccuacgac
cccgacaaca 480aggaagaaag aagcaucgcu aucuacuugg gcaucggucu gugccugcuu
uucaucgucc 540ggacccucuu guugcauccu gcuauuuucg gccugcauca cauuggcaug
cagaugagaa 600uugccauguu uucccugauc uacaagaaaa cucugaagcu cucgagccgc
gugcuugaca 660agauuuccau cggccagcuc gugucccugc ucuccaacaa ucugaacaag
uucgacgagg 720gccucgcccu ggcccacuuc guguggaucg ccccucugca aguggcgcuu
cugaugggcc 780ugaucuggga gcugcugcaa gccucggcau ucugugggcu uggauuccug
aucgugcugg 840cacuguucca ggccggacug gggcggauga ugaugaagua cagggaccag
agagccggaa 900agauuuccga acggcuggug aucacuucgg aaaugaucga aaacauccag
ucagugaagg 960ccuacugcug ggaagaggcc auggaaaaga ugauugaaaa ccuccggcaa
accgagcuga 1020agcugacccg caaggccgcu uacgugcgcu auuucaacuc guccgcuuuc
uucuucuccg 1080gguucuucgu gguguuucuc uccgugcucc ccuacgcccu gauuaaggga
aucauccuca 1140ggaagaucuu caccaccauu uccuucugua ucgugcuccg cauggccgug
acccggcagu 1200ucccaugggc cgugcagacu ugguacgacu cccugggagc cauuaacaag
auccaggacu 1260uccuucaaaa gcaggaguac aagacccucg aguacaaccu gacuacuacc
gaggucguga 1320uggaaaacgu caccgccuuu ugggaggagg gauuuggcga acuguucgag
aaggccaagc 1380agaacaacaa caaccgcaag accucgaacg gugacgacuc ccucuucuuu
ucaaacuuca 1440gccugcucgg gacgcccgug cugaaggaca uuaacuucaa gaucgaaaga
ggacagcucc 1500uggcgguggc cggaucgacc ggagccggaa agacuucccu gcugauggug
aucaugggag 1560agcuugaacc uagcgaggga aagaucaagc acuccggccg caucagcuuc
uguagccagu 1620uuuccuggau caugcccgga accauuaagg aaaacaucau cuucggcgug
uccuacgaug 1680aauaccgcua ccgguccgug aucaaagccu gccagcugga agaggauauu
ucaaaguucg 1740cggagaaaga uaacaucgug cugggcgaag gggguauuac cuugucgggg
ggccagcggg 1800cuagaaucuc gcuggccaga gccguguaua aggacgccga ccuguaucuc
cuggacuccc 1860ccuucggaua ccuggacguc cugaccgaaa aggagaucuu cgaaucgugc
gugugcaagc 1920ugauggcuaa caagacucgc auccucguga ccuccaaaau ggagcaccug
aagaaggcag 1980acaagauucu gauucugcau gagggguccu ccuacuuuua cggcaccuuc
ucggaguugc 2040agaacuugca gcccgacuuc ucaucgaagc ugauggguug cgacagcuuc
gaccaguucu 2100ccgccgaaag aaggaacucg auccugacgg aaaccuugca ccgcuucucu
uuggaaggcg 2160acgccccugu gucauggacc gagacuaaga agcagagcuu caagcagacc
ggggaauucg 2220gcgaaaagag gaagaacagc aucuugaacc ccauuaacuc cauccgcaag
uucucaaucg 2280ugcaaaagac gccacugcag augaacggca uugaggagga cuccgacgaa
ccccuugaga 2340ggcgccuguc ccuggugccg gacagcgagc agggagaagc cauccugccu
cggauuuccg 2400ugaucuccac ugguccgacg cuccaagccc ggcggcggca guccgugcug
aaccugauga 2460cccacagcgu gaaccagggc caaaacauuc accgcaagac uaccgcaucc
acccggaaag 2520ugucccuggc accucaagcg aaucuuaccg agcucgacau cuacucccgg
agacugucgc 2580aggaaaccgg gcucgaaauu uccgaagaaa ucaacgagga ggaucugaaa
gagugcuucu 2640ucgacgauau ggagucgaua cccgccguga cgacuuggaa cacuuaucug
cgguacauca 2700cugugcacaa gucauugauc uucgugcuga uuuggugccu ggugauuuuc
cuggccgagg 2760ucgcggccuc acugguggug cucuggcugu ugggaaacac gccucugcaa
gacaagggaa 2820acuccacgca cucgagaaac aacagcuaug ccgugauuau cacuuccacc
uccucuuauu 2880acguguucua caucuacguc ggaguggcgg auacccugcu cgcgaugggu
uucuucagag 2940gacugccgcu gguccacacc uugaucaccg ucagcaagau ucuucaccac
aagauguugc 3000auagcgugcu gcaggccccc auguccaccc ucaacacucu gaaggccgga
ggcauucuga 3060acagauucuc caaggacauc gcuauccugg acgaucuccu gccgcuuacc
aucuuugacu 3120ucauccagcu gcugcugauc gugauuggag caaucgcagu gguggcggug
cugcagccuu 3180acauuuucgu ggccacugug ccggucauug uggcguucau caugcugcgg
gccuacuucc 3240uccaaaccag ccagcagcug aagcaacugg aauccgaggg acgauccccc
aucuucacuc 3300accuugugac gucguugaag ggacugugga cccuccgggc uuucggacgg
cagcccuacu 3360ucgaaacccu cuuccacaag gcccugaacc uccacaccgc caauugguuc
cuguaccugu 3420ccacccugcg gugguuccag augcgcaucg agaugauuuu cgucaucuuc
uucaucgcgg 3480ucacauucau cagcauccug acuaccggag agggagaggg acgggucgga
auaauccuga 3540cccucgccau gaacauuaug agcacccugc agugggcagu gaacagcucg
aucgacgugg 3600acagccugau gcgaagcguc agccgcgugu ucaaguucau cgacaugccu
acugagggaa 3660aacccacuaa guccacuaag cccuacaaaa auggccagcu gagcaagguc
augaucaucg 3720aaaacuccca cgugaagaag gacgauauuu ggcccuccgg aggucaaaug
accgugaagg 3780accugaccgc aaaguacacc gagggaggaa acgccauucu cgaaaacauc
agcuucucca 3840uuucgccggg acagcggguc ggccuucucg ggcggaccgg uuccgggaag
ucaacucugc 3900ugucggcuuu ccuccggcug cugaauaccg agggggaaau ccaaauugac
ggcgugucuu 3960gggauuccau uacucugcag caguggcgga aggccuucgg cgugaucccc
cagaaggugu 4020ucaucuucuc ggguaccuuc cggaagaacc uggauccuua cgagcagugg
agcgaccaag 4080aaaucuggaa ggucgccgac gaggucggcc ugcgcuccgu gauugaacaa
uuuccuggaa 4140agcuggacuu cgugcucguc gacgggggau guguccuguc gcacggacau
aagcagcuca 4200ugugccucgc acgguccgug cucuccaagg ccaagauucu gcugcuggac
gaaccuucgg 4260cccaccugga uccggucacc uaccagauca ucaggaggac ccugaagcag
gccuuugccg 4320auugcaccgu gauucucugc gagcaccgca ucgaggccau gcuggagugc
cagcaguucc 4380uggucaucga ggagaacaag guccgccaau acgacuccau ucaaaagcuc
cucaacgagc 4440ggucgcuguu cagacaagcu auuucaccgu ccgauagagu gaagcucuuc
ccgcaucgga 4500acagcucaaa gugcaaaucg aagccgcaga ucgcagccuu gaaggaagag
acugaggaag 4560aggugcagga cacccggcuu uaaggguggc aucccuguga ccccucccca
gugccucucc 4620uggcccugga aguugccacu ccagugccca ccagccuugu ccuaauaaaa
uuaaguugca 4680ucaaagcu
46889874PRTBacteriophage SP6 9Met Gln Asp Leu His Ala Ile Gln
Leu Gln Leu Glu Glu Glu Met Phe1 5 10
15Asn Gly Gly Ile Arg Arg Phe Glu Ala Asp Gln Gln Arg Gln
Ile Ala 20 25 30Ala Gly Ser
Glu Ser Asp Thr Ala Trp Asn Arg Arg Leu Leu Ser Glu 35
40 45Leu Ile Ala Pro Met Ala Glu Gly Ile Gln Ala
Tyr Lys Glu Glu Tyr 50 55 60Glu Gly
Lys Lys Gly Arg Ala Pro Arg Ala Leu Ala Phe Leu Gln Cys65
70 75 80Val Glu Asn Glu Val Ala Ala
Tyr Ile Thr Met Lys Val Val Met Asp 85 90
95Met Leu Asn Thr Asp Ala Thr Leu Gln Ala Ile Ala Met
Ser Val Ala 100 105 110Glu Arg
Ile Glu Asp Gln Val Arg Phe Ser Lys Leu Glu Gly His Ala 115
120 125Ala Lys Tyr Phe Glu Lys Val Lys Lys Ser
Leu Lys Ala Ser Arg Thr 130 135 140Lys
Ser Tyr Arg His Ala His Asn Val Ala Val Val Ala Glu Lys Ser145
150 155 160Val Ala Glu Lys Asp Ala
Asp Phe Asp Arg Trp Glu Ala Trp Pro Lys 165
170 175Glu Thr Gln Leu Gln Ile Gly Thr Thr Leu Leu Glu
Ile Leu Glu Gly 180 185 190Ser
Val Phe Tyr Asn Gly Glu Pro Val Phe Met Arg Ala Met Arg Thr 195
200 205Tyr Gly Gly Lys Thr Ile Tyr Tyr Leu
Gln Thr Ser Glu Ser Val Gly 210 215
220Gln Trp Ile Ser Ala Phe Lys Glu His Val Ala Gln Leu Ser Pro Ala225
230 235 240Tyr Ala Pro Cys
Val Ile Pro Pro Arg Pro Trp Arg Thr Pro Phe Asn 245
250 255Gly Gly Phe His Thr Glu Lys Val Ala Ser
Arg Ile Arg Leu Val Lys 260 265
270Gly Asn Arg Glu His Val Arg Lys Leu Thr Gln Lys Gln Met Pro Lys
275 280 285Val Tyr Lys Ala Ile Asn Ala
Leu Gln Asn Thr Gln Trp Gln Ile Asn 290 295
300Lys Asp Val Leu Ala Val Ile Glu Glu Val Ile Arg Leu Asp Leu
Gly305 310 315 320Tyr Gly
Val Pro Ser Phe Lys Pro Leu Ile Asp Lys Glu Asn Lys Pro
325 330 335Ala Asn Pro Val Pro Val Glu
Phe Gln His Leu Arg Gly Arg Glu Leu 340 345
350Lys Glu Met Leu Ser Pro Glu Gln Trp Gln Gln Phe Ile Asn
Trp Lys 355 360 365Gly Glu Cys Ala
Arg Leu Tyr Thr Ala Glu Thr Lys Arg Gly Ser Lys 370
375 380Ser Ala Ala Val Val Arg Met Val Gly Gln Ala Arg
Lys Tyr Ser Ala385 390 395
400Phe Glu Ser Ile Tyr Phe Val Tyr Ala Met Asp Ser Arg Ser Arg Val
405 410 415Tyr Val Gln Ser Ser
Thr Leu Ser Pro Gln Ser Asn Asp Leu Gly Lys 420
425 430Ala Leu Leu Arg Phe Thr Glu Gly Arg Pro Val Asn
Gly Val Glu Ala 435 440 445Leu Lys
Trp Phe Cys Ile Asn Gly Ala Asn Leu Trp Gly Trp Asp Lys 450
455 460Lys Thr Phe Asp Val Arg Val Ser Asn Val Leu
Asp Glu Glu Phe Gln465 470 475
480Asp Met Cys Arg Asp Ile Ala Ala Asp Pro Leu Thr Phe Thr Gln Trp
485 490 495Ala Lys Ala Asp
Ala Pro Tyr Glu Phe Leu Ala Trp Cys Phe Glu Tyr 500
505 510Ala Gln Tyr Leu Asp Leu Val Asp Glu Gly Arg
Ala Asp Glu Phe Arg 515 520 525Thr
His Leu Pro Val His Gln Asp Gly Ser Cys Ser Gly Ile Gln His 530
535 540Tyr Ser Ala Met Leu Arg Asp Glu Val Gly
Ala Lys Ala Val Asn Leu545 550 555
560Lys Pro Ser Asp Ala Pro Gln Asp Ile Tyr Gly Ala Val Ala Gln
Val 565 570 575Val Ile Lys
Lys Asn Ala Leu Tyr Met Asp Ala Asp Asp Ala Thr Thr 580
585 590Phe Thr Ser Gly Ser Val Thr Leu Ser Gly
Thr Glu Leu Arg Ala Met 595 600
605Ala Ser Ala Trp Asp Ser Ile Gly Ile Thr Arg Ser Leu Thr Lys Lys 610
615 620Pro Val Met Thr Leu Pro Tyr Gly
Ser Thr Arg Leu Thr Cys Arg Glu625 630
635 640Ser Val Ile Asp Tyr Ile Val Asp Leu Glu Glu Lys
Glu Ala Gln Lys 645 650
655Ala Val Ala Glu Gly Arg Thr Ala Asn Lys Val His Pro Phe Glu Asp
660 665 670Asp Arg Gln Asp Tyr Leu
Thr Pro Gly Ala Ala Tyr Asn Tyr Met Thr 675 680
685Ala Leu Ile Trp Pro Ser Ile Ser Glu Val Val Lys Ala Pro
Ile Val 690 695 700Ala Met Lys Met Ile
Arg Gln Leu Ala Arg Phe Ala Ala Lys Arg Asn705 710
715 720Glu Gly Leu Met Tyr Thr Leu Pro Thr Gly
Phe Ile Leu Glu Gln Lys 725 730
735Ile Met Ala Thr Glu Met Leu Arg Val Arg Thr Cys Leu Met Gly Asp
740 745 750Ile Lys Met Ser Leu
Gln Val Glu Thr Asp Ile Val Asp Glu Ala Ala 755
760 765Met Met Gly Ala Ala Ala Pro Asn Phe Val His Gly
His Asp Ala Ser 770 775 780His Leu Ile
Leu Thr Val Cys Glu Leu Val Asp Lys Gly Val Thr Ser785
790 795 800Ile Ala Val Ile His Asp Ser
Phe Gly Thr His Ala Asp Asn Thr Leu 805
810 815Thr Leu Arg Val Ala Leu Lys Gly Gln Met Val Ala
Met Tyr Ile Asp 820 825 830Gly
Asn Ala Leu Gln Lys Leu Leu Glu Glu His Glu Val Arg Trp Met 835
840 845Val Asp Thr Gly Ile Glu Val Pro Glu
Gln Gly Glu Phe Asp Leu Asn 850 855
860Glu Ile Met Asp Ser Glu Tyr Val Phe Ala865
8701018DNAArtificial Sequencechemically synthesized oligonucleotide
10atttaggtga cactatag
181123DNAArtificial Sequencechemically synthesized oligonucleotide
11atttagggga cactatagaa gag
231222DNAArtificial Sequencechemically synthesized oligonucleotide
12atttagggga cactatagaa gg
221323DNAArtificial Sequencechemically synthesized oligonucleotide
13atttagggga cactatagaa ggg
231420DNAArtificial Sequencechemically synthesized oligonucleotide
14atttaggtga cactatagaa
201522DNAArtificial Sequencechemically synthesized oligonucleotide
15atttaggtga cactatagaa ga
221623DNAArtificial Sequencechemically synthesized oligonucleotide
16atttaggtga cactatagaa gag
231722DNAArtificial Sequencechemically synthesized oligonucleotide
17atttaggtga cactatagaa gg
221823DNAArtificial Sequencechemically synthesized oligonucleotide
18atttaggtga cactatagaa ggg
231923DNAArtificial Sequencechemically synthesized
oligonucleotidemisc_feature(22)..(22)n is a, c, g, or t 19atttaggtga
cactatagaa gng
232024DNAArtificial Sequencechemically synthesized oligonucleotide
20catacgattt aggtgacact atag
24214443DNAArtificial Sequencechemically synthesized oligonucleotide
21atgcagagga gcccactgga gaaagcctcc gtggtgagta aactcttttt tagttggacc
60agacccatcc tgcgaaaagg atacaggcag cgcctcgagt tgtcagatat ctaccagatt
120ccttctgtgg actcagctga caatttgagt gagaagctgg agcgggagtg ggatagagag
180ctggcgagca aaaaaaaccc caagcttatc aatgctctgc gccgctgctt tttctggagg
240ttcatgtttt atgggatctt cctgtacctg ggggaggtca ccaaagctgt tcagccgctc
300cttcttggcc gcatcatcgc cagctatgac cctgataata aagaagaaag gtctattgct
360atttatctgg gaattggcct ctgcttgctc ttcatcgtcc gcacccttct gctgcaccct
420gccatttttg gccttcacca catcggcatg caaatgagaa ttgccatgtt ctccctcatt
480tacaaaaaga ccctgaaact ttcctcaaga gtgttagata aaatatccat tggtcagctg
540gtcagcctgc tgtccaacaa tcttaacaaa tttgatgaag gcttggcgct ggcccacttc
600gtgtggattg cacctctgca ggtggccctg ttgatgggac ttatatggga gctgcttcaa
660gcctctgctt tctgtgggct gggctttttg attgtactgg cactttttca ggctgggctc
720ggaagaatga tgatgaaata cagagatcag cgggccggga agatatcaga gcgacttgtg
780atcaccagtg aaatgattga aaatattcag agcgtgaaag cctactgctg ggaagaagcc
840atggagaaga tgattgagaa cctgaggcag acagagctca agctcactcg gaaggctgct
900tatgttcgct atttcaacag cagcgccttc ttcttcagtg gcttctttgt tgtcttcctg
960tctgttctgc catatgcact gataaaaggc attattttac gaaagatctt caccaccatc
1020agtttttgca tcgttctcag gatggccgtc acaagacagt tcccctgggc tgtgcagacc
1080tggtacgatt ccttgggggc catcaacaag attcaagatt tcttgcaaaa acaagaatat
1140aaaactttag aatacaacct caccaccact gaagtggtca tggaaaatgt gacagccttt
1200tgggaggagg gttttggaga attgttcgag aaggcaaagc agaataacaa caacaggaag
1260acgagcaatg gggacgactc tctcttcttc agcaactttt cactgctcgg gacccctgtg
1320ttgaaagata taaacttcaa gatcgagagg ggccagctct tggctgtggc aggctccact
1380ggagctggta aaacatctct tctcatggtg atcatggggg aactggagcc ttccgaagga
1440aaaatcaagc acagtgggag aatctcattc tgcagccagt tttcctggat catgcccggc
1500accattaagg aaaacatcat atttggagtg tcctatgatg agtaccgcta ccggtcagtc
1560atcaaagcct gtcagttgga ggaggacatc tccaagtttg cagagaaaga caacattgtg
1620cttggagagg ggggtatcac tctttctgga ggacaaagag ccaggatctc tttggcccgg
1680gcagtctaca aggatgcaga cctctacttg ttggacagtc ccttcggcta cctcgacgtg
1740ctgactgaaa aagaaatttt tgaaagctgt gtgtgcaaac tgatggcaaa caagaccagg
1800attcttgtca ccagcaagat ggaacatctg aagaaagcgg acaaaattct gattctgcat
1860gaagggagct cctacttcta tggaacattt agcgagcttc agaacctaca gccagacttc
1920tcctccaaat taatgggctg tgactccttc gaccagttct ctgcagaaag aagaaactct
1980atactcacag agaccctcca ccgcttctcc cttgagggag atgccccagt ttcttggaca
2040gaaaccaaga agcagtcctt taagcagact ggcgagtttg gtgaaaagag gaaaaattca
2100attctcaatc caattaacag tattcgcaag ttcagcattg tccagaagac acccctccag
2160atgaatggca tcgaagaaga tagtgacgag ccgctggaga gacggctgag tctggtgcca
2220gattcagaac agggggaggc catcctgccc cggatcagcg tcatttccac aggccccaca
2280ttacaagcac ggcgccggca gagtgtttta aatctcatga cccattcagt gaaccagggc
2340caaaatatcc acaggaagac tacagcttct acccggaaag tgtctctggc ccctcaggcc
2400aatctgaccg agctggacat ctacagcagg aggctctccc aggaaacagg gctggaaata
2460tctgaagaga ttaatgaaga ggatcttaaa gagtgcttct ttgatgacat ggagagcatc
2520cccgcggtga ccacatggaa cacctacctt agatatatta ctgtccacaa gagcctcata
2580tttgtcctca tctggtgcct ggttattttc ctcgctgagg tggcggccag tcttgttgtg
2640ctctggctgc tgggcaacac tcctctccag gacaagggca atagtactca cagcagaaat
2700aattcttatg ccgtcatcat tacaagcacc tccagctact acgtgttcta catctatgtg
2760ggcgtggctg acaccctcct ggccatgggt ttcttccggg gcctgccttt ggtgcacacc
2820ctcatcacag tgtcaaaaat tctgcaccat aaaatgcttc attctgtcct gcaggcaccc
2880atgagcactt tgaacacatt gaaggctggc ggcatcctca acagattttc taaagatatt
2940gctatcctgg atgatctcct ccccctgaca atctttgact ttatccagct tctgctgatc
3000gtgattggag ccatagcagt ggttgctgtc ctgcagccct acatttttgt ggccaccgtg
3060cccgtgattg ttgcctttat tatgctcaga gcttacttcc tgcaaacttc tcaacagctc
3120aaacagctag aatctgaggg ccggagcccc atttttaccc acctggtgac ttccctgaag
3180ggactgtgga ctctgagagc attcgggcga cagccttact ttgagacact gttccacaag
3240gccctgaact tgcacactgc caactggttt ctttacctga gcacactccg ctggttccag
3300atgcggatag agatgatctt cgtcatcttt tttatagctg taaccttcat ttctatcctt
3360acaacaggag aaggagaggg cagggtggga atcatcctca cgctggctat gaacataatg
3420tccaccttgc agtgggccgt gaattccagt atagatgtgg attctctaat gaggagtgtc
3480tcccgggtgt ttaaattcat tgatatgcct actgagggga aacccaccaa gtcaacaaaa
3540ccttataaga atggacagct gagcaaggtg atgataattg agaacagcca cgtgaagaag
3600gatgacattt ggcccagcgg gggccagatg actgtgaagg acctgacggc caagtacacc
3660gaaggtggaa atgccatttt ggaaaacatc agcttctcaa tctctcctgg gcagagagtt
3720ggattgctgg gtcgcacggg cagcggcaaa tcaaccctgc tcagtgcctt ccttcggctc
3780ctgaatacag aaggcgaaat ccaaattgac ggggtgagct gggacagcat caccctgcag
3840cagtggagaa aagcatttgg ggtcattcca cagaaagttt tcatcttctc tggcactttc
3900agaaagaacc tggaccccta tgagcagtgg agcgaccagg agatctggaa ggttgcagat
3960gaagttggcc tgcggagtgt gatagaacaa tttcctggca agctggattt tgtgctggta
4020gatggaggct gcgtgctgtc ccacggccac aaacagctga tgtgcctcgc ccgctccgtt
4080ctttcaaagg ccaaaatctt gcttttggat gagcccagtg ctcacctcga cccagtgacc
4140tatcagataa tccgcaggac cttaaagcaa gcttttgccg actgcaccgt catactgtgt
4200gagcaccgga ttgaagcaat gctggaatgc cagcagtttc tggtgatcga ggagaataag
4260gtccggcagt acgacagcat ccagaagttg ttgaatgagc gcagcctttt ccgccaggcc
4320atctccccat ctgacagagt caagctgttt ccacatagga actcctctaa gtgcaagtcc
4380aagccccaga tcgctgccct caaggaggaa actgaggaag aggtgcagga tacccgcctg
4440tga
4443224443DNAArtificial Sequencechemically synthesized oligonucleotide
22atgcagagga gcccactgga gaaagcctcc gtggtgagta aactcttttt tagttggacc
60agacccatcc tgcgaaaagg atacaggcag cgcctcgagt tgtctgatat ctaccagatt
120ccttctgtgg actcagctga caatttgagt gagaagctgg agcgggagtg ggatagagag
180ctggcgagca aaaaaaaccc caagcttatc aatgctctgc gccgctgctt tttctggagg
240ttcatgtttt atgggatctt cctgtacctg ggggaggtca ccaaagctgt tcagccgctc
300cttcttggcc gcatcatcgc cagctatgac cctgataata aagaagaaag gtctattgct
360atttatctgg gaattggcct ctgcttgctc ttcatcgtcc gcacccttct gctgcaccct
420gccatttttg gccttcacca catcggcatg caaatgagaa ttgccatgtt ctccctcatt
480tacaaaaaga ccctgaaact ttcctcaaga gtgttagata aaatatccat tggtcagctg
540gtcagcctgc tgtccaacaa tcttaacaaa tttgatgaag gcttggcgct ggcccacttc
600gtgtggattg cacctctgca ggtggccctg ttgatgggac ttatatggga gctgcttcaa
660gcctctgctt tctgtgggct gggctttttg attgtactgg cactttttca ggctgggctc
720ggaagaatga tgatgaaata cagagatcag cgggccggga agatttcaga gcgacttgtg
780atcaccagtg aaatgattga aaatattcag agcgtgaaag cctactgctg ggaagaagcc
840atggagaaga tgattgagaa cctgaggcag acagagctca agctcactcg gaaggctgct
900tatgttcgct atttcaacag cagcgccttc ttcttcagtg gcttctttgt tgtcttcctg
960tctgttctgc catatgcact gataaaaggc attattttac gaaagatctt caccaccatc
1020agtttttgca tcgttctcag gatggccgtc acaagacagt tcccctgggc tgtgcagacc
1080tggtacgatt ccttgggggc catcaacaag attcaagatt tcttgcaaaa acaagaatat
1140aaaactttag aatacaacct caccaccact gaagtggtca tggaaaatgt gacagccttt
1200tgggaggagg gttttggaga attgttcgag aaggcaaagc agaataacaa caacaggaag
1260acgagcaatg gggacgactc tctcttcttc agcaactttt cactgctcgg gacccctgtg
1320ttgaaagata taaacttcaa gatcgagagg ggccagctct tggctgtggc aggctccact
1380ggagctggta aaacatctct tctcatggtg atcatggggg aactggagcc ttccgaagga
1440aaaatcaagc acagtgggag aatctcattc tgcagccagt tttcctggat catgcccggc
1500accattaagg aaaacatcat atttggagtg tcctatgatg agtaccgcta ccggtcagtc
1560atcaaagcct gtcagttgga ggaggacatc tccaagtttg cagagaaaga caacattgtg
1620cttggagagg ggggtatcac tctttctgga ggacaaagag ccaggatctc tttggcccgg
1680gcagtctaca aggatgcaga cctctacttg ttggacagtc ccttcggcta cctcgacgtg
1740ctgactgaaa aagaaatttt tgaaagctgt gtgtgcaaac tgatggcaaa caagaccagg
1800attcttgtca ccagcaagat ggaacatctg aagaaagcgg acaaaattct gattctgcat
1860gaagggagct cctacttcta tggaacattt agcgagcttc agaacctaca gccagacttc
1920tcctccaaat taatgggctg tgactccttc gaccagttct ctgcagaaag aagaaactct
1980atactcacag agaccctcca ccgcttctcc cttgagggag atgccccagt ttcttggaca
2040gaaaccaaga agcagtcctt taagcagact ggcgagtttg gtgaaaagag gaaaaattca
2100attctcaatc caattaacag tattcgcaag ttcagcattg tccagaagac acccctccag
2160atgaatggca tcgaagaaga tagtgacgag ccgctggaga gacggctgag tctggtgcca
2220gattcagaac agggggaggc catcctgccc cggatcagcg tcatttccac aggccccaca
2280ttacaagcac ggcgccggca gagtgtttta aatctcatga cccattcagt gaaccagggc
2340caaaatatcc acaggaagac tacagcttct acccggaaag tgtctctggc ccctcaggcc
2400aatctgaccg agctggacat ctacagcagg aggctctccc aggaaacagg gctggaaata
2460tctgaagaga ttaatgaaga ggatcttaaa gagtgcttct ttgatgacat ggagagcatc
2520cccgcggtga ccacatggaa cacctacctt agatatatta ctgtccacaa gagcctcata
2580tttgtcctca tctggtgcct ggttattttc ctcgctgagg tggcggccag tcttgttgtg
2640ctctggctgc tgggcaacac tcctctccag gacaagggca atagtactca cagcagaaat
2700aattcttatg ccgtcatcat tacaagcacc tccagctact acgtgttcta catctatgtg
2760ggcgtggctg acaccctcct ggccatgggt ttcttccggg gcctgccttt ggtgcacacc
2820ctcatcacag tgtcaaaaat tctgcaccat aaaatgcttc attctgtcct gcaggcaccc
2880atgagcactt tgaacacatt gaaggctggc ggcatcctca acagattttc taaagatatt
2940gctatcctgg atgatctcct ccccctgaca atctttgact ttatccagct tctgctgatc
3000gtgattggag ccatagcagt ggttgctgtc ctgcagccct acatttttgt ggccaccgtg
3060cccgtgattg ttgcctttat tatgctcaga gcttacttcc tgcaaacttc tcaacagctc
3120aaacagctag aatctgaggg ccggagcccc atttttaccc acctggtgac ttccctgaag
3180ggactgtgga ctctgagagc attcgggcga cagccttact ttgagacact gttccacaag
3240gccctgaact tgcacactgc caactggttt ctttacctga gcacactccg ctggttccag
3300atgcggatag agatgatctt cgtcatcttt tttatagctg taaccttcat ttctatcctt
3360acaacaggag aaggagaggg cagggtggga atcatcctca cgctggctat gaacataatg
3420tccaccttgc agtgggccgt gaattccagt atagatgtgg attctctaat gaggagtgtc
3480tcccgggtgt ttaaattcat tgatatgcct actgagggga aacccaccaa gtcaacaaaa
3540ccttataaga atggacagct gagcaaggtg atgataattg agaacagcca cgtgaagaag
3600gatgacattt ggcccagcgg gggccagatg actgtgaagg acctgacggc caagtacacc
3660gaaggtggaa atgccatttt ggaaaacatc agcttctcaa tctctcctgg gcagagagtt
3720ggattgctgg gtcgcacggg cagcggcaaa tcaaccctgc tcagtgcctt ccttcggctc
3780ctgaatacag aaggcgaaat ccaaattgac ggggtgagct gggacagcat caccctgcag
3840cagtggagaa aagcatttgg ggtcattcca cagaaagttt tcatcttctc tggcactttc
3900agaaagaacc tggaccccta tgagcagtgg agcgaccagg agatctggaa ggttgcagat
3960gaagttggcc tgcggagtgt gatagaacaa tttcctggca agctggattt tgtgctggta
4020gatggaggct gcgtgctgtc ccacggccac aaacagctga tgtgcctcgc ccgctccgtt
4080ctttcaaagg ccaaaatctt gcttttggat gagcccagtg ctcaccttga cccagtgacc
4140tatcagataa tccgcaggac cttaaagcaa gcttttgccg actgcaccgt catactgtgt
4200gagcaccgga ttgaagcaat gctggaatgc cagcagtttc tggtgatcga ggagaataag
4260gtccggcagt acgacagcat ccagaagttg ttgaatgagc gcagcctttt ccgccaggcc
4320atctccccat ctgacagagt caagctgttt ccacatagga actcctctaa gtgcaagtcc
4380aagccccaga tcgctgccct caaggaggaa actgaggaag aggtgcagga tacccgcctg
4440tga
4443234443DNAArtificial Sequencechemically synthesized oligonucleotide
23atgcagagga gcccactgga gaaagcctcc gtggtgagta aactcttttt tagttggacc
60agacccatcc tgcgaaaagg atacaggcag cgcctcgagt tgtcagatat ctaccagatt
120ccttctgtgg actcagctga caatttgagt gagaagctgg agcgggagtg ggatagagag
180ctggcgagca aaaaaaaccc caagcttatc aatgctctgc gccgctgctt tttctggagg
240ttcatgtttt atgggatctt cctgtacctg ggggaggtca ccaaagctgt tcagccgctc
300cttcttggcc gcatcatcgc cagctatgac cctgataata aagaagaaag gtctattgct
360atttatctgg gaattggcct ctgcttgctc ttcatcgtcc gcacccttct gctgcaccct
420gccatttttg gccttcacca catcggcatg caaatgagaa ttgccatgtt ctccctcatt
480tacaaaaaga ccctgaaact ttcctcaaga gtgttagata aaatatccat tggtcagctg
540gtcagcctgc tgtccaacaa tcttaacaaa tttgatgaag gcttggcgct ggcccacttc
600gtgtggattg cacctctgca ggtggccctg ttgatgggac ttatatggga gctgcttcaa
660gcctctgctt tctgtgggct gggctttttg attgtactgg cactttttca ggctgggctc
720ggaagaatga tgatgaaata cagagatcag cgggccggga agatatcaga gcgacttgtg
780atcaccagtg aaatgattga aaatattcag agcgtgaaag cctactgctg ggaagaagcc
840atggagaaga tgattgagaa cctgaggcag acagagctca agctcactcg gaaggctgct
900tatgttcgct atttcaacag cagcgccttc ttcttcagtg gcttctttgt tgtcttcctg
960tctgttctgc catatgcact gataaaaggc attattttac gaaagatctt caccaccatc
1020agtttttgca tcgttctcag gatggccgtc acaagacagt tcccctgggc tgtgcagacc
1080tggtacgatt ccttgggggc catcaacaag attcaagatt tcttgcaaaa acaagaatat
1140aaaactttag aatacaacct caccaccact gaagtggtca tggaaaatgt gacagccttt
1200tgggaggagg gttttggaga attgttcgag aaggcaaagc agaataacaa caacaggaag
1260acgagcaatg gggacgactc tctcttcttc agcaactttt cactgctcgg gacccctgtg
1320ttgaaagata taaacttcaa gatcgagagg ggccagctct tggctgtggc aggctccact
1380ggagctggta aaacatctct tctcatggtg atcatggggg aactggagcc ttccgaagga
1440aaaatcaagc acagtgggag aatctcattc tgcagccagt tttcctggat catgcccggc
1500accattaagg aaaacatcat atttggagtg tcctatgatg agtaccgcta ccggtcagtc
1560atcaaagcct gtcagttgga ggaggacatc tccaagtttg cagagaaaga caacattgtg
1620cttggagagg ggggtatcac tctttctgga ggacaaagag ccaggatctc tttggcccgg
1680gcagtctaca aggatgcaga cctctacttg ttggacagtc ccttcggcta cctcgacgtg
1740ctgactgaaa aagaaatttt tgaaagctgt gtgtgcaaac tgatggcaaa caagaccagg
1800attcttgtca ccagcaagat ggaacatctg aagaaagcgg acaaaattct gattctgcat
1860gaagggagct cctacttcta tggaacattt agcgagcttc agaacctaca gccagacttc
1920tcctccaaat taatgggctg tgactccttc gaccagttct ctgcagaaag aagaaactct
1980atactcacag agaccctcca ccgcttctcc cttgagggag atgccccagt ttcttggaca
2040gaaaccaaga agcagtcctt taagcagact ggcgagtttg gtgaaaagag gaaaaattca
2100attctcaatc caattaacag tattcgcaag ttcagcattg tccagaagac acccctccag
2160atgaatggca tcgaagaaga tagtgacgag ccgctggaga gacggctgag tctggtgcca
2220gattcagaac agggggaggc catcctgccc cggatcagcg tcatttccac aggccccaca
2280ttacaagcac ggcgccggca gagtgtttta aatctcatga cccattcagt gaaccagggc
2340caaaatatcc acaggaagac tacagcttct acccggaaag tgtctctggc ccctcaggcc
2400aatctgaccg agctggacat ctacagcagg aggctctccc aggaaacagg gcttgaaata
2460tctgaagaga ttaatgaaga ggatcttaaa gagtgcttct ttgatgacat ggagagcatc
2520cccgcggtga ccacatggaa cacctacctt agatatatta ctgtccacaa gagcctcata
2580tttgtcctca tctggtgcct ggttattttc ctcgctgagg tggcggccag tcttgttgtg
2640ctctggctgc tgggcaacac tcctctccag gacaagggca atagtacaca cagcagaaat
2700aattcttatg ccgtcatcat tacaagcacc tccagctact acgtgttcta catctatgtg
2760ggcgtggctg acaccctcct ggccatgggt ttcttccggg gcctgccttt ggtgcacacc
2820ctcatcacag tgtcaaaaat tctgcaccat aaaatgcttc attctgtcct gcaggcaccc
2880atgagcactt tgaacacatt gaaggctggc ggcatcctca acagattttc taaagatatt
2940gctatcctgg atgatctcct ccccctgaca atctttgact ttatccagct tctgctgatc
3000gtgattggag ccatagcagt ggttgctgtc ctgcagccct acatttttgt ggccaccgtg
3060cccgtgattg ttgcctttat tatgctcaga gcttacttcc tgcaaacttc tcaacagctc
3120aaacagctag aatctgaggg ccggagcccc atttttaccc acctggtgac ttccctgaag
3180ggactgtgga ctctgagagc attcgggcga cagccttact ttgagacact gttccacaag
3240gccctgaact tgcacactgc caactggttt ctttacctga gcacactccg ctggttccag
3300atgcggatag agatgatctt cgtcatcttt tttatagctg taaccttcat ttctatcctt
3360acaacaggag aaggagaggg cagggtggga atcatcctca cgctggctat gaacataatg
3420tccaccttgc agtgggccgt gaattccagt atagatgtgg attctctaat gaggagtgtc
3480tcccgggtgt ttaaattcat tgatatgcct actgagggga aacccaccaa gtcaacaaaa
3540ccttataaga atggacagct gagcaaggtg atgataattg agaacagcca cgtgaagaag
3600gatgacattt ggcccagcgg gggccagatg actgtgaagg acctgacggc caagtacacc
3660gaaggtggaa atgccatttt ggaaaacatc agcttctcaa tctctcctgg gcagagagtt
3720ggattgctgg gtcgcacggg cagcggcaaa tcaaccctgc tcagtgcctt ccttcggctc
3780ctgaatacag aaggcgaaat ccaaattgac ggggtgagct gggacagcat caccctgcag
3840cagtggagaa aagcatttgg ggtcattcca cagaaagttt tcatcttctc tggcactttc
3900agaaagaacc tggaccccta tgagcagtgg agcgaccagg agatctggaa ggttgcagat
3960gaagttggcc tgcggagtgt gatagaacaa tttcctggca agctggattt tgtgctggta
4020gatggaggct gcgtgctgtc ccacggccac aaacagctga tgtgcctcgc ccgctccgtt
4080ctttcaaagg ccaaaatctt gcttttggat gagcccagtg ctcaccttga cccagtgacc
4140tatcagataa tccgcaggac cttaaagcaa gcttttgccg actgcaccgt catactgtgt
4200gagcaccgga ttgaagcaat gctggaatgc cagcagtttc tggtgatcga ggagaataag
4260gtccggcagt acgacagcat ccagaagttg ttgaatgagc gcagcctttt ccgccaggcc
4320atctccccat ctgacagagt caagctgttt ccacatagga actcctctaa gtgcaagtcc
4380aagccccaga tcgctgccct caaggaggaa actgaggaag aggtgcagga tacccgcctg
4440tga
4443244443DNAArtificial Sequencechemically synthesized oligonucleotide
24atgcagagga gcccactgga gaaagcctcc gtggtgagta aactcttttt tagttggacc
60agacccatcc tgcgaaaagg atacaggcag cgcctcgagt tgtcagatat ctaccagatt
120ccttctgtgg actcagctga caatttgagt gagaagctgg agcgggagtg ggatagagag
180ctggcgagca aaaaaaaccc caagcttatc aatgctctgc gccgctgctt tttctggagg
240ttcatgtttt atgggatctt cctgtacctg ggggaggtca ccaaagctgt tcagccgctc
300cttcttggcc gcatcatcgc cagctatgac cctgataata aagaagaaag gtctattgct
360atttatctgg gaattggcct ctgcttgctc ttcatcgtcc gcacccttct gctgcaccct
420gccatttttg gccttcacca catcggcatg caaatgagaa ttgccatgtt ctccctcatt
480tacaaaaaga ccctgaaact ttcctcaaga gtgttagata aaatatccat tggtcagctg
540gtcagcctgc tgtccaacaa tcttaacaaa tttgatgaag gcttggcgct ggcccacttc
600gtgtggattg cacctctgca ggtggccctg ttgatgggac ttatatggga gctgcttcaa
660gcctctgctt tctgtgggct gggctttttg attgtactgg cactttttca ggctgggctc
720ggaagaatga tgatgaaata cagagatcag cgggccggga agatatcaga gcgacttgtg
780atcaccagtg aaatgattga aaatattcag agcgtgaaag cctactgctg ggaagaagcc
840atggagaaga tgattgagaa cctgaggcag acagagctca agctcactcg gaaggctgct
900tatgttcgct atttcaacag cagcgccttc ttcttcagtg gcttctttgt tgtcttcctg
960tctgttctgc catatgcact gataaaaggc attattttac gaaagatctt caccaccatc
1020agtttttgca tcgttctcag gatggccgtc acaagacagt tcccctgggc tgtgcagacc
1080tggtacgatt ccttgggggc catcaacaag attcaagatt tcttgcaaaa acaagaatat
1140aaaactttag aatacaacct caccaccact gaagtggtca tggaaaatgt gacagccttt
1200tgggaggagg gttttggaga attgttcgag aaggcaaagc agaataacaa caacaggaag
1260acgagcaatg gggacgactc tctcttcttc agcaactttt cactgctcgg gacccctgtg
1320ttgaaagata taaacttcaa gatcgagagg ggccagctct tggctgtggc aggctccact
1380ggagctggta aaacatctct tctcatggtg atcatggggg aactggagcc ttccgaagga
1440aaaatcaagc acagtgggag aatctcattc tgcagccagt tttcctggat catgcccggc
1500accattaagg aaaacatcat atttggagtg tcctatgatg agtaccgcta ccggtcagtc
1560atcaaagcct gtcagttgga ggaggacatc tccaagtttg cagagaaaga caacattgtg
1620cttggagagg ggggtatcac tctttctgga ggacaaagag ccaggatctc tttggcccgg
1680gcagtctaca aggatgcaga cctctacttg ttggacagtc ccttcggcta cctcgacgtg
1740ctgactgaaa aagaaatttt tgaaagctgt gtgtgcaaac tgatggcaaa caagaccagg
1800attcttgtca ccagcaagat ggaacatctg aagaaagcgg acaaaattct gattctgcat
1860gaagggagct cctacttcta tggaacattt agcgagcttc agaacctaca gccagacttc
1920tcctccaaat taatgggctg tgactccttc gaccagttct ctgcagaaag aagaaactct
1980atactcacag agaccctcca ccgcttctcc cttgagggag atgccccagt ttcttggaca
2040gaaaccaaga agcagtcctt taagcagact ggcgagtttg gtgaaaagag gaaaaattca
2100attctcaatc caattaacag tattcgcaag ttcagcattg tccagaagac acccctccag
2160atgaatggca tcgaagaaga tagtgacgag ccgctggaga gacggctgag tctggtgcca
2220gattcagaac agggggaggc catcctgccc cggatcagcg tcatttccac aggccccaca
2280ttacaagcac ggcgccggca gagtgtttta aatctcatga cccattcagt gaaccagggc
2340caaaatatcc acaggaagac tacagcttct acccggaaag tgtctctggc ccctcaggcc
2400aatctgaccg agctggacat ctacagcagg aggctctccc aggaaacagg gctggaaata
2460tctgaagaga ttaatgaaga ggatcttaaa gagtgcttct ttgatgacat ggagagcatc
2520cccgcggtga ccacatggaa cacctacctt agatatatta ctgtccacaa gagcctcata
2580tttgtcctca tctggtgcct ggttattttc ctcgctgagg tggcggccag tcttgttgtg
2640ctctggctgc tgggcaacac tcctctccag gacaagggca atagtactca cagcagaaat
2700aattcttatg ccgtcatcat tacaagcacc tccagctact acgtgttcta catctatgtg
2760ggcgtggctg acaccctcct ggccatgggt ttcttccggg gcctgccttt ggtgcacacc
2820ctcatcacag tgtcaaaaat tctgcaccat aaaatgcttc attctgtcct gcaggcaccc
2880atgagcactt tgaacacatt gaaggctggc ggcatcctca acagattttc taaagatatt
2940gctatcctgg atgatctcct ccccctgaca atctttgact ttatccagct tctgctgatc
3000gtgattggag ccatagcagt ggttgctgtc ctgcagccct acatttttgt ggccaccgtg
3060cccgtgattg ttgcctttat tatgctcaga gcttacttcc tgcaaacttc tcaacagctc
3120aaacagctag aatctgaggg ccggagcccc atttttaccc acctggtgac ttccctgaag
3180ggactgtgga ctctgagagc attcgggcga cagccttact ttgagacact gttccacaag
3240gccctgaact tgcacactgc caactggttt ctttacctga gcacactccg ctggttccag
3300atgcggatag agatgatctt cgtcatcttt tttatagctg taaccttcat ttctatcctt
3360acaacaggag aaggagaggg cagggtggga atcatcctca cgctggctat gaacataatg
3420tccaccttgc agtgggccgt gaattccagt atagatgtgg attctctaat gaggagtgtc
3480tcccgggtgt ttaaattcat tgatatgcca actgagggga aacccaccaa gtcaacaaaa
3540ccttataaga atggacagct gagcaaggtg atgataattg agaacagcca cgtgaagaag
3600gatgacattt ggcccagcgg gggccagatg actgtgaagg acctgacggc caagtacacc
3660gaaggtggaa atgccatttt ggaaaacatc agcttctcaa tctctcctgg gcagagagtt
3720ggattgctgg gtcgcacggg cagcggcaaa tcaaccctgc tcagtgcctt ccttcggctc
3780ctgaatacag aaggcgaaat ccaaattgac ggggtgagct gggacagcat caccctgcag
3840cagtggagaa aagcatttgg ggtcattcca cagaaagttt tcatcttctc tggcactttc
3900agaaagaacc tggaccccta tgagcagtgg agcgaccagg agatctggaa ggttgcagat
3960gaagttggcc tgcggagtgt gatagaacaa tttcctggca agctggattt tgtgctggta
4020gatggaggct gcgtgctgtc ccacggccac aaacagctga tgtgcctcgc ccgctccgtt
4080ctttcaaagg ccaaaatctt gcttttggat gagcccagtg ctcacctcga cccagtgacc
4140tatcagataa tccgcaggac cttaaagcaa gcttttgccg actgcaccgt catactgtgt
4200gagcaccgga ttgaagcaat gctggaatgc cagcagtttc tggtgatcga ggagaataag
4260gtccggcagt acgacagcat ccagaagttg ttgaatgagc gcagcctttt ccgccaggcc
4320atctccccat ctgacagagt caagctgttt ccacatagga actcctctaa gtgcaagtcc
4380aagccccaga tcgctgccct caaggaggaa actgaggaag aggtgcagga tacccgcctg
4440tga
4443254443DNAArtificial Sequencechemically synthesized oligonucleotide
25atgcagagga gcccactgga gaaagcctcc gtggtgagta aactcttttt tagttggacc
60agacccatcc tgcgaaaagg atacaggcag cgcctcgagt tgtcagatat ctaccagatt
120ccttctgtgg actcagctga caatttgagt gagaagctgg agcgggagtg ggatagagag
180ctggcgagca aaaaaaaccc caagcttatc aatgctctgc gccgctgctt tttctggagg
240ttcatgtttt atgggatctt cctgtacctg ggggaggtca ccaaagctgt tcagccgctc
300cttcttggcc gcatcatcgc cagctatgac cctgataata aagaagaaag gtctattgct
360atttatctgg gaattggcct ctgcttgctc ttcatcgtcc gcacccttct gctgcaccct
420gccatttttg gccttcacca catcggcatg caaatgagaa ttgccatgtt ctccctcatt
480tacaaaaaga ccctgaaact ttcctcaaga gtgttagata aaatatccat tggtcagctg
540gtcagcctgc tgtccaacaa tcttaacaaa tttgatgaag gcttggcgct ggcccacttc
600gtgtggattg cacctctgca ggtggccctg ttgatgggac ttatatggga gctgcttcaa
660gcctctgctt tctgtgggct gggctttttg attgtactgg cactttttca ggctgggctc
720ggaagaatga tgatgaaata cagagatcag cgggccggga agatatcaga gcgacttgtg
780atcaccagtg aaatgattga aaatattcag agcgtgaaag cctactgctg ggaagaagcc
840atggagaaga tgattgagaa cctgaggcag acagagctca agctcactcg gaaggctgct
900tatgttcgct atttcaacag cagcgccttc ttcttcagtg gcttctttgt tgtcttcctg
960tctgttctgc catatgcact gataaaaggc attattttac gaaagatctt caccaccatc
1020agtttttgca tcgttctcag gatggccgtc acaagacagt tcccctgggc tgtgcagacc
1080tggtacgatt ccttgggggc catcaacaag attcaagatt tcttgcaaaa acaagaatat
1140aaaactttag aatacaacct caccaccact gaagtggtca tggaaaatgt gacagccttt
1200tgggaggagg gttttggaga attgttcgag aaggcaaagc agaataacaa caacaggaag
1260acgagcaatg gggacgactc tctcttcttc agcaactttt cactgctcgg gacccctgtg
1320ttgaaagata taaacttcaa gatcgagagg ggccagctct tggctgtggc aggctccact
1380ggagctggta aaacatctct tctcatggtg atcatggggg aactggagcc ttccgaagga
1440aaaatcaagc acagtgggag aatctcattc tgcagccagt tttcctggat catgcccggc
1500accattaagg aaaacatcat atttggagtg tcctatgatg agtaccgcta ccggtcagtc
1560atcaaagcct gtcagttgga ggaggacatc tccaagtttg cagagaaaga caacattgtg
1620cttggagagg ggggtatcac tctttctgga ggacaaagag ccaggatctc tttggcccgg
1680gcagtctaca aggatgcaga cctctacttg ttggacagtc ccttcggcta cctcgacgtg
1740ctgactgaaa aagaaatttt tgaaagctgt gtgtgcaaac tgatggcaaa caagaccagg
1800attcttgtca ccagcaagat ggaacatctg aagaaagcgg acaaaattct gattctgcat
1860gaagggagct cctacttcta tggaacattt agcgagcttc agaacctaca gccagacttc
1920tcctccaaat taatgggctg tgactccttc gaccagttct ctgcagaaag aagaaactct
1980atactcacag agaccctcca ccgcttctcc cttgagggag atgccccagt ttcttggaca
2040gaaaccaaga agcagtcctt taagcagact ggcgagtttg gtgaaaagag gaaaaattca
2100attctcaatc ctattaacag tattcgcaag ttcagcattg tccagaagac acccctccag
2160atgaatggca tcgaagaaga tagtgacgag ccgctggaga gacggctgag tctggtgcca
2220gattcagaac agggggaggc catcctgccc cggatcagcg tcatttccac aggccccaca
2280ttacaagcac ggcgccggca gagtgtttta aatctcatga cccattcagt gaaccagggc
2340caaaatatcc acaggaagac tacagcttct acccggaaag tgtctctggc ccctcaggcc
2400aatctgaccg agctggacat ctacagcagg aggctctccc aggaaacagg gcttgaaata
2460tctgaagaga ttaatgaaga ggatcttaaa gagtgcttct ttgatgacat ggagagcatc
2520cccgcggtga ccacatggaa cacctacctt agatatatta ctgtccacaa gagcctcata
2580tttgtcctca tctggtgcct ggttattttc ctcgctgagg tggcggccag tcttgttgtg
2640ctctggctgc tgggcaacac tcctctccag gacaagggca atagtactca cagcagaaat
2700aattcttatg ccgtcatcat tacaagcacc tccagctact acgtgttcta catctatgtg
2760ggcgtggctg acaccctcct ggccatgggt ttcttccggg gcctgccttt ggtgcacacc
2820ctcatcacag tgtcaaaaat tctgcaccat aaaatgcttc attctgtcct gcaggcaccc
2880atgagcactt tgaacacatt gaaggctggc ggcatcctca acagattttc taaagatatt
2940gctatcctgg atgatctcct ccccctgaca atctttgact ttatccagct tctgctgatc
3000gtgattggag ccatagcagt ggttgctgtc ctgcagccct acatttttgt ggccaccgtg
3060cccgtgattg ttgcctttat tatgctcaga gcttacttcc tgcaaacttc tcaacagctc
3120aaacagctag aatctgaggg ccggagcccc atttttaccc acctggtgac ttccctgaag
3180ggactgtgga ctctgagagc attcgggcga cagccttact ttgagacact gttccacaag
3240gccctgaact tgcacactgc caactggttt ctttacctga gcacactccg ctggttccag
3300atgcggatag agatgatctt cgtcatcttt tttatagctg taaccttcat ttctatcctt
3360acaacaggag aaggagaggg cagggtggga atcatcctca cgctggctat gaacataatg
3420tccaccttgc agtgggccgt gaattccagt atagatgtgg attctctaat gaggagtgtc
3480tcccgggtgt ttaaattcat tgatatgcct actgagggga aacccaccaa gtcaacaaaa
3540ccttataaga atggacagct gagcaaggtg atgataattg agaacagcca cgtgaagaag
3600gatgacattt ggcccagcgg gggccagatg actgtgaagg acctgacggc caagtacacc
3660gaaggtggaa atgccatttt ggaaaacatc agcttctcaa tctctcctgg gcagagagtt
3720ggattgctgg gtcgcacggg cagcggcaaa tcaaccctgc tcagtgcctt ccttcggctc
3780ctgaatacag aaggcgaaat ccaaattgac ggggtgagct gggacagcat caccctgcag
3840cagtggagaa aagcatttgg ggtcattcca cagaaagttt tcatcttctc tggcactttc
3900agaaagaacc tggaccccta tgagcagtgg agcgaccagg agatctggaa ggttgcagat
3960gaagttggcc tgcggagtgt gatagaacaa tttcctggca agctggattt tgtgctggta
4020gatggaggct gcgtgctgtc ccacggccac aaacagctga tgtgcctcgc ccgctccgtt
4080ctttcaaagg ccaaaatctt gcttttggat gagcccagtg ctcacctcga cccagtgacc
4140tatcagataa tccgcaggac cttaaagcaa gcttttgccg actgcaccgt catactgtgt
4200gagcaccgga ttgaagcaat gctggaatgc cagcagtttc tggtgatcga ggagaataag
4260gtccggcagt acgacagcat ccagaagttg ttgaatgagc gcagcctttt ccgccaggcc
4320atctccccat ctgacagagt caagctgttt ccacatagga actcctctaa gtgcaagtcc
4380aagccccaga tcgctgccct caaggaggaa actgaggaag aggtgcagga tacccgcctg
4440tga
4443264443DNAArtificial Sequencechemically synthesized oligonucleotide
26atgcagagga gcccactgga gaaagcctcc gtggtgagta aactcttttt tagttggacc
60agacccatcc tgcgaaaagg atacaggcag cgcctcgagt tgtcagatat ctaccagatt
120ccttctgtgg actcagctga caatttgagt gagaagctgg agcgggagtg ggatagagag
180ctggcgagca aaaaaaaccc caagcttatc aatgctctgc gccgctgctt tttctggagg
240ttcatgtttt atgggatctt cctgtacctg ggggaggtca ccaaagctgt tcagccgctc
300cttcttggcc gcatcatcgc cagctatgac cctgataata aagaagaaag gtctattgct
360atttatctgg gaattggcct ctgcttgctc ttcatcgtcc gcacccttct gctgcaccct
420gccatttttg gccttcacca catcggcatg caaatgagaa ttgccatgtt ctccctcatt
480tacaaaaaga ccctgaaact ttcctcaaga gtgttagata aaatatccat tggtcagctg
540gtcagcctgc tgtccaacaa tcttaacaaa tttgatgaag gcttggcgct ggcccacttc
600gtgtggattg cacctctgca ggtggccctg ttgatgggac ttatatggga gctgcttcaa
660gcctctgctt tctgtgggct gggctttttg attgtactgg cactttttca ggctgggctc
720ggaagaatga tgatgaaata cagagatcag cgggccggga agatttcaga gcgacttgtg
780atcaccagtg aaatgattga aaatattcag agcgtgaaag cctactgctg ggaagaagcc
840atggagaaga tgattgagaa cctgaggcag acagagctca agctcactcg gaaggctgct
900tatgttcgct atttcaacag cagcgccttc ttcttcagtg gcttctttgt tgtcttcctg
960tctgttctgc catatgcact gataaaaggc attattttac gaaagatctt caccaccatc
1020agtttttgca tcgttctcag gatggccgtc acaagacagt tcccctgggc tgtgcagacc
1080tggtacgatt ccttgggggc catcaacaag attcaagatt tcttgcaaaa acaagaatat
1140aaaactttag aatacaacct caccaccact gaagtggtca tggaaaatgt gacagccttt
1200tgggaggagg gttttggaga attgttcgag aaggcaaagc agaataacaa caacaggaag
1260acgagcaatg gggacgactc tctcttcttc agcaactttt cactgctcgg gacccctgtg
1320ttgaaagata taaacttcaa gatcgagagg ggccagctct tggctgtggc aggctccact
1380ggagctggta aaacatctct tctcatggtg atcatggggg aactggagcc ttccgaagga
1440aaaatcaagc acagtgggag aatctcattc tgcagccagt tttcctggat catgcccggc
1500accattaagg aaaacatcat atttggagtg tcctatgatg agtaccgcta ccggtcagtc
1560atcaaagcct gtcagttgga ggaggacatc tccaagtttg cagagaaaga caacattgtg
1620cttggagagg ggggtatcac tctttctgga ggacaaagag ccaggatctc tttggcccgg
1680gcagtctaca aggatgcaga cctctacttg ttggacagtc ccttcggcta cctcgacgtg
1740ctgactgaaa aagaaatttt tgaaagctgt gtgtgcaaac tgatggcaaa caagaccagg
1800attcttgtca ccagcaagat ggaacatctg aagaaagcgg acaaaattct gattctgcat
1860gaagggagct cctacttcta tggaacattt agcgagcttc agaacctaca gccagacttc
1920tcctccaaat taatgggctg tgactccttc gaccagttct ctgcagaaag aagaaactct
1980atactcacag agaccctcca ccgcttctcc cttgagggag atgccccagt ttcttggaca
2040gaaaccaaga agcagtcctt taagcagact ggcgagtttg gtgaaaagag gaaaaattca
2100attctcaatc caattaacag tattcgcaag ttcagcattg tccagaagac acccctccag
2160atgaatggca tcgaagaaga tagtgacgag ccgctggaga gacggctgag tctggtgcca
2220gattcagaac agggggaggc catcctgccc cggatcagcg tcatttccac aggccccaca
2280ttacaagcac ggcgccggca gagtgtttta aatctcatga cccattcagt gaaccagggc
2340caaaatatcc acaggaagac tacagcttct acccggaaag tgtctctggc ccctcaggcc
2400aatctgaccg agctggacat ctacagcagg aggctctccc aggaaacagg gctggaaata
2460tctgaagaga ttaatgaaga ggatcttaaa gagtgcttct ttgatgacat ggagagcatc
2520cccgcggtga ccacatggaa cacctacctt agatatatta ctgtccacaa gagcctcata
2580tttgtcctca tctggtgcct ggttattttc ctcgctgagg tggcggccag tcttgttgtg
2640ctctggctgc tgggcaacac tcctctccag gacaagggca atagtactca cagcagaaat
2700aattcttatg ccgtcatcat tacaagcacc tccagctact acgtgttcta catctatgtg
2760ggcgtggctg acaccctcct ggccatgggt ttcttccggg gcctgccttt ggtgcacacc
2820ctcatcacag tgtcaaaaat tctgcaccat aaaatgcttc attctgtcct gcaggcaccc
2880atgagcactt tgaacacatt gaaggctggc ggcatcctca acagattttc taaagatatt
2940gctatcctgg atgatctcct ccccctgaca atctttgact ttatccagct tctgctgatc
3000gtgattggag ccatagcagt ggttgctgtc ctgcagccct acatttttgt ggccaccgtg
3060cccgtgattg ttgcctttat tatgctcaga gcttacttcc tgcaaacttc tcaacagctc
3120aaacagctag aatctgaggg ccggagcccc atttttaccc acctggtgac ttccctgaag
3180ggactgtgga ctctgagagc attcgggcga cagccttact ttgagacact gttccacaag
3240gccctgaact tgcacactgc caactggttt ctttacctga gcacactccg ctggttccag
3300atgcggatag agatgatctt cgtcatcttt tttatagctg taaccttcat ttctatcctt
3360acaacaggag aaggagaggg cagggtggga atcatcctca cgctggctat gaacataatg
3420tccaccttgc agtgggccgt gaattccagt atagatgtgg attctctaat gaggagtgtc
3480tcccgggtgt ttaaattcat tgatatgcca actgagggga aacccaccaa gtcaacaaaa
3540ccttataaga atggacagct gagcaaggtg atgataattg agaacagcca cgtgaagaag
3600gatgacattt ggcccagcgg gggccagatg actgtgaagg acctgacggc caagtacacc
3660gaaggtggaa atgccatttt ggaaaacatc agcttctcaa tctctcctgg gcagagagtt
3720ggattgctgg gtcgcacggg cagcggcaaa tcaaccctgc tcagtgcctt ccttcggctc
3780ctgaatacag aaggcgaaat ccaaattgac ggggtgagct gggacagcat caccctgcag
3840cagtggagaa aagcatttgg ggtcattcca cagaaagttt tcatcttctc tggcactttc
3900agaaagaacc tggaccccta tgagcagtgg agcgaccagg agatctggaa ggttgcagat
3960gaagttggcc tgcggagtgt gatagaacaa tttcctggca agctggattt tgtgctggta
4020gatggaggct gcgtgctgtc ccacggccac aaacagctga tgtgcctcgc ccgctccgtt
4080ctttcaaagg ccaaaatctt gcttttggat gagcccagtg ctcacctcga cccagtgacc
4140tatcagataa tccgcaggac cttaaagcaa gcttttgccg actgcaccgt catactgtgt
4200gagcaccgga ttgaagcaat gctggaatgc cagcagtttc tggtgatcga ggagaataag
4260gtccggcagt acgacagcat ccagaagttg ttgaatgagc gcagcctttt ccgccaggcc
4320atctccccat ctgacagagt caagctgttt ccacatagga actcctctaa gtgcaagtcc
4380aagccccaga tcgctgccct caaggaggaa actgaggaag aggtgcagga tacccgcctg
4440tga
4443274443DNAArtificial Sequencechemically synthesized oligonucleotide
27atgcagagga gcccactgga gaaagcctcc gtggtgagta aactcttttt tagttggacc
60agacccatcc tgcgaaaagg atacaggcag cgcctcgagt tgtctgatat ctaccagatt
120ccttctgtgg actcagctga caatttgagt gagaagctgg agcgggagtg ggatagagag
180ctggcgagca aaaaaaaccc caagcttatc aatgctctgc gccgctgctt tttctggagg
240ttcatgtttt atgggatctt cctgtacctg ggggaggtca ccaaagctgt tcagccgctc
300cttcttggcc gcatcatcgc cagctatgac cctgataata aagaagaaag gtctattgct
360atttatctgg gaattggcct ctgcttgctc ttcatcgtcc gcacccttct gctgcaccct
420gccatttttg gccttcacca catcggcatg caaatgagaa ttgccatgtt ctccctcatt
480tacaaaaaga ccctgaaact ttcctcaaga gtgttagata aaatatccat tggtcagctg
540gtcagcctgc tgtccaacaa tcttaacaaa tttgatgaag gcttggcgct ggcccacttc
600gtgtggattg cacctctgca ggtggccctg ttgatgggac ttatatggga gctgcttcaa
660gcctctgctt tctgtgggct gggctttttg attgtactgg cactttttca ggctgggctc
720ggaagaatga tgatgaaata cagagatcag cgggccggga agatatcaga gcgacttgtg
780atcaccagtg aaatgattga aaatattcag agcgtgaaag cctactgctg ggaagaagcc
840atggagaaga tgattgagaa cctgaggcag acagagctca agctcactcg gaaggctgct
900tatgttcgct atttcaacag cagcgccttc ttcttcagtg gcttctttgt tgtcttcctg
960tctgttctgc catatgcact gataaaaggc attattttac gaaagatctt caccaccatc
1020agtttttgca tcgttctcag gatggccgtc acaagacagt tcccctgggc tgtgcagacc
1080tggtacgatt ccttgggggc catcaacaag attcaagatt tcttgcaaaa acaagaatat
1140aaaactttag aatacaacct caccaccact gaagtggtca tggaaaatgt gacagccttt
1200tgggaggagg gttttggaga attgttcgag aaggcaaagc agaataacaa caacaggaag
1260acgagcaatg gggacgactc tctcttcttc agcaactttt cactgctcgg gacccctgtg
1320ttgaaagata taaacttcaa gatcgagagg ggccagctct tggctgtggc aggctccact
1380ggagctggta aaacatctct tctcatggtg atcatggggg aactggagcc ttccgaagga
1440aaaatcaagc acagtgggag aatctcattc tgcagccagt tttcctggat catgcccggc
1500accattaagg aaaacatcat atttggagtg tcctatgatg agtaccgcta ccggtccgtc
1560atcaaagcct gtcagttgga ggaggacatc tccaagtttg cagagaaaga caacattgtg
1620cttggagagg ggggtatcac tctttctgga ggacaaagag ccaggatctc tttggcccgg
1680gcagtctaca aggatgcaga cctctacttg ttggacagtc ccttcggcta cctcgacgtg
1740ctgactgaaa aagaaatttt tgaaagctgt gtgtgcaaac tgatggcaaa caagaccagg
1800attcttgtca ccagcaagat ggaacatctg aagaaagcgg acaaaattct gattctgcat
1860gaagggagct cctacttcta tggaacattt agcgagcttc agaacctaca gccagacttc
1920tcctccaaat taatgggctg tgactccttc gaccagttct ctgcagaaag aagaaactct
1980atactcacag agaccctcca ccgcttctcc cttgagggag atgccccagt ttcttggaca
2040gaaaccaaga agcagtcctt taagcagact ggcgagtttg gtgaaaagag gaaaaattca
2100attctcaatc caattaacag tattcgcaag ttcagcattg tccagaagac acccctccag
2160atgaatggca tcgaagaaga tagtgacgag ccgctggaga gacggctgag tctggtgcca
2220gattcagaac agggggaggc catcctgccc cggatcagcg tcatttccac aggccccaca
2280ttacaagcac ggcgccggca gagtgtttta aatctcatga cccattcagt gaaccagggc
2340caaaatatcc acaggaagac tacagcttct acccggaaag tgtctctggc ccctcaggcc
2400aatctgaccg agctggacat ctacagcagg aggctctccc aggaaacagg gcttgaaata
2460tctgaagaga ttaatgaaga ggatcttaaa gagtgcttct ttgatgacat ggagagcatc
2520cccgcggtga ccacatggaa cacctacctt agatatatta ctgtccacaa gagcctcata
2580tttgtcctca tctggtgcct ggttattttc ctcgctgagg tggcggccag tcttgttgtg
2640ctctggctgc tgggcaacac tcctctccag gacaagggca atagtactca cagcagaaat
2700aattcttatg ccgtcatcat tacaagcacc tccagctact acgtgttcta catctatgtg
2760ggcgtggctg acaccctcct ggccatgggt ttcttccggg gcctgccttt ggtgcacacc
2820ctcatcacag tgtcaaaaat tctgcaccat aaaatgcttc attctgtcct gcaggcaccc
2880atgagcactt tgaacacatt gaaggctggc ggcatcctca acagattttc taaagatatt
2940gctatcctgg atgatctcct ccccctgaca atctttgact ttatccagct tctgctgatc
3000gtgattggag ccatagcagt ggttgctgtc ctgcagccct acatttttgt ggccaccgtg
3060cccgtgattg ttgcctttat tatgctcaga gcttacttcc tgcaaacttc tcaacagctc
3120aaacagctag aatctgaggg ccggagcccc atttttaccc acctggtgac ttccctgaag
3180ggactgtgga ctctgagagc attcgggcga cagccttact ttgagacact gttccacaag
3240gccctgaact tgcacactgc caactggttt ctttacctga gcacactccg ctggttccag
3300atgcggatag agatgatctt cgtcatcttt tttatagctg taaccttcat ttctatcctt
3360acaacaggag aaggagaggg cagggtggga atcatcctca cgctggctat gaacataatg
3420tccaccttgc agtgggccgt gaattccagt atagatgtgg attctctaat gaggagtgtc
3480tcccgggtgt ttaaattcat tgatatgcct actgagggga aacccaccaa gtcaacaaag
3540ccttataaga atggacagct gagcaaggtg atgataattg agaacagcca cgtgaagaag
3600gatgacattt ggcccagcgg gggccagatg actgtgaagg acctgacggc caagtacacc
3660gaaggtggaa atgccatttt ggaaaacatc agcttctcaa tctctcctgg gcagagagtt
3720ggattgctgg gtcgcacggg cagcggcaaa tcaaccctgc tcagtgcctt ccttcggctc
3780ctgaatacag aaggcgaaat ccaaattgac ggggtgagct gggacagcat caccctgcag
3840cagtggagaa aagcatttgg ggtcattcca cagaaagttt tcatcttctc tggcactttc
3900agaaagaacc tggaccccta tgagcagtgg agcgaccagg agatctggaa ggttgcagat
3960gaagttggcc tgcggagtgt gatagaacaa tttcctggca agctggattt tgtgctggta
4020gatggaggct gcgtgctgtc ccacggccac aaacagctga tgtgcctcgc ccgctccgtt
4080ctttcaaagg ccaaaatctt gcttttggat gagcccagtg ctcacctcga cccagtgacc
4140tatcagataa tccgcaggac cttaaagcaa gcttttgccg actgcaccgt catactgtgt
4200gagcaccgga ttgaagcaat gctggaatgc cagcagtttc tggtgatcga ggagaataag
4260gtccggcagt acgacagcat ccagaagttg ttgaatgagc gcagcctttt ccgccaggcc
4320atctccccat ctgacagagt caagctgttt ccacatagga actcctctaa gtgcaagtcc
4380aagccccaga tcgctgccct caaggaggaa actgaggaag aggtgcagga tacccgcctg
4440tga
4443284443DNAArtificial Sequencechemically synthesized oligonucleotide
28atgcagagga gcccactgga gaaagcctcc gtggtgagta aactcttttt tagttggacc
60agacccatcc tgcgaaaagg atacaggcag cgcctcgagt tgtcagatat ctaccagatt
120ccttctgtgg actcagctga caatttgagt gagaagctgg agcgggagtg ggatagagag
180ctggcgagca aaaaaaaccc caagcttatc aatgctctgc gccgctgctt tttctggagg
240ttcatgtttt atgggatctt cctgtacctg ggggaggtca ccaaagctgt tcagccgctc
300cttcttggcc gcatcatcgc cagctatgac cctgataata aagaagaaag gtctattgct
360atttatctgg gaattggcct ctgcttgctc ttcatcgtcc gcacccttct gctgcaccct
420gccatttttg gccttcacca catcggcatg caaatgagaa ttgccatgtt ctccctcatt
480tacaaaaaga ccctgaaact ttcctcaaga gtgttagata aaatatccat tggtcagctg
540gtcagcctgc tgtccaacaa tcttaacaaa tttgatgaag gcttggcgct ggcccacttc
600gtgtggattg cacctctgca ggtggccctg ttgatgggac ttatatggga gctgcttcaa
660gcctctgctt tctgtgggct gggctttttg attgtactgg cactttttca ggctgggctc
720ggaagaatga tgatgaaata cagagatcag cgggccggga agatatcaga gcgacttgtg
780atcaccagtg aaatgattga aaatattcag agcgtgaaag cctactgctg ggaagaagcc
840atggagaaga tgattgagaa cctgaggcag acagagctca agctcactcg gaaggctgct
900tatgttcgct atttcaacag cagcgccttc ttcttcagtg gcttctttgt tgtcttcctg
960tctgttctgc catatgcact gataaaaggc attattttac gaaagatctt caccaccatc
1020agtttttgca tcgttctcag gatggccgtc acaagacagt tcccctgggc tgtgcagacc
1080tggtacgatt ccttgggggc catcaacaag attcaagatt tcttgcaaaa acaagaatat
1140aaaactttag aatacaacct caccaccact gaagtggtca tggaaaatgt gacagccttt
1200tgggaggagg gttttggaga attgttcgag aaggcaaagc agaataacaa caacaggaag
1260acgagcaatg gggacgactc tctcttcttc agcaactttt cactgctcgg gacccctgtg
1320ttgaaagata taaacttcaa gatcgagagg ggccagctct tggctgtggc aggctccact
1380ggagctggta aaacatctct tctcatggtg atcatggggg aactggagcc ttccgaagga
1440aaaatcaagc acagtgggag aatctcattc tgcagccagt tttcctggat catgcccggc
1500accattaagg aaaacatcat atttggagtg tcctatgatg agtaccgcta ccggtcagtc
1560atcaaagcct gtcagttgga ggaggacatc tccaagtttg cagagaaaga caacattgtg
1620cttggagagg ggggtatcac tctttctgga ggacaaagag ccaggatctc tttggcccgg
1680gcagtctaca aggatgcaga cctctacttg ttggacagtc ccttcggcta cctcgacgtg
1740ctgactgaaa aagaaatttt tgaaagctgt gtgtgcaaac tgatggcaaa caagaccagg
1800attcttgtca ccagcaagat ggaacatctg aagaaagcgg acaaaattct gattctgcat
1860gaagggagct cctacttcta tggaacattt agcgagcttc agaacctaca gccagacttc
1920tcctccaaat taatgggctg tgactccttc gaccagttct ctgcagaaag aagaaactct
1980atactcacag agaccctcca ccgcttctcc cttgagggag atgccccagt ttcttggaca
2040gaaaccaaga agcagtcctt taagcagact ggcgagtttg gtgaaaagag gaaaaattca
2100attctcaatc ctattaacag tattcgcaag ttcagcattg tccagaagac acccctccag
2160atgaatggca tcgaagaaga tagtgacgag ccgctggaga gacggctgag tctggtgcca
2220gattcagaac agggggaggc catcctgccc cggatcagcg tcatttccac aggccccaca
2280ttacaagcac ggcgccggca gagtgtttta aatctcatga cccattcagt gaaccagggc
2340caaaatatcc acaggaagac tacagcttct acccggaaag tgtctctggc ccctcaggcc
2400aatctgaccg agctggacat ctacagcagg aggctctccc aggaaacagg gcttgaaata
2460tctgaagaga ttaatgaaga ggatcttaaa gagtgcttct ttgatgacat ggagagcatc
2520cccgcggtga ccacatggaa cacctacctt agatatatta ctgtccacaa gagcctcata
2580tttgtcctca tctggtgcct ggttattttc ctcgctgagg tggcggccag tcttgttgtg
2640ctctggctgc tgggcaacac tcctctccag gacaagggca atagtacaca cagcagaaat
2700aattcttatg ccgtcatcat tacaagcacc tccagctact acgtgttcta catctatgtg
2760ggcgtggctg acaccctcct ggccatgggt ttcttccggg gcctgccttt ggtgcacacc
2820ctcatcacag tgtcaaaaat tctgcaccat aaaatgcttc attctgtcct gcaggcaccc
2880atgagcactt tgaacacatt gaaggctggc ggcatcctca acagattttc taaagatatt
2940gctatcctgg atgatctcct ccccctgaca atctttgact ttatccagct tctgctgatc
3000gtgattggag ccatagcagt ggttgctgtc ctgcagccct acatttttgt ggccaccgtg
3060cccgtgattg ttgcctttat tatgctcaga gcttacttcc tgcaaacttc tcaacagctc
3120aaacagctag aatctgaggg ccggagcccc atttttaccc acctggtgac ttccctgaag
3180ggactgtgga ctctgagagc attcgggcga cagccttact ttgagacact gttccacaag
3240gccctgaact tgcacactgc caactggttt ctttacctga gcacactccg ctggttccag
3300atgcggatag agatgatctt cgtcatcttt tttatagctg taaccttcat ttctatcctt
3360acaacaggag aaggagaggg cagggtggga atcatcctca cgctggctat gaacataatg
3420tccaccttgc agtgggccgt gaattccagt atagatgtgg attctctaat gaggagtgtc
3480tcccgggtgt ttaaattcat tgatatgcct actgagggga aacccaccaa gtcaacaaaa
3540ccttataaga atggacagct gagcaaggtg atgataattg agaacagcca cgtgaagaag
3600gatgacattt ggcccagcgg gggccagatg actgtgaagg acctgacggc caagtacacc
3660gaaggtggaa atgccatttt ggaaaacatc agcttctcaa tctctcctgg gcagagagtt
3720ggattgctgg gtcgcacggg cagcggcaaa tcaaccctgc tcagtgcctt ccttcggctc
3780ctgaatacag aaggcgaaat ccaaattgac ggggtgagct gggacagcat caccctgcag
3840cagtggagaa aagcatttgg ggtcattcca cagaaagttt tcatcttctc tggcactttc
3900agaaagaacc tggaccccta tgagcagtgg agcgaccagg agatctggaa ggttgcagat
3960gaagttggcc tgcggagtgt gatagaacaa tttcctggca agctggattt tgtgctggta
4020gatggaggct gcgtgctgtc ccacggccac aaacagctga tgtgcctcgc ccgctccgtt
4080ctttcaaagg ccaaaatctt gcttttggat gagcccagtg ctcacctcga cccagtgacc
4140tatcagataa tccgcaggac cttaaagcaa gcttttgccg actgcaccgt catactgtgt
4200gagcaccgga ttgaagcaat gctggaatgc cagcagtttc tggtgatcga ggagaataag
4260gtccggcagt acgacagcat ccagaagttg ttgaatgagc gcagcctttt ccgccaggcc
4320atctccccat ctgacagagt caagctgttt ccacatagga actcctctaa gtgcaagtcc
4380aagccccaga tcgctgccct caaggaggaa actgaggaag aggtgcagga tacccgcctg
4440tga
4443294443DNAArtificial Sequencechemically synthesized oligonucleotide
29atgcagagga gcccactgga gaaagcctcc gtggtgagta aactcttttt tagttggacc
60agacccatcc tgcgaaaagg atacaggcag cgcctcgagt tgtcagatat ctaccagatt
120ccttctgtgg actcagctga caatttgagt gagaagctgg agcgggagtg ggatagagag
180ctggcgagca aaaaaaaccc caagcttatc aatgctctgc gccgctgctt tttctggagg
240ttcatgtttt atgggatctt cctgtacctg ggggaggtca ccaaagctgt tcagccgctc
300cttcttggcc gcatcatcgc cagctatgac cctgataata aagaagaaag gtctattgct
360atttatctgg gaattggcct ctgcttgctc ttcatcgtcc gcacccttct gctgcaccct
420gccatttttg gccttcacca catcggcatg caaatgagaa ttgccatgtt ctccctcatt
480tacaaaaaga ccctgaaact ttcctcaaga gtgttagata aaatatccat tggtcagctg
540gtcagcctgc tgtccaacaa tcttaacaaa tttgatgaag gcttggcgct ggcccacttc
600gtgtggattg cacctctgca ggtggccctg ttgatgggac ttatatggga gctgcttcaa
660gcctctgctt tctgtgggct gggctttttg attgtactgg cactttttca ggctgggctc
720ggaagaatga tgatgaaata cagagatcag cgggccggga agatatcaga gcgacttgtg
780atcaccagtg aaatgattga aaatattcag agcgtgaaag cctactgctg ggaagaagcc
840atggagaaga tgattgagaa cctgaggcag acagagctca agctcactcg gaaggctgct
900tatgttcgct atttcaacag cagcgccttc ttcttcagtg gcttctttgt tgtcttcctg
960tctgttctgc catatgcact gataaaaggc attattttac gaaagatctt caccaccatc
1020agtttttgca tcgttctcag gatggccgtc acaagacagt tcccctgggc tgtgcagacc
1080tggtacgatt ccttgggggc catcaacaag attcaagatt tcttgcaaaa acaagaatat
1140aaaactttag aatacaacct caccaccact gaagtggtca tggaaaatgt gacagccttt
1200tgggaggagg gttttggaga attgttcgag aaggcaaagc agaataacaa caacaggaag
1260acgagcaatg gggacgactc tctcttcttc agcaactttt cactgctcgg gacccctgtg
1320ttgaaagata taaacttcaa gatcgagagg ggccagctct tggctgtggc aggctccact
1380ggagctggta aaacatctct tctcatggtg atcatggggg aactggagcc ttccgaagga
1440aaaatcaagc acagtgggag aatctcattc tgcagccagt tttcctggat catgcccggc
1500accattaagg aaaacatcat atttggagtg tcctatgatg agtaccgcta ccggtccgtc
1560atcaaagcct gtcagttgga ggaggacatc tccaagtttg cagagaaaga caacattgtg
1620cttggagagg ggggtatcac tctttctgga ggacaaagag ccaggatctc tttggcccgg
1680gcagtctaca aggatgcaga cctctacttg ttggacagtc ccttcggcta cctcgacgtg
1740ctgactgaaa aagaaatttt tgaaagctgt gtgtgcaaac tgatggcaaa caagaccagg
1800attcttgtca ccagcaagat ggaacatctg aagaaagcgg acaaaattct gattctgcat
1860gaagggagct cctacttcta tggaacattt agcgagcttc agaacctaca gccagacttc
1920tcctccaaat taatgggctg tgactccttc gaccagttct ctgcagaaag aagaaactct
1980atactcacag agaccctcca ccgcttctcc cttgagggag atgccccagt ttcttggaca
2040gaaaccaaga agcagtcctt taagcagact ggcgagtttg gtgaaaagag gaaaaattca
2100attctcaatc caattaacag tattcgcaag ttcagcattg tccagaagac acccctccag
2160atgaatggca tcgaagaaga tagtgacgag ccgctggaga gacggctgag tctggtgcca
2220gattcagaac agggggaggc catcctgccc cggatcagcg tcatttccac aggccccaca
2280ttacaagcac ggcgccggca gagtgtttta aatctcatga cccattcagt gaaccagggc
2340caaaatatcc acaggaagac tacagcttct acccggaaag tgtctctggc ccctcaggcc
2400aatctgaccg agctggacat ctacagcagg aggctctccc aggaaacagg gctggaaata
2460tctgaagaga ttaatgaaga ggatcttaaa gagtgcttct ttgatgacat ggagagcatc
2520cccgcggtga ccacatggaa cacctacctt agatatatta ctgtccacaa gagcctcata
2580tttgtcctca tctggtgcct ggttattttc ctcgctgagg tggcggccag tcttgttgtg
2640ctctggctgc tgggcaacac tcctctccag gacaagggca atagtactca cagcagaaat
2700aattcttatg ccgtcatcat tacaagcacc tccagctact acgtgttcta catctatgtg
2760ggcgtggctg acaccctcct ggccatgggt ttcttccggg gcctgccttt ggtgcacacc
2820ctcatcacag tgtcaaaaat tctgcaccat aaaatgcttc attctgtcct gcaggcaccc
2880atgagcactt tgaacacatt gaaggctggc ggcatcctca acagattttc taaagatatt
2940gctatcctgg atgatctcct ccccctgaca atctttgact ttatccagct tctgctgatc
3000gtgattggag ccatagcagt ggttgctgtc ctgcagccct acatttttgt ggccaccgtg
3060cccgtgattg ttgcctttat tatgctcaga gcttacttcc tgcaaacttc tcaacagctc
3120aaacagctag agtctgaggg ccggagcccc atttttaccc acctggtgac ttccctgaag
3180ggactgtgga ctctgagagc attcgggcga cagccttact ttgagacact gttccacaag
3240gccctgaact tgcacactgc caactggttt ctttacctga gcacactccg ctggttccag
3300atgcggatag agatgatctt cgtcatcttt tttatagctg taaccttcat ttctatcctt
3360acaacaggag aaggagaggg cagggtggga atcatcctca cgctggctat gaacataatg
3420tccaccttgc agtgggccgt gaattccagt atagatgtgg attctctaat gaggagtgtc
3480tcccgggtgt ttaaattcat tgatatgcct actgagggga aacccaccaa gtcaacaaaa
3540ccttataaga atggacagct gagcaaggtg atgataattg agaacagcca cgtgaagaag
3600gatgacattt ggcccagcgg gggccagatg actgtgaagg acctgacggc caagtacacc
3660gaaggtggaa atgccatttt ggaaaacatc agcttctcaa tctctcctgg gcagagagtt
3720ggattgctgg gtcgcacggg cagcggcaaa tcaaccctgc tcagtgcctt ccttcggctc
3780ctgaatacag aaggcgaaat ccaaattgac ggggtgagct gggacagcat caccctgcag
3840cagtggagaa aagcatttgg ggtcattcca cagaaagttt tcatcttctc tggcactttc
3900agaaagaacc tggaccccta tgagcagtgg agcgaccagg agatctggaa ggttgcagat
3960gaagttggcc tgcggagtgt gatagaacaa tttcctggca agctggattt tgtgctggta
4020gatggaggct gcgtgctgtc ccacggccac aaacagctga tgtgcctcgc ccgctccgtt
4080ctttcaaagg ccaaaatctt gcttttggat gagcccagtg ctcacctcga cccagtgacc
4140tatcagataa tccgcaggac cttaaagcaa gcttttgccg actgcaccgt catactgtgt
4200gagcaccgga ttgaagcaat gctggaatgc cagcagtttc tggtgatcga ggagaataag
4260gtccggcagt acgacagcat ccagaagttg ttgaatgagc gcagcctttt ccgccaggcc
4320atctccccat ctgacagagt caagctgttt ccacatagga actcctctaa gtgcaagtcc
4380aagccccaga tcgctgccct caaggaggaa actgaggaag aggtgcagga tacccgcctg
4440tga
4443304443DNAArtificial Sequencechemically synthesized oligonucleotide
30atgcagagga gcccactgga gaaagcctcc gtggtgagta aactcttttt tagttggacc
60agacccatcc tgcgaaaagg atacaggcag cgcctcgagt tgtctgatat ctaccagatt
120ccttctgtgg actcagctga caatttgagt gagaagctgg agcgggagtg ggatagagag
180ctggcgagca aaaaaaaccc caagcttatc aatgctctgc gccgctgctt tttctggagg
240ttcatgtttt atgggatctt cctgtacctg ggggaggtca ccaaagctgt tcagccgctc
300cttcttggcc gcatcatcgc cagctatgac cctgataata aagaagaaag gtctattgct
360atttatctgg gaattggcct ctgcttgctc ttcatcgtcc gcacccttct gctgcaccct
420gccatttttg gccttcacca catcggcatg caaatgagaa ttgccatgtt ctccctcatt
480tacaaaaaga ccctgaaact ttcctcaaga gtgttagata aaatatccat tggtcagctg
540gtcagcctgc tgtccaacaa tcttaacaaa tttgatgaag gcttggcgct ggcccacttc
600gtgtggattg cacctctgca ggtggccctg ttgatgggac ttatatggga gctgcttcaa
660gcctctgctt tctgtgggct gggctttttg attgtactgg cactttttca ggctgggctc
720ggaagaatga tgatgaaata cagagatcag cgggccggga agatttcaga gcgacttgtg
780atcaccagtg aaatgattga aaatattcag agcgtgaaag cctactgctg ggaagaagcc
840atggagaaga tgattgagaa cctgaggcag acagagctca agctcactcg gaaggctgct
900tatgttcgct atttcaacag cagcgccttc ttcttcagtg gcttctttgt tgtcttcctg
960tctgttctgc catatgcact gataaaaggc attattttac gaaagatctt caccaccatc
1020agtttttgca tcgttctcag gatggccgtc acaagacagt tcccctgggc tgtgcagacc
1080tggtacgatt ccttgggggc catcaacaag attcaagatt tcttgcaaaa acaagaatat
1140aaaactttag aatacaacct caccaccact gaagtggtca tggaaaatgt gacagccttt
1200tgggaggagg gttttggaga attgttcgag aaggcaaagc agaataacaa caacaggaag
1260acgagcaatg gggacgactc tctcttcttc agcaactttt cactgctcgg gacccctgtg
1320ttgaaagata taaacttcaa gatcgagagg ggccagctct tggctgtggc aggctccact
1380ggagctggta aaacatctct tctcatggtg atcatggggg aactggagcc ttccgaagga
1440aaaatcaagc acagtgggag aatctcattc tgcagccagt tttcctggat catgcccggc
1500accattaagg aaaacatcat atttggagtg tcctatgatg agtaccgcta ccggtcagtc
1560atcaaagcct gtcagttgga ggaggacatc tccaagtttg cagagaaaga caacattgtg
1620cttggagagg ggggtatcac tctttctgga ggacaaagag ccaggatctc tttggcccgg
1680gcagtctaca aggatgcaga cctctacttg ttggacagtc ccttcggcta cctcgacgtg
1740ctgactgaaa aagaaatttt tgaaagctgt gtgtgcaaac tgatggcaaa caagaccagg
1800attcttgtca ccagcaagat ggaacatctg aagaaagcgg acaaaattct gattctgcat
1860gaagggagct cctacttcta tggaacattt agcgagcttc agaacctaca gccagacttc
1920tcctccaaat taatgggctg tgactccttc gaccagttct ctgcagaaag aagaaactct
1980atactcacag agaccctcca ccgcttctcc cttgagggag atgccccagt ttcttggaca
2040gaaaccaaga agcagtcctt taagcagact ggcgagtttg gtgaaaagag gaaaaattca
2100attctcaatc ctattaacag tattcgcaag ttcagcattg tccagaagac acccctccag
2160atgaatggca tcgaagaaga tagtgacgag ccgctggaga gacggctgag tctggtgcca
2220gattcagaac agggggaggc catcctgccc cggatcagcg tcatttccac aggccccaca
2280ttacaagcac ggcgccggca gagtgtttta aatctcatga cccattcagt gaaccagggc
2340caaaatatcc acaggaagac tacagcttct acccggaaag tgtctctggc ccctcaggcc
2400aatctgaccg agctggacat ctacagcagg aggctctccc aggaaacagg gctggaaata
2460tctgaagaga ttaatgaaga ggatcttaaa gagtgcttct ttgatgacat ggagagcatc
2520cccgcggtga ccacatggaa cacctacctt agatatatta ctgtccacaa gagcctcata
2580tttgtcctca tctggtgcct ggttattttc ctcgctgagg tggcggccag tcttgttgtg
2640ctctggctgc tgggcaacac tcctctccag gacaagggca atagtacaca cagcagaaat
2700aattcttatg ccgtcatcat tacaagcacc tccagctact acgtgttcta catctatgtg
2760ggcgtggctg acaccctcct ggccatgggt ttcttccggg gcctgccttt ggtgcacacc
2820ctcatcacag tgtcaaaaat tctgcaccat aaaatgcttc attctgtcct gcaggcaccc
2880atgagcactt tgaacacatt gaaggctggc ggcatcctca acagattttc taaagatatt
2940gctatcctgg atgatctcct ccccctgaca atctttgact ttatccagct tctgctgatc
3000gtgattggag ccatagcagt ggttgctgtc ctgcagccct acatttttgt ggccaccgtg
3060cccgtgattg ttgcctttat tatgctcaga gcttacttcc tgcaaacttc tcaacagctc
3120aaacagctag aatctgaggg ccggagcccc atttttaccc acctggtgac ttccctgaag
3180ggactgtgga ctctgagagc attcgggcga cagccttact ttgagacact gttccacaag
3240gccctgaact tgcacactgc caactggttt ctttacctga gcacactccg ctggttccag
3300atgcggatag agatgatctt cgtcatcttt tttatagctg taaccttcat ttctatcctt
3360acaacaggag aaggagaggg cagggtggga atcatcctca cgctggctat gaacataatg
3420tccaccttgc agtgggccgt gaattccagt atagatgtgg attctctaat gaggagtgtc
3480tcccgggtgt ttaaattcat tgatatgcct actgagggga aacccaccaa gtcaacaaaa
3540ccttataaga atggacagct gagcaaggtg atgataattg agaacagcca cgtgaagaag
3600gatgacattt ggcccagcgg gggccagatg actgtgaagg acctgacggc caagtacacc
3660gaaggtggaa atgccatttt ggaaaacatc agcttctcaa tctctcctgg gcagagagtt
3720ggattgctgg gtcgcacggg cagcggcaaa tcaaccctgc tcagtgcctt ccttcggctc
3780ctgaatacag aaggcgaaat ccaaattgac ggggtgagct gggacagcat caccctgcag
3840cagtggagaa aagcatttgg ggtcattcca cagaaagttt tcatcttctc tggcactttc
3900agaaagaacc tggaccccta tgagcagtgg agcgaccagg agatctggaa ggttgcagat
3960gaagttggcc tgcggagtgt gatagaacaa tttcctggca agctggattt tgtgctggta
4020gatggaggct gcgtgctgtc ccacggccac aaacagctga tgtgcctcgc ccgctccgtt
4080ctttcaaagg ccaaaatctt gcttttggat gagcccagtg ctcacctcga cccagtgacc
4140tatcagataa tccgcaggac cttaaagcaa gcttttgccg actgcaccgt catactgtgt
4200gagcaccgga ttgaagcaat gctggaatgc cagcagtttc tggtgatcga ggagaataag
4260gtccggcagt acgacagcat ccagaagttg ttgaatgagc gcagcctttt ccgccaggcc
4320atctccccat ctgacagagt caagctgttt ccacatagga actcctctaa gtgcaagtcc
4380aagccccaga tcgctgccct caaggaggaa actgaggaag aggtgcagga tacccgcctg
4440tga
4443314443DNAArtificial Sequencechemically synthesized oligonucleotide
31atgcagagaa gccccctgga gaaggcctct gtggtgagca agctgttctt cagctggacc
60agacccatcc tgagaaaggg ctacagacag agactggagc tgtctgacat ctaccagatc
120ccctctgtgg actctgccga caacctgtct gagaagctgg agagagagtg ggacagagag
180ctggccagca agaagaaccc caagctgatc aatgccctga gaagatgctt cttctggaga
240ttcatgttct atggcatctt cctgtacctg ggagaggtga ccaaggccgt gcagcccctg
300ctgctgggca ggatcattgc cagctatgac cctgacaaca aggaggagag aagcattgcc
360atctacctgg gcattggcct gtgcctgctg ttcattgtga gaaccctgct gctgcaccct
420gccatctttg gcctgcacca cattggcatg cagatgagaa ttgccatgtt cagcctgatc
480tacaagaaga ccctgaagct gagcagcaga gtgctggaca agatcagcat tggccagctg
540gtgagcctgc tgagcaacaa cctgaacaag tttgatgagg gcctggccct ggcccacttt
600gtgtggattg cccccctgca ggtggccctg ctgatgggcc tgatctggga gctgctgcag
660gcctctgcct tctgtggcct gggcttcctg attgtgctgg ccctgttcca ggccggcctg
720ggcagaatga tgatgaagta cagagaccag agagccggca agatctctga gagactggtg
780atcacctctg agatgattga gaacatccag tctgtgaagg cctactgctg ggaggaggcc
840atggagaaga tgattgagaa cctgagacag acagagctga agctgaccag gaaggccgcc
900tatgtgagat acttcaacag ctctgccttc ttcttctctg gcttctttgt ggtgttcctg
960tctgtgctgc cctatgccct gatcaagggc atcatcctga ggaagatctt caccaccatc
1020agcttctgca ttgtgctgag gatggccgtg accaggcagt tcccctgggc cgtgcagacc
1080tggtatgaca gcctgggggc catcaacaag atccaggact tcctgcagaa gcaggagtac
1140aagaccctgg agtacaacct gaccaccaca gaggtggtga tggagaatgt gacagccttc
1200tgggaggagg gctttggaga gctgtttgag aaggccaagc agaacaacaa caacagaaag
1260accagcaatg gagatgacag cctgttcttc agcaacttca gcctgctggg cacccctgtg
1320ctgaaggaca tcaacttcaa gattgagagg ggccagctgc tggccgtggc cggcagcaca
1380ggagccggca agaccagcct gctgatggtg atcatgggag agctggagcc ctctgagggc
1440aagatcaagc actctggcag aatcagcttc tgcagccagt tcagctggat catgcctggc
1500accatcaagg agaacatcat ctttggggtg agctatgatg agtacaggta cagatctgtg
1560atcaaggcct gccagctgga ggaggacatc tccaagtttg ccgagaagga caacattgtg
1620ctgggggagg gaggcatcac cctgtctggg ggccagagag ccagaatcag cctggccaga
1680gccgtgtaca aggatgccga cctgtacctg ctggacagcc cctttggcta cctggatgtg
1740ctgacagaga aggagatctt tgagagctgt gtgtgcaagc tgatggccaa caagaccagg
1800atcctggtga ccagcaagat ggagcacctg aagaaggccg acaagatcct gatcctgcat
1860gagggcagca gctacttcta tggcaccttc tctgagctgc agaacctgca gcctgacttc
1920agcagcaagc tgatgggctg tgacagcttt gaccagttct ctgctgagag aagaaacagc
1980atcctgacag agaccctgca caggttcagc ctggaggggg atgcccctgt gagctggaca
2040gagaccaaga agcagagctt caagcagaca ggagagtttg gggagaagag gaagaacagc
2100atcctgaacc ccatcaacag catcaggaag ttcagcattg tgcagaagac ccccctgcag
2160atgaatggca ttgaggagga ctctgatgag cccctggaga gaagactgag cctggtgcca
2220gactctgagc agggagaggc catcctgccc aggatctctg tgatcagcac aggccccacc
2280ctgcaggcca gaagaagaca gtctgtgctg aacctgatga cccactctgt gaaccagggc
2340cagaatatcc acagaaagac cacagccagc accagaaagg tgagcctggc cccccaggcc
2400aacctgacag agctggacat ctacagcaga aggctgagcc aggagacagg cctggagatc
2460tctgaggaga tcaatgagga ggacctgaag gagtgcttct ttgatgacat ggagagcatc
2520cctgccgtga ccacctggaa cacctacctg agatacatca cagtgcacaa gagcctgatc
2580tttgtgctga tctggtgcct ggtgatcttc ctggccgagg tggccgccag cctggtggtg
2640ctgtggctgc tgggcaacac ccccctgcag gacaagggca acagcaccca cagcagaaac
2700aacagctatg ctgtgatcat caccagcacc agcagctact atgtgttcta catctatgtg
2760ggagtggctg acaccctgct ggccatgggc ttcttcagag gcctgcccct ggtgcacacc
2820ctgatcacag tgagcaagat cctgcaccac aagatgctgc actctgtgct gcaggccccc
2880atgagcaccc tgaacaccct gaaggctgga ggcatcctga acagattcag caaggacatt
2940gccatcctgg atgacctgct gcccctgacc atctttgact tcatccagct gctgctgatt
3000gtgattggag ccattgccgt ggtggccgtg ctgcagccct acatctttgt ggccacagtg
3060cctgtgattg tggccttcat catgctgagg gcctacttcc tgcagaccag ccagcagctg
3120aagcagctgg agtctgaggg cagaagcccc atcttcaccc acctggtgac cagcctgaag
3180ggcctgtgga ccctgagggc ctttggcaga cagccctact ttgagaccct gttccacaag
3240gccctgaacc tgcacacagc caactggttc ctgtacctga gcaccctgag atggttccag
3300atgaggattg agatgatctt tgtgatcttc ttcattgccg tgaccttcat cagcatcctg
3360accacagggg agggcgaggg cagagtgggc atcatcctga ccctggccat gaacatcatg
3420agcaccctgc agtgggccgt gaacagcagc attgatgtgg acagcctgat gagatctgtg
3480agcagagtgt tcaagttcat tgacatgccc acagagggca agcccaccaa gagcaccaag
3540ccctacaaga atggccagct gagcaaggtg atgatcattg agaacagcca tgtgaagaag
3600gatgacatct ggccctctgg aggccagatg acagtgaagg acctgacagc caagtacaca
3660gaggggggca atgccatcct ggagaacatc agcttcagca tcagccctgg ccagagggtg
3720ggcctgctgg gcagaacagg ctctggcaag agcaccctgc tgtctgcctt cctgaggctg
3780ctgaacacag agggagagat ccagattgat ggggtgagct gggacagcat caccctgcag
3840cagtggagga aggcctttgg ggtgatcccc cagaaggtgt tcatcttctc tggcaccttc
3900aggaagaacc tggaccccta tgagcagtgg tctgaccagg agatctggaa ggtggccgat
3960gaggtgggcc tgagatctgt gattgagcag ttccctggca agctggactt tgtgctggtg
4020gatggaggct gtgtgctgag ccatggccac aagcagctga tgtgcctggc cagatctgtg
4080ctgagcaagg ccaagatcct gctgctggat gagccctctg cccacctgga ccctgtgacc
4140taccagatca tcagaagaac cctgaagcag gcctttgccg actgcacagt gatcctgtgt
4200gagcacagaa ttgaggccat gctggagtgc cagcagttcc tggtgattga ggagaacaag
4260gtgaggcagt atgacagcat ccagaagctg ctgaatgaga gaagcctgtt cagacaggcc
4320atcagcccct ctgacagagt gaagctgttc ccccacagga acagcagcaa gtgcaagagc
4380aagccccaga ttgccgccct gaaggaggag acagaggagg aggtgcagga caccagactg
4440tga
4443324443DNAArtificial Sequencechemically synthesized oligonucleotide
32atgcagagga gccccctgga gaaggccagc gtggtgagca agctgttctt cagctggacc
60aggcccatcc tgaggaaggg ctacaggcag aggctggagc tgagcgacat ctaccagatc
120cccagcgtgg acagcgccga caacctgagc gagaagctgg agagggagtg ggacagggag
180ctggccagca agaagaaccc caagctgatc aacgccctga ggaggtgctt cttctggagg
240ttcatgttct acggcatctt cctgtacctg ggcgaggtga ccaaggccgt gcagcccctg
300ctgctgggca ggatcatcgc cagctacgac cccgacaaca aggaggagag gagcatcgcc
360atctacctgg gcatcggcct gtgcctgctg ttcatcgtga ggaccctgct gctgcacccc
420gccatcttcg gcctgcacca catcggcatg cagatgagga tcgccatgtt cagcctgatc
480tacaagaaga ccctgaagct gagcagcagg gtgctggaca agatcagcat cggccagctg
540gtgagcctgc tgagcaacaa cctgaacaag ttcgacgagg gcctggccct ggcccacttc
600gtgtggatcg cccccctgca ggtggccctg ctgatgggcc tgatctggga gctgctgcag
660gccagcgcct tctgcggcct gggcttcctg atcgtgctgg ccctgttcca ggccggcctg
720ggcaggatga tgatgaagta cagggaccag agggccggca agatcagcga gaggctggtg
780atcaccagcg agatgatcga gaacatccag agcgtgaagg cctactgctg ggaggaggcc
840atggagaaga tgatcgagaa cctgaggcag accgagctga agctgaccag gaaggccgcc
900tacgtgaggt acttcaacag cagcgccttc ttcttcagcg gcttcttcgt ggtgttcctg
960agcgtgctgc cctacgccct gatcaagggc atcatcctga ggaagatctt caccaccatc
1020agcttctgca tcgtgctgag gatggccgtg accaggcagt tcccctgggc cgtgcagacc
1080tggtacgaca gcctgggcgc catcaacaag atccaggact tcctgcagaa gcaggagtac
1140aagaccctgg agtacaacct gaccaccacc gaggtggtga tggagaacgt gaccgccttc
1200tgggaggagg gcttcggcga gctgttcgag aaggccaagc agaacaacaa caacaggaag
1260accagcaacg gcgacgacag cctgttcttc agcaacttca gcctgctggg cacccccgtg
1320ctgaaggaca tcaacttcaa gatcgagagg ggccagctgc tggccgtggc cggcagcacc
1380ggcgccggca agaccagcct gctgatggtg atcatgggcg agctggagcc cagcgagggc
1440aagatcaagc acagcggcag gatcagcttc tgcagccagt tcagctggat catgcccggc
1500accatcaagg agaacatcat cttcggcgtg agctacgacg agtacaggta caggagcgtg
1560atcaaggcct gccagctgga ggaggacatc agcaagttcg ccgagaagga caacatcgtg
1620ctgggcgagg gcggcatcac cctgagcggc ggccagaggg ccaggatcag cctggccagg
1680gccgtgtaca aggacgccga cctgtacctg ctggacagcc ccttcggcta cctggacgtg
1740ctgaccgaga aggagatctt cgagagctgc gtgtgcaagc tgatggccaa caagaccagg
1800atcctggtga ccagcaagat ggagcacctg aagaaggccg acaagatcct gatcctgcac
1860gagggcagca gctacttcta cggcaccttc agcgagctgc agaacctgca gcccgacttc
1920agcagcaagc tgatgggctg cgacagcttc gaccagttca gcgccgagag gaggaacagc
1980atcctgaccg agaccctgca caggttcagc ctggagggcg acgcccccgt gagctggacc
2040gagaccaaga agcagagctt caagcagacc ggcgagttcg gcgagaagag gaagaacagc
2100atcctgaacc ccatcaacag catcaggaag ttcagcatcg tgcagaagac ccccctgcag
2160atgaacggca tcgaggagga cagcgacgag cccctggaga ggaggctgag cctggtgccc
2220gacagcgagc agggcgaggc catcctgccc aggatcagcg tgatcagcac cggccccacc
2280ctgcaggcca ggaggaggca gagcgtgctg aacctgatga cccacagcgt gaaccagggc
2340cagaacatcc acaggaagac caccgccagc accaggaagg tgagcctggc cccccaggcc
2400aacctgaccg agctggacat ctacagcagg aggctgagcc aggagaccgg cctggagatc
2460agcgaggaga tcaacgagga ggacctgaag gagtgcttct tcgacgacat ggagagcatc
2520cccgccgtga ccacctggaa cacctacctg aggtacatca ccgtgcacaa gagcctgatc
2580ttcgtgctga tctggtgcct ggtgatcttc ctggccgagg tggccgccag cctggtggtg
2640ctgtggctgc tgggcaacac ccccctgcag gacaagggca acagcaccca cagcaggaac
2700aacagctacg ccgtgatcat caccagcacc agcagctact acgtgttcta catctacgtg
2760ggcgtggccg acaccctgct ggccatgggc ttcttcaggg gcctgcccct ggtgcacacc
2820ctgatcaccg tgagcaagat cctgcaccac aagatgctgc acagcgtgct gcaggccccc
2880atgagcaccc tgaacaccct gaaggccggc ggcatcctga acaggttcag caaggacatc
2940gccatcctgg acgacctgct gcccctgacc atcttcgact tcatccagct gctgctgatc
3000gtgatcggcg ccatcgccgt ggtggccgtg ctgcagccct acatcttcgt ggccaccgtg
3060cccgtgatcg tggccttcat catgctgagg gcctacttcc tgcagaccag ccagcagctg
3120aagcagctgg agagcgaggg caggagcccc atcttcaccc acctggtgac cagcctgaag
3180ggcctgtgga ccctgagggc cttcggcagg cagccctact tcgagaccct gttccacaag
3240gccctgaacc tgcacaccgc caactggttc ctgtacctga gcaccctgag gtggttccag
3300atgaggatcg agatgatctt cgtgatcttc ttcatcgccg tgaccttcat cagcatcctg
3360accaccggcg agggcgaggg cagggtgggc atcatcctga ccctggccat gaacatcatg
3420agcaccctgc agtgggccgt gaacagcagc atcgacgtgg acagcctgat gaggagcgtg
3480agcagggtgt tcaagttcat cgacatgccc accgagggca agcccaccaa gagcaccaag
3540ccctacaaga acggccagct gagcaaggtg atgatcatcg agaacagcca cgtgaagaag
3600gacgacatct ggcccagcgg cggccagatg accgtgaagg acctgaccgc caagtacacc
3660gagggcggca acgccatcct ggagaacatc agcttcagca tcagccccgg ccagagggtg
3720ggcctgctgg gcaggaccgg cagcggcaag agcaccctgc tgagcgcctt cctgaggctg
3780ctgaacaccg agggcgagat ccagatcgac ggcgtgagct gggacagcat caccctgcag
3840cagtggagga aggccttcgg cgtgatcccc cagaaggtgt tcatcttcag cggcaccttc
3900aggaagaacc tggaccccta cgagcagtgg agcgaccagg agatctggaa ggtggccgac
3960gaggtgggcc tgaggagcgt gatcgagcag ttccccggca agctggactt cgtgctggtg
4020gacggcggct gcgtgctgag ccacggccac aagcagctga tgtgcctggc caggagcgtg
4080ctgagcaagg ccaagatcct gctgctggac gagcccagcg cccacctgga ccccgtgacc
4140taccagatca tcaggaggac cctgaagcag gccttcgccg actgcaccgt gatcctgtgc
4200gagcacagga tcgaggccat gctggagtgc cagcagttcc tggtgatcga ggagaacaag
4260gtgaggcagt acgacagcat ccagaagctg ctgaacgaga ggagcctgtt caggcaggcc
4320atcagcccca gcgacagggt gaagctgttc ccccacagga acagcagcaa gtgcaagagc
4380aagccccaga tcgccgccct gaaggaggag accgaggagg aggtgcagga caccaggctg
4440tga
4443334443DNAArtificial Sequencechemically synthesized oligonucleotide
33atgcagagat cccctctgga gaaggcctca gtggtgtcca agcttttctt ctcctggacc
60aggcccattt taagaaaggg ctacaggcag agacttgagc tgtctgacat ctatcagatc
120ccttctgtgg attctgctga caatcttagt gaaaaattgg aaagggagtg ggacagagag
180ctggcaagta aaaagaaccc caagctgatt aatgccctga ggcgctgctt tttttggaga
240ttcatgttct atggcatatt cctctacctt ggagaagtaa ccaaagctgt acagcctctc
300ctccttggca gaatcattgc ctcctatgat cctgataaca aggaggagag aagcatagcc
360atctacctgg gcattgggct gtgcctcttg tttattgtga ggacccttct cttgcaccct
420gccatctttg gccttcatca cattggcatg caaatgagaa tagcaatgtt tagtcttatt
480tacaaaaaaa cattaaaact ctcttccagg gtgttggaca agatcagtat tggacaactg
540gtcagcctgc tgagcaacaa cctgaacaag tttgatgaag gactggccct ggcccacttt
600gtctggattg ccccccttca ggtggctctt ttgatgggcc tgatctggga actcctgcag
660gcctctgcct tctgtgggtt aggcttcctg atagtgctag ctctctttca ggcagggttg
720ggtagaatga tgatgaagta cagagaccag agggctggga agatatctga gaggctggtc
780attacttctg aaatgataga aaacatccag tctgttaaag cttactgctg ggaggaggct
840atggaaaaga tgattgagaa cttgaggcaa acagagctca agctgactag gaaggcagcc
900tatgtcaggt atttcaacag cagtgctttc ttcttctcag gctttttcgt ggtcttcttg
960agtgttctgc cctatgccct catcaagggg ataattttga gaaagatttt caccactatt
1020tccttttgca ttgtcctgag gatggctgtc accaggcaat tcccctgggc tgtgcagaca
1080tggtatgact ctctgggggc catcaacaaa atccaagatt tcctgcagaa gcaggagtac
1140aagaccctgg aatacaacct caccaccaca gaagttgtga tggagaatgt gactgcattc
1200tgggaggaag gatttgggga gctgtttgag aaagcaaaac aaaacaataa taacaggaaa
1260accagcaatg gagatgactc cctgttcttt tccaacttct ctttgttggg cacccctgtc
1320ctgaaagata taaactttaa aattgaaaga gggcagctgt tggcagttgc tggctccaca
1380ggagctggaa aaacttcact actgatggtg atcatggggg agttagaacc ctctgaaggg
1440aaaataaaac attctgggag gattagtttc tgcagccagt tcagctggat catgcctggg
1500accattaaag aaaatattat atttggagtg agctatgatg aatatagata taggagtgtc
1560atcaaagcct gtcagttgga ggaagacatc agcaaatttg cagagaaaga caacattgtt
1620ctgggtgaag gtggcatcac cctgtcagga gggcaaaggg ccaggatcag cttggccaga
1680gcagtctata aagatgctga tctgtacctc ctggatagcc cttttggcta tctggatgtt
1740ttgacagaga aggaaatttt tgagtcctgt gtctgcaagt taatggcaaa taaaacaagg
1800atacttgtga cctcaaaaat ggaacacctg aagaaggctg acaaaattct gatcctgcat
1860gagggcagca gctactttta tggaacattt tctgaactgc agaatttgca accagacttt
1920tcatcaaagc tcatgggatg tgacagtttt gatcagtttt ctgcagaaag gagaaactcc
1980attttgactg agaccctgca caggttcagt ctggaggggg atgccccagt gagttggact
2040gagacaaaga aacagagctt caagcagact ggagagtttg gagaaaagag gaaaaactca
2100attctcaatc ccatcaatag catcaggaag ttcagcatag ttcagaagac tcctttgcag
2160atgaatggga ttgaagagga ctcagatgag cccctggaaa ggagactctc cttggtgcca
2220gattcagagc agggggaagc catactgcca aggatctctg tgatttctac agggcccacc
2280ctccaagcaa gaaggagaca gtcagtttta aacctgatga cccactctgt caaccaggga
2340cagaacattc atagaaagac aacagcatct acaagaaaag tttcactggc ccctcaagcc
2400aatttaactg aactagatat ctacagcagg aggctcagcc aagaaacagg cctggagatc
2460tcagaagaaa taaatgagga ggatttgaag gaatgcttct ttgatgatat ggagagcatc
2520ccagctgtca caacctggaa cacctacctg agatacatca cagtgcacaa atccctcatc
2580tttgtactta tatggtgcct tgtcatcttc ttagctgagg tggctgcttc cctggtggtg
2640ctgtggctgc tgggaaacac acccctccag gataaaggga actctactca cagcaggaac
2700aacagttatg ctgtgatcat caccagtacc tcctcctact atgtgttcta catttatgtt
2760ggagttgcag acacattgct tgccatgggt ttttttagag gactccccct ggtgcatact
2820ctcatcactg tttccaaaat ccttcaccac aagatgctgc acagtgtact acaggctccc
2880atgagcaccc tcaacactct taaagcagga ggaatcttga acagatttag caaggacatt
2940gcaattcttg atgacctgct tccactgacc atctttgact tcatccagct tctgctcatt
3000gtaattggtg ccattgctgt ggtagcagtg ctccagccat atatttttgt ggccactgtg
3060cctgttattg tggccttcat tatgttgaga gcctacttcc tgcagacctc tcagcagctc
3120aagcaacttg aaagtgaggg caggagcccc atatttacac acttggtcac ttccctcaaa
3180ggcctctgga cactcagagc ttttggaaga caaccttatt ttgaaactct cttccacaag
3240gctctgaatc tccacacagc caactggttt ctgtatcttt caacactgcg ctggttccag
3300atgaggattg agatgatctt tgttatcttc ttcatagctg ttaccttcat ctctattctg
3360acaactggtg agggggaagg gagagtaggc atcatcctca cactagccat gaacataatg
3420tctaccttac aatgggccgt gaacagctcc atagatgtgg acagcctcat gagaagtgtg
3480tcaagagttt tcaaattcat tgacatgccc acagaaggca aaccaaccaa gagcacaaaa
3540ccctacaaga atggccagct gagtaaggtc atgatcattg aaaattctca tgtgaagaag
3600gatgatattt ggcccagtgg gggccagatg acagtcaagg acctcactgc caaatacaca
3660gagggtggaa atgctatcct agagaacatc tccttctcca tctccccagg ccaaagagtt
3720ggcttgctgg gcaggactgg cagtggcaag tccaccttgc tctcagcatt tctcaggctt
3780ttaaatacag agggagagat tcaaattgat ggggtgtctt gggatagtat aacacttcaa
3840cagtggagga aagcctttgg tgtgattcct cagaaagtgt ttatcttctc tggcactttc
3900agaaaaaatc tggaccccta tgaacagtgg agtgaccagg aaatctggaa ggtggcagat
3960gaagtgggcc taagatcagt catagagcag tttcctggaa agttggattt tgtgcttgta
4020gatggaggct gtgtgctgtc ccatggccat aaacagctaa tgtgcctggc taggtcagtg
4080ctgagcaagg ccaagatcct gctgttagat gagccttcag cccatctgga ccctgtgaca
4140taccagatta tcagaagaac tctgaagcag gcctttgctg actgcactgt catcctgtgt
4200gagcacagaa ttgaggccat gctggagtgc cagcagttcc ttgttataga agagaataag
4260gttaggcagt atgacagcat tcagaaactg ctaaatgaaa gatctctctt caggcaagct
4320atttcaccat ctgatagagt gaaacttttt ccccacagaa attcctctaa atgtaaatct
4380aagccccaga tagctgcctt gaaagaggag actgaagaag aagtccagga caccagactg
4440tga
4443344443DNAArtificial Sequencechemically synthesized oligonucleotide
34atgcagagat ccccgctgga gaaggcatct gtggtgtcaa aactgttctt tagctggaca
60aggcccatcc ttaggaaagg gtacagacag aggttggagc tgtcagacat atatcagatc
120ccttcagtgg actctgcaga caacctctct gaaaagctgg agagggaatg ggacagggaa
180ctggccagca aaaaaaaccc taaactgatt aatgccctga ggaggtgctt cttttggaga
240ttcatgttct atgggatctt cctttacctg ggggaggtga ctaaagctgt tcagcctctt
300cttctgggga ggattattgc ctcctatgac ccagacaaca aagaagaaag aagcatagcc
360atttacttag gcataggcct ctgcttgctc ttcatagtta gaaccctcct actccaccca
420gccatctttg gtctccacca cataggtatg cagatgagaa tagcaatgtt ctccttgatc
480tacaagaaga ccctcaagct gtccagcagg gtgctggaca agatctccat aggccagtta
540gtcagtctac tgtccaataa cttaaataag tttgatgagg gactggcact ggcacatttt
600gtgtggattg cccccctcca agtggccctt cttatgggcc ttatctggga gctgttgcag
660gcctctgctt tctgtggcct gggtttcctc atagtcctag ccttattcca ggctggactg
720ggcagaatga tgatgaagta tagggaccaa agagcaggga agatttctga aaggctggtt
780ataacttctg agatgattga gaacattcag tcagtgaaag cttactgctg ggaagaagct
840atggaaaaaa tgattgaaaa tctcagacag actgaattaa agttgaccag gaaagctgct
900tatgtcagat acttcaactc ctcagccttc tttttttctg gcttctttgt tgtattcctt
960tcagtcctcc cctatgccct gattaagggc attatcttga ggaaaatttt cacaaccatc
1020tccttttgta ttgtcctcag gatggctgtt acaaggcaat ttccttgggc tgtgcaaact
1080tggtatgata gccttggagc aatcaacaag atccaggatt tcctgcaaaa gcaggagtac
1140aagacattgg aatacaacct taccaccact gaggtggtga tggaaaatgt gactgccttc
1200tgggaggagg ggtttggaga gctgtttgag aaagccaaac agaacaacaa caatagaaag
1260acctctaatg gtgatgattc cctgttcttt tctaacttta gtcttctggg gaccccagtt
1320ctgaaagata ttaactttaa aattgaaagg ggacagttgc tggctgtggc tgggtccact
1380ggggctggga agacaagcct gctcatggtg atcatgggag agctggaacc cagtgaagga
1440aagatcaaac actcaggcag gatctccttc tgcagccagt tctcatggat tatgccaggc
1500actattaaag aaaatatcat ctttggtgta agctatgatg agtacaggta tagatctgta
1560attaaagcct gccagctgga ggaagacatc tctaagtttg ctgagaagga taacattgtg
1620ttgggggaag ggggcatcac cctttctggt gggcagaggg ctaggatctc ccttgctagg
1680gcagtataca aggatgctga cttgtacctc ttggatagtc cttttggcta cctagatgtg
1740ctgacagaga aagaaatatt tgaaagctgt gtgtgtaagc tcatggctaa caagaccagg
1800atcctggtca ccagtaaaat ggaacacctc aaaaaagcag acaagatcct tattctccat
1860gagggctcct cctacttcta tgggaccttc agtgagctgc agaatctgca gccagacttc
1920tcctcaaaac ttatgggctg tgactccttt gaccaattct ctgcagaaag aaggaatagc
1980atactgacag aaacactgca tagattctcc ctggaaggag atgccccagt gagttggaca
2040gaaaccaaaa agcagagctt caagcagact ggtgagtttg gtgaaaagag gaagaattct
2100atcctgaacc ccatcaatag catcaggaaa tttagcatag tccaaaagac ccccctccag
2160atgaatggaa tagaggagga tagtgatgag cctcttgaga gaaggctgtc cctggttcca
2220gacagtgaac agggtgaagc cattcttccg aggatcagtg tcatctccac tgggcccaca
2280ttgcaggcca gaagaagaca gtctgttctg aatttgatga cacattctgt gaatcaaggc
2340cagaatatcc atagaaaaac cactgccagc accagaaaag tttctctagc cccccaggct
2400aacctgactg agttagacat ctacagcaga aggctgagcc aagagactgg cttggaaata
2460tctgaggaga tcaatgagga ggacctcaag gagtgcttct ttgatgacat ggagtcaatc
2520cctgcagtca ctacatggaa cacttaccta aggtacatca cagttcataa gagcctcatc
2580tttgtcctca tatggtgtct ggtcatcttt ttagcagaag tggctgccag cctagttgtg
2640ctgtggttac tgggcaatac acctcttcag gacaaaggca atagcacaca cagcagaaac
2700aactcctatg cagtgatcat cacctctaca agctcttact atgtattcta tatatatgtg
2760ggagtggcag atactctcct ggccatggga ttcttcaggg gattacctct agttcacaca
2820ttgatcacag tgtcaaaaat tctccaccac aagatgttac acagtgtcct gcaagcccca
2880atgtctactc tgaacacact taaggcaggt ggaattttga ataggtttag caaggacata
2940gctatcctgg atgatctcct ccctctgacc atctttgact tcatccagtt actgctcatt
3000gtaattggag ccattgcagt ggtagcagtc ctacagcctt acatttttgt ggctactgtt
3060cctgttattg tggccttcat tatgctaaga gcttacttcc tgcaaacaag ccaacagttg
3120aaacagctag aaagtgaggg aaggtccccc atcttcaccc acctggtgac atcactcaag
3180gggctatgga ctcttagggc ttttgggaga cagccgtact ttgagacctt attccataag
3240gcccttaacc tccatacagc aaactggttc ttatacctga gtactctgag gtggtttcaa
3300atgaggattg aaatgatttt tgtgatcttc ttcattgctg tgaccttcat ctcaatcttg
3360accacaggag agggggaggg cagggtgggc atcatactga ccttggccat gaacattatg
3420tcaaccctgc agtgggctgt caatagctcc attgatgtgg acagtctgat gaggagtgtc
3480tccagggtct tcaagtttat tgacatgcca actgagggca aacccaccaa aagcactaag
3540ccatataaaa atggccaact gtccaaagtg atgatcattg aaaattcaca tgtaaagaag
3600gatgatatct ggccctctgg aggacagatg acagtgaaag acctgactgc caagtacaca
3660gagggtggta atgccattct tgagaacatt agtttcagta tttccccggg gcaaagggtg
3720ggcctccttg gcagaacagg ctctggcaag agtaccctgc tgtcagcctt tttaagactg
3780ttgaacactg agggagaaat tcagattgat ggtgtctcct gggatagcat caccctccag
3840cagtggagaa aagcttttgg agtgatcccg caaaaggttt tcatcttttc aggcaccttc
3900cggaagaacc tggaccccta tgagcagtgg tctgaccagg aaatatggaa ggtagctgat
3960gaagttgggc ttaggtcagt catagagcag ttcccaggca aactggactt tgtcctggtg
4020gatggtggat gtgtactgag tcatgggcac aaacagctga tgtgcctagc caggtctgtg
4080ctcagcaagg caaagatatt gctgcttgat gaacccagtg cccatctgga cccagtcaca
4140tatcagatca tcagaagaac attgaagcag gcctttgctg attgcacagt tatcctctgt
4200gagcacagga ttgaggccat gctggagtgc cagcagtttc tggtgattga ggagaataaa
4260gtaaggcagt atgactccat ccagaagctg ctcaatgaaa gaagcctctt tagacaagct
4320atctccccct cagacagggt caaattgttc cctcacagaa acagcagcaa gtgcaagagc
4380aagccccaaa ttgcagcctt gaaagaggag acagaggaag aggtgcagga caccagactc
4440tga
4443354443DNAArtificial Sequencechemically synthesized oligonucleotide
35atgcagagaa gccccctgga gaaggccagc gtggtgagca agctgttctt cagctggacc
60agacccatcc tgagaaaggg ctacagacag agactggagc tgagcgacat ctaccagatc
120cccagcgtgg acagcgccga caacctgagc gagaagctgg agagagagtg ggacagagag
180ctggccagca agaagaaccc caagctgatc aacgccctga gaagatgctt cttctggaga
240ttcatgttct acggcatctt cctgtacctg ggcgaggtga ccaaggccgt gcagcccctg
300ctgctgggca gaatcatcgc cagctacgac cccgacaaca aggaggagag aagcatcgcc
360atctacctgg gcatcggcct gtgcctgctg ttcatcgtga gaaccctgct gctgcacccc
420gccatcttcg gcctgcacca catcggcatg cagatgagaa tcgccatgtt cagcctgatc
480tacaagaaga ccctgaagct gagcagcaga gtgctggaca agatcagcat cggccagctg
540gtgagcctgc tgagcaacaa cctgaacaag ttcgacgagg gcctggccct ggcccacttc
600gtgtggatcg cccccctgca ggtggccctg ctgatgggcc tgatctggga gctgctgcag
660gccagcgcct tctgcggcct gggcttcctg atcgtgctgg ccctgttcca ggccggcctg
720ggcagaatga tgatgaagta cagagaccag agagccggca agatcagcga gagactggtg
780atcaccagcg agatgatcga gaacatccag agcgtgaagg cctactgctg ggaggaggcc
840atggagaaga tgatcgagaa cctgagacag accgagctga agctgaccag aaaggccgcc
900tacgtgagat acttcaacag cagcgccttc ttcttcagcg gcttcttcgt ggtgttcctg
960agcgtgctgc cctacgccct gatcaagggc atcatcctga gaaagatctt caccaccatc
1020agcttctgca tcgtgctgag aatggccgtg accagacagt tcccctgggc cgtgcagacc
1080tggtacgaca gcctgggcgc catcaacaag atccaggact tcctgcagaa gcaggagtac
1140aagaccctgg agtacaacct gaccaccacc gaggtggtga tggagaacgt gaccgccttc
1200tgggaggagg gcttcggcga gctgttcgag aaggccaagc agaacaacaa caacagaaag
1260accagcaacg gcgacgacag cctgttcttc agcaacttca gcctgctggg cacccccgtg
1320ctgaaggaca tcaacttcaa gatcgagaga ggccagctgc tggccgtggc cggcagcacc
1380ggcgccggca agaccagcct gctgatggtg atcatgggcg agctggagcc cagcgagggc
1440aagatcaagc acagcggcag aatcagcttc tgcagccagt tcagctggat catgcccggc
1500accatcaagg agaacatcat cttcggcgtg agctacgacg agtacagata cagaagcgtg
1560atcaaggcct gccagctgga ggaggacatc agcaagttcg ccgagaagga caacatcgtg
1620ctgggcgagg gcggcatcac cctgagcggc ggccagagag ccagaatcag cctggccaga
1680gccgtgtaca aggacgccga cctgtacctg ctggacagcc ccttcggcta cctggacgtg
1740ctgaccgaga aggagatctt cgagagctgc gtgtgcaagc tgatggccaa caagaccaga
1800atcctggtga ccagcaagat ggagcacctg aagaaggccg acaagatcct gatcctgcac
1860gagggcagca gctacttcta cggcaccttc agcgagctgc agaacctgca gcccgacttc
1920agcagcaagc tgatgggctg cgacagcttc gaccagttca gcgccgagag aagaaacagc
1980atcctgaccg agaccctgca cagattcagc ctggagggcg acgcccccgt gagctggacc
2040gagaccaaga agcagagctt caagcagacc ggcgagttcg gcgagaagag aaagaacagc
2100atcctgaacc ccatcaacag catcagaaag ttcagcatcg tgcagaagac ccccctgcag
2160atgaacggca tcgaggagga cagcgacgag cccctggaga gaagactgag cctggtgccc
2220gacagcgagc agggcgaggc catcctgccc agaatcagcg tgatcagcac cggccccacc
2280ctgcaggcca gaagaagaca gagcgtgctg aacctgatga cccacagcgt gaaccagggc
2340cagaacatcc acagaaagac caccgccagc accagaaagg tgagcctggc cccccaggcc
2400aacctgaccg agctggacat ctacagcaga agactgagcc aggagaccgg cctggagatc
2460agcgaggaga tcaacgagga ggacctgaag gagtgcttct tcgacgacat ggagagcatc
2520cccgccgtga ccacctggaa cacctacctg agatacatca ccgtgcacaa gagcctgatc
2580ttcgtgctga tctggtgcct ggtgatcttc ctggccgagg tggccgccag cctggtggtg
2640ctgtggctgc tgggcaacac ccccctgcag gacaagggca acagcaccca cagcagaaac
2700aacagctacg ccgtgatcat caccagcacc agcagctact acgtgttcta catctacgtg
2760ggcgtggccg acaccctgct ggccatgggc ttcttcagag gcctgcccct ggtgcacacc
2820ctgatcaccg tgagcaagat cctgcaccac aagatgctgc acagcgtgct gcaggccccc
2880atgagcaccc tgaacaccct gaaggccggc ggcatcctga acagattcag caaggacatc
2940gccatcctgg acgacctgct gcccctgacc atcttcgact tcatccagct gctgctgatc
3000gtgatcggcg ccatcgccgt ggtggccgtg ctgcagccct acatcttcgt ggccaccgtg
3060cccgtgatcg tggccttcat catgctgaga gcctacttcc tgcagaccag ccagcagctg
3120aagcagctgg agagcgaggg cagaagcccc atcttcaccc acctggtgac cagcctgaag
3180ggcctgtgga ccctgagagc cttcggcaga cagccctact tcgagaccct gttccacaag
3240gccctgaacc tgcacaccgc caactggttc ctgtacctga gcaccctgag atggttccag
3300atgagaatcg agatgatctt cgtgatcttc ttcatcgccg tgaccttcat cagcatcctg
3360accaccggcg agggcgaggg cagagtgggc atcatcctga ccctggccat gaacatcatg
3420agcaccctgc agtgggccgt gaacagcagc atcgacgtgg acagcctgat gagaagcgtg
3480agcagagtgt tcaagttcat cgacatgccc accgagggca agcccaccaa gagcaccaag
3540ccctacaaga acggccagct gagcaaggtg atgatcatcg agaacagcca cgtgaagaag
3600gacgacatct ggcccagcgg cggccagatg accgtgaagg acctgaccgc caagtacacc
3660gagggcggca acgccatcct ggagaacatc agcttcagca tcagccccgg ccagagagtg
3720ggcctgctgg gcagaaccgg cagcggcaag agcaccctgc tgagcgcctt cctgagactg
3780ctgaacaccg agggcgagat ccagatcgac ggcgtgagct gggacagcat caccctgcag
3840cagtggagaa aggccttcgg cgtgatcccc cagaaggtgt tcatcttcag cggcaccttc
3900agaaagaacc tggaccccta cgagcagtgg agcgaccagg agatctggaa ggtggccgac
3960gaggtgggcc tgagaagcgt gatcgagcag ttccccggca agctggactt cgtgctggtg
4020gacggcggct gcgtgctgag ccacggccac aagcagctga tgtgcctggc cagaagcgtg
4080ctgagcaagg ccaagatcct gctgctggac gagcccagcg cccacctgga ccccgtgacc
4140taccagatca tcagaagaac cctgaagcag gccttcgccg actgcaccgt gatcctgtgc
4200gagcacagaa tcgaggccat gctggagtgc cagcagttcc tggtgatcga ggagaacaag
4260gtgagacagt acgacagcat ccagaagctg ctgaacgaga gaagcctgtt cagacaggcc
4320atcagcccca gcgacagagt gaagctgttc ccccacagaa acagcagcaa gtgcaagagc
4380aagccccaga tcgccgccct gaaggaggag accgaggagg aggtgcagga caccagactg
4440tga
4443364443DNAArtificial Sequencechemically synthesized oligonucleotide
36atgcagcgca gccccctgga gaaggccagc gtggtgagca agctgttctt cagctggacc
60cgccccatcc tgcgcaaggg ctaccgccag cgcctggagc tgagcgacat ctaccagatc
120cccagcgtgg acagcgccga caacctgagc gagaagctgg agcgcgagtg ggaccgcgag
180ctggccagca agaagaaccc caagctgatc aacgccctgc gccgctgctt cttctggcgc
240ttcatgttct acggcatctt cctgtacctg ggcgaggtga ccaaggccgt gcagcccctg
300ctgctgggcc gcatcatcgc cagctacgac cccgacaaca aggaggagcg cagcatcgcc
360atctacctgg gcatcggcct gtgcctgctg ttcatcgtgc gcaccctgct gctgcacccc
420gccatcttcg gcctgcacca catcggcatg cagatgcgca tcgccatgtt cagcctgatc
480tacaagaaga ccctgaagct gagcagccgc gtgctggaca agatcagcat cggccagctg
540gtgagcctgc tgagcaacaa cctgaacaag ttcgacgagg gcctggccct ggcccacttc
600gtgtggatcg cccccctgca ggtggccctg ctgatgggcc tgatctggga gctgctgcag
660gccagcgcct tctgcggcct gggcttcctg atcgtgctgg ccctgttcca ggccggcctg
720ggccgcatga tgatgaagta ccgcgaccag cgcgccggca agatcagcga gcgcctggtg
780atcaccagcg agatgatcga gaacatccag agcgtgaagg cctactgctg ggaggaggcc
840atggagaaga tgatcgagaa cctgcgccag accgagctga agctgacccg caaggccgcc
900tacgtgcgct acttcaacag cagcgccttc ttcttcagcg gcttcttcgt ggtgttcctg
960agcgtgctgc cctacgccct gatcaagggc atcatcctgc gcaagatctt caccaccatc
1020agcttctgca tcgtgctgcg catggccgtg acccgccagt tcccctgggc cgtgcagacc
1080tggtacgaca gcctgggcgc catcaacaag atccaggact tcctgcagaa gcaggagtac
1140aagaccctgg agtacaacct gaccaccacc gaggtggtga tggagaacgt gaccgccttc
1200tgggaggagg gcttcggcga gctgttcgag aaggccaagc agaacaacaa caaccgcaag
1260accagcaacg gcgacgacag cctgttcttc agcaacttca gcctgctggg cacccccgtg
1320ctgaaggaca tcaacttcaa gatcgagcgc ggccagctgc tggccgtggc cggcagcacc
1380ggcgccggca agaccagcct gctgatggtg atcatgggcg agctggagcc cagcgagggc
1440aagatcaagc acagcggccg catcagcttc tgcagccagt tcagctggat catgcccggc
1500accatcaagg agaacatcat cttcggcgtg agctacgacg agtaccgcta ccgcagcgtg
1560atcaaggcct gccagctgga ggaggacatc agcaagttcg ccgagaagga caacatcgtg
1620ctgggcgagg gcggcatcac cctgagcggc ggccagcgcg cccgcatcag cctggcccgc
1680gccgtgtaca aggacgccga cctgtacctg ctggacagcc ccttcggcta cctggacgtg
1740ctgaccgaga aggagatctt cgagagctgc gtgtgcaagc tgatggccaa caagacccgc
1800atcctggtga ccagcaagat ggagcacctg aagaaggccg acaagatcct gatcctgcac
1860gagggcagca gctacttcta cggcaccttc agcgagctgc agaacctgca gcccgacttc
1920agcagcaagc tgatgggctg cgacagcttc gaccagttca gcgccgagcg ccgcaacagc
1980atcctgaccg agaccctgca ccgcttcagc ctggagggcg acgcccccgt gagctggacc
2040gagaccaaga agcagagctt caagcagacc ggcgagttcg gcgagaagcg caagaacagc
2100atcctgaacc ccatcaacag catccgcaag ttcagcatcg tgcagaagac ccccctgcag
2160atgaacggca tcgaggagga cagcgacgag cccctggagc gccgcctgag cctggtgccc
2220gacagcgagc agggcgaggc catcctgccc cgcatcagcg tgatcagcac cggccccacc
2280ctgcaggccc gccgccgcca gagcgtgctg aacctgatga cccacagcgt gaaccagggc
2340cagaacatcc accgcaagac caccgccagc acccgcaagg tgagcctggc cccccaggcc
2400aacctgaccg agctggacat ctacagccgc cgcctgagcc aggagaccgg cctggagatc
2460agcgaggaga tcaacgagga ggacctgaag gagtgcttct tcgacgacat ggagagcatc
2520cccgccgtga ccacctggaa cacctacctg cgctacatca ccgtgcacaa gagcctgatc
2580ttcgtgctga tctggtgcct ggtgatcttc ctggccgagg tggccgccag cctggtggtg
2640ctgtggctgc tgggcaacac ccccctgcag gacaagggca acagcaccca cagccgcaac
2700aacagctacg ccgtgatcat caccagcacc agcagctact acgtgttcta catctacgtg
2760ggcgtggccg acaccctgct ggccatgggc ttcttccgcg gcctgcccct ggtgcacacc
2820ctgatcaccg tgagcaagat cctgcaccac aagatgctgc acagcgtgct gcaggccccc
2880atgagcaccc tgaacaccct gaaggccggc ggcatcctga accgcttcag caaggacatc
2940gccatcctgg acgacctgct gcccctgacc atcttcgact tcatccagct gctgctgatc
3000gtgatcggcg ccatcgccgt ggtggccgtg ctgcagccct acatcttcgt ggccaccgtg
3060cccgtgatcg tggccttcat catgctgcgc gcctacttcc tgcagaccag ccagcagctg
3120aagcagctgg agagcgaggg ccgcagcccc atcttcaccc acctggtgac cagcctgaag
3180ggcctgtgga ccctgcgcgc cttcggccgc cagccctact tcgagaccct gttccacaag
3240gccctgaacc tgcacaccgc caactggttc ctgtacctga gcaccctgcg ctggttccag
3300atgcgcatcg agatgatctt cgtgatcttc ttcatcgccg tgaccttcat cagcatcctg
3360accaccggcg agggcgaggg ccgcgtgggc atcatcctga ccctggccat gaacatcatg
3420agcaccctgc agtgggccgt gaacagcagc atcgacgtgg acagcctgat gcgcagcgtg
3480agccgcgtgt tcaagttcat cgacatgccc accgagggca agcccaccaa gagcaccaag
3540ccctacaaga acggccagct gagcaaggtg atgatcatcg agaacagcca cgtgaagaag
3600gacgacatct ggcccagcgg cggccagatg accgtgaagg acctgaccgc caagtacacc
3660gagggcggca acgccatcct ggagaacatc agcttcagca tcagccccgg ccagcgcgtg
3720ggcctgctgg gccgcaccgg cagcggcaag agcaccctgc tgagcgcctt cctgcgcctg
3780ctgaacaccg agggcgagat ccagatcgac ggcgtgagct gggacagcat caccctgcag
3840cagtggcgca aggccttcgg cgtgatcccc cagaaggtgt tcatcttcag cggcaccttc
3900cgcaagaacc tggaccccta cgagcagtgg agcgaccagg agatctggaa ggtggccgac
3960gaggtgggcc tgcgcagcgt gatcgagcag ttccccggca agctggactt cgtgctggtg
4020gacggcggct gcgtgctgag ccacggccac aagcagctga tgtgcctggc ccgcagcgtg
4080ctgagcaagg ccaagatcct gctgctggac gagcccagcg cccacctgga ccccgtgacc
4140taccagatca tccgccgcac cctgaagcag gccttcgccg actgcaccgt gatcctgtgc
4200gagcaccgca tcgaggccat gctggagtgc cagcagttcc tggtgatcga ggagaacaag
4260gtgcgccagt acgacagcat ccagaagctg ctgaacgagc gcagcctgtt ccgccaggcc
4320atcagcccca gcgaccgcgt gaagctgttc ccccaccgca acagcagcaa gtgcaagagc
4380aagccccaga tcgccgccct gaaggaggag accgaggagg aggtgcagga cacccgcctg
4440taa
4443374443DNAArtificial Sequencechemically synthesized oligonucleotide
37atgcagagaa gccccctgga gaaggccagc gtggtgagca agctgttctt cagctggacc
60agacccatcc tgagaaaggg ctacagacag agactggagc tgagcgacat ctaccagatc
120cccagcgtgg acagcgccga caacctgagc gagaagctgg agagagagtg ggacagagag
180ctggccagca agaagaaccc caagctgatc aacgccctga gaagatgctt cttctggaga
240ttcatgttct acggcatctt cctgtacctg ggcgaggtga ccaaggccgt gcagcccctg
300ctgctgggca gaatcatcgc cagctacgac cccgacaaca aggaggagag aagcatcgcc
360atctacctgg gcatcggcct gtgcctgctg ttcatcgtga gaaccctgct gctgcacccc
420gccatcttcg gcctgcacca catcggcatg cagatgagaa tcgccatgtt cagcctgatc
480tacaagaaga ccctgaagct gagcagcaga gtgctggaca agatcagcat cggccagctg
540gtgagcctgc tgagcaacaa cctgaacaag ttcgacgagg gcctggccct ggcccacttc
600gtgtggatcg cccccctgca ggtggccctg ctgatgggcc tgatctggga gctgctgcag
660gccagcgcct tctgcggcct gggcttcctg atcgtgctgg ccctgttcca ggccggcctg
720ggcagaatga tgatgaagta cagggaccag agagccggca agatcagcga gagactggtg
780atcaccagcg agatgatcga gaacatccag agcgtgaagg cctactgctg ggaggaggcc
840atggagaaga tgatcgagaa cctgagacag accgagctga agctgaccag aaaggccgcc
900tacgtgagat acttcaacag cagcgccttc ttcttcagcg gcttcttcgt ggtgttcctg
960agcgtgctgc cctacgccct gatcaagggc atcatcctga gaaagatctt caccaccatc
1020agcttctgca tcgtgctgag aatggccgtg accagacagt tcccctgggc cgtgcagacc
1080tggtacgaca gcctgggcgc catcaacaag atccaggact tcctgcagaa gcaggagtac
1140aagaccctgg agtacaacct gaccaccacc gaggtggtga tggagaacgt gaccgccttc
1200tgggaggagg gcttcggcga gctgttcgag aaggccaagc agaacaacaa caacagaaag
1260accagcaacg gcgacgacag cctgttcttc agcaacttca gcctgctggg cacccccgtg
1320ctgaaggaca tcaacttcaa gatcgagaga ggccagctgc tggccgtggc cggcagcacc
1380ggcgccggca agaccagcct gctgatggtg atcatgggcg agctggagcc cagcgagggc
1440aagatcaagc acagcggcag aatcagcttc tgcagccagt tcagctggat catgcccggc
1500accatcaagg agaacatcat cttcggcgtg agctacgacg agtacagata cagaagcgtg
1560atcaaggcct gccagctgga ggaggacatc agcaagttcg ccgagaagga caacatcgtg
1620ctgggcgagg gcggcatcac cctgagcggc ggccagagag ccagaatcag cctggccaga
1680gccgtgtaca aggacgccga cctgtacctg ctggacagcc ccttcggcta cctggacgtg
1740ctgaccgaga aggagatctt cgagagctgc gtgtgcaagc tgatggccaa caagaccaga
1800atcctggtga ccagcaagat ggagcacctg aagaaggccg acaagatcct gatcctgcac
1860gagggcagca gctacttcta cggcaccttc agcgagctgc agaacctgca gcccgacttc
1920agcagcaagc tgatgggctg cgacagcttc gaccagttca gcgccgagag aagaaacagc
1980atcctgaccg agaccctgca cagattcagc ctggagggcg acgcccccgt gagctggacc
2040gagaccaaga agcagagctt caagcagacc ggcgagttcg gcgagaagag aaagaacagc
2100atcctgaacc ccatcaacag catcagaaag ttcagcatcg tgcagaagac ccccctgcag
2160atgaacggca tcgaggagga cagcgacgag cccctggaga gaagactgag cctggtgccc
2220gacagcgagc agggcgaggc catcctgccc agaatcagcg tgatcagcac cggccccacc
2280ctgcaggcca gaagaagaca gagcgtgctg aacctgatga cccacagcgt gaaccagggc
2340cagaacatcc acagaaagac caccgccagc accagaaagg tgagcctggc cccccaggcc
2400aacctgaccg agctggacat ctacagcaga agactgagcc aggagaccgg cctggagatc
2460agcgaggaga tcaacgagga ggacctgaag gagtgcttct tcgacgacat ggagagcatc
2520cccgccgtga ccacctggaa cacctacctg agatacatca ccgtgcacaa gagcctgatc
2580ttcgtgctga tctggtgcct ggtgatcttc ctggccgagg tggccgccag cctggtggtg
2640ctgtggctgc tgggcaacac ccccctgcag gacaagggca acagcaccca cagcagaaac
2700aacagctacg ccgtgatcat caccagcacc agcagctact acgtgttcta catctacgtg
2760ggcgtggccg acaccctgct ggccatgggc ttcttcagag gcctgcccct ggtgcacacc
2820ctgatcaccg tgagcaagat cctgcaccac aagatgctgc acagcgtgct gcaggccccc
2880atgagcaccc tgaacaccct gaaggccggc ggcatcctga acagattcag caaggacatc
2940gccatcctgg acgacctgct gcccctgacc atcttcgact tcatccagct gctgctgatc
3000gtgatcggcg ccatcgccgt ggtggccgtg ctgcagccct acatcttcgt ggccaccgtg
3060cccgtgatcg tggccttcat catgctgaga gcctacttcc tgcagaccag ccagcagctg
3120aagcagctgg agagcgaggg caggagcccc atcttcaccc acctggtgac cagcctgaag
3180ggcctgtgga ccctgagagc cttcggcaga cagccctact tcgagaccct gttccacaag
3240gccctgaacc tgcacaccgc caactggttc ctgtacctga gcaccctgag atggttccag
3300atgagaatcg agatgatctt cgtgatcttc ttcatcgccg tgaccttcat cagcatcctg
3360accaccggcg agggcgaggg cagagtgggc atcatcctga ccctggccat gaacatcatg
3420agcaccctgc agtgggccgt gaacagcagc atcgacgtgg acagcctgat gagaagcgtg
3480agcagagtgt tcaagttcat cgacatgccc accgagggca agcccaccaa gagcaccaag
3540ccctacaaga acggccagct gagcaaggtg atgatcatcg agaacagcca cgtgaagaag
3600gacgacatct ggcccagcgg cggccagatg accgtgaagg acctgaccgc caagtacacc
3660gagggcggca acgccatcct ggagaacatc agcttcagca tcagccccgg ccagagagtg
3720ggcctgctgg gcagaaccgg cagcggcaag agcaccctgc tgagcgcctt cctgagactg
3780ctgaacaccg agggcgagat ccagatcgac ggcgtgagct gggacagcat caccctgcag
3840cagtggagaa aggccttcgg cgtgatcccc cagaaggtgt tcatcttcag cggcaccttc
3900agaaagaacc tggaccccta cgagcagtgg agcgaccagg agatctggaa ggtggccgac
3960gaggtgggcc tgagaagcgt gatcgagcag ttccccggca agctggactt cgtgctggtg
4020gacggcggct gcgtgctgag ccacggccac aagcagctga tgtgcctggc cagaagcgtg
4080ctgagcaagg ccaagatcct gctgctggac gagcccagcg cccacctgga ccccgtgacc
4140taccagatca tcagaagaac cctgaagcag gccttcgccg actgcaccgt gatcctgtgc
4200gagcacagaa tcgaggccat gctggagtgc cagcagttcc tggtgatcga ggagaacaag
4260gtgagacagt acgacagcat ccagaagctg ctgaacgaga gaagcctgtt cagacaggcc
4320atcagcccca gcgacagagt gaagctgttc ccccacagaa acagcagcaa gtgcaagagc
4380aagccccaga tcgccgccct gaaggaggag accgaggagg aggtgcagga caccagactg
4440tga
4443384443DNAArtificial Sequencechemically synthesized oligonucleotide
38atgcagaggt cacctctgga aaaggctagc gtggtcagca agctattttt ttcctggacc
60cgcccgatac tcaggaaggg ctaccgacag cggctggagc tgagtgacat ttatcagatt
120ccctccgtcg attccgctga caacctgtct gagaaactgg agcgggaatg ggatagggaa
180ctggcgtcca aaaaaaaccc caaactcatc aatgcactcc gcagatgctt cttctggcgg
240tttatgtttt atggcatatt cctgtatctg ggggaggtga cgaaagccgt gcagccgctg
300ctgcttggtc gcattatcgc gtcatacgat ccagataaca aggaggaaag aagtatcgct
360atctatctcg ggatagggct gtgcctgctc ttcattgtgc ggactcttct cttgcacccc
420gccattttcg gtctgcatca tataggtatg cagatgagaa ttgcgatgtt ctcattgatt
480tacaaaaaaa cgcttaagct aagttcaagg gtgctagata agatatcgat cggccagctg
540gtgtctctgc ttagcaacaa cctcaataaa ttcgacgaag gccttgcact ggcccacttc
600gtgtggatcg cccctctgca ggtggctctg ctgatggggt taatatggga gctgttgcag
660gcctccgctt tttgtggcct ggggtttctc atcgtgttgg ccttgtttca ggcagggctg
720ggacgtatga tgatgaaata tagggatcag agggctggca aaatctctga gcgcctggtt
780attacgagtg aaatgattga gaacatccag tcagtgaagg cctattgctg ggaggaggcc
840atggaaaaaa tgattgagaa cctacgccag actgagctga agttaaccag aaaagccgcc
900tatgtgcgct actttaacag tagcgcattt ttcttctccg gttttttcgt ggtgtttctt
960agtgtgttgc cgtatgcctt aatcaaggga ataatactcc ggaagatttt cactaccatc
1020agcttctgta tcgtgttgcg gatggccgtc acccggcagt ttccctgggc agtacagact
1080tggtacgatt ctctcggagc aattaacaaa atccaagact ttctacaaaa gcaggagtac
1140aagaccctgg agtacaatct gaccaccaca gaagtcgtaa tggagaatgt aactgccttc
1200tgggaagagg gctttggcga actctttgaa aaggccaagc agaacaataa caaccggaag
1260acctccaacg gggacgacag cttatttttc agcaattttt ctttgctcgg gacccctgta
1320ctgaaagata ttaactttaa gatcgagcgc ggacaactcc tggctgtcgc cggcagcact
1380ggagctggaa aaacatcact gcttatggtg ataatgggag aactcgaacc aagcgaggga
1440aaaataaagc actctggacg gattagtttt tgctcccagt tctcgtggat aatgcctggc
1500accattaagg agaatatcat ctttggagtg agttacgacg aataccggta ccggtccgtt
1560atcaaggctt gtcaactcga ggaggacatt tctaaattcg ccgaaaaaga taatatagtg
1620ctgggcgaag gaggcattac actgagcggg ggtcagagag ctcgaattag cctcgcccga
1680gcagtctata aagacgccga tctttacctg ctggattccc cttttgggta tttggatgtt
1740ctgacagaga aggaaatctt tgaatcatgt gtctgtaaac tgatggccaa taagactagg
1800attctagtga cttcgaaaat ggagcacctg aaaaaagcgg acaaaattct gatactccat
1860gaagggtctt cctacttcta cggcaccttc tcagagttgc agaacttaca acctgatttt
1920tcatctaagc ttatggggtg cgactcgttt gaccagttct ccgctgaaag acgaaacagc
1980atcttaacgg aaactcttca caggttctca ttagagggag atgcgccggt gtcctggaca
2040gagacaaaaa aacagtcttt caaacagaca ggagagtttg gcgagaagag aaaaaactca
2100atcctcaatc ccatcaattc tattagaaag tttagcatcg tccaaaaaac accattgcag
2160atgaatggga ttgaggagga cagtgatgag cctttggaac gaagactgtc cctggtaccc
2220gatagcgaac agggtgaggc catccttcct aggatctcgg tcataagtac agggcccaca
2280ctgcaggcca ggcgacgtca aagtgtcctc aatcttatga cgcacagtgt gaatcagggg
2340cagaacatcc atcgtaagac gacagcttca actcgaaagg tcagtctagc tccacaagcc
2400aatcttacag agctggacat ttattcccgc cgcctcagtc aggagaccgg attggaaata
2460tcagaggaaa ttaatgaaga ggatctgaag gaatgcttct ttgatgacat ggaatcgatc
2520cccgctgtta ctacctggaa cacatatctg agatatatta ccgtccataa gagcttaatc
2580tttgtactga tatggtgctt ggtgattttc ctggcagagg ttgcggcgag tttggtcgtg
2640ctatggctcc ttggaaacac tcccctgcag gataagggga actccactca tagcaggaat
2700aacagctatg ccgtgatcat cacctctacc tcctcttatt acgtgtttta catatacgtc
2760ggtgttgcgg ataccctgtt ggcaatgggg ttctttagag gactacccct agttcacacc
2820ctgatcaccg tttcgaagat cttgcaccac aagatgcttc atagcgttct ccaagctcct
2880atgagcaccc ttaatacact gaaagcagga ggtatcctta accgcttttc caaagacatc
2940gctatactcg acgatttgct cccattgacc atcttcgact tcattcagct gctcctcatt
3000gtgatcggcg ccattgccgt ggtcgcagtg ttacagccat atattttcgt agccaccgtg
3060cccgtcatcg tggcatttat catgctgcgc gcatatttct tacagacatc tcagcaactg
3120aagcagctgg aatctgaggg cagatctcct atttttacac acctggttac cagcctgaag
3180ggcctgtgga ccctgcgtgc tttcggtcgc caaccctact ttgagactct cttccataag
3240gctctgaatt tacatactgc caattggttc ctatacctta gtacccttcg gtggttccag
3300atgcggatag aaatgatctt cgtgattttc ttcatcgcag tcactttcat ctctattttg
3360acgaccggtg agggcgaggg cagggtgggc atcattctga ctttggccat gaacattatg
3420tcaacactcc agtgggccgt taattcaagc attgatgtgg attccttgat gcgttccgtc
3480agcagggtat ttaaattcat agacatgccc accgagggca agccaacaaa atctaccaag
3540ccatacaaaa atggccaact aagcaaggtc atgattatcg agaattctca tgtgaaaaag
3600gacgacattt ggccttccgg gggtcaaatg actgtaaagg acctgacggc taaatacact
3660gagggcggta atgctatctt ggagaacatc tctttcagca tctcccctgg ccagagagtg
3720ggactgctcg ggcggacagg ctccggaaag tctacgctcc tttcagcatt ccttagactt
3780ctgaacaccg aaggtgagat tcagattgac ggggtctctt gggactccat cacacttcag
3840caatggagga aggcattcgg tgtaatcccc caaaaggttt ttatcttctc cggaacattt
3900cgtaagaatc tggacccgta cgagcagtgg tcagatcagg agatctggaa agtagcagac
3960gaggtcgggc tacggagcgt tattgaacag tttcctggca aactggactt cgttttggtg
4020gacggaggct gtgtgctgag tcacggccat aaacaactga tgtgcttagc taggtctgtt
4080ctcagcaagg caaagatttt actgctggat gaaccaagcg cccaccttga tccagtgaca
4140tatcaaatca tcagaagaac tcttaaacag gcgttcgccg actgcacagt gatcctgtgt
4200gagcacagaa tagaagccat gctggaatgt caacagtttc tcgtgattga ggagaacaag
4260gtgcgccagt acgatagcat ccagaagtta ctcaatgaaa ggtcactctt caggcaggcc
4320atctcaccca gcgaccgcgt taagctgttt ccacaccgaa acagttccaa gtgcaaaagt
4380aagccacaga ttgctgcact gaaggaagag acagaagaag aagttcagga cactcggctc
4440tga
4443394443DNAArtificial Sequencechemically synthesized oligonucleotide
39atgcagagga gcccactgga gaaagcctcc gtggtgagta aactcttttt tagttggacc
60agacccatcc tgcgaaaagg atacaggcag cgcctcgagt tgtcagatat ctaccagatt
120ccttctgtgg actcagctga caatttgagt gagaagctgg agcgggagtg ggatagagag
180ctggcgagca aaaaaaaccc caagcttatc aatgctctgc gccgctgctt tttctggagg
240ttcatgtttt atgggatctt cctgtacctg ggggaggtca ccaaagctgt tcagccgctc
300cttcttggcc gcatcatcgc cagctatgac cctgataata aagaagaaag gtctattgct
360atttatctgg gaattggcct ctgcttgctc ttcatcgtcc gcacccttct gctgcaccct
420gccatttttg gccttcacca catcggcatg caaatgagaa ttgccatgtt ctccctcatt
480tacaaaaaga ccctgaaact ttcctcaaga gtgttagata aaatatccat tggtcagctg
540gtcagcctgc tgtccaacaa tcttaacaaa tttgatgaag gcttggcgct ggcccacttc
600gtgtggattg cacctctgca ggtggccctg ttgatgggac ttatatggga gctgcttcaa
660gcctctgctt tctgtgggct gggctttttg attgtactgg cactttttca ggctgggctc
720ggaagaatga tgatgaaata cagagatcag cgggccggga agatatcaga gcgacttgtg
780atcaccagtg aaatgattga aaatattcag agcgtgaaag cctactgctg ggaagaagcc
840atggagaaga tgattgagaa cctgaggcag acagagctca agctcactcg gaaggctgct
900tatgttcgct atttcaacag cagcgccttc ttcttcagtg gcttctttgt tgtcttcctg
960tctgttctgc catatgcact gataaaaggc attattttac gaaagatctt caccaccatc
1020agtttttgca tcgttctcag gatggccgtc acaagacagt tcccctgggc tgtgcagacc
1080tggtacgatt ccttgggggc catcaacaag attcaagatt tcttgcaaaa acaagaatat
1140aaaactttag aatacaacct caccaccact gaagtggtca tggaaaatgt gacagccttt
1200tgggaggagg gttttggaga attgttcgag aaggcaaagc agaataacaa caacaggaag
1260acgagcaatg gggacgactc tctcttcttc agcaactttt cactgctcgg gacccctgtg
1320ttgaaagata taaacttcaa gatcgagagg ggccagctct tggctgtggc aggctccact
1380ggagctggta aaacatctct tctcatggtg atcatggggg aactggagcc ttccgaagga
1440aaaatcaagc acagtgggag aatctcattc tgcagccagt tttcctggat catgcccggc
1500accattaagg aaaacatcat atttggagtg tcctatgatg agtaccgcta ccggtcagtc
1560atcaaagcct gtcagttgga ggaggacatc tccaagtttg cagagaaaga caacattgtg
1620cttggagagg ggggtatcac tctttctgga ggacaaagag ccaggatctc tttggcccgg
1680gcagtctaca aggatgcaga cctctacttg ttggacagtc ccttcggcta cctcgacgtg
1740ctgactgaaa aagaaatttt tgaaagctgt gtgtgcaaac tgatggcaaa caagaccagg
1800attcttgtca ccagcaagat ggaacatctg aagaaagcgg acaaaattct gattctgcat
1860gaagggagct cctacttcta tggaacattt agcgagcttc agaacctaca gccagacttc
1920tcctccaaat taatgggctg tgactccttc gaccagttct ctgcagaaag aagaaactct
1980atactcacag agaccctcca ccgcttctcc cttgagggag atgccccagt ttcttggaca
2040gaaaccaaga agcagtcctt taagcagact ggcgagtttg gtgaaaagag gaaaaattca
2100attctcaatc caattaacag tattcgcaag ttcagcattg tccagaagac acccctccag
2160atgaatggca tcgaagaaga tagtgacgag ccgctggaga gacggctgag tctggtgcca
2220gattcagaac agggggaggc catcctgccc cggatcagcg tcatttccac aggccccaca
2280ttacaagcac ggcgccggca gagtgtttta aatctcatga cccattcagt gaaccagggc
2340caaaatatcc acaggaagac tacagcttct acccggaaag tgtctctggc ccctcaggcc
2400aatctgaccg agctggacat ctacagcagg aggctctccc aggaaacagg gctggaaata
2460tctgaagaga ttaatgaaga ggatcttaaa gagtgcttct ttgatgacat ggagagcatc
2520cccgcggtga ccacatggaa cacctacctt agatatatta ctgtccacaa gagcctcata
2580tttgtcctca tctggtgcct ggttattttc ctcgctgagg tggcggccag tcttgttgtg
2640ctctggctgc tgggcaacac tcctctccag gacaagggca atagtactca cagcagaaat
2700aattcttatg ccgtcatcat tacaagcacc tccagctact acgtgttcta catctatgtg
2760ggcgtggctg acaccctcct ggccatgggt ttcttccggg gcctgccttt ggtgcacacc
2820ctcatcacag tgtcaaaaat tctgcaccat aaaatgcttc attctgtcct gcaggcaccc
2880atgagcactt tgaacacatt gaaggctggc ggcatcctca acagattttc taaagatatt
2940gctatcctgg atgatctcct ccccctgaca atctttgact ttatccagct tctgctgatc
3000gtgattggag ccatagcagt ggttgctgtc ctgcagccct acatttttgt ggccaccgtg
3060cccgtgattg ttgcctttat tatgctcaga gcttacttcc tgcaaacttc tcaacagctc
3120aaacagctag aatctgaggg ccggagcccc atttttaccc acctggtgac ttccctgaag
3180ggactgtgga ctctgagagc attcgggcga cagccttact ttgagacact gttccacaag
3240gccctgaact tgcacactgc caactggttt ctttacctga gcacactccg ctggttccag
3300atgcggatag agatgatctt cgtcatcttt tttatagctg taaccttcat ttctatcctt
3360acaacaggag aaggagaggg cagggtggga atcatcctca cgctggctat gaacataatg
3420tccaccttgc agtgggccgt gaattccagt atagatgtgg attctctaat gaggagtgtc
3480tcccgggtgt ttaaattcat tgatatgcct actgagggga aacccaccaa gtcaacaaaa
3540ccttataaga atggacagct gagcaaggtg atgataattg agaacagcca cgtgaagaag
3600gatgacattt ggcccagcgg gggccagatg actgtgaagg acctgacggc caagtacacc
3660gaaggtggaa atgccatttt ggaaaacatc agcttctcaa tctctcctgg gcagagagtt
3720ggattgctgg gtcgcacggg cagcggcaaa tcaaccctgc tcagtgcctt ccttcggctc
3780ctgaatacag aaggcgaaat ccaaattgac ggggtgagct gggacagcat caccctgcag
3840cagtggagaa aagcatttgg ggtcattcca cagaaagttt tcatcttctc tggcactttc
3900agaaagaacc tggaccccta tgagcagtgg agcgaccagg agatctggaa ggttgcagat
3960gaagttggcc tgcggagtgt gatagaacaa tttcctggca agctggattt tgtgctggta
4020gatggaggct gcgtgctgtc ccacggccac aaacagctga tgtgcctcgc ccgctccgtt
4080ctttcaaagg ccaaaatctt gcttttggat gagcccagtg ctcacctcga cccagtgacc
4140tatcagataa tccgcaggac cttaaagcaa gcttttgccg actgcaccgt catactgtgt
4200gagcaccgga ttgaagcaat gctggaatgc cagcagtttc tggtgatcga ggagaataag
4260gtccggcagt acgacagcat ccagaagttg ttgaatgagc gcagcctttt ccgccaggcc
4320atctccccat ctgacagagt caagctgttt ccacatagga actcctctaa gtgcaagtcc
4380aagccccaga tcgctgccct caaggaggaa actgaggaag aggtgcagga tacccgcctg
4440tga
4443404443DNAArtificial Sequencechemically synthesized oligonucleotide
40atgcaacgga gtcctctgga aaaagcctct gtcgtatcta agcttttctt cagttggaca
60cgcccgattt tgagaaaggg ttatcggcaa cgcttggaac ttagtgacat ctaccaaatt
120ccaagtgtag actcagccga taacttgagc gaaaagctcg aacgagagtg ggatcgagaa
180ctggctagca aaaaaaatcc caaactcata aatgccctgc gacgctgttt cttttggcga
240tttatgtttt acggtatttt cctttatttg ggtgaggtca cgaaggctgt acagccactg
300ctgctgggtc gcatcattgc ctcttacgac cctgacaaca aagaggagcg gtcaatagct
360atctaccttg gtataggact ttgcttgctc ttcatagtcc gcacgttgct tctccaccct
420gctatatttg gtctccatca cattgggatg caaatgcgga tcgcgatgtt cagtcttata
480tataaaaaga ctcttaaact ttccagccgg gttctggata agatctctat tggtcaactg
540gtatctcttt tgtctaacaa cctgaataag ttcgacgagg gccttgcatt ggcccatttt
600gtatggattg cccctttgca agtcgccctc ctgatgggat tgatctggga actcctgcaa
660gctagtgctt tttgcggatt gggattcctc atagtccttg cgctctttca ggcgggactt
720ggacgcatga tgatgaagta tcgcgaccaa cgagctggca agatcagtga acggcttgta
780ataaccagtg aaatgataga gaacatccag agcgtaaaag cttactgttg ggaagaagcg
840atggaaaaga tgattgagaa ccttcgccag acagaactta aacttacacg aaaggccgct
900tatgtccggt acttcaactc ttcagcattt ttttttagtg gcttctttgt agtgttcctg
960tccgtccttc cgtatgcact tatcaagggt ataatactta ggaaaatctt cacaacaatc
1020agtttttgca tagtccttcg catggcagta actcgccaat ttccctgggc agttcagacg
1080tggtacgact cacttggcgc aattaacaaa attcaagatt tcctccaaaa gcaagagtat
1140aaaaccttgg aatacaacct taccaccaca gaagttgtaa tggaaaatgt cacagccttc
1200tgggaggaag gtttcggcga actttttgag aaggcgaagc aaaataacaa taatcggaaa
1260acatcaaacg gtgacgattc actgttcttt tctaacttta gccttcttgg gacgcccgtc
1320ctgaaggaca taaactttaa gattgaacgg ggtcaacttc tcgcggtcgc agggagtact
1380ggagcgggga aaacgagcct gctgatggtg ataatggggg agttggagcc ctcagaaggc
1440aagatcaagc atagtggtag aattagcttc tgcagtcaat ttagttggat tatgccgggc
1500acgatcaaag aaaatataat ctttggggta tcctacgatg aatacaggta ccgatcagtg
1560ataaaagcgt gccagcttga agaagacatt tcaaagtttg ctgagaagga taatatcgta
1620cttggagaag gaggtatcac cctgtctggg ggtcaacgag cgaggatctc cctggcacgc
1680gccgtctaca aggacgcgga cctctatctg ttggattcac cgttcggata tttggacgtg
1740cttacggaga aagaaatatt tgagagctgt gtttgcaagc tcatggcaaa taaaaccaga
1800atattggtta caagcaagat ggagcatctt aagaaagcag ataaaatcct gatattgcac
1860gagggctctt catacttcta cgggacgttt tctgagttgc agaacctcca gccggatttc
1920agctctaagc tgatgggctg tgattccttt gatcagttta gtgcggaaag acgaaacagt
1980atactcaccg aaacactgca caggttctct ctggagggcg acgccccggt ttcctggaca
2040gagacgaaga agcagtcctt caaacagaca ggcgagtttg gggagaaaag gaaaaatagc
2100atactcaacc cgattaacag cattcgcaag ttcagtatag tacaaaagac cccgttgcag
2160atgaacggta tagaggaaga ttctgatgag ccactggaaa gacggctttc tctcgttccg
2220gacagtgaac agggagaggc aatactgcct cggatcagcg ttatctctac aggacctact
2280ttgcaagctc ggcgccgaca gtcagtcttg aatcttatga ctcatagtgt taatcaaggc
2340cagaatatcc atcgcaagac caccgcaagt acaaggaaag tgagcttggc acctcaagca
2400aaccttactg aacttgatat ctactcacgg cgactttcac aggagaccgg acttgaaatt
2460agtgaagaaa ttaacgagga ggacctcaag gagtgcttct tcgatgacat ggaatcaatc
2520cccgcagtca caacctggaa cacttatctg aggtatataa cagttcacaa gagcctcatt
2580tttgtactta tttggtgttt ggtaattttc ctggcggagg ttgctgcttc tttggtcgtc
2640ctttggctcc tcgggaatac accgctccaa gacaaaggca actctaccca tagtaggaac
2700aattcatatg cagtgattat aaccagtaca tcatcttatt acgttttcta tatttatgtc
2760ggggtagctg acacgctgtt ggcgatgggc ttctttaggg gcctcccctt ggtacacacc
2820cttatcacgg tgagtaaaat cctgcatcac aaaatgcttc attctgtact ccaagcgccg
2880atgagtacgc ttaatacgct gaaagcagga gggatactga atcggttcag caaggacatc
2940gccattctgg atgacctgct tccattgaca atatttgatt tcattcagct ccttctcata
3000gttattggag ccatagcggt ggtggctgtg cttcagcctt atatattcgt tgccacagtt
3060cccgttatag tggcatttat aatgctcagg gcctactttc tccagacttc ccagcagttg
3120aagcaactcg aatcagaagg aaggtcacct attttcacac atcttgtgac ttccttgaag
3180ggcttgtgga cgctgcgggc cttcggaaga caaccatatt ttgaaactct cttccacaaa
3240gctttgaatc ttcatactgc gaactggttc ctgtatttga gtactttgcg ctggttccag
3300atgaggatag aaatgatatt cgttatcttc tttatcgcgg ttacgttcat aagtatcctc
3360actacggggg agggtgaggg tagagtgggc ataatactga ccctcgccat gaacattatg
3420tccaccctgc agtgggcggt aaacagcagc atagatgtgg attctttgat gcgcagtgtg
3480agcagggttt ttaagtttat cgatatgccg acggaaggaa agcccactaa aagcacgaaa
3540ccctataaaa atggacagct tagcaaagta atgataatcg agaatagcca tgtgaaaaag
3600gatgacatat ggccttccgg aggccaaatg actgttaaag atctgaccgc taaatatacc
3660gagggcggca acgcaatact cgaaaacata agcttttcca taagccccgg ccaacgcgtg
3720ggtcttctgg ggaggactgg ctccggaaaa tcaacgttgc ttagcgcgtt tttgcggctc
3780cttaacactg aaggtgagat ccaaatagat ggcgttagtt gggactctat aacactgcaa
3840caatggcgga aagctttcgg cgtcatacct cagaaggtgt tcatctttag cggaacgttc
3900aggaagaact tggatcccta cgaacaatgg agtgatcaag aaatatggaa agtggcagat
3960gaggtaggct tgcgcagtgt cattgaacaa ttcccaggga aactcgactt tgtactggtg
4020gacggcggtt gcgtcttgtc acacgggcac aaacagttga tgtgtttggc ccgcagtgtt
4080ttgtctaagg cgaagattct gttgctcgac gaaccgagtg ctcatcttga tcccgtcacc
4140taccaaatca tcagaaggac gttgaagcaa gctttcgccg actgcactgt aatcctttgt
4200gagcatagga tcgaagcaat gctcgagtgc caacagttct tggttataga ggagaataag
4260gttcggcaat acgactcaat acagaaactg cttaatgagc ggtcactctt tcgacaagct
4320atctctccta gtgacagggt aaagcttttt cctcatcgga attccagcaa gtgtaagagt
4380aaaccacaga tcgccgccct taaagaggag accgaagaag aggtgcagga tacgagactt
4440tag
4443
User Contributions:
Comment about this patent or add new information about this topic: